Main
Main
Problem Solving
Data Structures, Algorithms, Python
Modules and Coding Interview Problem
Patterns
Li Yin1
February 6, 2022
1
https://liyinscience.com
ii
Contents
0 Preface 1
I Introduction 13
iii
iv CONTENTS
5 Introduction to Combinatorics 65
5.1 Permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.1 n Things in m positions . . . . . . . . . . . . . . . . . 66
5.1.2 Recurrence Relation and Math Induction . . . . . . . 67
5.1.3 See Permutation in Problems . . . . . . . . . . . . . . 67
5.2 Combination . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2.1 Recurrence Relation and Math Induction . . . . . . . 68
5.3 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3.1 Integer Partition . . . . . . . . . . . . . . . . . . . . . 69
5.3.2 Set Partition . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Array Partition . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.5 Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.6 More Combinatorics . . . . . . . . . . . . . . . . . . . . . . . 72
6 Recurrence Relations 75
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 General Methods to Solve Linear Recurrence Relation . . . . 78
6.2.1 Iterative Method . . . . . . . . . . . . . . . . . . . . . 78
CONTENTS v
8 Bit Manipulation 97
8.1 Python Bitwise Operators . . . . . . . . . . . . . . . . . . . . 97
8.2 Python Built-in Functions . . . . . . . . . . . . . . . . . . . . 99
8.3 Twos-complement Binary . . . . . . . . . . . . . . . . . . . . 100
8.4 Useful Combined Bit Operations . . . . . . . . . . . . . . . . 102
8.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
2.1 The State Space Graph. This may appears as a tree, but we
can redraw it as a graph. . . . . . . . . . . . . . . . . . . . . 23
2.2 State Transfer process on a linear structure . . . . . . . . . . 24
2.3 State Transfer Process on the tree . . . . . . . . . . . . . . . 25
2.4 Linear Search on explicit linear data structure . . . . . . . . 25
2.5 Binary Search on an implicit Tree Structure . . . . . . . . . . 26
2.6 The State Spaces Graph . . . . . . . . . . . . . . . . . . . . 30
2.7 State Transfer Tree Structure for LIS, each path represents
a possible solution. Each arrow represents an move: find
an element in the following elements that’s larger than the
current node. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
xv
xvi LIST OF FIGURES
15.1 The whole process for insertion sort: Gray marks the item to
be processed, and yellow marks the position after which the
gray item is to be inserted into the sorted region. . . . . . . . 309
15.2 One pass for bubble sort . . . . . . . . . . . . . . . . . . . . . 311
15.3 The whole process for Selection sort . . . . . . . . . . . . . . 312
15.4 Merge Sort: The dividing process is marked with dark arrows
and the merging process is with gray arrows with the merge
list marked in gray color too. . . . . . . . . . . . . . . . . . . 315
15.5 Lomuto’s Partition. Yellow, while, and gray marks as region
(1), (2) and (3), respectively. . . . . . . . . . . . . . . . . . . 318
15.6 Bucket Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
15.7 Counting Sort: The process of counting occurrence and com-
pute the prefix sum. . . . . . . . . . . . . . . . . . . . . . . . 323
15.8 Counting sort: Sort keys according to prefix sum. . . . . . . . 324
15.9 Radix Sort: LSD sorting integers in iteration . . . . . . . . . 327
15.10Radix Sort: MSD sorting strings in recursion. The black
and grey arrows indicate the forward and backward pass in
recursion, respectively. . . . . . . . . . . . . . . . . . . . . . 329
15.11The time complexity for common sorting algorithms . . . . . 335
18.1 Graph Model for LIS, each path represents a possible solution. 406
18.2 The solution to LIS. . . . . . . . . . . . . . . . . . . . . . . . 407
20.18The update on D for Fig. 20.12. The gray filled spot marks
the nodes that updated its estimate value, with its precessor
indicated by incoming red arrow. . . . . . . . . . . . . . . . . 462
20.19The tree structure indicates the updates on D, and the short-
est path tree marked by red arrows. . . . . . . . . . . . . . . 463
20.20The execution of Bellman-Ford’s Algorithm with ordering
[s, t, y, z, x]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
20.21The execution of Bellman-Ford’s Algorithm on DAG using
topologically sorted vertices. The red color marks the shortest-
paths tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
20.22The execution of Dijkstra’s Algorithm on non-negative weighted
graph. Red circled vertices represent the priority queue, and
blue circled vertices represent the set S. Eventually, the blue
colored edges represent the shortest-paths tree. . . . . . . . . 468
20.23All shortest-path trees starting from each vertex. . . . . . . . 472
22.1 The process of the brute force exact pattern matching . . . . 492
22.2 The Skipping Rule . . . . . . . . . . . . . . . . . . . . . . . . 494
22.3 The Sliding Rule . . . . . . . . . . . . . . . . . . . . . . . . . 495
22.4 Proof of Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 495
22.5 Z function property . . . . . . . . . . . . . . . . . . . . . . . . 500
22.6 Cyclic Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
22.7 Building a Trie from Patterns . . . . . . . . . . . . . . . . . . 509
22.8 Trie VS Compact Trie . . . . . . . . . . . . . . . . . . . . . . 511
22.9 Trie Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
29.1 State Transfer Tree Structure for LIS, each path represents
a possible solution. Each arrow represents an move: find
an element in the following elements that’s larger than the
current node. . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
29.2 Word Break with DFS. For the tree, each arrow means check
the word = parent-child and then recursively check the result
of child. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
29.3 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672
29.4 One Time Graph Traversal. Different color means different
levels of traversal. . . . . . . . . . . . . . . . . . . . . . . . . . 679
29.5 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
29.6 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
29.7 Tree Structure for One dimensional coordinate . . . . . . . . 688
29.8 Longest Common Subsequence . . . . . . . . . . . . . . . . . 690
29.9 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
xxiii
xxiv LIST OF TABLES
Preface
1
2 0. PREFACE
Preface
3
4 0. PREFACE
Interviews, and Python objects and modules. I tried hard to do a good job.
This book differs from books focusing on extracting the exact formulation
of problems from the fuzzy and obscure world. We focus on learning the
principle of algorithm design and analysis and practicing it using well-defined
classical problems. This knowledge will also help you define a problem more
easily in your job.
Li Yin
Li Yin
http://liyinscience.com
8/30/2019
Acknowledgements
6 0. PREFACE
1
1.1 Structures
I summarize the characteristics that potentially set this book apart from
other books seen in the market; starting from introducing technically what
I think of the core principles of algorithm design are–the “source” of the
wisdom I was after as mentioned in the preface, to illustrating the concise
organization of the content, and to highlighting other unique features of this
book.
Core Principles
Algorithm problem solving follows a few core principles: Search and Com-
binatorics, Reduce and Conquer, Optimization via Space-Time Trade–off or
be Greedy. We specifically put these principles in one single part of the
book–Part. IV.
1. In Chapter. IV (Search and Combinatorics), I teach how to formulate
problems as searching problems via combinatorics in the field of math
to enumerate its state space—solution space or all possibilities. Then
we further optimize and improve the efficiency through “backtracking”
techniques.
7
8 1. READING OF THIS BOOK
Figure 1.1: Four umbrellas: each row indicates corresponding parts as out-
lined in this book.
4. Coding interview problem patterns: We close our book with the an-
alyzing and categorizing problems by patterns. We address classical
and best solutions for each problem pattern.
Q&A
What do we not cover? In the spectrum of coding interviews and the
spectrum of the algorithms, we do not include:
• Although this book is a comprehensive combination of Algorithmic
Problem Solving and Coding Interview Coaching, I decided not to
provide preparation guideline to the topic of System Design to avoid
deviation from our main topic. An additional reason is, personally, I
have no experience yet about this topic and meanwhile it is not a topic
that I am currently interested in either, so a better option is to look
for that in another book.
10 1. READING OF THIS BOOK
Problem Setting Compared with other books that talk about the prob-
lem solving (e.g. Problem Solving with Algorithms and Data Structures, we
do not talk about problems in complex setting. We want the audience to
have a simple setting so that they can focus more on analyzing the algorithm
or data structures’ behaviors. This way, we keep out code clean and it also
serves the purpose of coding interview in which interviewees are required
to write simpler and less code compared with a real engineering problems
because of the time limit.
Therefore, the purpose of this book is three-fold: to answer your ques-
tions about interview process, prepare you fully for the “coding intervie”,
and the most importantly master algorithm design and analysis principles
and sense the beauty of them and in the future to use them in your work.
more graph and tree based algorithms. Also, DFS is a good example
of recursive programming.
on the chapter levels, if you are confident enough with some chapters or
you think they are too trivial, just skim, given that the book is designed
to be self-contained of multiple fields(programming languages, algorithmic
problem solving and the coding interview problem patterns).
The content within the book is almost always partitioned into para-
graphs with titles. This conveniently allows us to skip parts that are just
for enhancement purpose, such as “stores” or . This helps us skim within
each chapter.
Part I
Introduction
13
2
2.1 Introduction
In the past, a person who is capable of solving complex math/physics com-
putation problem faster than the ordinaries stands out and is highly seek
out. For example, during world war two, Alen Turing hired engineer who
was fast solving the Sudoku problems. These kind of stories die with the
rise of powerful machines, with which the magic sticks are handed over to
ones–programmers who are able to harness the continually growing compu-
tation power of the hardwares to solve those once only a handful or none of
people that can solve, with algorithms.
There are many kinds of programmers. Some of them code the real-
world, obvious and easy rules to implement applications, some others chal-
lenge more computational problems with knowledge in math, calculus, ge-
ometry, physics, and so. We give a universal definition of algorithmic prob-
lem solving–information processing. Three essential parts include: Data
structures, algorithms, and programming languages. Knowing some basic
data structures, some types of programming languages and some basic al-
gorithms are enough for the first type of programmers. They might focus
15
16 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING
more on the front-end, such as mobile design, webpage design. The second
type of programmers, however, need to be equipped with more advanced
data structures and algorithm design and analysis techniques. Sadly, it is
all just a start, the real powerful lie in the combination of these algorithm
design methodologies and the other subjects. Math among all is the most
important, for both design and analysis, as we will see in this book. Still
a candidate with strong algorithmic skills is off a good start, at least with
some basic math knowledge, we can almost always manage to solve problems
with brutal force searching, and some others with dynamic programming.
Let us continue to define the algorithmic problem solving as information
processing, just what it is, and not how at this moment.
2.1.1 What?
Introduction to Data Structure Information is the data we care about,
which needs to be structured. And we can think of data structure as our low-
level file manager, what it needs to do is to support four basic operations–
’find’ a file belongs to Bob, ’Add’ Emily’s file, ’Delete’ Shown’s file, ’Modify’
Bod’s file. Why structured? If you are the file manager, would you just
throw all the hundreds of files over the floor or just throwing over in the
drawer? Nope, you line them up in the drawer, or you even put a name
on top of each file and order them by their first name. The way data is
structured in program is similar to real-world system, simply lining up, or
organize like a tree structure if there is some belonging and hierarchical
ordering which appears in institutions and companies.
we can code these rules with a certain programming language and let the
computer take over and if it won’t demands billions of operations, it will get
the result way more faster than humans are capable of, this is why we need
computers anyway.
2.1.2 How?
Knowing what it is, now, you would ask how. How can we know how to or-
ganize our data, how to design and analysis our algorithm? how to program
it? We need to study existing and well-designed data structures, algorithm
design principle and algorithm analysis techniques, understand and analyze
our problems, and study classical algorithms that our predecessors invented
for solving a classical problem, only then when we are seeing a problem,
old or new, we are prepared, we compare it with problems we know how
to solve: if it is exact the same same, congratulations, we would solve our
problem; if it is similar to a certain category of problems, at least we start
from a direction and not from scratch; if it is totally new, at least we have
our algorithm design principle and analysis techniques, we design one after
understanding the problem and relate it to all our skills. Of course, there
are problems that no body has been able to solve it yet. We will study it
in the book so that you would identify when the problem you are solving is
too hard.
2.2 Introduction
Algorithms are Not New Algorithms should not be considered purely
abstract and obscure. It origins from real-life problem solving including time
before there even exist computers (machines). The recurrence were studied
as early as 1202 by L. Fibonacci, for whom the fibinocci number is named.
Algorithms, as a set of rules/actions to solve a problem, they leverage any
form of knowledge – math, physics. Math stands out among all, as it is our
tool to understand problems, present relations, solve problems, and analyze
complexity. In this book, we use math in the most practical way and only
2.3. PROBLEM MODELING 19
at places where it really matters. The difference is, with computer program
written in a certain computer language to execute the algorithm is way more
efficient generally than doing it in person.
Algorithms are Everywhere in our daily life. Assume you are given a
group of people, your task is to find if there is a person in the group that is
born on a certain day. The most intuitive way to do is to check each of them
and see if his/her birthday matches with the target, this needs you to go a
full-round of this group of people. If you observed that this group of people
is grouped by the months, then you can nail down the times of checking by
checking the subgroup that matches the month of your target day. The first
way is the easiest and most straightforward way to solve a problem, which
is called brute force. The second one is involves more observation and might
takes less time to get the answer. However, they both have one thing in
common, need us to nail down the possibilities; in the first way, we nail it
down one by one, and in the second, we nail it down by almost 11/12 of the
original possibility. We can say solving the problem is to find its solution
in solution space, and different way of finding the solution is called different
algorithm.
solution space and simulating the process; finding the series of actions
between the input and output instance.
For example, we formulate the problem of drawing a call from the pool
as: Given a list of unsorted integers, find if the number 11 is in the list,
return true or false.
Example :
Given t h e l i s t : [ 1 , 3 4 , 8 , 1 5 , 0 , 7 ]
Return F a l s e b e c a u s e 11 d o e s not appear i n t h e l i s t .
Problem Categories
1. P Problems:
2. NP Problems:
There are more types, such as undecidable problems and the halting problems,
feel free to look them up if interested.
Figure 2.1: The State Space Graph. This may appears as a tree, but we can
redraw it as a graph.
1. Initial State: state that where our algorithm starts. In our example,
we can scan the whole list starting from leftmost position 0, we denote
it as S(0). Note that a state does not equal to a point on the input
instance, it can be a range –such as from position 0 to 5, or from 0 to
2, or any state you define.
3. State transfer Model: decides the state results from doing an action
a at state s. We denote it as T (a, s). For example, if we are at position
1 and move one step, MOVE(1), then we can reach to state 2, which
can be denote as 2 = T (M OV E(1), 1).
4. State Space: is the set of all states reachable from the initial state
by any sequence of actions, in our case, it can be 0, 1, 2, 3, 4, 5. We
can infer state space of the problem from the initial state, actions, and
transfer model. The state space forms a directed network or graph in
which the nodes are states and the links between nodes are actions.
Graph, with all its flexibity, is a universal and natural way to represent
relations. For example, if we limit the maximum moves we can make at
each state to two, the state space will be formed as follows in Fig. 2.6.
In practice, draw the graph as a tree structure is another option; in the
24 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING
Apply Data Structures With the state space graph, our problem is ab-
stracted to finding a node with value 4 and graph algorithms–more specif-
ically, graph search–can be applied to solve the problem. It does not take
an expert to tell us, "This graph just complicated the situation, because
our intuition can lead us to a much simpler and straightforward solution:
scan the items from the leftmost to the rightmost one by one". True! As is
depicted in Fig. 2.2, the problem can be modeled using a linear structure,
possiblly a list or linked list, and we only need to consider one action out
of all options, MOVE(1), then our searching covers the whole state space,
which makes the algorithm we designed complete 1 . On the other side, in
the state space graph, if we insist on moving two steps each time, we would
not be able to cover the whole state space, and might end up not finding
our target, which indicates this algorithm is incomplete.
Instead of using linear data structure, we can restructure the states as a
tree if we refine the state as a range of items. The initial state is the possible
subarray the target can be found, denote as S(0, 5). Start from initial state,
each time, we divide the space into two halves: S(s, m) and S(m, e), where
s, e is the start and end index respectively, and m = (s + e)//2, meaning the
integer part of s + e divided by 2. We do this to all nodes repeatedly, and
we will have another state transfer graph shown in Fig. 2.3.From this graph
we can see, the last node will be where we can not divide further, that is
1
Check complexity analysis
2.4. PROBLEM SOLVING 25
Given the state transfer graph in Fig. 2.2, we simply iterate each state
and compare each item with our target to see if it equals; if true, we find
our target and return, if not, we continue to the end. This simple search
method is depicted in Fig. 2.4. What if we know that the data is already
organized in ascending order? With the tree data structure, when given a
specific target, we only need to choose one action from the actions set; either
move to left or right to search with a condition: if target is larger or smaller
than the item in the middle of the state. When 4 is target, we have the
search process depicted in Fig. 2.5.
26 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING
All these state space, data structure, algorithm, and analysis might ap-
pear overwhelming to you for now. But as you learn, some of these steps
are not necessary, but knowing these elements are good for you to analyze
and learn new algorithms, think of it more gathering terminologies into your
language base.
1. being smarter that we can be decrease the cost, increase the speed, and
yet still gives out the exact solution we are looking for. This comes
down to optimization, which we have divide and conquer(Chapter ??),
dynamic programming(Chapter ??), and greedy algorithms(Chapter ??).
What are the commonality between them? They all in some way need
2.4. PROBLEM SOLVING 27
node ; and m, the maximum length of any path in the state space. Time is
often measured in terms of the number of nodes in the search tree, and the
space are in terms of the maximum number of nodes stored in memory.
For the most part we describe time and space complexity for search on a
tree; for a graph, the answer depends on how “redundant" paths or “loops"
in the state space are.
Part ??. Which might give us almost the best efficiency we can find.
2.7 Exercise
2.7.1 Knowledge Check
Longest Increasing Subsequence
Given a list of items A = [1, 2, 3, 4, 5, 6], find the position of item with value
4.
1. Initial State: state that where our algorithm starts. In our example,
we can scan the whole list starting from leftmost position 1. S(0)
5. Goal Test: the goal test determines whether a given state is a goal
state. Sometimes there is an explicit set of possible goal states, and
the test simply checks whether the given state is one of them. Such as
in this example, the goal state is 4. Sometimes the goal is specified by
an abstract property rather than explicitly enumerated sets of states.
For example, in the constraint state problems(CSP) such as the n-
queen, the goal is to reach to a state that not a single pair of queens
2.7. EXERCISE 31
Figure 2.7: State Transfer Tree Structure for LIS, each path represents a
possible solution. Each arrow represents an move: find an element in the
following elements that’s larger than the current node.
1. Model the problem as a directed graph, where each node is the ele-
ments of the array, and an edge µ to v means node v > µ. The problem
now becomes finding the longest path from any node to any node in
this directed graph.
2. Model the problem as a tree. The tree starts from empty root node, at
each level i, the tree has n-i possible children: nums[i+1], nums[i+2],
..., nums[n-1]. There will only be an edge if the child’s value is larger
than its parent. Or we can model the tree as a multi-choice tree: for
combination problem, each element can either be chosen or not chosen.
We would end up with two branch, and the nodes would become a path
of the LIS, therefore, the longest LIS exist at the leaf nodes which has
the longest length.
33
34 3. CODING INTERVIEWS AND RESOURCES
• Exploratory chat with recruiters: Either you applied for the position
and passed the initial screening or luckily get found by recruiters, they
would contact you to schedule a short chat, normally through phone.
During the phone call, the recruiter would introduce the company, the
position, and ask for your field of interest; just to check the degree of
interest on either side and decide if the process should be continued
• On-site interviews: If you have passed the first two rounds of inter-
views, you would be invited to the on-site interviews which is the most
fun, exciting, but might also be the most daunting and tiring part of
the whole process, since they can last anywhere from four hours to
the entire day. The company would offer both transportation and ac-
commodation to get you there. The on-site interview consists of 4
to 6 rounds one-on-one, each with an engineer in the team and lasts
between 45-60 minutes; due to the long process, typically a lunch in-
terview is included. There are some extra cases, which may or may
not be included: group presentation, recruiter conversation, or conver-
sation with the hiring manager or team manager. Presentation might
happens to research scientist or higher-level positions. The onsite in-
terview appears to be more more diverse compared with screening
interview; introduction, coding interviews, brain teaser type of ques-
tions, behavior questions, and questions related to the field of the
position, such as machine learning or web development. During the
lunch interview, it was just hanging out with the one who is arranged
to be with you, chatting while eating and showing you around the
company in some cases.
In some cases, you get to have to do on-line assignment which happens more
to some start-ups and second tier tech companies, which requires you spend-
ing at least two hours solving problems without any promise that would lead
to real interviews. Personally, I have done that twice with companies such
as ; and I never heard back from them then. I fully resent such assignment;
it is unfair because it wasted my time but not the company’s, I learned
nothing and the process is bored to hell! Ever since then I decide to stop
the interview whenever such chore is demanded!
3.1. TECH INTERVIEWS 35
Both the first and the second process serves as an initial screening pro-
cess, the purpose is obviously; a trade-off because of the cost, because
the remaining interview process can be quite costly in terms of finance–
accommodation and transportation if you get the on-site interviews–and in
terms of time–the time cost on each side but mainly the cost from spending
5-8 hours on the interviewees from multiple engineers of the hiring company.
Sometimes the process differs slightly between internship and full-time
position; interns typically do not need on-site interviews. For new graduates,
getting an internship first and through the internship to get a full-time
offer can ease the whole job hunting process a bit. For more experienced
engineers, they might get invited to on-site without the screening.
Writing code on paper was a common, natural and effective way for
programmers to code into computer later in this era. As Bill Gates describes
the experience in a commencement speech at his alma mater – Lakeside
School: “You had to type up your program off-line and create this paper
tape—and then you would dial up the computer and get on, and get the
paper in there, and while you were programming, everybody would crowd
around, shouting:“Hey, you made a typing mistake”“Hey, you messed this
up”“Hey, you’re taking too much time.”
Writing code is conducted on whiteboard rather than on paper in 1990s,
when software engineering was growing exponentially with the rise of the
36 3. CODING INTERVIEWS AND RESOURCES
Discussion
Is it a good way to testify and select talented candidates? There are different
opinions, in all, either favors it or oppose it. Stating the reason on each side
is boring and not bear much value. Let us see what people say about it in
their words.
Me : How do you think about coding interviews?
Susie Chen : Ummmmmm well there was like one full month I only did
Leetcode before interviewing with Facebok. LOL, was a bad experi-
ence but worth it hahahah. Susie was an intern from Facebook, with
bachelor degree from University of.
.....................................................................
Me : How the coding interview plays its role for new graduates and expe-
rienced engineers ?
Eric Lin :
• Common: Both require previous proj demo/desc. Your work
matters more than your score. Ppl care more about the actual
experience than the paper work.
3.2.1 Tips
Tips for Preparation
3.2. TIPS AND RESOURCES 37
4. If you are not fluent in English, practice even more using English! This
is the situation for a lot of international STEM students, including me.
I wished I would know the last three tips when I first started to prepare
interview with Google back to 2016 (At least I followed the first tip, went
for my dream company without hesitation, huh), one year in my PhD. It
was a very first try, I prepared for a month(I mean at least 8 hours a day);
reading and finishing all problems from Cracking the Coding Interview. I
failed the screening interview that is conducted through the phone with
a Google share document. I was super duper nervous; taking long to just
understand the question itself given my poor speaking English that time and
the noise from the phone made the situation worse (talking with people on
the phone fears me more than the ghost did from The Shining, by Stephen
King). At that time, I also did not have a clue about LeetCode.
5 Tips for the Interview Here, we summarize five tips when we are
doing a real interview or trying to mock one beforehand in the preparation.
1. Identify Problem Types Quickly: When given a problem, we read
through the description to first understand the task clearly, and run
small examples with the input to output, and see how it works intu-
itively in our mind. After this process, we should be able to identify
the type of the problems. There are 10 main categories and their dis-
trbution on the LeetCode which also shows the frequency of each type
in real coding interviews.
3.2.2 Resources
Online Judge System
Leetcode LeetCode is a website where you can practice on real interview-
ing questions used by tech companies such as Facebook, Amazon, Google,
and so on.
Here are a few tips to navigate the usage of LeetCode:
• Use test case to debug: Before we submit our code on the LeetCode,
we should use the test case function shown in Fig. 3.4 to debug and
testify our code at first. This is also the right mindset and process at
the real interview.
Communities
If you understand Chinese, there is a good community 1 that we share infor-
mation with either interviews, career advice and job packages comparison.
1
http://www.1point3acres.com/bbs/
42 3. CODING INTERVIEWS AND RESOURCES
Part II
43
45
4.1 Introduction
Leaving alone statements that “data structures are building blocks of algo-
rithms”, they are just mimicking how things and events are organized in
real-world in the digital sphere. Imagine that a data structure is an old-
schooled file manager that has some basic operations: searching, modifying,
inserting, deleting, and potentially sorting. In this chapter, we are simply
learning how a file manager use to ‘lay out’ his or her files (structures) and
each ‘lay out’s corresponding operations to support his or her work.
We say the data structures introduced in this chapter are abstract or
idiomatic, because they are conventionally defined structures. Understand-
ing these abstract data structures are like the terminologies in computer
science. We further provide each abstract data structure’s corresponding
Python data structure in Part. III.
There are generally three broad ways to organize data: Linear, tree-like,
and graph-like, which we introduce in the following three sections.
Items We use the notion of items throughout this book as a generic name
for unspecified data type.
Records
47
48 4. ABSTRACT DATA STRUCTURES
Static Array An array or static array is container that holds a fixed size
of sequence of items stored at contiguous memory locations and each
item is identified by array index or key. The Array representation is shown
in Fig. 4.1. Since using contiguous memory locations, once we know the
physical position of the first element, an offset related to data types can be
used to access any item in the array with O(1), which can be characterized
as random access. Because of these items are physically stored contiguous
one after the other, it makes array the most efficient data structure to store
and access the items. Specifically, array is designed and used for fast random
access of data.
Dynamic Array In the static array, once we declared the size of the array,
we are not allowed to do any operation that would change its size; saying
we are banned from either inserting or deleting any item at any position
of the array. In order to be able to change its size, we can go for dynamic
array. that is to sayStatic array and dynamic array differs in the matter of
fixing size or not. A simple dynamic array can be constructed by allocating
a static array, typically larger than the number of elements immediately
required. The elements of the dynamic array are stored contiguously at the
start of the underlying array, and the remaining positions towards the end
of the underlying array are reserved, or unused. Elements can be added at
the end of a dynamic array in constant time by using the reserved space,
until this space is completely consumed. When all space is consumed, and
an additional element is to be added, then the underlying fixed-sized array
needs to be increased in size. Typically resizing is expensive because it
involves allocating a new underlying array and copying each element from
the original array. Elements can be removed from the end of a dynamic
array in constant time, as no resizing is required. The number of elements
used by the dynamic array contents is its logical size or size, while the size of
the underlying array is called the dynamic array’s capacity or physical size,
which is the maximum possible size without relocating data. Moreover, if
4.2. LINEAR DATA STRUCTURES 49
the memory size of the array is beyond the memory size of your computer,
it could be impossible to fit the entire array in, and then we would retrieve
to other data structures that would not require the physical contiguity, such
as linked list, trees, heap, and graph that we would introduce next.
• Random access: it takes O(1) time to access one item in the array
given the index;
• Search and Iteration: O(n) time for array to iterate all the elements
in the array. Similarly to search an item by value through iteration
takes O(n) time too.
No matter it’s static or dynamic array, they are static data structures; the
underlying implementation of dynamic array is static array. When frequent
need of insertion and deletion, we need dynamic data structures, The concept
of static array and dynamic array exist in programming languages such as
C–for example, we declare int a[10] and int* a = new int[10], but not
in Python, which is fully dynamically typed(need more clarification).
Singly and Doubly Linked List When the node has only one pointer,
it is called singly linked list which means we can only scan nodes in one
direction; when there is two pointers, one pointer to its predecessor and
another to its successor, it is called doubly linked list which supports traversal
in both forward and backward directions.
• Search and Iteration: O(n) time for linked list to iterate all items.
Similarly to search an item by value through iteration takes O(n) time
too.
• Extra memory space for a pointer is required with each element of the
list.
this process is shown in Fig. 4.4. We can simply think of stack as a stack of
plates, we always put back and fetch a plate from the top of the pile. Queue
is just like a real-life queue in any line, to be first served with your delicious
ice cream, you need to be there in the head of the line.
Implementation-wise, stacks and queues are a simply dynamic array that
we add item by appending at the end of array, and they only differs with
the delete operation: for stack, we delete item from the end; for a queue, we
delete item from the front instead. Of course, we can also implement with
any other linear data structure, such as linked list. Conventionally, the add
and deletion operation is called “push” and “pop” in a stack, and “enque”
and “deque” in a queue.
Operations Stacks and Queues support limited access and limited inser-
tion and deletion and the search and iteration relies on its underlying data
structure.
Stacks and queues are widely used in computer science. First, they
are used to implement the three fundamental searching strategies–Depth-
first, Breath-first, and Priority-first Search. Also, stack is a recursive data
structure as it can be defined as:
Hashing Functions
The essence of designing hash functions is uniformity and randomness. We
further use h(k, m) to represent our hash function, which points out that it
takes two variables as input, the key as k, and m is the size of the table where
values are saved. One essential rule for hashing is if two keys are equal, then
a hash function should produce the same key value (h(s, m) = h(t, m), if
s = t). And, we try our best to minimize the collision to make it unlikely
for two distinct keys to have the same value. Therefore our expectation
for average collision times for the same slot will be α = m n
, which is called
loading factor and is a critical statistics for design hashing and analyze
its performance. Besides, a good hash function satisfied the condition of
simple uniform hashing: each key is equally likely to be mapped to any of
the m slots. But usually it is not possible to check this condition because
one rarely knows the probability distribution according to which the keys
4.2. LINEAR DATA STRUCTURES 53
Because the keys share a common factor c = 2 with the bucket size
m = 4, this will decrease the range of the reminder into m/c of its
original range. As shown in the example, the remainder is just {0, 2}
which is only half of the space of m. The real loading factor increase
to cα. Using a prime number is a easy way to avoid this since a prime
number has no factors other than 1 and itself.
If the size of table cannot easily to be adjusted to a prime number,
we can use h(k, m) = (k%p)%m, where p is our prime number which
should be chosen from the range m < p < |U |.
Resolving Collision
Collision is unavoidable given that m < n and the sometimes it just purely
bad luck that the data you have and the chosen hashing function produce
lots of collisions, thus, we need mechanisms to resolve possible collisions. We
introduce three methods: Chaining, Open Addressing, and Perfect Hashing.
Figure 4.6: Hashtable chaining to resolve the collision, change it to the real
example
Chaining An easy way to think of is by chaining the keys that have the
same hashing value using a linked list (either singly or doubly). For example,
when h(k, m) = k%4, and keys = [10,20,30,40,50]. For key as 10, 30, 50,
they are mapped to the same slot 2. Therefore, we chain them up at index 2
using a single linked list shown in Fig. 4.6. This method shows the following
characters:
• the worst-case behavior is when all keys are mapped to the same slot.
The advantage of chaining is the flexible size because we can always add
more items by chaining behind, this is useful when the size of the input set is
unknown or too large. However, the advantage comes with a price of taking
extra space due to the use of pointers.
Open Addressing In Open addressing, all items are stored in the hash
table itself; thus requiring the size of the hash table to be (m ≥ n), making
each slot either contains an item or empty and the load factor α ≤ 1. This
4.3. GRAPHS 55
avoids the usages of pointers, saving spaces. So, here is the question, what
would you do if there is collision?
i marks the number of tries. Now, try to delete item from linear
probing table, we know that T [p1 ] = k1 , T [p1 + 1] = k2 , T [p1 + 2] = k3 .
say we delete k1 , we repeat the hash function, find and delete it from
p1 , working well, we have T [p1 == N U LL. Then we need to delete or
search for k2 , we first find p1 and find that it is empty already, then we
thought k2 is deleted, great! You see the problem here? k2 is actually
at p1 + 1 but from the process we did not know. A simple resolution
instead of really deleting the value, we add a flag, say deleted, at
any position that a key is supposedly be deleted. Now, to delete k1 ,
we have p1 is marked as deleted. This time, when we are trying to
delete k2 , we first go to p1 and see that the value is not the same as its
value, we would know we should move to p1 + 1, and check its value:
it equals, nice, we put a marker here again.
*Perfect Hashing
56 4. ABSTRACT DATA STRUCTURES
4.3 Graphs
4.3.1 Introduction
Graph is a natural way to represent connections and reasoning between
things or events. A graph is made up of vertices (nodes or points) which are
connected by edges (arcs or lines). A graph structure is shown in Fig. 4.7.
We use G to denote the graph, V and E to refer its collections of vertices and
edges, respectively. Accordingly, |V | and |E| is used to denote the number
of nodes and edges in the graph. An edge between vertex u and v is denoted
as a pair (u, v), depending on the type of the graph, the pair can be either
ordered or unordered.
There are many fields in that heavily rely on the graph, such as the
probabilistic graphical models applied in computer vision, route problems,
network flow in network science, link structures of a website in social media,
and so. We present graph as a data structure. However, graph is really
a broad way to model problems; for example, we can model the possible
solution space as a graph and apply graph search to find the possible solution
to a problem. So, do not let the physical graph data structures limit our
imagination.
The representation of graph is deferred to Chapter. ??. In the next
section, we introduce the types of graphs.
vertices other than the start and the end vertices being the same. In
an undirected graph a (simple) cycle is a path that starts and ends
at the same vertex, has no repeated vertices other than the first and
last, and has length at least three. In this book we will exclusively
talk about simple cycles and hence, as with paths, we will often drop
simple. A graph is acyclic if it contains no cycles. Directed acyclic
graphs are often abbreviated as DAG.
vertices within the same set are connected to each other or adjacent.
A bipartite graph is a graph with no odd cycles; equivalently, it is a
graph that may be properly colored with two colors. See Fig. 4.8.
4.3.3 Reference
1. http://www.cs.cmu.edu/afs/cs/academic/class/15210-f14/www/
lectures/graph-intro.pdf
4.4 Trees
Trees in Interviews The most widely used are binary tree and binary
search tree which are also the most popular tree problems you encounter in
the real interviews. A large chance you will be asked to solve a binary tree
or binary search tree related problem in a real coding interview especially
for new graduates which has no real industrial experience and pretty much
had no chance before to put the major related knowledge into practice yet.
4.4.1 Introduction
A tree is essentially a simple graph which is (1) connected, (2) acyclic, and
(3) undirected. To connect n nodes without a cycle, it requires n − 1 edges.
Adding one edge will create one cycle and removing one edge will divides
a tree into two components. Trees can be represented as a graph whose
representations we have learned in the last section, such a tree is called free
tree. A Forest is a set of n >= 0 disjoint trees.
However, free trees are not commonly seen and applied in computer
science (not in coding interviews either) and there are better ways–rooted
trees. In a rooted tree, a special node is singled out which is called the root
and all the edges are oriented to point away from the root. The rooted node
and one-way structure enable the rooted tree to indicate a hierarchy relation
between nodes whereas not so in the free tree. A comparison between free
tree and the rooted tree is shown in Fig. 4.9.
Rooted Trees
A rooted tree introduces a parent-child, sibling relationship between
nodes to indicate the hierarchy relation.
60 4. ABSTRACT DATA STRUCTURES
Figure 4.9: Example of Trees. Left: Free Tree, Right: Rooted Tree with
height and depth denoted
Three Types of Nodes Just like a real tree, we have the root, branches,
and finally the leaves. The first node of the tree is called the root node,
which will likely to be connected to its several underlying children node(s),
making the root node the parent node of its children. Besides the root
node, there are another two kinds of nodes: inner nodes and leaf nodes. A
leaf node can be found at the last level of the tree which has no further
children. An inner node is any node in the tree that has both parent node
and children, which is also any node that can not be characterized as either
leaf or root node. A node can be both root and leaf node at the same time,
if it is the only node that composed of the tree.
• Depth: The depth (or level) of a node is the number of edges from
the node to the tree’s root node. The depth of the root node is 0.
• Height: The height(or depth) of a tree would be the height of its root
node, or equivalently, the depth of its deepest node.
• Trees can be always used to organize data and can come with efficient
information retrieval. Because of the recursive tree structure, divide
and conquer can be easily applied on trees (a problem can be most
likely divided into subproblems related to its subtrees). For example,
Segment Tree, Binary Search Tree, Binary heap, and for the pattern
matching, we have the tries and suffix trees.
1. Unlike arrays and linked list, tree is hierarchical: (1) we can store
information that naturally forms hierarchically, e.g., the file systems on
a computer, the employee relation in at a company. (2) If we organize
keys of the tree with ordering, e.g. Binary Search Tree, Segment Tree,
Trie used to implement prefix lookup for strings.
2. Trees are relevant to the study of analysis of algorithms not only be-
cause they implicitly model the behavior of recursive programs but also
because they are involved explicitly in many basic algorithms that are
widely used.
For a rooted tree, if each node has no more than N children, it is called
N-ary Tree. When N = 2, it is further distinguished as a binary tree,
4.4. TREES 63
where its possible two children are typically called left child and right child.
Fig. 4.10 shows a comparison of a 6-ary tree and a binary tree. Binary tree
is more common than N-ary tree because it is simplier and more concise,
thus making it more popular for coding interviews.
Types of Binary Tree There are four common types of Binary Tree:
1. Full Binary Tree: A binary tree is full if every node has either 0
or 2 children. We can also say that a full binary tree is a binary
tree in which all nodes except leaves have two children. In full binary
tree, the number of leaves (|L|) and the number of all other non-leaf
nodes (|N L|) has relation: |L| = |N L| + 1. The total number of nodes
compared with the height h will be:
n = 20 + 21 + 22 + ... + 2h (4.2)
= 2h+1 − 1 (4.3)
Introduction to Combinatorics
• the existence of such structures that satisfy certain given criteria, this
is usually called Contraint Restricted Problems (CSPs).
65
66 5. INTRODUCTION TO COMBINATORICS
5.1 Permutation
Given a list of integer [1, 2, 3], how many way can we order these three
numbers? Imagine that we have three positions for these three integers. For
the first position, it can choose 3 integers, leaving the second position with
2 options. Further, when it reaches to the last position, it can only choose
whatever that is left, we have 1. The total count will be 3 × 2 × 1.
Similarly, for n distinct numbers, we will get the number of permutation
easily as n × (n − 1) × ... × 1. A factorial, denoted as as n!, is used to
abbreviate it. Worth to notice, the factorial sequence grows even quicker
than the exponential sequence, such as 2n .
P (n, m) = (n − m + 1) ∗ m ∗ P (n − 1, m − 1) (5.4)
= (n − m + 1) ∗ m ∗ (P (n − 2, m − 2) (5.5)
... (5.6)
= m!P (n − m + 1, 1 (5.7)
5.2 Combination
Same as before, we have to choose m things out of n but with one difference–
the order does not matter, how many ways we have? This problem is
called combination, and it is denoted as C(n, m). For example, for [1,2,3],
C(3, 2) = [1, 2], [2, 3], [1, 3]. Comparatively, P(3, 2) = [1, 2], [2, 1], [2, 3], [3,
2], [1, 3], [3, 1].
To get combination, we can leverage and apply permutation first. How-
ever, this results over-counting. As shown in our example, when there are
two things in the combination, a permutation would double count it. If
there are m things, we over count by m! times. Therefore, if we divide
the permutation by all permutation of m things, we get out formula for
68 5. INTRODUCTION TO COMBINATORICS
combination:
P (n, m)
C(n, m) = (5.8)
P (m, m)
n!
= (5.9)
(n − m)!m!
• Use k-th item, then we just need to put the the k-th item into any sets
in C(n − 1, k − 1), resulting C(n − 1, k − 1).
• Not use k-th item, this means we need to pick k items from the other
n − 1 items, resulting C(n − 1, k).
5.3 Partition
We discuss three types of partitions: (1) integer partition, (2) set partition,
and (3) array partition. In this section, counting becomes less obvious com-
pared with combination and permutation, this is where we rely more on
recurrence relation and math induction.
5.3. PARTITION 69
If we draw out the transfer graph, we can see a lot of overlapping of some
state. Therefore, we add one more limitation on the condition, s>1.
{ a3 } , { a1 , a2 , a4 } ;
{ a4 } , { a1 , a2 , a3 }
{ a1 , a2 } , { a3 , a4 } ;
{ a1 , a3 } , { a2 , a4 } ;
{ a1 , a4 } , { a2 , a3 } ;
Let us denote the total ways as s(n, k). As seen in the example, given
2 groups and 4 items, there are two combination of each group’s size: 1+3
and 2+2. For combination 1,3, this is equivalent to choose one item from the
set to put at the first subset C(n, 1), and then choose 3 items for the other
subset C(3, 3). For combination 2, 2, we have C(4, 2) for one subset and
C(2, 2) for the other subset. However, because the ordering of the subsets
does not matter, we need to divide it by 2!. The set partition problem thus
consists of two steps:
From this solution, it is hard to get a closed form for s(n, k).
Find Recurrence Relation There is just one way to handle this problem,
let us try the incremental method–find a recurrence relation. We first start
with s(0, 0) = 0, and we can also easily get s(n, 0) = 0. Now, with the
mathematical induction, we assume we solved a subproblem, say s(n−1, k −
1), can we induce s(n, k)? What do we need?
Now we have n-1 items in k-1 groups, now there is one addition group
and one additional item. There are two ways:
• put the additional item into the additional group. In this way, s(n, k)
is simply the same as of s(n − 1, k − 1).
• spread the n-1 items from the original k-1 groups into k groups, that
is s(n − 1, k) and our additional item has k options now, making k ×
s(n − 1, k) in total
Combing together the count of these two ways, we get a recurrence relation
that
This can be done recursively: we will have a recursive function with depth
m.
n−1
n X X X X X
Does the ordering of the for loop matter? Actually it does not.
5.5 Merge
i=1
You would actually see for n = 4, there are 16 possible subsequence, which
is 24 . This is not coincidence. Imagine for each item in the array, they have
two options, either be chosen into the possible sequence or not chosen, which
make it to 2n .
ss = 2n (5.15)
If it is the case that ordering does not matter, for n distinct things, the
number of possible subsets, also called the power set will be:
Recurrence Relations
6.1 Introduction
Definition and Concepts A recurrence relation is function expressed
with the same function. More precisely, as defined in mathematics, recur-
rence relation is an equation that recursively defines a sequence or multi-
dimensional array of values; once one or more initial terms are given, each
further term of the sequence or array is defined as a function of the preced-
ing terms. Fibonacci sequence is one of the most famous recurrence relation
75
76 6. RECURRENCE RELATIONS
• Polynomial: an = an−1 + 1, a1 = 1 →
− an = n.
• Exponential: an = 2 × an−1 , a1 = 1 →
− an = 2n−1 .
• Factorial: an = n × an−1 , a1 = 1 →
− an = n!
the functional relation between T (n) with n that it cares, not each exact
value.
In this section, we focus on solving the recurrence relation using math
to get a closed-form solution. Categorizing the recurrence relation can help
us pinpoint each type’s solving methods.
= T (n/23 ) + 3O(1)
= ...
= T (1) + kO(1) (6.7)
We have 2nk = 1, we solve this equation and will get k = log2 n. Most
likely T (1) = O(1) will be the initial condition, we replace this, and we get
T (n) = O(log2 n).
However, when we try to apply iteration on the third recursion: T (n) =
3T (n/4) + O(n). It might be tempting to assume that T (n) = O(n log n)
due to the fact that T (n) = 2T (n/2) + O(n) leads to this time complexity.
T (n) = 3T (n/4) + O(n) (6.8)
= 3(3T (n/4 ) + n/4) + n = 3 T (n/4 ) + n(1 + 3/4)
2 2 2
i=0
4
6.2. GENERAL METHODS TO SOLVE LINEAR RECURRENCE RELATION79
Since the term of T(n) grows, the iteration can look messy. We can use
recursion tree to better visualize the process of iteration. In a recursive tree,
each node represents the value of a single subproblem, and a leaf would be
a subproblem. As a start, we expand T (n) as a node with value n as root,
and it would have three children each represents a subproblem T (n/4). We
further do the same with each leaf node, until the subproblem is trivial and
be a base case. In practice, we just need to draw a few layers to find the
rule. The cost will be the sum of costs of all layers. The process can be seen
in Fig. 10.3. In this case, it is the base case T (1). Through the expansion
Figure 6.1: The process to construct a recursive tree for T (n) = 3T (bn/4c)+
O(n). There are totally k+1 levels. Use a better figure.
with iteration and recursion tree, our time complexity function becomes:
k
T (n) = Li + Lk+1 (6.12)
X
i=1
k
=n (3/4)i−1 + 3k T (n/4k ) (6.13)
X
i=1
In the process, we can see that Eq. 10.13 and Eq. 10.7 are the same.
80 6. RECURRENCE RELATIONS
i=1
≤ 1/(1 − 3/4)n + 3log4 n T (1) = 4n + nlog4 3 ≤ 5n (6.15)
= O(n) (6.16)
1. Base case: proves that the property holds for the number 0.
2. Induction step: proves that, if the property holds for one natural num-
ber n, then it holds for the next natural number n + 1.
It is not hard that we find the rule and guess T (n) = 2n − 1. Now, we prove
this equation by induction:
T (n) = 2T (n − 1) + 1 (6.17)
= 2(2 n−1
− 1) + 1 (6.18)
=2 −1
n
(6.19)
We can see that c will be canceled and the left side is always greater than
the right side. Thus we learned that c2n is a too large guess, and the
multiplicative constant c plays no role in the induction step.
1
Visit https://en.wikipedia.org/wiki/Recurrence_relation for details.
82 6. RECURRENCE RELATIONS
γ n = an (6.22)
= c1 γ n−1
+ c2 γ n−2
+ ... + ck γ n−k
. (6.23)
By dividing γ n−k from left and right side of the equation, we get the simpli-
fied equation, which is called the characteristic equation of the recurrence
relation in the form of Eq. 6.3.
Within the context of computer science, the degree is mostly within 2. Here,
we introduce the formula solving the character roots for characteristic equa-
tion with the following form:
0 = ax2 + bx + c (6.31)
2
6.4. SOLVE NON-HOMOGENEOUS LINEAR RECURRENCE RELATION83
Symbolic Differentiation
binomial theorem
n
Cnk xk = (1 + x)n (6.37)
X
k=0
6.6 Exercises
1. Compute factorial sequence using while loop.
if y = 0,
(
x
gcd(x, y) = (6.38)
gcd(y, x%y) if y > 0
Function definition:
6.7 Summary
If a cursive algorithm can be further optimized, the optimization method can
either be divide and conquer or decrease and conquer. We have put much
effort into solving recurrence relation of both: the linear recurrence relation
for decrease and conquer, the divide and conquer recurrence relation for
divide and conquer. Right now, do not struggle and eager to know what is
divide or decrease and conquer, it will be explained in the next two chapters.
Further, Akra-Bazzi Method 4 applies to recurrence such that T (n) =
T (n/3) + T (2n/3) + O(n). Please look into more details if interested. Gen-
erating function is used to solve the linear recurrence.
4
Part III
85
87
After the warm up, we prepare ourselves with hands-on skills–basic pro-
gramming with Python 3, including two function type–iteration and recur-
sion, and connecting dots between the abstract data structures with Python
7.1 Introduction
Figure 7.1: Iteration vs recursion: in recursion, the line denotes the top-
down process and the dashed line is the bottom-up process.
89
90 7. ITERATION AND RECURSION
In this section, we first learn iteration and Python Syntax that can be
used to implement. We then examine a classic and elementary example–
Factorial sequence to catch a glimpse of how iteration and recursion can be
applied to solve this problem. Then, we discuss more details about recursion.
We end this section by comparing iteration and recursion; their pros and
cons and their relation between.
7.2 Iteration
In simple terms, an iterative function is one that loops to repeat some part
of the code. In Python, the loops can be expressed with for and while
7.3. FACTORIAL SEQUENCE 91
loop.
Enumerating the number from 1 to 10 is a simple iteration. Implemen-
tation wise:
• for usually is used together with function range(start, stop, step)
which creates a sequence of numbers from start to stop in range
[start, end), and increments by step (1 by default). Thus, we need to
set start as 1, and end as 11 to get numbers from 1 to 10.
1 # enumerate 1 t o 10 with f o r l o o p
2 f o r i in range (1 , 11) :
3 p r i n t ( i , end= ' ' )
7.4 Recursion
In this section, we reveal how the recursion mechanism works: function calls
and stack, two passes.
Recursive Calls and Stacks The recursive function calls of the recursive
factorial we implemented in the last section can be demonstrated as Fig. 7.2.
The execution of recursive function f (n) will pay two visits to each resur-
sive function f (i), i ∈ [1, n] through two passes: top-down and bottom-up as
we have illustrated in Fig. 7.1. The recursive function handles this process
via a stack data structure which follows a Last In First Out (LIFO) principle
to record all function calls.
Tail Recursion This is also called tail recursion where the function calls
itself at the end (“tail”) of the function in which no computation is done after
the return of recursive call. Many compilers optimize to change a recursive
call to a tail recursive or an iterative call.
2. The recursion is too deep which is out of the assigned memory limit
of the executing machine.
1. End condition, Base Cases and Return Values: either return an answer
for base cases or None, and used to end the recursive calls.
3. Variables: What the local and global variables. In Python any pointer
type of data can be used as global variable global result putting in the
parameters.
4. Construct current result: when to collect the results from subtree and
combine to get the result for current node.
5. Check the depth: if the program will lead to the heap stack overflow.
function. This means that when we perform the next recursive call to the
function, the current stack frame (occupied by the current function call) is
not needed anymore. This allows us to optimize the code. We Simply reuse
the current stack frame for the next recursive step and repeat this process
for all the other function calls.
Using regular recursion, each recursive call pushes another entry onto
the call stack. When the functions return, they are popped from the stack.
In the case of tail recursion, we can optimize it so that only one stack entry
is used for all the recursive calls of the function. This means that even on
large inputs, there can be no stack overflow. This is called Tail recursion
optimization.
Languages such as lisp and c/c++ have this sort of optimization. But,
the Python interpreter doesn’t perform tail recursion optimization. Due to
this, the recursion limit of python is usually set to a small value (approx,
104 ). This means that when you provide a large input to the recursive
function, you will get an error. This is done to avoid a stack overflow. The
Python interpreter limits the recursion limit so that infinite recursions are
avoided.
7.6 Exercises
1. Compute factorial sequence using while loop.
7.7 Summary
If a cursive algorithm can be further optimized, the optimization method can
either be divide and conquer or decrease and conquer. We have put much
effort into solving recurrence relation of both: the linear recurrence relation
for decrease and conquer, the divide and conquer recurrence relation for
divide and conquer. Right now, do not struggle and eager to know what is
divide or decrease and conquer, it will be explained in the next two chapters.
8
Bit Manipulation
Many books on algorithmic problem solving seems forget about one topic–
bit and bit manipulation. Bit is how data is represented and saved on the
hardware. Thus knowing such concept and bit manipulation using Python
sometimes can also help us device more efficient algorithms, either space or
time complexity in the later Chapter.
For example, how to convert a char or integer to bit, how to get each bit,
set each bit, and clear each bit. Also, some more advanced bit manipulation
operations. After this, we will see some examples to show how to apply bit
manipulation in real-life problems.
x « y Returns x with the bits shifted to the left by y places (and new bits
on the right-hand-side are zeros). This is the same as multiplying x by 2y .
x » y Returns x with the bits shifted to the right by y places. This is the
same as dividing x by 2y , same result as the // operator. This right shift is
also called arithmetic right shift, it fills in the new bits with the value of the
sign bit.
97
98 8. BIT MANIPULATION
x & y "Bitwise and". Each bit of the output is 1 if the corresponding bit
of x AND of y is 1, otherwise it’s 0. It has the following property:
1 # keep 1 o r 0 t h e same a s o r i g i n a l
2 1 & 1 = 1
3 0 & 1 = 0
4 # s e t t o 0 with & 0
5 1 & 0 = 0
6 0 & 0 = 0
x ∧ y "Bitwise exclusive or". Each bit of the output is the same as the
corresponding bit in x if that bit in y is 0, and it’s the complement of the
bit in x if that bit in y is 1. It has the following basic properties:
1 # t o g g l e 1 o r 0 with ^ 1
2 1 ^ 1 = 0
3 0 ^ 1 = 1
4
5 # keep 1 o r 0 with ^ 0
6 1 ^ 0 = 1
7 0 ^ 0 = 0
Logical right shift The logical right shift is different to the above right
shift after shifting it puts a 0 in the most significant bit. It is indicated with
a >>> operator n Java. However, in Python, there is no such operator, but
we can implement one easily using bitstring module padding with zeros
using >>= operator.
1 >>> a = BitArray ( i n t =−1000, l e n g t h =32)
2 >>> a . i n t
3 −1000
4 >>> a >>= 3
5 >>> a . i n t
6 536870787
However, bin() doesn’t return binary bits that applies the two’s complement
rule. For example, for the negative value:
1 a1 = b i n ( −88)
2 # output
3 # −0b1011000
int(x, base = 10) The int() method takes either a string x to return an
integer with its corresponding base. The common base are: 2, 10, 16 (hex).
1 b = i n t ( ' 01011000 ' , 2 )
2 c = i n t ( ' 88 ' , 1 0 )
3 print (b , c )
4 # output
5 # 88 88
chr() The chr() method takes a single parameter of integer and return a
character (a string) whose Unicode code point is the integer. If the integer
i is outside the range, ValueError will be raised.
1 d = chr (88)
2 print (d)
3 # output
4 # X
100 8. BIT MANIPULATION
ord() The ord() method takes a string representing one Unicode character
and return an integer representing the Unicode code point of that character.
1 e = ord ( ' a ' )
2 print ( e )
3 # output
4 # 97
N −1
2i = 2(N −1) + 2(N −2) + ... + 22 + 21 + 20 = 2N − 1 (8.1)
X
i=0
This is helpful if we just need two’s complement result instead of getting the
binary representation.
102 8. BIT MANIPULATION
Get ith Bit In order to do this, we use the property of AND operator
either 0 or 1 and with 1, the output is the same as original, while if it is and
with 0, they others are set with 0s.
1 # f o r n b i t , i i n r a n g e [ 0 , n−1]
2 def get_bit (x , i ) :
3 mask = 1 << i
4 i f x & mask :
5 return 1
6 return 0
7 print ( get_bit (5 ,1) )
8 # output
9 # 0
Else, we can use left shift by i on x, and use AND with a single 1.
1 def get_bit2 (x , i ) :
2 r e t u r n x >> i & 1
3 print ( get_bit2 (5 ,1) )
4 # output
5 # 0
Toggle ith Bit Toggling means to turn bit to 1 if it was 0 and to turn it to
0 if it was one. We will be using ’XOR’ operator here due to its properties.
1 x = x ^ mask
Clear Bits In some cases, we need to clear a range of bits and set them
to 0, our base mask need to put 1s at all those positions, Before we solve
this problem, we need to know a property of binary subtraction. Check if
you can find out the property in the examples below,
8.4. USEFUL COMBINED BIT OPERATIONS 103
The property is, the difference between a binary number n and 1 is all
the bits on the right of the rightmost 1 are flipped including the rightmost
1. Using this amazing property, we can create our mask as:
1 # b a s e mask
2 i = 5
3 mask = 1 << i
4 mask = mask −1
5 p r i n t ( b i n ( mask ) )
6 # output
7 # 0 b11111
With this base mask, we can clear bits: (1) All bits from the most significant
bit till i (leftmost till ith bit) by using the above mask. (2) All bits from
the lest significant bit to the ith bit by using ∼ mask as mask. The Python
code is as follows:
1 # i i −1 i −2 . . . 2 1 0 , keep t h e s e p o s i t i o n s
2 def c l e a r _ b i t s _ l e f t _ r i g h t ( val , i ) :
3 p r i n t ( ' val ' , bin ( val ) )
4 mask = ( 1 << i ) −1
5 p r i n t ( ' mask ' , b i n ( mask ) )
6 r e t u r n b i n ( v a l & ( mask ) )
1 # i i −1 i −2 . . . 2 1 0 , e r a s e t h e s e p o s i t i o n s
2 def c l e a r _ b i t s _ r i g h t _ l e f t ( val , i ) :
3 p r i n t ( ' val ' , bin ( val ) )
4 mask = ( 1 << i ) −1
5 p r i n t ( ' mask ' , b i n (~ mask ) )
6 r e t u r n b i n ( v a l & (~ mask ) )
Get the lowest set bit Suppose we are given ’0010,1100’, we need to get
the lowest set bit and return ’0000,0100’. And for 1100, we get 0100. If we
try to do an AND between 5 and its two’s complement as shown in Eq. 8.2
and 8.4, we would see only the right most 1 bit is kept and all the others are
cleared to 0. This can be done using expression x&(−x), −x is the two’s
complement of x.
104 8. BIT MANIPULATION
Clear the lowest set bit In many situations we want to strip off the
lowest set bit for example in Binary Indexed tree data structure, counting
number of set bit in a number. We use the following operations:
1 def strip_last_set_bit ( val ) :
2 p r i n t ( bin ( val ) )
3 return bin ( val & ( val − 1) )
4 print ( strip_last_set_bit (5) )
5 # output
6 # 0 b101
7 # 0 b100
8.5 Applications
Recording States Some algorithms like Combination, Permutation, Graph
Traversal require us to record states of the input array. Instead of using an
array of the same size, we can use a single integer, each bit’s location indi-
cates the state of one element with same index in the array. For example,
we want to record the state of an array with length 8. We can do it like
follows:
1 used = 0
2 f o r i in range (8) :
3 i f used &(1<< i ) : # check s t a t e a t i
4 continue
5 used = used | (1<< i ) # s e t s t a t e a t i used
6 p r i n t ( b i n ( used ) )
I nput : [ 2 , 2 , 1 ]
Output : 1
Example 2 :
I nput : [ 4 , 1 , 2 , 1 , 2 ]
Output : 4
Output : 28
E x p l a n a t i o n : The maximum r e s u l t i s 5 \^ 25 = 2 8 .
Solution 1: Build the Max bit by bit. First, let’s convert these
integers into binary representation by hand.
3 0000 , 0011
10 0000 , 1011
5 0000 , 0101
25 0001 , 1001
2 0000 , 0010
8 0000 , 1000
If we only look at the highest position i where there is one one and
all others zero. Then we know the maximum XOR m has 1 at that
bit. Now, we look at two bits: i, i-1. The possible maximum XOR
for this is append 0 or 1 at the end of m, we have possible max 11,
because for XOR, if we do XOR of m with others, mXORa = b, if b
exists in these possible two sets, then max is possible and it become
m << 1 + 1. We can carry on this process, the following process is
showed as follows: answer1̂ is the possible max,
8.5. APPLICATIONS 107
1 d e f findMaximumXOR ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 answer = 0
7 f o r i in range (32) [ : : − 1 ] :
8 answer <<= 1 # m u l t i p l e i t by two
9 p r e f i x e s = {num >> i f o r num i n nums} # s h i f t r i g h t
f o r n , d i v i d e /2^ i , g e t t h e f i r s t (32− i ) b i t s
10 answer += any ( ( answer +1) ^ p i n p r e f i x e s f o r p i n
prefixes )
11 r e t u r n answer
With Mask
I nput : 00000010100101000001111010011100
Output : 00111001011110000010100101000000
E x p l a n a t i o n : The i n p u t b i n a r y s t r i n g
00000010100101000001111010011100 r e p r e s e n t s t h e u n s i g n e d
i n t e g e r 4 3 2 6 1 5 9 6 , s o r e t u r n 964176192 which i t s b i n a r y
r e p r e s e n t a t i o n i s 00111001011110000010100101000000.
Example 2 :
108 8. BIT MANIPULATION
Input : 11111111111111111111111111111101
Output : 10111111111111111111111111111111
E x p l a n a t i o n : The i n p u t b i n a r y s t r i n g
11111111111111111111111111111101 r e p r e s e n t s t h e u n s i g n e d
i n t e g e r 4 2 9 4 9 6 7 2 9 3 , s o r e t u r n 3221225471 which i t s
binary representation i s
10101111110010110010011101101001.
Solution: Get Bit and Set bit with mask. We first get bits from
the most significant position to the least significant position. And get
the bit at that position with mask, and set the bit in our ’ans’ with a
mask indicates the position of (31-i):
1 # @param n , an i n t e g e r
2 # @return an i n t e g e r
3 def reverseBits ( s e l f , n) :
4 ans = 0
5 f o r i i n r a n g e ( 3 2 ) [ : : − 1 ] : #from h i g h t o low
6 mask = 1 << i
7 set_mask = 1 << (31− i )
8 i f ( mask & n ) != 0 : #g e t b i t
9 #s e t b i t
10 ans |= set_mask
11 r e t u r n ans
Input : [ 5 , 7 ]
Output : 4
Example 2 :
Input : [ 0 , 1 ]
Output : 0
8.6 Exercises
1. Write a function to determine the number of bits required to convert
integer A to integer B.
1 def bitswaprequired (a , b) :
2 count = 0
3 c = a^b
4 w h i l e ( c != 0 ) :
5 count += c & 1
6 c = c >> 1
110 8. BIT MANIPULATION
7 r e t u r n count
8 p r i n t ( bitswaprequired (12 , 7) )
2. 389. Find the Difference (easy). Given two strings s and t which
consist of only lowercase letters. String t is generated by random
shuffling string s and then add one more letter at a random position.
Find the letter that was added in t.
Example :
Input :
s = " abcd "
t = " abcde "
Output :
e
Explanation :
' e ' i s t h e l e t t e r t h a t was added .
9.1 Introduction
111
112 9. PYTHON DATA STRUCTURES
• For linked list, stack, queue, we either need to implement it with build-
in data types or we have Python Modules.
All these sequence type data structures share the most common methods
and operations shown in Table 9.4 and 9.5. To note that in Python, the
indexing starts from 0.
Let us examine each type of sequence further to understand its perfor-
mance, and relation to array data structures.
9.2.2 Range
Range Syntax
The range object has three attributes: start, stop, step, and a range
object can be created as range(start, stop, step. These attributes need
to integers–both negative and positive works–to define a range, which will
be [start, stop). The default value for start and stop is 0. For example:
1 >>> a = r a n g e ( 1 0 )
2 >>> b = r a n g e ( 0 , 1 0 , 2 )
3 >>> a , b
4 ( range (0 , 10) , range (0 , 10 , 2) )
Like any other sequence types, range is iterable, can be indexed and sliced.
This is just how we define the behavior of the range class back in the
C++ code. We does not need to save all integers in the range, but be
generated with function that specifically asks for it.
9.2.3 String
String is static array and its items are just characters, represented using
ASCII or Unicode 1 . String is immutable which means once its created we
can no longer modify its content or extent its size. String is more compact
compared with storing the characters in list because of its backing array
wont be assigned to any extra space.
String Syntax
Join The str.join() method will concatenate two strings, but in a way
that passes one string through another. For example, we can use the
str.join() method to add whitespace to that string, which we can do
like so:
1 b a l l o o n = "Sammy has a b a l l o o n . "
2 print ( " " . join ( balloon ) )
3 #Ouput
4 S a mm y h a s a b a l l o o n .
1
In Python 3, all strings are represented in Unicode. In Python 2 are stored internally
as 8-bit ASCII, hence it is required to attach ’u’ to make it Unicode. It is no longer
necessary now.
9.2. ARRAY AND PYTHON SEQUENCE 115
Split Just as we can join strings together, we can also split strings up using
the str.split() method. This method separates the string by whitespace
if no other parameter is given.
1 print ( balloon . s p l i t () )
2 #Ouput
3 [ 'Sammy ' , ' has ' , ' a ' , ' b a l l o o n . ' ]
We can also use str.split() to remove certain parts of an original string. For
example, let’s remove the letter ’a’ from the string:
1 print ( balloon . s p l i t ( "a" ) )
2 #Ouput
3 [ ' S ' , 'mmy h ' , ' s ' , ' b ' , ' l l o o n . ' ]
Now the letter a has been removed and the strings have been separated
where each instance of the letter a had been, with whitespace retained.
Replace The str.replace() method can take an original string and re-
turn an updated string with some replacement.
Let’s say that the balloon that Sammy had is lost. Since Sammy no
longer has this balloon, we will change the substring "has" from the original
string balloon to "had" in a new string:
1 p r i n t ( b a l l o o n . r e p l a c e ( " has " , " had " ) )
2 #Ouput
3 Sammy had a b a l l o o n .
String Functions
Because string is one of the most fundamental built-in data types, this makes
managing its built-in common methods shown in Table 9.1 and 9.2 necessary.
Use boolean methods to check whether characters are lower case, upper case,
or title case, can help us to sort our data appropriately, as well as provide
us with the opportunity to standardize data we collect by checking and then
modifying strings as needed.
116 9. PYTHON DATA STRUCTURES
9.2.4 List
The underlying abstract data structure of list data types is dynamic
array, meaning we can add, delete, modify items in the list. It supports
random access by indexing. List is the most widely one among sequence
types due to its mutability.
Even if list supports data of arbitrary types, we do not prefer to do this.
Use tuple or namedtuple for better practice and offers better clarification.
2 lst = []
3 l s t 2 = [ 2 , 2 , 2 , 2 ] # new a l i s t with i n i t i a l i z a t i o n
4 lst3 = [3]∗5 # new a l i s t s i z e 5 with 3 a s i n i t i a l i z a t i o n
5 print ( lst , lst2 , lst3 )
6 # output
7 # [ ] [2 , 2 , 2 , 2] [3 , 3 , 3 , 3 , 3]
Add Item We can add items into list through insert(index, value)–
inserting an item at a position in the original list or list.append(value)–
appending an item at the end of the list.
1 # INSERTION
2 l s t . i n s e r t ( 0 , 1 ) # i n s e r t an e l e m e n t a t i n d e x 0 , and s i n c e i t i s
empty l s t . i n s e r t ( 1 , 1 ) has t h e same e f f e c t
3 print ( l s t )
4
5 l s t 2 . i n s e r t (2 , 3)
6 print ( lst2 )
7 # output
8 # [1]
9 # [2 , 2 , 3 , 2 , 2]
10 # APPEND
11 f o r i in range (2 , 5) :
12 l s t . append ( i )
13 print ( l s t )
14 # output
15 # [1 , 2 , 3 , 4]
Delete Item
Get Size of the List We can use len built-in function to find out the
number of items storing in the list.
1 print ( len ( lst2 ) )
2 # 4
118 9. PYTHON DATA STRUCTURES
And now, let us get the memory size of lst_lst and each list item in this
list.
1 import s y s
2 for l s t in l s t _ l s t :
3 p r i n t ( s y s . g e t s i z e o f ( l s t ) , end= ' ' )
4 print ( sys . g e t s i z e o f ( l s t _ l s t ) )
We can see a list of integers takes the same memory size as of a list of strings
with equal length.
insert and append Whenever insert and append is called, and assume
the original length is n, Python could compare n + 1 with its allocated
length. If you append or insert to a Python list and the backing array isn’t
big enough, the backing array must be expanded. When this happens, the
backing array is grown by approximately 12% the following formula (comes
from C++):
1 n e w _ a l l o c a t e d = ( s i z e _ t ) n e w s i z e + ( n e w s i z e >> 3 ) +
2 ( newsize < 9 ? 3 : 6) ;
size : 1 b y t e s : 4 i d : 140682152394952
size : 2 b y t e s : 4 i d : 140682152394952
size : 3 b y t e s : 4 i d : 140682152394952
size : 4 b y t e s : 4 i d : 140682152394952
size : 5 b y t e s : 8 i d : 140682152394952
size : 6 b y t e s : 8 i d : 140682152394952
size : 7 b y t e s : 8 i d : 140682152394952
size : 8 b y t e s : 8 i d : 140682152394952
size : 9 b y t e s : 16 i d : 140682152394952
size : 10 b y t e s : 16 i d : 140682152394952
size : 11 b y t e s : 16 i d : 140682152394952
size : 12 b y t e s : 16 i d : 140682152394952
size : 13 b y t e s : 16 i d : 140682152394952
size : 14 b y t e s : 16 i d : 140682152394952
size : 15 b y t e s : 16 i d : 140682152394952
size : 16 b y t e s : 16 i d : 140682152394952
size : 17 b y t e s : 25 i d : 140682152394952
The output addresses the growth patterns as [0, 4, 8, 16, 25, 35, 46, 58, 72,
88, ...].
Amortizely, append takes O(1). However, it is O(n) for insert because
it has to first shift all items in the original list from [pos, end] by one position,
and put the item at pos with random access.
We have already seen how to use append, insert. Now, Table 9.3 shows us
the common List Methods, and they will be used as list.methodName().
Two-dimensional List
Two dimensional list is a list within a list. In this type of array the position
of an data element is referred by two indices instead of one. So it represents
a table with rows and columns of data. For example, we can declare the
following 2-d array:
1 ta = [ [ 1 1 , 3 , 9 , 1 ] , [ 2 5 , 6 , 1 0 ] , [ 1 0 , 8 , 12 , 5 ] ]
The scalar data in two dimensional lists can be accessed using two indices.
One index referring to the main or parent array and another index referring
to the position of the data in the inner list. If we mention only one index
then the entire inner list is printed for that index position. The example
below illustrates how it works.
1 p r i n t ( ta [ 0 ] )
2 p r i n t ( ta [ 2 ] [ 1 ] )
In the above example, we new a 2-d list and initialize them with values.
There are also ways to new an empty 2-d array or fix the dimension of the
outer array and leave it empty for the inner arrays:
1 # empty two d i m e n s i o n a l l i s t
2 empty_2d = [ [ ] ]
3
4 # f i x the outer dimension
5 fix_out_d = [ [ ] f o r _ i n r a n g e ( 5 ) ]
6 p r i n t ( fix_out_d )
All the other operations such as delete, insert, update are the same as of
the one-dimensional list.
1 m1 [ 1 ] [ 2 ] = 1
2 m2 [ 1 ] [ 2 ] = 1
3 p r i n t (m1, m2)
However, we can not declare it in the following way, because we end up with
some copies of the same inner lists, thus modifying one element in the inner
lists will end up changing all of the them in the corresponding positions.
Unless the feature suits the situation.
1 # wrong d e c l a r a t i o n
2 m4 = [ [ 0 ] ∗ c o l s ] ∗ rows
3 m4 [ 1 ] [ 2 ] = 1
4 p r i n t (m4)
With output:
[[0 , 0 , 1 , 0] , [0 , 0 , 1 , 0] , [0 , 0 , 1 , 0]
Access Rows and Columns In the real problem solving, we might need
to access rows and columns. Accessing rows is quite easy since it follows the
declaraion of two-dimensional array.
1 # a c c e s s i n g row
2 f o r row i n m1 :
3 p r i n t ( row )
There’s also a handy “idiom” for transposing a nested list, turning ’columns’
into ’rows’:
1 transposedM1 = l i s t ( z i p ( ∗m1) )
2 p r i n t ( transposedM1 )
9.2.5 Tuple
A tuple has static array as its backing abstract data structure in C, which
is immutable–we can not add, delete, or replace items once its created and
assigned with value. You might think if list is a dynamic array and has
no restriction same as of the tuple, why would we need tuple then?
Tuple VS List We list how we use each data type and why is it. The
main benefit of tuple’s immutability is it is hashable, we can use them as
keys in the hash table–dictionary types, whereas the mutable types such
as list and range can not be applied. Besides, in the case that the data
does not to change, the tuple’s immutability will guarantee that the data
remains write-protected and iterating an immutable sequence is faster than
a mutable sequence, giving it slight performance boost. Also, we generally
use tuple to store a variety of data types. For example, in a class score
system, for a student, we might want to have its name, student id, and test
score, we can write (’Bob’, 12345, 89).
Tuple Syntax
New and Initialize Tuple Tuples are created by separating the items
with a comma. It is commonly wrapped in parentheses for better readability.
Tuple can also be created via a built-in function tuple(), if the argument to
tuple() is a sequence then this creates a tuple of elements of that sequences.
This is also used to realize type conversion.
An empty tuple:
1 tup = ( )
2 tup3 = t u p l e ( )
When there is only one item, put comma behind so that it wont be translated
as string, which is a bit bizarre!
1 tup2 = ( ' c r a c k ' , )
2 tup1 = ( ' c r a c k ' , ' l e e t c o d e ' , 2 0 1 8 , 2 0 1 9 )
However, for its items which are mutable itself, we can still manipulate it.
For example, we can use index to access the list item at the last position of
a tuple and modify the list.
1 tup [ − 1 ] [ 0 ] = 4
2 #( ' a ' , ' b ' , [ 4 , 2 , 3 ] )
Understand Tuple
The backing structure is static array which states that the way the tuple
is structure is similar to list, other than its write-protected. We will just
brief on its property.
Tuple Object and Pointers Tuple object itself takes 48 bytes. And all
the others are similar to corresponding section in list.
1 lst_tup = [ ( ) , ( 1 , ) , ( ' 1 ' ,) , (1 , 2) , ( ' 1 ' , ' 2 ' ) ]
2 import s y s
3 f o r tup i n l s t _ t u p :
4 p r i n t ( s y s . g e t s i z e o f ( tup ) , end= ' ' )
Named Tuples
In named tuple, we can give all records a name, say “Computer_Science” to
indicate the class name, and we give each item a name, say ’name’, ’id’, and
’score’. We need to import namedtuple class from module collections.
For example:
1 r e c o r d 1 = ( ' Bob ' , 1 2 3 4 5 , 8 9 )
2 from c o l l e c t i o n s import namedtuple
3 Record = namedtuple ( ' Computer_Science ' , ' name i d s c o r e ' )
4 r e c o r d 2 = Record ( ' Bob ' , i d =12345 , s c o r e =89)
5 print ( record1 , record2 )
9.2.6 Summary
All these sequence type data structures share the most common methods
and operations shown in Table 9.4 and 9.5. To note that in Python, the
indexing starts from 0.
9.2.7 Bonus
Circular Array The corresponding problems include:
1. 503. Next Greater Element II
9.2.8 Exercises
1. 985. Sum of Even Numbers After Queries (easy)
2. 937. Reorder Log Files
You have an array of logs. Each log is a space delimited string of
words.
For each log, the first word in each log is an alphanumeric identifier.
Then, either:
Each word after the identifier will consist only of lowercase letters, or;
Each word after the identifier will consist only of digits.
We will call these two varieties of logs letter-logs and digit-logs. It is
guaranteed that each log has at least one word after its identifier.
Reorder the logs so that all of the letter-logs come before any digit-log.
The letter-logs are ordered lexicographically ignoring identifier, with
the identifier used in case of ties. The digit-logs should be put in their
original order.
Return the final order of the logs.
9.2. ARRAY AND PYTHON SEQUENCE 125
Table 9.5: Common out of place operators for Sequence Data Type in
Python
Operation Description
s+r Concatenates two sequences of the same type
s*n Make n copies of s, where n is an integer
v1 , v2 , ..., vn = s Unpack n variables from s
s[i] Indexing-returns ith element of s
s[i:j:stride] Slicing-returns elements between i and j with
optinal stride
x in s Return T rue if element x is in s
x not in s Return T rue if element x is not in s
1 Example 1 :
2
3 I nput : [ " a1 9 2 3 1 " , " g1 a c t c a r " , " zo4 4 7 " , " ab1 o f f key
dog " , " a8 a c t zoo " ]
4 Output : [ " g1 a c t c a r " , " a8 a c t zoo " , " ab1 o f f key dog " , " a1 9
2 3 1 " , " zo4 4 7 " ]
5
6
7
8 Note :
9
10 0 <= l o g s . l e n g t h <= 100
11 3 <= l o g s [ i ] . l e n g t h <= 100
12 l o g s [ i ] i s g u a r a n t e e d t o have an i d e n t i f i e r , and a word
a f t e r the i d e n t i f i e r .
8
9 i f type . i s n u m e r i c ( ) :
10 d i g i t s . append ( l o g )
11 else :
12 l e t t e r s . append ( ( ' ' . j o i n ( s p l i t e d [ 1 : ] ) , i d ) )
13 l e t t e r s . s o r t ( ) #d e f a u l t s o r t i n g by t h e f i r s t e l e m e n t
and then t h e s e c o n d i n t h e t u p l e
14
15 return [ id + ' ' + other f o r other , id in l e t t e r s ] +
digits
Linked list consists of nodes, and each node consists of at least two
variables for singly linked lit: val to save data and next, a pointer that
points to the successive node. The Node class is given as:
1 c l a s s Node ( o b j e c t ) :
2 d e f __init__ ( s e l f , v a l = None ) :
3 s e l f . val = val
9.3. LINKED LIST 127
4 s e l f . next = None
In Singly Linked List, usually we can start to with a head node which
points to the first node in the list; only with this single node we are able
to trace other nodes. For simplicity, demonstrate the process without using
class, but we provide a class implementation with name SinglyLinkeList
in our online python source code. Now, let us create an empty node named
head.
1 head = None
The first case is simply bad: we would generate a new node and we can not
track the head through in-place operation. However, with the dummy node,
only the second case will appear. The code is:
1 d e f append ( head , v a l ) :
2 node = Node ( v a l )
3 c u r = head
4 w h i l e c u r . next :
5 c u r = c u r . next
6 c u r . next = node
7 return
Now, let use create the same exact linked list in Fig. 9.1:
1 f o r v a l i n [ 'A ' , 'B ' , 'C ' , 'D ' ] :
2 append ( head , v a l )
128 9. PYTHON DATA STRUCTURES
Search operation we find a node by value, and we return this node, otherwise,
we return None.
1 d e f s e a r c h ( head , v a l ) :
2 f o r node i n gen ( head ) :
3 i f node . v a l == v a l :
4 r e t u r n node
5 r e t u r n None
Delete Operation For deletion, there are two scenarios: deleting a node
by value when we are given the head node and deleting a given node such
as the node we got from searching ’B’.
The first case requires us to first locate the node first, and rewire the
pointers between the predecessor and successor of the deleting node. Again
here, if we do not have a dummy node, we would have two cases: if the
node is the head node, repoint the head to the next node, we connect the
previous node to deleting node’s next node, and the head pointer remains
untouched. With dummy node, we would only have the second situation.
In the process, we use an additional variable prev to track the predecessor.
1 d e f d e l e t e ( head , v a l ) :
2 c u r = head . next # s t a r t from dummy node
3 prev = head
4 while cur :
5 i f c u r . v a l == v a l :
6 # rewire
7 prev . next = c u r . next
8 return
9.3. LINKED LIST 129
9 prev = c u r
10 c u r = c u r . next
Now the output will indicate we only have two nodes left:
1 C D
The second case might seems a bit impossible–we do not know its pre-
vious node, the trick we do is to copy the value of the next node to current
node, and we delete the next node instead by pointing current node to the
node after next node. While, that is only when the deleting node is not the
last node. When it is, we have no way to completely delete it; but we can
make it “invalid” by setting value and Next to None.
1 d e f d e l e t e ( head , v a l ) :
2 c u r = head . next # s t a r t from dummy node
3 prev = head
4 while cur :
5 i f c u r . v a l == v a l :
6 # rewire
7 prev . next = c u r . next
8 return
9 prev = c u r
10 c u r = c u r . next
Now, let us try deleting the node ’B’ via our previously found node.
1 deleteByNode ( node )
2 f o r n i n gen ( head ) :
3 p r i n t ( n . v a l , end = ' ' )
Clear When we need to clear all the nodes of the linked list, we just set
the node next to the dummy head to None.
1 def clear ( s e l f ) :
2 s e l f . head = None
3 self . size = 0
Question: Some linked list can only allow insert node at the tail which
is Append, some others might allow insertion at any location. To get the
length of the linked list easily in O(1), we need a variable to track the size
130 9. PYTHON DATA STRUCTURES
through prev if the given starting node is any node. Whereas for SLL, this is
not an option, because we would not be able to conduct a complete search–
we can only search among the items behind from the given node. When the
data is ordered in some way, or if the program is parallel–situations that
bidirectional search would make sense.
1 d e f gen ( head ) :
2 c u r = head . next
3 while cur :
4 y i e l d cur
5 c u r = c u r . next
1 d e f s e a r c h ( head , v a l ) :
2 f o r node i n gen ( head ) :
3 i f node . v a l == v a l :
4 r e t u r n node
5 r e t u r n None
Comparison We can see there is some slight advantage of dll over sll, but
it comes with the cost of handing the extra prev. This would only be an
advantage when bidirectional searching plays dominant factor in the matter
of efficiency, otherwise, better stick with sll.
132 9. PYTHON DATA STRUCTURES
Tips From our implementation, in some cases we still need to worry about
if it is the last node or not. The coding logic can further be simplified if we
put a dummy node at the end of the linked list too.
9.3.3 Bonus
Circular Linked List A circular linked list is a variation of linked list in
which the first node connects to last node. To make a circular linked list
from a normal linked list: in singly linked list, we simply set the last node’s
next pointer to the first node; in doubly linked list, other than setting the
last node’s next pointer, we set the prev pointer of the first node to the last
node making the circular in both directions.
Compared with a normal linked list, circular linked list saves time for us
to go to the first node from the last (both sll and dll) or go to the last node
from the first node (in dll) by doing it in a single step through the extra
connection. Because it is a circle, when ever a search with a while loop is
needed, we need to make sure the end condition: just make sure we searched
a whole cycle by comparing the iterating node to the starting node.
Input : 1−>1−>2
Output : 1−>2
Example 2 :
Input : 1−>1−>2−>3−>3
Output : 1−>2−>3
Analysis
Recursion Now, if we use recursion and return the node, thus, at each
step, we can compare our node with the returned node (locating behind the
current node), same logical applies. A better way to help us is drawing out
an example. With 1->1->1. The last 1 will return, and at the second last
1, we can compare them, because it equals, we delete the last 1, now we
backtrack to the first 1 with the second last 1 as returned node, we compare
again. The code is the simplest among all solutions.
134 9. PYTHON DATA STRUCTURES
1 d e f r e c u r s i v e ( node ) :
2 i f node . next i s None :
3 r e t u r n node
4
5 next = r e c u r s i v e ( node . next )
6 i f next . v a l == node . v a l :
7 node . next = node . next . next
8 r e t u r n node
9.3.5 Exercises
Basic operations:
1. 237. Delete Node in a Linked List (easy, delete only given current
node)
6. Sort List
7. Reorder List
Fast-slow pointers:
In the remaining section, we will discuss the implement with the built-in
data types or using built-in modules. After this, we will learn more advanced
queue and stack: the priority queue and the monotone queue which can be
used to solve medium to hard problems on LeetCode.
Stack The implementation for stack is simplily adding and deleting ele-
ment from the end.
1 # stack
2 s = []
3 s . append ( 3 )
4 s . append ( 4 )
5 s . append ( 5 )
6 s . pop ( )
Queue For queue, we can append at the last, and pop from the first index
always. Or we can insert at the first index, and use pop the last element.
1 # queue
2 # 1 : u s e append and pop
3 q = []
4 q . append ( 3 )
5 q . append ( 4 )
6 q . append ( 5 )
7 q . pop ( 0 )
Stack and Singly Linked List with top pointer Because in stack, we
only need to add or delete item from the rear, using one pointer pointing at
the rear item, and the linked list’s next is connected to the second toppest
item, in a direction from the top to the bottom.
1 # s t a c k with l i n k e d l i s t
2 ' ' ' a<−b<−c<−top ' ' '
3 c l a s s Stack :
4 d e f __init__ ( s e l f ) :
5 s e l f . top = None
6 self . size = 0
7
8 # push
9 d e f push ( s e l f , v a l ) :
10 node = Node ( v a l )
11 i f s e l f . top : # c o n n e c t top and node
12 node . next = s e l f . top
13 # r e s e t t h e top p o i n t e r
14 s e l f . top = node
15 s e l f . s i z e += 1
16
17 d e f pop ( s e l f ) :
18 i f s e l f . top :
19 v a l = s e l f . top . v a l
20 i f s e l f . top . next :
21 s e l f . top = s e l f . top . next # r e s e t top
22 else :
23 s e l f . top = None
24 s e l f . s i z e −= 1
25 return val
26
27 e l s e : # no e l e m e n t t o pop
28 r e t u r n None
Queue and Singly Linked List with Two Pointers For queue, we need
to access the item from each side, therefore we use two pointers pointing at
the head and the tail of the singly linked list. And the linking direction is
from the head to the tail.
1 # queue with l i n k e d l i s t
2 ' ' ' head−>a−>b−> t a i l ' ' '
3 c l a s s Queue :
4 d e f __init__ ( s e l f ) :
5 s e l f . head = None
6 s e l f . t a i l = None
7 self . size = 0
9.4. STACK AND QUEUE 137
8
9 # push
10 d e f enqueue ( s e l f , v a l ) :
11 node = Node ( v a l )
12 i f s e l f . head and s e l f . t a i l : # c o n n e c t top and node
13 s e l f . t a i l . next = node
14 s e l f . t a i l = node
15 else :
16 s e l f . head = s e l f . t a i l = node
17
18 s e l f . s i z e += 1
19
20 d e f dequeue ( s e l f ) :
21 i f s e l f . head :
22 v a l = s e l f . head . v a l
23 i f s e l f . head . next :
24 s e l f . head = s e l f . head . next # r e s e t top
25 else :
26 s e l f . head = None
27 s e l f . t a i l = None
28 s e l f . s i z e −= 1
29 return val
30
31 e l s e : # no e l e m e n t t o pop
32 r e t u r n None
Also, Python provide two built-in modules: Deque and Queue for such
purpose. We will detail them in the next section.
6
7 s = deque ( [ 3 , 4 ] )
8 s . append ( 5 )
9 s . pop ( )
ble 9.7. In python 3, we use lower case queue, but in Python 2.x it uses
Queue, in our book, we learn Python 3.
Table 9.7: Datatypes in Queue Module, maxsize is an integer that sets the
upperbound limit on the number of items that can be places in the queue.
Insertion will block once this size has been reached, until queue items are
consumed. If maxsize is less than or equal to zero, the queue size is infinite.
Class Data Structure
class queue.Queue(maxsize=0) Constructor for a FIFO queue.
class queue.LifoQueue(maxsize=0) Constructor for a LIFO queue.
class queue.PriorityQueue(maxsize=0) Constructor for a priority queue.
Table 9.8: Methods for Queue’s three classes, here we focus on single-thread
background.
Class Data Structure
Queue.put(item[, block[, timeout]]) Put item into the queue.
Queue.get([block[, timeout]]) Remove and return an item from the
queue.
Queue.qsize() Return the approximate size of the
queue.
Queue.empty() Return True if the queue is empty,
False otherwise.
Queue.full() Return True if the queue is full, False
otherwise.
Now, using Queue() and LifoQueue() to implement queue and stack re-
spectively is straightforward:
1 # python 3
2 import queue
3 # imp lementing queue
4 q = queue . Queue ( )
5 f o r i in range (3 , 6) :
6 q . put ( i )
1 import queue
2 # imp lementing s t a c k
3 s = queue . LifoQueue ( )
4
5 f o r i in range (3 , 6) :
6 s . put ( i )
9.4.4 Bonus
Circular Linked List and Circular Queue The circular queue is a
linear data structure in which the operation are performed based on FIFO
principle and the last position is connected back to the the first position to
make a circle. It is also called “Ring Buffer”. Circular Queue can be either
implemented with a list or a circular linked list. If we use a list, we initialize
our queue with a fixed size with None as value. To find the position of the
enqueue(), we use rear = (rear + 1)%size. Similarily, for dequeue(), we use
f ront = (f ront + 1)%size to find the next front position.
9.4.5 Exercises
Queue and Stack
Analysis: This is a typical buffer problem. If the size is larger than the
buffer, then we squeeze out the easilest data. Thus, a queue can be used to
save the t and each time, squeeze any time not in the range of [t-3000, t]:
1 c l a s s RecentCounter :
2
3 d e f __init__ ( s e l f ) :
4 s e l f . ans = c o l l e c t i o n s . deque ( )
5
6 def ping ( s e l f , t ) :
7 """
8 : type t : i n t
9 : rtype : int
10 """
11 s e l f . ans . append ( t )
12 w h i l e s e l f . ans [ 0 ] < t −3000:
13 s e l f . ans . p o p l e f t ( )
14 r e t u r n l e n ( s e l f . ans )
Monotone Queue
Obvious applications:
Hash Set Design a HashSet without using any built-in hash table libraries.
To be specific, your design should include these functions: (705. Design
HashSet)
add ( v a l u e ) : I n s e r t a v a l u e i n t o t h e HashSet .
c o n t a i n s ( v a l u e ) : Return whether t h e v a l u e e x i s t s i n t h e HashSet
o r not .
remove ( v a l u e ) : Remove a v a l u e i n t h e HashSet . I f t h e v a l u e d o e s
not e x i s t i n t h e HashSet , do n o t h i n g .
For example:
MyHashSet h a s h S e t = new MyHashSet ( ) ;
h a s h S e t . add ( 1 ) ;
h a s h S e t . add ( 2 ) ;
hashSet . c o n t a i n s ( 1 ) ; // r e t u r n s t r u e
hashSet . c o n t a i n s ( 3 ) ; // r e t u r n s f a l s e ( not found )
h a s h S e t . add ( 2 ) ;
hashSet . c o n t a i n s ( 2 ) ; // r e t u r n s t r u e
h a s h S e t . remove ( 2 ) ;
hashSet . c o n t a i n s ( 2 ) ; // r e t u r n s f a l s e ( a l r e a d y removed )
Note: Note: (1) All values will be in the range of [0, 1000000]. (2) The
number of operations will be in the range of [1, 10000].
1 c l a s s MyHashSet :
2
3 d e f _h( s e l f , k , i ) :
4 r e t u r n ( k+i ) % 10001
5
6 d e f __init__ ( s e l f ) :
7 """
8 I n i t i a l i z e your data s t r u c t u r e h e r e .
9.5. HASH TABLE 143
9 """
10 s e l f . s l o t s = [ None ] ∗ 1 0 0 0 1
11 s e l f . s i z e = 10001
12
13 d e f add ( s e l f , key : ' i n t ' ) −> ' None ' :
14 i = 0
15 while i < s e l f . s i z e :
16 k = s e l f . _h( key , i )
17 i f s e l f . s l o t s [ k ] == key :
18 return
19 e l i f not s e l f . s l o t s [ k ] o r s e l f . s l o t s [ k ] == −1:
20 s e l f . s l o t s [ k ] = key
21 return
22 i += 1
23 # double s i z e
24 s e l f . s l o t s = s e l f . s l o t s + [ None ] ∗ s e l f . s i z e
25 s e l f . s i z e ∗= 2
26 r e t u r n s e l f . add ( key )
27
28
29 d e f remove ( s e l f , key : ' i n t ' ) −> ' None ' :
30 i = 0
31 while i < s e l f . s i z e :
32 k = s e l f . _h( key , i )
33 i f s e l f . s l o t s [ k ] == key :
34 s e l f . s l o t s [ k ] = −1
35 return
36 e l i f s e l f . s l o t s [ k ] == None :
37 return
38 i += 1
39 return
40
41 d e f c o n t a i n s ( s e l f , key : ' i n t ' ) −> ' b o o l ' :
42 """
43 Returns t r u e i f t h i s s e t c o n t a i n s t h e s p e c i f i e d e l e m e n t
44 """
45 i = 0
46 while i < s e l f . s i z e :
47 k = s e l f . _h( key , i )
48 i f s e l f . s l o t s [ k ] == key :
49 r e t u r n True
50 e l i f s e l f . s l o t s [ k ] == None :
51 return False
52 i += 1
53 return False
Hash Map Design a HashMap without using any built-in hash table li-
braries. To be specific, your design should include these functions: (706.
Design HashMap (easy))
• put(key, value) : Insert a (key, value) pair into the HashMap. If the
value already exists in the HashMap, update the value.
144 9. PYTHON DATA STRUCTURES
1 c l a s s MyHashMap :
2 d e f _h( s e l f , k , i ) :
3 r e t u r n ( k+i ) % 10001 # [ 0 , 1 0 0 0 1 ]
4 d e f __init__ ( s e l f ) :
5 """
6 I n i t i a l i z e your data s t r u c t u r e h e r e .
7 """
8 s e l f . s i z e = 10002
9 s e l f . s l o t s = [ None ] ∗ s e l f . s i z e
10
11
12 d e f put ( s e l f , key : ' i n t ' , v a l u e : ' i n t ' ) −> ' None ' :
13 """
14 v a l u e w i l l always be non−n e g a t i v e .
15 """
16 i = 0
17 while i < s e l f . s i z e :
18 k = s e l f . _h( key , i )
19 i f not s e l f . s l o t s [ k ] o r s e l f . s l o t s [ k ] [ 0 ] i n [ key ,
−1]:
20 s e l f . s l o t s [ k ] = ( key , v a l u e )
21 return
22 i += 1
23 # d o u b l e s i z e and t r y a g a i n
24 s e l f . s l o t s = s e l f . s l o t s + [ None ] ∗ s e l f . s i z e
25 s e l f . s i z e ∗= 2
26 r e t u r n s e l f . put ( key , v a l u e )
27
28
29 d e f g e t ( s e l f , key : ' i n t ' ) −> ' i n t ' :
30 """
31 Returns t h e v a l u e t o which t h e s p e c i f i e d key i s mapped ,
o r −1 i f t h i s map c o n t a i n s no mapping f o r t h e key
32 """
33 i = 0
34 while i < s e l f . s i z e :
35 k = s e l f . _h( key , i )
9.5. HASH TABLE 145
36 i f not s e l f . s l o t s [ k ] :
37 r e t u r n −1
38 e l i f s e l f . s l o t s [ k ] [ 0 ] == key :
39 return s e l f . s l o t s [ k ] [ 1 ]
40 e l s e : # i f i t s d e l e t e d keep p r o b i n g
41 i += 1
42 r e t u r n −1
43
44
45 d e f remove ( s e l f , key : ' i n t ' ) −> ' None ' :
46 """
47 Removes t h e mapping o f t h e s p e c i f i e d v a l u e key i f t h i s
map c o n t a i n s a mapping f o r t h e key
48 """
49 i = 0
50 while i < s e l f . s i z e :
51 k = s e l f . _h( key , i )
52 i f not s e l f . s l o t s [ k ] :
53 return
54 e l i f s e l f . s l o t s [ k ] [ 0 ] == key :
55 s e l f . s l o t s [ k ] = ( −1 , None )
56 return
57 e l s e : # i f i t s d e l e t e d keep p r o b i n g
58 i += 1
59 return
Python 2.X VS Python 3.X In Python 2X, we can use slice to access
keys() or items() of the dictionary. However, in Python 3.X, the same syn-
tax will give us TypeError: ’dict_keys’ object does not support indexing.
Instead, we need to use function list() to convert it to list and then slice it.
For example:
1 # Python 2 . x
2 d i c t . keys ( ) [ 0 ]
3
146 9. PYTHON DATA STRUCTURES
4 # Python 3 . x
5 l i s t ( d i c t . keys ( ) ) [ 0 ]
set Data Type Method Description Python Set remove() Removes El-
ement from the Set Python Set add() adds element to a set Python Set
copy() Returns Shallow Copy of a Set Python Set clear() remove all ele-
ments from a set Python Set difference() Returns Difference of Two Sets
Python Set difference_update() Updates Calling Set With Intersection of
Sets Python Set discard() Removes an Element from The Set Python Set
intersection() Returns Intersection of Two or More Sets Python Set inter-
section_update() Updates Calling Set With Intersection of Sets Python Set
isdisjoint() Checks Disjoint Sets Python Set issubset() Checks if a Set is
Subset of Another Set Python Set issuperset() Checks if a Set is Superset of
Another Set Python Set pop() Removes an Arbitrary Element Python Set
symmetric_difference() Returns Symmetric Difference Python Set symmet-
ric_difference_update() Updates Set With Symmetric Difference Python
Set union() Returns Union of Sets Python Set update() Add Elements to
The Set.
If we want to put string in set, it should be like this:
1 >>> a = s e t ( ' a a r d v a r k ' )
2 >>>
3 { 'd ' , 'v ' , 'a ' , ' r ' , 'k '}
4 >>> b = { ' a a r d v a r k ' }# o r s e t ( [ ' a a r d v a r k ' ] ) , c o n v e r t a l i s t o f
s t r i n g s to s e t
5 >>> b
6 { ' aardvark ' }
7 #o r put a t u p l e i n t h e s e t
8 a =s e t ( [ t u p l e ] ) o r { ( t u p l e ) }
Compare also the difference between and set() with a single word argument.
dict Data Type Method Description clear() Removes all the elements
from the dictionary copy() Returns a copy of the dictionary fromkeys()
Returns a dictionary with the specified keys and values get() Returns the
value of the specified key items() Returns a list containing a tuple for each
key value pair keys() Returns a list containing the dictionary’s keys pop()
Removes the element with the specified key and return value popitem()
Removes the last inserted key-value pair setdefault() Returns the value of
the specified key. If the key does not exist: insert the key, with the specified
value update() Updates the dictionary with the specified key-value pairs
values() Returns a list of all the values in the dictionary
See using cases at https://www.programiz.com/python-programming/
dictionary.
9.5. HASH TABLE 147
Collection Module
OrderedDict Standard dictionaries are unordered, which means that any
time you loop through a dictionary, you will go through every key, but you
are not guaranteed to get them in any particular order. The OrderedDict
from the collections module is a special type of dictionary that keeps track
of the order in which its keys were inserted. Iterating the keys of an ordered-
Dict has predictable behavior. This can simplify testing and debugging by
making all the code deterministic.
The defaultdict class from the collections module simplifies this process by
pre-assigning a default value when a key does not present. For different value
type it has different default value, for example, for int, it is 0 as the default
value. A defaultdict works exactly like a normal dict, but it is initialized
with a function (“default factory”) that takes no arguments and provides
the default value for a nonexistent key. Therefore, a defaultdict will never
raise a KeyError. Any key that does not exist gets the value returned by
the default factory. For example, the following code use a lambda function
and provide ’Vanilla’ as the default value when a key is not assigned and
the second code snippet function as a counter.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 ice_cream = d e f a u l t d i c t ( lambda : ' V a n i l l a ' )
3 ice_cream [ ' Sarah ' ] = ' Chunky Monkey '
4 ice_cream [ ' Abdul ' ] = ' B u t t e r Pecan '
5 p r i n t ice_cream [ ' Sarah ' ]
6 # Chunky Monkey
7 p r i n t ice_cream [ ' Joe ' ]
8 # Vanilla
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 dict = d e f a u l t d i c t ( int ) # default value f o r int i s 0
3 d i c t [ ' c o u n t e r ' ] += 1
Counter
148 9. PYTHON DATA STRUCTURES
9.5.3 Exercises
1. 349. Intersection of Two Arrays (easy)
Answer: Use hashmap simply Set of tuple to save the corresponding sending
exmail address: local name and domain name:
1 class Solution :
2 d e f numUniqueEmails ( s e l f , e m a i l s ) :
3 """
4 : type e m a i l s : L i s t [ s t r ]
9.6. GRAPH REPRESENTATIONS 149
5 : rtype : int
6 """
7 i f not e m a i l s :
8 return 0
9 num = 0
10 handledEmails = s e t ( )
11 f o r email in emails :
12 local_name , domain_name = e m a i l . s p l i t ( '@ ' )
13 local_name = local_name . s p l i t ( '+ ' ) [ 0 ]
14 local_name = local_name . r e p l a c e ( ' . ' , ' ' )
15 h a n d l e d E m a i l s . add ( ( local_name , domain_name ) )
16 return l e n ( handledEmails )
9.6.1 Introduction
Graph representations need to show users full information to the graph itself,
G = (V, E), including its vertices, edges, and its weights to distinguish either
it is directed or undirected, weighted or unweighted. There are generally
four ways: (1) Adjacency Matrix, (2) Adjacency List, (3) Edge List, and (4)
optionally, Tree Structure, if the graph is a free tree. Each will be preferred
to different situations. An example is shown in Fig 9.3.
v from u and it also works the other way around. To represent undirected
graph, we have to double its number of edges shown in the structure; it
becomes 2|E| in all of our representations.
Adjacency Matrix
am = [ [ 0 ] ∗ 7 f o r _ i n r a n g e ( 7 ) ]
# set 8 edges
am [ 0 ] [ 1 ] = am [ 1 ] [ 0 ] = 1
am [ 0 ] [ 2 ] = am [ 2 ] [ 0 ] = 1
am [ 1 ] [ 2 ] = am [ 2 ] [ 1 ] = 1
am [ 1 ] [ 3 ] = am [ 3 ] [ 1 ] = 1
am [ 2 ] [ 4 ] = am [ 4 ] [ 2 ] = 1
am [ 3 ] [ 4 ] = am [ 4 ] [ 3 ] = 1
am [ 4 ] [ 5 ] = am [ 5 ] [ 4 ] = 1
am [ 5 ] [ 6 ] = am [ 6 ] [ 5 ] = 1
Applications Adjacency matrix usually fits well to the dense graph where
the edges are close to |V |2 , leaving a small ratio of the matrix be blank
and unused. Checking if an edge exists between two vertices takes only
O(1). However, an adjacency matrix requires exactly O(V ) to enumerate
the the neighbors of a vertex v–an operation commonly used in many graph
algorithms–even if vertex v only has a few neighbors. Moreover, when the
graph is sparse, an adjacency matrix will be both inefficient in the space
and iteration cost, a better option is adjacency list.
9.6. GRAPH REPRESENTATIONS 151
Adjacency List
An adjacency list is a more compact and space efficient form of graph repre-
sentation compared with the above adjacency matrix. In adjacency list, we
have a list of V vertices which is vertex-indexed, and for each vertex v we
store anther list of neighboring nodes with their vertex as the value, which
can be represented with an array or linked list. For example, with adjacency
list as [[1, 2, 3], [3, 1], [4, 6, 1]], node 0 connects to 1,2,3, node 1 connect to 3,1,
node 2 connects to 4,6,1.
In Python, We can use a normal 2-d array to represent the adjacent list,
for the same graph in the example, it as represented with the following code:
al = [ [ ] f o r _ in range (7) ]
# set 8 edges
al [ 0 ] = [1 , 2]
al [ 1 ] = [2 , 3]
al [ 2 ] = [0 , 4]
al [ 3 ] = [1 , 4]
al [ 4 ] = [2 , 3 , 5]
al [ 5 ] = [4 , 6]
al [ 6 ] = [5]
Edge List
The edge list is a list of edges (one-dimensional), where the index of the list
does not relate to vertex and each edge is usually in the form of (starting
vertex, ending vertex, weight). We can use either a list or a tuple to
represent an edge. The edge list representation of the example is given:
el = []
el . e x t en d ( [ [ 0 , 1] , [1 , 0]])
el . e x t en d ( [ [ 0 , 2] , [2 , 0]])
el . e x t en d ( [ [ 1 , 2] , [2 , 1]])
el . e x t en d ( [ [ 1 , 3] , [3 , 1]])
el . e x t en d ( [ [ 3 , 4] , [4 , 3]])
el . e x t en d ( [ [ 2 , 4] , [4 , 2]])
el . e x t en d ( [ [ 4 , 5] , [5 , 4]])
el . e x t en d ( [ [ 5 , 6] , [6 , 5]])
152 9. PYTHON DATA STRUCTURES
Applications Edge list is not widely used as the AM and AL, and usually
only be needed in a subrountine of algorithm implementation–such as in
Krukal’s algorithm to fine Minimum Spanning Tree(MST)–where we might
need to order the edges by its weight.
Tree Structure
Weighted Graph If we need weights for each edge, we can use two-
dimensional dictionary. We use 10 as a weight to all edges just to demon-
strate.
1 dw = d e f a u l t d i c t ( d i c t )
2 f o r v1 , v2 i n e l :
3 vn1 = c h r ( v1 + ord ( ' a ' ) )
4 vn2 = c h r ( v2 + ord ( ' a ' ) )
5 dw [ vn1 ] [ vn2 ] = 10
6 p r i n t (dw)
We can access the edge and its weight through dw[v1][v2]. The output of
this structure is given:
d e f a u l t d i c t (< c l a s s ' d i c t ' > , { ' a ' : { ' b ' : 1 0 , ' c ' : 1 0 } , ' b ' : { ' a ' :
10 , ' c ' : 10 , 'd ' : 10} , ' c ' : { ' a ' : 10 , 'b ' : 10 , ' e ' : 10} , 'd
' : { 'b ' : 10 , ' e ' : 10} , ' e ' : { 'd ' : 10 , ' c ' : 10 , ' f ' : 10} , ' f ' :
{ ' e ' : 10 , ' g ' : 10} , ' g ' : { ' f ' : 10}})
Binary Tree Node In a binary tree, the children pointers will at at most
two pointers, which we define as left and right. The binary tree node is
defined as:
1 c l a s s BinaryNode :
2 d e f __init__ ( s e l f , v a l ) :
3 s e l f . l e f t = None
4 s e l f . r i g h t = None
5 s e l f . val = val
N-ary Tree Node For N-ary node, when we initialize the length of the
node’s children with additional argument n.
1 c l a s s NaryNode :
2 d e f __init__ ( s e l f , n , v a l ) :
3 s e l f . c h i l d r e n = [ None ] ∗ n
154 9. PYTHON DATA STRUCTURES
4 s e l f . val = val
Construct A Tree Now that we have defined the tree node, the process
of constructing a tree in the figure will be a series of operations:
1
/ \
2 3
/ \ \
4 5 6
1 r o o t = BinaryNode ( 1 )
2 l e f t = BinaryNode ( 2 )
3 r i g h t = BinaryNode ( 3 )
4 root . l e f t = l e f t
5 root . right = right
6 l e f t . l e f t = BinaryNode ( 4 )
7 l e f t . r i g h t = BinaryNode ( 5 )
8 r i g h t . r i g h t = BinaryNode ( 6 )
4 i d x : i n d e x t o i n d i c a t t h e l o c a t i o n o f t h e c u r r e n t node
5 '''
6 i f i d x >= l e n ( a ) :
7 r e t u r n None
8 i f a [ idx ] :
9 node = BinaryNode ( a [ i d x ] )
10 node . l e f t = c o n s t r u c t T r e e ( a , 2∗ i d x + 1 )
11 node . r i g h t = c o n s t r u c t T r e e ( a , 2∗ i d x + 2 )
12 r e t u r n node
13 r e t u r n None
Now, we call this function, and pass it with out input array:
1 nums = [ 1 , 2 , 3 , 4 , 5 , None , 6 ]
2 r o o t = c o n s t r u c t T r e e ( nums , 0 )
In the next section, we discuss tree traversal methods, and we will use those
methods to print out the tree we just build.
Example 1 :
Input : r o o t = [ 3 , 5 , 1 , 6 , 2 , 0 , 8 , n u l l , n u l l , 7 , 4 ] , p = 5 , q = 1
Output : 3
E x p l a n a t i o n : The LCA o f o f nodes 5 and 1 i s 3 .
Example 2 :
Input : r o o t = [ 3 , 5 , 1 , 6 , 2 , 0 , 8 , n u l l , n u l l , 7 , 4 ] , p = 5 , q = 4
Output : 5
E x p l a n a t i o n : The LCA o f nodes 5 and 4 i s 5 , s i n c e a node
can be a d e s c e n d a n t o f i t s e l f
a c c o r d i n g t o t h e LCA d e f i n i t i o n .
Solution: Divide and Conquer. There are two cases for LCA: 1)
two nodes each found in different subtree, like example 1. 2) two nodes
are in the same subtree like example 2. If we compare the current node
with the p and q, if it equals to any of them, return current node in
the tree traversal. Therefore in example 1, at node 3, the left return
as node 5, and the right return as node 1, thus node 3 is the LCA.
In example 2, at node 5, it returns 5, thus for node 3, the right tree
would have None as return, thus it makes the only valid return as the
final LCA. The time complexity is O(n).
1 d e f lowestCommonAncestor ( s e l f , r o o t , p , q ) :
2 """
3 : type r o o t : TreeNode
4 : type p : TreeNode
5 : type q : TreeNode
6 : r t y p e : TreeNode
7 """
8 i f not r o o t :
9 r e t u r n None
10 i f r o o t == p o r r o o t == q :
11 r e t u r n r o o t # found one v a l i d node ( c a s e 1 : s t o p a t
5 , 1 , case 2 : stop at 5)
12 l e f t = s e l f . lowestCommonAncestor ( r o o t . l e f t , p , q )
13 r i g h t = s e l f . lowestCommonAncestor ( r o o t . r i g h t , p , q )
14 i f l e f t i s not None and r i g h t i s not None : # p , q i n
the subtree
9.8. HEAP 157
15 return root
16 i f any ( [ l e f t , r i g h t ] ) i s not None :
17 r e t u r n l e f t i f l e f t i s not None e l s e r i g h t
18 r e t u r n None
9.8 Heap
count = Counter(nums)
Heap is a tree based data structure that satisfies the heap ordering prop-
erty. The ordering can be one of two types:
• the min-heap property: the value of each node is greater than or equal
(≥) to the value of its parent, with the minimum-value element at the
root.
• the max-heap property: the value of each node is less than or equal to
(≤) the value of its parent, with the maximum-value element at the
root.
Figure 9.4: Max-heap be visualized with binary tree structure on the left,
and be implemented with Array on the right.
• and its parent is located at k/2 index (In Python3, use integer division
n//2).
is called percolation up. The comparison is repeated until the parent is larger
than or equal to the percolating element. When we push an item in, the
item is initially appended to the end of the heap. Assume the new item is
the smaller than existing items in the heap, such as 5 in our example, there
will be violation of the heap property through the path from the end of the
heap to the root. To repair the violation, we traverse through the path and
compare the added item with its parent:
• if parent is smaller than the added item, no action needed and the
traversal is terminated, e.g. adding item 18 will lead to no action.
• otherwise, swap the item with the parent, and set the node to its
parent so that it can keep traverse.
Each step we fix the heap ordering property for a substree. The time com-
plexity is the same as the height of the complete tree, which is O(log n).
To generalize the process, a _float() function is first implemented which
enforce min heap ordering property on the path from a given index to the
root.
1 d e f _ f l o a t ( idx , heap ) :
2 w h i l e i d x // 2 :
3 p = i d x // 2
4 # Violation
5 i f heap [ i d x ] < heap [ p ] :
6 heap [ i d x ] , heap [ p ] = heap [ p ] , heap [ i d x ]
7 else :
8 break
9 idx = p
10 return
• if one of its children has smaller value than this item, swap this item
with that child and set the location to that child’s location. And then
continue.
Figure 9.6: Left: delete node 5, and move node 12 to root. Right: 6 is the
smallest among 12, 6, and 7, swap node 6 with node 12.
Similarly, this process is called percolation down. Same as the insert in the
case of complexity, O(log n). We demonstrate this process with two cases:
• if the item is the root, which is the minimum item 5 in our min-heap
example, we move 12 to the root first. Then we compare 12 with its
two children, which are 6 and 7. Swap 12 with 6, and continue. The
process is shown in Fig. 9.6.
• if the item is any other node instead of root, say node 7 in our example.
The process is exactly the same. We move 12 to node 7’s position.
By comparing 12 with children 10 and 15, 10 and 12 is about to be
swapped. With this, the heap ordering property is sustained.
1 d e f pop ( heap ) :
2 v a l = heap [ 1 ]
3 # Move t h e l a s t item i n t o t h e r o o t p o s i t i o n
4 heap [ 1 ] = heap . pop ( )
5 _sink ( i d x =1 , heap=heap )
6 return val
in one aspect: it uses zero-based indexing. There are other three functions:
nlargest, nsmallest, and merge that come in handy in practice. These
functions are listed and described in Table 9.9.
Min-Heap Given the exemplary list a = [21, 1, 45, 78, 3, 5], we call the
function heapify() to convert it to a min-heap.
1 from heapq import heappush , heappop , h e a p i f y
2 h = [ 2 1 , 1 , 45 , 78 , 3 , 5 ]
3 heapify (h)
The heapified result is h = [1, 3, 5, 78, 21, 45]. Let’s try heappop and heappush:
1 heappop ( h )
2 heappush ( h , 1 5 )
The print out of ab directly can only give us a generator object with its
address in the memory:
1 <g e n e r a t o r o b j e c t merge a t 0 x7 fd c9 3b 38 9e8 >
We can use list comprehension and iterate through ab to save the sorted
array in a list:
1 a b _ l s t = [ n f o r n i n ab ]
However, if we have multiple tasks that having the same priority, the relative
order of these tied tasks can not be sustained. This is because the list items
are compared with the whole list as key: it first compare the first item,
whenever there is a tie, it compares the next item. For example, when our
example has multiple items with 3 as the first value in the list.
1 h = [ [ 3 , ' e ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , 'a ' ] ]
2 heapify (h)
The printout indicates that the relative ordering of items [3, ’e’], [3, ’d’], [3,
’a’] is not kept:
1 [[3 , 'a ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , ' e ' ] ]
Keeping the relative order of tasks with same priority is a requirement for
priority queue abstract data structure. We will see at the next section how
priority queue can be implemented with heapq.
Modify Items in heapq In the heap, we can change the value of any
item just as what we can in the list. However, the violation of heap ordering
property occurs after the change so that we need a way to fix it. We have
the following two private functions to use according to the case of change:
• _siftdown(heap, startpos, pos): pos is where the where the new
violation is. startpos is till where we want to restore the heap in-
variant, which is usually set to 0. Because in _siftdown() it goes
backwards to compare this node with the parents, we can call this
function to fix when an item’s value is decreased.
4 p r i n t ( heap )
5
6 heap [ 0 ] = [ 6 , ' a ' ]
7 # Increased value
8 heapq . _ s i f t u p ( heap , 0 )
9 p r i n t ( heap )
10 #D e c r e a s e d Value
11 heap [ 2 ] = [ 3 , ' a ' ]
12 heapq . _siftdown ( heap , 0 , 2 )
13 p r i n t ( heap )
3. If two items have the same priority, they are served according to their
order in the queue.
1. CPU Scheduling,
• Sort stability: when we get two tasks with equal priorities, we return
them in the same order as they were originally added. A potential
solution is to modify the original 2-element list [priority, task]
into a 3-element list as [priority, count, task]. list is preferred
because tuple does not allow item assignment. The entry count in-
dicates the original order of the task in the list, which serves as a
tie-breaker so that two tasks with the same priority are returned in
the same order as they were added to preserve the sort stability. Also,
since no two entry counts are the same so that in the tuple comparison
the task will never be directly compared with the other. For example,
use the same example as in the last section:
1 import i t e r t o o l s
2 c o u n t e r = i t e r t o o l s . count ( )
3 h = [ [ 3 , ' e ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , 'a ' ] ]
4 h = [ [ p , next ( c o u n t e r ) , t ] f o r p , t i n h ]
5 e n t r y _ f i n d e r . pop ( t a s k _ i d ) # d e l e t e from t h e d i c t i o n a r y
6 return
7
8 # Remove t a s k ' d '
9 remove_task ( ' d ' )
10 # Updata t a s k ' b ' ' s p r i o r i t y t o 14
11 remove_task ( ' b ' )
12 new_item = [ 1 4 , next ( c o u n t e r ) , ' b ' ]
13 heappush ( heap , new_item )
14 e n t r y _ f i n d e r [ ' b ' ] = new_item
Customized Object If we want the higher the value is the higher priority,
we demonstrate how to do so with a customized object with two compar-
ison operators: < and == in the class with magic functions __lt__() and
__eq__(). The code is as:
1 c l a s s Job ( ) :
2 d e f __init__ ( s e l f , p r i o r i t y , t a s k ) :
3 self . priority = priority
4 s e l f . task = task
5 return
6
7 d e f __lt__( s e l f , o t h e r ) :
8 try :
9 return s e l f . p r i o r i t y > other . p r i o r i t y
10 except AttributeError :
11 r e t u r n NotImplemented
12 d e f __eq__( s e l f , o t h e r ) :
170 9. PYTHON DATA STRUCTURES
13 try :
14 r e t u r n s e l f . p r i o r i t y == o t h e r . p r i o r i t y
15 except AttributeError :
16 r e t u r n NotImplemented
Hands-on Example
Top K Frequent Elements (L347, medium) Given a non-empty array
of integers, return the k most frequent elements.
Example 1 :
Input : nums = [ 1 , 1 , 1 , 2 , 2 , 3 ] , k = 2
Output : [ 1 , 2 ]
Example 2 :
Input : nums = [ 1 ] , k = 1
Output : [ 1 ]
Analysis: We first using a hashmap to get information as: item and its
frequency. Then, the problem becomes obtaining the top k most frequent
items in our counter: we can either use sorting or use heap. Our exemplary
code here is for the purpose of getting familiar with related Python modules.
9.10 Bonus
Fibonacci heap With fibonacc heap, insert() and getHighestPriority()
can be implemented in O(1) amortized time and deleteHighestPriority()
can be implemented in O(Logn) amortized time.
9.11 Exercises
selection with key word: kth. These problems can be solved by
sorting, using heap, or use quickselect
1. 703. Kth Largest Element in a Stream (easy)
2. 215. Kth Largest Element in an Array (medium)
3. 347. Top K Frequent Elements (medium)
4. 373. Find K Pairs with Smallest Sums (Medium
5. 378. Kth Smallest Element in a Sorted Matrix (medium)
priority queue or quicksort, quickselect
1. 23. Merge k Sorted Lists (hard)
2. 253. Meeting Rooms II (medium)
3. 621. Task Scheduler (medium)
172 9. PYTHON DATA STRUCTURES
Part IV
173
175
This part embodies the principle of algorithm design and analysis techniques–
the central part of this book.
Before we start, I wanna emphasize that tree and graph data structure,
especially tree, is a great visualization tool to assist us with algorithm design
and analysis. Tree is a recursive structure, it can almost used to visualize
any recursive based algorithm design or even computing the complexity in
which case it is specifically called recursion tree.
The next three chapters we introduce the principle of algorithm anal-
ysis(chapter 10) and fundamental algorithm design principle–Divide and
conquer(Chapter. 13) and Reduce and conquer(Chapter. IV). In Algorithm
Analysis, we familiarize ourselves with common concepts and techniques
to analyze the performance of algorithms – running time and space com-
plexity. Divide and conquer is a widely used principle in algorithm design,
in our book, we dedicate a whole chapter to its sibling design principle –
reduce and conquer, which is essentially a superset of optimization design
principle–dynamic programming and greedy algorithm–that is further de-
tailed in Chapter. 15 and Chapter. 17.
176
10
177
178 10. ALGORITHM COMPLEXITY ANALYSIS
10.1 Introduction
In reality, it is impossible to predict the exact behavior of an algorithm, thus
complexity analysis only try to extract the main influencing factors and ig-
nore some trivial details. The complexity analysis is thus only approximate,
but it works.
What are the difference cases? Yet, when two input instance has ex-
actly the same size, but with different values, such that one array where the
input array is already sorted, and the other is totally random, the time it
takes to these two cases will possibly vary, depending on the sorting algo-
rithm that you chose. In complexity analysis, best-case, worst-case, average-
case complexity analysis is used to differentiate the behavior of the same
algorithm applied on different input instance.
1. Worst-case: The behavior of the algorithm or an operation of a data
structure with respect to the worst possible case of input instance.
This gave us a way to measure the upper bound on the running time
for any input, which is denoted as O. Knowing it gives us a guarantee
that the algorithm will never take any longer.
2. Average-case: The expected behavior when the input is randomly
drawn from a given distribution. Average case running time is used
as an estimate complexity for a normal case. The expected case here
offers us asymptotic bound Θ. Computation of average-case running
time entails knowing all possible input sequences, the probability dis-
tribution of occurrence of these sequences, and the running times for
the individual sequences. Often it is assumed that all inputs of a given
size are equally likely.
3. Best-case: The possible best behavior when the input data is ar-
ranged in a way, that your algorithms run least amount of time. Best
10.1. INTRODUCTION 179
Toy Example: Selection Sort Given a list of integers, sort the item
incrementally.
For example , g i v e n t h e l i s t A=[10 , 3 , 9 , 2 , 8 , 7 , 9 ] , t h e s o r t e d
l i s t w i l l be :
A=[2 , 3 , 7 , 8 , 9 , 9 , 1 0 ] .
There are many sorting algorithms, in this case, let us examine the selection
sort. Given the input array A, and size to be n, we have index [0, n − 1].
In selection sort, each time we select the current largest item and swap it
with item at its corresponding position in the sorted list, thus dividing the
list into two parts: unsorted list on the left and sorted list on the right. For
example, at the first pass, we choose 10 from A[0, n − 1] and swap it with
A[n − 1], which is 9; at the second pass, we choose the largest item 9 from
A[0, n − 2] and swap it with 7 at A[n − 2], and so. Totally, after n − 1 passes
we will get an incrementally sorted array. More details of selection sort can
be found in Chapter 15.
In the implementation, we use ti to denote the target position and li
the index of the largest item which can only get by scanning. We show the
Python code:
1 def sel ect Sort (a) : cost times
2 ' ' ' Implement s e l e c t i o n s o r t ' ' '
3 n = len (a)
4 f o r i i n r a n g e ( n − 1 ) : #n−1 p a s s e s ,
5 t i = n − 1 −i c n−1
6 li = 0 c n−1
7 f o r j in range (n − i ) :
8 if a[ j ] > a[ li ]: c \sum_{ i =0}^{n−2}(n−i
)
9 li = j c \sum_{ i =0}^{n−2}(n−i
)
10 # swap l i and t i
11 p r i n t ( ' swap ' , a [ l i ] , a [ t i ] )
12 a[ ti ] , a[ li ] = a[ li ] , a[ ti ] c n−1
13 print (a)
14 return a
First, we ignore the distinction between different operation types and treat
all alike with a cost of c. In the above code, the line that comes with
notations–cost and times–are operations. In line 5, we first point at the
target position ti. Because of the for loop above it, this operation will be
called n − 1 times. Same for line 6 and 12. For operation in line 8 and
9, the times it operated is denoted as i=0 (n − i) due to two nested for
Pn−2
loops. And the range of j is dependable of the outer loop with i. We get
180 10. ALGORITHM COMPLEXITY ANALYSIS
n−2
T (n) = 3c ∗ (n − 1) + 2c(n − i) (10.1)
X
i=0
= 3c ∗ (n − 1) + 2c(n + (n − 1) + (n − 2) + ... + 2)
(n − 1) ∗ (2 + n)
= 3c ∗ (n − 1) + 2c( )
2
= cn2 + cn − 2 + 3cn − 3c
= cn2 + 4cn − 3c − 2 (10.2)
= an2 + bn + c (10.3)
We should note that only if f (n) = O(g(n)) and f (n) = Ω(g(n)), we can
have f (n) = Θ(g(n)).
Input Size and Running Time In general, the time taken by an al-
gorithm grows with the size of the input, so it is universal to describe the
running time of a program as a function of the size of its input. f (n), with
the input size denoted as n.
The notation of input size depends on specific problems and data struc-
tures. For example, the size of the array can be denoted as integer n, the
total numbers of bits when it come to binary notation, and sometimes, if
the input is matrix or graph, we need to use two integers such as (m, n) for
a two-dimensional matrix or (V, E) for the vertices and edges in a graph.
We use function T to denote the running time. With input size of n, our
running time can be denoted as T (n). Given (m, n), it can be T (m, n).
• Worst-case: T (n) = an2 +bn+c, now we can say T (n) = Θ(n2 ), which
indicates that T (n) = Ω(n2 ) and T (n) = O(n2 ).
• Best-case: T (n) = an, we can say T (n) = Θ(n), which indicates that
T (n) = Ω(n) and T (n) = O(n).
184 10. ALGORITHM COMPLEXITY ANALYSIS
As in the chapter. ??, there are generally two ways of reducing a problem:
divide and conquer and Reduce by Constant size, which is actually a non-
homogenous recurrence relation.
In Chapter. II, we showed how to solve linear recurrence relation and get
absolute answer, it was seemingly complex and terrifying. Good news, as
complexity analysis is about estimating the cost, so we can loose ourselves a
bit and sometimes a lower/upper bound is good enough, and the base case
will almost always be O(1) = 1.
Iterative Method
= T (n/23 ) + 3O(1)
= ...
= T (1) + kO(1) (10.6)
We have 2nk = 1, we solve this equation and will get k = log2 n. Most
likely T (1) = O(1) will be the initial condition, we replace this, and we get
T (n) = O(log2 n).
However, when we try to apply iteration on the third recursion: T (n) =
3T (n/4) + O(n). It might be tempting to assume that T (n) = O(n log n)
186 10. ALGORITHM COMPLEXITY ANALYSIS
due to the fact that T (n) = 2T (n/2) + O(n) leads to this time complexity.
i=0
4
Recursion Tree
Since the term of T(n) grows, the iteration can look messy. We can use
recursion tree to better visualize the process of iteration. In a recursive tree,
each node represents the value of a single subproblem, and a leaf would be
a subproblem. As a start, we expand T (n) as a node with value n as root,
and it would have three children each represents a subproblem T (n/4). We
further do the same with each leaf node, until the subproblem is trivial and
be a base case. In practice, we just need to draw a few layers to find the
rule. The cost will be the sum of costs of all layers. The process can be seen
in Fig. 10.3. In this case, it is the base case T (1). Through the expansion
with iteration and recursion tree, our time complexity function becomes:
k
T (n) = Li + Lk+1 (10.11)
X
i=1
k
=n (3/4)i−1 + 3k T (n/4k ) (10.12)
X
i=1
In the process, we can see that Eq. 10.13 and Eq. 10.7 are the same.
Because T (n/4k ) = T (1) = 1, we have k = log4 n.
∞
T (n) ≤ n (3/4)k−1 + 3k T (n/4k ) (10.13)
X
i=1
≤ 1/(1 − 3/4)n + 3log4 n T (1) = 4n + nlog4 3 ≤ 5n (10.14)
= O(n) (10.15)
Mathematical Induction
Mathematical induction is a mathematical proof technique, and is essentially
used to prove that a property P (n) holds for every natural number n, i.e.
for n = 0, 1, 2, 3, and so on. Therefore, in order to use induction, we need
to make a guess of the closed-form solution for an . Induction requires two
cases to be proved.
1. Base case: proves that the property holds for the number 0.
2. Induction step: proves that, if the property holds for one natural num-
ber n, then it holds for the next natural number n + 1.
For T (n) = 2 × T (n − 1) + 1, T0 = 0, we can have the following result by
expanding T (i), i ∈ [0, 7].
n 0 1 2 3 4 5 6 7
T_n 0 3 7 15 31 63 127
It is not hard that we find the rule and guess T (n) = 2n − 1. Now, we prove
this equation by induction:
1. Show that the basis is true: T (0) = 20 − 1 = 0.
T (n) = 2T (n − 1) + 1 (10.16)
= 2(2 n−1
− 1) + 1 (10.17)
=2 −1
n
(10.18)
where a ≤ 1, b > 1, and f (n) is a given function, which usually has f (n) =
cnk .
k
T (n) = ai T (n/bi ) + f (n) (10.20)
X
i=1
Considering that the first type is much more commonly seen that the other,
we only learn how to solve the first type; in fact, at least, I assume you that
within this book, the second type will never appear.
Now, assume T (1) = c for simplicity and for getting rid of this constant part
in our sequence. Then,
i=0
a
So far, we get a geometric series, which is a good sign to get the closed-form
expression. We first summarize all possible substitutions that will help our
further analysis.
k
2. bk = a: With ba = 1, T (n) = O(am m). With Eq. 10.35 and Eq. 10.33,
our upper bound is:
k
3. bk > a: In this case, we denote ba = d (d is a constant and d > 1).
Use the standard formula for summing over a geometric series:
dm+1 − 1 dm+1 − 1
T (n) = cam = O(am ) (10.38)
d−1 d−1
= O(bmk ) = O(nk ) = O(f (n)) (10.39)
Master Method
Comparison between bk and a equals to the comparison between bkm between
am . From the above substitution, it further equals to compare f (n) to
190 10. ALGORITHM COMPLEXITY ANALYSIS
nlogb a . This is when master method kicks in and we will see how it helps us
to apply these three cases into real situation.
Compare f (n)/c = nk with nlogb a . Intuitively, the larger of the two
functions would dominate the solution to the recurrence. Now, we rephrase
the three cases using the master method for the easiness of memorization.
3. If nk = nlogb a , then:
9 j = i −1
10
11 w h i l e j >= 0 and s l [ j ] > key : # compare key from t h e l a s t
s o r te d element
12 s l [ j +1] = s l [ j ] # s h i f t a [ j ] backward
13 j −= 1
14 s l [ j +1] = key
15 print ( sl )
16 return s l
For the first for loop in line 7, it will sure has n − 1 passes. However, for the
inner while loop, the real times of execution of statement in line 12 and 13
depends on the state between sl and key. If we try to sort the input array
a incrementally such that A=[2, 3, 7, 8, 9, 9, 10], and if the input array is
already sorted, then there will be no items in the sorted list can be larger
than our key which result only the execution of line 14. This is the best
case, we can denote the running time of the while loop by Ω(1) because it
has constant running time at its best case. However, if the input array is a
reversed as the desired sorting, which means it is decreasing sorted such as
A=[10, 9, 9, 9, 7, 3, 2], then the inner while loop will has n − i, we denote
it by O(n). We can denote our running time equation as:
And,
Using simple iteration, we can solve the math formula and have the asymp-
totic upper bound and lower bound for the time complexity of insertion
sort.
For the average case, we can assume that each time, we need half time
of comparison of n − i, we can have the following equation:
1. Consider each operation separately: one that look each operation in-
curred in the algorithm/data structure separately and offers worst-case
running time O and average running time Θ for each operation. For
the whole algorithm, it sums up on these two cases by how many times
each operation is incurred.
Amortized analysis does not purely look each operation on a given data
structure separately, it averages time required to perform a sequence of
different data structure operations over all performed operations. With
amortized analysis, we might see that even though one single operation
might be expensive, the amortized cost of this operation on all operations is
small. Different from average-case analysis, probability will not be applied.
From the example later we will see that amortized analysis view the data
structure in applicable scenario, to complete this tasks, what is the average
cost of each operation, and it is acheviable given any input. Therefore, the
same time complexity, say O(f (n)), worst-case > amortized > average.
There are three types of amortized analysis:
1. Aggregate Analysis:
2. Accounting Method:
3. Potential method:
10.7 Summary
For your convenience, we provide a table that shows the frequent used re-
currence equations’ time complexity.
Figure 10.4: The cheat sheet for time and space complexity with recurrence
function. If T(n) = T(n-1)+T(n-2)+...+T(1)+O(n-1) = 3n . They are called
factorial, exponential, quadratic, linearithmic, linear, logarithmic, constant.
194 10. ALGORITHM COMPLEXITY ANALYSIS
10.8 Exercises
10.8.1 Knowledge Check
1. Use iteration and recursion tree to get the time complexity of T (n) =
T (n/3) + 2T (2n/3) + O(n).
Search Strategies
2. Combinatorial Search(Chapter)
11.1 Introduction
Linear, tree-like data structures, they are all subsets of graphs, making graph
searching universal to all searching algorithms. There are many searching
1
https://en.wikipedia.org/wiki/Category:Search_algorithms
195
196 11. SEARCH STRATEGIES
Linear Search As the naive and baseline approach compared with other
searching algorithms, linear search, a.k.a sequential search, simply traverse
the linear data structures sequentially and checking items until a target
is found. It consists of a for/while loop, which gives as O(n) as time
complexity, and no extra space needed. For example, we search on list A to
find a target t:
1 d e f l i n e a r S e a r c h (A, t ) : #A i s t h e a r r a y , and t i s t h e t a r g e t
2 f o r i , v i n enumerate (A) :
3 i f A[ i ] == t :
4 return i
5 r e t u r n −1
Linear Search is rarely used practically due to its lack of efficiency com-
pared with other searching methods such as hashmap and binary search that
we will learn soon.
11.1. INTRODUCTION 197
Search process constructs a search tree where the root is the start state.
Loops in graph may cause the search tree to be infinite even if the state
space is small. In this section, we only use either acyclic graph or tree for
demonstrating the general search methods. In acyclic graph, there might
exist multiple paths from source to a target. For example, the example
shown in Fig. ?? has multiple paths from to. Further in graph search sec-
tion, we discuss how to handle cycles and explain single-path graph search.
Changing the ordering in the frontier set leads to different search strategies.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 al = defaultdict ( l i s t )
3 a l [ ' S ' ] = [ ( 'A ' , 4 ) , ( 'B ' , 5 ) ]
4 a l [ 'A ' ] = [ ( 'G ' , 7 ) ]
5 a l [ 'B ' ] = [ ( 'G ' , 3 ) ]
With uninformed search, we only know the goal test and the adjacent
nodes, but without knowing which non-goal states are better. Assuming
and limiting the state space to be a tree for now so that we won’t worry
about repeated states.
There are generally two ways to order nodes in the frontier without
domain-specific information:
• Queue that nodes are first in and first out (FIFO) from the frontier
set. This is called breath-first search.
• Stack that nodes are last in but first out (LIFO) from the frontier set.
This is called depth-first search.
• Priority queue that nodes are sorted increasingly in the path cost from
source to each node from the frontier set. This is called Uniform-Cost
Search.
Figure 11.3: Breath-first search on a simple search tree. At each stage, the
node to be expanded next is indicated by a marker.
Q=[B, C ]
Expand B, add D and E i n t o Q
Q=[C, D, E ]
Expand C, add F and G i n t o Q
Q=[D, E , F , G]
F i n i s h expanding D
Q=[E , F , G]
F i n i s h expanding E
Q=[F , G]
F i n i s h expanding F
Q=[G]
F i n i s h expanding G
Q= [ ]
Call the function with parameters as bfs(al, ’S’), the output is as:
S A B G G
Time Complexity We can clearly see that BFS scans each node in the
tree exactly once. If our tree has n nodes, it makes the time complexity O(n).
However, the search process can be terminated once the goal is found, which
can be less than n. Thus we measure the time complexity by counting the
number of nodes expanded while searching is running. Assume the tree has
a branching factor b at each non-leaf node and the goal node locates at
depth d, we sum up the number of nodes from depth 0 to depth d, the total
11.2. UNINFORMED SEARCH STRATEGIES 201
Depth-first search on the other hand always expand the deepest node
from the frontier first. As shown in Fig. 11.4, Depth-first search starts at
the root node and continues branching down a particular path. Using S
to denote the frontier set which is indeed a stack, the search process is
explained:
S=[A]
Expand A, add C and B i n t o S
S=[C, B ]
Expand B, add E and D i n t o S
S=[C, E , D]
Expand D
S=[C, E ]
Expand E
S=[C ]
Expand C, add G and F i n t o S
S=[C, G, F ]
Expand F
S=[C, G]
Expand G
S=[C ]
Expand C
S=[]
Call the function with parameters as dfs(al, ’S’), the output is as:
S A G B G
Call the function with parameters as dfs_iter(al, ’S’), the output is as:
S B G A G
We observe that the ordering is not exactly the same as of the recursive
counterpart. To keep the ordering consistent, we simply need to add the
adjacent nodes in reversed order. In practice, we replace g[n] with g[n][:: −1].
Properties DFS may not terminate without a fixed depth bound to limit
the amount of nodes that it expand. DFS is not complete because it always
deepens the search and in some cases the supply of nodes even within the
cutting off fixed depth bound can be infinitely. DFS is not optimal, in our
example, of our goal node is C, it goes through nodes A, B, D, E before
it finds node C. While, in the BFS, it only goes through nodes A and C.
However, when we are lucky, DFS can find long solutions quickly.
Time Complexity For DFS, it might need to explore all nodes within
graph to find the target, thus its worst-case time and space complexity is
not decided upon by the depth of the goal, but the total depth of the graph,
d instead. DFS has the same time complexity as BFS, which is O(bd ).
Space Complexity The stack will at most stores a single path from the
root to a leaf node (goal node) along with the remaining unexpanded siblings
so that when it has visited all children, it can backward to a parent node,
and know which sibling to explore next. Therefore, the space that needed
for DFS is O(bd). In most cases, the branching factor is a constant, which
makes the space complexity be mainly influenced by the depth of the search
tree. Obviously, DFS has great efficiency in space, which is why it is adopted
as the basic technique in many areas of computer science, such as solving
constraint satisfaction problems(CSPs). The backtracking technique we are
about to introduce even further optimizes the space complexity on the basis
of DFS.
Here, our source is ‘S’, and the goal is ‘G’. We are set to find a path from
source to goal with minimum cost. The process is shown as:
Q = [(0 , S) ]
Expand S , add A and B
Q = [ ( 4 , A) , ( 5 , B) ]
Expand A, add G
Q = [ ( 5 , B) , ( 1 1 , G) ]
Expand B, add G
Q = [ ( 8 , G) , ( 1 1 , G) ]
Expand G, g o a l found , t e r m i n a t e .
Time and Space Complexity Similar to BFS, both the worst case time
and space complexity is O(bd ). When all edge costs are c, and C ∗ is the
best goal path cost, the time and space complexity can be more precisely
∗
represented as O(bC /c ).
depth = 0 : S = [ S ]
Test S , g o a l not found
depth = 1 : S =[S ]
11.2. UNINFORMED SEARCH STRATEGIES 205
Expand S , S = [ B, A]
Test A, g o a l not found
Test B, g o a l not found
depth = 2 : S=[S ]
Expand S , S=[B, A]
Expand A, S=[B, G]
Test G, g o a l found , STOP
The implementation of the DLS goes easier with recursive DFS, we use a
count down to variable maxDepth in the function, and will only do goal
testing util this variable reaches to zero. The code is as:
1 d e f d l s ( graph , cur , t , maxDepth ) :
2 # End C o n d i t i o n
3 i f maxDepth == 0 :
4 i f c u r == t :
5 r e t u r n True
6 i f maxDepth < 0 :
7 return False
8
9 # Recur f o r a d j a c e n t v e r t i c e s
10 f o r n , _ i n graph [ c u r ] :
11 i f d l s ( graph , n , t , maxDepth − 1 ) :
12 r e t u r n True
13 return False
With the help of function dls, the implementation of DLS is just an iterative
call to the subroutine:
1 d e f i d s ( graph , s , t , maxDepth ) :
2 f o r i i n r a n g e ( maxDepth ) :
3 i f d l s ( graph , s , t , i ) :
4 r e t u r n True
5 return False
Properties Through the depth limited DFS, IDS has advantages of DFS:
• Limited space linear to the depth and branching factor, giving O(bd)
as space complexity.
• In practice, even with redundant effort, it still finds longer path more
quickly than BFS does.
206 11. SEARCH STRATEGIES
Bidirectional search applies breadth-first search from both the start and
the goal node, with one BFS from start moving forward and one BFS from
the goal moving backward until their frontiers meet. This process is shown
in Fig. 11.5. As we see, each BFS process only visit O(bd/2 ) nodes comparing
with one single BFS that visits O(bd ) nodes. This will improve both the time
and space efficiency by bd/2 times compared with vanilla BFS.
Implementation Because the BFS that starts from the goal needs to
move backwards, the easy way to do this is to create another copy of the
graph wherein each edge has opposite direction compared with the original.
By creating a reversed graph, we can use a forward BFS from the goal.
We apply level by level BFS instead of updating the queue one node by
one node. For better efficiency of the intersection of the frontier set from
both BFS, we use set data structure instead of simply a list or a FIFO
queue.
11.2. UNINFORMED SEARCH STRATEGIES 207
Use Fig. 11.2 as an example, if our source and goal is ‘S’ and ‘G’ re-
spectively, if we proceed both BFS simultaneously, the process looks like
this:
qs = [ 'S ']
qt = [ ' G' ]
Check i n t e r s e c t i o n , and p r o c e e d
qs = [ ' A' , 'B ' ]
qt = [ ' A' , 'B ' ]
Check i n t e r s e c t i o n , f r o n t i e r meet , STOP
No process in this case, however, the above process will end up missing the
goal node if we change our goal to be ‘A’. This process looks like:
qs = [ ' S ' ]
qt = [ ' A ' ]
Check i n t e r s e c t i o n , and p r o c e e d
qs = [ ' A' , 'B ' ]
qt = [ ' S ' ]
Check i n t e r s e c t i o n , and p r o c e e d
qs = [ ' G' ]
qt = [ ]
STOP
This because for source and goal nodes that has a shortest path with even
length, if we proceed the search process simultaneously, we will always end
up missing the intersection. Therefore, we process each BFS iteratively–one
at a time to avoid such troubles.
The code for one level at a time BFS with set and for the intersection
check is as:
1 d e f b f s _ l e v e l ( graph , q , bStep ) :
2 i f not bStep :
3 return q
4 nq = s e t ( )
5 for n in q :
6 f o r v , c i n graph [ n ] :
7 nq . add ( v )
8 r e t u r n nq
9
10 d e f i n t e r s e c t ( qs , qt ) :
11 i f qs & qt : # i n t e r s e c t i o n
12 r e t u r n True
13 return False
9 qt = { t }
10 step = 0
11 w h i l e qs and qt :
12 i f i n t e r s e c t ( qs , qt ) :
13 r e t u r n True
14 qs = b f s _ l e v e l ( graph , qs , s t e p%2 == 0 )
15 qt = b f s _ l e v e l ( bgraph , qt , s t e p%2 == 1 )
16 step = 1 − step
17 return False
11.2.6 Summary
Print Paths Second, we talked about the paths, but we never discuss
how to track all the paths. In this section, we would like to see how we can
track paths first, and then with the tracked paths, we detect cycles to avoid
getting into infinite loops.
More Efficient Graph Search Third, the last section is all about tree
search, however, in a large graph, this is not efficient by visiting some nodes
multiple times if they happen to be on the multiple paths between the
source and any other node in the graph. Usually, depends on the application
scenarios, graph search which remembers already-expanded nodes/states in
the graph and avoids expanding again by checking any about to be expanded
11.3. GRAPH SEARCH 209
node to see if it exists in frontier set or the explored set. This section, we
introduce graph search that suits for general purposed graph problems.
Visiting States We have already explained that we can use three colors:
WHITE, GREY, and BLACK to denote nodes within the unexpanded, fron-
tier, and explored set, respectively. We are doing so to avoid the hassles of
tracking three different sets, with visiting state, it is all simplified to a color
check. We define a STATE class for convenience.
c l a s s STATE:
white = 0
gray = 1
black = 2
Figure 11.6: Exemplary Graph: Free Tree, Directed Cyclic Graph, and
Undirected Cyclic Graph.
In this section, we use Fig. 11.6 as our exemplary graphs. Each’s data
structure is defined as:
• Free Tree:
1 ft = [[1] , [2] , [4] , [] , [3 , 5] , []]
1 ucg = [ [ 1 , 2 ] , [ 0 , 2 , 3 ] , [ 0 , 1 , 4] , [1 , 4] , [2 , 3 , 5] ,
[4]]
However, if we call it on the cyclic graph, dfs(dcg, 0), it runs into stack
overflow.
Now we call function dfs for ft, dcg, and ucg, the paths and orders for
each example is listed:
• For the free tree and the directed cyclic graph, they have the same
output. The orders are:
11.3. GRAPH SEARCH 211
[0 , 1 , 2 , 4 , 3 , 5]
These paths mark the search tree, we visualize the search tree for each
exemplary graph in Fig. 11.7.
Figure 11.7: Search Tree for Exemplary Graph: Free Tree and Directed
Cyclic Graph, and Undirected Cyclic Graph.
1 d e f d f g s ( g , v i , v i s i t e d , path ) :
2 v i s i t e d . add ( v i )
3 o r d e r s . append ( v i )
4 bEnd = True # node w i t h o u t u n v i s i t e d a d j a c e n t nodes
5 f o r nv i n g [ v i ] :
6 i f nv not i n v i s i t e d :
7 i f bEnd :
8 bEnd = F a l s e
9 d f g s ( g , nv , v i s i t e d , path + [ nv ] )
10 i f bEnd :
11 p a t h s . append ( path )
Did you notice that the depth-first graph search on the undirected cyclic
graph shown in Fig. 11.6 has the same visiting order of nodes and same
search tree as the free tree and directed cyclic graph in Fig. 11.6?
Efficient Path Backtrace In graph search, each node is added into the
frontier and expanded only once, and the search tree of a |V | graph will
only have |V | − 1 edges. Tracing paths by saving each path as a list in the
frontier set is costly; for a partial path in the search tree, it is repeating itself
11.3. GRAPH SEARCH 213
Now, we modify the dfs code as follows to find a given state (vertex) and
obtaining the path from source to target:
1 def d f g s ( g , vi , s , t , v i s i t e d , parent ) :
2 v i s i t e d . add ( v i )
3 i f v i == t :
4 r e t u r n b a c k t r a c e ( parent , s , t )
5
6 f o r nv i n g [ v i ] :
7 i f nv not i n v i s i t e d :
8 p a r e n t [ nv ] = v i
9 f p a t h = d f g s ( g , nv , s , t , v i s i t e d , p a r e n t )
10 i f fpath :
11 return fpath
12
13 r e t u r n None
The whole Depth-first graph search tree constructed from the parent
dict is delineated in Fig. 11.8 on the given example.
Time and Space Complexity For the depth-first graph search, we use
aggregate analysis. The search process covers all edges, |E| and vertices,
|V |, which makes the time complexity as O(|V | + |E|). For the space, it uses
space O(|V |) in the worst case to store the stack of vertices on the current
search path as well as the set of already-visited vertices.
Applications
Depth-first tree search is adopted as the basic workhorse of many areas of AI,
such as solving CSP, as it is a brute-force solution. In Chapter Combinatorial
Search, we will learn how “backtracking” technique along with others can be
applied to speed things up. Depth-first graph search is widely used to solve
graph related tasks in non-exponential time, such as Cycle Check(linear
time) and shortest path.
Questions to ponder:
• Only track the longest paths.
Now we call function bfs for ft, dcg, and ucg, the paths and orders for
each example is listed:
• For the free tree and the directed cyclic graph, they have the same
output. The orders are:
[0 , 1 , 2 , 4 , 3 , 5]
[0 , 1 , 2 , 2 , 3 , 1 , 4 , 4 , 4 , 3 , 3 , 5 , 3 , 5 , 2 , 5 , 4 , 1 , 5]
Properties We can see the visiting orders of nodes are different from
Depth-first tree search counterparts. However, the corresponding search
tree for each graph in Fig. 11.6 is the same as its counterpart–Depth-first
Tree Search illustrated in Fig. 11.7. This highlights how different searching
strategies differ by visiting ordering of nodes but not differ at the search-tree
which depicts the search space–all possible paths.
Now, use the undirected cyclic graph as example to find the path from source
0 to target 5:
1 b f g s ( ucg , 0 , 5 )
[0 , 2 , 4 , 5]
While this found path is the shortest path between the two vertices measured
by the length. The whole Breath-first graph search tree constructed from
the parent dict is delineated in Fig. 11.9 on the given example.
There are two important characteristics about tree search and graph search:
same search tree. However, whenever there exists cycles, the depth-
first graph search tree might differ from the breath-first graph search
tree.
1. Back edges which connect a node back to one of its ancestors in the
220 11. SEARCH STRATEGIES
Figure 11.11: Classification of Edges: black marks tree edge, red marks back
edge, yellow marks forward edge, and blue marks cross edge.
3. Cross edges point from a node to a previously visited node that is nei-
ther an ancestor nor a descendant in the depth-first forest Gπ . Marked
as blue edges in Fig. 11.11.
We can decide the type of tree edge using the DFS execution with the states:
for an edge (u, v), depends on whether we have visited v before in the DFS
and if so, the relationship between u and v.
Now, we call the above function with directed graph in Fig. 11.11.
1 v = l e n ( dcg )
2 c o l o r s = [STATE. w h i t e ] ∗ v
3 d f s . t = −1
4 d f s . d i s c o v e r , d f s . f i n i s h = [ −1] ∗ v , [ −1] ∗ v
5 d f s ( dcg , 0 , c o l o r s )
13 nodes . add ( i )
14 else :
15 p r i n t ( i , ' ) , ' , end = ' ' )
We would easily find out that the ordering of nodes according to the discov-
ery and finishing time makes a well-defined expression in the sense that the
parentheses are properly nested.
Questions to ponder:
• Implement the iterative version of the recursive code.
Figure 11.12: The process of Breath-first Graph Search. The black arrows
denotes the the relation of u and its not visited neighbors v. And the red
arrow marks the backtrack edge.
Shortest Path
Applications
The common problems that can be solved by BFS are those only need one
solution: the best one such like getting the shortest path. As we will learn
later that breath-first-search is commonly used as archetype to solve graph
optimization problems, such as Prim’s minimum-spanning-tree algorithm
and Dijkstra’s single-source-paths algorithm.
Introduction
Depth-first search starts at the root node and continues branching down a
particular path; it selects a child node that is at the deepest level of the tree
from the frontier to expand next and defers the expansion of this node’s
siblings. Only when the search hits a dead end (a node that has no child)
does the search “backtrack” to its parent node, and continue to branch
down to other siblings that were deferred. A recursive tree can be traversed
recursively. We print out the value of current node, then apply recursive
call on the left and right node; by treating each node as a subtree, naturally
a recursive call to a node can be thought of handling the traversal of that
subtree. The code is quite straightforward:
1 d e f r e c u r s i v e ( node ) :
2 i f not node :
3 return
4 p r i n t ( node . v a l , end= ' ' )
5 r e c u r s i v e ( node . l e f t )
6 r e c u r s i v e ( node . r i g h t )
Now, we call this function with a tree as shown in Fig. 11.13, the output
that indicates the traversal order is:
11.4. TREE TRAVERSAL 225
1 1 2 4 5 3 6
Figure 11.14: Left: PreOrder, Middle: InOrder, Right: PostOrder. The red
arrows marks the traversal ordering of nodes.
The visiting ordering between the current node, its left child, and its
right child decides the following different types of recursive tree traversals:
Return Values
Here, we want to do the task in a different way: We do not want to just
print out the visiting orders, but instead write the ordering in a list and
return this list. How would we do it? The process is the same, other than
we need to return something(not None which is default in Python). If we
only have empty node, it shall return us an empty list [], if there is only
one node, returns [1] instead.
Let us use PreOrder traversal as an example. To make it easier to
understand, the same queen this time wants to do the same job in a different
way, that she wants to gather all the data from these different states to her
own hand. This time, she assumes the two generals A and B will return a
list of the subtree, safely and sount. Her job is going to combine the list
returned from the left subtree, her data, and the list returned from the right
subtree. Therefore, the left general brings back A = [2, 4, 5], and the right
general brings back B = [3, 6]. Then the final result will be queue + A + B =
[1, 2, 4, 5, 3, 6]. The Python code is given:
1 d e f PreOrder ( r o o t ) :
2 i f r o o t i s None :
3 return [ ]
4 ans = [ ]
5 l e f t = PreOrder ( r o o t . l e f t )
6 r i g h t = PreOrder ( r o o t . r i g h t )
7 ans = [ r o o t . v a l ] + l e f t + r i g h t
8 r e t u r n ans
Complexity Analysis
It is straightforward to see that it only visit all nodes twice, one in the
forward pass and the other in the backward pass of the recursive call, making
the time complexity linear to total number of nodes, O(n). The other way is
through the recurrence relation, we would write T (n) = 2 × T (n/2) + O(1),
which gives out O(n) too.
• At first, we start from the root, and put it into the stack, which is 1
in our example.
• Our frontier set has only one node, thus we have to pop out node 1
and expand the frontiner set. When we are expanding node 1, we add
its children into the frontier set by pushing them into the stack. In
228 11. SEARCH STRATEGIES
the preorder traversal, the left child should be first expanded from the
frontier stack, indicating we should push the left child into the stack
afterward the right child is pushed into. Therefore, we add node 3 and
2 into the stack.
• When one branch down process terminates, we pop out a node from
stack, and we set cur=node.right, so that we expand the branching
process to its right sibling.
We illustrate this process in Fig. 11.17. The ordering of items pushed into
the stack is the preorder traversal ordering, which is [1, 2, 4, 5, 3, 6]. And
230 11. SEARCH STRATEGIES
the ordering of items being popped out of the stack is the inorder traversal
ordering, which is [4, 2, 5, 1, 3, 6].
Instead of traversing the tree recursively deepening down each time, the
alternative is to visit nodes level by level, as illustrated in Fig. ?? for our
exemplary binary tree. We first visit the root node 1, and then its children
2 and 3. Next, we visit 2 and 3’s children in order, we goes to node 4, 5, and
6. This type of Level Order Tree Traversal uses the breath-first search
strategy which differs from our covered depth-first search strategy. As we
see in the example, the root node is expanded first, then all successors of the
root node are expanded next, and so on, following a level by level ordering.
We can also find the rule, the nodes first come and get first expanded. For
example 2 is first visited and then 3, thus we expand 2’s children first.
Then we have 4 and 5. Next, we expand 3’s children. This First come first
expanded tells us we can rely on a queue to implement BFS.
Simple Implementation We start from the root, say it is our first level,
put it in a list named nodes_same_level. Then we use a while loop, and
each loop we visit all children nodes of nodes_same_level from the last
level. We put all these children in a temporary list temp, before the loop
ends, we assign temp to nodes_same_level, until the deepest level where
no more children nodes will be found and leave our temp list to be empty
and our while loop terminates.
1 def LevelOrder ( root ) :
2 i f not r o o t :
3 return
4 nodes_same_level = [ r o o t ]
5 w h i l e nodes_same_level :
6 temp = [ ]
7 f o r n i n nodes_same_level :
8 p r i n t ( n . v a l , end= ' ' )
9 if n. left :
10 temp . append ( n . l e f t )
11 i f n. right :
12 temp . append ( n . r i g h t )
13 nodes_same_level = temp
232 11. SEARCH STRATEGIES
The above will output follows with our exemplary binary tree:
1 1 2 3 4 5 6
Add an example
Triangle (L120)
Given a triangle, find the minimum path sum from top to bottom. Each
step you may move to adjacent numbers on the row below.
Example :
Given t h e f o l l o w i n g t r i a n g l e :
[
[2] ,
[3 ,4] ,
[6 ,5 ,7] ,
[4 ,1 ,8 ,3]
]
The minimum path sum from top t o bottom i s 11 ( i . e . , 2 + 3 + 5 +
1 = 11) .
Analysis Solution: first we can use dfs traverse as required in the problem,
and use a global variable to save the minimum value. The time complexity
for this is O(2n ). When we try to submit this code, we get LTE error. The
code is as follows:
1 import s y s
2 d e f min_path_sum ( t ) :
3 '''
4 P u r e l y Complete S e a r c h
5 '''
6 min_sum = s y s . maxsize
7 d e f d f s ( i , j , cur_sum ) :
8 n o n l o c a l min_sum
9 # edge c a s e
10 i f i == l e n ( t ) o r j == l e n ( t [ i ] ) :
11 # g a t h e r t h e sum
12 min_sum = min ( min_sum , cur_sum )
13 return
14 # o n l y two e d g e s / c h o i c e s a t t h i s s t e p
15 d f s ( i +1 , j , cur_sum + t [ i ] [ j ] )
16 d f s ( i +1 , j +1, cur_sum + t [ i ] [ j ] )
17 d f s (0 , 0 , 0)
18 r e t u r n min_sum
234 11. SEARCH STRATEGIES
11.6 Exercises
11.6.1 Coding Practice
Property of Graph
Combinatorial Search
12.1 Introduction
Combinatorial search problems consists of n items and a requirement to
find a solution, i.e., a set of L < N items that satisfy specified conditions or
constraints. For example, a sudoku problem where a 9 × 9 grid is partially
filled with number between 1 and 9, fill the empty spots with numbers that
satisfy the following conditions:
This sudoku together with one possible solution is shown in Fig. 12.1. In
this case, we have 81 items, and we are required to fill 51 empty items with
the above three constraints.
235
236 12. COMBINATORIAL SEARCH
O(cdL ) (12.1)
Where there are L variables, each with domain size d, and there are c
constraints to check out.
a1 X1 + ... + an Xn ≤ c (12.2)
1. It is space efficient for the usage of a DFS and the candidates are built
incrementally and their validity to fit a solution is checked right away.
2. It is time efficient for that some partial candidates can be pruned if the
algorithm believes that it will not lead to our final complete solution.
Because the ordering of variables s0 , ..., sL−1 can potentially affect the
size of the search space sometimes. Thus, backtracking search relies on one
or more heuristics to select which variable to consider next. Look-ahead is
one such heuristic that is preferably applied to check the effects of choosing
a given variable to evaluate or to decide the order of values to give to it.
There are other Breath-first Search based strategies that might work
better than backtracking, such as for combinatorial optimization problems,
best-first branch and bound search might be more efficient than its depth-
first counterpart.
238 12. COMBINATORIAL SEARCH
1. Branch and Prune: This method prunes the unqualified branches with
constraints of the problems. This is usually applied to solve constraint
restricted problems (CSPs).
12.2 Backtracking
In this section, we first introduce the technique of backtracking, and then
demonstrate it by implementing common enumerative combinatorics seen
in Chapter Discreet Programming.
12.2.1 Introduction
Backtracking search is an exhaustive search algorithm(depth-first search)
that systematically assigns all possible combinations of values to the vari-
ables and checks if these assignments constitute a solution. Backtracking is
all about choices and consequences and it shows the following two properties:
visiting each vertex less than once. This property makes backtracking
the most promising way to solve CSPs and combinatorial optimization
problems.
The process should be way clearer once we have learned the examples in the
following subsections.
12.2.2 Permutations
Given a list of items, generate all possible permutations of these items. If
the set has duplicated items, only enumerate all unique permutations.
No Duplicates(L46. Permutations)
When there are no duplicates, from Chapter Discreet Programming, we
know the number of all permutations are:
n!
p(n, m) = (12.3)
(n − m)!
where m is the number of items we choose from the total n items to make
the permutations.
For example :
a = [1 , 2 , 3]
There a r e 6 t o t a l permutations :
[1 , 2 , 3] , [1 , 3 , 2] ,
[2 , 1 , 3] , [2 , 3 , 1] ,
[3 , 1 , 2] , [3 , 2 , 1]
However, we only managed to enumerate the search space, but not sys-
tematically or recursively with the Depth-first search process. With DFS, we
depict the traverse order of the vertexes in the virtual search space with red
arrows in Fig. 12.2. The backward arrows mark the “backtracking” process,
where we have to reset the state to the upper level.
15 c u r r . pop ( )
16 used [ i ] = F a l s e
17 return
p(n, n)
pd(n, n) = , (12.4)
x0 !x1 !...xd−1
d−1
(12.5)
X
w.r.t xi ≤ n
i=0
Discussion
From the example of permutation, we have demonstrated how backtracking
works to construct candidates with an implicit search tree structure: the root
node is the initial state, any internal node represents intermediate states,
and all leaves are our candidates which in this case there are n! for p(n, n)
permutation. In this subsection, we want to point out the unique properties
and its computational and space complexities.
244 12. COMBINATORIAL SEARCH
Two Passes Backtracking builds an implicit search tree on the fly, and
it does not memorize any intermediate state. It visits the vertices in the
search tree in two passes:
1. Forward pass: it builds the solution incrementally and reaches to the
leaf nodes in a DFS fashion. One example of forward pass is []− >
[1]− > [1, 2]− > [1, 2, 3].
2. Backward pass: as the returning process from recursion of DFS, it
also backtracks to previous state. One example of backward pass is
[1, 2, 3]− > [1, 2], − > [1].
First, the forward pass to build the solution incrementally. The change of
curr in the source code indicates all vertices and the process of backtracking,
it starts with [] and end with []. This is the core character of backtracking.
We print out the process for the example as:
[] − >[1] − >[1 , 2] − >[1 , 2 , 3]−> b a c k t r a c k : [ 1 , 2 ]
backtrack : [ 1 ]
[ 1 , 3] − >[1 , 3 , 2]−> b a c k t r a c k : [ 1 , 3 ]
backtrack : [ 1 ]
backtrack : [ ]
[2] − >[2 , 1] − >[2 , 1 , 3]−> b a c k t r a c k : [ 2 , 1 ]
backtrack : [ 2 ]
[ 2 , 3] − >[2 , 3 , 1]−> b a c k t r a c k : [ 2 , 3 ]
backtrack : [ 2 ]
backtrack : [ ]
[3] − >[3 , 1] − >[3 , 1 , 2]−> b a c k t r a c k : [ 3 , 1 ]
backtrack : [ 3 ]
[ 3 , 2] − >[3 , 2 , 1]−> b a c k t r a c k : [ 3 , 2 ]
backtrack : [ 3 ]
backtrack : [ ]
in a tree the number of edges |E| is |v| − 1, making the time complexity
O(|V | + |E|) the same as of O(|V |). Since p(n, n) itself alone takes n! time,
making the permutation an NP-hard problem.
12.2.3 Combinations
Given a list of n items, generate all possible combinations of these items. If
the input has duplicated items, only enumerate unique combinations.
P (n, m) n!
C(n, m) = = (12.6)
P (m, m) (n − m)!m!
For example, when a = [1, 2, 3], there are in total 7 m-subsets, they are:
C( 3 , 0) : [ ]
C( 3 , 1) : [ 1 ] , [ 2 ] , [ 3 ]
C( 3 , 2) : [ 1 , 2 ] , [ 1 , 3 ] , [ 2 , 3 ]
C( 3 , 3) : [ 1 , 2 , 3 ]
1. for loop: in the loop to iterate all possible candidates, we limit the
candidates to be having larger indexes only.
2. We do not have to use a data structure to track the state of each candi-
date because any candidate that has larger index is a valid candidate.
We use start to track the starting position of valid candidates. The code
of combination is:
1 d e f C_n_k( a , n , k , s t a r t , d , c u r r , ans ) :
2 i f d == k : #end c o n d i t i o n
3 ans . append ( c u r r [ : : ] )
4 return
5
6 f o r i in range ( s t a r t , n) :
7 c u r r . append ( a [ i ] )
8 C_n_k( a , n , k , i +1 , d+1, c u r r , ans )
9 c u r r . pop ( )
10 return
This process can be better visualized in a tree as in Fig. ??. We can see this
process results 2n leaves compared with our previous implementation which
has a total of 2n nodes is slightly less efficient. The code is as:
1 d e f p o w e r s e t ( a , n , d , c u r r , ans ) :
2 i f d == n :
3 ans . append ( c u r r [ : : ] )
4 return
5
6 # Case 1 : s e l e c t item
7 c u r r . append ( a [ d ] )
8 p o w e r s e t ( a , n , d + 1 , c u r r , ans )
9 # Case 2 : not s e l e c t item
10 c u r r . pop ( )
11 p o w e r s e t ( a , n , d + 1 , c u r r , ans )
12 return
12.2. BACKTRACKING 247
Time Complexity The total nodes within the implicit search space of
combination shown in Fig. 12.5 is nk=0 Cnk = 2n , which was explained in
P
n m−1
c(n, k) = (xi + 1) (12.7)
X Y
k=0 i=0
However, counting c(n, k) with duplicates in the input replies on the specific
input with specific distribution of these items. We are still able to count by
enumerating with backtracking search.
Analysis The backtracking search here is the same as how to apply a DFS
on an explicit graph, with rather one extra point: a state path which might
have up to n items ( the total vertices of a graph). In the implementation,
the path vector will dynamically be modified to track all paths constructed
as the go of the DFS. The code is offered as:
1 d e f a l l _ p a t h s ( g , s , path , ans ) :
2 ans . append ( path [ : : ] )
3 for v in g [ s ] :
4 path . append ( v )
5 a l l _ p a t h s ( g , v , path , ans )
6 path . pop ( )
You can run the above code in the Goolge Colab to see how it works on our
given example.
• The subsequences and subsets share the same items when the ordering
of the subsequences are ignored.
Therefore, our code to handle duplicates should differ from that of a pow-
erset. In the case of powerset, the algorithm first sorts items so that all
duplicates are adjacent to each other, making the checking of repetition as
simple as checking the equality of item with its predecessor. However, in a
given sequence, the duplicated items are not adjacent most of the time, we
have to do things differently. We draw the search tree of enumerating all
subsequences of string “1232” in Fig. 12.7. From the figure, we can observe
that to avoid redundant branches, we simply check if a current new item in
the subsequence is repeating by comparing it with all of its predecessors in
range [s, i]. The code for checking repetition is as:
1 def check_repetition ( start , i , a) :
2 f o r j in range ( s t a r t , i ) :
3 i f a [ i ] == a [ j ] :
4 r e t u r n True
5 return False
Figure 12.7: The Search Tree of subsequences.The red circled nodes are
redundant nodes. Each node has a variable s to indicate the starting index
of candidates to add to current subsequence. i indicate the candidate to add
to the current node.
6 f o r i in range ( s t a r t , n) :
7 i f check_repetition ( start , i , a) :
8 continue
9 c u r r . append ( a [ i ] )
10 s u b s e q s ( a , n , i +1, d+1 , c u r r , ans )
11 c u r r . pop ( )
12 return
Sudoku (L37)
A Sudoku grid shown in Fig. 12.8 is a n2 × n2 grid, arranged into n n × n
mini-grids each containing the values 1, ..., n such that no value is repeated
in any row, column (or mini-grid).
Search Space First, we analyze the number of distinct states in the search
space which relies on how we construct the intermediate states and our
knowledge in Enumerative combinatorics. We discuss two different formu-
lations on 9 × 9 grid:
1. For each empty cell in the puzzle, we create a set by taking values
1, ..., 9 and removing from it those values that appear as a given in the
252 12. COMBINATORIAL SEARCH
i=0
The two different ways each takes a different approach to formulate the state
space, making its corresponding backtracking search differs too. We mainly
focus on the first formulation with backtracking search.
• The spot to select next time with the ordering rule we choose.
To get the domain set according to the constraints, a simple set opera-
tion is executed as: A−(row_state[i]|col_state[j]|block_state[i//3][j//3]).
In the solver, each time, to pick a spot, we first update all remaining
spots in the unfilled and then choose the one with minimal domain.
This process takes O(n) which is trivial compared with the cost of the
searching, with 9 for computing domain set of a single spot, 9n for n
spots, and adding another n to 9n to choose the one with the smallest
size. The solver is implemented as:
1 d e f _ret_len ( s e l f , a r g s ) :
2 i , j = args
3 o p t i o n = s e l f .A − ( s e l f . row_state [ i ] | s e l f . c o l _ s t a t e [ j
] | s e l f . b l o c k _ s t a t e [ i //3 ] [ j / / 3 ] )
4 return len ( option )
5
6 def solve ( s e l f ) :
7 i f l e n ( s e l f . u n f i l l e d ) == 0 :
8 r e t u r n True
9 # Dynamic v a r i a b l e s o r d e r i n g
10 i , j = min ( s e l f . u n f i l l e d , key = s e l f . _ret_len )
11 # Forward l o o k i n g
12 o p t i o n = s e l f .A − ( s e l f . row_state [ i ] | s e l f . c o l _ s t a t e [ j
] | s e l f . b l o c k _ s t a t e [ i //3 ] [ j / / 3 ] )
13 i f l e n ( o p t i o n ) == 0 :
14 return False
15 s e l f . u n f i l l e d . remove ( ( i , j ) )
16 for c in option :
17 s e l f . set_state ( i , j , c )
18 i f s e l f . solve () :
19 r e t u r n True
20 # Backtracking
21 else :
22 s e l f . reset_state ( i , j , c )
23 # Backtracking
24 s e l f . u n f i l l e d . append ( ( i , j ) )
25 return False
1. Random-restart hill-climbing.
2. Simulated annealing.
3. Genetic Algorithms.
4. Tabu search.
1. Choose the decision variables that typically encode the result we are
interested in, such that in a superset problem, each item is a variable,
and each variable includes two decisions: take or not take, making its
value set as 0, 1.
Branch and Bound Branch and bound (BB, B&B, or BnB) is an al-
gorithm design paradigm for discrete and combinatorial optimization prob-
lems, as well as mathematical optimization. A branch-and-bound algorithm
consists of a systematic enumeration of candidate solutions by means of
state space search: the set of candidate solutions is thought of as forming a
rooted tree with the full set at the root. The algorithm explores branches
of this tree, which represent subsets of the solution set. Before enumerating
the candidate solutions of a branch, the branch is checked against upper
and lower estimated bounds on the optimal solution, and is discarded if it
256 12. COMBINATORIAL SEARCH
cannot produce a better solution than the best one found so far by the algo-
rithm. “Branching” is to split problem into a number of subproblems, and
“bounding” is to find an optimistic estimation of the best solution to the
the subproblems to either maximize the upper bound or minimize the lower
bound. To get the optimistic estimation, we have to relax constraints. In this
section, we will exemplify both the minimization(TSP) and maximization
problem(Knapsack).
• Best-First: it selects the node with the best estimation among the
frontier set to expand each time. Worst scenario, the whole search tree
have to be saved as long the best estimation is extremely optimistic
and not a single branch is pruned in the process.
Search Space In this problem, xi denotes each item, and wi , vi for its
corresponding weight and value, with i ∈ [0, n − 1]. Each item can either
be selected or left behind, indicating xi ∈ 0, 1. The selected items can not
12.4. SOLVING COMBINATORIAL OPTIMIZATION PROBLEMS 257
n−1
max (12.10)
X
vi xi
v,x
i=0
n−1
s.t. (12.11)
X
wi xi ≤ c
i=0
xi ∈ 0, 1 (12.12)
With each variable having two choices, our search space is as large as 2n .
4 d e f __init__ ( s e l f , c , v , w) :
5 s e l f . best = 0
6 self .c = c
7 s e l f . n = len (v)
8 s e l f . i t e m s = [ ( v i / wi , wi , v i ) f o r _, ( v i , wi ) i n enumerate (
z i p ( v , w) ) ]
9 s e l f . i t e m s . s o r t ( key=lambda x : x [ 0 ] , r e v e r s e=True )
10
11 d e f e s t i m a t e ( s e l f , idx , c u r v a l , l e f t _ c a p ) :
12 est = curval
13 # u s e t h e v/w t o e s t i m a t e
14 f o r i i n r a n g e ( idx , s e l f . n ) :
15 r a t i o , wi , _ = s e l f . i t e m s [ i ]
16 i f l e f t _ c a p − wi >= 0 : # u s e a l l
17 e s t += r a t i o ∗ wi
18 l e f t _ c a p −= wi
19 e l s e : # use part
20 e s t += r a t i o ∗ ( l e f t _ c a p )
21 left_cap = 0
22 return est
2 i f i d x == s e l f . n :
3 s e l f . b e s t = max( s e l f . b e s t , v a l )
4 return
5 p r i n t ( status , val , left_cap , e s t )
6
7 _, wi , v i = s e l f . i t e m s [ i d x ]
8 # Case 1 : c h o o s e t h e item
9 i f l e f t _ c a p − wi >= 0 : # prune by c o n s t r a i n t
10 # Bound by e s t i m a t e , i n c r e a s e v a l u e and volume
11 i f est > s e l f . best :
12 s t a t u s . append ( True )
13 n e s t = s e l f . e s t i m a t e ( i d x +1, v a l+v i , l e f t _ c a p − wi )
14 s e l f . d f s ( i d x +1, n e s t , v a l+v i , l e f t _ c a p − wi , s t a t u s )
15 s t a t u s . pop ( )
16
17 # Case 2 : not c h o o s e t h e item
18 i f est > s e l f . best :
19 s t a t u s . append ( F a l s e )
20 n e s t = s e l f . e s t i m a t e ( i d x +1 , v a l , l e f t _ c a p )
21 s e l f . d f s ( i d x +1, n e s t , v a l , l e f t _ c a p , s t a t u s )
22 s t a t u s . pop ( )
23 return
Within Best-First search, we use priority queue with the estimated value,
and each time the one with the largest estimated value within the frontier set
is expanded first. Similarly, with branch and bound, we prune branch that
has estimated value that would never surpass the best solution up till then.
The search space is the same as in Fig. 12.9 except that the search process
is different from depth-first. In the implementation, the priority queue is
implemented with a min-heap where the minimum value is firstly popped
out, thus we use the negative estimated value to make it always pop out the
largest value conveniently instead of write code to implement a max-heap.
1 def bfs ( s e l f ) :
2 # t r a c k v a l , cap , and i d x i s which item t o add next
3 q = [( − s e l f . e s t i m a t e ( 0 , 0 , s e l f . c ) , 0 , s e l f . c , 0 ) ] #
estimate , val , left_cap , idx
4 s e l f . best = 0
5 while q :
6 e s t , v a l , l e f t _ c a p , i d x = heapq . heappop ( q )
7 e s t = −e s t
8 _, wi , v i = s e l f . i t e m s [ i d x ]
9 i f i d x == s e l f . n − 1 :
10 s e l f . b e s t = max( s e l f . b e s t , v a l )
11 continue
12
13 # Case 1 : c h o o s e t h e item
14 n e s t = s e l f . e s t i m a t e ( i d x + 1 , v a l + v i , l e f t _ c a p − wi )
15 i f nest > s e l f . best :
260 12. COMBINATORIAL SEARCH
16 heapq . heappush ( q , (− n e s t , v a l + v i , l e f t _ c a p − wi , i d x
+ 1) )
17
18 # Case 2 : not c h o o s e t h e item
19 nest = s e l f . estimate ( idx + 1 , val , left_cap )
20 i f nest > s e l f . best :
21 heapq . heappush ( q , (− n e s t , v a l , l e f t _ c a p , i d x + 1 ) )
22 return
Given a set of cities and the distances between every pair, find the short-
est possible path that visits every city exactly once and returns to the origin
city. For example, with the graph shown in Fig. 12.10, such shortest path is
[0, 1, 3, 2, 0] with a path weight 80.
Speedups Since we only care about the minimum cost, then any partial
result that has cost larger than the minimum cost of all known complete
12.5. EXERCISES 261
solutions can be prunned. This is the Branch and bound method that we
have introduced that is often used in the combinatorial optimization.
Other Solutions
Whenever we are faced with optimization, we are able to consider the other
two algorithm design paradigm–Dynamic Programming and Greedy Algo-
rithms. In fact, the above two problems both have its corresponding dynamic
programming solutions: for knapsack problem, polynomial solution is pos-
sible; for TSP, though it is still of exponential time complexity, it is much
better than O(n!). We will further discuss these two problems in Chapter
Dynamic Programming.
12.5 Exercises
1. 77. Combinations
262 12. COMBINATORIAL SEARCH
6. N-queen
7. Map-coloring
13.1 Introduction
Story Imagine that your mom asks you to get 10,000 pounds of corns,
what would you do? First, you would think, where should I get the corns?
I can go to Walmart or I can go to grow the corns in the farm. This is
when one problem/task is reduced to some other problems/tasks. Solving
the other ones means you solved your assignment from your mom. This is
one example of the reduction; converting problem A to problem B.
Now, you are at Walmart and are ready to load the 10,000 pounds of
bagged corns, but the trunk of your car can not fit all corns at once. You just
decide that you want to do 10 rounds of loading and transporting to home.
Now, your task becomes loading 1,000 pounds of corns. After you are done
with this, you just solved a subtask–getting 1,000 pounds of corns. In the
second round, you load another 1,000 pounds. You solved another subtask–
getting 2,000 pounds of corns. After 10 rounds in total, you will solve the
263
264 13. REDUCE AND CONQUER
original task. This is the other side of reduction, reduce one problem to one
or multiple smaller instances of itself.
The word ‘self-Reduction’ is not commonly or even put under the umbrella
of ‘reduction’. In other materials, you might see that the content of self-
reduction appears in the form of mathematical induction1 . Self-Reduction
and Mathematical Induction are inseparable, as self-reduction can be rep-
resented with recurrence relation, and the mathematical induction is the
most straightforward and powerful tool to prove its correctness and their
concentration aligns–“concentrating on reducing a problem and solving sub-
problems rather than solving it directly”.
Mathematical induction can guide us to reduce the problem: we assume
we know the solution from problems of size an/b , or an−k , we focus on how
to construct a solution for an with solutions to our subproblems such as an/b
and an−k .
We will further see the distinction of these two characteristics of problems
in our following examples.
2. Conquer: this step means that in the bottom-up pass, say we have
the solutions of the a subproblems each with size n/b available, we
need to combine these solutions to our current problem of size n.
• Various sorting algorithms like Merge Sort, Quick Sort (Chapter 15);
• Heap(Section ??);
13.2. DIVIDE AND CONQUER 267
Solution: divide and conquer. T (n) = max(T (lef t), T (right), T (cross)),
max is for merging and the T(cross) is for the case that the potential sub-
array across the mid point. For the complexity, T (n) = 2T (n/2) + n, if we
use the master method, it would give us O(nlgn). We write the following
Python code
1 d e f maxSubArray ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 d e f getCrossMax ( low , mid , h i g h ) :
268 13. REDUCE AND CONQUER
7 left_sum , right_sum =0 ,0
8 left_max , right_max = −maxint , −maxint
9 l e f t _ i , r i g h t _ j =−1,−1
10 f o r i i n xrange ( mid , low −1,−1) : #[ )
11 left_sum+=nums [ i ]
12 i f left_sum>left_max :
13 left_max= left_sum
14 left_i = i
15 f o r j i n xrange ( mid+1, h i g h +1) :
16 right_sum+=nums [ j ]
17 i f right_sum>right_max :
18 right_max= right_sum
19 right_j = j
20 r e t u r n ( l e f t _ i , r i g h t _ j , left_max+right_max )
21
22 d e f maxSubarray ( low , h i g h ) :
23 i f low==h i g h :
24 r e t u r n ( low , high , nums [ low ] )
25 mid = ( low+h i g h ) //2
26 r s l t =[]
27 #l e f t _ l o w , l e f t _ h i g h , left_sum = maxSubarray ( low , mid
13.3. CONSTANT REDUCTION 269
) #[low , mid ]
28 r s l t . append ( maxSubarray ( low , mid ) ) #[ low , mid ]
29 #right_low , r i g h t _ h i g h , right_sum = maxSubarray ( mid+1 ,
h i g h ) #[mid+1 , h i g h ]
30 r s l t . append ( maxSubarray ( mid+1 , h i g h ) )
31 #cross_low , c r o s s _ h i g h , cross_sum = getCrossMax ( low ,
mid , h i g h )
32 r s l t . append ( getCrossMax ( low , mid , h i g h ) )
33 r e t u r n max( r s l t , key=lambda x : x [ 2 ] )
34 r e t u r n maxSubarray ( 0 , l e n ( nums ) −1) [ 2 ]
Overlapping Subproblems
When the number of the subproblems appears in this relation is larger or
equals to 2, the subproblems might overlap. This implies that a straight-
forward recursion based solution without optimization will be expensive be-
cause these overlapped problems are solved again and again; the optimiza-
tion is possible with dynamic programming or greedy algorithm shown in
Part. ?? which optimize it using caching mechanism by saving the solution
of each subproblem and thus avoiding recomputation. However, to stick to
just the reduction itself, we delay our examples’ possible optimization to
Part. ??.
Subproblem Space
To count all possible subproblems– the subproblem space–is important for
us to understand the complexity. For array, a subproblem can be a subarray
that [ai , ..., aj ], i < j, i ∈ [0, n − 1], j ∈ [0, n − 1], which makes the potential
subproblems n2 . Sometimes, it is enough to fix ai = a0 , that the subarray
always start from the start. The reduction by constant size will be less
likely to be seen in tree structure where its more organized by the divide
270 13. REDUCE AND CONQUER
The above is the classical Fibonacci Sequence, to get the fibonacci number
at position n, we first need to know the answer for subproblems f(n-1) and
f(n-2), we can solve it easily using recursion function:
1 def f i b (n) :
2 i f n <= 1 :
3 return n
4 r e t u r n f i b ( n−1) + f i b ( n−2)
The above recursion function has recursion tree shown in Fig ??. And we
also draw the recursion tree of recursion function call for merge sort and
shown in Fig 13.2. We notice that we call f(2) multiple times for fibonacci
but in the merge sort, each call is unique and wont be called more than
once. The recurrence function of merge sort is T (n) = 2 ∗ T (n/2) + n, and
for fibonacci sequence it is T (n) = T (n − 1) + T (n − 2) + 1.
Practical Guideline With the master theorem to either divide and con-
quer or reducing by constant size tells us its asympototic time complexity:
divide and conquer is either polynomial or logarithmic and reducing by con-
stant size will go up to exponential with f (n) ≥ n. This reminds us that
when our problem state space is beyond exponential, we might better try re-
ducing by constant size, and if the problem state space is within polynomial,
a divide and conquer should work better to further boost the efficiency.
272 13. REDUCE AND CONQUER
13.5 A to B
13.5.1 Concepts
Definition Reduction or transformation between two problems A and B
is to say that a solution of one problem is also a solution of the other. It’s
common applications are:
1. Design algorithms: given algorithm for B, can solve our problem for
A.
13.7 Exercises
1. Binary Search.
274 13. REDUCE AND CONQUER
3. Skyline problem.
14
Want to do even better than linear complexity? Decrease and conquer re-
duces one problem into one smaller subproblem only, and the most common
case is to reduce the state space into half of its original size. If the combining
step takes only constant time, we get an elegant recurrence relation as:
14.1 Introduction
All the searching we have discussed before never assumed any ordering be-
tween the items, and searching an item in an unordered space is doomed to
have a time complexity linear to the space size. This case is about to change
in this chapter.
Think about these two questions: What if we have a sorted list instead
of an arbitrary one? What if the parent and children nodes within a tree
are ordered in some way? With such special ordering between items in a
data structures, can we increase its searching efficiency and be better than
the blind one by one search in the state space? The answer is YES.
Let’s take advantage of the ordering and the decrease and conquer method-
ology. To find a target in a space of size n, we first divide it into two sub-
spaces and each of size n/2, say from the middle of the array. If the array is
275
276 14. DECREASE AND CONQUER
increasingly ordered, all items in the left subspace are smaller than all items
in the right subspace. If we compare our target with the item in the middle,
we will know if this target is on the left or right side. With just one step,
we reduced our state space by half size. We further repeat this process on
the reduced space until we find the target. This process is called Binary
Search. Binary search has recurrence relation:
Find the Exact Target This is the most basic application of binary
search. We can set two pointers, l and r, which points to the first and
last position, respectively. Each time we compute the middle position m =
(l+r)//2, and check if the item num[m] is equal to the target t.
• If it is smaller than the target, move to the left half by setting the right
pointer to the position right before the middle position, r = m − 1.
• If it is larger than the target, move to the right half by setting the left
pointer to the position right after the middle position, l = m + 1.
14.2. BINARY SEARCH 277
Repeat the process until we find the target or we have searched the whole
space. The criterion of finishing the whole space is when l starts to be larger
than r. Therefore, in the implementation we use a while loop with condition
l≤ r to make sure we only scan once of the searching space. The process of
applying binary search on our exemplary array is depicted in Fig. 14.1 and
the Python code is given as:
1 def standard_binary_search ( l s t , t a r g e t ) :
2 l , r = 0 , len ( l s t ) − 1
3 w h i l e l <= r :
4 mid = l + ( r − l ) // 2
5 i f l s t [ mid ] == t a r g e t :
6 r e t u r n mid
7 e l i f l s t [ mid ] < t a r g e t :
8 l = mid + 1
9 else :
10 r = mid − 1
11 r e t u r n −1 # t a r g e t i s not found
Applying the first standard binary search will return 3 as the target position,
which is the second 4 in the array. This does not seem like a problem at
first. However, what if you want to know the predecessor or successor (3 or
5) of this target? In a distinct array, the predecessor and successor would
be adjacent to the target. However, when the target has duplicates, the
predecessor is before the first target and the successor is next to the last
target. Therefore, returning an arbitrary one will not be helpful.
Another case, what if our target is 6, and we first want to see if it exists
in the array. If it does not, we would like to insert it into the array and still
keep the array sorted. The above implementation simply returns −1, which
is not helpful at all.
The lower and upper bound of a binary search are the lowest and
highest position where the value could be inserted without breaking the
ordering.
278 14. DECREASE AND CONQUER
• With index 2 as the lower bound, items in i ∈ [0, l−1], a[i] < t, a[l] = t,
and i ∈ [l, n), a[i] ≥ t. A lower bound is also the first position that has
a value v ≥ t. This case is shown in Fig. 14.2.
• With the upper bound, items in i ∈ [0, u − 1], a[i] ≤ t, a[u] = t, and
i ∈ [u, n), a[i] > t. An upper bound is also the first position that has
a value v > t. This case is shown in Fig. 14.3.
Figure 14.4: Binary Search: Lower and Upper Bound of target 5 is the same.
14.2. BINARY SEARCH 279
right side of the state space, making l = mid+1 = 6. Now, in the right state
space, the middle pointer will always have values larger than 4, thus it will
only moves to the left side of the space, which only changes the right pointer
r and leaves the left pointer l touched when the program ends. Therefore, l
will still return our final upper bound index. The Python code is as follows:
1 d e f upper_bound_bs ( nums , t ) :
2 l , r = 0 , l e n ( nums ) − 1
3 w h i l e l <= r :
4 mid = l + ( r − l ) // 2
5 i f t >= nums [ mid ] : # move a s r i g h t a s p o s s i b l e
6 l = mid + 1
7 else :
8 r = mid − 1
9 return l
Bonus For the lower bound, if we return the position as l-1, then we get
the last position that value < target. Similarily, for the upper bound, we
14.2. BINARY SEARCH 281
14.2.2 Applications
Binary Search is a powerful problem solving tool. Let’s go beyond the
sorted array: How about when the array is sorted in a way that is not as
monotonic as what we are familiar with, or how about solving math functions
with binary search, whether they are continuous or discrete, equations or
inequations?
c a l l i s B a d V e r s i o n ( 3 ) −> f a l s e
c a l l i s B a d V e r s i o n ( 5 ) −> t r u e
c a l l i s B a d V e r s i o n ( 4 ) −> t r u e
Then 4 i s t h e f i r s t bad v e r s i o n .
Analysis and Design In this case, we have a search space in range [1, n].
Think the value at each position is the result from function isBadVersion(i).
Assume the first bad version is at position b, then the values from the po-
sitions are of such pattern: [F,..., F, ..., F, T, ..., T]. We can totally apply
the binary search in the search space [1, n]: to find the first bad version is
the same as finding the first position that we can insert a value True–the
lower bound of value True. Therefore, whenever the value we find is True,
we move to the left space to try to get its first location. The Python code
is given below:
1 def firstBadVersion (n) :
2 l , r = 1, n
3 w h i l e l <= r :
4 mid = l + ( r − l ) // 2
5 i f i s B a d V e r s i o n ( mid ) :
6 r = mid − 1
7 else :
8 l = mid + 1
9 return l
282 14. DECREASE AND CONQUER
target = 8
Output : −1
Analysis and Design In the rotated sorted array, the array is not purely
monotonic. Instead, there will be at most one drop in the array because of
the rotation, which we denote the high and the low item as ah , al respectively.
This drop cuts the array into two parts: a[0 : h + 1] and a[l : n], and both
parts are ascending sorted. If the middle point falls within the left part, the
left side of the state space will be sorted, and if it falls within the right part,
the right side of the state space will be sorted. Therefore, at any situation,
there will always be one side of the state space that is sorted. To check
which side is sorted, simply compare the value of middle pointer with that
of left pointer.
• Otherwise when they equal to each other, which is only possible that
there is no left part left, we have to move to the right part. For
example, when nums=[1, 3], we move to the right part.
With a sorted half of state space, we can check if our target is within the
sorted half: if it is, we switch the state space to the sorted space; otherwise,
we have to move to the other half that is unknown. The Python code is
shown as:
1 d e f R o t a t e d B i n a r y S e a r c h ( nums , t ) :
2 l , r = 0 , l e n ( nums )−1
3 w h i l e l <= r :
4 mid = l + ( r−l ) //2
14.2. BINARY SEARCH 283
5 i f nums [ mid ] == t :
6 r e t u r n mid
7 # Left i s sorted
8 i f nums [ l ] < nums [ mid ] :
9 i f nums [ l ] <= t < nums [ mid ] :
10 r = mid − 1
11 else :
12 l = mid + 1
13 # Right i s s o r t e d
14 e l i f nums [ l ] > nums [ mid ] :
15 i f nums [ mid ] < t <= nums [ r ] :
16 l = mid + 1
17 else :
18 r = mid − 1
19 # L e f t and middle i n d e x i s t h e same , move t o t h e r i g h t
20 else :
21 l = mid + 1
22 r e t u r n −1
Arranging Coins (L441, easy) You have a total of n coins that you
want to form in a staircase shape, where every k-th row must have exactly
k coins. Given n, find the total number of full staircase rows that can be
formed. n is a non-negative integer and fits within the range of a 32-bit
signed integer.
Example 1 :
n = 5
The c o i n s can form t h e f o l l o w i n g rows :
∗
∗ ∗
∗ ∗
Because t h e 3 rd row i s i n c o m p l e t e , we r e t u r n 2 .
Analysis and Design Each row x has x coins, summing it up, we get
1 + 2 + ... + x = x(x+1)
2 . The problem is equvalent to find the last integer x
x(x+1)
that makes 2 ≤ n. Of course, this is just a quadratic equation which
can be easily solved if you remember the formula, such as the following
Python code:
1 import math
2 d e f a r r a n g e C o i n s ( n : i n t ) −> i n t :
3 r e t u r n i n t ( ( math . s q r t (1+8∗n ) −1) // 2 )
12 else :
13 r = mid − 1
14 return l
15 return bisect_right () − 1
With l and r to represent the left and right child of node x, there are
two other definitions other than the binary search tree definition we just in-
troduced: (1)l.key ≤ x.key < r.key and (2) l.key < x.key ≤ r.key. In these
two cases, our resulting BSTs allows us to have duplicates. The exemplary
implementation follow the definition that does not allow duplicates.
14.3.1 Operations
In order to build a BST, we need to insert a series of items in the tree
organized by the search tree property. And in order to insert, we need
to search for a proper position first and then insert the new item while
sustaining the search tree property. Thus, we introduce these operations in
the order of search, insert and generate.
Search The search is highly similar to the binary search in the array. It
starts from the root. Unless the node’s value equals to the target, the search
proceeds to either the left or right child depending upon the comparison
result. The search process terminates when either the target is found or
when an empty node is reached. It can be implemented either recursively or
iteratively with a time complexity O(h), where h is the height of the tree,
which is roughly log n is the tree is balanced enough. The recursive search
is shown as:
1 def search ( root , t ) :
2 i f not r o o t :
3 r e t u r n None
4 i f r o o t . v a l == t :
5 return root
6 e l i f t < root . val :
7 return search ( root . l e f t , t )
8 else :
9 return search ( root . right , t )
Figure 14.6: The red colored path from the root down to the position where
the key 9 is inserted. The dashed line indicates the link in the tree that is
added to insert the item.
Insert Assuming we are inserting a node with key 9 into the tree shown
in Fig 27.1. We start from the root, compare 9 with 8, and goes to node 10.
Next, the search process will lead us to the left child of node 10, and this is
where we should put node 9. The process is shown in Fig. 14.6.
The process itself is easy and clean. Here comes to the implementation.
We treat each node as a subtree: whenever the search goes into that node,
then the algorithm hands over the insertion task totally to that node, and
assume it has inserted the new node and return its updated node. The
main program will just simply reset its left or right child with the return
value from its children. The insertion of new node happens when the search
hits an empty node, it returns a new node with the target value. The
implementation is given as:
1 def i n s e r t ( root , t ) :
2 i f not r o o t :
3 r e t u r n BiNode ( t )
4 i f r o o t . v a l == t :
5 return root
6 e l i f t < root . val :
7 root . l e f t = i n s e r t ( root . l e f t , t )
8 return root
288 14. DECREASE AND CONQUER
9 else :
10 root . right = i n s e r t ( root . right , t )
11 return root
1. When the parent node is None, which means the tree is empty. We
assign the root node with the a new node of the target value.
2. When the target’s value is larger than the parent node’s, the put a
new node as the right child of the parent node.
3. When the target’s value is smaller than the parent node’s, the put a
new node as the left child of the parent node.
Find the Minimum and Maximum Key Because the minimum key is
the leftmost node within the tree, the search process will always traverse to
the left subtree and return the last non-empty node, which is our minimum
node. The time complexity is the same as of searching any key, which is
O(log n).
1 d e f minimum ( r o o t ) :
2 i f not r o o t :
3 r e t u r n None
4 i f not r o o t . l e f t :
5 return root
6 r e t u r n minimum ( r o o t . l e f t )
To find the maximum node, replacing left with right will do. Also, some-
times we need to search two additional items related to a given node: suc-
cessor and predecessor. The structure of a binary search tree allows us to
determine the successor or the predecessor of a tree without ever comparing
keys.
Let us try something else. In the BST shown in Fig. 14.6, the node 3’s
successor will be node 4. For node 4, its successor will be node 6. For node
7, its successor is node 8. What are the cases here?
• An easy case is when a node has right subtree, its successor is the
minimum node within its right subtree.
• However, if a node does not have a right subtree, there are two more
cases:
The above two rules can be merged as: starting from the target node,
traverse backward to check its parent, find the first two nodes which
are in left child–parent relation. The parent node in that relation will
be our targeting successor. Because the left subtree is always smaller
than a node, when we backward, if a node is smaller than its parent,
it tells us that the current node is smaller than that parent node too.
We write three functions to implement the successor:
• Function findNodeAddParent will find the target node and add a
parent node to each node along the searching that points to their
parents. The Code is as:
1 d e f findNodeAddParent ( r o o t , t ) :
2 i f not r o o t :
3 r e t u r n None
4 i f t == r o o t . v a l :
5 return root
6 e l i f t < root . val :
7 root . l e f t . p = root
8 r e t u r n findNodeAddParent ( r o o t . l e f t , t )
9 else :
10 root . right . p = root
11 r e t u r n findNodeAddParent ( r o o t . r i g h t , t )
• Function reverse will find the first left-parent relation when traverse
backward from a node to its parent.
1 d e f r e v e r s e ( node ) :
2 i f not node o r not node . p :
3 r e t u r n None
4 # node i s a l e f t c h i l d
14.3. BINARY SEARCH TREE 291
The expected time complexity is O(log n). And the worst is when the tree
line up and has no branch, which makes it O(n). Similarily, we can use
inorder traversal:
292 14. DECREASE AND CONQUER
1 d e f p r e d e c e s s o r I n o r d e r ( r o o t , node ) :
2 i f not node :
3 r e t u r n None
4 i f node . l e f t i s not None :
5 r e t u r n maximum( node . l e f t )
6 # Inorder traversal
7 pred = None
8 while root :
9 i f node . v a l > r o o t . v a l :
10 pred = r o o t
11 root = root . right
12 e l i f node . v a l < r o o t . v a l :
13 root = root . l e f t
14 else :
15 break
16 r e t u r n pred
2. Node to be deleted has only one child: Copy the child to the node and
delete the child. For example, to delete node 14, we need to copy node
13 to node 14.
Next, we implement the above three cases in function _delete when a delet-
ing node is given, which will return a processed subtree deleting its root
node.
14.3. BINARY SEARCH TREE 293
Finally, we call the above two function to delete a node with a target key.
1 def d e l e t e ( root , t ) :
2 i f not r o o t :
3 return
4 i f r o o t . v a l == t :
5 root = _delete ( root )
6 return root
7 e l i f t > root . val :
8 root . right = delete ( root . right , t )
9 return root
10 else :
11 root . l e f t = delete ( root . l e f t , t )
12 return root
If we use any of the other two definitions we introduced that allows dupli-
cates, things can be more complicated. For example, if we use the definition
x.lef t.key <= x.key < x.right.key, we will end up with a tree looks like
Fig. 14.7:
294 14. DECREASE AND CONQUER
Note that the duplicates are not in contiguous levels. This is a big issue
when allowing duplicates in a BST representation as, because duplicates may
be separated by any number of levels, making the detection of duplicates
difficult.
An option to avoid this issue is to not represent duplicates structurally
(as separate nodes) but instead use a counter that counts the number of
occurrences of the key. The previous example will be represented as in
Fig. 14.8:
This simplifies the related operations at the expense of some extra bytes
and counter operations. Since a heap is a complete binary tree, it has a
smallest possible height - a heap with N nodes always has O(log N) height.
However, because there are n2 subarray, making the space cost polynomial,
which is definitely not good. Another problem, “what if we need to change
the value of an item”, we have to update n nodes in the dictionary which
includes the node in its range.
We can balance the search, update, and space from the dictionary ap-
proach to a logarithmic time with the technique of decrease and conquer. In
the binary search, we keep dividing our search space into halves recursively
until a search space can no longer be divided. We can apply the dividing
process here, and construct a binary tree, and each node has l and r to
indicate the range of that node represents. For example, if our array has
index range [0, 5], its left subtree will be [0, mid], and right subtree will
be [mid+1, 5]. a binary tree built with binary search manner is shown in
Fig. 14.9.
To get the answer for range query [0, 5], we just return the value at root
node. If the range is [0, 1], which is on the left side of the tree, we go to the
left branch, and cutting half of the search space. For a range that happens
to be between two nodes, such as [1, 3], which needs node [0, 1] and [2-5],
we search [0, 1] in the left subtree and [2, 3] in the right subtree and combine
them together. Any searching will be within O(log n), relating to the height
of the tree. needs better complexity analysis
Segment tree The above binary tree is called segment tree. From our
analysis, we can see a segment tree is a static full binary trees. ’Static‘ here
means once the data structure is built, it can not be modified or extended.
However, it can still update the value in the original array into the segment
tree. Segment tree is applied widely to efficiently answer numerous dynamic
range queries problems (in logarithmic time), such as finding minimum,
296 14. DECREASE AND CONQUER
4. If the parent node is in range [i, j], then we separate this range at the
middle position m = (i + j)//2; the left child takes range [i, m], and
the right child take the interval of [m + 1, j].
Because in each step of building the segment tree, the interval is divided
into two halves, so the height of the segment tree will be log n. And there
will be totally n leaves and n − 1 number of internal nodes, which makes the
total number of nodes in segment tree to be 2n − 1, which indicates a linear
space cost. Except of an explicit tree can be used to implement segment
tree, an implicit tree implemented with array can be used too, similar to the
case of heap data structure.
14.4.1 Implementation
Implementation of a functional segment tree consists of three core oper-
ations: tree construction, range query, and value update, named as as
_buildSegmentTree(), RangeQuery(), and update(), respectively. We
demonstrate the implementation with Range Sum Query (RSQ) problem,
but we try to generalize the process so that the template can be easily reused
to other range query problems. In our implementation, we use explicit tree
data structure for both convenience and easier to understand. We define a
general tree node data structure as:
1 c l a s s TreeNode :
2 d e f __init__ ( s e l f , v a l , s , e ) :
3 s e l f . val = val
4 self . s = s
5 self .e = e
6 s e l f . l e f t = None
7 s e l f . r i g h t = None
Given nums = [ 2 , 9 , 4 , 5 , 8 , 7 ]
14.4. SEGMENT TREE 297
sumRange ( 0 , 2 ) −> 15
update ( 1 , 3 )
sumRange ( 0 , 2 ) −> 9
Range Query Each query within range [i, j], i < j, i ≥ s, j ≤ e, will be
found on a node or by combining multiple node. In the query process, check
the following cases:
• If range [i, j] matches the range [s, e], if it matches, return the value
of the node, otherwise, processed to other cases.
– For the first two cases, a recursive call on that branch will return
our result.
– For the third case, where the range crosses two space, two re-
cursive calls on both children of our current node are needed:
the left one handles range [i, m], and the right one handles range
[m + 1, j]. The final result will be a combination of these two.
Update To update nums[1]=3, all nodes on the path from root to the
leaf node will be affected and needed to be updated with to incorporate the
change at the leaf node. We search through the tree with a range [1, 1] just
like we did within _rangeQuery except that we no longer need the case of
crossing two ranges. Once we reach to the leaf node, we update that node’s
value to the new value, and it backtracks to its parents where we recompute
the parent node’s value according to the result of its children. This operation
takes O(log n) time complexity, and we can do it inplace since the structure
of the tree is not changed.
1 d e f _update ( r o o t , s , e , i , v a l ) :
2 i f s == e == i :
3 root . val = val
14.5. EXERCISES 299
4 return
5 m = ( s + e ) // 2
6 i f i <= m:
7 _update ( r o o t . l e f t , s , m, i , v a l )
8 else :
9 _update ( r o o t . r i g h t , m + 1 , e , i , v a l )
10 root . val = root . l e f t . val + root . right . val
11 return
14.5 Exercises
1. 144. Binary Tree Preorder Traversal
14.5.1 Exercises
14.1 35. Search Insert Position (easy). Given a sorted array and a
target value, return the index if the target is found. If not, return the
index where it would be if it were inserted in order.
You can assume that there are no duplicates in the array.
Example 1 :
Input : [ 1 , 3 , 5 , 6 ] , 5
Output : 2
Example 2 :
Input : [ 1 , 3 , 5 , 6 ] , 2
Output : 1
Example 3 :
Input : [ 1 , 3 , 5 , 6 ] , 7
Output : 4
Example 4 :
Input : [ 1 , 3 , 5 , 6 ] , 0
Output : 0
1 # exclusive version
2 d e f s e a r c h I n s e r t ( s e l f , nums , t a r g e t ) :
3 l , r = 0 , l e n ( nums ) #s t a r t from 0 , end t o t h e l e n (
exclusive )
4 while l < r :
5 mid = ( l+r ) //2
6 i f nums [ mid ] < t a r g e t : #move t o t h e r i g h t s i d e
7 l = mid+1
8 e l i f nums [ mid ] > t a r g e t : #move t o t h e l e f t s i d e ,
not mid−1
9 r= mid
10 e l s e : #found t h e t r a g e t
11 r e t u r n mid
12 #where t h e p o s i t i o n s h o u l d go
13 return l
1 # inclusive version
2 d e f s e a r c h I n s e r t ( s e l f , nums , t a r g e t ) :
3 l = 0
4 r = l e n ( nums )−1
5 w h i l e l <= r :
6 m = ( l+r ) //2
7 i f t a r g e t > nums [m] : #s e a r c h t h e r i g h t h a l f
8 l = m+1
9 e l i f t a r g e t < nums [m] : # s e a r c h f o r t h e l e f t h a l f
10 r = m−1
11 else :
12 return m
13 return l
[
[1 , 3 , 5 , 7] ,
[ 1 0 , 11 , 16 , 2 0 ] ,
[ 2 3 , 30 , 34 , 50]
]
Given t a r g e t = 3 , r e t u r n t r u e .
302 14. DECREASE AND CONQUER
2. 153. Find Minimum in Rotated Sorted Array (medium) The key here
is to compare the mid with left side, if mid-1 has a larger value, then
that is the minimum
Sorting is the most basic building block for many other algorithms and is
often considered as the very first step that eases and reduces the original
problems to easier ones.
15.1 Introduction
Sorting In computer science, a sorting algorithm is designed to rearrange
items of a given array in a certain order based on each item’s key. The
most frequently used orders are numerical order and lexicographical order.
For example, given an array of size n, sort items in increasing order of its
numerical values:
Array = [ 9 , 1 0 , 2 , 8 , 9 , 3 , 7 ]
sorted = [2 , 3 , 7 , 8 , 9 , 9 , 10]
Sorting and Selection often go hand in hand; either we first execute sorting
and then select the desired order through indexing or we derive a selection
algorithm from a corresponding sorting algorithm. Due to such relation, this
chapter is mainly about introducing sorting algorithms and occasionally we
introduce their corresponding selection algorithms by the side.
303
304 15. SORTING AND SELECTION ALGORITHMS
Lexicographical Order For a list of strings, sorting them will make them
in lexicographical order. The order is decided by a comparison function,
which compares corresponding characters of the two strings from left to
right. In the process, the first pair of characters that differ from each other
determines the ordering: the string that has smaller alphabet from the pair
is smaller than the other string.
Characters are compared using the Unicode character set. All uppercase
letters come before lower case letters. If two letters are the same case, then
alphabetic order is used to compare them. For example:
' ab ' < ' bc ' ( d i f f e r s a t i = 0 )
' abc ' < ' abd ' ( d i f f e r s a t i = 2 )
Special cases appears when two strings are of different length and the shorter
one s is a prefix of the the longer one t, then it is considered that s < t. For
example:
' ab ' < ' abab ' ( ' ab ' i s a p r e f i x o f ' abab ' )
What’s more, it can compare other types of sequences such as list and
tuple using lexicographical orders too:
1 c1 = [ 1 , 2 , 3 ] < [ 2 , 3 ]
2 c2 = ( 1 , 2 ) > ( 1 , 2 , 3 )
3 c3 = [ 1 , 2 ] == [ 1 , 2 ]
1 max ( [ 4 , 8 , 9 , 2 0 , 3 ] )
With dictionary:
1 d i c t 1 = { ' a ' : 5 , ' b ' : 8 , ' c ' : 3}
2 k1 = max( d i c t 1 )
3 k2 = max( d i c t 1 , key=d i c t 1 . g e t )
4 k3 = max( d i c t 1 , key =lambda x : d i c t 1 [ x ] )
implement __eq__, __ne__, and only one of the ordering operators, and use
the functools.total_ordering() decorator to fill in the rest. For example,
write a class Person:
1 from f u n c t o o l s import t o t a l _ o r d e r i n g
2 @total_ordering
3 c l a s s Person ( o b j e c t ) :
4 d e f __init__ ( s e l f , f i r s t n a m e , l a s t n a m e ) :
5 s e l f . f i r s t = firstname
308 15. SORTING AND SELECTION ALGORITHMS
6 s e l f . l a s t = lastname
7
8 d e f __eq__( s e l f , o t h e r ) :
9 r e t u r n ( ( s e l f . l a s t , s e l f . f i r s t ) == ( o t h e r . l a s t , o t h e r .
first ))
10
11 d e f __ne__( s e l f , o t h e r ) :
12 r e t u r n not ( s e l f == o t h e r )
13
14 d e f __lt__( s e l f , o t h e r ) :
15 return (( s e l f . last , s e l f . f i r s t ) < ( other . last , other .
first ))
16
17 d e f __repr__ ( s e l f ) :
18 r e t u r n "%s %s " % ( s e l f . f i r s t , s e l f . l a s t )
Then, we would be able to use any of the above comparison operator on our
class:
1 p1 = Person ( ' L i ' , ' Yin ' )
2 p2 = Person ( ' B e l l a ' , ' Smith ' )
3 p1 > p2
Figure 15.1: The whole process for insertion sort: Gray marks the item to
be processed, and yellow marks the position after which the gray item is to
be inserted into the sorted region.
The key step is to find a proper position of a[i] in the region [0, i − 1] to
insert into. There are two different ways for iteration over unsorted region:
forward and backward. We use pointer j in the sorted region.
• Forward: j will iterate in range [0, i − 1]. We compare a[j] with a[i],
and stop at the first place that a[j] > a[i] (to keep it stable). All items
elements a[j : i − 1] will be shifted backward for one position, and a[i]
will be placed at index j. Here we need i times of comparison and
swaps.
In forward, the shifting process still requires us to reverse the range, therefore
the backward iteration makes better sense.
For example, given an array a = [9, 10, 2, 8, 9, 3]. First, 9 itself is sorted
array. we demonstrate the backward iteration process. At first, 10 is com-
pared with 9, and it stays at where it is. At the second pass, 2 is compared
with 10, 9, and then it is put at the first position. The whole whole process
of this example is demonstrated in Fig. 15.1.
• When j = −1, that means we need to insert the target at the first
position which should be j + 1.
• When t >= a[j], we need to insert the target one position behind j,
which is j + 1.
Bubble sort compares each pair of adjacent items in an array and swaps
them if they are out of order. Given an array of size n: in a single pass,
there are n − 1 pairs for comparison, and at the end of the pass, one item
will be put in place.
15.3. NAIVE SORTING 311
Passes For example, Fig. 15.2 shows the first pass for sorting array [9, 10,
2, 8, 9, 3]. When comparing a pair (ai , ai+1 ), if ai > ai+1 , we swap these two
items. We can clearly see after one pass, the largest item 10 is in place. For
the next pass, it only compare pairs within the unrestricted window [0, 4].
This is what“bubble” means in the name: after a pass, the largest item in
the unrestricted window bubble up to the end of the window and become in
place.
When the pair has equal values, we do not need to swap them. The advan-
tage of doing so is (1) to save unnecessary swaps and (2) keep the original
order of items with same keys. This makes bubble sort a stable sort. Also,
312 15. SORTING AND SELECTION ALGORITHMS
The above implementation runs O(n2 ) even if the array is sorted. We can
optimize the inner for loop by stopping the whole program if no swap is
detected in a single pass. When the input is nearly sorted, this strategy can
get us O(n) time complexity.
Selection Sort
In the bubble sort, each pass we get the largest element in the valid
window in place by a series of swapping operations. While, selection sort
makes a slight optimization via searching for the largest item in the current
unrestricted window and swap it directly with the last item in the region.
This avoids the constant swaps as occurred in the bubble sort. The whole
sorting process for the same array is shown in Fig 15.3.
6 f o r j in range (n − i ) :
7 i f a [ j ] >= a [ l i ] :
8 li = j
9 # swap l i and t i
10 a[ ti ] , a[ li ] = a[ li ] , a[ ti ]
11 return
We have learned a few comparison-based sorting algorithms and they all have
an upper bound of n2 in time complexity due to the number of comparisons
must be executed. Can we do better than O(n2 ) and how?
worst case:
n! ≤ l ≤ 2h (15.1)
2 ≥ n!
h
(15.2)
h ≥ log(n!) (15.3)
h = Ω(n log n) (15.4)
Divide In the divide stage, the original problem a[s...e], where s, e is the
start and end index of the subarray, respectively. The divide process divides
its parent problem into two halves from the middle index m = (s + e)//2:
a[s...m], and a[m + 1, e]. This recursive call keeps moving downward till the
size of the subproblem becomes one when s = e, which is the base case for a
list of size 1 is naturally sorted. The process of divide is shown in Fig. 15.4.
Merge When we obtained two sorted sublists from the left and right side,
the result of current subproblem is to merge the two sorted list into one. The
merge process is done through two pointer method: We assign a new list and
put two pointers at the start of the two sublists, and each time we choose
the smaller item to append into the new list between the items indicated by
the two pointers. Once a smaller item is chosen, we move its corresponding
pointer to the next item in that sublist. We continue this process until any
pointer reaches to the end. Then, the sublist where the pointer does not
reach to the end yet is coped to the end of the new generated list. The
subprocess is shown in Fig. 15.4 and its implementation is as follows:
1 d e f merge ( l , r ) :
2 ans = [ ]
3 # Two p o i n t e r s each p o i n t s a t l and r
4 i = j = 0
5 n , m = len ( l ) , len ( r )
6
7 w h i l e i < n and j < m:
8 i f l [ i ] <= r [ j ] :
9 ans . append ( l [ i ] )
10 i += 1
15.4. ASYMPTOTICALLY BEST SORTING 315
Figure 15.4: Merge Sort: The dividing process is marked with dark arrows
and the merging process is with gray arrows with the merge list marked in
gray color too.
11 else :
12 ans . append ( r [ j ] )
13 j += 1
14
15 ans += l [ i : ]
16 ans += r [ j : ]
17 r e t u r n ans
In the code, we use l[i] <= r[j] instead of l[i] < r[j] is because when the
left and right sublist contains items of equal keys, we put the ones in the
left first in the merged list, so that the sorting can be stable. However, we
used a temporary space as O(n) to save the merged result a, making merge
sort an out-of-place sorting algorithm.
5 m = ( s + e ) // 2
6
7 l = mergeSort ( a , s , m)
8 r = mergeSort ( a , m+1, e )
9 r e t u r n merge ( l , r )
Thus, we get O(n log n) as the upper bound for merge sort, which is asymp-
totically optimal within the comparison-based sorting.
15.4.2 HeapSort
To sort the given array in increasing order, we can use min-heap. We first
heapify the given array. To get a sorted list, we can simply pop out items
till the heap is empty. And the popped out items will be in sorted order.
Complexity Analysis The heapify takes O(n), and the later process
takes O(log n + log (n − 1) + ... + 0) = log(n!) which has an upper bound of
O(n log n).
Partition and Pivot In the partition, quick sort chooses a pivot item
from the subarray, either randomly or intentionally. Given a subarray of
A[s, e], the pivot can either be located at s or e, or a random position in
range [s, e]. Then it partitions the subarray A[s, e] into three parts according
to the value of the pivot: A[s, p − 1], A[p], and A[p + 1...e], where p is where
the pivot is placed at. The left and right part of the pivot satisfies the
following conditions:
• A[i] ≤ A[p], i ∈ [s, p − 1],
Conquer After the partition, one item–the pivot A[p] is placed in the right
place. Next, we only need to handle two subproblems: sorting A[s, p − 1]
and A[p + 1, e] by recursively call the quicksort function. We can write down
the main steps of quick sort as:
1 def quickSort (a , s , e ) :
2 # Base c a s e
3 i f s >= e :
4 return
5 p = partition (a , s , e )
6
7 # Conquer
8 q u i c k S o r t ( a , s , p−1)
9 q u i c k S o r t ( a , p+1 , e )
10 return
At the next two subsection, we will talk about partition algorithm. And
the requirement for this step is to do it in-place just through a series of
swapping operations.
Lomuto’s Partition
We use example A = [3, 5, 2, 1, 6, 4] to demonstrate this partition method.
Assume our given range for partition is [s, e], and p = A[e] is chosen as
pivot. We would use two pointer technique i, j to maintain three regions in
subarray A[s, e]: (1) region [s, i] with items smaller than or equal to p; (2)
[i + 1, j − 1] region with item larger than p; (3) unrestricted region [j, e − 1].
These three areas and the partition process on the example is shown in
Fig. 15.5.
Figure 15.5: Lomuto’s Partition. Yellow, while, and gray marks as region
(1), (2) and (3), respectively.
– If the current item A[j] belongs to region (2), that is to say A[j] >
p, we just increment pointer j;
– Otherwise when A[j] <= p, this item should goes to region (1).
We accomplish this by swapping this item with the first item in
region (2) at i + 1. And now region (1) increments by one and
region (2) shifts one position backward.
• After the for loop, we need to put our pivot at the first place of region
(2) by swapping. And now, the whole subarray is successfully pariti-
tioned into three regions as we needed, and return where the index of
where the pivot is at–i + 1–as the partition index.
Complexity Analysis The worst case of the partition appears when the
input array is already sorted or is reversed from the sorted array. In this
15.4. ASYMPTOTICALLY BEST SORTING 319
case, it will partition a problem with size n into one subproblem with size
n − 1 and the other subproblem is just empty. The recurrence function is
T (n) = T (n−1)+O(n), and it has a time complexity of O(n2 ). And the best
case appears when a subprocess is divided into half and half as in the merge
sort, where the time complexity is O(n log n). Randomly picking the pivot
from A[s, e] and swap it with A[e] can help us achieve a stable performance
with average O(n log n) time complexity.
Quick Select
Quick Select is a variant of quick sort, and it is used to find the k-th smallest
item in a list in linear time. In quicksort, it recurs both sides of the partition
index, while in quick select, only the side that contains the k-th smallest item
will be recurred. This is similar to the binary search, the comparison of k
and partition index p results three cases:
Based on the structure, quick select has the following recurrence time com-
plexity function:
1 d e f q u i c k S e l e c t ( a , s , e , k , p a r t i t i o n=p a r t i t i o n ) :
2 i f s >= e :
3 return a [ s ]
4
5 p = partition (a , s , e )
6 i f p == k :
7 return a [ p ]
8 if k > p:
9 r e t u r n q u i c k S e l e c t ( a , p+1 , e , k , p a r t i t i o n )
10 else :
11 r e t u r n q u i c k S e l e c t ( a , s , p−1, k , p a r t i t i o n )
1 import numpy a s np
2 np . random . s e e d ( 1 )
3 a = np . random . uni for m ( 0 , 1 , 1 0 )
4 a = np . round ( a , d e c i m a l s =2)
Complexity Analysis
a[i] − minV
i=n (15.8)
maxV − minV
Prefix sums are trivial to compute with the following simple recurrence
relation in O(n) complexity.
yi = yi−1 + xi , i ≥ 1 (15.10)
Counting Sort
Given an input array [1, 4, 1, 2, 7, 5, 2], let’s see how exactly counting sort
works by explaining it in three steps. Because our input array comes with
duplicates, we distinguish the duplicates by their relative order shown in the
parenthesises. Ideally, for this input, we want it to be sorted as:
15.5. LINEAR SORTING 323
Index : 0 1 2 3 4 5 6
Key : 1(1) 4 1(2) 2(1) 7 5 2(2)
Sorted : 1 ( 1 ) 1(2) 2(1) 2(2) 4 5 7
Figure 15.7: Counting Sort: The process of counting occurrence and com-
pute the prefix sum.
Denote the prefix sum array as ps. For key i, psi−1 tells us the number
of items that is less or equals to (≤) key i. This information can be
324 15. SORTING AND SELECTION ALGORITHMS
used to place key i directly into its correct position. For example, for
key 2, summing over its previous keys’ occurrences (ps1 ) gives us 2,
indicating that we can put key 2 to position 2. However, key 2 appears
two times, and the last position of key 2 is indicated by ps2 − 1, which
is 3. Therefore, for any key i, its locations in the sorted array is in
range [psi−1 , psi ). We could have just scan the prefix sum array, and
use the prefix sum as locations for key indicated by index of prefix
sum array. However, this method is only limited to situations where
the input array is integers. Moreover, it is unable to keep the relative
ordering of the items of the same key.
3. Sort Keys with Prefix Sum Array: First, let us loop over the
input keys from position 0 to n − 1. For keyi , we decrease the prefix
sum by one, pskeyi = pskeyi − 1 to get the last position that we can
assign this key in the sorted array. The whole process is shown in
Fig. 15.8. We saw that items of same keys are sorted in reverse order.
Looping over keys in the input in reverse order is able to correct this
and thus making the counting sort a stable sorting algorithm.
The implementation of the main three steps are nearly the same as what we
have discussed other than the recast of the key. In the process, we used two
auxiliary arrays: count array for counting and accumulating the occurrence
of keys with O(k) space and order array for storing the sorted array with
O(n) space, giving us the space complexity O(n + k) in our implementation.
The Python code is shown as:
1 def countSort ( a ) :
2 minK , maxK = min ( a ) , max( a )
3 k = maxK − minK + 1
4 count = [ 0 ] ∗ (maxK − minK + 1 )
5 n = len (a)
6 order = [ 0 ] ∗ n
7 # Get o c c u r r e n c e
8 f o r key i n a :
9 count [ key − minK ] += 1
10
11 # Get p r e f i x sum
12 f o r i in range (1 , k ) :
13 count [ i ] += count [ i −1]
14
15 # Put key i n p o s i t i o n
16 f o r i i n r a n g e ( n−1, −1, −1) :
17 key = a [ i ] − minK
18 count [ key ] −= 1 # t o g e t t h e i n d e x a s p o s i t i o n
19 o r d e r [ count [ key ] ] = a [ i ]
20 return order
And we see how that the integers are ordered by the length of digits, whereas
in the sorted strings, the length of strings does not usually decide the order-
ing.
Within Radix sorting, it is usually either the bucket sort or counting
sort that is doing the sorting using one radix as key at a time. Based upon
the sorting order of the digit, we have two types of radix sorting: Most
Significant Digit (MSD) radix sort which starts from the left-most radix
first and goes all the way the right-most radix, and Least Significant Digit
(LSD) radix sort vice versa. We should address the details of the two forms
of radix sort – MSD and LSD using our two examples.
• As shown in Fig. 15.9, in the first pass, the least significant digit (1st
place) is used as key to sort. After this pass, the ordering of numbers
of unit digits is in-place.
15.5. LINEAR SORTING 327
• In the second pass, the 10s place digit is used. After this pass, we
see that numbers that has less than or equals to two digits comprising
24, 45, 75, 90 in our example is in ordering.
• At the last and third pass, the 100s place digit is used. For numbers
that are short of 100s place digit, 0 is placed. Afterwards, the entire
numbers are in ordering.
We have to notice that the sorting will not work unless the sorting subroutine
we apply is stable. For example, in our last pass, there exists four zeros,
indicating that they share the same key value. If the relative ordering of
them is not kept, the previously sorting effort will be wasted.
As we see for digit 8, we need to have 178, for digit 7, we need to have 17,
and for digit 1, we only need 1. 178, 17, 1 are the prefix till the digit we
need. We can obtain these prefixes via a base exp.
exp = 1 , ( 1 7 8 // exp ) = 1 7 8 , 178 % 10 = 8
exp = 1 0 , ( 1 7 8 // exp ) = 1 7 , 17 % 10 = 7
exp = 1 0 0 , ( 1 7 8 // exp ) = 1 , 1 % 10 = 1
We can also get the prefix by looping and each time we divide our number
by 10. For example, the following code will output [8, 7, 1].
1 a = 178
2 digits = []
328 15. SORTING AND SELECTION ALGORITHMS
3 while a > 0:
4 d i g i t s . append ( a%10)
5 a = a // 10
• Because for decimal there are in total only 10 digits, we only arrange
10 total space for the count array.
variable n, thus radix sorting for integers with counting sort as subroutine
has a linear time complexity. Due to the usage of counting sort, which is
stable, making the radix sorting a stable sorting algorithm too.
With the usage of auxiliary count and order, it gives a O(n) space
complexity, and makes the LSD integer sorting an out-of- place sorting al-
gorithm.
Figure 15.10: Radix Sort: MSD sorting strings in recursion. The black and
grey arrows indicate the forward and backward pass in recursion, respec-
tively.
For better demonstration, we add two more strings: “ap” and “pear”.
String “ap” is for showing what happens when the strings in the same bucket
but one is shorter and has no valid letter to compare with. And string “pear”
is to showcase how the algorithm handles duplicates. The algorithm indeed
330 15. SORTING AND SELECTION ALGORITHMS
• At first, the recursion handles the first keys of the first letter in the
string, as the process where i = 0 shown in Fig. 15.10. There are three
buckets with letter ‘a’, ‘b’, and ‘p’.
Basics
To use the above two built-in methods to sort a list of integers is just as
simple as:
1 l s t = [4 , 5 , 8 , 1 , 2 , 7]
2 l s t . sort ()
Printing out lst shows that the sorting happens in-place within lst.
1 [1 , 2 , 4 , 5 , 7 , 8]
We print out:
1 new_lst , l s t
2 ([1 , 2 , 4 , 5 , 7 , 8] , [4 , 5 , 8 , 1 , 2 , 7])
Let’s try to sort other iterable object, and try sort a tuple of strings:
1 f r u i t = ( ' a p p l e ' , ' p e a r ' , ' b e r r y ' , ' peach ' , ' a p r i c o t ' )
2 new_fruit = s o r t e d ( f r u i t )
332 15. SORTING AND SELECTION ALGORITHMS
Print out new_fruit, and we also see that it returned a list instead of
tuple.
1 [ ' a p p l e ' , ' a p r i c o t ' , ' b e r r y ' , ' peach ' , ' p e a r ' ]
Timsort These two methods both using the same sorting method – Tim-
sort and has the same parameters. Timesort is a hybrid stable and in-place
sorting algorithm, derived from merge sort and insertion sort, designed to
perform well on many kinds of real-world data. It uses techniques from Peter
McIlroy’s “Optimistic Sorting and Information Theoretic Complexity”, Jan-
uary 1993. It was implemented by Tim Peters in 2002 for use in the Python
programming language. The algorithm finds subsequences of the data that
are already ordered, and uses that knowledge to sort the remainder more
efficiently.
Arguments
They both takes two keyword-only arguments: key and reverse:, and each
has None and False as default value, respectively.
This is equivalent to customize a class Int and rewrite its __lt__() special
method as:
1 c l a s s Int ( int ) :
2 d e f __init__ ( s e l f , v a l ) :
3 s e l f . val = val
4 d e f __lt__( s e l f , o t h e r ) :
5 return other . val < s e l f . val
Now, sort the same list but without setting reverse will get us exactly the
same result:
1 l s t = [ Int (4) , Int (5) , Int (8) , Int (1) , Int (2) , Int (7) ]
2 l s t . sort ()
Customize key We have mainly two options to customize the key argu-
ment: (1) through lambda function, (2) through a pre-defined function. And
in either way, the function only takes one argument. For example, to sort
the following list of tuples by using the second item in the tuple as key:
1 l s t = [ ( 8 , 1) , (5 , 7) , (4 , 1) , (1 , 3) , (2 , 4) ]
The same result can be achieved via lambda function which is more conve-
nient:
1 new_lst = s o r t e d ( l s t , key = lambda x : x [ 1 ] )
1 c l a s s Student ( o b j e c t ) :
2 d e f __init__ ( s e l f , name , grade , age ) :
3 s e l f . name = name
4 s e l f . grade = grade
5 s e l f . age = age
6
7 # To s u p p o r t i n d e x i n g
8 d e f __getitem__ ( s e l f , key ) :
9 r e t u r n ( s e l f . name , s e l f . grade , s e l f . age ) [ key ]
10
11 d e f __repr__ ( s e l f ) :
12 r e t u r n r e p r ( ( s e l f . name , s e l f . grade , s e l f . age ) )
attrgetter can take multiple arguments, for example, we can sort the list
first by ‘grade’ and then by ‘age’, we can do it as:
1 s o r t e d ( s t u d e n t s , key=a t t r g e t t e r ( ' g r a d e ' , ' age ' ) )
15.1 Insertion Sort List (147). Sort a linked list using insertion sort.
A graphical example of insertion sort. The partial sorted list (black)
initially contains only the first element in the list. With each iteration
one element (red) is removed from the input data and inserted in-place
into the sorted list
Algorithm of Insertion Sort: Insertion sort iterates, consuming one
input element each repetition, and growing a sorted output list. At
each iteration, insertion sort removes one element from the input data,
finds the location it belongs within the sorted list, and inserts it there.
It repeats until no input elements remain.
Example 1 :
I nput : 4−>2−>1−>3
Output : 1−>2−>3−>4
Example 2 :
I nput : −1−>5−>3−>4−>0
Output : −1−>0−>3−>4−>5
Example 2 :
Input : [ [ 1 , 4 ] , [ 4 , 5 ] ]
Output : [ [ 1 , 5 ] ]
E x p l a n a t i o n : I n t e r v a l s [ 1 , 4 ] and [ 4 , 5 ] a r e c o n s i d e r e d
overlapping .
15.3 Valid Anagram (242, easy). Given two strings s and t , write a
function to determine if t is an anagram of s.
Example 1 :
Input : s = " anagram " , t = " nagaram "
Output : t r u e
Example 2 :
Input : s = " r a t " , t = " c a r "
Output : f a l s e
Note: You may assume the string contains only lowercase alphabets.
Follow up: What if the inputs contain unicode characters? How would
you adapt your solution to such case?
15.5 Sort Colors (leetcode: 75). Given an array with n objects colored
red, white or blue, sort them so that objects of the same color are
adjacent, with the colors in the order red, white and blue. Here, we
will use the integers 0, 1, and 2 to represent the color red, white, and
blue respectively. Note: You are not suppose to use the library’s sort
function for this problem.
15.6 148. Sort List (sort linked list using merge sort or quick
sort).
Solutions
1. Solution: the insertion sort is easy, we need to compare current node
with all previous sorted elements. However, to do it in the linked list,
we need to know how to iterate elements, how to build a new list. In
this algorithm, we need two while loops to iterate: the first loop go
through from the second node to the last node, the second loop go
through the whole sorted list to compare the value of the current node
to the sorted element, which starts from having one element. There
are three cases for the comparison: if the comp_node does not move,
which means we need to put the current node in front the previous
head, and the cur_node become the new head; if the comp_node stops
at the back of it, so current node is the end, we set its value to 0, and
we save the pre_node in case; if it stops in the middle, we need to put
cur_node in between pre_node and cur_node.
15.8. LEETCODE PROBLEMS 337
1 d e f i n s e r t i o n S o r t L i s t ( s e l f , head ) :
2 """
3 : type head : ListNode
4 : r t y p e : ListNode
5 """
6 i f head i s None :
7 r e t u r n head
8 sorted_head = head
9 cur_node = head . next
10 head . next = None #s o r t e d l i s t o n l y has one node , a new
list
11 w h i l e cur_node :
12 next_node = cur_node . next #s a v e t h e next node
13 cmp_node = head
14 #compare node with p r e v i o u s a l l
15 pre_node = None
16 w h i l e cmp_node and cmp_node . v a l <= cur_node . v a l :
17 pre_node = cmp_node
18 cmp_node = cmp_node . next
19
20 i f cmp_node == head : #put i n t h e f r o n t
21 cur_node . next = head
22 head = cur_node
23 e l i f cmp_node == None : #put a t t h e back
24 cur_node . next = None #c u r r e n t node i s t h e end ,
s o s e t i t t o None
25 pre_node . next = cur_node
26 #head i s not changed
27 e l s e : #i n t h e middle , i n s e r t
28 pre_node . next = cur_node
29 cur_node . next = cmp_node
30 cur_node = next_node
31 r e t u r n head
12 : rtype : List [ I n t e r v a l ]
13 """
14 i f not i n t e r v a l s :
15 return [ ]
16 #s o r t i n g t h e i n t e r v a l s n l o g n
17 i n t e r v a l s . s o r t ( key=lambda x : ( x . s t a r t , x . end ) )
18 h = [ intervals [ 0 ] ]
19 # i t e r a t e t h e i n t e r v a l s t o add
20 for i in i n t e r v a l s [ 1 : ] :
21 s , e = i . s t a r t , i . end
22 bAdd = F a l s e
23 f o r idx , p r e _ i n t e r a l i n enumerate ( h ) :
24 s_before , e_before = p r e _ i n t e r a l . s t a r t ,
p r e _ i n t e r a l . end
25 i f s <= e _ b e f o r e : #o v e r l a p , merge t o t h e
same i n t e r v a l
26 h [ i d x ] . end = max( e , e _ b e f o r e )
27 bAdd = True
28 break
29 i f not bAdd :
30 #no o v e r l a p , push t o t h e heap
31 heappush ( h , i )
32 return h
3. Solution: there could have so many ways to do it, the most easy one
is to sort the letters in each string and see if it is the same. Or we can
have an array of 26, and save the count of each letter, and check each
letter in the other one string.
1 d e f isAnagram ( s e l f , s , t ) :
2 """
3 : type s : s t r
4 : type t : s t r
5 : rtype : bool
6 """
7 r e t u r n ' ' . j o i n ( s o r t e d ( l i s t ( s ) ) ) == ' ' . j o i n ( s o r t e d (
list (t)))
15 for n in table :
16 i f n != 0 :
17 return False
18 r e t u r n True
For the follow up, use a hash table instead of a fixed size counter.
Imagine allocating a large size array to fit the entire range of unicode
characters, which could go up to more than 1 million. A hash table is
a more generic solution and could adapt to any range of characters.
Dynamic Programming
341
342 16. DYNAMIC PROGRAMMING
To be noticed, not that all problems with the above formats will be cer-
tainly solved with dynamic programing, it requires the problem to show two
properties: overlapping subproblems and optimal substructures in order for
dynamic programming to be applied. These two properties will be defined
and explained in this chapter.
16.1.1 Concepts
The memoization works in such way that at the very first time that
the subproblem is solved it will be saved in the hashmap, and whenever
this problem is met again, it finds the solution and returns it directly
instead of computing again. The key elements of this style of dynamic
programming is:
Comparison The Figure 16.1 record the two different methods, we can
use memoization and tabulation for short. Momoization and tabulation yield
the same asymptotic time complexity, however the tabulation approach often
has much better constant factors, since it has less overhead for procedure
calls.
The memoization method applies better for beginners that who have de-
cent understanding of divide and conquer. However, once you study further
and have enough practice, the tabulation should be more intuitive com-
pared with recursive solution. Usually, dynamic programming solution to a
problem refers to the solution with tabulation.
We enumerate two examples: Fibonacci Sequence (Subsection 16.1.3)
and Longest Increasing Subsequence (subsection ??) in the remaining section
to showcase memoization and tabulation in practice.
Complete Search Because the relation between current state and previ-
ous states are directly given, it is straightforward to solve the problem in
16.1. INTRODUCTION TO DYNAMIC PROGRAMMING 347
2. Then, we initialize results of base cases which either were given or can
be obtained easily with simple deduction.
Subproblem space and state choice together not only formulates the re-
current relation with which we very much have the implementation in hand.
Together they also decide the time and space complexity we will need to
tackle our dynamic programming problems.
• Dos: Dynamic programming fits for the optimizing the following prob-
lems which are either exponential or polynomial complexity using com-
plete search:
350 16. DYNAMIC PROGRAMMING
1. When the naive solution has a low time complexity already such
as O(n2 ) or O(n3 ).
2. When the input dataset is a set while not an array or string or
matrix, 90% chance we will not use DP.
3. When the overlapping subproblems apply but it is not optimiza-
tion problems thus we can not identify the suboptimial substruc-
ture property and thus find its recurrence function. For example,
same problem context as in Dos but instead we are required to
obtain or print all solutions, this is when we need to retreat back
to the use of DFS+memo(top-down) instead of DP.
2. Come up with the most naive solution ASAP: analyze its time com-
plexity. Is it a typical DFS solution? Try draw a SUBPROBLEM
GRAPH to get visualization. Is there space for optimization?
3. Apply Section 16.2.1: Is there overlapping? Can you define the optimal
substructure/recurrence function?
4. If the conclusion is YES, try to define the Five key elements so that
we can solve it using the preferable tabulation. If you can figure it out
intuitively just like that, great! What to do if not? Maybe retreat to
use memoization, which is a combination of divide and conquer, DFS,
and memoization.
For example, if the subproblem space is n and if each state i relies on (1)
only one or two previous states as we have seen in the example of Fibonacci
Sequence, it makes the time complexity O(n); and (2) all previous states in
range [0, i − 1] as seen in the example of Longest Increasing Subsequence,
which can be viewed as O(n) to solve each subproblem, this brings up the
complexity up to O(n2 ).
Cons
1. Slower if many subproblemes are 1. For programmers who are in-
revisited due to overhead of recur- clined with recursion, this may not
sive calls. be intuitive.
2. If there are n states, it can use up
to O(n) table size which might lead
to Memory Limit Exceeded(MLE)
for some hard problems.
3. Faces stack overflow due to the
resursive calls.
[
[2] ,
[3 ,4] ,
[6 ,5 ,7] ,
[4 ,1 ,8 ,3]
]
The minimum path sum from top t o bottom i s 11 ( i . e . , 2 + 3 + 5 +
1 = 11) .
Analysis
1. We quickly read the question and we find the key word – minimum.
2. We come up with the most naive solution that would be dfs which
we have already covered in chapter. A quick drawing of dfs traversal
graph, we can find some nodes are repetitively visited.
3. Apply Two Properties: First, define the subproblem, for each node in
the triangle, it is decided by two indexes (i, j) as row and column index
respectively. The subproblem can be straightforward, the minimum
sum from the starting point (0, 0) to current position (i, j). And the
subprogram graph will be exactly the same as the graph we used in
dfs. We identify overlapping easily.
Now, develop the recurrence function. To build up solution for state
at (i, j), it needs two other states: (i − 1, j) and (i − 1, j − 1) and
need one value from current state. The function will be: f (i, j) =
min(f (i − 1, j), f (i − 1, j − 1)) + t[i][j].
4. Five Key Elements: we need to figure out how to assign and initialize
the dp space and do the iteration. To get the boundary condition:
354 16. DYNAMIC PROGRAMMING
(a) by observation: the first element at (0, 0) will have none of these
two states f (i − 1, j), f (i − 1, j − 1) exist. the leftmost and right-
most element of the triangle will have only one of these two states:
f (i − 1, j), f (i − 1, j − 1).
(b) by simple math induction: i ∈ [0, n − 1], j ∈ [0, i]. When i =
0, j = 0, f (i, j) = t[i][j], when i ∈ [1, n − 1], j = 0, f (i − 1, j − 1)
is invalid, and when i = n − 1, j = n − 1, (i − 1, j) is invalid.
The answer would be the minimum value of dp at the last row. The
Python code is given:
1 d e f min_path_sum ( t ) :
2 dp = [ [ 0 f o r c i n r a n g e ( r +1) ] f o r r i n r a n g e ( l e n (
t r i a n g l e ) ) ] # i n i t i a l i z e d to 0 f o r f ( )
3 n = len ( triangle )
4 #i n i t i a l i z e t h e f i r s t p o i n t , bottom
5 dp [ 0 ] [ 0 ] = t r i a n g l e [ 0 ] [ 0 ]
6 #i n i t i a l t h e l e f t c o l and t h e r i g h t c o l o f t h e t r i a n g l e
7 f o r i in range (1 , n) :
8 dp [ i ] [ 0 ] = dp [ i − 1 ] [ 0 ] + dp [ i ] [ 0 ]
9 dp [ i ] [ i ] = dp [ i − 1 ] [ i −1] + dp [ i ] [ i ]
10 f o r i in range (1 , n) :
11 f o r j in range (1 , i ) :
12 dp [ i ] [ j ] = t [ i ] [ j ] + min ( dp [ i − 1 ] [ j ] , dp [ i − 1 ] [ j
−1])
13 r e t u r n min ( dp [ − 1 ] )
Space Optimization From the recurrence function, we can see the cur-
rent state is only related to two states from the last row. We can reuse the
original triangle matrix itself to save the state. If we are following the
forward induction as the previous solution, we still have the problem of edge
cases; for some state that it only has one previous or none previous states
needed to decide its current state. We can write our code as:
1 d e f min_path_sum ( t ) :
2 '''
3 Space o p t i m i z a t i o n with forward induction
4 '''
5 t = deepcopy ( t )
6 i f not t :
7 return 0
8 n = len ( t )
9 f o r i in range (0 , n) :
10 f o r j in range (0 , i + 1) :
11 i f i == 0 and j == 0 :
12 continue
13 e l i f j == 0 :
14 t[ i ][ j ] = t[ i ] [ j ] + t [ i −1][ j ]
15 e l i f j == i :
16 t[ i ][ j ] = t[ i ] [ j ] + t [ i − 1 ] [ j −1]
17 else :
16.3. HANDS-ON EXAMPLES (MAIN-COURSE EXAMPLES) 355
18 t [ i ] [ j ] = t [ i ] [ j ] + min ( t [ i − 1 ] [ j ] , t [ i − 1 ] [ j −1])
19 r e t u r n min ( t [ − 1 ] )
The problem will be analyzed following our two properties and solved
following our five step guideline and five elements.
1. First step, we read the problem and we can quickly catch the key word
– maximum.
2. Second step, the naive solution. We have From other chapters, we have
seen how maximum subarray can be approached as either graph search
(O(2n ) to get more details later), linear search along the solution space
(O(n3 ) and O(n2 ) if be tweeted with the computation of subarray).
4. Step 4: Given all the conclusions, we can start the five key elements.
The above solution requires us to start from the maximum index in a
reverse order, this is called backward induction mentioned in materials
explaining dynamic programming from the angle of optimization. We
need to always pay attention there is empty array where the maximum
subarray should give zero as result. This makes our total states n + 1
instead of n. In the backward induction, this empty state will locate
at index n with a list of size n + 1.
1 d e f maximum_subarray_dp ( a ) :
2 '''
3 Backward i n d u c t i o n dp s o l u t i o n
4 '''
5 # a s s i g n m e n t and i n i t i a l i z a t i o n
6 dp = [ 0 ] ∗ ( l e n ( a ) + 1 )
7 # f i l l out t h e dp s p a c e i n r e v e r s e o r d e r
8 # we do not need t o f i l l t h e b a s e c a e dp [ n ]
9 f o r i in r e v e r s e d ( range ( len ( a ) ) ) :
10 dp [ i ] = max( dp [ i +1] + a [ i ] , a [ i ] )
11 p r i n t ( dp )
12 r e t u r n max( dp )
the global maximum subarray value, and use state to replace dp array, we
can decrease the space complexity from O(n) to O(1).
1 d e f maximum_subarray_dp_sp ( a ) :
2 '''
3 dp s o l u t i o n with s p a c e o p t i m i z a t i o n
4 '''
5 # a s s i g n m e n t and i n i t i a l i z a t i o n
6 state = 0
7 maxsum = 0
8 # f i l l out t h e dp s p a c e i n r e v e r s e o r d e r
9 # we do not need t o f i l l t h e b a s e c a e dp [ n ]
10 f o r i in r e v e r s e d ( range ( len ( a ) ) ) :
11 s t a t e = max( s t a t e + a [ i ] , a [ i ] )
12 maxsum = max( maxsum , s t a t e )
13 r e t u r n maxsum
All of the above steps are for deep analysis purpose. When you are more
experience, we can go directly to the five elements of tabulation and develop
the solution without connecting it to the naive solution. Also, this is actually
a Kadane’s Algorithm which will be further detailed in Chapter. ??.
16.4 Exercises
16.4.1 Knowledge Check
1. The completeness of Dynamic programming.
1 Input :
2 [
3 [0 ,0 ,0] ,
4 [0 ,1 ,0] ,
5 [0 ,0 ,0]
6 ]
7 Output : 2
Explanation: There is one obstacle in the middle of the 3x3 grid above.
There are two ways to reach the bottom-right corner:
1 1 . Right −> Right −> Down −> Down
2 2 . Down −> Down −> Right −> Right
Sequence Type
13 same = k
14 d i f f = k ∗ ( k−1)
15 f o r i i n r a n g e ( 3 , n+1) :
16 pre_diff = d i f f
17 d i f f = ( same+ d i f f ) ∗ ( k−1)
18 same = p r e _ d i f f
19 r e t u r n ( same+ d i f f )
Explanation:
As shown below, there are 3 ways you can generate "rabbit" from S.
(The caret symbol ^ means the chosen letters)
1 rabbbit
2 ^^^^ ^^
3 rabbbit
4 ^^ ^^^^
5 rabbbit
6 ^^^ ^^^
Example 2:
1 I nput : s 1 = " aabcc " , s 2 = " dbbca " , s 3 = " aadbbbaccc "
2 Output : f a l s e
Splitting Type DP
362 16. DYNAMIC PROGRAMMING
30 }
31 // s e a r c h f o r r i g h t min
32 l o c a l M i n = nums [ s i z e − 1 ] ;
33 f o r ( i n t i = s i z e − 2 ; i >= 0 ; i −−) {
34 l o c a l M i n = Math . min ( nums [ i ] , l o c a l M i n + nums [ i
]) ;
35 right_min [ i ] = Math . min ( right_min [ i + 1 ] ,
localMin ) ;
36 }
37 // s e a r c h f o r s e p a r e t e p o s i t i o n
38 int d i f f = 0;
39 f o r ( i n t i = 0 ; i < s i z e − 1 ; i ++) {
40 d i f f = Math . max( Math . abs ( left_max [ i ] −
right_min [ i + 1 ] ) , d i f f ) ;
41 d i f f = Math . max( Math . abs ( l e f t _ m i n [ i ] −
right_max [ i + 1 ] ) , d i f f ) ;
42 }
43 return d i f f ;
44 }
Example 2:
1 I nput : [ −2 ,0 , −1]
2 Output : 0
3 E x p l a n a t i o n : The r e s u l t cannot be 2 , b e c a u s e [ −2 , −1] i s not
a subarray .
1 d e f maxProduct ( nums ) :
2 i f not nums :
3 return 0
4 n = l e n ( nums )
5 min_local , max_local = [ 0 ] ∗ n , [ 0 ] ∗ n
6 max_so_far = nums [ 0 ]
7 m i n _ l o c a l [ 0 ] , max_local [ 0 ] = nums [ 0 ] , nums [ 0 ]
8 f o r i in range (1 , n) :
9 i f nums [ i ] >0:
10 max_local [ i ] = max( max_local [ i −1]∗nums [ i ] , nums
[ i ])
11 m i n _ l o c a l [ i ] = min ( m i n _ l o c a l [ i −1]∗nums [ i ] , nums
[ i ])
12 else :
13 max_local [ i ] = max( m i n _ l o c a l [ i −1]∗nums [ i ] , nums
[ i ])
14 m i n _ l o c a l [ i ] = min ( max_local [ i −1]∗nums [ i ] , nums
[ i ])
15 max_so_far = max( max_so_far , max_local [ i ] )
16 r e t u r n max_so_far
Example 2:
1 I nput : [ 1 , 2 , 3 , 4 , 5 ]
2 Output : 4
3 E x p l a n a t i o n : Buy on day 1 ( p r i c e = 1 ) and s e l l on day 5 (
p r i c e = 5 ) , p r o f i t = 5−1 = 4 .
4 Note t h a t you cannot buy on day 1 , buy on day
2 and s e l l them l a t e r , a s you a r e
5 e n g a g i n g m u l t i p l e t r a n s a c t i o n s a t t h e same
time . You must s e l l b e f o r e buying a g a i n .
Example 3:
1 I nput : [ 7 , 6 , 4 , 3 , 1 ]
2 Output : 0
3 E x p l a n a t i o n : In t h i s c a s e , no t r a n s a c t i o n i s done , i . e . max
profit = 0.
6 mono_stack = [ ]
7 profit = 0
8 for p in prices :
9 i f not mono_stack :
10 mono_stack . append ( p )
11 else :
12 i f p<mono_stack [ − 1 ] :
13 mono_stack . append ( p )
14 else :
15 #k i c k out t i l l i t i s d e c r e a s i n g
16 i f mono_stack and mono_stack [−1]<p :
17 p r i c e = mono_stack . pop ( )
18 p r o f i t += p−p r i c e
19
20 w h i l e mono_stack and mono_stack [ −1]<p :
21 p r i c e = mono_stack . pop ( )
22 mono_stack . append ( p )
23 return p r o f i t
Example 2:
1 Input : [ 3 , 2 , 6 , 5 , 0 , 3 ] , k = 2
2 Output : 7
3 E x p l a n a t i o n : Buy on day 2 ( p r i c e = 2 ) and s e l l on day 3 (
p r i c e = 6 ) , p r o f i t = 6−2 = 4 .
4 Then buy on day 5 ( p r i c e = 0 ) and s e l l on day
6 ( p r i c e = 3 ) , p r o f i t = 3−0 = 3 .
1 I nput : [ 1 , 1 2 , − 5 , − 6 , 5 0 , 3 ] , k = 4
2 Output : 1 2 . 7 5
3 Explanation :
4 when l e n g t h i s 5 , maximum a v e r a g e v a l u e i s 1 0 . 8 ,
5 when l e n g t h i s 6 , maximum a v e r a g e v a l u e i s 9 . 1 6 6 6 7 .
6 Thus r e t u r n 1 2 . 7 5 .
Note:
1 1 <= k <= n <= 1 0 , 0 0 0 .
2 Elements o f t h e g i v e n a r r a y w i l l be i n r a n g e [ − 1 0 , 0 0 0 ,
10 ,000].
3 The answer with t h e c a l c u l a t i o n e r r o r l e s s than 10−5
w i l l be a c c e p t e d .
Challenge
O(n x m) memory is acceptable, can you do it in O(m) memory? Note
Hint: Similar to the backpack I, difference is dp[j] we want the value
maximum, not to maximize the volume. So we just replace f[i-A[i]]+A[i]
with f[i-A[i]]+V[i].
12 Explanation :
13 Swap A [ 3 ] and B [ 3 ] . Then t h e s e q u e n c e s a r e :
14 A = [ 1 , 3 , 5 , 7 ] and B = [ 1 , 2 , 3 , 4 ]
15 which a r e both s t r i c t l y i n c r e a s i n g .
16
17 Note :
18
19 A, B a r e a r r a y s with t h e same l e n g t h , and t h a t l e n g t h
w i l l be i n t h e r a n g e [ 1 , 1 0 0 0 ] .
20 A[ i ] , B [ i ] a r e i n t e g e r v a l u e s i n t h e r a n g e [ 0 , 2 0 0 0 ] .
Simple DFS. The brute force solution is to generate all the valid
sequence and find the minimum swaps needed. Because each element
can either be swapped or not, thus make the time complexity O(2n ).
If we need to swap current index i is only dependent on four elements
at two state, (A[i], B[i], A[i-1], B[i-1]), at state i and i-1 respectively.
At first, supposedly for each path, we keep the last visited element a
and b for element picked for A and B respectively. Then
1 d e f minSwap ( s e l f , A, B) :
2 i f not A o r not B :
3 return 0
4
5 d e f d f s ( a , b , i ) : #t h e l a s t e l e m e n t o f t h e s t a t e
6 i f i == l e n (A) :
7 return 0
8 i f i == 0 :
9 # not swap
10 count = min ( d f s (A[ i ] , B [ i ] , i +1) , d f s (B [ i ] , A[ i
] , i +1)+1)
11 r e t u r n count
12 count = s y s . maxsize
13
14 i f A[ i ]>a and B [ i ]>b : #not swap
15 count = min ( d f s (A[ i ] , B [ i ] , i +1) , count )
16 i f A[ i ]>b and B [ i ]>a :#swap
17 count = min ( d f s (B [ i ] , A[ i ] , i +1)+1, count )
18 r e t u r n count
19
20 return dfs ( [ ] , [ ] , 0)
1 d e f minSwap ( s e l f , A, B) :
2 i f not A o r not B :
3 return 0
4
5 d e f d f s ( a , b , i , memo, swapped ) : #t h e l a s t e l e m e n t o f
the s t a t e
6 i f i == l e n (A) :
7 return 0
8 i f ( swapped , i ) not i n memo :
9 i f i == 0 :
10 # not swap
11 memo [ ( swapped , i ) ] = min ( d f s (A[ i ] , B [ i ] , i
+1 , memo, F a l s e ) , d f s (B [ i ] , A[ i ] , i +1 , memo, True ) +1)
12 r e t u r n memo [ ( swapped , i ) ]
13 count = s y s . maxsize
14
15 i f A[ i ]>a and B [ i ]>b : #not swap
16 count = min ( count , d f s (A[ i ] , B [ i ] , i +1,
memo, F a l s e ) )
17 i f A[ i ]>b and B [ i ]>a : #swap
18 count = min ( count , d f s (B [ i ] , A[ i ] , i +1,
memo, True ) +1)
19 memo [ ( swapped , i ) ] = count
20
21 r e t u r n memo [ ( swapped , i ) ]
22
23 return dfs ( [ ] , [ ] , 0 , {} , False )
Solution: here we not only need to count all the solutions, we need
to record all the solutions. Before using dynamic prgramming, we
can use DFS, and we need a function to see if a splitted substring is
palindrome or not. The time complexity for this is T (n) = T (n − 1) +
T (n − 2) + ... + T (1) + O(n), which gave out the complexity as O(3n ).
This is also called backtracking algorithm. The running time is 152
ms.
1 def partition ( s e l f , s ) :
2 """
3 : type s : s t r
4 : rtype : List [ List [ s t r ] ]
5 """
6 #s ="bb "
7 #t h e whole p u r p o s e i s t o f i n d pal , which means i t i s a
DFS
8 d e f bPal ( s ) :
9 r e t u r n s==s [ : : − 1 ]
10 d e f h e l p e r ( s , path , r e s ) :
11 i f not s :
12 r e s . append ( path )
16.5. SUMMARY 371
13 f o r i i n r a n g e ( 1 , l e n ( s ) +1) :
14 i f bPal ( s [ : i ] ) :
15 h e l p e r ( s [ i : ] , path +[ s [ : i ] ] , r e s )
16 res =[]
17 helper ( s , [ ] , res )
18 return res
This is actually the example that if we want to print out all the solu-
tions, we need to use DFS and backtracking. It is hard to use dynamic
programming and save time.
16.5 Summary
Steps of Solving Dynamic Programming Problems
We read through the problems, most of them are using array or string
data structures. We search for key words: ”min/max number", ”Yes/No" in
”subsequence/" type of problems. After this process, we made sure that we
372 16. DYNAMIC PROGRAMMING
are going to solve this problem with dynamic programming. Then, we use
the following steps to solve it:
1. .
2. New storage( a list) f to store the answer, where fi denotes the answer
for the array that starts from 0 and end with i. (Typically, one extra
space is needed) This steps implicitly tells us the way we do divide
and conquer: we first start with dividing the sequence S into S(1,n)
and a0 . We reason the relation between these elements.
4. We initialize the storage and we figure out where in the storage is the
final answer (f[-1], max(f), min(f), f[0]).
Greedy Algorithms
375
376 17. GREEDY ALGORITHMS
can be approximate too and efficient. Maybe it will inspire us in other fields.
17.1 Exploring
Example 1 :
Input : [ [ 1 , 2 ] , [ 2 , 3 ] , [ 3 , 4 ] , [ 1 , 3 ] ]
Output : 1
E x p l a n a t i o n : [ 1 , 3 ] can be removed and t h e r e s t o f i n t e r v a l s a r e
non−o v e r l a p p i n g .
n−1
o = max (17.1)
X
xi
i=0
(17.2)
However, if we sort the items by either start or end time, the checking of an
item’s compatibility to a combination will be only need to compare it with
its last item
17.1. EXPLORING 377
Dynamic Programming
14 #p r i n t ( LIS )
15 r e t u r n l e n ( i n t e r v a l s )−max( LIS )
And the final answer will be dp[-1]. With the sorting, the part with the
dynamic programming only takes O(n), making the total time O(n log n)
mainly caused by sorting. The Code only differs one line with the above
approach:
1 d e f e r a s e O v e r l a p I n t e r v a l s ( i n t e r v a l s : L i s t [ L i s t [ i n t ] ] ) −> i n t :
2 i f not i n t e r v a l s :
3 return 0
4 i n t e r v a l s . s o r t ( key=lambda x : x [ 0 ] )
5 n = len ( intervals )
6 dp = [ 0 ] ∗ ( n+1)
7
8 f o r i in range (n) :
9 max_before = 0
10 f o r j i n r a n g e ( i , −1, −1) :
11 i f i n t e r v a l s [ i ] [ 0 ] >= i n t e r v a l s [ j ] [ 1 ] :
12 max_before = max( max_before , dp [ j +1])
13 break
14 dp [ i +1] = max( dp [ i ] , max_before +1)
15 #p r i n t ( LIS )
16 r e t u r n n−dp [ −1]
Greedy Algorithm
In the previous solution, the process looks like this: If it is sorted by end
time, first we have e, m = 1, for a, it is not compatible with e, according
to previous recurrence relation, m = 1, with either a or e in the optimal
17.1. EXPLORING 379
1 d e f e r a s e O v e r l a p I n t e r v a l s ( i n t e r v a l s : L i s t [ L i s t [ i n t ] ] ) −> i n t :
2 i f not i n t e r v a l s :
3 return 0
4 min_rmv = 0
5 i n t e r v a l s . s o r t ( key = lambda x : x [ 1 ] )
6 l a s t _ e n d = −s y s . maxsize
7 for i in i n t e r v a l s :
8 i f i [ 0 ] >= l a s t _ e n d : #non−o v e r l a p
9 last_end = i [ 1 ]
10 else :
11 min_rmv += 1
12
13 r e t u r n min_rmv
If we sort our problems by start time. We need to tweak the code a bit, that
whenever one interval is incompatible with previous, we see if it has earlier
end time that the previous one, if it is, then we replace it with this one,
because it has later start time, and earlier end time, whatever the optimal
that the previous interval is in, replacing it with the current one will not
overlap and it will be more promising.
1 for i in i n t e r v a l s :
2 i f i [ 0 ] < l a s t _ e n d : #o v e r l a p , d e l e t e t h i s one , do not update
t h e end
3 i f i [ 1 ] < last_end :
4 last_end = i [ 1 ]
5 min_rmv += 1
6 else :
7 last_end = i [ 1 ]
380 17. GREEDY ALGORITHMS
Questions to Ponder
promising one. For example, for [[1, 11]], the optimal solution is [1, 11], for
[1, 11], [2, 12], the optimal solution is still [1, 11], even though [2, 12] is
another optimal solution for this subproblem. For [1, 11], [2, 12], [13, 14],
[11, 22], greedy approach gives us [1,11],[13, 14] as our optimal solution,
while in dynamic programming, we can still find another optimal solution:
[1, 11], [11, 22].
We clearly see that to get the best solution, we have to rely on the
optimal solution of all preceding subproblems. If we insist on applying
greedy algorithm, this is how to process looks like:
1 subproblems
2 [ 2 ] , LIS = [ 2 ]
3 [ 2 , 1 ] , LIS = [ 2 ] , o n l y compare [ 2 ] and 1
4 [ 2 , 1 , 3 ] , LIS= [ 2 , 3 ]
5 [ 2 , 1 , 3 , 7 ] , LIS = [ 2 , 3 , 7 ]
6 [ 2 , 1 , 3 , 7 , 5 ] , LIS = [ 2 , 3 , 7 ]
7 [ 2 , 1 , 3 , 7 , 5 , 6 ] , LIS = [ 2 , 3 , 7 ]
8
382 17. GREEDY ALGORITHMS
LIS = [2, 3, 7] is locally optimal but not part of the global optimal
solutions which are [1, 3, 5, 6] and [2, 3, 5, 6]. In our non-overlapping
interval problem, if one interval is optimal in the local subproblem, it
will sure be part of the optimal solution to the final problem (globally).
Practical Guideline
It is clear to us like in dynamic programming, greedy algorithms are for
solving optimization problems, and it subjects to a set of constraints. For
example:
• Maximize the number of events you can attend, but do not attend any
overlapping events.
• Minimize the cost of all edges chosen, but do not disconnect the graph.
However,
• Hard to get it right: Once you have found the right greedy approach,
designing greedy algorithms can be easy. However, finding the right
rule can be hard.
17.3 *Proof
The main challenging in greedy algorithms is to prove its correctness, which
is important in theoretical study. However, in real coding practice, we can
leverage the dynamic programming solution to compare with and scrutinize
different kinds of examples to make sure the greedy algorithm and the dy-
namic programming are having the same results. Still, let us just learn this
proof techniques as mastering another powerful tool.
17.3.1 Introduction
First, we introduce generally two techniques/arguments to prove the correct-
ness of a greedy algorithm in a step-by-step fashion using the mathematical
induction, they are: Greedy Stays Ahead and Exchange Arguments.
384 17. GREEDY ALGORITHMS
4. Prove Optimality. Using the fact that greedy stays ahead, prove that
the greedy algorithm must produce an optimal solution. This argu-
ment is often done by contradiction by assuming the greedy solution
isn’t optimal and using the fact that greedy stays ahead to de-rive a
contradiction.
The main challenge with this style of argument is finding the right measure-
ments to make.
Exchange Arguments
It proves that the greedy solution is optimal by showing that we can iter-
atively transform any optimal solution into the greedy solution produced
by greedy algorithm without worsening the cost of the optimal solution.
This transformality matches the word “exchange”. Exchange arguments are
a more versatile technique compared with greedy stays ahead. It can be
generalized into three steps:
1. Define the solution: Define our greedy solution as G = {g1 , ..., gk } and
we compare it against some optimal solution O = {o1 , ..., om }.
Guideline
We will simply go through the list and, but the point is it is we should use
the proof methods as a way to design the greedy algorithm on top of the
dynamic programming.
i=0
Analysis First, let us assume all intervals have distinct deadline. A naive
solution is to try all permutation of n intervals and find the one with the
minimum lateness. But what if we start from random order, we compute
its lateness and each time we exchange two adjacent items and see if this
change will decrease the total lateness or not.
1 ______a_i____a_j
Therefore, Say our items are ai and aj . There are four cases according to
di , dj , ti , tj . At first, with lateness s + ti − di , and s + ti + tj − dj . After the
exchange, we have s + tj − dj and s + tj + ti − di . i will definitely be more
late, j however will be less late. Let us compare the additional lateness of i
with the decreased lateness of j:
s + tj + ti − di − (s + ti − di ) →
− s + ti + tj − dj − (s + tj − dj ) (17.8)
tj →
− ti (17.9)
0, if f [i] ≤ d[i],
(
l[i] = (17.10)
f [i] − d[i], otherwise.
Which is to say we do not reward for intervals that are not late with negative
values. Things get more complex.
1. If none of them is late, then exchange or not to change will not make
any difference to the total lateness.
1 ______a_i____a_j__d_i__d_j
We can see that no particular rule–not sorting by d, not by t, and not by d−t
which is called slack time–we can find that to solve it in greedy polynomial
time. However, can we use dynamic programming?
and the optimal solution is still simply the same order, this indicates the
optimal substructure.
Let us assume we find the best order O for subarray intervals[0 : i],
now we have to prove that the best solution for subarray intervals[0 : i + 1]
can be obtained by inserting interval[i] into O. Assume the position we
insert is at j, so O[0 : j] will not be affected at all, we care about O[j : i].
First, we have to prove that no matter where to add insert interval[i],
the ordering of O needs to keep unchanged for it to have optimal solution.
If insert position is at the end of O, the ordering do not need to change. For
the other positions, however, it is really difficult to prove without enough
math knowledge and optimization.
Let us assume the start time is s for j, j + 1, we know:
We can not prove it, and we use this method to try out, but it gives
us wrong answer, so far, all our attempt to use greedy algorithm failed
miserably.
When there is a tie at the deadline, if we schedule the one that takes the
most time first, we end up with higher lateness. For example, if our solution
is [5, 5], [4, 6], [2, 6], the lateness is 5+4-6 + (9+2-6) = 3+5 =8 instead of
6.
all the other edges are the same. For example, in the graph we say e = (1, 5).
With the constraint that there is only edge differs, f has to be one edge out
of (2, 3), (3, 5); adding e to T forms a cycle, so in T ∗ , it can not have edges
(2, 3), (3, 5) at the same time, thus one referred as f has to be removed in the
T ∗ . It is always true that cost(e) ≥ cost(f ), because otherwise the greedy
approach would have chosen e instead of f .
For the optimal approach, if we replace e with f , then we have cost(T ) =
cost(T ∗ − e + f ) ≤ cost(T ∗ ). This means, with this swap of e and f between
G and O, the cost of the greedy approach is still at most the same as the
optimal cost, transforming the optimal solution to greedy solution will not
worsen the optimal solution.
17.5.1 Scheduling
We have seen two scheduling problems, it is a time-based problem which
naturally follows has a leftmost to rightmost order along the timeline. And
it is about scheduling tasks/meetings to allowed resources. We need to pay
attention to the following contexts:
• What are the conditions? Is both start and end time fixed, or they
are highly flexible and are bounded by earliest possible start time and
latest end time?
2 i f not i n t e r v a l s :
3 return 0
4 points = [ ]
5 for s , e in i n t e r v a l s :
6 p o i n t s . append ( ( s , +1) )
7 p o i n t s . append ( ( e , −1) )
8 p o i n t s = s o r t e d ( p o i n t s , key=lambda x : ( x [ 0 ] , x [ 1 ] ) )
9 ans = 0
10 total = 0
11 f o r _, f l a g i n p o i n t s :
12 t o t a l += f l a g
13 ans = max( ans , t o t a l )
14 r e t u r n ans
Label Assignment
We can modify the previous code to incorporate label assignment. We sep-
arate the start and end time in two independent lists because only when we
meet a start time, we assign a room, and sort both of them.
2(0) 4(4) 9(2) 16(3) 36(1)
9(4) 15(0) 23(3) 29(2) 45(1)
We put two pointers, sp, ep at the start of the start time list and end time
list respectively. We need zero room at first. And for start pointer at 2,
we assign room one to interval 0 because 2 < 9, no room is freed to reuse.
Then sp moves to 4. 4 < 9, no room is freed, assign room 2 to interval 4.
sp at 9, 9 ≥ 9, meaning we can reuse the room belonged to interval 4, thus
assign room 2 to interval 2. Now, move both sp and ep, we are comparing
16 > 15, meaning interval 3 can reuse the room belonged to interval 0, we
assign room 1 to interval 3. Next, we compare 36 > 23, interval 1 takes the
room number 1 from interval 3. Since one of the pointer reached to the end
of the list, process ends.
1 d e f minMeetingRooms ( i n t e r v a l s ) :
2 s t a r t s , ends = [ ] , [ ]
3 f o r i , ( s , e ) i n enumerate ( i n t e r v a l s ) :
4 s t a r t s . append ( ( s , i ) )
5 ends . append ( ( e , i ) )
6 s t a r t s . s o r t ( key=lambda x : x [ 0 ] )
7 ends . s o r t ( key=lambda x : x [ 0 ] )
8 n = len ( intervals )
9 rooms = [ 0 ] ∗ n
10 sp , ep = 0 , 0
11 label = 0
12 w h i l e sp < n :
13 i n d e x = s t a r t s [ sp ] [ 1 ]
14 # A s s i g n a new room
15 i f s t a r t s [ sp ] [ 0 ] < ends [ ep ] [ 0 ] :
16 rooms [ i n d e x ] = l a b e l
17 l a b e l += 1
17.5. CLASSICAL PROBLEMS 395
18 e l s e : #Reuse a room
19 room_of_end = rooms [ ends [ ep ] [ 1 ] ]
20 rooms [ i n d e x ] = room_of_end
21 ep += 1
22 sp += 1
23 p r i n t ( rooms )
24 return label
The above method is natural but indeed greedy! We sort the intervals by
start time, the worst case we assign each meeting a different room. However,
the number of room can be reduced if we can reuse any previous assigned
meeting rooms that is free at the moment. The depth is controlled by
the time line. For example, for interval c, if both a and e is free at that
moment, does it matter which meeting room to put of c in? Nope. Because
no matter which room it is in, the interval d will overlap with this interval,
thus can not use its meeting room, but still there is the one left from either
a or e. This is what this problem is essentially different from the maximum
non-overlapping problems. The greedy part is we always reassign the room
belongs to the earliest available rooms. A non-greedy and naive way is to
check all preceding meeting rooms, and find one available.
You have to property Did you find that, for the resource assignment,
mostly, we have no much choice, because we have to assign it. The only
choice is which room. We are greedy that we merge it whenever we can.
All the solutions no matter if they put the earliest finished meeting room
to reassign or just random or arbitrary one, they are doing it for a single
purpose: reduce the possible number of resources whenever they can.
An easy way to understand this problem is to notice that: for each
meeting you HAVE TO assign it a room. The worst case is we assign a room
for each single meeting and we do not even need to sort these intervals. Well,
how can we optimize it, minimize the number of rooms? We have to reuse
a room whenever it is possible. Therefore, we need to sort the meeting by
start time. Because the first meeting has no choice but to assign a meeting
room to it. For the second meeting, we have two options: either assign a
room or reuse one that is available now.
• Does it matter which one to reuse? Nope. Why? Because the smallest
number of rooms needed are decided by how many meetings collide at
a single time point. No matter which available room you put of this
396 17. GREEDY ALGORITHMS
meeting, for the following meetings the number of available rooms are
always the same:any rooms that are freed from preceding meetings.
Here is the thing. When we are scanning from the leftmost interval to
the rightmost by start time,
Proof We have been proved it already informally. The greedy we have will
end up with compatible/feasible solutions where at each meeting rooms, no
two overlapping meetings will be scheduled.
17.5.2 If we use the greedy algorithm above, we can schedule every interval
with d number of resources, which is optimal.
Figure 17.4: Left: sort by start time, Right: sort by finish time.
Figure 17.5: Left: sort by start time, Right: sort by finish time.
Optimizations
We use a list rooms which starts as being empty. Each room, we only keep
its end time. After sorting by the start time, we go through each interval
and try to put it in a room if it does not overlap, we put this interval in this
room and update its end time. If no available room found, we assign a new
room instead. With this strategy, we end up with O(nd) in time complexity.
When d is small enough, it saves more time.
398 17. GREEDY ALGORITHMS
1 d e f minMeetingRooms ( i n t e r v a l s ) :
2 i n t e r v a l s = s o r t e d ( i n t e r v a l s , key=lambda x : x [ 0 ] )
3 rooms = [ ] # a l i s t t h a t t r a c k s t h e end time
4 for s , e in i n t e r v a l s :
5 bFound = F a l s e
6 f o r i , r e i n enumerate ( rooms ) :
7 i f s >= r e :
8 rooms [ i ] = e
9 bFound = True
10 break
11 i f not bFound :
12 rooms . append ( e )
13 r e t u r n l e n ( rooms )
Priority Queue Is there a way to fully get rid of the factor of d? In our
case, we loop over all rooms and check if it is available, but we do not even
care which one is, we just need one! So, instead we replace rooms with a
priority queue with uses min-heap, making sure each time we only check the
room with the earliest end time; if it does not overlap, put this meeting into
this room and update its finish time, or else assign a new room.
1 import heapq
2 d e f minMeetingRooms ( i n t e r v a l s ) :
3 i n t e r v a l s = s o r t e d ( i n t e r v a l s , key=lambda x : x [ 0 ] )
4 rooms = [ ] # a l i s t t h a t t r a c k s t h e end time o f each room
5
6 for s , e in i n t e r v a l s :
7 bFound = F a l s e
8 # now , j u s t check t h e room t h a t ends e a r l i e r i n s t e a d o f
check i t a l l
9 i f rooms and rooms [ 0 ] <= s :
10 heapq . heappop ( rooms )
11 heapq . heappush ( rooms , e )
12 r e t u r n l e n ( rooms )
17.5.2 Partition
763. Partition Labels A string S of lowercase letters is given. We want
to partition this string into as many parts as possible so that each letter
appears in at most one part, and return a list of integers representing the
size of these parts.
Example 1 :
This will give us a solution with O(n2 ). Not bad! In dynamic programming
solution, when we are solving subproblem “abaefegdeh”, we are checking all
previous subproblems’ solutions. However, if we observe, we only need the
optimal solutions of the previous subproblem “abaefegde”– “aba”, “efegde”
to figure out the optimal solution of current: simply check if ’h’ is in any
parts of the preceding optimal solution and merging parts between the ear-
liest part that incorporates ’h’ to the last. Now, we can also observe that
between subproblems for example “abaefegdehi”.
a , o = {a}
ab , o = { a } , {b}
aba , merge , o ={a , b}
abae , o={a , b } , { e }
a b a e f , o = {a , b } , { e } , { f }
a b a e f e , e e x i s t s , merge , o = {a , b } , { e , f }
abaefeg
abaefegd
abaefegde
abaefegdeh
abaefegdehi
this part, or within the last range, or it is the end of the range and different
type of process is applied for different case. The Python code shows more
details of this greedy approach algorithm:
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 class Solution :
3 d e f p a r t i t i o n L a b e l s ( s e l f , S : s t r ) −> L i s t [ i n t ] :
4 n = len (S)
5 loc = defaultdict ( int )
6 f o r i , c i n enumerate ( S ) :
7 l o c [ c ] = i # g e t t h e l a s t l o c a t i o n o f each c h a r
8 l a s t _ l o c = −1
9 p r e v _ l o c = −1
10 ans = [ ]
11 f o r i , c i n enumerate ( S ) :
12 #p r e v _ l o c = min ( prev_loc , i )
13 l a s t _ l o c = max( l a s t _ l o c , l o c [ c ] )
14 i f i == l a s t _ l o c : ##a good one
15 ans . append ( l a s t _ l o c − p r e v _ l o c )
16 prev_loc = l a s t _ l o c
17
18 r e t u r n ans
First, this is a exponential problem because we might need to try all per-
mutation of a file. Merge the first two files take 10+5 cost, and merge this
further with 100, takes 10+5+100, and so. Now, let us write the cost of the
original order:
From the objective function, because all file size are having positive sizes,
to minimize it, we have to make sure F0 is the smallest item in the array
because it has to be computed the most times, and F2 is the second smallest
and so on. We can easily figure out that sorting the files in increasing orders
and merge them in this order result the least cost of merging. This is a very
simple and natural greedy approach.
Data Compression
17.6. EXERCISES 401
17.5.4 Factional S
17.5.5 Graph Algorithms
17.6 Exercises
• 630. Course Schedule III (hard)
402 17. GREEDY ALGORITHMS
18
The purpose of this chapter to see how our learned algorithm design principle
can be applied into problem solving. We approach problems using different
algorithm design principle and step by step to see how the change in the
time and space complexity.
Input : [ 1 0 , 9 , 2 , 5 , 3 , 7 , 1 0 1 , 1 8 ]
Output : 4
E x p l a n a t i o n : The l o n g e s t i n c r e a s i n g s u b s e q u e n c e i s [ 2 , 3 , 7 , 1 0 1 ] ,
t h e r e f o r e the length i s 4 .
403
404 18. HANDS-ON ALGORITHMIC PROBLEM SOLVING
problem as finding the longest path in the search tree, which is a binary tree
and with height n. We can have the Python code:
1 d e f l e n g t h O f L I S ( s e l f , nums : L i s t [ i n t ] ) −> i n t :
2 d e f d f s ( nums , idx , cur_len , last_num , ans ) :
3 i f i d x >= l e n ( nums ) :
4 ans [ 0 ] = max( ans [ 0 ] , cur _len )
5 return
6 i f nums [ i d x ] > last_num :
7 d f s ( nums , i d x +1, cur _len + 1 , nums [ i d x ] , ans )
8 d f s ( nums , i d x +1, cur_len , last_num , ans )
9 ans = [ 0 ]
10 last_num = −s y s . maxsize
11 d f s ( nums , 0 , 0 , last_num , ans )
12 r e t u r n ans [ 0 ]
18.1.2 Self-Reduction
Now, let us us an example smaller than before, say [2, 5, 3, 7], which has
the LIS 3 with [2, 3, 7]. Let us consider each state not atomic but as a
subproblem. The same tree, but we translate each node differently. We
start to consider the problem top down: we have problem [2, 5, 3, 7], and
our start index = 0, meaning start from item 2, then our problem is can be
divided into different situations:
• not take 2: we find the LIS length of subproblem [5, 3, 7]. In this
case, our subsequence can start from any of these 3 items, we indicate
this case by not changing the previous value. Use idx to indicate the
subproblem/subarray, we call dfs that idx+1.
• take 2: we need to find the LIS length of subproblem [5, 3, 7] whose
subsequence must start from 5. Thus, we set the last_num to 5 in the
recursive call.
Therefore, our code becomes:
1 d e f l e n g t h O f L I S ( s e l f , nums : L i s t [ i n t ] ) −> i n t :
2 d e f d f s ( nums , idx , last_num ) :
3 i f not nums :
4 return 0
5 i f i d x >= l e n ( nums ) :
6 return 0
7 len1 = 0
8 i f nums [ i d x ] > last_num :
9 l e n 1 = 1 + d f s ( nums , i d x +1, nums [ i d x ] )
10 l e n 2 = d f s ( nums , i d x +1 , last_num )
11 r e t u r n max( l e n 1 , l e n 2 )
12
13 last_num = −s y s . maxsize
14 r e t u r n d f s ( nums , 0 , last_num )
In this solution, the time complexity has not improved yet, but from this
approach, we can further increase the efficiency with dynamic programming.
18.2. A TO B 405
18.2 A to B
Another approach is to use the concept of “prefix” or “suffix”. The LIS must
start from one of the items in the array. Finding the length of the LIS in
the original array can be achieved by comparing n subproblems, the length
of LIS of:
1 [2 , 5 , 3 , 7 ] , LIS s t a r t s a t 2 ,
2 [5 , 3 , 7 ] , LIS s t a r t s a t 5 ,
3 [3 , 7 ] , LIS s t a r t s a t 3
4 [7] , LIS s t a r t s a t 7
18.2.1 Self-Reduction
We model the problem as in Fig. 29.1. Same here, our problem become
finding the longest path in a N-ary tree instead of a binary tree. Define f (i)
as the LIS starting with index i in the array. then, its relation with other
state will be f (i) = maxj (f (j)) + 1, j > i, a[j] > a[i], and f [n] = 0. Here,
the base case is when there has element to start from which will have 0 LIS.
406 18. HANDS-ON ALGORITHMIC PROBLEM SOLVING
Figure 18.1: Graph Model for LIS, each path represents a possible solution.
1 d e f l e n g t h O f L I S ( s e l f , nums : L i s t [ i n t ] ) −> i n t :
2 d e f d f s ( nums , idx , cur_num ) :
3 max_len = 0
4 # Generate t h e next node
5 f o r i i n r a n g e ( i d x +1, l e n ( nums ) ) :
6 i f nums [ i ] > cur_num :
7 max_len = max( max_len , 1 + d f s ( nums , i , nums [ i ] )
)
8 r e t u r n max_len
9 r e t u r n d f s ( nums , −1, −s y s . maxsize )
ending at any previous index by plusing one. The whole analysis process is
illustrated in Fig 18.2.
the array. To initialize we set dp[0] = 0, and the answer is max(dp). The
time complexity is O(n2 ) because we need two for loops: one outsider loop
with i, and another inside loop with j. The space complexity is O(n). The
Python code is:
1 def l i s (a) :
2 # d e f i n e t h e dp a r r a y
3 dp = [ 0 ] ∗ ( l e n ( a ) +1)
4 a = [− s y s . maxsize ] + a
5 print (a)
6 f o r i i n r a n g e ( l e n ( a ) ) : # end with i n d e x i −1
7 f o r j in range ( i ) :
8 if a[ j ] < a[ i ]:
9 dp [ i ] = max( dp [ i ] , dp [ j ] + 1 )
10 p r i n t ( dp )
11 r e t u r n max( dp )
1 d e f l e n g t h O f L I S ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 d e f b i n a r y S e a r c h ( a r r , l , r , num) :
7 w h i l e l <r :
8 mid = l +(r−l ) //2
9 i f num>a r r [ mid ] :
10 l=mid+1
11 e l i f num<a r r [ mid ] :
12 r=mid
13 else :
14 r e t u r n mid
15 return l
16 max_count = 0
17 i f not nums :
18 return 0
19 dp =[0 f o r _ i n r a n g e ( l e n ( nums ) ) ]#s a v e t h e maximum t i l l
now
20 maxans =1
21 l e n g t h =0
22 f o r i d x i n r a n g e ( 0 , l e n ( nums ) ) : #c u r r e n t combine t h i s t o
t h i s s u b s e q u e n c e , 10 t o [ ] , 9 t o [ 1 0 ]
23 pos = b i n a r y S e a r c h ( dp , 0 , l e n g t h , nums [ i d x ] ) #f i n d
i n s e r t i o n point
24 dp [ pos ]= nums [ i d x ] #however i f i t i s not a t end , we
r e p l a c e i t , c u r r e n t number
25 i f pos==l e n g t h :
26 l e n g t h+=1
27 p r i n t ( dp )
28 return length
1 d e f l e n g t h O f L I S ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 d e f b i n a r y S e a r c h ( a r r , l , r , num) :
7 w h i l e l <r :
8 mid = l +(r−l ) //2
9 i f num>a r r [ mid ] :
10 l=mid+1
11 e l i f num<a r r [ mid ] :
12 r=mid
13 else :
14 r e t u r n mid
15 return l
16 max_count = 0
17 i f not nums :
18 return 0
19 dp =[0 f o r _ i n r a n g e ( l e n ( nums ) ) ]
20 LIS=0
21 f o r i i n r a n g e ( l e n ( nums ) ) : #c u r r e n t combine t h i s t o t h i s
18.2. A TO B 409
s u b s e q u e n c e , 10 t o [ ] , 9 t o [ 1 0 ]
22 pos = b i n a r y S e a r c h ( dp , 0 , l e n g t h , nums [ i ] )
23 dp [ pos ]= nums [ i ]
24 i f pos==LIS :
25 LIS += 1
26 r e t u r n LIS
410 18. HANDS-ON ALGORITHMIC PROBLEM SOLVING
Part V
Classical Algorithms
411
413
As the name suggests, Two pointers technique involves two pointers that
start and move with the following two patterns:
415
416 19. ADVANCED SEARCH ON LINEAR DATA STRUCTURES
19.1.1 Array
Remove Duplicates from Sorted Array(L26)
Given a sorted array a = [0, 0, 1, 1, 1, 2, 2, 3, 3, 4], remove the duplicates in-
place such that each element appears only once and return the new length.
Do not allocate extra space for another array, you must do this by modifying
the input array in-place with O(1) extra memory. In the given example,
there are in total of 5 unique items and 5 is returned.
Analysis We set both slower pointer i and the faster pointer j at the first
item in the array. Recall that slow-fast pointers cut the space of the sorted
array into three parts, we can define them as:
1. unique items in region [0, i],
In the process, we compare the items pointed by two pointers, once these
two items does not equal, we find an new unique item. We copy this unique
item at the faster pointer right next to the position of the slower pointer.
Afterwards, we move the slow pointer by one position to remove duplicates
of our copied value.
With our example, at first, i = j = 0, region one has one item which is
naively unique and region two has zero item. Part of the process is illustrated
as:
i j [0 , i] [ i +1, j ] process
0 0 [0] [] item 0==0, j +1=1
0 1 [0] [0] item 0==0, j +1=2
0 2 [0] [0 , 1] item 0!=1 , i +1=1, copy 1 t o i n d e x 1 , j
+1=3
1 3 [0 , 1] [1 , 1] item 1==1, j +1=4
1 4 [0 , 1] [ 1 , 1 , 1 ] item 1==1, j +1=5
1 5 [0 , 1] [ 1 , 1 , 1 , 2 ] item 1==2, i +1=2, copy 2 t o i n d e x 2 ,
j +1=6
2 6 [0 , 1 , 2] [1 , 1 , 2 , 2]
After calling the above function on our given example, array a becomes
[[0, 1, 2, 3, 4, 2, 2, 3, 3, 4]. Check the source code for the whole visualized pro-
cess.
Given an array of n positive integers and a positive integer s, find the min-
imal length of a contiguous subarray of which the sum ≥ s. If there isn’t
one, return 0 instead.
Example :
I n p u t : s = 7 , nums = [ 1 , 4 , 1 , 2 , 4 , 3 ]
Output : 2
E x p l a n a t i o n : t h e s u b a r r a y [ 4 , 3 ] has t h e minimal l e n g t h under t h e
problem c o n s t r a i n t .
418 19. ADVANCED SEARCH ON LINEAR DATA STRUCTURES
However, we can use two pointers i and j (i ≤ j) and both points at the
first item. In this case, these two pointers defines a subarray a[i : j + 1] and
we care the region [i, j]. As we increase pointer j, we keep adding positive
item into the sum of the subarray, making the subarray sum monotonically
increasing. Oppositely, if we increase pointer i, we remove positive item
away from the subarray, making the sum of the subarray monotonically
decreasing. The detailed steps of two pointer technique in this case is as:
2. Get the optimal subarray for all subproblems(subarries) that end with
current j, which is e0 at the moment. We do this by forwarding pointer
i this time to shrink the window size until sum ≥ s no longer holds.
Let’s assume pointer i stops at index s0 . Now, we find the optimal
solution for subproblems a[0 : i, 0 : j]( denoting subarries with the
start point in range [0, i] and the end point in range [0, j].
Because both pointer i and j move at most n steps, with the total op-
erations to be at most 2n, making the time complexity as O(n). The above
question would be trivial if the maximum subarray length is asked.
Analysis Applying two pointers, with the region between pointer i and j
to be our testing substring. For this problem, the condition for the window
[i, j] it will at most have all characters from T . The intuition is we keep
expanding the window by moving forward j until all characters in T is
found. Afterwards, we contract the window so that we can find the minimum
window with the condition satisfied. Instead of using another data structure
to track the state of the current window, we can depict the pattern T as a
dictionary data structure where all unique characters comprising the keys
and with the number of occurrence of each character as value. We use
another variable count to track how the number of unique characters. In
all, they are used to track the state of the moving window in [i, j], with the
value of the dictionary to indicate how many occurrence is short of, and the
count represents how many unique characters is not fully found, and we
depict the state in Fig. 19.2.
Along the expanding and shrinking of the window that comes with the
movement of pointer i and j, we track the state with:
• When forwarding j, we encompass S[j] in the window. If S[j] is a
key in the dictionary, decrease the value by one. Further, if the value
reaches to the threshold 0, we decrease count by one, meaning we are
short of one less character in the window.
420 19. ADVANCED SEARCH ON LINEAR DATA STRUCTURES
Figure 19.3: The partial process of applying two pointers. The grey shaded
arrow indicates the pointer that is on move.
Part of this process with our example is shown in Fig. 19.3. And the Python
code is given as:
1 from c o l l e c t i o n s import Counter
2 d e f minWindow ( s , t ) :
3 d i c t _ t = Counter ( t )
4 count = l e n ( d i c t _ t )
5 i , j = 0, 0
6 ans = [ ]
7 minLen = f l o a t ( ' i n f ' )
8 while j < len ( s ) :
9 c = s[j]
10 i f c in dict_t :
11 d i c t _ t [ c ] −= 1
12 i f d i c t _ t [ c ] == 0 :
19.1. SLOW-FASTER POINTERS 421
13 count −= 1
14 # S h r i n k t h e window
15 w h i l e count == 0 and i < j :
16 curLen = j − i + 1
17 i f curLen < minLen :
18 minLen = j − i + 1
19 ans = [ s [ i : j + 1 ] ]
20 e l i f curLen == minLen :
21 ans . append ( s [ i : j +1])
22
23 c = s[i]
24 i f c in dict_t :
25 d i c t _ t [ c ] += 1
26 i f d i c t _ t [ c ] == 1 :
27 count += 1
28 i += 1
29
30 j += 1
31 r e t u r n ans
Input : [ 1 , 2 , 3 , 4 , 5 ]
Output : Node 3 from t h i s l i s t ( S e r i a l i z a t i o n : [ 3 , 4 , 5 ] )
Example 2 ( even l e n g t h ) :
Input : [ 1 , 2 , 3 , 4 , 5 , 6 ]
Output : Node 4
from t h i s l i s t ( S e r i a l i z a t i o n : [ 4 , 5 , 6 ] )
is the last item in the first example that comes with odd length. Further,
when the slow pointer reaches to item 4, the faster pointer reaches to the
empty node of the last item in the second example that comes with even
length. Therefore, in the implementation, we check two conditions in the
while loop:
When a linked list which has a cycle, as shown in Fig. 19.5, iterating
items over the list will make the program stuck into infinite loop. The
pointer starts from the heap, traverse to the start of the loop, and then comes
back to the start of the loop again and continues this process endlessly. To
avoid being stuck into a “trap”, we have to possibly solve the following three
problems:
The solution encompasses the exact way of slow faster pointers traversing
through the linked list as our last example. With the slow pointer iterating
one item at a time, and the faster pointer in double pace, these two pointers
will definitely meet at one item in the loop. In our example, they will meet
424 19. ADVANCED SEARCH ON LINEAR DATA STRUCTURES
at node 6. So, is it possible that it will meet at the non-loop region starts
from the heap and ends at the start node of the loop? The answer is No,
because the faster pointer will only traverse through the non-loop region
once and it is always faster than the slow pointer, making it impossible to
meet in this region. This method is called Floyd’s Cycle Detection, aka
Floyd’s Tortoise and Hare Cycle Detection. Let’s see more details at how
to solve our mentioned three problems with this method.
Check Linked List Cycle(L141) Compared with the code in the last
example, we only need to check if the slow and fat pointers are pointing at
the same node: If it is, we are certain that there must be a loop in the list
and return True, otherwise return False.
1 d e f h a s C y c l e ( head ) :
2 s l o w = f a s t = head
3 w h i l e f a s t and f a s t . next :
4 s l o w = s l o w . next
5 f a s t = f a s t . next . next
6 i f s l o w == f a s t :
7 r e t u r n True
8 return False
For a given linked list, assume the slow and fast pointers meet at node
somewhere in the cycle. As shown in Fig. 19.6, we denote three nodes: head
(h, start node of cycle(s), and meeting node in the cycle(m). we denote the
distance between h and s to be x, the distance between s and m to be y, and
the distance between m and s to be z. Because the faster pointer traverses
through the list in double speed, when it meets up with the slow pointer,
the distance that it traveled(x + y + z + y) to be two times of the distance
19.1. SLOW-FASTER POINTERS 425
From the above equation, we obtain the equal relation between x and z. the
starting node of the cycle from the head is x, and y is the distance from
the start node to the slow and fast pointer’s node, and z is the remaining
distance from the meeting point to the start node. Therefore, after we have
detected the cycle from the last example, we can reset the slow pointer to
the head of the linked list after. Then we make the slow and the fast pointer
both traverse at the same pace–one node at a time–until they meet at a
node we stop the traversal. The node where they stop at is the start node
of the cycle. The code is given as:
1 d e f d e t e c t C y c l e ( head ) :
2 s l o w = f a s t = head
3
4 d e f g e t S t a r t N o d e ( slow , f a s t , head ) :
5 # Reset slow p o i n t e r
6 s l o w = head
7 w h i l e f a s t and s l o w != f a s t :
8 s l o w = s l o w . next
9 f a s t = f a s t . next
10 return slow
11
12 w h i l e f a s t and f a s t . next :
13 s l o w = s l o w . next
14 f a s t = f a s t . next . next
15 # A cycle i s detected
16 i f s l o w == f a s t :
17 r e t u r n g e t S t a r t N o d e ( slow , f a s t , head )
18
19 r e t u r n None
Remove Linked List Cycle We can remove the cycle by recirculing the
last node in the cycle, which in example in Fig. 19.5 is node 6 to an empty
node. Therefore, we have to modify the above code to make the slow and
fast pointers stop at the last node instead of the start node of the loop. This
subroutine is implemented as:
1 d e f r e s e t L a s t N o d e ( slow , f a s t , head ) :
2 s l o w = head
3 w h i l e f a s t and s l o w . next != f a s t . next :
4 s l o w = s l o w . next
5 f a s t = f a s t . next
6 f a s t . next = None
The complete code to remove cycle is provided in google colab together with
running examples.
426 19. ADVANCED SEARCH ON LINEAR DATA STRUCTURES
2. If t > a[i] + a[j], we have to increase the sum, we can only do this by
moving pointer i forward.
3. If t > a[i] + a[j], we have to decrease the sum, we can only do this by
moving pointer j backward.
The 4 s u b a r r a y s a r e l i s t e d below :
[ 1 , 0 , 1 ] , index (0 , 2)
[ 1 , 0 , 1 , 0 ] , index (0 , 3)
[ 0 , 1 , 0 , 1 ] , index (1 , 4)
[ 1 , 0 , 1 ] , index (2 , 4)
However, the above code only returns 3, instead of 4 as shown in the example.
By printing out pointers i and j, we can see the above code is missing case
(2, 4). Why? Because we are restricting the subarray sum in range [i, j] to
be smaller than or equal to S, with the occruence of 0s that might appear
in the front or in the rear of the subarray:
The solution is to add another pointer ih to handle the missed case: When
the sum = S, count the total occurrence of 0 in the front. Compared with
the above solution, the code only differs slightly with the additional pointer
and one extra while loop to deal the case. Also we need to pay attention
that ih ≤ j, otherwise, the while loop would fail with example with only
zeros and a targeting sum 0.
19.4. SUMMARY 429
1 d e f numSubarraysWithSum ( a , S ) :
2 i , i_h , j = 0 , 0 , 0
3 win_sum = 0
4 ans = 0
5 while j < len ( a ) :
6 win_sum += a [ j ]
7 w h i l e i < j and win_sum > S :
8 win_sum −= a [ i ]
9 i += 1
10 # Move i_h t o count a l l z e r o s i n t h e f r o n t
11 i_h = i
12 w h i l e i_h < j and win_sum == S and a [ i_h ] == 0 :
13 ans += 1
14 i_h += 1
15
16 i f win_sum == S :
17 ans += 1
18 j += 1
19 r e t u r n ans
19.4 Summary
Two pointers is a powerful tool for solving problems on liner data structures,
such as “certain” subarray and substring problems as we have shown in the
examples. The “window” secluded between the two pointers can be viewed
as sliding window: It can move slide forwarding with the forwarding the
slower pointer. Two important properties are generally required for this
technique to work:
For example, given an array, imagine that we have a fixed size window
as shown in Fig. 19.7, and we can slide it forward one position at
a time, compute the sum of each window. The bruteforce solution
would be of O(kn) complexity where k is the window size and n is the
array size by using two nested for loops: one to set the starting point,
and the other to compute the sum in O(k). However, the sum of the
current window (Sc ) can be computed from the last window (Sl ), and
the items that just slid out and in as aj and ai respectively. Then
Sc = Sl − ai + aj . Getting the state of of the window between two
pointers in O(1) as shown in the example is our called Sliding Window
Property.
Usually, for an array with numerical value, it satisfies the sliding win-
dow property if we are to compute its sum or product. For substring,
as shown in our minimum window substring example, we can get the
state of current window referring to the state of the last window in
O(1) with the assist of dictionary data structure. In substring, this
is more obscure, and the general requirement is that the state of the
substring does not relate to the order of the characters(anagram-like
state).
2. Monotonicity: For subarray sum/product, the array should only com-
prise all positive/negative values so that the prefix sum/product has
monotonicity: moving the faster pointer and the slower pointer for-
ward results into opposite change to the state. The same goes for the
substring problems where we see from the minimum window substring
example the change of the state: count and the value of the dictionary
is monotonic, and each either increases or decreases with the moving
of two pointers.
19.5 Exercises
1. 3. Longest Substring Without Repeating Characters
2. 674. Longest Continuous Increasing Subsequence (easy)
3. 438. Find All Anagrams in a String
4. 30. Substring with Concatenation of All Words
5. 159. Longest Substring with At Most Two Distinct Characters
6. 567. Permutation in String
7. 340. Longest Substring with At Most K Distinct Characters
8. 424. Longest Repeating Character Replacement
20
2. Combinatorial Search(Chapter)
This chapter is more to apply the basic search strategies and two advanced
algorithm design methodologies–Dynamic Programming and Greedy Algorithms–
on a variety of classical graph problems:
• On the other hand, Minimum Spanning Tree (MST) and Shortest Path
Algorithm on the entails our mastering of Breath-first Graph Search.
431
432 20. ADVANCED GRAPH ALGORITHMS
DFS to Solve Cycle Detection Recall the process of DFS graph search
where a vertex has three possible states–white, gray, and black. A back edge
appears while we reach to an adjacent vertex v which is in gray state from
current vertex u. If we connect v back to its ancestor u, we find our cycle if
the graph is directed. When the graph is undirected, we have discussed that
it has only tree edge and back edge. Thus, we will use two states: visited
and not visted. For edge (u, v), we check two conditions:
2. avoiding cycle of length one which is any existing edge within the
graph. We can easily achieve this by tracking the predecessor p of the
exploring vertex during the search, and making sure the predecessor
is not the same as the current vertex: p 6= u.
7 r e t u r n True
8 else :
9 i f v != p : # both b l a c k and gray
10 p r i n t ( f ' Cycle s t a r t s a t node {v } . ' )
11 r e t u r n True
12 return False
For example, with the digraph in Fig. 20.3, the process is:
S Removed Edges
0 , 3 a r e t h e in−d e g r e e 0 nodes
Add 0 (0 , 1)
1 , 3 a r e t h e c u r r e n t in−d e g r e e 0 node
Add 1 (1 , 2)
3 i s t h e o n l y in−d e g r e e 0 node
Add 3 (3 , 2) , (3 , 4) , ( 3 , 5 )
2 , 4 , 5 a r e t h e in−d e g r e e 0 nodes
Add 2
Add 4
Add 5 (5 , 6)
6 i s t h e o n l y in−d e g r e e 0 node
Add 6
V−S empty , s t o p
Call topo_sort on the graph, we will have the sorted ordering as:
1 [3 , 5 , 6 , 4 , 0 , 1 , 2]
There are a total of n courses that you have to take. Some courses may have
prerequisites, for example course 1 has to be taken before course 0, which is
expressed as [0, 1]. Given the total number of courses and the prerequisite
pairs, return the ordering of courses you should take to finish all courses. If
it is impossible to finish, return an empty array.
The time complexity will be O(|V | + |E|) and the space complexity will be
O(|V |). Since the code is trivial, we only demonstrate it in the notebook.
Union Find
We represent each connected component as a set. For the exemplary graph
in Fig. 20.4, we have two sets: 0, 1, 2, 3, 4 and 5, 6. Unlike the graph-search
based approach, where the edges are visited in certain order,in union-find
approach, the ordering of edges to be visited can be arbitrary. The algorithm
using union-find is:
• For each edge (u, v) in E, union the two sets where vertex u and v
previously belongs to.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 d e f connectedComponent ( g ) :
3 n = len (g)
4 # i n i t i a l i z e disjoint set
5 ds = D i s j o i n t S e t ( n )
6
7 f o r i in range (n) :
8 f o r j i n g [ i ] : # f o r edge i <−>j
9 ds . union ( i , j )
10 r e t u r n ds . get_num_sets ( ) , ds . g e t _ a l l _ s e t s ( )
6 s e l f . index = 0
7
8 d e f add_edge ( s e l f , u , v ) :
9 i f u not i n s e l f . node_index :
10 s e l f . node_index [ u ] , s e l f . index_node [ s e l f . i n d e x ] = s e l f .
index , u
11 s e l f . ds . p . append ( s e l f . i n d e x )
12 s e l f . ds . n += 1
13 s e l f . i n d e x += 1
14
15 i f v not i n s e l f . node_index :
16 s e l f . node_index [ v ] , s e l f . index_node [ s e l f . i n d e x ] = s e l f .
index , v
17 s e l f . ds . p . append ( s e l f . i n d e x )
18 s e l f . ds . n += 1
19 s e l f . i n d e x += 1
20 u , v = s e l f . node_index [ u ] , s e l f . node_index [ v ]
21 s e l f . ds . union ( u , v )
22 return
23
24 d e f get_num_sets ( s e l f ) :
25 r e t u r n s e l f . ds . get_num_sets ( )
26
27 def get_all_sets ( s e l f ) :
28 s e t s = s e l f . ds . g e t _ a l l _ s e t s ( )
29 r e t u r n { s e l f . index_node [ key ] : s e t ( [ s e l f . index_node [ i ] f o r i
i n l i s t ( v a l u e ) ] ) f o r key , v a l u e i n s e t s . i t e m s ( ) }
Examples
1. 547. Number of Provinces(medium)
in graph G. Run DFS on the reversed finishing ordering, then a SCC will
include any vertex along the traversal that hasn’s been put into a SCC yet.
In our example, the process is:
0: f i n d { 0 , 1 , 2 , 3}
4 : f i n d {4}
5 : f i n d {5}
6 : f i n d {6}
8
9 def scc (g) :
10 rg = reverse_graph ( g )
11 o r d e r s = topo_sort_scc ( g )
12
13 # track states
14 c o l o r s = [STATE. w h i t e ] ∗ l e n ( g )
15 sccs = [ ]
16
17 # t r a v e r s e t h e r e v e r s e d graph
18 for u in orders :
19 i f c o l o r s [ u ] != STATE. w h i t e :
20 continue
21 scc = [ ]
22 d f s ( rg , u , c o l o r s , s c c )
23 s c c s . append ( s c c )
24 return sccs
Examples
1. 1520. Maximum Number of Non-Overlapping Substrings (hard): set
up 26 nodes for all letters. A node represents a substray from start to
end. Given a string abacdb, for a(0-2), add an edge between a -> to
any other letter between start and end.Then we will have a directed
graph. There is a scc (loop) between a and d, meaning a substring a
has occurence of b and b substring has occurence of a, which is con-
flicting condition 2, so that they have to be combined. all results are
sccs that are leaves in the contracted scc graph. We can think the scc
graph is acyclic which is a forest. If we choose an internal node, we cant
choose any of the leaves. Which making choosing the number of leaves
maximum. Another solution is using two pointers: https://zxi.
mytechroad.com/blog/greedy/leetcode-1520-maximum-number-of-non-overlapping
• Start with a forest consists of |V | trees and contains only one node. We
design a method to merge these trees into a final connected MST by
selecting one edge at a time. This is the path taken by the Kruskal’s
algorithm.
• Start with a root node which can be any vertex selected from G, grow
the tree by spanning to more nodes iteratively. In the process, we
maintain two disjoint sets of vertices: one containing vertices that are
in the growing spanning tree S and the other to track all remaining
vertices V − S. This is the path taken by the Prim’s algorithm.
Generate Spanning Tree with Union-Find For each edge (u, v):
• if u and v belongs to the same tree, adding this edge will form a cycle,
thus we discard this edge.
446 20. ADVANCED GRAPH ALGORITHMS
• otherwise, combine these two trees and add this edge into A.
(2, 5), (3, 4), (4, 5), (1, 3)]. As initialization, we assign a set id for each
vertex that is marked in read and placed above its corresponding vertex.
The process is:
edge logic action
(1 ,2) 1 ' s set_id 1 != 2's set_id 2 merge s e t 2 to s e t 1
(3 ,5) 3 ' s set_id 3 != 5's set_id 5 merge s e t 5 to s e t 3
(2 ,3) 2 ' s set_id 1 != 3's set_id 3 merge s e t 3 to s e t 1
(2 ,5) 2 ' s set_id 1 == 5's set_id 1 continue
(3 ,4) 3 ' s set_id 1 != 4's set_id 4 merge s e t 4 to s e t 1
(4 ,5) 4 ' s set_id 1 == 5's set_id 1 continue
(1 ,3) 1 ' s set_id 1 == 3's set_id 1 continue
This process produces edges [(1, 2), (3, 5), (2, 3), (3, 4)] as the edges of the
final MST. We can have slightly better performance if we can stop iterating
through edges once we have selected |V | − 1 edges. The implementation is
as simply as:
1 from t y p i n g import D i c t
2 def kruskal ( g : Dict ) :
3 # g i s a d i c t with node : a d j a c e n t nodes
4 v e r t i c e s = [ i f o r i in range (1 , 1 + len ( g ) ) ]
5 v e r t i c e s = g . keys ( )
6 n = len ( vertices )
7 ver_idx = {v : i f o r i , v i n enumerate ( v e r t i c e s ) }
8
9 # i n i t i a l i z e a disjoint set
10 ds = D i s j o i n t S e t ( n )
11
12 # s o r t a l l edges
13 edges = [ ]
14 for u in v e r t i c e s :
15 for v , w in g [ u ] :
16 i f ( v , u , w) not i n e d g e s :
17 e d g e s . append ( ( u , v , w) )
18 e d g e s . s o r t ( key=lambda x : x [ 2 ] )
19
20 # main s e c t i o n
21 A = []
22 f o r u , v , w in edges :
23 i f ds . f i n d ( ver_idx [ u ] ) != ds . f i n d ( ver_idx [ v ] ) :
24 ds . union ( ver_idx [ u ] , ver_idx [ v ] )
25 p r i n t ( f ' {u} −> {v } : {w} ' )
26 A. append ( ( u , v , w) )
27 return A
For the exemplary graph, we denote an weighted edge as a (key, value) pair,
where the value is a tuple of two with the first item being the other endpoint
from the key vertex and the second item being the weight of the edge. The
graph will thus be represented by a dictionary, {1:[(2, 2), (3, 12)], 2:[(1, 2),
(3, 4), (5, 5)], 3:[(1, 12), (2, 4), (4, 6), (5, 3)], 4:[(3, 6), (5, 7)], 5:[(2, 5), (3,
3), (4, 7)]}. Running kruskal(a) will return the following edges:
[ ( 1 , 2 , 2) , (3 , 5 , 3) , (2 , 3 , 4) , (3 , 4 , 6) ]
448 20. ADVANCED GRAPH ALGORITHMS
Complexity Analysis The sorting takes O(|E| log |E|) big oh time. The
cost of checking each edge’s belonging set id and merging two trees into a
single one is decided by the complexity of the disjoint set, it can range from
O(log |V |) to O(|V |). Therefore, we can conclude the time complexity will
be bounded by the sorting time, i.e., O(|E| log |E|).
Figure 20.9: A cut denoted with red curve partition V into {1,2,3} and
{4,5}.
Figure 20.10: Prim’s Algorithm, at each step, we manage the cross edges.
Implementation
One key step is to track all valid cross edges and be able to select the
minimum edge from the set. Naturally, we use priority queue pq. pq can be
implemented in two ways:
• Priority Queue by Edges–Considering the set S as a frontier set,
pq maintains all edges expanded from the frontier set.
Priority Queue by Edges For example shown in Fig. 20.10, at first, the
frontier set has only 1, then we have edges (1, 2), (1, 3) in pq. Once edge
(1, 2) is popped out as it has the smallest weight, we explore all outgoing
edges of vertex 2 to nodes in V − S, adding (2, 3), (2, 5) in pq, resulting pq =
(2, 3), (2, 5), (1, 3). Then we pop out edge (2, 3), and explore outgoing edges
of vertex 3 and add (3, 4), (3, 5) into pq, with pq = (2, 5), (1, 3), (3, 4), (3, 5).
At this moment, we can see that edge (1, 3) is no longer a cross edge. There-
fore, whenever we are about to add the light edge into the expanding tree,
we check if both of its endpoints are in set S already. If true, we skip this
450 20. ADVANCED GRAPH ALGORITHMS
edge and use the next valid light edge. Repeat this process will get us the
set of edges A forming a MST. The Python code is as:
1 import queue
2
3 d e f _get_light_edge ( pq , S ) :
4 w h i l e pq :
5 # Pick t h e l i g h t edge
6 w, u , v = pq . g e t ( )
7 # F i l t e r out non−c r o s s edge
8 i f v not i n S :
9 S . add ( v )
10 r e t u r n ( u , v , w)
11 r e t u r n None
12
13 d e f prim ( g ) :
14 cur = 1
15 n = len ( g . items () )
16 S = { c u r } #s p a n n i n g t r e e s e t
17 pq = queue . P r i o r i t y Q u e u e ( )
18 A = []
19
20 while len (S) < n :
21 # Expand e d g e s f o r t h e e x p l o r i n g v e r t e x
22 f o r v , w in g [ cur ] :
23 i f v not i n S :
24 pq . put ( ( w, cur , v ) )
25
26 l e = _get_light_edge ( pq , S )
27 if le :
28 A. append ( l e )
29 c u r = l e [ 1 ] #s e t t h e e x p l o r i n g v e r t e x
30 else :
31 p r i n t ( f ' Graph { g } i s not c o n n e c t e d . ' )
32 break
33 return A
In line 24, we use a 3 item tuple representing the edge cost, the first endpoint
in the set S and the second endpoint in V − S to align with the fact that
the PriorityQueue() uses the first item of a tuple as the key for sorting.
The while loop is similar to our breath-first-search and can be terminated
in the following two conditions:
• when the set S is as large as the set V by checking the size of set S
• when we can not find a light edge which happens when the graph is
not connected.
nodes that are still in V − S and see if we are able to find an even “lighter”
edge. Applying this process on the given example:
1. First, we have the start vertex 1 with the smallest cost, pop it out,
and explore edges (1, 2), (1, 3), resulting in (a) modifying task 2 and
3’s cost to 2 and 12, respectively and (b) set 2 and 3’s predecessor to
1.
2. Pop out vertex 2, explore edges (2, 3), (2, 5), resulting in (a) modifying
task 3 and 5’s cost to 4 and 5, respectively and (b) set 3 and 5’s
predecessor to 2.
3. Pop out vertex 3, explore edges (3, 5), (3, 4), resulting in (a) modifying
task 5 and 4’s cost to 3 and 6, respectively and (b) set 3 and 5’s
predecessor to 3.
4. Pop out vertex 5, explore edges (5, 4): since the new cross edge (5, 4)
has larger cost compared with previous reduced cross edge to reach to
vertex 4, the vertex 4 in the queue is not modified.
5. Pop out vertex 4, no more new edges to expand, terminate the pro-
gram.
This process results in the exactly same MST compared with the implemen-
tation by edges. However, it adds additional challenges into the implemen-
tation of the priority queue: We have to modify an enqueued item’s record
during the life cycle of the queue. In the Python implementation, we use
the our customized PriorityQueue() in Section. ??(also included in the
notebook). The main process of the algorithm is:
1 d e f prim2 ( g ) :
2 n = len ( g . items () )
3 pq = P r i o r i t y Q u e u e ( )
4 S = {}
5 A = []
6 # Initialization
7 f o r i in range (n) :
8 pq . add_task ( t a s k=i +1, p r i o r i t y=f l o a t ( ' i n f ' ) , i n f o=None ) #
t a s k : v e r t e x , p r i o r i t y : edge c o s t , i n f o : p r e d e c e s s o r v e r t e x
9
10 S = {1}
11 pq . add_task ( 1 , 0 , i n f o =1)
12
13 while len (S) < n :
14 u , p , w = pq . pop_task ( )
15 i f w == f l o a t ( ' i n f ' ) :
16 p r i n t ( f ' Graph { g } i s not c o n n e c t e d . ' )
17 break
18 A. append ( ( p , u , w) )
19 S . add ( u )
20.5. SHORTEST-PATHS ALGORITHMS 453
20 for v , w in g [ u ] :
21 i f v not i n S and w < pq . e n t r y _ f i n d e r [ v ] [ 0 ] :
22 pq . add_task ( v , w, u )
23
24 return A
25
Examples
1. 1584. Min Cost to Connect All Points (medium)
2. 1579. Remove Max Number of Edges to Keep Graph Fully Traversable
(hard)
i=1
The shortest path problem between vi and vj is to find the shortest path
weight σ(vi , vj ) along with the shortest path p.
p
(
min{w(p) : vi −
→ vj } if there is a path from vi to vj
σ(vi , vj ) =
∞ otherwise
(20.2)
For example, for the graph shown in Fig. 20.12, the shortest-path weight
and its corresponding shortest-path between s to any other vertex in V is
listed as:
( source , t ar g e t ) s h o r t e s t −path w e i g h t s h o r t e s t path
(s , s) 0 s
(s , y) 7 (s , y)
(s , x) 4 (s , y , x)
(s , t) 2 (s , y, x, t)
(s , z) −2 (s , y, x, t , z)
454 20. ADVANCED GRAPH ALGORITHMS
the cycle (t, x, t). Because the cycle has a positive path weight 5 + (−2),
the path (s, t) remains smaller than the path that comes with the cycle.
However, if we switch the weight of edge (t, x) with that of (x, t), then the
same cycle (t, x, t) will have negative path weight (−5) + 2, repeating the
cycle within the path infinitely we will have a cost of −∞. Therefore, for a
graph where the weights can be both negative and positive, one requirement
posed on the single-source shortest-path algorithm, recursive or iterative, is
to detect the negative-weight cycle that is reachable from the source. Once
we get rid of all negative-weight cycles, the remaining of the algorithm can
focus on only shortest-paths of at most |V | − 1 edges, and the resulting
shortest-paths will not contain neither negative- nor positive-weight cycles.
To obtain all possible paths, we call the function all_paths() with the
following code:
1 g = {
2 ' t ' : [ ( ' x ' , 5 ) , ( ' y ' , 8 ) , ( ' z ' , −4) ] ,
3 ' x ' : [ ( ' t ' , −2) ] ,
4 ' y ' : [ ( ' x ' , −3) , ( ' z ' , 9 ) ] ,
5 ' z ' : [ ( 'x ' ,7) ] ,
6 ' s ' : [ ( ' t ' , 6) , ( ' y ' , 7) ] ,
7 }
8 ans = [ ]
9 a l l _ p a t h s ( g , ' s ' , [ ' s ' ] , 0 , ans )
Figure 20.13: All paths from source vertex s for graph in Fig. 20.12 and its
shortest paths.
to any other vertex from this result, which is shown on the right side of
Fig. 20.13. All possible paths starting from source vertex can be viewed as
a tree, and the shortest paths from source to all other vertices within the
graph will be a subtree of the former tree structure, known as the shortest-
paths tree. Formally, a shortest-paths tree rooted at s is a directed subgraph
0 0 0 0 0
G = (V , E ), where V ∈ V and E ∈ E, such that
0
1. V is the set of vertices reachable from s in G,
0 0
2. for each v ∈ V , the unique simple path from s to v in G is a shortest
path from s to v in G.
Optimization
As we see, shortest path problem is a truly combinatorial optimization prob-
lem, making them the best demonstration examples of the algorithm design
principles–Dynamic Programming and Greedy Algorithm. On the other
hand, depending on the characteristics of targeting graph, either they are
dense or spares, directed acyclic graph (DAG) or not DAG, we can further
optimize the efficiency besides of the design principle. However, in this chap-
ter, we focus on the gist: how to solve all-pair shortest path problems with
dynamic programming?
First, we use an adjacency matrix to represent our weight matrix W of
size |V | × |V |. In the process, we track shortest-path weight estimate D and
additionally the predecessor Π. Both D and Π are of same size as W . wij
indicates the weight of each edge with startpoint i and endpoint j,
0
if i = j
W (i, j) = wij if i =
6 j, and (i, j) ∈ E (20.3)
∞ if i =6 j, and (i, j) ∈
/E
With this definition, we show a naive directed graph in Fig. 20.14 along with
its W .
Figure 20.14: The simple graph and its adjacency matrix representation
(changing it to lower letter)
path between a and d found so far. First, we define the subproblem as the
shortest path between a and d with maximum path length(MPL) m. With
this definition, we show two possible ways of dividing the subproblem:
Dm (a, d) = min(Dm/2 (a, d), Dm/2 (a, x) + Dm/2 (x, d)) (20.5)
x
Dk (i, j) = min(D{0,...,k−1} (i, j), D{0,...,k−1} (i, k) + D{0,...,k−1} (k, j)) (20.6)
As we see, each recurrence update only takes constant time. At the end,
after we consider all possible intermediate nodes, we reach out to the optimal
460 20. ADVANCED GRAPH ALGORITHMS
solution. This approach results in the best time complexity, O(|V |3 ) so far.
We demonstrate the update process in Fig. 20.17. At pass C, using C as
intermediate node, we end up only use C-th row and C-th column to update
our matrix.
(3).png (3).png.mps (3).png.pdf (3).png.png (3).png.jpg (3).png.mps
(3).png.jpeg (3).png.jbig2 (3).png.jb2 (3).png.PDF (3).png.PNG
(3).png.JPG (3).png.JPEG (3).png.JBIG2 (3).png.JB2 (3).png.eps
As we shall see later, the first way is similar to Bellman-Ford, the second
is a repeated squaring version of Bellman-Ford, and the third is Floyd-
warshall algorithm.
For the graph in Fig. 20.12, the updating on D using s as source is visualized
in Fig. 20.18. Connecting all red arrows along with the shaded gray nodes,
(2).png (2).png.mps (2).png.pdf (2).png.png (2).png.jpg (2).png.mps
(2).png.jpeg (2).png.jbig2 (2).png.jb2 (2).png.PDF (2).png.PNG
(2).png.JPG (2).png.JPEG (2).png.JBIG2 (2).png.JB2 (2).png.eps
Figure 20.18: The update on D for Fig. 20.12. The gray filled spot marks
the nodes that updated its estimate value, with its precessor indicated by
incoming red arrow.
we have a tree structure, each update on D, we expand the tree by one more
level, updating the best estimate reaching to target node with one more
possible edge. We visualize this tree structure in Fig. 20.21. We explain the
tree like this: if we are at most one edge away from s, we get t as small
as 6, if we are three edges away, t is able to gain a smaller value through
its predecessor x which is at most 2 edges away. After the last round of
update, when the tree reaches to height |V | − 1, the predecessor vector Π
will gives out the shortest-path tree: each edge in the shortest path tree can
be obtained by connecting each predecessor with vertices in the graph. The
shortest-path tree is marked in Fig. 20.21 in red color.
20.5. SHORTEST-PATHS ALGORITHMS 463
Figure 20.19: The tree structure indicates the updates on D, and the short-
est path tree marked by red arrows.
3 # A s s i g n an e n u m e r i a l i n d e x f o r each key
4 V = g . keys ( )
5 # Key t o i n d e x
6 v e r 2 i d x = d i c t ( z i p (V, [ i f o r i i n r a n g e ( n ) ] ) )
7 # Index t o key
8 i d x 2 v e r = d i c t ( z i p ( [ i f o r i i n r a n g e ( n ) ] , V) )
9 # I n i t i a l i z a t i o n t h e dp matrix with d e s t i m a t e and p r e d e c e s s o r
10 s i = ver2idx [ s ]
11 D = [ f l o a t ( ' i n f ' ) i f i != s i e l s e 0 f o r i i n r a n g e ( n ) ] # ∗ n
12 P = [ None ] ∗ n
13
14 # n−1 p a s s e s
15 f o r i i n r a n g e ( n−1) :
16 # r e l a x a l l edges
17 f o r u i n V:
18 ui = ver2idx [ u ]
19 for v , w in g [ u ] :
20 vi = ver2idx [ v ]
21 # Update dp ' s minimum path v a l u e and p r e d e c e s s o r
22 i f D[ v i ] > D[ u i ] + w :
23 D[ v i ] = D[ u i ] + w
24 P[ vi ] = ui
25 p r i n t ( f 'D{ i +1}: {D} ' )
26 r e t u r n D, P , v e r 2 i d x , i d x 2 v e r
During each pass, we relax on the estimation D with the following ordering:
's ':[( ' t ' , 6) , ( ' y ' , 7) ] ,
't ':[( ' x ' , 5 ) , ( ' y ' , 8 ) , ( ' z ' , −4) ] ,
'x ' : [ ( ' t ' , −2) ] ,
'y ' : [ ( ' x ' , −3) , ( ' z ' , 9 ) ] ,
'z ' : [ ( ' x ' ,7) ] ,
Printing out on the updates of D, we can see that it converges to the optimal
value faster than the previous strict Dynamic programming version.
1. special linear ordering of vertices to relax its leaving edges that leads
us to its shortest-paths in just one pass of the Bellman-Ford algorithm,
2. and some greedy approach that takes only one pass of relaxation which
can be similar to breath-first graph search or the Prim’s algorithm.
In Fig. 20.18, suppose we are relaxing leaving edges of vertices in linear order
[s, t, y, z, x], the process will be as follows:
vertex edges relaxed vertices
s ( s , t ) ,( s , y) {t :6 , y:7}
t ( t , x) ,( t , y) ,( t , z ) {x : 1 1 , z : 2 , t : 6 , y : 7 }
y (y , x) , (y , z ) {x : 4 , z : 2 , t : 6 , y : 7 }
z (z , x) {x : 4 , z : 2 , t : 6 , y : 7 }
x (x , t ) {t :2 , x :4 , z :2 , y:7}
The process is also visualized in Fig. 20.20. We see that only vertex z did
not find its shortest-path weight. Why? From s to z, there are paths:
(s, t, z), (s, t, z), (s, t, y, z), (s, y, x, t, z). If we want to make sure after one
pass of updates, vertex z reaches to its minimum shortest-path weight, we
have to make sure its predecessors all reach to its minimum-path weight too
which are vertex y and t. Same rule applies to its predecessors. In this
graph, the ordering
vertex predecessor
s None
t s, x
y s, t
x t , y, z
z y, t
From the listing, we see that the pair t and x conflicts each other: t needs
x as predecessor and x needs t as predecessor. Tracking down this clue, we
will find out that it is due to the fact that t and x coexist in a cycle.
466 20. ADVANCED GRAPH ALGORITHMS
Order Vertices with Topological Sort Taking away edge (x, t), we are
able to obtain a topological ordering of the vertices, which is [s, t, y, z, x].
Relaxing vertices by this order of its leaving edges will guarantee to reach
to the global-wise shortest-path weight that would otherwise be reached in
|V |−1 passes in Bellman-Ford algorithm using arbitrary ordering of vertices.
The shortest-paths tree is shown in Fig. 20.21.
So far, we discovered a O(|V | + |E|) linear algorithm for single-source
shortest-path problem when the given graph being directed, weighted, and
acyclic. The algorithm consists of two steps: topological sorting of vertices
in G and one pass of Bellman-Ford algorithm using the reordered vertices
instead of arbitrary ordering. Calling the topo_sort function from Sec-
tion. 20.2, we have our Python code:
1 d e f bellman_ford_dag ( g , s ) :
2 s = s
3 n = len (g)
4 # Key t o i n d e x
5 ver2idx = d i c t ( z i p ( g . keys ( ) , [ i f o r i in range (n) ] ) )
6 # Index t o key
7 idx2ver = d i c t ( z i p ( [ i f o r i in range (n) ] , g . keys ( ) ) )
8 # Convert g t o i n d e x
9 ng = [ [ ] f o r _ i n r a n g e ( n ) ]
10 f o r u in g . keys ( ) :
11 for v , _ in g [ u ] :
12 ui = ver2idx [ u ]
13 vi = ver2idx [ v ]
14 ng [ u i ] . append ( v i )
15 V = t o p o _ s o r t ( ng )
16 # I n i t i a l i z a t i o n t h e dp matrix with d e s t i m a t e and p r e d e c e s s o r
17 s i = ver2idx [ s ]
18 dp = [ ( f l o a t ( ' i n f ' ) , None ) f o r i i n r a n g e ( n ) ]
19 dp [ s i ] = ( 0 , None )
20
21 # r e l a x a l l edges
20.5. SHORTEST-PATHS ALGORITHMS 467
22 f o r u i i n V:
23 u = idx2ver [ ui ]
24 for v , w in g [ u ] :
25 vi = ver2idx [ v ]
26 # Update dp ' s minimum path v a l u e and p r e d e c e s s o r
27 i f dp [ v i ] [ 0 ] > dp [ u i ] [ 0 ] + w :
28 dp [ v i ] = ( dp [ u i ] [ 0 ] + w, u i )
29 r e t u r n dp
(j) x enters S
the queue. There are two ways to apply the priority queue:
• Add all vertices into the queue all at once at the beginning. Then only
deque and modification operations are needed.
• Add vertex in the queue only when it is relaxed and has a non-∞
shortest-path estimate. The process of Dijkstra algorithm on a non-
negative weighted graph that takes this approach of queue is demon-
strated in Fig. 20.22 and the code is as follows:
20.5. SHORTEST-PATHS ALGORITHMS 469
1 def dijkstra (g , s ) :
2 Q = PriorityQueue ()
3 S = []
4 # t a s k : v e r t e x id , p r i o r i t y : s h o r t e s t −path e s t i m a t e , i n f o :
predecessor
5 Q. add_task ( t a s k=s , p r i o r i t y =0 , i n f o=None )
6 visited = set ()
7 w h i l e not Q. empty ( ) :
8 # Use t h e l i g h t v e r t e x
9 u , up , ud = Q. pop_task ( )
10 v i s i t e d . add ( u )
11 S . append ( ( u , ud , up ) )
12
13 # Relax a d j a c e n t v e r t i c e
14 for v , w in g [ u ] :
15 # Already found t h e s h o r t e s t path f o r t h i s i d
16 i f v in v i s i t e d :
17 continue
18
19 vd , vp = Q. g e t _ t a s k ( v )
20 # F i r s t time t o add t h e t a s k o r a l r e a d y i n t h e queue , but
need update
21 i f not vd o r ud + w < vd :
22 Q. add_task ( t a s k=v , p r i o r i t y=ud + w, i n f o=u )
23 return S
• and (1) if the graph is acyclic and (2) only have non-negative cycles,
we can run one pass of Bellman-Ford algorithm with vertices being
relaxed in topologically sorted liner ordering.
Depends on which category the given graph G falls into, a naive and nature
solution to all-pairs shortest-path problem can be addressed by running the
corresponding algorithm |V | passes–once for each vertex viewed as source in
a complexity scaled by |V | times.
with initialization:
(
0, if i = j,
D (i, j) =
0
(20.12)
∞, otherwise.
2. For every pair of vertices i and j, we update the d and π using recur-
rence relation in Eq. 20.10 and 20.13, respectively, for |V | − 1 passes.
20.5. SHORTEST-PATHS ALGORITHMS 471
3. Run the |V |th pass to decide if any negative-weight cycle exist in each
rooted shortest-path tree.
The L matrix will be having all zeros along the diagonal, in this case, it is
[ [0 , 2 , 4 , 7 , −2] ,
[ inf , 0 , 3 , 8 , −4] ,
[ inf , −2, 0 , 6 , −6] ,
[ inf , −5, −3, 0 , −9] ,
[ inf , 5 , 7 , 13 , 0 ] ] ,
L1 = L0 · W = W, (20.14)
L =L ·W =W ,
2 1 2
L3 = L2 · W = W 3 ,
..
.
Ln−1 = Ln−2 · W = W n−1
L1 = W, (20.15)
L = W · W,
2
L4 = W 2 · W 2 ,
..
.
In this chapter, we extend the data structure learned from the first part with
more advanced data structures. These data structures are not as widely used
as the basic data structures, however, they can be often seen to implement
more advanced algorithms or they can be more efficient compared with al-
gorithms that relies on a more basic version.
The process of the monotone decresing stack is shown in Fig. 21.1. Some-
times, we can relax the strict monotonic condition, and can allow the stack
or queue have repeat value.
To get the feature of the monotonic queue, with [5, 3, 1, 2, 4] as example,
if it is increasing:
475
476 21. ADVANCED DATA STRUCTURES
• Popping out to get smaller/larger item to the right: when we pop one
element out, for the kicked out item, such as in step of 2, increasing
stack, 3 forced 5 to be popped out, for 5, 3 is the first smaller item
to the right. Therefore, if one item is popped out, for this item, the
current item that is about to be push in is 1) for increasing stack,
the nearest smaller item to its right, 2) for decreasing stack, the
nearest larger item to its right. In this case, we get [3,1, -1, -1,
-1], and [-1, 4, 2, 4, -1] respectively.
10 f i r s t L a r g e r T o L e f t [ i ] = A[ s t a c k [ − 1 ] ]
11 s t a c k . append ( i )
12 return firstLargerToLeft , firstLargerToRight , stack
For the above problem, If we do it with brute force, then use one for loop
to point at the current element, and another embedding for loop to look
for the first element that is larger than current, which gives us O(n2 ) time
complexity. If we think about the BCR, and try to trade space for efficiency,
and use monotonic queue instead, we gain O(n) linear time and O(n) space
complexity.
Monotone stack is especially useful in the problem of subarray where we
need to find smaller/larger item to left/right side of an item in the array.
To better understand the features and applications of monotone stack, let
us look at some examples. First, we recommend the audience to practice
on these obvious applications shown in LeetCode Problem Section before
moving to the examples:
There is one problem that is pretty interesting:
Window p o s i t i o n Max
−−−−−−−−−−−−−−− −−−−−
[ 1 3 −1] −3 5 3 6 7 3
1 [ 3 −1 −3] 5 3 6 7 3
1 3 [ −1 −3 5 ] 3 6 7 5
1 3 −1 [ −3 5 3 ] 6 7 5
1 3 −1 −3 [ 5 3 6] 7 6
1 3 −1 −3 5 [ 3 6 7] 7
Analysis: In the process of moving the window, any item that is smaller
than its predecessor will not affect the max result anymore, therefore, we
can use decrese stack to remove any trough. If the window size is the same
as of the array, then the maximum value is the first element in the stack
21.1. MONOTONE STACK 479
(bottom). With the sliding window, we record the max each iteration when
the window size is the same as k. At each iteration, if need to remove the
out of window item from the stack. For example of [5, 3, 1, 2, 4] with k =
3, we get [5, 3, 4]. At step 3, we get 5, at step 4, we remove 5 friom the
stack, and we get 3. At step 5, we remove 3 if it is in the stack, and we get
4. With the monotone stack, we decrease the time complexity from O(kn)
to O(n).
1 import c o l l e c t i o n s
2
3 d e f maxSlidingWindow ( s e l f , nums , k ) :
4 ds = c o l l e c t i o n s . deque ( )
5 ans = [ ]
6 f o r i i n r a n g e ( l e n ( nums ) ) :
7 w h i l e ds and nums [ i ] >= nums [ ds [ − 1 ] ] : i n d i c e s . pop ( )
8 ds . append ( i )
9 i f i >= k − 1 : ans . append ( nums [ ds [ 0 ] ] ) #append t h e
c u r r e n t maximum
10 i f i − k + 1 == ds [ 0 ] : ds . p o p l e f t ( ) #i f t h e f i r s t a l s o
t h e maximum number i s out o f window , pop i t out
11 r e t u r n ans
I nput : [ 3 , 1 , 2 , 4 ]
Output : 17
Explanation : Subarrays are [ 3 ] , [ 1 ] , [ 2 ] , [ 4 ] , [ 3 , 1 ] ,
[1 ,2] , [2 ,4] , [3 ,1 ,2] , [1 ,2 ,4] , [3 ,1 ,2 ,4].
Minimums a r e 3 , 1 , 2 , 4 , 1 , 1 , 2 , 1 , 1 , 1 . Sum i s 1 7 .
if there is duplicate such as [3, 1, 4, 1], for the first 1, we need [3,1],
[1], [1,4], [1, 4,1] with subarries, and for the second 1, we need [4,1],
[1] instead. Therefore, we set the right length to find the >= item.
Now, the problem in converted to the first smaller item on the left side
and the first smaller or equal item on the right side. From the feature
we draw above, we need to use increasing stack, as we know, from the
pushing in, we find the first smaller item, and from the popping out,
for the popped out item, the current item is the first smaller item on
the right side. The code is as:
1 d e f sumSubarrayMins ( s e l f , A) :
2 n , mod = l e n (A) , 10∗∗9 + 7
3 l e f t , s1 = [ 1 ] ∗ n , [ ]
4 r i g h t = [ n−i f o r i i n r a n g e ( n ) ]
5 f o r i in range (n) : # f i n d f i r s t s m a l l e r to the l e f t
from p u s h i n g i n
6 w h i l e s 1 and A[ s 1 [ − 1 ] ] > A[ i ] : # can be e q u a l
7 i n d e x = s 1 . pop ( )
8 r i g h t [ i n d e x ] = i −i n d e x # k i c k e d out
9 i f s1 :
10 l e f t [ i ] = i −s 1 [ −1]
11 else :
12 l e f t [ i ] = i +1
13 s 1 . append ( i )
14 r e t u r n sum ( a ∗ l ∗ r f o r a , l , r i n z i p (A, l e f t , r i g h t )
) % mod
only if we get the same answer both times if we ask for the representative
twice without modifying the set. Choosing the smallest member in a set
as representative is an examplary prespecified rule. According to its typical
applications such as implementing Kruskal’s minimum spanning tree algo-
rithm and tracking connected components dynamically, disjoint-set should
support the following operations:
1. make_set(x): create a new set whose only member is x. To keep these
sets to be disjoint, this member should not already be in some existent
sets.
2. union(x, y): unites the two dynamic sets that contain x and y, say
Sx ∪ Sy into a new set that is the union of these two sets. In practice,
we merge one set into the other say Sy into Sx , we then remove/destroy
Sy . This will be more efficient than create a new one that unions and
destroy the other two.
If our coding is right, each item must have an item when find_set
function is called, if not we will call make_set. For each existing set, it will
have at least one item. For function union, we choose the set that has less
items to merge to the one that with more items.
1 class DisjointSet () :
2 ' ' ' Implement a b a s i c d i s j o i n t s e t ' ' '
3 d e f __init__ ( s e l f , i t e m s ) :
4 s e l f . n = len ( items )
5 s e l f . item _set = d i c t ( z i p ( items , [ i f o r i i n r a n g e ( s e l f . n ) ] ) )
# f i r s t each s e t o n l y has one item [ i ] , t h i s can be one−>
m u l t i p l e match
6 s e l f . set_ item = d i c t ( z i p ( [ i f o r i i n r a n g e ( s e l f . n ) ] , [ [ item ]
f o r item i n i t e m s ] ) ) # each item w i l l always b e l o n g t o one
set
7
8 d e f make_set ( s e l f , item ) :
9 ' ' ' make s e t f o r new incoming s e t ' ' '
10 i f item i n s e l f . item_ set :
11 return
12
13 s e l f . item _set [ item ] = s e l f . n
14 s e l f . n += 1
15
16 d e f f i n d _ s e t ( s e l f , item ) :
17 i f item i n s e l f . ite m_set :
18 r e t u r n s e l f . item_s et [ item ]
19 else :
20 p r i n t ( ' not i n t h e s e t y e t : ' , item )
21 r e t u r n None
22
21.2. DISJOINT SET 483
23 d e f union ( s e l f , x , y ) :
24 id_x = s e l f . f i n d _ s e t ( x )
25 id_y = s e l f . f i n d _ s e t ( y )
26 i f id_x == id_y :
27 return
28
29 s i d , l i d = id_x , id_y
30 i f l e n ( s e l f . s et_ite m [ id_x ] ) > l e n ( s e l f . set_i tem [ id_y ] ) :
31 s i d , l i d = id_y , id_x
32 # merge i t e m s i n s i d t o l i d
33 f o r item i n s e l f . set_it em [ s i d ] :
34 s e l f . item _set [ item ] = l i d
35 s e l f . set _item [ l i d ] += s e l f . s et_ite m [ s i d ]
36 d e l s e l f . set_i tem [ s i d ]
37 return
Complexity For n items, we spend O(n) time to initialize the two hashmaps.
With the help of hashmap, function find_set tasks only O(1) time, accu-
mulating it will give us O(n). For function union, it takes more effort to
analyze. From another angle, for one item x, it will only update its item id
when we are unioning it to another set x1 . The first time, the resulting set
x1 will have at least two items. The second update will be union x1 to x2 .
Because the merged one will have smaller length, thus the resulting items
in x2 will at least be 4. Then it is the third, ..., up to k updates. Because
a resulting set will at most has n in size, so for each item, at most log n
updates will be needed. For n items, this makes the upper bound for union
to be n log n.
However, for our implementation, we has additional cost, which is in
union, where we merge the list. This cost can be easily limited to constant
by using linked list. However, even with list, there are different ways to
concatenate one list to another:
1. Use + operator: The time complexity of the concat operation for two
lists, A and B, is O(A + B). This is because you aren’t adding to one
list, but instead are creating a whole new list and populating it with
elements from both A and B, requiring you to iterate through both.
2. extend(lst): Use extend which doesn’t create a new list but adds to
the original. The time complexity should only be O(1). On the other
hand l += [i] modifies the original list and behaves like extend.
that use this structure are not faster than the linked-list version. By in-
troducing two heuristics–“Union by rank” and “path compression"–we can
achieve asympotically optimal disjoint-set data structure.
Naive Version
We first need to create a Node class which stores item and another par-
ent pointer parent. An item can be any immutable data structure with
necessary information represents a node.
1 c l a s s Node :
2 d e f __init__ ( s e l f , item ) :
3 s e l f . item = item # s a v e node i n f o r m a t i o n
4 s e l f . p a r e n t = None
We need one dict data structure item_finder and one set data structure
sets to track nodes and set. From item_finder we can do (item, node) map
to find node, and then from the node further we can find its set representative
node or execute union operation. sets is used to track all the representative
nodes. When we union two sets, the one merged to the other will be deleted
in sets. At the easy version, make_set will create tree with only one node.
find_set will start from the node and traverse all the way back to its final
parent which is when node.parent==node. And a union operation will
simply point one tree’s root node to the root of another through parent.
The code is as follows:
1 class DisjointSet () :
2 ' ' ' Implement with d i s j o i n t −s e t f o r e s t ' ' '
3 d e f __init__ ( s e l f , i t e m s ) :
21.2. DISJOINT SET 485
4 s e l f . n = len ( items )
5 s e l f . item_finder = dict ()
6 s e l f . s e t s = s e t ( ) # s e t s w i l l have o n l y t h e p a r e n t node
7
8 f o r item i n i t e m s :
9 node = Node ( item )
10 node . p a r e n t = node
11 s e l f . i t e m _ f i n d e r [ item ] = node # from item we can f i n d t h e
node
12 s e l f . s e t s . add ( node )
13
14 d e f make_set ( s e l f , item ) :
15 ' ' ' make s e t f o r new incoming s e t ' ' '
16 i f item i n s e l f . i t e m _ f i n d e r :
17 return
18
19 node = Node ( item )
20 node . p a r e n t = node
21 s e l f . i t e m _ f i n d e r [ item ] = node
22 s e l f . s e t s . add ( node )
23 s e l f . n += 1
24
25 d e f f i n d _ s e t ( s e l f , item ) :
26 # from item−>node−>p a r e n t t o s e t r e p r e s e n t a t i v e
27 i f item not i n s e l f . i t e m _ f i n d e r :
28 p r i n t ( ' not i n t h e s e t y e t : ' , item )
29 r e t u r n None
30 node = s e l f . i t e m _ f i n d e r [ item ]
31 w h i l e node . p a r e n t != node :
32 node = node . p a r e n t
33 r e t u r n node
34
35 d e f union ( s e l f , x , y ) :
36 node_x = s e l f . f i n d _ s e t ( x )
37 node_y = s e l f . f i n d _ s e t ( y )
38 i f node_x . item == node_y . item :
39 return
40
41 #t h e r o o t o f one t r e e t o p o i n t t o t h e r o o t o f t h e o t h e r
42 # merge x t o y
43 node_x . p a r e n t = node_y
44 #remove one s e t
45 s e l f . s e t s . remove ( node_x )
46 return
47
48 d e f __str__ ( s e l f ) :
49 ans = ' '
50 for root in s e l f . sets :
51 ans += ' s e t : '+ s t r ( r o o t . item ) + ' \n '
52 r e t u r n ans
53
54 d e f p r i n t _ s e t ( s e l f , item ) :
55 i f item i n s e l f . i t e m _ f i n d e r :
56 node = s e l f . i t e m _ f i n d e r [ item ]
486 21. ADVANCED DATA STRUCTURES
The above implementation, both make_set and union takes O(1) time com-
plexity. The main time complexity is incurred at find_set, which traverse
a path from node to root. If we assume each tree in the disjoint-set forest
is balanced, the upper bound of this operation will be O(log n). However, if
the tree is as worse as a linear linked list, the time complexity will goes to
O(n). This makes the total time complexity from O(n log n) to O(n2 ).
Heuristics
Union by Rank As we have seen from the above example, A sequence
of n − 1 union operations may create a tree that is just a linear chain of
n nodes. Union by rank, which is similar to the weighted-union heuristic
we used with the linked list implementation, is applied to avoid the worst
case. For each node, other than the parent pointer, it adds rank to track the
upper bound of the height of the associated node (the number of edges in
the longest simple path between the node and a descendant leaf). In union
by rank, we make the root with smaller rank point to the root with larger
rank.
In the initialization, and make_set operation, a single noded tree has an
initial rank of 0. In union(x, y), there will exist three cases:
Case 1 x . rank == y . rank :
j o i n x to y
y . rank += 1
Case 2 : x . rank < y . rank :
21.2. DISJOINT SET 487
j o i n y to x
x . rank += 1
Case 3 : x . rank > y . rank :
j o i n y to x
x ' s rank s t a y unchanged
Now, with adding rank to the node. We modify the naive implementation:
1 c l a s s Node :
2 d e f __init__ ( s e l f , item ) :
3 s e l f . item = item # s a v e node i n f o r m a t i o n
4 s e l f . p a r e n t = None
5 s e l f . rank = 0
set : 1
0 −>1 −>
1 −>
2 −>1 −>
3 −>1 −>
4 −>1 −>
21.4 Exercises
21.4.1 Knowledge Check
21.4.2 Coding Practice
Disjoint Set
491
492 22. STRING PATTERN MATCHING ALGORITHMS
Figure 22.1: The process of the brute force exact pattern matching
For LeetCode Problems, most times, brute force solution will not be ac-
cepted and receive LTE. In real applications, such as human genome match-
ing, the text can have approximate size of 3 ∗ 109 and the pattern can be
very long to, such as 108 . Therefore, other faster algorithms are needed to
improve the efficiency.
The other algorithms requires us preprocess either/both the pattern and
text. In this book, we mainly discuss three algorithms:
the pattern itself we know ’a’ will match ’a’, and any step further, we have
not enough information to cover, therefore, step 4 is necessary to compare
’c’ with ’b’ in the pattern. In this example, step 4, 5, 6, 7 are all needed but
step 4, 5, 6 will only end up do one or two comparison each step.
The reason why step 2 and 3 can be skipped can be shown from Fig. 22.2.
If we analyze our pattern at first, we will know at step 2 and step 3, “bra”
not equals to “abr” and “ra” not equals to “ab”. While at step 4, we do have
“a” equals to “a”. If we observe further of the relations of these pairs, we will
know they are suffix and prefix of the same length of the pattern. Inspired
by this, we define border of string S as a prefix of S which is equals to a
suffix of the same length of S, but not equals to the whole S. For example:
' ' a ' ' i s a b o r d e r o f ' arba '
' ab ' i s a b o r d e r o f ' abcdab '
' ab ' i s not a b o r d e r o f ' ab '
1 def naiveLps (p : s t r ) :
2 dp = [ 0 ] ∗ l e n ( p )
3 f o r i in range (1 , len (p) ) :
4 f o r l i n r a n g e ( i , 0 , −1) : # from maxmim l e n g t h t o l e n g t h 1
5 prefix = p [0: l ]
6 s u f f i x = p [ i − l +1: i +1]
7 #p r i n t ( p r e f i x , s u f f i x )
8 i f p r e f i x == s u f f i x :
9 dp [ i ] = l
10 break
11 r e t u r n dp
function in linear time, we first need to utilize two properties ( facts) for the
purpose of two further optimization:
2. Lemma: If l[i] > 0, then all borders of P[0...i] but for the
longest one are also borders of P [0...l(i) − 1]. The proof is: As
shown in Fig. 22.4, l(i) is the longest border for P[0...i]. We let µ be
another shorter border of P[0...i] such that |µ| < l(i). Because the first
l(i) and the second is the same, this means at the first l(i), the suffix
of l(i) that of the same length of µ is µ. This states that µ is both a
border of P[0...l(i)-1].
Now, with such knowledge we can do the following two further optimiza-
tion:
2. With 2, we can further get rid of the O(n) string comparison each step.
To accomplish this, we have to use all the information computed in the
previous steps: all borders of P[0...i] (assuming it has k in total) can be
enumerated from the longest to shortest as: b0 = π(i), b1 = π(b0 − 1),
..., bk−1 = π(bk−2 − 1) (bk−1 = 0). Therefore, at step posited at
i + 1, instead of comparing string s[0...π(i)] with s[i − (π(i) − 1)...i],
comparison of char s[π(i)] and s[i] is needed.
1 def prefix_function ( s ) :
2 n = len ( s )
3 pi = [ 0 ] ∗ n
22.1. EXACT SINGLE-PATTERN MATCHING 497
4 f o r i in range (1 , n) :
5 # compute l ( i )
6 j = p i [ i −1]
7 w h i l e j > 0 and s [ i ] != s [ j ] : # t r y a l l b o r d e r s o f s
[ 0 . . . i −1] , from t h e l o n g e s t t o t h e s h o r t e s t
8 j = p i [ j −1]
9 # check t h e c h a r a c t e r
10 i f s [ i ] == s [ j ] :
11 pi [ i ] = j + 1
12
13 return pi
Run an example:
1 S = ' abcabcd '
2 p r i n t ( ' The p r e f i x f u n c t i o n o f : ' , S , " i s " , prefix_function (S) )
3
4 The p r e f i x f u n c t i o n o f : abcabcd is [0 , 0 , 0 , 1 , 2 , 3 , 0]
1. For all i, π[i] ≤ m because of the separator ’$’ in the middle of the
pattern and the text that acts as a separator.
Because for all π[i] ≤ m: for i in [0, m-1], we save the border in π; for
i in [m, n+m-1], we set up a global variable j to track the last border. We
can decrease the space complexity in O(m). The Python implementation is
given as:
1 d e f KMP( p , t ) :
2 m = len (p)
3 s = p + '$ ' + t
4 n = len ( s )
5 pi = [ 0 ] ∗ m
6 j = pi [ 0 ]
7 ans = [ ]
8 f o r i in range (1 , n) :
9 # compute l ( i )
10 w h i l e j > 0 and s [ i ] != s [ j ] : # t r y a l l b o r d e r s o f s
[ 0 . . . i −1] , from t h e l o n g e s t t o t h e s h o r t e s t
11 j = p i [ j −1]
12 # check t h e c h a r a c t e r
13 i f s [ i ] == s [ j ] :
14 j += 1
15 # record the r e s u l t
16 i f j == m:
17 ans . append ( i −2∗m)
18 # s a v e t h e r e s u l t i f i i n [ 0 , m−1]
19 i f i < m:
20 pi [ i ] = j
21 r e t u r n ans
Run an example:
1 t = ' textbooktext '
2 p = ' text '
3 p r i n t (KMP( p , t ) )
4 # output
5 # [0 , 8]
1 d e f KMP( p , ps ) :
2 f = LPS( p )
3 n =m, n = l e n ( p ) , l e n ( t s )
4
5 i = 0 # index in s
6 j = 0 # index in p
7 pos = [ ]
8 while i < n :
9 i f p [ j ] == s [ i ] :
10 i += 1
11 p o s j += 1
12 dp [ i ] = pos
13 i f dp [ i ] == m: i f j == m: # i a t i +1 , j a t f [ j −1]
14 p r i n t ( " Found p a t t e r n a t i n d e x " , i −j )
15 ans . append ( i −2∗mj )
16 i += 1
17 else :
18 i f pos > 0 : j = f [ j −1]
19 e l s e : # mismatch a t i and j
20 i f j != 0 : # i f j can r e t r e a t with l p s , then i keep
t h e same
21 pos = dp [ p o s j = f [ j −1]
22 else :
23 i += 1 #t h e v a l u e i s 0
24 r e t u r n ans # i f j n e e d s t o s t a r t over , i moves t o o
25 i += 1
26 r e t u r n ans
27 p r i n t (KMP( p , s ) )
28 # [0 , 9 , 12]
Compressing a string
22.1.3 Z-function
Definition and Implementation
Z-function for a string s of length n is defined as an array z[i] = k, i ∈
[1, n − 1]. At item z[i] = k stores the longest substring starting at index i
which is also a prefix of string s. To notice, the length of the substring has
to be smaller than the whole length, therefore, z[0] = 0. In other words, it
means the the length of the longest common prefix between s and substring
s[i : n]. For example:
500 22. STRING PATTERN MATCHING ALGORITHMS
Another Example.
" aaabaab " − [ 0 , 2 , 1 , 0 , 2 , 1 , 0 ]
a 0
a s u b s t r i n g ' aa ' = p r e f i x ' aa '
a substring 'a ' = prefix 'a '
b 0
a s u b s t r i n g ' aa ' = p r e f i x ' aa '
a substring 'a ' = prefix 'a '
b
position i, [l, r] is one of its preceding non-zero z[p], p < i, which has the
furthest right boundary r. We can think it as a rightmost window, wherein
s[l, r] = s[0, r − l + 1]. s[0, i − l] is marked as yellow. We divide the area in
22.1. EXACT SINGLE-PATTERN MATCHING 501
Applications
The applications of Z-function are largely similar to those of prefix func-
tion. Therefore, the applications will be explained briefly compared with
the applications of prefix functions. If you have problems to understand
this section, please read the prefix function first.
3 m = len (p)
4 z = ( linearZF ( s ) )
5 ans = [ ]
6 f o r i , v i n enumerate ( z ) :
7 i f v == m:
8 ans . append ( i −m−1)
9 r e t u r n ans
We know the maximum for dp[i] is i + 1, however for cases like “aaa”’,
the situation is different:
subproblem 1 : ' a ' , dp [ 0 ] = 1
subproblem 2 : ' aa ' , dp [ 1 ] = 1 , ' aa ' , b e c a u s e ' a ' _1 == ' a_0 '
subproblem 3 , ' aaa ' , dp [ 2 ] = 1 , new s u b s t r s ' aaa ' , b e c a u s e '
a_0a_1 '= ' a_1a_2 ' , ' a_2 ' = ' a_0 ' .
If for each subproblem i, we take the string s[0...i] and reverse it i...0.
If using z-function on this substring, we can find the number of prefixes of
the reversed string are found somewhere else in it, which is the maximum
value of its z-function. This is because if we know z[j] = max(k), then
s[i...i-max-1] = s[i-j...i-j+max], which is to say s[i-max-1...i] = s[i-j-max...i-j]
With the max value, all of the shorter prefixes also occur too. Therefore,
dp[i] = i + 1 − max(z[i]). The time complexity is O(n2 )
1 def distinctSubstrs ( s ) :
2 n = len ( s )
3 i f n < 1:
4 return 0
5 ans = 1 # f o r dp [ 0 ]
6 #l a s t _ s t r = s [ 0 : 1 ]
7 f o r i in range (1 , n) :
8 reverse_str = s [ 0 : i +1][:: −1]
9 z = linearZF ( reverse_str )
10 ans += ( i + 1 − max( z ) )
11 r e t u r n ans
Run an example:
1 s = ' abab '
2 print ( distinctSubstrs ( s ) )
3 # output
4 # 7
22.2. EXACT MULTI-PATTERNS MATCHING 503
2. If frequent queries will be made on the same text with a given pattern,
and if the m << n, then KMP become impractical.
The solution to the second problem of KMP is preprocess the text and
store it in order to obtain an algorithm with time complexity only related
to the length of the pattern for each query. Building a suffix trie of the text
is such a solution.
Suffix Tree If we compress the above suffix trie, we get suffix tree.
Suffix Array Suffix Array is further applied with the benefits of saving
space in storage.
Suffix Tree VS Suffix Array Each data structure has its own pros and
cons. In reality, conversion between these two can be implemented in O(n)
time. Therefore, we can first construct one and convert it to the other later.
we have ’ab$’ and ’abab$’. At position 2, ’$’ will be smaller than ’a’ or any
other character and ’ab$’ is still smaller than ’abab$’. Therefore, adding this
special character will not lead to different sorting result, and can avoid the
prefix rule when comparing two different strings.
Naive Solution with O(n2 log n) time complexity With this knowl-
edge, we get s = s + ’$’, and we can generate the suffix array and sort
them. A stable sorting algorithm takes O(n log n) comparison, and each
comparison takes additional O(n), which makes the total time complexity
of O(n2 log n).
1 def generateSuffixArray ( s ) :
2 s = s + '$ '
3 n = len ( s )
4 s u f f i x A r r a y = [ None ] ∗ n
5 # generate
6 f o r i in range (n) :
7 suffixArray [ i ] = s [ i : ]
8 #p r i n t ( s u f f i x A r r a y )
9 suffixArray . sort ()
10 print ( suffixArray )
11 # s a v e s p a c e by s t o r i n g t h e o r d e r o f t h e s u f f i x e s , which i s
the s t a r t i n g index
12 f o r idx , s u f f i x i n enumerate ( s u f f i x A r r a y ) :
13 suffixArray [ idx ] = n − len ( s u f f i x )
14 print ( suffixArray )
15 return suffixArray
Cyclic Shifts For our example, we start at position 0, we get the first
cyclic shift of ’ababaa$’, and then position 1, we have our second cyclic shift
’babaa$a’, and so till the last position of the string. Now, let us see what
happens if we sort all of the cyclic shifts:
Sorted To S u f f i x Array
0 : ababaa$ $ababaa $
1 : babaa$a a$ababa a$
2 : abaa$ab aa$abab aa$
3 : baa$aba abaa$ab abaa$
4 : aa$abab ababaa$ ababaa$
5 : a$ababa baa$aba baa$
6 : $ababaa babaa$a babaa$
We know the number of cyclic shifts is the same as of the number of all
suffixes of the same string. And by observing the above example, sorting the
cyclic shifts will get us sorted suffixes if we remove all characters after ’$’ in
each cyclic shift. This conclusion can be hold true for all strings because ’$’
is smaller than all other characters, and with ’$’ at different position in each
cyclic shift, once we are at ’$’, the comparison of two strings end because the
first one that has ’$’ is smaller than all others. Therefore, all the characters
after the ’$’ will not affect the sorting at all. Now, we know that sorting
cyclic shifts and suffixes of string s is equivalent with the addition of ’$’ at
the end.
If we can sort the cyclic shifts of string in faster way, then we will find
ourselves a more efficient suffix sorting algorithm. One obvious efficient
sorting algorithm is using Radix Sort. Using radix sort, we first sort the
cyclic shifts by the last character using counting sort, and then the second
last character till finishing the first character. Sorting each character for
the whole cyclic shifts array takes O(n), and we are running n rounds, this
makes the whole sorting of O(n2 ) and with O(n) space. However, we can
improve these complexity further by using special properties of the Cyclic
shifts.
Partial Cyclic Shifts Different from the cyclic shifts, partial cyclic shifts
are defined as CiL which is the partial cyclic shift of length L starting at
index i. For the above example, the partial cyclic shift of length 1, 2, and 4
will be :
C7 C1 C2 C4
ababaa$ a ab abab
babaa$a b ba baba
abaa$ab a ab abaa
baa$aba b ba baa$
aa$abab a aa aa$a
a$ababa a a$ a$ba
$ababaa $ $a $aba
506 22. STRING PATTERN MATCHING ALGORITHMS
Carefully observing the relation of pair (C1 , C2 ) and (C2 , C4 ). We can find
that C1 and the second half of substring (denoted by underline) in C2 has
the same key set. Same rule applies to C2 and the second half of substring
in C4 .
Order and Class Order is defined as the sorted cyclic shift with the
starting index as their value. For example, for C 1 , the sorted order will be
[6, 0, 2, 4, 5, 1, 3], which represents [$, a, a, a, a, b, b]. Class is an array that
each item Classi corresponds to Ci and denotes as the number of par-
tial cyclic shifts of the same length that are strictly smaller than Ci . For
’ababaa$’, the class of length 1 will be [1, 2, 1, 2, 1, 1, 0]. The reason to bring
in the concept of class is because of the rule that the set of first and second
half of the doubled partial cyclic shifts share the same key set, and the class
is equivalent to the converted key of corresponding partial cyclic
shift.
1 d e f getCharOrder ( s ) :
2 n = len ( s )
3 numChars = 256
4 count = [ 0 ] ∗ numChars # t o t a l l y 256 c h a r s , i f you want , can
p r i n t i t out t o s e e t h e s e c h a r s
5
6 order = [ 0 ] ∗ ( n)
7
8 #count t h e o c c u r r e n c e o f each c h a r
9 for c in s :
10 count [ ord ( c ) ] += 1
11
12 # p r e f i x sum o f each c h a r
13 f o r i i n r a n g e ( 1 , numChars ) :
14 count [ i ] += count [ i −1]
15
16 # a s s i g n from count down t o be s t a b l e
17 f o r i i n r a n g e ( n−1,−1,−1) :
18 count [ ord ( s [ i ] ) ] −=1
19 o r d e r [ count [ ord ( s [ i ] ) ] ] = i # put t h e i n d e x i n t o t h e o r d e r
i n s t e a d the s u f f i x s t r i n g
20
21 return order
that the second part CiL is already sorted, we just need to sort the first half
with counting sort using the class array of the last partial cyclic shifts. The
time complexity of this step is O(n) too. The Python implementation of
computing the doubled partial cyclic shifts’ order is:
1 ' ' ' I t i s a counting s o r t using the f i r s t part as c l a s s ' ' '
2 def sortDoubled ( s , L , order , c l s ) :
3 n = len ( s )
4 count = [ 0 ] ∗ n
5 new_order = [ 0 ] ∗ n
6 # t h e i r key i s t h e c l a s s
7 f o r i in range (n) :
8 count [ c l s [ i ] ] += 1
9
10 # p r e f i x sum
11 f o r i in range (1 , n) :
12 count [ i ] += count [ i −1]
13
14 # a s s i g n from count down t o be s t a b l e
15 # s o r t the f i r s t h a l f
16 f o r i i n r a n g e ( n−1, −1, −1) :
17 s t a r t = ( o r d e r [ i ] − L + n ) % n #g e t t h e s t a r t i n d e x o f t h e
f i r s t half ,
18 count [ c l s [ s t a r t ] ] −= 1
19 new_order [ count [ c l s [ s t a r t ] ] ] = s t a r t
20
21 r e t u r n new_order
2 s = s + '$ '
3 n = len ( s )
4 o r d e r = getCharOrder ( s )
5 c l s = getCharClass ( s , order )
6 p r i n t ( order , c l s )
7 L = 1
8 while L < n :
9 order = sortDoubled ( s , 1 , order , c l s )
10 c l s = u p d a t e C l a s s ( o r d e r , c l s , L)
11 p r i n t ( order , c l s )
12 L ∗= 2
13
14 return order
Applications
Number of Distinct Substrings of a string
22.3 Bonus
can construct a trie of all patterns as shown in Section 27.3. For example,
in Fig. 22.7 shows a trie built with all patterns.
Now, let us do Trie Matching exactly the same way as the brute force
pattern matching algorithm by sliding the pattern trie along the text at each
position of text. Each comparison: walk down the trie by spelling symbols
of text and a pattern from the pattern list matches text each time we reach
a leaf. Try text = “panamabananas”. We will first walk down branch of p-
>a->n and stop at the leaf, thus we find pattern ‘pan‘. With Trie Matching,
the runtime is decreased to O(maxi mi ∗ n). Plus the trie construction time
O( i mi ).
P
However, merging all patterns into a trie makes it impossible for using
advanced single-pattern matching algorithms such as KMP.
More Pattern Matching Tasks There are more types of matching, in-
stead of finding the exact occurrence of one string in another.
3. Palindrome Matching.
Compact Trie If we assign only one letter per edge, we are not taking
full advantage of the trie’s tree structure. It is more useful to consider
compact or compressed tries, tries where we remove the one letter per edge
constraint, and contract non-branching paths by concatenating the letters on
these paths. In this way, every node branches out, and every node traversed
represents a choice between two different words. The compressed trie that
corresponds to our example trie is also shown in Figure 27.4.
| | is the alphlbetical size, and N is the total number of nodes in the trie
P
Note: You may assume that all inputs are consist of lowercase letters
a-z. All inputs are guaranteed to be non-empty strings.
1 d e f s t a r t W i t h ( s e l f , word ) :
2 node = s e l f . r o o t
3 f o r c i n word :
4 l o c = ord ( c )−ord ( ' a ' )
5 # c a s e 1 : not a l l l e t t e r s matched
6 i f node . c h i l d r e n [ l o c ] i s None :
7 return False
8 node = node . c h i l d r e n [ l o c ]
9 # case 2
10 r e t u r n True
Now complete the given Trie class with TrieNode and __init__ func-
tion.
1 c l a s s Trie :
2 c l a s s TrieNode :
3 d e f __init__ ( s e l f ) :
4 s e l f . is_word = F a l s e
5 s e l f . c h i l d r e n = [ None ] ∗ 26 #t h e o r d e r o f t h e
node r e p r e s e n t s a c h a r
6
7 d e f __init__ ( s e l f ) :
8 """
9 I n i t i a l i z e your data s t r u c t u r e h e r e .
514 22. STRING PATTERN MATCHING ALGORITHMS
10 """
11 s e l f . r o o t = s e l f . TrieNode ( ) # r o o t has v a l u e None
22.1 336. Palindrome Pairs (hard). Given a list of unique words, find
all pairs of distinct indices (i, j) in the given list, so that the concate-
nation of the two words, i.e. words[i] + words[j] is a palindrome.
1 Example 1 :
2
3 Input : [ " abcd " , " dcba " , " l l s " , " s " , " s s s l l " ]
4 Output : [ [ 0 , 1 ] , [ 1 , 0 ] , [ 3 , 2 ] , [ 2 , 4 ] ]
5 E x p l a n a t i o n : The p a l i n d r o m e s a r e [ " dcbaabcd " , " abcddcba " , "
s l l s " ," l l s s s s l l "]
6
7 Example 2 :
8
9 Input : [ " bat " , " tab " , " c a t " ]
10 Output : [ [ 0 , 1 ] , [ 1 , 0 ] ]
11 E x p l a n a t i o n : The p a l i n d r o m e s a r e [ " b a t t a b " , " t a b b a t " ]
23 i f word [ i ] != word [ j ] :
24 return False
25 i += 1
26 j −= 1
27 r e t u r n True
28
29
30 class Solution :
31 d e f p a l i n d r o m e P a i r s ( s e l f , words ) :
32 ' ' ' Find p a i r s o f p a l i n d r o m e s i n O( n∗k ^2) time and O
( n∗k ) s p a c e . ' ' '
33 root = Trie ()
34 res = [ ]
35 f o r i , word i n enumerate ( words ) :
36 i f not word :
37 continue
38 r o o t . i n s e r t ( word [ : : − 1 ] , i )
39 f o r i , word i n enumerate ( words ) :
40 i f not word :
41 continue
42 t r i e = root
43 f o r j , ch i n enumerate ( word ) :
44 i f ch not i n t r i e . l i n k s :
45 break
46 t r i e = t r i e . l i n k s [ ch ]
47 i f i s _ p a l i n d r o m e ( word [ j + 1 : ] ) and t r i e . i n d e x
i s not None and t r i e . i n d e x != i :
48 # i f t h i s word c o m p l e t e s t o a
p a l i n d r o m e and t h e p r e f i x i s a word , c o m p l e t e i t
49 r e s . append ( [ i , t r i e . i n d e x ] )
50 else :
51 # t h i s word i s a r e v e r s e s u f f i x o f o t h e r
words , combine with t h o s e t h a t c o m p l e t e t o a p a l i n d r o m e
52 f o r pali_index in t r i e . pali_indices :
53 i f i != p a l i _ i n d e x :
54 r e s . append ( [ i , p a l i _ i n d e x ] )
55 i f ' ' i n words :
56 j = words . i n d e x ( ' ' )
57 f o r i , word i n enumerate ( words ) :
58 i f i != j and i s _ p a l i n d r o m e ( word ) :
59 r e s . append ( [ i , j ] )
60 r e s . append ( [ j , i ] )
61 return res
palindrome when considering the empty string as prefix for the other
word.
1 class Solution ( object ) :
2 d e f p a l i n d r o m e P a i r s ( s e l f , words ) :
3 # 0 means t h e word i s not r e v e r s e d , 1 means t h e
word i s r e v e r s e d
4 words , l e n g t h , r e s u l t = s o r t e d ( [ ( w, 0 , i , l e n (w) )
f o r i , w i n enumerate ( words ) ] +
5 [ ( w[ : : − 1 ] , 1 , i , l e n (w) )
f o r i , w i n enumerate ( words ) ] ) , l e n ( words ) ∗ 2 , [ ]
6
7 #a f t e r t h e s o r t i n g , t h e same s t r i n g were nearby , one
i s 0 and one i s 1
8 f o r i , ( word1 , rev1 , ind1 , l e n 1 ) i n enumerate ( words
):
9 f o r j i n xrange ( i + 1 , l e n g t h ) :
10 word2 , rev2 , ind2 , _ = words [ j ]
11 #p r i n t word1 , word2
12 i f word2 . s t a r t s w i t h ( word1 ) : # word2 might
be l o n g e r
13 i f i n d 1 != i n d 2 and r e v 1 ^ r e v 2 : # one
i s r e v e r s e d one i s not
14 r e s t = word2 [ l e n 1 : ]
15 i f r e s t == r e s t [ : : − 1 ] : r e s u l t += ( [
ind1 , i n d 2 ] , ) i f r e v 2 e l s e ( [ ind2 , i n d 1 ] , ) # i f r e v 2 i s
r e v e r s e d , t h e from i n d 1 t o i n d 2
16 else :
17 break # from t h e p o i n t o f view , break
i s p o w e r f u l , t h i s way , we o n l y d e a l with p o s s i b l e
reversed ,
18 return r e s u l t
19
There are several other data structures, like balanced trees and hash
tables, which give us the possibility to search for a word in a dataset of
strings. Then why do we need trie? Although hash table has O(1) time
complexity for looking for a key, it is not efficient in the following operations
:
517
23
In this chapter, we will specifically talk math related problems. Normally, for
the problems appearing in this section, they can be solved using our learned
programming methodology. However, it might not inefficient (we will get
LTE error on the LeetCode) due to the fact that we are ignoring their math
properties which might help us boost the efficiency. Thus, learning some of
the most related math knowledge can make our life easier.
23.1 Numbers
23.1.1 Prime Numbers
A prime number is an integer greater than 1, which is only divisible by 1
and itself. First few prime numbers are : 2 3 5 7 11 13 17 19 23 ...
Some interesting facts about Prime numbers:
2. 2, 3 are only two consecutive natural numbers which are prime too.
519
520 23. MATH AND PROBABILITY PROBLEMS
There are actually a lot of space for us to optimize the algorithm. First,
√
instead of checking till n, we can check till n because a larger factor of n
must be a multiple of smaller factor that has been already checked. Also,
because even numbers bigger than 2 are not prime, so the step we can set
it to 2. The algorithm can be improved further by use feature 3 that all
primes are of the form 6k ± 1, with the exception of 2 and 3. Together with
feature 4 which implicitly states that every non-prime integer is divisible by
a prime number smaller than itself. So a more efficient method is to test if n
is divisible by 2 or 3, then to check through all the numbers of form 6k ± 1.
1 def isPrime (n) :
2 # corner cases
3 i f n <= 1 :
4 return False
5 i f n<= 3 :
6 r e t u r n True
7
8 i f n % 2 == 0 o r n % 3 == 0 :
9 return False
10
11 f o r i i n r a n g e ( 5 , i n t ( n ∗ ∗ 0 . 5 ) +1, 6 ) : # 6k+1 o r 6k−1, s t e p
6 , up t i l l s q r t ( n ) , when i =5 , check 5 and 7 , ( k−1 , k+1)
12 i f n%i == 0 o r n%( i +2)==0:
13 return False
14 r e t u r n True
15 r e t u r n True
23.1. NUMBERS 521
4 ugly . s o r t ( )
5 d e f nthUglyNumber ( s e l f , n ) :
6 """
7 : type n : i n t
8 : rtype : int
9 """
10 r e t u r n s e l f . u g l y [ n−1]
The second way is only generate the nth ugly number, with
1 class Solution :
2 n = 1690
3 ugly = [ 1 ]
4 i2 = i3 = i5 = 0
5 f o r i i n r a n g e ( n−1) :
6 u2 , u3 , u5 = 2 ∗ u g l y [ i 2 ] , 3 ∗ u g l y [ i 3 ] , 5 ∗ u g l y [ i 5 ]
7 umin = min ( u2 , u3 , u5 )
8 u g l y . append ( umin )
9 i f umin == u2 :
10 i 2 += 1
11 i f umin == u3 :
12 i 3 += 1
13 i f umin == u5 :
14 i 5 += 1
15
16 d e f nthUglyNumber ( s e l f , n ) :
17 """
18 : type n : i n t
19 : rtype : int
20 """
21 r e t u r n s e l f . u g l y [ n−1]
23.1.3 Combinatorics
1. 611. Valid Triangle Number
Follow up: Could you optimize your algorithm to use only O(k) extra
space? Solution: Generate from Index 0 to K.
1 d e f getRow ( s e l f , rowIndex ) :
2 i f rowIndex == 0 :
3 return [ 1 ]
4 # f i r s t , n = rowIndex +1, i f n i s even ,
5 ans = [ 1 ]
524 23. MATH AND PROBABILITY PROBLEMS
6 f o r i i n r a n g e ( rowIndex ) :
7 tmp = [ 1 ] ∗ ( i +2)
8 f o r j i n r a n g e ( 1 , i +1) :
9 tmp [ j ] = ans [ j −1]+ ans [ j ]
10 ans = tmp
11 r e t u r n ans
Triangle Counting
Analysis: The first solution is to get all digits [1,2], and generate all the
permutation [[1,2],[2,1]], and generate the integer again, and then sort gen-
erated integers, so that we can pick the next one that is larger. But the time
complexity is O(n!).
Now, let us think about more examples to find the rule here:
1 435798 − >435879
2 1432−>2134
If we start from the last digit, we look to its left, find the cloest digit that
has smaller value, we then switch this digit, if we cant find such digit, then
we search the second last digit. If none is found, then we can not find one.
Like 21. return -1. This process is we get the first larger number to the
right.
1 [ 5 , 5 , 7 , 8 , −1, −1]
2 [ 2 , −1, −1, −1]
For the reminding digits, we do a sorting and put them back to those digit
to get the smallest value
23.1. NUMBERS 525
1 class Solution :
2 def getDigits ( s e l f , n) :
3 digits = []
4 while n :
5 d i g i t s . append ( n%10) # t h e l e a s t i m p o r t a n t p o s i t i o n
6 n = i n t (n/10)
7 return d i g i t s
8 d e f g e t S m a l l e s t L a r g e r E l e m e n t ( s e l f , nums ) :
9 i f not nums :
10 return [ ]
11 r s t = [ −1]∗ l e n ( nums )
12
13 f o r i , v i n enumerate ( nums ) :
14 smallestLargerNum = s y s . maxsize
15 i n d e x = −1
16 f o r j i n r a n g e ( i +1 , l e n ( nums ) ) :
17 i f nums [ j ]>v and smallestLargerNum > nums [ j ] :
18 index = j
19 smallestLargerNum = nums [ j ]
20 i f smallestLargerNum < s y s . maxsize :
21 r s t [ i ] = index
22 return rst
23
24
25 def nextGreaterElement ( s e l f , n) :
26 """
27 : type n : i n t
28 : rtype : int
29 """
30 i f n==0:
31 r e t u r n −1
32
33 d i g i t s = s e l f . getDigits (n)
34 digits = digits [:: −1]
35 # print ( digits )
36
37 r s t = s e l f . getSmallestLargerElement ( d i g i t s )
38 # print ( rst )
39 stop_index = −1
40
41 # switch
42 f o r i i n r a n g e ( l e n ( r s t ) −1, −1, −1) :
43 i f r s t [ i ]!= −1: #s w i t c h
44 print ( ' switch ' )
45 stop_index = i
46 digits [ i ] , digits [ rst [ i ] ] = digits [ rst [ i ]] ,
digits [ i ]
47 break
48 i f stop_index == −1:
49 r e t u r n −1
50
51 # print ( digits )
52
53 # s o r t from stop_index+1 t o t h e end
526 23. MATH AND PROBABILITY PROBLEMS
54 d i g i t s [ stop_index + 1 : ] = s o r t e d ( d i g i t s [ stop_index + 1 : ] )
55 print ( digits )
56
57 #c o n v e r t t h e d i g i t i a l i z e d answer t o i n t e g e r
58 nums = 0
59 digit = 1
60 for i in d i g i t s [ : : − 1 ] :
61 nums+=d i g i t ∗ i
62 d i g i t ∗=10
63 i f nums >2147483647:
64 r e t u r n −1
65
66
67 r e t u r n nums
Special case is when one number is zero, the GCD is the value of the other.
gcd(a, 0) = a.
The basic algorithm is: we get all divisors of each number, and then find
the largest common value. Now, let’s see how to we advance this algorithm.
We can reformulate the last example as:
1 36 = 2 ∗ 2 ∗ 3 ∗ 3
2 60 = 2 ∗ 2 ∗ 3 ∗ 5
3 GCD = 2 ∗ 2 ∗ 3
4 = 12
1. gcd(a, 0) = a
2. gcd(a, a) = a,
23.2. INTERSECTION OF NUMBERS 527
Based on the above features, we can use Euclidean Algorithm to gain GCD:
1 def euclid (a , b) :
2 w h i l e a != b :
3 # r e p l a c e l a r g e r number by i t s d i f f e r e n c e with t h e
s m a l l e r number
4 if a > b:
5 a = a − b
6 else :
7 b = b − a
8 return a
9
10 p r i n t ( e u c l i d (36 , 60) )
The only problem with the Euclidean Algorithm is that it can take several
subtraction steps to find the GCD if one of the given numbers is much bigger
than the other. A more efficient algorithm is to replace the subtraction
with remainder operation. The algorithm would stops when reaching a zero
reminder and now the algorithm never requires more steps than five times
the number of digits (base 10) of the smaller integer.
The recursive version code:
1 def euclidRemainder ( a , b) :
2 i f a == 0 :
3 return b
4 r e t u r n gcd ( b%a , a )
a×b
lcm(a, b) = (23.1)
gcd(a, b)
528 23. MATH AND PROBABILITY PROBLEMS
Long Multiplication
23.2 29. Divide Two Integers (medium) Given two integers dividend
and divisor, divide two integers without using multiplication, division
and mod operator. Return the quotient after dividing dividend by
divisor. The integer division should truncate toward zero.
1 Example 1 :
2
3 Input : d i v i d e n d = 1 0 , d i v i s o r = 3
4 Output : 3
5
6 Example 2 :
7
8 Input : d i v i d e n d = 7 , d i v i s o r = −3
9 Output : −2
23.4. PROBABILITY THEORY 529
Analysis: we can get the sign of the result first, and then convert the
dividend and divisor into its absolute value. Also, we better handle
the bound condition that the divisor is larger than the vidivend, we
get 0 directly. The code is given:
1 def d i v i d e ( s e l f , dividend , d i v i s o r ) :
2 d e f d i v i d e ( dd ) : # t h e l a s t p o s i t i o n t h a t d i v i s o r ∗ v a l <
dd
3 s , r = 0, 0
4 f o r i in range (9) :
5 tmp = s + d i v i s o r
6 i f tmp <= dd :
7 s = tmp
8 else :
9 r e t u r n s t r ( i ) , s t r ( dd−s )
10 r e t u r n s t r ( 9 ) , s t r ( dd−s )
11
12 i f d i v i d e n d == 0 :
13 return 0
14 s i g n = −1
15 i f ( d i v i d e n d >0 and d i v i s o r >0 ) o r ( d i v i d e n d < 0 and
d i v i s o r < 0) :
16 sign = 1
17 d i v i d e n d = abs ( d i v i d e n d )
18 d i v i s o r = abs ( d i v i s o r )
19 i f d i v i s o r > dividend :
20 return 0
21 ans , did , dr = [ ] , s t r ( d i v i d e n d ) , s t r ( d i v i s o r )
22 n = l e n ( dr )
23 p r e = d i d [ : n−1]
24 f o r i i n r a n g e ( n−1 , l e n ( d i d ) ) :
25 dd = p r e+d i d [ i ]
26 dd = i n t ( dd )
27 v , p r e = d i v i d e ( dd )
28 ans . append ( v )
29
30 ans = i n t ( ' ' . j o i n ( ans ) ) ∗ s i g n
31
32 i f ans > (1<<31) −1:
33 ans = (1<<31)−1
34 r e t u r n ans
In programming tasks, such problems are either solvable with some closed-
form formula or one has no choice than to enumerate the complete search
space.
530 23. MATH AND PROBABILITY PROBLEMS
23.6 Geometry
In this section, we will discuss coordinate related problems.
939. Minimum Area Rectangle(Medium)
Given a set of points in the xy-plane, determine the minimum area of a
rectangle formed from these points, with sides parallel to the x and y axes.
If there isn’t any rectangle, return 0.
1 Example 1 :
2
3 Input : [ [ 1 , 1 ] , [ 1 , 3 ] , [ 3 , 1 ] , [ 3 , 3 ] , [ 2 , 2 ] ]
4 Output : 4
5
6 Example 2 :
7
8 Input : [ [ 1 , 1 ] , [ 1 , 3 ] , [ 3 , 1 ] , [ 3 , 3 ] , [ 4 , 1 ] , [ 4 , 3 ] ]
9 Output : 2
21 combine ( p o i n t s , i +1 , c u r r +[ p o i n t s [ i ] ] , ans )
22 return
23
24 ans =[ s y s . maxsize ]
25 combine ( p o i n t s , 0 , [ ] , ans )
26 r e t u r n ans [ 0 ] i f ans [ 0 ] != s y s . maxsize e l s e 0
Traverse linked list using two pointers. Move one pointer by one and
other pointer by two. If these pointers meet at some node then there is a
loop. If pointers do not meet then linked list doesn’t have loop. Once you
detect a cycle, think about finding the starting point.
1 d e f d e t e c t C y c l e ( s e l f , A) :
2 #f i n d t h e " i n t e r s e c t i o n "
3 p_f=p_s=A
4 w h i l e ( p_f and p_s and p_f . next ) :
5 p_f = p_f . next . next
23.8. EXERCISE 533
23.8 Exercise
23.8.1 Number
313. Super Ugly Number
1 Super u g l y numbers a r e p o s i t i v e numbers whose a l l prime f a c t o r s
are in t h e g i v e n prime l i s t p r i m e s o f s i z e k . For example ,
[1 , 2 , 4 , 7 , 8 , 13 , 14 , 16 , 19 , 26 , 28 , 32] i s the sequence
of the f i r s t 12 s u p e r u g l y numbers g i v e n p r i m e s = [ 2 , 7 , 1 3 ,
19] of size 4.
2
3 Note :
4 ( 1 ) 1 i s a s u p e r u g l y number f o r any g i v e n p r i m e s .
5 ( 2 ) The g i v e n numbers i n p r i m e s a r e i n a s c e n d i n g o r d e r .
6 ( 3 ) 0 < k <= 1 0 0 , 0 < n <= 1 0 6 , 0 < p r i m e s [ i ] < 1 0 0 0 .
7 ( 4 ) The nth s u p e r u g l y number i s g u a r a n t e e d t o f i t i n a 32− b i t
signed integer .
1 d e f nthSuperUglyNumber ( s e l f , n , p r i m e s ) :
2 """
3 : type n : i n t
4 : type p r i m e s : L i s t [ i n t ]
5 : rtype : int
6 """
7 nums = [ 1 ]
8 i d e x s = [ 0 ] ∗ l e n ( p r i m e s ) #f i r s t i s t h e c u r r e n t i d e x
9 f o r i i n r a n g e ( n−1) :
10 min_v = maxsize
11 min_j = [ ]
12 f o r j , i d e x i n enumerate ( i d e x s ) :
13 v = nums [ i d e x ] ∗ p r i m e s [ j ]
14 i f v<min_v :
15 min_v = v
16 min_j=[ j ]
17 e l i f v==min_v :
18 min_j . append ( j ) #we can g e t m u t i p l e j i f
there i s a t i e
534 23. MATH AND PROBABILITY PROBLEMS
Problem-Patterns
535
24
Array Questions(15%)
In this chapter, we mainly discuss about the array based questions. We first
categorize these problems into different type, and then each type can usually
be solved and optimized with nearly the best efficiency.
Given an array, a subsequence is composed of elements whose subscripts
are increasing in the original array. A subarray is a subset of subsequence,
which is contiguous subsequence. Subset contain any possible combinations
of the original array. For example, for array [1, 2, 3, 4]:
Subsequence
[1 , 3]
[1 , 4]
[1 , 2 , 4]
Subarray
[1 , 2]
[2 , 3]
[2 , 3 , 4]
Subset i n c l u d e s d i f f e r e n t length o f subset , e i t h e r
length 0: [ ]
length 1: [ 1 ] , [ 2 ] , [ 3 ] , [ 4 ]
length 2: [1 , 2] , [1 , 3] , [1 , 4] , [2 , 3] , [2 , 4] , [3 , 4]
Here array means one dimension list. For array problems, math will play
an important role here. The rules are as follows:
537
538 24. ARRAY QUESTIONS(15%)
dynamic programming.
Before we get into solving each type of problems, we first introduce the
algorithms we will needed in this Chapter, including two pointers (three
pointers or sliding window), prefix sum, kadane’s algorithm. Kadane’s al-
gorithm can be explained with sequence type of dynamic programming.
After this chapter, we need to learn the step to solve these problems:
1. Analyze the problem and categorize it. To know the naive solution’s
time complexity can help us identify it.
2. If we can not find what type it is, let us see if we can convert. If
not, we can try to identify a simple version of this problem, and then
upgrade the simple solution to the more complex one.
5. Check the special case. (Usually very important for this type of prob-
lems)
24.1 Subarray
Note: For subarray the most important feature is contiguous. Here, we
definitely will not use sorting. Given an array with size n, the total number
of subarrays we have is i=n i=1 i = n ∗ (n + 1)/2, which makes the time
P
complexity of naive solution that use two nested for/while loop O(n2 ) or
O(n3 ).
There are two types of problems related to subarry: Range Query
and optimization-based subarray. The Range query problems include
querying the minimum/maximum or sum of all elements in a given range
[i,j] in an array. Range Query has a more standard way to solve, either by
searching or with the segment tree:
Range Query
2. sliding window can be used to find subarray that either the sum
or product inside of the sliding window is ordered (either monotone
increasing/decreasing). This normally requires that the array are all
positive or all negative. We can use the sliding window to cover its all
search space. Or else we cant use sliding window.
3. For all problems related with subarray sum/product, for both vague
or absolute conditioned algorithm, we have a universal algorithm: save
the prefix sum (sometimes together with index) in a sorted array, and
use binary search to find all possible starting point of the window.
Solution: Brute force is to use two for loops, first is the starting,
second is the end, then we can get the maximum value. To optimize,
we can use divide and conquer, O(nlgn) vs brute force is O(n3 ) (two
embedded for loops and n for computing the sum). The divide and
conquer method was shown in that chapter. A more efficient algorithm
is using pre_sum. Please check Section ?? for the answer.
Now what is the slinding window solution? The key step in sliding
window is when to move the first pointer of the window (shrinking
the window). The window must include current element j. For the
maximum subarray, to increase the sum of the window, we need to
abandon any previous elements if they have negative sum.
1 from s y s import maxsize
2 class Solution :
3 d e f maxSubArray ( s e l f , nums ) :
4 """
5 : type nums : L i s t [ i n t ]
24.1. SUBARRAY 541
6 : rtype : int
7 """
8 i f not nums :
9 return 0
10 i , j = 0 , 0 #i<=j
11 maxValue = −maxsize
12 window_sum = 0
13 w h i l e j < l e n ( nums ) :
14 window_sum += nums [ j ]
15 j += 1
16 maxValue = max( maxValue , window_sum )
17 w h i l e i <j and window_sum < 0 :
18 window_sum −= nums [ i ]
19 i += 1
20 r e t u r n maxValue
Example 2 :
Given nums = [ −2 , −1, 2 , 1 ] , k = 1 ,
r e t u r n 2 . ( b e c a u s e t h e s u b a r r a y [ −1 , 2 ] sums t o 1 and i s
the l o n g e s t )
Follow Up :
Can you do i t i n O( n ) time ?
3 : type nums : L i s t [ i n t ]
4 : type k : i n t
5 : rtype : int
6 """
7 prefix_sum = 0
8 d i c t = {0: −1} #t h i s means f o r i n d e x −1, t h e sum i s
0
9 max_len = 0
10 f o r idx , n i n enumerate ( nums ) :
11 prefix_sum += n
12 # s a v e t h e s e t o f p r e f i x sum t o g e t h e r with t h e
f i r s t index of t h i s value appears .
13 i f prefix_sum not i n d i c t :
14 d i c t [ prefix_sum ] = i d x
15 # t r a c k t h e maximum l e n g t h s o f a r
16 i f prefix_sum−k i n d i c t :
17 max_len=max( max_len , idx−d i c t [ sum_i−k ] )
18 r e t u r n max_len
Another example that asks for pattern but can be converted or equiv-
alent to the last problems:
24.3 525. Contiguous Array. Given a binary array, find the maximum
length of a contiguous subarray with equal number of 0 and 1. Note:
The length of the given binary array will not exceed 50,000.
Example 1 :
Input : [ 0 , 1 ]
Output : 2
E x p l a n a t i o n : [ 0 , 1 ] i s t h e l o n g e s t c o n t i g u o u s s u b a r r a y with
e q u a l number o f 0 and 1 .
Example 2 :
Input : [ 0 , 1 , 0 ]
Output : 2
Explanation : [ 0 , 1 ] ( or [ 1 , 0 ] ) i s a l o n g e s t contiguous
s u b a r r a y with e q u a l number o f 0 and 1 .
11
12 f o r idx , v i n enumerate ( nums ) :
13 cur_sum+=v
14 i f cur_sum i n mapp :
15 max_len=max( max_len , idx−mapp [ cur_sum ] )
16 else :
17 mapp [ cur_sum]= i d x
18
19 r e t u r n max_len
Example 2 :
I nput : [ 2 , 2 , 2 , 2 , 2 ]
Output : 1
E x p l a n a t i o n : The l o n g e s t c o n t i n u o u s i n c r e a s i n g s u b s e q u e n c e
i s [ 2 ] , i t s length i s 1.
\ t e x t i t { Note : Length o f t h e a r r a y w i l l not e x c e e d 1 0 , 0 0 0 . }
13 w h i l e j < l e n ( nums ) :
14 j += 1 #s l i d e t h e window
15 max_length = max( max_length , j −i )
16 # when c o n d i t i o n v i o l a t e d , r e s e t t h e window
17 i f j <l e n ( nums ) and nums [ j −1]>=nums [ j ] :
18 i = j
19
20 r e t u r n max_length
Input : s = 7 , nums = [ 2 , 3 , 1 , 2 , 4 , 3 ]
Output : 2
E x p l a n a t i o n : t h e s u b a r r a y [ 4 , 3 ] has t h e minimal l e n g t h
under t h e problem c o n s t r a i n t .
12 i n d e x = b i s e c t _ r i g h t ( ps , ps [−1]− s )
13 i f index > 0 :
14 i n d e x −= 1
15 ans = min ( ans , j −i n d e x +1)
16 j+=1
17 r e t u r n ans i f ans != f l o a t ( ' i n f ' ) e l s e 0
4 """
5 : type nums : L i s t [ i n t ]
6 : type k : i n t
7 : rtype : int
8 """
9 ' ' ' r e t u r n t h e number o f s u b a r r a y s t h a t e q u a l t o k
'''
10 d i c t = c o l l e c t i o n s . d e f a u l t d i c t ( i n t ) #t h e v a l u e i s
t h e number o f t h e sum o c c u r s
11 d i c t [0]=1
12 prefix_sum , count =0, 0
13 f o r v i n nums :
14 prefix_sum += v
15 count += d i c t [ prefix_sum−k ] # i n c r e a s e t h e
counter of the appearing value k , d e f a u l t i s 0
16 d i c t [ prefix_sum ] += 1 # update t h e count o f
p r e f i x sum , i f i t i s f i r s t time , t h e d e f a u l t v a l u e i s 0
17 r e t u r n count
Input : A = [ 4 , 5 , 0 , − 2 , − 3 , 1 ] , K = 5
Output : 7
E x p l a n a t i o n : There a r e 7 s u b a r r a y s with a sum d i v i s i b l e by
K = 5:
[ 4 , 5 , 0 , −2, −3, 1 ] , [ 5 ] , [ 5 , 0 ] , [ 5 , 0 , −2, −3] , [ 0 ] , [ 0 ,
−2, −3] , [ −2 , −3]
Analysis: for the above array, we can compute the prefix sum as
[0,4,9, 9, 7,4,5]. Let P[i+1] = A[0] + A[1] + ... + A[i]. Then, each
subarray can be written as P[j] - P[i] (for j > i). We need to i for
current j index that (P[j]-P[i])% K == 0. Because P[j]%K=P[i]%K,
therefore different compared with when sum == K, we not check P[j]-
K but instead P[j]%K if it is in the hashmap. Therefore, we need to
save the prefix sum as the modulo of K. For the example, we have dict:
0: 2, 4: 4, 2: 1.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 class Solution :
3 d e f subarraysDivByK ( s e l f , A, K) :
4 """
5 : type A: L i s t [ i n t ]
6 : type K: i n t
7 : rtype : int
8 """
9 a_sum = 0
10 p_dict = d e f a u l t d i c t ( i n t )
11 p_dict [ 0 ] = 1 # when i t i s empty we s t i l l has one
0:1
24.1. SUBARRAY 547
12 ans = 0
13 f o r i , v i n enumerate (A) :
14 a_sum += v
15 a_sum %= K
16 i f a_sum i n p_dict :
17 ans += p_dict [ a_sum ]
18 p_dict [ a_sum ] += 1 # s a v e t h e remodule i n s t e a d
19 r e t u r n ans
Follow up: If you have figured out the O(n) solution, try coding an-
other solution of which the time complexity is O(n log n).
Analysis. For this problem, we can still use prefix sum saved in
hashmap. However, since the condition is sum >= s, if we use
a hashmap, we need to search through the hashmap with key <=
pref ixs um − s. The time complexity would rise up to O(n2 ) if we use
linear search. We would receive LTE error.
1 d e f minSubArrayLen ( s e l f , s , nums ) :
2 """
3 : type s : i n t
4 : type nums : L i s t [ i n t ]
5 : rtype : int
6 """
7 i f not nums :
8 return 0
9 dict = collections . defaultdict ( int )
10 d i c t [ 0 ] = −1 # pre_sum 0 with i n d e x −1
11 prefixSum = 0
12 minLen = s y s . maxsize
13 f o r idx , n i n enumerate ( nums ) :
14 prefixSum += n
15 f o r key , v a l u e i n d i c t . i t e m s ( ) :
16 i f key <= prefixSum − s :
17 minLen = min ( minLen , idx−v a l u e )
18 d i c t [ prefixSum ] = i d x #s a v e t h e l a s t i n d e x
19 r e t u r n minLen i f 1<=minLen<=l e n ( nums ) e l s e 0
1 d e f minSubArrayLen ( s e l f , s , nums ) :
2 """
3 : type s : i n t
4 : type nums : L i s t [ i n t ]
24.1. SUBARRAY 549
5 : rtype : int
6 """
7 d e f bSearch ( nums , i , j , t a r g e t ) :
8 while i < j :
9 mid = ( i+j ) / 2
10 i f nums [ mid ] == t a r g e t :
11 r e t u r n mid
12 e l i f nums [ mid ] < t a r g e t :
13 i = mid + 1
14 else :
15 j = mid − 1
16 return i
17
18 i f not nums :
19 return 0
20 r e c = [ 0 ] ∗ l e n ( nums )
21 r e c [ 0 ] = nums [ 0 ]
22 i f r e c [ 0 ] >= s :
23 return 1
24 minlen = l e n ( nums )+1
25 f o r i i n r a n g e ( 1 , l e n ( nums ) ) :
26 r e c [ i ] = r e c [ i −1] + nums [ i ]
27 i f r e c [ i ] >= s :
28 i n d e x = bSearch ( r e c , 0 , i , r e c [ i ] − s )
29 i f rec [ index ] > rec [ i ] − s :
30 i n d e x −= 1
31 minlen = min ( minlen , i − i n d e x )
32 r e t u r n minlen i f minlen != l e n ( nums )+1 e l s e 0
24.9 713. Subarray Product Less Than K Your are given an array of
positive integers nums. Count and print the number of (contiguous)
subarrays where the product of all the elements in the subarray is less
550 24. ARRAY QUESTIONS(15%)
than k.
Example 1 :
Input : nums = [ 1 0 , 5 , 2 , 6 ] , k = 100
Output : 8
E x p l a n a t i o n : The 8 s u b a r r a y s t h a t have p r o d u c t l e s s than
100 a r e : [ 1 0 ] , [ 5 ] , [ 2 ] , [ 6 ] , [ 1 0 , 5 ] , [ 5 , 2 ] , [ 2 , 6 ] ,
[5 , 2 , 6].
will fit the scenairo and gave O(n) time complexity and O(N ) space com-
plexity.
24.10 862. Shortest Subarray with Sum at Least K Return the length
of the shortest, non-empty, contiguous subarray of A with sum at least
K.
If there is no non-empty subarray with sum at least K, return -1.
Example 1 :
I nput : A = [ 1 ] , K = 1
Output : 1
Example 2 :
I nput : A = [ 1 , 2 ] , K = 4
Output : −1
Example 3 :
I nput : A = [ 2 , − 1 , 2 ] , K = 3
Output : 3
Note: 1 <= A.length <= 50000, −105 <= A[i] <= 105 , 1 <= K <=
109 .
Analysis: The only difference of this problem compared with the last
is with negative value. Because of the negative, the shrinking method
no longer works because when we shrink the window, the sum in the
smaller window might even grow if we just cut out a negative value.
For instance, [84,-37,32,40,95], K=167, the right answer is [32, 40, 95].
In this program, i=0, j=4, so how to handle the negative value?
Solution 1: prefix sum and binary search in prefix sum. LTE
1 d e f s h o r t e s t S u b a r r a y ( s e l f , A, K) :
2 def bisect_right ( lst , target ) :
3 l , r = 0 , l e n ( l s t )−1
4 w h i l e l <= r :
5 mid = l + ( r−l ) //2
6 i f l s t [ mid ] [ 0 ] <= t a r g e t :
7 l = mid + 1
8 else :
9 r = mid −1
10 return l
11 acc = 0
12 ans = f l o a t ( ' i n f ' )
13 prefixSum = [ ( 0 , −1) ] #v a l u e and i n d e x
14 f o r i , n i n enumerate (A) :
15 a c c += n
16 i n d e x = b i s e c t _ r i g h t ( prefixSum , acc−K)
17 f o r j in range ( index ) :
18 ans = min ( ans , i −prefixSum [ j ] [ 1 ] )
19 i n d e x = b i s e c t _ r i g h t ( prefixSum , a c c )
20 prefixSum . i n s e r t ( index , ( acc , i ) )
552 24. ARRAY QUESTIONS(15%)
21 #p r i n t ( index , prefixSum )
22 r e t u r n ans i f ans != f l o a t ( ' i n f ' ) e l s e −1
Now, let us analyze a simple example which includes both 0 and neg-
ative number. [2, -1, 2, 0, 1], K=3, with prefix sum [0, 2, 1, 3, 3,
4], the subarray is [2,-1,2], [2,-1,2, 0] and [2, 0, 1] where its sum is at
least three. First, let us draw the prefix sum on a x-y axis. When
we encounter an negative number, the prefix sum decreases, if it is
zero, then the prefix sum stablize. For the zero case: at p[2] = p[3],
if subarray ends with index 2 is considered, then 3 is not needed. For
the negative case: p[0]=2>p[1]=1 due to A[1]<0. Because p[1] can
always be a better choice to be i than p[1] (smaller so that it is more
likely, shorter distance). Therefore, we can still keep the validate pre-
fix sum monoitually increasing like the array with all positive numbers
by maintaince a mono queue.
1 class Solution :
2 d e f s h o r t e s t S u b a r r a y ( s e l f , A, K) :
3
4 P = [ 0 ] ∗ ( l e n (A) +1)
5 f o r idx , x i n enumerate (A) :
6 P [ i d x +1] = P [ i d x ]+x
7
8
9 ans = l e n (A)+1 # N+1 i s i m p o s s i b l e
10 monoq = c o l l e c t i o n s . deque ( )
11 f o r y , Py i n enumerate (P) :
12 w h i l e monoq and Py <= P [ monoq [ − 1 ] ] : #both
n e g a t i v e and z e r o l e a d s t o k i c k out any p r e v o u s l a r g e r
or equal value
13 p r i n t ( ' pop ' , P [ monoq [ − 1 ] ] )
14 monoq . pop ( )
15
16 w h i l e monoq and Py − P [ monoq [ 0 ] ] >= K: # i f one
x i s c o n s i d e r e d , no need t o c o n s i d e r a g a i n ( s i m i l a r t o
s l i d i n g window where we move t h e f i r s t i n d e x f o r w a r d )
17 p r i n t ( ' pop ' , P [ monoq [ 0 ] ] )
18 ans = min ( ans , y − monoq . p o p l e f t ( ) )
19 p r i n t ( ' append ' , P [ y ] )
20 monoq . append ( y )
21
22
23 r e t u r n ans i f ans < l e n (A)+1 e l s e −1
2 Example 1 :
3
4 I nput : A = [ 1 , 0 , 1 , 0 , 1 ] , S = 2
5 Output : 4
6 Explanation :
7 The 4 s u b a r r a y s a r e b o l d e d below :
8 [1 ,0 ,1 ,0 ,1]
9 [1 ,0 ,1 ,0 ,1]
10 [1 ,0 ,1 ,0 ,1]
11 [1 ,0 ,1 ,0 ,1]
12 Note :
13
14 A. l e n g t h <= 30000
15 0 <= S <= A. l e n g t h
16 A[ i ] i s e i t h e r 0 o r 1 .
Answer: this is exactly the third time of maximum subarray, the max-
imum length of subarry with a certain value. We solve it using prefix
sum and a hashmap to save the count of each value.
1 import c o l l e c t i o n s
2 class Solution :
3 d e f numSubarraysWithSum ( s e l f , A, S ) :
4 """
5 : type A: L i s t [ i n t ]
6 : type S : i n t
7 : rtype : int
8 """
9 d i c t = c o l l e c t i o n s . d e f a u l t d i c t ( i n t ) #t h e v a l u e i s
t h e number o f t h e sum o c c u r s
10 d i c t [ 0 ] = 1 #p r e f i x sum s t a r t s from 0 and t h e number
is 1
11 prefix_sum , count =0, 0
12 f o r v i n A:
13 prefix_sum += v
14 count += d i c t [ prefix_sum−S ] # i n c r e a s e t h e
counter of the appearing value k , d e f a u l t i s 0
15 d i c t [ prefix_sum ] += 1 # update t h e count o f
p r e f i x sum , i f i t i s f i r s t time , t h e d e f a u l t v a l u e i s 0
16 r e t u r n count
12 for x in P:
13 ans += count [ x ]
14 count [ x + S ] += 1
15
16 r e t u r n ans
2
3 Example 1 :
4 I nput : [ 2 3 , 2 , 4 , 6 , 7 ] , k=6
5 Output : True
6 E x p l a n a t i o n : Because [ 2 , 4 ] i s a c o n t i n u o u s s u b a r r a y o f
s i z e 2 and sums up t o 6 .
7
8 Example 2 :
9 I nput : [ 2 3 , 2 , 6 , 4 , 7 ] , k=6
10 Output : True
11 E x p l a n a t i o n : Because [ 2 3 , 2 , 6 , 4 , 7 ] i s an c o n t i n u o u s
s u b a r r a y o f s i z e 5 and sums up t o 4 2 .
12
13 Note :
14 The l e n g t h o f t h e a r r a y won ' t e x c e e d 1 0 , 0 0 0 .
15 You may assume t h e sum o f a l l t h e numbers i s i n t h e r a n g e
o f a s i g n e d 32− b i t i n t e g e r .
1 Example 1 :
2
3 I n p u t : " abc "
4 Output : 7
5 E x p l a n a t i o n : The 7 d i s t i n c t s u b s e q u e n c e s a r e " a " , " b " , " c " , " ab
" , " ac " , " bc " , and " abc " .
6
7 Example 2 :
8
9 I n p u t : " aba "
10 Output : 6
11 E x p l a n a t i o n : The 6 d i s t i n c t s u b s e q u e n c e s a r e " a " , " b " , " ab " , " ba
" , " aa " and " aba " .
12
13 Example 3 :
14
15 I n p u t : " aaa "
16 Output : 3
17 E x p l a n a t i o n : The 3 d i s t i n c t s u b s e q u e n c e s a r e " a " , " aa " and " aaa
".
j<i
10 i f c == S [ j ] :
11 continue
12 else :
13 dp [ i ] += dp [ j ]
14 dp [ i ] %= MOD
15 r e t u r n sum ( dp ) % MOD
24.2.1 Others
For example, the following question would be used as follow up for question
Longest Continuous Increasing Subsequence
300. Longest Increasing Subsequence
673. Number of Longest Increasing Subsequence
Given an unsorted array of integers, find the number of longest increasing
subsequence.
1 Example 1 :
2
3 Input : [ 1 , 3 , 5 , 4 , 7 ]
4 Output : 2
5 E x p l a n a t i o n : The two l o n g e s t i n c r e a s i n g s u b s e q u e n c e a r e [ 1 , 3 ,
4 , 7 ] and [ 1 , 3 , 5 , 7 ] .
6
7 Example 2 :
8 Input : [ 2 , 2 , 2 , 2 , 2 ]
9 Output : 5
10 E x p l a n a t i o n : The l e n g t h o f l o n g e s t c o n t i n u o u s i n c r e a s i n g
s u b s e q u e n c e i s 1 , and t h e r e a r e 5 s u b s e q u e n c e s ' l e n g t h i s 1 ,
s o output 5 .
11 \ t e x t i t { Note : Length o f t h e g i v e n a r r a y w i l l be not e x c e e d 2000
and t h e answer i s g u a r a n t e e d t o be f i t i n 32− b i t s i g n e d i n t . }
state: f[i]
1 from s y s import maxsize
2 class Solution :
3 d e f findNumberOfLIS ( s e l f , nums ) :
4 """
5 : type nums : L i s t [ i n t ]
6 : rtype : int
7 """
8 max_count = 0
9 i f not nums :
10 return 0
11 memo =[None f o r _ i n r a n g e ( l e n ( nums ) ) ]
12 r l s t =[]
13 d e f r e c u r s i v e ( idx , t a i l , r e s ) :
14 i f i d x==l e n ( nums ) :
15 r l s t . append ( r e s )
16 return 0
17 i f memo [ i d x ]==None :
18 length = 0
19 i f nums [ i d x ]> t a i l :
20 addLen = 1+ r e c u r s i v e ( i d x +1, nums [ i d x ] , r e s +[
nums [ i d x ] ] )
21 notAddLen = r e c u r s i v e ( i d x +1, t a i l , r e s )
22 r e t u r n max( addLen , notAddLen )
23 else :
24 r e t u r n r e c u r s i v e ( i d x +1, t a i l , r e s )
25
26
27 ans=r e c u r s i v e (0 , − maxsize , [ ] )
28 count=0
29 for l s t in r l s t :
30 i f l e n ( l s t )==ans :
31 count+=1
32
33 r e t u r n count
14 c o u n t s [ i d x ] = c o u n t s [ i ] #change t h e
count
15 e l i f l e n g t h s [ i ] + 1 == l e n g t h s [ i d x ] : #i f it
is a tie
16 c o u n t s [ i d x ] += c o u n t s [ i ] #i n c r e a s e t h e
c u r r e n t count by count [ i ]
17
18 l o n g e s t = max( l e n g t h s )
19 print ( counts )
20 print ( lengths )
21 r e t u r n sum ( c f o r i , c i n enumerate ( c o u n t s ) i f l e n g t h s [ i ]
== l o n g e s t )
Solution: Not thinking about the O(n) complexity, we can use sorting
to get [1,2,3,4,100,200], and then use two pointers to get [1,2,3,4].
How about O(n)? We can pop out a number in the list, example, 4 ,
then we use while first-1 to get any number that is on the left side of 4, here
it is 3, 2, 1, and use another to find all the bigger one and remove these
numbers from the nums array.
1 d e f l o n g e s t C o n s e c u t i v e ( s e l f , nums ) :
2 nums = s e t ( nums )
3 maxlen = 0
4 w h i l e nums :
5 f i r s t = l a s t = nums . pop ( )
6 w h i l e f i r s t − 1 i n nums : #keep f i n d i n g t h e s m a l l e r
one
7 f i r s t −= 1
8 nums . remove ( f i r s t )
9 w h i l e l a s t + 1 i n nums : #keep f i n d i n g t h e l a r g e r one
10 l a s t += 1
11 nums . remove ( l a s t )
12 maxlen = max( maxlen , l a s t − f i r s t + 1 )
13 r e t u r n maxlen
B ∈ A. There are two kinds of subsets: if the order of the subset doesnt
matter, it is a combination problem, otherwise, it is a permutation problem.
To solve the problems in this section, we need to refer to the backtracking
in Sec ??. When the subset has a fixed constant length, then hashmap can
be used to lower the complexity by one power of n.
Subset VS Subsequence. In the subsequence, the elements keep the
original order from the original sequence. While, in the set concept, there
is no ordering, only a set of elements.
In this type of questions, we are asked to return subsets of a list. For
this type of questions, backtracking ?? can be applied.
24.3.1 Combination
The solution of this section is heavily correlated to Section ??. 78. Subsets
1 Given a s e t o f d i s t i n c t i n t e g e r s , nums , r e t u r n a l l p o s s i b l e
s u b s e t s ( t h e power s e t ) .
2
3 Note : The s o l u t i o n s e t must not c o n t a i n d u p l i c a t e s u b s e t s .
4
5 Example :
6
7 I n p u t : nums = [ 1 , 2 , 3 ]
8 Output :
9 [
10 [3] ,
11 [1] ,
12 [2] ,
13 [1 ,2 ,3] ,
14 [1 ,3] ,
15 [2 ,3] ,
16 [1 ,2] ,
17 []
18 ]
13 ans . append ( c u r r )
14 i f d == k : #t h e l e n g t h i s s a t i s f i e d
15
16 return
17 f o r i in range ( s , n) :
18 c u r r . append ( nums [ i ] )
19 C_n_k( d+1, k , i +1, c u r r [ : ] , ans ) # i +1 b e c a u s e no
r e p e a t , make s u r e u s e deep copy c u r r [ : ]
20 c u r r . pop ( )
21
22 ans = [ ]
23 C_n_k( 0 , k , 0 , [ ] , ans )
24 r e t u r n ans
Incremental. Backtracking is not the only way for the above problem.
There is another way to do it iterative, observe the following process. We
can just keep append elements to the end of of previous results.
1 [1 , 2 , 3 , 4]
2 l = 0, []
3 l = 1 , f o r 1 , [ ] + [ 1 ] , −> [ 1 ] , get powerset of [ 1 ]
4 l = 2 , f o r 2 , [ ] + [ 2 ] , [ 1 ] + [ 2 ] , −> [ 2 ] , [ 1 , 2 ] , g e t p o w e r s e t o f
[1 , 2]
5 l = 3 , f o r 3 , [ ] + [ 3 ] , [ 1 ] + [ 3 ] , [ 2 ] + [ 3 ] , [ 1 , 2 ] + [ 3 ] , −> [ 3 ] , [ 1 ,
3 ] , [ 2 , 3 ] , [ 1 , 2 , 3 ] , get powerset of [ 1 , 2 , 3 ]
6 l = 4 , for 4 , []+ [ 4 ] ; [ 1 ] + [ 4 ] ; [ 2 ] + [ 4 ] , [1 , 2] +[4]; [ 3 ] + [ 4 ] ,
[ 1 , 3 ] + [ 4 ] , [ 2 , 3 ] + [ 4 ] , [ 1 , 2 , 3 ] + [ 4 ] , get powerset of [ 1 , 2 , 3 ,
4]
1 d e f s u b s e t s ( s e l f , nums ) :
2 r e s u l t = [ [ ] ] #u s e two d i m e n s i o n a l , which a l r e a d y have [ ]
one e l e m e n t
3 f o r num i n nums :
4 new_results = [ ]
5 for r in r e s u l t :
6 n e w _ r e s u l t s . append ( r + [ num ] )
7 r e s u l t += n e w _ r e s u l t s
8
9 return r e s u l t
90. Subsets II
1 Given a c o l l e c t i o n o f i n t e g e r s t h a t might c o n t a i n d u p l i c a t e s ,
nums , r e t u r n a l l p o s s i b l e s u b s e t s ( t h e power s e t ) .
2
3 Note : The s o l u t i o n s e t must not c o n t a i n d u p l i c a t e s u b s e t s .
4
5 Example :
6
7 Input : [ 1 , 2 , 2 ]
8 Output :
9 [
10 [2] ,
11 [1] ,
12 [1 ,2 ,2] ,
24.3. SUBSET(COMBINATION AND PERMUTATION) 563
13 [2 ,2] ,
14 [1 ,2] ,
15 []
16 ]
So it would be more efficient if we first save all the numbers in the array in
a dictionary. For the above case, the dic = 1:1, 2:2. Each time we try to
generate the result, we use 2 up to 2 times. Same way, we can use dictionary
on the backtracking too.
1 class Solution ( object ) :
2 d e f subsetsWithDup ( s e l f , nums ) :
3 """
4 : type nums : L i s t [ i n t ]
5 : rtype : List [ List [ int ] ]
6 """
7 i f not nums :
8 return [ [ ] ]
9 res = [ [ ] ]
10 d i c = c o l l e c t i o n s . Counter ( nums )
11 f o r key , v a l i n d i c . i t e m s ( ) :
564 24. ARRAY QUESTIONS(15%)
12 tmp = [ ]
13 for l s t in res :
14 f o r i i n r a n g e ( 1 , v a l +1) :
15 tmp . append ( l s t +[ key ] ∗ i )
16 r e s += tmp
17 return res
77. Combinations
1 Given two i n t e g e r s n and k , r e t u r n a l l p o s s i b l e c o m b i n a t i o n s o f
k numbers out o f 1 . . . n .
2
3 Example :
4
5 Input : n = 4 , k = 2
6 Output :
7 [
8 [2 ,4] ,
9 [3 ,4] ,
10 [2 ,3] ,
11 [1 ,2] ,
12 [1 ,3] ,
13 [1 ,4] ,
14 ]
18 f o r i i n r a n g e ( s , l e n ( nums ) ) :
19 # i f nums [ i ] > t a r g e t :
20 # return
21 s e l f . combine ( nums , t a r g e t −nums [ i ] , i , c u r r +[nums [ i ] ] ,
ans ) # u s e i , i n s t e a d o f i +1 b e c a u s e we can r e u s e
11 r e t u r n ans
12
13 d e f combine ( s e l f , nums , t a r g e t , s , c u r r , ans ) :
14 i f target < 0:
15 return
16 i f t a r g e t == 0 :
17 ans . append ( c u r r )
18 return
19 f o r i d x i n r a n g e ( s , l e n ( nums ) ) :
20 num , count = nums [ i d x ]
21 f o r c i n r a n g e ( count ) :
22 s e l f . combine ( nums , t a r g e t −num∗ ( c +1) , i d x +1, c u r r +[
num ] ∗ ( c +1) , ans )
DFS + MEMO. This problem is similar to 39. Combination Sum. For [2,
3, 5], target = 8, comparison:
1 [2 , 3 , 5] , target = 8
2 3 9 . Combination Sum . # t h e r e i s o r d e r i n g ( each time t h e s t a r t
i n d e x i s same o r l a r g e r than b e f o r e )
3 [
4 [2 ,2 ,2 ,2] ,
5 [2 ,3 ,3] ,
6 [3 ,5]
568 24. ARRAY QUESTIONS(15%)
7 ]
8 3 7 7 . Combination Sum IV , h e r e we have no o r d e r i n g ( each time t h e
s t a r t i n d e x i s t h e same a s b e f o r e ) . Try a l l e l e m e n t .
9 [
10 [2 ,2 ,2 ,2] ,
11 [2 ,3 ,3] ,
12 ∗ [3 ,3 ,2]
13 ∗ [3 ,2 ,3]
14 [3 ,5] ,
15 ∗ [5 ,3]
16 ]
1 d e f combinationSum4 ( s e l f , nums , t a r g e t ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : type t a r g e t : i n t
5 : rtype : int
6 """
7 nums . s o r t ( )
8 n = l e n ( nums )
9 d e f DFS( idx , memo, t ) :
10 i f t < 0:
11 return 0
12 i f t == 0 :
13 return 1
14 count = 0
15 i f t not i n memo :
16 f o r i i n r a n g e ( idx , n ) :
17 count += DFS( idx , memo, t−nums [ i ] )
18 memo [ t ] = count
19 r e t u r n memo [ t ]
20 r e t u r n (DFS( 0 , { } , t a r g e t ) )
Because, here we does not need to numerate all the possible solutions, we
can use dynamic programming, which will be shown in Section ??.
24.3.3 K Sum
In this subsection, we still trying to get subset that sum up to a target. But
the length here is fixed. We would have 2, 3, 4 sums normally. Because it is
still a combination problem, we can use the backtracking to do. Second,
because the fixed length, we can use multiple pointers to build up the
potential same lengthed subset. But in some cases, because the length is
fixed, we can use hashmap to simplify the complexity.
1. Two Sum Given an array of integers, return indices of the two num-
bers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and
you may not use the same element twice.
1 Example :
2
3 Given nums = [ 2 , 7 , 1 1 , 1 5 ] , t a r g e t = 9 ,
24.3. SUBSET(COMBINATION AND PERMUTATION) 569
4
5 Because nums [ 0 ] + nums [ 1 ] = 2 + 7 = 9 ,
6 return [0 , 1 ] .
Hashmap. Using backtracking or brute force will get us O(n2 ) time com-
plexity. We can use hashmap to save the nums in a dictionary. Then we just
check target-num in the dictionary. We would get O(n) time complexity. We
have two-pass hashmap and one-pass hashmap.
1 # two−p a s s hashmap
2 d e f twoSum ( s e l f , nums , t a r g e t ) :
3 """
4 : type nums : L i s t [ i n t ]
5 : type t a r g e t : i n t
6 : rtype : List [ int ]
7 """
8 dict = collections . defaultdict ( int )
9 f o r i , t i n enumerate ( nums ) :
10 dict [ t ] = i
11 f o r i , t i n enumerate ( nums ) :
12 i f t a r g e t − t i n d i c t and i != d i c t [ t a r g e t −t ] :
13 r e t u r n [ i , d i c t [ t a r g e t −t ] ]
14 # one−p a s s hashmap
15 d e f twoSum ( s e l f , nums , t a r g e t ) :
16 """
17 : type nums : L i s t [ i n t ]
18 : type t a r g e t : i n t
19 : rtype : List [ int ]
20 """
21 dict = collections . defaultdict ( int )
22 f o r i , t i n enumerate ( nums ) :
23 i f target − t in dict :
24 r e t u r n [ d i c t [ t a r g e t −t ] , i ]
25 dict [ t ] = i
15. 3Sum
Given an array S of n integers, are there elements a, b, c in S such that
a + b + c = 0? Find all unique triplets in the array which gives the sum of
zero.
Note: The solution set must not contain duplicate triplets.
For example, given array S = [-1, 0, 1, 2, -1, -4],
1 A solution set is :
2 [
3 [ −1 , 0 , 1 ] ,
4 [ −1 , −1, 2 ]
5 ]
Solution: Should use three pointers, no extra space. i is the start point
from [0,len-2], l,r is the other two pointers. l=i+1, r=len-1 at the beignning.
The saving of time complexity is totally from the sorting algorithm.
1 [ −4 , −1 , −1 ,0 ,1 ,2]
2 i , l −> ``````<− r
570 24. ARRAY QUESTIONS(15%)
Use hashmap:
1 d e f threeSum ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : List [ List [ int ] ]
5 """
6 res =[]
7 nums=s o r t e d ( nums )
8 i f not nums :
9 return [ ]
10 i f nums[ −1] <0 o r nums [ 0 ] > 0 :
11 return [ ]
12 e n d _ p o s i t i o n = l e n ( nums )−2
13 dic_nums={}
14 f o r i i n xrange ( 1 , l e n ( nums ) ) :
15 dic_nums [ nums [ i ] ] = i# same r e s u l t s a v e t h e l a s t i n d e x
16
17 f o r i i n xrange ( e n d _ p o s i t i o n ) :
18 t a r g e t = 0−nums [ i ]
19 i f i >0 and nums [ i ] == nums [ i − 1 ] : #t h i s i s t o a v o i d
repeat
20 continue
21 i f t a r g e t <nums [ i ] : #i f t h e t a r g e t i s s m a l l e r than
t h i s , we can not f i n d them on t h e r i g h t s i d e
22 break
24.3. SUBSET(COMBINATION AND PERMUTATION) 571
23 f o r j i n r a n g e ( i +1 , l e n ( nums ) ) : #t h i s i s t o a v o i d
repeat
24 i f j >i +1 and nums [ j ]==nums [ j − 1 ] :
25 continue
26 complement =t a r g e t − nums [ j ]
27 i f complement<nums [ j ] : #i f t h e l e f t numbers a r e
b i g g e r than t h e complement , no need t o keep s e a r c h i n g
28 break
29 i f complement i n dic_nums and dic_nums [
complement ]> j : #need t o make s u r e t h e complement i s b i g g e r
than nums [ j ]
30 r e s . append ( [ nums [ i ] , nums [ j ] , complement ] )
31 return res
18. 4Sum
1 d e f fourSum ( s e l f , nums , t a r g e t ) :
2 d e f findNsum ( nums , t a r g e t , N, r e s u l t , r e s u l t s ) :
3 i f l e n ( nums ) < N o r N < 2 o r t a r g e t < nums [ 0 ] ∗N o r
t a r g e t > nums [ −1]∗N: # e a r l y t e r m i n a t i o n
4 return
5 i f N == 2 : # two p o i n t e r s s o l v e s o r t e d 2−sum problem
6 l , r = 0 , l e n ( nums )−1
7 while l < r :
8 s = nums [ l ] + nums [ r ]
9 i f s == t a r g e t :
10 r e s u l t s . append ( r e s u l t + [ nums [ l ] , nums [ r
]])
11 l += 1
12 r−=1
13 w h i l e l < r and nums [ l ] == nums [ l − 1 ] :
572 24. ARRAY QUESTIONS(15%)
14 l += 1
15 w h i l e l < r and nums [ r ] == nums [ r + 1 ] :
16 r −= 1
17 e l i f s < target :
18 l += 1
19 else :
20 r −= 1
21 e l s e : # r e c u r s i v e l y reduce N
22 f o r i i n r a n g e ( l e n ( nums )−N+1) :
23 i f i == 0 o r ( i > 0 and nums [ i −1] != nums [ i
]) :
24 findNsum ( nums [ i + 1 : ] , t a r g e t −nums [ i ] , N
−1 , r e s u l t +[nums [ i ] ] , r e s u l t s ) #r e d u c e nums s i z e , r e d u c e
target , save r e s u l t
25
26 results = []
27 findNsum ( s o r t e d ( nums ) , t a r g e t , 4 , [ ] , r e s u l t s )
28 return r e s u l t s
454. 4Sum II
Given four lists A, B, C, D of integer values, compute how many tuples
(i, j, k, l) there are such that A[i] + B[j] + C[k] + D[l] is zero.
To make problem a bit easier, all A, B, C, D have same length of N
where 0 ≤ N ≤ 500. All integers are in the range of -228 to 228–1 and the
result is guaranteed to be at most 231–1.
Example:
1 Input :
2 A = [ 1 , 2]
3 B = [ −2 , −1]
4 C = [ −1 , 2 ]
5 D = [ 0 , 2]
6
7 Output :
8 2
Explanation:
1 The two t u p l e s a r e :
2 1 . ( 0 , 0 , 0 , 1 ) −> A [ 0 ] + B [ 0 ] + C [ 0 ] + D [ 1 ] = 1 + ( −2) + ( −1) +
2 = 0
3 2 . ( 1 , 1 , 0 , 0 ) −> A [ 1 ] + B [ 1 ] + C [ 0 ] + D [ 0 ] = 2 + ( −1) + ( −1) +
0 = 0
Solution: if we use brute force, use 4 for loop, then it is O(N 4 ). If we use
divide and conquer, sum the first half, and save a dictionary (counter), time
complexity is O(2N 2 ). What if we have 6 sum, we can reduce it to O(2N 3 ),
what if 8 sum.
1 d e f fourSumCount ( s e l f , A, B, C, D) :
2 AB = c o l l e c t i o n s . Counter ( a+b f o r a i n A f o r b i n B)
3 r e t u r n sum (AB[−c−d ] f o r c i n C f o r d i n D)
24.3. SUBSET(COMBINATION AND PERMUTATION) 573
Summary
As we have seen from the shown examples in this section, to solve the com-
bination problem, backtrack shown in Section ?? offers a universal solution.
Also, there is another iterative solution which suits the power set purpose.
And I would include its code here again:
1 d e f s u b s e t s ( s e l f , nums ) :
2 r e s u l t = [ [ ] ] #u s e two d i m e n s i o n a l , which a l r e a d y have [ ]
one e l e m e n t
3 f o r num i n nums :
4 new_results = [ ]
5 for r in r e s u l t :
6 n e w _ r e s u l t s . append ( r + [ num ] )
7 r e s u l t += n e w _ r e s u l t s
8
9 return r e s u l t
24.3.4 Permutation
46. Permutations
1 Given a c o l l e c t i o n o f d i s t i n c t numbers , r e t u r n a l l p o s s i b l e
permutations .
2
3 For example ,
4 [ 1 , 2 , 3 ] have t h e f o l l o w i n g p e r m u t a t i o n s :
5
6 [
7 [1 ,2 ,3] ,
8 [1 ,3 ,2] ,
9 [2 ,1 ,3] ,
10 [2 ,3 ,1] ,
11 [3 ,1 ,2] ,
12 [3 ,2 ,1]
13 ]
47. Permutations II
Given a collection of numbers that might contain duplicates, return all
possible unique permutations.
For example,
1 [ 1 , 1 , 2 ] have t h e f o l l o w i n g u niqu e p e r m u t a t i o n s :
2
3 [
4 [1 ,1 ,2] ,
5 [1 ,2 ,1] ,
6 [2 ,1 ,1]
7 ]
24.5 Intervals
Sweep Line is a type of algorithm that mainly used to solve problems with
intervals of one-dimensional. Let us look at one example: 1. 253. Meeting
Rooms II
Given an array of meeting time intervals consisting of start and end times
[[s1,e1],[s2,e2],...] (si < ei), find the minimum number of conference rooms
required.
1 Example 1 :
2
3 Input : [ [ 0 , 3 0 ] , [ 5 , 1 0 ] , [ 1 5 , 2 0 ] ]
4 Output : 2
5
6 Example 2 :
7
8 Input : [ [ 7 , 1 0 ] , [ 2 , 4 ] ]
9 Output : 1
It would help a lot if at first we can draw one example with cooridinates.
First, the simplest situation is when we only need one meeting room is
there is no intersection between these time intervals. If we add one interval
that only intersect with one of the previous intervals, this means we need
24.5. INTERVALS 575
24 f o r i i n r a n g e ( s +1 , e +1) :
25 v o t e s [ i ]+=1
26 num_rooms = max( num_rooms , v o t e s [ i ] )
27 r e t u r n num_rooms
One-dimensional Implementation
To get the maximum number of intersection of all the intervals, it is not
necessarily to scan all the time slots, how about just scan the key slot: the
starts and ends . Thus, what we can do is to open an array and put all
the start or end slot into the array, and with 1 to mark it as start and 0
to mark it as end. Then we sort this array. Till this point, how to get the
maximum intersection? We go through this sorted array, if we get a start
our current number of room needed will increase by one, otherwise, if we
encounter an end slot, it means one meeting room is freed, thus we decrease
the current on-going meeting room by one. We use another global variable
to track the maximum number of rooms needed in this whole process. Great,
because now our time complexity is decided by the number of slots 2n, with
the sorting algorithm, which makes the whole time complexity O(nlogn)
and space complexity n. This speeded up algorithm is called Sweep Line
algorithm. Before we write our code, we better check the special cases, what
if there is one slot that is marked as start in one interval but is the end
of another interval. This means we can not increase the counting at first,
but we need to decrease, so that the sorting should be based on the first
element of the tuple, and followed by the second element of the tuple. For
example, the simple case [[13, 15], [1, 13]], we only need maximum of one
meeting room. Thus it can be implemented as:
1 d e f minMeetingRooms ( s e l f , i n t e r v a l s ) :
2 i f not i n t e r v a l s :
24.5. INTERVALS 577
3 return 0
4 #s o l u t i o n 2
5 slots = []
6 # put s l o t s i n t o one−d i m e n s i o n a l a x i s
7 for i in i n t e r v a l s :
8 s l o t s . append ( ( i . s t a r t , 1 ) )
9 s l o t s . append ( ( i . end , 0 ) )
10 # s o r t t h e s e s l o t s on t h i s d i m e n s i o n
11 #s l o t s . s o r t ( key = lambda x : ( x [ 0 ] , x [ 1 ] ) )
12 slots . sort ()
13
14 # now e x e c u t e t h e c o u n t i n g
15 crt_room , max_room = 0 , 0
16 for s in s l o t s :
17 i f s [ 1 ] = = 0 : # i f i t ends , d e c r e a s e
18 crt_room−=1
19 else :
20 crt_room+=1
21 max_room = max( max_room , crt_room )
22 r e t u r n max_room
Min-heap Implementation
Instead of opening an array to save all the time slots, we can directly
sort the intervals in the order of the start time. We can see Fig. 24.4, we
go through the intervals and visit their end time, the first one we encounter
is 30, we put it in a min-heap, and then we visit the next interval [5, 10], 5
is smaller than the previous end time 30, it means this interval intersected
with a previous interval, so the number of maximum rooms increase 1, we
get 2 rooms now. We put 10 into the min-heap. Next, we visit [15, 20], 15
is larger than the first element in the min-heap 10, it means that these two
intervals can be merged into one [5, 20], so we need to update the end time
578 24. ARRAY QUESTIONS(15%)
10 to 20.
This way, the time complexity is still the same which is decided by the
sorting algorithm. While the space complexity is decided by real situation,
it varies from O(1) (no intersection) to O(n) (all the meetings are intersected
at at least one time slot).
1 d e f minMeetingRooms ( s e l f , i n t e r v a l s ) :
2 i f not i n t e r v a l s :
3 return 0
4 #s o l u t i o n 2
5 i n t e r v a l s . s o r t ( key=lambda x : x . s t a r t )
6 h = [ i n t e r v a l s [ 0 ] . end ]
7 rooms = 1
8 for i in i n t e r v a l s [ 1 : ] :
9 s , e=i . s t a r t , i . end
10 e_before = h [ 0 ]
11 i f s<e _ b e f o r e : #o v e r l a p
12 heappush ( h , i . end )
13 rooms+=1
14 e l s e : #no o v e r l a p
15 #merge
16 heappop ( h ) #k i c k out 10 i n our example
17 heappush ( h , e ) # r e p l a c e 10 with 20
18 r e t u r n rooms
Map-based Implementation
1 class Solution {
2 public :
3 i n t minMeetingRooms ( v e c t o r <I n t e r v a l >& i n t e r v a l s ) {
4 map<i n t , i n t > mp ;
5 f o r ( auto v a l : i n t e r v a l s ) {
6 ++mp [ v a l . s t a r t ] ;
7 −−mp [ v a l . end ] ;
8 }
9 i n t max_room = 0 , crt_room = 0 ;
10 f o r ( auto v a l : mp) {
11 crt_room += v a l . s e c o n d ;
12 max_room = max( max_room , crt_room ) ;
13 }
14 r e t u r n max_room ;
15 }
16 };
I nput : A = [ [ 0 , 2 ] , [ 5 , 1 0 ] , [ 1 3 , 2 3 ] , [ 2 4 , 2 5 ] ] , B =
[[1 ,5] ,[8 ,12] ,[15 ,24] ,[25 ,26]]
Output : [ [ 1 , 2 ] , [ 5 , 5 ] , [ 8 , 1 0 ] , [ 1 5 , 2 3 ] , [ 2 4 , 2 4 ] , [ 2 5 , 2 5 ] ]
Reminder : The i n p u t s and t h e d e s i r e d output a r e l i s t s o f
I n t e r v a l o b j e c t s , and not a r r a y s o r l i s t s .
24.6 Intersection
For problems to get intersections of lists, we can use hashmap, which takes
O(m + n) time complexity. Also, we can use sorting at first and use two
pointers one start from the start of each array. Examples are shown as
below;
Note:
Note:
Follow up:
(a) What if the given array is already sorted? How would you opti-
mize your algorithm?
(b) What if nums1’s size is small compared to nums2’s size? Which
algorithm is better?
(c) What if elements of nums2 are stored on disk, and the memory is
limited such that you cannot load all elements into the memory
at once?
1 Example :
2
3 I nput : [ 0 , 1 , 0 , 3 , 1 2 ]
4 Output : [ 1 , 3 , 1 2 , 0 , 0 ]
24.8 Exercises
24.8.1 Subsequence with (DP)
1. 594. Longest Harmonious Subsequence
We define a harmonious array is an array where the difference between
its maximum value and its minimum value is exactly 1.
Now, given an integer array, you need to find the length of its longest
harmonious subsequence among all its possible subsequences.
Example 1:
1 I nput : [ 1 , 3 , 2 , 2 , 5 , 2 , 3 , 7 ]
2 Output : 5
3 E x p l a n a t i o n : The l o n g e s t harmonious s u b s e q u e n c e i s
[3 ,2 ,2 ,2 ,3].
Note: The length of the input array will not exceed 20,000.
Solution: at first, use a Counter to save the whole set. Then visit the
counter dictionary, to check key+1 and key-1, only when the item is
not zero, we can count it as validate, or else it is 0.
1 from c o l l e c t i o n s import Counter
2 class Solution :
3 d e f findLHS ( s e l f , nums ) :
4 """
5 : type nums : L i s t [ i n t ]
6 : rtype : int
7 """
8 i f not nums o r l e n ( nums ) <2:
9 return 0
10 count=Counter ( nums ) #t h e l i s t i s s o r t e d by t h e key
value
11 maxLen = 0
582 24. ARRAY QUESTIONS(15%)
Note:
Both strings’ lengths will not exceed 100.
Only letters from a z will appear in input strings.
Solution: if we get more examples, we could found the following rules,
“aba”,”aba” return -1,
1 def findLUSlength ( s e l f , a , b) :
2 """
3 : type a : s t r
4 : type b : s t r
5 : rtype : int
6 """
7 i f l e n ( b ) != l e n ( a ) :
8 r e t u r n max( l e n ( a ) , l e n ( b ) )
24.8. EXERCISES 583
9 #l e n g t h i s t h e same
10 r e t u r n l e n ( a ) i f a !=b e l s e −1
Explanation: Replace the two ’A’s with two ’B’s or vice versa.
Example 2:
1 I nput :
2 s = "AABABBA" , k = 1
3
4 Output :
5 4
Explanation: Replace the one ’A’ in the middle with ’B’ and form
"AABBBBA". The substring "BBBB" has the longest repeating let-
ters, which is 4.
Solution: the brute-force recursive solution for this, is try to replace
any char into another when it is not equal or choose not too. LTE
1 #b r u t e f o r c e , u s e r e c u r s i v e f u n c t i o n t o w r i t e b r u t e f o r c e
solution
2 d e f r e p l a c e ( news , idx , re_char , k ) :
3 n o n l o c a l maxLen
4 i f k==0 o r i d x==l e n ( s ) :
5 maxLen = max( maxLen , getLen ( news ) )
6 return
7
8 i f s [ i d x ] ! = re_char : #r e p l a c e
9 news_copy=news [ : i d x ]+ re_char+news [ i d x + 1 : ]
10 r e p l a c e ( news_copy , i d x +1 , re_char , k−1)
11 r e p l a c e ( news [ : ] , i d x +1 , re_char , k )
12
13 #what i f we o n l y have one c h a r
14 # f o r char1 in chars . keys ( ) :
15 # r e p l a c e ( s [ : ] , 0 , char1 , k )
584 24. ARRAY QUESTIONS(15%)
To get the BCR, think about the sliding window. The longest re-
peating string we can by number of replacement = ‘length of string
max(numer of occurence of letter i), i=’A’ to ‘Z’. With the constraint,
which means the equation needs to be ≤ k. So we can use sliding
window to record the max occurence, and when the constraint is vi-
olated, we shrink the window. Given an example, strs= “BBCABB-
BAB”, k=2, when i=0, and j=7, 8–5=3>2, which is at A, we need
to shrink it, the maxCharCount changed to 4, i=1, so that 8–1–4=3,
i=2, 8–2–3=3, 8–3–3=2, so i=3, current length is 5.
1 def characterReplacement ( s e l f , s , k) :
2 """
3 : type s : s t r
4 : type k : i n t
5 : rtype : int
6 """
7 i , j = 0 , 0 #s l i d i n g window
8 counter =[0]∗26
9 ans = 0
10 maxCharCount = 0
11 w h i l e j <l e n ( s ) :
12 c o u n t e r [ ord ( s [ j ] )−ord ( 'A ' ) ]+=1
13 maxCharCount = max( maxCharCount , c o u n t e r [ ord ( s [
j ] )−ord ( 'A ' ) ] )
14 w h i l e j −i +1−maxCharCount>k : #now s h r i n k t h e
window
15 c o u n t e r [ ord ( s [ i ] )−ord ( 'A ' ) ]−=1
16 i+=1
17 #updata max
18 maxCharCount=max( c o u n t e r )
19 ans=max( ans , j −i +1)
20 j+=1
21
22 r e t u r n ans
1 I nput :
2 s = " ababbc " , k = 2
3
4 Output :
5 5
Now, use sliding window, we use a pointer mid, what start from 0, if
the whole string satisfy the condition, return len(s). Otherwise, use
two while loop to separate the string into three substrings: left, mid,
right. left satisfy, mid unsatisfy, right unknown.
1 from c o l l e c t i o n s import Counter , d e f a u l t d i c t
2 class Solution :
586 24. ARRAY QUESTIONS(15%)
3 def longestSubstring ( s e l f , s , k) :
4 """
5 : type s : s t r
6 : type k : i n t
7 : rtype : int
8 """
9 i f not s :
10 return 0
11 i f l e n ( s )<k :
12 return 0
13 count = Counter ( c h a r f o r c h a r i n s )
14 mid=0 #on t h e l e f t s i d e , from 0−mid , s a t i s f i e d
elments
15 w h i l e mid<l e n ( s ) and count [ s [ mid]]>=k :
16 mid+=1
17 i f mid==l e n ( s ) : r e t u r n l e n ( s )
18 l e f t = s e l f . l o n g e s t S u b s t r i n g ( s [ : mid ] , k ) #" ababb "
19 #from pre_mid − cur_mid , g e t r i d o f t h o s e c a n t
s a t i s f y the c o n d i t i o n
20 w h i l e mid<l e n ( s ) and count [ s [ mid ]] < k :
21 mid+=1
22 #now t h e r i g h t s i d e keep d o i n g i t
23 r i g h t = s e l f . l o n g e s t S u b s t r i n g ( s [ mid : ] , k )
24 r e t u r n max( l e f t , r i g h t )
24.8.2 Subset
216. Combination Sum III
Find all possible combinations of k numbers that add up to a number
n, given that only numbers from 1 to 9 can be used and each combination
should be a unique set of numbers.
Note :
A l l numbers w i l l be p o s i t i v e i n t e g e r s .
The s o l u t i o n s e t must not c o n t a i n d u p l i c a t e c o m b i n a t i o n s .
Example 1 :
Input : k = 3 , n = 7
Output : [ [ 1 , 2 , 4 ] ]
Example 2 :
Input : k = 3 , n = 9
Output : [ [ 1 , 2 , 6 ] , [ 1 , 3 , 5 ] , [ 2 , 3 , 4 ] ]
1 d e f combinationSum3 ( s e l f , k , n ) :
2 """
3 : type k : i n t
4 : type n : i n t
5 : rtype : List [ List [ int ] ]
6 """
24.8. EXERCISES 587
24.8.3 Intersection
160. Intersection of Two Linked Lists (Easy)
Write a program to find the node at which the intersection of two singly
linked lists begins.
For example, the following two linked lists:
A: a1 −> a2
\
c1 −> c2 −> c3
/
B: b1 −> b2 −> b3
• The linked lists must retain their original structure after the function
returns.
• You may assume there are no cycles anywhere in the entire linked
structure.
• Your code should preferably run in O(n) time and use only O(1) mem-
ory.
588 24. ARRAY QUESTIONS(15%)
25
Circular Linked List For the circular linked list, when we are traversing
the list, the most important thing is to know how to set up the end condition
for the while loop.
25.1 708. Insert into a Cyclic Sorted List (medium) Given a node
from a cyclic linked list which is sorted in ascending order, write a
function to insert a value into the list such that it remains a cyclic
sorted list. The given node can be a reference to any single node in
the list, and may not be necessarily the smallest value in the cyclic
list. For example,
Analysis: The maximum we traverse the list is one round. The po-
tential positions we insert is related to the insert value. Suppose the
linked list is in range of [s, e], s<=e. Given the insert value as m:
1. m ∈ [s, e]: we insert in the middle of the list.
2. m ≥ e or m ≤ s: we insert at the end of the list, we need to
detect the end as if the current node’s value is larger than its
successor’s value.
589
590 25. LINKED LIST, STACK, QUEUE, AND HEAP QUESTIONS (12%)
3. After one loop, if we can not find a place, then we insert at the
end. For example, 2->2->2 and insert 3 or 2->3->4->2 and
insert 2.
1 d e f i n s e r t ( s e l f , head , i n s e r t V a l ) :
2 i f not head : # 0 node
3 head = Node ( i n s e r t V a l , None )
4 head . next = head
5 r e t u r n head
6
7 c u r = head
8 w h i l e c u r . next != head :
9 i f c u r . v a l <= i n s e r t V a l <= c u r . next . v a l : # i n s e r t
10 break
11 e l i f c u r . v a l > c u r . next . v a l : # end and s t a r t
12 i f i n s e r t V a l >= c u r . v a l o r i n s e r t V a l <= c u r .
next . v a l :
13 break
14 c u r = c u r . next
15 else :
16 c u r = c u r . next
17 # insert
18 node = Node ( i n s e r t V a l , None )
19 node . next , c u r . next = c u r . next , node
20 r e t u r n head
25.2. QUEUE AND STACK 591
12 d e f enQueue ( s e l f , v a l u e ) :
13 i f s e l f . c u r _ s i z e >= s e l f . s i z e :
14 return False
15 new_node = MyCircularQueue . Node ( v a l u e )
16 i f s e l f . c u r _ s i z e == 0 :
17 s e l f . t a i l = s e l f . head = new_node
18 else :
19 s e l f . t a i l . next = new_node
20 new_node . next = s e l f . head
21 s e l f . t a i l = new_node
22 s e l f . c u r _ s i z e += 1
23 r e t u r n True
24
25 d e f deQueue ( s e l f ) :
26
27 if s e l f . c u r _ s i z e == 0:
28 return False
29 # d e l e t e head node
30 v a l = s e l f . head . v a l
31 i f s e l f . c u r _ s i z e == 1:
32 s e l f . head = s e l f . t a i l = None
33 else :
34 s e l f . head = s e l f . head . next
35 s e l f . c u r _ s i z e −= 1
36 r e t u r n True
37
38 d e f Front ( s e l f ) :
39 r e t u r n s e l f . head . v a l i f s e l f . head e l s e −1
40
41 d e f Rear ( s e l f ) :
42 return s e l f . t a i l . val i f s e l f . t a i l e l s e −1
43
44 d e f isEmpty ( s e l f ) :
45 r e t u r n True i f s e l f . c u r _ s i z e == 0 e l s e F a l s e
46
47 def i s F u l l ( s e l f ) :
48 r e t u r n True i f s e l f . c u r _ s i z e == s e l f . s i z e e l s e
False
25.4 346. Moving Average from Data Stream (easy). Given a stream
of integers and a window size, calculate the moving average of all
integers in the sliding window.
Example :
25.2. QUEUE AND STACK 593
Now, try the BCR, which is O(n). The maximum area is amony areas
that use each height as the rectangle height multiplied by the width
that works. For the above example,we would choose the maximum
among 2 × 1, 1 × 6, 5 × 2, 6 × 1, 2 × 4, 3 × 1. So, the important step here
is to find the possible width, for element 2, if the following height is
increasing, then the width grows, however, since the following height
1 is smaller, so 2 will be popped out, we can get 2 × 1, which satisfies
the condition of the monotonic increasing stack, when one element is
popped out, which means we found the next element that is smaller
than the kicked out element, so the width span ended here. How to
deal if current number equals to previous, 6,6,6,6,6, we need to pop
previous, and append current. The structure we use here is called
Monotonic Stack, which will only allow the increasing elements to get
in the stack, and once smaller or equal ones get in, it kicks out the
previous smaller elements.
1 def largestRectangleArea ( s e l f , heights ) :
2 """
3 : type h e i g h t s : L i s t [ i n t ]
4 : rtype : int
5 """
6 i f not h e i g h t s :
7 return 0
8 maxsize = max( h e i g h t s )
9
10 s t a c k = [ −1]
11
12 #t h e s t a c k w i l l o n l y grow
13 f o r i , h i n enumerate ( h e i g h t s ) :
25.2. QUEUE AND STACK 595
14 i f s t a c k [ −1]!= −1:
15 i f h>h e i g h t s [ s t a c k [ − 1 ] ] :
16 s t a c k . append ( i )
17 else :
18 #s t a r t t o k i c k t o pop and compute t h e
area
19 w h i l e s t a c k [ −1]!= −1 and h<=h e i g h t s [
s t a c k [ − 1 ] ] : #same o r e q u a l n e e d s t o be pop out
20 i d x = s t a c k . pop ( )
21 v = heights [ idx ]
22 maxsize=max( maxsize , ( i −s t a c k
[ −1] −1) ∗v )
23 s t a c k . append ( i )
24
25 else :
26 s t a c k . append ( i )
27 #h a n d l e t h e l e f t s t a c k
28 w h i l e s t a c k [ −1]!= −1:
29 i d x = s t a c k . pop ( )
30 v = heights [ idx ]
31 maxsize=max( maxsize , ( l e n ( h e i g h t s )−s t a c k [ −1] −1)
∗v )
32 r e t u r n maxsize
28 r s l t = check ( i , j , w, h )
29 i f r s l t ==0: #we d e f i n i t e l y need t o
break i t . o r e l s e we g e t wrong r e s u l t
30 break
31 maxsize = max( maxsize , check ( i , j , w,
h) )
32 r e t u r n maxsize
Still can not be AC. So we need another solution. Now use the largest
rectangle in histogram.
1 d e f maximalRectangle ( s e l f , matrix ) :
2 """
25.2. QUEUE AND STACK 597
3 : type matrix : L i s t [ L i s t [ s t r ] ]
4 : rtype : int
5 """
6 i f not matrix :
7 return 0
8 i f l e n ( m atri x [ 0 ] ) ==0:
9 return 0
10 d e f getMaxAreaHist ( h e i g h t s ) :
11 i f not h e i g h t s :
12 return 0
13 maxsize = max( h e i g h t s )
14
15 s t a c k = [ −1]
16
17 #t h e s t a c k w i l l o n l y grow
18 f o r i , h i n enumerate ( h e i g h t s ) :
19 i f s t a c k [ −1]!= −1:
20 i f h>h e i g h t s [ s t a c k [ − 1 ] ] :
21 s t a c k . append ( i )
22 else :
23 #s t a r t t o k i c k t o pop and compute
the area
24 w h i l e s t a c k [ −1]!= −1 and h<=h e i g h t s [
s t a c k [ − 1 ] ] : #same o r e q u a l n e e d s t o be pop out
25 i d x = s t a c k . pop ( )
26 v = heights [ idx ]
27 maxsize=max( maxsize , ( i −s t a c k
[ −1] −1) ∗v )
28 s t a c k . append ( i )
29
30 else :
31 s t a c k . append ( i )
32 #h a n d l e t h e l e f t s t a c k
33 w h i l e s t a c k [ −1]!= −1:
34 i d x = s t a c k . pop ( )
35 v = heights [ idx ]
36 maxsize=max( maxsize , ( l e n ( h e i g h t s )−s t a c k
[ −1] −1) ∗v )
37 r e t u r n maxsize
38 row , c o l = l e n ( matrix ) , l e n ( matrix [ 0 ] )
39 h e i g h t s = [ 0 ] ∗ c o l #s a v e t h e maximum h e i g h t s t i l l
here
40 maxsize = 0
41 f o r r i n r a n g e ( row ) :
42 f o r c in range ( c o l ) :
43 i f matrix [ r ] [ c]== ' 1 ' :
44 h e i g h t s [ c]+=1
45 else :
46 h e i g h t s [ c ]=0
47 #p r i n t ( h e i g h t s )
48 maxsize = max( maxsize , getMaxAreaHist ( h e i g h t s ) )
49 r e t u r n maxsize
Monotonic Stack
598 25. LINKED LIST, STACK, QUEUE, AND HEAP QUESTIONS (12%)
Example 2:
1 Input : [ 1 , 2 , 3 , 4 , 5 ]
2 Output : 4
3 E x p l a n a t i o n : Buy on day 1 ( p r i c e = 1 ) and s e l l on day 5 (
p r i c e = 5 ) , p r o f i t = 5−1 = 4 .
4 Note t h a t you cannot buy on day 1 , buy on day
2 and s e l l them l a t e r , a s you a r e
5 e n g a g i n g m u l t i p l e t r a n s a c t i o n s a t t h e same
time . You must s e l l b e f o r e buying a g a i n .
Example 3:
1 Input : [ 7 , 6 , 4 , 3 , 1 ]
2 Output : 0
3 E x p l a n a t i o n : In t h i s c a s e , no t r a n s a c t i o n i s done , i . e . max
profit = 0.
6 mono_stack = [ ]
7 profit = 0
8 for p in prices :
9 i f not mono_stack :
10 mono_stack . append ( p )
11 else :
12 i f p<mono_stack [ − 1 ] :
13 mono_stack . append ( p )
14 else :
15 #k i c k out t i l l i t i s d e c r e a s i n g
16 i f mono_stack and mono_stack [−1]<p :
17 p r i c e = mono_stack . pop ( )
18 p r o f i t += p−p r i c e
19
20 w h i l e mono_stack and mono_stack [ −1]<p :
21 p r i c e = mono_stack . pop ( )
22 mono_stack . append ( p )
23 return p r o f i t
Also, there are other solutions that can use O(1) space. Say the given
array is: [7, 1, 5, 3, 6, 4]. If we plot the numbers of the given array on
a graph, we get: If we analyze the graph, we notice that the points of
to obtain more profit), we will end up losing the profit over one of the
transactions leading to an overall lesser profit.
1 class Solution {
2 p u b l i c i n t maxProfit ( i n t [ ] p r i c e s ) {
3 int i = 0;
4 int valley = prices [ 0 ] ;
5 i n t peak = p r i c e s [ 0 ] ;
6 int maxprofit = 0;
7 while ( i < p r i c e s . length − 1) {
8 w h i l e ( i < p r i c e s . l e n g t h − 1 && p r i c e s [ i ] >=
prices [ i + 1])
9 i ++;
10 valley = prices [ i ] ;
11 w h i l e ( i < p r i c e s . l e n g t h − 1 && p r i c e s [ i ] <=
prices [ i + 1])
12 i ++;
13 peak = p r i c e s [ i ] ;
14 m a x p r o f i t += peak − v a l l e y ;
15 }
16 return maxprofit ;
17 }
18 }
This solution follows the logic used in Approach 2 itself, but with
only a slight variation. In this case, instead of looking for every peak
following a valley, we can simply go on crawling over the slope and keep
on adding the profit obtained from every consecutive transaction. In
the end,we will be using the peaks and valleys effectively, but we need
not track the costs corresponding to the peaks and valleys along with
the maximum profit, but we can directly keep on adding the difference
between the consecutive numbers of the array if the second number is
larger than the first one, and at the total sum we obtain will be the
maximum profit. This approach will simplify the solution. This can
be made clearer by taking this example: [1, 7, 2, 3, 6, 7, 6, 7]
The graph corresponding to this array is:
From the above graph, we can observe that the sum A+B+CA+B+CA+B+C
is equal to the difference D corresponding to the difference between the
heights of the consecutive peak and valley.
1 class Solution {
2 p u b l i c i n t maxProfit ( i n t [ ] p r i c e s ) {
3 int maxprofit = 0;
4 f o r ( i n t i = 1 ; i < p r i c e s . l e n g t h ; i ++) {
5 i f ( prices [ i ] > prices [ i − 1])
6 m a x p r o f i t += p r i c e s [ i ] − p r i c e s [ i − 1 ] ;
7 }
8 return maxprofit ;
9 }
10 }
25.3. HEAP AND PRIORITY QUEUE 601
I nput : t a s k s = [ " A" , "A" , "A" , " B " , " B " , " B " ] , n = 2
Output : 8
E x p l a n a t i o n : A −> B −> i d l e −> A −> B −> i d l e −> A −> B .
Figure 25.5: Task Scheduler, Left is the first step, the right is the one we
end up with.
4 f = [ count f o r _, count i n c . i t e m s ( ) ]
5 f . s o r t ( r e v e r s e =True )
6 idle_time = ( f [ 0 ] − 1) ∗ n
7
8 f o r i in range (1 , len ( f ) ) :
9 c = f[i]
10 i d l e _ t i m e −= min ( c , f [ 0 ] − 1 )
11 return idle_time + len ( tasks ) i f idle_time > 0 e l s e len
( tasks )
604 25. LINKED LIST, STACK, QUEUE, AND HEAP QUESTIONS (12%)
26
For the string problems, it can be divided into two categories: one string
and two strings pattern matching.
For the one string problem, the first type is to do operations that meet
certain requirements on a single string. (1). For the ad hoc easy string
processing problems, we only need to read the requirement carefully and
use basic programming skills, data structures, and sometimes requires us
to be farmilar with some string libraries like Re other than the basic built-
in string functions. We list some LeetCode Problems of this type in Sec-
tion 26.1. (2) There are also more challenging problems: including find the
longest/shortest/ count substring and subsequence that satisfy certain re-
quirements. Usually the subsequence is more difficult than the substring.
In this chapter we would list the following types in Section 26.3
605
606 26. STRING QUESTIONS (15%)
26.3.1 Palindrome
Palindrome is a sequence of characters read the same forward and backward.
To identify if a sequence is a palindrome say “abba" we just need to check
if s == s[::-1]. In the structure, if we know “bb" is palindrome, then “abba"
should be palindrome if s[0] == s[3]. Due to this structure, in the problems
with finding palindromic substrings, we can apply dynamic programming
and other algorithms to fight back the naive solution.
To validate a palindrome we can use two pointers, one at the start, and
the other and the end. We iterative them into the middle location.
20.
Example 2 :
1 def validPalindrome ( s e l f , s ) :
2 i f not s :
3 r e t u r n True
4
5 i , j = 0 , l e n ( s )−1
6 w h i l e i <= j :
7 i f s [ i ] == s [ j ] :
8 i += 1
9 j −= 1
10 else :
11 l e f t = s [ i +1: j +1]
12 right = s [ i : j ]
13 r e t u r n l e f t == l e f t [ : : − 1 ] o r r i g h t == r i g h t
[:: −1]
14 r e t u r n True
Example 2 :
From the example, first, we know this matrix would only have valid
value at the upper part due to i<=j. Because if j-i>=3 which means
the length is larger or equals to 3, dp[i][j] = 1 if s[i]==s[j] and dp[i+1][j-
1]==1. Compare i:i+1, j:j-1. This means we need to iterate i reversely
and j incrementally.
1 def countSubstrings ( s e l f , s ) :
2 """
26.3. ADVANCED SINGLE STRING 609
3 : type s : s t r
4 : rtype : int
5 """
6 n =l e n ( s )
7 dp = [ [ 0 f o r _ i n r a n g e ( n ) ] f o r _ i n r a n g e ( n ) ] # i f
from i t o j i s a p a l i n d r o m e
8 res = 0
9 f o r i i n r a n g e ( n−1,−1,−1) :
10 f o r j in range ( i , n) :
11 i f j −i >2: #l e n g t h >=3
12 dp [ i ] [ j ] = ( s [ i ]==s [ j ] and dp [ i + 1 ] [ j −1])
13 else :
14 dp [ i ] [ j ] = ( s [ i ]==s [ j ] ) #l e n g t h 1 and 2
15 i f dp [ i ] [ j ] :
16 r e s += 1
17 return res
l e f t = 1 , r i g h t = 2 , i = 3 , i /2 = 1 , i %2 = 1
l e f t = 2 , r i g h t = 2 , i = 4 , i /2 = 2 , i %2 = 0
1 def countSubstrings ( s e l f , S) :
2 n = len (S)
3 ans = 0
4 f o r i i n r a n g e ( 2 ∗ n−1) :
5 l = i n t ( i /2)
6 r = l + i %2
7 w h i l e l >= 0 and r < n and S [ l ] == S [ r ] :
8 ans += 1
9 l −= 1
10 r += 1
11 r e t u r n ans
Example 2 :
I nput :
" cbbd "
Output :
2
One p o s s i b l e l o n g e s t p a l i n d r o m i c s u b s e q u e n c e i s " bb " .
1 def longestPalindromeSubseq ( s e l f , s ) :
2 i f not s :
3 return 0
4 i f s == s [ : : − 1 ] :
5 return len ( s )
6
7 rows = l e n ( s )
8 dp = [ [ 0 f o r c o l i n r a n g e ( rows ) ] f o r row i n r a n g e ( rows )
]
612 26. STRING QUESTIONS (15%)
9 f o r i i n r a n g e ( 0 , rows ) :
10 dp [ i ] [ i ] = 1
11
12 f o r l i n r a n g e ( 2 , rows +1) : #u s e a s p a c e
13 f o r i i n r a n g e ( 0 , rows−l +1) : #s t a r t 0 , end l e n − l +1
14 j = i+l −1
15 i f j > rows :
16 continue
17 i f s [ i ] == s [ j ] :
18 dp [ i ] [ j ] = dp [ i + 1 ] [ j −1]+2
19 else :
20 dp [ i ] [ j ] = max( dp [ i ] [ j −1] , dp [ i + 1 ] [ j ] )
21 r e t u r n dp [ 0 ] [ rows −1]
26.3.2 Calculator
In this section, for basic calculator, we have operators ’+’, ’-’, ’*’, ’/’, and
parentheses. Because of (’+’, ’-’) and (’*’, ’/’) has different priority, and the
parentheses change the priority too. The basic step is to obtain the integers
digit by digit from the string. And, if the previous sign is ’-’, we make sure
we get a negative number. Given a string expression: (a+b/c)*(d-e)+((f-
g)-(h-i))+(j-k). The rule here is to deduct this to:
The rules are: 1) Reduce the ’*’ and ’/’: And we handle it when we en-
counter the following operator or at the end of the string. Because, when
we encounter a sign(operator), we check the previous sign, if the previous
sign is ’/’ or ’*’, we compute the previous number with current number to
reduce it into one. 2) Reduce the parentheses into one: (d-e) is reduced to
d, and because the previous sign is ’*’, it is further combined with a and
become a. Thus, if we save the reduced result into a stack, there will be
[a„ f, j], we just need to sum over. thus to avoid the boundary condition,
we can add ’+’ at the end. In the later part, we will explain more about
how to deal with the above two kinds of reduce. There are different levels
of calculators:
2 . sum o v e r t h e p o s i t i v e and n e g a t i v e v a l u e s i n t h e
stack
3. ’+’, ’-’, ’*’, ’/’, w/o parentheses: This is similar to Case 1, other than
the ’*’, ’-’. For example, a-b/c/d*e. When we are at c, we compute the
pop the top element in the stack and compute (-b/c)=f, and append f
into the stack. When we are at d, similarly, we compute (f/d)=g, and
append g into the stack.
1 . i t e r a t e through t h e c h a r :
i f a d i g i t : obtain the i n t e g e r
i f c in [ '+ , ' − ' , ' ∗ ' , ' / ' ] or c i s the l a s t char :
i f p r e s i g n == ' − ' :
num = −num
# we r e d u c e t h e c u r r e n t num with p r e v i o u s
e l i f presign in [ '∗ ' , ' / ' ] :
num = o p e r a t o r ( s t a c k . pop ( ) , p r e s i g n , num)
s t a c k . append (num)
num = 0
presign = c
2 . sum o v e r t h e p o s i t i v e and n e g a t i v e v a l u e s i n t h e
stack
Example 2 :
Input : " 2−1 + 2 "
Output : 3
Example 3 :
Input : "(1+(4+5+2) −3)+(6+8) "
Output : 23
1 def calculate ( s e l f , s ) :
2 s = s + '+ '
3 ans = num = 0 #num i s t o g e t each number
4 s i g n = '+ '
5 s t a c k = c o l l e c t i o n s . deque ( )
6 for c in s :
7 i f c . i s d i g i t ( ) : #g e t number
8 num = 10∗num + i n t ( c )
9 e l i f c i n [ '− ' , '+ ' , ' ) ' ] :
10 i f s i g n == '− ' :
11 num = −num
12 i f c == ' ) ' :
13 w h i l e s t a c k and s t a c k [ −1] != ' ( ' :
14 num += s t a c k . pop ( )
15 s t a c k . pop ( )
16 s i g n = s t a c k . pop ( )
17 else :
26.3. ADVANCED SINGLE STRING 615
18 s t a c k . append (num)
19 num = 0
20 sign = c
21 e l i f c == ' ( ' : # l e f t p a r a t h e s e , put t h e c u r r e n t
ans and s i g n i n t h e s t a c k
22 s t a c k . append ( s i g n )
23 s t a c k . append ( ' ( ' )
24 num = 0
25 s i g n = '+ '
26
27 while stack :
28 ans += s t a c k . pop ( )
29 r e t u r n ans
"1 + 1" = 2
" 6−4 / 2 " = 4
" 2∗(5+5 ∗2) /3+(6/2+8) " = 21
"(2+6∗ 3+5− (3∗14/7+2) ∗ 5 ) +3"=−12
Solution: Case 4
1 def calculate ( s e l f , s ) :
2 ans = num = 0
3 s t a c k = c o l l e c t i o n s . deque ( )
4 n = len ( s )
5 p r e s i g n = '+ '
6 s = s+ '+ '
7 d e f op ( pre , op , c u r ) :
8 i f op == ' ∗ ' :
9 return pre ∗ cur
10 i f op == ' / ' :
11 r e t u r n −abs ( p r e ) // c u r i f p r e < 0 e l s e p r e // c u r
12 f o r i , c i n enumerate ( s ) :
13 i f c . i s d i g i t () :
14 num = 10∗num + i n t ( c )
15 e l i f c i n [ '+ ' , '− ' , ' ∗ ' , ' / ' , ' ) ' ] :
16 i f p r e s i g n == '− ' :
17 num = −num
18 e l i f presign in [ '∗ ' , '/ ' ] :
19 num = op ( s t a c k . pop ( ) , p r e s i g n , num)
616 26. STRING QUESTIONS (15%)
26.3.3 Others
Possible methods: two pointers, one loop+two pointers
1 d e f findAnagrams ( s e l f , s , p ) :
2 """
3 : type s : s t r
4 : type p : s t r
5 : rtype : List [ int ]
6 """
7 i f l e n ( s )<l e n ( p ) o r not s :
8 return [ ]
9 #f r e q u e n c y t a b l e
10 t a b l e = {}
11 for c in p :
12 t a b l e [ c ]= t a b l e . g e t ( c , 0 ) +1
13
14 begin , end = 0 , 0
15 r = []
16 counter = len ( table )
17 w h i l e end<l e n ( s ) :
18 end_char = s [ end ]
19 i f end_char i n t a b l e . k e y s ( ) :
20 t a b l e [ end_char]−=1
21 i f t a b l e [ end_char ]==0:
22 c o u n t e r −=1
23
24 #go t o l o n g e r s t r i n g , from A, AD,
25 end+=1
26
27 w h i l e c o u n t e r ==0: #we have t h e same c h a r i n t h e
window , s t a r t t o t r i m i t
28 #s a v e t h e b e s t r e s u l t , j u s t t o s a v e t h e
beigining
29 i f end−b e g i n == l e n ( p ) :
30 r . append ( b e g i n )
31 #move t h e window f o r w a r d
32 start_char = s [ begin ]
33 i f s t a r t _ c h a r i n t a b l e : #r e v e r s e t h e count
34 t a b l e [ s t a r t _ c h a r ]+=1
35 i f t a b l e [ s t a r t _ c h a r ]==1: #o n l y i n c r e a s e when
it is 1
36 c o u n t e r+=1
37
38 b e g i n+=1
39 return r
26.7 Exercise
26.7.1 Palindrome
618 26. STRING QUESTIONS (15%)
26.1 EXERCISES
1. Valid Palindrome (L125, *). Given a string, determine if it is a palin-
drome, considering only alphanumeric characters and ignoring cases.
Note: For the purpose of this problem, we define empty string as valid
palindrome.
Example 1 :
Example 2 :
Input :
" abccccdd "
Output :
7
Explanation :
One l o n g e s t p a l i n d r o m e t h a t can be b u i l t i s " d c c a c c d " , whose
length i s 7.
Example 2 :
Tree Questions(10%)
Operations
When looking for a key in a tree (or a place to insert a new key), we tra-
verse the tree from root to leaf, making comparisons to keys stored in the
619
620 27. TREE QUESTIONS(10%)
nodes of the tree and deciding, on the basis of the comparison, to continue
searching in the left or right subtrees. On average, this means that each
comparison allows the operations to skip about half of the tree, so that each
SEARCH, INSERT or DELETE takes time proportional to the logarithm of
the number of items stored in the tree. This is much better than the linear
time required to find items by key in an (unsorted) array, but slower than
the corresponding operations on hash tables.
In order to build a BST, we need to INSERT a series of elements in the
tree organized by the searching tree property, and in order to INSERT, we
need to SEARCH the position to INSERT this element. Thus, we introduce
these operations in the order of SEARCH, INSERT and GENERATE.
Figure 27.2: The lightly shaded nodes indicate the simple path from the root
down to the position where the item is inserted. The dashed line indicates
the link in the tree that is added to insert the item.
5 return root
6
7 # Key i s g r e a t e r than r o o t ' s key
8 i f r o o t . v a l < key :
9 r e t u r n s e a r c h ( r o o t . r i g h t , key )
10
11 # Key i s s m a l l e r than r o o t ' s key
12 r e t u r n s e a r c h ( r o o t . l e f t , key )
Also, we can write it in an iterative way, which helps us save the heap space:
1 # i t e r a t i v e searching
2 d e f i t e r a t i v e _ s e a r c h ( r o o t , key ) :
3 w h i l e r o o t i s not None and r o o t . v a l != key :
4 i f r o o t . v a l < key :
5 root = root . right
6 else :
7 root = root . l e f t
8 return root
The above code needs return value and reassign the value for the right
and left every time, we can use the following code which might looks more
complex with the if condition but works faster and only assign element at
the end.
1 # recursive insertion
2 def i n s e r t i o n ( root , val ) :
3 i f r o o t i s None :
4 r o o t = TreeNode ( v a l )
5 return
6 i f val > root . val :
7 i f r o o t . r i g h t i s None :
8 r o o t . r i g h t = TreeNode ( v a l )
9 else :
10 i n s e rt i o n ( root . right , val )
622 27. TREE QUESTIONS(10%)
11 else :
12 i f r o o t . l e f t i s None :
13 r o o t . l e f t = TreeNode ( v a l )
14 else :
15 i n s e rt i o n ( root . l e f t , val )
We can search the node iteratively and save the previous node. The while
loop would stop when hit at an empty node. There will be three cases in
the case of the previous node.
1 . The p r e v i o u s node i s None , which means t h e t r e e i s empty , s o
we a s s i g n a r o o t node with t h e v a l u e
2 . The p r e v i o u s node has a v a l u e l a r g e r than t h e key , means we
need t o put key a s l e f t c h i l d .
3 . The p r e v i o u s node has a v a l u e s m a l l e r than t h e key , means we
need t o put key a s r i g h t c h i l d .
1 # iterative insertion
2 d e f i t e r a t i v e I n s e r t i o n ( r o o t , key ) :
3 pre_node = None
4 node = r o o t
5 w h i l e node i s not None :
6 pre_node = node
7 i f key < node . v a l :
8 node = node . l e f t
9 else :
10 node = node . r i g h t
11 # we r e a c h e d t o t h e l e a f node which i s pre_node
12 i f pre_node i s None :
13 r o o t = TreeNode ( key )
14 e l i f pre_node . v a l > key :
15 pre_node . l e f t = TreeNode ( key )
16 else :
17 pre_node . r i g h t = TreeNode ( key )
18 return root
BST Generation First, let us declare a node as BST which is the root
node. Given a list, we just need to call INSERT for each element. The time
complexity can be O(n logn ).
1 datas = [ 8 , 3 , 10 , 1 , 6 , 14 , 4 , 7 , 13]
2 BST = None
3 f o r key i n d a t a s :
4 BST = i t e r a t i v e I n s e r t i o n (BST, key )
5 p r i n t ( L e v e l O r d e r (BST) )
6 # output
7 # [ 8 , 3 , 10 , 1 , 6 , 14 , 4 , 7 , 13]
50 50
/ \ delete (20) / \
30 70 −−−−−−−−−> 30 70
/ \ / \ \ / \
20 40 60 80 40 60 80
50 50
/ \ delete (30) / \
30 70 −−−−−−−−−> 40 70
\ / \ / \
40 60 80 60 80
50 60
/ \ delete (50) / \
40 70 −−−−−−−−−> 40 70
/ \ \
60 80 80
The i m p o r t a n t t h i n g t o n o t e i s , i n o r d e r s u c c e s s o r i s needed o n l y
when r i g h t c h i l d i s not empty . In t h i s p a r t i c u l a r c a s e ,
i n o r d e r s u c c e s s o r can be o b t a i n e d by f i n d i n g t h e minimum
v a l u e i n r i g h t c h i l d o f t h e node .
Features of BST
Minimum and Maximum The operation is similar to search, to find
the minimum, we always traverse on the left subtree. For the maximum, we
just need to replace the “left” with “right” in the key word. Here the time
complexity is the same O(lgn).
1 # recursive
2 d e f get_minimum ( r o o t ) :
3 i f r o o t i s None :
4 r e t u r n None
5 i f r o o t . l e f t i s None : # a l e a f o r node has no l e f t s u b t r e e
6 return root
7 i f root . l e f t :
8 r e t u r n get_minimum ( r o o t . l e f t )
9
10 # iterative
11 d e f iterative_get_minimum ( r o o t ) :
12 w h i l e r o o t . l e f t i s not None :
13 root = root . l e f t
624 27. TREE QUESTIONS(10%)
14 return root
However, if it happens that your tree node has no parent defined, which
means you can not traverse back its parents. We only have one option. Use
the inorder tree traversal, and find the element right after the node.
For t h e r i g h t s u b t r e e o f t h e node :
1 ) I f i t i s not None , then t h e s u c c e s s o r i s t h e minimum node i n
t h e r i g h t s u b t r e e . e . g . f o r node 1 2 , s u c c e s s o r ( 1 2 ) = 13 = min
(12. right )
2 ) I f i t i s None , then t h e s u c c e s s o r i s one o f i t s a n c e s t o r s . We
t r a v e r s e down from t h e r o o t t i l l we f i n d c u r r e n t node , t h e
node i n advance o f c u r r e n t node i s t h e s u c c e s s o r . e . g .
s u c c e s s o r ( 2 ) =5
27.1. BINARY SEARCH TREE 625
1 def S uc ce s so rI n or de r ( root , n) :
2 # Step 1 o f t h e above a l g o r i t h m
3 i f n . r i g h t i s not None :
4 r e t u r n get_minimum ( n . r i g h t )
5 # Step 2 o f t h e above a l g o r i t h m
6 s u c c = None
7 w h i l e r o o t i s not None :
8
9 i f n . val > root . val :
10 root = root . right
11 e l i f n . val < root . val :
12 succ = root
13 root = root . l e f t
14 e l s e : # we found t h e node , no need t o t r a v e r s e
15 break
16 return succ
The worst case to find the successor or the predecessor of a BST is to search
the height of the tree: include the one of the subtrees of the current node,
and go back to all the parents and greatparents of this code, which makes
it the height of the tree. The expected time complexity is O(lgn). And the
worst is when the tree line up and has no branch, which makes it O(n).
Now we put a table here to summarize the space and time complexity
for each operation.
626 27. TREE QUESTIONS(10%)
4. If the parent node is in range [i, j], then we separate this range at the
middle position m = (i + j)2, the left child takes range [i, m], and the
right child take the interval of [m+1, j].
Because in each step of building the segment tree, the interval is divided
into two halves, so the height of the segment tree will be log N . And there
will be totally N leaves and N-1 number of internal nodes, which makes the
total number of nodes in segment tree to be 2N − 1 and make the segment
tree a full binary tree.
Here, we use the Range Sum Query (RSQ) problem to demonstrate how
segment tree works:
Example :
Given nums = [ 1 , 3 , 5 ]
sumRange ( 0 , 2 ) −> 9
update ( 1 , 2 )
sumRange ( 0 , 2 ) −> 8
Note:
Build Segment Tree. Because the leaves of the tree is a single ele-
ment, we can use divide and conquer to build the tree recursively. For
a given node, we first build and return its left and right child(including
calculating its sum) in advance in the ‘divide‘ step, and in the ‘con-
quer’ step, we calculate this node’s sum using its left and right child’s
sum, and set its left and right child. Because there are totally 2n − 1
nodes, which makes the time and space complexity O(n).
628 27. TREE QUESTIONS(10%)
7 s e l f . _updateNode ( i , v a l , r o o t . l e f t , s , m)
8 else :
9 s e l f . _updateNode ( i , v a l , r o o t . r i g h t , m+1, e )
10 root . val = root . l e f t . val + root . right . val
11 return
Range Sum Query. Each query range [i, j], will be a combination
of ranges of one or multiple ranges. For instance, as in the segment
tree shown in Fig 27.3, for range [2, 4], it will be combination of [2, 3]
and [4, 4]. The process is similar to the updating, we starts from the
root, and get its middle index m: 1) if [i, j] is the same as [s, e] that i
== s and j == e, then return the value, 2) if the interval [i, j] is within
range [s, m] that j <=m , then we just search it in the left branch. 3)
if [i, j] in within range [m+1, e] that i>m, then we search for the right
branch. 4) else, we search both branch and the left branch has target
[i, m], and the right side has target [m+1, j], the return value should
be the sum of both sides. The time complexity is still O(log n).
1 d e f _rangeQuery ( s e l f , r o o t , i , j , s , e ) :
2 i f s > e or i > j :
3 return 0
4 i f s == i and j == e :
5 r e t u r n r o o t . v a l i f r o o t i s not None e l s e 0
6
7 m = ( s + e ) //2
8
9 i f j <= m:
10 r e t u r n s e l f . _rangeQuery ( r o o t . l e f t , i , j , s , m)
11 e l i f i > m:
12 r e t u r n s e l f . _rangeQuery ( r o o t . r i g h t , i , j , m+1 , e )
13 else :
14 r e t u r n s e l f . _rangeQuery ( r o o t . l e f t , i , m, s , m) +
s e l f . _rangeQuery ( r o o t . r i g h t , m+1, j , m+1 , e)
15 d e f update ( s e l f , i , v a l ) :
16 s e l f . _updateNode ( i , v a l , s e l f . s t , 0 , s e l f . n −1)
17
18 d e f sumRange ( s e l f , i , j ) :
19 r e t u r n s e l f . _rangeQuery ( s e l f . s t , i , j , 0 , s e l f . n−1)
Segment tree can be used here to lower the complexity of each query to
O(logn).
Compact Trie If we assign only one letter per edge, we are not taking
full advantage of the trie’s tree structure. It is more useful to consider
compact or compressed tries, tries where we remove the one letter per edge
constraint, and contract non-branching paths by concatenating the letters on
27.3. TRIE FOR STRING 631
these paths. In this way, every node branches out, and every node traversed
represents a choice between two different words. The compressed trie that
corresponds to our example trie is also shown in Figure 27.4.
| | is the alphlbetical size, and N is the total number of nodes in the trie
P
Note: You may assume that all inputs are consist of lowercase letters
a-z. All inputs are guaranteed to be non-empty strings.
1 d e f s e a r c h ( s e l f , word ) :
2 node = s e l f . r o o t
3 f o r c i n word :
4 l o c = ord ( c )−ord ( ' a ' )
5 # c a s e 1 : not a l l l e t t e r s matched
6 i f node . c h i l d r e n [ l o c ] i s None :
7 return False
8 node = node . c h i l d r e n [ l o c ]
9 # case 2
10 r e t u r n True i f node . is_word e l s e F a l s e
1 d e f s t a r t W i t h ( s e l f , word ) :
2 node = s e l f . r o o t
3 f o r c i n word :
4 l o c = ord ( c )−ord ( ' a ' )
5 # c a s e 1 : not a l l l e t t e r s matched
6 i f node . c h i l d r e n [ l o c ] i s None :
7 return False
8 node = node . c h i l d r e n [ l o c ]
9 # case 2
10 r e t u r n True
Now complete the given Trie class with TrieNode and __init__ func-
tion.
1 c l a s s Trie :
2 c l a s s TrieNode :
3 d e f __init__ ( s e l f ) :
4 s e l f . is_word = F a l s e
5 s e l f . c h i l d r e n = [ None ] ∗ 26 #t h e o r d e r o f t h e
node r e p r e s e n t s a c h a r
6
7 d e f __init__ ( s e l f ) :
8 """
9 I n i t i a l i z e your data s t r u c t u r e h e r e .
10 """
11 s e l f . r o o t = s e l f . TrieNode ( ) # r o o t has v a l u e None
27.1 336. Palindrome Pairs (hard). Given a list of unique words, find
all pairs of distinct indices (i, j) in the given list, so that the concate-
nation of the two words, i.e. words[i] + words[j] is a palindrome.
1 Example 1 :
2
3 I nput : [ " abcd " , " dcba " , " l l s " , " s " , " s s s l l " ]
4 Output : [ [ 0 , 1 ] , [ 1 , 0 ] , [ 3 , 2 ] , [ 2 , 4 ] ]
5 E x p l a n a t i o n : The p a l i n d r o m e s a r e [ " dcbaabcd " , " abcddcba " , "
s l l s " ," l l s s s s l l "]
6
7 Example 2 :
8
9 I nput : [ " bat " , " tab " , " c a t " ]
10 Output : [ [ 0 , 1 ] , [ 1 , 0 ] ]
11 E x p l a n a t i o n : The p a l i n d r o m e s a r e [ " b a t t a b " , " t a b b a t " ]
634 27. TREE QUESTIONS(10%)
43 f o r j , ch i n enumerate ( word ) :
44 i f ch not i n t r i e . l i n k s :
45 break
46 t r i e = t r i e . l i n k s [ ch ]
47 i f i s _ p a l i n d r o m e ( word [ j + 1 : ] ) and t r i e . i n d e x
i s not None and t r i e . i n d e x != i :
48 # i f t h i s word c o m p l e t e s t o a
p a l i n d r o m e and t h e p r e f i x i s a word , c o m p l e t e i t
49 r e s . append ( [ i , t r i e . i n d e x ] )
50 else :
51 # t h i s word i s a r e v e r s e s u f f i x o f o t h e r
words , combine with t h o s e t h a t c o m p l e t e t o a p a l i n d r o m e
52 f o r pali_index in t r i e . pali_indices :
53 i f i != p a l i _ i n d e x :
54 r e s . append ( [ i , p a l i _ i n d e x ] )
55 i f ' ' i n words :
56 j = words . i n d e x ( ' ' )
57 f o r i , word i n enumerate ( words ) :
58 i f i != j and i s _ p a l i n d r o m e ( word ) :
59 r e s . append ( [ i , j ] )
60 r e s . append ( [ j , i ] )
61 return res
13 i f i n d 1 != i n d 2 and r e v 1 ^ r e v 2 : # one
i s r e v e r s e d one i s not
14 r e s t = word2 [ l e n 1 : ]
15 i f r e s t == r e s t [ : : − 1 ] : r e s u l t += ( [
ind1 , i n d 2 ] , ) i f r e v 2 e l s e ( [ ind2 , i n d 1 ] , ) # i f r e v 2 i s
r e v e r s e d , t h e from i n d 1 t o i n d 2
16 else :
17 break # from t h e p o i n t o f view , break
i s p o w e r f u l , t h i s way , we o n l y d e a l with p o s s i b l e
reversed ,
18 return r e s u l t
19
There are several other data structures, like balanced trees and hash
tables, which give us the possibility to search for a word in a dataset of
strings. Then why do we need trie? Although hash table has O(1) time
complexity for looking for a key, it is not efficient in the following operations
:
27.4 Bonus
Solve Duplicate Problem in BST When there are duplicates, things
can be more complicated, and the college algorithm book did not really tell
us what to do when there are duplicates. If you use the definition "left <=
root < right" and you have a tree like:
3
/ \
2 4
then adding a “3” duplicate key to this tree will result in:
3
/ \
2 4
\
3
so checking for duplicate’s existence is not that simple as just checking for
immediate children of a node.
An option to avoid this issue is to not represent duplicates structurally
(as separate nodes) but instead use a counter that counts the number of
occurrences of the key. The previous example would then have a tree like:
1 3(1)
2 / \
3 2(1) 4(1)
4
639
640 28. GRAPH QUESTIONS (15%)
has PPID that is 0, which means this process has no parent process.
All the PIDs will be distinct positive integers.
We use two list of integers to represent a list of processes, where the
first list contains PID for each process and the second list contains the
corresponding PPID.
Now given the two lists, and a PID representing a process you want
to kill, return a list of PIDs of processes that will be killed in the
end. You should assume that when a process is killed, all its children
processes will be killed. No order is required for the final answer.
Example 1 :
Input :
pid = [ 1 , 3 , 10 , 5 ]
ppid = [ 3 , 0 , 5 , 3 ]
kill = 5
Output : [ 5 , 1 0 ]
Explanation :
3
/ \
1 5
/
10
Kill 5 will also k i l l 10.
Analysis: We know the parent and the child node is a tree-like data
structure, which is also a graph. Instead of building a tree data struc-
ture first, we use graph defined as defaultdict indexed by the parent
node, and the children nodes is a list. In such a graph, finding the
killing process is the same as do a DFS/BFS starting from the kill
node, we just save all the passing nodes in the process. Here, we only
give the DFS solution.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 d e f k i l l P r o c e s s ( s e l f , pid , ppid , k i l l ) :
3 """
4 : type p i d : L i s t [ i n t ]
5 : type ppid : L i s t [ i n t ]
6 : type k i l l : i n t
7 : rtype : List [ int ]
8 """
9 # f i r s t s o r t i n g : nlog n ,
10 graph = d e f a u l t d i c t ( l i s t )
11 f o r p_id , i d i n z i p ( ppid , p i d ) :
12 graph [ p_id ] . append ( i d )
13
14 q = [ kill ]
15 path = s e t ( )
16 while q :
17 i d = q . pop ( 0 )
18 path . add ( i d )
28.2. CONNECTED COMPONENTS 641
19 f o r n e i g i n graph [ i d ] :
20 i f n e i g i n path :
21 continue
22 q . append ( n e i g )
23 r e t u r n l i s t ( path )
X X X X
X O O X
X X O X
X O X X
A f t e r r u n n i n g your f u n c t i o n , t h e board s h o u l d be :
X X X X
X X X X
X X X X
X O X X
13 moves = [ ( 0 , −1) , ( 0 , 1 ) , ( −1 , 0 ) , ( 1 , 0 ) ]
14 # f i n d a l l c o n n e c t e d components t o t h e edge 0 , and mark
them a s −1,
15 # then f l i p a l l 0 s i n t h e o t h e r p a r t s
16 # change t h e −1 t o 0 s
17 v i s i t e d = [ [ False f o r c in range ( c o l s ) ] f o r r in range (
rows ) ]
18 d e f d f s ( x , y ) : # ( x , y ) i s t h e edge 0 s
19 f o r dx , dy i n moves :
20 nx = x + dx
21 ny = y + dy
22 i f nx < 0 o r nx >= rows o r ny < 0 o r ny >= c o l s
:
23 continue
24 i f board [ nx ] [ ny ] == 'O ' and not v i s i t e d [ nx ] [ ny
]:
25 v i s i t e d [ nx ] [ ny ] = True
26 d f s ( nx , ny )
27 # f i r s t and l a s t c o l
28 f o r i i n r a n g e ( rows ) :
29 i f board [ i ] [ 0 ] == 'O ' and not v i s i t e d [ i ] [ 0 ] :
30 v i s i t e d [ i ] [ 0 ] = True
31 d f s ( i , 0)
32 i f board [ i ] [ − 1 ] == 'O ' and not v i s i t e d [ i ] [ − 1 ] :
33 v i s i t e d [ i ] [ − 1 ] = True
34 d f s ( i , c o l s −1)
35 # f i r s t and l a s t row
36 f o r j in range ( c o l s ) :
37 i f board [ 0 ] [ j ] == 'O ' and not v i s i t e d [ 0 ] [ j ] :
38 v i s i t e d [ 0 ] [ j ] = True
39 dfs (0 , j )
40 i f board [ rows − 1 ] [ j ] == 'O ' and not v i s i t e d [ rows − 1 ] [
j ]:
41 v i s i t e d [ rows − 1 ] [ j ] = True
42 d f s ( rows −1, j )
43 f o r i i n r a n g e ( rows ) :
44 f o r j in range ( c o l s ) :
45 i f board [ i ] [ j ] == 'O ' and not v i s i t e d [ i ] [ j ] :
46 board [ i ] [ j ] = 'X '
8 return
9 moves = [ ( 0 , −1) , ( 0 , 1 ) , ( −1 , 0 ) , ( 1 , 0 ) ]
10 # f i n d a l l c o n n e c t e d components t o t h e edge 0 , and mark
them a s −1,
11 # then f l i p a l l 0 s i n t h e o t h e r p a r t s
12 # change t h e −1 t o 0 s
13 d e f d f s ( x , y ) : # ( x , y ) i s t h e edge 0 s
14 f o r dx , dy i n moves :
15 nx = x + dx
16 ny = y + dy
17 i f nx < 0 o r nx >= rows o r ny < 0 o r ny >= c o l s
:
18 continue
19 i f board [ nx ] [ ny ] == 'O ' :
20 board [ nx ] [ ny ] = '−1 '
21 d f s ( nx , ny )
22 return
23 # f i r s t and l a s t c o l
24 f o r i i n r a n g e ( rows ) :
25 i f board [ i ] [ 0 ] == 'O ' :
26 board [ i ] [ 0 ] = '−1 '
27 d f s ( i , 0)
28 i f board [ i ] [ − 1 ] == 'O ' :
29 board [ i ] [ − 1 ] = '−1 '
30 d f s ( i , c o l s −1)
31 # # f i r s t and l a s t row
32 f o r j in range ( c o l s ) :
33 i f board [ 0 ] [ j ] == 'O ' :
34 board [ 0 ] [ j ] = '−1 '
35 dfs (0 , j )
36 i f board [ rows − 1 ] [ j ] == 'O ' :
37 board [ rows − 1 ] [ j ] = '−1 '
38 d f s ( rows −1 , j )
39 f o r i i n r a n g e ( rows ) :
40 f o r j in range ( c o l s ) :
41 i f board [ i ] [ j ] == 'O ' :
42 board [ i ] [ j ] = 'X '
43 e l i f board [ i ] [ j ] == '−1 ' :
44 board [ i ] [ j ] = 'O '
45 else :
46 pass
I nput : n = 5 and e d g e s = [ [ 0 , 1 ] , [ 1 , 2 ] , [ 3 , 4 ] ]
0 3
| |
644 28. GRAPH QUESTIONS (15%)
1 −−− 2 4
Output : 2
Example 2 :
Input : n = 5 and e d g e s = [ [ 0 , 1 ] , [ 1 , 2 ] , [ 2 , 3 ] , [ 3 , 4 ] ]
0 4
| |
1 −−− 2 −−− 3
Output : 1
Solution: Use DFS. First, if given n node, and have edges, it will
have n components.
for n in v e r t i c e s :
i f n not v i s i t e d :
DFS( n ) # t h i s i s a component t r a v e r s e i t s c o n n e c t e d
components and mark them a s v i s i t e d .
Before we start the main part, it is easier if we can convert the edge list
into undirected graph using adjacencly list. Because it is undirected,
one edge we need to add two directions in the adjancency list.
1 d e f countComponents ( s e l f , n , e d g e s ) :
2 """
3 : type n : i n t
4 : type e d g e s : L i s t [ L i s t [ i n t ] ]
5 : rtype : int
6 """
7 i f not e d g e s :
8 return n
9 def dfs ( i ) :
10 for n in g [ i ] :
11 i f not v i s i t e d [ n ] :
12 v i s i t e d [ n ] = True
13 dfs (n)
14 return
15 # convert edges into a adjacency l i s t
16 g = [ [ ] f o r i in range (n) ]
17 f o r i , j in edges :
18 g [ i ] . append ( j )
19 g [ j ] . append ( i )
20
21 # f i n d components
22 v i s i t e d = [ False ]∗ n
23 ans = 0
24 f o r i in range (n) :
25 i f not v i s i t e d [ i ] :
26 v i s i t e d [ i ] = True
27 dfs ( i )
28 ans += 1
28.3. ISLANDS AND BRIDGES 645
29 r e t u r n ans
I nput :
11110
11010
11000
00000
Output : 1
Example 2 :
I nput :
11000
11000
00100
00011
Output : 3
Solution; DFS without extra space.. We use DFS and mark the
visted components as ’-1’ in the grid.
646 28. GRAPH QUESTIONS (15%)
1 d e f numIslands ( s e l f , g r i d ) :
2 """
3 : type g r i d : L i s t [ L i s t [ s t r ] ]
4 : rtype : int
5 """
6 i f not g r i d :
7 return 0
8 rows , c o l s = l e n ( g r i d ) , l e n ( g r i d [ 0 ] )
9 moves = [ ( − 1 , 0 ) , ( 1 , 0 ) , ( 0 , −1) , ( 0 , 1 ) ]
10 def dfs (x , y) :
11 f o r dx , dy i n moves :
12 nx , ny = x + dx , y + dy
13 i f nx < 0 o r ny < 0 o r nx >= rows o r ny >= c o l s
:
14 continue
15 i f g r i d [ nx ] [ ny ] == ' 1 ' :
16 g r i d [ nx ] [ ny ] = '−1 '
17 d f s ( nx , ny )
18 return
19 ans = 0
20 f o r i i n r a n g e ( rows ) :
21 f o r j in range ( c o l s ) :
22 i f g r i d [ i ] [ j ] == ' 1 ' :
23 g r i d [ i ] [ j ] = '−1 '
24 dfs ( i , j )
25 ans += 1
26 r e t u r n ans
Input : [ [ 0 , 1 ] , [ 1 , 0 ] ]
Output : 1
Example 2 :
Input : [ [ 0 , 1 , 0 ] , [ 0 , 0 , 0 ] , [ 0 , 0 , 1 ] ]
Output : 2
Example 3 :
Input :
[[1 ,1 ,1 ,1 ,1] ,[1 ,0 ,0 ,0 ,1] ,[1 ,0 ,1 ,0 ,1] ,[1 ,0 ,0 ,0 ,1] ,[1 ,1 ,1 ,1 ,1]]
Output : 1
28.4. NP-HARD PROBLEMS 647
Note :
fact, there is no polynomial time solution available for this problem as the
problem is a known NP-Hard problem.
28.6 943. Find the Shortest Superstring (hard). Given an array A
of strings, find any smallest string that contains each string in A as a
substring. We may assume that no string in A is substring of another
string in A.
Example 1 :
Example 2 :
Input : [ " c a t g " , " c t a a g t " , " g c t a " , " t t c a " , " a t g c a t c " ]
Output : " g c t a a g t t c a t g c a t c "
24 break
25 return G
26
27 d e f d f s ( used , d , c u r r , path , ans , best_path ) :
28 i f c u r r >= ans [ 0 ] :
29 return
30 i f d == n :
31 ans [ 0 ] = c u r r
32 best_path [ 0 ] = path
33 return
34 f o r i in range (n) :
35 i f used & (1<< i ) :
36 continue
37 #used [ i ] = True
38 i f c u r r == 0 :
39 d f s ( used |(1<< i ) , d+1 , c u r r+l e n (A[ i ] ) ,
path +[ i ] , ans , best_path )
40 else :
41 d f s ( used |(1<< i ) , d+1 , c u r r+l e n (A[ i ] )−G[
path [ − 1 ] ] [ i ] , path +[ i ] , ans , best_path )
42 #used [ i ] = F a l s e
43 return
44
45
46 G = getGraph (A)
47 ans = [ 0 ]
48 f o r a i n A:
49 ans [ 0 ] += l e n ( a )
50
51 final_path = [ [ i f o r i in range (n) ] ]
52
53 v i s i t e d = 0#[ F a l s e f o r i i n r a n g e ( n ) ]
54 d f s ( v i s i t e d , 0 , 0 , [ ] , ans , f i n a l _ p a t h )
55
56 # g e n e r a t e r e s u l t from path
57 final_path = final_path [ 0 ]
58 r e s = A[ f i n a l _ p a t h [ 0 ] ]
59 f o r i in range (1 , len ( final_path ) ) :
60 l a s t = f i n a l _ p a t h [ i −1]
61 cur = final_path [ i ]
62 l = G[ l a s t ] [ c u r ]
63 r e s += A[ c u r ] [ l : ]
64 return res
1. Single Sequence (50%): This is an easy type too. The states represents
if the sequence ends here and include the current element. This way
of divide the problem we can obtain the state transfer function easily
to find a pattern.
651
652 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
I nput : n = 3 , k = 2
Output : 6
E x p l a n a t i o n : Take c1 a s c o l o r 1 , c2 a s c o l o r 2 . A l l
p o s s i b l e ways a r e :
10 pre_diff = d i f f
11 d i f f = ( same+ d i f f ) ∗ ( k−1)
12 same = p r e _ d i f f
13 r e t u r n ( same+ d i f f )
29.2 Paint House (L256, *). There are a row of n houses, each house
can be painted with one of the three colors: red, blue or green. The
cost of painting each house with a certain color is different. You have
to paint all the houses such that no two adjacent houses have the same
color.
The cost of painting each house with a certain color is represented by
a n x 3 cost matrix. For example, costs[0][0] is the cost of painting
house 0 with color red; costs[1][2] is the cost of painting house 1 with
color green, and so on... Find the minimum cost to paint all houses.
Note: All costs are positive integers.
Example :
Input : [ [ 1 7 , 2 , 1 7 ] , [ 1 6 , 1 6 , 5 ] , [ 1 4 , 3 , 1 9 ] ]
Output : 10
E x p l a n a t i o n : P a i n t house 0 i n t o blue , p a i n t house 1 i n t o
green , p a i n t house 2 i n t o b l u e .
Minimum c o s t : 2 + 5 + 3 = 1 0 .
1 d e f minCost ( s e l f , c o s t s ) :
2 i f not c o s t s :
3 return 0
4 c1 , c2 , c3 = c o s t s [ 0 ]
5 n = len ( costs )
6 f o r i in range (1 , n) :
7 nc1 = c o s t s [ i ] [ 0 ] + min ( c2 , c3 )
8 nc2 = c o s t s [ i ] [ 1 ] + min ( c1 , c3 )
9 nc3 = c o s t s [ i ] [ 2 ] + min ( c1 , c2 )
10 q c1 , c2 , c3 = nc1 , nc2 , nc3
11 r e t u r n min ( c1 , c2 , c3 )
money stashed, the only constraint stopping you from robbing each
of them is that adjacent houses have security system connected and
it will automatically contact the police if two adjacent houses were
broken into on the same night.
Given a list of non-negative integers representing the amount of money
of each house, determine the maximum amount of money you can rob
tonight without alerting the police.
Solution: Induction and Multi-choiced State. For each house
has two choice: rob or not rob. Thus the profit for each house can be
deducted as follows:
1 house : dp [ 1 ] . rob = p [ 1 ] , dp [ 1 ] . not_rob = 0 , r e t u r n max( dp
[1) ]
2 house : i f rob house 2 , means we d e f i n i t e l y can not rob
house 1 . dp [ 2 ] . rob = dp [ 1 ] . not_rob + p [ 2 ] .
i f not rob house 2 , means we can c h o o s e rob house
1 o r not rob house 1 . dp [ 2 ] . not_rob = max( dp [ 1 ] . rob , dp
[ 1 ] . not_rob )
1 d e f rob ( s e l f , nums ) :
2 i f not nums :
3 return 0
4 i f l e n ( nums ) ==1:
5 r e t u r n nums [ 0 ]
6 rob = nums [ 0 ]
7 not_rob = 0
8 f o r i i n r a n g e ( 1 , l e n ( nums ) ) :
9 new_rob = not_rob + nums [ i ]
10 new_not_rob = max( rob , not_rob )
11 rob , not_rob = new_rob , new_not_rob
12 r e t u r n max( rob , not_rob )
29.4 House Robber III (L337, medium). The thief has found himself
a new place for his thievery again. There is only one entrance to this
area, called the "root." Besides the root, each house has one and only
one parent house. After a tour, the smart thief realized that "all houses
in this place forms a binary tree". It will automatically contact the
police if two directly-linked houses were broken into on the same night.
Determine the maximum amount of money the thief can rob tonight
without alerting the police.
Example 1 :
I nput : [ 3 , 2 , 3 , n u l l , 3 , n u l l , 1 ]
3
/ \
2 3
\ \
656 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
3 1
Output : 7
E x p l a n a t i o n : Maximum amount o f money t h e t h i e f can rob = 3
+ 3 + 1 = 7.
Example 2 :
Input : [ 3 , 4 , 5 , 1 , 3 , n u l l , 1 ]
3
/ \
4 5
/ \ \
1 3 1
Output : 9
E x p l a n a t i o n : Maximum amount o f money t h e t h i e f can rob = 4
+ 5 = 9.
I nput : [ − 2 , 1 , − 3 , 4 , − 1 , 2 , 1 , − 5 , 4 ] ,
Output : 6
E x p l a n a t i o n : [ 4 , − 1 , 2 , 1 ] has t h e l a r g e s t sum = 6 .
Follow up: If you have figured out the O(n) solution, try coding an-
other solution using the divide and conquer approach, which is more
subtle.
Solution 1: Prefix Sum. For the maximum subarray problem, we
have our answer to be max(yj − yi )(j > i, j ∈ [0, n − 1]), which is
equivalent to max(yj − min(yi )(i < j)), j ∈ [0, n − 1]. We can solve
the maximum subarray problem using prefix sum with linear O(n)
time, where using brute force is O(n3 ) and the divide and conquer is
O(nlgn). For example, given an array of [−2, −3, 4, −1, −2, 1, 5, −3].
We have the following results: The coding:
658 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
Table 29.3: Process of using prefix sum for the maximum subarray
Array −2 −3 4 −1 −2 1 5 3
prefix sum −2 −5 −1 −2 −4 −3 2 1
Updated prefix sum −2 −3 4 3 1 2 7 4
current max sum −2 −2 4 4 4 4 7 7
min prefix sum −2 −5 −5 −5 −5 −5 −5 −5
1
2 # o r we can u s e import math , math . i n f
3
4 # Function t o compute maximum
5 # s u b a r r a y sum i n l i n e a r time .
6 d e f maximumSumSubarray ( nums ) :
7 i f not nums :
8 return 0
9 prefixSum = 0
10 g l o b a l A = −s y s . maxsize
11 minSub = 0
12 f o r i i n r a n g e ( l e n ( nums ) ) :
13 prefixSum += nums [ i ]
14 g l o b a l A = max( globalA , prefixSum−minSub )
15 minSub = min ( minSub , prefixSum )
16 return globalA
17
18 # D r i v e r Program
19
20 # Test c a s e 1
21 a r r 1 = [ −2, −3, 4 , −1, −2, 1 , 5 , −3 ]
22 p r i n t ( maximumSumSubarray ( a r r 1 ) )
23
24 # Test c a s e 2
25 a r r 2 = [ 4 , −8, 9 , −4, 1 , −8, −1, 6 ]
26 p r i n t ( maximumSumSubarray ( a r r 2 ) )
As we can see, we did not need extra space to save the prefix sum,
because each time we only use prefix sum at current index.
Solution 2: Kadane’s Algorithm. Another easier perspective us-
ing dynamic programming for this problem because we found the key
word ”Maximum" in the question which is problem that identified in
the dynamic programming chapter.
dp : t h e maximum s u b a r r a y r e s u l t t i l l i n d e x i , which
i n c l u d e s t h e c u r r e n t e l e m e n t nums [ i ] . We need n+1 s p a c e
due t o u s i n g i −1.
Init : all 0
s t a t e t r a n s f e r f u n c t i o n : dp [ i ] = max( dp [ i −1]+nums [ i ] , nums [
i ] ) ; b e c a u s e i f f o r each element , we can e i t h e r c o n t i n u e
t h e p r e v i o u s s u b a r r a y o r s t a r t a new s u b a r r a y .
R e q u e s u l t : max( dp )
29.1. SINGLE SEQUENCE O(N ) 659
The algorithm can also be easily modified to keep track of the starting
and ending indices of the maximum subarray (when max_so_far changes) as
well as the case where we want to allow zero-length subarrays (with implicit
sum 0) if all elements are negative.
Because of the way this algorithm uses optimal substructures (the max-
imum subarray ending at each position is calculated in a simple way from
a related but smaller and overlapping subproblem: the maximum subarray
ending at the previous position) this algorithm can be viewed as a sim-
ple/trivial example of dynamic programming.
Prefix Sum to get BCR convert this problem to best time to buy and sell
stock problem. [0, −2, −1, −4, 0, −1, 1, 2, −3, 1], which is to find the maxi-
mum benefit, => O(n), use prefix_sum, the difference is we set prefix_sum
to 0 when it is smaller than 0, O(n). Or we can try two pointers.
1 from s y s import maxint
2 d e f maxSubArray ( s e l f , nums ) :
3 """
4 : type nums : L i s t [ i n t ]
5 : rtype : int
6 """
7 max_so_far = −maxint − 1
8 prefix_sum= 0
9 f o r i i n r a n g e ( 0 , l e n ( nums ) ) :
10 prefix_sum+= nums [ i ]
11 i f ( max_so_far < prefix_sum ) :
12 max_so_far = prefix_sum
13
14 i f prefix_sum< 0 :
15 prefix_sum= 0
16 r e t u r n max_so_far
The algorithm can also be easily modified to keep track of the starting
and ending indices of the maximum subarray (when max_so_far changes) as
well as the case where we want to allow zero-length subarrays (with implicit
sum 0) if all elements are negative. For example:
Now, let us see how we do maximum subarray with product operation
instead of the sum.
I nput : [ 2 , 3 , − 2 , 4 ]
Output : 6
E x p l a n a t i o n : [ 2 , 3 ] has t h e l a r g e s t p r o d u c t 6 .
Example 2 :
I nput : [ −2 ,0 , −1]
Output : 0
E x p l a n a t i o n : The r e s u l t cannot be 2 , b e c a u s e [ −2 , −1] i s not
a subarray .
Example 2 :
2 dp 0 0 2 0 4 0
3 ") ( ) ( ( ( ) ) ) ("
4 dp 0 0 2 0 0 0 2 4 8 0
Thus, when we are at position ’)’, we look for i-1, there are two cases:
1 1 ) i f s [ i −1] == ' ( ' , i t i s an c l o s u r e , dp [ i ]+=2 , then we
check dp [ i −2] t o c o n n e c t with p r e v i o u s l o n g e s t l e n g t h .
f o r example i n c a s e 1 , " ) ( ) ( ) " , where dp [ i ] = 4 .
2 2 ) i f s [ i −1] == ' ) ' , then we check a t p o s i t i o n i −1−dp [ i −1] ,
i n c a s e , a t dp [ i ] = 8 , i f a t i t s c o r r e s p o n d i n g
p o s i t i o n we check i f i t i s ' ( ' . I f i t i s we i n c r e a s e t h e
count by 2 , and c o n n e c t i t with p r e v i o u s p o s i t i o n .
1 def longestValidParentheses ( s e l f , s ) :
2 """
3 : type s : s t r
4 : rtype : int
5 """
6 i f not s :
7 return 0
8 dp = [ 0 ] ∗ l e n ( s )
9 f o r i in range (1 , len ( s ) ) :
10 c = s[i]
11 i f c == ' ) ' :#check p r e v i o u s p o s i t i o n
12 i f s [ i −1] == ' ( ' :#t h i s i s t h e c l o s u r e
13 dp [ i ] +=2
14 i f i −2>=0: #c o n n e c t with p r e v i o u s l e n g t h
15 dp [ i ]+=dp [ i −2]
16 i f s [ i −1] == ' ) ' : #l o o k a t i −1−dp [ i −1] f o r ' ( '
17 i f i −1−dp [ i −1]>=0 and s [ i −1−dp [ i − 1 ] ] == ' ( '
:
18 dp [ i ] = dp [ i −1]+2
19 i f i −1−dp [ i −1]−1 >=0: # c o n n e c t with
previous length
20 dp [ i −1]+=dp [ i −1−dp [ i −1] −1]
21 p r i n t ( dp )
22 r e t u r n max( dp )
23 # input " ( ( ) ) ) () ) ( "
24 # output [ 0 , 0 , 2 , 4 , 0 , 0 , 2 , 0 , 0 ]
13 s t a c k . append ( i )
14 else :
15 ans = max( ans , i −s t a c k [ − 1 ] )
16 r e t u r n ans
29.1.4 Exercise
1. 639. Decode Ways II (hard)
29.2.1 Subsequence
29.8 Longest Increasing Subsequence (L300, medium). Given an
unsorted array of integers, find the length of longest increasing subse-
quence.
Example :
Input : [ 1 0 , 9 , 2 , 5 , 3 , 7 , 1 0 1 , 1 8 ]
Output : 4
E x p l a n a t i o n : The l o n g e s t i n c r e a s i n g s u b s e q u e n c e i s
[ 2 , 3 , 7 , 1 0 1 ] , t h e r e f o r e the length i s 4 .
Note: (1) There may be more than one LIS combination, it is only
necessary for you to return the length. (2) Your algorithm should run
in O(n2) complexity.
Follow up: Could you improve it to O(n log n) time complexity?
Solution 1: Induction. For each subproblem, we show the result
as follows. Each state dp[i] we represents the longest increasing sub-
sequence ends with nums[i]. The reconstruction depends on all the
previous i-1 subproblems, as shown in Eq. 29.1.
29.2. SINGLE SEQUENCE O(N 2 ) 665
Figure 29.1: State Transfer Tree Structure for LIS, each path represents a
possible solution. Each arrow represents an move: find an element in the
following elements that’s larger than the current node.
1 subproblem : [ ] , [ 1 0 ] , [ 1 0 , 9 ] ,
[10 ,9 ,2] ,[10 ,9 ,2 ,5] ,[10 ,9 ,2 ,5 ,3] , [10 ,9 ,2 ,5 ,3 , 7 ] . . .
2 Choice :
3 ans : 0, 1, 1, 1, 2, 2,
3,
29.2.2 Splitting
Need to figure out how to fill out the two-dimensional dp matrix for splitting.
29.9 Word Break (L139, **). Given a non-empty string s and a dic-
tionary wordDict containing a list of non-empty words, determine if
666 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
Example 2 :
Example 3 :
Input : s = " c a t s a n d o g " , wordDict = [ " c a t s " , " dog " , " sand " ,
" and " , " c a t " ]
Output : f a l s e
Thus, deduction still works here. We manually write down the result
of each subproblem. Suppose we are trying to achieve answer for ’leet’,
how does it work? if ’lee’ is true and ’t’ is true, then we have true. Or,
if ’le’ is true, and ’et’ is ture, we have true. unlike problems before, the
ans for ’leet’ can only be constructured from all the previous smaller
problems.
1 d e f wordBreak ( s e l f , s , wordDict ) :
2 wordDict = s e t ( wordDict )
3 n = len ( s )
4 dp = [ F a l s e ] ∗ ( n+1)
5 dp [ 0 ] = True #s e t 1 f o r empty s t r ' '
6 f o r i i n r a n g e ( 1 , n+1) :
7 f o r j in range ( i ) :
8 i f dp [ j ] and s [ j : i ] i n wordDict : # check
p r e v i o u s r e s u l t , and new word s [ j : i ]
9 dp [ i ] = True
29.2. SINGLE SEQUENCE O(N 2 ) 667
10
11 r e t u r n dp [ −1]
Figure 29.2: Word Break with DFS. For the tree, each arrow means check
the word = parent-child and then recursively check the result of child.
10 r e t u r n memo [ s t a r t ]
11
12 f o r i i n r a n g e ( s t a r t , end+1) :
13 word = s [ s t a r t : i ] #i i s t h e s p l i t t i n g p o i n t
14 i f word i n wordDict :
15 i f i not i n memo :
16 memo [ i ] = DFS( i , end , memo)
17 i f memo [ i ] :
18 r e t u r n True
19 memo [ s t a r t ] = F a l s e
20
21 r e t u r n memo [ s t a r t ]
22
23 r e t u r n DFS( 0 , n , { } )
Solution: use two dp. one to track if it is pal and the other is to
compute the cuts.
1 d e f minCut ( s e l f , s ) :
2 """
3 : type s : s t r
4 : rtype : int
5 """
6 pal = [ [ False f o r _ in range ( len ( s ) ) ] f o r _ in
range ( len ( s ) ) ]
7 c u t s = [ l e n ( s )−i −1 f o r i i n r a n g e ( l e n ( s ) ) ]
8 f o r s t a r t i n r a n g e ( l e n ( s ) −1,−1,−1) :
9 f o r end i n r a n g e ( s t a r t , l e n ( s ) ) :
10 i f s [ s t a r t ] == s [ end ] and ( end−s t a r t < 2 o r
p a l [ s t a r t + 1 ] [ end −1]) :
11 p a l [ s t a r t ] [ end ] = True
12 i f end == l e n ( s ) −1:
13 cuts [ start ] = 0
14 else :
15 c u t s [ s t a r t ] = min ( c u t s [ s t a r t ] , 1+
c u t s [ end +1])
16 return cuts [ 0 ]
29.11 Best Time to Buy and Sell Stock III (L123, hard). Say you
have an array for which the ith element is the price of a given stock
on day i. Design an algorithm to find the maximum profit. You may
29.2. SINGLE SEQUENCE O(N 2 ) 669
I nput : [ 3 , 3 , 5 , 0 , 0 , 3 , 1 , 4 ]
Output : 6
E x p l a n a t i o n : Buy on day 4 ( p r i c e = 0 ) and s e l l on day 6 (
p r i c e = 3 ) , p r o f i t = 3−0 = 3 .
Then buy on day 7 ( p r i c e = 1 ) and s e l l on day
8 ( p r i c e = 4 ) , p r o f i t = 4−1 = 3 .
Example 2 :
\ begin { l s t l i s t i n g }
I nput : [ 1 , 2 , 3 , 4 , 5 ]
Output : 4
E x p l a n a t i o n : Buy on day 1 ( p r i c e = 1 ) and s e l l on day 5 (
p r i c e = 5 ) , p r o f i t = 5−1 = 4 .
Note t h a t you cannot buy on day 1 , buy on day
2 and s e l l them l a t e r , a s you a r e
e n g a g i n g m u l t i p l e t r a n s a c t i o n s a t t h e same
time . You must s e l l b e f o r e buying a g a i n .
Example 3 :
I nput : [ 7 , 6 , 4 , 3 , 1 ]
Output : 0
E x p l a n a t i o n : In t h i s c a s e , no t r a n s a c t i o n i s done , i . e . max
profit = 0.
15 m a x _ g l o b a l _ p r o f i t= max( max_global_profit ,
p r i c e s [ i ]− m i n _ l o c a l )
16 m i n _ l o c a l = min ( min_local , p r i c e s [ i ] )
17 return max_global_profit
18
19 i f not p r i c e s :
20 return 0
21 n = len ( prices )
22 min_local = p r i c e s [ 0 ]
23 preProfit , postProfit = [0]∗ n , [0]∗ n
24
25 f o r i in range (n) :
26 p r e P r o f i t [ i ] = maxProfitI (0 , i )
27 p o s t P r o f i t [ i ] = m a x P r o f i t I ( i , n−1)
28 m a x P r o f i t = max ( [ p r e+p o s t f o r pre , p o s t i n z i p (
preProfit , postProfit ) ] )
29 return maxProfit
To avoid repeat work, we can use a for loop to get all the value of
preProfit, and use another to get values for postProfit. For the post-
Profit, we need to traverse from the end to the start of the array in
reverse direction, this way we track the local_max and the profit is
going to be local_max - prices[i], and both keep a global max profit.
The code is as follows:
1 def maxProfit ( s e l f , p r i c e s ) :
2 """
3 : type p r i c e s : L i s t [ i n t ]
4 : rtype : int
5 """
6 i f not p r i c e s :
7 return 0
8 n = len ( prices )
9
10 preProfit , postProfit = [0]∗ n , [0]∗ n
11 #g e t p r e P r o f i t , from 0−n , t r a c k t h e m i n i _ l o c a l ,
global_max
12 min_local = p r i c e s [ 0 ]
13 max_global_profit = 0
14 f o r i in range (1 , n) :
15 m a x _ g l o b a l _ p r o f i t= max( max_global_profit , p r i c e s [ i
]− m i n _ l o c a l )
16 m i n _ l o c a l = min ( min_local , p r i c e s [ i ] )
17 p r e P r o f i t [ i ] = max_global_profit
18 #g e t p o s t P r o f i t , from n−1 t o 0 , t r a c k t h e max_local ,
global_min
19 max_local = p r i c e s [ −1]
20 max_global_profit = 0
21 f o r i i n r a n g e ( n−1, −1, −1) :
22 m a x _ g l o b a l _ p r o f i t= max( max_global_profit , max_local
−p r i c e s [ i ] )
23 max_local = max( max_local , p r i c e s [ i ] )
24 p o s t P r o f i t [ i ] = max_global_profit
29.3. SINGLE SEQUENCE O(N 3 ) 671
25 # i t e r a t e p r e P r o f i t and p o s t P r o f i t t o g e t t h e maximum
profit
26 m a x P r o f i t = max ( [ p r e+p o s t f o r pre , p o s t i n z i p (
preProfit , postProfit ) ] )
27 return maxProfit
29.3.1 Interval
Problems include Stone Game, Burst Ballons, and Scramble String. The
features of this type of dynamic programming is we try to get the min/-
max/count of a range of array; and the state transfer function updates
through the range by from the big range to small rang.
29.12 486. Predict the Winner (medium) Given an array of scores that
are non-negative integers. Player 1 picks one of the numbers from
either end of the array followed by the player 2 and then player 1 and
so on. Each time a player picks a number, that number will not be
available for the next player. This continues until all the scores have
been chosen. The player with the maximum score wins.
Given an array of scores, predict whether player 1 is the winner. You
can assume each player plays to maximize his score.
Example 1 : Input : [ 1 , 5 , 2 ] . Output : F a l s e
E x p l a n a t i o n : P l a y e r 1 f i r s t c h o o s e s 1 . Then p l a y e r 2 have
t o c h o o s e between 5 and 7 . No matter which number p l a y e r
2 choose , p l a y e r 1 can c h o o s e 2 3 3 . F i n a l l y , p l a y e r 1
has more s c o r e ( 2 3 4 ) than p l a y e r 2 ( 1 2 ) , s o you need t o
r e t u r n True r e p r e s e n t i n g p l a y e r 1 can win .
Note:
Solution: At first, we can not use f [i] to denote the state, because we
can choose element from both the left and the right side, we use f [i][j]
instead, which represents the maximum value we can get from i to j
range. Second, when we deal with problem with potential accumulate
value, we can use sum[i][j] to represent the sum in the range i − j.
Each player take actions to maximize their total points, f[i][j], it has
two choice: left, right, which left f[i+1][j] and f[i][j-1] respectively for
player two to choose. In order to gain the maximum scores in range [i,j]
we need to optimize it by making sure f[i+1][j] and f[i][j-1] we choose
the mimum value from. Therefore, we have state transfer function:
f [i][j] = sum[i][j] − min(f [i + 1][j], f [i][j − 1]). Each subproblem
relys on only two subproblems, which makes the total time complexity
O(n2 ). This is actually a game theory type. According to the function:
if the range is 1, when i == j, the value is nums[i], which is the
initialization. For the loop, the first for loop is the range: from size
2 to n, the second for loop to get the start index i in range [0, n − l],
then the end index j = i + l − 1. The answer for this problem is: if
f [0][−1] >= sum/2. If it is, then it is true.
The process of the for loop is we initialize the diagonal element, and
fill out element on the right upper side, which is upper diagonal.
1 d e f PredictTheWinner ( nums ) :
29.3. SINGLE SEQUENCE O(N 3 ) 673
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : bool
5 """
6 i f not nums :
7 return False
8 i f l e n ( nums ) ==1:
9 r e t u r n True
10 #sum [ i , j ] = sum [ j +1]−sum [ i ]
11 sums = nums [ : ]
12 f o r i i n r a n g e ( 1 , l e n ( nums ) ) :
13 sums [ i ]+=sums [ i −1]
14 sums . i n s e r t ( 0 , 0 )
15
16 dp = [ [ 0 f o r c o l i n r a n g e ( l e n ( nums ) ) ] f o r row i n
r a n g e ( l e n ( nums ) ) ]
17 f o r i i n r a n g e ( l e n ( nums ) ) :
18 dp [ i ] [ i ] = nums [ i ]
19
20 f o r l i n r a n g e ( 2 , l e n ( nums ) +1) :
21 f o r i i n r a n g e ( 0 , l e n ( nums )− l +1) : #s t a r t 0 , end
l e n − l +1
22 j =i+l −1
23 dp [ i ] [ j ] = ( sums [ j +1]−sums [ i ] )−min ( dp [ i + 1 ] [
j ] , dp [ i ] [ j −1])
24 n =l e n ( nums )
25 r e t u r n dp [ 0 ] [ n−1]>=sums [ −1]/2
Actually the for loop we can use a simpler one. However, it is harder
to understand to code compared with the standard version.
1 f o r i i n r a n g e ( n−1,−1,−1) :
674 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
2 dp [ i ] [ i ] = nums [ i ] #i n i t i a l i z a t i o n
3 f o r j i n r a n g e ( i +1,n ) :
4 dp [ i ] [ j ] = max( nums [ j ] − dp [ i ] [ j −1] , nums [ i ]
− dp [ i + 1 ] [ j ] )
Given [ 3 , 1 , 5 , 8 ]
Return 167
7 n = l e n ( nums )
8 nums . i n s e r t ( 0 , 1 )
9 nums . append ( 1 )
10
11 c = [ [ 0 f o r _ i n r a n g e ( n+2) ] f o r _ i n r a n g e ( n+2) ]
12 f o r l i n r a n g e ( 1 , n+1) : #l e n g t h [ 1 , n ]
13 f o r i i n r a n g e ( 1 , n−l +2) : #s t a r t [ 1 , n−l +1]
14 j = i+l −1 #end =i+l −1
15
16 #f u n c t i o n i s a k f o r l o o p
17 f o r k i n r a n g e ( i , j +1) :
18 c [ i ] [ j ] = max( c [ i ] [ j ] , c [ i ] [ k−1]+nums [ i
−1]∗nums [ k ] ∗ nums [ j +1]+c [ k + 1 ] [ j ] )
19 #r e t u r n from 1 t o n
20 return c [ 1 ] [ n ]
Input :
" bbbab "
Output :
4
One p o s s i b l e l o n g e s t p a l i n d r o m i c s u b s e q u e n c e i s " bbbb " .
Example 2 :
Input :
" cbbd "
Output :
2
One p o s s i b l e l o n g e s t p a l i n d r o m i c s u b s e q u e n c e i s " bb " .
Solution: for this problem, we have state dp[i][j] means from i to j, the
length of the longest palindromic subsequence. dp[i][i] = 1. Then we
use this range to fill in the dp matrix (upper triangle.)
1 def longestPalindromeSubseq ( s e l f , s ) :
2 """
3 : type s : s t r
4 : rtype : int
5 """
6 nums=s
7 i f not nums :
8 return 0
9 i f l e n ( nums ) ==1:
10 return 1
11
12 def isPanlidrome ( s ) :
13 l , r= 0 , l e n ( s )−1
14 w h i l e l<=r :
676 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
15 i f s [ l ]!= s [ r ] :
16 return False
17 else :
18 l+=1
19 r−=1
20 r e t u r n True
21
22 i f i s P a n l i d r o m e ( s ) : #t o s p e e d up
23 return len ( s )
24
25 rows=l e n ( nums )
26 dp = [ [ 0 f o r c o l i n r a n g e ( rows ) ] f o r row i n r a n g e (
rows ) ]
27 f o r i i n r a n g e ( 0 , rows ) :
28 dp [ i ] [ i ] = 1
29
30 f o r l i n r a n g e ( 2 , rows +1) : #u s e a l e n g t h
31 f o r i i n r a n g e ( 0 , rows−l +1) : #s t a r t 0 , end l e n −
l +1
32 j =i+l −1
33 i f j >rows :
34 continue
35 i f s [ i ]==s [ j ] :
36 dp [ i ] [ j ] = dp [ i + 1 ] [ j −1]+2
37 else :
38 l e f t _ s i z e , r i g h t _ s i z e = dp [ i ] [ j −1] , dp [ i
+1][ j ]
39 dp [ i ] [ j ]= max( dp [ i ] [ j −1] , r i g h t _ s i z e )
40 p r i n t ( dp )
41 r e t u r n dp [ 0 ] [ rows −1]
7 int len = 0;
8 f o r ( i n t j = i + 1 ; j < n ; ++j ) {
9 i n t t = dp [ j ] ;
10 i f ( s [ i ] == s [ j ] ) {
11 dp [ j ] = l e n + 2 ;
12 }
13 l e n = max( l e n , t ) ;
14 }
15 }
16 f o r ( i n t num : dp ) r e s = max( r e s , num) ;
17 return res ;
18 }
19 };
Counting
In this type, any location in the coordinate will be only vised once. Thus,
it gives O(mn) time complexity.
62. Unique Paths
1 A r o b o t i s l o c a t e d a t t h e top− l e f t c o r n e r o f a m x n g r i d (
marked ' S t a r t ' i n t h e diagram below ) .
2
3 The r o b o t can o n l y move e i t h e r down o r r i g h t a t any p o i n t i n
time . The r o b o t i s t r y i n g t o r e a c h t h e bottom−r i g h t c o r n e r o f
t h e g r i d ( marked ' F i n i s h ' i n t h e diagram below ) .
4
5 How many p o s s i b l e uni que p a t h s a r e t h e r e ?
6
7 Above i s a 3 x 7 g r i d . How many p o s s i b l e uni que p a t h s a r e t h e r e ?
8
9 Note : m and n w i l l be a t most 1 0 0 .
10
11 Example 1 :
12
13 Input : m = 3 , n = 2
14 Output : 3
15 Explanation :
16 From t h e top− l e f t c o r n e r , t h e r e a r e a t o t a l o f 3 ways t o r e a c h
t h e bottom−r i g h t c o r n e r :
17 1 . Right −> Right −> Down
18 2 . Right −> Down −> Right
19 3 . Down −> Right −> Right
20
21 Example 2 :
22
23 Input : m = 7 , n = 3
24 Output : 28
BFS. Fig. 29.4 shows the BFS traversal process in the matrix. We can
clearly see that each node and edge is only visited once. The BFS solution
is straightforward and is the best solution. We use bfs to track the nodes
in the queue at each level, and dp to record the unique paths to location
(i, j). Because each location is only visted once, thus, at each level, using
the same dp will have no conflict.
1 # BFS
2 d e f uniquePaths ( s e l f , m, n ) :
3 dp = [ [ 0 f o r _ i n r a n g e ( n ) ] f o r _ i n r a n g e (m) ]
4 dp [ 0 ] [ 0 ] = 1
5 bfs = set ([(0 ,0) ] )
6 d i r s = [ ( 1 , 0) , (0 ,1) ]
7 while bfs :
8 new_bfs = s e t ( )
9 for x , y in bfs :
10 f o r dx , dy i n d i r s :
11 nx , ny = x+dx , y+dy
12 i f 0<=nx<m and 0<=ny<n :
29.4. COORDINATE: BFS AND DP 679
Figure 29.4: One Time Graph Traversal. Different color means different
levels of traversal.
13 dp [ nx ] [ ny]+= dp [ x ] [ y ]
14 new_bfs . add ( ( nx , ny ) )
15 b f s = new_bfs
16 r e t u r n dp [m− 1 ] [ n−1]
1 d e f combinationSum4 ( s e l f , nums , t a r g e t ) :
2 """
29.4. COORDINATE: BFS AND DP 681
3 : type nums : L i s t [ i n t ]
4 : type t a r g e t : i n t
5 : rtype : int
6 """
7 nums . s o r t ( )
8 n = l e n ( nums )
9 dp = [ 0 ] ∗ ( t a r g e t +1)
10 dp [ 0 ] = 1
11 f o r t i n r a n g e ( 1 , t a r g e t +1) :
12 f o r n i n nums :
13 i f t−n >= 0 :
14 dp [ t ] += dp [ t−n ]
15 else :
16 break
17 r e t u r n dp [ −1]
Optimization
64. Minimum Path Sum (medium)
1
2 Given a m x n g r i d f i l l e d with non−n e g a t i v e numbers , f i n d a path
from top l e f t t o bottom r i g h t which m i n i m i z e s t h e sum o f a l l
numbers a l o n g i t s path .
3
4 Note : You can o n l y move e i t h e r down o r r i g h t a t any p o i n t i n
time .
5
6 Example 1 :
7
8 [[1 ,3 ,1] ,
9 [1 ,5 ,1] ,
10 [4 ,2 ,1]]
15 dp [ r ] [ 0 ] = dp [ r − 1 ] [ 0 ] + g r i d [ r ] [ 0 ]
16
17 f o r r i n r a n g e ( 1 , rows ) :
18 f o r c in range (1 , c o l s ) :
19 dp [ r ] [ c ] = g r i d [ r ] [ c ] + min ( dp [ r − 1 ] [ c ] , dp [ r ] [ c −1])
20 r e t u r n dp [ − 1 ] [ − 1 ]
Further inspecting the above code, it can be seen that maintaining pre
is for recovering pre[i], which is simply cur[i] before its update. So it is
enough to use only one vector. Now the space is further optimized and the
29.4. COORDINATE: BFS AND DP 683
Two-dimensional Coordinate
935. Knight Dialer (Medium)
1 A c h e s s k n i g h t can move a s i n d i c a t e d i n t h e c h e s s diagram below :
2 This time , we p l a c e our c h e s s k n i g h t on any numbered key o f a
phone pad ( i n d i c a t e d above ) , and t h e k n i g h t makes N−1 hops .
Each hop must be from one key t o a n o t h e r numbered key .
3
4 Each time i t l a n d s on a key ( i n c l u d i n g t h e i n i t i a l placement o f
t h e k n i g h t ) , i t p r e s s e s t h e number o f t h a t key , p r e s s i n g N
digits total .
5
684 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
Most Naive BFS. Analysis: First, we need to figure out from each num-
ber, where is the possible next moves. We would have get this dictionary:
moves = {0 : [4, 6], 1 : [6, 8], 2 : [7, 9], 3 : [4, 8], 4 : [0, 3, 9], 5 : [], 6 : [0, 1, 7], 7 :
[2, 6], 8 : [1, 3], 9 : [2, 4]}. This is not exactly a coordinate, however, because
we can make endless move, we would have a graph. The brute force is we
put [0,1,2,3,4,5,6,7,8,9] as the start positions, and we use BFS to control the
steps, the total number of paths is the sum over of all the leaves. At each
step, we would do two things 1) generate a list to save all the possible next
numbers; 2) if it reaches to the leaves, sum up all the nodes.
1 # n a i v e BFS s o l u t i o n
2 d e f k n i g h t D i a l e r ( s e l f , N) :
3 """
4 : type N: i n t
5 : rtype : int
6 """
7 i f N == 1 :
8 r e t u r n 10
29.4. COORDINATE: BFS AND DP 685
9 moves = { 0 : [ 4 , 6 ] , 1 : [ 6 , 8 ] , 2 : [ 7 , 9 ] , 3 : [ 4 , 8 ] , 4 : [ 0 , 3 ,
9 ] , 5 : [ ] , 6 : [ 0 , 1 , 7 ] , 7 : [ 2 , 6 ] , 8 : [ 1 , 3 ] , 9 : [ 2 , 4 ] } #4 , 6 has
three
10
11 bfs = [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9] # a l l starting points
12 step = 1
13 while bfs :
14 new = [ ]
15 for i in bfs :
16 new += moves [ i ]
17 s t e p += 1
18 b f s = new
19 i f s t e p == N:
20 r e t u r n l e n ( b f s ) %(10∗∗9+7)
Optimized BFS. However, the brute force BFS only passed 18/120 test
cases. To improve it further, we know that we only need a counter to record
the counter of each number in that level. This way, bfs is replaced with a
counter. Now, the new code is:
1 #o p t i m i z e d BFS e x a c t l y a DP
2 def knightDialer ( s e l f , N) :
3 MOD = 10∗∗9+7
4 i f N == 1 :
5 r e t u r n 10
6 moves = { 0 : [ 4 , 6 ] , 1:[6 , 8] , 2: [7 , 9] , 3: [4 ,8] , 4: [0 , 3 ,
9] , 5 : [ ] , 6:[0 ,1 ,7] , 7 : [ 2 , 6 ] , 8 : [ 1 , 3 ] , 9 : [ 2 , 4 ] } #4 , 6 has
three
7
8 bfs = [1]∗10
9 step = 1
10
11 while bfs :
12 size = 0
13 new = [ 0 ] ∗ 1 0
14 f o r idx , count i n enumerate ( b f s ) :
15 f o r m i n moves [ i d x ] :
16 new [m] += count
17 new [m] %= MOD
18 s t e p += 1
19 b f s = new
20 i f s t e p == N:
21 r e t u r n sum ( b f s ) %(MOD)
4 moves = { 0 : [ 4 , 6 ] , 1 : [ 6 , 8 ] , 2 : [ 7 , 9 ] , 3 : [ 4 , 8 ] , 4 : [ 0 , 3 ,
9 ] , 5 : [ ] , 6 : [ 0 , 1 , 7 ] , 7 : [ 2 , 6 ] , 8 : [ 1 , 3 ] , 9 : [ 2 , 4 ] } #4 , 6 has
three
5
6 dp = [ 1 ] ∗ 1 0
7
8 f o r s t e p i n r a n g e (N−1) :
9 size = 0
10 new_dp = [ 0 ] ∗ 1 0
11 f o r idx , count i n enumerate ( dp ) :
12 f o r m i n moves [ i d x ] :
13 new_dp [m] += count
14 new_dp [m] %= MOD
15 dp = new_dp
16
17 r e t u r n sum ( dp )%(MOD)
Optimized BFS. Analysis: Each time we can make 8 moves, thus after
K steps, we can have 8K total unique paths. Thus, we just need to get
the total number of paths that it ends within the board (valid paths). The
first step is to write down the possible moves or directions. And, then we
initialize a two-dimensional array dp to record the number of paths end at
(i, j) after k steps. Using a BFS solution, each time we just need to save all
the unique positions can be reached at that step.
29.4. COORDINATE: BFS AND DP 687
1 # Optimized BFS s o l u t i o n
2 d e f k n i g h t P r o b a b i l i t y ( s e l f , N, K, r , c ) :
3 d i r s = [ [ − 2 , −1] , [ −2 , 1 ] , [ −1 , −2] , [ −1 , 2 ] , [ 1 , −2] , [ 1 ,
2] , [2 , −1] ,[2 , 1 ] ]
4 dp = [ [ 0 f o r _ i n r a n g e (N) ] f o r _ i n r a n g e (N) ]
5 t o t a l = 8∗∗K
6 last_pos = set ( [ ( r , c ) ] )
7 dp [ r ] [ c ]=1
8
9 f o r s t e p i n r a n g e (K) :
10 new_pos = s e t ( )
11 new_dp = [ [ 0 f o r _ i n r a n g e (N) ] f o r _ i n r a n g e (N) ]
12 f o r x , y in last_pos :
13 f o r dx , dy i n d i r s :
14 nx = x+dx
15 ny = y+dy
16 i f 0<=nx<N and 0<=ny<N:
17 new_dp [ nx ] [ ny ] += dp [ x ] [ y ]
18 new_pos . add ( ( nx , ny ) )
19 l a s t _ p o s = new_pos
20 dp = new_dp
21
22 r e t u r n f l o a t ( sum (map( sum , dp ) ) ) / t o t a l
One-dimensional Coordinate
For one-dimensional, it is the same as two-dimensional, and it could be even
simpler.
70. Climbing Stairs (Easy)
1 You a r e c l i m b i n g a s t a i r c a s e . I t t a k e s n s t e p s t o r e a c h t o t h e
top .
2
3 Each time you can e i t h e r c l i m b 1 o r 2 s t e p s . In how many
d i s t i n c t ways can you c l i m b t o t h e top ?
4
5 Note : Given n w i l l be a p o s i t i v e i n t e g e r .
6
7 Example 1 :
8
9 Input : 2
10 Output : 2
11 E x p l a n a t i o n : There a r e two ways t o c l i m b t o t h e top .
12 1. 1 step + 1 step
13 2. 2 steps
14
15 Example 2 :
16
17 Input : 3
18 Output : 3
19 E x p l a n a t i o n : There a r e t h r e e ways t o c l i m b t o t h e top .
20 1. 1 step + 1 step + 1 step
21 2. 1 step + 2 steps
22 3. 2 steps + 1 step
BFS. Fig 29.7 demonstrates the state transfer relation between different
position. First, we can solve it using our standard BFS. In this problem, we
do not know the level of the tree structure, so the end condition is while the
bfs is not empty. Thus, eventually the bfs is set to empty, and the result of
the dp is empty too. So we use a global variable ans to track our result.
29.4. COORDINATE: BFS AND DP 689
1 # BFS
2 def climbStairs ( s e l f , n) :
3 dp = [ 0 ] ∗ ( n+1)
4 dp [ 0 ] = 1 # i n i t s t a r t i n g p o i n t 0 t o 1
5 dirs = [1 , 2]
6 bfs = set ( [ 0 ] )
7 ans = 0
8 while bfs :
9 new_dp = [ 0 ] ∗ ( n+1)
10 new_bfs = s e t ( )
11 f o r i i n b f s : #pos
12 f o r dx i n d i r s :
13 nx = i+dx
14 i f 0 <= nx <= n :
15 new_dp [ nx ] += dp [ i ]
16 new_bfs . add ( nx )
17 ans += dp [ −1]
18 b f s , dp = new_bfs , new_dp
19 r e t u r n ans
The BFS and the Dynamic Programming has the same time and space
complexity.
29.4.3 Generalization
1. State: f [x] or f [x][y] to denote the optimum value or count, or check
the workability of whole solutions till axis x for 1D and (x, y) for 2D;
2. Function: usually for f [x], we connect f [x]Rf [x − 1], or f [x][y]Rf [x −
1][y], f [x][y − 1];
3. Initialization: for f [x] we initialize the starting point, sometimes we
need extra 1 space, with size n + 1; for f [x][y] we need to initialize
elements from row 0 and col 0;
4. Answer: Usually it is f [n − 1] or f [m − 1][n − 1];
directly using the original matrix or array to save the state results. Note:
there are possible ways to optimize the space complexity, we can do it from
O(m ∗ n) to O(m + n) to O(1) which we get by reusing the original grid or
array.
One-Time Traversal
Multiple-Dimensional Traversal
1 d e f LCSLen ( S1 , S2 ) :
2 i f not S1 o r not S2 :
3 return 0
4 n , m = l e n ( S1 ) , l e n ( S2 )
5 f = [ [ 0 ] ∗ (m+1) f o r _ i n r a n g e ( n+1) ]
6 #i n i t f [ 0 ] [ 0 ] = 0
7 f o r i in range (n) :
8 f o r j i n r a n g e (m) :
9 f [ i + 1 ] [ j +1] = f [ i ] [ j ]+1 i f S1 [ i ]==S2 [ j ] e l s e max( f [ i
] [ j +1] , f [ i + 1 ] [ j ] )
10 print ( f )
11 return f [ −1][ −1]
12 S1 = "ABCD"
13 S2 = "ABD"
14 LCSLen ( S1 , S2 )
15 # output
16 # [[0 , 0 , 0 , 0] , [0 , 1 , 1 , 1] , [0 , 1 , 2 , 2] , [0 , 1 , 2 , 2] , [0 ,
1, 2, 3]]
17 # 3
29.16 72. Edit Distance (hard). Given two words word1 and word2,
find the minimum number of operations required to convert word1
to word2. You have the following 3 operations permitted on a word:
Insert a character, Delete a character, Replace a character.
Example 1 :
Example 2 :
the previous i chars in S1 to be the same as the first j chars in S2. The
upbound of the minimum edit distance is max(m,n) by replacing and
insertion. The most important step is to decide the transfer function:
to get the result of current state f[i][j]. If directly filling in the matrix
is obscure, then we can try the recursive:
DFS ( " h o r s e " , " r o s e " )
= DFS ( " h o r s " , " r o s " ) # no e d i t a t e
= DFS ( " hor " , " r o " ) # no e d i t a t s
= 1+ min (DFS ( " ho " , " r o " ) , # d e l e t e " r " from l o n g e r one
DFS ( " hor " , " r " ) , # i n s e r t " o " a t t h e l o n g e r one , l e f t
" hor " and " r " t o match
DFS ( " ho " , " r " ) ) , # r e p l a c e " r " i n t h e l o n g e r one with
" o " i n t h e s h o r t e r one , l e f t " ho " and " r " t o match
rabbbit
^^^^ ^^
rabbbit
^^ ^^^^
rabbbit
^^^ ^^^
Example 2 :
babgbag
^^ ^
babgbag
^^ ^
babgbag
^ ^^
babgbag
^ ^^
babgbag
^^^
a 1 1 0 0
b 1 1 1 0
b 1 1 2 0
a 1 2 2 2
1 d e f numDistinct ( s e l f , s , t ) :
2 i f not s o r not t :
3 i f not s and t :
4 return 0
5 else :
6 return 1
7
8 rows , c o l s = l e n ( s ) , l e n ( t )
9 i f c o l s > rows :
10 return 0
11 i f c o l s == rows :
12 r e t u r n 1 i f s==t e l s e 0
13
14 # initalize
15 dp = [ [ 0 f o r c i n r a n g e ( c o l s +1) ] f o r r i n r a n g e ( rows +1)
]
16 f o r r i n r a n g e ( rows ) :
17 dp [ r + 1 ] [ 0 ] = 1
18 dp [ 0 ] [ 0 ] = 1
19
20 # f i l l out t h e l o w e r p a r t
21 f o r i i n r a n g e ( rows ) :
22 f o r j i n r a n g e ( min ( i +1 , c o l s ) ) :
23 i f i==j : # d i a g n o a l
24 i f s [ i ] == t [ j ] :
25 dp [ i + 1 ] [ j +1] = dp [ i ] [ j ]
26 e l s e : # l o w e r h a l f o f t h e matrix
27 i f s [ i ] == t [ j ] :
28 dp [ i + 1 ] [ j +1] = dp [ i ] [ j +1]+dp [ i ] [ j ] # dp
[ i ] [ j ] i s b e c a u s e they e q u a l , s o check p r e v i o u s i , j ,
29 else :
30 dp [ i + 1 ] [ j +1] = dp [ i ] [ j +1] # check t h e
s u b s e q u e n c e b e f o r e t h i s c h a r i n S i s t h e same a s t
31 r e t u r n dp [ − 1 ] [ − 1 ]
29.18 44. Wildcard Matching (hard). Given an input string (s) and
a pattern (p), implement wildcard pattern matching with support for
’?’ and ’*’.
’?’ Matches any single character. ’*’ Matches any sequence of charac-
ters (including the empty sequence).
The matching should cover the entire input string (not partial).
Note:
s could be empty and contains only lowercase letters a-z. p could be
empty and contains only lowercase letters a-z, and characters like ? or
*.
696 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
Example 1 :
Input :
s = " aa "
p = "a"
Output : f a l s e
E x p l a n a t i o n : " a " d o e s not match t h e e n t i r e s t r i n g " aa " .
Example 2 :
Input :
s = " aa "
p = "∗"
Output : t r u e
E x p l a n a t i o n : ' ∗ ' matches any s e q u e n c e .
Example 3 :
Input :
s = " cb "
p = "? a "
Output : f a l s e
E x p l a n a t i o n : ' ? ' matches ' c ' , but t h e s e c o n d l e t t e r i s 'a ' ,
which d o e s not match ' b ' .
Example 4 :
Input :
s = " adceb "
p = " ∗ a ∗b "
Output : t r u e
E x p l a n a t i o n : The f i r s t ' ∗ ' matches t h e empty s e q u e n c e ,
w h i l e t h e s e c o n d ' ∗ ' matches t h e s u b s t r i n g " dce " .
13 f o r i i n r a n g e ( pi , np ) :
14 i f p [ i ] != ' ∗ ' :
15 return False
16 r e t u r n True
17 e l s e : # i f string l e f t , return False
18 return False
19
20 i f p [ pi ] in [ ' ? ' , '∗ ' ] :
21 i f p [ p i ] == ' ? ' :
22 r e t u r n h e l p e r ( s i +1, p i +1)
23 else :
24 f o r i i n r a n g e ( s i , ns +1) : # we can match
a l l t i l l t h e end
25 #p r i n t ( i )
26 i f h e l p e r ( i , p i +1) :
27 r e t u r n True
28 return False
29 else :
30 i f p [ p i ] != s [ s i ] :
31 return False
32 r e t u r n h e l p e r ( s i +1, p i +1)
33 return helper (0 , 0)
1 d e f isMatch ( s e l f , s , p ) :
2 ns , np = l e n ( s ) , l e n ( p )
3 dp = [ [ F a l s e f o r c i n r a n g e ( ns +1) ] f o r r i n r a n g e ( np+1)
]
4
5 # initialize
6 dp [ 0 ] [ 0 ] = True
7 f o r r i n r a n g e ( 1 , np+1) :
8 i f p [ r −1] == ' ∗ ' and dp [ r − 1 ] [ 0 ] :
9 dp [ r ] [ 0 ] = True
10
11 # dp main
12 f o r r i n r a n g e ( 1 , np+1) :
13 f o r c i n r a n g e ( 1 , ns +1) :
14 i f p [ r −1] == ' ? ' :
15 dp [ r ] [ c ] = dp [ r − 1 ] [ c −1]
698 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
29.5.3 Summary
The four elements include:
2. function: f[i][j] research how to match the ith element in the first string
with the jth element in the second string;
3. initialize: f[i][0] for the first column and f[0][j] for the first row;
4. answer: f[n][m]
29.6 Knapsack
The problems in this section are defined as: Given n items with Cost Ci
and value Vi , we can choose i items that either 1) equals to an amount S or
2) is bounded by an amount S. We would be required to obtain either 1)
maximum values or 2) minimum items. Depends on if we can use one item
multiple times, we have three categorizes:
How to solve the above three types of questions will be explained and the
Python example will be given in the next three subsections (Section 29.6.1,
29.6. KNAPSACK 699
29.6.2, and 29.6.3) with the second type of restriction that the total cost is
bounded by an amount S.
The problems itself is a combination problem with restriction, therefore
we can definitely use DFS as the naive solution. Moreover, the problems
are not about to simply enumerate all the combinations, its an optimization
problems, this is the difference of with memoization to solve these prob-
lems. Thus, dynamic programming is not our only choice. We can refer to
Section ?? and Section ?? for the DFS based solution and reasoning.
LeetCode problems:
value we can gain with subproblems (0,i) and a cost of c. Thus, the size of
the dp matrix is n × (C + 1). This makes the time complexity of O(n × C).
Like any coordinate type of dynamic programming problems, We definitely
need to iterate through two for loops, one for i and the other for c, which
one is inside or outside does not matter here. The state transfer function
will be: the maximum value of 1) not choose this item, 2) choose this item,
which will add v[i] to the value of the first i-1 items with cost of c-c[i].
dp[i][c] = max(dp[i − 1][c], dp[i − 1][c − c[i]] + v[i]).
1 d e f knapsack01DP ( c , v , C) :
2 dp = [ [ 0 f o r _ i n r a n g e (C+1) ] f o r r i n r a n g e ( l e n ( c ) +1) ]
3 f o r i in range ( len ( c ) ) :
4 f o r w i n r a n g e ( c [ i ] , C+1) :
5 dp [ i + 1 ] [w ] = max( dp [ i ] [ w ] , dp [ i ] [ w−c [ i ] ] + v [ i ] )
6 r e t u r n dp [ − 1 ] [ − 1 ]
Optimize Space. Because when we are updating dp, we use the left upper
row to update the right lower row, we can reduce the space to O(C). If
we keep the same code as above just with one dimensional dp, then for the
later part of updating it is using the updated result from the same level, thus
resulting using each item multiple times which is actually the most efficient
solution to unbounded knapsack problem in the next section. To avoid this
we have two choices 1) by using a temporary one-dimensional new dp for
each i. 2) by updating the cost reversely we can make sure each time we are
not using the newly updated result.
1 d e f knapsack01OptimizedDP1 ( c , v , C) :
2 dp = [ 0 f o r _ i n r a n g e (C+1) ]
3 f o r i in range ( len ( c ) ) :
4 new_dp = [ 0 f o r _ i n r a n g e (C+1) ]
5 f o r w i n r a n g e ( c [ i ] , C+1) :
6 new_dp [ w ] = max( dp [ w ] , dp [ w−c [ i ] ] + v [ i ] )
7 dp = new_dp
8 r e t u r n dp [ −1]
9
10 d e f knapsack01OptimizedDP2 ( c , v , C) :
11 dp = [ 0 f o r _ i n r a n g e (C+1) ]
12 f o r i in range ( len ( c ) ) :
13 f o r w i n r a n g e (C, c [ i ] −1 , −1) :
14 dp [ w ] = max( dp [ w ] , dp [ w−c [ i ] ] + v [ i ] )
15 r e t u r n dp [ −1]
For the convenience of the later sections, we modularize the final code as:
1 d e f knapsack01 ( c o s t , v a l , C, dp ) :
2 f o r j i n r a n g e (C, c o s t −1, −1) :
3 dp [ j ] = max( dp [ j ] , dp [ j −c o s t ]+ v a l )
4 r e t u r n dp
5 d e f k n a p s a c k 0 1 F i n a l ( c , v , C) :
6 n = len ( c )
7 dp = [ 0 f o r _ i n r a n g e (C+1) ]
8 f o r i in range (n) :
29.6. KNAPSACK 701
9 knapsack01 ( c [ i ] , v [ i ] , C, dp )
10 r e t u r n dp [ −1]
1 d e f knapsackUnbound ( c o s t , v a l , C, dp ) :
2 f o r j i n r a n g e ( c o s t , C+1) :
3 dp [ j ] = max( dp [ j ] , dp [ j −c o s t ]+ v a l )
4 r e t u r n dp
5
6 d e f knapsackUnboundFinal ( c , v , C) :
7 n = len ( c )
8 dp = [ 0 f o r _ i n r a n g e (C+1) ]
9 f o r i in range (n) :
10 knapsackUnbound ( c [ i ] , v [ i ] , C, dp )
11 r e t u r n dp [ −1]
Reduce to Unbounded Knapsack. If n[i] >= C/c[i], ∀i, then the Bounded
Knapsack can be reduced to Unbounded Knapsack.
29.6.4 Generalization
The four elements of the backpack problems include:
2. State transfer Function: dp[i][c] = f (dp[i − 1][c − c[i]], dp[i − 1][c]). For
example, if we want:
4. Answer: dp[n-1][C-1].
I nput : c o i n s = [ 1 , 2 , 5 ] , amount = 11
Output : 3
E x p l a n a t i o n : 11 = 5 + 5 + 1
704 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
Example 2 :
Input : c o i n s = [ 2 ] , amount = 3
Output : −1
Example 2 :
Input : [ 1 , 2 , 3 , 5 ]
Output : f a l s e
E x p l a n a t i o n : The a r r a y cannot be p a r t i t i o n e d i n t o e q u a l sum
subsets .
4 s = sum ( nums )
5 i f s %2:
6 return False
7 # 01 s n a p s a c k
8 dp = [ F a l s e ] ∗ ( i n t ( s / 2 ) +1)
9 dp [ 0 ] = True
10
11 f o r i i n r a n g e ( l e n ( nums ) ) :
12 f o r j i n r a n g e ( i n t ( s / 2 ) , nums [ i ] −1 , −1) :
13 dp [ j ] = ( dp [ j ] o r dp [ j −nums [ i ] ] )
14
15 r e t u r n dp [ −1]
29.7 Exercise
29.7.1 Single Sequence
Unique Binary Search Tree
Interleaving String
Race Car
29.7.2 Coordinate
746. Min Cost Climbing Stair (Easy)
1 On a s t a i r c a s e , t h e i −th s t e p has some non−n e g a t i v e c o s t c o s t [ i
] assigned (0 indexed ) .
2
706 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
7 dp [ 1 ] = 0
8 f o r i i n r a n g e ( 2 , l e n ( c o s t ) +1) :
9 dp [ i ] = min ( dp [ i ] , dp [ i −1]+ c o s t [ i −1] , dp [ i −2]+ c o s t [ i −2])
10 r e t u r n dp [ −1]
Multiple Time Coordinate. The only difference compared with our ex-
amples, we track the out of boundary paths each time when the next location
is not within bound.
1 d e f f i n d P a t h s ( s e l f , m, n , N, i , j ) :
2 MOD = 10∗∗9+7
3 d i r s = [( −1 , 0 ) , ( 1 , 0 ) , ( 0 , −1) , ( 0 , 1 ) ]
4 dp = [ [ 0 f o r _ i n r a n g e ( n ) ] f o r _ i n r a n g e (m) ]
5 dp [ i ] [ j ] = 1
6 ans = 0
7
8 f o r s t e p i n r a n g e (N) :
9 new_dp = [ [ 0 f o r _ i n r a n g e ( n ) ] f o r _ i n r a n g e (m) ]
10 f o r x i n r a n g e (m) :
11 f o r y in range (n) :
12 i f dp [ x ] [ y ] == 0 : #o n l y check a v a i l a b l e l o c a t i o n
at that step
13 continue
14 f o r dx , dy i n d i r s :
15 nx , ny = x+dx , y+dy
16 i f 0 <= nx < m and 0 <= ny < n :
17 new_dp [ nx ] [ ny ] += dp [ x ] [ y ]
18 else :
19 ans += dp [ x ] [ y ]
20 ans %= MOD
708 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
21 dp = new_dp
22
23 r e t u r n ans
Coordinate.
1 def uniquePathsWithObstacles ( s e l f , obstacleGrid ) :
2 """
3 : type o b s t a c l e G r i d : L i s t [ L i s t [ i n t ] ]
4 : rtype : int
5 """
6 i f not o b s t a c l e G r i d o r o b s t a c l e G r i d [ 0 ] [ 0 ] == 1 :
7 return 0
8 m, n = l e n ( o b s t a c l e G r i d ) , l e n ( o b s t a c l e G r i d [ 0 ] )
9 dp = [ [ 0 f o r c i n r a n g e ( n ) ] f o r r i n r a n g e (m) ]
10 dp [ 0 ] [ 0 ] = 1 i f o b s t a c l e G r i d [ 0 ] [ 0 ] == 0 e l s e 0 # s t a r t i n g
point
11
12 # init col
13 f o r r i n r a n g e ( 1 , m) :
14 dp [ r ] [ 0 ] = dp [ r − 1 ] [ 0 ] i f o b s t a c l e G r i d [ r ] [ 0 ] == 0 e l s e 0
15
16 f o r c in range (1 , n) :
29.7. EXERCISE 709
17 dp [ 0 ] [ c ] = dp [ 0 ] [ c −1] i f o b s t a c l e G r i d [ 0 ] [ c ] == 0 e l s e 0
18
19 f o r r i n r a n g e ( 1 , m) :
20 f o r c in range (1 , n) :
21 dp [ r ] [ c ] = dp [ r − 1 ] [ c ] + dp [ r ] [ c −1] i f o b s t a c l e G r i d [ r
] [ c ] == 0 e l s e 0
22 p r i n t ( dp )
23 r e t u r n dp [ − 1 ] [ − 1 ]
1 d e f minimumDeleteSum ( s e l f , s1 , s 2 ) :
2 word1 , word2=s1 , s 2
3 i f not word1 :
4 i f not word2 :
5 return 0
6 else :
7 r e t u r n sum ( [ ord ( c ) f o r c i n word2 ] )
8 i f not word2 :
9 r e t u r n sum ( [ ord ( c ) f o r c i n word1 ] )
710 29. DYNAMIC PROGRAMMING QUESTIONS (15%)
10
11 rows , c o l s=l e n ( word1 ) , l e n ( word2 )
12
13 dp = [ [ 0 f o r c o l i n r a n g e ( c o l s +1) ] f o r row i n r a n g e ( rows +1) ]
14 f o r i i n r a n g e ( 1 , rows +1) :
15 dp [ i ] [ 0 ] = dp [ i − 1 ] [ 0 ] + ord ( word1 [ i −1]) #d e l e t e i n word1
16 f o r j i n r a n g e ( 1 , c o l s +1) :
17 dp [ 0 ] [ j ] = dp [ 0 ] [ j −1] + ord ( word2 [ j −1]) #d e l e t e i n word2
18
19 f o r i i n r a n g e ( 1 , rows +1) :
20 f o r j i n r a n g e ( 1 , c o l s +1) :
21 i f word1 [ i −1] == word2 [ j − 1 ] :
22 dp [ i ] [ j ] = dp [ i − 1 ] [ j −1]
23 else :
24 dp [ i ] [ j ] = min ( dp [ i ] [ j −1] + ord ( word2 [ j −1]) , dp [
i − 1 ] [ j ] + ord ( word1 [ i −1]) ) #d e l e t e i n word2 , d e l e t e i n word1
25 r e t u r n dp [ rows ] [ c o l s ]
Part VIII
Appendix
711
30
713
714 30. COOL PYTHON GUIDE
We use type() built-in function to see its underlying type–class, for example:
1 >>> type ( [ 1 , 2 , 3 , 4 ] )
2 < c l a s s ' l i s t '>
3 >>> type ( 1 )
4 < c l a s s ' i n t '>
5 >>> type ( [ 1 , 2 , 3 , 4 ] )
6 < c l a s s ' l i s t '>
7 >>> type ( r a n g e ( 1 0 ) )
8 < c l a s s ' r a n g e '>
9 >>> type ( 1 )
10 < c l a s s ' i n t '>
11 >>> type ( ' abc ' )
12 < c l a s s ' s t r '>
4 >>> c
5 [1 , 2 , 3 , 4 , 5 , 6]
Properties
In-place VS Standard Operations In-place operation is an operation
that changes directly the content of a given linear algebra, vector, ma-
trices(Tensor) without making a copy. The operators which helps to do
the operation is called in-place operator. Eg: a+= b is equivalent to a=
operator.iadd(a, b). A standard operation, on the other hand, will re-
turn a new instance of object.
Examples
Behavior of Mutable Objects Let us see an example, we create three
variables/instances a, b, c, and a, b are assigned with object of the same
value, and c is assigned with variable a.
1 >>> a = [ 1 , 2 , 3 ]
2 >>> b = [ 1 , 2 , 3 ]
716 30. COOL PYTHON GUIDE
3 >>> c = a
4 >>> i d ( a ) , i d ( b ) , i d ( c )
5 (140222162413704 , 140222017592328 , 140222162413704)
We see that a and b are having different identity, meaning the object of
each points to different location in memory, they are indeed two independent
objects. Now, let us compare a and c the same way:
1 >>> a == c , a i s c , type ( a ) i s type ( c )
2 ( True , True , True )
Ta-daa! They have the same identity, meaning they point to the same piece
of memory and c is more like an alias to a. Now, let’s change a value in a
use in-place operation and see its ids:
1 >>> a [ 2 ] = 4
2 >>> i d ( a ) , i d ( b ) , id ( c )
3 (140222162413704 , 140222017592328 , 140222162413704)
4 >>> a += [ 5 ]
5 >>> i d ( a ) , i d ( b ) , id ( c )
6 (140222162413704 , 140222017592328 , 140222162413704)
We do not see any change about identity but change of values. Now, let us
use other standard operations and see the behavior:
1 >>> a = a + [ 5 ]
2 >>> a
3 [1 , 2 , 4 , 5 , 5]
4 >>> i d ( a ) , i d ( b ) , i d ( c )
5 (140222017592392 , 140222017592328 , 140222162413704)
These three variables a, b, c all share the same identity, meaning they all
point to the same instance of object in the same piece of memory. This ends
up more efficient usage of memory. Now, let’s try to change the value of
the variable a. We called += operator which is in-place operator for mutable
objects:
30.1. PYTHON OVERVIEW 717
We see still a new instance of string object is created and with an new id
140222017638952.
Python Data Types Python contains 12 built-in data types. These in-
clude four scalar data types( int, float, complex and bool), four sequence
types(string, list, tuple and range), one mapping type(dict) and two set
types(set and frozenset). All the four scalar data types together with
string, tuple, range and fronzenset are immutable, and the others are mu-
table. Each of these can be manipulated using:
• Operators
• Functions
• Data-type methods
We can also write a .py file ourselves and import them. We provide reference
to some of the popular and useful built-in modules that is not covered in
Part ?? in Python in Section 30.9 of this chapter, they are:
• Re
get / # f i r s t subpackage
__init__ . py
i n f o . py
p o i n t s . py
t r a n s a c t i o n s . py
create / # s e c o n d subpackage
__init__ . py
a p i . py
p l a t f o r m . py
When we import any package, python interpreter searches for sub direc-
tories / packages.
Library is collection of various packages. There is no difference between
package and python library conceptually. Have a look at requests/requests
library. We use it as a package.
3 b u i l d e r _ l i s t . append ( s t r ( data ) )
4 " " . join ( builder_list )
5
6 ### Another way i s t o u s e a l i s t comprehension
7 " " . j o i n ( [ s t r ( data ) f o r data i n c o n t a i n e r ] )
8
9 ### o r u s e t h e map f u n c t i o n
10 " " . j o i n (map( s t r , c o n t a i n e r ) )
This code takes advantage of the mutability of a single list object to gather
your data together and then allocate a single result string to put your data
in. That cuts down on the total number of objects allocated by almost half.
Another pitfall related to mutability is the following scenario:
1 d e f my_function ( param = [ ] ) :
2 param . append ( " t h i n g " )
3 r e t u r n param
4
5 my_function ( ) # r e t u r n s [ " t h i n g " ]
6 my_function ( ) # r e t u r n s [ " t h i n g " , " t h i n g " ]
What you might think would happen is that by giving an empty list as a
default value to param, a new empty list is allocated each time the function
is called and no list is passed in. But what actually happens is that every
call that uses the default list will be using the same list. This is because
Python (a) only evaluates functions definitions once, (b) evaluates default
arguments as part of the function definition, and (c) allocates one mutable
list for every call of that function.
Do not put a mutable object as the default value of a function parameter.
Immutable types are perfectly safe. If you want to get the intended effect,
do this instead:
1 d e f my_function2 ( param=None ) :
2 i f param i s None :
3 param = [ ]
4 param . append ( " t h i n g " )
5 r e t u r n param
6 Conclusion
• Boolean False
Identity operators Identity operators are used to check if two values (or
variables) are located on the same part of the memory. Two variables that
are equal does not imply that they are identical as we have shown in the
last section.
722 30. COOL PYTHON GUIDE
30.3 Function
30.3.1 Python Built-in Functions
Check out here https://docs.python.org/3/library/functions.html.
Built-in Data Types We have functions like int(), float(), str(), tuple(),
list(), set(), dict(), bool(), chr(), ord(). These functions can be used for
intialization, and also used for type conversion between different data types.
Lambda function can has zero to multiple arguments but only one expres-
sion, which will be evaluated and returned. For example, we define a lambda
function which takes one argument x and return x2 .
1 s q u a r e 1 = lambda x : x ∗∗2
30.3. FUNCTION 723
Map
Map applies a function to all the items in an input_list. Here is the
blueprint:
1 map( function_to_apply , l i s t _ o f _ i n p u t s )
Most of the times we want to pass all the list elements to a function
one-by-one and then collect the output. For instance:
1 items = [ 1 , 2 , 3 , 4 , 5]
2 squared = [ ]
3 f o r i in items :
4 s q u a r e d . append ( i ∗ ∗ 2 )
Map allows us to implement this in a much simpler and nicer way. Here you
go:
724 30. COOL PYTHON GUIDE
1 items = [ 1 , 2 , 3 , 4 , 5]
2 s q u a r e d = l i s t (map( lambda x : x ∗ ∗ 2 , i t e m s ) )
Most of the times we use lambdas with map so I did the same. Instead of a
list of inputs we can even have a list of functions! Here we use x(i) to call
the function, where x is replaced with each function in funcs, and i is the
input to the function.
1 def multiply (x) :
2 r e t u r n ( x∗x )
3 d e f add ( x ) :
4 r e t u r n ( x+x )
5
6 f u n c s = [ m u l t i p l y , add ]
7 f o r i in range (5) :
8 v a l u e = l i s t (map( lambda x : x ( i ) , f u n c s ) )
9 print ( value )
10
11 # Output :
12 # [0 , 0]
13 # [1 , 2]
14 # [4 , 4]
15 # [9 , 6]
16 # [16 , 8]
Filter
As the name suggests, filter creates a list of elements for which a function
returns true. Here is a short and concise example:
1 n um b er _ li s t = r a n g e ( −5 , 5 )
2 l e s s _ t h a n _ z e r o = l i s t ( f i l t e r ( lambda x : x < 0 , n um b er _ li s t ) )
3 pr int ( less_than_zero )
4
5 # Output : [ −5 , −4, −3, −2, −1]
The filter resembles a for loop but it is a builtin function and faster.
Note: If map and filter do not appear beautiful to you then you can read
about list/dict/tuple comprehensions.
Reduce
Reduce is a really useful function for performing some computation on a list
and returning the result. It applies a rolling computation to sequential pairs
of values in a list. For example, if you wanted to compute the product of a
list of integers.
So the normal way you might go about doing this task in python is using
a basic for loop:
1 product = 1
2 l i s t = [1 , 2 , 3 , 4]
3 f o r num i n l i s t :
30.4. CLASS 725
4 p r o d u c t = p r o d u c t ∗ num
5
6 # p r o d u c t = 24
30.4 Class
30.4.1 Special Methods
From [1]. http://www.informit.com/articles/article.aspx?p=453682&
seqNum=6 All the built-in data types implement a collection of special object
methods. The names of special methods are always preceded and followed
by double underscores (__). These methods are automatically triggered by
the interpreter as a program executes. For example, the operation x + y
is mapped to an internal method, x.__add__(y), and an indexing opera-
tion, x[k], is mapped to x.__getitem__(k). The behavior of each data type
depends entirely on the set of special methods that it implements.
User-defined classes can define new objects that behave like the built-
in types simply by supplying an appropriate subset of the special methods
described in this section. In addition, built-in types such as lists and dictio-
naries can be specialized (via inheritance) by redefining some of the special
methods. In this book, we only list the essential ones so that it speeds up
our interview preparation.
Table 30.6: Special Methods for Object Creation, Destruction, and Repre-
sentation
Method Description
*__init__(self Called to initialize a new instance
[,*args [,**kwargs]])
__del__(self) Called to destroy an instance
*__repr__(self) Creates a full string representation of an object
__str__(self) Creates an informal string representation
__cmp__(self,other) Compares two objects and returns negative, zero, or positive
__hash__(self) Computes a 32-bit hash index
hline Returns 0 or 1 for truth-value testing
__nonzero__(self)
__unicode__(self) Creates a Unicode string representation
If we have no __repr__(), the output for the following test cases are:
1 8766662474223
2 <__main__ . Student o b j e c t a t 0 x 7 f 9 2 5 c d 7 9 e f 0 >
Table 30.7: Special Methods for Object Creation, Destruction, and Repre-
sentation
Method Description
__lt__(self,other) self < other
__le__(self,other) self <= other
__gt__(self,other) self > other
__ge__(self,other) self >= other
__eq__(self,other) self == other
__ne__(self,other) self != other
From the above outputs, we can see that the colors1 list is the same but
in the second case, it is changed although we are assigning value to colors2.
The result can be either wanted or not wanted. In python, to assign one
list to other directly is similar to a pointer in C++, which both point to
the same physical address. In the first case, colors2 is reassigned a new list,
which has an new address, so now colors2 points to the address of this new
list instead, which leaves the values of colors2 untouched at all. We can
visualize this process as follows: However, we often need to do copy and
(a) The copy process for code 1 (b) The copy process for code 2
leave the original list or string unchanged. Because there are a variety of
list, from one dimensional, two-dimensional to multi-dimensional.
4 print ( list2 )
5 [ 'a ' , 'x ' , ' c ' , 'd ' ]
6 print ( list1 )
7 [ 'a ' , 'b ' , ' c ' , 'd ' ]
But as soon as a list contains sublists, we have the same difficulty, i.e.
just pointers to the sublists.
1 l s t 1 = [ ' a ' , ' b ' , [ ' ab ' , ' ba ' ] ]
2 lst2 = lst1 [ : ]
If you assign a new value to the 0th Element of one of the two lists, there
will be no side effect. Problems arise, if you change one of the elements of
the sublist.
1 >>> l s t 1 = [ ' a ' , ' b ' , [ ' ab ' , ' ba ' ] ]
2 >>> l s t 2 = l s t 1 [ : ]
3 >>> l s t 2 [ 0 ] = ' c '
4 >>> l s t 2 [ 2 ] [ 1 ] = ' d '
5 >>> p r i n t ( l s t 1 )
6 [ ' a ' , ' b ' , [ ' ab ' , ' d ' ] ]
The following script uses our example above and this method:
If we save this script under the name of deep_copy.py and if we call the
script with“python deep_copy.p”, we will receive the following output:
1 $ python deep_copy . py
2 [ ' c ' , ' b ' , [ ' ab ' , ' d ' ] ]
3 [ ' a ' , ' b ' , [ ' ab ' , ' ba ' ] ]
30.7 Loops
The for loop can often be needed in algorithms we have two choices: for and
while. So to learn the basic grammar to do for loop easily could help us be
more efficienct in programming.
Usually for loop is used to iterate over a sequence or matrix data. For
example, the following grammar works for either string or list.
1 # f o r loop in a l i s t to get the value d i r e c t l y
2 a = [5 , 4 ,3 , 2 , 1]
3 f o r num i n a :
4 p r i n t (num)
5 # f o r loop in a l i s t use index
6 f o r idx in range ( len ( a ) ) :
7 print ( a [ idx ] )
8 # f o r l o o p i n a l i s t g e t both i n d e x and v a l u e d i r e c t l y
9 f o r idx , num i n enumerate ( a ) :
10 p r i n t ( idx , num)
Sometimes, we want to iterate two lists jointly at the same time, which
requires they both have the same length. We can use zip to join them
together, and all the others for loop works just as the above. For example:
1 a , b = [1 , 2 , 3 , 4 , 5] , [5 , 4 , 3 , 2 , 1]
2 f o r idx , (num_a , num_b) i n enumerate ( z i p ( a , b ) ) :
3 p r i n t ( idx , num_a , num_b)
5 c n t = Counter ( a )
6 p r i n t ( c n t . most_common ( 3 ) )
7
5. Reversing
1 # 1 . r e v e r s i n g s t r i n g s or l i s t
2 a = ' crackingleetcode '
3 b = [1 ,2 ,3 ,4 ,5]
4 print (a [:: −1] , a [:: −1])
5 # 2 . i t e r a t e o v e r each c h a r o f t h e s t r i n g o r l i s t
c o n t e n t s i n r e v e r s e o r d e r e f f i c i e n t l y , h e r e we u s e z i p
to
6 f o r char , num i n z i p ( r e v e r s e d ( a ) , r e v e r s e d ( b ) ) :
7 p r i n t ( char , num)
8 #3 . r e v e r s e each d i g i t i n an i n t e g e r o r f l o a t number
9 num = 123456789
10 p r i n t ( i n t ( s t r (num) [ : : − 1 ] ) )
11
1 def index ( a , x ) :
2 ' Locate the l e f t m o s t value e x a c t l y equal to x '
3 i = bisect_left (a , x)
4 i f i != l e n ( a ) and a [ i ] == x :
5 return i
6 r a i s e ValueError
732 30. COOL PYTHON GUIDE
7
8 def find_lt (a , x) :
9 ' Find r i g h t m o s t v a l u e l e s s than x '
10 i = bisect_left (a , x)
11 if i :
12 r e t u r n a [ i −1]
13 r a i s e ValueError
14
15 def find_le (a , x) :
16 ' Find r i g h t m o s t v a l u e l e s s than o r e q u a l t o x '
17 i = bisect_right (a , x)
18 if i :
19 r e t u r n a [ i −1]
20 r a i s e ValueError
21
22 def find_gt (a , x ) :
23 ' Find l e f t m o s t v a l u e g r e a t e r than x '
24 i = bisect_right (a , x)
25 i f i != l e n ( a ) :
26 return a [ i ]
27 r a i s e ValueError
28
29 def find_ge ( a , x ) :
30 ' Find l e f t m o s t item g r e a t e r than o r e q u a l t o x '
31 i = bisect_left (a , x)
32 i f i != l e n ( a ) :
33 return a [ i ]
34 r a i s e ValueError
30.9.3 collections
collections is a module in Python that implements specialized container
data types alternative to Python’s general purpose built-in containers: dict,
list, set, and tuple. The including container type is summarized in Ta-
ble 30.8. Most of them we have learned in Part ??, therefore, in the table
we simply put the reference in the table. Before we use them, we need to
import each data type as:
1 from c o l l e c t i o n s import deque , Counter , OrderedDict , d e f a u l t d i c t
, namedtuple
30.9. SUPPLEMENTAL PYTHON TOOLS 733
735