CPgrahs
CPgrahs
CPgrahs
Graph
We Are All Connected
Heroes TV Series
Many real-life problems can be classied as graph problems. Some have ecient solutions. Some do not have
them yet. In this chapter, we learn various graph problems with known ecient solutions, ranging from basic
traversal, minimum spanning trees, shortest paths, and network ow algorithms.
4.1
In this chapter, we discuss graph problems that are commonly appear in programming contests,
the algorithms to solve them, and the practical implementations of these algorithms. The issue on
how to store graph information has been discussed earlier in Section 2.3.1.
We assume that the readers are familiar with the following terminologies: Vertices/Nodes,
Edges, Un/Weighted, Un/Directed, In/Out Degree, Self-Loop/Multiple Edges (Multigraph) versus
Simple Graph, Sparse/Dense, Path, Cycle, Isolated versus Reachable Vertices, (Strongly) Connected Component, Sub-Graph, Tree/Forest, Complete Graph, Directed Acyclic Graph, Bipartite Graph, Euler/Hamiltonian Path/Cycle. If you encounter any unknown terms. Please go to
Wikipedia [46] and search for that particular term.
Table 4.1 summarizes our research so far on graph problems in recent ACM ICPC Asia regional
contests and its ICPC Live Archive [11] IDs. Although there are so many graph problems (many
are discussed in this chapter), they only appear either once or twice in a problem set. The question
is: Which ones that we have to learn? If you want to do well in contests, you have no choice but
to study all these materials.
4.2
58
LA
2817
2818
3133
3138
3171
3290
3294
3678
4099
4109
4110
4138
4271
4272
4407
4408
4524
4637
4645
Problem Name
The Suspects
Geodetic Set Problem
Finding Nemo
Color a Tree
Oreon
Invite Your Friends
Ultimate Bamboo Eater
Bug Sensor Problem
Sub-dictionary
USHER
RACING
Anti Brute Force Lock
Necklace
Polynomial-time Red...
Gun Fight
Unlock the Lock
Interstar Transport
Repeated Substitution ...
Infected Land
Source
Kaohsiung03
Kaohsiung03
Beijing04
Beijing04
Manila06
Dhaka05
Dhaka05
Kaohsiung06
Iran07
Singapore07
Singapore07
Jakarta08
Hefei08
Hefei08
KLumpur08
KLumpur08
Hsinchu09
Japan09
Japan09
Table 4.1: Some Graph Problems in Recent ACM ICPC Asia Regional
DFS_WHITE = -1), DFS starts from a vertex u, mark u as visited (set dfs_num[u] to DFS_BLACK = 1),
and then for each unvisited neighbor v of u (i.e. edge u v exist in the graph), recursively visit
v. The snippet of DFS code is shown below:
typedef pair<int, int> ii; // we will frequently use these two data type shortcuts
typedef vector<ii> vii;
#define TRvii(c, it) \ // all sample codes involving TRvii use this macro
for (vii::iterator it = (c).begin(); it != (c).end(); it++)
void dfs(int u) { // DFS for normal usage
printf(" %d", u); dfs_num[u] = DFS_BLACK; // this vertex is visited, mark it
TRvii (AdjList[u], v) // try all neighbors v of vertex u
if (dfs_num[v->first] == DFS_WHITE) // avoid cycle
dfs(v->first); // v is a (neighbor, weight) pair
}
The time complexity of this DFS implementation depends on the graph data structure used. In
a graph with V vertices and E edges, dfs runs in O(V + E) and O(V 2 ) if the graph is stored as
Adjacency List and Adjacency Matrix, respectively.
Figure 4.1: Sample graph for the early part of this section
On sample graph in Figure 4.1, dfs(0) calling DFS from a start vertex u = 0 will trigger
this sequence of visitation: 0 1 2 3 4. This sequence is depth-rst, i.e. DFS goes to
59
the deepest possible vertex from the start vertex before attempting another branches. Note that
this sequence of visitation depends very much on the way we order neighbors of a vertex, i.e. the
sequence 0 1 3 2 (backtrack to 3) 4 is also a possible visitation sequence. Also notice
that one call of dfs(u) will only visit all vertices that are connected to vertex u. That is why
vertices 5, 6, and 7 in Figure 4.1 are currently unvisited by calling dfs(0).
The DFS code shown here is very similar to the recursive backtracking code shown earlier in
Section 3.1. If we compare the pseudocode of a typical backtracking code (replicated below) with
the DFS code shown above, we can see that the main dierence is just whether we ag visited
vertices. DFS does. Backtracking does not. By not revisiting vertices, DFS runs in O(V + E), but
the time complexity of backtracking goes up exponentially.
void backtracking(state) {
if (hit end state or invalid state) // invalid state includes states that cause cycling
return; // we need terminating/pruning condition
for each neighbor of this state // regardless it has been visited or not
backtracking(neighbor);
}
Other Applications
DFS is not only useful for traversing a graph. It can be used to solve many other graph problems.
Finding Connected Components in Undirected Graph
The fact that one single call of dfs(u) will only visit vertices that are actually connected to u can
be utilized to nd (and to count) the connected components of an undirected graph (see further
below for a similar problem on directed graph). We can simply use the following code to restart
DFS from one of the remaining unvisited vertices to nd the next connected component (until all
are visited):
#define REP(i, a, b) \ // all sample codes involving REP use this macro
for (int i = int(a); i <= int(b); i++)
// inside int main()
numComponent = 0;
memset(dfs_num, DFS_WHITE, sizeof dfs_num);
REP (i, 0, V - 1)
if (dfs_num[i] == DFS_WHITE) {
printf("Component %d, visit:", ++numComponent);
dfs(i);
printf("\n");
}
printf("There are %d connected components\n", numComponent);
//
//
//
//
Exercise: We can also use Union-Find Disjoint Sets to solve this graph problem. How?
60
Exercise: Flood Fill is more commonly performed on 2-D grid (implicit graph). Try to solve UVa
352, 469, 572, etc.
Graph Edges Property Check via DFS Spanning Tree
Running DFS on a connected component of a graph will form a DFS spanning tree (or spanning
forest if the graph has more than one component and DFS is run on each component). With one
more vertex state: DFS_GRAY = 2 (visited but not yet completed) on top of DFS_WHITE (unvisited)
and DFS_BLACK (visited and completed), we can use this DFS spanning tree (or forest) to classify
graph edges into four types:
1. Tree edges: those traversed by DFS, i.e. from vertex with DFS_GRAY to vertex with DFS_WHITE.
2. Back edges: part of cycle, i.e. from vertex with DFS_GRAY to vertex with DFS_GRAY too.
Note that usually we do not count bi-directional edges as having cycle
(We need to remember dfs_parent to distinguish this, see the code below).
3. Forward/Cross edges from vertex with DFS_GRAY to vertex with DFS_BLACK.
These two type of edges are not typically used in programming contest problem.
Figure 4.2 shows an animation (from top left to bottom right) of calling dfs(0), then dfs(5), and
nally dfs(6) on the sample graph in Figure 4.1. We can see that 1 2 3 1 is a (true) cycle
and we classify edge (3 1) as a back edge, whereas 0 1 0 is not a cycle but edge (1 0) is
just a bi-directional edge. The code for this DFS variant is shown below.
61
4.3
take out the front most vertex u from the queue and enqueue each unvisited neighbors of u. With
the help of the queue, BFS will visit vertex s and all vertices in the connected component that
contains s layer by layer. This is why the name is breadth-rst. BFS algorithm also runs in O(V +E)
on a graph represented using an Adjacency List.
Implementing BFS is easy if we utilize C++ STL libraries. We use queue to order the sequence
of visitation and map to record if a vertex has been visited or not which at the same time also
record the distance (layer number) of each vertex from source vertex. This feature is important as
it can be used to solve special case of Single-Source Shortest Paths problem (discussed below).
queue<int> q; map<int, int> dist;
q.push(s); dist[s] = 0; // start from source
while (!q.empty()) {
int u = q.front(); q.pop(); // queue: layer by layer!
printf("Visit %d, Layer %d\n", u, dist[u]);
TRvii (AdjList[u], v) // for each neighbours of u
if (!dist.count(v->first)) { // dist.find(v) != dist.end() also works
dist[v->first] = dist[u] + 1; // if v not visited before + reachable from u
q.push(v->first); // enqueue v for next steps
}
}
Exercise: This implementation uses map<STATE-TYPE, int> dist to store distance information.
This may be useful if STATE-TYPE is not integer, e.g. a pair<int, int> of (row, col) coordinate.
However, this trick adds a log V factor to the O(V + E) BFS complexity. Please, rewrite this
implementation to use vector<int> dist instead!
Other Applications
Single-Source Shortest Paths (SSSP) on Unweighted Graph
The fact that BFS visits vertices of a graph layer by layer from a source vertex turns BFS as a good
solver for Single-Source Shortest Paths (SSSP) problem on unweighted graph. This is because in
68
unweighted graph, the distance between two neighboring vertices connected with an edge is simply
one unit. Thus the layer count of a vertex that we have seen previously is precisely the shortest
path length from the source to that vertex. For example in Figure 4.7, the shortest path from the
vertex labeled with 35 to the vertex labeled 30, is 3, as 30 is in the third layer in BFS sequence
of visitation. Reconstructing the shortest path: 35 15 10 30 is easy if we store the BFS
spanning tree, i.e. vertex 30 remembers 10 as its parent, vertex 10 remembers 15, vertex 15
remembers 35 (the source).
Which graph traversal algorithm to choose? Table 4.2 can be helpful.
Pro
Cons
Code
O(V + E) DFS
Uses less memory
Cannot solve SSSP on unweighted graphs
Slightly easier to code
O(V + E) BFS
Can solve SSSP on unweighted graphs
Uses more memory
Slightly longer to code
69