Final
Final
Final
You can use the following NPC problems we mentioned in class or homeworks for your reductions:
SAT, 3SAT, Vertex Cover, Set Cover, independent set, Clique, Subset Sum, Knapsack, 3-partition problem,
3 Dimensional Matching, Coloring, 3-Coloring, (directed or undirected) Hamitonian cycle/path, Traveling
salesman problem, Dominating set problem, the max cut problem.
You can assume that (a poly-sized) linear program can be solved in poly-time.
Moreover, you can assume the min-cost matching (on bipartite or general graphs) and the min-cost flow
problem (e.g., find a flow of value k and the cost of the flow is minimized) can be solved in poly-time.
We may need concentration inequalities like Chebyshev’s inequality and/or Chernoff Bound.
(Chebyshev’s inequality) Let X be a random variable with finite expected value µ and finite non-zero
variance σ 2 . Then for any real number k > 0,
1
Pr(|X − µ| ≥ kσ) ≤ .
k2
(Chernoff Bound) Let X1 , . . . , Xn be n i.i.d. random variables in [0, 1]. Let µ = E[Xi ]. Then for any
> 0, we have that
n
" #
n2
1X
Pr Xi − µ ≥ ≤ 2 exp − .
n 2
i=1
Other versions should work equally well for this exam. You can use any of them.
1. (3 pts) Suppose there exists a trail with L miles and n stopping points on that trail. We assume that
the stopping points are located at distances x1 , x2 , . . . , xn from the start of the trail. Each day you can
hike at most d miles and you must stop at a stopping point at that day if you haven’t finish the trail.
Try to design a polynomial algorithm to find the minimal days you need to go through the whole trail
and prove the correctness of your algorithm. You can assume there always exists a feasible solution.
2. (3pts) (solve one of the following two problems, both are textbook exercises. you don’t need to
solve both) (1) Path selection problem: We are given a directed graph G and a set of directed paths
P1 , . . . , Pc in G (Pi s may share nodes or edges). The question is whether we can choose at least k of
the paths such that no two of the selected path share any nodes. Show the problem is NPC.
(2) In class, we learnt that finding k edge/vertex-disjoint paths from s to t can be solved using flow.
However, the following variant turns out to be NPC. Given a directed graph G and k pairs of nodes
(s1 , t1 ), . . . , (sk , tk ). The problem asks whether there exists k vertex-disjoint path P1 , . . . , Pk such
that Pi goes from si to ti . Prove the problem is NPC.
3. (4 pts) Let S be the set {1, 2, . . . , mn}. We partition S into m sets A1 , A2 , . . . , Am of size n. Let a
second partitioning of S into m sets of size n be B1 , B2 , . . . , Bm .
1
(a) Prove that the sets Ai can be renumbered such that Ai ∩ Bi 6= ∅.
(b) Design a polynomial time that finds such a renumbering
4. (5 pts) (Inference-free schedule) We are given an undirected graph G(V, E). We also have two robots
A and B. Initially, A is located at node a and B is at b. We would like to find a schedule to move
A to node c and move B to node d. At each time slot, a robot can move from one node to one of its
neighbors in G or stays still. We also require that the schedule is inference-free: at any time slot, A
and B must be at a distance at least r from each other. Give a polynomial time algorithm that finds a
inference-free schedule. You can assume a and b are at a distance greater than r and so are c and d.
A(x̂, ŷ) < min {A(x̂ − 1, ŷ), A(x̂ + 1, ŷ), A(x̂, ŷ − 1), A(x̂, ŷ + 1)} .
For convenience, you can simply assume A(i, 0) = A(i, N + 1) = A(0, i) = A(N, i) = +∞ for any
1 ≤ i ≤ N . Now please answer the following questions.
(hint: there is no dependency between each questions, so you can take any order.)
(a) (2pt) Gradient descent is a popular method for finding local optima. Specifically, we can start
from any coordinate (x0 , y0 ) and, in each following step t > 0, find a neighboring grid (xt , yt )
(i.e., |xt −xt−1 |+|yt −yt−1 | = 1) satisfying A(xt , yt ) < A(xt−1 , yt−1 ) (If there are more than
one such neighboring grids, choose an arbitrary one.). We repeat this process until (xt , yt )
becomes a local optimum.
Despite its simplicity, gradient descent may not be an efficient algorithm. Please construct a
counter example such that the above gradient descent algorithm needs to query O(N 2 ) grids
to find a local optimum. For your convenience, you can simply assume we always start from
(x0 = 1, y0 = 1).
(b) (3pt) Propose an efficient algorithm that can find a local optimum in o(N 2 ) time. (hint: In fact,
only querying O(N ) grids is sufficient).
6. (5pts) There is a set of pages P = {1, 2, ..., n} that can be broadcast by a server. Assume that time is
discrete and there are T time slots. At each time slot, the broadcast server can broadcast exactly one
page and all of the users could receive that page. There are several users. Each user u is associated
with a page pu ∈ P and a time interval [bu , eu ]. u can be satisfied if the server broadcasts page pu any
time during [bu , eu ].
Design an algorithm to schedule the broadcast server in order to satisfy as many users as possible.
The problem is NP-hard (you do not have to prove it). Design a poly-time approximation algorithm
that can achieve a constant factor approximation ratio.
∗ Please answer 3 questions from the following 5 questions. If you answer more than 3 questions, we
will give you scores according to the 5 problems for which you get the highest scores.
where xi is an uniformly random chosen points from [0, 1]. Argue why this is a good estimation.
For fixed value and δ, show how many samples (in terms of and δ) we need to show that
Pr[|g − ĝ| ≤ ] ≥ 1 − δ.
(b) (1pts) The above problem can provide an estimation with additive error . However, for some
applications, g can be extremely small (say 10−10 ) and a usual additive error (say = 0.0001) is
not enough. In this case, we would like an estimation with multiplicative error. More precisely,
for fixed valuex and δ, we would like an estimation ĝ such that
If we still use uniform sampling, show how many samples do we need. You can assume that you
know a value v, which is a very coarse estimation of g such that g/10 ≤ v ≤ 10g (in this case,
the number of samples may depend on , δ and v −1 ).
(c) (2pts) Suppose we have another function h(x), which is a very rough estimation of R 1f (x). In
particular, we know that f (x)/10 ≤ h(x) ≤ 10f (x) and we know the value A = 0 h(x)dx.
Suppose we can take n i.i.d. samples xi according to a probability density that is proportion to
h(x) (i.e., according to pdf h(x)/A). In this case, show that there is a method that can estimate
g with multiplicative error (with probability 1 − δ), and the number of samples n is polynomial
in −1 and log 1/δ. You need to state what is your algorithm, why it is a good estimation, and
the analysis of the number of samples. (sidenote: you can see from this problem that knowing a
rough approximation of f can greatly reduce the sample complexity if you use a better sampling
method).
8. For a finite set V of points in the Euclidean plane R2 , the Voronoi diagram consists of regions
for each v ∈ V . Essentially, Pv consists of all points that are closest to v. The Delaunay triangulation
of V is the graph D(V, E) where
(a) (3pts) Show that a minimum spanning tree on V is a subgraph of D(V, E).
(b) (2pts) Assume a Delaunay triangulation can be computed in O(n log n) time. Show that a min-
imum spanning tree can be also computed in O(n log n) time.
9. (Approximate Counting 0-1 KNAPSACK) For the 0-1 KNAPSACK problem, we are given a set of n
objects of weight 0 ≤ a1 ≤ a2 ≤ · · · ≤ an ≤ b, and the capacity C. All these numbers are positive
integers.
P
(1) (1 pts) We want to count the number N of the subset S such that i∈S ai ≤ C. Your algorithm
can run in pseudopolynomial time (i.e., poly(n, C)).
P
(2) (1 pts) We would like to uniformly sample one solution from all possible solutions to i∈S ai ≤ C
(i.e., each subset S is sampled with probability 1/N ). Show how to do this in pseudopolynomial time.
(3) (1 pts) We are given a very large set U . We can take samples uniformly from U . Suppose you want
to estimate the cardinality of a subset T ⊂ U . For any element u ∈ U , you can determine whether
u ∈ T or not in O(1) time. Suppose |U |/|T | = α. Show that it is possible to estimate the cardinality
of a subset T with ε relative error with probability 1 − δ using poly(α, ε−1 , log 1δ ) time.
(4) (2 pts) Design a poly(n, ε−1 , log 1δ ) algorithm to approximate the number N with ε relative error
with probability 1 − δ. (hint: first create
P a scaled problem as follows: Let a0i = bn2 ai /bc. Let N 0
be the number of subset S such that i∈S ai ≤ n . Show that N 0 ≤ (n + 1)N . For any solution in
0 2
scaled problem, show that we can delete at most 1 element to meet the original constraint.)
10. (5pts) Consider a random bipartite graph G(U, V ; E) with |U | = n, |V | = n. For each pair of nodes
u and v, (u, v) ∈ E with probability 1/n. Let M (E) be the size of the maximum matching for
G(U, V ; E). Show that
E[M (E)] = Ω(n).
(Note that E is a random set, so E[M (E)] makes sense).
11. (5pts) Consider the following stochastic matching problem. We have a random bipartite graph model
where each possible edge e is present independently with some probability pe . The probabilities pe are
given as input. Given these probabilities, we want to build a large matching in the randomly generated
graph. However, the only way we can find out whether an edge is present or not is to query it, and
if the edge is indeed present in the graph, we are forced to add it to our matching. Further, for each
vertex i, we are allowed to query at most ti edges incident on i (the numbers ti is called the patience
level which are also part of the input). Our goal is to design an adaptive query algorithm to maximize
the expected size of the matching.
Our strategy is based on the following LP (δ(i) is the set of edges incident on vertex i):
X
max xe
e
X
s.t. ye ≤ ti , ∀i ∈ V
e∈δ(i)
X
xe ≤ 1, ∀i ∈ V
e∈δ(i)
xe = pe ye , ∀e ∈ E
ye ∈ [0, 1]. ∀e ∈ E.
• First solve the LP and obtain the optimal solution (xe , ye ) (xe = pe ye ).
• Pick a permutation π on edges uniformly at random.
• For each edge e in the ordering π, do the following: If e is not available (at least one of its
endpoints is matched or has lost its patience, i.e., we have already probed ti edges incident on
that vertex i) then do not probe it. If e is still available then probe it with probability ye /α where
α is a constant which you can choose.
(a) (1pt) Note that this LP is somewhat different from the LP relaxation we studied in class. Here
ye is not the relaxation of some integer variable. Prove that the optimal LP solution is an upper
bound of the expected matching size obtained by any optimal strategy.
(b) (4pt) Prove that this algorithm achieves a constant factor approximation (using appropriate con-
stant α). (Hint: use a somewhat large constant α. Let Ie be the event that e is available when we
consider to probe it in the random permutation π. Try to show Pr[Ie ] is at least some constant).
Note that even if you do not know how to formally prove (a), you can still try to answer (b).