Daa Module 6
Daa Module 6
Daa Module 6
Back Tracking: -The Control Abstraction – The N Queen’s Problem, 0/1 Knapsack Problem
Branch and Bound:Travelling Salesman Problem.
Introduction to Complexity Theory :-Tractable and Intractable Problems- The P and NP
Classes- Polynomial Time Reductions - The NP- Hard and NP-Complete Classes
BACKTRACKING
GENERAL METHOD
The principal idea is to construct solutions one component at a time and evaluate such
partially constructed candidates as follows.
If a partially constructed solution can be developed further without violating the problem‘s
constraints, it is done by taking the first remaining legitimate option for the next component.
If there is no legitimate option for the next component, no alternatives for any remaining
component need to be considered. In this case, the algorithm backtracks to replace the last
component of the partially constructed solution with its next option.
STATE-SPACE TREE
• The time required by a backtracking algorithm or the efficiency depends on four factors
(i) The time to generate the next X(k);
(ii) The no. of X(k) satisfying the explicit constraints
(iii) The time for bounding functions Bi
(iv) The no. of X(k) satisfying the Bi for all i.
A backtracking algorithm on one problem instance might generate only O(n) nodes while on
a different instance might generate almost all nodes in the state space tree.
N-QUEENS PROBLEM
The problem is to place it queens on an n-by-n chessboard so that no two queens attack each
other by being in the same row or in the same column or on the same diagonal. For n = 1,
the problem has a trivial solution, and it is easy to see that there is no solution for n = 2 and
n =3. So let us consider the four-queens problem and solve it by the backtracking technique.
Since each of the four queens has to be placed in its own row, all we need to do is to assign a
column for each queen on the board presented in the following figure.
The explicit constraints using this formulation are: Si = {1,2,...,8}, 1≤ i ≤ n The solution
space consist of tuples.
The implicit constraints for this problem are:
1. No two xi ’s can be same (i.e., all queens must be placed on different columns).
2. No two queens can be on the same diagonal.
Steps to be followed
We start with the empty board and then place queen 1 in the first possible position of its row,
which is in column 1 of row 1.
Then we place queen 2, after trying unsuccessfully columns 1 and 2, in the first acceptable
position for it, which is square (2,3), the square in row 2 and column 3. This proves to be a
dead end because there i no acceptable position for queen 3. So, the algorithm backtracks and
puts queen 2 in the next possible position at (2,4).
Then queen 3 is placed at (3,2), which proves to be another dead end.
The algorithm then backtracks all the way to queen 1 and moves it to (1,2). Queen 2 then
goes to (2,4), queen 3 to (3,1), and queen 4 to (4,3), which is a solution to the problem.
(x denotes an unsuccessful attempt to place a queen in the indicated column. The numbers
above the nodes indicate the order in which the nodes are generated)
If other solutions need to be found, the algorithm can simply resume its operations at the leaf
at which it stopped. Alternatively, we can use the board‘s symmetry for this purpose.
• If we imagine the squares of the chess board being numbered as indices of two dimensional
array A[1..n, 1..n], then for every element on same diagonal running from upper left to lower
right, each element has same “row-column” value.
• Similarly every element on the same diagonal running from upper right to lower left has the
same “row+column” value. Suppose that two queens are placed at (i, j) and (k, l) positions.
They are on the same diagonal iff
i - j = k - l or i + j = k + l or j - l = i - k or j - l = k - i.
Given n positive weights wi, n positive profits pi , and a positive number M which is the
knapsack capacity, the 0/1 knapsack problem calls for choosing a subset of the weights such
that
S i = 1 to k wixi <= M and
S i = 1 to k pixi is maximizd
The solution space for this problen consists of the 2n distinct ways to assign zero or one
values to the x's.
Thus the solution space is the same as that for the sum of the subsets problem.
Bounding function is needed to help kill some live nodes without actually expanding them.
A good bounding function for this problem is obtained by using an upper bound on the value
of the best feasible solution obtainable by expanding the given live node and any of its
descendants. If this upper bound is not higher than the value of the best solution determined
so far then that live node may be killed.
procedure BOUND(p,w,k,M)
// p: the current profit total
// w: the current weight total
// k : the index of the last removed item
// M : the knapsack size
// the return result is a new profit
global n , P(1:n) , W(1:n)
integer k, i ,real b,c,p,w, M
b := p ; c := w
for i := k+1 to n do
c := c + W(i)
if c < M then b := b + P(j)
else return (b + (1 - (c - M)/W(i))*P(i))
endif
repeat
return (b)
end BOUND
Remark :
It follows that the bound for a feasible left child ( x(k) = 1) of a node Z is the same as that for
Z. Hence , the bounding function need not be used whenever the backtracking algorithm
makes a move to the left child of the node. Since the backtracking algorithm will attempt
make a left child move whenever given a choice between a left and right child, the bounding
function need be used only after a series of successful left child moves ,(i,e, moves to feasible
left child).
procedure Knapsack(M,n,W,P, fw,fp,X)
// M : the size of the knapsack
// n : the number of the weights and profits
// W(1:n) : the weights
// P(1:n) : the corresponding profits ; P(i)/W(i) >= P(i+1)/W(i+1),
// fw : the final weight of the knapsack
// fp : the final maximum profit
// X(1:n), either zero or one ; X(k) = 0 if W(k) is not in the knapsack else X(k) = 1
1. integer n,k, Y(1:n), i , X(1:n) ; real M, W(1:n), P(1:n), fw, fp, cw, cp ;
2. cw := cp := 0 ; k := 1 ; fp := -1 // cw = current weight, cp = current profit
3. loop
4. while k <= n and cw + W(k) <= M do // place k into knapsack
5. cw := cw + W(k) ; cp := cp + P(k) ; Y(k) := 1 ; k := k+1
6. repeat
7. if k > n then fp := cp; fw := cw ; k := n ; X := Y // update the solution
8. else Y(k) := 0 // M is exceeded so object k does not fit
9. endif
10. while BOUND(cp,cw,k,M) <= fp do // after fp is set above, BOUND = fp
11. while k!= 0 and Y(k) != 1 do
12. k := k -1 // find the last weight included in the knapsack
13. repeat
14. if k = 0 then return endif // the algorithm ends here
15. Y(k) := 0 ; cw := cw - W(k) ; cp := cp - P(k) // remove the k-th item
16. repeat
17. k := k+1
18. repeat
19. end knapsack
BRANCH AND BOUND
Procedure B&B()
begin
E: nodepointer;
E := new(node); -- this is the root node which
-- is the dummy start node
H: heap; -- A heap for all the live nodes
-- H is a min-heap for minimization problems,
-- and a max-heap for maximization problems.
while (true) do
if (E is a final leaf) then
-- E is an optimal solution
print out the path from E to the root;
return;
endif
Expand(E);
if (H is empty) then
report that there is no solution;
return;
endif
E := delete-top(H);
endwhile
end
Procedure Expand(E)
begin
- Generate all the children of E;
- Compute the approximate cost value CC of each child;
- Insert each child into the heap H;
end
Given a graph (cities), and weights on the edges (distances) find a minimum weight tour of
the cities
– Start in a particular city
– Visit all other cities (exactly once each)
– Return to the starting city
• Cannot be done by brute-force as this is worst-case exponential or worse running time
– So we will look to backtracking with pruning to make it run in a reasonable amount of time
in most cases
• We will build our state space by:
– Having our children be all the potential cities we can go to next
– Having the depth of the tree be equal to the number of cities in the graph
• we need to visit each city exactly once
• Now we need to add bounding to this problem
– It is a minimization problem so we need to find a lower bound
• We can use:
– The current cost of getting to the node plus
– An underestimate of the future cost of going through the rest of the cities
Top-level outline of the algorithm
2. Repeat the following step until a solution (i.e., a complete circuit, represented by a
terminal node) has been found and no unexplored non-terminal node has a smaller bound
than the length of the best solution found: – Choose an unexplored non-terminal node with
the smallest bound, and process it
3. When a solution has been found and no unexplored non-terminal node has a smaller
bound than the length of the best solution found, then the best solution found is optimal.
Example: Traveling Salesman Problem
Decision Problem
There are many problems for which the answer is a Yes or a No. These types of problems
are known as decision problems. For example,
Whether a given graph can be colored by only 4-colors.
Finding Hamiltonian cycle in a graph is not a decision problem, whereas checking a
graph is Hamiltonian or not is a decision problem.
• Let’s start by reminding ourselves of some common functions, ordered by how fast they
grow.
constant O(1)
logarithmic O(log n)
linear O(n) n-log-n O(n × log n)
quadratic O(n^2 ) cubic O(n^3 )
exponential O(k ^n), e.g. O(2^n)
factorial O(n!)
super-exponential e.g. O(n^n)
• Computer Scientists divide these functions into two classes:
Polynomial functions: Any function that is O(n k ), i.e. bounded from above by n k for some
constant k. E.g. O(1), O(log n), O(n), O(n × log n), O(n^2 ), O(n^3 ) This is really a different
definition of the word ‘polynomial’ from the one we had in a previous lecture.
Exponential functions: The remaining functions. E.g. O(2^n), O(n!), O(n ^n)
• On the basis of this classification of functions into polynomial and exponential, we can
classify algorithms:
• Here are examples of tractable problems (ones with known polynomial-time algorithms):
– Sorting a list
• Here are examples of intractable problems (ones that have been proven to have no
polynomial-time algorithm).
– Some of them require a non-polynomial amount of output, so they clearly will take
a non-polynomial amount of time, e.g.:
∗ Towers of Hanoi: we can prove that any algorithm that solves this problem must
have a worst-case running time that is at least 2n − 1.
• Generally we think of problems that are solvable by polynomial time algorithms as being
tractable, and problems that require superpolynomial time as being intractable.
• Sometimes the line between what is an ‘easy’ problem and what is a ‘hard’ problem is a
fine one.
• For example, “Find the shortest path from vertex x to vertex y in a given weighted graph”.
This can be solved efficiently without much difficulty.
• However, if we ask for the longest path (without cycles) from x to y, we have a problem for
which no one knows a solution better than an exhaustive search
P and NP class
P is the class of all decision problems that are polynomially bounded. The implication is that
a decision problem X P can be solved in polynomial time on a deterministic computation
model (such as a deterministic Turing machine).
NP represents the class of decision problems which can be solved in polynomial time by a
non-deterministic model of computation. That is, a decision problem X NP can be solved
in polynomial-time on a non-deterministic computation model (such as a non-deterministic
Turing machine). A non-deterministic model can make the right guesses on every move and
race towards the solution much faster than a deterministic model.
Deterministic v Non-Deterministic
• Let us now define some terms – P: The set of all problems that can be solved by
deterministic algorithms in polynomial time
• By deterministic we mean that at any time during the operation of the algorithm, there is
only one thing that it can do next
• A nondeterministic algorithm, when faced with a choice of several options, has the power to
“guess“ the right one
• Using this idea we can define NP problems as, – NP:The set of all problems that can be
solved by nondeterministic algorithms in polynomial time.
As an example, let us consider the decision version of TSP: Given a complete, weighted
graph and an integer k, does there exist a Hamiltonian cycle with total weight at most k?
A smart non-deterministic algorithm for the above problem starts with a vertex, guesses
the correct edge to choose, proceeds to the next vertex, guesses the correct edge to choose
there, etc. and in polynomial time discovers a Hamiltonian cycle of least cost and
provides an answer to the above problem. This is the power of non-determinism. A
deterministic algorithm here will have no choice but take super-polynomial time to
answer the above question.
P versus NP
Every decision problem that is solvable by a deterministic polynomial time algorithm is also
solvable by a polynomial time non-deterministic algorithm.
All problems in P can be solved with polynomial time algorithms, whereas all problems
in NP - P are intractable.
It is not known whether P = NP. However, many problems are known in NP with the
property that if they belong to P, then it can be proved that P = NP.
If P ≠ NP, there are problems in NP that are neither in P nor in NP-Complete.
The problem belongs to class P if it’s easy to find a solution for the problem. The problem
belongs to NP, if it’s easy to check a solution that may have been very tedious to find.
Is P = NP?
• Obviously, any problem in P is also in NP, but not the other way around
• To show that a problem is in NP, we need only find a polynomial-time algorithm to check
that a given solution (the guessed solution) is valid.
• But the idea of nondeterminism seems silly. A problem is in NP if we can ‘guess’ a solution
and verify it in polynomial time!!
• No one has yet been able to find a problem that can be proven to be in NP but not in P
NP-completeness
The theory of NP-completeness is a solution to the practical problem of applying complexity
theory to individual problems. NP-complete problems are defined in a precise sense as the
hardest problems in P. Even though we don't know whether there is any problem in NP that is
not in P, we can point to an NP-complete problem and say that if there are any hard problems
in NP, that problems is one of the hard ones.
So if we believe that P and NP are unequal, and we prove that some problem is NP-complete,
we should believe that it doesn't have a fast algorithm.
For unknown reasons, most problems we've looked at in NP turn out either to be in P or NP-
complete. So the theory of NP-completeness turns out to be a good way of showing that a
problem is likely to be hard, because it applies to a lot of problems. But there are problems
that are in NP, not known to be in P, and not likely to be NP-complete; for instance the code-
breaking example I gave earlier.
Examples of NP-Complete
• Travelling Salesman Problem: Given a set of cities and distances between all pairs, find a
tour of all the cities of distance less than M.
• Hamiltonian Cycle: Given a graph, find a simple cycle that includes all the vertices.
• Partition: Given a set of integers, can they be divided into two sets whose sum is equal?
• Vertex Cover: Given a graph and an integer N, is there a set of fewer than N vertices which
touches all the edges?
• At present no algorithms exist that are guaranteed to solve any of the NPcomplete problems
efficiently
• Remember if we could find one then we could solve all the NP-Complete problems • In the
meantime can we find ‘adequate’ solutions?
• One approach is to seek an approximate solution which may not be the optimal but is close
to the optimal
• Another approach is to focus on the average case and develop an algorithm that works for
most, but not all, cases
It is easy to show that P NP. However, it is unknown whether P = NP. In fact, this question
is perhaps the most celebrated of all open problems in Computer Science.
Let A and B be two problems whose instances require as an answer either a \yes" or a \no"
(3SAT and Hamilton cycle are two good examples). A reduction from A to B is a polynomial
time algorithm R which transforms inputs of A to equivalent inputs of B. That is, given an
input x to problem A, R will produce an input R(x) to problem B, such that x is a \yes" input
of A if and only if R(x) is a \yes" input of B. A reduction from A t to B, together with a
polynomial time algorithm for B, constitute a polynomial algorithm for A . For any input x of
A of size n, the reduction R takes time p(n) -a polynomial- to produce an equivalent input
R(x) of B. Now, this input R(x) can have size a most p(n) |since this is the largest input R can
conceivably construct in p(n) time. If we now submit this input to the assumed algorithm for
B, running in time q(m) on inputs of size m, where q is another polynomial, then we get the
right answer of x, within a total number of steps at most p(n) + q(p(n)) -also a polynomial
We have seen many reductions so far, establishing that problems are easy (e.g., from
matching to max- ow). In this part of the class we shall use reductions in a more sophisticated
and counterintuitive context, in order to prove that certain problems are hard. If we reduce A
to B, we are essentiually establishing that, give or take a polynomial, A is no harder than B.
We could write this as A B, an inequality between the complexities of the two problems. If
we know B is easy, this establishes that A is easy. If we know A is hard, this establishes B is
hard
We say that A is easier than B, and write A < B, if we can write down an algorithm for
solving A that uses a small number of calls to a subroutine for B (with everything outside the
subroutine calls being fast, polynomial time). There are several minor variations of this
definition depending on the detailed meaning of "small" -- it may be a polynomial number of
calls, a fixed constant number, or just one call.
So "easier" in this context means that if one problem can be solved in polynomial time, so
can the other. It is possible for the algorithms for A to be slower than those for B, even
though A < B.
As an example, consider the Hamiltonian cycle problem. Does a given graph have a cycle
visiting each vertex exactly once? Here's a solution, using longest path as a subroutine:
This algorithm makes m calls to a longest path subroutine, and does O(m) work outside those
subroutine calls, so it shows that Hamiltonian cycle < longest path. (It doesn't show that
Hamiltonian cycle is in P, because we don't know how to solve the longest path subproblems
quickly.)
Cook's Theorem
Problem:
Satisfiability (SAT) A Boolean variable has two possible values: True (1) or False (0). A
literal is either a boolean variable x or its negation ¬x. A Boolean formula is composed of
literals joined by operations AND (conjunction, ∧), OR (disjunction, ∨), and perhaps further
negations. An assignment gives a truth value to every variable in a Boolean formula. An
assignment is said to be satisfying if the formula evaluates to 1. A clause is a set of literals
joined by OR. Note that “if-then” conditions can be rewritten as clauses. A boolean formula
is in conjunctive normal form (CNF) if it consists of clauses joined by AND. We remark that
every Boolean function can be written equivalently as a CNF formula. Given: a boolean
formula, either in general form or in CNF. Goal: Find a satisfying assignment (if there exists
some). Motivations: This is a fundamental problem in logic and related fields like Artificial
Intelligence. The following is only an example of a more concrete application scenario.