Daa Module 6

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

MODULE VI

Back Tracking: -The Control Abstraction – The N Queen’s Problem, 0/1 Knapsack Problem
Branch and Bound:Travelling Salesman Problem.
Introduction to Complexity Theory :-Tractable and Intractable Problems- The P and NP
Classes- Polynomial Time Reductions - The NP- Hard and NP-Complete Classes
BACKTRACKING

Backtracking is an algorithm for capturing some or all solutions to given computational


issues, especially for constraint satisfaction issues. The algorithm can only be used for
problems which can accept the concept of a “partial candidate solution” and allows a quick
test to see if the candidate solution can be a complete solution.
Backtracking: Depth first generation with bounding function is known as Backtracking.
• The desired solution must be expressible as n-tuple (x1 ,x2 ,...,xn ) where xi∈Si ; Si is a
finite set.
• Often the problem to be solved calls for finding (x1 , x2 ,...,xn ) which satisfies (maximizes
or minimizes) criterion function P(x1 ,x2 ,...,xn )
• Suppose mi is the size of set Si . Then there are m = m1 , m2 ,...,mn , n-tuples, which are
possible candidates. The backtrack algorithm, on an average, requires far less than m trials.
• Buildup the vector, one component (x1 , x2 ,...,xn ) at a time and use modified criterion
functions Pi (x1 , x2 ,...,xn ) (also called bounding functions) to test whether the vector
formed has any chance of success?
• Realization that the partial vector (x1 , x2 ,...,xi ) can in no way lead to optimal solution,
then mi+1, mi+2, ..., mn possible test vector can be ignored entirely.
• Many problems require solution to satisfy a complex set of constraints. These constraints
can be divided in to two categories:
-Explicit
-Implicit
Explicit constraints are the rules which restrict each xi to take value from a given set. Ex. xi
= 0 or 1 or Si = {0, 1} l i ≤ xi ≤ ui or Si = {α : li ≤ α ≤ ui }
• All tuples that satisfy explicit constraints define a possible solution space.
Implicit constraints determine which of the tuples in the solution space actually satisfy the
criterion function.

GENERAL METHOD
The principal idea is to construct solutions one component at a time and evaluate such
partially constructed candidates as follows.
If a partially constructed solution can be developed further without violating the problem‘s
constraints, it is done by taking the first remaining legitimate option for the next component.
If there is no legitimate option for the next component, no alternatives for any remaining
component need to be considered. In this case, the algorithm backtracks to replace the last
component of the partially constructed solution with its next option.
STATE-SPACE TREE

It is convenient to implement this kind of processing by constructing a tree of choices being


made, called the state-space tree. Its root represents an initial state before the search for a
solution begins. The nodes of the first level in the tree represent the choices made for the
first component of a solution; the nodes of the second level represent the choices for the
second component, and so on.
A node in a state-space tree is said to be promising if it corresponds to a partially
constructed solution that may still lead to a complete solution; otherwise, it is
called nonpromising.
Leaves represent either nonpromising dead ends or complete solutions found by the
algorithm.
Terminology regarding tree organization of solution space
• Problem State: Each node in a tree defines a problem state.
• State space: All paths from the root to other nodes define the state space of the problem.
• Solution states: Solution states are those problem states S for which the path from the
root to S defines a tuple in the solution space.
• Answer states: Answer states are those problem states S for which the path from the root
to S defines a tuple which is the member of the set of solutions (i.e. satisfies implicit
constraints).
• State space tree: Tree organization of the solution space is state space tree.
• Live node: A node which has been generated and all of whose children have not been
generated is called a live node.
• Dead node: A dead node is a generated node that is not to be expanded further, or one, for
which all is children have been generated.
• E-node: The live node whose children are currently being expanded is called the E-node. •
In both the method of generating the problem states, we will have a list of live nodes.

• The time required by a backtracking algorithm or the efficiency depends on four factors
(i) The time to generate the next X(k);
(ii) The no. of X(k) satisfying the explicit constraints
(iii) The time for bounding functions Bi
(iv) The no. of X(k) satisfying the Bi for all i.
A backtracking algorithm on one problem instance might generate only O(n) nodes while on
a different instance might generate almost all nodes in the state space tree.
N-QUEENS PROBLEM

The problem is to place it queens on an n-by-n chessboard so that no two queens attack each
other by being in the same row or in the same column or on the same diagonal. For n = 1,
the problem has a trivial solution, and it is easy to see that there is no solution for n = 2 and
n =3. So let us consider the four-queens problem and solve it by the backtracking technique.
Since each of the four queens has to be placed in its own row, all we need to do is to assign a
column for each queen on the board presented in the following figure.
The explicit constraints using this formulation are: Si = {1,2,...,8}, 1≤ i ≤ n The solution
space consist of tuples.
The implicit constraints for this problem are:
1. No two xi ’s can be same (i.e., all queens must be placed on different columns).
2. No two queens can be on the same diagonal.

Steps to be followed

We start with the empty board and then place queen 1 in the first possible position of its row,
which is in column 1 of row 1.
Then we place queen 2, after trying unsuccessfully columns 1 and 2, in the first acceptable
position for it, which is square (2,3), the square in row 2 and column 3. This proves to be a
dead end because there i no acceptable position for queen 3. So, the algorithm backtracks and
puts queen 2 in the next possible position at (2,4).
Then queen 3 is placed at (3,2), which proves to be another dead end.

The algorithm then backtracks all the way to queen 1 and moves it to (1,2). Queen 2 then
goes to (2,4), queen 3 to (3,1), and queen 4 to (4,3), which is a solution to the problem.
(x denotes an unsuccessful attempt to place a queen in the indicated column. The numbers
above the nodes indicate the order in which the nodes are generated)
If other solutions need to be found, the algorithm can simply resume its operations at the leaf
at which it stopped. Alternatively, we can use the board‘s symmetry for this purpose.
• If we imagine the squares of the chess board being numbered as indices of two dimensional
array A[1..n, 1..n], then for every element on same diagonal running from upper left to lower
right, each element has same “row-column” value.
• Similarly every element on the same diagonal running from upper right to lower left has the
same “row+column” value. Suppose that two queens are placed at (i, j) and (k, l) positions.
They are on the same diagonal iff
i - j = k - l or i + j = k + l or j - l = i - k or j - l = k - i.

Algorithm for n-queen:


0/1 KNAPSACK PROBLEM

Given n positive weights wi, n positive profits pi , and a positive number M which is the
knapsack capacity, the 0/1 knapsack problem calls for choosing a subset of the weights such
that
S i = 1 to k wixi <= M and
S i = 1 to k pixi is maximizd

The x's constitute a zero-one valued vector.

The solution space for this problen consists of the 2n distinct ways to assign zero or one
values to the x's.
Thus the solution space is the same as that for the sum of the subsets problem.

Bounding function is needed to help kill some live nodes without actually expanding them.
A good bounding function for this problem is obtained by using an upper bound on the value
of the best feasible solution obtainable by expanding the given live node and any of its
descendants. If this upper bound is not higher than the value of the best solution determined
so far then that live node may be killed.

Here we use the fixed tuple size formulation.


If at node Z the values of xi , 1<i < k have already been determined, then an upper bound for
Z can be obtained by relaxing the requirement xi = 0 or 1 to 0 <= xi <= 1 for k+1 <= i <= n
and use the greedy method to solve the relaxed problem.

Procedure Bound(p,w,k,M) determines an upper bound on the best solution obtainable by


expanding any node Z at level k+1 of the state space tree.

The object weights and profits are W(i) and P(i).


p = S i = 1 to k P(i)X(i) and it is assumed that P(i)/W(i) >=P(i+1)/W(i+1), 1 < i < n

Algorithm for 0/1 knapsack:

procedure BOUND(p,w,k,M)
// p: the current profit total
// w: the current weight total
// k : the index of the last removed item
// M : the knapsack size
// the return result is a new profit
global n , P(1:n) , W(1:n)
integer k, i ,real b,c,p,w, M
b := p ; c := w
for i := k+1 to n do
c := c + W(i)
if c < M then b := b + P(j)
else return (b + (1 - (c - M)/W(i))*P(i))
endif
repeat
return (b)
end BOUND
Remark :

It follows that the bound for a feasible left child ( x(k) = 1) of a node Z is the same as that for
Z. Hence , the bounding function need not be used whenever the backtracking algorithm
makes a move to the left child of the node. Since the backtracking algorithm will attempt
make a left child move whenever given a choice between a left and right child, the bounding
function need be used only after a series of successful left child moves ,(i,e, moves to feasible
left child).
procedure Knapsack(M,n,W,P, fw,fp,X)
// M : the size of the knapsack
// n : the number of the weights and profits
// W(1:n) : the weights
// P(1:n) : the corresponding profits ; P(i)/W(i) >= P(i+1)/W(i+1),
// fw : the final weight of the knapsack
// fp : the final maximum profit
// X(1:n), either zero or one ; X(k) = 0 if W(k) is not in the knapsack else X(k) = 1

1. integer n,k, Y(1:n), i , X(1:n) ; real M, W(1:n), P(1:n), fw, fp, cw, cp ;
2. cw := cp := 0 ; k := 1 ; fp := -1 // cw = current weight, cp = current profit
3. loop
4. while k <= n and cw + W(k) <= M do // place k into knapsack
5. cw := cw + W(k) ; cp := cp + P(k) ; Y(k) := 1 ; k := k+1
6. repeat
7. if k > n then fp := cp; fw := cw ; k := n ; X := Y // update the solution
8. else Y(k) := 0 // M is exceeded so object k does not fit
9. endif
10. while BOUND(cp,cw,k,M) <= fp do // after fp is set above, BOUND = fp
11. while k!= 0 and Y(k) != 1 do
12. k := k -1 // find the last weight included in the knapsack
13. repeat
14. if k = 0 then return endif // the algorithm ends here
15. Y(k) := 0 ; cw := cw - W(k) ; cp := cp - P(k) // remove the k-th item
16. repeat
17. k := k+1
18. repeat
19. end knapsack
BRANCH AND BOUND

 Branch and bound is a systematic method for solving optimization problems


 B&B is a rather general optimization technique that applies where the greedy method
and dynamic programming fail.
 However, it is much slower. Indeed, it often leads to exponential time complexities in
the worst case.
 On the other hand, if applied carefully, it can lead to algorithms that run reasonably
fast on average.
 The general idea of B&B is a BFS-like search for the optimal solution, but not all
nodes get expanded (i.e., their children generated). Rather, a carefully selected
criterion determines which node to expand and when, and another criterion tells the
algorithm when an optimal solution has been found.

The General Branch and Bound Algorithm

 Each solution is assumed to be expressible as an array X[1:n]


 A predictor, called an approximate cost function CC, is assumed to have been defined.

Procedure B&B()
begin
E: nodepointer;
E := new(node); -- this is the root node which
-- is the dummy start node
H: heap; -- A heap for all the live nodes
-- H is a min-heap for minimization problems,
-- and a max-heap for maximization problems.
while (true) do
if (E is a final leaf) then
-- E is an optimal solution
print out the path from E to the root;
return;
endif
Expand(E);
if (H is empty) then
report that there is no solution;
return;
endif
E := delete-top(H);
endwhile
end

Procedure Expand(E)
begin
- Generate all the children of E;
- Compute the approximate cost value CC of each child;
- Insert each child into the heap H;
end

TRAVELING SALESMAN PROBLEM

Given a graph (cities), and weights on the edges (distances) find a minimum weight tour of
the cities
– Start in a particular city
– Visit all other cities (exactly once each)
– Return to the starting city
• Cannot be done by brute-force as this is worst-case exponential or worse running time
– So we will look to backtracking with pruning to make it run in a reasonable amount of time
in most cases
• We will build our state space by:
– Having our children be all the potential cities we can go to next
– Having the depth of the tree be equal to the number of cities in the graph
• we need to visit each city exactly once
• Now we need to add bounding to this problem
– It is a minimization problem so we need to find a lower bound
• We can use:
– The current cost of getting to the node plus
– An underestimate of the future cost of going through the rest of the cities
Top-level outline of the algorithm

1. Draw and initialize the root node .

2. Repeat the following step until a solution (i.e., a complete circuit, represented by a
terminal node) has been found and no unexplored non-terminal node has a smaller bound
than the length of the best solution found: – Choose an unexplored non-terminal node with
the smallest bound, and process it

3. When a solution has been found and no unexplored non-terminal node has a smaller
bound than the length of the best solution found, then the best solution found is optimal.
Example: Traveling Salesman Problem

INTRODUCTION TO COMPLEXITY THEORY


Turing machine
 A Turing machine is a hypothetical device that manipulates symbols on a strip of tape
according to a table of rules.
 Despite its simplicity, a Turing machine can be adapted to simulate the logic of any
computer algorithm, and is particularly useful in explaining the functions of a CPU inside a
computer.
In Computer Science, many problems are solved where the objective is to maximize or
minimize some values, whereas in other problems we try to find whether there is a solution
or not. Hence, the problems can be categorized as follows –
Optimization Problem
Optimization problems are those for which the objective is to maximize or minimize some
values. For example,
 Finding the minimum number of colors needed to color a given graph.
 Finding the shortest path between two vertices in a graph.

Decision Problem
There are many problems for which the answer is a Yes or a No. These types of problems
are known as decision problems. For example,
 Whether a given graph can be colored by only 4-colors.
 Finding Hamiltonian cycle in a graph is not a decision problem, whereas checking a
graph is Hamiltonian or not is a decision problem.
• Let’s start by reminding ourselves of some common functions, ordered by how fast they
grow.

constant O(1)
logarithmic O(log n)
linear O(n) n-log-n O(n × log n)
quadratic O(n^2 ) cubic O(n^3 )
exponential O(k ^n), e.g. O(2^n)
factorial O(n!)
super-exponential e.g. O(n^n)
• Computer Scientists divide these functions into two classes:

Polynomial functions: Any function that is O(n k ), i.e. bounded from above by n k for some
constant k. E.g. O(1), O(log n), O(n), O(n × log n), O(n^2 ), O(n^3 ) This is really a different
definition of the word ‘polynomial’ from the one we had in a previous lecture.

Exponential functions: The remaining functions. E.g. O(2^n), O(n!), O(n ^n)

• On the basis of this classification of functions into polynomial and exponential, we can
classify algorithms:

Polynomial-Time Algorithm: an algorithm whose order-of-magnitude time performance is


bounded from above by a polynomial function of n, where n is the size of its inputs.

Exponential Algorithm: an algorithm whose order-of-magnitude time performance is not


bounded from above by a polynomial function of n.

Tractable Problem: a problem that is solvable by a polynomial-time algorithm. The upper


bound is polynomial.

• Here are examples of tractable problems (ones with known polynomial-time algorithms):

– Searching an unordered list Tractable and . . .

– Searching an ordered list

– Sorting a list

– Multiplication of integers (even though there’s a gap)

– Finding a minimum spanning tree in a graph

Intractable Problem: a problem that cannot be solved by a polynomial-time algorithm. The


lower bound is exponential.

• Here are examples of intractable problems (ones that have been proven to have no
polynomial-time algorithm).

– Some of them require a non-polynomial amount of output, so they clearly will take
a non-polynomial amount of time, e.g.:
∗ Towers of Hanoi: we can prove that any algorithm that solves this problem must
have a worst-case running time that is at least 2n − 1.

∗ List all permutations (all possible orderings) of n numbers.

Tractable and Intractable

• Generally we think of problems that are solvable by polynomial time algorithms as being
tractable, and problems that require superpolynomial time as being intractable.

• Sometimes the line between what is an ‘easy’ problem and what is a ‘hard’ problem is a
fine one.

• For example, “Find the shortest path from vertex x to vertex y in a given weighted graph”.
This can be solved efficiently without much difficulty.

• However, if we ask for the longest path (without cycles) from x to y, we have a problem for
which no one knows a solution better than an exhaustive search

P and NP class

P is the class of all decision problems that are polynomially bounded. The implication is that
a decision problem X P can be solved in polynomial time on a deterministic computation
model (such as a deterministic Turing machine).

NP represents the class of decision problems which can be solved in polynomial time by a
non-deterministic model of computation. That is, a decision problem X NP can be solved
in polynomial-time on a non-deterministic computation model (such as a non-deterministic
Turing machine). A non-deterministic model can make the right guesses on every move and
race towards the solution much faster than a deterministic model.

Deterministic v Non-Deterministic

• Let us now define some terms – P: The set of all problems that can be solved by
deterministic algorithms in polynomial time

• By deterministic we mean that at any time during the operation of the algorithm, there is
only one thing that it can do next

• A nondeterministic algorithm, when faced with a choice of several options, has the power to
“guess“ the right one

• Using this idea we can define NP problems as, – NP:The set of all problems that can be
solved by nondeterministic algorithms in polynomial time.

 As an example, let us consider the decision version of TSP: Given a complete, weighted
graph and an integer k, does there exist a Hamiltonian cycle with total weight at most k?
 A smart non-deterministic algorithm for the above problem starts with a vertex, guesses
the correct edge to choose, proceeds to the next vertex, guesses the correct edge to choose
there, etc. and in polynomial time discovers a Hamiltonian cycle of least cost and
provides an answer to the above problem. This is the power of non-determinism. A
deterministic algorithm here will have no choice but take super-polynomial time to
answer the above question.

P versus NP
Every decision problem that is solvable by a deterministic polynomial time algorithm is also
solvable by a polynomial time non-deterministic algorithm.
All problems in P can be solved with polynomial time algorithms, whereas all problems
in NP - P are intractable.
It is not known whether P = NP. However, many problems are known in NP with the
property that if they belong to P, then it can be proved that P = NP.
If P ≠ NP, there are problems in NP that are neither in P nor in NP-Complete.
The problem belongs to class P if it’s easy to find a solution for the problem. The problem
belongs to NP, if it’s easy to check a solution that may have been very tedious to find.

Is P = NP?

• Obviously, any problem in P is also in NP, but not the other way around

• To show that a problem is in NP, we need only find a polynomial-time algorithm to check
that a given solution (the guessed solution) is valid.

• But the idea of nondeterminism seems silly. A problem is in NP if we can ‘guess’ a solution
and verify it in polynomial time!!

• No one has yet been able to find a problem that can be proven to be in NP but not in P

NP-completeness
The theory of NP-completeness is a solution to the practical problem of applying complexity
theory to individual problems. NP-complete problems are defined in a precise sense as the
hardest problems in P. Even though we don't know whether there is any problem in NP that is
not in P, we can point to an NP-complete problem and say that if there are any hard problems
in NP, that problems is one of the hard ones.

(Conversely if everything in NP is easy, those problems are easy. So NP-completeness can be


thought of as a way of making the big P=NP question equivalent to smaller questions about
the hardness of individual problems.)

So if we believe that P and NP are unequal, and we prove that some problem is NP-complete,
we should believe that it doesn't have a fast algorithm.

For unknown reasons, most problems we've looked at in NP turn out either to be in P or NP-
complete. So the theory of NP-completeness turns out to be a good way of showing that a
problem is likely to be hard, because it applies to a lot of problems. But there are problems
that are in NP, not known to be in P, and not likely to be NP-complete; for instance the code-
breaking example I gave earlier.

Examples of NP-Complete

• Travelling Salesman Problem: Given a set of cities and distances between all pairs, find a
tour of all the cities of distance less than M.

• Hamiltonian Cycle: Given a graph, find a simple cycle that includes all the vertices.

• Partition: Given a set of integers, can they be divided into two sets whose sum is equal?

• Integer Linear Programming: Given a linear program is there an integer solution?

• Vertex Cover: Given a graph and an integer N, is there a set of fewer than N vertices which
touches all the edges?

NP- Hard Class

 NP-hard (Non-deterministic Polynomial-time hard), is a class of problems that are,


informally, "at least as hard as the hardest problems in NP".
 More precisely, a problem H is NP-hard when every problem L in NP can be reduced in
polynomial time to H.
 As a consequence, finding a polynomial algorithm to solve any NP-hard problem would
give polynomial algorithms for all the problems in NP, which is unlikely as many of them
are considered hard

Solving These Problems

• At present no algorithms exist that are guaranteed to solve any of the NPcomplete problems
efficiently

• Remember if we could find one then we could solve all the NP-Complete problems • In the
meantime can we find ‘adequate’ solutions?
• One approach is to seek an approximate solution which may not be the optimal but is close
to the optimal

• Another approach is to focus on the average case and develop an algorithm that works for
most, but not all, cases

It is easy to show that P NP. However, it is unknown whether P = NP. In fact, this question
is perhaps the most celebrated of all open problems in Computer Science.

POLYNOMIAL TIME REDUCTIONS

Let A and B be two problems whose instances require as an answer either a \yes" or a \no"
(3SAT and Hamilton cycle are two good examples). A reduction from A to B is a polynomial
time algorithm R which transforms inputs of A to equivalent inputs of B. That is, given an
input x to problem A, R will produce an input R(x) to problem B, such that x is a \yes" input
of A if and only if R(x) is a \yes" input of B. A reduction from A t to B, together with a
polynomial time algorithm for B, constitute a polynomial algorithm for A . For any input x of
A of size n, the reduction R takes time p(n) -a polynomial- to produce an equivalent input
R(x) of B. Now, this input R(x) can have size a most p(n) |since this is the largest input R can
conceivably construct in p(n) time. If we now submit this input to the assumed algorithm for
B, running in time q(m) on inputs of size m, where q is another polynomial, then we get the
right answer of x, within a total number of steps at most p(n) + q(p(n)) -also a polynomial

We have seen many reductions so far, establishing that problems are easy (e.g., from
matching to max- ow). In this part of the class we shall use reductions in a more sophisticated
and counterintuitive context, in order to prove that certain problems are hard. If we reduce A
to B, we are essentiually establishing that, give or take a polynomial, A is no harder than B.
We could write this as A B, an inequality between the complexities of the two problems. If
we know B is easy, this establishes that A is easy. If we know A is hard, this establishes B is
hard

We say that A is easier than B, and write A < B, if we can write down an algorithm for
solving A that uses a small number of calls to a subroutine for B (with everything outside the
subroutine calls being fast, polynomial time). There are several minor variations of this
definition depending on the detailed meaning of "small" -- it may be a polynomial number of
calls, a fixed constant number, or just one call.

Then if A < B, and B is in P, so is A: we can write down a polynomial algorithm for A by


expanding the subroutine calls to use the fast algorithm for B.

So "easier" in this context means that if one problem can be solved in polynomial time, so
can the other. It is possible for the algorithms for A to be slower than those for B, even
though A < B.

As an example, consider the Hamiltonian cycle problem. Does a given graph have a cycle
visiting each vertex exactly once? Here's a solution, using longest path as a subroutine:

for each edge (u,v) of G


if there is a simple path of length n-1 from u to v
return yes // path + edge form a cycle
return no

This algorithm makes m calls to a longest path subroutine, and does O(m) work outside those
subroutine calls, so it shows that Hamiltonian cycle < longest path. (It doesn't show that
Hamiltonian cycle is in P, because we don't know how to solve the longest path subproblems
quickly.)

Cook's Theorem

A problem A in NP is NP-complete when, for every other problem B in NP, B < A.

Theorem: an NP-complete problem exists.

Problem:

Satisfiability (SAT) A Boolean variable has two possible values: True (1) or False (0). A
literal is either a boolean variable x or its negation ¬x. A Boolean formula is composed of
literals joined by operations AND (conjunction, ∧), OR (disjunction, ∨), and perhaps further
negations. An assignment gives a truth value to every variable in a Boolean formula. An
assignment is said to be satisfying if the formula evaluates to 1. A clause is a set of literals
joined by OR. Note that “if-then” conditions can be rewritten as clauses. A boolean formula
is in conjunctive normal form (CNF) if it consists of clauses joined by AND. We remark that
every Boolean function can be written equivalently as a CNF formula. Given: a boolean
formula, either in general form or in CNF. Goal: Find a satisfying assignment (if there exists
some). Motivations: This is a fundamental problem in logic and related fields like Artificial
Intelligence. The following is only an example of a more concrete application scenario.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy