DAA Unit 1
DAA Unit 1
(REGULATION 2017)
CS8451 DESIGN AND ANALYSIS OF ALGORITHMS
OBJECTIVES:
To understand and apply the algorithm analysis techniques.
To critically analyze the efficiency of alternative algorithmic solutions for the same problem
To understand different algorithm design techniques.
To understand the limitations of Algorithmic power.
UNIT I - INTRODUCTION 9
Notion of an Algorithm – Fundamentals of Algorithmic Problem Solving – Important Problem Types –
Fundamentals of the Analysis of Algorithmic Efficiency –Asymptotic Notations and their properties. Analysis
Framework – Empirical analysis – Mathematical analysis for Recursive and Non-recursive algorithms –
Visualization
UNIT II - BRUTE FORCE AND DIVIDE-AND-CONQUER 9
Brute Force – Computing an – String Matching – Closest-Pair and Convex-Hull Problems – Exhaustive Search –
Travelling Salesman Problem – Knapsack Problem – Assignment problem. Divide and Conquer Methodology –
Binary Search – Merge sort – Quick sort – Heap Sort – Multiplication of Large Integers – Closest-Pair and
Convex – Hull Problems.
UNIT III - DYNAMIC PROGRAMMING AND GREEDY TECHNIQUE 9
Dynamic programming – Principle of optimality – Coin changing problem, Computing a Binomial Coefficient –
Floyd‘s algorithm – Multi stage graph – Optimal Binary Search Trees – Knapsack Problem and Memory
functions. Greedy Technique – Container loading problem – Prim‘s algorithm and Kruskal’s Algorithm – 0/1
Knapsack problem, Optimal Merge pattern – Huffman Trees.
UNIT IV - ITERATIVE IMPROVEMENT 9
The Simplex Method – The Maximum-Flow Problem – Maximum Matching in Bipartite Graphs, Stable marriage
Problem.
UNIT V - COPING WITH THE LIMITATIONS OF ALGORITHM POWER 9
Lower – Bound Arguments – P, NP NP- Complete and NP Hard Problems. Backtracking – n-Queen problem –
Hamiltonian Circuit Problem – Subset Sum Problem. Branch and Bound – LIFO Search and FIFO search –
Assignment problem – Knapsack Problem – Travelling Salesman Problem – Approximation Algorithms for NP-
Hard Problems – Travelling Salesman problem – Knapsack problem.
TOTAL: 45 PERIODS
OUTCOMES:
At the end of the course, the students should be able to:
Design algorithms for various computing problems.
Analyze the time and space complexity of algorithms.
Critically analyze the different algorithm design techniques for a given problem.
Modify existing algorithms to improve efficiency.
TEXT BOOKS:
1. Anany Levitin, ―Introduction to the Design and Analysis of Algorithms‖, Third Edition, Pearson
Education, 2012.
2. Ellis Horowitz, Sartaj Sahni and Sanguthevar Rajasekaran, Computer Algorithms/ C++, Second Edition,
Universities Press, 2007.
REFERENCES:
1. Thomas H.Cormen, Charles E.Leiserson, Ronald L. Rivest and Clifford Stein, ―Introduction to
Algorithms‖, Third Edition, PHI Learning Private Limited, 2012.
2. Alfred V. Aho, John E. Hopcroft and Jeffrey D. Ullman, ―Data Structures and Algorithms‖, Pearson
Education, Reprint 2006.
3. Harsh Bhasin, ―Algorithms Design and Analysis‖, Oxford university press, 2016.
4. S. Sridhar, ―Design and Analysis of Algorithms‖, Oxford university press, 2014.
UNIT I INTRODUCTION
Problem to be solved
Algorithm
It is a step by step procedure with the input to solve the problem in a finite amount of time
to obtain the requiredoutput.
Characteristics of an algorithm:
Input: Zero / more quantities are externally supplied.
Output: At least one quantity is produced.
Definiteness: Each instruction is clear and unambiguous.
Finiteness: If the instructions of an algorithm is traced then for all cases the algorithm must
terminates after a finite number of steps.
Efficiency: Every instruction must be very basic and runs in short time.
Example:
The greatest common divisor(GCD) of two nonnegative integers m and n (not-both-zero),
denoted gcd(m, n), is defined as the largest integer that divides both m and n evenly, i.e., with a
remainder of zero.
Euclid’salgorithmisbasedonapplyingrepeatedlytheequalitygcd(m,n)=gcd(n,mmodn),
where m mod n is the remainder of the division of m by n, until m mod n is equal to 0. Sincegcd(m,
0) = m, the last value of m is also the greatest common divisor of the initial m and n.
gcd(60, 24) can be computed as follows:gcd(60, 24) = gcd(24, 12) = gcd(12, 0) = 12.
(ii) Decisionmaking
The Decision making is done on the following:
(a) Ascertaining the Capabilities of the ComputationalDevice
In random-access machine (RAM), instructions are executed one after another (The
central assumption is that one operation at a time). Accordingly, algorithms
designed to be executed on such machines are called sequentialalgorithms.
In some newer computers, operations are executed concurrently, i.e., in parallel.
Algorithms that take advantage of this capability are called parallelalgorithms.
ChoiceofcomputationaldeviceslikeProcessorandmemoryismainlybasedon
space and time efficiency
(b) Choosing between Exact and Approximate ProblemSolving
The next principal decision is to choose between solving the problem exactly or
solving itapproximately.
An algorithm used to solve the problem exactly and produce correct result is called
an exactalgorithm.
If the problem is so complex and not able to get exact solution, then we have to
choose an algorithm called an approximation algorithm. i.e., producesan
approximate answer. E.g., extracting square roots, solving nonlinear equations, and
evaluating definite integrals.
(c) Algorithm DesignTechniques
An algorithm design technique (or “strategy” or “paradigm”) is a general approach
to solving problems algorithmically that is applicable to a variety of problems from
different areas ofcomputing.
Algorithms+ Data Structures
Though Algorithms and Data Structures are independent, but they are combined
together to develop program. Hence the choice of proper data structure is required
before designing thealgorithm.
Implementation of algorithm is possible only with the help of Algorithms and Data
Structures
Algorithmic strategy / technique / paradigm are a general approach by which
many problems can be solved algorithmically. E.g., Brute Force, Divide and
Conquer, Dynamic Programming, Greedy Technique and soon.
Pseudocode and flowchart are the two options that are most widely used nowadays for specifying
algorithms.
a. NaturalLanguage
It is very simple and easy to specify an algorithm using natural language. But many times
specification of algorithm by using natural language is not clear and thereby we get brief
specification.
Example: An algorithm to perform addition of two numbers.
Such a specification creates difficulty while actually implementing it. Hence many programmers
Step 1: Readofthe
prefer to have specification first number,
algorithm say of Pseudocode.
by means
a. Step 2: Read the first number,
say b.
Step 3: Add the above two numbers and store the result in
b. Pseudocode
Pseudocode is a mixture of a natural language and programming language constructs.
Pseudocode is usually more precise than naturallanguage.
For Assignment operation left arrow “←”, for comments two slashes “//”,if condition, for,
while loops areused.
Condition / Decision
Display the value of c
Flow connectivity
Stop
Stop Stop state
FIGURE 1.4 Flowchart symbols and Example for two integer addition.
Once an algorithm has been specified then its correctness must beproved.
An algorithm must yields a required result for every legitimate input in a finite amount of
time.
Forexample,thecorrectnessofEuclid’salgorithmforcomputingthegreatestcommon
divisor stems from the correctness of the equality gcd(m, n) = gcd(n, m mod n).
A common technique for proving correctness is to use mathematical induction becausean
algorithm’s iterations provide a natural sequence of steps needed for such proofs.
The notion of correctness for approximation algorithms is less straightforward than it is for
exact algorithms. The error produced by the algorithm should not exceed a predefined
limit.
(iii) Stringprocessing
A string is a sequence of characters from analphabet.
Strings comprise letters, numbers, and special characters; bit strings, which comprise zeros
and ones; and gene sequences, which can be modeled by strings of characters from the four-
character alphabet {A, C, G, T}. It is very useful inbioinformatics.
Searching for a given word in a text is called stringmatching
(v) Combinatorialproblems
These are problems that ask, explicitly or implicitly, to find a combinatorial object such as a
permutation, a combination, or a subset that satisfies certainconstraints.
A desired combinatorial object may also be required to have some additional property such
s a maximum value or a minimumcost.
In practical, the combinatorial problems are the most difficult problems incomputing.
Thetravelingsalesmanproblemandthegraphcoloringproblemareexamplesof
combinatorial problems.
(vi) Geometricproblems
Geometric algorithms deal with geometric objects such as points, lines, andpolygons.
Geometric algorithms are used in computer graphics, robotics, andtomography.
The closest-pair problem and the convex-hull problem are comes under thiscategory.
(vii) Numericalproblems
Numerical problems are problems that involve mathematical equations, systems of
equations, computing definite integrals, evaluating functions, and soon.
The majority of such mathematical problems can be solved onlyapproximately.
TABLE 1.1 Values (approximate) of several functions important for analysis of algorithms
n √n log2n n n log2n n2 n3 2n n!
1 1 0 1 0 1 1 2 1
2 1.4 1 2 2 4 4 4 2
4 2 2 4 8 16 64 16 24
8 2.8 3 8 2.4•101 64 5.1•102 2.6•102 4.0•104
10 3.2 3.3 10 3.3•101 102 103 103 3.6•106
16 4 4 16 6.4•101 2.6•102 4.1•103 6.5•104 2.1•1013
102 10 6.6 102 6.6•102 104 106 1.3•1030 9.3•10157
103 31 10 103 1.0•104 106 109
104 102 13 104 1.3•105 108 Very big
1012
105 3.2•102 17 105 1.7•106 computation
1010 1015
106 103 20 106 2.0•107 1012 1018
In the worst case, there is no matching of elements or the first matching element can found
at last on the list. In the best case, there is matching of elements at first on the list.
Worst-case efficiency
The worst-case efficiency of an algorithm is its efficiency for the worst case input of sizen.
The algorithm runs the longest among all possible inputs of thatsize.
KLNCE/B.TECH/INFORMATION TECHNOLOGY/IV SEM/CS8451-DAA/2020-2021
For the input of size n, the running time is Cworst(n) =n.
Yet another type of efficiency is called amortized efficiency. It applies not to a single run of
an algorithm but rather to a sequence of operations performed on the same data structure.
Asymptotic notation is a notation, which is used to take meaningful statement about the
efficiency of a program.
The efficiency analysis framework concentrates on the order of growth of an algorithm’s
basic operation count as the principal indicator of the algorithm’s efficiency.
To compare and rank such orders of growth, computer scientists use three notations, they
are:
O - Big ohnotation
Ω - Big omeganotation
Θ - Big thetanotation
Lett(n)andg(n)canbeanynonnegativefunctionsdefinedonthesetofnaturalnumbers.
Thealgorithm’srunningtimet(n)usuallyindicatedbyitsbasicoperationcountC(n),andg(n),
some simple function to compare with the count.
Example 1:
1
n (n − 1) ≤ 1n2
2
1 2 4
i.e., n ≤ n (n − 1) ≤ 1n2
1
4 2 2
Note: asymptotic notation can be thought of as "relational operators" for functions similar to the
corresponding relational operators for values.
=⇒Θ(), ≤⇒O(), ≥⇒Ω(), <⇒o(), >⇒ω()
PROOF: The proof extends to orders of growth the following simple fact about four arbitrary real
numbers a1, b1, a2, b2: if a1 ≤ b1 and a2 ≤ b2, then a1 + a2 ≤ 2 max{b1, b2}.
Since t1(n) ∈ O(g1(n)), there exist some positive constant c1 and some nonnegative integer
n1 suchthat
t1(n) ≤ c1g1(n) for all n ≥ n1.
Similarly, since t2(n) ∈ O(g2(n)),
t2(n) ≤ c2g2(n) for all n ≥ n2.
Let us denote c3 = max{c1, c2} and consider n ≥ max{n1, n2} so that we can use
both inequalities. Adding them yields the following:
t1(n)+t2(n) ≤ c1g1(n) +c2g2(n)
≤ c3g1(n) +c3g2(n)
= c3[g1(n) +g2(n)]
≤ c32 max{g1(n), g2(n)}.
Hence, t1(n) + t2(n) ∈ O(max{g1(n), g2(n)}), with the constants c and n0 required by the
definition O being 2c3 = 2 max{c1, c2} and max{n1, n2}, respectively.
The property implies that the algorithm’s overall efficiency will be determined by the part
with a higher order of growth, i.e., its least efficient part.
ӫ t1(n) ∈ O(g1(n)) and t2(n) ∈ O(g2(n)), then t1(n) + t2(n) ∈ O(max{g1(n), g2(n)}).
Summation formulas
Algorithm analysis
For simplicity, we consider n itself as an indicator of this algorithm’s input size. i.e.1.
The basic operation of the algorithm is multiplication, whose number of executions we
denote M(n). Since the function F(n) is computed according to the formula F(n) = F(n −1)•n
for n >0.
The number of multiplications M(n) needed to compute it must satisfy theequality
M(n) = M(n-1) + 1 for n >0
To compute To multiply
F(n-1) F(n-1) by n
M(n − 1) multiplications are spent to compute F(n − 1), and one more multiplication is
needed to multiply the result by n.
Recurrence relations
The last equation defines the sequence M(n) that we need to find. This equation defines
M(n) not explicitly, i.e., as a function of n, but implicitly as a function of its value at another point,
namely n − 1. Such equations are called recurrence relations or recurrences.
Solve the recurrence relation M(n) = M(n − 1) + 1, i.e., to find an explicit formula for
M(n) in terms of n only.
To determine a solution uniquely, we need an initial condition that tells us the value with
which the sequence starts. We can obtain this value by inspecting the condition that makes the
algorithm stop its recursive calls:
if n = 0 return 1.
This tells us two things. First, since the calls stop when n = 0, the smallest value of n for
which this algorithm is executed and hence M(n) defined is 0. Second, by inspecting the
pseudocode’s exiting line, we can see that when n = 0, the algorithm performs no multiplications.
Thus, the recurrence relation and initial condition for the algorithm’s number of multiplications
M(n):
M(n) = M(n − 1) + 1 for n >0,
M(0)=0 for n =0.
Method of backward substitutions
M(n) = M(n − 1)+1 substitute M(n − 1) = M(n − 2) +1
= [M(n − 2) + 1]+ 1
= M(n − 2)+2 substitute M(n − 2) = M(n − 3) +1
= [M(n − 3) + 1]+ 2
= M(n − 3) + 3
…
= M(n − i) + i
…
= M(n − n) + n
= n.
Therefore M(n)=n
ALGORITHM TOH(n, A, C, B)
//Move disks from source to destination recursively
//Input: n disks and 3 pegs A, B, and C
//Output: Disks moved to destination as in the source order.
if n=1
Move disk from A to C
else
Move top n-1 disks from A to B using C
TOH(n - 1, A, B,C)
Move top n-1 disks from B to C using A
TOH(n - 1, B, C, A)
…
= 2iM(n − i) + 2i−1 + 2i−2 + . . . + 2 + 1= 2iM(n − i) + 2i− 1.
…
Since the initial condition is specified for n = 1, which is achieved for i = n − 1,
M(n) = 2n−1M(n − (n − 1)) + 2n−1 – 1 = 2n−1M(1) + 2n−1 − 1= 2n−1 + 2n−1 − 1= 2n− 1.
Thus, we have an exponential time algorithm
EXAMPLE 3: An investigation of a recursive version of the algorithm which finds the number of
binary digits in the binary representation of a positive decimal integer.
ALGORITHM BinRec(n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n’s binary representation
While(n>1)
{
n=n/2;
C++;
}
if n = 1 return 1
else return BinRec(⎝n/2])+1
Algorithm analysis
The numberof additions made in computing BinRec(⎝n/2]) is A(⎝n/2]), plus one more
addition is made by the algorithm to increase the returned value by 1. This leads to the recurrence
A(n)=A(⎝n/2])+1forn >1.
Since the recursive calls end when n is equal to 1 and there are no additions made
then, the initial condition is A(1) =0.
The standard approach to solving such a recurrence is to solve it only for n =
2kA(2k) = A(2k−1) + 1 for k >0,
A(20) = 0.
n=2k
backward substitutions
A(2k) = A(2k−1)+1 substitute A(2k−1) = A(2k−2) +1
= [A(2k−2) + 1]+ 1= A(2k−2)+2 substitute A(2k−2) = A(2k−3) +1
= [A(2k−3) + 1]+ 2 = A(2k−3)+3 ...
...
= A(2k−i) + i
...
= A(2k−k) + k.
Thus, we end up with A(2k) = A(1) + k = k, or, after returning to the original variable n = 2k and
hence k = log2 n,
A(n) = log2 n ϵ Θ (log2 n).
EXAMPLE 1: Consider the problem of finding the value of the largest element in a list of n
numbers. Assume that the list is implemented as an array forsimplicity.
ALGORITHM MaxElement(A[0..n − 1])
//Determines the value of the largest element in a given array
//Input: An array A[0..n − 1] of real numbers
//Output: The value of the largest element in A
maxval ←A[0]
for i ←1 to n − 1 do
if A[i]>maxval
maxval←A[i]
return maxval
Algorithm analysis
The measure of an input’s size here is the number of elements in the array, i.e.,n.
There are two operations in the for loop’sbody:
o The comparison A[i]> maxvaland
o The assignmentmaxval←A[i].
The comparison operation is considered as the algorithm’s basic operation, because the
comparison is executed on each repetition of the loop and not theassignment.
The number of comparisons will be the same for all arrays of size n; therefore, there is no
need to distinguish among the worst, average, and best caseshere.
Let C(n) denotes the number of times this comparison is executed. The algorithm makes
one comparison on each execution of the loop, which is repeated for each value of the
loop’s variable i within the bounds 1 and n − 1, inclusive. Therefore, the sum for C(n) is
calculated as follows:
EXAMPLE 2: Consider the element uniqueness problem: check whether all the Elements in a
given array of n elements are distinct.
ALGORITHM UniqueElements(A[0..n − 1])
//Determines whether all the elements in a given array are distinct
//Input: An array A[0..n − 1]
//Output: Returns “true” if all the elements in A are distinct and “false” otherwise
for i ←0 to n − 2 do
for j ←i + 1 to n − 1 do
if A[i]= A[j ] return false
return true
Algorithm analysis
The natural measure of the input’s size here is again n (the number of elements in thearray).
Sincetheinnermostloopcontainsasingleoperation(thecomparisonoftwoelements),we
should consider it as the algorithm’s basic operation.
The number of element comparisons depends not only on n but also on whether there are
equal elements in the array and, if there are, which array positions they occupy. We will
limit our investigation to the worst caseonly.
One comparison is made for each repetition of the innermost loop, i.e., for each value of the
loop variable j between its limits i + 1 and n − 1; this is repeated for each value of the outer
loop, i.e., for each value of the loop variable i between its limits 0 and n −2.
EXAMPLE 3: Consider matrix multiplication. Given two n × n matrices A and B, find the time
KLNCE/B.TECH/INFORMATION TECHNOLOGY/IV SEM/CS8451-DAA/2020-2021
efficiency of the definition-based algorithm for computing their product C = AB. By definition, C
is an n × n matrix whose elements are computed as the scalar (dot) products of the rows of matrix A
and the columns of matrix B:
where C[i, j ]= A[i, 0]B[0, j]+ . . . + A[i, k]B[k, j]+ . . . + A[i, n − 1]B[n − 1, j] for every pair of
indices 0 ≤ i, j ≤ n − 1.
The total number of multiplications M(n) is expressed by the following triple sum:
Now, we can compute this sum by using formula (S1) and rule (R1)
.
The running time of the algorithm on a particular machine m, we can do it by the product
If we consider, time spent on the additions too, then the total time on the machine is
KLNCE/B.TECH/INFORMATION TECHNOLOGY/IV SEM/CS8451-DAA/2020-2021
CS8451 Design and Analysis ofAlgorithms _ Unit I 1.21
EXAMPLE 4 The following algorithm finds the number of binary digits in the binary
representation of a positive decimal integer.
ALGORITHM Binary(n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n’s binary representation
count ←1
while n > 1 do
count ←count + 1
n←⎝n/2]
returncount
Algorithmanalysis
An input’s size isn.
The loop variable takes on only a few values between its lower and upperlimits.
Since the value of n is about halved on each repetition of the loop, the answer should be
about log2 n.
The exact formula for the number oftimes.
Thecomparisonn>1willbeexecuted is actually⎝log2n] +1.
Visualization
Algorithm visualization can be defined as the use of images to convey some useful information
about algorithms. That information can be a visual illustration of an algorithm’s operation, of its
performance on different kinds of inputs, or of its execution speed versus that of other algorithms for
the same problem. To accomplish this goal, an algorithm visualization uses graphic elements—points,
line segments, two- or three-dimensional bars, and so on—to represent some “interesting events” in
the algorithm’s operation. There are two principal variations of algorithm visualization: Static
algorithm visualization
Dynamic algorithm visualization, also called algorithm animation Static algorithm visualization
shows an algorithm’s progress through a series of still images. Algorithm animation, on the other
hand, shows a continuous, movie-like presentation of an algorithm’s operations.