UNIT-I
UNIT-I
I. Algorithm
What is an Algorithm? -Algorithm is a set of steps to complete a task. For example, Task: to make a cup
of tea.
‘’A set of steps to accomplish or complete a task that is described precisely enough that a computer can
run it’’.
Described precisely: very difficult for a machine to know how much water, milk to be added etc. In the
above tea making algorithm. These algorithms run on computers or computational devices.
An algorithm is a finite set of instructions that, if followed, accomplishes a particular task. In addition, all
algorithms must satisfy the following criteria:
• Effectiveness: Every instruction must be very basic enough and must be feasible.
3. Algorithm analysis- How much space is required and How much time does it take to run the
algorithm. An algorithm is a procedure (a finite set of well-defined instructions) for accomplishing some
tasks which; given an initial state terminate in a defined end-state. The computational complexity and
efficient implementation of the algorithm are important in computing, and this depends on suitable data
structures.
4. Implementation- Decide on the programming language to use C, C++, Java, assembly, etc.
5. Testing- Integrate feedback from users, fix bugs, ensure compatibility across different versions
Keeping illegal inputs separate is the responsibility of the algorithmic problem, while treating special
classes of unusual or undesirable inputs is the responsibility of the algorithm itself.
• How to devise algorithms. Divide and Conquer, Branch and Bound, Dynamic Programming
• How to validate algorithms. Check for Algorithm that it computes the correct answer for all possible
legal inputs. First Phase algorithm validation. Program Proving or Program Verification Algorithm to
Program
• First Form: Program which is annotated by a set of assertions about the input and output predicate
calculus variables of the program
• How to analyze algorithms. Analysis of Algorithms or performance analysis refer to the task of
determining how much computing time & storage an algorithm requires 2 phases
• Debugging - Debugging is the process of executing programs on sample data sets to determine whether
faulty results occur and, if so, to correct them.
• Profiling or performance measurement is the process of executing a correct program on data sets and
measuring the time and space it takes to compute the results
PSEUDOCODE:
• Text mode most often represented in close to any High level language such as C,Pascal
• Pseudo code: High-level description of an algorithm. More structured than plain English.
Control flow
• If … then … [else …]
• While … do …
• repeat … until …
• For … do …
• Method call
• Return value
• return expression
• Expressions Assignment
Recursive Algorithm
A recursive function is a function that is defined in terms of its self. An algorithm that calls itself is direct
recursive. Algorithm A is said to be indirect recursive if it calls another algorithm which in turn calls A.
Towers of Hanoi: A very elegant solution results from the use of recursion. Assume that the number of
disks is n. To get the largest disk to the bottom of tower B, we move the remaining n \342\200\224 1disks
to tower C and then move the largest to tower B.
This algorithm is invoked by Towers of Hanoi (n, A, B, C). Observe that our solution for an n-disk
problem is formulated in terms of solutions to two (n \342\200\224 l)-disk problems.
step3. {
step5. {
Step7. write ("move top disk from tower", x, "to top of tower", y);
step10. }
step11. }
PERFORMANCE ANALYSIS
The space requirement S(P)of any algorithm P may therefore be written as S(P)= c+5p (instance
characteristics), where c is a constant.
Time Complexity:
The time T (P) taken by a program P is the sum of the compile time and the run time. The compile time
does not depend on the instance characteristics. Also, we may assume that a compiled program will be
run several times without recompilation. Consequently, we concern ourselves with just the run time of a
program. Because many of the factors tp depends on are not known at the time a program is conceived, it
is reasonable to attempt only to estimate tp.
Where n denotes the instance characteristics, and ca, cs, cm, q, and soon, respectively, denote the time
needed for an addition, subtraction, multiplication, division, and soon, and ADD, SUB, MUL, DIV, and
so on.
while ((expr))do
Each execution of the control part of a while statement is given a step count, equal to the number of step
counts assignable to (expr). The step count for each execution of the control part of a for statement is one,
unless the counts attributable to (expr) and (expr) are functions of the instance characteristic.
{
s:=0.0;
count:=count+ 1;
for i :=1to n do
{ count:=count+ 1;// For for
s:=s+ a[i];count:=count+ 1;// For assignment
}
count:=count+ 1;// For last time of for
count:=count+ 1;// For the return
returns;
}
When analyzing a recursive program for its step count, we often obtain a recursive formula for the step
count, for example trsum (n) = 2 tRSum (n- 1) if n>0
These recursive formulas are referred to as recurrence relations
tRSum(n) = 2 + tRSum(n-l)
= 2 + 2 + t RSum(n-2)
= 2(2)+tRSum(n-2)
= n(2)+ tRSum{0)
= 2n + 2, n >0 So the step count for RSum (Algorithm1.7)is 2n+2
The time complexity of an algorithm is given by the number of steps taken by the algorithm to compute
the function n it was written for. The number of steps is itself a function of the instance characteristic.
Matrix addition
Algorithm Add(a, b, c, m, n)
{
for i :=1to m do
for j :=1to n do
c[ij]:=a[i,j]+b[i,j];
}
The statement f (n) = 0(g(n)) states only that g(n) is an upper bound on the value of /(n) for all n, n > n0.
<=nm∑ im |ai|ni-m
O (Big O) Notation
For a given function g(n), O(g(n)) is defined as f(n) : there exist constants c > 0, and n0 ϵ N
O(g(n)) = such that 0 ≤ f(n) ≤ c g(n) for all n ≥ n0
In other words a function f(n) is said to belongs to O(g(n)), if there exists positive constant c such that 0 ≤
f(n) ≤ c g(n) for sufficiently large value of n.Fig 1.2 gives a intuitive picture of functions f(n) and g(n),
where f(n) = O (g(n)). For all the values of n at and to the right of n0, the values of f(n) lies at or below
cg(n). Sog(n) is said an asymptotically upper bound for f(n).
Prove that 3n3 + 2n2 + 4n + 3 = O (n3)
Here,
f(n) = 3n3 + 2n2 + 4n + 3 g(n) = O (n3) to proof f(n) = O (g(n)) we must determine the positive constants
c and n0 such that 3n3 + 2n2 + 4n + 3 ≤ c n3 for all n ≥ n0 dividing the whole equation by n3, we get 3 +
2/n + 4/n2 + 3/n3 ≤ c
Θ (Theta) Notation
f(n) : there exist constants c1> 0, c2> 0 and n0ϵ N Θ(g(n)) = such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2 g(n) for all n
≥ n0. In other words a function f(n) is said to belongs to Θ(g(n)), if there exists positive constants c1 and
c2 such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2g(n) for sufficiently large value of n. Fig 1.1 gives a intuitive picture of
functions f(n) and g(n), where f(n) = Θ (g(n)). For all the values of n at and to right of n0, the values of
f(n) lies at or above c1g(n)and at or below c2g(n).In other words, for all n ≥n0, the function f(n) is equal
to g(n) to within a constant factor. So,g(n) is said an asymptotically tight bound for f(n).
For example f(n) = ½ n2 -3 n let, g(n) = n2 to proof f(n) = Θ (g(n)) we must determine the
positive constants c1, c2 and n0 such that c1 n2 ≤ ½ n2 -3 n ≤ c2 n2 for all n ≥ n0
Dividing the whole equation by n2, we get c1
≤ ½ -3/n ≤ c2
We can make the right hand inequality hold for any value of n ≥ 1 by choosing c2 ≥ ½. Similarly we can
make the left hand inequality hold for any value of n ≥ 7 by choosing c 1≤1/14. Thus, by choosing c1=1/14,
c2= ½. And n0 = 7 we can have f(n) = Θ (g(n)). That is ½n2 -3 n = Θ (n2).
Prove that 3n3 + 2n2 + 4n + 3 = Ω (n3) Here, f(n) = 3n3 + 2n2 + 4n + 3 g(n) = O (n3) to proof f(n) = Ω
(g(n)) we must determine the positive constants c and n0 such that c n3 ≤ 3n3 + 2n2 + 4n + 3 for all n ≥
n0 dividing the whole equation by n3, we get c ≤ 3 + 2/n + 4/n2 + 3/n3 We can make the inequality hold
for any value of n ≥ 1 by choosing c ≤ 3. Thus, by choosing c = 3 and n0 = 1 we can have f(n) = Ω (g(n)).
For a given function g(n), Ω (g(n)) is defined as f(n) : there exist constants c > 0, and n0 ϵ N Ω(g(n)) =
such that 0 ≤ c g(n) ≤ f(n) for all n ≥ n0
In other words, a function f(n) is said to belongs to Ω (g(n)), if there exists positive constant c such that 0
≤ c g(n) ≤ f(n) for sufficiently large value of n. Fig 1.3 gives a intuitive picture of functions f(n) and g(n),
where f(n) = Ω (g(n)). For all the values of n at and to the right of n0, the values of f(n) lies at or above
cg(n). Sog(n) is said an asymptotically lower bound for f(n).
The growth patterns of order notations have been listed below: O(1) < O(log(n)) < O(n) < O(n log(n)) <
n0 such that c1 n3 ≤ 7n3 + 7 ≤ c2 n3 for all n ≥ n0 dividing the whole equation by n3, we get c1 ≤ 7 +
7/n3 ≤ c2
We can make the right hand inequality hold for any value of n ≥ 1 by choosing c2 ≥ 14. Similarly we
can make the left hand inequality hold for any value of n ≥ 1 by choosing c1 ≤ 7. Thus, by choosing c1
Thus, 7n3 + 7 = Θ (n3). Now let us take few examples of Algorithms and represent their complexity in
asymptotic notations
Notation Name
O(1) Constant
O(log(n)) Logarithmic
O(n) Linear
O(n log(n)) Linearithmic
O(n2) Quadratic
O(cn) Exponential
O(n!) Factorial
A Comparison of typical running time of different order notations for different input size listed
below:
log 2 n n log 2 n2 n3 2n
n n
0 1 0 1 1 2
1 2 2 4 8 4
2 4 8 16 64 16
3 8 24 64 512 256
4 16 64 256 4096 65536
5 32 160 1024 32768 4294967296
Little o notation: [“Little"oh"] The function f(n) = o(g(n))(read as "f of n is little oh of g of n") iff
O(g(n))={f(n)=for all constants c>=0, there exists a constant n 0 >0 such that 0<=f(n)<o(g(n)) for all
n>=n0.}
Limn->∞ f(n)/g(n)=o
N1.9999=o (n2)
N2/log n = o (n2)
N2≠ o (n2)
Worst Case: This is the upper bound for execution time with any input(s). It guarantees that irrespective
of the type of input(s), the algorithm will not take any longer than the worst case time.
Best Case: This is the lower bound for execution time with any input(s). It guarantees that under any
circumstances, the algorithm will be executed at least for best case time. That is the minimum time
required by the algorithm to execute for any input.
Average case: This is the execution time taken by the algorithm for any random input(s)to the algorithm.
In this case, for the inputs, the algorithm takes a time which is in between the upper and lower bound.
DIVIDE-AND-CONQUE
A function to compute on n inputs the divide-and-conquer strategy suggests splitting the inputs
into k distinctsubsets,1< k < n, yielding k sub problems. These sub problems must be solved, and then a
method must be found to combine sub solutions into a solution of the whole.
Small (P)is a Boolean-valued function that determines whether the input size is small enough that
the answer can be computed without splitting. If this is so, the function S is invoked. Otherwise the
problem P is divided into smallersubproblems.ThesesubproblemsPi,P2,…..,Pk are solved by recursive
applications of D And C. Combine is a function that determines the solution to P using the solutions to
the k sub problem.
The general methodology applied in the divide and conquers technique is as follows:
Step 1: Divide the problem into two or more independent sub- Problems (not necessarily
same type).
Step 2: Solve (conquer) the each sub-problem recursively to the Smallest possible size.
Step 3: Combine these solution of the sub-problems into a solution to the whole problem.
BINARY SEARCH
Binary search is a well known instance of divide and conquer method. For binary search divide and
conquer strategy is applied recursively for a given sorted array is as follows: Divide: Divide the
selected array at the middle. It creates two sub-array, one left sub-array and other right sub-array.
For a given sorted array of N element and for a given key element (value to be searched in the sorted
array), the basic idea of binary search is as fallows
1. First find the middle element of the array
2. Compare the middle element with the key element.
3. There are three cases
If it is the key element then search is successful.
If it is less than key element then search only the lower half of the array.
If it is greater than key element then search only the upper half of the array.
4. Repeat 1, 2 and 3 until the key element found or sub-array sizes become one.
Solution:
Step 1: Here Lower = 0, Upper = 5, Item = 10
Step 2: First we have to calculate middle element for the array A Mid = (Lower + Upper) / 2
= (0 + 5) / 2 =2
Step 3: Here Lower < Upper and A [Mid]! =10. So continue Step4 and Step 5
Lower = 3 4 Upper = 5
10 12 13
Step 5: Mid = (Lower + Upper) /
= (3 + 5) / 2
=(3+3)/2 =3
Here, Lower = Upper and A [ Mid ] = =10, So, PRINT “Search successful”
Step 7: End
The best-case analysis is easy. For a successful search only one element comparison is needed. For an
unsuccessful search, Theorem3.2 states that [log n] element comparisons are needed in the best case. In
conclusion we are now able to completely describe the computing time of binary search by giving
formulas that describe the best, average, and worst cases:
Successful searches: best –Ɵ (1), average- Ɵ (log n), and worst – Ɵ (log n)
Unsuccessful searches: best- Ɵ (log n), average- Ɵ (log n), and worst – Ɵ (log n).
MERGE SORT
Merge sort is also one of the ‘divide and conquer’ class of algorithms. This is a sorting algorithm to
sort an unordered list of element. Merge sort is a recursive algorithm that splits the array into two
sub-arrays, sorts each sub-array, and then merges the two sorted arrays into a single sorted array.
Given a sequence of n elements(also called keys) a[l],... ,a[n],the general idea is to imagine them split
into two sets a[l],... ,a[|_n/2J]and a[|_n/2j+ 1],... ,a[n].Each set is individually sorted, and the resulting
sorted sequences are merged to produce a single sorted sequence of n element. The base case of the
recursion is when a sub array of size 1 (or 0). Merge sort algorithm also closely follow divide and
conquer strategy. It is an external sorting algorithm.
Divide: Divide N element array to be sorted into two sub array of N / 2 element each.
Conquer: Sort the sub arrays recursively using merge sort.
Combine: Merge the two sorted sub-array to produce final sorted array.
Suppose we have to sort a array of N element, A [ p…..r ]. Initially p = 1 and r=N
Divide Step: If a given array A has zero or one element, simply return; it is already sorted. Otherwise,
split A [p .. r ] into two sub arrays A [ p .. q ] and A [ q + 1 .. r ], each containing about half of the
elements of A [p .. r]. That is, q is the halfway point of A [p .. r].
Conquer Step: Conquer by recursively sorting the two sub arrays A [ p .. q ] and A [ q + 1 .. r ].
1. Combine Step: Combine the elements back in A [ p .. r ] by merging the two sorted sub
arrays A [ p .. q ] and A [ q + 1 .. r ] into a unique sorted sequence.
if (low <high)then
{
mid:=[(low+high)/2;
MergeSort(low,mid);
MergeSort(mid+ I,high);
Merge(low,mid,high);
}
}
Ex: Consider the array of ten elements a[l:8]= (12, 31, 25,8, 32, 17, 40, 42).Algorithm Merge Sort
begins by splitting a[] into two sub arrays each of size five (a[l:4] and a[5:8] ).The elements in a[l:4] are
then split into two sub arrays of size three(a[l:2])and two (a[3 :4]).Then the items in a[1 : 2] are split into
sub arrays of size two (a[l: 2]) and one (a[3 : 4]).The two values in a[1 : 2] are split a final time into one-
element sub arrays, and now the merging begins. Note that no movement of data has yet taken place. A
record of the sub arrays is implicitly maintained by the recursive mechanism.
According to the merge sort, first divide the given array into two equal halves. Merge sort keeps dividing
the list into equal parts until it cannot be further divided.
As there are eight elements in the given array, so it is divided into two arrays of size 4.
Now, again divide these two arrays into halves. As they are of size 4, so divide them into new arrays of
size 2.
Now, again divide these arrays to get the atomic value that cannot be further divided.
In combining, first compare the element of each array and then combine them into another array in
sorted order. So, first compare 12 and 31, both are in sorted positions. Then compare 25 and 8, and in
the list of two values, put 8 first followed by 25. Then compare 32 and 17, sort them and put 17 first
followed by 32. After that, compare 40 and 42, and place them sequentially.
In the next iteration of combining, now compare the arrays with two data values and merge them into an
array of found values in sorted order.
Now, there is a final merging of the arrays. After the final merging of above arrays, the array will look
like -
Algorithm
Merge(array A, int p, int q, int r) { // merges A[p..q] with A[q+1..r]
array B[p..r]
j = q+1
Finally, to merge both sorted lists takes n time, by the comments made above. In conclusion we
have
= 4T(n/4) +2cn
= 4(2T(n/8)+en/4)+2cn
= 2kT(l)+kcn
= an +cn logn
It is easy to see that if 2k solving the above recurrence we can see that merge sort has a time complexity
of Θ (n log n).
Now, let's see the time complexity of merge sort in best case, average case, and in worst case. We will
also see the space complexity of the merge sort.
1. Time Complexity
Case Time Complexity
In quick sort, the division into two sub arrays is made so that the sorted sub arrays do not need to be
merged later. This is accomplished by rearranging the elements in a[l:n] such that a[i] < a[j]for all i
between1 and m and all j between m + 1and n for some m, 1< m < n. Thus, the elements in a[l:m] and
a[m + 1:n] can be independently sorted
Combine: No work is needed to combine the subarrays, because they are sorted in place.
• Perform the divide step by a procedure PARTITION, which returns the index q that marks the
position separating the sub arrays.
QUICKSORT (A, p, r)
If (p < r)
Then q ←PARTITION(A, p, r )
QUICKSORT (A, p, q − 1)
QUICKSORT (A, q + 1, r)
Partitioning
Partition sub array A [p . . . r] by the following procedure:
PARTITION (A, p, r)
x ← A[r ]
i ← p –1
for j ← p to r –1
do if A[ j ] ≤ x
then i ← i + 1
exchange A[i ] ↔ A[ j ]
return i + 1
PARTITION always selects the last element A[r ] in the sub array A[p . . r ] as the pivot the
element around which to partition.
As the procedure executes, the array is partitioned into four regions, some of which may
be empty:
To understand the working of quick sort, let's take an unsorted array. It will make the concept more
clear and understandable.
In the given array, we consider the leftmost element as pivot. So, in this case, a[left] = 24, a[right] =
27 and a[pivot] = 24.
Since, pivot is at left, so algorithm starts from right and move towards left.
Now, a[pivot] < a[right], so algorithm moves forward one position towards left
Now, a[left] = 24, a[right] = 19, and a[pivot] = 24.
Because, a[pivot] > a[right], so, algorithm will swap a[pivot] with a[right], and pivot moves to right,
as -
Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so algorithm moves one
position to right as -
Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so, swap a[pivot] and a[left],
now pivot is at left, i.e. -
Since, pivot is at left, so algorithm starts from right, and move to left. Now, a[left] = 24, a[right] =
29, and a[pivot] = 24. As a[pivot] < a[right], so algorithm moves one position to left, as -
Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so, swap a[pivot] and
a[right], now pivot is at right, i.e. -
Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the algorithm starts from left
and move to right.
Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are pointing the same
element. It represents the termination of procedure.
Element 24, which is the pivot element is placed at its exact position. Elements that are right side of
element 24 are greater than it, and the elements that are left side of element 24 are smaller than it.
Now, in a similar manner, quick sort algorithm is separately applied to the left and right sub-arrays.
After sorting gets done, the array will be -
• If the sub arrays are balanced, then Quick sort can run as fast as merge sort.
• If they are unbalanced, then Quick sort can run as slowly as insertion sort.
Worst case
= T (n − 1) + Θ (n)
= O (n2) .
Best case
• Occurs when the sub arrays are completely balanced every time.
• Each sub array has ≤ n/2 elements.
• Get the recurrence
T (n) = 2T (n/2) + Θ (n) = O(n lgn).