Introduction To Algorithms: Unit 1
Introduction To Algorithms: Unit 1
Introduction To Algorithms: Unit 1
Introduction to Algorithms
Syllabus :
Introduction : ‘O’,’Ω’ and ‘Ө’ asymptotic notations, Average, Best and Worst
case analysis of algorithms for Time and Space complexity, Amortized
Analysis, Solving Recurrence Equations, Proof Techniques: by Contradiction,
by Mathematical Induction.
Priority Queues : Heaps & Heap sort.
The word Algorithm comes from the name of the 9th century Persian
mathematician Abu Abdullah Muhammad ibn Musa al-Khwarizmi whose
works introduced Indian numerals and algebraic concepts. The word algorism
originally referred only to the rules of performing arithmetic using Arabic
numerals but evolved via European Latin translation of al-Khwarizmi’s name
into algorithm by the 18th century. The word evolved to include all definite
procedures for solving problems or performing tasks
In mathematics, computing, linguistics and related subjects, an
algorithm is a sequence of finite instructions, often used for calculation and
data processing. It is formally a type of effective method in which a list of well-
defined instructions for completing a task will, when given an initial state,
proceed through a well-defined series of successive states, eventually
terminating in an end-state.
Big O-notation
Example 3 : The function 3n+2 = θ (n) as 3n+ 2 ≥ 3n for all n ≥ 2 and 3n+2 ≤
4n for all n ≥ 2, so c1 = 3, c2 = 4 and n0 =2.
The theta notation is more precise than both the big oh and omega notations.
The function f(n) = θ (g(n)) if g(n) is both an upper and lower bound on f(n).
f (n) = θ (g(n))
Fig. 1.1
f (n) = Ο (g(n))
Fig. 1.2
Example 3 : 10n2+4n+2 = O(n2) because 10n2+4n+2 <= 11n2 for all n >=5.
f (n) = Ω (g(n))
Fig. 1.3
• determine the robustness of the program e.g., how well does it deal with
unexpected or erroneous inputs?
In this text, we are concerned primarily with the running time. We also
consider the memory space needed to execute the program. There are many
factors that affect the running time of a program. Among these are the
algorithm itself, the input data, and the computer system used to run the
program. The performance of a computer is determined by
• the hardware :
• disk available;
• the programming language in which the algorithm is specified;
1 Algorithm Sum(a[],n) 0 - 0
2 { 0 - 0
3 S = 0.0; 1 1 1
5 s = s+a[i]; 1 N n
6 return s; 1 1 1
7 } 0 - 0
2n+3
Example 2
1 Algorithm Sum(a[],n,m) 0 - 0
2 { 0 - 0
5 s = s+a[i][j]; 1 Nm nm
6 return s; 1 1 1
7 } 0 - 0
2nm+2n+2
Amortized analysis
The aggregate method, though simple, lacks the precision of the other two
methods. In particular, the accounting and potential methods allow a
specific amortized cost to be allocated to each operation.
Accounting Method
Potential method
Amortized costs can provide a clean abstraction of data-structure
performance.
Any of the analysis methods can be used when an amortized analysis is
called for, but each method has some situations where it is arguably the
simplest or most precise.
Proof Let P denote the set of all prime numbers. Assume for a contradiction that
P is a
finite set. The set P is not empty since it contains at least the integer 2. Since P is
finite and nonempty, if makes sense to multiply all its elements. Let x denote that
product, and let y denote x + 1. Consider the smallest integer d that is larger than 1
and that is a divisor of y . Such an integer certainly exists since y is larger than 1 and
we do not require that d be different from y . First note that d itself is prime, for
otherwise any proper divisor of d would also divide y and be smaller than d, which
would contradict the definition of d. Therefore, according to our assumption that P
contains each and every prime, d belongs to P. This shows that d is also a divisor
of x since x is the product of a collection of integers including d. We have reached
the conclusion that d exactly divides both
x and y . But recall that y = x + 1 . Therefore, we have obtained an integer d larger
than 1 that divides two consecutive integers x and y. This is clearly impossible: if
indeed d divides x, then the division ofybyd will necessarily leave 1 as.remainder.
The inescapable conclusion is that the original assumption was equally impossible.
But the original assumption was that the set P of all primes is finite, and therefore its
impossibility establishes that the set P is in fact infinite.
Theorem 2 There exist two irrational numbers x and y such that xy is rational.
For instance, if it is correct that P (x) is true for all x in X, but we are careless
in applying this rule to some y that does not belong to X, we may erroneously
believe that P(y) holds. Similarly, if our belief that P(x) is true for all x in X is
based on careless inductive reasoning, then P(y) may be false even if indeed y
belongs to X. In conclusion, deductive reasoning can yield a wrong result, but
only if the rules that are followed are incorrect or if they are not followed
properly.
For example
13 = 1 = 12
13 + 2 3 = 9 = 32
13+23 +33 = 36 = 62
13+23+33 +43 = 100 = 102
13+23+33+43 +53 = 225 = 152
From the above examples we can say that that the sum of the cubes of the first n
positive integers is always a perfect square. It turns out in this case that
inductive reasoning yields a correct law.
1.5.2.1 The principle of Mathematical Induction
Consider any property P of the integers. For instance, P ( n ) could be "sq(n)= n2 ",
or "the sum of the cubes of the first n integers is equal to the square of the sum of
those integers", or "n 3 < 2 n". The first two properties hold for every n≥0, whereas
the third holds provided n ≥ 10. Consider also an integer a, known as the basis. If
You are also given a supply of tiles, each of which looks like a 2 x 2 board with one square
removed, as illustrated in Figure 1.5(b). Your puzzle is to cover the board with these tiles so
that each square is covered exactly once, with the exception of the special square, which is not
covered at all. Such a covering is called a tiling. Figure 1.5(d) gives a solution to the instance
given in Figure 1.5(a).
Proof The proof is by mathematical induction on the integer n such that m = 2n.
o Basis: The case n = 0 is trivially satisfied. Here m = 1, and the 1x1 "board" is a
single square, which is necessarily special. Such a board is tiled by doing nothing!
(If you do not like this argument, check the next simplest case: if n = 1, then m = 2
and any 2 x 2 board from which you remove one square looks exactly like a tile by
definition.)
o Induction step: Consider any n > 1. Let m = 2n. Assume the induction hypoth esis
that the theorem is true for 2n-1 x 2n-1 boards. Consider an m x m board, containing
one arbitrarily placed special square. Divide the board into 4 equal sub-boards by
halving it horizontally and vertically. The original special square now belongs to
exactly one of the sub-boards. Place one tile in the middle of the original board so as
to cover exactly one square of each of the other three sub-boards; see Figure 1.5(c).
Call each of the three squares thus covered "special" for the corresponding sub-
board. We are left with four 2n-1 x 2n-1sub-boards, each containing one special
square. By our induction hypothesis, each of these sub-boards can be tiled. The final
solution is obtained by combining the tilings of the sub-boards together with the
tile placed in the middle of the original board.
Since the theorem is true when m = 20, and since its truth for m = 2 n follows
from its assumed truth for m = 2 n-1 for all n > 1, it follows from the principle of
mathematical induction that the theorem is true for all m provided m is a power
of 2.
o Induction step: Consider any number n of horses in H. Call these horses h1, h2, ...,
hn. Assume the induction hypothesis that any set of n - 1 horses contains only
horses of a single colour (but of course the horses in one set could a priori be a
Design & Analysis of Algorithms (PU) 1-2 Introduction to Algorithms
different colour from the horses in another). Let H be the set obtained by
removing horse h1 from H, and let H2 be defined similarly; see Figure 1.6.
H1 : h2 h3 h4 h5
H2 : h1 h3 h4 h5
Figure 1.6. Horses of the same colour (n = 5)
There are n - 1 horses in each of these two new sets. Therefore, the induction
hypothesis applies to them. In particular, all the horses in H1 are of a single colour,
say c1, and all the horses in H2 are also of a single colour, say C2. But is it really
possible for colour c1 to be different from colour C2 Surely not, since horse hn
belongs to both sets and therefore both
c1 and C2 must be the colour of that horse! Since all the horses in H belong
to either H1 or H2 (or both), we conclude that they are all the same colour
c = c1 = c2. This completes the induction step and the proof by mathematical
induction.
We are now ready to formulate a more general principle of mathematical in duction. Consider
any property P of the integers, and two integers a and b such that a ≤ b. If
Theorem 5 Every positive composite integer can be expressed as a product of prime numbers.
Proof The proof is by generalized mathematical induction. In this case, there is no need for a basis.
o Induction step: Consider any composite integer n ≥ 4. (Note that 4 is the smallest positive
composite integer, hence it would make no sense to consider smaller values of n.) Assume the
induction hypothesis that any positive composite integer smaller than n can be expressed as a
product of prime numbers. (In the smallest case n = 4, this induction hypothesis is vacuous.)
Consider the small est integer d that is larger than 1 and that is a divisor of n. As argued in the
proof of Theorem 1, d is necessarily prime. Let m = n/d. Note that 1 < m < n because n is
composite and d > 1. There are two cases.
- If m is composite, it is positive and smaller than n, and therefore the induction hypothesis
applies: m can be expressed as a product of prime numbers, say m = P\P2 • • • Pk •
Therefore n = d x m can be expressed as n = dp1 p2 • • • pkr also a product of prime
numbers.
In either case, this completes the proof of the induction step and thus of the
theorem.
A heap is a special kind of rooted tree that can be implemented efficiently in an array.
Heapsort is an algorithm design technique that uses the heap data structure.
Heap data structure
• Heap is a nearly complete binary tree.
• Height of node = number of edges on a longest simple path from the node down to a
leaf.
• Height of heap = height of root = (log n).
• A heap can be stored as an array A.
Root of tree is A[1].
Parent of A[i ] = A[i/2].
Left child of A[i ] = A[2i ].
Right child of A[i ] = A[2i + 1].
Computing is fast with binary representation implementation.
Heap property
• For max-heaps (largest element at root), max-heap property: for all nodes i, excluding
the root, A[PARENT(i )] ≥ A[i ].
• For min-heaps (smallest element at root), min-heap property: for all nodes i , excluding
the root, A[PARENT(i )] ≤ A[i ].
By induction and transitivity of ≤, the max-heap property guarantees that the maximum
element of a max-heap is at the root. Similar argument for min-heaps. The heapsort algorithm
uses max-heaps. In general, heaps can be k-ary tree instead of binary.
The way MAX-HEAPIFY works
• Continue this process of comparing and swapping down the heap, until subtree rooted
at i is max-heap. If we hit a leaf, then the subtree rooted at the leaf is trivially a max-
heap.
Run MAX-HEAPIFY on the following heap example :
To delete the maximum key from the max heap, we use an algorithm called Adjust.
Adjust takes as input the array a[ ] and the integers i and n.
Algorithm 1.2
1 Algorithm Adjust(a, i, n)
2 // The complete binary trees with roots 2i and 2i + 1 are combined with node i
3 //to form a heap rooted at i. No node has an address greater than n or less than 1.
5 {
6 j := 2i; item :=a[i];
7 while (j ≤ n) do
8 {
9 if ((j < n) and (a[j] < a[j + 1])) then j: = j + 1;
Design & Analysis of Algorithms (PU) 1-6 Introduction to Algorithms
Algorithm 1.3
A sorting Algorithm
1 Algorithm Sort(a, n)
2 // Sort the elements a[l : n].
3 {
4 for i := 1 to n do lnsert(a,i);
5 for i := n to 1 step -1 do
6 {
7 DelMax(a, i, x); a[i] := x;
8 }
9 }
Design & Analysis of Algorithms (PU) 1-7 Introduction to Algorithms
1.6.2 Heapsort
The best-known example of the use of a heap arises in its application to sorting. A
conceptually simple sorting strategy has been given before, in which the maximum value is
continually removed from the remaining unsorted elements. A sorting algorithm that
incorporates the fact that n elements can be inserted in O(n) time is given in Algorithm 1.4 as
follows :
Algorithm 1.4
Heapsort Algorithm
1 Algorithm HeapSort(a, n)
2 // a[l : n] contains n elements to be sorted. HeapSort rearranges them in place into
3 // nondecreasing order.
4 {
5 for i := [n/2] to 1 step - 1 do Adjust(a, i,n);
6 }
7 for i := n to 2 step -1 do
8 {
9 t := a[i]; a[i]:= a[1]; a[1] := t;
10 Adjust (a, 1, i – 1);
11 }
12 }
Design & Analysis of Algorithms (PU) 1-8 Introduction to Algorithms
Q. 1 Flowcharting and pseudocode are 2 different design tools for an algorithm. How do
they differ and how are they similar ?
Ans. : Both flowcharting and pseudocode are used to design individual parts of a program.
But flowcharting gives a pictorial representation of the logical flow of an algorithm.
This is incontrast to the other design tool, pseudocode, that provides a textual (part
English, part structured code) design solution.
Q. 2 What are the factors which contribute for running time of a program ?
Ans. : There are basically four factors on which the running time of program depend. They
are :
(i) The input to the program.
(ii) The quality of code generated by the compiler used to create the object code.
(iii) The nature and speed of the instructions on the machine used to execute the program.
(iv) The time complexity of the algorithm underlying the program.
Q. 3 The input to the program contributes for running time of a program – Explain.
Ans. : This indicates that the running time of a program should be defined as a function of the
input. In most of the cases, the running time depends not on exact input (i.e. what is the
kind of input) but depends only on the size of the input. For example, if we are sorting
five numbers, (using some algorithm), it takes less amount of time as compared to
sorting ten number (of course using the same algorithm).
(i) Integrity :
(ii) Clarity :
Refers to the overall readability of a program, with emphasis on its underlying logic.
(iii) Simplicity :
The clarity and accuracy of a program are usually enhanced by keeping the things as
simple as possible, consistent with the overall program objectives.
Design & Analysis of Algorithms (PU) 1-9 Introduction to Algorithms
(iv) Efficiency :
(v) Modularity :
(vi) Generality :
Program must be as general as possible. (viz., rather than keeping fixed values for
variables, it is better to read them).
Q. 5 In what way the asymmetry between Big-Oh notation and Big-Omega Notation
helpful ?
Ans. : There are many situations where the algorithm functions faster on some inputs but not
on all the inputs. viz, let us assume that we know an algorithm to determine whether
the input to it is of prime length. This algorithm runs very fast whenever the input
length is even. Hence, we cannot find good lower bound on the running time, that is
true for all n ≥ no.
Q. 6 What are the basic components, which contribute to the space complexity ?
Ans. : Space complexity is the amount of memory, the program needs for its execution. There
are basically two components, which need to be considered while determining the
space complexity. They are :
(i) A fixed part, which is independent of the characteristics (viz., number, size) of the inputs
and outputs. This part typically includes the instruction space
(i.e. space for the code), space for simple variables and fixed-size component variables
(also called aggregate), space for constants, and so on.
(ii) A variable part which consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, space needed by reference
variables (this depends on instance characteristics) and the recursion stack space.
∴ S (P) ⇒ space requirement for program P
= c + SP , ..... SP ⇒ instance characteristics,
Where c is a constant.
If we ignore the constant, we can say that the complexity is of the order of n. The
notation used is Big O i.e. O(n).
(a) 24
(b)
(d) n + n2 + n3
Q. 9 Show the step table (indicating number of steps per execution, frequency of each of
the statement) for the following segment :
Design & Analysis of Algorithms (PU) 1-11 Introduction to Algorithms
Q. 12 Find out the frequency count for the following piece of code :
x = 5; y = 5;
for (i = 2; i < = x ; i ++)
for (j = y; j >= 0; j −−)
{
if (i = = j)
printf(“xxx”);
else
break;
}
Ans. :
(i) x = 5; y = 5;
(iv) { if (i = = j)
(vi) else
(vii) break;
(viii) }
Note here that the break statement will cause the termination of inner loop
(i.e. j loop).
‘i’ value j value Remarks
2 5 Since i ≠ j, hence goes to statement
VII
3 5 Since i ≠ j, hence goes to statement
VII
4 5 Since i ≠ j, hence goes to statement
VII
5 5 Since i = j, hence goes to statement V.
Design & Analysis of Algorithms (PU) 1-14 Introduction to Algorithms
Design & Analysis of Algorithms (PU) 1-15 Introduction to Algorithms
Note : The reader can write a simple ‘C’ code and check the answer by step-execution.
i.e. 1 (1 for a = a + 2)
= n
= n2
Q. 16 If the algorithm doIt has the complexity 5n, calculate the run-time complexity of the
following program segment :
j=1
loop i < = n
doIt (…)
i=i+1
Ans. : dolt has the complexity 5n,
Q. 18 An algorithm runs a given input of size n. If is 4096, the run time is 512 millisecond. If
n is 16384, the run time 1024 millisecond. What is the complexity ? What is the big –
‘O’ notation ?
Ans. :
n1 = 4096 n2 = 16384
f(n1) = 512 f(n2) = 1024
n2 = 4 × n1
f(n2) = 2 × f(n1)
Since n increases by four while f(n) increases by only two, the complexity is n 1/2. The
big – ‘O’ notation = ‘O’(n1/2).
Review Questions
Q. 1 What is an algorithm ?
Q. 4 Compare the two functions n2 and 2n/4 for various values of n. Determine when the
second becomes larger than the first.
Q. 5 Determine the frequency counts for all statements in the following two algorithm
segments:
1. for i:= 1 to n do 1. i:= 1;
2. for j := 1 to i do 2. while (i<=n) do
3. for k := 1 to j do 3. {
4. x := x +1; 4. x := x +1;
5. i := i + 1;
6. }
(a) (b)
Q. 6 Find two function f(N) and g (N) such that neither f (N) = O(g(N)) nor g (N) = O (f(N)).
Design & Analysis of Algorithms (PU) 1-19 Introduction to Algorithms
b. Implement the code in the language of your choice, and give the running time for
several values of N.
Q. 8 Given the following recursive function to insert an element x into the highest zero
element of an array a :
Insert (x, a[ ], i)
What are the best, worst and average time and space complexities of Insert (x, a, n)
using the recursion tree method. Verify your solutions using the substitution method.
1. 20 is O(1)
2. n (n −1) /2 is O (n2 )
4. ik is O (nk+1 )
Dec 2006
Q. 1 (a) Prove by contradiction that “there are infinitely many prime numbers.” (8 Marks)
OR
(b) State whether the following functions are CORRECT or INCORRECT and justify your
answer.
(i) 3n + 2 = O (n)
(ii) 100 n + 6 O(n)
(iii) 10n2 + 4n + 2 = O (n2) (6 Marks)
May 2007
Q. 1 (a) Prove by generalized mathematical induction that “every positive integer can be expressed as
product of prime numbers.” (8 Marks)
(b) Prove by contradiction. There exist two irrational numbers x and y such that xy is rational.
(8 Marks)
OR
Q. 2 (a) Prove by mathematical induction that the sum of the cubes of the first n positive integers is
equal to the square of the sum of these integers. (8 Marks)
T(n) = O(n)
T(1) = Ⓗ (1)
Show that above recurrence is assympotically bound by Ⓗ (n). (8 Marks)
Dec 2007
(b) Explain in brief Amortized analysis. Find the amortized cost with respect to stack operations.
(10 Marks)
OR
May 2008
(c) Prove by contradiction that “there are infinitely many prime numbers” (8 Marks)
OR
Q. 2 (a) Name and explain in two or three sentences three popular methods to arrive at amortized
costs for the operations. (6 Marks)
(b) If f(n) = amnm +……………+ a1 n + a0 then prove that f(n) = O (nm). (8 Marks)
(c) State whether the following equalities are correct or incorrect. (4 Marks)
(i) 5n – 6n = Θ (n )
2 2
(ii) n! = O (nm)
(iii) n3 + 106 n2 = Θ (n2)
(iv) 6n3 /(log n + 1) = O (n3).
Dec 2008
Q. 1 (a) What are the basic components which contributes to the space
complexity ? Compute the space needed by the following algorithms
justify your answer.
Design & Analysis of Algorithms (PU) 1-24 Introduction to Algorithms