MC4101 ADSA - Unit I

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

lOMoARcPSD|2583814

UNIT- I
Algorithms – Algorithms as a Technology -Time and Space complexity of algorithms-
Asymptotic analysis-Average and worst-case analysis-Asymptotic notation-Importance of
efficient algorithms- Program performance measurement - Recurrences: The Substitution Method
– The Recursion- Tree Method- Data structures and algorithms.

Overview of Data Structures


Introduction to Data Structures
Data Structure is a way of collecting and organizing data in such a way that we can perform
operations on these data in an effective way. Data Structures is about rendering data elements in
terms of some relationship, for better organization and storage. For example, we have data
player's name "Virat" and age 26. Here "Virat" is of String data type and 26 is of integer data
type.
We can organize this data as a record like Player record. Now we can collect and store player's
records in a file or database as a data structure. For example: "Dhoni" 30, "Gambhir" 31,
"Sehwag" 33
In simple language, Data Structures are structures programmed to store ordered data, so that
various operations can be performed on it easily. It represents the knowledge of data to be
organized in memory. It should be designed and implemented in such a way that it reduces the
complexity and increases the efficiency.

Basic types of Data Structures


Anything that can store data can be called as a data structure, hence Integer, Float, Boolean,
Char etc, all are data structures. They are known as Primitive Data Structures.
Then we also have some complex Data Structures, which are used to store large and connected
data. Some example of Abstract Data Structure are :

 Linked List
 Tree
 Graph
 Stack, Queue etc.
All these data structures allow us to perform different operations on data. We select these data
structures based on which type of operation is required. We will look into these data structures in
more details in our later lessons.
lOMoARcPSD|2583814

The data structures can also be classified on the basis of the following characteristics:

CharactersticDescription

LinearIn Linear data structures,the data items are arranged in a linear sequence.
Example: Array

Non-LinearIn Non-Linear data structures,the data items are not in sequence.


Example: Tree, Graph

HomogeneousIn homogeneous data structures,all the elements are of same type. Example: Array

Non- In Non-Homogeneous data structure, the elements may or may not be of the same
Homogeneous type. Example: Structures

StaticStatic data structures are those whose sizes and structures associated memory
locations are fixed, at compile time. Example: Array

DynamicDynamic structures are those which expands or shrinks depending upon the
program need and its execution. Also, their associated memory locations changes.
Example: Linked List created using pointers

What is an Algorithm ?
An algorithm is a finite set of instructions or logic, written in order, to accomplish a certain
predefined task. Algorithm is not the complete code or program, it is just the core logic(solution)
lOMoARcPSD|2583814

of a problem, which can be expressed either as an informal high level description


as pseudocode or using a flowchart.
Every Algorithm must satisfy the following properties:

1. Input- There should be 0 or more inputs supplied externally to the algorithm.


2. Output- There should be atleast 1 output obtained.
3. Definiteness- Every step of the algorithm should be clear and well defined.
4. Finiteness- The algorithm should have finite number of steps.
5. Correctness- Every step of the algorithm must generate a correct output.
An algorithm is said to be efficient and fast, if it takes less time to execute and consumes less
memory space. The performance of an algorithm is measured on the basis of following properties:

1. Time Complexity
2. Space Complexity

Space Complexity
Its the amount of memory space required by the algorithm, during the course of its execution.
Space complexity must be taken seriously for multi-user systems and in situations where limited
memory is available.
An algorithm generally requires space for following components :

 Instruction Space : Its the space required to store the executable version of the program. This
space is fixed, but varies depending upon the number of lines of code in the program.
 Data Space : Its the space required to store all the constants and variables value.
 Environment Space : Its the space required to store the environment information needed to
resume the suspended function.

Time Complexity
Time Complexity is a way to represent the amount of time needed by the program to run till its
completion. We will study this in details in later sections.

Time Complexity of Algorithms


Time complexity of an algorithm signifies the total time required by the program to run till its
completion. The time complexity of algorithms is most commonly expressed using the big O
notation.
Time Complexity is most commonly estimated by counting the number of elementary functions
performed by the algorithm. And since the algorithm's performance may vary with different types
of input data, hence for an algorithm we usually use the worst-case Time complexity of an
algorithm because that is the maximum time taken for any input size.
lOMoARcPSD|2583814

Calculating Time Complexity


Now lets tap onto the next big topic related to Time complexity, which is How to Calculate Time
Complexity. It becomes very confusing some times, but we will try to explain it in the simplest
way.
Now the most common metric for calculating time complexity is Big O notation. This removes all
constant factors so that the running time can be estimated in relation to N, as N approaches
infinity. In general you can think of it like this :
statement;
Above we have a single statement. Its Time Complexity will be Constant. The running time of
the statement will not change in relation to N.
for(i=0; i < N; i++)
{
statement;
}
The time complexity for the above algorithm will be Linear. The running time of the loop is
directly proportional to N. When N doubles, so does the running time.

for(i=0; i < N; i++)


{
for(j=0; j < N;j++)
{
statement;
}
}
This time, the time complexity for the above code will be Quadratic. The running time of the
two loops is proportional to the square of N. When N doubles, the running time increases by N *
N.

while(low <= high)


{
mid = (low + high) / 2;
if (target < list[mid])
high = mid - 1;
lOMoARcPSD|2583814

else if (target > list[mid])


low = mid + 1;
else break;
}
This is an algorithm to break a set of numbers into halves, to search a particular field(we will
study this in detail later). Now, this algorithm will have a Logarithmic Time Complexity. The
running time of the algorithm is proportional to the number of times N can be divided by 2(N is
high-low here). This is because the algorithm divides the working area in half with each iteration.

void quicksort(int list[], int left, int right)


{
int pivot = partition(list, left, right);
quicksort(list, left, pivot - 1);
quicksort(list, pivot + 1, right);
}
Taking the previous algorithm forward, above we have a small logic of Quick Sort(we will study
this in detail later). Now in Quick Sort, we divide the list into halves every time, but we repeat the
iteration N times(where N is the size of list). Hence time complexity will be N*log( N ). The
running time consists of N loops (iterative or recursive) that are logarithmic, thus the algorithm is
a combination of linear and logarithmic.
NOTE : In general, doing something with every item in one dimension is linear, doing something
with every item in two dimensions is quadratic, and dividing the working area in half is
logarithmic.

Types of Notations for Time Complexity


Now we will discuss and understand the various notations used for Time Complexity.

1. Big Oh denotes "fewer than or the same as" <expression> iterations.

2. Big Omega denotes "more than or the same as" <expression> iterations.
3. Big Theta denotes "the same as" <expression> iterations.
4. Little Oh denotes "fewer than" <expression> iterations.
5. Little Omega denotes "more than" <expression> iterations.

…………………………………………………………………………………………………...

Arrays:

Whenever we want to work with large number of data values, we need to use that much number
of different variables. As the number of variables are increasing, complexity of the program also
increases and programmers get confused with the variable names. There may be situations in
lOMoARcPSD|2583814

 Step 1: Check whether queue is Empty (front == NULL).


 Step 2: If it is Empty then, display 'Queue is Empty!!!' and terminate the function.
 Step 3: If it is Not Empty then, define a Node pointer 'temp' and initialize with front.
 Step 4: Display 'temp → data --->' and move it to the next node. Repeat the same until
'temp' reaches to 'rear' (temp → next != NULL).
 Step 4: Finally! Display 'temp → data ---> NULL'.

APPLICATIONS

 To store a set of programs which are to be given access to a hard disk according to
their priority.
b) For representing a city region telephone network.
c) To store a set of fixed key words which are referenced very frequently.
d) To represent an image in the form of a bitmap.
e) To implement back functionality in the internet browser.
f) To store dynamically growing data which is accessed very frequently, based upon a
key value.
g) To implement printer spooler so that jobs can be printed in the order of their
arrival.
h) To record the sequence of all the pages browsed in one session.
i) To implement the undo function in a text editor.
j) To store information about the directories and files in a system.

Algorithm Analysis
Efficiency of an algorithm can be analyzed at two different stages, before implementation and
after implementation. They are the following −
 A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an
algorithm is measured by assuming that all other factors, for example, processor speed,
are constant and have no effect on the implementation.
 A Posterior Analysis − This is an empirical analysis of an algorithm. The selected
algorithm is implemented using programming language. This is then executed on target
computer machine. In this analysis, actual statistics like running time and space required,
are collected.
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the execution or
running time of various operations involved. The running time of an operation can be defined as
the number of computer instructions executed per operation.

Algorithm Complexity
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
 Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
 Space Factor − Space is measured by counting the maximum memory space required by
the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space required by
the algorithm in terms of n as the size of input data.
lOMoARcPSD|2583814

Space Complexity
Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle. The space required by an algorithm is equal to the sum of the
following two components −
 A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and constants used,
program size, etc.
 A variable part is a space required by variables, whose size depends on the size of the
problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I)
is the variable part of the algorithm, which depends on instance characteristic I. Following is a
simple example that tries to explain the concept −

Algorithm: SUM(A,
B) Step 1 - START
Step 2 - C ← A + B + 10
Step 3 - Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space
depends on data types of given variables and constant types and it will be multiplied accordingly.

Time Complexity
Time complexity of an algorithm represents the amount of time required by the algorithm to run
to completion. Time requirements can be defined as a numerical function T(n), where T(n) can
be measured as the number of steps, provided each step consumes constant time.
For example, addition of two n-bit integers takes n steps. Consequently, the total computational
time is T(n) = c ∗ n, where c is the time taken for the addition of two bits. Here, we observe that
T(n) grows linearly as the input size increases.

Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing of its


run-time performance. Using asymptotic analysis, we can very well conclude the best case,
average case, and worst case scenario of an algorithm.

Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to
work in a constant time. Other than the "input" all other factors are considered constant.

Asymptotic analysis refers to computing the running time of any operation in mathematical units
of computation. For example, the running time of one operation is computed as f(n) and may be
for another operation it is computed as g(n2). This means the first operation running time will
increase linearly with the increase in n and the running time of the second operation will increase
exponentially when n increases. Similarly, the running time of both operations will be nearly the
same if n is significantly small.

Usually, the time required by an algorithm falls under three types −

Best Case − Minimum time required for program execution.

Average Case − Average time required for program execution.

Worst Case − Maximum time required for program execution.


lOMoARcPSD|2583814

Asymptotic Notations

Following are the commonly used asymptotic notations to calculate the running time complexity
of an algorithm.

Ο Notation

Ω Notation

θ Notation

Big Oh Notation, Ο

The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can possibly
take to complete.

For example, for a function f(n)

Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }

Omega Notation, Ω

The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly take
to complete.

For example, for a function f(n)


lOMoARcPSD|2583814

Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }

Theta Notation, θ

The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. It is represented as follows −

θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }

Common Asymptotic Notations

Following is a list of some common asymptotic notations −

Data Structures - Asymptotic Analysis


constant − Ο(1)

logarithmic − Ο(log n)

linear − Ο(n)

n log n − Ο(n log n)

quadratic − Ο(n2)

cubic − Ο(n3)

polynomial − nΟ(1)

exponential − 2Ο(n)
lOMoARcPSD|2583814

Time complexity :

Big O notation
f(n) = O(g(n)) means
There are positive constants c and k such that:
0<= f(n) <= c*g(n) for all n >= k.

For large problem sizes the dominant term(one with highest value of exponent) almost completely
determines the value of the complexity expression. So abstract complexity is expressed in terms
of the dominant term for large N. Multiplicative constants are also ignored.

N^2 + 3N + 4 is O(N^2)
since for N>4, N^2 + 3N + 4 < 2N^2 (c=2 & k=4)

O(1) constant time


This means that the algorithm requires the same fixed number of steps regardless of the size of
the
task.
Example:
1) a statement involving basic operations
Here are some examples of basic operations.
one arithmetic operation (eg., +, *)
one assignment
one test (eg., x==0)
one read(accessing an element from an array)

2) Sequence of statements involving basic


operations. statement 1;
statement 2;
..........
statement k;
Time for each statement is constant and the total time is also constant: O(1)

O(n) linear time


This means that the algorithm requires a number of steps proportional to the size of the task.

Examples:
1. Traversing an array.
2. Sequential/Linear search in an array.
3. Best case time complexity of Bubble sort (i.e when the elements of array are in sorted order).

Basic strucure is :
for (i = 0; i < N; i++) {
sequence of statements of O(1)
}

The loop executes N times, so the total time is N*O(1) which is O(N).

O(n^2) quadratic time


The number of operations is proportional to the size of the task squared.
lOMoARcPSD|2583814

Examples:
1) Worst case time complexity of Bubble, Selection and Insertion sort.

Nested loops:
1.
for (i = 0; i < N; i++) {
for (j = 0; j < M; j++) {
sequence of statements of O(1)
}
}
The outer loop executes N times and inner loop executes M times so the time complexity is
O(N*M)

2.
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
sequence of statements of O(1)
}
}
Now the time complexity is O(N^2)

3.let's consider nested loops where the number of iterations of the inner loop depends on the
value of the outer loop's index.

for (i = 0; i < N; i++) {


for (j = i+1; j < N; j++) {
sequence of statements of O(1)
}
}
Let us see how many iterations the inner loop has:
Value of i Number of iterations of inner loop
0 N1
1 N2
..... .......
N3 2
N2 1
N1 0
So the total number of times the “sequence of statements” within the two loops executes is :
(N 1)+(N 2)+.....2+1+0 which is N*(N 1)/2 or (1/2)*(N^2) (1/2)* N
and we can say that it is O(N^2) (we can ignore multiplicative constant and for large problem
size the dominant term determines the time complexity)

O(log n) logarithmic time


Examples:
1. Binary search in a sorted array of n elements.

O(n log n) "n log n " time


lOMoARcPSD|2583814

Examples:
1. MergeSort, QuickSort etc.

O(a^n)(a>1) exponential time


Examples:
1. Recursive Fibonacci implementation
2. Towers of Hanoi

The best time in the above list is obviously constant time, and the worst is exponential time
which, as we have seen, quickly overwhelms even the fastest computers even for relatively small
n. Polynomial growth (linear, quadratic, cubic, etc.) is considered manageable as compared to
exponential growth.

Using the "<" sign informally, we can say that the order of growth is
O(l) < O(log n) < O(n) < O(n log n) < O(n^2) < O(n^3) < O(a^n) where a>1

A word about Big O when a function is the sum of several terms


If a function (which describes the order of growth of an algorithm) is a sum of several terms, its
order of growth is determined by the fastest growing term. In particular, if we have a
polynomial
p(n) = aknk + ak 1nk 1 +.....+a1n + a0

its growth is of the order nk:


p(n) = O(nk)

Example:

{ O(1)
perform any statement S1}
for (i=0; i < n; i++) {

{perform any statement(s) S2} O(n)

{run through another loop n O(n^2)


times}
}

Total Execution Time: O(1) + O(n) therefore, O(n^2)


+O(n^2)
lOMoARcPSD|2583814

Statements with method calls:


When a statement involves a method call, the complexity of the statement includes the complexity
of the method call. Assume that you know that method f takes constant time, and that method g
takes time proportional to (linear in) the value of its parameter k. Then the statements below have
the time complexities indicated.
f(k); // O(1)
g(k); // O(k)

When a loop is involved, the same rule applies. For example:


for (j = 0; j < N; j++) g(N);

has complexity (N2). The loop executes N times and each method call g(N) is complexity O(N).

Other Examples

1. for (j = 0; j < N; j++) f(j);


This is O(N) since f(j) is O(1) and it is executed N times.

2. for (j = 0; j < N; j++) g(j);


The first time the loop executes j is 0 and g(0) takes "no operations". The next time j is 1
and g(1) takes 1 operations. The last time the loop executes j is N 1 and g(N 1) takes N 1
operations. The total time is the sum of the first N 1 numbers and is O(N2).
3. for (j = 0; j < N; j++) g(k);
Each time through the loop g(k) takes k operations and the loop executes N times. Since
you don't know the relative size of k and N, the overall complexity is O(N * k).

So why should we bother about time complexity?


Suppose time taken by one operation=1 micro sec.
Problem size N=100
Complexity Time
N 1 micro sec.
N^2 0.01 sec.
N^4 100 sec.
2^N 4x10^16 years

Lets look at the problem for computing Fibonacci numbers :

A recursive solution:
public long Fib1(long n){
if ((n == 1 )||(n==2)) return 1;
return Fib1(n 1) + Fib1(n 2);
}

Running Time Analysis:


Let T(n) be the number of steps needed to compute F(n) (nth Fibonacci
number) We can see that
lOMoARcPSD|2583814

T(n) = T(n 1) + T(n 2) + .....................................................(i)


1 T(1) = T(2) = 1
lOMoARcPSD|2583814

T(n) is exponential in n. It takes approximately 2^(0.7n) steps to compute F(n). The proof is out
of the scope of this course. F(200) will take about 2^(140) steps which is even more than the life
of universe!!!!

What takes so long?


The same subproblems get solved over and over again!
F(5)
↙ ↘
F(4) F(3)
↙↘ ↙↘
F(3) F(2) F(2) F(1)
↙↘
F(2) F(1)

A better solution:
Solve F1, F2, ..., Fn. Solve them in order and save their values!

Function Fib2(n){
Create an array fib[1.. .n]
fib[1] = 1
fib[2] = 1
for i = 3 to n:
fib[i] = fib[i 1] + fib[i 2]
return fib[n]
}

The time complexity of this algorithm is O(n). The number of steps required is proportional to n.
F(200) is now reasonable so is F(2000) and F(20000).

Moral: The right algorithm makes all the difference

Some Important Recurrence Relations :


Recurrence Algorithm Big Oh Solution
T(n) = T(n/2) + O(1) Binary Search O(log n)
T(n) = T(n 1) + O(1) Sequential Search O(n)
T(n) = T(n 1) + O(n) Selection Sort (other n2 sorts in worst case) O(n2)
T(n) = 2*T(n/2) + O(n) Merge Sort & Quicksort O(n log n)

Tower of Hanoi
The Tower of Hanoi is a mathematical puzzle. It consists of three rods, and a number of disks of
different sizes which can slide onto any rod. The puzzle starts with the disks neatly stacked in
order of size on one rod, the smallest at the top, thus making a conical shape.
lOMoARcPSD|2583814

The objective of the puzzle is to move the entire stack to another rod, obeying the following rules:

• Only one disk may be moved at a time.


• Each move consists of taking the upper disk from one of the rods and sliding it onto
another rod, on top of the other disks that may already be present on that rod. No disk may
be placed on top of a smaller disk.

lOMoARcPSD|2583814

We want to write a recursive method, THanoi(n,A,B,C) which moves n disks from peg A to peg C
using peg B for intermediate transfers.

Observation: THanoi(n, A, B, C) is equivalent to performing following tasks:


THanoi(n 1, A, C, B) (which means moving n 1 disks from A to B using C)
Transferring the largest disk from peg A to peg C
THanoi(n 1, B, A, C) (which means moving n 1 disks from B to C using A)

Stopping condition: n = 1

The time complexity of above algorithm can be determined using following recurrence relation.
Let T(n) be the number of steps required to solve the puzzle for n disks. It is clearly evident from the
above observation that the soluiton for n disks is equivalent to solving the puzzle two times for n 1
disks and a single step involving transfer of disk from starting 'peg' to final 'peg' which takes
constant time.
Thus,
T(n) = T(n 1) + T(n 1) + O(1)= 2*T(n 1) + O(1)
The solution to this recurrence relation is exponential in n and so T(n) is of exponential order. The
proof is out of scope of this course.

This is an example where recursion is much easier to formulate than a loop based solution.

Polynomial vs. Exponential Running Time:

The time complexity(generally referred as running time) of an algorithm is expressed as the amount of
time taken by an algorithm for some size of the input to the problem. Big O notation is commonly
used to express the time complexity of any algorithm as this suppresses the lower order terms and is
described asymptotically. Time complexity is estimated by counting the operations(provided as
instructions in a program) performed in an algorithm. Here each operation takes a fixed amount of time in
execution. Generally time complexities are classified as constant, linear, logarithmic, polynomial,
exponential etc. Among these the polynomial and exponential are the most prominently
considered and defines the complexity of an algorithm. These two parameters for any algorithm are
always influenced by size of input.

Polynomial Running Time

An algorithm is said to be solvable in polynomial time if the number of steps required to complete the
algorithm for a given input is O(nk) for some non-negative integer k, where n is the complexity of the
input. Polynomial-time algorithms are said to be "fast." Most familiar mathematical operations such as
addition, subtraction, multiplication, and division, as well as computing square roots, powers, and
logarithms, can be performed in polynomial time. Computing the digits of most interesting
mathematical constants, including pi and e, can also be done in polynomial time.
lOMoARcPSD|2583814

All basic arithmetic operations ((i.e.) Addition, subtraction, multiplication,division), comparison


operations, sort operations are considered as polynomial time algorithms.

Exponential Running Time:

The set of problems which can be solved by an exponential time algorithms, but for which no
polynomial time algorithms is known. An algorithm is said to be exponential time, if T(n) is upper
bounded by 2poly(n), where poly(n) is some polynomial in n. More formally, an algorithm is exponential
time if T(n) is bounded by O(2nk) for some constant k.

Algorithms which have exponential time complexity grow much faster than polynomial algorithms.
The difference you are probably looking for happens to be where the variable is in the equation that
expresses the run time. Equations that show a polynomial time complexity have variables in the bases of
their terms. Examples: n3 + 2n2+ 1. Notice n is in the base, NOT the exponent. In exponential equations,
the variable is in the exponent. Examples: 2n. As said before, exponential time grows much faster. If n is
equal to 1000 (a reasonable input for an algorithm), then notice 10003 is 1 billion, and 21000 is simply
huge! For a reference, there are about 280 hydrogen atoms in the sun, this is much more than 1 billion.

Average, Best, and Worst Case Complexities:

We can have three cases to analyze an algorithm:


1) Worst Case
2) Average Case
3) Best Case

Worst Case:

In the worst case analysis, we calculate upper bound on running time of an algorithm. We must know the
case that causes maximum number of operations to be executed. For Linear Search, the worst case
happens when the element to be searched (x in the above code) is not present in the array. When x is not
present, the search() functions compares it with all the elements of arr[] one by one. Therefore, the worst
case time complexity of linear search would be Θ(n).

Average Case:

In average case analysis, we take all possible inputs and calculate computing time for all of the inputs.
Sum all the calculated values and divide the sum by total number of inputs. We must know (or predict)
distribution of cases. For the linear search problem, let us assume that all cases are uniformly
distributed (including the case of x not being present in array). So we sum all the cases and divide the
sum by (n+1). Following is the value of average case time complexity.
lOMoARcPSD|2583814

Average case time=

=
= Θ(n)

Best Case:

In the best case analysis, we calculate lower bound on running time of an algorithm. We must know the
case that causes minimum number of operations to be executed. In the linear search problem, the best
case occurs when x is present at the first location. The number of operations in the best case is constant
(not dependent on n). So time complexity in the best case would be Θ(1).

Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we guarantee
an upper bound on the running time of an algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and it is rarely done. In the
average case analysis, we must know (or predict) the mathematical distribution of all possibleinputs.
The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesn’t provide any
information as in the worst case, an algorithm may take years to run.
For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best cases.
For example, Merge Sort. Merge Sort does Θ(nLogn) operations in all cases. Most of the other
sorting algorithms have worst and best cases. For example, in the typical implementation of Quick Sort
(where pivot is chosen as a corner element), the worst occurs when the input array is already sorted and
the best occur when the pivot elements always divide array in two halves. For insertion sort, the worst
case occurs when the array is reverse sorted and the best case occurs when the array is sorted in the same
order as output.

Analysis of Recursive Algorithms

Example: Factorial
n! = 1•2•3...n and 0! = 1 (called initial case)
So the recursive defintiion n! = n•(n-1)!

Algorithm F(n)
if n = 0 then return 1 // base case
else F(n-1)•n // recursive call

Basic operation? multiplication during the recursive call


Formula for multiplication
lOMoARcPSD|2583814

M(n) = M(n-1) + 1
is a recursive formula too. This is typical.

We need the initial case which corresponds to the base case


M(0) = 0
There are no multiplications

Solve by the method of backward substitutions


M(n) = M(n-1) + 1
= [M(n-2) + 1] + 1 = M(n-2) + 2 substituted M(n-2) for M(n-1)
= [M(n-3) + 1] + 2 = M(n-3) + 3 substituted M(n-3) for M(n-2)
... a pattern evolves
= M(0) + n
=n
Not surprising!
Therefore M(n) ε Θ(n)

Procedure for Recursive Algorithm


1. Specify problem size
2. Identify basic operation
3. Worst, best, average case
4. Write recursive relation for the number of basic operation. Don't forget the initial conditions (IC)
5. Solve recursive relation and order of

growth Stop here?

Example: Tower Hanoi


Explain the problem using figure

Demo and show recursion

1. Problem size is n, the number of discs


2. The basic operation is moving a disc from rod to another
3. There is no worst or best case
4. Recursive relation for moving n discs
M(n) = M(n-1) + 1 + M(n-1) = 2M(n-1) + 1
IC: M(1) = 1
5. Solve using backward substitution
M(n) = 2M(n-1) + 1
= 2[2M(n-2) + 1] +1 = 22M(n-2) + 2+1
=22[2M(n-3) +1] + 2+1 = 23M(n-3) + 22 + 2 + 1
lOMoARcPSD|2583814

...
M(n) = 2iM(n-i) + ∑j=0-i2j = 2iM(n-i) + 2i-1
...
M(n) = 2n-1M(n-(n-1)) + 2n-1-1 = 2n-1M(1) + 2n-1-1 = 2n-1 + 2n-1-1 = 2n-1

M(n) ε Θ(2n)

Terrible. Can we do it better?

Where did the exponential term come from? Because two recursive calls are made. Suppose three
recursive calls are made, what is the order of growth.

Lesson learned: Be careful of the recursive algorithm, they can grow exponential.

Especial if the problem size is measured by the level of the recursive tree and the operation count is total
number of nodes.

Example: Binary Representation


Algorithm BinRec(n)
if n = 1 then return 1
else return BinRec(floor(n/2)) + 1

1. Problem size is n
2. Basic operation is the addition in the recursive call
3. There is no difference between worst and best case
4. Recursive relation including initial conditions
A(n) = A(floor(n/2)) + 1
IC A(1) = 0
5. Solve recursive relation
The division and floor function in the argument of the recursive call makes the analysis difficult.
We could make the variable substitution, n = 2k, could get rid of the definition,
but the substitution skips a lot of values for n.
The smoothness rule (see appendix B) says that is ok.

Smoothness rule
T(n) eventually non-decreasing and f(n) be smooth {eventually non-decreasing and f(2n) ε Θ(f(n))}
if T(n) ε Θ(f(n)) for n powers of b then T(n) ε Θ(f(n)) for all n.
lOMoARcPSD|2583814

Works for O and Ω.

substitute n = 2k (also k = lg(n))


A(2k) = A(2k-1) + 1 and IC A(20) = 0

A(2k) = [A(2k-2) + 1] + 1 = A(2k-2) + 2


= [A(2k-3) + 1] + 2 = A(2k-3) + 3
...
= A(2k-i) + i
...
= A(2k-k) + k
A(2k) = k

Substitute back k = lg(n)


A(n) = lg(n) ε Θ(lg n)

Example: Fibonacci Numbers Sequence


A classic example for more elaborate recursion relations.

Fibonacci Sequence: 0, 1, 1 2, 3, 5, 8, 13, 21, 34, ...


Fibonacci (1202) proposed for the growth of rabbits
Can be defined by the simple recurrence
F(n) = F(n-1) + F(n-2) for n > 1
IC: F(0) = 0 and F(1) = 1 (why do we need two ICs)

Homogenous second-order linear recurrence with constant coefficients


ax(n) + bx(n-1) + cx(n-2) = 0
Homogenous because the recurrence equals zero.
Why second-order linear? Substitute the proposed solution x(n) = rn
a rn + b rn-1 + c rn-2 = 0
divide by rn
a + b r + c r2 = 0, characteristic equation is a second order polynomial.
The real roots are solutions
x(n) = αr1n + βr 2n
α and β are determined from the

Apply to Fibonacci Recursion


F(n) - F(n-1) - F(n-2) = 0, homogenous second order with constant coefficients
r2 - r - 1 = 0, characteristic equation
lOMoARcPSD|2583814

r1,2 = (1 ± √5)/2 = φ or φ' { φ = (1+√5)/2 and φ' = (1-√5)/2}


The general form of the solution
F(n) = αφn + βφ' n where α and β are unknows
Using the ICs
α + β =0
φα + φ'β = 1
Solve by substituting the first equation into the second, get α = 1/√5 and β = -1/√5
So
F(n) = 1/√5 (φn - φ' n) = φn/√5 rounded to the nearest integer

Example: Recursive Algorithm for Fibonacci Numbers


Algorithm F(n)
if n ≤ 1 then return n
else return F(n-1) + F(n-2)

1. Problem size is n, the sequence number for the Fibonacci number


2. Basic operation is the sum in recursive call
3. No difference between worst and best case
4. Recurrence relation
A(n) = A(n-1) + A(n-2) + 1
IC: A(0) = A(1) = 0
or
A(n) - A(n-1) - A(n-2) = 1, Inhomogeneous recurrences because of the 1

In general solution to the inhomogeneous problem is equal to the sum of solution to homogenous
problem plus solution only to the inhomogeneous part. The undetermined coefficients of the solution for
the homogenous problem are used to satisfy the IC.

In this case A(n) = B(n) + I(n) where


A(n) is solution to complete inhomogeneous problem
B(n) is solution to homogeneous problem
I(n) solution to only the inhomogeneous part of the problem

We guess at I(n) and then determine the new IC for the homogenous problem for B(n)

For this problem the correct guess is I(n) = 1

substitute A(n) = B(n) -1 into the recursion and get


lOMoARcPSD|2583814

B(n) - B(n-1) - B(n-2) = 0 with IC B(0) = B(1) = 1

The same as the relation for F(n) with different IC


We do not really need the exact solution; We can conclude
A(n) = B(n) -1 = F(n+1) - 1 ε Θ(φn), exponential

There is the Master Theorem that give the asymptotic limit for many common problems.

Iterative algorithm

Algorithm Fib(n)
F[1] ← 0; F[1] ← 1
for i ← 2 to n do
F[i] ← F[i-1] + F[i-2]
return F[n]

Order of growth is Θ(n).


Why is it so much better then the recursive algorithm? Draw the recursive tree.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy