DAA - Module 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

DESIGN AND

ANALYSIS
OF
ALGORITHMS

18CS42
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

MODULE-1
INTRODUCTION

1.1 What is an Algorithm? (T2:1.1)


1.2 Algorithm Specification (T2:1.2)
1.3 Analysis Framework (T1:2.1)
1.4 Performance Analysis: Space complexity, Time complexity (T2:1.3).
1.5 Asymptotic Notations (T1:2.2, 2.3, and 2.4)
1.5.1 Big-Oh notation (O)
1.5.2 Omega notation (Ω)
1.5.3 Theta notation (Ɵ)
1.5.4 Mathematical analysis of Non-Recursive and recursive Algorithms with Examples.
1.6 Important Problem Types:
1.6.1 Sorting, Searching
1.6.2 String processing
1.6.3 Graph Problems
1.6.4 Combinatorial Problems.
1.7 Fundamental Data Structures
1.7.1 Stack
1.7.2 Queues
1.7.3 Graphs
1.7.4 Trees
1.7.5 Sets and Dictionaries

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 2


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

1.1 What is an Algorithm?


The word algorithm comes from the name of a Persian author, Abu jafar mohammed ibn
musa al khowarizmi (825A.D) who wrote a text book on mathematics.

Definition: “An algorithm is a sequence of unambiguous instructions for solving a problem,


i.e., for obtaining a required output for any legitimate input in a finite amount of time”.
(Or)
An algorithm is a finite set of instructions that if followed accomplishes a particular task.

All algorithm must satisfy the following criteria:


i. Input There are zero or more quantities which are externally
supplied.
ii. Output At least one quantity is produced.
iii. Definiteness Each instruction must be clear and unambiguous.
iv. Finiteness If we trace out the instructions of the algorithm, then for
all valid cases the algorithm will terminate after a finite
number of steps.
v. Effectiveness Every instruction must be sufficiently basic that it can in
principle be carried out by a person using only a pencil and
paper. It is not enough that each operation be definite as in ( iii), but it
must be feasible
Algorithms that are definite and effective are also called computational procedures. A program
is the expression of an algorithm in a programming language. The diagrammatic representation
of algorithm is given as notion of an algorithm below:

Figure 1.1: Notion of an algorithm

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 3


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

The important properties of the algorithm are

• The algorithm should be unambiguous.


• The range of inputs for which an algorithm works has to be specified carefully.
• The same algorithm can be represented in several different ways.
• Several algorithms for solving the same problem may exist.
• Algorithms for the same problem can be based on very different ideas and can solve the
problem with dramatically different speeds.

The study of algorithms includes four distinct areas.

1. How to devise algorithms: Mastering various design strategies helps to devise new
algorithms.
2. How to validate algorithms: checking the algorithm, whether it gives the correct answer
for all possible inputs. After validation the program can be written.
3. How to analyze algorithms: analysis of algorithms or performance analysis refers to the
task of determining how much computing time and storage an algorithm requires.
4. How to test a program: testing a program in two phases debugging and profiling
(performance, measurement).
a. Debugging is the process of executing programs on sample data sets to determine
whether faulty results occur and so correct them.
b. Profiling is the process of executing a correct program on data sets and measuring
the time and space it takes to compute the results.

1.2 Algorithm specification


i) Pseudo code conventions: we can describe an algorithm in natural language like English
etc. Graphics/Flowcharts is also an another method for representing.
Pseudo code conventions are:
• Comments begin with // and continue with the end of the line.
• Blocks are indicated with matching braces { and } .A compound statement can be
represented as a block. An identifier begins with a letter. The data types are not
explicitly declared. Compound data types can be formed with records.
Example: node=record
{
data type data1

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 4


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

data type2 data 2


:
:
:
data type 1 data 1
node * link;
}
This is a self referential structure, data items of a record can be accessed with-> and
period (.)
• Assignment of values to variables is done using the assignment statement.
<variable>:=<expression>;
• Boolean values true and false are used.
• Logical operators and, or,and not, relational operators < ,<=,>,>= are also supported.
• Conditional looping statements has the following forms
If<condition> then <statement>
If <condition>then<statemenmt1> else<statement2>
• We can use case statement
Case
{
:<conditon1>:<statement>1>
: <conditon1>:<statement>2>
:
:
:<conditonn>:<statement>n>
:else:<statement n+1>
}
Here <statement> can be simple or compound.
• Input and output are specified by read and write.
• Algorithm consists of a heading and a body
Syntax is
Algorithm name (<parameter lists>)
Where name is the name of the procedure/algorithm and <parameter list> is
the list of parameters.
Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 5
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

1.3 The Analysis Framework


Analysis of algorithms means investigation of algorithm efficiency with respect to two resources:
-Running Time and Memory Space.

Analysis framework is a systematic approach that can be applied for analyzing the efficiency
through:
• Time efficiency
• Space efficiency.

Time efficiency also called time complexity indicates how fast an algorithm in question runs.
Space efficiency also called space complexity refers to the amount of memory units required by the
algorithm in addition to the space needed for its input and output.

Important factors for Analysis Framework are:


• Measuring an Input’s Size.
• Units for Measuring Running Time
• Orders of Growth
• Worst-Case, Best-Case, and Average-Case Efficiencies

Computing order of growth


of algorithms

Measuring time Measuring Input


Complexity size

Analysis Framework

Measuring
Measuring Space
running time
Complexity

Computing Best case, worst case and


average case efficiencies

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 6


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

1. Measuring an Input’s Size:


For an algorithm, the parameter ‘n’ specifies the input size. A common observation is that “All
algorithms run longer on larger inputs”. Therefore it is logical to investigate an algorithm’s input
efficiency as a function of parameter “n” indicating input size.
Selecting the size of input vary with respect to the type of problem. In many cases selecting
‘n’ is straightforward. Example: Searching, Sorting etc...
In some varieties of problems for example: Product of matrices, Diagnose disease from X-Ray, ‘n’
cannot be directly estimated. For such algorithms, measuring size is

b = [log2n] +1
Where b= number of bits, n= input parameters.
This metric usually gives a better idea about the efficiency of algorithms.

2. Units for Measuring Running Time

An algorithm’s efficiency must be measured with a metric that does not on extraneous factors like
computer used, compiler used to run the program.
One possible approach is to count the number of times each of the algorithm’s operations is
executed. The most important component used to measure the running time of the algorithm
is called the basic operation.
Identification of the basic operation of an algorithm: it is usually the most time-
consuming operation in the algorithm’s innermost loop.
For example, most sorting algorithms work by comparing elements (keys) of a list being
sorted with each other; for such algorithms, the basic operation is a key comparison.

Let cop be the execution time of an algorithm’s basic operation on a particular computer, and let
C(n) be the number of times this operation needs to be executed for this algorithm. Then we can
estimate the running time T (n) of a program implementing this algorithm on that computer by the
formula

T (n) ≈ cop * C (n)

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 7


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Problem: How much longer will the algorithm run if we double its input size? Assume C(n) is ½
n(n+1) .

The answer is 4 times longer

3. Orders of Growth

Order of growth of an algorithm is a way of predicting the execution time of a program


changes with input size.
The varying of running time with increase in input size is the order of growth of
algorithm. The magnitude of the numbers in below table has significance for the analysis of
algorithms. There are seven efficiency classes listed row wise in the Table2.1.

The function growing the slowest among these is the logarithmic function. The exponential
function 2n and the factorial function n! grow so fast that their values become astronomically large
even for rather small values of n. Algorithms that require an exponential number of operations are
practical for solving only problems of very small sizes.

4. Worst-Case, Best-Case, and Average-Case Efficiencies

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 8


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Many algorithms running time depend not only on an input size but on specific input.
Consider, as an example, sequential search. This is a straightforward algorithm that searches for a
given item (some search key K) in a list of n elements by checking successive elements of the list
until either a match with the search key is found or the list is exhausted. The running time of this
algorithm can be quite different for same list size n.

ALGORITHM SequentialSearch(A[0..n − 1], K)


//Searches for a given value in a given array by sequential search
//Input: An array A[0..n − 1] and a search key K
//Output: The index of the first element in A that matches K
// or −1 if there are no matching elements
i ←0

while i < n and A[i] = K do


i ←i + 1
if i < n return i
else return −1

Worst case efficiency: when there are no matching elements or the first matching element happens
to be the last one on the list, the algorithm makes the largest number of key comparisons among all
possible inputs of size n:
Cworst (n) = n.

The worst-case efficiency of an algorithm is its efficiency for the worst-case input of size
n, which is an input (or inputs) of size n for which the algorithm runs the longest among all
possible inputs of that size.

The worst-case analysis provides very important information about an algorithm’s efficiency
by bounding its running time from above. In other words, it guarantees that for any instance of size
n, the running time will not exceed Cworst (n), its running time on the worst-case inputs.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 9


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

The best-case efficiency of an algorithm is its efficiency for the best-case input of size n, which
is an input (or inputs) of size n for which the algorithm runs the fastest among all possible
inputs of that size.

Accordingly, we can analyze the best case efficiency as follows. First, we determine the kind of
inputs for which the count C(n) will be the smallest among all possible inputs of size n. For
example, the best-case inputs for sequential search are lists of size n with their first element equal to
a search key; accordingly, Cbest(n) = 1 for this algorithm.

The average-case efficiency of an algorithm is average time taken (number of times the
basic operation will be executed) to solve all the possible instances of the input.

The average-case efficiency: neither the worst-case analysis nor its best-case counterpart yields the
necessary information about an algorithm’s behavior on a “typical” or “random” input.

1.4 Performance Analysis (TB-2-1.3)


Performance analysis is the criteria for judging algorithms. The two important criterias’s are
namely:
• Space Complexity.
• Time Complexity

*Space complexity: The space complexity of an algorithm is the amount of memory it needed
to run to completion.

The space needed by each of these algorithms is the sum of the following components
i) Fixed Part: is the aspect of an algorithm that is independent of the characteristics of the
input and outputs.
Example: Number, size, Space for simple variables, fixed component variables, Space for
constant etc.
ii) A Variable Part: consists of the space needed by the components variables whose size is
dependent on the particular problem instance being solved.
Example: The space needed by referenced variables recursion stack space.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 10


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Therefore the space requirement S(P) of any algorithm P may be given as below
S(P)=C+ Sp (instance characteristics),
Where C is a constant, Sp is an instance characteristics which is the main focus in analyzing
the space complexity of any algorithm. Therefore Sp is always problem specific.

Example1

Here we see the values a, b, c is independent of the instance characteristics, so Sp =0.

2. Algorithm to compute the sum of n numbers:


Iterative version of Sum algorithm: The space needed by n is one word, since it is of type
integer so,

S(P)= C+ Sp
C=s, n, i =3 (1 word per variable)
Sp= a[i]= n
S(P) = 3+n

Ssum (n)>= (n+3)

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 11


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Recursive Algorithm to compute sum of n numbers.

In recursive algorithms the instances are characterized by the n. The recursion stack needs
space which in turn space is used to store formal parameters, local variables and the return
address. So each call to Rsum requires at least 3 words.
• Space for value a[n]
• Return address
• Pointer to a[ ]
Since the depth of the recursion i.e. how many times recursion is called is n+1.
S(P)= 3(n+1)

*Time complexity:
The time T (P) taken by a program P is the sum of the compile time and the run time.
T(P)=compile time +Runtime
Compile Time of a program can be ignored as the complied program will run several times
without recompilationRun-time is denoted by tp.
.
Many factors affect t(p), like the characteristics of the compiler to be used, determine the
number of additions, subtractions multiplications, divisions, compares, loads, stores and so on
That would be made by the code for P so

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 12


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

This formula can be used and the idle time for each operation can be considered, but the
operation execution time is machine dependent. So t (p) must be deduced such that it should be
machine independent.
So it is idle to calculate t(p) by machine independent feature I.e., to identify the program step
and calculate the count of it .

Program Step: is a loosely defined as a syntactically or semantically meaningful segment


of a program that has an execution time that is independent of their instance
characteristics.

There are two ways of computing the program step.


1. Program step-count method
2. Tabular method

1. Program step count method

Example: For program to find the sum of n numbers

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 13


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Here count variable is made declared as global. Count is incremented by the step count of each
statement it executes.
So, in the algorithm the for loop, the count will increase by a total of 2n.Then finally
2n+3 will be count after program termination each invocation of sum algorithm executes a
total of 2n+3 steps. For recurrence sum the algorithm step count is computed as below

2. Tabular method: to determine the step count of an algorithm is to build a table in which we
list the total number of steps contributed by each statement.
Here 2 things we should identify.
s/e: Steps per execution of the statement
Frequency: Number of times the statement is executed.
Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 14
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Example:

1.5 Asymptotic Notations and Basic Efficiency Classes

To compare and rank orders of growth, we use three notations: O (big oh), Ω (big
omega), and Θ (big theta).

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 15


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

O-notation:
A function t(n) is said to be in O(g(n)), denoted t(n)∈O(g(n)), if t(n) is bounded above by some

constant multiple of g(n) for all large n, i.e., if there exist some positive constant c and some
nonnegative integer n0 , such that
t(n) ≤ c g(n) for all n≥ n0
The definition illustration is shown in the below figure

.
Figure : O-notation

Ex: 3n3+2n2∈O(n3)
According to definition of O-notation,
t(n) ≤ c g(n) for all n≥ n0
i.e. 3n3+2n2≤c.n3 Assume 2 is replaced by n
Then 3n3+n3
= 4n3
so c=4 and n0>=2
for more examples refer to the material uploaded.
Ω-notation:
A function t(n) is said to be in Ω (g(n)), denoted t(n)∈ Ω (g(n)), if t(n) is
bounded above by some constant multiple of g(n) for all large n, i.e., if there
exist some positive constant c and some nonnegative integer n0 such that
t(n) ≥c g(n) for all n≥ n0
The definition illustration is shown in the below figur

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 16


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Ex: n! ∈Ω (2n)

According to definition of the notation

t(n) ≥ c g(n) for all n≥ n0

i.e. n!≥ c. 2n
with c=1 & n0≥4 the above inequality will be satisfied.

Θ-notation:

function t(n) is said to be in Θ(g(n)), denoted t(n)∈Θ(g(n)), if t(n) is bounded both above

and below by some positive constant multiples of g(n) for all large n, i.e., if there exist some
positive constant c1 and c2 and some nonnegative integer n0 such that
c2g(n) ≤ t(n) ≤c1g(n) for all n≥n0
The definition illustration is shown in the below figure

Ex: ½n(n-1)∈Θ(n2)

According to definition of Θ notation,


c2g(n) ≤ t(n) ≤c1g(n) for all n≥n0

i.e. c2n2≤ ½n(n-1)≤c1n2


Right inequality (Upper Bound)

½n(n-1)= ½n2-½n≤ ½n2 for all n≥0

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 17


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Left inequality (Lower Bound)


½n(n-1)= ½n2-½n ≥½n2 -½n½n = ¼n2 for all n≥2

with c2=¼ , c1= ½ & n0≥2 the above inequality will be satisfied.

THEOREM:
If t1(n)∈O(g1(n)) and t2(n)∈O(g2(n)), then t1(n)+t2(n)∈O(max{g1(n),g2(n)}).
(The analogous assertions are true for the Θ and Ω notations as well.)

Proof: Let us take four arbitrary real numbers a1, b1, a2 and b2; if a1≤ b1 and a2≤b2 then
a1+a2≤2max {b1, b2}

Since t1(n)∈O(g1(n)),there exist some positive constant c1 and some nonnegative integer n1
such that
t1(n)≤c1g1(n) for all n≥n1.

Similarly, since t2(n)∈O(g2(n)),


t2(n)≤c2g2(n) for all n≥n2.
Let c3=max{c1, c2) and consider n≥max{n1,n2}. Adding above two inequalities
t1(n)+ t2(n) ≤ c1g1(n)+ c2g2(n)
≤ c3g1(n)+ c3g2(n)
= c3[g1(n)+g2(n)]
≤ c3 2 max{g1(n),g2(n)}

Hence t1(n)+t2(n)∈O(max{g1(n),g2(n)}) with the constants c=2c3=2max{c1,c2} &


n0=max{n1,n2}

Comparing Orders of Growth using limits:-


The three principal cases are:

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 18


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

The first two cases mean that t(n) ∈O(g(n)), The last two

cases mean that t(n)∈Ω(g(n)), and the second case means

that t(n)∈Θ(g(n)).
The limit -based approach is often more convenient than the approach based on the
definitions because it can take advantage of the powerful calculus techniques developed for
computing limits, such as

L'Hospital's rule

and Stirling's formula

Example 1: Compare the orders of growth of ½n(n − 1) and n2.

Here the limit is a positive constant, i.e. the functions have the same order of growth i.e.

½n(n − 1) ∈Θ(n2)

Example 2: Compare the orders of growth of log2n and √n.

Here the limit is equal to zero, i.e. log2n has a smaller order of growth than √n. i.e.

log2n∈O(√n).

Example 3: Compare the orders of growth of n! and 2n.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 19


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Here the limit is equal to ∞, i.e. n! has a larger order of growth than 2n. i.e

n!∈Ω(2n).

Basic Efficiency classes:-

Class Name Example


1 Constant Short of best-case efficiencies, very few reasonable
examples can be given since an algorithm’s running time
typically goes to infinity when its input size grows
infinitely large.
log n Logarithmic Typically, a result of cutting a problem’s size by a
constant factor on each iteration of the algorithm (see
Section 4.4). Note that a logarithmic algorithm cannot
take into account all its input or even a fixed fraction of
it: any algorithm that does so will have at least linear
running time.
n Linear Algorithms that scan a list of size n (e.g., sequential
search) belong to this class.
nlogn n-log-n Many divide-and-conquer algorithms (see Chapter 5),
including mergesort and quicksort in the average case,
fall into this category
n2 Quadratic Typically, characterizes efficiency of algorithms with
two embedded loops (see the next section). Elementary
sorting algorithms and certain operations on n × n
matrices are standard examples.
n3 Cubic Typically, characterizes efficiency of algorithms with
three embedded loops (see the next section). Several
nontrivial algorithms from linear algebra fall into this
class.
2n Exponential Typical for algorithms that generate all subsets of an n-
element set. Often, the term “exponential” is used in a
broader sense to include this and larger orders of growth
as well
n! Factorial Typical for algorithms that generate all permutations of
an n-element set.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 20


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

1.6 Mathematical Analysis of Non-recursive Algorithms


General Plan for Analyzing Time Efficiency of Non-recursive Algorithms:-
1. Decide on a parameter (or parameters) indicating an input's size.
2. Identify the algorithm's basic operation.
3. Check whether the number of times the basic operation is executed depends only on the size
of an input. If it also depends on some additional property, the worst-case, average- case, and,
if necessary, best-case efficiencies have to be investigated separately.
4. Set up a sum expressing the number of times the algorithm's basic operation is
executed.
5. Using standard formulas and rules of sum manipulation either find a closed form
formula for the count or, at the very least, establish its order of growth.
We frequently use two basic rules of sum manipulation:

and two summation formulas

Example 1: Finding the value of the largest element in a list of n numbers.


ALGORITHM MaxElernent(A[0, … .., n -1])
//Determines the value of the largest element in a given array
//Input: An array A[O .. n - 1] of real numbers
//Output: The value of the largest element in A
maxval A[0]
for i 1 to n-1 do
if A[i] > maxval
maxval A[i]
return maxval

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 21


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Analysis:
1. The measure of input's size here is the number of elements in the array, i.e., n.
2. There are two basic operations in the algorithm: the comparison A[i] >maxval and the
assignment maxval A[i]. Since the comparison is executed on each repetition of the
loop and the assignment is not, so comparison to be the algorithm's basic operation.
3. The basic operation of the algorithm depends only on the size of the input, so we need
to analyze only one kind of efficiency.
4. Let us denote C(n) the number of times this comparison is executed. The algorithm
makes one comparison on each value of the loop's variable i within the bounds 1 and
n - 1. Therefore, we get the following sum for C(n)
:

5. Solve the above equation by using standard formulas of Summation.

Example 2: Element uniqueness problem:


ALGORITHM UniqueElements(A[O …. n1])
//Determines whether all the elements in a given array are distinct or not.
//Input: An array A[O .. n -1]
//Output: Returns "true" if all the elements in A are distinct and "false" otherwise.
for i -> 0 to n - 2 do
for j -> i + 1 to n - 1 do
if A[i] = =A[j] return false
return true
Analysis:
1. The measure of input's size here is the number of elements in the array, i.e., n.
2. The comparison of two elements is the algorithm's basic operation.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 22


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

3. The number of element comparisons will depend not only on size of input but also on
whether there are equal elements in the array and, if there are, which array positions
they occupy. So we need to analyze best case, worst case & average case separately.
4. Worst Case analysis:
There are two kinds of worst-case inputs arrays with no equal elements and arrays in
which the last two elements are the only pair of equal elements. For such inputs, one
comparison is made for each repetition of the innermost loop, i.e., for each value of the
loop's variable j between its limits i + 1 and n - 1; and this is repeated for each value of
the outer loop, i.e., for each value of the loop's variable i between its limits 0 and n - 2.
So, we get basic operation count as:

5. Solve the above equation by using standard formulas & rules of Summation.

Example 3: Matrix Multiplication


ALGORITHM MatrixMultiplication(A[0....n-1, 0...n- 1], B[0.... n-1, 0 .... n1])
//Multiplies two n-by-n matrices by the definition-based algorithm.
//Input: Two n-by-n matrices A and B.
//Output: Matrix C = AB
for i 0 to n-1 do
for j 0 to n - 1 do
C[i, j]  0
for k 0 to n - 1 do
Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 23
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

C[i, j]  C[i, j] + A[i, kj * B[k, j]


return C
Analysis:
1. The measure of input’s size is matrix order n.
2. The algorithm's innermost loop has two arithmetical operations-multiplication and
addition, but as per the property of asymptotic notation we consider multiplication as the
algorithm's basic operation.
3. The basic operation count of the algorithm depends only on the size of the input, so we
need to analyze only one kind of efficiency.
4. Let M(n) be the total number of multiplications executed by the algorithm. There is just
one multiplication executed on each repetition of the algorithm's innermost loop,
which is governed by the variable k ranging from 0 to n - 1. Therefore, the number of
multiplications made for every pair of specific values of variables i and j is

The total number of multiplications M(n) is expressed by the following triple sum:

5. Solve the above equation by using standard formulas & rules of Summation.

Example 4: Finding the number of binary digits in the binary representation of a


positive decimal integer(non-recursive algorithm).

ALGORITHM Binary(n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n 's binary representation
count 1

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 24


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

while n > 1 do
count count + 1
n
return count
Analysis:

1. The size of the input is value of n.


2. The comparison n > 1 that determines whether the loop's body will be executed. So
comparison will be the key operation of the algorithm.
3. The basic operation count of the algorithm depends only on the size of the input, so we
need to analyze only one kind of efficiency.
4. Since the value of n is about halved on each repetition of the loop, the answer should be
about log2 n.

5. So C(n)∈ Θ(log2 n).

1.7 Mathematical Analysis of Recursive Algorithms


General Plan for Analyzing Time Efficiency of Recursive Algorithms
1. Decide on a parameter (or parameters) indicating an input's size.
2. Identify the algorithm's basic operation.
3. Check whether the number of times the basic operation is executed can vary on
different inputs of the same size; if it can, the worst -case, average-case, and best- case
efficiencies must be investigated separately.
4. Set up a recurrence relation, with an appropriate initial condition, for the number of times
the basic operation is executed.
5. Solve the recurrence or at least ascertain the order of growth of its solution.

Example 1: Compute the factorial function F(n) = n! for an arbitrary nonnegative


integer n.
ALGORITHM F(n)
//Computes n! recursively
//Input: A nonnegative integer n
//Output: The value of n!
Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 25
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

if n = 0 return 1
else return F(n - 1) * n

Analysis:
1. The size of the input is value of n.
2. The basic operation of the algorithm is multiplication, let M(n) denote number of
executions of Multiplications.
3. The basic operation count of the algorithm depends only on the size of the input, so
we need to analyze only one kind of efficiency.
4. F(n) is computed as F(n)=F(n - 1) * n, so the number of multiplications needed to
compute F(n) is multiplications needed to compute F(n-1) plus 1 to multiply the result
with n.

To solve the above recurrence relation we need an initial condition i.e. the value with which
the sequence starts. This can be obtained by looking at the condition that makes the recursive
call to stop. Here
if n=0 return 1
so the recurrence relation with initial condition is
M(n)=M(n-1)+1
M(0)=0
5. Solve the above recurrence relation by using backward substitution method.
M(n) =M(n-1)+1 //Substitute M(n-1)=M(n-2)+1
= (M(n-2)+1)+1
=M(n-2)+2 //Substitute M(n-2)=M(n-3)+1
= (M(n-3)+1)+2
=M(n-3)+3
.…

=M(n-i)+i
.…

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 26


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

=M(n-n)+n
=M(0)+n
M(n) =0+n //Initial Condition

M(n) =n
The number of multiplications required is: M(n)=n

∴ The Time Complexity: T(n)∈Θ(n).

Example 2: Tower of Hanoi

We have n disks of different sizes & three pegs. Initially all the disks are on the first peg
such that largest is on the bottom & smallest is on the top. We have to move all the disks to the
third peg using second one as an auxiliary. We can move only one disk at a time and smaller one
is always on the top of the larger one. This problem can be solved by recursive technique.
When n>1(number of disks), we first move recursively n-1 disks from peg1 to peg2 with peg3 as
auxiliary, then move the largest disk from peg1 to peg3. Finally move the n-1 disks recursively
from peg2 to peg3 with peg1 as auxiliary.

Figure : Recursive solution to the Tower of Hanoi puzzle.


Analysis:
1. The input parameter is the number of disks we have to move i.e. n.
2. The key operation of the algorithm is Movement of disks.
3. The number of disks movements depends only on number of disks we have to move i.e.
n.
4. The recurrence relation for basic operation count is:

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 27


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

M(n)=M(n-1)+1+M(n-1) for n>1


M(n)=2M(n-1)+1
And the initial condition is M(1)=1
5. Solve the above recurrence relation by using backward substitution method.
M(n) =2M(n-1)+1 //Substitute M(n-1)=2M(n-2)+1
= 2[2M(n-2)+1]+1
=22M(n-2)+2+1 //Substitute M(n-2)=2M(n-3)+1
= 22[2M(n-3)+1]+2+1
=23M(n-3)+22+2+1
.…

=2iM(n-i)+2i-1+…+22+2+1
.…

=2n-1M(n-(n-1))+…+ 22+2+1

=2n-1M(1)+2n-2+2n-3…+ 22+2+1

M(n) =20+21+22+…..+2n-3+2n-2+2n-1 //Initial Condition M(1)=1

The above series is in the G.P. So

In the above series a1=1, r=2 & n=n.

M(n) =1(1-2n)/1-2=2n-1=2n

The number of disk movements required is: M(n)=2n

∴ The Time Complexity: T(n)∈Θ(2n).

When a recursive algorithm makes more than a single call to itself, it can be useful for analysis
purposes to construct a tree of its recursive calls. In this tree, nodes correspond to recursive calls,
and we can label them with the value of the parameter (or, more generally, parameters) of the
calls. For the Tower of Hanoi example, the tree is given in Figure. By counting the number of
nodes in the tree, we can get the total number of calls made by the Tower of Hanoi algorithm:

(where l is t he level in the tre∑n-10e in Figure ) = 2n − 1.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 28


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Figure: Tree of recursive calls made by the recursive algorithm for the Tower of Hanoi
puzzle.

Example3: Finding the number of binary digits in a Binary Representation of a positive


decimal integer.

ALGORITHM BinRec(n)

//Input: A positive decimal integer n.

//Output: The number of binary digits in n's binary representation.

if n=1 return 1

else return BinRec( )+1

Analysis:

1. The size of the input is value of n.


2. The basic operation of the algorithm is Addition.

3. The basic operation count depends only on the value of input parameter.
4. The recurrence relation for the basic operation count is:
The number of additions required: A(n)=A( )+1
The initial condition is A(1)=0
5. Solve the above recurrence relation by using backward substitution method.
Assume n=2k for simplification
A(2k) =A(2k/2)+1
A(2k) =A(2k-1)+1 //Substitute A(2k-1)= A(2k-2)+1
=[A(2k-2)+1]+1
= A(2k-2)+2 //Substitute A(2k-2)= A(2k-3)+1

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 29


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

=[A(2k-3)+1]+2
=A(2k-3)+3

….
.
= A(2k-k)+k
= A(20)+k
= A(1)+k //Initial condition is A(1)=0
A(2k) =0+k
But,
n =2k
logn =log2k =k log2(2)
logn =k
k = logn

A (2k) =k
∴ Key operation count A (n)=log n

∴ The Time Complexity: T (n) ∈Θ (log n)

1.8 Important Problem types: The most important problem types are

1. Sorting
2. Searching
3. String processing
4. Graph problems
5. Combinatorial problems
6. Geometric problems
7. Numerical problems

These problems are used in the subject to illustrate different algorithm design techniques and
methods of algorithm analysis.

1. Sorting

The sorting problem is to rearrange the items of a given list in non decreasing order.
For example, we can choose to sort student records in alphabetical order of names or by student
number or by student grade-point average. Such a specially chosen piece of information is called a
key.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 30


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Two properties of sorting algorithms are.


A sorting algorithm is called stable if it preserves the relative order of any two equal
elements in its input. In other words, if an input list contains two equal elements in positions i and j
where i < j, then in the sorted list they have to be in positions i and j respectively, such that i< j.
This property can be desirable if, for example, we have a list of students sorted
alphabetically and we want to sort it according to student GPA: a stable algorithm will yield a list in
which students with the same GPA will still be sorted alphabetically.

The second notable feature of a sorting algorithm is the amount of extra memory the
algorithm requires. An algorithm is said to be in-place if it does not require extra memory, except,
possibly, for a few memory units. There are important sorting algorithms that are in-place and those
that are not.

2. Searching

The searching problem deals with finding a given value, called a search key, in a given set .There
are plenty of searching algorithms to choose from. Example: sequential search and binary search.
These algorithms are of particular importance or real-world applications because they are
indispensable for storing and retrieving information from large databases.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 31


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

For searching, too, there is no single algorithm that fits all situations best. Some algorithms
work faster than others but require more memory; some are very fast but applicable only to sorted
arrays; and so on. Unlike with sorting algorithms, there is no stability problem, but different issues
arise.
3. String Processing
In recent decades, the applications dealing with non numerical data interest of researchers in string-
handling algorithms.
A string is a sequence of characters from an alphabet. Strings of particular interest are text
strings, which comprise letters, numbers, and special characters; bit strings, which comprise zeros
and ones; and gene sequences, which can be modeled by strings of characters from the four-
character alphabet {A,C, G, T}.
There are many string-processing algorithms in computer science one particular problem—
that of searching for a given word in a text—has attracted special attention from researchers. They
call it string matching.

4. Graph Problems

A Graph consists of a finite set of vertices (or nodes) and set of Edges which connect a pair of
nodes.
Graphs are used to solve many real-life problems. Graphs are used to represent networks. The
networks may include paths in a city or telephone network or circuit network. Graphs are also used
in social networks like LinkedIn, Facebook. For example, in Facebook, each person is represented
with a vertex (or node). Each node is a structure and contains information like person id, name,
gender, locale etc.
Graphs can be used for modeling a wide variety of applications, including transportation,
communication, social and economic networks, project scheduling, and games. Studying different
technical and social aspects of the Internet in particular is one of the active areas of current research
involving computer scientists, economists, and social scientists .Basic graph algorithms include
graph-traversal algorithms (how can one reach all the points in a network?), shortest-path algorithms
(what is the best route between two cities?), and topological sorting for graphs with directed edges
examples are the traveling salesman problem and the graph-coloring problem.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 32


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

5. Combinatorial Problems
Combinatorial problem deals with, given a finite collection of objects and a set of
constraints, finding an object of the collection that satisfies all constraints. Combinatorial problems
are problems involving arrangements of elements from a finite set and selections from a finite set.
These problems can be divided into three basic types: (1) enumeration
problems, (2) existence problems, and (3) optimization problems.

In enumeration problems the goal is either to find how many arrangements there are satisfying the
given properties or to produce a list of arrangements satisfying the given properties.
In existence problems the goal is to decide whether or not an arrangement exists satisfying the given
properties.
In optimization problems the goal is to find where a given function of several variables takes on an
extreme value (maximum or minimum) over a given finite domain.
The traveling salesman problem and the graph coloring problem are examples of
combinatorial problems. These are problems that ask, explicitly or implicitly, to find a
combinatorial object such as a permutation, a combination, or a subset—that satisfies certain
constraints.
A desired combinatorial object may also be required to have some additional property such
as a maximum value or a minimum cost. Combinatorial problems are the most difficult problems in
computing, from both a theoretical and practical standpoint.
Some combinatorial problems can be solved by efficient algorithms, but they should be
considered fortunate exceptions to the rule. The shortest-path problem mentioned earlier is among
such exceptions.

6. Geometric Problems:
Geometric algorithms deal with geometric objects such as points, lines, and polygons. The ancient
of course, today people are interested in geometric algorithms with quite different applications in
mind, such as computer graphics, robotics, and tomography. The two classic problems of
computational geometry: the closest-pair problem and the convex-hull problem.
The closest-pair problem is self-explanatory: given n points in the plane, find the closest pair
among them. The convex-hull problem asks to find the smallest convex polygon that would include
all the points of a given set.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 33


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

7. Numerical Problems
Numerical problems, another large special area of applications, are problems that involve
mathematical objects of continuous nature: solving equations and systems of equations,
computing definite integrals, evaluating functions, and so on. The majority of such mathematical
problems can be solved only approximately. Another principal difficulty stems from the fact that
such problems typically require manipulating real numbers, which can be represented in a computer
only approximately. Moreover, a large number of arithmetic operations performed on approximately
represented numbers can lead to an accumulation of the round-off error to a point where it can
drastically distort an output produced by a seemingly sound algorithm.

Many sophisticated algorithms have been developed over the years in this area, and they
continue to play a critical role in many scientific and engineering applications. But in the last 30
years or so, the computing industry has shifted its focus to business applications. These new
applications require primarily algorithms for information storage, retrieval, transportation through
networks, and presentation to users. As a result of this revolutionary change, numerical analysis has
lost its formerly dominating position in both industry and computer science programs. Still, it is
important for any computer-literate person to have at least a rudimentary idea about numerical
algorithms.

1.9 Important Data Structures

Since majority of algorithms operate on data, particular ways of organizing data play a critical role
in the design and analysis of algorithms.
A data structure can be defined as a particular scheme of organizing related data items. The
nature of the data items is dictated by the problem at hand; they can range from elementary data
types (e.g., integers or characters) to data structures (e.g., a one-dimensional array of one-
dimensional arrays is often used for implementing matrices).
There are a few data structures that have proved to be particularly important for computer
algorithms.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 34


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Linear Data Structures

The two most important elementary data structures are the array and the linked list. A (one
dimensional) array is a sequence of n items of the same data type that are stored contiguously in
computer memory and made accessible by specifying a value of the array’s index

Fig a: Array of n elements

• In the majority of cases, the index is an integer either between 0 and n – 1 or


between 1 and n. Some computer languages allow an array index to range between
any two integer bounds low and high.
• Each and every element of an array can be accessed in the same constant amount of
time regardless of where in the array the element in question is located. Arrays are
used for implementing a variety of other data structures.
• string, a sequence of characters from an alphabet terminated by a special character
indicating the string’s end. Strings composed of zeros and ones are called binary
strings or bit strings.

A linked list: is a sequence of zero or more elements called nodes, each containing two kinds of
information: some data and one or more links called pointers to other nodes of the linked list.

Fig b: Singly linked list of n elements.


In a singly linked list(Fig b) each node except the last one contains a single pointer to the
next element .To access a particular node of a linked list, one starts with the list’s first node and
traverses the pointer chain until the particular node is reached. Thus, the time needed to access an
element of a singly linked list, unlike that of an array, depends on where in the list the element is
located. On the positive side, linked lists do Item [0] Item [1] Item [n –1].

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 35


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Insertions and deletions can be made quite efficiently in a linked list by reconnecting a few
appropriate pointers. It is often convenient to start a linked list with a special node called the header.
This node may contain information about the linked list itself, such as its current length; it may also
contain, in addition to a pointer to the first element, a pointer to the linked list’s last element.

Doubly linked list: this is an Another extension of singly linked list in which every node, except the
first and the last, contains pointers to both its successor and its predecessor

Fig c: Doubly linked list of n elements

The array and linked list are two principal choices in representing a more abstract data structure
called a linear list or simply a list.

A list : is a finite sequence of data items, i.e., a collection of data items arranged in a certain linear
order. The basic operations performed on this data structure are searching for, inserting, and deleting
an element. Two special types of lists, stacks and queues, are particularly important.

A stack is a list in which insertions and deletions can be done only at the end. This end is
called the top because a stack is usually visualized not horizontally but vertically—akin to a stack of
plates whose “operations” it mimics very closely. As a result, when elements are added to (pushed
onto) a stack and deleted from (popped off) it, the structure operates in a “last-in–first-out” (LIFO)
fashion—exactly like a stack of plates if we can add or remove a plate only from the top.
Stacks have a multitude of applications; in particular, they are indispensable for
implementing recursive algorithms.
A queue, on the other hand, is a list from which elements are deleted from one end of the
structure, called the front (this operation is called dequeue),and new elements are added to the other
end, called the rear (this operation is called enqueue).
Consequently, a queue operates in a “first-in–first-out” (FIFO) fashion—akin to a queue of
customers served by a single teller in a bank. Queues also have many important applications,

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 36


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

including several algorithms for graph problems. Many important applications require selection of
an item of the highest priority among a dynamically changing set of candidates.
A data structure that seeks to satisfy the needs of such applications is called a priority queue.
A priority queue is a collection of data items from a totally ordered universe (most often integer or
real numbers). The principal operations on a priority queue are finding its largest element, deleting
its largest element, and adding a new element. .
A better implementation of a priority queue is based on an ingenious data structure called
the heap. We discuss heaps and an important sorting algorithm based on them in Section 6.4.

Graphs

A graph is informally thought of as a collection of points in the plane called “vertices” or “nodes,”
some of them connected by line segments called “edges” or “arcs.”
Formally, a graphG = _V,E_ is defined by a pair of two sets: a finite nonempty set V of
items called vertices and a set E of pairs of these items called edges. If these pairs of vertices are
unordered, i.e., a pair of vertices (u, v) is the same as the pair (v, u), we say that the vertices u and v
are adjacent to each other and that they are connected by the undirected edge (u, v).
We call the vertices u and v endpoints of the edge (u, v) and say that u and v are incident to
this edge; we also say that the edge (u, v) is incident to its endpoints u and v. A graph G is called
undirected if every edge in it is undirected. If a pair of vertices (u, v) is not the same as the pair (v,
u), we say that the edge (u, v) is directed from the vertex u, called the edge’s tail, to the vertex v,
called the edge’s head. We also say that the edge (u, v) leaves u and enters v. A graph whose every
edge is directed is called directed. Directed graphs are also called digraphs.

It is normally convenient to label vertices of a graph or a digraph with letters, integer


numbers, or, if an application calls for it, character strings (Fig d a & Fig b). The graph depicted in
Fig d (a) has six vertices and seven undirected edges:

Fig d: Undirected graph Directed graph


Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 37
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

V = {a, b, c, d, e, f }, E = {(a, c), (a, d), (b, c), (b, f ), (c, e), (d, e), (e, f )}.

The digraph depicted in Figure has six vertices and eight directed edges:
V = {a, b, c, d, e, f }, E = {(a, c), (b, c), (b, f ), (c, e), (d, a), (d, e), (e, c), (e, f )}.

Our definition of a graph does not forbid loops, or edges connecting vertices to themselves.
A graph with relatively few possible edges missing is called dense. A graph with few edges relative
to the number of its vertices is called sparse.
Whether we are dealing with a dense or sparse graph may influence how we choose to
represent the graph and, consequently, the running time of an algorithm being designed or used.

Graph Representations Graphs for computer algorithms are usually represented in one of two
ways: the adjacency matrix and adjacency lists.
The adjacency matrix of a graph with n vertices is an n × n boolean matrix with one row
and one column for each of the graph’s vertices, in which the element in the ith row and the j th
column is equal to 1 if there is an edge from the ith vertex to the jth vertex, and equal to 0 if there is
no such edge. For example, the adjacency matrix for the graph of

For the above Fig d (a) here is the adjacency matrix and (b) adjacency list
respectively.

Note that the adjacency matrix of an undirected graph is always symmetric, i.e., A[i, j ]= A[j, i] for
every 0 ≤ i, j ≤ n − 1

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 38


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

The adjacency lists of a graph or a digraph is a collection of linked lists, one for each vertex,
that contain all the vertices adjacent to the list’s vertex (i.e., all the vertices connected to it by an
edge).. For example, Figure d represents the graph in Figure 1.6a via its adjacency lists. To put it
another way.

Weighted Graphs A weighted graph (or weighted digraph) is a graph (or digraph) with numbers
assigned to its edges. These numbers are called weights or costs.
If a weighted graph is represented by its adjacency matrix, then its element A[i, j ] will
simply contain the weight of the edge from the ith to the jth vertex if there is such an edge and a
special symbol, e.g.,∞, if there is no such edge. Such a matrix is called the weight matrix or cost
matrix.

a) Weighted graph b) Weighted matrix c) Adjacency list

Paths and Cycles Among the many properties of graphs, two are important for a great number of
applications: connectivity and acyclicity. Both are based on the notion of a path.
A path from vertex u to vertex v of a graph G can be defined as a sequence of adjacent
(connected by an edge) vertices that starts with u and ends with v. If all vertices of a path are
distinct, the path is said to be simple.
The length of a path is the total number of vertices in the vertex sequence defining the path
minus 1, which is the same as the number of edges in the path. For example, a, c, b, f is a simple
path of length 3 from a to f in the graph in Figure 1.6a, whereas a, c, e, c, b, f is a path (not simple)
of length 5 from a to f.

A directed path is a sequence of vertices in which every consecutive pair of the vertices is
connected by an edge directed from the vertex listed first to the vertex listed next. For example, a, c,
e, f is a directed path from a to f in the graph in Figure 1.6b.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 39


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

A graph is said to be connected if for every pair of its vertices u and v there is a path from u to v. If
we make a model of a connected graph by connecting some balls representing the graph’s vertices
with strings representing the edges, it will be a single piece.
If a graph is not connected, such a model will consist of several connected pieces that are
called connected components of the graph. Formally, a connected component is a maximal (not
expandable by including another vertex and an edge) connected subgraph2 of a given graph. For
example, the graphs in Figures 1.6a and 1.8a are connected, whereas the graph in Figure below is
not, because there is no path, for example, from a to f. The graph in the below Figure has two
connected components with vertices {a, b, c, d, e} and {f, g, h, i}, respectively. Graphs with several
connected components do happen in real-world applications.
. A cycle is a path of a positive length that starts and ends at the same vertex and does not
traverse the same edge more than once. For example, f , h, i, g, f is a cycle in the graph in Figure 1.9.
A graph with no cycles is said to be acyclic.

Figure : connected components of the graph

Trees

A tree (more accurately, a free tree) is a connected acyclic graph (Figure below)
A graph that has no cycles but is not necessarily connected is called a forest: Each of its
connected components is a tree A subgraph of a given graph G = _V, E_ is a graph G

(a) Tree. (b) Forest.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 40


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Trees have several important properties other graphs do not have. In particular, the number of edges
in a tree is always one less than the number of its vertices: |E| = |V| − 1.

Rooted Trees Another very important property of trees is the fact that for every two vertices in a
tree, there always exists exactly one simple path from one of these vertices to the other. This
property makes it possible to select an arbitrary vertex in a free tree and consider it as the root of the
so-called rooted tree.
A rooted tree is usually depicted by placing its root on the top (level 0 of the tree), the
vertices adjacent to the root below it (level 1), the vertices two edges apart from the root
still below (level 2), and so on.

Figure below presents such a transformation from a free tree to a rooted tree. Rooted trees
play a very important role in computer science, a much more important one than free trees do; in
fact, for the sake of brevity, they are often referred to as simply “trees.” An obvious application of
trees is for describing hierarchies, from file directories to organizational charts of enterprises. There
are many less obvious applications, such as implementing dictionaries and data encoding

(a) Free tree. (b) Its transformation into a rooted tree.

List of tree applications, we should mention the so-called state-space trees that underline
two important algorithm design techniques: backtracking and branch-and-bound .
• For any vertex v in a tree T , all the vertices on the simple path from the root to that vertex
are called ancestors of v.
• The vertex itself is usually considered its own ancestor; the set of ancestors that excludes the
vertex itself is referred to as the set of proper ancestors.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 41


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

• If (u, v) is the last edge of the simple path from the root to vertex v (and u _= v), u is said to
be the parent of v and v is called a child of u;
• vertices that have the same parent are said to be siblings.
• A vertex with no children is called a leaf ;
• a vertex with at least one child is called parental. All the vertices for which a vertex v is an
ancestor are said to be descendants of v;
• the proper descendants exclude the vertex v itself.
• All the descendants of a vertex v with all the edges connecting them form the subtree of T
rooted at that vertex.
• Thus, for the tree in Figure b, the root of the tree is a; vertices d, g, f, h, and I are leaves, and
vertices a, b, e, and c are parental; the parent of b is a; the children of b are c and g; the
siblings of b are d and e; and the vertices of the subtree rooted at b are {b, c, g, h, i}.
• The depth of a vertex v is the length of the simple path from the root to v.
• The height of a tree is the length of the longest simple path from the root to a leaf.

Ordered Trees An ordered tree is a rooted tree in which all the children of each vertex are ordered.
It is convenient to assume that in a tree’s diagram, all the children are ordered left to right.

A binary tree can be defined as an ordered tree in which every vertex has no more than two children
and each child is designated as either a left child or a right child of its parent; a binary tree may also
be empty. The binary tree with its root at the left (right) child of a vertex in a binary tree is called the
left (right) subtree of that vertex.

In Figure below, some numbers are assigned to vertices of the binary tree. Note that a number
assigned to each parental vertex is larger than allthe numbers in its left subtree and smaller than all
the numbers in its right subtree.Such trees are called binary search trees. (This figure shows binary
tree and its binary search tree representation).

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 42


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Binary trees and binary search trees have a wide variety of applications in computer science;
you will encounter some of them throughout the book. In particular, binary search trees can be
generalized to more general types of search trees called multiway search trees, which are
indispensable for efficient access to very large data sets. As you will see later in the book, the
efficiency of most important algorithms for binary search trees and their extensions depends on the
tree’s height. Therefore, the following inequalities for the height h of a binary tree with n nodes are
especially important for analysis of such algorithms:log2 n_ ≤ h ≤ n − 1.

A binary tree is usually implemented for computing purposes by a collection of nodes


corresponding to vertices of the tree. Each node contains some information associated with the
vertex (its name or some value assigned to it) and two pointers to the nodes representing the left
child and right child of the vertex, respectively. All the siblings of a vertex are linked via the
nodes’right pointers in a singly linked list, with the first element of the list pointed to by the left
pointer of their parent.
Figure 1.14a illustrates this representation for the tree in Figure 1.11b. It is not difficult to
see that this representation effectively transforms an ordered tree into a binary tree said to be
associated with the ordered tree. We get this representation by “rotating” the pointers about 45
degrees clockwise (see Figure 1.14b).

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 43


DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

Sets and Dictionaries

The notion of a set plays a central role in mathematics. A set can be described as an unordered
collection (possibly empty) of distinct items called elements of below figure .

A specific set is defined either by an explicit listing of its elements (e.g., S = {2,3, 5, 7}) or by
specifying a property that all the set’s elements and only they must satisfy (e.g., S = {n: n is a prime
number smaller than 10}).
The most important set operations are: checking membership of a given item in a given set; finding
the union of two sets, which comprises all the elements in either or both of them; and finding the
intersection of two sets, which comprises all the common elements in the sets.

Sets can be implemented in computer applications in two ways. The first considers only sets
that are subsets of some large set U, called the universal set. If set U has n elements, then any subset
S of U can be represented by a bit string of size n, called a bit vector, in which the ith element is 1 if
and only if the ith element of U is included in set S. Thus, to continue with our example, if U = {1, 2,
3, 4, 5, 6, 7, 8, 9}, then S = {2, 3, 5, 7} is represented by the bit string 011010100. This way of
representing sets makes it possible to implement the standard set operations very fast, but at the
expense of potentially using a large amount of storage.

The second and more common way to represent a set for computing purpose is to use the list
structure to indicate the set’s elements. Of course, this option, too, is feasible only for finite sets;
fortunately, unlike mathematics, this is the kind of sets most computer applications need.

Note, however, there are two principal points of distinction between sets and lists. First, a set cannot
contain identical elements; a list can. This requirement for uniqueness is sometimes circumvented
by the introduction of a multiset, or bag, an unordered collection of items that are not necessarily
distinct. Second, a set is an unordered collection of items; therefore, changing the order of its
elements does not change the set. A list, defined as an ordered collection of items, is exactly the
opposite.

This is an important theoretical distinction, but fortunately it is not important for many
applications. It is also worth mentioning that if a set is represented by a list, depending on the
application at hand, it might be worth maintaining the list in a sorted order.
Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 44
DESIGN AND ANALYSIS OF ALGORITHMS (18CS42)

In computing, the operations we need to perform for a set or a multiset most often are searching for
a given item, adding a new item, and deleting an item from the collection. A data structure that
implements these three operations is called the dictionary. Note the relationship between this data
structure and the problem of searching mentioned in Section 1.3; obviously, we are dealing here
with searching in a dynamic context. Consequently, an efficient implementation of a dictionary has
to strike a compromise between the efficiency of searching and the efficiencies of the other two
operations.

Manjula L, Assistant Professor, Dept of CSE, RNSIT Page 45

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy