Daa Unit-1
Daa Unit-1
Solution as an algorithm
Algorithm Design
technique
Prove Correctness
No
Yes
Analyse the algorithm
is it efficient No
Yes
Pseudocode
Algorithm
Flow chart
Program (Using programming language)
Fig.1.1:Specification of Algorithm
It is very easy to specify an algorithm using natural language. But many times specification
of algorithm by using natural language is not clear, and may require brief description.
Example: Write an algorithm to perform addition of two numbers.
Step 1: Read the first number, say ‘a’.
Step 2: Read the second number, say ‘b’.
Step 3: Add the two numbers and store the result in a variable ‘c’.
Step 4: Display the result.
Such a specification creates difficulty, while actually implementing it (difficulty in
converting into source code). Hence many programmers prefer to have specification of
algorithm by means of pseudo-code.
Another way of representing the algorithm is by flow chart. Flow chart is a graphical
representation of an algorithm, but flowchart method work well only if the algorithm is
small and simple.
Pseudo-Code for expressing Algorithms
Based on algorithm there are two more representations used by programmer and those are
flow chart and pseudo-code. Flowchart is a graphical representation of an algorithm.
Similarly pseudo-code is a representation of algorithm in which instruction sequence can be
given with the help of programming constructs. It is not a programming language since no
pseudo language compiler exists.
The general procedure for writing the pseudo-code is presented below-
1.Comments begin with // and continue until the end of line
2.A block of statements (compound statement) are represented using { and } for example if
statement, while loop, functions etc.,.
3.Example
{
Statement 1;
Statement 2;
.........
.........
}
4.The delimiters [;] are used at the end of the each statement.
5.An identifier begins with a letter. Example: sum, sum5, a; but not in 5sum, 4a etc.,.
6.Assignment of values to the variables is done using the assignment operators as := or ←.
7.There are two Boolean values TRUE and FALSE.
Logical operators: AND, OR, NOT.
Relational operators: <, >, ≥,≤,=,≠.
Arithmetic operators: +, -, *, /, %;
8.The conditional statement if-then or if-then-else is written in the following form.
If (condition) then (statement)
If (condition) then (statement-1) else (statement-2)
‘If’ is a powerful statement used to make decisions based as a condition. If a condition is
true the particular block of statements are execute.
Example
if(a>b) then
{
write("a is big");
}
else
{
write("b is big");
}
9.Case statement
case
{
:(condition -1): (statement-1)
:(condition -2): (statement-2)
:(condition -n): (statement-n)
..............
..............
else
(statement n+1);
}
If condition -1 is true, statement -1 executed and the case statement is exited. If statement -1
is false, condition -2 is evaluated. If condition -2 is true, statement-2 executed and so on. If
none of the conditions are true, statement –(n+1) is executed and the case statement is
exited. The else clause is optional.
10.Loop statements:
For loop:
i).The general form of the for loop is
for variable:=value 1 to value n step do
{
Statement -1;
Statement -1;
.......
.......
Statement -n;
}
Example:
for i:=1 to 10 do
{
write(i); //displaying numbers from 1 to 10 i:=i+1;
}
}
ii).While loop:
The general form of the while loop is
while <condition> do
{
<statement 1>
<statement 2>
........
........
<statement n>
Example1:
i:=1;
repeat
{
write(i);i:=i+1;}until(i>10);
}
Example2:
i:=1;
while(i<=10)do
{
write(i);//displaying numbers from 1 to 10 i:=1+1;
}
11.Break: this statement is exit from the loop.
12. Elements of array are accessed using [ ].
For example, if A is an one-dimensional array, then ith element can be accessed using A[i].
If A is two-dimensional array, then (i, j)th element can be accessed using A[i,j].
13.Procedures (functions): There is only one type of
procedure: An algorithm consists of a heading and a body.
Algorithm Name (<parameter list>)
Syntax:
{
body of the procedure
}
14.Compound data-types can be formed with records
Syntax:
Name = record
{
data-type -1 data 1;
data-type -2 data2 2; data-type –n data n;
}
Example
Employee =record
{
int no,n;
char name[10];
float salary;
}
6.Performance Analysis:
Performance analysis or analysis of algorithms refers to the task of determining the
efficiency of an algorithm i.,e how much computing time and storage an algorithm requires
to run (or execute). This analysis of algorithm helps in judging the value of one algorithm
over another.
To judge an algorithm, particularly two things are taken into consideration
1.Space complexity
2.Time complexity.
Space Complexity: The space complexity of an algorithm (program) is the amount of
memory it needs to run to completion. The space needed by an algorithm has the following
components.
1.Instruction Space.
2.Data Space.
3.Environment Stack Space.
Instruction Space: Instruction space is the space needed to store the compiled version of
the program instructions. The amount of instruction space that is needed depends on factors
such as-
i). The compiler used to compile the program into machine code.
ii).The compiler options in effect at the time of compilation.
iii).The target computer, i.,e computer on which the algorithm run.
Note that, one compiler may produce less code as compared to another compiler, when
the same program is compiled by these two.
Data Space: Data space is the space needed to store all constant and variable values. Data
space has two components.
i).Space needed by constants, for example 0, 1, 2.134.
ii).Space needed by dynamically allocated objects such as arrays, structures, classes.
Environmental Stack Space: Environmental stack space is used during execution of
functions. Each time function is involved the following data are saved as the environmental
stack.
i).The return address.
ii).Value of local variables.
iii).Value of formal parameters in the function being invoked.
Environmental stack space is mainly used in recursive functions. Thus, the space
requirement of any program p may therefore be written as
Space complexity S(P) = C + Sp (Instance characteristics).
This equation shows that the total space needed by a program is divided into two parts.
•Fixed space requirements(C) is independent of instance characteristics of the inputs and
outputs.
-Instruction space
-Space for simple variables, fixed-size structure variables, constants.
•A variable space requirements (SP(1)) dependent on instance characteristics 1.
-This part includes dynamically allocated space and the recursion stack space.
Example of instance character is:
Examples: 1
Algorithm NEC (float x, float y, float z)
{
Return (X + Y +Y * Z + (X + Y +Z)) /(X+ Y) + 4.0;
}
In the above algorithm, there are no instance characteristics and the space needed by X, Y,
Z is independent of instance characteristics, therefore we can write,
S(XYZ) =3+0=3
One space each for X, Y and Z
∴Space complexity is O(1).
Examples: 2
Algorithm ADD ( float [], int n)
{
sum = 0.0;
for i=1 to n do
sum=sum+X[i]; return sum; }
Here, atleast n words since X must be large enough to hold the n elements to be summed.
Here the problem instances is characterized by n, the number of elements to be summed. So,
we can write,
S(ADD) =3+n
3-one each for n, I and sum
Where n- is for array X[],
Space complexity is O(n).
Time Complexity
The time complexity of an algorithm is the amount of compile time it needs to run to
completion. We can measure time complexity of an algorithm in two approaches
1.Priori analysis or compile time
2.Posteriori analysis or run (execution) time.
In priori analysis before the algorithm is executed we will analyze the behavior of the
algorithm. A priori analysis concentrates on determining the order if execution of statements.
In Posteriori analysis while the algorithm is executed we measure the execution time.
Posteriori analysis gives accurate values but it is very costly.
As we know that the compile time does not depend on the size of the input. Hence, we will
confine ourselves to consider only the run-time which depends on the size of the input and
this run-time is denoted by TP(n). Hence
The time (T(P)) taken by a program P is the sum of the compile time and execution time.
The compile time does not depend on the instance characteristics, so we concentrate on the
runtime of a program. This runtime is denoted by tp (instance characteristics).
The following equation determines the number of addition, subtraction, multiplication,
division compares, loads stores and so on, that would be made by the code for p.
tp(n) = CaADD(n)+ CsSUB(n)+ CmMUL(n)+ CdDIV(n)+……………..
where n denotes instance characteristics, and Ca, Cs, Cm, Cd and so on…..
As denote the time needed for an addition, subtraction, multiplication, division and so on,
and ADD, SUB, MUL, DIV and so on, are functions whose values are the number of
additions, subtractions, multiplications, divisions and so on. But this method is an
impossible task to find out time complexity.
Another method is step count. By using step count, we can determine the number if steps
needed by a program to solve a particular problem in 2 ways.
Method 1: introduce a global variable “count”, which is initialized to zero. So each time a
statement in the signal program is executed, count is incremented by the step count of that
statement.
Example:
Algorithm Sum(a, n)
{
s:=0;
for i:=1 to n do
{
s:=s+a[i];
return s;
}
count:=count+1; //for last time of for loop count:=count+1; //for return statement
return s;
}
Thus the totall number of steps are 2n+3
Method 2: The second method to determine the step count of an algorithm is to build a table
in which we list the total number of steps contributed by each statement.
Ex:
1. Algorithm Sum(a, n) 0 - 0
2. { 0 - 0
3. s:=0; 1 1 1
4. for i:=1 to n do 1 n+1 n+1
5. s:=s+a[i]; 1 n n
6. return s; 1 1 1
7. } 0 - 0
3 4 5 6 7 9 10 12 15
Best Case: If we want to search an element 3, whether it is present in the array or not. First,
A(1) is compared with 3, match occurs. So the number of comparisons is only one. It is
observed that search takes minimum number of comparisons, so it comes under best case.
Time complexity is O(1).
Average Case: If we want to search an element 7, whether it is present in the array or not.
First, A(1) is compared with 7 i,.e, (3=7), no match occurs. Next, compare A(2) and 7, no
match occurs. Compare A(3) and A(4) with 7, no match occurs. Up to now 4 comparisons
takes place. Now compare A(5) and 7 (i.,e, 7=7), so match occurs. The number of
comparisons is 5. It is observed that search takes average number of comparisons. So it comes
under average case.
Note: If there are n elements, then we require n/2 comparisons.
n
. . Time complexity is O = O(n) (we neglect constant)
2
Note: If the element is not found in array, then we have to search entire array, so it comes
under worst case
7.Asymptotic Notation:
Accurate measurement of time complexity is possible with asymptotic notation. Asymptotic
complexity gives an idea of how rapidly the space requirement or time requirement grow as
problem size increase. When there is a computing device that can execute 1000 complex
operations per second. The size of the problem is that can be solved in a second or minute or
an hour by algorithms of different asymptotic complexity. In general asymptotic complexity
is a measure of algorithm not problem. Usually the complexity of an algorithm is as a
function relating the input length to the number of steps (time complexity) or storage
location (space complexity). For example, the running time is expressed as a function of the
input size ‘n’ as follows.
f(n)=n4+100n2+10n+50 (running time) There are
four important asymptotic notations.
1.Big oh notation (O)
2.Omega notation (Ω).
3.Theta notation (θ)
Let f(n) and g(n) are two non-negative functions.
Big oh notation
Big oh notation is denoted by ‘O’. It is used to describe the efficiency of an algorithm. It is
used to represent the upper bound of an algorithms running time. Using Big O notation, we
can give largest amount of time taken by the algorithm to complete.
Definition: Let f(n) and g(n) be the two non-negative functions. We say that f(n) is said to
be O(g(n)) if and only if there exists a positive constant ‘c’ and ‘n0‘ such that,
f(n)≤c*g(n) for all non-negative values of n, where n≥n0.
Here, g(n) is the upper bound for f(n).
Ex: Let f(n) = 2n4 + 5n2 + 2n +3
< 2n4 + 5n4 + 2n4 +3n4
< (2+5+2+3)n4
< 12n4.
f(n)=12n4
4
This implies g(n)=n , n >1
c=12 and n0 =1
f(n)=O(n4)
The above definition states that the function ‘f’ is almost ‘c’ times the function ‘g’ when ‘n’ is
greater than or equal to n0.
This notion provides an upper bound for the function ‘f’ i.,e, the function g(n) is an upper bound
on the value of f(n) for all n, where n≥ n0.
Big omega notation
Big omega notation is denoted by ‘’. It is used to represent the lower bound of an algorithms
running time. Using big omega notation we can give shortest amount of time taken by the
algorithm to complete.
Definition: The function f(n)= (g(n)) (read as for of n is omega of g of n) if and only if there
exist positive constants ‘c’ and ‘n0’ such that,
f(n) ≥ c*g(n) for all n, n≥n0
Example:
Let f(n) = 2n4 + 5n2 + 2n +3
> 2n4 (for example as n ∞,
lower order o terms are insignificant)
f(n) > 2n4, n >1
g(n)=n4, c=2 and n0 =1
f(n)= (n4 )
Big Theta notation
The big theta notation is denoted by ‘’. It is in between the upper bound and lower bound of an
algorithms running time.
Definition: Let f(n) and g(n) be the two non-negetive functions. We say that f(n) is said to be
(g(n)) if and only if there exists a positive constants ‘c1’ and ‘c2’, such that,
c1g(n) f(n) c2g((n) for all non-negative values n, where n ≥ n0.
The above definition states that the function f(n) lies between ‘c1’times the function g(n) and
‘c2’, times the function g(n) where ‘c1’ and ‘c2’ are positive constants.
This notation provides both lower and upper bounds for the function f(n) i.,e, g(n) is both lower
and upper bounds on the value of f(n), for large n. in other words theta notation says that f(n) is
both O(g(n)) and Ω(g(n)) for all n, where n≥n0.
This function f(n) = ⱷ(g(n)) iff g(n) is both upper and lower bound an f(n).
Example:
f(n) = 2n4 + 5n2 + 2n +3
2n4≤ 2n4 + 5n2 + 2n +3≤ 12n4
2n4≤ f(n)≤ 12n4
g(n) = n4
c1=2, c2=12 and n0=1
f(n)=(n4)
Little ‘oh’ notation
Little oh notation is denoted by “o”. the asymptotic upper bound provided by O-notation may or
may not be asymptotically tight. The bound 2n2 =O(n2) is asymptotically tight, but the bound
2n=O(n2) is not. We use o-notation to denote an upper bound that is not asymptotically tight.
Definition: f(n)=o(g(n)), iff f(n)<c.g(n) for any positive constants c>0, n0>0 and n> n0.
Total (n)
Table.1.3.Time Complexity
Sets & Disjoint set union: introduction, union and find operations:
The elements are stored. We assumet at the sets being represented are pair wise disjoint(that is,if
S{and Sj,i / j, aretwo sets, then there is no element that is in both Si and Sj).
Algorithm 2.13
Basic Traversal and Search Techniques
Techniques for Binary Trees
Binary Tree
A binary tree is a finite set of nodes which is either empty or consists of a root and
two disjoint binary trees called the left sub tree and the right sub tree.
In a traversal of a binary tree, each element of the binary tree is visited exactly at once.
During the visiting of an element, all actions like clone, display, evaluate the operator etc is
taken with respect to the element. When traversing a binary tree, we need to follow linear
order i.e. L, D, R where
L->Moving left
D->printing the data
R->moving right
For fig: 1
In order: A-B-C-D-E-F-G-H-I
Post order: A-C-E-D-B-H-I-G-F
Pre order: F-B-A-D-C-E-G-I-H
Algorithm preorder(x)
Input: x is the root of a sub tree.
1. If x ≠ NULL
2. Then output key(x);
3. Preorder (left(x));
4. Preorder (right(x));
Algorithm postorder(x)
Input: x is the root of a subtree
1. If x ≠ NULL
2. Then postorder(left(x));;
3. Postorder(right(x));
4. Outputkey(x);
Algorithm inorder(x)
Input: x is the root of a subtree
1. If x≠ null
2. Then inorder(left(x));
3. Outputkey(x);
4. Inorder(right(x));
Exercises
5.2 Techniques for Graphs
Representation of graphs
(i) Adjacency Matrix: A V x V array, with matrix[i][j] storing whether there is an edge
between the ith vertex and the jth vertex. This matrix is also called as “Bit matrix” or
“Boolean Matrix”
(ii) Adjacency list: One linked list per vertex, each storing directly reachable vertices .
(iii) Linked List or Edge list:
“The process of traversing all the nodes or vertices on a graph is called graph traversal”.
The DFS explore each possible path to its conclusion before another path is tried. In other
words go as a far as you can (if u don’t have a node to visit), otherwise, go back and try
another way. Simply it can be called as “backtracking”.
(i) Select an unvisited node ‘v’ visits it and treats it as the current node.
(ii) Find an unvisited neighbor of current node, visit it and make it new current node
(iii) If the current node has no unvisited neighbors, backtrack to its parent and make it as a
new current node
(iv) Repeat steps 2 and 3 until no more nodes can be visited
(v) Repeat from step 1 for remaining nodes also.
Implementation of DFS
DFS (Vertex)
{
Mark u as visiting
For each vertex V directly reachable from u
If v is unvisited
DFS (v)
}
Visited vertex: The node or vertex which is visited is called ‘visited vertex’ i.e. can be
called as “current node”.
Discovery edge: It is opposite to unexplored edge, the path which is already traversed is
known as discovery edge.
Back edge: If the current node has no unvisited neighbors we need to backtrack to its
parent node. The path used in back tracking is called back edge.
For the following graph the steps for tracing are as follows:
Properties of DFS
i) DFS (G, v) visits all the vertices and edges in the connected component of v.
ii) The discovery edges labeled by DFS (G, v) form a spanning tree of the connected
component of v.
Tracing of graph using Depth First Search
Exercise
1.
Depth: W-U-V-Y-X-Z
2.
Depth: A-B-C-E-D
Depth: 1-2-3-4-5-6-7-8-9-10-11-12.
5.3 Breadth First Search
It is one of the simplest algorithms for searching or visiting each vertex in a graph. In this
method each node on the same level is checked before the search proceeds to the next level.
BFS makes use of a queue to store visited vertices, expanding the path from the earliest
visited vertices
Breadth: a-b-c-d-e-f-g-h-i-j-k
Implementation of BFS
Explored vertex: A vertex is said to be explored if all the adjacent vertices of v are visited.
Example 1: Breadth first search for the following graph:
Properties of BFS
Complexity of BFS
BFS: a f h e g i d j k c l n b m o
BFS: 7-11-8-2-9-10-5-3
BFS: A-B-C-D-E-F-G-H
Connected component: If G is connected undirected graph, then we can visit all the
vertices of the graph in the first call to BFS. The sub graph which we obtain after traversing
the graph using BFS represents the connected component of the graph.
Thus BFS can be used to determine whether G is connected. All the newly visited vertices
on call to BFS represent the vertices in connected component of graph G. The sub graph
formed by theses vertices make the connected component.
Spanning tree of a graph: Consider the set of all edges (u, w) where all vertices w are
adjacent to u and are not visited. According to BFS algorithm it is established that this set
of edges give the spanning tree of G, if G is connected. We obtain depth first search
spanning tree similarly
These are the BFS and DFS spanning trees of the graph G
Bi-connected Components
i) After deleting vertex B and incident edges of B, the given graph is divided into two
components
ii) After deleting the vertex E and incident edges of E, the resulting components are
iii) After deleting vertex F and incident edges of F, the given graph is divided into teo
components.
Note: If there exists any articulation point, it is an undesirable feature in
communication network where joint point between two networks failure in case of
joint node fails.
Vi-vertex belong Bi
Bi-Bi-connected component
i- Vertex number 1 to k
a- articulation point
Exercise
***********