Unit-IV Notes
Unit-IV Notes
Preliminaries of Tree ADT - Binary Trees - The Search Tree ADT–Binary Search Trees - AVL
Trees - Tree Traversals - B-Trees - Heap Tree – Preliminaries of Graph ADT - Representation of
Graph – Graph Traversal - BFS – DFS – Applications of Graph – Shortest - Path Algorithms –
Dijkstra’s Algorithm Minimum Spanning Tree – Prims Algorithm
max { degree(node) }
Degree of tree = node ∈tree
Degree of tree =3
Parent = node that has subtrees
Children = the roots of the subtrees of parent
1
4.2 Binary Trees
A binary tree is a tree in which no node can have more than two children. Each node can have 0, 1
or 2 children is shown Fig 4.2.
2 3
5 6
In this tree, node 1,2 and 3 contains two points i.e, left and right pointer pointing to the left and
right node respectively. Similarly nodes 3, 5 and 6 are the leaf nodes so these nodes have NULL
pointer on both left and right parts.
Properties of Binary Tree
a) At each level of i, the maximum number of nodes is 2i.
b) The height of tree is defines as the longest path from the root node to the leaf node. Her example
has height 3 and the maximum number of nodes at height 3 is (1+2+4+8)=15.
c) The minimum number of nodes possible at height h is equal to h+1.
d) If the number of nodes is minimum, then the height of the tree would be maximum. Similarly, if
the number of nodes is maximum, then the height of the tree would be minimum.
2
1
2 3
4 5 6 70
3
Fig 4.5Binary search tree
Basic Operations
1. Search
2. Insert
3. Pre-order Traversal
4. In-order Traversal
5. Post-order Traversal
1. Search Operation
Whenever an element is to be searched, start searching from the root node. If the data is less than
the key value, then search for the element in the left subtree. Otherwise, search for the element in
the right subtree. Fig 4.6 shows the example of search operation.
4
Fig 4.7 Find Min and max values
2. Insertion Operation
When a binary search tree is constructed, the keys are added one at a time. As the keys are inserted,
a new node is created for each key and linked into its proper position within the tree. Suppose we
want to build a binary search tree from the key list [60, 25, 100, 35, 17, 80] by inserting the keys in
the order they are listed is shown in Fig 4.8.
5
Fig 4.9 Removing leaf node
Removing an interior node with one child shown in Fig 4.10.
6
Removing an interior node with two children is shown in Fig 4.11.
For node 12, its predecessor is node 4 and its successor is node 23. For removing an interior node
with two children requires three steps:
1. Find the logical successor, S, of the node to be deleted, N.
2. Copy the key from node S to node N.
3. Remove node S from the tree.
7
With each node in an AVL tree, we associate a balance factor, which indicates the height
difference between the left and right branch. The balance factor can be one of three states:
left high (>): When the left subtree is higher than the right subtree.
equal high (=): When the two subtrees have equal height.
right high (<): When the right subtree is higher than the left subtree
Insertions
Inserting a key into an AVL tree begins with the same process used with a binary search tree. We
search for the new key in the tree and add a new node at the child link where we fall off the tree.
When a new key is inserted into an AVL tree, the balance property of the tree must be maintained.
If the insertion of the new key causes any of the subtrees to become unbalanced, they will have to
be rebalanced is shown in Fig.4.13.
8
i) Case 1: When the balance factor of the pivot node (P) is left high before the insertion and the new
key is inserted into the left child (C) of the pivot node. To rebalance the subtree, the pivot node has
to be rotated right over its left child. The rotation is accomplished by changing the links such that P
becomes the right child of C and the right child of C becomes the left child of P is shown in Fig
4.15.
ii) Case 2: the pivot (P), the left child of the pivot (C), and the right child (G) of C. For this case to
occur, the balance factor of the pivot is left high before the insertion and the new key is inserted
into either the right subtree of C. Node C has to be rotated left over node V and the pivot node has
to be rotated right over its left child. The right child of G as the new left child of the pivot node,
changing the left child of G to become the right child of C, and setting C to be the new left child of
G is shown in Fig 4.16.
9
Deletions
When an entry is removed from an AVL tree, we must ensure the balance property is maintained.
For example, suppose we want to remove key 17 from the AVL tree in Figure 14.21(a). After
removing the leaf node, the subtree rooted at node 25 is out of balance, as shown below. A left
rotation has to be performed pivoting on node 25 to correct the imbalance is shown in Fig 4.18.
10
(2) Traverse the right subtree of R in postorder.
(3) Process the root R.
Observe that each algorithm contains the same three steps, and that the left subtree of R is
always traversed before the right subtree. The difference between the algorithms is the time at
which the root R is processed. The three algorithms are sometimes called, respectively, the node-
left-right (NLR) traversal, the left-node-right (LNR) traversal and the left-right-node (LRN)
traversal.
Traversal algorithms
Preorder Traversal
Consider the following case where we have 6 nodes in the tree A, B, C, D, E, F. The
traversal always starts from the root of the tree. The node A is the root and hence it is visited first.
The value at this node is processed. Now we check if there exists any left child for this node if so
apply the preorder procedure on the left subtree. Now check if there is any right subtree for the
node A, the preorder procedure is applied on the right subtree.
Since there exists a left subtree for node A, B is now considered as the root of the left
subtree of A and preorder procedure is applied. Hence we find that B is processed next and then it
is checked if B has a left subtree is shown in Fig. 4.19
11
Fig. 4.19 Preorder Traversal
The algorithm for the above method is presented in the pseudo-code form below:
Algorithm
PREORDER( ROOT )
Temp = ROOT
If temp = NULL
Return
End if
Print info(temp)
If left(temp) ≠ NULL
PREORDER( left(temp))
End if
If right(temp) ≠ NULL
PREORDER(right(temp))
End if
End PREORDER
Inorder Traversal
In the Inorder traversal method, the left subtree of the current node is visited first and then
the current node is processed and at last the right subtree of the current node is visited. In the
following example, the traversal starts with the root of the binary tree. The node A is the root and
it is checked if it has the left subtree. Then the inorder traversal procedure is applied on the left
subtree of the node A. Now we find that node D does not have left subtree. Hence the node D is
processed and then it is checked if there is a right subtree for node D. Since there is no right
subtree, the control returns back to the previous function which was applied on B. Since left of B
is already visited, now B is processed. It is checked if B has the right subtree. If so apply the
inorder traversal method on the right subtree of the node B. This recursive procedure is followed
till all the nodes are visited is shown in Fig.4.20.
12
Fig 4.20 Inorder Traversal
Algorithm
INORDER( ROOT )
Temp = ROOT
If temp = NULL
Return
End if
If left(temp) ≠ NULL
INORDER(left(temp))
End if
Print info(temp)
If right(temp) ≠ NULL
INORDER(right(temp))
End if
13
End INORDER
Postorder Traversal
In the postorder traversal method the left subtree is visited first, then the right subtree and at
last the current node is processed. In the following example, A is the root node. Since A has the
left subtree the postorder traversal method is applied recursively on the left subtree of A. Then
when left subtree of A is completely is processed, the postorder traversal method is recursively
applied on the right subtree of the node A. If right subtree is completely processed, then the
current node A is processed is shown in Fig 4.21.
Algorithm
POSTORDER( ROOT )
Temp = ROOT
If temp = NULL
Return
End if
If left(temp) ≠ NULL
POSTORDER(left(temp))
End if
If right(temp) ≠ NULL
POSTORDER(right(temp))
End if
14
Print info(temp)
End POSTORDER
4.6 B TREES
An extension of a multiway search tree of order m is a B-tree of order m. This type of tree will be
used when the data to be accessed / stored is located on secondary storage devices because they
allow for large amounts of data to be stored in a node. A B-tree of order m is a multiway search
tree in which:
1. The root has at least two subtrees unless it is the only node in the tree.
2. Each nonroot and each nonleaf node have at most m nonempty children and at least m/2
nonempty children.
3. The number of keys in each nonroot and each nonleaf node is one less than the number of its
nonempty children.
4. All leaves are on the same level.
Searching
An algorithm for finding a key in B-tree is simple. Start at the root and determine which pointer to
follow based on a comparison between the search value and key fields in the root node. Follow the
appropriate pointer to a child node. Examine the key fields in the child node and continue to follow
the appropriate pointers until the search value is found or a leaf node is reached that doesn't contain
the desired search value.
Insertion
The condition that all leaves must be on the same level forces a characteristic behavior of Btrees,
namely that B-trees are not allowed to grow at the their leaves; instead they are forced to grow at
the root. When inserting into a B-tree, a value is inserted directly into a leaf is shown in Fig 4.22
and Fig.4.23. This leads to three common situations that can occur:
1. A key is placed into a leaf that still has room.
2. The leaf in which a key is to be placed is full.
3. The root of the B-tree is full.
15
Fig 4.22 B-tree Insertion
16
Fig 4.23 steps in B-tree insertion operation
Deletion
The deletion process will basically be a reversal of the insertion process - rather than splitting
nodes, it's possible that nodes will be merged so that B-tree properties, namely the requirement that
a node must be at least half full, can be maintained is shown in Fig 4.24 ,Fig 4.25 and Fig 4.26.
There are two main cases to be considered:
1. Deletion from a leaf
2. Deletion from a non-leaf
17
Fig 4.24 Deletion
18
Fig 4.25 Deletion
19
Fig 4.26 steps in B Tree deletion operation
4.7 HEAP TREE
Heap is a special case of balanced binary tree data structure where the root-node key is compared
with its children and arranged accordingly. A heap is a complete binary tree in which the nodes are
organized based on their data entry values. There are two variants of the heap structure. A max-
heap has the property, known as the heap order property, that for each non-leaf node V, the value
in V is greater than the value of its two children. The largest value in a max-heap will always be
stored in the root while the smallest values will be stored in the leaf nodes. The min-heap has the
opposite property. For each non-leaf node V, the value in V is smaller than the value of its two
children is shown in Fig.4.27 and Fig.4.28.
Example
20
Fig 4.28 Heap tree insertion
21
Figure 4.29 A Graph
Graph Terminology
1. Vertex : An individual data element of a graph is called as Vertex. Vertex is also known as node.
In above example graph, A, B, C, D & E are known as vertices.
2. Edge : An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is
represented as (starting Vertex, ending Vertex).
2. Directed Graph
A graph with only directed edges is said to be directed graph.
3. Complete Graph
A graph in which any V node is adjacent to all other nodes present in the graph is known as a
complete graph. An undirected graph contains the edges that are equal to edges = n(n-1)/2 where n
is the number of vertices present in the graph. The following figure shows a complete graph.
4. Regular Graph
Regular graph is the graph in which nodes are adjacent to each other, i.e., each node is accessible
from any other node.
5. Cycle Graph
A graph having cycle is called cycle graph. In this case the first and last nodes are the same. A
closed simple path is a cycle.
6. Acyclic Graph
22
A graph without cycle is called acyclic graphs.
7. Weighted Graph
A graph is said to be weighted if there are some non negative value assigned to each edges of the
graph. The value is equal to the length between two vertices. Weighted graph is also called a
network.
Outgoing Edge
A directed edge is said to be outgoing edge on its orign vertex.
Incoming Edge
A directed edge is said to be incoming edge on its destination vertex.
Degree
Total number of edges connected to a vertex is said to be degree of that vertex.
Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
Parallel edges or Multiple edges
If there are two undirected edges to have the same end vertices, and for two directed edges to have
the same origin and the same destination. Such edges are called parallel edges or multiple edges.
Self-loop
An edge (undirected or directed) is a self-loop if its two endpoints coincide.
Simple Graph
A graph is said to be simple if there are no parallel and self-loop edges.
Adjacent nodes
When there is an edge from one node to another then these nodes are called adjacent nodes.
Incidence
In an undirected graph the edge between v1 and v2 is incident on node v1 and v2.
Walk
A walk is defined as a finite alternating sequence of vertices and edges, beginning and ending with
vertices, such that each edge is incident with the vertices preceding and following it.
Closed walk
A walk which is to begin and end at the same vertex is called close walk. Otherwise it is an open
walk. If e1,e2,e3,and e4 be the edges of pair of vertices (v1,v2),(v2,v4),(v4,v3) and (v3,v1)
respectively ,then v1 e1 v2 e2 v4 e3 v3 e4 v1 be its closed walk or circuit.
23
Path
A open walk in which no vertex appears more than once is called a path. If e1 and e2 be the two
edges between the pair of vertices (v1,v3) and (v1,v2) respectively, then v3 e1 v1 e2 v2 be its path.
Length of a path
The number of edges in a path is called the length of that path. In the following, the length of the
path is 3.
Circuit
A closed walk in which no vertex (except the initial and the final vertex) appears more than once is
called a circuit. A circuit having three vertices and three edges.
Sub Graph
A graph S is said to be a sub graph of a graph G if all the vertices and all the edges of S are in G,
and each edge of V(G) and E(G’) S has the same end vertices in S as in G. A subgraph of G is
a graph G’ such that V(G’) E(G)
Connected Graph
A graph G is said to be connected if there is at least one path between every pair of vertices in G.
Otherwise, G is disconnected. This graph is disconnected because the vertex v1 is not connected
with the other vertices of the graph.
Degree
In an undirected graph, the number of edges connected to a node is called the degree of that node
or the degree of a node is the number of edges incident on it. In the above graph, degree of vertex
v1 is 1, degree of vertex v2 is 3, degree of v3 and v4 is 2 in a connected graph.
Indegree
The indegree of a node is the number of edges connecting to that node or in other words edges
incident to it.
Outdegree
The outdegree of a node (or vertex) is the number of edges going outside from that node.
24
In this representation, graph can be represented using a matrix of size total number of
vertices by total number of vertices; means if a graph with 4 vertices can be represented using a
matrix of 4X4 size. In this matrix, rows and columns both represent vertices. This matrix is filled
with either 1 or 0. Here, 1 represents there is an edge from row vertex to column vertex and 0
represents there is no edge from row vertex to column vertex.
Adjacency Matrix: let G = (V, E) with n vertices, n≥1. The adjacency matrix of G is a 2-
dimensional n x n matrix, A, A(i, j) = 1 iff (v i vj)ϵE(G) (<vi, vj> for a diagraph), A(i, j) = 0
otherwise.
example : for undirected graph
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a
digraph need not be symmetric. The space needed to represent a graph using adjacency matrix is n 2
bits. To identify the edges in a graph, adjacency matrices will require at least O(n 2 ) time.
Adjacency List
In this representation, every vertex of graph contains list of its adjacent vertices. The n rows of the
adjacency matrix are represented as n chains. The nodes in chain I represent the vertices that are
adjacent to vertex i.
It can be represented in two forms. In one form, array is used to store n vertices and chain is used
to store its adjacencies.
So that we can access the adjacency list for any vertex in O(1) time. Adjlist[i] is a pointer to to first
node in the adjacency list for vertex i. example: consider the following directed graph
representation implemented using linked list
25
Adjacency Multilists
In the adjacency-list representation of an undirected graph each edge (u, v) is represented by two
entries one on the list for u and the other on that list for v. This can be accomplished easily if the
adjacency lists are actually maintained as multilists (i.e., lists in which nodes may be shared among
several lists). For each edge there will be exactly one node but this node will be in two lists (i.e. the
adjacency lists for each of the two nodes to which it is incident).
Weighted edges
In many applications the edges of a graph have weights assigned to them. These weights may
represent the distance from one vertex to another or the cost of going from one; vertex to an
adjacent vertex In these applications the adjacency matrix entries A [i][j] would keep this
information too. When adjacency lists are used the weight information may be kept in the list
nodes by including an additional field weight. A graph with weighted edges is called a network.
26
Fig 28 BFS Traversal graph
27
5) GPS Navigation systems: Breadth First Search is used to find all neighboring locations.
6) Broadcasting in Network: In networks, a broadcasted packet follows Breadth First Search to
reach all nodes.
7) In Garbage Collection: Breadth First Search is used in copying garbage collection using
Cheney’s algorithm.
8) Cycle detection in undirected graph: In undirected graphs, either Breadth First Search or Depth
First Search can be used to detect cycle. In directed graph, only depth first search can be used.
9) Ford–Fulkerson algorithm In Ford-Fulkerson algorithm, we can either use Breadth First or
Depth First Traversal to find the maximum flow. Breadth First Traversal is preferred as it reduces
worst case time complexity to O(VE2 ).
10) To test if a graph is Bipartite We can either use Breadth First or Depth First Traversal.
11) Path Finding We can either use Breadth First or Depth First Traversal to find if there is a path
between two vertices.
12) Finding all nodes within one connected component: We can either use Breadth First or Depth
First Traversal to find all nodes reachable from a given node.
28
Applications of Depth First Search
1) For an unweighted graph, DFS traversal of the graph produces the minimum spanning tree and
all pair shortest path tree.
2) Detecting cycle in a graph A graph has cycle if and only if we see a back edge during DFS. So
we can run DFS for the graph and check for back edges.
3) Path Finding We can specialize the DFS algorithm to find a path between two given vertices u
and z.
i) Call DFS(G, u) with u as the start vertex.
ii) Use a stack S to keep track of the path between the start vertex and the current vertex.
iii) As soon as destination vertex z is encountered, return the path as the contents of the
stack
4) Topological Sorting
5) To test if a graph is bipartite We can augment either BFS or DFS when we first discover a new
vertex, color it opposite its parents, and for each other edge, check it doesn’t link two vertices of
the same color. The first vertex in any connected component can be red or black! See this for
details.
6) Finding Strongly Connected Components of a graph A directed graph is called strongly
connected if there is a path from each vertex in the graph to every other vertex.
29
Given a graph where edges are labeled with weights (or distances) and a source vertex, what is the
shortest path between the source and some other vertex? Problems requiring us to answer such
queries are broadly known as shortest-paths problems. Shortest-paths problem come in several
flavors. For example, the single-source shortest path problem requires finding the shortest paths
between a given source and all other vertices; the single-pair shortest path problem requires finding
the shortest path between given a source and a given destination vertex; the all-pairs shortest path
problem requires finding the shortest paths between all pairs of vertices.
Dijkstra’s algorithm is an iterative algorithm that provides us with the shortest path from one
particular starting mode to all other nodes in the graph. To keep track of the total cost from the start
node to each destination we will make use of the distance instance variable in the vertex class. The
distance instance variable will contain the current total weight of the smallest weight path from the
start to the vertex. Dijkstra’s algorithm finds the shortest path in a weighted graph containing only
positive edge weights from a single source. It uses a priority based dictionary or a queue to select a
node / vertex nearest to the source that has not been edge relaxed. Time complexity of Dijkstra’s
algorithm is O((E+V) Log(V)) for an adjacency list implementation of a graph. V is the number of
vertices and E is the number of edges in a graph.
On repeating the above steps until the set contains all vertices of given graph. Then we get the
following Shortest Path.
Step 4: Step 5:
30
4.11.2 PRIM’S ALGORITHM
Prim’s algorithm is also a greedy algorithm technique. It starts with an empty spanning tree. The
idea is to maintain two sets of vertices:
At every step, it considers all the edges and picks the minimum weight edge. After picking the
edge, it moves the other endpoint of edge to set containing MST.
1. Create MST set that keeps track of vertices already included in MST.
2. Assign key values to all vertices in the input graph. Initialize all key values as INFINITE (∞).
Assign key values like 0 for the first vertex so that it is picked first.
3. While MST set doesn't include all vertices.
a. Pick vertex u which is not is MST set and has minimum key value. Include 'u'to
MST set.
b. Update the key value of all adjacent vertices of u. To update, iterate through all
adjacent vertices. For every adjacent vertex v, if the weight of edge u.v less than the
previous key value of v, update key value as a weight of u.v.
Example:
31
Step 1: Step 2: Step 3:
Step 4: Step 5:
32