DS Notes - Tree
DS Notes - Tree
DS Notes - Tree
A node is a structure which may contain a value, a condition, or represent a separate data structure
(which could be a tree of its own). Each node in a tree has zero or more child nodes, which are
below it in the tree (by convention, trees grow down, not up as they do in nature). A node that has a
child is called the child's parent node (or ancestor node, or superior). A node has at most one
parent.
Nodes that do not have any children are called leaf nodes. They are also referred to as terminal
nodes.
The height of a node is the length of the longest downward path to a leaf from that node. The
height of the root is the height of the tree. The depth of a node is the length of the path to its root
(i.e., its root path). This is commonly needed in the manipulation of the various self balancing trees,
AVL Trees in particular. Conventionally, the value -1 corresponds to a subtree with no nodes,
whereas zero corresponds to a subtree with one node.
The topmost node in a tree is called the root node. Being the topmost node, the root node will not
have parents. It is the node at which operations on the tree commonly begin (although some
algorithms begin with the leaf nodes and work up ending at the root). All other nodes can be
reached from it by following edges or links. (In the formal definition, each such path is also
unique). In diagrams, it is typically drawn at the top. In some trees, such as heaps, the root node has
special properties. Every node in a tree can be seen as the root node of the subtree rooted at that
node.
An internal node or inner node is any node of a tree that has child nodes and is thus not a leaf
node.
A subtree of a tree T is a tree consisting of a node in T and all of its descendants in T. (This is
different from the formal definition of subtree used in graph theory.[1]) The subtree corresponding
to the root node is the entire tree; the subtree corresponding to any other node is called a proper
subtree (in analogy to the term proper subset).
The binary tree is a fundamental data structure used in computer science. The binary tree is a useful
data structure for rapidly storing sorted data and rapidly retrieving stored data. A binary tree is
composed of parent nodes, or leaves, each of which stores data and also links to up to two other
child nodes (leaves) which can be visualized spatially as below the first node with one placed to the
left and with one placed to the right. It is the relationship between the leaves linked to and the
linking leaf, also known as the parent node, which makes the binary tree such an efficient data
structure. It is the leaf on the left which has a lesser key value (i.e., the value used to search for a
leaf in the tree), and it is the leaf on the right which has an equal or greater key value. As a result,
the leaves on the farthest left of the tree have the lowest values, whereas the leaves on the right of
the tree have the greatest values. More importantly, as each leaf connects to two other leaves, it is
the beginning of a new, smaller, binary tree. Due to this nature, it is possible to easily access and
insert data in a binary tree using search and insert functions recursively called on successive leaves.
Introduction
· The depth of a node is the number of edges from the root to the node.
· The height of a node is the number of edges from the node to the deepest leaf.
· The height of a tree is a height of the root.
· A full binary tree.is a binary tree in which each node has exactly zero or two children.
· A complete binary tree is a binary tree, which is completely filled, with the possible
exception of the bottom level, which is filled from left to right.
A complete binary tree is very special tree, it provides the best possible ratio between the number
of nodes and the height. The height h of a complete binary tree with N nodes is at most O(log N).
We can easily prove this by counting nodes on each level, starting with the root, assuming that
each level has the maximum number of nodes:
h = O(log n)
Advantages of trees
Trees are so useful and frequently used, because they have some very serious advantages:
3.3 TRAVERSALS
A traversal is a process that visits all the nodes in the tree. Since a tree is a nonlinear data
structure, there is no unique traversal. We will consider several traversal algorithms with we
group in the following two kinds
· depth-first traversal
· breadth-first traversal
· PreOrder traversal - visit the parent first and then left and right children;
· InOrder traversal - visit the left child, then the parent and the right child;
· PostOrder traversal - visit left child, then the right child and then the parent;
There is only one kind of breadth-first traversal--the level order traversal. This traversal visits
nodes by levels from top to bottom and from left to right.
PreOrder - 8, 5, 9, 7, 1, 12, 2, 4,
11, 3
InOrder - 9, 5, 1, 7, 2, 12, 8, 4, 3,
11
PostOrder - 9, 1, 2, 12, 7, 5, 3, 11,
4, 8
LevelOrder - 8, 5, 4, 9, 7, 11, 1,
12, 3, 2
In the next picture we demonstrate the order of node visitation. Number 1 denotes the first node
in a particular traversal and 7 denote the last node.
These common traversals can be represented as a single algorithm by assuming that we visit each
node three times. An Euler tour is a walk around the binary tree where each edge is treated as a
wall, which you cannot cross. In this walk each node will be visited either on the left, or under
the below, or on the right. The Euler tour in which we visit nodes on the left produces a preorder
traversal. When we visit nodes from the below, we get an inorder traversal. And when we visit
nodes on the right, we get a postorder traversal.
3.4 BINARY SEARCH TREES
We consider a particular kind of a binary tree called a Binary Search Tree (BST). The basic idea
behind this data structure is to have such a storing repository that provides the efficient way of
data sorting, searching and retrieving.
We implement a binary search tree using a private inner class BSTNode. In order to support
the binary search tree property, we require that data stored in each node is Comparable:
Insertion
The insertion procedure is quite similar to searching. We start at the root and recursively go
down the tree searching for a location in a BST to insert a new node. If the element to be inserted
is already in the tree, we are done (we do not insert duplicates). The new node will always
replace a NULL reference.
Draw a binary search tree by inserting the above numbers from left to right.
Searching
Searching in a BST always starts at the root. We compare a data stored at the root with the key
we are searching for (let us call it as toSearch). If the node does not contain the key we precede
either to the left or right child depending upon comparison. If the result of comparison is
negative we go to the left child, otherwise - to the right child. The recursive structure of a BST
yields a recursive algorithm.
Searching in a BST has O(h) worst-case runtime complexity, where h is the height of the tree.
Since s binary search tree with n nodes has a minimum of O(log n) levels, it takes at least O(log
n) comparisons to find a particular node. Unfortunately, a binary serch tree can degenerate to a
linked list, reducing the search time to O(n).
Deletion
Deletion is somewhat trickier than insertion. There are several cases to consider. A node to be
deleted (let us call it as toDelete)
· is not in a tree;
· is a leaf;
· has only one child;
· has two children.
If toDelete is not in the tree, there is nothing to delete. If toDelete node has only one child the
procedure of deletion is identical to deleting a node from a linked list - we just bypass that node
being deleted
Deletion of an internal node with two children is less straightforward. If we delete such a node,
we split a tree into two subtrees and therefore, some children of the internal node won't be
accessible after deletion. In the picture below we delete 8:
Deletion starategy is the following: replace the node being deleted with the largest node in the
left subtree and then delete that largest node. By symmetry, the node being deleted can be
swapped with the smallest node is the right subtree.
where the PreOrderIterator class is implemented as an inner private class of the BST class
The main difficulty is with next() method, which requires the implicit recursive stack
implemented explicitly. We will be using Java's Stack. The algorithm starts with the root and
push it on a stack. When a user calls for the next() method, we check if the top element has a
left child. If it has a left child, we push that child on a stack and return a parent node. If there is
no a left child, we check for a right child. If it has a right child, we push that child on a stack and
return a parent node. If there is no right child, we move back up the tree (by popping up elements
from a stack) until we find a node with a right child. Here is the next()implementation
The following example.shows the output and the state of the stack during each call to next().
Note, the algorithm works on any binary trees, not necessarily binary search trees..
1 2 4 6 5 7 8 3
Output
6
4 7
1 2 4 5 8
2 5 3
1 2 1 1
Stack 1 1
1
A non-recursive preorder traversal can be eloquently implemented in just three lines of code. If
you understand next()'s implementation above, it should be no problem to grasp this one:
return cur.data;
}
Level order traversal processes the nodes level by level. It first processes the root, and then its
children, then its grandchildren, and so on. Unlike the other traversal methods, a recursive
version does not exist.
A traversal algorithm is similar to the non-recursive preorder traversal algorithm. The only
difference is that a stack is replaced with a FIFO queue.
Arrays can be used to represent complete binary trees. Remember that in a complete binary tree,
all of the depths are full, except perhaps for the deepest. At the deepest depth, the nodes are as
far left as possible. For example, below is a complete binary tree with 9 nodes; each node
contains a character. In this example, the first 7 nodes completely fill the levels at depth 0 (the
root), depth 1 (the root's children), and depth 2. There are 2 nodes at depth 3, and these are as far
left as possible.
The 9 characters that the tree contains can be stored in an array of characters, starting with the
root's character in the [0] location, the 2 nodes with depth 1 are placed after the root, and so on.
The entire representation of the tree by an array is shown in the figure below.
1. The data from the root always appears in the [0] component of the array.
2. Suppose that the data for a nonroot appears in component [i] of the array. Then the data
for its parent is always at location [(i-1)/2] (using integer division).
3. Suppose that the data for a node appear in component [i] of the array. Then its children (if
they exist) always have their data at these locations:
o Left child at component [2i+1];
o Right child at component [2i+2].
A binary tree can be represented by its individual nodes. Each node will contain references to its
left child and right child. The node also has at least one instance variable to hold some data. An
entire tree is represented as a reference to the root node.
class BTNode
{
public char data;
public BTNode left;
public BTNode right;
}
Given the above BTNode definition, we'll be able to represent a binary tree of characters. The
example below illustrates such an representation.
Pre-order Traversal
void Preorder (BTNode root)
{
// Not all nodes have one or both children.
// Easiest to deal with this once
// Also covers the case fo an empty tree
if (root == null)
return;
In-order Traversal
Post-order Traversal
For a more general purpose, we can redefine the class BTNode, such that each node could hold
data that is a Java Object.
class BTNode
{
private Object data;
private BTNode left;
private BTNode right;
...
}
This way, we will be able to use BTNode to organize many different types of data into tree
structures (similar to the way we use Node to organize data into linked lists in our previous
assignments). Here is a farely comprehensive definition of a BTNode class in BTNode.java.
For many tasks, we need to arrange things in an order proceeding from smaller to larger. We can
take the advantage of the order to store the elements in the nodes of a binary tree to maintain a
desired order and to find elements easily. One of this kind of trees is called binary search tree.
A binary search tree has the following 2 characteristics for every node n in the tree:
1. Every element in n's left subtree is less or equal to the element in node n.
2. Every element in n's right subtree is greater than the element in node n.
For example, suppose we want to store the numbers {3, 9, 17, 20, 45, 53, 53, 54} in a binary
search tree. The figure below shows a binary search tree with these numbers.
Let's try to compare storing the numbers in a binary search tree (as shown above) with an array
or a linked list. To count the number of occurrences of an element in an array or a linked list, it is
necessary to examine every element. Even if we are interested only in whether or not an element
appears in the numbers, we will often look at many elements before we come across the one we
seek.
With a binary search tree, searching for an element is often much quicker. To look for an
element in a binary search tree, the most we'll ever have to look at is the depth of the tree plus
one.
3.4.3 Heaps
A heap is a binary tree where the elements are arranged in a certain order proceeding from
smaller to larger. In this way, a heap is similar to a binary search tree (discussed previously), but
the arrangement of the elements in a heap follows rules that are different from a binary search
tree:
1. In a heap, the element contained by each node is greater than or equal to the elements of
that node's children.
2. The tree is a complete binary tree, so that every level except the deepest must contain as
many nodes as possible; and at the deepest level, all the nodes are as far left as possible.
As an example, suppose that elements are integers. Below are 3 trees with 6 elements. Only one
is a heap--which one?
The tree on the left is not a heap because it is not a complete binary tree. The middle tree is not a
heap because one of the nodes (containing 52) has a value that is smaller than its child. The tree
on the right is a heap.
A heap is a complete binary tree, therefore it can be represented using an array (as we have
discussed in the beginning of this notes). Heaps provide an efficient implementation of priority
queses.
Linked representation uses three parallel arrays, INFO, LEFT and RIGHT and a pointer variable
ROOT. Each node N of T will correspond to a location K such that –
#include<stdio.h>
typedef struct node
{
int data;
struct node *left;
struct node *right;
}node;
node *create()
{
node *p;
int x;
printf("Enter data(-1 for no data):");
scanf("%d",&x);
if(x==-1)
return NULL;
p=(node*)malloc(sizeof(node));
p->data=x;
printf("Enter left child of %d:\n",x);
p->left=create();
printf("Enter right child of %d:\n",x);
p->right=create();
return p;
}
int main()
{
node *root;
root=create();
// A tree node
struct node
{
int data;
struct node *right,*left;
};
// A queue node
struct Queue
{
int front, rear;
int size;
struct node* *array;
};
int i;
for (i = 0; i < size; ++i)
queue->array[i] = NULL;
return queue;
}
if (isEmpty(queue))
++queue->front;
}
if (hasOnlyOneItem(queue))
queue->front = queue->rear = -1;
else
++queue->front;
return temp;
}
// A utility function to check if a tree node has both left and right children
int hasBothChild(struct node* temp)
{
return temp && temp->left && temp->right;
}
else
{
// get the front node of the queue.
struct node* front = getFront(queue);
// If the left child of this front node doesn’t exist, set the
// left child as the new node
if (!front->left)
front->left = temp;
// If the right child of this front node doesn’t exist, set the
// right child as the new node
else if (!front->right)
front->right = temp;
// If the front node has both the left child and right child,
// Dequeue() it.
if (hasBothChild(front))
Dequeue(queue);
}
Enqueue(root, queue);
while (!isEmpty(queue))
{
struct node* temp = Dequeue(queue);
if (temp->left)
Enqueue(temp->left, queue);
if (temp->right)
Enqueue(temp->right, queue);
}
}
levelOrder(root);
return 0;
}
Traversal is like searching the tree except that in traversal the goal is to move through the tree in
some particular order. In addition, all nodes are processed in the traversal but searches cease
when the required node is found.
If the order of traversal is not specified and the tree contains n nodes, then the number of paths
that could be taken through the n nodes would be n factorial and therefore the information in the
tree would be presented in some format determined by the path. Since there are many different
paths,
no real uniformity would exist in the presentation of information.
Therefore, three different orders are specified for tree traversals. These are called:
* pre-order
* in-order
* post-order
Because the definition of a binary tree is recursive and defined in terms of the left and right
subtrees and the root node, the choices for traversals can also be defined from this definition. In
pre-order traversals, each node is processed before (pre) either of its sub-trees. In in-order, each
node is processed after all the nodes in its left sub-tree but before any of the nodes in its right
subtree (they are done in order from left to right). In post-order, each node is processed after
(post) all nodes in both of its sub-trees.
Each order has different applications and yields different results. Consider the tree shown below
(which has a special name - an expression tree):
*
/ \
/ \
/ \
+ +
/ \ / \
/ \ / \
a b c 7
The following would result from each traversal
* pre-order : *+ab+c7
* in-order : a+b*c+7
* post-order: ab+c7+*
Recursive functions for all three types of traversal
void preorder(node *ptr) void postorder(node *ptr)
{ {
if(ptr==NULL) if(ptr==NULL)
return; return;
printf(“%d”,ptr->info); postorder( (ptr->lchild);
preorder(ptr->lchild); postorder(ptr->rchild);
preorder(ptr->rchild); printf(“%d”,ptr->info);
} }
void inorder(node *ptr)
{
if(ptr==NULL)
return;
inorder (ptr->lchild);
printf(“%d”,ptr->info);
inorder (ptr->rchild);}
Eg:
Preorder traversal: To traverse a binary tree in Preorder, following operations are carried-out {i)
Visit the root,(ii) Traverse the left subtree, and (iii)Traverse the right subtree. Therefore, the
Preorder traversal of the above tree will outputs:7,1,0,3,2,5,4,6,9,8,10
Inorder traversal: To traverse a binary tree in Inorder, following operations are carried-out (i)
Traverse the left most subtree starting at the left external node, (ii) Visit the root, and (iii)
Traverse the right subtree starting at the left external node. Therefore, the Inorder traversal of the
above tree will outputs:0,1,2,3,4,5,6,7,8,9,10
Postorder traversal: To traverse a binary tree in Postorder, following operations are carried-out
(i) Traverse all the left external nodes starting with the left most subtree which is then followed
by bubble-up all the internal nodes, (ii) Traverse the right subtree starting at the left external
node which is then followed by bubble-up all the internal nodes, and (iii) Visit the root.
Therefore, the Postorder traversal of the above tree will outputs: 0, 2, 4, 6, 5, 3, 1, 8, 10, 9, 7.
"A binary tree is threaded by making all right child pointers that would normally be null point
to the inorder successor of the node (if it exists) , and all left child pointers that would normally
be null point to the inorder predecessor of the node.”
A threaded binary tree makes it possible to traverse the values in the binary tree via a linear
traversal that is more rapid than a recursive in-order traversal. It is also possible to discover the
parent of a node from a threaded binary tree, without explicit use of parent pointers or a stack,
albeit slowly.
Types of threaded binary trees
Let's make the Threaded Binary tree out of a normal binary tree
The INORDER traversal for the above tree is—D B A E C. So, the respective Threaded Binary
tree will be --