ADSA Unit-3
ADSA Unit-3
__________________________________________________________________________________
Red-Black Trees, Splay Trees, Applications. Hash Tables: Introduction, Hash Structure, Hash
functions, Linear Open Addressing, Chaining and Applications.
_________________________________________________________________________________
Property #3: The children of Red colored node must be colored BLACK. (There should not be two
Property #5: Every new node must be inserted with RED color.
Property #6: Every leaf (e.i. NULL node) must be colored BLACK.
Example
Following is a Red-Black Tree which is created by inserting numbers from 1 to 9.
The above tree is a Red-Black tree where every node is satisfying all the properties of Red-Black
Tree.
Every Red Black Tree is a binary search tree but every Binary Search Tree need not be Red
Black tree.
2. Rotation
The insertion operation in Red Black tree is performed using the following steps...
Step 1 - Check whether tree is Empty.
Step 2 - If tree is Empty then insert the newNode as Root node with color Black and exit from the
operation.
Step 3 - If tree is not Empty then insert the newNode as leaf node with color Red.
Step 4 - If the parent of newNode is Black then exit from the operation.
Step 5 - If the parent of newNode is Red then check the color of parentnode's sibling of newNode.
Step 6 - If it is colored Black or NULL then make suitable Rotation and Recolor it.
Step 7 - If it is colored Red then perform Recolor. Repeat the same until tree becomes Red Black Tree.
The deletion operation in Red-Black Tree is similar to deletion operation in BST. But after every deletion
operation, we need to check with the Red-Black Tree properties. If any of the properties are violated then
make suitable operations like Recolor, Rotation and Rotation followed by Recolor to make it Red-
Black .
46
Every Splay tree must be a binary search tree but it is need not to be balanced tree.
Insertion Operation in Splay Tree
The insertion operation in Splay tree is performed using following steps...
Step 1 - Check whether tree is Empty.
Step 2 - If tree is Empty then insert the newNode as Root node and exit from the operation.
Step 3 - If tree is not Empty then insert the newNode as leaf node using Binary Search tree
insertion logic.
Step 4 - After insertion, Splay the newNode
Applications:
Decision-based algorithm is used in machine learning which works upon the algorithm of tree.
Databases also uses tree data structures for indexing. Domain Name Server(DNS) also uses tree
structures. File explorer/my computer of mobile/any computer.
Some applications of the trees are:
1. XML Parser uses tree algorithms.
2. Decision-based algorithm is used in machine learning which works upon the algorithm of tree.
3. Databases also uses tree data structures for indexing.
4. Domain Name Server(DNS) also uses tree structures.
5. File explorer/my computer of mobile/any computer
6. BST used in computer Graphics
7. Posting questions on websites like Quora, the comments are child of questions
We've seen searches that allow you to look through data in O(n) time, and searches that allow you to
look through data in O(logn) time, but imagine a way to find exactly what you want in O(1) time. Think
it's not possible? Think again! Hash tables allow the storage and retrieval of data in an average time
of O(1). 51
At its most basic level, a hash table data structure is just an array. Data is stored into this array at
specific indices designated by a hash function. A hash function is a mapping between the set of input
data and a set of integers.
With hash tables, there always exists the possibility that two data elements will hash to the same integer
value. When this happens, a collision results (two data members try to occupy the same place in the hash
table array),
and methods have been devised to deal with such situations. In this guide, we will cover two methods,
linear probing and separate chaining, focusing on the latter.
A hash table is made up of two parts: an array (the actual table where the data to be searched is stored)
and a mapping function, known as a hash function. The hash function is a mapping from the input space
to the integer space that defines the indices of the array. In other words, the hash function provides a
way for assigning numbers to the input data such that the data can then be stored at the array index
corresponding to the assigned number.
Let's take a simple example. First, we start with a hash table array of strings (we'll use strings as the data
being stored and searched in this example). Let's say the hash table size is 12:
Next we need a hash function. There are many possible ways to construct a hash function. We'll discuss
these possibilities more in the next section. For now, let's assume a simple hash function that takes a
string as input. The returned hash value will be the sum of the ASCII characters that make up the string
mod the size of the table:
int hash(char *str, int table_size) { int sum; /* Make sure a valid string passed in */ if (str==NULL)
return -1; /*
Sum up all the characters in the string */ for( ; *str; str++) sum += *str; /* Return the sum mod the
table size */ return sum % table_size; }
Now that we have a framework in place, let's try using
52 it. First, let's store a string into the table: "Steve".
We run "Steve" through the hash function, and find that hash("Steve",12) yields 3:
Let's try another string: "Spark". We run the string through the hash function and find
that hash("Spark",12) yields 6. Fine. We insert it into the hash table:
Let's look at the above example again, this time with our modified data structure:
Figure %: After adding "Steve" to the table And "Spark" which hashes to 6:
54
Problem : How does a hash table allow for O(1) searching? What is the worst case efficiency of a look
up in a hash table using separate chainging?
A hash table uses hash functions to compute an integer value for data. This integer value can then be
used as an index into an array, giving us a constant time access to the requested data. However, using
separate chaining, we won't always achieve the best and average case efficiency of O(1). If we have too
small a hash table for the data set size and/or a bad hash function, elements can start to build in one
index in the array. Theoretically, all n element could end up in the same linked list. Therefore, to do a
search in the worst case is equivalent to looking up a data element in a linked list, something we already
know to be O(n) time. However, with a good hash function and a well created hash table, the chances of
this happening are, for all intents and purposes, ignorable. Problem : The bigger the ratio between the
size of the hash table and the number of data elements, the less chance there is for collision. What is a
drawback to making the hash table big enough so the chances of collision is ignorable?
Wasted memory space
Problem : How could a linked list and a hash table be combined to allow someone to run through the
list from item to item while still maintaining the ability to access an individual element in O(1) time?
Hash Functions
As mentioned briefly in the previous section, there are multiple ways for constructing a hash function.
Remember that hash function takes the data as input (often a string), and return s an integer in the range
of possible indices into the hash table. Every hash function must do that, including the bad ones. So
what makes for a good hash function?
Rule 1: If something else besides the input data is used to determine the hash, then the hash value is not
as dependent upon the input data, thus allowing for a worse distribution of the hash values.
Rule 2: If the hash function doesn't use all the input
55 data, then slight variations to the input data would