0% found this document useful (0 votes)

1 views

08_Hashing.pptx

The document discusses hashing as a data storage technique that allows for efficient data retrieval without the need for sorting, achieving best-case time complexity of O(1). It describes two forms of hashing: open hashing, which allows unlimited storage space, and closed hashing, which uses fixed space and requires collision resolution strategies. Various hashing functions and collision resolution methods, including separate chaining and open addressing, are also explained to manage data storage effectively.

Uploaded by

pro gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

08_Hashing.pptx

Uploaded by

pro gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Data Structures (DS)

GTU # 3130702

Unit-4
Hashing & File
Structure (Hashing)
What is Hashing?
Sequential search requires, on the average O(n) comparisons to locate an element, so many
comparisons are not desirable for a large database of elements.
Binary search requires much fewer comparisons on the average O (log n) but there is an
additional requirement that the data should be sorted. Even with best sorting algorithm, sorting
of elements require O(n log n) comparisons.
There is another widely used technique for storing of data called hashing. It does away with the
requirement of keeping data sorted (as in binary search) and its best case timing complexity is
of constant order O(1). In its worst case, hashing algorithm starts behaving like linear search.
Best case timing behavior of searching using hashing = O(1)
Worst case timing Behavior of searching using hashing = O(n)
What is Hashing?
In hashing, the record for a key value "key", is directly referred by calculating the address from
the key value.
Address or location of an element or record x, is obtained by computing some arithmetic
function f.
f(key) gives the address of x in the table.
Record Hash Table
1
2
3
f()🡪Address 4
5
6
7
Mapping of Record in hash table
Hash Table Data Structure
There are two different forms of hashing.
1. Open hashing or external hashing
Open or external hashing, allows records to be stored in unlimited space (could be a hard disk).
It places no limitation on the size of the tables.

2. Close hashing or internal hashing

Closed or internal hashing, uses a ﬁxed space for storage and thus limits the size of hash table.
Open Hashing Data Structure
The basic idea is that the records [elements] are partitioned into B classes, numbered 0,1,2 …
B-1
A Hashing function f(x) maps a record with key x to an integer value between 0 and B-1
Each bucket in the bucket table is the head of the linked list of records mapped to that bucket

Bucket
Table
0
1
. The open hashing
. List of Elements data organization
.
B-1
Close Hashing Data Structure
A closed hash table keeps the elements in the bucket itself.
0 A
Only one element can be put in the bucket.
1
If we try to place an element in the bucket and ﬁnd it already holds an 2 C
element, then we say that a collision has occurred. 3
In case of collision, the element should be rehashed to alternate empty 4
location within the bucket table. 5 B
In closed hashing, collision handling is a very important issue.
Hashing Functions
Characteristics of a Good Hash Function
A good hash function avoids collisions.
A good hash function tends to spread keys evenly in the array.
A good hash function is easy to compute.

Different hashing functions

1. Division-Method
2. Midsquare Methods
3. Folding Method
4. Digit Analysis
5. Length Dependent Method
6. Algebraic Coding
7. Multiplicative Hashing
Division-Method
In this method we use modular arithmetic system to divide the key value by some integer
divisor m (may be table size).
It gives us the location value, where the element can be placed.
We can write, L = (K mod m) + 1,
L = location in table/ﬁle
K = key value
m = table size/number of slots in ﬁle

Suppose, k = 23, m = 10 then

L = (23 mod 10) + 1= 3 + 1=4
The key whose value is 23 is placed in 4th location.
Midsquare Methods
In this case, we square the value of a key and take the number of digits required to form an
address, from the middle position of squared value.
Suppose a key value is 16
Its square is 256
Now if we want address of two digits
We select the address as 56 (i.e. two digits starting from middle of 256)
Folding Method
Most machines have a small number of primitive data types for which there are arithmetic
instructions
Frequently key to be used will not fit easily in to one of these data types
It is not possible to discard the portion of the key that does not fit into such an arithmetic data
type
The solution is to combine the various parts of the key in such a way that all parts of the key
affect for final result such an operation is termed folding of the key
That is the key is actually partitioned into number of parts, each part having the same length as
that of the required address
Add the value of each parts, ignoring the final carry to get the required address
Folding Method
This is done in two ways
Fold-shifting: Here actual values of each parts of key are added
Suppose, the key is : 12345678, and the required address is of two digits,
Break the key into: 12, 34, 56, 78
Add these, we get 12 + 34 + 56 + 78 : 180, ignore first 1 we get 80 as location

Fold-boundary: Here the reversed values of outer parts of key are added
Suppose, the key is : 12345678, and the required address is of two digits,
Beak the key into: 21, 34, 56, 87
Add these, we get 21 + 34 + 56 + 87 : 198, ignore first 1 we get 98 as location
Digit Analysis
This hashing function is a distribution-dependent
Here we make a statistical analysis of digits of the key, and select those digits (of fixed
position) which occur quite frequently
Then reverse or shifts the digits to get the address
For example,
The key is : 9861234
If the statistical analysis has revealed the fact that the third and fifth position digits occur quite frequently,
We choose the digits in these positions from the key
So we get, 62. Reversing it we get 26 as the address
Length Dependent Method
In this type of hashing function we use the length of the key along with some portion of the key
to produce the address, directly.
In the indirect method, the length of the key along with some portion of the key is used to
obtain intermediate value.
Algebraic Coding
Here a n bit key value is represented as a polynomial.
The divisor polynomial is then constructed based on the address range required.
The modular division of key-polynomial by divisor polynomial, to get the address-polynomial.
Let f(x) = polynomial of n bit key = a1 + a2x + ……. + anxn-1
d(x) = divisor polynomial = d1 + d2x + …. + dnxn-1
Required address polynomial will be f(x) mod d(x)
Multiplicative Hashing
This method is based on obtaining an address of a key, based on the multiplication value.
If k is the non-negative key, and a constant c, (0 < c < 1)
Compute kc mod 1, which is a fractional part of kc.
Multiply this fractional part by m and take a floor value to get the address

m (kc mod 1)
└ ┘
0 < h (k) < m
Collision Resolution Strategies
Collision resolution is the main problem in hashing.
If the element to be inserted is mapped to the same location, where an element is already
inserted then we have a collision and it must be resolved.
There are several strategies for collision resolution. The most commonly used are :
Separate chaining - used with open hashing
Open addressing - used with closed hashing
Separate chaining
In this strategy, a separate list of all elements mapped to the same value is maintained.
Separate chaining is based on collision avoidance.
If memory space is tight, separate chaining should be avoided.
Additional memory space for links is wasted in storing address of linked elements.
Hashing function should ensure even distribution of elements among buckets; otherwise the
timing behaviour of most operations on hash table will deteriorate.
Separate chaining
0 10 50 A Separate Chaining
1 Hash Table
2 12 32 62
3
4 4 24
5
6
7 7
8
9 9 69
Example - Separate chaining
Example : The integers given below are to be inserted in a hash table with 5 locations using
chaining to resolve collisions.
Construct hash table and use simplest hash function.
1, 2, 3, 4, 5, 10, 21, 22, 33, 34, 15, 32, 31, 48, 49, 50
An element can be mapped to a location in the hash table using the mapping function key % 10

Hash Table Location Mapped elements

0 5, 10 , 15, 50
1 1 , 21, 31
2 2 , 22, 32
3
3 , 33, 48
4
4 , 34, 49
Example - Separate chaining
0 5 10 50 15

1 1 21 31

2 2 22 32

3 3 33 48

4 4 34 49

Hash Table
Open Addressing
Separate chaining requires additional memory space for pointers.
Open addressing hashing is an alternate method of handling collision.
In open addressing, if a collision occurs, alternate cells are tried until an empty cell is found.
a. Linear probing
b. Quadratic probing
c. Double hashing.
Linear Probing
In linear probing, whenever there is a collision, cells are searched sequentially (with wraparound) for an
empty cell.
Fig. shows the result of inserting keys {5,18,55,78,35,15} using the hash function (f(key)= key%10) and linear
probing strategy.
Empty After 5 After 18 After 55 After 78 After 35 After 15
Table
0 15
1
2
3
4
5 5 5 5 5 5 5
6 55 55 55 55
7 35 35
8 18 18 18 18 18
9 78 78 78
Linear Probing
Linear probing is easy to implement but it suffers from "primary clustering"
When many keys are mapped to the same location (clustering), linear probing will not distribute
these keys evenly in the hash table.
These keys will be stored in neighbourhood of the location where they are mapped.
This will lead to clustering of keys around the point of collision
Quadratic probing
One way of reducing "primary clustering" is to use quadratic probing to resolve collision.
Suppose the "key" is mapped to the location j and the cell j is already occupied.
In quadratic probing, the location j, (j+1), (j+4), (j+9), ... are examined to find the first empty
cell where the key is to be inserted.
This table reduces primary clustering.
It does not ensure that all cells in the table will be examined to find an empty cell.
Thus, it may be possible that key will not be inserted even if there is an empty cell in the table.
Double Hashing
This method requires two hashing functions f1 (key) and f2 (key).
Problem of clustering can easily be handled through double hashing.
Function f1 (key) is known as primary hash function.
In case the address obtained by f1 (key) is already occupied by a key, the function f2 (key) is
evaluated.
The second function f2 (key) is used to compute the increment to be added to the address
obtained by the first hash function f1 (key) in case of collision.
The search for an empty location is made successively at the addresses
f1(key) + f2(key),
f1(key) + 2 * f2(key),
f1 (key) + 3 * f2(key),...
Thank
You

BCS304-DSA Notes M-5
100% (1)
BCS304-DSA Notes M-5
22 pages
Hashing
No ratings yet
Hashing
25 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
DS Unit 4 Hashing File Structure
No ratings yet
DS Unit 4 Hashing File Structure
46 pages
UNIT V - Hashing
No ratings yet
UNIT V - Hashing
20 pages
Hashing
No ratings yet
Hashing
56 pages
Study_Material_on_Hashing
No ratings yet
Study_Material_on_Hashing
4 pages
Chapter 4 Hashing and File Structure
No ratings yet
Chapter 4 Hashing and File Structure
46 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
MODULE-5
No ratings yet
MODULE-5
33 pages
Hashing
No ratings yet
Hashing
34 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing
No ratings yet
Hashing
42 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Hashing
No ratings yet
Hashing
44 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
30 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
Hashing Part 1 Lecture
No ratings yet
Hashing Part 1 Lecture
33 pages
Hashing
No ratings yet
Hashing
30 pages
Unit-5
No ratings yet
Unit-5
50 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
Hashing (DASTAL)
No ratings yet
Hashing (DASTAL)
27 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Module 5: HASHING: Functions. The Values Are Then Stored in A Data Structure Called Hash Table
No ratings yet
Module 5: HASHING: Functions. The Values Are Then Stored in A Data Structure Called Hash Table
39 pages
Hashing
No ratings yet
Hashing
5 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
HASHING
No ratings yet
HASHING
63 pages
Hashing Slide
No ratings yet
Hashing Slide
16 pages
unit 1 Hashing
No ratings yet
unit 1 Hashing
61 pages
Lab 2
No ratings yet
Lab 2
10 pages
HAshing (Satish sir)
No ratings yet
HAshing (Satish sir)
52 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
2,2Hashing
No ratings yet
2,2Hashing
30 pages
Lecture 08 - Hash Tables
No ratings yet
Lecture 08 - Hash Tables
21 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Unit-6c DBMS - Hashing
No ratings yet
Unit-6c DBMS - Hashing
21 pages
Hashing
No ratings yet
Hashing
12 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
7 pages
HashTables
No ratings yet
HashTables
55 pages
Hashing
No ratings yet
Hashing
75 pages
Lecture 14 Hashing
No ratings yet
Lecture 14 Hashing
44 pages
Lab5 Hashing Algos
No ratings yet
Lab5 Hashing Algos
10 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
DS 5
No ratings yet
DS 5
23 pages
Module 5 Hashing
No ratings yet
Module 5 Hashing
66 pages
Hash Table
No ratings yet
Hash Table
9 pages
Handout 9 - Hashing
No ratings yet
Handout 9 - Hashing
11 pages
Ads-Unit I-Hashing
No ratings yet
Ads-Unit I-Hashing
14 pages
Final Hashing
No ratings yet
Final Hashing
41 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
Hashing2
No ratings yet
Hashing2
59 pages
Chapter 11 Hashing
No ratings yet
Chapter 11 Hashing
42 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Worksheet 7 Solution
No ratings yet
Worksheet 7 Solution
4 pages
Wa0014.
No ratings yet
Wa0014.
85 pages
Page 6599737
No ratings yet
Page 6599737
14 pages
DSA Patterns
No ratings yet
DSA Patterns
5 pages
Empirical Mode Decomposition An Introduction
No ratings yet
Empirical Mode Decomposition An Introduction
45 pages
Deadlock
No ratings yet
Deadlock
57 pages
Linear Programming Notes V Problem Transformations
No ratings yet
Linear Programming Notes V Problem Transformations
6 pages
Bracketing Methods: Methods Because Two Initial Guesses For The Root Are Required. As The Name Implies, These
No ratings yet
Bracketing Methods: Methods Because Two Initial Guesses For The Root Are Required. As The Name Implies, These
15 pages
Ola Data Analysis For Dynamic Price Prediction Usi
No ratings yet
Ola Data Analysis For Dynamic Price Prediction Usi
8 pages
Basic Assumptions of The Game Theory: I N N I N
No ratings yet
Basic Assumptions of The Game Theory: I N N I N
6 pages
DSP Using Matlab v.4 Solution
88% (8)
DSP Using Matlab v.4 Solution
133 pages
WCCI14 Poster Template IJCNN
No ratings yet
WCCI14 Poster Template IJCNN
2 pages
Msa CS801 1.4 CP
No ratings yet
Msa CS801 1.4 CP
2 pages
Haimanot Tadesse Aug2020
No ratings yet
Haimanot Tadesse Aug2020
47 pages
Control Systems CH1
No ratings yet
Control Systems CH1
12 pages
Fundamentals of Mixed Signals and Sensors
100% (1)
Fundamentals of Mixed Signals and Sensors
27 pages
Lecture 13: Feedback Linearization: 6.243J (Fall 2003) : Dynamics of Nonlinear Systems by A. Megretski
No ratings yet
Lecture 13: Feedback Linearization: 6.243J (Fall 2003) : Dynamics of Nonlinear Systems by A. Megretski
5 pages
Exercises Lecture 4
No ratings yet
Exercises Lecture 4
2 pages
Adobe Scan 22-Feb-2024
No ratings yet
Adobe Scan 22-Feb-2024
8 pages
BCS401
100% (1)
BCS401
2 pages
Soft Computing Sample File
No ratings yet
Soft Computing Sample File
26 pages
Control: Study With A Pilot Crane
No ratings yet
Control: Study With A Pilot Crane
8 pages
Kishalay Das cv-1 PDF
No ratings yet
Kishalay Das cv-1 PDF
1 page
Article 1 - Understanding Machine Learning - Concepts and Applications
No ratings yet
Article 1 - Understanding Machine Learning - Concepts and Applications
3 pages
Trangntb6 Lab211 Assignment List
No ratings yet
Trangntb6 Lab211 Assignment List
6 pages
Last Name, First CHE426:: F C V C V C F + F F C
No ratings yet
Last Name, First CHE426:: F C V C V C F + F F C
3 pages
A Transportation Problem
No ratings yet
A Transportation Problem
1 page
Math Worksheets Aljabar
No ratings yet
Math Worksheets Aljabar
2 pages
Stats2 Quiz 2 Pyq
No ratings yet
Stats2 Quiz 2 Pyq
16 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

08_Hashing.pptx

Uploaded by

08_Hashing.pptx

Uploaded by

Data Structures (DS)

2. Close hashing or internal hashing

Different hashing functions

Suppose, k = 23, m = 10 then

Hash Table Location Mapped elements

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.