0% found this document useful (0 votes)
53 views

Hashing

Hashing is a technique used to map keys to values in an associative array called a hash table. It involves two main components - a hash function that generates a hash value from the key, and a hash table data structure that stores the key-value pairs. Common hash functions include division, folding, mid-square, and digit extraction methods. Collisions occur when two keys map to the same hash value. Collision resolution strategies like separate chaining and open addressing like linear probing, quadratic probing, and double hashing are used to handle collisions.

Uploaded by

Lamia Alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Hashing

Hashing is a technique used to map keys to values in an associative array called a hash table. It involves two main components - a hash function that generates a hash value from the key, and a hash table data structure that stores the key-value pairs. Common hash functions include division, folding, mid-square, and digit extraction methods. Collisions occur when two keys map to the same hash value. Collision resolution strategies like separate chaining and open addressing like linear probing, quadratic probing, and double hashing are used to handle collisions.

Uploaded by

Lamia Alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 56

HASHING

1
Hashing
• Mathematical concept
– To define any number as set of numbers in
given interval
– To cut down part of number
– Used in discreet maths, e.g graph theory, set
theory
– Used in Searching technique
– Used in encryption methods

2
Hash Functions and Hash
Tables
• Hashing has 2 major components
– Hash function h
– Hash Table Data Structure of size N
• A hash function h maps keys (a identifying element of record set) to hash value or
hash key which refers to specific location in Hash table
• Example:
h(x)  x mod N
is a hash function for integer keys
• The integer h(x) is called the hash value of key x

3
Hash Functions and Hash Tables
• A hash table data structure is an array or array
type ADTof some fixed size, containing the
keys.
• An array in which records are not stored
consecutively - their place of storage is
calculated using the key and a hash function

hash array
Key index
function

4
• Hashed key: the result of applying a hash function to a
key
• Keys and entries are scattered throughout the array
• Contains the main advantages of both Arrays and Trees
• Mainly the topic of hashing depends upon the two main
factors / parts
(a) Hash Function (b) Collision Resolution
• Table Size is also an factor (miner) in Hashing, which is
0 to tablesize-1.

5
Table Size
• Hash table size
– Should be appropriate for the hash function used

– Too big will waste memory; too small will


increase collisions and may eventually force
rehashing (copying into a larger table)

6
Example
• We design a hash table for a
dictionary storing items 0 
(SSN, Name), where SSN 1 025-612-0001
(social security number) is a 2 981-101-0002
nine-digit positive integer 3 

• The actual data is not stored 4 451-229-0004


in hash table
• Pin points the location of 9997 
actual data or set of data 9998 200-751-9998

• Our hash table uses an array


9999 

of size N10,000 and the


hash function
h(x)last four digits of x
7
Hash Function
• The mapping of keys into the table is called Hash
Function

• A hash function,
– Ideally, it should distribute keys and entries evenly
throughout the table
– It should be easy and quick to compute.
– It should minimize collisions, where the position
given by the hash function is already occupied
– It should be applicable to all objects
8
• Different types of hash functions are used for the
mapping of keys into tables.

(a) Division Method


(b) Folding Method
(c) Mid-square Method
(d) Subtraction Method
(e) Digit-Extraction Method
(f) Rotation Method

9
1. Division Method
• Choose a number m larger than the number n of keys
in k.
• The number m is usually chosen to be a prime no.
• The hash function H is defined as,
H(k) = k(mod m) or H(k) = k(mod m) + 1
• Denotes the remainder, when k is divided by m
• 2nd formula is used when range is from 1 to m.

10
• Example:
Elements are: 3205, 7148, 2345

Table size: 0 – 99 (prime)


m = 97 (prime)

H(3205)= 4, H(7148)=67, H(2345)=17

• For 2nd formula add 1 into the remainders.

11
2. Folding Method
• The key k is partitioned into no. of parts
• Then add these parts together and ignoring the
last carry.
• One can also reverse the first part before
adding (right or left justified. Mostly right)
H(k) = k1 + k2 + ………. + kn

12
• Example:

H(3205)=32+05=37 or H(3250)=32+50=82

H(7148)=71+48=19 or H(7184)=71+84=55

H(2345)=23+45=77 or H(2354)=23+54=68

13
3. Mid-Square Method
• The key k is squared. Then the hash function H is
defined as
H(k) = l
• The l is obtained by deleting the digits from both
ends of K2.

• The same position must be used for all the keys.

14
• Example:
k: 3205 7148 2345
k2: 10272025 51093904 5499025
H(k): 72 93 99

• 4th and 5th digits have been selected. From the


right side.

15
4. Subtraction Method

16
5. Digit-Extraction Method

1st, 3rd and 4th digits has been selected and used as an
address 17
6. Rotation Method

18
Example

19
Example

20
Collision Resolution Strategies
• If two keys map on the same hash table index then we
have a collision.
• As the number of elements in the table increases, the
likelihood of a collision increases - so make the table
as large as practical
• Collisions may still happen, so we need a collision
resolution strategy

21
• Two approaches are used to resolve collisions.
(a) Separate chaining: chain together several keys/entries
in each position.
(b) Open addressing: store the key/entry in a different
position.
• Probing: If the table position given by the hashed
key is already occupied, increase the position by
some amount, until an empty position is found

22
Separate Chaining
• The idea is to keep a list of all elements that hash
to the same value.
– The array elements are pointers to the first nodes of
the lists.
– A new item is inserted to the front of the list.
• Advantages:
– Better space utilization for large items.
– Simple collision handling: searching linked list.
– Overflow: we can store more items than the hash table
size.
– Deletion is quick and easy: deletion from the linked list.

23
Example
Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81
hash(key) = key % 10.
0 0

1 81 1
2

4 64 4
5 25
6 36 16
7

9 49 9

24
Open Addressing

• Types of open addressing are

1. Linear Probing
2. Quadratic Probing
3. Double Hashing.

25
1. Linear Probing
• Locations are checked from the hash location k to the end
of the table and the element is placed in the first empty
slot
• If the bottom of the table is reached, checking “wraps
around” to the start of the table. Modulus is used for this
purpose
• Thus, if linear probing is used, these routines must
continue down the table until a match or empty location
is found

26
• Linear probing is guaranteed to find a slot for the
insertion if there still an empty slot in the table.
• Even though the hash table size is a prime number is
probably not an appropriate size; the size should be at
least 30% larger than the maximum number of elements
ever to be stored in the table.

• If the load factor is greater than 50% - 70% then the


time to search or to add a record will increase.

27
H(k)=h, h+1, h+2, h+3,……, h+I

• However, linear probing also tends to promote


clustering within the table.

1 2 3 4 5 6 7 8

28
2. Quadratic Probing
• Quadratic probing is a solution to the clustering
problem
– Linear probing adds 1, 2, 3, etc. to the original
hashed key
– Quadratic probing adds 12, 22, 32 etc. to the original
hashed key
• However, whereas linear probing guarantees that all
empty positions will be examined if necessary,
quadratic probing does not

29
• If the table size is prime, this will try approximately
half the table slots.
• More generally, with quadratic probing, insertion may
be impossible if the table is more than half-full!

H(k) = h, h+1, h+4, h+9, h+16,……, h+i2

30
3. Double Hashing
• 2nd hash function H’ is used to resolve the collision.
• Here H’(k) = h’ ≠ m
• Therefore we can search the locations with addresses,
H’(k) = h, h+h’, h+2h’, h+3h’,…….
• If m is prime, then this sequence access all the
locations.

31
Double Hashing
• Double hashing uses a
secondary hash function • Common choice of
d(k) and handles compression map for the
collisions by placing an secondary hash function:
item in the first available d2(k)  k mod q
cell of the series
(h  jd(k)) mod N where
for j  0, 1, … , N  1 – qN
• The secondary hash – q is a prime
function d(k) cannot have • The possible values for
zero values d2(k) are
• The table size N must be 1, 2, … , q
a prime to allow probing
of all the cells
32
Example of Double Hashing
k h(k) d(k) Probes
18 5 4 5
• Consider a hash 41 2 6 2
table storing integer 22 9 1 9
keys that handles 44 5 2 5 7
collision with double 59 7 3 7 10
32 6 4 6
hashing
31 5 3 5 8
– N13 73 8 3 8 11
– h(k)  k mod 13
– d(k)  k mod 7
0 1 2 3 4 5 6 7 8 9 10 11 12
• Insert keys 18, 41,
22, 44, 59, 32, 31,
41 18 32 44 31 22 59 11
73, in this order
0 1 2 3 4 5 6 7 8 9 10 11 12
33
Building
Building aa Hash
Hash Table
Table
• The simplest kind of hash
table is an array of records.
• This example has 701
records.

[0] [1] [2] [3] [4] [5] [ 700]

...
Building
Building aa Hash
Hash Table
Table [ 4 ]
Number 506643548

• Each record has a special field,


called its key.
• In this example, the key is a
long integer field called Number.

[0] [1] [2] [3] [4] [5] [ 700]


• Typical way to create a hash ...
value:

(Number mod 701)


Building
Building aa Hash
Hash Table
Table[ 4 ]
Number 506643548

• The number might be a


person's identification
number, and the rest of the
record has information
about the person.
[0] [1] [2] [3] [4] [5] [ 700]

...
Building
Building aa Hash
Hash Table
Table
• When a hash table is in use,
some spots contain valid
records, and other spots are
"empty".

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 506643548 Number 155778322

...
Inserting
Inserting aa New
New Record
Record Number 580625685

• Typical way create a hash


value:
(Number mod 701)

What is (580625685 % 701) ?

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 506643548 Number 155778322

...
Number 580625685

• Typical way to create a hash


value:
(Number mod 701)
3
What is (580625685 % 701) ?

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 506643548 Number 155778322

...
Number 580625685

• The hash value is used for


the location of the new
record.

[3]

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 506643548 Number 155778322

...
Inserting
Inserting aa New
New Record
Record
• The hash value is used for
the location of the new
record.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

...
Collisions
Collisions Number 701466868

• Here is another new record


to insert, with a hash value
of 2.
My hash
value is [2].

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

...
Collisions
Collisions Number 701466868

• This is called a collision,


because there is already
another valid record at [2].

When
When aa collision
collision occurs,
occurs,
move
move forward
forward until
until you
you
find
find an
an empty
empty spot.
spot.
[0] [1] [2] [3] [4] [5] [ 700]
Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

...
Collisions
Collisions Number 701466868

• This is called a collision,


because there is already
another valid record at [2].

When
When aa collision
collision occurs,
occurs,
move
move forward
forward until
until you
you
find
find an
an empty
empty spot.
spot.
[0] [1] [2] [3] [4] [5] [ 700]
Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

...
Collisions
Collisions Number 701466868

• This is called a collision,


because there is already
another valid record at [2].

When
When aa collision
collision occurs,
occurs,
move
move forward
forward until
until you
you
find
find an
an empty
empty spot.
spot.
[0] [1] [2] [3] [4] [5] [ 700]
Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

...
Collisions
Collisions
• This is called a collision,
because there is already
another valid record at [2].

The
The newnew record
record goes
goes
in
in the
the empty
empty spot.
spot.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Searching
Searching for
for aa Key
Key Number 701466868

• The data that's attached to a


key can be found fairly
quickly.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Number 701466868

• Calculate the hash value.


• Check that location of the array
for the key.
My hash
value is [2].
Not me.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Number 701466868

• Keep moving forward until you


find the key, or you reach an
empty spot.
My hash
value is [2].
Not me.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Number 701466868

• Keep moving forward until you


find the key, or you reach an
empty spot.
My hash
value is [2].
Not me.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Number 701466868

• Keep moving forward until you


find the key, or you reach an
empty spot.
My hash
value is [2].
Yes!

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Number 701466868

• When the item is found, the


information can be copied to
the necessary location.
My hash
value is [2].
Yes!

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Deleting
Deleting aa Record
Record
• Records may also be deleted from a hash table.

Please
delete me.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

...
Deleting
Deleting aa Record
Record
• Records may also be deleted from a hash table.
• But the location must not be left as an ordinary
"empty spot" since that could interfere with searches.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 701466868 Number 155778322

...
Deleting
Deleting aa Record
Record
• Records may also be deleted from a hash table.
• But the location must not be left as an ordinary
"empty spot" since that could interfere with searches.
• The location must be marked in some special way so
that a search can tell that the spot used to have
something in it.

[0] [1] [2] [3] [4] [5] [ 700]


Number 281942902 Number 233667136 Number 580625685 Number 701466868 Number 155778322

...
Applications of Hashing
• Compilers use hash tables to keep track of declared
variables
• A hash table can be used for on-line spelling checkers
— if misspelling detection (rather than correction) is
important, an entire dictionary can be hashed and
words checked in constant time
• Game playing programs use hash tables to store seen
positions, thereby saving computation time if the
position is encountered again
• Hash functions can be used to quickly check for
inequality — if two elements hash to different values
they must be different

56

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy