0% found this document useful (0 votes)
11 views

Tutorial 3 - Part1

Uploaded by

bob pan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Tutorial 3 - Part1

Uploaded by

bob pan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Hash

Maps

Tutorial 3
Ahmed Fahmy
ECE 250 @uWaterloo
Motivation

• Look-ups for key-value pairs Ahmed 1


• For example:
Lulu 0
• Does an item (key) exist (value) in the data structure?
• Given a student name (key), Maaz 0
would they pass ECE250 (value)?

• Complexity John 1

• Linked lists Gloria 1


• Trees
• Vectors...?
ADT: Dictionary
Motivation

• Using vectors for lookups 0


• If keys are integers
• Space complexity depends on the key range! 1
• A quick solution would be mapping:
2

• Problem: collisions! 3
• Later...
• If keys are objects: 4
• Strings
• User-defined class
• Solution: transform the object into some integer
Hashing

• Hash: give each object a different unsigned int (hash) value.


• Requirements:
• Fast 
• An object will always have the same hash value

• Uniform probability for a collision  very important
Mapping

• We can use

• A bit slow operation!


• Solution: make
Collision
unsigned int hash(type obj, unsigned int size) {
return obj.hash() & ((1 << m) – 1);
}

• Insert
• 4, 10, 33, 2

• Chaining

33 10 4

0 1 2 3 4 5 6 7
Collision
unsigned int hash(type obj, unsigned int size) {
return obj.hash() & ((1 << m) – 1);
}

• Insert
• 4, 10, 33, 2

• Chaining
• Open-addressing

33 10 4

0 1 2 3 4 5 6 7
Linear-Probing
unsigned int hash(type obj, unsigned int size) {
return obj.hash() & ((1 << m) – 1);
}

• Insert
• 4, 10, 33, 2
• Check next location

• Search
• Stop when empty or full

33 10 4

0 1 2 3 4 5 6 7
Double Hashing

• It is the most efficient!


• Hash again to get the next cell index:

• Different hash functions for the initial value and jump


Quality of
Hashing
• How can we assess the quality of a hash function?
• Load factor: expected number of keys to have the same hash value
• Another way to define it:
• How many times we probe “on average” to find an item?
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

SDBM [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

DBJ2A [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

FNV1 [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

FNV1-A [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

Murmur2 [1]
Problem Solving

• Remove Duplicates
from (unsorted) vector void removeDubFast(vector<int>& v){ // un/sorted vector v
unordered_set<int> m;
for (int i:v)
• Complexity: m.insert(i);
v.clear();
for (int i:m)
v.push_back(i);
}
450000000

400000000

350000000

300000000

250000000

200000000

150000000

100000000

50000000

0
0 42000 84000 126000 168000 210000 252000 294000 336000 378000 420000 462000 504000 546000 588000 630000 672000 714000 756000 798000 840000 882000 924000 966000

Real Performance
Thank You
References

• [1]
https://softwareengineering.stackexchange.com/questions/49550/which-hashing-al
gorithm-is-best-for-uniqueness-and-speed

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy