Hash Tables: Map Dictionary Key "Address."
Hash Tables: Map Dictionary Key "Address."
Hash Tables: Map Dictionary Key "Address."
Key Definition
am First
person
singular
number
indicative
of
be,
also,
before
noon
…
…
More applications of maps
(University record)
Key = student id.
Value = information about the student
(Social media)
Key = user name / email id.
Value = user information.
hash function
Hash codes
key K h(key)
Java relies on 32-bit hash codes, so for the base types
byte, short, int, char, we can get a hash code by
casting its value x0 x1 x2 ... xn−2 xn−1
to the type int,
and using the integer representation.
For a string object s = s[0] s[1] … s[n-1], the following
method is used to compute the hash code h(s):
Compression functions
Let us get back to the dictionary of all 2-letter words.
There are 26 x 26 = 676 possible keys, but perhaps 70
possible meaningful words. Let us convert each 2-letter
word xy to an integer i as follows (using f(a) = 0, f(b) =
1, f(c) = 2, …, f(z) = 25):
i = 26.f(x) + f(y)
Division method. If we plan on storing these words
in an array of size 100 (i.e. N=100) then a simple
compression function is mod 100 (if N = bucket array
size, then use mod N). Thus, the integer i will be placed
in location i mod N of the hash table
insert(key, value)
Compute the key's hash code.
Compress it to determine the entry's bucket.
Insert the entry (key and value together)
into that bucket (and deal with collision)
find(key)
Hash the key to determine its bucket.
Search the entry with the given key.
If found, return the entry, else, return null.
delete(key)
Hash the key to determine its bucket.
Search the list for an entry with the given key.
Remove it from the list if found.
Return the entry or null if not found.
Collision avoidance
Hashing with separate chaining
(Also called Chain hashing)
0 1 2 3 4 5 6 7 8 9 10
hash
Key
Quadratic Probing
In case of a collision at slot i, try looking for slots i+12,
i+22, i+32, … until a free slot is found.
Typically we expect O(1) time performance for each of the
operations. This may not be feasible if the load factor n/N is
large (> 0.75) or there are too many collisions. The
performance can slowly degenerate towards O(n).
One option is to enlarge the hash table when the load factor
becomes too large. Allocate a new array (typically at least
twice as long as the old), and then walk through all the entries
in the old array and “rehash” them into the new.
[Note: you
just
can’t
copy
the
entries
of
the
old
array
into
the
slots
of
the
new
array,
because
the
compression
functions
of
the
two
arrays
will
be
different.
You
have
to
rehash
each
entry
individually.]
6 10
Y Z W T X
Key X index 6
6 10
Y
10 X
Key X
int hash = 0;
for (int i = 0; i < s.length(); i++)
hash = (31 * hash + s.charAt(i)) % N
y0
=
s[0]
y1
=
31*y0
+
s[1]
y2
=
31*y1
+
s[2]
y3
=
31*y2
+
s[3]
…
…
…
h(s)
=
yn-‐1
=
31*yn-‐2
+
s[n-‐1]
Example. Consider the following keys to be entered
into a hash table, first into a table of size 5.
It
masks
the
sign
bit
and
converts
it
into
a
positive
integer.
Quadratic
probing
is
slow
since
it
involves
multiplication.
Here
is
how
you
can
speed
it
up
to
calculate
the
next
index.
Initially
k=
-‐1
k
=
k+2
index
=
(index
+
k)
%
hashTableSize
Why
does
it
work?