Cuckoo Hashing: Hardware Implementations: Adam Kirsch Michael Mitzenmacher

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 20

Cuckoo Hashing :

Hardware Implementations
Adam Kirsch
Michael Mitzenmacher
Motivation
• Hash tables are ubiquitous.
• Highly useful in router hardware.
– Measurement and monitoring tasks.
• Desiderata:
– Few (parallel) memory accesses.
– High space utilization.
– Low failure probability.
– Hardware-level simplicity.
• What are good hash table designs for hardware?
State of the Art :
Multiple Choice Hashing
• Each element placed in least loaded of d
locations. (If 1 element/cell, look for 1
empty cell out of d.)
Cuckoo Hashing and Moves
• Cuckoo hashing paradigm: give each
element d choices, and move elements
among choices as needed.
Original Cuckoo Hashing
• 2 subtables, left and right. Each element gets one
location per subtable.
• Place new element in left subtable.
– If element already there, kick it out, move to right
subtable.
– If element already there, kick it out, move to left
subtable…
– Until everything placed.
• Works with high probability as long as load is less
than ½.
Better Cuckoo Hashing
• More choices
• More elements per bucket
• Generally kick out a random item.
• Such schemes are not fully analyzed.
What’s Wrong with Cuckoo Hashing?
• Lots of moves per insert in worst case.
– Average is constant.
– But maximum is Omega(log n) with non-trivial
(inverse-poly) probability.
• Router hardware settings: may need
bounded number of memory accesses per
insert.
Moves Needed per Insertion
The Power of One Move
• Previous work (submitted): How much gain from
allowing just one move?
• Framework: allow small content-addressable
memory (CAM) to handle unsolvable collisions
[max 0.2%].
• Multiple schemes analyzed.
• With 4 choices, insertions only (no deletions),
factor of 2 or larger improvement in space.
Pros/Cons of One Move Systems
• Pros
– Simple to implement
– Efficient
– High space utilization for insertion-only
– Analyzable and optimizable
• Cons
– Performance suffers in settings with churn
– Better space utilization possible with more moves
The New Idea
• Use the CAM as a queue for move operations.
• Lookup: check the hash table and the CAM-
queue.
• Try move operations from queue as available.
– Move attempt = 1 parallel memory lookup.
• De-amortization
– Use queue to make worst-case performance same
as average-case performance.
Queue Policy
• Key point: better to give priority to “new”
insertions over moves.
– New moves have d choices; moves effectively have d
– 1.
• Intuition suggests older items may be less likely to
be successfully placed.
– True in practice.
• Full priority queue may be too complex.
• Simple strategy: new items placed at front, failed
moves places at back.
Probability of Success vs. Age
Experimental Evaluation
• Table of size 32768, 4 subtables.
• Target utilization u.
• Insert 32678u elements, then alternate
insertions/deletions to get to steady state.
• Allow ops queue operations (parallel
memory operations) per insertion.
Analysis
• Currently we do not know how to analyze such
systems.
– For d > 2 choices, lots of open questions in cuckoo
hashing analysis.
– Analyzing d = 2 may be possible, but very low space
utilization.
• See [Kutzelnigg], asymptotic analysis of cuckoo hashing.
• Need to understand distribution of move
operations/element to analyze queue.
Conclusions and Open Questions
• Moving elements leads to much better space utilization in hash
tables, at a price.
• Cuckoo hashing appears implementable, with per-insert move
guarantees based on de-amortization via a CAM
queue.
• Analysis in an idealized model?
– Even analysis for basic cuckoo hashing open.
• Performance on real traffic?
– Bursty insertions/deletions?
– Distribution of element lifetimes?
• Proper sizing of CAM queue?
– How does overflow probability scale?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy