Private Information Retrieval: Amir Houmansadr
Private Information Retrieval: Amir Houmansadr
#4417749:
• clothes for age 60
• 60 single men
• best retirement city
• jarrett arnold Thelma Arnold
• jack t. arnold 62-year-old widow
• jaylene and jarrett arnold Lilburn, Georgia
• gwinnett county yellow pages
• rescue of older dogs
• movies for dogs
• sinus infection
Observation
The owners of the database know a lot about the users!
Yes, we can:
user U database D
secure link
database D
user U
A new primitive:
Private Information Retrieval (PIR)
Private Information Retrieval (PIR) [CGKS95]
• User: wishes
– to retrieve xi
and
– to keep i private
Private Information Retrieval (PIR)
n ?
4
3
i 7 j
i {1,…n}
xi
x=x1,x2 , . . ., xn {0,1}n
SERVER USER
Non-Private Protocol
xi i {1,…n}
x =x1,x2 , . . ., xn i
SERVER USER
NO privacy!!!
Communication: 1
Trivial Private Protocol
x1,x2 , . . ., xn
x =x1,x2 , . . ., xn xi
SERVER USER
Not optimal !
Other solutions?
• User asks for additional random indices.
Drawback: leaks information, reduces
communication efficiency
Sub-linear with n
Approach I: k-Server PIR
x {0,1}n S1 i
x {0,1}n S2 U
0 1 0 0 1 1 0 1 0 0 1 0
S1 i S2
U
A 2-server Information Theoretical PIR
n
0 1 0 0 1 1 0 1 0 0 1 0
S1 i S2
Q1 subset {1,…,n}
i Ï Q1
U
Protocol I: 2-server PIR
n
0 0 1 0 0 1 1 0 1 0 0 1 0
S1 i S2
a1 x Q1 subset {1,…,n}
i Ï Q1
Q1
U
Protocol I: 2-server PIR
n
0 0 1 0 0 1 1 0 1 0 0 1 0
S1 i S2
Q2=Q1 + {i}
a1 x Q1 subset {1,…,n}
i Ï Q1
Q1
U
Protocol I: 2-server PIR
n
0 0 1 0 0 1 1 0 1 0 0 1 0 1
S1 i S2
Q2=Q1 + {i}
a1 x Q1 subset {1,…,n} a2 x
i Ï Q1 Q2
Q1
U
Weakness: Servers should not collude!
Protocol I: 2-server PIR
n
0 0 1 0 0 1 1 0 1 0 0 1 0 1
S1 i S2
Q2=Q1 + {i}
a1 x Q1 subset {1,…,n} a2 x
i Ï Q1 Q2
Q1
xi =a1 Å a2
i
U
Weakness: Servers should not collude!
Computation PIR
• Only one server, no need to trust
Prateek Mittal
University of Illinois Urbana-Champaign
Middle
Signed
Server list
(relay descriptors)
Exit
Guards
1. Load balancing
2. Exit policy 23
Performance Problem in Tor’s Architecture:
Global View
• Global view
– Not scalable Directory
Servers
List of servers?
Need solutions
without global
system view
Torsk – CCS09 24
Current Solution:
Peer-to-peer Paradigm
• Morphmix [WPES 04]
– Broken [PETS 06]
• Salsa [CCS 06]
– Broken [CCS 08, WPES 09]
• NISAN [CCS 09]
– Broken [CCS 10]
• Torsk [CCS 09]
– Broken [CCS 10]
• ShadowWalker [CCS 09]
– Broken and fixed(??) [WPES 10]
27
Private Information Retrieval (PIR)
• Information theoretic PIR
RA A
– Multi-server protocol
– Threshold number of servers don’t
collude RB
B
RC
• Computational PIR
C Database
– Single server protocol
– Computational assumption on server
A
28
ITPIR-Tor: Database Locations
• Tor places significant trust in guard relays
– 3 compromised guard relays suffice to undermine user anonymity
in Tor.
• Choose client’s guard relays to be directory
servers ExitExit
relay compromised:
relay honest
All
At least
guardone
relays
guard
compromised
relay is honest
Equivalent security toMiddle
the current
Middle Exit
Exit
Tor network
Middle Exit
Deny Service
End-to-end Timing Analysis
Guards ITPIR
ITPIRdoes not provide
guarantees userprivacy
privacy
Guards
Guards But in this case, Tor anonymity broken
29
ITPIR-Tor
Database Organization and Formatting
• Middles, exits Sort by
Relay Bandwidth
– Separate databases
Descriptors
• Exit policies m1 e1
– Standardized exit m2 e2
m3 e3 Exit Policy 1
policies m4 e4
– Relays grouped by exit m5 e5 Exit Policy 2
m6 e6
policies m7 e7
Non-
• Load balancing m8 e8
standard
– Relays sorted by Middles Exits Exit policies
bandwidth
30
ITPIR-Tor Architecture
Guard relays/
PIR Directory servers
Trusted
Directory
Authority
2. Initial connect
1. Download PIR
3. Signed meta-information database
5. 5.18
18PIR Queries(1
middle,18 PIRmiddle/exit)
Query(exit)
6. PIR Response m1 e1
m2 e2
4. Load balanced m3 e3
index selection m4 e4
m5 e5
m6 e6
m7 e7
m8 e8
Middles Exits
31
Performance Evaluation
• Percy [Goldberg, Oakland 2007]
– Multi-server ITPIR scheme
• 2.5 GHz, Ubuntu
• Descriptor size 2100 bytes
– Max size in the current database
• Exit database size
– Half of middle database
• Methodology: Vary number of relays
– Total communication
– Server computation
32
Performance Evaluation:
Communication Overhead
Advantage of PIR-Tor
becomes larger due
to its sublinear
scaling: 100x--1000x
1.1 MB improvement
216 KB
12 KB
Current Tor network:
5x--100x
improvement
33
Performance Evaluation:
Server Computational Overhead
100,000 relays:
about 10 seconds
(does not impact
user latency)
Current Tor
network: less than
0.5 sec
34
Performance Evaluation:
Scaling Scenarios
Scenario Tor ITPIR ITPIR
Communication Communication Core Utilization
(per client) (per client)
Explanation Relay Clients
35
Conclusion
• PIR can be used to replace descriptor
download in Tor.
– Improves scalability
• 10x current network size: very feasible
• 100x current network size : plausible
– Easy to understand security properties
• Side conclusion: Yes, PIR can have practical
uses!
• Questions?
36
Acknowledgement
• Some of the slides, content, or pictures are borrowed from
the following resources, and some pictures are obtained
through Google search without being referenced below: