0% found this document useful (0 votes)
42 views7 pages

Common Scenario: - Millions Want To Download The Same Popular Huge Files (For Free)

Downloads of tracker for file of tracker for file popeye.mp4.torrent • The tracker, which runs and contacts tracker on a webserver as well, to get list of peers keeps track of all peers currently downloading downloading file the file 3 Tracker returns list of peers currently downloading file 4 Node contacts peers in list to begin downloading pieces of file in parallel

Uploaded by

swapon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views7 pages

Common Scenario: - Millions Want To Download The Same Popular Huge Files (For Free)

Downloads of tracker for file of tracker for file popeye.mp4.torrent • The tracker, which runs and contacts tracker on a webserver as well, to get list of peers keeps track of all peers currently downloading downloading file the file 3 Tracker returns list of peers currently downloading file 4 Node contacts peers in list to begin downloading pieces of file in parallel

Uploaded by

swapon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

4/2/2007

Common Scenario
• Millions want to download the same
popular huge files (for free)
BitTorrent – ISO’s
– Media (the real example!)

CS514 • Client-server model fails


– Single server fails
Vivek Vishnumurthy, TA
– Can’t afford to deploy enough servers

IP Multicast?
• Recall: IP Multicast not a real option in
general settings
– Not scalable
– Onlyy used in private
p settings
g
• Alternatives
– End-host based Multicast
Source
– BitTorrent
Router
– Other P2P file-sharing schemes (later in
lecture) “Interested”
End-host

Client-Server Client-Server
Overloaded!

Source Source

Router Router

“Interested” “Interested”
End-host End-host

1
4/2/2007

IP multicast End-host based multicast

Source Source

Router Router

“Interested” “Interested”
End-host End-host

End-host based multicast End-host based multicast


• “Single-uploader” “Multiple-uploaders” • Also called “Application-level Multicast”
– Lots of nodes want to download • Many protocols proposed early this
– Make use of their uploading abilities as well decade
– Node that has downloaded (part of) file will – Yoid
Y id (2000)
(2000), N
Narada
d (2000)
(2000), O
Overcastt (2000)
(2000),
then upload it to other nodes. ALMI (2001)
Uploading costs amortized across all nodes • All use single trees
• Problem with single trees?

End-host multicast using single tree End-host multicast using single tree
Source Source

2
4/2/2007

End-host multicast using single tree End-host multicast using single tree
Source
• Tree is “push-based” – node receives data,
Slow data transfer
pushes data to children
• Failure of “interior”-node affects downloads in
entire subtree rooted at node
• Slow interior node similarly affects entire subtree
• Also, leaf-nodes don’t do any sending!
• Though later multi-tree / multi-path protocols
(Chunkyspread (2006), Chainsaw (2005), Bullet
(2003)) mitigate some of these issues

BitTorrent BitTorrent Swarm


• Written by Bram Cohen (in Python) in 2001 • Swarm
• “Pull-based” “swarming” approach – Set of peers all downloading the same file
– Each file split into smaller pieces – Organized as a random mesh
– Nodes request
q desired p
pieces from neighbors
g
• As opposed to parents pushing data that they receive • E
Eachh node
d kknows lilistt off pieces
i
– Pieces not downloaded in sequential order downloaded by neighbors
– Previous multicast schemes aimed to support • Node requests pieces it does not own from
“streaming”; BitTorrent does not
neighbors
• Encourages contribution by all nodes
– Exact method explained later

How a node enters a swarm How a node enters a swarm


for file “popeye.mp4” for file “popeye.mp4”
www.bittorrent.com

• File popeye.mp4.torrent • File popeye.mp4.torrent


hosted at a (well-known) hosted at a (well-known)
1
webserver webserver
• The .torrent
torrent has address Peer • The .torrent
torrent has address
of tracker for file of tracker for file
• The tracker, which runs • The tracker, which runs
on a webserver as well, on a webserver as well,
keeps track of all peers keeps track of all peers
downloading file downloading file

3
4/2/2007

How a node enters a swarm How a node enters a swarm


for file “popeye.mp4” for file “popeye.mp4”
www.bittorrent.com www.bittorrent.com

• File popeye.mp4.torrent • File popeye.mp4.torrent


hosted at a (well-known) hosted at a (well-known)
webserver webserver
2 • The .torrent
torrent has address • The .torrent
torrent has address
Peer Peer
of tracker for file of tracker for file
Tracker • The tracker, which runs 3 Tracker • The tracker, which runs
on a webserver as well, on a webserver as well,
keeps track of all peers keeps track of all peers
downloading file downloading file
Swarm

Contents of .torrent file Terminology


• URL of tracker • Seed: peer with the entire file
• Piece length – Usually 256 KB – Original Seed: The first seed
• SHA-1 hashes of each piece in file • Leech: peer that’s downloading the file
– For reliability – Fairer
F i tterm might
i ht h
have b
been “d
“downloader”
l d ”
• “files” – allows download of multiple files • Sub-piece: Further subdivision of a piece
– The “unit for requests” is a subpiece
– But a peer uploads only after assembling
complete piece

Peer-peer transactions:
Choosing pieces to request Choosing pieces to request
• Rarest-first: Look at all pieces at all peers, • Random First Piece:
and request piece that’s owned by fewest – When peer starts to download, request
peers random piece.
– Increases diversityy in the pieces downloaded • So as to assemble first complete piece quickly
• avoids case where a node and each of its peers • Then participate in uploads
have exactly the same pieces; increases
throughput – When first complete piece assembled, switch
– Increases likelihood all pieces still available to rarest-first
even if original seed leaves before any one
node has downloaded entire file

4
4/2/2007

Choosing pieces to request Tit-for-tat as incentive to upload


• End-game mode: • Want to encourage all peers to contribute
– When requests sent for all sub-pieces, • Peer A said to choke peer B if it (A) decides not
(re)send requests to all peers. to upload to B
– To speed up completion of download • Each peer (say A) unchokes at most 4 interested
– Cancel request for downloaded sub-pieces peers at any time
– The three with the largest upload rates to A
• Where the tit-for-tat comes in
– Another randomly chosen (Optimistic Unchoke)
• To periodically look for better choices

Anti-snubbing Why BitTorrent took off


• A peer is said to be snubbed if each of its • Better performance through “pull-based”
peers chokes it transfer
– Slow nodes don’t bog down other nodes
• To handle this, snubbed peer stops
uploading to its peers • Allows uploading from hosts that have
downloaded parts of a file
Optimistic unchoking done more often – In common with other end-host based
– Hope is that will discover a new peer that will multicast schemes
upload to us

Why BitTorrent took off Pros and cons of BitTorrent


• Practical Reasons (perhaps more important!) • Pros
– Working implementation (Bram Cohen) with simple – Proficient in utilizing partially downloaded files
well-defined interfaces for plugging in new content
– Many recent competitors got sued / shut down
– Discourages “freeloading”
• Napster, Kazaa • By rewarding fastest uploaders
– Doesn’t do “search” per se. Users use well-known, – Encourages diversity through “rarest-first”
trusted sources to locate content • Extends lifetime of swarm
• Avoids the pollution problem, where garbage is passed off as
authentic content • Works well for “hot content”

5
4/2/2007

Pros and cons of BitTorrent Pros and cons of BitTorrent


• Cons • Dependence on centralized tracker:
– Assumes all interested peers active at same pro/con?
time; performance deteriorates if swarm – Single point of failure: New nodes can’t
“cools off” enter swarm if tracker goes down
– Even worse: no trackers for obscure content – Lack of a search feature
• ☺ Prevents pollution attacks
• Users need to resort to out-of-band search: well
known torrent-hosting sites / plain old web-search

Why is (studying) BitTorrent


“Trackerless” BitTorrent
important?
• To be more precise, “BitTorrent without a
centralized-tracker”
• E.g.: Azureus
• Uses a Distributed Hash Table (Kademlia DHT)
• Tracker run by a normal end-host (not a web-
server anymore)
– The original seeder could itself be the tracker (From CacheLogic, 2004)
– Or have a node in the DHT randomly picked to act as
the tracker

Why is (studying) BitTorrent


Other file-sharing systems
important?
• BitTorrent consumes significant amount of • Prominent earlier: Napster, Kazaa,
internet traffic today Gnutella
– In 2004, BitTorrent accounted for 30% of all
internet traffic (Total P2P was 60%),
• Current popular file-sharing client: eMule
according
di to C CacheLogic
h L i – Connects
C t tto the
th ed2k
d2k and
dKKad
d networks
t k
– Slightly lower share in 2005 (possibly – ed2k has a supernode-ish architecture
because of legal action), but still significant (distinction between servers and normal
– BT always used for legal software (linux iso) clients)
distribution too – Kad based on the Kademlia DHT
– Recently: legal media downloads (Fox)

6
4/2/2007

File-sharing systems… References


• (Anecdotally) Better than BitTorrent in • BitTorrent
finding obscure items – “Incentives build robustness in BitTorrent”,
• Vulnerable to: Bram Cohen
– Pollution
P ll ti attacks:
tt k Garbage
G b d
data
t inserted
i t d with
ith – BitTorrent Protocol Specification:
the same file name; hard to distinguish http://www.bittorrent.org/protocol.html
– Index-poisoning attacks (sneakier): Insert • Poisoning/Pollution in DHT’s:
bogus entries pointing to non-existant files – “Index Poisoning Attack in P2P file sharing
– Kazaa reportedly has more than 50% systems”
pollution + poisoning – “Pollution in P2P File Sharing Systems”

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy