0% found this document useful (0 votes)
7 views64 pages

P 2 P Overlay Networks

Uploaded by

ayushi301995
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views64 pages

P 2 P Overlay Networks

Uploaded by

ayushi301995
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

Computer Networks

P2P Overlay Networks

Mayank Pandey
Overlay Networks
• An overlay network is
a computer network
– that is built on top of
another network
• Nodes in the overlay
network can
– connected by virtual or
logical links
• Through many physical
links, in the underlying
network Computer Networks: A Systems Approach, Peterson and Davie

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 2


Overlay Networks
• Logical network implemented on top of some
underlying network:
– the Internet was built as an overlay over telephone
network
• The links that connect the overlay nodes
– implemented as tunnels through the underlying network
– For example, Tunneling in:
• VPN (Virtual Private Networks)
• IPv6 deployment (IP in IP tunnel)
• Multicast Back Bone (MBONE, IP in IP tunnel)

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 3


Overlay Networks

Computer Networks: A Systems Approach, Peterson and Davie

• Three overlay nodes (A, B, and C) connected by a pair of tunnels.


• Overlay node B makes a forwarding decision for packets from A
to C based on the inner header (IHdr)
– and then attaches an outer header (OHdr) that identifies C as the
destination in the underlying network.

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 4


Overlay Networks

Computer Networks: A Systems Approach, Peterson and Davie

• Nodes A, B, and C are able to interpret both the inner


and outer header
– intermediate routers understand only the outer header.
• A, B, and C have addresses in both the overlay
network and the underlying network
– not necessarily the same
10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 5
Overlay Networks

Computer Networks: A Systems Approach, Peterson and Davie

• Underlying address (of A, B,C) might be a 32-bit IP address


• Their overlay address might be an experimental 128-bit
address.
– In fact, the overlay need not use conventional addresses at all
– may route based on URLs, domain names, an XML query, or even the
content of the packet.

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 6


Application Layer Overlay Network

• Also known as P2P overlays


– Nodes-> End hosts (Processes running there)
• May act as routers of network
– Edges-> Transport Layer Virtual links

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 7


Applications:
Application Service Description Examples
File storage and sharing: Gnutella,
Bittorrent, InterPlanetary File System (IPFS)
Scalable approaches to Content delivery networks (CDNs): Akamai,
Content sharing data with users Limelight
Distribution across the Internet Streaming media: Spotify, Sonos
Software update distribution: Linux, World
of Warcraft
Privacy and censorship resistance: Tor,
Distributed Delegating the work for an Freenet
Computing application across many Cryptocurrency: Bitcoin
computers Botnets and malware: Storm
Collaboration Providing real-time human Voice Over IP (VOIP): Skype
communication Instant Messaging: Tox
Platforms Building applications Java: JXTA

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 8


P2P Networks: Overview
• Music-sharing apps:
– Napster and later on KaZaA
• introduced term “peer-to-peer”
– In context of sharing MP3 files
• Don’t download music from a
central site:
– access music files directly from whoever
on Internet have a copy on their
computer

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 9


P2P Networks: Overview
• Generally peer-to-peer network allows:
– community of users to pool and their resources
• content, storage, network bandwidth, CPU etc.
– in a distributed and decentralized manner

• P2P networks provide access to larger


– archival store, computation capability , Audio/video
streaming and conferencing
• than any one user could afford individually.

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 10


P2P Networks: Overview
• Decentralized and Self-Organized:
– nodes organize themselves into a network
• without any centralized coordination
• Locating an object of interest and downloading that
– happen without centralized authority
• System is able to scale to millions of nodes
– Nodes are hosts willing to share objects
– Transport layer links connecting these nodes
• Utilized to traverse sequence of machines to locate object

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 11


P2P Networks: Types
• Two main P2P Network types
– Unstructured (based on searching)
“Network has structure, but peers are free to join any where
and objects can be stored anywhere”
– Structured (based on addressing)
“Network structure determines where peers belong in the
network and where objects are stored “

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 12


P2P Networks: Another Classification
• 1st generation
– Typically: Napster
• Only file download P2P, Publish and Search Centralized
• 2nd generation
– Typically: Gnutella
• Pure P2P
• 3rd generation
– Typically: Super-peer networks
• Hybrid of P2P and Centralized
• 4th generation
– Typically: Distributed hash tables
• Structured P2P overlays

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 13


P2P Networks: Beginning
• A killer application: Napster
– Free music over the Internet
• Key idea:
– share the content, storage and bandwidth of
individual (home) users
• Each user stores a subset of files
• Each user has access (can download) files from
all users in the system

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 14


Napster: Challenges
• Find where a particular file is stored
– Scalability: up to hundred of thousands or millions
of systems
– Dynamicity: systems can come and go any time

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 15


Napster: History
• The rise
– January 1999: Napster version 1.0
– May 1999: company founded
– December 1999: first lawsuits
– 2000: 80 million users
• The fall
– Mid 2001: out of business due to lawsuits

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 16


Napster Technology
• User installs the software
– Download the client program
Bob
– Register name, password, local directory, etc. centralized
• Client contacts Napster (via TCP) directory server
1
– Provides a list of music files it will share peers
1
– Napster’s central server updates the directory
• Client searches on a title or performer 1 3

– Napster identifies online clients with the file 2 1

– and provides IP addresses


• Client requests the file from the chosen
supplier Alice
– Supplier transmits the file to the client
– Both client and supplier report status to Napster

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 17


Napster: Properties
• Publishing and Searching Centralized
– Only download in P2P manner
• No load on single provider
• Server’s directory continually updated
– Always know what music is currently available
– Point of vulnerability for legal action
• Proprietary protocol
• Bandwidth issues
– Suppliers ranked by apparent bandwidth & response time
• Single point of failure, Performance bottleneck, Copyright
infringement

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 18


Pure P2P system: Gnutella
• Gnutella history • Query flooding
– 2000: J. Frankel & – Join: contact a few
T. Pepper released nodes to become
Gnutella neighbours
– Soon after: many other • Publish: no need!
clients (e.g., Morpheus,
• Search:
Limewire, Bearshare)
– ask neighbours, who ask
– 2001: protocol
their neighbours
enhancements, e.g.,
“ultrapeers” • Fetch: get file directly
from another node
10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 19
Gnutella: Query Flooding
• Fully distributed • Overlay network:
– No central server graph
• Public domain protocol – Edge between peer X
– Many Gnutella clients and Y if there’s a TCP
implementing protocol connection
– All active peers and
edges form the overlay
network
– Given peer will typically
be connected with < 10
overlay neighbors

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 20


Gnutella: Protocol
• Query message sent
File transfer:
over existing TCP HTTP
connections
• Peers forward Query
QueryHit
Query message
ery Qu
it
• QueryHit sent over Qu ryH ery
ue
reverse path Q
Query
QueryHit

Scalability: Qu
er
y
limited scope
flooding

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 21


Gnutella: Peer Joining
• Joining peer X must find some other peers
– Start with a list of candidate peers
• A bootstrap server is needed
– X sequentially attempts TCP connections with peers on
list until connection setup with Y
• X sends Ping message to Y
– Y forwards Ping message.
– All peers receiving Ping message respond with Pong
message
• X receives many Pong messages
– X can then set up additional TCP connections

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 22


Gnutella: Pros and Cons
• Advantages
– Fully decentralized
– Search cost distributed
– Processing per node permits powerful search
semantics
• Disadvantages
– Search scope may be quite large
– Search time may be quite long
– High overhead, and nodes come and go often

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 23


Hybrid P2P system: KaAzA
• KaZaA history
– 2001: created by Dutch company (Kazaa BV)
• Smart query flooding
– Join: on start, the client contacts a super-node (and
may later become one)
– Publish: client sends list of files to its super-node
– Search: send query to super-node, and the super-nodes
flood queries among themselves
– Fetch: get file directly from peer(s)

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 24


KaZaA: Exploiting Heterogeneity
• Each peer is either a group
leader or assigned to a
group leader
– TCP connection between
peer and its group leader
– TCP connections between
some pairs of group leaders
• Group leader tracks the
content in all its children
ordinary peer

group-leader peer

neighoring relationships
in overlay network
KaZaA: Motivation for Super-Nodes
• Query consolidation
– Many connected nodes may have only a few files
– Propagating query to a sub-node may take more time than
for the super-node to answer itself
• Stability
– Super-node selection favors nodes with high up-time
– How long you’ve been on is a good predictor of how long
you’ll be around in the future

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 26


BitTorrent
• BitTorrent history and motivation
– 2002: B. Cohen debuted BitTorrent
– Key motivation: popular content
• Popularity exhibits temporal locality
– Focused on efficient fetching, not searching
• To handle very large files
• If few publisher and many downloaders
– Publishers may bog down

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 27


BitTorrent: Simultaneous Downloading
• Divide large file into many pieces
– Replicate different pieces on different peers
• Peer can (hopefully) assemble the entire file
• Allows simultaneous downloading
– Retrieving different parts of the file from
different peers at the same time
• And uploading parts of the file to peers
– Important for very large files

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 28


BitTorrent: Simultaneous Downloading
• Replication: natural side effect of downloading
– As soon as a peer downloads a particular piece
• it becomes another source for that piece.
– The more peers downloading pieces of the file, the
more piece replication occurs
• distributes the load proportionately
• more total bandwidth is available to share the file with
others.
• Pieces are downloaded in random order
– to avoid a situation where peers find themselves lacking the same set
of pieces.

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 29


BitTorrent: Swarms
• File is shared via independent BitTorrent network swarm
• The lifecycle of a swarm
– starts as a singleton peer with a complete copy of the file
– node that wants to download joins the swarm
• becomes its second member and begins downloading pieces
– becomes another source for the pieces it has downloaded
• Other nodes join the swarm and begin downloading pieces from
multiple peers, not just the original peer
– If the file remains in high demand
• New peers keep on replacing those who leave the swarm,
• The swarm could remain active or it could shrink back original peer
based on the demand and popularity

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 30


Swarm:

Computer Networks: A Systems Approach, Peterson and Davie

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 31


BitTorrent Components
• Seed: Peer with entire file
• Fragmented in pieces
• Leacher: Peer with an incomplete copy of the file
• Torrent file: Passive component
• Contains meta information about file and swarm
– The target file’s size
– The piece size
– SHA-1 hash values pre-computed from each piece
– The URL of the swarm’s tracker
• Tracker: server that tracks a swarm’s current
membership
• Allows peers to find each other
• Returns a list of random peers
10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 32
32
Structured P2P: The DHTs
• Why we need DHTs?
– Searching in P2P networks is not efficient
• Either centralized system with all its problems
• Or distributed system with all its problems
– Actual file transfer process in P2P network is scalable
• File transfers directly between peers
– Searching does not scale in same way
• Original motivation for DHTs
– More efficient searching and object location
• Put another way:
– Use addressing instead of searching

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 33


Recall: Hash Tables
• Hash tables are a well-known data structure
– Hash tables allow insertions, deletions, and finds in constant
(average) time
• Hash table is a fixed-size array
– Elements of array also called hash buckets
• Hash function maps keys to elements in the array
• Properties of good hash functions:
– Fast to compute
– Good distribution of keys into hash table
– Example: SHA-1 algorithm

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 34


Hash Tables: Example

• Hash function: hash(x) = x mod 10


• Insert numbers 0, 1, 4, 9,16, and 25
• Easy to find if a given key
• is present in the table

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 35


DHT: Idea
• Hash tables are for fast
lookups
– Idea: Distribute hash buckets
to peers
• Result is Distributed Hash
Table (DHT)
– Need efficient mechanism for
• finding which peer is responsible
for which bucket and
• routing between them

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 36


DHT: Principle
• In a DHT, each node is
responsible for
– one or more hash buckets
• As nodes join and leave
– responsibilities change
• Nodes communicate to
– find the responsible node

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 37


DHT: Principle

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 38


DHT: Examples
• Chord
• CAN
• Tapestry
• Several others exist too
• Pastry, Plaxton, Kademlia, Koorde, Symphony,
P-Grid, CARP, …

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 39


Chord
• Chord was developed at MIT
– Originally published in 2001 at SIGCOMM
conference
• Paper has mathematical proofs of correctness and
performance
• Many projects at MIT around Chord
– CFS storage system
– Ivy storage system
– Plus many others…

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 40


Chord: Basics
• Chord uses SHA-1 hash function
– Results in a 160-bit object/node identifier
• Same hash function for objects and nodes
– Node ID hashed from IP address
– Object ID hashed from object name
• Object names: Based on the context
• SHA-1 gives a 160-bit identifier space
– Organized in a ring which wraps around
– Nodes keep track of predecessor and successor
• Node responsible for objects between
– its predecessor and itself
• Overlay is often called “Chord ring”
10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 41
Chord: Different procedures
• Joining the network
• Storing in to the network
• Retrieving from the network
• Periodic stabilization
– To keep the overlay structure intact

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 42


Joining: Step by Step Example
• Existing network with
nodes on 0, 1 and 4
– Hash id of nodes are 0,1
and 4
• Many different ways to
implement Chord
– Here only conceptual
example
– Covers all important
aspects
10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 43
Joining: Step by Step Example
• New node wants to join
– Hash of the new node: 6
• Known node in network:
– Node1
• Contact Node1
– With own hash id

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 44


Joining: Situation Before Join

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 45


Joining: Contact known node

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 46


Joining: Join gets routed along the network

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 47


Joining: Successor of New Node Found

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 48


Joining Successful + Data Transfer
• Joining is successful
– Old responsible node
transfers data that
should be in new node
– New node informs
Node4 about new
successor (not shown)

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 49


Joining: All Is Done

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 50


Storing a Value
• Node 6 wants to
store object with
name “Foo” and
value 5
• hash(Foo) = 2

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 51


Storing a value: (Contd.)

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 52


Storing a value: (Contd.)

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 53


Retrieving a Value
• Node 1 wants to get
object with name “Foo”
– hash(Foo) = 2
– Foo is stored on node 4

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 54


Handling Dynamism: Joining
• Stabilization Algorithm:
– Ask successor to tell about
predecessor

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 55


Handling Dynamism: Joining
• Stabilization Algorithm:
– Ask successor to tell about
predecessor
• Suppose 13 joins and sets
its successor as 15 and
predecessor as nil

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 56


Handling Dynamism: Joining
• Stabilization Algorithm:
– Ask successor to tell about
predecessor
• Suppose 13 joins and sets
its successor as 15 and
predecessor as nil
• When 13 runs stabilization
– 15 Sets its pred as 13

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 57


Handling Dynamism: Joining
• Stabilization Algorithm:
– Ask successor to tell about
predecessor
• Suppose 13 joins and sets its
successor as 15 and
predecessor as nil
• 13 runs stabilization
– 15 Sets its pred as 13
• 11 runs stabilization
– Sets its succ as 13

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 58


Handling Dynamism: Joining
• Stabilization Algorithm:
– Ask successor to tell about
predecessor
• Suppose 13 joins and sets its
successor as 15 and predecessor as
nil
• 13 runs stabilization
– 15 Sets its pred as 13
• 11 runs stabilization
– Sets its succ as 13
– 13 sets its pred as 11

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 59


Handling Dynamism: Leaving
• Some more state needed
– Successor list
• Containing some immediate K successors
• Some algorithm needed
– Check predecessor (heartbeat messages)
• If fails---- set as nil (stabilization can handle)
– During stabilization if successor does not respond:
• Replace with closest alive successor

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 60


Chord: Scalable Routing
• Routing happens by passing message to successor
• What happens when there are 1 million nodes?
– On average, need to route 1/2-way across the ring
– In other words, 0.5 million hops! Complexity O(n)
• How to make routing scalable?
– Answer: Finger tables
• Basic Chord keeps track of predecessor and successor
• Finger tables keep track of more nodes
– Allow for faster routing by jumping long way across the ring
– Routing scales well, but need more state information
• Finger tables not needed for correctness, only for performance
improvement

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 61


Chord: Finger Tables
• In m-bit identifier space, node has up to m fingers
– Fingers are stored in the finger table
• Row i in finger table at node n contains first node s
– that succeeds n by at least 2i-1 on the ring
• In other words:
– finger[i] = successor(n + 2i-1)
• First finger is the successor
– Distance to finger[i] is at least 2i-1

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 62


Chord: Scalable Routing
• Finger intervals increase
with distance from node
n
– If close, short hops and if
far, long hops
• Two key properties:
– Each node only stores
information about a
small number of nodes
• Example has five nodes
at 0, 2, 5, 6 and 11
– 4-bit ID space --> 4 rows
of fingers

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 63


Chord: Final Words
• Search performance of “pure” Chord is O(n)
– Number of nodes is n
• With finger tables
– need O(log n) hops to find the correct node

10/24/2024 Mayank Pandey, MNNIT, Allahabad, India 64

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy