Grid vs. Peer-to-Peer: Project Report No. 2
Grid vs. Peer-to-Peer: Project Report No. 2
Grid vs. Peer-to-Peer: Project Report No. 2
Same general approach, the creation of overlay structures that coexist with, but need not correspond in
structure to, underlying organizational structures
Grid computing addresses infrastructure but not yet failure, whereas P2P addresses failure but not yet
infrastructure. The interests of the two commities are likely to grow together over time.
1.1 Defination
Grids are sharing environments implemented by the deployment of a persistent, standards-based
service infrastructure that supports the creation of, and resources sharing within, distributed
commities.
P2P deal with many more participants but offer limited and specialized servers, have been less
concerned with qualities of service, and have made few if any assumptions about trust.
Grids have incrementally scaled the deployment of relatively sophisticated servers and application.
Connecting small numbers of sites into collaborations engaged in complex scientific applications.As
system scale increases, Grid developers are now facing and addressing problems relating to
autonomic configuration and management.
P2P communities developed rapidly around sharing and are now seeking to expand to more
sophisticated applications as well as continuing to management.
P2P has been popularized by grass roots, mass-culture file-sharing and highly parallel
computing applications that scale in some instances to hundreds of thousands of nodes.
1.2.2 Resources
Grid integrate resources that are more powerful, more diverse, and better connected that the
typical P2P.
A Grid resource might be a cluster, storage system, database, or scientific instrument of
considerable value that is administered in an organized fashion according to some well defined
policy. This explicit administration enhances the resource's ability to deliver desired qualities of
serviece and can facilitate, e.g. software upgrades, but it can also increase the cost of integrating
the resource into a Grid. Explicit administration, higher cost of membership, and the stronger
community links within scientific VOs mean that resource availability tends to be higher and
more uniform.
P2P often deal with intermittent participation and highly variable behavior. Major resources are
home computers. The difference in capabilities between home and work computers illustrated
by the average CPU time per work unit in SETI@home: home computer are 30% slower than
work computers(13:45 vs. 10:16 hours per work unit).
1.2.3 Applications
Grid tends to be far more data intensive. Because of better network connectivity, which also
allows for more flexibility in Grid application design.
P2P has far larger communities. The activity is about the same as Grid. First generation is
centralized structures. Second-generation is flooding-based, Third-generation based on
distributed hash tables. First and second generation are characterized at the level of both
individual nodes(behavior, resources) and network properties(topological properties, scale,
traffic), revealing not only general resilience but also unexpected emergent properties. Third
generation is characterized by simulation studies rather than large-scale deployments. Scalable
autonomic management achieved in narrow domains.
First, some services are specific to particular regimes: eg. machanisms that make up for the
inherent lack of incentives for cooperation in P2P.
Second, functionality requirements can conflict: eg. Grid require accountability and P2P
anonymity
Third, common services may start from different hypotheses, as in the case of trust.
1.2.7 Future Directions
Grid and P2P are both concerned with the pooling and coordinated use of resources within
distributed communities and are constructed as overlay structures that operate largely
independently of institutional relationships.
3 Issues of P2P
Can be inefficient. However, sometime is the only plan that satisfies the semantics of a query.
<GUID Keys>
* Freenet GUID keys are calculated using SHA-1 secure hashes,
* GUID is location-independent globally unique identifier, based on contents of the file.
* Hash ensures that similar works will be scattered throughout the network. Single node's
failure will make no impact on others, which increases robustness.
- On receiving an insert, a node checks its data store to see if the key already exists.
- If the key does no exist in the node's data store, the node looks up the closest key and forwards
the message to the corresponding node as it would for a query.
- If the TTL expires without collision, the final node returns an "all clear" message. The user then
sends the data down the path established by the initial insert message.
- Each node along the path verifies the data against its GUID, stores it, and creates a routing table
entry that lists the data holder as the final node in this chain.
- If the insert encounters a loop to a dead-end, it backtracks to the second-nearest key, then the
third nearest and so on, until it succeeds.
4.1.3 Routing
- Steepest-ascent hill-climbing search: Each node forwards queries to the node that it thinks is
closest to the target.
Reference
[1] On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing, Ian Foster, Adriana
Iamnitchi, Department of Computer Science, University of Chicago, Chicago, IL 60615, Mathematics and
Computer Science Division, Argonne National Laboratory, Argonn,IL60439
[2] Framework or Peer-to-Peer Distributed Computing in a Heterogeneous, Decentralised
Environment, Jerome Verbeke, Neelakanth Nadgir, Greg Ruetsch, and Ilya Sharapov, Sun Microsystems,
Inc., Palo Alto, CA 94303
[4] A Peer-to-Peer Approach to Resource Location in Grid Environments, Adriana Iamnitchi
Computer Science Dept. The University of Chicago, Ian Foster MCS Division Argonne National
Laboratory, Daniel C. Nurmi MCS Division Argonne National Laboratory
[6] A Unified Peer-to-Peer Database Framework for Scalable Service and Resource Discovery
Wolfgang Hoschek, CERN IT Division, European Organization for Nuclear Research