The document discusses write-read consistency, emphasizing that a write operation follows a read operation on the same data item by the same process. It covers data replication, its need for higher availability, reduced latency, read scalability, and fault tolerance, along with various types of replication such as asynchronous, synchronous, active, and passive replication. Additionally, it outlines different replication models including master-slave, client-server, and peer-to-peer, highlighting their functionalities and challenges.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
6 views18 pages
Distributed Shared Memory.pptx (3)
The document discusses write-read consistency, emphasizing that a write operation follows a read operation on the same data item by the same process. It covers data replication, its need for higher availability, reduced latency, read scalability, and fault tolerance, along with various types of replication such as asynchronous, synchronous, active, and passive replication. Additionally, it outlines different replication models including master-slave, client-server, and peer-to-peer, highlighting their functionalities and challenges.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18
5.
Write follow Reads Consistency
• A write operation by a process on a data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read. • E.g Only view reactions to submitted articles if you have the initial posting Replication • Data Replication is the process of generating numerous copies of data called replicas & storing in various locations for backup, fault tolerance & improving overall network accessibility. • The data replicas can be stored on on-site & off-site servers as well as cloud-based hosts or all within the same system. Need for Data Replication • Higher Availability: Data is replicated over numerous locations so that the user can access it even if some of the copies are unavailable due to site failures. • Reduced Latency: By keeping data geographically closer to a customer, replication helps to reduce data query latency. e.g. Netflix retain a copy of duplicated data closer to the user • Read Scalability: Read queries can be served from copies of the same data that have been replicated. This increases the overall throughput of queries. • Fault-Tolerant: Replica Placement Replica Placement • The placement problem itself should be split into two subproblems: 1. Placing Replica Server: Replica-server placement is concerned with finding the best locations to place a server that can host a data store 2. Placing Content: Content placement deals with finding the best servers for placing the content. Replica Placement 1. Content Replication & Placement: • Permanent Replica: It can be considered as the initial set of replicas that constitute a distributed data store. In many cases no of permanent replicas is small. • E.g. A website: website distribution comes in 2 formats. • First kind of distribution is one in which the files that constitute a site are replicated across a limited no of servers at single location. Whenever request comes in, it is forwarded to one of the server using round robin strategy. • The second form is called mirroring. In this case, a website is copied to a limited no of servers, called mirror sites which are geographically spread across the Internet. In most cases, client simply choose one of the various mirror sites from list offered to them. Replica Placement • Server-Initiated Replicas: • In contrast to permanent replicas, server initiated replicas are copies of data store that exists to enhance performance which are created at the initiative of the data store. • E.g. A webserver placed in NewYork. Normally, this server can handle incoming requests quite easily, but it may happen that over a couple of days a sudden burst of requests come in from an unexpected location far from the server. In that case, it may be worthwhile to install a number of temporary replicas in regions where requests are coming from. To provide optimal facilities such as hosting services can dynamically replicate files to servers where those files are needed to enhance performance that is close to demanding clients. The algorithm for dynamic replication takes two issues into account. First replication can take place to reduce the load on a server. Second specific files on a server can be migrated or replicated to servers placed in the proximity of clients that issue many Replica Placement • Client Initiated Replicas: Client initiated replicas are more commonly known as cashes. In essence, a cache is a local storage facility that is used by a client to temporarily store a copy of the data it has just requested. In principle. Managing cache is left entirely to the client. The data store from where the data had been fetched has nothing to do with keeping cached data consistent. Types of Data Replication • Asynchronous vs synchronous replication • Active vs passive replication • Based on server model o Active Replication o Passive Replication • Based on replication schemes o Single Leader Architecture o Multi Leader Architecture o No Leader Architecture Types of Data Replication 1. Asynchronous Replication:In this replication, the replica gets modified after the commit(save) is fired onto the database. 2. Synchronous Replication :In this replication, the replica gets modified immediately after some changes are made in the relation table Types of Data Replication 1. Active Replication : • It is a non-centralized replication mechanism. The central idea is that all replicas receive & process the same set of client requests. • Consistency is ensured by assuming that replicas will generate the same output when given the same input in the same sequence. This assumption indicates that servers respond to queries in a deterministic manner. • Client do not address a single server, but rather a group of servers. 2. Passive Replication : • Client requests are processed by just one server (primary). • The primary server changes the status of the other Types of Data Replication Based on Server Model 1. Single Leader Architecture: o In this architecture, one server accepts client writes & replicas pull data from it. o This is the most popular & traditional way. 2. Multi Leader Architecture : o In this architecture, multiple servers can accept writes and serve as a model for replicas. o To avoid delay, copies should be spread out & leaders should be near all of them. 3. No Leader Architecture : Every server in this architecture can receive writes & function as a replica model. While it provides maximum flexibility ,it makes synchronization difficult. Based on Replication Scheme 1. Full Data Replication: o It refers to the replication of the whole database across all sites. o Since the results can be accessed from any local server, full replication speeds up the execution of global queries. o The drawback is that the updating process is often sluggish. This makes maintaining current data copies in all locations challenging. 2. Partial Data Replication : o Here, only selected parts of the database are replicated based on significance of data at each site. o The number of copies can be anything from one to the total number of nodes in the distributed system. o This kind of replication can be effective for members of Sales and Marketing teams where a partial database is maintained on Replication Models 1. Master-Slave Model 2. Client-Server Model 3. Peer-to-Peer Model Replication Models 1. Master-Slave Model • In this model one of the copy is the master replica and all the other copies are slaves. • Slaves should always be identical to the master. In this model the functionality of the slaves is very limited, thus the configuration is very simple. • The slaves essentially are read-only. • Most of the master-slaves services ignore all the updates or modifications performed at the slave & ‘undo’ the update during synchronization, making the slave identical to the manner. • The modifications or the updates can be reliably performed at the master & the slaves must synchronize directly with master. Replication Models 2. Client-Server Model • The functionality of the clients in this model is more complex than that of the slave in the master-slave model. • It allows multiple inter-communicating servers; all types of data modifications and updates can be generated at the client. • One of the replication systems in which this model is successfully implemented is Coda. • In client-server replication all the updates must be propagated first to the server, which then updates all the other clients. • In this model, one replica of the data is designated as the special server replica. • All updates created at other replicas must be registered with the server before they can be propagated further. Since all updates Replication Models 3. Peer-Peer Model • Here all the replicas or the copies are of equal importance or they are all peers. • In this, any replica can synchronize with other replica, & any file system modification or update can be applied at any replica. • Peer-to-peer systems can propagate updates faster by making use of any available connectivity. • They provide a very rich & robust communication framework, but they are more complex in implementation. • One more problem with this model is scalability. • As synchronization & communication is allowed between any replicas, this results in exceedingly large complicated data