DC UT1 CompsA

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Module 1

1. What do you mean by Distributed Computing? Enlist goals and issues of


Distributed Systems.
Distributed computing refers to a system where processing and data storage is
distributed across multiple devices or systems, rather than being handled by a single
central device.
In a distributed system, each device or system has its own processing capabilities and
may also store and manage its own data.
These devices or systems work together to perform tasks and share resources, with no
single device serving as the central hub.
One example of a distributed computing system is a cloud computing system, where
resources such as computing power, storage, and networking are delivered over the
Internet and accessed on demand.
In this type of system, users can access and use shared resources through a web
browser or other client software.

Components
● Devices or Systems: The devices or systems in a distributed system have their
own processing capabilities and may also store and manage their own data.
● Network: The network connects the devices or systems in the distributed
system, allowing them to communicate and exchange data.
● Resource Management: Distributed systems often have some type of resource
management system in place to allocate and manage shared resources such as
computing power, storage, and networking.

Goals and Issues


● Heterogeneity
● Openness
● Scalability
● Security
● Failure Handling
● Concurrency
● Transparency

2. Describe Grid and Cluster computing Models w.r.t. architecture, advantages,


and disadvantages.
Cluster Computing Model
Cluster computing is a collection of tightly or loosely connected computers that work
together so that they act as a single entity.
The connected computers execute operations all together thus creating the idea of a
single system.
The clusters are generally connected through fast local area networks (LANs).

Architecture:
● It is designed with an array of interconnected individual computers and computer
systems operating collectively as a single standalone system.
● It is a group of workstations or computers working together as a single,
integrated computing resource connected via high-speed interconnects.
● A node – Either a single or multiprocessor network having memory, input, and
output functions and an operating system.
● Two or more nodes are connected on a single line or every node might be
connected individually through a LAN connection.

Advantages:
● High Performance
● Easy to manage
● Scalable
● Availability
● Flexibility

Disadvantages:
● High cost
● Problems in finding faults
● More space is needed

Grid Computing Model


Grid Computing can be defined as a network of computers working together to perform
a task that would rather be difficult for a single machine.
All machines on that network work under the same protocol to act as a virtual
supercomputer.
The task that they work on may include analyzing huge datasets or simulating situations
that require high computing power.
Computers on the network contribute resources like processing power and storage
capacity to the network.

Architecture:
● Fabric Layer: It is the lowest layer and offers interfaces to local resources at a
specific site. These interfaces are customized to permit sharing of resources
within a virtual organization.
● Connectivity layer: This layer contains communication protocols to offer support
for grid transactions that span the usage of multiple resources. Protocols are
required to send data among resources or to access a resource from a distant
location. Also, this layer will have security protocols to authenticate users and
resources. Instead of the user, if the user's program is authenticated then the
handover of rights from the user to the program is carried out by the connectivity
layer.
● Resource layer: This layer manages a single resource. It makes use of
functions offered by the connectivity layer and calls straightway the interfaces
made available by the fabric layer. As an example, this layer will provide
functions for getting configuration information on a particular resource or
generally, to carry out specific operations such as process creation or reading
data. Hence this layer is responsible for access control, and hence will be
dependent on the authentication carried out as part of the connectivity layer.
● Collective Layer: It handles access to multiple resources and contains services
for resource discovery, allocation and scheduling of tasks onto multiple
resources, data replication, and so on. This layer contains many diverse
protocols for a variety of functions, reflecting the broad range of services it may
provide to a virtual organization
● Application Layer: This layer contains applications that function within a virtual
organization and which utilize the grid computing environment

Advantages:
● It is not centralized, as there are no servers required, except the control node
which is just used for controlling and not for processing.
● Multiple heterogeneous machines i.e. machines with different Operating Systems
can use a single grid computing network.
● Tasks can be performed parallelly across various physical locations and the
users don’t have to pay for them (with money).

Disadvantages:
● The software of the grid is still in the involution stage.
● A super fast interconnect between computer resources is the need of the hour.
● Licensing across many servers may make it prohibitive for some applications.
● Many groups are reluctant with sharing resources

3. Compare NOS, DOS, and middleware with neat diagrams.


Distributed Operating System
Network Operating System

Middleware

DOS NOS Middleware

Definition An operating system The network operating Middleware is


that manages a system is also referred computer
collection of to as the Dialoguer. It software that
independent is the software that provides
computers and runs on a server and services to
makes them appear enables the server to software
to the users of the manage data, users, applications
system as a single groups, security, beyond those
computer. applications, and other available from
networking functions. the operating
system.

Architecture Follows n-tier Follows 2-tier


client-server client-server
architecture architecture
Multipro Multicom
cessor puter

Degree of Very High Low High


transparency high

Same OS on all Yes Yes No No


nodes

Number of 1 N N N
copies of OS

Basis of Shared Messages Files Model specific


Communication memory

Resource Global, Global, Per node Per node


Management central distributed

4. Differentiate between replication and caching. An experimental file server is up


75% of the time and down 25% due to bugs. How often does this file server have
to be replicated to give the availability of at least 99%?
Replication:
1. Replication not only increases availability and reliability but also helps to balance
the load between components leading to better performance. Also, having a copy
nearby can hide many of the communication latency problems.
2. Make copies of data available at different machines:
● Replicated file servers (mainly for fault tolerance)
● Replicated databases
● Mirrored websites
● Large-scale distributed shared memory systems

Caching:
1. Caching is a special form of replication, caching results in making a copy of a
resource, generally in the proximity of the client accessing that resource.
2. Allow client processes to access local copies:
● Web caches (browser/web proxy)
● File caching (at server and client)

25%=0.25....down-time
100%-99%=1%... this translates to 0.01
therefore, no. of replications = n log(0.25) = log 0.01
n = (log(0.1)) / ( log (0.25))
= 4 replications
5. What is Middleware? Enlist the services of Middleware.
In the context of distributed applications, middleware refers to software that offers
additional services above and beyond those offered by the operating system to allow
data management and communication across the various distributed system
components.
Complex distributed applications are supported and made easier by middleware.
Middleware often enables interoperability between applications that run on different
operating systems, by supplying services so that the application can exchange data in a
standards-based way.
Middleware sits "in the middle" between application software that may be working on
different operating systems.
Middleware comes in many forms, including database middleware, transactional
middleware, intelligent middleware, content-centric middleware, and message-oriented
middleware.
Middleware provides a variety of services, including control services, communication
services, and security services.

Services offered by Middleware


● Communication services
● Information services
● Control Services
● Security Services
● Persistence
● Messaging
● Querying
● Concurrency
Module 2

6. Explain the different types of Message oriented Communication (Transient,


Persistent, Synchronous, Asynchronous) with neat diagrams and examples.
Processes in a distributed system must communicate with each other in order to work
together.
Message Oriented Communication can be viewed along 2 axes: persistence (whether
the system is persistent o transient); and synchronicity (whether it is synchronous or
asynchronous).
Persistence: In persistent communication, messages are stored at each intermediate
hop along the way until the next node is ready to take delivery of the message. It is also
called a store-and-forward-based delivery paradigm. Example Postal system (pony
express), email, etc. In transient communication, messages are buffered only for small
periods of time (as long as sending/receiving applications are executing). If the
message cannot be delivered or the next host is down, it is discarded. Example:
General TCP/IP communication.
Synchronicity: In synchronous communication, the sender blocks further operations
until some sort of acknowledgment or response is received, hence the name-blocking
communication. In asynchronous or non-blocking communication, the sender continues
execution without waiting for any acknowledgment or response. This form needs a local
buffer at the sender to deal with it at a later stage.
Persistent synchronous communication: The sender is blocked when it sends the
message, waiting for an acknowledgment to come back. The message is stored in a
local buffer, waiting for the receiver to run and receive the message. Some instant
message applications, such as Blackberry messenger, are good examples. When you
send out a message, the app shows you the message is "delivered" but not "read". After
the message is read, you will receive another acknowledgment.
Transient asynchronous communication: Since the message is transient, both
entities must be running. Also, the sender doesn't wait for responses because it is
asynchronous. UDP is an example.
Receipt-based transient synchronous communication: The acknowledgment sent
back from the receiver indicates that the message has been received by the other end.
The receiver might be working on some other process.
Delivery-based transient synchronous communication: The acknowledgment
comes back to the sender when the other end takes control of the message.
Asynchronous RPC is an example.
Response-based transient synchronous communication: The sender blocks until
the receiver processes the request and sends back a response. RPC is an example.
There is no clear mapping of TCP to any type of communication. From an application
standpoint, it maps to transient asynchronous communication. However, from a protocol
standpoint, it maps to a receipt-based transient synchronous communication if it only
has a one-size window

7. Explain the working of RPC with neat diagrams.


RPC is used to implement the client/server model. It is easier to program than sockets.
When the client process makes a call to the remote function, the following steps are
executed to complete RPC
1. Client procedure calls client stub in normal way
2. Client stub builds message, calls local OS
3. The client's OS sends a message to the remote OS
4. Remote OS gives a message to server stub
5. Server stub unpacks parameters and calls the server’s routine
6. The server does work, returns result to the stub
7. Server stub packs it in the message and calls local OS
8. The server's OS sends a message to the client's OS
9. The client's OS gives a message to client stub
10. Stub unpacks result, returns to client

8. How RPC differs from the normal procedure call.


RPC LPC

Responds to remote as well as local Responds to only local procedure calls


procedure calls

Uses network connection in remote calls Doesn’t require network to make a call

Adds latency factor in the calculation of Local calls are always faster compared to
software execution time network-based remote calls
Allows failure due to the network Doesn’t allow connection failure
connectivity

Requires remote setup for specific Operates implicitly


procedures

Supports cross-platform communication Procedures are called within the same


program and system

9. Discuss the components of RMI in distributed systems.


RMI stands for Remote Method Invocation. It is a mechanism that allows an object
residing in one system (JVM) to access/invoke an object running on another JVM.

RMI is used to build distributed applications; it provides remote communication between


Java programs. It is provided in the package java.rmi.

Architecture of an RMI Application


In an RMI application, we write two programs, a server program (resides on the server)
and a client program (resides on the client).
● Inside the server program, a remote object is created and a reference of that
object is made available for the client (using the registry).
● The client program requests the remote objects on the server and tries to invoke
its methods.

Components of RMI
Transport Layer: This layer connects the client and the server. It manages the existing
connection and also sets up new connections.
Stub: A stub is a representation (proxy) of the remote object at the client. It resides in
the client system; it acts as a gateway for the client program.
Skeleton: This is the object which resides on the server side. stub communicates with
this skeleton to pass requests to the remote object.
RRL(Remote Reference Layer): It is the layer that manages the references made by
the client to the remote object.

10. What do you mean by stream communication? How to specify the required
QoS in a stream communication?
Stream-oriented communication is a form of communication in which timing plays an
important role.
Stream-oriented communication is also referred to as continuous streams of data.

Features
● Supports for continuous media
● Streams in distributed systems
● Stream management

Transmission mode
● Synchronous
1. Specifies max end-to-end delay variance between packets.
2. Max time limit.
● Asynchronous
1. End-to-end delay maximum.
2. No time limit.
● Isochronous
1. Max end-to-end delays are specified and variance to.
2. Both max and lower limits.

Characteristics
1. Streams are unidirectional.
2. Generally a single source, one or more sinks.
3. Often either sink/source is wrapped around hardware (e.g., Camera, CD device,
Tv monitor).
4. Simplex Stream: Single way to flow data.
5. Complex stream: Multiple ways to flow data. Example: Video with subtitles.

Streams and Quality of Service:


Timing (and other nonfunctional) requirements are generally expressed as Quality of
Service (QoS) requirements.
These requirements describe what is needed from the underlying distributed system
and network to ensure that, for example, the temporal relationships in a stream can be
preserved.
QoS for continuous data streams mainly concerns timeliness, volume, and reliability.
From an application's perspective, in many cases, it boils down to specifying a few
important properties:
● The required bit rate at which data should be transported.
● The maximum delay until a session has been set up (i.e., when an application
can start sending data).
● The maximum end-to-end delay (i.e., how long it will take until a data unit makes
it to a recipient).
● The maximum delay variance, or jitter.
● The maximum round-trip delay.

Module 3

11. Differentiate between Physical and logical clock synchronization


Physical clock
It is a physical process and also a method of measuring that process to record the
passage of time.
For example, the rotation of the Earth is measured in solar days. Most of the physical
clocks are based on cyclic processes such as celestial rotation.

Logical clock
It is a mechanism for capturing causal and chronological relationships in a distributed
system.
A physically synchronous global clock may not be present in a distributed system. In
such systems, a logical clock allows the global ordering of events from different
processes.

Vector clock
It is an algorithm for generating a partial ordering of events in a distributed system. It
detects causality violations.
Like the Lamport timestamps, interprocess messages contain the state of the sending
process's logical clock.

12. Discuss Cristian's and Berkeley's Physical clock synchronization algorithms


and compare them.
Cristian’s Algorithm
Cristian’s Algorithm is a clock synchronization algorithm used to synchronize time with a
time server by client processes. This algorithm works well with low-latency networks
where Round Trip Time is short as compared to accuracy while redundancy-prone
distributed systems/applications do not go hand in hand with this algorithm. Here Round
Trip Time refers to the time duration between the start of a Request and the end of the
corresponding Response.

Algorithm:
1. The process on the client machine sends the request for fetching clock time(time
at the server) to the Clock Server at time T0
2. The Clock Server listens to the request made by the client process and returns
the response in form of clock server time.
3. The client process fetches the response from the Clock Server at time T1 and
calculates the synchronized client clock time using the formula given below.
Tclient = Tserver + (T1 - T0)/2
where
Tclient refers to the synchronized clock time,
Tserver refers to the clock time returned by the server,
T0 refers to the time at which the request was sent by the client process,
T1 refers to the time at which response was received by the client process

Working/Reliability of the above formula:


T1 - T0 refers to the combined time taken by the network and the server to transfer the
request to the server, process the request, and return the response back to the client
process, assuming that the network latency T0 and T1 are approximately equal.
The time at the client side differs from the actual time by at most (T1 - T0)/2 seconds.
Using the above statement we can draw a conclusion that the error in synchronization
can be at most (T1 - T0)/2 seconds.
Hence,
error ε [- (T1 - T0)/2, (T1 - T0)/2]

Berkeley’s Algorithm
Berkeley’s Algorithm is a clock synchronization technique used in distributed systems.
The algorithm assumes that each machine node in the network either doesn’t have an
accurate time source or doesn’t possess a UTC server.

Algorithm:
1. An individual node is chosen as the master node from a pool node in the
network. This node is the main node in the network which acts as a master and
the rest of the nodes act as slaves. The master node is chosen using an election
process/leader election algorithm.
2. Master node periodically pings slaves nodes and fetches clock time at them
using Cristian’s algorithm.

master sends requests to slave nodes


slave nodes send back time given by their system clock

3. Master node calculates the average time difference between all the clock times
received and the clock time given by the master’s system clock itself. This
average time difference is added to the current time at the master’s system clock
and broadcasted over the network.

13. Explain different averaging algorithms for physical clock synchronization in


distributed systems. At what intervals will the clocks of a distributed system be
synchronized?
Averaging algorithm is a decentralized algorithm. One of these algorithms works by
dividing time into fixed-length intervals. Let T0 be any agreed-upon moment in past. The
kth interval starts at time T0 +kS and runs until T0 + (k+1)S. Here S is a system
parameter.
At the start of each interval, every machine broadcast its clock time. This broadcast will
happen at different times at different clocks as machines' clock speeds can be different.
After the machine finishes the broadcast, it starts the local timer in order to collect
broadcast from other machines in some time interval L. When all the broadcasts are
collected at every machine, the following algorithms are used.
● Average the time values collected from all machines.
● Another version is to discard n highest and n lowest time values and take the
average of the rest.
● Another variation is to add the propagation time of the message from the source
machine in the received time. Propagation time can be calculated using the
known topology of the network.
14. Describe the Lamport algorithm with an example.
For synchronization of logical clocks, Lamport defined happen-before relation. The
expression x -> y is read as 'x happens before y’. This means all processes agree that
first event x occurs and after x, event y occurs. The following two situations indicate the
happen before relation.

● If x and y are the events in the same process and event x occur before y then x
-> y is true.
● If x is an event of sending the message by one process and y is the event of
receiving the same message by another process then x -> y is true. Practically, a
message takes nonzero time to deliver to another process.

If x -> y and y -> z then x -> z. It means the happen-before relation is transitive. If any
two processes do not exchange messages directly or indirectly, then x -> y and y -> x
where x and y are the events that occur in these processes. In this case, events are
said to be concurrent.

Let T(x) be the time value of event x. If x -> y then T(x) < T(y) If x and y are the events
in the same process and event x occur before y then x -> y then T(x) < T(y). If x is an
event of sending the message by one process and y is the event of receiving the same
message by another process then x -> y then all the processes should agree on the
values T(x) and T(y) with T(x) < T(y). Clock time T is assumed to go forward and should
not be decreasing. The time value can be corrected by adding a positive value.

Three processes A, B, and C are shown. These processes are running on different
machines. Each clock runs at its own speed. The clock has ticked 5 times in process A,
7 times in process 8, and 9 times in process C. This rate is different due to different
crystals in the timer.
At time 5, process A sends message p to process B and it is received at time 14.
Process B will conclude that message has taken 9 ticks to travel from process A to
process B if the message contains sending time. The same is true about message q
from process B to process C.

Process B sends message s at time 56 and it is received by process A at time 45.


Similarly, Process C sends message r at time 54 and it is received by process A at time
49. This should not happen and it should be prevented. It shows sending time is larger
than receiving time.

Message r from process C leaves at time 54. As per the happen-before relation, it
should reach process B at time 55 or later. The same is true for message s which
should reach process A at time 63 or later. The receiver fast-forwards the clock by one
more than the sending time. That is why the sender always sends sending a time-in
message. If two events occur in sequence, then the clock must tick at least once. If the
process sends and receives a message in quick succession, then the clock should be
advanced at least by 1 between these two send-and-receive events.

If two events occur at the same time in different processes, then they should be
separated by a decimal point. If such events occur at time 20 then the former should be
considered at 20.1 and later at 20.2.

● If x->y is in the same process then T(x) < T(y).


● If x is an event of sending the message y is the event of receiving the message
then T(x) < T(y).
● For all distinguishing events x and y. T(x) ≠ T(y).

The total ordering of all the events can be carried out with the above algorithm.

15. Discuss Election algorithms with their steps and example:


To fulfill the need of processing in a distributed system, many distributed algorithms are
designed in which one process acts as a coordinator to play some special role. This can
be any process. If this process fails, then some approach is required to assign this role
to another process. In this case, the election is required to elect the coordinator.

In a group of processes, if all processes are the same then the only way to assign this
responsibility is on the basis of some criterion. This criterion can be the identifier which
is some number assigned to the process. For example, it could be the network address
of the machine on which the process is running. This assumption considers that only
one process is running on the machine.

The election algorithm locates the process with the highest number in order to elect it as
a coordinator. Every process is aware of the process number of other processes. But,
processes do not know which processes are currently up or which ones are currently
crashed. The election algorithm ensures that all processes will agree on the newly
elected coordinator.

Bully Election Algorithm


In this algorithm, if any process finds that the coordinator is not responding to its
request, then it initiates the election. Suppose, process P initiates the election. The
election is carried out as follows:

● Process P sends Election message to all the processes having a higher number
than it.
● If no one replies to Election message of P then it wins the election.
● If anyone higher number process responds then it takes over. Now P's work is
done.

Any process may receive Election message from its lower-numbered colleague.
Receiver replies with OK message to the sender of Election message. This reply
conveys the message that the receiver is alive and will take over the further job. If the
receiver is currently not holding an election, it starts the election process.
Eventually, all processes will quit except the highest number of processes. This process
wins the election and sends Coordinator message to all processes to convey the
message that he is now the new coordinator.

Initially, process 17 (higher-numbered) is the coordinator.


Process 14 notices that coordinator 17 has just crashed as it has
not given a response to the request of process 14. Process 14
now initiates the election by sending Election message to
processes 15, 16, and 17 which are higher-numbered processes.

Processes 15 and 16 respond with OK message. Process 17 is


already crashed and hence does not send a reply with OK message. Now, the job of
process 14 is over.

Now process 15 and 16 each holds an election by sending Election message to its
higher-numbered processes. As process 17 crashed, only process 16 replied to 15 with
OK message.

Finally, process 16 wins the election and sends a coordinator message to all the
processes to inform them that, it is now a new coordinator. In this algorithm, if two
processes detect simultaneously that the coordinator is crashed then both will initiate
the election. Every higher-number process will receive two Election messages. The
process will ignore the second Election message and the election will carry on as usual.
Ring Election Algorithm
This ring algorithm does not use tokens and Processes are physically or logically
ordered in the ring. Each process has its successor. Every process knows who is its
successor. When any process notices that the coordinator is crashed, it builds the
Election message and sends it to its successor. This message contains the process
number of sending process.

If the successor is down then the process sends a message to the next process along
the ring. Process locates the next running process along the ring if it finds many
crashed processes in sequence. In this manner, the receiver of Election message also
forwards the same message to its successor by appending its own number to the
message.
In this way, eventually, the message returns back to the process that had started the
election initially. This incoming message contains its own process number along with
the process numbers of all the processes that had received this message.

At this point, this message gets converted to a coordinator message. Once again this
coordinator message is circulated along the ring to inform the processes about the new
coordinator and members of the ring. Off course, the highest process number is chosen
from the list of new coordinators by a process that has started the election.
Process 5 notices the crash of the coordinator which was initially process 7. It then
sends Election message to process 6. This message contains the number of the
process 5. As process 7 crashed, process 6 forward this message to its new successor
process 0 and append its number to the same list. In this way, the message is received
by all the processes in the ring.

Eventually, a message arrives at process 5 which had initiated the election. The highest
number in this list is 6. So the message is again circulated in a previous manner to
inform all the processes that, process 6 is now the new coordinator.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy