DC UT1 CompsA
DC UT1 CompsA
DC UT1 CompsA
Components
● Devices or Systems: The devices or systems in a distributed system have their
own processing capabilities and may also store and manage their own data.
● Network: The network connects the devices or systems in the distributed
system, allowing them to communicate and exchange data.
● Resource Management: Distributed systems often have some type of resource
management system in place to allocate and manage shared resources such as
computing power, storage, and networking.
Architecture:
● It is designed with an array of interconnected individual computers and computer
systems operating collectively as a single standalone system.
● It is a group of workstations or computers working together as a single,
integrated computing resource connected via high-speed interconnects.
● A node – Either a single or multiprocessor network having memory, input, and
output functions and an operating system.
● Two or more nodes are connected on a single line or every node might be
connected individually through a LAN connection.
Advantages:
● High Performance
● Easy to manage
● Scalable
● Availability
● Flexibility
Disadvantages:
● High cost
● Problems in finding faults
● More space is needed
Architecture:
● Fabric Layer: It is the lowest layer and offers interfaces to local resources at a
specific site. These interfaces are customized to permit sharing of resources
within a virtual organization.
● Connectivity layer: This layer contains communication protocols to offer support
for grid transactions that span the usage of multiple resources. Protocols are
required to send data among resources or to access a resource from a distant
location. Also, this layer will have security protocols to authenticate users and
resources. Instead of the user, if the user's program is authenticated then the
handover of rights from the user to the program is carried out by the connectivity
layer.
● Resource layer: This layer manages a single resource. It makes use of
functions offered by the connectivity layer and calls straightway the interfaces
made available by the fabric layer. As an example, this layer will provide
functions for getting configuration information on a particular resource or
generally, to carry out specific operations such as process creation or reading
data. Hence this layer is responsible for access control, and hence will be
dependent on the authentication carried out as part of the connectivity layer.
● Collective Layer: It handles access to multiple resources and contains services
for resource discovery, allocation and scheduling of tasks onto multiple
resources, data replication, and so on. This layer contains many diverse
protocols for a variety of functions, reflecting the broad range of services it may
provide to a virtual organization
● Application Layer: This layer contains applications that function within a virtual
organization and which utilize the grid computing environment
Advantages:
● It is not centralized, as there are no servers required, except the control node
which is just used for controlling and not for processing.
● Multiple heterogeneous machines i.e. machines with different Operating Systems
can use a single grid computing network.
● Tasks can be performed parallelly across various physical locations and the
users don’t have to pay for them (with money).
Disadvantages:
● The software of the grid is still in the involution stage.
● A super fast interconnect between computer resources is the need of the hour.
● Licensing across many servers may make it prohibitive for some applications.
● Many groups are reluctant with sharing resources
Middleware
Number of 1 N N N
copies of OS
Caching:
1. Caching is a special form of replication, caching results in making a copy of a
resource, generally in the proximity of the client accessing that resource.
2. Allow client processes to access local copies:
● Web caches (browser/web proxy)
● File caching (at server and client)
25%=0.25....down-time
100%-99%=1%... this translates to 0.01
therefore, no. of replications = n log(0.25) = log 0.01
n = (log(0.1)) / ( log (0.25))
= 4 replications
5. What is Middleware? Enlist the services of Middleware.
In the context of distributed applications, middleware refers to software that offers
additional services above and beyond those offered by the operating system to allow
data management and communication across the various distributed system
components.
Complex distributed applications are supported and made easier by middleware.
Middleware often enables interoperability between applications that run on different
operating systems, by supplying services so that the application can exchange data in a
standards-based way.
Middleware sits "in the middle" between application software that may be working on
different operating systems.
Middleware comes in many forms, including database middleware, transactional
middleware, intelligent middleware, content-centric middleware, and message-oriented
middleware.
Middleware provides a variety of services, including control services, communication
services, and security services.
Uses network connection in remote calls Doesn’t require network to make a call
Adds latency factor in the calculation of Local calls are always faster compared to
software execution time network-based remote calls
Allows failure due to the network Doesn’t allow connection failure
connectivity
Components of RMI
Transport Layer: This layer connects the client and the server. It manages the existing
connection and also sets up new connections.
Stub: A stub is a representation (proxy) of the remote object at the client. It resides in
the client system; it acts as a gateway for the client program.
Skeleton: This is the object which resides on the server side. stub communicates with
this skeleton to pass requests to the remote object.
RRL(Remote Reference Layer): It is the layer that manages the references made by
the client to the remote object.
10. What do you mean by stream communication? How to specify the required
QoS in a stream communication?
Stream-oriented communication is a form of communication in which timing plays an
important role.
Stream-oriented communication is also referred to as continuous streams of data.
Features
● Supports for continuous media
● Streams in distributed systems
● Stream management
Transmission mode
● Synchronous
1. Specifies max end-to-end delay variance between packets.
2. Max time limit.
● Asynchronous
1. End-to-end delay maximum.
2. No time limit.
● Isochronous
1. Max end-to-end delays are specified and variance to.
2. Both max and lower limits.
Characteristics
1. Streams are unidirectional.
2. Generally a single source, one or more sinks.
3. Often either sink/source is wrapped around hardware (e.g., Camera, CD device,
Tv monitor).
4. Simplex Stream: Single way to flow data.
5. Complex stream: Multiple ways to flow data. Example: Video with subtitles.
Module 3
Logical clock
It is a mechanism for capturing causal and chronological relationships in a distributed
system.
A physically synchronous global clock may not be present in a distributed system. In
such systems, a logical clock allows the global ordering of events from different
processes.
Vector clock
It is an algorithm for generating a partial ordering of events in a distributed system. It
detects causality violations.
Like the Lamport timestamps, interprocess messages contain the state of the sending
process's logical clock.
Algorithm:
1. The process on the client machine sends the request for fetching clock time(time
at the server) to the Clock Server at time T0
2. The Clock Server listens to the request made by the client process and returns
the response in form of clock server time.
3. The client process fetches the response from the Clock Server at time T1 and
calculates the synchronized client clock time using the formula given below.
Tclient = Tserver + (T1 - T0)/2
where
Tclient refers to the synchronized clock time,
Tserver refers to the clock time returned by the server,
T0 refers to the time at which the request was sent by the client process,
T1 refers to the time at which response was received by the client process
Berkeley’s Algorithm
Berkeley’s Algorithm is a clock synchronization technique used in distributed systems.
The algorithm assumes that each machine node in the network either doesn’t have an
accurate time source or doesn’t possess a UTC server.
Algorithm:
1. An individual node is chosen as the master node from a pool node in the
network. This node is the main node in the network which acts as a master and
the rest of the nodes act as slaves. The master node is chosen using an election
process/leader election algorithm.
2. Master node periodically pings slaves nodes and fetches clock time at them
using Cristian’s algorithm.
3. Master node calculates the average time difference between all the clock times
received and the clock time given by the master’s system clock itself. This
average time difference is added to the current time at the master’s system clock
and broadcasted over the network.
● If x and y are the events in the same process and event x occur before y then x
-> y is true.
● If x is an event of sending the message by one process and y is the event of
receiving the same message by another process then x -> y is true. Practically, a
message takes nonzero time to deliver to another process.
If x -> y and y -> z then x -> z. It means the happen-before relation is transitive. If any
two processes do not exchange messages directly or indirectly, then x -> y and y -> x
where x and y are the events that occur in these processes. In this case, events are
said to be concurrent.
Let T(x) be the time value of event x. If x -> y then T(x) < T(y) If x and y are the events
in the same process and event x occur before y then x -> y then T(x) < T(y). If x is an
event of sending the message by one process and y is the event of receiving the same
message by another process then x -> y then all the processes should agree on the
values T(x) and T(y) with T(x) < T(y). Clock time T is assumed to go forward and should
not be decreasing. The time value can be corrected by adding a positive value.
Three processes A, B, and C are shown. These processes are running on different
machines. Each clock runs at its own speed. The clock has ticked 5 times in process A,
7 times in process 8, and 9 times in process C. This rate is different due to different
crystals in the timer.
At time 5, process A sends message p to process B and it is received at time 14.
Process B will conclude that message has taken 9 ticks to travel from process A to
process B if the message contains sending time. The same is true about message q
from process B to process C.
Message r from process C leaves at time 54. As per the happen-before relation, it
should reach process B at time 55 or later. The same is true for message s which
should reach process A at time 63 or later. The receiver fast-forwards the clock by one
more than the sending time. That is why the sender always sends sending a time-in
message. If two events occur in sequence, then the clock must tick at least once. If the
process sends and receives a message in quick succession, then the clock should be
advanced at least by 1 between these two send-and-receive events.
If two events occur at the same time in different processes, then they should be
separated by a decimal point. If such events occur at time 20 then the former should be
considered at 20.1 and later at 20.2.
The total ordering of all the events can be carried out with the above algorithm.
In a group of processes, if all processes are the same then the only way to assign this
responsibility is on the basis of some criterion. This criterion can be the identifier which
is some number assigned to the process. For example, it could be the network address
of the machine on which the process is running. This assumption considers that only
one process is running on the machine.
The election algorithm locates the process with the highest number in order to elect it as
a coordinator. Every process is aware of the process number of other processes. But,
processes do not know which processes are currently up or which ones are currently
crashed. The election algorithm ensures that all processes will agree on the newly
elected coordinator.
● Process P sends Election message to all the processes having a higher number
than it.
● If no one replies to Election message of P then it wins the election.
● If anyone higher number process responds then it takes over. Now P's work is
done.
Any process may receive Election message from its lower-numbered colleague.
Receiver replies with OK message to the sender of Election message. This reply
conveys the message that the receiver is alive and will take over the further job. If the
receiver is currently not holding an election, it starts the election process.
Eventually, all processes will quit except the highest number of processes. This process
wins the election and sends Coordinator message to all processes to convey the
message that he is now the new coordinator.
Now process 15 and 16 each holds an election by sending Election message to its
higher-numbered processes. As process 17 crashed, only process 16 replied to 15 with
OK message.
Finally, process 16 wins the election and sends a coordinator message to all the
processes to inform them that, it is now a new coordinator. In this algorithm, if two
processes detect simultaneously that the coordinator is crashed then both will initiate
the election. Every higher-number process will receive two Election messages. The
process will ignore the second Election message and the election will carry on as usual.
Ring Election Algorithm
This ring algorithm does not use tokens and Processes are physically or logically
ordered in the ring. Each process has its successor. Every process knows who is its
successor. When any process notices that the coordinator is crashed, it builds the
Election message and sends it to its successor. This message contains the process
number of sending process.
If the successor is down then the process sends a message to the next process along
the ring. Process locates the next running process along the ring if it finds many
crashed processes in sequence. In this manner, the receiver of Election message also
forwards the same message to its successor by appending its own number to the
message.
In this way, eventually, the message returns back to the process that had started the
election initially. This incoming message contains its own process number along with
the process numbers of all the processes that had received this message.
At this point, this message gets converted to a coordinator message. Once again this
coordinator message is circulated along the ring to inform the processes about the new
coordinator and members of the ring. Off course, the highest process number is chosen
from the list of new coordinators by a process that has started the election.
Process 5 notices the crash of the coordinator which was initially process 7. It then
sends Election message to process 6. This message contains the number of the
process 5. As process 7 crashed, process 6 forward this message to its new successor
process 0 and append its number to the same list. In this way, the message is received
by all the processes in the ring.
Eventually, a message arrives at process 5 which had initiated the election. The highest
number in this list is 6. So the message is again circulated in a previous manner to
inform all the processes that, process 6 is now the new coordinator.