cs516 Unit III
cs516 Unit III
CLOCK SYNCHRONIZATION
Contents
o Need for clock synchronization
o Lamport’s algorithm (logical clock)
o Physical clock synchronization algorithms
o Christian’s algorithm (Passive)
o Berkely Algorithm (Active)
Need for clock synchronization
When each machine has its own clock, an event that occurred after another event may
nevertheless be assigned an earlier time
No common clock or other precise global time source exists in a distributed system
This could lead to unexpected behavior and various system failures
o Example: Building output using make tool, which relies on timestamp of source files
and object files
Lamport's Algorithm
Synchronizes logical clocks
happens-before relation
o a -> b read as a happens before b, means all processes agree that event a
happens before event b
o transitive relation : a -> b and b -> c, then a -> c
Each message carries the sending time as per its clock
If the receivers clock shows a value prior to the sent time, then it fast forwards its clock to
one more than sending time
Between every two events, the clock must tick at least once
o Example: 3 process : each with own clock of 6, 8, 10 ticks
o 56(1) changed to 61(1), 69(1), 77(1), 85(1)
o 54(0) changed to 70(0), 76(0)
Christian’s Algorithm
One machine as a WWV receiver (time server) and the goal is to have all the other
machines keep synchronized with it
Periodically each machine sends a message to the time server asking it for the current
time
The machine responds as fast as it can with a message containing its current time
The time server is passive
Berkeley Algorithm
The time server is active, polling every machine periodically and asking its time
Based on the answer, it computes an average time and tell all the other machines to
advanced their clock to the new time or slow the clocks down until some specified
reduction has been achieved
A state in which one process enters into a critical region and read or update certain shared
data structures
It ensures that no other process will use the shared data structures at the same time
Centralized Algorithm
One process is elected as the coordinator (Ex: the one running on the machine with the
highest network address)
Whenever a process wants to enter a critical region, it sends a request message to the
coordinator stating which critical section it wants to enter and asking for permission
If no other process is currently in that critical region, the coordinator sends back a reply -
granting permission
When the reply arrives, the requesting process enters the critical region
If some other process is already in critical region, the coordinator cannot grant permission
Actual method to deny permission could be system dependent like No reply, or
"Permission denied" message etc.
When the process exits the critical section, it sends a message to coordinator releasing its
exclusive access
Distributed Algorithm
When a process wants to enter a critical section, it builds a message containing the name
of the critical region, its process number and the current time
It sends the message to all other processes, including itself
If the receiver is not in the critical region and does not want to enter it, it sends back an
OK message to the sender
If the receiver is already in the critical region, it does not reply. It queues the request.
If the receiver wants to enter the critical region, it compares the timestamp of the
incoming message with the one that it has sent. The lowest one wins.
If the incoming message is lower, the receiver sends back and OK message
If its own message has a lower timestamp, the receiver queues the incoming request and
sends nothing
After sending out requests, the process sits back and waits until everyone else has given
permission
After it exits the critical region, it sends OK message to all processes on its queue and
deletes them all
Token Ring Algorithm
ELECTION ALGORITHMS
A Ring Algorithm
When any process notices that the
coordinator is not functioning, it builds an
ELECTION message containing its own
process number and sends the message
to its successor.
If the successor is down, the sender skips
over the successor and goes to the next
member along the ring. or the one after
that, until a running process is located.
At each step, the sender adds its own
process number to the list in the
message.
Eventually, the message gets back to the
process that started it all.
That process recognizes this event when
it receives an incoming message containing its own process number.
At that point, the message type is changed to COORDINATOR and circulated once again, this
time to inform everyone else who the coordinator is (the list member with the highest
number) and who the members of the new ring are.
When this message has circulated once, it is removed and everyone goes back to work.
ATOMIC TRANSACTIONS
The Transaction Model
o Stable Storage
o Transaction Primitives
o ACID Properties
o Nested Transaction
Implementation
o Private Workspace
o Writeahead Log
o Two-phase Commit Protocol
Concurrency Control
o Locking
o Two-phase locking
o Optimistic Concurrency Control
o Timestamps
Higher-level abstraction, that hides technical issues (like mutual exclusion, critical region
management, deadlock prevention, and crash recovery) and allows the programmer to
concentrate on the algorithms and how the processes work together in parallel.
Example banking application:
o Withdraw( amount, account 1).
o Deposit(amount, account2).
If the telephone connection is broken after the first one but before the second one, the first
account will have been debited but the second one will not have been credited.
The money vanishes into thin air
The key is rolling back to the initial state if the transaction fails to complete
RAM memory,
o wiped out when the power fails or a machine crashes. Next we have
Disk storage
o survives CPU failures but which can be lost in disk head crashes.
Stable storage
o designed to survive anything except major calamities such as floods and earthquakes.
o Stable storage can be implemented with a pair of ordinary disks
Transaction Primitives
1. BEGIN_ TRANSACTION: Mark the start of a transaction.
2. END_ TRANSACTION: Terminate the transaction and try to commit.
3. ABORT TRANSACTION: Kill the transaction; restore the old values.
4. READ: Read data from a file (or other object).
5. WRITE: Write data to a file (or other object).
Nested Transactions
Transactions may contain sub transactions, often called nested transactions.
The top-level transaction may fork off children that run in parallel with one another, on
different processors, to gain performance or simplify programming
Implementation
Private Workspace
when a process starts a transaction, it is given a private workspace containing all the files
(and other objects) to which it has access.
Until the transaction either commits or aborts, all of its reads and writes go to the private
workspace, rather than the "real" one, by which we mean the normal file system
Writeahead Log
sometimes called an intentions list
files are actually modified in place, but before any block is changed, a record is written to the
writeahead log on stable storage telling
o which transaction is making the change,
o which file and block is being changed
o what the old and new values are.
Only after the log has been written successfully is the change made to the file.
If the transaction succeeds and is committed, a commit record is written to the log
If the transaction aborts, the log can be used to back up to the original state (rollback)
Concurrency Control
When multiple transactions are executing simultaneously in different processes (on different
processors), some mechanism is needed to keep them out of each other's way.
Locking
when a process needs to read or write a file (or other object) as part of a transaction, it first
locks the file.
the lock manager maintains a list of locked files, and rejects all attempts to lock files that are
already locked by another process
The issue of how large an item to lock is called the granularity of locking.
The finer the granularity, the more precise the lock can be, and the more parallelism can be
achieved
Two-phase locking
Timestamps
assign each transaction a timestamp at the moment it does BEGIN_TRANSACTION
Every file in the system has a read timestamp and a write timestamp associated with it
Consider a system with processes A and B running on machine 0, and process C running on
machine 1.
Three resources exist: R, S, and T.
A holds S but wants R, which it cannot have because B is using it
C has T and wants S
As soon as B finishes, A can get R and finish, releasing S for C
This configuration is safe.
After a while, B releases R and asks for T, a perfectly legal and safe swap.
Machine 0 sends a message to the coordinator announcing the release of R
Machine 1 sends a message to the coordinator announcing the fact that B is now waiting for
its resource, T.
Assume that the message from machine 1 arrives first, leading the coordinator to incorrectly
concludes that a deadlock exists and kills some process
Such a situation is called a false deadlock
Wait-die algorithm
When one process is about to block waiting for a resource that another process is using, a
check is made to see which has a larger timestamp (i.e., is younger).
We can then allow the wait only if the waiting process has a lower timestamp (is older) than
the process waited for.
In this manner, following any chain of waiting processes, the timestamps always increase, so
cycles are impossible.
Wound-wait algorithm
one transaction is supposedly wounded (it is actually killed) and the other waits.
If an old process wants a resource held by a young one, the old process preempts the young
one, whose transaction is then killed
The young one probably starts up again immediately, and tries to acquire the resource,
forcing it to wait.
Compared with Wait-die algorithm,
o if the young one wants a resource held by the old one, the young one is killed.
o It will undoubtedly start up again and be killed again.
o This cycle may go on many times before the old one releases the resource.
Wound-wait does not have this property
THREADS
Thread Usage
to allow parallelism to be combined with sequential execution and blocking system calls.
a) Dispatcher/worker model.
b) Team model.
c) Pipeline model.
Model Characteristics
Thread management
Two alternatives are possible here,
static threads
o the choice of how many threads there will be is made when the program is written or
when it is compiled.
o Each thread is allocated a fixed stack.
o This approach is simple, but inflexible
dynamic threads
o allow threads to be created and destroyed on-the-fly during execution
o The thread creation call usually specifies the thread's main program (as a pointer to a
procedure) and a stack size, and may specify other parameters as well, for example, a
scheduling priority.
o The call usually returns a thread identifier to be used in subsequent calls involving the
thread
o In this model, a process starts out with one (implicit) thread, but can create one or
more threads as needed, and these can exit when finished
Threads can be terminated in one of two ways.
o A thread can exit voluntarily when it finishes its job,
o it can be killed from outside
Shared Data
data that are shared among multiple threads, such as the buffers in a producer-consumer
system.
Access to shared data is usually programmed using critical regions, to prevent multiple
threads from trying to access the same data at the same time.
Critical regions are most easily implemented using semaphores, monitors, and similar
constructions
Mutex
one of two states, unlocked or locked
Operations: LOCK, UNLOCK, TRYLOCK
mutexes are used for short-term locking, mostly for guarding the entry to critical regions
Condition variable
used for long-term waiting until a resource becomes available
lock mutex; lock mutex;
Global Variable
variables that are global to a thread but not global to the entire program do cause trouble
Solutions
o prohibit global variables altogether
o assign each thread its own private global variables
Scheduling
Threads can be scheduled using various scheduling algorithms, including priority round robin,
and others.
Threads packages often provide calls to give the user the ability to specify the scheduling
algorithm and set the priorities, if any
When a server thread, S, starts up, it exports its interface by telling the kernel about it.
The interface defines which procedures are callable, what their parameters are, and so on.
When a client thread C starts up, it imports the interface from the kernel and is given a
special identifier to use for the call.
The kernel now knows that C is going to call S later, and creates special data structures to
prepare for the call.
SYSTEM MODELS
1. Workstation model
2. Processor pool model
3. Hybrid form
Workstation Model
The system consists of workstations (high-end personal computers) scattered throughout a
building or campus and connected by a high-speed LAN.
1. Diskless workstations
a. Do not have local disks
b. File system must be implemented by one or more remote file servers
2. Diskful workstations / Disky workstations
a. Have local disks
When the workstations have private disks, these disks can be used in one ofat least four ways
1. Paging and temporary files
2. Paging temporary and system binaries
3. Paging temporary files, system binaries and file caching
4. Complete local file system
Low cost, easy hardware and software Heavy network usage; file servers
(Diskless)
maintenance, symmetry and flexibility may become bottlenecks
Paging, scratch Reduces network load over diskless Higher cost due to large number of
files case disks needed
Paging, scratch
Still lower network load; reduces load Higher cost; cache consistency
files, binaries,
on file servers as well problems
file caching
A Hybrid Model
Provide each user with a personal workstation and to have a processor pool in addition
PROCESSOR ALLOCATION
Design considerations
Migratory or Nonmigratory nature of process
CPU Utilization
o Maximize this number
o Make sure that every CPU has something to do
Response time
o Minimize mean response time
Response ratio
o Amount of time it takes to run a process on some machine, divided by how long it
would take on some unloaded benchmark processor.
o Collecting all the information in one place allows a better decision to be made, but is
less robust and can put a heavy load on the central machine
Optical versus suboptimal algorithms
Local versus global algorithms
o transfer policy
Sender-initiated versus receiver-initiated algorithms
o location policy
A Centralized Algorithm
heuristic algorithm that does not require any advance information
called up-down
a coordinator maintains a usage table with one entry per personal workstation
concerned with giving each workstation owner a fair share of the computing power
A Hierarchical Algorithm
organize them in a logical hierarchy independent of the physical structure of the network
Some of the machines are workers and others are managers
For each group of k workers, one manager machine (the "department head") is assigned the
task of keeping track of who is busy and who is idle
If the manager receiving the request thinks that it has too few processors available, it passes
the request upward in the tree to its boss
A Bidding Algorithm
The processes, which must buy CPU time to get their work done, and processors, which
auction their cycles off to the highest bidder.
Each processor advertises its approximate price by putting it in a publicly readable file
Different processors may have different prices, depending on their
o speed,
o memory size,
o presence of floating-point hardware
o other features
o indication of the service provided like expected response time
When a process wants to start up a child process, it goes around and checks out who is
currently offering the service that it needs.
It then determines the set of processors whose services it can afford.
From this set, it computes the best candidate, where "best" may mean cheapest, fastest, or
best price/performance, depending on the application.
It then generates a bid and sends the bid to its first choice.
The bid may be higher or lower than the advertised price.
Processors collect all the bids sent to them, make a choice, by picking the highest one.
The winners and losers are informed, and the winning process is executed.
The published price of the server is then updated to reflect the new going rate.
(a) Two jobs running out of phase with each other. (b) Scheduling matrix for eight processors,
each with six time slots. The Xs indicated allocated slots.
co-scheduling
which takes interprocess communication patterns into account while scheduling to ensure that
all members of a group run at the same time.
have each processor use a round-robin scheduling algorithm with all processors first running
the process in slot 0 for a fixed period, then all processors running the process in slot 1 for a
fixed period, and so on.
A broadcast message could be used to tell each processor when to do process switching, to
keep the time slices synchronized.
Variant breaks the matrix into rows and concatenates the rows to form one long row. With k
processors, any k consecutive slots belong to different processors
FAULT TOLERANCE
Component Faults
System Failures
Synchronous versus Asynchronous Systems
Component Faults
A fault is a malfunction, possibly caused by
a design error
a manufacturing error
a programming error,
physical damage,
deterioration in the course of time,
harsh environmental conditions
unexpected inputs,
operator error,
rodents eating part of it, and
many other causes
Classified as
transient
o occur once and then disappear
o A bird flying through the beam of a microwave transmitter
intermittent
o occurs, then vanishes of its own accord, then reappears, and so on
o A loose contact on a connector
permanent
o continues to exist until the faulty component is repaired.
o Burnt-out chips, software bugs, and disk head crashes
System Failures
Two types of processor faults can be distinguished
Fail-silent faults
o a faulty processor just stops and does not respond to subsequent input or produce
further output, except perhaps to announce that it is no longer functioning
o also called fail-stop faults
Byzantine faults
o a faulty processor continues to run, issuing wrong answers to questions, and possibly
working together maliciously
Use of Redundancy
Three kinds are possible:
information redundancy
o extra bits are added to allow recovery from garbled bits
o Ex: Hamming code
time redundancy
o an action is performed, and then, if need be, it is performed again
physical redundancy
o extra equipment is added to make it possible for the system as a whole to tolerate the
loss or malfunctioning of some components
o two ways to organize these extra processors:
active replication
primary backup
DESIGN ISSUES
Clock Synchronization
Same as earlier
Predictability
it should be clear at design time that the system can meet all of its deadlines,
even at peak load
it is often known what the worst-case behavior of these processes is
Fault Tolerance
Many real-time systems control safety-critical devices
Active replication – sometimes used
Primary-backup schemes are less popular because deadlines
fault-tolerant real-time systems must be able to cope with the maximum number of faults
and the maximum load at the same time
Some real-time systems have the property that they can be stopped cold when a serious
failure occurs. A system that can halt operation like this without danger is said to be fail-safe.
o Ex: railroad signaling system unexpectedly blacks out
Language Support
specialized real-time languages can potentially be of great assistance
it should be easy to express the work as a collection of short tasks (e.g., lightweight
processes or threads) that can be scheduled independently, subject to user-defined
precedence and mutual exclusion constraints
maximum execution time of every task can be computed at compile time
Recursion cannot be tolerated
need a way to deal with time itself
special variable, clock, should be available, containing the current time in ticks
Range of a 32-bit clock before overflowing for various resolutions
o 1 ns => 4 secs; 1 us => 72 mins; 1 ms => 50 days; 1 sec => 136 years
way to express minimum and maximum delays
way to express what to do if an expected event does not occur within a certain interval
useful to have a statement of the form e v e r y ( 2 5 m s e c ) { . . . } that causes the
statements within the curly brackets to be executed every 25 msec
Real-Time Communication
Time Division Multiple Access (TDMA)
Real-Time Scheduling
Characterized by
designed for preemptively scheduling periodic tasks with no ordering or mutual exclusion
constraints on a single processor
In advance, each task is assigned a priority equal to its execution frequency.
For example, a task run every 20 msec is assigned priority 50 and a task run every 100 msec
is assigned priority 10.
At run time, the scheduler always selects the highest priority task to run, preempting the
current task if need be.
Whenever an event is detected, the scheduler adds it to the list of waiting tasks.
This list is always keep sorted by deadline, closest deadline first.
For a periodic task, the deadline is the next occurrence.
The scheduler then just chooses the first task on the list, the one closest to its deadline
Least laxity
algorithm first computes for each task the amount of time it has to spare, called the laxity
(slack).
o For a task that must finish in 200 msec but has another 150 msec to run, the laxity is
50 msec
This algorithm chooses the task with the least laxity,
o that is, the one with the least breathing room.
Static Scheduling
Static scheduling is done before the system starts operating.
The input consists of a list of all the tasks and the times that each must run.
The goal is to find an assignment of tasks to processors and for each processor, a static
schedule giving the order in which the tasks are to be run.
Thanks to my family members who supported me while I spent hours and hours to prepare this.
Your feedback is welcome at GHCRajan@gmail.com