Operating systems Theory and practice
Operating systems Theory and practice
Contents
1 Overview 2
1.1 Theoretical Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Practical Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Process Synchronisation 5
3.1 Common synchronisation problems . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Mutual exclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Producer/Consumer problem via semaphores . . . . . . . . . . . . . . . . . 11
3.5 Reader/Writer problem via semaphores . . . . . . . . . . . . . . . . . . . . 12
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
i
6 Deadlock 32
6.1 A definition for deadlock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Resource Allocation Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3 Resource allocation examples . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.4 Deadlock Prevention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.4.1 Prevention by Preemption . . . . . . . . . . . . . . . . . . . . . . . 34
6.4.2 Prevention by Linear Ordering of Resources . . . . . . . . . . . . . 34
6.4.3 The Banker’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 35
6.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7 Scheduling 37
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.2 F.C.F.S. Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.3 S.T.F. Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.4 Priority Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.5 Preemptive Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.6 Round Robin Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.7 Scheduling Tasks on more than one Processor . . . . . . . . . . . . . . . . 41
7.8 Preemptive Schedules for more than one Processors . . . . . . . . . . . . . 46
7.9 Scheduling Dependent Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.9.1 A-Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.9.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.9.3 B-Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.9.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
9 Computer Security 56
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.2 Encryption Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.3.1 Simple Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.3.2 The Vigenere Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.3.3 One-Time Pads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.4 Introduction to Number Theory . . . . . . . . . . . . . . . . . . . . . . . . 58
9.4.1 Congruences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.4.2 The Greatest Common Divisor . . . . . . . . . . . . . . . . . . . . 59
9.4.3 Euclid’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.4.4 Powers modulo a Prime . . . . . . . . . . . . . . . . . . . . . . . . 61
9.4.5 Primitive Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.5 The Discrete Logarithm Problem . . . . . . . . . . . . . . . . . . . . . . . 63
9.6 The Diffie-Hellman Key exchange procedure . . . . . . . . . . . . . . . . . 64
ii
1
10 Further Reading 69
11 Appendices 70
11.1 Appendix A: Introduction to Unix . . . . . . . . . . . . . . . . . . . . . . . 70
1 OVERVIEW 2
1 Overview
In this course you explore many challenges involved with the design and implementation of
operating systems. From the perspective of design, you learn about the theory underlying
the functionality that an operating system provides, and from the perspective of imple-
mentation, you discover what it is like to build programs that operate at the same level as
the operating system.
Almost as important as designing and building operating systems, this course also intro-
duces you to the C and C++ programming languages. You will learn a minimal set of
techniques, but they will be able to take you a very long way because, as you will soon dis-
cover, system programming is about making an operating system do whatever you would
like it to do.
Most of the problems faced are in multi-process operating systems, where the operating
system gives the impression that more than one process is running at a time. These
processes may be owned by one, or more, of many valid users and so you also explore
the challenges involved with multi-user operating systems.
DOS is a single-process, single-user, operating system with every process running to com-
pletion before the next process starts. UNIX is a multi-process, multi-user, operating
system. DOS and UNIX, however, are both multi-tasking operating systems. This is not
a contradiction. Multi-tasking refers to the many tasks that an operating system can per-
form, whereas multi-processing refers to many processes running simultaneously, regardless
of whether they are all an instance of the same task or not.
• Process Synchronisation
• Deadlock
• Scheduling
• Virtual Memory and Paging
• Computer Security
The practical side of this course entails practical sessions in the computer laboratories on
campus, and two programming assignments.
2 WHAT IS AN OPERATING SYSTEM? 3
You also learn how to use the basic tools that are required for system programming.
In your solo programming assignment you build a dynamic web site with CGI technology.
This will help you tie the application of system programming techniques to the most
important aspects of operating systems.
In your group programming assignment you and your group build a parallel-processing
task-farm framework using UNIX IPC libraries. A task-farm is used for high-performance
computing tasks of a certain type. Although this might seem like an ambitious task, all
you have to do is make sure you complete the practical sessions and solo assignment.
Review them until you understand them. If you do this, you will have more than adequate
knowledge and experience to excel in the practical aspect of this course.
From your computer literacy knowledge, you should recall that there are two types of
software application: System Software and Application Software. System software
manages the operation of the computer itself. System software interfaces with the hard-
ware. An operating system is system software. Application software, on the other hand,
performs work for the user, and accomplishes this by interfacing with system software.
Computer hardware and software concepts can be difficult to grasp because they are both
abstract and complex. There are, therefore, a number of common, simplified ways of
looking at operating systems.
One view is that of a program providing its users with a convenient interface. An operating
system is a layer of software on top of the bare hardware. This presents the programmer
with a virtual machine that is easier to understand and program.
For example, what would a programmer have to program if she wanted to access the
stiffy-disk drive on her notebook? To write a program that reads some data from a stiffy-
disk by interacting with the hardware alone, the controller chip that drives the stiffy has
2 WHAT IS AN OPERATING SYSTEM? 4
16 commands. This includes commands for reading, writing, moving the disk arm, re-
calibrating the controller, etc. The Read command requires 13 parameters, packed into
9 bytes. When a read is compete, the controller returns 23 status and error fields packed
into 7 bytes. Of course, hard disks, CD’s etc have similar complexity but different details.
Now, what would the program look like if the information has to be interpreted as textual
information and then displayed on the screen as a matrix of pixels?
Rather than traumatise you with further examples of a day in the life of a programming
nightmare, let’s consider the tried and tested solution to this common programming prob-
lem: bring the operating system to the rescue!
From nearly four decades of programming experience, the solutions to many common prob-
lems that occur when writing application software are collected together and generalised.
As a virtual machine, the operating system provides an abstraction of the bare hardware.
It hides the truth of the hardware from the programmer and presents a nice, simple view of
named files that can be read and written. The operating system also conceals a lot of un-
pleasant business concerning interrupts, timers, memory management, and other low level
features. This layer of software, presents such a simplistic machine that most application
software need not actually know what hardware it is running on - all computers look the
same to it.
This is a very useful way of thinking about the role that an operating system plays in a
computer system.
The alternate view of the role an operating system plays is that of a program that manages
all the pieces of a complex system. It provides for an orderly and controlled allocation of
the processor/s, memories, and I/O devices among the various programs competing for
them.
For example, if a printer is a shared resource, surely the computer cannot allow simulta-
neous access to the printer from different programs. In this scenario, the operating system
can buffer all output destined for the printer on disk. This buffering technique is called
spooling. When one program is finished, the operating system can copy its output to the
printer while other programs continue generating more output, oblivious to the fact that
their output is not really going to the printer yet.
3 PROCESS SYNCHRONISATION 5
3 Process Synchronisation
A process is a program whose execution has started but not yet terminated. At any one
moment a process need not be actually executing.
An operating system must maintain a data structure that describes the current status of
all live processes. This structure is called a process control block or pcb and commonly
contains the following fields:
During the life of any particular process an operating system must perform operations that
can result in a change to the information in the PCB of the process. Typical operations
include:
A set of processes is called determinate if given the same input, the same results are
produced regardless of the order of execution of the processes. Determinate systems are
easy to control, the operating system can let them execute in any order and the results are
always the same. However in real life sets of processes are usually not determinate and
the operating system must synchronise their execution in order that a preferred result is
attained. Some common synchronisation problems are:
• Mutual exclusion problem: In many computer systems, processes must cooperate with
each other when accessing shared data. The designer must ensure that only one
process at a time modifies the shared information. During the execution of such
critical sections of code mutual exclusion must be ensured.
• Producer-Consumer problem: In this problem a set of producer processes provide
messages to a set of consumer processes. They all share a common pool of space into
which messages may be placed by the producers and removed by the consumers.
• Reader-Writer problem: In this problem reader processes access shared data but do
not alter it while writer processes change the contents of the shared data. Any number
of readers should be allowed to proceed concurrently in the absence of a writer, but
writers must insist on mutual exclusion while in their critical section.
It turns out that if one can solve the mutual exclusion problem then all the other common
synchronisation problems are solvable. First we will examine some historical attempts
to solve the mutual exclusion problem via software then we will introduce the hardware
concept of a semaphore which provides a modern solution.
Consider a system of two cooperating processes P0 and P1 . Each process has a segment
of code, called a critical section, in which the process may be reading or writing common
variables. The important feature of such a system is that when one process is having
its critical section executed by the CPU, the CPU must be prevented from executing the
critical section of the other process. We say that the execution of critical sections must be
mutually exclusive in time.
To solve the mutual exclusion problem we must design a protocol which the process must
use to cooperate and ensure mutual exclusion. Each process must request permission to
enter its critical section by executing so called entry code and after completing its critical
section must execute exit code so that the next process can enter its critical section. The
following constraints must be observed by any practical mutual exclusion solution.
2 No assumptions may be made concerning the relative execution speeds of the cooper-
ating processes.
3 When one process is in a non-critical section of code it may not prevent the other
process from entering its critical section.
4 When both processes want to enter their critical section the decision about which one
to grant access to cannot be postponed indefinitely.
Our first attempt at a solution to the mutual exclusion problem with constraints is to let
both processes share a common variable called turn initialised to 0 or 1, and then use the
protocol that if turn = i then process Pi is allowed to execute in its critical section. Each
cooperating process would loop as follows:
program P(i)
common variable: turn: 0..1 = 0;
repeat
critical section
turn = j;
non-critical section
until false;
This solution ensures that only one process at a time can be in its critical section, however,
constraint number 3 is not satisfied since strict alternation of processes in the execution of
their critical section is required.
The problem with this attempted solution is that it fails to remember the state of each
process but remembers only which process is next. To remedy the situation we could use
a common flag for each process. The idea is for a process to set its flag before entering
its critical section and only do this if the other process’s flag is unset. The cooperating
processes would loop as follows:
program P(i)
common variables:
flag: array[0..1] of boolean = {false, false};
repeat
3 PROCESS SYNCHRONISATION 8
critical section
flag[i] = false;
non-critical section
until false;
Unfortunately this algorithm does not ensure mutual exclusion. Consider the following
sequence of events:
This sequence of events allows P0 and P1 to enter their critical sections at the same time
and mutual exclusion is not ensured. The problem is with the non-atomic nature of the
entry code. It does not help to interchange the order of the assignment and the while loop
in the entry code since in that case both processes may exclude each other indefinitely and
thus violate constraint number 4.
It appears that no simple solution to the mutual exclusion problem exists but a correct so-
lution was discovered in 1964 by the Dutch mathematition, Dekker. This solution combines
both of the previous attempts as follows:
program P(i)
common variables:
flag: array[0..1] of boolean = {false, false};
turn: 0..1 = 0;
repeat
flag[i] = true;
while flag[j] do
if turn = j then
begin
flag[i] = false;
while turn = j do nothing;
3 PROCESS SYNCHRONISATION 9
flag[i] = true;
end;
critical section
turn = j;
flag[i] = false;
non-critical section
until false;
We leave it to the reader to convince himself that Dekker’s solution ensures mutual exclu-
sion and that indefinite blocking cannot occur.
When more that two processes are involved in the mutual exclusion problem the solution
is more complicated. In 1965 another Dutch mathematition, Dijkstra solved the n process
problem. His solution was refined by Knuth and then DeBruijn, and finally by Eisenberg
and McGuire in 1972 to produce the following n process solution that satisfies not only all
the constraints 1 to 4, but is also fair in the sense that every process can eventually enter
its critical section even if access requirements are greater than total time allows.
program P(i)
common variables:
flag: array[0..n-1] of (idle, want, in) = {idle, ...};
turn: 0..n-1 = 0;
ordinary variables:
j: integer;
repeat
repeat
flag[i] = want;
j = turn;
while j <> i do
if flag[j] <> idle then
j = turn
else
j = j+1 mod n;
flag[i] = in;
j = 0;
while ( j<n ) and ( j=i or flag[j]<>in ) do inc(j);
until (j >= n) and ( turn = i or flag[turn] = idle);
turn = i;
3 PROCESS SYNCHRONISATION 10
critical section
j = turn+1 mod n;
while (j<>turn) and (flag[j] = idle) do j = j+1 mod n;
turn = j;
flag[i] = idle;
non-critical section
until false;
We leave it to the reader to convince himself that the Eisenberg and McGuire solution
ensures mutual exclusion and that indefinite blocking cannot occur.
3.3 Semaphores
In modern computer instruction sets, the problem of entry and exit code for the mutual
exclusion is solved by supplying an atomic instruction that does the job. The data structure
associated with this instruction is called the semaphore and a full definition is as follows:
P (S) if S ≥ 1 then S = S − 1 else the executing process places itself in Q(S) and goes to
sleep.
V (S) if Q(S) is non-empty then wake up one waiting process and make it available for
execution else S = S + 1.
The operating system must offer P and V as indivisible instructions. This means that
once they start executing they cannot be interrupted until they have completed. Note also
that in the definition of V (S), no rules are laid down to identify which waiting process is
reactivated. In most operating systems this decision is implementation dependent.
To solve the mutual exclusion problem using a semaphore the following scheme can be
used:
Process i: Process j:
loop loop
... ...
3 PROCESS SYNCHRONISATION 11
P(S) P(S)
... ...
critical section critical section
... ...
V(S) V(S)
... ...
non-critical section non-critical section
... ...
endloop endloop
To see that this scheme will work consider the following two scenarios:
1: Process i goes in and out of its critical section while processes j, k, ... do not
attempt entry. That is: S = 1; i : P (S), S = 0; i enters; i:V(S), S = 1; i exits;
and the initial configuration is restored.
2: Process i goes in, j attempts entry and k, l, ... are not interested. That is:
S = 1; i : P (S); S = 0; i enters; j : P (S); j waits; i : V (S); i exits, j enters;
j : V (S); S = 1; j exits; and the initial configuration is restored.
In this problem we have many producers producing messages which are consumed by many
consumers. However, there are only a finite number of message buffers:
producer i: consumer j:
loop loop
... ...
create a message m
P(mutexP) P(mutexC)
{one producer at a time} {one consumer at a time}
P(nrempty) P(nrfull)
{wait for an empty cell} {wait for a message}
3 PROCESS SYNCHRONISATION 12
buff[in] = m m = buff[out];
in = in + 1 mod N out = out + 1 mod N
V(nrfull) V(nrempty)
{signal a full cell} {signal an empty cell}
V(mutexP) V(mutexC)
{let the next producer in} {let next consumer in}
In this problem we have a number of writer programs that must exclude all readers and
other writers when in their critical section. We also have a number of reader routines who
can perform their read operations concurrently but writer routines must be excluded. The
following semaphore solution gives priority to the readers:
reader i: writer j:
loop loop
... ...
P(mutexR) P(mutexW)
{readers enter one at a time} {wait}
P(mutexR)
{readers exit one at a time}
nr = nr-1
3 PROCESS SYNCHRONISATION 13
V(mutexR)
{allow other readers in/out}
endloop
3.6 Exercises
1: Use a semaphore with P and V operations to control the traffic flow at the intersection
of two one-way streets. The following rules should be satisfied:
– Only one car can be crossing at any given time.
– When cars are approaching the intersection from both directions they should take
turns at crossing so as to prevent indefinite postponement in either street.
– A car approaching from one street should always be allowed to cross the intersec-
tion if there are no cars approaching from the other street.
A solution to this problem is two algorithms for crossing. One algorithm for cars
coming from one direction and another algorithm for cars coming from the other
direction.
2: Consider a barbershop that has three barber chairs, three barbers, one till, and
one cashier. The shop also contains a sofa that can seat four waiting customers,
and standing room area for further waiting customers. Assume that at most twenty
customers can be inside the barbershop at any one time.
A customer enters the shop, provided it is not full and once inside takes a seat on the
sofa or stands if the sofa is fully occupied. When a barber is free the customer who has
been on the sofa for the longest is served and if there are any standing customers the
one who has been in the shop the longest takes a seat on the sofa. When a customer’s
haircut is finished the cashier accepts payment and gives the customer his receipt.
Because there is only one till payment is accepted for one customer at a time. The
barbers divide their time between cutting hair and sleeping in their barber chair if
there are no customers to be served.
Solve this concurrency problem by writing three algorithms. One each for customers,
barbers and cashiers. Make a table of all the semaphores you use indicating what the
P and V operations denote for each.
3: There are five philosophers sitting at a round table. On the table are five plates,
five forks (one to the left of each plate), and a bottomless serving bowl of spaghetti at
the centre of the table. The philosophers spend their time either thinking or eating.
Thinking is easy as the philosopher does not require any utensils to do it. Eating
on the other hand requires two forks, one from the left and one from the right. On
completion of an eating period the philosopher will replace the two forks in their
original positions and return to thinking.
3 PROCESS SYNCHRONISATION 14
Design an algorithm for a philosopher to follow that allows all philosophers to think
and eat to their hearts content.
4 INTER-PROCESS COMMUNICATION UNDER UNIX 15
UNIX now offers new powerful interprocess communication (IPC) facilities. These include
message queues, shared memory segments and semaphores. In this section we show how to
call UNIX IPC routines to implement shared memory and semaphores. UNIX offers two
shell commands to monitor the current IPC state and to delete unwanted IPC structures.
The formats are:
ipcs [options]
ipcrm [options]
To tackle the group project described later on in this document, you will have to implement
shared memory and semaphores in C++. Make sure you have read the following reference
material and in-line tutorials before you start with IPC:
The following two sections provide code-studies of both UNIX IPC shared memory and
semaphores. Use these examples to test your understanding of C and C++ and expect to
use the classes defined here in your practical sessions and group assignment. You may, of
course, feel free to test your understanding of the code and C++ by implementing your
own Linux IPC class library.
Recall that each process has its own private memory space. One way to share data between
processes is to use a section of public memory that belongs to no particular process, and
some kind of memory address-mapping mechanism to represent the same public segment
in each process that needs to access that segment.
Shared memory is a subset of main computer memory which can be shared by one or more
processes. Shared memory segments are attached to the data segment of a process and
are mapped to a different address in each process. Each process has no idea of which other
processes might have references to the shared memory segment.
UNIX provides the system call shmget to create a shared memory segment along with its
associated data structures. The segment is then attached to a processes address space
4 INTER-PROCESS COMMUNICATION UNDER UNIX 16
through the use of the system call, shmat. shmget returns the shared memory identifier
shmid, which is later used by the system call shmctl to update the contents of the shared
memory segment. The system call, shmdt is used to detach a shared memory segment from
a processes address space.
Full specifications for the above system calls can be found in the man pages. The following
sample program, written in C and C++, (the source of which can be downloaded from
the course web site) demonstrates how to create and use a shared memory segment in
C++. The benefit of using C++ is to encapsulate some fairly detailed system calls in
C to a higher, less error-prone level, while also exploiting many other benefits of C++,
such as object-orientation, strict type-checking, and functionally powerful compilers and
debuggers.
The first listing is the C++ header file containing the declaration of the SharedMem class
in the ipc namespace:
#ifndef _SHAREDMEM_H_
#define _SHAREDMEM_H_
/*
* A VERY simplified shared memory class for use in a std UNIX
* environment.
*
* Exit codes for class operations:
*
* 1 - Unable to allocate memory
* 2 - Unable map memory
* 3 - Could not remove shared memory
*/
#include <iostream>
#include <cstdio>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/ipc.h>
#include <sys/shm.h>
namespace ipc {
template<class S_type>
class SharedMem {
public:
/*
* This method generates the shared memory segment.
* The size of the segment is set by the data type.
* Once created, the segment is attached. The creating
* PID is saved in the my_pid data member.
*/
4 INTER-PROCESS COMMUNICATION UNDER UNIX 17
SharedMem(); // Constructor
/*
* This method removes the shared memory segment from
* the system if the calling function is the process that
* created the segment.
*/
~SharedMem(); // Destructor - remove shared memory
/*
* Put assigns a value to the shared memory segment.
*/
void Put(const S_type);
/*
* Get retrieves the current value stored in the memory segment.
*/
S_type Get();
private:
} // namespace ipc
#endif //_SHAREDMEM_H_
The class declared in the header file is then defined in the follwing C++ source file:
/*
* Shared memory implementation
*/
#include <string>
#include <vector>
#include "SharedMem.h"
namespace ipc {
template<class S_type>
SharedMem<S_type>::SharedMem() {
my_pid = getpid(); // Save PID of creating process
if ((shmid = shmget(IPC_PRIVATE, sizeof(S_type), IPC_CREAT | 0660)) < 0)
exit(1);
if ((shm_ptr = (S_type *) shmat(shmid, NULL, 0)) == NULL)
exit(2);
}
// Force instantiation
template class SharedMem<bool>;
template class SharedMem<int>;
template class SharedMem<char>;
template class SharedMem<float>;
template class SharedMem<double>;
} // namespace ipc
An example of using the SharedMem class is as follows. It is a simple tally that can be
accessed by more than one process:
#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include "ipc/SharedMem.h"
Different processes accessing the same shared memory must be synchronised. This is
implemented is through the use of semaphores, the topic of the next section.
4.2 Semaphores
UNIX supplies a system call, semget to set up a semaphore with its associated data
structure. semget returns a unique positive integer known as the semaphore identifier,
semid. The semid is subsequently used by the semop system call which updates the values
in the semaphore data structure.
The UNIX semaphore structure contains the variables, semval, semzcnt and sempid. The
variable semval is a non-negative integer whose value is changed by the semop system call.
semval corresponds to the semaphore integer described above in these notes. semzcnt
is an unsigned short integer that represents the number of processes that are suspended
waiting for semval to reach zero. sempid holds the id of the process that performed the
last semaphore operation on this semaphore.
Full specifications for semget and semop can be found in the man pages.
The following sample program, written in C and C++, (the source of which can be down-
loaded from the course website) demonstrates how to create and destroy a semaphore and
also how to implement the P and V mutual exclusion operators.
The first listing is the C++ header file. It contains a declaration of the Semaphore class
in the ipc namespace:
#ifndef _SEMAPHORE_H_
#define _SEMAPHORE_H_
#include <iostream>
#include <cstdio>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <stdlib.h>
#include <unistd.h>
namespace ipc {
/*
4 INTER-PROCESS COMMUNICATION UNDER UNIX 20
class Semaphore {
public:
Semaphore(); // Constructor
~Semaphore(); // Destructor - remove semaphore
int P(); // LOCK (decrement semaphore)
void V(); // UNLOCK (increment semaphore)
int Z(); // WAIT while semaphore is NOT 0
void Put(const int ); // Assign a value to semaphore
int Get(); // Return value of the semaphore
private:
#if defined(__GNU_LIBRARY__) && !defined(_SEM_SEMUN_UNDEFINED)
// definition in <sys/sem.h>
#else
union semun { // We define:
int val; // value for SETVAL
struct semid_ds *buf; // buffer for IPC_STAT, IPC_SET
unsigned short int *array; // array for GETALL, SETALL
struct seminfo *__buf; // buffer for IPC_INFO
4 INTER-PROCESS COMMUNICATION UNDER UNIX 21
};
#endif
union semun arg; // For semctl call
struct sembuf zero, lock, unlock; // hoo ha’s for P,V & Z operations
int semid; // ID of semaphore
pid_t my_pid; // PID of creator
};
} // namespace ipc
#endif //_SEMAPHORE_H_
The second listing is the definition of the Semaphore class in a C++ source file:
#include "Semaphore.h"
namespace ipc {
Semaphore::Semaphore() {
zero.sem_num = 0,
zero.sem_op = 0,
zero.sem_flg = SEM_UNDO;
lock.sem_num = 0,
lock.sem_op = -1,
lock.sem_flg = SEM_UNDO;
unlock.sem_num = 0,
unlock.sem_op = 1,
unlock.sem_flg = SEM_UNDO;
my_pid = getpid();
if ((semid = semget(IPC_PRIVATE, 1, 0660)) == -1) {
exit(1);
}
Put(0); // Default - set to zero @ start
}
// LOCK semaphore
// Atomic test & decrement
int Semaphore::P() {
if (semop(semid, &lock, 1) == -1)
exit(3);
return 0;
}
// UNLOCK semaphore
// Increment semaphore
void Semaphore::V() {
if (semop(semid, &unlock, 1) == -1)
exit(4);
}
4 INTER-PROCESS COMMUNICATION UNDER UNIX 22
} // namespace ipc
#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include "ipc/Semaphore.h"
The following section describes some previous group assignments that were developed using
UNIX IPC.
Consider the game of Musical Chairs. The game is controlled by a Disc Jockey who’s task
it is to play short segments of music.
The game starts with n players who dance around n − 1 chairs while a disc jockey plays
a short segment of music. When the music stops playing the n players all scramble for
the n − 1 chairs and the unsuccessful player, (the one who did not get a chair) is out of
the game. Once the unsuccessful player leaves the game the music restarts, the remaining
players start dancing, a single chair is removed from the dance floor and the next round
continues with n − 1 players competing for n − 2 chairs.
In the final round of the game, 2 players compete for 1 chair and the successful player in
the final round is declared the overall winner of the game.
Make use of semaphores and shared memory to solve the Musical Chairs problem. Write
one code for the disc jockey to follow and one code for the players to follow. Make sure
that you clearly specify all shared memory data with their initial values.
Design and code in Perl the game Snakes to run on a UNIX system from telnet terminals.
Make use of UNIX shared memory to store the current state of the SnakePit. Make use
of UNIX semaphores to solve all synchronisation problems that appear in your design.
The class will be split into groups. Each group must submit a complete solution to the
Snakes game. Each solution will consist of two Perl programs, a Monitor and a Player.
During the playing of Snakes, one copy of the Monitor program will be running while
many copies of the Player program may be running.
In the game of Snakes, the Monitor will set up a new game and allow a Player to join as
a new snake in the SnakePit. As each player joins the game he is allowed to choose the
type of snake he represents. Common types could be:
Mamba
5 GROUP PROGRAMMING ASSIGNMENT 24
Puff-adder
Cobra
Python
Each snake type will have different fighting skills. It is up to your group to define the set
of fighting skills for each snake type.
When the game starts Players move their snakes in the snake pit using the arrow keys.
Snakes can attack other snakes when their bodies intersect and the outcome of the en-
counter will be determined by the current attributes of the two snakes involved in the
encounter. Snakes can grow in length when they consume food that they encounter in the
snake pit. Food may be thrown into the pit at random times determined by the monitor.
At any one time, the Monitor displays the state of the whole snake pit. Players on the
other hand can only see part of the snake pit. (ie: in the immediate vicinity of their snake).
A player’s snake should be centred in his field of view and if a player decides to move his
snake to the right say then the SnakePit and its contents should move to the left.
Specifications for each snake type should be loaded at the start from an ASCII text file so
that the group can experiment with different specifications to determine those that result
in the best game.
Player programs send their game commands to the monitor. The monitor must stack up
incoming commands against each player. The monitor executes the commands by cycling
through the player’s stacks and executing the command at the bottom of each player’s stack
(if any). In this way it is possible for a quick thinking player to get more than one command
executed before his slower counterpart executes any. After executing each command the
state of the battle must be updated and the new state reflected on the monitor screen and
all the player’s screens. Access to the command stack must be controlled through the use
of semaphores.
The group must decide how a game is won and what messages to display on a player’s
screen when he wins the game or when his snake dies before the game is over.
At the end of this project each group will be asked to demonstrate their SNAKES game
to the rest of the class. Each group must also hand in a design document, outlining the
design of their SNAKES game, especially how semaphores were used to solve the multi-user
clashing problem. Each group must also construct a web-site that delivers their SNAKES
game to the world as a gzipped tar file containing code, installation instructions and a
player’s user manual.
Design and code into Perl the game BattleShips to run on a UNIX system from telnet
terminals. Make use of UNIX shared memory to store the current state of the BattleShips
5 GROUP PROGRAMMING ASSIGNMENT 25
board. Make use of UNIX semaphores to solve all synchronisation problems that appear
in your design.
The class will be split into 6 groups according to machine name. Each group must submit
a complete solution to the BattleShips game. Each solution will consist of two Perl
programs, a Monitor and a Player. During the playing of BattleShips, one copy of
the Monitor program will be running while many copies of the Player program may be
running.
In the game of BattleShips, the Monitor will set up a new game and allow a Player to
join one of two fleets taking part in a battle. As each player joins a fleet, he is allowed
to choose the type of ship he represents. Common types are:
Battleship
Destroyer
Cruiser
Frigate
Submarine
When the game starts Players move their ships on the board using the arrow keys.
Players can also fire missiles at other ships taking part in the battle. The different
ships will have different attributes. For example, Battleships can only turn slowly but
have relatively large fire-power. Ships must return to their home port to replenish supplies
when these run low.
At any one time, the Monitor displays the state of the whole battle. Players on the other
hand can only see part of the battle board. (ie: in the immediate vicinity of their ship).
Some ships may be able to fire missiles further than their field of view. Ships from the
same fleet should be able to pass messages to each other.
Specifications for each ship should be loaded at the start from an ASCII text file so that
the players can experiment with different specifications to determine those that result in
the best game.
The board should consist of an ASCII window on the world. The world is all sea apart
from two islands which have home ports for the ships in their fleets. Use ASCII characters
to represent the ships and the islands and the ports. The world is round and any ship
sailing over an edge appears at the opposite edge.
Players send their battle commands to the monitor. The monitor must stack up incoming
commands against each player. The monitor executes the commands by circling the players
and executing the command at the bottom of each player’s stack (if any). In this way it
is possible for a quick thinking player to get more than one command executed before his
slower counterpart executes any. After executing each command the state of the battle
must be updated and the new state reflected on the monitor screen and all the player’s
screens.
5 GROUP PROGRAMMING ASSIGNMENT 26
Each group will be awarded a mark for their project. Individual members of a group could
be asked a spot quiz on their projects to determine their contribution to the group effort.
Consider the following one-player game: There are nine clocks in a 3*3 array (figure 1).
The goal is to return all the dials to 12 o’clock.
There are nine different allowed ways to turn the dials on the clocks. Each such way is
called a move. Select for each move a number 1 to 9. That number will turn the dials 90’
(degrees) clockwise on those clocks which are affected according to figure 2 below.
1 ABDE
2 ABC
3 BCEF
4 ADG
5 BDEFH
6 CFI
5 GROUP PROGRAMMING ASSIGNMENT 27
7 DEGH
8 GHI
9 EFHI (Figure 2)
0 = 12 o’clock
1 = 3 o’clock
2 = 6 o’clock
3 = 9 o’clock
then the sequence of moves, (5,8,4,9), will return all the clocks to noon as follows:
3 3 0 3 0 0 3 0 0 0 0 0 0 0 0
2 2 2 5-> 3 3 3 8-> 3 3 3 4 -> 0 3 3 9-> 0 0 0
2 1 2 2 2 2 3 3 3 0 3 3 0 0 0
Your operating systems group must construct a multi-player version of this game. The
multi-player version of the game should allow for a maximum of six players. Each player
should view a screen showing the state of all nine clocks.
The object of the multi-player version of the clocks game is still to solve the clock problem,
but by allowing any of the players to make moves at any time. Whenever a move is made
the system should allocate a score for that move and update that player’s total score
accordingly. Once the clock puzzle has been solved, the player with the highest total score
wins.
Scoring could be done as follows: When a player makes a move, his (or her) total score
gets incremented by:
In your design of the multi-player clocks game, you should allow for all moves occurring
in some time interval after an accepted move, to be aborted. Also as soon as a move is
5 GROUP PROGRAMMING ASSIGNMENT 28
accepted by the system, all players should be warned that no new moves will be accepted
until the time interval has elapsed.
Use Perl to design and code the game Clocks to run on your UNIX box from telnet
terminals. Write two programs, one called ClockWatcher to act as an administrator for
your game, and the other called ClockPlayer for each player to run. Make use of UNIX
shared memory to store the current state of the game. Make use of UNIX semaphores to
solve all synchronisation problems that appear in your design. Make use of the ncurses
library to display the current state of the game on a telnet terminal. Make sure that all
your programs have adequate help available to users. Produce a group report for your
project in the form of a web site on your machine. Your game should be distributable from
your web site as a tar archive containing appropriate README and INSTALL files.
The Game of Life is the most well-known cellular automaton, invented by John Conway
and popularised in Martin Gardner’s Scientific American column starting in October 1970.
The game was originally played by hand with counters, but implementation on a computer
greatly increased the ease of exploring patterns.
The Life cellular automaton is run by filling in a number of cells on a 2-D grid. Each
generation then switches cells on or off depending on the state of the cells that surround
it in the previous generation. The rules are defined as follows. All eight of the cells
surrounding the current one are checked to see if they are on or not. Any cells that are on
are counted, and this count is then used to determine what will happen to the current cell.
1 Death: if the count is less than 2 or greater than 3, the current cell is switched off.
2 Survival: if (a) the count is exactly 2, or (b) the count is exactly 3 and the current
cell is on, the current cell is left unchanged.
3 Birth: if the current cell is off and the count is exactly 3, the current cell is switched
on.
A pattern which does not change from one generation to the next is known as a still life ,
and is said to have period 1. Conway originally believed that no pattern could produce an
infinite number of cells, and offered a $50 prize to anyone who could find a counterexample
before the end of 1970 (Gardner 1983, p. 216). Many counterexamples were subsequently
found, including guns and puffer trains .
A Life pattern which has no father pattern is known as a Garden of Eden (for obvious
biblical reasons). The first such pattern was not found until 1971, and at least 3 are now
known. It is not, however, known if a pattern exists which has a father pattern , but no
grandfather pattern.
Similar cellular automaton games with different rules are HexLife and HighLife. HashLife
5 GROUP PROGRAMMING ASSIGNMENT 29
is a life algorithm that achieves remarkable speed by storing sub-patterns in a hash table,
and using them to skip forward, sometimes thousands of generations at a time.
For your group project this year you must design and implement a 2-player game of life.
Player 1 starts with 8 blue counters and player 2 starts with 8 red counters. The game
takes place on a 19 x 19 life board. At the start of the game players place 5 counters
consecutively on the life board at locations of their choice. The game of life algorithm
then takes over using the same rules as above except that the following further rule is
implemented:
Once the game of life is running, players may interrupt the run up to 3 times in order to
place one of their remaining counters. The game ends after a fixed number of iterations
and the majority surviving colour is declared the winner.
Design and code into C the game TwoLife to run on your UNIX box from telnet terminals.
Make use of UNIX shared memory to store the current state of the game. Make use of
UNIX semaphores to solve all synchronisation problems that appear in your design. Make
use of the ncurses library to display the current state of the game on a telnet terminal.
Design and code a multi-player version of Go-Moku. Your design must allow players to
alternately place markers on a rectangular board until one of them obtains five markers in
a row (horizontal, vertical or diagonal).
When a game is in progress each player must be able to view the current state of the
game on his workstation. Also a game coordinator must be able to view the game on his
workstation without taking part as a player.
Your solution must allow for two to four players and one game coordinator. The final
product must detect a win or a drawn situation.
If a player takes too long to play when it is his turn then the system must make a move
for him. (Just use the nearest non-filled square to his last move). The timeout parameter
must be settable by the game co-ordinator.
5 GROUP PROGRAMMING ASSIGNMENT 30
The class will be split into a number of groups. Each group must submit a complete
solution to the Go-Moku problem. Each group’s solution will be demonstrated to the rest
of the class towards the end of the semester. Programs must be written in C to run on
a UNIX system with telnet terminals. The current state of the game must be stored
in shared memory and UNIX semaphores must be employed to solve any synchronisation
problems in your design.
Design and code into C the game MATH 24 to run on a UNIX system from telnet
terminals. Make use of the UNIX shared memory to store MATH 24 questions, answers
and score boards. Make use of UNIX semaphores to solve all synchronisation problems
that appear in your design.
The class will be split into 5 groups of ±10 students. Each group must submit a com-
plete solution to the MATH 24 game. Each solution will consist of two C programs, a
QuizMaster and Players. During the playing of MATH 24, one copy of the QuizMaster
will be running while many copies of the Players program may be running.
In the game MATH 24 the QuizMaster selects 4 integers at random in the range 1 to 12
and presents them to all the Players. The Players combine the integers using the binary
operations +, −, × and ÷ and any legal combination of brackets. The resulting expression
must evaluate to 24 hence the name of the game. The rules of the game state that all four
integers must be used once and only once in this expression. The first of the Players to
produce a correct expression wins the game.
Note that in general a solution is not unique so in your design allow the QuizMaster to
allocate points to the Players depending on what order they solve the problem. (for
example: 6-points for the first correct solution, 4-points for the second correct solution,
etc.) Your QuizMaster should allow the game to run continuously, with a time-limit on
each round and a score-board available to the Players indicating their correct solutions
and current points. You may like to implement a negative score to penalise Players that
submit incorrect expressions.
Each group will be awarded a mark for their project. Members of a group will be asked a
spot quiz on their projects and some combination of group mark and quiz mark will make
up the final project mark.
In order to write the Players program it is necessary to be able to timeout the user if
she is too slow in entering a solution to the MATH 24 game. Under UNIX, timeouts
can be implemented using the signal system call. You can use the man pages to get full
specifications for signal calls.
5 GROUP PROGRAMMING ASSIGNMENT 31
Design and code an auctioneering system. Make use of semaphores to solve any sy-
chronization problems that appear in your design.
The class will be split into groups. Each group will produce an auctioneer program and a
bidder program. Each group will have a group-leader who will oversee all design questions.
• Set up descriptions and reserve prices of items to be sold. This set-up procedure takes
place before the auction starts.
• Monitor bidder registrations before the auction.
• End bidder registrations.
• Start the bidding on each item.
• Monitor the bidding on each item.
• End the bidding on each item.
• Log the highest bid and the successful bidder for each item.
• Restart in the case of a power failure.
Each group will be awarded a mark for their project. Members of a group will be asked a
spot quiz on their projects and some combination of group mark and quiz mark will make
up the final project mark.
6 DEADLOCK 32
6 Deadlock
At the end of the previous section we saw that a simple sychronization algorithm can end
up in a deadlocked state when more than one processes are competing for a few resources.
A set of processes is in a state of deadlock when every process in the set is waiting for a
resource that can only be released by another process in the set.
A deadlock situation may arise iff the following necessary condition holds:
• circular hold and wait: There must exist a set of waiting processes {p1 , p2 , . . . , pn }
such that p1 is waiting for a resource held by p2 , p2 is waiting for a resource that is
held by p3 , . . ., pn is waiting for a resource that is held by p1 . The resources involved
must be held in a non-sharable mode.
Deadlocks can be described more precisely in terms of a directed bipartite graph G(V, E),
called a resource allocation graph, where the set of vertices V is partitioned into two types,
P = {p1 , p2 , . . . , pn } the set of processes and R = {r1 , r2 , . . . , rm } the set of resource types.
Each element in the set E of edges is an ordered pair (pi , rj ) or (rj , pi ). If (pi , rj ) ∈ E
then there is a directed edge from process pi to resource type rj indicating that process pi
has requested an instance of resource type rj and is currently waiting for that resource. If
(rj , pi ) ∈ E then there is a directed edge from resource type rj to process pi indicating that
an instance of resource type rj has been allocated to process pi . These edges are called
request edges and assignment edges respectively.
Pictorially we represent each process pi as a circle and each resource type as a square.
Since a resource type rj may have more than one instance we represent each such instance
as a dot within the square. A request edge points to a square while an assignment edge
starts at one of the dots and points to a circle.
• If each resource type has exactly one instance then a cycle implies that a deadlock
has occurred.
• If any resource involved in a cycle has more than one instance then the cycle does not
necessarily imply a deadlock.
• A system is deadlocked iff when resources are partitioned according to instances there
is no way to draw request edges without introducing a cycle.
The only way to prevent deadlock from occurring is to ensure that a circular hold and wait
condition never occurs. We will investigate three methods for doing this:
To prevent the hold and wait condition we allow implicit preemption. If a process that is
holding some resources requests another resource that cannot be immediately allocated to
it then all resources currently held are preempted and implicitly released. The preempted
resources are added to the list of resources for which the process is waiting. The process
will only be restarted when it can regain all its old resources as well as the new one that
it requested.
This is not the best deadlock prevention scheme available. For example if a line printer
is continually preempted the the operator would have a terrible time sorting out which
printed pages belonged to which process.
In order to ensure that the circular hold and wait condition never happens one can impose
a linear ordering of all resource types. To do this one must define a 1 − 1 function F that
maps resource types to integer numbers. For example suppose our resource types are card
readers, disk drives, tape decks and line printers and:
F (cr) = 1
F (dd) = 5
F (td) = 7
F (lp) = 12
and suppose that we insist that each process can only request resources in increasing order
of enumeration. To do this we require that whenever a process requests a resource rj it
first releases any resource ri such that F (ri ) > F (rj ). If this protocol is followed then a
circular hold and wait condition cannot happen.
To see this assume that a circular hold and wait condition exists with {p0 , p1 , . . . , pn−1 }
involved. Assume that pi is waiting for resource ri which is held by pi+1 , (mod n on all the
indeces). Thus since pi+1 is holding ri while requesting ri+1 we must have F (ri ) < F (ri+1
for all i. In other words: F (r0 ) < F (r1 ) < F (r2 ) < . . . < F (rn−1 ) < F (r0 ). This is a
contradiction and thus the circular hold and wait cannot exist if the protocol is adhered
to.
6 DEADLOCK 35
The following deadlock prevention scheme ensures that a circular hold and wait condition
cannot occur by demanding that at any time the system is in a safe state. More formally,
a system of processes is in a safe state if there exists a sequence, {p0 , p1 , . . . , pn−1 } such
that for each pi the resources which pi can still request can be satisfied by the available
resources plus the resources held by all the pj with j < i.
To check whether or not a collection of processes is in a safe state the operating system
must maintain several data structures containing the current state of resource allocations..
Let n be the number of processes and m the number of resource types. The banker’s
algorithm will require the following data structures:
Given that the above data structures are kept up to date by the operating system, the
algorithm for checking whether or not the system is in a safe state is quite simple:
1: Let W ork and F inish be vectors of length m and n respectively. Initialise W ork =
Available and F inish[i] = F alse for all i.
2: Find an i such that:
a: F inish[i] = F alse
b: N eed[i] ≤ W ork
If no such i exists then goto step 4.
3: Set:
a: W ork = W ork + Allocation[i]
b: F inish[i] = T rue
and goto step 2.
4: if F inish[i] = T rue for all i then state is safe.
6 DEADLOCK 36
Now to complete the banker’s algorithm we must specify what is to be done when a request
for resources comes in from process pi . Suppose pi issues a request vector Request[i] with
Request[i, j] = k if process pi wants k more instances of resource type rj . The operating
system must then take the following action:
1: If Request[i] > N eed[i] then we have an error since the process has requested more
resources than initially allowed for in M ax[i].
2: If Request[i] > Available the process pi must wait.
3: The operating system pretends to allocate the resources as follows:
a: Available = Available − Request[i].
b: Allocation[i] = Allocation[i] + Request[i].
c: N eed[i] = N eed[i] − Request[i]
If the resulting resource state is safe then the allocations are made and the process pi
continues. If the new state is unsafe then pi must wait and the old state is restored.
6.5 Exercise
1) Consider the following resource allocation data for five processes competing for four
resource types.
0 0 1 2
1 7 5 0
Available = 1 5 2 0 M ax = 2 3 5 6
0 6 5 2
0 6 5 6
0 0 1 2 0 0 0 0
1 0 0 0 0 7 5 0
Allocation = 1 3 5 4 N eed = 1 0 0 2
0 6 3 2 0 0 2 0
0 0 1 4 0 6 4 2
a) Show that this system in a safe state?
b) If a request, [0, 4, 2, 0], comes in from the second process, will the operating system
grant it?
7 SCHEDULING 37
7 Scheduling
7.1 Introduction
CPU scheduling deals with the problem of deciding which of the processes in the READY
queue is to be allocated the CPU. The criteria used for comparing different CPU scheduling
algorithms include:
• Utilisation: The idea is to keep the CPU as busy as possible. In real systems CPU
utilisation could vary from 40 to 90 percent.
• Throughput: One way to measure the work done by the CPU is to count the number
of tasks that are completed per unit of time.
• Turnaround Time: The interval of time from submission to completion. Includes,
waiting time, executing time and I/O time.
• Waiting Time Some scheduling algorithms just try to minimise the waiting time rather
than the complete turnaround time.
• Response Time: The time from submission of a request until the first response is
produced. Often used as a criteria in interactive systems.
The performance of FCFS is however often quite poor. Consider the following three tasks
with known CPU burst times. We can compute the average turnaround time to service
these three CPU bursts:
Task Burst Time
T1 24
T2 3
T3 3
If the tasks arrive in the order 1, 2 and 3 to a FCFS scheduler then the average turnaround
time can be computed with the aid of a Gantt chart.
T1 T2 T3
24 + 27 + 30
AT T = = 27
3
On the other hand, if the tasks arrive in the reverse order, 3 then 2 then 1, a much better
result is obtained.
T3 T2 T1
3 + 6 + 30
AT T = = 13
3
Thus we conclude that the average turnaround time for FCFS scheduling is not in general
minimal.
• The CPU bound task gets the CPU and holds it. The I/O bound tasks all wait for
the CPU in the READY queue. The I/O tasks are IDLE.
• The CPU task finishes and moves to an I/O device. All the I/O tasks finish quickly
and now wait in the I/O queue. The CPU is now IDLE.
There is a convoy effect as these two situations repeat resulting in plenty of IDLE time.
The way around this problem is to allow shorter tasks to go first!
In Shortest-Task-First scheduling the CPU is assigned to that task with the smallest next
CPU burst time. For example:
T2 T1 T4 T3
3 + 9 + 16 + 24
AT T = = 13
4
Theorem
7 SCHEDULING 39
STF scheduling is optimal in that it yields the minimum average waiting time.
which is minimised if ti < tj whenever i < j since ti has a larger weight than tj . Q.E.D.
Although STF is optimal in the shortest waiting time sense, it is un-implementable. There
is no way that the operating system knows the length of the next CPU burst time.
One approach is to try and approximate STF scheduling. We try to predict the length of
the next CPU burst by considering the burst time history of the task.
For example let tn be the length of the nth CPU burst and let τn be our predicted value
for the nth CPU burst. Then let
τn+1 = αtn + (1 − α)τn
where α is a parameter satisfying 0 ≤ α ≤ 1 which controls the relative weight of recent
and past history in our prediction.
If α = 0 then τn+1 = τn and our estimate never changes. If α = 1 then τn+1 = tn and only
the most recent CPU burst time is taken into account. However if 0 < α < 1 then the
following expansion shows how past history is incorporated.
We see that each successive term has less weight than its predecessor. Note that the initial
τ0 can be a system constant.
In this type of scheduling a priority is associated with each task and the CPU is allocated
to the task with the highest priority. Equal priority tasks are scheduled according to FCFS.
Note that STF scheduling is just priority scheduling with the priority set to τ1 .
Internal priority algorithms will use factors such as, time limits, memory requirements,
number of open files and ratio of I/O to CPU bursts to compute the priority of a given
task.
7 SCHEDULING 40
External priority algorithms will use factors such as, funds, sponsors and politics to allocate
priorities.
Rumour has it that when they closed down the IBM 7094 at MIT in 1973 they found a
low-priority task that had been submitted in 1967.
FCFS, STF and PRIORITY scheduling algorithms are non-preemptive by nature. Once
the CPU has been allocated to a process it stays allocated until the process releases the
CPU, (either by terminating or by requesting I/O).
FCFS is intrinsically non-preemptive but the other two can be modified to be preemptive
algorithms. For example if a new task arrives in the queue with a shorter expected burst-
time (or a higher priority) than the currently executing task, then the currently executing
task is preempted and the new task is assigned to the CPU.
T1 T2 T4 T1 T3
17 + 4 + 24 + 7
AT T = = 13
4
T1 T2 T3 T4
8 + 11 + 19 + 23 1
AT T = = 15
4 4
above calculations:
A Round Robin scheduling algorithm is usually used for time sharing systems. A small
unit of time, called a time-slice is defined. The READY queue is treated as a circular
queue and the CPU scheduler goes around the READY queue allocating the CPU to each
process for one time-slice.
For example:
T1 T2 T3 T1 T1 T1 T1 T1
30 + 7 + 10 2
AT T = = 15
3 3
Note that an infinite time-slice is equivalent to FCFS scheduling while a very small time
slice is equivalent to each process running on its own processor at n1 the speed of the real
processor. Again there are overheads with roll-out that have not been taken into account.
In this section we investigate algorithms for scheduling tasks on more than one processor.
We assume that the tasks are independent and can run in any order we wish. However, once
a task starts it must run to completion so we will not be looking at preemptive algorithms
in this section. When processing on more than one processor the measure that is usually
optimised is the total throughput time. No optimal algorithm is known for minimising total
throughput time. We will investigate one heuristic algorithm.
We define the largest processing time schedule, or LPT schedule, as the result of an algo-
rithm which, whenever a processor becomes free, assigns that task whose execution time
is the largest of those tasks not yet assigned.. For cases when there is a tie, an arbitrary
tie-breaking rule can be employed. Consider the following example:
7 SCHEDULING 42
An LPT schedule for this set of tasks on 3 processors turns out to be an optimal schedule
for 3 processors:
1 2 3 4 5 6 7 8
P1
P2
P3
We see that processor 1 finishes last and the total throughput time for the set of tasks
scheduled in this way is:
t(LP T ) = 7.5
The LPT schedule is not always an optimal schedule. Consider the following example on
3 processors:
An LPT schedule for this set of tasks has a total throughput of 12.
1 2 3 4 5 6 7 8 9 10 11 12
P1
P2
P3
t(LP T ) = 12
whereas an optimal schedule, an OPT schedule, has a total throughput time of 11.5.
7 SCHEDULING 43
1 2 3 4 5 6 7 8 9 10 11 12
P1
P2
P3
t(LP T ) = 11.5
Just how good is LPT scheduling as compared to an optimal schedule? After a certain
amount of experimentation about the possible shortcomings of an LPT schedule one usually
arrives at the following example for the case of 2 processors:
A 2 processor LPT schedule for this set of tasks has a total throughput of 6.5
1 2 3 4 5 6 7
P1
P2
1 2 3 4 5 6 7
P1
P2
Constructing LPT and OPT schedules for the following set of tasks gives a worst case
scenario when 3 processors are involved:
For the worst case scenario on m processors consider 2m+1 tasks with ti = 2m−F loor( i+1 2
)
for i = 1, 2, . . . , 2m and with t2m+1 = m. It can be verified by constructing Gannt charts
of the LPT and OPT schedules that:
7 SCHEDULING 44
t(LP T ) 4 1
= −
t(OP T ) 3 3m
Theorem
t(LP T ) 4 1
≤ −
t(OP T ) 3 3m
Proof
The theorem is trivially true for m = 1 since in that case t(LP T ) = t(OP T ). So let m ≥ 2.
Assume the theorem is false. Contrary to the theorem assume that we have a minimal set
of tasks, {T1 , T2 , · · · , Tn }, with execution times, {t1 , t2 , · · · , tn } and assume that the tasks
are ordered so that t1 ≥ t2 ≥ · · · ≥ tn . With these assumptions the LPT schedule will
always assign the tasks in numerical order.
Now assume that Tk finishes last in the LPT schedule with k < n. then an LPT schedule for
the set of tasks {T1 , T2 , · · · , Tk } would complete at the same time as an LPT schedule for the
tasks {T1 , T2 , · · · , Tn } and this smaller set of tasks would also invalidate our theorem. But
we assumed that our n tasks was a minimal set so we have a contradiction and can safely
assume that k = n and Tn finishes strictly last in the LPT schedule for {T1 , T2 , · · · , Tn }.
We shall now show that any OPT schedule for {T1 , T2 , · · · , Tn } can have at most two tasks
per processor. First we note that
n
1 X
t(OP T ) ≥ ti
m i=1
Now let τn denote the starting time of Tn in an LPT schedule for {T1 , T2 , · · · , Tn }. Since
no processor can be idle before Tn begins execution we have:
1 n−1
X
τn ≤ ti
m i=1
and hence:
7 SCHEDULING 45
t(LP T ) τn + tn
=
t(OP T ) t(OP T )
n−1
tn 1 X
≤ + ti
t(OP T ) mt(OP T ) i=1
n
(m − 1)tn 1 X
≤ + ti
mt(OP T ) mt(OP T ) i=1
(m − 1)tn
≤ +1
mt(OP T )
and since the theorem does not hold for {T1 , T2 , · · · , Tn } we have:
4 1 (m − 1)tn
− < +1
3 3m mt(OP T )
Therefore, since Tn has the least execution time we conclude that if the theorem is vi-
olated then no processor can execute more than two tasks in an optimal schedule for
{T1 , T2 , · · · , Tn }
To complete the proof we will show that a two task at most per processor OPT schedule
can be transformed into an LPT schedule without increasing the total throughput time
which contradicts our assumption that the theorem was invalid.
Type A:
Pi Pi
−→
Pj Pj
Type B:
Pi Pi
−→
Pj Pj
Type C:
7 SCHEDULING 46
Pi −→ Pi
Type D:
Pi Pi
−→
Pj Pj
To turn an OPT schedule into an LPT schedule without increasing the total throughput
time we first employ type A and C transforms to ensure that the m longest tasks are
scheduled first with the longest on processor 1 and the mth longest on processor m. We
then employ transforms of type A and B to ensure that the (m + 1)th longest task is
scheduled second on processor m while the (m + 2)th longest task is scheduled second on
processor m − 1. We carry on in reverse order up the list of processors until all tasks are
scheduled in an almost LPT fashion.
The only situation that could prevent a true LPT schedule from being generated would be
if one of the tasks scheduled second on a higher numbered processor completed before one
of the tasks scheduled first on a lower numbered processor. In this case a simple downward
shuffle of type D remedies the problem.
Now, as none of the transforms used increase throughput time, the total throughput time
of the resulting LPT schedule will be the same as the total throughput time of the original
OPT schedule and the contradiction is established.
If we introduce preemption and remove the restriction that once a task has begun it must
run to completion. we shall find that in general total throughput times can be improved.
For a simple example, consider three takes of unit execution time that must be scheduled
on two processors.
1 2 3
P1
P2
1 2 3
P1
P2
Note that in a preemptive schedule, tasks may be stopped and restarted at will and on any
processor but there must not be an overlap of scheduled execution time for any one task.
For any set of independent tasks the following preemptive scheduling result has been known
for some time:
Theorem
n
1 X
tmin = max{maxi {ti }, ti }
m i=1
Proof
It is clear that tmin must be a lower bound for the total throughput time since no schedule
can terminate in less time than it takes for the longest task to complete, and since a
schedule can not be more efficient than to keep all the processors busy throughout the
duration of the schedule.
To see that tmin can actually be achieved by a preemptive schedule consider the following
construction:
Sort the list of tasks into execution time order with the longest execution time first so that
n
1 X
tmin = max{t1 , ti }
m i=1
Now generate a schedule with total throughput time of tmin by scheduling T1 on the first
processor, T2 in the remaining time on the first processor and any extra time on the second
processor, T3 in the remaining time on the current processor with any extra on the next
processor. Continue in this way until all tasks are scheduled. Note that all processors
1 Pn
will be fully booked if t1 ≤ m i=1 ti but there will be free time on the higher numbered
processors if this is not the case.
A 3 processor preemptive optimal schedule for this set of tasks has a total throughput of:
6 + 4.5 + 4 + 3.5 + 3
tmin = =7
3
Let T = {T1 , T2 , . . . , Tn } be a set of tasks. Suppose the tasks are constrained to run in some
order of precedence. The precedences are specified as a directed graph, G whose nodes are
the set of tasks T and whose directed edges are members of T × T with Ti → Tj if task Ti
must complete before task Tj starts. If there is a sequence of edges Ti → Tj → . . . → Tk
then we say that Ti is a predecessor of Tk and that Tk is a successor of Ti . Two tasks are said
to be independent if neither is a successor of the other. Independent tasks may be executed
in any order or even at the same time if more than one processor is available. Dependent
tasks must be executed in the order specified by the directed edges in the precedence graph.
In the previous section we were concerned with scheduling sets of independent tasks on
m processors. In this section we will investigate two algorithms for scheduling dependent
tasks on m processors. The particular dependence is always given by a precedence graph.
7.9.1 A-Scheduling
Step 3 Compare element by element until two unequal elements in the same position are
found, call them lj and lj0 . We say that L < L0 if lj < lj0 else L > L0 .
Step 4 If all elements are equal up until the last element of L then L < L0 if L0 is longer in
length than L else the lists are equal.
and a labelling sub-algorithm which is designed to give each of the n tasks in the precedence
graph a label:
Step 1: An arbitrary task T with no successor is chosen and given the label, 1.
Step 2: Suppose for some k, the set of labels, 1, 2, . . . , k −1 have already be assigned. Consider
the set of tasks that have not yet received a label but whose successors have already
been labelled. To each of these tasks attach a list of labels of its successors and
choose the task T with the smallest successor list to receive the label k. Note that
the smallest successor list is chosen according to the list ordering algorithm above.
Step 3: Repeat step 2 until all tasks in the precedence graph have received labels.
Once labels have been assigned to each task the scheduling algorithm is simple:
A-Schedule: Whenever a processor becomes free assign that task all of whose predecessors have
already been executed and which has the largest label among those tasks not yet
assigned.
Theorem
A-schedules for two processor systems are optimal when all tasks have equal execution
times .
Proof
The proof of the fact that A-schedules are optimal under the above conditions is rather
difficult and will not be attempted in this course. The interested student is referred to ***.
7.9.2 Examples
1) Consider a set of 22 tasks that satisfy the following precedence relations and all have
unit execution times. {T1 → T4 , T2 → T4 , T3 → T4 , T4 → T5 , T5 → T6 , T5 → T7 ,
T6 → T9 , T6 → T10 , T7 → T10 , T8 → T11 , T8 → T12 , T9 → T12 , T10 → T12 , T10 → T13 ,
T10 → T14 , T12 → T15 , T12 → T16 , T13 → T15 , T13 → T16 , T14 → T15 , T14 → T16 ,
T15 → T17 , T16 → T17 , T16 → T18 , T16 → T19 , T18 → T20 , T18 → T21 }.
a) Draw the precedence graph for this set of tasks.
b) Label the tasks according to the labelling algorithm.
c) Produce an A-schedule for the tasks on two processors.
7 SCHEDULING 50
2) Show that an A-schedule is not necessarily optimal when three processors are involved.
Use the following precedence relations to provide a counter example. Assume all tasks
have equal execution times. {T1 → T4 , T2 → T4 , T3 → T4 , T4 → T6 , T5 → T7 , T5 → T8 ,
T5 → T9 , T5 → T10 , T5 → T11 , T5 → T12 }.
3) Show that an A-schedule is not necessarily optimal when the tasks involved do not have
equal execution times. Use the following precedence relations to provide a counter
example. Assume that all tasks except task T3 execute in unit time while task T3
require two units to execute. {T1 → T4 , T1 → T5 , T2 → T4 , T2 → T5 , T3 → T5 }
7.9.3 B-Scheduling
The B-schedule is optimal on any number of processors for sets of tasks, each of unit
execution time, whose precedence graphs are singly rooted trees. Each task in the tree
except for the root task has exactly one successor task. The structure of the tree must be
such that the independent tasks are the leaves of the tree while the root of the tree is a
task that can only start once all the other tasks in the set have been completed.
The B-schedule requires the concept of a level which is as follows: The root of a tree is
at level 0. All tasks that are predecessors of the root are at level 1. All tasks that are
predecessors of level 1 tasks are at level 2. etc. etc.
B-Schedule: Whenever a processor becomes free, assign that task if any, all of whose predeces-
sors have already executed and which is at the highest level of those tasks not yet
assigned. If there is a tie the an arbitrary tie-breaking rule may be used.
7.9.4 Example
1) Consider a set of 12 tasks that satisfy the following precedence relations and all have
unit execution times. {T1 → T3 , T2 → T3 , T3 → T9 , T4 → T9 , T5 → T10 , T6 → T10 ,
T7 → T10 , T8 → T11 , T9 → T11 , T10 → T12 , T11 → T12 }
a) Draw the precedence tree for this set of tasks.
b) Which task is the root task.
c) Produce an B-schedule for this set of tasks on three processors.
8 VIRTUAL MEMORY AND PAGING 51
8.1 Introduction
We consider a system consisting of two memory levels, main and auxiliary. At time t = 0
assume that one program is residing in auxiliary memory. The program is divided into n
pages each consisting of c contiguous addresses. The program must run in a main memory
consisting of m page frames. If m < n then a paging algorithm is required to calculate
what page must be in which page frame at any particular time t.
Each time the program makes a reference we are only interested in the index of the page
or page frame referenced and not with the individual words within the page. Therefore if
we regard N = {1, 2, 3, . . . , n} as the set of pages and M = {1, 2, 3, . . . , m} as the set of
page frames then at each moment of time there is a page map, ft : N → M ∪ {0} such that
(
y if page x resides in page frame y at time t
ft (x) =
0 if page x is missing from M at time t
When the processor generates an address α the hardware computes a memory location
β = ft (x)c + γ, where x and γ are determined from α = xc + γ with 0 ≤ γ < c. Note that
if c is a power of 2 then the hardware can be organised to make this computation very
efficient. If ft (x) = 0 then the hardware generates a page fault interrupt.
When a page fault interrupt occurs the operating system must find the missing page in
auxiliary memory, place it in main memory, update the map ft , and attempt the reference
again. This is the task of the paging algorithm.
Now suppose that the average time to access a word in a page in main memory is ∆M and
that the average time to transfer a page from auxiliary memory to main memory is ∆A
∆A
then an important system parameter is the ratio ∆ = ∆ M
. On most operating systems
4
∆ > 10 but good paging algorithms should not rely on this assumption.
(
∆M if rt+1 is in memory
at+1 − at =
∆M + ∆A otherwise
8 VIRTUAL MEMORY AND PAGING 52
Now the fault rate, F (ω), is defined as the number of page faults encountered while pro-
cessing reference sequence ω normalised by the length of ω. The expected elapsed time for
a reference is thus:
Thus minimising F (ω) for all possible ω will minimise the running time of the program.
In our study of paging algorithms we will only deal with so-called demand paging. Only
the missing page is fetched from auxiliary memory and page replacements only occur when
main memory is full. In the abstract a demand paging algorithm, A, is a mechanism for
processing a reference sequence,
ω = r1 , r2 , . . . , rt , . . . ,
and generating a sequence of memory states,
S0 , S1 , . . . , St , . . . .
Each memory state St is the set of pages from N which reside in M at time t. The memory
states satisfy the following conditions:
S0 = ∅ , St ⊆ N , kSt k ≤ m , r t ∈ St and
St−1 if rt ∈ St−1
St = St−1 + rt if rt 6∈ St−1 and kSt−1 k < m
t−1 + rt − rs rt 6∈ St−1 and kSt−1 k = m and rs ∈ St−1
S
if
Note that rt is the page demanded by the next instruction in the program and rs is the
page chosen for overwriting by the operating system’s replacement policy.
Before we discuss specific paging algorithms we require four further definitions to do with
a reference sequence:
ω = r1 , r2 , . . . , rt , . . . .
Firstly, the forward distance dt (x) at time t for page x is the distance to the first reference
to x after time t:
(
k if rt+k is the first occurrence of x in rt+1 , rt+2 , . . .
dt (x) =
∞ if x does not appear after rt
8 VIRTUAL MEMORY AND PAGING 53
Secondly, the backward distance bt (x) is the distance to the most recent reference to x
before time t:
(
k if rt−k is the last occurrence of x in r1 , r2 , . . . , rt
bt (x) =
∞ if x does not appear in r1 , r2 , . . . , rt
Thirdly, the reference arrival time lt (x) denotes the last time before time t that the reference
x was fetched from auxiliary memory.
lt (x) = max{i ≤ tkSi − Si−1 = x}
And fourthly, the reference frequency #t (x) denotes the number of references to x in
r1 , r2 , . . . , rt ,
In the following examples of demand paging algorithms we assume that kSt−1 k = m and
that rt 6∈ St−1 . Also let R(St−1 ) denote the page in St−1 that is replaced so that:
St = St−1 + rt − R(St−1 )
Different replacement rules, R, will give rise to different demand paging algorithms:
LRU Least Recently Used: The page in St−1 that is replaced is the one with the largest
backward distance:
R(St−1 ) = y ⇐⇒ bt−1 (y) = max{bt−1 (z) | z ∈ St−1 }
LFU Least Frequently Used: The page in St−1 that is replaced is the one having received
the least use. (The tie-breaking rule is usually LRU)
R(St−1 ) = y ⇐⇒ #t−1 (y) = min{#t−1 (z) | z ∈ St−1 }
FIFO First In First Out: The page replaced is the one that has been in memory for the
longest time:
R(St−1 ) = y ⇐⇒ lt−1 (y) = min{lt−1 (z) | z ∈ St−1 }
LIFO First In First Out: The page replaced is the one that has been in memory for the
shortest time:
R(St−1 ) = y ⇐⇒ lt−1 (y) = max{lt−1 (z) | z ∈ St−1 }
BEL Belady’s Optimal Algorithm: The page replaced is the one with the largest forward
distance in the sequence rt+1 , rt+2 , . . ..
R(St−1 ) = y ⇐⇒ dt−1 (y) = max{dt−1 (z) | z ∈ St−1 }
If two or more pages have infinite forward distance then the page with the small-
est page number is chosen for replacement. This rule cannot effect the fault-rate
performance as any page with infinite forward distance is never used again.
8 VIRTUAL MEMORY AND PAGING 54
Note that Belady’s algorithm is un-realizable since it requires a look into the future opera-
tion of the program. However it does provide a useful benchmark against which to measure
the performance of the other realizable algorithms.
Theorem
Belady’s demand paging algorithm is optimal in the sense that it results in the minimum
achievable paging cost when processing any reference sequence ω. (paging costs are mea-
sured in units of page replacements and they only start mounting up once memory is
full)
Proof
Let > denote a linear ordering of the references in ω such that y > z if y has greater
forward distance than z at time t.
Let Ck (S +rt −y, t) denote the cost of processing the references, rt+1 , rt+2 , . . . , rt+k starting
from state S at time t. Note that page rt is entering S and overwriting page y. For Belady’s
algorithm to be optimal we must show that for all k:
y > z ⇒ ∆Ck = Ck (S + rt − z, t) − Ck (S + rt − y, t) ≥ 0
since if this is the case then the y to choose to obtain minimal achievable cost is just the
y with greatest forward distance. We will show that ∆Ck = 0 ∨ 1 by induction on k.
The result is trivial for k = 0 since we are then considering processing the next zero page
references and any algorithm is optimal. Now suppose that y > z ⇒ ∆Cj = 0 ∨ 1 for
j = 0, 1, . . . , k − 1 we must show that the same statement is true when j = k. There are
three cases to be considered:
Note: We need not consider the case rt+1 = y since we assume that y > z ≥ rt+1
Case 3: rt+1 ∈
6 S + rt In this case we have:
∆Ck = Ck (S + rt − z, t) − Ck (S + rt − y, t)
= [1 + Ck−1 (S + rt − z + rt+1 − u, t + 1)] − [1 + Ck−1 (S + rt − y + rt+1 − v, t + 1)]
= Ck−1 (S + rt − z + rt+1 − u, t + 1) − Ck−1 (S + rt − y + rt+1 − v, t + 1)
where u has greatest forward distance in S + rt − z and v has the greatest forward
distance in S + rt − y. Now let s be the element of S + rt − z − y with the greatest
forward distance then there are three possibilities in the ordering of s, y and z.
a) s > y > z In this case u = v = s and ∆Ck reduces to:
Ck−1 ((S + rt + rt+1 − s) − z, t + 1) − Ck−1 ((S + rt + rt+1 − s) − y, t + 1)
which is 0 ∨ 1 by the induction hypothesis.
b) y > s > z In this case u = y and v = s and ∆Ck reduces to:
Ck−1 ((S + rt + rt+1 − y) − z, t + 1) − Ck−1 ((S + rt + rt+1 − y) − s, t + 1)
which is 0 ∨ 1 by the induction hypothesis since s > z.
c) y > z > s In this case u = y and v = z and ∆Ck reduces to:
Ck−1 (S + rt − z + rt+1 − y, t + 1) − Ck−1 (S + rt − y + rt+1 − z, t + 1)
which is 0.
Thus by induction ∆Ck is 0 ∨ 1 for all k and the optimality of Belady’s algorithm is
established.
9 COMPUTER SECURITY 56
9 Computer Security
9.1 Introduction
Computer Security has been the subject of intensive research since multi-user operating
systems were first introduced. Its importance continues to grow as more sensitive infor-
mation is stored, transmitted and processed by computers. Some applications include the
military, banks, credit bureaus and hospitals. Security flaws of computer systems and ap-
proaches to penetration have been enumerated in the literature. Here are some of the more
common flaws:
• The system does not authenticate itself to the user. A common way to steal passwords
is for an intruder to leave a running process which masquerades as the standard
system log-on. After an unsuspecting user enters an identification and a password, the
masquerader records the password, gives an error message (identical to the standard
one provided by the log-on process in the case of a mistyped password) and aborts.
The true log-on process is left to take care of any retry.
• Improper handling of passwords. Passwords may not be encrypted, or the table
of encrypted passwords may be exposed to the general public, or a weak encryption
algorithm may be used.
• Improper implementation A security mechanism may be well thought out but improp-
erly implemented. For example, timely user abortion of a system process may leave a
penetrator with system administrator access rights.
• Trojan horse: A borrowed program may surreptitiously access information that be-
longs to the borrower and deliver this information to the lender.
• Clandestine code: Under the guise of correcting an error or updating an operating
system code can be embedded to allow subsequent unauthorised entry to a system
In this section we will study cryptographic methods for access control and message protec-
tion.
executed by a sender, which takes a message (called the plain-text) and a small piece of
information (called the key) and creates an encoded version of the message (called the
cipher-text). The cipher-text is transmitted along an open line to a receiver who must then
use a decrypting procedure together with the key to recover the plain-text. The key is
arranged in advance between sender and receiver.
9 COMPUTER SECURITY 57
When we consider the quality of an encryption system, we assume that a third-party trying
to decode the message knows the encryption and decryption procedures and has a copy
of the cipher-text. The only thing missing is the key. We also assume that the sender
does not spend time trying to contrive a difficult to read message but relies entirely on the
encryption system to provide all the needed security.
A more demanding standard for measuring the quality of an encryption system is that it
should be safe against a chosen plain-text attack. It is often possible for the third party to
process a known message through the encryption procedure and thus obtain a plain-text
cipher-text pair from which it may be possible to deduce the key.
9.3 Examples
This system involves a simple letter-for-letter substitution method. The key is a rearrange-
ment of the 26 letters of the alphabet. For example if the key is given as:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
actqgwrzdevfbhinsymujxplok
Most messages can de decoded without the key by looking for frequently occurring pairs
of letters. (TH and HE are by far the most common pairings to be found in most English
messages). Once these letters have been identified the rest usually fall into place easily. Of
course this system is useless against a known plain-text attack.
This cipher works by replacing each letter by another letter a specified number of positions
further down the alphabet. For example, J is 5 positions further down from E and D is 5
positions on from Y. The key in this cipher is a sequence of shift amounts. If the sequence
is of length 10 the the first member of the key is used to process the letters in positions
1, 11, 21, ... in the plain-text. The second member of the key is used to process the
letters in positions 2, 12, 22, ... and so on. For example if we use the key:
9 COMPUTER SECURITY 58
3 1 7 23 10 5 19 14 19 24
This type of cipher was considered very secure up until 1600 AD but it is not very difficult
to crack. If the length of the key is known then a guess at the first key element coupled with
a table showing possible two letter combinations in positions 1,2 and 11,12 and 21,22
etc will usually reveal the second element of the key. The same technique can be used to
get the rest of the key. Again this cipher is useless against a known plain-text attack.
In the previous example, if the key-sequence is long enough then the cipher becomes harder
and harder to crack. In the extreme case when the key-sequence is as long as the plain-
text itself the cipher is theoretically unbreakable (since for any possible plain-text there is
a key for which the given cipher-text comes from that plain-text). This type of cipher has
reportedly been used by spies, who were furnished with notebooks containing page after
page of randomly generated key-sequences. Note that it is essential that each key-sequence
be used only once, (hence the name one-time pad.
One-time pads seem practical when an agent is communicating with a central command.
They become less attractive if several agents need to communicate with each other.
To ensure that cipher systems are safe one usually resorts to Number Theory. Before
presenting some number theoretic cipher systems we must revise our number theory back-
ground.
9.4.1 Congruences
The congruence a ≡ b mod n says that when divided by n, a and b have the same remainder.
For example:
9 COMPUTER SECURITY 59
In the second example we are using −6 = 8(−1)+2. Note that we always have a ≡ b mod n
for some 0 ≤ n ≤ n − 1, and we are usually only concerned with that b.
a + c ≡ b + d mod n , ac ≡ bd mod n
.
6 ≡ 18 mod 12 , 3 6≡ 9 mod 12
For any two numbers, a and b, the number (a, b) is the largest number which divides a and
b evenly. For example:
Theorem 1 For any two non-zero integers a and b, there are two other integers x and y,
with the property that (a, b) is the smallest positive integer that can be expressed as:
(a, b) = ax + by
Proof: Consider the set S of all integers that can be written in the form ax + by, and let
S + be the set of all positive integers in S. Now the set S contains the integers a, −a, b and
−b, so the set S + is not empty. Thus S + must have a least element, call this element d.
We must show that d = (a, b).
First by the division algorithm there are integers q and r such that a = dq + r, with
0 ≤ r < d. Thus r = a − dq and since d ∈ S + we have r = a − (ax + by)q and a little
algebra gives r = a(1 − xq) + b(−nq). Thus r is in S but since 0 ≤ r < d and since d is
the smallest element of S + we must have r = 0. So d|a.
Now suppose that c|a and c|b, then a = cu and b = cv. Thus d = ax + by = cux + cvy =
c(ux + vy) which shows that c|d. Thus d is the greatest common divisor of a and b. So
d = (a, b).
ax + by = d
as
ax0 + ry = d where x0 = x + qy
Now try to solve the equation ax0 +ry = d by employing the same technique, r < a so divide
r into a getting a quotient and remainder and rewrite the equation, etc etc. Eventually
one ends up with an equation of the form sx0 + ty 0 = d where one of s and t is 0 while the
other is d. Consider the following example where we are trying to compute (30, 69).
30x + 69y = d
30x0 + 9y = d [x0 = x + 2y]
3x0 + 9y 0 = d [y 0 = y + 3x0 ]
3x00 + 0y 0 = d [x00 = x0 + 3y 0 ]
from the last line of this reduction we can read off x00 = 1 and y 0 = 0 is a solution if d = 3.
Note that back-substitution will give x = 7 and y = −3 and the solution to the original
problem is:
It is important to realize that this process is feasible on a computer even if a and b are
several hundred digits long. It is easy to show that the larger of the two coefficients
decreases by 21 every two equations. Thus in twenty iterations the larger coefficient will
decrease by a factor of 2−10 < 10−3 . The greatest common divisor of two 600 digit numbers
could be computed in no more than 4000 iterations.
Proof: Since p is prime we have (a, p) = 1 so there are integers x and y such that
ax + py = 1. Hence ax ≡ 1 mod p and r ≡ (1)r ≡ axr ≡ xar ≡ xas ≡ s mod p.
Corollary 3 If p is prime and a 6≡ 0 mod p then for any b there is a y with ay ≡ b mod p.
Proof: In the preceeding proof we found an x with ax ≡ 1 mod p. Now just take y = bx
and the result follows.
so that in
Note that even if the numbers are several hundred digits long then although special routines
must be written to handle the modulo multiplications, these calculations with exponents
will be feasible.
We will now develop a series of theorems involving powers of numbers in some modulo
arithmetic field.
Theorem 4 Suppose b 6≡ 0 and let d be the smallest number such that bd ≡ 1. Then for
any e > 0, be ≡ 1 implies d|e.
Proof: If d 6 |e then e = dq + r for some 0 < r < d and br ≡ be−dq ≡ be (bd )−q ≡ 1 which
contradicts the definition of d.
Proof: This theorem is proved in the same way as the corresponding theorem in ordinary
algebra: If x = β is a solution the the polynomial can be written as (x − β) times a
polynomial of degree d − 1 which by induction has at most d − 1 solutions.
In cryptography we often work in a field modp where p is some large prime number. We
will be interested in elements of this field whose powers that take on all possible values in
the field. Such an element is called a primitive root. Here is a more formal definition:
Primitive roots are only useful if we know they exist. The following theorem which is quite
hard to prove ensures the existence of a primitive root for appropriately chosen p.
Proof: Choose any a 6≡ 0 and let d be the smallest positive number for which ad ≡ 1,
(there must be such a number since aK ≡ aL implies aK−L ≡ 1). If d = p − 1 then a is a
0
primitive root. If d < p − 1, we will find a0 and d0 with d0 > d such that (a0 )d ≡ 1 and the
process can be repeated until a primitive root is found.
We must show that ce is the smallest number with the property that (a0 )ce ≡ 1.
Now assume that (a0 )x ≡ 1, then we have 1 ≡ (a0 )cx ≡ bcx . So cx = eM for some integer
M and x = (cK + eL)x = e(KM + Lx). So x = ey for some integer y We will show that
y is divisible by c and we are done.
9 COMPUTER SECURITY 63
Thus x is divisible by ce so ce is the smallest number with the property that (a0 )ce ≡ 1 but
as already mentioned ce > d and hence a0 is a better candidate than a for a primitive root.
This completes the proof and some corollaries follow easily.
Proof: We know that ad ≡ 1 for some d between 1 and p − 1. If d < p − 1 then the
sequence of powers of a would start repeating before all the numbers between 1 and p − 1
were obtained and then a would not be a primitive root.
Proof: Let a be a primitive root then using the previous corollary we have bp−1 ≡ (ax )p−1 ≡
(ap−1 )x ≡ 1.
The existence of a primitive root a for any prime p shows that the equation
ax ≡ b mod p
has a solution for any b 6≡ 0. We have seen that given the left hand side of this equation it
is usually feasible to compute the right hand side even when the integers involved are large.
Going the other way however is much harder. Given a primitive root a and any element b
the computation of x to satisfy the above equation is called the discrete logarithm problem.
x is called the discrete logarithm of b with respect to the primitive root a modulo prime p.
Many modern encryption systems are based on the fact that no efficient way of computing
discrete logarithms is known.
9 COMPUTER SECURITY 64
To make use of the discrete logarithm problem to build an encryption system one must
have a reliable method of finding at least one primitive root a given any prime p. A little
analysis shows that in most cases it will be sufficient to choose a at random and then test
for primitivity. If a turns out to be not primitive then choose another a at random.
The analysis goes as follows: It is easy to show that if a is a primitive root then ax is a
primitive root if (x, p − 1) = 1.
Firstly:
(ax )n ≡ 1 mod p → anx ≡ 1
→ (p − 1)/nx
→ nx = r(p − 1)
Thus:
(x, p − 1) = 1 → Ax + B(p − 1) = 1
→ Anx + Bn(p − 1) = n
→ Ar(p − 1) + Bn(p − 1) = n
→ (Ar + Bn)(p − 1) = n
→ p − 1/n
and so we have shown that (ax )n ≡ 1 → (p − 1)/n which means ax is a primitive root.
As a first example of how the intractability of the discrete logarithm problem may be used
in a cryptographic setting consider the problem of two people, A and B, trying to agree
on a secret key knowing that a third party, C, is listening to all communications between
them.
9 COMPUTER SECURITY 65
The technique is as follows: A and B agree publically on a large prime p and a primitive
root a. These numbers will also be known to C. Then A secretly chooses a large number
α while B secretly chooses a large number β. Then aα mod p and aβ mod p are computed
by A and B respectively and publically announced. The secret key which will be known
only to A and B can then be computed by them as:
Note that for C to compute the secret key he would have to determine either α or β from
his knowledge of p, a, aα and aβ . In other words he would have to solve the discrete
logarithm problem for large modulo arithmetic which no one to this date has been able to
do.
As another example of the use of the intractability of the discrete logarithm problem
consider the following scheme for protecting code against piracy.
The author of the code selects a large prime p with primitive root a and stores these as
constants in the code. The author also chooses a secret number c for that copy of the code
and stores ac mod p as a constant in the code.
At startup the code computes a machine identity h. This identity could be a combination
of the BIOS id and manufacturing date together with the hard disk id and formatting date.
The code then computes ah mod p and makes this number known to the user by displaying
it on the screen. The user then phones the author and tells him over the phone the number
displayed on the screen.
If that user is currently paid-up then the author then computes a password (ah )c mod p
and then phones the user back to inform him of his password.
The user enters the password from the keyboard and the code computes (ac )h mod p to
determine if access is granted.
Note that the user cannot compute the password before the code is run since neither c nor
h is obtainable from non-executing code. Naturally tracing must be prohibited to prevent
the password being detected at access determination time.
The idea of public key encryption is to allow a receiver to set up a system so that anyone
can send him an encoded message, but only the receiver will be able to decode it. The
9 COMPUTER SECURITY 66
plan is as follows:
The receiver chooses two large primes p and q. He then computes a number e that is
relatively prime to both p − 1 and q − 1. In other words:
(e, p − 1) = (e, q − 1) = 1
. He also computes another number d such that:
ed ≡ 1 mod (p − 1) and ed ≡ 1 mod (q − 1)
Finally the receiver computer the product of p and q:
n = pq
The receiver keeps p, q and d secret and publishes e and n. To send a message M < n
to this receiver, any member of the public can compute M e mod n and transmit M e to
the receiver safe in the knowledge that no eavesdropper can recover M from M e . Rivest,
Shamir and Alderman showed that the receiver can recover M from M e by computing
(M e )d mod n. This public key encryption technique has become widely used and is known
as RSA encryption.
To show that RSA encryption is feasible we must show that it is feasible to compute e and
d from knowledge of p and q and we must also show that (M e )d ≡ M mod n. Lastly the
reader must be convinced that it is extremely hard to compute p and q from n so that the
secrecy of d is guaranteed.
Firstly to get e such that (e, p − 1) = (e, q − 1) = 1 just choose e to be prime and greater
than p2 and 2q .
Thirdly to show that (M e )d ≡ M mod p we use the last corollary from the section on
number theory which states that if x ≡ y mod p − 1 then bx ≡ by mod p. We have ed ≡
1 mod (p − 1) so the corollary tells us that M ed ≡ M 1 mod p. Similarity M ed ≡ M 1 mod q
Thus M ed − M is divisible by both p and q so (M e )d ≡ M mod (pq = n)
Lastly to convince yourself that the factors p and q can remain secret even if n√is known
consider the fact that the crude approach of dividing n by all numbers up until n would
take approximately 1050 steps for a 100 digit n and in the last 100 years many famous
mathematicians have been unable to devise a significantly better factoring algorithm.
A problem with public key encryption is that it is easy for a troublemaker C to send a
message to A pretending to be B. This problem can be solved if both A and B have
9 COMPUTER SECURITY 67
Suppose B wants to send a message M to A. He first encrypts the message using his own
private decryption key dB to get M dB mod nB . He then prepends his name and encrypts
the result using A0 s public encryption key to get (B +(M dB mod nB ))eA mod nA This mess
is sent to A via an open line. A decrypts the mess using his private decryption key dA and
discovers B’s name at the beginning of an encrypted message. A then decrypts the rest of
the message using B 0 s public encryption key eB . If the result makes sense A is secure in
the knowledge that only someone knowing B 0 s private decryption key dB could have sent
the message.
ssh2 (Secure Shell) is a program for logging into a remote machine and executing commands
in a remote machine. It is intended to replace rlogin and rsh, and provide secure, encrypted
communications between two untrusted hosts over an insecure network. X11 connections
and arbitrary TCP/IP ports can also be forwarded over the secure channel.
ssh2 connects and logs into the specified hostname. The user must prove his identity to
the remote machine using some authentication method.
Public key authentication is based on the use of digital signatures. Each user creates a
public / private key pair for authentication purposes. The server knows the user’s public
key, and only the user has the private key. The filenames of private keys that are used in
authentication are set in .ssh2/identification. When the user tries to authenticate himself,
the server checks .ssh2/authorisation for filenames of matching public keys and sends a
challenge to the user end. The user is authenticated by signing the challenge using the
private key.
If other authentication methods fail, ssh2 will prompt for a password. Since all communi-
cations are encrypted, the password will not be available for eavesdroppers.
When the user’s identity has been accepted by the server, the server either executes the
given command, or logs into the machine and gives the user a normal shell on the remote
machine. All communication with the remote command or shell will be automatically
encrypted.
If no pseudo tty has been allocated, the session is transparent and can be used to reliably
transfer binary data.
The session terminates when the command or shell in on the remote machine exits and all
X11 and TCP/IP connections have been closed. The exit status of the remote program is
returned as the exit status of ssh2.
Ssh2 automatically maintains and checks a database containing the host public keys. When
9 COMPUTER SECURITY 68
logging on to a host for the first time, the host’s public key is stored in a file .ssh2/hostkey-
PORTNUMBER-HOSTNAME.pub in the user’s home directory. If a host’s identification
changes, ssh2 issues a warning and disables the password authentication in order to prevent
a Trojan horse from getting the user’s password. Another purpose of this mechanism is
to prevent man-in-the-middle attacks which could otherwise be used to circumvent the
encryption.
ssh2 has been installed on your unix box. Download ssh2 for windows from www.ssh.com
and try to establish a secure shell connection to your unix box. Remember that ssh2 has
replaced ssh, the original secure shell command. Use man ssh2 to get information on ssh2
options. Also make use of keysgen to generate a private/public pair of keys and set up
ssh so that you can start a unix session without transmitting a password.
10 FURTHER READING 69
10 Further Reading
This set of notes is suppose to be self contained. The following books and articles are not
required reading for this course but they may help you to understand some of the topics
presented.
Rute is a dependency consistent UNIX tutorial. This means that you can read it
from beginning to end in consecutive order. This book can be downloaded from:
http://hughm.cs.unp.ac.za/ murrellh/notes/rute.ps
This book is for Linux enthusiasts who want to know how the Linux Kernel works.
It describes the principles and mechanisms that Linux uses. This book can be
downloaded from: http://hughm.cs.unp.ac.za/ murrellh/notes/tlk.ps
This book contains proofs for many of the more difficult theorems discussed in
this course.
Describes security issues with respect to the UNIX operating system. Read the
article and in particular the password encryption scheme. The UNIX password
encryption is based on DES, the Data Encryption Standard.
11 APPENDICES 70
11 Appendices
The following texts are part of your notes. Please ask your lecturer if there are any
supplementary texts that would enhance your experience.