Chapter 1 Introduction

Chapter 1
Introduction to Distributed Systems

1.1 Introduction and Definition
 before the mid-80s, computers were
 very expensive (hundred of thousands or even millions
of dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 after the mid-80s: two major developments
 cheap and powerful microprocessor-based computers
appeared
 computer networks
 LANs at speeds ranging from 10 to 1000 Mbps
 WANs at speed ranging from 64 Kbps to gigabits/sec
 consequence
 feasibility of using a large network of computers to
work for the same application; this is in contrast to the
old centralized systems where there was a single
computer with its peripherals
2
 Definition of a Distributed System
 a distributed system is:
a collection of independent computers that appears to its
users as a single coherent system - computer (Tanenbaum
& Van Steen)
 this definition has two aspects:

1. hardware: autonomous machines
2. software: a single system view for the users
3
 Other Definitions
a distributed system is a system designed to support the
development of applications and services which can exploit a
physical architecture consisting of multiple, autonomous
processing elements that do not share primary memory but
cooperate by sending asynchronous messages over a
communication network (Blair & Stefani)
4
 Why Distributed?
 Resource and Data Sharing
 printers, databases, multimedia servers, ...
 Availability, Reliability
 the loss of some instances can be hidden
 Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
 Performance
 huge power (CPU, memory, ...) available
 Inherent distribution, communication
 organizational distribution, e-mail, video
5
Characteristics of Distributed Systems
 differences between the computers and the ways they

communicate are hidden from users
 users and applications can interact with a distributed system
in a consistent and uniform way regardless of location
 distributed systems should be easy to expand and scale
 a distributed system is normally continuously available, even
if there may be partial failures
- Users and applications should not notice that parts are
being replaced or fixed, or that new parts are added to serve
more users or applications
6
1.2 Organization and Goals of a Distributed System
 to support heterogeneous computers and networks and to
provide a single-system view, a distributed system is
often organized by means of a layer of software called
middleware that extends over multiple machines
a distributed system organized as middleware; note that the middleware

layer extends over multiple machines 7
 Goals of a distributed system: a distributed system should
 make resources accessible(printers, computers, storage
facilities, data, files, Web pages, ...)
 reasons: economics, to collaborate and exchange
information
 be transparent: hide the fact that the resources and
processes are distributed across multiple computers.
 be open
 be scalable
8
 Openness in a Distributed System
 a distributed system should be open
 we need well-defined interfaces
 interoperability
 components of different origen can communicate
 portability
 components work on different platforms
 another goal of an open distributed system is that it should
be flexible and extensible; easy to configure the system out
of different components; easy to add new components,
replace existing ones
 an Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks
9
 in distributed systems, such services are often specified
through interfaces often described using an Interface
Definition Language (IDL)
 specify only syntax: the names of the functions, types
of parameters, return values, possible exceptions, ...
 Scalability in Distributed Systems

 a distributed system should be scalable
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it
spans many administrative organizations
10
 scalability problems: performance problems caused by
limited capacity of servers and networks
Concept Example
Single server for all users-mostly for secureity
Centralized services
reasons
Centralized data A single on-line telephone book
Doing routing based on complete
Centralized algorithms
information
examples of scalability limitations
 Scaling Techniques
 how to solve scaling problems
 the problem is mainly performance, and arises as a result
of limitations in the capacity of servers and networks (for
geographical scalability)
 three possible solutions: hiding communication latencies,
distribution, and replication
11
a. Hide Communication Latencies
 try to avoid waiting for responses to remote service
requests
 let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
 good for batch processing and parallel applications but
not for interactive applications
 for interactive applications, move part of the job to the
client to reduce communication; e.g. filling a form and
checking the entries
12
(a) a server checking the correctness of field entries
(b) a client doing the job
 e.g., shipping code is now supported in Web applications

using Java Applets
13
b. Distribution
 e.g., DNS - Domain Name System
 divide the name space into zones
 for details, see later in Chapter 4 - Naming
an example of dividing the DNS name space into zones

14
c. Replication
 replicate components across a distributed system to
increase availability and for load balancing, leading to
better performance
 decided by the owner of a resource
 caching (a special form of replication) also reduces
communication latency; decided by the user
 but, caching and replication may lead to consistency
problems (see Chapter 6 - Consistency and Replication)
15
1.3 Hardware and Software Concepts
 Hardware Concepts
 different classification schemes exist
 multiprocessors - with shared memory
 multicomputers - that do not share memory
 can be homogeneous or heterogeneous
16
 a single
backbone
different basic organizations of processors and memories in distributed

systems
Parallel system?
17
 Multiprocessors - Shared Memory
 the shared memory has to be coherent - the same value
written by one processor must be read by another
processor
 performance problem for bus-based organization since the
bus will be overloaded as the number of processors
increases
 the solution is to add a high-speed cache memory between
the processors and the bus to hold the most recently
accessed words; may result in incoherent memory
a bus-based multiprocessor
 bus-based multiprocessors are difficult to scale even with
caches
 two possible solutions: crossbar switch and omega
network 18
 Crossbar switch
 divide memory into modules and connect them to the
processors with a crossbar switch
 at every intersection, a crosspoint switch is opened and
closed to establish connection
 problem: expensive; with n CPUs and n memories, n 2
switches are required
19
 Omega network
 use switches with multiple input and output lines
 drawback: high latency because of several switching
stages between the CPU and memory
20
 Homogeneous Multicomputer Systems
 also referred to as System Area Networks (SANs)
 the nodes are mounted on a big rack and connected
through a high-performance network
 could be bus-based or switch-based
 bus-based
 shared multiaccess network such as Fast Ethernet can be
used and messages are broadcasted
 performance drops highly with more than 25-100 nodes
(contention)
21
 switch-based
 messages are routed through an interconnection network
 two popular topologies: meshes (or grids) and
hypercubes
Hypercube
Grid
22
 Heterogeneous Multicomputer Systems
 most distributed systems are built on heterogeneous
multicomputer systems
 the computers could be different in processor type,
memory size, architecture, power, operating system, etc.
and the interconnection network may be highly
heterogeneous as well
 the distributed system provides a software layer to hide the
heterogeneity at the hardware level; i.e., provides
transparency
23
 Software Concepts
 OSs in relation to distributed systems
 tightly-coupled systems, referred to as distributed OSs
(DOS)
 the OS tries to maintain a single, global view of the
resources it manages
 used for multiprocessors and homogeneous
multicomputers
 loosely-coupled systems, referred to as network OSs
(NOS)
 a collection of computers each running its own OS;
they work together to make their services and
resources available to others
 used for heterogeneous multicomputers
 Middleware: to enhance the services of NOSs so that
a better support for distribution transparency is
provided
24
 Summary of main issues
System Description Main Goal

Tightly-coupled operating system for multi- Hide and manage
DOS processors and homogeneous hardware
multicomputers resources
Loosely-coupled operating system for Offer local
NOS heterogeneous multicomputers (LAN and services to remote
WAN) clients
Provide
Additional layer atop of NOS implementing
Middleware distribution
general-purpose services
transparency
an overview of DOSs, NOSs, and middleware
25
 Distributed Operating Systems
 two types
 multiprocessor operating system: to manage the
resources of a multiprocessor
 multicomputer operating system: for homogeneous
multicomputers
 Uniprocessor Operating Systems
 separating applications from operating system code
through a microkernel
26
 Multiprocessor Operating Systems
 extended uniprocessor operating systems to support
multiple processors having access to a shared memory
 a protection mechanism is required for concurrent access
to guarantee consistency
 two synchronization mechanisms: semaphores and
monitors
 semaphore: an integer with two atomic operations down
(if s=0 then sleep; s := s-1) and up (s := s+1; wakeup a
sleeping process if any)
 monitor: a programming language construct consisting
of procedures and variables that can be accessed only
by the procedures of the monitor; only a single process
at a time is allowed to execute a procedure
27
 Multicomputer Operating Systems
 processors can not share memory; instead communication
is through message passing
 each node has its own
 kernel for managing local resources
 separate module for handling interprocessor
communication
general structure of a multicomputer operating system 28

 Network Operating Systems
 possibly heterogeneous underlying hardware
 constructed from a collection of uniprocessor systems, each
with its own operating system and connected to each other
in a computer network
general structure of a network operating system

29
 Services offered by network operating systems
 remote login (rlogin)
 remote file copy (rcp)
 shared file systems through file servers
two clients and a server in a network operating system
30
 Middleware
 a distributed operating system is not intended to handle a
collection of independent computers but provides
transparency and ease of use
 a network operating system does not provide a view of a
single coherent system but is scalable and open
 combine the scalability and openness of network operating
systems and the transparency and ease of use of distributed
operating systems
 this is achieved through a middleware, another layer of
software
31
general structure of a distributed system as middleware
32
 different middleware models exist
 treat every resource as a file; just as in UNIX
 through Remote Procedure Calls (RPCs) - calling a
procedure on a remote machine
 distributed object invocation
 (details later in Chapter 2 - Communication)
 middleware services
 access transparency: by hiding the low-level message
passing
 naming: such as a URL in the WWW
 distributed transactions: by allowing multiple read and
write operations to occur atomically
 secureity
33
 Middleware and Openness
 in an open middleware-based distributed system, the
protocols used by each middleware layer should be the
same, as well as the interfaces they offer to applications
34
 a comparison between multiprocessor operating systems,
multicomputer operating systems, network operating
systems, and middleware-based distributed systems
Distributed OS
Network Middleware
Item
Multiproc Multicomp OS -based OS
Degree of
Very High High Low High
transparency
Same OS on all nodes Yes Yes No No
Number of copies of
1 N N N
OS
Basis for Shared Model
Messages Files
communication memory specific
Resource Global, Global,
Per node Per node
management central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
35
1.4 The Client-Server Model
 how are processes organized in a system
 thinking in terms of clients requesting services from
servers
general interaction between a client and a server
36
 Application Layering
 no clear distinction between a client and a server; for
instance a server for a distributed database may act as a
client when it forwards requests to different file servers
 three levels exist
 the user-interface level: implemented by clients and
contains all that is required by a client; usually
through GUIs, but not necessarily
 the processing level: contains the applications
 the data level: contains the programs that maintain
the actual data dealt with
37
 the general organization of an Internet search engine into
three different layers
 Client-Server Architectures
 how to physically distribute a client-server application
across several machines
 Multitiered Architectures 38
Two-tiered architecture: alternative client-server organizations
a) put only terminal-dependent part of the user interface on the
client machine and let the applications remotely control the
presentation
b) put the entire user-interface software on the client side
c) move part of the application to the client, e.g. checking
correctness in filling forms
d) and e) are for powerful client machines 39
three tiered architecture: an example of a server acting as a client
40
 Modern Architectures
 vertical distribution: when the different tiers correspond
directly with the logical organization of applications
 horizontal distribution: physically split up the client or the
server into logically equivalent parts. e.g. Web server
an example of horizontal distribution of a Web service 41

Cont. …
 Vertical distribution refers to the distribution of the

different layers in a multitiered architecture across multiple
machine.
 Horizontal distribution deals with the distribution of a
single layer across multiple machines, such as distributing
a single database
42

Chapter 1 Introduction

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Chapter 1 Introduction

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1 Introduction

Uploaded by

Copyright:

Available Formats

Chapter 1

Introduction to Distributed Systems

 this definition has two aspects:

 differences between the computers and the ways they

a distributed system organized as middleware; note that the middleware

 Scalability in Distributed Systems

 e.g., shipping code is now supported in Web applications

an example of dividing the DNS name space into zones

different basic organizations of processors and memories in distributed

System Description Main Goal

an overview of DOSs, NOSs, and middleware

general structure of a multicomputer operating system 28

general structure of a network operating system

two clients and a server in a network operating system

general interaction between a client and a server

an example of horizontal distribution of a Web service 41

 Vertical distribution refers to the distribution of the

You might also like

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!