0% found this document useful (0 votes)
63 views

Parallel Processing Report

This document discusses parallel processing and parallel computer architectures. It describes two main methods of parallelism: temporal parallelism and data parallelism. It then discusses different types of parallel computer architectures according to Flynn's taxonomy: SISD, SIMD, MISD, and MIMD. The most common architecture is MIMD, where multiple processors execute different instructions on different data simultaneously. Shared memory systems are also discussed, where processors can access a shared memory location and see updates immediately.

Uploaded by

ZaidB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Parallel Processing Report

This document discusses parallel processing and parallel computer architectures. It describes two main methods of parallelism: temporal parallelism and data parallelism. It then discusses different types of parallel computer architectures according to Flynn's taxonomy: SISD, SIMD, MISD, and MIMD. The most common architecture is MIMD, where multiple processors execute different instructions on different data simultaneously. Shared memory systems are also discussed, where processors can access a shared memory location and see updates immediately.

Uploaded by

ZaidB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

MSc.

Coursework 2017

Parallel Processing
SUPERVISOR: Dr. SAGVAN ALI SALIH

BY: ZAID HUSSEIN BERJIS


Contents
Chapter One : Introduction to Parallel Processing
Chapter One : Introduction to Parallel Processing

SOME INTERESTING FEATURES OF PARALLEL COMPUTERS


Better quality of solution.

Better algorithms.

Better storage distribution.

Greater reliability.
Parallelism Methods

1) TEMPORAL PARALLELISM
2) DATA PARALLELISM

TABLE 1.1 Comparison of Temporal and Data Parallel Processing


Temporal parallel processing (pipelining idea) Data parallel
processing
Job is divided into a set of independent tasks and tasks Full jobs are assigned for
are assigned for processing. processing
Tasks should take equal time. Pipeline stages should thus Jobs may take different
be synchronized. times. No need to
synchronize beginning of
jobs.
Bubbles in jobs lead to idling of processors. Bubbles do not cause
idling of processors.
Processors specialized to do specific tasks efficiently. Processors should be
general purpose and may
not do all tasks efficiently.
Task assignment static. Job assignment may be
static, dynamic, or
quasi-dynamic.
Not tolerant to processor faults. Tolerates processor faults.
Efficient with fine grained tasks. Efficient with coarse
grained tasks and quasi-
dynamic scheduling.
Scales well as long as number of data items to be Scales well as long as
processed is much larger than the number of processors number of jobs is much
in the pipeline and time taken to communicate task from greater than the number of
one processor to the next is negligible. processors and
processing time is much
higher than the time to
distribute data to
processors.
Lately there has been a lot of interest generated all over the world on parallel
processors and parallel computers. This is due to the fact that all current micro-
processors are parallel processors. Each processor in a microprocessor chip is called a
core and such a microprocessor is called a multicore processor. Multicore processors
have an on-chip memory of a few megabytes (MB). Before trying to answer the question
“What is a parallel computer?”, Let’s review the structure of a single processor computer
(Fig. 1.1). It consists of an input unit which accepts (or reads) the list of instructions to
solve a problem (a program) and data relevant to that problem. It has a memory or
storage unit in which the program, data and intermediate results are stored, a
processing element which we will abbreviate as PE (also called a Central Processing
Unit (CPU)) which interprets and executes instructions, and an output unit which
displays or prints the results.

Figure 1.1 : Von Neumann Structure

This structure of a computer was proposed by John Von Neumann in the mid 1940s
and is known as the Von Neumann Architecture. In this architecture, a program is first
stored in the memory. The PE retrieves one instruction of this program at a time,
interprets it and executes it. The operation of this computer is thus sequential. At a time,
the PE can execute only one instruction. The speed of this sequential computer is thus
limited by the speed at which a PE can retrieve instructions and data from the memory
and the speed at which it can process the retrieved data. To increase the speed of
processing of data one may increase the speed of the PE by increasing the clock
speed. The clock speed increased from a few hundred kHz in the 1970s to 3 GHz in
2005. Processor designers found it difficult to increase the clock speed further as the
chip was getting overheated. The number of transistors which could be integrated in a
chip could, however, be doubled every two years. Thus, processor designers placed
many processing “cores” inside the processor chip to increase its effective throughput.
The processor retrieves a sequence of instructions from the main memory and stores
them in an on-chip memory. The “cores” can then cooperate to execute these
instructions in parallel. A computer which consists of a number of inter-connected
computers which cooperatively execute a single program to solve a problem is called a
parallel computer. Rapid developments in electronics have led to the emergence of
processors which can process over 5 billion instructions per second. Such processors
cost only around $100. It is thus possible to economically construct parallel computers
which use around 4000 such multicore processors to carry out ten trillion (1013 )
instructions per second assuming 50% efficiency. The more difficult problem is to
perceive parallelism in algorithms and develop a software environment which will enable
application programs to utilize this potential parallel processing power.

Parallel Computing Architecture:

Hardware: A parallel computer is a collection of several interconnected nodes that cooperate


and communicate with each other in order to solve complex problems by splitting them into
parallel tasks.

In parallel computer systems (or parallel computing) Flynn’s taxonomy is frequently


used to classify computer architectures. Flynn classifies parallel processor systems
according to the number of instruction streams and the number of data streams it can
simultaneously manage.

Flynn’s taxonomy classifies computer architecture into four main categories:

• Single Instruction Single Data machine (SISD).

• Single Instruction Multiple Data machine (SIMD).

• Multiple Instruction Single Data machine (MISD).

• Multiple Instruction Multiple Data machine (MIMD).

Single Instruction Single Data (SISD):

A sequential computer which exploits no parallelism in either the instruction or data streams.
Single control unit fetches single instruction stream from memory. The control unit then
generates appropriate control signals to direct single processing unit (PU) to operate on
single data stream i.e., one operation at a time.
Single Instruction Multiple Data (SIMD):

Single instruction, multiple data, or SIMD, systems are parallel systems. As the name
suggests, SIMD systems operate on multiple data streams by applying the same
instruction to multiple data items, so an abstract SIMD system can be thought of as
having a single control unit and multiple ALUs.

An instruction is broadcast from the control unit to the

ALUs, and each ALU either applies the instruction to

the current data item, or it is idle.

Note that: in a SIMD system, the ALUs must operate synchronously, that is, each ALU
must wait for the next instruction to be broadcast before proceeding.

Finally, SIMD systems are ideal for parallelizing simple loops that operate on large
arrays of data. Parallelism that’s obtained by dividing data among the processors and
having the processors all apply the same instructions to their subsets of the data is
called data-parallelism.

Multiple Instruction Single Data (MISD):

Multiple instructions operate on one data stream. It is a type of parallel computing


architecture where many functional units perform different operations by executing
different instructions on the same data set.

Multiple Instruction Multiple Data (MIMD):

Multiple autonomous processors simultaneously executing different instructions on


different data. MIMD architectures include multi-core processors, and/or distributed
systems.
Thus, MIMD systems typically consist of a collection of fully independent processing
units or cores, each of which has its own control unit and its own ALU. Furthermore,
unlike SIMD systems, MIMD systems are usually asynchronous, that is, the processors
can operate at their own speed.

MIMD are currently the most common than others and can be broadly divided according
to the organization of the memory into three sub-classes: :

• Shared memory machines.

• Distributed memory machines across a network in a distributed environment.

• Mixed memory machines.

The MIMD architectures is primarily used in a number of application areas, including the
following:

• Computer-aided design.

• Computer-aided manufacturing.

• Simulation.

• Modeling.

• Communication switches etc.

Shared Memory Systems:

In a shared-memory system a collection of autonomous processors is connected


to a memory system via an interconnection network, and each processor can
access each memory location.
All processors can simultaneously access the same memory. Meanwhile, all performed
operations on that memory is immediately available to all other processors

Shared memory is an efficient means of communications for passing data between


processors. Generally, parallel programming with share memory is easier and more
convenient than with the other types of memory architectures, since all processors can
simultaneously access the same memory.

However, in practice, this class is best suited to machines with a limited number of
processors, because increasing the number of processors, may constitute a bottleneck
with the access to the shared memory.

Distributed Memory Systems:

Distributed memory system refer to a computer system in which each processor has its
own memory space. Different processors communicate over an interconnected network.
So in distributed-memory systems the processors usually communicate explicitly by
sending messages through the network or by using special functions that provide
access to the memory of another processor.

Distributed memory machines can integrate a large number of processors (may be in


thousands). Thus, parallel applications using this type of memory architecture can be
scalable.

The communication model can allow a considerable increase in speed, but its
programming is difficult since the programmers have to handle all communication
operations.

Mixed (or hybrid) Memory Systems:

The combination of both shared and distributed memory mechanisms (noted as mixed
or hybrid memory architectures) provides a flexible means to adapt to various
computing platforms.
This combination may increase scalability, increase performance computing, speed up
computation, and permit to efficient utilization of the existing hardware capacities.

However, this type of architecture combines advantages, but also may combine
the disadvantages of the both architectures.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy