William Stallings Computer Organization and Architecture 9 Edition

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 61

+

William Stallings
Computer Organization
and Architecture
9th Edition
+
Chapter 2
Computer Evolution and Performance
+
History of Computers
First Generation: Vacuum Tubes
 ENIAC
 Electronic Numerical Integrator And Computer
 Designed and constructed at the University of Pennsylvania
 Started in 1943 – completed in 1946
 By John Mauchly and John Eckert

 World’s first general purpose electronic digital computer


 Army’s Ballistics Research Laboratory (BRL) needed a way to supply trajectory tables for new
weapons accurately and within a reasonable time frame
 Was not finished in time to be used in the war effort

 Its first task was to perform a series of calculations that were used to help determine the
feasibility of the hydrogen bomb

 Continued to operate under BRL management until 1955 when it was disassembled
ENIAC

Major
Memory drawback
drawback
consisted was
Occupied was the
the need
need
Contained Capable of 20 accumulators,
1500 Decimal
more
more of
of each
each for manual
Weighed square 140 kW rather
than 5000 capable programming
30 feet Power than
18,000 additions of
tons of consumption binary by setting
vacuum
vacuum per
per holding
holding switches
floor machine
tubes second a and
space
10 digit plugging/
number
number unplugging
cables
+
John von Neumann
EDVAC (Electronic Discrete Variable Computer)
 First publication of the idea was in 1945

 Stored program concept


 Attributed to ENIAC designers, most notably the mathematician John von
Neumann
 Program represented in a form suitable for storing in memory alongside
the data

 IAS computer
 Princeton Institute for Advanced Studies
 Prototype of all subsequent general-purpose computers
 Completed in 1952
Structure of von Neumann Machine
+
IAS Memory Formats
 Both data and instructions are stored
 The memory of the IAS consists there
of 1000 storage locations (called
words) of 40 bits each  Numbers are represented in binary
form and each instruction is a binary
code
+
Structure
of
IAS
Computer
+ Registers
Memory buffer register • Contains a word to be stored in memory or sent to the I/O unit
(MBR) • Or is used to receive a word from memory or from the I/O unit

Memory address register • Specifies the address in memory of the word to be written from or read
(MAR) into the MBR

Instruction register (IR) • Contains the 8-bit opcode instruction being executed

Instruction buffer register • Employed to temporarily hold the right-hand instruction from a word in
(IBR) memory

• Contains the address of the next instruction pair to be fetched from


Program counter (PC) memory

Accumulator (AC) and • Employed to temporarily hold operands and results of ALU operations
multiplier quotient (MQ)
+

IAS
Operations
+

Table 2.1

The IAS
Instruction
Set

Table 2.1 The IAS Instruction Set


+
Commercial Computers
UNIVAC
 1947 – Eckert and Mauchly formed the Eckert-Mauchly Computer Corporation
to manufacture computers commercially

 UNIVAC I (Universal Automatic Computer)


 First successful commercial computer
 Was intended for both scientific and commercial applications
 Commissioned by the US Bureau of Census for 1950 calculations

 The Eckert-Mauchly Computer Corporation became part of the UNIVAC


division of the Sperry-Rand Corporation

 UNIVAC II – delivered in the late 1950’s


 Had greater memory capacity and higher performance

 Backward compatible
+
 Was the major manufacturer of
punched-card processing equipment

 Delivered its first electronic stored-


program computer (701) in 1953
 Intended primarily for scientific
applications

 Introduced 702 product in 1955


 Hardware features made it suitable
IBM
to business applications

 Series of 700/7000 computers


established IBM as the
overwhelmingly dominant
computer manufacturer
+
History of Computers
Second Generation: Transistors
 Smaller

 Cheaper

 Dissipates less heat than a vacuum tube

 Is a solid state device made from silicon

 Was invented at Bell Labs in 1947

 It was not until the late 1950’s that fully transistorized computers
were commercially available
Table 2.2  
Computer Generations

+
Computer Generations
+
Second Generation Computers

 Introduced:  Appearance of the Digital


 More complex arithmetic and Equipment Corporation (DEC) in
logic units and control units 1957
 The use of high-level
programming languages  PDP-1 was DEC’s first computer
 Provision of system software
 This began the mini-computer
which provided the ability to:
phenomenon that would become
 load programs so prominent in the third
 move data to peripherals and generation
libraries
 perform common
computations
Table 2.3  
Example
Members of the
IBM 700/7000 Series
 

Table 2.3 Example Members of the IBM 700/7000 Series


IBM
7094
Configuration
History of Computers
Third Generation: Integrated Circuits

 1958 – the invention of the integrated circuit

 Discrete component
 Single, self-contained transistor
 Manufactured separately, packaged in their own containers, and soldered
or wired together onto masonite-like circuit boards
 Manufacturing process was expensive and cumbersome

 The two most important members of the third generation were the
IBM System/360 and the DEC PDP-8
+
Microelectronics
+  A computer consists of gates,
Integrated memory cells, and
interconnections among these
Circuits elements

 The gates and memory cells are


 Data storage – provided by constructed of simple digital
memory cells electronic components

 Data processing – provided by  Exploits the fact that such


gates components as transistors, resistors,
and conductors can be fabricated
 Data movement – the paths among from a semiconductor such as silicon
components are used to move data
from memory to memory and from  Many transistors can be produced at
memory through gates to memory the same time on a single wafer of
silicon
 Control – the paths among
components can carry control  Transistors can be connected with a
signals processor metallization to form
circuits
+
Wafer,
Chip,
and
Gate
Relationship
+
Chip Growth
Moore’s Law

1965; Gordon Moore – co-founder of Intel

Observed number of transistors that could be


put on a single chip was doubling every year

Consequences of Moore’s law:

The pace slowed to a doubling every 18 months in the 1970’s but has sustained that rate
ever since
The cost of computer logic and memory circuitry has fallen at a
dramatic rate
+
Table 2.4
Characteristics of the
System/360 Family

Table 2.4 Characteristics of the System/360 Family


Table 2.5
Evolution of the PDP-8

Table 2.5 Evolution of the PDP-8


+
DEC - PDP-8 Bus Structure
+ LSI
Large
Scale
Later Integration

Generations
VLSI
Very Large
Scale
Integration

ULSI
Semiconductor Memory Ultra Large
Microprocessors Scale
Integration
+ Semiconductor Memory
+
Microprocessors
 The density of elements on processor chips continued to rise
 More and more elements were placed on each chip so that fewer and fewer
chips were needed to construct a single computer processor

 1971 Intel developed 4004


 First chip to contain all of the components of a CPU on a single chip
 Birth of microprocessor

 1972 Intel developed 8008


 First 8-bit microprocessor

 1974 Intel developed 8080


 First general purpose microprocessor
 Faster, has a richer instruction set, has a large addressing capability
Evolution of Intel Microprocessors

a. 1970s Processors

b. 1980s
Evolution of Intel Microprocessors

c. 1990s Processors

d. Recent Processors
+
Microprocessor Speed
Techniques built into contemporary processors include:
P rocessor analyzes
moves data
which
or
DataBranch
Speculative flow

Proc essor looks a heand
ad daintathe

Usi ng bra nc h pre di ct ion flow


instructions into
are dependent
a conceptual
on
Pipelining
ainstruc
na lysi s, tion
some cprocodee ssors
fe tcspe
hecula
d f tirom
ve l y
e xe cute i nstruc ti ons ahea d of t he ir a c tua l
pipe with
each
ame other’s
ppeamory all results,
stages of
ra nce and or the
data,
i n theprprogra
e dicmts ewhic
xe c ut hion, hol di ng

execution
prediction
analysis
brarenche
pipe
to
the sult s s,
create
eaxerecuti on ly
like
in tor
processing
an optimized
e ngi
gr oupsloca
emporary
tonebe
s asproc
schedule of instructions
simultaneously
oftiinstr
busy eassse
ons, kee
possible
uctions,
d ne xt
pi ng
+
Performance
Balance
Increase the number
 Adjust the organization and of bits that are
retrieved at one time
architecture to compensate by making DRAMs
for the mismatch among the “wider” rather than
“deeper” and by
capabilities of the various using wide bus data
components paths

Reduce the frequency


 Architectural examples of memory access by
include: incorporating
increasingly complex
and efficient cache
structures between
the processor and
main memory

Change the DRAM Increase the


interface to make interconnect bandwidth
it more efficient by between processors and
memory by using
including a cache higher speed buses and
or other buffering a hierarchy of buses to
scheme on the buffer and structure
DRAM chip data flow
Typical I/O Device Data Rates
+
Improvements in Chip Organization
and Architecture
 Increase hardware speed of processor
 Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, increasing clock rate
 Propagation time for signals reduced

 Increase size and speed of caches


 Dedicating part of processor chip
 Cache access times drop significantly

 Change processor organization and architecture


 Increase effective speed of instruction execution
 Parallelism
+
Problems with Clock Speed and Login
Density
 Power
 Power density increases with density of logic and clock speed
 Dissipating heat

 RC delay
 Speed at which electrons flow limited by resistance and capacitance of
metal wires connecting them
 Delay increases as RC product increases
 Wire interconnects thinner, increasing resistance
 Wires closer together, increasing capacitance

 Memory latency
 Memory speeds lag processor speeds
+ Processor
Trends
Strategy
With
As
The caches
use of is
two to useprocessors
became
multiple two
larger it
Multicore simpler
made
on
potential
create two
chip rather
processors
performance
the same
processors and larger
to increase
onthe
sense
chip provides theto
performance
thanthen
one three
more
caches
without
levels
complex are justified
increasing
of cache on a chiprate
processorthe clock
+
Many Integrated Core (MIC)
Graphics Processing Unit (GPU)
MIC GPU
 Leap in performance as well as the  Core designed to perform parallel
challenges in developing software operations on graphics data
to exploit such a large number of
cores  Traditionally found on a plug-in
graphics card, it is used to encode
 The multicore and MIC strategy and render 2D and 3D graphics as
involves a homogeneous collection well as process video
of general purpose processors on a
single chip  Used as vector processors for a
variety of applications that require
repetitive computations
+ Overview ARM
 Results of decades of design effort on
complex instruction set computers (CISCs) Intel
 Excellent example of CISC design

 Incorporates the sophisticated design


principles once found only on mainframes
and supercomputers

 An alternative approach to processor design


is the reduced instruction set computer
x86 Architecture
(RISC)

 The ARM architecture is used in a wide


variety of embedded systems and is one of
the most powerful and best designed RISC
based systems on the market

 In terms of market share Intel is ranked as CISC


the number one maker of microprocessors
for non-embedded systems
RISC
 8080
 First general purpose microprocessor
 8-bit machine with an 8-bit data path to memory
 Used in the first personal computer (Altair)

 8086
 16-bit machine
 Used an instruction cache, or queue
 First appearance of the x86 architecture

x86 Evolution  8088


 used in IBM’s first personal computer
+
 80286
 Enabled addressing a 16-MByte memory instead of just
1 MByte

 80386
 Intel’s first 32-bit machine
 First Intel processor to support multitasking

 80486
 More sophisticated cache technology and instruction
pipelining
 Built-in math coprocessor
x86 Evolution - Pentium
Penti Penti
Pent Penti Penti
um um
ium um II um 4
Pro III
+ Increased MMX



Superscal superscalar Additional

Includes

technology
ar organization floating- additional

Aggressive ●
Designed
Multiple register
point floating-

specifically
renaming instruction point and
instructio ●
Branch to process
s to other
ns prediction video,
Data flow support 3D enhanceme
executed

audio, and
analysis
graphics graphics nts for
in parallel ●
Speculative
data software multimedia
execution
x86 Evolution (continued)

 Core
 First Intel x86 microprocessor
with a dual core, referring to the
implementation of two processors
on a single chip

 Core 2
 Extends the architecture to 64 bits
 Recent Core offerings have up to
10 processors per chip
General definition: Embedded

“A combination of computer
hardware and software, and perhaps
additional mechanical or other parts,
designed to perform a dedicated Systems
function. In many cases, embedded
systems are part of a larger system or
+ product, as in the case of an antilock
braking system in a car.”
Table 2.7
Examples of Embedded Systems and Their Markets
+
Embedded Systems
Requirements and Constraints
Small to large systems,
implying different cost
constraints and different needs
for optimization and reuse
Relaxed to very strict
Different models of requirements and combinations
computation ranging from of different quality requirements
discrete event systems to with respect to safety, reliability,
hybrid systems real-time and flexibility

Different application
characteristics resulting in static
versus dynamic loads, slow to Short to long
fast speed, compute versus
interface intensive tasks, and/or life times
combinations thereof
Different environmental
conditions in terms of
radiation, vibrations, and
humidity
+ Figure 2.12
Possible Organization of an Embedded System
+
Acorn RISC Machine (ARM)

 Family of RISC-based  Widely used in PDAs and other


microprocessors and handheld devices
microcontrollers
 Chips are the processors in iPod
 Designs microprocessor and and iPhone devices
multicore architectures and
licenses them to manufacturers  Most widely used embedded
processor architecture
 Chips are high-speed processors
that are known for their small die  Most widely used processor
size and low power requirements architecture of any kind
+
A
R
M
Ev
ol
uti
on
DSP = digital signal processor SoC = system on a chip
ARM Design Categories
 ARM processors are designed to meet the needs of three system
categories:

 Secure applications
 Smart cards, SIM cards, and
payment terminals

 Application platforms
 Embedded real-time systems
 Devices running open operating
 Systems for storage, automotive
systems including Linux, Palm
body and power-train, industrial,
OS, Symbian OS, and Windows
and networking applications
CE in wireless, consumer
entertainment and digital imaging
applications
+
System Clock
+ Table
Performance Factors 2.9
and
System Attributes
Benchmarks
For example, consider this high-level language statement:

A = B + C /* assume all quantities in main memory */

With a traditional instruction set architecture, referred to as a complex instruction


set computer (CISC), this instruction can be compiled into one processor
instruction:

add mem(B), mem(C), mem (A)

On a typical RISC machine, the compilation would look something like


this:
load mem(B), reg(1);
load mem(C), reg(2);
add reg(1), reg(2), reg(3);
store reg(3), mem (A)
+ Desirable Benchmark Characteristics

Written in aofhigh-level
Re pre senta tive a pa r tic ula r kind
Can
of
Re
progra mbe
language,
Written
pre senta
system progr
ming
tive
making
inamm
style , suc it
aofhigh-level
a panume
ing,
h as
r tic ula
r icr akind
l
Can
portable
measured
language,
of
pr progra
ogr
system
pr
mbe across
ming
am ming,
ogr amprogr
making
or style
c omm
mingamm ing, nume
, different
easily
er cit
suc hiaals
r ic a l
machines
portable across
or c ommdifferent
measured
pr ogr am ming,
pr ogr am ming
machines
easily
er c ia l

H
a
s
w
i
d
e
d
is
tr
i
b
u
ti
o
n
+
System Performance Evaluation
Corporation (SPEC)
 Benchmark suite
 A collection of programs, defined in a high-level language
 Attempts to provide a representative test of a computer in a particular
application or system programming area

 SPEC
 An industry consortium
 Defines and maintains the best known collection of benchmark suites
 Performance measurements are widely used for comparison and research
purposes
+  Best known SPEC benchmark suite

 Industry standard suite for processor


intensive applications
SPEC  Appropriate for measuring performance for
applications that spend most of their time
doing computation rather than I/O

CPU2006  Consists of 17 floating point programs


written in C, C++, and Fortran and 12 integer
programs written in C and C++

 Suite contains over 3 million lines of code

 Fifth generation of processor intensive suites


from SPEC
+  Gene Amdahl [AMDA67]

 Deals with the potential speedup of a


program using multiple processors compared
to a single processor
Amdahl’s  Illustrates the problems facing industry in the
development of multi-core machines
Law  Software must be adapted to a highly
parallel execution environment to exploit
the power of parallel processing

 Can be generalized to evaluate and design


technical improvement in a computer system
+
Amdahl’s Law
+
Little’s Law
 Fundamental and simple relation with broad applications

 Can be applied to almost any system that is statistically in steady


state, and in which there is no leakage

 Queuing system
 If server is idle an item is served immediately, otherwise an arriving item
joins a queue
 There can be a single queue for a single server or for multiple servers, or
multiples queues with one being for each of multiple servers

 Average number of items in a queuing system equals the average


rate at which items arrive multiplied by the time that an item
spends in the system
 Relationship requires very few assumptions
 Because of its simplicity and generality it is extremely useful
+ Summary Computer Evolution
and Performance
Chapter 2
 Multi-core
 First generation computers  MICs
 Vacuum tubes
 GPGPUs
 Second generation computers
 Transistors  Evolution of the Intel x86
 Third generation computers  Embedded systems
 Integrated circuits
 ARM evolution
 Performance designs
 Performance assessment
 Microprocessor speed  Clock speed and instructions per
 Performance balance second
 Chip organization and  Benchmarks
architecture  Amdahl’s Law
 Little’s Law

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy