ΕCE 338 Parallel Computer Architecture Spring 2022
ΕCE 338 Parallel Computer Architecture Spring 2022
ΕCE 338 Parallel Computer Architecture Spring 2022
Administrivia
The need for Parallel Computing
Introduction to Parallel Computer Architecture
Nikos Bellas
MS-teams page
Code: 9zj4f0d
To be used exclusively
Office: #422
Phone #: 24210-74704
Android Smartphone
Playstation 4
Facebook Datacenter
ALU
Ifetch Reg DMem Reg
n
s
t
ALU
Ifetch Reg DMem Reg
r.
ALU
Ifetch Reg DMem Reg
r
d
ALU
Ifetch Reg DMem Reg
e
r
ALU
I Ifetch Reg DMem Reg
n
ALU
s Ifetch Reg DMem Reg
t
ALU
r. Ifetch Reg DMem Reg
ALU
Ifetch Reg DMem Reg
O
r
d
e
r ECE 338: Parallel Computer 17
2) The Principle of Locality
• The Principle of Locality:
– Program access a relatively small portion of the address space at any instant of time.
• Two Different Types of Locality:
– Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced
again soon (e.g., loops, reuse)
– Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are
close by tend to be referenced soon
(e.g., straight-line code, array access)
• Last 30 years, computer architecture relied on locality for memory
perfomance
P $ MEM
α
ExTime new ExTime old 1 α
Speedup enhanced
ExTime old 1
Speedup overall
ExTime new 1 α α
Speedup enhanced
Best you could ever hope to do (perfect speedup):
1
Speedup maximum
1 - α
ECE 338: Parallel Computer 21
Amdahl’s Law example
• New CPU 10X faster
• I/O bound server, so 60% time waiting for I/O
1
Speedup overall
α
1 α
Speedup enhanced
1
1.56
1 0.4 0.4
10
• Apparently, it’s human nature to be attracted by 10X faster,
vs. keeping in perspective it’s just 1.6X faster
Compiler X (X)
Inst. Set. X X
Organization X X
Technology X
After mid-2000’s
Transistors still getting smaller (Moore’s law) but
energy increases!
WHY?
Processor
f/2
Input Output Output
Processor
f
Input
Processor
f/2
Before
Capacitance = C
After
Clock frequency = f
Capacitance = 2.2C
Voltage = V
Clock frequency = f/2
Power = CV2f
Voltage = 0.6V
Power = 0.396CV2f
Slower processors allow for lower Vdd voltage.
Emphasis on parallelism NOT on clock frequency
ECE 338: Parallel Computer 33
Reducing Power: Heterogeneous computing
Specialization
ISA
Microarchitecture
Many new issues Logic
at the bottom Circuits
(Look Down)
Electrons
• No clear, definitive answers to these problems
ECE 338: Parallel Computer 42
Computer Architecture Today (III)
• Computing landscape is very different from 10-20 years ago
• Both UP (software and humanity trends) and DOWN (technologies
and their issues), FORWARD and BACKWARD, and the resulting
requirements and constraints