Chapter_4
Chapter_4
Performance
ECE369
1
Defining Performance
ECE369
2
Response Time and Throughput
• Response time
– How long it takes to do a task
• Throughput
– Total work done per unit time
• e.g., tasks/transactions/… per hour
• How are response time and throughput
affected by
– Replacing the processor with a faster version?
– Adding more processors?
• We’ll focus on response time for now…
ECE369
3
Relative Performance
• Define Performance = 1
Execution Time
• “X is n time faster than Y”
Performance X /PerformanceY
Execution time Y /Execution time X =n
Example: time taken to run a program
10s on A, 15s on B
Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
So A is 1.5 times faster than B
ECE369
4
Execution Time
• Elapsed Time
– Total response time, including all aspects of Processing,
such as I/O, OS overhead, idle time
– a useful number, but often not good for comparison
purposes
• CPU time
– Time spent processing a given job
• Discounts I/O time, other jobs’ shares
– can be broken up into system time, and user time
ECE369
5
Clock Cycles
Clock (cycles)
Data transfer
and computation
Update state
ECE369
7
CPU Time Example
ECE369
9
CPI Example
Sequence 1: IC = 5 Sequence 2: IC = 6
Clock Cycles Clock Cycles
= 2×1 + 1×2 + 2×3 = 4×1 + 1×2 + 1×3
= 10 =9
Avg. CPI = 10/5 = 2.0 Avg. CPI = 9/6 = 1.5
ECE369
12
Performance Summary
• Performance depends on
– Algorithm: affects IC, possibly CPI
– Programming language: affects IC, CPI
– Compiler: affects IC, CPI
– Instruction set architecture: affects IC, CPI, Tc
ECE369
13
Component Analysis
ECE369
14
Example
• Don't Panic, can easily work this out from basic principles
ECE369
15
Example
ECE369
16
CPI Example ( Repeated problem )
• Suppose we have two
implementations of the same seconds cycles seconds
instruction set = ´
architecture (ISA). program program cycle
For some program,
ECE369
17
Let’s Complicate Things A Little bit… ( Repeated problem )
ECE369
18
Scary Stuff ( New problem )
ALU 43% 1
Load 21% 1
Store 12% 2
Branch 24% 2
( )
n
∑ CPI i×IC i n
CPI original=
i =1
Instruction Count
=∑ CPI i ×
i=1
( IC i
InstructionCount )
ECE369
19
Example(Contd.)
ECE369
20
Practice problems
ECE369
21
What is MIPS?
ECE369
22
MIPS Example
ECE369
23
Performance Measurement Overview
CPUtime=CPUclock cycles for the
× Clock Cycle Time
pogram
Clock Rate
IC
IC ×CPI
CPUtime=
Clock Rate
ECE369
24
Performance Measurement Overview
n
CPU clock cycles for
=∑ CPI i× IC i
the
program i=1
(∑ )
n
CPUtime= CPI i ×IC i ×Clock Cycle Time
i=1
(∑ )
n
CPI i ×IC i n
overall CPI =
i=1
Instruction Count
=∑ CPI i ×
i=1
(
IC i
Instruction Count )
ECE369
25
Exercise
ECE369
26
Solution
(∑ )
n
CPI i×IC i n
CPI original=
i =1
Instruction Count
=∑ CPI i ×
i=1
( IC i
InstructionCount )
=4×25 +1 . 33×75 =2 . 0
ECE369
28
Amdahl's Law
• Example:
"Suppose a program runs in 100 seconds on a machine, with
multiply operations responsible for 80 seconds of this time. How much
do we have to improve the speed of multiplication if we want the
program to run 4 times faster?"
How about making it 5 times faster?
ECE369
29
Amdahl’s Law
1. Speed up = 4
2. Old execution time = 100
3. New execution time = 100/4 = 25
4. If 80 seconds is used by the affected part =>
5. Unaffected part = 100-80 = 20 sec
6. Execution time new = Execution time unaffected +
Execution time affected / Improvement
7. 25= 20 + 80/Improvement
8. Improvement = 16
ECE369
30
Example: Speed up using parallel processors
ECE369
32
Example
1
Speedup= ≃1 . 56
0 .4
+0 . 6
10
ECE369
33
Example