Defining Performance
Defining Performance
Defining Performance
6 Performance
When trying to choose among different computers, performance is an
important attribute.
Understanding how best to measure performance and the limitations
of performance measurements is important in selecting a computer.
Defining Performance
When we say one computer has better performance than another, what
do we mean?
An analogy with passenger airplanes shows how subtle the question of
performance can be. Figure 1.14 lists some typical passenger
airplanes, together with their cruising speed, range, and capacity.
FIGURE 1.14 The capacity, range, and speed for a number of commercial airplanes.
The last column shows the rate at which the airplane transports passengers, which is the
capacity times the cruising speed (ignoring range and takeoff and landing times).
We can define computer performance in several different ways.
If you were running a program on two different desktop computers,
youd say that the faster one is the desktop computer that gets the job
done first. If you were running a datacenter that had several servers
running jobs submitted by many users, youd say that the faster
computer was the one that completed the most jobs during a day.
Response time
Also called execution time. The total time required for the computer
to complete a task, including disk accesses, memory accesses, I /O
activities, operating system overhead, CPU execution time, and so on.
Throughput
Also called bandwidth. Another measure of performance, it is the
number of tasks completed per unit time.
To maximize performance, minimize response time or execution time
for some task. Thus, we can relate performance and execution time for
a computer X:
This means that for two computers X and Y, if the performance of X is
greater than the performance of Y, we have
That is, the execution time on Y is longer than that on X, if X is faster
than Y.
In discussing a computer design, we often want to relate the
performance of two different computers quantitatively. We will use the
phrase X is n times faster than Y or equivalently X is n times as
fast as Yto mean
If X is n times as fast as Y, then the execution time on Y is n times as
long as it is on X:
Example
Do the following changes to a computer system increase throughput, decrease
response time, or both?
1. Replacing the processor in a computer with a faster version
2. Adding additional processors to a system that uses multiple processors for
separate tasksfor example, searching the web
Answer
Decreasing response time almost always improves throughput. Hence, in case 1,
both response time and throughput are improved. I n case 2, no one task gets work
done faster, so only throughput increases.
If, however, the demand for processing in the second case was almost as large as
the throughput, the system might force requests to queue up. In this case,
increasing the throughput could also improve response time, since it would reduce
the waiting time in the queue. Thus, in many real computer systems, changing
either execution time or throughput often affects the other.
Performance and execution time are reciprocals, increasing
performance requires decreasing execution time.
Measuring Performance
Time is the measure of computer performance: the computer that
performs the same amount of work in the least time is the fastest.
Program execution time is measured in seconds per program. However,
time can be defined in different ways, depending on what we count.
The most straightforward definition of time is called wall clock time,
response time, or elapsed time. These terms mean the total time to
complete a task, including disk accesses, memory accesses,
input/output (I /O ) activities, operating system overhead.
Computers are often shared, however, and a processor may work on
several programs simultaneously. In such cases, the system may try to
optimize throughput rather than attempt to minimize the elapsed time
for one program.
CPU execution time or simply CPU time, is the time the CPU spends
computing for this task and does not include time spent waiting for I
/O or running other programs. CPU time can be further divided into
the CPU time spent in the program, called user CPU time, and the
CPU time spent in the operating system performing tasks on behalf of
the program, called system CPU time.
CPU execution time
Also called CPU time. The actual time the CPU spends computing for a
specific task.
Relative Performance
Example
If computer A runs a program in 10 seconds and computer B runs the same
program in 15 seconds how much faster is A than B?
Answer
We know that A is n times as fast as B if
Performance A/Performance B=Execution time B/Execution time A
15/10=1.5
Thus the performance ratio is and A is therefore 1.5 times as fast as B.
user CPU time
The CPU time spent in a program itself.
system CPU time
The CPU time spent in the operating system performing tasks on
behalf of the program.
system performance to refer to elapsed time on an unloaded system
and CPU performance to refer to user CPU time.
Understanding Program Performance
Different applications are sensitive to different aspects of the
performance of a computer system. Many applications, especially those
running on servers, depend as much on I /O performance, which, in
turn, relies on both hardware and software.
In some application environments, the user may care about
throughput, response time, or a complex combination of the two (e.g.,
maximum throughput with a worst-case response time).
To improve the performance of a program, one must have a clear
definition of what performance metric matters and then proceed to
look for performance bottlenecks.
Almost all computers are constructed using a clock that determines
when events take place in the hardware. These discrete time intervals
are called clock cycles (or ticks, clock ticks, clock periods, clocks,
cycles). Designers refer to the length of a clock period both as the
time for a complete clock cycle (e.g., 250 picoseconds, or 250 ps) and
as the clock rate (e.g., 4 gigahertz, or 4 GHz), which is the inverse of
the clock period.
clock cycle
Also called tick, clock tick, clock period, clock, or cycle. The time
for one clock period, usually of the processor clock, which runs at a
constant rate.
clock period
The length of each clock cycle.
CPU Performance and Its Factors
the bottom-line performance measure is CPU execution time.
A simple formula relates the most basic metrics (clock cycles and clock
cycle time) to CPU time:
CPU Execution time for a program=CPU clock cycles for a
program*Clock cycle time
Alternatively, because clock rate and clock cycle time are inverses,
CPU Execution time for a program=CPU clock cycles for a
program/Clock cycle time
can improve performance by reducing the number of clock cycles
required for a program or the length of the clock cycle.
Instruction Performance
The computer had to execute the instructions to run the program, the execution time
must depend on the number of instructions in a program.
The execution time is equal to the number of instructions executed multiplied by the
average time per instruction. Therefore, the number of clock cycles required for a
program can be written as
CPU clock cycles= Instructions for a program* No of clock cycles per instruction
Clock cycles per instruction (C P I )
Average number of clock cycles per instruction for a program or program fragment.
The Classic CPU Performance Equation
Instruction count
The number of instructions executed by the program.
Basic performance equation in terms of instruction count, CPI, and clock cycle time:
Using the Performance Equation
Example
Suppose we have two implementations of the same instruction set architecture. Computer A
has a clock cycle time of 250 ps and a CPI of 2.0 for some program, and computer B has a
clock cycle time of 500 ps and a CPI of 1.2 for the same program. Which computer is faster
for this program and by how much?
Answer
We know that each computer executes the same number of instructions for the program; let
s call this number I. First, find the number of processor clock cycles for each computer:
CPU clock cycles _A=I*2.0
CPU clock cycles _B=I*1.2
Now we can compute the CPU time for each computer:
CPU time_A =CPU clock cycles_A * clock cycle time
= I * 2.0 * 250 ps = 500*I ps
Likewise, for B:
CPU time_B = I*1.2*500 ps=600*I ps
Clearly, computer A is faster. The amount faster is given by the ratio of
the execution times:
CPU Performance_A / CPU performance_B= Execution time_B / Execution time_A
= (600* I ps) / (500* I ps) = 1.2
We can conclude that computer A is 1.2 times as fast as computer B for this program.