Using MIPS and MFLOPS As Performance Metrics: April 26, 2008
Using MIPS and MFLOPS As Performance Metrics: April 26, 2008
Using MIPS and MFLOPS As Performance Metrics: April 26, 2008
One alternative way to measure CPU performance is MIPS, or million instructions per
second. For a given program, MIPS is given by
Instruction count
MIPS = (1)
Execution time × 106
Since,
Instruction count × CPI
Execution time = (2)
Clock rate
Equation 1 becomes
Clock rate
MIPS = (3)
CPI × 106
Since MIPS is a rate of operations per unit time, CPU performance can be specified
as the inverse of execution time, with faster machines having a higher MIPS rating. How-
ever, according to the Patterson and Hennessy, there are problems with using MIPS as a
performance metric.
• MIPS is dependent on the instruction set of the CPU, making it difficult to compare
the MIPS ratings of processors with different instruction sets.
• MIPS can vary inversely to performance.
Consider the MIPS rating of a processor with an optional floating-point unit. Since
it generally takes more clock cycles per floating-point instruction that per integer instruc-
tion, floating-point programs using the optional hardware instead of software floating-point
routines take less time but have a lower MIPS rating. A software floating-point routine
executes simpler instructions, resulting in a higher MIPS rating, but it executes so many
more instructions that the overall execution time is longer.
We can see similar anomalies with optimizing compilers as the following example demon-
strates.
Example. Let us assume that you have profiled your code and the instruction mix is
detailed in Table 1. We now want to build an optimizing compiler for the CPU. The compiler
discards 50% of the ALU instructions although it cannot reduce loads, stores, or branches.
Assuming a 20-ns clock cycle time (or a 50-MHz clock), what is the MIPS rating for the
optimized code versus the unoptimized code? Does the MIPS rating agree with the ranking
of execution time?
1
Table 1: The instruction mix and CPIs of individual instructions
Operation Frequency CPI
ALU Operations 43% 1
Loads 21% 2
Stores 12% 2
branches 24% 2
Answer. We use the CPU performance formula to compute the CPI of the unoptimized
code as
CPIunoptimized = .43 × 1 + .21 × 2 + .12 × 2 + .24 × 2 = 1.57
So,
50 MHz
MIPSunoptimized = = 31.85
1.57 × 106
The performance of the unoptimized code, in terms of execution time, is given by:
since 50% of the ALU operations have been discarded (.43/2) and the instruction count is
reduced by the missing ALU instructions. Thus,
50 MHz
MIPSoptimized = = 28.90
1.73 × 106
The performance of the optimized code, in terms of execution time, is
The optimized code is 13% faster, but its MIPS rating is lower! As this example shows,
MIPS can fail to give a true picture of performance in that it does not track execution time.
Another popular alternative to measure execution time is million floating-point operations
per second, or MFLOPS (megaflops). The formula for MFLOPS is simply
Number of floating-point operations in a program
MFLOPS = (4)
Execution time × 106
2
The MFLOPS rating is dependent on the machine and on the program, and since MFLOPS
are intended to measure floating-point performance, they are not applicable outside that
range. For example, compilers have a MFLOPS rating of nearly zero no matter how fast
the CPU is since compilers rarely use floating-point arithmetic. When comparing the per-
formance of different machines, MFLOPS is not dependable because the set of floating-point
operations is not consistent across machines.