Ccuracy and Speed: Hapter
Ccuracy and Speed: Hapter
Ccuracy and Speed: Hapter
2, where is
the rounding error, or equivalently x =
2
=
2
1
+
2
2
. (4.1)
Hence the standard deviation of the sum or difference is
=
2
1
+
2
2
. (4.2)
Similarly if we multiply or divide two numbers then the variance of the result x
obeys
2
x
2
=
2
1
x
2
1
+
2
2
x
2
2
. (4.3)
But, as discussed above, the standard deviations on x
1
and x
2
are given
by
1
= Cx
1
and
2
= Cx
2
, so that if, for example, we are adding or subtracting
our two numbers, meaning Eq. (4.2) applies, then
=
C
2
x
2
1
+ C
2
x
2
2
= C
x
2
1
+ x
2
2
. (4.4)
130
4.2 | NUMERICAL ERROR
I leave it as an exercise to show that the corresponding result for the error on
the product of two numbers x = x
1
x
2
or their ratio x = x
1
/x
2
is
=
2 Cx. (4.5)
We can extend these results to combinations of more than two numbers.
If, for instance, we are calculating the sum of N numbers x
1
. . . x
N
with errors
having standard deviation
i
= Cx
i
, then the variance on the nal result is the
sum of the variances on the individual numbers:
2
=
N
i=1
2
i
=
N
i=1
C
2
x
2
i
= C
2
Nx
2
, (4.6)
where x
2
is the mean-square value of x. Thus the standard deviation on the
nal result is
= C
x
2
. (4.7)
As we can see, this quantity increases in size as N increasesthe more num-
bers we combine, the larger the error on the resultalthough the increase is a
relatively slow one, proportional to the square root of N.
We can also ask about the fractional error on
i
x
i
, i.e., the total error divided
by the value of the sum. The size of the fractional error is given by
i
x
i
=
C
N x
2
N x
=
C
x
2
x
, (4.8)
where x = N
1
i
x
i
is the mean value of x. In other words the fractional error
in the sum actually goes down as we add more numbers.
At rst glance this appears to be pretty good. So whats the problem? Ac-
tually, there are a couple of them. One is when the sizes of the numbers you
are adding vary widely. If some are much smaller than others then the smaller
ones may get lost. But the most severe problems arise when you are not adding
but subtracting numbers. Suppose, for instance, that we have the following
two numbers:
x = 1000000000000000
y = 1000000000000001.2345678901234
and we want to calculate the difference y x. Unfortunately, the computer
only represents these two numbers to 16 signicant gures, which means that
131
CHAPTER 4 | ACCURACY AND SPEED
as far as the computer is concerned:
x = 100000000000000
y = 100000000000001.2
The rst number is represented exactly in this case, but the second has been
truncated. Now when we take the difference we get y x = 1.2, when the true
result would be 1.2345678901234. In other words, instead of 16-gure accuracy,
we now only have two gures and the fractional error is several percent of the
true value. This is much worse than before.
To put this in more general terms, if the difference between two numbers is
very small, comparable with the error on the numbers, i.e., with the accuracy
of the computer, then the fractional error can become large and you may have
a problem.
EXAMPLE 4.1: THE DIFFERENCE OF TWO NUMBERS
To see an example of this in practice, consider the two numbers
x = 1, y = 1 +10
14
2. (4.9)
Trivially we see that
10
14
(y x) =
2. (4.10)
Let us perform the same calculation in Python and see what we get. Here is
the program:
from math import sqrt
x = 1.0
y = 1.0 + (1e-14)*sqrt(2)
print((1e14)*(y-x))
print(sqrt(2))
The penultimate line calculates the value in Eq. (4.10) while the last line prints
out the true value of
b
2
4ac
2a
.
Use your program to compute the solutions of 0.001x
2
+1000x +0.001 = 0.
b) There is another way to write the solutions to a quadratic equation. Multiplying
top and bottom of the solution above by b
b
2
4ac, show that the solutions
can also be written as
x =
2c
b
b
2
4ac
.
Add further lines to your program to print these values in addition to the earlier
ones and again use the program to solve 0.001x
2
+1000x +0.001 = 0. What do
you see? How do you explain it?
c) Using what you have learned, write a new program that calculates both roots of
a quadratic equation accurately in all cases.
This is a good example of how computers dont always work the way you expect them
to. If you simply apply the standard formula for the quadratic equation, the computer
will sometimes get the wrong answer. In practice the method you have worked out here
is the correct way to solve a quadratic equation on a computer, even though its more
complicated than the standard formula. If you were writing a program that involved
solving many quadratic equations this method might be a good candidate for a user-
dened function: you could put the details of the solution method inside a function to
save yourself the trouble of going through it step by step every time you have a new
equation to solve.
Exercise 4.3: Calculating derivatives
Suppose we have a function f (x) and we want to calculate its derivative at a point x.
We can do that with pencil and paper if we knowthe mathematical formof the function,
or we can do it on the computer by making use of the denition of the derivative:
df
dx
= lim
0
f (x + ) f (x)
.
133
CHAPTER 4 | ACCURACY AND SPEED
On the computer we cant actually take the limit as goes to zero, but we can get a
reasonable approximation just by making small.
a) Write a program that denes a function f(x) returning the value x(x 1), then
calculates the derivative of the function at the point x = 1 using the formula
above with = 10
2
. Calculate the true value of the same derivative analyti-
cally and compare with the answer your program gives. The two will not agree
perfectly. Why not?
b) Repeat the calculation for = 10
4
, 10
6
, 10
8
, 10
10
, 10
12
, and 10
14
. You
should see that the accuracy of the calculation initially gets better as gets smaller,
but then gets worse again. Why is this?
We will look at numerical derivatives in more detail in Section 5.10, where we will
study techniques for dealing with these issues and maximizing the accuracy of our
calculations.
4.3 PROGRAM SPEED
As we have seen, computers are not innitely accurate. And neither are they
innitely fast. Yes, they work at amazing speeds, but many physics calcula-
tions require the computer to perform millions of individual computations to
get a desired overall result and collectively those computations can take a sig-
nicant amount of time. Some of the example calculations described in Chap-
ter 1 took months to complete, even though they were run on some of the most
powerful computers in the world.
One thing we need to get a feel for is how fast computers really are. As a
general guide, performing a million mathematical operations is no big prob-
lem for a computerit usually takes less than a second. Adding a million
numbers together, for instance, or nding a million square roots, can be done
in very little time. Performing a billion operations, on the other hand, could
take minutes or hours, though its still possible provided you are patient. Per-
forming a trillion operations, however, will basically take forever. So a fair rule
of thumb is that the calculations we can perform on a computer are ones that
can be done with about a billion operations or less.
This is only a rough guide. Not all operations are equal and it makes a
difference whether we are talking about additions or multiplications of single
numbers (which are easy and quick) versus, say, calculating Bessel functions
or multiplying matrices (which are not). Moreover, the billion-operation rule
will change over time because computers get faster. However, computers have
been getting faster a lot less quickly in the last fewyearsprogress has slowed.
So were probably stuck with a billion operations for a while.
134
4.3 | PROGRAM SPEED
EXAMPLE 4.2: QUANTUM HARMONIC OSCILLATOR AT FINITE TEMPERATURE
The quantum simple harmonic oscillator has energy levels E
n
= h(n +
1
2
),
where n = 0, 1, 2, . . . , . As shown by Boltzmann and Gibbs, the average en-
ergy of a simple harmonic oscillator at temperature T is
E =
1
Z
n=0
E
n
e
E
n
, (4.11)
where = 1/(k
B
T) with k
B
being the Boltzmann constant, and Z =
n=0
e
E
n
.
Suppose we want to calculate, approximately, the value of E when k
B
T =
100. Since the terms in the sums for E and Z dwindle in size quite quickly as
n becomes large, we can get a reasonable approximation by taking just the rst
1000 terms in each sum. Working in units where h = = 1, heres a program
to do the calculation:
File: qsho.py from math import exp
terms = 1000
beta = 1/100
S = 0.0
Z = 0.0
for n in range(terms):
E = n + 0.5
weight = exp(-beta*E)
S += weight*E
Z += weight
print(S/Z)
Note a few features of this program:
1. Constants like the number of terms and the value of are assigned to
variables at the beginning of the program. As discussed in Section 2.7,
this is good programming style because it makes them easy to nd and
modify and makes the rest of the program more readable.
2. We used just one for loop to calculate both sums. This saves time, making
the program run faster.
3. Although the exponential e
E
n
occurs separately in both sums, we calcu-
late it only once each time around the loop and save its value in the vari-
able weight. This also saves time: exponentials take signicantly longer
to calculate than, for example, additions or multiplications. (Of course
135
CHAPTER 4 | ACCURACY AND SPEED
longer is relativethe times involved are probably still less than a mi-
crosecond. But if one has to go many times around the loop even those
short times can add up.)
If we run the program we get this result:
99.9554313409
The calculation (on my desktop computer) takes 0.01 seconds. Now let us try
increasing the number of terms in the sums (which just means increasing the
value of the variable terms at the top of the program). This will make our
approximation more accurate and give us a better estimate of our answer, at
the expense of taking more time to complete the calculation. If we increase the
number of terms to a million then it does change our answer somewhat:
100.000833332
The calculation now takes 1.4 seconds, which is signicantly longer, but still a
short time in absolute terms.
Now lets increase the number of terms to a billion. When we do this the
calculation takes 22 minutes to nish, but the result does not change at all:
100.000833332
There are three morals to this story. First, a billion operations is indeed doable
if a calculation is important to us we can probably wait twenty minutes for an
answer. But its approaching the limit of what is reasonable. If we increased
the number of terms in our sum by another factor of ten the calculation would
take 220 minutes, or nearly four hours. A factor of ten beyond that and wed
be waiting a couple of days for an answer.
Second, there is a balance to be struck between time spent and accuracy. In
this case it was probably worthwhile to calculate a million terms of the sum
it didnt take long and the result was noticeably, though not wildly, different
from the result for a thousand terms. But the change to a billion terms was
clearly not worth the effortthe calculation took much longer to complete but
the answer was exactly the same as before. We will see plenty of further exam-
ples in this book of calculations where we need to nd an appropriate balance
between speed and accuracy.
Third, its pretty easy to write a program that will take forever to nish.
If we set the program above to calculate a trillion terms, it would take weeks
to run. So its worth taking a moment, before you spend a whole lot of time
136
4.3 | PROGRAM SPEED
writing and running a program, to do a quick estimate of how long you expect
your calculation to take. If its going to take a year then its not worth it: you
need to nd a faster way to do the calculation, or settle for a quicker but less
accurate answer. The simplest way to estimate running time is to make a rough
count of the number of mathematical operations the calculation will involve;
if the number is signicantly greater than a billion, you have a problem.
EXAMPLE 4.3: MATRIX MULTIPLICATION
Suppose we have two N N matrices represented as arrays A and B on the
computer and we want to multiply them together to calculate their matrix
product. Here is a fragment of code to do the multiplication and place the
result in a new array called C:
from numpy import zeros
N = 1000
C = zeros([N,N],float)
for i in range(N):
for j in range(N):
for k in range(N):
C[i,j] += A[i,k]*B[k,j]
We could use this code, for example, as the basis for a user-dened function to
multiply arrays together. (As we saw in Section 2.4.4, Python already provides
the function dot for calculating matrix products, but its a useful exercise
to write our own code for the calculation. Among other things, it helps us
understand how many operations are involved in calculating such a product.)
How large a pair of matrices could we multiply together in this way if the
calculation is to take a reasonable amount of time? The program has three
nested for loops in it. The innermost loop, which runs through values of the
variable k, goes around N times doing one multiplication operation each time
and one addition, for a total of 2N operations. That whole loop is itself exe-
cuted N times, once for each value of j in the middle loop, giving 2N
2
opera-
tions. And those 2N
2
operations are themselves performed N times as we go
through the values of i in the outermost loop. The end result is that the ma-
trix multiplication takes 2N
3
operations overall. Thus if N = 1000, as above,
the whole calculation would involve two billion operations, which is feasible
in a few minutes of running time. Larger values of N, however, will rapidly
become intractable. For N = 2000, for instance, we would have 16 billion op-
137
CHAPTER 4 | ACCURACY AND SPEED
erations, which could take hours to complete. Thus the largest matrices we can
multiply are about 1000 1000 in size.
4
Exercise 4.4: Calculating integrals
Suppose we want to calculate the value of the integral
I =
1
1
1 x
2
dx.
The integrand looks like a semicircle of radius 1:
-1 0 1
and hence the value of the integralthe area under the curvemust be equal to
1
2
=
1.57079632679 . . .
Alternatively, we can evaluate the integral on the computer by dividing the domain
of integration into a large number N of slices of width h = 2/N each and then using
the Riemann denition of the integral:
I = lim
N
N
k=1
hy
k
,
4
Interestingly, the direct matrix multiplication represented by the code given here is not the
fastest way to multiply two matrices on a computer. Strassens algorithm is an iterative method for
multiplying matrices that uses some clever shortcuts to reduce the number of operations needed so
that the total number is proportional to about N
2.8
rather than N
3
. For very large matrices this can
result in signicantly faster computations. Unfortunately, Strassens algorithm suffers from large
numerical errors because of problems with subtraction of nearly equal numbers (see Section 4.2)
and for this reason it is rarely used. On paper, an even faster method for matrix multiplication is
the CoppersmithWinograd algorithm, which requires a number of operations proportional to only
about N
2.4
, but in practice this method is so complex to programas to be essentially worthlessthe
extra complexity means that in real applications the method is always slower than direct multipli-
cation.
138
4.3 | PROGRAM SPEED
where
y
k
=
1 x
2
k
and x
k
= 1 + hk.
We cannot in practice take the limit N , but we can make a reasonable approxima-
tion by just making N large.
a) Write a program to evaluate the integral above with N = 100 and compare the
result with the exact value. The two will not agree very well, because N = 100 is
not a sufciently large number of slices.
b) Increase the value of N to get a more accurate value for the integral. If we require
that the program runs in about one second or less, how accurate a value can you
get?
Evaluating integrals is a common task in computational physics calculations. We will
study techniques for doing integrals in detail in the next chapter. As we will see, there
are substantially quicker and more accurate methods than the simple one we have used
here.
139