0% found this document useful (0 votes)
125 views86 pages

02-Numerical Methods of Approximation-86

method

Uploaded by

suminto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views86 pages

02-Numerical Methods of Approximation-86

method

Uploaded by

suminto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Contents 31

Numerical Methods
of Approximation
31.1 Polynomial Approximations 2

31.2 Numerical Integration 28

31.3 Numerical Differentiation 58

31.4 Nonlinear Equations 67

Learning outcomes
In this Workbook you will learn about some numerical methods widely used in
engineering applications.

You will learn how certain data may be modelled, how integrals and derivatives may be
approximated and how estimates for the solutions of non-linear equations may be found.
Polynomial  

Approximations 31.1 

Introduction
Polynomials are functions with useful properties. Their relatively simple form makes them an ideal
candidate to use as approximations for more complex functions. In this second Workbook on Nu-
merical Methods, we begin by showing some ways in which certain functions of interest may be
approximated by polynomials.

 
• revise material on maxima and minima of
Prerequisites functions of two variables
Before starting this Section you should . . . • be familiar with polynomials and Taylor series

' 
$
• interpolate data with polynomials

Learning Outcomes • find the least squares best fit straight line to
experimental data
On completion you should be able to . . .

& %

2 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

1. Polynomials
A polynomial in x is a function of the form
p(x) = a0 + a1 x + a2 x2 + . . . an xn (an 6= 0, n a non-negative integer)
where a0 , a1 , a2 , . . . , an are constants. We say that this polynomial p has degree equal to n. (The
degree of a polynomial is the highest power to which the argument, here it is x, is raised.) Such
functions are relatively simple to deal with, for example they are easy to differentiate and integrate. In
this Section we will show ways in which a function of interest can be approximated by a polynomial.
First we briefly ensure that we are certain what a polynomial is.

Example 1
Which of these functions are polynomials in x? In the case(s) where f is a poly-
nomial, give its degree.
(a) f (x) = x2 − 2 − x1 , (b) f (x) = x4 + x − 6, (c) f (x) = 1,
(d) f (x) = mx + c, m and c are constants. (e) f (x) = 1 − x6 + 3x3 − 5x3

Solution
(a) This is not a polynomial because of the x1 term (no negative powers of the argument are
allowed in polynomials).
(b) This is a polynomial in x of degree 4.
(c) This is a polynomial of degree 0.
(d) This straight line function is a polynomial in x of degree 1 if m 6= 0 and of degree 0 if m = 0.
(e) This is a polynomial in x of degree 6.

Task
Which of these functions are polynomials in x? In the case(s) where f is a poly-
nomial, give its degree.
(a) f (x) = (x−1)(x+3) (b) f (x) = 1−x7 (c) f (x) = 2+3ex −4e2x

(d)f (x) = cos(x) + sin2 (x)

Your solution

Answer
(a) This function, like all quadratics, is a polynomial of degree 2.
(b) This is a polynomial of degree 7.
(c) and (d) These are not polynomials in x. Their Maclaurin expansions have infinitely many terms.

HELM (2006): 3
Section 31.1: Polynomial Approximations
We have in fact already seen, in 16, one way in which some functions may be approximated
by polynomials. We review this next.

2. Taylor series
In 16 we encountered Maclaurin series and their generalisation, Taylor series. Taylor series are
a useful way of approximating functions by polynomials. The Taylor series expansion of a function
f (x) about x = a may be stated
f (x) = f (a) + (x − a)f 0 (a) + 12 (x − a)2 f 00 (a) + 3!1 (x − a)3 f 000 (a) + . . . .
(The special case called Maclaurin series arises when a = 0.)
The general idea when using this formula in practice is to consider only points x which are near to a.
Given this it follows that (x − a) will be small, (x − a)2 will be even smaller, (x − a)3 will be smaller
still, and so on. This gives us confidence to simply neglect the terms beyond a certain power, or, to
put it another way, to truncate the series.

Example 2
Find the Taylor polynomial of degree 2 about the point x = 1, for the function
f (x) = ln(x).

Solution
In this case a = 1 and we need to evaluate the following terms
f (a) = ln(a) = ln(1) = 0, f 0 (a) = 1/a = 1, f 00 (a) = −1/a2 = −1.
Hence
1 3 x2
ln(x) ≈ 0 + (x − 1) − (x − 1)2 = − + 2x −
2 2 2
which will be reasonably accurate for x close to 1, as you can readily check on a calculator or
computer. For example, for all x between 0.9 and 1.1, the polynomial and logarithm agree to at
least 3 decimal places.

One drawback with this approach is that we need to find (possibly many) derivatives of f . Also, there
can be some doubt over what is the best choice of a. The statement of Taylor series is an extremely
useful piece of theory, but it can sometimes have limited appeal as a means of approximating functions
by polynomials.
Next we will consider two alternative approaches.

4 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

3. Polynomial approximations - exact data


Here and in subsections 4 and 5 we consider cases where, rather than knowing an expression for
the function, we have a list of point values. Sometimes it is good enough to find a polynomial that
passes near these points (like putting a straight line through experimental data). Such a polynomial
is an approximating polynomial and this case follows in subsection 4. Here and in subsection 5 we
deal with the case where we want a polynomial to pass exactly through the given data, that is, an
interpolating polynomial.

Lagrange interpolation
Suppose that we know (or choose to sample) a function f exactly at a few points and that we want
to approximate how the function behaves between those points. In its simplest form this is equivalent
to a dot-to-dot puzzle (see Figure 1(a)), but it is often more desirable to seek a curve that does not
have“corners” in it (see Figure 1(b)).

x x

(a) Linear, or “dot-to-dot”, interpolation, (b) A smoother interpolation of the data points.
with corners at all of the data points.

Figure 1
Let us suppose that the data are in the form (x1 , f1 ), (x2 , f2 ), (x3 , f3 ), . . . , these are the points
plotted as crosses on the diagrams above. (For technical reasons, and those of common sense, we
suppose that the x-values in the data are all distinct.)
Our aim is to find a polynomial which passes exactly through the given data points. We want to find
p(x) such that
p(x1 ) = f1 , p(x2 ) = f2 , p(x3 ) = f3 , ...
There is a mathematical trick we can use to achieve this. We define Lagrange polynomials L1 ,
L2 , L3 , . . . which have the following properties:
L1 (x) = 1, at x = x1 , L1 (x) = 0, at x = x2 , x3 , x4 . . .
L2 (x) = 1, at x = x2 , L2 (x) = 0, at x = x1 , x3 , x4 . . .
L3 (x) = 1, at x = x3 , L3 (x) = 0, at x = x1 , x2 , x4 . . .
.. ..
. .
Each of these functions acts like a filter which “turns off” if you evaluate it at a data point other
than its own. For example if you evaluate L2 at any data point other than x2 , you will get zero.
Furthermore, if you evaluate any of these Lagrange polynomials at its own data point, the value you
get is 1. These two properties are enough to be able to write down what p(x) must be:

HELM (2006): 5
Section 31.1: Polynomial Approximations
p(x) = f1 L1 (x) + f2 L2 (x) + f3 L3 (x) + . . .
and this does work, because if we evaluate p at one of the data points, let us take x2 for example,
then
p(x2 ) = f1 L1 (x2 ) +f2 L2 (x2 ) +f3 L3 (x2 ) + · · · = f2
| {z } | {z } | {z }
=0 =1 =0

as required. The filtering property of the Lagrange polynomials picks out exactly the right f -value
for the current x-value. Between the data points, the expression for p above will give a smooth
polynomial curve.
This is all very well as long as we can work out what the Lagrange polynomials are. It is not hard to
check that the following definitions have the right properties.

Key Point 1
Lagrange Polynomials
(x − x2 )(x − x3 )(x − x4 ) . . .
L1 (x) =
(x1 − x2 )(x1 − x3 )(x1 − x4 ) . . .
(x − x1 )(x − x3 )(x − x4 ) . . .
L2 (x) =
(x2 − x1 )(x2 − x3 )(x2 − x4 ) . . .
(x − x1 )(x − x2 )(x − x4 ) . . .
L3 (x) =
(x3 − x1 )(x3 − x2 )(x3 − x4 ) . . .
and so on.
The numerator of Li (x) does not contain (x − xi ).
The denominator of Li (x) does not contain (xi − xi ).

In each case the numerator ensures that the filtering property is in place, that is that the functions
switch off at data points other than their own. The denominators make sure that the value taken at
the remaining data point is equal to 1.

L1
1.5

1 L2

0.5

0
x

-0.5

Figure 2

6 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Figure 2 shows L1 and L2 in the case where there are five data points (the x positions of these data
points are shown as large dots). Notice how both L1 and L2 are equal to zero at four of the data
points and that L1 (x1 ) = 1 and L2 (x2 ) = 1.
In an implementation of this idea, things are simplified by the fact that we do not generally require
an expression for p(x). (This is good news, for imagine trying to multiply out all the algebra in the
expressions for L1 , L2 , . . . .) What we do generally require is p evaluated at some specific value.
The following Example should help show how this can be done.

Example 3
Let p(x) be the polynomial of degree 3 which interpolates the data
x 0.8 1 1.4 1.6
f (x) −1.82 −1.73 −1.40 −1.11
Evaluate p(1.1).

Solution
We are interested in the Lagrange polynomials at the point x = 1.1 so we consider
(1.1 − x2 )(1.1 − x3 )(1.1 − x4 ) (1.1 − 1)(1.1 − 1.4)(1.1 − 1.6)
L1 (1.1) = = = −0.15625.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (0.8 − 1)(0.8 − 1.4)(0.8 − 1.6)
Similar calculations for the other Lagrange polynomials give
L2 (1.1) = 0.93750, L3 (1.1) = 0.31250, L4 (1.1) = −0.09375,
and we find that our interpolated polynomial, evaluated at x = 1.1 is

p(1.1) = f1 L1 (1.1) + f2 L2 (1.1) + f3 L3 (1.1) + f4 L4 (1.1)


= −1.82 × −0.15625 + −1.73 × 0.9375 + −1.4 × 0.3125 + −1.11 × −0.09375
= −1.670938
= −1.67 to the number of decimal places to which the data were given.

Key Point 2

Quote the answer only to the same number of decimal places as the given data (or to less places).

HELM (2006): 7
Section 31.1: Polynomial Approximations
Task
Let p(x) be the polynomial of degree 3 which interpolates the data
x 0.1 0.2 0.3 0.4
f (x) 0.91 0.70 0.43 0.52
Evaluate p(0.15).

Your solution

Answer
We are interested in the Lagrange polynomials at the point x = 0.15 so we consider
(0.15 − x2 )(0.15 − x3 )(0.15 − x4 ) (0.15 − 0.2)(0.15 − 0.3)(0.15 − 0.4)
L1 (0.15) = = = 0.3125.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (0.1 − 0.2)(0.1 − 0.3)(0.1 − 0.4)
Similar calculations for the other Lagrange polynomials give
L2 (0.15) = 0.9375, L3 (0.15) = −0.3125, L4 (0.15) = 0.0625,
and we find that our interpolated polynomial, evaluated at x = 0.15 is

p(0.15) = f1 L1 (0.15) + f2 L2 (0.15) + f3 L3 (0.15) + f4 L4 (0.15)


= 0.91 × 0.3125 + 0.7 × 0.9375 + 0.43 × −0.3125 + 0.52 × 0.0625
= 0.838750
= 0.84, to 2 decimal places.

The next Example is very much the same as Example 3 and the Task above. Try not to let the
specific application, and the slight change of notation, confuse you.

Example 4
A designer wants a curve on a diagram he is preparing to pass through the points
x 0.25 0.5 0.75 1
y 0.32 0.65 0.43 0.10
He decides to do this by using an interpolating polynomial p(x). What is the
y-value corresponding to x = 0.8?

8 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Solution
We are interested in the Lagrange polynomials at the point x = 0.8 so we consider
(0.8 − x2 )(0.8 − x3 )(0.8 − x4 ) (0.8 − 0.5)(0.8 − 0.75)(0.8 − 1)
L1 (0.8) = = = 0.032.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (0.25 − 0.5)(0.25 − 0.75)(0.25 − 1)
Similar calculations for the other Lagrange polynomials give
L2 (0.8) = −0.176, L3 (0.8) = 1.056, L4 (0.8) = 0.088,
and we find that our interpolated polynomial, evaluated at x = 0.8 is

p(0.8) = y1 L1 (0.8) + y2 L2 (0.8) + y3 L3 (0.8) + y4 L4 (0.8)


= 0.32 × 0.032 + 0.65 × −0.176 + 0.43 × 1.056 + 0.1 × 0.088
= 0.358720
= 0.36 to 2 decimal places.

In this next Task there are five points to interpolate. It therefore takes a polynomial of degree 4 to
interpolate the data and this means we must use five Lagrange polynomials.

Task
The hull drag f of a racing yacht as a function of the hull speed, v, is known to
be
v 0.0 0.5 1.0 1.5 2.0
f 0.00 19.32 90.62 175.71 407.11
(Here, the units for f and v are N and m s−1 , respectively.)
Use Lagrange interpolation to fit this data and hence approximate the drag corre-
sponding to a hull speed of 2.5 m s−1 .

Your solution

HELM (2006): 9
Section 31.1: Polynomial Approximations
Answer
We are interested in the Lagrange polynomials at the point v = 2.5 so we consider

(2.5 − v2 )(2.5 − v3 )(2.5 − v4 )(2.5 − v5 )


L1 (2.5) =
(v1 − v2 )(v1 − v3 )(v1 − v4 )(v1 − v5 )
(2.5 − 0.5)(2.5 − 1.0)(2.5 − 1.5)(2.5 − 2.0)
= = 1.0
(0.0 − 0.5)(0.0 − 1.0)(0.0 − 1.5)(0.0 − 2.0)

Similar calculations for the other Lagrange polynomials give


L2 (2.5) = −5.0, L3 (2.5) = 10.0, L4 (2.5) = −10.0, L5 (2.5) = 5.0
and we find that our interpolated polynomial, evaluated at x = 2.5 is

p(2.5) = f1 L1 (2.5) + f2 L2 (2.5) + f3 L3 (2.5) + f4 L4 (2.5) + f5 L5 (2.5)


= 0.00 × 1.0 + 19.32 × −5.0 + 90.62 × 10.0 + 175.71 × −10.0 + 407.11 × 5.0
= 1088.05

This gives us the approximation that the hull drag on the yacht at 2.5 m s−1 is about 1100 N.

The following Example has time t as the independent variable, and two quantities, x and y, as
dependent variables to be interpolated. We will see however that exactly the same approach as
before works.

Example 5
An animator working on a computer generated cartoon has decided that her main
character’s right index finger should pass through the following (x, y) positions on
the screen at the following times t
t 0 0.2 0.4 0.6
x 1.00 1.20 1.30 1.25
y 2.00 2.10 2.30 2.60
Use Lagrange polynomials to interpolate these data and hence find the (x, y)
position at time t = 0.5. Give x and y to 2 decimal places.

Solution
In this case t is the independent variable, and there are two dependent variables: x and y. We are
interested in the Lagrange polynomials at the time t = 0.5 so we consider
(0.5 − t2 )(0.5 − t3 )(0.5 − t4 ) (0.5 − 0.2)(0.5 − 0.4)(0.5 − 0.6)
L1 (0.5) = = = 0.0625
(t1 − t2 )(t1 − t3 )(t1 − t4 ) (0 − 0.2)(0 − 0.4)(0 − 0.6)
Similar calculations for the other Lagrange polynomials give
L2 (0.5) = −0.3125, L3 (0.5) = 0.9375, L4 (0.5) = 0.3125

10 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Solution (contd.)
These values for the Lagrange polynomials can be used for both of the interpolations we need to
do. For the x-value we obtain

x(0.5) = x1 L1 (0.5) + x2 L2 (0.5) + x3 L3 (0.5) + x4 L4 (0.5)


= 1.00 × 0.0625 + 1.20 × −0.3125 + 1.30 × 0.9375 + 1.25 × 0.3125
= 1.30 to 2 decimal places

and for the y value we get

y(0.5) = y1 L1 (0.5) + y2 L2 (0.5) + y3 L3 (0.5) + y4 L4 (0.5)


= 2.00 × 0.0625 + 2.10 × −0.3125 + 2.30 × 0.9375 + 2.60 × 0.3125
= 2.44 to 2 decimal places

Error in Lagrange interpolation


When using Lagrange interpolation through n points (x1 , f1 ), (x2 , f2 ), . . . , (xn , fn ) the error, in the
estimate of f (x) is given by
(x − x1 )(x − x2 ) . . . (x − xn ) (n)
E(x) = f (η) where η ∈ [x, x1 , xn ]
n!
N.B. The value of η is not known precisely, only the interval in which it lies. Normally x will lie in
the interval [x1 , xn ] (that’s interpolation). If x lies outside the interval [x1 , xn ] then that’s called
extrapolation and a larger error is likely.
Of course we will not normally know what f is (indeed no f may exist for experimental data).
However, sometimes f can at least be estimated. In the following (somewhat artificial) example we
will be told f and use it to check that the above error formula is reasonable.

Example 6
In an experiment to determine the relationship between power gain (G) and power
output (P ) in an amplifier, the following data were recorded.
P 5 7 8 11
G 0.00 1.46 2.04 3.42

(a) Use Lagrange interpolation to fit an appropriate quadratic, q(x), to estimate


the gain when the output is 6.5. Give your answer to an appropriate accuracy.

(b) Given that G ≡ 10 log10 (P/5) show that the actual error which occurred in
the Lagrange interpolation in (a) lies withing the theoretical error limits.

HELM (2006): 11
Section 31.1: Polynomial Approximations
Solution
For a quadratic, q(x), we need to fit three points and those most appropriate (nearest 6.5) are for
P at 5, 7, 8:

(6.5 − 7)(6.5 − 8)
q(6.5) = × 0.00
(5 − 7)(5 − 8)
(6.5 − 5)(6.5 − 8)
+ × 1.46
(7 − 5)(7 − 8)
(6.5 − 5)(6.5 − 7)
+ × 2.04
(8 − 5)(8 − 7)
= 0 + 1.6425 − 0.5100
= 1.1325 working to 4 d.p.
≈ 1.1 (rounding to sensible accuracy)

(b) We use the error formula


(x − x1 ) . . . (x − xn ) (n)
E(x) = f (η), η ∈ [x, x1 , . . . , xn ]
n!
Here f (x) ≡ G(x) = log10 (P/5) and n = 3:

d(log10 (P/5)) d(log10 (P ) − log10 (5)) d(log10 (P ))


= =
dP  dP
 dP
d ln P 1 1
= =
dP ln 10 ln 10 P

d3 1 2
So 3
(log10 (P/5)) = .
dP ln 10 P 3
Substituting for f (3) (η):

(6.5 − 6)(6.5 − 7)(6.5 − 8) 10 2


E(6.5) = × × 3, η ∈ [5, 8]
6 ln 10 η
1.6286
= η ∈ [5, 8]
η3
Taking η = 5: Emax = 0.0131
Taking η = 8: Emin = 0.0031

Taking x = 6.5 : Eactual = G(6.5) − q(6.5) = 10 log10 (6.5/5) − 1.1325


= 1.1394 − 1.1325
= 0.0069

The theory is satisfied because Emin < Eactual < Emax .

12 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Task
(a) Use Lagrange interpolation to estimate f (8) to appropriate accuracy given the
table of values below, by means of the appropriate cubic interpolating polynomial
x 2 5 7 9 10
f (x) 0.980067 0.8775836 0.764842 0.621610 0.540302

Your solution

Answer
The most appropriate cubic passes through x at 5, 7, 9, 10
x = 8 x1 = 5, x2 = 7, x3 = 9, x4 = 10

(8 − 7)(8 − 9)(8 − 10)


p(8) = × 0.877583
(5 − 7)(5 − 9)(5 − 10)
(8 − 5)(8 − 9)(8 − 10)
+ × 0.764842
(7 − 5)(7 − 9)(7 − 10)
(8 − 5)(8 − 7)(8 − 10)
+ × 0.621610
(9 − 5)(9 − 7)(9 − 10)
(8 − 5)(8 − 7)(8 − 9)
+ × 0.540302
(10 − 5)(10 − 7)(10 − 9)
1 1 3 1
= − × 0.877583 + × 0.764842 + × 0.621610 − × 0.540302
20 2 4 5
= 0.6966689

Suitable accuracy is 0.6967 (rounded to 4 d.p.).

HELM (2006): 13
Section 31.1: Polynomial Approximations
(b) Given that the table in (a) represents f (x) ≡ cos(x/10), calculate theoretical bounds for the
estimate obtained:
Your solution

Answer
(8 − 5)(8 − 7)(8 − 9)(8 − 10) (4)
E(8) = f (η), 5 ≤ η ≤ 10
4!
η 1 η
(4)
f (η) = cos so f (η) = 4 cos
10 10 10
1  η 
E(8) = 4
cos , η ∈ [5, 10]
4 × 10 10
1 1
Emin = 4
cos(1) Emax = cos(0.5)
4 × 10 4 × 104
This leads to
0.696689 + 0.000014 ≤ True Value ≤ 0.696689 + 0.000022
⇒ 0.696703 ≤ True Value ≤ 0.696711
We can conclude that the True Value is 0.69670 or 0.69671 to 5 d.p. or 0.6967 to 4 d.p. (actual
value is 0.696707).

14 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

4. Polynomial approximations - experimental data


You may well have experience in carrying out an experiment and then trying to get a straight line to
pass as near as possible to the data plotted on graph paper. This process of adjusting a clear ruler
over the page until it looks “about right” is fine for a rough approximation, but it is not especially
scientific. Any software you use which provides a “best fit” straight line must obviously employ a
less haphazard approach.
Here we show one way in which best fit straight lines may be found.

Best fit straight lines

Let us consider the situation mentioned above of trying to get a straight line y = mx + c to be as
near as possible to experimental data in the form (x1 , f1 ), (x2 , f2 ), (x3 , f3 ), . . . .

y = mx + c

f3

f2

f1

x x x x
1 2 3

Figure 3
We want to minimise the overall distance between the crosses (the data points) and the straight line.
There are a few different approaches, but the one we adopt here involves minimising the quantity

R = (mx1 + c − f1 ) 2 + (mx2 + c − f2 ) 2 + (mx + c − f3 )2 + ...


| {z } | {z } | 3 {z }
vertical distance second data point third data point
between line and distance distance
the point (x1 , f1 )
X 2
= mxn + c − fn .

Each term in the sum measures the vertical distance between a data point and the straight line.
(Squaring the distances ensures that distances above and below the line do not cancel each other
out. It is because we are minimising the distances squared that the straight line we will find is called
the least squares best fit straight line.)

HELM (2006): 15
Section 31.1: Polynomial Approximations
In order to minimise R we can imagine sliding the clear ruler around on the page until the line looks
right; that is we can imagine varying the slope m and y-intercept c of the line. We therefore think
of R as a function of the two variables m and c and, as we know from our earlier work on maxima
and minima of functions, the minimisation is achieved when
∂R ∂R
=0 and = 0.
∂c ∂m
(We know that this will correspond to a minimum because R has no maximum, for whatever value
R takes we can always make it bigger by moving the line further away from the data points.)
Differentiating R with respect to m and c gives
∂R
= 2 (mx1 + c − f1 ) + 2 (mx2 + c − f2 ) + 2 (mx3 + c − f3 ) + . . .
∂c
X 
= 2 mxn + c − fn and
∂R
= 2 (mx1 + c − f1 ) x1 + 2 (mx2 + c − f2 ) x2 + 2 (mx3 + c − f3 ) x3 + . . .
∂m
X 
= 2 mxn + c − fn xn ,

respectively. Setting both of these quantities equal to zero (and cancelling the factor of 2) gives a
pair of simultaneous equations for m and c. This pair of equations is given in the Key Point below.

Key Point 3
The least squares best fit straight line to the experimental data
(x1 , f1 ), (x2 , f2 ), (x3 , f3 ), . . . (xn , fn )
is
y = mx + c
where m and c are found by solving the pair of equations
n
! n
! n
X X X
c 1 +m xn = fn ,
1 1 1

n
! n
! n
X X X
c xn +m x2n = xn fn .
1 1 1
n
X
(The term 1 is simply equal to the number of data points, n.)
1

16 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Example 7
An experiment is carried out and the following data obtained:
xn 0.24 0.26 0.28 0.30
fn 1.25 0.80 0.66 0.20
Obtain the least squares best fit straight line, y = mx + c, to these data. Give c
and m to 2 decimal places.

Solution
For a hand calculation, tabulating the data makes sense:
xn fn x2n xn fn
0.24 1.25 0.0576 0.3000
0.26 0.80 0.0676 0.2080
0.28 0.66 0.0784 0.1848
0.30 0.20 0.0900 0.0600
1.08 2.91 0.2936 0.7528
X
The quantity 1 counts the number of data points and in this case is equal to 4.
It follows that the pair of equations for m and c are:

4c + 1.08m = 2.91
1.08c + 0.2936m = 0.7528

Solving these gives c = 5.17 and m = −16.45 and we see that the least squares best fit straight
line to the given data is
y = 5.17 − 16.45x
Figure 4 shows how well the straight line fits the experimental data.

1.2

0.8

0.6

0.4

0.2

0
x
0.24 0.25 0.26 0.27 0.28 0.29 0.3

Figure 4

HELM (2006): 17
Section 31.1: Polynomial Approximations
Example 8
Find the best fit straight line to the following experimental data:
xn 0.00 1.00 2.00 3.00 4.00
fn 1.00 3.85 6.50 9.35 12.05

Solution
In order to work out all of the quantities appearing in the pair of equations we tabulate our calcu-
lations as follows
xn fn x2n xn fn
0.00 1.00 0.00 0.00
1.00 3.85 1.00 3.85
2.00 6.50 4.00 13.00
3.00 9.35 9.00 28.05
X 4.00 12.05 16.00 48.20
10.00 32.75 30.00 93.10
X
The quantity 1 counts the number of data points and is in this case equal to 5.
Hence our pair of equations is
5c + 10m = 32.95
10c + 30m = 93.10
Solving these equations gives c = 1.03 and m = 2.76 and this means that our best fit straight line
to the given data is
y = 1.03 + 2.76x

18 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Task
An experiment is carried out and the data obtained are as follows:
xn 0.2 0.3 0.5 0.9
fn 5.54 4.02 3.11 2.16
Obtain the least squares best fit straight line, y = mx + c, to these data. Give c
and m to 2 decimal places.

Your solution

Answer
Tabulating the data gives
xn fn x2n xn fn
0.2 5.54 0.04 1.108
0.3 4.02 0.09 1.206
0.5 3.11 0.25 1.555
0.9 2.16 0.81 1.944
X
1.9 14.83 1.19 5.813
X
The quantity 1 counts the number of data points and in this case is equal to 4.
It follows that the pair of equations for m and c are:

4c + 1.9m = 14.83
1.9c + 1.19m = 5.813

Solving these gives c = 5.74 and m = −4.28 and we see that the least squares best fit straight line
to the given data is
y = 5.74 − 4.28x

HELM (2006): 19
Section 31.1: Polynomial Approximations
Task
Power output P of a semiconductor laser diode, operating at 35◦ C, as a function
of the drive current I is measured to be
I 70 72 74 76
P 1.33 2.08 2.88 3.31
(Here I and P are measured in mA and mW respectively.)
It is known that, above a certain threshold current, the laser power increases
linearly with drive current. Use the least squares approach to fit a straight line,
P = mI + c, to these data. Give c and m to 2 decimal places.

Your solution

Answer
Tabulating the data gives
I P I2 I ×P
70 1.33 4900 93.1
72 2.08 5184 149.76
74 2.88 5476 213.12
76 3.31 5776 251.56
292 9.6 21336 707.54
X
The quantity 1 counts the number of data points and in this case is equal to 4.
It follows that the pair of equations for m and c are:

4c + 292m = 9.6
292c + 21336m = 707.54

Solving these gives c = −22.20 and m = 0.34 and we see that the least squares best fit straight
line to the given data is
P = −22.20 + 0.34I.

20 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

5. Polynomial approximations - splines


We complete this Section by briefly describing another approach that can be used in the case where
the data are exact.

Why are splines needed?


Fitting a polynomial to the data (using Lagrange polynomials, for example) works very well when
there are a small number of data points. But if there were 100 data points it would be silly to try
to fit a polynomial of degree 99 through all of them. It would be a great deal of work and anyway
polynomials of high degree can be very oscillatory giving poor approximations between the data points
to the underlying function.

What are splines?


Instead of using a polynomial valid for all x, we use one polynomial for x1 < x < x2 , then a different
polynomial for x2 < x < x3 then a different one again for x3 < x < x4 , and so on.
We have already seen one instance of this approach in this Section. The “dot to dot” interpolation
that we abandoned earlier (Figure 1(a)) is an example of a linear spline. There is a different straight
line between each pair of data points.
The most commonly used splines are cubic splines. We use a different polynomial of degree three
between each pair of data points. Let s = s(x) denote a cubic spline, then

s(x) = a1 (x − x1 )3 + b1 (x − x1 )2 + c1 (x − x1 ) + d1 (x1 < x < x2 )


s(x) = a2 (x − x2 )3 + b2 (x − x2 )2 + c2 (x − x2 ) + d2 (x2 < x < x3 )
s(x) = a3 (x − x3 )3 + b3 (x − x3 )2 + c3 (x − x3 ) + d3 (x3 < x < x4 )
..
.

And we need to find a1 , b1 , c1 , d1 , a2 , . . . to determine the full form for the spline s(x). Given the
large number of quantities that have to be assigned (four for every pair of adjacent data points) it
is possible to give s some very nice properties:

• s(x1 ) = f1 , s(x2 ) = f2 , s(x3 ) = f3 , . . . . This is the least we should expect, as it simply states
that s interpolates the given data.

• s0 (x) is continuous at the data points. This means that there are no “corners” at the data
points - the whole curve is smooth.

• s00 (x) is continuous. This reduces the occurrence of points of inflection appearing at the data
points and leads to a smooth interpolant.

Even with all of these requirements there are still two more properties we can assign to s. A natural
cubic spline is one for which s00 is zero at the two end points. The natural cubic spline is, in some
sense, the smoothest possible spline, for it minimises a measure of the curvature.

HELM (2006): 21
Section 31.1: Polynomial Approximations
How is a spline found?
Now that we have described what a natural cubic spline is, we briefly describe how it is found.
Suppose that there are N data points. For a natural cubic spline we require s00 (x1 ) = s00 (xN ) = 0
and values of s00 taken at the other data points are found from the system of equations in Key Point
4.

Key Point 4
Cubic Spline Equations
    
k2 h2 s00 (x2 ) r2
 h2 k3
 h3 
 s00 (x3 )  
  r3 

. .. . .. .. .. ..
=
    
 .  . . 
   00   
 hN −3 kN −2 hN −2   s (xN −2 )   rN −2 
hN −2 kN −1 s00 (xN −1 ) rN −1
in which
h1 = x2 − x1 , h2 = x3 − x2 , h3 = x4 − x3 , h4 = x5 − x4 , . . .

k2 = 2(h1 + h2 ), k3 = 2(h2 + h3 ), k4 = 2(h3 + h4 ), . . .


   
f3 − f2 f2 − f1 f4 − f3 f3 − f2
r2 = 6 − , r3 = 6 − ,...
h2 h1 h3 h2

Admittedly the system of equations in Key Point 4 looks unappealing, but this is a “nice” system
of equations. It was pointed out at the end of 30 that some applications lead to systems of
equations involving matrices which are strictly diagonally dominant. The matrix above is of that
type since the diagonal entry is always twice as big as the sum of off-diagonal entries.
Once the system of equations is solved for the second derivatives s00 , the spline s can be found as
follows:
s00 (xi+1 ) − s00 (xi ) s00 (xi )
 00
s (xi+1 ) + 2s00 (xi )

fi+1 − fi
ai = , bi = , ci = − hi , di = fi
6hi 2 hi 6
We now present an Example illustrating this approach.

22 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Example 9
Find the natural cubic spline which interpolates the data
xj 1 3 5 8
fj 0.85 0.72 0.34 0.67

Solution
In the notation now established we have h1 = 2, h2 = 2 and h3 = 3. For a natural cubic spline we
require s00 to be zero at x1 and x4 . Values of s00 at the other data points are found from the system
of equations given in Key Point 4. In this case the matrix is just 2 × 2 and the pair of equations are:
 
00 00 00 f3 − f2 f2 − f1
h1 s (x1 ) +2(h1 + h2 )s (x2 ) + h2 s (x3 ) = 6 −
| {z } h2 h1
=0
 
00 00 00 f4 − f3 f3 − f2
h2 s (x2 ) + 2(h2 + h3 )s (x3 ) + h3 s (x4 ) = 6 −
| {z } h3 h2
=0

In this case the equations become


   00   
8 2 s (x2 ) −0.75
=
2 10 s00 (x3 ) 1.8
Solving the coupled pair of equations leads to
s00 (x2 ) = −0.146053 s00 (x3 ) = 0.209211
We now find the coefficients a1 , b1 , etc. from the formulae and deduce that the spline is given by

s(x) = −0.01217(x − 1)3 − 0.016316(x − 1) + 0.85 (1 < x < 3)


3 2
s(x) = 0.029605(x − 3) − 0.073026(x − 3) − 0.162368(x − 3) + 0.72 (3 < x < 5)
s(x) = −0.01162(x − 5)3 + 0.104605(x − 5)2 − 0.099211(x − 5) + 0.34 (5 < x < 8)

Figure 5 shows how the spline interpolates the data.

0.8

0.6
0.4
0.2

0
2 4
x 6 8

Figure 5

HELM (2006): 23
Section 31.1: Polynomial Approximations
Task
Find the natural cubic spline which interpolates the data
xj 1 2 3 5
fj 0.1 0.24 0.67 0.91

Your solution

Answer
In the notation now established we have h1 = 1, h2 = 1 and h3 = 2. For a natural cubic spline we
require s00 to be zero at x1 and x4 . Values of s00 at the other data points are found from the system
of equations
 
00 00 00 f3 − f2 f2 − f1
h1 s (x1 ) +2(h1 + h2 )s (x2 ) + h2 s (x3 ) = 6 −
| {z } h2 h1
=0
 
00 00 00 f4 − f3 f3 − f2
h2 s (x2 ) + 2(h2 + h3 )s (x3 ) + h3 s (x4 ) = 6 −
| {z } h3 h2
=0

In this case the equations become


   00   
4 1 s (x2 ) 1.74
=
1 6 s00 (x3 ) −1.86
Solving the coupled pair of equations leads to s00 (x2 ) = 0.534783 s00 (x3 ) = −0.399130
We now find the coefficients a1 , b1 , etc. from the formulae and deduce that the spline is

s(x) = 0.08913(x − 1)3 + 0.05087(x − 1) + 0.1 (1 < x < 2)


3 2
s(x) = −0.15565(x − 2) + 0.267391(x − 2) + 0.318261(x − 2) + 0.24 (2 < x < 3)
s(x) = 0.033261(x − 3)3 − 0.199565(x − 3)2 + 0.386087(x − 3) + 0.67 (3 < x < 5)

24 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Exercises
1. A political analyst is preparing a dossier involving the following data

x 10 15 20 25
f (x) 9.23 8.41 7.12 4.13

She interpolates the data with a polynomial p(x) of degree 3 in order to find an approximation
p(22) to f (22). What value does she find for p(22)?

2. Estimate f (2) to an appropriate accuracy from the table of values below by means of an
appropriate quadratic interpolating polynomial.

x 1 3 3.5 6
f (x) 99.8 295.5 342.9 564.6

3. An experiment is carried out and the data obtained as follows

xn 2 3 5 7
fn 2.2 5.4 6.5 13.2

Obtain the least squares best fit straight line, y = mx + c, to these data. (Give c and m to 2
decimal places.)

4. Find the natural cubic spline which interpolates the data

xj 2 4 5 7
fj 1.34 1.84 1.12 0.02

HELM (2006): 25
Section 31.1: Polynomial Approximations
Answers

1. We are interested in the Lagrange polynomials at the point x = 22 so we consider

(22 − x2 )(22 − x3 )(22 − x4 ) (22 − 15)(22 − 20)(22 − 25)


L1 (22) = = = 0.056.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (10 − 15)(10 − 20)(10 − 25)

Similar calculations for the other Lagrange polynomials give

L2 (22) = −0.288, L3 (22) = 1.008, L4 (22) = 0.224,

and we find that our interpolated polynomial, evaluated at x = 22 is

p(22) = f1 L1 (22) + f2 L2 (22) + f3 L3 (22) + f4 L4 (22)


= 9.23 × 0.056 + 8.41 × −0.288 + 7.12 × 1.008 + 4.13 × 0.224
= 6.197
= 6.20, to 2 decimal places,

which serves as the approximation to f (22).

2.
(2 − 1)(2 − 3) (2 − 1)(2 − 3.5) (2 − 3)(2 − 3.5)
f (2) = × 342.9 + × 295.5 + × 99.8
(3.5 − 1)(3.5 − 3) (3 − 1)(3 − 3.5) (1 − 3)(1 − 3.5)
= −274.32 + 443.25 + 29.94
= 198.87

Estimate is 199 (to 3 sig. fig.)

3. We tabulate the data for convenience:


xn fn x2n xn fn
2 2.2 4 4.4
3 5.4 9 16.2
5 6.5 25 32.5
P 7 13.2 49 92.4
17 27.3 87 145.5
X
The quantity 1 counts the number of data points and in this case is equal to 4.
It follows that the pair of equations for m and c are as follows:

4c + 17m = 27.3
17c + 87m = 145.5

Solving these gives c = −1.67 and m = 2.00, to 2 decimal places, and we see that the least
squares best fit straight line to the given data is

y = −1.67 + 2.00x

26 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®

Answers

4. In the notation now established we have h1 = 2, h2 = 1 and h3 = 2. For a natural cubic


spline we require s00 to be zero at x1 and x4 . Values of s00 at the other data points are found
from the system of equations
 
00 00 00 f3 − f2 f2 − f1
h1 s (x1 ) +2(h1 + h2 )s (x2 ) + h2 s (x3 ) = 6 −
| {z } h2 h1
=0
 
00 00 00 f4 − f3 f3 − f2
h2 s (x2 ) + 2(h2 + h3 )s (x3 ) + h3 s (x4 ) = 6 −
| {z } h3 h2
=0

In this case the equations become


   00   
6 1 s (x2 ) −5.82
=
1 6 s00 (x3 ) 1.02

Solving the coupled pair of equations leads to

s00 (x2 ) = −1.026857 s00 (x3 ) = 0.341143

We now find the coefficients a1 , b1 , etc. from the formulae and deduce that the spline is given
by

s(x) = −0.08557(x − 2)3 + 0.592286(x − 2) + 1.34 (2 < x < 4)


3 2
s(x) = 0.228(x − 4) − 0.513429(x − 4) − 0.434571(x − 4) + 1.84 (4 < x < 5)
3 2
s(x) = −0.02843(x − 5) + 0.170571(x − 5) − 0.777429(x − 5) + 1.12 (5 < x < 7)

HELM (2006): 27
Section 31.1: Polynomial Approximations
 

Numerical Integration 31.2 

Introduction
In this Section we will present some methods that can be used to approximate integrals. Attention
will be paid to how we ensure that such approximations can be guaranteed to be of a certain level
of accuracy.

 

Prerequisites • review previous material on integrals and


integration
Before starting this Section you should . . .

' 
$
• approximate certain integrals

Learning Outcomes • be able to ensure that these approximations


are of some desired accuracy
On completion you should be able to . . .

& %

28 HELM (2006):
Workbook 31: Numerical Methods of Approximation
1. Numerical integration
The aim in this Section is to describe numerical methods for approximating integrals of the form
Z b
f (x) dx
a

One motivation for this is in the material on probability that appears in 39. Normal distributions
can be analysed by working out
Z b
1 2
√ e−x /2 dx
a 2π
for certain values of a and b. It turns out that it is not possible, using the kinds of functions most
2
engineers would care to know about, to write down a function with derivative equal to √12π e−x /2 so
values of the integral are approximated instead. Tables of numbers giving the value of this integral
for different interval widths appeared at the end of 39, and it is known that these tables are
accurate to the number of decimal places given. How can this be known? One aim of this Section
is to give a possible answer to that question.
It is clear that, not only do we need a way of approximating integrals, but we also need a way of
working out the accuracy of the approximations if we are to be sure that our tables of numbers are
to be relied on.
In this Section we address both of these points, begining with a simple approximation method.

2. The simple trapezium rule


The first approximation we shall look at involves finding the area under a straight line, rather than
the area under a curve f . Figure 6 shows it best.

f(x)

f(b)

b −a
f(a)

a b x

Figure 6

HELM (2006): 29
Section 31.2: Numerical Integration
We approximate as follows
Z b
f (x) dx = grey shaded area
a
≈ area of the trapezium surrounding the shaded region
= width of trapezium × average height of the two sides
1  
= (b − a) f (a) + f (b)
2

Key Point 5
Simple Trapezium Rule
Z b
The simple trapezium rule for approximating f (x) dx is given by approximating the area under
a
the graph of f by the area of a trapezium.

The formula is:


Z b
1 
f (x) dx ≈ (b − a) f (a) + f (b)
a 2
Or, to put it another way that may prove helpful a little later on,
Z b
1  
f (x) dx ≈ × (interval width) × f (left-hand end) + f (right-hand end)
a 2

Next we show some instances of implementing this method.

Example 10
Approximate each of these integrals using the simple trapezium rule
Z π/4 Z 2 Z 2
−x2 /2
(a) sin(x) dx (b) e dx (c) cosh(x) dx
0 1 0

Solution
Z π/4  
1 1 π 1
(a) sin(x) dx ≈ (b − a)(sin(a) + sin(b)) = −0 0+ √ = 0.27768,
2 2 4 2
Z0 2
−x2 /2 1  2
−a /2 −b2 /2
 1
= (1 − 0) e−1/2 + e−2 = 0.37093,

(b) e dx ≈ (b − a) e +e
Z 12 2 2
1 1
(c) cosh(x) dx ≈ (b − a) (cosh(a) + cosh(b)) = (2 − 0) (1 + cosh(2)) = 4.76220,
0 2 2
where all three answers are given to 5 decimal places.

30 HELM (2006):
Workbook 31: Numerical Methods of Approximation
It is important to note that, although we have given these integral approximations to 5 decimal places,
this does not mean that they are accurate to that many places. We will deal with the accuracy of
our approximations later in this Section. Next are some Tasks for you to try.

Task
Approximate the following integrals using the simple trapezium method
Z 5 Z 2

(a) x dx (b) ln(x) dx
1 1

Your solution

Answer
Z 5
√ 1 √ √  1  √ 
(a) x dx ≈ (b − a) a + b = (5 − 1) 1 + 5 = 6.47214
Z1 2 2 2
1 1
(b) ln(x) dx ≈ (b − a)(ln(a) + ln(b)) = (1 − 0) (0 + ln(2)) = 0.34657
1 2 2

The answer you obtain for this next Task can be checked against the table of results in 39
concerning the Normal distribution or in a standard statistics textbook.

Task Z 1
1 2
Use the simple trapezium method to approximate √ e−x /2 dx
0 2π

Your solution

Answer
We find that
Z 1
1 2 1 1
√ e−x /2 dx ≈ (1 − 0) √ (1 + e−1/2 ) = 0.32046
0 2π 2 2π
to 5 decimal places.
Z b
So we have a means of approximating f (x) dx. The question remains whether or not it is a good
a
approximation.

HELM (2006): 31
Section 31.2: Numerical Integration
How good is the simple trapezium rule?
We define eT , the error in the simple trapezium rule to be the difference between the actual value of
the integral and our approximation to it, that is
Z b
1  
eT = f (x) dx − (b − a) f (a) + f (b)
a 2
It is enough for our purposes here to omit some theory and skip straight to the result of interest. In
many different textbooks on the subject it is shown that
1
eT = − (b − a)3 f 00 (c)
12
where c is some number between a and b. (The principal drawback with this expression for eT is
that we do not know what c is, but we will find a way to work around that difficulty later.)
It is worth pausing to ask what meaning we can attach to this expression for eT . There are two
factors which can influence eT :
1. If b − a is small then, clearly, eT will most probably also be small. This seems sensible enough -
if the integration interval is a small one then there is “less room” to accumulate a large error.
(This observation forms part of the motivation for the composite trapezium rule discussed later
in this Section.)

2. If f 00 is small everywhere in a < x < b then eT will be small. This reflects the fact that we
worked out the integral of a straight line function, instead of the integral of f . If f is a long
way from being a straight line then f 00 will be large and so we must expect the error eT to be
large too.
We noted above that the expression for eT is less useful than it might be because it involves the
unknown quantity c. We perform a trade-off to get around this problem. The expression above gives
an exact value for eT , but we do not know enough to evaluate it. So we replace the expression with
one we can evaluate, but it will not be exact. We replace f 00 (c) with a worst case value to obtain
an upper bound on eT . This worst case value is the largest (positive or negative) value that f 00 (x)
achieves for a ≤ x ≤ b. This leads to
(b − a)3
|eT | ≤ max f 00 (x) .

a≤x≤b 12
We summarise this in Key Point 6.

Key Point 6
Error in the Simple Trapezium Rule
Z b
The error, |eT |, in the simple trapezium approximation to f (x) dx is bounded above by
a
(b − a)3
max f 00 (x)

a≤x≤b 12

32 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Example 11
Work out the error bound (to 6 decimal places) for the simple trapezium method
approximations to
Z π/4 Z 2
(a) sin(x) dx (b) cosh(x) dx
0 0

Solution
In each case the trickiest part is working out the maximum value of f 00 (x).
(a) Here f (x) = sin(x), therefore f 0 (x) = − cos(x) and f 00 (x) = − sin(x). The function sin(x)
takes values between 0 and √12 when x varies between 0 and π/4. Hence
1 (π/4)2
eT < √ × = 0.028548 to 6 decimal places.
2 12
(b) If f (x) = cosh(x) then f 00 (x) = cosh(x) too. The maximum value of cosh(x) for x between 0
and 2 will be cosh(2) = 3.762196, to 6 decimal places. Hence, in this case,
(2 − 0)3
eT < (3.762196) × = 2.508130 to 6 decimal places.
12

(In Example 11 we used a rounded value of cosh(2). To be on the safe side, it is best to round this
number up to make sure that we still have an upper bound on eT . In this case, of course, rounding
up is what we would naturally do, because the seventh decimal place was a 6.)

Task
Work out the error bound (to 5 significant figures) for the simple trapezium method
approximations to
Z 5 Z 2

(a) x dx (b) ln(x) dx
1 1

Your solution
(a)

HELM (2006): 33
Section 31.2: Numerical Integration
Answer √
If f (x) = x = x1/2 then f 0 (x) = 12 x−1/2 and f 00 (x) = − 41 x−3/2 .
The negative power here means that f 00 takes its biggest value in magnitude at the left-hand end
of the interval [1, 5] and we see that max1≤x≤5 |f 00 (x)| = f 00 (1) = 14 . Therefore
1 43
eT < × = 1.3333 to 5 s.f.
4 12

Your solution
(b)

Answer
Here f (x) = ln(x) hence f 0 (x) = 1/x and f 00 (x) = −1/x2 .
It follows then that max1≤x≤2 |f 00 (x)| = 1 and we conclude that
13
eT < 1 × = 0.083333 to 5 s.f.
12

One deficiency in the simple trapezium rule is that there is nothing we can do to improve it. Having
computed an error bound to measure the quality of the approximation we have no way to go back and
work out a better approximation to the integral. It would be preferable if there were a parameter we
could alter to tune the accuracy of the method. The following approach uses the simple trapezium
method in a way that allows us to improve the accuracy of the answer we obtain.

34 HELM (2006):
Workbook 31: Numerical Methods of Approximation
3. The composite trapezium rule
The general idea here is to split the interval [a, b] into a sequence of N smaller subintervals of equal
width h = (b − a)/N . Then we apply the simple trapezium rule to each of the subintervals.
Figure 7 below shows the case where N = 2 (and ∴ h = 12 (b − a)). To simplify notation later on we
let f0 = f (a), f1 = f (a + h) and f2 = f (a + 2h) = f (b).

f(x)
f2

f1
f
0

a b x
Figure 7

Applying the simple trapezium rule to each subinterval we get


Z b
f (x) dx ≈ (area of first trapezium) + (area of second trapezium)
a
1 1 1  
= h(f0 + f1 ) + h(f1 + f2 ) = h f0 + 2f1 + f2
2 2 2
where we remember that the width of each of the subintervals is h, rather than the b − a we had in
the simple trapezium rule.
The next improvement will come from taking N = 3 subintervals (Figure 8). Here h = 13 (b − a)
is smaller than in Figure 7 above and we denote f0 = f (a), f1 = f (a + h), f2 = f (a + 2h) and
f3 = f (a + 3h) = f (b). (Notice that f1 and f2 mean something different from what they did in the
N = 2 case.)

f(x)
f3

f
2

f
f 1
0

a b x
Figure 8

HELM (2006): 35
Section 31.2: Numerical Integration
As Figure 8 shows, the approximation is getting closer to the grey shaded area and in this case we
have

Z b
1 1 1
f (x) dx ≈ h(f0 + f1 ) + h(f1 + f2 ) + h(f2 + f3 )
a 2 2 2
1  
= h f0 + 2 {f1 + f2 } + f3 .
2

The pattern is probably becoming clear by now, but here is one more improvement. In Figure 9
N = 4, h = 41 (b − a) and we denote f0 = f (a), f1 = f (a + h), f2 = f (a + 2h), f3 = f (a + 3h)
and f4 = f (a + 4h) = f (b).

f(x)
f4

f3

f2
f f1
0

a b x
Figure 9

This leads to

Z b
1 1 1 1
f (x) dx ≈ h(f0 + f1 ) + h(f1 + f2 ) + h(f2 + f3 ) + + h(f3 + f4 )
a 2 2 2 2
1  
= h f0 + 2 {f1 + f2 + f3 } + f4 .
2

We generalise this idea into the following Key Point.

36 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Key Point 7
Composite Trapezium Rule
Z b
The composite trapezium rule for approximating f (x) dx is carried out as follows:
a

1. Choose N , the number of subintervals,


Z b
1  
2. f (x) dx ≈ h f0 + 2{f1 + f2 + · · · + fN −1 } + fN ,
a 2
where
b−a
h= , f0 = f (a), f1 = f (a + h), . . . , fn = f (a + nh), . . . ,
N
and fN = f (a + N h) = f (b).

Example 12
Using 4 subintervals in the composite trapezium rule, and working to 6 decimal
places, approximate
Z 2
cosh(x) dx
0

Solution
In this case h = (2 − 0)/4 = 0.5.
We require cosh(x) evaluated at five x-values and the results are tabulated below to 6 d.p.
xn fn = cosh(xn )
0 1.000000
0.5 1.127626
1 1.543081
1.5 2.352410
2 3.762196
It follows that
Z 2
1
cosh(x) dx ≈ h (f0 + f4 + 2{f1 + f2 + f3 })
0 2
1
= (0.5) (1 + 3.762196 + 2{1.127626 + 1.543081 + 2.35241})
2
= 3.452107

HELM (2006): 37
Section 31.2: Numerical Integration
Task
Using 4 subintervals in the composite trapezium rule approximate
Z 2
ln(x) dx
1

Your solution

Answer
In this case h = (2 − 1)/4 = 0.25.
We require ln(x) evaluated at five x-values and the results are tabulated below t0 6 d.p.
xn fn = ln(xn )
1 0.000000
1.25 0.223144
1.5 0.405465
1.75 0.559616
2 0.693147
It follows that
Z 2
1
ln(x) dx ≈ h (f0 + f4 + 2{f1 + f2 + f3 })
1 2
1
= (0.25) (0 + 0.693147 + 2{0.223144 + 0.405465 + 0.559616})
2
= 0.383700

38 HELM (2006):
Workbook 31: Numerical Methods of Approximation
How good is the composite trapezium rule?
We can work out an upper bound on the error incurred by the composite trapezium method. For-
tunately, all we have to do here is apply the method for the error in the simple rule over and over
again. Let eNT denote the error in the composite trapezium rule with N subintervals. Then

00 h3 00 h3 00 h3
N
eT ≤ max f (x) + max f (x) + ... + max f (x)
1st subinterval 12 2nd subinterval 12 last subinterval 12

h3  
= max f 00 (x) + max f 00 (x) + . . . + max f 00 (x) .

12 | 1st subinterval 2nd subinterval
{z last subinterval
}

N terms

This is all very well as a piece of theory, but it is awkward to use in practice. The process of working
out the maximum value of |f 00 | separately in each subinterval is very time-consuming. We can obtain
a more user-friendly, if less accurate, error bound by replacing each term in the last bracket above
with the biggest one. Hence we obtain
N h3
 
eT ≤ 00
N max f (x)
12 a≤x≤b

This upper bound can be rewritten by recalling that N h = b − a, and we now summarise the result
in a Key Point.

Key Point 8
Error in the Composite Trapezium Rule

Z b
The error, eN , in the N -subinterval composite trapezium approximation to
T f (x) dx is bounded
a
above by
(b − a)h2
max f 00 (x)

a≤x≤b 12
Note: the special case when N = 1 is the simple trapezium rule, in which case b − a = h (refer to
Key Point 6 to compare).

The formula in Key Point 8 can be used to decide how many subintervals to use to guarantee a
specific accuracy.

HELM (2006): 39
Section 31.2: Numerical Integration
Example 13
The function f is known to have a second derivative with the property that
|f 00 (x)| < 12
for x between 0 and 4.
Using the error bound given in Key Point 8 determine how many subintervals are
required so that the composite trapezium rule used to approximate
Z 4
f (x) dx
0

can be guaranteed to be in error by less than 1


2
× 10−3 .

Solution
We require that
(b − a)h2
12 × < 0.0005
12
that is
4h2 < 0.0005.
This implies that h2 < 0.000125 and therefore h < 0.0111803.
Now N = (b − a)/h = 4/h and it follows that
N > 357.7708
Clearly, N must be a whole number and we conclude that the smallest number of subintervals which
guarantees an error smaller than 0.0005 is N = 358.

It is worth remembering that the error bound we are using here is a pessimistic one. We effectively
use the same (worst case) value for f 00 (x) all the way through the integration interval. Odds are that
fewer subintervals will give the required accuracy, but the value for N we found here will guarantee
a good enough approximation.
Next are two Tasks for you to try.

40 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Task
The function f is known to have a second derivative with the property that
|f 00 (x)| < 14
for x between −1 and 4.
Using the error bound given in Key Point 8 determine how many subintervals are
required so that the composite trapezium rule used to approximate
Z 4
f (x) dx
−1

can be guaranteed to have an error less than 0.0001.

Your solution

Answer
We require that
(b − a)h2
14 × < 0.0001
12
that is
70h2
< 0.0001
12
This implies that h2 < 0.00001714 and therefore h < 0.0041404.
Now N = (b − a)/h = 5/h and it follows that
N > 1207.6147
Clearly, N must be a whole number and we conclude that the smallest number of subintervals which
guarantees an error smaller than 0.00001 is N = 1208.

HELM (2006): 41
Section 31.2: Numerical Integration
Task
2 /2
It is given that the function e−x has a second derivative that is never greater
than 1 in absolute value.

(a) Use this fact to determine how many subintervals are required for the com-
posite trapezium method to deliver an approximation to
Z 1
1 2
√ e−x /2 dx
0 2π

that is guaranteed to have an error less than 1


2
× 10−2 .

(b) Find an approximation to the integral that is in error by less than 1


2
× 10−2 .

Your solution
(a)

Answer
1 (b − a)h2
We require that √ < 0.005. This means that h2 < 0.150398 and therefore, since
2π 12
N = 1/h, it is necessary for N = 3 for the error bound to be less than ± 21 × 10−2 .

Your solution
(b)

42 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Answer
1 2
To carry out the composite trapezium rule, with h = 1
3
we need to evaluate f (x) = √ e−x /2 at

x = 0, h, 2h, 1. This evaluation gives
f (0) = f0 = 0.39894, f (h) = f1 = 0.37738, f (2h) = f2 = 0.31945
and f (1) = f3 = 0.24197,
all to 5 decimal places. It follows that
Z 1
1 2 1
√ e−x /2 dx ≈ h(f0 + f3 + 2{f1 + f2 }) = 0.33910
0 2π 2
We know from part (a) that this approximation is in error by less than 1
2
× 10−2 .

Example 14
Determine the minimum number of steps needed to guarantee an error not
exceeding ±0.001, when evaluating
Z 1
cosh(x2 ) dx
0

using the trapezium rule.

Solution
f (x) = cosh(x2 ) f 0 (x) = 2x sinh(x2 ) f 00 (x) = 2 sinh(x2 ) + 4x2 cosh(x2 )
Using the error formula in Key Point 8

1 2 2 2

2
E = − h {2 sinh(x ) + 4x cosh(x )}
x ∈ [0, 1]
12
|E|max occurs when x = 1

h2
0.001 > {2 sinh(1) + 4 cosh(1)}
12
h2 < 0.012/{(2 sinh(1) + 4 cosh(1)}
⇒ h2 < 0.001408
⇒h < 0.037523
⇒n ≥ 26.651
⇒n = 27 needed

HELM (2006): 43
Section 31.2: Numerical Integration
Task
Determine the minimum of strips, n, needed to evaluate by the trapezium rule:
Z π/4
{3x2 − 1.5 sin(2x)} dx
0

such that the error is guaranteed not to exceed ±0.005.

Your solution

Answer
f (x) = 3x2 − 1.5 sin(2x) f 00 (x) = 6 + 6 sin(2x)

π
|Error| will be maximum at x = so that sin(2x) = 1
4
(b − a) 2 (2) π
E=− h f (x) x ∈ [0, ]
12 4
π 2 π
E=− h 6{1 + sin(2x)}, x ∈ [0, ]
48 4
π πh2
|E|max = h2 (12) =
48 4
2
πh 0.02
We need < 0.005 ⇒ h2 < ⇒ h < 0.07979
4 π
π π
Now nh = (b − a) = so n =
4 4h
π
We need n > = 9.844 so n = 10 required
4 × 0.07979

44 HELM (2006):
Workbook 31: Numerical Methods of Approximation
4. Other methods for approximating integrals
Here we briefly describe other methods that you may have heard, or get to hear, about. In the end
they all amount to the same sort of thing, that is we sample the integrand f at a few points in the
integration interval and then take a weighted average of all these f values. All that is needed to
implement any of these methods is the list of sampling points and the weight that should be attached
to each evaluation. Lists of these points and weights can be found in many books on the subject.

Simpson’s rule
This is based on passing a quadratic through three equally spaced points, rather than passing a
straight line through two points as we did for the simple trapezium rule. The composite Simpson’s
rule is given in the following Key Point.

Key Point 9
Composite Simpson’s Rule
Z b
The composite Simpson’s rule for approximating f (x) dx is carried out as follows:
a

1. Choose N , which must be an even number of subintervals,


Z b
2. Calculate f (x) dx
a
 
≈ 13 h f0 + 4{f1 + f3 + f5 + · · · + fN −1 } + 2{f2 + f4 + f6 + · · · + fN −2 } + fN

where
b−a
h= , f0 = f (a), f1 = f (a + h), . . . , fn = f (a + nh), . . . ,
N
and fN = f (a + N h) = f (b).

The formula in Key Point 9 is slightly more complicated than the corresponding one for composite
trapezium rule. One way of remembering the rule is the learn the pattern
1 4 2 4 2 4 2 ... 4 2 4 2 4 1
which show that the end point values are multiplied by 1, the values with odd-numbered subscripts
are multiplied by 4 and the interior values with even subscripts are multiplied by 2.

HELM (2006): 45
Section 31.2: Numerical Integration
Example 15
Using 4 subintervals in the composite Simpson’s rule approximate
Z 2
cosh(x) dx.
0

Solution
In this case h = (2 − 0)/4 = 0.5.
We require cosh(x) evaluated at five x-values and the results are tabulated below to 6 d.p.
xn fn = cosh(xn )
0 1.000000
0.5 1.127626
1 1.543081
1.5 2.352410
2 3.762196
It follows that
Z 2
1
cosh(x) dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
0 2
1
= (0.5) (1 + 4 × 1.127626 + 2 × 1.543081 + 4 × 2.352410 + 3.762196)
2
= 3.628083,

where this approximation is given to 6 decimal places.

Z 2
This approximation to cosh(x) dx is closer to the true value of sinh(2) (which is 3.626860
0
to 6 d.p.) than we obtained when using the composite trapezium rule with the same number of
subintervals.

Task
Using 4 subintervals in the composite Simpson’s rule approximate
Z 2
ln(x) dx.
1

Your solution

46 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Answer
In this case h = (2 − 1)/4 = 0.25. There will be five x-values and the results are tabulated below
to 6 d.p.
xn fn = ln(xn )
1.00 0.000000
1.25 0.223144
1.50 0.405465
1.75 0.559616
2.00 0.693147
It follows that
Z 2
1
ln(x) dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
1 3
1
= (0.25) (0 + 4 × 0.223144 + 2 × 0.405465 + 4 × 0.559616 + 0.693147)
3
= 0.386260 to 6 d.p.

How good is the composite Simpson’s rule?


On page 39 (Key Point 8) we saw a formula for an upper bound on the error in the composite
trapezium method. A corresponding result for the composite Simpson’s rule exists and is given in
the following Key Point.

Key Point 10
Error in Composite Simpson’s Rule
Z b
The error in the N -subinterval composite Simpson’s rule approximation to f (x) dx is bounded
a
above by
(iv) (b − a)h4

max f (x)
a≤x≤b 180
(Here f (iv) is the fourth derivative of f and h is the subinterval width, so N × h = b − a.)

The formula in Key Point 10 can be used to decide how many subintervals to use to guarantee a
specific accuracy.

HELM (2006): 47
Section 31.2: Numerical Integration
Example 16
The function f is known to have a fourth derivative with the property that
(iv)
f (x) < 5

for x between 1 and 5. Determine how many subintervals are required so that the
composite Simpson’s rule used to approximate
Z 5
f (x) dx
1

incurs an error that is guaranteed less than 0.005 .

Solution
We require that
4h4
5× < 0.005
180
This implies that h4 < 0.045 and therefore h < 0.460578.
Now N = 4/h and it follows that
N > 8.684741
For the composite Simpson’s rule N must be an even whole number and we conclude that the
smallest number of subintervals which guarantees an error smaller than 0.005 is N = 10.

Task
The function f is known to have a fourth derivative with the property that
(iv)
f (x) < 12

for x between 2 and 6. Determine how many subintervals are required so that the
composite Simpson’s rule used to approximate
Z 6
f (x) dx
2

incurs an error that is guaranteed less than 0.0005 .

Your solution

48 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Answer
We require that
4h4
12 × < 0.0005
180
This implies that h4 < 0.001875 and therefore h < 0.208090.
Now N = 4/h and it follows that
N > 19.222491
N must be an even whole number and we conclude that the smallest number of subintervals which
guarantees an error smaller than 0.0005 is N = 20.

The following Task is similar to one that we saw earlier in this Section (page 42). Using the composite
Simpson’s rule we can achieve greater accuracy, for a similar amount of effort, than we managed
using the composite trapezium rule.

Task
2 /2
It is given that the function e−x has a fourth derivative that is never greater
than 3 in absolute value.

(a) Use this fact to determine how many subintervals are required for the composite Simpson’s rule
to deliver an approximation to
Z 1
1 2
√ e−x /2 dx
0 2π
that is guaranteed to have an error less than 1
2
× 10−4 .

Your solution

Answer
3 (b − a)h4
We require that √ < 0.00005.
2π 180
This means that h4 < 0.00751988 and therefore h < 0.294478. Since N = 1/h it is necessary for
N = 4 for the error bound to be guaranteed to be less than ± 21 × 10−4 .

HELM (2006): 49
Section 31.2: Numerical Integration
(b) Find an approximation to the integral that is in error by less than 1
2
× 10−4 .

Your solution

Answer
1 2
In this case h = (1 − 0)/4 = 0.25. We require √ e−x /2 evaluated at five x-values and the

results are tabulated below to 6 d.p.
1 2
xn √ e−xn /2

0 0.398942
0.25 0.386668
0.5 0.352065
0.75 0.301137
1 0.241971
It follows that
Z 1
1 2 1
√ e−x /2 dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
0 2π 3
1
= (0.25) (0.398942 + 4 × 0.386668 + 2 × 0.352065
3
+4 × 0.301137 + 0.241971)
= 0.341355 to 6 d.p.

We know from part (a) that this approximation is in error by less than 1
2
× 10−4

50 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Example 17
Find out how many strips are needed to be sure that
Z 4
sinh(2t) dt
0

is evaluated by Simpson’s rule with error less than ±0.0001

Solution
(b − a) 4
E=− h (16) sinh(2x) 0<x<4
180
64h2 sinh(8)
|E| ≤ ≤ 0.0001
180
0.0180
⇒ h4 ≤ ⇒ h ≤ 0.0208421
64 sinh(8)
4
nh = b − a ⇒ n≥ = 191.92
0.0208421
So n = 192 is needed (minimum even number).

HELM (2006): 51
Section 31.2: Numerical Integration
Engineering Example 1

Plastic bottle design

Introduction
Manufacturing containers is a large and varied industry and optimum packaging can save companies
millions of pounds. Although determining the capacity of a container and amount of material needed
can be done by physical experiment, mathematical modelling provides a cost-effective and efficient
means for the designer to experiment.
Problem in words
A manufacturer is designing a new plastic bottle to contain 900 ml of fabric softener. The bottle is
circular in cross section, with a varying radius given by
r = 4 + 0.5z − 0.07z 2 + 0.002z 3
where z is the height above the base in cm.
(a) Find an expression for the volume of the bottle and hence show that the fill level needs to be
approximately 18 cm.
(b) If the wall thickness of the plastic is 1 mm, show that this is always small compared to the
bottle radius.
(c) Hence, find the volume of plastic required to manufacture a bottle which is 20 cm tall (include
the plastic in the base and side walls), using a numerical method.
A graph the radius against z is shown below:
20

18

16

14
z 12

10

0
−5 0 5 10 15
r
Figure 10
Mathematical statement of problem
Calculate all lengths in centimetres.
(a) The formula for the volume of a solid of revolution, revolved round the z axis between z = 0
Z d
and z = d is πr2 dz. We have to evaluate this integral.
0

(b) To show that the thickness is small relative to the radius we need to find the minimum radius.

52 HELM (2006):
Workbook 31: Numerical Methods of Approximation
(c) Given that the thickness is small compared with the radius, the volume can be taken to be the
surface area times the thickness. Now the surface area of the base is easy to calculate being
π × 42 , but we also need to calculate the surface area for the sides, which is much harder.

For an element of height dz this is s


2πz×(the slantheight) of the surface between z and z + dz.
 2
dr 
The slant height is, analytically  1+ × dz, or equivalently the distance between
dz
(r(z), z) and (r(z + dz), z + dz), which is easier to use numerically.
s  2
Z 20
dr
Analytically the surface area to height 20 is 2πr 1 + dz; we shall approximate this
0 dz
numerically. This will give the area of the side surface.
Mathematical analysis
Z d
(a) We could calculate this integral exactly, as the volume is π(4+0.5z −0.07z 2 +0.002z 3 )2 dz
0
but here we do this numerically (which can often be a simpler approach and possibly is so here).
To do that we need to keep an eye on the likely error, and for this problem we shall ensure
the error in the integrals is less than 1 ml. The formula for the error with the trapezium
rule, with step h and integrated from 0 to 20 (assuming from the problem that we shall
20 2
not integrate over a larger range) is h max|f 00 |. Doing this crudely with f = πg 2 where
12
g(z) = 4 + 0.5z − 0.07z 2 + 0.002z 3 we see that

|g(z)| ≤ 4 + 10 + 28 + 16 = 58 (using only positive signs and |z| ≤ 20)

and |g 0 (z)| ≤ 0.5 + 0.14z + 0.006z 2 ≤ 0.5 + 2.8 + 2.4 = 5.7 < 6,

and |g 00 (z)| ≤ 0.14 + 0.012z ≤ 0.38.

Therefore
20 2
f 00 = 2π(gg 00 + (g 0 )2 ) ≤ 2(58 × 0.38 + 62 )π < 117π, so h max|f 00 | ≤ 613h2 .
12
We need h2 < 1/613, or h < 0.0403. We will use h = 0.02, and the error will be at most 0.25.

The approximation to the integral from 0 to 18 is


899
1 2 X 1
πg (0)0.02 + πg 2 (0.02i)0.02 + πg 2 (18)0.02
2 i=1
2

(recalling the multiplying factor is a half for the first and last entries in the trapezium rule).
This yields a value of 899.72, which is certainly within 1 ml of 900.

(b) From the graph the minimum radius looks to be about 2 at about z = 18. Looking more
exactly (found by solving the quadratic to find the places where the derivative is zero, or by
plotting the values and by inspection), the minimum is at z = 18.93, when r = 1.948 cm. So
the thickness is indeed small (always less than 0.06 of the radius at all places.)

HELM (2006): 53
Section 31.2: Numerical Integration
s 2
Z 20 
dr
(c) For the area of the side surface we shall calculate 2πr 1+ dz numerically, using
0 dz
s 2 
dr p
the trapezium rule with step 0.02 as before. 1+ dz = (dz)2 + (dr)2 , which we
dz
p
shall approximate at point zn by (zn+1 − zn ) + (rn+1 − rn )2 , so evaluating r(z) at intervals
2

of 0.02 gives the approximation


999
X
p p
πr(0) (0.02)2 + (r(0.02) − r(0))2 + 2πr(0.02i) (0.02)2 + (r(0.02(i + 1)) − r(0.02i))2
i=1
p
+πr(20) (0.02)2 + (r(20) − r(19.98))2 .

Calculating this gives 473 cm2 . Approximating the analytical expression by a direct numerical
calculation gives 474 cm2 . (The answer is between 473.5 and 473.6 cm2 , so this variation is
understandable and does not indicate an error.) The bottom surface area is 16π = 50.3 cm2 ,
so the total surface area we may take to be 474 + 50 = 524 cm2 , and hence the volume of
plastic is 524 × 0.1 = 52.4 cm3 .

Mathematical comment
An alternative to using the trapezium rule is Simpson’s rule which will require many fewer steps.
When using a computer program such as Microsoft Excel having an efficient method may not be
important for a small problem but could be significant when many calculations are needed or com-
putational power is limited (such as if using a programmable calculator).
The reader is invited to repeat the calculations for (a) and (c) using Simpson’s rule.
The analytical answer to (a) is given by
Z 18
π(16 + 4z − 0.31z 2 − 0.054z 3 + 0.0069z 4 − 0.00028z 5 + 0.000004z 6 ) dz
0

which gives 899.7223 to 4 d.p.

54 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Exercises
1. Using 4 subintervals in the composite trapezium rule approximate
Z 5

x dx.
1

2. The function f is known to have a second derivative with the property that

|f 00 (x)| < 12

for x between 2 and 3. Using the error bound given earlier in this Section determine how many
subintervals are required so that the composite trapezium rule used to approximate
Z 3
f (x) dx
2

can be guaranteed to have an error in it that is less than 0.001.

3. Using 4 subintervals in the composite Simpson rule approximate


Z 5

x dx.
1

4. The function f is known to have a fourth derivative with the property that
(iv)
f (x) < 6

for x between −1 and 5 . Determine how many subintervals are required so that the composite
Simpson’s rule used to approximate
Z 5
f (x) dx
−1

incurs an error that is less than 0.001.

5. Determine the minimum number of steps needed to guarantee an error not exceeding
±0.000001 when numerically evaluating
Z 4
ln(x) dx
2

using Simpson’s rule.

HELM (2006): 55
Section 31.2: Numerical Integration
Answers

1. In this case h = (5 − 1)/4 = 1. We require x evaluated at five x-values and the results are
tabulated below

xn fn = xn
1 1
2 1.414214
3 1.732051
4 2.000000
5 2.236068

It follows that
Z 5
√ 1
x dx ≈ h (f0 + f4 + 2{f1 + f2 + f3 })
1 2
1  
= (1) 1 + 2.236068 + 2{1.414214 + 1.732051 + 2}
2
= 6.764298.

(b − a)h2
2. We require that 12 × < 0.001. This implies that h < 0.0316228.
12
Now N = (b − a)/h = 1/h and it follows that

N > 31.6228

Clearly, N must be a whole number and we conclude that the smallest number of subintervals
which guarantees an error smaller than 0.001 is N = 32.

h = (5 − 1)/4 = 1.
3. In this case √
We require x evaluated at five x-values and the results are as tabulated in the solution to
Exercise 1. It follows that
Z 5
√ 1
x dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
1 3
1
= (1) (1 + 4 × 1.414214 + 2 × 1.732051 + 4 × 2.000000 + 2.236068)
3
= 6.785675.

6h4
4. We require that 6 × < 0.001. This implies that h4 < 0.005 and therefore h < 0.265915.
180
Now N = 6/h and it follows that N > 22.563619. We know that N must be an even whole
number and we conclude that the smallest number of subintervals which guarantees an error
smaller than 0.001 is N = 24.

56 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Answers
6
5. f (x) = ln(x) f (4) (x) = −
x4
(b − a)h4 f (4) (x)
Error = − a = 2, b = 4
180
2h4 (6/x4 )
|E| = x ∈ [2, 4]
180
h4 1
|E|max = ≤ 0.000001
15 24
⇒ h4 ≤ 15 × 24 × 0.000001 ⇒ h ≤ 0.124467

Now nh = (b − a) so
2
n≥ ⇒ n ≥ 16.069568 ⇒ n = 18 (minimum even number)
0.124467

HELM (2006): 57
Section 31.2: Numerical Integration
Numerical  

Differentiation 31.3 

Introduction
In this Section we will look at ways in which derivatives of a function may be approximated numerically.

 

Prerequisites • review previous material concerning


differentiation
Before starting this Section you should . . .

' 
$
• obtain numerical approximations to the first
and second derivatives of certain functions
Learning Outcomes
On completion you should be able to . . .
& %

58 HELM (2006):
Workbook 31: Numerical Methods of Approximation
1. Numerical differentiation
This Section deals with ways of numerically approximating derivatives of functions. One reason for
dealing with this now is that we will use it briefly in the next Section. But as we shall see in these
next few pages, the technique is useful in itself.

2. First derivatives
Our aim is to approximate the slope of a curve f at a particular point x = a in terms of f (a) and
the value of f at a nearby point where x = a + h. The shorter broken line Figure 11 may be thought
of as giving a reasonable approximation to the required slope (shown by the longer broken line), if h
is small enough.

This slope approximates f ! (a)

Slope of line is f ! (a)

a a+h x

Figure 11

So we might approximate
difference in the y-values f (a + h) − f (a)
f 0 (a) ≈ slope of short broken line = = .
difference in the x-values h
This is called a one-sided difference or forward difference approximation to the derivative of f .
A second version of this arises on considering a point to the left of a, rather than to the right as we
did above. In this case we obtain the approximation
f (a) − f (a − h)
f 0 (a) ≈
h
This is another one-sided difference, called a backward difference, approximation to f 0 (a).
A third method for approximating the first derivative of f can be seen in Figure 12.

HELM (2006): 59
Section 31.3: Numerical Differentiation
f

This slope approximates f ! (a)

Slope of line is f ! (a)

a−h a a+h x

Figure 12
Here we approximate as follows
difference in the y-values f (x + h) − f (x − h)
f 0 (a) ≈ slope of short broken line = =
difference in the x-values 2h
This is called a central difference approximation to f 0 (a).

Key Point 11
First Derivative Approximations
Three approximations to the derivative f 0 (a) are
f (a + h) − f (a)
1. the one-sided (forward) difference
h

f (a) − f (a − h)
2. the one-sided (backward) difference
h

f (a + h) − f (a − h)
3. the central difference
2h

In practice, the central difference formula is the most accurate.

These first, rather artificial, examples will help fix our ideas before we move on to more realistic
applications.

60 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Example 18
Use a forward difference, and the values of h shown, to approximate the derivative
of cos(x) at x = π/3.
(a) h = 0.1 (b) h = 0.01 (c) h = 0.001 (d) h = 0.0001
Work to 8 decimal places throughout.

Solution
cos(a + h) − cos(a) 0.41104381 − 0.5
(a) f 0 (a) ≈ = = −0.88956192
h 0.1
0 cos(a + h) − cos(a) 0.49131489 − 0.5
(b) f (a) ≈ = = −0.86851095
h 0.01
0 cos(a + h) − cos(a) 0.49913372 − 0.5
(c) f (a) ≈ = = −0.86627526
h 0.001
0 cos(a + h) − cos(a) 0.49991339 − 0.5
(d) f (a) ≈ = = −0.86605040
h 0.0001

One advantage of doing a simple example first is that we can compare these approximations with
the ‘exact’ value which is

0 3
f (a) = − sin(π/3) = − = −0.86602540 to 8 d.p.
2
Note that the accuracy levels of the four approximations in Example 15 are:

(a) 1 d.p. (b) 2 d.p. (c) 3 d.p. (d) 3 d.p. (almost 4 d.p.)

The errors to 6 d.p. are:

(a) 0.023537 (b) 0.002486 (c) 0.000250 (d) 0.000025

Notice that the errors reduce by about a factor of 10 each time.

Example 19
Use a central difference, and the value of h shown, to approximate the derivative
of cos(x) at x = π/3.
(a) h = 0.1 (b) h = 0.01 (c) h = 0.001 (d) h = 0.0001
Work to 8 decimal places throughout.

HELM (2006): 61
Section 31.3: Numerical Differentiation
Solution
cos(a + h) − cos(a − h) 0.41104381 − 0.58396036
(a) f 0 (a) ≈ = = −0.86458275
2h 0.2
0 cos(a + h) − cos(a − h) 0.49131489 − 0.50863511
(b) f (a) ≈ = = −0.86601097
2h 0.02
0 cos(a + h) − cos(a − h) 0.49913372 − 0.50086578
(c) f (a) ≈ = = −0.86602526
2h 0.002
cos(a + h) − cos(a − h) 0.49991339 − 0.50008660
(d) f 0 (a) ≈ = = −0.86602540
2h 0.0002

This time successive approximations generally have two extra accurate decimal places indicating a
superior formula. This is illustrated again in the following Task.

Task
Let f (x) = ln(x) and a = 3. Using both a forward difference and a central
difference, and working to 8 decimal places, approximate f 0 (a) using h = 0.1 and
h = 0.01.
(Note that this is another example where we can work out the exact answer, which
in this case is 13 .)

Your solution

62 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Answer
Using the forward difference we find, for h = 0.1
ln(a + h) − ln(a) 1.13140211 − 1.09861229
f 0 (a) ≈ = = 0.32789823
h 0.1
and for h = 0.01 we obtain
ln(a + h) − ln(a) 1.10194008 − 1.09861229
f 0 (a) ≈ = = 0.33277901
h 0.01
Using central differences the two approximations to f 0 (a) are
ln(a + h) − ln(a − h) 1.13140211 − 1.06471074
f 0 (a) ≈ = = 0.33345687
2h 0.2
and
ln(a + h) − ln(a − h) 1.10194008 − 1.09527339
f 0 (a) ≈ = = 0.33333457
2h 0.02
The accurate answer is, of course, 0.33333333

There is clearly little point in studying this technique if all we ever do is approximate quantities we
could find exactly in another way. The following example is one in which this so-called differencing
method is the best approach.

Example 20
The distance x of a runner from a fixed point is measured (in metres) at intervals
of half a second. The data obtained are
t 0.0 0.5 1.0 1.5 2.0
x 0.00 3.65 6.80 9.90 12.15
Use central differences to approximate the runner’s velocity at times t = 0.5 s and
t = 1.25 s.

Solution
Our aim here is to approximate x0 (t). The choice of h is dictated by the available data given in the
table.
Using data with t = 0.5 s at its centre we obtain
x(1.0) − x(0.0)
x0 (0.5) ≈ = 6.80 m s−1 .
2 × 0.5
Data centred at t = 1.25 s gives us the approximation
x(1.5) − x(1.0)
x0 (1.25) ≈ = 6.20 m s−1 .
2 × 0.25
Note the value of h used.

HELM (2006): 63
Section 31.3: Numerical Differentiation
Task
The velocity v (in m s−1 ) of a rocket measured at half second intervals is
t 0.0 0.5 1.0 1.5 2.0
v 0.000 11.860 26.335 41.075 59.051
Use central differences to approximate the acceleration of the rocket at times
t = 1.0 s and t = 1.75 s.

Your solution

Answer
Using data with t = 1.0 s at its centre we obtain
v(1.5) − v(0.5)
v 0 (1.0) ≈ = 29.215 m s−2 .
1.0
Data centred at t = 1.75 s gives us the approximation
v(2.0) − v(1.5)
v 0 (1.75) ≈ = 35.952 m s−2 .
0.5

3. Second derivatives
An approach which has been found to work well for second derivatives involves applying the notion
of a central difference three times. We begin with
f 0 (a + 12 h) − f 0 (a − 21 h)
f 00 (a) ≈ .
h
Next we approximate the two derivatives in the numerator of this expression using central differences
as follows:
f (a + h) − f (a) f (a) − f (a − h)
f 0 (a + 21 h) ≈ and f 0 (a − 21 h) ≈ .
h h

64 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Combining these three results gives

00 f 0 (a + 12 h) − f 0 (a − 21 h)
f (a) ≈
h
   
1 f (a + h) − f (a) f (a) − f (a − h)
≈ −
h h h
f (a + h) − 2f (a) + f (a − h)
=
h2

Key Point 12
Second Derivative Approximation
A central difference approximation to the second derivative f 00 (a) is

f (a + h) − 2f (a) + f (a − h)
f 00 (a) ≈
h2

Example 21
The distance x of a runner from a fixed point is measured (in metres) at intervals
of half a second. The data obtained are
t 0.0 0.5 1.0 1.5 2.0
x 0.00 3.65 6.80 9.90 12.15
Use a central difference to approximate the runner’s acceleration at t = 1.5 s.

Solution
Our aim here is to approximate x00 (t).
Using data with t = 1.5 s at its centre we obtain

x(2.0) − 2x(1.5) + x(1.0)


x00 (1.5) ≈
0.52
−2
= −3.40 m s ,

from which we see that the runner is slowing down.

HELM (2006): 65
Section 31.3: Numerical Differentiation
Exercises
1. Let f (x) = cosh(x) and a = 2. Let h = 0.01 and approximate f 0 (a) using forward, backward
and central differences. Work to 8 decimal places and compare your answers with the exact
result, which is sinh(2).

2. The distance x, measured in metres, of a downhill skier from a fixed point is measured at
intervals of 0.25 s. The data gathered are

t 0 0.25 0.5 0.75 1 1.25 1.5


x 0 4.3 10.2 17.2 26.2 33.1 39.1

Use a central difference to approximate the skier’s velocity and acceleration at the times
t =0.25 s, 0.75 s and 1.25 s. Give your answers to 1 decimal place.
Answers
cosh(a + h) − cosh(a) 3.79865301 − 3.76219569
1. Forward: f 0 (a) ≈ = = 3.64573199
h 0.01
cosh(a) − cosh(a − h) 3.76219569 − 3.72611459
Backward: f 0 (a) ≈ = = 3.60810972
h 0.01
cosh(a + h) − cosh(a − h) 3.79865301 − 3.72611459
Central: f 0 (a) ≈ = = 3.62692086
2h 0.02
The accurate result is sinh(2) = 3.62686041.

2. Velocities at the given times approximated by a central difference are:

20.4 m s−1 , 32.0 m s−1 and 25.8 m s−1 .

Accelerations at these times approximated by a central difference are:

25.6 m s−2 , 32.0 m s−2 and −14.4 m s−2 .

66 HELM (2006):
Workbook 31: Numerical Methods of Approximation
 

Nonlinear Equations 31.4


 

Introduction
In this Section we briefly discuss nonlinear equations (what they are and what their solutions might
be) before noting that many such equations which crop up in applications cannot be solved exactly.
The remainder (and majority) of the Section then goes on to discuss methods for approximating
solutions of nonlinear equations.

#
• understand derivatives of simple functions
Prerequisites • understand the quadratic formula
Before starting this Section you should . . .
• understand exponentials and logarithms
"
' !
$
• approximate roots of equations by the
bisection method and by the
Learning Outcomes Newton-Raphson method
On completion you should be able to . . . • implement an approximate Newton-Raphson
method
& %

HELM (2006): 67
Section 31.4: Nonlinear Equations
1. Nonlinear Equations
A linear equation is one related to a straight line, for example f (x) = mx + c describes a straight
line with slope m and the linear equation f (x) = 0, involving such an f , is easily solved to give
x = −c/m (as long as m 6= 0). If a function f is not represented by a straight line in this way we
say it is nonlinear.
The nonlinear equation f (x) = 0 may have just one solution, like in the linear case, or it may have
no solutions at all, or it may have many solutions. For example if f (x) = x2 − 9 then it is easy to
see that there are two solutions x = −3 and x = 3. The nonlinear equation f (x) = x2 + 1 has no
solutions at all (unless the application under consideration makes it appropriate to consider complex
numbers).
Our aim in this Section is to approximate (real-valued) solutions of nonlinear equations of the form
f (x) = 0. The definitions of a root of an equation and a zero of a function have been gathered
together in Key Point 13.

Key Point 13
If the value x is such that f (x) = 0 we say that
1. x is a root of the equation f (x) = 0
2. x is a zero of the function f .

Example 22
Find any (real valued) zeros of the following functions. (Give 3 decimal places if
you are unable to give an exact numerical value.)
(a) f (x) = x2 + x − 20 (b) f (x) = x2 − 7x + 5 (c) f (x) = 2x − 3
(d) f (x) = ex + 1 (e) f (x) = sin(x)

Solution

(a) This quadratic factorises easily into f (x) = (x − 4)(x + 5) and so the two zeros of this f are
x = 4, x = −5.

(b) The nonlinear equation x2 − 7x +√5 = 0 requires the quadratic


√ formula and we find that the
2
7± 7 −4×1×5 7 ± 29
two zeros of this f are x = = which are equal to x = 0.807
2 2
and x = 6.193, to 3 decimal places.

68 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Solution (contd.)
(c) Using the natural logarithm function we see that
x ln(2) = ln(3)
from which it follows that x = ln(3)/ ln(2) = 1.585, to 3 decimal places.
(d) This f has no zeros because ex + 1 is always positive.
(e) sin(x) has an infinite number of zeros at x = 0, ±π, ±2π, ±3π, . . . . To 3 decimal places these
are x = 0.000, ±3.142, ±6.283, ±9.425, . . . .

Task
Find any (real valued) zeros of the following functions.
(a) f (x) = x2 + 2x − 15, (b) f (x) = x2 − 3x + 3,
(c) f (x) = ln(x) − 2, (d) f (x) = cos(x).
For parts (a) to (c) give your answers to 3 decimal places if you cannot give an
exact answer; your answers to part (d) may be left in terms of π.

Your solution

HELM (2006): 69
Section 31.4: Nonlinear Equations
Answer
(a) This quadratic factorises easily into f (x) = (x − 3)(x + 5) and so the two zeros of this
f are x = 3, x = −5.

(b) The equation x2 − 3x + 3 = 0 requires the quadratic formula and the two zeros of this
f are
√ √
3 ± 32 − 4 × 1 × 3 3 ± −3
x= =
2 2
which are complex values. This f has no real zeros.
(c) Solving ln(x) = 2 gives x = e2 = 7.389, to 3 decimal places.
π π π
(d) cos(x) has an infinite number of zeros at x = , ± π, ± 2π, . . . .
2 2 2

Many functions that crop up in engineering applications do not lend themselves to finding zeros
directly as was achieved in the examples above. Instead we approximate zeros of functions, and this
Section now goes on to describe some ways of doing this. Some of what follows will involve revision
of material you have seen in 12 concerning Applications of Differentiation.

2. The bisection method


Suppose that, by trial and error for example, we know that a single zero of some function f lies
between x = a and x = b. The root is said to be bracketed by a and b. This must mean that f (a)
and f (b) are of opposite signs, that is that f (a)f (b) < 0.

Example 23
The single positive zero of the function f (x) = x tanh( 12 x) − 1 models the wave
number of water waves at a certain frequency in water of depth 0.5 (measured
in some units we need not worry about here). Find two points which bracket the
zero of f .

Solution
We simply evaluate f at a selection of x-values.
x f (x) = x tanh( 12 x) − 1
0 0 × tanh(0) − 1 = −1
0.5 0.5 × tanh(0.25) − 1 = 0.5 × 0.2449 − 1 = −0.8775
1 1 × tanh(0.5) − 1 = 1 × 0.4621 − 1 = −0.5379
1.5 1.5 × tanh(0.75) − 1 = 1.5 × 0.6351 − 1 = −0.0473
2 2 × tanh(1) − 1 = 2 × 0.7616 − 1 = 0.5232
From this we can see that f changes sign between 1.5 and 2. Thus we can take a = 1.5 and b = 2
as the bracketing points. That is, the zero of f is in the bracketing interval 1.5 < x < 2.

70 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Task
The function f (x) = cos(x) − x has a single positive zero. Find bracketing points
a and b for the zero of f . Arrange for the difference between a and b to be equal
to 12 .
(NB - be careful to use radians on your calculator!)

Your solution

Answer
We evaluate f for a range of values:
x f (x)
0 1
0.5 0.37758
1 −0.459698
Clearly f changes sign between the bracketing values a = 0.5 and b = 1.
(Other answers are valid of course, it depends which values of f you tried.)

The aim with the bisection method is to repeatedly reduce the width of the bracketing interval
a < x < b so that it “pinches” the required zero of f to some desired accuracy. We begin by
describing one iteration of the bisection method in detail.
Let m = 21 (a + b), the mid-point of the interval a < x < b. All we need to do now is to see in
which half (the left or the right) of the interval a < x < b the zero is in. We evaluate f (m). There
is a (very slight) chance that f (m) = 0, in which case our job is done and we have found the zero
of f . Much more likely is that we will be in one of the two situations shown in Figure 13 below. If
f (m)f (b) < 0 then we are in the situation shown in (a) and we replace a < x < b with the smaller
bracketing interval m < x < b. If, on the other hand, f (a)f (m) < 0 then we are in the situation
shown in (b) and we replace a < x < b with the smaller bracketing interval a < x < m.

a m b x a m b x

(a) (b)

Figure 13

HELM (2006): 71
Section 31.4: Nonlinear Equations
Either way, we now have a bracketing interval that is half the size of the one we started with. We
have carried out one iteration of the bisection method. By successively reapplying this approach we
can make the bracketing interval as small as we wish.

Example 24
Carry out one iteration of the bisection method so as to halve the width of the
bracketing interval 1.5 < x < 2 for
f (x) = x tanh( 12 x) − 1.

Solution
The mid-point of the bracketing interval is m = 12 (a + b) = 21 (1.5 + 2) = 1.75. We evaluate
f (m) = 1.75 × tanh( 12 × 1.75) − 1 = 0.2318,
to 4 decimal places. We found earlier (Example 20, page 63) that f (a) < 0 and f (b) > 0, the fact
that f (m) is of the opposite sign to f (a) means that the zero of f lies in the bracketing interval
1.5 < x < 1.75.

Task
Carry out one iteration of the bisection method so as to halve the width of the
bracketing interval 0.5 < x < 1 for
f (x) = cos(x) − x.

Your solution

Answer
Here a = 0.5, b = 1. The mid-point of the bracketing interval is m = 12 (a + b) = 12 (0.5 + 1) = 0.75.
We evaluate
f (m) = cos(0.75) − 0.75 = −0.0183
We found earlier (Task, pages 58-59) that f (a) > 0 and f (b) < 0, the fact that f (m) is of the
opposite sign to f (a) means that the zero of f lies in the bracketing interval 0.5 < x < 0.75.

So we have a way of halving the size of the bracketing interval. By repeatedly applying this approach
we can make the interval smaller and smaller.
The general procedure, involving (possibly) many iterations, is best described as an algorithm:

72 HELM (2006):
Workbook 31: Numerical Methods of Approximation
1. Choose an error tolerance.

2. Let m = 12 (a + b), the mid-point of the bracketing interval.

3. There are three possibilities:

(a) f (m) = 0, this is very unlikely in general, but if it does happen then we have found the
zero of f and we can go to step ??,
(b) the zero is between m and b,
(c) the zero is between a and m.

4. If the zero is between m and b, that is if f (m)f (b) < 0 (as in Figure 13(a)) then let a = m.

5. Otherwise the zero must be between a and m (as in Figure 13(b)) so let b = m.

6. If b − a is greater than the required tolerance then go to step ??.

7. End.

One feature of this method is that we can predict in advance how much effort is required to achieve
a certain level of accuracy.

Example 25
A given problem using the bisection method starts with the bracketing points
a = 1.5 and b = 2. How many iterations will be required so that the error in the
approximation is less that 12 × 10−6 ?

Solution
Before we carry out any iterations we can write that the zero to be approximated is 1.75 ± 0.25 so
that the maximum magnitude of the error in 1.75 may be taken to be equal to 0.25.
Each successive iteration will halve the size of the error, so that after n iterations the error is equal
to
1
× 0.25
2n
We require that this quantity be less than 1
2
× 10−6 . Now,
1 1 1
n
× 0.25 < × 10−6 implies that 2n > × 106 .
2 2 2
The smallest value of n which satisfies this inequality can be found by trial and error, or by using
logarithms to see that n > (ln( 21 ) + 6 ln(10))/ ln(2). Either way, the smallest integer which will do
the trick is
n = 19.
It takes 19 iterations of the bisection method to ensure the required accuracy.

HELM (2006): 73
Section 31.4: Nonlinear Equations
Task
A function f is known to have a single zero between the points a = 3.2 and b = 4.
If these values were used as the initial bracketing points in an implementation of
the bisection method, how many iterations would be required to ensure an error
less than 12 × 10−3 ?

Your solution

Answer
We require that
 
1 4 − 3.2 1
n
× < × 10−3
2 2 2
or, after a little rearranging,
4
2n > × 103 .
5
The smallest value of n which satisfies this is n = 10. (This can be found by trial-and-error or by
using logarithms.)

Pros and cons of the bisection method

Pros Cons
• the method is easy to understand and re- • the method is very slow
member
• the method cannot find roots where the
• the method always works (once you find val-
curve just touches the x-axis but does not
ues a and b which bracket a single zero)
cross it (e.g. double roots)
• the method allows us to work out how many
iterations it will take to achieve a given error
tolerance because we know that the interval
will exactly halve at each step

The slowness of the bisection method will not be a surprise now that you have worked through an
example or two! Significant effort is involved in evaluating f and then all we do is look at this f -value
and see whether it is positive or negative! We are throwing away hard won information.

74 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Let us be realistic here, the slowness of the bisection method hardly matters if all we are saying is
that it takes a few more fractions of a second of computing time to finish, when compared with a
competing approach. But there are applications in which f may be very expensive (that is, slow) to
calculate and there are applications where engineers need to find zeros of a function many thousands
of times. (Coastal engineers, for example, may employ mathematical wave models that involve finding
the wave number we saw in Example 20 at many different water depths.) It is quite possible that
you will encounter applications where the bisection method is just not good enough.

3. The Newton-Raphson method


You may recall (e.g. 13.3) that the Newton-Raphson method (often simply called Newton’s
method) for approximating a zero of the function f is given by
f (xn )
xn+1 = xn −
f 0 (xn )
where f 0 denotes the first derivative of f and where x0 is an initial guess to the zero of f . A graphical
way of interpreting how this method works is shown in Figure 14.

x3

x x x
2 1 0

Figure 14
At each approximation to the zero of f we extrapolate so that the tangent to the curve meets the
x-axis. This point on the x-axis is the new approximation to the zero of f . As is clear from both
the figure and the mathematical statement of the method above, we require that f 0 (xn ) 6= 0 for
n = 0, 1, 2, . . . .

HELM (2006): 75
Section 31.4: Nonlinear Equations
Example 26
Let us consider the example we met earlier in Example 24. We know that the
single positive zero of
f (x) = x tanh( 12 x) − 1
lies between 1.5 and 2. Use the Newton-Raphson method to approximate the zero
of f .

Solution
We must work out the derivative of f to use Newton-Raphson. Now
f 0 (x) = tanh( 12 x) + x 21 sech2 ( 12 x)


d
on differentiating a product and recalling that tanh(x) = sech2 (x). (To evaluate sech on a
dx
1
calculator recall that sech(x) = .)
cosh(x)
We must choose a starting value x0 for the iteration and, given that we know the zero to be between
1.5 and 2, we take x0 = 1.75. The first iteration of Newton-Raphson gives
f (x0 ) f (1.75) 0.231835
x1 = x0 − = 1.75 − = 1.75 − = 1.547587,
f 0 (x0 ) f 0 (1.75) 1.145358
where 6 decimal places are shown. The second iteration gives
f (x1 ) f (1.547587) 0.004585
x2 = x1 − 0
= 1.547587 − 0 = 1.547587 − = 1.543407.
f (x1 ) f (1.547587) 1.09687
Clearly this method lends itself to implementation on a computer and, for example, using a spread-
sheet package, it is not hard to compute a few more iterations. Here is output from Microsoft Excel
where we have included the two lines of hand-calculation above:
n xn f (xn ) f 0 (xn ) xn+1
0 1.75 0.231835 1.145358 1.547587
1 1.547587 0.004585 1.09687 1.543407
2 1.543407 2.52E − 06 1.095662 1.543405
3 1.543405 7.69E − 13 1.095661 1.543405
4 1.543405 0 1.095661 1.543405
and all subsequent lines are equal to the last line here. The method has converged (very quickly!)
to 1.543405, to six decimal places.

Earlier, in Example 25, we found that the bisection method would require 19 iterations to achieve 6
decimal place accuracy. The Newton-Raphson method gave an answer good to this number of places
in just two or three iterations.

76 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Task
Use the starting value x0 = 0 in an implementation of the Newton-Raphson
method for approximating the zero of
f (x) = cos(x) − x.
(If you are doing these calculations by hand then just perform two or three itera-
tions. Don’t forget to use radians.)

Your solution

Answer
The derivative of f is f 0 (x) = − sin(x) − 1. The first iteration is
f (x0 ) 1−0
x1 = x0 − 0
=0− =1
f (x0 ) −0 − 1
and the second iteration is
f (x1 ) cos(1) − 1 −0.459698
x2 = x1 − 0 =1− =1− = 0.750364,
f (x1 ) − sin(1) − 1 −1.841471
and so on. There is little to be gained in our understanding by doing more iterations by hand, but
using a spreadsheet we find that the method converges rapidly:
n xn f (xn ) f 0 (xn ) xn+1
0 0 1 −1 1
1 1 −0.4597 −1.84147 0.750364
2 0.750364 −0.01892 −1.6819 0.739113
3 0.739113 −4.6E − 05 −1.67363 0.739085
4 0.739085 −2.8E − 10 −1.67361 0.739085
5 0.739085 0 −1.67361 0.739085

It is often necessary to find zeros of polynomials when studying transfer functions. Here is a Task
involving a polynomial.

HELM (2006): 77
Section 31.4: Nonlinear Equations
Task
The function f (x) = x3 + 2x + 4 has a single zero near x0 = −1. Use this value
of x0 to perform two iterations of the Newton-Raphson method.

Your solution

Answer
Using the starting value x0 = −1 you should find that f (x0 ) = 1 and f 0 (x0 ) = 5. This leads to
f (x0 ) 1
x1 = x0 − 0
= −1 − = −1.2.
f (x0 ) 5
f (x1 ) −0.128
The second iteration should give you x2 = x1 − 0
= −1.2 − = −1.17975.
f (x1 ) 6.32
Subsequent iterations will home in on the zero of f . Using a computer spreadsheet gives:
n xn f (x) f 0 (x) xn+1
0 −1 1 5 −1.2
1 −1.2 −0.128 6.32 −1.17975
2 −1.17975 −0.00147 6.175408 −1.17951
3 −1.17951 −2E − 07 6.173725 −1.17951
4 −1.17951 0 6.173725 −1.17951
where we have recomputed the hand calculations for the first two iterations.
We see that the method converges to the value −1.17951.

78 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Engineering Example 2

Pressure in an ideal multi-component mixture

Introduction

An ideal multi-component mixture consists of

1. n-pentane (5%)

2. n-hextane (15%)

3. n-heptane (50%)

4. n-octane (30%)

In general, the total pressure, P (Pa) of an ideal four-component mixture is related to the boiling
point, T (K) through the formula:

P = x1 p∗1 + x2 p∗2 + x3 p∗3 + x4 p∗4

where, for component i, the mole fraction is xi and the vapour pressure is p∗i , given by the formula:
 
∗ Bi
pi = exp Ai − i = 1, 2, 3, 4
(T + Ci )

Here p∗i is in mm Hg (1 mm Hg = 133.32 Pa), T is the absolute temperature (K) and the constants
Ai , Bi and Ci are given in the table below.

i component xi Ai Bi Ci
1 n-pentane 0.05 15.8333 2477.07 −39.94
2 n-hexane 0.15 15.8366 2697.55 −48.78
3 n-heptane 0.50 15.8737 2911.32 −56.51
4 n-octane 0.30 15.9426 3120.29 −63.63

Problem 1

For the liquid compositions, xi given in the table above, plot a graph of the total pressure, P (Pa)
against temperature (K) over the range 250 to 500 K.

Solution
 
Bi
p∗i= exp Ai − , expressed in millimetres of mercury, and so it is 133.32 times that in
T + Ci
pascals. Therefore, expressed in pascals, we have
4  
X Bi
P = 133.32 xi exp Ai −
i=1
T + Ci

Plotting this from T = 250 to 500 gives the following graph

HELM (2006): 79
Section 31.4: Nonlinear Equations
× 105
18
16
Pressure 14
Pa 12
10
8
6
4
2
0
250 300 350 400 450 500 Temperature K

Figure 15

Problem 2

Using the Newton-Raphson method, solve the equations to find the boiling points at total pressures
of 1, 2, 5 and 10 bars. Show the sequence of iterations and perform sufficient calculations for
convergence to three significant figures. Display these solutions on the graph of the total pressure,
P (Pa) against temperature T (K).

Solution

We wish to find T when P = 1, 2, 5 and 10 bars, that is, 105 , 2 × 105 , 5 × 105 and 10 × 105 Pa.

Reading crude approximations to T from the graph gives a starting point for the Newton-Raphson
process. We see that for 105 , 2×105 , 5×105 and 10×105 Pa, temperature T is roughly 365, 375, 460
and 485 degrees K, respectively, so we shall use these values as the start of the iteration.

In this case it is easy to calculate the derivative of P with respect to T exactly, rather than numerically,
giving
4    
0
X Bi Bi
P (T ) = 133.32 xi exp Ai − ×
i=1
T + Ci (T + Ci )2

Therefore to solve the equation P (T ) = y, we set T0 to be the starting value above and use the
iteration
P (Tn ) − y
Tn+1 = Tn −
P 0 (Tn )
For y = 100000 this gives the iterations
T0 T1 T2 T3 T4
365 362.7915 362.7349 362.7349 362.7349
We conclude that, to three significant figures T = 363◦ K when P = 100000 Pa.

For y = 200000 this gives the iterations

80 HELM (2006):
Workbook 31: Numerical Methods of Approximation
T0 T1 T2 T3 T4
375 390.8987 388.8270 388.7854 388.7854
We conclude that, to three significant figures T = 389◦ K when P = 200000 Pa.
For y = 500000 this gives the iterations
T0 T1 T2 T3 T4 T5
460 430.3698 430.4640 430.2824 430.2821 430.2821
We conclude that, to three significant figures T = 430◦ K when P = 500000 Pa.
For y = 1000000 this gives the iterations
T0 T1 T2 T3 T4 T5
475 469.0037 468.7875 468.7873 468.7873 468.7873
We conclude that, to three significant figures T = 469◦ K when P = 1000000 Pa.

An approximate Newton-Raphson method


The Newton-Raphson method is an excellent way of approximating zeros of a function, but it requires
you to know the derivative of f . Sometimes it is undesirable, or simply impossible, to work out the
derivative of a function and here we show a way of getting around this.
We approximate the derivative of f . From Section 31.3 we know that
f (x + h) − f (x)
f 0 (x) ≈
h
is a one-sided (or forward) approximation to f 0 and another one, using a central difference, is
f (x + h) − f (x − h)
f 0 (x) ≈ .
2h
The advantage of the forward difference is that only one extra f -value has to be computed. If f
is especially complicated then this can be a considerable saving when compared with the central
difference which requires two extra evaluations of f . The central difference does have the advantage,
as we saw when we looked at truncation errors, of being a more accurate approximation to f 0 .
The spreadsheet program Microsoft Excel has a built in “solver” command which can use Newton’s
method. (It may be necessary to use the “Add in” feature of Excel to access the solver.) In reality
Excel has no way of working out the derivative of the function and must approximate it. Excel gives
you the option of using a forward or central difference to estimate f 0 .
We now reconsider the problem we met in Examples 24 to 26.

HELM (2006): 81
Section 31.4: Nonlinear Equations
Example 27
We know that the single positive zero of f (x) = x tanh( 21 x) − 1 lies between
1.5 and 2. Use the Newton-Raphson method, with an approximation to f 0 , to
approximate the zero of f .

Solution
There is no requirement for f 0 this time, but the nature of this method is such that we will resort
to a computer straight away. Let us choose h = 0.1 in our approximations to the derivative.
Using the one-sided difference to approximate f 0 (x) we obtain this sequence of results from the
spreadsheet program:
f (x+h)−f (x)
n xn f (xn ) h
xn+1
0 1.75 0.231835 1.154355 1.549165
1 1.549165 0.006316 1.110860 1.543479
2 1.543479 8.16E − 05 1.109359 1.543406
3 1.543406 1.01E − 06 1.109339 1.543405
4 1.543405 1.24E − 08 1.109339 1.543405
5 1.543405 1.53E − 10 1.109339 1.543405
6 1.543405 1.89E − 12 1.109339 1.543405
7 1.543405 2.31E − 14 1.109339 1.543405
8 1.543405 0 1.109339 1.543405
And using the (more accurate) central difference gives
f (x+h)−f (x−h)
n xn f (xn ) 2h
xn+1
0 1.75 0.231835 1.144649 1.547462
1 1.547462 0.004448 1.095994 1.543404
2 1.543404 −1E − 06 1.094818 1.543405
3 1.543405 7.95E − 10 1.094819 1.543405
4 1.543405 −6.1E − 13 1.094819 1.543405
5 1.543405 0 1.094819 1.543405
We see that each of these approaches leads to the same value (1.543405) that we found with the
Newton-Raphson method.

82 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Task
Use a spreadsheet to recompute the approximations shown in Example 24, for the
following values of h:
h = 0.001, 0.00001, 0.000001.

Your solution

Answer
You should find that as h decreases, the numbers get closer and closer to those shown earlier for the
Newton-Raphson method. For example, when h = 0.0000001 we find that for a one-sided difference
the results are
f (x+h)−f (x)
n xn f (xn ) h
xn+1
0 1.75 0.231835 1.145358 1.547587
1 1.547587 0.004585 1.096870 1.543407
2 1.543407 2.52E − 06 1.095662 1.543405
3 1.543405 8.08E − 13 1.095661 1.543405
4 1.543405 0 1.095661 1.543405
and those for a central difference with h = 0.0000001 are
f (x+h)−f (x−h)
n xn f (xn ) 2h
xn+1
0 1.75 0.231835 1.145358 1.547587
1 1.547587 0.004585 1.096870 1.543407
2 1.543407 2.52E − 06 1.095662 1.543405
3 1.543405 7.7E − 13 1.095661 1.543405
4 1.543405 0 1.095661 1.543405
It is clear that these two tables very closely resemble the Newton-Raphson results seen earlier.

HELM (2006): 83
Section 31.4: Nonlinear Equations
Exercises
1. It is given that the function

f (x) = x3 + 2x + 8

has a single negative zero.

(a) Find two integers a and b which bracket the zero of f .


(b) Perform one iteration of the bisection method so as to halve the size of the bracketing
interval.

2. Consider a simple electronic circuit with an input voltage of 2.0 V, a resistor of resistance 1000
Ω and a diode. It can be shown that the voltage across the diode can be found as the single
positive zero of
 x  2−x
f (x) = 1 × 10−14 exp − .
0.026 1000
Use one iteration of the Newton-Raphson method, and an initial value of x0 = 0.75 to show
that

x1 = 0.724983

and then work out a second iteration.

3. It is often necessary to find the zeros of polynomials as part of an analysis of transfer functions.
The function

f (x) = x3 + 5x − 4

has a single zero near x0 = 1. Use this value of x0 in an implementation of the Newton-Raphson
method performing two iterations. (Work to at least 6 decimal place accuracy.)

4. The smallest positive zero of

f (x) = x tan(x) + 1

is a measure of how quickly certain evanescent water waves decay, and its value, x0 , is near 3.
Use the forward difference
f (3.01) − f (3)
0.01
to estimate f 0 (3) and use this value in an approximate version of the Newton-Raphson method
to derive one improvement on x0 .

84 HELM (2006):
Workbook 31: Numerical Methods of Approximation
Answers

1. (a) By trial and error we find that f (−2) = −4 and f (−1) = 5, from which we see that the
required bracketing interval is a < x < b where a = −2 and b = −1.
(b) For an iteration of the bisection method we find the mid-point m = −1.5. Now f (m) =
1.625 which is of the opposite sign to f (a) and hence the new smaller bracketing interval
is a < x < m.
1 × 10−14  x  1
2. The derivative of f is f 0 (x) = exp + , and therefore the first iteration
0.026 0.026 1000
0.032457
of Newton-Raphson gives x1 = 0.75 − = 0.724983.
1.297439
0.011603
The second iteration gives x2 = 0.724983 − = 0.701605.
0.496319
Using a spreadsheet we can work out some more iterations. The result of this process is
tabulated below
n xn f (xn ) f 0 (xn ) xn+1
2 0.701605 0.003942 0.202547 0.682144
3 0.682144 0.001161 0.096346 0.670092
4 0.670092 0.000230 0.060978 0.666328
5 0.666328 1.56E − 05 0.052894 0.666033
6 0.666033 8.63E − 08 0.052310 0.666031
7 0.666031 2.68E − 12 0.052306 0.666031
8 0.666031 0 0.052306 0.666031

and we conclude that the required zero of f is equal to 0.666031, to 6 decimal places.

3. Using the starting value x0 = 1 you should find that f (x0 ) = 2 and f 0 (x0 ) = 8. This leads to
f (x0 ) 2
x1 = x0 − = 1 − = 0.75.
f 0 (x0 ) 8
f (x1 ) 0.171875
The second iteration should give you x2 = x1 − 0
= 0.75 − = 0.724299.
f (x1 ) 6.6875
Subsequent iterations can be used to ‘home in’ on the zero of f and, using a computer
spreadsheet program, we find that

n xn f (x) f 0 (x) xn+1


2 0.724299 0.001469 6.573827 0.724076
3 0.724076 1.09E − 07 6.572856 0.724076
4 0.724076 0 6.572856 0.724076
We see that the method converges to the value 0.724076.

HELM (2006): 85
Section 31.4: Nonlinear Equations
Answers

4. We begin with

f (3.01) − f (3) 0.02924345684


f 0 (3) ≈ = = 2.924345684,
0.01 0.01
to the displayed number of decimal places, and hence an improvement on x0 = 0.75 is

f (3)
x1 = 3 − = 2.804277,
2.924345684
to 6 decimal places. (It can be shown that the root of f is 2.798386, to 6 decimal places.)

86 HELM (2006):
Workbook 31: Numerical Methods of Approximation

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy