Application of Differential Equations in Biology

Contents
1 Pafnuty Lvovich Chebyshev 1821-1894 2

1.1 Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Chebyshev’s interest in approximation theory . . . . . . . . . 4
2 Approximation in the L2-norm 6

2.1 Best approximation in the L2-norm . . . . . . . . . . . . 6
2.1.1 The Gram-Schmidt Process . . . . . . . . . . . . . 7
2.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Legendre polynomials . . . . . . . . . . . . . . . . . . 9
3 Approximation in the uniform norm 11

3.1 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Chebyshev polynomials 21
4.1 Properties of the Chebyshev polynomials . . . . . . . . . . 23
5 How to find the best approximating polynomial in the

uniform norm 39
5.1 Chebyshev’s solution . . . . . . . . . . . . . . . . . . . . . . . 39
6 Conclusion 45
1
Chapter - 1
Pafnuty Lvovich Chebyshev

(1821-1894)
Since this thesis is dedicated to Chebyshev polynomial, we discuss in

this chapter who Pafnuty Lvovich Chebyshev was and why he dealt
with uniform approximation.
The information in this chapter is obtained from The history of

approximation theory by K. G. Steffens.
1.1 Biography
Pafnuty Lvovich Chebyshev was born on May 4, 1821 in Okatovo,
Russia. He could not walk that well, because he had a physical
handicap. This handicap made him unable to do usual children
things. Soon he found a passion: constructing mechanisms.
In 1837 he started studying mathematics at the Moscow University.

One of his teachers was N. D. Brashman, who taught him practical
mechanics. In 1841 Chebyshev won a silver medal for his ’calculation
of the roots of equations’. At the end of this year he was called ‘most
outstanding candidate’. In 1846, he graduated. His master thesis
was called ‘An Attempt to an Elementary Analysis of Probabilistic
Theory’. A year later, he defended his dissertation ‘‘About
integration with the help of logarithms’’. With this dissertation he
obtained the right to become a lecturer.
In 1849, he became his doctorate for his work ‘theory of

congruences’. A year later, he was chosen extraordinary professor at
Saint Petersburg University. In 1860 he became here ordinary
professor and 25 years later he became merited professor. In 1882 he
stopped working at the University and started doing research.
He did not only teach at the Saint Petersburg University. From 1852
2
to 1858 he taught practical mechanics at the Alexander Lyceum in
Pushkin, a suburb of Saint Petersburg.
Because of his scientific achievements, he was elected junior

academician in 1856, and later an extraordinary (1856) and an
ordinary (1858) member of the Imperial Academy of Sciences. In this
year, he also became an honourable member of Moscow University.
Besides these, he was honoured many times more: in 1856 he

became a member of the scientific committee of the ministry of
national education, in 1859 he became ordinary membership of the
ordnance department of the academy with the adoption of the
headship of the ‘‘commission for mathematical questions according to
ordnance and experiments related to the theory of shooting’’, in 1860
the Paris academy elected him corresponding member and full
foreign member in 1874, and in 1893 he was elected honourable
member of the Saint Petersburg Mathematical Society.
He died at the age of 73, on November 26, 1894 in Saint Petersburg.
Pafnuty Lvovich Chebyshev
3
1.2 Chebyshev’s interest in approximation
theory
Chebyshev was since his childhood interested in mechanisms. The
theory of mechanisms played in that time an important role, because
of the industrialisation.
In 1852, he went to Belgium, France, England and Germany to talk

with mathematicians about different subjects, but most important for
him was to talk about mechanisms. He also collected a lot of
empirical data about mechanisms, to verify his own theoretical
results later.
According to Chebyshev, the foundations of approximation theory

were established by the French mathematician Jean-Victor
Poncelet. Poncelet approximated roots of the form
and uniformly by linear expressions.
Another important name in approximation theory was the Scottish

mechanical engineer James Watt. His planar joint mechanisms were
the most important mechanisms to transform linear motion into
circular motion. The so called Watt’s Curve is a tricircular plane
algebraic curve of degree six. It is generated by two equal circles
(radius b, centres a distance 2a apart). A line segment (length 2c)
attaches to a point on each of the circles, and the midpoint of the line
segment traces out the Watt curve as the circles rotate.
Watt’s Curves for different values of a; b and c. The Watt’s Curve

inspired Chebyshev to deal with the following: determine the
parameters of the mechanism so that the maximal error of the
4
approximation of the curve by the tangent on the whole interval is
minimized.
In 1853, Chebyshev published his first solutions in his ‘‘Théorie

des mécanismes, connus sous le nom de parallélogrammes’’. He
tried to give mathematical foundations to the theory of mechanisms,
because practical mechanics did not succeed in finding the
mechanism with the smallest deviation from the ideal run. Other
techniques did not work either. Poncelet’s approach did work, but
only for specific cases.
Chebyshev wanted to solve general problems. He formulated the

problem as follows (translated word-by-word from French):
To determine the deviations which one has to add to get an

approximated value for a function f, given by its expansion in powers
of x − a, if one wants to minimize the maximum of these errors between
x = a − h and x = a + h, h being an arbitrarily small quantity.
The formulation of this problem is the start of approximation in the

uniform norm.
5
Chapter - 2
Approximation in the L2-norm
The problem that Chebyshev wanted to solve is an approximation

problem in the uniform norm. In this chapter we show how
approximation in the L2-norm works, so that we can compare it to
approximation in the uniform norm later.
This chapter uses usual concepts of linear algebra, and some basic
definitions that are needed.
2.1 Best approximation in the L2-norm
 C [a, b]. We want to find the best approximating

Let f (x)
polynomial p (x)  Pn of degree n to the function f (x) with respect to
the L2-norm. We can restate this as follows
Problem 1. Find the best approximating polynomial p  Pn of

degree n to the function f (x)  C [a, b] in the L2-norm such that
The best approximating polynomial p(x) always exists and is unique.

We are not going to prove this, since this is out of the scope of this
thesis. We will prove existence and uniqueness for approximation in
the uniform norm.
To solve this problem, we want to minimize
Since
6
Theorem 1. The best approximating polynomial p (x)  Pn is such
that
if and only if,
where
Thus, the integral is minimal if p (x) is the orthogonal projection of

the function f (x) on the subspace Pn. Suppose that u1, u2, u3, … , un
form an orthogonal basis for Pn. Then
Orthogonal polynomials can be obtained by applying the Gram-

Schmidt Process to the basis for the inner product space V.
2.1.1 The Gram-Schmidt Process
Let {x1, x2, …, x3} be a basis for the inner product space V. Let
7
for k = 1, 2, …, n – 1.
Then pk is the projection of xk + 1 onto span (u1, u2, …, un) and the set
{u1, u2, … , un } is an orthonormal basis for V.
2.1.2 Example: Find the best approximating quadratic

polynomial to the function f (x) =|x| on the interval [– 1, 1]. Thus, we
want to minimize
This norm is minimal if p(x) is the orthogonal projection of the

function f (x) on the subspace of polynomials of degree at most 2.
We start with the basis {1, x, x2} for the inner product space V.
Then,
8
So we have to calculate each inner product
Thus,
with maximum error
2.1.3 Legendre polynomials
In fact, the polynomials that are orthogonal with respect to the inner
product
are called the Legendre polynomials, named after the French

9
mathematician Adrien-Marie Legendre. The formula for finding the
best approximating polynomial is then thus
The Legendre polynomials satisfy the recurrence relation
(n + 1) Pn + 1 (x) = (2n + 1) x Pn (x) – n Pn – 1 (x)
A list of the first six Legendre polynomials are as following:
10
Chapter - 3
Approximation in the uniform norm
Pafnuty Lvovich Chebyshev was thus the first who came up with the
idea of approximating functions in the uniform norm. He asked
himself at that time.
Problem 2. Is it possible to represent a continuous function f (x) on
the closed interval [a, b] by a polynomial of degree at

most n, with n  Z, in such a way that the maximum error at any point
x  [a, b] is controlled? I.e. is it possible to construct p (x) so that the
error is minimized?
This thesis will give an answer to this question. In this chapter we

show that the best approximating polynomial always exists and that
it is unique.
3.1 Existence
In 1854, Chebyshev found a solution to the problem of best
approximation. He observed the following
Lemma 1. Let f (x)  C [a, b] and let p (x) be a best approximation to

f(x) out of Pn. Then there are at least two distinct points x1, x2 [a, b]
such that . That is, f (x)
– p(x) attains each of the values .
Proof. This is a proof by contradiction. Write the error
11
. If the conclusion of the lemma is
false, then we might suppose that f (x1) – p (x1) = E, for some x1. But
that
for all x  [a, b]. Thus, E +  ≠ 0 and so  Pn, with p ≠ q.

We now claim that q(x) is a better approximation to f (x) than p(x). We
show this using the inequality stated above.
for all x  [a, b]. That is,
Hence, q(x) is a better approximation to f (x) than p(x). This is a

contradiction, since we have that p(x) is a best approximation to f (x).
Corollary 1: The best approximating constant to f (x)  C [a, b] is
with error
Proof: This is again a proof by contradiction. Let x1 and x2 be such
12
that f (x1) – p0 = – (f (x2) – p0) = . Suppose d is any other
constant. Then, E = f – d cannot satisfy lemma 1. In fact,
E (x1) = f (x1) – d
E (x2) = f (x2) – d;
showing that E (x1) + E (x2) ≠ 0. This contradicts lemma 1.
Next, we will generalize lemma 1 to show that a best linear

approximation implies the existence of at least n + 2 points, where n
is the degree of the best approximating polynomial, at which f – p
alternates between .
We need some definitions to arrive at this generalization.
Definition 1. Let f (x)  C [a, b].
1. x [a, b] is called a (+) point for f (x) if
2. x [a, b] is called a (–) point for f (x) if

.
3. A set of distinct points a ≤ x0 < x1 < · · · < xn ≤ b is called an

alternating set for f (x) if the xi are alternately (+) points and (–) points;
that is, if
and f (xi) = – f (xi – 1) for all i = 1; : : : ; n.
We use these notations to generalize lemma 1 and thus to

characterize a best approximating polynomial.
Theorem 2. Let f (x)  C [a, b], and suppose that p(x) is a best
13
approximation to f (x) out of Pn. Then, there is an alternating set for
f – p consisting of at least n + 2 points.
Proof. We may suppose that,
since if f (x)  Pn, then f (x) = p(x) and then there would be no
alternating set.
Hence,
Consider the (uniformly) continuous function φ = f – p (continuous on

a compact set is uniformly continuous). Our next step is to divide the
interval [a, b] into smaller intervals a = t0 < t1 < ... < tk = b
so that
whenever x, y  [ti, ti + 1].

We want to do this, because if [ti, ti + 1] contains a (+) point for
φ = f – p, then φ is positive on the whole interval [ti, ti + 1]
x, y  [ti, ti + 1] and ... (4.1)
Similarly, if the interval [ti, ti + 1] contains a (–) point, then φ is negative

on the whole interval [ti, ti + 1]. Hence, no interval can contain both (+)
points and (–) points.
We call an interval with a (+) point a (+) interval, an interval with a (–)
point a (–) interval. It is important to notice that no (+) interval can touch
a (–) interval. Hence, the intervals are separated by an interval
containing a zero for φ.
Our next step is to label the intervals
(+) intervals
14
(–) intervals
............................ ..................
(–1)m – 1 intervals
Let S denote the union of all signed intervals: . Let N denote

the union of the remaining intervals. S and N are compact sets with S U
N = [a, b].
We now want to show that m ≥ n + 2. We do this, by letting m < n + 2

and showing that this yields a contradiction.
The (+) intervals and (–) intervals are strictly separated, hence we can
find points z1, ..., zm – 1  N such that
......... .........
We can now construct the polynomial which leads to a contradiction
q (x) = (z1 – x) (z2 – x) ... (zm – 1 – x)
Since we assumed that m < n + 2, m – 1 < n and hence q (x)  Pn.

The next step is to show that p + λq  Pn is a better approximation to
f(x) than p(x).
Our first claim is that q(x) and f – p have the same sign. This is true,
15
because q(x) has no zeros on the (±) intervals, and thus is of constant
sign. Thus, we have that q > 0 on I1, ..., Ik1, because (zj – x) > 0 on these
intervals. Consequently, q < 0 on Ik1 + 1, ..., Ik2, because (z1 – x) < 0 on
these intervals.
The next step is to find λ. Therefore, let .N

is the union of all subintervals [ti, ti + 1], which are neither (+) intervals
nor (–) intervals.
By definition, ε < E. Choose λ > 0 in such a way, such that
Our next step is to show that q(x) is a better approximation to f(x) than
p(x). We show this for the two cases: x  N and x N.
Let x  N. Then,
Let x N. Then, x is in either a (+) interval or a (–) interval. From

equation 4.1, we know that
. Thus, f – p and λq (x) have the same sign. Thus we

have that
because q(x) is non-zero on S.
16
So we arrived at a contradiction: we showed that p + λq is a better
approximation to f(x) than p(x), but we have that p(x) is the best
approximation to f(x). Therefore, our assumption m < n + 2 is false, and
hence m ≥ n+2.
It is important to note that if f – p alternates n + 2 times in sign, then

f – p must have at least n + 1 zeros. This means that p(x) has at least
n + 1 the same points as f(x).
3.2 Uniqueness
In this section, we will show that the best approximating
polynomial is unique.
Theorem 3. Let f(x)  C [a, b]. Then the polynomial of best

approximation p(x) to f(x) out of Pn is unique.
Proof. Suppose there are two best approximations p(x) and q(x) to
f(x) out of Pn. We want to show that these p(x), q(x)  Pn are the
same.
If they are both best approximations, they satisfy
The average r(x) = of p(x) and q(x) is then also a best

approximation,
because
Thus,
By theorem 2, f – r has an alternating set x0, x1, ..., xn + 1

containing of n + 2 points.
For each i,
17
(f – p) (xi) + (f – q) (xi) = ± 2E (alternating),
While
– E ≤ (f – p) (xi), (f – q) (xi) ≤ E.
This means that
(f – p) (xi) = (f – q) (xi) = ± E (alternating), for each i.
Hence, x0, x1, ..., xn + 1 is an alternating set for both f – p and

f – q.
The polynomial q – p = (f – p) – (f – q) has n + 2 zeros.

Because q – p  Pn,
we must have p(x) = q(x). This is what we wanted to show: if there
are two best approximations, then they are the same and hence
the approximating polynomial is unique.
We can finally combine our previous results in the following

theorem.
Theorem 4. Let f(x)  C [a, b], and let p(x)  Pn. If f – p has an
alternating set containing n + 2 (or more) points, then p(x) is the
best approximation to f(x) out of Pn.
Proof. This is a proof by contradiction. We want to show that if

q(x) is a better approximation to f(x) than p(x), then q(x) must be
equal to p(x).
Therefore, let x0, x1, ..., xn + 1 be the alternating set for f – p.
Assume q(x)  Pn is a better approximation to f(x) than p(x).
18
Thus,
Then we have
for each i = 0, ..., n + 1.
Thus we have
This means that f(xi) – p(xi) and f(xi) – p(xi) – f(xi) + q(xi) = q(xi) – p(xi)
must have the same sign ( , then a and a – b have the same
sign). Hence, q – p = (f – p) – (f – q) alternates n + 2 (or more) times
in sign, because f – p does too. This means that q – p has at least
n + 1 zeros. Since q – p  Pn, we must have q(x) = p(x). This
contradicts the strict inequality, thus we conclude that p(x) is the
best approximation to f(x) out of Pn.
Thus, from theorem 2 and theorem 4 we know that the polynomial

p(x) is the best approximation to f(x) if and only if f – p alternates
in sign at least n + 2 times, where n is the degree of the best
approximating polynomial. Consequently, f – p has at least n + 1
zeros.
We can illustrate this theorem using an example.
Example 1. Consider the function f(x) = sin (4x) in [–π, π]. Figure
4.1 shows this function together with the best approximating
polynomial p0 = 0.
19
Figure 4.1: Illustration of the function f(x) = sin (4x) with best
approximating polynomial p0 = 0.
The error E = f – p = sin (4x) has 8 different alternating sets of 2

points. Using theorem 2 and theorem 4, we find that p0 = p1 = p2 =
p3 = p4 = p5 = p6 = 0 are best approximations.
This means that the best approximating polynomial of degree 0 is

p0 = 0. This is true since f – p0 alternates 8 times in sign, much
more than the required n + 2 = 2 times.
We can repeat this procedure: the best approximating polynomial

in P1, is p1 = 0, because then f – p1 alternates again 8 times in
sign, much more than the required n + 2 = 3 times.
20
The polynomial p7 = 0 is not a best approximation, since f – p7
only alternates 8 times in sign and it should alternate at least
n + 2 = 9 times in sign. So in P7 there exists a better
approximating polynomial than p7 = 0.
Example 2. In this example we show that the function p = x –

is the best linear approximation to the function f(x) = x2 on [0, 1]
(techniques to find this polynomial will be discussed in chapter 6).
The polynomial of best approximation has degree n = 1, so f – p

must alternate at least 2 + 1 = 3 times in sign. Consequently, f – p
has at least 1 + 1 = 2 zeros. We see this in figure 4.2. f – p
alternates in sign 3 times and has 2 zeros:
21
Figure 4.2: The polynomial p(x) = x – is the best approximation
of degree 1 to f(x) = x , because f – p changes sign 3 times.
2
We now know the characteristics of the best approximating

polynomial. The next step is to find the maximum error between
f(x) and the best approximating function p(x).
De La Vallée Poussin proved the following theorem, which

provides the lower bound for the error E.
Theorem 5 (De La Vallée Poussin):
Let f(x)  C[a, b], and suppose that q(x)  Pn is such that f(xi) –
q(xi) alternates in sign at n + 2 points a ≤ x0 < x1 < · · · < xn+1 ≤ b.
Then
Before proving this theorem, we show in figure 4.3 how this

theorem works.
Suppose we want to approximate the function f(x) = ex with a

quintic polynomial. In the figure, a quintic polynomial r(x)  P5 is
shown, that is chosen in such a way that f – r changes sign 7
times. This is not the best approximating polynomial. The red
curve shows the error for the best approximating polynomial p(x),
which also has 7 points for which the error changes in sign.
22
Figure 4.3: Illustration of de la Vallée Poussin’s theorem for
f(x) = ex and n = 5. Some polynomial r(x)  P5 gives an error f – r
for which we can identify n + 2 = 7 points at which f – r changes
sign.
The minimum value of gives a lower bound for the
maximum error of the best approximating polynomial p(x)

 P5 [7].
The point of the theorem is the following:
Since the error f(x) – r(x) changes sign n + 2 times, the error
exceeds at one of the points xi that give the

changing sign.
23
So de la Vallée Poussin’s theorem gives a nice mechanism for
developing lower bounds on .
Proof. [De La Vallée Poussin.] We now prove theorem 5.
This is a proof by contradiction. Assume that the inequality does

not hold. Then, the best approximating polynomial p(x) satisfies
The middle part of the inequality is the maximum difference of

over all x  [a, b], so it cannot be larger at xi  [a, b].
Thus,
for all i = 0, ..., n + 1 ... (4.2)
Now consider p(x) – q(x) = (f(x) – q(x)) – (f(x) – p(x)), which is a

polynomial of degree n, since p(x), q(x)  Pn. Then from 4.2, we
know that f (xi) – q(xi) has always larger magnitude than f(xi) –
p(xi).
Thus, the magnitude will never be large enough to
overcome . Hence,
From the hypothesis we know that f(x) – q(x) alternates in sign at

least n + 1 times, thus the polynomial p – q does too.
Changing sign n + 1 times means n + 1 roots. The only polynomial
24
of degree n with n + 1 roots is the zero polynomial. Thus, p(x) =
q(x). This contradicts the strict inequality. Hence, there must be at
least one i for which
25
Chapter - 4
Chebyshev polynomials
To show how Chebyshev was able to find the best approximating
polynomial, we first need to know what the so called Chebyshev
polynomials are.
Definition 2. We denote the Chebyshev polynomial of degree n

by Tn(x) and it is defined as
Tn(x) = cos (n arc cos(x)), for each n ≥ 0.
This function looks trigonometric, and it is not clear from the

definition that this defines a polynomial for each n. We will show
that it indeed defines an algebraic polynomial.
For n = 0: T0(x) = cos (0) = 1

For n = 1: T1(x) = cos (arc cos(x)) = x
For n ≥ 1, we use the substitution θ = arc cos(x) to change the

equation to
Tn(θ(x)) ≡ Tn(θ) = cos (nθ); where θ  [0, π].

Then we can define a recurrence relation, using the fact that
Tn+1(θ) = cos((n + 1)θ) = cos(θ) cos(nθ) – sin(θ) sin(nθ)

and
Tn–1(θ) = cos((n – 1)θ) = cos(θ) cos(nθ) + sin(θ) sin(nθ).
If we add these equations, and use the variable θ = arc cos(x), we

obtain
Tn+1(θ) + Tn–1(θ) = 2 cos(nθ) cos(θ)

26
Tn+1(θ) = 2 cos(nθ) cos(θ) – Tn–1(θ)
Tn+1(x) = 2x cos(n arc cos(x)) – Tn–1(x).
That is,
Tn+1(x) = 2xTn(x) – Tn–1(x). (5.1)
Thus, the recurrence relation implies the following Chebyshev

polynomials
T0(x) = 1
T1(x) = x
T2(x) = 2xT1(x) – T0(x) = 2x2 – 1
T3(x) = 2xT2(x) – T1(x) = 4x3 – 3x
T4(x) = 2xT3(x) – T2(x) = 8x4 – 8x2 + 1
...
Tn+1(x) = 2xTn(x) – Tn–1(x) n≥1
We see that if n ≥ 1, Tn(x) is a polynomial of degree n with leading

coefficient 2n – 1.
In the next figure, the first five Chebyshev polynomials are shown.
27
Figure 5.1: The first five Chebyshev polynomials.
4.1 Properties of the Chebyshev polynomials

The Chebyshev polynomials have a lot of interesting properties. A
couple of them are listed below.
P 1: The Chebyshev polynomials are orthogonal on (–1, 1) with
respect to the weight function .
Proof. Consider
Using the substitution θ = arc cos(x), this gives
28
and
Suppose n ≠ m. Since
,
we have
Suppose n = m. Then
So we have
Hence we conclude that the Chebyshev polynomials are
orthogonal with respect to the weight function .
P 2: The Chebyshev polynomial Tn(x) of degree n ≥ 1 has n simple

zeros in [–1, 1] at
29
, for each k = 1, 2, ..., n:
Proof. Let .
Then
The are distinct and Tn(x) is a polynomial of degree n, so all the

zeros must have this form.
P 3: Tn(x) assumes its absolute extrema at
, for each k = 0; 1; : : : ; n:
Proof.
Let
We have
and when k = 1, 2, ..., n – 1 we have
30
Since Tn(x) is of degree n, its derivative is of degree n – 1. All zeros
occur at these n – 1 distinct points.
The other possibilities for extrema of Tn(x) occur at the endpoints of
the interval [–1, 1], so at and at .
For any k = 0, 1, ..., n we have
So we have a maximum at even values of k and a minimum at

odd values of k.
P 4: The monic Chebyshev polynomials (a polynomial of the form

xn + cn–1xn–1 + · · · + c2x2 + c1x + c0) are defined as
and , for each n ≥ 1.

The recurrence relation of the Chebyshev polynomials implies
and
for each n ≥ 2.
31
Proof. We derive the monic Chebyshev polynomials by dividing
the Chebyshev polynomials Tn(x) by the leading coefficient 2n–1.
The first five monic Chebyshev polynomials are shown in figure

5.2.
P 5: The zeros of occur also at
,
for each k = 1, 2, ..., n.
and the extrema of occur at
, with ,
for each n = 0, 1, 2, ..., n.
Proof. This follows from the fact that is just a multiple of

Tn(x).
32
Figure 5.2: The first five monic Chebyshev polynomials.
P 6: Let denote the set of all monic polynomials of degree

n. The polynomials of the form , when n ≥ 1, have the
property that
, for all
The equality only occurs if Pn ≡ .
Proof. This is a proof by contradiction. Therefore, suppose that

and that
33
We want to show that this does not hold. Let . Since
and Pn are both monic polynomials of degree n, Q is a polynomial
of degree at most n – 1.
At the n + 1 extreme points of , we have
From our assumption we have
for each k = 0,1, ..., n,

so we have
Since Q is continuous, we can apply the Intermediate Value

Theorem. This theorem implies that for each j = 0, 1, ..., n – 1 the
polynomial Q(x) has at least one zero between and . Thus,

Q has at least n zeros in the interval [–1, 1]. But we have that the
degree of Q(x) is less than n, so we must have Q ≡ 0. This implies
that , which is a contradiction.
P 7: Tn(x) = 2xTn–1(x) – Tn–2(x) for n ≥ 2.
Proof. This is the same as equation 5.1, but then with n + 1

replaced by n.
34
P 8: Tm(x)Tn(x) = for m > n.
Proof. Using the trigonometric identity,
we get
Tm(x) · Tn(x) = cos(m arc cos(x)) cos(n arc cos(x))
P 9: Tm(Tn(x)) = Tmn(x).
Proof. Tm(Tn(x)) = cos(m arc cos(cos(n arc cos(x))))

= cos(mn arccos(x))
= Tmn(x):
P 10:
Proof. Combining the binomial expansions of the right-hand side,

makes the odd powers of cancel. Thus, the right-hand side
is a polynomial as well.
Let x = cos(θ). Using the trig identity cos(x)2 + sin(x)2 = 1 we find
35
which is the desired result. These polynomials are thus equal for
P 11: For real x with , we get
Thus,
Tn(cosh(x)) = cosh(nx) for all real x:
Proof. This follows from property 10.
P 12: .
Proof. This follows from property 10.
P 13: For n odd,
For n even, 2T0 is replaced by T0.
36
Proof. For , let x = cos(θ). Using the binomial expansion we
get
If n is even, the last term in this last sum is T0 (because

then the central term in the binomial expansion is not doubled).
P 14: Tn and Tn–1 have no common zeros.
Proof. Assume they do have a common zero. Then Tn(x0) = 0 =

Tn–1(x0). But then using property 7, we find that Tn–2(x0) must be
zero too. If we repeat this, we find Tk(x0) = 0 for every k < n,
including k = 0. This is not possible, since T0(x) = 1 has no zeros.
Therefore, we conclude that Tn and Tn–1 have no common zeros.
P 15: .
Proof.
37
.
For x = ±1, , because we can interpret the derivative

as a limit: let θ → 0 and θ → π. Using the L’Hôpital rule we find
38
Chapter - 5
How to find the best
approximating polynomial in
the uniform norm
In this chapter we first show how Chebyshev was able to solve an

approximation problem in the uniform norm. We will then compare
this to approximation in the L2-norm. After this, we will give some
other techniques and utilities to find the best approximating
function. We close this chapter with some examples.
5.1 Chebyshev’s solution

In this section we show step by step an approximation problem
that Chebyshev was able to solve. This section is derived from A
short course on approximation theory by N. L. Carothers and from
Best Approximation: Minimax Theory by S. Ghorai.
The problem that Chebyshev wanted to solve is the following
Problem 3. Find the best approximating polynomial pn–1  Pn–1 of

degree at most n – 1 of f(x) = xn on the interval [–1, 1].
This means we want to minimize the error between f(x) = x n and
pn–1(x), thus minimize . Hence, we can restate the

problem in the following way: Find the monic polynomial of degree
39
n of smallest norm in C[–1, 1].
We show Chebyshev’s solution in steps.
Step 1: Simplify the notation. Let E(x) = xn – p and let M =

We know that E(x) has an alternating set: –1 ≤ x0 < x1 <
· · · < xn ≤ 1 containing (n – 1) + 2 = n + 1 points and E(x)
has at least (n – 1) + 1 = n zeros. So and

E(xi+1) = – E(xi) for all i.
Step 2: E(xi) is a relative extreme value for E(x), so at any xi in

(–1, 1) we have . is a polynomial of
degree n–1, so it has at most n – 1 zeros. Thus,
Step 3: Consider the polynomial M2 – E2 P2n. M2 – (E(xi))2 = 0

for i = 0, 1, ..., n, and M2 – E2 ≥ 0 on [–1, 1]. Thus, x1, ...,
xn–1 must be double roots of M2 – E2. So we already have
2(n – 1) + 2 = 2n roots. This means that x1, ..., xn–1 are
double roots and that x0 and xn are simple roots. These
are all the roots of M2 – E2.
Step 4: The next step is to consider . We

already know from the previous steps that x1, ..., xn–1
are double roots of . Hence (1 – x2) has

double roots at x1, ..., xn–1 and simple roots at x0 and xn.
These are all the roots, since (1 – x2) is in P2n.
40
Step 5: In the previous steps we found that M2 – E2 and (1 – x2)
are polynomials of the same degree and with

the same roots. This means that these polynomials are
the same, up to a constant multiple. We can calculate
this constant. Since E(x) is a monic polynomial, it has
leading coefficient equal to 1. The derivative has

thus leading coefficient equal to n. Thus,
E is positive on some interval, so we can assume that it

is positive on [–1, x1] and therefore we do not need the ±-
sign. If we integrate our result, we get
E(x) = M cos(n arccos(x) + C):
Since E(–1) ≥ 0, we have that E(–1) = –M. So if we

substitute this we get,
E(–1) = M cos(n arccos(–1) + C) = –M

cos(nπ + C) = –1
nπ + C = π + k2π
C = mπ with n + m odd
E(x) = ±M cos(n arccos(x)).
From the previous chapter we know that cos(n arccos(x)) is the n-

th Chebyshev polynomial. Thus, it has degree n and leading
coefficient 2n–1. Hence, the solution to problem 1 is
41
E(x) = 2–n+1Tn(x).
We know that |Tn(x)|≤ 1 for |x|< 1, so the minimal norm is M

= 2–n+1.
Using theorem 4 and the characteristics of the Chebyshev

polynomials, we can give a fancy solution.
Theorem 6. For any n ≥ 1, the formula p(x) = xn – 2–n+1Tn(x)

defines a polynomial p  Pn–1 satisfying
for any q  Pn–1.
Proof. We know that 2–n+1Tn(x) has leading coefficient 1, so p 

Pn–1.
Let for k = 0, 1, ..., n. Then, –1 = x0 < x1 < · · · <

xn = 1 and
We have that |Tn(x)|= |Tn(cos(θ)| = |cos(nθ)| ≤ 1 for –1 ≤ x ≤ 1.
This means that we have found an alternating set for Tn(x)

containing n+1 points.
So we have that xn – p(x) = 2–n+1Tn(x) satisfies |xn – p(x)| ≤ 2–n+1,

and for each k = 0, 1, ..., n has xnk – p(xk) = 2–n+1Tn(xk) = (–1)n–k2–n+1.
42
Using theorem 4, we find that p(x) must be the best
approximating polynomial to xn out of Pn–1.
Corollary 2. The monic polynomial of degree exactly n having
smallest norm in C [a, b] is
Proof. Let p(x) be the monic polynomial of degree n. Make the

transformation 2x = (a + b) + (b – a)t. Then we have
. This is a polynomial of degree n with
leading coefficient , and
We can write
where is a monic polynomial of degree n in [–1, 1] and
Hence, for minimum norm, we must have .
43
Combining the results and substituting back we get
We can generalize theorem 6 and corollary 2 to find the best

approximating polynomial for functions f(x) of degree n with
leading term a0xn, by a polynomial p(x) of degree at most n – 1 in [–
1, 1].
Corollary 3 (Generalizations). Given a function f(x) of degree n

with leading term a0xn,
1. the best approximation to f(x) by a polynomial p(x) of degree at

most n – 1 in [–1, 1] is such that
p(x) = f(x) – a 02–n+1Tn(x),
with maximum error a02–n+1. To refer to this technique, we will call

this technique Chebyshev approximation.
2. the polynomial of degree exactly n having smallest norm in

C[a, b] is
To refer to this technique, we will call this technique smallest

norm.
So we now found two ways to calculate the best approximating

polynomial p(x)  Pn for functions with leading term a0xn. Either it
is found using Chebyshev approximation by the formula p(x) = f(x)
– a0 · 2–n+1Tn(x) or by finding the polynomial of degree n having
44
smallest norm in C[a; b] by the formula .
Chapter 6
Conclusion
In this thesis, we studied Chebyshev approximation. We found
that Chebyshev was the first to approximate functions in the
uniform norm. The problem he wanted to solve was to represent a
continuous function f(x) on the closed interval [a; b] by an
algebraic polynomial of degree at most n, in such a way that the
maximum error is minimized.
We first looked at approximation in the L2-norm, to compare it to

the uniform norm later. In this norm, the solution to the
approximation problem is just an orthogonal projection, which can
be found by applying the Gram-Schmidt Process. This is just
calculus, so it is an easy procedure.
We then looked at the actual problem: approximating functions in

the uniform norm. We succeeded in answering the four questions
asked in the introduction: there always exists a best
approximating polynomial and this polynomial is unique. We
found that a necessary and sufficient condition for the best
approximating polynomial is the following:
The polynomial p(x) is the best approximation to f(x) if and only if

45
f(x) – p(x) alternates in sign at least n + 2 times, where n is the
degree of the best approximating polynomial. Then, f – p has at
least n + 1 zeros.
We found that we can construct the best approximating

polynomial using four techniques:
1. Chebyshev approximation: p(x) = f(x) – a02–n+1Tn(x). The

maximum error is E = a02–n+1.
2. Smallest norm: p(x) = f(x)-E(x), where
E= . The maximum error is either at a
terminal point or at a zero of .
Here, Tn(x) is defined as Tn(x) = cos(n arccos(x)). These are the so

called Chebyshev polynomials, defined by the recurrence relation
Tn+1(x) = 2xTn(x) – Tn–1(x).
For intervals different then [–1, 1], a change of variable is needed

for techniques 1; 3 and 4. This is done using the following
formula: .
The difference between approximation in the L2-norm and the

uniform norm is that the uniform norm gives in general a smaller
error. However, it is more difficult to find the solution. So if the
difference between the error of both norms is negligible, then it is
recommended to use the orthogonal projection.
This thesis gave a good overview of what Chebyshev

approximation is. However, approximation theory is much broader
46
than Chebyshev approximation. For example Weierstrass theorem,
Bernstein polynomials and spline approximation are important in
approximation theory.
References
[1] K. G. Steffens. The history of approximation theory.
Birkhäuser, 2006.
[2] S. J. Leon. Linear Algebra with applications. Pearson, 8th
edition, 2010.
[3] T. Maroˇsević. A choice of norm in discrete
approximation. Mathematical Communications, 1(2):147-
152, 1996.
[4] E. Celledoni. Best approximation in the 2-norm. Norwegian
University of Science and Technology, 2012.
[5] N. L. Carothers. A short course on approximation theory.
Bowling Green State University.
[6] S. De Marchi. Lectures on Multivariate Polynomial
Interpolation. University of Padua, 2015.
[7] M. Embree. Oscillation Theorem. Virginia Tech, 2016.
[8] E. Süli and D. F. Mayers. An introduction to numerical
analysis. Cambridge University Pess, 2003.
[9] R. L. Burden and J. Douglas Faires. Numerical Analysis.
Brooks Cole, 9th international edition, 2011.
[10] S. Ghorai. Best Approximation: Minimax Theory.
University of Calcutta, 2014.
[11] D. Levy. Introduction to numerical analysis 1. University
of Maryland, 2011.
[12] L. Fox and I. B. Parker. Chebyshev Polynomials in
Numerical Analysis. Oxford University Press, 1968.
[13] C. de Boor. Approximation Theory, Proceedings of
47
symposia in applied mathematics, volume 36. American
Mathematical Society, 1986.
[14] G. G. Lorentz. Approximation of Functions. Holt, Rinehart
and Winston, 1966.
...
48

Application of Differential Equations in Biology

Uploaded by

Copyright:

Available Formats

Application of Differential Equations in Biology

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Application of Differential Equations in Biology

Uploaded by

Copyright:

Available Formats

Contents

1 Pafnuty Lvovich Chebyshev 1821-1894 2

2 Approximation in the L2-norm 6

3 Approximation in the uniform norm 11

5 How to find the best approximating polynomial in the

Pafnuty Lvovich Chebyshev

Since this thesis is dedicated to Chebyshev polynomial, we discuss in

The information in this chapter is obtained from The history of

In 1837 he started studying mathematics at the Moscow University.

In 1849, he became his doctorate for his work ‘theory of

Because of his scientific achievements, he was elected junior

Besides these, he was honoured many times more: in 1856 he

Pafnuty Lvovich Chebyshev

In 1852, he went to Belgium, France, England and Germany to talk

According to Chebyshev, the foundations of approximation theory

Another important name in approximation theory was the Scottish

Watt’s Curves for different values of a; b and c. The Watt’s Curve

In 1853, Chebyshev published his first solutions in his ‘‘Th´eorie

Chebyshev wanted to solve general problems. He formulated the

To determine the deviations which one has to add to get an

The formulation of this problem is the start of approximation in the

Approximation in the L2-norm

The problem that Chebyshev wanted to solve is an approximation

2.1 Best approximation in the L2-norm

 C [a, b]. We want to find the best approximating

Problem 1. Find the best approximating polynomial p  Pn of

The best approximating polynomial p(x) always exists and is unique.

To solve this problem, we want to minimize

if and only if,

Thus, the integral is minimal if p (x) is the orthogonal projection of

Orthogonal polynomials can be obtained by applying the Gram-

2.1.1 The Gram-Schmidt Process

2.1.2 Example: Find the best approximating quadratic

This norm is minimal if p(x) is the orthogonal projection of the

with maximum error

2.1.3 Legendre polynomials

are called the Legendre polynomials, named after the French

The Legendre polynomials satisfy the recurrence relation

(n + 1) Pn + 1 (x) = (2n + 1) x Pn (x) – n Pn – 1 (x)

A list of the first six Legendre polynomials are as following:

Approximation in the uniform norm

Problem 2. Is it possible to represent a continuous function f (x) on

the closed interval [a, b] by a polynomial of degree at

This thesis will give an answer to this question. In this chapter we

Lemma 1. Let f (x)  C [a, b] and let p (x) be a best approximation to

– p(x) attains each of the values .

Proof. This is a proof by contradiction. Write the error

for all x  [a, b]. Thus, E +  ≠ 0 and so  Pn, with p ≠ q.

for all x  [a, b]. That is,

Hence, q(x) is a better approximation to f (x) than p(x). This is a

Corollary 1: The best approximating constant to f (x)  C [a, b] is

Proof: This is again a proof by contradiction. Let x1 and x2 be such

showing that E (x1) + E (x2) ≠ 0. This contradicts lemma 1.

Next, we will generalize lemma 1 to show that a best linear

We need some definitions to arrive at this generalization.

Definition 1. Let f (x)  C [a, b].

1. x [a, b] is called a (+) point for f (x) if

2. x [a, b] is called a (–) point for f (x) if

3. A set of distinct points a ≤ x0 < x1 < · · · < xn ≤ b is called an

and f (xi) = – f (xi – 1) for all i = 1; : : : ; n.