Fundamental Ideas of Analysis
Fundamental Ideas of Analysis
\MdUtl RetjL
Digitized by the Internet Archive
in 2019 with funding from
Kahle/Austin Foundation
https://archive.org/details/fundamentalideasOOOOreed
Fundamental
Ideas of
ANALYSIS
■ j, uafa Library
mas
TRENT UNIVERSITY
WTERBOROUGH, ONTARIO
Fundamental
Ideas of
ANALYSIS
MICHAEL C. REED
Duke University
This book was typeset in Palatino by John Davies with the DTgX Documentation System and printed
and bound by the Hamilton Printing Company. The cover was printed by Phoenix Color Corpora¬
tion.
Recognizing the importance of preserving what has been written, it is a policy of John Wiley & Sons,
Inc. to have books of enduring value published in the United States printed on acid-free paper, and
we exert our best efforts to that end.
The paper in this book was manufactured by a mill whose forest management programs include
sustained yield harvesting of its timber lands. Sustained yield harvesting principles ensure that the
numbers of trees cut each year does not exceed the amount of new growth.
Copyright ©1998 by John Wiley & Sons, Inc. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, ex¬
cept as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either
the prior written permission of the Publisher, or authorization through payment of the appropriate
per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (508)
750-8400, fax (508) 750-4470. Requests to the Publisher for permission should be addressed to the
Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012,
(212) 850-6011, fax (212) 850-6008, E-Mail: PERMREQ@WILEY.COM.
The ideas and methods of mathematics, long central to the physical sci¬
ences, now play an increasingly important role in a wide variety of dis¬
ciplines. This success, in fields as diverse as biology, economics, opera¬
tions research, robotics, cryptology, and finance, raises difficult questions
about the undergraduate curriculum. Many mathematics majors now
plan careers in other fields, and non-majors form a significant part of the
student population in advanced undergraduate courses. Because of the
interests of these students, and the (appropriate) use of machine compu¬
tation, some courses which used to be highly theoretical (e.g., ordinary
differential equations) now emphasize methods and computational ex¬
periments. There is a bewildering array of undergraduate courses, both
pure and applied. What should we teach these students? Where will
they learn the power and beauty of pure mathematics? How can we
make clear the coherence and continuity of the central ideas of mathe¬
matics?
Because of all of these issues, the undergraduate analysis course is
more important than ever. For in this course students learn that mathe¬
matics is more than just methods that work. Analysis provides theorems
that prove that results are true and provides techniques to estimate the
errors in approximate calculations. In this course, students are asked for
the first time to construct long proofs and it is from the proofs that they
obtain deep understanding of the ideas. Finally, the ideas and methods
of analysis play a fundamental role in ordinary differential equations,
probability theory, differential geometry, numerical analysis, complex
analysis, and partial differential equations, as well as in most areas of
applied mathematics. For all these reasons, analysis should be a central,
if not the central, course in the undergraduate curriculum.
This analysis text makes the connections to the rest of the curriculum
visible. The standard topics for a one term undergraduate real analy¬
sis course are covered in the unstarred sections of chapters 1 through 6.
But, in addition, numerous examples are given which show the ways in
which real analysis is used in ordinary and partial differential equations,
probability theory, numerical analysis, number theory, and so forth. In
this way the importance of the technical questions becomes clear to the
students and the entire course is more lively and interesting. These
“connections” are developed in the starred sections of chapters 1 through
6 and in chapters 7 -10. Throughout the text the connections are also
emphasized in examples, problems, and projects. The development of
the standard topics does not depend on the material in the starred sec¬
tions, so the instructor can choose which starred sections to include and
which to omit. Furthermore, with one or two exceptions, the starred
sections and chapters are independent of each other. My intention is
to provide materials which make it easy for the individual instructor to
construct a lively and interesting course that is appropriate, both in con¬
tent and in level of difficulty, for his or her own students. Assistance is
available in a separate Instructor's Manual in which the starred sections
and chapters, the projects, and the harder problems are thoroughly dis¬
cussed.
Michael Reed
*
Contents
Preface
chapter 1 Preliminaries 1
1.1 The Real Numbers 1
1.2 Sets and Functions 6
1.3 Cardinality 15
1.4 Methods of Proof 20
chapter 2 Sequences 27
2.1 Convergence 27
2.2 Limit Theorems 35
2.3 Two-state Markov Chains* 40
2.4 Cauchy Sequences 44
2.5 Supremum and Infimum 52
2.6 The Bolzano-Weierstrass Theorem 55
2.7 The Quadratic Map* 60
Projects 68
* *
Bibliography 403
Index 409
CHAPTER 1
Preliminaries
This chapter has two purposes. First, we review the properties of real
numbers and establish our notation for sets and functions. Second, we
provide simple examples of mathematical proofs which can serve as
models for students with no previous experience in proving theorems.
The reader is surely familiar with the basic arithmetic properties of ad¬
dition, x + y, and multiplication, xy, of real numbers.
Addition and multiplication are commutative (PI and P5) and associa¬
tive (P2 and P6). Property P9 is called the distributive property. We oc¬
casionally write x • y instead of xy to emphasize that we are multiplying
x and y. A set with operations of addition and multiplication which sat¬
isfy these nine properties is called a field. Thus, the set of real numbers,
which we denote by R, is a field. There are other fields. Any set with two
elements can be made into a field by defining addition and multiplica¬
tion appropriately (see problem 7). The complex numbers, C, are a field
that contains R. We define C and study some of its properties in Section
2 Chapter 1. Preliminaries
y/2 e 7r
H-1-1-•-1-•-I-*-
-10 12 3
Figure 1.1.1
The real numbers have an order relation <, “less than or equal to”,
which has the following properties:
We can define > and < in terms of <. We say that x > y if and only if
y < x, and we say that x < y if and only if x < y and x ^ y. In terms of
the line in Figure 1.1.1, x < y means that x is to the left of y.
All the rules for the manipulation of elementary algebraic expres¬
sions, for example, removing parentheses or the laws of exponents, can
be proven from (PI) through (P9). The usual rules for the manipulation
of inequalities can be derived from (P1)-(P9) and (01)-(05). We illus¬
trate how this can be done in the following proposition. The reader is
asked to construct other such proofs in problems 1 through 6.
Proposition 1.1.1
(d) For all real numbers x and y, if x < y, then —x > —y.
(x + z) + (-z) = (y + z) + (-z)
1.1 The Real Numbers 3
v- (by P3)
It follows from (1) and part (b) that x(—y) = —(xy). Since this equality
holds for all real numbers x and y, it holds if we replace x by —x. Thus
(-x)(-y) ■= —((—x)y). But, by (2) and (P4), (—x)y = -(xy), so we can
use the result of problem 3 to conclude that
Similarly,
Thus -y < -x, which is the same as -x > -y by the definition of >. □
Proposition 1.1.2
Therefore, if x+y > 0, we have that \x+y\ = x+y < |x| + |y|. If x + y < 0,
we proceed as follows. We always know that x > —\x\ since this is clearly
true if x is nonnegative and x = — |x| if a; is negative. Similarly, y > —\y\.
Thus, again using (04) twice,
Using part (d) of Proposition 1.1.1, we see that -(x + y) < |x| + |y|.
Therefore, since |x + y| = —(x + y) if x + y < 0, we conclude that |x + y| <
|x| + |y|, which completes the proof of (c). O
There are two other important properties of the real numbers. The
real numbers are complete, which means roughly that every sequence
of real numbers that looks as if it is converging does indeed converge to
a limit. This is discussed in Section 2.4. Second, the real numbers satisfy
the Archimedian property, which says that if a > 0 and b > 0, then there
is an integer n so that na > b. If we think about the real numbers, this
seems obvious. However, there are ordered fields which do not have
the Archimedian property (ordered fields are defined in problem 12).
Throughout the discussion of completeness in Section 2.4 we assume that
the Archimedian property holds for R.
1.1 The Real Numbers 5
Problems
1. Prove that if x and y are real numbers, then \xy\ — |ar||y|. Hint: check all
the cases.
2. Use the field properties of the real numbers to provide a careful proof of
the elementary algebraic identity (x + y)2 = x2 + 2xy + y2.
3. Prove that —(—x) = a: for all real numbers a;. Hint: show that (—®) + (x) =
0 and then use part (b) of Proposition 1.1.1.
4. Prove that all real numbers 2 satisfy z • 0 = 0 = 0 • z. Hint: first prove that
0 + z- 0 = z- 0 + z- 0 and then use part (a) of Proposition 1.1.1.
5. Use part (d) of Proposition 1.1.1 to prove that if x < y and z < 0, then
zy < zx.
(a) Using the properties (01)-(05), prove that 0 < x2 < y2. Is the
conclusion true if we omit the hypothesis that 0 < a:?
(b) Using mathematical induction, prove that 0 < xn < yn for all posi¬
tive integers n. Induction is discussed in Section 1.4.
0 + 0 = 0; 1 + 1 = 0; 1+0= 1=0+1
8. Prove that if x and y are real numbers, then 2xy < x2 +y2.
12. Let T be a field. Suppose that there is a set P C T which satisfies the
following properties:
(a) For each x in T, exactly one of the following three statements holds:
x is in P; —x is in P; x = 0.
(b) If x is in P and y is in P, then x + y is in P and xy is in P.
If such a P exists, P is called an ordered field. For any set Pcf, define
x < y if and only if (y - x) is in P or x = y. Prove that P satisfies the
properties (a) and (b) if and only if < satisfies the properties (01)-(05).
S = {x | a < x < b}
defines the set 5 of real numbers that are greater than or equal to a and
less than or equal to b. S is also denoted by [a, b] and called a closed
interval because it contains its endpoints.
T = {x | a < x < b}
defines the set T of real numbers which are greater than a and less than
b. T is also denoted by (a, b) and called an open interval. Occasionally,
we will use half-open intervals [a, b) = {x \ a < x < b}. The half-open in¬
terval (a, b] is defined analogously. Sometimes we specify a set by listing
its members. For example, the natural numbers are the set N, where
N = {1,2,3,...}
Z = {...,-2,-1,0,1,2,...}.
The set of real numbers that can be written in the form — where m and
n
n are integers and n / 0 are called the rational numbers and denoted by
Q. A real number which is not rational is called irrational. If x belongs
to a set X, we write
x e X,
1.2 Sets and Functions 7
x {X.
We use the standard notation for the union, S'UT, and intersection, SnT,
of sets:
S UT = {a: | a: e 5 or a: e T}
SC\T = {x | x e S' and x e T}.
Sc = {xeX\x fLS}.
Thus, if S' = [2,3] and T = [1,4], then [2,3] x [1,4] is the set of ordered
pairs (x, y) such that 2 < x < 3 and 1 < y < 4. That is, [2,3] x [1,4] is just
the rectangle in the plane with vertices at the points (2,1), (3,1), (2,4),
and (3,4). Since the Euclidean plane R2 is just the set of all ordered pairs
(x, y) where x e R and ye E, we see that R2 = R x 1. Note that there is a
8 Chapter 1. Preliminaries
or
t = f(s)-
Note the distinction between F and /. The set F is a subset of the set
of ordered pairs, while f(s) is the name of the second element of the
ordered pair whose first element is s. The symbol / is the name of the
“rule” which assigns f(s) to s.
Definition. The set {s | (s, t) e F} is called the domain of /, and the set
{f | (s, t) e F} is called the range of /. These sets are sometimes denoted
by Dom(f) and Ran(f) for short.
Thus, the set F consists of the points in the plane that we normally refer
to as the graph of /. See Figure 1.2.1(a). The requirement in the definition
of function that each s occur as the first element of at most one ordered
pair ensures that for each s e Dom(f) the function has exactly one value.
That is, the vertical line passing through (s, 0) intersects the graph of /
in exactly one point. The domain of / is R, and the range of / is the
set [—2, oo). We use the symbol [a, oo) to denote the set {x e R | x > a}.
Similarly, the symbol (—oo, 6] denotes the set {x e R | x < b}, and (a, oo)
and (—oo, b) are defined analogously.
is the graph of the natural logarithm function; see Figure 1.2.1 (b). How¬
ever, in this case the only members of S' = R which occur as the first
element of ordered pairs in F are positive numbers. Thus Dom(f) is
10 Chapter 1. Preliminaries
the interval (0, oo). On the other hand, any real number is the natural
logarithm of some s > 0; so Ran(f) = R.
fab
its determinant ad — be. Then,
c d
a
F = eM(2)
c d)’ad~bc c
and
ad — be.
In this case Dom(f) = and Ran(f) = R, since, given any real num¬
ber q, it is easy to choose a, b, c, and d so that ad — be = q.
Example 4 For each point (a, 6) in IR2, let f(a,b) be the two-by-two
matrix given by
f{a,b)
F = i(g,i(g))\g€C[0,i]}.
1.2 Sets and Functions 11
The domain of I is C[0,1] and the range of I is R since, given any real
number a, we can easily find a function g so that 1(g) = a.
}(r\t)) = t,
12 Chapter 1. Preliminaries
rHM) =
It is clear that /“1 is one-to-one and that the inverse function of /_1 itself
is the original function /. We will see later that the inverse function of
the natural logarithm is the exponential function.
Many of the functions which we will consider in this book are func¬
tions from K to R. In this case there are several natural ways to make
new functions from two given functions / and g. We define the sum of
the functions, / + g, by
on the domain
fd(x) = f(x)g(x)
on the same domain. Finally we define the composition of the two func¬
tions / and g, denoted by / o g, by
(fog)(x) = f(g(x))
on the domain
Thus, to compute the value of fog at x, we first compute the number g(x)
and then compute f(g(x)). The reason for the complicated expression for
the domain of / o g is that x must be in the domain of g and g(x) must be
in the domain of f. Note that fog and g o f are not normally the same
function.
and
l
smx
On the other hand, sin x is defined everywhere, so
Dom(g of) = {x e R | x 0}
and
{g°f){x) = sin-.
X
Problems
1. Let S be the open interval (1,2) and let T be the closed interval [-2,2].
Describe the following sets:
(a) S U T.
(b) SOT.
(c) M\S.
(d) T\S.
(e) R\{T\S).
2. Describe the following sets of real numbers:
(a) f(x) =
(b) f(x) = -±2.
(c) f(x) = sin
(d) f(x) = In |®|.
10. In each case, determine the domain and compute a formula for fog and
9° f-
1.3 Cardinality 15
1.3 Cardinality
Two sets, S and T, are said to have the same number of elements, or the
same cardinality, if there is a one-to-one function, /, from S onto T, that
is Dom(f) — S and Ran(f) = T. This definition is reasonable because
if S and T have the same cardinality, then the function / gives a way of
matching up each element of S with a corresponding element of T. We
say that / establishes a one-to-one correspondence between S and T.
2n if n > 1
f(n) 1 — 2n if n < 0
has these properties. Notice that this means that, according to our defi¬
nition of cardinality, Z and N have the same cardinality even though N is
strictly contained in Z.
fc /fafe)
is a one-to-one function from N onto S. Therefore, S is countable. □
N =
[33]. We use it in the following proposition. For other uses, see Proposi¬
tion 1.4.2 in the next section or Section 6.6.
A 2"3“
So far, all the infinite sets that we have considered have been count¬
able, and, therefore, the following theorem may come as a surprise.
18 Chapter 1. Preliminaries
□ Theorem 1.3.6 The set of real numbers between 0 and 1, that is (0,1), is
not countable.
Proof. In the proof we will use several facts about the decimal expan¬
sion of real numbers. Decimal expansions are discussed in project 4 of
Chapter 6. Every real number x between 0 and 1 has a decimal expansion
x = O.X1X2X3 ... where each Xi is an integer between 0 and 9. Each deci¬
mal expansion corresponds to a different number, except for expansions
ending in 0's, which correspond to numbers that can also be represented
by expansions ending in 9's. For example, \ can be written as .5000 ...
or as .4999 —
The proof is by contradiction. Suppose that (0,1) were countable.
Then there would be a one-to-one function / from N to (0,1) with do¬
main N and range (0,1). Denote by xthe integer in the decimal
expansion of /(n). That is.
/(l) n T(1)T(1)T(1)
U.iLi JbJb c .. x (i)
n .Jbr(2U2)r(2) . . X-
r(2)
/(2) Jb 2 Jb 2
y = ■VW'iVi ...yn---
Problems
1. Prove that the set S = {5,10,15,20,...} is countable by constructing a
one-to-one function from S onto N.
2. Prove that the set of real numbers of the form en, n — 0, ±1, ±2,... is
countable.
3. Prove that:
6. Prove that the set of two-by-two matrices with real entries is uncountable.
8. Suppose that for each natural number n, the set An is countable. Denote
the union of the sets An by A = UnAn. Prove that A is countable.
9. For each of the following sets, say whether it is finite, countable, or un¬
countable:
and 1.3.4 have exactly this form. Sometimes a theorem statement has
only conclusions, Q. This is the case for Theorems 1.3.5 and 1.3.6. This
is done for brevity when it is assumed that the reader knows what the
assumptions, P, are. For example, for Theorem 1.3.5, the unwritten as¬
sumptions are: (1) “The real numbers satisfy the properties PI through
P9 of Section 1.1”; (2) “Q is the set of real numbers of the form ™.” The
first sentence could be listed as an assumption for every theorem in this
book, but that would be annoying, so the assumption is unwritten but
implicit in the theorem statement. The second statement is just the def¬
inition of Q. Another example of unwritten hypotheses is Proposition
1.1.2, which seems to have only conclusions. It is very important to re¬
member that a theorem which asserts that P implies Q does not assert
that either P or Q is true. It asserts only that if P is true, then Q is also
true.
“Not P” is the statement that not all the sentences in P are true. The
statements “P implies Q” and “not Q implies not P” are logically equiv¬
alent. Here is how one can see this. First, suppose that P implies Q.
Suppose that not Q is true. If not P is false, then P is true, which would
imply that Q is true. Therefore, not P must be true. Thus, not Q implies
not P. On the other hand, suppose that not Q implies not P. Suppose P
is true. Then Q must be true because otherwise not P would be true
since not Q implies not P. The fact that “P implies Q” and “not Q implies
not P” are equivalent is useful because it means that we have two ways
of proving that P implies Q. We can take P as true and prove that the
statements Q are true. This is called a direct proof. Or we can take the
statement not Q as our hypothesis and try to prove that the statement
not P is true. This is called a contrapositive proof. We give both a direct
and a contrapositive proof for the following simple proposition.
22 Chapter 1. Preliminaries
V2n = m.
2n2 = m2 (4)
2n2 = (2k)2 = 4 k2
1.4 Methods of Proof 23
or
n2 = 2k2.
n(n + 1)
1 +2+ •••+n (5)
2
Proof. Q(n) is statement (5). Q(l) is certainly true since 1 = (1) • (2)/2.
Now, suppose that Q(k) is true for some k. That is,
k(k + 1)
1 T 2 + • • • -)- k
k(k + 1)
l + 2-|-*-* + fe-|-(feT-l) + (A; + 1)
(k + 1)(| + 1)
(k + l)(k + 2)
(fc + l)((fc + l) + l)
Problems
1. Give five examples which show that P implies'Q does not necessarily
mean that Q implies P.
2. Proposition 1.1.2 seems to consist only of conclusions. What are the un¬
written hypotheses?
3. Suppose that a, b, c, and d are positive numbers such that a/b < c/d. Prove
that
a a+c c
<
b b -I- d d
4. Suppose that 0 < a < b. Prove thap
10. In order to disprove the implication that P implies Q, one often provides
an example in which P is true but Q is not. Such an example is called a
counterexample to the statement that P implies Q. For each of the follow¬
ing incorrect statements, identify P, identify Q, and provide a counterex¬
ample.
(a) Prove that there is a q e Q so that |q — a/2| < d—c. Flint: use problem
11 of Section 1.1.
(b) Prove that q — V2 is irrational.
(c) Prove that there is an irrational number between c and d.
sk — 2-*ij=0 j\ ’
Sequences
2.1 Convergence
2, 4, 6, 8, 10, 12,...
1, -1, 1, -1, 1, -1,...
1? 1 2 ’ ^ 4 ’ ^ 8 ’ 1T6’ * **
More formally, a sequence is a function from the natural numbers, N, to
the real numbers, M. We usually give names to the values of the func¬
tion, for example, by letting ai denote the value of the function at 1, a2
denote the value of the function at 2, and so forth. We represent the en¬
tire sequence by {an}^, where n is the index which runs from 1 to oo,
indicating that the successive terms of the sequence are ai, a2, <23,....
Often, a sequence is specified by giving an explicit formula for the
terms that shows how they depend on n. For example, the first sequence
28 Chapter 2. Sequences
above is given by the formula an — 2n, the second sequence by the for¬
mula an = (-l)n+1, and the third by an = 2 - 2~n+1. However, some¬
times a sequence is specified by giving an algorithm for computing the
terms. For example, if
ai = 2
for all n, then it is easy to see that a2 = (6)(2), a3 = (6)2(2), and so forth.
Thus, the formula for the term of the sequence is an = 2(6n_1). When
we specify a sequence, the name of the index doesn't matter, so {an
and {afcjfcT 1 specify the same sequence. Sometimes, a sequence is given
where the index n starts at some integer other than 1. In such a case,
one can define a new index for the same sequence that starts at 1. For
example, if the sequence {an}^°=2 is giyen/ we can set k = n — 1 and
observe that {ak+1}^ specifies the same sequence. When the index
runs from 1 to 00, we will often drop the explicit statement of the range
of the index and write simply {an} instead of {an}^=1.
The idea of a “limit” of a sequence is very simple. We say that a
sequence {an} converges to a limit a if the terms in the sequence, an, get
closer and closer to a as n gets larger. So, the first sequence above does
not converge to a limit because each term is two bigger than the previous
term, so the sequence can't get closer and closer to any finite number.
The second sequence doesn't converge because it keeps hopping back
and forth between —1 and 1. The third squence converges to 2 because,
as n gets larger, the terms get closer and closer to 2.
These ideas seem so simple and clear that it is reasonable to ask why
we need a technical definition of convergence. The answer is that we
will encounter lots of sequences whose convergence and limits are not
obvious. Even when the sequence is given by an explicit formula, it may
be difficult to see immediately whether it converges and, if so, what the
limit is. This is the case with the following two sequences,
;. 1
an = n sin —
n
and
o„ = (l + i)n,
n
which are often introduced in calculus. At other times, the sequence may
be specified by giving a0 and a recursive formula for the rest of the terms,
2.1 Convergence 29
for example.
1
&n+1 2<*n + 2
or
®n+l — 2an(l On)-
lim an = a or an —> a.
n—>-00
Figure 2.1.1
There must exist an N so that all the terms an in the sequence for n > N
lie in the interval. Note that for convergence to hold this must be true
for each £. That is, for each e there must exist such an N. The size of N
will normally depend on e, smaller £ requiring larger N, which is why
we wrote N(e) in the definition.
1 < £
1 < e^/n,
which is equivalent to
1
~2 < n-
Choose N to be any integer greater than or equal to Then, if n > TV,
we have
n > N > —z.
£z
Therefore,
1 1
—F= < —j= < S.
Vn VN
This shows that (1) holds for n> N. Since we have shown how to choose
N(e) for every e > 0, we have proven that an —> 1. Throughout the
proof we used properties of the order relation among real numbers. For
example, we used the fact that if 0 < x < y, then 0 < y-1 < x~l.
Q>n —
Since | gets smaller and smaller as n gets larger, the expression under the
square root sign gets closer and closer to 2. So, it is intuitively reasonable
that this sequence has a limit and the limit is \/2. In order to prove it, let
e > 0 be given. We want to show that
< £ (2)
J2 + --V2
(V^ti ~ ^2)(\/2 + n +
V n
(\/2 + n + V^2)
2.1 Convergence 31
3 jn
V2 + n + V^2
3/n
2^2'
3/n
< e,
2V2
which is equivalent to
3
n >
2e\[2
Choose A?-.to be any integer > 3/2ey/2- Then, if n > N, we have n >
3/2e-\/2, which is equivalent to 3/2ny/2 < e. Therefore n > N implies
that (2) holds. We have verified the criterion (1) in the definition of limit,
so we conclude that y 2 + ^ > \[2.
dn — rein—1
where \r\ < 1. Then, a2 = 5r; 0,3 = 5r2; and, in general, an = 5rn_1. Since
we are assuming |r| < 1, it seems reasonable that an -> 0. Let's prove it.
Let e > 0 be given. We want to show that for n large enough
or equivalently.
1 5
(n — 1) In —r > In -
r £
or
n > + 1. (4)
32 Chapter 2. Sequences
Note that In A > 0 since \r\ < 1. Choose N to be any integer such that
N > + 1.
Then n> N implies that (3) holds, since (4) and (3) are equivalent. There¬
fore, by the definition of limit, we conclude that an -> 0.
In terms of the number line, saying that an —>• oo means that, given any
number M, there is an N so that n > N implies that an is to the right of
M.
for n large enough. If M < 0, the inequality holds for all n so there is
nothing more to prove. If M > 0, the inequality (5) is equivalent to
In M — In 5
n > --- + 1. (6)
inr
Problems
(a) an = n sin C
(b) an = ( 1 + J)n.
(c) an+1 = \an + 2, ai = .5.
(d) an_|_i = 2.5Gn(l (in)) ®i -3-
(a) an = l + 4t=.
(b) an = 1 +
(c) an — 3 + °~n.
(d) an = I
(a) an = 2n.
(b) an = -n2.
(c) an — Vhrn.
(d) an = g.
(a) cn —> 0.
(b) cn -» oo.
(c) cn —> 1.
(d) {cn} does not converge, nor does it diverge to +00 or —00.
|®n ^"1
* ^
for each n, but {an} does not converge to a. Thus a sequence can
get “closer and closer” to a without converging to a.
(b) Find a sequence {an} and a real number a so that an —»• a but so
that the above inequality is violated for infinitely many n. Thus a
sequence can converge without getting “closer and closer.”
2.2 Limit Theorems 35
| bn | = | bn b T b|
< I bn — b\ + |6|
< 1 + |&|
for n > N. Define M = max{|&i|, I&2I, • ••, |biv—1|» 1 + N}- Then, by the
way that M is defined, we see that |&n[ < M for 1 < n < N. In addition,
\bn\ < l + \b\ < M for n > N. Thus, the inequality in (7) holds for all n.
□
Proposition 2.2.2 Let {an} and {bn} be sequences such that an ->• L and
bn -> L. Let {cn} be a sequence such that an < cn < bn for all n. Then
Cn L.
Proof. Let e > 0 be given. Since an ->• L, we can choose an N\ such that
n > N\ implies \an - L\ < e. Since bn -t L, we can choose an N2 such
that n > N2 implies \bn - L\ < e. If we choose N = max{Ni,N2}, then
36 Chapter 2. Sequences
n > N implies that both n > N\ and n > N2. Suppose n > N. Then
either cn < L or cn > L. In the first case,
U71 — Cn — T
SO
so
L < |6n L| < e.
In either case, n > N implies
IL - cn| < e,
1 — — <cn<l + —.
n n
□ Theorem 2.2.3 Let {an} and {bn} be sequences and suppose that an ->■ a
and bn -> b. Then,
lim (an + bn) a + b.
71—>-00
= £.
2.2 Limit Theorems 37
□ Theorem 2.2.4 Let {an} be a sequence and suppose that an —> a. Then,
for any constant k.
□ Theorem 2.2.5 Let {an} and {bn} be sequences and suppose that ar a
and bn —b. Then,
lim (anbn) = ab.
71-^00
We want to show that the sum on the right-hand side is less than e if we
choose n large enough. WeTl work on the second term first. If a = 0,
then the second term is < | for all n. If a ^ 0, then, since bn —> b, we can
choose an Ni so that n > N\ implies
\bn-b\ <
2 a
\aT aI <
2M
<
e
M -|- \a
,, e
2M 2 a
= £
□ Theorem 2.2.6 Let {an} and {6n} be sequences and suppose that an -> a
and bn —> b. Suppose that b ^ 0 and bn ^ 0 for any n. Then,
lim (l“)
n—>cx> on = T-
b
The proofs of Theorem 2.2.4 and Theorem 2.2.6 are left to the student
(problems 3 and 5).
lim (1 + i2 l.
n—»oo
3n + l _ 3 + ^
n+2 “ i + r
Since the numerator on the right-hand side has limit 3 and the denomi¬
nator has limit 1, Theorem 2.2.6 assures us that {an} has a limit and that
the limit is 3.
Notice how much easier the proof in Example 3 was than the direct
proof in problem 3b of Section 2.1. There is an important point here
about the role of theory in mathematics. Suppose that we'd been given
six problems just like problem 3b except that the coefficients in the nu¬
merator and denominator were different. For each problem the direct
proof of convergence would have been long, and after a while we would
have noticed that the pattern of proof was the same. We would be doing
lots of hard work (basically the same hard work) each time to show that
2.2 Limit Theorems 39
the limit of the ratio is the ratio of the limits. That's when one sees the
need for proving Theorem 2.2.6. The proof of Theorem 2.2.6 contains all
the hard work which we were doing over and over in the separate prob¬
lems. Once Theorem 2.2.6 is proved, it can be used to do each of the six
problems easily.
Problems
1. Prove that each of the following limits exists by using Theorem 2.2.3 -
Theorem 2.2.6:
n2+ 6
(c) dji — 3n2 — 2 '
5+(A'2
i 3* )
(d) an = 2n + 5
2+ 3n — 2
6. Let p(x) be any polynomial and suppose that an ->• a. Prove that
7. Let {an} and {bn} be sequences and suppose that an < bn for all n and
that an ->• oo. Prove that bn -»■ oo.
8. Let an = y/n+l - \fn. Prove that an -> 0.
9. (a) Let {an} be the sequence in problem 1(c) of Section 2.1. Prove that
{an} -> 4. Hint: subtract 4 from both sides.
(b) Consider the sequence defined by the recursive relation an+1 =
aan + 2. Show that if \a\ < 1 the sequence has a limit independent
of ai.
40 Chapter 2. Sequences
10. The Euclidean plane R2 = R x R is the set of ordered pairs (x, y), where
seR and yeR. Note that we can add pairs, (xi,2/i) + (*2,2/2) = (*1 +
*2, 2/1 +2/2)/ arid multiply pairs by real numbers, a(x,y) = (ax, ay). These
operations correspond to vector addition and scalar multiplication. We
define a notion of “size” for points in R2 by
That is, the size of (*, y) is just the Euclidean distance from the point to
the origin.
(a) Let *1, *2,2/i) 2/2 be real numbers. Using the inequality in problem 8
of Section 1.1, prove that
(b) Prove that for any two points (*1, 2/1) and (x2,2/2)/
The first formula says simply that the probability that the phone will be
free on the n + 1 check is the probability that it was free on the check
times the probability that if it's free it remains free plus the probability
that it was busy on the check times the probability that if it's busy it
will switch to free. Similar reasoning gives the second formula. If we are
given the probability that the phone starts free, po, and the probability
that it starts busy, qo, then we can use these recursive formulas to com¬
pute pn and qn for each n. Although it would be complicated to write out
the formulas explicitly, we can see what is happening if we reformulate
the recursion using vectors and matrices. Think of the pair (pn, qn) as a
two-dimensional column vector. Then (8) and (9) can be rewritten as
( Pn+l \ = / 1 -Q P \ l Pn \ (10)
V Qn+l ) \ q 1 -P ) \ Qn )
( i -q p ^
A (ID
V q i-p )
Vn+1 — Avn
SO
vn = Anv 0.
This formula shows that we can obtain the probabilities at the check
by applying the nth power of the matrix A to the vector of initial proba¬
bilities vo = (po, qo)- Note that we must havepo + qo = 1 since the phone
is either free or busy initially. By adding equations (8) and (9) we see that
Pn+l + qn+1 = Pn T Qn- Thus pn + qn = 1 for all n.
We have created two sequences of real numbers, {pn} and {qn}. We
would like to know how they behave for large n. If we wait a long time,
what are the probabilities that the phone will be free or busy? How do
these probabilities depend on the initial probabilities p0> <?o? First we as¬
sume that {pn} and {qn} converge to limits p and q, respectively, and ask
what those limits might be. Using equation (8) and Theorems 2.2.3 and
2.2.4 from Section 2.2, we see that
p = lim pn
n
= lim pn+i
n
42 Chapter 2. Sequences
= (1 - q) limpn +plim<?n
n n
= (1~9)P + P9-
In similar fashion,
This gives us two linear equations that must be satisfied by any limits p
and q . The equations are independent and thus have a unique solution,
which is easily calculated to be:
- P q
p = —— > q =
p+q p+q
We shall show that pn -» p and qn —* q. Using the equations (8), (9), and
the equations above satisfied by p and q, we calculate
= 2a2(l - (p + q))2
From now on we assume that 0 < p < 1 and 0 < q < 1. It follows that
0 < p + q < 2, which implies that |1 - (p + q)\ < l. Therefore,
}im{l-(p + q))n = 0
n—^ oo
2.3 Two-state Markov Chains 43
and so it follows from Theorem 2.2.4 that the right-hand side of (12) con¬
verges to zero. Thus, the left-hand side of (12) converges to zero too.
Since
(Pn+l - P)1 2 3 4 < (Pn+l - P)2 + {Pn+1 ~ pf,
Problems
6. In a certain city, taxis are sometimes on the north side and sometimes on
the south side, but they must be in one place or the other. Let pn and qn
denote the probabilities that a particular taxi is on the north side and the
south side, respectively, on the time that we check. If a taxi is on the
north side when we check, it has a probability of .5 of having switched
to the south side when we next check. If it is on the south side, it has a
probability of .3 of having switched north when we check again. If there
are 100 taxis, about how many will be on the north and south sides if we
wait a long time?
7. Suppose that taxis can be on the north side or the south side, and let pn
and qn be the probabilities of finding them there on the check. Sup¬
pose that between any two times that we check, each taxi has a finite
probability r of going home to the taxi barn (neither north or south) from
which it doesn't emerge. Explain why pn and qn satisfy a recursion rela¬
tion of the form
where the coefficients bij all satisfy 0 < bij < 1 and in addition bn + b21 <
1 and bi2 + b22 < 1- By summing the two equations, prove that pn —> 0
and qn —f 0 as n —> oo. What does this mean?
8. Suppose that the taxi in problem 7 can reemerge from the taxi barn and go
to the north side or the south side with certain probabilities. Formulate
(but do not solve) a three-state Markov chain for the probabilities pn, qnr
and rn that the taxi will be on the north side, the south side, or in the taxi
barn, respectively.
In the first three sections of this chapter we proved that many different
sequences converge to limits. In all cases, the idea of the proof was the
same. First we guessed the limit, a. Then we subtracted the purported
limit from the terms of the sequence, an, making a new sequence {an~a}.
Then we proved that the sequence of differences converges to zero; that
is, (an - a) -)• 0. Since this is equivalent to an ->• a, we concluded that the
sequence {an} converges and the limit is, indeed, a. But what if we can't
2.4 Cauchy Sequences 45
That is, a sequence is a Cauchy sequence if, given any e > 0, all the
terms of the sequence are within e of each other from some index N on.
We emphasize that for each e there must exist an N. In general, if e is
smaller, one will have to choose N larger. Notice that there is no mention
of a limit a.
3n + 1 3m + 1 5(n — m)
n+ 2 m+ 2 (n + 2)(m + 2)
5n 5m
< +
(n + 2)(m + 2) (n + 2)(m + 2)
5 5
< +
m+ 2 n+2
In the last step we used the fact that ^ < 1 f°r each positive integer k.
Choose N so that N > 10/e, or equivalently 5/N < e/2; the reason for
this choice will be clear shortly. If n > N and m> N, then
Proof. Let a = lim an and let e > 0 be given. Since an ->• a, we can
choose N so that n > N implies that \an - a\ < e/2. Therefore if n > N
and m> N, we find
^ e/2 T e/2.
7T = 3.14159...
e = 2.71828...
y/2 = 1.41421...
r\ = 3.0
r2 = 3.1
7*3 = 3.14
r4 = 3.141
r5 = 3.1415,
case 1 case 2
b2 aN2 C2 b2 aN2 c2
—i-1-•-t- —i-•-1-t-
bX ml cl &i mi ci
Figure 2.4.1
Note that c\ —b\ = M — a\. Now, let mi be the midpoint of the interval
[b\, ci]. If there is one term of the sequence in the interval [mi, ci], then all
the rest of the terms of the sequence must be in [mi, ci] since the sequence
is monotone increasing. In this case, let N2 be an integer so that aN2 is in
[mi, ci], and define our second interval by b2 = mi and c2 = ci, i.e. the
right half of [&i, ci]. See case 1 in Figure 2.4.1. If no points of the sequence
are in the interval [mi, ci], we define our second interval to be the left half
2.4 Cauchy Sequences 49
of [&i, ci]; that is, b2 = &i and C2 = mi; see case 2 in Figure 2.4.1. In either
case, we know that
and
c2 — b2 = -(M — ai).
and .
Cj-bj = Q) (M-ai).
and
cj+i-&i+i = Q) (M-ai).
Now, let £ > 0 be given. Choose j so that 2_J+1(M — ai) < e. Then, if
n > JVj- and m>Nj, both an and aw are in the interval [bj, Cj], so
f(n) = = tfi,
. 1 — In X In x
Problems
2. Prove that the rational numbers are dense in the real numbers without
using decimal expansions. Hint: use problem 11 of Section 1.1.
3. Suppose that (an| and {6n} are both Cauchy sequences and that an < bn
for each n. Prove that limn^00 an < lim^oo bn.
4. Suppose that {xn} and {yn} are both Cauchy sequences. Prove that
hmrwoc (xn - yn) exists.
5. Suppose that {an} is a Cauchy sequence. Prove that {a1 2 3 4 5 6n} is a Cauchy
sequence. Is the converse true?
7. Suppose that the terms {an} satisfy \an+i — an\ < 2 n for all n. Prove that
{an} is a Cauchy sequence.
9. Suppose that ai > 0 and an+1 = an + ■£-. Prove that {an} diverges to oo.
Hint: if {an} converges to a nonzero limit, find an equation that would
be satisfied by the limit.
10. Let ai = y/2, and let an for n > 2 be defined recursively by the formula
^n+1
11. (a) Compute enough terms of the sequence an = y/n to guess the limit,
(b) Let an = y/n — 1. Use the binomial theorem to prove that
n(n — 1) 2
n (1 + an)n > an *
2
(a) Prove that f(x) = (1 + \)x is increasing for x > 1. Hint: show that
/'(1) > 0 and f"(x) > 0 for x > 1.
(b) Use the binomial theorem to show that an is bounded above by 1 +
En
fc=o fc! ■
ai + 02 + ••• + an
bn •
n
Prove that {bn} is monotone and bounded and therefore has a limit.
14. Let {an} be a sequence and suppose that an -»• a. Define the sequence
{bn} as above and prove that bn -> a.
52 Chapter 2. Sequences
Though not all bounded sets have maximal elements, they all have
least upper bounds, as the following theorem shows.
□ Theorem 2.5.1 Let S' be a set of real numbers which is bounded above.
Then S has a unique least upper bound.
2"n+1(M - 6). Since the sequence {an} is decreasing and is bounded be¬
low by b, Theorem 2.4.3 guarantees that {an} converges to a limit a.
To complete the proof we use two simple results that we have left as
excercises. First, since an is an upper bound for S for each n, the limit a
is also an upper bound for S (problem 2). Next, let £ > 0 be given, and
choose n so that 2~n(M — b) < e/2 and so that \an — a\ < e/2. Then,
i | | . . . e e
%n I ^ | |^n n\ — 2
The least upper bound of the set S is also called the supremum of the
set and denoted by sup S. A similar proof shows that any set which is
bounded below has a greatest lower bound. The greatest lower bound
is called the infimum of the set and is denoted by inf S. If a set S is
not bounded above, we define sup S = +oo. It is easy to see that if
sup S = Too then there is a sequence of points sneS such that sn —>■ Too
(problem 7). Similarly, if S is not bounded below we define inf S' = — oo.
Our discussion of completeness assumes the Archimedian property
(for example, it was used implicitly in the proof of Theorem 2.4.3). The
Archimedian property follows from Theorem 2.5.1.
O Theorem 2.5.2 Let a and b be real numbers which satisfy 0 < a < b.
Then there is a positive integer n such that b < na.
The following result will be useful when we define the Riemann in¬
tegral; see (10) in Section 3.3.
54 Chapter 2. Sequences
□ Theorem 2.5.3 Let A and B be two sets of real numbers and suppose
that for each a e A and beB the inequality a < b holds. Then sup A <
inf B.
Proof. Choose any beB. Since a < b for all ae A, it is clear that b is an
upper bound for A. Since sup A is the least upper bound, we see that
sup^4 < b.
Since this is true for all beB, the set B is bounded below by sup A. Since
inf B is by definition the greatest lower bound, we conclude that sup A <
inf B. □
Problems
(a) {4|neN}.
(b) {2 - 4 | ne N}.
(c) {er \ re Q}.
(d) {x2 | 0 < x < 2}.
2. Let S' be a set of real numbers and suppose that an is an upper bound for
S for each n. Prove that if an a, then a is an upper bound for S.
4. Prove that least upper bounds are unique. That is, if pi and p2 are both
least upper bounds for the set S, then /xi = H2-
5. Suppose that a is an upper bound for the set S and that there is a sequence
of elements xneS such that xn a. Prove that a is the least upper bound.
2.6 The Bolzano-Weierstrass Theorem 55
(b) Let £ be the set consisting of all numbers of the form a\ + a2 where
ai e £1 and a2 e £2. Prove that sup{S} = sup{£i} + sup{£2}.
8. Let £ be a set of real numbers and suppose that a > 0. Prove that
sup {ax | x e £} = a sup £.
9. Let {an} and {bn} be bounded sequences and define sets A, B, and C by
A = {an},B = {6n},andC = {an+bn}. Prove that sup C < supA+supB.
Give an example to show that strict inequality may hold.
10. (a) Assume that Theorem 2.5.1 is true and prove Theorem 2.4.2.
(b) Assume that Theorem 2.4.3 is true and prove Theorem 2.4.2.
(c) Conclude that Theorem 2.4.2, Theorem 2.4.3, and Theorem 2.5.1 are
equivalent.
0^*21 0-4) a. . .
„ 1
\ank ~d | < -.
It follows easily from this inequality that ank -» d as k -> oo, so we have
found a subsequence of {an} that converges to d. □
0-2k = 2 + —— —> 2
2.6 The Bolzano-Weierstrass Theorem 57
1
a2k+\ — ~ 2 + -)• - 2.
2k + 1
Example 2 Let ao = .2 and define the rest of the sequence {an} by the
recursive relation
Proof. Let {an} be the sequence. Since it is bounded, all of the terms are
contained in an interval of the form [~M,M] for some M. We will con¬
struct the subsequence and a nested family {Ik} of subintervals whose
58 Chapter 2. Sequences
lengths shrink to zero so that each I/, contains the tail of the subsequence.
We first divide the interval
"* ^1 *’ into two parts at the mid-
_| a^i |_point. If [0, M] contains in-
-M 0 Mi M finitely many terms in the
j sequence {an}, we choose
ani to be one of those terms
Figure 2.6.1 and choose h = [0,M]. If
[0, M] contains only finitely
many terms, then [—M, 0] must contain infinitely many terms of {an}. In
this case we let ani be one of those terms and choose I\ = [— M, 0]. In
either case, I\ has length M and contains ani and infinitely many other
terms of {an}. We now divide Ji at its midpoint, making two subinter¬
vals. Either the right subinterval or the left subinterval (or both) must
contain infinitely many points of {an}. Choose I2 to be a subinterval
with infinitely many points of {an} and define an2 to be any member of
{an} in this subinterval whose index is greater than n\. Note that the
length of I? is 2_1M. See Figure 2.6.1.
Continuing in this manner we construct a nested family, {Ik}, of
intervals, each containing all the succeeding ones. The length of Ik is
2~k+1M, and a term of the sequence ank is in Ik- Since the intervals are
nested, each Ik contains anj for all j > k. Given e > 0, we can choose K
so that 2~k+1M < e. If j > K and k > K, both anj and ank are in Ik, so
Problems
1. Prove that if a sequence converges it has exactly one limit point. Is the
converse true? Prove it or give a counterexample.
2. For each of the following sequences, find the limit points. For each limit
point find a subsequence that converges to it:
(b) an = (-l)n +
(c) an = sin ™.
3. Suppose that the sequence {an} converges to a and that d is a limit point
of the sequence {bn}. Prove that ad is a limit point of the sequence {anbn}.
4. Let c be a limit point of {an} and d be a limit point of {bn}. Is c + d
necessarily a limit point of {an + 6n}? Prove it or give a counterexample.
5. Let {yj}jLi be N given real numbers. Construct a sequence {an} so that
{yj}jLi is the set of limit points of {an} but an ^ yj for any n or j.
6. Consider the following sequence: a\ = the next three terms are 113.
4 2 4
’ ’ '
the next seven terms are §> §> f > ••• and so forth.
What are the limit points?
7. Let {dn} be a sequence of limit points of the sequence {an}. Suppose that
dn —> d. Prove that d is a limit point of {an}.
(a) Define what it means for a point peM2 to be a limit point of {pn}.
(b) Prove that p is a limit point of {pn} if and only if {pn} has a subse¬
quence which converges to p.
(c) Determine the limit points of the sequence {pn} if for each n, pn =
(d) Determine the limit points of {pn} if the polar coordinates of pn are
rn = 2 - J and dn =
13. Let {h}^Li be a family of closed finite intervals which has the prop¬
erty that the intersection of any finite subcollection of the intervals is
nonempty. Prove that there is a point p contained in ft^=1h- Hint: con¬
sider the sets Jk = n£=1In and use the idea of the proof of problem 8.
■Tn+l — (14)
F(x) - ax( 1 — x)
Figure 2.7.1
Given x0, the horizontal line through the point (*0, F(x0)) intersects the
line y = x in the point (F(x0), F(x0)). See Figure 2.7.1(a). Since xi =
F{xq), the first coordinate of this point is just x\. The horizontal line
through (®i, F(xi)) intersects the line y = x in the point (F(xi), F(xi)),
whose first coordinate is X2■ Continuing in this way, we can construct
approximately as many terms of the orbit {xn} as we like and see that
they get smaller and smaller because the graph of F(x) lies below the
62 Chapter 2. Sequences
□ Theorem 2.7.1 Suppose that 0 < a < 1, 0 < xo < 1/ and the sequence
{xn} satisfies (14). Then, xn —>• 0 as n —» oo.
Proof. If xo = 0 or xo = 1, all the rest of the terms in the orbit are zero
so there is nothing more to prove. On the other hand, if 0 < xn < 1, then
since a < 1 and 1 — xn < 1. Thus, the sequence is strictly monotone de¬
creasing and bounded below by zero. By Theorem 2.4.3, {xn} converges
to some x, and since each xn > 0, we know that xe [0,1]. Taking the
limit of both sides of (14), using Theorems 2.2.4 and 2.2.5, we see that x
satisfies
Since a < 1 and x e [0,1], this can only be true if x = 0. Thus, xn —>• 0 as
n -» oo. □
Even in this very simple case we see the value of three different
methods of investigation: numerical experimentation, geometric con¬
struction, and analytical proof. Let's try this same approach for a > 1.
Set a = 2.4. If we try two different values for xo, namely, xo = .1 and
xq = .85, and iterate (14), we find
Xo .1 .85
»1 .216 .306
X2 .406426 .509674
X3 .578985 .599775
X4 .585027 .576108
X5 .582649 .586098
X6 .583606 .582209
x7 .583224 ■ * .58378
X8 .583377 .583154
x9 .583316 .583405
XlO .58334 .583305
Theorem 2.7.1 that if {xn} converges, then the limit x must satisfy (15).
Solving (15) algebraically, we find that there are two possibilities, x = 0
and x = ^=1. Since a = 2.4, we calculate = .583333, so now we
see where the .583 came from. But why does the sequence converge to
.583333 and not to zero? Let's look at the graph of F in Figure 2.7.1(b)
and construct the successive points by the same geometrical method as
above. The x coordinate of the point where the graph of F intersects the
line y — x is x. If xn is less than x, then xn+i will be larger than xn be¬
cause the graph of F is higher than the line y = x for x < x. This explains
why the sequence cannot converge to zero. If xn is greater than x, then
xn+i will be less than xn because the graph of F is lower than the line
for x > x. But that in and of itself doesn't prove that the sequence will
converge to x since it could hop back and forth from one side of x to the
other. In fact, in the case shown in Figure 2.7.1(b), it does hop back and
forth, but, nevertheless, the sequence xn converges to . We shall give
an analytical proof in the case 1 < a < 2y/2.
□ Theorem 2.7.2 Suppose that 1 < a < 2\/2 and that the sequence {xn}
satisfies (14). Then, for all 0 < xo < 1, the sequence {xn} converges to
a
1 as n —> oo.
Proof. Let x = — a
. Recall that the function F achieves its maximum
value at x = | and that value is f. Thus, xn < | for n > 1. Now, choose
a number 5 small enough so that
We can choose such a S satisfying the first inequality since a > 1. The
reasons for this choice will become clear shortly. First of all, notice that
the graph of F is concave downward since F"{x) = —2a. Thus on the
closed interval [6, |], the minimum value of F occurs at one of the end¬
points. By our choice of 5 above, F(5) = ad)(l — 5) > S and F(|) > <5.
Thus,
Therefore, if xi is in the interval [<5, J], then all the succeeding iterates
will be in [S, |]. On the other hand, if any Xj < 5, then
Iterating this inequality shows that if x\,..., xk are all < 5, then
Since a(l - <$) > 1 and x\ > 0, this gives the contradiction xk > 6 for k
large. Therefore, for each xo, some iterate, xk must satisfy xk > S, and,
by what we proved above, all the succeeding iterates, xn for n > k, will
remain in [5, |].
To prove the convergence, we subtract x = from both sides of
(14) and rearrange algebraically to obtain
Since xn is in [<S, |], axn > ad, and the hypothesis that 1 < a < 2\[2
2
implies that axn < < 2. Thus,
Therefore, if we define
Xo .1 .5 .88
X\ .297 .825 .34848
X2 .68901 ‘ .'476438 .749238
.707108 .823168 .620006
X4 .683451 .480356 .777475
X5 .713941 .823727 .570925
XQ .673957 .479164 .8084
x7 .725139 .823567 .511135
x8 .657731 .479504 .824591
2.7 The Quadratic Map 65
The sequences certainly don't look like they are converging! As n gets
larger, they seem to alternate from close to .479 to close to .823. Amaz¬
ingly, any other initial condition between 0 and 1 seems to give a se¬
quence with the same properties. To understand what is happening, we
look at the composition of F with itself, F^2\x),
and
2-2n+l = ^(®2n—1)>
Problems
What happens to the iteration for different choices of x0 in 1R. Prove your
conclusions.
for 0 < xq < 7r. Prove that xn —> 0. Hint: derive the inequality since < x
for 0 < x < tv by using the Fundamental Theorem of Calculus.
Projects 67
axn
xn+\ — •
b 4“ Xfi
Find out all you can about the orbits by numerical experimentation and
analytical proof.
where xo is any real number and 0 < A < 1. Find out all you can about
the orbits by numerical experimentation and analytical proof.
ax2n+/3xn(l - xn).
Therefore,
_ _axl + 0xn(l - xn)_
Xn+1 ax\ + 2/3xn(l — xn) + 7(1 — xn)2
6. (a) Show that if 0 < xq < 1, then 0 < xn < 1 for all n.
(b) Show that if a = (3 = 7, then xn remains constant in all generations.
9. Suppose that a < (3 and 7 < (3. Find numerical evidence that {xn} con¬
verges to a number x satisfying 0 < x < 1 and characterize x analytically.
68 Chapter 2. Sequences
Projects
(a) Suppose that {an} converges to a limit. What must that limit be?
Hint: divide the above equation by Fn to find an equation relating
clyi-\-\ and
(e) Use the inequality in (d) to show that {an} is a Cauchy sequence
and therefore converges to a limit. Hint: express
2. The purpose of this project is to derive the result of Section 2.3 more easily
using the tools of linear algebra.
(a) Show that 1 and (1 — p — q) are the two eigenvalues of the matrix A.
(b) Find a nonsingular matrix Q so that A = QDQ_1, where D is the
diagonal matrix
D (1
\ 0
0
1-p-q J
j
(c) Explain why An = QDnQ~1.
(d) Prove that
lim An ( Po ) = ( p+9 ^
rwoo V <lo J V Ufi /
Qk — c + dpk-i.
Projects 69
4. The gas CO2 is produced by the cellular metabolism of the body and
is removed from the body when one exhales. There is a mechanism in
the brain which senses the C02 level and sends signals to the breathing
mechanism to take deeper breathes if the C02 level is too high. Fet Vn be
the volume of the breath and let Cn be the C02 concentration at the
time of the breath. Fet M be the (assumed) constant C02 concentra¬
tion produced by the metabolism between each breath. Then, a simple
model of this system is:
The second equation says that the volume of the n + 1 breath is propor¬
tional to the C02 concentration at the time of the breath. The first
equation says that C02 concentration Cn is lowered by an amount pro¬
portional to the volume of the n^"1 breath and raised by M. Given the
initial concentration C0 and breath size V0, we would like to determine
the behavior of the sequences {Cn} and {Un}. Our discussion follows
[10], where more information can be found.
(a) Show that the constant sequences Cn = M/afi and Vn — M/a solve
(19) and (20).
(b) Define cn = Cn - M/a/3 and vn = Vn - M/a. Show that the se¬
quences {cn} and {vn} satisfy
(c) Show that {cn} satisfies the second order difference equation,
cn — d\ A” + d2 A£
Define
ocn — n - 2irjn.
The number an/ which is denoted n(mod2n) in number theory, is just the
remainder when n is divided by 27r.
□ Theorem. The set of limit points of the sequence {an} is the entire inter¬
val [0, 27t].
Proof. Let e > 0 and N be given. We will show that for any x e [0, 2ir),
there is an n > N so that
but this would imply that n is a quotient of integers and therefore ra¬
tional. Since n is irrational, we must have an / am. Now, we apply the
Bolzano-Weierstrass theorem to the sequence {an}, which is bounded be¬
cause each an is in [0,2n). Since a subsequence converges, we can choose
a large integer m > N and a larger integer n > m so that |an — am\ < e.
Let k = n — m. Then
and since
2k = 2(an Om) T 4tti^jn jm)i
we have ct2k = 25. Continuing in this manner, we see that the numbers
c*2fc> Q-zki ••• equal 5, 25, 35,..., respectively. Each x e [0, 27t] is therefore
within 5 of one of the numbers a*,, a2k, 03*,,.... Since 5 < e, this proves
(24). The other case is handled similarly. If 5 < 0, then a*. = 2ir + 5, a 2k —
2it + 25, and so forth. As above, this proves that any x is within 5 of one
of the points afc,a2fc,Q!3fc,.... □
CHAPTER 3
3.1 Continuity
Throughout the first few chapters of this book we study functions which
are defined on subsets of R and take values in R. Recall that the set
on which a function / is defined is called its domain and is denoted by
Dom(f). Normally the functions we deal with will be defined on inter¬
vals (open, closed, or half open) or unions of intervals. For example, the
function f(x) = (2 — x2)^1 is naturally defined on the union of the in¬
tervals (—oo, -v^2), (-a/2, a/2), and (a/2, oo), but not at the points \/2 or
-a/2. Occasionally, functions are defined on more complicated sets.
The definition says that a function is continuous at a point if, as one gets
closer to the point, the function values approach the function value at
the point.
= ( lim xn)2
n—>oo
= /(c),
lim p(xn)
n—> oo
lim {bo + bixn + b2x
n—^ oo
• + bNx%}
lim bo + lim bixn + lim b2xl + ... + lim bNx,N
n
n->oo n-Yoo n—)■ oo n n-^oo
N
bo + b\ lim xn + b2 lim x2 + ... + bjy lim x^
n-^roo n-> oo n—>oo n
p(c).
the pieces match up, that is, where the definition formula of the function
changes.
2x if 0 < x < 1
f(x) = 2-i*2 if 1 < x < 2.
lim f{xn)
n—>oo
~ v
= lim 2(1
7i—yoc
-i)
= 2 Figure 3.1.1
* /(!)•
Thus, / is continuous at all points of [0, 2] except 1.
(a) f + g is continuous on D.
(c) fg is continuous on D.
Proof. The proofs of all four parts follow easily from Theorems 2.2.3 -
2.2.6 in Section 2.2. For example, to prove (a), let c be a point of D. Sup¬
pose that is a sequence of points such that xn e D and xn f c. Since
76 Chapter 3. The Riemann Integral
= /(C) + 3(C)-
Thus, by definition, / + g is continuous at c. Since ,c was an arbitrary
point of D, we have proven (a). The other parts are proved similarly. □
We will see later that sinx, cosx, eax, and many other elementary
functions are continuous on R. Because of Theorem 3.1.1, any algebraic
combination of these functions and polynomials is a continuous function
on R, except possibly where denominators vanish. In particular, parts (a)
and (b) show that the set of continuous functions on a set D, denoted by
C(D), is a vector space; that is, linear combinations of continuous func¬
tions are again continuous functions. Vector spaces are defined formally
in Section 5.8. The composition of continuous functions is also continu¬
ous.
Proof. First suppose that the e — 8 condition holds. We will show that
/ is continuous at c. Suppose that xn eDom(f) and xn —> c. Let e > 0 be
given and choose 8 so that (2) holds. Since xn —> c, we can choose an N so
that \xn — c\ < 8 for n > N. Thus, by (2), we know that \f(xn) - /(c)| < e
for n > N. Thus, by the definition of convergence for sequences, the
sequence {f(xn)} converges to /(c). Therefore, / is continuous at c.
To show that continuity implies the e — 8 condition, we give a contra¬
positive proof. That is, assume that there exists an e t> 0 so that there is
no 8 so that (2) holds. We'll show that / is not continous at c. Let 8n =
Since (2) does not hold, there is an xn e Dom(f) satisfying
\xn c| ^ 8n (3)
such that
From (3), it follows that xn -> c since 8n -» 0. But by (4), the sequence
{/(*«)} cannot converge to /(c) since each f(xn) is always a distance of
more than e away from /(c). Therefore / is not continuous at c. □
if and only if
lim f(xn) L
n—>oo
since
lim- = 1.
x
Problems
1. Students are often told in calculus that continuous functions are those
functions “whose graphs you can draw without picking your pencil up
from the paper.” Write a paragraph explaining the relation between this
informal idea and the formal definition in this section.
2. Prove part (c) of Theorem 3.1.1.
3. Let f(x) be a continuous function. Prove that \f{x)\ is a continuous func¬
tion.
4. Where is the function In (sin x) defined and continuous?
5. Suppose that / is a continuous function on E such that f(q) = 0 for all
q e Q. Prove that f(x) = 0 for all let.
6. Suppose that the function / is defined only on the integers. Explain why
it is continuous.
7. Let f(x) = 3x — 1 and let e > 0 be given. How small must 8 be chosen so
that \x — 1| < 5 implies |/(x) — 2| < e?
8. Let f(x) = x2 and let e > 0 be given.
9. Let f(x) = 3x3 - 2 and let e > 0 be given. Find a 8 so that \x - 1| < 5
implies | f(x) — 1| < e.
10. Suppose that a function / is continuous at a point c and /(c) > 0. Prove
that there is a 8 > 0 so that for all x e Dom(f),
x —c 8 implies
/(c)
/(*) > 2
x — c
\fx — y/c =
y/x + yfc.
12. (a) Sketch a rough graph of the function f(x) = e-1^2 defined on the
domain {x \ x / 0}.
80 Chapter 3. The Riemann Integral
13. (a) Generate the graph of the function f(x) = sin \ on the interval (0,1].
(b) Find sequences {a„} and {f3n} of numbers in (0,1], so that an ->■ 0,
(3n —> 0, and
(c) Conclude from part (b) that lim^o sin ^ does not exist.
(d) Can / be defined at 0 so that / is continuous on [0,1]?
(e) Can g(x) = x sin | be defined at 0 so that g is continuous on [0,1]?
the numbers |/(xnfe)| diverge to +oo. Therefore a B must exist so that (6)
holds. □
That is, sup5 / is just the supremum of the set of values of / on S. Simi¬
larly, we define the infimum of / on S by
Note that sup5 / could be +oo and infs / could be — oo, depending on
the function / and the set S. For example, if f(x) = x2 and S = R, then
sup5 / = +oo and infs f = 0. Even in the case where sup5 / is finite, we
do not necessarily know that there is a point c in S so that /(c) = sup5 /,
as the following example shows.
lim f(xnk) = M.
k—> oo
Proof. We shall consider the case f(a)<y<f(b); the proof of the other
case is similar. Let
S = U e [a,b] | f(x) < y}
a+5 £«. n c + /3
-T-*-♦-*-1-1-1-
a x^i c jj
Figure 3.2.2
For each n we can choose an rn£5 that is in the interval \c - cl; see
problem 3 of Section 2.5. Since xn —> c and / is continuous, f{xn) -» /(c).
3.2 Continuous Functions on Closed Intervals 83
See Figure 3.2.2. Thus, /(c) < y since f(xn) < y for all n. If /(c) < y,
then if /3 > 0 is small enough, f(c + 0) < y because / is continuous. Since
this contradicts the definition of c, we conclude that /(c) = y. □
Figure 3.2.3
Example 2 Consider the function f(x) = x2, with domain equal to the
whole real line, R. Let e > 0 be given and choose a point c. Then,
We want to see how small S must be so that the right-hand side is less
than e. Since x is within 5 of c, we know that |x| < |c| + S. Therefore if
6 < 1, we have
then
S smaller. This makes sense because the slope of / gets steeper as x gets
larger. So, to make the difference \f(x) - /(c) | small, we have to choose
5 smaller as c gets larger. Notice, however, that 8 = 2ijV£i+-j- is a suitable
<5 for all c in [—N, N}. Thus f(x) = x2 is uniformly continous on each
interval [—AT, N] but not on the whole real line R.
for all x and c in the interval. Set 8n — 2~n. Then, for each n there are
points xn and cn in [a, b] so that
< 2~nk I
+ Xnk - d\
so cnk -> d also as k oo. Since / is continuous, f(xnk) ->• f{d) and
f(cnk) -»• f(d) as k ->• cxd. This is impossible since |/(x„fe) - /(cnJ| > £o
for each k. Thus / is uniformly continuous on [a, b}. □
Problems
13. Use the result in Project 5 of Chapter 2 and the fact that sin a: is a contin¬
uous function to prove that the set of limit points of the sequence {sinn}
is the entire interval [—1,1].
3.3 The Riemann Integral 87
N N
Vmj(zj-Zi_i) < area < Y] Mj(xj - Zj_i)
i=l i=1
because the sum on the left is the sum of the areas of the rectangles lying
under the curve in Figure 3.3.1(a), and the sum on the right is the sum
of the areas of the rectangles whose tops are over the curve in Figure
3.3.1(b). The sum on the left is called a lower sum and is denoted by
Lp(f) because it depends on / and on the partition P.
88 Chapter 3. The Riemann Integral
Similarly, the sum on the right is called an upper sum and is denoted by
Up(f). For any bounded function / and any particular partition P we
always know that Lp(f) < Up(f) since mi < Mi for each i.
Intuitively, upper sums should approximate the area from above and
lower sums from below. And we should get better approximations to the
“area” if we pick partitions with more points and shorter bases (x{—Xi-1)
for the rectangles. The following lemma shows that if we add points to a
partition, then the upper sum cannot increase and the lower sum cannot
decrease. The second lemma shows that, indeed, all upper sums are
larger than all lower sums.
Proof. We add the additional points to the partition P one at a time. Let
Qi denote the partition with one new point y added in the j^1 interval:
xq < xi < ... < Xj-i < y < Xj < ... < xn.
Then,
1-1 n
Up(S) = -*»-1) + Mj(Xj-Xj-1) + Mi(xi
i=1 i=j+1
3-1
vQl(f) = T,Mi{xi - X{-i) + sup f(x) (y - Xj-1)
i=1 \xj-l<x<y )
n
+ sup (xj y) + Y Mi(xi-xi^ i).
\y<x<Xj
i=j+1
Since,
— Mj(xj — Xj-i),
forth. Continuing in this way, we find that £/q(/) < UP(f). The proof
that Lp(f) < Lq(/) is similar. □
Thus, each upper sum is larger than each lower sum, so by Theorem
2.5.3,
Lemma 3 Let / be a bounded function on [a, b}. Suppose that for each
e > 0 there is a partition P so that
Proof. Let / be continuous on [a, b]. We shall show that the criterion
of Lemma 3 is satisfied. Since / is continuous on [d, 6], we know from
Theorem 3.2.5 that it is uniformly continuous. Therefore, given e > 0,
we can choose S > 0 so that
Let P be any partition such that the maximum subinterval length is less
than or equal to S. Since / is continuous, by Theorem 3.2.2 there are
points a and d+ in each subinterval [xi,Xi-1] such that f(a) — Mi and
f(di) = rrii. Since |c*—di\ < S for each i, we therefore know that Mi—rrii <
s/(b — a). Thus,
N N N
Mi{xi - Xi-i) - Y mi(xi - Xi-i) Y(Mi ~ mi)(xi ~ xi-1)
i=1 i= 1 i=1
N
<
Yb_a(Xi Xi~i)
i—i
£.
Note that this theorem proves that the integral of a continuous func¬
tion exists, but it doesn't tell us how to compute it. If the function / is
the derivative of a function that we know, then we can evaluate the inte¬
gral by using the Fundamental Theorem of Calculus (Section 4.2). If not,
which is often the case, then we must use some method of approxima¬
tion. If P is a partition and x* is a point in the ith subinterval
we call
Yf(Xi)(Xi ~ Xi~l)
i=1
3.3 The Riemann Integral 91
fb
f{x)dx as k —» oo.
Ja
Proof. Let e > 0 be given and choose 6 as in the proof of Theorem 3.3.1.
Choose K so that k > K implies that the maximum subinterval length is
less than S. For all k,
and
Lpk{f) < f
Ja
f{x)dx < UPk(f).
ISk - f
Ja
f{x) dx\ < e
i— 1
N N '
= aY,f(4’)(4-4- ) + 1 ,3 £</(**•)(*,• 1
-*?- )'
*=1 i=l
rb rb
/ f(x)dx < / g(x)dx. (14)
Ja Ja
N N
Since the Riemann sums on the left converge to the integral on the left of
(14) and the Riemann sums on the right converge to the integral on the
right, we conclude from problem 3 of Section 2.4 that (14) holds. □
N N
The Riemann sums on the left converge to the integral on the left of (15)
and the Riemann sums on the right converge to the integral on the right,
so we conclude that (15) holds. □
Theorems 3.3.4 and 3.3.5 can be combined to give the following fun¬
damental integral estimate.
f
Ja
f(x) dx < (6-a)sup|/(x)
[a, 6]
(16)
f
Ja
f(x)dx = f
Ja
f(x)dx +
Jc
f f(x)dx. (17)
The proof of Theorem 3.3.7 is left to the student (see problem 6). If
/(*) > 0 for all x e [a, b], we define the area under / and above the x-axis
between a and b to be Jab f(x) dx. Note that by Theorem 3.3.4, the area is
always nonnegative. If f(x) < 0, we define the area above / and below
the x-axis to be fb |/(x)| dx. In this case also, the area is nonnegative.
In our definition of the Riemann integral /° f(x) dx, we assumed that
a < b. If a > b we define
Problems
1. Let / be the function on [0,1] given by
fO if x is rational
f(x) — | i if a; is irrational.
1 if x V |
/(*) 2 if cc =
Prove that / is Riemann integrable and compute f* f(x) dx. Hint: for
each e > 0, find a partition P so that UP(f) — Lp(f) < e and use Lemma
3.
3. Suppose that / is Riemann integrable on [a, b] and f(x) > 0 for all x.
7. Let f(x) = x2 and let P be a partition of the interval [1,2] into subintervals
of length S. Compute UP(f) and LP(f) when S = .5,S = .2, and 8 = .1.
e— 1 <
f
/ Vi T- xex dx < V2(e — 1).
10. Let f(x) — x + .l(sina;)3. Estimate the integral f“ f(x) dx from above and
below.
11. Let / be the function in problem 10. Estimate ef (®) dx from above and
below.
14. Let / be a continuous function on the interval [a, b}. Suppose that for
every a\ e [a, b\ and b\ e [a, 6] we know that
dx = 0.
15. Let / be a continuous function on the interval [a, b\. Prove that there exists
an x e [a, b] so that
In the last section we saw that the Riemann integral of a continuous func¬
tion can be approximated by a Riemann sum. Indeed, the integral was
defined as the limit of such approximations. Sometimes one can use the
Fundamental Theorem of Calculus to evaluate an integral exactly, but of¬
ten some numerical method must be used. In this section we show how
to use analytic techniques to derive error estimates for several different
approximation schemes. Our purpose is not to find the “best” methods
or estimates, but to show how analytical tools can be used.
Consider a particular rectangle in the approximation by the lower
sum in Figure 3.3.1(a). See Figure 3.4.1(a) on page 98. The error is the
area above the rectangle and below the curve. How big will this error
be? In each interval we are approximating the function / by a constant
function (whose height is the minimum of / in the interval). The error
will, therefore, depend on how quickly / moves up from this constant
value. That is, the error should depend on the derivative of /. If / has
96 Chapter 3. The Riemann Integral
a small derivative, then the graph of / will be almost flat in a small in¬
terval and the approximation will be good. If / has a large derivative,
then the graph of / will rise steeply and the approximation will be poor.
Thus, what we want is an explicit estimate which says how good an ap¬
proximation the Riemann sum is in terms of the size of the derivative of
/. Throughout this section we use the properties of differentiation intro¬
duced in Chapter 4; in particular, we use the Mean Value Theorem from
Section 4.2 and Taylor's theorem from Section 4.3.
From now on, we will always choose partitions which divide [a, b\
into subintervals of equal length. So, if there are N subintervals, then
each has length
, b— a
N N
RN = 1) = h^2f(xi)
i= 1 i=1
Proof. For every xe[xitxi-1], the Mean Value Theorem guarantees that
there is a £ between x and x* such that
m__mi = m
so
\fix)-f(x*)\ = \f'(0\\x~xi\
< Mh.
Therefore,
b n N
< i—1
E [
J*i-1
tf(x) - f{x*))dx
N rx.
< / Mhdx
i=l ■'**-1
= Mh(b — a). □
Notice that the theorem gives an upper bound for the error indepen¬
dent of how the points x* in the Riemann sum are chosen. We can use the
left endpoints, the right endpoints, the midpoints, or the points where /
achieves maxima or minima. The actual error will depend on the x*, and
some choices may be better then others.
Rn < he-1/2
Of course, E(h) is a bound for the error, not the error itself, which may
be smaller. Notice that E(h) ->• oo as h ->• 0. In fact, E(h) has a minimum
at ho = y/2e/M and that minimum is
Thus there is no point in choosing h < ho since the error may just get
larger, and there is no way to choose h to guarantee that our approxima¬
tion is closer than 2y/2Me(b — a) to the correct value of the integral.
To get better methods for computing /a6 f(x) dx we should approxi¬
mate / better on each subinterval. Since we need to be able to integrate
the approximations explicitly, polynomials are natural candidates. In a
Riemann sum, the function is approximated by a function which is con¬
stant on each subinterval, so it is natural to try the next simplest thing,
linear approximations.
which has the same value as f at Xj_i and the same derivative as / at
Xi-1- Let Tn denote the sum of the integrals of the functions Tl(x) on the
intervals [x»_i, x*]. That is.
Tjy Tl(x) dx
N N rxi
) (x -
Jxi-l
i=l i—1
N iV
2
i—1
Mh?(b — a)
f(x) dx - Tn <
/ 6
f(x)-T(x) = t-$-(x-Xi-1)2
2!
Thus,
[b f(x)dx-TN = Y, r V(x)-Ti(x))dx
Ja i=\ " xi — 1
iV /.a.i
i M
< Y ^T-(a5 — ay.-i)'4 rfx
100 Chapter 3. The Riemann Integral
M
2!~ i—1
Mh3N
6
Mh?{b — a)
6 □
in each interval [x*_i, x*]. See Figure 3.4.1(c). This approximation scheme
gives the trapezoidal rule (problem 3), which is also order 2 and has the
error bound (20) where M is a bound for the second derivative of /. The
proof of the error bound is outlined in project 4.
Mh?{b — a)
E(h) (20)
12
One can test the order of any scheme by trying it out on a function
whose integral is known. If the scheme is order 1, then halving the size of
h should halve the size of the error. If the scheme is order 2, then halving
the size of h should cause the resulting error to be \ of the former error.
If one tries this out on the special Riemann sum method
f
Ja
f(x) dx — Rn <
Mh?(b — a)
24
(22)
f(x)-Hi(x) = if
and thus
rxi rxi M
/ |/(x) - Hl(x)\dx < / -{x-xifdx (23)
Jxi-x Jxi-i *■
= y(V2)3- (24)
Summing over the N intervals gives the estimate on the right side of (22).
But why do the integrals of the if*(x) give the Midpoint Rule?
jr f Hl(x)dx = Y^f(xi)(xi-xi-1).
i=1 ^ xi-1 i= 1
Problems
1. If we approximate the integral in Example 1 by using the method of The¬
orem 3.4.2, how small should h be chosen so that the error is < 10“4?
2. Show how to get an nth order scheme by generalizing the idea in Theo¬
rem 3.4.2.
102 Chapter 3. The Riemann Integral
3. Show that the linear approximation scheme with the Ll(x) defined as in
this section gives the trapezoid rule
5. In each case below, compute how small h must be to guarantee that the
error in the left-hand endpoint Riemann sum method and the error in the
Midpoint rule are < 10-4 :
6. Suppose that the error bound (including round-off) for a first-order nu¬
merical scheme method is given by (19).
(a) Show that if h is small the error bound increases as we make h still
smaller.
(b) For what h is E(h) smallest?
(c) Suppose that the error bound for a second-order method is given by
7. Let / be a continuous function on the interval [a, b\, which we divide into
N segments of length h = Instead of approximating / on each subin¬
terval [xi-!,xi] by a linear function, we approximate / by the quadratic
function that has the same values as / does atat xif and at the mid¬
point Xi. Show that the sum of«the integrals of these quadratic approxi¬
mations is
For each of the integrals in problem 5, how small must h be so that the
error in Simpson's rule is < 10 “4?
3.5 Discontinuities
Upper and lower sums were defined in Section 3.3 for any function on a
finite interval [a, b] that is bounded. We proved there that if / is contin¬
uous on [a, b], then the inf of the upper sums equals the sup of the lower
sums, and so, by definition, the Riemann integral exists. In this section
we show that the Riemann integral also exists for special classes of func¬
tions which have points in their domains where they are not continuous.
Such functions are called discontinuous, and the points are called dis¬
continuities. We begin with an example (problem 1 of Section 3.3) which
shows that if a function has too many discontinuities, the Riemann inte¬
gral may not exist.
0 if a; is rational
/(*) 1 if a; is irrational
Let P be any partition. Then, because each interval in the partition con¬
tains both rational and irrational numbers (problem 11 of Section 1.1 and
problem 11 of Section 1.4), the upper sum equals 1 and the lower sum
equals 0. Since this is true for each partition P, the infimum of the upper
sums is 1 and the supremum of the lower sums is 0. Thus the Riemann
integral of / does not exist.
Proof. Suppose that / is monotone increasing; the proof for the decreas¬
ing case is similar. Let M be such that |/(x)| < M for all x e [a, b] and let
e > 0 be given. Let P be a partition so that the subintervals have equal
length h with h < e/2M. Since / is monotone increasing, the value of
/ at the right-hand endpoint of each subinterval [xi-\,Xj\ is the supre-
mum of / over the interval and the value at the left-hand endpoint is the
infimum of /. Thus,
N
Up(f) = Y^f(xi)(xi-Xi-1)
i—1
and so.
= 1))
2=1
= h(f(b) — f(a))
< h(2M)
< £.
is not because we defined /(0) = 0. Since lim^o f{x) does not exist, no
definition of / at x = 0 would make / a continuous function on [0,1]. For
each S > 0, / is continuous on the interval [5,1], so the integral // f(x) dx
certainly exists. If the integral of / on [0,1] does exist, it seems natural to
guess that it should equal the limit of the numbers f$ f(x) dx as S 0, if
this limit exists.
Proof. Choose M so that \f(x)\ < M for all x in [a, b], and let e > 0 be
given. Choose <5 small enough so that SM < §. The reason for this choice
will become clear later. Let Ps be a partition of the interval [o+<5, b], which
we label with a + S = x\ < x2 < ■ ■ • < xN = 6. Any such partition Ps can
be extended to be a partition P of [a, b] by adding the point x0 = a. Let
N N £
Therefore,
N N
N N
(27)
N
< ^ ^ -Mj(Xj X{ — i)
i—1
+ M\(x\ - ®0)
£
<
4'
Thus we have shown that, given e > 0, we can choose a number S > 0 so
that | f(x) dx — fY f(x) dx\ < e. This proves (25). □
rl . 1 , r1 1 r8 ' l
/ sm dx — / sin
— dx — = / sin dx —
Jo x Is x Jo x
(28)
108 Chapter 3. The Riemann Integral
raj CLj
raj-0S
/ f(x)dx = lim / f(x)dx. (29)
Jaj-i ’ S\0jaj_1+S
We sometimes denote the limit from the left by lima.^ f{x) and the limit
from the right by limX\cf(x). If / has both right-hand and left-hand
limits at c and those limits are unequal, then / is said to have a jump
discontinuity at c. Notice that nothing is said about the value of / at c.
It may not even be defined there. However, if / is defined at c and
defined on [0,3]. In the graph shown in Figure 3.5.2, we can see that /
has jump discontinuities at x = 1 and x — 2. At x = 1, the limits from
the left and right are 2 and 1/2, respectively. At x = 2, the limits from
the left and right are 1/2 and 7r/2, respectively.
For example, in Example 3, there are three subintervals and three func¬
tions, fi, /2, and f%. It is easy to check (problem 6) that the fj is continu¬
ous on [aj-i,aj]. Therefore, by problem 12 of Section 3.3,
raj raj—fi
/ f.j(x)dx = lim / fj(x)dx.
Jaj-i J s\oJaj_1+sJjy J
This means that for piecewise continuous functions. Theorem 3.5.3 can
be written in a simpler form:
J
/»3
f(x)dx = J 2
/*1
xdx +
f*2 ^
J -dx + J yj-
— — (x — 2)2 dx
problems 9, 11, and 12) or directly from the definition of Riemann inte-
grability (see problem 13).
Problems
x if a; is rational
/(*) 0 if a: is irrational
0, 0 < x < 1
/(*) 1, 1 < x < 2
2, 2 < x < 3
,, X / b 0<x<±
f(X) " I * - | 5<*<1
(a) Prove that / is Riemann integrable without appealing to any theo¬
rems in this section.
(b) Which theorems in this section guarantee that / is Riemann inte¬
grable?
4. Each of the following functions is well defined for x > 0. For each, ex¬
plain whether Theorem 3.5.2 can be used to prove that the function is
Riemann integrable on [0,1]. Explain why the answers don't depend on
how the functions are defined at x = 0:
(a) sin1 2 3 4
Cb) |sini.
(c) In*.
112 Chapter 3. The Riemann Integral
(d) Hint: derive the inequality sinx < x for 0 < x < 7t by using
the Fundamental Theorem of Calculus.
(b) Show that if fe has any other values at a or b, then fe would not be
continuous on [a, b\.
9. Use Corollary 3.5.4 to prove that the properties of the Riemann integral
(Theorems 3.3.3 - 3.3.7) hold for the integrals of piecewise continuous
functions on finite intervals.
10. Prove Theorem 3.5.3 by following the ideas of the proof of Theorem 3.5.2.
11. Let / be bounded on [a, b] and continuous except for finitely many points.
Let Pn be a sequence of partitions so that the maximal subinterval length
goes to zero as n —> oo, and let Sn be a Riemann sum corresponding to
Pn. Prove that Sn -> /Qb f(x) dx as n —> oo.
12. Use the result of problem 11 to show that the properties of the Riemann
integral (Theorems 3.3.3 - 3.3.7) hold for functions which are bounded on
[a, b] and are continuous except for finitely many points.
13. Use the definition of Riemann integrability to show directly that if / and
g are Riemann integrable on [a;b], then f + g is Riemann integrable on
[a, b] and
3.6 Improper Integrals 113
/ —= dx — 2 — 2 Vd.
Js Vx
Since the right-hand side has a limit as 6 0, namely 2, we could define
the integral of ^ on [0,1] to be that limit:
dx = lim / 2.
<s\0 Js
dx = lim dx.
<5\o
and call it the improper Riemann integral of / on [a, 6]. A similar defi¬
nition holds if / is continuous but unbounded on [a, b).
114 Chapter 3. The Riemann Integral
i £1+Q
L
1
xa dx
1 +a 1Ta
In the case -1 < a < 0, the right-hand side has limit as 5 \ 0, so the
improper Riemann integral exists and
1 +a
f X £l+a 1
lim < -- — --> Too,
<5\o ^ 1 + a 1+aJ
so the improper Riemann integral does not exist. The case a = — 1 is left
as problem 3.
Example 3 Consider the integral /q1 dx. The problem here is near
x = 1, where the function is unbounded. Define
cos x
dx.
\/l — x
We want to show that limj^o h exists. This means that for any sequence
Sn —> 0 with 5n > 0, the sequence has a limit and the limit is in¬
dependent of the sequence {<5n} chosen. We did not emphasize this in
Example 2 since the limits there were simple and explicit. Here we will
be more careful. Let {<5n} be such a sequence and consider the two terms
I6n and ISm. Suppose Sn < 5m- Then,
COS X cosx
\hn ~ hm dx dx
\/l — X \J\ - X
3.6 Improper Integrals 115
rl—Sr COSX
dx
JlSm
Sm VT^ X
cos x\
< dx
/ -6m y/l-x
l-<5n 2
<
L -6m
dx
= 2^n - 2v^.
Since <$„ \ 0 and y/s is continuous, ->• 0 as n -> oo. Thus, given
£ > 0, we can choose JV so that n > N and m> N implies \ y/5^ - y/5^\ <
e/2. By the above estimate, it follows that \Isn - hm I < £ for n > N and
m > N. Thus, {Isn} is a Cauchy sequence and therefore converges since
the real numbers are complete.
We have shown that {/jn} converges for any sequence 5n \ 0. If
7n \ 0 is another such sequence, then the same estimate as above shows
that
for all n. Since the right-hand side converges to zero as n —> oo, the
left-hand side must converge to zero too; that is.
noo rb
/ /(x)dx = lim / /(x)dx.
Ja b—>oo Ja
1
5
ba~1
1
lim fb— dx =
6—>oo J i Xa a — 1
Figure 3.6.2
Sometimes one can show that improper Riemann integrals exist even
though one can not evaluate the limit explicitly.
cos b
cos(1) + dx.
~b~
The first two terms on the right clearly have nice limits as b oo. To
handle the third term notice that if c and d are both large and c < d, then
fd COS X fc COS X
/ 0 dx / 9 dx
J1 x2 h x2
1 1
c d
This estimate can be used similarly to the way in which the estimate was
used in Example 3 to show that /** ^ dx converges to a finite limit
as cn —> oo, and the limit is independent of the sequence {cn} chosen
(problem 7).
3.6 Improper Integrals 117
1 1 /*|+2mr+M
> - > ^- / dx
2 ^ 2 + 2mr +2n7r
JV
c l
7T E 1 + 4n
n=l
JV ,
M V- _1
>
87r 1 n
n=l
Since the harmonic diverges (see Section 6.2), we see that the total
amount of area in the positive bumps of the function x^sinx is infi¬
nite. Similarly there is an infinite amount of area in the negative bumps.
Yet the improper integral ff° dx exists because when one takes the
limit of
rh sin x
/ -dx
J1 x
Problems
1. Prove that the following functions have improper Riemann integrals on
the interval [0,1]
1+x1 2
(a) \[x
cos 2ar
(b) 13/4 •
1
(c) y/x(x+l) '
3. Use the integral formula for the natural logarithm (see Example 2 of Sec¬
tion 4.3) and properties of the logarithm to prove that the improper Rie¬
mann integral fQ ~ dx does not exist.
6. Prove that the improper Riemann integral J0°° e-*2/2 dx exists. Hint: for
large x, estimate e-*2/2 by e~x.
8. Suppose that / and g are continuous functions on the interval (a, b] and
assume that the improper Riemann integrals fa f(x)dx and g(x) dx
both exist. Prove that the improper Riemann integral f(x) + g(x) dx
exists and equals fa f(x) dx + g(x) dx.
pb nC nb
Show that if this is true for one c e (a, b) then it is true for all c e (a, b) and
the value of f(x) dx is independent of the choice of c.
/ OO
f(x)dx =
/>C
/ f{x)dx + /
/>0O
f(x) dx.
-oo J — oo Jc
Show that if this is true for one c e R then it is true for all c e 1 and the
value of f(x) dx is independent of the choice of c.
(d) J — OO
xdx.
12. Suppose that / is a continuous function on M such that the improper in¬
tegral fZo l/(*)l dx exists.
Projects
1. Suppose that / is continuous on the open interval (a, b). The purpose of
this project is to show that / can be extended to be a continuous function
on [a, b] if and only if / is uniformly continuous on (a, b). It is clear from
Theorem 3.2.5 that if / can be extended it must be uniformly continuous.
So, suppose / is uniformly continuous on (a, b).
Suppose that we are interested in E10 but don't want to evaluate the in¬
tegral numerically.
(c) Use the recursion relation to compute Ex, E2,..., E10. Ignore the
fact that your calculator has an “e” key and use the (good) approxi¬
mate value of 2.718 for e. Do you think you've found a good value
for Eio? Do your numbers satisfy the inequality in (b)? Explain
what happened by deriving a recursion relation for the difference
between your approximate En (call it En) and the real En.
(d) Rewrite the recursion relation in (a) to express En_x in terms of En.
Pick any number between -100 and +100 for E20 and use the re¬
cursion relation to compute Ex9, E18,..., Ex0.
(e) Do you think this value for E10 is right? See if it is right by numeri¬
cally estimating the integral for E10. How can this be?
(f) Explain what is going on!
4. The purpose of this project is to prove that the trapezoid rule has er¬
ror bound E(h) = vVe use the linear approximation Ll(x) on
Xi] as defined in Section 3.4.
(t - 3h-i)(f - Xi)
9i(t) = f(t)-L\t)-(f{x)-L\x))
(X - Xi_X){x - Xi)
Differentiation
f(x + h) - f{x)
h
f(x + h) - /(a)
/'(*) lim (1)
h^o h
for every sequence {hn} such that hn ^ 0 and x + hn e Dom(f) for each
n and hn —>• 0. Equivalently, given e > 0, there is a S p> 0 such that
f(x + h) - f(x)
/'(*) - < e if 0 < |/i| < <5.
Note that f is itself a function since the limit will, in general, depend on
x. Since x is required to be in the domain of / we always have Dom(f') C
Dom(f).
= 2,x T h.
Since the limit of (2x + h) exists as h ->• 0 and equals 2x, we conclude
from the definition that /' exists at x and f(x) = 2x.
The difference quotient (1) is the slope of the straight line through the
points (x,f(x)) and (x + h,f(x + h)) on the graph of/; see Figure 4.1.1(a).
Thus, intuitively, f'(x) is the limit of these slopes, that is, the slope of the
line tangent to the graph of / at (x, f(x)).
Figure 4.1.1
4.1 Differentiable Functions 123
/(o + M -/(0)
hn
f(0 + hn)-m = _L
hn
/(0 + hn)-/(0) = L
hn
Thus the sequence of quotients does not converge as n —>■ oo. The differ¬
ence quotient does have a limit from the left and a limit from the right at
x = 0, but the limits are not the same.
Proof. We write
f{x + h) - f(x) h
f(x + h) - f(x)
( = gf'- fg'
\g) g2 '
As above, the difference quotients for / and g on the right side have
limits f'(x) and g'(x) respectively. Furthermore, by Theorem 4.1.1, g(x +
h) -> g(x) as h -» 0. Thus, by Theorems 2.2.3 - 2.2.5, the right-hand side
has a limit, and the limit is f'(x)g(x) + f(x)g'(x). Therefore, the left-hand
side has a limit, and the limit is f(x)g(x) + f{x)g'{x).
We omit the proof of (c). □
Example 3 To show how useful this theorem is, we will prove that
polynomials are differentiable at all x and derive the usual formula for
the derivative. First, it is easy to she that if f{x) = a for all x, then the
difference quotient is zero for all h, so f'(x) = 0. It is also easy to use
the definition to see that if f(x) = x, then f'(x) = 1. To differentiate
f{x) = x2, notice that x2 = x ■ x. So, applying part (b) of the theorem, we
see that f(x) = x2 is differentiable and
(x ■ x)' = x ■ 1 + 1 • x = 2x.
4.1 Differentiable Functions 125
(zn+1y - (x-xny
= (x)'{xn) + x{xny
= xn + x ■ nxn~l
= (n + l)xn.
Therefore, by induction, we have proven that (xn)' = nxn~l for all n > 1.
Using part (a) of the theorem repeatedly, we obtain
by the formula
9(y)-9(f(x)) (2)
H{y)
y - /(*)
Let y = f(x) + h. Then
We will often use sin x, cos x, and the exponential function in exam¬
ples even though we will not formally define them until Chapter 6. We
assume that the reader is familiar with them, knows that they are differ¬
entiable, and knows their derivatives. The chain rule allows us to assert
that complicated functions like sin (e2x) are differentiable (and it gives a
formula for the derivative!) without our having to take the limit of the
difference quotient.
again in the sets (by Theorems 3.1.1 and 4.1.2). They are also algebras
because products of functions in these spaces are again in the spaces.
If a function / is differentiable in an open interval about x and its
derivative f is differentiable at x, we say that / is twice differentiable at
x and denote the second derivative by
/"(*) = A
dx V dx )
Problems
(a) |sinx|.
(b) sin|x|.
5. Let p(x) be a polynomial and suppose that xQ is a real root; that is, p(x0) =
0. When will |p(z)| be differentiable at a:0?
128 Chapter 4. Differentiation
8. Suppose that / eC(1)[a, 6]. Prove that / is Lipschitz continuous on [a, 6].
Hint: use the Mean Value Theorem.
f(x + h) - f(x - h)
lim fix).
h->0 2h
11. Suppose that /(x) > 0 for all x e R. Assume that f(x)2 is differentiable.
Is /(x) necessarily differentiable?
12. Let f(x) = x2 sin (1/x) on the set E = ( —oo, 0) U (0, oo).
13. Let g be the function g(x) = |x| on the interval [—1,1). Extend g to the
whole real line by requiring that g(x + 2) = g(x) for all x.
Figure 4.2.1
□ Theorem 4.2.1 Suppose that / is continuous on the finite interval [a, 6].
Let c be a point where / attains its maximum. If a < c < b and / is
differentiable at c, then /'(c) = 0.
Proof. Suppose /'(c) > 0. Since £(/(c + h) - f(c)) ->■ /'(c), we can
choose a 6 so that
A similar result holds for the point where / achieves its minimum
(problem 1).
The next theorem says that if / is zero at two different points then it
must achieve a maximum or a minimum somewhere in between.
Proof. If f(x) = 0 for all x e (a, b), then f'{x) = 0 for all x, so we can
choose c to be any point in the interval. Otherwise, there must be a point
x0 such that |/(x0)| ^ 0. If f(xa) > 0, then, by Theorem 3.2.2, / achieves
a positive maximum at some point c in the interval [a, b\. The point c
cannot equal a or b since / is zero there, so a < c < b. Thus, by Theorem
4.2.1, f'(c) = 0. See Figure 4.2.1(b). If f(x0) < 0, a similar proof, using
the analogue of Theorem 4.2.1 for minima, gives the result. □
The next theorem states that there must be a point on the graph of
a function between (a, / (a)) and (6, f(b)) where the slope of the tangent
line equals the slope of the straight line between (a, /(a)) and (6, /(&)).
See Figure 4.2.1(c).
= fiP) ~ /(a)
b—a
Proof. The idea of the proof is simple; we subtract from / the function
whose graph is the straight line and then use Rolle's theorem. Define
lim- — limcosc(®) = 1.
a:\0 ® a;\0
Since sin ® is an odd function, the limit from the left is the same. Thus,
, sin ®
Inn- = 1.
®
Thus,
N
f(b) - f(a) = ~ /(*<-1)) (5)
i—1
N
= (6)
2=1
The sum (4) is a Riemann sum for /ab /'(®) dx. By Corollary 3.3.2, the Rie-
mann sum converges to /ab f(x) dx as the maximal length of the subin¬
tervals gets smaller since /' is continuous. But each of the Riemann sums
equals f(b) - /(a), so the limit of the Riemann sums equals f(b) - f{a).
If one is able to, then g(x)dx = G(b) — G(a), by the Fundamental The¬
orem.
f(x) = r Ja
Proof. Let x e [a, b). We shall show that the difference quotient for F
has the right-hand limit f(x) at x. Suppose h > 0. Then
F{x+h)-F(x) i r+h,u,^ i
-h- = h L f(t) dt~lJa m dt
1 rx+h
= hi mdt
For the moment, fix h and let m and M denote the minimum and maxi¬
mum of / on the interval [x, x + h]. Then, by Theorem 3.3.4,
]_ rx+h
m < — /(f) dt < M.
JX
1 px+h
Note that Part II answers a very natural question. Does every contin¬
uous function have an antiderivative? That is, given /, is there an F so
that F' = /?
4.2 The Fundamental Theorem of Calculus 133
All five theorems proven in this section have versions with weaker
hypotheses.
Problems
1. State and prove the analogue of Theorem 4.2.1 for the point where /
achieves its minimum.
2. Let f(x) = e~x2. Find a formula for a function F so that F'(x) = f(x).
3. Let g be twice continuously differentiable on [a, b]. Suppose that there are
three distinct points, x\, X2, x$, in [a, b] so that g(xi) = g(x2) = <7(3:3) = 0.
Prove that there is a c satisfying a < c < b such that g"(c) = 0. Hint: use
Rolle's theorem twice.
4. Prove that
x-+0 x
Hint: use the Mean Value Theorem.
7. Let / and g be in C^^(R) and suppose that /(0) = g(0) and f(x) < g'(x)
for all x > 0. Prove that f(x) < g(x) for all x > 0.
(a) there is a S > 0 such that f"(x) > 0 for all x e (c- S,c + 8).
(b) f'(x) < 0 for x e (c - <5, c) and f'(x) > 0 for x e (c, c + 6).
(c) f(x) > f(c) for x e (c — 8, c + S) such that x/c; that is, c is a local
minimum.
134 Chapter 4. Differentiation
(d) the hypothesis /"(c) > 0 is not sufficient to guarantee the conclu¬
sion of part (c).
12. Suppose that / and g are continuously differentiable on [a, b\. Prove that
Prove that / e C^^R) and express /' in terms of /. Then find /. Hint:
see project 1.
if the slope f'(x) is changing, then the graph of / will curve away from
the straight line approximation. See Figure 4.3.1. Since the rate of change
of /'(x) is given by f"(x), our intuition suggests that the size of the sec¬
ond derivative of / will determine how good the linear approximation
is. We shall see below that this intuition is correct.
If we want to make better approximations, we could use a quadratic
approximation
which has the same value, the same derivative and the same second
derivative as / at xQ. Continuing in this way, we define the nul Taylor
polynomial of / at xQ by
f{n)(x0)
... + (x
n\
T(0)(z, 0) = 1
rW(x, o) = i
T{2) (x,0) = 1-f
Tm(x, 0) = 1-f
T2 T4
tw(x,o) = i-^r + ^r-
The graphs of cos i, r<°>(z,0), T^(x, 0), TM>(x,0): and T";>(x,<i) are
shown in Figure 4.3.2.
136 Chapter 4. Differentiation
Now, since g(x) — 0 (by the way we chose a in (8)), Rolle's theorem
implies that there is an x\ between x and xQ so that g'(xi) = 0. Since
g'(x0) = 0 too, Rolle's theorem implies that there is an X2 between x\
and x0 so that g(2)(x2) = 0. Continuing in this manner, we find an xn+i
so that g(n+l\xn+i) = 0. Setting £ = xn+1, we conclude that (7) holds.
□
Notice that the Mean Value Theorem is just Taylor's theorem for the
case n = 0.
x2 x4 sm£
|cosz- {1 - — + — }| < x
5!
<
- 5!
Thus, the approximation is very good near the origin but gets much
worse as x grows, as we saw in Figure 4.3.2.
rx 1 ,
In 2: = / —at.
J1 t
By the Fundamental Theorem of Calculus, In x is continuously differen¬
tiable and (In x)' = l/x. Thus lnx is infinitely often continuously dif¬
ferentiable on (0, 00). The first three derivatives evaluated at x = 1 are
f'(l) = l,/"(l) = -1, and /(3)( 1) = 2. We shall use the third-order
Taylor polynomial about the point xQ = 1,
f(x) f'jxo)
lim
x yxq 9(x) g'{xo) ’
Proof. Since g'(x0) ^ 0, both g and g' are nonzero in a small interval
about x0 except for the root of g at xQ (problem 12). Using the Mean
Value Theorem, we can therefore compute
lim mi)
X-^Xo
£?'(6)
f'{x0)
9'(x0)
since f and g' are continuous. We used the fact that £i and £2 are be¬
tween xQ and x, so £1 —> xQ and £2 —> xQ as x —> xQ. □
Problems
1. Compare the graph of In x to the graphs of the first five Taylor polynomi¬
als of In x about x = 1.
(a) limx^o^.
(b) lim^i
4. Compute
/>sin2x
lim —- / cos 51 dt.
z-m) sin x Jo
6. Use one of the methods in Section 3.4 to estimate the integral for In 1.2
and compare the result with that obtained in Example 2.
(b) Prove that In an = n In a for all a > 0 and all integers n > 0 .
(c) Find the first few Taylor polynomials of f(x) = ln(l + x) about
x = 0. *
r /(X) _ f"(x0)
X g(x) g"{x0)'
Finding the roots of functions, that is, the values of x so that f(x) = 0, is
important in both pure and applied mathematics. The most familiar use
is finding the roots of the derivative of a function in order to determine
possible local maxima and minima. Even if f(x) is a relatively simple
function like a polynomial, this is. not an easy question. The quadratic
formula allows one to find the roots of quadratic polynomials, and more
complicated formulas allow one to write down analytic expressions for
the roots of cubic and quartic polynomials. But it can be proven that
there are no general formulas for quintic and higher order polynomials.
Newton's method, discovered by Isaac Newton, is based on a simple
geometric idea. Consider the graph of a function f(x) near one of its
roots.
4.4 Newton's Method 141
/(®n)
Figure 4.4.1
Let xn denote the guess for the root. We will explain how to construct
the n + 1 guess. If f(xn) = 0, we have found the root and we can stop.
Otherwise, go to the point {xn, f(xn)) on the graph of /. Construct the
tangent line to the graph at this point; it has slope f(xn). Then £n+i
is defined to be the point where the tangent line crosses the £-axis. See
Figure 4.4.1. To find the formula for xn+i, consider the triangle whose
vertices are at (xn, 0), (xn+i, 0), and (xn, f(xn)). The height is f(xn) and
the base is xn+i - xn, so the slope of the hypotenuse (which is a piece of
the tangent line) is the ratio. That is.
~f{Xn)
f'{xn)
Xn+1 Xn
/(*») (9)
Xn+l — Xn . .
/ (®n)
xq = 1.0
x\ = 1.5
x2 = 1.41667
£3 = 1.41422
£4 = 1.41421
142 Chapter 4. Differentiation
Xq = 1.9
xi m 1.47632
X2 = 1.41552
ar3 = 1.41421
x4 = 1.41421
It is clear, even in a simple case like this, that there are pitfalls in
Newton's method. First, notice that that if we start with xq < 0, then
the sequence of iterates will converge to the root — y/2 rather than to the
root V2. Second, if we had been unlucky enough to start with the initial
guess :ro = 0/ then the tangent line has zero slope and so doesn't intersect
the z-axis anywhere; hence the method breaks down immediately. Intu¬
ition suggests that we can avoid these pitfalls if we start close enough to
the root that we want to find. The following theorem shows that under
reasonable hypotheses the intuition is correct.
Proof. Note that we are not sure yet that the iterates given by formula
(9) even exist: for how do we know the iterates stay in the interval or
that /'pn) / 0 for each of the xnl For the moment we assume that
everything is all right and make an estimate. That estimate will tell us
how close we need to choose xq to x. From (9),
/On)
®n+1 ® Xn ~ X - (10)
f On)
_ (/On) ~ /Q))
(ID
l/M
!/'(*) I > for all x e [x — 5i,x + £i].
2
Equivalently,
■ 1 2
for all x e [x — Si, x + cq].
!/'(*)! ~ !/'(*) I
Next, since f" is continuous, there is an M so that |/" (a?) | < M for all
x e \x — 8\,x + 8\\. Thus,
3M
< \Xr A < 2
{rhj)if"('Tn) + f vn)){Xn x) I /'(*)!
Figure 4.4.2
f{Xn)
(1 + x„) arctan®r > 2|xn
f'(Xn)
l^n+l I \Xn |
if \xn\ > b. Thus if we begin with |xo| > b, the successive iterates will get
further and further away from the origin.
Problems
5. (a) Which hypothesis of the Theorem 4.4.1 is not satisfied by the func¬
tion f(x) = x2 ?
(b) For different choices of x0/ provide numerical evidence that New¬
ton's method converges rapidly to zero anyway.
(c) Prove that for f(x) — x2 and any x0/ the iterates in Newton's method
converge to zero.
(a) How many iterates are required in each method to find \/2 to within
10“6?
(b) Explain carefully why Newton's method is so much faster.
8. Explain why the factor (l/2)n in (17) can be made to be (l/10)n if we start
even closer to the root.
10. For A larger than 2.6 (approximately), the function f(x) = x3 — Acc2 + s + l
has three real roots, which we denote rx(A), r2(A), and r3(A). Use New¬
ton's method to generate good graphs of rlrr2, and r3 on the interval
[2.6,4],
Suppose L < R. By monotonicity f(x) < L for x < c and f(z) > R for
z > c. Thus Ran(f) would contain at most one point, /(c), in the interval
(L,R). This is impossible since then Ran(f) would not be an interval
which would contradict the hypothesis. Therefore L = /(c) = R. Now
suppose cn —y c and let e be given. Since, /(c) = supX<cf{x), we can
choose xi < c so that /(c) - f(xi) < e. Similarly we can choose z\ > c so
that f{z\) - /(c) < e. That is,
for n > N. This proves that | f(cn) — /(c) | < e for n > N, so / is continu¬
ous. The proofs at the endpoints c = a and c = b are similar. This proves
(a). Note that we needed only monotonicity, not strict monotonicity.
To prove (b) we recall that if / is continuous, then Ran(f) is an in¬
terval (Corollary 3.2.4). Since / is strictly monotone, /-1 exists, and it is
easy to check that /_1 is strictly monotone on the interval Ran(f). Since
the range of /_1 is [a, b], part (a) implies that /~* is continuous. □
= —■ (19)
/ (*)
f~l{yn) - x
f(f-HVn)) ~ f{x)
_
f(x + hn) ~ f[x)
i
/'(/_1(y))
is continuous by Theorems 3.1.1(d) and 3.1.2. □
Next, we prove two theorems which justify the usual techniques for
changing variables in integrals. Both of these theorems have slick proofs
that use the chain rule and the Fundamental Theorem of Calculus (prob¬
lems 6 and 7). We give the proofs below because changing variables in
Riemann sums is central to the definition of the integral over more com¬
plicated objects like curves and surfaces.
<21)
i= 1 i=1
Now let the interval length in partition P get small. By Corollary 3.3.2,
the Riemann sum on the left side of (21) converges to the left side of (20).
On the other hand, since 0_1 is continuous, and therefore uniformly con¬
tinuous, on [a, b], the interval length of the {C} partition also gets small.
Thus, again by Corollary 3.3.2, the right-hand side of (22) converges to
the right-hand side of (20). Thus (20) holds. Q
150 Chapter 4. Differentiation
pb r4>(b)
/ f(<f>(x))dx = / f{t)(</> ■l(t))'dt. (23)
E/w
i=l
Letting the interval length in the partitions get small, we obtain (23). □
Problems
5. Prove that the function tan x is strictly monotone increasing on the inter¬
val (—|, |). Use Theorem 4.5.2 to compute a formula for the derivative
of arctanax
4.6 Functions of Two Variables 151
6. Give a different proof of Theorem 4.5.4 as follows. Let F(u) = Ja“ f(x) dx
and define G(t) = jF(0(t)). Now, apply the Fundamental Theorem of
Calculus to G'. Note that this proof does not require 0 to be monotone.
(a) Explain why In a; defined on (0, oo) has an inverse function. Call it
4>. What are the domain and range of 0? Is 0 continuously differen¬
tiable?
(b) Show by a change of variables that
If (xi,yi) and (£2,2/2) are points in R2, we define the distance between
them to be
Definition. Let pn = (xn, pn) be a sequence of points in the plane and let
p = (x, y). We say that {pn} converges top, writtenpn p, if \\pn-p\\ ->
0.
It is not hard to check that pn —> p if and only if xn —>• x and yn —»■ 2/
(problem 10 of Section 2.2). Now that we have a notion of convergence
we can define continuity.
3£2p + p3 + 1
f{p),
where we used the limit theorems from Section 2.2 in the second step.
Thus / is continuous on R2. The same idea can be used to prove that all
polynomials in two variables are continuous functions on R2.
The three theorems in Section 3.1 have analogues for functions of two
variables. The sum, product, and quotient of continuous functions are
continuous except possibly where the denominator vanishes. If / is a
4.6 Functions of Two Variables 153
In all these cases, the proofs are virtually identical to those in Section
3.1 except that the absolute value, | • |, in R is replaced by the Euclidean
distance, || • ||. The reader is asked to give the proof of the third result in
problem 1.
\\q — p\\ < S implies \f(q) — f(p)\ < £ for all q and p in E.
Sets of the form R = {x \ a\ < x < £>i, 0,2 < y < 62} are called closed
rectangles. The following theorem generalizes Theorems 3.2.1, 3.2.2, and
3.2.5.
(a) / is bounded on R.
Proof. The proofs of all three parts follow closely the proofs in Section
3.2 except that they use the generalization of the Bolzano-Weierstrass
theorem proved in problems 10 and 11 of Section 2.6. We will prove part
(a), leaving parts (b) and (c) to the problems. To prove that / is bounded
on R, we need to show that there is an M so that |/(p)| < M for all
peR. Suppose that this is not true. Then for each large integer n, there
is a pneR such that \f(pn)\ > n. By the (generalization of the) Bolzano-
Weierstrass theorem, the sequence {pn} has a subsequence {pnk} that
154 Chapter 4. Differentiation
In the proof of Theorem 4.6.1, the only property of R that was used
was that sequences in R have subsequences that converge to a point of
R. Sets which have this property are called compact sets. Thus, continu¬
ous functions on compact sets have the three properties (a), (b), and (c).
Compact sets are investigated in problems 5 and 6.
if the limit exists. Similarly, suppose that all the points (xa, yQ + h) are in
the domain of / for h small enough. We define the partial derivative of
/ with respect to y at (xQ, y0), written §£ (xG, y0), by
if the limit exists. We shall often use the simpler notation fx{x0i Vo) and
fy{x0,y0) for x0,y0) and ^(a'0,y0), respectively. In the language of
calculus, we compute fx by holding y fixed and differentiating with re¬
spect to x, and we compute fy by holding x fixed and differentiating
with respect to y.
The same argument shows that the partial derivatives of all polynomials
in two variables exist.
4.6 Functions of Two Variables 155
□ Theorem 4.6.2 Suppose that / is differentiable at (xQ, yQ) and let a and
b be any real numbers. Then,
Proof. Let Ds(x0, yQ) be a disk in which the partials exist and are con¬
tinuous. Choose h small enough so that the point (xQ + ah, yQ + bh) is in
the disk. See Figure 4.6.2. We now rewrite the difference quotient as
156 Chapter 4. Differentiation
f{xQ + ah,yQ + bh) - f(xQ + ah,yQ) f(xQ + ah, yQ) - f(xQ, yQ)
h h
=
df, , y.\ bh f (xQ + ah, yQ) — f (xQ, y0)
~(Xo + ah,t)- + a---.
In the first term on the right, we used the Mean Value Theorem in y
for the function f{x0 + ah, y) on the interval 0 < y < bh. See Figure
4.6.2. We can use the Mean Value Theorem because of the hypothe¬
sis that / is continuously differentiable in y for each fixed x. Since
0 <i<bh and the partial
, (x0 + ah, y0 + bh) is continuous in the disk, the
. (x0 + ah, £) limit of the first term on the
right as h -> 0 is b^(x0,y0).
• (*o,yo) m{xo + ah,y0) Since the limit of the second
term is a^(x0,y0), we have
Figure 4.6.2 proved the theorem. □
In each term we used the Mean Value Theorem, so £1 is between y(t) and
y(t + h) and £2 is between x(t) and x(t + h). Thus, as h —» 0, we know
that £1 —» y(t) and £2 x(t). Since x(t) and y(t) are differentiable and
the partials of / are continuous, the limit of the right hand side exists
and equals the right hand side of (25). Thus g is differentiable and (25)
holds. Since the composition and product of continuous functions are
continuous, g' is continuous. □
Problems
1. State and prove the analogue of Theorem 3.1.3 for real-valued functions
of two variables.
(b) Show that if E is not closed and bounded, then E is not compact.
xy
f(x,y)
\Jx2 + y2
a =
dH
p'W = —^-(*(*). p(*)), p(0) = po.
(b) Suppose that H(x,p) = x2 + p2. What do the orbits look like?
(c) Suppose that H(x,p) = ax + bp. What do the orbits look like?
h(x,y) = / G(t)dt,
Jo
where G is a continuous function on M. Prove that h is continuously
differentiable and compute its partial derivatives.
12. Imagine a infinitely long elastic string lying along the z-axis. Suppose
the string is set in motion at time t — 0, and let u(x, t) denote the vertical
displacement of the string at x at time t. According to a simple model for
small displacements, u should satisfy the wave equation
££ C Uxx —
where / gives the initial displacement and g gives the initial velocity of
the string at x. Assume that / and g are twice continuously differentiable.
Verify that
solves the wave equation and the initial conditions. This solution was
first written down by J. d'Alembert (1717-1783).
Projects
/(y)^ = 9{t)-
You then integrate the left side with respect to y, obtaining F(y), and
integrate the right side with respect to t, obtaining G(t). Setting F(y) =
G(t) + C, you determine the constant C by the initial condition and then
solve for y in terms of t. How can one integrate one side of an equality
with respect to one variable and the other side with respect to another
variable and expect the results to be equal?
(a) Using the chain rule and the Fundamental Theorem of Calculus,
explain carefully what is really going on here.
dy _ y
y( 0) =1
dt y+ 1’
satisfies
(b) Try every trick in the book to find an expression for y in terms of t.
(c) Use the Intermediate Value Theorem to show that for each t there is
a number y(t) that satisfies (26).
(d) Prove that for each t, y(t) is unique.
(e) For a number of different t between 0 and 4, use Newton's method
to determine y(t) and draw a sketch of the graph of y(t).
Lp{f) = - xi-i)(yj ~
ij
n
Hint: write
d pb
o
f(x, y) dx dy -
pd / pb
Jc
/ f(x,y)dx
\J a
dy
_^ ^ pb
+ MiAXi ~ xi-l)(Vj ~ Vj-l) ~ ^2(Vj ~ Vj-l) / /(*, Vj) dx
ij j da
and justify carefully why each term can be made less than e/3 by choosing
the partition appropriately.
CHAPTER 5
Sequences of Functions
Sequences of real numbers have played a central role in this text so far.
For example, sequences are used in the definitions of continuous and dif¬
ferentiable functions. The Riemann integral is the limit of a sequence of
Riemann sums. The invariant probabilities of a two-state Markov pro¬
cess were found by taking the limits of sequences. Newton's method
shows how to find the roots of a polynomial by constructing sequences
of approximations. On the other hand, many of the most important ob¬
jects which one studies in analysis are functions. The solution of a differ¬
ential or integral equation is a function. The solution of a partial differ¬
ential equation is a function of several variables. Often one determines
these solutions by constructing a sequence of functions, fn(x), that gets
closer and closer to the solution f(x) as n —> oo. This is the technique we
shall use when we study integral equations (Section 5.4) and ordinary
differential equations (Section 7.1). But what do we mean by “closer and
164 Chapter 5. Sequences of Functions
closer”? What does it mean for a sequence of functions to get closer and
closer to a limiting function? As we shall see, there are many different
notions of convergence for sequences of functions. In this section we in¬
troduce two of the simplest, pointwise and uniform convergence. Other
kinds of convergence are discussed in Section 5.3 and later sections.
That is, given e > 0 and x e E, there exists an N so that n> N implies
x2 - —x + 1 —> x? T 1
2n
as n —>■ oo. That is, fn{x) —> f(x) for each x as n —>■ oo, so by definition,
the sequence of functions /„ converges pointwise to the function /.
Figure 5.1.1
5.1 Pointwise and Uniform Convergence 165
0 1
Figure 5.1.2
Figure 5.1.3
For any fixed x > 0, the sequence of numbers {fn{x)} converges to zero
as n —>• oo. In fact, if we choose N so that 2~N < x, then fn(x) = 0 for all
n > N. If x — 0, then /n(0) = 0 for all n, so fn(0) —> 0. Thus the sequence
of functions /„ converges pointwise to the zero function, f(x) = 0, on the
interval [0,1]. Notice, however, that for each n, /q1 fn(x)dx = 2-1 since
the area under each graph is just one half the base (2~n) of the triangle
times the height (2n). Therefore,
is not enough to guarantee that the limit of the integrals of the fn is the
integral of the limiting function /. We therefore want to define a stronger
notion of convergence that gives us some control over the properties of
the limiting function.
that this is just the condition we need. The name “uniform convergence”
comes from the fact that, given e > 0, we can choose an N so that the
graphs of all the fn, for n > N, lie in an c-band about the graph of the
limiting function /; that is, they are uniformly close to /. Note that, by
definition, uniform convergence implies pointwise convergence. Let's
examine the sequences of functions in the above examples to see whether
they converge uniformly.
\fn(x)-f(x)\ = g.
In order to make the right-hand side < e, we will have to choose n large,
but how large depends on x. The bigger x is, the larger n will have to be
chosen so that | fn(x) - f(x)\ < e. In particular, given e > 0 and any n,
there is an x so that |/n(a;) - f{x)\ > e. Thus fn does not converge uni¬
formly to / on M. However, the convergence is uniform on each finite in¬
terval [a, b\. To see this, let e > 0 be given. Choose N > max{|a|, |6|}/2e.
Then, for all x e [a, b\, n > N implies
max{|a|, |6|}
\fn{x)-f(x)\ < < < £.
2N
We will often use the notation fn —> f to mean that the sequence of
functions converges to the limiting function /. The arrow itself doesn't
indicate whether the convergence is pointwise or uniform, so we will
always say /„->■/ pointwise or /„ -> / uniformly. In Section 5.3 we
discuss other notions of convergence.
Problems
4. Suppose that g is a continuous function on [a, 6] such that g(x) > 0 for
each x e [a, b}. Prove that if fn —>• / uniformly, then ^ -> ^ uniformly.
6. Prove that sin (x + 4) —> sin x uniformly on R. Hint: use the Mean Value
Theorem.
7. Let fn(x) = 1+nz 2 • Prove that fn —> 0 pointwise but not uniformly on
[0,1]-
8. Let fn(x) = Prove that fn —>• 0 pointwise but not uniformly on
[0, oo).
11. Let g{x) = e~x2 and define fn(x) = g(x — n). Prove that fn —> 0 pointwise
but not uniformly on R.
12. Let {/n} be a sequence of continuous functions such that fn —>■ / uni¬
formly on R. Suppose that xn —> xa. Prove that limn_^oo fn{xn) = f(xa).
14. (a) Prove that (1 + ^)n -»■ ex for all x. Hint: see problem 11 of
Section 4.3.
(b) Prove that (1 + ^)n —> ex uniformly on any finite interval [a, 6].
(c) Prove that the convergence is not uniform on R.
for all x such that \x — xa\ < S. By Theorem 3.1.3, this proves that / is
continuous at xQ. Since xQ was arbitrary, / is continuous on E. □
Example 3 of Section 5.1 shows that the limiting function may not be
continuous if the convergence is not uniform.
Thus,
rb pb rb
/ fn(x)dx- f{x)dx / {fn(x) ~ f(x))dx
Ja Ja Ja
rb
‘ \fn(x) - f{x)\dx
5 ja
s rb
/ dx
- b - a Ja
= £
if n > N. Thus fa fn(x) dx —> J^ f(x) dx. In the next to last step we used
the estimate in Corollary 3.3.6. □
5.2 Limit Theorems 171
Thus, limn^oo /0°° fn(x) dx = 1, while the integral of the limiting function
is 0.
lim
n^°°
f
Jxo
f'n(t)dt = f
Jxo
g(t) dt.
Finally, we prove that one can differentiate under the integral sign
with respect to a parameter. This theorem plays an important role in our
derivation of the Euler equation in Section 5.5.
F(y) = f f{x,y)dx.
Ja
F'(v) (8)
lim
n—>oo
F(y + hn) - F(y)
b-n
lim
n—>oo J a f f(x,y + hn) - /(x,y)
hn.
dx (9)
exists and equals the right hand side of (8). Since / is continuously dif¬
ferentiable in the second variable,
as n —>• oo for each fixed x. What we need to know is that we can bring
the limit on the right-hand side of (9) inside the integral. This is exactly
the question treated in Theorem 5.2.2. According to the hypothesis of
that theorem, we need to know that the convergence in (10) is uniform
in x in order to interchange the limit and the integral. By the Mean Value
Theorem, for each fixed x,
Problems
1. Let fn(x) = e~nx on the interval [0,1]. Explain why the sequence of func¬
tions {fn} converges pointwise on [0,1]. What is the limiting function? Is
it continuous? Is the convergence uniform?
8. Show by example that the hypothesis of Theorem 5.2.3, that {fn(x)} con¬
verges for at least one x, is needed to obtain the conclusion.
10. Complete the proof of Theorem 5.2.4 by showing that F' is continuous.
11. Let / be a continuous function on [0, oo) which equals zero outside the
interval [a, b\. For each A > 0, define
pOO
F{ A) = / e~Xxf(x)dx.
Jo
By using Theorem 5.2.4, prove that F is infinitely often continuously dif¬
ferentiable on (0, oo). Remark: F is called the Laplace transform of /.
12. Let fn be a sequence of continuous functions defined on M, and suppose
that fn y f uniformly on every finite interval [a, b\. Suppose that there
is a nonnegative continuous function g on R such that |/n(s)| < g(x)
for all n and all x e R and suppose that the improper Riemann integral
ff°oo g(x) dx < oo exists.
The number ||/||oo is called the supremum norm or the sup norm of the
function /.
176 Chapter 5. Sequences of Functions
Thus, H/lloo is just the supremum of the values of \f(x)\ on the set E.
Usually the set E will be an interval, finite or infinite. Notice that ||/||oo
depends on the set E, but we will not usually indicate that in the symbol
ll/lloo- If / is a continuous function on a finite interval [a, b], then |/| is
also continuous on [a, b], so ||/||oo is just the maximum of \f{x)\ on [a, b].
(a) ll/Hoo > 0 and ||/||oo = 0 if and only if / is the zero function on E.
Proof. The proofs follow quite easily from the properties of sup proven
in Section 2.5. Since \f(x)\ > 0 for each x e E, the supremum over all
such \f(x) \ must be nonnegative. Further, it is clear that \\f\\oo = 0 if and
only if \f(x)\ = 0 for all x e E, in which case / is the zero function on E
by definition. This proves (a).
To prove (b), note that
||a/||oo = sup{\af(x)\ \ x e E}
= sup{|a| \f(x)\ | x e E}
= |a| sup{|/(x)| | x e E}
M 11/11oo?
Thus, ll/Hoo + HpIIoo is an upper bound for {| f(x) + g(x)\ | x e E}, and
therefore.
^ ||/||oo T I |p 11 00
Note that the properties of ||/||oo that we have just proven are anal¬
ogous to the properties of absolute value in measuring the size of real
5.3 The Supremum Norm 177
numbers: (a) |x| > 0 and |x| = 0 if and only if x = 0; (b) \ax\ = |a| |x|
for all x; (c) for all x and y, \x + y\ < |a?| + \y\. The sup norm is not the
only way one can measure the size of functions. Other notions of size are
discussed below.
Using the sup norm, we can define convergence.
lim II fn - /||oo = 0.
n—Loo
Therefore,
If fn converges to / in the sup norm, then it is not hard to see that the se¬
quence fn is a Cauchy sequence (problem 1). The more difficult question
is this. Let S be a set of functions. Does a Cauchy sequence of functions
in S necessarily have a limit in S? If S is the set of continuous functions
on a set E, the answer is yes.
lim || fn ~ /||oo = 0.
Proof. Since {/n} is a Cauchy sequence in the sup norm, given e > 0,
we can choose N so that (13) holds. For each xeR, |fn(x) — fm{x)\ <
II fn /m||oo/ SO
Thus, for each x, the sequence of real numbers {fn(x)} is a Cauchy se¬
quence and therefore converges because the real numbers are complete.
Define f(x) = limn_+oo fn{x). Since the absolute value is a continuous
function,
\f(x) - fm(x)| = Jirn^\fn(x) - fm(x)|,
11/111=/ \f{x)\dx,
Ja
in which case we are taking the size of / to be the area under the graph
of \f(x)\. More generally, we define for 1 < p < oo,
\f(x)\p dx\ ,
Figure 5.3.1
Problems
(b) Suppose that /„->•/ in the sup norm. Prove that ||/n||oo -* ||/||oo-
and define
1 0 < x < \
0 \ < x < 1.
182 Chapter 5. Sequences of Functions
(a) Prove that /n -> / pointwise on [0,1]. Hint: draw the graph of fn.
(b) Prove that ||/ — /n||oo = 1 for each n so that fn does not converge to
/ in the sup norm.
(c) Explain how you could have predicted the result of part (b) simply
by using Theorem 5.2.1.
(d) Prove that \\f — fn\\i —* 0 as n —>■ oo.
(a) Show that ||/|| has the properties (a), (b), and (c) of Proposition 5.3.1.
(b) Prove that C^[a,b] is complete in the norm ||/||. Hint: follow the
proof of Theorem 5.3.3 and use Theorems 5.2.1 and 5.2.3.
8. Let C7([a, 6];R2) denote the set of pairs, (f(x),g(x)), of continuous func¬
tions on [a, b}. Define a norm on C([a, 6]; R2) by
(a) Prove that this norm satisfies the properties (a), (b), and (c) of Propo¬
sition 5.3.1.
(b) Prove that C([a, &]; R2) is complete.
9. Let Q be a closed, finite, rectangle in the plane, and let C(Q) denote the
set of real-valued continuous functions on Q. For / in C(Q), define
(a) Prove that this norm satisfies the properties (a), (b), and (c) of Propo¬
sition 5.3.1.
(b) Prove that C(Q) is complete.
10. Let C0(R) denote the set of continuous functions on R such that
limx_).OC) f(x) — 0 and lim^^ f(x) = 0. Prove that C0(R) is complete in
the || • ||oo norm. Hint: since C0(R) C C6(R) the limit of a Cauchy sequence
certainly exists and, by problem 3, is in Cb(R).
5.4 Integral Equations 183
Proof. Let ipo(x) be any continuous function on [a, b]. Define functions
'ipi(x), ip2{x), • • •, recursively by the formula
We begin by showing that the functions ipn are continuous. Suppose that
ipn is continuous for some n. For each fixed x, K(x, y) is a continuous
function of y. By Theorem 3.1.l(c), K(x, y)ipn{y) is a continuous function
184 Chapter 5. Sequences of Functions
|^n+l(*l)-V'n+l(®2)|
Let e > 0 be given. Since / is continuous, we can choose a <5i such that
\f(xi) - f(x2)\ < § if \xi - x2\ < Si. To handle the second term in
(18), note that, because xj)n is continuous, there is a constant Cn so that
|i/>n(y)\ < Cn for all y e [a, 6]. Since K is uniformly continuous on the
square, we can choose a S2 so that \x\ - x2| < S2 implies
e/2
\K(xi,y) - K(x2,y)\ < for all y e [a, b\. (19)
Cn(b — a)
if |*i - *21 < min{<!>i, £2}, which shows that i/’n+i is continuous. By as¬
sumption, ^o(x) is continuous. So, by induction, all the ij)n are continu¬
ous functions on [a, b].
Next, we will show that {V’n} is a Cauchy sequence in the sup norm
if A is sufficiently small. Let M be an upper bound for K on the square.
Then, for n > 1,
a = |A| M(b-a)
for all x e [a, b\. Thus, taking the sup of the left-hand side, we find
Suppose that n > N and m > N, and let m > n be the larger of the two
integers. Then, by the triangle inequality (Proposition 5.3.1(c)),
m— 1 m—n—1 n
y a-’ = an y < -- < --
^ ■f—' l —o l —o
j=n j=0
since n> N and 0 < o < 1. Because ||0i - 0o||oo is a fixed number and
o < 1, it is clear that we can choose N large enough so that
||0m 0n||oo — £
if n > N and m > N. This proves that the sequence {0n} is a Cauchy
sequence in the sup norm on [a, 6]. By Theorem 5.3.3, there exists a con¬
tinuous function 0 on [a, 6] such that ||0n - 0||oo 0- By Proposition
5.3.2, 0n -> 0 uniformly.
186 Chapter 5. Sequences of Functions
We are not done since we must still show that ^ satisfies (15). Con¬
sider equation (16). For each x, the left-hand side converges to 'ip(x). On
the other hand, for each fixed x, K(x,y) is a continuous function of y,
thus, since xpn —* ip uniformly, we conclude (problem 4 of Section 5.2)
that
rb rb
/ K{x,y)il)n-i{y) dy —-A / K{x,y)^{y)dy
Ja Ja
as n —> oo. Therefore, ip(x) satisfies (15). Suppose that cj>(x) is another
continuous function that satisfies (15). If we subtract the equation
from the 'ip(x) equation and estimate as above, we find
Since a < 1, this equation can hold only if \\ip — (j)^ = 0, which implies
that 4>{x) — ip(x) for all x. Thus, the solution ip(x) is unique. □
Notice that the proof not only guarantees the existence of a ijj(x) sat¬
isfying (15) but also shows that, if we start with any ipo(x) and iterate the
relation (16), we can approximate ip(x) by ipn(x) after n steps, and the
estimates show how good our approximation will be.
Problems
1. Let5n = E;=0«T
1
b(z) sin a? + e x y dy.
4
h(x) = A / K(x,y)h(y)dy.
Ja
Show that if ip(x) is a solution of (15), then 0(x) + h(x) is also a solution.
Let a = 0, b = n, and K(x,y) = sin x sin y. Show that there is a choice
of A so that such an h(x) exists. Why doesn't this nonuniqueness violate
Theorem 5.4.1?
9. Prove that there is at most one continuous function on the interval [0,2]
that satisfies
0(x) = f(x)
s-2
10. (a) Show that there is a unique, bounded, continuous function 0 on the
interval [0, oo) that solves
poo
-2x
0(x) + / e~2x~2y sin (x — y)0(y) dy.
Jo
Hint: follow the proof of Theorem 5.4.1.
(b) Prove that ||0||oo < 2. Hint: iterate ||0||oo < 1 + ||
11. Let K be a continuous function on the square S = [0,1] x [0,1] and let /
be a continuous function on [0,1]. We want to solve the Volterra integral
equation:
pX
0(z) = f(x) + / K(x,y)ip(y)dy. (21)
Jo
Let 0o(s) = 1 and define 0n(*) recursively by
pX
0n+i(*) = f(x) + / K(x,y)ipn{y)dy.
Jo
188 Chapter 5. Sequences of Functions
MnCrn
\ipn+1{x) - ipn{x)\ < —- for all X e [0,1].
(d) Prove that tpn converges uniformly to a solution of (21). Hint: the
series ^ converges.
Example 1 Let (xi,y\) and (22,2/2) be two points in the plane. What
is the curve between them that has the shortest length? We all know
that the answer is the straight line that connects the points, and that cer¬
tainly seems reasonable geometrically. But can we prove it analytically?
To keep the analysis simple, we will only consider curves between the
points that are the graphs of functions. Let y(x) be such a function, that
5.5 The Calculus of Variations 189
is, y(xi) = yi and y(x2) = y2, and suppose that y(x) is continuously dif¬
ferentiable. See Figure 5.5.1(a). Then the arc length of the graph of y(x)
between the points (*1,2/1) and (x2, y2) is
J(y') = [ yj 1 + (y'{x))2dx.
J X\
Figure 5.5.1
ds ds dx
v[x) = — = —-—
dt dx dt
or
dx v(x)
ds
dt dx
190 Chapter 5. Sequences of Functions
By Theorem 4.5.2,
* = jt_.
dx v(x)
From project 2(a) in Chapter 3 we know that the rate of change of arc
length with respect to x is given by ^ = \/l + (y'(x))2. To figure out how
v(x) and y(x) are related, we use the fact that the total energy (kinetic
plus potential),
E - ^mv(x)2 + mgy(x),
Xl dt
= 1 — dx
dx
X1 y/1 + (y'(x))2
dx.
-L V-2 gy(x)
J(y,y') =
r x\
/ a/i + (y'{x))2 .
Jo x
rx2
J{x,y(x),y (x)) = / f(x,y(x),y'(x))dx (22)
J Xl
(a?2,2/2)
(*i>yi)
Figure 5.5.2
Here is the idea. Suppose that y(x) minimizes J, and let 77(2) be a
continuously differentiable function such that 77(21) = 0 = 77(22). For
every e, the graph of y(x) + £77(2) passes through the points (21,7/1) and
(*2, U2)- See Figure 5.5.2. Note that e is allowed to take negative values.
Define a real-valued function I on R by
If y(x) minimizes J, then the function 1(e) should reach its minimum
at e = 0, and this should be true for every choice of 77(2) that satisfies
77(21) = 0 = 77(22). If 1(e) is continuously differentiable, then, according
to Theorem 4.2.1,
I'( 0) = 0, (23)
and this should be true for all choices of 77(2). If y is a twice continu¬
ously differentiable function such that (23) holds for all such 77(2), then
y is called an extremal for the functional J. We want to determine what
condition this puts on the function y(2).
(24)
rx 2
l'{e) = / fy{x,y(x) + er](x)y'(x) + Er]'(x))ri(x)dx
J X\
rx 2 *
Thus,
fX2
l'(0) = / fy{xiy(x),y'(x))r]{x) + fy'(x,y(x),y'(x))r] (x)dx (25)
J X\
rx2 d
= J {fv{x,y{x),y'(x)) - —fy'(x,y(x),y'(x))}ri(x)dx, (26)
where we integrated by parts in the second term and used the assump¬
tion that rj vanishes at the endpoints. Therefore, if y(x) is an extremal.
d y'
fy(x,y(x),y’(x)) ’(x,y(x),y'(x))
dx (1 + {y'Y)\
y"
(1 + (t /)2)2
Thus, if y(x) is an extremal, then y"(x) = 0 for all x e [a:1? *2]- Therefore,
y(x) is a straight line, and since it must pass through (xi,yi) and (m2,2/2)
the function y(x) is determined. So far so good. The only extremal is
the straight line between the points. But how do we know whether the
5.5 The Calculus of Variations 193
I"
(V(*))2
(! + (« + £Tj'(x))
and so I"(e) is strictly positive for all e if rj(x) is not the zero function
on [x 1,2:2]. It follows from the Fundamental Theorem of Calculus that
I'(e) > 0 if e > 0 and I'(e) < 0 if e < 0. From this, it follows that
1(e) > 1(0) if e ^ 0. Since every function whose graph goes through
(x\,yi) and (X2,f/2) can be written in the form y(x) + rj(x) for an rj(x) that
vanishes at x\ and X2, we conclude that the straight line is the absolute
global minimum of the length functional J over the whole class of twice
differentiable functions whose graphs go through (xi, y\) and (X2,3/2)-
= 0.
Thus
fyy'-f = C (27)
Vi + (y'{x))1 2
f{y,y)
V~2gy(x)
So, using (27) and carrying out the differentiations, we compute that
_(j/f_y/1 + (y'(x))2, = c
V'l + {y'(x))2^-2gy(x) y/-2 gy(x)
Rearranging and squaring both sides, we find that any extremal y(x)
must satisfy the differential equation
The method for solving this differential equation, which requires a spe¬
cial change of variables, is outlined in project 4.
Problems
H(x)t](x) dx = 0 (28)
for all twice continuously differentiable functions rj(x) that vanish at the
endpoints. Prove that H(x) = 0 in the interval as follows:
(a) Let [a, b] be any finite interval. Show how to construct a twice con¬
tinuously differentiable function on R which is strictly positive on
the open interval (a, b) and identically zero everywhere else. Hint:
use pieces of polynomials.
(b) If xa is a point of [xi,x2] such that H(x0) ^ 0, show how to choose
r)(x) so that hypothesis (28) is violated.
3. Find a curve passing through (1,2) and (2,4) that is an extremal for the
functional
4. Find a curve passing through (1,1) and (2,2) that is an extremal for the
functional
Ji x°
5. Find a curve passing through (0,0) and (1,1) that is an extremal for the
functional
6. Find a curve passing through (0,0) and (1,1) that is an extremal for the
functional
7. Find a curve passing through (0, 0) and (f, 1) that is an extremal for the
functional
(a) Show that the surface area generated when the curve is revolved
around the x-axis is given by (project 2(d) of Chapter 3)
y&) _ „
V1 + (y '(x))2
(c) Show that the functions y(x) = C2 cosh satisfy the differen¬
tial equation for all choices of C\ and C2.
(d) Show that C\ and C2 can be chosen so that the graph of y passes
through (xi,r/i) and (x2,y2).
(a) for all x and y in M, p(x, y) > 0 and p(x, y) = 0 if and only if x = y.
(c) for any three points x, y, and z in M, p(z, z) < p(x, y) + p(y, z).
The three conditions are very intuitive. The first statement says that
the “distance” between distinct points is always positive. The second
says that the distance from x to y is always the same as the distance from
y to x. The third says that the distance from x to z is less than or equal to
5.6 Metric Spaces 197
we see that property (c) holds too. Thus p is a metric on M. Note that the
crucial step in proving (c) was the triangle inequality.
so property (c) holds too. Thus poo is a metric on C[a, b}. The crucial step
in proving (c) was the triangle inequality (Proposition 5.3.1(c)).
Example 3 Let x = (xi,x2, •••, xn) and y = (yi, y2, yn) be points in
Rn. We define the Euclidean metric on Mn by
property (c) follows easily from the triangle inequality proved in prob¬
lem 10 of Section 2.2. The proof in the general case, which is somewhat
harder, is given in Example 3 of Section 5.8. For the moment, we restrict
our attention to K2, where
The same set can have different metrics. In problem 1, the student is
asked to verify that both
and
Pmax((*i,yi),(®2,y2)) = max {\xi - x2|, |yi - y21}
(a) \/x2 + y2 < 1 (b) |x| + |y| < 1 (c) max{|x|, |y|} < 1
Figure 5.6.2
Notice that metric spaces are not required to have any linear struc¬
ture; that is, they need not be vector spaces. No notion of addition or
5.6 Metric Spaces 199
scalar multiplication occurs in the definition. Thus for any metric space
(Ai,p) and any subset M.i C M., (M.\,p) is also a metric space. For ex¬
ample, the set of points (x, y) in R2 such that x2 — y is a metric space
under any of the three metrics on R2 discussed in Example 3. If M. is a
vector space and the vector space has a norm || • ||, then p\x, y) = ||cc - y||
defines a metric on M. because (c) follows from the triangle inequality
for norms. Vector spaces with norms are discussed in Section 5.8.
S2 = {{x,y, z) e R3 | x2 + y2 + z2 = 1}.
N
p(x,y) = (29)
i—1
messages are; see Section 10.2. Here we will briefly describe why such
metrics are useful in molecular biology.
Deoxyribonucleic acid (DNA) is a two-stranded polymer, each strand
consisting of a sequence of four building blocks joined together linearly
along a sugar-phosphate backbone. The four building blocks are the nu¬
cleotides adenine (A), thymine (T), cytosine (C), and guanine (G). Since
A only binds to T and C to G, the linear sequence of A's, T's, C's and G's
along one strand determines the other strand and therefore the whole
DNA molecule. The length of the sequence ranges from 4.6 million for
an E. coli bacterium to about 3 billion in human DNA. Short segments
of the DNA molecule code for the production of specific proteins. Ev¬
ery protein consists of a linear chain of amino acids (typically 50 to 1500)
selected from a fixed list of 20. Experimental techniques allow one to
determine the sequence of relatively small segments of DNA molecules
and the entire amino acid sequence of some proteins.
There are several reasons why one wants to compare two different
linear sequences in order to say how “similar” they are. If the DNAs
of two different animals are similar, then the animals are probably close
to each other on the evolutionary tree. Very similar short segments of
DNA probably code for similar proteins. If the sequences of amino acids
in two proteins are similar, their three-dimensional structures may be
similar and their functions may also be similar. The three-dimensional
structures and the functions of proteins are both difficult to determine.
Thus, astute comparisons of the sequence of a new protein with the se¬
quences of well-known proteins in data bases may suggest reasonable
hypotheses about structure and function.
In the case of DNA, several biological facts make the situation more
complicated. First, the sequences that we wish to compare may have dif¬
ferent lengths. We can handle this by counting as a mismatch a symbol
matched with nothing. Second, DNA mutates by substitutions (a sym¬
bol replaced by another symbol), additions (a symbol placed between
two there already), and deletions (a symbol removed). In the case of a
deletion, we can hold the place of the deleted symbol with a new symbol
So, given two sequences, a natural question is to find the relative
placement so that the distance between them is minimal. Or one can
ask the same question but permit additions or deletions or substitutions.
Even for short sequences, difficult combinatorial questions are involved.
For moderately long sequences, efficient algorithms for machine com¬
putation must be devised. See project 5. The notion of “similar” and all
the rest of our discussion depends of course on the metric that is cho-
5.6 Metric Spaces 201
Proof. Suppose that xn -»• x in the metric p. Let e > 0 be given, and
choose a S > 0 which satisfies (30) for this x and e. Since p(xn, x) -t 0,
we can choose an N so that n > N implies that p(xn,x) < S. By (30),
202 Chapter 5. Sequences of Functions
Problems
1. Prove that the functions pi and pmax defined in Example 3 are indeed
metrics on R1 2 3 4 5.
2. Prove that the function p defined in Example 5 is a metric.
4. Suppose that p is a metric on M. Prove that the following are also metrics:
(a) pi - 5p.
(b) p2 = min{l, p}.
5. Suppose that pi and p2 are metrics on M. Prove that the following are
also metrics:
(a) p = pi + p2 ■
(b) p2 = max{pi,p2}-
for all x and y in AT Prove that if p and a are uniformly equivalent, then
they are equivalent.
12. Prove that the metrics pi, pmax, and p2 defined in Example 3 are uniformly
equivalent.
\x - y 1
1
1 + \x — y\
(a) Prove that p^ is a metric on C[a, b} if ip(x) > 0 for all x e [a, b\.
(b) Explain why the condition ip(x) > 0 is not enough to guarantee that
Pip is a metric on C[a, b],
(c) Suppose that ip and (p are continuous functions on [a, b] which sat¬
isfy ip(x) > 0 and <p{x) > 0 for all x. Prove that the metrics p^ and
p$ are uniformly equivalent.
If a set has a metric, then the notions of Cauchy sequence and complete¬
ness can be defined analogously to their definitions on E.
Example 2 Let M. = {/ e C[a, b] \ f(x) > 0 for x e [a, 6]} with the metric
Poo, which comes from the sup norm. Since p^ is a metric on C[a,b], it
is a metric on M. Let {/n} be a Cauchy sequence in M. Since {/n} is a
Cauchy sequence in C[a, b] and C[a, b] is complete (Theorem 5.3.2), there
is a continuous function / on [a, b] such that fn / uniformly. For each
x, fn{x) > 0 for all n, and this implies that f(x) = limn^oo fn(x) > 0.
Therefore / e M, so (M, poo) is complete.
Proof. Uniqueness is easy, for suppose that there are two fixed points,
x and x. Then,
i,
iteration shows that p(xn+i, xn) < anp(x xo). Therefore, by the triangle
inequality, if m> n,
Since a < 1, the geometric series X!aj converges. Now, if p(xi, x0) = 0,
then xo is a fixed point and we are done. Otherwise, p(xi, xo) > 0, and,
given e > 0, we can choose an N so that YJj=n aj - £P(xh xo^1 if n > N
and m > N. For such n and m, p{xm, xn) < e, which proves that {xn}
is a Cauchy sequence. Since At is complete, there is an x e A4 so that
xn -»• x. Finally, by the triangle inequality,
Since xn ->■ x, all the terms on the right converge to zero. Therefore,
we conclude that p(T(x),x) = 0, and so, by the definition of metric,
T{x) = x. a
®n+l = /(®n)>
which starts with a given point x$. A point x is called a fixed point of
the iteration (or of /) if f(x) = x. A fixed point is called stable if there is
a S > 0 so that xq e [x — 6, x + (5] implies that xn —> x. In Section 2.7, we
investigated an iteration called the quadratic map and proved that some
of its fixed points are stable.
\f{x)-x\ = \f(x)-f(x)\
= |/'(c)||z-z|
< /3\x — x\
by the Mean Value Theorem. This proves that f(x) is closer to x than x,
so / takes M into itself. Using the Mean Value Theorem again, we see
that if x and y are in M, then
Example 4 Let's see what information Theorem 5.7.2 gives us about the
quadratic map,
f(x) = ax( 1 — x),
in the case 1 < a < 3. It is easy to check that the only fixed points in the
interval [0,1] are the points 0 and x=s=L. After calculating that f'(x) =
a( 1 — 2x), we substitute x and find f'(x) = 2 — a. Since 1 < a < 3, we see
that \f(x)\ < 1, so the hypothesis of Theorem 5.7.2 is satisfied. Therefore,
208 Chapter 5. Sequences of Functions
□ Theorem 5.7.3 Let (.Mi, pi) be a metric space. Then, there is a complete
metric space (M2, P2) and an isometry from M\ into M2 such that the
range of T is dense in M2-
Theorem 5.7.3 says that in terms of metric space properties (Mi, p\)
can be identified with a dense subset of a complete metric space
(M2,P2)- Though Theorem 5.7.3 says that every metric space can be
“enlarged” to become complete, it is not very useful since in practice one
wants to be able to characterize the added points. For example, what are
the added points if one completes C[a,b\ in the L2 norm? Problem 12
gives an example of an incomplete space for which one can characterize
the completion.
Problems
5. Which of the following subsets of C[a, 6] are complete metric spaces with
the metric poo?
6. Give an example to show that a discrete metric space may not be com¬
plete.
10. (a) Prove that if p and a are uniformly equivalent metrics on M, then
(M, p) is complete if and only if (M, a) is complete.
(b) Suppose that p and a are equivalent metrics on M. Show by ex¬
ample that it is possible that (M,p) is complete but (M,a) is not
complete. Hint: see problem 9.
11. Let (Mi, pi), z = l,..., N, be a finite collection of complete metric spaces.
Let M be the product of the spaces Mf, that is, M consists of the N-
tuples (xi,X2, ■ ■ ■ ,xn) with x* e Mi for each i. For two such iV-tuples,
x = (xi,x2, ■ ■ .,xN) and y = (yi,y2, • • -,Vn), define
i= 1
(c) Show that C0(R), the continuous functions which go to zero at oo,
is complete in the sup norm (problem 10 of Section 5.3).
(d) Prove that M is dense in C0(R).
13. (a) Let M be the circle of radius 1 with the center at the origin in R2. Let
p2 be the Euclidean metric, and let p be the metric which assigns to
two points the arc length along the circle between them (going the
shorter way). Prove that these metrics are uniformly equivalent.
(b) Assume that the geodesic metric, pg, of Example 4 of Section 5.6
assigns to the points a and (3 on S2 the shorter of the two great
circle arcs between them. Use part (a) to show that pg is uniformly
equivalent to P2 on S2.
(c) Prove that (52, P2) is complete.
(d) Conclude that (52, pg) is complete.
14. In studying lateral inhibition in the retina, one is led to the following kind
of model. We imagine a line of cells indexed by j for —00 < j < 00, and
denote by ej a nonnegative number representing the stimulation of the
cell. If rj represents the response of the cell, we would like to
solve the family of equations
rj = ej - \(rj-1 + rj+1).
(a) Prove that for every bounded sequence {ej }, there is a unique bounded
sequence {rj} so that these equations hold. Hint: you will need to
use the fact that £<*, is a Banach space; see Section 5.8.
(b) How can you compute the sequence {rg}?
Definition. A vector space over the real numbers is a set V, whose ele¬
ments are called vectors, together with operations of addition and scalar
multiplication that satisfy the following rules:
5.8 Normed Linear Spaces 211
2. For v and w in V, v + w = w 4- v.
5. For each veV there exists another vector in V, denoted —v, so that
v + (—v) = 0.
Throughout the text we have used the absolute value \x\ to measure
the size of a real numbers and in Section 5.3 we introduced the sup norm
H/lloo to measure the size of bounded functions. Notice that the proper¬
ties of the absolute value in Proposition 1.1.2 and the properties of || • ||oo
in Proposition 5.3.1 are the same. Other properties of | • | and || • ||oo are
also very similar; compare, for example, problem 10 in Section 1.1 with
problem 2(a) in Section 5.3. This suggests that the absolute value and the
sup norm are special cases of a more general idea.
212 Chapter 5. Sequences of Functions
Definition. Let Lhea vector space over the real numbers. A function,
|| • II, from V to R is called a norm if it satisfies the following conditions:
(a) ||u|| > 0 and ||u|| = 0 if and only if v is the zero vector in V.
(c) For all v and w in V, ||u + u>|| < ||v|| + ||w|| (the triangle inequality).
1
2
n
2
x+y 2 = 5> + ^l2 (33)
2=1
+ 5>|2 (36)
2=1
5.8 Normed Linear Spaces 213
s ((S'-'f "(S1*1’)')
= (IMI2 + IMI2)2. (38)
Taking the square root of both sides proves the triangle inequality for
the Euclidean norm. The Cauchy-Schwarz inequality was used in going
from (35) to (36).
The sequence which is all zeros is the 0 of the vector space. We define a
norm on by
11 { a j } 1100 = sup \dj\.
3
Since |aj| > 0 for each j, it is clear that IKajjHoo > 0 and IKajUloo = 0 if
and only if dj = 0 for each j. Furthermore, if a e R,
< oo
< £ (39)
ia(»)_aMll.. =
- Ilfaw
iij^W
II 1 ^7 -
_ n^mn°°
„M
“j II
Jj=l l|oo - SUp
c„n |fl
l„(n)
■ - Cl; j,
we see that
if n > N, m > N, for each fixed j. Thus, each of the sequences of com¬
ponents is a Cauchy sequence of real numbers and therefore
converges to a real number aj. By problem 4(a),
| ||a(n)||||oo _||/7(n)||
I ||U' ||<*
| < ||ci
||oo |
||n(n) — Cl
/1(m)||
' loo,
this proves that \aj\ < M for all j. Therefore, the sequence a = {a,j} is
bounded and is thus an element of
Finally, letting n —* oo in (40) shows that |aj — a™ | < e for each j if
m > N. Thus,
yi — Uil*n + Uz2*®2 + • • • +
= aSty){x) + (3Sty)(x).
216 Chapter 5. Sequences of Functions
These examples show that some of the most important objects that
one wants to study are linear transformations on Banach spaces. If the
Banach spaces are finite dimensional as in Example 4, the study of lin¬
ear transformations is called linear algebra. If the underlying Banach
spaces are infinite dimensional, as in Examples 5 and 6, the study of lin¬
ear transformations is part of a branch of mathematics called functional
analysis.
Problems
(f) The continuous functions on R that satisfy \f{x)\ < cex2 for some
ceM which can depend on /.
5.8 Normed Linear Spaces 217
6. Prove that Rn is complete in the Euclidean norm. Hint: show that a se¬
quence is Cauchy in Rn if and only if each of the sequences of components
is Cauchy in R.
7. Show that
/OO
\f{x)\e~x2dx
-OO
(a) Prove that if || • ||i and || • ||2 are equivalent, then V is complete in
|| • ||i if and only if V is complete in || • ||2.
(b) Prove that the 11 • 111 norm and the Euclidean norm 11 • 112 are equivalent
on Mn by showing that
(c) Prove that the sup norm and the Li norm are not equivalent on
C[a,b\.
9. Recall from linear algebra that a set of vectors {}”11 is said to be linearly
independent if no linear combination a\V\ + a2v2 +... + a.jVn is the zero
vector unless Uj = 0 for allz. A vector space V is said to have dimension
N if every set of N independent vectors {vi}^ spans V; that is, every
vector in V can be written as a linear combination of the V{. If V has
dimension N for some N, V is said to be finite dimensional.
10. Let {xi}^=1 and {yi}fLi be real numbers not all zero. Define a quadratic,
p(A),by
N
p(A)= +A^)2-
2=1
Explain why p(A) has either two complex roots or a double real root. Use
this fact to prove the Cauchy-Schwarz inequality
JV
< (42)
2=1
(a) Explain why c0 is a normed linear space with the norm || • Hoc.
(b) Prove that cQ is complete. Hint: since cQ C l^, we know that any
Cauchy sequence has a limit in t^.
(c) Show that the set of sequences which are zero after finitely many
terms is dense in cQ.
(d) Show that cQ is not dense in .
12. Let T be a linear transformation from a Banach space to itself and suppose
that T is a contraction. What fixed points can T have?
13. Let An be the set ofnxn matrices A = {aij}. Define
(d) Prove that ^ is a linear transformation from C(1 2)[a, b] to C[a, 6],
(e) Prove that {/ e [a, b} \ f"(s) = 0} is a vector space. It is called the
kernel of ^.
(f) Identify the functions in the kernel.
Projects
(b) Explain why \f(x)\ is continuous on [0,1]. Let xQ be the point where
|/(x)| achieves its maximum. Explain why \f(x0)\ = ||/||oo-
(c) Assume that xQ is not one of the endpoints and let e > 0 be given.
Explain why you can choose a y > 0 so that \ f(x)\ > ||/||oo - £ for
all x e [xa — y, xQ + y\.
2. The purpose of this project is to show by example that the method out¬
lined in Section 5.4 can sometimes be used to solve nonlinear integral
equations. Consider the following integral equation on the whole line R:
1
'ip(x) - COS X y){ip(y))2 dy. (43)
2
1 1 fx+*
i/>n+i(x) = - c°sx + - sin (x - y)(ipn(y))2 dy.
2 2Jx-\
(a) Recall that Cb(R) denotes the space of bounded continuous func¬
tions on R and that Cb(R) is complete (problem 3 in Section 5.3).
Prove that if ipn e Cb(R), then ipn+i e C&(R). Argue inductively that
ipn e Cb(R) for all n if ipo e Cb (R) •
(b) Suppose ip0 e C'b(R) and UV’olloo < 1- Prove that ||V»n||oo < 1-
220 Chapter 5. Sequences of Functions
(c) Use estimates similar to those in the proof of Theorem 5.4.1 to show
that
||0n+l 0n||oo L 2 ll^n — V'n-lll00'
3. The purpose of this project is to show that integral equations can some¬
times be used to solve boundary value problems for differential equa¬
tions. Let / be a continuous function on [0,1] and define
y( 1 — x) if y < x
K(x,y)
x(l — y) if y > x.
(a) Explain why Ci < 0 and why y'(x) must blow up as x \ 0. What
does it mean geometrically that y'(x) blows up?
(e) Show that the constants C\ and C2 can be chosen so that the curve
(x(6),y(9)) passes through the points (0,0) and (ce1} yi).
(f) Generate the graph of the curve (x(0), y(6)). Why do you think that
the curve which gives the shortest time of descent is so steep near
the origin?
(a) Consider the sequences AGGCTC and AGCTCG drawn from the
DNA alphabet. We use the discrete metric in which a letter op¬
posite a deletion symbol and a letter opposite nothing are counted
as full mismatches. Show that the minimum distance between the
sequences is 3 if we do not allow deletion symbols to be inserted.
Show that the minimum distance is 2 if we do allow deletion sym¬
bols.
(b) Design and implement a computational algorithm for finding the
minimal distance (with no deletion symbols allowed) between two
sequences of length 10 and length 8 constructed from the DNA al¬
phabet.
(c) Design an algorithm which produces random DNA sequences of
length 8 and 10.
(d) Conduct an experiment in which you determine the minimum dis¬
tance of 1000 randomly chosen pairs of lengths 10 and 8, respec¬
tively. What fraction has distance 2, distance 3, and so forth? How
likely is it that two randomly chosen pairs have a distance < 3?
(e) If the lengths of the sequences are N and M instead of 10 and 8, es¬
timate how the number of computational steps involved in finding
the minimal distance grows as N and M get large.
,
CHAPTER 6
Series of Functions
As N gets larger, the sup is taken over a smaller set so the sequence of
numbers {sat} is monotone decreasing; that is, sn > sn+i- If {sn} is
bounded then, by Theorem 2.4.3, {sN} converges to a finite number s.
We define s to be the limit superior of the sequence {an} and write
We shall usually write lim sup an, omitting the subscript n ^ oo. If {sN}
is not bounded, there are only two possibilities since it is monotone de¬
creasing. Either sN = oo for all N, in which case we say that lim sup an =
oo, or sN -oo, in which case we say that lim sup an = -oo. Similarly,
we define
sN = inf { an | n > N}
224 Chapter 6. Series of Functions
A sequence {an} may or may not have a limit, but it always has a lim sup
and lim inf, though they may equal ±oo. Notice that sN < sn for all N,
so, by problem 3 in Section 2.4, we always have s<s.
Example 1 Let an = ( — l)n. The sequence {an} certainly does not con¬
verge. However, for each N, sn = 1 and sN = —1. Thus, lim sup an = 1
and lim inf an = — 1.
"T-1-1-1-1-1-1--r
1 2 3 4 5 6 7 8
Figure 6.1.1
1.0,1,-1,1,-2,1,-3,1,-4,1,-5,...
k%
For each TV, sN = 1 and sN = —oo. Therefore, lim sup an = 1 and
lim inf an = —oo.
We will see later that lim sup and lim inf are very useful. For the mo¬
ment we prove a theorem that gives a practical and intuitive characteri¬
zation of s and s.
6.1 Lim sup and Lim inf 225
(a) If s is finite and e > 0 is given, there exists an N so that an < s+e for
all n > N, and for each N there exists an n > N so that an > s — e.
Conversely, if s is a number satisfying these properties, then s — s.
(b If s is finite and e > 0 is given, there exists an N so that an > s—e for
all n > N, and for each N there exists an n > N so that an <s + e.
Conversely, if s is a number satisfying these properties, then s = s.
Proof. We will prove (a); the proof of (b) is similar. Let e > 0 be given.
Since sjsr —>■ s and s is finite, we can choose N so that sn — s < e. That is,
sup {an | n > N} < s+e so an < s+c for all n > N. Given N, suppose that
there were no n > N such that an >s — e. Then sn — sup {an \ n > N} <
's — e, which is impossible since sn decreases to s.
Conversely, suppose that s satisfies the stated properties. Then for
each e > 0 there is an N so that sn < s + c; thus, since sjv is monotone
decreasing, s < s + e. Since e is arbitrary, we conclude that s < s. On
the other hand, given any N, there is an n > N such that an > s — e.
This implies that sn > s — e. Thus s > s - e, and since e is arbitrary we
conclude that s > s. Therefore, s = s. O
in which case
lim sup an — a = lim inf an.
so
a—e < s < a+e and a — e < s < a + e.
Theorem 6.1.1 gives some intuition about limsup and liminf. The
terms of the sequence eventually get below any number that is bigger
than s and keep coming back above any number that is less than s. Simi¬
larly, the terms of the sequence are eventually above any number below s
and keep coming back below any number above s. We emphasize that a
sequence {an} may or may not have a limit, but limsupan and liminf an
always exist.
The following technical theorem will be used when we consider infi¬
nite series, and its proof illustrates the concepts that we have defined.
liminf ^2+1 < lim inf < lim sup < lim sup -n+1. (2)
O'Tl
Proof. The middle inequality holds because the lim inf of any sequence
is less than or equal to the lim sup. We will prove the inequality on the
right. The proof of the one on the left is similar. Define
an+1
a lim sup
n—>oo O'n
for all n > N. We can rewrite this as an+1 < an(a + e). Iterating this
inequality, starting at N, gives
so
As n -> 00 the right-hand side of (3) converges toa + e since the term
with the nth root converges to 1. Therefore, by the result in problem 7,
6.1 Lim sup and Lim inf 227
lim sup < a + e . Since e was arbitrary, lim sup %/a„ < a, which is
what we needed to prove. □
Problems
1. Find the lim sup and lim inf of each of the following sequences:
(a) an = 5 + (-l)n.
(b) an = 5 + (—2)n.
(c) an = 5 + i sin n.
(d) — (3.2)an(l dn), with no 2*
4. Let {an} be a sequence of real numbers and suppose that lim sup an is
finite. Prove that if c > 0, we have lim sup can = c lim sup an.
5. Let {an} be a sequence of real numbers and suppose that lim sup on is
finite. Let {cn} be another sequence and suppose that cn c.
and give an example which shows that strict inequality can hold.
7. Suppose that {an} and {bn} are sequences such that an < bn for all n and
bn —)■ b. Prove that lim sup an < b.
8. Let {an} be a bounded sequence of real numbers. Prove that {a„} has a
subsequence that converges to lim sup an.
228 Chapter 6. Series of Functions
9. Let {an} be a bounded sequence of real numbers and let P be the set of
limit points of {an}. Limit points are defined in Section 2.6. Prove that
lim sup an = sup P and lim inf an — inf P.
11111111
2’ 3’ 2^’ 3*’ 2®’ 3®’ 2*’ 3*’ *'*
If m = oo, then infinitely many terms are being added up and the sum is
called an infinite series. We want to determine conditions under which
we can give a reasonable meaning to and we want to prove the¬
orems that allow us to manipulate infinite series. For each n, we define
the partial sum, Sn, of the series aj to be
Sn — aj.
3=1
IZa3 = S‘
3=1
If the sequence of partial sums does not converge we say that the series
diverges.
6.2 Series of Real Constants 229
aSn + 1 = Sn + an+1.
1 - an+1
1 — (X
aj
a
3=0
1 1 1 1 1
1+2+4+8+ 16 + 32 +
gets closer and closer to 2. If |a| > 1, then one can see from the explicit
formula for Sn that Sn does not converge. In the case a = 1, we are
adding up l's, so the series certainly does not converge. In the case a =
—1, the partial sums alternate between 1 and 0 and so do not converge.
For simplicity, we will often write ]Cyli aj as E) aj> where the indices
are understood.
Since a series converges if and only if the sequence of partial sums
converges, we can use the theorems which we have proven about se¬
quences to study series.
□ Theorem 6.2.1
(a) A series aj converges if and only if for each given e > 0 there
is an N such that
j—n
which proves (5). Conversely, if (5) holds, then \Sm — S'n-i | < e for n and
m large enough, so {SVi} is a Cauchy sequence. Thus limn_^oo Sn exists
and by definition the series converges.
To prove (b), let e > 0 be given and choose N so that (5) holds. If we
choose m = n, then
n
j—n
for n > N, which implies that a3 —> 0 as j -> oo. Part (c) follows imme¬
diately from (a) and the fact that
m
< J2\aJ
j=n j=n
To prove (d), let Sa,n and S^n be the partial sums of X aj and X b1
respectively. Let Sn be the nth partial sum of X(cai + dbj). Then,
Sn = cSa,n + dSbn for each n. Since Sa,n converges and Sb>n converges.
Theorems 2.2.3 and 2.2.4 guarantee.that Sn converges and
lim Sn — c n—>co
n—>• oo
lim Sa,n + d n—^
limoo Sbn-
’
l 1
(m-l)P mp
m— 1 rn
Figure 6.2.1
1
E JV
j=n
< dx
= JiLfE_1_)
p— 1 \rriP 1 (n — l)p
1 2
- p- 1 (AT - l)^1
if n > N and m > N. Since p > 1 , the expression on the right can
be made as small as we like by choosing N large. Thus, by part (a) of
Theorem 6.2.1, ^ converges.
1 + 5 + 4 + i} + {^ 6NN> + {5 + - + ^} +
The terms in each bracket add up to a number greater than Thus the
sequence of partial sums diverges to oo, and thus the harmonic series
232 Chapter 6. Series of Functions
□ Theorem 6.2.2 (The Comparison Test) Let {aj}, {bj,}, and {cj} be se¬
quences of nonnegative numbers such that aj < bj < Cj for each j. Then,
m m m
T,bi J2bJ
j=n
^ Es- < £•
j=n
I sin j | 1
j2 + 1 ~ j2
and Y -p converges, by part (a) of Theorem 6.2.2, Y is absolutely
convergent. Therefore, by part (c) 'of Theorem 6.2.1, Y is conver-
gent. 3
when applying the comparison test because it means that the criterion
aj < bj < Cj need only hold for all j bigger than some finite number J.
Though convergence depends only on the tail of the series, the sum of
the series, if it converges, depends on all of the terms.
□ Theorem 6.2.3 (The Root Test) Set a = lim sup \aj | 3 . Then
1
Proof. Set a = lim sup \aj\i. By Theorem 6.1.3,
Thus, if lim sup < 1, the series converges by part (a) of Theorem
6 2.3. And, if lim inf > 1, the series diverges by part (b) of Theorem
6.2.3. u
Neither the root test nor the ratio test give information if the respec¬
tive limits equal 1. We remark that in many cases the sequence [a~l
a limit, in which case the four limits in Theorem 6.1.3 are all the same.
234 Chapter 6. Series of Functions
\aj\ j +1
as j -> oo, the ratio test proves that the series converges.
Proof. Define Sn = Ej=o ajf S = lim*-** Sn, and Tn = EJ=o af(j)- Let
e > 0 be given. Since E ttj converges absolutely, we can choose N\ so
that
OO
Let J = max {jo, ji, ...,jNl}, where jk is the natural number such that
fijk) = k. Such integers exist because / is onto. Now, choose N =
max{J + 1, Ah}. By the triangle inequality,
for each n. If n > N, then n > Ni, so the second term on the right is
< | by (7). Further, since n > J + 1, the partial sum Xy=o af(j) contains
every term in the sum J2f=o aj- Thus, the difference Tn - Sn contains
only terms aj with j > N + 1, and therefore \Tn - Sn\ < §, again by (7).
Thus,
| Tn — S | < 6 for n > N,
so Tn —y S; that is, Ylaf(j) = Ylaj- The same proof, using |aj| and
in Place of aj and af{j)' shows that ^ |a/(j)| = E \aj\, so Ea/(j)
converges absolutely. □
oo \ / oo \
uCM = (8)
j=o / \fc=0 /
How are we to understand the double sum on the right? We could mean
oo/oo \ oo/oo \ oo / n
N / N
Sn = X] I 2X \ajWbk\
k=0 \j=0
N ( N \
= ^1^1
fc=o \i=o /
- (sw) (S1-1)
< AB.
Section 2.6, the sequence of partial sums converges. Thus, in this order-
ing, Yhajbk is absolutely convergent. By Theorem 6.2.5, J2ajbk is abso¬
lutely convergent in all orderings and djb*. is the same in all orderings.
So, using the usual rules of arithmetic.
One can use this theorem to show that exey = ex+y by multiplying
out the series for the terms on the left and regrouping (problem 12 of
Section 6.5). We remark that the conclusions of Theorem 6.2.5 are still
true if only one of the two series aj and bk is absolutely convergent,
but the proof is harder.
Problems
Show that the ratio test gives no information but that the root test and the
comparison test show convergence.
6.2 Series of Real Constants 237
diverges. What happens if you calculate the first few partial sums on
your hand calculator?
9. Show that the series Y^Li s*n (j)2 converges. Hint: use the Mean Value
Theorem.
10. Establish the convergence or divergence of Y'jLiln (1 + j). Hint: use the
definition of ln x in Example 2 of Section 4.3.
hi l\Sj
11. Establish the convergence or divergence of Y’jLi ~—7T~-
12. Prove that if the sequence {&.,■} is bounded and Y \aj\ converges, then
Y o-jbj converges.
13. Prove that if aj > 0 for all j and Y aj converges, then Y a) converges.
14. Suppose that aj > 0 and that Y aj converges.
(a) Show by example that it is not necessarily true that Y \[^j con-
verges.
(b) Show that Y j' converges. Hint: use the Cauchy-Schwarz in¬
equality.
15. Suppose that aj > 0 and that Y aj diverges. Prove that Y diverges.
Hint: first show that if it converges, then aj —> 0.
16. Consider the series Y — • Since the signs alternate and the absolute
values of the terms decrease and converge to zero, this series satisfies the
hypotheses of the Alternating Series Theorem in project 1. Therefore, it
converges. However, it is conditionally convergent since the harmonic
series diverges.
(a) Prove that the sum of the positive terms diverges to infinity. Prove
that the sum of the absolute values of the negative terms diverges
to infinity.
(b) Let a be a given real number. Rearrange the series by choosing only
positive terms, starting at the beginning of the series, until the sum
is greater than a. Choose as many negative terms as needed to bring
the sum below a. Continue in this manner and use the fact that the
terms in the series converge to zero to show that this can be done in
such a way that the sequence of partial sums converges to a.
238 Chapter 6. Series of Functions
converges pointwise to f(x) on [a, b\. We have already seen that point-
wise convergence does not give us much control over the limiting func¬
tion. Thus, we want conditions which guarantee that the partial sums
Sn(x) converge uniformly to f(x), in which case we say that series (9)
converges uniformly.
m
\Sm(x) - Sn(x)\ =
j—n+l
5Z fj(x)
k «.
m
<
j=n+1
1fj(x)
m
<
E
.j=n+1
M>
< e.
6.3 The Weierstrass M-test 239
Thus, for each xeE, Sn(x) is a Cauchy sequence of numbers and there¬
fore converges to a limit f(x), which is, by definition, the sum of the
series. Letting m —)• oo in the above inequality, we find
< £
for all x e [a, 6]. Thus the right-hand side of (11) converges uniformly to
the left hand side. ^
OO
/'(*) = }™lSn(X) =
Example 1 Let {an} be a sequence such that \an\ < C/np where p > 1.
Define a function f(x) by
OO
Since |ansinna;| < C/np, the series converges uniformly by the Weier¬
strass M-test. Thus, / is well defined and continuous on R since an sin nx
is continuous for each n. By Theorem 6.3.2, we can integrate / by inte¬
grating the series term by term. So, for example,
r2n oy p2tv
/ f(x)dx = an / sinnxdx = 0.
Jo n=1 Jo
Since |nan cos nx\ < C/np~l, this series converges uniformly by the M-
test because p — 1 > 1. Thus, by Theorem 6.3.3, / is continuously differ¬
entiable and
OO
Series like these are called Fourier series. We study Fourier series in
Chapter 9.
Define hn — ±^4_n, where we choose the plus or minus sign for each
n so that there is no integer strictly between 4nx and Anx + 4nhn. Fix n.
Then,
0, if j > n
g(4j{x + hn))- g{4Jx)
±4n, if j = n
hr>.
4j, if 0 < j < n — 1.
n—1
> T - (17)
3=0
Problems
1. Show that the series YlJLo V converges uniformly in the interval [—/3,(3]
if 101 <1.
2. (a) Show that the series 0e_J'xxJ' converges uniformly on [0, oo).
Hint: how large can xe~x be for x > 0?
(b) Compute the sum of the series.
, 00 OO
= + i)V.
is**
3=0 i=o
°° i
f{x) m
8. Let N be a positive integer and suppose that p > N. Let {a^} be a se¬
quence of numbers satisfying |oj| < C/jp for some constant C. Prove
that
OO
OO
Compute /o f(x)dx.
244 Chapter 6. Series of Functions
OO
defines a continuous function for x > 1. The function C(x) is called the
Riemann zeta function.
13. Let Q be a rectangle [a, b] x [c, d] in the plane. Let fj(x, y) be a sequence
of continuous functions on Q that satisfy | fj(x, y)\ < Mj for all (x, y) e Q.
Suppose that YlJLo Mj < oo. Prove that
OO
f{n)(x0)
...+ n\
3=0
This raises the natural question of whether the infinite series equals the
function itself, that is, whether
/(*)
3=0
The sum on the right is called the Taylor series of the function /. In the
special case when xQ = 0 the series is called the Maclaurin series for
/. There are really two separate important questions here. Where does
a series of the form Ylajix ~ xo)j converge? Such a series is called a
power series. And if the Taylor series of a function / converges, does it
converge to f(x)7 We begin with the first question, which has a straight¬
forward answer.
Let p = limsup \aj\* and define R = 1/p if p is finite and nonzero. If
p = 0, we define R = oo, and if p = oo, we define R = 0. R is called the
radius of convergence of the series
OO
^2 aj {x — x0y, (i9)
3=0
• i i
limsup (|<3.^11a? — x0\J)i = limsup (|x — cc0||aj|i) (20)
i
= |a; — xQ\limsup |a.j| j (21)
= \x — x0\/R. (22)
In the second step we used problem 4 of Section 6.1. Thus, if \x — xQ\ <
R, (19) converges by the root test (Theorem 6.2.3), and if \x — xQ\ > R,
(19) diverges by the root test. If R = oo, then (21) is zero, so the series
converges for all x by the root test. Finally, if R = 0, the series diverges
for x ^ xQ by the root test.
It remains to show the uniform convergence. Suppose that r < R,
and choose 7 so that ^ < 7 < 1. If \x — xa\ < r, then by (22) we have
• I V
limsup (|a.y 11m — x0\3)i < — < 7.
R
. 1
(\aj\\x - x0\3)i < 7
for all j > J. Now choose Mj = 7-? for j > J and define
for all j and all x e [xQ - r, xQ + r\. Since 7 < 1, the series £ Mj converges,
so, by the Weierstrass M-test, (19) converges uniformly on [xQ - r, xa + r].
□
Example 1 Consider the geometric series Y.'jLo xj. Since Oj = 1 for all
1
j, it is clear that limsup \aj\i = 1, so R = 1. Thus Theorem 6.4.1 con¬
firms what we already know about the geometric series, namely, that it
converges for |ar| < 1 and diverges for |ar| > 1. The theorem gives no
6.4 Power Series 247
information about x = ±R, but we can see in this case that the series di¬
verges for x = ±1. We have already computed that the sum of the series
is /(*) = for |x| < 1. This is a perfectly nice infinitely differentiable
function everywhere on M except for x = 1, but the series Yj°=o equals
the function only on the interval ( — 1,1). To represent f(x) around the
point x = 3, we write
1 1 1
1 — x 2 1 +^
x — 3
E(-i
3=0
r
which converges if \x — 3| < 2. This is the Taylor series for / around the
point x0 = 3, as one can check by computing the derivatives of / at 3,
and it represents the function in the interval (1,5). What we see here is
a general phenomenon: the radius of convergence of the Taylor series
is the distance from xQ to the nearest “singularity” of the function. The
reason for this will be clarified when we study analytic functions of a
complex variable in Chapter 8.
Proof. We shall give a sketch of the proof. Let r < R. We know that
/(x) = Y'jLo aj(x ~ xo)j and that the powers of x - xQ are continuously
differentiable. Thus, Theorem 6.3.3 guarantees that f is continuously
differentiable and the derivative can be computed term by term if we
can show that the series of term-by term-derivatives
OO
i i j
= lim (jj-1) lim sup (| aj | j ) j"—1
j~^°° j~><>O
1
= lim sup (| a j | i )
j-> OO
since ->• 1. Thus (23) has the same radius of convergence as (19).
Therefore, Theorem 6.4.1 implies that (23) converges uniformly on {xQ —
r, xQ + r). Thus, by Theorem 6.3.3, f(x) is continuously differentiable on
(x0 — r, x0 + r) and
OO
Since r was an arbitrary number less than R, (24) holds on (x0—R, x0+R).
We now apply the same idea to show that f" exists and that
OO
f"(x) = l)aj(x - Xo)j-2 (25)
3=2
on (x0 - R,xa + R). The crucial step is to show that (25) has the same
radius of convergence as (19). The argument, which is similar to that
1
above, uses the fact that limnH>00(j(j — 1))j-2 = 1. Continuing in this
manner, we prove that
OO
/(n)(x) = Y^j(j-l)...(j-n + l)a,j(x-XoY~n (26)
j=n
by showing that the series on the right-hand side has the same radius of
convergence as (19). We omit the details, which are very similar to those
outlined above. Notice that that if we evaluate both sides of (26) at xQ,
we find that
since the other terms on the right vanish when x — x0. Thus aj =
and so the original series (19) is just the Taylor series for / expanded
about the point xQ. □
£
X
e (27)
3=0
6.4 Power Series 249
Thus, for all real numbers y — In x, we have ip'(y) = ip(y). By the unique¬
ness of the solutions of ordinary differential equations (Theorem 7.1.1),
ip(x) = Cex for some constant C. Since In 1 = 0, we know that -0(0) = 1,
so C = 1. Finally, problem 8 in Section 4.5 shows that the inverse func¬
tion to the natural log satisfies ip(x + y) = ip(x)ip(y), which proves that
exey = ex+y.
since (28)
00 r2 j
cos x
5l_1|w (29)
AM - A0) i
Problems
1. Find the radius of convergence of the series £\itx-’ tor each of the fol¬
lowing choices of the coefficients a,:
2. Given the following conditions on the coefficients {<ij}, what can you say
about the radius of convergence of )7
(a) 0 < mi < aj < m2/ for some constants mi and m2.
(b) 2j < aj < 3U
(c) j2 < aj < j3.
3. What do you think is the radius of convergence of the Taylor series for
In a; expanded about xQ = 1? Find the series and prove it. What do you
think is the radius of convergence of the Taylor series for In# expanded
about 4? Find it and prove it.
7. (a) Use power series to evaluate directly the limits in problems 3(a) and
3(c) of Section 4.3.
9. Is there a power series which converges to the function f(x) = \x\ for all
x?
11. Find the radius of convergence of the series Yjf= oti + 1)0* + 2)xV Find
the function to which the series converges.
12. Give examples which show that a Maclaurin series can either converge
or diverge at the points -R and R (independently) where R is the radius
of convergence.
13. Suppose that we didn't know how to solve the differential equation
y'(t) = y{t), with initial condition y(0) = yQ- Let's try to write a power se¬
ries y(t) = Cjtj for the unknown solution. By differentiating the series.
252 Chapter 6. Series of Functions
ci — ^0?
2 c2 = ci,
3c3 = C2,
We regard two pairs as equal if and only if both components are equal.
It is straightforward to check that addition and multiplication in C are
commutative and associative. That is, if z\ = (xi,yi), z2 = (x2,y2), and
2:3 = (®3,2/3)/ then
Z\ + Z2 z2 + Zl
(ziZ2)z3 Z\{z2z3).
holds. The element (0,0) is called the zero of C and (1,0) is called the
identity of C since
z + (0,0) = z
*(1,0) = z
6.5 Complex Numbers 253
x\x - yiy = x2
yix + xi y = y2.
Thus, if we give the special complex number (0,1) the name i, we see
that every complex number z can be written in the form
x + iy
where the real number x is called the real part of z, x = Re(z), and the
real number y is called the imaginary part of z,y = Im(z).
254 Chapter 6. Series of Functions
M = \jx2 + y2,
so |z| is just the Euclidean distance from the point (x,y) to the origin.
Therefore, if z\ = x\ + iyi and z2 = x2 + iy2, then |zi - z2| is the Euclidean
distance between (x\,yi) and (x2,y2). In particular, the set of z which
satisfy \z — zi\ = c is a circle of radius c about (mi, j/i). The absolute value
satisfies several simple properties:
The second inequality follows from the triangle inequality, and the third
statement is easy to verify directly (problem 2). To prove the triangle
inequality, we square the left-hand side:
Taking the square root of both sides gives the triangle inequality. In go¬
ing from (33) to (34) we used the Cauchy-Schwarz inequality (problem
10(a) in Section 2.2). For any complex number z — x + iy, we define the
complex conjugate z by z = x - iy and note that \z\2 = zz.
Thus, zn —>• z if and only if, given any circle about 2, the sequence
gets inside the circle and stays inside after finitely many terms. If zn —
xn + iyn and z — x + iy, then
from which it follows that zn —>• z if and only if xn —> x and yn —> y. If
for every given M there is an N so that n > N implies \zn\ > M, we say
that {zn} converges to oo. Analogously to the real case, we say that the
sequence {zn} is a Cauchy sequence if, given e > 0, there is a A such
that
| zn — zm | < e if n > N and m > N.
we knowthat \xn—Xm\ < \zn—zm\and \yn~ ym\ < \zn—Zm I- It follows that
{x^ and {yn} are Cauchy sequences in R. By Theorem 2.4.2, {xn} and
{yn} converge to finite limits x and y, respectively. If we define z = (x, y),
then (36) shows that zn —>• z. The converse argument is similar. □
Sn — 0,j
3=1
Many of the theorems of Section 6.2 are true for infinite series of com¬
plex numbers with no change in proof. The geometric series (Example
1) converges if a is a complex number satisfying \a\ < 1 and the sum
of the geometric series is This follows from (31), which implies that
\an\ = |a|n -> 0. Theorem 6.2.1 holds unchanged. The comparison test
256 Chapter 6. Series of Functions
□ Theorem 6.5.2 Let {aj}, {bj}, and [cj] be sequences of complex num¬
bers such that \cij| < I bj\ < | Cj | for all j.
The hypotheses, conclusions, and proofs of the ratio and root tests (The¬
orems 6.2.3 and 6.2.4) refer only to the absolute values \aj\, so they go
over without change to the complex case.
Suppose now that {a^} is a sequence of complex numbers and zQ is a
given complex number. We want to ask for which z e C the power series
OO
Theorem 6.5.3 shows that the natural sets of convergence for complex
power series are disks. The intersection of a disk with the real subset E of
the complex numbers C is an interval or a point. This is why the natural
domains of convergence of power series of a real variable x are intervals
or points. In Chapter 8 we return to the question of which functions from
C to C can be expressed by convergent power series. Using complex
series, we can define the exponental function.
Z
zj
e (39)
j!
This can be proven by multiplying out the series for eZl and eZ2 and col¬
lecting terms (problem 12) or by using the fact that ez is an entire analytic
function, and (40) holds for real x and y (see problem 9 of Section 8.3). If
0 is a real number then
e
w (41)
OO
e2j OO
e2j+1 (42)
m (2j + 1)!
3= 0 3=0
cos 0 = sin# =
2i
These representations of sin 0 and cos 0 will be very useful when we con¬
sider Fourier series in Chapter 9.
258 Chapter 6. Series of Functions
which is called the polar form of z. Given two complex numbers, zi and
Z2, we can compute their product by using (40) and the polar form:
2lZ2 = = |zi|Mei(9l+"2).
2gi2#i
U)
Two complex numbers are equal if their absolute values are equal and
their arguments are equal or differ by an integral multiple of 2ir. There¬
fore |<v| = |z| 2 and
* *
9 + m27r = 29\
0 + m27r = n6\.
Solving for 9\ and letting m run through the integers, we find exactly n
distinct choices, modulo 27t, for 9\\
. 6 m2ir
Oi = - + -, m = 0,1, 2,..., n — 1.
n n
Thus, every complex number has exactly n roots. For example, the
■(n-iy.
Problems
1. Verify from definitions (30) and (31) that complex addition is associative
and complex multiplication is commutative.
(a) zn + wn z + w.
(b) zn • wn y z • w.
(c) /3zn ->• (3z.
(d) for each positive integer m, z™ —>• zm.
9. Let {oj}jL0 be complex numbers and for each z define p(z) = a0 + a\z +
a2z2 + ... + amzm. Suppose that zn —> z. Prove that
Find power series representations for sin z and cos z and verify that each
series has radius of convergence R = oo. Are sinz and cosz bounded
functions on C?
11. Prove that for |z| > 1,
1 _ y, (-1Y + 1
1+z “ ^ zi
3=1
12. Prove formula (40) by multiplying out the series on the right (using The¬
orem 6.2.6) and show by regrouping (using Theorem 6.2.5) that you get
the series on the left.
OO
p = n%-
j=i
If {Pn} converges to zero (or diverges to oo) we say that the infinite prod¬
uct diverges to zero (or oo). We require the an to be nonzero because if
one an were zero, then all partial products beyond n would be zero au¬
tomatically, no matter how wildly the sequence {an} behaved. Similarly,
if P were allowed to be zero, then “convergence” would not put strong
conditions on the behavior of aj for large j. For example, the sequence in
which aj = for j odd and aj = -■ for j even diverges to zero. We need
a criterion on the partial products {Pn} which guarantees convergence
to a nonzero limit.
n
l < £ for all n > m > N. (44)
n ai
Proof. Suppose Pn —> P ^ 0, and let e > 0 be given. Since the abso¬
lute value is a continuous function (problem 7(a) of Section 6.5), we can
choose N\ so that \Pn\ > ^ for n > N\. Thus,
n m —1
I Pn — Pm-II ii
j—1
<ij n
3=1
ai
m—1 n
n
3=1
ai n
j=m
aj ~ ^
3=m
Therefore,
n aj
< ^Pn for all n>m>N\.
262 Chapter 6. Series of Functions
Since {Pn} is a Cauchy sequence, the right-hand side can be made < £ by
choosing TV > TVi large enough and requiring n > m > N. This proves
(44).
Conversely, suppose that condition (44) holds. Then we can choose
TV so that in j=maj 1| — \ for n > m > TV. This implies, in particular,
that
1
-
2 -
< n aj
Thus,
N m / n \
IP
On - 1pm |1 —
— n% n ( n
3=1 j=N+l
a3
w'=m+l
a3 -d
/
N /q\ n
Jf=i
n (|) n
w
vz/ j=m+1
a3 —1
for n > m > TV; thus, it is clear that \Pn - Pm\ can be made small by
choosing n and m sufficiently large. Therefore, {Pn} converges. Further¬
more,
n
IT aj
3=1
OO
II(lHr&j) (45)
j=i
Q Theorem 6.6.2 Suppose that bj > 0 for all j. Then 0(f T 6j) converges
if and only if Yj bj converges.
6.6 Infinite Products and Prime Numbers 263
for all n. Since n(l + bj) converges the sequence on the right side is
bounded. Thus, the sequence of partial sums on the left is bounded and
therefore converges because it is increasing. □
1234 n— 1 _ 1
j 3 4 5 n n
Therefore, the first partial product approaches zero, and the second par¬
tial product converges to \ as n —> 00.
□ Theorem 6.6.3 If 11(1 + bj) converges absolutely, then 11(1 + bj) con¬
verges.
11(1+^3)
j=m
n (i+\bj\)
j=m
1. (47)
Since n(l + bj) converges absolutely. Theorem 6.6.1 guarantees that the
right hand side of (47) can be made smaller than any given e > 0 by
choosing n and m large enough. Thus, the same is true of the left-hand
side, which proves, by Theorem 6.6.1, that the product converges. □
52 j2 < oo. However, the product IIjLi(l + j^~) does not converge
absolutely since 52 j — 00 •
* (n) - 2+1,
and a similar estimate holds for n odd. That is, less than half (approxi¬
mately) of the integers < n can be prime. Of course the multiples of 3 can
not be prime either, and they constitute approximately | of the integers
< n. This suggests that less than i of the integers < n can be prime since
1 — i — | However, there is a problem with this reasoning since
some multiples of 2 are also multiples of 3 and vice versa, and so we've
subtracted too much. Nevertheless, this suggests that as n gets larger,
7r(n) becomes smaller relative to n.
1 - -) (48)
PJ
(49)
- ftps)
(50)
266 Chapter 6. Series of Functions
since each term is a sum of a geometric series. Let M > 0 be given and
choose N so that
which we may do since the harmonic series diverges to oo. By the fun¬
damental theorem of arithmetic, each n < N can be written uniquely as
a finite product of powers of primes. Choose K large* enough so that the
finite product (49) is taken over all the primes that occur in the product
representations of all n < N. Then, each term ^ is contained in the sum
obtained by multiplying out the series in the finite product (50). Note
that we can multiply out these series because they are absolutely con¬
vergent (Theorem 6.2.5). Because the rest of the terms obtained from the
product are positive, we see that
N K
l
m < £n
n=1
Lemma 2 For any real x, let [x] denote the greatest integer < x. Then,
for all m and r.
m m
w(m, r) m — E
*i<r ..Pil.
+ .E
.PhPi2.
*1
m
+ (-i r _PlP2-.Pr. '
(51)
m
L PhPi2-Pik\
6.6 Infinite Products and Prime Numbers 267
is just the number of multiples of PixPi2---Pik which are < m. Thus, a fixed
integer n < m is counted in the term
m
£ -PilPl2 ”•Pile -
two primes, and so forth. So, the net number of times that n is counted
(including minus signs) is
the sum on the right of (51) is equal to the number of integers < m that
are not multiples of the first r primes. That is, the sum equals w(m,r).
□
□ Theorem 6.6.4 Let 7r(m) denote the number of primes < m. Then,
7T (m)
lim 0. (52)
m—>00 m
Proof. The set of primes < m is contained in the union of the first r
primes and the set of numbers < m that are not multiples of the first r
primes. Thus, for every m and r
= (l + l)r = 2r
1
(r) + i 2 i + ...
terms in expression (51) for w(m, r), we have
m • „ m
7r (m) < r + 2 + m — 2_j-^ ...+ (-!)*
PlP2---Pr
i<rPi' h<i2<r PilPi2
= r + 2r +
i=\ ^ Pi >
This estimate is true for all positive integers m and r. Now, let e > 0
be given. By Lemma 1 we can choose an r (henceforth fixed) so that
or
7r(m) r 2r e
——- <-1-f- -
m mm2
for all m. For m large enough,
r 2r e
— + < x
mm2
and so
7r (m)
< e,
m
which proves the theorem. □
7r (m)
lim = 1,
m—> oo m/lnm
Problems
1. Prove that
1
j(j + !)/ 3'
2. C
"(I < “a
n nz
for some constant C.
(b) Use the estimate in (a) to prove the convergence of the infinite prod¬
uct
OO
z,
IU
=1
3
+ -)e-.
4. Let {a^} be a sequence of real numbers such that |aj \ < 1 for each j. Prove
that rijli(l + aj) converges if and only if ^ In (1 + aj) converges. Hint:
relate the partial sum and product.
n (-i)j
j
IU-M
s e S
C(*)
270 Chapter 6. Series of Functions
8. (a) Show that for all nonintegral x, the following infinite product con¬
verges:
(b) Show that the limiting function is continuous and that by defining it
to be zero at the integers it can be extended to be a continuous func¬
tion on R. Hint: show that the partial products converge uniformly
on appropriate sets.
(c) Generate the graphs of several partial products and use the graphs
to guess what the limiting function is.
9. (a) Write a computer program which computes n(n) for any given n.
(b) Use the program to provide numerical evidence for Theorem 6.6.4
and the Prime Number Theorem.
Projects
\S ~~ ‘Snl 5: -Sn+i-
Note that this gives us a very easy way to estimate how close a
partial sum is to the sum of an alternating series. How many terms
of the alternating geometric series do we have to take to be within
10-4 of the limit?
Projects 271
(f) Prove that for each x we can choose a J so that the Maclaurin series
for sin x and cos x satisfy the hypotheses for the Alternating Series
Theorem for j > J. Use the alternating series remainder to estimate
2 4
how close 1 — + 2L- is to cosx on the interval [—1,1]. Compare
your estimate to the one you get by using Taylor's theorem.
2. Polynomials are easy to integrate and power series are “almost” polyno¬
mials. This gives us a new way to approximate certain integrals.
2
(a) What is the Maclaurin series for the function e~x ?
(b) Use the first three nonzero terms of the Maclaurin series to approx¬
imate the integral fQ e~x2 dx.
(c) Using the alternating series error estimate from Project 1, estimate
the error in your approximation.
(d) How many terms of the series would you have to take to be sure
that your estimate of the integral is within 10-4 of the correct an¬
swer? Compare the computational effort involved with the numer¬
ical methods discussed in Section 3.4.
(e) Suppose that we want to estimate the integral J0°° e~x dx to within
1CT4. First, choose A” so that f^e~x~dx < 10_4/2. Hint: e-x <
e_x for x > 1. Then estimate f0 e~x dx. Note: The exact value of
J0°° e~x2 dx can be computed analytically. See problem 9 of Section
10.3.
3. The purpose of this project is to show how power series can be used to
find or approximate the solutions of certain differential equations.
(a) Use the idea in problem 13 of Section 6.4 to find a power series so¬
lution of the initial value problem
that satisfies the condition y(0) = 1. Verify that the series that you
found converges for all t and satisfies the differential equation. The
solution, J0(t), is called a Bessel function.
(c) Power series can also be used for nonlinear equations. Use it to find
the first four terms of the solution of
Note that when you square the series for y(t), the lower-order terms
are easy to calculate. Check to make sure that the terms you found
coincide with the series expansion of the solution found in Example
2 of Section 7.1.
Differential Equations
y'{t) = 2 y(t).
For every choice of the constant c, the function y(t) = ce2t satisfies the
differential equation. Thus, the differential equation has a whole family
of solutions. Often one refers to both the function y(t) and to its graph
(which is a curve in the t — y plane) as a “solution.” If the value of y(t)
at a particular time t = tQ is given, then c is determined. To see this,
suppose y(t0) = yQ. Then, yQ = y(t0) = ce2t°, so c = y0e~2t°. Thus, given
any point, (t0,y0), in the plane, there is a function, y{t), which solves
the differential equation and whose graph passes through the point. The
condition y(t0) = yQ is called an initial condition because the value of
y is being specified at the “initial” time tQ. Note, however, that the so¬
lution y{t) is determined for times before tQr as well as for times after tQ.
Conversely, suppose that yi(t) = c\e2t
and y2{t) = C2e2t are two solutions
which are equal at some time t\. Then,
c\e2tl = C2e2tl, from which it fol¬
lows that ci must equal C2. Therefore,
the solutions are equal for all times t.
Thus, distinct solutions can never have
the same value at the same time. In
terms of the geometry of solution curves
in the plane, we can restate what we
have proven as follows: every point in
the plane has a solution curve going
through it, and distinct solution curves
never cross. Several of these solution
curves are shown in Figure 7.1.1.
true for quite simple functions /, such as f(t,y) = sin (2y) or /(£,y) =
t2 + y sin (2y). Thus, we will have to prove that solutions exist and have
the right properties rather than just checking the properties of a given
family of functions, as we did in Example 1. The second difficulty is
that solutions may exist only for short times as shown by the following
example.
v(t) =^
Vo
y{t) - ya
276 Chapter 7. Differential Equations
We will solve (2) by an iteration method and then show that the solution
satisfies (1). Notice that (2) allows us to reformulate both the differential
equation and the initial condition into a single condition on the function
y(t). Let T > 0 be a number such that T < 8, and let y\{t) = yQ. We
define functions yn(t) inductively for n > 1 by
We shall show that if T is small enough, this definition makes sense and
that the resulting sequence of functions, {yn(t)}, is a Cauchy sequence in
C[t0 — T, t0 + T], For simplicity, we denote the interval [tQ — T,t0 + T]
by It- Throughout, we shall denote the sup norm on C[It] by || • ||oo,t to
emphasize the dependence on T.
t0 5 tQ tQ T tQ 5
Figure 7.1.3
<
df
I\ max < oo.
s dy
For all /3i and /?2 in [ya - 8, y0 +6], the Mean Value Theorem implies that
there is a £ between (3\ and /?2 so that
di (4)
|/(f,/3i)-/(f,/32)| < dy
- (82)
Thus, taking the supremum of the left-hand side over all t e IT, we obtain
where a = KT. If we choose T < min {^, <5}, then a < 1. It follows
in exactly the same way as in the proof of Theorem 5.4.1 that {yn} is a
Cauchy sequence in the sup norm in C[It\- Thus, by Theorem 5.3.3, there
is a continuous function y{t) on It such that yn —> y uniformly. One can
also use the contraction mapping principle to prove the existence of y
(see problem 5). Since estimate (5) holds,
Vi*) = limyn+i(f)
= yo+ [ f{s,y{s))ds.
Jto
The function y therefore satisfies (2) and y(t0) = yQ. Since f(s,y(s)) is
continuous, the Fundamental Theorem of Calculus implies that yit) is
continuously differentiable and (1) holds.
To see that y(t) is unique, suppose that z(t) is another continuously
differentiable function on It that satisfies (1). Then z{t) satisfies the inte¬
gral equation (2) also. Subtracting the integral equation for y(t) from the
integral equation for z(t), we find
The same estimates as above show that \\z — y||oo,T < &\\z — y||oo,T- But,
since a < 1, this can only be true if \\z — y\\oo,T — 0/ which implies that
z(t) = yit) for all t in the interval. Thus, y(t) is unique. □
Proof. Suppose y(tQ) = z(tQ) at a point tQ e [a, 6]. Then, both y(t) and
z(t) satisfy (1) and the same initial condition at tQ. Thus, the unique¬
ness statement in Theorem 7.1.1 implies that y(t) = z(t) for all t in some
interval It containing t0. See Figure
7.1.4. Let t\ = sup {t | y(t) = z{t)} and
suppose that t\ < b. We know that
z(ti) = y{t\) since z and y are continu¬
ous functions which are equal in the in¬
terval [t0,ti). However, the local existence
theorem would then guarantee that y(t) =
z(t) for all t in some interval containing
ti, which would contradict the definition
of t\. Thus, we must have t\ > b, so
y(t) = z(t) on the interval [t0, b]. The proof
Figure 7.1.4
that y(t) — z(t) on the interval [a,f0] is
similar. □
Proof. By Theorem 7.1.1, the solution y[t) exists in some small time
interval about tQ. Since y(t) is continuous, it's graph cannot escape from
S immediately and as long as it remains in S, y{t) satisfies the estimate
if 11 — tQ| < S/2M. Therefore, the solution y(t) remains in the rectangle
for t e Jjt. Now define
where we used the Mean Value Theorem in step (11). Since a(t) satisfies
this differential inequality, Proposition 7.2.2 (proven in the next section)
guarantees that
a(t) < e2Kta(0).
Problems
Let 5 = 1. Determine values for the constants M and K used in the proof
of Theorem 7.1.1. Prove that a solution exists on the interval [-§, §].
Let 5 = 1. Determine values for the constants M and K used in the proof
of Theorem 7.1.1. On what interval IT does Theorem 7.1.1 guarantee that
a solution exists?
282 Chapter 7. Differential Equations
Let (5 = 1. Determine values for the constants M and K used in the proof
of Theorem 7.1.1. On what interval It does Theorem 7.1.1 guarantee that
a solution exists?
6. Let y(t) be the solution of the initial-value problem in problem 2. Let y(t)
be the solution of the same differential equation with the initial condition
y(0) = 1.05. Estimate \y(t) — y(t)\ on the interval [— |].
7. Suppose that the two solutions, y(t) and y(t), described in problem 6 exist
for all times t eR. Use the proof of Theorem 7.1.3 to estimate |y(t) — y(t) |.
10. For each of the following functions, f(t,y), find the set of initial values
y(0) = yQ for which Theorem 7.1.1 or problem 8 guarantees a unique
local solution y(t):
11. Let g be a function of n + 1 variables. Show that the nth order differential
equation
and suppose ti < oo. Then, as t /■ t\, either y(t) -> oo or y(t) -> -oo.
Proof. We will show that if y{t) does not converge either to +oo or -oo,
then y(t) can be extended past t\, contradicting the definition of t\. If
y(t) —> oo as t h, then for every N > 0 there is p such that y(t) > N
for all t Similarly, if y(t) -> -oo as t t\ then for every N > 0
there is n so that y{t) < -N for all t > t\ - y. Therefore, if y(t) doesn't
converge either to +oo or to -oo, there is an N > 0 so that every interval
[ti - /i, t{) contains at least one point such that -N < y(fM) < N.
284 Chapter 7. Differential Equations
Thus, the solution z(t) of (13) exists on the interval + T*), which
contains t\. See Figure 7.2.1. By the uniqueness proven in Theorem 7.1.1,
the solution z(t) coincides with y(t) on the interval [t^ti) where both
solutions exist. Therefore z(t) extends the solution y{t) past t\ which
contradicts the definition of t\. Thus, either y(t) -)• oo as t /* t\ or
y(t) —> -oo as t /* t\. □
for all times t. Thus in a time interval of length T, y(t) cannot increase
or decrease by more than T, so y(t) cannot approach +oo or — oo in finite
time. Theorem 7.2.1 implies, therefore, that the solution is global.
Even when y'(t) grows, y(t) can't go to infinity in finite time if y'(t)
doesn't grow too fast. For example, the solution of y'(t) = y(t) with y(0)
= 1, is the function y(t) = e*, so y'(t) does indeed grow exponentially.
Nevertheless, the solution exists for all times. Suppose that we consider
the initial-value problem
(a) If y(a) < z(a), then y(t) < z(t) for all £ e [a, b}.
(b) If y(b) < z(b), then y(t) > z{t) for all £ e [a, b\.
Proof. We shall prove (a); the proof of (b) is similar. Suppose that y(a) <
z(a) and that there is a £2 in the interval [a, b] such that 2/(^2) > z(t2). We
will show that this leads to a contradiction. Let t\ be the supremum of
the set of £ e [a, £2] such that y(t) < z(t). The hypothesis y(a) < z(a)
shows that the set is nonempty. Since y and z are continuous functions,
we know that y{ti) = z(t\). In particular, t\ < £2- On the interval [£1,^2]/
we define h(t) = y(t) — z(£). Since y and z are continuous on [£1, £2], they
are bounded. Thus, there is a constant B so that \y(t)\ < B and |z(£)| < B
for all £ e [£1, £2]. Let K be the supremum of 1| on the rectangle [£1, £2] x
[-B, B]. Then for £ e [£1, £2],
We used the hypothesis (15) in step (18) and the Mean Value Theorem in
step (19). Since h'(t) < Mh(t) on [tifa], Proposition 7.2.2 assures us that
h(t) < h(t\)eMt. But h(t) is nonnegative and h(ti) = 0, so h(t) = 0 for all
£ e [£1, £2]. Thus, ^(£2) < z(£2). Since this violates our assumption about
£2, the proof is complete. O
and
CT
CT
(23)
II
II
0
0
exist on the interval [a, b\. Then, the solution of<r*b
0
exists on the interval [a, 6]. Furthermore, x(t) < y(t) < z(t) forallfe[f0,&]
and z(t) < y(t) < x(t) for all t e [a, tQ\.
Proof. By Theorem 7.1.1, the equation (24) has a local solution y(t)
which exists near t = tQ. By Theorem 7.2.1, either y(t) exists on [a, b] or
it goes to Too or to —oo as t approaches some t\ e [a, b]. Suppose ti > tQ.
On the interval [t0,ti)r part (a) of Theorem 7.2.3 guarantees that y(t) is
bounded above by z(t) and bounded below by x(t), so y(t) cannot go to
Too or to — oo as t t\. Therefore, the solution exists for all t e [tQ, b] and
the estimate x(t) < y{t) < z{t) holds. A similar proof, using part (b) of
Theorem 7.2.3, shows that the solution exists for all t e [a, t0\ and that the
estimate z(t) < y(t) < x(t) holds there. □
Example 3 The fact that orbits cannot cross (Theorem 7.1.2) can some¬
times be used to show that solutions exist on infinite time intervals. Con¬
sider the initial-value problem
Since y'(t) < 0, the solution is decreasing and therefore cannot approach
Too. On the other hand, the t-axis is the orbit of the solution that is
identically zero for all t. Since y(t) starts positive and cannot cross the t
axis, the solution remains positive. Thus it cannot approach — oo. Since
the solution can not approach either Too or Too on the interval [0, oo),
by Theorem 7.2.1 it exists on the entire interval [0, oo) .
Problems
y'{t) = y(t)p, y( 0) = 1.
2. Solve the initial-value problem in Example 3 explicitly and verify that the
solution exists on [0, oo) and satisfies 0 < 2/W < ya.
8. Suppose that 0 < e < y0. Prove that the solution of the initial-value
problem
9. Suppose that all the hypotheses of Theorem 7.2.3 hold except that the
hypothesis that / and g are continuously differentiable is weakened to the
statement that both / and g are continuous and one of them is uniformly
Lipschitz continuous. Prove that the conclusion still holds.
section we show how to estimate the error in Euler's method, the sim¬
plest numerical method for approximating the solutions of differential
equations.
Suppose that we wish to approximate the solution of the differential
equation
on the interval [a, b]. We divide the interval into N equal parts of length
h = (b — a)/N by setting to = a, ti = a + h,t2 = a + 2h,..., tv = b.
Set yo = yQ. We know the value of y at to, and the differential equation
tells us the value of y' at to, namely, f(to,y{to)). Thus it is natural to
approximate y(t) on the interval [t0, ti] by yo + (t — to)f(t0,y(t0)) since
this straight line has the same value and slope as y(t) at to- This gives us
the approximation
2/i = 2/o + hf(t0,y(t0))
for the value of y(t\) at t\. Using yi, we can approximate y'(ti) =
/(ti,y(ti)) w /(ti,2/1), and this enables us to define the straight line
approximation yi + (t — ti)/(ti,yi) on the interval [ti,t2]- This second
straight line gives us the approximation
y2 = yi + hf(ti,y{ti))
for n = 0,..., N -1. Connecting the points (tn, yn) by straight lines gives
the polygonal approximation to y(t) first used by Euler and known as
Euler's method.
n t"n yn y{tn)
0 0 5 5
1 .25 2.5 3.17
2 .5 1.56 2.30
3 .75 1.41 2.02
4 1.0 1.64 2.10
5 1.25 2.07 2.39
6 1.5 2.60 2.81
7 1.75 3.17 3.31
8 2.0 3.77 3.86
Figure 7.3.1
If the solution of (26) is a straight line, Euler's method gives the so¬
lution exactly. By the Fundamental Theorem of Calculus (or by Taylor's
theorem), the deviation of y(t) from a straight line can be estimated if we
can bound the second derivative of y. Thus, we expect that error esti¬
mates for Euler's method should involve bounds on the second deriva¬
tive of y{t), that is, on the first derivatives of /.
Proof. The numbers {yn} are defined by the recursion relation (27).
We can also write a recursion relation for the numbers {y{tn)} as fol¬
lows. Since y(t) and / are continuously differentiable, the composition
f(t, y(t)) is also continuously differentiable. By (26), y'(t) is continuously
differentiable, so y(t) is twice continuously differentiable. Therefore, by
Taylor's theorem.
2! y"(&)
y{pn+1) y(tn) + hy'(tn) +
292 Chapter 7. Differential Equations
for some point between tn and tn+\. Since y(t) satisfies (26), we can
rewrite this as
In the third term on the right, all three functions are evaluated at the
point (£my (£«))• Subtracting (27) from (30) and taking absolute values,
we find
h2
|y(*n+i) - y»»+i| < \y{tn) - yn\ + h\f(tn,y{tn)) - f{tn,yn)\ + —L (31)
h2
< \y(tn) - yn\ + hK\y(tn) - yn)\ + ^L, (32)
where we used the Mean Value Theorem in the second step. Throughout,
we used the hypotheses that the points (tn,y(tn)) and (tn,yn) are in the
rectangle. For simplicity, we write the error at the step as En =
|y(tn) — yn | and set A = (1 + hK) and B = Then (32) can be written
Iterating this inequality, and using the partial sum of the geometric series
gives
we obtain
the right-hand side of (29) is an upper bound for the error; the actual er¬
ror may be much less. In addition, we used the rather crude estimate
1 + x < ex in order to get a bound independent of n. Nevertheless,
it is true that Euler's method is not a very efficient method in that one
must make h very small (thus the number of intervals, N, very large) in
order to approximate the solution well. As in the case of the numerical
estimation of integrals discussed in Section 3.4, there are serious draw¬
backs to choosing h too small because of the tiny but real round-off error
that may occur with each computational step. This is explored further
in project 2. Thus, the design of higher-order methods and the proof of
error bounds have played (and play) an important role in the study of
ordinary and partial differential equations. The proofs of the error es¬
timates for higher-order algorithms are more complicated than the one
above but use the same analytical ideas.
A natural question has probably come to mind. What is this mysteri¬
ous rectangle R which occurs in the hypotheses of Theorem 7.3.1? Since
we are using numerical techniques precisely because we can't solve the
equation explicitly, how can we know what R is? The answer is that we
must derive estimates on the solution by using the ideas of Section 7.2.
on the interval [0, 2]. Then f(t, y) = sin y, so = 0 and = cos y. Thus,
whatever the rectangle R is, the constants K and L in Theorem 7.3.1 can
be taken to equal 1. Therefore, estimate (29) is
h
\y{tn) ~ Vn\ — ^(e — !)•
If we want the Euler's method points (tn, yn) to be within 10~3 of the
true solution, we can guarantee that if we choose h small enough so that
Notice that we do not know what the solution y(t) is, but we have esti¬
mates on it from above above and below. Thus, the solution curve y(t)
remains in the rectangle
To see that the same is true for the Euler's method iterates, we estimate
so
yn < (1 + 2h)n3 < 3e2hN < 3e4.
Similarly,
Vn+i = yn + htnyn sin yn > (1-2h)yn
so if h < we have yn > 0 for all n. Thus, the points (tn, yn) also remain
in the rectangle R. In the rectangle, we know that \f (t, y)\ < 6e4, so we
can estimate
df
K sup < 2 + 6e4
R dy
and
L = sup < 3e4 + 6e4(2 + 6e4).
R dt 1 dy
Problems
1. Use Euler's method to approximate the solution to the following initial-
value problem on the interval [0,2]:
4. Use Theorem 7.3.1 and the methods of Example 3 to determine how small
h must be chosen so that the Euler points yn are within 10~4 of the true
values y(tn) for the differential equation in problem 1 if 2/(0) = 4.
5. Use Theorem 7.3.1 and the methods of Example 3 to determine how small
h must, be chosen so that the Euler points yn are within 10-4 of the true
values y(tn) for the differential equation in problem 2.
y'(t) = 2/(*)2, y( 0) = u
y’it) = y(to) = y0
z'(t) = g(t, y(t), z(t)), z(t0) = z0.
Use the Euler's method scheme from problem 8 to investigate the behav¬
ior of solution curves N2(t)) in the Ni - N2 plane for different
choices of (ni, n2).
Use the Euler's method scheme from problem 8 to investigate the behav¬
ior of solution curves (Ni(t),N2(t)) in the Ni - N2 plane for different
choices of (ni, n2). In fact, all solutions are periodic in t. Is that what you
found? Why not?
Projects
Here a, b, and yQ are positive and y(t) represents the population size at
time t in some units.
(a) First we suppose that yQ < b. Explain why y(t) is always increasing.
Explain why there can be no time t at which y(t) — b. (Hint: recall
Theorem 7.1.2.) Explain why the solution y(t) exists for all positive
times and is unique.
(b) Again suppose that yQ < b. Explain why c = lim^oo y{t) exists.
Prove that if c <b, then y(t) will be eventually higher than c, giving
a contradiction. Conclude that y(t) —>• b as t —>■ 00.
(c) Again suppose that yQ < b. Compute y"(t) in terms of y(t) and use
the result to help you draw an accurate sketch of the solution.
(d) Suppose that yQ > b. Use the ideas in (a), (b), and (c) to show that
the solution exists for all positive times, and draw an accurate graph
of it.
2. The purpose of this project is to analyze the trade-off between small step
size and round-off error in Euler's method. Every time the computer
calculates an Euler iterate, it makes an error whose size is bounded above
Projects 297
(a) Using the terminology and hypotheses of Theorem 7.3.1, except that
we replace (27) in Section 7.3 by (34) above and yn by yn, prove that
h2
En+i < (1 + hK)En + — L + e.
Lh e (b-a)K _
En < ( —rr (e 1).
2K + hK
(c) Explain why the error bound cannot be made arbitrarily small no
matter how we choose h. How should one choose h to make the
error bound as small as possible?
(d) Suppose that e — 1CU8. What is the maximum accuracy you can get
by using Euler's method for the differential equations in problems
1 and 2 of Section 7.3?
3. The purpose of this project is to show how a local existence and unique¬
ness result analogous to Theorem 7.1.1 can be proved for systems. We
will consider the initial-value problem for the system
(a) Show that if y(t) and z(t) satisfy (35) and (36), then they also satisfy
a pair of integral equations.
(b) Follow the proof of Theorem 7.1.1 to show that if T is small enough,
the integral equations can be solved by iteration.
(c) Show that the solutions of the integral equations are continuously
differentiable and satisfy the differential equations.
298 Chapter 7. Differential Equations
4. Suppose that y(t) and z(t) satisfy (35) and (36) on a time interval a < t <
b. The curve {\y(t), z(t)) \ te[a,b}} in the y — z plane is called an orbit.
Prove the analogue of Theorem 7.1.2 by showing that if two orbits over
the time interval [a, b\ cross, then they are identical.
5. Suppose that y(t) and z(t) satisfy (35) and (36) on a time interval a < t <
b. The solution pair (y(t),z(t)) is said to go to oo at finite time b if, for
every M, there is a tM < b so that
Complex Analysis
exists.
300 Chapter 8. Complex Analysis
The functions u(x, y) and v(x, y) are called the real and imaginary parts
of the function /.
so
According to the definition, if / is analytic, (2) and (3) must be the same,
so
ux(x,y) + ivx(x,y) = vy(x, y) - iuy(x, y).
Since two complex numbers are equal if and only if their real and imag¬
inary parts are equal.
nx (xl i yn)%n "I” Uy (0, T2)yn ^ {xnt ^3)l/n ~t~ (^~4; ^)xn
Xn + iyn xn + Wn
ux{ri,yn)xn + iux(xn,T3)yn | . vx{ta, 0)xn + ivx(0, r2)yn
Xn T" iyn xn 3~ iyn
In going from (7) and (8) to (9), we used the Mean Value Theorem four
times. In going from (9) to (10), we used the Cauchy-Riemann equations
and rearranged the terms. From the Mean Value Theorem, we know that
n,T2,t3, and T4 all converge to zero as xn —> 0 and yn —» 0. Using the
continuity of ux and vx, the convergence in the last step follows from the
result in problem 2. □
ux — 2x = vy and uy = —2 y = —vy.
since (uj)x = (Vj)y for each j because a,j(z - zoy is analytic. Similarly,
(S uj)y — vj)x■ Thus, the real and imaginary parts of / satisfy the
Cauchy-Riemann equations so the function /, defined by (12), is analytic
inside its radius of convergence.
For example, since the power series for ez has radius of convergence
R = oo, ez is analytic on C. Similarly,
_ g iz giz _|_ g iz
sin z = -—-, and cosz = ---
21 2
are analytic functions on C whose restrictions to the real axis are since
and cos x.
304 Chapter 8. Complex Analysis
Problems
0
(c) eC and Im(z) > 0}.
A
cc;
0
(d) eC A
and Im(z) > 0}.
u
2. Suppose that {xn} and {yn} are sequences of real numbers such that xn +
iyn ^ 0 for any n and xn + iyn —> 0. Suppose that {an} and {/3n} are
sequences of real numbers such that an —> 7 and /3n —» 7. Prove that
3. For each of the following functions, find the real and imaginary parts in
terms of x and y and verify that the Cauchy-Riemann equations hold on
the indicated domain:
(a) f[z) = z3 on C.
(b) f(z) = \ on {z e C | z / 0}.
(c) f(z) = X)°l0 zj on {z e C \ \z\ < 1}.
(a) Prove that / is continuous on C if and only if its real and imaginary
parts are continuous functions on R2.
(b) Prove that if / is analytic, then / is continuous.
8. Say where each of the following functions is analytic and compute its
derivative:
(a) f(z) = (cosz)2 + e2z.
(b) f(z) = e1/2.
(c) f(z) = e1/siaz.
9. Let f(z) = Where is / analytic?
(a) Find a power series representation for / in the region {z | \z\ < 1}.
(b) Find representation for / as an infinite series of powers of - valid
in the region {z \ \z\ > 1}.
(c) Find a power series representation for / around the point z = i.
what is its radius of convergence? Hint: write
rb
= f
Ja
Re{e~10g(t)} dt (14)
e~ixtf(t) dt
/ oo
(cos xt)f(t) dt — i /
roc
(sin xt) f(t) dt.
\Z2tz J--oo J — oo
/(*) = dx.
Notice that the integral makes sense since the interval is finite and the
integrand is a continuous function of t; thus, f(z) is a well-defined func¬
tion on C. The real and imaginary parts of f(z) are, respectively,
1 rb
u(x,y) = — / eyt (cos xt) f (t) dt
V27T Ja
1 fb
v(x,y) =--= / eyt(sin xt) f(t) dt.
V 27T J a
Using Theorem 5.2.4, one can compute the partial derivatives of u and v
by differentiating under the integral sign. When one does so (problem
2), one finds that u and v satisfy the Cauchy-Riemann equations. Thus, /
is the restriction to the real axis of a function that is analytic in the entire
complex plane. This property of the Fourier transform plays an impor¬
tant role in the theory of signal transmission in electrical engineering and
in scattering theory in physics.
r N
/ f(z)dz ~ ^f(zj)(zj- Zj-1),
JC j=1
where the points zq — ca, z/v = t, and zi, Z2, ■.., zn-i be along the curve
in order between zq and zn. See Figure 8.2.1.
308 Chapter 8. Complex Analysis
Figure 8.2.1
= <i7)
j—i zj zj-1
~ -*i-1)- (i8>
i=i
Notice that (18) is a Riemann sum for f(z(t))z'(t) dt, which is an inte¬
gral of a continuous function on a finite interval on the real line, since /
is continuous and z(t) is continuously differentiable. This gives us the
idea for the following definition.
We can also write fc f(z) dz in terms of real line integrals in R2. If u(x, y)
and v(x,y) are the real and imaginary parts of /, then
[ f(z(t))z'(t)dt
Ja
+ i f (u(x(t),y(t))y'{t) + v(x{t),y{t))x'{t))dt
Ja
Example 2 Let C be the straight line from (0,0) to (1,2) and let f(z) =
z2. Then, z(t) = t + 2it, 0 < t < 1, is a parameterization of C. Since
z'(t) = (1 + 2i), we have
Example 3 Let C be the arc of the unit circle (\z\ = 1) from 1 to i and
let f(z) = 2. We can parameterize C by z(t) = cost + isint = elt on the
interval [0, f]. Thus, z'(t) = ielt, so
310 Chapter 8. Complex Analysis
[ (z -
Jc
Zo)n dz = f
Jo
(rel0)niireidd0
/>27T
i(eie)n+1rn+1d.O
= Jo
0 if n ^ — 1
27ri if n — — 1
N
j p(z)dz = J aj(z — zoy dz
N f
— ya3j{z-z0)J dz
3=0
= 0.
= 0.
Since we know that a series in powers of z — z0 is analytic inside its radius
of convergence (Example 3 of Section 8.1), this result is a special case of
Cauchy's theorem, which is proved in the next section.
8.2 Integration on Paths 311
Problems
2. Explain carefully why one can differentiate under the integral sign in
computing the partial derivatives of the real and imaginary parts of the
Fourier transform of a continuous function that vanishes outside of a fi¬
nite interval (see Example 1). Verify that the Cauchy-Riemann equations
hold.
3. Let / be the function which equals 1 on the interval [a, b] and equals 0
elsewhere. Compute the Fourier transform of / and verify explicitly that
it is analytic in the entire complex plane.
4. Let / be an analytic function. Show that the real and imaginary parts, u
and v, satisfy Laplace's equation:
5. Let C be the path from (0,0) to (2,4) in R2 that follows the graph of the
function y = x2. Compute the following line integrals:
9. Let C be the circle of radius 1 with center at the origin. Evaluate the
integral
r i
dz.
'c z(z - 2)
fa
Hint: write z(z-2)
| | ^2 — \ } and use the ideas in Example 4.
10. Let C be the circle of radius R with center at the origin. Find an upper
bound for
[ — dz
Jc z
11. Let C be a smooth curve of finite length in C and / be a continuous func¬
tion on C. Suppose that {/n} is a sequence of continuous functions that
converges uniformly to / on C. Then
Figure 8.3.1
We give a proof of Cauchy's theorem which depends on Green's theorem
from multivariable calculus. Direct proofs are available; see [6] or [23].
8.3 Cauchy's Theorem 313
[ f{z)dz = 0 (21)
Jc
for every simple closed contour C in D whose interior contains only
points of D.
This shows that if the interior of a simple closed contour contains a single
point at which / is not analytic, then the conclusion of Cauchy's theorem
may not hold.
1 r dz 1 r dz
= —ixi -f 0.
2 Jc2 z 2 Jc2 z — 2
If C3 is the circle of radius \ about the point 1, then by Cauchy's theorem.
0.
Proof. Suppose that e > 0 is small enough so that C(z0, e), the circle of
radius e centered at zot lies entirely inside of C. We will first show that
the integral of f(z)/(z - zQ) on C is equal to the integral of f(z)/(z - zQ)
on C(z0, e). Choose a point to on C(z0, e) and let L denote the radial line
segment from uj to the first intersection point, r, of the line with C. These
8.3 Cauchy's Theorem 315
0 = [ ihU
JCi z ~ z0
fn±dz-[ m.
JC Z — Za JC(z0,e) Z z0
which shows that the integral of f(z)/(z - zQ) on C and the integral of
f(z)/(z- z0) on C(z0, e) are equal. Notice that this equality is true for all
small e.
We now estimate the difference between the right-hand and left-hand
316 Chapter 8. Complex Analysis
sides of (23).
fM ~ dz = fizo) - y~. [ fM dz
2m Jc z — zQ 2m Jc
C(z0,e) Z Z0
fM) 2m
f
Jc(z0,e) Z- Zo
f f{z) ~ f{z0)
— ~ ^ Jc(za, 2m e) Z - Zf,
dz
1 fiz) - f(Zo) dz
2tv fcC(z0,e) z Z0
I f(z) ~ f{Zo)
< l max (27re)
z € C(z0,e)
In the second step we used the result of Example 2 to cancel the first
two terms on the right. In the last step we used the estimate proved
in problem 8 of Section 8.2. Since / is continuous, the right-hand side
converges to zero as e —> 0. However, the left-hand side does not depend
on e, so we conclude that (23) holds. □
Proof. Choose ri so that it satisfies r < r\ < tq. Let z be a point in the
disk {r | |r — zQ\ < ri} and let a; be a point on C(z0,ro). Since r\ < ro,
there is an a independent of 2 and uo so that
Thus, 1 _ 1 1
w - 2 ~ w“2»1-5^
f(“)
duj (25)
/(*) 2?rf Jc{z0,r0) W-Z
—f
2m Jc{z0,r0) U - Zo N^°°
Um f; (^^-Y dw
- zo/
(26)
3=0
N
lim V( — / du (2: — z0y (27)
iV—> 00
JT'q V2?rZ JC{zo,r0 ) (u - zQy+i
Since the series in (27) converges for z e {z \ \z - zQ\ < r\}, we know, by
Theorem 6.5.3, that the radius of convergence of the series is > r\. Thus
the series converges uniformly on {z \ \z — zQ\ < r} if r < r\. □
n\ /M
f(n)(zo) du (28)
27ri - 20)n+1
Proof. The same proof as in Theorem 6.4.2 shows that a complex power
series is infinitely differentiable inside its radius of convergence and the
derivatives can be computed by differentiating the series term by term.
Differentiating (27) term by term and setting 2 = 20 gives formula (28)
in the case where C = C(z0,ro). Thus, the coefficient of the power
in (27) is f^\z0)/j\, so (27) is just the Taylor series of /. Formula (28)
holds for general C satisfying the hypotheses by an argument similar to
the argument at the beginning of the proof of Theorem 8.3.2. □
(c) The Prime Number Theorem, which we stated at the end of Sec¬
tion 6.6, describes the asymptotic behavior of the function 7r(m) whose
value is the number of primes < m. It doesn't seem likely that this could
have anything to do with complex numbers or analytic functions. How¬
ever, the Riemann zeta function, which we introduced in problem 12 of
Section 6.3, is the restriction to the set {x \ x > 1} of an analytic func¬
tion (see problem 10). The properties of ((z) played a central role in the
original proofs of the Prime Number Theorem.
out that the size of the region where / is nonzero can be characterized
in terms of the growth properties of / in the imaginary directions. As
mentioned in the example, this plays an important role in the theory of
signal transmission in electrical engineering and in scattering theory in
high-energy physics.
Problems
1. Let C\, C2, and C3 be circles with center the origin and radii 1, 3, and 5,
respectively, traversed counterclockwise. Use Cauchy's Theorem and the
Cauchy integral formula to evaluate the following integrals:
~ (z — l)(z — 4)'
4. Suppose that / is analytic in the disk {z e C | |z| < R} and that /'(z) = 0
for all z in the disk. Prove that / is constant in the disk. Hint: use power
series.
5. Suppose that / is analytic in the entire complex plane and bounded; that
is, \f(z)\ < M for some M. Use (28) and (20) to prove that /(n^(0) = 0
for all n > 1. Conclude from this that / is a constant. This is known as
Liouville's Theorem.
6. Let C\ be the straight line path from i to 1 and let C2 be the path that goes
in a straight line from i to 0 and then in a straight line from 0 to 1.
(a) Compute fc z dz and fc z dz and show that they are the same.
How could you have predicted this by using Cauchy's theorem?
(b) Compute fCi z dz and fc^ z dz and show that they are not the same.
7. (a) Suppose that / is analytic in the open unit disk {z \ \z\ < 1} and
suppose that /(z) = 0 when 2 is real. Prove that /(z) = 0 in the
disk. Hint: what can you say about the Taylor coefficients of / at 0?
(b) Suppose that / is analytic in the unit disk {z \ \z\ < 1} and that
f(zn) = 0 at a sequence of points {zn} that have a limit point in the
disc. Prove that f(z) = 0 in the disk.
8. Suppose that / and g are both analytic functions on C and that /(z) =
g{z) if z is real. Prove that /(z) = g(z) for all z.
9. Use the fact that ezew — ez+u for z and u> real to prove that the same
equality must hold for all z e C and all weC.
10. If b > 0, define bz = ezlnb. Show that the Riemann £ function,
OO 1
3=1 J
Projects
1. The purpose of this project is to show how the theory of analytic functions
can be used to evaluate improper Riemann integrals on R. The integral
1
dx
1 + x2
(-it, 0) 0 (R,0)
(a) Let Fr be the contour in the above figure, consisting of a first piece,
Tr/ traversing the real line from x = — R to x = R, and a second
piece, CR, which traverses counterclockwise from (12,0) to (-12,0)
along the circle of radius R with center at the origin. Suppose 12 > 1.
By writing f(z) = (z + i)~1(z — i)-1 and using the Cauchy integral
formula, explain why
(b) Use (20) to prove that | /Cr f(z) dz\ -» 0 as 12 -> oo.
(c) Explain why the improper Riemann integral dx exists and
equals 7r.
(d) Use the same ideas to evaluate
has at least one root if n > 1. Since we are assuming that p has order n,
we know that an / 0. Note that the coefficients {tr,-} may be complex. We
shall outline a proof by contradiction. Suppose that there is no so that
p(z0) = 0.
Remark: to see the power of analytic function theory try to prove the
Fundamental Theorem directly.
CHAPTER 9
Fourier Series
Fourier analysis has played an important role in mathematics since the
early part of the 19^ century. In Section 9.1 we show Fourier's technique
for solving a partial differential equation describing heat flow. His cal¬
culation posed a question for mathematicians which proved to be both
difficult and exceptionally fruitful. Fourier series are formally defined in
Section 9.2- where several examples are given. The theorems on point-
wise convergence and mean-square convergence are proved in Sections
9.3 and 9.4, respectively.
It this section we investigate the flow of heat in a thin metal bar of length
L centimeters. We assume that the temperature, u, is constant in each
cross section, so u depends only on the time t and the distance x along
the bar. Heat energy is measured in calories. The specific heat of a ma¬
terial, c, is the number of calories needed to raised 1 gram 1 degree
centigrade. Thus the heat per unit length at x is cpAu(t, x), where A
is the area of the cross section and p is the density of the metal. For
simplicity, we assume that the
bar is homogeneous and a per¬
fect cylinder, so c, p, and A are
constants. It follows that the to¬ x = 0 a b x = L
tal heat in the segment of the bar
between x — a and x = b is Figure 9.1.1
Both the right-hand side of (1) and the right-hand side of (2) represent
the net rate of change of heat in the segment, so setting them equal and
rearranging, we find
fb a
/ ut(t,x)-uxx(t,x)dx = 0. (3)
Ja Cp
The constant n = ~ is called the diffusivity. Since (3) holds for all choices
of a and b, the integrand in (3) must be zero if it is continuous (problem
14 of Section 3.3). Thus,
This partial differential equation is called the heat equation. The tem¬
peratures along the bar are given at time t = 0,
and we suppose that the ends of the bar are held at temperature zero at
all times (for example, by placing them in an ice bath). Thus,
The mathematical problem is to find the function, u(t, x), defined on the
half-strip, 0 < x < L, t > 0, so that the partial differential equation (4),
the initial condition (5), and the boundary conditions (6) all hold.
9.1 The Heat Equation 325
For the moment we forget about the initial conditions and look for
solutions of (4) and (6) that have the special form u(t,x) = X(x)T(t),
which is a function of x times a function of t. If we substitute X(x)T(t)
into (4), carry out the differentiations and rearrange algebraically, we
find
T'(t) X"(x)
—— = - — (7)
nT(t) X(x) '
Since t and x are independent variables, this can only be true for all t
and all x if both sides are constant; we call the constant -A. Thus, T(t)
satisfies the differential equation
The general solution of (9) is X (x) = a\ cos \/\x + a2 sin y/\x. Flowever,
since the boundary conditions (6) must be satisfied, it is easy to see that
we must have a\ = 0 and sin y/\L = 0. Thus A cannot be any constant
but must satisfy vXL = ±n7r for some integer n. Since the function sin x
is odd, the solutions for n negative are minus the solutions for n positive,
and the solution for n = 0 is identically zero. Thus, if u is not identically
zero, the possible A's are An = (yf)2, n = 1,2, 3,, and X{x) is one of
the functions
mrx
Xn(x) bn sin
/. \ _\ 7 • T17TX
un(t,x) = bne An sm——
1j
nirx
u(t, x) bne XnKt sin (10)
71=1
L
326 Chapter 9. Fourier Series
Assuming that the series converges, such a u(t, x) satisfies the heat equa¬
tion and the boundary conditions. But what about the initial condition
(5)? If we set t = 0 in (10), we see that
/(*) = f>sin(11)
71=1 L
fL . mrx rmrx f 0 if n ^ m
/ sm —— sm —-— ax = <
L y ^ if n = m.
We shall use these special properties to compute formulas for the coef¬
ficients {6n}. Suppose that we know that the series on the right of (11)
converges uniformly to f(x) on the interval [0, L\. Multiplying both sides
of (11) by sin and integrating, we find
where we used Theorem 6.3.2 to exchange the sum and the integral. We
conclude that
Example 1 Suppose that f(x) = x(L — x). To evaluate the integral (12),
we integrate by parts:
2 . mrx
— — / x(L — a;) si:
sin —— d®
-Lv ,/0 ±J
4 L2
n3 7r3
(i - (-id.
Therefore, bn = 0 if n is even and bn = if n is odd. Thus w(f, x) is
given by the series
8L2 3-7ra:
u(t, x) (^)2^ sin ™ + (3n/L)2K,t
sin
337T3 ~ir
i sin rj”'r +
5V L
Notice that for t > 0, the series converges very fast because of the ex¬
ponential factors. This is the key fact which is exploited in the theorem
below. When t = 0, the series is
If u(t, x) satisfies the initial condition (5), then this series must converge
to the simple polynomial x(L — x) for each x in [0, L\. Because of the cubic
power, the series certainly converges. But does it converge to x(L - x)?
We shall see in Section 9.3 that the answer is yes.
□ Theorem 9.1.1 Suppose that / is piecewise continuous on [0, L\, and let
u(t, x) be defined by (10) where the bn are given by (12). Then for t > 0,
u(t, x) is infinitely often continuously differentiable in both t and x and
satisfies the heat equation (4) and the boundary conditions (6).
77/7T \ U7TX
£ bne~XnKt
n—1
T )C0S^T
g -A nry'Lo
Kt _ _
1 2!
_
1 -f- 2j -(- ... (A/x/^fo)
2! C
Mn < 2 M <
(An«f0)2 n3 '
This estimate shows that Mn < oo. Thus, by the Weierstrass M-test,
the differentiated series converges uniformly for all x. Notice that the
estimates hold uniformly for all t > ta and that the terms in the differen¬
tiated series are continuous functions. Thus we have proved that u(t, x)
is continuously differentiable in x in the region — oo < x < oo,tQ < t.
Since tQ was an arbitrary positive number, it follows that u(t, x) is con¬
tinuously differentiable in x for t > 0 and the derivative can be computed
by differentiating the series for u(t, x) term by term.
Similarly, the term-by-term derivative with respect to t,
E -Annbne
OO
x 7 w
Xn
Ki .
sin—,
T17TX
Problems
fL nirx m-nx f 0 if n ^ m
Jo L L \ f if n = m
(a) Generate the graphs of /, Si, S3, and S5 on the interval [0,tt] when
t = 0. Does it look as if the series for w(0, x) is converging to /?
(b) Find an upper bound on the error we make if we replace u by S5.
(c) Compare the graphs of S5 for t = 0, .5,1,2, and 5. Is this the way
you would expect u(t, x) to behave?
330 Chapter 9. Fourier Series
(a) Compute the coefficients bn. Hint: use formula (12) and integrate
by parts.
(b) Show by explicit differentiation that the function u defined by (10)
satisfies the heat equation for t > 0 and explain why the boundary
conditions (6) hold for all t > 0.
(c) Explain why the series for u at t — 0 cannot converge to / for all
x e [0,7r].
_ 11
6. Let Sn be the nm partial sum of the series for the solution u of problem 5.
(a) Generate enough graphs of Si, S2, S3,... to convince yourself that,
for t — 0, the series converges to / for all x e [0,7r] except x — n.
(b) Compare the graphs of S5 for t = 0, .5,1,2, and 5. Is this the way
you would expect u(t, x) to behave?
(L 0) — 0 — (13)
(b) Use the methods of the section to derive the formal solution
OO
°o = if0Lf(x)dx>-
(c) Explain why the same arguments as in Theorem 9.1.1 show that u is
infinitely differentiable in x and t for t > 0 and u satisfies the heat
equation and the boundary conditions (13).
(d) By using the partial differential equation, prove that fQL u(t, x) dx is
independent of t. Why is that reasonable?
(e) What happens to the solution as t —>■ 00?
8. Suppose that the ends of the bar are kept at temperature zero and that
/3cu(t, x) units of heat are added to the bar per gram per unit time by
some internal chemical reaction where [3 is a constant. Show that u should
satisfy the partial differential equation
Using the methods of the section, write down a formal solution to (14)
which satisfies the boundary condition (6) and the initial condition (5).
How does the behavior of the solution as t ->• 00 depend on /3?
9.2 Definitions and Examples 331
9. Suppose that the ends of the bar are kept at temperature zero and that
cg(x) units of heat are added to the bar per gram per unit time at time t.
Show that u should satisfy the partial differential equation
Suppose that the initial temperatures are given by f(x) and assume that
g can be written in the form g(x) = E^°=i cn sin njf-- Suppose that the
solution u(t, x) has the form
OO
U7TX
u(t, x) ^2 Tn(t) sin
~T~
71 = 1
for some unknown functions {Tn(t)}. For each n, find and solve the or¬
dinary differential equation that must be satisfied by Tn(t).
10. Consider a rectangular plate with coordinates (0, 0), (a, 0), (a, b), and
(0, b) for the vertices. Assume that the temperature u(t, x, y) satisfies the
two-dimensional heat equation
Using the methods of the section, find functions Tn;Tn(f) so that this
initial-boundary value problem has a solution of the form
OO OO
X ^ V^ m / \ 7171X miry
u(t,x,y) 2^ 2^ sin — sin
~b~‘
n=1 m=1
Note that if £ |am| < oo and £|6m| < oo then the series on the right
certainly converges for each x. If we define
71
then (15) means that the partial sums Sn(x) converge to f{x). Each par¬
tial sum is a periodic function of period 27t; that is, Sn(x + 2ir) = Sn(x).
It is easy to prove from this (problem 1) that if the series converges, then
/ must be periodic of period 2tt. From now on we assume that / has this
property. For calculations, it is often convenient to rewrite the series and
partial sums in terms of complex exponentials. Using the formulas for
sin x and cos x in terms of complex exponentials (Example 1 of Section
6.5), we find
Thus, if we define
i(am-z6TO) if m > 0
Cm ao if m = 0
2 (C'—m “F ib—rri) If ^ 0
then
s„(*) =
m=—n
OO
f(x) = y cmeimx, (16)
m——oo
where the infinite sum on the right means the limit of the partial sums
Yjm=-ncmelTnx as n f oo. Some of the Fourier series which we shall
study are not absolutely convergent, so it is important to specify which
sequence of partial sums we mean.
If (16) holds and if the series cbnverges uniformly, then the coeffi¬
cients {Crn} are determined. To see this, multiply both sides of (16) by
e~inx and integrate term by term. Then,
ei{m-n)x dx
r f(x)e-inx dx
J —7V
2ircn
9.2 Definitions and Examples 333
since the integral is zero unless n = m. This shows that the coefficient cn
is determined for each integer n. If we define
then the series on the right side of (16) is called the Fourier series of / and
the Cm are called the Fourier coefficients. Since (15) is just a rewriting of
(16), it is also called the Fourier series of /. The coefficients {am}^=0
and {bm}m=i are a^so called Fourier coefficients and can be written in
terms of {cm}“=_00 by the formulas: ao = co, am = c_m + cm, and
bm = i_1(c_m - Cm), for m > 0. From these formulas, it follows easily
(problem 2) that
l rn
ao = (18)
2~Jj(t)dt
l r77
am = — / /(f) cosmf df, m > 0 (19)
7T J-7T
i r
bm = — / /(f) sinmf dt, m > 0. (20)
7T J-7T
Example 1 Suppose that f(x) = (tt — \x\)2 on [—7r, 7t]. Since f(ir) =
0 = f (—7r), the function / can be extended to be a continuous function
of period 27t on M. Since /(f) is even, all the coefficients bm are equal to
zero (problem 4). We calculate
a0 = ^ f^-\t\fdt = - f\n-t)2dt = y
27T J—7r tt J o
and by integration by parts,
2
am = — (tt — f )2 cos mt dt
7T Jo
^2 />7r
t=7T
{(7T - f) COSmf}^=0
7rm4
m*
Thus, the Fourier series of / is
7T 00 4
C'N--' —~ cos mx. (21)
/(*) T m2
m=1
334 Chapter 9. Fourier Series
Figure 9.2.1
if0. i r
bm —-/ sin mxdx + — / sin mxdx
7T J—tx TV J 0
_
f ^
J 7T m
m = 1,3,5,...
1 7 7
\ 0 m = 2,4,6,...
Thus, the Fourier series of / is
~ 4
/W ~ ^0^2m + l)Sin(2n + 1)3!- (22)
Notice that it is not at all evident that this series converges for any x ^
0 since the coefficients decay only like m_1. In fact, how could a sum
9.2 Definitions and Examples 335
Figure 9.2.2
oo
xir ^ . mux
f(X) = = 2 bmSm-j—,
^
m=1
Problems
6. Assume that the Fourier series for /(x) = (A — |x|)2, which we computed
in Example 1, converges to /(x) for each x in [—n, 7r]. Prove that
El 00 i
7r 2
n2 6
n=l
8. Show that the Fourier series of the function /(x) = |x| on the interval
[—7r, 7r] is
7T 4 °° ^
f(x) ~ f - n—1 v 7
(b) Let fa be the function fa(x) = f(x — a). What is the relationship
between the Fourier coefficients of / and the Fourier coefficients of
f ?
f(t)eint dt dt
lim 0. (23)
n—>00
If each term in the finite sum on the right of (24) converges to zero as
n —>■ oo, then the left-hand side converges to zero. Thus, it is sufficient to
prove the result in the case where / is continuous, which we henceforth
assume. If / is constant, then the explicit calculation that we made above
shows that (23) holds. So the idea of the proof is to approximate / by a
piecewise constant function. Let e > 0 be given. By Theorem 3.2.5, / is
uniformly continuous on [a, b\, so we can choose a S > 0 so that
Therefore,
fb rb fb
/ f{t)e±intdt =
/ if{t) - g[t))e±lntdt + / g(t)e±in,dt (26)
Ja Ja “ *> Ja
N
fPi
< £ + J2\f(pj)\ e±intdt (28)
Jpj-i
3=1
9.3 Pointwise Convergence 339
Each of the integrals in the finite sum goes to zero as n —>■ oo, so
exist and are not infinite. Note that the value of / at x may not itself be
involved in these difference quotients since it need not equal f(x~) or
f(x+). Of course, if / is continuous at x, then f{x~) = /(x) = f(x+).
The following theorem gives sufficient conditions for the pointwise con¬
vergence of Fourier series.
f{x+) + f(x )
lim Sn(x) (29)
71—> OO 2
340 Chapter 9. Fourier Series
£ (C f_J(t)e-Mdt)e'
Sn(x) =
m——n (hi (30)
= hf™ t ^ m——n
dt. (31)
Solving (32) for Dn(x) and multiplying the numerator and denominator
by e~lx/2, we find that
sin (n + \)x
Dn{x) =
sin 7}X
Sn(x) = ~ J f(t)Dn(x — t) dt
= [ f(x + s)Dn(-s)ds
In the last step we used the fact that Dn is even and that all integrals of
f(x + s)Dn(s) on intervals of length 2n are the same since both f(x + s)
and Dn(s) are periodic of period 2ir in s. Because j- f*K Dn(s) ds = 1
and Dn{s) is even,
Sn{x) -
f(x+) + f(x ) 1
2ir 7_
f
r°
J —IT
(f(x + s)~ f(x ))Dn(s)ds
“'
+ ~~ f
2iv Jo
(f{x + s)~ f{x+))Dn(s) ds
1 r0 l rn
— — / g{s) sin (n + |)s ds + — / h(s) sin (n + h)s ds,
Z7T J-7T Z7T 70
where
g(«) = /<a;+jg7/(a!~>,
sin
h(.) = fdx+s)
sin
“/(l+)
9.3 Pointwise Convergence 341
Since sin does not vanish on [—7r, 0), and / is piecewise continuous, g
is piecewise continuous on [—7r, 0). Furthermore, if we write
lr° 1 r°
~ / g(s) sin (nl)s ds = — / (g(s) cos #) sin ns ds
K J-7T 7T J-7T
1 f0
+ — (p(s) sin |) cos ns ds.
Now we write sin ns and cos ns in terms of e+ms and e ins, and we use
the Riemann-Lebesgue lemma to conclude that
1
— / g(s) sin (n + i)s ds —> 0 as n CXD.
7T J--*
1
— / h(s) sin (n + |)s ds —> 0 as n —> oo,
7T Jo
which completes the proof. □
and
/(0 + s)-/((T) (ir + s)2 — 7T2
lim lim 2tv,
s/0 s s/0 s
/(*) = y + ^c°snz.
One reason that Theorem 9.3.2 is hard to prove is that the hypotheses
and conclusion are “local.” If the difference quotient for / has right-hand
9.3 Pointwise Convergence 343
Problems
±int dt 0.
(a) For each sel, find the right-hand and left-hand limits of the differ¬
ence quotient of /.
(b) For each xeK, to what number does the Fourier series of / con¬
verge?
344 Chapter 9. Fourier Series
(c) /(*) = ^
(d) f(x) = sin K
(e) f(x) = a; sin
(f) f(x) = x2 sin K
{ 0 — 7T < X < 0
5x 0 < X < 7T
57t/2 X = TT.
,, s f 7T +x —n < x < 0
/(l) = {t-X 0 < X < TT.
8. Explain carefully why the coefficients {an} of the solution u(t, x) in prob¬
lem 7 of Section 9.1 can be chosen so that the initial condition w(0,a:) =
f(x) holds for all x if /'(0) = 0 = f'{L).
9.4 Mean-square Convergence 345
10. A subset F C 1 is said to have measure zero if, given any e > 0,
there is a countable family of intervals {/n}^°=i such that E C uIn and
Y length(In) < e.
(a) Show that any finite set of points has measure zero.
(b) Show that the set { -
knJ
} has measure zero.
(c) Show that any countable set has measure zero. Remark: There are
uncountable sets of measure zero.
(/w) dx,
From (b) and (c), it follows that the inner product is conjugate linear in
the second factor; that is, (/, a\g\ + : ^2) = Qi(/wi) + ®2(fi92)- The
02
346 Chapter 9. Fourier Series
l/lb = (/,/)»•
If 11/ — /n|| 2 -> 0, then the sequence of functions {/n} is said to converge
to / in the L2 norm, or in the mean-square sense. The reader is asked
to show in problem 3 that pointwise convergence for all x does not im¬
ply mean-square convergence, nor does mean-square convergence imply
pointwise convergence.
Proof. The proof uses only the three simple properties of the inner
product. First suppose that ||<7||2 = 1.
= if - - (f,g)g)
= (/, /) - (/,9){f,g) - {f,g)(f,g) + (/,g){f,g)(g,g)
/Hi -1(f,g)\2,
from which (33) follows in the case ||y||2 = 1. If g is the zero function,
then (33) certainly holds, and if g is not, then ||#||2 > 0. Applying what
we have just proven to the functions / and h = g/\\g\\2, we find \(f,h)\ <
||/||2 since ||/i||2 = 1. Thus,
Proof. Properties (a) and (b) follow immediately from similar properties
of the inner product. By the Cauchy-Schwarz inequality,
Thus || • ||2 has the three properties of a norm (see Section 5.8), which is
why we have been referring to it as the L2 norm. Since this norm comes
from an inner product, we can introduce a notion of orthogonality.
Notice that the dot product is a function from Rn x R" to R that satisfies
on the interval [—7r, zr]. The fact that the functions einx are orthogonal to
each other enabled us to derive formula (17) for the Fourier coefficients.
Dividing by \Z2tv ensures that each has L2 norm equal to 1. Note that it is
only for convenience in the definition that we have indexed the sequence
of functions {<pn(a:)} from n = 1 to n = N, where possibly N = oo. The
index can run over some other set (as in the above example), and the
order of the functions in the sequence plays no role in the definition.
Suppose that / is a piecewise continuous function and {<pn}^=i is a
finite orthonormal family on the interval [a, b\. We want to find the linear
combination of the functions <pn that gives the best approximation to /
in the mean-square sense. That is, we want to choose coefficients {cn} so
that the norm of the difference ||/ — cn(pn(x)II2 is as small as possible.
First, we calculate
N N N
where we completed the square in the last step. Since the first two terms
on the right do not depend on the sequence {cn} and the third term is
nonnegative, we see that ||/ - J2n=i cnVnh is smallest if we choose cn =
(/, (fn)- In this case
The numbers cn = (/, <pn) are called the generalized Fourier coefficients
of / with respect to the orthonormal family {pn}Yi- Notice that since
9.4 Mean-square Convergence 349
N N
If {<pn(x)} is an infinite orthonormal family, then (36) holds for each fi¬
nite N. Since the right-hand side doesn't depend on N, this shows that
|cni2 is finite and
OO
X] M2 < ll/lli,
n— 1
r
J—7r V*rr y/2n
= -i
2tt J-n
= (I itrm
[ 0 otherwise,
Note that the Fourier series is the same as (16), but the definition of cm
differs by a factor of V2tt from (17) since we have put the factor y/2n
under einx so that einx/\/2tt has L2 norm equal to 1. If we define the
partial sum, Sn{f), of the Fourier series of / by
g imx
Sn{f){x) = X Cm /n-’
m--n.V2tT
350 Chapter 9. Fourier Series
OO
E M2 ^ ii/ii2-
m——oo
<39>
We can immediately use Bessel's inequality to improve Corollary 9.3.3.
Proof. We already know from Corollary 9.3.3 that the Fourier series of /
converges to / pointwise. By the Weierstrass M-test, we need only show
that E^°oc \cm\ < oo to conclude that the series converges uniformly. If
we denote by {c^} the Fourier coefficients of /', then integration by parts
in (17) shows that iracm = c'm for all m / 0. For each positive integer n,
let Qn denote the set of integers Qn = {m | — n < m < n; m ^ 0}.
Then, the discrete Cauchy-Schwarz inequality (problem 10 in Section 5.8)
implies that
S lc"
meQn meQn
E m
<
^ ] |Cm.| ^ OO,
/ OO
js(x ~y)f(y)dy.
-OO
rb-\-S
9s(x) = / js(x-y)f(y)dy.
J a—S
/ OO
js{r)f{x - r)dr,
-OO
\g5{x) - /(*)|
/ OO
3s(r){f{x - r) - f{x))dr
-OO
= J - f{x))dr
/OO
js(x)di
- _ _ ~°°
and
(Parseval's relation).
Since the formula for the Fourier coefficients, (17), depends linearly on
the function under the integral, Sn(g) — Sn(f) = Sn(g — /). Furthermore,
by (38), ||Sn(g - /)||2 < ||g - f lb, so,
which proves that ||/-S„(/)||2 -> Oasn ->• 00. Furthermore, formula
(35) gives
Problems
1. Verify the three properties, (a), (b), and (c), of the inner product.
2. Let {/„} be a sequence of continuous functions on [a, b} that converges to
a function / uniformly. Prove that fn -t / in the mean-square sense.
3. Let [a, b] be a finite interval.
(a) Construct a sequence of continuous functions on [a, b\, {fn}, so that
fn -> 0 pointwise but ||/n||2 -too. Hint: choose /„ to be a function
which is tall on a small set and zero elsewhere.
(b) Construct a sequence of continuous functions on [a, b\, {/„}, so that
11 fn 112 —t 0 but {fn{x)} does not converge to 0 for any x e [a, b). Hint:
find fn with narrow graphs that march back and forth across [a, b].
(b) Let f(x) = 0 for x > 0 and f(x) — e 1/x2 for x < 0. Prove that / is a
C°° function on R. Hint: see Example 4 in Section 6.4.
(c) Use the function / and its translates to construct a C°° function j
that has the properties in (a).
6. (a) Let / be a piecewise continuous function on [7r, tt\. Prove that there
is a sequence of continuous functions fn on [n, 7r] so that /n —> / in
mean-square sense. Hint: connect the pieces.
(b) Use the idea of the proof of Theorem 9.4.6 to show that the Fourier
series of a piecewise continuous function / converges to / in the
mean-square sense.
(42)
Prove that strict equality holds in (42) if and only if f(x) = a cos x + b sin x
for some constants a and b. Hint: use Parseval's relation.
k *
J
r \f\x)\ux
—TV
<
Projects
(a) Compute the coefficients in the expansion f(x) — fl» cos ns.
(b) We will approximate the solution, u(t, x), of the heat equation by
the first eight terms of its expansion v(t, x) = ane~Xnt cos nx.
Graph v at times t = 0, |, and 1.
(c) What properties of the solution discussed in Section 9.1 (and prob¬
lem 7 of that section) can you observe in the graphs?
(d) How close is v to the true solution at t = 1?
(e) Compare v and 5 — e_t cos nx at t = 1. Why are they so close?
(f) Investigate the influence of n by graphing the approximate solu¬
tions at the times t = 0, |, and 1 in the cases k = 10 and
« = 1/10.
2. The technique used to solve the heat equation in Section 9.1 can also be
used to solve boundary value problems for other partial differential equa¬
tions. In the simplest model for a vibrating string of length L that has
fixed ends, the unknown function, u(t, x), which represents the vertical
displacment from equilibrium of the string at position x at time t, satisfies
the wave equation
The constant c = T/p where T is the tension in the string and p is the
density. We specify the initial displacements and the initial velocity of
the string at each x.
(b) Using Theorem 9.3.2 and the idea in Example 3 of Section 9.2, show
that the constants {an} and {frn} can be chosen so that the series
OO
(47)
(a) Explain why the improper Riemann integrals {f,e~inx) and ||/|||
exist.
(b) If g is continuous, explain why Bessel's inequality will hold for f—g.
(c) Show that / can be approximated as closely as we like in the mean-
square sense by a continuous function.
(d) Prove that the Fourier series of / converges to / in the mean-square
sense.
(e) Prove that the Fourier series of h(x) — sin ^ converges to h in the
mean-square sense on [—7r, it].
(f) Formulate theorems more general than Theorem 9.3.6 that guaran¬
tee mean-square convergence of Fourier series.
(a) Prove that |(i>, iu)| < ||v||||u;|| for all v e V and w eV.
(b) Prove that || - || is a norm on V.
(c) Suppose that vn —> v; that is, ||t>n — v\\ -> 0. Prove that for every
w eV, (vn,w) -> (v,w).
(d) Vectors v and w are said to be orthogonal if (v, w) = 0. Let {0n} be
an orthonormal family vectors in V; that is \\(j)n\\ — 0 for all n and
{<Pm 0m) = 0 if n / m. Prove that for all v e V,
(a) Let Cn denote the set of n-tuples of complex numbers with the usual
vector addition and scalar multiplication. For any two such vectors,
£ = (zi, Z2,. • •, zn) and w = (wi,u>2, ■ ■., wn), we define the inner
product by
n
OO
Probability Theory
In this chapter we show how many of the analytical tools which we have
developed, such as sequences, series, limit theorems, and metric spaces,
are used in probability theory In Section 10.1 we introduce discrete ran¬
dom variables, using the Bernoulli, binomial, and Poisson random vari¬
ables as examples. In Section 10.2 we show how simple probabilistic
ideas and the concept of metric are used in coding theory. Continu¬
ous random variables are discussed in Section 10.3. Finally, in Section
10.4 we develop more advanced applications of metric space concepts
to probability theory. Chebyshev's inequality and the weak law of large
numbers are covered in the projects.
Given an outcome (m, n), the value of X is the sum of the dice. The
values of Y and Z are the numbers on the green die and the red die,
respectively. Because X, Y and Z are M-valued functions defined on the
set of outcomes of the experiment, they are random variables. Y and Z
360 Chapter 10. Probability Theory
take values between 1 and 6 and X takes values 2 through 12; that is, the
range of X is the set of integers between 2 and 12.
which says simply that X must take one of these values. If A is any
subset of R, then we define the probability that the value of X is in A,
denoted P{X e A}, by
where the sum is over all n such that an e A. This makes sense because
we are saying that the probability that the value of X lies in A is the sum
of the probabilities that X takes on each of the different numbers in A.
The function whose value at an is P{X = an} is called the mass density
of the discrete random variable X.
If we denote the complement of A in R by Ac, we note that
1 hus, we always know that P{X e Ac} = 1 — P{X e A}. We remark that
we often write P{a < X < b} instead of P{X e [a, b}}.
Let X be the random variable in the dice experiment and let A be the
closed interval A = [—1,3]. Then P{X = | since 2 and
3 are the only possible values of X in the interval [—1,3]. Now, suppose
that A = [3, oo). Then,
for these two particular sets, A and B. With a little more work, one can
show that (2) is true for all choices of A and B, so the random variables
Y and Z are independent. On the other hand, consider the random vari¬
ables X and Y with the same two sets, A = {2,3} and B = {4}. Then,
P{X e>l} = ^ + ^ = ^ and P{Y e B} = However, there are no
outcomes so that X has the value 2 or 3 and Y has the value 4. Thus,
= (3)
p/{x=k} = = (p+(l-p))n
= 1.
Similarly, the probability that the first flip is a head and the second flip
p(l — p). Now suppose we flip the coin n times and each flip
is a tail is
is independent of the other flips. Let Xk be the random variable that is
1 if the flip is heads and 0 if thek^ flip is tails. Define the random
variable X = X\ + X2 H-1- Xn. Then, the value of X is just the number
of heads in n flips of the coin, so the possible values for X are the integers
0 through n. To compute P{X = k}, notice that the probability that any
particular configuration of k heads will occur is pk{l -p)n~k because the
flips are independent. Since there are (n_^!fe! different choices for the
positions of the k heads in n flips, we see that (3) holds; that is, X is a
binomial random variable.
10.1 Discrete Random Variables 363
p{x = k} = e~"w
OO
A*
E pix =k} ~ki
e~xex 1
\^
P{Xi = k} = e~x— = P{X2 = k},
The expected value, which is also called the mean, is the weighted av¬
erage of the possible values of X, with the weight of each value given by
its probability of occurrence. Note that once we know the mass density
of a discrete random variable, we can compute its mean without know¬
ing what the underlying experiment is or the meaning of A. A simple
364 Chapter 10. Probability Theory
Proposition 10.1.1
n■ k „\n-k
Em - SV»!» p (i - p)7
(U pk-1 (1
n—1
{n - 1)! n—l)—k
P (1 ~P){
nP^0 ((n - !) - k)!(k)\
= np.
00 \k oo \(k-l) oo
E(X) = E = Ae- E ^ = Ae- E ^ = A,
fc=0 fc=l ^ fc=0 A"
E(fc+W
fc=0
dx E
fc=0
*1
d / 1
dx V1 — x
1
(1 — x)2
for | x | < 1. In the first step we used Theorem 6.3.3 and the fact that the
radius of convergence of the geometric series is 1. Substituting x =
we see that 2E(X) = (1 — |)“2, so .E(X) = 2.
□ Theorem 10.1.2 Let X be a random variable with range {an} and let
■0 be a real-valued function on M. If the series ^]0(an)P{X = an} con¬
verges absolutely, then ip o X has finite expectation and
= E bi E P{X = an)
j {n\ip(an)=bj}
366 Chapter 10. Probability Theory
= =y
3
= E(i/j oX).
If we follow exactly the same steps but replace ip(an) by |^(an)| and bj
by \bj\, we see that J2j fy-PiV’ ° X = bj} converges absolutely. □
Proof. Let the values of X be {a;} and the values of Y be {bj}. We begin
by considering the double sum
All the terms are positive in this double sum. Thus, by Theorem 6.2.6,
if we show that it converges in any rearrangement, then it converges in
all rearrangements and the sum is always the same. Since X and Y are
independent,
= £kl = =
y>|p{r = y l£|oj|p{x
< oo.
10.1 Discrete Random Variables 367
In the last step we used the hypothesis that X and Y have finite expecta¬
tions. This proves that XY has finite expectation. Equation (5) is proved
by following the same steps with \aibj\ replaced by afij. □
Problems
1. Consider the experiment where two dice are rolled, and let the random
variable X be the sum of the faces.
(a) no misprints?
(b) less than or equal to three misprints?
368 Chapter 10. Probability Theory
(a) Prove that Y1T= i P{X = k} = 1. Hint: see problem 6 in Section 9.2.
(b) Prove that X does not have finite expectation.
7. Let 0 < p < 1 and consider an experiment in which we flip a coin which
comes up heads with probability p until we get heads. Let X be the num¬
ber of flips which we make. Compute E(X).
8. Let X be the random variable which is the sum of the faces in the experi¬
ment of rolling two dice. Compute E(X) and E(X2).
9. Let X and Y be discrete random variables on the same sample space and
suppose that both X and Y have finite expectation. Prove that for any c
and d, the random variable cX + dY has finite expectation and
10. Consider the experiment of rolling two dice, one red and one green. Com¬
pute the expectations of the following random variables:
11. Suppose that Y(A) is a Poisson random variable with parameter A. Let A
be any subset of R and define PA{A) = P{Y(A) e A}. Prove that PA is a
continuous function of A on (0, oo).
each message three times. Instead of sending 00, we send 000000 and
instead of sending 01 we send 010101, and so forth. These four binary
strings
oooooo oioioi ioioio mm
will be our code words. Let S be the set of all strings of 0's and l's of
length 6, and let p be the discrete metric on S. That is, if {x;}®=1 and
{Vi}i=i are binary strings of length 6, then
6
i {Hi}) = ^ ^ Vi)
i—1
Since p(Cs,Cr) = 1 by assumption, we see that p(Cr, CQ) > 2; that is, the
received string is a distance > 2 from all other code words. Thus, if we
receive a string which has one error in transmission, we can correct the
error by replacing the received string by the unique code word which has
distance 1 from it. Using this scheme, we can correctly decode a received
string if it is in fact correct or if it has exactly one error. The probability
of correct transmission is p6, and the probability for transmission with
exactly one error is 6p5(l — p). Thus, we will correctly decode the string
with probability p6 + 6p5(l — p). If, for example, p — .9, then
so the reliability has been decreased. This raises several natural ques¬
tions. For which p does a repetition of three times improve the reliabil¬
ity? Suppose that we repeat each two-digit message five times. Then the
four code words would be a distance > 5 apart and we would be able
370 Chapter 10. Probability Theory
to correctly decode received strings with no errors, one error, or two er¬
rors. For what p will this code improve reliability? Can we achieve any
desired level of reliability less than 1? For which p? These questions are
investigated further in problems 1, 2, and 9. For obvious reasons, these
codes are called repetition codes.
Figure 10.2.1
R = lo&2N
n
£1 “I- #5 Xj (6)
X2 = X3 + Xq + X7 (7)
£4 X5 ~\r Xq X7 (8)
£1 + £3 + x5 + x7 = 0
X2 + £3 + £6 + X7 = 0
X4 + X5 + Xq + X7 = 0
Thus, among the 128 binary vectors of length 7, the 16 code words are
the binary vectors v = (£1, £2,£3, £4, £5, xq, £7) such that Hv = 0 where
H is the matrix
/ 1 0 1 0 1 0 1 \
H = 0 110 0 11
v 0 0 0 1 1 1 1 J
Then, p(C7i, C72) = p{C\ - C2,0), and p(C\ - C2,0) is simply the num¬
ber of I's in the code word C\ - C2. Suppose that C\ ^ C2, and let
(£1, £2, £3, £4, £5, £6, x7) = C\ — c2. We shall show that C\ - C2 has at
372 Chapter 10. Probability Theory
least three l's. If there are three or more l's in the 3r^, 6*- , and Im¬
If C is a code word we let Bi (C) denote the set of words within a distance
1 of C. Now, let a denote the string which is all 0's except for a 1 in the
He, +
( ° \
H{C + ei) = HC + Hei = He, ± 0
W
Thus, B\(C) contains one code word and seven distinct binary strings
which are not code words. Let C, and Cj be distinct code words and
suppose x e B\(Ci) and y e Bi(Cj). Then, by the triangle inequality for
metrics.
Since p(Ci,x) < 1 and p{Cj,y) < 1, we must have p(x,y) > 1 which
implies that x ^ y. Thus, the sets Bi(C,) are disjoint. Therefore, their
union contains 16x8 = 128 distinct binary vectors, that is, all the binary
vectors of length 7.
We choose the decoding algorithm which assigns to each received
signal the unique code word within a distance 1. With this encoding
and decoding scheme, a code word transmitted with zero errors or with
one error is decoded correctly. Thus, the probability of correct trans¬
mission is p7 + 7p6(l — p) as compared to p4 for the direct transmission
of the four original digits. If p = .9, for example, then p4 = .66 and
p7 + 7p6(l — p) = .85. The information rate is |, so we have achieved a
dramatic improvement in reliability with only a modest reduction in the
information rate.
10.2 Coding Theory 373
The Hamming matrix has the nice property that the column is
just the binary representation of i reading from bottom to top. Suppose
that the four digits that we wish to transmit are encoded in a code word
C and that a single error occurs in transmission. Then the transmitted
word can be written as C + e; for some i. If we apply H to the received
word, we obtain H(C+e;) = Hei, which is the column of H. Since the
column is just the binary representation of i, we can see immediately,
by applying H to the received word, in which digit the error occurs.
For example, suppose that the sender wishes to transmit the signal 1101.
Thus, £3 = 1,^5 = 1,xq = 0, and £7 = 1. Using (6) - (8), the sender
determines that = 1, £2 = 0, and £4 = 0 and transmits the signal
1010101. (9)
1000101 (10)
(1\. 1
V0/
Since this is not the zero vector we know that there is an error in the
signal, and if there is only one error, it is in the third position since Oil
is the binary representation of 3. Therefore, we can correct (10) to obtain
the code word (9), thus recovering the signal that the sender wished to
transmit.
For more information about the history and mathematical develop¬
ment of coding theory, see [42] or [32].
Problems
2. Suppose that we wish to transmit a binary message which has two digits,
00, 01, 10, or 11. We use a repetition code which repeats each message
five times.
(a) How many code words are there? How many possible transmitted
signals are there?
(b) Explain why all the code words are a distance > 5 apart. Explain
why it follows that we can correctly decode any transmitted signal
with < 2 errors.
(c) Suppose that the transmission channel sends an individual digit
correctly with probability p. What is the probability that the trans¬
mitted message will have < 2 errors?
(d) If p = .95, compare the reliability of this coding scheme with the
reliability of sending the two digits directly.
(e) If p — .7, compare the reliability of this coding scheme with the
reliability of sending the two digits directly.
(f) Find a p0 so that this code improves reliability if p > pQ and hurts
reliability if p < p0.
(g) What is the information rate of this channel?
3. Suppose that we encode two binary digits as the first two digits of a three-
digit binary string and choose the third digit so that the sum of all the
digits is zero. Explain why this code can detect single errors but cannot
correct them. What is the information rate of this code?
4. Prove that the Hamming (7, 4) code has information rate R — |.
5. Suppose that you are receiving signals which employ the Hamming (7, 4)
code. Decode the following received signals:
(a) 0010111.
(b) 0110111.
(c) 0111101.
(d) 0111100.
6. Let S be the set of all possible words of length n that can be transmitted
by an information channel, and let C C S be the set of code words. Let
p be the discrete metric on S. The set C is called a perfect code if there
is an integer m so that the union of the sets of radius m centered at the
code words C e C equals S and, furthermore, each pair of code words is
a distance at least 2m + 1 apart. Which of the following codes is perfect?
where the plus signs on the right mean binary addition. For zeZ2 we
define scalar multiplication by
where on the right z ■ X{ means multiplication in Z2. Use the fact that Z2
is a field (problem 7 of Section 1.1) to show that Z£ satisfies the definition
of vector space (with R replaced by Z2) given in Section 5.8.
(a) What is the probability that the decoded signal is the signal that was
encoded?
(b) Suppose that p > |. Prove that if we choose n large enough, the
probability of correct transmission can be made larger than any /3 <
1. Hint: use the weak law of large numbers; see project 3.
(c) Explain why a message with < (n - l)/2 errors can be correctly
decoded.
(d) What is the probability that the decoded signal is the signal that was
encoded?
(e) Suppose that p > f. Prove that if we choose n large enough, the
probability of correct transmission can be made larger than any (3 <
1. Hint: use the weak law of large numbers; see project 3.
376 Chapter 10. Probability Theory
f(x)dx = 1. (11)
-OO
for all a and b, including a = —oo and b = oo. Condition (11) simply
guarantees that the total probability is 1.
0, if x < 1
fix) =
75, ifx> 1.
Since,
lim lim [
/ —r- dx = lim lim “To = 1,
Jc
>oo
c\l d^ooJc X^ c\l d—>oo \c2 d2
and
P{—2 < X < 1} = J 0 dx = 0.
P{X = a} — f f(x)dx = 0,
Ja
xf(x)dx. (13)
-OO
a2 = (x — n)2f(x) dx
J OO
/(*) = dh
n--1-r~—r
c a 0 b d
Figure 10.3.1
\e~Xx x > 0
/(*) 0, x < 0.
That is, F(x) is just the probability that the value of X lies in the interval
( —oo,x]. F is called the cumulative distribution function of X. As x gets
larger P{—oo < X < x} cannot decrease so F is a monotone increasing
function. Since the value of X is some real number, we must have
lim F(x)
nr>—Vrvo
\ /
= P{—
c
oo < X < 00}
j
= 1 (14)
and
Furthermore, since F is monotone increasing, the limits from the left and
right, F(x~) and F(x+), exist for each x. Thus, the only possible discon¬
tinuities of F are jump discontinuities (problem 7 in Section 3.5). Any
real-valued function on R that is monotone increasing and satisfies (14)
and (15) is called a cumulative distribution function even if no random
variable is specified.
Suppose that X is a discrete random variable which takes on only
finitely many values, for example, a Bernoulli or binomial random vari¬
able. Let ai, a2,.. •, aw denote the possible values listed in increasing
order. Then
Thus F is zero on the interval (—00, ai), equal to P{X = a\} on the in¬
terval [ai, a2), equal to P{X = a{} + P{X = a2} on the interval [a2,03),
and so forth. See Figure 10.3.2. F is constant except for jump discontinu¬
ities at each an, and the size of the jump, F(a+) - F(a~), at an is equal to
P{X = an}.
In general, a discrete random variable X takes on countably many
values {an}. In this case the right-hand side of (16) may be an infinite
series if infinitely many of the numbers an are less than or equal to a
particular x. The series always converges by the comparison test since it
consists of a subset of the terms of the series XI P{x = a„}, which con¬
verges and sums to 1. If {an} has no limit points (e.g a Poisson random
variable), then by the Bolzano-Weierstrass theorem, there can be at most
380 Chapter 10. Probability Theory
finitely many values an in any given finite interval. These points can be
ordered, and again we get the simple picture in Figure 10.3.2.
P{X = an}
•-
P{X = an_!>
I T
®n—1
Figure 10.3.2
If {an} has lots of limit points, there may not be a natural ordering of
{an}. For example, suppose that the values {an} are the rational num¬
bers. Formula (16) is true but it is much harder to visualize the graph of
F since, in any interval about a limit point of {an}, F will have infinitely
many steps. As before, the possible values of X are just the points of
discontinuity of F, and the probabilities are just the sizes of the jumps at
these points. Thus, the mass density of a discrete random variable can be
recovered from its cumulative distribution function. A cumulative dis¬
tribution function which is constant except for finite or countably many
jumps is called discrete.
If the random variable X has a density /, then
Thus, if F is continuous,
Since 0 < P{X = b} < P{a < X < b} for all a < b, we conclude that
P{X = b} = 0. Therefore, continuous random variables take on specific
values with probability zero. We saw this before in the special case when
X has a density, for example, when X is uniform, exponential, or normal.
This raises the natural question of what kinds of sets can have positive
probability. The following example shows that this question is deeper
than it looks.
Example 5 (the Cantor set and function) We will describe the Cantor
set, C, which is a subset of [0,1], by saying which points are not in C.
First, we exclude the middle third (|, |) of the interval [0,1]. We then
exclude the middle thirds, namely, (|, |) and (|, |), of the two intervals
that remain. Now there are four remaining intervals, and we exclude
their middle thirds, and so forth. The Cantor set is the collection of points
in [0,1] that remain after we have carried out this procedure infinitely
often. A straightforward calculation with geometric series shows that
the sum of the lengths of the excluded intervals equals 1, so in that sense
C is a very small set. On the other hand, there is another characterization
of C which allows one to show that C is uncountable (see below), so in
that sense C is a very large set.
Decimal expansions were discussed in project 4 of Chapter 6. The
word “decimal” is used, of course, because one is writing a given num¬
ber as a sum of powers of 10. In similar fashion, one can show that every
382 Chapter 10. Probability Theory
OO
x E 3n ’
where, for each n, an — 0,1, or 2. This is called the ternary expansion
of x. Given x, the sequence {an} is uniquely determined except when
x is of the form q/3n for some integers n and q, in which case there are
exactly two expansions, one ending in a string of 0's and the other ending
in a string of 2's. Conversely, if {an} is any sequence of 0's, l's, or 2's,
the series converges to a real number x that is in the interval [0,1]. The
proofs of these statements are similar to those outlined in project 4 of
Chapter 6.
It is not too difficult to see that the Cantor set is just the set of x in
[0,1] whose ternary expansions have no l's. For example, the first middle
third we eliminated, (|, |), consists of numbers such that ai = 1 in their
ternary expansions. Similarly, if a\ ^ 1 but a.2 = 1, then x is in the
interval (|, |) or the interval (§,§), depending on whether a\ = 0 or
a\ = 2. If a number x has two ternary expansions, then it is in the Cantor
set if one of the expansions has no l's. This shows that there is a one-
to-one correspondence between C and the set of all sequences of 0's and
2's. Since a straightforward modification of the proof of Theorem 1.3.6
shows that this set of sequences is uncountable, C must be uncountable
too.
We shall now define a function g on [0,1]. If the ternary expansion of
x has no l's, we set N = oo and otherwise let N be the index of the first
place in the ternary expansion of x where a 1 occurs. Set bn = \an for
n < N and bjy = 1, and define
00 h
sW = E
n=1 Z
-§>■
This function, g, is called the Cantor function. Notice that the value of a
is ^ for all x in (l, |) since in that case ai = 1. Similarly, the value of g is \
on (i, |) and the value of g is | on (|, |). Continuing in this fashion, one
can see that g is constant on each of the intervals in the complement of
the Cantor set. Furthermore, g is monotone increasing and continuous.
The monotonicity can be proved by checking cases, and the continuity
holds because two numbers that are very close have ternary expansions
which are identical for a large number of terms.
10.3 Continuous Random Variables 383
Define a function F by
f 0, x < 0
F{x) = < g(x), 0 < x < 1
[1, x > 1.
Problems
1. Suppose that after a new car is purchased, the number of years until the
first major repair is a random variable X with density
if x > 0
if x < 0.
(a) Show that f(x) dx = 1.
(b) Compute P{0 < X < 2}.
(c) Compute P{X > 2}.
(d) Compute P{—5 < X < 2}.
4. Find the mean of the random variable in Example 1 and show that the
variance is not finite.
5. (a) Suppose that the departure time of a bus is uniformly distributed
between 1p.m. and 1:10p.m. If you arrive at 1:03 p.m., what is the
probability that you will have missed the bus?
(b) A point is chosen at random on the interval [0,4], What is the prob¬
ability that it will be within | of 7r? What is the probability that it
will be within | of an integer?
6. Suppose that X is uniformly distributed on the interval [c,d]. Find the
mean and standard deviation of X. Draw the graph of the cumulative
distribution function of X.
7. Prove that the mean and standard deviation of a random variable that is
exponentially distributed with parameter A are both equal to 1/A. Com¬
pute explicitly the cumulative distribution function and verify that it has
the right properties.
8. Let fn(x) be the density of an exponentially distributed random variable
Xn with parameter An. Suppose that An —> A > 0.
Now use polar coordinates in the plane to do the integral on the right ex¬
plicitly. Note that this uses the fact that one can compute double integrals
by iterating the integrals (see project 4 of Chapter 4), as well as a change
of variables formula for multiple integrals.
10. Let Y be normally distributed with parameters /z and a, and suppose that
X is a standard normal random variable. Prove that
11. Generate the graphs of the cumulative distribution function of the fol¬
lowing random variables:
f(x)dx.
(b) Write formulas for the mean, pN, and standard deviation, aN, of Yn-
Use Corollary 3.3.2 to prove that /xjv —> P and aN —>• cr as N —t 00,
where p and a are the mean and standard deviation of X.
(c) Prove that for all c < d,
for all a < b and c < d. The function / is called the joint density of X and
Y, respectively.
f(x,y)dy, fY(y) =
poo
/ f(x,y)dx,
-oo J — oo
-&kF{x'v) = f{x'y)-
P{a < X < b and c < Y < d} — P{a < X < b}P{c <Y<d}
for all a < b and c < d. Prove that X and Y are independent if and
only if f(x,y) = fx(x)fY(y).
P{Y(X} = k) = e~x£.
10.4 The Variation Metric 387
n\
P{X = k} Pk(i~p)n-k (18)
(n - k)\kl
n—k
n!
(19)
(n — k)\k\
Now, suppose that p is small and n is large. Then A/n is small. Suppose
that k is small compared to n. Then (see problem 1),
(21)
n(n — 1)... (n — k + 1)
(22)
nk
(23)
\k
P{X = k} « e~x—~ = P{Y(A) = k}.
kl
We shall see later that this is true and derive a bound for the difference.
The approximation (24) is the reason that the number of occurrences
of rare events is often assumed to have a Poisson distribution. Here is an
example.
variable which has the value 1 if there is a quake on the day of the
year and equals zero otherwise. Then the value of X = i Xi is the
number of quakes during the year. If we assume that the X{ are indepen¬
dent and set p = P{Xi = 1}, then X is a binomial random variable with
n = 365 and p = 2.5/365. Since n is large, p is small, and A = np = 2.5
has moderate size, the mass density of X should be well approximated
by the mass density of a Poisson random variable with the same mean,
that is, by y(2.5). The probability that there will be no earthquakes in
the year is given by X as
362.5 \ 365
P{X = 0} .081
365 /
and by Y as
P{Y = 0} = e-2'5 = .082.
= .5434
and by Y as
random from the deck and reinsert it at a randomly chosen place. Let's
assume that we have chosen a specific method of shuffling. Since there
are 52! possible orderings of the cards, we can label the orderings by the
numbers 1,2,..., 52!. Let the value of the random variable Xn be the la¬
bel after n shuffles. Let U be the random variable which takes on each
of the label values 1,2,..., 52! with probability that is, U takes on
each label value with equal probability. A reasonable mathematical in¬
terpretation of the question “How random is the deck after n shuffles?”
is the question “How close is the mass density of Xn to the mass density
of f7?” since U assigns equal probabilities to each ordering. If one uses a
“riffle shuffle”, one can show that the densities are quite close if n > 7.
For an excellent introduction to the mathematics of card shuffling, see
[38], where the riffle and other shuffles are formally defined.
-j OO
(a) pv{X,Y)> 0.
Proof. We shall prove (26), from which the properties (a), (b), and (c)
follow quite easily (problem 2). Let S+ = {n e N U {0} \pn > qn} and
S~ = {n e N U {0} | pn < qn}. For any iCMU {0},
Since the first term in (29) is positive and the second is negative, it follows
that
which implies
It follows that
SO
Note that (a), (b), and (c) are just the properties of a metric except that
p(X, Y) = 0 does not imply that X = Y. In fact, pv{-, •) is a metric on the
set of sequences {pn}£°=o °f nonnegative numbers such that J2pn = 1.
That is, pv is a metric on the set of mass densities. We follow common
practice and write pv{X,Y) even though pv depends only on the mass
densities of X and Y, not on the random variables themselves. We now
prove a theorem that relates the properties of pv(X,Y) to probabilistic
properties of X and Y.
392 Chapter 10. Probability Theory
M'LXi.Z.Yi) <
i—1
oo
=
J2 PiX e A - in}}p{Z = n} (32)
n—0
oo
< E IP{Y £ A - {n}} + pv(X,Y)}P{Z = n} (33)
n—0
oo
= E P{Y(A- {n}}P{Z = n] + pv{X,Y) (34)
71=0
OO
= ^P{Y eA-{n} and Z = n} + pv(X,Y) (35)
n=0
pv(X1 + X2,Y1 + Y2) < pv(X1 + X2, X2 + Yi) + pv(X2 + Yi, Yi + Y2)
< pviXuY!) + Pv(X2,Y2).
The proofs of the next three results all depend on the following two
ideas. First, since pv(X,Y) depends only on the mass densities of X
and Y, its value does not change if Y is replaced by another random
variable Y with the same density. Second, suppose that Y is a Poisson
random variable with mean A. Then, for every choice of pi > 0 so that
Pi + P2 + • • • + Pn = A, there is a sample space S and mutually indepen¬
dent Poisson random variables Y{pi) so that Y = ^Y(pi) is a Poisson
random variable with mean A. Thus pv{X, Y) — pv(X, Y). See problems
12,13, and 14.
Corollary 10.4.3 Let Y(pi) and Y(p2) be Poisson random variables with
means p\ < p2. Then,
Proof. Let Y(mi) and Y(p2 - Mi) be independent Poisson random vari¬
ables with means p\ and p2 — pi and define Y(p2) = Y(p\) + Y(p2 — Ml)-
Then Y(p2) is Poisson with mean p2. Therefore,
pv(Y(pi),Y(p2)) = pv(Y(pi),Y(p2))
= P{Y(M2 - Mi) ^ 0}
394 Chapter 10. Probability Theory
_ l _ g-(M2-Ml)
< P2~Pl-
In the third step we used part (a) of Theorem 10.4.2. The last step follows
from the Mean Value Theorem. □
Thus we have an upper bound for how close the mass densities of two
Poisson random variables are if the means are close.
and similarly,
Thus we see that the two predictions indeed differ by less than 0.1.
2 pv(X,Y(p)) YJ\P{X=n}-P{Y(p)=n}\
77=0
1 OO
= 2p(l - e~p).
For the general case, we replace Y(^7=i Pi) by the sum Ya=i ^(Pi) of in¬
dependent Poisson random variables Y(pi) with corresponding means
Pi. And we replace Xi,X2,...,Xn, by independent Bernoulli random
variables with probabilities pi,P2, ■■■■,Pn, so that the families {Xi} and
{Yi} are mutually independent. This can be done by a product construc¬
tion similar to that described in problem 13. Thus, by part (c) of Theorem
10.4.2 and (38),
Pv(T.UXi,Y(Y,Uvi)) = Pv(T.UYY.UY(Pi))
n
< Y,p»(X”Yi(p,))
i=1
n
< Ep.2 □
i—1
Proof. This is the special case of (37) when pt—p for all i. □
396 Chapter 10. Probability Theory
.2
np
This was indeed true for the two cases, A\ = {0} and A2 = {0,1, 2}, that
we compared explicitly in Example 1.
Problems
1. For fixed k, show that the expressions on the left sides of (21), (22), and
(23) converge to the respective right sides as n —» 00.
2. Complete the proof of Theorem 10.4.1 by verifying that pv satisfies the
properties (a), (b), and (c).
3. Suppose that on the average 1 out of every 75 items coming off an assem¬
bly line is defective. Assume that each item coming off has probability X
of being defective. Use a binomial random variable and a Poisson ran¬
dom variable to calculate the probability that the next batch of 75 items
will have
4. Every day during January you buy a lottery ticket. Each ticket has a
chance of winning equal to Use a binomial random variable and
a Poisson random variable to calculate the probability that you will win
5. Assume that each of the four letters in the DNA alphabet occurs inde¬
pendently in each position with probability |. Suppose that we have a
DNA strand of length 8. Use a binomial random variable and a Poisson
random variable to calculate the probability that the strand contains
(a) no C's.
(b) exactly two C's.
(c) less than or equal to two C's.
10.4 The Variation Metric 397
6. Let X be a random variable which takes the values 0,1, and 2 with prob¬
abilities po = Pi — \, and p2 = and let Y be a random variable
which takes the values 0,1,2, and 3, each with probability |. Compute
Pv{X,Y).
7. For each integer n > 0, let Xn be a random variable which takes the
values 0,1,2,n, each with probability j)A_.
8. Provide the details of the induction argument for part (c) of Theorem
10.4.2.
9. Compute the error bound for the difference between the binomial and the
Poisson random variables in the situation described in problem 3.
10. Compute the error bound for the difference between the binomial and the
Poisson random variables in the situation described in problem 4.
11. Compute the error bound for the difference between the binomial and the
Poisson random variables in the situation described in problem 5. What
is the point of this problem?
12. Let Y(pi) and Y(p2) be independent Poisson random variables with
means pi and p2. Prove that Y(pi) + Y(p,2) is a Poisson random vari¬
able with mean pi + p2. Hint: since Y(pi) and Y(p2) are independent,
n
= YJp{Y(pi)^k}P{Y(p2)=n-k}.
k=0
uT Po
P{X! — m and X2 = n} = ^ e~^
L ml nl
(a) Prove that Xi is a Poisson random variable with mean p,i and X2 is
a Poisson random variable with mean p2.
(b) Prove that Xi and X2 are independent.
(c) Prove that Xx + X2 is a Poisson random variable with mean A =
Pi + P2-
398 Chapter 10. Probability Theory
14. Let Y(A) be a Poisson random variable with mean A. Let pi > 0 for
i = 1,2,..., n and suppose that A = p\ + p2 + • • • + pn■ Show how to
create a sample space 5 and random variables Xi, X2,..., Xn so that
Projects
a2 = E((X-p)2).
(a) Suppose that X takes the values 1.5, 2, and 2.5 with probabilities
\, and |, respectively. Compute p and <r2.
(b) Suppose that X takes the values 0, 2, and 4 with probabilities |,
\, and |, respectively. Compute p and <j2. What do you conclude
about the meaning of cr2? ^
n n
2. The purpose of this project is to sketch the proofs and uses of Markov's
inequality and Chebyshev's inequality. Throughout we assume that X is
a discrete random variable which takes values {an}.
(a) Suppose that X has finite expectation. Prove that, for each t > 0,
P{\X-n\>S} <
3. The purpose of this project is to introduce the weak law of large numbers.
Let Xi be a sequence of independent discrete random variables all having
the same density and finite variance. Let p = E(Xi) and for each n define
Sn = Xi + X2 + ... + Xn. We want to prove that for all <5 > 0,
0. (39)
(d) Suppose that we flip a fair coin repeatedly and that Xi — 1 if the
flip is a head and Xi = 0 if the flip is a tail. Explain carefully
what the weak law of large numbers means in this case. Why does
this make sense?
(e) Suppose that we flip a coin, which comes up heads with probability
p > \, repeatedly. Let X{ = 1 if the flip is a head and X{ = 0 if
the flip is a tail. Let Sn = Xx + X2 + ... + Xn. Prove that
The k^1 cell senses the binary digit of p and fires if and only if = 1.
Thus, the output of our group of 100 neurons that is read at a higher level
is a string of 0's and l's giving the first 100 binary digits of p.
(a) Compare and contrast the accuracy of the two schemes. Hint: for
the stochastic scheme, use Chebyshev's inequality.
(b) Compare and contrast the simplicity of the two schemes.
(c) Compare and contrast the stability of the two schemes. Hint: sup¬
pose that a particular neuron dies.
(d) In each scheme how difficult would it be to improve accuracy and
stability if there were selective pressure to do so?
>
Bibliography
[1] Bartle, R., and D. Sherbert, Introduction to Real Analysis, 2nd ed., John
Wiley & Sons, Inc., New York, 1992.
[2] Birkhoff, G., and G.-C. Rota, Ordinary Differential Equations, 4th ed.,
John Wiley & Sons, Inc., New York, 1989.
[3] Boltazzini, U., The Higher Calculus: A History of Real and Complex
Analysis from Euler to Weierstrass, Springer-Verlag, New York, 1986.
[4] Braun, M., Differential Equations and Their Applications, 4th ed..
Springer-Verlag, New York, 1993.
[5] Burden, R., and J. Faires, Numerical Analysis, 5th ed., PWS-Kent,
Boston, 1993.
[6] Churchill, R., and J. Brown, Complex Variables and Applications, 4th
ed., McGraw-Hill, Inc., New York, 1984.
[18] Gelfand, I., and S. Fomin, Calculus of Variations, Prentice Hall, En¬
glewood Cliffs, NJ, 1963.
[19] Goldberg, R., Methods of Real Analysis, 2nd ed., John Wiley & Sons,
Inc., New York, 1976.
[22] Hochstadt, H., Integral Equations, John Wiley & Sons, Inc., New
York, 1973.
[25] Hoppensteadt, F., and C. Peskin, Mathematics in Medicine and the Life
Sciences, Springer-Verlag, New York, 1991.
[26] John, F., Partial Differential Equations, 4th ed., John Wiley & Sons, Inc.,
New York, 1982.
[30] Lander, E., and M. S. Waterman, Calculating the Secrets of Life, Na¬
tional Academy Press, Washington, D.C., 1995.
[33] Rosen, K., Elementary Number Theory and Its Applications, Addison-
Wesley, Reading, Mass., 1993.
[35] Ross, S., A First Course in Probability Theory, 4th ed., MacMillan, New
York, 1994.
[39] Steele, J., "Le Cam's Inequality and Poisson Approximation," Math¬
ematical Monthly, 91(1994), pp. 116 - 123.
R real numbers 1
R2 Euclidean plane 7, 40
Rn n dimensional Euclidean space 157, 211
R radius of convergence 245
Ran(f) range of / 8
Sc complement of S 7
\s complement of S 7
sup supremum 53, 81
tW(x,o nth Taylor polynomial 135
Up(S) upper sum 88
T integers 7
Z2 integers modulo 2 5
€ is contained in (a set) 6
is not contained in (a set) 7
t
7T (n) number of primes < n 265
p metric 196
P2 Euclidean metric on Rn 196
Pv variational metric 389
0 empty set 7
Riemann integral 89
Jaf(x)dx
Sc f(z)dz integral on a contour C in C 308
order relations 2
<,<,>,>
* not equal to
u union 7
n intersection 7
= is equivalent to 3
-A function / 8
—>• converges to (numbers) 29
—>• converges to (functions) 168
X Cartesian product 7
absolute value 3, 254
•
norm 212
II • II
IT norm 179
1 ' lip
sup norm 175
|| -Hoc
Fourier series of a function 333
r^> approximately equal to 134
V
Index
subset, 7 uncountable, 19
sum uniform convergence
of functions, 12
sequences, 167
lower, 87
upper, 88 series, 238, 256
supremum uniformly continuous, 84,153
function, 81 uniformly equivalent metrics, 203
norm, 175 uniform random variable, 377
set, 53 upper bound, 52
surjective, 11 upper sum, 88
0 64 0468087 2
Cover:
MATISSE, Henri.
Interior with a Violin Case.
Nice, (winter 1918 -19)
Oil on canvas, 28 3/4 x 23 5/8”(73 x 60 cm).
The Museum of Modern Art, New York.
Lillie P. Bliss Collection.
Photograph © 1998
The Museum of Modern Art, New York
http://www.wiley.com/college
ISBN □-471-15^-4
9 0 0 00>
9 780471 159964