Multivariable Mathematics Compress
Multivariable Mathematics Compress
MATHEMATICS
MULTIVARIABLE
MATHEMATICS
FOURTH EDITION
Richard E. Williamson
Dartmouth College
Hale F. Trotter
Princeton University
• .
Williamson, Richard E.
Multi variable mathematics/ Richard E. Williamson, Hale F. Trotter-4th ed.
p. cm.
Includes index.
ISBN 0-13-067276-9
1. Algebras, Linear. 2. Differential Equations. 3. Calculus. I. Trotter, Hale F. II. Title.
QAl84.W54 2004
5!2'.J5-dc21 2003049839
Cover Image: Provided by Richard Williamson. It is a trajectory ofthe three-species system described
in Exercise 34 in Chapter 12, Section 4.
ISBN D-13-067276-9
• CHAPTER 1 Vectors 1
1 Coordinate Vectors 1
2 Geometric Vectors 8
A Points and Vectors 8
B Distance and Length 10
C Scalar Multiplication 11
D Vector Addition 13
E Points, Arrows, and Vectors 14
3 Lines and Planes 17
A Lines 18
B Planes 21
4 Dot Products 24
A Lengths and Angles 25
B Properties of x • y and Ix I 27
C Unit Vectors and Projections 28
5 Euclidean Geometry 33
A Equations for Lines and Planes 33
B Distance to a Line in JR2 or a Plane in R 3 35
6 The Cross Product 37
V
vi Contents
C Homogeneous Systems 65
D Geometry of Solution Sets 70
3 Matrix Algebra 74
A Sum and Scalar Multiple 74
B Matrix Multiplication 75
C Identity Matrices 78
D Matrix Polynomials 79
4 Inverse Matrices 81
A Invertibility 81
B Computing Inverses 83
C Special Matrices 85
5 Determinants 88
A Definition 88
B Row and Column Expansions 91
C Basic Properties 92
D Computing Determinants 94
E Invertible Matrices 96
This book covers material that is often studied after a first course in one-variable
calculus, namely the algebra and geometry of vectors and matrices, multivariable
and vector calculus, and differential equations, including systems. The branches
of these three areas are strongly intertwined and we've designed our treatment
to display the connections effectively. Our aim has been to teach basic problem
solving, both pure and applied, in a framework that is mathematically coherent,
while allowing for selective emphasis on traditional rigor.
While the sequence of topics follows rather traditional lines of mathematical
classification, the actual route taken may vary widely from course to course. An
underlying theme is the encouragement of geometric thinking in two and three
dimensions, extended to arbitrary dimension when it's useful to do so. Thus,
most of Chapters I and 2 on vectors and matrices is prerequisite for the rest of
the book, but otherwise there is considerable flexibility for course scheduling.
Chapter 3 on linear algebra, with an introduction to general vector spaces and
linear transformations, is included for those who want to cover this material at
some point, but none of it is prerequisite for later chapters. In particular the
material on differentiability in Chapter 5 is organized so that the motivation for
the definition depends on gradient vectors rather than linear transformations.
For this edition the exposition has been completely rewritten in many places
and, in addition to Chapter 3, a number of topics that are optional additions to a
basic course have been added, as follows:
Additional emphasis on scientific applications in Section 1B of Chapter 2.
Subsection on vector integrals in Chapter 4, Section 1.
Subsections on quadric surfaces in Chapter 4.
Subsection on flow lines in Chapter 6, Section 1.
Subsection on use of the chain rule in coordinate changes.
Expanded treatment of the second-derivative criterion for extrema.
Section 5 on centroids and moments in Chapter 7.
Section 6 on application of improper integrals in Chapter 7.
Section 4, Chapter 8 relating flow lines, divergence, and curl.
Subsection on finding potentials in Chapter 9, Section 2.
Additional subsection on flows in Chapter 12, Section 1.
More efficient computation of exponential matrices in Chapter 13, Section 2.
xiii
xiv Preface
Richard E. Williamson
Hale F. Trotter
SYLLABUS SUGGESTIONS
The vertical listings by chapter and section are by no means exhaustive but dis-
play variety in emphasis to give some feeling for the book's flexibility . Unlisted
sections or subsections selected from the table of contents can, of course, be
included at an instructor's discretion.
VECTORS
Originally vectors were conceived of as geometric objects with magnitude and direc-
tion, suitable for representing physical quantities such as displacements, velocities,
or forces. Later on, introduction of a more general algebraic concept of vector uni-
fied and simplified various topics in pure and applied mathematics. This first chapter
introduces vectors in algebraic terms but is chiefly concerned with their geomet-
ric interpretation. The ideas are fundamental for the rest of the book, because the
possibility of visualizing multivariable problems geometrically is one of the major
advantages of using vectors.
2
li~,~A,~.~4t ,l,.~j If we take x = (1 , 2) in JR. and r = 3, then
rx = 3(1, 2) = (3, 6).
Similarly, with x = (l, 2, -3) in JR 3 and r = - 2,
rx = -2(1 , 2, - 3) = (- 2, -4, 6) .
For two vectors x = (x1, x2, . . . , Xn) and y = (YI , Y2 , . . . , Yn) in IR", we define
the sum to be the vector
in which each entry Xk + Yk is the sum of the corresponding entries Xk and Yk·
Note that the sum is defined only for vectors with the same number of entries. For
example, the sum of ( 1, 2) and (3, 4, 5) is undefined because corresponding entries
can't be matched up.
ff x = (2, - 1, 0) and y = (0, -1, -2), then 2x + y expresses the result of three
multiplications and three additions:
We customarily write -x for the scalar multiple (-1 )x, and x - y as an abbrevi-
ation for x + (-y), and use 0 to denote an n-tuple whose entries are all zero. Then
for an arbitrary vector x, x - x = 0, as in
The notation O is ambiguous since O may stand for (0, 0) in one formula and for
(0, 0, 0) in another. The ambiguity disappears in context since only one interpretation
will make sense. For instance. if z = (-2, 0, 3), then in the formula z+O, the Omust
stand for (0, 0, 0), since addition is defined only for n-tuples with the same number
of entries.
Formulas 1 to 9 below are valid for arbitrary x, y, and z in JR." and arbitrary
real numbers r, s. They state rules for our new operations of addition and scalar
multiplication very closely analogous to the familiar distributive, commutative, and
associative laws for ordinary addition and multiplication of numbers.
1. rx + sx = (r + s )x
2. rx + ry = r(x + y)
Section 1 Coordinate Vectors 3
3. r(sx) = (rs)x
4. x+y = y+x
5. (x + y) + z = x + (y + z)
6. x+ 0 = X
7. X + (-X) = 0
8. Ix= x
9. Ox= 0
Note that the O on the left side of Formula 9 is the real number zero, while the O on
the right side is the zero vector in JR 11 for some n.
These formulas are straightforward consequences of the definitions of the vector
operations and of the laws of arithmetic. For illustration, we give a formal proof of
the second one.
Proof of Formula 2. Let x = (x1, x2, ... , X11) and y = (y1, Y2, ... , y 11 ), and let r
be a real number. Then
so
so
By the distributive law for numbers, r(x1 + Y1) = rx1 +ry1, r(x2 + n) = rx2 +ry2,
etc., so the n-tuples rx + ry and r(x + y) are the same. •
A set with operations of addition and multiplication by scalars defined in such a
way that the Formulas I through 9 hold is called a vector space, and its elements are
called vectors. We use this more general point of view in Chapter 3, but elsewhere
in the book the term vector may be taken to refer to an element of JR 11 •
As with numbers and other algebraic expressions such as polynomials, the commu-
tative and associative properties of vector addition stated in Formulas 4 and 5 imply
4 Chapter 1 Vectors
that we can reorder and regroup the terms in a sum without changing the value of
the sum. Consequently we can simply write a sum such as x1 + · · · + x11 without
putting in parentheses to show how the terms are grouped, because the grouping
doesn't affect the value. Also, the distributive laws stated for two-term sums in 1
and 2 hold for sums of more than two terms.
A sum of scalar multiples a1 x1 + · · · + akXk is called a linear combination of the
vectors x1, ... , Xk. Fommlas 1 through 9 justify manipulating and simplifying linear
combinations in much the same way as other algebraic expressions, as illustrated in
the following example.
3(2u + v + w) - (u + 2v + 3w)
and, in general,
Note that these are the only coefficients that express x as a linear combination of
e1, . . . , e11 , for if
Section 1 Coordinate Vectors 5
then Xk = Yk for k = l, ... , n, In other words, every vector in !Rn appears in just
one way as a linear combination of the vectors e1, ... , en, and the coefficients in the
linear combination are simply the entries in the vector. Because of these properties,
the set {e 1 , . • • , en J is called the standard basis for !Rn. The numbers x 1 , . • . , Xn
are called the coordinates of x with respect to the standard basis, and the vectors
x I e 1, ••• , Xn en are called the components of x with respect to this basis.
In JR.3 we often write the standard basis vectors as i, j, and k instead of e 1, e2, and
e3, and in JR2 we may use i and j instead of e 1 and ei. The ijk-notation appears most
often in geometric and physical applications. The notation ek by itself is ambiguous
because it could in principle refer to a vector in JR.n for any n 2:. k. It will always be
clear from the context how many entries are meant for a vector ek,
(x1,x2) =x1(1,0)+x2(0, 1)
= x1e1 + x2e2 =xii+ x2j.
= x1 e1 + x2e2 + x3e3
=xii+ xij + x3k.
shows the vector (2, 3, 4) represented as a linear combination of the vectors (1, 1, 1),
(1, 1, 0), and (1, 0, 0).
To express (1, 3) as a linear combination of (1, 1) and (3, 4), we look for numbers
x and y such that
X + 3y = J
X + 4y = 3
for x and y. Subtracting the first equation from the second gives y = 2. Then setting
y = 2 in the first equation gives x = -5. So (1, 3) equals the linear combination
Now solve
X + 3y = J
X + 4y = 3
X + 5y = 8
for x and y. The first two equations are the same as in the previous example, and the
calculations there showed that x = -5 and y = 2 are the only values that satisfy both
equations. Substituting these values in the third equation we find -5 + 5(2) = 5 -/ 8,
so there are no values for x and y that satisfy all three equations. We conclude that
(1, 3, 8) is not a linear combination of (I, 1, 1) and (3, 4, 5).
The last two examples show that answering questions about linear combinations
may require solving systems of first-degree equations. Chapter 2 describes routines
for solving such equations with many variables. Equations that come up in examples
and exercises in this chapter will be simple enough to be solved by common-sense
methods as in these examples.
In this book we emphasize applications to geometry and physics in two and three
dimensions, and most of our examples in this chapter involve vectors in JR2 or JR 3 .
However, we'll be applying the concepts and methods illustrated here to vectors in
!Rn for arbitrary values of n later on. Here is a nongeometric example.
A model for an economy might use a vector p to represent annual production, with
entries Pi giving the year's production for each of n commodities considered in the
model. Thus p would be a vector in Rn, where n might be as large as several hundred
in an elaborate model. Similarly, vectors c and b might represent annual consump-
tion and the amount in inventory at the beginning of the year for each commodity.
Then the amounts in inventory at the end of the year would be given by the vector
b + p-c.
Section 1 Coordinate Vectors 7
EXERCISES
1. Let x = (-3, 4) and y = (2, 2). Compute 13. Let x = i + j, y = 2i + j + k, and z = -2i + j + 2k. Cal-
(a) X +y culate
(b) 2x + 3y (a) -x+2y-z
(c) -x+y-(1,4) (b) 6x-2y+z
(c) -4x+3y+z
2. Let u = (2, -1) and v = (-3, 1). Compule
(a) u - 2v
14. Let u = i - j + 3k, v = 2j + k, and w = -2i + 2j - k.
(b) 3u+2v
Calculate
(c) 4u+v-{-l,3) (a) -gu+tv--low
(b) Express i as a linear combination of u, v, and w. (d) What vector gives the total production for a five-day
[First use the answer to part (a) to express u in tenns week?
of i, v, and w.] 29. A computer monitors the temperatures recorded by sen-
27. Let x = (5,500, 10) represent the amount of ink, paper, sors at 50 sites in a building. Suppose that Xk(t) is the
and binding material needed to produce a single copy of temperature at the kth site at time t as measured on a 24-
some book and let y = (4, 800, 90) be the same vector hour clock. Then the vector x(t) = (x1 (1), ... , xso(t))
for some other book. What does lOOx + 50y represent? represents the profile of temperatures in the entire build-
ing at time t. Write an expression in terms of x(t) for the
28. A small factory produces products of four different
vector that represents the average temperatures for the day
kinds. The vectors w = (50, 75, 100, 190) and r =
at the 50 sites, using the readings alt = 2, 8, 14, and 20.
(100, 150, 200, 300) give the wholesale and retail prices
in dollars for a single unit of each kind. The vector *30. Give a formal proof that the extension of Fonnula 2
p = (25, 25, 15, 10) gives the number of units of each on page 3 to sums of m tenns follows from the given
product produced in a day. Formulas 1 to 9, that is, prove that for given vectors
(a) What vector gives the retailer's profit per unit for x 1, ... , Xm and scalar r,
the four products?
(b) If the wholesale price vector is doubled, what hap- r(x1 + · · · + X,n) = rx1 + · · · + rxm.
pens to the retailer's profit vector?
(c) What happens to the retailer's profit vector if the =
Use mathematical induction: starting with m 2, assume
retail prices each increase by 10% and the wholesale the formula true with 111 tenns and prove that it's true with
prices stay unchanged? 111 + 1 terms.
(a) (b)
FIGURE 1.3
(a) Position. (b) Directed arrow.
I / I /
' I /
----~:::.-:,.!,/ ____' :~":LY"
I /
(a) (b)
10 Chapter 1 Vectors
FIGURE 1.4
(a) din IR 2 . (b) d in JR3.
/A (a1,a2,a3)
/ I
/
:1/1: -la3 - b3I
(b1,b2,b3) ~ - -
-----+ ,a
1
(a 1 2,b 3)
(a,,a2J I
I I
I I
I
ia2 - b2I
: X2
I
(b 1, b2) (a 1, b2 ) I
la, - bil I
I
X ~---,..----.--- l
I v'(a, - b,)2 + (a2 - b2)2 --,
introduce are valid in JR" for all values of n, allowing us to extend the geometric
concepts to higher dimensions in ways that are essential for applying geometric
reasoning to problems with many variables. In particular, Sections 3 and 4 will
extend and justify the intuitive geometric ideas of line and perpendicularity that we
used to introduce axes and projections onto axes.
28 Distance and Length
We define the distance between the points x =(xi, ... ,x11 ) and y = (y1, ... , Yn)
in JR 11 to be
This definition agrees with the usual notion of distance in JR 1 , JR 2 , and JR 3 • When
n = 1, /(x1 -y1) 2 = lx1 -Y1I, the absolute value of x1 - YI, which is the natural
distance between points x and y in JR. Application of the theorem of Pythagoras to
the right triangles in Figures l.4(a) and (b) shows that the formula defines in JR 2 and
JR3 the usual geometric distance. This agreement motivates the definition. We'll see
in Section 4 that distance defined here has other basic properties of distance that we
would want in JR11 •
!EXAMPLE 1 j In JR, the distance between 2 and -5 is 12 - (-5)1 = 7. In JR 2 , the distance between
(2, 4) and (-3, 1) is
The length of a vector x = (x1, ... , Xn) is denoted ·by !xi and defined to be
Ix - YI or IY - xi,
since these two numbers are both equal to
FIGURE 1.6 X2 2x X3
Scalar multiples.
3y
X1 X2
-2y
-x XI
(a) (b)
Multiplying the position vector of a point by ½ moves the point halfway directly
toward the origin. For instance, let x = (1,2). Then ½x = 1),2x = (2,4), and (i,
-x = (-1, -2), as shown in Figure l.6(a). Figure 1.6(b) shows y = (1, 2, 1), 3y =
(3, 6, 3), and -2y = (-2, -4, -2).
I
~(3, 4) I
I
(2. 3v f(O, 2, 3)
/
"(I , 2)
(a) (b)
The point midway between two points will be useful in Section 2D for arriving
at a geometric interpretation for the sum x + y. We define the midpoint m between
x and y by m = ½(x + y) = ½x + ½Y, motivated by observing that the coordinates of
m will then be the averages of the coordinates of x and y. Furthermore, the distances
from x to m and y to m are
each of which is half the distance between x and y. Thus the sum of the distances
from x to m and from m toy is equal to the distance from x toy, so x, m, and y
are collinear instead of being the vertices of a triangle. This justifies calling m the
midpoint. See Figure 1. 7.
(a) (b)
Let x = (-1, 2) and y = (3, 1). Figure l.I0(a) shows + y, x x- y, and x+ 2y.
Figure I.I0(b) shows x + y and x - y when x = (1 , 2, 4) and y = (1, -1, I).
Choose coordinate axes so that i points east, j points north, and k points up. Find a
vector of length 1 that points in the direction of the sun
FIGURE 1.10
x+ y
X X - y
(a) (b)
j : North
(c)
In each case consider the required vector as an arrow starting at the origin that' s
the hypotenuse of a right triangle whose other sides are vertical and horizontal, as
in Figure l.IO(c).
In case (a), the vertical side has length sin 60° = -/3 /2 and the horizontal side
has length cos 60° = l /2. The required vector is the sum of the horizontal side
considered as an arrow pointing south and the vertical side pointing up. Thus the
required vector is - ½j 4
+ k.
In case (b), the vertical component has length sin 30° = 1/2 and is (l/2)k.
The horizontal component has length cos 30° = -/3/2 and points southwest, which
is the direction of -i - j. Since 1-i - jl = -/2, we have to multiply -i - j by
(-/3/2)/-/2 = -/6/4 to get a vector of length -/3/2, so the required vector is the
sum - f i- f j + ½k of the two components.
To illustrate how vector calculations can prove geometric theorems, we show that
the diagonals of a parallelogram intersect at their midpoints. We choose one vertex
of the parallelogram to be the origin O and let the adjacent vertices have position
vectors a and b. The fourth vertex then has position vector a + b. The midpoint
between a and b has position vector ½a+ ½b, while the midpoint between O and
a + b has position vector ½O + ½<a+ b) . Since these vectors are equal, the two
midpoints are the same.
the force is applied. The choice should always be made to create the picture, mental
or visible, that conveys the underlying idea in the most useful way. Experience is
the best guide, and examples point the way.
Arrows. We've discussed arrows, their tails, and their tips in an intuitive way
without a formal definition. We'll clarify the mathematical connections among these
geometric ideas as follows. For vectors x and y in !Rn, the arrow or directed line
segment from x to y consists of all points of !Rn of the form z(t) = (I - t)x + ty,
where O :s t :S 1. We call x the tail of the arrow and y the tip. To see that the
definition of z(t) is appropriate, we need to show that z(t) covers all the points from
tail to tip as t increases from O to 1. To prove this consider the distance relations
1(0 - t)x + ty) - xi= tly- xi, and 1(0 - t)x + ty) - YI= (1 - t)lx - YI,
generalizing the case t = ½ that justifies the midpoint fonnula in Section 2C. The
equations tell us that a typical point (1-t)x + ty of the segment is at distance t ly-xl
along the segment from x toy and is distance (I - t)lx -yl along th.e segment from
y to x. All of these points are collinear with x and y, because the two distances add
up to the distance between x and y. As t runs from O to 1, z(t) covers exactly the
points from x to y in that order as shown in Figure 1.11, so the segment inherits
the order of the points in the interval O :S t :s 1 in the real number line. To reverse
direction we just interchange x and y in the fonnula for z(t).
Equivalence. Two arrows, one from x to y and a second from z to w, are called
equivalent if the second is a translation of the other, which means that there is a
vector v such that z = x + v and w = y + v. If this is so, then (1 - t)z + tw =
(1 - t )x + ty + v so corresponding points on the two arrows are all translated by the
same vector v. Each arrow is equivalent to infinitely many translations of itself, one
for each vector v. An arrow and a few of its translates appear in Figure 1.12.
Every arrow from x to y is equivalent to one with its tail at O; just take v = - x
to get the arrow from x - x = 0 to y - x; in other words, subtract the tail from the
tip. Among all the arrows equivalent to a given one, we have singled out the one
with its tail at O as the equivalent position vector, because it's natural to identify
this special one with the point at its tip, and hence with its coordinate vector in !Rn.
The identification of arrows with elements of !Rn via position vectors allows us to
do numerical computations involving arrows, even when it's better to think of their
tails as somewhere other than at O; just translate each arrow to the location of the
equivalent position vector. Then identify each position vector with the unique element
of !Rn that corresponds to it, and perfonn the desired operations in !Rn. Finally, use
the reverse identification to find a position vector or some other equivalent arrow, if
that conveys more in a picture.
16 Chapter 1 Vectors
(a) (b)
IEXAMPL~~i The arrows in a plane shown in Figure I. I 2(a) are mutually equivalent and the unique
position vector associated with each one of them is p = (3, 1). The arrow from x to
y is related to p by the translation vector v = (-4, I).
Drawing arrows in 3-dimensional perspective often requires more care. One strat-
egy is to locate the projections of the tail and tip of an arrow in the horizontal plane
and then mark off the corresponding vertical coordinates along a lightly traced line.
Then join the tail to the tip.
IEXAMPLE 9 I Figure 1. l 2(b) shows the position vector (I, -1, I) translated once so that its tail is
at (2, 3, 2) and again so that its tail is at (-3, 0, I) .
EXERCISES
For each pair of elements in IR 2 or IR3 given in Exercises 11. x = (I , I, I), y = (I, I, -1)
I to 4, sketch coordinate axes and mark the points for 12. x = (0, 0, - I), y = (0, I, I)
which they are the coordinates. Draw the line segment
connecting the points and calculate its length. Also For each pair of vectors given in Exercises 13 to 16, draw
calculate the coordinates of the midpoint of the segment, coordinate axes and arrows representing x and y. Then
and mark it in the drawing. use geometric constructions to draw arrows for 2x + y
1. (1,1),(-2,2) 2. (-1,4),(2,-1) and x -y.
3. (l,1.1),(1,-1 , l) 4. (1,0,0),(1,2, I) 13. x = (-2, !), y = ( I, 2)
For each vector in JR 2
or JR 3
given in Exercises 5 to 8, 14. X = (1, 0), y = (-2, -1)
draw arrows starting at the origin representing x, ½x, and 15. X=(] , 0,]),y=(2,l,-l)
-2x. Find jxj, !½xi, and I - 2xj. 16. x = (0, I, -1), y =(],I, 2)
5. (I, I) 6. (-1, 2) 7. (I. 2, 2) 8. (-!, 1, I)
In Exercises 17 to 20, draw the arrows with the given
For each pair of vectors in IR2
or JR 3
given in Exercises tails and tips. Then draw the equivalent position vector.
9 to 12, find the vectors x + y, x - y, and x +2y. Sketch 17. Tip (I, 2), tail (2, I)
coordinate axes and draw arrows representing the vectors
you have found. 18. Tip (-1, I), tail (2, 2)
3A Lines
In both the plane and JR 3 there are two natural ways of specifying a line, either as
passing through a particular point in a particular direction or as passing through two
particular points. We begin with the first of these.
A nonzero vector specifies a direction. Motivated by the earlier discussion of scalar
multiplication, we define two nonzero vectors u and v to have the same direction
(a)
if u = tv for some positive number t and the opposite direction if u = tv for some
negative number t. We say that u and v are parallel if they have either the same or
the opposite direction.
Let u be a nonzero vector. We define the line through the origin in the direction
of u to consist of all points whose position vectors are multiples of u, as illustrated
in Figure 1.13(a). Translating all the points of such a line by a fixed vector v gives a
parallel line through the point whose position vector is v as in Figure l.13(h). Hence
(b) we make the following definition.
A line is a set consisting of all the points tu+ v, where u and v are fixed position
FIGURE 1.13 vectors, while u =I- 0 and t ranges over all real numbers. Each value oft corresponds
Arrows on a line. to a point of the line. A variable t used in this way is called a parameter, and
the expression tu + v is called the parametric representation of the line in vector
form.
Although motivated by geometric intuition in JR2 and JR 3 , this definition of a line
applies in ]Rll, as do the definitions of same and opposite direction and parallelism.
Given the vectors u = (2, 1) and v = (-1, 1) in JR 2 , suppose that we want lo sketch
the line of points x = tu + v. We think of the direction of the line as determined
by the vector u and so sketch the line t u through the origin. Then draw the line
parallel Lo t u that passes through the lip of the arrow representing v; this gives the
Xz line of points x =tu+ v, shown in Figure I.14(a). Alternatively, plot two points on
the line, for example, those obtained by setting t = 0, which gives x = v = (-1, I),
and t = 1, which gives x = u + v = (2, I)+ (-1, 1) = (1, 2). Then draw the line
through these two points as shown in Figure I.14(h).
If we let (x, y) be the coordinates of x, the vector equation x = tu + v, or
~ - - - ..;,.iL--- ---
X1 (x, y) = t (2, 1) + (-1, 1), is equivalent to the pair of numerical equations
(a)
X = 2t - 1
y = t + l,
which are scalar parametric equations for the line. We'll usually use the more concise
vector representation, hut both forms are equally valid.
To find a parametric representation for the line through two distinct points with
position vectors a and b, recall that the vector from a to b is b - a. Thus b - a
gives the direction of the line, and a is a point on it, so
(b)
x = t(b - a)+ a
FIGURE 1.14
Points on a line. is a parametric representation of the line. A more symmetrical rewriting of this
expression is
Section 3A Lines and Planes 19
3.1 X = (I - t)a + tb.
When t = 0, x = a and when t = l, x = b. For t between 0 and l we saw in Section
2E that xis on the line segment between a and b. Thus if t = ½,xis the midpoint.
To find a parametric representation for the line in JR 3 through (-1, I, 0) and (2, 2, I),
we find (2, 2, I) -( - 1, I, 0) = (3, l, I) as the vector from the first point to the second
and obtain
To find out whether a point such as (1, -2, 2) lies on this line, we check to see
whether there is a value of t such that
The second coordinates match only if t = -3, and the third coordinates match only
if t = 2, so the equation has no solution and the point is not on the line.
If we ask instead about the point (-10, -2, -3), we check the equation
Let a= (1, I , 0) and b = (2, - I , 0), and let L be the line through them. The direc-
tion of L is b - a = (2, -1, 0) - (1, l , 0) = (1, - 2, 0). So L has the representation
t(l, -2, 0) + (l, 1, 0). The line through the origin parallel to L has the represen-
tation t (l, -2, 0) and the line through (I, 2, 3) parallel to L has the representation
t(l, -2, 0) + (1, 2, 3).
The line through (-1, 2, 3) and (I, -2, 3) is parallel to L, because it has
(-1, 2, 3) - (1, -2, 3) = (-2, 4, 0) for a direction vector, and the vector (-2, 4, 0) =
2(1, - 2, 0) is a scalar multiple of (I, -2, 0). The line through (-1, 2, 3) and the ori-
gin is not parallel to L, because it has ( -1, 2, 3) for a direction vector and ( -1, 2, 3)
is not a scalar multiple of (I, -2, 0).
20 Chapter 1 Vectors
A given line has many different representations. In the expression tu+v, the vector
u giving the direction is replaceable by a nonzero multiple ru, and v is replaceable
by the position vector of another point on the line without changing the line being
represented. A way to check whether two representations give the same line is to
take two distinct points in one representation and see whether they are also given by
the other, as in the following example.
Setting t = 0 and t = I in the second representation gives the points (0, - I) and
(2, 3). These points are both given by the first representation because
when t = I, and
A motorboat starts out from a dock on a lake at 10 miles per hour in the direction
given by 4i + 3j. (We take coordinates with origin at the dock and i pointing east and
j pointing north, with the unit of distance 1 mile.) At the same time a rowboat starts
north at 4 miles per hour from a point 2 miles east of the dock. The boats move along
the lines with parametric representations t1 (4i + 3j) and 2i + t2j. The lines intersect
where 11 (4i + 3j) = 2i + tij, giving 411 = 2 and 311 = t2. Then t1 = ½, t2 = ~, and
the point of intersection is ½(4i + 3j) = 2i + ~j.
The representations we have used tell us the paths followed by the boats but take
no account of how fast they go. The vector 4i + 3j has length 5; if we double it, and
let m(t) = t (8i + 6j), we have a function describing the motion of a boat that goes I 0
miles in 1 hour and is at the origin (the dock) when t = 0. Similarly r(t) = 2i + 4tj
describes the motion of the rowboat. The motorboat reaches the point 2i + !j
when
i,
t = ¼, and the rowboat reaches it when t = which is } hour= 7½ minutes later,
so the boats do not run into each other.
Recall from Section 2E that the points on the segment directed from x to y were
+
represented in the form z(t) = ( l - t )x ty with the parameter t increasing over
the interval 0 ::'.S t :'.S 1. Rewriting z(t) shows that z(t) = t (y - x) + x, so letting t
Section 3B Lines and Planes 21
run over all real numbers gives the points on the line through x in the direction of
y - x, assuming y - x ¥- 0. When t > l the equations
3B Planes
If we pick two nonzero vectors and picture them as arrows from the origin, then
unless they lie along the same line, there is a unique plane that contains both arrows.
We define a plane through the origin to be the set of linear combinations
X = t1UJ + t2U2,
where neither u 1 nor u2 is a scalar multiple of the other. We say that u 1 and u 2 are
linearly independent if neither one is a scalar multiple of the other. Geometrically
this means that the vectors aren't parallel as defined on page 18 and that neither one
is the zero vector.
A part of such a plane appears in Figure l.IS(a). In Figure l.IS(b) the plane
through the origin has been translated by adding a fixed vector v to every point
of the plane through the origin. Formally, we define a plane to be a set of points
whose position vectors have the form t1 u 1 + t2u2 + v, where u1 , u2, and v are fixed
vectors, u 1 and u2 are linearly independent, and ti and t2 range over all real numbers.
The variables t1 and t2 are parameters and t1u1 + t2u2 +vis called a parametric
representation of a plane in vector form.
This definition of plane is motivated by the way we picture planes in JR 3 , but
like our definition of line, it applies in ]Rn. A characteristic property of geometric
planes is that they are fiat, that is, if p and q are two distinct points in a plane,
then the entire line through p and q lies in the plane. To see that planes as we have
defined them have this property, let p = PI u 1 + p2u2 + v and q = q1 u1 + q2u2 + v
be two points in the plane that has parametric representation t1u1 + t2u2 + v. By
Formula 3.1, every point x on the line through p and q has a position vector of the
(a) (b)
22 Chapter 1 Vectors
FIGURE 1.16 V
Parallel planes. \
'\
'
') f ,Q ' f2U2 +V
,1
I j
/ I
/ I
/ : l2U2
form ( 1 - t )p + tq for some number t . Substituting for p and q in this formula and
rearranging the terms gives
This shows that x is in the plane because it has the form t1 u 1 + t2u2 +v with
t1 = (] - t)p1 + tq1 and t2 = (] - t)p2 + q2.
IE>Q\MPLE .1 I Let u1 = 0, 0), u2 = (0, 1, 0), and v = (0, 0, 2). The vectors u1 + t2u2 are just
(I,
the vectors (/1 , t2 , 0) making up the xy-coordinate plane. The vectors
t1
Here's how to find a representation for the plane through three given points. For
three points x1, x2, and x3 to determine a unique plane passing through them, the
three points must not lie on a line. If the three points do not lie on a line, the vectors
x3 - x1 and x2 - x1 are linearly independent; otherwise x3 - x1 = t(x2 - xi ), for
some value oft . Then x3 = t(x2 - x1) + x1, and X3 would lie on the line through
x1 and x2.
Let x1 , x2, and x3 be three points that don't lie on a line. We just observed that
the vectors u1 = x3 - x1 and u2 = x2 - x1 are linearly independent. Then
is the parametric representation of a plane; this is the plane containing the three
given points because x = x1 when ti = t2 = 0, x = x2 when t1 = 0 and t2 = I , and
x = x3 when t1 = I and t2 = 0.
\
'
I
1/
(a) (b)
Then the parametric representation for the plane through the three points is
To picture the plane relative to rectangular coordinate axes in IR.3 , we can either draw
the translated vectors n1 + x1 and n2 + x1 as in Figure 1. l 7(a) or plot three or four
noncollinear points on the plane as in Figure I.I 7 (b).
EXERCISES
In Exercises I to 4, represent the described line in the For the pairs of lines given in Exercises 7 to 10, find out
parametric fonn x = tu+ v and sketch the line. whether the two lines are the same, and if they aren't,
whether they are parallel.
1. The line in R 2 parallel to the vector (2, l) and passing
through the point (-1, 2) 7. t(I, 2) + (2, 1) and t(-2, -4) + (3, 3)
2
2. The line in JR through the points (1, 0) and (0, 1) 8. t(2, -1) + (2, 1) and t(-1, 2) + (2, -1)
3. The line in JR. 3 passing through the points (2, 2, 3) and 9. t(2,3, -1) + (-1, -1, 1) and t(-4, -6,2) + (1, I , -1)
(1, 2, 2)
10. t(4, 2, 2) + (2, 0, 1) and t(2, 1, 1) + (2, 2, 1)
4. The line in IR3 parallel to the vector (1, l, 2) and passing
through the point (2, 0, 1) Which of the pairs of vectors given in Exercises 11 to
14 are linearly independent?
5. Let a= (-1, 1), b = (0, l), c = (2, 1), and d = (-3, 2).
11. (1 , 2), (2, I) 12. (2, -1), (-2, 1)
(a) Sketch the lines ta+ b and sc + d.
(b) Find the point p where the lines in part (a) intersect 13. (3, 1, 3), (1, 3, 1) 14. (-1, 3, 1), (2, -6, -2)
by finding values of s and t for which
In Exercises 15 to 18, find a representation for the given
plane in the parametric form ti n1 + t2n2 + v, and sketch
p = ta+ b =SC+ d. the plane.
(c) Change c to (2, -2) and show that then the lines 15. The plane parallel to the vectors (1, 1, 0) and (0, 1, 1) and
ta+ b and sc + d do not intersect. passing through the origin
6. Let a = (-3, 0, 1), b = (0, 1, 2), c = (2, -1, 1), and 16. The plane parallel to the vectors e1 and e2 in JR 3 and
=
d (l,2,0). passing through the point (0, 0, 2)
(a) Sketch the lines ta + b and sc + d. 17. The plane passing through the three points (I , 0, 0) ,
(b) Find the point of intersection of the lines in part (a), (0, 1, 0), and (0, 0, 1)
or show that they do not intersect.
(c) What is the answer to part (b) if d is changed to 18. The plane passing through the three points (1, 1, 0),
,_10,l\? (-3, 0, 2), and (2, 4, 7)
24 Chapter 1 Vectors
19. A plane Pin IR 3 contains the line t{l, -1, 2) + (1, 2, 1) 23. (a) Show that if tu = 0, for u a vector in IRn, then either
and the point (3, 0, 1). t = 0 or u = 0. Can you derive the same result using
only the laws l through 9 on page 3?
(a) Find a vedor between two points of P that is linearly
(b) Show that if u1 is a scalar multiple of u2, and is not
independent of (1, -!, 2).
zero, then u2 is a scalar multiple of u1.
(b) Find a parametric representation for P.
(c) Show that tu1 + VJ and tu2 + v2 represent the same
20. Let the vertices of a triangle be the points a, b, and c line if and only if both u2 and v2 - v1 are scalar
and let p, q, and r be the midpoints of the sides opposite multiples of u 1.
a, b, and c, respectively. A line joining a vertex to the 24. Let u = (u1, u2), v = (v1, v2) be nonzero vectors in IR2.
midpoint of the opposite side is called a median of the (a) Show that u and v are parallel if and only if u I vi =
triangle. Show that the point p = ½a+ jp is on the u2v1.
median that joins a and p. Express p in terms of a, b, and (b) Let a and b be vectors in :~.2. so that tu+ a and
c. Show also that p is on the other two medians of the sv + b are two lines in the plane. Show that if u
triangle. and v are not parallel, then there are values of s and
t such that tu + a = sv + b, so the lines intersect.
21. Show that if a, b, c, and d are the position vectors of
(Write out the vector equation tu + a = sv + b as
the vertices of a quadrilateral, not necessarily lying in a
a pair of scalar equations and show that they can
plane, then the midpoints of the four sides are vertices of
always he solved if u1 v2 i= u2vt,)
a parallelogram and do lie in a plane.
25. A subset S of IR" is convex if whenever it contains a and
22. Show that two lines are parallel if and only if for two b it also contains the line segment joining them, that is,
distinct points VJ and v2 on the first line, and two distinct all points ta+ (I - t)b with O .:S t .::: 1. Show that if S
points w1 and w2 on the second line, the difference and T are convex subsets of IR", then the set S + T of all
W1 - W2 is a multiple of v1 - v2. sums x + y, with x in S and y in T is also convex.
x • x = xf + · · · + x~ .
Referring to our earlier definition of the length of x in Section 2B as
lxl J
= xf + .. · + x;,
we see that the length of x equals the square root of the dot product of x with itself:
x- y
X
8
y y y
(a) Ix - Yl2 < lxF + IYl 2
(b) Ix - yj2 = lxl + IYl2 2
(c) Ix - Yl 5- lx12 + IYl 2
2
We get a relation between dot products and angles by recalling from Section 2
that if two sides of a triangle represent vectors x and y, then the third side represents
the vector x - y from the tip of y to the tip of x as illustrated in Figure 1.18. The
key to the relation is the familiar trigonometry formula called the law of cosines that
relates the length of one side of a triangle to the lengths of the other sides and the
angle between them. Applied to the triangles in the figure it gives the following:
4.4 Law of Cosines.
Ix - Yl
2
= lxl 2 + IYl 2 - 21xl IYI cos 0 ,
We now use the law of cosines to derive the following fundamental geometric
property of the dot product.
4.5 Theorem. The dot product of two nonzero vectors is equal to the product of
their lengths times the cosine of the angle between them. In other words, if 0 is the
angle between vectors x and y, then
x • y = lxllYI cos 0.
26 Chapter 1 Vectors
Proof Using the additivity and homogeneity of the dot product, we have
2
Ix - Yl = (x - y) • (x - y)
= x • (x - y) - y • (x - y)
=X•X-X•Y - Y•X+Y·Y
= lxl 2 + IYl 2 - 2(x • y) ,
where the last step uses the symmetry property y • x = x • y. Comparing this with
the value for Ix - yl 2 given by the law of cosines shows that x • y = lxl IYI cos 0. •
As illustrated in Figure 1.18, the formulas for lx-yj 2 in the proof of Theorem 4.5
show that when x • y > 0 and cos 0 > 0 then Ix - yl 2 < lxl 2 + lyl 2 ; similarly, when
x•y < 0 and cos0 < 0, then lx-yl 2 > lxl 2 + jyj 2 . The special case of perpendicular
vectors x and y shown in Figure I. l 8(b ), when cos 0 and x•y are both 0, is particularly
simple and important and we emphasize it as follows:
X•Y
4.7 0 = arccos -- .
lxllYI
If either x or y is 0, the right side of Equation 4.7 is undefined, which is appropriate
because the zero vector has no direction and so can't form an angle with another
vector.
2 2 2 2
x • x = I + 3 = I0, y • y = (- 1) + 1 = 2, and x • y = -1 + 3 = 2.
Then
2 I
lxl = Jio, IYl=-v'2, and cos0=-- = - .
Jw ,./5
Using a calculator we find 0 ~ arccos .447 ~ 1.1 radians, or about 63°.
To find the angle 0 between x = (I, 2, 0, - 2) and y = (0, -6, 3, 2) in JR4 we
compute
X • X = 12 + 22 + 02 + (-2) 2 =] + 4 + 0 + 4 = 9,
y. y = 0 2 + (-6) 2 + 32 + 22 = 2 = 0 + 36 + 9 + 4 = 49, and
X•y=0 - 12+0-4= -16.
Section 4B Dot Products 27
Then jxj = 3, IYI = 7, and cos0 = -16/21, giving 0 ~ arccos(-.762) ~ 2.44
radians, or about 139.6°.
FIGURE 1.19 X
Perpendicular vectors .
-x
x1 + 2x2 + 3x3 = 0,
XI - X3 = 0.
The second equation gives x1 = x3, and substituting in the first equation gives
4x1 + 2x2 = 0 or x2 = -2x1. Taking x1 = 1 gives x = (1, -2, I) as one solution,
and a scalar multiple of x is also perpendicular to a and b. The length of x is
./6, so to get a vector of length 2 we must multiply by ±2/./6 ±,J273, giving =
X = ±( ft, -ft, ft).
To illustrate how to use the dot product to prove geometric theorems, we'll show that
in a parallelogram with all four sides the same length the diagonals are perpendicular.
Figure 1.20 shows such a parallelogram with diagonals parallel to the vectors x + y
and x - y. To show that the diagonals are perpendicular we need to show that
(x + y)-(x - y) = 0. Using the additivity property of the dot product twice allows
us to multiply out, getting
(x + y)•(X - y) = X. (x - y) + y•(X - y)
Q27
FIGURE 1.20
X
to show.
= X • X + X • (-y) + y • X + y • (-y).
By homogeneity, x • (-y) = -(x • y) and Y• (-y) = -(y • y). Since the lengths of x
and y are equal, x • x = lxi 2 = IYl2 = y • y, so the right side is zero, as we wanted
Ix• yl ~ !x!IYI
28 Chapter 1 Vectors
Proof Theorem 4.5 gives x • y = lx!lyl cos0 for some angle 0. Taking absolute
values, and noting that the absolute value of a product of real numbers equals the
product of their absolute values, we have
4.9 Theorem. The length function defined on IR" by !xi = ,Jw has the
properties:
Positivity: !xi > 0 except that 101 =0
Homogeneity: lrxl = !rl!xl
Triangle Inequality: Ix+ YI ~ Ix! + IYI
Proof Positivity is an immediate consequence of the positivity property of dot
products, for since x • x > 0 unless x = 0, the same is true of !xi = ,Jw. We leave
the proof of homogeneity to the reader in Exercise 12.
Geometrically, the triangle inequality is equivalent to the theorem that a side of
a triangle cannot be longer that the sum of the lengths of the other two sides, which
is why it's called the triangle inequality. See Figure 1.21.
For an algebraic proof, we start with the equation
2 2
Ix+ Yl = (x + y) • (x + Y) = Jxl 2 + IY1 + 2x • y.
FIGURE 1.21
Triangle inequality.
From the Cauchy-Schwarz inequality, x • y ~ lxl!YI, so
4.10 Theorem. Let x and n be vectors in Rn, with \n\ = 1. There are unique
vectors p and q such that x = p + q, with p parallel to n and q perpendicular
ton. The vector p equals (x • n)n, with length \x. n\ , and q = x - p, with length
\qi= Jlxl 2 - lp\ 2 .
Proof. Since p is parallel to n there is a scalar r such that p = rn. For q =x- p
to be perpendicular ton, we need (x - p) • n = 0. But since n. n = 1,
(x - p) • n = (x - rn) • n = x • n - r(n • n) = x. n - r = 0,
so (x - p) • n = 0 if and only if r = n • x. By Pythagoras \q\ 2 = \xl2 - \p\ 2 . •
The standard basis vectors e1, e2, ... , en for Rn are unit vectors and are orthogonal
to each other. In particular, in R 3 we have i • i = j • j = k • k = I and i • j = j • k =
k • i = 0. If x = (xi, x2, ... , Xn) = x1e1 + · · · + Xnen is an arbitrary vector in Rn
then its coordinate in the direction of ei is x • ei = Xi. In R 3 , if v = xi + yj + zk
then X = V. i, y = V • j, and z = V. k.
A mechanical force has magnitude and direction and so is a vector with its arrow's
tail usually drawn at the point of application of the force. It's often useful to express
a force as a sum of perpendicular components because they act independently of
each other, as in the next example.
Here we analyze the effect of gravity on a I-pound brick held in place by friction
on a roof that slopes so that n = ~ i + ~j + ~ k is the unit vector perpendicular to the
roof, as in Figure 1.24. Gravity exerts a downward vertical force of I pound, which
as a vector is -k. The brick doesn't move because the roof exerts an opposing force
F = k on it. The component of F in the direction of n is
6 12• 18· 36 k
P = (F
· n) n = 7n = 49 1 + 49J + 49
30 Chapter 1 Vectors
and is the force with which the roof presses directly against the brick. The other
F
force component F - p = -l~i - ~j + l~k is perpendicular to n and thus parallel
to the roof, and is the frictional force that keeps the brick from sliding.
The vector v = 2i + 3j + 6k has the same direction as n and could have been
given instead of n to specify the way the roof slopes. In that case we would have to
start by calculating lvl = 7 and finding the unit vector n = v/!vi.
The work done by a force in moving an object a certain distance in a straight line
FIGURE 1.24
is the product of the distance and the magnitude of the force, provided the motion
is in the direction of the force. More generally, the work done is the distance times
Brick on a roof.
the coordinate of the force in the direction of the motion. Suppose a force F moves
an object through a displacement d so the distance moved is d = ldl . If n is the unit
vector in the direction of d, then d = dn and the coordinate of F in the direction of
d is F • n. The work done is (F • n)d = (F • dn) = F • d.
IEXAMPLE 6J The bottom of a snow-covered slope is at the origin, and the top at (100, 20), with
units measured in feet. Pulling a child's sled up the slope takes a force given by the
vector (8, 3), in units of pounds. The work done is (100, 20) (8, 3) = 800+60 = 860
0
foot-pounds.
We use the dot product to analyze the flow of fluid, or heat, or radiant energy
described by a vector-valued function v(x,y,z) that gives the direction and magnitude
of the flow, that is, the flow velocity, at each point (x, y, z). We consider here only
the simplest kind of flow, in which v is a constant vector so that the flow is uniform
along parallel lines.
For a given flow there will be a rate of flow per time unit through a surface in its
path, which is called the flux through the surface. This is illustrated in Figure 1.25
showing a horizontally placed parallelogram P, and a flow vector v down and from
the right. The vector n is a unit vector perpendicular to P. The shaded region indicates
the volume flowing through P in one time unit, and is a solid B bounded by six
parallelograms with horizontal top and bottom, with its other four edges of length
lvl and parallel to v. The volume V(B) is the area A(P) of its top times the vertical
height h of B. Since h = n • v, the coordinate of v in the direction of n, the flux
through the top parallelogram is A(P)n • v. Note that the sign of the flux through P
would change if we reversed the direction of n.
Similar considerations apply to other planar figures R perpendicular to n and of
area A(R). If a flow velocity in R 3 is the constant vector v at every point, the flow
FIGURE 1.25
Flow's flux equals V(B).
Section 4C Dot Products 31
We'll compute the flow of air through a window of area IO square feet that is
perpendicular to the vector w = 3i - 4j when the wind velocity (in feet per second)
is 20i. Since lwl = 5, we find n = ½w as the unit vector perpendicular to the plane
of the window, so An= IOn = 2w = 6i - 8j. The flux through the window is then
20i • (6i - 8j) = 120 cubic feet per second.
This example uses the dot product in a nongeometric context. Suppose a manufacturer
produces four different models of widgets. We can write information about the models
as vectors in JR4 , with each entry of a vector corresponding to one of the four models.
Suppose unit production costs are given by a vector c = (2, 4, 5, 7), meaning that
it costs 2 dollars to produce each model 1 widget, 4 dollars for each model 2
widget, and so on. Similarly let the unit wholesale prices be w = (3, 6, 7, IO) and
retail prices be r = (5, 9, 11, 18). If p = (100, 30, 10, 5) gives the number of each
model produced in a day and s = (80, 40, 8, 3) the number sold at retail, then the
day's total manufacturing cost is p • c = (100)(2) + (30)(4) + (10)(5) + (5)(7) =
200+ 120+50+35 = 405 dollars, and the retailer's gross income (before expenses)
is s • (r - w) = (80, 40, 8, 3) • (2, 3, 4, 8) = 160 + 120 + 32 + 24 = 436 dollars for
the day.
EXERCISES
In Exercises 1 to 4, compute x • y for the given vectors. 11. Prove that the dot product has the additivity and homo-
geneity properties in Theorem 4.2.
1. x=(l ,3), y=(-2,4)
12. Prove the homogeneity property of length listed in
2, X = (v'2, vf3), y = (vf3, v'2) Theorem 4.9.
J. X = (-1, -1 , 2),y = (l, 6, J) In Exercises 13 to 16, find (a) the coordinate and (b) the
4. X = (l , 2, 1, 3), y = (0, l, 2, 1)
component of the vector x in the direction of the vector
v, and also the component of x perpendicular to v.
In Exercises 5 to 8, for the given vectors u and v find (a)
u • v, (b) lul and lvl , and (c) the angle between u and v. 13. X = (l, -1, 2), V = (l / vf3, l / vf3, l/vf3)
14. X = (l, -1, 2), V = (2/7, -3/7, 6/7)
5. u ={l,1),v=(l,0)
15. X = (2, -3, 1), V = (1, 3, -2)
6. U = (vf3, 1), V = (1, vf3)
16. X = (-4, 0, -1), V = (0, -3, 2)
7. U = (2, l , 2), V = (l, 2, 2)
In Exercises 17 and 18, for the triangle with the given
8. u=(3,l,l),v=(4,l,0) vertices A, B , C, find the lengths of its sides, and deter-
In Exercises 9 and 10, use the information and approx- mine which of its angles are acute, obtuse, or right
angles.
imate coordinate system of Exercises 34 to 38 on
page 17. 17. A= (2, -3, 6) , B = (I , 3, -2), C = (1 , 7, 1)
9. Find the angle between the position vectors of New York 18. A= (l , 2, 4), B = (-2, -1, 2), C = (4, 2, -3)
and Los Angeles, and find the approximate airline distance 19. (a) Show that for a vector x in R 3 ,
between the two cities.
10. Do the same as in Exercise 9, for New York and Paris.
32 Chapter 1 Vectors
(b) If the vector x in part (a) is a unit vector, that is, This inequality is sometimes called the reversed triangle
a vector u of length I , show that u • ei = cos a;, inequality.
where Cl'i is the angle between u and ei . The coor- 28. Show that the sum of the squares of the lengths of
dinates cos a; are called the direction cosines of the four sides of a parallelogram is equal to the sum
u relative to the standard basis vectors ei . If x is of the squares of the length, of the diagonals. [Hint:
a nonzero vector, the direction cosines of x are Write Ix ± yl 2 = (x ± y) • (x ± y) and multiply
defined to be the direction cosines of the unit vector out.]
x/lxl.
(c) Find the direction cosines of (1, 2, I) . 29. Here is one way of proving the Cauchy-Schwarz inequal-
ity in JR" without appealing to geometry or trigonom-
20. Find the direction cosines of (6, -3 , -2) etry. Recall that if b 2 - 4ac > 0 and a f. 0,
21. Show that the standard basis vectors satisfy e; • ej = 0 if then the quadratic equation at 2 + bt + c = 0 has
i f. j, and e; • ej = 1 if i = j . two distinct real roots, and there are some values
22. Show that if x f. 0, then the vector (1/jxl)x has of t that make the expression at 2 + bt + c neg-
length I . ative. We suppose two vectors x and y are given
in JR", and we want to show that I(x • y) I <
23. A solar energy collector with area 15 square meters is lxllyl.
mounted so that its panels are perpendicular to the vector
(a) Show that if either x or y is 0, then the inequal-
4i+3k. At what rate does solar energy fall on the collector ity is true because both sides of it are 0. From
if the vector i + j + 3k gives the direction of the sun
now on we may assume that neither x nor y
and the rate of energy falling on a surface perpendicular
is 0.
to the sun's rays is 80 watts per square meter? (Use
(b) Using the properties 4.2 of the dot product, show
the method of Example 7, treating radiation as a flow of
that
energy.)
24. At what rate does solar energy fall on the collector in
ltx + Yl 2 = (tx + y) • (tx + y)
Exercise 23 later in the day when the direction of the sun
is -3j+k? = Jxl 2t 2 + 2(x • y)r + IYl 2
25. A wind blowing from the northwest exerts a force of 2:: 0 for all values of t .
15 pounds on a bicycle rider who follows a road that
goes 400 feet west and then 500 feet in a direction
30° north of west. In a coordinate system in which i (c) Use the remark at the beginning of the problem to
points cast and j points north, find a vector representing conclude that
the force of the wind and vectors representing the two
parts of the road. Calculate the work that the rider does
against the wind in cycling along each part. If there were
a road running straight from the starting point to the
(d) Derive the Cauchy-Schwarz inequality from
finish, would taking it make a difference in the total work
(c) .
done?
Once the inequality is established, we know that for
26. Suppose that a factory produces each day four dif-
nonzero vectors in JR" the ratio (x • y)/(lxllyl) is always
ferent items in amounts represented by the produc-
between -1 and I and therefore equal to cos 0 for a
tion vector p = (25, 25, 15, 10) and that these items
unique angle 0, with O :::: 0 :::: n. Since the angle
are sold according to the wholesale dollar price vec-
between two vectors is measured in the plane contain-
tor w = (100,150,200,300). What is the total rev-
ing the vectors, we use Equation 4.7 to define angle
enue for the factory from selling all of each day's
in JR".
production?
30. If equality holds in the Cauchy - Schwarz inequal-
27. Derive the inequality
ity, that is, if l(X•Y)I = lxllYI, then one of the
vectors is a scalar multiple of the other. Prove
lxl - IYI :::: Ix - YI
this
from the triangle inequality, and then show that (a) for vectors in JR 2 and JR 3 using Equation 4.5
(b) in general, using ideas from the proof outlined in
llxl - IYII:::: Ix - YI- Exercise 29
Section SA Euclidean Geometry 33
y SECTION 5 EUCLIDEAN GEOMETRY
A basic fact of analytic geometry is that the points in the plane whose coordinates
(x, y) satisfy an equation of the form ax + by = c lie on a straight line and that
every straight line has such an equation. There is a similar correspondence between
,, ,rX - Xo
,, ,, planes in JR3 and equations of the form ax+ by+ cz = d. With the help of the dot
X product we'll find a geometric interpretation for the coefficients in these equations
and extend the idea to higher dimensions.
SA Equations for Lines and Planes
Suppose that xo is a fixed point on a line in IR 2 or a plane in JR 3 , and p is a vector
(a)
perpendicular to the line or plane. Then a point x is on the line or plane if and only
z if the vectors p and x - xo are perpendicular. In other words, for every point x on
the line or plane, we must have
5.1 p • (x - xo) = 0.
Figure 1.26 shows the relation between these vectors in JR 2 and JR 3 . In JR2 we can
write p = (a, b), x = (x, y), and Xo = (xo, yo). Then Equation 5.1 becomes
(b) Letting ax0 + byo = d gives the standard equation for a line in JR 2 , in which (a, b)
is still a vector perpendicular to the line:
FIGURE 1.26
Line and plane. (a, b) • (x, y) = (a, b) • (xo.yo) or ax+ by= d.
Find an equation of the line in JR2 that is perpendicular to p = (l, 2) and passes
through (3, 4). The answer is
(1, 2) . (x - 3, y - 4) = 0 or (x - 3) + 2(y - 4) = 0,
ax+ by+ cz =d
for an equation of a plane perpendicular to (a, b, c).
34 Chapter 1 Vectors
FIGURE 1.27 z
Plane perpendicular to (l , I , I).
IEXAMPLE 21 =
In JR 3 let p (I, I, I ) and xo
through xo has equation
= (I, 0, 0). The plane perpendicular to p and passing
or
x+y+z=I.
To get an idea of how the plane lies in JR 3 , we can pick three points on the plane, for
example (1, 0, 0), (0, I , 0), and (0, 0, I), and sketch the triangle fanned by them as
in Figure 1.27. We can find points on a plane by picking values for two coordinates
and solving the plane's equation for the third. In this example, we found the plane's
intercepts, the points where the plane intersects the axes. Intercepts are easy to spot
since they have two coordinates equal to zero.
The angle between two planes is the angle between vectors perpendicular to the
planes, but as the next example shows, some care is needed in specifying the angle
and choosing which way the vectors point.
FIGURE 1.28
Trough.
p'iq
--e--- ------
---- ----
(a) (b)
Section 5B Euclidean Geometry 35
IPI = lql = ,Ju,, cos 0 = 24/26 ~ 0.923 and (from tables or a calculator) 0 ~ 22.6°.
This looks reasonable for the angle between p and q, as shown in Figure l .28(b ),
but not for the angle between the sides of the trough, labeled <p in Figure l .28(a).
Instead, 0 is the angle between one side of the trough and the extension of the plane
containing the other, shown as a dotted line in the figure, and <p = 180° -0 ~ 157.4°.
Note that this is the angle between p and -q, which could have been chosen as
normals instead of p and q. In the abstract, any one of ±p and ±q could be chosen
as normals to the planes, and 0 or <p could be taken as the angle between the planes;
the appropriate choice in a concrete problem is best made with the help of a sketch.
FIGURE 1.29
Distance to line and plane. n
/
~x,
/
/
/
/6 = n · (x 1 - x0)
(a) (b)
8 = n • (x1 - xo) = n • x1 - c
1 l 2 l
-x+-y- - z + - = 0.
,,/6 ,,/6 ,,/6 ,,/6
The distance from the origin x1 = (0, 0, 0) to the plane is then I /-/6. Note that if
we put x1 = (1, I , 1) instead, we get the same result, so this point is on the same
side of the plane as the origin and lies at the same distance from the plane.
EXERCISES
In Exercises 1 and 2, find an equation for the line 11. The lines with equations x - 2y =l and 2x +y =3
described in JR2 , and sketch it. 12. The line through (0, 0) and (I, 2) and the line through
I. Perpendicular to e2 in R 2 and passing through (2, 3) (1. 2) and (2, 3)
2. Perpendicular to (2, -3) and passing through (1, I) 13. Find the cosine of the angle between the planes 2x + y +
z = l and x - y - z = -1 in JR 3 . [Hint: Look at vectors
In Exercises 3 and 4, find an equation for the plane perpendicular lo lhe planes.]
described, and sketch it.
14. Find the cosine of the angle between the plane 2x+y+z =
3. Perpendicular to (I, 2, 4) and passing through (-1, 0, 0) I and a line parallel to (1, 2, I).
4. Perpendicular to e2 - e3 in R 3 and passing through In Exercises 15 to 18, sketch the plane in JR 3 with the
(0, I, 0) given equation.
In Exercises 5 to 8, describe the point or set of points 15. x + y - z = I 16. 2x - y + 3z = 0
that the plane in JR 3 with equation 2x + 3 y- z = 2 has in 17. y + 2z = 1 18. x - z = -I
common with the line having the given parameterization. 19. Find an equation for the plane parallel to the plane
3x - 2y + 5z = 2 and passing through (2, I, I).
5. (x, y, z) = t(l, -1, 4) + (I, 0, -2)
20. Find a unit vector II perpendicular to the plane passing
6. (x. y, z) = t (- 1, 0, I) + (-1, I , -1) through a = (I, 0, 1), b = (2, I, 0), and c = (I, I , I) .
7. (x. y. z) = t ( -1, I, 1) + (- I , 1, - 1) [Hint: n • (b - a) = 0 and n • (c - a) = O.]
the positive y-axis in IR 2 and the direction of the positive 26. point (1, l, 1); plane (1, 1, l) •x =3
z-axis in IR 3 )? 27. point (1, 0, -1); plane x + 2y + 3z = 1
23. point (2, -1); line 2x +y =2 28. point (-2, 1, 0); plane x - y + z = 2
24. point (-1, 2); line 2x +y=2 29. Let ax + by + C'{. = d be the normalized equation of a
plane in JR 3 , so a 2 + b 2 + c2 = I; what is the distance
25. point (1 , 0, -1); plane (1, I, 1) • x = l from the plane to the origin?
As a help in remembering the formula, note that the pattern of subscripts in each
component comes from the one before it by the substitutions 1 - 2 - 3 - l.
Formal computation with 2-by-2 and 3-by-3 determinants, treated more generally in
Chapter 2, Section 5, makes the cross product easier to remember in the form
j k
U XV= UJ
VJ
U2
Vz
U3
V3
= I u2
Vz
u3
V3
Ii - I U\
Vt
u3 1 ·
V3 J
+ I u1
Vt
!12
v2
Ik.
(1, -3, 2) X (2,4, -5) = {(-3)(-5)- (2)(4), (2)(2) - (l)(-5), (1}(4)- (-3)(2))
6.2 u • (u x v) = 0, and v • (u x v) = 0.
38 Chapter 1 Vectors
FIGURE 1.30 11 XV k k
L L;
Axis orientation. area 111 Xvi
k__
...
V
. J
II i"' ,i-
J
Right-handed Left-handed
(a) (b) (c)
Now observe that each term matches another with the opposite sign so that the sum
is zero. A similar calculation shows that v • (u x v) = 0.
Note that interchanging u and v exchanges the two terms in each coordinate entry
of u x v and so has the effect of changing the sign of the entry. Hence the cross
product is not commutative. Instead we have
6.3 V X U= -U XV.
i xj = (1, 0, 0) x (0, I, 0)
= ((0)(0) - (0)(1) , (0)(0) - (1)(0), {1)(1) - (0)(0))
= Oi + Oj + I k = k
and noting that k is perpendicular to both i and j. Similar calculations give j x k = i
and k x i = j. From Equation 6.3 we then have j x i = -k, k x j = -i, and
i x k = - j. Recall that we've already adopted the right-handed orientation for
labelling axes shown in Figure I.30(b ). The algebra is the same regardless of what
orientation we choose, but the picture would look different if we had chosen the
left-handed orientation.
[EXAMPLE 3 j Let us find an equation for the plane that has parametric representation x = su +
tv + w with u = (3, -1, 2), v = (2, 5, -2) and w = (0, 0, 4). We need a vector
p perpendicular to the plane. Since u and v are parallel to the plane, and u x v is
perpendicular to both of them, we can take p = u xv= ((- 1)(- 2) - (2)(5) , (2)(2)-
(3)(-2), (3)(5) - (-1)(2)) = (-8, IO, 17) to obtain a vector perpendicular to the
plane. Then p • x = p • w, or -8x + IOy + 17z = 68, is the required equation.
lEXA~P~E4 I Unless they are parallel or coincide, two planes in JR. 3 intersect in a line. Here is
how to find a parametric representation for the line of intersection. As an example,
we take the planes with equations 2x - y + z = IO and -3x + 2y - z = - 7,
rewriting these equations as p • x = 10 and q • x = - 7, where p = (2, - 1, 1) and
Section 6 The Cross Product 39
q = (-3, 2, -1) are perpendicular to the first and second plane, respectively. The
line of intersection lies in both planes and is therefore perpendicular to both p and
q, so we calculate
v=pxq
= ((-1)(-1) - (1)(2), (1)(-3) - (2)(-1), (2)(2) - (-1)(-3))
=(-1,-1,1)
to get a vector with the direction of the line. We still need to find a point on the
line. Usually it is simplest to find the point where a line meets one of the coordinate
planes x = 0, y = 0, or z = 0. For instance, taking x = 0 in the equations for
the planes gives -y + z = 10 and 2 y - z = - 7, which we solve to give y = 3
and z = 13. The point (0, 3, 13) is therefore on both planes and so on the line of
intersection. We now know the direction of the line of intersection and a point on it,
and have t (-1, -1, 1) + (0, 3, 13) as a parametric representation for the line.
To find the point of intersection of the line t ( -1, 2, 4) + (0, - 2, 3) with the plane
through (4, 2, 3), (-2, 0, 1), and {I, 3, -1), we first find an equation for the plane.
The vectors p = (4, 2, 3) - (-2, 0, l) = (6, 2, 2) and q = (4, 2, 3) - {I, 3, -1) =
(3, -1, 4) are parallel to the plane, so v = p x q = (l 0, - 18, - 12) is perpendicular
to the plane. An equation for the plane is v • x = v • (4, 2, 3) = (10, -18, -13) •
(4, 2, 3) = -32. For x on the line we have
Suppose we want to find an equation for the plane parallel to the two vectors x1 =
(1, 2, -3) and x2 = (2, 0, 1), and containing the point xo =
(1, 1, 1). A vector p
perpendicular to Xt and x2 is x1 x x2:
Anticommutativity: v x u = -u x v
Additivity: u X (v + w) = u XV+ u X w
(u + v) X w = u X w + V X w
Proof. Anticommutativity has already been justified as Equation 6.3. Proofs of the
other properties also follow directly from definition 6.1 and are left as exercises.
Note also that (u x v) x w is usually not equal to u x (v x w), so the cross product
is not associative. See Exercises 25 to 28. •
We have already seen that u x v is perpendicular to both u and v. Two more
properties are needed to fully characterize the cross product geometrically. The first
of these gives the geometric meaning of the length of the cross product, expressed
in the formula
6.5 lu Xvi= A(P),
where A(P) is the area of the parallelogram P that has arrows u and v as adjacent
edges as shown in Figure I .30(a). In other words, the length of the cross product
of two vectors is equal to the area of the parallelogram having the vectors as adja-
cent edges. This property makes the cross product useful in computing areas and
volumes in JR 3 , and plays a role in defining the area A (S) of a smooth surface in
Chapter 9, Section 3B. A proof of Equation 6.5 in straightforward steps is outlined
in Exercise 15. The area of the triangle with u and v as adjacent edges is half that
of the parallelogram and is therefore equal to ½Iu x vi.
IEXAMPLE 11 We saw in the previous example that the cross product of the vectors x 1 = (1, 2, -3)
and x2 = (2, 0, 1) is the vector (2, - 7, -4). Hence the area of the parallelogram P
with edges XJ and x2 is
Perpendicularity to u and v and having length given by Equation 6.5 isn't enough
to characterize u xv completely, because -u x v has the same properties. The choice
between the two possible directions of u x v relative to the pair (u, v) is settled by
the following rule:
6.6 Right-Hand Rule for the Cross Product. The vector u x v points in the
direction of the thumb when the fingers of the right hand curl from u to v.
This rule is illustrated in Figure 1.31 (b ). If you look at the plane containing u and
v from the side away from which u x v points, then it takes a counterclockwise
rotation of less than 180° to rotate u to point in the same direction as v.
Section 6 The Cross Product 41
FIGURE 1.31
Orientation of u, , -, w. 1-------u uxv
a, I
g'
='
-1 V
V u
(a) (b)
We have aJready computed that for the unit coordinate vectors, i x j = k. If you hold
your right hand with the thumb pointing up (in the direction of k), then its fingers
naturally curl in the counterclockwise direction taking i to j as in Figure l.30(b).
EquivaJently, if you look down at the xy-plane from above, it takes a positive
(counterclockwise) rotation to carry i to j.
The choice of the right-hand rule for the cross product instead of a left-hand rule
is an arbitrary convention, but we have aJready made that choice in drawing the
coordinate axes as we did in Figure 1.4 in Section 1. If we had used the left-hand
rule for orienting the coordinate axes, then the cross product would obey the left-hand
rule shown in Figure I.30(c).
The cross product is aJso linked to volumes. Three vectors u, v, win JR3 that don't
lie in a plane determine a solid region B with u, v, w for adjacent edges; B is called
a parallelepiped because it's bounded by three pairs of congruent parallelograms,
illustrated in Figure l.3l(a). Each edge of Bis paraJlel to one of the vectors u, v, w,
and B looks like a lop-sided box. The scalar triple product of u, v, and w in that
order, is defined to be u • (v x w). We'll show that this number equals either plus
or minus the volume V(B) of B, where B is the parallelepiped determined by u, v,
and w. The precise statement is:
6.7 Theorem. Let B be the parallelepiped with the vectors u, v, w as adjacent
edges. Then u • (v x w) = V(B} if the three vectors obey the right-hand rule and
otherwise u • (v x w) = -V(B).
l-';l$<AMPLE &I Let u = (I, 1, 1), v = (1, 2, -3), w = (2, 1, 1). The triple product of these three
, "" - vectors is (1, 1, I)• ((1, 2, -3) x (2, 1, 1)) =
(1 , 1, I)• (5, -7, -3) =
-5, so the
volume of the parallelepiped B determined by Lhe vectors is V (B) =
5. Since the
sign of the triple product is negative, u, v, and w form a left-handed system.
42 Chapter 1 Vectors
IEXAMPLE 10 j In Lhe discussion before Example 7 of Section 4 we defined the flux of a flow with
constant velocity vector v. There we found the flux across a parallelogram P with
unit normal vector n to be <I>= A(P)n•v, where A(P) is the area of P. We now
see that this flux is a scalar triple product: <I> = v • (p x q), where the vectors
p and q represent adjacent edges of P. The reason is that A(P) = IP x qi and
n = (p x q)/lp x qi, so <I> is the scalar triple product of v, p, and q:
pxq
<I> = A(P)n • v = IP x qi--• v = v • (p x q).
IP X q i
We use this formula to generalize flux to nonconstant flows across curved surfaces
in Chapter 9, Section 3C.
If you choose three vectors u, v, win JR 3 that form a right-handed system and test
v, w, u and w, u, v by the right-hand rule, you'll find that they are also right-handed
systems. Consequently,
6.8 u • (v x w) = v • (w x u) = w • (u x v).
EXERCISES
In Exercises 5 and 6, find the area of the parallelogram 13. The triangle with vertices (-2, 1, 0), (2, 3, 0), (2, -1, 0)
with u and v as adjacent edges. forms the base of an irregular pyramid with apex at
(0, 0, 2).
5, ll = i, V = j + k
(a) Make a sketch of the pyramid.
6. u=(-1,0,0),v=(0,-1 , 0) (b) Find a vector perpendicular to each of the three
In Exercises 7 and 8, find the area of the triangle with u sides.
and v as adjacent edges. (c) Find the cosine of the angle between each of the
three sides and the base.
7. u=(3,-l,2),v=(-l,0,l)
14. Verify using coordinates the second of Equations 6.2 of
8. U = (1, 1, _1), V = (1, 2, 1) the text: v • (u x v) = 0.
In Exercises 9 and 10, use the cross product to find *15. (a) Verify by direct coordinate computation that
an equation of the form ax + by + cz = d for the
plane parallel to both u and v that contains the point
(-1 , -1, 1).
Section 6 The Cross Product 43
(b) Use the result of part (a) to show that Sir William Rowan Hamilton introduced the terms scalar
and vector in 1846 in connection with his invention of
lu xvi= iullvlsin0, quaternions, which he defined to be expressions of the
form q =a+ bi + cj + dk, where a, b, c, and dare real
where 0 is the angle between u and v that satisfies numbers. He called a the scalar part of q, representing
0 :'.:: 0 :':: 7L a single magnitude on the scale of real numbers, and
(c) Show that lullvl sin0 is the area A(P) of the paral- called bi + cj + dk the vector part, representing a line
lelogram with edges u and v, and hence by part (b) segment with both magnitude and direction in ~n. The
that the length of u x v is equal to A(P), as stated
word vector came from astronomy; in the eighteenth
in Equation 6.5 of the text.
century the line from the sun to a planet was called the
16. Compute the volume of the parallelepiped with adjacent planet's radius vector. Hamilton defined the algebraic
edges (2, 1, 3), (-1 , -2, 4), (3, 3, 2). operations on quaternions so they became an extension
17. Verify Equation 6.8 algebraically by writing out the prod- of those for complex numbers, adding them as linear
uct in terms of coordinates for three vectors u,v,w. combinations of the basis {I, i, j, k} and defining the
18. Explain geometrically why Equation 6.8 is consistent with product of two quaternions by multiplying out assuming
Theorem 6. 7. the distributive law and then simplifying using the rules
In Exercises 19 and 20, find the area of the parallelogram
in ~ 3 with the given edges and make a sketch of it. ·2 =J·2 = k2 = -
I
I' ij = k = -ij,
19. (l,l,0),(0,1,2) 20. {0,l,2),{-3,5,-1)
J"k = .t = - kj ' ki =j = -ik
21. Find the volume of the parallelepiped B in R3 with edges
(1, 1, 0), (0, 1, 2), and (-3, 5, -1). Make a sketch of B. for multiplying i, j, and k. Note that quaternion multi-
22. If u = (2, 1, 3), v = (0, 2, 1), and w = (1, 1, 1), compute plication is not commutative.
{a) u x v (b) u • (v x w)
(c) (u x v) • w (d) (u x v) x w 24. Show that if q1 = s1 + v1 and q2 = s2 + v2 are two
(e) u x (v x w) (0 (u • v)w quaternions with scalar parts s 1 , s2 and vector parts v 1 , v2,
23. This exercise shows how the formula for the cross product then their product is the quaternion
comes up naturally if you try directly to find a vector
perpendicular to two given vectors. Let u = (u 1, u2, u3)
and v = (v1 , v2 , v3 ) be nonparallel vectors in !R3 • To find
a vector x = (x, y, z) perpendicular to u and v, we want Thus the product of two quaternions v1, v2 with zero
u • x = v • x = 0, that is, scalar parts yields scalar part -v1 • v2 and vector part
v1 x v2. Quaternions fell from favor among physical
u1x+u2y+u3z=O scientists after Josiah Willard Gibbs later introduced the
more convenient dot and cross products.
VJ X + V2Y + V3Z = 0.
The remaining exercises for this section deal with asso-
(a) Multiply the first equation by v1 and the second by ciativity of multiplication, which fails in general for the
u I and subtract to get cross product but holds for quaternion products.
25. The associative law for the cross product holds only for
some choices of vectors; verify this by comparing the
following products.
(b) Do a calculation similar the one in part (a) to get (a) (i X l) X j and i X (i X j)
(b) (i X j) X i and i X (j X i)
(c) (i X j) X k and i X (j X k)
(d) (i X i) X j and j X (i X i)
(c) Use (a) and (b) to express x and y in terms of z, *26. Let u, v, and w be arbitrary vectors in JR 3 •
and chc,ose a value for z that avoids fractions in the {a) Using geometric properties of the cross product,
result. Compare the triple (x, y, z) that you obtain show that there are scalars a and b such that u x
with U XV. (v x w) =av+ bw.
44 Chapter 1 Vectors
(b) By taking the dot product of both sides of the part (c) of Exercise 26 and other properties of the cross
equation in part(a) with u, show that a(u • v) = product to find a relation between u x (v x w)- (u xv) x w
-b(u • w). and v x (u x w).]
(c) Verify using coordinates that *28. Use the results of Exercise 24, part (c) of Exercise 26,
u x (v x w) = (u • w)v - (u • v)w. and Equation 6.8 to show that quaternion multiplication is
*27. Show that the cross product is associative for three vectors associative, so q1 (qzq3) = (q1q2)q3 for three quaternions
u, v, win that order, that is, u x (v x w) = (u x v) x w, if q,, qz, q3.
and only if v and u x ware linearly dependent. [Hint: Use
Chapter 1 REVIEW
Exercises l to 4 refer to vectors a = i + j + k, b 12. Show that for a pair of points a and b, the points
=
i + 2j + 2k, and c 2i + 3j + 6k. p = !a+ ~b and q = ~a+ !h are on the line segment
joining a and b, and divide it into three equal parts.
1. Which is longer, 4a or c?
13. Find a vector function oft that describes a particle moving
2. In the triangle with vertices a, b, and c, which side is
along the line through (1, 2, 3) and (-5, 3, 4) at unit
longest?
speed.
3. Express b - a and c - b - a in terms of i, j, and k.
14. Find a vector function oft that describes a particle moving
4. Express k, j, and i in terms of a, b, and c. in a straight line at constant speed that passes through
In Exercises 5 to 8, express the first vector given as a (2, -3) when t = I and through (5, 4) when t = 3. What
linear combination of the other vectors, or show that it's is the speed of the motion?
impossible to do so. 15. Suppose two boats start out at positions u 1 and u2 when
5. (2,-l,3,2);e1,e2,e3,e4 t = I) and maintain constant velocities VJ and v2. What
functions Pl (I) and P1(t) give their positions at time t?
6. (I, 2); (2, 3), (4, 6), (-6, -9) Let d(t) be the vector displacement from the first boat to
7. (4, 1,-2);(1,2,3) , (6,5,4) the second as a function of t.
8. (3, 1, -2); (l, 2, 3), (6, 5, 4), (5, 3, I) (a) Show that if the boats are on a collision course, then
the direction of d(t) doesn't change with time.
(b) Suppose the direction of d(f) doesn't change with
time. When will collision occur if it does? Under
what circumstances will it not occur?
16. Consider airplanes moving in three dimensions instead of
boats moving on a two-dimensional water surface as in
Exercise 15. Does this make a difference in the answers
to (a) or (b )?
17. Here are descriptions of four lines K, L, M, and N in the
plane:
FIGURE 1.32
K : the line through the points (3, 4) and (-2, 3)
9. Copy Figure 1.32 and then draw arrows representing (a)
L: the line with parametric representation t (i + 2j) + i - 3j
x + 2y, (b) x - y, and (c) 2x - 3y. Label the arrows with
(a), (b), and (c) to show which is which. M: the line through the point (8, 5) parallel to the vector
10. Let points a and b be position vectors of diagonally Si+ j
opposite vertices of a parallelogram and let c be the N: the line through the origin and the point (2, 4)
position vector of a third vertex. Express the position
Which of the lines are the same? Which of the lines are
vector of the fourth vertex in tenns of a, b, and c.
parallel?
11. Find representations su+a and tv+b for the line through
18. Here are descriptions of four lines K, L, M, and N in R.3:
(I, 2) and (2, 1), and the line through (4, 5) and (-1, -2),
and find where the lines intersect. K: the line through the points (l, 2, 3) and (4, -5, 6).
Section 6 The Cross Product 45
L: the line with parametric representation (b) a vector in the plane x - y + 4z = 0 and a vector
t (3i - 7j + 3k) - 2i + 9j. perpendicular to it
M: the line through the point (0, I, 2) parallel to the vector 29. A force F = i + 3j - 2k drags an object from (l, 2, 3) to
3i + k. (4, 5, 0). Find the work done.
N : the line through the origin and the point (I, 2, 3). 30. If the wind velocity is lOi + 20j (in feet per second),
Which of the lines are the same? Which of the lines are (a) What is the speed of the wind?
parallel? (b) How much air blows through a triangular opening
with vertices at (-2, 2, 0), (3, -4, 0), and (0, 0, 5)
In Exercises 19 to 22, find a parametric representation in one second? (Coordinates are in feet.)
for the given plane.
31. Let L be the line (2, 3) • x = 6.
19. the plane containing (3, 0, 0), (0, 2, 0), and (0, 0, 5). (a) Find the distance of the origin from L.
20. the plane parallel to the one in Exercise 19 and containing (b) Find a parametric representation for L.
(0, -!, 3). (c) Find all vectors of length 4 that are perpendicular
to L.
21. the plane parallel to the x- and y-axes and containing
(1, 2, 3). 32. Let P be the plane (l , l, l) • x = 3.
22. the plane containing the origin, (l, 2, 3), and (-2, -3, !). (a) Find the distance of the origin from P.
(b) Find a parametric representation for P.
For each set of three points given in Exercises 23 to (c) Find all vectors of length 4 that are perpendicular
26, determine whether the points lie on a line. If they to P .
do, find a parametric representation of the line. If they
33. Let L be the line t(2, l) + (-2, 0).
do not, find a parametric representation of the plane in
(a) Find a parametric representation for the line K
which they lie.
that is perpendicular to L and passes through the
23. (3, 1,2), (0,0,0),(-6,-2,-4) origin.
24. (l, l, 0), (0, !, !), (l, 0, l) (b) Find a unit vector n and number c such that L has
=
the equation n • x c.
25. (l, 2, 3), (3, 2, l) , (4, 2, 2) (c) What is the distance from L to (3, 5)? Is (3, 5) on
26. (l, 2, -3), (10, 5, -9), (-5, 0, l) the same side of L as the origin?
27. Let a = i + j + k. b = i + 2j + 2k, c = 2i + 3j + 6k, as 34. Let P be the plane s(3, 2, I)+ t(2, I, l) + (-2, 0. 1).
in Exercises l to 4. (a) Find a parametric representation for the line L
(a) Which is larger, the angle between a and b or the that is perpendicular to P and passes through the
angle between a and c? origin.
(b) Find a nonzero vector perpendicular to both a and b. (b) Find a unit vector n and number c such that P has
(c) What is the area of the triangle whose vertices are the equation n • x = c.
the origin and the points a and b? (c) What is the distance from P to (3, 5, -2)? Is
(d) What is the area of the triangle whose vertices are (3, 5, -2) on the same side of P as the origin?
the points a, b, and c? 35. Find a parametric representation for the line of intersec-
28. Express i + 3j - 2k as the sum of tion of the planes with equations x + y + z =3 and
(a) a vector parallel to 2i - 6j - 3k and a vector =
2x - y - z 5, and find the point where the line intersects
perpendicular to it the plane with equation x - y = 2.
CHAPTER 2
Many specific problems about vectors and their applications require solving systems
of first-degree equations. This chapter is mainly about how to solve such systems, but
also features applications and an optional look in Section 2D at geometric interpreta-
tion of the solutions using the vector geometry of Chapter l. In practice, the matrix
methods introduced in Section 2 of this chapter often provide the most efficient way
to do the necessary computations.
46
Section 1A Systems of Linear Equations 47
X + 4y + 3z = l
2x + 5y + 4z = 4
-x + 3y + 2z = -5.
Add (-2) times the first equation to the second, and replace the second equation by
the result; this eliminates x from the second equation by making its coefficient equal
to 0:
X + 4y + 3z = l
- 3y - 2z = 2
-x + 3y + 2z = -5.
Add the first equation to the third, and replace the third equation by the result:
X + 4y + 3z =
- 3y - 2z = 2
7y + 5z = -4.
Add -4 times the second equation to the first, and - 7 times the second equation to
the third to get
X + ½z = 11
Y + iz = -i
I _ 2
3Z - 3·
X + !z = 11
y + jZ = -i
z = 2.
Add ( -l) times the third equation to the first and ( - i) times the third equation to
the second to get
X = 3
y = -2
z= 2.
Hence the system has one solution. the point (x , y, z) = (3, - 2, 2) in JR3.
48 Chapter 2 Equations and Matrices
We can verify by substitution into the initial system of equations above that we've
found a solution, but this verification doesn't rule out the possibility that the original
equations might still have other solutions. We'll dispose of this idea once and for all
by singling out the two operations we use to solve linear systems and then proving
that the solution set of a system remains unchanged after applying them to the
system.
In the previous example we used only elementary operations, and the following
theorem guarantees that the one solution we found is the only solution.
!EX:A~PLE 2 j Consider just the first two equations in the previous example:
3x + l2y + 9z = 3
2x + Sy + 4z = 4.
These equations represent two planes, so they may intersect in a line. When we solved
these two equations along with the third equation in Example I , it so happened that
we avoided until the end adding a multiple of the third equation to either of the
others, so we can go through the first steps of that example simply ignoring the third
equation. Three steps from the end we arrive at
X + ½z = 131
2- - 2
Y + J•- - -3-
There are infinitely many triples (x, y, z) that satisfy the equations, because for an
arbitrary value of z the last two equations determine corresponding x and y values
Section 1A Systems of Linear Equations 49
z=t
to denote an arbitrary value of z. Then we can write the equations in final fonn as
X = -½t + 131
(a)
y = -jt - j
z = t.
(x, y, z) = t ( - ½, -j, I) + ( ¥, - j, 0) .
(b)
This is a parametric representation for a line, just what we thought we might get
from the intersection of two planes.
X + y - 2z = 1
-3x + 2y + z = 0
-x + 4y - 3z = 1.
To try to solve this system, we add 3 times the first equation to the second and
times the first to the third. The result is
x+ y-2z= l
Sy - Sz =3
Sy - Sz = 2.
There are no values of y and z that can satisfy the last two equations simultane-
ously, so we conclude that the system has no solutions, that is, that the system is
inconsistent. Geometrically, we see that the given system turns out to be equivalent
to one in which the last two equations represent distinct parallel planes, so there is
no point of intersection. ·
Example 2 illustrates a general principle that we'll prove in Section 2B: When a
system of linear equations has more than one solution it always has infinitely many
50 Chapter 2 Equations and Matrices
solutions, solutions that we can find by assigning arbitrary values to some of the
unknowns, and then detennining values for the other unknowns in terms of these
arbitrary values. As in Example 2, this always happens for a consistent system with
more unknowns than equations. The simplest case is that of one equation in several
unknowns, for example,
2x - 3y + z = 1.
z = -2s + 3t + 1,
so all solutions are of the form
x+y-z=0
X -y + Z = 0,
X +y = I
X - y = -t.
X + y = I
- 2y = -21.
Hence y = t from the second equation, so x = 0 from the first. The solutions are
(x, y, z) = (0, t, I)
= t(0, 1, 1),
which exhibits the infinitely many solutions as a parametric representation of the
points on a line in JR 3 .
Many questions about lines and planes come down to solving systems of linear
equations. For example, two lines in JR 3 with parametric representations su 1 + v1
and tu2 + vz intersect if and only if there are values of s and t such that the
Section 1A Systems of Linear Equations 51
vector equation su1 + VJ = tu2 + v2 holds, and this amounts to a system of linear
equations for sand t. If we take u1 = (3, 2, I), v1 = (-1, 0, I) , u2 = (0, 2, I), and
Vz = (-4, 2, 2), then the vector equation is equivalent to
3s - I = -4
2s =2t+2
s + I = t + 2.
The first equation gives s = -1, and then the second gives t = -2. These values
also satisfy the third equation, so the lines do intersect. The point of intersection is
obtained by putting s = -1 in su1 + v1 (or t = -2 in tu2 + v2) and is -(3, 2, I)+
(-1, 0, 1) = (-4, -2, 0). If we change v2 to (-4, 2, 0), the third equation is changed
to s + I = t and is not satisfied by the values of s and t that satisfy the first two
equations, so the lines don't intersect.
Sometimes we know the degree of a polynomial j(x) but don't know the coefficients.
If we know values of f (x) for enough values of x, it may be possible to find the
coefficients by solving a system of equations. For example, when a particle moves
in the plane under the influence of a constant force that is parallel to the y-axis, its
path is a parabola with an equation of the fonn y = f(x) = ax 2 +bx+ c. If the
points (x, y) = (0, I), (2, 4), and (3, 3) are known to be on the path then we can
find a, b, and c by solving the equations
Oa +Ob +c =l
4a+2b+c=4
9a + 3b +c = 3.
Exercise 11 asks you to finish the calculation to find the coefficients of f (x).
EXERCISES
Some of the systems of equations in Exercises I to 6 In Exercises 7 to 10, find a point of intersection of the
have one solution, some have more, and some have no two given lines, or else show that they do not intersect.
solutions. If there are solutions, find all of them and
interpret them as an intersection of lines or planes. In 7. x=t(l ,-1,2)+(1,l,l), x=s(3,2,l)+( -2,-6,5)
the cases where there is no solution, give a geomet- 8. X = t(l, 1, 2) + (0, 1, 1), X = s(-2, 1, 1) + (2, 1, 2)
ric explanation.
9, X = t(l , 2) + (2, 1), X = s(l , 3) + (3, -1)
1. X +y =1 2. 2x - y = 2 10. x = t(-1 , I, 1) + (1, 0, 2), x = s(2, 0. 2) + (1 , 2, 2)
x-y=2 -2x + y 2 = 11. (a) Finish the calculation in Example 6 in the text, by
finding the values of a, b, c, and J(.t).
3. X + y +Z =0 4, X + y +Z= 0 (b) ls there a value y such that if the path of the object in
x-y =0 x-y =0 Example 6 passed through (0, l), (2, 4), and (3, y),
y +z=0 2x+ z= 0 then the value of a would be 0? Js the path a parabola
in this case?
5. X + y + l =0 6. x - 2y = I 12. Suppose f(x) = qex + c2e-x + <'l, where CJ, c2, and c3
x+y- z=1 2x + y = -1 are constants. How should these constants be chosen so
X + y + 2z = 2 X - 1y = 4 that f(O) = 1, f'(O) = I, and J"(0) = 2?
52 Chapter 2 Equations and Matrices
!)• i);
S1, S2, and S3 having densities 2, 3, and I respectively,
measured in grams per cubic centimeter. Suppose also that
the price of each substance in cents per cubic centimeter is 17. 8J - ( : ) • a2 - ( 8F (
4, 3, and I, respectively. Is it possible to make a mixture
weighing IO grams, with a volume of 20 cubic centimeters
and costing I dollar?
Recall from Chapter 1, Section 1 that a vector b is
h-(H
a linear combination of a 1, ... , a11 if there are scalars
x1, ... , x 11 such that b = x1a1 + · · ·+x11 a11 • In Exercises
15 to 18, find coefficient x' s to express b as a linear
18••, - ( n- n- ., -(~ ),
.2 - (
IEXAMPLE 1 I junctions
An assembly of electrical conductors is called a network if each pair Ji, of
in the assembly is contained in a closed loop, or circuit where each segment
lj
represents a connecting wire with a given electrical resistance. When such a network
is connected to a power source, currents flow in the segments, and a voltage is
measurable at each junction. Ohm's law states that the current flow in a wire is
proportional to the difference in voltage at its two ends, the constant of proportionality
being the reciprocal of the wire's resistance:
(Vi - Vj)
Cjj = , (1)
ru
where Cij is the current flowing from junction l; to junction lj, ru is the resistance
of the connection between junctions l; and lj, and v; and Vj are the values of the
voltages at junctions l; and lj. The standard units of measurement are amperes for
current, ohms for resistance, and volts for voltage. A negative value for the current
from l; to lj indicates a current flowing from lj to 1;.
Figure 2.2(a) shows a network with four junctions and five segments, with the
resistance of each segment indicated beside it. Suppose external power source ter-
minals connected at junctions 1 to 4 maintain values v1 = 12 and v4 = 0. Since
junction Ii has no external connection, the current flowing in must balance the cur-
rent flowing out, so that if signs are taken into account, the sum of the currents out
of junction ]z must be zero. Using Equation (1 ), we get the equation
we see that v2 is a weighted average of vi, v3, and V4, with coefficients that are the
reciprocals of the resistances in the lines joining Ii to the others. A similar equation
holds at each junction that doesn't have an external connection. Thus at lJ
jv2 - V3 =6
-V2 + iv3 = 2.
Solving this system gives v2 = 2} :=:::: 7 .33 and v3 = ~ :=:::: 6.22. Once we know the
voltages, we can find the currents from (1). Thus the current from 11 to Ii is
The total current from junction Ii into the rest of the network is then about 2.3 +
0.96 = 3.3 amperes, which is the current flowing into junction 11 from the outside
source.
We may regard vectors in JR2 or JR3 as representing forces acting at some point which
for convenience we take to be the origin. The direction of the arrow is the direction
in which the force acts, and the length of the arrow is the magnitude of the force.
Our fundamental physical assumption here is that if more than one force acts at a
point then the resulting force acting at the point is represented by the sum R of the
separate force vectors acting there. In Figure 2.3 we have two different pictures, and
the resultant arrow R appears only in Figure 2.3(a). For example, suppose that the
force vectors in Figure 2.3(a) lie in a plane, which we take to be R 2 with the origin
at the point of action. If we have
then by definition
FIGURE 2.3
Force vectors.
(a)
Suppose we are given only the directions of the three force vectors and are asked
to find corresponding forces that will produce a given resultant, say, R = (-1, -1).
In other words, suppose we want to find nonnegative numbers CJ, c2, C3 such that
(3)
(Having some c; < 0 would reverse the direction of the corresponding force.) The
vector equation is equivalent to the system of equations we get by substituting for
the F; the given vectors (1) that determine the force directions. We want
q(-1,3)+c2(4,3)+c3(-2,-4)=(-1,-l), or
-CJ +4c2 -
+
2C) = -1
3C/ 4 q = -1.
3c z -
Since we have two equations and three unknowns, we would expect in general to be
able to specify one of the c; and then solve for the others. However recall that the
c; are to be nonnegative. In particular, a glance at Figure 2.3(a) shows that we could
not get a resultant equal to (-1, -1) unless c3 is positive. Hence we try C3 = 1.
This choice leads to the pair of equations
-c1 +4c2 =1
CJ+ Cz = 1.
These equations have the unique solution CJ = ~. c2 = r
Thus the triple
(CJ, c2, c3) = (~, ~.1) is one possible solution, and the three force vectors are
Pl=½+ ½pz.
Similarly, because going to a4 does not occur in the events we are watching,
I I
P2 = 2PI + 2P3
1
P3 = 2P2·
We rewrite the previous three equations as
I I
P1-3P2=3
-½ PL + P2 - ½P3 = 0
-½pz + P3 = 0
and solve them by routine methods. We get PI = ~, p2 = ~, p3 = 4.
It appears
that the closer we start to a 5 the more likely we are to get to as without going to
a4 , but the exact probabilities depend on the entire maze.
FIGURE 2.4 b2 bb
Rat mazes. 0 0
a5
b,
&i 0
b.1
0
b4
(a) (b)
56 Chapter 2 Equations and Matrices
In a network of interconnected water pipes the junctions are pipe joints. It's usual
in a pipe network to assign each pipe a positive flow direction with an arrow as in
Figure 2.5. With this understanding a positive number rk will be a flow rate in the
direction assigned to the kth pipe, while a negative number -rk will be a flow of
equal rate in the opposite direction. We'll separate the flow rates into internal rates
rk and rates fk from or to external sources or drains. Specifying an external rate
fk = 0 at a joint closes off the external pipe there. We also assume that the inflow
at a joint equals the outflow. Thus at the upper left comer in Figure 2.5(a) we find
t1 = r1 + r2, while at the lower left we find r3 = r2 + t3, or -r2 + r3 = t3. Checking
each external joint of the network of Figure 2.5(a), we find the entire set of equations
relating the rates tk to the rates rk:
- r4 + rs
= t3 (4)
+ r6 = t4
rs + r6 = ts.
From these equations we conclude that specifying the flows rk in the internal pipes
completely determines the flows fk at the external joints.
Turning the problem around, we can ask to what extent specifying the flows fk
at the external joints will specify the flows rk in the pipes. In particular, we can try
specifying that the exterior flow fk at each joint should be zero. This leads to the
system of five equations in six unknowns:
r1 + r2 =0
-r1 - r4 + rs =0
- r2 + r3 =0 (5)
- r3 + r4 + r6 = 0
rs+ r6 = 0.
We can let r6 = a be an arbitrary number, so we get rs = -a from the last
equation. Similarly, let r4 = b be an arbitrary number. Noting from the first and
third equations that r3 = r2 = -ri, the remaining two equations for r1 and r4 both
reduce to r1 = -a - b, so the solution vector is
r, T5
Tz r4
rb
rJ
14
(a) (b)
Section 1B Systems of Linear Equations 57
It follows that there are infinitely many pipe flows, depending on the parameters
a and b, that will produce external flows tk = 0 for k = 1, 2, 3, 4, 5. We' II see
in Section 2C that every solution r of a system such as Equation (4) is the sum
r = r P + ro of one particular solution r p and one of the solutions in the 2-parameter
family ro.
f'ekA,MPLB11 j The derivation of Simpson's rule for approximate integration is based on the require-
_,_ .-:i =··: .,., · · .. :-·-.--. ·=,=., ,. · ~- ment that it should give exact results when applied to quadratic polynomials. The rule
gives an approximation to the integral of a function over an interval a-h ~ x :::: a+h
in terms of the values of the function at the points a - h, a, and a+ h. The general
form of the approximation is
a+h
1
a-h
f(x) dx :=:::: Af(a - h) + Bf (a)+ Cf(a + h),
where A, B, and C are constants. If the formula is to be correct for all polynomials
of degree less than or equal to 2, it must in particular be correct for the polynomials
fo(x) = l, fi(x) = x, and h(x) = x 2 • Each of these requirements leads to an
equation for A, B, and C. For instance, with /o(x) = 1 we have
a+h
1 a-h
fo(x) dx = 2h and Jo(a - h) = Jo(a) = Jo(a + h) = 1,
A+B+C = 2h
(a - h)A + aB +(a+ h)C = 2ah,
(a 2 - 2ah + h 2 )A + a 2 B + (a 2 + 2ah + h 2 )C = 2a 2 h + }h 3 .
a+h
E(f) =
1
a-h
f(x)dx - ½hf(a - h) - ihf(a) - ½hf(a + h).
Note that E(/o) = E(/1) = E(h) = 0 holds by the way we chose A, B and C.
But then elementary properties of the integral and the form of the approximation as
a linear combination of values of /(x) show that ·
We can use the same method to derive a variety of formulas in the field of numerical
analysis, as in Exercise 18.
EXERCISES
1. Figure 2.2(b) shows an electrical network with the resis- 10. What is the probability p3 that a walk starting at a3 goes
tance in ohms of each edge marked on it. Suppose an to a4 without passing through a5?
external power supply maintains junction A at 10 volts
In Exercise Exercises 11 to 13, assume a rat traces a
and junction B at 4 volts. Following the procedure of
random walk on the paths shown in Figure 2.4(b ). Let
Example 7 in the text, set up equations for the voltages
Pk be the probability of going from bk to b6 without
at the other junctions and solve them. From the results,
going through hs.
calculate the current flowing into the network at junc-
tion A. 11. Find Pk for k = 1, 2, 3, 4.
The edges and vertices of a 3-dimensional cube form 12. Modify Figure 2.4(b) in the text so b4 and the path from
a network with 8 junctions and 12 edges. Suppose that it to b3 are eliminated. Then compute the resulting new
each edge is a wire of resistance I ohm and that there are values for /Jk for k = I, 2, 3.
just two external connections, which maintain a voltage
13. Modify Figure 2.4(b) in the text by introducing a new path
of I at one of the vertices and O at another. In Exercises
from b4 to b6. Then compute the resulting new values for
2 to 4, find the values of the voltages at the other vertices
Pk for k = I , 2, 3, 4.
and the current flowing in the external connections under
the stated conditions. 14. (a) Suppose the vector t = (11, 12, l3, l4, l5) in Equations 4
of the text is specified to be t = (-1, 0, 1, 2, I). Find a
2. The two vertices with external connections are at opposite vector r that determines consistent internal flow rates.
corners of the cube.
(b) Solve Equations 5 to verify that the vector r =
3. The vertices with external connections are at the two ends (-a - b, a + b, a + b, b, -a, a), with arbitrary a and
of an edge of the cube. b, describes all solutions that are consistent with external
4. The vertices with external connections are at opposite flow t = 0 in Figure 2.5(a).
corners of a face of the cube. 15. Let the external flow vector in Figure 2.S(b) be s =
2 (I, 1. 2, 4). Show that there is more than one consistent
5. If forces in JR act at the origin parallel to (2, I), (2, 2),
internal flow vector r , and find all of them in terms of an
and (-3, -1), find magnitudes we can assign to the forces
arbitrarily assigned value.
so their sum will be zero.
16. If the external flow vector in Figure 2.S(b) is s =
In Exercises 6 and 7, suppose that three forces acting
(I, 0, I, I), show that there is no consistent internal flow
at the origin in IR3 have directions parallel to (I, 0, 0), vector.
(l, 1, 0), and (I, 1, I).
17. Carry out the solution of the equations for A, B, C given
6. Find examples of magnitudes for forces acting parallel in Example 11 of the text. LSuggestion: Start by subtract-
to tht'se directions so the resultant force vector will be ing a times the first equation from the second and a 2 times
(-], 2, 4). the first from the third.]
7. Can an arbitrary force vector F = (a, b, c) be the resultant 18. Use the method of Example 11, to find constants
of forces acting in the actual directions specified in the A, B , C, D such that
preamble? Explain.
a+3h
In Exercises 8 to I 0, suppose that a random walk
traverses the paths shown in Figure 2.4(a). 1 a f(x) dx = Af(a) + Bf (a+ h) + Cf(a + 2h)
+Df(a + 3h)
8. What is the probability /JI that a walk starting at a1 goes
, to a4 without passing through a5?
is exact whenever f(x) is I, x, x 2 and x3, and so as in
9. What is the probability P2 that a walk starting at a2 goes Example 5 is also exact for a polynomial of degree at
to a4 without passing through a5? most 3.
Section 2A Matrix Methods 59
SECTION 2 MATRIX METHODS
In Section 1 we used elementary operations to solve systems of linear equations in
an ad hoc way that's hard to adapt to large systems. In Sections 2A and 2B we
introduce matrix equations and an effective solution routine. In Sections 2C and 2D
the emphasis is on geometric ideas. The matrix operations appear again in Sections
4 and 5 for computing inverse matrices and determinants, as well as later on in
Chapters 6 and 13.
2A Matrix Equations and Elementary Operations
A matrix is simply a rectangular array of numbers. Here are some examples:
( -~"'! )· (
O S 1 0.7 3 I O ../2) .
1 4
0.9 0 28 ). ( 0 -1 ) . (Z,S, O) ,
(
The horizontal lines of numbers in a matrix are called its rows, and the vertical lines
are called its columns.
The numbers of rows and columns in a matrix determine its dimensions, and for
consistency the number of rows is always designated before the number of columns.
The five examples just given have dimensions 3-by-2, 2-by-3, 2-by-2, l-by-3, and
4-by- l. A matrix is square if it has the same number of rows as columns, so it has
dimensions n-by-n for some n. The 1-by-n matrices are called n-dimensional row
vectors, and n-by-1 matrices are called n-dimensional column vectors, so we may
regard the rows or columns of an m-by-n matrix as vectors in Rn or Rm, respectively,
as in Definition 2.1.
,,~X:AM~~E\} I Matrices occur naturally for representing systems of linear equations. In the system
2x + 3y - 4z = l
X - y + 2z = - 1,
The 2-by-3 matrix is the coefficient matrix of the system, and the 2-by-I matrix is
the right side.
When writing the coefficient matrix of a system it's important to have the variables
lined up in the same order in all the equations and to use the coefficient O in the
matrix to indicate the absence of a variable. For example, to put the system
2x +y =4
z - y =2
60 Chapter 2 Equations and Matrices
in matrix form, it is a good idea to make clear the place that each coefficient has in
the system by first rewriting it as
2x+y =4
-y+z=2.
We can use dot products to relate a system's matrix and variables algebraically.
@AMPLE 2 I If A = (- i ;) and x = ( ~~ ).
then Ax = ( - i ;) (~~ ) = ( ~;: ! ;~~ ).
If B = ( : ! r) and y = 0}
then . By =( : : / ) m t;:! /) .
= (:
IEXAMP(E<3 I Each system on the left is equivalent Lo the matrix equation on the right. Each
variable corresponds Lo a column of the matrix.
4x + 3y = I
-x + 2y = 2
Section 2A Matrix Methods 61
2x
X
+ y + 2z = -1
+ 2y + Z = 0 o~n(:)~(-b)
Xi
Xi -
XJ
+ X2
X2
+ 2x2 = ]
=]
=0 (:-i)C:)~O)
The operations we used in Section 1 to solve systems of linear equations were
elementary multiplication of an equation by a nonzero scalar and elementary mod-
ification that adds a scalar multiple of another equation. The resulting system was
equivalent to the system we started with in that both systems had precisely the same
solutions. Here we apply these same operations to systems Ax = b described in
terms of matrices A and vectors b, noting their effect on the corresponding scalar
equations in a system. In particular, the operations have no effect on the vector x,
consistent with the invariance of the solution set of the system. Thus we can if we
like omit x and just operate simultaneously on A and b.
3
x + 4Y = 41 with matrix form
!ly
4
-- 42.
Multiplying the second rows of the constant matrices by ii has the effect of multi-
plying the second equation by 141 :
3 _ I
X + 4Y - 4
with matrix form
y = IT9
Adding -¾ times the second rows of the constant matrices to the first has the effect
of replacing the first equation by x = - ii :
62 Chapter 2 Equations and Matrices
X = -rr4
y = TI9
The unique solution is evident in either scalar equation or matrix form.
You can imagine trying to solve a large system this way, faced with a large number
of possible steps to take. We'll describe a fail-safe routine, one that Theorem 2.2 in
Section 2B proves will always work. It's helpful to refer to the first nonzero entry
in a row of a matrix as the leading entry in that row. Here':, the routine:
IEXAMPLE s I matrix
We repeat the calculations of Example 2 of Section I to see the solution process in
form for a system with infinitely many solutions.
+ + 9z = 3
( ; I; ~ ) ( ; ) = ( ! )
3x 12y
2x + 5y + 4z = 4 ;
- lx + 2y + z = -5 - 1 2 0 z -5
To change the system so that x appears only in the first equation, we multiply the
first equation by ¼, adding ( - 2) times the new first equation to the second, and
also adding it to the third. In matrix terms, we apply elementary multiplication by ¼
to the first rows of the coefficient matrix and the right side, then apply elementary
modifications adding multiples of the first row to the second and third:
The first variable x appears only in the first equation with coefficient I; correspond-
ingly the first column of the matrix has 1 in the first row and O elsewhere.
Similarly, to isolate y in the second equation, multiply the second row of the
matrix and of the right side by -¼,
and then perform elementary modifications,
adding (-4) times the new second row to the first row and adding (-6) times the
new second row to the third row, treating the right-side entries similarly:
y
+ lz
3 -
2
-
2 ,
+ 3Z = -3'
0=
ll
3
0
U: t)(O=( ~i)-
Section 2A Matrix Methods 63
The second column of the coefficient matrix now has 1 in the second row and 0
in the other rows, corresponding to y appearing only in the second equation.
All possible values of the variables satisfy the third equation, so we can ignore it.
Because x appears only in the first equation and y appears only in the second, we get
a solution in which z = t has an arbitrary value and x = t+ -½ ¥
and y = -j t - j.
As in Example 2, the set of solutions has the parametric representation of a line in
IR.3 : (x, y, z) = t (-½, -j, 1) + (1l, -j, 0).
In a reduced matrix R a leading entry is the only nonzero entry in its column, and
the variable associated with that column in the system Rx = c is called a leading
variable; all other variables in the system are called nonleading.
In the matrix equations with x = (x, y, z) all the real information is in the 3-by-3
matrices and the constant column vectors; the vector x and the equal sign just remind
us of the context, so in principle could be dropped.
To reduce the second column we multiply the second row by (-1) to get
01 -2I -35 ) ( Xy )
= (
-62 ) ,
( 0 - 1 -5 z 6
and then add 2 times the second row to the first and 1 times the second row to the
third to get
( 1 0 7) (
0 I 5
0 0 0
xy )
z
= ( -10 ) -6
0
.
The leading variables are x and y, while the nonleading variable is z, and we have
the two nonzero equations
X + 7z = -10
y + Sz = -6 .
Giving z an arbitraryvalue, we find the unique solution with z = t is x = -10 - ?t,
y = -6 - St, z = t. The solutions in vector fonn are the points
2B Reduced Matrices
The examples suggest that row operations on the equations in a linear system produce
solutions or else tell us that there are none. The choices we made may have looked
a bit ad hoc so it may not be clear that the process outlined in Steps I, 2, and 3 in
the previous subsection always works for systems of arbitrary size. We'll now show
that they provide a guaranteed routine for displaying a system in a form that makes
it easy to read off the solutions. We repeat the definition of the key terms we used
to describe the process, that a leading entry in a matrix is the first nonzero entry in
a row and that a matrix is reduced if the following two conditions hold:
(i) Every column containing a leading entry is zero except for the leading entry.
(ii) Every leading entry is I.
IEXAMPLE 1 j If
0 0 0)
and B = I I I ,
( 0 2 0
the matrix A is reduced because the top two rows have leading entry I with only
zeros elsewhere in the columns containing the leading entries. Note that the zero row
has no leading entry. The matrix B is not in reduced form because the conditions
(i) and (ii) are both violated; a reduced form for B would have the I and 2 in the
middle column replaced by O and 1, respectively. The reduced form of A gives us
the solutions to a matrix equation Ax = b such as
The Steps I, 2, 3 listed earlier for applying elementary row operations are the
main ideas we need to prove the following theorem.
Proof Suppose the matrix A is not yet reduced. Then there must be some column
containing a leading entry such that either (i) or (ii) or both fail to hold. If that column
contains the leading entry r for the ith row ri, multiplying ri by r - 1 will make the
leading entry 1. (Since r was a leading entry, it couldn't be zero, though it might be
Section 2C Matrix Methods 65
1 to begin with.) If other entries in the column are nonzero, we can replace them by
zero by adding suitable multiples of the ith row to the other rows. Another column that
already satisfied (i) and (ii) before these operations must have a zero for its ith entry
and therefore is unaltered by the operations. Applying this process to an unreduced
matrix A increases the number of columns that satisfy the conditions (i) and (ii). If the
resulting matrix is still not reduced, we repeat the process, and we obtain a reduced
matrix after at most n steps, where n is the number of columns in A. •
Theorem 2.2 shows that we can always use row reduction to convert a system
of linear equations to a system with a reduced coefficient matrix that has the same
solutions. If the reduced system has no zero rows, or if any zero rows correspond to
zero entries on the right side, then the system is consistent and the solut.ions are all
given by assigning arbitrary values to the nonleading variables.
+ Z + 2w = -]
(~ -2
0
X - 2y
z- w = 1
is not reduced, but we can reduce it by subtracting the second row from the first:
l -2 0 3 ) ( ; ) - ( -2 ) . X - 2y + 3w = -2
( 0 0 1 -1 z - 1 ' z- w = 1
w
We can assign arbitrary values to the variables y and w, so for each of the values
s and t there is just one solution with y = s and w = t, obtained by then putting
= =
x 2s - 3t - 2 and z t + 1. The solutions are given as vectors by
; ) _ ( 2s-:t-2 ) -
z - t+l - s
( ; )
O +t
(- ~1 ) +
(- ) ~1 .
(
w t O I 0
The general form x = su1 + tu2 + v for solutions of our original system Ax = b is
significant in a fundamental way discussed in Section 2C. Note that if the constant
vectors in this linear combination were in ~ 3 instead of ~ 4 , we could assert that
the solutions form a plane containing the point v in ~ 3 ; Section 2D shows how to
extend the possibility of this geometric interpretation to solutions of all systems.
2C Homogeneous Systems
The planar solutions x = su1 + tu2 +v to the system Ax = b of Example 8 illustrate
an important decomposition for solutions of linear systems. Setting s = t = 0 shows
that x = v is a solution of the original system. But if we let x = 01 or x = 02 for
that example, we find that instead of Au1 = b or Au2 = b we get Au1 = Au2 = 0.
Thus x = u 1 and x = u2 are solutions of the homogeneous equation Ax = 0 that
we get when we set b = 0 in Ax = b. To explain what's going on here we start
with the following property of matrix-vector products.
66 Chapter 2 Equations and Matrices
Proof. Let r; be the ith row of A, so the ith entries in A(su+tv), Au, and Av are
r; • (su + tv), r; • u and r; • v. By additivity and homogeneity of the dot product,
The expression on the right is the ith entry in sAu + tAv. To get the more general
equation, apply the two-term version to t1 u1 +(t2u2+ · +tkuk), and then successively
split off one more term at a time. •
Remark. The term linearity applied to the property of matrix-vector multiplication
in Theorem 2.3 stems from the observation that multiplication of the points on a line
tu + v in JR 2 or JR 3 by A carries the line into another line, or possibly just a point.
The reason is that, by Theorem 2.3, A(tu + v) = tAu + Av. Thus if Au -=I- 0 the
result of applying A is a line through the point Av and parallel to the vector Au. If
Au= 0 we get only the point Av.
Here is the basic theorem about the structure of solutions of Ax = b.
2.4 Theorem. Every solution of the matrix equation Ax = b has the form Xh +xp,
where Xp is some particular solution and x1i is a solution of the homogeneous equation
Ax=0.
Proof Let A have n columns and m rows, with m < n . When we convert A
to a reduced form R, the zero rows in R will be consistent because the right side
is zero also. There can be at most m leading entries, so at most m columns with
leading entries. We may specify arbitrary values for each variable that corresponds
to a column having no leading entry, then solve in terms of these for the variables
that correspond to leading entries. Given the infinitely many arbitrary values for at
least one variable, we get infinitely many solutions. The same argument applies to
Rx= 0 if we don't count trivial equations O = 0. Finally, a reduced system with at
least as many nontrivial equations as variables has a leading entry in every column,
so has only the zero solution. •
l:~M~P~~ ~~P I
u-: -! ) (n n
1 The system
X - 2y + Z = 0
=( °' 2x +y - 3z = 0
( ~ ~ =:)( ~ ) = ( ~ ) or
X
y
+ -z = Q
+ -z = 0.
or
x +Oz= o
y +Oz= 0,
in which z appears only with zero coefficients, we still have to remember it's there
and set z = t to get all solutions (x, y, z) = (0, 0, t).
tell when the latter happens. Theorem 2.5 gives one way of answering the question,
and Sections 4 and 5 present special criteria that apply when A is a square matrix.
In the following discussion we assume nothing about the dimensions of A.
If A has columns u_; with entries U;J, then
2.7 Definition Vectors u1, ••• , Un arc linearly inde1,cndcnt if Uli.'!. p u~tiCJQ
xt 01 f · · · f x11U11 = 0 is satisfied poly by ¢tiooiji1t~ aU Xk = (), <)f V:;f
kntly py Equation 2.6, if the only solution to A.x ..;:: o is·>~ =.!!
v.-b,re .. •.is the
matrix with the Uk for columns. Otherwise the veccors nn:> linearly delh'ndent
Definition 2.7 becomes more intuitive and is often easier to apply, in a form that
explicitly contains our original definition for two vectors:
The following theorem allows us to use whichever of the two definitions is more
convenient at a given point.
2.8 Theorem. Definitions 2.7 and 2.7' of linear independence are equivalent.
If Dk is a linear combination of the other o's, then both equations hold with Xk =
1 #- 0. But if the equations hold with some Xk #- 0, then dividing the second equation
by Xk shows that uk is a linear combination of the other u' s. •
IEXAMPLE 12 I Let A =
( 13)
3 l
2 2 and B = (134)
2 2 4
3 1 4
. The columns of A are linearly inde-
EXERCISES
Note. A linear system may have just one solution, or 15. Express the vectors i, j in R 2 as linear combinations
infinitely many solutions, or else no solutions if it's inconsis- of (I, 2) and (2, 3) by solving an appropriate system of
tent. Solving a system requires finding all solutions or showing equations for the coefficients of combination.
that there are none if that's the case.
16. Express the vectors i, j, k in R 3 as linear combinations
In Exercises 1 to 4, (a) write the system of equations in of (I, 1, 1), (1, 1, 0), and (1, 0, 0).
matrix fonn; that is, find a matrix A and a vector b such
that the system is equivalent to the equation Ax = b, 17. Express the vector (5, 0, 1, 2) as a linear combination of
and (b) solve the system. (1, 2, 1, 0) and (2, -1, 0, 1).
!D·~(D
3. X 4. X I 2 3
=1
y - z X - y = 0 0 1 2
X +Z =0 -x + y = 1 19. Solve the system 0 0 1
( 0 0 0
In Exercises 5 to 8, (a) write a system of equations 0 0 0
equivalent to the given matrix equation and (b) solve
the system. 20. Solve the system
5• ( ! i )( ; ) = ( b)
(i l j
0 0 0
_i
1 -D·~( J)
6. ( -b i ) = ( ~ )
X
2 3 4 6
0: O·~ (n
In Exercises 21 to 24, determine whether or not the vec-
tor vis a linear combination of the other vectors given.
1
• 21. V = 2i + 3j; 3 = 2i-j, b = 2i + j
s. 0D·~U) 22.
23. v
V = 2i + 3j + 4k; a = 2i - j, b = i + j + k,
C =j-2k
=
(0,1,2)
(-1,0,-l);a = (2,-1,2),b = (1,1,-3),c =
In Exercises 9 and 10, solve the given system. =
24. v (3, -1, 0, -l);a = (2, -1, 3, 2), b = (-1, 1, L-3) ,
•. (-!H)(O~O) C=(},J,9,-5)
In Exercises 25 and 26, parametric representations arc
given for two lines, L1 and L2. Show that there is just
one line L3 that intersects both LI and L2 at right angles,
10. ( _; !~) (' ~) = (
3 -1 2 w
~)
-1
and find a parametric representation for it. [Hint: If the
parameters s and t have values corresponding to the
' points where L3 intersects L1 and L2, then the vector
In Exercises 11 to 14, (a) find a reduced matrix equiv- from one point to the other must be perpendicular to hoth
alent to A, (b) solve the system Ax = 0, (c) solve the L 1 and L2. Show that th.is leads to two linear equations
system Ax = e1 + ei. that s and t must satisfy.]
11. A= ( ; -i -! ) 12. A= ( ~ ~
25. L1: s(3, 2, 1) + (-1, 0, 1), L2: t(O, 2, 1) + (-4, 2, 0)
26. L1:s(-1,3,0), L2:t(4,0,1)+(-2,1,1)
A~( H
*27. Show that if L 1 and L 2 are two lines in JR3 that do not
13. A= ( ~ 1
1 0 0
) ~ 14. intersect and are not parallel, then there is a unique third
line L 3 that intersects both of them at right angles, as in
70 Chapter 2 Equations and Matrices
Exercise 25. What if the lines are parallel? What if they the homogeneous system Ax = 0. This version of linearity
intersect? is sometimes called the superposition principle.
28. Show that if v 1, ... , vk are solutions of a homogeneous
system Ax = 0, then every linear combination of them is
30. Show that if x = v and x = u + v both satisfy Ax = b,
then Au= 0.
also a solution.
29. Let x = v be one solution of the system Ax = b. Show 31. Show that if x1 satisfies Ax = h1 and x2 satisfies
that w is also a solution if and only if w-v is a solution of Ax = b2, then l1 x1 + t2X2 satisfies Ax = t1 b1 + t2b2.
Comparing Definition 2.9 with the special cases in Chapter I, we see that an
ordinary plane is a 2-plane and a line is a I-plane. We can even take k = 0 and
regard a single point as a 0-plane. Every example of a system of linear equations
that we have treated has had for its solution set a k-plane fork = 0, I, or 2, or else
has had no solutions at all.
( 01 -2 0 3) = (-2)·
0 I -I x l '
x1 - 2x2 + 3x4 =
X3 - X4 =
-2
I.
For arbitrary scalars s and t there is a unique solution in which the two nonleading
variables, x2 and x4 have the values x2 = s and x4 = t, namely,
x- ( + -,
2s-3t-2 ) ( 2) (-3 ) (-2
g +t : + ! .
)
This represents a 2-plane x = su 1 + tu2 + v if we take
-~--~ -- - -
llJ = G} •2 = n} and V = CD
To check that 01 and 02 are linearly independent, note that the second entries in
01 and 02 are respectively O and 1, while the fourth entries are respectively 1
and 0. Thus neither 01 nor u2 can be a scalar multiple of the other, so the set
of solutions is a 2-plane in IR4 parallel to u 1 and u2, and containing the point
v. Note that xh = su 1 + tu2 solves Ax = 0 while Xp = v solves Ax = b,
illustrating the decomposition of solutions into homogeneous plus particular in
Theorem 2.4.
In a reduced matrix with n columns and r nonzero rows, the number of nonleading
variables is k = n - r, because every nonzero row of a reduced matrix contains just
one leading entry and corresponds to just one variable. Thus a system of m linear
equations in n variables has a solution set that is an (n-m)-plane unless the result
of applying row reduction to the coefficient matrix produces one or more zero rows.
In particular, the solution set of a single linear equation in JR 11 is an (n-1)-plane,
sometimes called a hyperplane in JR11 •
X=
(-lr/lu-1 )
3
t
u
3 =S (b)
O
O
+t (-f) (f) (-~)
3
O
1
+u 3
0
1
+
. 0
0
.
The first three tenns in the sum are linearly independent since each one has a 1 in the
entry where the other two have only zero, so their linear combination is the general
solution Xh of a • x = 0. The fourth vector is a solution Xp of the nonhomogeneous
equation, so the solutions fonn a hyperplane containing Xp in IR 11 •
Checking for Independence. For arbitrary sets of vectors the simple check for
independence we used at the end of the previous example is often impossible, but
there is a routine check. The method depends on knowing that applying ele~entary
row operations to a matrix A with columns a 1, ••• am preserves a dependence relation
among the columns. For instance if we get B from A by applying row operations,
then a 1 = a2 -2a3 if and only if b1 = b2 -2b3. The reason is that the row operations
(i) multiplication by r -:f. 0 and (ii) adding a multiple of one row to another, preserve
a dependence relation in each affected row. More formally, we have the following.
72 Chapter 2 Equations and Matrices
Proof. We can express a linear relation x1 a1 + · · · +x11 a11 = 0 among the columns
aj of A as Ax = 0, where Xj = 0 if aj isn't involved. Since solution vectors x of
Ax = 0 are unchanged by row operations on A, a linear relation among the columns
of A carries over to the same relation among the corresponding columns of B. •
To find out whether a= (I, 2, 3, - 1), b = (0, I, -1, I), and c = (-1, 0, 2, -1) are
linearly independent, we form the matrix A that has them as columns, with reduced
form R for which we omit the details:
2
l O -1 )
I 0
I O O)
0 I 0
R=
A=
(
3 - 1
-1
2 ;
l -1
( 0 0 0
0 0 1
.
Theorem 2.11 tells us that if some column of a reduced matrix we're checking for
dependent columns has no leading entry, that column will be a linear combination
of the columns that do have only leading entries, as in the following example.
To check the vectors a = (1, 0, l, 0), b = (0, 1, 1, 0), c = (1, I, 2, 1), and d =
(-3, -4, - 7, -3) for independence, we fonn the matrix A that has them as columns,
showing also the system Ax = 0 to aid understanding the dependence relations:
I 0 XJ + X3 - 3x4 =0
A=
(
0 l
I l
0 0 I
l =~ )'
-3
x1 +
x2
x2
+ x3
+ 2x3
- 4x4
- 7x4
X3 - 3X4
=0
=0
=0
Section 2D Matrix Methods 73
A row reduction left as Exercise 5(b) gives an R and a reduced system Rx= 0:
XJ =0
01 01 00 -10) X2 X4 =0
R= 0 0 0 0 ; 0=0
(
0 0 l -3 X3 - 3x4 = 0.
Solving the corresponding system Rx = 0, we can set the sole nonleading variable
x4 equal to 1, so x1 = 0, x2 = 1, and x3 = 3. Hence b + 3c + d = 0. Evidently
d = -b - 3c, so {a, b, c, d} is a linearly dependent set of vectors, and even the
smaller set {b, c, d} is dependent. But {a, b, c} is an independent set. What about
{a, b, d}? By looking carefully at R, you can answer this question without referring
to the system Rx= 0.
EXERCISES
l. Solve the equation w + 3x - 2y + z = 3 by expressing determines a plane that doesn't contain the third
the solutions parametrically as a 3-plane in R 4 . vector.
2. Solve the equation u + v + w - x - y =
I by expressing
the solutions parametrically as a 4-plane in R5 •
9. Let A =( t - ).
~ ~ Show that points x such that
1 0 -1 3 )
2 1 0 -I
( 3 -I 2 0 '
-1 1 -I 2
A = ( :~:
a31
:~~ )
a32
or B =( b11
b21
b1 2
b22
b13 )
b23 .
In general, a;j is called the ijth entry of A and stands for the entry in the ith row
and the jth column of the matrix. For a row vector or a column vector we usually
use only one subscript and write, for example, a = (a1, a2, . . . , a 11 ) .
7 -I ) II 12 13 )
p = ( - 3 2 and Q =( 21 22 23 .
Then in P the entries are P11 = 7, p12 = -1, P21 = - 3, p22 = 2. We can write a
formula for the entries in Q, namely%= I0i +j for i = I , 2 and j = l, 2, 3.
(~ ;)+( -! ;)=(~ ;)
2
I
I )
0 +( - I
I =; -~) = ( ~ ~ ~)
There's no reasonable way to define addition for matrices of different dimensions,
so we can't add
Section 3B Matrix Algebra 75
For a matrix A and number r, the scalar multiple r A is defined to be the matrix
C with entries Cij = raij-
61)= ( -~ =~ )
- 2(
3
(-! i 6)=(-~ ~ ~)
Using both addition and scalar multiplication, we can write linear combinations of
matrices that have the same dimensions. For example, using 2-by-2 matrices,
)
- 3( 0
2
l )
l
=( 2
4
-2 )
6
+( 0
-6
-3 )
-3
= ( -22 -5)
3 .
As with vectors in Rn, we write -A for (-l)A and A - B for A+ (-1)B. Also,
for every m and n there is an m-by-n zero matrix, denoted by 0, with all entries
equal to zero, such that A + 0 = A.
Notational warning. When O is used to denote a zero matrix, we depend on the
context to make clear what the dimensions are intended for the matrix. For example,
Our general definition of the matrix product AB depends on the special case Ax.
76 Chapter 2 Equations and Matrices
IEXAMPLE4 I If
)
4 3
A-( - 1 2
l -1
and B =( h11
h21
b12
b22 ).
then
( -i ; ) (
b 12 ) ( 4h11+3b21
AB= b11
b22
= -b1 I +2b21
-I h21 b 11-b21
16
7 23)
8 .
-3 -3
The entry in the first row and second column of AD is the dot product given by
(4, 3) . (2, 5) = (4)(2) + (3)(5) = 23; you should check that the other entries are
correct as shown.
Mat1ix multiplication is sometimes called row-by-column multiplication, and
schematically the process looks like this, putting the dot product of the second row
and the fourth column in the corresponding row and column of the product:
I:a;khkJ,
k=I
an expression that is read as "the sum for k = l to n of aikbk.i" and means the
same as
Note that the summation index k runs over the column index of A (that is, across a
row) and over the row index of B (that is, down a column).
It is important to note that matrix multiplication is not in general commutative.
Thus even if the products AB and BA are defined and have the same dimensions
they may not be equal. This is why we need the first two laws, with factors in
opposite orders, in Theorem 3.2.
Section 38 Matrix Algebra 77
3.2 Theorem. Let A, B, and C be matrices having the proper dimensions for the
sums and products to be defined, and let t be a scalar. The basic properties of matrix
products are
Proof. We prove only property 4 since it is the most complicated; proofs of the
others are left as exercises. Suppose A is m-by-n. For the products AB to be defined,
B must haven rows and so must be n-by-p for some p. Similarly, C must be p-by-q
for some q in order for BC to be defined. To prove two matrices equal we have
to show that corresponding entries are the same in both. The r jth entry of BC is
I:;= 1 brst\j, so the ijth entry of A(BC) is
(p
n
Lair L b,sCsj ) = LLp ai,brsCsj.
11
The sums on the right in these two equations consist of the same terms added in
different orders, so corresponding entries in A(BC) and (AB)C are equal. •
Formulas I through 4 in Theorem 3.2 state laws that have the same form as some
of the laws of ordinary arithmetic, with matrices replaced by scalars. Because of
the associative law for matrices it makes sense to write the product ABC instead of
(AB)C or A(BC) since the result is independent of the order in which the products
are formed, though as we saw in Example 5 not necessarily independent of the order
of the factors. In the next section we'll define an inverse operator to multiplication
by A, denoted A- 1 that is defined only for some square matrices.
j-~><AMPLE 6 I To illustrate the associative and distributive laws in the next two examples, let
A = ( _ : ~) , B = ( ~ i) , C = ( -~ ~) , and D i) ·
= ( - ~~
Thus (AB)C = A(BC), as the associative law states for A. B, and C in that order.
J-(
- 0I 0)
I or l= U 0 0)
l
0
0
I
or I=
(
0
1 0 - --0
0 I ··· 0
. . .
0
.
.. .. . . ..
··· 0
0)
0
..
.
I
that has Is on its main diagonal and Os elsewhere is called an identity matrix. An
identity matrix I has the property that both
IA = A, and BI =B
for matrices A and B such that the products are defined. Thus it is an identity element
for matrix multiplication somewhat as the number I is an identity for multiplication
of numbers. There is an n-by-n identity matrix for every value of n, and as with
zero maUices, we depend on the context to determine the dimensions of the identity
matrix denoted by an occun-encc of / in a formula.
0I O O) (
l O XJ )
x2 =( XI )
x2 .
( 0 0 I X3 X3
3D Matrix Polynomials
We may multiply together any number of matrices of the same dimensions n-by-n to
get a square matrix with the same dimensions. In particular, if A is a square matrix
we may multiply it by itself repeatedly, and we define
A2 =( ~ j )( ~ ~ )=( ~ ~ ),
A
3
=( ~ ~)(~ ~) =( ~ 1~ ), etc.
4 5 )
=( 0 9 +( 8
0
4 )
12 +( 2 0 )
0 2 =( 14
0
9 )
23 .
EXERCISES
In Exercises J to 10, compute the given expressions 28. Show that if A is a m-by-11 matrix, then Aej is the jth
using the following matrices: column of A, where e1, ... , e11 are the standard basis
vectors in JR", considered as column vectors. What arc
the products e; A, considering e; as a row vector?
A = ( -b i ), B = ( In Exercises 29 to 34, compute the given matrix
products.
c-(1 i )·
(i 00) (: DO)
2 1
- 1
29. 5 30. 0
.)( n
8 1
1. 3A - B 2. A +2B 3. 2A +B +C
4. A
7. ABC
+sc 5. AB
8. AB-2B
6. BC
9. C+AC
31. ( 2 32.
0)1 2 4 )
10. A+ B +C 2
In Exercises 11 to 20, compute the given expression if
it is defined, or else give a reason why it is not defined,
33.
(H) (_: - 1
1 -1
1
)
using the following matrices.
34. (~ ~ ~ )(-:)
0 0 5 -1
A=( 35. Show that for a matrix A and zero matrices of appropriate
dimensions,
G=( -I 2)
matrices are AO and O A defined, and what arc the
dimensions of the products?
0 3
36. Prove the Icft distributive law for matrices, C(A + B) =
CA +CB .
11. 2B - 3G 12. AB 13. BA
37. Let r be a 1-by-n row vector and e an n-by-1 column
14. BD 15. DB 16. CD +3DB vector.
17. 2AB - SG 18. 2GC -4AB 19. CDC (a) Show that the matrix product re is the same as the
20. DCD dot product r • e.
(b) Describe in general the product er.
In Exercises 21 to 26, with A, B, C, D the same as in
the preceding group of exercises, determine what the 38. A company makes m grades of a product in n different
dimensions of X and Y would have to be for each of the factories.
following equations to be possible. (In some cases there (a) Let £lij be the number of tons per day of grade i
may be no possible dimensions; in other cases there may produced in plant j .
he more than one possibility.) (b) Let dj be the number of days per month that plant
21. AX=B+Y 22. (D+2X)YC=0 j operates.
(e) Let p; be the wholesale price per ton of grade i .
23. AX= YD 24. ex +DY= o
25. AX= YC 26. AX= CY
27. Using the matrix C of the preceding group of exercises, ~I A - (a,1), d - ( ;: ) , ~d p - ( :~ } ood
compute Ci, Cj, and Ck, taking the basis vectors i, ,i, and
k as column vectors in JR 3 • let u be the column vector of all Is in JR" and v be the
Section 4A Inverse Matrices 81
column vector of all ls in ]Rm. Give interpretations for factoring the left side we know that the scalar equation
the following expressions. x2 - I = 0 has x = l and x =-
l as its only solutions.
(a)
(d)
Au
(Ad)• v
(b)
(e)
(Au)• v
(Ad)• p
(c)
= ( ~ ~) ifa 2 +bc=
I. Thus the equation A 2 - I = 0 has infinitely many
different solutions in the set of 2-by-2 matrices.
39. A= ( -~ ~) 40. A=
( 0
0
0
1
0
1)
I (c) Show that every 2-by-2 matrix A for which A 2 = I
is either /, - l, or one of the matrices described in
0 0
part (b).
(A - B)(A + B) # A 2 - B 2. eA = N-+oo
.
hm LI-A k = Elk
-A.
k! k!
45. (a) Show that for A and / as in Exercise 44, (A+ /) 2 = k=O k=O
A 2 +2A + I.
(b) Find examples of 2-by-2 matrices A and B such that Note that the finite sum from O to N is a polynomial
(A+ B) 2 # A 2 +2AB + B 2 • PN(A) and so is itself an n by n matrix in which we
compute the limit separately in each matrix entry.
AA - 1 = A - 1A = /.
and
7 -2 ) ( 1 2 )
A
- I
A = (
-3 I 3 7 =( I O)
0 I .
4.1 a b
( c d
)-t I
- ad- be
(
In Example I we used
Formula 4.1 is worth remembering and is the special 2-by-2 case of Theorem 5.8 in
Section SE. Exercise 32 asks you to verify that the formula is correct.
If A is an n-by-n matrix, then the matrix equation Ax =
b is equivalent to a
system of n linear equations in n variables. If A happens to be an invertible matrix
with inverse A - I, then we can solve the system in matrix form by multiplying
both sides on the left by A - 1 to get A - I Ax = A - I b. Since A - I A = I , we have
A- 1 Ax =Ix= X, so
Theorem 4.4 will show that if A is a square matrix and x = 0 is the only solution
to Ax = 0, then A invertible.
X + 2y = 3 is equivalent to
3x + 1y = -4
Section 4B Inverse Matrices 83
By Fonnula 4.1,
l 2
( 3 7
)-l -_( -37 -2 )
1
( _; -i ) ( ~ ~ )( ; ) = ( ~ ~)(; ) =( ; )
on the left and
( ; ) = ( _; -i )( -~ ) = ( -~~ )
on the right. Thus (x, y) = (29, - 13) is the unique solution.
4B Computing Inverses
Formula 4.1 gives an easy way to find inverses of 2-by-2 matrices, and Section SE
has fonnulas for A- 1 using detenninants when A has larger dimensions, but it's
often more efficient, particularly for large matrices, to use row operations as in the
process described below. To use the method we need to introduce another type of
row operation on a matrix A: row rearrangement, which changes the order of the
rows in the matrix. As with elementary multiplication and elementary modification,
applying the same row rearrangement to A and b leaves the solution set of Ax = b
unchanged; this is so because the solutions of a system are unchanged by writing
the scalar equations of the system in different orders. We'll first describe the process
and give an example, then prove some theorems that show it always works, either
finding the inverse of a matrix A or showing that A has no inverse.
4.3 Matrix Inversion Process. Given a square matrix A, apply row operations
to obtain a reduced matrix R, applying the same operations in the same order to I
to obtain a matrix B . Then
If a row of R is a zero row, A is not invertible.
If R has no zero rows, then A is invertible. Apply row rearrangement
1
to convert R to I, and apply the same rearrangement to B . Then B is A- .
We illustrate how the process works on the matrix A on the left below, row reducing
A and perfonning the same row operations on I as we go along. We start with
-3
~ -7
~),/=(~ ~ ~)- 0 0 1
Add -2 times the second row to the first and -1 times the second to the third:
40 8)
0
-2
, ( 01 1
0)
0 .
-3 -7 0 -1 1
84 Chapter 2 Equations and Matrices
Multiply the first row by ¼ and then add 3 times the first row to the third:
Multiply the third row by -1 and then add -2 times the third row to the first:
(
0l O1 O0) , (
0 0 I
-4
i0 - 1I
3
1
s
2 -1
~).
The matrix on the left is reduced. Interchanging the first and second rows gives
Proof Let R be the result of applying row reduction operations to A, and B the
result of applying the same operations to /, as in the inversion process 4.3.
First suppose that x = 0 is the only solution of Ax= 0. Then the same is true of
the system Rx = 0, and by Theorem 2.5 the system has at least as many nontrivial
equations as it has variables. Since R is square, this implies that R has no zero
row. Thus every row and column contains a leading entry I with Os in the rest of
the column, and the rows of R can be rearranged to produce the identity matrix I.
Rearrange the rows of B in the same way, and write bk for its kth column. Row
operations on a matrix apply simultaneously to all its columns, so we've converted
the equations Axk = ek to an equivalent system IXk = bk. Thus Xk = bk is the kth
column of X, and Bis a solution of AX= I. In other words, AB=/.
Section 4C Inverse Matrices 85
=
Otherwise, suppose that the system Ax 0 has nonzero solutions. Then the same
is true of the system Rx = 0 and, again by Theorem 2.5, the reduced system has
more variables than it has nontrivial equations, which implies that R has at least one
zero row. •
Theorem 4.4 implies that if A is invertible, then the inversion process 4.3 produces
a matrix B such that AB= I. To prove that B = A- 1 we need to show that BA= I
as well. Also, many different sequences of row operations will reduce A to I, so it
isn't obvious that the process used in Theorem 4.4 always gives the same matrix B
as a result. The next theorem settles both these questions.
4.5 Theorem. If A is an invertible matrix, and B is the matrix produced by the
inversion process 4.3 such that AB = I, then BA = I as well, so B is inverse to A.
Finally B = A- 1 is the only inverse of A.
(
a1
0
0 ) ,(
a2
~ etc.
0
In particular, identity matrices are diagonal matrices in which all the diagonal entries
are l. The notation diag(t1, t2, ... , tn) is convenient for the n-by-n diagonal matrix
with entries t1, ... , tn on the main diagonal. Diagonal matrices are easy to multiply.
We have for dimensions n-by-n,
diag(a1, a2, ... , a11 ) diag(b1 , ... , b11 ) = diag(flJ b1, ... , anbn).
It follows that a diagonal matrix is invertible if and only if the main diagonal entries
are all nonzero, and in that case
4.6
ke><AM~l.E 4 j
. ' ..., ,, .' .,.. , ;•,· c6 ~
( ~ ~ )-1 = ( t ½ )
86 Chapter 2 Equations and Matrices
with all entries below the main diagonal equal to zero. Just as with diagonal matrices,
an upper triangular matrix is invertible if and only if the main diagonal entries are
all nonzero. The reason is that if there is no O on the diagonal of an upper triangular
matrix, we can transform it into the identity matrix by elementary operations as in
the following example.
IEXAMPLE· s I right
With upper triangular matrices, it is simpler to reduce the columns working from
to left instead of from left to right.
I 3 )
l 2 ,
0 4
0i D·
0l O·
EXERCISES
In Exercises l to 4, either find the inverse of the given In Exercises 5 and 6, solve the matrix equation Ax =b
matrix or show that it does not have one. by multiplying by A- 1 .
I. ( ~) 5. A = ( ; -! ); b= ( !)
3. ( l :) 4 ( -7
· 12
-5 )
9 6. A = ( ~ ~ ) ; b = ( -~ )
Section 4C Inverse Matrices 87
In Exercises 7 to 10, use row operations to find the A square matrix Q is orthogonal if its column vectors,
inverse of the given matrix, or to show that its inverse or equivalently its row vectors, are mutually perpendic-
doesn't exist. ular and all have length 1. In Exercises 26 to 29, show
that the given matrix is orthogonal.
1/../2 -1/../2 )
26. ( 1../2 1/./2
cos0 - sin0 )
1~ ) 27
• ( sin0 cos0
-7
0n
ad - be, is not zero.)
0 ;)
0 -1 (b) Try to find the inverses of the following 2-by-2
17. 2 18. -1 matrices using Formula 4.1.
0 0
19. u D w.o 2
2
0
0
- 1
0
I
0
1
The transpose A of a matrix A == (aij) is defined
0 0 0)
2
0
0
0
3
0
0
0
4
What goes wrong in the third one?
33. A square matrix A is symmetric if At = A and skew
symmetric if At = -A, where A' is the transpose of A,
by A' = (aj;) . In Exercises 21 to 24, prove the stated defined just before Exercise 21.
equations.' (a) Show that if A is a square matrix, then A + At is
symmetric and that A - A 1 is skew symmetric.
21. (A 1 )1 =A 22. (AB)'= B' A' (b) Using part (a), show that every square matrix is the
23. (A + B) = A' + B'
1
24. (A') - 1 = (A-1)1 sum of a symmetric matrix and a skew-symmetric
25. Theorem 4.4 shows in particular that if A and B arc matrix.
square matrices mch that AB = I, then BA = I a1so, (c) Show that if A is invertible and symmetric, then
so A is invertiblc and A- 1 = B . Use this result to show A - I is symmetric. What if A is invertible and skew
that if we know only that BA = I then A- 1 = B by symmetric?
showing first that A' B 1 = I . (This is an alternative to the 34. This exercise shows that we can use matrix products to
proof given for Theorem 4.5.) carry out the elementary row operations on a matrix M
88 Chapter 2 Equations and Matrices
that we used in Section 2 for solving systems and in this the ilh and jth rows, a row rearrangement operation
section for inverting square matrices. on M, yields Tij M as a result.
(a) Let D; (r) be the matrix that equals the identity (d) A matrix D;(r) or (/ + rEij) or Tij is called an
matrix I of some given dimensions except that the elementary matrix. Show that each of the three
I in the ith diagonal entry is replaced by r. Show types of elementary operation is reversible in the
that the matrix product D; (r )M equals the result of sense of Definition 1.1 in Section I by verifying
multiplying the ith row of M by r. that, assuming i ¥- j, D 11 (r) = D;(l/r), (I+
(b) Let Eij be a matrix with 1 for its ijth entry and O's rEij)- 1 = (I - rEij), T;-;1 = Tj;.
elsewhere. For example, we have the 3-by-3 matrices
*35. Let p(x) = ao+a1x+- · +anx~ be a polynomial of degree
Eu = 00
( 0
00 1)
0 , and £21 = (01 00 0)
0 .
at most n with real coefficients. An algebra theorem says
that if there are more than n distinct values of x such
that p(x) = 0, then all its coefficients ak = 0. Use
0 0 0 0 0
this theorem to prove that if bo, .. . , b,. are scalars and
Show that if M is m-by-n while Eij and / are xo, . . . , Xn are distinct scalars, then there is exactly one
both m-by-m, then (/ + rEij)M is the elementary polynomial of degree at most n such that p(xk) = bk for
modification of M that we get by adding r times the k = 0, .. . , n. [Hint: Consider a system of linear equations
jth row to the ith row. with ao, ... , an as variables, rhen apply Theorem 4.4 to
(c) Let Tij be a matrix resulting from the interchange of the associated homogeneous system.]
the ith and jth rows of/. Show that interchanging
SECTION 5 DETERMINANTS
Determinants were originally invented as a device for solving systems of linear
equations, but they turned out to have both geometric and algebraic significance
which make them important in many fields of pure and applied mathematics. Apart
from this section, we use determinants in Chapters 3, 7, 11, and 13 in a variety of
contexts that will arise naturally there. A determinant is a scalar det A defined for
each square n-by-n matrix A. A common notation for the determinant of a matrix is
to replace the parentheses enclosing the matrix by vertical bars. Thus
SA Definition
In our definition of determinant we'll define det A first for 1-by-l matrices, and then
for each n define the determinant of an n-by-n matrix in terms of determinants of
(n - 1)-by-(n - 1) submatrices called minors. For a matrix A, the matrix obtained
by deleting the ith row and jth column of A is called the ijth minor of A and is
denoted by Aij. Recall that we use the small letter aij to denote the ijth entry of
a matrix A. Thus the ijth minor A;J corresponds to the entry a;1 in a natural way,
because the minor is obtained by deleting the row and column containing aiJ.
[ EXAl\'1PlE 1 I Let
-5 -6
B=(! 2 ).
A-( -38 -9
4
D· and
4
Section SA Determinants 89
Some examples of entries and corresponding minors of A and B are
a,,= -5 A11 =( -9
4 ~)
a23 = 0 A23 = ( -5
_3 -~)
b11 = B11 = (4)
b12 = 2 B12 = (3).
5.1 detA = a11 det A11 - a12 det A12 + · ·· + (-tt+ 1a1n det A1n-
The definition is inductive in the sense that the determinant of an n-by-n matrix A is
defined in terms of the determinants of the (n - 1)-by-(n - 1) minors AiJ. Starting
with the simple definition for the 1-by-l case allows us to go on to 2-by-2, then
3-by-3, and so on. In words, the formula says that det A is the sum, with alternating
signs, of the elements of the first row of A, each multiplied by the determinant of
its corresponding minor. For this reason the numbers
are called the cofactors of the corresponding elements of the first row of A. In
general, the cofactor of the entry au in A is defined to be (-Ii+i detAiJ· Thus in
Example 1 the entry a23 = 0 in the matrix A has cofactor
3
(-1)2+ det ( =~ -: ) = 38.
The factor (-1); +i associates plus and minus signs with det AiJ according to the
pattern
+ +
+ +
+ +
+ +
-5 -6
8 -9
-3 4
n- -5 det ( -: n-(-6) det ( -~ n
+ 7 det ( _ ~ -: )
90 Chapter 2 Equations and Matrices
= (-5)(-18 - 0) + (6)(16- 0)
+ 7(32 - 27)
= 90 + 96 + 35 = 221
(c) det ( ; !)= ad - be, in agreement with the definition given earlier in
connection with Formula 4.1 for inverting a 2-by-2 matrix. Stated in words,
det ( ; !) is the product of the entries on the main diagonal minus the
product of the other two entries. We can often compute 2-by-2 determinants
mentally, and consequently find 3-by-3 determinants in one or two lines.
j k
(a, b, 0) x (c, d, 0) = a
C
b
d
O
0
= det ( : !) k.
The length of this vector is the absolute value of the 2-by-2 determinant. On the
other hand, from Chapter 1 we know that the length of the cross product equals the
area A(P) of the parallelogram P with the vectors (a, b) and (c, d) in the xy-plane
for adjacent sides. Thus
det ( ; ! )= ±A(P).
Since we can interchange rows and columns in the 2-by-2 matrix without changing
the detenninant, A(P) is also equal to the area of the parallelogram with the vectors
(a, c) and (b, d) for adjacent edges. In either case the sign is "+" if the angle from
the first vector to the second is positive, and "-" otherwise.
Instead of the signed area we get in the 2-by-2 case, the determinant of a 3-by-
3 matrix is the signed volume V (P) of a solid region P called a parallelepiped,
having congruent parallelograms for opposite faces. To see this, recall from Chapter
I, Section 6 that the signed volume of a parallelpiped P with u, v, and w for adjacent
edges is equal to a scalar triple product of the three vectors:
j k UJ
u2 u3 )
u • (v x w) = (u1i + uzj + u3k) • VJ Vz V3 = det VJ VZ V3 •
w, Wz W3 ( w, Wz W3
Hence det ( ~; ~; ~! ) = ± V (P), where the sign is ·'+" if the three column
WJ Wz W3
vectors in the given order form a right-hand system and is "-" otherwise.
Section 58 Determinants 91
5B Row and Column Expansions
It is an important fact, the proof of which we omit, that if in Equation 5.1 the elements
and cofactors of the first row are replaced by the elements and cofactors of any other
row, or of any column, then the expansion is still valid. Here is the fonnal statement.
5.2 Theorem. If A is a square matrix, then
n
det A= L(-ti+i Oij det AiJ expansion by ith row
j=l
and
n
detA =L (-l)i+iaiJ det Ai} expansion by }th column.
i=l
= -(8)det ( -6 7 ) + (-9)det ( -5
4 2
_
3 2
7) - (O)det ( -5 -6)
_
3 4
= -8(-12 - 28) - 9(-10+ 21) = 221.
~n
Computing this detenninant using the elements and cofactors of the third column,
which equals the expansion of the transposed matrix by the third row, we get
-5 -6 7 ) ( -5 8
det 8 -9 0 = det -6 -9
( -3 4 2 7 0
= 7 det ( 8 -9)
_
3 4
- (0) det ( -5
_3 -: ) +2det( -~ =~)
= 7(32 -- 27) + 2(45 + 48) = 221.
92 Chapter 2 Equations and Matrices
SC Basic Properties
We can always evaluate a determinant using the definition, but this involves a lot of
arithmetic if the dimensions of the matrix are al all large. Some of the theorems we
prove will justify other methods of calculation that usually work better than row or
column expansions if 11 is greater than 2 or 3.
5.3 Theorem. If B is obtained from A by multiplying some row or column
by a number r, then det B = r det A. If A has a zero row or column, then
detA = 0.
Proof. [f the ith row of A is multiplied by r, the expansion of B by the ith row is
II
A similar argument using a column expansion proves the column version. The induc-
tive expansion by a zero row or column gives zero, so det A = 0 in that case. •
Let
I·£.><AMPLE 4.,
1 2 3) =
1 2r2r 43) .
A=
( -1 2 4
0 1 2
, B
( -1
0 r 2
Notice that B is obtained by multiplying the second column of A and r. The theorem
says that
det B - det ( -l 2; 2r 3) =
~ r det
( -1
0
1
=r det A.
5.4 Theorem. Let A, B, and C be matrices that are identical except in one row
or column; and suppose that in the exceptional row or column the entries in C
are the sums of the corresponding entries in A and B. Then det C = det A +
det B.
Proof. Suppose the special row or column is the )th column. Then expansion using
that column gives
n
1
-1 2 4 2 3) , B = ( -1I
0 1 2 0
-1
5 3)
3 4 .
0 -1 2
The matrices A, B, and C are identical except in the second column, and that column
of C is the sum of the corresponding columns of A and B. We have
Sign Changes. Another basic property of determinants is that if two rows (or
columns) of a matrix A are interchanged then det A changes sign. This property is
sometimes expressed by saying that the detenninant is alternating as a function
of its rows (or of its columns). For 2-by-2 matrices, the property amounts to the
observation that
det ( ; ! )= ad - be = - det ( : ; )
= -det ( : ~ ) ·
Proof. We proceed inductively, assuming that the theorem has been proved for
(n - 1)-by-(n - 1) matrices, and proving it for n-by-n matrices. We have already
observed that the theorem is true for 2-by-2 matrices. Assuming n 2: 3, we expand
94 Chapter 2 Equations and Matrices
det A using some row different from the two that are to be interchanged, say the kth
row. Then
ll
But interchanging two rows different from the kth in A interchanges two rows dif-
ferent from the kth in each (n - 1)-by-(n - 1) minor Aki· Since all determinants
det AkJ then change sign by the induction assumption, and nothing else changes in
the expansion, it follows that det A changes sign. A similar argument using a col-
umn expansion proves the column version. Finally, if A has two rows or columns
proportional, we can factor out a proportionality constant r. and get det A = r det B,
where B has two rows or columns the same. Then det B = - det B so det B = 0.
Hence det A = r det B = 0. •
-1
2
0
-6 -3
1
j)
has its first and third rows proportional, with the third being (--3) times the first. We
verify, expanding by the second row, that
= (-9 + 9) - 5(-6 + 6) = 0.
5D Computing Determinants
The following fact is useful in computing the determinants of large matrices.
5.6 Theorem. Adding a scalar multiple of one row (or column) of A to another
leaves det A unchanged.
Proof. Let the ith row or column of A be the one affected by adding r times the
kth row or column, and denote by C the modified matrix. Then we look at det C as
a function of its rows (or columns) CJ, .•• , c;, ... , c11 , so that c; = 3; + r3k.
The last matrix has two rows or columns proportional, so its detenninant is zero.
=
Thus det C det A, as was to be shown. •
Section 5D Determinants 95
Let
I 3 -2) I 3 0)
A=
( 2 -4
3
I
5 -2
, C =
( 2
3
-4 5
5 4
.
The third column of C is equal to the third column of A plus 2 times the first column.
We compute
2 4 -I O)
3 0 2 3
A= -I 2 3 I .
(
0 I -2 -1
; ) t
-1
and by Theorem 5.6, det A = det B. The expansion of det B has only one nonzero
term and we get
det B = (- I) det ( ;
-4
1
-7
! i) -I
=-det ;
-4 -3
! i) -1
[subtracting column 1 from column 2)
Theorems 5.3, 5.5, and 5.6 describe the effect on det A of the elementary row
operations of Section 2A and of the interchange of two rows or columns:
Proof. Suppose that A has been put in reduced form R by elementary row opera-
tions, so that det A = k det R for some constant k =I- 0. If det A =I- 0, then R cannot
have a zero row. A reduced square matrix R with no zero row is equivalent to /,
so A is invertible. If det A = 0, then R contains a zero row, and A is not invertible,
because the equivalent systems Ax= 0 and Rx= 0 have nonzero solutions. •
I l -
5.8 Theorem. If A is invertible, then A- = - - A 1•
det A
if k = i,
(I)
if k =I- i.
If k =i, the left side of Equation 1 is just the expansion of det A by the elements of
the ith row, so we get det A. If k =I- i, we can still regard the sum as an expansion
by the ith row of the determinant of a matrix in which the ith row is the same as
the kth row, thus giving determinant Oby the last part of Theorem 5.5.
To finish the proof we look at the kith entry in the matrix product AA 1 :
if k = i,
if k =I- i,
Section SE Determinants 97
the Jast two equalities fol1owing from the definition of a;i and Equation (1 ). Hence
AA = (detA)J, so dividing by detA gives A((detA)- 1A 1 ) = J. Now apply A-1
1
1-..EXAMPLE
·.· cc,.. ,". • •· . . ······,10 l.
To invert the matrix
.=- - : _;: ,,: •. . · ., ...,,., •, ' ·- . · .
with entries del Aij; thus det Ai 1 = I ~ 61 = -63, and det A12 = I ~ 61 =
-56. To get the matrix of cofactors insert the factors ( -zi+ i, changing the sign of
every second entry and giving
-63 56 -3 )
36 -32 6 .
( -3 6 -3
Finally, transpose this matrix by reflecting across its main diagonal, then divide by
det A = 30, found for example by expanding det A by the last row. The result is
21 6
-63 36 -3 -m s
A- 1 = _2_ ( 56 -32 6 ) = 28 _.!.§_
30 ( 15 15
-3 6 -3 I I
-m s
If a square matrix A is invertible, the system of linear equations that the vector
equation Ax = b represents has the unique solution x = A- 1b. We can combine
this solution formula with the formula for A- 1 in the previous Theorem 5.8 to get
x = (detA) - 1A1 b. This formula leads to the fol1owing rule.
5.9 Cramer's Rule. If det A :f: 0 and x = (x1, ... , Xn), then the }th coordinate
of the solution of Ax = b is
det B<i)
x-----
J - detA '
where B<i> is A with its jth column replaced by the entries b 1 , ... , b11 in b.
Proof. As we noted before, the statement of Cramer's rule x = (detA)- 1A1 b. The
jith entry in the matrix A1 is aJi =au=
(-l)i+i detAu, Thus
II II
But the sum on the right is just the expansion of B(j) by the jth column. •
98 ..,..,,.._....,.,..,____ Chapter 2 Equations and Matrices
x1 -2x2 +4x3 =I
-x1 +x2 -x3 =2
2x1 +3x~ -x3 = 3.
The relevant matrices are
A=
( 2
1 -2
1 1
3 -I
-~), BO>=
1 -2
2
( 3
1
3
4 )
-1
-1
,
B<2 l = 1I 21 -14) , -2 1 )
(2 3 -1
1 2 .
3 3
6· ( i i ~ 2i)
In Exercises 1 and 2, evaluate det A and also det(2A).
I -2 3 )
1. A= 3 1 4 1 4 16 64
( 5 6 7
1 0 1 0)
0 3 1 4
The product rule for determinants states that if A and
B are square matrices with the same dimensions, then
2· .4 = (- 1 1 1
-1
4 0
2 3
det(A B) = det(A) det(B). In Exercises 7 and 8, find
A B , BA, and the determinants of A, B, AB, and BA,
and verify that the product rule holds for these examples.
3. What is the relation between dct A and dct(2A)? (Sec
Exercises 1 and 2.)
B=(O2 -3l)
4. What is the relation between detA and det(-A)?
s. ( - ~
2 -1
! _i -!l )
0
9. Apply the product rule of the preceding exercises to show
=
that if A is invertible, then det A -::f. 0 and det(A- 1)
(detA)- 1•
Section SE Determinants 99
('''°"
10. Show that if D is the diagonal matrix diag(d1, .. . , dn), -e 1 sint
)
0
in the notation used in Theorem 4.6 in the previous 22. 1
e sin t e1 cost 0
section, then det D is the product of the diagonal elements, 0 0 e3r
d1d2 · · · dn. In particular, det I = 1.
11. Compute the determinant of the matrix
(i -r i -~).
0 0 0 4
23.
(f
4
0
2 n_;)
4
12. Show that for an upper triangular matrix like the one in
Exercise 11, in which every element below the diagonal is
24.
(2:' -t
2
0, the determinant is equal to the product of the diagonal 25. Show that the cross product of vectors u = (u 1, u 2 , u 3 )
elements. and v = (v1, v2, v3) has the form of a 3-by-3 determinant:
uH)
of vectors u =
(u1, u2, u3), v = (v1, v2, v3), w =
(w1, w2, w3) is expressible as a 3-by-3 detenninant:
IS. ( : j j) 16.
u1
= det
HD O=! :)
u • (v x w) v1
(
Wt
17. ( 18.
n
27. Let A be an m-by-m matrix and B an n-by-n matrix.
2
2
-1
0
02 0I O)
0 Consider the (m + n)-by-(m + n) matrix ( ~ ~ ).
0 I 0 3 0 which has A in the upper left corner, B in the lower
0 0 0 0 4 right comer, and zeros elsewhere; show that its deter-
minant is equal to (det A)(det B). [Hint: Consider the
In Exercises 21 to 24, use Theorem 5.7 to determine for
cases A = I and B = I. Then use the product rule of
which values of the real variable t the given matrix fails
Exercise 7.]
to have an inverse.
21.
1-t
0
2 0)
2- t 5
28. Explain how Formula 4.1 for inverting 2-by-2 mat1ices
is a special case of Theorem 5.8 for inverting n-by-n
( 0 0 3-t matrices.
Chapter 2 REVIEW
In Exercises l to 8 let
E =
( -1
2
0
-1 )
-4 , F =
( 0
l
-3)2 ,
4 1 3 -2 0
B=( -2 -1
c-(1 -3 -1 ) 2 0 -2) and evaluate each of the following expressions, or else explain
- 2 0 , D=
( I O
0 0
5
3
, why it's not defined.
1. A+ B 2. AB 3. B+2C
100 Chapter 2 Equations and Matrices
b~ ~)
nomial q (t) and diagonal matrix A?
22. Convert the matrix ( to reduced form by 37. (a) Show that if E and F are diagonal matrices of the
1 1 2
same dimensions, then E F = FE.
elementary row operations. (The operations allowed at
(b) Let D = diag(a, b, c) and write down DA and AD,
each stage may be different for a few special values of s.
where A is the matrix in part (c) of Exercise 8.
Consider different cases if necessary.) For which values
Describe in words the effect of multiplying a matrix
of s is the matrix invertible?
by a diagonal matrix, when the diagonal matrix is
23. Write the system of equations on the left, and when it is on the right.
(c) Assuming a, b, and c are all different, and D =
2x+ y-z=O diag(a, b, c), what can you say about matrices B
3y - z = h such that DB= BD? What if a = b = c? What if
X - y = I a= h I- c?
38. Let p(x) be the polynomial a+bx+cx 2 • By setting up and
in matrix form, and apply row operations to get an solving a system of linear equations, find values for the
equivalent system with a reduced coefficient matrix. Give coefficients a, b, c so that p(-1) = 1, p(0) = 0, p(l) =
one solution for each value of h for which a solution I. Given numbers r, s, t is it always possible to find values
exists. for a, b, c to make p(-1) = r, p(0) = s, p(l) = t?
Section SE Determinants 101
39. Let f(x) = a sinx + h cosx + c. Find a, b, and c so that 41. (1, 2, 1, 2), (1, 2, 3, 4), and (1, 0, 0, 1)
/(0) = 1, J'(O) = 2, and f"(0) = 3.
42. (1, 2, 1, 2), (1, 2, 3, 4), and (0, 0, 1, 1)
40. Let K be the plane through (1, 2, 3), (-1, 5, 2), and
(2, -6, 10), and let L be the plane with equation 2x - Evaluate the detenninants of the matrices given in Exer-
3y +z = 2. cises 43 to 46.
u n 0 -n
(a) Find an equation for K by solving a system of linear
2 2
equations, without using the cross product.
43. 1 44. 1
(b) Find the intersection of K and L with the plane
0 0
that has equation x + 2y + az = 0. (The result will
n (J
depend on a.) Are there values of a for which the
2 0 0 2
three planes do not intersect?
In Exercises 41 and 42, find all vectors in JR. 4 that are
perpendicular to the given vectors.
45.
(-i 1
-3
0
2
0
-1
46.
-3
-2
0
0
1
-2 D
CHAPTER 3
VECTOR SPACES
AND LINEARITY
(~OPTIONAL CHAPTER)
This chapter extends and generalizes the linear algebra developed in Chapters l and
2, but the additional material isn't used in later chapters.
2x+3y=5
X - y =3
is equivalent to the matrix equation Ax = b with
= 2x + 3y
z
w = x- y
or (~)=(i -i)(;).
IA Matrix Representation
There's a close connection between systems of linear equations, either in scalar or
matrix form as illustrated above, and the natural generalization of the function y = ax
from IR to JR to functions from IR11 to !Rm. If A is an m-by-n matrix, setting j(x) = Ax
defines a function y = f (x) from IR11 to 1R111 • We proved in Theorem 2.3 of Chapter 2,
Section 2C that a matrix-vector product Ax has the property we called linearity,
meaning that for a given linear combination sx+ty we have A(sx+ty) = sAx+t Ay.
In terms of the function f (x) = Ax, this says
1.1 /(sx + ty) = s/(x) + tf(y),
11
and a function f from 1R to !Rm is called linear if it satisfies Equation 1.1 . Repeated
application of Equation 1.1 shows that a linear function f (x) always satisfies the
more general condition
102
Section 1A Linear Functions on JR" 103
1.2 f (t1x1 + t2x2 + · ·· + tkXk) = tif (x1) + tif (x1) + · · · + tkf (xk).
The rest of this section, and indeed this chapter, is about understanding linearity.
IExiNt~~~;lJ ~~~g;;~ AIR;:: (a;i) the following systems define the same function f (x)
0
= Ax
YI = a11x1 + + a1nXn
Ym = amlXI +
The next theorem shows that as a consequence of Equation 1.1, all linear functions
y = f (x) from ]Rn to ]Rm are expressible in the two forms of the previous example,
in other words in the form f (x) = Ax.
1.3 Theorem. If f is a linear function from IR11 to ]Rm, and A is the m-by-n
matrix whose columns are the vectors f (e1), ... , f(en), then f(x) = Ax for every
x in ]Rn.
Proof. e,
The product of a matrix and the standard basis vector always gives the jth
column of the matrix, so the definition of A implies Ae, = f (ej) for j 1, ... , n.=
Now consider an arbitrary vector x = (x1, ... ,xn) = x1e1 + ·· · + xnen in !Rn.
To check that f (x) = Ax, we use the linearity of f and the linearity of matrix
multiplication (Theorem 2.3 in Chapter 2). Since we can write x = x1e1 +· · ·+xnen,
so f(x) = Ax. •
For an example of how the theorem works in a particular case, suppose f is a linear
function from JR 3 to JR2 with
= (2xi)+ ( +(-4x3) =
x1
3x2)
-x2 2x3
+ 4x3) (2 -4)(:~) .
(2x1 3x2 -
x1 - x2 + 2x3
= 3
1- I 2
X3
Note. In the context of real-valued functions of one variable the term "linear func-
tion" often means a function of the form /(x) = ax + b, because such functions
have straight line graphs in the xy-plane. Such functions are linear in the present
context of functions with vector variables only if b = 0. In this book the term "linear
function" will always mean a function satisfying Equation 1.1.
Before looking at more examples of linear functions, we introduce some termi-
nology that's useful for talking about functions in general, whether or not they are
linear. We revisit these terms at the beginning of Chapter I applied to nonlinear
functions.
A function f is defined on a set D called the domain of f and takes its values
within some set R called the range of/. Thus for every x in D, /(x) is some yin
R. We say that f is a function from D to R, and use the notation
f: D---+ R
Defining f (x) to be the scalar multiple x (1, 2, 1) gives a function f: IR 1---+ JR 3 that
is linear because the coordinates of the image of f in JR 3 are YI = x, y2 = 2x,
y3 = x, and these have the simplest possible form of the first display in Example 1.
The image of f consists of all scalar multiples of ( 1, 2, 1) and is a line through
the origin in IR 3 in the direction of the vector (1, 2, 1). If E is the interval [O, 1),
then /(£), the image of E under /, is the line segment joining (0, 0, 0) and
(I, 2, I).
Section 1A Linear Functions on !Rn 105
JEX,AM~~~;I 2 3
Let f:IR -> R be the linear function such that f ( ~) = 0) and f C) =
~
( - )- The matrix of f is ( ~ -i ), and for arbitrary (,, t) in R2, we have
The image F of f is the plane through the origin in JR 3 containing the vectors
(1,2,4) and (-1,0, 1).
For a linear function with a different kind of geometric interpretation, consider the
function from JR. to JR that sends x to 2x and has the geometric effect of stretching
the real number line by the factor 2. In two dimensions we can define a linear
function f: IR.2-----+ JR 2 that stretches horizontal distances by a factor of 3 and vertical
distances by a factor of 2. To do this we need /(e1) = 3e1 and f (e2) = 2e2, so f
is represented by a diagonal matrix as
Figure 1.1 shows the geometric effect off, where C is the unit circle xf + xJ =
f(C) I, and /(C), the image of C under f, is an ellipse. If ( ~~) is the image of
HGURE 3.1
( ;~) = ( ;;~) under f. then x1 = ½u1 and x2 = ½u2. Hence if ( ~~) is in
u2
Unequal expansions.
f (C), then
9
u2
1
+ J
= 1; this is the equation of the ellipse with semi major axis 3
and semiminor axis 2 shown in Figure 1.1.
The projections of one vector on another defined in Chapter I, Section 4C are geo-
metric examples of linear functions f: lRn---+ ]Rn. Let n be a unit vector in ]Rn.
Define the function P0 : IR.11 -----+ IR.11 by P0 (x) =
(x • n)n. Then P0 is linear because
using properties of the dot product shows it satisfies Equation 1.1 :
Rotations are another class of linear functions from a space to itself. We view a
rotation of the plane around the origin as a function f:~ 2-+IR2 with f(x) defined
as the result of rotating x through an angle 0, where both f (x) and x are pictured
as arrows with tails at the origin. To see that such a rotation is a linear function,
recall the geometric interpretation of vector addition and scalar multiplication in
Section 2 of Chapter I. Figure 3.2(a) shows vectors u, v, and w = u + v and their
images under f. By the parallelogram law for addition, the arrow representing w is
the diagonal of the parallelogram with sides u and v. The rotation carries the entire
parallelogram to a congruent one with sides f (u) and f(v) and diagonal f(w), so
f (w) = f (u) + f (v).
Similarly, Figure 3.2(b), in which q is a scalar multiple sp of p, shows that
f (q) = sf (p) because both q and p are rotated through the same angle, and their
lengths are not changed.
For a simple example of a rotation, consider turning the plane 90° counterclockwise
!EXAMPLE 71
around the origin. This takes e1 to e2 and ez to -e1 as shown in Figure 3.3(a). The
rotation is then a linear function f:IR 2-+IR2 with f(e1) = e2 and f(e2) = -e1, so
its matrix has e2 and -e1 as columns. For a vector u = ( ~).
f(u) = f ( ; ) = ( ~ - ~) ( ; ) = ( -~ ).
The image of a vector u = (x, y) under a 90° rotation should have the same length
as u and be at right angles to it. We can check this algebraically by computing some
dot products. We have lf(u)J 2 = f (u) • f (u) = (-y) 2 + x 2 = x 2 + y2 = JuJ 2, so
f (u) and u have the same length, and f (u) • u = -yx + xy = 0 so f (u) and u are
perpendicular to each other.
IEXAMPLES I Consider rotating the plane counterclockwise around the origin through an arbitrary
angle 0. As shown in Figure 3.3(b) this rotation carries e1 and e2 to the vectors that
form the columns of the matrix
R _ ( cos 0 - sin 0 )
0
- sin 0 cos 0 ·
Computing dot products as in the previous example shows that Rex has the same
length as x, and the cosine of the angle between Rex and x is equal to cos O. (See
Exercises 9 and IO.)
Here is a general statement about the image of a linear function f (x) = Ax.
11 111
1.4 Theorem. Let f: JR -,> JR be a linear function with matrix A. Then the
image off consists of all linear combinations of the columns of A.
l•·EXAMPLE9'1
,~ ,,, ··" ,, · 1
To illustrate Theorem 1.4, consider a plane P through the origin in IR. 3 . P consists
of all linear combinations of two vectors Y1 and Y2 in JR 3. You can check that the
function f : JR 2 -+ JR 3 defined by
is linear. Theorem 1.3 implies that the matrix of J has as columns the two vectors
(a)
For example, if
then
(b )
FIGURE 3.2
Thus the image of f in JR. 3 is the plane P determined by the two column vectors of
the 3-by-2 matrix.
More generally, the definition of a k-plane containing the origin in JR" given in
c,. y) Section 2D of Chapter 2 amounts to saying that a k-plane through the origin is the
image of a linear function .f: JRk~ JR" given by f (x) = Ax, where A is an n-by-k
matrix with k linearly independent columns.
1B Composition
(a) If f and g are two functions, not necessarily linear, such that the image of J overlaps
the domain of g, then the composition g oJ of J and g is defined to be the function
(-sin 0, cos OJ obtained by first applying f and then applying g :
(g o/)(x) = g(/(x)).
The domain of (g o/) consists of all x such that both f (x) and g(J (x)) arc defined,
and is the same as the domain of J when the image of f is contained in the domain
(b)
of g.
The following theorem states an important connection between composition of
FIGURE 3.3 linear functions and matrix multiplication that motivates the definition of matrix
multiplication.
108 Chapter 3 Vector Spaces and Linearity
1.5 Theorem. Let f: JR11 -+ ]Rm and g: ]Rill -+ ]RP be linear functions with
matrices A and B, respectively. Then the compositio~ g f is defined, and 0
(gaf)(x) = (BA)x,
for all x in JR'Z, so g a f has matrix BA and is a function from JR11 to ]RP. It follows
from Theorem 2.3 in Chapter 2, Section 2C that go f is linear.
In Theorem l.5 the image off is contained in the domain of g, which is ]Rm, so
the domain of go f is all of JR 11 • Note also that A is an m-by-n matrix and B is a
p-by-m matrix, so the product BA is defined.
Proof. Suppose that
IEXAMPLE 10 I Let f: 1R 2 2
-+ JR and g: JR 2 -+ JR 2 be defined by
_ ( cos 20 - sin 20 )
- sin 20 cos 20 ·
In the last step we used the trigonometric identities sin 20 = 2 sin 0 cos 0 and
cos 20 = cos 2 0 - sin 2 0.
1C Inverse Functions
A function f:lR"---+JRlll is one-to-one if for every yin the image F off in ]Rill
there is a unique x in JR11 such that /(x) = y. If f is one-to-one, its inverse function
1- 1: F ---+ JR 11 is defined by setting 1-1 (y) = x, where x is the unique x in JR 11 such
that /(x) = y. Thus for all x in !R:11 and ally in F,
Section 1C Linear Functions on R" 109
Proof. If l is one-to-one, then since 1(0) = AO= 0, the only solution of Ax= 0 is
x = 0. Conversely, suppose x = 0 is the only solution of Ax= 0. Since Ax 1 = Ax2
if and only if A(x1 - x2) = 0, it follows that if Ax1 = Ax2 then x1 - x2 = 0
and x1 = x2, so l is one-to-one. By the Definition 2.7 of linear independence in
Chapter 2, the columns of A are independent if and only if the system Ax = 0 has
only x = 0 for its solution. Following Inversion Process 4.3 in Chapter 2, a square
matrix A is invertible if and only if Ax = 0 has only the zero solution. In that case
Ax= y if and only if x = A- 1y, so 1- 1 (y) = A- 1y. •
For an example of a function given by an invertible matrix A, let l (x) = Ax =
(i !)x. Since the columns of A are independent A- 1 exists, so
-1 13 -1 ( 4 -3 )
l (X) =( 1 4) X = -1 1 X.
We can see geometrically that rotating a vector in IR 2 through an angle 0 and then
through the angle -0 puts it back in its original position, so the functions given by
the rotation matrices
are examples of functions that are inverses of each other. As it should, multiplying
the matrices in either order gives the identity matrix:
cos 0
2
+ sin2 0 0 ) ( 1 0)
( 0 sin 2 0 + cos 2 0 = 0 1 ·
The function l(u, v) is one-to-one by Theorem 1.6 because the columns of the 3-
by-2 matrix are independent, neither one being a scalar multiple of the other. The
110 Chapter 3 Vector Spaces and Linearity
JR 3 , but we get a one-to-one correspondence that has an inverse ' only by restricting
the domain of f (x, y, z) to P, since u and v are independent of y and (x, y, z) and
(x, 0, z) give the same values for u and v for all values of y.
The form of the inverse correspondence in the previous example suggests that
1- 1 is a linear function
from P to IR 2 . That this is so follows from
1.7 Theorem. If /:ffi:11 ~ ]Rm is linear and one-to-one then 1- 1 is also linear.
2. 1(6)=(;). 1(~)=C)
7- I (:)- G) , I (l)- u).
3
-/m-G)- 1m-O) 10)-m
•. I m-(-:) . m-(_:).I
•. 10)-G)- 1(0-(D .
fm-m 1(:)-(g)
Exercises 9 and 10 refer to the rotation matrix Re =
Exercises 5 to 8 give information about linear functions cos 0 - sin 0 ) f E . th.1s secuon.
.
f. In each case find f (ek) for the standard basis vectors ·
( sm o xamp1e 8 m
0 cos 0
ek in the domain of f by first expressing each ek as a
linear combination of the domain vectors x for which 9. Show that IR0xl = lxl for every x in i 2, so R0 preserves
f (x) is given. the lengths of vectors.
Section 1C Linear Functions on IR" 111
10. Show that x • Rox = lxi 2 cos 0. Assuming the result of b, C, and d, such that (fog)(x) = Px+q. When is fog
Exercise 9, what does this say about the cosine of the the same function as go f?
angle between x and Rox? 19. Let n be the unit vector(~.~,~), and let Pn:R 3--.R 3
For each of the pairs of linear functions in Exercises 11 be the associated projection function as in Example 6.
to 14, find the matrix that represents the composition Find the matrix of Pn by finding the image of each of the
g O f. Also, say what the domain and range of g of are. standard basis vectors under it.
= ( -1l O 2) = ( -II -1
is just the dot product x • y, while xy' is an n-by-n
13. f(x) l 3 x,
2 l O
g(y)
I -1
!)y matrix.
Show that if n is a unit vector then the matrix of the
projection function P0 is nn 1 •
14. f (x) = (2 l 3 )x, g(y)= 2y
the matrix ( ~ b)
In Exercises 21 to 26, find out whether the image of the
15. (a) Show that gives a linear func- given function is a line, a plane, or some other subset of
tion from R2 to JR 2 that corresponds geometrically its range.
to reflection in the line through the origin 45 ° coun- 21. f(x, y) = (2x - y, 6x - 3y)
terclockwise from the horizontal.
(b) What matrix corresponds to reflection in the line 22. f(x, y) = (x - y, x - 3y, x + y)
through the origin 135° counterclockwise from the 23. f(x, y, z) = (x + y - z, -x - y + z)
horizontal?
24. f(x,y,z)=(x+y-z,x-y+z)
(c) Compute the product of the matrices in parts (a) and
(b) and interpret the result geometrically. 25. f(x, y, z) = (x - 2y + z, -2x - y - z, -Sy+ z)
16. A counterclockwise rotation in R 2 through an angle ct 26. f(x, y, z) = (y, z, x)
. d .b d b th . R ( cos ct - sin ct )
1s escn e y e matrix a = sin ct cos ct . Given functions f: IR" - ]Rm and g: ]RP - iRq, the
composition go f doesn't make sense unless the image
Let f3 be another angle, and compute the product RaRp.
F off lies in the domain JR11 of g, in particular, unless
The composition of a rotation through angle ct with one
m = p. For f and g of the types given in Exercises 27
through angle f3 is a rotation through the angle ct + f3 . to 30, decide whether g 0 f, f 0 g, both, or neither, makes
What is the relation between RaRp and Ra+p? sense.
17. (a) Show that 27. f:JRI--.JR2, g:JR2--.]Rl
28. f:R2--.JR3, g:JR1--.JR3
~1 -~0) and V =(
-1~ 0~ 0b) 29. f: R2--. JR2, g: JR.2--. R3
30. f:R3--.JRI, g:JRI--.JR2
31. Show that if f : JR" -+ JR is a linear function, then there
represent 90° rotations of" JP.3 about the x 1-axis
is a vector a in JR" such that J(x) =a• x for all x in JR" .
and x 2 -axis, respectively. Find the matrix W that
represents a 90° rotation about the x3-axis. Also 32. Show that if A is an m-by-n matrix such that Ax = 0 for
find u- 1 and v- 1 , which represent rotations in the every x in 1R11 , then all entries in A are zero.
opposite direction. 33. The linear function fa : IR2 -+ !R2 defined by fa (x, y) =
(b) Compute uvu- 1 and vuv- 1 and interpret the (x + ay, y), for fixed a =I= 0, is an example of a shear
results geometrically by checking out what they do transformation.
to basis vectors. (a) What is the matrix of fa?
*18. Let f and g be defined by f (x) = Ax+ b and g(x) = (b) Find the points fa(l,0),f0 (0, l) , f0 (-l,0), and
Cx + d for given matrices A and C and vectors b and d. f 0 (0, -1), and sketch their relation to the corre-
~;nn " m<>tri,c P and vector Q. expressed in terms of A, sponding domain points when a = l.
112 Chapter 3 Vector Spaces and Linearity
(c) Show that if a > 0, then j~ moves points above the (e) For which lines L in the plane (not necessarily
x-axis to the right and points below the x-axis to the through the origin) is the image f(L) equal to L?
left. What happens if a < O? (f) What is the composition of two shear transforma-
(d) What points are always left fixed by fa? tions fa and fh?
1. rx + sx = (r + s)x
2. rx + ry = r(x + y)
3. r(sx) = (rs)x
4. x + y =y+x
5. (x + y) + z = x + (y + z)
6. x+ 0 = X
7. x +(-x) = 0
8. Ix = x
9. Ox= 0
We now take a more general point of view and define a vector space V over
the real numbers to be a set with operations of addition and scalar multiplication
defined so that they behave like the familiar operations on~". Specifically, for x and
y in V and r in ~. the sum x + y and the scalar multiple rx must also be elements
of V. In addition, V has to contain a zero vector, and formulas I through 9 above
have to hold for all real numbers r, s and all x, y, and z in V. As we have done
all along with vectors in ~ 11 , we write - x for the scalar multiple (- l)x, x - y for
x + (-y), and O for the zero vector.
We form linear combinations of vectors in a vector space by adding scalar mul-
tiples of vectors, and routine calculations such as
The set of all n-tuples of real numbers, with addition and multiplication by a scalar
IE~AMPLE,, defined as in Chapter 1, forms the vector space ~ 11 • From the point of view of this
chapter, when we showed that the operations defined on IR:11 have the properties I
through 9, we were showing that IR:11 is a vector space.
IE)(!\MPLE 21 For fixed m and n the set of all m-by-n matrices forms a vector space J\1111 , 11 • The
vector operations are matrix addition and scalar multiplication of matrices as defined
Section 2A Vector Spaces 113
in Chapter 2. For example, in M2,3 we have
(; ~-i)+(;;;)=(~i~)
and = ( ~ ~ i).
2 (; ; ; )
Let '.Pn be the set of all polynomials of degree at most n, which is the set of all
functions p having the form
where ao, ... , a11 are constants. Define addition and scalar multiplication as usual
for polynomials, by collecting terms with like powers of x. For example, when n 2 =
we have
and
3(1 + 2x + 3x 2) = 3 + 6x + 9x 2 .
'.PII and R 11 +1 are very much alike as vector spaces, under the correspondence
Let V be the set of all continuous real-valued functions on [O, l]. For f and g in
V and r in R define f + g and rf as the functions whose values at a point x in
[O, 1] are f(x) + g(x) and rj(x), respectively. We learn in Chapter 5 that sums and
constant multiples of continuous functions are continuous, a theorem often assumed
in calculus. Thus the defined operations do produce elements of V as they should.
114 Chapter 3 Vector Spaces and Linearity
FIGURE 3.4
(a) Sum. (b) Scalar multiples.
0 0 ~ 1
'-..___../ '\,I
(a) (b)
Formulas 1 through 9 follow from the same kind of argument used in Chapter l to
show that they hold for JR11 • The 0-element of V required by formula 6 is the zero
function z defined by z(x) = 0 for all x in [0, 1].
The vector space described in Example 5 is commonly denoted by C[O, l], the
space of continuous functions defined on the interval [O, 1]. More generally, the
continuous functions on an interval [a, b] form a vector space called C[a, b]. The
notation C (-oo, oo) denotes the space of continuous functions on the entire real
line. Figure 3.4 illustrates the two vector space operations in C[O, l].
2B Subspaces
The vector space JR2 fits in a natural way inside JR3 if we identify (x, y) in JR2 with
(x, y, 0) in JR3 . The following example shows that all planes though the origin are
subspaces of JR 3 .
IEXAMPLE 6 I Let V consist of all the vectors in JR 3 that lie in a plane ax t- by + cz = 0. The
sum of two vectors in V is in V and the same is true for scalar multiples of vectors
in V. We see this geometrically from the parallelogram law of addition and the
geometric interpretation of scalar multiplication. For this example we can also check
algebraically that if
a(x1 + x2) + b(y1 + Y2) + c(z1 + z2) = 0 and s(ax1 + sy1 + sz1) = 0
for all scalars s. Thus we can think of addition and scalar multiplication as restricted
just to V so V is a vector space. If a = b =
0 and c = 1 we get the xy-plane of
vectors (x, y, 0) as a special case.
The previous example generalizes as follows. Let W be a vector space and let V
be a subset of it. We say that V is closed under addition if x + y is in V whenever
x and y are, and closed under scalar multiplication if every scalar multiple sx is
in V whenever x is. If V is closed under both operations, V is called a subspace of
w.
2.1 Theorem. If V is a nonempty subset of a vector space W, and V is closed
under addition and scalar multiplication as defined on W, then V is a vector space.
Proof To prove that V is a vector space we have to show that the formulas 1
through 9 hold. First, closure under scalar multiplication implies that if x is in V so
Section 28 Vector Spaces 115
are -x = (-1 )x and O = Ox. Then all the fonnulas hold for vectors in V because
the vectors are also in W and the formulas hold because W is a vector space. •
The next theorem gives an alternative condition for a subset of a vector space to
be a subspace.
Proof. If V is a subspace and vectors x 1, ..• , Xn belong to it, then repeated appli-
cation of the closure conditions shows that every linear combination
X = fXJ.
It may be that V contains no other vectors, in which case our subspace V is
identical with the line, shown in Figure 3.5.
3. Otherwise, V contains a vector x2 different from all the vectors tx1. Since V
contains all scalar multiples and sums of vectors in it, V contains all linear
combinations tx 1 + ux2, where (t, u) ranges over R 2 . In other words, V
contains the plane through the origin with parametric representation
x = tx1 +ux2.
/
/ ' tx1 + ux2 + vx3,
/
where (t,u,v) ranges over JR 3 • Because x1,x2, and X3 don't all lie in a
plane through the origin, every vector in JR 3 is a linear combination of them,
something that's apparent geometrically from Figure 3.5(c), where the shaded
box shows the vectors with all three coefficient values between O and I .
(a) Example IO in Section 5C shows that we really get all of JR 3 this way.
In a vector space W the zero vector by itself is always a subspace, for the reasons
given in item I of the preceding example. It is called the zero subspace, or sometimes
the trivial subspace, of W. W is itself closed under the vector operations, and so
is technically a subspace of itself. Subspaces of a vector space W other than W
itself are called proper subspaces of W. We summarize the results of Example 8 as
follows, where the proper subspaces of JR 3 are the ones listed in (I) to (3).
2.3 Theorem. The subspaces of JR 3 are (I) The zero subspace, (2) the lines
through the origin, (3) the planes through the origin, and (4) the space JR 3 itself.
(b)
More generally we'll see in Example IO of Section 6C, the subspaces of ]Rn are the
zero subspace, the k-planes through the origin for k = I, ... , n - I, and the space
JRII itself.
Here is the formal statement and proof that the span of a subset of a vector space
is always a subspace.
2.4 Theorem. Let S be an arbitrary set of vectors in a vector space W and let V
be the span of S. Then Vis a subspace of W.
Proof. We need to show that V is closed under addition and scalar multiplication.
Let x and y be vectors in the span of S, so each of them is a linear combination of a
Section 28 Vector Spaces 117
finite number of vectors in S. Suppose that {v 1, ••• , vk} contains all the vectors in
S needed in the linear combinations for both x and y. Then there are scalars ai and
bi such that
(Some of the a's and b's may be zero if not all of the v's are needed for both x and
y.) Then x + Y = (a1 + b1)v1 + ... + (ak + bk)Vk and rx = ra1v1 + ... + rakVk , so
x + y and rx are also linear combinations of vectors in S. Hence V is closed under
addition and scalar multiplication and is therefore a subspace of W. .,
In the space '.J> of polynomials discussed in Example 4, the span of the set S =
{I, x, x 2 , ••. , xn} is the subspace '.J>n of polynomials of degree at most n. If m ~ n,
then '.J>,,, is a subspace of '.J>n, and if m < n, then '.J>m is a proper subspace of '.J>n.
The whole space '.J> is the span of the infinite set {1, x, x2, ... }.
Here are some examples of subspaces of C[a, b], the space of continuous functions
on [a , b] discussed in Example 5.
We can get a subspace of C[a, b] by taking the span of a set of functions in it. For
instance, the span of {I, x, x 2 }, is a subspace of C[a, b] consisting of all polynomials
of degree at most 2. Another example is the set of all functions of the form
which is the span of the set {l, cos x, sin x, cos 2x, sin 2x, cos 3x, sin 3x}.
In the next two examples we have subspaces of C[O, I] that are not described as
the span of a set.
Thus Vis closed under addition and scalar multiplication and is a subspace of C[a , b].
d(rf) df
!!_(I+ g) = df + dg and --=r-,
dx dx dx dx dx
118 Chapter 3 Vector Spaces and Linearity
sums and scalar multiples of functions with continuous derivatives also have contin-
uous derivatives.
We denote the set of functions whose first k derivatives are continuous by
c<kl[a, b]. Repeated application of the argument used for c<l)[a, b] shows that it is
a subspace of C[a, b]. For I :::: k, c<O[a, b] is a subspace of c<kl[a, b]. For I > k,
c<O[a, b] is a proper subspace of c<kl[a, b]. A proof is outlined in Exercise 33.
EXERCISES
In each of Exercises 1 to 6, let S be the set of all vectors the e;? Give an example of a vector in IR 00 that is not in
(x, y, z) in IR! 3 whose entries satisfy the given conditions. the span of all the e;.
In each case, either show that the subset is a subspace
In Exercises 16 to 19, determine whether the set of all
of IR 3 by verifying the closure conditions, or show that
polynomials p in '.P3 that satisfy the given conditions is
it is not a subspace by finding some linear combination
a subspace of '.P3.
of elements of S that is not in S.
1. X + 2y = 0 2. X + Z = 2 16. p(0)=I 17. p(l)=0
18. p(0) = p(l)
3. x + y = 0 and z = 0 4. x + y = 0 or z = 0
19. p(l) = p' (2), where p' is the derivative of p
5. X = y3 6. x + y = 0 and x = y3
20. In the space '.P of polynomials, let A be the set of all
In Exercises 7 to 10, let S be the subset of the vector p such that p(x) = -p(-x), and let B be the set of
space of2-by-2 matrices, M2,2, consisting of the matrices p such that p(x) = p(-x). Show that A is the span of
{x, x3, x5. ... ), and find a spanning set for B.
A = (; ~) whose entries satisfy the given conditions.
In Exercises 21 to 24, determine whether the given subset
Show either that S is a subspace of M2,2 or that it is not.
of cO\-oo, oo) is also a subspace.
1. X =W 8, X = -W
9. y =z= I 10. det(A) = xw - yz =0 21. All f such that f' (0) exists
11. (a) Show that the set of vectors (x, y, z) in JR3 such that
22. All f such that f' (0) =2
x + 2y - z = 0 is a subspace of JR 3. 23. All f such that f' (0) = f (2)
(b) By finding a parametric representation for the solu- 24. All f such that f(x) = f(-x) for every value of x
tions of x + 2y - z = 0, find two vectors that span
the subspace in part (a). 25. Let C[a, b] be the vector space of continuous real-valued
functions defined on the interval [a, b]. Let Co[a, b] be the
12. Let a be a fixed nonzero vector in IR". set of functions fin C[a, b] such that f(a) = f(b) = 0.
(a) Show that the set S of all vectors x such that Show that Co[a, b] is a subspace of Cfa, b].
a • x = 0 is a subspace of IRn.
(b) Show that if k is a nonzero real number, then the In Exercises 26 and 27, show that S and T have the
set A of all vectors x such that a , x = k is not a same span in IR 3 by showing that the vectors in S are in
subspace. the span of T and vice versa. [Hint: You can do this by
solving systems of linear equations.]
13. Let S be a subset of IR", and let s.L (pronounced "S
perpendicular", or "S perp" for short) be the set of all 26. S = ((l, 0, 0), (0, 1, 0)), T = ((1, 2, 0), (2, 1, 0)}
vectors p in IR" such that p • s = 0 for all s in S. Show 27. S = /(2, 3, 1), (1, 2, 3)), T = ((3, 5, 4), (I, 1, -2))
that S.L is always a subspace of IRn.
28. (a) Show that the plane P of points (I, I, I) +
14. Show that for a subset S of IRn, the span of S is con- s (I, 2, 0) + t ( -2, I, 1) is not a subspace by finding
tained in (S.l ).L. [Hint: First show that S is contained in two vectors in P whose sum is not in P.
(S.L ).L .]
(b) Show that the plane P of points (-1, 3, 1) +
*15. Let e; be the sequence in IR 00 (Example 3) having I in s(l, 2, 0) + t(-2, 1, l ) is a subspace.
the ith place and O elsewhere, so e1 = (I, 0, 0, ... ), e2 = (c) What is different about cases (a) and (b)? For which
(0, 1, 0, ... ), etc. Which vectors in IR: 00 are in the span of + +
vectors b do the points b s(l, 2, 0) t(-2, 1, 1)
(e1. e2, ... , e11 }? Which are in the span of the set of all form a subspace?
Section 3 Linear Functions 119
29. Show that if S is a subset of a vector space W and V c<k+ 1>[a, b], so c<k+ 1>[a, b] is a proper subspace of
is a subspace of W that contains S, then the span of S c<k>[a, b].
is a subset of V. (Another way of stating this is to say
that the span of S is the smallest subspace of W that con- *34. Let c< 00 > be the vector space of infinitely often differen-
tains S.) tiable functions of a real variable. Show that c< 00> is a
proper subspace of c<k) for k =
I, 2, ....
30. Show that the intersection of two subspaces of a vector
space V is always a subspace of V. 35. Suppose a linear function f : JR 3 - JR has f (e1) =
I, /(e-i) = 2, and f (e3) = I. Show that the equation
*31. Exercise 4 shows that the union of two subspaces is not / (x) = I has solutions consisting precisely of the points
always a subspace. Show that the union of two subspaces in the plane perpendicular to (1, 2, I) and passing through
is a subspace if and only if one of them is contained in (1, 0, 0).
the other. ·
In each of Exercises 36 to 41, say whether the given
32. Given two subsets A and 'B of a vector space, let A + 'B statement is always true or sometimes false. If the
stand for the set of all vectors that are equal to sums statement is always true, give a reason why; otherwise
a+ b with a in A and b in 'B. Show that if A and 'B are give an example for which it is false.
subspaces, then so is A+ 'B.
36. If S is a subspace of a vector space and x is in S, then
*33. This exercise outlines a proof that for the spaces of -xis in S.
functions c<k>[a, b] defined in Example 13 that c<l)[a, b]
is a proper subspace of c<k>[a, b] when l > k. 37. If S is a subspace of a vector space W and x is in W but
(a) Show that c<1>[a, b] is a proper subspace of C[a, b]
not in S, then the set of all sums x + y with y in S is not
a subspace of W.
by giving an ex.ample of a function that is continuous
on the interval [a, b] but doesn't have a derivative 38. If S is a subspace of !Rn and S contains more than one
that is continuous on the interval. vector, then S contains a line through the origin.
(b) One version of the fundamental theorem of calculus 39. If Sr and S2 are two different proper subspaces of a vector
states that if .f is continuous on an interval [a, b] space W, then W has a proper subspace that contains both
then F(x) = J:x /(!) dt is a function with derivative S1 and Sz.
F'(x) = f(x). Use this and your example from part
(a) to find a function that is in c(l>[a, b] but not in 40. There is no subspace of JR11 such that !xi ::5 I for all x in
c<2>[a, b]. the subspace.
(c) Show by induction that for k = 1, 2, ... , 41. No subspace S of IR3 has the property that X• (l, 2, I) =I
there is a function in c<k>[a, b] that is not in for all x in S.
independent. Matrix representations for linear functions in general are not readily
available, but the definition of linear independence in Definition 2.7 in Chapter 2
provides a useful substitute that doesn't depend on properties of !Rn.
Proof By linearity, j(0) = J(Ox) = OJ(x) = 0, and j(x1) = f (x2) if and only if
j(x1 -x2) = 0. If J is one-to-one, then x = 0 is the only vector such that J(x) 0. =
If f is not one-to-one, there are vectors such that j(x1) = j(x2) but x1 =f x2, and
then x1 - x2 is a nonzero solution of J(x1 - x2) = 0. •
3A Examples of Linear Functions
We now give some examples of linear functions just to illustrate various possibilities
that come under the definition. After giving some specific examples, we consider
general ways of combining linear functions to get others.
For our first example, we simply recall Theorems 2.3 in Chapter 2, Section 2C and
1.3 of Section I of this chapter. If f: !Rn~ !Rm is a linear function, then f (x) = Ax,
where A is the m-by-n matrix whose jth column is the vector f(ej). This theorem
is significant in that it gives us a concrete computational description of all possible
linear functions from IR11 to !Rm.
This direct description of linear functions by matrices only works for functions
whose domain and range are standard coordinate spaces IR11 , !Rm. As we'll see later
in Section SB, many vector spaces are very much like the spaces !Rn, and there is a
way to associate linear functions on them with matrices, but in other cases such as
the next couple of examples this is not possible.
We sometimes use the term transformation to refer to a function from one vector
space to another, and use the term operator to refer to a function from a vector
space to itself. These terms help to avoid confusion when we deal with vector
spaces such as C(-oo, oo) whose elements are themselves functions. For example,
differentiation is a differential operator from infinitely often differentiable functions
such as J (x) = sin x that operates on f (x) to produce J' (x) = cos x. We use this
terminology in several of the examples that follow.
IEXAMPLE -21 Let '.P be the vector space of all polynomials, as in Example 4 in the previous section.
Because the derivative of a polynomial is a polynomial, we can define the differential
operator D : '.P-+ '.P by setting Dp(x) = p'(x) for every polynomial p(x) in '.P.
For example, D(2 + x - x 3 ) = I - 3x 2 . Checking that D is linear is a matter of
observing that if p(x) and q(x) are polynomials, and r ands are numbers, then
X X
(a) (c)
u u
-2x
Du (x) = ,
(l + x-)2
X X
(b) (d)
l ~MPCE g;j
•
'.;.:·..E
.•
• <• •· • ' -
The discussion of the differential operator D in Example 2 applies somewhat dif-
ferently if we consider D as a transformation from c(l) (-oo, oo ), the space of
continuously differentiable functions f(x), to C(-oo, oo), the continuous functions.
The linearity of D follows just as in Example 2. But note that while here f (x) is
assumed to have a continuous derivative, Df(x) = f'(x) may only be continuous
but not differentiable. As with differentiation of polynomials D is still not one-to-
one, for the same reason as before, namely, that the derivative of every constant
function is the identically zero function. Figures 3.6(a) and (b) show the graphs of
a function u(x) and its derivative Du(x). The linearity of D as an operator on u(x)
isn't at all obvious from looking at the pictures.
RL:E '4;j
l••·.EJ<.A
. . M. .. ···· ·
Let C ( -oo, oo) denote the space of continuous real-valued functions u (x) and let
q(x) be a fixed function in C(-oo, oo). We define an operator Q: C(-oo, oo) ~
C(-oo, oo) by
Qu(x) = q(x)u(x).
Figure 3.6(c) shows the effect of multiplying by q(x) = x when u(x) = (1 +x 2 )- 1 ;
Figure 3.6(d) shows the effect when q(x) = x 2 (1 + x 2 )- 1 instead. Checking that Q
is a linear transformation amounts to observing that
Putting the operator Q and the differential operator D in the single equation
Du= Qu gives an example of a differential equation:
u(x) = kex 2 / 2
satisfies the equation. The preceding formula gives all solutions, as we show
in Chapter l 0, Section 3. In this example the domain of D is the subspace of
C(-oo, oo) consisting of the vector space of continuously differentiable functions.
IE?(AIVIPL~ .s I isIn notthis theexample we again use an m-by-n matrix to describe a linear function, but it
same as Example 1. For one thing, the domain is not and the range is
]Rn
not ]Rm. Let Mn,p be the vector space of n-by-p matrices discussed in Example 2
of the previous section. If A is a fixed m-by-n matrix, then for each n-by-p matrix
M, the product AM is defined and is an m-by-p matrix. Thus we obtain a function
f A : Mn,p -+ Mm,p by defining
fA(M) = AM.
fA(M + N) = A(M + N)
=AM+ AN= fA(M) + /A(N),
and by the scalar commutativity law of the same theorem,
In the preceding examples, formal verification that the transformations were linear
was straightforward, and in the future we'll often leave such routine checks to the
reader.
A formal proof that a function f is not linear involves finding some x and y in
its domain such that /(x + y) -f- /(x) + /(y), or a scalar r and an x in the domain
off such that /(rx) -f- rf(x). For example, /:lR 1---+JR 1 defined by f(x) = x 2
certainly looks nonlinear. To prove that /(x) = x 2 is not linear, it's enough to note,
for instance, that /(l + 1) = /(2) = 4 while /(1) + /(1) = l + l = 2 -f- 4, or
/((-1)(3)) = /(-3) = 9 while -/(3) = -9 -f- 9. Usually a function that looks
nonlinear is nonlinear, but care is sometimes required, as in the next example.
Define /:lR2 ---+JR2 by f(x, y) = (3x-y, (x+y+2) 2 -(x+y) 2 -4). The function f
appears nonlinear at first glance, but a second look shows that (x+y+2) 2 -(x+y) 2 -4
simplifies to 4x + 4y, so f is linear after all.
Section 38 Linear Functions 123
The previous example is rather artificial. A more natural situation is to have a
family of functions that are nonlinear in general but linear in exceptional cases that
may be overlooked. For instance, ax 2 + bx is nonlinear only if a f= 0. The exact
domain of a function may also make a difference, as in the following example.
Let f:R 2 --+ JR 2 be the function defined by the second-degree formula f(x, y) =
((x + 1) 2 - (y + 1) 2 , 3x 2 + 5xy + 2y 2 ). It certainly looks nonlinear, and we leave
it to the reader to check that it is.
Now let V be the subspace of JR 2 consisting of all scalar multiples of (1, -1 ),
and define another function g: V --+ JR 2 given by the same formula as .f, but with
domain restricted to V. For (x, y) in V, we have y = - x , so for (x, y) in V,
(g o.f)(x) = g(f(x))
whenever the right side is defined. According to Theorem 1.5 in Section 1 of this
chapter, the composition of linear functions g:Rm--+ JRP and f:TJf..P--+ Rn that have
standard coordinate spaces for domain and range corresponds to matrix multiplica-
tion, so that if /(x) = Ax and g(y) = By, then (g of)(x) = BAx. It then follows
from Theorem 2.3 in Chapter 2, Section 2C that g o f is linear. The same conclusion
holds for linear functions on vector spaces in general.
3.4 Theorem. If f :11--+ V and g:V--+ Ware linear functions, then their com-
position g o f:11 --+ W, defined by (g c f)(x) = g{f (x)) is also linear.
Proof. We check the linearity of g o f by the following calculation, using first the
linearity of f and then the linearity of g.
If f: V --+ V is a function whose domain and range are the same space, then
f of is defined. We often write / 2 for f of. Since / 2 is again a function from V to
V, we can also define / 3 =f o / 2 , and so on. For instance, we write D 2 instead of
D oD for the second derivative operator so D 2 f means the same as f" .
For functions with the same domain S and the same vector space for their range,
sums and scalar multiples are naturally defined by
whether or not S is a vector space. When the domain of the functions is a vector
space, and the functions are linear, we have the following:
3.5 Theorem. If f: V --+ W and g: V --+ W are linear functions, then the sum
f + g:V--+ Wand scalar multiple rf: V--+ Ware linear also.
The proof amounts to checking linearity by using the definition in much the same
way as in the proof of Theorem 3.4, and we leave it as an exercise.
IEXAMPLE a I We define linear differential operators using both composition and linear combination
of transformations. For example, suppose that p(x), q(x), and r(x) are continuous
functions, and D is the differentiation operator. Then
3C Inverse Functions
Recall from Section l that a function /: IR11 --+ IR111 has an inverse function if there
is a function 1- 1 whose domain is the image F off in IR111 , such that
(b)
FIGURE 3.7
Image line. for every x in the domain of f and for every y in the image set F of f. Thus f has
an inverse precisely when f is one-to-one; hence we have, by Theorem 3.3:
Proof If x and y are in the image F off, which is the domain of 1- 1, then there
exist u and v in V such that /(u) = x and /(v) = y, sou= 1- 1 (x) and v = 1- 1 (y).
Because f is linear, ifs and t are scalars, sx + ty = j(su + tv), so sx + ty is also
in F. This shows that the domain of 1- 1 is a vector subspace of W. The range of
1- 1 is the vector space V. Apply 1- 1 to both sides of sx + ty = f (su + tv) to get
so 1- 1 is linear. •
Here are some examples of linear functions that have inverses.
Section 3C Linear Functions 125
l:~MM,eJ:f~,I 2 2
Let /:lR ---+ 1R be the linear function defined for vectors X in JR 2 by
Let JR ---+ JR2 be defined by f (t) = (t, t). Then f is linear and its image is the line
in JR 2 with equation x = y, as shown in Figure 3.7(b). Thus in this case the image of
f is a proper subspace V of JR 2 , that is one that is not all of JR 2 . Since the equation
f (t) = (0, 0) is satisfied only by t = 0, the function f is one-to-one, so has inverse
1- 1, given by 1- 1(t, t) = t for all vectors (t, t) in V.
The function S from C[0, 1) to C[0, 1) defined by Su(x) = fox u(t) dt, has an
image consisting of the continuously differentiable functions u in C[0, 1] for which
u(O) = 0. Sis one-to-one because Su is identically zero only if u is also. According
to the fundamental theorem of calculus, the inverse of S is the differentiation operator
D restricted to the functions u such that u(O) = 0 and du/dx is continuous.
EXERCISES
In Exercises 1 to 4, a value of n and some information In Exercises 9 to 12, determine the effect on a sequence
about a linear function /: :!Rn---+ ]Rn are given. In each (x1, x2, x3, ... ) of the given combinations of the func-
case find the matrix A such that f (x) = Ax for all x tions defined in Exercises 5 to 8.
in lRn. 9. fog and g o f 10. goh and h og
1. n = 2, f(ei) = (l, 2), f(e2) = (2, l) 11. g op and pog 12. h op and poh
2. n = 3, f(e1) = (1,2,0),f(ez) = (-l,2,0),f(e3) = 13. In analogy with Example 5 of the text, define for each
(0, 0, l) fixed m-by-n matrix B, the function g9 : Mq,m ~ Mq,n
by g 8 (M) =MB. Show that g 8 is linear.
3. n = 2, f(l, 1) = (l, 2), f(2, l) = (2, l)
14. If A is m-by-p and Bis q-by-n, what are the domain and
4. n = 3, f(e1) = e2, f(e2) = 2e3, f(e3) = 3e1
=
range of hA.B as defined by hA,n(M) AM B for all M
Each of Exercises 5 to 8 defines a function from in Mp,q? ls hA,B linear?
JR 00 to JR 00 , where JR 00 is the vector space of sequences
In Exercises 15 and 16, Let D be the differentiation oper-
(xiJ, k = l, 2, 3, ... of Example 3 in Section 2. In each
ator d/dx. For each given function u(x), find Du(x),
case, show that the function is linear and state whether
xu(x), D(xu(x)), and xDu(x).
the function is one-to-one or not. If it is one-to-one then
describe its inverse and the domain of the inverse. 15. u(x) = 2x 3 - 4x 16. u(x) = e3 x
5. f (.q, x2, x3, ... ) = 2(x1, x2, X3, ... ) 17. Let D =d/dx act as a transformation from cO>(-oo, oo)
to C(-oo, oo).
6. g(x1,x2, x3, ... ) = (x1, 2x2, 3x3, ... ) (a) If u(x) = 2x 3, find (Dx - xD)11(x), where the
7. h(x1, xi, x3, ... ) = (x2, x3, x4, ... ) operator Dx first multiplies by x and then applies D.
8. p(xi, X2, X3, ... ) = (0, XJ, x2, X3, ... ) (b) Show that Dx - xD =
I, where I is the identity
operator defined by Ju == u for all u.
126 Chapter 3 Vector Spaces and Linearity
(c) Is D 2 - x 2 equal to ( D + x)(D - x)? To find out, 26. L:IR 2-JR2, l.(x,y) = (x+2y,2x+4y). [s there an
apply both operators to a general function u(x) in (x, y) such that L (x, y) = (- I , 2)?
c( 2 \-oo, o.:i) and see if you get the same result. 27. L: cO>[Q, I] - C[0, I], l.l = l' - 21 , Is l(x) = lxl
18. (a) Show that the equation in the domain of L?
ex(D + l)u(x) = Dexu(x) 28. L: C(2)[0, I] - C[0, I], Lf(x) = 2l"(x). ls there an f
in the domain of L such that L.l(x) = x 2 ?
is satisfied by all functions u in cO>(-oo, oo).
29. L : C[0, l] - C[0, I], Ll(x) = xf (x). Is there an f in
(b) Show that the equation
the domain of l. such that l.f(x) = x 2 + I?
(D + l)u(x) = 0 In Exercises 30 to 35, for the given linear function,
is satisfied by all functions of the form u (x) = ce-x, determine whether it has an inverse function; if it has,
where c is a constant. describe the inverse by using a matrix, or in some other
(c) Show that the equation (D + l)u(x) = 0 has only way. Specify the domain of the inverse function.
the solutions given in part (b ).
In Exercises 19 and 20, show that the given function
2
30. l :IR -JR
2
, 1(~) = ( _: ; )(;,)
S: C[O, l] -+ C[O, I] is linear.
19. Su(x) = ft u(t) dt 20. Su(x) = J0' e-1 u(t)dt
2
31. l:IR -JR ,
2
1(;) = (!;) (~)
In Exercises 2 1 to 25, L : V ~ W is a linear function 32. f:JR'-n~.2. f(x) = (x, -2x)
from some specified vector space to another. Use the
given information to answer the questions. *33. l:V-JR2, l ( :X)
1
= (-I 2J 2
] 3) (X): , where V
21. L:JR 2
- IR, L(uo) = I, L(u 1) = -2. What is
L(3uo - 401)? is the subspace of JR 3 consisting of all linear combinations
22. L:JR2->IR 2 , L{l , 2) = (2, 3), L(-1, I)= (I, -1) . Find of (1, I, I) and (I , 2, 3).
a vector u in JR 2 such that Lu = (3, 7). *34. D:V-C(-00,00), where Du= u', and Vis the
23. L:JR 3 -ne, L(e1) = (1,2), L(e2) = (-1,0), L(e3) = subspace of c(l>(-oo, oo) consisting of the continuously
differentiable functions with u (0) = 0
(2, 2). What is L(-1, 3, 2)?
24. L : C[0, l] - C[0, l], l(l) = I, l(x) = x, L(x 2 ) = *35. D:V - C(-oo, oo), where Du = u', and V is the
x 2 +2. What is l(2x 2 +x - I)? subspace of cO \-oo, oo) consisting of the continuously
differentiable functions with u (0) = u (I)
25. L : C[0, l] - C[0, l], l(l) = x, L(x) = x 2 • What is
l.(2x + 3)? 36. Prove Theorem 3.5: If l and g are linear, so are the sum
f + g and scalar multiple rf, for a scalar r.
In Exercises 26 to 29, L : V ~ V is a linear function
from a specified vector space V to itself. Use the given
information to answer the questions.
Proof. If x and y are in f (U), then there are vectors u and v in U such that
f(u) = x and f (v) = y. For scalars r and s, ru + sv is also in U because U is a
Section 48 Image and Null-Space 127
Image of a square.
(a) (b)
We already know a whole class of linear functions that illustrate Theorem 4.1 . If
f: JR"~ ]Rm is linear then by Theorem 1.4, its image is the span of the columns
of the matrix that represents f and is therefore a subspace, by Theorem 2.4. For
example, if
In this example we have to use other reasoning to find the image. If D = d/dx acts in
cO>(-oo, oo), the space of continuously differentiable functions, then the image of
a single function u in c<l)(-oo, oo) is a continuous function v in C(-oo, oo). The
image of Dis all of C(-oo, oo), because if v is an arbitrary element of C(-oo, oo),
then
For each fixed vector a in JR 3 , the formula f (x) = a • x defines a linear function
j.EXAfvlt>LE 4 j /: JR 3 ~ JR. If a is the zero vector, then the null-space off is the entire domain JR 3 .
If a =f. 0, then the null-space of f is a plane through the origin in JR 3 consisting of
all vectors perpendicular to a.
IEXAMPLE s j Finding the null-space of a linear function given by a matrix amounts to solving
a homogeneous system of equations. For example, suppose f : JR 2 ~ JR 2 is
f (x) = Ax with 2-by-2 matrix
A= (! 1i ).
To find the null-space N of J, we solve the system Ax = 0 in the form
X +4y=0
3x + l2y = 0,
by the row-reduction method of Chapter 2 (or by inspection if you notice that the
second equation is just 3 times the first). The solutions are of the form (x, y) =
t (-4, 1), where t ranges over all real numbers. In other words, N is a line through
the origin in IR 2 with slope -¼.
In this example, finding the null-space requires knowledge of calculus. If D =
IEXAMPLE6j d/dx acts on c(l>(-oo, oo), the null-space of D consists of all functions with
derivative identically zero. It is a theorem of calculus that a function has derivative
0 on an interval if and only if the function is constant on the interval. Hence the
null-space of D is the subspace consisting of constant functions. Since sums and
multiples of constant functions are constant, they do indeed form a vector subspace
of C 1 (-oo, oo) .
For a second example with the same domain and range, if multiplication by x
produces a continuous function xu(x) that is identically zero, then u must have
been identically zero. It follows that the operation of multiplication by x, acting on
C(-oo, oo), has its null-space consisting of the zero function in C(-oo, oo).
4.2 Theorem. Let f : V ~ W be linear and let N be the set of all v in V such
that f(v) = 0. Then N is a subspace of V.
4.3 Theorem. A linear function is one-to-one if and only if its null-space is the
zero subspace consisting of the vector O alone.
4C Nonhomogeneous Equations
The null-space of a linear function f plays a central role in the description of all
solutions of the equation
/(x)= b ,
/(x) =O
is called the associated homogeneous equation of / (x) = b, and the null-space of
/ is therefore the set of all solutions of the homogeneous equation.
We used the following theorem for solving linear systems of numerical equations
in Chapter 2; this more generally applicable version has formally the same
proof.
Proof. Suppose that /(X-O) = band also that /(u) = b. Since/ is linear,
x+4y =5
3x + 12y = 15
130 Chapter 3 Vector Spaces and linearity
has one solution that we can guess by inspection: (x, y) = (1, 1). In Example 5 we
found all the solutions of the associated homogeneous system
X =0
+4y
3x + 12y = 0
to be of the form (x, y) = t (-4, I). Hence all solutions of the given system are of
the form (x, y) = t(-4, I)+ (l , 1).
IEXAMPLE 9 I One application of Theorem 4.4 is very familiar in elementary calculus. Suppose we
want to solve the linear equation
Dy =g,
EXERCISES
In Exercises I to 6, a function f: fiJ/. 11 - , ; YiJ/.m is specified 9. F : C(-oo, oo) -4i- cO>(-oo, oo), where F(u)(x) =
by a formula. In each case state which [J/.m is the range fte- 1 u(t)dt.
off, and desctibe the image off in YiJ/.m. Is it a subspace
of YiJ/.111 ? Also state whether or not the function is linear, 10. F : c<l)(-oo, oo) -+ C(-oo, oo), where F(u)(x)
and if it is linear, find its null-space. u'(x) + u(x).
In Exercises 11 to 14, describe the image and the null-
1. f (x, y) = (x, y, x + y) space of the function defined by /(x) =
Ax for the given
2. j(t)=(t,2t,3t) matrix A.
3. f(u,v)=(u,v,2u+v+l)
4. J(x, y) = (x + y, x - y)
11. A =( 6 ~ ). 12. .4 =( ~ 1).
5.
6.
f(x, y, z) = (x + 4 y + 3z, 2x +Sy+ 4z)
j(t) = (t,O, l) IJ.A-o H)- 14..b(n n
In each of Exercises 7 to IO, describe carefully the 15. (a) Find all solutions of the homogeneous equation
image of the given transformation F , state whether the
2x -Sy= 0.
function is linear, and if it is linear, describe its null-
space. (b) Verify that a linear combination of two solutions is
also a solution.
1. F : C(--oo, oo) -+ C(-oo, oo), where F(u)(x) = (c) Find a single solution of the nonhomogeneous
u(x) + x. equation
8. F: C(-oo, oo)-+ C(-oo, oo), where F(u)(x) = e"<x)_ 2x -Sy= 7;
Section 5 Coordinates and Dimension 131
then use Theorem 4.4 to represent all solutions of (b) Verify that a linear combination of two solutions may
the nonhomogeneous equation. not be a solution.
16. Let f : lRn ---+ !Rm be linear. (c) The conclusion of Theorem 4.4 doesn't hold for
(a) If f is not identically zero, show that the image of f (y) = Dy - y21 3 • Why doesn't the theorem apply
f contains a line through the origin. to f(y)?
(b) If n > m, show that the null-space of f contains a 21. Define a function G from C ( -oo, oo) to C ( -oo, oo) by
line through the origin.
17. Show that the null-space of a linear function / : ]Rn ---+ JR Gu(x) = lox tu(t) dt.
is the set of all vectors orthogonal to some fixed vector
Xo in ]Rn.
(a) Show that G is linear.
18. (a) Find all solutions of the pair of homogeneous (b) Show that G is one-to-one.
equations
(c) Describe the image under G of the subspace of '.P
consisting of all polynomials, p(x) = ao + a1x +
x+y+z=O ... + anxn of degree S n.
x-y+z =0. (d) Describe the image under G of all of '.P.
(e) Describe the inverse of G.
(t) Find an element of '.P that is not in the image of G.
(b) Verify that a linear combination of two solutions is
also a solution. 22. If Fis a function from a set A to a set 'B, and B is a subset
(c) Verify that (x, y, z) = (1, 1, -1) is a solution of the of 'B, then the inverse image of B under F, denoted by
pair of nonhomogeneous equations p- 1(B), is defined to be the set of all a in A for which
F(a) is in B . Show that if Fis a linear function from one
x+y+z=l vector space to another, and U is a subspace of its range,
then F- 1(U) is a subspace of the domain of/. What is
x-y+z=-1. the connection with Theorem 4.2?
23. A function u(x) in C(-oo, oo), the space of continuous
Then use Theorem 4.4 to represent all solutions of
functions on (-oo, oo) is called even if u(x) = u(-x)
the pair of nonhomogeneous equations.
for all x. It is called odd if u(x) = -u(-x) for all x.
19. Consider the homogeneous differential equation (For example, cos x is even and sin x is odd.) Let R be
(D - l)y = 0. the operator defined on C(-oo, oo) by (Ru)(x) = u(-x).
(a) Verify that the operator (D - 1) : c<l) (-oo, oo) ---+- Let I be the identity operator: (lu)(x) =
u(x).
C (-oo, oo) is linear. (a) Show that the graph of Ru is the reflection of the
(b) Chapter IO, Section 3 shows that all solutions of graph of u in the y-axis.
(D - l)y = 0 have the form y(x) = cex for some (b) Show that R and I are linear operators and that
constant c. Verify that y(x) = 1 + x is a solution R2 = I.
of the nonhomogeneous equation (D - l)y =-x, (c) Let Fe = iU + R). Show that the image of Fe
and use Theorem 4.4 to represent all solutions of the consists of the even functions and that its null-space
nonhomogeneous equation. consists of the odd functions.
20. (a) Verify that for each constant c the function y(x) = (d) Find the image and null-space of F0 = I - Fe =
t-,(x + c) 3 is a solution of the differential equation iU - R).
Dy - y2/3 = 0. (e) Find F'; and F; in terms of Fe and F0 •
use them to do computations with vectors and linear functions in vector spaces other
than ]Rn. The other is to show how to define dimension for a vector space for which
we lack immediate geometric intuition. The goals are related because both depend
on first generalizing the standard basis (e1, ... , en} in )Rn.
SA Bases and Coordinates
Recall that if S is a subset of a vector space V, the span of S is defined to consist of
all vectors w that are linear combinations of vectors in S. We showed in Theorem 2.4
of Section 2 that the span of S is always a subspace of V. If the span of S is all of
V we say that S spans V or is a spanning set for V.
IEXAMPLE 11 Figure 3.9 shows the span of each of three different sets of vectors in JR 3 • In
Figure 3.9(a), the span of x1 and x2 is the line containing those two vectors. In
Figure 3.9(b), the span of Yt and Y2 is the plane through the origin containing YI, y2.
In Figure 3.9(c), the vector y3 = Yt +Y2 adds nothing to the span of YI and Y2 because
Y1 + Y2 is already in the span. Thus the span of x1 and x2 looks one dimensional.
The span of Yt and Y2 looks two dimensional, and since y3 is a linear combination
of YI and Y2, the span of Y1, Y2, and y3 looks the same. The span of e1, e2, and e3
looks three dimensional.
Recall from Definitions 2.7 and 2.7' of Chapter 2 that a set of vectors in ll{n
is linearly independent if no single vector in the set is a linear combination of
other vectors in the set, or equivalently, if the only way to express O as a linear
combination of vectors in the set is by taking all the coefficients equal to zero.
The definitions of spanning set and linearly independent set make sense for general
vector spaces, so we can make the following definitions.
5.1 Definition. A basis for a vector space V is a set of vectors in V that is linearly
independent and spans V.
5.2 Definition. If V has a finite basis {b1, ... , b11 } consisting of n vectors, then
V has dimension n, written dim(V) = n. If V consists of the zero vector alone we
define dim(V) = 0. If V isn't spanned by a finite set then V is infinite dimensional.
Note. It's conceivable that V might have two bases with unequal numbers of ele-
ments. We prove in Section 5C that this can't happen, so if V has a finite basis,
dim(V) is obtained by counting the vectors in an arbitrary finite basis for V, and it
doesn't matter which basis is used.
FIGURE 3.9
Spanning sets. \ I
I
\
\
\
\
X2
.
\
\
\
\
\
e2
// I
I
;1
-~
I
I
I
In the vector space '.Pn of polynomials of degree at most n, the set Sn consisting of
the n + 1 polynomials
The next theorem shows that a basis {b1, ... , bn} for V generates a one-to-one
correspondence v ++ (v1, ... , Vn) between vectors v in V and vectors (v1, ... , Vn)
in !Rn that is linear, that is, it preserves addition and scalar multiplication. Using a
different basis will produce a different correspondence of the same kind.
5.3 Theorem. Let B = {b1 , ... , bn} be a basis for the vector space V. Then for
every v in V, there are unique scalars VI, ... , v11 such that v = VJ h1 + · · · + Vnbn.
The correspondence is linear, that is, If u ~ (u1, ... , un) and v ++ (v1, ... , v11 ),
then
u + v +--+ (u1 +VI, ... , Un+ v11 ) and ru +--+ (ru1, ... , ru 11 ).
Proof. Since B spans V, every vector vis some linear combination v1 b 1+ · +v11 b 11 •
To prove uniqueness we have to show that if vis also equal to w1h1 + · · · + w 11 b11
then the v's are the same as the w's. If both linear combinations are equal to v, then
their difference is 0, so
If {b1, ... , h11} is a basis, then the unique n-tuple (x;, ... ,x11 ) such that
L~XAMPLE4 I larly
Some vector spaces have natural bases relative to which coordinates are particu-
simple. For example, in JR the coordinate vector of the 11-tuple
11
(x 1 , ••• , x 11 )
relative to the standard basis {e1, ... , e11 } of JR" is just the n-tuple itself, because
(xi, ... ,Xn) = x1e1 + · · · + Xnen.
Similarly, in the space 'Y11 of polynomials of degree at most n, the coordinate vector
of a polynomial p(x) = ao + a1x + · · · + anX 11 relative to the basis {1, x, ... , xn}
of Example 3 is simply the (n + 1)-tuple of coefficients, (ao, a1, ... , a 11 ).
,···exAMPLE.• S•l Giving a basis for a subspace is often a good way to describe the subspace. For
example, the functions ex and e-x span the subspace S of C(-oo, oo) consisting of
all functions expressible in the form
c1 and c2 constant. Neither function is a constant multiple of the other, so they are
linearly independent, and {ex, e-x} is a basis for the 2-dimensional space S.
The functions sinhx = ½ex - ½e-x and coshx = ½ex+ ½e-x are in S, so relative
to the basis {ex, e-x} their respective coordinate vectors are <½, -½)
and (½, ½).
Exercise 13 asks you to show that the pair {coshx, sinhx} is also a basis for S.
5B Linear Functions
Theorem 5.3 shows that by fixing bases in finite-dimensional vector spaces V and W
we can calculate the results of vector operations in them by working with coordinate
vectors in JR" and JR"'. The next theorem is a generalization of Theorem 1.3 in Section
1, and it shows how to use coordinates to represent a linear function f:V ~ w by
a matrix. In Sections 6 and 7 we take up some ways of finding bases that lead to
particularly simple matrix representations for linear functions we are interested in.
5.4 Theorem. Let V = {v1, ... , v11 } be a basis for V and W = {w1, ... , Wm} a
basis for W. Let f be a linear function with domain V and range W such that
Section 58 Coordinates and Dimension 135
Then the coordinates vk and Wk are related by
llm2 ••·
a111 ) (
Omn
~~ ) =( uwi:,ln ) '
Vn
where the jth column of the m-by-n matrix consists of the coordinates of /(vj)
relative to the basis Was given by f(Vj) = OJjW1 + · · · +amjWm.
Note. Using standard bases in V = ]Rn and W = ]Rm we get Theorem 1.3.
Proof. We can combine f with the linear correspondences between vectors and
coordinates given by Theorem 5.3 to obtain a function F: lR11 -+ ]Rm by defining
F ( v 1, ... Vn) to be the coordinate vector ( w 1 ... , Wm) such that
In applications of Theorem 5.4 the spaces V and W are very often the same and we
use the same basis to represent both domain and image vectors. For example, suppose
V = W = JR 2 and the single basis is {v1, v2} = {(I, I), (I, 2)}. If it's given that
A-'-U -D·
so 1- 1 exists and has matrix A- 1 relative to the basis {v1, v2}.
We know from Theorem 1.4 that the image of a linear function f(x) = Ax is
spanned by the columns of the matrix A. To get a basis for the image, we need to
check for linear dependencies among the column vectors in A. On the other hand,
to find a basis for the null-space off we need to represent the solutions of Ax= 0
as a linear combination of independent vectors. We'll do both in the next example.
A routine procedure, proved to work in all cases in Theorem 5.10 in the following
Section SC, is Lo apply elementary row operations Lo A to get a reduced matrix R
for which dependence relations are easier to see. Since solution vectors x of Ax = 0
are unchanged by row operations on A, a linear relation among the columns of A
carries over to the same relation among the corresponding columns of R.
IEXAMPLE 11 We'll find bases for the null-space and image of the linear function
defined by the matrix equation f (x) = Ax, where
f: IR4 ----+ JR 3
I34 2)
Ax=
( - 1 2 1 3 x.
l O I -I
Row reduction of A takes only a few operations, and the resulting equation Rx = 0,
equivalent to Ax = 0, is easier to analyze:
IO I -1) = 0,
Rx =
(0 1 1
000
1
0
x = 0, or
Xj +x3 -X4
x2 +x3 +x4 = 0.
The null-space off (x) = Ax consists of the vectors x = (x1 , x2, x3, x4) that satisfy
Ax = 0, or Rx = 0. Setting the non leading variables x3 = s and x4 = t, we find
leading variables x1 = -s + t and x2 = -s - t. Thus all solutions are
The two vectors span the null-space of f, and they're independent because, based
on the first two entries alone, neither is a scalar multiple of the other. Hence the
null-space of f is 2-dimensional with these two vectors as a basis.
For the image of f, we know that it's the span of the columns. The first two
columns of R are evidently independent, and the third is their sum while the fourth
is second minus the first. Hence the first two columns of A suffice to span the image
of f. Since these vectors are independent, the image of f is 2-dimensional, with
basis {(I, - 1, I), (3, 2, 0)} in JR3.
Section SB Coordinates and Dimension 137
The dimensions of the null-space and image add up, 2 + 2, to the dimension 4 of
the domain off. We prove in Section SC that this relation generalizes to all linear
functions f: V --+ W if V has a finite basis.
The differential operator D: 'Y3 --+ 'Y3 acts linearly. Using the basis {1, x, x 2 , x 3} for
both the domain and range, the natural coordinates to use are the respective coefficient
vectors, (ao, a1, a2, a3) and (ho, b1, b2, b3) for polynomials ao + a1x + a2x 2 + a3x 3
and bo + b1x + b2x 2 + b3x 3 in 'Y3. The 4-by-4 matrix operation that canies out
differentiation in tenns of these coordinates is
o0 01 o
2 0 a1
o)(ao) b1
(ho)
( 0 0 0 3 a2 = b2 ·
0 0 0 0 a3 b3
Reducing the matrix just replaces the 2 and 3 by l's, so ao is the single nonleading
variable, while a1, a2, and a3 are leading variables. To find the null-space we set
the bk = 0 and solve, setting ao = t and finding a1 = 0, 2a2 = 0, and 3a3 = 0.
Thus the null-space consists of the constant polynomials, with one-element basis {1}.
Since the matrix has just three independent columns, the image is 3-dimensional and
consists of the polynomials with coefficients bo = a1, b1 = 2a2, b2 = 2a3, and
b3 = 0. Thus {I, x, x 2 } is a basis for the image. As in the previous example, we see
here that again the sum, 1 + 3, of the dimensions of the null-space and image is 4,
the dimension of the domain.
We have gone through this extensive description of the simple relation D(ao +
arx + a2x 2 + a3x 3) = a1 + 2a2x + 3a3x 2 to illustrate some general principles in an
abstract setting that is familiar enough to be thoroughly understood.
EXERCISES
In Exercises 1 to 4, show that the given set of vectors 8. (1, 2, 3), (2, 3, 4), (3, 4, S)
fonns a basis for IR.n of the appropriate dimension by
showing (a) spanning, and (b) independence. In Exercises 9 to 12, show that the given subset of
C(-oo, oo) is linearly independent.
1. {(-1,l),(l,l)} 9. {Er,e2x,eJX) 10. {X,ex ,e-X )
2. {(1, 2), (1, -2)}
11. {cosx, sinx) 12. {cosx,xcosx,x 2 cosx}
3. {(I, 0, 0, ), (1, 1, 0) , (1, 1, 1))
13. Let S be the subspace of C(-oo, oo) with basis {e , e-x ).
4. {(I, 2, 3), (0, 0, 1), {2, 2, 4)) (a) Show that the pair {cash x, sinh x) is another basis
In Exercises 5 to 8, find the dimension of the subspaces for S.
of lie or IR. 3 spanned by the given vectors. [Hint: If a
(b) Find the coordinates of ex and e-x relative to the
set of vectors is already independent, it fonns a basis for basis in part (a).
the subspace it spans.] 14. Let S be the subspace of C(-oo, oo) with basis {e, e-x).
What are the coordinates of the function 3ex - 4e-x
5. (-l,l),(1,-1) relative to the basis {cash x, sinh x)?
6. (1, 2), (1, 3)
15. (a) Show that the set {1, x + 1, (x + 1) 2)
is a basis for
7. (I, 0, 1), (0, 0, 1), (1, 0, 2) the space '.P2 of polynomials of degree at most 2.
138 Chapter 3 Vector Spaces and Linearity
(b) What are the coordinates of the polynomial x 2 +x + I In Exercises 32 and 33, let '.Ps be the space of polyno-
relative to the basis given in part (a)? mials of degree at most 5 and let ('.) be the subspace of
'.Ps consisting of odd polynomials, that is, polynomials
In Exercises 16 to 19, let Bn = {1 , cosx . sinx , cos2x, p(x) such that p(-x) = -p(x).
sin 2x, ... , cos nx , sin nx}. Functions in the subspace 'J11
of C(-oo, oo) spanned by Bn are called trigonometric 32. Find a basis for('.). What is the dimension of('.)?
polynomials of degree ::: n. 33. Is there a polynomial p(x) such that {x-x 3 , x 3+x 5 , p(x)}
16. We'll show in Theorem 7.4 of Section 7B that B11 is a is a basis for ('.)?
linearly independent set, so it is actually a basis for 'J11 • In Exercises 34 to 37, determine whether x is in the span
What is the dimension of 'J,,? of S.
17. Show that cos 2 x and sin 2 x are in 'J2, and find their 34. x = (17, -6, 13) and S = {(I, -6, 2), (4, 8, I ) )
coordinates relative to the basis B2. (Use trigonometric 35. x = (3 , -4, 5, 2) and S = {(I , - 2, I , I) , (2, I , -2, I ),
identities.) (3, I, I, I)}
* 18. Show that for given integers p and q , the product 36. x = sin(x + 'lf/7) and S = {cos x, sinx)
(cos px)(sin qx) is in 'Ip+q and find its coordinates rela-
tive to the basis Bp+q · 37. x = cos 2x and S = {I, cos.x, cos 2 x)
*19. Show that if f(x) is in 'Ip and g(x) is in 'Iq, then 38. Let f he the linear function from JR 2 to JR 2 whose
f (x)g(x) is in 'Ip+q· matrix relative to the natural basis is ( ~ _ 7). Thus
In Exercises 20 to 23, find a basis for the vector space f(I, 0) = (I , 2) and f(O, I) = (2, - 1). Find the matrix
consisting of all linear combinations of each of the off relative to the basis {v1, v2) = {(I , I) , (1 , 2)} for IR2.
following sets of functions . 39. Let g be the linear function from IR 2 to JR 3 whose matrix
20. {l , x , x - I, x 2 + I) 21. {e-' ,e-x, sinhx , coshx} 2
::::v: ~oe:~: ~::r: b:si(s· {e1r2~rr)R and natural
22. {sin2 x, cos 2 x , I} 23. {cosx , sinx,sin2x} 3
11 3
In Exercises 24 to 27, let the linear function f : JR11 -+ - 2 2
ll(m be f (x) = Ax. Find a basis, if there is one, for (a) [Thus g(I, 0) = (2. I, - 2) and g(O, I) = ( - 1, 2, 2).]
the image of f and (b) the null-space of f. Find the matrix of g relative to the bases
{v1,v2} = {(I , l),(1 , 2)) for IR 2 and {w1 , w2 , w3} =
24. A =(; I~ ) 25. A =( i ~) {(I, 0, 0), (1 , I, 0), (I, I, I)} for JR 3 •
Proof. Let V = {v1, ... , Vn} be a basis for V, and let {x1, ... ,Xm} be a subset
of V with m > n. We need to show that there are numbers ri, ... , rm, not all zero,
such that
(2)
with some numbers akj as coefficients. Substitution of (2) into ( l) and interchanging
the order of summation gives
Because the v's fonn a basis they are independent, so the coefficients of the v's are
all zero, and the m variables r1, ... , rm satisfy the equations
m
Since m > n these equations form a homogeneous system with more variables than
equations. By Theorem 2.5 of Chapter 2, there are infinitely many nonzero solutions
for the r's, which is what we wanted to show. •
We now have a short proof that the dimension of V is independent of whatever
finite set of basis vectors we count to compute dim(V).
5.6 Theorem. Let V be a vector space having a basis with n elements. Then
every basis for V has n elements.
Proof. Let {VJ, •.. , Vn} and {u 1, •.. , ut} be two bases for V; in particular, each
set is independent. We can't have k > n because then the u's would be dependent
140 Chapter 3 Vector Spaces and Linearity
by the previous theorem, and we can't have n > k because then the v's would be
dependent. Hence n = k. •
It follows that to find the dimension of a vector space all we have to do is pick
a basis and count the number of vectors in it; we always get the same number, no
matter what basis we count. Thus we've proved that the definition of dimension in
Section 5A assigns a clearly defined number called the dimension of V to every
vector space V that has a finite basis. A vector space with a finite basis is called
finite-dimensional, and all others are called infinite-dimensional.
The vector space '.P of all polynomials doesn't have a finite basis. If it did have one
with n elements, then the linear independence of the n + I functions I, x, x 2 , ..• , x"
would contradict Theorem 5.5. However '.P has a basis consisting of the infinite
sequence of independent functions {1,x,x 2 , ••. }, because every polynomial is a
unique linear combination of some finite subset of them.
Proof. If x 1, ... , x11 is an independent set, then that set itself is a basis for V.
Otherwise, some relation r1X1 + · · · + r11 x11 = 0 holds with at least one r, say
rk, different from 0. Dividing by rk, we get Xk = -(r1/rk)x1 - · · · - (r 11 /rk)Xn,
Substituting the right side for Xk in a linear combination of all the x's, we get a
linear combination from which Xk has been eliminated. It follows that we can delete
Xk, and the span of the remaining vectors will still be all of V. If the resulting subset
of x' s is not independent, we repeat the process until we do arrive at an independent
set, which is then a basis for V. •
The previous theorem says we can get a basis from a finite spanning set by deleting
some vectors. For a finite-dimensional space the next theorem shows that we can get
a basis from a linearly independent set by putting more vectors in the set.
5.8 Theorem. Let S = (x1, ... , xk} be a linearly independent set in a vector
space V. If Sis not a basis for V, we can include more vectors in S, possibly arriving
at a finite basis for V, in which case Vis finite-dimensional. Otherwise we can extend
S with an infinite sequence of independent vectors so V is infinite-dimensional.
Proof. Suppose x 1, ..• , Xk are linearly independent but don't span all of V. Then
there is some vector y that is not a linear combination of x1, ... , Xk, Take Xk+l =y.
We claim that the set x,, ... , Xk, Xk+I is linearly independent. Suppose that r1x1 +
· ·. + rk+tXk+t = 0. We must show that all the r's are 0. If rk+I were not 0, we
could write Xk+1 = -(r1/rk+dX1 - · · · - (rkfrk+i)Xk, which is impossible because
Xk+l is not a linear combination of the other x's. Therefore we have rk+I = 0 and
r1 x1 + · · · + rkXk = 0. Since x,, ... , Xk are independent, the last equation implies
r1 = · · · = rk = 0. Thus if a linearly independent set S doesn't span V, we can add
a vector to S so that the resulting set is also independent.
Repeating the process, we may reach a spanning set S in a finite number of steps,
so S becomes a basis and Vis finite dimensional. Otherwise we can find an arbitrarily
large independent set so V is infinite dimensional. •
Section SC Coordinates and Dimension 141
We extend the definition of k-plane in Chapter 2, Section 2D from the spaces !Rn
to other vector spaces V by saying that a k-plane is either a k-dimensional proper
subspace S of V, as in the next example, or else a translation of S by a fixed vector
v in V.
We summarize much of what we have proved about finite bases and dimension
as follows. It is proved by a straightforward application of the previous theorems.
Theorem 5.4 in Section 2B shows that introducing bases and coordinates in finite-
dimensional vector spaces allows us to study linear functions /: V ~ W by looking
at linear functions /: !Rn~ !Rm of the form f (x) = Ax. The following theorem gives
us an effective description of the null-space and image of such a linear function f
as k-planes containing O in V and W; in particular, the dimension of the null-space
of f is the number s of columns with nonleading entries in a reduced form R of
A, and the dimension of the image is the number r of columns with leading entries.
Since every column in R is of one kind or the other, it follows that s + r = n, where
n is the dimension of the domain of f.
Proof. We'll assume that reducing the matrix off gives a reduced matrix R with k
nonleading variables. The null-space of f consists of the vectors whose coordinates
satisfy the system Rx = 0. Since the system Rx = 0 always has x = 0 as one
solution, if there are no nonleading variables then the unique solution to Rx = 0 is
=
x 0, so the null-space is the single point 0, a 0-plane. If k > 0, there is for each
i = 1, ... , k, a unique solution Ui of Rx= 0 in which we choose the ith nonleading
variable to be I and the other nonleading variables to be 0. These k vectors Ui are
linearly independent, because Ui has 1 in the position of the i th nonleading variable
where the remaining k - 1 vectors have 0. Hence Ui can't be a linear combination of
=
the other Uj. The vectors x t1u1 + · · · + fkUk are the solutions of Rx = 0. To see
142 Chapter 3 Vector Spaces and Linearity
this note first that every such x is a solution, because by the linearity of matrix-vector
multiplication
Rx= t1Ru1 + · ·· +rkRuk = 0.
because Ru1 = ... = Ruk = 0. Second, the nonleading variables have some values
v; in every solution vo, and using t; = v; in the formula for x shows that the solution
vo has the form x.
Each of the r = n - k leading variables, necessarily the same as the number of
nonzero rows in R, corresponds to a column containing a single entry I , each in a
different row and with the other entries 0. These columns are standard basis vectors,
spanning the other columns, hence spanning the image off. Thus the image off
is an r-plane containing O in W . •
EXERCISES
1. (a) Let Eij be the m-by-n matrix with I in the ijth (a) Show that f: V - IR is a linear function with
position and zeros elsewhere. Show that the Eij form null-space precisely W .
a basis for the space of all m-by-n matrices. What (b) Show that if g: V - IR is another linear function
is the dimension of the space? with null-space W, then g = cf for some constant
(b) What is the dimension of the space of diagonal n- c. [Hint: Show that g(x) - g(x11 )f (x) = 0 for all
by-n matrices? x in V.]
2. Which of the following statements is true for every linear
9. (a) Show that if S is a k-dimensional subspace of IR",
function f? Prove your answer.
then S is the null-space of some linear function f:
(a) If XJ and x2 are linearly independent, then so are IR" - 1R11 -t . [Hint: Begin by picking a basis for
f(x1) and f(x2). S and extending it to a basis for IR" .]
(b) If f (x1) and f (x2) are linearly independent, then so (b) Use part (a) to show that every k-dimensional sub-
are x1 and x2. space S of IR" is the intersection of n-k hyperplanes
3. Show that if f is a one-to-one linear function, then the set through the origin, that is, of (n-1 )-dimensional
{f(x1), ... , f(xk)l is linearly independent if and only if subspaces of IR".
{x1 , . . . , xk) is linearly independent. What does this imply
about the dimensions of the image and domain of J? 10. Assume V and W are finite dimensional and that
f: V - W is linear. Prove that if 'N is the null-space of
4. Let x, = (1,2,3},x2 = (-1,2, l),x3 = (I, I, I), and f, then there is a subspace S of V such that Sn 'N = 0,
"4 = (I. 1,0). and f restricted to S is one-to-one. [Hint: Pick a basis
(a) Without doing any computation, give a reason why for 'N and extend it to be a basis for V.J
x1, x2 , x3, and X4 form a linearly dependent set.
(b) Express x1 as a linear combination of x2, x3, and x4. 11. Assume V and Ware finite dimensional, and f:V -w
5. Prove that two planes that contain O in IR 3 intersect in a is linear. Let 'N and f (V) be the null-space and
line or coincide. image of f. Use Theorem 5.10 to prove the equation
+ =
dim(N) dim(! (V)) dim(V).
6. Prove Theorem 5.9, that is, prove that if V has dimension
n, then (a) and (b) imply (c), (a) and (c) imply (b), and 12. Prove that if V is finite-dimensional and f :V - W is
(b) and (c) imply (a). linear, then the inverse image of a vector w in the image
7. Prove that if dim(W) = n and Vis a subspace of W , then of f is a k-plane in V, where k is the dimension of the
dim(V) .:'.:: n. ' null-space of f . (For the definition of inverse image, see
Exercise 22 in Section 4.)
8. Let a vector space V have dimension n and
have an (n - 1)-dimensional subspace W with basis 13. (a) Prove that if f:V -w is linear with dim(V) >
{XJ , .. • ,x,,_J)_ Let x11 be in V but not in W. Define dim(W), then dim(N) > 0, where N is the null-
f (a1x1 + · · · + a 11 x11) = a11 • space off.
Section 6A Eigenvalues and Eigenvectors 143
(b) Use part (a) to explain why there can't be an m-by-n 14. Find a 3-by-2 matrix A and a 2-by-3 matrix B such that
matrix A and an n-by-m matrix B such that BA = I BA= l.
ifm < n.
AI 0
0 A2
(
0 0
The main advantage is that diagonal matrices display properties of an operator that
are easy to read from the matrix. We'll mostly be concerned with linear operators
on .IR.n, represented by square matrices, but some of the ideas apply as well to other
linear operators, such as the differential operators D = d/dx and D 2 = d 2 /dx 2 .
6A Definitions and Examples
For a given linear operator L: V ~ V and a vector x, there is usually no simple
relation between x and L(x). But for special vectors u it may happen that L(u) is a
scalar multiple of u, so that
K~~~IVJfl~~-1)11 2
Let L be the linear operator on IR. defined by the matrix
( ! ! )·
Thus
and that
Before discussing how to find eigenvectors, we ' ll show how they ' re useful. Sup-
pose that L is a linear operator on a vector space V and that u 1, ... , Uk are eigen-
vectors of L with associated eigenvalues A1 , ... , Ak , In other words,
These formulas are most useful when there are enough linearly independent eigen-
vectors to form a basis for the space, since then every vector is a linear combination
of the eigenvectors and Equation 6.2 applies to the basis representation of every x.
In the general case of an n-dimensional vector space with a basis of n eigenvectors
the second equation in terms of matrices and coordinates becomes
A1
L(x) -
(
I 0
- - -- - - - - - -,- - --~
' ', X = UX1 + VXz
'I •
Xz' /
"'
Using u and v as coordinates gives the diagonal matrix representation
Figure 3.10 shows the effect of L on each of the two eigenvectors u and v, so
the image L(x) of a vector x depends geometrically on the parallelogram law. We
express the effect of L by saying that L is a composition of two operators:
1. A stretch by a factor of 3 away from the line through v along the lines parallel
. h eigenvector
to x 1 an d wit . coord'mate matnx
. ( 3 0 ) .
O 1
2. A reversal of direction on lines parallel to x2, leaving points on the line
through u fixed, with eigenvector coordinate matrix ( b _~ ). Operators
( 1) and (2) together produce the same end result in either order:
To compute the eigenvalues and associated eigenvectors for the function L of the
previous ex.amples, we proceed as follows. We need to find vectors u =j:. 0, and
numbers >.. such that
L(u) >..u = 0.
In matrix form, this equation is
or
or
1
(I -A) )(~)=(~). (I)
If this 2-by-2 matrix has an inverse, then the only solution of the equation ( 1) is
x = 0 and y = 0. Hence we must try to find values of A for which the matrix isn't
invertible. By Theorem 5.7 of Chapter 2, Section 5E, this will occur precisely when
(I - A)
det (
4 (I _l A) ) = (I - A) 2 - 4 = 0.
This quadratic equation in A has roots A = 3 and A = -1, as we see by inspection,
by factoring, or by using the quadratic formula. To find eigenvectors associated with
A = 3 and A = - 1, we must find x and y, not both zero, satisfying Equation (I).
Thus we consider
and
A= - 1 : ( ; ~ ) ( ~ ) =( ~ )·
Each of these systems reduces to a single equation:
A = 3: - 2x +y = 0
A = -1 : 2x +y = 0.
It follows that there are many solutions, but all we need is one nonzero solution for
each eigenvalue. We choose for simplicity
6.4
Because Equations 6.3 and 6.4 are expressed in terms of the matrix A of L, we
sometimes refer to eigenvectors and eigenvalues of the matrix rather than the function
L. However the distinction between the matrix A and the operator L is particularly
important here, because even if A is a real matrix, some of the roots Ak may be
Section 6A Eigenvalues and Eigenvectors 147
complex numbers. In that case, the matrix A-Ak I can't be interpreted as an operator
on IR.n, because it will have complex entries. In Section 6B we'll see that we can
still get interesting information by operating instead on the space en
of n-tuples of
complex numbers.
For differential operators the general definitions and principles of Equations 6.1
and 6.2 remain the same, but the determinants and matrices of Equations 6.3 and 6.4
are replaced by calculus computations as in the next example.
If r is a constant, then (d/dx) erx =rerx. If we consider u(x) = erx as a vector in
the space c<oo), and let D be the differentiation operator, then
Du= ru,
so er x is an eigenvector for D associated with the eigenvalue r . In particular, Dex =
~, De2x = 2e 2x, and De- 3x = -3e-3x, so the functions q ex, c2e2x, c3e-3x are
eigenvectors for the differentiation operator D if the Ci are nonzero constants. The
associated eigenvalues are 1, 2, and -3, so
For this analysis to work, the eigenvectors of L had to be a basis for JR.2 , but that
follows from their linear independence.
148 Chapter 3 Vector Spaces and Linearity
EXERCISES
1. The linear operator L from JR 2 to R2 with matrix (b)For ).. < 0, let k = H and show that any linear
combination of cos kx and sin kx is an eigenvector
for).._
(c) Find two linearly independent functions such that
any linear combination of them is an eigenvector
has eigenvalues 7 and -5. Which of the following vectors for)..= 0.
is an eigenvector of L? For those that are, what is the (d) We'll show in Chapter 11 that the functions listed
associated eigenvalue? in parts (a), (b), and (c) are the only eigenvectors
for the operator D2 . Show that the only functions
f(x) that satisfy the condition /(0) = /(rr) = 0
and are eigenvectors of D2 are multiples of the
functions sin kx, for k a positive integer. What are
the associated eigenvalues? (This question comes up
In Exercises 2 to 7, find all the eigenvalues of each of
in studying the small vibrations of a string anchored
the linear operators defined by the following matrices,
at the points x = 0 and x = 1r.)
and for each eigenvalue find an associated eigenvector.
12. (a) Find the eigenvalues and an associated pair of eigen-
2.
( ! 1) 3. ( ~ ~) vectors for the linear operator L on R2 having matrix
5. ( i i)
( ~ ~) ·
4. ( g ~) (b) Show that the eigenvectors of the function L in
n 0n
part (a) form a basis for R2 , and use this to give
(!
0 0 a geometric description of the action of L on R2 , as
6. l 7. I in Example 2.
I 0
(c) Generalize the results you found for (a) and (b) to a
8. Show that 0 is an eigenvalue of a linear operator L if and linear operator L from !Rn to !Rn having a diagonal
only if L is not one to one. matrix diag(a1, a2, ... , an).
9. Show that if f is a one-to-one linear operator having ).. 13. Find the eigenvalues of the operator G on JR 2 with matrix
for an eigenvalue, then 1- 1, the inverse of f, has I/).. for
an eigenvalue. ( ! i ), show that the associated eigenvectors span
10. Let f be a linear operator having ).. for an eigenvalue. IR 2, and describe the action of G, as in Example 2.
(a) Show that ).. 2 is an eigenvalue associated with f of.
(b) Show that ).." is an eigenvalue associated with the
In Exercises 14 to 17, solve the system of differen-
tial equations using eigenvalues and eigenvectors as in
function we get by composing f with itself n times.
Example 5 in the text. The matrices are the same as the
11. Let c< 00 l (R) be the vector space of infinitely often differ- ones in Exercises 2 to 5, and you may use the results of
entiable functions f (x) for x in R. Then the differential those exercises if you have already worked them out.
operator D 2 acts linearly from c< 00 l(JR) to c< 00 l(IR). This
exercise is about the eigenvectors of the operator D2 .
(a) For ).. > 0, let k = ..ff and show that any lin-
14. ~; =( ! 1) X 15. ~; =( ~ ~) X
6B Bases of Eigenvectors
In Section 6A we saw that the effect of a linear operator on a linear combination
of eigenvectors is particularly simple. Here we'll look at conditions under which a
finite-dimensional space V has a basis of eigenvectors for a given operator L(x) on
Section 6B Eigenvalues and Eigenvectors 149
V so we can take advantage of Equation 6.2 and, if V is finite-dimensional, express
L by a diagonal matrix acting on coordinates u 1, ... , u 11 in x = u 1u 1 + · · · + Un Un.
6.5 Theorem. The matrix of an operator L with respect to a basis {u 1 , .•• , 0 11 }
is diagonal if and only if each basis vector u; is an eigenvector of L. If the matrix
is diagonal, then the ith entry on the diagonal is the eigenvalue associated with u;.
Proof. Recall that the matrix of a linear function L with respect to given bases in
its domain and range is defined so that its jth column gives the coordinates with
respect to the basis in the range of L(uj ), where Dj is the jth basis vector of the
domain. Here we have an operator L, whose range is the same as its domain, and
we can use the basis {u1, .•• , Un} for both domain and range.
Ifuj is an eigenvector, then L(uj) = AjUj, where Aj is the associated eigenvalue.
To express Aj D_; as a linear combination ciu1 + ... + c11 u 11 , we take Cj = Aj and
all the other e's equal to 0. Thus the coordinate vector of L(uj), namely the jth
column of the matrix of L, is zero, except for having Aj in the jth place. The matrix
is diagonal if and only if this condition holds for every column. On the other hand,
if the entries in the jth column of the matrix are zero except for a value J...j in the
jth place, we have L(uj) = Ajllj and u1 is an eigenvector. Thus if the matrix is
diagonal, every basis vector Uj is an eigenvector. •
IEXAIVJPL~ 6 I In Example I of Section 6A we saw that the operator L on JR 2 that has the matrix
A= ( ! !)
with respect to the standard basis has eigenvectors
=3 ( ~ )- 2 ( -~ ) =( l~ ) ·
150 Chapter 3 Vector Spaces and Linearity
L(x) =( ! )( _; ) = ( 16 )
by direct calculation.
The next theorem shows that eigenvectors associated with different eigenvalues
are linearly independent. We apply it to obtain a condition under which an operator
is guaranteed to have a basis of eigenvectors.
C] ll ] + •••+ Ck+Jllk+l = 0.
C]AJ UJ + .. · +Ck+]Ak+Jllk+l = 0,
where we have used L(uj) = Ajllj. Now multiply the previous equation by A1, and
subtract from this equation to get
The k vectors 02 , ... , llHJ are independent by assumption, so Cj(Aj - Ai)= 0 for
j = 2, ... , k + 1. Since Aj -A1 =f 0, we conclude that Cj = 0 for j = 2, ... , k + 1.
Then the first equation implies that c1 0 1 = 0, so c1 = 0 also. Hence u 1, ••• , llHJ
is an independent set. •
Theorem 6.6 is not restricted to operators on finite-dimensional spaces. In particular,
it applies to the differentiation operator D on the infinite-dimensional space c(l) of
continuously differentiable functions. The function erx is an eigenvector for D associ-
ated with the eigenvalue r, because Derx = rerx; this then implies that the functions
erix, ... , ekx are linearly independent, provided that the numbers r 1, ... , rk are all
different.
Proof. Let ''1 , . .. , Vn be eigenvectors associated with the distinct roots AJ , ... , An .
By Theorem 6.6, they are linearly independent. But n linear independent vectors in
an n-dimensional space form a basis, by Theorem 5.9. •
An operator may have a basis of eigenvectors without having a full set of distinct
eigenvalues. A simple example is the function from JR 3 to JR 3 given by the matrix
The characteristic equation is (2-A) 2 (3 - >..), and the only roots are 2 and 3. We still
have a basis of eigenvectors because in this case there are two linearly independent
eigenvectors, e1 and e2, associated with the eigenvalue 2.
This example shows one way that a matrix can fail to have enough eigenvectors to
form a basis. Consider the linear function from JR2 to JR 2 with the matrix
A= ( 0 -5)
2 2 .
det(A - U) = det ( - ~
2
~f ) = (->..)(2 - >..) + 10
= >.. 2 -2A + 10 = 0.
The formula for solving quadratic equations gives the complex roots 1 + 3i and
1 - 3i. Therefore, the linear function from JR2 to IR 2 has no real eigenvalues and
consequently no eigenvectors in R 2 if we use only real numbers for scalars.
If we use complex scalars and consider the matrix A of the previous example A
as defining a linear function from the complex 2-dimensional coordinate space e 2
into itself, we can use the eigenvalues 1 ± 3i. To find the eigenvectors, we proceed
152 Chapter 3 Vector Spaces and Linearity
just as we have done before with real eigenvalues. For ). = I + 3i, the equation
(A - U)x = 0 becomes
-5
I - 3i ) (; ) = 0.
Dividing the first row by -1 - Ji gives
-I -
2
1 - 3i
3·
-1
2
)C )=o
Subtracting twice the first row from the second leaves
calculation leads to (
1
: Ji ) as an associated eigenvector. Thus viewed as an a
erator _on a comple~ vector space, A does have a basis of eigenvectors-namely,
(
1 _- 31 ) , ( I + . d wit
_ 31 ) , associate . h the e1genva
. Iues I + 3.1, I - 3.1.
2 2
IEXAMPLE 12 I 2
For the operator on IR. with matrix
A=( -2-1 2 )
3
(A - _/)x = ( -2 2)
_
2 2
x = 0.
The solution set is the 1-dimensional subspace of IR.2 consisting of all multiples of
the vector ( : ) . Two eigenvectors associated with the eigenvalue 1 are linearly
dependent, so can't form a basis. The result is exactly the same if we consider A as
the matrix of an operator on the complex space C2 ; we still fail to have a basis of
eigenvectors.
It's possible to make up the deficiency in the previous example by generalizing the
definition of eigenvector, but in our application to differential equations in Chapter 13
we avoid the need for this by using exponential matrices e' A defined for arbitrary
square matrices A.
6C Changing Coordinates
Theorem 5.4 of Section 6 showed us how to find the matrix representation for a
linear function f: V ~ W relative to bases {v1, ... , v,1 } in V and {w1, ... , Wm} in
W. Here we assume we have a basis different from the standard basis in !Rn and
show how the matrix of a linear operator F: !Rn~ IR.11 relative to the standard basis
is related to the matrix of the same operator relative to the nonstandard basis. We'll
then use this result to show how using a basis of eigenvectors may simplify the
matrix of an operator.
A= UBU- 1, or B = u- 1 AU.
Proof. First observe that x = y1 01 + · · · + Y11 011 = Uy. Since the columns of U
are basis vectors, they're independent, so 1
u-
exists and y = u- 1x. Then Ax =
U By= U BU- x for all x, in particular when x = ek. Hence A= U Bu- 1 .
1 •
B = u- 1AU= ( _j : ) (! : )(
154 Chapter 3 Vector Spaces and Linearity
IEXAMPLE 14 I If A ! ).
=( then Example I shows that IR
2
has a basis of eigenvectors
there's no apparent advantage to using the nonstandard basis, but using the eigen-
vector basis for the same operator allows us to simplify by operating with a diagonal
matrix. This change will be useful in Chapter 13.
EXERCISES
3.
(-~ ~)
( 3-2 -2)
4.
n n I
0
0
matrix that represents L relative to that basis.
10. ( =~ ! )
5. -2 -2
2 2 -2
I 6. CO I) 0 I 0
0 0 I
,2. ( -! =i -n 13.
(
-1 00)
-1
- 1 -1
0 0
I
The dot product on IR" has Properties 7. I and so is an example of an inner product.
We need a notation different from x • y for a general inner product both to make
it clear which product we're talking about, and because we sometimes use both
together, as in Section 7C.
In Definition 7. l we assumed homogeneity only in the first entry, but using sym-
metry twice we get {x, ry} = (ry, x} = r{y, x} = r{x, y}. Hence {x, ry) also equals
r{x, y) . Similarly, we assumed additivity only in the first entry, but we also have it
in the second entry also: {x, y + z} = (x, y} + {x, z} as a consequence of additivity
in the first entry and symmetry.
We define the length. or norm, of a vector by
156 Chapter 3 Vector Spaces and Linearity
The dot product and length in IR11 satisfy the Cauchy-Schwarz inequality
/x • YI S /xl/y/ .
or 0 .s 2 - 2(x, y), so (x, y) S I. For nonzero x and y, x/llx/1 and y//lYII are unit
We've already seen that the dot product x•y in IR11 is an example of an inner product.
(j~AMPLEtj In particular, if x = (x1, x2) and y = (y1, Y2),
X • Y = XJYI + X2Y2
is an inner product in IR 2 • If we define
(x, y) = Xj YI + 2x2y2
Section 7A Inner Products 157
FIGURE 3.11
ixl = I
(a) (b)
instead, we get another inner product for x and y in JR. 2. To see this, all we have
to do is check that the relations given under 7.1 are satisfied, which we leave as an
exercise. Relative to this inner product, length is defined by
We have (/, g) = (g, /) simply because f(x)g(x) = g(x)f(x) for -rr :::: x :::: rr.
The other properties of the inner product depend on properties of definite integrals;
the verification is left as Exercise 11. The importance of this example depends partly
on the formulas
(coskx,coslx) = r: coskxcoslxdx = 0, k =/ l,
where k and l are integers. These formulas follow in a straightforward way using
trigonometric identities; their significance here is that in terms of the inner product
(/, g}, and the orthogonality relation (/, g} = 0, they assert that certain trigono-
metric functions are orthogonal. We're not claiming that the graphs of coskx and
sin Ix intersect at right angles, but rather that their ordinary product has average
value zero over the interval -rr :'S x :'S rr. If k = l direct computation shows that
(cos kx, cos kx} and (sin kx, sin kx} both equal rr for k ~ 1. (If k = 0, cos kx = 1
and sinkx = 0 so the integrals are 2rr and 0 instead.) We'll return to this example
in the next section.
EXERCISES
In Exercises I to 4, detem1ine whether the given formula 9. Show that there is no inner product on JR 3 such that
defines an inner product on ~ 2 • Verify your answer (e1, e1) = (e2, e2) = I, (e3, e3) = 5, (e1, e2) = 0, and
by showing either that the Properties 7 .1 on page 155 (e1, e3) = (ei, e3) = 2. [Hint: Show that if homogeneity
are satisfied or that at least one of them fails. Here and additivity hold, then ( (2, 2, - I), (2, 2, -1)) is nega-
=
x (x1, x2) and y = (y1, Y2). tive, so positivity fails.]
1. (x, y) = XJ YI + 2x2Y2 10. Let V be a 2-dimensional vector space with an inner
product and a basis {u, v}, and let (u, u) = a, (u, v) = b,
2. (x, y) = XJYI -X2Y2
and (v, v) = c.
3. (x, y) = XJYI + x1y2 + X2J1 + 2Qy2 (a) Let x = pu + qv and y = ru + sv be vectors in V.
4. (x, Y) = XJYI Use additivity and homogeneity of the inner product
5. Sketch the "unit circle" determined by llxll = I, if to show that
the nonn is determined by the inner product (x, y) =
3x1 YI + 2X2Y2, where x = (x1, x2), y = (y1, Y2). (b)
(x. y) = (p q ) (: : ) C).
Show that a > 0 and c > 0, and that the Cauchy-
6. (a) Let V be a finite-dimensional vector space with basis Schwarz inequality implies that b 2 < ac.
{v1, ... , v,,}. Let (c) Show that if a, b, and c satisfy the conditions of
part(b) and (x, y) is defined by the formula in part(a)
x=x1v1+···+x11 v,, then (x, y) satisfies the conditions for being an inner
product. [Hint: To show positivity, write out (x, x)
Y=y1v1+···+Y11V11
in terms of a, b, c, p, and q and use the technique
be representations of x and y in the given basis. of completing the square.]
Show that 11. (a) Verify that the formula f::,rc f (x)g(x) dx defines
an inner product on the space C[-rr, rr] of real-
(x, y) = (x1, ... , x,,) • (Y1, ... , y,,) valued continuous functions on [-rr, rr ]. To show
that (f, f) > 0 unless J = 0 you may assume that
defines an inner product on V. if J (x) ~ 0 is continuous but not identically zero
(b) Show that with the inner product as defined in part
(a), the basis elements v 1, ... , v,, satisfy (v;, Vj) =
on an interval a ~ X ~ b, then r:
f(x) dx > O.]
(b) Write out explicitly the meaning of the Cauchy-
0 if i ,f. j and (v;, Vj) = I.
Schwarz inequality for the inner product in
7. Verify the orthogonality relations in text Example 2 by part (a).
using trigonometric identities.
12. Prove the law of cosines for general inner products by
8. Suppose an inner product defined on JR.2 has the val- expanding llx - yll and using the definition of the cosine
ues ((-1, 2), (-1, 2)) = 4, ((2, -5), (2, -5)) = 9, and of the angle 0 between two vectors.
((-1, 2), (2, -5)) = 5. Calculate the lengths lie, II and
lle2 II of the standard basis vectors for the length function
13. Prove that the Pythagorean relation llx-yf = llxf+IIYll 2
holds if and only if x and y are orthogonal.
associated with the given inner product. [Hint: Express e1
and e2 in terms of the vectors (-1, 2) and (2, -5), and 14. Prove that if I/xii= ..j (x, x} is the norm defined by an
use homogeneity and additivity of the inner product.] inner product (x, y), then (x. y) = ¼<llx+yll 2 - llx-yf).
Section 7B Inner Products 159
7B Orthogonal Bases
The standard basis vectors E = {e1, • • • , en} in JR" form an orthogonal set since
ej •ek = 0 if j -:f k. If in addition all the vectors in an orthogonal set have length 1, the
set is said to be orthonormal. Thus the set E is orthononnal since \ek \ = ek • ek = 1
fork = 1, ... , n. The more restrictive orthononnality is a useful property for a basis
{u1, ... , u,,} to have, because we'll see we can then compute the coordinates in a basis
representation x = u 1u 1 + · · ·+u 11 u,1 directly as inner products (x, Uk) without solving
systems of linear equations and also compute inner products (x, y) as dot products of
coordinate vectors. We'll also see how to construct an orthonormal basis starting from
an arbitrary given basis in a finite-dimensional space with an inner product.
The standard basis {e1, ... , e11 } is an orthonormal basis for JR" relative to the usual
dot product in JR" because ei • ej = 0 if i -:f j, and e; • e; = I.
FIGURE 3.12
(a) (b)
Relative to the basis {u1, 02, 03) in IR. 3, the coordinates of {I, 2, -1) are
1
VJ = {1, 2, - J) • ( ~, ~, ~) = \
v2 = ( I 2 -1) • (~ _J
' • 7'
l) = _l7
7' 7
V3 = (1, 2, -1) • ( ~, ~, - ~) = \3 •
Hence {I, 2, - 1) = \1v1 - ~v2 + 1.'/v3.
I:e)(AMPLE GI Consider the infinite-dimensional subspace S of C[-;rr, ;rr] spanned by the set of
functions
In Example 2 of the previous section, we observed that relative to the inner product
all hold. It follows from Theorem 7.4 that these functions are linearly independent.
We could divide each of the functions in the orthogonal set by its length to get
an orthonormal set, but we are mainly interested in computing the coefficients in a
linear combination of coskx and sinkx, so what's usually done is to alter the inner
product by a constant positive factor that absorbs the normalization constants. We
can do this because with one exception all these factors are the same:
Because the factor rr occurs in each of these numbers, it's customary to alter the
definition of the inner product by dividing by rr and put
This doesn't change the orthogonality of the set, but now we have
Now lxlsinkx has integral zero over [-rr,rr], because it's an odd function. Hence
bk = 0 fork = 1, 2, .... On the other hand, the graph of !xi cos kx is symmetric
about the y-axis, so we can just double the integral over [O, rr]. For k =I- 0 we
integrate by parts, getting
ak = -21re x cos kx dx
Jr 0
= -2 [x-sinkx]re
- - - -2 ire sinkxdx
rr k O krr o
162 Chapter 3 Vector Spaces and Linearity
2
= [ -coskx ]]r = -(cosbr
2
- 1)
2
k rr 2
k rr O
k = 2,4, 6, ... ,
= __2_((-ll - 1) ={ O, 4
k2rr - k 2rr , k = 1, 3, 5, ....
When k = 0, we have ao = -
21]r x dx = rr. To summarize,
7r 0
= 2,4, 6, ...
I
0, k ,
ao = rr, Gk= _ _±_
k2rr'
k = I, 3, 5, ... ,
k = 1,2,3, ....
The constant term is ao/2 so the nth Fourier approximation T, 1 (x) for odd n is
rr 4 4 cos 3x 4 cos nx
T, 1 (x) = - - -cosx- - -2- - ··· - - - -
2.
2 rr rr 3 rr n
Approximation of the values f (x) by T,1 (x) in the previous example is taken up
in Chapter 14, Section 8, but in the present context we consider instead another kind
of approximation that's measured by the norm of a vector, be it a function f(x) in
C[-rr, rr] or a vector x in IR11 . The main theorem is as follows, and it's one of the
principle reasons that orthonormal bases are important.
11
d112 = llxll 2 - "L.., (x, uk} 2.
k=I
Proof. We'll work with d;,, using additivity and homogeneity of the dot product:
llx - (111u1 + · · · + U11U11)11 2 = (x - (u1u1 + · · · + u11u,i), x - (u1u1 + · · · + U11U11)}
IJ Jl
Adding and subtracting I:Z=I (x, Dk) 2 to the last expression gives a sum of squares:
11 n n n
i; = llxu2 - L(x, uk) 2 + L(x, uk) 2 - 2 L Uk(X, uk) +Lui
k=I k=I k=I k=l
n n
= Uxll 2 - L(x, Dk) 2 + L((x, Dk) - uk)2.
k=l k=I
The Uk' s occur only in the last sum on the right, which is always nonnegative and
takes on its minimum value of O just when uk = (x, uk) for every value of k from
1 ton, so these are the values that minimize d;. •
In the previous example the inner product and norm on the vecLor space C[-Jr, Jr]
were respectively
For f(x) = lxl we found the nth-degree trigonometric approximation for odd n to be
Jr 4 4 cos 3x 4 cos nx
T,1(X) = -2 - -COSX -
Jr
--- -
Jr 32
··· - ---.
Jr n2
Since the function f(x) = lxl isn't differentiable it certainly isn't in the span of the
trigonometric system so II/-T,111 is always positive. Since 11/11 ¼f::_rrx 2 dx = =
IJT2
3 '
dn
2
= 11/ - Tnll
2 2Jr2
= - 3- -
( Jf2
2 + Jr2 I;
16 (n+l)/2 l )
(2k-1)4 ·
We're not in a position to prove it here, but the trigonometric system is complete so
d; Lends to zero as n tends to infinity.
In Chapter 1, Section 5B we saw how to find the distance between a point x in R. 2
or R. 3 and a line or plane when these were given in the respective forms ax+ by = c
or ax+ by+ cz = d. Using the previous theorem, we can find the distance from a
point to a general k-plane if we first represent the k-plane parametrically using an
orthonormal set {01 , ... , ut}.
To find the distance from the point (2, 3, 4) in JR 3 to the line x t(l, 1, 1), first =
rewrite the line as x = s(JJ, JJ' JJ).
According to Theorem 7.5, the point on
the line that produces the minimum distance to (2, 3, 4) is the one for which the
single scalar coordinate is s = (2, 3, 4) • = (JJ, JJ' 73 ) 1·
Hence the min-
imizing point is (3, 3, 3) and the minimum distance is 1(2, 3, 4) - (3, 3, 3)1 =
I(-1, 0, 1) I = ./2. If the line had been shifted so it dido' t contain the origin, for
164 Chapter 3 Vector Spaces and Linearity
example, x = I (I, 1, l) + (1, 2, 2), we would have instead minimized the distance
between the point (2, 3, 4) - (1, 2, 2) = (1, I , 2) and the line x = s( JJ, JJ). JJ"'
Theorem 7 .5 shows that if the vectors in an n-dimensional space are represented
using an orthonormal basis, then norms are all computable using Euclidean lengths
of coordinate vectors. This is true for inner products ah;o, from which follows the
result for nom1s. See also Exercise 14.
7.6 Theorem. Let {u1, ... , 0 11 } be an orthonormal set in a vector space with
corresponding inner product (x, y). If
(x, y) = (x1, ... , Xn) • (y1, ... , Yn) and !!xii = l(x1, ... , Xn)I.
Proof Using additivity of the inner product and orthonormality of the Dk, we have
(a) (b)
Section 78 Inner Products 165
The vector Y2 can't be zero, because by its definition that would imply x2 and u 1
to be linearly dependent, which they are not. Thus the vector u2 = y2/IIY2il has
length l.
Next we take x3 from our independent set and form its projection p on the
subspace spanned by u1 and u2, defined by
If y3 = 0, then x3, u1 , and u2 would be dependent; but this is impossible, because the
subspace spanned by u1 and u2 is the same as that spanned by x1 and x2; therefore,
x3, x2, and x1 would be dependent. Hence u3 = y3/IIY3II has length 1.
In this way we successively compute u 1, u2, ... , Uk. To get Yk+l we set
7.7
We can verify as before that Yk+I is orthogonal to u1 , u2 , ... , Uk. To obtain a unit
vector, we can normalize Yk+l to Uk+!= Yk+1/IIYk+1II- In practice, it may be more
convenient to compute the y's from the equivalent formula
The vectors x1 = (I, -1, 2) and x2 = (1, 0, - I) span a plane '.P in JR 3 because they
are linearly independent. To find an orthogonal basis for '.P, we apply the Gram-
Schmidt process to the basis {x1, x2} for '.P. We set Y1 = x1 and
(x2 • Y1)
Y2 = x2 - IYil 2 YI
(-1)
= (1, 0, -1) - - -(1, - I, 2)
6
= (i, -¼, -j).
Thus the plane '.P was defined as the set of all linear combinations
SXJ + tX2,
166 Chapter 3 Vector Spaces and Linearity
FIGURE 3.14 z
(a) (b)
but can also be represented as the set of all linear combinations of YI = x1 and Y2,
namely,
uy1 + vy2,
where {y1, Y2} is the orthogonal pair {(1, - I, 2), ( t, -¼, -i)},
shown in
Figure 3. I 4(a).
If we add a third vector x3 that is linearly independent of X1 and x2, we can
go on to find y3 so that {y 1, Y2, y3} is an orthogonal basis for JR. 3, and {y 1, Y2} is
an orthogonal basis for '.P. For instance, if we take x3 = e1, y3 works out to be
( I 3 I )
TI• TI• TI ·
IEXAMf)LE u I but
Let '.P l, 1] be the vector space of polynomials f(x) = ao + a1x + · · · + a x
11 [ -
restricted to -1 ~ x ~
1. We define an inner product
11
11
,
(f, g) = f
-]
1
f(x)g(x) dx.
The argument used in Example 3 of Section 5A to show that the functions 1, x, ... , x 11
form a basis for '.P11 , works also to show that they form a basis for '.P11 [ -1, 1] when
restricted to [-1, l ]. To find an orthogonal basis, let .vo(x) = I. Then let
(x, 1)
Y1(x) =x - - - 1 =x,
(I, 1)
2
(x 2, 1) (x 2, x) ( 3) I
v2(x) = x2 - ---1 - - - x = x2 - -- = x2 - -·
. (1 , 1) (X, X) ' 2 3'
this is so because (x 2, 1) = i,
(x 2, x) = 0, and ( I, 1) = 2. The graphs of the
three polynomials .vo(x) = l , y1(x) = x, y2(x) = x 2 - (½) are illustrated in
Figure 3.14(b). We get the corresponding orthonormal set by dividing successively
by II 111 = ,./2, llxll = If
and llx 2 - ½II = to get ffs, r/½,
Jix, /ii-(x 2 - ¼)}.
Thus we have an orthonormal basis for '.P3[- l , 1]. The resulting polynomials are
called normalized Legendre polynomials . ·
Section 78 Inner Products 167
EXERCISES
1. Find a vector (.x, y, z) in R 3 such that the triple of vectors 10. Let S = {01, ... , Un) be an orthonormal set. Prove that
(I, I , I), (-1,½, ½), (.x, y, z) fonns an orthogonal basis the vector L~=I Uk Uk is orthogonal to x - L~=I UkUk
for R3 • Then nonnalize this basis by dividing each vector if and only if Uk = {x, Uk) . This is another way to
by its length. characterize the choice of coefficients in Theorem 7.5,
showing that the nearest point to x in the span of S is
2. The vectors (1, 1, 1) and (1, 2, 1) span a plane '.J> in R3 • the perpendicular pr~jection of x onto the span of S.
Use the Gram-Schmidt process to find an onhogonal basis
for R3 in which the first two vectors fonn an orthogonal 11. For a given inner product {x, y) on Rn, let A be the n-
basis for '.J>. by-n matrix defined to have entries aij ={e;, ei) for
4 i, j = l ... , n. Show that for arbitrary x and y in R",
3. Find an onhogonal basis for R in which the first three {x, y) = x • Ay.
vectors form a basis for the subspace S spanned by
(1,2, 1, I), (-1,0, 1,0), and (0, 1, 0,2) . 12. An t1-by-t1 matrix A = (aij) is symmetric if aii = aj;
for i, j = l, ... , n. Prove that if A is symmetric then
4. Let '.J>2 be the three-dimensional space of quadratic poly-
Ax • y = x • Ay for all x and y in Rn.
nomials p(.x) = a +bx +cx 2 , restricted so that O S .x .:5 1.
If '.P2 is given the inner product 13. Show that if A is symmetric and {x, y) is defined to be
x•Ay for x and yin ]Rn, then {x, y) has all the properties of
an inner product except possibly the positivity property:
{p,q) = fo' p(.x)q(.x)d.x, {x, x) > 0 unless x = 0.
14. A is called a positive definite matrix if it is symmetric
find an orthononnal basis for '.P2. [Hint: One basis for and {x, y) = x • Ay has the positivity property of an inner
'.P2 is {1 , .x, .x 2}.)] product. Show that a diagonal matrix is positive definite
5. Show that applying the Gram-Schmidt process to the if and only if all its "diagonal entries are positive.
three vectors (3, 0, 0), (1, I , 0), and (1, I, 1) in order pro-
duces an orthogonal basis that normalizes to the standard *15. Show that a symmetric 2-by-2 matrix A =( ~ !) is
basis {e1 , e2, e3}. positive definite if and only if a and det A = ac - b 2 are
6. (a) Show that applying the Gram-Schmidt process to an both positive.
orthononnal set in order gives the orthononnal set *16. Even if {x, y) doesn't have the positivity property, but
back again. has the other properties of an inner product, Formula 7 .8
(b) Let {01, ... , Un) and {v1, ... , Vn} be two orthonor- in the Gram-Schmidt process still makes sense unless
mal sets in a vector space, such that the subspaces {yk, Yk) = 0 at some stage. This observation leads to
spanned by {01, ... , Dk} and {v 1, ... , vk} are the an efficient method for determining whether a symmetric
same fork = 1, 2, ... , n. Show that Uk = ±vk for matrix is positive definite or not.
k= 1,2, ... ,n.
(a) Suppose that {x, y) is defined to be x • Ay for a
7. Show that if {01, ... , Un} and {v1, ... , Vn} are two symmetric matrix A, as in Exercise 13. Show that if
orthonormal bases for a vector space, then the matrix M applying the formulas of the Gram-Schmidt process
used to change from one set of coordinates to another to the standard basis vectors {e 1, ••• , en) leads at
has columns that fonn an orthononnal set in Rn. [Hint: some stage to a Yk with {Yk , Yk) S O then A is not
Express Uk as a linear combination of v 1, ••• , vn, and use positive definite.
Theorem 7.6.) (b) Show that if {Yk, Yk) > 0 at every stage then A is
8. Find the distance between the point (2, 3, 4) and the plane positive definite.
parametrized by x = u(l, 2, 1) + v(l, -1, 1) + (I, 1, 2). *17. Show that the result of Exercise 12 remains valid if x
Note that conveniently (1, 2, 1) • (1 , -1, 1) = 0. and y are allowed to have complex entries, and use this
9. Find the distance between the point (2, 3, 4) and the to prove that all the eigenvalues of a real symmetric
plane parametrized by x = u(l , 1, 1) + v(l, -1, 1). Since matrix are real numbers. [Hint: If Ay = AY with A and
(1, 1, 1) • ( 1, - 1, l) =f 0, use the Gram-Schmidt process y complex, show that Ay = Iy and put x = y in the
first. equation of Exercise 12.)
168 Chapter 3 Vector Spaces and Linearity
Given what we know about rotations in IR 2 it's easy to find the matrix of a rotation
in IR 3 about one of the coordinate axes. If R is a rotation through angle 0 about
the xi-axis, then R(e1) = e1 and R rotates e2 and e3 through the angle 0 in the
1 0 0 )
x2x3-plane, so its matrix is M = 0 cos 0 - sin 0 . in which the submatrix
( 0 sin0 cos0
in the lower right corner is the same as the matrix Re of Example 8 in Section I •
For a numerical example, we'll find the matrix of a rotation R through 60° about an
axis in the direction of a = (I, 1, 1). Figure 3.15{a) shows the standard basis vectors
and their images under R, with the axis of rotation shown as a dotted line.
The first step is to find an orthonormal basis u1, u2, and u3 with u I in the same
direction as a. We could use the Gram-Schmidt process here, but the following
method is less work. We'll start by finding an orthogonal basis containing a. For a
vector orthogonal to a we can take an arbitrary b = (b1, b2, b3) with a• b = b1 +
2b2 + 2b3 = 0, for instance, b = (0, 1, -1 ). Since we are working in JR3, we can get
a third vector perpendicular to a and b by finding the cross product c = a x b, which
works out to be (-2, 1, 1). Now we normalize by dividing each of a, b, and c by its
length to get unit vectors u 1 = ( l/v'3, 1/v'3, 1/v'3), u2 = (0, 1/-/2, - 1/-/2), and
03 = (-2/-/6, 1/-/6, 1/-/6). Using these vectors as the columns of a matrix gives
1/v'3 0 -2/-/6 )
u I/./3 1/./2 1/./6 . Since cos60° = 1/2 and sin60° = ./3/2,
(
l /v'3 -1/-/2 1/-/6
Section 7C Inner Products 169
FIGURE 3.1S z
',
' ', p
.,
.,,
I
,' I
., I
,. I
, , ~ --.... I ,,,..
., --- J./ y
x tJ' I
C I
I
I
1
M = ( 0 1~2 -l/2 ) is the matrix of R relative to the basis {u1 , u2, u3} .
0 v'3/2 1/2
Then the matrix of R relative to the standard basis is A = U Mv- 1 according
to the formula given above. We can use orthononnality to find u- 1 without any
computation. It is simply U 1 , the transpose of U, obtained by flipping U about its
main diagonal, so that the rows of U become the columns of U 1 and vice versa. In
l/v'3l/v'3 l/v'3 )
this case, U 1 =( l/-v'2 - 1/../i . To see that u- 1 = U 1 when U is
O
, -2/./6 1/./6 1/./6
a square matrix with orthonormal columns, note that row i of U 1 (which is column
i of U) is just ui , and column j of U is DJ. The ijth entry in the product U 1 U is
therefore o; • OJ, which is 1 when i = j and O when i -:f. j because {01, uz , u3}
is an orthonormal basis. Thus U 1 U is the identity matrix and therefore u- 1 = U 1 •
Finally,
A= UMU- 1 = UMU 1
=( !j1 1/h
1/v'3 - l/-v'2
-~~1)(~ -l12)( ;3 !~Ji -!~1)
1/./6
1~2
0 v'3/2 1/2
11
is the matrix of R relative to the standard basis. As a partial check on the calculation,
you can verify that Ao1 is equal to 01 as it should be.
170 Chapter 3 Vector Spaces and Linearity
IE~~MPLE 14 I
)
We'll now see how to find the axis of a rotation R if we're given its matrix A. For
~ - ;t ~3/
a numerical example we'll take the matrix A =
7
(;
3
7 - -;
2 6
, whose columns
are the orthonormal vectors VJ, v2, v3 of Example 5, and which you can check has
determinant 1.
If a has the same direction as the axis of rotation, then R(a) = Aa = a, so a is
an eigenvector of A associated with the eigenvalue I, and
6
(A - /)a=( -: 3
-7
7
lO
2 13
: ) ( :~ ) = 0 .
a3
7 7 - 7
1· ~XAMPLE 15 I Orthonormal bases make it easy to describe the geometric operation of reflection in
a subspace. Let V be a vector space with an inner product and an orthonormal basis
[u1 , .. . , u11 }, and let Uk be the subspace spanned by the first k basis vectors. Also
Jet Uo be the zero subspace. The linear function r: V - V defined by
~ ~ ~ ) , with the
6. Let r: V ---+ V be reflection in a subspace U of a space V
with an inner product as in Example 15. (a) Show that M has the form (
(a) Show that r has the property that r(x) = x for x 0 C d
in U and r(x) = -x if x is perpendicular to every
vector in U. submatrix ( ; : ) having columns forming an
(b) Show that every linear function from V to V with orthonormal set.
the property in part (a) must be identical with r . (b) Show that if). = +l, then f is either a rotation with
The following exercises outline the steps in a proof that axis u I or a reflection in a plane containing u 1.
every function from JR 2 to JR2 or from JR 3 to JR 3 that (c) Show that if ). = -1, then f is either reflection
preserves lengths and takes the origin to itself is either a in a line perpendicular to u 1 or the composition of
rotation, a reflection, or the composition of a rotation and a rotation with axis u1 and reflection in the plane
a reflection. We pointed out just before Examples 7 and perpendicular to u 1.
Chapter 3 REVIEW
In Exercises l to 4 determine whether the given set of 4. X = (0, 0, 0), y = (1, 0, 0), Z = (0, 1, 0)
vectors is independent. ·
In Exercises 5 to 10, find the matrix of a linear function
1. X=(2,3),y=(-1,2) , z=(l,0) f with the given properties.
2. X = (0, }, 2), y = (0, 0, 1), Z = (1, 0, 0)
5. The null-space of f : R 3---+ R 3 is the plane spanned by
3. X= (-1,2,0,3),y = (0, },-1,4),z= (- 1,3,-1, }) (1, 1, 1) and (1, -1, I), and /(0, 0, 1) = (0, 0, 1).
172 Chapter 3 Vector Spaces and Linearity
6. The image of f: IR2 - IR 3 is the plane spanned by there is a basis of eigenvectors for the matrix operator.
(1, I, I) and (1, -1, I).
7. The null-space of f:IR 3-IR 2 is the line x = t(l, I, I),
17. ( _; -3) -4
18. ( -4-3 ~)
n -b)u D
and j(l , 0, 0) = (1, 1) and j(0, I, 0,) = (], 2).
(g
-2 1
8. f:IR 3-IR3 has J(l,0,0) = (0,0, 1), j(O,O, I)=
19. 2 20. 2
(-1. 0, 1), and /(x) = x for all points x = (0, t, 0). -I I
9. J: IR 3- JR 3 has f (2, 2, 2) =
(I, 0, ]), and j(x) = x for
(
x in the plane spanned by (0, 1, 1) and (1, 1, 0). 0 a
DERIVATIVES
A real-valued function f(x) defined on some interval a < x < b hac; a derivative
at a point x in its domain interval, denoted by f' (x ), if
'( ) _ . f(x + h) - f(x)
f X - 1Im - - - - - - .
h~O h
Fundamental interpretations such as velocity and slope give the derivative a primary
place in applied mathematics and in geometry, and the formulas of one-variable
calculus provide techniques for dealing with the functions such as the trigonometric
and exponential functions that arise in these areas.
We'll assume that the reader knows the rules and elementary examples of calculus,
and that pictures are familiar that show the graph of a function f (x) along with its tan-
gent line at a point xo ac; in Figure 4.1 . The purpose of this chapter is to begin extend-
ing the definition and interpretations of the derivative from real-valued functions of a
real variable, associated with formulas such as y = f (x ), to vector-valued functions
of a vector variable. The resulting notational change required in this last formula
is fairly slight, since we'll now write y = f (x), but the supply of applications and
interpretations will increase considerably. The elementary techniques and examples
of one-variable calculus will continue to play an important role throughout the rest
of the book. Thus what we're about to do will provide a review of that material.
FIGURE 4.1 y
Graph with tangent line.
to describe a function with domain a subset of !Rn and image a subset of !Rm .
173
174 Chapter 4 Derivatives
then the real-valued function /k is called the kth coordinate function of f. For
example, if
YI= x1 +x2
Y2 = x1 - x2
has matrix fonn ( ;~ ) =( ! -! ) (;~ ).
SECTION l FUNCTIONS OF ONE VARIABLE
Here we take up the most straightforward generalization of calculus for real-valued
functions of one real variable: vector-valued functions x = /(t) of a real variable t.
An important difference between vector-valued functions and real-valued functions
is one of geometric interpretation: for vector functions we usually study the image of
f, namely the vector values actually taken by f, rather than the graph of an equation
x = /(/). For notation we'll sometimes write vectors as columns instead of rows
with comma separations; this practice sometimes results in a more readable display
and is often required in the context of matrix multiplication.
JA Derivatives
If a point moves in space so as to occupy various positions at a progression of times,
then its position at time t generates a vector-valued position function f with values
/(t). In particular if the position of a point in JR. 3 at time t is given by
(a) (b)
not a purely theoretical concept; for instance it's crucial for describing the dynamics
of planetary motion in Chapter 12, Section 3.
If x1 = (x1, YI, z1) and xo = (xo, Yo, zo) are points in JR 3, then the function JR ~ JR 3
with image the points x(t) given by
l?EMMe~e:,n 2
The function g from JR to JR for which
g(t) = (t, t 2)
describes a curve in JR2 . Because the coordinates x = t and y = t 2 satisfy the relation
y = x 2 , the point (t, t 2) always lies on the parabola with equation y =
x 2 , shown
in Figure 4.2(b ).
Jim f(t)
t-.ro
= (1im fi(t), ... , Jim
r-ro ,-ro
f,,(t)).
Similarly a function with values in )Rn is said to be continuous if its real-valued
coordinate functions are all continuous on their common domain interval. These
definitions are treated more generally in Chapter 5, Section 1.
The function defined by g(t) = (t, t 2) has limit vector (2, 4) at t = 2 because
lim(t, t 2)
,~2
= (1im t, lim r2 )
,~2 ,-2
= (2, 4).
176 Chapter 4 Derivatives
FIGURE 4.3
)
X1 X1
The function g is continuous for all real r because the coordinate functions t and t 2
arc continuous.
'( )
g t = 1· g(t
Im------,
+ h) - g(t)
h->0 h
assuming the limit exists. If the limit exists for each tin (a, b), then g'(t) determines
a new function JR -.!.+ JR11 , just as in the case n = 1. The derivative is often
written dg/dt.
(t + h)2 - 12 )
= h->O
1m l. h
( (t + h~ 3 - t
3 .
The two entries in this vector have as limits the derivatives of 12 and 13 , respectively.
Hence by the definition of the derivatives,
. (1+h)2-r 2 . (I + h)3 - 13 2
lim - - - - - = 21 and hm - - - - - = 3t .
h->0 h h->0 h
Rv th~ definition of vector limit the vector limit g' (t) exists, and g' (t) = (2t, 31 2 ).
Section 1A Functions of One Variable 177
Example 4 suggests that a function JR -!+ ]Rn has a derivative at a point t if and
only if each coordinate function of g has a derivative there. This is true, and we
have
1.1
g1(t) ) g; (t) )
If g(t) = : ' then g'(t) = : ,
(
gn(t)
( g~(t)
If g(t) = (
cos
. t )
SUI 1
, then g'(t)
COS t
=(
- sin 1 ) Note that as I vanes . .the curve
traced in JR2 by g(t) is a circle of radius l centered at the origin. Indeed lg(t)I =
./cos 2 t + sin 2 t = 1, and the geometric definition of cos t and sin t is based on the
interpretation oft as the counterclockwise angle that the radius at g(t) makes with
the positive horizontal axis, as shown in Figure 4.4(a). By a similar argument, g(-t)
also traces the same circle, but in the clockwise direction.
If h(t) = ( :2 ) , then h'(t) = ( ;, ) . For O :St ::: 1, the points h(t) trace the
13 3,2
curve in JR3 sh~wn in Figure 4.4(b ). As a guide for sketching this curve, observe
that its perpendicular projection into the xy-plane is the parabola, y = x 2 , and its
projection into the xz-plane is the cubic z = x3 • The projection y = z213 into the
yz-plane is less familiar; for that interesting curve see Example 9.
Figure 4.4(c) shows that, ash tends to 0, the vector g(t +h)- g(t) has a direction
that should tend to what we would like to call the tangent direction to the curve y
at g(t). However, since g is assumed continuous,
and the zero vector that we get as a limit has no direction. The standard way to
overcome this difficulty is to divide by h before letting h tend to zero. Observe that
FIGURE 4.4 X X3
y
g(t + h)
(1 , 1, 1)
= tx(to) + x(to),
t(t)
jEXAMPLE 7 j The circle of Example 5 has points x(t) = (cost, sin t) and tangent vector x(t) =
( - sin t, cost). A typical tangent vector appears in Figure 4.4(a).
The condition that g' (t) be nonzero requires the curve to have a well-defined
tangent line at every point.
The condition that g'(t) be continuous means that the direction and length of the
tangent vector g'(t) change continuously as the point g(t) moves along the curve.
Here's an example of a smooth curve that we'll encounter often.
IEXAMPLE s j The image curve defined parametrically by x(t) =(cost, sin t, t) lies on the cylinder
of radius 1 shown in Figure 4.5(a). The image curve is called a helix. If we tem-
porarily set the third coordinate function of x(t) equal to 0, the image is a circle of
radius 1 centered at (0, 0, 0), because
and the tangent line to the helix at x(O) = (1, 0, 0) has the parametric representation
t(t) = tx(O) + x(O)
= t(O, 1, l) + (1, 0, 0).
Section 1B Functions of One Variable 179
Note that x(t) is a continuous function, and is never zero, so the helix is a smooth
curve.
If a point moves in the plane so that at time t its position is x(t) = (t 2 , 13), then the
tangent vector is x(t) = (2t, 3t 2 ), with length jx(t)I = (4t 2 + 91 4 ) 112 . In particular,
x(O) = 0. The sketch of the path traced by x(t) is in Figure 4.5(b) for -1 :::: t :::: 1.
In making the picture it's helpful to observe that the coordinates of a point on the
path satisfy the equation· x = y 213; since x =
t 2 ~ 0 here we also have y x 312 . =
The tangent vector shrinks to zero in this example as x(t) approaches the origin
I'
because, with continuously varying x(t), its length becomes instantaneously zero at
I
I the abrupt change in the direction of motion shown in Figure 4.5(b). In this way the
I
I parametrization describes the geometric situation well. The curve certainly doesn't
I
I deserve to be called smooth at (0, 0), and is said to have a cusp there.
We list here some useful formulas that hold if two vector-valued functions x(t) =
f (t) and y(t) = g(t) have vector derivatives on an interval a < t < b; we assume
<f>(t) and u(t) are real valued and differentiable on the same interval.
d d
(a) 1.2 dt (x + y) = x + y, dt (ex)= ex, c constant
y
d . I
1.3 d/<f>x) = <f>x + <I> x
d
1.4 d/x • y) = x •y +x •y
The preceding formulas all follow from writing x = f(t) and y = g(I) in terms
of their coordinate functions and then applying the corresponding differentiation
(b) formulas for real-valued functions along with Formula 1.1. For example, the proof
of 1.5, a version of the chain rule for differentiation, goes like this:
FIGURE 4.5
(a) Helix, (b) Cusp.
~x(u) = (/1 (u), ... , J,1 (u) )'
dt
= ((/1(u)] 1
, ••• , [J,,(u)]')
speed of motion along the pal.h y described by g(t) as t varies. To justify the use
of the term speed, we observe that, for small h, the number lg(t + h) - g(t)l/lhl is
close to the average rate of traversal of y over a sufficiently short interval from t to
t + h. In addition, if g'(t) exists, we'll now show that
By the triangle inequality in the reversed form llxl - IYII::: Ix -yl, (seep. 32),
lg(t+h)-g(t)I I I
lg(t+h)-g(t) I
- - -h-- - - lg (t)I :::: - - - h - - - g (t) .
I
l 11
The right side tends to zero as h tends to zero by the definition of g'(t). Hence the
left side tends to zero also. Thus lg' (t) I is a limit of average rates over arbitrarily
small time intervals. It's for this reason that the real-valued function v defined by
v(t) = lg'(t)I is called the speed of g. It follows that it's natural to call the vector
v(t) = g'(t) the velocity vector of the motion at the point g(t). Note that the vector
v(t) is identical to what we called the standard tangent vector toy at g(t) if v(t) -:/- 0.
Velocity v(t) = 0 indicates speed zero and no direction at time t.
IEX.AMPLE 10 I Let x(t) = (a cost, a sin t, bt) with a and b nonzero constants. This is a more
general helix than the one in Example 8, where we took a = b = 1. Figure 4.6(a)
shows the choice a = 1, b = ½ along with a = -1, b = ½ as a dotted curve.
The two together outline the general configuration of the double helix portion of
the DNA molecule. The velocity at time tis i(t) = v(t) = (-asint,acost,b).
It follows that the velocity vector is always perpendicular to the vector r(t) =
(a cost, a sin t, 0), which points horizontally from the axis of the spiral to x(t ). To
see this just check that
FIGURE 4.6
v=i
(a) (b)
Section 1D Functions of One Variable 181
Suppose that g(t) = (rcoswt, r sin wt, ct), where r, c, and ware positive constants.
Then computing first and second derivatives one coordinate at a time gives
g'(t) = (-rwsin wt, rwcoswt, c) and g 11 (t) = (-rw 2 cos wt, -rw2 sin wt, 0).
Suppose JR ~ JR 3 describes a path in JR 3 with velocity ~ector v(t) = g'(t). If
we assume that g' itself has a derivative, we define the acceleration vector at g(t)
by a(t) = g" (t). If x(t) is used to denote the image points of some curve, then along
with x(t) for velocity vectors we may denote acceleration vectors by x(t).
The physical significance of acceleration a(t) is that if x(t) describes the motion
of a particle of constant mass m, then F(t) = ma(t) is by definition the force vector
acting on the particle. If we denote by a(t) the length of a(t), then a(t) is called
the magnitude of the acceleration, and ma(t) is called the magnitude of the force
acting on the particle. We detect the presence of acceleration that isn't parallel to
the velocity at x(t) by observing a bending of the path of motion away from the
straight line through the tangent vector x(t) and toward the direction of x(r). Look
at Figure 4.6(b) and think of the sideways pull that you feel when going around a
tight curve at high speed. The previous example illustrates the basic idea; there the
acceleration points in a direction perpendicular to the vertical axis of the helix since
the third coordinate of g"(t) is always zero. Section lF provides another illustration,
and there is a more general treatment in Chapter 8, Section 3.
If the vector x(t) = (r cos wt, r sin wr, ct) gives the position at time t of a particle
of mass m in JR3, then the velocity and acceleration vectors, v(t) = g 1(r) and a(t) =
g"(t), are as computed in Example 11, namely
x(t) = (-rwsinwt, rwcoswr, c) and x(t) = (-rw2 cos wt, -rw2 sin wt, 0) .
Typical velocity and acceleration vectors are shown in Figure 4.6(b ), located appro-
priately with tails at x(t). The lengths of these vectors just happen to be constant as
functions of time t, depending only on the constants r, w and c as follows:
Note that while the speed and velocity depend on c, the acceleration doesn't.
1D Arc Length
For a point moving with constant speed v along a curve, the distance covered between
time to and time !1 should turn out to be V times t, -tn Mn.- 0
------"
182 Chapter 4 Derivatives
·: EXAMPLE;;13
1.·.· ··.· ··· ··.• ..., ...,..·.·.···.·····.. ·. ·--·.·.·. 1 If a circle of radius a is parametrized in R 2 by
( ,; . "-''. '::,:- ... , . ~:-:=· :- . ,·:. :,: ) .. ,,,< "· .. ;
then
v(t) = lg'U>I
= 1(-awsinwt, awcoswt)I
= awJsin2 wt + cos 2 wt = aw.
Thus the distance covered between times to and t1 is
11
I=
1
10
awdt = aw(ti - to).
The constant w is called the angular speed of the motion on the circle.
Image curves that appear to be the same may have different parametrizations that
yield different arc lengths. The following simple example shows that a little care is
needed to avoid producing inconsistent results from different parametrizations.
EXERCISES
Find the derivatives f'(t) and f"(t)for each of the 4. f(t) = (t + t 2 , t 2 + 13, 13 + 14 ), when t = - I
following functions l to 6 at the indicated point. Then 5. f(I) = (cos1,cos21,cos31,cos41) when I =rr/2
find a parametric representation for the tangent line at
6. f (I)= ti+ t 2j + t 3k when t = I
each indicated point.
Sketch the curves defined parametrically by the follow-
1. f(t) =(I+ t 2, l + t 3 ), =2 when t ing functions 7 to 12.
2. f(t) = (tcost, tsint), when t = ;r/2 7. j(t)=t(l,2,0)+(1. ), J),-OO<t <OO
s. JU)=U.t 2 ,t 3),0srs 1
3. /(t) = { :: 1 ). when I= -I 9. J(t)=(2t,t),-IsrsI
Section 1D functions of One Variable 183
37. Show that if a, b and w are positive constants, then the t at the height of a projectile fired straight up with an
parametrization x(t) = (a cos wt, b sin wt) traces the same initial speed of 300 feet per second. What is the minimum
ellipse x 2 /a 2 + y2 /b 2 = I in IR2 regardless of the size temperature attained?
of w. *44. Parametrizations x = g(t), a :St :Sb and x = h(u), a :S
38. Find the velocity and acceleration vectors for x(t) in u :S f3 are called equivalent if there is a continuously
Exercise 37. Show that the velocity and acceleration are differentiable¢ with¢' > 0 from [a, {3] onto [a , b] such
never zero. that g(</J(u)) = h(u). {Note that¢'> 0 implies¢ strictly
Sketch the following four curves for the indicated time increasing.)
intervals. Then add to your sketch the velocity and (a) Use Equation 1.5 to show that if g and h are equiv-
acceleration vectors at the designated times. alent then lh'{u)I = lg'(<PM)l<P'{u).
1 (b) Use part {a) to change variable in the arc-length inte-
39. x(t)=(t,t,t 2 ),0:St :S l;t=0, 2I ,1 gral for h and show that equivalent parametrizations
40. x{t) = (2 cos t)i + {sin t)j, 0 :S t :S 2rr; t = 0, rr /2, rr yield equal arc lengths.
{c) Show that g{t) = (t, t) for -I :s t :S I and
41. x{t) = (t, t 2 , tJ), 0 :St :S l ;t = 0, ½, l h (u) = (- cosu, - cos u) for O :S u :s 5rr /2 are
42. x(t) = {cost)i + {sint)j + tk, 0 :St :S 2rr;t = 0, rr, 2rr not equivalent parametrizations of the line segment
43. The normal lapse rate for temperature above the surface from {- I , - 1) to {I, l) in JR 2 by showing that they
of the earth assumes a steady drop in air temperature of yield different arc lengths. This example shows that
3°F per 1000 feet of increase in elevation. Under this the condition that the function ¢(u) be increasing
assumption, with ground temperature 32°F, and assuming can't be omitted from the definition of equivalence
negligible air resistance, estimate the temperature at time if equal arc length is to be a consequence.
(b) This algorithm plots points on an elliptical helix shown in Figure 4.7, and defined
by three equations of the form x = g1 (t), y = g2(t), z = g3(t) with a ::: t :S b.
FIGURE 4.7 In our particular example we get the picture shown, for which g1 (t) = 2 sin t,
g2(t) = 3 cos(!), g3(t) = 0.4t, and a = 0, b = 4n. The viewing direction here
is along a line joining the point (I, 1, I) to the origin. The order of the sine and
cosine in the first two coordinate functions makes the helix turn clockwise instead of
counterclockwise as it winds up around the vertical axis. Note that the curve winds
around an elliptical cylinder rather than a circular one.
The decision to plot a picture by hand or by computer will usually favor the
computer if a fairly high degree of accuracy is needed for some reason or if the picture
Section 1F Functions of One Variable 185
is just too complicated to draw by hand. Otherwise, a quick pencil drawing may
convey the necessary information with less fuss. Some of the information you might
want to convey is that you understand the basic ideas of graphical representation, and
this may best be done, for example on an examination, with a careful pencil drawing.
For this reason it's a good idea not to become overly dependent on having computer
software do your thinking for you until you've become reasonably adept at doing it
for yourself. The assigned exercises will require a mixture of both approaches.
EXERCISES
Plot the following parametrically defined curves l Plot the image curves 6 to 9 subject to the given
to 4. conditions.
1. g(t) = (tcost, t sint), 0::: t:::: 2rr 6. x = (cost, sint, t 2 ), 0::: z:::: 3
2. f(t) = (t, ½t3. ½r 4), O::: t:::: 1 7. x = (2 cos t, 3 sin t, e ), 1 ::: z :::: 2
1
3. g(t) = (sin2t, 2sin2 t, 2cost), 0::: t:::: 2rr 8. x = (t, t cost, t sin t), x 2 + y2 + z2 ::: 1
4. g(t) = (ltl, 2it - 11, 31t + 11), -2::: t:::: 2 9. x=(t 2 ,t 3 ,t 4 ),lyl::: 5
5. Prove that the curve in Exercise 3 lies on a sphere centered 10. Make computer plots of lines x = ta + b in JR 3 for
c ::: t :::: d for a variety of choices of the vector and
at the origin.
scalar parameters.
lF Vector Integration
We've seen in Section ID that if you know the speed \x(t)\ of some point moving in
space, you integrate speed with respect to t to find distance measured along the path
of motion from some chosen point. Finding the actual path of motion requires prior
knowledge of more than just the speed; for that we need to know the velocity vector
v(t) = i(t). Since we get from position x(t) to velocity i(t) by vector differentiation
it follows that recovering position from velocity is done by vector integration. Given a
vector valued function f(t) with n real-valued coordinate functions fi(t), . .. , fn(t),
each integrable over some common interval, the indefinite vector integral of f is
defined by
f
We interpret he relationship between j(t) and F(t) = f(t) dt + c as follows. For
whatever choice of c, the tangent vector to the image curve of Fat F(t) is the vector
f (t), usually pictured with its tail at F(t). Figure 4.8(a) shows two choices for c.
Suppose a and b are constant vectors in ~n and we want to find the position function
x(t) consistent with velocity i(t) = ta + b, as well as with the initially specified
186 Chapter 4 Derivatives
x(t) = f x(t) dt = f 2
(ta+ b) dt = ½r a + tb + c.
To determine c, we note that x(to) = ½rJa + tob + c, soc= x(to) - ½tJa- tob. Thus
FIGURE 4.8 y
X
/
---- y
Suppose we fire a projectile from ground level and are willing to ignore the effects
of air resistance on the flight of the projectile. (The retarding effect of air resistance
is taken into account in Chapter 12, Section 3.) Thus the only acceleration we need
to take into account after the initial release of the projectile is that of the vector
-g = (0, 0, - g), where g is the magnitude of gravitational acceleration near our
location on earth. Denote by x = x(t) the position of the projectile at time t after
firing, so that the velocity vector is v = x(t) and the acceleration vector is a = x(t).
x
Equating our two expressions for acceleration gives = -g. Writing this equation is
the critical step in predicting the projectile's path. To solve the equation we integrate
both sides twice with respect to t getting successively
EXERCISES
In Exercises 1 to 6, compute the indefinite integ(als 17. Superman, while standing atop a 200-foot-high building,
F(t) = ff (t) dt + c; then determine the constant of sees a scoundrel drop a victim out a window 50 feet
integration c so that the associated condition is satisfied. across the street from' his building and 100 feet above
the pavement below. Reacting instantly, Superman gives
1. /(t) = (t 2 + 1, t 3 - l); F(l) = (2, 2) himself a mighty push in just the right direction to plunge
2. f(t) = (t, t 2 , t 3 ); F(0) = (1, 2, 1) under the influence of gravity and effect a dramatic rescue
just before the victim hits the pavement. Neglecting air
3. /(t) = (t cost, t sint); F(0) = (1, 1)
resistance, estimate Superman's initial velocity vector and
4. f(t) = (1/(t 2 + 1), t/(t 2 + 1)); F(0) = (0, 1) speed. Also estimate Superman's and the victim's speeds
at the time of rescue.
5. f(t) = (1. t 2 , -1, t 2); F(l) = (2, 2, 2, 2)
6. /(t) = ta - t 2 b; F(to) = xo 18. Someone wants to kick a ball on level ground so it falls
In Exercises 7 to 14, given x(t) or x(t), find the x(t) that back to earth a feet away.
satisfies the initial conditions. (a) Show that we can do this with infinitely many
7. i(t) = (t, = (2, 1)
-t 2 );x(0)
different initial angles of elevation as long as the
initial speed vo at which the ball is kicked is at least
8. i(t) = t(l, -l);x(l) = (1, 1) .jag.
9. i(t) = (cost, sin2t);x(rr/2) = (-1, 1) (b) Suppose in addition that the ball is to be lobbed
over a vertical fence of height h halfway between
10. i(t) = (e t); x(0) = (e, 1)
1
,
the initial and terminal points on the ground, Show
11. i(t) = (t, r, t 2);x(l) = (1, -1, 1) that barely clearing the fence requires initial angle
of elevation 0 = arctan (4h/a) and initial speed
12. i(t) = t(l. 1, t);x(0) = (2, I, 2)
vo = Jg(a 2 + 16h 1)/(8h).
13. x(t) = (t, -t 2);x(O) = (2, 1), i(0) = (I, 1)
14. x(t) = (t,, 2 ,e- 1);x(l) = (1,0,0),i(l) = (0, 1,0) *19. Suppose you want to stand at distance a from the base of
a vertical building wall of height h and then kick a ball in
15. Suppose you want to kick a ball over an h-foot vertical
such a way that it lands at distance b back from the edge
fence a feet away from you in such a way that it just
on the building's flat roof, having just grazed the edge
barely gets over the top of the fence and lands on the
of the roof as it went by. Show that the initial angle of
ground b feet from the fence on the other side. Assuming
air resistance neglected, what should be the initial angle
=
elevation of your kick is 0 arctan (h/a+h/(a+b)) and
of elevation 0 of your kick, and what should its initial its initial speed is v0 = .Jga(a + b)/(2hcos 2 B). [Hint:
speed be? Find a parabola containing three crucial points.)
16. A target is suspended over level ground at height ho, *20. A projectile is fired up from the surface of the earth with
to be released to fall earthward under constant vertical initial velocity (uo, vo). Under the influence of constant
acceleration - g . Simultaneously with the release of the vertical acceleration -g the projectile reaches height hmax
target, a gun aimed directly at the suspended target is fired and then falls back to earth. Neglecting air resistance,
from ground level at a horizontal distance l from the point show that the fraction of time during its trajectory that
directly below the target. Assume that the speed of bullet the projectile spends above height h I is lvd/vo, where
and target are not reduced by air resistance. (u 1 , v·1) is the projectile's velocity vector at height h 1 .
(a) Show that the bullet's trajectory will intersect the
vertical path of the target only if 2v5ho ~ g(l 2 +h5) . . 21. Big Bertha In World War I, Paris was bombarded by guns
(b) Show that the bullet will hit the target if the condi- from the unprecedented distance of 75 miles away, shells
tion in part (a) is met. · taking 186 seconds to complete their trajectories. Estimate
(c) One feature of the conclusion in part (b) is that it the angle of elevation at which the gun was fired and the
happens independently of the size of vo as long as maximum height of the trajectory, assuming negligible air
it satisfies the condition in part (a). However the resistance. During a substantial part of the trajectory the
distance d 1 that the target has fallen when it is hit altitude was high enough that air resistance was negligible
does depend on vo. Find d1 assuming d1 > ho. there.
188 Chapter 4 Derivatives
*22. Fox and Rabbit Suppose that a rabbit runs with constant
26. If JR ~ IR11 and IR ~ IR11 are both integrable over
speed v > 0 on a circular path of radius a, and that a fox,
[a, b], show by using the corresponding properties of
also running with constant speed v, pursues the rabbit by
integrals of real-valued functions that
starting at the center of the circle, always maintaining a
position on the radius from the center to the rabbit. Show
that it takes the fox time t = na/(2v) to catch the rabbit
and that the fox's path is a semicircle.
1b kf(t)dt =k 1b f(t)dt, k a real number,
d dx1 dxn
1b J'(t) dt = f(b) - .f(a).
p -(nlJXJ + · · · +m
= dt 11 x11 ) = m1-dt
+ · · · +mn-d ·
t
28. Suppose x = x(t) has two continuous derivatives on an
interval, and that x(t) = rx(t) for some scalar constant
Thus the momentum of the system is the velocity vector r =I= 0, so that the acceleration vector is parallel to the
of the center of mass multiplied by the sum of the masses. velocity vector. The purpose of this exercise is to show
Show that if the momentum of such a system is a constant that the motion of x(t) is confined to a line.
Po, then the center of mass either remains fixed or moves (a) Verify that the equation x(t) = rx(t) is equiva-
with constant speed along a fixed line parallel to po. lent to
24. Consider the vector differential equation + ax + bx = 0 x
to be solved for vector functions x = g(t). We assume a ~(e-' 1x(t)) = 0.
and b are scalar constants. dt
(a) Suppose the scalar equation r 2 +ar+h = 0 has roots (b) Show that part (a) implies x(t) = e' 1c for some
r1 and r2. Show by substitution that x(t) = e' 11 c1 + constant vector c.
e'21 c2 satisfies the vector differential equation for (c) Show that part (b) implies that x(t) = (1/r)e''c+d
fixed arbitrary choices for the vectors c I and c2 and for constant vectors c and d, and hence that x(t)
for all t. stays on a line.
(b) If the roots r 1 and r2 of part (a) happen to be
*29. This exercise generalizes the previous one. Suppose
equal, the two terms in x(t) collapse into a single
x = x(t) has two continuous derivatives on an interval
term with arbitrary coefficient c1 + c2. Show that in
a .:s t .:s b and that x(t) = g(t)x(t) for some continuous
that case additional solutions are given by x(t) =
real-valued function g(t). Thus the acceleration vector, if
e' 11 c1 + te' 1'c2.
not zero, is parallel to the velocity vector. The purpose of
25. Let IR ~ IR" be a function defined for a .:::: t .:::: b. If the this exercise is to show that the motior.. of x(t) is confined
coordinate functions !1, ... , .fn of f are integrable, we to a line.
define the integral off over the interval [a, b] by (a) Verify that the equation x(t) = g(t)x.(t) is equiva-
lent to
1b f(t)dt= (1b f1(t)dt, ... ·1b f(t)dt). :t (e-h<Ox.(t)) = 0, where h(t) = 1' g(u)du.
(b) Show that part (a) implies x(t) = ehUlc for some
(a) If .f(t) = (cost, sint) for 0:::: t .:::: n/2, compute
constant vector c.
1t 2
f(t)dt. (c) Show that part (b) implies that x(t) = H(t)c + d
(b) If g(t) = (t, t 2 , t 3) for 0 ::: t ::: 1, compute for constant vectors c and d, where H' (t) = eh(t).
Jd g(t) dt. Hence show that x(t) stays on a line.
Section 2A Several Independent Variables 189
Apart from real-valued functions of a real variable, the functions we picture most
effectively by their graphs are the functions JR2 -1..+ JR with graphs in JR3 consisting
of the points
(x, y, z) =
(x, y, f(x, y)),
FIGURE 4.9
I y
I
I
I
I
\
~ (.r, y, 0)
I
z=f(x,y)
y=x2 - I
(a) (b)
FIGURE 4.10 y z
A
i
__Jo
1·~ x -+-Y
(a) (b}
f (x, y) = I- x - y2
for which x ~ 0, y ~ 0, and I - x - y 2 = z ~ 0. First observe that the domain D of
the function that we are interested in has been restricted to the part of the xy-plane
in the first quadrant for which I - x - y 2 ~ 0, or x ::s I - y 2 . This domain appears in
Figure 4.IO(a), and again in Figure 4.IO(b) under the graph of/. To sketch the graph
190 Chapter 4 Derivatives
of f itself, it helps to notice that cross sections of the graph obtained by holding
y = Yo fixed and letting x vary are lines whose projections onto the xz-plane satisfy
z = 1- x - y5. Each of these lines joins a point in the yz-plane, where x = 0 and
z = 1 - y2, to one in the x, y-plane, where z = 0 and 1 - x - y2 = 0. Such lines
are in Figure 4. IO(b). We could also include cross sections of the graph off taken
parallel to the yz-plane; such curves are parabolic in shape, with projections onto
the yz-plane satisfying z = 1 - xo - y2 for values of xo between O and 1.
IEXAMPLE 31 The graph of f (x, y) = x 2 + y 2 has the property that f is constant on each circle
of a given radius in the xy-plane and centered at the origin. In other words, cross
sections of the graph taken with planes parallel to the xy-plane are circles, shown
in Figure 4.11 (a). All these circles pass through the parabola in the yz-plane with
equation z = y2, because z = /(0, y) = y2.
f (x, y) = -2x - y + 2.
Setting z = J(x, y), we get
z = -2x - y +2 or 2x + y + z = 2;
we see that the graph off is a plane in JR 3 . To sketch it, we take cross sections parallel
to the yz-plane, which project into that plane as lines with equations y+z = 2 -2x0 .
Or we may also take cross sections parallel to the xz-plane. Both are shown in
Figure 4.1 l(b) for x ~ 0, y ~ 0, z ~ 0.
A more direct way to sketch the plane is to locate three points on it by setting, for
example, (x, y) = (0, 0), (1, 0), and (0, I). The corresponding points on the graph
(a) are (x, y, z) = (0, 0, 2), ( 1, 0, 0), and (0, 1, 1), shown as dots in Figure 4.11 (b ).
Joining these dots by lines in this plane gives some idea of the position of the plane.
Alternatively, we find the points where the plane intersects the axes by setting two
'-
',
z 1 I
I of the coordinates equal to zero and solving for the third; doing this we find (1, 0, 0),
' I
(0, 2, 0) and (0, 0, 2).
' (0. 0, 2) Note that we my also write the plane's equation as
/
. ' /
/
' , t
i .,.,, ......--r-
' ' ~
, • '(O, I, I) 2x + y + (z - 2) = (2, 1, I) • (x, y, z - 2) = 0.
- \
OJ/~
(I , 0, ~ This equation shows that our plane is realized as all points (x, y, z) such that the line
joining (x, y, z) to (0, 0, 2) is perpendicular to the vector (2, 1, 1), a nonnal vector
~ '°,
' I
I
to the plane.
I
(a) (b)
=
implicitly defined level set associated with f (x) k is sometimes called the graph
of the equation f (x) = k, whereas the graph of the function f(x) always refers to
the equation y = f (x).
Topographical maps display terrain elevations by showing level curves at equally
spaced levels as in Figure 4.12. Such displays have the advantage over perspective
drawings that foreground features don't obscure what lies in back of them. See
Figure 4.12(a), which shows the terrain levels, and Figure 4.12(b), which shows the
corresponding level curves.
The function f (x, y) = x 2 + y2 of Example 3 has concentric circles for level sets.
At level k = l we get f(x, y) = x 2 + y 2 = 1, which represents a circle of radius
1 about (0, 0) in the xy-plane. In general, at a level k > 0 we get a level curve
x 2 + y 2 = k, which is a circle of radius ./k. See Figure 4.13(a), where the values of
./k are nearly equally spaced. As the surface rises more steeply the level lines will
get closer together. If we don't label the level curves with numerical level value k,
we can't tell from level curves alone whether the surface is rising or falling as we
go out from the center.
The function f: JR.3 ~ JR. defined by f (x, y, z) = x 2 + y2 + z2 has level sets in JR. 3
consisting of points (x, y, z) that satisfy an equation of the fonn
x2 + y2 +z 2 = k
for some fixed real number k. If k > 0, we get a sphere of radius ./k centered at (0,
0, 0), because x 2 + y 2 + z2 is the square of the distance from (x, y, z) to (0, 0, 0). If
k = 0, the equation is satisfied only by (0, 0, 0). If k < 0, the corresponding level
set is empty. Some level sets are shown in Figure 4.13(b) as concentric spheres. The
graph off is a subset of R 4 and can't be pictured.
FIGURE 4.13
(a) (b)
192 Chapter 4 Derivatives
IEXAMPLE 7 j 3
The linear function g: JR -+ JR defined by
g(x,y,z)=x+y+z
has a graph in JR4, so we can't draw it. The level sets of g are the parallel planes
with equations
X + y +z = k,
one for each real number k. Three of the planes are shown in Figure 4.14. Note that
each plane is perpendicular to the vector ( I , I , 1), because the equation also takes a
form showing (I, 1, 1) and (x, y, z) - (0, 0, k) perpendicular:
...
( I , 1, 1) • (x, y, z - k) = 0.
Note that the graph of/: JR 2 -+ JR is the set of points (x, y, z) in JR 3 such that
FIGURE 4.14
z = f (x, y), and that this set is the same as the level set at level k 0 of the =
function g: JR 3 -+ JR given by g(x, y, z) = z - f(x, y). Whichever point of view
we take, we get the same picture.
EXERCISES
23. Suppose that the density per unit area of a thin film, 25. Suppose the region D in JR 3 consists of all points (x, y, z)
referred to in (x, y)-eoordinates, is given by the fonnula satisfying both x 2 + y 2 :::: 4 and O :5 z :::: 5. Suppose
d(x, y) = x 2 + 2y2 - x + l for -1 :::: x :::: I and the temperature at a point (x, y, z) in D is T (x, y, z) =
- l :::: y :::: l. Sketch the set of points at which the film x2 + y2 - z.
has density ¾- (a) Sketch the region D.
24. Let the density per unit of volume in a cubical box of side (b) Sketch the set of points in D for which the temper-
length 2 vary directly as the distance from the center and ature is -1 degree.
2C Computer-Generated Graphs
Some graphs of functions f (x, y) are fairly easy to draw by hand. For example,
the graph of z = JI - x 2 - y 2 is a hemisphere of radius I over the domain
x 2 + y 2 ~ I . A few examples that have been done by a computer are shown in
Figure 4.15. But whether sketching is done by hand or computer, the technique
illustrated here is fundamentally the same in that it consists of drawing curves on
the function's graph that are traced by holding one variable fixed and varying the
other.
To describe this technique another way, sketching the graph of z = f (x, y) is
possible by plotting some carefully chosen curves that lie on the surface, as in
Figure 4.15. The simplest curves to draw are often the ones that are images under
f of line segments in the domain of f that are parallel to the x and y axes, as
in Figure 4.15(b); this approach allows us to use either function values f (x, yo),
with Yo fixed and x varying, or else f(xo, y) with xo fixed and y varying. Thus a
rectangular domain for f such as O ~ x s 2, rr / 4 ~ y ~ 3rr /2 for f (x, y) = x cos y
might be treated using the following routine. This "program'' is not intended to run
in a particular language, but is presented only as a compact way of indicating the
rough structure of such a program.
Following this routine produces the picture shown in Figure 4.16. These drawings
are in a style that can in principle be drawn by hand, drawing one curve on the
194 Chapter 4 Derivatives
F'IGURE 4.15 z
y
X
(a) z = x1 - y2 (b) z = xy
v'x2 + y2
2
(d) z = sin (x + y2)
x2 + y2
graph at a time, with only one of the variables actually varying. Applications such as
Maple, Matlab, and Mathematica make drawings such as this with additional sophis-
tication. The Web site http://math.dartmouth.edu/rvrewn/ also provides some Java
programs in the style of the graphical techniques we use here.
The Java programs are designed to ignore error-producing values such as square
roots of negative numbers or undefined function values that may arise from trying
to plot the graph of a function like f(x, y) = JI -
x 2 - y 2 over a rectangle that
contains the circular disk x + y2 :-s I. [Here for example, j(l, 1) = .J=T.] The
2
natural plotting domain of the programs we use is a rectangle with edges parallel
to rectangular axes, but we may want to plot only over a domain with some other
shape such as a circular or triangular one. We do this easily using the Heaviside
unit step function defined by
FIGURE 4.16
z = xcosy . H(x) = 10,l, if X ~ 0,
if X < 0.
IEXAlvlPLE a I Suppose we want a picture of the graph of the function f (x, y) = sin(x 2 + y 2 ) / (x 2 +
y2) with its domain restricted to the part of the first quadrant inside the circle
x 2 + y2 :-s 9 of radius 3. Using the Heaviside function we define a new function of
two variables h (x, y) by writing h (x, y) = H(9 - x 2 - y 2 ). It follows that
1, if x 2 + y2 ~ 9,
h(x, y) = O,
1 if x 2 + y2 > 9.
Thus h(x, y) takes the value l inside and on the circle of radius 1 centered at the
origin and the value 0 outside the circle. Then the product h(x, y)f (x, y) will talce
Section 2C Several Independent Variables 195
FIGURE 4.17 z z
z = sin(x 2 + y2)/(x 2 + y2), 0 S
XS 3, 0 SYS 3.
X X
y )'
(a) (b)
on the value O outside the circle x 2 + y2 = 9 and will be equal to / (x, y) inside
and on the circle. If we sketch the graph of z = h(x, y)f(x, y) over the square
0 ~ x ~ 3, 0 ~ x ~ 3 we get a picture like Figure 4. l 7(a). By suppressing the zero
values we get Figure 4.l 7(b ).
In the previous example / (x, y) has not been defined at (x, y) = (0, 0) but
we've successfully avoided the issue. One way to do this is to define / (0, 0) = 1,
which incidentally will make / continuous at (0, 0). Another way is to incorporate
a feature in the plotting program that allows it to ignore points in the domain that
would normally produce an error message. This is what has been done with the Java
program GPLOT available at the Web site referred to previously.
EXERCISES
l. Sketch the graph of f(x, y) = x2 - y2, for Jxl S 2, assuming the value I on each region as described in 12
IYI s 2. to 17 and assuming the value zero elsewhere in JR 2 .
2. Sketch the graph of f(x, y) = x2 - y2 for OS x S 2, 12. P(x, y) talces the value 1 where x ::: 0, y ::: 0, and
0 Sy S 2. y ~ x.
3. Sketch the plane z = 1-x - y for OS x .S 2, 0 Sy .S 2.
13. P(x, y) takes the value 1 where 1 ::: x ::: 0, 1 ::: y::: 0,
4. Sketch the plane x+2y+z = 2 for 1 .S x S 2, 1 =:: y .S 2. and y::: x.
5. Sketch the graph of f(x, y)= x 2 +y3 for !xi .S 3, IYI .S 3. 14. P(x, y) talces the value 1 where x 2 + y2 .s 1.
6. Sketch the graph of f(x, y) = x 2 +y 2 for !xi .S 1, IYI .S 1.
15. P (x, y) takes the value 1 where x2 + y2 S I and x ::: y.
7. Sketch the graph of f(x, y) =x + y for O S x S 1,
0 .Sy S 2. 16. P(x, y) takes the value 1 on the triangular region in R2
with vertices (0, 0), (0, 1) and (1, 0).
8. Sketch the graph of f(x, y) =y2 - x3. 0 S x S 2,
0Sy.S 1. 17. P (x, y) takes the value 1 on the square in JR 2 with vertices
(0,0), (0, 1), (1,0) and (1, 1).
9. Sketch the graph of f(x, y) = cosx sin y, 0 ~ x ~ 2Ir,
0:'.::Y:'.::2,r. 18. Sketch the graph of f (x, y) =
(x - y ) 3 for values of
10. Sketch the graph of f(x, y) = exp(-x -2y), 0 ~ x S 2, (x, y) simultaneously satisfying O :'.:: x .s
2, 0 :'.:: y .S 2,
0 Sy S 2. and y :'.:: x .
11. Let f (x, y) = xy(x 3 + y3)/(x 2 + y 2 ). Sketch its graph 19. Sketch the graph of f (x, =
y2 - x 2 for values of
y)
for -1 S x :'.:: 1, -1 Sy :'.:: 1. What is the difficulty at (x, y) simultaneously satisfying O S x ~ 2, 0 S y :'.:: 2,
(x, y) = (0, 0)? and x Sy.
Using modifications of the Heaviside function H(x), 20. Sketch the graph of f(x, y) = cos(x 2 +y2)/(l +x 2 +y 2 ),
form a product of functions that we'll refer to as P(x, y), for !xi S 2, IYI ~ 2.
196 Chapter 4 Derivatives
21. Sketch the graph of f(x, y) = cos(x 2 + y2)/(l+x 2 + y2), *24. Sketch the part of the sphere of radius I centered al the
for x 2 + y2 :S 2. origin that lies above the part of the first quadrant in
the xy-plane that lies between the y-axis and the line
22. Sketch the graph of z = Js - x 2 - y2 when I ::: z .:::: 2. )' =X.
2D Quadric Surfaces
Quadric surfaces are level sets in JR 3 of second-degree polynomials in three variables
x, y, z; they fall into six distinct types illustrated in Figure 4.18, plus some degenerate
cases in which the polynomial depends on only two variables. We'll be returning to
all of these surfaces later in Section 4, where we represent them parametrically in a
way similar to what we used to represent curves in space.
The elliptic cone in Figure 4.18(2) is a limit of hyperboloids in two ways: (i) As
the waist of the hyperboloid of one sheet pinches in while k decreases to O through
FIGURE 4.18 z z z
---
----
\--- ____:
X
----------
---- (----
--------
(--- ----
--------
x2 '
-+Y- -z2 =k>O
x2 y2
-+--z=O
2 x2 y2 2 =k<O
----z
a2 b2 a 2 b~ a2 b2
(I) Hyperboloid (2) Elliptic cone (3) Hyperboloid
of one sheet of two sheets
z
z
---- z
----
x2 y2 x2 y2
-+--z=O ----z=O
a2 b2 a2 b2
(4) Ellipsoid (5) Elliptic paraboloid (6) Hyperbolic paraboloid
Section 2D Several Independent Variables 197
positive values it tends to the cone. (ii) As the two separate pieces of the hyper-
boloid of two sheets get closer while k increases to O through negative values, the
two pieces of the hyperboloid become more pointed and come together to form
the cone.
As well as being level sets at level O of functions of three variables, the two
paraboloids are also graphs in JR 3 of the respective functions (x/a)2±(y/b)2 defined
on JR2 . The surface of Example 5 is an elliptic paraboloid in which a = b = 1. The
spheres of Example 6 are a special case of the ellipsoid in which a = b = c.
The degenerate cases mentioned previously are cylinders, which may be level
sets of functions on JR 3 that really depend on only two variables, say x and y. For
example, the equation x 2 + y2 = 1 that determines a circle in JR2 also determines
a circular cylinder in JR 3• Since the equation places no restriction on z, a level set
satisfying x 2 + y2 = k > 0 in JR3 contains all lines perpendicular to the xy-plane
and passing through the circle of radius ./k.
To make highly accurate pictures of surfaces, in particular quadric surfaces, we
use computer graphics. On the other hand, rough sketches are often based on the
observation that well-known curves such as lines, parabolas, ellipses, and hyperbolas
lie on these surfaces and are useful guides in making a drawing.
EXERCISES
1. Use the picture of the generic elliptic cone in Figure 4.18 7. Consider two distances in JR. 3 : (i) from (x, y, z) to the
as a guide in making sketches of the circular cones plane z = -1, (ii) from (x, y, z) to the point (0, 0, I).
(a) x 2 +y2-z 2 = 0. (b) x 2 +z 2 -y2 = 0. (c) y2+z 2 -x 2 = The points (x, y, z) for which these distances are equal
0. constitute a quadric surface Q. Identify Q and make a
2. Make sketches in JR. 3 of (a) circular cylinder x 2 + y2 = 1. sketch of it.
(b) parabolic cylinder x 2 - y = 0. (c) hyperbolic cylinder
8. The paraboloids x 2 + y2 = z and x 2 + y 2 = 8 - z intersect
x2 - y2 = 1.
in a curve in JR. 3 • Identify the curve and make a sketch
3. The ellipsoid, the elliptic paraboloid, and hyperbolic of it.
paraboloid are shown as (4), (5) and (6) in Figure 4. 18 as
generic level surfaces of quadratic polynomials at respec- Each of the following quadratic equations describes an
tive levels l, 0, and 0. What, if anything, would be altered example of one of the quadric surface types illustrated
in the pictures if we had chosen levels 2 in (4), I in (5), in the text. In each case identify the type by name and
and 2 in (6)? make a sketch of the surface.
4. (a) Show that the intersection of the hyperboloid H 1 of 9. 4x 2 - y2 + 4z 2 = 16 10. x2 /4 + y 2 /4 - z2 /9 = l
one sheet
11. x 2 /4 - y 2 /4 + z2 /9 = 1 12. 4x 2 + y2 + 4z 2 = 16
2
(x/a) + (y/b)2- (z!d = l 13. x 2 /4 + y 2 /4 + z2 /9 = 1 14. x2 /4 - y2 / 4 - z2 /9 = 0
of ( ) . f(x + t, y) - f(x, y)
- x, y = 1J ill--------,
ax 1-->0 f
Thus a partial derivative is the result of differentiating with respect to just one
variable at a time with the others held fixed. If the derivatives anax and aflay
exist they are also functions from JR 2 to R
A similar definition works for functions defined on !Rn . For each i = 1, ... , n ,
we define a new real-valued function called the partial derivative of f with respect
to the ith variable, denoted by af/axi . For each X = (XJ, . . . , Xn) in the domain of
f , the number (af/axj}(x) is by definition
af . f(XJ, ... , Xi+t , ... ,Xn)-f(x1, .. , , Xj, .. , , Xn)
3.1 -(x)
axi
= ,~o
hm - - - - - - -- - - - - - - - - - .
t
The domain space of aflax; is !Rn , and the domain of af/axi is the subset of
the domain of f consisting of all x for which the preceding limit exists. Thus the
domain of anaxi could conceivably be the empty set. The number (af/axj)(X)
is simply the derivative at Xi of the function of one variable obtained by holding
x1 , . .. , x; - 1, Xi + 1, . . . , x 11 fixed and by considering f to be a function of the i th
variable only. As a result, the differentiation formulas of one-variable calculus apply
directly.
It's important to realize that we do not call a function "differentiable" just because
it has partial derivatives. For functions of more than one variable, the concept of
differentiability is a little more complicated than that; the matter is taken up in
Chapter 5.
Let f(x, y, z) = x 2 y + iz + z2x . Then
aJ 2
-(x, y, z)=2xy+ z,
ax
af 2
-(x , y , z) = x + 2yz,
ay
aJ
- (x,y ,z)= y +2zx.
2
az
The partial derivatives at x = (1 , 2, 3) are
aJ o. 2, 3) = 4 + 9 = 13,
ax
aJ
-(1 , 2, 3) = 1 + 12 = 13,
ay
aJ
- (1,2,3) =4+6= 10.
az
We can repeat the operation of taking partial derivatives. The partial derivative
of aflax; with respect to the }th variable is a;axj(af/ax;) and is denoted by
a2 f/ax1ax;. We may repeat this indefinitely, provided the derivatives exist. An
alternative notation for higher-order partial derivatives is illustrated as follows, in
which each variable of differentiation is denoted by a subscript:
a.r
-=Jx;
ax;
2
- a ( -aJ) =---=fx
a x· .r
axj ax; axj ax; ' J
2
1
/.
X1
(:X .) = 1
:
X;
{ = fx;x;
2
a:k ( a: tx;) = axk :~ ax; = fx;xjxk ·
1
Note that the order of variables in the subscript notation is the opposite of that in
the a-notation, since for example fxy means Ux) y·
a3 J
/yxx = ax2ay = 0
z
3B Geometric Interpretation
f To interpret partial derivatives geometrically, we rely on something we know about
.·• I
real-valued functions of a single variable, namely that the value of the derivative at
',_ ---,~ I ,../
l"~~--w-;-r-- -~- :~ a point is the slope of the tangent line to the graph of the function at that point. For
illustrative purposes it will be enough to consider the graph of a function IR: 2 ~ JR,
I ~~,. -~ I
I \ ' ~I~ t
b 1
t ~,__,_
:
-1- - - - - , - -- namely, the set of points (x, y, f(x, y)) in IR: 3 where (x, y) is in the domain of/.
1 \ I
a \ -- l,.r ---· ___ I Y Such a graph is in Figure 4.19 as a surface lying over a rectangle in the xy-plane.
1
I 1
I 1 The intersection of the surface with the vertical plane determined by the condition
y = b is a curve satisfying the conditions
I I
I I
I I
\ I
I I
z = f(x, y), y = b.
'v
Consider the curve defined by the function g(x) = f(x, b) as a subset of 2-dimen-
X
sional space. Its slope at x = a is
FIGURE 4.19 I aJ
g (a)= ax (a, b).
Section 3C Partial Derivatives 201
' aJ
h (b) = - (a, b).
ay
The angles a and f) shown in Figure 4.19 therefore satisfy
aJ
tan a = -(a, b), tan f)
aJ b).
= -(a,
ax ay
The numbers tan a and tan f) are slopes of tangent lines to two curves contained in
the graph of the function J. For this reason it's natural to try to define a tangent plane
to the graph of f just to be the plane containing these two lines. If J satisfies the
condition of differentiability defined in Chapter 5, then that turns out to be consistent
with our ultimate definition. We see that the set of points (x, y, z) satisfying
f(x , y ) = 1- 2x 2 - y2
corresponding to x ~ 0, y ~ 0 is in Figure 4.20. The function f has partial deriva-
tives at (½, ½) given by
:~ (½, ½) = - 2, aJ
ay
(½, ½) = -1.
Since f ( ½, ½)= ¼,the tangent plane to the graph of J at ( ½, ½) is, by Equation 3.2,
z= ¼- 2 (x - ½)- (y - ½)
= ¾-2x - y.
z f
I ,,I '\
We can sketch the tangent plane by drawing the two tangent lines in it determined
at (x , y) = f).
<½, It's somewhat easier to locate three points on the plane, for
simplicity
(i,o,o), (o, ¾,o)(o,o, ¾) ,
and then sketch the plane containing these points. The point of tangency on the graph
of f is (½, ½,¼). See Figure 4.20.
FIGURE 4.20
3C Continuity
We discuss continuity for functions of more than one variable extensively in Chapter 5.
At this point, we'll consider briefly the case ~ 2 _£,,. ~- To a1low z to approach x from
202 Chapter 4 Derivatives
an arbitrary direction we assume, for each x = (x, y) in the domain of f, that f (z)
is defined for all vectors z = (z, w) satisfying Ix - zl < 8, where 8 is some positive
number. We then say that f is continuous if for each point x in the domain off
lim f {z)
Z--+X
= f (x).
The limit relation means that we can make f{z) arbitrarily close to j(x) if the
distance Ix - zi from x to z, is small enough. As usual, the intuitive idea of continuity
is that the values of the function f should not change abruptly, resulting, for example,
in breaks in the graph of f. The graphs shown in Figure:; 4.19 and 4.20 are those
.., ___
._, of continuous functions, whereas Figure 4.21 shows a simple example of the graph
of a discontinuous function.
If we assume certain continuity conditions on f and its partial derivatives then
,. / ~ - ;
X the higher-order partial derivatives of JR2 ~ JR are independent of the order of dif-
ferentiation. The precise statement follows, though we remark that a slightly stronger
FIGURE 4.21 theorem is true. (See Exercise 13 of Chapter 7, Section 3.)
3.3 Clairaut's Theorem. Let JR 2 . ~ JR be continuous and such that fx, fy, fxy,
and Jyx are also continuous on the same domain as f. Then fxy fyx· =
Proof. Choose x, y, h =f. 0, k =f. 0 and 8 > 0 so the difference
is defined if ../h 2 + k1 < 8. We now apply the mean-value theorem in the variable
x to the function
G (x) = f (x, y + k) - f (x, y)
on the interval with endpoints x and x + h. We find
G(x + h) - G(x) = hG'(x,),
where x, is between x and x + h. In terms of F and f, this last equation is
F(h, k) = hkfxy(XJ, Yl ),
where YI is between y and y + k. Rewriting F in the form
F(h, k) = [f(x + h, y + k) - f(x, y + k)] - [J(x + h, y) - f(x, y)]
allows us to follow the same general procedure, this time differentiating with respect
toy, then x. We find
F(h, k) = hkfyx(x1, Y1),
where x2 and )'2 lie between x, x + h and y, y + k respectively . Equating the two
expressions found for F(h, k), and canceling the factor hk, gives
Section 3C Partial Derivatives 203
Now let both h and k tend to zero. It follows from the positions of the Xi and y;
that the distances
both tend to zero. Therefore, by the continuity of fxy and fyx, we get fxy(x, y) =
Jyx(x, y). The point (x, y) was arbitrary, so fxy = Jyx on the domain off. •
We may apply Theorem 3.3 successively to still higher-order partial derivatives,
provided the analogous differentiability and continuity requirements are satisfied.
Moreover, by considering only two variables at a time, we can apply the theo-
rem to functions ]Rn -1..+JR where n > 2. Thus for the commonly encountered
functions that have continuous partial derivatives of arbitrarily high order, we have
typically
a2f a 2f
axay ayax
a3g a3g
axayax = ax 2 ay
a4h a4h , etc.
azaxayaz = axayaz 2
The last two formulas follow from repeated application of the two-variable formula
by interchanging two differentiations at a time.
EXERCISES
.
In Exercises I to 6, find -
aJ aJ .
and - , where f (x, y) 1s 9. f (x, y)
1
= - 2- -2 , (a, b) = (1, 1)
ax ay X +y
the given function. 10. f(x, y) = x(y2 + 1), (a, b) = (0, 2)
1. x + x sin(x + y)
2
a2 f a2 f
In Exercises 11 to 14, find - - and - - , where f is
2. sin x cos(x + y) ayax axay
3. ex+y+I as given.
ofi (x)
fi(x) ) OXi
4.1 If f(x) = : , then of (X) =
( OXi
fm(X) ofm (x)
OXj
M,(~;l' Writing g(x, y) as a column vector, suppose that JR 2 ~JR is
2
g(x, y) = ( ::~ ) .
Then
og
ox (x, y) = ( 2xy
yi ) and og(x,y)=(2x2 ).
ay xy
U COS V )
f(u,v)= us~nv ,
(
then
af
-(u, v) =( cosv )
sin v and
af
-(u, V) = (-u v) sin
U COS V .
au o av t
If x and y are constant vectors, and h(u, v) = ux + vy, then ah/au(u, v) = x and
ah/av(u, v) = y.
The geometric significance of the vector partial derivative is as follows. If all
coordinates but one are held fixed, and the remaining one, say Xi, is allowed to vary,
then f (x) = f (x1, ... , x;, .. . , Xn) traces an image curve in ]Rm, sometimes called
a coordinate curve if m ~ n. Hence by the interpretation of Section I, the vector
af/axi(X) is a tangent vector to this coordinate curve at the image point f(x).
206 Chapter 4 Derivatives
Similarly,
aJ a(uox1 + vx2)
(a) -(uo,v) = - - - - - =x2.
au av
Thus x1 and x2 each plays the role of tangent vector to a line in the plane parametrized
by f.
where in the picture we have restricted the parameters u and v so that O ::::: u ::::: 4
and O ::::: v ::::: 3JT. If u = uo is held fixed and v varies, we get a helical curve
(see Example 8 in Section 1) winding one and one-half times around the z-axis on
a cylinder of radius uo. With v = vo and u varying, we get a line segment v units
above the xy-plane. The vector partial derivatives off were computed in Example 2.
Section 4A Parametrized Surfaces 207
At (uo, vo) = (1, T(/4), we get the tangent vectors
aJ (1, rr/4) = ( 1;,/2)
au 1/-;f , -(l,T(/4)=
af (-l/,/2)
1/,/2 ,
av l
to the two parameter curves through the point f(l, T(/4) = (1/,/2, 1/,/2, rr/4) on
the surface. The tangent plane at this point is represented parametrically by
The function f(u, v) = (u cos v, u sin v, v) with domain altered from the previous
example to the rectangle -1 ::: u :S 1, -37'{ ::: v =s 3rr has for its image a helicoid
H that winds around the central axis. Figure 4.23(a) shows a sketch of a portion
of the surface. The tangent plane to H at a point f (0, vo) = (0, 0, vo) on the
vertical axis is generated by two tangent vectors, fu (0, vo) = (cos vo, sin vo, 0) and
fv(O, vo) = (0, 0, 1). Since the second of these two vectors is parallel to the central
axis, the tangent plane at a point on the axis always contains the axis. Since the
dot product of fu(O, vo) and fv(O, vo) equals zero the tangent plane also contains
segments perpendicular to the central axis and lying in the surface. The surface is
pictured as a kind of "skeleton," reminiscent of models of the DNA molecule and
their linking of points on pairs of helices shown in Figure 4.23(a).
FIGURE 4.23 z
(a) (b)
208 Chapter 4 Derivatives
function from that of the helicoid.) The parameter values (u, v) = (0, vo) all corre-
spond to the single point (0, 0, 0) on C. Furthermore, the attempt to find a pair of
independent vectors Xu (0, vo) = (cos vo, sin vo, 0) and Xv (0, vo) = (0, 0, 0) at (0, 0,
0) fails to produce a tangent plane, since the second vector fails to provide a tangent
direction. At every other point C has a well-defined tangent plane, as you are asked
to show in Exercise 19.
The sharp point at the ends of the two symmetric halves of the cone in the
previous example is called "singular" because the surface lacks a certain smoothness
that we like to associate with a typical point of a surface. The official definition
is as follows. Recall the related definition of smooth curve in Section l of this
chapter.
restricted by 0 ::: u :::: 2 and 0 ::: v :::: Jr. This example comes from the previous one
by elimination of the third coordinate in the range, so the image surface is restricted
to the image plane. Figure 4.24(a) shows the domain of h and Figure 4.24(b) shows
the image. To get an idea of how the transformation behaves, we look al the images
under h of the lines parallel to the axes in the domain. The resulting curves in JR. 2 ,
given by
X = ll cos V, y = u sin V,
(a)
are semicircles if u is held fixed and line segments if v is held fixed. The image of
y f restricted to the rectangle in Figure 4.24(a) is the half-disk in Figure 4.24(b). ln
this example the image "surface" lies in IR 2 •
solution to finding a function JR2 --1+ JR3 whose graph is a part of the helicoid
parametrized by
x = ucosv, y = usinv, z = v.
The following example illustrates the general proposition that when both represen-
tations apply, the tangent planes tum out to be the same.
The graph of the function f(x, y) = tr-Y has partial derivatives fx(l, 1) = 1 and
f y ( 1, 1) = -1. Hence the tangent plane to the graph S of f at the point of tangency
(1, 1, 1) is
z=l+(l)(x-l)+(-l)o/-1) or z=x-y+l.
g(u, v) =( ~
eu-v
) , so gu(u, v) =( b ),
eu-v
gv(u, v) =( ~
-e u-v
)
as parametric representations for S and its tangent vectors. Parametrically, the tangent
at g(l, 1) = (1, 1, 1) is x = g(l, 1) + ugu(l, 1) + vgvO, 1), or
4B Quadric Surfaces
The definition of quadric surface given in Section 2D is fundamental for many pur-
poses, but parametric representations provide insight into the structure of some of
them, and will also be useful for certain multiple integration problems later on. The
elliptic and hyperbolic paraboloids are graphs of real-valued functions, so these two
types are only a notational change away from parametrization, as we've just seen. Of
the remaining types, we'll treat the elliptic cone and the ellipsoid in detail, leaving
details of the hyperboloids as exercises.
Geometrically an elliptic cone is generated by all lines that pass through the points
of a fixed ellipse in space and that also pass through a fixed point not in the plane of
the ellipse. (In particular, an elliptic cone could be one of the familiar right circular
cones.) Using coordinates in R3, let the fixed point be the origin, and let the fixed
ellipse be the one that projects parallel to the z-axis from the plane z = 1 onto
the ellipse (x/a) 2 + (y/b) 2 = 1 in the xy-plane. A typical point on the ellipse has
coordinates (a cos v, b sin v, 1), so a line joining this point to the origin consists of
all points of the form
---- ---
,_
(
---------
(a) (b)
As u varies for fixed v, x(u, v) traces one of the generating lines of the cone. As v
varies for fixed u =I= 0, x(u, v) traces an ellipse with semiaxes lual and luhl in the
plane z = u. See Figure 4.25(a).
To identify the cone as the quadric surface defined in Section 2, just extract x, y
and z from the parametrization, and observe that
2 2
X
02
+ Yb2 = u 2 cos 2 v + u 2sm
·2 2
v =z .
IEX"MPLE 11 I Geometrically an ellipsoid is a closed surface with three perpendicular axes of sym-
metry such that the plane cross sections perpendicular to these axes are ellipses,
possibly circles. Taking the symmetry axes to be the coordinate axes in R 3, we let
a, b and c be positive numbers and
a cos u sin v )
x(u,v)= bsinusinv .
( CCOS V
Identification with the quadric surface of Section 2 comes from checking that
(xja)
2
+ (y/b) 2 + (zjc)2 = cos 2 u sin 2 v + sin 2 u sin2 v + cos 2 v = l.
For a fixed v between O and re and varying u, x(u, v) traces an elliptic latitude curve
(xjasinv) 2 + (y/bsinv) 2 = l in the plane z = ccosv. As v varies for fixed u,
x(u, v) traces longitude curves extending between the "north pole" and the "south
pole." Typical curves are shown in Figure 4.25(b).
EXERCISES
1. j(X, y) =
x+y
X - )'
( x2 + y2
)
4. f(x, y) - u; )
Section 48 Parametrized Surfaces 211
5. J (x, y) =(ex, eY , e-<+Y) (b) Find the image of the segment of the line y = x
between (0, 0) and (I, 1).
6. f(x,y)=( ~ )+( ~) (c) Find the image of the region defined for positive x
and y, and x 2 + y 2 < l.
For each of the functions in Exercises 7 to 10, find (d) Find the angle between the images of the lines y = 0
all first-order vector partial derivatives at the indicated and y = (l/v'3)x.
point. Then for each one, sketch the curves on the image 17. A vector function f from the x y-plane to the u v-plane is
surface passing through the image point g(uo, vo) and defined by
having the property that only u varies on one curve, and
only v varies on the other; that is, sketch the coordinate
curves through g(uo, vo). Finally, sketch the two tangent
vectors given by the partial derivatives.
(x: 4x
y)2 ) ' X =;,i= 0.
f
Ix)- ( -y2)
~y -
x
2
2xy ·
(a) Show that T is also parametrized by g(u, v)
(u, v, ~ ) by showing that the images of
=
' both parametrizations coincide with the level set
Consider the domain space to be the xy-plane and the x 2 + y 2 - z2 = 0 for z ~ 0.
range space to be the uv-plane. (b) Show that the pointed end of T is singular with
(a) What are the coordinate functions of J? respect to the parametrization g(u, v), but for a
212 Chapter 4 Derivatives
different reason than for the parametrization given are different parametrizations of the ellipsoid
in Example I 0. (x/a) 2 + (y/b)+ (z/c) 2 = l.
(c) Show that, with respect to g(u, v), T is smooth at (b) Use the parametrizations of part (a) to show that the
every point other than the tip. ellipsoid has elliptic cross sections perpendicular to
22. (a) Sketch the surface parametrized by the x-axis and y-axis.
26. Parametrizations for the hyperboloid of one sheet,
x(u, v) = ((I - u 2 ) cos v, (1 - u 2 ) sin v, u), (x/c) 2 + (y/h) 2 - (z/c) 2 = I, and of two sheets.
(x/a) 2 + (y/b) 2 - (z/c)2 = -1, may involve the
-1 :::0 !I :::0 1, -OO < V < 00.
hyperbolic functions cosh v = }(ev + e-v) and sinh
(b) What are the singular points of the surface in
v = ½(ev -
e-v).
part (a)? (a) Verify that cosh 2 v - sinh2 v = I.
(b) Verify that the hyperboloid of one sheet has
23. Let a > b > 0, and consider the set T parametrically parametrization
represented by
x(u,v) = (acosu cosh ·v,bsinu cosh v,c sinh v).
U) ( (a+bcosv)cosu)
(a+bcosv)sinu,
bsin v
0:::: u :::: 2n, 0:::: v :::: 2n.
(c) Verify that each piece of the hyperboloid of two
sheets has parametrization
EXERCISES
Chapter 4 REVIEW
1. Let x(t) = (e 1 cos t)i+(e 1 sin t)j+e' k and find the velocity 8. A particle moves so that at time t its position is x(t) =
vector i(t) and the vector t(t) of length I pointing in the ( - sin t, cos t, t 2 ).
same direction. Also find the acccleralion vector i(t). (a) Find the velocity v(t) =
i(t) and the acceleration
2. The motion of a particle is given by the vector function a(t) = ti(t).
x(t) = (cos 2t)i + (sin 2t)j + 12 k. (bl What is the distance between the positions of the
particle at t = 0 and alt = n?
(a) Sketch the trajectory of the particle when O ::S t ~ n.
(c) Find a parametric expression for the tangent line to
(b) What is the velocity vector when t = n/4? Add this
the curve traced out by the particle's motion at the
vector to your sketch.
point x(n).
(c) Suppose the particle leaves its prescribed path when
t = n/4 and continues at the constant velocity it has 9. Find an equation for the tangent plane to the graph of the
acquired at that time; where will the particle then be equation z = x 2 + y 3 at ( 1, 2, 9).
at time t = n /2? 10. Find a parametric representation for the plane tangent to
3. The position of a particle at time t is given by the image surface of the function J(u, v) =
(u+v, u 2 , v 2 )
=
at the point f (l, 1) (2, 1, 1).
In Exercises 11 to 16, verify that the functions satisfy
the Laplace equation uxx + uyy = 0 in two dimensions.
11. u(x,y)=x 2 -y2
(a)Find the velocity and acceleration vectors when
t =0. 12. u(x , y) = 2xy
(b) What distance docs the particle travel between t = 0 13. u(x, y) = x 3y - xy 3
and t = 2n?
(c) Show that the particle always lies on a sphere cen-
14. u(x, y) = ln(x 2 + y 2 )
tered al the origin. 15. u(x, y) = arctan(y/x), xi= 0
(d) Find a plane that contains the path of the particle,
16. u(x, y) = ex 1 -y2 cos2xy
and sketch the path from an appropriate point of
view. In Exercises 17 and 18, verify that the functions satisfy
4. (a) Sketch the image curve uf the function given by the 3-dimensional Laplace equation uxx+uyy+u:: = 0.
f(t) = (t, cost, sint) for O ~ t ~ 2n. 17. u(x, y, z) = xyz + 2x 2 - y2 - z2
(b) Find a parametric representation for the line tangent
tu the curve in part (a) at the point f (n/2) = 18. u(x, y, z) = (x 2 + y 2 + z2 )- 111 , (x, y, z) i= (0, 0, 0)
(n /2, 0, l ), and add this line to the sketch for 19. Let u(x, y, z) = (x 2 + y 2 + z2)u, ex constant. Show that
part (a). u,_, + llyy + ll~z = 0 just for ex= 0 and ex= -1/2.
(c) Find the acceleration vector of the curve at J (n /2)
and add this to the sketch also. 20. Let f(u, v) = +
(u, v, u 2 v 2 ).
(a) Sketch the image of J.
5. Find the maximum and minimum values of the speed (b) Compute the partial derivatives J11 (u, u) and
of the motion along the curve traced by x(t) = fv(u, u) . Compute the equation of the tangent plane
(a cost, b sin t, ct), where a > b > 0 and c are constant. to the image surface of J when u =
1, v 1. =
6. Suppose x(t) traces a curve in JR 3 that lies in a plane (c) Sketch the u and v coordinate curves of J through
ax+ by+ cz = d. Show that if the curve has nonzero (1, 1, 2); that is, the curve where u varies and v = 1,
tangent and acceleration vectors at each point x(t), then and the curve where v varies and u =
I. Sketch the
x(t) and i(t) are parallel to the plane uf the curve. tangent vectors to these curves at (I, 1, 2). What is
the relationship between these tangent vectors and
7. (a) Sketch the curve parametrized by x(t) = (e 1 cost,
the partial derivatives computed in (b)?
e1 sint) for O ::St ::S n/2.
(b) Find the length of the curve in part (a). 21. (a) Show that the intersection of the elliptic cone
(c) Repeat parts (a) and (b) with the parameter interval +
(x / a ) 2 (y / b)2 - z2 =
0 with a plane perpendicular
replaced by -n/2 ~ t ~ 0. to the z-axis is an ellipse.
Section 4C Parametrized Surfaces 215
(b) Show that the intersection of the cone with a plane
perpendicular to the x-axis or to the y-axis is a A function JR.11 ~ JR"' represents a set S
hyperbola. (a) explicitly if s is the graph of f in an+m'
22. Identify the curve of intersection of the ellipsoid (b) implicitly if S is a level set off in !Rn,
(x/a}2 + (y/b) 2 + (z/c) 2= 1 with a plane perpendicular (c) parametrically if S is the image of f in IR"'.
to one of the coordinate axes.
for example, the graph of JR. I ~ JR. I where f (x) = x 2 is
23. Identify the curves of intersection of the elliptic an explicit representation of a parabola Pin JR. 1+I = JR.2.
paraboloid (x/a) 2 + (y/b) 2 = z with (a) planes per-
pendicular to the z-axis. (b) planes perpendicular to the 25. Find an implicit representation for the parabola P.
x-axis or the y-axis. 26. Find a parametric representation for the parabola P.
24. Consider the helicoid H parametrized by
27. The function IR 1 ~ JR 3 where g(t) = (t, t, r2 ) has as
x(u, v) = (u cos v, u sin v, v). its image a parabola P in JR 3, so P is represented para-
(a) Find a parametrization for the tangent plane to H at metrically by g. Make a sketch of P, and find an explicit
the point (1/../2, 1/../2, rr/4).
(b) Find a nonnal vector n of length I to H at the same representation of P as the graph of some JR 1 ~ JR 2 .
point.
CHAPT ER 5
DIFFERENTIABILITY
216
Section 1A Limits and Continuity 217
We'll also extend the definitions of limit and continuity from real-valued functions
of one variable to real-valued and vector-valued functions of several variables and
explain how to construct continuous functions of several variables using the familiar
functions of single-variable calculus.
The definition of limit is based on the idea of nearness. The limit relation
sinx
lim - - =1
x---+0 X
underlies the usual introduction to the calculus of the trigonometric functions and
I I
is most often proved geometrically in that context. The equation says "(sinx)/x is
- I 0 arbitrarily close to 1 provided xis sufficiently close to O." We express nearness on the
real-number line by inequalities such as Ix - 31 < 0.4, which says that the distance
between the number x and the number 3 is less than 0.4, or equivalently that x lies
FIGURE 5.1 in the open interval with center 3 and length 0.8. See Figure 5.1. And we translate
statements such as "(sin x) / x is arbitrarily close to 1 provided x is sufficiently close
to O" into statements about inequalities that we can operate on algebraically. Thus
the previous displayed formula says the following: For a given positive number E,
there is a positive number 8 such that if
The condition 0 < Ix - 01 signifies that the precise value, if any, assigned to the
function at x = 0 is irrelevant to the existence of the limit; this condition can only
make it easier to find the required number 8 > 0, because x = 0 doesn't have to
satisfy the E-inequality.
In the previous chapter we assumed known the properties of elementary functions
such as sin x / x, and will continue to do so here, largely avoiding E and 8 arguments,
and concentrating more on the geometric properties of the natural domain sets in JR 11
that play the role of intervals a < x < b and a S x S b in JR.
lA Neighborhoods
In JR" a definition of limit also requires the means of asserting that one point is close
to another. For a given 8 > 0 and point xo in IR 11 , the set of all points x in IR11 that
xo = (l, 2, I) satisfy the inequality
lx-xo1<8
FIGURE 5.2 Suppose S is a set of points in IR11 and x a point in JR 11 • Then x is a limit point
of S if, for a given 8 > 0, there exists a point y in S such that 0 < jx - YI < 8.
Translated into English, the definition says that x is a limit point of S if there are
points in S other than x that are contained in a ball of arbitrarily small positive radiw
with center at x. A 8-ball is sometimes called a neighborhood of the point at it
center. Thus x is a limit point of S if every neighborhood of x contains a point of
other than x. Note that a limit point of S need not itself be in S.
218 Chapter 5 Differentiability
!EXAMPLE 1 I The set Sin JR 2 consisting of all points (x, y) such that
x 2 + y2 < I,
together with the single point (2, 0) appears in Figure 5.3. The set of limit points of
S consists of the circular disk together with the circle
y
x2 + y2 = 1.
Note, however, that the limit points precisely on the circle of radius 1 are not in S.
The point (2, 0) is not a limit point of S, even though it is in S, because there is no
~--- (2, 0) other point of S within 1 unit of it.
If. : !11;
One way for a point x to be a limit point of a set S is for x to be an interior
point of S, that is, a point x in S such that all points within some neighborhood of
I x are also in S. A set S all of whose points are interior points is called open. For
example ]Rn is an open set, and so is the circular disk in Example 1. If something
FIGURE 5.3
occurs at all points in a neighborhood of a point x then x is an interior point of the
set where it occurs.
Consider again the set S shown in Figure 5.3 and described in Example 1. The
interior points of S, which are also limit points, are those in the open disk represented
by the shaded part of the drawing. A point x in the disk but not on the circle
x 2 + y 2 = 1 is an interior point of S, because a disk of small enough radius centered
at x would be contained in S. The point (2, 0) is not an interior point of S, even
though it is in S. Even if the circle x 2 + y2 = I were included in S, the points of
the circle would not be interior points. Figure 5.4(a) shows the interior points of S
and Figure 5.4(b) shows the limit points of S.
In the most common examples of functions f: D ----+ ]Rm, the domain D is either
an open set, or else an open set together with some points called boundary points
of D. A boundary point of a set D is a point x such that every neighborhood of x
contains both a point in D and a point not in D. Thus x may be a boundary point
of D without being itself in D. But an interior point of D is never a boundary point
of D. The boundary of a set D is just the set of all boundary points of D, and a
closed set is by definition a set that contains all of its boundary points.
j EXJ,\'1,1PLE ,3 I ofFigurewhichshows
D,
5.5 examples of closed sets
is always open, and the boundary of
D 2
in JR and in IR along with the interior
,
The boundary of the set shown
D.
in Figure 5.3 consists of the points on the circle x 2 + y 2 = 1 together with the point
(2, 0).
1B Limits
Here is the definition of limit for a function ]Rn ~ ]Rm. Let yo be a point in ]Rm
and xo a limit point of the domain of f. Then Yo is the limit of J at xo if, for a
given E > 0, there is a 8 > 0 such that If (x) - Yol < E whenever x is in the domain
off and satisfies O < Ix - Xol < 8. The relation is written
lim f (x)
x--->X<i
= Yo-
Section 1B Limits and Continuity 219
To put it less formally, the definition says that /(x) is arbitrarily close to y0 when x
is sufficiently close to xo and x # xo. Geometrically, the idea is this: Given an E-ball
Bf centered at Yo, there exists a 8-ball B0 centered at xo whose intersection with the
domain off, except possibly for xo itself, is sent by f into Bf. A 2-dimensional
example is pictured in Figure 5.6. The statement
lim
bx---+XQ
f (x) = YI and lim
bx---+xo
f (x) = Y2
(b) then by the triangle inequality
FIGURE 5.4
IY1 - yzl:::: IYI - /(x)I + 1/(x) - Y2I < E +E = 2i;
for all x in a small enough neighborhood of xo. But we can make 2E as small as we
FIGURE 5.5
--
a
...
like, so IY1 - Y2I = 0 and YI = yz.
b
Closed interval
a :$ X :$ b,
a
Interior of D
Open interval
a <x< b,
b
Boundary of D
a b
ti d
-t- I ~
a a b a b
Closed rectangle Open interior Boundary rectangle
a :$ X :$ b, a <x< b,
C :5,y :$ d. c < y < d.
(d) (e) (f)
FIGURE 5.6
The last inequality holds because Ja 2 + b 2 .:'.:: Jal+ !bl. (Square both sides.) Using
continuity of sin t and cos t,
Jim cos t
r--+ro
= cos to and Jim sin t
t--+to
= sin to.
Then we can make both
as small as we like by making It - tol small enough. Hence the inequality (*) shows
that lf(t) - /(to)I is as small as we like whenever It - tol is small enough.
j EXAMPLE s I Consider the real-valued function defined in all of IR.2 except for (x, y) = (0, 0) by
l
f(x,y)= 2 2·
X +y
In this example we can write
Jim f (x) = +oo
x--+O
to describe what happens, because as l(x, y)I tends to 0, its square x 2 + y 2 = l(x , y)l 2
tends to zero also, so the fraction tends to +oo. By the convention established in
our definition of limit, only elements of IR" are acceptable limits, so we say for this
example that the limit fails to exist.
!EXAMPLE 6 I Let f be real-valued with the same domain as in the preceding example and defined by
x2 -y2
f(x,y)= 2 2·
X +y
There is no limit as (x, y) -;. (0, 0). If (x, y) approaches (0, 0) along the line y =ax,
we obtain
x2 - y2 . x2(l - a2) l - a2
lim -=---:- - hm - - - - - - -
x--+O x 2 + y 2 - x--+0 x 2(l + a 2) - l + a 2 ·
This limit is not independent of a, because, for example, the limit equals 0 if a = I,
=
and I if a 0. But for a unique limit to exist we have to be able to approach (0, 0)
from all possible directions, so the overall limit fails to exist.
Section 1C Limits and Continuity 221
The functions in Examples 5 and 6 are both real-valued. The following theorem
shows that the problem of the existence and evaluation of a limit for a function
1Rn ~ 1Rm reduces to the same problem for the real-valued coordinate functions.
1.1 Theorem. Given IR.n ~ IR.m, with coordinate functions Ji, ... , f m, and a
point Yo = (y1, ... , Ym) in ]Rm, then
Jim f(x)
X--Ho
= Yo (A)
if and only if
Jim /i(x)
X-->-X()
= y;, i = 1, ... ,m. (B)
Proof To say that Equations (A) and (B) are equivalent is to say that the distance
also become arbitrarily small. But the equivalence of these last two statements follows
at once from the inequalities
After squaring both sides the first inequality follows from af + · · · + a! ~ af,
i =I, ... , m, and the second from af +···+a~~ m(max1::;i_:::m la;l)2. •
we have
Jim Ji (t) = (0, 0, 0) .
1-->-0
But Jim fz(t) doesn't exist because the function sin(l/t) has no limit at t = 0.
1C Continuity
Roughly speaking, a continuous function f is one whose values do not change
abruptly. That is, if xis close to Xo, then f(x) must be close to f (XQ). This idea is
related to the idea of limit, and the definition of continuity is as follows: A function
f is continuous at xo if
At a nonlimit, or isolated, point of the domain off, we can't ask for a limit; instead
we extend the definition of continuity simply by defining f to be continuous at such
222 Chapter 5 Differentiability
1.2 Theorem. A vector function is continuous at a point if and only if its coor-
dinate functions are continuous there.
Ji, ... , / 11 are continuous real-valued functions of a real variable. The latter include
most of the functions of ordinary calculus, such as x 2 , sin x, and, for x > 0, ln x.
We use these same functions to construct examples of the continuous coordinate
functions that constitute the vector-valued functions ~n L ~m of a vector variable.
For example, the coordinate functions of
sinxy cosxy )
f(x, y) = ( ex+y ' ex+y
tum out to be continuous. The continuity of these and other examples follows from
repeated application of the following three theorems, together with Theorem 1.2 on
coordinate functions. If you think these theorems are obviously true you're right, but
we'll prove them anyway.
1.3 Theorem. The functions ~ 11 ~ ~ . where Pk(X1, ... , x 11 ) = Xk, are contin-
uous for k = 1, 2, ... , n. Pk is called the kth coordinate projection.
Proof of 1.3. We have IPk(x1, ... ,X11) - Pk(a1, ... ,an)I = lxk - akl .::: Ix -
al, so IPk(XJ, ... ,x11 ) - Pk(ai, ... ,a11 )I is arbitrarily small if Ix - al is small
enough. •
Proof of 1.4. For S(x, y) = x + y, write IS(x, y) - S(a, b)I = Ix - a+ y - bl .:::
Ix - al+ IY - bl, by the triangle inequality. Hence IS(x, y) - S(a, b)I is small if
Ix - al and IY - bl are small enough. Since Ix - al and IY - bl are both at most
Section 1C Limits and Continuity 223
the distance ../(x - a) 2 + (y - b) 2 from (x, y) to (a, b), making this distance small
enough makes IS(x, y) - S(a, b)I as small as you like.
For M(x, y) = xy use the triangle inequality and factor out !xi and lbl to get
IM(x, y) - M(a, b)I = lxy - xb + xb - abl S lxllY - bl+ Ix - alibi. Keeping x
within distance I of a, makes
Hence IM(x, y) - M(a, b)I is as small as you like for a given (a, b) if Ix -
al and IY - bl are small enough. From here the argument is the same as for
S(x, y). •
Proof of 1.5. We let a be a limit point of the domain of g(f(x)) and show that
lim g(/(x))
X->X()
= g(f(a)). Since g is continuous at f(a) there is a neighborhood Bs,
s > 0, of j(a) such that lg(y)- g(/(a))I is a small as we like, say less than E > 0,
when y is in Bs. Similarly, since f (x) is continuous at x = a there is a neighborhood
Br, r > 0, of a such that j(x) is in Bs when xis in Br. Hence lg(y)- g(/(a))I < E
whenever xis in Br . ,---=-------=- •
y
h(x, y) = J1 -x 2 - y 2 1n(x + y),
is defined on the half-disk that is the intersection of the domains of / and g, as shown
in Figure 5.7. The product is a continuous function because it is the composition of
the continuous vector function
F(x, y) = (f (x, y), g(x, y))
with the function M of Theorem I .4.
I. Assuming xo = (l, 2), draw the set of all vectors x in IR 2 17. f(t) = (t, r2 , r3, r4 )
such thal
18. f (u, v) = (u + v, u - v, u 2 + v2 )
(a) Ix - xol ::: 3
(b) Ix - xol = 3 19. f(x, y, z) = (2x, 2y, 2z)
(c) Ix-· xol < 3 In Exercises 20 to 25, detennine at which points the
In Exercises 2 to 11, identify (a) the interior and (b) the function fails to have a limit. Use Theorem 1. I. Take the
boundary of the set of points x = (x, y) in JR 2. ( c) Which domain of each coordinate function as large as possible.
sets are open? (d) Which sets are closed? The domain of f is then the part common to the domains
of all the coordinate functions.
2. Ix - 0, 2) I :::: o.5
3. Ix - 0, 2)1 < o.5 20 _ f ( x ) =( y + tan x )
y ln(x + y)
4. Ix- (l,2)1 < -0.5
5. 0 < X < 3 and O < y < 2
6. 2 ::: x < 3 and O < y < 2
21. f ( ; ) =( x
2
1 )
y2 - I
f
7. x 2 + 2y 2 < I
X
8. x =I- (0, 2) or (1, 2) 22. f (x, y) = -.
smx
- +y
9. x 2
IO. x > 0
II. X
+ y2 >
> y
0
23. f(x, y) = { ,
- .-+y,
smx
2+ y,
if X =/:- 0,
if x = 0
~
sin t
0 < x 2 + y2 < 1, together with the interval I ::: x < 2 of
the x axis.
(a) Describe the boundary of S.
(b) What are the interior points of S?
24. /(r) (
cost
sint 2
)
(c) Is S open? Closed? 25. f(u,v)= UV
,
I )
(
1-u 2 -v 2 2-u 2 -v 2
13. Let L be a line and P a plane in JR 3 • Is either P or L an
open subset of JR 3 ? In Exercises 26 to 31, determine at which points the
function fails to be continuous. Take the domain of each
In Exercises 14 to 19, the formula defines a function f
coordinate function as large as possible. The domain of
from !Rn to !Rm for some n and m. In each example, state
what n and m are, and list the real-valued coordinate
f is then the part common to the domains of all the
coordinate functions.
functions of f .
14 . .f(x,y)=(x-y,x 2 -y 2 )
26. f ( ; ) = ( :2 + ;2 )
15. f(x, y) =( 6 ~ ) ( ;, ) x2 + y2
/ and g that are discontinuous at a point xo such that / 40. Let S be a closed subset of R". Prove that the complement
has a removable discontinuity at xo and g does not. of S in Rn is open.
33. A function T: Rn -+ Rn is called a translation acting 41. If S is an open subset of JR,n, show that the complement
on R" by Yo if there is a vector Yo in Rn such that of S in .!Rn is closed.
T(x) = x + Yo for all x in Rn. 42. A function of more than one variable can have a limit
(a) Describe in words the effect on JR2 of translation by along every line through a point without having a limit at
Yo== (1, 1). that point. For example, define/ (x, y) = x 2 y/(x 4 + y2)
(b) Prove that every translation is a continuous function. for (x, y) # (0, 0).
34. Prove that the union of an arbitrary collection of open (a) Show that Jim / (0, y) = 0 and that, for each fixed
y->0
subsets of Rn is open. number a, Jim f(x, ax)= 0.
x->0
35. Prove that the intersection of a finite collection of open (b) Show that approaching (0, 0) along the parabola
subsets of Rn is open. y = ax 2 you get limit a/(1 + a 2 ).
36. Give an example to show that an intersection of infinitely (c) What is the set of possible limits achieved by using
many open subsets of Rn may fail to be open. the approaches of part (b)?
in which case / has derivative /'(xo) =a.The equivalent form on the right calls
attention to an important property of the tangent line equation y = f (xo)+a(x -xo),
namely that its graph approaches the graph of /(x) as x approaches xo more rapidly
than x - xo approaches 0.
Reca1l that in Chapter 4, Section 3 we used a modification of this same definition
to motivate the definition of the partial derivatives of a function of more than one
variable. At that point we emphasized that the mere existence of partial derivatives
226 Chapter 5 Differentiability
To be consistent with the custom in the case of a single real variable, the function
J is called simply differentiable if it's differentiable at every point of its domain.
We'll prove in Theorem 2.2 that there can be only one vector a for which (ii) is true,
and it's called the gradient of the differentiable function J at xo; it's customary to
use the notation VJ (xo) for the vector a. The symbol V is pronounced "grad" here,
so V.f (x) becomes "grad J at x." If J is a function of a single real variable x we
continue to write the customary J'(x) instead of VJ(x).
Remark 1. Since we can't divide by a vector x - xo we instead mulliply by the
_../" reciprocal of its length in (ii), the crucial point being to ensure that the numerator
tends to zero faster than Ix - xol does.
Remark 2. Condition (i) of the definition requires the approach of x to Xo to be
y
unrestricted, as compared to the restricted I-dimensional limits used to define partial
/
derivatives.
Remark 3. According to the definition of differentiability, the domain of a dif-
>X
ferentiable function is an open set. It's convenient however to extend the definition
sufficiently to speak of a differentiable function J defined on an arbitrary subset S
FIGURE 5.8
of the domain space. By such an J we'll mean the restriction to Sofa differentiable
z= Jt - x2 - y2. function whose domain is an open set containing S.
The function J defined by J(x, y) = Jt - x 2 - y 2 has for its domain the disk
x + y2 :::: 1. Its graph appears in Figure 5.8. The interior points of the domain are
2
those (x, y) such that x 2 + y2 < I. We'll see that J(x, y) is differentiable at these
interior points. But it doesn't follow by Remark 3 that J is differentiable at the points
of the circle x 2 +y2 = 1; indeed we'll see that J can't be extended to be differentiable
on an open set containing the circle. See Examples 5 and 6 and Exercise 26.
The next theorem allows us to compute the vector VJ (x) in tenns of partial deriva-
tives of a differentiable function. The gradient is the key to putting a firm foundation
under the notion of tangent plane introduced in Chapter 4. We'll see in Chapter 6
that the gradient of a real-valued function has several natural interpretations.
2.2 Theorem. If a function ]Rn ~ JR is differentiable at xo, then the kth coor-
dinate of the gradient VJ(xo) of J at xo is the kth partial derivative of J at xo,
Section 2A Real-valued functions 227
But Xj = xo + tei differs from xo only in the jth coordinate, and in that coordinate
the difference is just t. Hence the limit on the left side of the last equation is just
af!axj at xo. Since the dot product a• ej is just the jth entry in a, we're done. •
How can we tell whether or not a vector function is differentiable? Theorem 2.2
only allows us to conclude that a function is not differentiable at a point if one or
more of its first-order partial fails to exist at that point, because differentiability of
J at x implies that these partials exist at x. Thus Example 2 is inconclusive to the
extent that we have simply assumed that the functions J and g appearing there are
differentiable. The converse implication isn't valid, since it's possible for the partials
to exist without J being differentiable, as Exercise 24 shows. However, by adding
an additional assumption, namely that the partials are themselves continuous on an
open set S, we'll deduce in Theorem 2.3 the differentiability of J on the entire set
S. The theorem guarantees differentiability for most examples met in practice.
~() l so that in particular Yo = xo and y11 = x. We show these points with line segments
y4-J
i Y2
joining them for three dimensions in Figure 5.9. Then we have
11
= L (f(yk) -
I
n aJ
f(x) - f(xo) = L(Xk - ak)-(zk).
k=l axk
aJ aJ )
Vf(xo) • (x - xo) = ( -(xo), ... , -(xo) • (x1 - a1, ... , Xn - an)
ax, ax,,
n aJ
= L(Xk - ak)-(xo).
k=l axk
Hence
11
:SL I-.axk
aJII
-(zk) - aJ
-(xo)
axk
I /x - xo/,
k=I
Now divide by Jx - xo/. Since the partial derivatives are assumed continuous at xo,
and the Zk tend to xo as x does, Equation (*) folJows from making /x - xol tend to
zero. •
!EXAMPLE a I ToTheconclude
function f = sin
(x, y) ex of Example 2 is differentiable at all points
y- y
this from Theorem 2.3 we calculate that
(x, y).
According to Theorems 1.3 and 1.4 of Section 1, these partial derivatives are con-
tinuous for all (x, y ), so by Theorem 2.3 the function J (x, y) is differentiable for
all (x,y).
=
The function g(x, y, z) xy + yz + zx of Example 2 is differentiable at all points
(x, y, z). To apply Theorem 2.3 we compute the vector
According to our Theorems 1.3, 1.4 and 1.5 of Section l, these partial derivatives
are continuous for all (x, y), so by Theorem 2.3 the function f(x, y) is differentiable
for all (x, y) in the open unit disk.
Continuing with the previous example, the points on the circle x 2 + y 2 = I are more
problematic. According to Remark 3 following Definition 2.1 of differentiability, if it
were possible to extend the definition of h (x, y) to an open set containing the circle
in such a way that the resulting extension he(x, y) became differentiable, then we
could claim that the original function h(x, y) is differentiable on the circle. However,
such an extension of h(x, y) is impossible for this example. For depending on the
signs of x and y,
tend to oo or -oo as (x, y) tends to a point on the circle from inside the circle. Thus
there is no point on the circle at which both partial derivatives can exist, as would
have to be the case if h(x, y) were differentiable there. See Exercise 26.
X
Section 28 Real-valued functions 231
y-axis
f'(Xo)(§ =jg)
x-axis
It's only for functions of two variables that we can draw a graph of a tangent
plane in JR 3 , but the approximation given by Equation 2.5 is vaJid quite generally as
in the next example, where we consider a function of three variables with graph and
tangent in JR 4 •
l': ~~Me~iJll ~~~r~;i~a~?o~ :(~~~ i~;n:f~~;:~u~~ ~j~;~~'. J~x~z~i , ;: ~;)a!~ ~~~ ~~~ ~~n!e~~
2
3
T (x, y, z) = I + (I, 2, 3) • (x - I, y - 1, z - I)
= -5 + X + 2y + 3z.
In the equation that has the tangent plane at (I , I , I , I) as its graph in JR4, we
introduce an additional variable w. The desired equation is then
W = -5 + X + 2y + 3z.
In determining the vector Vf (xo), we required that the real-valued function
EXERCISES
For each of the functions I to 8 find V/(x) at a general 20. f(x, y) = (x2 + y2)-I
point x in the domain of f.
21. Consider the function f: R" -+ R defined by f (X) =
1. f(x,y)=x2-y2 lxJ 2 = x • x. Prove that 'vf (x) = 2x for all x in R 11 •
2. f(x, y) = x 2 - y2 - sinxy 22. Is the function g: R11 -+ R defined by g(x) = lxl
differentiable at every point of its domain? Explain your
3. f(x,y)=x+2y answer. [Hint: lxl = ,Jx-x".]
4. f(x, y, z) = (x - y)z *23. Prove that if the real-valued function f is differentiable
5. f(x, y, z) = x + y - z2 at Xo, then
At which points do the functions 17 to 20 fail to be is differentiable for all x but is not continuously differen-
differentiable? Give a reason for your answer. tiable at x = 0.
17. f(x, y) = x-2 + y- 2 *26. Referring to Examples 5 and 6 in the text, prove that
there is no point on the circle x 2 + y 2 = 1 for which the
18. f(x, y) = Jx 2 - y 2
one-sided partial derivatives both exist when the defining
19. J(x, y) =Ix+ YI limits are taken from within the circle.
This is a significant extension of our earlier use of the partial derivative notation, but
the derivative is still "partial" since it's computed in a single direction. The domain
of of/ov is the subset of the domain of f for which the preceding limit exists.
In practice we always assume v =I- 0, since the case v = 0 gives no information.
(Why?)
The connection between the derivative with respect to a vector and the gradient
is provided in the following theorem.
3.1 Theorem. If f is differentiable at x = (x1, ... ,xn) and v = (v1, ... , Vn),
then
aJ
-(x) = V/(x) • V.
av
We can write this formula in terms of coordinates as
and the proof is finished for nonzero v. When v = 0, both sides of the previous
equation are zero. •
Observe that when v = ej, a standard basis vector of length I, the equation in
Theorem 3.1 shows that the derivative with respect to that vector is just the partial
derivative with respect to Xj, that is,
aJ aJ
234 Chapter 5 Differentiability
As in the previous equation we'll most often want to choose the vectors v in afjav to
have length 1 so these derivatives serve as standardized rates of change in a variety
of directions. Nevertheless the more general definition has its uses, as for example
in Exercises 22 and 23.
For each vector u in JR11 of length lul = I, we define the directional derivative
of f in the direction of u to be the function afjau. The reason for the name
"directional" derivative is that in JR11 there is a natural way lo associate a vector to
each direction, namely, take the unit vector in that direction. The number (af/ au)(x)
is then regarded as a standard measure of the rate of change of the value f (x) in the
direction of u.
IEXAMPLE 1 I Suppose a function f: JR 3 ---+ JR isf(x, y, z) = xyz. We find the directional deriva-
tive of f in the direction of the unit vector u = (I /2, I /2, I/./2) by letting
x = (x, y, z) and using Theorem 3.1 to get
tan y
aJ
= -(x).
au
The situation here is a generalization of the one shown in Figure 4.19 in Chapter 4,
Section 3. If u = e1, the angle y becomes the angle a in the earlier figure and
FIGURE 5.12 aJ aJ .
- , w1thu=(l,0).
tan y is a slope. au ax
Section 3B Directional Derivatives 235
If u = e2, we get y = f3 in Figure 4.19 of Chapter 4, Section 3, and
at at
- , With U
. = (0, 1).
au ay
3B Mean-Value Theorem
We assume here some acquaintance with the fo11owing fundamental theorem from
single-variable caJculus, and we state it without proof.
3.2 Mean-Value Theorem. Let IR. ~ IR. be continuous on the closed interval
[x , y ] and differentiable on the open interval (x, y ). Then there is a number xo strictly
between x and y such that f(x) - f(y) = f'(xo)(x - y).
3.3 Theorem. Let IR.n ~ IR. be differentiable on an open set containing the line
segment S joining two vectors x and y in IR.n . Then there is a point xo on S such
that
f (y) - f (x) = Vf (xo) • (y - x) .
Proof. Consider the function g(t) = f(t(y - x) + x), defined for O ~ t ~ 1 and
set m(t) = t (y - x) + x. Then if h is a real number,
by Theorem 3.1. Now let t = I and t = 0 in the definition of g(t) to get g (I) - g (0) =
f (y) - f (x). But by the mean-value theorem for functions of one variable, applied
to g at t = 1 and t = 0,
FIGURE 5.13
t
One of the most important conclusions we draw from the mean-value theorem
for functions of one variable is that a function with zero derivative on an interval is
constant. For a function J of a vector variable, we replace the domain interval by
an open set D in ]Rn that we assume to be polygonally connected; a polygonally
connected set S is one such that a given pair of points in it can be joined by a finite
sequence of line segments lying in S, that is, by a polygonal path. Figure 5.13 shows
a set in JR 2 that is connected in this way and also one that is not.
In each Exercise I to 4, with functions defined on JR.2 or 1. f (x, y) = ex+Y at x = (I, 1) and in the direction of the
JR. 3 , find the directional derivative of J in the direction curve defined by g(t) = (t 2 , t 3 ) at g(2) for t increasing
of the unit vector u at the point x. 8. f(x, y) = (x 2 + _v2)- 1 at the poim (I, 3) and in the
1. f(x,y,z) =x 2 + y 2 +z 2 , u = (l/.J3, I/.J3, 1/.J3), x = direction of the vector (1, 2)
(I, 0, 1) 9. Find the directional derivative at (I, 0, 0) of the function
2. f(x, y) =x 2
- _v2, u = (1/./2, 1/./2), x = (2, 1) J(x, y, z) = x 2 +yez in the direction of the tangent vector
at g(0) l.o the curve R3 defined parametrically by
3. J(x,y)=x+y, u=(1 ,0), x= (2,3)
g(t) = (3t 2 + t + 1, 2t , t 2 ). .
4. f(x,y,z) = xysinz, u = (1/./2,0, - 1/./2),x =
(1, I, I) 10. Find the directional derivative at (I, 0, 0) of the function
f (x, y, z) = x 2 +yez in the direction of increasing t along
In Exercises 5 to 8, for the real-valued function defined the curve in R3 defined by g(t) = (t 2 - t + 2, t, t + 2) at
in JR. 2 , find the directional derivative at x in the direction g(0).
indicated.
11. Find the directional derivative at (I, 0, I) of the function
5, f(x, y) = x 2 - y2 at x = (1, 1) and in the direction f(x, y, z) = 4x 2 y + _v2z in the direction of the vector
(I/ ../5, 2/ ../5) (I, 1, I).
6. J (x, y) = ex sin y at x = (1, 0) and in the direction 12. Find the directional derivative at (0, 0) of the function
(cos ex, sincx) J(x, y) = sin(x + y) in the direction of the vector (a, b) .
Section 4A Vector-valued Functions 237
13. Use the cross product to find the direction of a perpen- 21. The mean-value theorem doesn't generalize to vector-
dicular p at (I, 2, I) to the surface defined parametrically valued functions, because for a vector-valued function
by (x, y, z) = (u 2 v, u + v, u). Then find the directional f (x) of even just one variable there may not be a single
derivative off (x, y, z) = x 3 + y 2 + z in the direction of point xo between x and y at which (f (y) - f (x)) / (y -
pat (I, 2, I). x) = j'(xo). Verify this assertion for the example f(x)=
14. Show that for an arbitrary angle a, the vector u = (sin x, sin 2x) on the interval O .:'.':: x .:'.S 1r.
(cos a, sina) is a unit vector in IR 2 inclined at angle a In Exercises 22 and 23, the fundamental derivative
to the positive x-axis.
approximation f(x + h) ~ f (X-O) + VJ(xo)h for dif-
ferentiable functions JR" ~ JR becomes the firs.t-degree
15. For the unit vector u in Exercise 14, show that
a (aN-1 I)
where the unit vector is u = (y - x)/IY - xJ.
20. Show that the function f defined by
aN f
ahN (x) = rJh ahN-1 (x).
xlyl
f(x,y)=
I ..jx2+y2'
0,
(x, y)
(x, y)
i= (0, 0)
= (0, 0)
has a directional derivative in every direction at (0, 0),
22. Let f(x, _y) = sin(x 2 + y) , x = (0, 0), and h =
(h, k). Compute the second-degree Taylor approximation
to f(h, k).
2 1
but that f is not differentiable at (0, 0). [Hint: If f were 23. Let f (x , y) = eX +»2-z , x = (0, 0, 0), and h =
differentiable at (0, 0), then we would have Vf (0, 0) = (h, k, l). Compute the second-degree Taylor approxima-
(0, 0).] tion to f(h, k, [).
af .)
Vfi (x) = ( -(x),
a1i a1i (x), -(x)
ax1 ax2 axil .
Vfi(x) = ( -(x),
a1i ~h (x), ... ' ah (x))
ax1 i:1x2 axil ,
f (x, y)
·
=( x2 -
2xy
y2 )
aJ ( 2x ) aJ ( -2v ·)
ax(x,y)= 2y a/x, y) = 2; . ·
We draw the same conclusion as in Example I from continuity of the four partial
derivatives, namely that f is continuously differentiable on all of JR2 .
Section 4B Vector-valued Functions 239
4B The Derivative Matrix
Each of the Examples I and 2 in Section 4A displays the first-order partial derivatives
of a vector-valued function JR 2 -1.+
JR 2 as the result of computations that are in
principle somewhat different. In Example l we looked at the gradient vectors of
the two coordinate functions, while in Example 2 we looked at the vector partial
derivatives off itself. Both of these interpretations are important to keep in mind,
but there is a third way of organizing the results of the computations that is specially
important, namely the arrangement of each example's real-valued partial derivatives
in the following 2-by-2 matrix:
afi
a(x, y) aJ1
ay(x,y) ) - ( 2x
f'(x, y) = a}i . ah<
-2y)
2x ·
( ax(x,y) - x y)
ay ,
- 2y
We have denoted the resulting matrix at a point (x, y) by f' (x, y ), a notation that will
be particularly useful when we take up the general chain rule in Chapter 5. Notice
that the rows of this matrix are the successive coordinates of the gradient vectors
of the coordinate functions f1(x, y) = x 2 - y2 and h(x, y) = 2xy of f(x, y).
The columns of the matrix consist of the entries in the vector partial derivatives of
f(x, y).
In general we define the derivative matrix of a differentiable function at x by
aJi (x) aJ1 (x) aJ1 (x)
ax1 ax2 axn
a1z (x) a12 (x) a12 (x)
J' (x) = ax1 ax2 axn
where f1 (x), h(x), ... , fm (x) are the differentiable coordinate functions of f (x).
Note that the differentiation variable remains the same in each column, producing a
vector partial derivative ofJaxj(X), which is a tangent vector at x to a curve in the
image space generated by varying only Xj. Across each row a coordinate function
f; remains the same, producing the coordinate functions of a gradient vector VJ;(x),
which we'll see in Section 1 of the next chapter gives the magnitude and direction
of maximum increase of fi at x.
If f is a real-valued function there is only one coordinate function so the matrix
has only a single row, whose entries are identical with the coordinates of the gradient
vector VJ (x). Notationally the only difference between VJ (x) and the single-rowed
matrix is that VJ (x) has its entries separated by commas. So in every case a deriva-
tive matrix f' (x) can be regarded as having tangent vectors for columns and
gradient vectors for rows.
j·,~XAM~~E:3 I The function JR
3
-1.+ JR 3 defined by
f(x, y, z) = ( ./: ;s:~z )
x+y
240 Chapter 5 Differentiability
t( .
1 .,,y,z )-x
- 2 +e,
y h(x,y,z) =x+ ysinz, /J(x,y,z)=x+y.
The derivative matrix at (x, y, z) is the matrix whose columns are the three possible
vector partial derivatives:
,
t(x,y,z)= (at at (x,y,z) -:--
:l-(x,y,z) -a at (x,y,z) ) .
cX y <1Z
t'(x, y, z) =
2x
l
eY
sinz ycosz
O) .
( I I 0
t' (1, I , n) =
2I eO - O)
I .
(I I 3
Vt1 (x, y) = (2x + 2y, 2x + 2y) and 'Vh(x, y) = (y2 + 2xy, 2xy + x 2 ).
Hence the derivative oft at (x, y) is given by the matrix
t'(x, ) =( 2x + 2y 2x + 2y )
y y2+2xy 2xy+x 2 ·
- sin to ) .
( cos to
It's instructive to consider the mau·ix as a vector in the range space of and to t
t
draw it with its tail at the image point (to). For to = 0, Jr /4, n /3, n /2, and n, the
Section 4C Vector-valued Functions 241
-..fi./2)
( ~) ' ( ./2/2 '
Viewed as vectors, drawn with their tails at their corresponding image points under
f , these are shown in Figure 5.14. Evidently, for functions of one real variable, the
idea of derivative, introduced here as a matrix, coincides with the vector derivative
developed in Chapter 4, Section 4. The first-degree Taylor expansion approximates
f (t) near to and is the vector function oft given by
This is the parametric representation of the line tangent to the image off at J(to) .
4C Tangent Approximations
We've so far encountered two settings of the tangent approximation, the first being
the one appropriate for the case IR --1+ IR of a differentiable real-valued function of
a single real variable, namely
In this familiar setting in single-variable calculus, we use T(x) to write the equation
of the tangent line to the graph of y =
f (x) at xo in the form
We've seen that for general n and m a differentiable function IR" IRm has an --1+
m-by-n derivative matrix /'(x) with m rows, one for each coordinate function, and
242 Chapter 5 Differentiability
n columns, one for each coordinate variable in x. It is this matrix that plays the role
that the gradient plays in the case of a single real-valued coordinate function. Thus
the first-degree Taylor approximation to /: IR.11 - !Rm at xo is
4.l Definition
\\.'.l'l¢t~ rtf~ vccto~ s ~)<o mqst be · f¢.ted as ~ tt$~y1( '.2~l that
wf. \~~ >JllUldply Oil tpe left by the tf( .•. t .watib< . f '(ii) } . j ~y.
1;,oh1U1n. cule·for. . matr.i.~ multiplk<Jtion.
has as image one complete tum of a circular helix in IR. 3. This function f has
derivative matrix
cos v -u sin v )
J'(u, v) = s\n v u c~s v .
(
Since the entries ip. this matrix are continuous real-valued functions on all of IR 2 , we
can conclude that f (u, v) is continuously differentiable on its rectangular domain
0 S u S l, 0 ::::: v s 2.1r. Note that the columns of the 3-by-2 matrix J' (uo, vo) in
a single-variable context are tangent vectors no and vo that generate a tangent plane
at a point f (uo, vo) consisting of all points
T(u, v) = f(uo, vo) + (u - uo)uo + (v - vo)vo.
This plane is the image of the first-degree tangent approximation at (uo, vo). Note
also that the inclusion of the terms -uo and -vo just has the effect of shifting the
parameter values so the point of tangency on the plane corresponds to (uo, vo). Delet-
ing those two terms would still produce the same plane as image; with (u, v) = (0, 0)
the parameter values indicate the point of tangency on the plane. See Figure 4.22(b)
in Chapter 4, Section 4, where the columns of the matrix J'(u, v) were written as
the vector partials afl au and af I aV.
-1 l I )
J'(u, v) = 2 - 2 2 .
( 3 3 -3
This constant 3-by-3 matrix A = f'(x, y, z) has positive determinant 24 and so
is invertible with inverse A- 1 using Theorem 5.7 in Chapter 2, Section 5. Note that
/(x) = Ax, where x is a 3-dimensional column vector. Hence there is an inverse
Section 4C Vector-valued Functions 243
function 1- 1(y) = A- 1y that takes each point y in the image of f back to its
corresponding point x in the domain of f.
The first-degree approximation defined by Equation 4.1 has the same essential
character as the two special cases we considered previously, namely that f (x) - T (x)
tends to zero faster than jx-xol as x tends to xo. Here is the complete formal statement
and proof.
4.2 Theorem. If !Rn _!__,,. !Rm is differentiable at xo, then
4.3 Corollary. If A is a constant m-by-n matrix, then the function !Rn _!__,,. !Rm
defined by f (x) = Ax has A for its derivative matrix, that is, J' (x) = A.
Proof. The derivative matrix /'(xo) is the unique matrix that satisfies the equation
of Theorem 4.2. Observe that since A(x - xo) =
Ax - Axo, then
Ax - Axo - A(x-xo) _ Ax -Axo - Ax - Axo _
0
Ix - xol - Ix- xol - ·
20. A(:)-(:~:) at ( : ) - ( b)
5. f(x) = x 2ex
6. f(x,y) = (exY,xy)
21. r(: )-( :;~;~) " (: )-(; )
22. f (x, y, z) = (x + y + z, xy + yz + zx, xyz) at (x, y, z)
UV )
1. f(u. v. w) - ( VU! 23. Let P be the function from R3 to R2 defined by
WU P(x, y, z) = (x, y).
(a) What is the geometric interpretation of this transfor-
8. f (t J = ( ~: ) mation?
(b) Show that P is differentiable at all points and find
the derivative matrix of P at (l, 1, l).
10. f(u, v) = (u, v, 11 2 + v 2) g(t) = (t -1,t 2 - 3t +2), -00 < 1< 00.
L1 Exercises 11 to 14, let / be the vector function (b) Find formulas describing the tangent approximations
to g near t = 0 and near t = 2.
defined by
f ( X)
y
=( x2
2xy
-y2) (c) Draw the lines defined parametrically by the tangent
approximation.
25. Let f be the function given in Exercise I, and let
Find the derivative matrix of f at the following points:
11. ( ; )
xo =( 6) , YI =( OOI ) ,
12. ( ~ )
Y2 =( O~I ) , Y3 = ( ~: ! )·
(a) Compute f(xo + y;) for i = I, 2, 3.
13. ( 6) (b) Find the tangent approximation to J(xo + y) for an
arbitrary vector y.
(c) Use part (b) to find approximations to the vectors
1/./2 ) J(xo + y;), i = I, 2, 3.
14. ( 1/./2
26. (a) Sketch the graph in R3 defined explicitly by the
In Exercises 15 to 22, find the derivative matrix of the function
function at the indicated point. f(x,y)=4-x 2 -y2.
1 (b) Find the tangent approximation to f (i) near (x, y) =
15. f ( ; ) = x2 + y2 at ( ; ) =( )
(0, 0) and (ii) near (x, y) = (2, 0).
16. g(x, y, z) = xyz at (x, y, z) = (l, 0, 0) (c) Draw· the graphs of the approximations in (b).
27. What is the derivative matrix f'(x, y, z) of the function
17. J(t) =( sint ) at t = !L
cos 1 4
a1
f (x, y, z) = b1 a2
b2 b3 ) ( xy )
a3 +( ao
ho ) ?
18.1(,)-(:,),11-1 (
CJ c2 c3 z co
)!~
19. g(x.y)= ( 2 ) at (x,y)=(l,2)
for a fixed vector b. What is the derivative matrix of a
translation from R" to R 11 ?
Section 5 Newton's Method 245
*29. Show that if f:lRm-JRn and g:lRm-lRn are differen- (a) (f + g)'(xo) = f'(xo) + g'(xo)
tiable at XO and a is a real number, then f + g and af are (b) (af)'(xo) = af'(xo)
differentiable at xo and
lxk - xj < €
whenever k ~ N. Then we say that the given sequence converges to the limit x,
and we write
FIGURE 5.15 lim Xk = X.
k--+OO
We can summarize by saying that the sequence x1, xz, x3, ... converges to x if
lxk - xi is arbitrarily small for all sufficiently large k. Figure 5.15 shows a sequence
with entries lying within € of x whenever k ~ 6.
Consider the vector ( ,./i., n) in IR2 . Suppose that ,./i. is approximated by the decimal
expansion sequence 1, 1.4, 1.41, 1.414, ... and that n is approximated by 3, 3.1,
3.14, 3.141, .... Then we can form the sequence of vectors (1, 3), (l.4, 3.1), (1.41,
3.14), (1.414, 3.141), ... to approximate the vector (,./i., n). We leave as an exercise
showing that if x1, xi, x3, ... and YI, yz, )'J, ... are the sequences approximating ,./i.
and n respectively, then lim Xk = ,./i. and lim Yk = rr implies that lim (xk, Yk) =
k--+oo k--+oo k--+oo
(,./i., 1{ ).
We look first at Newton's method for approximating a solution of an equation
f(x) = 0 where f is real-valued and x is a real variable. We assume that f is
continuously differentiable. If the graph of f should happef!. to be convex as shown
in Figure 5.16, then it's geometrically apparent that the tangent line to the graph
at (xo, /(xo)) crosses the x-axis at a point x1 that is a better approximation to the
solution x than xo is. Having chosen xo somewhat arbitrarily, and having found xi,
we can repeat the process. This time we use the tangent line at (x 1, /(xi)) and call
its intersection with the x-axis xz. Thus we can generate a sequence of numbers
xo, x1, xi, . . . approximating x.
In practice, we need a formula for computing the sequence x1, xz, ... . We observe
first that the tangent line at (xo, f (xo)) has the equation
FIGURE 5.16
5.1 y = J'(xo)(x - xo) + f(xo).
Since the approximation x 1 is found by intersecting the tangent with the x-axis, we
set y = 0 in the above equation and solve for x1. TI1e result is
/(xi)
X2 =XJ - - - .
f'(x1)
j EXAMPLE 2 j The equation x 2 - 3 = 0 has two solutions, ./3 and -./3. To approximate ./3 we
· choose xo = 2 and compute Xk+I from Xk by the Formula (1), which in this case is
(x} - 3)"
Xk+I = Xk - 2Xk
(x} + 3)
2Xk
Thus we get x1 = ¾ = 1.75. Substituting this value in the preceding formula for
k = l gives x2 = ~~ ;: : : 1.732142857. This approximation to ./3 is correct to three
decimal places. Calculating one more step gives x3 ~ 1. 7320508, which is correct
to the number of displayed digits.
If /' (xo) has an inverse matrix [/' (xo) 1- 1, we apply the inverse to both sides to get
r
In this equation [J' (Xo) I/ (xo) is the vector we get by applying the inverse of the
matrix /'(xo) to the vector /(xo). The vector x 1 is the first improvement on the
initial approximation xo to the solution x.
Section 5 Newton's Method 247
x 2 +y2 =2
2
x -y2 = 1,
the intersecting graphs of which appear in Figure 5.17. There are four solutions to
the pair of equations. To find approximate solutions by Newton's method we define
f(x, y) =( x2
X
2
+ y22 -
-y - I
2)
and solve the equation f(x, y) = (0, 0). Since f is a function from IR2 to JR 2 , we
require both x 2 + y 2 - 2 = 0 and x 2 - y 2 - l = 0, and it's helpful to sketch the curves
defined by these two equations. The exact solutions are represented br the four points
of intersection of the circle x 2 + y2 - 2 = 0 and the hyperbola x 2 - y - l = 0 shown
FIGURE 5.17 in Figure 5.17. The choice of an initial approximation depends on which solution we
want to approximate. To look for the solution in the first quadrant, we try xo = ( l , I).
Since
f(x,y) = ( x22 + 2 y2-2)
X -y - 1
, we have / , (x, y) = ( 2x 2x 2y )
-2y
and
1 -] 1 -I
[J'(x, y)rl =
-x
1
( -y -1
4
-x
4
I -1
--y
4
)
Then the right side of Equation 5.4 becomes
I -I
-x 1 -1 )
1
( ; ) - [/'(x, yW f(x, y) - ( ; ) - (
4 ::t ( x~ + y~ - 2 )
1 -I 1 -1 X - y - 1
-y --y
4 4
2x 2 - 3
-(;)-( 4x
2y 2 - I
4y
248 Chapter 5 Differentiability
This vector is the analog of the expression (x 2 + 3)/2x in the previous example
and is the formula by which the sequence of approximations is actually computed.
Setting xo = (xo, yo) = (I , I), we get
X
I
= ( 2xt:+ 3) ~
2y5 1
= (
3
) 1.25 )
- ( 0.75 .
4yo 4
Substituting x1 into Equation 5.3 gives
2(1.25)2 + 3 )
X
2
= 4( 1.25) ::::: 1.225
2(0.75) + 3 ( l.70833 ) .
(
4(0.75)
Substituting our approximate value for x2 gives
2
2(1.225) + 3 )
4(1.225) 1.22574
X3 =( 2(1.70833)2 + 3 ::::: ( 0.707108 ) .
4(1.70833)
Similarly, we get
1.22474 ) d ( 1.22474 )
X4 ::::: ( 0.707107 an xs ::::: 0. 707107 ·
As in the previous example, you can check that further iteration using only five
places after the decimal point doesn 't produce a change.
In this example, the two simultaneous equations can actually be solved by elim-
ination to yield x = (./f.5, J53). The approximation x4 = ( l.22474, 0.707107)
happens to be correct to that many decimal places. We get the other three vector
solutions by symmetry. Referring to Figure 5.17, we get these vectors by changing
one or both signs of the coordinates to minus. The numericai procedure could have
been applied by taking as initial estimate xo one of the vectors (- 1, - 1), (- 1, 1),
or (1 , - 1).
For k ~ 1, Xk+l as defined by Equation 5.5 will in general be different from the
corresponding value determined by the Newton formula in Equation 5.4, because the
matrix [J'(xo)]- 1 remains the same at each step in Equation 5.5.
( ;~ =~~ =i )= ( ~ ) '
we apply Equation 5.5 with xo = (l, l ). Then
so
1
x- [J'(x,,)r /(x) =(;)-(i l ) ( 1
4
2
x + y2 _ 2 )
x 2 - y2- 1
2
+3
=(;)-(:( )=( -2x +4x
-2y 2 :4y+ 1
)
-2x5 +4xo + 3
4 1.25 )
-2yJ +4yo + 1 ) =( 0.75 .
4
In the next step
2
-2(1.25) + 4(1.25) + 3 )
4 1.21875
x, = ( -2(0.75) 2 : 4(0.75)+ 1 =( 0.71875 ) ·
Continuing, we arrive at
1.22474 )
XJO =( 0. 707107 '
which agrees with the result obtained in Example 3, though it takes more steps.
In deciding whether to use the Newton Formula 5.4 or its modification 5.5, note
that Formula 5.4 produces faster convergence than 5.5, that is, it achieves a smaller
error in a given number of steps; on the other hand Formula 5.5 has the advantage
250 Chapter 5 Differentiability
that it requires calculation of the derivative matrix J' and its inverse at only one
point. Thus if computing the inverse matrices [f' (xk )]- 1 is going to be particularly
time-consuming, it may be worth taking the extra iteration steps that Equation 5.5
may require to achieve the desired accuracy.
Note. Java applets NEWTON, NEWT2, and NEWT3 are available at http://math.
dartmouth.edu/"-'rewn/ for implementing Newton's method.
EXERCISES
I. (a) Sketch the graph of f(x) = Vi-x for -2 < x < 2. (a) Sketch the curves satisfying each of the two
(b) Sketch the tangent lines to the graph of f at xo = equations.
¾, xo =-¾,and xo = -¼- (b) Defining f by
(c) For each of the three choices for xo in part (b), what
solution of f (x) = 0 can the Newton iteration be
expected to converge to?
f(x, v) =( x2 +y - I )
· x+_v2-2 '
(d) Discuss the choice xo = V'J/9 for an initial approx-
imation to a solution of f(x) = 0.
find f'(x, y), [f'(x, y)]- 1• and (x, y) = [f'(x, y)]- 1
2. To get some idea of how Newton's method can fail, f(x,y).
observe that the equation Vi = 0 has the unique solution (c) Using the sketch in part (a), choose an initial approx-
x = 0. Show that applying Newton's method to this imation xo = (xo, yo) to the solution off (x, y) = 0
equation with initial guess xo for the solution produces the that lies in the fourth quadrant of the xy-plane.
subsequent approximations x,, = (-2)11 xo. What happens (d) Compute XJ = (x1, YI) by Formula 5.1.
if xo =/:- 0 as n increases? [Strictly speaking, the method (e) Compute xs = (xs, y5).
doesn't apply if we start with xo = 0, since f' (0) fails to
exist.]
5. Let
Chapter 5 REVIEW
State whether each of the following sets I to 8 is 2. All (x, y) in IR 2 such that lx l < I and IYI::: I
(a) open, (b) closed, or (c) neither. Also describe each
3. Al.I (x, y, z) in !R 1 such that x 2 + y2 + z2 s I and x < 0
set's (d) interior, and (e) boundary.
4. All (x. y, z) in IR 3 such that x > I , y > 0, and z > - I
I. The positive quadrant in JR1, that is, the set of points
{(x, y)lx > 0, y > O} 5. The xy-plane in JR 3
Section 5 Newton's Method 251
6. The set in R 2 consisting of all (x, y) with I(x, y) I < 1, 21. f(x, y, z, w) = xy + yz +zw +wx
along with the point (2, 2)
In Exercises 22 to 27, compute all second-order partial
7. The set of all (x, y, z) in R 3 such that x > 0, y > 1. and derivatives of the given function, including mixed partial
z> 3 derivatives such as a2 flax ay.
8. The set of vectors x in R4 such that Ix - (1, 1, 1, 1) I ::::: 2 22. f(x, y) = x3 - y3
9. Consider the set S in R 3 consisting of the points (x, y, z) 23. f(x, y) = e'" siny
such that x 2 + y2 + z 2 ::::: 1 when y ::::: 0, and such that
x 2 + y2 + z 2 < 1 when y > 0. Describe (a) the interior of 24. f(x, y) = x/(x 2 + y 2)
S and (b) the boundary of S. What is the smallest closed 25. f(x, y, z) = yzex
set containing S?
26. f(x, y, z) = cos(x + y + z)
In Exercises 10 to 15, suppose the function IR ~ JR is 21. f(x,y,z)=x4+y3+z2
defined by
In Exercises 28 to 33, find the derivative matrix for the
if X:::: 0,
H(x) = 11, given function.
0, if X < 0. 28. f(x, y) = (x + y, x 2 + y2)
29. f (u, v) = (u, v, u + v, u - 1)
What points of discontinuity, if any, does the composi-
tion H(/ (x, y)) have in its domain of definition, given 30. f(t) = (t, t 2 , t 3 )
each of the following definitions for f (x, y)? 31. f(x, y) = (½x 2 - ½i, xy)
10. f(x, y) = x 2 + y2 32. f(x, y,z) = (x + y, y + z, z +x)
11. f(x,y,z)=x 2 +y2+z 2 33. f(u, v, w) = (uv, vw, wu, uvw)
12. f(x, y) = + y2)
ln(x 2 34. Assuming lul = 1, find the rate of increase aj/au(l, 2, 3)
of f(x, y, z) = xyz in the direction of the vector (100,
13. f (x, y) = (x - y)/(1 + (x - y)2)
200, 500).
14. f(x, y) = Y 35. Let f(x, y) = xy2 and let u = (cos 0, sin0).
15. f(x, y) = 1/(1 +x 2 + y 2) (a) Compute (n//au)(-1, 2) in terms of 0.
In Exercises 16 to 21, compute 'i/f(x) at a general point (b) In what direction does this function increase most
x in the domain of f. rapidly from the value f ( - l, 2)?
*36. Let two fixed unit vectors in IR2 satisfy u1 # ±u2, Sup-
16. f (x, y) = e'" sin) . . 1 denvattves
pose the d1recttona . . -af (x) and -,
af- (x ) are
17. f(x, y) = (x 2 + .:,,2)- 1 au1 au2
18. f(x, y, z) = xy + xyz 2 continuous functions of x on all of IR 2 • Is f continuously
differentiable? Does your answer change if the unit vec-
19. f(x,y,z)=x-y tors u1 # ±u2 vary continuously from one point x to
20. f(x, y, z) = x/() 2 + z2) another?
CHAPTER 6
VECTOR DIFFERENTIAL
CALCULUS
VJ(x) aJ
= ( -(x), aJ )
... , -(x) .
ax, axil
In Chapter 5 we concentrated on isolated values of the gradient of a real-valued
function J (x) and paid relatively little attention to it in a context in which it was
important to regard it as a function from JR" to IR11 • As a function of x, the image
of VJ(x) is most often pictured as a vector field, which we'll define here. We'll
then refer to a vector field generated by VJ for some differentiable function J
as a gradient field. (In Chapter 12 we' II study vector fields that aren't necessar-
ily gradient fields.) Plotting a vector field F(x, y) = (f(x, y), g(x, y)) in JR 2 is
in principle a very simple matter. Associated with each point (x, y) in the domain
of F is an arrow from (0, 0) to the point F(x, y ). We translate this arrow paral-
lel to itself so that it starts at (x, y) and ends at (x, y) + F(x , y). Carrying out
this routine for a suitably chosen selection of points in the plane will produce a
sketch of the vector field. Carrying out this procedure by hand for a large num-
ber of points is extremely tedious for all but the simplest vector fields, and com-
puter plotting is the preferred alternative, as described in Subsection 1D. Figure 6.1
shows some sketches that include a few such arrows in IR 2 and IR 3 . To make
sense of these pictures physically, you can imagine that the direction and length
of an arrow VJ (x) are respectively the direction and speed of a fluid flow at the
point x to which the arrow is attached. (See Subsection 1D for a discussion of
flow lines.)
(a} (b)
Notice that the length JVJ(x, y)I = Jx 2 + 1 is independent of y, but increases as Jxl
does. Also, the arrows starting from a single vertical line, with the same x-coordinate,
are parallel, with the same length because, as we remarked, VJ (x, y) is independent
of y in this example.
!t~~f4~~~;gJ
1
The function g(x, y, z) = ¼(x 2 + y 2 + z2 ) has a gradient in JR 3 ,
Vg(x, y, z) = (½x, ½Y, ½z).
The direction of the field is directly away from the origin at each point, as shown in
Figure 6.1 (b ).
The gradient of a function is important for several reasons, one of which is that
it appears often in applications, for example to the concept of energy in Chapter 9,
Section 2. Also, we proved in Chapter 5, Section 3, Theorem 3.1 that we can write
the directional derivative of a real-valued function f with respect to a unit vector u
in terms of the gradient of J. Thus if J is differentiable, then
1.1 aJ
-(x) = VJ(x) • u .
au
The following theorem is the origin of the mathematical use of the term gradient.
The term appears in several other areas, for example road construction, where it
refers to the slope of a road.
.. .
y
aJ (x) =
ou V/(x) • u = IV/(x)llul cos 0
x-
steepest path
= IV/(x)I cos 0,
(a)
where 0 is the angle between u and VJ (x). Hence the directional derivative assumes
its maximum value when cos 0 = l and 0 = 0, that is to say when u has the same
y
direction as VJ (x). Thus VJ (x) points in the direction of maximum increase of /
at the point x, and IV/(x)I is the maximum rate of increase there. •
lEXAMPLE3 I increases
Let f(x,y) exx_ Then Vf(x,y)
=
most rapidly in the direction
(yexY,xer>'); thus at
=
2) = (2e e
V/(1,
(1, 2) the function f
2 , 2 ), which has the same
direction as the unit vector (2/ ./5, I/ ./5). The rate of increase in this direction is
IV/(1, 2)1 = ./5e2 • Similarly,
and has direction (2/ ./5, -1 / ./5), with maximum rate of increase at (-1, 2) equal
to .Jse-2 . The maximum rate of decrease occurs in the opposite direction.
1B Chain Rule
Next we'll prove a chain rule for differentiating the composition g(/(t)), of a func-
tion JR ~ JR11 and a function JR11 --3...+ JR. For example, if / and g are
theorem gives a formula for doing this in terms of the gradient of g and the vector
derivative off.
1.3 Theorem. Let g be real-valued and continuously differentiable on an open set
D in ]Rn and let f (t) be defined and differentiable for a < t < b, taking its values in
D. Then the composite function F(t) = g(f (t)) is differentiable for a < t < b and
where Xo is some point on the segment joining y and x. Letting x = f (t) and
y = f(t+ h), with !hi < 8, we have
F(t + h) - F(t) f (t + h) - f (t)
h = Vg(xo) • h .
The vector xo is now some point on the segment joining f(t) and f(t + h). (Note
that xo is in the domain D of g because it lies on a radius of a ball contained in
D.) Since g was assumed continuously differentiable, Vg(x) is continuous, and so
Vg(xo) tends to Vg(f (t)) ash tends to zero. The dot product is continuous, so F'(t)
exists, with
IEXAMPLE s I The function J y) = + y2 has a level curve passing through the point ( 1, and
having the equation
(x,
+ y2 = 5. Then
x3
x3
2) = (3(1)2, 2(2)) = (3,
VJ(l, According 4).
2)
to Theorem 1.4 the tangent line to this curve at (x, y) = (1, 2) has equation
(3,4) • (x-l,y-2)=0, or 3(x - 1)+2(y-2)=0, or 3x t2y=7.
I· EXAMPLE 61· +
The function f(x, y, z) = x 2 y2 - z2 has for one of its level surfaces a cone C
consisting of all points satisfying x 2 + y2 - z2 = 0. The point xo = ( 1, 1, ./2) lies
Section 1C Gradient Fields 257
FIGURE 6.4
y
----y
X
Putting together Theorem 1.2 with I .4, we get the following theorem.
the unit vector u that points in the direction of maximum (b) Arc there directions of maximum increase for
increase of the function at xo, and also find the rate of f(x, y) = xy and g(x, y) = x 2 - y 2 at (x, y) =
maximum increase at x. (0, 0)? Does Theorem I 2 apply?
7. f(x,y) =x 2 -y 3 at (xo,.Yo) = (], 1) 27. If g (x, y) = ex+y and f' (0) = (1, 2), use the chain rule
to find F'(0), where F(t) = g(f(t)) and f(0) = (1, -1).
8. g(x, y) = x_v2 at (xo, .Yo)= (-1, 2)
9. h(x, y, z) = xy sin z at (xo, Yo, zo) = (I, 2, n) 28. Let y be a curve in R 3 being traversed at time t = I
with speed 2 and in the direction of (1, -1, 2). If t = I
10. p(x, y, z, w) = (x 2 +_v 2 +z 2 +w 2 ) 112 at (xo, YO, zo. wo) = corresponds to the point (1, 1, 1) on y, find the rate
(1, 1, 1, 2) of change of the function x + y + z + xyz along y at
In Exercises 11 to 14, sketch the vector fields described t = 1.
by the functions from ~ 2 to ~ 2 or ~ 3 to ~ 3 • To do this 29. If f(x,y,z) = sinx and F(t) = (cost,sint,t), find
pick a few points x in the indicated domain and draw g'(n), where g(t) = f(F(f)).
the arrow for F(x) with its tail located at the point x.
30. Let R ~ R" be differentiable. Let IR" ~ IR be
11. F(x,y) = (1,x) for -1 ~ x ~ 2,0 ~ y ~ 2 continuously differentiable, and such that the composition
12. F(x, y) = (-y. x) for x 2 + y2 ~ 4 g(t) = f(FU)) exists. If F'(to) is tangent to the level
surface of f at F(to), show that g' (to) = 0.
13. .,,(x, y) = (y, x) for x 2 + y 2 ~ 4
31. A spaceship is traveling in R 2 along a path such that
14. F(x, y, z) = -f (x, y, z) for x 2 + y2 + z2 ~ 4
at time t ~ 0 the ship is at g(f) = (3t 2 , t 3 ). The
In Exercises 15 to 17, first compute V/, and then sketch intensity of gamma radiation at the point (x, y) in IR2
the vector field F = V/. is / (x, y) = x 2 - y2 wherever I(x, y) ~ 0. Describe
fully, using a labeled sketch where appropriate, the
15. f (X, y) = Xy + y2 following:
16. f(x,y,z) =x 2 +_v2+z 2 (a) The level curve of I that the ship is on at t = 1.
(b) The path of the ship for t ~ 0.
17. f(x, y, z) = x 2 + y2 (c) The gradient vector of I at the ship's position when
18. f(x, y, z) =x - y +z t = 1.
(d) The ship's velocity vector at t = 1.
In Exercises 19 to 24, find, · if possible, a normal vector
(e) The time, if there is one, when the ship stops
and the tangent line or plane to each of the following
increasing its radiation risk and begins its race to
level curves or surfaces at the indicated points.
safety. Does its course become more dangerous
I 9. x 2 + y 2 - z2 = 2 at (x, y, z) = (I, 1, 0) later on?
20. x sin y = 0 at (x, y) = (0, n/2) and at (x, y) = (0, 0) 32. If T (x, y, z) represents the temperature at a point (x, y, z)
of a region R in R3, the vector field VT is called the
21. Jxl = 1 at x = e1 , the first natural basis vector in R"
temperature gradient. Under certain physical assump-
22. x 2 y + yz + w = 3 at (x, y, z, w) = (1, 1, 1, 1) tions VT(x, y, z) is negatively proportional to the vector
23. xyz = 1 at (x, y, z) = (1, 1, 1) that represents the direction and rate per unit of area of
heat flow at (x, y, z). The sets on which T is constant
24. xyz = 0 at (x, y, z) = (1, 2, 0) are called isotherms. If the isotherms of a temperature
25. If IR 2 ~ R is continuously differentiable, its graph is function are concentric spheres, prove that the tempera-
defined implicitly in IR 3 as the level surface S of the ture gradient points either toward or away from the center
function F(x, y, z) = z-f(x, y) given by F(x, y, z) = 0. of the spheres.
(a) Show that VF = (-aJ/ax, -aJ/ay, 1), which is 33. Show that the vector field defined on IR 2 by F(x, y) =
never the zero vector. (-y, x) is not of the fom1 vf (x, y) for a function f.
(b) Find a normal vector and the tangent plane to the [Hint: Suppose that af/ax (x, y) = -y and af/cly
graph of f(x, y) = xy + yex at (x, y) = (1, I). (x, y) = x. Then differentiate the first equation with
respect to y and the second with respect to x.]
26. (a) The function f (x, y) = x 2 + y 2 has VJ (0, 0) =
(0, 0), which fails to indicate that there is a direction For each of the following fields 33 to 36, find an /
of maximum increase for fat (x, y) =
(0, 0). Is this such that Vf = F. The previous exercise established,
reasonable? What happens at (0, 0)? by specific example, that some vectors fields are not
Section 1D Gradient Fields 259
gradient fields. The distinction between gradient fields (c) Find the field of which
and nongradient fields is investigated in some detail in f(x, y, z) = (x 2 + y2 + z 2) - 1l2, the Newtonian
Section 2 of Chapter 9. But when a vector field F is a potential, is the potential function.
gradient field, a little guesswork based on experience (d) Find the field of which f(x, y) = -½
log(x 2 + y2),
with indefinile integrals will sometimes yield a real- the logarithmic potential, is the potential function.
valued function such that VJ = F . For example, if (e) Show that the generalized Newtonian potential
F(x, y) = (x, y), a little thought leads to the guess that f(x) = lx1 2-n in R", n ::: 3, satisfies 'vf(x) =
V(x 2 /2+y 2 /2) = F(x, y). It follows from Theorem 3.4 (2 - n)lxl-nx.
of the previous chapter that every two solutions to the 39. The vector equation of motion for the position x(i) at
problem differ by at most an additive constant. time I of a single planet relative to a star fixed at
the origin has the form x = -klxt 3x, where k is a
34. F(x, y) = (y, x) positive constant depending on the gravitational constant
35. F(x, y, z) = (x , y, z) and the masses of the two bodies. See Equations 3.2 in
Chapter 12, Section 3.
36. F(x, y) = (e-t+Y , e-'+Y) (a) Show that the magnitude of the acceleration vector
37. F(x, y) = (x2, y2) obeys an inverse-square law: Iii= k/lxl 2 •
(b) Show that the vector equation is equivalent to a pair
38. The level surfaces of a function Rn ~ R are called the of equations, where x = (x, y):
equipotential surfaces of the vector field 'vf, and f is
called the potential function of the field.
(a) Show that the equipotential surfaces are perpendic-
ular to the field. (c) Show that the vector field F(x, y) =
(b) Find the equipotential surfaces of the field ( - kx(x 2 + y2) - 3l 2 , -ky(x 2 + y 2 )- 3l 2 ) is equal to
'vf(x , y, z) = (x, y, z). 'vf (x, y), where f (x, y) = k(x 2 + y2)-l/2_
,,----
JI\\\"-
! \ \ \
} I \ \ ' ,
, ----
---·
___
........
........ ....... _
DEFINE f(x , y) = fo(X
DEFINE g(x,y) = fo (X - y)
+ y)
. I I t \ '
. I I I
. I I I I
I '
I
-- -
_,,,.,,,.,,/ .
,-, / .
FOR X = x1 TO x2 STEP S
FOR y = Yl TO Y2 STEP s
PLOT ARROW :
(x,y) TO (x + t(x,y), y + g(x,y)J
" // ___ _ I I I I I . NEXT y
. ..,.,,..,-,- - .... I I / / / •
NEXT x
has taken place. What we're often interested in anyway is the relative strength of a
field as it varies from point to point, and that information comes across very well in
a properly scaled sketch.
Recall that a gradient.field is a vector field having the form F(x, y) = Vf(x, y),
where the real-valued function f is called a potential function of F. In iR2 , for
example,
Considerable insight is needed to distinguish a gradient field from one that isn't a
gradient just by looking at sketches such as Figure 6.6. For example, the one in
Figure 6.6(a) is a scaled version of the gradient field Vxy = (y, x) while the one
in Figure 6.6(b) is a scaled version of F(x, y) = (y, -x ), which is not a gradient
field. We'll see later in Chapters 8 and 9 that the "rotational" character of the field
on the right is a visual clue that it isn't a gradient field. Theorem 2.5 in Chapter 9,
Section 2 provides a computational criterion.
Often a potential function f has an infinite singularity, that is, a point of its
domain where f tends to infinity. At such a point VJ will not only fail to be defined
but at nearby points will have gradient vectors that become arbitrarily long. If such
a singularity occurs in the course of making a sketch, we need to make allowances
for it in our algorithm. The simplest thing to do is arrange it so that the plotting
steps give the troublesome point a wide berth.
In principle, we can make a perspective sketch of a vector field in IR. 3 . The trouble
is that the tendency of the arrows to overlap each other often makes the picture hard
to interpret.
Flow Lines. Suppose a continuously differentiable vector field F is defined on
some open subset S of IR.n. The image of a parametrized curve x = g (t), with image
in S, is called a flow line of F if the velocity vector of the curve at a point x = g(t)
// /.,,,. --,,'\.,
/ I I I \
'\
I I I I \
,, ., I I I I I '
'
. . , I l I
\
\
I I I
I I I I ; . \ \ \
I I
I I
,, ,,.
' ' \ \ I \
I I
~ ' ,,,.___ __
\ \.
-
\
(a) (b)
Section 2 The Chain Rule 261
g'(t) = F(g(t)).
If F is a velocity field, its flow lines represent the paths of particles moving with
velocities given by F. There is a detailed discussion of this relationship between
vector fields and curves in Chapter I 2 on systems of differential equations. For now
we simply observe that given a reasonably accurate sketch of a vector field, we can
sketch in some typical flow lines, as in Figure 6.6. The idea is to draw curves that
appear to be tangent to the arrows of the field. The speed and direction of traversal
of the curve are determined by the length and direction of the arrows of the field,
though if the field arrows are scaled as previously described, then only direction and
relative speed are apparent directly from the sketch.
EXERCISES
Plot the vector fields 1 to 6 in JR 2 . Use scaling of 7. (a) Verify that if a and b are constants, not both zero,
the . vector lengths as it seems appropriate. Add some then the image of the curve parametrized by
sketches of typical flow lines to each of the vector
field sketches. (x, y) = (a cost+ b sin t, bcos t - a sin t)
1. F(x, y) = (x, y), -2.:::: x.:::; 2, -2.:::: y.:::; 2 is a flow line of the vector field F(x, y) = (y, -x).
(b) Show that the flow lines of part (a) are circles.
2. F(x ,y)=(x,0),-4.:::;x.:::; 4,-4.:::;y.:::;4
8. (a) Verify that if a and b are constants, not both zero,
3. F(x , y) = (2x, x), -4.:::: x :'S: 4, -4.:::: y :'S: 4 then the image of the curve parametrized by
4. F(x, y) = V(x 2y2), -2.:::; x.:::; 2, -2.:::; y.:::; 2 (x, y) = (a cosh t + b sinh t, b cosh t + a sinh t)
5. F(x, y) = vex+y, -4 :-s: x :-s: 4, -4 :<:: y :-s: 4
is a flow line of vector field F(x, y) =
(y, x).
6. F(x, y) = V½ log(x 2 + y 2 ), -4 ~ x ~ 4, (b) Show that the flow lines of part (a) are either
-4 ::Sy :-s: 4, (x, y) =;i= (0, 0) hyperbolas or lines.
Ro.f(x) = g(J(x))
262 Chapter 6 Vector Differential Calculus
for every vector x such that xis in the domain off and /(x) is in the domain of g.
The domain of g 0 f consists of those vectors x that are carried by f into the domain
of g. An abstract picture of the composition of two functions is shown in Figure 6.7.
IEXAMPLE 11 Suppose that we are given a two-dimensional region in which the points move about
according to some specified law. It may be known that, for a given position with
coordinates (u, v), a point is always to be found at some definite later time in a
position (x, y ). Then (x, y) and (u, v) are related by equations of the form
x=g1(u,v)
y = g2(u, v).
In vector notation these equations might be written
X = g(u),
where x = (x, y), u = (u, v), and g has coordinate functions g1 , g2 . Now suppose
that the position u = (u, v) of a point is itself determined as a function of other
variables (s, t) by equations
u = fi(s, t)
V = /i(s , 1).
These may be written in vector form as
u = /(s),
wheres= (s, t) and f has coordinate functions /1, Ji. Then (x, y) and (s, t) are
related by
X = g(/(s)).
Section 2A The Chain Rule 263
It's a remarkable fact that for differentiable f and g the previous formula is the
correct extension of the chain rule if we take care to evaluate the derivatives at the
proper points, as in the next example.
Consider the special case in which f is a function of a single real variable and g is
real-valued. Then g f is a real function of a real variable. Theorem 1.3 shows that
O
The right side of this last equation equals a matrix product in terms of derivative
matrices
g
, (/(0) = ( -(JU)),
ag ag
... , -(JU)) ) ,
ayi aym
and
f{(t) )
J'(t) = :
(
f~(t)
264 Chapter 6 Vector Differential Calculus
The chain rule is valid under the assumption that f and g are differentiable, but
for a proof requiring less detailed analysis, we make the stronger assumption of
continuous differentiability.
2.2 Chain Rule. Let f be continuously differentiable near x, and let g be con-
tinuously differentiable near /(x), with
(g of)'(x) = g'(/(x)}/'(x).
Proof. We need only show that the derivative matrix of go f at x has continuous
entries given by the entries in the product of g' (J (x)) and /' (x). These matrices
have the respective forms
Butthis expression is just the dot product of two vectors Vgi (/(x)) and (af!axj )(x).
It follows from Theorem 1.3 that
because we are differentiating with respect to the single variable Xj. This establishes
the matrix relation, because the entries in (g oJ)' (x) are by definition given by the
right side of Equation 2. Since g and / are continuously differentiable, Formula (l)
represents a continuous function of x for each i and j. Hence g o f is continuously
differentiable. •
Section 2A The Chain Rule 265
Let
g t (u, v) =( V
1 Ul ) and / I (x, y) = ( 2X _ 2y
y ) .
2x 2
To find (g o/) 1(2, 1), we note that /(2, I)= (5, 3) and compute
g I (5, 3) = ( 3 5)
I I and /'(2, I)= ( : _; ) ·
It's common practice in calculus to denote a function by the same symbol as a typical
element of its range. Thus the derivative of a function JR _!_,,. JR is often denoted, in
conjunction with the equation y = f(x), by dy/dx. Similarly, the partial derivatives
of a function JR3 _!_,,. JR are commonly written as
aw aw aw
"a;'ay'
and
az·
along with the explanatory equation w = f(x, y, z). For example, if w =
f(x, y, z) = xy 2 ex+ 3z, then
This notation has the disadvantage that it doesn't contain specific reference to the
function being differentiated, but it's convenient and is the traditional language of
calculus. To illustrate its convenience, suppose that the functions g and / arc given
by real-valued coordinate functions
ax
-
ax
-
as at
(aw aw)= (ag ag ag) ay ay
as at ax ay az as at
az az
as at
Matrix multiplication yields
aw ag ax ag ay ag az
as =--+--+--
ax as ay as az as
aw
-
ag ax ag ay ag oz
at =---+--+--
ax at ay at az at
} (A)
We get a slightly different-looking application of the chain rule if the domain space
of J is one-dimensional, that is, if J is a function of one variable. Consider, for
example,
d(g •/) _ dw
dt - dt.
The derivatives of g and / are defined, respectively, by the derivative matrices
c: ~:) and ( ]~ )
dW I
- = 'vg•f.
dt
Let us suppose that both f and g are real-valued functions of one variable, the
situation we meet in one-variable calculus. The derivatives of J at t, of g at s = f(t),
Section 2A The Chain Rule 267
and of g 0 fat t are represented by the three l-by-1 derivative matrices f'(t), g'(s),
and (go f)' (t), respectively. The chain rule implies that
Given that
X = u 2 + VJ and
U=t+l
{ y = e"v, { V = e',
2
U ) =( u + VJ ) =( X ) { -00 < U < 00.
g ( v e"v y ' -oo < v < oo.
( -
au
ay
au
ax
av )
ay
av
=(
2u
VeUV
3v 2
ue"v )-
The dependence of x and y on t is given by
Hence the two derivatives dx /dt and dy /dt are the entries in the derivative matrix
of the composite function g of. The chain rule therefore implies that
l
268 Chapter 6 Vector Differential Calculus
That is,
dx =ax
- - du
- +ax -dv
-
dt au dt av dt
(D)
dy = ay du + ay dv .
dt au dt av dt
dx
dt
dy
f' (t)
= 2u +3v2et
= ve111• + ueuv+t
l
and g' (u, v) gives
.
dt
dx
-(0)=2+3=5,
dt
dy
dt (0) = e + e = 2e.
The definition of matrix multiplication gives the derivative formulas resulting from
applications of the chain rule a formal pattern that helps the memory. The pattern is
particularly evident when the coordinate functions are denoted by scalar variables,
as in Formulas (A), (B), (C), and (D). All formulas of the general form
az ax iJ :.: ay
· ··+--+--+···
ax at ay a1
have the disadvantage of not containing explicit reference to the points at which the
various derivatives are evaluated. It's essential to know this information, and we can
find it by going to the formula
ax = -I, ax = 3, ay = 5, ay = 0.
au av au av
Suppose also that /(1,2) = 2 and g(l,2) = -2. What is az;au(l,2)? The chain
mle implies thal
oz
-=--+--.
oz ox oz oy (E)
au axau ayau
Section 2A The Chain Rule 269
When u = I and v = 2, we are given that
x=/(1,2)=2 and y=g(l,2)=-2.
Hence
az
;-(2,
ux
-2) =y Ix=2,y=-2
= -2
az (2, -2)
ay
=x Ix=2,y=-2
= 2.
To obtain az/au at (u, v) = (I , 2), it's necessary to know at what points to evaluate
the partial derivatives that appear in Equation (E). In greater detail, the chain rule
implies that
Hence
az
-(1,2)
au
= (-2)(-1) + (2)(5) = 12.
l;exAMPLe9I If w = f(ax 2 + bxy + cy 2 ) and y = x 2 + x + 1, we may want dw/dx at x = I.
- · The solution relies on formulas that follow from the chain rule such as (A), (B), (C),
(D), and (E). Let z be defined by
z = ax 2 + bx y + cl.
Then w = f(z), and since ax/ax= 1,
dz az ax az dy az az dy
-
dx
- -ax-ax+ ay
- -=-+-
dx ax
-.
ay dx
Hence
dw = df dz= df (az + azdy)
dx dz dx dz ax ay dx
=j 1
(z)(2ax +by+ (bx+ 2cy)(2x + 1)) .
If x =- 1, then y = 1, and so z = a - b + c. Thus
dw
-(-1)
dx
=f I
(a - b + c)(-2a + 2b- 2c).
EXERCISES
1. Assume (a) Find the matrices g'(f(x, y)) and f'(x, y).
2
f ( x ) =( x + xy + 1 ) (b) Use part (a) to find the matrices (g 0 f)'(l, 1) and
y y2 +2 ' (g of)' (0, 0).
,(:)-(T)
270 Chapter 6 Vector Differential Calculus
= Jx 2 + y 2 + z2 and
+)-(n
2. Assume 7. If g(x, y, ZJ
r cos(}
f (t) - ( f(r,0) = (
rs~n0
)·
and
find g' (f (r, 0)) and f'(r, 0); then multiply these together
and find (g o f)' (2, n).
8. Vector functions f and g are defined by
~
f(I) =( 12 ~4 )
er-2
, -oo < I < oo.
(a) Find the derivative matrix of go f at ( )-
Let g be a real-valued differentiable function with domain (c) Are the following statements true or false?
IR.3 • If xo = (2, 0, I), and (i) Domain of f = domain of g O f.
(ii) Domain of g = domain of fog.
ag 8g 8g 9. Let v be a tangent vector at xo to a curve defined
-(xo)
ax
= 4, -(Xo)
8y
= 2, 8z (XQ) = 2 ' parametrically by a differentiable vector function g. If
xo is in the domain of a differentiable vector function
find d(g c f)/dt at t = 2. F, prove that F'(Xo)V, if not zero, is a tangent vector at
4. Let z = x_v2 and suppose that x = 2u + 3v. Assume F(xo) to the curve defined parametrically by F O g.
also that y is a function of u and v with the properties 10. The convention of denoting coordinate functions by real
that when (u, v) = (2, 1) then y = -1, 8y/au = 5 and variables has its pitfalls. Resolve the following paradox:
ay/av = -2. Find az/au and az/av when (u, v) = Let w = f(x, y, z) and z = g(x, y). By the chain rule
(2, I).
5. Consider the functions aw aw ax aw ay aw az
-=--+--+--.
ax ax ax ay ax az ax
u+v
" - ll The quantities x and y are unrelated, so that ay /ax = 0.
"2 _ v2 However ax/ax= l. Hence
and aw aw aw az
-=-+--,
ax ax 8:: ax
F(x, y, z) = x 2 + y2 + z2 = w.
and so, subtracting aw/ ax from both sides,
(a) Find the derivative matrix of F 0 f at (u, v).
(b) Find Jw/8u and aw/av.
6. Let 11 = f(x, y). Make the change of variables x
r cos 0, _v = r sin 0. Given that In particular, take w = 2x + y + 3z and z = 5x + 18.
Then
af
-ax = x 2 + 2xv. - -v2 and
0f 2
-2xy +2,
aw az = 5 _
- =X -=3 and
ay az ax
find of/80, when r = 2 and 0 = n/2. It follows that O = 15.
Section 2B The Chain Rule 271
11. If y= f (x - al)+ g(x + at), where a is constant and f Find a 2(fog)/avau at (I, 1).
and g are twice differentiable, show that
In Exercises 16 to 19, let J be real-valued and differen-
a2y
2
a2y tiable.
a &x 2 = ;ii"i" (wave equation).
16. If u(x, y) = f(ax + by), show that b au/ax= a au/ay.
12. Let U(x, y) = f(x + iy) + f(x - iy), where i 2 = -1. 17. If U(X, y) = f(xy), show that X au/ax= y au/ay.
Show that Uxx + Uyy = 0.
18. If U(X, y) = f(x/y), show that X au/ax = -y au/ay,
13. If f(tx, ty) = tn f(x, y) for some integer n. and for all y ;t=O.
x, y, and t, show that
19. If U(X, y) = j(x 2 + y2), show that y au/ax= X au/ay.
aJ aJ
X ax+ Y ay = nj(X, y). If in Exercises 20 to 25 f and g are of the following
types, decide whether go J, or fog, or neither one, can
14. (a) If possibly be defined.
w = f(x, y, z, t), x = g(u, z, t), and
20. j: R2--+ JR2, g: IR.2--+ R3
z = h(u, t), 21. f: lR3--+ R2, g: JR2--+ R
22. f: JR--+ R2, g: R--+ JR2
write a fonnula for dw/dt, where by this symbol
23. f: R3--+ R2, g: R3--+ JR3
is meant the rate of change of w with respect to t,
and where all the interrelations of w, x, z, t are taken 24. f: IR--+ R 2, g: R 3 --+ R 2
into account. 25. f: R --+ JR.3, g: R3 --+ JR3
(b) If
*26. A 2-dimensional Hamiltonian system is a pair of
w = f (x, y, z, t) = 2xy + 3z + t 2, equations of the fonn
g(u, = ut sin z,
z, t) dx/dt = Hy(x, y, t), dy/dt = -Hx(x, y, t).
h(u, t) = 2u + t,
The function H of three variables that determines the
evaluate dw/dt at the point u = l, t = 2, y = 3, by system is called its Hamiltonian. Suppose that the pair
using the formula you derived in part (a) and also (x(t), y(t)) satisfies the system, and consider two func-
by substituting in the functions for x and z and then tions oft:
differentiating.
d
15. Consider a real-valued function f (x, y) such that (i) dt [H(x(t), y(t), t)], (ii) H 1 (x(t), y(t), t),
JA2, 1) = 3, Jy(2, 1) = -2, fxA2, I)= o. where the partial derivative of the Hamiltonian in (ii) is
fxy(2, 1) = /yx(2, I)= 1, Jyy(2, 1) = 2. computed before substituting x(t) and y(t) for x and y.
(a) Show that (i) and (ii) are equal as functions of 1.
Let R 2 ~ IR.2 be defined by (b) Show that if H is independent oft, then the curve
parametrized by (x(t), y(t)) lies on a level curve
g(u, v) = (u + v, uv). of H.
2B Changing Variables
One of the most important uses of the chain rule is computing the effect of a change of
variable on the form of important expressions such as /VJ/, the length of a gradient.
In doing computations of this kind it's often simpler and clearer to use the subscript
notation for partial derivatives, as in the next example.
Hence Vii{z, w) = Ciiz + iiw, Uz -- tiw). Squaring and adding the coordinates of
Vii(z, w) gives
This equation tells us that in this case we can compute IVul directly from the length
of (uz, iiw) if we just multiply this length by ./2.
IEXAMPLE 11 I For functions of a point (x, y) in ~ 2 , the Laplace operator ~ acting on twice con-
tinuously differentiable functions u = u (x, y) produces a continuous function ~u:
Then
Uxx = (uz), + (uu,)x = (UzzZx + Uzu,Wx) + (Uw ,Z, _l.. UwwWx)
( ; ) =( l -i )(~ ) and ( ~ ) =( : _: ) ( ; ) .
These matrix equations establish a one-to-one correspondence between pairs (x, y)
in one copy of IR 2 and pairs (z, w) in another copy. We've seen in Section 5 on
determinants in Chapter 2 that a linear coordinate change z = Ax with square matrix
A is one-to-one precisely when det A i= 0. Establishing an analogous criterion for
nonlinear transformations is possible only locally, meaning in some neighborhood
of a point. The general result, too technical to prove here, is as follows.
Briefly the theorem says that F has a continuously differentiable local inverse in
some neighborhood of a point xo where F' (xo) is invertible, or equivalently where
det F'(xo) i= 0. The scalar det F'(x) is called the Jacobian determinant of F(x),
and it's crucial for changing variables in multiple integrals in Chapter 7.
The pair of equations x = u + v, y = u 2 - v2 determines a transformation F from
points (u, v) in JR. 2 to points (x, y) in JR. 2 . But note that F sends every point (u, v)
for which u = -v onto the single point (x, y) = (0, 0). To put it another way, the
entire line u + v = 0 gets sent by F into the single point (0, 0), so without some
restriction on its domain the transformation F can't be one-to-one. On the other
hand, the derivative matrix of F at (u, v) is
,
F (u, v) = ( 2u
l l ) '
- 2v so det F' (u, v) = -2(u + v).
The inverse function theorem implies that F has a continuously differentiable inverse
defined in a neighborhood of every point F(u, v) for which det F'(u, v) = -2(u +
v) i= 0. Note that these are exactly the points not on the line u + v = 0, which is
collapsed by F into a single point. For this particular transformation we can actually
compute the coordinate functions for the inverse where it exists:
U = ~ (X + ~) , V = ~ (X - ~) , X = U + V -::/= 0.
274 Chapter 6 Vector Differential Calculus
Figure 6.8(b) shows some image curves of vertical and horizontal line segments in
the (u, v)-plane. The points on the diagonal in Figure 6.8(a) all have (x, y) = (0, 0)
as their image.
We treat more examples in detail in Section 5 where both sets of variables have
standard geometric interpretations.
EXERCISES
l. Let u(x, y) be differentiable for all (x, y) in JR 2 . Let 7. As in Example 10 of the text, show that if (x, y) and
x = s+t, y = s-t and u(s, t) = u(s+t, s-t). Use the (z, w) are related by
chain-rule to show that
z = (x + y)/,./i, w = (x y)/,./i,
(~
11
as
)
2
+ (a
at
11
)
2
=2 (au)
ax
2
+2 (au)
ay
2
*11. For a continuously differentiable function IR ~ IR the function. If one of the main hypotheses, F' (XO) ::/: 0, fails
inverse function theorem sharpens slightly to assert that there may still be a merely continuous inverse. Verify
if f' (xo) ::/= 0, then f is either strictly increasing or else this last assertion using the function IR ~ IR defined by
strictly decreasing in some neighborhood of x 0 . Prove f(x) = x 3 .
this using the mean-value theorem for derivatives, and
without appealing to the statement of the inverse function 13. Use the chain rule to show that under the assumptions
theorem. Make sure to make use of the continuity of f'; of the inverse function theorem 2.3, (F- 1)'(F(xo)) =
the conclusion is false without that assumption. (F'(xo) r 1
, that is, the derivative matrix at F(xo) of the
inverse mapping is equal to the inverse of the matrix
12. The conditions of the inverse function theorem guaran-
tee the existence of a continuously differentiable inverse F'(xo).
pv
u -=ko
t
expresses the relationship between pressure p, volume v, and temperature t, of the
gas in some container. Or the equations
(a)
x2 +y2+z2 = I
x+y+z=0
may be interpreted as a relation between the three coordinates of a point on both the
sphere of radius I centered at (0, 0, 0) in JR 3 and a plane through the origin. In neither
example do the equations give an explicit formula for any of the coordinates in terms
of others. In this section we study the application of calculus to such relations.
For two functions JR.2 ~ IR and IR ~ JR, the equation
(b)
F(x, y) =0
FIGURE 6.8 defines f implicitly if F ((x, f (x)) = 0 for every x in the domain of f. The zero on
the right side of the equation could in practice be an arbitrary constant c. But since
F(x, y) = c is equivalent to G(x, y) =
F(x, y) - c = 0, it's customary to absorb
the constant into the function F in a generic context.
Let F(x, y) = x 2 +y2-l. Then the condition that F(x, f(x)) x2 +(f(x))2-I = 0, =
for every x in the domain of f, is satisfied by each of the following choices for f
fi(x)= ~ , -} :5 X :5 1.
/z(X) = - ~ , - 1 :5 X :'.S l.
FIGURE 6.9
Consider a function JR11 +111 __!_,,. ]Rm. We can write an arbitrary element in JR11 +111
as (.q, ... ,Xn,YI,··· ,Ym), or as a pair (x, y), where x = (.q, ... ,Xn) and y =
(y1, . . . , Ym). In this way F looks like either a function of the two vector variables,
x in JRll and y in IR. 111 , or a function of the single vector variable (x, y) in JR 1i+m. The
function IR.11 __!}_,,. IR.111 is defined implicitly by the equation
F(x, y) =0
if F{x, G(x)) = 0 for every x in the domain of G.
y = X + 3, Z = -2X - 2.
Notice that the number of equations is the same as the number of variables that we
solve for, two in this example.
In terms of a function JR 2 ~ IH: 2 Equations (*) are
F (x, ( Yz )) = ( 2xx + Y + : -
+~+2
1) = ( 0 )
0
=(~)x+(~ !)(;)+(-~)=(~)-
The implicitly defined function IR __!}_,,. IR. 2 is
G(x) = ( y ) = ( x +3 ) .
z -2x - 2
Although Example I shows that an implicitly defined function need not be con-
tinuous, we'll be primarily concerned in this section with functions that are not only
continuous but also differentiable. The implicit function theorem described in Theo-
rem 3.4 gives conditions for the existence of a differentiable G defined by an equation
Section 3 Implicit Differentiation 277
F(x, G(x)) = 0. However, we consider here the problem of finding the derivative of
G only when G and G' are both assumed to exist. Suppose the functions JR.2 ~ JR.
and JR ~ JR are differentiable and that
F(x, G(x)) =0
for every x in the domain of g. Then the chain rule applied to F(x, G(x)) yields,
in terms of the partial derivatives Fx and Fy ,
dy
F;r;(X, y) + Fy(X, y)-
dx
= 0.
Solving the last equation for dy / dx gives
dy F;r;(x,y)
31 - = ---- if Fy(x, y) f 0.
• dx Fy(x, y)'
dy
2x +2y-
dx
= 0.
Solving for dy /dx gives
)' dy
dx
if y i 0.
For example, at the point (xo, Yo)= (l/,./2, J/,./2), we have F(xo, Yo)= 0, and
X
dy 2xo
-(xo, Yo)= - -
dx 2yo
x2 +y2 - l = O =-1.
FIGURE 6.10 Thus the graph of the implicitly defined function has slope - I al (xo, yo). Figure 6. JO
shows the tangent line there.
The process just described is called implicit differentiation, and it extends to
vector-valued functions of several variables too.
x
2
+ y2 + z2 - 6 = 0, xyz + 2 = 0,
278 Chapter 6 Vector Differential Calculus
suppose that x and y are differentiable functions of z, that is, the function defined
implicitly by the equations is of the form (x, y) = G(z). To compute dx /dz and
dy/dz, we apply the chain rule to the given equations to get
dx dy
2x- + 2y- +2z = 0,
dz dz
dx dy
yz- +xz- +xy
dz dz
= 0.
We can solve these new equations for dx/dz and dy/dz . The solution is
which is the matrix G' (z). Notice that the corresponding values for x and y have to be
known to make the formula completely explicit. That is, from the information given
so far, there is no possible way of evaluating dx/dz at z = l. On the other hand, given
the point (x, y, z) = (1, -2, 1) satisfying both equations, we have (dx /dz)(l) = -1.
The reason is that, just as in Example 1, there is more than one function f defined
implicitly by the given equations. By specifying a particular point on its graph, we
determine / uniquely in the vicinity of the point.
Consider
XU+ yv + ZW = 1,
X + y + Z + U + V + W = 0,
xy + zuv + w = l.
Suppose that each of x, y, and z is a function of u, v, and w . To find the partial
derivatives of x, y, and z with respect to w, we differentiate the three equations using
the chain rule.
ax ay az
u-+v-+w-+z =0,
aw aw aw
ax ay az
-+-+-+1=0,
aw aw aw
ax ay az
y--+x-+uv-+1=0
aw aw aw
Solving the this linear system is simplest using Cramer's rule, giving, for example,
Xw as
ax uv 2 + XZ + W - ZUV - XW - V
aw - u2 v + vy + wx - yw - ux - uv 2 .
Similarly, we could solve foray/aw and az/aw. To find partials with respect to u,
differentiate the original equations with respect to u and solve for ax/au, ay/au,
and az/au. Partials with respect to v are found by the same method.
The computation indicated in Example 5 leads to the nine entries in the derivative
matrix of an implicitly defined vector function. For the computation to work it's
necessary to have the number of given equations equal the number of implicitly
defined coordinate functions, just as in Example 2. To get more insight into the
reason for this requirement, suppose we are given a differentiable vector function
F(uvx )=(Fi(u,v,x,y))
' ' 'Y F (u v x y)
2 ' ' '
Fi (u, v, x, y) = 0, F2(u, v, x, y) = 0
The last matrix on the right is the derivative matrix G'(u, v). Solving for it, we get
1
ax 1
ax aF1 ) - aF
av ) ( aF, (
ax ay au
3.2 G'(u, v) = :; ay - aF2
(
av ax
-aF2
ay
aF2
-
au
au
To be able to solve uniquely for the matrix G'(u, v) it's essential that the inverse
matrix appearing in Equation 3.2 should exist. In particular, this requires that the
matrix to be inverted be square, in other words that the number of equations originally
given must equal the number of implicitly determined variables; equivalently, the
280 Chapter 6 Vector Differential Calculus
number of variables you solve for must be the same as the number of equations
that determine them, just as for linear systems for which you may expect a unique
solution.
The analog of Equation 3.2 holds for an arbitrary number of equations under suit-
able hypotheses and the proof follows similar lines. We summarize the generalization
of Equations 3.1 and 3.2 as follows.
3.3 Theorem. Suppose JR 11 +m ~ JR.111 and JR.11 ~ JR. 111 are differentiable and
that y = G(x) satisfies F(x, y) = 0 for all x in some open subset of JR". Then
The subscript notation used in the theorem is illustrated in the next example.
Fx(x,y,z)= ( 2xyz+ z )
x2
and F(y,z)(X, y, z) =( z
Note that the vector y must be chosen so that Fy is a square matrix and that the
implicit differentiation formula works only when that matrix is invertible. Thus we
must choose (x, y) so that
detFy(x, y) t= 0.
For the choice made in Example 6, we have
2
det ( x
z
X
x+y
) = x 3 +x 2 y - xz,
3.4 Implicit Function Theorem. Let JR11 +111 ~ !Rm be a continuously differen-
tiable function. Suppose for some xo in JR.11 and some Yo in ]Rm that
(i) F(xo, Yo) =0 and (ii) Fy(xo, Yo) is an invertible m-by-m matrix.
Section 3 Implicit Differentiation 281
Note that condition (ii) of the statement is equivalent to det Fy(Xo, Yo) -=I= 0
and is just what is needed to make sense of the formula for computing G' (x) in
Theorem 3.3. Theorem 3.4 is useful for identifying points at which the level sets
S of a function are smooth. This identification typically has to be done piecemeal,
treating one "patch" of a curve or surface at a time.
EXERCISES
l. The equation x 2 + y2 - 1 = 0 is satisfied by many values (d) Solve the given equation explicitly for x in terms of
of (x, y), including (I, 0), (0, 1), and (1/./2, 1/./2). Use y and interpret the results of part (b) graphically.
implicit differentiation in both parts (a) and (b). In Exercises 3 to 6, use implicit differentiation to find
(a) Express dy /dx in terms of x and y, and evaluate dy/dx, and, if possible, dx/dy at the indicated point.
at (x,y) = (1/./2, 1/./2). Does it make sense to
evaluate at (x, y) = (0, l) or (x, y) = (-1, I)? 3. xy+ 1 =0 at (x,y) = (-1, 1)
(b) Express dx/dy in terms of x and y, and evaluate 4. xeY + yex = 0 at (x, y) = (0, 0)
at (x, y) = (1 / ./2, -1 / ./2). Does it make sense to
evaluate at (x, y) = (1, 0) or (x, y) = (0, 1)? 5. x + y(x 2 + l) + ½ = 0 at (x, y) = (-1, ¼)
(c) Solve the given equation explicitly for y in terms of
x and interpret the results of part (a) graphically.
6. x 2 + y2 = 1 at (x, y) = (1/./2, 1/./2)
(d) Solve the given equation explicitly for x in terms of 7. Suppose that x 2 y + yz = 0 and xyz + l = 0.
y and interpret the results of part (b) graphically. (a) Find dx/dz and dy/dz at (x, y, z) = (1, 1, -1).
2. The equation x2 - y2 - 1 = 0 is satisfied by many (b) Find dy/dx and dz/dx at (x, y, z) = (1, 1, -1).
(c) Find dx/dy and dz/dy at (x, y, z) = (1, 1, -1).
points including (x, y) = (v'3, ./2), (x, y) = (1, 0) and
(x, y) = (-1, 0). Use implicit differentiation in both parts 8. If X + y - U - V = 0 and X - y + 2u + V = 0, find ax/au,
(a) and (b). oy/au, ax/av, and ay/av by
(a) Express dy/dx in terms of x and y, and evaluate at (a) first solving for x and y in terms of u and v
(x, y) = (v'3, ./2). Does it make sense to evaluate (b) implicit differentiation
at (x, y) = (1, 0) or (x, y) = (1. 1)?
9. If Exercise 7 is expressed in the general vector notation
(b) Express dx/dy in terms of x and y, and evaluate at
of Theorem 3.3, what are F, x, y, Fx, and Fy for part (a)?
(x, y) = (./2, 1). Does it make sense to evaluate at
Part (b)? Part (c)?
(x, y) = (1, 0) or (x, y) = (0, 1)?
(c) Solve the given equation explicitly for y in terms of 10. If Exercise 8 is expressed in the vector notation of
x and interpret the results of part (a) graphically. Theorem 3.3, what is the matrix G'(x)?
282 Chapter 6 Vector Differential Calculus
11. If x 2 + yu +xv+ w = 0, x + y + uvw + I = 0, then, 16. Show that the hyperboloid of two sheets x 2 +y 1 z2 + I =
regarding x and y as functions of u, v, and w, find 0 has two pieces, not intersecting. that are graphs of
smoolb functions of the form z = z(x, y).
ax
- and
av
~ at (x, y, u, v, w) = (I, -1, 1, 1, -1). (a) Do this by finding explicit representations for the
au au two graphs.
(b) Apply the implicit function theorem to show this for
12. The equations 2x 3 y + yx 1 + 12 = 0, x + y + t - I = 0 a neighborhood N of eyery point xo in '.R 2 .
implicitly define a curve
17. It's intuitively evident that a sphere of radius a centered
at the origin is a smooth surface S near all of its points.
f(t) = ( ~~:~ ) that satisfies f(I) = ( -~ ) . (a) Explain how this follows from the implicit func-
tion theorem, applied with each of x, y and z as
dependent variable, to the function F (x, y, z) =
Find the tangent line to the curve when I = I. x2 + y2 + 2 2 _ a2.
13. Let 1.he equation x /4 2
+ y2 + z2 /9 - I = 0 define (b) Find representations for parts of S in six pieces, so
z implicitly as a function z = f(x, y) near the point showing explicitly the smoothness of S.
x = I, y = v'll/6, z = 2. The graph of the function f is
a surface. Find its tangent plane at (1, v'll/6, 2). 18. The sphere x 2 + y1 + t 2 - 4 = 0 and the plane
x + y + z - 2 = 0 have points in common, for example
14. Suppose the equation F(x, y, z) = 0 implicitly defines (2, 0, 0), and the two surfaces appear to intersect in a
z = f(x, y) and that zo = f(xo, Yo). Suppose further that circle.
the surface that is the graph of z = f (x, y) has a tangent
(a) Find the center c and radius a of the circle.
plane at (xo, yo), as defined in Chapter 4, Section 3B.
(b) The implicit function theorem doesn't actually find
Show that
an explicit parametrization of the circle for you, but
aF aF it docs show that this can in principle be done in
(x - xo)8x(.:tQ, YO, zo) + (y - Yo)ay(xo, YO, Zo) + overlapping pieces. Explain how.
( ;,z ) = ( ~~::
z(u, v)
) ~~ is satisfied by the points on a level set S in R 3 of
F(x, y, z) = xyz - yz 2 + x 2 y.
(a) Find all points (x, y, z) such that F, (x, y, z) = 0,
= f(u, v) at (u, v) = (1, 1). as required by the implicit function theorem for the
existence of z = z(x, y).
(b) The function f parametrically defines a surface in (b) Some of the points found in part (a) may not lie
the (x, y, z) space. Find the tangent plane to it at the on S. Show that all points on S near which it's
point (1, 1, -1). impossible to apply the implicit function theorem to
Section 4A Extreme Values 283
guarantee existence of z = z(x, y) arc the solutions (d) The results of parts (b) of Exercises 19 and 20
of x - 2z = 0 and 5x 2 y - 4 = 0. together imply that S is a smooth surface at all of
(c) Solve the z-quadratic equation xyz - yz 2 + x 2 y = I its points. Explain why.
explicitly for z = z(x, y). The points where this
solution fails to define a continuously differentiable *21. Show that the inverse function theorem (Theorem 2.3)
function should be the same as the points found in follows from the implicit function theorem (Theorem 3.4)
part (b). by setting F(x, y) = x - / (y).
/(xo) ~ /(x).
The number /(XO) is called a local maximum value or a local minimum value if
there is a neighborhood N of xo such that, respectively,
Consider the function defined by f(x, y) = x 2 + y2 for points (x, y) in the set S
of points that lie inside or on the ellipse x 2 + 2y 2 = l. Note that S ·is closed and
284 Chapter 6 Vector Differential Calculus
FIGURE 6.11
J{ (xo) = f~ (yo) = 0.
Since
'
/ 1 (xo)
aJ (xo, yo)
= ax '
and / 2 (yo)
af
= -(xo, Yo),
ay
af af
ax (xo, yo)= ay (xo, yo)= 0.
and so the only extreme value off in the interior of the e11ipse occurs at (xo, yo) =
(0, 0). From the graph of f, shown in Figure 6.11, we see that the value 0 there is a
local minimum. We next consider the values of f on the boundary curve itself. The
ellipse is defined parametrically by the function
Thus the values of f on the ellipse are given as the values of the composition f O g.
Any extreme values of / on the ellipse will be extreme for / g. The latter is a
0
Section 4A Extreme Values 285
real-valued function of one variable, and we treat it in the usual way, that is, by
setting its derivative equal to zero. By the chain rule, we obtain
d
-(Jog)= Vf(g(t)}. g 1(t)
dt
= (2cost,2/-v'2sint) • (-sint, I/-v'2cost)
= - 2 cost sin t + sin t cos t
= -½ sin 2t.
Extreme values therefore may occur at t = 0, rr /2, ;r, and 3rr /2. The corresponding
values of (x, y) are (1, 0), (0, I /v'2), (-1, 0), and (0, -J/v'2), and those of f are
J, ½, I, and ½, respectively. We see that the absolute minimum of f is 0 at (0, 0)
and that the absolute maximum off occurs at the two points (1, 0) and (-1, 0).
Notice that the two extreme values of f O g that occur at t = rr /2 and 3rr /2 arc not
extreme for f, a<; we see by looking at Figure 6.11.
The methods used in the preceding example are valid in any number of dimensions.
The next theorem is the principal criterion used in this extension, and although we can
prove it by reducing it to the single-variable method, we give a proof that contains
the single-variable situation as a special case.
Proof. Suppose f has a local minimum at xo. For any unit vector u in Rn, there is
an E > 0 such that if -E < t < E, then f(xo):::: f(xo + tu). Hence, for O < t < E,
af I
-(xo) = f (xo)u.
au
Therefore
aJ
au (xo) = J' (xo)u,
and that the derivative with respect to u measures the rate of change of J in the
direction of u. At an extreme point in the interior of the domain of J, this rate should
be zero in every direction. The importance of the theorem is that of all the interior
points x of the domain of J we need to look for extreme points only among those
for which J'(x) = 0. Points x for which J'(x) = 0 are called critical points of J.
4B Constraints
As we did in Example 1 we'll consider in more detail real-valued functions J on
open sets D, trying to find the extreme points of J when J has its domain restricted
to some subset S of D. Two possibilities that we must look out for are
IEXAMPLE2 I Let J(x, y, z) = xyz in the set defined by !xi ::: 1, IYI ::: 1, \zl ::: I. Thus the domain
of J is the cube C with edges of length 2 illustrated in Figure 6.12(a). The condition
VJ (x) = 0 for critical points amounts to (yz, xz, xy) = (0, 0, 0). The solutions of
this equation are the points satisfying x = y = 0, or x = z = 0, or y = z = O;
in other words, the coordinate axes. Since J has the value zero at all of its critical
points, and since J has both positive and negative values in the neighborhood of
each of these points, no critical point can be an extreme point. Furthermore, a little
thought shows that J has maximum value 1 and minimum value -1 on C. These
values occur at the eight corners of the cube, none of which is a critical point for J.
The boundary set S of a region R in IR11 is itself never an open subset of IR11 ,
so examining critical points of J is of no use in finding whatever extreme points
of J may lie on S, as on the boundary of the cube in the previous example. More
generally, we may be interested in maximizing or minimizing a function J whose
domain is restricted to a lower-dimensional set, say a curve or a surface, that we
may not necessarily regard as the boundary of some region.
Then F'(t) is zero only at t = 1. Furthermore, since F"(t) = l2t 2 - 6t, we have
F"(l) > 0. It follows that f has a relative minimum at the point {I, 1, I) while
restricted to the curve y. The minimum value of f on y is f (l , I, 1) = -1, and
there are no other extreme values.
t # - ' •
) ~ y
/.•
/.
-- . The function f on C takes the value F(t) = cost + sin t + 2. We have F'(t) =
/,'
-sint + cost, So F'(t) = 0 at t = rr/4 and t = 5rr/4. Since F"(rr/4) < 0 and
F"(Srr/4) > 0,
(a)
,< ...
y
4C Lagrange Multipliers
The solution of the previous problem depended on our being able to find a concrete
parametric representation for the curve of intersection of the plane z - 2 = 0 and the
cylinder x 2 + y 2 - 1 = 0. When a specific parametrization is not readily available,
we can still sometimes apply the method of Lagrange multipliers, to be described
next. The method consists in verifying the pure existence of a parametric represen-
tation and then deriving necessary conditions for there to be an extreme point for a
function f when restricted to the curve or surface.
FIGURE 6.13 y
~ u
'·
z
X
n = 2, m = I n = 3, m = 2
(a) (b)
Why Does the method work? What the Lagrange method does for us in
practice is allow us to restrict attention to solutions xo of the previous equation that
also lie on S. A complete proof of the method's correctness is fairly complicated,
but the geometric idea behind it is quite plausible. Since xo is a local extreme point
for f on S, derivatives of f in directions parallel to S at xo should be zero; in
other words,
aJ
-(xo) = 'vf(xo) • u = 0
au
for every unit vector u tangent to S at xo. But vectors u tangent to S are perpen-
dicular to them normal vectors VG1 (xo), VG2(xo) , ... , VGm(xo), as illustrated in
Figure 6.13 for the cases n = 2, m = l and 11 = 3, m = 2. The previous displayed
equation shows that u is also perpendicular to 'vf (xo). The pictures suggest, and it
can be proved, that 'vf (xo) is then a linear combination of the vectors VGk(xo), a
combination that we choose to write here in the form
Graph
off
G ( x , y ) ~- , t If m = I, then Vf(xo) will typically be parallel to VG1 (xo),asshowninFigure 6.13(a),
~ <- ~-0-\1/- - but in any case there are constants Ak such that the Lagrange condition is satisfied at
VJ -1+
xo. Figure 6.14 is a picture that includes the graph of a function JR 2 JR and shows
some critical vectors 'vf perpendicular to the set S determined by G(x, y) = 0.
FIGURE 6.14 Remark 1. It's important to understand that the Lagrange condition is only a
necessary condition that must hold at a local extreme point, which is why we can
use it to exclude many other points from consideration. The Lagrange condition may
hold at some points that are not extreme points, just as the gradient of a function f
may be zero at points that are not extreme points of f.
Remark 2. In all max-min problems it's important to have some grounds for
believing that the desired extreme points do indeed exist. In addition we need to be
able to distinguish among the critical points for those that provide relative maxima
and minima. Section 4E gives a second derivative test that's sometimes helpful
for doing this. For the Lagrange method to be effective, we also need to assure
ourselves that the set S to which f is restricted is not only closed and bounded but
Section 4C Extreme Values 289
also sufficiently smooth, so sharp corners on curves and sharp edges in surfaces have
to be examined separately. We deal with these issues in the Exercises.
I EXAMPLES I
. ..... .... "''• ., ...... : The problem of Example 4 is that of finding the extreme points of f (x, y, z)
x + y + z subject to the conditions
=
G1(x, y, z) = x 2 + y2 - I= 0, G2(x, y, z) = z - 2 = 0.
(x + y + z) + )q (x 2 + y2 - 1) + >..2 (z - 2).
x- y + z + A(x 2 + y2 + z2 - 1)
x2 + y2+z2 = 1.
The solutions of these four equations are found as in the previous example:
>..=±-
./3 = -y = Z = =f-.
1
2 , X
./3
290 Chapter 6 Vector Differential Calculus
The maximum off occurs at (1/../3, -1/../3, 1/../3). The maximum value is ../3.
What is the minimum value?
Let g(x1,x2, ... ,x11 ) = 0 implicitly define a surface S in JR 11 and let a =
(a1, a2, ... , a 11 ) be a fixed point not on S. Suppose we want to minimize locally the
distance from a to S. Minimizing the distance from a to Sis the same as minimizing
the square of the distance, which is easier to differentiate. Applying the Lagrange
method, we look for points p on S that are critical points of
II
L (xk - ak)
2
+ Ag(x1, ... , x 11 )
k=l
for some A. The critical points satisfy, in addition to g(x1, ... , x,1 ) = 0, the equations
ag
2(x1 - at) +A-(x1, ... ,x11 ) =0
ax1
ag
2(x11 - a11) + A-(x1, ... , Xn) = 0.
dXn
ag
-(p)
dXn
where p = (p1, ... , p11 ). The vector p - a on the left is then either zero or parallel
to the normal vector to S at p, which appears on the right side of the equation. In
s other words, we have shown that p - a is perpendicular to S, or else p = a. A
two-dimensional example is illustrated in Figure 6.15, where p provides a local, but
FIGURE 6.15 not a global, minimum.
IEXAMPLE sl Suppose that a cylindrical can is to contain a fixed volume V and that its surface
area, with top and bottom, is to be as small as possible. If the radius of the can is x,
and it5 height is y, then V = rrx 2y. We want to minimize the total area 2rrx 2+2rrxy
of the top, bottom, and sides. We write
must equal diameter 2x. The value of x for a given volume V can then be determined
from the equation 2;r x 3 = ;r x 2 y = V.
intersect in a line Sas shown in Figure 6.16(a). Let f(x, y, z) = xy, and restrict f
to the line S. Using the Lagrange method to maximize f on S, we consider
4D Saddle Points
A critical point xo of a function f such that f (xo) is neither a local maximum nor
a local minimum value for f is called a saddle point for f.
FIGURE 6.16 z
/
/
/
/ ·' ,,,-/'/
/.
s ')
·, .'
~
•• ··1
··,
____.,,, ~ , - ---+
y
u
·,
I
I,
/. /
(a) (b)
292 Chapter 6 Vector Differential Calculus
In case (b), or in the case where R has no interior points, we can do either of the
following:
EXERCISES
4E Second-Derivative Criterion
In this section we'll identify strict local extreme points, that is, points xo for which
a strict inequality f (xo) > f (x) or J(xo) < f (x) holds for x ¥ xo in some neigh-
borhood of xo. For functions of one real variable with two continuous derivatives
the second-derivative test says that at a critical · point xo interior to its domain
f has (i) a local minimum if J" (xo) > 0, (ii) a local maximum if J" (xo) < 0.
294 Chapter 6 Vector Differential Calculus
or (iii) neither of these if J" changes sign at xo. The intuitive geometric content
of these alternatives is as follows: near xo the graph of J (i) is concave up and
so stays above the horizontal tangent through (xo, f(xo)) if J"(xo) > 0, (ii) is
concave down and stays below the tangent if J" (xo) < 0, or (iii) crosses the
tangent in case J" changes sign at xo. We'll see that the alternatives for func-
tions IR 11-4IR are very similar, the main technical difference being the criteria
that we use to decide about concavity. The essence of the extension to higher
dimensions consists of examining the second-order directional derivative, defined
naturally by
cP J
--(x) = -aua (aJ)
-au (x).
au2
The basic analysis is in the following theorem, which contains the I-dimensional
case and so bears out the intuitive review we started with.
(i) If az
1 (xo) > 0 for all unit vectors u, then J(xo) is a strict local minimum
au 2
value.
az f
(ii) If auz (xo) < 0 for all unit vectors u, then J(xo) is a strict local maximum
value.
aZ f
(iii) If auz (xo) is positive for some u and negative for others, then x0 is a saddle
point.
Proof. We first observe that if g(x) is real-valued and twice continuously differen-
tiable on an interval containing 0 in its interior, then
To see this just compute the integral by parts and apply the fundamental theorem of
calculus to the remaining integral. Now let g(x) = J(xo + xu), where u is a unit
vector in IR 11 • By the chain rule g'(x) = Vf(xo + xu) • u; applying the chain rule
again we get
11
g(x)=V(VJ(xo+xu) •U)•u = a
- (i)j)
- (xo+xu).
au au
Since xo is a critical point for J, it follows that g'(0) = VJ(xo). u = 0. Replacing
g by the corresponding expressions in J everywhere in Equation ( *), we get
a a
But 2 j/ou2 is continuous, so for small enough t the sign of 2 f /ou2 (xo + tu)
is the same as the sign of 8 2 f /8u2 (xo). Case (i) now follows by checking that the
inequality /(xo + xu) - f(xo) > 0 holds at all points x for which lxnl = !xi < 8,
for some positive 8 > O; cases (ii) and (iii) are similar. •
Remark. Just as in the I-dimensional case, the three cases of Theorem 4.4 don't
cover all possibilities. (See Exercise 1.) For example, 82 f /8u 2 (xo) could be zero for
all unit vectors u, in which case the statement yields no information.
Theorem 4.4 is in a way a straightforward generalization of the second-derivative
criterion of single-variable calculus. But in practice the transition from dimension 1
to dimension 2 is a distinctive one, because in JR 2 there are infinitely many ways to
approach a critical point, while in ~ there are at most two approaches, from the right .
or the left. For dimension 3 or more the additional distinctions are mainly technical,
so we'll concentrate on functions ~ 2 ~ JR.
4.5 Theorem. Suppose that f(x, y) has continuous second-order partials fxx,
fxy = fyx and /yy defined on an open set containing xo. Let u = (u, v). Then
2
8 f
- 2 (xo) = fxx(X-O)u 2 + 2/xy(XQ)uv + /yy(Xo)v 2 .
au
Proof. We know that 8f / au = Vf • u = fx u + /y v, so the second-order derivative
82 j/8u2 is
Suppose fxx(xo, yo) = p > 0, /yy(xo, yo) = q > 0 and fxy(xo, Yo) = 0. Then
(8 2 //8u 2 )(xo, yo)= pu 2 +qv 2 > 0, since u and v can't both be zero if l(u, v)I = 1.
Thus we're in case (i) of Theorem 4.3, and f (xo, yo) would be a strict minimum at a
critical point. On the other hand, if fxx (xo, Yo) = /yy(xo, Yo) = 0 and fxy(xo, yo) =
r :f. 0, then (8 2 j/8u2 )(xo, Yo) = ruv. Since ruv can be both positive and negative,
a critical point at (xo, Yo) would be a saddle point.
It follows that (0, I/ v'3) is a strict minimum point and (0, -1 / v'3) is a strict max-
imum. At (I, 0) we find fxxO, 0) = 0, fxy(1, 0) = 2, Jyy(I, 0) = 0, so the second
derivative is 2uv, which exhibits different signs depending on whether u and v have
the same or opposite sign. Hence (I, 0) is a saddle point. Similarly, (- l , 0) is a
saddle point.
If the quadratic polynomial of Theorem 4.5 doesn't yield to the quick sign analysis
that applied in the two previous examples, we have a simple test based on the
discriminant of a quadratic equation.
(i) If D > 0 and fxx(xo, Yo) > 0 or fvy(xo, yo) > 0, then f(xo, yo) is a strict
local minimum.
(ii) If D > 0 and fxx(xo, yo) < 0 or fyy(xo, yo) < 0, then f(xo, Yo) is a strict
local maximum.
(iii) If D < 0, then f(x, y) has a saddle point at (xo, yo).
Proof. Since u = (u, v) is a unit vector, u and v can'l both be zero. Thus for any
choice of u and v, the quadratic polynomial a2 J/au 2 of Theorem 4.5 can be written
in either of the two forms
2 2
u [!xx+ 2(f~y)(v/u) + (fyy)(v/u) 2 ] or v [ux x)(u/v)2 + 2(f{ y)(u/v) + ./yy].
Deciding whether either of these two functions changes sign or not comes down to
deciding whether either of the quadratic polynomials in v/u or u/v has a real root
or not; if not, there is no sign change. But "no real root" is equivalent to
in other words to D > 0. To decide in that event whether the quadratic polynomial
is negative or positive, all we have lo do is check one of the coefficients frx(xo, yo)
or fyy(xo, yo). This covers (i) and (ii). Case (iii) is now quite simple. If D < 0,
there are two distinct real root values for v/u or u/v, so there are two definite sign
changes as (u, v) varies. Hence (xo, yo) is a saddle point. •
=
If D 0 no conclusions follow from Theorem 4.6. Keeping the next example in
mind along with Figure 6.17 makes it easier to recall the alternatives of Theorem 4.6.
(',~~~(yl~~l:; 1~ j The general quadratic polynomial f (x, y) = ax 2 + 2bxy + cy 2 has critical points
satisfying
2ax + 2by = 0 ( 2a 2b ) ( x ) ( 0 )
{ 2bx + 2cy = 0 or 2b 2c y = 0 ·
Hence there is a single critical point at (xo, yo) = (0, 0) unless the determinant
D = 4ac - 4b2 = 0. Note that frx = 2a , Jyy = 2c and fxy = fyx = 2b. If
Section 4E Extreme Values 297
FIGURE 6.17
X y X z=y2-x2
(i) D = /;ufvv > O,f,,> 0 (ii) D = f xxfyy > O.fxx < 0 (iii) D = f,J11 < 0
up
Concave· in X and y, down in x and y, up in x, down in y.
EXERCISES
1. (a) Show that the functions f(x) = x 3 and g(x) = x 4 u = ± 1. Show that both second partials are equal
behave differently at their critical points, but that to f" (x).
the second-derivative criterion of Theorem 4.4 fails (b) Show more generally that for twice-differentiable
to distinguish between their behaviors. functions lR" ..f+ JR
(b) Find the critical points of the functions f(x , y) =
(x + y) 3 and g(x, y) = (x + y)4, and describe the
behavior of f and g near these points. Show also
that Theorem 4.4 fails to distinguish between the
two behaviors. Note that it's not correct to show this by simply
replacing (-u) 2 by u2 in the previous equation.
2. (a) For twice-differentiable functions JR ..f+ R there Find the critical points of each of the functions 3 to 13,
are in principle two values for the second- and try to apply the second-derivative test to determine
o2 f whether each critical point is a maximum, a minimum
order derivative au 2 (x), because we can let or a saddle point. Note, however, that if the conditions
298 Chapter 6 Vector Differential Calculus
1 a2q
alternate in sign so that (- l)m dct Ht) (xo) arc positive
;:,-a2 (x, y, z) = q(u, v, w), then f has a strict local maximum at xo. Show that this
,. u formulation has the criteria (i) and (ii) of Theorem 4.6 as
independent of (x, y, z) . special cases.
extreme point is in an open subset of the domain of .f. Our classic strategy so
far is to restrict attention to the critical points of .f, that is, the solutions of the n
simultaneous equations embodied in the single vector equation .f' (x) = 0, but finding
those solutions can be problematic, even with Newton's method.
Steepest Ascent Method. Though the method described here applies in any
number of dimensions, it's easiest to visualize in the special case JR.2 ~ JR. Imagine
that the graph of J represents the topography of some mountainous terrain, and that
an ambitious climber is determined always to head up the steepest way from a
given point. Figures 6.4(b) and (c) in Section 1 illustrate the setting. This strategy
determines the climber's path, and it's natural to call that path a path of steepest
ascent. To make mathematics out of these remarks, all we have to do is to recall
that at a point x = (x , y ) in the plane, the direction of maximum increase of J is
the same as the direction of the gradient vector VJ (x), provided this vector is not
zero. (This effectively tells the climber what horizontal compass heading gives the
steepest way up at each point of the path.) It seems reasonable that a path of steepest
ascent will in general lead to the "top," at which point we would have reached a
local maximum of J where VJ = 0. (It might only be the summit of one of the
foothills.) To reach a local minimum we would always head in the direction -VJ(x)
opposite to that of the gradient. We'd then call what we're doing the steepest descent
method.
The numerical implementation of steepest ascent amounts to taking a succession
of small steps along the direction of the gradient, at each point x along the way
observing the value J (x). At each step we make a decision about whether to continue
or not and record the value of J at the end of the last step as our estimate for a local
maximum value for J.
Each step in the process has the same general form as the very first step. Most
of the computation takes place in the domain of J, which in the case of JR. 2 we can
think of as a topographic map of the graph of J. Having decided on xo, a starting
point, we move in the direction of the gradient vector at xo by a certain distance to
a new point x 1:
Xt =XO+ ho VJ(xo) , where ho > 0.
To search for a local minimum value, we think of pursuing the path of steepest
descent by making hn < 0. This choice will tend to move us downhill in the
direction opposite to the gradient direction.
The remaining question is how the numerical factors hn are to be chosen. The
simplest choice is to make all hn the same; this choice has the consequence that as
300 Chapter 6 Vector Differential Calculus
we approach the location of a maximum value for f. where the gradient is zero,
continuity of VJ will cause the vectors h'vf(x.n) to get shorter and shorter. This
is desirable, for otherwise we face the danger of taking a big step right past the
extreme point. The following routine summarizes the method. From a different point
of view what we're doing here is finding approximate solutions to a system of two
differential equations, namely
dx dy
dt = fx(X, y), dt = /y(X, y).
At the end of the section we describe a method for improving the accuracy of the
process by varying h.
DEFINE f(x, y) =
1 - x2 - 2y2
DEFINE fx(x, y) = -2x (x-coordinate of 'i1 f(x, y).)
DEFINE f} (x, y) =
-4y (y-coordinate of 'i1 f(x, y).)
SET h = 0.4
INPUT (x, y) (Starting point.)
DO
SET (x1, u1) =
(x, y) (Keep for later use.)
(Next compute new position.)
SET (x, y) =
ex + tlf,:(X, }'J. j' llt'y(x, y))
PRINT x, y, f(x, y)
LOOP UNTIL Jf(x, y) - f(x1, Y1ll < f
(Stop if change in f is < e.)
Figure 6.18 shows some curves on the graph off along with their projections into
the xy-plane. Starting with xo = yo = 0.5 and e = 0.001 the output of num-
bers Xn, Yn and f (xn, Yn) from the next to last line of the program would be
as follows:
X y f(x, y)
Note that the step size h = 0.4 is fairly large; this choice results in repeatedly
overshooting the origin, as evidenced by the alternating sign in the y-coordinates. A
smaller choice, say h = 0. I, avoids this effect but requires more steps to achieve the
same accuracy. With a very small, and very inefficient, step size the points (x, y)
would approximate a flow line of the gradient field that cuts perpendicularly across
level curves of f as shown in the picture that accompanies the routine.
Section 4F Extreme Values 301
FIGURE 6.18
z
The function chosen for this example, j(x, y) = sin 2 x +sin 2 y, has its critical points
at the solutions of the equations
fx(x, y) = 2sinxcosx = sin2x
Jy (x, y) = 2 sin y cos y = sin 2y.
The solutions are all of the fonn (x, y) = (j re /2, krc /2), where j and k are integers.
If j = 21, k = 2m are both even we get f(lrc, mrr) = 0 for a local minimum,
and if j = 21 + 1, k = 2m + I we get f((2l + l)rr/2, (2m + l)rr/2) = 2 for
a local maximum. Having this information before doing the computing allows us
to experiment intelligently with different starting points and different sizes for h to
see what the results are. You're asked to do this in the exercises. The trick is to
coordinate the choice for h with the starting point; too small or too large a value for
h can require many steps to reach an acceptable approximation, or may lead to no
convergence at all. In our example, starting at (x, y) = (1.5, 1.5), computation with
step size h = 0.4 and t: = 0.0001 ends after three steps; note that re /2 ~ 1.57079:
X y f(x,y)
In the preceding example we were able to start reasonably close to the extreme
point at (x, y) = (re /2, rr /2) because we already knew the exact location of the point.
If we always had that kind of infonnation there would be no need for numerical
methods at all. In practice, we need to make educated guesses about the location
of extreme points. If the domain of the real-valued function f under consideration
is two-dimensional, making a computer-aided graph of f may be helpful. Failing
that, Newton's method for approximate root location may be helpful, as described
in Chapter 5, Section 5.
The preceding program outline is crude in that it forces you to stick rigidly with
a single step size h throughout the computation. This rigidity is clearly a defect
302 Chapter 6 Vector Differential Calculus
when the approximations x11 are getting close to a critical point, causing repeated
overshooting from side to side. What we need is a method for automatically choosing
the step size and adjusting it as the process proceeds. One way to do this is to think
of replacing /(x + u), where u = hV/(x), by its second-degree Taylor expansion
af 1 a2 f
f (x) + au (x) + 2 au2 (x).
In dimension 2, with u = (u , v), the middle term is
We now maximize this function, which is possible since it's parabolic with respect
to the variable h, and has a maximum if the coefficient of h 2 is negative. Setting the
derivative with respect to h equal to zero, we find the critical value of h to be
f;(x, y) + f;(x, y)
h= - 2
f.cx(X, y)fx (x, y) + 2/xy(x, y)fx(X, y)Jy(x, y) + /yy(X, y)fy2 (x, y) '
Replacing the line that sets h = 0.4 in the earlier program by this more complicated
expression requires earlier insertion into the program of definitions for the second
derivative functions fxx(x,y), fxy(x,y), and /yy(x,y).
Note that the second derivative that appears in the denominator of the expres-
sion for /z will typically be negative if we're looking for a maximum, so h
will be positive. Correspondingly, in looking for a minimum h will be nega-
tive. The applet ASCENT/DESCENT implements the method for fixed h, and
ASCE;NT/DESCENT+ does the same for the variable h method. Both applets are
available at http://math.dartmouth.edu/~rewn/. The former program is simpler to
apply because it requires the user to calculate only first partial derivatives, so you'd
resort to the automatic step-size version only if the simpler version fails.
EXERCISES
has a simple geometric description. The image under P of a point (r, 0) is the point
-+--t-»t-~--'-t--~
x-axis x = (x, y) whose distance from the origin is r and such that the angle from the
positive x axis to x in the counterclockwise direction is 0. See Figure 6.19.
The image of P consists of all of JR 2 except for the origin, so for any point (x, y)
in IR 2 there are numbers r and 0, called polar coordinates of x, such that
FIGURE 6.19 5.1 x = r cos 0 and y = r sin 0.
For two points (r1, 01) and (r2, 02) in the domain of P, the equations
hold whenever r1 = r2 and 01 = 02 +2rrm for some integer m. Hence the polar coor-
dinates of a point (x, y) in IR 2 are not uniquely specified without some restrictions
on r and 0. However if (x, y) =I- (0, 0) the polar coordinates of (x, y) are uniquely
specified up to an integer multiple of 2rr in the 0-coordinate. To see this square both
sides of the two displayed equations and add to get rf = r}. Assuming r > 0 we
conclude that r1 = r2. But then cos 01 = cos 02 and sin 01 = sin 02, so (cos 01, sin 01)
and (cos 0z, sin 02) represent the same point on a circle of radius 1 centered at the
origin in JR 2 . Hence 01 = 02 + 2rrm for some integer m.
The preceding paragraph says that P is not one-to-one, but that it becomes so
if its domain is restricted to be a subset of a rectangular half-strip in the r0-plane
defined by inequalities
0 < r < oo, 0o :'.S 0 < 0o + 2rr.
So restricted, P does have an inverse function, and we can find some partial formulas
for the inverse by solving the equations x = r cos 0, y = r sin 0 for r and 0. We
obtain, for x =f. 0,
y
0 = arctan -X + krr.
We have used the common convention of restricting an inverse trigonometric function
to the principal branch of the corresponding multiple-valued function. Hence the
image of arctan is the interval -rr /2 < 0 < rr /2. If follows that the function
defined by
(n=(:::.:n. nO,
f ),
is the inverse of the restriction of P by O < r < oo and -rr /2 < 0 < rr /2. Similarly
IEXAMPLE 1 j We have not defined polar coordinates for the origin of the xy-plane simply because
so the one-to-one requirement fails at the origin. This failure causes no real difficulty;
for example, the equation in rectangular coordinates of the lemniscate,
~
(I)
0-axis
SB Spherical Coordinates
Consider the function 1U3 ~ JR3 , defined by
5.2
l
r ) ( r sin ¢ cos 0 ) 0<r<oo
S ¢ = r sin ¢ sin 0 , 0<</)<JC
( 0 r cos¢ 0 S 0 < 2JC.
Here, for simplicity, we have restricted the domain of S from the outset so that
S is one-to-one. Its range is all of JR 3 with the exception of the z-axis. Hence it
assigns spherical coordinates (r, ¢, 0) to every point of JR 3 except those on the
z-axis. As with polar coordinates in the plane, the spherical coordinates (r, ¢, 0)
of a point x = (x, y, z) have a simple geometric interpretation. See Figure 6.21(b).
The number r is the distance from x to the origin. The coordinate ¢ is the angle
in radians between the vector x a11d the positive z-axis. Finally, 0 is the angle in
radians from the positive x-axis to the projected image (x, y, 0) ofx on the xy-plane.
The symbols ¢ and 0 are sometimes interchanged, particularly in physical appli-
cations.
We can compute an explicit expression for the inverse function, which we denote
(a) by s- 1, by solving the equations
x = r sin ¢ cos 0,
y = r sin ¢ sin 0,
z-axis z = rcos¢,
for r, 0, and¢. We get, for y ~ 0,
2 2 2
= (.x, y, z) r ) ( x ) ../x + y z+ z
( arccos
</J X
(
¢ = S -1 y = Jx2+y1+:i x 2 + y2 > 0.
0 z arccos ~x
xZ+y~
Since the image of the principal branch the arccosine function is the interval 0 s
y-axis
0 s JC, this function is actually the inverse of the function obtained by restricting the
'' domain of S by the further condition 0 s 0 s JC. To get values of 0 in the interval
'
' ... JC < 0 < 2JC, corresponding to y < 0, we add JC to the third coordinate in the
' preceding formula. Note that when ¢ = 0 the spherical coordinate transformation
x-axis
(b) reduces to x = r cos 0, y = r sin 0, z = 0; this amounts to changing to polar
coordinates in the plane z = 0, that is in the xy-plane in JR 3 •
FIGURE 6.21
Three surfaces in lll3 defined by spherical coordinate equations r = 1, ¢ = JC /4,
1·~>,CAM~LE 2J and 0 = JC/3, respectively, are shown in Figure 6.22. The corresponding rectangular
coordinate equations derived from the preceding expressions for s- 1 are respectively
x 2 + y2 + z2 ~ 1, with x 2 + y2 > 0
FIGURE 6.22 z
4> =l0=j
(r varies)
}'
(0 varies)
O=j.r=I
X
(4> varies)
SC Cylindrical Coordinates
The coordinate transformation is defined by
5.3 D<r<OO
-rr < 0 S rr
- -- 6 varies -00 < Z < 00 .
.,,_.,,--···
y
The coordinates (r, 0, z) are obtained by a straightforward extension to IR. 3 of
X
polar coordinates in IR. 2 • Figure 6.23 shows the effect of varying each of the three
coordinates.
FIGURE 6.23
5D Jacobian Matrices
The name "curvilinear" is applied to coordinates for the reason that if all but one
of the nonrectangular coordinates are held fixed and the remaining one is varied,
the coordinate transformation defines a curve in IR.11 • Thus in plane polar coordinates
the coordinate curves are circles and straight lines, as shown in Figure 6.24(b).
For spherical coordinates, typical coordinate curves are the circle, semi-circle, and
half-line obtained as intersections of the pairs of surfaces shown in Figure 6.22. The
curves and surfaces obtained by varying one or more curvilinear coordinate variables
play the same role that the natural coordinate lines and planes of IR.11 do. For example,
to say that a point in IR. 3 has rectangular coordinates (x, y, z) = (l, 2, 1) is to say that
it lies at the intersection of the coordinate planes x = 1, y = 2, and z = 1. Similarly,
I}= :I!._
4
t-------+- I} = f <r varies)
r X
r = 2 (0 varies)
I} = - ~ t-------+-
8 I} =- }1! (r varies)
8
(a) (b)
Section SD Curvilinear Coordinates 307
saying that a point in IR.3 has spherical coordinates (r, <I>, 0) = (1, rr /4, rr /3) is to
say that the point lies at the intersection of the surfaces shown in Figure 6.22.
Generalizing from the preceding examples, we see that a system of curvilinear
coordinates in IR. 11 is determined by a function 1U11 ~ IR. 11 • It's assumed that for
some open subset N in the domain of T, the restriction of T to N is one-to-one and
therefore has an inverse r- 1 • The curvilinear coordinates of a point x lying in the
image set T(N) are
respectively. These matrices, and, more generally, the derivative matrices of dif-
ferentiable coordinate transformations, are called Jacobian matrices. We've seen
in Chapter 5, Section 4 that the columns of these matrices have simple geometric
interpretations; each column of a Jacobian matrix is obtained by differentiation of
the coordinate functions with respect to a single variable, while holding the other
variables fixed. This means that the }th column of the matrix represents a tangent
vector to the curvilinear coordinate curve for which the }th coordinate is allowed to
vary. That is, let the coordinate transformation be given by 1U11 ~ IR.11 • Then the }th
column of the matrix of the derivative T'(uo) is a tangent vector, which we'll denote
by Cj, at xo = T(uo), to the curvilinear coordinate curve formed by allowing only
the }th coordinate of uo to vary. Tangent vectors are shown (with their initial points
translated to the point XQ) in Figure 6.25 for some polar, spherical, and cylindrical
coordinate curves. The coordinates of the tangent vectors CJ, ••. , c11 are rectangular
coordinates, not curvilinear coordinates.
Remark. We can now see that the Jacobian matrix itself of a coordinate trans-
formation is the matrix of a certain first-degree change of coordinates at each point.
To see this, consider curvilinear coordinates in JR. 11 given by x = T(u), where u
308 Chapter 6 Vector Differential Calculus
FIGURE 6.25
Polar
(a)
C
2
=(_!__, ,] ,--...!..)
2a 2;2 ,'2
X
Spherical Cylindrical
(b) (c)
EXERCISES
In Exercises l to 4, make a sketch using xy coordinates In Exercises 5 to 8, make a sketch using xyz coordinates
in JR 2 of the curves given in polar coordinates. in JR 3 of the curves and surfaces given in spherical
coordinates.
1. r = 1, JT ::; 0 ::; 3JT /2
5. r = 2, 0::; 0 S n/4, n/4-:::_ <I>-:::_ JT/2
2. r = 0, 0 ::; 0 S 1r /2
6. I -:::_ r -:::. 2, 0 = 1r /2, <P = JT / 4
3. r(sin0 - cos0) = n/2, r > 0, n/2 S 0 S Jr
7. 0 Sr S 1, 0 S 0 S rr /2, <P = rr /4
4. r = -;r/2cos0, JT s 0 .s 3-;r/2
8. 0-:::_r-:::_ 1,0=n/4,O::;¢::;JT/4
Section 5D Curvilinear Coordinates 309
9. Use cylindrical coordinates in R 3 to describe the region For which points (x, y) in R2 , or (x, y, z) in R3, do the
defined in rectangular coordinates by O :'.S x, x 2 + y2 :'.S J. matrices 5.4, 5.5, and 5.6 fail to be invertible?
10. Let (r, 0) be polar coordinates in R 2 • The equation 13. With a, b, c positive, the equations
0 <I<-,
7f x = ar sin¢, cos 0
- - 2 y = br sin¢, sin 0 , 0 < ¢, < n, 0 :'.S 0 < 2n,
z = er cos¢,
describes a curve in U2 • Sketch this curve, and sketch its
image in R 2 under the polar coordinate transfonnation. define ellipsoidal coordinates in R3 . For a = 1, b = c =
11. Let (r, ¢,, 0) be spherical coordinates in JR3 • The equation 2, sketch a typical example of each of the three kinds of
coordinate surface.
7f
14. Compute the coordinates of tangent vectors to the coordi-
0 <I< - nate curves for the general ellipsoidal coordinates given
- - 2'
in Exercise 7, when a = b = I, c = 2, and r = ½,
¢, = 0 = n/2.
determines a curve in R3 (as well as in the r<J,0 space 15. Let r, ¢,, and 0 be spherical coordinates in R 3 . The
U3 ). Sketch the curve in IR 3 • [Suggestion: The curve lies equation
on a sphere.]
12. Compute the determinants of the matrices 5.4, 5.5, and
5.6, and show that they are
8(x, y)
(a) - - =r
8(r, 0) detennines a curve in R 3 • Compute the coordinates of a
a(x,y,z) ."' tangent vector to the curve.
(b) - - - = r 2 sm'I'
8(r, ¢,, 0) 16. Prove that in 3-dimensional spherical coordinates,
a(x, y, z)
(c) ---= r the sphere xr + x] + xf = I has the equation
8(r,0,z) r = 1.
Chapter 6 REVIEW
2 2
In Exercises I to 6, let /(x, y, z) = x2 + y2 - z2 , 8. Suppose T(x, y, x) = Ke-c<x +Y +,h is the temperature
g(u,v) = (cosu,sinu,v), h(u,v) = (u+v,uv), and at a point (x, y, z) inside a solid ball x 2 + y2 + z2 :'.S a 2 ,
K(x, y) = xy. Find where K and c are positive constants. Show that the
surfaces of equal temperature (i.e., the isotherms) are
1. f'(x,y,z) spheres, and that the temperature gradients point toward
2. h'(u, v) the center of the ball. What is the magnitude of the
temperature gradient on a sphere of radius b < a?
3. g' (rr /3, n 2 /36)
9. Define F(x, y) = (x/Jx 2 + y 2, y/Jx2 + y 2) at points
4. (goh)'(rr/6, n/6) (x, y) "# (0, 0).
(a) Sketch the vector field defined by F(x, y).
5. BK /Bu at (1, ../3), where u is the unit vector in the
direction of the gradient of K at (1, ../3) (b) Find the maximum rate of increase of Jx 2 + y2 at
(x, y) "# (0, 0).
a
6. - f(K(x, y), K(x + y, x - y), x2 + y 2 ) 10. Let x = u 2 - v 2 and y = u2 + v2 . Assume z = f(x, y)
ax
has partial derivatives fx(O, 2) = 3 and /y(O, 2) = 4.
7. If the temperature at a point (x, y, z) of a solid ball of Find 8z/8u and 8z/8v at (u, v) = (1, I).
radius 3 centered at (0, 0, 0) is given by T(x, y, z) =
yz + zx + xy, find the direction in which T is increasing 11. A function Rn -1.... JR is said to be homogeneous of
most rapidly at(), I, 2). degree m if f (tx) = tm /(x) for all t > 0 and all x in the
310 Chapter 6 Vector Differential Calculus
domain off. For example f (x, y) = (x 3 +y 3 ) cos(y/x) is 18. The graphs of x + yz = 0 and y + xz = 0 intersect
homogeneous of degree 3. Show that if f is differentiable in a curve containing the point (x, y, z) = (I , - I , I) and
and homogeneous of degree m, then J(x) = ~x • Vf(x). parametrized by functions of the form y = y(x), z = z(x).
(a) Find y' (I) and z' (I) by implicit differentiation.
12. Let /(z) be differentiable and w = f (ax + by), a, b
(b) Find a tangent vector to the curve at ( I , -1, I).
constant. Show that bwx = awy.
13. Let h(z) be a real-valued function, differentiable for all
19. A function z = f(x, y) is defined implicitly by
real z. Define u(x, t) =
lz(x - at), where a is a constant.
x + y2 + z2 = 3z. Find Zx and Zy at (1, I, I).
x + i- z2 = I.
2
15. Suppose that /(11, v) = u 2 v + uv 3 and that u and v are You arc asked to find a tangent vector to y at (I, I , I) in
differentiable functions of s and t with u (2, I) = -2, each of two different ways:
u.,(2, 1) = 3, v(2, I) = 2 and v5 (2, I) = -4. Find (a) Solve explicitly for x = x(z) and y = y(z) to find a
aJ (2 I) parametric representation (x, y, z) = (x(z), y(z), z).
as ' .
Then find a tangent vector.
16. Define F from the uv plane to the xy plane by (b) Use implicit differentiation to find dx/dz and dyldz
near z = I . Then find a tangent vector.
22. The equations x 2 + y 2 + z2 = 6, x + y + z 4 define =
(a) Show that an image point (x, y) necessarily satisfies z = z(x) and y = y(x) 8.S differentiable functions near
x > lyl. Sketch the region R in the xy plane (x, y, z) = (1, 2, I).
determined by x > lyl. (a) Sketch the curve defined by the intersection of the
(b) Show that F satisfies the hypotheses of the Inverse two surfaces.
Function Theorem 2.3 in Section 2, and so has a (b) Find dlldx and dyldx at the point (x, y, z) =
continuously differentiable inverse defined in some (1, 2, I).
neighborhood of each image point (x, y) in R. (c) Find a vector parallel to the line tangent to the curve
(c) Show that F has a global inverse defined for all in part (a) at the point (I, 2, I). [Hint: A parametric
(x, y) with x > IYI by representation for the curve near (I, 2, I) is given
by (x, y(x), z(x)}.]
u =½In (½(x + y)) +½In (½(x - y)) 23. Let f(x,y) be differentiable, and let u = (cos0,sin0).
(a) Show that the directional derivative af/au has a
v=½ln(½<x+y))-½In(½(x-y)) .
critical point as a function of 0 whenever u and VJ
17. The pressure P, volume V and temperature T of a gas are parallel.
are related by PV = 6T . (b) How does the result of part (a) relate to Theorem 1.2
(a) Show that it's impossible to have P, V, and T all in Section 1?
increase simultaneously with each one increasing at 24. Find a point on the sphere x 2 + y 2 + z2 = 28 at
its own constant rate. which the tangent plane is parallel to the tangent plane
(b) Assume T increases steadily at 2° per minute and to xy + lnz = 6 at the point (2, 3, I).
V at 3 cubic centimeters per minute. At t = 0,
25. The equations
T = IO~ and V = 8 cubic centimeters. Find d P /dt
at/= 0. sin(x + u) - cos(y + v) + xy - u + uv =1
(c) Using the same data as in part (b), find dP/dt at
I= 30. U + V +X +y - UX - Vy =5
Section SD Curvilinear Coordinates 311
implicitly define x and y as functions of u and v. Find Xu 36. Find the maximum and minirnwn values of the function
and Yu at the point (u, v, x, y) = (2, 1, -2, --1). f(x, y) = x 2 + 4xy + y2 on the region in R 2 defined by
26. Let f (x, y, z) = 3x2 z + 3xy - 6y 2 - 3z + 3z 3 • x2 + y2 :'.: 1.
(a) Find 'vf(x, y, z). 37. Check all critical points of 2xy + y 2 + 4y + 2x for local
(b) Let S be the level set of / at 0. Find a unit vector maximality and minimality.
that is perpendicular to S at the point (1, 1, l).
38. Use Lagrange multipliers to find the point or points on
(c) Find the tangent plane to S at (1, 1, 1).
the parabola y = x 2 closest to the point (0, b), where
27. The equations xy + zt = -4, x 2 + y + z + t 2 = 8 (0, b) is a fixed point on the y-axis. Note that the answer
define x and z as differentiable functions of y and t near will depend on b.
(x,y,z,t) = (-2,1,-1,2). Find ax/oy and oz/oy at
this point. 39. Let x0 be a nonzero vector in R11 • Define a real-valued
function on Rn by f (x) =Xo • x. Without calculating any
28. A point moves on a differentiable curve in the xy-plane derivatives, show that the maximum value of /(x) for x
in R 3 so that its position at time t is (x(t), y(t), 0). Acor- restricted by lxl = 1 is lxol- What is the minimum value?
responding point on the graph of a differentiable function Do these answers change if the restriction is replaced by
z = f(x, y) is then at (x(t), y(t), f (x(t), y(t))). Sup- !xi:'.: l?
pose that the velocity vector of the plane curve is given
by the gradient field of f(x, y), so that (dx/dt, dy/dt) = 40. Find the minimum value of f(x, y) = 3x 2 +2y2--6x-4y
(/x(x, y), /y(x, y)) for a point (x, y) on the curve. subject to the condition x + 2y = 4. What can you
(a) Show that, if z(t) = f(x(t), y(t)), then dz/dt = say about the maximum? Answer both questions if the
l'vf(x, y)1 2 - condition is replaced by x + 2y :'.: 4.
(b) Show that the point on the graph of f (x, y) corre- 41. Let f(x, y) = x2y - 2x - y.
sponding to (x(t), y(t)) has speed (a) Find all critical points off (x, y) in R 2.
(b) Find the maximum and minimum values of f(x, y)
v = l'vf (x, y)IJ 1 + IV/ (x, Y)l 2 . when (x, y) is restricted to lie on the line segment
29. Find the points where f (x, y) = x 2 + y attains it max- joining the points (0, 1) and (1, 0).
imum and minimum values on the subset of R2 where
x2 +2y2 :'.: 1.
42. Find the points on x2 +2xy +3y2 = 14 which are closest
to and farthest from the origin.
30. Find the maximum value of x + 2y subject to x 2 + y2 = 1. 43. (a) Find all critical points of the function x 3 + 3x y + y2.
31. A rectangular box is to be constructed so that the length (b) Find the maximum value of the function in part (a)
of one of its internal diagonals is 3 units. What is the given that (x, y) is restricted to the closed square
maximum possible volume? - l:'.:,X:'.:,1,-1:'.:Y:'.:l.
32. (a) It's clear that the minimum value of / (x, y, z) = 44. Find the maximum and minimum values of the function
xyz subject to the conditions O :'.: x, 0 :'.: y, 0 :'.: z g(x, y) = x 2 + x + y + y2 over the closed region
and z + y + z = I is zero. Find the maximum value. x2 + y2 :'.: 1.
(b) What if xyz is replaced by xy2z 3 ?
45. Suppose (xo, yo) is a point in R 2 with polar coordi-
33. Let u = f(r) andr = a +-
2u
Jx 2 + y 2• Show that - 2
a2u-, = nates (ro, 80). Show that the polar coordinate equation
ax 8r 2
, - 2rro cos(& - 0o) + ,J
= a 2 represents the circle with
d2 f + ! df if r ¥ 0. radius a and center (xo, Yo).
d,-2 r dr
46. A cone with vertex at the origin in R3, and that intersects
34. Let f(x, y) =(x 2 - y2, 2xy) =
(u, v) and g(u, v) =
planes through the z-axis in two perpendicular lines, has
(eu cos v, eu sin v) = (s, t). Using the chain rule, show
a simple representation in terms of spherical (r, ¢,, 0)
that as/ox= 8t/8y and as/By= -at/Bx. coordinates. Find such a representation. Then find an
35. Find all critical points in IR.3 of the function f (x, y, z) = equation in terms of rectangular (x, y, z) coordinates for
xy + yz +xz . the same cone.
CHAPTER 7
MULTIPLE INTEGRATION
lb f(x)dx
as the signed area between the x-axis and the graph of f. We want to extend this
idea to the integral of a function R ~ R. Suppose that / (x, y ) is a function
2
1d f(x, y )dy
is meant simply the definite integral of the function of one variable obtained by
holding x fixed; for example,
2
f 2 x 3y2dy = [ ~
3 3 ] Y=
= -g x 3 •
Jo 3 y=O 3
As this example shows, if the integral exists, it depends on x . Thus we may set
and fonn the iterated integral in the order first y, then .x:
To interpret the iterated integral, look at Figure 7. l(a). For a fixed value of x , the
integral with respect to y is Lhe area of the shaded region, which we have called
F(x) . Then we can interpret the iterated integral, which is the integral of the area
312
Section 1A Iterated Integrals 313
FIGURE 7.1
Graph off
Graph off "\
\ '
:
I
I I
'~
(J t:t 1
I
I
1
I
I I [ I I
Area F(x) I I l I Id
i------1---',- '--I------; -.--.
t 11 I I/
a f -i----¥- -- i_ _ _ /
X
a/+
--I-
I I I /
b
/
/
___
I
: /
/
l ___ 1-\-V
.
/
/ :
I /
/
I I I /
b ___ J-' --------~'
Area G (y)
(a) (b)
function F (x) with respect to x, as the volume of the region between the graph of
f and the rectangle a S x S b, c S y S d. If f assumes negative values there, we
interpret the integral as a signed volume.
We can also define an iterated integral in the opposite order. We set
We may also interpret this last integral as the signed volume under the graph of f,
as suggested by Figure 7.l(b). Intuitively we expect the two iterated integrals to be
equal, and this will follow from Theorem 2.2 in the next section.
A common notational convention, which we'll sometimes use, is to omit the
brackets and w1ite the previous iterated integral as
This alternative notation has the advantage of emphasizing which variable goes with
which integral sign, namely, x with ib and y with id
l~.X~Nf P~~'. JJ Consider f (x, y) = x 2 + y, defined on the rectangular region
0 S x S 1, 1S y S 2.
f l dx f 2 (x 2 + y) dy = f l [ x 2 y + L2])'=2 dx
lo 11 lo 2 y=I
314 Chapter 7 Multiple Integration
= fo
1
[(2x + 2) - (x
2 2
+ i)] dx
= 11 ( + } ) x2 dx = }~ ~ = ~I .
f\:c2 + y)dy = x2 + ~
11 2
is the area of the shaded cross section. It is customary to interpret the definite integral
of an area-valued function as volume. Thus we can regard the iterated integral
[1 dx
lo
f\x
11
2 + y)dy = .!...!_
6
as the volume of the 3-dimensional region lying below the surface and above the
rectangle O :5 x:::: I, 1 :5 y:::: 2.
f 2 dy f I (x 2 + y) dx = f 2 [ ~3 + yx ]x=l dy
11 Jo 11 x=O
r
3
2 2
= 1i (1 + y) dy [f + ~
=
= (} + 2) - (1 + 1) = 161.
This time
1 J
1 o
(x
2
+ y) dx = - + y
3
FIGURE 7.2
'l
~
/?f
/4/' J( '
·~//f !
(
,I y j -
y
I /~
X
IX
(a) (b)
Section 1B Iterated Integrals 315
is the area of a cross section parallel to the xz-plane. See Figure 7.2(b). The second
integral again gives the volume of the 3-dimensional region lying below the surface
2
z = x + y and above the rectangle O S x S 1, 1 Sy s 2. So it isn't surprising that
the two iterated integrals of Examples 1 and 2 are equal.
1B Nonrectangular Regions
\I
-- X... It is important to be able to integrate over subsets of the plane that are more general
(x, 0)
I \
\
than rectangles. In such problems, the limits in the first integration will depend on
(a) the remaining variable.
= [1 [x - x3 + 1 -
k
2x2
2
+ x4] dx = ~-
60
y For each x between O and 1, the number y is between O and 1 - x 2• In other
words, the point (x, y) runs along the line segment joining (x, 0) and (x, 1 - x 2 ).
As x varies between O and 1, this line segment sweeps out the shaded region B
--- as shown in Figure 7.3(a). The integrand f(x, y) = x + y has the graph shown in
Figure 7.3(b), and the iterated integral is the volume under the graph and above the
(b) region B, shown in Figure 7.3(b).
Suppose we are given an iterated integral over a plane region B in which the integrand
is the constant function f defined by f(x, y) = l, for all (x, y) in B. The integral
may then be interpreted either as the volume of the slab of unit thickness and with
base B or simply as the area of B. For example,
l 1---------. [1 dx r1-x dy = ~
2
lo lo 3
is the area of the region B shown in Figure 7.3(a). The integral also represents the
volume of the solid region of height 1 based on the plane region B and shown in
y Figure 7.3(c).
FIGURE 7.4
JM
y y
y = u(x)
d ~-~~JB
y = v(x) C -- -
(I b X X
fbd_··Ju(x)
a
'
v(x)
f (x, y )dy dd Jr(y)
J c y s(y) f(x, y)dx
(a) (b)
For example, to integrate over a region A between the graphs of y = u (x) and
y = v(x)for x between a and b, shown in Figure 7.4(a), we would choose the order
of integration to be first with respect to y, then x. On the other hand, the region B
in Figure 7.4(b) would naturally lead us to the opposite order: first x, then y.
IEX.AMPLE s I Let f be defined by f (x, y) = x y over the region B bounded by the vertical lines
x = -1 and x =
2 and by the graphs y = 1 + x 2 and y = -x 2 , shown in
Figure 7.5(a). To find the iterated integral of f over B in the order first y, then
x, we think of holding x fixed somewhere between - 1 and 2 and letting y vary
between y = -x 2 and y = 1 + x 2 . Thus we get the single integral
1+x 2
1-x2
xydy.
(a)
1 [1'+,-2
2
-l - x2
]
xydy dx.
2 2
[11 ·'+x
1 2
-I
x
-x-
]
y dy dx = 1 2
-I
[
2
J 2] l+x
x ..:. y ,
x-
dx
(b)
FIGURE 7.5
Section 1B Iterated Integrals 317
The choice of the order of integration was dictated by noticing that integrating the
other way, first with respect to x, then y, leads to two separate integrals when y is
between I and 5 or between -4 and 0. See Figure 7 .5(b ).
1-1
1
dy f ~ x dx =
lo
1-1
1
[!x 2 ] ~ dy
2 o
(a) = ! 11 (1 - y2) dy
2 -I
y
y=>'l-x2
= i [y - ~y3II= irn -(-1)J=i-
We can also integrate in the other order. We think of D as bounded above by the
graph of y = ~ and below by the graph of y = - ~ - We first integrate
with respect toy between -.Jt=xI and JI - x 2 , then with respect to x between
0 and 1, as indicated in Figure 7.6(b). The result is
1 1~ 11 ~
(b) 1
1 dx x dy = x[y] ~dx
0 -~ O -y l -x 2
FIGURE 7.6
1
= fo 2x~dx
2 2 3/2] I 2
= [ -3(1 - X ) 0 = 3·
In this example the two orders of integration lead to computations of about equal
complexity. In practice it may happen that you get stuck using one order, so you try
the other.
I EXAMPLE 7'·1 Let f be defined by f(x , y) = x 2 y+xy 2 over the region bounded by y = !xi, y = 0,
· - x = - 1 and x = 1. See Figure 7. 7. The two iterated integrals over the region are
1 1'"'
1
-1
dx
0
(x 2 y + xy2) dy
and
318 Chapter 7 Multiple Integration
y The second integral breaks into two pieces because, for fixed y between O and I , the
integration with respect to x is carried out over two separate intervals. Computation
y = lxl of the integral is straightforward. We get
[1 [x3y
lo -3-
x2y2]-y
+ -2- _ + -3- + -2-
[x3y x2y2]1 dy = }[123(y - Y4)dy = 31 - 152 = s·I
Y
1 0
X
- dx= 11 (x4
FIGURE 7.7
1[x2y2
- - +xy3]1xl xJxl3)
-+-- dx
1 2 3 o
-I 2 3 -I
x4 11 --dx.
= 1-dx+ xJ.xl3
1 2 3 -I -I
The functions x 4 /2 and x Ix J3 /3 are even and odd, respectively. It follows that the
sum of the two integrals is
110
-
2 -I
x 4 dx - -
3
110 -I
x 4 dx 111
+-
2 O
x 4 dx + -I
3
11
O
x 4 dx = 11
O
I
x 4 dx = -.
5
IC Higher Dimensions
Iterated integrals for functions defined on sets of dimension greater than 2 can also
be computed by repeated I -dimensional integration.
Since x is held fixed during the integration with respect to z and y, the integral is
lo
r1
x dx
r1-x
Jo
r1-x-y
dy lo dz = Jo
r1
x dx lo
rl-x [z]6-x-y dy
= Jo
r1 xdx Jor1-.r (1-x -y)dy
I ) 1
=Jo(I ( 2x
I 3 2
-x +2x dx=24·
r1 fl -x
= lo dx lo (1-x -y) dy
(b)
= [~x4 -
8
!xs - _!_x6]1
5 12 0
FIGURE 7.8 3 l I Il
=- - - - - = -
8 5 12 120
We can interpret the first integration with respect to z as taking place along a verti-
r~l <:Pomt>nt ioinini? the ooints (x , v, 0) and (x, y, x). The integration with respect to
320 Chapter 7 Multiple Integration
y takes place for each fixed x between O and 1, and sweeps out a vertical rect-
angle between the xy-plane and the plane z = x. [The graph of the integrand
X f(x, y, z) = x + y cannot be drawn in Figure 7.8(b); the sketch shown there is
the domain of f for the purposes of integration, but the graph of f would be
h
in JR 4 .]
a
In the integration with respect to x, the vertical rectangle sweeps out the 3-dimen-
sional region between the planar top and bottom. We can interpret the value of the
integral as the total mass of a solid region with variable density equal to f (x, y, z) =
x + y at the point (x, y, z).
IEXAMPLE 10 I A solid column C of height h has circular cross-sections with radii that decrease
linearly from bat the base to a at the top. Our problem is to find its volume V(C).
We choose an x-interval from x = 0 to x = h to coincide with the column's axis of
symmetry as shown in Figure 7.9(a). The radius and area of a cross-sectional disk
at distance x from the base are given, consistently with r(O) =band r(h) = a, by
r(x) a-b)
= ( -,;- x +b and A(x) = n:r 2 (x).
This formula for A(x) could have been the result of an additional, but unnecessary,
iterated integration with respect to y and z. In any case the volume of the column is
FIGURE 7.9 We can compute this integral either by making the substitution u = r(x) or by
squaring the bracketed expression for r (x ). The result is
Note that when a = 0 we get a formula for the volume of a right circular cone with
base radius band height h: V = ½nb2 h. If a= b we get the volume of a cylinder.
IEXAl')APLE 11 I A solid torus B is generated by rotating a circular disk of radius a about a line l in
the same plane and at distance c > a from the center of the disk. See Figure 7.9(b).
If S is sliced by a plane perpendicular to l the resulting intersection is a flat annular
region with inner radius of the form ro = c - p and outer radius r 1 = c + p. Thus
the annulus will have area
The increments ±p depend on the level at which the slice is made. At level z
measured along the horizontal axis having label /, we see that p = .ja 2 - z2. Hence
A(z) = 4nc./a 2 -- z2 . To find V(B) we integrate A(z) from -a to a:
This last integral is usually computed using the substitution z = a sin u. However
a moment's thought shows that the integral represents the area of a semicircle of
radius a, namely ½na 2 . Hence V(B) = (4nc)(½na 2) = 2n 2 ca 2 , 0 < a < c.
EXERCISES
1. 1.~ 2
[l\x y2+xy3)dy]dx 14.1 1
-1
dx [fxldy r1(x+y+z)dz
lo lo
2 3 1
2. fo [/ Ix - 21 sinydx] dy 15. Evaluate the integral lorr sinxdx fo dy fo\x+y+z)dz.
I
f
3. /
0
[1° (x + y2)dy] dx
16. Evaluate the integral
Jo
dx ix dy lx+y dz ix dw.
-x -x-y -z
22. Let f be defined by f(x, y, z) = 1 on the hemisphere 26. Let f be defined by f(x1, ... ,Xn) = x1x2 .. . Xn on the
bounded by the plane z = 0 and the surface z = cube O .::S x1 .::S 1, 0 .::S x2 .::S 1, ... , 0 .::S x 11 .::S l. Evaluate
JI - x2 - y 2. Evaluate an iterated integral off in some
order over the region.
fl d .tJ r1 dx2 . .. r1 X1X2. , .XndXn .
23. A solid region B has a circular base of radius a. Cross- Ji h h
sections by planes perpendicular to a fixed diameter of
the circle are squares. Sketch B and find its volume. 27. Evaluate
K b
lim
ma~(ti.x;)--+0
L f(xk)!:Hk = lar f(x) dx
a b X k=O
(a)
where t:nk is the distance between points of subdivision shown in Figure 7. IO(a) and
Xk is some point in the kth interval. The sum on the left is most simply interpreted
'j as the total area of a collection of rectangles. The purpose of this section is to extend
the definition of integral to functions with values f (x) where xis in some subset B
of JR" for n ~ 2.
The extension proceeds very naturally if we keep in mind the analog of Figure
7. I O(a) shown in Figure 7.1 O(b) for the case n = 2. A segment with length ~x
is replaced by a rectangle in the xy-plane with dimensions ~x and ~y. and the
rectangle area J(x)~x is replaced by a volume f(x)~x~y. The integral will then
be defined as a limit of sums of volumes.
Although integration over intervals is adequate for practically all purposes in
(bl
dimension 1, we need more general sets in ~". We first consider some simple sets
in ~ 11 • A closed coordinate rectangle is a subset of ~ '! consisting of all points
'FIGURE 7.10 x = (x1, ... , x 11 ) that satisfy a set of inequalities
If, for some i in Formula (I), ai = bi, then R is called degenerate and V (R) = 0.
For rectangles in IR 2, content is the same thing as area, and we often write A(R)
instead of V(R) to have the notation remind us of area rather than volume.
A subset B of JR.n is called bounded if there is a real number k such that lxl < k
for all x in B. A finite set of (n - I )-dimensional planes in ]Rn (lines in JR 2 ) parallel
to the coordinate planes will be called a grid. A grid separates ]Rn into a finite
number of closed, bounded rectangles R1, ... , Rr and a finite number of unbounded
regions. A grid covers a subset B of ]Rn if B is contained in the union of the
bounded rectangles R1, ... , Rr, so a set B can be covered by a grid if and only if
FIGURE 7.11 B is bounded. As a measure of the fineness of a grid, we take the maximum of the
lengths of the edges of the rectangles R 1, . . . , Rr. This number is called the mesh
of the grid. In Figure 7.12 the shadings are parts of planes that cut B3 .
Remark. When n ~ 4 a note is in order about the planes that form a grid. In the
space ]Rn each of the planes of dimension n - l that we use to form a grid will be
perpendicular to one of the coordinate axes. For example, planes with equations x1 =
c consist of all points (c, x2, x3, . . . , Xn). Thus vectors of the form (0, x2, x3, ... , Xn)
______,..
x-axis
R, _ 1 R,
-I
X
324 Chapter 7 Multiple Integration
joining two points in this plane will be perpendicular to a vector (d, 0, 0, ... , 0) with
d i- 0 that is parallel to the x1 axis. Similar remarks apply to each of the other n - 1
types of plane with respective equation types x2 = c, x3 = c, ... , x 11 = c. Each of
these planes has content zero in JR.11 , and the volume of a box bounded by n pairs of
parallel planes is the product of the distances between the two planes in each pair.
We now give a definition of the multiple integral, called the Riemann integral
after Bernhard Riemann. Consider a function JR.11 /4 JR and a set B such that
(a) B is a bounded subset of the domain off.
(b) f is bounded on B.
Assertion (b) means that there exists a real number K such that If (x) I ~ K, for
all x in B. The multiple integral of f over B will be defined in terms of the function
fB, which is f altered to be zero outside B, that is,
fB(X) = { t,cx), if x is in B
if x is not in B
Figure 7.13 shows the shaded graph of a function J8 cut from a graph over the
first quadrant in JR. 2 . Let G be a grid that covers B and has mesh equal to m ( G).
In each of the bounded rectangles Ri formed by G, with i = 1, . . . , r, choose an
arbitrary point Xi. The sum
,.
L fB(x;)V(R;)
i=I
is called a Riemann sum for f over B. Its value, for given f and B, depends on
G and XJ, ... , x,.. If, no matter how we choose grids G with mesh m(G) tending to
zero, it happens that
r
exists and is always the same number, then this limit is the integral of f over B
and is denoted by L f d V. If the integral exists, f is said to be integrable over B.
FIGURE 7.13
X
Section 28 Multiple Integrals 325
The limit that defines the multiple integral is somewhat different from the limit
of a vector function defined in Chapter 5, Section l, although the idea behind it is
similar. The defining equation
,.
lim I:.ts(X;)V(Ri) = {
m(G)-.O.
1=1
Js .fdV
means that, for any £ > 0, there exists 8 > 0 such that if G is any grid that covers
B and has mesh less than 8, and S is an arbitrary Riemann sum for .f8 formed from
G, then
It should be emphasized that the integral is not defined for functions f and sets
B unless the boundedness conditions on f and on B are satisfied. Without these
conditions, even the Riemann sums may not be defined.
If f is a real-valued function of one real variable, that is, if n = 1, and if B is an
interval a S x S b, the Riemann integral of f over B is the familiar definite integral
1b f(x) dx.
L f(x, y, z) dx dydz, if n = 3,
2B Existence
Multiple integrals are often computed by first rewriting them as iterated integrals,
which are then evaluated by repeated application of I-dimensional integration tech-
niques. Even though they are too technical to prove here, it is nevertheless important
to have criteria for the existenc~ of an integral L/(x) dV. The criteria provided
below in Theorem 2.1 impose conditions (i) on the set of boundary points of B and
(ii) on the set of discontinuity points off. Both (i) and (ii) require that the respective
sets be negligible in the following sense: A set S has zero content if 1 dV = 0. For
example, finite sets of points, finite collections of smooth curves in JR~ and JR 3 , and
finite collections of smooth curves and surfaces in JR 3 all have zero content, though
we won't prove this.
2.1 Theorem. Let JR11 ~ JR be defined and bounded on a bounded set B such
that (i) the boundary of B has zero content and (ii) f is continuous except possibly
on a set of zero content. Then f is Riemann integrable over B.
326 Chapter 7 Multiple Integration
!EXAMPLE 1 ! We'll evaluate l (2x + y) dx dy directly from its definition, where B is the rectangle
0 ::: x :::: 1, 0 ::: y ::: 2. Note that (i) the boundary of B consists of four line segments
having total content zero, and (ii) f (x, y) = 2x + y is continuous everywhere. Thus
y ~ we know that f is integrable on B, so we can use an arbitrary sequence of Riemann
sums with mesh tending to O to evaluate the integral. For each n = 1, 2, ... , consider
R,6
the grid G n consisting of the lines
j
x--
- ' i = 0, . .. ,n and y = -, J = 0, ... , 2n.
n n
See Figure 7.14(a). The mesh of Gn is 1/n, and the area of the rectangles Rij is
R11
I/ n2 . Setting
(a)
X
( I I !
tfJ I ·:
' 1t ' I
ID'
I
1j -
T'.:.i ; -}·
•)''
_j '
J__.,_
y
(b)
= n3
1 (
4n !;
n
i +n E
2n
j
)
4n 2 + 3n 3
= n
2 =4+ -.
n
Hence
Direct evaluation of a multiple integral would be very difficult for most functions
we want to integrate. Fortunately in many instances we can evaluate the multiple
integral by repeated application of ordinary I -dimensional integration instead of by
finding the limits of Riemann sums. The pertinent theorem, which we don't prove,
is the following.
2.2 Theorem. Let B be a subset of IR.11 such that the iterated integral
Section 2C Multiple Integrals 327
exists over B. If, in addition, the multiple integral
lfdV
exists, then· the two integrals are equal.
Since the argument that proves Theorem 2.2 applies equally well to any order of
iterated integration, we have an immediate corollary:
2.3 Theorem. If£ f dV exists and iterated integrals exist for some orders of
integration, then all these integrals are equal.
where l!.x; and l!.yj are the dimensions of the rectangle Rij. Thus we can expect
these sums to tend to the respective integrals as the mesh of the grid tends to zero:
2C Double Integrals
Computing multiple integrals by iterated integration often requires us to describe
the region over which we are to integrate so we can make reasonable choices for
the order of integration and the limits of integration. We start with 2-dimensional
examples. A double integral is usually written
ff J(x,y)dxdy or lf(x,y)dxdy,
B
0 5 x 5 I, 0 5 y :'.S 2.
328 Chapter 7 Multiple Integration
This is the same integral that occurs in Example 1, but we evaluate it here by iterated
integration as follows.
z
l (2x + y)dxdy = fo
1
dx fo\2x + y )dy
= [1 [2xy + ! i]y=Z dx
4
lo 2 y=O
1
= fo (4x + 2)dx
--F-~~~~--->-
2 x
= [2x 2 + 2x]6 = 4.
(a) Integration in the other order, first with respect to x and then with respect to y,
would produce the same final result.
l J(x, y) dx dy = fo
2 2
r2 f(x,
lo y) dy,
(b) depends on x and represents the area of the shaded vertical slice shown in Figure
7. I 5(h ). If we want to integrate first with respect to x, perhaps to make the indefinite
integrals easier to find, we hold y fixed with O S y s 4 and note that then x satisfies
.jy S x S 2. The double integral is then given by [see Figure 7.15(c)]
4 2
{ f(x,y)dxdy= { dyf f(x,y)dx.
ln lo g
I
I y For example, if f(x, y) = xy, we would have
,.._.,_~~~ ~ ~ ~ 1/
2
(c) f xydxdy= [4dyf xydx
ln lo g
FIGURE 7.15
[I ]x=2 dy
=
1
0
4
-x 2 y
2 x=Jy
= 14 (2y- ~y2) dy
= [y2 - ~y3]4
6
= 16 - 32
3
= ~-
3
0
Section 2D Multiple Integrals 329
The inequality x 2 + y2 s l defines a disk D in the xy-plane shown in Figure 7. l 6(a).
The volume above D and under the graph of / (x, y) =
x 2 + y2 is shown in
Figure 7.16(b). For each fixed x satisfying -1 s x s I, we have y restricted
y so that
(x,vf"=?°)
1 1~
- ) I X
....____,__.- (x, - \ / ~ )
1 D
(x 2 + y2)dxdy = 1
-I
dx
-~
(x 2 + y2)dy
(a)
= ~ fo
1
2 2
(Jt-x +2x ~ d x
=i(~+i)=i
The indefinite integrals needed in the last step are in the Appendix and most standard
(b)
tables; they're evaluated by making the substitution x = sin 0.
FIGURE 7.16
More complicated regions like those shown in Figure 7 .17 are handled by cutting
them up into disjoint regions Ck over each of which it is possible to compute an
iterated integral. Then use as often as necessary the equation
1 C1U C2
f(x,y)dxdy=l f(x,y)dxdy+J f(x,y)dxdy;
C1 C2 .
FIGURE 7.17 y
y
(a) (b)
330 Chapter 7 Multiple Integration
2D Triple Integrals
A triple integral is usually written
ff!
B
f(x,y,z)dxdydz or lf(x,y,z)dxdydz,
Osxs2
z 0 Sy SI
0 S z S 2,
we can sketch it as in Figure 7.18. The integral of f(x, y, z) = xyz over R is then
computed as an iterated integral in any of the possible orders.
l xyzdxdydz = fo
2
dx fo
1
dy f 2
xyzdz
2 1
= fo xdx fo ydy J\:.dz
z
ball of radius 2 with center at the origin, shown in Figure 7.19. The integral L f dV
equals the triple iterated integral of the function f(x, y, z) = xyz over B. For fixed
x and y, the variable z runs from Oto J4 -
x 2 - y 2 , which are the limits of the first
integration with respect to z. The result of this integration is a function of x and y
that must be integrated over the 2-dimensional subset obtained by projecting B on
the x y-plane, that is, over the region
y
x 2 + y2 S 4, X :::: 0, Y :::: 0.
For fixed x, the variable y runs from O to .J4 - x 2 ; hence these are the limits on
FIGURE 7.19
the integration with respect to y. Finally, x runs from O to 2, so we conclude that
~ 2 J4-x2-y1
1 11 1
R
f dV =
O
dx
O
dy
0
xyzdz.
Section 2E Multiple Integrals 331
Then
2
[ fdV =! f xdx [ ~ y(4-x 2 -y2)dy
lB 2 lo lo
= 21 lo[2 X ( 2(4 - 2
X ) -
x2
2(4 -
2
X ) -
(4-x2)2)
4 dx.
2
lo[ ( 2x -x 3+ 81x 5) dx = 34 .
2E Content and Mass
If an integrand f (x, y) is constantly 1 on · a region R in JR 2 , then the integral of f
over R can be interpreted as the volume of the solid B with base R and height 1.
An alternative interpretation is as the area of R. More generally, and more precisely,
if a set B in ]Rn satisfies the conditions of Theorem 2.1, we define the content of B
to be
V(B) = l dV.
In case n = 2 we call this number the area of Band denote it by A(B). When n ~ 3
we speak of n-dimensional volume and retain the notation V (B), or else use Vn (B)
if the dimension isn't clear from the context. One virtue of this definition is that it
allows us to remove the ambiguity inherent in the multiple ways of computing via
iterated integrals. However, the definition is dependent on the choice of coordinates
used to describe B ; this point is addressed in Section 4.
Referring back to the previous Example 4, the volume of the region B that lies
vertically between the disk D and the graph of f (x, y) = x 2 + y 2 is equal, by
Theorem 2.2 and our definition of V ( B), to
f f [ rt(x,y) ]
V(B)= lBdV= lv lo dz dxdy.
The integral with respect to z works out immediately to f(x, y), so the value rr/2
of the integral computed in Example 4 is assigned to V(B) without concern about
order of integration.
If µ(x) is nonnegative and integrable for over B, then µ(x) can be interpreted as
the density at x of a mass distribution µ. Then, assuming V (B) > 0, the integral
M(B) = l µ(x)dV
is called the total mass of the distribution µ on B. The type of density considered
here describes mass per volume unit, and in case V ( B) =
0, the integral of an
332 Chapter 7 Multiple Integration
integrable density µ, over B will always be zero. Some alternative densities for
curves and surfaces with zero volume are described in Chapter 8, Section 2 and
Chapter 9, Section 3.
The importance of the multiple integral is due partly to rhe variety of interpreta-
tions that stem from it. Content and total mass are conceptually two of the simplest;
others are discussed in subsequent sections.
EXERCISES
In Exercises I to 4, make a drawing of the set B and 7. f(x, y) = x + y 2 and B is the rectangle with corners
compute JB
f dx dy. (1, 1), (1, 3), (2, 3), and (2, 1).
1. f(x, y) = x 2 + 3y2 and B is the disk 8. f(x, y) = x + y + 2 and B is the region bounded by the
x2 + y2.::: 1. curves y 2 = x, and x = 2.
2. f(x, y) = 1/(x + y) and B is the region bounded by the 9. f(x, y) =Ix+ YI and Bis the disk x 2 + y 2 ::: 1.
lines y = x, x = l, x = 2, y = 0. 10. f(x, y) = x 2 + y 2 and B is the square with corners at
3. f(x, y) = x sinxy and B is the rectangle O ::: x .::: rr, (x, y) = (±1, ±1).
0:::y:::1. 11, Find by integration the area of the subset of R 2 bounded
4. f(x, y) = x 2 - y 2 and B consists of all (x, y) such that by the curve
2 2
0 ::: x .::: 1 and x - y c:::_ 0.
x 2 - 2x + 4y2 - 8y + 1 = 0.
In Exercises 5 and 6, use the definition of the double
integral as a limit of Riemann sums to compute the 12. Given that f(x, y, z) = xyz and that
"·3 = "')2
n ( n 14. Write an expression for the volume of the ball
~l ~l x2 + y2 + z2 .::: a2
i=l i=l (a) as a triple integral.
(b) as a double integral.
5. f(x, y) =x + 4y and B is the rectangle O ::: x .::: 2, 15. Sketch in R 3 the two cylindrical solids defined by
0::: y.::: 1. x 2 + r.2 ::: 1 and y 2 + z 2 ~ 1, respectively. Find the
volume of their intersection.
6. f(x, y) = 3x 3 + 2y and Bis the rectangle
Q _::: X :'.:: 2, Q _::: y _::: 1. 16. The 4-dimensional ball B of radius 1 and with center at the
origin is the subset ofR4 defined by xf+x?+x5 +x] ::: 1.
ln Exercises 7 to I 0, find the volume under the graph of Set up an expression for the volume V (B) as a fourfold
f and above the set B, where iterated integral.
Section 3 Integration Theorems 333
17. A hemispherical bowl of radius a conlains liquid with (b) Explain how Cavalieri's principle follows from The-
maximum depth h. Find the volume of the liquid. orem 2.2 and the definition of area and volume.
18. Cavalieri's principle as originally fommlated in the 19. A semicircular steel plate with a two foot radius has a
17th century states that two solids that have equal concenlric semicircle of 6-inch radius removed from its
cross-sectional areas at the same height will have equal straight edge. If the steel has uniform density µ, = 12
volumes. pounds per square foot, find the total mass of the plate.
20. A reclangular 2-by-3 foot steel plate has been machined so
that its density varies linearly from 10 pounds per square
foot to 12 pounds per square foot as measured in the long
direction. Find the total mass of the plate.
21. A column with circular cross section varying from diam-
eter 12 inches to diameter 8 inches is 10 feet long. The
density µ, of the material in the column varies linearly
along the length of the column from 50 pounds per cuhic
foot at the thick end to 40 pounds per cubic foot at the
(a) (b) thin end. Find the total mass. Sec Figure 7.9(a).
(a) Assuming that the hypotheses of the principle hold
for the two solids with square cross sections that
follow, find the volume of the one below.
3.1 Theorem. Linearity: If f and g are integrable over B and a and b are any
two real numbers, then af + bg is integrable over B and
Lf dV ~ 0.
3.3 Theorem. If Risa rectangle, then l dV = V(R), where the content V(R)
is defined as the product of the lengths of the edges of R.
In the next theorem recall that / 8 (x) is defined to equal /(x) for x in B and to
equal O for x not in B.
only if [ fsdV exists. Whenever both integrals exist, they are \"!qual.
334 Chapter 7 Multiple Integration
Proof of 3.1. Let E > 0 be given, and choose 8 > 0 so that if S1 and S2 are
Riemann sums for f 8 and gn respectively whose grids have mesh less that 8, then
Let S be any Riemann sum for (af + bg)B whose grid has mesh Jess than 8. Then
= a L fB(Xi)V(Ri) + b LgB(x;)V(Ri)
i
Hence
Thus
depends only on the function fn. Similarly, L .fBdV is defined by using <.fB)c,
which is equal to .fn. •
We can now prove the next two theorems directly.
3.5 Theorem. If f and g are integrable over B, and f ::: g on B, then
l fdV::: l gdV.
l
I fdVI::: £ 1JldV.
Section 3 Integration Theorems 335
Proof. The function g - f is nonnegative and, by Theorem 3.1, is integrable over
B. Hence, by Theorems 3.1 and 3.2,
from which the conclusion follows. The second part is left as Exercise 2. •
The next theorem establishes an analog for the equation
1 B1UB1
f dV = 1B1
fdV + 1
B2
fdV.
Since B1 and B2 are disjoint, fB 1uB2 = fB 1 + fBz· Hence, by Theorem 3.1, the
function fB 1uB2 is integrable over B1 U Bz, and
3.7 Leibniz Rule. If (ag/ay)(x, y) is continuous for a:::: x:::: band c:::: y:::: d,
then
dd
y
lb
a
g(x,y)dx= lb
a
ag
-(x,y)dx.
ay
336 Chapter 7 Multiple Integration
Proof The trick is to start with the following change in order of integration:
For each t the integral in square brackets on the right evaluates to g(t, y) - g(t, c).
(Use the version of the Fundamental Theorem of Calculus that tells you how to
integrate a derivative.) Thus the previous equation becomes
Note that the subtracted term is a constant. Now apply the other version of the
fundamental theorem of calculus to both sides. On the left we undo they-integration,
so we get the desired result
la
b gy(t, y)dt =:!._lb dy a
g(t, y)dt. •
J1
Let G (y) = 0 sin(.v,,x) dx. There seems to be no way to evaluate the integral in
terms of elementary functions, but we can find G'(y) using the Leibniz rule. We find
1
G'(y) =
1 0
1 cos(yex)exdx = [ -I sin(yex) ]
Y O
= -(sin(ey)
I
Y
- siny) , y f:. 0.
j ~*MPlf2] If u and v are fixed numbers, both positive or both negative, the formula
u I
F(y, u, v) = J,
II
-eyx dx
X
d
-F(y,
dy
u, v) = 1v -a [
II oy
1
-eyx
X
] dx
ll
= J, u eYxdx
EXERCISES
lo dy fo'\e-xy -
l 2e-2xY) dx
indicated derivative of the given function.
1
9. f(y) = fo (y2+r 2 )dt. Find .f'(y).
~ JR"'
2
7. Let ]Rn be defined on a set B in Rn. We define 11. h(x) = fox (x - u)eu du. Find h'(x) and h"(x).
1
= [~u5/2 + ~u3/2] 16
0 5 3 0 15
The aim of this change of variable was to simplify the integrand, and the change of
r - - 1--T--1 interval of integration from I ~ x ~ 2 to O ~ u ~ 1 makes very little difference in
I I I
,_ - - - _.1_ _ _ ,
I
the computation. In computing multiple integrals it's more often the corresponding
: Ar ; : - - f: change in the region of integration that we're concerned with, the point being that we
--
., AO •
I
I
can use a change of variable to simplify the region. We first consider some simple
Oo - ___!.__ examples of multivariable coordinate changes.
4A Polar Coordinates
(a)
r
In a double integral L f (x, y) dx dy over a circular region, it is often helpful to
introduce polar coordinates by the transformation
y
x = r cos0
y = r sin 0.
Corresponding regions in the xy-plane and r0-plane are shown in Figure 7.20. As
r and 0 vary within the limits ro ~ r ~ ri and 0o ~ 0 ::: 01 , the values of x
and y give the coordinates of the points in the shaded part D of the disk shown in
Figure 7.2O(c). Rather than approximate the value of an integral by decomposing D
using a rectangular grid, we can think of using a polar coordinate grid. A typical
X subdivision S of such a grid is shown in Figure 7.2O(b). Using elementary geometry,
we can compute the exact area of S as a certain fraction, namely, (!:!,.0)/2rr of the
(b)
region between circles of radius r and r + l:!,.r. We find
y · !:!,.0
A(S) = -[rr(r
2rr
+ !:!,.r)2 - rrr 2 ]
!:!,.0 I
= -[2r!:!,.r + (l:!,.r) 2 J = r!:!,.r!:!,.0 + - (!:!,.r) 2 !:!,.0.
2 2
If l:!,.r and !:!,.0 are small, the second term is relatively small compared with the first
term, and we make the approximation A(S) ~ rl:!,.r!:!,.0 . Because x = rcos0 and
y = r sin 0, we can approximate the integral of f over D as follows :
X N
= 81 1'1
4.1
1D
J(x, y)dA
1 8o
d0
~
f(rcos0, rsin0)rdr.
The computations in the next two examples are considerably simpler than direct
use of iterated integrals with respect to x and y .
4B Spherical Coordinates
y We introduce spherical coordinates in JR 3 by the transformation
X
~l Corresponding regions are shown in Figure 7 .21 . The spherical coordinate "cube" C
has by a direct calculation a volume approximated by
FIGURE 7.21 is valid, where f(r, </>, 0) = f(r sin¢ cos 0, r sin</> sin 0, r cos</>).
h~MP:Lf';~] A solid ball B of radius a is described by the spherical coordinate inequalities
l dV=
2
fo ,r d0 fo,r d</J foa r 2
sin<j)dr
2
= fo ,r d0 forr sin </J d</J foa r 2dr
= [0]6JT [-cos</J]o [lr J:
3
8 = (2rr)(2) (ia 3
) = irra3,
r
the formula for the volume of a sphere of radius a.
(a)
z 4C Cylindrical Coordinates
The transformation
x ) ( rcos0)
( : = rs~n0
y
is used to introduce cylindrical coordinates in IR.3 • Corresponding regions are shown
X
in Figure 7.22. Note the close connection with plane polar coordinates. Using the
(b) result of the similar calculation for polar coordinates, we can see that the volume of
a cylindrical coordinate "cube" C is given approximately by
FIGURE 7.22
V(C) ~ rf1r/10!1z.
zo
dz
0o ro
The formulas for integration in polar, spherical, and cylindrical coordinates can all
be derived in a uniform way. The computation involves the determinants of the
Jacobian matrices of the coordinate transformations. For polar coordinates, we have
det
cos0
si~ 0
-rsin0
rcos0
0) =
0 r.
(
,_ ----- =.::-1--- --1
--== ---i--- 0 l
(a) Thus we see that the extra factor in the integrand on the right side of Equation 4.1
and Equation 4.2 is in each case supplied by the Jacobian determinant of the coor-
dinate transformation. The expression r !:!,,r !:!,,0 is called the area element in polar
coordinates. Similarly r 2 sin¢, !:!,,r !:!,,¢, !:!,,0 is called the volume element in spherical
coordinates, whereas in cylindrical coordinates the volume element is r !:!,,r !:!,,0 l:!,,z.
These formula'> will be generalized in Section 4D.
--- ___ .,,..
Cylindrical Shells. Suppose a solid figure B is generated by rotating a plane
region R about a line l in the same plane. If R doesn't intersect l and is bounded
above and below by graphs of u(r) and v(r), we can imagine that B is composed
(b) of coaxial cylindrical shells, each one of height h(r) = u(r) - v(r) depending on
the distance r of a point on the shell from the line. See Figure 7.23(a). When such a
FIGURE 7.23 shell is slit vertically and rolled out flat, it has surface area S(r) = 21rrh(r), which
is the circumference 21rr of the shell multiplied by its height h(r). It seems plausible
that we can find the volume of B by computing the integral of S(r) over the relevant
interval ro S: r S: r1:
b 1,.,
4.3 V(B) =
1a
S(r) dr =
ro
21rrh(r) dr.
342 Chapter 7 Multiple Integration
1r1 [lu(r) ] .
V(B) =
1
0
2rr
d0
ro t•(r)
dz rdr = 2rr
1
(u(r) - v(r )}rdr.
Rotate a square with side length a about a line l in the same plane, where l is parallel
to an edge of the square and lies at distance c > a /2 from the center of the square.
The resulting solid B is a ring with right-angled comers. See Figure 7.23(b).
To use Equation 4.3 to compute V(B), we note that r should run from ro = c-½a
to r1 = c + ½a and that h(r) = a for all r in the interval. Then
FIGURE 7.24
(i) T is one-to-one on the interior of R.
(ii) det T' , the Jacobian determinant of T , is not zero in the interior of R.
~~
a(u, v) ( aG .
au
Then Jacobi's formula becomes
iT(R)
J(x,y)dxdy=
1 R
a(x,y)I
f(F(u,v),G(u,v)) - - dudv.
l a(u, v)
V
In three dimensions, we have, with
- T
r
fr(R)
f(x, y, z) dx dy dz
X
= j. f (F(u, v, w), G(u, v, w), H(u, v, w) )la(x,y,z)I
- - - du dv dw.
(b) R a(u, V, w)
Aside from the computation of det T', the application of the transfonnation for-
FIGURE 7.25
mula is a matter of finding the geometric relationship between the subset R and its
image T ( R) for a transfonnation T.
<let T' = I~ ! I= I.
344 Chapter 7 Multiple Integration
x )
( y
=( u c?s v )
u sm v
. v
det T ' -_ , cos -u sin v I-- u.
sm v u cos v
The transformation is one-to-one between Rand T(R). We can see this geometrically,
because of the interpretation of v and u as angle and radius, respectively, or directly
from the relations
1,r/2
i T(R)
2
(x +y2)dA=
1R
u 2 udA=
f 1
2
u 3 du
0
15Jr
dv=-.
8
FIGURE 7.26 y
V
-
(2, 0)
0,f) (2.Jl
T
(0, 2) X
u
(a) (b)
l~>cAMP~E. a I Let B be the positive octant in 3-dimensional space R. 3 defined by the inequalities
2
x +y2+z2~ I, x:;:::O,y:;:::O,z:;:::0.
Section 4D Change of Variable 345
F'IGURE 7.27
z
- T
Ix
I
To transform the integral l (x 2 + y2) dx dy dz, we can define T by
x )
(
y =T ( u )
v =( u sin v cos w )
u sin v sin w .
Z W U COS V
= u 2 smv.
.
EXERCISES
In the definite integrals I and 2, make the indicated IO. Let B be the region in IR 3 described by the inequalities
change of variable together with the appropriate change 0 .:5 x, 0.::: y, 0 .:5 z, and x 2 + y2 + z2 .:5 4.
in the limits of integration. Then compute the resulting (a) Sketch the region B, and describe it by using spher-
integral. ical coordinates.
{2 , (b) Use spherical coordinates and Equation 4.2 to eval-
1. Let X = ,Ju in lo xe··· dx. uate the triple integral
2. Let x = sin0 in fo
1
~dx. L Jx 2 + _v 2 + z 2 dx dydz .
3. Let B be the region in IR 2 described by the inequalities (c) Use spherical coordinates to evaluate the triple
integral
0 .:5 x, and x2 + y2,::: 4.
(a) Sketch the region B and describe it by using polar L zdxdydz.
coordinates.
(b) Use the polar coordinates and F.quation 4.1 to eval- Use spherical coordinates to compute the triple inte-
uate the double integral grals 11 and 12.
l .:5 x2 + y2 + ,2 .:5 4.
3 described by
l (x
2
+ y2)dxdy,
Use cylindrical coordinates to compute the integrals 14
and 15.
where R is the region in part (a).
5. Let A be the annular region in !R consisting of points 2
14. l z dx dy dz, where B satisfies 1 .:5 z .:5 2, x 2 + y2 .:5 I.
t
6. Compute lo dx lo{~ (x
2
+ .r2) 3 dy.
i ip(b)
ip(a)
f(x)dx =
"
f(</J(u))<f/(u)du,
radius ../ir72 centered at (0. 0). For Exercises 17 to 20 use multiple integration to prove
the geometric volume formulas.
8. Compute the area bounded by the polar coordinate curves
0 = 0, 0 = rr/4, and r = 0 2 . 17. Sis a sphere of radius a. Show V(S) = ;.na3.
9. Find the area bounded by the Iemniscate (x 2 + y2) 2 = 18. C is a cone of height h and base radius a. Show V(C) =
2a 2 (x 2 - y2) by changing to polar coordinates. ja 2h.
Section 4D Change of Variable 347
19. L is a right circular cylinder of height k and radius a. proportion k < 1 of the volume of the whole sphere to
Show V(L) = na 2k. remain as a ring, how should you choose b?
20. R is a slice of thickness k perpendicular to the axis *28. A solid ball Ba of radius a is spherically homogeneous if
of a right circular cone having maximum radius b and its density is constant on every spherical shell with center
minimum radius a. Show that its volume is V(R) = at the center of Ba. The purpose of this exercise is to
f (a 2 + ab + b2 )k. Explain how Exercises 18 and 19 are establish Newton's result that according to the inverse-
essentially special cases of this. square law the gravitational attraction of a spherically
21. Consider the transformation T defined by homogeneous ball acting at a point p is the same as it
would be if all the mass of the ball were concentrated at
its center. If p is inside the ball, the part of the ball at
distance from the center greater than IPI is irrelevant. By
definition, if Ba is centered at the origin and has density
Let Ruv be the region 1 ::: u 2 + v2 ::: 4, u ;:: 0, v ;:: 0. µ(jxl) at x, the attracting force vector on a particle of
(a) Sketch the image region Rxy = T(Ruv). mass 1 at p is given by G times the 3-dimensional vector
dxdy integral
(b) Compute
1 ~-
Rq yx2 + y2 x- p Mp
22. Define a transformation from the uv-plane to the xy-plane
by x = u + v, y = 11 2 -v. Let Ruv be the region bounded
1Ba
µ(!xi) I
X-p
13 dVx = - -p13P,
I
by (1) u-axis, (2) v-axis, and (3) the line u + v = 2. where Mp is the mass of the part of Ba that lies within
(a) Find and sketch the image region Rxy· distance IP! of its center, and G is the gravitational
dxdy constant. Newton showed without using our techniques
(b) Compute the integral
1 Jl + -;;;;=:::;:::=::::;=-
4x + 4y
R,y
23. Let a transformation of the uv-plane to the xy-plane be
that the integral equals the expression on the right.
(a) Choose perpendicular (x, y, z)-axes with origin at
the center of Ba and positive z-axis passing through
given by
p = (0, 0, p}. Show, without computing any inte-
grals, that the x and y coordinates of the vector
X = U, y = v(I + u2), integral are zero and that the z coordinate is given
and let Ruv be the rectangular region given by O ::: u ::: 3 in spherical coordinates by
and O::: v::: 2.
(a) Find and sketch the image region Rxy·
(b) Find a(x, y).
2n la ,2µ(r)
[ ] f"
a(u, v)
rcos</J- p sin1Pd1P]dr.
(c) Transform f x dx dy to an integral over Ruv and 0 (r 2 - 2pr cos IP + p2)3/2
]R,y
compute either one of them. (b) Let cos IP = u and integrate by parts to show that
the inner integral in part (a) is
24. Rotate a circular d1sk of radius a about a line in the same
plane at distance c > a from the center of the circle. This 1
generates a solid torus B. Find the volume of B using
the cylindrical she11 approach to setting up an integral for f -1
(ru - p)(r 2 + p 2 - 2pru)- 3/ 2 du
(e) Specialize the results of parts (a) and (b) to tion of the ball on a unit point-mass inside the
the case of a homogeneous ball with constant ball and r units from the center has magnitude
density µ to show that the gravitational attrac- 11rµGr.
Thus the center of mass i is a weighted average of the position vectors Xk in the
system. We'll see that i is the unique point at which a physical system consisting
of masses mk at points Xk would "balance" under the influence of constant gravity
if the mutual distances lxk - xi I are held fixed. The meaning of the term "balance"
is expressed by saying that the "moment" of the system about an arbitrary plane P
through i is the zero vector. To make these ideas precise, we define the moment
Mp of the mass system about a plane P to be the weighted algebraic sum of the
distances from the points to P. See Figure 7 .28. If n • (x - xo) = 0 is the equation
of a plane through Xo, normalized so that lnl = 1, then the distance from Xk to the
plane is n • (Xk - xo) if Xk is on the side of the plane toward which n points and is
minus that number if Xk is on the other side. (See Section 5 of Chapter 1.) A formula
suitable for computation of Mp is then
N
Mp= Lmkn • (xk - xo).
k=l
5.2 Theorem. Let P be an arbitrary plane containing the center of mass i of the
system of masses m1, mz, ... , mN at the respective points x1, xz, ... , XN, Then the
moment Mp of the system about P is zero.
FIGURE 7.28 m4
j"----- /
I ~,,/ --vX5
X4
I ----/ m-
l .,,, -- :,
I //:
~~ I
~ I
m1 I
x, :
I
~ :I
m, L_ 1
X, - ---
- m, __________ jI
x, .
Section 5 Centroids and Moments 349
Proof. To verify that the moment about a plane Po containing the center of mass
x is 0, we replace the generic point xo by x and use the distributive law for the dot
product. We get
M Po = t
k=I
mkn • (xk - x) = n• [t k=l
mkxk - (t
k=I
mk) x] .
The vector in square brackets is O by the definition of x, so M Po is the
0-vector. •
For mass that is distributed according to a continuous, nonnegative density
µ(x) 2'.: 0 over a body B in space, by analogy with Equation 5.1, we define the
center of mass of the distribution µ, over B to be the point
1
5.3 x = M(B) f 8 µ(x)xdV, if the total mass M(B) = f8 µ(x)dV is positive.
The term centroid is used for x if the distribution µ,(x) is uniformly equal to
I, in which case we're talking about a purely geometric property of B rather than
a physical property associated with mass. From this perspective it's appropriate to
write V (B) for volume instead of M (B).
Just as with Equation 5.1 for discrete masses, the center of mass of a continuous
density µ(x) should be a vector that we picture as a point in space. This is just what
(a)
we get from Equation 5.3. The reason is that the vector-valued integral of a vector-
valued function is defined to be the vector whose coordinates are the integrals of the
individual coordinate functions x, y and z of the vector x = (x, y, z). In applying
Equation 5.3 in JR3, we have
5.4 -x = -I
M
1 B
xµ(x,y,z)dxdydz
(b)
-y = -
1
M
1 B
yµ(x,y,z)dxdydz
lt:exAMete:111
' "' < 1
0
"'·'"·
To find the center of mass of a solid hemisphere H with radius a and constant density
µ,, we'll take as given the formula 1.1ra 3 for the volume of a sphere. The total mass
of His then M = ½<1.1ra 3 )µ = 2.1rµa 3 /3. To compute x, we introduce coordinates
as shown in Figure 7.29(a). We note that x = y = 0, because His symmetric about
the yz-plane and the xz-plane. To find z we change to spherical coordinates, getting
2 2
JHµzdxdydz=µ fo 1r d0 forr/ d</> loa(rcos</>)r 2 sin</>dr
{2rr {rr/2 fa 3
=µ,lo d0 lo sin<f>cos</>d</> lo r dr
350 Chapter 7 Multiple Integration
Hence z =
rr µ,a 4 /4M = (,r µ,a 4 )(8,r µ,a 3 /3) =
3a/8. In other words, the center
i
of mass is of the way along the axis of symmetry of H, measured from the flat
surface of H.
M
f
= JQ µ,(x, y)dx dy = Jo
f"p d0 Jofa r2 r dr
=( i) ( 4
a )
4
= rr ;4.
Similarly,
Mp= iµ,(x)n•(X-Xo)dV,
Section 5 Centroids and Moments 351
where n • (x - xo) = 0 is a nonnalized equation for the plane P. As in the case
of a discrete distribution, Mp changes sign if n changes direction, and in many
applications there is a natural choice for this direction.
5.6 Theorem. The mass and moment of the union of disjoint regions Bi and B2
about a plane P is the sum of their respective masses and moments about P:
M(B1 U B2) = M(B1) + M(B2) and Mp(B1 U B2) = Mp(Bi) + Mp(B 2).
Proof. Both equations are immediate consequences of Theorem 3.6 of Section 3,
which expresses additivity of the integral as a function of the domain of inte-
gration. •
Denoting the moments of B about the yz-plane, the xz-plane and the xy-plane in
JR3 by Myz, Mxz and Mxy respectively, we summarize Equations 5.4 as
By Theorem 5.6, to compute the center of mass of the union of two disjoint bodies
Bi, B2, we can write
_ Myz(B1) + Myz(B2)
x=--------
M(B1)+M(B2) '
_ + Mxz(B2)
Mzx(B1)
y = + M(B2) '
M(B1)
_ Mxy(B1) + Mxy(B2)
z = --------.
M(B1) + M(B2)
V(B) = 2JT jb
II
rh(r)dr, 0 <a.::: r.::: b,
where z = h (r) defines the height of the region R measured parallel to L and at
distance r from the line. Dividing both sides of this equation by 21T A(R) shows that
V(B)/(2JTA(R)) = r, the distance of the centroid of R from L. Now just multiply
this last equation by 21T A ( R) to get Pappus' s formula. •
Pappus's theorem is particularly simple to apply when we know on geometric
grounds where the centroid R is located relative to the axis of rotation, as in the next
example.
IEXAMPLE4 l Rotate an a-by-b rectangle R about a line l in the same plane and parallel to an
edge of length a and lying at distanced from the nearest edge. Figure 7.23(b) in the
previous Section 4C shows the solid for the case a = b. The resulting solid B is a
ring with sharp corners. Since the centroid is d +h/2 units from l, the circle it traces
has circumference 2JT(d +b/2) = JT(2d +h). Since Risa rectangle A(R) = ab, so
V(B) = JT(2d + b)ah.
EXERCISES
In Exercises I~. find the center of mass x of each of the defined on the set described. Sketch the set and show
following discrete mass distributions. Sketch the given the location of the center of mass x.
points and the center of mass x. 5. Let / be the interval 0 :5 x :5 2 in JR, and let µ,(x) =
1. In R, mI = 1 at x1 = 1, m2 = 3 at x2 = 2, m3 = l - ½x.
2 at X3 = -4. 6. Let D be the disk of radius I centered at the origin in R 2,
2. In R 2, m1 = 1 at XJ = (I, 1), m2 = 2 at x2 = (1,0), and let µ,(x , y) =Ix+ yl.
1113 = 3 at X3 = (0, 1). 7. Let Q be the quarter-disk of radius 1 in the first quadrant
3. In R 2, mI = 1 at xi = (1, - 1), m2 = 2 at x2 = (I, 2), with edges on the axes in R 2, and let µ,(x, y) = x + y.
m3 = 3 at X3 = (- 1, 1).
8. Let C be the unit cube in R 3 defined by 0 :::: x :S 1,
4. In R 3, m1 = ½ atx1 = (1,-1,2), m2 = ¼ atx2 = 0 :5 y :s I, 0:::: z:::: I, and let Jt(x, y, z) = xyz.
(0. 1, 2), m3 = i at X3 = (I, 1, 1).
9. Find the centroid of the region R in the first quadrant of
In Exercises 5 to 8, find the center of mass x of each JR 2 bounded by the graph of y = x 2, the line y = 4 and
of the following continuous distributions with density J.l the y axis.
Section 6 Improper Integrals 353
IO. Find the centroid of a solid right circular cone of base where zo is a fixed point in JR 2 ; the number I (zo) is
radius a, and height b from base to tip.
I I. Show that the center of mass of a homogeneous solid ball
is at the center of the ball. /(zo) = l Ix - zoi2 µ,(x) dA.
12. Use the result of Example 3 of the text to find the center of
mass of a hemispherical surface of radius a and constant
18. Find the moment of inertia of a disk Ra of radius a about
surface density µ,. (The result will be corroborated later
its center if (i) Ra has constant density µ,(x) = 1 and
with an approach tailored to more general surfaces.)
(ii) Ra has density µ,(x) =
jxjP, p > 0.
13. (a) Find the centroid of the part of the annulus with radii
19. Find the moment of inertia of a square S of side b about
a > b and center (0, 0) that lies in the first quadrant.
one corner if S has uniform density µ,.
(b) Use part (a) to locate the centroid of a quarter circle
of radius a. 20. Find the moment of inertia of a square S of side b about
its center if S has uniform density µ,.
14. A square region R of side length a is rotated about a line
parallel to a diagonal and containing a vertex not on that 21. Show that the moment of inertia I (zo) of R as defined
diagonal. Find the volume of the solid generated this way. previously satisfies
15. Use Pappus's theorem and the volume ;na
3 of a ball
of radius a to find the centroid of a plane semicircular
I(zo) = M(R)lzo - x-i2 + /(x),
region.
where x is the center of mass of the weighted region R.
16. A right circular cone of height h and base area A has [Hint: 1n the definition of/ (zo), jx - zoi 2 can be replaced
volume V = ½hA. Use Pappus's theorem to find the by the dot product of (x - i) + (i - zo) with itself.)
centroid of a right triangle with perpendicular side lengths
22. Use the previous exercise to show that / (zo) is minimized
a and b.
by taking zo to be the center of mass i of R.
17. Let B be a set in ]Rn and µ,(x) the density at x in B .
23. Let B1 , ... , BN be nonoverlapping regions with union
If xo is a fixed point in IRn and n • (x - xo) = 0 is
B, having respective masses M(Bk), M(B) and centers
the normalized equation of a plane P through XO, we've
of mass Xt, i.
defined the moment of B about P by
(a) Prove that x is given by a weighted sum of the Xk as
Mp(B)= l µ,(X)D•(x-xo)dV. 1 N
x = -(- L M(Bk)Xk-
M B) k=I
Let /(x) = x- 113 for O < x s 1. See Figure 7.30(a). The integral off over this
interval / is not an ordinary Riemann integral because /(x) tends to infinity as
I
x ~ 0. To assign a value to lo f (x) dx, we let /;; be the interval [8, 1) and first
compute
f f(x)dx =
}1
Jim
Jf
d----+ o+ 18
f(x)dx = lim
.5----+ o+
io - 8213 ) = r
Let g(x) = e-x for O s x. See Figure 7.30(b). The integral off over this interval
J is not an ordinary Riemann integral because the domain of the integrand f (x) is
00
an unbounded interval. To assign a value to fo g (x) dx, we let h be the interval
[O, 8] and first compute
'~
~ I
6
(b)
X We say that l f (x) d V is defined as an improper integral if a limit
f f(x)dV= lim f f(x)dV is finite and independent of the family {B6 } used
ln ln.i
FIGURE 7.30 to define it. It's possible to show that if either /(x) ::: 0 on B, or /(x) s O on
B, then the choice of expanding sets Bd that cover all of B doesn't affect the final
outcome; the limit value assigned to the improper integral will either be finite, in
which case we speak of convergence of the integral to that value, or else we have
divergence of the integral, perhaps to +oo or to -oo.
Since In z ::: 0 for O < z ::: 1, the function f (x, y) = - ln(xy) is non-negative on
the square S: 0 < x ::: 1, 0 < y ::: 1. See Figure 7.31(a). Noting that f is bounded
on the square S.1 determined by 8 ::: x ::: 1, 8 ::: y _'.:S 1, we integrate f over S/i and
compute the limit as 8 tends to 0.
{ -ln(xy)dxdy= [ -(lnx+lny)dxdy
1s6 ls~
Section 6 Improper Integrals 355
= -1 1 -1 1 1
dy
1
lnxdx
1
dx
1
lnydy
= -21 1
1 1
dy lnxdx
(b) Now let 8 -+ 0+. If O < p < l the limit of the integrals over D0 is rr /(1 - p), so
the improper integral converges to that value. If p > 1 the integrals tend to +oo, so
FIGURE 7.31 the improper integral diverges for p > 0. The case p = 1 is left as an exercise.
The function f(x, y) = 1/x 2y 2 , defined for x 2': 1 and y 2': 1, has the graph shown
in Figure 7 .32. If B is the set of points (x, y) for which x 2': 1 and y 2': l, it is natural
to define f 8 f dA in such a way that it stands for the volume under the graph of
/. We can approximate this volume by computing the volume lying above bounded
subrectangles of B. To be specific, let BN be the rectangle with comers at (1, 1) and
(N, N) and with edges parallel to the edges of B. For N > 1 we have
FIGURE 7.32 z
/
l ~~~~---
t,
i '. i ' ,
/
/
(0, N)
----;
(1, 0/ ) / -,.. --~~-------ft
/" .. .,,
..... .
/"./
(N)L---- (4-_ --
/ / .
/ ~ :._ - -- -
'
/_// /
>-' X
356 Chapter 7 Multiple Integration
= { Ndx { N ~ d y
1 BN
f dA
11 11 X Y
As N tends to infinity, the rectangles BN eventually cover every point of B , and the
regions above the BN fill out the region under the graph of f . Then we define
ls{ f d A = N --+oo
lim 1f BN
dA = 1.
6.1 Is p(x)dV = 1,
then p(x) can be interpreted as the density of a statistical outcome. To be more
specific, suppose E is an experiment with possible outcomes in S. Then the proba-
bility that the outcome of the experiment lies in a subset B of S can sometimes be
expressed in the form
for some density function p(x) . For example, the coordinates of a vector outcome x
might be the results of measuring simultaneously several distinct properties of some
physical object. In analogy with the center of mass of a mass distribution, we define
the mean of a probability distribution as the vector
i•
xe-(x 2 +y2 )/?n 2 dx dy = 1
00
-oo
xe-(x 2 f?n 2) dx 1 -oo
00
e-(y 2 f?n 2 ) dy.
The first integral on the right is zero because the integrand is an odd function.
The function
defines a probability density in the first quadrant Q of .IR 2 • All we have to check is
that the integral of p over Q is equal to l. We compute
l e-x-y dx dy = 1 1
00
e-xdx
00
e-Y dy
2
lim [-e-x]i)
= ( N----+oo
2
Jim (-e-N +
= ( N----+oo 1)) = l.
The probability that the outcome of the experiment E is in some rectangle R: a ~
x ~ b, c ~ y ::: d, where a and c are nonnegative is
Pr[E in R] = £ e-x-y dx dy
= [-e-x]: [-e-Y]~
= (e-a - e-b)(e-c - e-d).
358 Chapter 7 Multiple Integration
(!XAMPtE s I The squaring trick used in the previous example used in combination with a switch
2
to polar coordinates allows us to show that the integral of e-x between - oo and
oo equals ,./ii as follows:
00 , )2 = ]Ri{ e-x--y-
, ,dx dy
(1_00 e-x- dx
{2,r {00 ,
= Jo d0 Jo e-rrdr
=27r [ -2e
1 -,·2]00 I
= (2rr)(z) = 7r.
O
EXERCISES
1. Loo (l + x)-4dx . 12. (a) Show that p(x) = e-x is a probability density on
the interval 0 ~ x < o:, in JR.
(b) If E is an experiment with probability density
2. lo"° x - 3dx . p, as given in part (a), find the probability that
the outcome of E lies between a and b, where
3. f x dx d y, where R is the infinite rectangle O ~ x ~ 1, 0 ~a< b.
}R y 2 13. (a) For what constant k is the function
2 ~ y in JR 2 .
5. [ e-x-y-z dtdydz, where C is the infinite rectangle a probability density on a disk of radius 1?
(b) If the outcomes of an experiment E are dis-
0 ~ X ~ 1, 0 ~ y ~ I, 0 ~Zin JR • 3
tributed according to the density of part (a), find
1
6. f -,-- d x dy dz, where C is the same as in the probability that E has an x-coordinate bigger
le z-fe than½ -
Exercise 5. (c) What is the mean of p?
7. l ln(x 2 + y2) dx dy, where D is the disk x 2 + y 2 ~ 1 in 14. If an experiment E has as its probability density the
symmetric normal density in !R 2 with constant o- 2 , find
IR2 • [Hint: Let D6 be the annulus 82 ~ x 2 + y 2 ~ 1, and the probability that the coordinates of the outcome are
change to polar coordinates.] both positive.
8. l + (x 2 y2)- 1dx dy, where D is the disk x 2 + y2 ~ 1 15. (a) Show that
in !R2 .
dxdy -1- 1°" e --~n
·• ,_a 2 dx = I
9.
1+Jx2 +
Q
-;;::;:==:;:,
y2
where Q is the quarter-disk x
7A Midpoint Approximations
For a I-dimensional integral lb (I
f(x)dx we partition the interval a ~ x ~ b into
p equal subintervals with endpoints x1 =(a+ j(b - a)/p), j = 0, . .. p) . The
midpoint approximation is
1 a
b
f(x)dx
p-1
~ L f(a + (j + ½Hb -a)/p),
j=--0
here p and q are the respective numbers of lines in the grid in the x and y directions,
while j runs from O to p and k runs from O to q. Since the dimensions of each
rectangle are (b - a)/ p by (d - c) / q, the midpoint of the grid rectangle with lower
left comer (xj, Yk) is at
1
R
f(x,y)dxdy ~ - - - - L L f ( x1 ,yk).
pq j=O k=O
The approximate value we get this way is just the value of a Riemann sum of the
type used to define the integral. In particular, if f is a positive function, the approx-
imate value is a sum of volumes of vertical boxes as illustrated in Figure 7.14(b) of
Section 2. The routine to implement the midpoint approximation method is a double
loop of the form
SET s = 0
FOR j = 0 TO p - 1
FOR k = 0 TO q - l
LET s = s + f(a + (j + .S)(b - a)/p , C + {k + .S)(d - c )/q )
NEXT k
NEXT j
PRINT s(b - a)(d - c)/(pq)
The rest of the routine consists of a definition for f, and an assignment of values to
the limits a, b, c, d.
To integrate over a region D in JR 2 that's more complicated than a rectangle,
simply enclose D in a rectangle R. Then define / by its given values for (x, y)
Section 78 Numerical Integration 361
in D and define f(x, y ) = 0 for (x , y ) outside D. The Heaviside function H of
Chapter 4, Section 2C is helpful here. Alternatively, it may be simpler first to make
a change of variable in the integral that results in integration over a rectangle.
The analogous formula for the midpoint approximation for a function g(x, y, z)
integrated over a rectangular region R with extreme comers at (a, c, e) and (b, d , j) is
1 R
g(x,y)dxdydz ~ - - - - - - L L L g ( xj , Yk,ZI),
pqr j=O k=O 1=0
where
(xj , Yk, z1) =(xi+ ½(b - a)/p , Yk + 1(d - c)/q, Zk + ½(e - j)/r)
=(a+ (j + ½Hb -a)/p , c + (k + 1)(d - c)/q, e + 1) (e - j)/r).
7B Simpson Approximations
If the integrand fin a multiple integral is a fairly smooth function we can take advan-
tage of its smoothness by repeated use of the I-dimensional Simpson approximation
over an even number p of intervals:
b b-a
l a
f(x)dx ~ --(J(xo) + 4j(x1)
3p
+ 2f(x2) + · ·· + 4j(Xp-J) + j(xp))
where Xj =a + j(b - a)/p. The pattern for the coefficients sjP) is such that the
first and the last coefficients are Scip) = st> = l , while the intermediate values are
given by the formula sJP> = 3 - (- l)i, in other words, alternating 4's and 2's,
beginning and ending with 4. The requirement for the even number of subdivision
intervals comes from the geometry underlying the method; with just two intervals,
the Simpson approximation is precisely the integral over a :S x :S b of the unique
quadratic polynomial that interpolates the values off at a, ½<a+ b) and b.
To apply the Simpson formula to a double integral we use a two-stage Simpson
approximation to an iterated integral
F(y) =
l a
b
J(x, y)dx ~ T I:sY>
b
p
a P
j=O
j(Xj, y), where Xj =a+ j(b - a)/p.
362 Chapter 7 Multiple Integration
(d c)
i
d q
F(y) dy ~ -
3q
L slq> F(yk)-
c k=O
i d [ fb j(x, y)dx] dy
la
~ (b -;)(d -
pq
c) t s!q> (t sy>
k=O j=O
f(xj, Yk))
p q
~ (b -;)(d - c) LL sy> s!q> J(xj, Yk)-
pq j=O k=O
Note that in this fonnula we use the values off at all grid points in the rectangle R,
including those on its boundary. The double loop in the implementing routine has
the form
SET s 0
FOR j = O TO p (Odd number of values, with p even.)
FOR k = O TO q (Odd number of values, with q even.)
LET s s + S(j,p)S(k,q)f(a + j(b - a)/p,c + k(d - c)/q)
NEXT k
NEXT j
PRINT s(b - a)(d - c) jU.1pq)
EXERCISES
l. Use a program to implement the midpoint approximation In Exercises 3 to 8, use the midpoint or Simpson approx-
for an f(x, y) and test it on the example f(x, y) = x 2 +y4 imations to find an approximate value to four-place accu-
over the rectangle R: 0 _::: x _::: 1, 0 _::: y _::: l. Having
computed the correct value via indefinite integrals, you racy for lj(x,y)dxdy where f, Rare
can find out how small p and q can be while still
producing four-place accuracy. 3. f(x, y) =1- x 2 - y 2 , R: 0 _::: x :'S: 2, 0 :'S: y :'S: 3.
2. Use a program to implement Simpson's approximation for 4. f(x,y)=l-x-y, R:0:'S:x, 0_:::y, x+y:'S:I.
f (x, y) and test it on the same example as in the previous 5. J(x,y)=l-x 2 -y2, R:x 2 +y2_:::1.
problem to find minimal values for p and q for four-place
accuracy, such that reducing either p or q fails to yield 6. f(x,y)=l-x 2 -y2, R:x 2 +y 2 :'S:l,y2'.:0.
that degree of accuracy, in particular increasing p or q 1. f(x, y) =I- x2 - y2, R: 0::: x, 0 _::: y, x 2 + y2 .'.:: 1.
should not change the first four digits.
8. f (x, y) = )', R : 0 :':: X :':: J, 0 :':: y .'.:: X.
Section 7B Numerical Integration 363
9. The double integral p q r
G(a , b, c, d) = lb 1d e-x
2
-i dx dy
" " " Sj(p)S(q)S(r)
L..,L..,L..,
j=O k=O l=O
(
k t gxj,Yk,Zt),
(a) Compute the value of the integral over JR 2 by using 0,::: z::: 3.
polar coordinates over a disk of radius R as R tends
to oo.
13. l(x +y+z)dxdydz, R: 0::: x.::: 1, O::: y::: I,
(b) Compute approximations to ,r by finding Simpson 0:::z:::l.
approximations to 4G(O, a, 0, a) for suitable values 14. (a) Sketch the region in JR2 bounded by the four lines
of a. x + y = I, x + 2y = 4, x - 2y = - I and x - 3y = I.
(c) Estimate how large you need to make the positive (b) Find an approximate value for the area of the region
number a in part (h) to get four-place accuracy. in part (a). Can you find the exact value, 187/120,
In Exercises 10 to 13, use Simpson's approximation by elementary geometry?
(c) Find an approximate value for the integral of
to find approximate values for the following integrals.
f (x, y, z) = x 3 + y 3 over the region described
The Simpson fonnula for a triple integral over a 3-
in part (a).
dimensional rectangle R is
IS. (a) Sketch the region in the positive octant of JR3
ff lg(x,y,z)dxdydz
hounded by the planes x + y + z = 3, x + y +2z = 6,
z = I and z 2. =
1 [ld [1b
(b) Find an approximate value for the volume of the
1 region in part (a). Can you find the exact value by
= g(x, y, z)dx] dy] dz elementary geometry?
(c) Find an approximate value for the integral of
(b - a)(d - c)(f - e)
f(x, y, z) = x 4 + y4 + z4 over the region described
~ 21pqr in part (a).
Chapter 7 REVIEW
1 2
S. forr [fo rr sin(x - y)dy] dx.
2. ill [12 ex-ydy] dx. Sketch the region of integration stated for each of the
following double integrals. Then evaluate the integral.
.£ rlox c-Ydy] dx. You may want to pick your order carefully, and you
2
3• may even want to change coordinates.
364 Chapter 7 Multiple Integration
10. 1+ (x 2 y2) 5 dx dy; D is the disk of radius 4 centered at then evaluate one of them.
Make a sketch of each of the following plane or solid
elf. o).
regions numbered 26 through 33, and find the area or
11. 1 (x 2 - y2)2 dx dy; D is the disk of radius l centered at volume as the case may be. Use your own reasonable
choices for the constants in making the sketches.
c& o).
12. l (l+x 2+y2)- 312 dV; Dis the disk of radius 2 centered
26. Elliptic region E in JR2 : x 2 /a 2 + y2 / b2 :S I.
27. Solid B based on the region E of the previous exercise
at (0, 0). and with square cross sections perpendicular to the x-axis.
14. ls x 2y2 dx dy; S is the square Ix I + IY I :S: 1. 29. Ellipsoid B: x 2/ a 2 + y2 / b 2 + z 2/ c2 :S 1. Note that B has
elliptical cross sections.
Sketch the 3-dimensional region of integration of each 30. Region H in JR2 : x 2 :S: y 2 + l, IYI :S: l.
of the following integrals. Then evaluate the integral, 31. Solid B generated by rotating the region H of the previous
possibly after a change of coordinates. exercise about the y-axis.
15. fo
1
[i 2
[fi\yzdz] dy]dx.
32. Solid B generated by rotating the region H of the previous
exercises about the x-axis.
1
33. Solid B bounded by the three coordinate planes in JR 3 and
16. fo [lax [lay xyzdz] dy] dx. the plane ax+ by+ cz = I, where a, b, c are positive.
34. For the integral
17. fc z(x:: + i)dV; C is the solid cylinder x2 + y 2 :':: 1,
0 :S: z :S: 2.
Jof I dy 1./Y
Y 2xydx,
18. fc (x 2 + y2 + ·z.2) dV; C is the solid cylinder x 2 + y 2 ::: 4,
0 :S: z :S: l. (a) Sketch the region of integration in JR2 •
19. L (x 2 + y2 +z 2) dV; B is the solid ball x 2 + y 2 +z 2 :s l.
(b)
35. Given
Evaluate the integral.
coordinates to compute l y zd V .
Find the volume of C.
SO. Let R be the region in the first quadrant of the xy-plane
41. Let C be a solid cylinder of radius I symmetric about the where 4 ~ xy:::: 9 and I ::S y/x ::S 4.
z-axis. Let W be the wedge-shaped subset of C where (a) Solve the equations u = xy and v = y/x uniquely
0 ~ z ~ x. Write an iterated integral for l zdV
(b)
for x and/ in R in terms of u and v.
Express JR x- 2 dxdy as an integral in u and v, and
(a) in rectangular coordinates.
(b) in cylindrical coordinates. find its value.
(c) Evaluate the multiple integral in whichever way you In Exercises 51 to 56, evaluate those improper integrals
prefer. that have finite values; for those that don't, explain why
42. (a) Set up an iterated integral whose evaluation will they fail to converge.
yield the volume of the spherical ball Ba of radius 2
51. { e-(x +i) dx dy.
a centered at the origin in JR3. }JR.2
(b) Use Jacobi's theorem to show that every spherical
ball of radius a in JR 3 has the same volume. 52. { (x 2 + y2)- 1 dx dy.
}R2
43. Let R be the region in JR 3 determined by the inequalities
0 ~ x, 0 ::s y, x 1 + y 2 :::: I and 0:::: z:::: x 2 + y 2 • Evaluate 53.1 (x 2 + y2)-J/ 3 dA.
l xyzdV. '
54. r
.r2+y2~,
2
e-../xZ+y dx dy.
44. Compute the value of the integral off (x, y) = x + y 2 2 }R2
over the triangle in JR 2 with corners at (0, 0), (l, 0), and 55. r e-(x +i+z ) dx dy dz.
23
2
f.!
(I, I) by computing iterated integrals in both possible }JR.3
orders.
45. Find limits of integration, some of which may be noncon- 56.1 x2+y2+z2iJ
(x
2
+ y2 + z2 )- 2dV.
stant, for this integral if the region of integration is the
circular disk of radius 2 centered at (3, 0): In Exercises 57 to 60, decide for what values of a the
integrals have finite values.
j~b [id f(x, y) dx] dy. 57.
1.r2+y2~J (x 2
1
+ y 2)a
dA
60.1 x2+y2+z2~1 (x 2
;
+y + z2 )a
dV
(a)
(b)
Sketch the region of integration.
Write the integral in terms of rectangular coordi-
nates.
{21r fl {~2 (c) Write the integral in terms of spherical coordinates.
61. lo d0 lo dr lo rdz is expressed m cylindrical (d) Compute the value of the integral.
coordinates.
CHAPTER 8
The present chapter features just one of several ways to extend the Fundamental
Theorem, and we postpone further extensions to Chapter 9. Here we first introduce a
simple generalization of the integral itself, called the line integral over a parametrized
curve. Combining the line integral with the gradient operator we'll arrive at an
analogue of the Fundamental Theorem,
which we'll use to express a key relationship between the physical concepts work
and energy. Recall that curves in JR 2 and JR 3 are in a sense negligible in the multi-
ple integrals of Chapter 7. However, all the integrals in this chapter will reduce to
ordinary one-variable integrals for computational purposes, even though the concepts
that give rise to them have a distinctly higher-dimensional flavor.
Differentiation along a curve is a concept intrinsic to the geometry of the curve,
typically distinct from differentiation of the curve's parametrization. The idea is
particularly important in describing motion along curves. In some applications to
motion on curves we use the term particle to describe a body moving on a curved
path, but we often understand particle to stand for the position of the center of mass
of what is really a large and complex body such as the earth, a simplification that
turns out to be adequate for many purposes.
367
368 Chapter 8 Integrals and Derivatives on Curves
FIGURE 8.1
(a) (b)
We'll be particularly interested in the arrows of the field F that stem from points
on y, as shown in Figure 8.1 (b ). These arrows will depend on t in a specific way
if we introduce the composition F(g(t)). At each point g(t) on y there is also a
tangent vector g' (t), and the dot product
F(g(t)). g'(t)
is a continuous real-valued function for a :'.S t :'.Sb. The line integral of F over y is,
by definition,
1 1
fo 2
(1 , ,4, 16 ) • (1, 2,, 3t 2 ) dt = fo (1 2
+ 2t 5 + 3t 8 ) dt
We interpret the line integral in qualitative terms as follows. The dot product
F g'(t)
(g(t)). lg'(OI
is the coordinate of F{g(t)) in the direction of the unit tangent vector to y at g(t).
Then F{g(t)) • g'(t), the integrand in Formula I.I, is the tangential coordinate of
F{g(t)) times jg'(t)!, the speed of traversal of y at g(t). In particular, if F(gU)) is
always perpendicular to y at g(t), the integrand, and hence the integral will be zero.
At the other extreme, for a given field F, if the speed lg' (t) I is prescribed at each
point of the curve, then the integrand will be maximized by choosing a curve y that
at each point has the same direction as the field there. Thus the integrand in the line
integral is a local measure of the circulation of the vector field along y. The term
circulation is justified by the frequent interpretation of F as the velocity field of a
fluid flow .
Section 1A Line Integrals 369
r12
= Jo (-costsint+sintcost)dt
F flf /2
= lo Odt = 0.
X
FIGURE 8.2
wheres is the distance traversed and F 1 is the force in the unit direction t of motion.
In Figure 8.2(b) a particle moves along a line having direction vector t with ltl = I,
and it is subject at each point x to the constant force vector F. The coordinate of F
in the direction of motion is Ft = F • t, so work is
will approximate y as shown in Figure 8.3, since the number lg' (tk- 1) II(tk - tk-d I
approximates the distance from g(tk_I) to g(tk), We fix a point Xk = g(tk) on y, and
near Xk approximate F by the constant field F(xk), That is, near Xk we approximate
F(x) by the vector field that assigns the constant vector F(xk) to every point. The
tangential coordinate of F(xk) is F(xk)-t(tk), where t(t) =
g'(t)/lg'(t)I . Thus the
work done in moving a particle along y from Xk to Xk+I is approximately
370 Chapter 8 Integrals and Derivatives on Curves
FIGURE 8.3
1
wk= (F(xk) t(tk))lg (tk)l(tk+1 - tk)
0
an integral formula that we define to be the work done by the field F in moving the
particle through the domain of F along y.
A suggestive shorthand notation for the general line integral uses the unit tangent
vector t(t) = g'(t)/lg'(t)I to a smooth path of integration on which g'(t) # 0. Given
that arc length along such apathy is defined by s(t) = 1t lg'(t)ldt, it's natural to
write ds = lg' (t) I d t for the so-called arc length differential. The line integral for
work is then the natural extension of the special case W = (F • t)s :
W = l F tds. 0
This way of writing the integral captures the essence of our interpretation of the line
integral, since F • t is the coordinate of F along the tangent direction to y.
The assumptions that ~, be continuous and that g' be continuous assured that the
integrand F(g(t)). g' (t) would be continuous and hence that the line integral would
exist. However these conditions are stronger than necessary. It's enough to assume
that the path of integration is piecewise smooth and then that the vector field F is
sufficiently regular so the integral in Formula 1.1 exists. Thus the derivative g' may
be discontinuous at finitely many points, allowing y to have sharp corners at some
points.
IEXAMPLE 31 Let a vector field be defined in JR3 by F(x, y, z) = (x, y, z). Let the curve y in JR 3
be given by g(t) =(cost, sin t, It - Jt /21) for 0 :S t :S n. Then y has a corner at
(0, l, 0), where ·1 = n /2. Indeed, g is not differentiable there, and lim g' (t)
r-.rr/2-
=
( -1, 0, -1) and Jim g' (t) = (-1, 0, 1), showing that the direction of the tangent
r-.1r /2+
Section 1A Line Integrals 371
1b
a F(g(t))•g'(t)dl= a
1b [ dx dy dz]
F1(x,y,z)dt+F2(x,y,z)dl +F3(x,y,z)dl di.
i F1 dx + F2dy + F3dz.
This is still shorter if we write dx = (dx, dy, dz), giving a convenient shorthand for
the line integral of F over y:
i F•dx.
F(x, y, z) = (x - y, y - z, z - x).
A curve y, given by g(I) = (I, -I, 12 ) for OS I S l, passes through the field. We
compute the integral of F over y as follows. First, the values of F on y are given by
We write
i F •dx = fo
1
[<21)(1} + (-1 - 2
1 )(-1) + (1 2 - 1)(21)] di
5
i
i
= (21 3 -1 2 +31)d1=-.
o 3
If we can choose coordinates so that one or more of the sections of a line integral
path are parallel to an axis, computing the value of an integral may be substantially
simplified, as in the following example.
372 Chapter 8 Integrals and Derivatives on Curves
FIGURE 8.4
~ \ y ! Ii /,*
"-.-r: /9-Y, I / /"
,,,,,.,,,,
--.
Y4
,,,
\
8,
82
---
'--. ..._____
X
/ I \ \~ (a) (b)
!EXAMPLES I Let F(x, y) = (x, y) define a 2-dimensional velocity field along the path 8 from
(0, 0) to (a, 0) along the x-axis and then along the line segment from (a, 0) to
(a, b). Thus 8 consists of a path 81 along the x-axis followed by a path 82 parallel
to the y-axis. See Figure 8.4(a), where it's assumed that a > 0 and b > 0. No
matter how the path is parametrized by a function g(t) = (g1(t), gz(t)), we see
that g~(t) = 0 along 81. In other words, dy = 0 along 81. Similarly, dx = 0
along 82. Since F • dx = x dx + ydy, we can compute the circulation of F along
8 using x and y as parameters on 81 and 82 respectively. For arbitrary a and b
we have
f x dx + y dy = f x dx + f y dy
lo lo, 102
r
= lo xdx
rb ydy = 2a2
+ Jo
i i
+ 2b2.
Let F(x, y) = (x, y) as in the previous example. This time we integrate F over the
closed path, shown in Figure 8.4(a), consisting of two circular arcs with their ends
joined by radial segments. The entire path y is to be traced counterclockwise. Over
the circular arcs the tangent vectors t are perpendicular to the vector field arrows,
so F • t = 0 there. Thus the integral of F is zero over y2 and y4. Let t be the unit
tangent at a typical point of the segment YI. Since F and t point in the same direction
along YI, F • t = JFI = Jx 2 + y 2 at a point (x, y) of YI. In contrast, the tangent
vectors to Y3 all point toward the origin. Since F and t point in opposite directions
along y3, F • t =-!Fl= -Jx 2 + y 2 at a point (x, y) of y3. The integrals of Fare
thus negatives of each other:
1
YJ
F-tds = -1YI
F tds .
0
Thus the two remaining integrals cancel in the computation of the integral over y,
giving zero net circulation over the complete circuit.
Section 1A Line Integrals 373
Figure 8.4(b) shows the 3-dimensional vector field F(x, y, z) = -yi + xj + k along
with the helix x(t) = (2 cos t)i + (2 sin t)j + (t)k. To integrate F over the helix we
compute F • dx along the curve. Note that
F(x(t)) = (-2 sin t)i + (2cos t)j + k, and x(t) = (-2 sin t)i + (2 cos t)j + k.
Hence the velocity vector to the helix at a point x coincides with the vector F(x),
so F(x) • x = lxl 2 = 5. It follows that the circulation of F along the helix from x(a)
to x(b) is
lb F(x(t)) • x(t) dt = lb 5 dt = 5(b - a).
L F•dx.
:
!EXAMPLES
. . ,, " :I Consider three parametrizations for the line segment joining a and b:
'ilf(x) of
= ( -(x), of .)
... , -(x) .
OXJ OX11
(I
If 'ilf is continuous, it generalizes the derivative in the formula for the fundamental
theorem of calculus for one variable:
X
[ vf • dx = f (b) - f (a).
_,_ ____ _ In particular, the value of the line integral of a gradient field over a curve depends
only on the endpoints of the curve; thus in this case, the notation
X
y
lb 'ilf • dx = f (b) - f (a)
bd
=
1 -f(g(t))dt,
a dt
Section 18 Line Integrals 375
where in the second step we used the chain rule, Theorem 1.3 of Chapter 6. But by
Equation (*), the fundamental theorem for one variable, the last integral is equal to
f(g(b)) - J(g(a)) = f (b) - f(a). •
Consider the vector field Vf(x, y) in JR 2, where f(x, y) = ½(x 2 + y 2). Then
<
VJ(x, y) = (x, y). If y is some continuously differentiable curve with respective
initial and final endpoints xi = (x1, YI), and x2 = (x2, Y2), then
(x2,Y2)
J Y
Vf(x) •dx =
J
<x1,Y1)
xdx + ydy = J(x2, Y2)- f(x1, yi)
l Y dx + X dy = X2Y2 - XJ YI.
In particular, if the path starts at (xo, Yo) = (I, 2), and ends at (x, y ), we find that
(x,y)
J(1,2)
ydx +xdy = J(x, y) - f(l, 2) = xy-2.
Examples 9 and 10 show in detail how line integrals solve a vector equation
f (xo, Yo) = 0,
provided that (F1(x, y), F2(x, y)) is a given gradient field. The solution is then
(x,y)
f(x, y) =
J(xo,yo)
Fi (x, y) dx + F2(x, y) dy.
EXERCISES
1. i x dx + x 2 dy + y dz, where L is given by l (Fi dx + F2 dy) for the given choices of F1 and F2.
g (t) = (t , t, t), for O :S t :S I.
= x 2, = y2.
2. l + + (x y) dx dy, where P is given by g(t) = (t, t 2),
13. Fi (x, y) F2(x, y)
14. F1(x,y) = xi, F2(x,y) =x 2y.
0:St:SI. 15. F1(x ,y) =siny, F2(x,y)=xcosy.
3. 1 1
YI
x dy and
Y2
x dy, where y 1 is given by
16. F1 (x, y) = ex-y, F2(x, y) = -ex-y_
17. Find the work done in moving a particle along the curve
g(t) = (cost,sint) for O :St :s 2n, and where n is
given by h(t) = (cost, sin t) for O :st ::: 4rr .
(x, y, z) =
(t, t , t 2), 0 :s t ::: 2, under the influence of the
field F(x, y, z) = (x + y, y, y).
4.1 YI
(dx + dy), where YI is given parametrically by 18. (a) Find the work done by the force field F(x , y) =
yi - xj in moving a particle clockwise once around
(x, y) = (cost, sint), 0::: t::: 2rr. the circle of radius 1 centered at the origin in JR 2 •
5.
1YI
dx +dy
X
2
+ )' 2
,
. . .
where YI 1s the curve m Exercise 4. (b) How docs the answer to part (a) change if the circle
is moved so that its center is at an arbitrary point
8. j F•
}'
dx, where F(x , y, z, w) = (x, x, y, xw) and y is f(t) = (cost, sint), 0 :St :S rr/2,
given by (x, y, z, w) = (t, 1, t, t), 0 :St::: 2. 1- u
2
and g(u) =(- - , -2u- )
l+u 2 l+u 2
, 0< u < 1
- -
In Exercises 9 to 12, let YI be given by (x, y) =
(cost, sint), 0:::: t:::: rr/2 and y2 by (x, y) = (l-u, u),
are equivalent parametrizations of a quarter-circle.
0:::: u :::: 1. Compute j (J dx + gdy) and (The relevant definition of equivalence is given in the
preamble to Theorem 1.2.)
J
YI
(J dx + g dy) for the given choices off and g. 21. Show that f(t) = (1 112 , 1312 ), 1 :S t :S 2, and g(u) =
Y2 (u, u 3 ), 1 :S u :S ./2 are equivalent parametrizations of
a cubic curve. (The relevant definition of equivalence is
9. f(x,y)=x,g(x,y)=x+l.
given in the preamble to Theorem 1.2.)
10. f (x, y)
1 I. f(x, Y)
= x + y, g(x, y) =
= -2--2'
1
g(x, y) = -2--2 ·
l.
1 22. Show that if fy
F • dx and j' G • dx exist, then
y
X +y X +y
(c)
clockwise.
Compute 1 F • dx, where , is a rectangle with
33. (a)
(b)
SketchthevectorfieldF(x,y)=(-y, x).
Show that the vector field of part (a) can't be the
sides parallel to the axes, traced counterclockwise. gradient of a real-valued function f by finding
[Hint: Only the vertical sides of the rectangle make distinct paths from some point a to another point
a nonzero contribution.] b #- a such that the integrals of F over these paths
(d) Do a computation analogous to the ones in parts (b) have different values.
and (c) for a triangle with vertices at (0, 0), (a, 0) (c) Show that the vector field of part (a) can't be the
and (0, b) where a and b are positive. What is the gradient of a real-valued function f by finding a
pattern in the answers? closed path y starting and ending at the same point
27. The purpose of this exercise is to display a pattern in
the results of integrating the vector field G(x, y) =
such that £F • dx #- 0.
-½yi + }xj = (-}y, }x) over some closed paths in JR2 • 34. Assume a continuous vector field F(x) satisfies
(a) Make a sketch of the vector field G. (i) IF(x)I = k for a constant k > 0, and (ii) F(x) is
\
(b) Compute 1 G • dx, where c is a circular path of
curve y of finite length. Prove that £
tangent at each point x to a continuously differentiable
F • dx equals ±k
radius a, centered at (a, /3) and traced counterclock-
wise. times the length of y .
s(t) = f I
to
jx(u)j du.
The definition is a natural one, because the length jx(t)J is the speed at which the
curve is being traced at time t. But a given path in space can be traced at many
different varying speeds, including possible multiple back-and-forth tracings. Hence
it's useful to have a standard parametrization for a curve that depends only on the
intrinsic geometry of the image set of the curve. If points in the image correspond
one-to-one with values of arc length measured from a specific point xo on the image
curve, we can use arc length s as the parameter in a representation of the curve
by a function g(s), where g(so) = xo. It would then follow thats = 1s
so
jg(u)j du.
Differentiating both sides of the equation with respect to s gives 1 = jg(s)j. For
this reason we say that a curve g(t), to _:s t _:s t1 is parametrized by arc length if
lg(t)I = I for all t in the parameter interval; in other words the curve is traced with
constant speed I. The expression Jx(t)I dt = Jg'(t)I dt is traditionally called the arc
length element of the curve.
Let a > 0 be the radius of a circle centered at the origin in .IR 2 . The simplest
parametrization for this circle is g(t) = (a cost, a sin t), 0 _:st _:s 2:,r. Since jg(t)I =
I(-a sin t, a cost) I = a, the circle has been parametrized by arc length just when
a = 1. For other values of a, note that the arc length of the part of the circle
corresponding to the parameter interval O _:s u _:s t is
1 1
s = fo lg(u)I du = fo a du= at.
dg(s/a)
ds
I .
= 1(-sm(s/a),cos(s/a)}! = I.
l
The plane curve g(t) = (t, }1 312 ) has velocity vector g(t) = (I, 1112 ), so the speed
is jg(t)J = ./f+t. Arc length in terms of the parameter t measured from O is then
/JT"'+u 2 ,, 2 2
s(t) =
1o
du= -(1
3
+ u)3/2 = -(I + 1)3/2 -
0 3
-.
3
Solving fort in terms of s gives t = (1 + !s) 213 - 1. Exercise 14 shows that if we
use arc length as parameter, letting h(s) = g((l + ~s) 312 - l), then lh'(s)I I, so =
the curve is now traced by h(s) with uniform speed 1 for s > 0.
Section 2 Weighted Curves and Surfaces of Revolution 379
If the curve happens to be parametJ.ized by something other than arc length, using
instead a function g(t) on to St S ti, the arc length functions is given by
M =
1~
s1
µ(h(s))ds = 1~
1
1 ds(t)
µ(h(s(t)))-dt
dt
= 111 µ(g(t))!g(t)ldt.
~
The differential ds = lg(t) ldt is called the arc length differential for a curve
parametrized by g(t).
Suppose that the density of the helix at a point x is equal to the square of the distance
from x to the midpoint q = (0, 0, rr) of the helix's axis. Thus the density at g(t) is
2
!g(t) - ql 2 = a 2 cos 2 t + a 2 sin2 t + (t - rr)2 = a 2 + (t - rr) .
FIGURE 8.6
This definition gives the expected result for commonly met surfaces such as cylinders,
spheres, and cones, and is a special case of a more comprehensive definition given in
the next chapter that includes surfaces that aren't necessarily surfaces of revolution.
In a typical application we're given a curve parametrized in IR 2 by g(t) =
(x(t),y(t)), to~ t ~ t 1 . The arc length element is ds = ../i(t) 2 +Y(t) 2 dt. If
y(t) ~ 0 and we rotate the curve about the x-axis then r(s(t)) = y(t). Hence
The circle parametrized by (y(t), z(t)) = (a cost, b +a sin t), 0 ~ t ~ 21r has radius
a and center at (0, b) in the yz plane. If O < a < b, rotating about the y-axis generates
a torus or "donut" surface T , shown in Figure 8.7(a). The arc length element is
lg(t)ldt = ../(-asint) 2 + (acost)2dt = adt, and r(s(t)) = (b +asint). Then
{21(
a(T) = Jo 21r(b + a sin t)a dt
{21( {21t
= 21rab lo dt + 21ra 2 lo sin t dt = 4rr 2ab.
Section 2 Weighted Curves and Surfaces of Revolution 381
FIGURE 8.7 z
(a) (b)
If O < b < a, the surface of revolution is more complicated than a torus; to make
the surface area computation valid we would have to replace the integrand by its
absolute value.
A line segment of length l extends from the origin in IR2 to a point h units above
the positive x-axis and is then rotated about the x-axis to produce a cone C. Slitting
the cone along the segment allows it to be rolled out flat as a sector of a circle, as
shown in Figure 8.7(b). The circular arc of the sector has length 2rrh, which is the
circumference of the cone's base. The area of the sector is thus 2rrh/(2rrl) = h/l
times the area rr / 2 of a full circle of radius /. Hence the cone should have area
cr(C) = (h/ l)rrl 2 = rrhl.
Using instead the previous displayed fonnula, we'll represent the segment as
the graph of f(x) = (h/-J/ 2 - h 2 )x for O S x S -J1 2 - h 2 . We find ds =
(l/,Jl 2 - h 2 )dx. In agreement with our purely geometric computation we get
cr(C) = 1~
2rr(h/JZ 2 - h 2 )x(l/./Z 2 - h 2) dx
= -l 22rrhl
--
-h 2
1~
O
xdx = rrhl.
This example provides some supporting evidence for the con-ectness of the definition
of er (S).
382 Chapter 8 Integrals and Derivatives on Curves
EXERCISES
In Exercises 1 to 4, find the length l(y) of the indicated 11. Compute the surface area of a sphere Sa of radius a in
curves. two ways using Equation 2.2.
1. (x, y) = (t, lncost), 0 _::: t _::, I. (a) Parametrize a semicircle in JR 2 by g(t) =
(acost,asint), 0::: t _:::Jr.Show that ds = adt
2. (x, y) = (t 2 , ~t 3 - ½t), 0 _::, t _::, 2. and r(s(t)) = asint. Then rotate the semicircle
3. y=x 312 ,0_:::x_:::5. about the horizontal axis to get a(S0 ) = 4Jra 2 •
4. g(t) = (6t 2 , 4h13, 3t 4 ), -1 _::, t :::; 2. (b) Parametrize a semicircle by g(t) = (t, .../a 2 - 12 ),
-a _::: t ::: a. Show that ds = a(a 2 - t 2 )- 112 dt and
5. If a curve is described in plane polar coordinates by a r(sU)) = (a 2 - 12 ) 111 . Then rotate the semicircle
function r = f (0) for a :::: 0 _::: b, then in rectangular
about the horizontal axis to get a(S0 ) = 4Jra 2 .
coordinates the curve may be parametrized by
12. The graph of y = a cosh(x/a), 0 ::: x :::: b, is rotated
(x, y) = (r cos 0, r sin0) about the x-axis in IR2 to generate a surface S. Find a(S).
= (j(0)cos0, f(0)sin0), a_::: 0 _::: b. 13. (a) Set up an integral for the arc length of the ellipse in
IR2 parametrized by
(a) Show that the arc length formula for a curve f = (x, y) = (a cost, bsint), 0:::: t:::; 2Jr.
f (0) in polar coordinates is
(b) Assume a > b and show that the arc length integral
found in part (a) is equal to
f"/2
(b) Sketch the curve given by r = (I + cos 0) for
4a Jo .Ji - k2 sin2 tdt , k
2
= (I - 2
b /a 2 ).
0 _::: 0 _::: Jr and find its length. This integral is a standard form of an elliptic inte-
6. A 5-foot piece of wire is coiled in a uniform spiral 3 gral; it can't be evaluated using elementary func-
inches in diameter. Find the height of the coil if it contains tions if O < k2 < I.
six complete turns. (c) Approximate the length of the ellipse if a = 2 and
7. Find the total mass of the helix g(t) = (a cost, a sint, bt),
=
b l, either by direct numerical approximation of
the integral using Simpson's rule or by finding the
0 :::: t .::: 2Jr, if its density per unit length at (x, y, z) is
value of the elliptic integral in a table.
equal to x 2 + y2 + z2•
14. If y is given by g(t) = (t , ~1 312), and h(s) =
8. Let y be a continuously differentiable curve with end-
points p I and P2· Let ).. be the line segment p 1+t (p2 - P1 ), g((l + ~s) 213 - 1), show that lh'(s)I = I for s > 0, and
0:::: t.::: l. Prove that[()..):::: l(y). Thus the shortest dis- that the parametrization h(s) is an arc length parametriza-
tance between two points is a straight line. [Hint: Use the tion for y.
result of Exercise 7(c) of Chapter 7, Section 3.) 15. Compute Jg(t)I for the helix g(t) = (cost,sint,t) and
9. Find the total mass of the wire with shape (x, y, z) = use the result to find an arc length parametrization for
(6t 2 , 4v'2t 3 , 3t 4 ), 0 :::: t :::; 1, this helix.
(a) if the density at the point corresponding tot is 12 . 16. In Example 3 of the text, if the radius a of the helix
(b) if the density at a point is equal to the square of its tends to zero the total mass Ma tends to 2rr 3 /3. What
distance from the yz-plane. geometric interpretation does this number have in the
present context?
10. Suppose y is given by g(t) for a :::: t ::: b and y is then
reparametrized by are lengths so that t = t(s). Show that 17. The centroid of a curve y of finite length l(y) is the aver-
.the line integral equation age position Po of the points on the curve. Thus the vector
Po is given in terms of an arc length parametrization h(s)
r'<r> or more general parametrizations g(t) in a vector-valued
la
b
F(g(t)) •g'(t)dt = Jo F(h(s))-t(s)ds integral by
J 1s1
holds, where h(s) = g(t(s)) and t(s) = (dh/ds)(s). Po= l() h(s)ds = l 1() 111 g(t)lg(t)ldt.
[Hint: Use the change of variable theorem for integrals.] Y so Y to
Section 3 Normal Vectors and Curvature 383
(a) Let y be an arc of a circle of radius a such that 18. Using the definition of centroid of a curve y in the previ-
the ends of the arc subtend angle 0 at the center ous exercise, prove Pappus's theorem: Rotating a plane
of the circle. Show that the centroid of y lies curve about a line in the same plane generates a surface
at distance (2a/0) sin(0 /2) from the center of the S of area a(S) equal to l(y) times the circumference of
circle, measured along the line from the center of the circle traced by rotating the centroid of y about the
the circle to the midpoint of the arc. line. [Hint: The distance in JR 2 from a point y to a line
(b) Show that the half-tum of a helix parametrized by is In• (y - Xo)I, where n • (y - x0 ) = 0 is a normalized
g(t) = (a cost, a sin t , ht), 0 .'.:: t ::, rr has its centroid equation for the line.]
at Po = (0, 2.a/rr, brr /2).
The purpose of this section is to analyze the connection between the shape of a
smooth curve in space and the variety of possible motions that can occur along the
path of the curve. We'll denote position on a curve as a function of time by x = x(t)
and assume x(t) is twice continuously differentiable. Since arc length s (t) along a
curve parametrized by x(t) is an integral of speed !x(t)I, it follows that the speed is
v = ds/dt, or sometimes more conveniently v = s. Thus
ds .
-(t)
dt
= s(t) = !x(t)!.
It's customary to denote the vector of length 1 having the same direction as the
velocity or tangent vector v(t) = x(t) by t(t). Recall that, by definition, s f O on a
smooth curve. Thus we can write x(t) = s(t)t(t) or t = (1/s)x, with ltl = 1.
Turning to the acceleration vector along the curve, we have by definition x =
v = d(st)/dt. Now apply the product rule for a scalar times a vector, Formula 1.3
in Chapter 4, Section 1, to get
3.1 x = st+si.
As a first step in interpreting Equation 3.1, we 'II verify in the following proof that
the vector t(t) is orthogonal to t(t), that is t(t)•t(t) = 0. Since t f 0, this means that
either (i) i = 0 or else (ii) t is perpendicular to t. In case t(t) f 0, we define a unit
vector n(t) called the principal normal to the curve at a point by n(t) = (!tl- 1)t.
=
Thus i !tin. This observation allows us to refine Equation 3.1 as follows.
st 3.2 Theorem. For a twice-differentiable smooth curve x(t), the acceleration vec-
tor is expressible as a sum of orthogonal components as
x= st+ sltln.
Proof. The orthogonality of t and n follows from having ltl 2 = t • t equal to a
constant, namely 1 in this case; just differentiate t t = 1 with respect to time t using
0
= =
the product rule. On the left we get (d/dt)t•t t•t +t •t 2t•t. On the right side
FIGURE 8.8
the derivative of 1 is 0, so t • i = 0. Hence t and t are orthogonal. Since t !tin, =
we can rewrite Equation 3.1 as claimed. •
Figure 8.8 is a typical picture of how t and n relate to the path followed by a curve.
=
The two orthogonal components, at st and a 11 = s !tin are called respectively the
384 Chapter 8 Integrals and Derivatives on Curves
tangential acceleration and the centripetal acceleration of motion along the curve.
The tangential acceleration measures the rate of change of speed along the curve.
The centripetal acceleration measures the rate at which the motion bends away from
a straight-line path. The force that bends the path of a particle of mass m generates
the centripetal acceleration, so the centripetal component of that force is man, and
the total force is mat+ man,
l·EXAMPLE'2 I ToThefindhelixitsx(t) =
(a cost, a sin t, bt) has radius a
curvature we first compute
K
> 0 and vertical climb rate b > 0.
x(t) = (-asint,acost,b).
Hences = Ja 2 sin2 t + a 2 cos 2 t + b2 = .Ja 2 + b2 . Hence
y"
3.6 K(x)-----
+
- (1 (y')2 )3/2 .
EXERCISES
1. Show that the curvature of a plane circular path of radius 8. Let a, b and (J) be positive constants . Let
a> 0 is 1/a. g(t) = (acos(J)t,asinM,bt), t ~ 0.
(a) Find explicitly the arc length parametrization h(s)
2. Find the curvature K(t) of the parabola x(t) = (t, t 2 ) for
of the curve.
-00 < t < 00.
(b) Find the unit tangent and principle normal vectors
3. Centripetal acceleration a0 increases in magnitude if either at an arbitrary point h(s).
speed s or curvature K is increased by a factor p > I.
(c) Find the curvature K(s).
Which does more to increase la0 1?
4. For the circular helix motion x(t) = (acost,asint,bt), 9. Show that the curve (x, y) = (coss, sins), 0 ~ s ~ 2n
is parametrized by arc length. Sketch the curve together
show that the tangential component of the acceleration is
always zero and that the centripetal component at a point
with its velocity and acceleration vectors at s Jr /2. =
of the path is directed toward the axis of the helix. 10. (a) Show that for a line given by g(t) = tx1 + xo, the
curvature is identically zero.
5. Equation 3.5 expresses curvature K in terms of the square
(b) Show that if a curve y, parametrized by arc length
root of an expression of the form lal 2 lhl 2 - (a• b) 2 ; is
and given by a function f (s), has a tangent at every
this expression always nonnegative? Explain.
point and has curvature identically zero, then y is a
6. Motion along a linear path can be described by x(t) = straight line.
¢(t)c + d, c ¥- 0, where we assume that the real-
valued function cp(t) is strictly increasing and has two 11. Use Theorem 3.2 to show how that if a particle of constant
continuous derivatives. Show that the acceleration vector mass m moves so that at time tit is at x(t), then the work
has centripetal component identically zero. done in traversing a part of the path having length so is
equal to an integral with respect to arc length s along x(t)
7. Here is the converse to the statement in the previous in the form
exercise: Suppose x(t) has the centripetal component of
its acceleration identically equal to zero, but that s(t) ¥- 0. so d2s
=
Then the path of motion is a straight line. Prove this as
follows.
W
!oo m-
2
ds.
dt
(a) Show that i = 0 and hence that Ji: = scfor some
constant vecmr c ¥- 0. 12. Here are two different ways to prove Equation 3.5 for
(b) Integrate the result of part (a) with respect to time curvature.
t to show that x(t) = s(t)c + d for some constant (a) Verify the equation using the substitutions Ji: = st
vector d, so that the path of motion is a line. and x=
st+ s2 Kn. Then expand the dot products
386 Chapter 8 Integrals and Derivatives on Curves
I
I
\
F(x, y) = ¼<x -y) i + ¼<x +y)j F(X,y ) = 41 xt. + 2I yt•
(a) (b)
of the nearby field arrows give an indication of the speed with which a flow line is
traversed at a given point.
A sketch of the 2-dimensional vector field F defined by the vector equation F (x, y) =
¼<x - y)i + ¼<x + y)j appears in Figure 8.9(a). For example, fi'(l, I) is represented
by a vertical arrow of length ¼with its tail at the point F(l, I). Four flow lines have
been sketched in with attention paid to their tangency relation to the arrows in the
field sketch.
If in the previous example we had wanted instead a sketch of the field 6F(x, y) =
(x - y)i + (x + y)j we might have preferred to settle for the picture shown in
Figure 8.9(a) anyway. The point is that making the arrows 6 times as long for 6F
makes for a more cluttered picture, particularly if the domain is enlarged beyond the
one shown in the figure. In accepting the temporary convention that the arrow lengths
be scaled down by the factor ¼, we make a clearer picture at the small expense of
asking our minds to interpret the picture as if the arrows were 6 times as long. The
flow lines will appear to be the same in either case, though they will be traced with
6 times the velocity in 6F as in F. There is nothing about the image of the flow lines
themselves that shows their velocities, so we rely on our interpretation of the field
arrows for that information.
Figure 8.9(b) shows a sketch of the field together with some level curves of the
function f. Some flow lines have also been sketched in. These flow lines have the
property that, as well as being tangent to vectors of the field, they are perpendicular
to the level curves that they cross. To sketch the flow lines we can use either tangency
to the field arrows or perpendicularity to the level curves as a guide, whichever seems
simpler. In this example the level curves are the family of ellipses 2+ !x ¼i
= k or
equivalently, with c = Sk, x + 2y = c; the ellipses are fairly easy to draw, so we
2 2
might prefer using these as an aid in sketching the perpendicular flow lines instead
of first sketching the vector field.
. aF1 aF2
4.1 d1vF(x, y) = --(x, y) + - ( x , y), for F : JR.2 -+ JR. 2 .
ax ay
. a~ a~ a~
4.2 d1vF(x, y, z) = -(x, y, z) + -(x, y, z) + -(x, y, z),
ax ay az
axy a(x - y 2 )
divF(x, y) = -ax + - -
ay
- = y + (-2y) = -y.
ax ay az
div F(x, y, z) = -ax + - + -
ay az
= 1 +I+ I = 3.
If we were to reverse the direction of the field in the previous example and consider
instead the field - F, the arrows in the sketch would all reverse direction and the
flow lines would spiral outward from the origin as in Figure 8.9(a). We would have
div( -F)(x, y) = ¼, and conclude that we have expansion of areas along flow lines
of-F.
FIGURE 8.10
l
I I
\
I \
I
-y-- -------1+-+-~--+--__..__--F"'H-----L
t
I
\
/
divG(x, y) = -a ( -r==e:::::=::;:
-y ) + -a ( x )
ax 4Jx2 + y2 ay 4Jx2 + y2
xy xy
-----=0.
- 4(x2 + y2)3/2 4(x2 + y2)3/2
Since div G is identically 0 the points in the shaded region in the first quadrant move
along their flow lines during a fixed time interval into a region of the same area.
The Curl of a Vector Field. The term curl is meant to suggest that we're trying
to measure the local tendency of a vector field and its flow lines to circulate around
some axis. In dimension 3 the curl of a differentiable vector field F = F1 i+ F2j+ F3k
is the 3-dimensional field
We can conclude that along the y-axis, where z = x = 0, the vectors of the field
curl F all have length I and point in the direction of the negative z-axis.
IEX."M.~4E 11 I Sometimes we can choose coordinates so that the third coordinate function of the
given vector field is identically zero and the other two coordinate functions are
independent of z, that is,
i j k ) 2
curlF=det cl/ox o/oy o/oz =Di+Oj+(aF _ cJFr)k.
( Fi (x, y) F2(x, y) O ax oy
We conclude that the arrows representing the curl of a field of this special kind either
have length zero or else are parallel to the z-axis.
so we define the curl; sometimes called the scalar curl, of a 2-dimensional vec-
tor field to be the real-valued function curl F = (0F2/ox - cJFif3y). The scalar
curl plays an important part in Green's theorem taken up in Section l of the next
chapter, and it's particularly helpful in conveying an intuitive feeling for the signif-
icance of curl F in general. It will follow from Green's theorem that if the scalar
curl of a 2-dimensional field is continuous and positive at a point (x, y), then the
line integral of F over a small enough counterclockwise oriented circle centered
at (x, y) will be positive; thus a field with positive scalar curl will have positive
counterclockwise circulation near (x, y). If curl F(x, y) < 0 the circulation will be
clockwise near (x, y ). If curl F = 0 identically, the circulation will be zero near
every point.
The statements in the previous paragraph can't be interpreted as predictions about
how a fluid particle would move at a given time and place; that information is given
by the vector values of F. Indeed, without some external constraint, a fluid particle
will simply follow a flow line with its velocity at each point x detennined by F(x) .
Circulation, as defined by a line integral rn Section 1, is just a cumulative measure
of the effect of the field along a particular path.
The 2-dimensional field F(x, y) = k<-x + y)i + k<-x - y)j of Example 6, shown
in Figure 8.1 O(a), has
1o(-x-y) la(-x+y) 1 l 1
curlF(x y)
'
= -----
8 ax
- -----
8 oy
= --8 - -8 = --.
4
This tells us that near each point (x, y) the circulation around a counterclockwise
oriented circle is negative, or alternatively that the circulation around a clockwise
circle is positive.
392 Chapter 8 Integrals and Derivatives on Curves
j~:x,;tva~~E;13j The vector field G(x, y) = -¼(y/Jx 2 + y 2)i + ¼(x/Jx2 + y2)j of Example 8
· · · shown in Figure 8.lO(b) has a scalar curl given for (x, y) #- (0, 0) by
=--:===>0.
4Jx2 + y2
Since curl G is positive everywhere on the domain of G we can conclude that the
circulation of G around small enough counterclockwise circles in the domain of
G will be positive. It's tempting to think that by some reasoning we can con-
clude that the circulation around the closed flow lines shown in Figure 8. lO(b)
will also be positive; the conclusion is true, but for a somewhat different reason
explained in Section 1 on line integrals, namely that the tangent vectors to the
curve coincide with the vectors of the field. (See also Exercise 8 of the present
section.)
The vectors of the field G in the previous example all have length ¼- By varying the
arrow-lengths but leaving the directions alone we find a field having the same flow
lines as G but traced with different speeds. For example, consider the vector field
4 -y . X ,
H(x , y) = -;::=~G(x,
Jx2 + y2
y) = 2
X + y
2I+
X
2
+ y
2J·
Only on the circle x 2 + y 2 = 16 do the two vector fields coincide; outside this circle
the arrows of H are shorter, while inside the circle they are longer. The scalar curl
of H is
curlH(x,y)=!._( x
ax x 2 + y 2
)-~(
ay x2
-y)
+ y2
(x2 + y2) _ 2x2 (x2 + y2) _ 2y2
= (x2 + y2)2 + (x2 + y2)2 = 0, (x, y) #- (0, 0).
We conclude that the circulation of H is zero near every point. Nevertheless, it's
intuitively evident that the circulation along the flow lines will be nonzero. (See
Exercise 8.)
We'll see in Chapter 9 that the scalar curl measures the local tendency of a 2-
dimensional vector field to have a nonzero circulation about a point, as defined by
line integrals of the field over small closed paths around the point. However, such
a tendency by no means implies that a particle acted upon by a velocity field with
Section 4 Flow Lines, Divergence, and Curl 393
FIGURE 8.11 curl F(x)
nonzero scalar curl will exhibit vortex motion locally. Indeed Figure 8.lO(b) shows
that the circular flow lines would cut right across a small circular path centered at a
point other than the origin. Similar remarks apply to the 3-dimensional vector field
F of a 3-dimensional field F.
In JR 3 we can ask if there is an interpretation not only for the magnitude of
curl F but also for its direction, assuming curl F(x) #- 0. The answer is yes, and we
interpret the vector curl as follows. Let P be a plane through x with unit normal
vector n. For each point of P that is also in the domain of F, project the vector F(x)
perpendicularly onto P as shown in Figure 8.11 to get a 2-dimensional vector field
F n in P having a scalar curl, namely curl F n.
The following three observations show that understanding the special case of the
scalar curl is a help in understanding the 3-dimensional vector field curl F.
Statement (i) is just the Theorem of Pythagoras for the large triangle in Figure 8.11 .
Statement (ii) follows from the first statement, since curl F0 (x) is maximized by
making n. curl F(x) = 0. Statement (iii) will follow from Example 5 in Section 1 of
Chapter 9.
EXERCISES
10. F(x, y, z) = x 3 i - y 3j + z3 k.
Note. General methods for deriving the parametriza-
(gF) ,curl(gF) = g2F , curlF
tions of flow lines in the next three exercises are taken up
in Chapter 12. holds identically on the domain of F. (This is easy to show
if g is constant, but otherwise is a little more work.)
I J. (a) Verify that the gradient field F(x, y) = ¼xi + ½yj
in Example 3 of the text has flow lines parametrized *17. (a) Show that if F = Vf is a gradient field with
by x(t) = cie 114 i + c2e112j, where c1 and c2 are R3 -1.+ R twice continuously differentiable, then
arbitrary real constants. curl F is identically zero.
(b) Show that the flow lines in part (a) usually follow (b) Use part (a) and the result of the previous exercise
parabolic paths, degenerating in some cases into to find a 3-dimensional vector field G that isn't a
straight lines heading away from the origin. gradient field, but such that G • curl G is identically
12. (a) Verify that the vector field 6F(x, y) = (x - y )i + zero.
(x+y)j in Example 2 of the text has flow lines given (c) Find a 3-dimensional differentiable vector field F
by x(t) = Ae1 cos(t + a)i + Ae 1 sin(t + a)j, where such that F • curl F is not always zero.
A and a are arbitrary real constants. 18. The scalar curl of a continuously differentiable 2-dimen-
(b) Show that the flow lines described in part (a) follow sional gradient field F = Vf is always zero.
generally spiral paths, in one case degenerating into (a) Show this by computing the second-order derivatives
a point. in curl (VJ).
Verify that the vector field 4G(x, y) = (b) Show this by calculating the circulation integral
13. (a)
-¼(y/Jx2 + y2)1+ ¼<x/Jx 2 + y 2 )j in Example 8
of the text has flow lines parametrized by
1 VJ• dx over a closed curve c.
Chapter 8 REVIEW
Compute the following line integrals by whatever correct 9. Explain in general terms why/(}"}.)= /(Y3), while /(yt)
method seems simplest. has a different value.
1. [ xy dx + (x + y2) dy,
2
where s is the closed counter- 10. (a) Which, if any, of the parametrizations in the previous
exercise are equivalent, so that the line integrals of
clockwise oriented square with comers at (0, 0), (I, 0),
an arbitrary continuous vector field F over them will
(I, I) and (0, I).
l
5. y tangent vector to the curve, pointing in the direction of
x2 + y2 = I. traversal. Show that F(x) • dx = l(y), the length of y.
In Exercises 6 to 9, consider a line integral / (y) = 14. Define a vector field F(x) =
x and parametrize the
fr x2 y dy, where y starts at (0, 0) and ends at (1, l). · segment joining point a and point b by x(1) = 1b+(l-1)a,
with O 5 1::: I.
6. Compute /(yi) if Yl is parametrized by g(t) = (t 2 , t 3 ) (a) Show by direct computation of the line integral that
for05t5I.
7. Compute / (!"2) if l"2 is parametrized by g(t)
0515I.
= (t, 12) for £ F • dx = }(lhl 2 - laJ 2 ).
8. Compute /(YJ) if }'3 is parametrized by g(1) = (1 2, 14) (b) Do the computation in part (a) by finding a real-
for O 5 1 5 1. valued function J such that VJ =
F and applying
396 Chapter 8 Integrals and Derivatives on Curves
Theorem 1.3, the Fundamental Theorem of Calculus 20. Show that the curvature of the graph of y = ex tends
for line integrals. to zero as x - ±oo. Where is the point of maximum
l 5. Let t stand for time and consider the time-dependent plane curvature?
vector field 21. Let x(t) = (a cost, a sin t, bt) where a and b are nonneg-
ative constants.
F(t, x, y) = ((I - t)x - ty, tx + (I - t)y). (a) For a fixed positive value of b, what values of a
yield the maximum and minimum values for the
curvature K?
(a) Find the work done by this field on a particle at (b) For a fixed value of a, what values of b yield the
g(t) = (cost, sin t) in the time interval O ~ t ~ 2JT. maximum and minimum values for the curvature K?
(b) How does the answer to part (a) change if, instead
22. Let x(t) trace a smooth curve with speed s(t) along the
of varying with time, the vectors of the field are
curve.
constantly equal to their values at some fixed time
to? Explain why the answer is geometrically evident (a) Show that lx{t)1 2 ::: ls (t)1 2 .
for t = 0 and for t == I.
s
(b) Show that the magnitude 2 K of the centripetal
component of acceleration along the curve is equal
16. Let y be a smooth curve parametrized by x == g{s), to Jlx(t) 12 - ls(t) 12 at x(t), so that the discrepancy
0 ~ s ~ [, where s stands for arc length measured along between lxl and Isl is zero just when K == 0.
the curve starting at g{O) and ending at g(l). Show that
if F is a continuous vector field defined along y, then 23. Using the cross-product, show that curvature of a smooth
The fundamental theorem of calculus for one variable says that if f' is integrable
for a ~ t ~ b, then
In Section I of the previous chapter, the theorem was extended to line integrals of
a gradient V/ by the equation
The main theorems of the present chapter are also variations on the idea that an
integral of some kind of derivative of a function can be evaluated by using only the
values of the function itself on a boundary set, for example the endpoints a and b in
the fundamental theorem stated above. We begin with the version known as Green ' s
theorem.
= J:.
JD{ (aG
ax
- aF)dxdy
ay Tr
Fdx+Gdy, (3)
397
398 Chapter 9 Vector Field Theory
Suppose that D is the square defined by -1 ::: x ::: 1, -1 ::: y ::: 1, and let F and
G be defined on D by F(x, y) = -yex and G(x, y) = xe>'. Then
ac aF
ih(x, y) - ay(x, y) = eY + ex
y
'Y2
so
'Y1
{ (oG - iJF) dx dy
JD ax oy
=f f 1
-I
dx 1 (e>'
-1
+ ex)dy
D
=f +
1
X
(e 2ex - e- 1 )dx
'Yi -1
= 4(e - e- 1).
'Y4
We parametrize the boundary curve y in four pieces Yi, i = I, 2, 3, 4, by
FIGU RE 9.1
-1 ::':I::': I.
1YI
F dx + G dy = 1-yex dx + xeY dy
Y
=1 -1
1
d
[<-te/x +e' y]dt
dt dt
=1
1
e'dt =e- ~-
-1 e
Similarly, the integrals over the other three sides are also equal to (e - 1/e), so
i F dx + G dy = 4 ( e -1) .
Equation (3) is thus verified for this particular example.
1.1 Theorem. Let y be a smooth curve and let F be a continuous vector field
defined on y . Denote by y - the curve y traced in the opposite direction. Then
l- F • dx =- i F • dx.
1 y-
F•dx=-1bF(g(a+b - t))•g'(a+b-t)dt.
a
1 y-
F•dx==1°F(g(u))•g'(u)du
b
We can prove Green's Theorem most easily for regions D such that y, the bound-
ary of D, is crossed at most twice by a line parallel to a coordinate axis. Such a
region is called simple. Thus a coordinate line intersects the boundary of a simple
region either in a line segment or else in at most two points. Using Theorem I. l we
can extend the theorem to finite unions of simple regions. A few such are shown
in Figure 9.2, where only D1 is simple. As shown for D2, when the boundary of
the region is not a single curve, only the outer boundary is traced counterclockwise,
while the inner boundary is traced clockwise. A rule that covers all cases is to trace
each piece of the boundary so that the region is always to the left as a point traces
the boundary. Line integrals around a path that begins and ends at the same point,
called a closed path or circuit, are important enough that they are often distinguished
from other integrals by means of an integral sign like f, or perhaps f to indicate
a direction of traversal.
400 Chapter 9 Vector Field Theory
FIGURE 9.2
0
Di
(a) (b) (c)
1.2 Green,s Theorem. Let D be a bounded plane region that is a finite union of
simple regions, each with a boundary consisting of a piecewise smooth curve. Let
F and G be continuously differentiable real-valued functions defined on D together
with y, the boundary of D. Then
1(aaox
D
- aF) dx dy =
- -
oy
f Y
F dx + G dy,
where y is parametrized so that it's traced once, with D on the left.
Proof Consider first the case in which D is a simple region, with boundary y
parametrized by
Since
y
i Fdx+Gdy= i Fdx+ i Gdy,
we can work with each of the te1ms on the right separately. We have
y = v(x)
Cl. X
The curve y consists of the graphs of two functions u(x) and v(x), perhaps together
with one or two vertical segments, as shown in Figure 9.3. On a vertical segment,
FIGURE 9.3 g1 is constant, so g~ = 0 there. On the remaining parts of y we apply the change
of variable x = g1 (t) so that, on the top curve, g2(t) = y = u(x), whereas on the
bottom, g2(t) = y = v(x). It follows that
= /3 [
- 1u(x) -a
aF ]
1a v(x)
(x, y)dy
X
dx
aF
y
=
1D
--dxdy.
ay
/3' ~~-s(yO
'~.
= r(y) { G(x, y)dy = { aG dx dy.
X
}y lv ax
a'
Combining this equation with the previous one gives Green's Theorem for the special
X
class of simple regions.
We now extend the theorem to a finite union, D = D1 U · · · U DK, of simple
regions each with a piecewise smooth boundary curve Yb k = 1, ... , K. Applying
FIGURE 9.4 Green's Theorem to each simple region Dk we get
f
fv
(aGax - aF)dxdy=
ay
{ Fdx+Gdy+--·+J Fdx+Gdy.
lY1 Yk
Now the boundary of D consists of pieces taken from several of the curves Yk· In
X
addition, there may be parts of curves Yk that are not a part of y but that act as a
common boundary to two simple regions. The effect is illustrated in Figure 9.5.
A piece 8 of common boundary will be traced in one direction or the opposite
FIGURE 9.5 depending on which simple region it's associated with. But for a line integral we
always have, by Theorem l.l,
{ F dx + G dy + { F dx + G dy = 0,
lo lo-
o- is otraced in reverse order. Thus although the parts of the curves Yk
where
make up y contribute to i F dx + G dy, the other parts cancel, leaving
that
{ (aGax - aF)dxdy
lv ay
= { Fdx+Gdy.
lY
FIGURE 9.6 y
y
~
~
X
X
(a) (b)
1B Changing Paths
The last part of the proof just given extends Green's Theorem from simple regions to
those such as are shown in Figure 9.6. The extension has an important consequence
for line integrals / F dx + G dy over two closed curves y and 8, when the functions
F and G are defined in the region D between y and 8. In Figure 9.6(a), the curves
are traced in the same direction (counterclockwise in the figure), and in Figure 9.6(b),
the curves go from one point to another in the same direction. Given this relative
orientation of the two curves, if the equation
ac aF
---=0 (4)
ax ay
holds throughout D, then we can conclude that
i Fdx+Gdy= iFdx+Gdy.
We'll show the validity of this principle in the next two examples.
, -y X
F (x, y) = X 2 +y2' G(x,y)=
X
2
+y
2'
for (x, y) -:/= (0, 0). Direct computations show that these functions satisfy
'Y
Equation (4). If y is the ellipse shown in Figure 9.7 and defined by
1 y Uc
Fdx+Gdy=O,
i Fdx+Gdy= 1 Fdx+Gdy .
But on c we have x 2 + y 2 = 1, so
i Fdx+Gdy= 1 - ydx+xdy
f 2,r
= lo 2
(sin t + cos2 t) dt = 2rr.
It's important to observe that Green's Theorem could not have been applied directly
to the entire inte1ior of the ellipse because (aG;ax) and (rJF/rJy) fail to exist at
the origin.
The curve Yl given by g(t) = (t, t 2), 0 ::=:: t S I, is shown in Figure 9.8. Suppose
that F(x , y) = (F(x , y),G(x,y)) is a continuously differentiable vector field for
x 2 + y2 < 4 and satisfies Equation (4), namely Gx(x , y) - Fy(x , y) 0 in the disk =
of radius 2. The line integral of F over YI could perhaps be computed directly in
the form
--~
1
1 YI
F dx= { [F(t , t 2 )+G(t , t 2 )(2t)]dt.
0
lo
i Xi Xi
But there are other possibilities. For example, the curve Y2 can be parametrized by
Ik g 2 (t) = (t, t), 0 ::=:: t ,:s I. Since we can apply Green's Theorem to the region between
Yl and Y2, Equation (4) implies that
{ (aG
}D ax
- aF)dxdy
ay
= o,
FIGURE 9.8
and hence
1YI
F • dx + 1- Y2
F • dx = 0.
404 Chapter 9 Vector Field Theory
Here y2- is given by g2(t) = (l - t, 1 - t) for O ::: t S I. Then the line integrals
over Yt and y2 are equal by Equation I , and the latter integral is then
l F• dx= 1 1
[F(t,t)+G(t,t)]dt.
l
(I, 0), 0:::t:::1,
g3(f) =
( I , t),
Thus
1 YI
F • dx = 1 Y2
F • dx.
The paths may even intersect at points other than their common endpoints, as indi-
cated in Figure 9.6(b); the only requirement is that we be able to apply Green's
Theorem to the region or regions bounded by the curves.
IC Physical Interpretations
Green's Theorem has two distinct but closely related physical interpretations. We
assume D to be a region in IR 2 whose boundary is a single counterclockwise-oriented
curve y. If y has a smooth parametrization g(t) = (KI (t), g2 (t)), a s t s b, and has
a nonzero tangent at each point, we can form the unit tangent and normal vectors
1
g (t) ( gi (!) g~ (I) )
t(r) = lg'(t)I = lg'(t)I' lg'(t)I
and
Section 1C Green's Theorem 405
An example is shown in Figure 9.9. Note that this normal vector isn't related to
the curvature of y and doesn't necessarily have the same direction as the principal
normal to the curve, as defined in Chapter 8, Section 3.
f y F dx + G dy = 1b F(g(t)) • t(t)ig'(t)I dt
= fr F O tds.
ac aF
cur!F(x) = -(x)
ax
- -(x).
ay
l curlFdA = fYF•tds,
sometimes called Stokes's Theorem for the plane. Now interpret F as the velocity
field of a fluid flow in the plane, which means that at each point x the arrow rep-
resenting F(x) has the direction of the flow at x, with the speed of the flow there
equal to the length of the arrow. The line integral represents the circulation of the
flow around y in the counterclockwise direction. (Recall that circulation of a vec-
tor field over a smooth curve, closed or not, was defined in Chapter 8, Section 1.)
Stokes's Theorem says that this circulation is equal to the integral of curl F over D.
In particular, if curl F is identically zero in D, then the circulation is zero for every
smooth circuity contained in D, whether y is oriented counterclockwise or not. For
this conclusion to hold, it's necessary that curl F be defined throughout the inside of
every circuit in D to which Stokes's Theorem is applied. Conversely, we can show
that if the circulation is zero over every smooth circuit, then the function curl F must
be identically zero. See Exercise JO and Section 2 for an alternative approach. In
Section 5, we treat the scalar curl generalized to a vector field in IR 3 •
cur!F(x,y,z)= aH - ac).
(- - 1+ (aF
- - aH)
- j + - - aF)
- k. (aG
/ ay az az ax ax ay
406 Chapter 9 Vector Field Theory
In the previous example we interpreted the field J:t' as the velocity field of a fluid
flow in D. That is, the vector field F at each point of D represents the speed and
direction of the flow at that point. In this case the line integral in Stokes's Theorem
is called the circulation of F around y, and Stokes's Theorem says that circulation
of F along y is the integral of curl F over D. Thus if curl F is identically zero in
D, then the circulation is zero around every smooth closed curve with its interior
contained in D. A field F for which curl F is zero is called irrotational for this
reason.
Gauss's Theorem in the Plane. Using the divergence of a vector field introduced
in Section 4 of Chapter 8, we can rewrite Green's Theorem in another way. Instead
of applying the fundamental Equation (3) to the field F = (F, G), here we instead
apply it to Lhe related vector field H = ( -G, F). If t = (a, h) is a unit tangent vector
pointing so that the region is on the left, then the perpendicular vector n = (b, -a)
is a unit vector that points away from the region as shown in Figure 9.9. Since
F = (F, G), we have
f YH • dx = i H • t ds
= fr F • nds .
On the other hand, the area integral for Green's Theorem applied to H is
(aF+ac)
1 -
-
D ax ay
dxdy.
Section 1C Green's Theorem 407
We define a real-valued function div F called the divergence of F by
. aF ac
d1vF(x, y) = -(x, y) + -(x, y).
ax ay
In terms of the divergence, Green's Theorem is
Using the fluid flow interpretation, in which F represents the velocity field of a
fluid flow, the line integral in the divergence theorem is the integral of the outward
normal coordinate F • n of F over y and gives the rate at which fluid is flowing out
of the region D bounded by y. The value of this line integral is called the total flow
rate, called flux, of F across y in the outward direction. Gauss's Theorem shows
that the flux across y, denoted 4>(y ), is equal to the integral of the divergence of
F over the region bounded by y. Thus div F(x) measures the rate of change of the
density of the fluid at the point x. If div F(x) is predominantly positive in D, then
4>(y), the outward flow, will be positive, while a negative 4>(y) indicates that more
fluid is going into D than is going out. If div F is identically zero, then F is said
to represent an incompressible flow, since the flow into and out of arbitrarily small
neighborhoods of every point will be exactly balanced.
A flow in the plane determined by the vector field F(x, y) = yi+xj is incompressible
since div F(x, y) = 0. Hence
0= l div F dA = f/ dx + x dy
for every circular disk D with counterclockwise oriented boundary circle y. Indeed,
if we were to parametrize y by x = xo + r cost, y = yo+ r sin t for O .S t S 21r
we would discover that the total flux of F across y is O for all choices of the point
(xo, yo) and radius r.
EXERCISES
l
value of the line integral
indicated closed path.
i
In Exercises l to 4, use Green's theorem to compute the
2
y dx +x dy, where y is the
7. (x - y) dx + (x + y) dy, where y is a triangle traced
counterclockwise and having for its three vertices (0, 0),
(1, 0), and (1 , 1)
8. Use the same integrand a•; in Exercise 7, but change the
1. The circle given by g(t) = (cost, sint), 0 :St ::S 2n path to the square with comers at (0, 0), (1 , 0), (1 , 1), and
(0, I), traced counterclockwise.
2. The square with comers at (±1 , ±1), traced counterclock-
wise 9. 1 (x 2 - y2) dx + (x 2 + /) d y, where c is the circle of
3. The square with comers at (0, 0), (1, 0), (1, 1), and radius 1 centered at the origin and traced clockwise
(0, 1), traced counterclockwise
In Exercises 7 to 10, evaluate the following line integrals 12. Let f be a real-valued function with continuous second-
by whatever method seems simplest. order derivatives in an open set D in IR2 • Let F be the
Section 2A Conservative Vector Fields 409
vector field defined in D by F(x) = VJ (x), the gradient Show that if J(xo) f:. 0 for some xo in B, then there
of J. Show that if F(x) = (F(x), G(x) ), then the equation is a disk D centered at xo such that IJ(x)I ~ 8 for
(oG/ox) - (oF/oy) = 0 is satisfied in D. some 8 > 0, and all x in D.]
13. (a) If J(x, y) = arctan(y/x) for x > 0, compute (b) Use part (a) and Stokes's Theorem to show lhat
VJ(x, y). if curl F is continuous in an open set D, and the
(b) Show that the formulas for the coordinate functions circulation of F is zero around every smooth circuit
of VJ found in part (a) define a continuous vector in D, then F is irrotational in D; that is, curl F is
field F(x,y) = (F(x,y),G(x,y)) for all (x,y) f:. identically zero in D.
(0, 0). (c) Use part (a) and Gauss's Theorem to show that if F
(c) Show that there is no function g such that is continuous in D and the flux <l>(y) = 0 for every
Vg(x, y) = F(x, y) for all (x, y) f:. (0, 0). [Hint: smooth circuit y in D, then F is incompressible.
If g existed, then the line integral of Vg would be 17. Define
independent of the path as long as the path avoided
(0, 0).] -y • X •
F(x,y ) = -2--2•+-2--2J•
X +y X +y
for (x,y):/=(0,0).
14. (a) Consider a particle moving in a plane vertical to the
surface of the earth and subject to the gravitational
(a) Show that div F is identically zero. What implica-
field N(x, y) =(0, mg), where mis the mass of the
tion does this have for areas of regions under the
particle and g is the acceleration of gravity. Show
influence of the flow generated by F?
that as the particle moves in the plane, the amount
(b) Show that curl F is identically zero. What impli-
of work done is independent of the path between
cation does this fact have for the circulation of F
two points and depends only on the initial and final
around circular paths that don't go around the ori-
points. In particular, the work done in moving along
gin?
a closed path is zero.
(c) What is the circulation of F along a counterclock-
(b) Replace the field N by a field l<' = (F, G) satisfying
wise-oriented circle of radius a centered at the ori-
(oG/ox) = (oF/oy) throughout the plane. Show
gin? Does this result contradict part (b)? Explain
that the same conclusions hold.
your answer.
15. Assume that the vector field F =
(F, G) is a gradient
The equations curl F = 0 and div F = 0 occur in
field, that is, F = VJ for some real-valued J. Show that
Green's Formula can be written in the form
complex variable theory in a slightly different form as
the Cauchy-Riemann equations. In Exercises 17 to 20
l ti..fdA = i VJ•nds,
show that if u (x, y) and v (x, y) are the real and imagi-
nary parts, respectively, of the following complex-valued
functions, then the vector field given by F(x, y) =
(u(x, y), -v(x, y)) is irrotational and incompressible.
where ti..J = (o 2f/ox 2 )+UJ1 J ;ay2), the Laplacian of J.
16. (a) Show that if J(x, y) is a continuous real- 18. (x + iy) 2
valued function defined in an open set B of IR.2 , 19. (x + iy) 3
andl J(x, y)dx dy = 0 for every circular disk D 20. e-1+iy
in B, then J(x, y) is identically zero in J3.. [Hint: 21. ½ln(x 2 + y2) +i arctan y/x, x > 0
i F dx 0
is independent of the piecewise smooth path y from xo toxin D, then Lhe real-valued
function defined by
J(x) = f~ F dx0
r+ru
FIGURE 9.10
= lx F dxO
1
= fo F(x +vu)• udv.
In the result of this computation we let u = ej , the jth standard basis vector in
Rn. Then
= lim - 11
,-o t o
1
F(x + vej) • ei dv .
Since the integral in this last limit is zero when t = 0, the limit is the derivative
with respect to t of the integral, evaluated at t =
0. By the fundamental theorem of
calculus, this is just the integrand evaluated at v = 0, so
where Fj is the jth coordinate function of F. Since I<' was assumed continuous, so
are the partial derivatives aj/axj; therefore J is continuously differentiable on D .
Finally, the equations (aj/axj)(x) = Fj(x), j =
1, .. . , n, taken all together mean
that VJ = Fin D.
Section 2A Conservative Vector Fields 411
If the particle follows a particular path given by x(t) = g(t) , then the velocity and
acceleration vectors are v(t) = g'(t) and a(t) = g"(t), and we have F(g(t)) =
ma(t), where m is the mass of the particle. We write v = lvl, so that v 2 = v • v.
Hence if X1 = g(t1) and x2 = g(t2), then
t2
W(x1 , x2) =
1 ti
ma(t) •v(t)dt.
But since a(t) = v(t), and (d/dt)v 2(t) = 2v(t) • v(t), we have
W(x1 , x2) = -m 1t2 -d 2
[v (t)] dt ,
2 ti dt
The function T(t) = (m/2)v 2 (t) is called the kinetic energy of the particle at time t .
On the other hand, if we fix a point xo in D, then by Theorem 2.1, the equation
U(x) =- [ F•dX
= £ 2
F•dx- /~ F•dx
1
(2)
= -U(x2) + U(x1) .
Comparison of Equations (1) and (2) shows that
412 Chapter 9 Vector Field Theory
In other words, along the path traced by g(t), the sum U(g(t)) + T(t) is a constant,
independent oft, called the total energy of the particle. For this reason, the function
U (x), which is a function of position in D, is called the potential energy of the
field F. Thus the potential energy is minus the field potential. Notice that there is
an arbitrary choice made in defining the potential in that the point xo was picked to
have zero potential. The choice of some other point xo would change the function
U by at most an additive constant equal to W(xo, x1). It is the constant total energy
that's "conserved" and that gives rise to the term conservative field.
2B Path Independence
For a vector field F defined in a region D of JR. 11 , independence of path in the line
integral / F • dx means that
2.2
where y[x 1, x2] and 8[x 1, x2] are any two piecewise smooth curves in D having
initial point X1 and terminal point x2. An alternative formulation of the independence
property is that
2.3 i F dx
0
=0
for every piecewise smooth closed curve y lying in D. The equivalence of the
two properties follows from the observations that y[x 1, x2] followed by 8[x1, x2] in
reverse direction is a closed path, and that a closed path may be regarded as two
paths joining x 1 and x2, but traced in opposite directions.
The following theorem is a formal summary of three equivalent characteristics of
gradient fields, the first of which is just our original definition.
(a), we have
.
obtam (b) from
1p
F • dx = 1-
q
F • dx. Hence 1 -1
p
F • dx
q-
F • dx = 0. Thus we
i F • dx = £ +i F • dx F. dx
= 1 -1
p
F • dx
q-
F • dx = 0.
To see that (b) implies (a), we reverse the previous argument, letting r and s be
two given piecewise smooth paths from X1 to x2. Then r together with the reversed
curve s- make up a closed path y over which F has integral zero. It follows that
i F • dx + 1- F • dx = 0, so i -1
F • dx F. dx = 0.
Finally, Theorem 1.3 of Chapter 8, Section 1 states that (c) implies (a), while
Theorem 2.1 of the present section states that (a) implies (c). It then follows from
the previous implications that (b) and (c) imply each other. •
2C Derivative Criterion
A more intrinsic criterion for deciding whether a continuous vector field is a gradient
field arises as follows. Suppose first that JR. 2 ~ JR. 2 is continuous on an open set D,
and that F is a gradient field, that is, there is a real-valued function f defined on D
such that V/ = F. In terms of coordinate functions F1 and F2 of F, this means that
and
and
a2 f
---=-,
aF2
ax1ax2 ax1
(3)
throughout D. By the definition of curl F, Equation (3) says curl F = 0. This equation
has an extended consequence: We consider a more general vector field ]Rn ~ ]Rn,
which we assume continuously differentiable in an open subset D of JR.n. If F is a
=
gradient field, there is an f such that V/ F, or, in terms of coordinate functions
af = F'j ,
-axj j = l, . .. , n.
414 Chapter 9 Vector Field Theory
(4)
The functions 8F; /axi are the entries in the n-by-n Jacobian matrix of JR 11 -.!+
JR 11 ,
and Equation (4) expresses its symmetry, which means that F' equals its transpose
across its main diagonal.
'
F (x' y) = ( x2 -y X
+ y2 , x2 + y2 )
is defined for all (x, y) =f. (0, 0). You can check that 8F1/8y = 0F2/8x, but there is
no continuously differentiable f such that "vf (x, y) = F(x, y) for all (x, y) =f. (0, 0).
The underlying reason is that for x > 0 the function f (x, y) = arctan(y / x) satisfies
VJ= F, but this f cannot be extended to be a single-valued solution of the equation
in the entire plane with the origin deleted. (See Exerci::;e 13 of the previous section.)
If f(x, y) could be so extended to a function g(x, y), we would have
where c lies on x 2 + y 2 = I, starting and ending at (I, 0). This is impossible, since
explicit calculation along c, traced once counterclockwise with (x, y) =(cost, sin t),
shows
Proof. Pick a fixed point xo in R and let x be any other point of R. We consider
paths from xo to x, each consisting of a sequence of line segments parallel to the
axes and such that each coordinate variable varies on at most one such segment.
Three-dimensional examples are shown in Figure 9.11. The reason for looking at
Section 2C Conservative Vector Fields 415
FIGURE 9.11
such paths is to be able to approach x from any coordinate direction for the purpose
of talcing partial derivatives at x. Choosing one of these paths, call it Yx, define a
real-valued function f by
We apply Green's Theorem to the 2-dimensional rectangle Rt, bounded by 8 and get
f, l,
F•dx =
1(-aF-
R~
1
OXj
-
aF-)
-
OXj
1
dx;dxj = 0,
J,l . dx =
fi
[
l(o; ,6j)
F • dx - [ , , F • dx
j(l,J'l,i)
= 0,
and so the change of path leaves the value of the integral invariant. •
416 Chapter 9 Vector Field Theory
Once it has been established that x can be approached along a path of integration
that varies only in an arbitrary coordinate, say the kth, we have, as in the proof of
Theorem 2.1, the equation aJ('iJxk(x) = Fk(X), for all k. Thus VJ(x) = F(x) for all
x in R.
-y X )
(x, y)-/= (0, 0) ,
F(x, y) = ( x2 + y2 , x2 + y2 '
y where the path of integration is any piecewise smooth curve from (1, 0) to (x, y). A
(X, y) polygonal path from (1, 0) to (x, 0) and from (x, 0) to (x, y) is shown in Figure ·9.12.
On the first segment, the entire integral is zero because y is identically zero, and on
the second segment, with x constant, the integral reduces to
[Y : dy = arctan (~).
(1, 0) (x, 0)
X lo X + Y2 X
The most general potential of F in the right half-plane differs from this one by at
FIGURE 9.12 most a constant. (Why?) The general solution of VJ = Fin the half-plane is therefore
f(x, y) = arctan~ + C.
X
Like the field in Example 3 this one satisfies the hypotheses of Theorem 2.6, so we
conclude from the theorem that there is a potential function J such that VJ = F
on any half-plane bounded by a coordinate axis. But in contrast to Example 3, for
this field there is a continuous potential function defined everywhere in IR 2 except
at the origin. This infonnation can't be obtained simply by verifying the symmetry
condition, but requires checking one of the conditions of Theorem 2.4. To actually
find a potential we calculate the line integral of F(x, y) from ( 1, 0) to (x, y) along the
polygonal path from (1, 0) to (x, 0) and then from (x, 0) to (x, y) as in Example 3.
After some cancellation the result is
ydy
J (x, y) = y
= 2I ln(x 2 + y 2), =I-
la
0 X
2
+Y 2
(x, y) (0, 0) .
Thus we have a potential function f (x, y) for the vector field F(x, y ), usually called
the logarithmic potential, valid everywhere in the plane except at the origin.
Section 2D Conservative Vector Fields 417
2D Indefinite Integration
Given a vector field IR" -.!+
IR", finding a real-valued function f such that VJ = F
amounts to solving for the function F in the system of partial differential equations
where F1, F2, ... , Fn are the given coordinate functions of F . It's sometimes simpler
to avoid working with definite integrals along explicit paths of integration as in the
previous example and instead use indefinite integrals. Assuming equality of mixed
. Is, t he consIStency
part1a • cond'1t1ons
. -aFi = -aFj must be satis . fl e d , so 1t
. usua ll y
ax} axi
makes sense to verify them before proceeding further; failure of even one of these
equations means that there is no solution f to the system of equations.
Suppose F(x, y) = (y2, 2xy + 1), and we're looking for a function IR2 ~ IR such
that fx(X, y) = y 2 and Jy(X, y) = 2xy + 1. Since a(y 2)/ay = a(2xy)/ax = 2y for
all (x, y), Theorem 2.6 guarantees that the desired function is defined for all (x, y).
To find f, start for example with fx (x, y) = y2 and integrate with respect to x while
holding y fixed. We get
f fx(x,y)dx=f(x,y)= f 2
y2dx=xy +C(y).
It's important in principle, and often in practice, to allow the "constant" of integration
C(y) to depend on the temporarily fixed, but arbitrary, value y. Now apply a/ay to
this partly determined expression for j(x, y) and compare the result with the given
expression Jy(x, y). We find
Let
F(x, y, z) = (F1 (x, y, z), F2(x, y, z), F3(x, y, z)) = eY+z(y, x(y + 1), xy).
418 Chapter 9 Vector Field Theory
We want to solve
that is, find f such that VJ = (F1, F2, F3). The three consistency conditions, for
example, aF1/ay = (y + l)e>'+z = aF2/ax, all hold, so we go ahead and integrate,
choosing to start with the first equation fx(x, y, z) = (y + l)eY+z. We find
The constant of integration may depend on the two variables not involved in the
integration. Now apply o/az to this last expression for f to get f,(x, y, z) =
xyeY+z + Cz(Y, z). The third equation, f,(x, y, z) = xye>-+z, of our given system
shows by comparison that C,(y, z) = 0. This says C(y, z) = C(y) is independent of
z, so f(x, y, z) = xye>-+z+c(y). Now compute /y(x, y, z) = x(y+l)e>-+z+c'(y);
comparison with the second equation of the system shows that C' (y) = 0, so C(y)
is constant. Thus f (x, y, z) = xyeY+z + c.
EXERCISES
I. Consider the approximation to the earth's gravitational 9. Consider the vector field defined in JR1 , with the z-axis
field acting on a particle of mass I represented by the deleted, by
vector tiel<l F(x , y, z) = (0, 0, -g).
(a) Find for F the potential energy function U(x, y, z) F(x , y,z)=
-y X )
that is zero when (x, y, z) = (0, 0, 0) . ( -2--2.-2--2,0.
X +y X +y
(b) If a particle of mass I has at (0, 0, 0) a velocity
Is F a gradient field?
vector (v1, v2, v3) with v3 > 0, and no force but F
acts on the particle, find the path of the particle. In Exercises 10 to 13, find a field potential for the given
(c) Verify that the sum of potential energy and kinetic field.
energy remains constant for the path of part (b). 10. F(x,y, z) =(2xy,x 2 + ;: 2 ,2yz)
2. Show that if F and G are gradient fields defined on the 11. G(x, y) = (ycosxy, x cosxy)
same domain D, then F + G and cF are gradient fields,
where c is a constant. 12. H(x, y) = -y
?-- , -
X )
(x, y) # (0, 0)
(-
x- + y 2 X
--?
2 + y-
,
g1(u, v) )
g(u, v) = g2(u, v) , (1)
( g3(u, v)
with u = (u, v) in the interior of some set D in JR2 , which we assume bounded by
finitely many smooth curves. We further assume that, at each point g(u, v) of S, the
tangent vectors defined by the vector partial derivatives
ag
av (u, v)
determine a 2-dimensional tangent plane to S; in other words, that the lwo tangents
are linearly independent. If S satisfies all these conditions, we'll refer to it as a piece
of smooth surface.
420 Chapter 9 Vector Field Theory
FIGURE 9.13 V
ag Jg
- X---:..
,1u ,Ju
II
(a) (b)
ag
OU (u, V)
ag (LI,
av /
I X V)
represents the area of the outlined tangent parallelogram shown in Figure 9. l3(b).
If we think of scaling down such parallelograms by factors du and dv at the points
g(uk, Vk) corresponding to the corner points (uk, vk) of a grid over D, then it's
natural to define the area of S by integrating the parallelogram area over D:
0
3.2 f µda= f µ(g(u,v))l g(u,v)x ag(u,v),dudv
JD JD au av
exists. If µ(x) ::: 0, then Equation 2 defines the total mass due to the density µ.
Section 3B Surface Integrals 421
FIGURE 9.14 ! z
. ~Ji I
I
I
I i :
~
X I
•I'
(a) (b)
g(u, v) =( ~ ),
u2 + v2
u
2
+ v 2 ~ a 2;
og
-(u,v)= ( 01 ) , ag
-(u, v) = ( o1 ) .
au 2u av 2v
We have
ag ag ( 0 2u 2v
ou (u, v) x av (u, v) = 2u 2v II
' 1 0
The surface in the previous example can be thought of as a piece of the graph of
an equation z = f(x, y), where in the example f(x, Y> = x 2 + y 2, with the domain
of integration the disk x 2 + y2 ::: a 2 • In general, such a graph can be parametrized
by a function JR. 2 ~ JR. 3 of the form
g(x,y)=( ; )· (.r,y)inD.
f(x, y)
Then gx(x, y) x gy(x, y) = (-fx(x, y), - /y(x, y), I) so the area differential
becomes
To compute the area of a sphere Sa of radius a using Equation 3.3, we start with
IE~AMPL~21 the hemispherical graph Ha of z = Ja 2 - x 2 - y 2 over the disk x 2 + y 2 < a 2. See
Figure 9.14(b). Then we have fx(x, y) = -x(a 2 - x 2 - y2)- 112 and /y(x, y) =
-y(a 2 - x 2 - y2)- 112 , so Equation 3.3 becomes
This is an improper integral, because the integrand tends to infinity as (x, y) tends
from within the disk to an arbitrary point on the boundary. To compute the integral,
integrate first over a smaller disk of radius b, and then let b ~ a. This can be done
by changing to polar coordinates and then letting b ~ a. We get
a(H0 ) = lim 1
h-+a x2+y2<b2
a/Ja 2 - x 2 - y2 dx dy
2
= lim a f ,r d0 fh r/Ja 2 - r 2 dr
b-+a lo lo
= 2rra Jim [-
b-+a
(a 2 - r 2 ) 112 Jg
lim [-(a 2 -
= 2rra h-+a b2 ) 112 + a] = 2rca 2 .
Hence a(S0 ) = 2a(H0 ) = 4rra 2 , the formula for the area of a sphere of radius a.
Section 3C Surface Integrals 423
L§~;\""1ft,~~'.~:j We parametrize a complete turn of a helicoid surface of width a by
n
g(u, V) =( : ~r:; ), 0 :,U :< a, Q :< V :< 2".
J:
= 2rr [½(I + u 2 ) 312 = (2rr /3)[ (I + a 2 ) 312 - 1].
ag ag
-x-
n== au av
ag x ag 1 ·
I au av
it follows that
ag ag )
F(g(u , v)) • ( -(u, v) x -(u, v)
au av
424 Chapter 9 Vector Field Theory
is equal to the coordinate of F(g(u, v)} in the direction of n, multiplied by the area
of the tangent parallelogram spanned by ag/au and ag/au at g(u, v). We define the
surface integral of I<' over S by
3.4
1 D
F(g(u, v)) • -(u,
au (
ag v) x
ag v) ') du dv,
-(u,
av ,
and denote it by f s
F dS or
0
f s
F•nda .
Suppose that a continuous vector field IR 3 ~ IR 3 describes the speed and direction
of a fluid flow at each point of a region R in which it's defined. We'll define, using
a surface integral, the flux, or rate of flow across a piece of smooth surface S, lying
in the region R. If S is perfectly flat and Fis a constant field, then the flux is equal
to Fna(S), where Fn is the coordinate F • n of Fin the direction of a unit normal n
F to S. Thus, for a flat S, the flux is equal to the volume of the tube of fluid illustrated
in Figure 9.16, which shows the amount of fluid passing through its base in one time
unit. Because Fn = F • n, we define, for a flat S with area a (S), the flux to be the
rate of flow of F across S given by the formula
<t>(F,S) =F•na(S).
becomes a better approximation to what we would like to call the flux of F across
S as the subdivision of S is refined by making the corresponding grid G finer in the
parameter domain D. On the other hand, if Fis continuous on Sand g is continuously
differentiable on D, then
lim
m(G)-.O
~
Lt <t>k =
k=l
1 D
F(g(u)) • (ag
-(u)
au
ag ) du dv
x -(u)
av
= { F•dS,
ls
Section 3C Surface Integrals 425
which is the previously defined integral of F over S. Consequently, we define the
flux of F across S to be the rate of flow given by
Suppose the vector field F(x) = (F1(x), F2(x), F3(x)) is tangent to the surface S
at every point x of S. Then F • n = 0 at every point of S, so the surface integral
Is F • n da is zero. At the other extreme, if the continuous vector field F(x) is
perpendicular to S at every point x of S we expect that the integral of F over S will
be different from zero. For example, if F coincides with the standard unit nonnal
vector n at each point of S, then Is F • n da = Is n • n da = Is da, which is just
the area of S.
The motivation for the definition of flux given previously is stated in terms of
the velocity field of a fluid flow, because that is the physical setting for flux mea-
surements across surfaces that we most easily visualize. However some of the most
important applications of surface integrals concern the flux of the more abstract
fields: gravitational, electric, and magnetic. The next example is fundamental for
these areas.
where G is the universal gravitational constant. Note that the magnitude of the
field at xis !F(x)I = GMmlxi- 2 ; in other words, this is an inverse-square law of
attraction. To compute the flux of this field across a sphere Sa of radius a centered
at the origin, we make the simplifying observation that if x is on Sa, then we have
F(x) = -GMma- 2n, where n is an outward-pointing unit vector directed from x
away from the origin. Since n • n = 1, the flux of the field is
This last integral is the area of the sphere Sa, namely 4na 2 , so flux <I>== -4nGMm.
The most significant feature of this result is that <I> is independent of a, the radius
of the sphere. We'll use Gauss's Theorem in the next section to show that the same
phenomenon holds for closed surfaces other than spheres.
Let x = (x, y, z). Then the coordinates of the cross-product fJgjou x og/ov have
the form
fJ(y,z)
cJ(u, v)
= I Yu
Zu
Yv
Zv
I· fJ(z, x)
fJ(u,v)
= I Zu
Xu
Zv
Xv
I· a(x, y)
cJ(u, v)
= I Xu
Yu
x,,
Yv
I·
Thus we can write general surface integrals of a vector field F = (F1, F2, F3) in
either of the successively more abbreviated fonns
f s
F dS= { (F1iJ(y.z) +F2fJ(z,x) +F3fJ(x,y))dudv
0
3D Orientation
In computing a line integral over a piecewise smooth curve, it's customary to orient
the smoofh pieces of the curve coherently so that the tenninal point of one piece is
the same as the initial point of the one that follows it. To integrate a vector field over
a piecewise smooth surface, we need a notion of orientation for pieces of smooth
surfaces S. If JR 2 ~ JR 3 represents S parametrically with g defined on D, then
Figure 9.13 shows how D and S might possibly be related. The edge of S, corre-
sponding under g to the boundary of D, we'll call the border of S. As a point u
moves around the piecewise smooth boundary of D in the counterclockwise direction,
its image g(u) traces the border of S with what we'll call its positive orientation.
It will be convenient later to use the notation as to denote the positively oriented
border of S.
An alternative way to describe the positive orientation is as follows. Define the
positive side of S by saying that "positive" is the side of S out from which the
normal vector agjfJu x cJg/fJv points. If you then walk on the positive side of S
keeping S on your left as you follow the border around, you are going in its positive
direction. See Figure 9.18(b) for a picture. The equivalence of these two notions
FIGURE 9.18
r
t' n
I
;
II ti
(a) (b) (c)
Section 3D Surface Integrals 427
of positivity can be stated and proved as a fonnal theorem, but we won't attempt
that here.
A piecewise smooth surface is defined to be a finite union of pieces of smooth sur-
face that are joined along common border curves. Figure 9.18 shows some examples.
The border curve of each piece of surface has a positive orientation that comes from
some parametrization of that piece. The parametrizations of two adjacent pieces are
coherent if they give opposite orientations to common border curves, as in parts (a)
and (b) of Figure 9.18. A piecewise smooth surface is said to be orientable if its
adjacent pieces can be parametrized coherently. The border orientation of a single
piece can always be reversed to accommodate a neighbor by interchanging the roles
of its two parameters, for example replacing (u, v) by (v, u) throughout. Parts (a)
and (b) of Figure 9.18 show orientable surfaces. However part (c) of Figure 9.18
shows two rectangular strips joined together, one of them with a twist. This surface
is not orientable, because no matter how the orientations of the pieces are changed
there will be some part of the common border traced in the same direction. The
resulting surface is called a Mobius strip.
We define the integral of a continuous vector field over a piecewise smooth surface
to be the sum of the integrals over each of its smooth pieces. Thus if S = S1 U S2,
da
ag
= -au agl
x -
l av dudv ,
doesn't change when the orientation is reversed by interchanging the roles of u and
v. But in Formula 3.3, the vector surface differential,
dS = nda,
where n is a unit normal to the surface. Possible choices for this vector are considered
in the following examples.
For a flat surface S parallel to the xy-plane, there are two possible choices for
the unit normal; n must be either (0, 0, 1) or (0, 0, -1). In Figure 9.19(a) either
choice would in principle be appropriate for the rectangle S1 in the xy-plane, but
having chosen one, say n1 = (0, 0, 1), there is only one possible choice of n for the
428 Chapter 9 Vector Field Theory
FIGURE 9.19 z
1y y
X
(a) (b)
rectangle S2 in the xz-plane that will lead to a coherent orientation. Following the
conventions illustrated in Figure 9. l 9(a), we choose n2 = (0, 1, 0). (Note that with
these choices the two normals point out from the same side of the two-piece surface.)
To compute the total flux of the vector field F(x, y, z) = (y, z, x) over S1 U S2 with
this orientation, we first note that F(x, y, z) • n1 = (y, z, x) • (0, 0, 1) = x on S1.
Also on S1 we have da = dx dy. Hence
1
f F 0 n1da = f xdxdy = f dy [1 xdx = ~-
ls1 ls 1 lo lo 2
f F 0 n2da = f zdxdz = [1 dx [1 z dz = ~-
ls2 lsi lo lo 2
Suppose the vector field in the previous example is replaced by the field G(x , y, z) =
(x, y, 0), but we retain the closed surface CUD shown in Figure 9.19(b). This time
The area element on C is du = lgx x gyl dx dy = ..fidx dy. Since the third
coordinate of gx x gy is positive, this vector points in, so we must change sign to get
an outward-pointing normal. However we'll compute G • n just by observing that G
is parallel to the xy-plane and n is perpendicular to the lines on C. Thus the angle
between G and n is rr / 4 at every point of C, and
EXERCISES
1. (a) Sketch the plane triangle T in JR 3 parametrized by (b) Find the flux across P of the vector field
g(u, v) = (2u + v, v, 3u + v) for F(x,y, z) =-xi+ yj + zk.
0 ~ U, 0 ~ V, U + V ~ 1. 6. Use the parametrization
(b) Find the area of T.
a cos u sin v )
2. (a) Sketch the plane elliptical region E in the part of g(u,v)= asinusinv, 0 ~ u ~ 2rr,
the plane z = 4 - x - 2y that lies above the disk ( acosv
x 2 + y2 ~ I in the xy-plane.
(b) Find the area of£. for a sphere of radius a to show that the area of the sphere
is 4rra 2 •
3. (a) Sketch the part of the graph of the hyperbolic
paraboloid z = y 2 - x 2 that lies above the disk 7. A repelling electric field E(x) = /x/- 3 x has flux <I> across
x 2 + y 2 ~ l in the xy-plane. the sphere of radius a centered at the origin. Find <I>.
(b) Find the area of the part of the graph described in 8. (a) Find the area of the spiral ramp represented para-
part (a). metrically by
(a) F(x, y, z) = (x, y, z) and Sis given by (b) Find the area of the graph of J(x, y) = x2 + y for
0::::x::::1,0::::y::::l.
/11-1)) 0:::: u:::: ), Show that if JR 3 ~ IR is continuously differentiable
g(11,v)=
( u+v ,
LIV
0 :'.':: V :'.:: 2.
15. (a)
and implicitly determines a piece of smooth surface
Son which 8G /az =/; 0, and which lies over a region
(b) F(x, y. z) = (x 2 , 0, 0) and Sis given by D of the xy-plane, then
g(u,v)=
(
1/COSIJ)
us~nv,
0:::: u :::: l,
0:::: v:::: 2T(. a(S) = l (~~r (~~r (~~r
+ +
10. Find the total mass of a spherical film having density at
aa1-
i}z
1
each point equal to the linear distance of the point from
a single fixed point on the sphere.
X
l dxdy.
*11. Let x = g(u, v), for (u, v) in D, and x = h(s, t), for (s, t) Assume that just one point of S lies over each point
in B, be parametrizations for the same piece of smooth of D.
surface S in R 3 . If there is a one-to-one transformation (b) Compute the surface area of the hemisphere
T, continuously differentiable both ways between D and
B, such that the Jacobian determinant of T is positive,
and such that g(u, v) = h(T(11, v)) for (u, v) in D, then
g and h are called equivalent parametrizations of S. using part ( a).
(a) Show that equivalent parametrizations assign the
16. (a) Show that if a surface Sis the graph of z = f(x, y)
same surface area to S. (Hint: Use the change-of- for (x, y) in D, then the surface integral of F
variable theorem.) (F1, F2, F3) over S is
(b) Show that the equivalent parametrizations assign the
I(
same value to the surface integral of a vector field
over S.
aJ
-F1-;- - F2-;-
aJ + F3 ) dx dy.
D ax oy
12. Let the temperature at a point (x, y, z) of a region R be
given by a continuously differentiable function T(x, y, z). (b) Use part (a) to compute the integral of F(x, y, z) =
Then the vector field 'ilT is called the temperature (x, y, z) over the graph of z = x 2 + y for
gradient, and under some reasonable assumptions about 0::=::x::::1,0::::y::=::l.
the region, 'ilT(x, y, z) is proportional to the direction and
rate of flow of heat per unit of area at (x, y, z).
In Exercises 17 to 20, find a parametrization as a piece-
wise smooth orientable surface wilh outward-pointing
(a) If T(x, y, z) = x 2 +y 2 for x 2 +y 2 :::: 4, find the total
normal for the given surface.
rate of flow of heat across the cylindrical surface
x 2 + y2 = 1, 0:::: z:::: l. 17. The cylindrical can with bottom and no top given by
(b) Give an example of a continuously differentiable x 2 + y2 = 1, 0 :::: z :::: I and x2 + y 2 :::: 1, z = 0
vector field that cannot be a temperature gradient.
18. The funnel given by x 2 + y 2 - z 2 = 0, l :::: z :::: 4 and
13. The Newtonian potential function (x 2 +y 2 +z 2 )- 112 has as x2 + y 2 = 1, 0 S z S I
its gradient the attractive force field F of a charged particle
19. The trough given by
at the origin acting on an oppositely charged particle at
(x, y, z). The flux of the field across a piece of smooth
surface is defined to be the surface integral of F over
y - z = 0, 0 :::: x :::: 1, 0 :::: z :::: 1, and
S. Show that the flux of F across a sphere of radius a y + z = 0, 0 S x S l, 0 S zS 1
centered at the origin is independent of a.
[Hint: Write f F • dS in the form / F • n da and 30. Let J(x, y, z) be a continuously differentiable function
defined on a smooth surface S in JR 3 • Suppose that every
use Theorem 3.5 in Chapter 7, Section 3.] level surface of f is perpendicular to S wherever the
(b) Show that if the piece of S shrinks to a point two surfaces intersect. (This just means that their normal
Xo in such a way that a(S) tends to zero, then vectors are[rpendicular at each point of intersection.)
{l/a(S)}J5 F dS tends to F(xo) •no, where no is
0
the fluid per volume unit. In particular, if div F(x) > 0 the fluid is expanding at x
and if div F(x) < 0 the fluid is contracting at x. This interpretation is justified in
Section 4B. (See also Chapter 8, Section 4 and Chapter 12, Section ID.)
{ divFdV = { F dS.0
JR JaR
Gauss's Theorem is like Green's Theorem and the fonnula
la
b "vf • dx = J (b) - J (a),
{ divFdV ={ F dS.
0
JR JaR
Proof. In terms of coordinate functions of F, Gauss's formula reads
1(-a-+
R
aF1 -aF2 + -aF3)
xay az dx dydz = 1 F1
aR
dydz + F2dzdx + F3dxdy.
We assume first that R is a simple region and prove only the equation
l<'IGURE 9.20
the proofs for the tenns containing F1 and F3 being similar. Addition of the resulting
equations will then prove the theorem for simple regions. Because R is simple, aR
Section 4A Gauss's Theorem 433
FIGURE 9.21 z
y = s(x, z)
"'·
_.. y
X
consists of the graphs of two functions, s(x, z) and r(x, z), perhaps together with
pieces consisting of lines parallel to the y-axis as shown in Figure 9.21. Let
gi(u, v) )
g(u, v) = g2(u, v) , (u, v) in D,
( g3(u, v)
1
aR
F2dzdx =
1D
a(g3,g1)
F2(g1, g2, g3)---du dv,
o(u, v)
(l)
and, on the sections of aR that are parallel to the y axis, the normal vector to aR
is perpendicular to they axis. Hence a(g3, g1)/a(u, v), the second coordinate of the
normal, is equal to zero, thus eliminating the part of the integral that is not on the
graph of r ors. We now apply the change-of-variable theorem to the two remaining
parts of the integral in Equation ( 1). The appropriate transformations are
z )
( x
=( g3(u, v) ) ,
gi(u, v)
with (u, v) in either Dr or Ds, where Dr and Ds are the parts of D corresponding to
the graphs of rands. The Jacobian determinant a(g3, gi)Ja(u, v) is positive on the
graph of r and negative on the graph of s, because it represents the x 2 coordinate
of the outward normal. On Dr we have gz(u, v) = r(x, z), whereas on Ds we have
g 2 (u, v) =
s(x, z). Using these facts, we get from the change-of-variable theorem
and Equation (1 ),
{ F2dzdx= { F2(x,s(x,z),z)(-l)dxdz
laR }R2
+f F2(x,r(x,z),z)dxdz,
lR2
where R2 is the plane region we get by projecting R onto the xz-plane. These last
two integrals are not surlace integrals, but rather 2-dimensional multiple integrals.
434 Chapter 9 Vector Field Theory
aF2
1 1 [i
r(x.n ]
F2dzdx = -(x,y,z)dy dxdz
iJR R2 s(x,z) ay
=
1 Ray
aF2
-dxdydz.
Similar arguments involving Ft and F3 complete the proof for simple regions, since
the addition of the three resulting equations gives
Example 5 of Section 3 consists of showing that the flux of the gradient field F of
the potential function
and then that the divergence of this field is zero (i.e., div F = 0 everywhere except
at the origin). In particular, div F = 0 throughout R. Applying Gauss's Theorem to
R gives
{ J< dS = { divFdV = 0.
0
laR jR
But aR consists of S1 with inward pointing normal and S2 with outward pointing
normal; so, with the understanding that S1 stands for the inner surface with reversed
normal, we get
{ F • dS = { F • dS +{ F • dS = 0.
laR ls, ls2
Section 4B Gauss's Theorem 435
Thus the integrals over the outward-oriented surfaces are equal. To find the actual
value, it's enough to compute it for one surface, say a sphere. The result is -41r,
as shown in Example 5 of Section 3 with GM m = 1. This result is a special case
of one version of Gauss's Law, which says that the gravitational flux out across a
surface S containing a mass distribution of total mass M on R is -41r M.
4.2 { F • dS = { F • dS.
lsi ls2
In other words, F' has the same flux across the two surfaces S1 and S2.
It's easy to check that the field F(x,y,z) = (xz,yz,-z 2 ) satisfies divF(x) = 0
everywhere. It follows from the Surface Independence Principle that if two sur-
faces S1 and S2 bound a region R , one with inward-oriented normal, the other with
outward-oriented normal, then F has the same flux across the two surfaces. If one
of the two surfaces, say S1, is contained in the xy-plane, as in Figure 9.22(a), then
the flux across S1 is zero, because F(x, y, z) = 0 when z = 0. Hence the flux across
S2 is also zero.
4B Interpretation of Divergence
The divergence of a vector field F at a point x is a measure of the tendency of the
field to radiate away from x, hence the term divergence. To justify this interpretation,
consider a solid ball Ba of radius a centered at a point xo in the interior of the set
FIGURE 9.22 z
(a) (b)
436 Chapter 9 Vector Field Theory
- -1
1
V (Ba) Ba
divFdV =-
1
-
V (Ba)
{
JaBa
F • ndu,
where n is the outward-pointing unit normal vector to the spherical boundary surface
Sa = aBa . The ratio on the left is the average value of div F in a neighborhood of
xo, and so tends to divF(xo) as a tends to zero. (See Exercise 3 of Chapter 7,
Section 3.) The integral on the right is the average flux, per unit of volume, of F
directed out across Sa. Hence this average flux tends to the limit of the left side,
namely div F(xo), as a tends to zero. The number div F is the expansion rate of F
at xo. Gauss's Theorem itself is often called the divergence theorem because the
theorem is a statement about div F. In particular, if div F(x) > 0 the flow generated
FIGURE 9.23
by Fis expanding near x and if div F(x) < 0 the flow generated by Fis contracting
near x. (See also Chapter 8, Section 4 and Chapter 12, Section ID.)
IEXAMPLE s j Let F(x) be the continuously differentiable velocity field of a fluid flow in 3-
dimensional space, with continuously differentiable fluid density p(x) , and let B
be an arbitrary region of finite volume in the domain of F(x). We assume that no
fluid is created or destroyed in B so any change in density is due to compression or
expansion of the fluid. Then
!!._1
dt B
pdV=-{ F•ndS,
laB
Section 4B Gauss's Theorem 437
where n is the outward-pointing unit normal to the surface aB. The left side is the
rate of change of mass with respect to t, which because of the absence of creation
or destruction of fluid must be due to flux across aB, as measured by the right side.
We need the minus sign because the integral by itself measures total outward flux,
which would be positive if the left side were negative, and vice versa. Now apply
the Leibniz rule on the left and Gauss's theorem on the right to get
[opdV=-{divFdV, or f(ap+divF)dV=O.
1B at 1B 1B at
Since B is arbitrary we can choose B to be a ball Br(Xo) of radius r centered at
an arbitrary point xo in the domain of F(x). Since the integrand in the right hand
integral is continuous it must be identically zero, otherwise a nonzero value for it
would, for small enough positive r, give a nonzero value for the integral over Br.
This establishes Equation 4.3.
EXERCISES
In Exercises I to 4, compute the divergence of the vector In Exercises 13 and 14, prove the identity for a twice
field F. continuously differentiable vector field F or real-valued
function f.
I. F(x,y,z)=(x 2 ,y2,z 2 )
13. div(curl F)(x) =0
2. F(x,y,z) = (sinxy,0,0)
14. curl(Vj)(x) =0
3. F(x, y, z) = (y, z, x)
15. (a) Show that for f(x, y, z) = (x 2 + y2 + z2 )- 112 the
4. F(x, y, z) = (xy, yz, zx) equation div(VJ)(x) = 0 holds for all x =I= 0.
In Exercises 5 to 8, verify Gauss's Theorem for the (b) Show by example that div(Vf)(x) =I= 0 may hold for
vector field F and regions R in JR3 . Sketch R, together some twice continuously differentiable function f.
with a few outward-pointing normal vectors. (c) If the operator t::,. is defined by t::,.J = div(Vf), find
a formula for t::,.J in terms of partial derivatives of
5. F(x, y, z) = (x 2 , y2, z2); R: x 2 + y2 ~ I, 0:::: z:::: 1 J. A function such that t::,.f (x) = 0 for all x in the
6. F(x, y, z) = (y, -x, O); R: x 2 + y 2 + z2:::: 4 domain of f is called harmonic function, and t::,. is
called the Laplace operator.
7. F(x, y, z) = (0, 0, z); R : x 2 + y2:::: I, 0:::: z:::: I
16. The trace of a square matrix is defined as the sum of
8. F(x, y, z) = (x, y, z); R: 0:::: x:::: I, 0:::: y:::: 1,
the elements on its main diagonal. If Rn .! Rn is a
0:::: z:::: I differentiable vector field, we define div F to be the real-
In Exercise 9 to 12, sketch the closed surface S, and
compute 1
F • dS over S by using Gauss's Theorem.
s the normal vectors to s point
. out.
valued function given by
12. F(x, y, z) = (x, y, z); S: bottom x 2 + y2:::: l, in JR 3 and with outward-pointing normal.
2
z = O; top z = I - x - y2
17. F(x,y,z)=(x 2 ,y2 ,z2)
438 Chapter 9 Vector Field Theory
20. (a) Use Gauss's theorem to prove that if Fis a continu- 25. A field F for which div F(x) = 0 everywhere is called
ously differentiable vector field with zero divergence divergence free. Show that the flux of a divergence-free
in a region R, then the integral of F over aR is zero. field across a smooth closed surface is zero.
(b) Write an intuitive argument, based on the interpreta- 26. Define the vector field F(x, y, z) = (ax, by, cz), where a,
tion of the divergence, for the assertion in part (a). b, and c arc constants.
x2 y2 z2 (a) Find the flux of F across a sphere of radius p > 0,
21. Let S be the ellipsoid
02
+ b2 + c 2 = 1, and let D(x, y, z) oriented so that its normal vector points out from the
be the distance from the origin to the tangent plane to S sphere.
at (x,y,z). (b) Answer the question of part (a) with F defined
(a) Let instead by F(x, y, z) = (yz, zx, xy).
27. Gauss's Law. The gravitational field generated by an
integrable mass density µ, defined on a region R is
Show that F,n = v- 1, where n is the outward unit F(x) =G { µ,(y)(y -/) dVy.
normal to S at (x, y, z). JR ly-Xj
4
(b) Show that f D- 1da = JT (be+ ca+ ab)·
ls 3 a b c
(a) Show that the flux of this field across a smooth
22. A vector field JR 3 !.
JR 3 defined in a region R is called closed surface S with no points of R inside or on S
incompressible in R if div F(x) = 0 for all x in R. If is zero. Do this by interchanging the order of surface
Fis continuously differentiable and incompressible in R, and volume integrals.
show that the flux of F is zero across every sufficiently (b) Show that the flux of this field across a smooth
small sphere with its interior in R. closed surface S containing all points of R in its
23. Suppose that u(x, y, z) is twice continuously differen-
tiable in a region R and that ·u is a harmonic function
interior is -4JT G l µ, d V.
...u
y
(a) (b)
(The term border instead of boundary is used to avoid confusion with what we
earlier called the boundary of S in Chapter 5, Section 1.)
We can now relate the line integral of a vector field F around aS to the surface
integral of an associated vector field over S. We assume that JR3 ~ JR3 is a contin-
uously differentiable vector field whose domain contains S. In Chapter 8, Section 4
we defined the vector field curl F by
The vector field curl F is a kind of derivative of the field F, and it plays a central
role in another extension of the Fundamental Theorem of Calculus called Stokes's
Theorem, to be proved later as Theorem 4.2:
X) = (UC?S V) , TC
(
_y
Z
u sm v
V
for O ::: u ::: 1, 0 -< V
-< -.
2
Then the border of S consists of three line segments and a spiral curve shown
in Figure 9.25 together with the domain D of the parametrization. Restricting the
parametrization of S to the boundary of D gives the following parametrizations of
the smooth pieces of the border of S:
Now let F be the vector field F(x, y, z) = (z, x, y). The line integrals of It' over
the Yi are all of the form
{ zdx+xdy+ydz.
FIGURE 9.25 z
V
(l)
7T
2
(2) (4)
y
(3)
u ---
x
(a) (b)
Section SA Stokes's Theorem 441
It's easy to see that the integrals over YI, Y2, and y3 are all zero, whereas over y4
we get
~ rr
1n
F • dx =
Lo
(cos 2 t + sin t - t sin t) dt =-.
4
On the other hand curl F(x1 , x2, x3) = (1 , 1, 1); so the integral of curl F over S is
f curlF•dS= f (a(y, z) + a(z,x) + a(x,y))dudv
ls lv a(u, v) a(u , v) a(u , v)
1du lon/2 (sinv-cosv + u)du = rr
=
1 O O
-.
4
J. F • dx,
lsf curl F . dS = fas
where as is the positively oriented border of S.
Proof. Let F1 , F2, F3 be coordinate functions of F. We'll prove that
ias
Fidx =
J aF1 aF1
--dxdy + -dzdx .
s ay az
(2)
i as
F2dy =
1s
aF2 aF2
- - dy dz+ -
ay ax
dx dy
and
ias
F3dz =
ls
aF3
- - d z dx
s ax
+-
aF3
ay
dy dz
are similar, and addition of the three equations gives Stokes's formu1a. To prove
the Equation (2), suppose that h(t) = (u(t), v(t)) is a counterclockwise-oriented
442 Chapter 9 Vector Field Theory
J F1dx=f F1(g(u,v))!!_g1(u,v)dt
~s dt
ag, du ag1 dv]
=
f
F1 (g(u , v)) [ -(u, v)- + -,-(u , v)-d dt
au dt av t
=
i[,
ag1
F1(g)-du
au
+ F1(g) -
ag,
av
dv.
This last integral is a line integral around the region D in JR 2 , and we can apply
Green's Theorem to it, getting
i as
F1dx=
1[
D
a ( F1(g)-
--
au
ag1) - - a ( F1(g)-
av av
ag1)] dudv.
au
(3)
The assumption that g is twice continuously differentiable ensures that the integral
over D will exist. The same assumption allows us to interchange the order of partial
differentiation in a computation which shows that
Substitution of this identity into Equation (3) gives Equation (2), thus completing
the proof. A suggestion for deriving Equation (4) is given in Exercise 17. •
SB Interpretation of Curl
Using Stokes's Theorem we can derive an interpretation for the vector field curl F
that gives some information about F itself. Let xo be a point of an open set on which
F is continuously differentiable. Let no be an arbitrary unit vector pointing away
from xo, and construct a disk S,. of radius r centered at xo and perpendicular to no.
This is shown in Figure 9.26. Applying Stokes's Theorem to F on the sulface S,.
and its border y,. gives
JF 0 dx = f curlF dS.
0
fr,. ls,.
The value of the line integral was defined more generally in Chapter 8, Section lA
to be the circulation of F around y,.. For small r, the circulation around y,. is a
measure of the tendency of the field near xo to rotate around the axis determined by
no. On the other hand, the surface integral is, for small enough r, nearly equal to the
dot product curl F(xo) •no multiplied by the area of S,.. See Exercise 22 of Section 3.
FIGURE 9.26
Section SB Stokes's Theorem 443
It follows that the circulation around Yr tends to be larger if no points in the same
direction as curl F(xo). Thus we can think of curl F(xo) as determining the axis
about which the circulation of F is greatest near xo. Similarly, Jcurl F(.xo) I measures
the magnitude of the circulation around this axis near xo. Mechanically speaking,
if the vanes of a paddle-wheel were attached to an arrow no (see Figure 9.26.) and
inserted in a velocity field F at xo, the wheel would be expected to rotate most rapidly
with no held parallel to the vector curl F(xo) and not at all with no perpendicular
to curl F(xo). In summary, we can think intuitively about the curl of a field as
follows:
(i) The direction of curl F(x) is the axis about which F rotates most rapidly at x.
(ii) The length of curl F(x) determines the maximum rate of rotation at x.
f curlF•dS = J F 0 dx,
ls hs
for a piecewise smooth surface S.
We can regard a sphere as a piecewise smooth surface on which all of the border
curves cancel one another. Indeed if we parametrize a sphere Sa in JR 3 by
then the positively oriented "border" of the sphere consists of the half-circle shown
in Figure 9.28 traced once in each direction. Thus the half-circle corresponds to
the segments u = 0 and u = 2n in the parameter domain. (What happens to the
444 Chapter 9 Vector Field Theory
r
FIGURE 9.28
1T
(a) (b)
{ curl F . dS = 0.
ls"
A surface like that in Example 3, in which the border is effectively nonexistent
for the purpose of line integration over as, is called a closed surface.
{ l . dS = { curl B • dS = { B . dx.
ls ls las
The first integral is the total current flux across S, and the last one is the circulation
of the magnetic field around the border curve as that encircles the conductor. The
equality of these two quantities is called Ampere's law.
A vector field for which div F = 0, is called a divergence-free field; for a quick
way to find one, start with an arbitrary twice continuously differentiable vector field
G = (G1, G2, G3) and set F = curlG. Then divF = 0, since
To simplify the pictures, we'll consider here some vector fields in which the z-
coordinate is zero. Our pictures will show a horizontal slice of the field. Figure 9.29
shows three snapshots of a time-dependent field
J<'IGURE 9.29
F, (x, y, z)
= ((1 - r)x - ty, tx + (1- r)y,0)
Os rs 1.
446 Chapter 9 Vector Field Theory
SC Simple Connectedness
Stokes's Theorem has many applications and in particular gives information about
gradient fields, that is, fields F such that F = VJ , or, using an alternative notation,
F = gradf. If we assume that F is the continuously differentiable gradient field of
f , we can form the vector field curl F. But F = (aj/ax , aJ/ay, aJ/az), so we get
immediately from the definition of curl F and the equality of mixed partials that
FIGURE 9.30
(a) (b)
Section SC Stokes's Theorem 447
FIGURE 9.31
(a) (b)
a closed curve, whereas the outside of such a curve is not simply connected. In
Figure 9.3l(a), the curve y is the border of the surface consisting of the part of the
plane lying inside y. However, the presence of the hole in Figure 9.3l(b) prevents a
similar construction. More precisely, we'll say that an open set is simply connected
if every piecewise smooth closed curve y lying in B is the border of some piecewise
smooth orientable surface S lying in B, and with parameter domain a disk in JR 2 • We
assume for applications that S is parametrized by twice continuously differentiable
functions.
Now we can prove the following.
then F is a gradient field in B, that is, there is a real-valued function f such that
F=V/.
Proof. By Theorem 2.4 it is enough to show that §Y F-dx = 0 for every piecewise
smooth curve y lying in B. Because B is simply connected, there is a piecewise
smooth surface S of which y is the border and to which we can apply Stokes's
theorem in either two or three dimensions. Thus
J F. dx = f curl F • dS = 0,
hs ls
as we wanted to show. •
EXERCISES
In Exercises 1 to 4, compute curl F. that Stokes's Theorem holds for the vector field F and
surface S. Sketch S and its border, showing orientation.
1. F(x , y, z) = (y - z2 , z - x 2 , x - y 2 )
2. F(x, y, z) = (z, 2y, 3z) 5. F(x, v, z) = (x, y, z);
3. F(x, y, z) = (x - y, z - x, y - z) S : g(u, v) = (u, v, ./~l---u,,.._--v""""
2 2 ), u2 + v 2 :S I
In Exercises 9 and 10, compute ls curl F • d S by using F' (x)y = S(x)y + ½curl F(x) x y,
Stokes's Theorem. In other words, choose a properly where S(x) is a symmetric matrix.
and compute i
oriented parametrization for the border curve y of S,
F • dx.
16. Let F(x, y, z) be the gradient field of the real-valued
Newtonian potential
If S is the image in IR 3 of g, give a precise descrip- The second term works out similarly. Then subtract, using
tion of the oriented border of S. equality of mixed partials.
(c) Use Stokes's Theorem to compute the integral of
F(x, y, z) = (x, x, 0) over the border of .S as ori-
18. A vector field JR 3 -!JR 3 defined in a region R is called
irrotational in R if curl F(x) =
0 for all x in R. If F
ented by the parametrization in part (b). is continuously differentiable and inotational in R, show
12. Show that Stokes's formula can be written in the form that the circulation of F is zero around every sufficiently
small circular path in R.
{ curlF nda 0 = J, F-t ds, 19. Consider a cylindrical can C of radius I having a closed
ls f.1s flat bottom and open top with an unspecified smooth bor-
der ac, oriented as shown in Figure 9.32. Let F(x, y, z) =
where n is a unit normal to S and t is a unit tangent to
(x, x, 0). What is the value of the line integral of F over
as.
the border of C?
*13. Use the result of Exercise 25 of Section 3 and Stokes's
Theorem to prove that if F is a continuously differentiable ::
vector field at xo, then
ac
. -1-
Inn
r--+OA(D,) c
i
F • t ds = curl F(xo) • no,
6.1 'vf = aJ i + aJ j + aJ k.
ax ay az
This equation defines 'v as an operator from real-valued differentiable functions
JR 3 -1+ IR, to vector fields JR3 ~ JR 3 . If we write
then the operator 'v x is defined by taking the formal cross product of 'v and F to get
a a a a a a
'vxF= ay az i + az ax j + ax ay k
F2 F3 F3 F1 F1 F2
6.3
= ( ~ F3 _ ~I F2) i + ( aF1 _ aF3) j + ( aF2 _ aF1) k.
ay az az ax ax ay
Thus 'v x F is the vector field that we have called the curl of F and written curl F.
Similarly, for a differentiable vector field F, we define the operator 'v• by taking the
formal dot product of 'v and F to get
6.4
This real-valued function we have called the divergence of F and have written div F.
The meaning of the notation just introduced is easy to remember if Equation 6.2 is
kept in mind.
Using the 'v notation, Stokes's formula becomes
where V2 f is just shorthand for v' • (v'f) and so denotes the Laplace operator
2 a2 J a2 J a2 J
v' J= ox 2 + ay2 + oz 2 .
Equations (8) and (9), are the same as those in Exercises 13 and 14 of Section 4,
where we used the notations div and curl.
6B Green's Identities
The preceding formulas imply many special cases of the Gauss and Stokes theorems.
A particularly important kind arises if the vector field F is assumed to be a gradient
v'J, or a multiple Jv'g of a gradient. If we set F = v'J in Equation 6, the result is
f V 2 J dV = f aJ da. (12)
JR ls an
452 Chapter 9 Vector Field Theory
If we replace Fin Formula 6.6 by JVg, instead of by VJ, we have, from Equation (6),
6.8
1R
(jV 2 g - gV 2 f)dV=
1(J..J--g-
S
a aJ)
an an
da.
I.EXAMPLE 2 I Green's second identity, Equation 8, illuminates many features of the Laplace oper-
ator V2 . In particular, suppose u and v are harmonic functions on a region R of IR. 3
and its smooth boundary surface aR, that is, suppose V 2u = 0 and V 2 v = 0 on
Section 6C The Operators V, Vx and v. 453
R U aR. Setting J = u and g = v in Equation 8 makes the left side zero, so
1 ( av u - - vau) av= la au
aR an
- da =0
an
or
laiJR u-da
an
v-da.
aR an
This last equation expresses a symmetry that holds for an arbitrary pair of harmonic
functions on R U oR. In particular, with v = 1 as constant harmonic function,
av/on= 0 so we conclude that
au
laiJR -da =0.
on
In words this last equation says that the average value over aR of the nonnal deriva-
tive of a hannonic function must be zero. This last result can also be obtained directly
from Green's first identity. (See Exercise 20.)
From Equation 6 we can derive some equations for vector-valued integrals. Let
v be an arbitrary constant vector and let F(x) = J(x)v, where J is real-valued and
continuously differentiable on a region R. Then because V • Jv = VJ. v (verify!),
Formula 6.6 becomes
[ VJ •vdV = [ Jv •nda.
JR JaR
Since v is constant,
v. f VJ d V = v• f J n da,
JR JaR
Because vis arbitrary, we can successively set v equal to e1, e2, e3, and conclude
that the two vector integrals have the same coordinates in JR3 . Hence
f
JR
V X FdV = r
JaR
n X Fda. (14)
6C Changing Coordinates
We've already dealt with this issue in Chapter 6, Section 2B and Section 5, arid in
Chapter 7, Section 4. In the first instance we looked mainly at fairly simple linear
examples, in the second mainly at geometry, and in the third mainly just at Jacobian
determinants in multiple integrals. Here we take up nonlinear changes of variable
as they affect first and second order differential operators. The calculations can be
fairly messy, so to understand them it's important to follow a general principle.
Changing from rectangular coordinates x in !Rn to curvilinear coordinates u we
use a coordinate transformation x = T(u) that's both continuously differentiable
454 Chapter 9 Vector Field Theory
6.9 Theorem. If v(x) is continuously differentiable then v' (x) v' (u)(T') = 1
(u),
where x = T(u) is a coordinate transformation and v(u) = v(T(u)).
Proof. Since x = T(u), and we assume r- 1 exists and is differentiable, the chain-
rule gives us
j EXAMPLE 3 j Divergence and gradient in polar coordinates. We first compute the inverse of the
derivative matrix for the polar coordinate transformation
x
y
= r cos0}
= r sin0 · (
cos0 -rsin0
sin0 r cos 0
)-I= ( cos0
- (1/r) sin0
sin0
(l/r) cos 0
)
·
To find Vu(x, y) in polar coordinates we use Theorem 6.9 and multiply this inverse
. . . . _ (au au) ( cos0 sin0 )
matnxbythedenval!vematnxofu(r,0): ar ae -(1/r)sin0 (l/r)cos0 ·
Thus
_
Vu= ( cos0-
au - -1 sm0-
. au)·1 + ( sm0
. -au + -cos0-
1 au)·J, for r> 0. (15)
ar r a0 ar r a0
In particular we see that we can replace partial differentiation of u with respect to x
and y by action on u according to
au -
-au = ( cos0- 1
-sin0- au) and -au = ( sin0 -au + -I sin0-
au) (16)
ax ar r a0 ay ar r a0
Note that the coefficients of the partials with respect to r and 0 are just the columns
of the inverse matrix. Thus the divergence of a vector field F(x, y) = F1 (x, y)i +
F2 (x, y )j becomes in polar coordinates
V •F
aFi - -l
= ( cos0-- 8Fi)
sin0-- + ( sin0-
aF2 8F2) .
t- -l sin0-- (17)
ar r a0 ar r ae
Section 6C The Operators V, Vx and v. 455
If u(r) and F(r) are functions of r alone then Equations (15) and (17) simplify to
n- 0au.
vu= cos -•+ sm -J
. 0au. and
-
V •F
a Fi
= cos0-- +
aF2
sin0--.
or ar ar ar
Note that while the vector coordinates are computed using polar coordinates, the
vectors Vu are represented using standard basis vectors i and j in IR 2 •
~
ax
by (cos0i- -
ar
!r sine~)
. a0
and ~
ay
by (sin0~ +
ar
!r cos0~)-
a0
(18)
Applying the first of these twice to the first equation in (16) gives
2
-a u = ( cos0-
a - 1 . a ) ( cos0-
-sm0- au - -sm0-
1 . au)
ax2 ar r a0 ar r a0
a ( cos0-
= cos0- au) a ( -sm0-
- cos0- 1 . au)
ar ar ar r a0
1 a ( cos0-
- -sin0- au) +-sm0-
1 . a ( sm0-
. au) .
r a0 2 ar r a0 a0
We get the polar expression for ~ from ~ by interchanging cos 0 and sin 0 and
ay ax
replacing - by +, so
2
-au = ( sin0-
a + 1 a ) ( sin0-
-cos0- au + -cos0-
I au)
ay2 ar r a0 ar r o0
a ( sin0-
= sin0- au) + sin0-
a ( -cos0-
1 au)
ar ar ar r ao
1 a ( sin0-
+ -cos0- au) + 1 cos0-
a ( cos0-
au) .
r a0 ar 2r a0 a0
With the help of sin2 0 + cos 2 0 = l we can extract from the first and last terms of
the sum Uxx + Uyy the terms Urr and r- 2u00- Everything else but r- 1ur cancels, so
in polar coordinates
2_ a2u 1 au 1 a2u
(19)
V u = ar 2 + -;: or + r2 a0 2' O < r.
Exercise 20 shows it's easier to verify this equation than derive it from Uxx + Uyy·
456 Chapter 9 Vector Field Theory
I
of Chapter 6, Section 5, namely
2 au 1au 1 au a 2u
Vu = ar 2 + -;: or + r 2 00 2 + az 2 ' r > O. (20)
~ by (cos¢~ - sin¢!.._)·
oz or r oq,
Applying these twice in succession to a function u(r, ¢ , 0), and then adding the
results, gives for O < r, 0 < <P < rr,
EXERCISES
For Exercises I to 7, verify the corresponding identity In Exercises 12 and 13, assume x in JR. 3 , and prove the
(I )-(7) in Section 6A. For Exercises 8 to IO, verify the equation for x # 0.
corresponding identity (8)-(10) in Section 6A.
12 'v
·
(_!___)- -x
lxl - lxl 3
11. Prove that if v is a constant vector and x is not zero, then
V X X V V•X
Vx--=-+-x . 13. v2 (_!__) = o
3
lxl lxl lxl lxl
Section 6C The Operators V, Vx and v. 457
14. Replace F in Equation 6 of the text by v x F where v moving a particle from oo to x along some smooth path
is a constant vector in IR3 • Use this to prove Formula through the field V N.
(14) at the end of Section 6B. [Hint: Use Formula (7) in
Section 6A, and Equation 8 in Section 4C of Chapter 1
18. Show that if f (x, y) equals a function 7(Jx 2 + y 2 ) =
to show that the dot product of the two sides of Formula f(r), then
(14) with v are equal as in the proof of Formula (13).]
15. If T(x) is the steady-state temperature at a point x of an
n2/(
V X, y -
2
) - 8 J(r) + ! aJ(r) .
2ar r ar
open set R in JR 3 , then the flux of the temperature gradi-
ent across any smooth surface in R is zero. Use this fact 19. Show that if f(x, y, z) equals a function 7
and Equation (12) to prove that a steady-state temperature
(Jx2 + y2 + z2) = J(r), then
function that is twice continuously differentiable is har-
monic, i.e., V 2 T = 0. [Hint: Suppose that V 2 T(Xo) > 0.
Prove that V2 T(x) > 0 in some ball centered at Xo,) n2/(
V
) -
X,Y,Z -
a27(r)
- a 2
+ --a-·
2 8f(r)
r r r
16. Use Green's first identity to prove that if u is a harmonic
function on a region R together with its smooth boundary 20. Verify the formula for V 2ii(r, 0) in text Example 4
aR in JR3, then the average value over aR of the normal by computing Ur, Urr, and ii00 from ii(r, 0) =
derivative of u must be zero. u(rcos0,rsin0). This computation doesn't qualify as
17. Consider the Newtonian potential function N(x) = Jx1 - 1 a derivation of the polar form of V 2 u(x, y).
and its associated gradient field VN(x). (See Exercise 16.) 21. Verify that the cancellations claimed at the end of text
Prove that N(x) can be interpreted as the work done in Example 4 do occur.
Chapter 9 REVIEW
l. (a) Find a function f such that (b) Prove that I (y; /Jo) = (xo + yo)/(1 + xoyo) for all
Vf(x, y) = (3x 2y, x 3 + 3y2). piecewise smooth paths in the first quadrant from
(b) Use your answer to part (a) to prove that (0, 0) to (xo, yo), using whatever method seems most
convenient.
[ 3x 2y dx + (x 3 + 3y2) dy =
8 for any path y
(c) Explain to what extent the results of parts (a) and
from (1, 1) to (1, 2). (b) extend to other quadrants.
2. Let y be the closed curve consisting ofline segments from 5. Let S be the closed surface that is the boundary of
(0, 0) to (I, 0), from 11, 0) to (1, 2) and from (1, 2) back the solid region inside the cylinder x 2 + y2 = 4 and
to (0, 0). Show that between the planes z = 0 and z = 2. Suppose S is
positively oriented, with normal vector pointing out at
. -1 l + f3y2 1 + f3x2 d
l(y,/3)- Y (1+xy)2dx+ (l+xy)2 y.
(e) Prove that the flux ·of F across the sphere of radius
a centered at the origin is 4rra 3-P; is this consistent
with your answer to part (d)?
458 Chapter 9 Vector Field Theory
7. The circulation and flux of the vector field F(x, y) = 13. The conical graph of z = 2Jx 2 + y 2 , x 2 + y2 :'.:: 1 has the
(-x, -y) relative to the circle x 2 + y2 = 1 can be
hemispherical graph of z = 2+J1 - x 2 - y 2 , x 2+y 2 :'.:: I
computed in several ways, some easier than others.
placed over its top to form a surface S.
(a) Find the total circulation of F(x, y) around the
(a) Find parametrizations that orient each of the two
circle x 2 + y2 = 1 (relative to the counterclockwise
parts of S such that the normals point outward over
direction).
the whole surface.
(b) Find the total flux of F across the circle x 2 + y 2 = 1,
(b) Compute the total surface area of S and its enclosed
relative to outward-pointing unit normal vectors. volume.
8. Let F he given hy F(x, y, z) = (x + y, y + z, z +x), and (c) Find the flux of F(x, y, z) = (x, y, z) across S.
let g(u, v) = (u cos v, u sin v, v) parametrize a helicoidal 14. Let S he the portion of the sphere x 2 + y 2 + z2 = 1 above
surface H.
(a) Compute curlF(x, y, z). the xy plane. Calculate Jls z da in two ways:
(b) Compute the normal vector g.,(u, v) x 8v(u, v). (a) Directly, as the integral of a function over a surface.
(c) Compute the integral f
]H1
curl F • dS, where H1 is (b) By noting that/ ls z da =
the part of the helicoid corresponding to O :'.:: u :'.::
1, 0 :'.:: v :'.:: 4n, representing two complete turns of
a helicoid of width 1.
J ls<O,O,l)•(x,y,z)da = Jls curlF 0
nda,
where the field F is F(x, y, z) = (-y /2, x /2, 0) and
9. Consider the family of 2-dimensional vector fields S is given an upward orientation, and then applying
defined by Stokes's Theorem, either to change the surface of
integration or to convert to a line integral.
-y X
15. Let f and g be scalar functions of three variables whose
Fa(x,y) = (x2+y2)"i+ (x2+y2)"j,
second partial derivatives are all continuous.
(x, y) ,f. (0, 0), (a) Prove (VJ) x (Vg) = curl(f'vg)
(b) Letf(x,y,z)=x+y+zand
g(x, y, z) = x 2 + y 2 - z 2. Compute
where a is a positive constant. Note that curl Fo(x, y) =
2k.
(a) Compute the scalar curl F" (x, y) and prove that it's J fs((vf)x('vg))•nda,
zero if and only if a = 1, in which case it's
identically zero. where S is the hemisphere x 2 + y 2 + z 2 = I, z :::'.: 0
(b) What can you say, depending on a > 0, about the and n is the upward directed normal.
circulation of F" around a smooth closed curve that [Hint: Think about the change of surface or line
doesn't contain the origin? integral using Stokes's Theorem.]
(c) What can you say, depending on a, about the cir-
16. (a) Find a formula for the function f: IR2 - JR such
culation of F" around a circle centered at the origin
that 'vf(x, y) = (siny, xcosy).
that encircles the origin once, counterclockwise?
10. Find a function f(x, y) such that 'vf(x, y) =
(2xy+ y3 + 1, x 2 +3xy 2). Explain why you cannot find an
f (x, y) such that 'vf(x, y) = (x 2 + 3xy2, 2xy + y3 + 1).
(b)
i
Use your answer to part (a) to compute
sinydx +xcosydy, where y is any path from
(1, I) to (I, 2).
11. Let R be a plane region with piecewise smooth bound- 17. Let y be the counterclockwise path consisting of the part
ary curve aR, oriented counterclockwise. Prove that of the circle x 2 +y 2 = l lying in the first quadrant together
1 aR
xdy = -1 aR
ydx = A(R), with the segments O :'.:: x :'.:: 1 and O :'.:: y :'.:: l on the x- and
1
and 1aR
xdx = f
laR
ydy =0.
y-axes. Use Green's Theorem to compute
18. Let F : JR 3 -
)'
xy dx+y dy.
1aB
x dy dz,
laB
f y dzdx, f
laB
zdx dy. prove that
B
V• Vf(x)dV =1 aB
f(x)v•dS. 26. Let f and g be twice continuously differentiable real-
valued functions defined on IR 3 , and let A denote the
(b) Use part (a) to establish an equation for vector-
valued integrals: 1 B
Vf(x) dV ={
laB
f(x) dS.
Laplace operator Au = Uxx + Uyy + Uzz·
(a) Prove that A(fg) = f Ag+ gAJ + 2Vf • Vg.
(b) Prove that !l.(fg) = f Ag+gAf if the level surfaces
(c) What conclusion can you draw from part (b) if f is
identically zero on aB? of f and g intersect only at right angles.
C H A PT E R 10
FIRST-ORDER DIFFERENTIAL
EQUATIONS
y' = F(x, y)
as assigning a slope y' to a point (x, y). The assignment is usually represented
geometrically by drawing through the point with coordinates (x, y) a line segment
with slope y' = F(x, y). Just such an array of points and segments is shown in
Figure 10. l (b ). A collection of points with directions attached is called a direction
460
Section 1A Direction Fields 461
field or slope field, and geometrically speaking the assignment of slopes to points is
the essence of the equation y' = F(x, y).
It's important to be clear about the distinction between direction fields and the
vector fields introduced in Chapter 6, Section I. A picture of a direction field always
contains a I-dimensional domain axis whose positive direction determines a direction
for the solution curves, while speed is equal to the slope of segments relative to this
axis. In a vector field we need some device such as an arrow point to indicate
direction, and we indicate speed by the length of a segment. A I -dimensional vector
field has its arrows all on the same line, not a helpful picture, which is one reason
for resorting to direction fields here instead.
(a)
lA Plotting Direction Fields
When a thin film of fluid flows steadily over a plane surface, the particles of fluid trace
in the plane paths called flowlines. Figure 10.l(a) illustrates such a flow by showing
some of its flowlines. In practice, we might try to describe the flow by giving even
less information, namely, just some short line segments tangent to the flowlines at a
selection of points. Figure 10.1 (b) shows some tangent segments, chosen from among
the tangents to the paths in Figure 10.l(a). Visually it's fairly easy to reconstruct the
significant features of Figure 10.l(a) from Figure 10.l(b); to do this graphically, we
can sketch curves through the selected points, making them appear to be tangent to
the segment through each point. A study of such a reconstruction is the geometric
theme of this chapter.
(b) There are two natural ways to produce a sketch of a direction field. One way
is draw tangents to flowlines. The other is to make the sketch associated with the
FIGURE 10.1 first-order differential equation y' = F(x, y) by drawing a short segment with slope
F (x, y) through the point (x, y). These two ways of looking at a direction field blend
together when we solve the differential equation. The reason is that a solution y(x)
satisfies
y' = F(x, y)
at the point (x, y) = (x, y(x)). In particular, the curves in Figure 10.l(a) are the
graphs of solutions coming from the direction field in Figure 10.1 (b ).
y
I
= - -y , for X =/:- 0.
X
At each point in the xy-plane, except for points on the y-axis, where x = 0, the
equation specifies a numerical slope y'. We can make a table of some sample points
and slopes:
462 Chapter 10 First-Order Differential Equations
(x,y) Y'=-i
(I, l) -I
(l, 2) -2
(2, l) I
-2
(-1, 2) 2
(-2, 2) l
By plotting some points and at each drawing a short segment with the specified slope,
we get the picture of the direction field shown in Figure 10.2(a). The shape of the
curves tangent to the segments in Figure 10.2(a) is fairly easy to sketch, and some are
in the figure. Note in particular that the positive and negative x-axes, where y = 0,
are such curves, but that the vertical y-axis is excluded because of the restriction
that X-:/- 0.
We can use calculus in this example to find formulas for the solution curves. We
multiply the given equation by x and rearrange to give
xy' + y = 0.
Treating y as a function of x, the product rule for differentiation shows that our
equation is the same as
(xy)' = 0.
But this means that the product xy must be a constant: xy = c. In other words,
C
y= -.
X
The graphs of y = c/x, for various choices of c are just the curves tangent to the
segments of the direction field in Figure 10.2(a). In particular, c = 0 corresponds to
the x-axis except for x = 0, and c = I corresponds to the curves sketched in the
first and third quadrants. Finally, we can verify directly that given a constant c,
C
y=- satisfies y ' =--,
y
for X-:/- 0.
x X
(a) (b)
y' = -ylx y' = cosx
Section 1A Direction Fields 463
The reason is that y' = -c / x 2 , and on the other hand, -y / x = -c/ x 2 , also. Hence
I C y
y = -- =
x2 X
The special case of the equation y' = F (x, y) in which F is independent of y takes
the form
y' = G(x).
We assume that G is a continuous function on some interval. To solve the equation,
we integrate both sides with respect to x, getting formally,
y=JG(x)dx+C.
Here the indefinite integral stands for any function whose derivative is G. We know
that any two such integrals differ by at most an additive constant C. Each different
constant C gives a graph parallel to the others because the function
F(x, y) = G(x)
is independent of y: that is, direction segments lying on the same vertical line are
all parallel.
For example, the differential equation
y
I
= cosx
has solutions
y = f COS X dx +C
= sinx + c.
The direction field generated by
y' = G(x) = cosx
is sketched in Figure 10.2(b) together with the particular solutions y = sinx + 1 and
y = sinx - l.
We find the particular solution whose graph passes through the point (xo, yo) =(½, 2)
by substituting these values into the solution formula. The constant c is determined
by the equation
C
2= T•
2
so that c = l. Thus the solution to the initial-value problem
I y I
Y =--, Y!2)=2
X
y = -, X > 0.
X
The graph of this solution is shown in Figure 10.2(a). To find the solution curve
through an arbitrary preassigned point (xo, yo), with xo i=- 0, we make the substitution
C
YO= -
xo
The preceding examples all suggest that, with minor exceptions, through each
point of a direction field there is a unique solution curve for y' = F(x, y) and that
the solution extends without hindrance wherever the field is defined. The following
examples show that neither of these statements is true in general.
... ," _, ,,
I
I I
I I
I I
I I
I I
I
I I y' == I + y2
y'=0,y<0
' I
(a) (b)
has two distinct solutions passing through (xo, yo) = (0, 0):
y(x) = 0, for - oo < x < oo
and
l
0, -00 < X < 0,
y(x) = x2
0 :'.:: X < 00.
4'
The first solution has its graph coinciding with the x-axis for all x, whereas the graph
of the second coincides with the x-axis for x ::: 0 and then assumes a pambolic shape,
shown in Figure 10.3(a). Thus a solution to the differential equation is not uniquely
determined by the requirement that its graph pass through (0, 0). For still more
solutions of this differential equation see Exercise 6.
l=I+y2
has the solution y = tanx with its graph passing through (xo, Yo) = (0, 0). But the
solution tends to infinity discontinuously at x = ±rr /2, despite having F (x, y) =
1 + y 2 well-behaved throughout the entire xy-plane. Figure 10.3(b) makes it clear
graphically why the solution can't be carried on continuously outside the interval
--;r/2 < X < n/2.
The question of existence and uniqueness of solutions is taken care of for large
classes of differential equations by using the methods of the next three chapters.
We'll state without proof a theorem for the initial-value problem y' = F(x, y),
y(xo) = yo.
1.1 Existence and Uniqueness Theorem. Suppose the function F(x, y) and its
partial derivative Fy(x, y) are both continuous for a < x < b and a < y < /3,
and that a < xo < b and a < Yo < /3. Then the initial-value problem y' =
F(x, y), y(xo) = y 0 has a unique solution defined on some subinterval of a < x <
b; if in addition there is a constant B such that IFy(x, y)I < B for all x in the
interval and for all real numbers y, then this solution will exist on the entire interval
'
466 Chapter 10 First-Order Differential Equations
EXERCISES
to get an approximate value Yk+ 1 at Xk+ 1. Setting x = Xk+ 1 in the tangent line
equation and noting that (Xk+t - xk) = h, gives
for x values between 0 and l, with step size h = 0.01, might look like this:
At each stage the value Pk is the prediction and Yk is the correction. If h < 0
the approximations move from larger to smaller x values. For graphic output use
some form of PLOT instead of PRINT. Matlab, Maple, and Mathematica software is
available for doing the following exercises. Also Java applets DFDEM, IORDPLOT,
and IORD are at the Web site http://math.dartmouth.edu/~rewn/.
Section 2A Applications 469
EXERCISES
SECTION 2 APPLICATIONS
One of Newton's many contributions to science was that it's useful to formulate
a differential equation to be solved for a physically interesting unknown and then
solve the equation. This apparently simple observation has had a profound influence
on science.
2A Direct Integration
We'll consider first some problems that are reducible to solving differential equations
of the form dy/dx = F(x). If F(x) is continuous on some interval, then all solutions
are
where C is an arbitrary constant and G' (x) = F (x) on the interval in question.
To satisfy an initial condition y(xo) = yo, solve for the constant C in the equation
y(xo) = G(xo) + C, getting C = y(xo) - G(xo). This routine for solving the initial-
value problem proves the special case of Theorem 1.1 of the previous section in
which F(x, y) = F(x) is a function of x alone. In geometric terms, we see that the
slopes of a continuously varying direction field generated by an equation y' = F(x)
determine a function y(x) on the interval of definition of F(x) whose graph satisfies
two conditions: (i) It passes through a given point (xo, yo) if xo is in the interval.
(ii) It is tangent to a direction segment at each of its points.
f~i(AriAP~~'ll If F(x) = (I + x 2 )- 1 for all real x, then all solutions of y' = F(x) have the form
y = G(x) = J F(x)dx +C
J
= -dx-2 + C = arctanx + C.
l+x
We'll assume that the arctangent function is the principal branch, the branch for
which arctan O = 0. To satisfy the initial condition y(l) = Jr /2 we need
Jr Jr
-2 = arctan 1 + C = -4 + C
so we take C = Jr/ 4, making the unique solution to the initial-value problem y =
arctanx + rr/4.
470 Chapter 10 First-Order Differential Equations
FIGURE 10.5 V =Q I
v=O I I
I
I / y> O
: y< O v> O I
I
v< O I I
I I
I I
I
v< O ty= O
v > O -i-y= O I
I
I
I
I
y> O : y< 0
I
I
I
I
I
I
(a) (b)
I EXAMPLE 2 I Let g be a constant approximation to the acceleration of gravity near the surface of
the earth. A projectile is fired straight up from the top of a building (yo = 0) with
velocity v(0) = 1000 feet per second. If we choose to measure distance up from the
top of the building, as in Figure I0.5(b), then at time 1 ~ 0 we have dv / d1 = -g
since gravity acts to decrease velocity. Integrating dv / dt = - g with respect to t
using initial condition v(0) = 1000 and the estimate g = 32.2, gives
V = dy / dl = - gt + Ci ,
~ -32.21 + 1000.
Integrating the velocity with initial condition y(0) = 0 to find y (l) gives
2
y (1) = - ½g1 + v(0)t + C2
~- 16.11 2 + lO00t.
2B Separation of Variables
We'll now consider more examples where first-order differential equations arise from
geometric or scientific assumptions. The equations are chosen so that we can solve
them by a method called separation of variables, illustrated in the next example.
IEXAMPLE3 I It's often observed in biological studies that the rate of change d P / d
of a bacteria population at time
P(t) very nearly proportional to
tis
1 of the size
P (t ). Expressing
this proportionality in the form
dP
dt
= kP , (1)
Section 2B Applications 471
where k is a constant, gives a first-order differential equation for P. Because the
derivation of the differential equation depends on assumptions that may not be pre-
cisely true, we can expect a solution P(t) to be at best an approximation to the
true situation. It's our purpose here to study this approximation. Experience with the
exponential function allows us to guess one solution. If we let
P(t) = Kek 1 ,
we see that P'(t) = kP(t) for all real numbers t. In other words, P = Kekt is
a solution. If we had not been able to guess a solution, or if we wanted to try to
find still other solutions, we would have proceeded as follows. Assuming that the
population size is always positive, we can divide Equation (1) by P, getting
I dP
--=k.
p dt
Next we integrate both sides of the equation with respect to t:
/
1-
- dP
dt
p dt
= I kdt.
The integral on the left is In P; that on the right is kt. Both integrals are determined
only to within an additive constant. Hence we can lump the constants together and
write
lnP=kt+c,
Figure 10.6 shows the graph of P for, rather arbitrarily, k = 2 and various choices
of Po. The constant Po is usually determined by observing that, for t = 0, we have
Po = P(0), which is the size of the population at t = 0. If instead of P(0) we
happen to know P(t1) for some t1 > 0, then the equation
P(t1) = Poekti
leads to
Hence
P(t) = P(t1)ek<t-t1)
for all t > 0.
472 Chapter 10 First-Order Differential Equations
A condition that requires a solution P(t) to satisfy an equation of the form P(to) =
Po is called an initial condition. The term comes from the interpretation of to in
applications as a starting time for an evolving process.
We could solve the differential equation in the previous example because we could
rewrite it as an equation between two functions, for each of which we could find an
indefinite integral. The typical equation of this type looks like
P0 =2
dy
g(y) dx = j(x), (2)
P0 =I
P 0 = 0.5
P0 = o_.1-+----------
though this form isn't possible for all first-order differential equations. (See
1 Exercise 9.) By assuming that y is some differentiable function of x, we can try
to find an indefinite integral with respect to x for each side, and so write
FIGURE 10.6
f g(y) : ; dx = f J(x) dx. (3)
relating y and x. There still remains the problem of solving this last equation for y
in terms of x.
The process outlined is usually called separation of variables because it involves
getting the x' s on one side of the equation and the y' s on the other. The whole matter
becomes simpler notationally if we cancel the dx's on the left side of Equation (3).
The resulting formal equation
still leads to Equation (4) for the solution. The original Equation (2) is sometimes
written in the symmetric form
g(y)dy = J(x)dx,
which can be interpreted as either
dy dx
g(y) dx = f(x) or g(y) = J(x) dy.
dy y
(6)
dx X
Section 28 Applications 473
, \ Y I I has associated with it the direction field with slope y / x at the point (x, y). Since the
\ \ I I
\ \ I I , ·
'
'',, ',, \ : ,,' ;';
slope y/x is just the same as the slope of the line from (0, 0) to (x, y), the direction
'' \\ I I '
field looks like the sketch shown in Figure 10.7. It appears that the solution curves
', ', \ \ I I ,/
_ ','
,~,\ \ \ \ , , ,"'
I I '
,,,,," "
are radial lines from the origin. To prove this, we write the equation in either of the
...
.................. '~~\\ ,,~" .,,,.,,,,""
........... :::=---
----· two forms
---- ----.,,-:,,,,,
_,..," ,,,
,,,.,,,, ," I I
l dy
-=-
l
or
dy
-=-
dx
I y dx X y X
-- ''
.,.,,,,, , ' II
- ,"
' /
I/ I
assuming both y =I= 0 and x =I= 0. Integration with respect to x on the left of the first
equation gives
.FIGURE 10.7
I .!_ dy dx
y dx
= f .!_ dx,
X
or, formally,
J;=Jd;.
In either formulation, we find
In IYI = In lxl + C.
Taking the exponential of both sides gives
or
y = ±ecx.
Since ec is always positive, and since y = 0 is a solution of Equation (6), a solution
formula for Equation (6) is
y = kx,
where k is any real number. In other words, the graphs of the solutions are lines
through the origin.
Suppose that a tank containing a chemical in solution is divided into two compart-
ments by a porous membrane. Suppose that the chemical in one compartment is
maintained at a fixed concentration C (e.g., in grams per liter) and let u(t) be the
concentration of the chemical in the other compartment at time t. It may sometimes
be determined experimentally that diffusion takes place across the dividing mem-
brane in such a way that the rate of change of the concentration u (t) is proportional
to the difference in concentrations.
474 Chapter 10 First-Order Differential Equations
Then
- In jC - uj =kt+ c.
jC - ul = e-ce-kt_
C- u = Ke-kt,
where K is now any nonzero constant. Finally,
To determine the constant K (remember that C is given at the start), we could, for
example, measure u(O). Setting t = 0 in the preceding solution formula then gives
Hence
u
u(t) = C - (C -- u(0))e-k'.
u(O)
The constant k also could be determined experimentally by measuring u(t1) for some
C t1 > 0 (see Exercise 19). The shape of the graph of u (t) is shown in Figure 10.8
u(O)
for a single arbitrary choice of C and k and for values of u(O) that are relatively
larger and smaller than C. Indeed, the original differential equation u' = k(C - u),
with k > 0, shows that whenever C > u(t), then u' (t) > 0, so that u is increasing.
Similarly whenever C < u(t), we must have u'(t) < 0, so that u is decreasing. We
FIGURE 10.8 assumed above that u -=I- C; if u(0) = C then u(t) = C is a solution, sou is constant.
I~XJ.\MPL~ 6 ] Chemical solutions in a tank, for example solutions of salt in water, are often subject
to inflow and outflow of a particular chemical at different rates. If S = S(t) is the
amount of chemical in the tank at time t, then
dS .
- = (rate of mflow) - (rate of outflow). (7)
dt
Section 28 Applications 475
As an example, suppose that a full 100-gallon tank contains 150 pounds of salt in
solution at time t = 0, that a salt solution with a concentration of 2 pounds per
gallon is being added at a rate of 2 gallons per minute, and that thoroughly mixed
salt solution is flowing out of the tank at a rate of 2 gallons per minute. Thus salt is
flowing in at a constant rate of 4 pounds per minute and is overflowing at a rate of
2S(t)/100 pounds per minute at time t. The differential equation
dS 2S
-=4--
dt 100
then expresses the general relation of Equation (7). To solve the equation for S(t),
we write it as
dS 2 dt
S- 200 =- 100
Integration gives
In IS -2001 = -l0 t +c
or
S(t) = 200±ece-tf50_
But S(O) = 150 by assumption, so ±ec must be equal to -50. The amount of salt
at time t is then
dS
dt
= 2 (2- ~)-
2
(2- ~) _!_
2 100
FIGURE 10.9
dS (2 - (t/2)) dt
---=
S-200 100
476 Chapter 10 First-Order Differential Equations
J
Removing the absolute value gives
200
150
S(t) = 200 ± ece-(r/50)+(12/400>,
100
50 and, as before, the assumption that S(0) = 150 shows that ±ec = -50 is correct.
Thus
2 4
FIGURE 10.10
is the solution to the problem for 0 _::: t s 4. For t > 4, the simple differential
equation
dS
-=0
dt
means that there is no further change in the amount of salt. The correct value of the
solution
S(t) = C, 4 < t,
comes from setting t =4 in the formula that holds for 0 S t S 4. We find that
S(I0) = 200 - 50e- 0 ·04
~ 152.
The graph of S, for both t s 4 and t > 4, is shown in Figure 10. lO(b ). There is no
single elementary formula to represent the function, so we write
2
200 - 5oe-< 1 / 50J+(r /400) 0 < t < 4
S(t) ={ , - - '
200 - 5oe- 0 ·04 , 4 < t.
d2 r GM
dt 2 = --;:z-
If we let v = dr/dt denote the radial speed, then by the chain rule,
FIGURE 10.11
d 2r dv dv dr
dt 2 = dt = dr dt
dv
=V -.
dr
The differential equation then becomes
dv GM
v- - - -r 2- ·
dr -
lntegration of both sides with respect to r gives
1 ro
v
vdv=-GM
lr
ro 2
r
dr
or
2
v _ v5 = GM _ GM
2 2 r ro
where vo is the radial speed at distance ro from the planet. This relation between
speed and distance enables us to determine the escape speed of the satellite, namely
the speed vo that must be attained at distance ro so that the speed v always remains
positive thereafter. We must have
2GM 2GM
V2 =Vo+
2
-- - -- > 0.
r ro
Since GM/ r ---+ 0 as r ---+ oo, the only way the inequality can hold is to have
2GM
Vo2 - - - >0.
ro
The critical escape speed that must be exceeded at distance ro is thus
vo = J2GM_
ro
The analysis presented here ignores the possibility that the planet moves under the
influence of the satellite; practically speaking this is fine if the satellite has negligible
mass compared with the planet's mass. Also, considerations involving kinetic and
potential energy show that a formula of this type for escape speed is correct even
if the satellite is on a nonradial path. We treat both of these issues in Chapter 12,
Section 3, where we allow for two bodies having commensurate masses as would
be the case in a double star system.
478 Chapter 10 First-Order Differential Equations
EXERCISES
ln Exercises I to I 0, solve each differential equation by ential equations, and then find a particular solution that
direct integration, and find the particular solution that satisfies the given additional condition. Verify by sub-
satisfies the associated initial condition by determining stitution that your solution does satisfy the differential
one or more constants of integration. equation.
dy
1. y' = x(I - x), y(0) = I 17. -
dt
= 2y, .v(0) = 2
2. ds/dt = (t -t- 1) 2, s(l) = 2 dy
18. - = 2tv. , y(0) = 2
3. y' = x/(1 - x 2 ), y(0) = I dt
X
4. c/11/dv = v2 + 1, 11(-l) = 1 19. y
I
= 2' y(l) = 0
y
5. y" = sinx, y(0) = I, y'(0) = I
dy X
6. y"' = I, y(0) = y'(0) = y"(0) = 0 20. - = --,y(I) = I
dx y
1. dz/dt = te 1 , z(O) = I 21. (a) Suppose that a spherical ball of dry ice evaporates
8. y' = arctanx, y(O) = 0 in such a way that the rate of evaporation dV/dt is
always proportional to the radius r of the ball. Use
9. dx 2 /dt 2 =e 1
, x(0) = 1, dx/dt(0) =0 V = (~)1rr 3 to show that the first-order differential
10. y"" = X, y(0) = y"(0) = 0, y'(I) = y"'(I) = I equation satisfied by r as a function of t is of
11. A projectile is fired up from ground level with an initial the form
dr k
vdocity of 5000 feet per second. What is the maximum
altitude attained, and how long does it take to get there,
dt
= ,.
a~suming g = 32 feet per second per second? where k is a negative constant.
12. A weight is dropped from 5000 feet above ground. How (b) Solve the differential equation in part (a), and use
long does it take Lo reach the ground, and with what final the observed measurements that at time t = 0 the
velocity does it hit? Assume g = 32 ft./sec 2 . radius of the ball is 1 inch, whereas I hour later the
radius is ½ inch, to determine a particular solution
13. Suppose the two objects described in Exercises 11 and 12 as well as the constant k.
are released at the same time and are aimed directly at (c) How long dqes it take the ball to evaporate com-
each other. pletely starting with a radius of I inch?
(a) How Jong after release do they meet, and at what
22. Psychological studies of stimulus and response often
height above ground?
attempt to treat these as numerical variables s and r
(b) What initial velocity should the projectile be given
so that the two objects meet 2500 feet above ground?
related by an equation of the form r =
f (s). It's some-
times hypothesized that f satisfies a differential equation
14. A projectile is fired up from ground level so that its of the form
maximum height will be 5000 feet. What is its initial
velocity?
-dr
ds
r"
= k-,
s
with k > 0.
15. A weight is thrown down from 5000 feet above ground so '
as to reach the ground in 10 seconds. What is the velocity Which of the two hypotheses on the exponent 11, 11 = 0
of the throw? or n = 1, is consistent with the following table of
16. Suppose the objects described in the previous two exer- experimental values?
cises are sent on their way at the same time and are aimed
directly at each other. r s
(a) About how long after release do they meet, and at
what height above ground? 0.5 1
(b) Whal initial velocity should the projectile be given 1 2
so that the two objects meet 2500 feet above ground? 3 6
where v = v(t) is the velocity of the object at time t. Use 33. Flow of liquid from a tank. A cylindrical tank with cross-
integration to derive the following relations. sectional area A has an outlet hole in its side near the
(a) v(t) = gt+ vo, where vo is the velocity at time 0. bottom. If h = h(t) is the height of an ideal fluid above
(b) s(t) = ht 2 +vot+so, wheres= s(t) is the distance the outlet at time t, and a is the area of the outlet hole,
at time t of the object from the reference points = 0. then V(t), the remaining fluid volume at time t satisfies
Torricelli's equation
31. An object dropped near the earth's surface falls distance
s(t) = ½gt 2 in time t. In particular, so = s(0) = 0 and
vo = s'(0) = 0. dV
- =-a/fii,.
(a) Show that s = s(t) satisfies the first-order differen- dt
tial equation
ds ~ An intuitive justification for the equation is to note that it
dt = .,;2gs.
depends on having the outlet velocity equal to the free-fall
velocity of a drop of fluid from height h, as derived in
(b) Show that the differential equation in part (a) has the previous exercise; thus -d V /dt equals area a times
each member of the one-parameter family outlet velocity .J'[gfi. (A thoroughly scientific justification
depends on principles of fluid mechanics.) Thus for an
s =( A+ t c)
2
= ~ gt 2 + ./iict + c2
ideal fluid, the equation takes the form
dh a ;;,;:-;:
as a solution.
- =--.,;2gh.
dt A
(c) Show that the solution in part (b) satisfies v5 = 2gso.
32. The general solution to the falling object problem treated (a) Show that the Torricelli equation has a solution of
in the two previous problems is the form h(t) = (bt + c)2. Then determine what the
constants b and c must be.
I 2
s = 2 gt + vot+so, (b) Use your answer to part (a) to find out how long it
would take for the fluid height above the outlet to
where vo is initial velocity and so is initial displacement. drop from ho to 0. In particular, estimate how long
(a) Show that s = s(t) satisfies the first-order differen- it would take to empty a full cylindrical tank with
tial equation diameter 10 feet, height 20 feet, and circular outlet
at the bottom with diameter 6 inches.
2
~; = Jv5 + 2g(s - so). 34. The first-order nonlinear equation dy /dx = e- • ~-y can
in principle be solved by using separation of variables.
[Hint: Solve fort in terms of both sand ds/dt.] (a) Try to find an effective solution formula for the
(b) Show that the expression v5 + 2g(s - so) under the initial-value problem with initial condition y(O) =
radical is always nonnegative, given our assumptions - 1, and explain the difficulty you encounter.
on s. (Hint: When does that expression reach its (b) Make a computer graphics plot of the solution to the
minimum as a function of t?) initial-value problem in part (a).
y' = F(x, y)
has a particularly important special case, namely the one in which
for some functions g and / . The resulting differential equation is usually written in
the normalized form
y' + g(x) y = J(x). (1)
For reasons explained at the end of the section the equation is called a first-order
linear differential equation.
ln IYI = -G(x) + c,
where G is an indefinite integral of g and c is a constant. Taking the exponential of
both sides gives
and removing the absolute value allows us to replace the positive constant ec by an
arbitrary nonzero constant K:
y = Ke-G<x)
= Ke-f g(x )dx_
M(x) = ef g(x)dx ,
The whole point is that the left side can now be written as the derivative of ef g(x) dx y,
because, by the product rule, applied to the factors ef g(x)dx and y,
Thus we have rewritten the standard linear differential equation in the form
it remains only to integrate both sides with respect to x and then solve for y. The
integrating factor M(x) is sometimes called an exponential multiplier.
l = xy+ x,
y' - xy =x .
= -e-(1/2)x2 + C.
Section 38 linear Equations 483
y 2
Then multiplying by e+< 112>x gives
y = -1 + ce<l/2)x2
for the solution. Figure 10.12 shows the graph of the particular solution satisfying
y(O) = 0.
3B Applications
C(t) = 1- e-r/100_
The solution is kept thoroughly mixed and the excess is drawn off, also at a rate of
3 gallons per minute. Let S(t) stand for the amount of chemical in the tank at time
t 2: 0. We have rate of inflow minus rate of outflow equal to
dS S(t)
dt(t) = 3C(t) - 3 lOO
10 = -50 + K, or K = 60.
Thus the desired particular solution is
IEXAMPLE4 I Let f (t) be the concentration of a chemical solution on one side of a porous mem-
brane, and let u(t) be the concentration on the other side. Suppose that diffusion
takes place through the membrane in such a way that
du
- = 2(/(t} - u) ,
dt
Section 38 Linear Equations 485
that is, so that the rate of change of u is proportional to the difference in concentra-
tions. If u(O) = 3, and f is maintained so that
0:::; t < 10
f(t) = { 1: 10::::t,
du
-dt + 2u = 2/(t).
An exponential multiplier M is given by
d
-(e2t u) = 2e2' f(t).
dt
s-(e
d 21
u(t))dt =
las 2e 2
' f(t)dt
laO dt O
or
Then
Using the integral with limits is convenient here because we can write, according to
the definition of f,
=I ~e 20 - 2 + ½e2s, 10 :::: s.
486 Chapter 10 First-Order Differential Equations
4 - 4e-2s, 0 S s S I 0,
u(s) = 3e-2s + { (3e 20 - 4)e-2s + 1, 10 S s.
4 - e- 21·, 0ssslO,
= { (3e20 - l)e- 2s + I, 10 S s.
IEXAMPLE s I Newton's
ture
law of cooling asserts that the rate of change of the surface tempera-
of an object is proportional to the difference between
u(t) and the u(t) J(t),
temperature of the surrounding medium. Thus
du
- = k(j - u), k > 0.
dt
The constant k must be positive to be consistent with knowing that if f (t) > u(t)
then du/dt > 0. We can't solve this differential equation by separation of variables
unless f is constant, but in any case the equation is linear, with the form
du
-dt + ku = kf(t).
One important problem is to figure out how to control the temperature u(t) in some
desired way by choosing j(t) p~operly. Our solution method leads to
d
-(ekt LI) = kit j(t).
dt
Hence
It's convenient to choose the indefinite integral to have the value O when t =0 so
we can write it as a definite integral. We get
where we have replaced C by u(O). If we now take f (t) to be a constant and call it
f o, the solution becomes
EXERCISES
7. 2-
dy
= xy, y(l) = 0 13. f(t) = e-21 , fort~ 0, with 11(0) = JO
dx
. f(t) = { /o' (constant), for Ost s 1, with u(O) = 5,
dP 14 for 1 < t
8. t - + P = t 3 , P(l) = 0 0
dt
15. A container of milk at 70° F is placed in a mixture of
9. Salt solution enters a JOO-gallon tank of initially pure
ice and brine constantly at 30c F. Assume the validity of
water from two different sources. One source provides
Newton's law described in Example 5 and that the milk
water containing I pound of salt per gallon at a rate of 2
has reached 40° after 15 minutes.
gallons per minute. A second source provides 3 gallons
(a) Find an approximate value for the constant k in
of salt solution per minute at a varying concentration
C(t) = 2e- 2t, measured in pounds of salt per gallon. Newton's law.
Assume that the contents of the tank are kept thoroughly (b) When will the milk reach 35°?
mixed at all times and that solution is drawn off at a rate 16. Suppose that a metal bar initially at 300° F is immersed
of 5 gallons per minute. Find the amount of salt in the in a water bath at 100° F. for 30 minutes and then is
tank at an arbitrary time t > 0. transferred to another water bath at 50° F. Assume the
validity of Newton's law described in Example 5 of the
IO. The current i (t) in an electric circuit satisfies the differ-
ential equation text.
(a) What will the temperature of the bar be after an addi-
di tional 30 minutes, assuming the cooling coefficient
L- + Ri = E(t), for the iron in water is k = 0.1?
dt (b) Suppose that initially the bar is cooled for 30 minutes
in air at 100°, for which the cooling coefficient is
where L and R are positive constants, called inductance
only k = 0.07 and is then immersed in water for 30
and resistance, respectively, and E(t) is an applied volt-
minutes. What will the temperature of the bar be at
age. Show that
the end of the hour?
17. Verify directly that if YI (x) is a solution of
18. Verify directly that if Yl (x) and n(x) are solutions of the 21. A 100-gallon tank is initially full of pure water. Salt
respective equations solution is added for 10 minutes at the rate of I gallon per
minute with salt content of the added solution increasing
linearly over the 10 minutes from 1 pound per gallon to
y' + gy = !1 and y' + gy = h, 2 pounds per gallon. Thoroughly mixed salt solution is
drawn off at the rate of one gallon per minute. Estimate
then c1 Yt + C2)'2 is, for every pair of constants c1, c2, a the amount of salt in the tank at the end of the l Ominutes.
solution of 22. A 100-gallon mixing vat is initially half-full of pure water.
Two gallons of salt solution per minute at a concentration
y'+gy=cif1 +c2h- of one pound of salt per gallon begin to flow in, while one
gallon per minute of mixed solution flows out. Estimate
the amount of salt in the vat at the moment it begins to
19. An initially full 100 cubic-foot tank starts with 10 pounds overflow.
of salt dissolved in water. At a cenain time additional
23. A 100-gallon tank initially contains 50 gallons of water
salt solution begins to enter the tank at a rate of l cubic
with a total of 10 pounds of salt dissolved in it. A drain
foot per hour, while thoroughly mixed solution runs out a
is opened in the bottom that is regulated so as to let
drain at the same rate. However, the amount of salt in the
out I gallon of solution per minute. Simultaneously. salt
added solution decreases at a constant rate from l pound
solution begins to be added at 2 gallons per minute with
per cubic foot initially all the way down to zero pounds
a concentration of 2 pounds per gallon.
per cubic foot at the end of one hour.
(a) How much salt is in the tank when it first becomes
(a) Find the amount of salt in the tank at a given time
full and starts to overflow?
during the first hour. In panicular, about how much
(b) If the process is allowed to continue with overflow
salt will be in the tank at the end of one hour?
at an additional outflow of 1 gallon per minute, what
(b) If pure water continues to run into the tank after the
is the upper limit for the total amount of salt in the
first hour at the rate of I cubic foot per hour, how
tank? Estimate the additional time after the start of
much more time will it take for the total amount of
overflow for the amount of salt in the tank to reach
salt in the tank to reach 5 pounds?
175 pounds.
20. Two 100-gallon mixing tanks are initially full of pure 2
water. A solution containing one pound of salt per gallon 24. The first-order linear equation dy/dx - (sinx)y = e-x
of water pours into the first tank at the rate of one gallon can in principle be solved by using an exponential inte-
per minute. Thoroughly mixed solution runs from the first grating factor.
tank to the second at the rate of one gallon per minute, (a) Try to find an effective solution formula for the
where it too is thoroughly mixed in before draining away initial-value problem with initial condition y(O) =
at l gallon per minute. There will always be at least -1, and explain the difficulty you encounter.
as much salt in the first tank as in the second; find the (b) Make a computer graphics plot of I.he solution to the
maximum amount of this excess. initial-value problem of part (a).
Chapter 10 REVIEW
In Exercises 1 to 14, find all functions that satisfy the 1. xy' + (2x - 3)y = x4
differential equation. 8. y' = xy + y
I. x(dy/dx) + y - x =0 9. t(dx/dt) = -2x + t 3 , x(2) = I
2. dy/dx = 1/(y(l - x)2) 10. t(dx/dt) =I
3. dx/dt = tx + e 1
11. dx/dt = -3x 2
(I + X) )' + y = COS dy/dt + ty = l
1
4. X 12.
5. y3y' = (y 4 + l)ex 13. dx/dt =(x + t) 2 [Hint: Let x + t = y.]
6. dy/dx = 4x 3y - y, y(l) = l 14. dy/dt = cos2 y
Section 38 Linear Equations 489
15. Consider the differential equation dy /dx = ex-y. 18. Early experiments with objects dropped from rest above
(a) In what region of the (x, y )-plane are all solutions the earth led to the conjecture that after an object had
strictly increasing? fallen distance s its velocity would be proportional to s.
(b) In what region of the (x, y)-plane are all solutions Under the contemporary assumption that the acceleration
concave up? of gravity is constant, the velocity is proportional to .ji.
(c) Is the line y = x a solution graph? (a) Is the early conjecture consistent with initial velocity
(d) Is the line y = x an isocline? zero? Explain your reasoning.
(e) Solve the differential equation by separation of (b) Is the early conjecture consistent with positive initial
variables. Can you get the infonnation asked for velocity? How would acceleration be related to s
above directly from your solution fonnula? Which under this assumption?
approach seems simpler? 19. A 100-gallon mixing vat is initially full of pure water,
whereupon two gallons of salt solution per minute is
16. Consider the differential equation dy / dx = ex-y. added, each gallon containing I pound of salt. Water evap-
(a) What conclusions can you draw from Theorem 1.1 orates from the tank at the rate of one gallon per minute,
on existence and uniqueness about solutions of this and the excess solution overflows into a drain. Find the
equation? amount of salt in the tank at time t under the given
(b) Can a solution graph passing through the point assumptions and also under the altered assumption that
(x, y) = (0, 1) cross the line y = x? Explain your the tank initially contains 50 pounds of salt in solution.
reasoning. 20. Coffee cooling. We are presented two choices for cooling
one cup of coffee over a period of IO minutes: (i) let the
17. Consider the family of linear equations y' + ay = c, with
coffee cool by itself for IO minutes and then add cream,
a, c constant, a # 0.
or (ii) add the same amount of cream right away and then
(a) Show (a) that the isoclines of the direction field of allow the mixture to cool for 10 minutes. Assume that
this equation are horizontal lines and (b) that every mixing quantity p of liquid at temperature To and quantity
such line is an isocline. q at temperature T1 instantly results in quantity p+q with
(b) Sketch the direc.:tion field associated with the differ- average temperature given by (pTo+qT1)/(p+q). Which
ential equation y' + 2y = L method will end up with cooler coffee?
CHAPTER 1·1
SECOND-ORDER EQUATIONS
y"-2y=x, (l)
y" - 3y = ex, (2)
y" - 3y' + 2y = f(x), (3)
IE>CAIVIPLE 1·I We can write the differential equation y' - r y = 0, where r is a constant, as
which specifies that the rate of change of y is proportional to the value of y for
every value of the variable x. This type of equation appears in Chapter 10, Section 2
for describing population growth. To find solutions we use repeatedly the formula
y' = rerx for the derivative of y = e'"x. It follows that Equation (4) is satisfied if
we take y = erx_ More generally, if c is an arbitrary constant, then Equation (4) is
490
Section 1A Differential Operators 491
satisfied if we take
y = cen,.' (5)
because the c will cancel on both sides. Equation (5) gives the most general solution
to (4); observe that we can write (5) in the form
Dividing by e-rx leaves y' - ry = 0, which is the given Equation (4) rewritten. But
now we can reverse these steps, supposing that y is some solution. We start with
y' - ry =0
and then multiply by e-rx to get
where c is a constant of integration. Multiplying both sides by erx shows that y must
be of the form
y =cerx_
Thus we have shown that ce'x is the most general solution of y' = ry in the sense
that all particular solutions arise from specifying the value of c.
The method used in the preceding example consists of multiplying the expression
y' + ay by eax and then recognizing the result as the derivative (eax y)' = eax y' +
ae"-x y. We'll use this exponential multiplier e"-x repeatedly in what follows.
y = -½ex+ ce3x
for the most general solution. We can verify directly that we have indeed found
some solutions, one for each value of c. What we have shown additionally is that
all solutions must be of the form -½
ex + ce 3x.
(D+2)y= Dy+2y
=l+2y,
(D 2 - l)y = D 2y - y
= D(Dy) y = y" - y.
(D 2 + a)y
= y" + ay,
(D 2 + aD
+ b)y = y'' + ay' + by,
(D + s)(D + t)y = (D + s)(y' + ty)
= (D 2 + (s + t)D + st)y.
The last computation shows that for constants s and t
y" + ay' + by = O;
Equation (3) at the beginning of this chapter is similar to this, with a = -3, b = 2.
Writing the equation using differential operators gives
(D 2 +aD+b)y=0.
If we try to find a solution of the form y = erx, then Dy = rerx and D 2 y = r 2erx,
so erx is a solution if and only if r 2 erx + arerx + berx = 0. Then dividing by erx
gives the condition on r
2
r + ar + b = 0,
called the characteristic equation of the given differential equation.
r 2 - 3r +2 =0 or (r - 1)(r - 2) = 0.
The roots are r1 = 1 and r2 = 2, so there are solutions
y1(x) = ex, Y2(X) = e 2x.
The operator L = D 2 - 3D + 2 is linear, so if both L(y1) 0 and L(y2) = = 0 then
we also have L(CJYI + c2y2) = 0, and additional solutions are given by
lB Factoring Operators
Our general method of solution will be to factor an operator into factors of the form
(D+s) and (D+t), and then apply the exponential multiplier method of Examples 1
and 2 repeatedly.
(Dz+ 5D + 6)y = 0.
Next we try to factor the operator. We see that
(D + 2)y =u
for the moment, we substitute u into the previous equation and arrive at
(D + 3)u = 0.
But we can solve this first-order linear equation for u if we multiply through by e 3x.
We get
e 3x Du + 3e 3x u = 0
or
D(e 3x u) = 0.
Therefore
e3·Tu = c1,
for some constant q, and so
Since the constants CJ and c2 are arbitrary anyway, we can change the sign on the
first one to get
y" + ay' + by = 0,
with unequal characteristic roots r1, r2 has its most general solution of the form
If r 1 = r2, then er2 x is replaced in the general solution formula by xer,x to get
The constants CJ, c2 are uniquely determined by prescribing initial conditions y(xo) =
Yo, y' (xo) = zo.
Proof. In operator form, the differential equation is
(D - r1)(D - r2)y = 0.
We assume y(x) is a solution and show that it has the form claimed in the theorem.
Set z(x) =(D - r2)y(x) and substitute z for (D - r2)Y in the previous equation.
Now solve the resulting equation (D - ri)z = 0 to get
496 Chapter 11 Second-Order Equations
e-rox
- y - + cz.
= - -Cj- e(r1 -r,)x
r1 - r2
For neatness, we can rename the constant ci/(r1 - r2) and call it q to get
In that case,
Finally, note that once c1 is determined from y(xo) and y' (xo) as noted previously,
the constant c2 can in any case be determined from the value y(xo) alone; just solve
the appropriate equation (*)or(**) for c2 with x = xo. •
IEXAIVIPLE s j Suppose we want a solution graph of y" + 5 y' + 6y = 0 that passes through (0, I)
with slope 2. In other words we want the solution for which y(O) = l and y'(O) = 2.
Since the characteristic equation of the differential equation is r 2 + 5r + 6 = 0, the
characteristic roots are r = -2 and r = -3. All solutions thus have the form
Section 1B Differential Operators 497
FIGURE 11.1 y
y(x) = 5e-2x - 4e- 3x, y'(x) = 2-
-10e- 2x + 12e- 3x.
3 X
-1
Suppose that in the previous example we didn't want to satisfy initial conditions at
a single point, but wanted instead a solution graph passing through two given points
in the xy-plane. Such conditions applied to a single solution at more than one point
are called boundary conditions. The problem of finding a solution of a differential
equation that also satisfies boundary conditions is called a boundary-value problem.
Boundary-value problems are theoretically more complicated than initial-value
problems and don't always have solutions. (See Exercises 10 and 11 in the next
section.) Nevertheless some boundary-value problems are quite simple computation-
ally. The next example is of this kind.
Our aim is to find the solution to y" - 4y = 0 that satisfies boundary conditions
y(0) = I and y(l) = 2. Thus in this example we need work only with the expression
y (x ) = qe2x +c2e-2x
for the general solution itself, since values of y'(x) aren't involved. The resulting
equations for the coefficients are
498 Chapter 11 Second-Order Equations
Though it's not guaranteed by Theorem 1.1, these equations tum out here to have a
unique solution, namely
y(x)
2 - e-
= ( e~~ - -2
2) e2x + ( 2e2- 2 ·
-2 )
2
e- x.
e e - e ,
EXERCISES
In Exercises 1 to 6, with D = d/dx, compute [Hint: What characteristic roots go with each solution?]
1. (D + l )e- 2x 2. (D 2 + ])ex In Exercises 21 to 26, put each of the linear differential
4. (D 2 + D - 1) sinx equations in the operator form (aD 2 + bD + c)y = 0.
Then factor the operator, e.g. D 2 - 1 = (D - l)(D + 1). ,
5. (D 2
+ l )x cos x 6. (D 2 - l)xe-ix
ax 2 + bx + c = 0.
But to exploit fully the analogy between these two kinds of equation we need the
complex exponential function, defined for purely imaginary numbers ix, with x
real, by
/x = cosx + i sinx.
Our motivation for this definition comes from Equations 2.1 and 2.2 below. See also
Exercise 40.
The absolute value of a complex number a+ i/3 is la+ i/31 = ,/a 2 + fJ2, and
equals its distance from the complex number 0. Figure 11.2 shows that in eix we
can interpret x as an angle. The absolute value leix I equals I for all x because
FIGURE 11.2
',,
, /
,, / ''\
1
1
sin..-.· ' eix = cos x + i sin x
I \
I \
I I
I I
\ COS X :
\ I
\ I
\ I
\ I
' ' '-. ____,.., /
/
',
-- ---
Section 2A Complex Solutions 501
These equations justify using the exponential notation: The function eix behaves very
much like the real-valued exponential ex, for which ex~' = ~+x' and 1/ ~ = e-x.
In addition to its algebraic simplicity, another reason for using the complex expo-
nential function is the simplicity of the formulas for its derivative and integral. To
differentiate or integrate a complex-valued function u(x) + iv(x) with respect lo the
real variable x, we simply differentiate or integrate the real and imaginary parts. By
definition,
d du dv
-(u(x) + iv(x))
dx dx
=
-(x) + i-(x),
dx
and
Similarly,
where c may be a real or complex constant. These are analogous to the formulas for
the derivative and integral of e0 x when a is real. More generally, we can define
and compute
2.1 .!!.__e(a+i{J)x = (a + if3)e(a+i{J)x
dx
and
= __l ___ e(a+i{J)x + c,
2.2
f e<a+i{J)x dx
a+ ifJ
a+ ifJ =/- 0.
These computations are left as exercises. We are now in a position to discuss the
differential equation
(D 2 +aD+b)y=0
502 Chapter 11 Second-Order Equations
contains complex numbers rJ and r2. We' II see that the usual techniques, as dis-
cussed in Section 1, still apply. The exponential multiplier method goes over formally
unchanged because of Equation 2.1; we have
I,EXAMPLE· 1 I Consider the differential equation y" + y = 0. We write the equation in operator form,
(D 2 +1)y=O,
D(e-ix u) = 0.
We integrate both sides with respect to x to get
-ix
e u = CJ or u = CJ eix .
Substituting this result for u into Equation (I) gives
or
To simplify the solution we can set d1 = (CI+ c2) and d2 = i (c1 - c2). This involves
no change in generality in the constants because for given d1 and d2 we can solve
for CI and c2. Solving for CI and c2, we find
y = eax(c1ei.Bx + c2e-i.Bx)
= eax[CJ(cosf3x + i sinf3x) + c2(cosf3x - i sin/Jx)]
= eax [(c1 + c2) cos f3x + i (c1 - c2) sin f3x]
= d1 eax cos f3x + d2eax sin f3x.
This is a form of the solution that is often used in practice, so we include it in the
statement of the following Theorem 2.3. The proof is formally the same as that of
504 Chapter 11 Second-Order Equations
Theorem I. I in the previous section, so we omit it. The only difference here is that
we can interpret the solutions as being complex-valued, though the case of real roots
is automatically included also.
y = CJ erix + c2erix, r1 =/ r2
y = c1xe''x + c2e''x, r1 = r2 ,
where r1, r2 are the roots of r 2 + ar + b = 0. If a and b are real numbers with
a 2 - 4b < 0, we can write ri = a + i/J, r2 = a - i/J. The general solution can then
be written
Initial conditions y(xo) = Yo, y'(xo) = zo can always be satisfied by a unique choice
of CJ and c2.
The equation y" +ai y = 0, with w 2 > 0, has characteristic equation r 2 +w2 = 0 with
characteristic roots r = ±iw. The associated solutions to the differential equation are
Functions of this form are called harmonic oscillations because of the role they play
in the analysis of sound waves, and the differential equation is called a harmonic
oscillator equation.
Ay" + By'+ Cy =0
according to relations among the constants A, B, and C. The characteristic equation is
with roots
-B ± ,./B -4AC
2
ri, r2 =- - - 2A----
If B 2 - 4AC > 0, the roots are real and unequal, so the solutions are
If B 2 - 4AC = 0, the roots are equal and real, so solutions are all of the form
Section 2A Complex Solutions 505
But since e'x -:j:. 0 for all complex numbers rx, we can divide by e'x to get
r 2 +2r+5 = 0.
The polynomial on the left is just the characteristic polynomial of the differential
equation, and its roots are
r1 = -1 + 2i, r2 =- 1 - 2i.
It follows that
YI (x) = e<- 1+2i>x = e-x(cos 2x + i sin 2x)
and
y2(x) = e<- 1- 2nx = e-x(cos2x - i sin2x)
are complex-valued solutions of the differential equation. Hence the linear combination
the combinations CJ + c2 and i (CJ - c2) assume arbitrary values, in particular, real
values. Thus the solutions we have found have the form
EXERCISES
In Exercises I to 4, show that each of the given complex 17. y" - 2y' + 2y = 0, y(n) = 0, y'(n) =0
numbers has absolute value l. Then find a real number 18. y" - y' + y = 0, y(0) = 2, y' (0) = -1
x such that the complex number has the form eix = 19. 2y" + y' - y = 0, y(0) = 0, y' (0) = 2
cos x + i sin x; for example,
20. y" + y' = 2y, y(0) = 0, y(l) = I
./3-+-i = cos -Jr + i sin -Jr = e',r· /6 . 21. 2y" + y' + y = 0, y(0) = 0, y'(0) = 0
- 22. 3y" - y' + y = 0, y(0) = 0, y(l) = 0
2 6 6
In Exercise 23 to 30, find a second-order differential
1. i 2. (1 + i)/./2 3. (1 - i)/./2 4. (-v'3 - i)/2 equation y" + ay' + by = 0 that has the given solution.
[Hint: What are the charactenstic roots associated with
In Exercises 5 to 8, for each pair dJ, d2 of real numbers each solution?]
given, find complex numbers CJ, c2 such that
23. sin2x 24. e2x sin 2x
25. ex cos2x 26. cos2x
27. cos(x/2) 27. xezx
5. dJ = 1, dz = 0 6. dJ = 4, d2 = -2 29. sin 3x - cos 3x 30. X -7
7. dJ = o, dz= n 8. d1 = l , d2 = 1
In Exercises 31 to 33, we deal with the issue that a
9. Recall that a function is periodic, with period p, if constant-coefficient differential equation with complex
J(x + p) = /(x) for all x in the domain of/. characteristic roots may fail to have a unique solution if
(a) Show that eix has period 2kn if k is an integer. we impose certain critically chosen boundary conditions.
(b) Show that ei/Jx is periodic for fJ real, and find the
smallest positive period if fJ # 0. 31. Show that the boundary-value problem y" + y = 0,
y(0) = 1, y(n) = 1 has no solution. [Hint: You know
10. Show that e-ifJx = cosx - i sin{Jx. What properties of all solutions of y" + y = O.]
cos and sin are used here?
32. Show that the boundary-value problem y" + y = 0,
Solve each of the differential equations 11 to 14 by y(0) = 1, y(2n) = 1 has infinitely many solutions.
factoring the differential operator associated with it and
33. Show that if the constant-coefficient equation y" + ay' +
then successively solving a pair of first-order linear
equations. by = 0 has real characteristic roots, then the associated
boundary-value problem y(x1) = YI, y(x2) = yz always
J1. y" +y= l 12. y'' + 2y' + 2y = 0 has a unique solution if xi # xz.
13. y" + 2y = 0 14. )' 11
+ y' = X 34. (a) Show that CJ cos {Jx + c2 sin {Jx also has the form
Find the roots of the characteristic equation of each of J
A cos({Jx - <I>), where A = cf + c~ and <I> satisfies
the differential equations 15 to 22. Then write the gen- cos <I> = CJ/ A and sin <I> = c2/ A.
eral solution of the differential equation, replacing com- (b) The result of part (a) is useful because it shows
plex exponentials by eax cos f3x and eax sin f3x where that
it's appropriate. Finally determine the constants of inte-
gration so the given initial conditions will be satisfied. CJ cos {Jx + c2 sin {Jx
15. y" + 2y = 0, y(O) = 0, y' (0) =l has a graph that is the same as that of cos {Jx
16. 2y" + 3y' = 0, y(O) = 1, y'(0) =0 shifted by a phase angle <J> and multiplied by an
Section 2B Complex Solutions 507
amplitude A. Sketch the graph of Complex-valued differentiable functions f(x) = u(x) +
iv(x) and g(x) = s(x)+it(x) obey the same basic rules
cos 2x + v'3 sin 2x relative to differentiation that real-valued functions do.
In Exercises 41 to 44, use the corresponding relations
by first finding <I> and A. for real-valued functions to show that the following
35. (a) Let CJ and c2 be real numbers. Show that
fonnulas hold on an interval a < x < b on which
y(x) = ci cos{3x + c2 sin th can also be written both f, g and f / g are differentiable complex-valued
functions.
as A sin(jlx +0 I where A = /er+ c~ is the ampli-
tude of y(x) and 0 satisfies sin 0 = ci/ A and
41. (/ + g)' = /' + g'
cos0 = c2/A. 42. (cf)' = cf', c constant
(b) The result of part (a) says that 43. (jg)'= Jg'+ J'g
2B Higher-Order Equations
While most of the differential equations that arise directly in applications have order 1 or
2, understanding higher-order equations is technically useful for solving certain second-
order equations in Section 3 and for solving systems of equations in Chapter 12. There-
fore it's useful to record an extension of Theorem 1.1 as follows. The proof continues
step-by-step as in the proof of Theorem 1.1 and uses no new ideas so we omit it.
2.4 Theorem. The differential equation
(D - r1)(D - r2) · · · (D - rn)Y = 0,
with characteristic roots rk all different has its most general solution of the fonn
Y = cie''x + c2e'2x + · · · + Cne'nx.
508 Chapter 11 Second-Order Equations
If some rk are equal, say r1 = r2 = · · · = rm, then er2 x, er3x, ... , er'"x are
replaced in the general solution formula by xerix, x 2erix, ... , xm - I er,x respec-
tively. The constants c1, c2, ... , c11 are uniquely determined by prescribing values
y(xo), y' (xo), ... , y 11 - 1 (xo) of the solution and its derivatives at a single point xo.
The key to applying Theorem 2.4 is finding the characteristic roots of a differ-
ential equation
that is, finding the roots of the purely algebraic characteristic equation
r
11
+a 11 - 1rn-l + · · · + a1r + ao = 0.
Although there are general formulas for the roots if n is 3 or 4, these are awkward
to use and we'll rely in our examples on equations that reduce to solving quadratic
equations.
r (r 2 - 4r + 4) = r(r - 2) 2 = 0.
The roots are O and 2, where 2 is a double root. According to Theorem 2.4 the
general solution to the differential equation is a linear combination of the solutions
e0x = 1, e2x and xe2x. Thus the solution is y = c1 + c2e 21 + c3xe2x.
It will be useful for us to apply the idea behind Theorem 2.4 in reverse order,
starting with solutions and arriving at a differential equation that has those solutions.
2C Independent Solutions
The substitution method used in the preceding example doesn't by itself show that the
solution formula obtained gives the most general solution; this is true and it follows
from Theorem 2.3. Knowing about the solutions to the nth-order constant-coefficient
equation is useful in Section 3 for understanding some special types of second-order
nonhomogeneous equations y" + ay' +by= f(x). Otherwise it's mainly 4th-order
equations that arise directly in applications. We restate Theorem 2.4 to take account
of the trigonometric fonn that solutions may take if the constant coefficients in the
differential equation are all real.
Section 2C Complex Solutions 509
2.5 Theorem. The nth-order constant-coefficient equation L(y) = 0 has for its
general solution a sum of constant multiples
If r1, ... , rn are the roots of the characteristic equation rn +an-1rn-l +· .. +air+
ao = 0, the terms Yk(x) in the solution can each be written in the form x 1e''k-t,
I= 0, I, . .. , m - 1, where m is the multiplicity of the root rk, If roots a + i/3 and
a - i/3 occur in complex conjugate pairs, then the corresponding pairs of exponential
solutions are equivalent to
The constants of integration in the solution formulas we've just been dealing with
appear frequently in the next section. An expression of the form
is called a linear combination of YI, y2, ... , Yn with coefficients c1, c2, ... , Cn.
Alternatively, the numbers Ck may be regarded as parameters, and the linear combi-
nation displayed above is an example of an n-parameter family of functions.
Initial conditions y(O) = 0, y' (0) = 1, y" (0) = 2, y"' (0) = I impose conditions on
the constants q. For example,
CJ +c2 +CJ= 0
CJ -c2 + c4 = I
CJ+ C2 - CJ= 2
CJ-c2-C4= 1.
column under a vertical compressive force. (The constant ).,_ = P / p depends on the
structure of the column and on the vertical load P applied to it.) The characteristic
equation is r 4 + Ar 2 = 0, or r 2(r 2 + ).,_) = 0. With ).,_ > 0, the roots are r1 = r2 = 0
and r3 = ./I;, r4 =-,/Ii.The general solution is
IEX/\MPLEi1tl The fourth degree, or quartic, equation r 4 - 13r 2 + 36 = 0 is also a quadratic equation
in r 2, with solutions r 2 = (I 3 ± 5) /2. Since r 2 = 4 or r 2 = 9, the four distinct roots
are r = ±2, ±3. Hence the solutions to
It will be useful in Section 3 to apply the ideas behind Theorem 2.5 in reverse
order, starting with solutions and arriving at a differential equation that has those
solutions.
Section 2C Complex Solutions 511
(D 2 + l)(D 2 + 4)y = 0.
The function y = sin 2x is a solution, because (D 2 +4) sin 2x = 0. Since the operator
factors may appear in any order, and (D 2 + 1) cos x = 0, both functions are solutions
of the differential equation. (Linear combinations y = c1 cos x +c2 sin x +q cos 2x +
C4 sin 4x constitute the general solution.)
2.6 Corollary. The solutions x 1erkx listed in Theorem 2.5, including those with
real form x 1eax cos {3x and x 1eax sin {3x, are linearly independent.
Proof Suppose a linear combination y(x) of these n solutions is identically zero:
EXERCISES
In Exercises l to 10, find the general solution to the of initial conditions. Find the correct values for the
differential equation. constants Ck so the conditions will be satisfied.
0, then we'll use boundary conditions y(0) = y(L) = 0, (a) Show that the most general solution to the differen-
y'(0) = 0, and y" (L) = 0. Imagine the beam extended tial equation may be written
beyond x = L and bending with an inflection at x = L.
(a) Solve the differential equation by four successive y = c1 cash .VJ:x + c2 sinh .VJ:x
integrations, and use the boundary conditions to
show that the beam's shape is described by the + c3 cos .VJ:x + q sin .VJ:x,
graph of.
p
where coshu = (eu +e-u)/2 and sinhu = (e" -
y(x) = --(2x 4
- 5Lx 3
+ 3L 2 2
x ). e-")/2.
48 (b) Let ). = n 4 rr 4 where n is a positive integer, and
(b) Show that the graph of y(x) has an inflection on find solutions of the differential equation subject to
0 < x < L. What is the maximum downward boundary conditions y(0) = y"(0) = 0, y(l) =
vertical deflection from level 0 on L < x < 0? y"(l) = 0.
(c) Make a sketch that shows the qualitative features of (c) Sketch the graphs of the solutions found in part
the graph of y(x) for L ~ x ~ 0. (b) for n = 1, 2, 3.
(d) Show that if ). is not of the form prescribed in part
52. Rotating shaft. A differential equation for the lateral (b) then the only solution to the problem posed there
displacement y = y(x) at distance x from one end of is the identically zero solution.
a uniform rotating shaft is
53. Prove that a set {y1(x), yz(x), ... , Yn(x)} of functions
y(4) - ).y = 0. defined on an interval is linearly independent if and
only if no one of them is a linear combination, using
where the constant ). > 0 is proportional to the speed of constant coefficients, of any remaining functions in
rotation. the set.
L(y) = 0,
and for a given function f we can also consider the nonhomogeneous equation
L(y) = f.
The associated homogeneous equation is the special case of the nonhomogeneous
equation obtained by letting f be the identically zero function, and this special
case is fundamental to understanding the more general case. For constant-coefficient
linear differential operators, the theorems of the two preceding sections give a com-
plete description of the set of all solutions to the homogeneous equation. Fur-
thennore, the exponential multiplier method developed there provides a practical
method for solving many nonhomogeneous equations. The next example illustrates
the method.
514 Chapter 11 Second-Order Equations
IEXAMPLE 11 Given
y" + 2y' + y = e3x'
we write the characteristic polynomial in the form D 2 + 2D + l and factor it, putting
the equation in the form
(D + 1)2y = e3x.
Letting (D + l)y = u, we try to solve
(D + J)u = e3x.
Multiplication by ex gives
or
or
D(ex y) = ¼e4x +CJ.
Then
eX y = 16 e4x + cix + c2
[
or
In the preceding example, the solution breaks naturally into a sum of two parts
Yh and Yp:
Yh = qxe-x + c2e-x,
1 3x
Yp = 16e ·
The function Yh is called the homogeneous part of the solution because it's a solution
of the homogeneous equation
L(y) =0
Section 3A Nonhomogeneous Equations 515
L(y) =f
because it's just that: a particular solution, though not the most general one. We
sometimes refer to the homogeneous part of a general solution as the homogeneous
solution. We get Yp by setting ci = c2 = 0 in the general solution. The breakup
of the solution into two parts is an example of a general property of linear operators
discussed in Chapter 2, Section 2C on systems of linear algebraic equations. The
principle is important enough, and at the same time simple enough, that we state it
here also.
3.1 Theorem. Let L be a linear operator. Let / be a function, and let Yp be a
function in the domain of L such that L(yp) = f. Then every solution y of
Proof. Suppose that L(y) =/ and that also L(yp) = f, Then since L is linear,
L(y - Yp) = L(y) - L(yp)
= f - f =0.
It follows that y-yp = Yh for some homogeneous solution y1z. But then y = y1z + Yp
as we wanted to show. •
The method of Example I can always be used to find the most general solution
to an equation L (y) = / of the form
(D - r1) · · · (D - rn)Y = f.
In second-order examples we can use Theorems 1. I or 2.3, since Yh fo1lows imme-
diately from the roots of the characteristic polynomial. Theorem 3.1 then says that if
we find the general homogeneous part of the solution Yh using Theorems 1.1 or 2.3,
and somehow find a particular solution Yp, then the general solution of the given
equation is Yh + Yp·
To find Yp it's often convenient to take advantage of the linearity of Lin case the
right-hand side/ is a sum of two or more terms. If we want to solve
y = a1y1 +a2Y2
we would not have to start all over again, but would only have to find a particular
solution for
(D+lfy=l.
This could be solved by using exponential multipliers, but in this case the differential
equation is so simple that we can guess a solution, namely, Y2 = 1. Then a paiticular
solution of Equation (2) is Yp = /6 e 3x + 1, and the general solution is
Solving
(D + l)y = u.
Then we solve
(D+l)u=e-x
ex(D + l)u = 1
or
Hence
is
The exponential multiplier method shown previously for finding particular solu-
tions of L(y) = f (x) will provide a solution if we can perform the integrations
involving f (x). The method of undetermined coefficients explained next has more
restricted applicability, but is often more efficient when it does apply.
3B Undetermined Coefficients
The method depends on the observation that if we want to solve
L(y) = f (x),
where f (x) is itself a solution of a homogeneous equation My = 0, then
M(L(y)) = M(f (x)) = 0.
Then the desired solution y(x) must be among the solutions of
M(L(y)) = 0.
If M and L are linear constant-coefficient operators then the solutions of the preced-
ing equation are linear combinations of functions of the form xke'X, where r may
be real or complex. Thus the only computational problem is to determine the so far
"undetermined coefficients" of combination that will actually give a solution of the
original equation L(y) = f (x). Guessing the operator M is based on experience
solving homogeneous equations.
I
I E><"-¥P~~ 3 1 The differential equation
in operator form is
(D 2 - l)y = ex.
Since (D - l)ex = 0, we have for any solution y(x),
(D - l)(D 2 - l)y = (D - l)~- = 0.
Since the first two tenns arc solutions of the associated homogeneous equation
y" - y = 0, we can concentrate on the remaining tenn CJXex. To find CJ, we compute
Yp(x) = c3xex
y~(x) = CJ(xex + ex)
y;(x) = c3(xex + 2ex).
Thus for y~ - Yp =~ to hold, we must have
Then to satisfy the given differential equation we substitute )'p and its derivatives
to get
-2c, sinx + 2c2 cosx = 2 sinx.
Because cos x and sin x are linearly independent on any interval we must have
c, =-1 and c2 = 0. Thus Yp(x) = -x cosx is a particular solution, and
y(x) = CJ cosx + c4 sinx - x cosx
is the general solution.
Section 38 Nonhomogeneous Equations 519
. Here is an outline of the routine for finding the terms in a linear combination for
a trial particular solution Yp to L(y) = f(x), where f is a constant multiple of
(i) Include in the linear combination Yp the function f itself and all terms in
its derivative set, consisting of the linearly independent sets of functions of
which f and its successive derivatives are linear combinations. For example,
the derivative set of x 2 + x sin x consists of the two sets {x 2 , x , I} and
{x sinx, x cosx, sinx, cosx}.
(ii) If a term included in step (i) happens to be a solution of the homogeneous
equation, multiply that term and all terms in its derivative set by the single
lowest power xk such that the resulting terms are no longer homogeneous
solutions.
(iii) Form a linear combination with undetermined constant coefficients of the
terms from (ii), and determine the values of the coefficients by substitution
into L(y) = f .
Here are some examples of functions f (x) and corresponding trial solutions Yp (x ),
assuming no term in Yp(x) satisfies the homogeneous equation.
Since
y'(x) = A(2x - x 2)e-x,
y" (x) = A(2 - 4x + x 2 )e-x,
The terms with x and x 2 as factors all cancel, and we arc left with
To find the form of a trial solution Yp(x) for a nonhomogeneous equation it's
often simpler, and just as effective, to make an educated guess at Yp(x) rather than
methodically following the three rules listed above. For example, a little experience
shows that Yp = Axe·-x is a good choice for the equation y" - y = e-x.
EXERCISES
In Exercises 1 to IO, find the general solution of the In Exercises 29 to 32, find the general solution of the
differential equations and then find the particular solution equation by first finding the general solution of the asso-
satisfying y(O) = 0 and y'(O) = 1. ciated homogeneous equation and then adding to it a
particular solution found by the undetermined coeffi-
1. y" _ )' = e2x 2. y" - y = 3ex cient method. Then find the particular solution satisfy-
3. y" + 2y' + y = e 4. y" -y =X ing y(O) = 0, y' (0) = 1 and sketch the graph of that
solution.
5. y" - y = ex + x 6. y" - 2y = cos2x 29. y" + 4y' + 4y = 3x 30. y" - y' - I 2y = 2e 4x
7. y + )' = COS X
11
8. y" = cosx + sinx
31. y" + 2y' + 2y = ex 32. y 11 - y' =X
9. y" + y = x cos x 10. y" - y = xe-<
In Exercise 33 to 42, find the general form for a trial
In Exercises 11 to 14, use factored operators and the
solution Yp for each of the following. (For example,
exponential multiplier method to find the general solu-
if y" - y = ex, choose Yp = Axex .) You need not
tion for the differential equation.
detennine the coefficient values.
11. y" + y' - 2y = ex 12. y" - y = e2 ' 33. y" -4y = xe2x + e2x
13. y" + y = eix 14. y" =x 34. y 11 + y = x 2 COS X
In Exercise 15 to 22, find a homogeneous differential 35. y" - Sy'+ 6y = xe 2x + e3x
equation of least possible order for which the given
function is a solution. 36. y"+4y=x 2 cos2x-2sin2x
15. e + 2e 2x 16. ex COS X - ex sin X 37. y" - 4 y = e2x + 5 cos x
17. x +I 18. xex - 2ex 38. y" + y = 3x sin(x - 3)
19. x sin 3x 20. x 2 cos4x 39. y" - y' = x 2 + 2ex
y = l + x +x3
111
In Exercises 23 Lo 28, find the appropriate form for a 41.
trial solution for the equation. For example you would 42. y" + )' = x 99 COSX
use Yp = A cos 2x + B sin 2x for y" - y sin 2x. =
Falling body in a resisting medium. The distance y(t)
23.y"-y=COSX 24. y" + y = COS X covered in time t by a falling body of mass m under the
25. y" - y = ex 26. y" - y = xex sole influence of a constant gravitational field satisfies
a differential equation of the form d 2 y/dt 2 = g. (If
27. y" - 2y' + y = xex 28. y" = x5 distance is measured up from the surface of the earth,
Section 3C Nonhomogeneous Equations 521
then instead we would use d 2 y/dt 2 = -g.) To take (b) Show that the formula for y(t) in part (a) satisfies
atmosphere resistance into account in a simple way, we y(t) ~ ½gt 2 fort?. 0. [Hint: y" = g-(k/m)y' ~ g .
write the differential equation in terms of force rather Now integrate from 0 to t .]
than resistance in the form (c) Find the analogue of the formula given in part (a) for
the case of initial velocity y' (0) =
vo.
d2y dy 45. (a) For a body of mass m subject to friction constant
m-=gm-k- or k, show that initial velocity vo, leads to velocity at
dt2 dt
time t given by
Here k is a positive constant that is used to express
a retarding force proportional to velocity d y / d t. This
equation applies to Exercises 43 to 46. y ' (t) = kmg + ( Vo - k-
mg) e- k t I m.
43. (a) Show that the general solution to the retarded falling
(b) Find the limit ask tends to 0 of the formula for y'(t)
body equation is
in part (a). Does this agree with the free-fall formula
VO+ gt?
mg m2g -kt m
y(t) =-k
t- - 2 (I - e I ).
k satisfying y(xo) = y'(xo) = 0.
3C Variation of Parameters
The undetermined coefficient method is inadequate if the nonhomogeneity involves
a term that is not itself a solution of some homogeneous equation. If we know a
nontrivial solution YI of a homogeneous equation, we can try to find a function u(x)
such that y(x) = YI (x)u(x) will be a solution of the associated nonhomogeneous
equation. We can find the complete solution this way, and the procedure is called
variation of parameters. The substitution y(x) = YI (x)u(x) will leave us with a
linear differential equation that we can solve for u(x); then solve this equation to
find the "variable parameter" u (x). This method doesn't require that the coefficients
be constant.
Substitution into the given differential equation followed by division by e-x yields
3.2
hold identically for all x on the interval in question. We solve this system of equations
for u; and u~, with the result that
, -Y2(x)f(x)
ul (x) = YI (x)y ' (x) - Y2(x)yi (x)
,
2
u' (x)
2
= Yi(x)f(x) .
YI (x)y~(x) - Y2(x)y; (x)
The expression in the denominators is the same in both fonnulas and we can write
them as the 2-by-2 determinant
called the Wronskian determinant of the pair YI, Y2· It's possible to prove that
w(x) is never zero if YI, y2 are linearly independent solutions, so the examples we
consider here will have that property. To complete the solution, integrate the formulas
for u;, u; to find u1 and u2, and then combine with YI, Y2 to get a particular solution
Yp(x) = YI (x)u1 (x) + Y2(x)u2(x),
3.3
= Y1(x) 1 -y2(x)f(x) dx + Y2(X)
w(x)
I Yi(x)f(x) dx.
w(x)
computation to arrive at Equation 3.3. The next example will be done that way.
Because of the way that f enters Equations 3.2 and 3.3, the equation to be solved
musl be in normalized form to make these formulas valid.
jEX~¥PLE sJ 2
We normalize x y" - 2xy' + 2y = x 3 , say for positive x, to get
II 2 I 2
y ---y+-2 y=x, x>O.
X X
It's routine to check that the equation has homogeneous solutions y1(x) = x, y2(x) =
x 2. For this example Equations 3.2 are
I
XU!+ X
2
= 0,
U2I
UJI +2XU2 = X.
I
Multiplying the second equation by x and then subtracting the firsl equation from it
gives x 2u; = x 2, or u; = I. It then follows from the first equation that u; = -x.
Integrating to find u I and u2 gives
u2(x) = x.
A particular solution is
Yp = YJUJ + y2u2
= x-(-½x 2) + x 2-x = ½x 3 •
Adding constants of integration to u I and u2 would only add linear combination of
homogeneous solutions to Yp· In any case, we have the solution
We'll now solve a generalization of Example 1, but using Equation 3.3 instead of
Equation 3.2.
w(x) = I~ ;: I= 2 2 2
2x - x = x .
-x 2 f(x)
Yp(x)=x
f x2 dx+x
2 / xf(x)
~dx
.
=-x
f j(x)dx+x
2J f(x)
----:;-dx.
Section 3D Nonhomogeneous Equations 525
To make the integration fairly easy, we can use the example f(x) = x cosx. For
this choice, we get
)'p = -x f xcosxdx +x
2
f cosxdx
3•5
= YI (t)y2(x) - Y2(t)y1 (x)
G(X, t ) --------,
w(t)
1
(i) G(x, t) = -- (er1(x-t) - er2(x-t)), r1 f:. r2
Tl - T2
(ii) G(x, t) = (x - t)e'1(x-t), r1 = r2
(iii) G(x, t) = "iJea(x -lJ sin /J(x - t), r1 =ex+ i/3, r2 = ex -- i{J
526 Chapter 11 Second-Order Equations
[!xAMPtE 10 I The normalized constant-coefficient equation y" - 3y' +2y = J(x) has characteristic
equation r 2 - 3r + 2 = 0 with roots r 1 = 2 and r2 = I and independent solutions
YI = e2x, yz =ex.Then w(t) = -e 31 and the Green's function solution is
x e2t ex _ et e2x x
Yp(X) =
1 xo
----f(t)dt
-e3I
=
1 xo
(e 2(x-t) _e<x-t))f(t)dt.
Suppose f (x) = -1 if x < 0 and f (x) = 2 if 0 S x. In a purely formal sense the two
cases differ only by a constant factor. Assume first that x < 0, where f (x) = -1.
In that case, with xo = 0,
1 2x 1
= e-' - -e for X < 0.
2 2
For the case x ~ 0, just replace the factor -1 by 2 in the integral to get altogether
ex - le2x - l < 0
I
X
Yp(X) = 2 2' '
1 + e 2x - 2ex, x ~ 0.
Note that the behavior of the two parts is quite different: xE~oo Yp(x) = -½, but
lim Yp(x)
x~oo
= oo.
IEXAMPLE 11 I Recall from earlier examples that x 2 y" - 2xy' + 2y = 0 has independent solutions
Y1(x) = x, Y2(x) = x 2. The Wronskian is w(x) = x 2 -:j:. 0 except at x = 0. The
Green's function is G(x, t) = (x 2 ft) - x. We have the solution to the normalized
nonhomogeneous equation y" - (2fx)y' + (2f x 2 )y = J(x) given by
= x2
1x XO
t 2dt - x
1x
XQ
t 3 dt = l
-x 2(x 3 - x5) - -x(x 4
3
1
4
- X6)
15 1 3 2 4 1
= -x
12
- -x
3 o
x + -x
4 o
x.
Section 3D Nonhomogeneous Equations 527
This is the solution that satisfies y(xo) = y' (xo) = 0. The tenns containing xo
combine with the homogeneous solution to give the solution y = c 1x + c2x 2 + i1zx 5 •
= x x +
2
x 1 I= x 2 - 1, x =I- ±I.
w(x)
I1 2
= -t(x- -+t 2
I) - (t 2 + I)x
2
G(x,t) ___1_ __
l
0, -1 < X < 0,
f(x) = 1, 0S XS½,
0, ½ < x < 1,
= (x 2 + 1)[½ ln(l - 2
x )] - x[x + ln ( ~ ~;)].
The complete solution is then
0, - 1< < 0,
I
X
The third line comes from evaluating the bracketed expressions in the previous inte-
gral evaluation at x = ½- The first and third lines are solutions of the homogeneous
equation on their respective intervals, because f(x) =
0 there. Finding the solution
that satisfies more general initial conditions, y(0) = yo, y' (0) = zo, is just a matter
528 Chapter 11 Second-Order Equations
We have
y'(x) = CJ + 2c2x + y~(x).
But Yp(O) = y~(O) = 0, so we find q, c2 from equations ez = yo, c1 = zo.
Summary. What we have seen in this section is a collection of methods for
finding explicit solution formulas of the form
IEXAMPLE 13 j The equation (x - 1 )y" - x y' + y = I has a particular solution Yp (x) = 1, and the
associated homogeneous equation has solutions Yt (x) = x and yz(x) = ex. Initial
conditions y(xo) = Yo, y' (xo) = zo are satisfied by solving for CJ, c2 in
EXERCISES
I
0, 0 < X < 2,
18. x 2 y"-2xy' +2y= 3, 2~x~4. y' = µ.xµ.-J, and y" = µ(µ. - l)xµ.- 2 _ Show that for
J
- 1, 4 < x; y(3) = 0, y'(3) = 0 y to solve th~ differential equation, µ, must satisfy
the indicial equation
19. Show that the derivative of the Green's function For-
µ. 2 + (a - l)µ. + b = 0.
mula 3.4 is
(b) Show that if the indicial equation has real roots
, (x) = 1x Y1(t)y~(x)- Y2(t)yi(x) f(t)dt. /J.1 I /J.2 then YI = x/J- 1 , .Y2 = x/J- 2 are solutions.
Yp xo w(t) (c) Show that if the indicial equation has complex con-
jugate roots µ,1 = a + ifJ, /J.2 = a - i/3 then
Show then that y~(xo) = 0. [Hint: Use Equation 3.4
YI = xa cos(/3 lnx), 'n = xa sin(/J lnx) are solu-
separated into two integrals, and then apply the product
tions. Note that, by definition, xu+i/J = xaeif3lnx for
rule for differentiation.]
X > 0.
20. The Leibniz rule for differentiating an integral states that (d) Show that if µ, 1 is a double root of the indicial
d lb(x) aF equation then y 1 = x/J- 1 , Y2 = x/J- 1 lnx are solutions.
- F(x, t) dt = lb(x) -(x, t) dt Use Theorem 2.4 for this.
dx a(x) a(x) ax
In Exercises 29 to 32, use the results of the previous
+ b'(x)F(x, b(x)) -a'(x)F(x, a(x)) , exercise to solve the following Euler Equations.
530 Chapter 11 Second-Order Equations
29. x 2 y" + xy' - y =0 30. x 2 y" + 4xy' + )' = 0 (c) Find a second-order homogeneous linear equation
having f(x) = x and g(x) = ex as solutions for all
31. x 2 y"+3xy'+y=0 32. x 2 y"+xy'+y=0 X =/- J.
33. Let f (x) and g(x) be twice differentiable functions on an 34. It's sometimes erroneously inferred from insufficient evi-
interval a< x < b on which f(x)g'(x)- J'(x)g(x) =I- 0. dence that the Wronskian determinant
(a) Show that the 3-by-3 detenninant equation
y f g =
f(x) g(x) I
y' J' g' =0
W[f, g](x)
I J'(x) g'(x)
SECTION 4 OSCILLATIONS
A second-order constant-coefficient linear differential equation
d 2x dx
a- +b- +ex= IU)
dt 2 di
often has a physical interpretation that allows a neat classification of the solutions
and equations into distinct types, depending on the relations between the constants
a, b, c and the function I. Equations of this kind are important not only because
of their direct physical applicability, but also because of the insight they yield about
related nonlinear phenomena. A typical mechanism that we can analyze using a
constant-coefficient equation is shown in Figure 11.3, in cross section. Automobile
shock absorbers and artillery recoil mechanisms are designed using the principles
illustrated here. The working parts consist of a piston that travels in a cylinder
containing fluid, and a spring that can expand and compress. A spring usually exerts
a force roughly proportional to its extension or compression from its equilibrium
position, denoted by O on the fixed scale. Thus if x is the amount of displacement
from 0, then the force f 1. exerted by the spring is representable, according to Hooke's
law, by
Is= -hx , h > 0,
where for small enough displacements h is constant. We also assume that the fric-
tional force IF in the mechanism due to the viscosity of the fluid is proportional to
the velocity:
dx
IF = -k dt , k > o.
The time-dependent external force !£ = I (I) acts independently of J. and IF,
which are, in tum, assumed to act independently of one another, so that the total
force acting parallel to the scale in the figure is
dx
fs + fF + /E = -hx - k-
dt
+ l(I).
Section 4A Oscillations 531
FIGURE 11.3 -I O +1
On the other hand, general physical principles assert that this force must also be
equal to the mass m of the moving parts times the acceleration d 2 x / d t 2 . Thus
d 2x dx
4.1 m-
dt 2
+ k -dt + hx = f(t).
We'll investigate various assumptions about k, h, and f (t). The mass m will be
a fixed positive constant. We consider first harmonic oscillation, also called free
osciHation, with external force f identically zero.
4A Harmonic OscilJation
We assume that k =
0 and f =
0. These assumptions represent an ideal situation
that can only be approximated by the mechanism shown in Figure 11.3. Under these
assumptions the differential equation becomes
d 2x h
-+-x=O.
dt 2 111
A = 2, and h = m = l. Changing the number a, called the phase angle shifts the
graph to the right or left. The frequency of the oscillation depends only on the ratio
h/m; increasing h or decreasing m increases the frequency of the oscillation. That
this ideal oscillation is periodic is a direct consequence of the assumption that there
is no friction.
require that
Acos(-a) = xo, -Asin(-a) = vo.
We can solve these equations for A and a in terms of xo and uo to get
vo
a= arctan -
XO
In the special case when xo = 0, the initial displacement from equilibrium is zero,
and we find
7r
A= lvol, a-±-
- 2·
4B Damped Oscillation
The piston in the mechanism shown in Figure 11.3 exerts a damping force that
depends on the viscosity of the medium where the piston moves. If we continue
to assume that f = 0 in Equation 4.1, then we have to deal with the differential
equation
d 2x k dx h
-+-
dt 2 m
-+-x=0.
dt m
r1 = 2
~ (-k + ../k 2 - 4mh), r2 =
2
~ (-k - ../k 2 - 4mh).
r1 = - ½, r2 = -1,
so that the displacement from equilibrium at time t is
A typical graph is shown in Figure 11.5. The maximum displacement occurs at just
one point, after which the displacement tends steadily to O as t increase~.
X
Underdamping, k2 - 4mh < 0. This case occurs when k < 2,./mii, so that,
0.47 relative to ,./mii, the friction constant k is small. The characteristic roots are now
complex conjugates of one another:
0.92
1
ri = -2m1 (-k + i../4mh - k2 ) , r2 =- (-k - i)4mh - k2 ).
2m -
FIGURE 11.5
The general form of the displacement function is then
2 2
- k J 4mh - k
x(t)
·
= e-kt/2m ( c 1 c oJ s4mh
- - - - t + c2 s i n - - - -
2m 2m
t
)
2
= Ae-kt/2m cos ( J4mhm-k t- a
)
,
2
FIGURE 11.6 X
x(t) = e- 1 cos t
,r/2
j EXAMPLE 3 I Take h = 2, m = I, and k = 2. Then k < 2,./mFi, and the displacement at time t is
x(t) = Ae- 1 cos(t - a).
Figure 11.6 shows the graph of such a function with A = I and a = 0. It's easy
to check that this choice for the constants A and a gives a solution satisfying the
initial conditions
dx
x(0) = I, -(0) = -1.
dt
Critical Damping, k = 2,./mFi. This case lies between overdamping and under-
damping, and it is critical in the sense that an arbitrarily small change in one of
the parameters k, m, or h will disturb the equality k = 2,./mh and produce one
of the other two cases. Numerically, the case of critical damping is distinguished
by the equality of the characteristic roots: ri = r2 = -k /2m. It follows that the
displacement function is given by
so that
x(t) = [(vo+xo)t+xo]e- 1 •
Figure 11.7 shows four possibilities, depending on the size of vo, the initial velocity.
Xo Xo
Uo > 0 Uo: 0
(a) (b)
X X
Xo Xo
(c) (d)
4C Forced Oscillation
In the specific instances considered so far, the differential equation
d 2x dx
m-
2
dt
+k- +hx
dt
= f(t)
has been subject to initial conditions x (0) = xo, dx / dt (0) = vo, but the external
force function f has been assumed to be identically zero. The resulting free oscil-
lation is described by a solution of a homogeneous differential equation. (Note that
free doesn't imply undamped.) When f is not identically zero, we speak of forced
oscillation. From a purely mathematical point of view, there is no reason why the
function f on the right-hand side of the preceding differential equation cannot be
chosen to be an arbitrary continuous function for t ::: 0. However a force function
f that assumes large values could easily drive the oscillations outside the range
in which we can maintain the original assumptions used to derive the differential
equation. (For example stretching a spring too far might change its characteristics
to the point of destroying its elasticity altogether.) For this reason, the function f is
chosen in the examples and exercises to have a rather restricted range of values. In
every example we can use the decomposition of a solution x(t) into homogeneous
and particular parts,
x(t) = Xh(t) +xp(t).
The solution Xh ( t) has already been discussed earlier in this section for various
choices of m, k, and h in the homogeneous differential equation. What remains to be
done is to discuss the effect of adding a solution of the nonhomogeneous equation.
If k > 0, the analysis given in the earlier examples shows that every homogeneous
solution tends to zero like an exponential of the form e-kt/1m. Thus for values of
t that make (kt/2m) moderately large, the addition of the homogeneous solution
has a negligible effect. Such an effect is called transient, and the complementary
particular solution is called the steady-state solution.
536 ....------. Chapter 11 Second-Order Equations
IE:x,Ary!PLE s j If /(t) = ao cos wt, then the differential equation
d2x dx
m dt 2 + k dt + hx = ao cos wt, k > 0,
The choice w 2 = h/m makes the maximum amplitude of x_v equal to ao,./m/(k,./h).
This choice of the frequency w is called resonant because it produces a response
of large amplitude for values of the system parameter k that are small relative to
Jm[li. Notice that in this example of resonance we have a arctan( oo) n /2, = =
so that
Xp(t) ao
= -COS ( Wt - -n)
wk 2
Thus in a system with ,./m large relative to k../h, a small external force may produce
vibrations of large amplitude if the external force oscillates at the resonant frequency.
For this reason, resonance can completely upset an operating system, even though
the external force remains small in magnitude.
Section 4C Oscillations 537
EXERCISES
Note. Some of the following exercises use the overdot where E(t) is the voltage impressed on the circuit from an
notation .i = dx/dt and i = d 2x/dt 2 for first and second external source. The charge Q(t) is related to the current
time derivatives, used also in Chapter 4, Section l. flow l(t) by I= dQ/dt.
Each of the differential equations 1 to 6 generates a free (a) Derive the relations that must hold between R, L,
(i.e., undriven) oscillation. (a) Without solving it first, and C in order that the response of Q(t) should
classify each equation according to type: harmonic, over- be respectively underdamped, critically damped, and
damped, underdamped, or critically damped. (b) Find the overdamped.
general solution formula for each equation. (b) Show that if C = oo (capacitor is absent) the
equation for l(t) is Ldl /dt +RI= E(t).
l. d 2 x/dt 2 + 2dx/dt + x =0 (c) Solve the equation in part (b) when E(t) = E sin wt
2. d 2x/dt 2 +2dx/dt +2x =0 and I (0) = 0, and show that, if t is large enough,
the current response differs negligibly from
3. i +9x =0
4. x + 3.i +x = 0
E . 0
5. x+a.i+a 2 x=0,a>0 Z sm(wt - ),
2
6. d x/dt 2
+ ¼dx/dt + ½x = 0
For each of the following general solutions of a second- where Z = JR 2 +w2 L 2 and cos0 = R/Z . The
order, constant-coefficient equation, find the choice of function Z(w) is called the impedance of the circuit
the arbitrary constants that satisfies the corresponding in response to the sinusoidal input of frequency w.
initial conditions. Sketch the solution that you find. Also *(d) Show that, if O < C < oo, a long-term response to
find the differential equation of lowest order that the input voltage E(t) = Esinwt is E(t + a)/Z(w),
solution satisfies. where Z(w) =
2
JR 2 + [wL -1/(wC)] 2 , tana =
(c-t - w L)/(wR).
7. x(t) = CJ cos 2t + c2 sin 2t; x(0) = 0, dx/dt(0) = 1
15. We can determine the range of validity of Hooke's law
8. x(t) = A cos(3t - q, ); x(0) = l, dx/dt(O) = -1
for a given spring, and the corresponding value of a
9. x(t) = qte-21 + c2e-21 ; x(0) = 0, dx/dt(0) = -1 Hooke constant h, as follows. Hang the spring with known
10. x(t) = Ae-1 cos(2t - rp); x(0) = 2, dx/dt(0) = 2 weights Wj = mjg, j = 1, . . . , n, attached to the free
end. If the additional extension is always proportional to
11. x(t) = qe- 2' + qe- 4'; x(0) = -1, dx/dt(0) = -1 the additional weight, then Hooke's law is valid for this
Find the steady-state solution to each of the differential range of extensions, and h is the constant of proportion-
equations 12 and 13. Also estimate the earliest time ality. A similar procedure applies to compression.
beyond which the transient solution remains less than (a) Assume distance units are in feet. A spring with a
O.ol, assuming initial conditions x(O) = I, x(O) = 0. 5-pound weight appended has length 6 inches, but
with an 8-pound weight appended has length 1 foot.
12. d 2 x/dt 2 + 2dx/dt + 2x = 2cos 3t If the spring satisfies Hooke's law with constant h,
[Hint: Show lx(t)I ~ .J2e- 1 .] find h.
(b) What if distance is measured in meters and force in
13. d 2 x/dt 2 + 3dx/dt + 2x =cost
kilograms in part (a)? (There are about 3.28 feet in
14. There is a well-known analogy between the behavior of a meter and 2.2 pounds in a kilogram.)
a damped mass-spring system and of an RLC electrical (c) Suppose we know Hooke's constant to be h =
circuit. Here L is the inductance ( analogue of mass) of a 120 for a certain spring. We observe that between
coil, R is the resistance (analogue of friction constant) hanging a 20-pound weight from it and then a larger
in the circuit, and C is the capacitance (analogue of weight we get an additional extension of 6 inches.
reciprocal of spring stiffness) or ability of a capacitor to How big is the larger weight?
store a charge. The differential equation satisfied by the (d) A spring is compressed to length 20 cm by a force
charge Q(t) on the capacitor at time t is of 5 kg, to length 10 cm by a force of 6 kg, and
to 5 cm by a force of 7 kg. Discuss the possible
validity of Hooke's law given this information.
538 Chapter 11 Second-Order Equations
16. Answer the following questions about solutions x(t) of 26. cos t - sin t , sin t
the forced harmonic oscillator x + hx = cos w t, if 27. A weight of mass 111 = 1 is attached by springs with
x(0) = (h - w 2 ) - 1 and i{O) = 0. Hooke constants h l, h2 to two fixed vertical supports. The
(a) If h = 2, what value should w have to make the weight oscillates along a horizontal line with negligible
response amplitude equal 4? friction. By analyzing the force due to each spring, show
{b) If w = 2, what value should h have to make the that the displacement x = x (t) of the weight from
response amplitude equal to 5? equilibrium satisfies x = -(111 + h2)x .
(c) If h = 2, what is the unique positive value of w
28. A weight of mass m is attached by springs with Hooke
for which the response x(t) becomes unbounded as
constants h 1, h2 to two fixed vertical supports. The weight
t -4 oo?
oscillates along a horizontal tine with negligible friction.
(d) If w = 10, for what range of h-valucs will the
Let the respective unstressed lengths of the two springs be
response amplitude remain between 3 and 4?
I 1, t 2 , and let b denote the distance between the supports.
17. Answer the following questions about solutions x(t) to (a) By analyzing the force due to each spring, show that
the damped and unforced equation mx
+ ki + hx = 0. the displacement x = x(t) of the weight from the
(a) If m = 2, how should h and k be related so that the support attached to the first spring satisfies
solutions will be oscillatory?
(b) If h = k = 1, how should m be chosen so that all
nontrivial solutions will oscillate?
(c) If m = h = I. how should k be chosen so that x(t) (b) Use the result of part {a) to show that the equilibrium
has circular frequency w = ½? value for x(t) is
(d) If m = h = 1 how should k be chosen so that x(t)
is oscillatory?
18. Answer the following questions about solutions x(t) of
the damped and forced equation mx =
+ ki + hx cos w t . (c) Show that the constant value Xe of part (b) is the
(a) If m = k = h = 1, how should w be chosen so that solution to the differential equation that satisfies the
the amplitude of the steady-state solution will be 1? initial conditions x(0) = Xe, i(0) = 0.
(b) If k = w =
1, what relation must hold between m 29. The recoil mechanism of an artillery piece is designed
and h so that the steady-state amplitude will be 1? containing a linearly damped spring mechanism. The
{c) If m = h = w = I, how should k be chosen so that spring stiffness h and the damping factor k should be
the amplitude of the steady-state solution will be 2? chosen so that after firing the gun barrel will tend to its
(d) What polynomial relation must hold among m, h, k, original position before firing without additional oscilla-
and w if the frequency of the transient solution is tion. We'll assume a given initial velocity Vo and ma~s m
to be the same as the steady-state frequency? As a for the gun barrel during recoil, and also a fixed maximum
special case, show that if the homogeneous solutions recoil distance E, always attained by the barrel.
have frequency w, and also h = mw 2 , then there is {a) How should h and k be chosen so that under these
no transient solution. [Hint: Show k = 0.) conditions the gun barrel undergoes critical damping
Find the amplitude A, the frequency w/2rr and a phase after firing?
angle </J for each of the following periodic functions. (b) Write the differential equation for displacement x(t)
so as to display dependence on the parameters Vo
19. 2 cost + 3 sin t
and E instead of h and k.
20. - 2 cos 2t + 3 sin 2t
30. During construction of a suspension bridge, two towers
21. sinm + 2cosnl have been erected, and a IO-ton weight is suspended
22. sin(3t) between the towers by a cable anchored to both tow-
ers. Because of the elastic properties of the cable and the
By how much are each of the following pairs of oscil- towers, it takes a ½-ton force to move the weight side-
lations out of phase? You can decide this by expressing ways by 0.1 feel. An earthquake moves the base of each
each pair in the general form A cos(wt-</J), cos(wt-i/f). tower sideways with identical displacements of the form
23. (-v13/2)cost+(l/2)sint, cost 0.25 cos 61 feet in t seconds. Assume a linear model for
the lateral force on the weight and that damping is negli-
24. (I /2) cost + (-v13/2) sin t, cost gible. Find Hooke's constant h and the natural unforced
25. (I/Ji) cost+ (I/Ji) sin t , sin, frequency of oscillation for the weight.
Section 4C Oscillations 539
31. The differential equation (c) Show that 4mh - k 2 > 0 is equivalent to assuming
mx + ki + hx = ao sin wt that the transient response xh(t) is oscillatory.
(d) Show that if k 2 ~ 2hm then F'(w) = 0 only
determines the displacement x(t) of a damped spring with when w = 0 and that F(w) is strictly increasing
external forcing f (t) = ao sin wt as a function of time t. for w ~ 0. Hence conclude that in this case the
A frequency w that maximizes the amplitude of x (t) is maximum response is laol/ h, and occurs only for
called a maximum resonance frequency. This exercise the constant f (t) = ao .
asks you to investigate the maximum resonance frequency
for fixed w, and under various assumptions about the 33. The purpose of this exercise is to show that if m, k and
mechanism. h are positive constants, then for large enough t each
particular solution of mx+ki+hx = bo sin wt is bounded
(a) Consider the mechanically ideal case where k = 0.
by a number proportional to lbol -
Show that choosing w to equal the natural circular
(a) Show that the solutions of the associated homoge-
frequency we, = ,,/h7m produces a response x(t)
that contains the factor t and hence has deviations neous equation all tend to zero as t ---+ oo. (These
from the equilibrium position x = 0 that become are the transient solutions.)
(b) Show that
arbitrarily large as t increases. (Thus there is no the-
oretical maximum resonance frequency in this case, -bo
though in practice the maximum response will be Xp(t) = k2w2 + (h - mw2 )2
limited by the structural capacity of the mechanism
to accommodate wide deviations from equilibrium.) x (kw cos wt - (h - mw2 ) sin wt)
(b) k is a fixed positive number in Example 5 of the
is a particular solution and that it satisfies lxp(t)I .:::
text. Show that the steady-state displacement xp(t)
has maximum amplitude when h and m are chosen lbol/Jk 2w2 + (h - mw 2 )2.
so that ,/hTm = w and that the maximum amplitude [Hint: See Example 5 of the text.]
is ao(wk)- 1• (Making such choices for h and m (c) Show how to conclude from the results of (a) and
constitute tuning of the mechanism for maximum (b) that every solution is bounded for t ~ 0.
response.) 34. The purpose of this exercise is to observe the effect on
(c) Continuing with the ideas of part (b), show that we the individual solutions of the initial-value problem
get a small response amplitude in the steady-state
solution to a given forcing frequency w by making x + hx = sin wt, x(0) = i(O) = 0
lh - w 2ml l&rge.
of letting the parameter w approach the positive constant
32. In the previous exercise we assumed the input frequency ../ii. (The differential equation represents a highly ideal-
in Example 5 of ihe text to be fixed and considered the ized situation from a physical point of view, because there
effect of varying h and m in the differe~tial equation. is no damping term.)
Suppose now that h, m and k are fixed positive numbers
(a) Show that the unique solution to the initial-value
and that we want to choose w so as to maximize the
problem with w 'I= ../ii is
amplitude of the response xp(t).
(a) Show that the amplitude factor
112
x(t) = h ~ w2 (sin wt - ~ sin ./hr),
p(w) = ( (h - w 2m) 2 + k2w2)-
and that the solution satisfies
of xp(t) is maximized when the function F(w) =
(h - w 2m)2 + k 2w 2 is minimized. 1 + w/,,/h
lx(t)I .'.:: lh - w21 for all values of t.
(b) Show that if k 2 < 2hm, then F'(w) = 0 when
w 2 = (2hm - k 2 )/(2m 2). Conclude from this that
in this case the maximum response amplitude
(b) Show that as w approaches ../ii, the solution values
found in part (a) approach
2mlaol
occurs for w = woy~
l - v:;;;;, _!_ (-,,/ht cos ..fiit + sin ..fiit) .
k,,/4hm - k 2 2h
where wo = .,/Tilm is the natural circular frequency Show also that in contrast to the inequality in part
of the undamped (k = 0), unforced (ao = 0) (a), this function oscillates with arbitrarily large
mechanism. amplitude as I tends to infinity.
540 Chapter 11 Second-Order Equations
(c) Find an initial-value problem that has the function (b) What is the phase difference between cos(wt - a)
obtained in part (b) as a solution. and sin(wt - fJ)? [Hint: Express the second one in
[Hi11t: What happens to the original differential equation terms of cosine.]
as w-+ ./h?] 38. Let f(t) =
sin at+ sinfJt where a and f3 are positive
35. Suppose that an undamped, but forced, oscillator ha~
numbers.
the form (a) Show that if f3 = ra for some rational number r
n then f (t) is periodic for some period p > 0, i.e.,
x+ 2x = Lak cos kt. f(t + p) = f(t) for al1 t. Show also that p can be
k=O expressed as (possibly different) integer multiples of
both rr /a and rr / /3.
(a) Use the linearity of the differential equation to show *fb) Prove that if an f (t) of the form given above is
that it has the particular solution periodic with period p > 0 then f3 = ra for some
n
rational number r . Thus for example sin t + sin ,,/21
~ Ok can't be periodic. [Hint: Check that f(p) = 0 and
Xp(t) = ~ 2_ k2 COS kt.
f"(p) = 0. Then conclude that (a 2 - {3 2) sinap =
k=O
(a 2 - {3 2 ) sin /Jp = 0, so that either a = ±/3 or else
The trigonometric sum on the right in the differential ap and /Jp arc integer multiples of rr .]
equation is an example of a Fourier series, discussed 39. Suppose we want to construct a damped hannonic oscil-
in general in Chapter 14. An extension of such a sum lator with Hooke constant h = 2 and damping constant
to an infinite series can represent a very general class k = 3. What is the lower limit mo for the mass m such that
of functions. oscillatory solutions are possible? Docs oscillation occur
(b) How does the solution in part (a) change if the left form= mo?
side of the differential equation is replaced by x+4x
and the right side remains the same? In Exercises 40 to 43, suppose a physical process is
36. Consider the differential equation accurately modeled by a differential equation of the form
x+ 2Sx = 16cos 3t . d 2x dx
m- 2
+k- +hx = 0,
(a) Show that the equation has general solution
dt dt
with 111, k and h positive constants. It may be possible
x(t) = c1 cos St + c2 sin St+ cos 3t .
by observation to draw conclusions about the parameters
(b) Show that the particular solution satisfying x(O) = 0, in the underlying process. Given each of the following
= 0 is xp(t) = cos3t -cos St .
x(O) sets of information about the constants and a solution,
(c) Show that cos 3t - cos St = 2 sin 4t sin t. find the implications for the other constants.
(d) Use the result of part (c) to sketch the graph of the 40. m = I, x(t) = e- 31 cos6t
particular solution found in part (b) for O ~ t ~ 2rr.
41. h = I, x(t) = e- sin St
1
Recall that the routine so far has been to solve the homogeneous equation by find-
ing the roots of the characteristic equation, then solve the general nonhomogeneous
Section 5 Laplace Transforms 541
equation by an integration that involves not only f(t) but two independent homo-
geneous solutions. If we are willing to assume that f (t) is defined for all t ~ 0,
and that f (t) and the solution y(t) don't grow too rapidly as t ~ oo, we can use
an alternative method that incorporates all of these steps and has for historical rea-
sons achieved considerable popularity in electrical engineering and control theory.
Experience with exponential integrating factors shows that it's natural to multiply a
solution y(t) by a factor e- s1 , where s is some real or complex number, to get a
product e- st y(t). What seems less natural, but nevertheless turns out to be effective,
is to integrate with respect to I between 0 and oo. This gives us an improper integral,
leading for example to the calculation
5.1
which holds when (s - a) > 0. We'll use this result repeatedly, and to verify it we
first compute the partial integral
T -(s-a)t d
= [ - - -l e -(s-a)t]T
lao e t
s -a o
= __l_e-(s-a)T + _l__
s-a s-a
When T ~ oo, the exponential factor tends to zero, so letting T ~ oo on both
sides proves the formula correct. If for some real or complex numbers s, an integral
of the form
which we prove under the assumptions (i) the improper integrals are convergent, and
(ii) lim e- s1 y(t) = 0. Indeed, integration by parts of the partial integral on the left
1--+00
gives
T T
la e- st y'(t)dt = [e- s1 y(t)]~ +s lo s1
e- y(t)dt
T
= e-sTy(T) - y(O) + s la e- st y(t)dt.
Because of assumption (ii), the first term on the right tends to 0 as T ~ oo.
Equation 5.2 follows by letting T ~ oo in the preceding equation.
The example
y' +2y = 0, y(O) = 3,
542 Chapter 11 Second-Order Equations
is too simple to show the real advantages of using Laplace transforms, but it does
illustrate the general principles involved. We form the Laplace transform of both sides
of the differential equation by multiplying both sides by e- sr and then integrating
from O to oo with respect to t. The result is
Here we have used the assumption that y(O) = 3. We also rely on our knowledge of
the exponential nature of the solution to justify the assumptions (i) and (ii) needed
for the application of Equation 5.2. The previous equation can now be solved for the
Laplace transform of the solution y(t) in the form
3
i
(X)
e-Sf ydt =- -. (I)
o s+2
Thus we have found not the solution y(t), but its Laplace transform. However if we
set a= -2 in Equation 5.1, and then multiply by 3, we get
(2)
Since we already know from the general theory of this chapter that y(t) must be an
exponential solution, there remains only the one question of the constants involved,
and we see by comparing Equations (1) and (2) that the solution
y(t) = 3e-2'
satisfies our requirements.
To apply the Laplace transform to differential equations with order higher than
one, we need a simple extension of Equation 5.2. To simplify the notation, we write
5.2 in the form
.c[y'](s) = -y(O) + s,C[yl(s).
Applying this equation to ,C[y"](s ), the Laplace transform of y" , gives
£.,[y]= Y(s),
.c[y'J = -y(O) +sY(s)
= -1 +sY(s),
£,[y"] = -y'(O) - sy(0) + s2Y(s)
= -s +s 2 Y(s).
Because integration from Oto oo and multiplication by e- s1 are both linear operations,
the equation
.C[y" -y' - 2y] = .C[3e'J
simplifies to
£.,[y"J - .c[y'] - 2£,[y] = 3£,[e1].
The expressions found for .C[y], .C[y'J, and .C[y"], together with 5.1, allow us to
write the equation as
l
[-s + s2Y(s) ] - [-1 + sY(s)] - 2[Y(s)] = 3s -- -l.
Rearrangement gives
2
2
(s -s -2)Y(s) -3- +s -1 = -
= s-1 s - 2s +4
- --
s-l
544 Chapter 11 Second-Order Equations
or
s 2 - 2s + 4
Y(s) = ----
2
--.
(s-l)(s -s-2)
Having found an expression for Y(s), our problem is now to identify precisely the
solution y(t) that satisfies .l[y](s) = Y(s). Because Y(s) is a rational function, it can
theoretically always be broken down according to the partial fraction decomposition
usually associated with the computation of indefinite integrals. In our example the
decomposition works because the denominator of Y factors. We need to detem1ine
the coefficients A, B, and C in
s 2 -2s +4 A B C
- - - - - - - - = - - + - - + --.
(s-l)(s+ l )(s-2) s-1 s+I s-2
n!
£[1 11 ] = --
5
n+ I' 11
= 0 ' I ' 2 ' ...
provides the transform of f(t) = t'1, it also provides, after division by n!,the inverse
transform
11
;:_,-1 -1- ] - -
[ sn+l
!
-n!'
n = 0, I, 2, .. .
For the proof that for every Laplace transform Y there is a unique function y such
that lfy] = Y, we can refer to more theoretical accounts of the subject. All the
entries in Table 11 .1 are computed using elementary integration techniques.
Section 5 Laplace Transforms 545
TABLE 11.1 Table of Laplace transforms.
1. 1
s
2. t
n!
n =0, 1,2, ...
s-a
(s - a) 2
n!
6. t"e 0 ' n=0,1 ,2, ...
(s-a)n+ 1 '
b
7. sinbt
s2 +b2
s
8. cos bt
s2 +b2
2bs
9. t sinbt
(s2 + b2)2
s2 - b2
10. t costJt
(s2 +b2)2
b
l 1. e" 1 sin bt
(s -a) 2 +b 2
s- a
12. e"' cos bt
(s -a) 2 +b 2
(a -b)
(s - a)(s - b)
(a - b)s
(s - a)(s - b)
(a - b)(b - c)(a - c)
15. (b - c)e 01
+ (c - a)i" + (a - 1
b)ec
(s - a)(s - b)(s - c)
2b 3
16. sinbt - bt cos bt
(s - a)i'
j = 1, 2, ... , m.
546 Chapter 11 Second-Order Equations
2. If the denominator Q(s) has the factor (s 2 + ps +q) 11 as the highest power of
s 2 + ps +q that divides Q (s ), then include in the decomposition of P (s) / Q (s)
the fractions of the form
Bks + Ck k, k = 1, 2, . . . , n.
(s 2 +ps+q)
s+I
F(s) = (s - 1)2(s2 + I)'
we decompose the function into a sum of fractions as follows:
s+I
------,--- = -A- + --~
B Cs+ D
+ ----,---
2 2
(s - I) (s + I)2 2 s- 1 (s - 1) s +1
To compute B, we can multiply through by (s - l )2 and then set s = l. We get
B = 1. The same kind of trick doesn't apply directly to the other coefficients, but
if we subtract 1/(s - 1)2 from both sides we find we can cancel (s - 1) on the left
to get
-s 2 + s -s A Cs+ D
(s - 1) (s + 1) - (s - l)(s 2 + 1)
2 2 = s - 1 + s2 + 1 ·
½s 2 - s+ ½= cs2 + (D - C)s - D.
We equate coefficients of like powers on both sides and find that C = ½ whereas
D = -½. The result is that
l 1 l
s +1 -2 l 2s 2
-(s---l)-2-(s_2_+_1) = -s--1 + -(s---1)-2 + -s2_+_1 - -s2_+_1 ·
on both sides of the equation. The resulting linear equations can then be solved for
the coefficients.
2 6
(s + 4)Y(s) = -2- - + s - I
s +4
or
6 s 1
Y(s) = (s2 + 4)2 + (s2 + 4) (s2 + 4).
3 16 s 1 2
Y(s)=s"(s 2 +4) 2 + s2 +4 -2-s2 +4·
EXERCISES
In Exercises I to 4, compute directly, assuming that 3. Integrate once by parts to verify that if y(0) = 1 then
y(t) has a transfonn Y(s) and is such that all required 00
integrals and limits exist and are finite. fo e- st y'(t)dt = sY(s) -1.
1. Integrate to verify that 4. Integrate twice by parts to verify that if y(0) =2 and
y' (0) = 3, then
for s > a. 00
fo e- st y"(t) dt = s 2 Y(s) - 2s - 3.
2. Use integration by parts to verify that
In Exercises 5 to 8, by computing the appropriate inte-
1
1
00
£,[t](s) = te- st dt =2 , for s > 0. gral, or by using Table 11 .1 of Laplace transfonns, com-
Cl s pute .C[J](s) where /(t) is as follows:
548 Chapter 11 Second-Order Equations
5. t sin 2t 6. cos t + 2 sin t 7. t 2+ 2t - I ( d) Solve the differential equation y" = H (t - a), 0 <
a, with initial conditions y(0) = I, y'(0) = 0.
8. cos(t + a) 9. (2t + l)e 31 10. e' + e-1
.(,[g](s) = e-as f..,[j(t + a)](s). This formula makes it possible to determine something
about the long-run behavior of f from the behavior of
(c) Sketch the graph of H(t) - H(t - I). f..,[f](s) nears = 0, without finding f.
SECTION 6 CONVOLUTION
Let us review the solution of the second-order differential equation
2
(s + ps + q)Y(s) = F(s) + y'(O) + sy(O) + py(O).
The polynomial factor P(s) =
s2+ps+q on the left is the characteristic polynomial
2
of the operator D + pD + q. The reciprocal Q(s) = I/ P(s) is called the transfer
Section 6 Convolution 549
function of the operator, and if we multiply by Q(s), or divide by P(s), we get the
formula
Y(s) = F(s) + y'(O) + sy(O) + py(O)
P(s)
for the Laplace transform Y = .C[y]. The remaining step is to find the inverse trans-
form y (t) = .C - 1[Y)(t). The essence of the method is to use the Laplace transform
to reduce the solution of the problem to some routine algebraic manipulations.
In addition to the table of specific Laplace transforms in the previous section
(Table 11. l ), there are a number of general formulas, such as Formula 5.3, that
are useful in solving problems. The most important of these answers the following
question: If F(s) and G(s) are the Laplace transforms of f(t) and g(t), respectively,
what function has Laplace-transform equal to the product F(s)G(s)? It turns out that
under rather general hypotheses there is an answer, given by the convolution integral
The function f * g(t) is called the convolution of the functions f (t) and g(t) and is
defined for t ~ 0, provided that f and g are integrable on every finite interval. The
convolution f * g is to be thought of as a kind of product of f and g and it turns
out that f * g = g * f, although this is not obvious from the definition. The basic
information about convolutions is summarized as follows.
6.1 Theorem. Let J(u) and g(u) be integrable on O ~ u ~ t for every positive
t ; then f * *
g and g f both exist and are equal, that is, convolution is commutative:
f*g=g*f.
If 1/(t)I and lg(t)I are such that .CCIJl](s) and L[lgl](s) are both finite, then
.C[f * g](s) = (.CC/1(s)){L[g](s)) .
Proof. The first statement follows from changing variable in the definition of f * g.
We have, on replacing u by t - v,
Under our assumptions we can interchange the order of integration using a theorem
called Fubini' s theorem. Then we have
Because we have assumed f (t) and g(t) are zero fort < 0, the inner integral is zero
for t < 0. It follows that we need the t integration only for O ::=: t < oo. Similarly
we need the u integration only for O ::=: u ::=: t . Hence
00
= .C[f * g](s),
which is what we wanted to prove. •
IEXAMPLE 11 From Table 1 I.I, we see that .c[t](s) =
from Theorem 6. 1 that
I/s 2 and £,[sin t](s) = 1/(s 2 + l ). It follows
1
~ -- - - = .C [ [' (t - u)sinudu] (s) .
s s +1
2 lo
Holding t fixed, we can use integration by parts to show that
We could have obtained the same result by computing a partial fraction decomposi-
tion of the form
I I A B Cs+ D
s 2 ·-s2_+_1 =-;-
+ s 2 + _s_2_+_1_'
and then finding the inverse transform of each term.
Table l l.2 lists the most frequently used general properties of the Laplace trans-
form. The entries that haven't already been discussed follow from elementary cal-
culus techniques. The table omits the precise conditions under which each formula
holds. The distinction between Formulas 2 and 3 and Equations 5.2 and 5.3 of the
previous section occurs because in 5.2 and 5.3 we assumed that we were deal-
ing with solutions of differential equations and that these solutions had continuous
derivatives at t = 0. The corres~nding formulas in Table 11.2 are valid under the
weaker assumption that Jim f k>(t) exists, but is not necessarily equal to J<kl (O).
t-+0+
(See Exercise 8 of Section 5.)
Section 6 Convolution 551
TABLE 11.2 General properties of the Laplace transform.
d
7 . .C,[tf (l)](s) = --.(,[f](s)
d.r
8. .C,[/ * g](s) = .(,[f](s).C,[g](s)
l l .c .
s(s2 + 1) = :;- [sm t](s)
Hence
.(, -1 [ 1 ] = - cos t + l.
s(s 2 + 1)
Repeating the application of Fonnula 4 gives
1
= .C [ [' (- cos u + l) du]
s 2 (s 2 + l) lo
= .(,[- sin t + t].
r-1[ s (s l + l) ] =-smt+t,
....,
2
·
2
d s
,(,[tcost](s) = -- - -
ds s 2 + 1
s2 - 1
2
= (s2 + 1)2 ·
Another application of the same formula gives
FIGURE 11.8
2 d s2 - 1
,(,[t cost](s) =- ds (s 2 02+
2s 3 - 6s
= (s2 + 1)3 ·
I~>O\MP,I.E 4 j To apply Formula 6 of Table I J.2 to the function f (I) = t, we define /(I)
t < 0. The graphs of
0 for
f (I) and f (t - a) are in Figure 1.1.8 for a = I and a = 2.
=
Each function is zero where it's not positive. From Formula 6 we find
Similarly,
,(,[J(t - 2)](s) = e- 2s .C[f(t)](s)
= e-2s .L:,[t](s) = e- 2'· 2I . 1
EXERCISES
In Exercises 1 to 4, find the convolution f * g of the In Exercises 9 to 14, use the formulas in Tables l 1.1
given pair of functions. and 11.2, find the inverse Laplace transform of the given
function.
1. f(t) = t, = e-t, t ~ 0
g(t)
e-2s
1
2. f(t) = t , g(t) = (1 2 + 1), t :=: 0
2 9
· s(s + 3) 2 IO. s-(-s2_+_4_)
3. f(t) = 1, g(t) = I, t :=: 0 1 1
11. -s2_+_2s_+_2 12. -s2_+_1
4. f(t) = t, g(t) = cost, t ~ 0
13. (e-l· + 1)/s 14. _ s _
In Exercises 5 to 8, use the convolution of two functions s2 + 10
to find the inverse Laplace transform of each of the given
products of Laplace transforms. In Exercises 15 to 18, solve the given initial-value prob-
lem and check by substitution.
I 1
5 6 = sin2t + 1, y(O) = 1, y'(O) = -1
· s2(s + 1) · (s - l)(s - 2) 15. y" - y
1 O
00
e- st f (t) dt = 1-
00
e•
sinu
- - du,
1
(nu)5
22. One possible definition of the gamma function, denoted
by f(z), is and that this is finite for all s > 0. [Hint:
f(z) = loo ,z-le-1 dt, z > 0.
Express the second integral as an alternating infinite
series.]
(b) Show that J;c'
e- st lf(t)ldt =
+oo for all s.
(a) Use integration by parts to show that [Hint: Compare the analogue of the second inte-
gral above with a smaller, but divergent, infinite
series.]
f(z + 1) = zf(;:). (c) Prove that if .C[lf (t)l](s) is finite then so is F(s) =
(b) Deduce from part (a) that f(n + 1) = n!, for .C[f(t)](s). [Hint: Compare I Jff e- st f(t)dtl and
n =0, 1,2, .... J;' e- st lf(t)ldt.]
7.1 Existence and Uniqueness Theorem. Assume the function of three variables
f(t, y, z) and its two first-order partial derivatives /yU, y, z) and fz(t, y, z) are
continuous fort in an interval / and for all (y, z) in an open rectangle R containing
(yo, zo) shown in the figure. Then the initial-value problem
IEXAMPLE 11 Of the three equations preceding Theorem 7.1 y = - sin y is the only one that
satisfies the boundedness condition. In this example 1/y(t, y, z)I = I cosyj ::SI, and
fz(t, y, z) = 0, so the initial-value problem has a solution for all real t, starting with
arbitrary to, yo, and zo.
to solve first for z and then for y. In the next example we solve a linear equation with
a variable coefficient, one that we can't solve using the constant-coefficient methods
of the previous sections in this chapter.
..
y
1.
= -y, ( )
y to = Yo, .(
y to) = zo,
t
with to i= 0. Letting z = y, this problem is equivalent to
.
z = -t1z, z(to) = zo
y = Z, y(to) = YO·
The top equation doesn't contain y and is first-order linear with integrating factor
exp (f (-1/t)dt) = 1/t, so the integrable form of the equation is
d
(1/t)z - (l/t 2)z =O or -(z/t) = 0, with solution z = CJt.
dt
The initial condition on z requires CJ = zo/ to, so z(t) = zot / to. To find y we
integrate y =
z to get y = ½zot 2/to+ cz. Finally, the initial condition y(to) Yo =
requires cz = Yo - ½zoto soy = ½zot /to+ yo - ½znto. A routine check shows this
2
Section 7B Nonlinear Equations 555
to be the solution to the initial-value problem for t > 0 when to > 0, or for t < 0
when to< 0.
-dz
dt
= dz dy
--,
dy dt
. dy dz dz
orsmce - =z, - =z-.
dt dt dy
. dz dz . .
Thus the shortcut to remember 1s the replacement of - by z- m the top equation
dt dy
of our first-order system, yielding
dz
zdy=f(y,z)
dy
-=z.
dt
If we can solve the top equation for z as a function of y then we can put the result
in the bottom equation with some hope of solving for y = y(t). Note that we would
then have y = z(y(t)) as a check on the accuracy of our computations.
dz
dy
= y, with solutions z = ½i + c1.
Using the two initial conditions z(0) = ½ and y(0) = 0 we see that c1 = ½, so
z= ½i + ½- (Note that z is never zero.) Putting this expression for z into the
bottom equation of the system gives the equation
IEXAMPLE4 I Initially we best describe the motion of a pendulum with no force but gravity acting
on it in terms of the angle 0 = 0(t) that the pendulum makes with a vertical line as
shown in Figure 11.9. Here we'll assume an ideal pendulum with a rod of negligible
mass attaching a weight of mass m to the pivot, and with all mass concentrated at the
weight's center of gravity at distance I from the pivot. Figure 11.9 shows the set-up,
where the possible positions for the center of mass describe a circle of radius I. The
typical motion is a back and forth oscillation, such as we considered in Section 4.
The downward-directed gravitational force F of magnitude mg has a radial com-
ponent FR of magnitude lgm cos01 and a component FT of magnitude lgm sin01
tangential to the circle. At any given position the gravitational force component in
the direction of motion is perpendicular to the component directed along the length
of the pendulum. The coordinates relative to these directions are FT = -gm sin 0
and FR = - gm cos 0. It follows by the Pythagorean relation that the sum of the
squares of the perpendicular component magnitudes must be g 2 m 2 . The force FR
-gm cos e acting along the length of the pendulum toward its end must be exactly balanced by
-gm
an opposite force at the pivot, and these forces play no other role in our description
of the motion. If 0 is measured in radians, distance along the circular path of motion
is y = 10, so we can express the force coordinate FT in the direction of motion as
FIGURE 11.9
mass times acceleration: FT = md 2(10)/dt 2. Equating our two expressions for FT
Pendulum analysis.
gives the differential equation satisfied by 0 = 0(t):
d 2 (10) .
m~=-gmsm0.
The minus sign signifies that the signed velocity d(/0)/dt is decreasing if 0 < 0 < n:
and increasing if -n: < 0 < 0. If y(t) stands for distance measured along the circle
then 0(t) = _v(t)/ I, leading to the alternative form
d2y . y
- = -gsm-.
dt 2 I
Theorem 7 .1 guarantees the existence of a unique solution y(t) to an initial-value
problem with y(to) = Yo, y(to) = zo, but this nonlinear equaLion has no solutions in
terms of elementary functions.
If y remains small, say IYI < 0.1, we may find approximate solutions that are
acceptable for some purposes by using the tangent-line replacement for the graph of
sin y near y = 0, namely sin y ::::::: y. This approximation leads to the linear equation
d2y g
dt 2 = -Ty.
Section 78 Nonlinear Equations 557
It's routine to check that this differential equation has among its solutions
EXERCISES
In Exercises l to 6, use the method of Section 7A to solve (c) Use the complex exponential ei 1 to find a relation
the initial value problem, stating explicitly for what values between the solutions of part (a).
of the independent variable your solution is defined.
16. (a) Solve the initial-value problem ji = y y, y(O) = 0,
1. tji + y = O; y(l) = 0, j>(l) = 1 j>(O) = -½-
(b) Sketch the graph of the solution to part (a).
2. t 2 j.i+_j,2=0; y(l)=j>{l)=-1
(c) Use the complex exponential eir/2 to find a relation
3. j.i + j, 2 = O; y(O) = 0, j>(O) = 1 between the solution to part (a) and the solution to
4. ty + y = O; y(l) = j>(l) = l the problem in Example 3 of the text.
5. tj.i + y = t 3 ; y(I) = j>{l) = 1 17. (a) Apply the method of Section 7B to the initial-value
problem
6. j.i + y= O; y(O) = y(0) = 1
In Exercises 7 to 13, use the method of Section 7B ji = - sin y, y(O) = Yo, j>(O) = zo
to solve the initial value problem, stating explicitly for
for a nonlinear pendulum equation to establish the
what values of the independent variable your solution is
equation
defined.
7. yji - j, 2 = O; y(O) = j>(O) = 1 I ·2 I 2
2 y =cosy-cosyo+ 2 z0 •
8. y2ji + j, 3 = 0; y(O) = j>(O) = l
(b) Use the result of part (a) to show that if - cos Yo+
9. ji - j, 3 = O; y(O) = j>(O) = 1 z5
½ > l the pendulum will rotate "over the top"
10. ji + j>2 = 0 y(O) = 0, j>(O) = 1 repeatedly.
(c) Use the result of part (a) lo show that to have
11. ji + j>2 = I; y(O) = j>(O) = 0
oscillatory motion, with angle y strictly between -7r
12. ji - y 3 = 0; y(O) = j>(O) = ../2 and 7r, we must have - cos YO + ½ < l. z5
13. Show that the differential equation t j.i + y2 = 0 has
2
18. (a) Apply the method of Section 7B to the initial-value
more than one solution satisfying y(O) = j,(0) = 0. problem
Explain why this doesn't contradict the uniqueness part
of Theorem 7. I • ji = -siny, y(O) = T/, j>(O) =0
3
14. Show that the differential equation t ji + j> = t has
for a nonlinear pendulum equation to establish the
more than one solution satisfying y(O) = j>(O) = 0.
equation
Explain why this doesn't contradict the uniqueness part
of Theorem 7. l.
½i =cosy - cos 1/,
15. (a) Solve the initial value problems for j.i = - y and
ji = y, using the same initial conditions y(O) = 0, where -7r < 1/ < 1r.
j,(0) = l for both equations. (b) Prove that the time T(TJ) it takes for the pendulum
(b) Sketch the graphs of both solutions of part (a). in part (a) to fall from angle y = 1/ lo the vertical
558 Chapter 11 Second-Order Equations
T (11) = 1
o
'1 dy
----=====;:-
J2(cos y - cos 77)
K(k) =
-~
1 Ji - d¢ 2
-;=======, k < I,
2
o k sin¢
(c) Make a change of variable in the integral in part (b) which is an elliptic integral and isn't computable
chosen to show that T(77) = K(k), where k = using elementary functi,1ns.
7C Phase Space
If we can't find an explicit formula for the solution of a second-order equation and
we want to study a particular solution satisfying given initial conditions, one option
is to apply numerical methods as discussed in Section 8 and another is to display
solutions in what is called phase space, described here. To do either of these we first
convert second-order equations ji = f(t, y, y), with initial conditions y(to) = Yo,
y(to) = zo, into first-order systems as follows. Let y = z so that i: = ji. Then
z = f(t, y, y), and we can write the original second order initial-value problem as
y = z, y(to) = Yo,
z= f(t, y, z), z(to) = zo.
There are two advantages to this reformulation. A purely technical advantage is that
it allows us to apply the first-order numerical methods of Chapter 10, Section 2 to
second-order problems. The other advantage is conceptual, in that the first-order
system gives equal weight to displaying the two fundamental quantities, position y
and velocity z = y. The 2-dimensional (y, z)-space is the phase space of the second-
order equation. For the purpose of plotting curves in phase space we restrict attention
to equations j; = J(y, y) in which the function f is not explicitly dependent on time
t; such equations are called autonomous.
the vector functions (y(t), z(t)) trace out ellipses in the phase space. Figure 11.I0(a)
shows some curves in phase space and Figure I l.IO(b) shows the corresponding
solution graphs, which relate time and position. The phase curves relate the funda-
mental quantities position and velocity. In Figure 11.10 ha!f the width of an ellipse
corresponds to the amplitude of a graph.
Section 7C Nonlinear Equations 559
FIGURE 11.10 z y
(a) y2 +z2 /4 = A 2
(b) y = A sin(2t).
The previous example illustrates an important point about phase curves and peri-
odic solutions y = y(t) of second-order differential equations y = f (y, y).
7.2 Periodicity Theorem. Assume /y(Y, y) andfz(Y, j,) are bounded continuous
functions of y and that y(t) is a solution of y = f (y, y). Then y(t) is periodic if
and only if the corresponding phase space curve traced by (y(t), j,(t)) is a closed
loop.
Proof. If y(t) is periodic with period P > 0, that is y(t + P) = y(t) for all
t, then differentiation shows that also y(t + P) = y(t). Thus the vector function
defined by (y, z) = (y(t), j,(t)) traces a closed loop in the yz phase space whenever
t traverses an interval of length P. Conversely, if this same vector function traces
a closed loop when t traverses an arbitrary interval to :s t ::S to + P of length
p, then applying the Existence and Uniqueness Theorem 7 .1 with the initial-values
(y(to), j,(to)) = (y(to + P), j,(to + P)) implies that the solution y(t) repeats over
intervals to + P ::S t ::S to + (k + 1) P for integer k and so is periodic. •
Example 5 is about a linear equation that has explicit solutions in terms of the
periodic functions sine and cosine. Since the corresponding phase curves are ellipses,
Theorem 7 .2 implies that a solution must be periodic with period P equal to the time
it takes to traverse the corresponding ellipse.
The next example is the nonlinear pendulum equation, for which the solutions are
well understood but for which there are no simple formulas.
y=z
z= -sin y
for the phase space variables y and z. We interpret y as a displacement angle in
radians and z as its angular velocity. Applying the method of Section 7B, we solve
dz .
z-=-smy
dy
toget ½z 2 =cosy+c, or z=± 2cosy-2cosyo+z5, J
560 Chapter 11 Second-Order Equations
where yo and zo are initial values for y and zo. To get a real value for j, we must
have -2 ~ -2 cos yo + z5. Furthermore, if -2 cos yo + z5 > 2, then z is either a
positive or negative periodic function of y whose graph can' t cross the y axis to
fonn the closed loop that goes with a solution that's periodic as a function oft. Thus
the solutions y(t) that are periodic are generated by initial-conditions satisfying
-2 ~ -2cosyo + z5 < 2;
we can think of forming the corresponding phase curves by joining at the y axis the
two graphs we get by choosing opposite sign for the square-root. See Figure I 1.1 l(a).
The periodic rotational curves above and below the y axis go with "over-the-top"
motions of the pendulum, motions with increasing angle for z = j, > 0 and decreas-
ing angle for z = j, < 0. On the closed loops the motions are "back-and-forth" peri-
odic, with top part of the loop representing increasing angle y and the bottom part
decreasing angle y. The periodicity of the periodic solutions y = y(t) of y = - sin y
depends in no way on the periodicity of sin y. For comparison we include a phase
portrait for a somewhat more realistic linearly damped pendulum; the top two curves
in Figure 11 .11 (b) represent motions that start out rotating over the top but after a
while are damped down to swinging back and forth with decreasing amplitude. We
take up the plotting of these curves in Section 8.
Note. Other than closed loops and rotational traces, we list three types of special
points and curves in Figure I I. I J (a). The unstable equilibrium points and separating
curves listed as types 2 and 3 are highly theoretical, and are impossible to realize
mechanically.
1. Single points (Yl, 0) on the y axis at the center of closed loops represent
vertical stable equilibrium positions of the pendulum, hanging down, and
with velocity z = 0.
2. Single points (y2 , 0) midway between the points of type I represent vertical
unstable equilibrium positions of the pendulum, balanced up, with velocity
z = 0.
FIGURE 11.11 z = .v
Phase portraits for
(a) ji = - sin y and
(b) ji = -Jby - sin y.
)'
(a) (b)
Section 7C Nonlinear Equations 561
3. Curves extending between two points of type 2 separate the rotational motions
from the back-and-forth motions and are the phase-space traces of motions
that tend away from and toward unstable equilibrium without ever attaining it.
The single points referred to as type 2 are traces in phase space of constant solutions
y(t) = C of the differential equation y = - sin y. Note that y(t) = 0 so the
corresponding phase space plot of a constant solution is a point (y, z) = (C, 0)
on the y-axis. In general a constant solution of a differential equation must satisfy
y(t) = 0, and is called an equilibrium solution because the position y(t) doesn't
change over time. The pendulum example shows that an equilibrium may be very
unstable, as in the upward vertical position (y, z) = (Jr, 0) of a pendulum. In general
an equilibrium point (y, z) = (C, 0) is called stable if all phase curves starting
sufficiently close to it remain close to it. Otherwise the point (y, z) = (C, 0) is
called unstable.
The pendulum equation y = - sin y, with phase portrait shown in Figure 11.1 (a) has
equilibrium solutions satisfying y(t) = 0, so these solution also satisfy y(t) = 0. For
the pendulum equation this amounts to asking for the solutions of - sin y = 0. The
solutions of this equation are y = kJr for integer values of k. Thus the equilibrium
points are at (y, z) = (kJr, 0). When k is an even integer, these points lie at the
centers of the closed loops in Figure 11.1 (a), and these points are stable equilibrium
points, because a closed loop starting close enough to such a point remains close to
the point. When k is an odd integer, a loop starting near the point loops nearly 2Jr
units away while going around a stable point that is nearly Jr units away.
EXERCISES
In Exercises I to 6, solve the initial-value problem and 15. The differential equation y - y =0 has phase plots
sketch the graph of the solution y = y(t). Then sketch that decompose into three distinct parts, some of which
the trace of the solution in yz phase space, indicating correspond to constant equilibrium solutions.
the direction of traversal as t increases.
(a) Make a phase portrait of the differential equation in
l. y + y = 1, y(0) = 2, y(0) =0 (y, z) == (y, j,) space, making clear where the equi-
2. y - y = 0, y(0) = l, y(O) = 0 librium points and what the directions of traversal
are for the other phase curves.
3. y = 1, y(0) = y(0) = 0 (b) On the basis of your sketch for part (a) do the
4. y + y = 0, y(0) = 2, y(0) =- l equilibrium points appear to be stable or unstable?
Explain your answer.
S. y + y = 0, y(0) = 2, y(0) =0
6. y - y == 0, y(0) = l, y(0) = 0
16. The differential equation y + y == 0 has phase plots
In Exercises 7 to 14, use the method of Section 78 to that decompose into three distinct parts, some of which
find a relation depending on a constant C between y correspond to constant equilibrium solutions.
and z = y; then use this to sketch a phase portrait of the
(a) Make a phase portrait of the differential equation in
differential equation containing at least three curves.
(y, z) = (y, y) space, making clear where the equi-
7. y+ y = l 8. y- y == 0 librium points and what the directions of traversal
9. y == 1 10. y=0 are for the other phase curves.
(b) On the basis your sketch for part (a) do the equilib-
ll. y + y = 0 12. y +2y 3 = 0 rium points appear to bt: stable or unstable? Explain
13. y - 2y 3 = 0 14. y = 1+ y your answer.
562 Chapter 11 Second-Order Equations
is satisfied by the displacement function y(t) of a vibrating spring. There, the factors
m (mass), k (frictional constant), and h (spring stiffness) were constants, whereas
f (t), the externally applied force, was allowed to be a nonconstant function. Using
the method of characteristic equations, we were able to make a fairly complete
analysis of the solutions of the differential equation. But suppose that some or all
of m = m(t), k = k(t), and h = h(t) vary with time. For example, the spring
may weaken, or the friction may increase because of heating, or the weight of the
mechanism might increase or decrease for some reason. Assuming that m(t) > 0,
we can divide by it to get an equation of the form
d 2y dy
dt 2 + a(t)dt + b(t)y = f(t),
or nonlinear, for example, the damped pendulum equation
,l2y dy g .
dt2 + k dt + Ism y = 0.
Equations of both types appear in the very general form
It is this form that we'll treat here along with initial conditions of the form
dy
y(to) = YO, dt (to)= zo,
where to, yo, and zo are given constants. The Existence and Uniqueness Theorem 1.1
described in the introduction to this chapter guarantees a unique solution to this
problem if F(t, y, z) and its partial derivatives with respect toy and z are continuous
on some interval containing to and in some rectangle containing the point (yo, zo).
It's important to realize that the only type of problem for which we so far have
a universally effective method of actually displaying such solutions is the linear
equation with a(t) and b(t) both constant. Even then, if f(t) is not a function we can
integrate explicitly, we may have trouble finding a formula for a particular solution.
What we may then settle for is a numerical approximation Yk to the value y(tk)
of the true solution at a discrete set of points tk, Such approximations were treated
Section SA Numerical Methods 563
in Chapter 2 for first-order equations, and the methods used here for second-order
equations are simple modifications of the first-order methods.
If the purpose in solving a differential equation is just to obtain numerical values
or a graph for some particular solution, then a purely numerical approach may be
more efficient than first finding a solution formula and then finding the desired
graphical or numerical results from the formula. On the other hand, if what you
want is to display the nature of a solution's dependence on certain parameters in the
differential equation, or on initial conditions, then solution by formula is preferable
if at all possible. Beyond that, detailed properties of a solution, such as whether it's
periodic or only approximately so, can be hard to get from a numerical approximation
and easy to get from a formula such as y(t) = 2 cos 3t.
SA Euler,s Method
Numerical methods for first-order equations are motivated by using the interpreta-
tion of the first derivative as a slope. Rather than trying to make something of the
interpretation of the second derivative in a second-order equation, what we'll do is
find a pair of first-order equations equivalent to a given second-order equation and
then apply first-order methods to the simultaneous solution of the pair of equations.
The principle is easiest to understand in the general case
y" = F(t, y, y'), y(to) = YO, y'(to) = zo.
There are many ways to find an equivalent pair of first-order equations, but the most
natural is usually to introduce the first derivative y' as a new unknown function z.
We write z = y', z' = y", so we can replace y" = F(t, y, y') by z' = F(t, y, z).
The pair to be solved numerically is then
y' = z, y(to) = YO,
z' = F(t, y, z), z(to) = zo.
Since y' (to) = z(to) = zo, there is an initial condition that goes naturally with each
equation. To find an approximate solution, we can do what we would do with a
single first-order equation, except that at each step we find new approximate values
for both unknown functions y and z, and then use these values to compute new
approximations in the next step. The iterative formulas are as follows for the simple
Euler method. with step size h:
Yk+l = Yk + hzk,
Zk+l = Zk + hF(tk, Yk, Zk),
where tk = to + kh. The starting values yo, zo come from the initial conditions.
The initial-value problem •
ji = -siny, y(O) = j,(0) = I , for OS t < 20
describes the motion of a pendulum with fairly large amplitude, so large that the linear
approximation ji = -y would be inadequate. We go ahead to solve numerically the
system
j,=z, y(O)=l,
z= -siny, z(O) = I,
564 Chapter 11 Second-Order Equations
Note the command SET S=Y, saving the current value of y for use two lines
later; without this precaution, the advanced value y + hz would be used, which is
not correct. The printout results in 2000 values of t from 0.01 to 20 by steps of
size 0.0 I along with the corresponding y-values. We could also print the z-values,
which are approximations to the values of the derivative y. This is useful in making a
phase-space plot of the solution, plotting approximations to the points (y(t), y(t) ). In
the formal routine listed above, the line PRINT T,Y would be replaced by something
like PLOT Y,Z for a phase plot.
Rather than displaying a table with 2000 entries, Figure I 1.12 shows the graph of
y as an unbroken curve for the initial conditions y(0) = j,(0) = I. The replacement
of sin y by y in the differential equation is inappropriate in this instance, because
the values of y that occur are too large to make the approximation a good one. The
linearized initial-value problem ji = -y, y(0) = y'(0) = 1 has solution cost+
f ),
sin t = -./2 sin (t - with graph is shown in Figure 11.12 as a dotted curve. The
graphs show that the solution to the nonlinear pendulum equation differs substantially
in amplitude and period from the solution to the linear equation. At very sma11
amplitudes the discrepancy is much less, because the approximation of sin y by y is
better the closer y is to zero.
FIGURE 11.12 y
Solutions y(t) to y = - sin y
and y = -y (·. · ), both with
y(O) = j,(0) = 1.
Pk+I = Yk +hZk
qk+I =zk+hF(tk,Yk,Zk)
Section 8B Numerical Methods 565
h
Yk+l = Yk + 2(zk + qk+1)
h
Zk+l = Zk + 2[F(tk, Yk, zk) + F(tk+l, Pk+I, qk+I )].
Here Pk and qk provide the simple Euler estimates that are then used to compute
the final estimates for Yk and Zk = Yk; fk = to + kh as before. (As with the simple
Euler method, the value Yk has to be kept for use in computing Zk+I and cannot be
replaced by Yk+l without significant error.)
The advantage of the modification is that the error in the final estimates is sub-
stantially reduced without adding much complexity to the computation.
An algorithm to produce the first, third, and fourth columns in the previous table
might look like this. The routine produces only every hundredth row of the computed
values.
DEFINE F(T,Y,Z)c -Y
SET T=O
SET Y=O
SET Z=l
SET H=.001
FOR J=l TO 30
FOR K=l TO 100
SET P=Y+H*Z
SET Q=Z+H•F(T,Y,Z)
SET S=Y
SET Y~Y+.S*H*{Z+Q)
SET Z=Z+.S*H*{F(T,S,Z)+F(T~H,P,Q))
SET T=T+H
NEXT K
PRINT T, Y, SIN(T)
NEXT J
Recall that in this algorithm we are dealing with a pair of equations of the form
y=z z=F(t,y,z).
The Web site http://math.dartmouth.edu/"-rewn/ has Java applets 20RD and 20RD-
PLOT at use this routine. Decreasing the step size h often improves accuracy in the
Euler methods. This requires more steps to reach a given value of t and may produce
566 Chapter 11 Second-Order Equations
TABLE 11.3
more approximate solution values than is convenient. Thus we'd print results only
after m steps of calculation. For example, h =
0.001 and m =
10 would produce
approximate values with argument differences of 0.01. The applets just referred to
allow for this feature.
EXERCISES
MATLAB, Maple, and Mathematica are widely avail- j,(0) = 2. What changes if you replace the condition
able . for doing these exercises. In addition there are j,(0) = 0 by j,(0) = a for various choices of a?
Java applets 20RD, 20RDPLOT, and PHASEPLOT at [Hint: Look for successive approximate values for y
the Web site http://math.dartmouth.edu/~rewn/ and the with opposite sign.]
Heaviside function H(t) is available for use in these (b) What can you say about the questions in part (a) on
applets. an interval 11 :=:: t :S O with 11 < O?
2. The Bessel equation of order zero is
1. The Airy equation y + ty = 0 has solutions for t > 0
somewhat similar to solutions of y + y = 0. x 2 y" +xy' + x 2y = 0.
(a) Estimate the location of positive r-values for which For I :=:: x ::s 40, estimate the location of the zero values
y (t) = 0 for the solution satisfying y(O) = 0 and of a solution satisfying y(l) = l, y'(l) = 0.
Section 8B Numerical Methods 567
3. Make a numerical comparison of the solution of ji = (a) If the terminal velocity for the linear model is 36
- siny, y(O) == 0, j,(0) = l, with the solution of ji = -y feet per second, what is k?
using the same initial conditions. In particular, estimate (b) Estimate the time it takes for the linear model to
the discrepancies between the location of successive zero reach velocity 35.99 ft/sec.
values for the two solutions, one of which is y(t) = sin t. (c) If the terminal velocity for the nonlinear model with
4. The nonlinear equation ex = 1. 1 is 36 feet per second, estimate k. This
requires trial and error.
d2y dy 2 (d) Estimate the time it takes for the nonlinear model of
dx 2 + k dx + hy = 0, h, k, constant, part (c) to reach velocity 35.99 ft/sec.
has solutions defined near x =
0. Compare the behavior of 9. Bead on a wire. Consider a bead sliding without friction
numerical solutions of the initial value problem y(O) = 0, under constant vertical gravity along a wire bent into
y' (0) = 1 with the corresponding behavior when the the shape of the twice continuously differentiable graph
nonlinear term hy 2 is replaced by a linear term h y. To do of y = f (x). It turns out that x = x(t) satisfies the
this you should investigate the result of choosing several equation x = -(g + /"(x)x 2)/'(x)/(1 + f'(x)2). With
different values of k > 0 and h > 0. J(x) = -x 3 +4x 2 - 3x, g =
32 and x(0) =
0, estimate
how large x(0) > 0 should be for the bead to overcome
5. The linear equation
periodic oscillation and go over the hump in the wire.
d 2y dy
dx2 + a(x) dx + b(x)y = 0 PENDULUM
with continuous coefficients a(x) and b(x) occurs often 10. (a) Use the Euler method for
with nonconstant coefficients. For other choices of the y" = F(t, y, y'), y(to) = YO, y'(to) == zo,
coefficients, we use numerical methods. Study the behav-
ior of numerical solutions of the initial value problem and apply it to the pendulum equation with
y(O) = 0, y' (0) = 1 as follows.: F(t, y, y') = -16 siny, y(O) = 0, y'(O) = 0.5.
(a) Let a(x) = sin x and b(x) = cosx for O::: x.::: 2rr . (b) Do part (a) using the improved Euler method.
(b) Let a(x) = e-x 12 and b(x) =
e-x/ 3 for OS x .::: 1. For small oscillations of y, the approximation
sin y ~ y is fairly good, leading to the replace-
6. Nonuniqueness. The initial-value problem ji =
3.jy, ment of the pendulum equation y" + (g/ /) sin y = 0
y(0) = 0, j,(0) = 0 has the identically zero solution. by the linearized equation y" + (g / l) y = 0, with
(a) Verify that y(t) = -kt
4
is also a solution. solutions of the form
(b) Investigate the application of the Euler methods to
the problem. y(t) = ci cos ./iii t + c2 sin ./iii t.
Assuming g/ I = 16, compare y(t) with the
FALLING OBJECTS improved Euler approximation to the solution of
7. Suppose that the displacement y(t) of a falling object is the nonlinear equation under initial conditions.
subject to a nonlinear friction force (c) y(O) = 0, y'(O) = 0.1
(d) y(O) = 0, y' (0) = 4
mji = -kya + mg. 11. Consider the damped pendulum equation 0 = -(g/ l) sin
(a) Find numerical approximations to y(t) in the range 0-(k/m)O, with g =
32.2, l = 20, k = 0.03, and m = 5.
0 ::: t .::: 20, with g = 32.2, y(O) = 0, y(O) = 0, For the solution with 0(0) = 0, 0(0) = 0.2.
m = 1 and a = 1.5. Use the values k = 0, 0.1, (a) Estimate the maximum angles 0 for OS t .::: 15.
0.5, 1, and sketch the graphs of y = y(t) using an (b) Estimate the successive times between occurrences
appropriate scale. of the value 0 = 0 for OS t S 15.
(b) Estimate the values of k that, along with y(O) = 0 (c) Repeat part (b), but with initial conditions 0(0) = 0,
and the other parameter values in part (a), produce 0(0) = 2.0
approximately the values y(O) =
0, y(5) 66, = 12. Consider the following modification of the pendulum
y(lO) = 137 and y(l5) = 208. equation m/0 = -gm sin 0, written here in terms of
8. A nonlinear model for an object of mass 1 dropped from forces. If the pendulum pivot is moved vertically from its
rest has frictional force y = -kya + g, y(O) = y(O) = O; usual fixed position at level 0, so that at time tit is at f(t),
the model is linear if a =
1. In numerical work assume with f (0)=0, the additional vertical force component is
e = 32 ft/sec 2. mJ(t). Thus the vertical force due to gravity alone is
568 Chapter 11 Second-Order Equations
replaced by ( - gm+ m/(t)) sin 0. It follows that the detect long-term approach to periodic behavior for some
equation for displacement angle 0 = 0(t) becomes k < 0.
m/0 = - gm sin0 + m/(t) sin0 or 16. Plot closed periodic phase paths for the Morse model of
displacement y from equilibrium of the distance between
.. I .. the two atoms of a diatomic molecule:
0 = ( - g + f(t)) sin 0 .
1
Show that if f(t) = at, with a constant, then
(a) ji = K(e-2ay - t -ay ) , K =a= I.
there is no change in acceleration as compared with
the fixed-pivot case and hence that the equation 17. Consider the nonlinear oscillator equation mi +kili l.8 +
for the displacement angle 0 remains the same: hx = 0, 0 5 /3 = canst. Let m = 1, k = 0.2, h = 5
0 = -(g / l) sin 0. and suppose that x(0) = 0, ..i:(O) = 5. Compute numerical
(b) Show that in the case of a general twice- approximations to x(t) on the range O 5 t 5 20 for
differentiable f (t), the position of the pendulum f3 = 0, 0.5, I. Sketch the resulting graphs using computer
weight at time tis x(t) = (t sin0(t), f(t)-l cos0), graphics.
where 0 satisfies either of the differential equations
displayed above. 18. Chaos. The nonlinear Duffing oscillator models a
(c) Let g = 32,/ = 5,111 = 1 and f(t) = 2sin4t. Plot periodically-driven, damped initial-value process:
(t, 0(t)) fort-values 0.01 apart between O and 100,
assuming 0(0) = 0 and 0(0) = 0.01, 0.001, 0.0001. ji + ky - y + y3 = A cos wt, y(O) = ½, j,(0) = 0.
Note the long term deviation in behavior as com-
pared ~ith the identically zero solution correspond- (a) Make a computer plot of the solution y(t) for O 5
ing to 0(0) = 0. t 5 300 for the parameter choices k = 0.2, A =
(d) Plot the path of the pendulum weight under the 0.3, w = 1 and initial conditions y(O) = y (O) = 0.
assumptions in part (c). The behavior of the damped, periodically driven
Duffing equation is often described as "chaotic,"
which in practical terms means unpredictable. In par-
OSCILLATORS AND PHASE SPACE ticular, the specific output that you get will depend
13. An unforced oscillator displacement x = x(t) satisfies significantly not only on the parameter and initial
mx + ki + hx = 0, where k and h may depend on time t. values, but on the choice of numerical method and
(a) Suppose that k(t) = 0.2(1 - e- 0· 1'), h = 5, 111 = I
step size, and even on the internal arithmetic of the
and that x(0) = 0, i(0) = 5. Compute a numerical machine used to generate the output. For this reason
approximation to x(t) on the range of O 5 t 5 20. it seems impossible to describe accurately the global
Then sketch the graph of x = x(t). shape of the output from the damped, periodically
(b) Do part (a) using instead k = 0 and /z(t) = 5(1 - driven Duffing oscillator.
e-0.21). (b) Change just the damping constant in part (a) to
k = 0, and make a plot of the solution. Comment
14. Make phase-plots of the soft spring oscillator equation on the qualitative changes that you sec as compared
ji = - y y 3 +8y under each of the following assumptions. with the output in part (a) .
(a) y = 8 = I (c) Make several phase plane plots, starting at different
(b) y = 1, 8 = 2 points in the (y, y)-plane, of solutions of the Duffing
(c) y = 2, 8 = I equation. Use the parameter values k = 0.2, A =
15. Make solution graphs and (y, y) phase plots for the 0.3, and w = I.
periodically driven hard spring oscillator equation (d) Experiment with part (c) by trying your own choices
ji = -y - y3 + ky + Jo
cost and use the results to for the three parameter values.
Chapter 11 REVIEW
7. y 111 =X
21. (a) Show that the functions e" and e-., are linearly
independent on an interval a < x < b by showing
8. y"" = Bly directly that the equation
9. y" + 9y = sin 3x, y(O) = 1, y'(O) = 0
10. (D 2 + 4)y = cos 3:i, using Section 3C
11. y" + y = 0, y(O) = -1, y(rr) = I can't be satisfied for all real x unless the constants
c1 and c2 are both zero.
12. y" + y = 0, y(O) = 0, y(rr/2) = 2 (b) Show that the equation 2e" - 3e-x = 0 is satisfied
The basic real solution forms for y" + ay' + by = 0 are for exactly one real x and that 2e" + 3e-x == 0 is
(e 71X, er2x}, (e 71 x, xe71 x} and (eax cos/Jx, eax sin/Jx} satisfied for no real x.
and are prototypes for Exercises 13 and 14. (c) For what complex values of x is 2e" + 3e-x = 0
satisfied?
13. Make a corresponding list of triples for the equation
22. Suppose that
y 111 + ay'' + by' + cy = 0.
14. Make a corresponding list of quadruples for
y"" + ay"' + by" + cy' + dy = 0.
15. Derive from scratch the fundamental sinusoidal solutions (a) Find the general solution y(t) of this equation.
to the harmonic oscillator problem ji + y = 0, y(O) = 0, (b) Let
j,(O) = 1. Do this in the following steps: (Our earlier dy
derivations were made using complex exponentials.) z(t) = dt(t).
(a) Multiply the equation by j,, and then integrate with
Show that the parametrized curve (y(t), z(t)) traces
respect tot to get ½f
+ ½Y 2 = C. clockwise a circular path or else reduces to a single
(b) Find C, solve for j,, and solve the resulting first order
point.
equation to get y(t) = sin(t + c).
23. Theorem 2.4 implies that there is a one-to-one correspon-
16. Suppose that y1 (x) and Y2 (x) are real-valued functions dence between the set of all n- tuples of initial values
defined for all real x and you know that YI (x) is not a
(zo, z1, ... , Zn-1) and all solutions to the nth-order homo-
constant multiple of Y2(x). Are the two functions neces- geneous equation L(y) = 0. Explain how this conclusion
sarily linearly independent? Explain your answer, using follows.
an example if necessary.
24. Theorem 2.4 implies that there is a one-to-one corre-
17. Let the functions YI, Y2 be defined for all real x by spondence between the set of all n-tuples of inirial val-
YI (x) = e'" and Y2 (x) = t!x, where r and s are unequal ues (zo, z1, ... , Zn-I) and all n-tuples (CJ, c2, ... , cn) of
complex numbers. Show that YI and Y2 are linearly coefficients of linear combination. Explain how this con-
independent even if complex constants CJ, cz are allowed clusion follows.
in CJ YI (x) + c2y2(x ).
25. (a) Show that the family of differential equations
18. For what values of the constant b do the nonidentically y 11 - (2r + h)y' + r(r + h)y = 0, depending on
zero solutions of y" + y' + by = 0 oscillate as func- the parameter h, can also be written
tions of x? (D - r)(D - (r + h) }y = 0.
19. The current 1 (t) flowing through a certain electric circuit (b) Show that the equations of pru1 (a) have solutions
at time t satisfies y = CJ e<r+h)x + c2e' x if h :/= 0.
(c) Let CJ = l/h,c2 = -1/h, and show that for
I"+Rl'+l=sint, each fixed x, the resulting solution Yh(x), tends as
h - 0 to
where R > 0 is a constant resistance. The equation has d rx rx
a solution of the form 1 (t) = A sin(t - ex) for certain y = dr
-e =xe .
constants A > 0 and ex . Find A and ex.
20. Show that the functions e" and e-" are linearly indepen- In Exercises 26 to 29, assume that L is a linear operator
dent on an interval a < x < b by showing directly that such that L(YI) = w1, L(y2) = w2, and L(y3) = w3 .
570 Chapter 11 Second-Order Equations
INTRODUCTION TO SYSTEMS
In the two previous chapters we have considered differential equations whose solu-
tions are real-valued functions of a real variable. It turns out that a natural and useful
generalization is to consider vector differential equations (or, equivalently, systems
of real differential equations) whose solutions are vector-valued functions of a real
variable. There are two main reasons for making this generalization: one is that
many phenomena in applied mathematics can most naturally be expressed in vector
form; another is that real differential equations of order higher than one can often be
reduced advantageously to vector equations of order one. Both of these statements
will be explained in this chapter.
In dealing with vector equations it will be convenient to use the letter t to denote
the variable with respect to which derivatives are taken. This choice has the advantage
that applications most frequently involve time-dependent phenomena, and also that
the letters x, y, z, etc., are left free to denote space coordinates, as usual. To write
general systems of differential equations more compactly we'll use the notation
x = (x, y) for 2-dimensional systems, x = (x, y, z) for 3-dimensional systems, and
x = (x1, ... , Xn) for n-dimensional systems.
X = X(t)
y = y(t),
571
572 Chapter 12 Introduction to Systems
defined on some interval a < t < b and satisfying the differential equations. As t
a b
increases from a to b, the point in the xy-plane with coordinates (x(t), y(t)) will
(a)
trace a path, perhaps like the one in Figure 12.l(b). Such a path is called a trajectory
of the system. It's important to remember that a solution of a system is a function
y oft and that its trajectory is the image of this function, containing only a part of the
information contained in the solution; by itself, the trajectory fails to show explicitly
(x (b), y (b))
the correspondence between values oft and values (x(t), y(t)) of the solution, nor
(x (a), y (a))
does the trajectory display the speed of traversal. Nevertheless a sketch of several
judiciously chosen trajectories of a system, called for historical reasons a phase-
X portrait of the system, is often enough to convey important information about the
system, particularly if the directions of traversal are shown by inserting appropriate
(b) arrow points. Note however that a "trajectory" may be a ~ingle point xo arising from
a constant solution x(I) = xo; in that case we refer respectively to an equilibrium
FIGURE 12.1 point and an equilibrium solution.
In the description of scientific problems, the variable t often represents time. Thus
the derivatives x' and y' may stand for the rates of change of x and y with respect
to time; if t is to be interpreted as time, the derivatives are often written with dots
instead of primes: x, y.
[ EXAMPLE.· 1·1 Consider the system
x=x
y = 2y.
This system is particularly simple because each unknown function occurs in just one
equation. Such a system is called uncoupled, We can therefore solve each equation
separately to get the general solution
x(t) = c1e'
y(t) = c2e2,.
We see that x(O) = CJ and y(0) = c2, so that imposing initial conditions such as
x(0) = 1
y(0) =2
y will determine the values of c1 and c2 and will single out the particular solution
X y = 2x 2 ,
with the restriction that x(t) > 0 and y(t) > 0. Figure 12.2 shows a part of the
FIGURE 12.2 trajectory for -oo < t < oo. Because x(f) =
er can't be negative, the trajectory
consists of only half of the parabola y = 2x 2 .
Section 1A Vector Fields 573
We can write the system of differential equations in Example I in vector notation
by letting x = (x, y), dx/dt = (dx/dt, dy/dt), and F(x, y) = (x, 2y). Then the
system becomes
-dx
dt
= F(x).
Similarly, the system
dx
-=x+y+t
dt
dy
-=x-y-t
dt
would be written dx/dt = F(t, x), where
F(t, x, y) = (x + y + t, X - y - t).
dx1
dt = F1(t,x1,x2, ... ,Xn)
dx2
-=F2(t,x1,x2,, .. ,x11 )
dt
dx
dt = F(t, x),
where F(t,x) = {F1(t,x), ... ,F11 (t,x)). A solution x = x(t) is a vector-valued
function of a real variable t on an interval a < t < b, such that substitution into the
differential equation satisfies the system of equations on the interval, as (x(t), y(t)) =
(e', 3e21 ) satisfies (i:, y) = (x, 2y) in Example 1. The image of the interval under this
function is a trajectory curve in R 11 • Such a trajectory in R 3 is shown in Figure 12.3.
One advantage of the vector interpretation of a system is that the derivative dx/dt
has a geometric meaning that the derivatives dx;/dt do not have when taken sep-
arately: dx/dt is a tangent vector to a trajectory, and if t is time, then dx/dt is a
velocity vector. Formally, tangent and velocity are matters of definition. The follow-
ing discussion shows how the formal definitions are suggested by the intuitive ideas
behind tangent and velocity.
574 Chapter 12 Introduction to Systems
FIGURE 12.3
Reviewing the vector derivative, Figure 12.3 shows the points x(t), x(t + h), and
the chord x(t + h) - x(t) joining them. If we multiply the vector by 1/ h, then
x(t + h) - x(t)
h
will be parallel to the chord, and if it has a limit as h ......,. 0, this limit vector can
reasonably be defined to be a tangent vector at x(t). The limit is defined as in Chapter
4 so that
dx . x(t+h)-x(t)
-(t) = hm - - - - -
dt h--+0 h
= (um x1(t+h)-x1(t)····, Jim x (t-+-h)-x (t)) 11 11
h--+0 h h--+0 h
dx1 dx2 dx 11 ) dx
= ( -(t), -(t), ... ' - ( t ) = -(t).
dt dt dt dt
Thus dx (t) is a tangent vector at x(t). The velocity interpretation is valid when t is
dt
time because the Euclidean length I(dx/dt)(t) I is defined to be the speed of traversal
of the trajectory at x(t). The reason is that for small values of h, the Euclidean length
Ix(t + h~ - x(t) I
is nearly the average rate of traversal of the trajectory over the interval from t to
t + h. This approximation is really good only if x(t) is differentiable, in which case,
because length is continuous,
. l"(t+h)-x(t)I
hm ----- -(t) I.
= Id"
h--'>0 h dt
"'
FIGURE 12.4
(b)
-dx
dt
= F(t, x),
the function F(t, x) is independent of t, then the system is called autonomous and
so has the form
dx
-dt = F(x).
Example 2 describes an autonomous system. For an autonomous system, the tangent
vector F(x) located at the point x is always the same regardless of what time it is
when the trajectory passes through x. Such an assignment of vectors F(x) to points
xis called a vector field and Figure l2.4(b) shows a sketch of one. For the general
nonautonomous system, the function F(t, x) specifies a tangent vector (to a trajectory
through x) that may be different for each t, in this way producing a time-dependent
vector field. The pictorial analogue of the static vector field shown in Figure 12.4 for
a time-dependent vector field would be a sequence of "snapshots" taken at different
times. Each snapshot would have the same general form as Figure 12.4, but would
show changes in the individual arrows as time t varies.
The 2-dimensional system
l:-~,~~MPL~,3J
t - f - .
-dx
dt
= (1 - t)x - ty
dy
-=tx+( l -t)y
dt
determined by the time-dependent vector field
(1-t)x-ty)
F(t,x,y)= ( tx+()-t)y
= (1- t) ( ~ ) + t ( -~ )
576 Chapter 12 Introduction to Systems
is not autonomous. Figure 12.5 shows some sketches of the vector fields for
t=0,½,1.
Suppose the 2-dimensional system dx/dt = F(t, x, y), dy/dt = G(t, x, y) is such
that the ratio G(t, x, y)/ F(t, x, y) = R(x, y) happens to be independent oft. This
would occur in particular if neither F nor G depended explicitly on t. Since the
chain rule allows us to write
dy dy/dt
y
dx dx/dt
/
' X
under fairly general conditions, we can conclude that there are trajectory curves of
the system satisfying the differential equation
dy
/
F (O,x. y)
(a)
'
= (;)
-dx = R(x , v).
·
If we can solve this equation, we have a way to plot trajectories without finding
solutions x(t), y(t). For example, dx/dt = ty, dy/dt = -tx leads us to consider
dy X
y
--r- t dx y'
t
F (], x, y)
-
= (7')
X to 27. Looking at the vector field will tell you roughly how a trajectory is traced.
The trajectory of a constant solution, for which x = y = 0 for all t is just a single
point.
IC Second-Order Equations
(b) Systems of first-order differential equations arise very naturally in the study of higher-
order equations. We restrict ourselves here to the most important case, namely second
y order, a context we discussed also in Chapter 11 , Section 7. In a second-order I-
dimensional equation ji = f (t, y, y) we reduce the order to I and simultaneously
/
' /
X
raise the dimension to 2 by introducing a new dependent variable z. We let y z.
Then ji = i;. Thus an initial-value problem
'
F(I2, x, Y;~-l(x-v)
- 2
(c)
x T :\'
is equivalent to a 2-dimensional system of the form
Y = Z, y(to) = YO,
z = f(t, y, z), z(to) = zo,
FIGURE 12.S a first-order system x= F(t, x), with x = (y, ;:) and time-dependent vector field
y(O)=Ct=l,
(a) y + y = O;
z(O) = c2 = 0.
z
The vector solution determined by these values of ct and c 2 is
(~g~ ) = ( ~::~ t) .
The trajectory of this solution is a circle of radius 1, shown in Figure 12.6(a), which
-t-+--f--f-f----(ft~-,1-,1---,t-1-t---....
y taken altogether is a phase portrait for the original equation ji + y = 0, consisting of
the circles
FIGURE 12.6 The curves in Figure 12.6 are trajectories of vector solutions y =
y(t), z z(t) =
(a) Circles; y + y = 0 and as such have directions as indicated by arrow points in the figures. If a phase
(b) Spirals; ji +hi+ !Y = 0. curve has parts on both sides of the y axis, these directions in a phase portrait always
have a clockwise orientation relative to the usual orientation of the y-axis and z-axis.
The general principle is that when z = y is positive then y is increasing, and when
=
z y is negative then y is decreasing. Since z = y = 0 on the y axis, it follows
that dz/dy = z/y = 0, so a trajectory that crosses the y axis crosses vertically, as
displayed in both parts of Figure 12.6.
1
! =
The differential equation ji + j y + y 0 has general solution y = e-3 1 ( CJ cos j t +
c2 sin jt) and is equivalent to the first-order system
y=z
·
Z = - 51 y - 2
5Z-
578 Chapter 12 Introduction to Systems
We solve the second-order equation for y and then find z = y to solve the system:
I
1
y =e-5 (c 1 cos ~t + c2 sin ~t)
I
z =¼e-5 1 ( (2c2 - c1) cos }t - (c2 + 2c1) sin jt ).
Figure 12.6(b) shows some phase space plots starting at eight different points; the
trajectories all spiral in toward the origin as t increases.
EXERCISES
The uncoupled systems I to 4 are solvable by treating 7. F(x ,y) = (x+ l,y}
each equation separately. Find the general solution, and 8. F(t,x,y) = (t,y)
then find the particular solution that satisfies the given
initial conditions. 9. F(x, y, z) = (x, ½J, jZ)
10. F(t , x, y, z) = (x + t , y - t, z)
1. dx/dt = x + I, x(0) = I,
dy/dt = y, =
y(0) 2 11. Sketch the vector field F(x, y) = (-y,x). Then sketch
the trajectory curve tangent to arrows in the field sketch,
2. dx/dt = t, x(l) = 0,
starting at (x. y) = (1, 0).
dy/dt = y, y(l) = 0
12. Show that the system i = -ty, y = tx has circular solu-
3. dx/dt = x, x(0) = 0, tion trajectories of radius r > 0, traced with increasing
dy/dt = !J,
y(0) = l, speed rt as time increases. [Hint: Show that xi+ yy = O.]
dz/dt = ½z, z(O) = -1 In Exercises 13 and 14, by letting y = z, express
4. dx/dt = x + t, x(0) = 0, each second-order differential equation as a first-order
dy/dt=y-t, y(0)=0, system of dimension 2. Also find the corresponding
dz/dt = z, z(O) = l initial conditions for y(O) and z(O), and solve the initial-
5. For the system in Exercise 2, there is a vector-valued value problem for y and z.
function F(t, x), with x = (x, y) such that the system 13. y + y + y = 0, y(0) = l, j,(0) = l
has the form dx/dt = F(t, x).
14. y + tj- = t, y(0) = 0, y(0} = l
(a) Find F.
(b) Find the speed of a trajectory through x at time t. Find first-order systems equivalent to the differential
equations 15 to 18 by setting dy/dt = z and, if appro-
6. For the system in Exercise 4, there is a vector-valued
priate, dz/dt = w.
function F(t, x), with x = (x, y, z) such that the system
has the form dx/dt F(t, x).= 15. d 2 y/dt 2 + (dy/dt) 2 + y2 = e1
(a) Find F. 16. d 2 y/dt 2 = y (dy/dt)
(b) Find the speed of a trajectory through x at time t.
17. d 3 y/dt 3 = (d 2 y/dt 2 }2 - y (dy/dt) - t
Sketch the vector fields 7 to IO by drawing a few arrows
18. d 3 y/dt 3 = 12xdy/dt
for F(x) or F(t. x) with their tails at selected points x of
the fom1 (x, y) or (x, y, z). In Exercises 8 and 10, make In Exercises 19 to 22, reduce the system to normal form,
separate sketches for t = - 1, t = 0, and t 1. = with each first derivative by itself on the left side.
Section 1C Vector Fields 579
19. dx/dt+dy/dt=t, still no forces acting horizontally, the single equation is
dx/dt - dy/dt =y replaced by the 2-dimensional uncoupled system
20. dx/dt + dy/dt = y,
i =0,
dx/dt + Uy/dt = x
21. Ux/dt+dy/dt+x+5y=t, ji = -g.
dx/dt +dy/dt +2x +2y = 0 (a) Solve the 2-dimensional system, subject to the four
22. dx/dt - dy /dt = e- 1 , initial conditions
dx/dt +dy/dt = e1
23. Sketch a phase portrait for the second-order equation ji =
x(0) = 0, y(O) = 0,
j,, indicating the directions of traversal where appropriate. i(0) = zo > 0, j,(0) = wo > 0.
Note however that since one of the equations in the system
will be j, = z these directions are always left to right (b) Show that the trajectory of the solution found in part
when z > 0 and right to left when z < 0. [Hint: One set (a) follows a parabolic path.
of trajectories consists of the individual points on the y (c) Show that the maximum height is attained when
axis.] the horizontal displacement is zowo/g and that the
maximum height is w5/(2g).
24. Sketch a phase portrait for the second-order equation (d) Show that the horizontal distance traversed before
ji = y, indicating the directions of traversal. [Hint: Two returning to height y(O) = 0 is 2zowo/g. Show also
fam.ilies of hyperbolas make up the trajectories.]
that for a given initial speed vo = Jz5 + w5, this
25. Consider the 2-dimensional coupled system
horizontal distance is maximized by having zo = wo.
X= X + y, }' = 4x + y. 31. A projectile fired against air resistance proportional to
velocity satisfies the uncoupled system
(a) Change the coordinates (x, y) to (z, w) with the
relations i = -ki,
x = z + w, y = 2z - 2w,
ji = -ky - g, k > 0.
counterclockwise from the downward vertical direc- (b) Use the relation between u and v derived in part
tion, then x = I sin 0 and y = -l cos 0. [Hint: These (a) together with the given system to derive a single
equations are a slight modification of the usual polar differential equation satisfied by u(t). Then solve
coordinate relations.] this equation using the initial conditions.
(b) Show that .i = -l sin 0tF + I cos 00 and y = (c) Find a formula for v(t).
l cos00 2 + l sin 00. (d) Find out how long it takes for the initial temperature
(c) Use the representation .i = 0, y = -g forthe coordi- difference between the two bodies to be cut in half.
nates of the ac<.:eleration of gravity together with the 34. Bugs in mutual pursuit. Four identical bugs are on a
result of part (b), to derive the pendulum equation. flat table, each moving at the same constant speed v. Use
(This derivation safely ignores the lengthwise, or (x, y)-coordinates on the table, and locate bugs I through
radial, force on the pendulum, since that force is 4 initially in respective quadrants I through 4, each at
always perpendicular to the path of motion.) [Hint: one of the points (±1, ±1). Bug I always heads directly
Eliminate terms containing 02 .] toward bug 2, bug 2 toward bug 3, bug 3 toward bug
33. Heat exchange. The temperatures u (t) 2': v (t) of two bod- 4, and bug 4 toward bug I, so their paths are mutually
ies in thermal contact with each other may be governed congruent.
for the warmer body by Newton's law of cooling and for (a) Use the symmetry of the paths to show that the bugs
the cooler body by the analogous heating law: are at all times at the corners of a square, and in
particular, if bug I is at (x, y), then 2, 3 and 4 are
-du = -p(u - v),
dv
- =q(u - v), respectively at (-y, x), (-x, -y) and (y, -x).
dt dt (b) Show for a bug at (x, y) that y/i = (y-x)/(x + y)
and that i 2 + y2 = v2 .
where p and q are positive constants and the equations (c) Use part (b) to show that the path
are subject to initial conditions u(0) = uo, v(0) = vo. (x, y) = (x(t), y(t)) followed by bug I satisfies the
(Note that p may be different from q if the two bodies nonlinear autonomous system
have different capacities to absorb heat.) This is a coupled
system, but because of its simple form we can solve it as dx -V X +y dy V X - y
follows.
dt = ./2 Jx2 + y2' dt = ./2 .Jx2 + y2 ·
(a) Show that qdu/dt + pdv/dt = 0. Then integrate
with respect to t to show that qu(t) + pv(t) = co,
where co= quo+ pvo.
Proof. If two trajectories did agree at xo, this common vaJue taken as initial value
would dictate that the trajectories are the same from that time on until one of them
terminates. Similarly the reverse trajectory that satisfies x=
-F(x), and that coin-
cides as a curve with a trajectory approaching xo from the other side, would also be
uniquely determined until termination of one of them. •
Figure 12.7(b) shows some computer plots of trajectories for the nonautonomous
system i = (1 - t)x - ty, y = tx + (l - t)y. Each one of the four trajectories is
y shown crossing one of the others. One trajectory of a nonautonomous system may
very well cross another one, or even intersect itself at a nonzero angle, because on
arrival at the same point in phase space at a different time there may have been a
change of direction in the vector field F(t, x). Snapshots of the vector field of this
system are in Figure 12.5. The graphs of different solutions in (t, x, y)-space will
have no points in common, because t varies from point to point.
Autonomous system trajectories Flows. The trajectories of an autonomous system x = F(x) are called the
(a) flow lines of d1e vector field F, and we can picture them, as shown for example
in Figure 12.7(a), as the possible paths followed by fluid particles in a steady fluid
y
flow with velocity vector F(x) at x. These ideas are also discussed in Chapter 8,
Section 4. In what follows we'll assume that the autonomous vector field F satisfies
the conditions of Theorem 1.1 in some region B in JR'\ thus guaranteeing (i) that
there is a unique flow line through each x in B and (ii) that distinct flow lines have
no points in common. We associate with each such n-dimensional vector field F a
family of flow transformations T, from B to B defined by
Nonautonomous system trajectories 1.3 T,(x) = y(t) , where y(t) solves y= F(y) with initial value y(0) = x.-
(b)
In words, T, {x) is the point on the flow line of F starting at x that the flow reaches
after time t.
The system x = -y, y =
x has circular trajectories, as in Figure 12.7(a). Thus
a flow line of radius A for the vector field F(x, y) = (- y, x) is parametrized by
x(t) = A cos(t + a) , y(t) = A sin(t + a). To start one of these flow lines at a fixed
point (u , v) when t = 0, we note that A = Ju 2 + v2 and write
Since sine and cosine are periodic functions, the vector-valued function </> : JR 3 ---1>
JR 2defined by </>(t, u, v) = Tr(u , v) in the previous example is not one-to-one as a
function oft and (u , v) unless t is somehow restricted. However for fixed t = to,
the function T10 : JR2 ---1> JR 2 turns out not only to be one-to-one but to have a nice
inverse, namely T_ 10 • It's a straightforward exercise to show that Tr0 is just a rotation
about the origin through angle to. Hence the inverse of Tr0 is T-ro · We 'II see that
this simple relationship between Tr and its inverse holds very generally.
The flow transformations T, defined above have the composition property.
1.4 T1 Ts = Tr+s; in other words, T1 (Ts(x)) = Tr +s(X) ,
whenever all three transformations are defined. Equation I .4 holds because the system
-dx
dt
= F(x) ,
of which y(t) = T1 (x) is a solution, has a unique solution starting at Tr (xo) whose
vaJue at s time units later must coincide with the unique solution value achieved by
starting at xo and running for time t + s. Furthermore, the reversed system,
dx
dt = -F(x),
has solution trajectories traced in the direction - F(x) exactly opposite to that of the
=
solutions of dx/dt F(x). We can use solutions of the reversed system to define T,
for t < 0 by T, (xo) = z(t), where: z(t) satisfies
dz
dt = -F(z), z(O) = xo.
It follows that each of Lr and Tr is an inverse operator to the other, so
1.5
Proof. The existence of a unique solution curve passing through each xo ensures
that T, is a well-defined transformation. Furthermore T1 is one-to-one because if
Section 1D Vector Fields 583
Ti(xo) = Ti(x1) for some t > 0, and for xo =f. X1, then there would be two distinct
solution curves to the system
dx
- = -F(x)
dt
starting at Yo= Ti(xo) = Ti(x1) and passing back through xo and x1, respectively.
But this is impossible, again by the uniqueness theorem.
To see what the transformation Ti does to volumes, we use the following.
1.7 Lemma. The Jacobian determinant J1 (x) satisfies
where y(t) = F(y(t)) and y(O) = x. By Theorem I.I Ti(x) is a continuously dif-
ferentiable function of x. We use the Leibniz notation of Chapter 7, Section 4D to
write
a(yi(t), ... ,yn(t))
J, = -------.
a(x1,,., , Xn)
The derivative of a determinant is the sum of the determinants obtained by differ-
entiating one row at a time, as shown in Exercise 9. With rows indexed by i, we
then have
By the chain rule, the ikth determinant entry in the ith term above is
ayi
axk =
I: ayi ayj
j=I ayj axk.
By row-linearity of the determinant, the ith term in the sum for dJr/dt is then
t.
ayi a(y1, .. ·, Yj, .. ·, Yn) .
ayj a(XJ, ... ,Xk,,,. ,Xn)
;=I
But the detenninants in this last sum are O (two rows equal) unless j = i, in which
case the determinant is Jr. To finish proving the lemma, we note that the remaining
multiplier of ayj/ayi is just J1 • To finish proving the lemma we have
dJ, = J, ~ ayj(t)
dt ~
1=]
ayi
= Jr '°' ---
n aF;(y(t)) .
~ ay-
= J1 div F(y(I)).
i=l I
584 Chapter 12 Introduction to Systems
We'll now finish proving the theorem. If div F = 0 then by the lemma, 11 is constant
as a function of t . But To is an identity transformation, so lo = 1 and 11 = 1 for
t > 0 also. Hence the transformation T1 is volume-preserving by the Jacobi change-
of-variable theorem for multiple integrals. Conversely, volume-preservation implies
=
11 = 1, so divF 0. Finally, the first-order linear differential equation for 11 with
initial condition Jo = 1 has the solution
(Note that if div F is constant, then the exponent is just t div F.) The statements
about volume-decreasing and volume-increasing follow as previously by Jacobi's
theorem. •
You can see directly that the uncoupled system x = x, y = 2y, z = 3z has the
1·Ei(AMPl~, to.! solution x(t) = ue 1 , y(t) = ve 21 , z(t) = we 31 with initial values x(O) = u, y(O) = v.
z(O) = w. The flow generated by the vector field F(x, y, z) = (x, 2y, 3z) is therefore
EXERCISES
1. The domain of t-values for which the solution of even 5. A 2-dimensional Hamiltonian system has the form
an autonomous system exists may be quite restricted.
Illustrate this point by deriving the explicit solution to the . aH . 8H
x=-, y=--,
I-dimensional initial-value problem i = ax 2 , x(O) = l, ay ax
where a > 0 is constant.
where the real-valued Hamiltonian function H (x, y) is
2. If x(O) = 0 Theorem 1.1 on existence and uniqueness assumed to be twice continuously differentiable.
of solutions fails to apply to the I-dimensional equation (a) Show that the flow of a 2-dimensional Hamiltonian
X={ Jx, X ~ 0, system preserves areas.
0, X < 0. (b) Show that the system i = -y, y = x is Hamilto-
nian, and find a Hamiltonian function H (x, y) for
(a) Explain why the theorem doesn't apply if x(O) = 0.
this system.
(b) Find two distinct solutions to the equation, both
(c) Show that the flow lines of a 2-dimensional Hamil-
satisfying x (0) = 0.
tonian system follow li!vel curves of the associated
Hamiltonian function.
3. Can the flow of a continuously differentiable 2-
dimensional vector field send a region of positive area 6. Show that the second-order equation i = - f (x) is
into a region of area zero in finite time? Explain your equivalent to a first-order system if we set y = i. Then
answer. show that the first-order system is a Hamiltonian system,
as defined in the previous exercise, with Hamiltonian
4. What is the flow of the identically zero vector field on
JR3? H(x, y) = ½i + U(x), where U 1(x) = f(x).
Section 2A Linear Systems 585
The function U(x) is the potential energy of the system, (b) Carry out the proof for n-by-n matrices by induc-
and H (x, y) is the total energy. tion, first expanding the determinant by one row, for
7. A 2-dimensional gradient system has the form example the first row, and then applying the induc-
tion hypothesis to the cofactors.
. au . au 10. If the proof of the lemma for Theorem 1.6 is restricted to
x=-,
ax y= ay' dimension 2, the computation is in principle no simpler
but we avoid using so many subscripts. In particular, we
where the real-valued potential function U (x, y) is deal with the flow transformation T,(u, v) = (x(t), y(t))
assumed to be twice continuously differentiable. Show generated by the system i = F(x, y), j, = G(x, y) with
that the flow of such a system preserves areas if and only initial conditions x(O) = u, y(O) = v.
if Uxx + Uyy is identically zero that is, if and only if (a) For fixed t let 11 (u, v) be the Jacobian determinant
U(x, y) is a harmonic function. of T1 (u, v) with respect to u and v. Use the system
to show that
8. Consider the 2-dimensional uncoupled system i = x 3,
y=y3.
(a) Sketch the vector field of the system near the
origin.
(b) Compute the Jacobian determinant 11 of the flow (b) Apply the chain rule, for example Fu =
transformation T1 of the system. Fxxu+FyYu, to the partials of F and Gin part (a) to
(c) Let B be a region of positive area in JR2 • Use the show that (d/dt)J1 = (Fx + Gy)l1 = div(F, G)J1 •
result of part (b) to show that the area of the image Here F and G are evaluated at (x(t), y(t)).
of B under T1 is bigger than the area of B if t > 0 (c) Noting that 711 is an identity transformation, so
and less than the area of B if t < 0. that Jo = det To = 1, solve the first order linear
(d) Can you draw the same conclusion as in part (c) if differential equation for 11 in part (b) to show that
the original system is replaced by i = =
x 2 , j, y 2? f~
11 = exp ( div(F, G) dt), where F and G are
Explain your reasoning. evaluated at (x(t), y(t)).
9. Show that if the entries in an n-by-n matrix A(t) = 11. The last part of the proof of Theorem 1.6 shows that
(aii (t)) are differentiable functions of a real variable t, the Jacobian determinant of a flow transformation 7i is
then the derivati,,e of det A (t) is computed by differen- f~
11 (x) = exp ( div F(Tu (x)) du).
tiating the entries in one row of A(t) at a time and then (a) Show that if div F is constant, then the exponent is
adding the resulting n determinants. tdivF.
(a) Carry out the proof for 2-by-2 matrices by first (b) Show generally that the exponent is t times the time-
expanding the determinant and then applying the average of div F over the part of the flow line starting
product rule for differentiation. at x traced between time O and time t.
-dx
dt
=a(t)x + b(t),
where a and b are real-valued functions defined on some interval. Similarly an n-
dimensional first-order system of differential equations is called a linear system if
586 Chapter 12 Introduction to Systems
a11(t)
a111(t) )
a21 (t) a211 (t)
A(t) = .
(
a111 (t) a,,,~ (t)
b1 (t) )
b2(t)
b (t) = : '
(
bn(t)
IEXAMPLE 1J The first term in the right side of the vector differential equation
(:;j::) = ( i -i ) (;) + ( ~)
in terms of a matrix product is
dx
-- =2x+4y+2
dt
dy
- =x-y+4.
dt
(D - 2)x - 4y =2
-x + (D + I )y = 4.
At this point, we follow a routine similar to the row reduction method for solving
linear algebraic equations. For example, we can operate on the second equation with
the differential operator (D - 2) to eliminate x when we add the first equation to
the second:
(D - 2)x - 4y =2
-(D - 2)x + (D - 2)(D + l)y = (D - 2)4.
Section 2A linear Systems 587
Addition gives
(D - 2)(D + l)y - 4y = (D - 2)4 + 2
or
2
D y - Dy - 6y = -6.
We can solve this equation by the methods of the previous chapter, because it contains
only one unknown function, namely y(t). The characteristic equation is
r
2
- r - 6 = (r + 2)(r - 3) = 0,
which has roots r1 = -2 and r2 = 3. Hence the general solution of the associated
homogeneous equation is
We use the second equation of the system to express x(t) directly in terms of y(t):
x(t) = (D + l)y - 4
= -qe-2, + 4c2eJr - 3.
x(t))
( y(t) =c1e
-2, (-1)1 +c2e Jr (4) + (-3)
1 1 .
Our method of solution guaranteed only that every solution of the original system
must be a special case of the general formula just obtained. Therefore we should
substitute the formula into the system to see if it really provides a solution for every
choice of q and c2. The general theory to be developed in Chapter 13, Section 3
applies, showing that for systems of the form dx/dt = Ax+b(t), the general solution
of an n-dimensional system contains n arbitrary constants. Therefore substitution is
necessary in such an example only if the number of arbitrary constants present
is greater than the dimension of the system, in which case the substitution leads to
relations between the constants. Substitution is always a useful check on the accuracy
of a computation.
The previous example was misleadingly simple, because after solving for y(t)
it was not necessary to solve another differential equation to find x(t); as a result,
no extra arbitrary constants were introduced, so there was no need to find relations
among the constants so the number of constants would equal the dimension of the
system. We'll content ourselves here with such simple examples, leaving the more
complicated ones for Chapter 13 where we use more efficient methods.
588 Chapter 12 Introduction to Systems
Xp(t) = (-~)
is a particular solution that happens to be constant, and
-e- 21 ) ( 4e3t )
Xh (t) = c1 ( e-2, + c2 e3t
dx dy
Lt= - v=-.
dt' dt
Section 2B Linear Systems 589
The system then becomes
du
dt = X + 2y + t
dv
dt = 3x + 2y
dx
-=u
dt
dy
dt = v.
This system is of the standard form dx/dt = Ax+ b(t), where
A=
00 00 31 2)
( 01 0
l
0
0
2
0
0
Md b(t)= u) ·
The order u, v, x, y has been used in forming the matrix. Alternatively, the system
takes the form
Du - x - 2y t =
Dv - 3x - 2y = 0
-u + Dx 0. =
-v +Dy= 0.
The numerical methods in Section 4 will apply directly to the system in standard
form. We take up matrix methods for solving the first-order system in Chapter 13,
Sections 1 to 3, but here we use just the techniques of Chapter 11. A moment's
thought shows that the elimination method applied to the first-order system will
simply take us back to the original second-order system, or one equivalent to it.
Therefore we might as well try to solve the second-order system directly by elimi-
nation. We first solve the associated homogeneous system, and, as usual, we write
D = d/dt to get
(D 2 - l)x - 2y =0
-3x + (D 2 - 2)y 0. =
If we multiply the second equation by 2 and operate on the first with (D 2 - 2), then
addition of the resulting equations eliminates y:
(D 2 - 2)(D 2 - l)x - 6x = 0.
Multiplying out the operators gives
(D 4 - 3D 2 - 4)x = 0.
We solve the equation by finding its characteristic roots from the equation
The first of the two homogeneous equations allows us to solve for y directly
(
xh
YhU)
(I)) =CJ ( -cost
cost ) ( sin t)
21
( e ) ( e-2, )
+c2 -sint +c3 ~e21 +q ~e-21.
Xp(t))
( YpU)
= (at+b)
ct +d '
0 = (at+ b) + 2(ct + d) + t
0 = 3(at + b) + 2(ct + d),
or
0 =(a+ 2c + l)t + (b + 2d)
0 = (3a + 2c)t + (3b + 2d).
It follows that
b+2d=0, a+2c=-l,
3b + 2d = 0, 3a + 2c = 0.
Xp(f))=(
( Yp(t) _:J. 1
½t)·
4
We could have computed the particular solutions along with the homogeneous solu-
tion, by applying elimination to the nonhomogeneous system.
A typical set of initial conditions for the system might take the form
x(O))
( y(O) -
(0)
0 '
dx/dt(O))-
( dy/dt(O) -
(0)
1 ·
Section 2B Linear Systems 591
q ( ~)+ C2 ( _ : ) + C3 ( i )+ q ( -i)
=i )+ ( = ( ~ ) ·
These two vector equations are equivalent to the equations
q +q + q = 0
- Ci + ~C3 + ~ C4 = 0
c2 + 2c3 - 2c4 = -½
- c2 + 3q - 3q = ¾-
These equations have the unique solution q = 0, c2 =- 1, c3 = A, c4 = - ½, so the
particular solution we are looking for is
+ d2e' ( - 2
3. .J3 cos Vr,:,)
sm V r,,3 I - 2 3I .
(x(t)) =de' (
y(t) I
-~cos-v'3t
2
+ .J3 sin-/31)
2
cos .J3 t
EXERCISES
In Exercises I to 4, classify the first-order system as In Exercises 5 and 6, solve by first eliminating one
linear or nonlinear: of the unknown functions. Then determine the arbitrary
constants so that the initial conditions are satisfied.
dx dv
1. - = I +x 2 + y 2. ___:,_ = 12 +z dx
di di 5. dt = 6x + 8y, x(O) = I
dy dz
-dt = +x + y
1
2 - = 13 + y
di dv
d~ = -4x - 6y, y(O) =0
dx dx
3. - =t x + y +
2
e1 4. - =tx
di
+y dx
di
dy dy
6. --
dt
= x + 2y
'
x(O) =0
-=l -=x+y
di dt dy
dt =-2x+y, y(0)=-1
Section 28 Linear Systems 593
In Exercises 7 and 8, find a particular solution for the dx dy dx dy
system. Then noting that the homogeneous equations are 11. -+-
dt dt
=x +ty. 12. - +e1 - =x+e1 •
dt dt
the same as the ones in Exercises 5 and 6, write the most dx dy 2 dx dy
general solution. - -2-=x+t • -+-=y+e-1 •
dt dt dt dt
7. ( ~;~~:) = ( _: _:) (;) + C) 13. (a) Verify that if x1 (t) and x2(t) are solutions of the
system dx/dt = A(t)x, where A(t) is an n-by-n
[Hint: Try x =at+ b; y =ct+ d.J matrix then cix1 (t) + c2x2(t) is also a solution for
arbitrary constants c1 and c2 .
8. ( ~;~~~) = ( _; ~) (;) + (~I) (b) If dx/dt = Ax, where A is an m-by-n matrix, can
you haven 'Im? Give an example or explain why
[Hint: Use undetermined coefficients as in part (a).) not.
9. Solve by elimination (c) Show that for a system of the form dx/dt = A(t)x+
b(t), the conclusion of part (a) follows only if b(t)
is identically zero.
dx
- = x+ z.
dt In Exercises 14 to 17, classify the system as linear or
dy nonlinear.
dt =x+2y.
14. dx/dt + dz/dt = 1, 15. d 2 x/dt 2 + dy/dt = 0,
dz
- = -z. dx/dt - t(dz/dt) =x y2 + t(d y/dt 2 )
2
=t
dt
16. dx/dt + d z/dt = 1, 17. d x/dt - dy/dt = x 2 ,
2 2
Then satisfy the initi&I condition x(0) = l, y(0) = - 1, dx/dt - t 2 (dz/dt) =O y +t 2 (d 2 y/dt 2 ) = o
z(O) = 2.
In Exercises 18 and 19, use elimination by operator mul-
10. (a) Find a first-order system of dimension 4 equivalent
tiplication to get rid of one of the dependent variables.
to the second-order system
Solve the resulting equation for the remaining variable,
and then determine the general solution (x(t), y(t)).
i - 3x - 2y = 0.
18. dx/dt = X + 2y, 19. dx/dt =- y - t,
i - y+2x = 0.
dy/dt = X + y + t dy/dt = X + t
[Hint: Let i = u, y = v.J In Exercises 20 to 23, reduce the system to the standard
(b) By solving for i, y, u, and ii in the first-order fonn with just one first derivative on the left side of each
system obtaine<i in part (a), write the equivalent equation.
4-dimensional system in the form dx/dt = Ax,
where A is a 4-by-4 matrix. 20. dx/dt.+ dy/dt = t,
(c) By collecting terms properly, write the system in dx/dt - dy/dt =X
part (a) in the form 21. dx/dt +dy/dt = y,
dx/dt +2dy/dt =X
L1 (D)x + L2(D) y = 0 22. 2dx/dt + dy/dt + x +Sy= t ,
L 3(D)x + L4(D)y = 0, dx/dt + dy/dt + 2x + 2y = 0
23. dx/dt +dy/dt = sint,
where each Lk.(D) is a second-order constant- dx/dt - dy/dt = cost
coefficient operator.
In Exercises 24 to 27, use elimination by operator mul-
(d) Use the method of elimination to solve the system
ti plication to get rid of one of the dependent variables.
found in part (c).
Solve the resulting equation for the remaining variable
In Exercises 11 and 12, apply row operations to the tenns and then determine the general solution of the system.
containing first derivatives to reduce the system to the Substitution may be necessary to find relations among
standard fonn dx/dt = A(t)x + b(t), where A(t) is a constants. Then determine the constants so that the initial
square matrix. conditions are satisfied.
594 Chapter 12 Introduction to Systems
SECTION 3 APPLICATIONS
The examples in this section are all of a type that arise frequently in applied math-
ematics. For some we'll be able to give complete solutions, whereas the others are
examples for which we need the numerical methods described in the next section.
Figure 12.8 shows two 50-gallon tanks connected by flow pipes and with inlets and
outlets all having the rates of flow as marked in gallons per minute (g/m). The flow
rates are arranged so that each tank is maintained at its capacity at all times. We
suppose that each tank initially contains salt solution at a concentration in pounds per
gallon that we leave unspecified for the moment, that the left-hand tank is receiving
salt solution at a concentration of 1 pound per gallon, and that the right-hand tank is
receiving pure water. The problem is to find out what happens to the amount of salt,
Section 3 Applications 595
FIGURE 12.8 I g/m at I lb/g I g/m pure water
Fluid exchange.
=-£ /Q M
3g/m
50 gal.
2 g/m
2g/m
Concentration = /o
in pounds, as time goes on. We assume that each tank is kept thoroughly mixed at
all times, so that the concentration of salt is always the same throughout the whole
tank. In the left-hand tank, with salt content x(t), the rate of change of the amount
of salt is dx/dt. On the other hand, because of the various flow rates, we can break
this rate of change into three parts:
dx = -4 (~) + 3 (!_) + I
dt 50 50 '
where x / 50 is the concentration of salt in the left tank and y / 50 the concentration in
the right tank, both in pounds per gallon. The term -4(x/50) is the rate of outflow
of salt, and the other two terms represent the rate of inflow. Similarly,
dy
dt
= 2 (~) -
50
3 (!_).
50
Thus we have a system of differential equations that we can write as
dx 4 3
-=--x+-y+I
dt 50 50
dy 2 3
dr = sox - soY·
To solve it, we can use the elimination method, first writing the system in the form
(D + ~) x - 2_y = I
50 50
_2.x
50
+ (v + 2-)
50
y = o.
We multiply the first equation by st and operate on the second by (D+ 3t). Addition
of the two equations then gives
or
596 Chapter 12 Introduction to Systems
2
r + so'+
7 6 = (r+ 1)(r+ 6)
(50) 2 50 50
lim x(t)
f-+00
= ~.
lim y(t)
/-+00
=5t
In other words, the concentration, in pounds per gallon, in the left tank approaches
½, and in the right tank approaches ½.
The constants CJ and c2 depend on the initial values x(O) and y(O). Thus the
equations
x(O) = c1 - + 52°
ic2
y(O) = CJ + c2 + 530
detennine CJ and c2 when x (0) and y(O) are known. The values x (t1) and y(t1) at
a time t1 also determine the constants. We leave these details as an exercise.
Consider two weights of mass m 1 and m2 separated by springs from each other and
from fixed walls. Suppose the springs have stiffness constants k1, k2, k3 as shown
in Figure 12. 9; thus the restoring force toward the motionless equilibrium position
for the ith spring is proportional to k;. Let x and y be the displacements from
equilibrium of the first and second weights. The force acting on the first weight is
equal to nq(d 2 x/dt 2 ), but we also have
Section 3 Applications 597
FIGURE 12.9
Mass-spring system.
X y
In deriving both equations, we have neglected frictional forces. We can rewrite the
system in the form
d 2x (k1 + k2) k2
dt 2 = - m1 X + ;-;-y
d 2y k2 k2 + k3 y.
-=-x-
dt m2 m2
x(t) = (D 2 + 2)y(t)
= c1 cost + c2 sin t - c3 cos ,,/3 t - q sin ,,/3 t.
The constants c1, c2, CJ, and q would be determined by initial displacements and
velocities, namely x(O), y(O), i(O), j,(0).
The oscillation cos ,,/3 t in the previous example is called a normal mode of the
oscillation, and it is determined by its circular frequency ,,/3. The other normal
mode, cos t, with circular frequency µ, = 1 appearing in the same example arises
from different initial conditions. In identifying a normal mode, it's customary to
focus attention on the circular frequency itself. Thus the typical normal mode looks
598 Chapter 12 Introduction to Systems
like cos µt with circular frequency µ. The normal modes are important characteristics
of an oscillatory system, and particularly efficient routes to their computation are in
Chapter 13 using eigenvalue methods and exponential matrices.
Suppose the third spring is removed altogether from our previous example, so that
k3 =0. We're left with
d 2x
dt2 =- 2x + y,
d2y
dt2 =X - y,
Elimination of x proceeds as in the previous example, but this time the differential
equation for y is (D 4 + 3D 2 + l)y = 0, with characteristic equation
4
r + 3r 2 + 1 = 0.
Regarding the left side quadratic as a function of r 2 , we find r 2 = (- 3 ± v'5) /2. Both
values are negative, so the characteristic roots are ±iJ(3 + v'S)/2 ~ ±1.62i and
±iJ(3 - v'S)/2 ~ ±0.62i. The normal modes from which solutions are constructed
have circular frequencies µ1 = J(3 + v'S)/2 and µi = v'S)/2. /<3 -
i,,EXAM'PLE 4 j A typical autonomous second-order system has the form
x = f(x,y,x,y)
y = g(x, y, x, j,),
with initial conditions x(to) = xo, y(to) = yo, x(to) = uo, y(to) = vo. A particularly
important special case is that of Newton's equations of planetary motion,
.. -kx .. '-ky
X = -~---,--
(x2 + y2)3/2 Y = (x2 + y1)3/2'
in which k is a positive constant. In these equations, x and y stand for the rectangular
coordinates of a planet in a planar orbit relative to a fixed sun at the origin.
We'll derive the planetary motion equations in a more general vector form that
allows for the motion of both bodies, which could be applied to a double star inter-
action for example. Let X1 = x1 (f) and x2 = xz(t) represent the positions at time t
of two bodies in space such that each acts on the other by the inverse square law
of gravitational attraction, with no other forces considered. If m I and m 2 are the
Section 3 Applications 599
FIGURE 12.10
Equal but opposite forces:
F/m1 < F/m2, m1 > m2.
respective masses of the two bodies, the magnitude of the mutually attractive force
is then
where r = jx1 - x2/ is the distance between x1 and x2 (i.e., the length of the vector
between them). The gravitational constant G is about 6.673- 10- 11 if the relevant
units are meters, kilograms, and seconds. The normalized vectors
X1 -X2 Xj -Xz
u2 = - - - U1= - - - -
/x1 -xzl' lx1 - x2I
have length 1 and point respectively from the second body to the first, and vice
versa. Thus the vectors that describe the force acting on each body are the product
of magnitude F and a normalized direction unit vector u; the vector Fu1 acts on the
first body and Fu2 acts on the second body. Since these forces can also be described
by Newton's second law as mass times acceleration, we have
Figure 12. IO shows the positions and force vectors. The acceleration vectors, which
actually govern the motion, are depicted as if m 1 is much larger than mz. Written
out in more detail these Newton equations are
.. Gm1
3.1 X2 =- (X2-X1),
/x1 -x2I 3
where m1 has been canceled from the first equation and m2 from the second. Sub-
tracting the second equation from the first gives
.. ..
XJ - X2 = - G(m1 +m2)(x13 -X2)
/x1 -x2/
Equations 3.1 form a system of vector equations for the motions of the two bodies
relative to some coordinate system. If a moving coordinate system has its origin
maintained at the center of mass of one of the bodies, say the second, we can let
x = x1 - x2 and consider only the equation of relative motion for the first body:
3.2
600 Chapter 12 Introduction to Systems
There are no simple formulas for the solutions x = x(t), y = y(t) of these
equations. The classical approach to the problem is to derive certain significant prop-
erties of the solutions without actually finding the solutions explicitly. For example,
the trajectories have reasonably simple equations. These properties are usually stated
as Kepler's laws of planetary motion, laws that were discovered empirically from
astronomical observation before the work of Newton. Kepler's laws hold for solutions
to Newton's equations that have closed paths for trajectories.
1. The path described by a solution (x(t), y(t)) is an ellipse with one focus at
the sun. ·
2. The radius from the sun to the planet sweeps our equal areas in equal periods
of time.
3. If T is the time required to complete one orbit and a is half the major axis
of the orbit, then
2 47r2 3
T =-----a.
G(m1 + m2)
The derivation of these beautiful laws from Newton's differential equations is
given in many physics texts and in some calculus texts. (An outline of the derivation
is given in a series of exercises at the end of this section.)
Although the results just described tell us a great deal about planetary motion, if
what we want to know is the position or velocity of a planet at a given time, then
we may resort to numerical methods of the kind described in the next section. These
methods apply directly to a first-order system of arbitrary dimension, and to apply
them to Newton's equations we consider an equivalent system of four first-order
x
equations. Let = u and y = v. Because x=
ii and y = v,
the system takes the
first-order form
x=u
y=v
. G(m1 +m2)x
u=------
(x2 + y2)3/2
G(m1 +m2)y
v= (x2 + y2)3/2 ·
Section 3 Applications 601
The prescription for initial position (x(to), y(to)) and velocity (x(to), j,(to)} takes
the fonn
x(to) = xo, u(to) = uo,
y(to) = Yo, v(to) = vo.
Thus (xo, Yo) represents the position of the planet at time t == to, whereas uo and v0
are the rates of change of x and y at the same time.
In other words, at distance r from the sun, a speed greater than Ve implies a hyperbolic
trajectory, and a speed less than Ve implies an elliptic trajectory. For a derivation of
the formula for Ve under the assumption that the less massive body has a negligible
effect on the other one and that the motion is radial, see Example 6 of Chapter 10,
Section 2. Exercise 28 in the present section shows how to eliminate the radial-
motion assumption.
EXERCISES
1. Suppose that two 100-gallon tanks of salt solution contain (b) Show that it's possible to choose c 1 and c2 so that
amounts of salt y{t) and z{t) at time t. Suppose that the an arbitrary initial condition (x(0), y(0)} = (xo, Yo)
solution in the y tank is flowing to the z tank at a rate of is satisfied. Is this a reasonable state of affairs from
1 gallon per minute, and that the solution in the z tank is a physical standpoint?
flowing to the y tank at the rate of 4 gallons per minute. 3. In Example 2 of the text, the system
Suppose also that the nverflow from the y tank goes down
the drain, whereas the z tank is kept full by the addition (D 2 + 2)x - y =0
of fresh water. Assume that each tank is kept thoroughly
mixed at all times. -x + (D 2
+ 2)y = 0
(a) Find a linear system satisfied by y and z.
are shown to have the general solution
{b) Find the general solution of the system in part (a)
and then determine the constants in it so that the
initial values will be y(0) = 10 and z(0) = 20.
x(t) = CJ cost + c2 sin t - c3 cos J3 t - q sin J3 t,
(c) Draw the graphs of the particular solutions found in y(t) = ci cost+ c2 sint + c3 cosv'3 t + q sinv'3t ,
part (b) and interpret the results.
where x(t) and y(t) are interpreted as the displacements
2. In Example 1 of the text, the general solution to a system
at time t of two masses in a mass-spring physical system.
of differential equations is found to be
(a) Show the initial conditions
x(t) = cie-(l/50)r - ~c2e-(6/50)r + ~. x(0) = 0, i(0) =1
y(t) = cie-(L/50)r + cze-(6/50)r + ~-
y(0) = l, j,(0) =0
(a) Find values for the constants CJ and c2 so that the are satisfied by choosing the constants properly in
initial conditions x(O) = 25, y(0) = j are satisfied. the general solution.
602 Chapter 12 Introduction to Systems
(b) Show thal general initial conditions of the form (c) Show that fork > 0 the curves found in part (b) are
x(0) = XO, y(0) = YO, x(0) = uo, j,(0) = Vo can closed circuits in the HP-plane. Thus the Lotka-
always be satisfied. Volterra theory models the cyclic variation in the
4. Two points start from x 1 = 0 and x2 = 1 on a line sizes of certain populations. [Hint: There are at most
and move with positions x 1 (t) and x2(t) at time t ~ 0. two positive x-values for which f (x) = xa /eb:x: has
Suppose that the x 1-point always maintains its velocity at a given value.]
exactly 10 units per second greater than that of the x2- 7. Two 100-gallon tanks X and Y contain initially 50 and
point. Suppose also that the sum of the two velocities is 100 gallons, respectively, of pure water. From an external
e-t fort~ 0. source, salt solution is added to Y at I gallon per minute
(a) Express the relation between the velocities as a first- (gpm), each gallon containing 1 pound of salt. Mixed
order system. solution flows from Y to X at 2 gpm and from X to Y
(b) Describe the motion of the two points. Are they ever at I gpm. Let x = x(t) and y = y(t) be the respective
at the same position at the same time? amounts of salt in X and Y at time t ~ 0. Note. You're not
5. (a) Show that under initial conditions of the spe- asked to solve any differential equations for this question.
cial form (a) At what time t 1 will X begin to overflow? Express
x(0) = xo > 0, i(O) = uo the total amount of salt in the two tanks as a function
y(0) = 0, j,(0) = 0, of t while 0 ::: t :::: t1.
(b) Find a system of differential equations satisfied by
Newton's equations of planetary motion reduce to x(t) and y(t) for 0::: t:::: t1.
(c) Find a system of differential equations satisfied by
d 2x GM x(t) and y(t) for t 1 ::: t, while X is overflowing.
dt 2 =-7,
8. Two 100-gallon tanks X and Y are initially full of salt
together with the condition that y (t) is identically solution, with xo pounds of salt in X and y0 pounds of
zero. salt in Y. Mixed solution is pumped from X to Y at 2
(b) Taking the physical situation into account, what can gallons per minute and from Y to X at 3 gallons per
you say about the behavior of a solution x(t) of the minute. Pure water evaporates from X at 2 gallons per
reduced system in part (a) if uo = 0? minute. Let x = x(t) and y = y(t) be the respective
amounts of salt in X and Y at time t ~ 0.
6. The Lotka-Volterra equations
(a) At what time t1 will one of the tanks first overflow
dH or become empty?
dt = (a - bP)H, (b) Find, but don't solve, a system of differential
equations satisfied by x(t) and y(t) for 0::: t :::: ti.
dP (c) Show that x(t) + y(t) remains constant and that the
dt = (ell - d)P,
amount of salt in each tank separately is constant
whenever xo = iYO·
with a, b, c, d > 0, model the size relationship of parasite
(d) Assume that x(0) = 10 and y(0) = 20. Use the
P (t) and hosl Ii (t) populations at time t.
equation x(t) + y(t) = 30 to solve the system you
(a) Show that if P(t) > a/b, then H(t) decreases, and found in part (b).
that if H(t) < d/c, then P(t) decreases. Show also
that the equilibrium points (He, Pe) are (0, 0) and 9. Two tanks, one of capacity 100 gallons, the other of
(a/b, d/c). capacity 200 gallons are each initially half-full of liquid.
(b) Show that the parameterized solution curves The JOO-gallon tank starts with nothing but pure water, but
(H, P) = (H(t), P(t)) satisfy the other tank starts out with 10 pounds of salt dissolved
in the water. Solution flows from the 100-gallon tank to
dH (a -bP)H the other tank at 2 gallons per minute. Solution flows in
the opposite direction at I gallon per minute. Pure water
dP (cH -d)P
is added to the JOO-gallon tank at I gallon per min. The
and solve this equation by separation of variables entire process is stopped if either tank becomes empty or
to get either tank overflows.
(a) How long does it take for the process to stop?
(b} Write down the system of differential equations
where k is constant. and initial conditions whose solutions describes the
Section 3 Applications 603
process as a function of time. As a check, notice a form of the equations of motion that predicts math-
that the total amount of salt present in the system ematically the relative equilibrium positions of the two
remains unchanged. masses.
(c) Use the check in part (b) to find a first-order linear Instead of measuring the locations of the two masses
initial-value problem for x(t) alone. shown in Figure 12.9 from their equilibrium positions we
(d) Solve the initial-value problem in part (c) for x(t). can measure both displacements from the same point at
Then find y(t), and estimate the amount of salt in the left-hand support. If we know the unstressed (i.e.,
each tank when the process stops. relaxed) lengths !1, /2, /3 of the three springs, and the
Normal modes. In Exercises 10 to 13, calculate the distance b between the supports, this approach allows
circular frequencies of the various constituent oscilla- us to determine the precise location of the equilibrium
tions associated with the system of Example 2 of the positions. (This information was assumed known in our
text under the following assumptions. earlier analysis.) Let z and w be the respective distances
of masses m I and m2 from the left end, as shown in
10. m1 = m2 = 1, k2 == 2, k1 = k3 = 1 Figure 12.9.
11. m1 = m2 == 1, k1 == k2 = 1, k3 =2 (a) Show that
12. m1 = 1, m2 = 2, kt = 1, k2 = k3 = 4
m1d 2z/dt = -k1 (z -
2
!1)+ k2((w - z) - !2),
13. mt = l, m2 = 2, kt = 2, k2 = k3 = 3
14. Suppose that the middle spring is removed from the sys- m2d 2w/dt 2 = -k2((w - z) - /2) + k3((b - w) - /3).
tem governed by the equations of Example 2 of the text.
(a) Show that the system becomes uncoupled.
(b) Show that the equations are equivalent to
(b) What are the normal modes?
2
m1d 2z/dt = -(k1 + k2)z + k2w + k1l1 - k2l2,
15. A 2-dimensional mechanical system m 1x = f(x, y),
m I y = g(x, y) is called conservative if there is a 2
m2d w/dt 2 = k2z - (k2 + k3)w + k2l2 - k3l3 + k3b.
potential function U (x, y) such that
(c) The equations derived in part (b) are similar to the
oU(x, y) oU(x, y) __ -g(x, )').
- -- = - f(x, y) and ones derived in Example 2 of the text except for the
ax oy presence of additional constant terms on the right
(a) Show that this two-body system is conservative by side. Thus they constitute a nonhomogeneous system
computing a potential: rather than a homogeneous one. Since equilibrium
solutions are constant, the second derivatives z and
m1x = -(kt + k2)x + kzy, w are identically zero. Consequently, to find the
m2y = k2x - (k2 + k3)y. equilibrium positions, all we have to do is set the
right sides of the differential equations equal to zero
(b) The kinetic energy of the system is
and solve for z and w. Find the equilibrium solutions
T = !(m1i 2 + m2j, 2). Show that for a general in terms of the ls and ks.
conservative system of the type considered here, the
total energy T + U is constant. [Hint: Multiply the 17. Let g be the acceleration of gravity at the surface of a
first equation by i, the second by j,, add the two homogeneous solid spherical body of mass M and radius
equations and integrate.] =
R. Use the inverse-square law to show that g GM/ R 2,
*16. In deriving the equations of Example 2 of the text to where G is the gravitational constant in appropriate units
establish the precise location of each mass relative to the of measurement. Assume the mass of the body is concen-
trated at its center.
other, and to the spring supports, we needed to know
in advance the equilibrium positions of the two masses. 18. (a) Use the equation established in the previous exercise
'Irus is a problem of finding an equilibrium solution to estimate the gravitational constant using a mea-
to the appropriate equations of motion, that is, find- sured value of 9.8 meters per second for the accel-
ing a constant solution, for which all time derivatives eration of gravity near the surface of the earth. (Use
are zero. For the equations derived in Example 2, it's the values for the mass and radius of the earth
routine to check that the unique equilibrium solution is m = 6-1024 kg, R = 6368 km.)
x(t) = 0, y(t) = 0. Indeed we chose our coordinates (b) Estimate the acceleration of gravity near the surface
so that these would be the equilibrium solutions, so we of the earth using the value 6.67·10- 11 for the
get no new infonnation. This exercise asks you to derive gravitational constant G.
604 Chapter 12 Introduction to Systems
19. The outer radius (not the thickness!) of the earth"s atmo- moon from the earth, and the earth's motion around
spheric shell is about 5600-103 meters, and the earth's the sun is ignored? (Because of the earth's motion
mass is about 5976-1024 kilograms. With G = 6.673- around the sun, the number of days from full moon
10- 11 , estimate the escape speed required at the outer on earth to full moon is about two days more than
limit of the atmosphere for a projectile of mass 100 kilo- the answer to this exercise.)
grams. How is your answer affected if the projectile mass 24. The synchronous orbit of a body of mass m about a
is instead 1000 kilograms? How about 10 22 kilograms? unifonnly rotating body of mass M > m is the one
20. Suppose at some time that two bodies subject only to their that maintains the orbiting body directly over one point
mutual gravitational attraction are at distance ro apart and on the rotating one. Assume the mass of each body is
are receding from each other along a fixed line at a certain concentrated at its center.
fraction q of escape velocity, where O < q < 1. Show (a) Use the first two Kepler laws to show that a syn-
that their separation velocity reaches zero, and the bodies chronous orbit is necessarily circular, and that it
start to "fall" toward each other, when their distance apart must lie in the plane of the equator of the rotating
becomes ro/(1 - q 2 ) . body.
21. This exercise is a reminder that there would be no such (b) Use the third Kepler law to show that if T is the
thing as escape velocity for a body of constant mass if the period of rotation of the larger body, then the radius
acceleration of gravity were really constant. For linear of the synchronous orbit is R = K T 213 , where
motion away from the attracting body, we would have K = ;,fG(M + m)/41r2.
x = -g, for some positive constant g. Show that no (c) Show that the synchronous orbit about the earth for
matter how large xo = x(0) > 0 and vo = i(0) > 0 are, a small satellite has radius approximately 6.22846
x(t) has a finite maximum. times the radius of the earth, or 26,246 miles. (Con-
tinuing orbital correction of communication satellites
22. (a) Use Kepler's second law, equal areas swept out in
is required because of uneven mass concentrations
equal times, to show that a planet moving in circular
on earth and the influence of other bodies such as
orbit must have constant speed.
the sun and the moon.)
(b) Use Kepler's third law, T 2 = 41r 2a 3 /( G(m 1 +m ;i ) ),
together with the result of part (a), to show that a :ZS. The Newton equations for orbits of a single planet of mass
circular orbit of radius a has constant orbital speed m2 relative to a fixed sun of mass nlJ have the form
v = JG(m1 + m2)/a. [Hint: Express v in terms of .. -kx .. -ky
the period T .] x = (x2 + y2)3/2' Y = (x 2 + y2)3/2 '
23. The uniform orbital speed of a satellite of mass m I at
where k = G(m1 + m2).
distance xo from an attracting body of mass m2 is the
speed v I that the satellite must attain to keep it in a (a) Find the relationship that must hold between the
uniform circular orbit. positive constants a and w so that these differential
(a) Show that the orbit
equations will have solutions with circular orbits
described by x (t) = a cos wt, y(t) = a sin cot.
x = xo(cos(v/xo)t, sin(v/xo)t) (b) Show that the relarionship described in part (a)
expresses the third Kepler law.
represents circular motion of radius xo with uniform (c) Show that the orbit
speed v and acceleration x toward the origin of
magnitude v 2 / x 0 . This acceleration vector is called x = (acoswt,asinwt), w = const. > 0
centripetal acceleration. [Hint: Compute lxl, Iii obeys the second Kepler law.
and Iii.]
(b) Show that if gravitational acceleration 26. A vector system x = -F(x) is called conservative if
G(m1 + m2)/xJ is to provide precisely the cen- there is a real-valued potential energy function U(x)
tripetal acceleration of the circular orbit found in such that F(x) = VU(x). For a I-dimensional vector field
part (a), then the uniform orbital speed will be the relation is just F(x) = U'(x); a potential function is
VJ :::; JG(m1 + m2)/xo. determined only up to an additive constant.
(c) How is uniform orbital speed related to escape (a) Verify that the Newtonian vector field
speed?
(d) How many days would there be in a month if the
earth's moon had a uniform circular orbit of radius
F(x, y) = ( (x2 ::2)3/2' ::2)3/2)
(x2
equal to 384,404 kilometers, the mean distance of the has U(x, y) = -k(x 2 + y2)- 112 as potential.
Section 3 Applications 605
(b) The kinetic energy of a body of mass I following Express area swept out along an orbit as an integral
a path (x, y) = (x (t), y(t)) is T = ½(x 2 + j,2), and with respect to time t between t and t + r .]
the total energy of motion in a conservative field is 29. Kepler's second law (radius vector from sun to planet
sweeps out equal areas in equal times) holds for all
E = T + U = ! (i 2 + j, 2) - k central force laws, that is, force laws expressible in
2 .jif+y2 the form x = G(x)x, where G(x) is some real-valued
for the Newtonian field. Verify that the total energy function. This includes as special cases the inverse-square
E is constant for the motion in the vector field of law of attraction, where G(x) = -klxl- 2 , k > 0, and the
part (a). Hint: Show that dE/dt = 0, and use the Coulomb repulsion law, where G(x) = klxi- 2 , k > O;
Newton equations of motion.] the latter governs interaction of particles bearing electric
(c) Verify that for motion governed by the equation charges of the same sign.
x= F(x) the total energy E is constant if the vector (a) Assuming planar motion and using rectangular coor-
=
field is conservative: F(x) VU(x). dinates (x, y) for x, show that a central force law has
the form
The results of the next four exercises establish the
validity of Kepler's Jaws. x = G(x, y)x, y = G(x, y)y,
27. We've seen that the orbit of one body relative to a second =
and conclude that xji-yx 0 for a motion governed
always lies in a fixed plane containing both bodies. This by a central force law.
is often shown as follows. (b) Use the conclusion of part (a) to show that xj,-yi =
(a) Show that if a body of mass m has a path of motion h for some constant h.
that obeys the inverse-square law mx = -(k/lxi 3)x, (c) Change to polar coordinates by x = r cos 8, y =
then the motion is confined to a plane through the r sin0 to show that xj,- yi = r 20 and hence, using
center of attraction determined by the initial position part (b), show that r 28 = h for some constant h. The
and the velocity vectors. [Hint: Establish the relation result says that for r_notion in a central force field,
f,(x xx)= xx x to show that the plane containing the angular velocity 0 is inversely proportional to the
x
x and is perpendicular to a fixed vector.] square of the distance from the center of the field.
(d) Use the result of part (c) and a computation of area
(b) A central force law is one such that motion is
governed by an equation of the form x = G(x)x, in polar coordinates to prove Kepler's second law for
where G(x) is a real-valued function. Show that a central force field by showing that, as a function
motion subject to a central force law is confined to of time t, area swept out has the form A = ½ht+ c.
a plane. Explain why this proves Kepler's second law under
the given assumptions.
28. The angular momentum of a planet at position x in its *(e) Apply Green's theorem to the equation xj,-yi = h
plane orbit about the wn is the vector L = x x mi, that derived in part (b) to show directly, without using
is, L is the cross-product of the position vector x with the polar coordinates, that Kepler's second law holds.
linear momentum vector mi.
(a) Introduce rectangular coordinates x, y in the plane
*30. A single planet with position x = x(t) obeying x =
of motion so that x = (x, y, 0) to show that the -(k/lxl 3 )x follows an elliptic, parabolic, or hyperbolic
path. Here is an outline of a way to show this by deriving
length L = ILi of angular momentum equals L =
a linear differential equation from the vector equation.
mlxj, - yil.
(b) Show that in terms of polar coordinates x = r cos 0, (a) Use x = (r cos 0, r sin 0) to express the vector
y = r sin 0, the angular momentum is mr 2iJ, if equation of motion in the two polar coordinate
iJ > 0. equations;: - rB 2 = -k/r 2 , r0 + 2rB = O.
(c) Kepler's second law of planetary motion decrees (b) Show that ~he second equation derived in part (a)
that the radius joining a planet to the sun sweeps implies r 20 = h for some constant h, and use
out equal areas in equal times. Use the Kepler law this to write the other equation in the form r =
together with the formula h 2 r- 3 - kr- 2. In particular, show that if h = 0 the
motion is confined to a line and results either in
collision or escape.
A=! (9i r 2 d0 (c) Use the results of part (b) to show that if h ::f. 0,
2101
for area in polar coordinates to show that the angular I d2r 2 I (dr ) 2 k
momentum mr 2 8 is constant on an orbit. [Hint: r2 d0 2 - r3 d0. =~ - 112.
606 Chapter 12 Introduction to Systems
[Hint: Use the chain rule to express ;- and ;: in terms and ro are initial speed and distance. Starting with the
of derivatives with respect to 0.] Newton vector equation x = -kx/\x\ 3 , we form the dot
(d) Make the change of variable r = 1/u to show that product of both sides by x to get the scalar equation
the equation in part (c) becomes the second-order X•X =
-k(x •x)/ jxj 3 .
linear equation d 2 u/d0 2 + u = k/ h 2 . (a) Show that the left side of the previous equation is
(e) Show that the solution u = 1/r = d v2 ~ .
A cos(0 + a) + k/ h 2 to the previous equation rep- equa I to dt , where v -vx • X Ix! .
= =
2
resents an ellipse, parabola or hyperbola in polar (b) Show that the right side of that same equation is
coordinates according as IA! < k/h 2 , IAI = k/h 2, equal to k(Vjxj- 1) • x, where V is the gradient
or IA! > k/ h2 . [Hint: Let x = r cos(0 + a), operator: VJ= (afiax, aflay, at1az).
y = r sin(0 + a), a rotation by a of the original (c) Show that the result of part (b) is also k(d/dt)lxl- 1,
xy-axes.] and conclude that
(0 Each focus of an ellipse lies on the major axis at
d v2 d 1
=
distance c from the center where c 2 a 2 - b2 and a - - =k - -
and bare the semi-axis. The eccentricity is e = c/a . dt 2 dt r
Show that for an elliptic orbit, the center of attraction (d) Integrate the previous equation between O and an
is at one focus and the eccentricity is jAjh 2 /k. Then arbitrary positive time t to get the equation relating
show that the polar equation for an orbit is v, vo, r and ro.
h2 /k 32. Suppose a projectile is fired directly away from and at
r----- distance xo from the center of mass of a planet with initial
- 1 + ecos0 ·
speed zo. If zo is Jess than the escape speed, show that
[Hint: For the first part, convert the polar equation, the maximum additional distance attained from the center
with a = 0, to rectangular coordinates.] of mass of the planet is
(g) Assume that the orbit in part (f) is elliptic, with
0 :5 e < I. Show that the time for one complete
revolution is T = 2rrnb/ h. Then show that h 2 / k =
2GM-xozl'
b2 /a to derive the third Kepler Jaw T 2 = 4rr 2a 3/k.
[Hint: The sum of the maximum and minimum where M is the sum of the masses of the two bodies.
values for r is equal to 2a.]
33. A 2-dimensional Hamiltonian system is a pair of differ-
*31. We established the formula for escape speed in the text ential equations of the form
under the assumption that the relative distance separating
two bodies, subject only to the forces of mutual grav- dx/dt = Hy(x, y, t), dy/dt = -Hx(X, y, t).
itational attraction, would always be measured radially
along the same fixed line. The purpose of this problem The function H that determines the system is called
its Hamiltonian. Suppose that (x(t), y(t)) satisfies the
is to show that the fixed-line assumption isn't necessary,
and that the relative speed v and distance r are always system, and consider two functions oft:
d
related by (a) dt [H (x(t), y(t), t)],
v2 v2 k k
- - _Q_ (b) H1 (x(t), y(t), t),
2 2 r ro where the partial derivative of the Hamiltonian in (ii) is
Here the constant is k =
Gm, where m is the sum of computed before substituting x(t) and y(t) for x and y.
the two masses, G is the gravitational constant, and vo Show that these two functions oft are equal.
-dx
dt
= F(t, x), x(to) = xo.
Section 4A Numerical Methods 607
If the system is linear and has an explicit solution formula, that formula may well be
preferred to a numerical approximation, because the approximation may be unable
to give a convincing description of a solution's long-term behavior. However, even
for a solvable system the numerical approach may be the quickest way to get some
short-term qualitative information about solution trajectories.
4A Euler's Method
We choose a step of size h to find successive approximations Xk to the true values
x(to + kh) of the solution x(t). The idea is to use the derivative approximation
x(t + h) - x(t)
h ~ F(t, x),
in the form
+ h) ~ x(t) + hF(t, x).
x(t
Thus having found x corresponding to tk = to + kh, we define the approximation
Xk+l at fk+J by
Xk+l = Xk + hF(tk, Xk).
Starting at a point Xo, this equation generates a sequence XJ, x2, ... , Xm of arrow
tips Xk+ 1 designed to lie close to a trajectory containing Xk- Figure 12.11 (a) shows
an example of an autonomous vector field F(x) along with some arrows tangent to
points Xk on a solution trajectory, the latter shown as a dotted curve. If we scale these
tangent arrows down in length by a small enough factor h > 0, we can expect the tip
of each arrow to land at points Xk +hF(tk, Xk) that are good approximations to points
on the trajectory. Having accepted one of these approximations as Xk+l, we may then
go on similarly to the next approximation by starting at Xk+J· Figure 12.1 l(b) shows
how using a small scale factor h can improve the approximation.
For a 2-dimensional system,
x = F(t, x, y), x(to) = xo,
y = G(t, x, y), y(to) = YO,
the 0th step starts with xo and yo. Then
x1 = xo + hF(to, xo, Yo)
YI = YO+ hG(to, xo, Yo).
"-
__.
-
FIGURE 12.11
-...... '-...
/ /
,,.,,,,..-
//,./ ~'-\ \
I I
\ I/ I
\ \.' ...__ '-.. _
.......
?
__,, /
/ /
'-- ---
""' Vector field, trajectory, and tangents
(a)
Effect of scaling on a tangent
(b)
608 Chapter 12 Introduction to Systems
where the letters on the left represent the new values and the letters on the right
represent the values computed in the previous step of the loop.
S= X
X = X + H * (T * Y + I)
Y=Y+H*S
T=T+H
where Pk+! is the value that the Euler method would have predicted, namely
h
Xk+I = xk + 2[F(tk, xk) + F(tk + h, Pk+1)] .
Section 4B Numerical Methods 609
TABLE 12.1
t X y x = ty+ 1 }'=X
0 1 -1 1
0.1 I.I -0.89 1 1.1
0.2 1.19 -0.77 0.92 1.19
0.3 1.28 -0.64 0.87 1.28
0.4 1.35 -0.51 0.85 1.35
0.5 1.44 -0.36 0.85 1.44
0.6 1.52 -0.21 0.89 1.52
0.7 1.61 -0.05 0.97 l.61
0.8 1.70 0.12 1.08 1.70
0.9 1.81 0.30 1.24 1.81
1 1.94 0.49 1.44 1.94
I.I 2.09 0.70 1.70 2.09
1.2 2.26 0.93 2.02 2.26
1.3 2.48 1.18 2.41 2.48
1.4 2.73 1.45 2.88 2.73
1.5 3.03 1.75 3.45 3.03
1.6 3.39 2.09 4.14 3.39
1.7 3.83 2.47 4.96 3.83
J.8 4.35 2.91 5.95 4.35
1.9 4.97 3.41 7.13 4.97
2 5.72 3.98 8.56 5.72
2.1 6.62 4.64 10.28 6.62
2.2 7.69 5.41 12.36 7.69
2.3 8.98 6.31 14.88 8.89
2.4 10.53 7.36 17.93 10.53
2.5 12.40 8.60 21.64 12.40
Figure 12. 12 shows a geometric rationale for using this particular modification to
get an improved estimate. The tip of the arrow Xk+ 1 lies at the midpoint between
the tip of the Euler arrow computed at Xk and time tk and the tip of the Euler arrow
computed at Pk+I and time lk + h, then translated back to Xk- Thus the improved
estimate used infonnation not just from the pair (tk, Xk) but also estimated future
information from (tk + h, Pk+1).
FIGURE 12.12 ..... _
P, I I + hF(t, + h, Pt t ,)
we start with (xo, yo). Then letting p = (p, q), we compute the Euler approximation
h
X\ = xo + 2[F(to, xo, Yo)+ F(to + h, PI, qi)]
h
YI= YO+ 2[G(to, xo, Yo)+ G(to + h, Pl' q1)].
At the (k + l)th step, we compute tk = tk-I + h =to+ kh and
The recursive formulas for the basic loop have the form
P=X-H*Y
Q=Y-H*X
H
X=X+T*(-Y-Q)
H
= Y+
Y
2 * (X + P) .
The results in Table 12.2 use step size h = 0.01, but records only every tenth step,
including also the values of cost and sin t for comparison, because (x(t), y(t)) =
(cost, sint) is the correct elementary formula for the solution.
T X y cos T sin T
0 1 0 0.1 0.0
0.1 0.99505 0.09983 0.99500 0.09983
0.2 0.98011 0.19868 0.98006 0.19867
0.3 0.95538 0.29553 0.95533 0.29552
0.4 0.92110 0.38944 0.92105 0.38942
0.5 0.87762 0.47945 0.87757 0.47943
0.6 0.82537 0.56467 0.82533 0.56465
0.7 0.76487 0.64425 0.76483 0.64422
0.8 0.69673 0.71740 0.69669 0.71736
0.9 0.62163 0.78337 0.62159 0.78333
1 0.54031 0.84152 0.54028 0.84148
1.1 0.45360 0.89126 0.45358 0.89121
1.2 0.36235 0.93209 0.36233 0.93204
1.3 0.26749 0.96361 0.26747 0.96356
1.4 0.16995 0.98550 0.16994 0.98545
1.5 0.07071 0.99754 0.07071 0.99749
1.6 0.02922 0.99962 0.02922 0.99957
Recall from Section 3 that Newton's equations of planetary motion for a planet of
mass m 1 orbiting a star of mass m2 are
.. G(m1 +m2)x
x=---,----,,-,-
(x2 + y2)3/2 '
x(O) = xo, i(O) = uo,
.. G(m1 +m2)Y
Y =- (x2 + y2)3/2 • y(O) = YO, j,(O) = VQ.
These equations are derived in the previous section, where we remarked that the
second-order system is equivalent to the first-order system
X =U , x(O) = xo
y = V, y(O) = Yo
. G(m1 + m2)x
u=- u(O) = uo
(x2 + y2)3/2 '
. G(m1 + m2)y
V = --.,,---::~-=-
(x2 + y2)3/2 '
v(O) = vo .
We often choose units of measurement so that G(m1 +m2) = 4:rr 2, although for some
purposes we could just as well choose them so that G(m 1 + m2) = 1. In choosing
initial values for position and velocity, recall that we get an elliptic orbit only if the
orbital speed is less than the escape speed, Ve = J2G(m1 + m2)/r. In other words,
we want
2G(m1 + m2)
Juz+v2<
0 0
x5 + Y5
Thus if G (m 1 + m2) = 1 and (xo, yo) = (1 , 0), we should choose (uo , vo) so that
612 Chapter 12 Introduction to Systems
TABLE 12.3
t X y
0 1 0
0.5 0.877588 0.479435
1.0 0.540327 0.841496
1.5 0.070792 0.997561
2.0 -0.416073 0.9094'18
2.5 -0.801107 0.598749
3.0 -0.990088 0.141517
3.5 -0.936777 -0.350347
4.0 -0.654219 -0.756473
4.5 -0.211552 -0.977463
5.0 0.282893 -0.959211
5.5 0.708087 -0.706161
6.0 0.959937 -0.280329
to get a closed trajectory. Table 12.3 for (y, x, y) came from using the improved
Euler method, having chosen G (m 1 + m2) = 1, (xo, yo) = (1, 0), and (uo, vo) =
(0, 1). The step size wash= 0.01, but the result is printed only for every 50 steps.
The initial conditions we have chosen in these examples are satisfied by the
solution (x(t), y(t)) = (cost, sin t), which has a circular orbit for its trajectory.
Hence we can use this solution as a check on the accuracy of our method of numerical
approximation.
EXERCISES
Software for doing these exercises is widely available; of size h = 0.1, both by computing from the
in particular the Web site http://math.dartmouth.edu/ explicit exponential solution formula and by a direct
~,rewn contains applicable Java applets, along with numerical solution of the system using either the
some graphical demonstration applets for specific appli- Euler method or its improved modification.
cations. 2. Find a table of approximations to the solution x(t), y(t)
of the system
1. The first order autonomous system
x=x+i
x=y y=x 2 +v+t,
y=x with initial condition x(O) = I, y(O) = 2. Use a step of
size h = 0.1 on the interval 0 .::: t =s ½, and make the
is equivalent to the single equation _v = y, via the relation
approximation with
_y = x.
(a) The Euler method
(a) Show that the system has solutions
(b) The improved Euler method.
x(t) = qe + c2e-
1 1
, 3. (a) Show that the second-order equation
y(t) = cie
1
cze-
-
1
•
.Y-2.v + y = r
(b) Find the panicular solution satisfying ,\"(0) = with initial conditions y(O) = I, j,(0) 2, IS
l , y(O) = 2. equivalent to the first order system
(c) Compute a table of numerical approximations to
the particular solution found in part (b ). Do this .i =
2x - y + t, x(O) = 2,
computation on the interval 0 ::: t ::: ! in steps y=x, y(O)=I.
Section 4B Numerical Methods 613
(b) Find a numerical approximation to the solution of MIXING
the system in part (a) for the interval 0 :::: t ::::: J.
(c) Solve the given second-order equation by using its 12. Tanks I at capacity I 00 gallons and 2 at 200 gallons are
characteristic equation, and compare the solution initially full of salt solution. Tank 1 has 5 gallons per
with the numerical results of part (b). minute of salt solution at I pound per gallon running in
while mixed solution is drawn off, also at 5 gallons per
4- (a) Find a first-order system equivalent to
minute, with an additional 3 gallons per minute flowing
y"-ty'-y=t, y(O)=l, y'(0)=2. out to tank 2. Tank 2 has 2 gallons per minute of pure
water running in and 3 gallons per minute being drawn
(b) Find a numerical approximation to the solution of off, while 2 gallons per minute more flow to tank I.
the system in part (a) for the interval O :::: t ::::: I. (a) Find a system of differential equations satisfied by
the salt contents of the tanks up to the time when
Apply the improved Euler method to the I -dimensional one is empty.
systems in Exercises 5 to 8. (b) Make a computer plot that compares the graphs of
the components of the solution to part (a), assuming
5. dx = 1-x 113 ,x(O) = ½ tank I has initially IO pounds of salt and tank 2 has
dt
20 pounds. Estimate the maximum amount of salt in
6. dy = t 2 + y2, y(O) = 1 each tank and the time when these are attained.
dt
dx
7. -
dt
= vl~
+r,x(O) = 0
13. Tanks I at capacity 100 gallons and 2 at 200 gallons
are initially half-full of salt solution. Tank I has 5 gal-
dy . lons per minute of pure water running in while mixed
8. - =smy, y(O)
dt
=I solution is drawn off at 4 gallons per minute. with
an additional 3 gallons per minute pumped to tank 2.
9. Make a computer plot of the solution to the bug-pursuit
Tank 2 has 2 gallons per minute of salt solution at 1
problem of Exercise 34 in Section IC.
pound per gallon running in and I gallon per minute
10. The general Lorenz system is .i =a(y - x), y = being drained off, while 3 gallons per minute pour into
px - y - xz, i. = +
-/3z xy, where /3, p, a are pos- tank I.
itive constants. For certain values of the parameters, in (a) Find a system of differential equations satisfied by
particular f3 =8/3, p =
28, a = 10, solution trajec- the salt contents of the tanks up to the time when
tories exhibit an often-studied type of unpredictable or one is empty.
"chaotic" oscillation. Plot the orbits with these parameter (b) Make a computer plot that compares the graphs
choices and initial value (x, y, z) = (2, 2, 21) . A partic- of the components of the solution to part (a),
ularly good view is obtained by projecting on the plane assuming tank I has initially IO pounds of salt
through the origin perpendicular to the vector (- 2, 3, I). and tank 2 has 20 pounds. Estimate the minimum
Note the effect of small changes in the initial vector on amount of salt in tank 1 and the time when this is
the successive numbers of circuits in each spiral configu- attained.
ration.
11. A basic result of multivariable calculus, proved in PLANETARY ORBITS
Chapter 5, Section I, says that the gradient vector
Vf(x, y) = (f,(x, y), /y(x, y)) is perpendicular to the 14. The system of Newton equations
level curve of f that contains (x, y); consequently, x = -x(x 2 + y2)- 312 , y = -y(x 2 + y2)- 312 with initial
( - /y(x, y), fx(x, y)) is tangent to a level set, which conditions x(O) = 1, y(O) = 0, i(O) = 0, y = vo,
is then a trajectory of the system x = -fy(x, y), has a solution with a closed trajectory if vo < ..fi.
=
y fx(x, y). Assume that f(x, y) x 2 = + ½y2. Use the improved Euler method to make an approximate
(a) Use these ideas to make a computer-graphics plot of computation of the trajectory of a single orbit if
the elliptic level curves of f. (a) vu= 0.35 (b) vo = 0.7 (c) vo = 1.4
(b) Make a computer-graphics plot of some orthogonal 15. When the inverse-square law F = Gm1m2r- 2 is replaced
trajectories, that is, curves perpendicular to the level by F = Gm1m2r-P, where 0 < p, the Newton equations
set of the function /. for the orbit of a single planet about a fixed sun take
(c) Identify the well-known family of orthogonal trajec- the form
tory curves by solving the relevant uncoupled system
analytically.
614 Chapter 12 Introduction to Systems
Assume k = and make a pictorial comparison of the 19. Make phase plots of the hard spring oscillator equation
orbits with initial conditions x(O) = I, y(0) = 0, x(O) = y = -y y 3 + 8y under each of the following assumptions.
0, y(O) = 0.5 for the choices p = 1.9, p = 2 and p = 2.1. (a) y = S = 1
Discuss the differences among the three cases. (b) y=l,8=2
(c) y = 2. 8 = l
OSCILLATORY SYSTEMS 20. Make solution graphs and (y, y) phase plots for the
periodic driven hard spring oscillator equation ji =
16. The equations ii = -(g/l)sin0 + <i, 2 sin0cos0,
¢,
-204>cot0, 0 # kn, k integer, govern the spherical pen-
fo
- y - y 3 + k y + cost and use the results to detect
Jong-term approach to periodic behavior.
dulum, where </J and O < 0 < TC are spherical coordinate
angles where 0 is measured from the downward-pointing Time-dependent linear spring mechanism. The
vertical axis in JR 3 and </J is the longitudinal angle. Use equation ji+k(t)y+h(t)y = sint represents an oscillator
g/ l = I. externally forced by f (t) = sin t, damped by factor k(t)
(a) Assuming g / l = 1, plot the trajectory in 0¢,-space and with stiffness h(t). Use initial conditions y(O) = 0,
for a solution if 0(0) = 0(0) = 1, ¢,(0) = 0, 4>(0) = y(O) = 1 to plot solutions with the following choices
]. for k(t) and h(t).
(b) Assuming g I l = I, make a 3-dimensional perspec-
tive plot of the (x, y, z) path of the bob if the initial 21. k(t) = 0, h(t) = e 110 1
Chapter 12 REVIEW
set off, that is f (xo) = f (x1 ). Show that j'/ 1 Jx(t)i 2 dt = 18. Show that the solution trajectories of a Hamiltonian sys-
O and hence that x(t) must be a constant so~ution. tem are level curves of H (x, y). [Hint: See Exercise 15.]
Let H(x) = H(x, y) be a continuously differentiable 19. Illustrate Exercise 18 using the example H (x, y) =
function of two real variables. The vector field H(x , y) = x2-y2.
(Hy(x, y), -Hx(x, y)) is call~d a H_amiltoni~n ~eld,
and the autonomous vector d1fferenual equation x = 20. Illustrate Exercise 18 using the example H (x, y)
H(x), called a Hamiltonian system. x2+y2.
C H APT E R 13
MATRIX METHODS
This chapter is about some special techniques for solving linear systems of differential
equations, principally those with constant coefficients. The methods all depend on
the notion of eigenvalue and eigenvector, introduced in Section I. The results are
analogous to those for a single linear constant-coefficient equation, and the methods
developed for them are special cases of the eigenvector analysis in this chapter.
Section 4 deals with equilibrium and stability for linear and nonlinear systems.
dx
-=Ax
dt
in Chapter 12 for a few examples. The examples show that if A is a constant n-by-n
matrix, then we can expect to find exponential solutions. For example, in the case
n = I, we would have x = ax, with solutions of the form x(t) = ce 01 • Consequently,
we try solutions
x(t) = e'·1 u,
where u is a constant vector in !Rn. Differentiation of x(t) gives
dx
- =)..e''1 U.
dt
Since the matrix A acts linearly, we also have
Ax= /· 1 Au.
Thus to solve the differential equation, we must have, after division by t?-- 1 ,
Au= )..u.
The case u = 0 is too trivial to be interesting, so with that possibility ruled out we
define the nonzero vector u to be an eigenvector of the matrix A, and the number
).. to be the corresponding eigenvalue. Going the other way, if u is an eigenvector
with eigenvalue ).., then x(t) = e>..r u is a solution to the differential equation, and so
is an arbitrary scalar multiple ce>..t u.
617
618 .---~-- Chapter 13 Matrix Methods
I~><AMPJ.E.J I To solve the system
dx/dt) ( x+y )
( dy/dt = 4x+y
=(! :)(;).
try to find nonzero vectors u = (u, v) that satisfy the eigenvector equation
for some number J.... In other words, try to find numbers J... such that the equation
has nonzero solutions. If the 2-by-2 matrix is invertible, then only the solution
(u, v) = (0, 0) exists, so we assume it isn't invertible. Theorem 5.7 of Chapter 2,
Section 5 tells us that a square matrix A is invertible if and only if det A =I- 0. Hence
we require
I - J..
4
In other words,
(l - J...) 2 - 4 = J...2 - 2J... - 3
= (J... - 3) (J... + I) = 0.
The only solutions are J... = 3 and J... = -1.
- 2u +V =0
4u - 2v = 0
XJ (t) = e3' ( ~ )
(; ~)(~)=(~)-
Section 1A Eigenvalues and Eigenvectors 619
Note that u = 1, v = -2 will do, so
xz(t) = e- 1 ( -~ )
t increases extends away from the origin in the direction of the eigenvector ( ~ ) .
Similarly with CJ = 0 and cz > 0, we get a half line traced by c2e-r ( _; )toward
the origin and parallel to the eigenvector ( _; ). More generally each solution of
the system traces a curve in the xy plane that results from forming a particular linear
combination of points that correspond to the same t-value. We'll now pursue this
observation further. To see the geometric significance of the computation in Example
I, observe that neither of the vectors
U =( ; ), V = ( _; )
is a multiple of the other, as shown in Figure 13.1. Using the parallelogram law, a
vector x in JR 2 is expressible using coordinates z and w relative to these vectors as
J<'IGURE 13.1 /z
y I
Eigenvectors u, v. 4'1/',
\
\ I \
\ \
\ \
\ \
\ \
\ \
_ _ _ _ _ _ _ _ _ ___,,)x=zu+wv
'---+- \
I / X
I I
I I
I I
I I
I I
I
I
I
I
WV~/ \
\W
620 Chapter 13 Matrix Methods
Ax= Az ( ; ) + Aw ( _; )
= zA ( ; ) + wA ( _; ) ·
Ax=3z(; )-w( _; )·
In other words, A has the effect of multiplying the first vector ( ; ) by 3 and the
B =( ~ -~).
It follows that relative to the (z, w) coordinates, the vector differential equation is
1B Eigenvector Matrices
The procedure in the previous example generalizes to arbitrary dimensions. We pro-
ceed as follows to solve the n-dimensional constant-coefficient equation
dx
-=Ax :
dt
Section 18 Eigenvalues and Eigenvectors 621
1. Find the eigenvalues of A by finding the roots of the polynomial equation
det(A - >..I) = 0.
2. For each eigenvalue Ak, find an eigenvector Uk by solving
(A - >..d)u = 0.
Theorem 4.3 of Chapter 2 guarantees the existence of an eigenvalue.
3. If the solutions e>.. 1'u1, ... , e>.. 11 'u11 are linearly independent, so that none is a
linear combination of the others, write the general solution
We'll see shortly that if the matrix U with the Uk as columns is invertible then we
do get the most general solution. In particular we prove in Chapter 3, Section 7B
that if the eigenvalues of A are all different, then the corresponding solutions will
always be linearly independent. If the solutions e>..kt Uk are linearly dependent, the
procedure outlined produces solutions but not the most general one. In this case, we
can use the elimination method explained in Chapter 12 or the exponential matrix
method described in the next section.
If A has some complex numbers for eigenvalues, then the same method still works,
with the complex exponential replacing the real exponential.
dx
-=x-y
dt
dy
-=x+y
dt
has matrix
A=(! -1) 1 .
1->.. -1
det ( 1 1->.. ) =0,
( -: ~! ) ( ~ ) = ( ~ ) , that is, of
-iu -
u - iv
V =0
= 0.
622 Chapter 13 Matrix Methods
One solution is
( ~ - I ) ( ~ ) =( ~ ), that is, of
iu - V =0
u + iv = 0.
One solution is
Because CJ = (d1 -idz)/2 and c2 = (d1 +idz)/2, the constants c1 and c2 can always
be chosen so that d 1 and d2 have arbitrary preassigned values; in particular, we can
choose them to make d1 and d2 real numbers.
e~''
A,= .
(
0 0
Section 1B Eigenvalues and Eigenvectors 623
with corresponding eigenvalues Ak in the same order as the eigenvectors. If c is an
arbitrary constant column vector with entries c1, ... , en, we can form the vector-
valued function
x(t) = U Arc.
The vector x(t) has the form
and so is a solution of dx/dt = Ax, though not the most general solution unless the
vectors Uk are linearly independent. But if the Uk are independent, thus making U
invertible, we can let c = u- 1xo for some xo in lRn. Hence
C = A-rou- 1xo.
Since At is diagonal matrix with entries e>-kr, AtA-r0 = Ar-to and
This formula gives all solutions, because every solution has some value at to, and
the formula assigns the arbitrary value xo there. By Theorem I. I in Chapter 12,
Section ID, the solution is uniquely determined by the initial condition, so there is
a I-to-I correspondence between solutions and initial conditions.
If the eigenvector matrix U isn't invertible we don't get all possible solutions
this way, but we'll see in the next section that the role of U Aru- 1 is assumed
by a routinely computed exponential matrix that provides all solutions in the form
x(t) = erAxo.
A_ ( I I )
- 4 I
We have
e3t
At= ( O
624 Chapter 13 Matrix Methods
x(t) =( I
2
I ) ( e3
-2 0
1
)0)
=( {e3r + ¼e-1 )
1e3r _ le-,
2 2
satisfies the differential equation dx/dt = Ax and the initial condition x(O) = (2, 3).
=( ~e3<1-l) + ½e-<1-l) )
Se3(1-I) _ e-<1-I) ·
EXERCISES
Find the eigenvalues and eigenvectors of the matrices in of the system in the form x(t) = qe.1,. 1' u 1 +c2e.1,. 2'u2. In
Exercises I to 6. the case of complex eigenvalues, convert the solution to
real form. Then use the initial conditions to determine
CJ and c2.
7. ( ~ ) - ( =! ; ) (; ).( ;;g; )- ( ~ )
dx
8. -
dt
= 3x, X (0) = l,
dy
dt -- 2_y, _11 (0) =0
dx
9. - =x +4v, x(0)
dt -
=l
The 2-dimensional systems of differential equations in dy
....:.. =5v, _v(0)= l
Exercises 7 to 10 can all be written dx = Ax in which dt -
dt
the matrix A has constant entries. In each example,
dx )
( :;cit =( ~
find the eigenvalues of A, and for each eigenvalue find
a corresponding eigenvector. Use the eigenvalues and 10.
-1 ) (
2
X )
y '
( X (Q)
y(O)
) =( l )
0
eigenvectors of the system to write the general solution
Section 2A Matrix Exponentials 625
11. (a) Find the general homogeneous solution of the 15. The second-order, constant-coefficient, differential equa-
system tion
d2 y dy
-+a-+by =0
dt 2 dt
x
e = 1 + X + -2!lX 2 + -3!IX 3 + · ·· .
626 Chapter 13 Matrix Methods
If A has dimensions n-by-n and / is the n-by-n identity matrix, we consider the
finite sum
k
I +A+ _!._ A 2 + ... + _!._
kl
A k = 'L.,,,
\""""' _!._A i.
2 ., .
.,
. 0 .I.
]=
This sum of n-by-n matrices is also an n-by-n matrix. We define the exponential of
A by
kl- 001.
eA = lim '\""""' -A 1
k--+ 00 L.,,, j !
= 'L.,,,
\""""' -j ! A1 '
J=O J=O
where the existence of the matrix limit is understood to mean that the limit exists in
each of the n 2 entries in the matrix. It's sometimes convenient to use the notation
exp A for eA. For example, if
A=(~~). A
2
=(
22 0
0 32
)
' ... 'Al
·
=
( 21.
0
then
exp ( ~ 0
3
) .
= k--+oo
lim k -
1
~ j!
( 21.
0
0
3i )
J=0
00 21
L-:i- 0
=
. 0 J.
1=
00 3J
=( ~2
0
e3 ).
0 I:-:i-
i=O ]·
It's remarkable that the exponential of a square matrix always exists and has many
of the properties of the ordinary real or complex exponential function . In the matrix
exponential's most useful form, A is multiplied by a scalar t that we can pull out of
the powers (tA)i to give 11 Ai . The most important properties of e'A are as follows.
2.1 Theorem. If A is an n-by-n real or complex matrix, the matrix series
00 ti .
'°'-Al
L.,,, . '
J=O 1·
converges to an n-by-n matrix e 1 A satisfying
(a) e<r+s)A = e 1 AesA = esAerA for scalars t ands
(b) e'A is invertible, and e- 1AetA = e1Ae-tA = /
d
(c) -ef A = Ae 1 A = e'A A
dt
Proof If A = (aiJ ), choose a positive number b such that laiJ I :'S b for i, j
1, ... , n. Since the entries in A 2 are of the form
it follows that they are all at most nb2 in absolute value. Proceeding inductively, the
entries in Ak are at most nk-I bk in absolute value. It follows that each entry in eA
is defined by an absolutely convergent infinite series dominated by the convergent
series
nb 2 1 · I ·
1+ b + - + ... + -nl- bl + ...
2! j! '
as discussed in Chapter 14, Section 3D. Hence all the entries exist and eA is defined.
These estimates show that if the entries aiJ in A are replaced by the entries t aiJ in
t A, then the convergence is uniform on every bounded interval c ~ t ~ d. (See
Chapter 14, Section 4.)
To prove property (a), we apply the binomial theorem to (t + s)j to get
This last sum is just the product of the two absolutely convergent series that represent
e'A and esA respectively. Since s and t commute in the original series, we also get
the product in the other order.
Property (b) follows from (a) on taking t = 1 ands = -1. Since e0 = 1, property
(a) implies eAe-A = e-AeA = J.
Formally we can compute the derivative of e'A from the definition by
.'!___etA = !1_ ~ ti Ai
dt dt f;
J=v
j!
oo jri - IAJ
=I:
J=l
.,
J.
oo 11- 1AJ-1
= A'°'----=
~ (j-1)!
J=l
Ae'A.
Note that the factor A could be taken out on the right just as well as on the left.
This computation using term-by-term differentiation of series is justified because the
differentiated series in each entry is uniformly convergent by the estimates made in
the first part of the proof. See Chapter 14, Section 4. ~
2B Solving Systems
The simplest justification for introducing e' A is to show that
x(t) = e Axo 1
628 Chapter 13 Matrix Methods
x(t) = /t-to)AXo
satisfies the initial condition x(to) = Xo, since e0 A = I. It's one of the nice features
of e 1A that it always exists and that the equation x(t) = e 1Ac provides all solutions
of dx = Ax as the constant vector c ranges over all constant vectors with the same
dt
dimension as x; this will follow from Theorem 3.2 in Section 3A.
We compute
) , A
2
=( ~ i ),... , Ak = ( ~ ~ ) , ....
Then
00 k 00
tk
etA_L~
-
00 k ( k
0I 1 )=
I:~k!
k=O ti (k - l)!
00 k
k=O k!
0 I::,
k=O .
- ( e'
- 0
te
e'
1
).
Section 2C Matrix Exponentials 629
Hence the solution with initial conditions x(O) = xo, y(O) = Yo is
To find the solution with initial conditions x(to) = xo, y(to) = yo, just replace
t by (t - to) everywhere in the vector solution for to =
O; this is simpler than
recomputing undetermined coefficients and it shows one of the many advantages of
using the exponential matrix e1A.
2C Relationship to Eigenvectors
The connection between exponential solutions and the eigenvector method of the
previous section is as follows. If the eigenvectors of the square matrix A are linearly
independent, we form the eigenvector matrix
A'= Ct . .,:, )·
where Ak is the eigenvalue of Dk. Then we have seen that
solves the initial value problem for the equation dx/dt = Ax. Since
x(t) = e'AXO
solves the same problem, we are faced with the question of whether the two solutions
are the same. By Theorem 1.1 in Chapter 12, Section 1D, there is only one solution
satisfying x(O) = XO- (Exercise 18 in Section 3 indicates a direct proof.) Hence
e'Axo = U A,u- 1xo for all t. Since xo is arbitrary, it follows that
2.2 e1A = U A,U- 1 •
j,;~~Ar"ifl.~Z,:] In Example 1 of the previous section, we solved the system
( :;j:; ) = ( ! ! ) ( ; )
by finding the eigenvalues ).. 1 = 3, A2 = -1 and corresponding eigenvectors (I, 2),
(1 , -2) of the 2-by-2 matrix A of the system. Thus
and
As a check on the computation, notice that e1A = I when t = 0. This example shows
that if the eigenvectors of A are linearly independent, it may well be easier to use
them to compute e1A than to use the matrix power series definition.
Equation 2.2 is ineffective for computing e1A if the eigenvector matrix U fails
to have an inverse. For example if A =( b ! ) there is only a single repeated
eigenvalue ). = 1, and all corresponding eigenvectors have the form ( ~ ) with
2
u =f. 0. Hence the only possibility for U is a matrix of the fonn ( t~t ~ ) , which
EXERCISES
3. A= ( : 1)
1
(i 0)
4. A=
0 -,
.
(ID-(~ )C)
5. A = u:~ ) 6. A = (
13. A-(: -: ) 14. h u~) Define cos t A to be the real part of the series and sin t A
to be the imaginary pan, so that
eitA =costA+isintA.
IS. A-(: :) 16. A-c~ -~ =n Show that the matrices cost A and sin t A satisfy
(a) cos(-tA)
d
dt
=
cost A , sin(-tA) = -sintA
. d .
(b) -costA=-AsmtA, -smtA=Acos tA
dt
4 4 [Hint: Express cost A and sintA in terms of ei 1A.]
17. A= ( O I )
-6 5
18. A= ( -
-6 6
) (c) (costA) 2 + (sintA) 2 =
J, where I is the n-by-n
identity matrix
19. A= 0
(
0 -½
20. (a) Use the method of elimination to find the general Define cost A and sin t A as in Exercise 21, and verify the
solution of the system formulas given in (a), (b), and (c).
23. Show that if A is an n-by-n matrix, then a system of
the form
where the coefficient functions bk(t) contain the eigenvalues AJ, ... , An of A explic-
itly. The next theorem shows how to compute the coefficients by solving a system
of linear equations. The complete proof of the theorem is complicated, so we'll
just sketch it. There is a complete proof in Chapter 7, Section 2 of Introduction to
Differential Equations, 2nd ed., by Richard Williamson, McGraw-Hill (2001).
2.4 Theorem. The coefficient functions bk(t) in the matrix Equation 2.3 satisfy
the linear scalar equations
n-1
(i) 1
e J..k = L, bj (t)A£, k = l, .... n.
j=O
If some m of the eigenvalues Ak are equal, say Ai = · · · = Am, then the following
m - I additional relations hold:
dk dk n-1 .
(ii) dAke'J..=dAkL,bj(t)A 1 , at A=AJ, fork=l, ... ,m-1.
j=O
Sketch of Proof. We'll assume Equation 2.3 holds for some choice of the coeffi-
cients bj(t). Let Vk be an eigenvector of A corresponding to eigenvalue Ak : Avk =
AkVk, vk -1- 0. Apply the matrix sum on the right side of Equation 2.3 to Vk, noting
that Ajvk = vk: ).i
e"v, = %bj(t)Aiv, = (%b;(t)A/) v,.
Similarly, apply the matrix e1A to Vk to get another expression for the same thing:
N · oo ·
tl . tl .
e'Avk = N-+oo
lim ~ --:-A 1 vk = ~ --;-A 1 Vk = e'J..kvk.
!-,-, J ! !-,-, J ! k
1=0 1=0
Solve for bo(t) and b1 (t) to get b1 (t) = -¼e- 1 + ¼e 3', bo(t) = ¾e- 1 + ¼e 3'. Plugging
these coefficient functions into Equation 2.3 gives
e'A -
_4(le-'+ le3')
4
( 01
l O) + (-le-r
4
+ le3')
4
( 41
l l )
Section 2D Matrix Exponentials 633
E~XA~PL~ 4;,j In Example 2 of Section 1 we saw that the matrix ( ~ - ~ ) has eigenvalues
Ai = 1+i and A2 = 1- i . Equations (i) of Theorem 2.4 are then
bo(t) = -(1 + i)e1 sin t + e' (cost+ i sin t) = e1(cost - sin t).
Then
exp ( t ( ! --~ )) = 1
e ( cos I - sin t) ( b ~ ) + e sin t ( 1
! -! )
1 1
_ ( e cos t -e sin t )
- e1 sin t e1 cos t ·
We see right away that bo(t) = e- 21 + 2te- 21 . Equation 2.3 then becomes
1
eA = bo(t) ( b ~ ) + bi (t) ( -~ _! )
= (1 + 2t)e- 21 ( ~ ~) + te-21 ( -~ -! )
-21 ( 1 + 2t
=e -4t
634 Chapter 13 Matrix Methods
I·· E><,\MPLE 61
= 2 I O) A) 3 = 0, so there is a
Let A
( 00 02 2l . The characteristic equation is (2 -
triple root A1 = 2. Equations (i) of Theorem 2.4 reduce to
We get two additional relations among the bk(t) by differentiating the first equation
above twice with respect to A1 and then setting A1 = 2 after each differen-
tiation:
Solving the last three displayed equations for the bk(!) gives
O!n
Equation 2.3 is then
e'A - (I - 21 + 21')eh
+ (I - 21 '),'' 0i n 0~ n + ½r',''
l 1
½12 )
= e2' 0 I 1 .
( 0 0 I
The multiple-eigenvalue case in the proof we gave for Theorem 2.4 is incomplete.
However the following remarkable algebraic result makes the theorem plausible and
leads to a complete proof. This theorem allows in principle for the possibility of
collapsing the infinite series for e1 A into a finite sum by replacing powers of A
higher than n - 1 by lower powers as in Equation 2.3. The next example works out
the 2-by-2 case.
then the matrix polynomial obtained by substituting Ak for )._ k in P()..) satisfies
P(A) = 0 , with the understanding that A O = I replaces )._ 0 = I in the substitution.
The theorem, which we won't prove, is often stated briefly as "a square matrix
satisfies its own characteristic equation."
Section 2D Matrix Exponentials 635
a-). b ) 2
det ( e d-). =A -(a+d)).+(ad-be).
2 3 1 )
The matrix A = 0 2 2 has characteristic equation
( 0 0 2
~ ~ ~) ~ ~) ~
1
1
8A- = 12 ( - 6( ; +( I~ ~ )
001 002 0 0 4
-0 -~ ~)
636 Chapter 13 Matrix Methods
EXERCISES
condition by first finding e1 A. (a) Show that if ex i= fl, then A has two linearly
independent eigenvectors.
1. ( I~ =; ); x(0) = ( _; ) (b) Show that if ex = fl, then the only eigenvectors of A
are of the form u = ( ~ ). c # 0.
2. ( -1 _: ); x(l) =( ~ ) Compute in each of the two cases.
,. o1-n n
(c) e'A
4.(~l 01 I) = (2)
12. Theorem 2.4 shows that the coefficients bk(t) in an expan-
~ ; x(O) ~ sion
11 - l
In Exercises 5 and 6, find e1 A for the given matrix A. e1A = L)dt)Ak
k=O
S. A iI i0 -l-1
=( ~I )
are completely determined by the eigenvalues of the n-
hy-n matrix A. For example, the matrices ( ~ ~ ).
6. A-(l -05)
0
0
0.5
-1 (i f ) both have characteristic polynomial with I
2 2.5 0~5 and 2 as roots. Hence the bk(t) are the same for all these
-1 0.5 1.5 matrices regardless of the value of the entry fl .
7. Find the appropriate exponential matrix and use it to (a) Compute bo(t) and b1 (t) for the two matrices above,
solve and use them to find the corresponding exponential
matrices, each depending on the parameter fl .
dx
-=2t
dt
+ Z, (b) Compute the exponential matrix for ( ~ ! ).
dy
- =-x+3y
dt
+ Z, 13. Use Theorem 2.5 to compute A2 and A 3 if A =
dz
- =-x
dt
+ 4z. ( ! ~ ).
14. Verify the Cayley-Hamilton Theorem for the matrix
(Note that A = 3 is a triple eigenvalue.)
8. Solve by whatever method seems simplest:
( -~ -! ).
In Exercise 15 to 23, use the Cayley-Hamilton theorem
to find the inverse matrix if the given matrix is invertible.
dx
-=2x+ z,
~
dt
dy
dt = y + w, 15. (_: : : ) 16. ( - : : )
dz
~ ~ ~~
-dt =2z+w '
dw 17. ( : ) 18. ( ~I ) , t real
-=-y+w.
dt I -3 -7 0 0
Section 2E Matrix Exponentials 637
C :) J n.c
2 -1 0
19. 0
0
20.
C -1
0
e'
0
,; )-"'"'
e'
2 -1 3 0 0
0 2 0 0 2 0 0
21. 22.
0 0 0 0 3 0
0 0 0 4 0 0 0 4
2E Independent Solutions
The discussion of this section shows that there is an exponential solution fonnula
x(t) = e'Ac for every equation dx/dt = Ax in which A is a constant square matrix.
In the example we had
/Ac= ( t 1
;' ) ( ~~ )
= c1 ( ~ ) + c2 (
1
;' ) .
Since we want different solutions for every different choice of c 1, c2, it's important
to avoid the redundancy that would occur in the fonnula in case one of the two
columns in the matrix is a constant multiple of the other. We'll see that this cannot
happen in general, but to state the general result we need to look more closely at what
is meant by linear independence of vector functions. Let x1 (t), x2(t), ... , Xm (t) be
n-dimensional column vectors whose entries are functions on some common interval
a < t < b. (It's not ruled out that some or all of the entries may happen to be
constant.) Vector functions Xk(t), k = 1, .. . , m defined on a t-interval are said to
be linearly independent if whenever
for all t, then the constant coefficients q are all zero. When we have only two
functions (m = 2), asserting linear independence is the same as saying that neither
function is a constant multiple of the other. The reason is that if either c1 or c2 is not
zero we could divide by it and express one vector as a multiple of the other. Similarly,
if we have m > 2 vector functions, their linear independence means that none of
them is equal to a sum of scalar multiples, or linear combination. of the others. The
negation of linear independence of a set of vectors is called linear dependence, and
it means simply that at least one of the vectors is a linear combination of the others.
that form the exponential matrix in Example 1 are linearly independent. For
is the same as
c1e
1
+ c2te 1 = 0
c2e 1 = 0.
It follows that c2 = 0. Hence CJ = 0. This conclusion holds for a given value of t,
so in particular the constant vectors
(~) . (~)
are linearly independent. Just set t = 0.
Consider the vector functions
Proof. Apply the matrix e-rA to both sides of the vector equation
CJXJ(t) + · · •+ CnXn(t) = 0.
Using the distributivity of matrix multiplication, we get
But e-rA is the inverse of the matrix whose kth column is xk(t). Hence e-rAxk(t)
is the kth column of the identity matrix I. Thus our equation becomes
where ek is the column vector with 1 in the kth entry and 0 elsewhere. Adding up
the linear combination gives
CJ 0
= 0
Each of the given matrices in Exercises l to 4 is the Not every square matrix with linearly independent
exponential matrix of some constant matrix A. For each columns is an exponential matrix. For example, an expo-
matrix, find A by computing the derivative of e'A at nential matrix e'A must equal / when t = 0 and proper-
t = 0. Then express the vector function x(t) = e'Ac as ties (a) and (b) of Theorem 2.1 must hold. In Exercises
a linear combination of the columns of e'A and verify 5 to 8 show that the matrix has linearly independent
that each column of e' A is a solution of x = Ax. columns, but is not an exponential matrix.
3.,"-0 -: J)
4. lA = ( eo~ r:,' ~ ) 9. Theorem 2.6 is a simple consequence of the following
more general theorem: If A(t) is an invertible square
0 e21
matrix for each t in some interval a < t < b, then the
640 Chapter 13 Matrix Methods
columns of A(t) are linearly independent vector functions 11. Prove that a set (x1 (t), x2(t) , ... , x111 (t)) of vector-valued
on that interval. Show how to prove this theorem using functions of the same dimension n is linearly independent
the ideas in the proof of Theorem 2.6. on an interval a .'.': t .'.': b if and only if no one of them is
10. Let D be the n-by-n diagonal matrix with entries a linear combination of the others.
d 1, ••• , d,, on the main diagonal and zeros elsewhere.
Show that e1 D is the diagonal matrix with entries
edit, ... , ed.,1_
d ~ ~ da;k ~ dbkj
dt L..,a;k(t)bkj(f) = L, dt(t)bkj(f) + L..,a;k(t)dt(t).
k=O k=O k=O
But this is just the ijth entry in the sum of products of matrices on the right, so the
formula is proved. In our first application B(t) will be a column vector x(t).
The proof of the next theorem is fonnally just an application of the exponential
multiplier method of Chapter IO, Section 3A.
3.2 Theorem. The vector differential equation
dx
- = Ax+ b(t),
dt
where A is a constant matrix and b(t) is a continuous function on some interval has
for its general solution
f
x(t) = e 1 A e- 1Ab(l)dt + e 1Ac,
e-tAX == f e-tAb(t) dt + C.
Since e1A is the inverse of e-tA, we can multiply through by e'A to get
In the first example of the previous section, we saw that the system
( ~~~~: ) == ( b ! ) ( ~ )
had associated with it the matrix
tA _ ( e' te' )
e - 0 e' ·
Hence to find a particular solution of
- ( e'
- 0
te'
e1
)f ( 1 - -21
te-
e
21
) dt
) ( I+ ½te-2' + ¼,-~
- ( e'
- 0
te 1
e1 I -21
-2e
)
~
1
=( te ~-I ) •
-1.e
Adding the particular solution just found to the general homogeneous solution we
already had gives
c2e1 - ½e- 1
3B Variation of Parameters
Even in the case of an n-by-n matrix A(t) with nonconstant continuous entries there
is a formula for a particular solution of
dx
dt = A(t)x + b(t)
642 Chapter 13 Matrix Methods
in terms of solutions of the related homogeneous equation. Suppose x1 (t), ... , Xn (t)
is a set of n linearly independent solutions of
dx
- = A(t)x.
dt
We now form the n-by-n fundamental matrix
is a solution of the nonhomogeneous equation. It turns out that this can always be
done as follows. Using the product rule for differentiation, we substitute X (t)v(t)
into the nonhomogeneous equation to get
dX(t) dv(t)
--v(t)
dt
+ X(t)--
dt
= A(t)X(t)vft)
.
+ b(t).
Since each column of X (t) is a solution of the homogeneous equation, we have
dX(I)
-dt- = A(t)X(t).
Therefore, the first term cancels on each side, leaving
dv(I)
X(I)--
dt
= b(t).
Since the columns of X (t) are independent as vector functions, it follows (see
Exercise l 6.) that these columns are independent vectors for each fixed t. Hence
the inverse matrix x- 1(t) exists, and multiplying by it gives
dv(t) = x- 1(t)b(t).
dt
Section 3C Nonhomogeneous Systems 643
Integration gives the formula for v(t):
Finally,
3.3 Xp(t) = X(t)v(t)
f
= X(t) x- 1 (t)b(t)dt.
Notice that this formula is the same as the one previously derived in the constant-
coefficient case, with e' A now replaced by the more general X (t ). This process for
finding Xp is sometimes called variation of parameters, because to find it we replace
the constant vector vo in the homogeneous vector solution X (t )vo by a function that
varies with t.
r.-_,EXAMPLE
,
· . -_
"
,- ·- -.,,_·_- ·, -_-.·--_· 3-·1
.-. •, : ...... ' '
0
-~:·,· :-·: ,.·,:; ,,~. _,,·· ~:
It's routine to verify that the homogeneous system associated with
et e2t )
X(t) =( 0 et '
Xp(t) =( et
0
e21
et )f ( e~t ;-1, ) ( ;;t ) dt
- ( et
- 0
e2t
et )f ( 1 ~te2t ) dt
3
-(i
- 0
e2t
et
) ( I -
et
½e2' )- ( te + je '
e2t )
The general solution is then
3C Summary of Methods
For linear systems in the standard form dx/ dt = Ax+ b, and hence for systems and
equations reducible to this form, we usually proceed as follows:
644 Chapter 13 Matrix Methods
1. Find the general solution of the homogeneous equation dx/ dt = Ax, either
by elimination, by the eigenvector method, if applicable, or by finding e1A
directly. In the constant-coefficient case the homogeneous solution is always
of the form x1i(t) = e 1Ac, where c is a constant vector.
2. Find a particular solution to the nonhomogeneous equation, either as a by-
product of the elimination method, by undetermined coefficients, if applicable,
or by Formula 3.2 or 3.3.
3. Write the general solution as x(t) = x1,(t) +xp(t).
If A isn't constant there is no general method for finding x1i(t), and we'll very likely
have to use numerical methods.
EXERCISES ·
In Exercises I to 4, use Equation 3.2 to solve the initial- (b) Show that if X(t) is an n-by-n matrix with linearly
value problem of the form dx/dt =Ax+ b(t), x(to) = independent columns, in particular if X (t) is a fun-
xo. The associated homogeneous equations dx/dt = Ax damental matrix, then in order for
were found in Exercises 11, 12, and 13 for Section 2A
to 2C to have the exponential matrices e1A needed here
x(t) = X (t)c, . c constant
in Exercises 2, 3, and 4. to satisfy x(to) = xo, we must have c = x- 1 (to)xo.
Exercise 19 shows that X (to) is invertible.
2
• ( ~ )-( =i nc )+( n, dx
-dt = A(t)x + b(t),
then the general solution of the nonhomogeneous
(;rn ) = ( ~)
system is
x(t) = Xp(t) + X(t)c,
~f)-(;
For Exercises 7 to 10, consider the systems in Exercises
21 1 to 4 respectively, which are solvable by the method of
-I2 ) ( xy ) + ( e
2e2r )
; undetermined coefficients: Fann linear combinations of
4. (
the terms, and their derivatives, that occur in each entry
of the nonhomogeneous part of the differential equation,
( ~~~~ ) =( =1 ) taking care to include appropriate multiples by powers
oft for terms that are also homogeneous solutions. Then
5. (a) Show that for a solution of the form x(t) = e Ac,
1 substitute into the equation to detennine the coefficients
where c is a constant vector, to satisfy the condition of combination. In Exercises 7 t.o 10, use this method on
x(to) = xo, we must have c = e- 0Ax0 .
1 the corresponding system in Exercises l to 4.
Section 4 Equilibrium and Stability 645
In Exercises 11 and 12, the system has the homogeneous has X (t) = e1A for its fundamental matrix of independent
solutions shown. Verify that these are linearly indepen- column solutions with X (0) = I .
dent solutions. Find a particular solution of the nonho- In Exercises 16 and 17, A(t) is a square matrix with
mogeneous equation, using Equation 3.3. entries differentiable on a ::: t < b.
3 16. (a) Show that if A(t) and dA(t)/dt commute, then
2t dA 2 (t) dA(t)
1 ~ =
2A ( t) ~ .
- 2t 2 dA k (t) dA(t)
(b) Generalize part (a) to - - = kAk- 1( t ) - - .
) dt
>91-+-'l--lf-+-l-f-HctJl,IIIH--H----+-l--++DIIH-Hf- -
y
(a) (b)
time. The reason for using Lhe tenn equilibrium solution rather than constant solution
is that we 're mainly interested in the stability of solutions that result from small
perturbations of an equilibrium solution. For the purposes of graphing, an equilibrium
solution appears as a single point in the space of the dependent variables, so we often
find it natural to refer to an equilibrium solution as an equilibrium point.
Example: x = 2e 1 ,
y = e31 ;8y = x 3 .
648 Chapter 13 Matrix Methods
y
y
Example: x = e-t,
y = 3e';xy = 3.
III. Asymptotically Stable Node: A1 < A2 < 0
y
y
Example: x = 2 cos t,
y = sint;x 2 + 4y 2 = 4.
VI. Asymptotically Stable Spiral: A1 = -p + iq, A2 = -p - iq, p > 0, q # 0
Example: x = e- 1 cost,
= e-t sin t; x2 + y2 = e-2(arctan YJx).
y
VII. Unstable Star: At = A2 > 0
Example: x = 2e 1 , Example: x = e1 ,
y = 3e 1
, 3x = 2y. y = te 1 , y = x In x.
650 Chapter 13 Matrix Methods
y
y
x=x+y+l x+y+l=O
satisfies
y = 4x + y - 1 4x + y- I= 0,
x=x+y
y = 4x + y. or x. = ( 4I
i=y 0 I
y = -x - y, in matrix form X = ( -I -1 ~ ) X,
z=x-z 0 - 1
Section 4A Equilibrium and Stability 651
has a unique equilibrium point at the origin (0, 0, 0). The characteristic equation is
-A 1 0 )
det -1 -1 - A O = 0,
( 1 0 -1-A
Since all roots have negative real part, either -1 or-½, the equilibrium is asymp-
totically stable, with all solutions tending to the origin as t increases. We could
compute the general solution without much trouble, but we don't need that if we're
only checking stability near equilibrium. Note that showing the existence of just one
eigenvalue with positive real part would have been enough to guarantee instability.
x= (
-1
~
-1
-1
0 -1
~)x
have respective characteristic equations
A = -1 ± i, A = -1 and A = 1 ± i, A = I.
c1e' cost )
and x(t) = c2e' sin t .
( CJe'
The origin is respectively stable and unstable for these solutions as shown in Figure
13.3. Note that the trajectories appear to be very similar, but the ones in (b) nec-
essarily start at a positive distance from the origin, while in theory the ones in (a)
approach arbitrarily close to the origin.
4.1 Linear Stability. Let A be a real n-by-n constant matrix. The equilibrium
solution xo = 0 for the homogeneous system x = Ax is
(a) asymptotically stable if every eigenvalue of A has negative real part, and is
(b) unstable if A has at least one eigenvalue with positive real part.
652 Chapter 13 Matrix Methods
FlGURE 13.3
z
(a) Stable. (b) Unstable.
X
y
(a) (b)
The same conclusions hold for stability of an equilibrium solution for x = Ax+ b,
where b is an n-dimensional constant vector. If all eigenvalues of A have real
part zero, we can draw no immediate conclusion in dimension n ~ 4, for then an
equilibrium solution xo may be stable or may be unstable, but if n < 3 and the
eigenvalues are all distinct then xo will be stable.
Proof The method for computing the exponential matrix described in Section 2D
shows that every entry in the exponential matrix e1 A has the form e>..r Q(t), where
Q(t) is a polynomial and A is an eigenvalue of A. If every such A has negative real
part, then the entries, and hence all solutions, tend to zero as t tends to +oo. On
the other hand, if a single eigenvalue A, with eigenvector v, has positive real part,
then the solution x(t) = 8e>..r v is unbounded in every nonzero coordinate as t tends
to +oo, regardless of how small the positive number 8 is chosen.
If the nonhomogeneous equation has xo for an equilibrium solution the homoge-
neous plus particular fmm x(t) = e1Ac + xo for the general solution shows that the
same conclusions hold for the nonhomogeneous equation. The last statement of the
theorem is settled by checking out the two examples in Exercise 23 and noting that
with real and nonzero parts both zero when n = 2 the eigenvalues are ±i q with q
real , and nonzero, and when n = 3 the eigenvalues are ±i q, and 0. •
EXERCISES
stable node.
48 Nonlinear Systems
To extend eigenvalue analysis of equilibrium solutions to autonomous nonlinear
systems we start by finding the equilibrium points xo = (a 1, ... , an) of a nonlinear
system x= =
F(x), that is, points such that F(Xo) 0. To do this for linear systems we
had the routine of Chapter 2, Section 2, but if F(x) is nonlinear we have to resort
to ad hoc methods or perhaps numerical approximation using Newton's method
as described in Chapter 5, Section 5. After locating an equilibrium point xo, the
next step is to linearize the system at XO, replacing each real-valued equation Xk =
F.t(x1, ... , Xn) in the system by its linearization
11
Xk '°' oF
= ~ ~(xo)(xj
ox·
- ai) + Fk(xo), k = l, ... , X11.
j=I J
0F1 0F1
- (Xo) -(Xo)
ox1 OXn
F'(Xo) =
0F11 0F11
- (Xo) -(Xo)
ox1 OX11
654 Chapter 13 Matrix Methods
x
For simplicity we work with the homogeneous equation = F' (xo)x at Xo, as we
did in Section 4A, and we'll see that the eigenvalues of the constant matrix F'(xo)
are the key to our criteria.
y dy -y + x 2
i = -dx = X ' X =f. O.
FIGURE 13.4
(a) Nonlinear saddle. (b) Lorenz z
trajectory; f3 = ~, p = 28,
(f = JO.
(a) (b)
i = -ax +ay
y = px - y-xz
z=xy-{Jz,
and it has been studied extensively in recent years because of the apparently chaotic
behavior of its solution trajectories near equilibrium. The equilibrium solutions are
Section 48 Equilibrium and Stability 655
just the solutions to the algebraic system we get by setting the right hand sides equal
to zero. Looking at the special case f3 = a = 1, p = 2, we solve
-x +y =0
2x - y-xz =0
xy- z = 0.
Noting from the first equation that x = y, we then see that there are just three
solutions: (1, 1, 1), (0, 0, 0) and (-1, -1, 1). The derivative matrix F'(x, y, z) for
the linearization at (x, y, z) is
ax ay oz
F'(l, 1, 1) =
(
-1
y
I
2 - z -l
X
~x )
-1 (1.1.l)
=( -! -! -~ ) .
1 1 -1
Similarly, we get the Iinearization matrices at (0, 0, 0) and (-1, -1, 1) by evaluating
the same derivative matrix at these two additional points:
F'(O, 0, 0) - (
-1
2
0
-! ~);
0 -1
F'(-1, -1, 1) = ( -~ -~
-1 -1 -1
~).
The next theorem draws conclusions about stability from the eigenvalues of deriva-
tive matrices. For a general n-dimensional system the number of possibilities is very
large, so we list only a few general categories. We omit the detailed proof.
-1-). 1
P().) = det
(
! -1-).
1
0
-1
-1 - ).
)
Computing the determinant, we get P().) = - (().+ 1) 3 +I), and we see by inspection
that ). 1 = - 2 is a root. Division by).+ 2 gives P().) = -(). + 2)(). 2 +).+I). The
roots of the quadratic factor are ). 2 = (-1 + ./3 i) /2 and ;\.3 = (- 1 - ./3 i) /2. Thus
the real parts of all three eigenvalues are negative, so we conclude from Theorem 4.2
that (1, 1, I) is an asymptotically stable equilibrium solution.
IE>fA~PLE a I The
Continuing with the special Lorenz system, we examine the equilibrium at (0, 0, 0).
relevant characteristic polynomial is evaluated at (0, 0, 0) and is = det P().)
(F'(0, 0, 0) - ).[), or
- 1 - ). 1 0 )
P().) = 2 -1 - ). 0 .
( 0 0 - 1 - J..
The determinant is -(). + 1) 3 + 2).. + 2 = -(A+ 1)().. 2 + 2A - 1). The roots are
). 1= - 1, Az = -1 - ./2 and A.3 = -1 + ./2. Since ).3 > 0 we conclude from
Theorem 4.2 that (0, 0, 0) is an unstable equilibrium. This point is a saddle point,
since there are two negative eigenvalues that contribute to making the other basic
solutions tend to zero. Checking out the equilibrium at (-1, -1, I) is left as an
exercise.
has been studied extensively with the aim of understanding trajectories such as the
one shown in Figure 13.4(b). With the choice of parameters shown there, the equi-
librium points, aside from the one at the origin, are at (±6./2, ±6./2, 27). The
trajectory shown in the figure has initial point (2, 2, 2 1). It winds around in the area
of one equilibrium an apparently random number of times, then switches toward the
other equilibrium with similar behavior, continuing back and forth unpredictably.
The eigenvalues of the linearizations are the same at the two equilibrium points;
they are approximately as follows: AJ ~ - 13.85 , A2, A3 ~ 0.09 ± 10.19i . Thus
these two points are saddle points, and each one has a surface containing it on which
all trajectories gradually spiral away from the point, as well as a trajectory at a pos-
itive angle to the surface that converges to the point. The typical trajectory behavior
lies somewhere between these extremes, winding away from one equilibrium until
it is attracted by the other, then reversing. The number of circuits about each point,
and the path taken, is very sensitive to minute changes in the initial conditions.
Edward Lorenz began the study of the Lorenz system by using it to approximate
more complicated differential equations in the study of weather patterns; hence the
interest in the system's sensitivity to initial conditions, sometimes called the butterfly
effect. For more details about the system see Colin Sparrow, The Lorenz Equations:
Bifurcations, Chaos, and Strange Attractors, Springer-Verlag (1982).
Section 48 Equilibrium and Stability 657
EXERCISES
1. Assume A is not the O matrix and let x = Ax be a 2- (c) Show that in polar coordinates the given system
dimensional autonomous system for which det A = 0. takes the form ;- = ar 3 , fJ = -1.
Show that the system has zero for an eigenvalue and that [Hint: Apply d/dt to the equations x = r cos 0, y =
the equilibrium solutions make up an entire line in IR2 • rsin0.]
2. The nonautonomou~ system (d) Solve the polar-form system in part (c), and show
that if a > 0 the equilibrium point is unstable, and
.i:= (1 - t )x - t y that if a < 0 it is stable. Thus the parameter value
a = 0 is called a bifurcation point for the system,
.v = tx + (1 - t)y because the stability of the system at the equilibrium
point changes in a fundamental way as a increases
exhibits a change of character at its lone equilibrium point. through zero.
This system appears also in Example 3 of Chapter 12,
Section l. 6. A nonlinear pendulum with frictional damping has a
(a) Show that for each real number t Lhe system has a
displacement angle 0 = 0(t) that satisfies B+(k/(/m))iJ+
single equilibrium point at (x, y) = (0, 0). f sin 0 = 0, where k > 0 is constant.
(h) Show that while t > 1 solutions behave in a stable (a) Show that the equation for 0 is equivalent, with
manner near the equilibrium point, and that behavior x = 0, y = iJ, to the first-order system
is unstable when t < I.
. g . k
3. Show that the equilibrium points of Lhe Lorenz system i =y, y = -l smx - lmy.
Chapter 13 REVIEW
=( ~ )
14. (a) Find an equivalent first-order system of dimension
sin t ) , x(O) 2n for the n-dimensional second-order initial-value
problem x = x, x(O) = Xo, x(O) = zo, and write
the resulting Zn-dimensional system as an uncoupled
sequence of 2-dimensional coupled systems.
(b) Solve the given second-order system for the case
n = I and deduce the solution to the general case
from part (a) and this special case.
(c) Deduce from the result of part (b) the form of the
exponential matrix for the Zn-dimensional system of
-1 -1
part (a).
I -1
n
-1 1 15. (a) Find an equivalent first-order system of dimension
2n for the n-dimensional second-order initial-value
1
problem x = -x, x(O) = Xo, i(O) = zo, and write
-2 ] ) x, x(O) - ( the resulting Zn-dimensional system as an uncoupled
I
sequence of 2-dimensional coupled systems.
D•0)• D
12. i=x+e1,x(O)=en,n:::::2
+ x(O) - (
(b) Solve the given second-order system for the case
n = I and deduce the solution to the general case
from part (a) and this special case.
(c) Deduce from the result of part (b) the form of the
exponential matrix for the Zn-dimensional system of
part (a).
13. Find the general solution of the second-order system
x = x + y, y = x - y by first writing it as a first- 16. If all eigenvalues of the n-by-n matrix A are real, show
order system of dimension 4 in matrix form and finding that the system ti = Ax has solutions x(t) = t.lu for
an exponential matrix in complex form. t > 0, where u is an eigenvector of A with eigenvalue A.
CH APTER 14
INFINITE SERIES
The study of numerical infinite series is an important branch of the study of numerical
approximation. In lhe first section we ' ll treat limits of sequences somewhat infor-
mally. Later on we'll use calculus technique to deal with sequential limits. Section 1
introduces the idea of convergence of a series, and Section 2 complements it using
Taylor expansions. Additional sections deal with the more technical aspects of con-
vergence. The chapter closes with power series solutions of ordinary differential
equations and an introduction to Fourier series and the 1-dimensional heat and wave
equations.
L ak = am + am+ J + · · · + an , m S n.
k=m
For example,
II
a 1 + a2 + a3 + ··· = L ak.
k= I
660
Section 1 Examples and Definitions 661
Writing formulas like this requires us lo have a formula for lhe general term ak.
Thus, for example,
00
3 3 3 3 3
10 + 102 + 103 + 104 +···=I: 1Qk ·
k=l
'°'
L., ak
k=l
= n~oo '°'
lim L., ak = s,
k=l
if the limit exists. If the series has a sum in this sense, then the series is said to
converge to s, and if the limit fails to exist, lhe series is said to diverge.
L xk = 1 + x + x 2 + ··· .
k=O
There is a simple formula for the partial sums if x =j:. 1, given by
2 n-1 1 -xn
Sn= l+x+x +···+x =- -.
1-x
To verify the formula, multiply bolh sides by 1 - x and note that all but two terms
cancel on the left. If O < Ix I < 1, then xn tends to O as n -* oo; to see lhis take
n > 1n8/ In jxj to make jxjn < 8 < l. Thus for all x for which Jxl < I the sum is
oo 1 - xn
'°'xk
L.,
= lim - -
n-H)O l- X
= } - X
k=O
If jx I > I, then xn is unbounded as n -* oo, so the sum fails to exist. If x = -1,
the formula for s,1 gives O if n is odd and l if n is even, so there is no limit then
either. Finally, if x = l the formula for Sn is invalid, but in lhat case we see directly
that s11 = n, which tends to oo as n -* oo.
We can get some feeling for convergence of a sum by looking at specific numerical
examples. In each of the next three examples we can find a simple formula for the
partial sums, something lhat is not possible for many important infinite series.
[ ~)(AIVIPL~ ~J Consider the geometric series with ratio x = ½- Using Example 1, we have
00
1 I I I
L
k=O
3k = i + 3 + 32 + 33 + ...
(I ;3n+[)
hm '°' - = n~oo
• II } • } -
= n~oo L., 3k
hm 1 - (] /3)
k=O
1-0 3
1-(1/3)
= -2
662 .....,.,.,.,,..,........,.-,.,.,....,..., Chapter 14 Infinite Series
j E~AMPLE 3 I Here is a particularly simple series.
= n--+oo
lim (1 - !) + (! -!) + ··· + (~ - - +
2 2 3 n n
1
1
)
= . ( 1 - -1- )
hm = 1- . -1-
hm = 1- 0 =1
11--+oo n+1 n--+oon+1
This is an example of what is called a "telescoping" series, because the interior terms
cancel in the partial sum. Since (1/ k) - (1/ k + 1) = 1/ k(k + 1), the series can also
be written
00 1
I: k(k + o = 1.
k=l
The infinite series L~I k = 1 + 2 + 3 + · · · is divergent. To see this, note that the
nth partial sum is the sum of the first n integers:
n(n + 1)
Sn = 1 + 2 + · ·· + n = 2
.
Hence limn--+oo Sn = oo, so we agree to say that the series "diverges to +oo."
I,~~AMP~ts j The geometric series with x = -2 is
00
I: (-2l = 1 - 2 + 22 - 23 + ....
k=O
and the nth partial sum is, according to the fonnula in Example 1,
1 + (-1) 11 2n
3
As n -+ oo, the numerator oscillates between being large and positive and large
and negative. Since the partial sums do not tend to a fixed finite value, the series
diverges.
EXERCiSES
5 6
'£ (~--
II
1
In Exercises l to 4, use the definition L ak = 1
• k=I + )
k k 1
2. I: k - L (k - o
k=2 k=o3
k=m 4 II
am + am+ 1 + · · · + a11 for the I:-notation to write out and 3. I: 2-k 4. 6 I: (- 1/
simplify the sums. k=O k=I
Section 1 Examples and Definitions 663
i Exercises 5 to 10, find a formula for the kth tenn, The symbol kl, called k-factorial is defined by
= 1, 2, 3, ... , of the infinite series
that is consistent
vith the given part of the series.
I I 1
kl-{
.-
l,
k(k-1)· · ·3·2·1,
k=O,
k~l.
5· 1 + 2 + 3 + 4 + ...
1
6.1--+ - --+···
l I Thus 0! = 1, l! = I, 2! = 2-1 = 2, 3! = 3-2·1 = 6, and
2 4 8 so on. Rewrite each of the infinite series 26 to 31 using
I I 1 factorial notation. Then write out the first three terms.
7 · 1-3 + 2·4 + 3.5 + ... 00 3-k
26. E - -
I 1 I k=i1·2· · ·k
8' 3 + 2.3 2 + 3.33 + ...
4 5 6
27. I; l
9' 2.32 k=I 2-4 · · · (2k)
+ 3.3 3 + 4.3 4 + ...
10. -
I
- 2-3
1
- + -3.4 - ···
I 28. I: 2k
k=I 2·4-6 · · · (2k)
1-2
In Exercises 11 to 14, verify that each of the partial 29. I; l
sum formulas is correct. Then find the sum of the k=l k(k + l)(k + 2) · · · (2k - 1)(2k)
corresponding infinite series as n ~ oo if it exists. 00 1.3.5 .. · (2k - I)
30. E-----
n 1 n k=I 2-4·6 · · · (2k)
ll. k~I 2k(2k + 2) = 4(n + I) 00
(-Ii
31. E-- ---
n 3 1-3 · · · (2k + I)
k=o
12. Ek =4-4-n
k==O 4 In Exercises 32 to 37, write out Sn for n = 1, 2, and 3.
n Also find limn-H)O Sn
13. }: 2k = n (n + 1)
k=I
32. Sn = ( 1 + ~)
14. E -- -
n (2k + 1 1) =2- 2-n
k=O 2k 3n 2 -1 2"
34. Sn= -2-- 35. Sn= n+I
4n +n 3
In Exercise 15 to 20, what is the sum of the geometric 3n +4 n!O + 1
series? 36. S n = - - 37. Sn= -n-(n_9_+_1_)
E (l)k
n
15. 00
- 16. E
oo ( --
3)k
Verify the equations in Exercises 38 to 41.
k=O 6 k=o 4
00 I 1 00
I 1
11. E -
oo ( I )k 18. E
oo (
-1-2 )k , x ¥ o 38. k~l 10k =9 39. k~2 3k =6
k=I l + lT k=O 1 +x
40
00
( 1 )k 1 4
00
l I
19.
00
E e-2k
00
20. }:(0.0li
· k'f-1 -4 = -5 1. k~3 k(k + 1) =3
k=O k=O 42. Find an infinite series with positive terms that converges
In Exercises 21 to 24, write the repeating decimal expan- to 3.
sions as an infinite geometric series and find the sum of 43. Find an infinite series with alternating positive and nega-
each one. tive terms that converges to 3.
21. 0.888~ 22. 0.10101010 44. Assume that the series }:~ 1 ak = s is convergent. Show
that L~m
ak is also convergent if m > I. What is the
23. 0.123123123 24. 1.23452345 sum of the second series in terms of s and the terms ak?
25. Prove that an infinite decimal expansion that repeats 45. Cantor set. From the interval [O, l], the open middle third
periodically from some point on must represent a rational
(j, j) is deleted. Then the open middle third is deleted
number. from each of the two remaining closed intervals, then
664 Chapter 14 Infinite Series
the open middle third from each of the four remaining (a) Find a formula for Cn, the sum of the lengths ,
closed intervals, and so on. The set C remaining after the intervals remaining after n steps.
entire infinite sequence of deletions is called the Cantor (b) Show that limn-+oo Cn = 0, showing that C has
middle-third set. length zero.
It will follow from the next theorem that every polynomial has the form in Theo-
rem 2.1 using powers of x - a for arbitrary a. Let f(x) = 1 - x 2 + 2x 3 • To write
f (x) in terms of powers of x - a with a = I, we compute
/(1) = 2, J'(l) = 4, 11
/ (1) = 10, /
111
(1) = 12.
Hence
. 4 10 12
J(x) = 2+ -(x -1) + -(x - 1)2 + -(x - 1) 3
1! 2! 3!
2
=2+ 4(x - I) + 5(x - 1) + 2(x - 1) 3 •
called the nth degree Taylor polynomial off (x) at x = a. Theorem 2.1 shows that
if f (x) is itself a polynomial of degree at most n, then the coefficients ak are designed
so that Tn(x) = f(x) for all real x. When f(x) is not a polynomial, Tn(x) will not
usually be equal to f (x) except at x = a. The difference f (x) - Tn (x) = Rn (x)
is the Taylor remainder. If Rn (x) is small Tn (x) will be a good approximation to
f (x). Here is a simple estimate for the size of the remainder, under the assumption
that j<n+l)(x) is continuous on an interval containing a.
n 1
f(x) = L k! j<k)(a)(x - al+ Rn(x)
k=O
where
1
R (x)
n
= ---f(n+l)(c)(x
(n + l)!
- at+l
,
Proof. With x held fixed and x -=/=- a, define the unique number K by
1n
g(t) = -f(x) + L k!f(k)(t)(x - tl + K(x - tt+
1
•
k=O
Note that g(a) = 0 because of the way K is defined, and that g(x) = 0 no matter
what value K has. Applying the product rule to the terms in the summation over k,
we find that differentiation with respect to t gives
g'(t) = '°"
~k!
n 1
-f(k+l)(t)(x - d - '°"
II 1
---f(k)(t)(x - tl-l
~(k-1)!
k=O k=l
- (n + l)K(x - tt.
1
g'(t) = -J<n+l)(t)(x -tt - (n + l)K(x - tt.
n!
By the Mean-Value Theorem for derivatives, there is a number c between x and a
such that g'(c) = 0. (Recall that g(x) = g(a) = 0.) But the equation g'(c) = 0
666 Chapter 14 Infinite Series
K = I /(n+l)(c)
(n + I)! '
which is what we wanted to show.
The nth degree Taylor polynomial T,1 (x) is also called the nth degree Tayk
approximation to /(x) about x = a. Note that to increase the degree of approx,
mation, all we do is add another tenn without altering the previous terms. Without
worrying about convergence, we can write the infinite Taylor series.
2.3
= f(a) + f 1
(a)(x - a)+ I, f 11 (a)(x - a)
2
2
+ _!_ J"'(a}(x - a) 3 + · ··
3!
and arrive at the nth degree Taylor approximation by stopping after (n + I) terms.
The infinite series is defined only when j(x) has derivatives of all orders at x = a,
and even then may not converge except at x = a.
l. £~MPLE2 ·j Let j(x) = x 112 for x ~ 0. Then J'(x) = c½)x- 112 , j"(x) = -d>x- 312, fl/l(x) =
Ci)x- 512 , and so on. We find, with a= I in Taylor's formula
Jx = J + _!_
1! 2
(~) (x - I)+_!_(-
2!
~)
4
(x - 1-,Z + R2(x)
.
I I
=l+ (x - I} - cx - I) 2 + R2 (x),
2 8
where R2(x) = (l/3!)ic- 512 (x - 1) 3 , and c is somewhere between x and I. Note
that the first two terms of the approximation give the function T1 whose graph is
the tangent line to the graph of y = ..fi at x = I. The first three terms describe
a quadratic function T2 that approximates ..fi near x = I . The graphs of both
approximations are shown in Figure 14.J(a). Note that the approximations get better
the closer we get to x = I.
T1(X)
~ - - - {i
~ = - - - r2(X)
~---<---+------~
X
X
(a) (b)
II }
We can then estimate that ~ differs from L- xk by at most
k=l k!
e
S (n + I)! , if O S x S 1.
oo I
f(x) = L k! J<k\a)(x - al,
k=O
with the series on the right converging for x in at least some subinterval of the
domain of f. All we have to do is show that
lim Rn(X)
n->OO
=0
for the values of x in question. This at once proves convergence of the series and
shows that the sum at x is f (x).
2.4 Here is a list of important Taylor expansions that many people find useful to
remember.
oo xk x x2 x3
(a) ex = L - = 1 + - + - + - + ··· -oo < x < oo
k=ok! 1! 2! 3! '
oo ( - 1l x2 x4 x6
(b) cos x =L --x2k = 1- - + - + - + ··· -oo < x < oo
k=O (2k)! 2! 4! 6! '
668 Chapter 14 Infinite Series
. ~ (-l)k 2k+I x3 x5 x1
(c) sm x = ---x
L, = x - - + - - - + · ·· -oo < x <
k=O (2k + l)! 3! 5! 7! '
oo (-l)k+I k x2 x3 x4
(d) ln(l + x) = L ---x =x - - + - - - + ··· , -1 < x < 1
k=l k 2 3 4
lim l ecxn+l =0
11->oo (n + I)!
for all real numbers x. Pick a fixed value for x. Since c ::; lei ::; lxl, we see that
ec < elxl for all relevant values of c. Hence, with x fixed, all we need to show is that
1
lim ---xn+l = 0.
+ 1)!
n->oo (n
We prove this as follows. Choose k > l :": 2x > 0, hold l fixed, and write
X X X X X
kl =1 2 I l +1 k.
By the assumption on k, l and x, we have x/ I :::; ½- Thus for O < x < I we have
xk x1 1
-<---
k! - l! 2k-l"
Letting k - oo shows that xk / k ! - 0. Allowing -/ ::; x < 0 only makes the sign
alternate. Similar arguments apply to the other series listed previously, and these are
left as exercises.
To find a series expansion for e-2x, there is no need to start from scratch. Simply
replace x by -2x everywhere in 2.4(a):
oo I
e-2x = L k!(-2lxk.
k=O
To find an expansion for ln(l - x) in powers of x, replace x by -x in 2.4(d):
ln(l - x) = r:
00 (
- l)k+l <-I)kxk, -1<-x<l,
k=l k
oo 1
= - I: -xk, -l<x<l.
k=l k
Section 2B Taylor Series 669
.;URE 14.2
Figure 14.2(a) shows the graph of cos x along with the Taylor expansion partial sum
graphs of To(x) = 1, T2(x) = l - ½x 2 and T4(x) = 1 - ½x 2 + tfX 4.
Figure 14.2(b) shows the graph of sinx along with the Taylor expansion partial sum
graphs of T1 (x) = x, T3(x) = x - ¼x 3 and Ts(x) = x - ¼x 3 + iiox 5.
EXERCISES
(°"oo (°"oo
I +x k=O
27. Show that L...k=O (-Ii)
--,;i-- Lk=O k!I) = 1.
23. (a) If/(.()= cosx, show that the Taylor coefficients of 28. Show that
f about x = 0 arc
1<nl(0) = { 0. n odd,
n! . (-1)"12/n!, n even.
(b) Using the method of Example 4 of the text, 29. An even function f is a function such that / ( -x) = f (x)
show that for all real x, and an odd function is a function such that
f(-x) = -f(x) for all real x.
L (-Ii
00
2k (a) Show that the odd-order derivatives J<2k+ 1l(O) of
cnsx = (lk)! x , -oo < x < oo. an even function are all zero. [For example, f(x) =
k=O
cosx.]
(b) Show that the even-order derivatives J< 2k>(O) of an
24. (a) If f(x) = sinx, show that the Taylor coefficients of odd function are all zero. [For example, f(x) =
/ about x = 0 are sinx.]
(c) What conclusions can you make about the Taylor
j(n)(O) ={ (-l)(n-1)/2/II!, n odd, expansions about a = 0 of even functions and odd
n! 0, n even. functions?
This principle is a fundamental property of real numbers and indeed is built into their
very definition. Our attitude here is to accept the principle as a plausible assertion
about the real number line. Figure 14.3 illustrates both cases of the principle.
(EXJ\lyt~LE lj The infinite series with kth tenn ak = k- 12-k has nth partial sum
n I II l
Sn = L k2k ::5 L 2k
k=I k=O
Since sn+I =Sn+ 2-n-l /(n + 1) > Sn, and Sn ~ b = 2, the nondecreasing sequence
principle shows that the series has a sum
oo I
s =L k2k ~ 2.
k=!
The equations
lim s
n->OO 11
=s and lim (s 11
11->00
- s) =0
are completely equivalent, and very often it's simpler to show that a sequence of
numbers sn - s tends to O than to show that Sn tends to s. Here we record two
important kinds of sequence that have zero for a limit.
. 1
(a) hmn->oo -
na = 0, if a> 0
3.2 I
(b) limn->oo brJ = 0, if lbl > 1
672 Chapter 14 Infinite Series
The reason in each case is that the denominator, along with its absolute value in (b),
tends to infinity as n-+ oo. See Exercises 21 and 22 for details.
are convergent series of real numbers. Then the series with kth terms cak and ak +bk,
respectively, are convergent also, with
00 00 00 00 :X,
Proof. Use of the distributive and associative laws for finite sums shows that the
proof reduces to showing that if
I! II
then
c Jim
11-00
Sn = Jim cs 11 ,
11-00
and Jim s11
11-x,
+ Jim
11-00
111 = 11-00
Jim (s,1 + 111 ).
These limit relations are proved in the same way as the more familiar analogues for
a continuous variable x, which we assume:
c Jim f (x)
X-00
= Jim cf (x)
X-00
and
3.4 Theorem. Altering a finite number of terms in an infinite series has no effect
on whether the series converges or diverges.
Proof Suppose all changes occur among the first M terms, replacing ak by a~ for
k = 1, 2, ... M. Then for n > M, the new partial sums s~ differ from Sn by a fixed
amount, s~ - Sn= d, independent of n. Hence
Jim s~
n----->00
= d + n->oo
lim Sn,
(1 '°' -1 1
+ 2) + ~ 3k = 3 + -3 + -32 + -33 + · ·· =
1 1 3
-
2
+2 =
7
-.
2
k=l
EXERCISES
In Exercise 9 to 12, find a formula for the nth entry in an (b) Find a simple formula for the kth term ak of an
infinite sequence with the given four values. Then find infinite series that has s 11 = I + (l/11) for its nth
limn->oo Sn,
partial sum. What is the sum of 1 ak? Lk=
I 2 3 4 3 4 5 6 14. Let L~Iak be a series of nonnegative terms. Show that
9 10 the series either converges or else diverges to infinity, in
• 2' 3' 4' 5 · .. · 2' 3' 4' 5' ...
674 Chapter 14 Infinite Series
16. Show that lilll/c->oo xk / k! = 0 for x > 0 by choosing 19. I: (2/3l - I: (2/3/+ 1
k=O k=O
k > I ::: 2x and writing
00 k 00 1
20
XI.: X X X X X · k~I k 3 +1 - k~J k 3 +1
-=-·-···-·--···-
k! 1 2 / /+1 k 21. Prove that limn->oo n-a = 0 if ex > 0. [Hint: To make
11-a < E, make n > E-l/a.]
Then use the assumption x / I .::: ½, What happens if 22. Prove that limn--. 00 b-n = 0 if lbl > 1. [Hint: Show
X < 0? lhln = (1 + 6)n ::: 1 + 116 by the binomial theorem.]
Proof. Let S11 = Lk=I ak, By assumption limn--+oo s11 = s, for some finite number
s. Hence limn--+oo Sn-I = s also. It follows from Sn - Sn -I = a, 1 that
lim a,1 = Jim (sn - Sn-1)
n--+ oo n--+ 00
= 11--+oo
Jim Sn - lim Sn-I
n--+oo
= s -s = 0. •
The series
00
k I 2 3
Ek+l =2+3+4+·--
= Jim I =1
k--+ool+(l/k) '
the series fails to converge because a11 doesn't tend to 0.
Warning. It is not true, just because limn--+x a 11 = 0, that the series ak Lbt
converges. The harmonic series }:~1 (] / k), shown to diverge in Example 6, is a
counterexample, because limk..... o I/ k = 0, but the series diverges.
Section 3C Convergence Criteria 675
There are close analogies between integrals over an infinite interval and infinite
series. Thus the improper integral of f (x) from o to oo is convergent if it has a
finite value determined by
00
Jim lb f(x)dx;
1a f(x)dx =
b-oo a
otherwise, the integral is divergent. For example,
00
f e-p:,; dx = Jim {b e-p:,; dx
11 b-00)1
e-P - e-pb e-P
=Jim----=
b-oo
p p
if p > 0.
If p < 0, the computation shows that the improper integral diverges to oo. (What
about p = O?) For another example, consider
[ 00 dx {b dx
lo I +x = b~~}0 I +x 2
2
. Jr
= hm [arctan b - arctan OJ = - .
b-oo 2
The analogy with series is that to compute an improper integral you first compute
"proper" integrals over finite intervals and then find their limit over intervals with
length tending to oo. The next theorem shows that there is a very useful connection
between the convergence and divergence of particular infinite series and improper
integrals.
3.6 Integral Test. Let I:,:~1 Ok be a series of positive terms, and suppose f is
a decreasing function such that f (k) = ak for k = 1, 2, 3 .... Then the series and
improper integral,
1 2 3 4 5
(a) 00
Lak and 1 00
f(x)dx,
1~ I 2 3
(b)
4 5
either both converge or both diverge.
Proof.
k=l
Suppose first that the integral converges. Looking at Figure 14.4(a) shows
that, by comparing areas, we have
FIGURE 14.4
N
Lok ~
1N-I f(x)dx ~
loo f(x)dx < oo.
k=2 1 I
The series converges because the partial sums are increasing (because Ok > 0)
and bounded above by the value of the integral. Then the nondecreasing sequence
principle (3.1) applies.
676 Chapter 14 Infinite Series
f I
N
f(x)dx
N
~ L>k ~ Lllk <
k=I
oo
k=I
oo,
so again Principle 3.1 applies to show that the integral converges to a finite number
as N ~ oo if the series converges. •
IEXAMPt.E 61 The harmonic series L~J (1/ k) diverges, because with f(x) = 1/x, we have
· J(k) = 1/ k. But
!NI-dx
f I
ool
-dx
X
=
=
Jim
N-+oo I
N-+oo
The series diverges even though limk->:::d 1/ k) = 0, as we pointed out in the warning
about misapplication of the term test.
~IVIPl£7 I To=t=decide about the p-series L~J k-P for p > 0, let /(x) = 1/xP. We have, for
p 1,
l
I
oo I
-dx
xP
= N-+oo
lim
fI
N 1
-dx
xP
= N-+oo
lim [
1 ]N
(1 - p)xP-1 I
= N-+oo
11m
. -I- [ -I- -
1 - p NP-I
I] = I --,
p-1
+oo,
p > I,
Hence we have convergence for p > I and divergence for p < I . The case p = 1 is
the harmonic series, and when p ~ 0 the tenns of the series fail to tend to zero, so
the series diverges by the term test. For p > I, the p-series defines a function { (p)
called the Riemann zeta-function:
00 l
{(p) = "-.
~ kP
k=I
oo I oo I
I: - and Lko.9
k=I -Jk k=I
both diverge.
Section 3D Convergence Criteria 677
Application of the integral test usually depends on being able to compute some
indefinite integral, but examples in which the computation is awkward can sometimes
be handled by comparison with a related series.
n ti oo
Sn= Lak .:S Lbk .:S Lbk =b.
k=I k=I k=I
n t1
and that by Principle 3. l(ii), lim11 .... oc Sn = oo. Hence }:~1 bk diverges also. •
00 1
Consider the series L - - - . Since Ink ->
k=2 k 2 Ink
1 fork> 3, we have
-
1 1
0 < -- < - 2 for k _> 3.
- k 2 Ink - k '
Since }:~3 I/ k 2 is a convergent p-series by the integral test, we know that }:~3
l/(k 2 lnk) converges by the comparison test. Hence the given series, with the addi-
tional term 1/ (4 ln 2), converges also.
1 1 1
------------,.<-~---=-c=-
(k + 1)5/6 - (k + J)l/2(k + 1)1/3 - klf2(k + 1)1/3'
the given series diverges by comparison with the divergent p-series for p = i.
3D Absolute Convergence
If all terms of a series are nonpositive from some point on, we can simply multiply
the series by ( -1) and apply the tests of the previous subsection. For a series that
678 Chapter 14 Infinite Series
has infinitely many terms ak of both signs, we try if possible to show that the series
with kth term lak I converges. If
00
I: <-1)\2 = k=I
k=l
1 Lk21 II 00
(-ll I 1 I
Ik./k+T = k./k+T ::5 k3/2'
and I:~ 3 2
1 (1/ k 1 ) is a convergent p-series with p = ~-
For series with nonnegative terms, convergence is the same as absolute conver-
gence, so a test that proves one proves the other also.
00 1 .
L2k =2, with r = 1,
k=O
2
Section 3D Convergence Criteria 679
and
00
(-1}*: 2
L-r =
3, with r = -1/2.
k=O
The following test applies to series with kth term ak -/= 0 from some point on and
deals directly with absolute convergence.
3.9 Ratio Test. Let L~l ak be a series for which limk~oo lak+1 l/lakl exists.
I
(ii) If limk~oo a::
1
I > 1, or is infinite, the series fails to converge.
If the limit of the ratio lak+ 1 / ak I of successive terms fails to exist or if the limit is
l, no assertion is being made about convergence of the series. Note also that if a
series of positive terms fails to converge absolutely then it fails to converge at all.
Proof. We'll assume ak > 0 since we are concerned only with terrns of the fonn
lak I in the proof.
Case (i). Since the limit of ak+i/ak is less than 1, there is a number r < 1 such
that ak+ 1/ak s r for all sufficiently large values of k, say k ~ N. Thus
ak+I s rak fork= N, N +
1, .... Hence
2 k
aN+k s raN+k-1 Sr aN+k-2 S ···Sr aN,
LrkaN = aN Lrk
k=O k=O
IEXAfl#IP~~ ;13 j
· -
The series
ak+ 1 = (k
1=;
+
1k
1) /2k+
2
/2k converges because, with ak
1
, the ratio test gives
= k2/2k and
2
lim ( I + -I ) ·-I = -I
= k-+oo < I.
k 2 2
j;~i<AIVtPLE141 Consider the series L~oxk/k!, where kl, k factorial is 1-2-3---k if k 2: I, and
· O! = I. We have ak = x/c / k! and ak+J = xk+ 1/(k + I)!
.
]1m lak+II
-- = 1·1m lxk+l/(k+l)!I
k-+oo ak k-+oo xk / k!
. k!lxl . lxl
= k-+oo
hm - - - = hm - - = 0 <
(k + 1) ! k-+oo k + l
I.
Hence the series of terms depending on x converges by case (i) of Test 3.9 for all
x. We have already seen that the series converges to ex.
3E Alternating Series
An alternating series is one in which the terms are alternately positive and negative.
The alternating harmonic series is an example:
00
kl 1
°"(-I) -
~ k
=I- -
2
+ -31 - -4I + ... .
k=l
Proof. Suppose a1 = Pl, a2 = -p2, a3 = p3, and so on, with Pk 2: 0. Then the
. I sum "2n
part1a .
L..-k= 1 ak is
where the terms P2k-l - P2k are all nonnegative, because Pk = lak I 2: lak+I I = Pk+ 1-
Hence s2,1 is nondecreasing as n increases. For the same reason, grouping the terms
differently shows that
Hence the partial sums s2n are bounded above and nondecreasing, while the partial
sums s2n+1 = PI -(p2- p3)-· · ·-(P2n-2-P2n-l )-(P2n-P2n+1) are nonincreasing.
By Principle 3.1 for bounded sequences
2n
lim s2n = lim ""' ak = s,
n-+oo n-+oo L
k=I
for some numbers. But s2n+1 = s2n + a2n+1, so since lim a2 11 +1
n-+OO
= 0 by (ii),
2n+l
lim
n-+oo
s2n+ 1 = lim ""' ak
n-+oo L
=s
k=l
also. Hence all s11 converge to s. Since s2n ,::: s .::: s2n+ 1 it follows that if m is even
Sm ~ s ~ Sm+l and if m is odd Sm+! .:5 s ~ Sm, Thus Is - sml .:5 lam+1 I- •
[~~,VWl¼~ti~I The alternating harmonic series converges, because with ak = (-l)k+ 1/ k, we have
(i) 1(-1/+l
k
I> ,(-l)k+21
k+l ,
and
( -l)k+l
(ii) lim - - - = 0.
k-+oo k
Note that the alternating harmonic series fails to converge absolutely because }:~1
l / k is divergent.
The series
00
(-Il 1 I l
L ~ = ln 2 -
k=l
ln 3 + ln 4 - · · ·
converges because (i) I / Ink 2::. I/ ln(k+ I) and (ii) limk-+oo I / Ink = 0. The Leibniz
test implies convergence.
EXERCISES
oo I
In Exercises I to 6, determine the convergence or diver- 5. I:-2-
gence of the infinite series by using the term test (for k=l k + l
divergence) or the integral test (for convergence or
In Exercises 7 to 12, detennine the convergence or
divergence). Show carefully how the test you use applies
divergence of the infinite series by using the comparison
in each case.
test or the ratio test. Show carefully how the test you
oo k2 00 k use applies in each case.
t.1:-kI 2. I:-2-
k=l k + l
2
k=l + 00 2k 00 1
+ kl)k + k)3/2
00
7. k"fl 3k+l 8. k"fl (k2
3. I: ke-k 4. I:
00 (
i
1
k=I k=I
10. ~ _l_
00
9. I:-2- /;;;1 n2"
n=2 n Inn
682 Chapter 14 Infinite Series
11 f- 1-·-
j3 + 1
• j=l
12.
j=l
I; j:
./T+T
2 33. (a)
(b)
Prove that
Prove that
I:t.. 1 kxk =
L~I kZ-k
x/(1 - x) 2, for lxl < I.
= 2.
In Exercises 13 to 18, determine whether the series In Exercises 34 to 41, determine the real values of x for
converges absolutely or not. For those alternating series which the series converges.
that fail to converge absolutely, try to apply the Leibniz oo I oo 1
test for convergence. 34. L -kxk 35. L 2 xk
k=l k=I k
n.
oo (-I)*
I: - -
oo (-J)*+l
14. I: - - 36 _ I: sin;x 00 1
37. I: ---r<x - 1)*
k=z k 2 Ink k=I k2 k=I k k=I 2
In Exercises 19 to 24, determine the convergence, abso- 42. (a) Show that t(p), which is defined by the p-series as
lute convergence, or divergence of the series. t(p) = L~I k-P, is decreasing asp increases, for
p > 1.
oo (-Il oo k (b) Show that (1-Z-P)s(p) = l+l/3P+I/5P+l/7P+
19. I: - - 20. I: - 3-
k=2 1n(l/ k) k=l k +I
00 (-2)k 00 kl (c) Show that (1 - 21-P)t(p) = 1 - I/2P + l/3P -
21. I: - 2 - 22. I: _ · l/4P + ...
k=I k + 1 k=l (2k)!
43. The function f(x) = rt is defined for > 0 by
E(1 + ~)
x
23.
k=I k,
z-k 24. E(-ll-k_!
k=I
-
(k+l)!
f(x) = exlnx _
(a) Use l'Hopital's rule to prove that limx-+O+ xx = I,
In Exercises 25 to 30, determine the convergence or and so conclude that limk-+oo (1/k/ 1/k) = I.
divergence of the series. (b) Prove that I: (!)
k=I k
l/k diverges.
00 k2 oo kk
25. I: ---r 26. I: - (c) Prove that I:~ 1 k(l/k) diverges.
k=l 3 k=l k!
44. Prove that if a is a real number and lbl > 1 then
27. f k! 28. I: (-l)k lim
na
-b = 0, by applying the ratio test to ""~ nab-".
k= I k ..;1nk
k=2 n-..oc- n ~n- 1
f(x) = L fk(x)
k=I
Section 4 Uniform Convergence 683
N
= N--+oo
lim ""'/k(X).
L
k=I
This means that for each x in S there is a number /(x) such that, given E > 0, there
is an integer K sufficiently large that
N
L fk(x) - f(x) < E,
k=l
whenever N ~ K.
r~~,M~~~1d The series L~o xk has for its (N + l)st partial sum the finite sum
N
""'x
11 - xN+I , x,i=l,
L k = I -x
k=O N + I, X =}.
Then
oo N l
""'xk
L
= lim
N--+ooL
""'xk = --,
1-x
for -1 < x < 1.
k=O k=O
For real values of x outside the interval ( -1, 1), the series fails to converge.
The trigonometric series I:~ 1 (sin kx)/ k 2 converges pointwise for all real x.
The reason is that we can compare its terms with those of the convergent series
I:~ 1 1/k2 , by observing that
sinkx < _!_
k2 - k2'
I k = 1, 2, ....
I
The result is that the given series even converges absolutely.
An infinite series I:~ 1 f.t(x) that converges for each x in a set S to a number
f (x) defines a function f on S. However, in general we can conclude very little
about the properties of f from pointwise convergence alone. For this reason it's
sometimes helpful to consider a stronger form of convergence on S. We say that
I:~ 1 fk converges uniformly to a function f on a set S, if, given E > 0, there is
an integer K such that for all x in S and for all N ~ K.
N
Lfk(x) - /(x) < E
k=I
The definition just given should be compared carefully with that of pointwise con-
vergence. Notice that uniform convergence implies pointwise convergence, but not
conversely. Roughly speaking, uniform convergence of a series of functions defined
on a set S means that the series converges with at least a certain minimum rate for
684 Chapter 14 Infinite Series
all points in S. A pointwise convergent series may have points at which the con-
vergence is increasingly slow. Figure 14.5 is a picture of uniform and nonuniform
convergence to the same function /; sN(x) and IN(x) are Nth partial sums of two
series.
To determine that a series converges uniformly, we have the following.
4.1 Weierstrass Test. Let L~J fk be a series of real-valued functions defined
a b on a set S. If there is a constant series L~i Pk, such that
2. L Pk converges,
k=I
a '
' '
' '
' '' b
/(x) - L fk(X) =L fk(x) - L fk(x) = L /k{x).
' '',,' k=I k=I k=I k=N+I
,,
'' ' IN(x) ,, It follows that
,,
' ' ,, N
' ' ,, oo oo
''
,,
'' /(x) - L /k(x) < L 1/k{x)J ~ L Pk·
(b) nonuniform k=I k=N+I k=N+l
Since L~i Pk converges, we can, given € > 0, find a K such that L~N Pk < €
if N > K. This completes the proof, because the number K depends only on € and
FIGURE 14.5 ~00~ •
The trigonometric series L~ 1 (sin kx) / k 2 converges uniformly for all real x, because
and L~l I/ k converges. However, the power series L~o xk, while it converges
2
pointwise for -1 < x < 1, fails to converge uniformly on (-1, I). See Exercise 6.
The Weierstrass test applies on symmetric closed subintervals [ -r, r] with O < r < I
by observing that lxk I ::: ,t for x on [-r, r] and that L~o ,k converges if O ~ r < 1.
Hence the power series converges pointwise on (-1, 1) and uniformly on [-r, r] for
any r < 1.
The next four theorems are about uniformly convergent series of functions. They
all assert that certain limit operations are interchangeable with the summing of a
series, provided that certain series converges uniformly. If uniform convergence is
replaced by pointwise convergence, then the resulting statements fail to hold in
general. See Exercise 9.
4.2 Theorem. Let Ji, h, h, ... be a sequence of functions defined on a set S
in !R11 • Suppose xo is a limit point of S, and suppose that the limit
lim fk(X)
X->Xo
Section 4 Uniform Convergence 685
exists for k = 1, 2, .. . . Then
00 00
lim
X--c>XQ
L fk(x) = L lim fk(x),
X--c>XQ
k=l k=l
provided the series of numbers on the right converges and the series on the left
converges uniformly on S.
Proof. Let limx--c>xo ft(X) = ak. Then adding and subtracting I:f:: 1 fk(x) and
I:f= 1 at, we get
00 00 oo N
Now let E > 0. Since Lt=I fk converges uniformly, we can choose K such that
N > K implies
oo N
L ft(X) - L ft(X) <-
E
3'
for all x in S.
k=l k=l
k=l k=l
Finally, pick 8 > 0 so that Ix - xol < 8 implies, via the relation
N N N N
Jim '"" ft(X)
X--c>XQ ~
= '""Gk
~
, that L fk(x)- Lat <-
E
3.
k=l k=l k=l k=l
Then for x satisfying Ix - xol < 8, the left side of equation (1) is less than E. •
t't 1b
00
fk(x) dx = 1b[OOt't /k(x)
]
dx .
686 Chapter 14 Infinite Series
Proof. By Theorem 4.3 the function I:~ 1 /k(x) is continuous on [a, b] and so is
integrable there. We have
1 b[OO
?;fk(x)
]
dx - ?; 1
00 b
fk(x)dx = 1kilb 00
/k(x)dx. (2)
00
L fk(x) < E(b - a)- 1, for all x in [a, b].
k=N+l
11
a
b g(x)dxl::: (b-a) max lg(x)I,
a::,x:;;h
b 00
Thus the left side of equation (2) is less than E in absolute value for N > K . •
The interchange of differentiation with the summing of a series requires somewhat
more in the way of hypotheses than did the previous theorem on integration.
k=l k=l
1 1
N N x x [ N ]
!;[fk(x) - /k(a)] =~ a Ji(t) dt = a ~ Ji(t) dt.
Using pointwise convergence on the left and uniform convergence on the right to
justify letting N tend to infinity in Theorem 4.4, we get I:~ 1 /k(x) = f (x). Hence
Differentiation of both sides of the last equation gives /'(x) = L f~(x), which is
k=1
the conclusion of the theorem.
•
[:e'~Mf:"t.l;~.:~j
I
Consider the trigonometric series
~ sinkx
L- k4 .
k=I
The series converges absolutely for all real x, because the terms are dominated by
k- 4 . Furthermore, the series of derivatives of the terms of the given series is
00
'\"""' coskx
L- k3 .
k=I
Similarly, this series converges uniformly for all x by the Weierstrass test, because
d 2 ~ sinkx ~ sinkx
dx 2 L- ~ = - L-J?l·
k=I k=I
EXERCISES
1. Show that the series Lk=O xk converges uniformJy for 4. (a) Show that if ICkl < B for some fixed number B,
-d ::0 X ::0 d if O < d < l. then the series
00
2. (a) Show that the trigonometric series Lk=I (cos kx/ k2 ) u(x, t) = L qe-k 2
t sinkx
converges unifonnly for all real x. k=I
(b) Prove that the series of part (a) defines a continuous is a solution of U:u = u, satisfying u(O, t) =
function for all real x. = 0 when t > 0 and x is in [O, JT ]. [Hint:
u(n, t)
00 For arbitrary 8 > 0, apply Theorem 4.5 with t :::: 8.)
3. Show that if a trigonometric series a; + L (at cos kx + (b) Show that, if u(x, t) in part (a) is defined for t = 0
k=I by a series convergent for each x, then u(x, t) is
bk sinkx) converges uniformly on [-JT, n], then it con- continuous on the set S in R 2 defined by O ::: t,
verges unifonnly for all real x . O<x:::;n.
688 Chapter 14 Infinite Series
(c) Show that the function 11(.x, t) is infinitely often on (0, ]]. [Hint: For the fim part use the error estimate
differentiable with respect to both x and t, for in Theorem 3.10.]
t > 0. 8. (a) Assume that the series L~J k 2ak and L~J k 2bk
5. Show that if a trigonometric series as displayed in both converge absolutely. Show that
Exercise 3 satisfies the conditions lak I ~ A/ k2 , lbk I ~ cc
B/k 2 , fork= I, 2, 3, ... and fixed constants A and B, w(x. t) = L sinkx(ak coskat + bk sinkat)
then the series converges uniformly for all real x. k=l
6. By considering the partial sums of the power series is a solution of the I-dimensional wave equation
L~o xk for -1 < x < I, show that the series fails a 2 w.u = wu. [Hint: Use the Weierstrass test and
to converge uniformly on (- I, 1). Show uniform convcr- Theorem 4.5.]
gencc ,.1or - 2I ~ x ~ 2I . (bJ Show that the solution w(x, t) of part (a) satisfies
the boundary conditions w(0, t) = w(:,r, t) = 0 for
*7. Show that L~ 1 (-ll(l-x)xk converges uniformly on t 2'.: 0 and an initial condition w(x, 0) = h(x), where
(0, 1], but that L~! (1 - x)xk converges only pointwise h is twice continuously differentiable.
(1
Divergence Divergence
Interval of
convergence
(a)
R=2 R=2
-1 0 3
Divergence Divergence
Interval of
convergence
(b)
Section SA Power Series 689
00 1
I:
k=I
k2kxk
for absolute convergence by using the ratio test. The kth term is ak = 2-k xk / k, so
.
l1m ak+I
- - I I= 1·1m I rk-1 xk+I /(k + I) I
k-+-oo ak k-+-oo 2-kxk / k
By the ratio test, the series converges absolutely when ½Jx I < 1, that is, when
lxl < 2, and diverges when ½lxl > 1, that is, when lxl > 2. Thus the interval of
convergence has radius R = 2 and is centered at x = 0. Because the ratio test gives
no information when the limit, in this case Jxl/2, is equal to 1, we have to check
that case separately. The points satisfying Jx I/2 = 1 are just the points x = 2 and
x = -2. These points are the endpoints of the interval of convergence, and direct
substitution into the series shows that at x = 2 we have I:~ 1 1/ k, which diverges;
at x = -2 we have I:~ 1(-Ii/k, which is a convergent alternating series. Thus
the precise interval of convergence is - 2 ~ x < 2.
The series
is the Taylor expansion of the function ex, and we showed in Section 2 that the
series converges, even absolutely, to ex for all real values of x. Hence we see that
the interval of convergence is -oo < x < oo, and it's customary to say that the
radius of convergence is R = oo.
The series
690 Chapter 14 Infinite Series
. I
hm - -
ak+l I = hm
. Ix2k+22 ;3k+1 (k + 3) I
k---+oo ak k---+oo x k /3k(k - 2)
k+2 I
= k~~ lxl2 3(k + 3) = 3x2
By the ratio test, we have convergence for lxl < J3 and divergence for lxl > J3.
A separate check shows divergence for x = ±J3.
is valid for -1 < x < I. In the interior of the interval of convergence, that is for
- I < x < 1, we integrate both sides and include a constant of integration to get
oo xk+l x2 x3
- In() - x) =c+ '°' -- =
L...,k +I
c + x + - + - + ··· .
2 3
k=O
0=-lnl =0+c,
so
oo xk x2
-ln(I -x) = '°' -k
L..,
x3
= x + - + - + ···
2 3 ' -I< x < I.
k=l
cos x = L -(2k(-Ii
k=O
- - ( 2 k + I )x2k
+ l)!
oo (-If x2 x4
=~ (2k)! x2k =I- 2! + 4! - ... .
This is just the Taylor expansion of cos x. The theorem that justifies the preceding
computations is as follows.
Proof. The case of an expansion about a point other than a = 0 follows from a
simple change of variable, as in Exercises 28 and 29. We prove first the part about
integration, under the assumption that the series
00
f(x) = L akxk
k==O
represents the function f (x) in the interval lxl < R. Assuming -R < x1 < R we'll
prove that
. (t 00 Xt 00 k+l
1 0
f(x)dx = Lak
k=O 1 O
xkdx = Lak_:L_·
k=O k+ l
To use Theorem 4..4, we need to verify that the given power series converges
uniformly on the interval between O and xI. But if we choose s and r so that
R > s > r > x 1 , then the series converges at x = s. Hence its terms tend to zero
and so are bounded in absolute value by some number m : laksk I ~ m. Then for
x ~ r < s, we have
The series withkth term m(r/sl is a convergent geometric series, becauseO < r /s < I..
This shows by Theorems 4.1 and 4.3 that the given series converges uniformly on
( -r, r) to a continuous function, which is necessarily f (x) because the series is assumed
to converge to f(x) at each point x. Because r is a number such that O < r < R, we
can include an arbitrary x in ( - R, R) in an interval of uniform convergence, so we can
integrate term by term on all intervals contained in [0, xi}.
For the differentiation part of the theorem, we start with the same series for f (x)
and show that
00
J'(x) = Lkakxk-I_
k=l
692 Chapter 14 Infinite Series
To do this, we apply Theorem 4.5 by showing that the differentiated series converges
uniformly on every interval -r :5 x :5 r, where O < r < R. Choose a number c
such that r < c < R. Since L~o
akck converges, there is a number b such that
lak lck :5 b. Then for x in [-r, r] we have
lkakxk-l I :5 k/ak/rk-l
The series with kth term k(b/c)(r/ci- 1 is geometric and convergent, because O <
r/c < I. Hence the series with kth term kakxk-J converges uniformly on [-r, r]
by the Weierstrass test. This allows us to differentiate term by term on all intervals
[-r, r] with O < r < R. Thus can differentiate at each x such that - R < x < R,
simply by choosing r so that Ix/ < r < R . •
Theorem 5.1 allows us to differentiate and integrate a power series repeatedly,
because the result of performing one such operation on a power series conver-
gent when Ix - a I < R is just another power series that is also convergent when
/x -al< R.
IEXAMPL~ ,71 Starting with
--1
= Loo k
x =l+x+x
2
+ ... , Ix/< 1,
1 -x
k=O
I oo
---
2
= Lkxk - J = 1 +2x+3x 2 +···, /x/ < I.
(1-x) k=I
2
3
'°'
00
= ~(k - l)kx k2
- = 1·2+2·3x+3-4x 2 + ... , lxl < 1.
(I - x) k=2
5.2 Theorem. If
00
J(x) = L ak(X - al
k=O
on some interval Ix - a/ < R, then the series is the Taylor series off about x = a.
That is,
Now set x = a on both sides. All terms on the right become zero except the first, so
J"(a) = n!an ,
which is what we wanted to show.
•
SC Finding Limits by Using Series
A convergent Taylor expansion
00
f(x) = Lak(X - al
k=O
This idea applies to calculation of fairly complicated limits, as the following example
shows.
To show that
x - sinx 1
lim - - -= -
x--+O x3 6
(x - ·
smx )/ x 3 = 61 - I
120 x
5 + ··· .
Taking the limit as x ~ 0 amounts to setting x = 0 in the continuous function
represented by the series. The resulting limit is i.
SD Products and Quotients
We can multiply and divide power series very much like polynomials. The product
of two power series about the same point gives a third series called their Cauchy
product, which is simply the series formed by collecting equal powers of x:
here, states that the Cauchy product converges to the correct value in the interior of
the common interval of convergence of the two factors.
(l + x)(l + x + x 2 + x + · · ·)
3
= I+ 2x + 2x 2 + 3
2x + · · · ,
2 3
(l + X + X + X + · • · )(X + 2I x 2 + 3I x 3 + · · · )
= x + (l + ½)x + (l +
2
½+ ½)x 3 + · · · = x + !x + ~1x 3 + · · · ,
which is valid for -1 < x < l.
(c) Multiplying together the power series for 1/x and lnx about x = l , we get
(l -(x - 1) + (x - 2
1) - (x - 1) 3 + · · ·)
x ((x - l) - ½(x - 1) 2 + ½<x - 1) 3 - ... )
Examples (b) and (c) show that it may be impossible to find a simple formula for
the kth coefficient in a power series.
co+c1x+c2x 2 +··· 2
2 = ao + a Ix + a2x + · · · .
bo + b1x + b2x + · · ·
for the desired values of the ak . You can carry this process oUL as far as you like,
because the solution of each equation depends only on the solution of the previous
Section SD Power Series 695
equations in the list. For this reason you can also truncate the series in the numerator
and denominator and use polynomial long division to compute a preassigned number
of terms.
To compute the first few terms in the expansion of (I +ex)- 1 , note that co= 1, c 1 =
c2 = · · · = 0. Also bo =
2, bk= l/k!, k =
l , 2, 3, .... We then solve
2ao = 1, ao +2a1 = 0,
To then find the first three terms in the expansion of sin x/ (1 + ex), we compute
(x -ix + ... ) (l - ~x + ;
3
8
x
3
+ ... ) = lx -{x 2
_ / x3 ....
2
EXERCISES
Using the ratio test, or by other means, in Exercises In Exercises 15 to 18, use the Taylor expansion ln(l -
1 to 8 find the interval of convergence of the power
series. In case the interval has finite endpoints, determine x) =- E!xk
k
k=I
to derive the Taylor expansion, valid
whether the series converges when x is equal to each of for Ix I < 1, for the function.
the endpoints, and sketch the interval.
+ x) !2 ln ( 11 +-xx)
1. f: !xk
k
k=I
2. E
00
k=I
1
- (x
k
-2/
15. ln(l 16.
k=O !(2k)
-x2k 4. E k 2 <x + 1/
k=I
In Exercises 19 to 22, use the Taylor expansion
k 1
(l-x)- 1 = L~oxk to derive the Taylor expansion for
00 00
5. E --<x +2/ 6. E ~---,,--xk
k=I k + 1 k=O.Jk2+1 the function about the point a.
00 l 00
,. E -<x + 3)2k+1 s. E zkx2k 19.
1
+ x 2 , about a = 0
1
20. - -2 , about a =0
k=I y'k k=O 1 1-x
In Exercises 9 to 14, use the Taylor expansion ex = 1
21. -, about a =1
X
22. - - . about a= 0
d E_!_k! xk to derive the Taylor expansions about the
k=O
X 1-x
point a = 0 for the function. 23. Use the relation d(arctanx)/dx = 1/(1 + x 2 ) and the
result of Exercise 19 to derive the Taylor expansion of
10. (e-' + e-x)/2 = cosh x arctan x about a = 0.
11. (ex - e-x)/2 = sinhx 12. xex
2
24. Use the relation d(l + x 2 )- 1 /dx == -2x/(l + x 2 ) 2 and
n. c 14. e5x the result of Exercise 19 to derive the Taylor expansion
of (1 + x 2)-2 about a = 0.
696 Chapter 14 Infinite Series
25. Use the Taylor expansion (1 - x)- 1 = I:~ 0 xk to from the formula
prove that
)xii< R.
27. Let a be a real number. In Exercises 31 to 34, use Taylor expansions to find the
(a) Prove that, if j(x) = (1 + x)a, then j(k)(0) limit.
a(a -1) ··· (a -k + I), so that the Taylor expansion 2 5
1. Jim ln(l + x ) ln(l - x )
of (1 + x)a about O is 3 x--->0 x2 32. J~ xS
·m x + In( I - x) cos x - I + x 2/2
~ a(a - 1) .. · (a - k + I) k
33. ll 34. Jim
x->0 x2 x--->0 x4
(I +xt =I+~ k! X •
k=I 35. Find the Taylor expansion of f (x) = 1/(x + c) about
X = 0 for C-/= 0.
(b) Write out the first four terms of the expansion in part 36. Find the Taylor expansion of g(x) = 1/[(x + c)(x + d)]
(a) for a = 3, a = -3, and a = ½- about x = 0, for c -/= 0, d -/= 0, and c -/= d, by expressing
28. Derive the formula g as a sum of two fractions.
37. Prove that, if f (x) = L~o CkXk converges in some
Ix _ al < R, interval, then
00
d 00 k) 00 k-1
dx (
Lakx = Lkakx , lxl < R. in the interior of the same interval.
k=O k=l 38. Use Theorem 5.2 to prove that, if two power series
29. Derive the formula 00
Lak(X - al and
k=O
y' =y2,
We can treat higher-order equations, for example, y" = J(x, y, y'), in a way
similar to what we used in the previous example.
y" = yy'
given that y (0) = 1 and y' (0) = -1. First compute from the given equation some
formulas for higher derivatives. Then simplify by substituting the given values y(0) =
1, y' (0) = -1. We find
=
y" yy'; y"(0) -1,=
y"' = yy" + (y')2 = y2y' + (y')2; y"'(0) o, =
yC4l = 2y(y')2 + y2y'' + 2y'y" = 4y(y')2 + y3y'; yC4l(0) = 3.
The first five terms of the Taylor expansion of y(x) about x = 0 then add up to
y' (0) y" (0) 2 y"' (0) 3 yC4l (0) 4 1 2 I 4
y(O)+--x+--x + - - x + - - x = 1-x - - x + -x .
1! 2! 3! 4! 2 8
In this example there doesn't seem to be a simple coefficient pattern, but we could
compute as many terms as we had time and space for.
698 Chapter 14 Infinite Series
y" = xy' - y
= co (1 - lx 2 2
-
4
~x - · · · ) +cJX,
For comparison, note that if the original differential equation were replaced by
y" = -y
then solutions would have the fonn
4. y" = xy, y(O) = 1, y'(O) = 0 8. Show that if y'" = y 2 y' and y(O) = y'(O) = y"(O) =
5. Find the first four nonzero terms in the Taylor expansion 1, then
of y = y(x) aboul x = 0 if y" = yy' and y(O) = 12 13 14 35
y'(O) = I. y = 1+ X + -X
2
+ -X
6
+ -X
8
+ -X
40
+ ··· .
Section 7 Power Series Solutions 699
SECTION 7 POWER SERIES SOLUTIONS
The solutions of many of the differential equations we have studied may be repre-
sented in tenns of their Taylor expansions. For example, polynomials, the elementary
transcendental functions cos x, sin x, ex, and linear combinations of all these have
Taylor expansions that are valid for all x. Beyond these examples there is a large and
important class of differential equations that has solutions representable by power
series, and even if the solution so represented is not a combination of elementary
functions at all, the infinite series expansion may serve to define a new function
nearly as important as some of the more familiar ones. Furthermore, the partial sums
of a series expansion often give useful approximations to the true solution.
Recall that the Taylor expansion of a function f has the form
~ f(k)(xo) k 1 , 1 " 2
~ k! (x - xo) =f (xo) + l! / (xo)(x - xo) + 21 f (xo)(x - xo) + ··· .
k=O
If such a series converges anywhere but at x = xo then it's absolutely convergent
in an interval xo - R < x < xo + R that is symmetric about xo, and within such
an interval we can treat such series very much like the polynomials in x that arise
as special cases. In particular we can add and multiply Taylor expansions about the
same point x 0 , and a Taylor expansion is differentiable or integrable term by term to
produce the Taylor expansion of the derivative or integral of the expanded function
within the interval of convergence. A function f is called analytic if its Taylor series
converges to f (x) for all x in such an interval.
1l (2x )2k
00
d d
-cos2x
dx
L
= -dx k=O
(-
(
2k)'.
700 Chapter 14 Infinite Series
00
d (- tl (2x) 2k
=,?; dx (2k) !
00
(-ll(2k)(2x) 2k-l{2)
= L
k=l
(2k)!
- 2 ~ (-ll(2x)2k-l
- ~ (2k - l)!
k=I
00
Il
-- - 2 ~ '°'------
(- (2x)2k+I
+ l)! = -2sin2x.
(2k
k=O
In the last step we simply replaced k by k + 1 throughout to make the expansion
look more like the expansion in the preceding examples.
y(x) = LCkXk.
k=O
This fonn of the expansion is particularly appropriate if we want to solve an initial-
value problem with y{O) and y'(0) specified, because then co= y(0) and c1 = y'(0),
if there is a Taylor expansion for the solution about x = 0. Proceeding under that
assumption for the moment, we compute
00
y'(x) = L kqxk-1,
k=I
00
co co q c!
c2 = - ~ = - ,,
2
C3 = -2-3 = -3!'
c2 co C3 q
C4 = - 3.4 = 4!' C5
- 4.5 = 5!'
C5 CJ
C7 = 6-7 7!'
C2k-2 k Co
(2k - 1)(2k) = (- l ) (2k) ! ' C2k+l = 2k(2k + 1)
l k CI
(- ) (2k + 1)!
If we take y(0) = co and y' (0) = c1 = 0, then only the first column contains nonzero
entries, and we get the solution
co 2 co 4 k co 2k
yo(x)=co--x +-x -···+(-!) - - x + ...
2! 4! (2k)!
= co cosx.
On the other hand, the choice y(0) =
co = 0 and y'(0) = q makes the entries in
the first column all zero, so we get another solution from the second column:
CJ 3 CJ 5 q 2k+l
Y1(x) =q - -x + -x - .. · + (-1)---x + .. ·
3! 51 (2k+l)
=q sinx.
00
y(x) = L qxk,
k=O
00
where this time we shifted the index in the first summation up by 2 and in the second
summation down by I to make the exponents of x agree. Thus we get a single term,
the constant 2c2, in the first summation that does not correspond to a term in the
second summation. We can write then
00
c2 = 0,
Ck+ 2 =- (k + ])(k + 2)'
We find as a result that the terms are determined in sequences with indexes differing
by 3, and that
0= C2 = C5 = Cg = · · · = C3k+2 = ··· ·
CJk-3 (-llco
CJk
(3k - 1)3k
= 2-3-5-6 · ·. (3k - 1)3k '
C4
C4 -· 3.4'
C4 CJ
C7 = -6-7
- =
3-4·6·7'
C7 CJ
CJO =
9-10 3-4-6-7-9-10'
C3k-2 (-llc1
CJk+J =
3k(3k + ]) 3-4-6-7 · · · 3k(3k + 1)
Section 7 Power Series Solutions 703
(-llx3k
yo(x) = 1+ L
oc
------,
k=I 2-3·5·6 · · · (3k - 1)3k
oo (-llx3k+I
Y1(x) = + I:------.
X
3-4-6-7 · · · 3k(3k + 1)
k=I
The power series for Yo and YI both converge for all x because the denominators of
the kth terms each contain increasing integer factors, 2k in number. Hence each series
has terms dominated by those of an everywhere convergent series, for example,
(-llx3k I lxl3k
I 2-3-5-6 .. · (3k - 1)3k ::: (2k)!'
In practice estimates of this kind are useful for testing the accuracy we get by
stopping with a specified number of terms in a Taylor expansion. In this example,
to get an estimate when Ix I ::: 1, we estimate the tail of the factorial series by a
geometric series with ratio l/4n 2 :
00
1 1 I
2
< n)! L
k=n
(2k)! = 1
+ (2n + 1)(2n + 2) + (2n + 1) · .. (2n +4) + .. ·
1 1
< 1 + 2- + - 2-2+ ...
- 4n (4n )
4n 2
= 4n 2 - 1 ·
Thus the error in stopping after n - l terms on the interval -1 ::: x ::: 1 is at most
4n 2 1
4n 2 -1.(2n)!'
For n = 5, that is, keeping terms of degree 4, the error is at most 3-1 o- 7 .
Since the two solutions Yo and YI are linearly independent, the general solution
of y" + xy = 0 has the form
00 00 00
00 00 00
m(m + I) (m - l)(m + 2)
c2 = co, c3 =- 2-3 CJ,
2
(k + l + m)(k - m)
k ?:_ 2.
Ck+ 2 = (k + l}(k + 2) Ck,
Since the recurrence relation contains a shift by 2, it's natural to split the coefficients
into those of even and those of odd index:
(3 + m)(2- m) (m - 2)m(m + 3)(m + I)
C4 = 3.4
C2 -- ---------co,
4!
(5+m)(4-m) (m - 4)(m - 2)m(m + 5)(m + 3)(m + 1)
C6 = 5,6 C4
6!
CO,
and
(6+m)(5-m)
6,7 C5
(m - 5)(m - 3)(m - l)(m + 6)(m + 4)(m + 2)
= 7!
CJ,
If m = 21 is a positive even integer, then all even coefficients are zero beyond 2/.
Thus the series expansion with even powers reduces to an even polynomial in that
case. Similarly, if m = 21 + 1 is a positive odd integer, the series expansion with odd
Section 7 Power Series Solutions 705
powers reduces to an odd polynomial. For example, when m = 4 and co = q = 1,
we get the solutions
4.5 2.4.7.5
P4(x) =l-
- x2 + - - x 5
2! 4! '
3-6 1·3·8-6 -1-1-3-10-8-6
Q4(x)=x--x 3 + - - x 5 - - - - - - x 7 +··· .
3! 5! 7!
Using the ratio test for convergence (see Exercise 7) shows that the infinite series
solution converges for - 1 < x < l. These two solutions form the basis for the
collection of all solutions of the homogeneous Legendre equation of index 4 on the
interval -1 < x < 1. In general, the Legendre equation of integer index m has two
independent solutions of which one is a polynomial and the other is not.
EXERCISES
l. Use the method of power series to derive the general 5. (a) Apply the power series method to find the general
solution solution of the differential equation
then y = f(-x) is a solution of (c) Apply the power series method to find a solution of
the differential equation
y" -xy = 0.
y" +xy' + y =x.
(b) Use the result of part (a) together with the result
of Example 3 of the text to find a power series (d) Combine the result of part (a) with that of Exercise 5
expansion for the general solution of y" - xy = 0. to write the general solution of y" + xy' + y = x.
3. (a) Apply the method of power series to solve the first- 6. Apply the ratio test for series convergence to the series
order differential equation solution found for the Legendre equation in Example 4 of
the text. Show that the series converges for -1 < x < 1.
y' + 2xy = 0. [Hint: Split into even and odd parts; show that each
converges separately.)
(b) Solve the differential equation in part (a) by finding
an exponential multiplier and then integrating. 7. The Bessel equation of index n is
(c) Do the results of parts (a) and (b) agree for all x? 2
n )
4. (a) Apply the power series method to find the general y II + ~y +
} / (
I - x2 y = 0.
solution of the differential equation
(a) Show that when n = 0 the coefficients of a solution
y" -xy' =0 of the form I:~o qxk satisfy (k + 2)2ck+2 = -q.
(b) Show that, if we choose co = 1, c1 = 0 in part (a),
in the form y(x) = coyo(x) + ciy1(x). we get the solution
(b) Solve the differential equation in part (a) by solving
the equivalent system ""
Io<x) = E<-1/z-2k(k!)-2x2k,
y' =U, k=O
called a Bessel function of order 0.
U
I
= XU. (c) Show that Ji(x) = -J6(x) defines a solution of the
(c) Do the results of parts (a) and (b) agree for all x? Bessel equation of index l.
706 Chapter 14 Infinite Series
8. A Bessel function of integer order n is defined by (b) Show that, if y11 is a solution of the Bessel equation
of index n, and u 11 (x) = .Jxy,,(x), then Un
t)"
ln(X) = (i ?;<-d
oo x2k
22kk!(n + k)! ·
satisfies
In the special circumstances that the coefficients Gk, bk arise from an integrable
function f (x) using the Euler formulas
then the trigonometric series is called the Fourier series of f. The coefficients Gk,
bk as given by Equations 8.2 are called the Fourier coefficients of f. This choice
is justified by Theorem 8.5. The most fundamental question about a Fourier series
is the extent to which the series represents the function. The importance of such a
representation stems partly from the possibility of incorporating the individual tenns
of the series into a solution of certain differential equations.
Our first examples illustrate the beautiful way in which the partial sums of a
Fourier series attempt to mimic the function / that generates them. A partial sum
N
8.3 SN(x) = ao + "L_)GkCOSkx +bksmkx)
.
2 k=I
Note also that the Fourier coefficients ak, bk are detennined by integral formulas that
use the values of J(x) only for - Jr ~ x ~ 1r. For these reasons, we'll sometimes
restrict attention to values of x in the interval of length 21r between -Jr and Jr .
Section 88 Fourier Series 707
8B Orthogonality
The functions cos kx, sin kx that occur in a Fourier series are the most important
examples of orthogonal functions on the interval -rr ~ x ~ rr. For integers k and
l orthogonality of cos kx and sin kx means that
8.4 -I 111: coskx sinlx dx = 0,
T{ -Jr
These formulas are usually proved using trigonometric identities, but they can also
be proved by first writing the sines and cosines in terms of eikx and eilx. (See
Exercise 29.) As a sample application of the orthogonality relations 8.4, suppose
that a trigonometric series satisfies some condition that allows us to integrate it term
by term on the interval -rr ~ x ~ rr. For example, the series might converge
uniformly on the interval, or it might be only a finite sum, with all ak and bk equal
to zero from some point on.
8.5 Theorem. Suppose the trigonometric series 8.1 converges to a function f (x)
and is integrable term-by-term over the interval [ -rr, rr] to give the integral of f (x ).
Then the coefficients ak, bk of f(x) are given by the Euler formulas 8.2.
-1
l
rr
n:
-ir
f(x) coslx dx = -1 - +
l
rr -n:
n:
[
ao
2
00
I )ak coskx
k=O
+ bk sinkx) ]
coslx dx
The first two of Equations 8.4 show all but one of these last terms is zero, the only
survivor being the term with the factor a, . We find for l # 0,
When l = O the only nonzero term that survives in the sum is the first one. Since the
integral of 1 over the interval is 2rr, we get ao. A similar computation in Exercise 30
shows that -1111: f (x) sin lx dx = b1. •
7r -:n:
708 Chapter 14 Infinite Series
Theorem 8.5 shows that no choices other than Formulas 8.2 for determining the
Gk and bk are possible if we want to represent a reasonably large class of functions
f (x) by the trigonometric series of Equation 8.1.
= [ - 22- coskx
]7r = - 2-(cosk;r - 1)
k;r o 2
k;r
k = 2, 4, 6, ... ,
= -j---((-ll - 1) = { O, 4
s1(x) = s0(x) - (i) cos .r k ;r - k 2;r, k = I, 3, 5, ....
(b)
0, k =2,4,6, ... ,
ao = Jr, Gk= -~
k = 1,3, 5, ... ,
{ k2;r '
bk = 0, k = I, 2, 3, ....
s3(x) = s (x) -
1 ( ~)cos 3x
9
Fourier approximations to
Hence the Nth Fourier approximation is given for N = I, 3, 5 , ... by the trigono-
metric polynomial
j{x) = Ix I on [- 1T, 7r]
(c)
;r 4 4 cos 3x 4 cos N x
SN(X) = -2 - -cosx- - - - - ... - - - - - .
;r ;r 32 ;r N2
FIGURE 14.7
If N is even, we have SN(x) = SN-i(x). Figure 14.7 shows how the graphs of
So, Si, and S3 approximate that of !xi on [-rr, ;r]; additional terms improve the
approximation.
Gk= -
Jr
1 /o-rr coskxdx + -1 /orr coskxdx.
Jr 0
Section 8C Fourier Series 709
Since cosine is an even/unction, cos(-kx) = coskx, the two integrals are equal, so
we get ak = 0. Similarly,
bk= - -11o
T( -TC
sinkx dx + -T(I 11r sinkx dx.
0
Since sine is an odd/unction, sin(-x) = -sinx, the integrals themselves are nega-
tives of each other, and we get for k -=f. 0,
I
0
k even
= 2__(-(-1/ + I)= 0~
kT( - k odd.
kT(
s1(x) =(¼)sinx In summary,
(a)
ak = 0, k =0, 1,2, ... ,
0, k = 2, 4, 6, ... ,
4
kT(, k = 1, 3, 5, ... .
where ak and bk are given by the Euler Formulas 8.2. Theorem 8.6 gives some con-
l:<'IGURE 14.8
ditions on f under which we can use the Fourier series to represent / . Suppose
Step-function approximations
that the graph of/ is not only bounded on [-T(, T(] but piecewise monotone, which
SN(x) for N = 0, 3, 5.
means that the interval [-T(, T(] breaks into finitely many subintervals, with endpoints
-T( = xi < x2 < · · · < Xn = T(, such that J(x) is either nondecreasing or nonin-
creasing on each open subinterval (xk, Xk+i). It's possible to prove that the Fourier
series of / will then converge to the 2T( -periodic extension of f (x) illustrated in
Figure 14.9 wherever/ is continuous, and at a discontinuity at xo will converge to
the "average" value
½U(xo-) + f(xo+)].
710 Chapter 14 Infinite Series
FIGURE 14.9
Typical periodic extension.
-- -- --•
' ... ,
----
- ,r 'Tr
Here J (xo-) stands for the left-hand limit of J at xo, and f (xo+) stands for the
right-hand limit. The graph of a typical piecewise monotone function appears in
Figure 14.9 with average value at jumps indicated by dots.
8.6 Theorem. Let J be bounded and piecewise monotone on [-Jr , n]. Then the
Fourier series of J converges at every point x of the interval to ½[J (x - )+ f (x+ )]. In
particular, if J is continuous at x, then the series converges to f (x ). At x = ±n, the
series converges to <½)[f(n - ) + f(-n +)]. (A somewhat stronger version is called
Dirichlet's theorem; the conclusion is the same, with somewhat weaker assumptions
about J.)
I
Theorem 8.6 implies that the Fourier series of g converges as follows:
00
4sin(2k + l)x l , O < X < Jr ,
L--
Jr
k=O
-- =
2k + l
0, x=0
- I -Jr < x < 0.
To be very specific, we can set x = Jr /2 and arrive at the alternating series expansion
Theorem 8.6 gives a reason beyond the one in Theorem 8.5 for choosing the
coefficients in a trigonometric series according to Euler Formulas 8.2. Assuming
piecewise monotonicity the resulting sequence of trigonometric polynomials will
converge to the function f and its average value at jumps. Since the partial sums
of a Fourier series are themselves periodic functions, the function to which they
converge is also periodic, a function we called a periodic extension of J. A periodic
extension may differ from the precise definition of J (x) at some points in the interval
-Jr S x S Jr , but changing a value f (xo) at a point xo has no effect on the
integral formulas for the Fourier coefficients, so it's customary to make such changes
whenever it's convenient in defining a periodic extension of a function from an
interval to the entire real number line.
Figure 14.10 shows a function f extended periodically, with period 2n, from the
interval [-Jr, n] to other values of x. Since the partial sums of the Fourier series are
also periodic with period 2n , whatever convergence takes place on [-Jr, 1r] extends
periodically to all values of x.
Section SC Fourier Series 711
FIGURE 14.10
Periodic extension of
f(x) = x + n from (-;,r, ;,r)
with S4(x).
'IT
j(x) == x + 'IT extended periodically from (-'IT, 'IT) and nonnalized at jumps.
The 4th Fourier partial sum is superimposed.
The coefficient values ak and bk that we get from the Euler formulas are inde-
pendent of the finite values assigned to / (x) at isolated points; this is because the
definite integrals in the Euler formulas don't distinguish between two functions that
differ at finitely many points. For example, the two functions
f (x) ={
0,
1,
-,r :'.:: X :'.::
0 < x :'.S
0
an
d ( ) = { 0,l -,r :'.:: X < 0
,r gx ' 0< <
_x _,r
differ only at x = 0, where /(0) = 0 and g(O) = 1. Intuitively speaking the areas
under the two graphs should be the same, namely ,r, and this is an important property
of the integral. Since the Fourier series of a piecewise monotone function converges
to the average of the right and left limits at each point x, it makes sense simply to
redefine such a periodic function to have the average value at each jump discontinuity
and refer to this as the normalized function.
We restate Theorem 8.6 as follows.
Strictly speaking, Theorem 8.7 has implications here only for functions of period
2:,r,as does Theorem 8.6, but we'll see in the next section that a modified statement
is valid for functions with positive period 2p.
I
EXAMPLE 4 :1 If the function / (x) = x + :,r is extended periodically from -:,r < x < :,r to
' -, , ,
1
, '" · •• other values of x, its graph consists of parallel line segments of slope l; it remains
undefined at odd integer multiples of :,r since it's initially undefined at ±n. To
produce a normalized version of the function defined for all x, all we have to do
is define the function to have the value :,r at odd multiples of :,r. We compute the
Fourier series for the functions as originally defined, and the series will converge to
the normalized function for all real x. The periodically extended function is shown in
Figure 14.10 together with the 4th partial sum of the Fourier expansion. The Fourier
coefficients are computed as follows:
712 Chapter 14 Infinite Series
ao = -11rr (x + n) dx
rr -rr
= o+ 2rr = 2n.
(Note that the integral of x over an interval symmetric about O is always 0.) When
k > 0,
ak =-
1 lrr
(x + n) cos kx dx
n -rr
The last integral above is O because the indefinite integral is O at ±n. The previous
one is most easily seen to be zero by observing that the integrand x cos kx is an odd
function, so that the integral over [ -n, O] is the negative of the integral over [O, n ].
Now for the bk' s,
bk = -1 lrr (x + n) sin kx dx
n -rr
2(-1/+ 1 2(-1/+1
k + O = ---'--k--
Tbe last integral is zero because the integrand is an odd function. The previous one
is computed using integration by parts, with
2- jrr x sin kx dx =
n -rr kn
1
- - -x cos kx Irr + - -
-rr
1
kn
f rr
-rr
cos kx dx
1 l 2(-l)k+l
= --cos
k
kn - - cos(-kn)
k .
+0 = --
k
-
The full expansion, including the constant ao/2, is then
2(-l)k+l
=n +L
00
f(x) k sinkx
k=l
= n + 2 sin x - sin 2x + i sin 3x - ½sin 4x + - · · · .
EXERCISES
6. f(x)=x+l,-rr<x:::rr
*29. Establish the orthogonality relations in Equations 8.4 of
7 _ f (x) = { -rr, -rr < x < 0, the text, by using the identities cosnx = ½<ei 11 x +e-i 11x),
rr, 0:5x:5rr
1 . .
8. f(x) = 2x + 1, -rr < x < rr sinnx = i (e'"x - e-,nx) together with the identity
2
9. f(x) = -/xi, -rr :::: x :S rr ei(a+/3> = eiaeif3 to compute the relevant integrals.
9A General Intervals
While the interval [-;r, ;r] is a natural one for Fourier expansions because it is a
period interval for the trigonometric functions, it may be that a function encountered
in an application needs to be approximated on some other interval. If the function f
to be approximated is defined not on the interval [ -;r, rr:] but on [ - p, p ], a suitable
change in the computation of the approximation is as follows. With f defined on
[- p, p ], we define
fp(X) =f (~), -Jr :'::: X S Jr.
Then we can compute the Fourier coefficients of Ji, by Formula 9.2. The resulting
trigonometric polynomials SN will converge to Ji, on [-Jr, ;r] as in Theorems 8.6
and 8. 7. To approximate f on [ - p, p], we consider
;rx)
SN ( - = -ao2 + ,;-.. ( k;rx k;rx)
~ llk cos - - + bk sin - - , - p::::xsp.
P k=l P P
= -I
p
JP f(x)cos (k;rx)
-p
-
p
dx.
N
ao ~( k;rx k-!rx)
2 + k=l
~ ak cos - - + bk sin - -
p p
If
0 :'::: X .:'::: p,
h(x)={~I, -p .:'::: X < 0,
then
ak = 0, k = 0, 1,2, ... ,
=; forr sinkx dx = I O~
k;r'
k
k
= 2,4, 6, ...
= 1, 3, 5, ....
,
Section 9A Applied Fourier Expansions 715
FIGURE 14.11
Intervals of length b - a = 2p.
-p p a b
p = (b - a)/2
For a function J defined on an arbitrary interval a ,::: x ,::: b, it's helpful to think of
a periodic extension F of J having period b - a and defined for all real numbers x.
Such an extension appears in Figure 14.10. We set 2 p = b - a so that p = (b - a) /2
and - p = -(b - a)/2. We then compute the Fourier coefficients of F over the
interval [-p, p] according to Formula 9.1. Also, because the integrands in Formula
9.1 have period 2p, we can use the geometric observation that we can perform
the integration over an interval of length 2p = b - a, in particular, over [a, b] as
in Figure 14.11. (See Exercise 7 for a nongeometric proof.) The reinterpretation of
Formulas 9.1 is
9.2 ak =-2
-
b-a
lb a
2krrx
f(x)cos--dx,
b-a
2
bk= - -
b-a
lb
a
2krrx
J(x)sin-- dx.
b-a
The associated trigonometric polynomials are
SN(X) - + LN ( QkCOS--
= ao 2k:rrx . 2k:rrx)
+bk sm-- .
2
k=I
b-a b-a
Equations 9.2 are useful computationally in part because the way f (x) is defined
may make it easier to compute its integral over the interval fa, b] rather than [-p, p ].
j:E~~M'1~,~2 j Let /(x) = x, for O < x < 1. We find, integrating by parts fork i= 0,
1
ak =2 lo x cos 2k:rr x dx
=2 [ xsin2k:rrx]
--- - -
2k:rr
2
o 2k:rr o
1
sin2k:rrxdx=0, i 1
1
bk= 2 fo x sin 2k:rrx dx
COS 2k:rr X ] l
= 2 [ -x --- + -
2 lo l
cos2k:rrxdx
2k:rr o 2k:rr o
cos 2k:rr 1
k:rr k:rr
716 Chapter 14 Infinite Series
1
Since ao = 2 fo xdx = 1, then ao/2 =½,and the Fourier series is
sin4rrx sin6rrx
1
- - -
1 (
sin 2rr x +- -- + - - - + .. · ) .
2 rr 2 3
Hence aside from a possible constant term, an even function has only cosine terms
in its Fourier expansion. Similarly, if l is an odd periodic function, the product
l (x) cos(krr x / p) is also odd; so for the Fourier cosine coefficient we have
Thus an odd function has only sine terms in its Fourier expansion.
Suppose given a function l (x) defined just on the interval 0 :s; x .:s; p we want
to find a trigonometric series expansion for l consisting only of sine terms, or
sometimes only of cosine terms. The trick is to extend the definition of l from the
interval 0 :s; x .:s; p to all real x in such a way that the extension is periodic of
period 2p and either is odd or else is even. We then compute the Fourier series
of the extension. If le is an even periodic extension of l, then le will have only
cosine terms in its Fourier series in an expansion designed to represent l just on
0 :s; x .:s; p. Similarly if lo is an odd periodic extension of l, then lo has only sine
terms in an expansion designed to represent l just for 0 ::: x .:s; p.
We'll compute the cosine expansion for the function defined by l(x) = 1 - x for
0 S x S 2. We consider the even periodic extension shown in Figure 14.12. To find
the extension we define le by le (x) = I (- x) for - 2 S x < 0, and then extend
periodically, with period 4, to the whole x-axis. We use Fonnula 9.4 to compute the
Fourier-cosine expansion of le - (Since le is even, we know that bk = 0 for all k.)
The coefficient fonnula in 9.4 allows us to write
Ok= -
11
2 -2
2
brx
le(x)cos-dx
2
= 1
2
o
brx
le(x)cos -dx.
2
Ok=
1o
2
brx
(l - x) cos --dx
2
= [ -2( I - x) sin -
k;r
brx]
2
- + -2
2 o k;r o
2
brx1
sin --dx
2
4 k { 0, k even,
= ;r2k2 [l - cos ;r] = 8/(1r 2k2), k odd.
Finally, ao = Jl(I -
x)dx = 0. Thus the cosine expansion of I on OS x S 2 has
for its general nonzero term
8 (k;rx) k odd.
;r2k2 cos 2 '
Written out, the expansion of the given function looks like
_ ~ (cos ;rx/2
l(x) - 7r
2 ] + cos 3;rx/2
9
+ cos 5nx/2
25
+ ... ) , 0 S XS 2.
bk= -
11
2 -2
2
knx
l 0 (x) sin -
2
dx = 1
2
o
k;rx
l 0 (x) sin -dx.
2
718 Chapter 14 Infinite Series
FIGURE 14.13
Odd extension of f (x) = l - .x
from O ~ .x ~ 2.
bk= f\1
lo
-x)sin brx dx = [-~(I -x)cosk.1rx2] -
2
~f
1r lo
2
cosk7rx2dx
I
2 1r O
2 2 2 [2 k1rx]2 0, kodd,
=-(-ll+--- -sin- = 4
k1r 1r k1r k1r 2 0 k1r , k even.
Thus the general nonzero term in the sine expansion is 4 / (k1r) sin(k1r x /2), for even
k > 0. A careful interpretation of this formula shows that the sine expansion of j(x)
is then
Note that Examples l and 2 of the previous section are cosine and sine expansions
respectively of the given functions restricted to O ::: x :::: 1r.
The Java Applet FOURIER at Web site http://math.dartmouth.edu/~rcwn/ approx-
imates Fourier coefficients using Simpson's rule and then plots graphs of partial
sums. An alternative is to use computer algebra software such as Maple, MATI.AB or
Mathematica to compute Fourier coefficients for elementary functions.
9C Differential Equations
Given a linear differential operator L, for example the hannonic oscillator operator
L = (D 2 + (J}), we can use Fourier series to solve the nonhomogeneous equation
Ly = f (t) if the forcing function f is periodic with period 2p and representable by
a Fourier series with coefficients ak, bk, For example, to solve
OO k1rt
Ly= Lbk sin-,
k=I p
we first find a simple particular solution Yk(t) of the equation
k1rt
Ly= sin-, k = 1, 2, 3, ....
p
The linearity of L leads us to a formal particular solution of the linear equation as
00
y = LbkYk(t) .
k=I
Suppose we want to solve the forced harmonic oscillator equation y + <,iy = J (t),
where f (t) has period 2p = 2 and is defined on the interval 0::: t < 2 by
Figure 14. l4(a) shows the graph of J for 0::: t < 8, called a square wave.
The square-wave input f has Fourier expansion
00 00
2 (1-(-ll) 4 1
J (t) =-
7r
L k=l
k sin(kJZ't) =-
7r
L - - sin((2n + l)Jrt),
2n + 1
n=O
with the understanding that the series converges to 0 at the jump discontinuities. The
computation appears in Example 1. The differential equation y + <Lly = sin(kJrt)
has the particular solution Yk(t) = (w2 -k2n 2)- 1 sin(knt). It follows that a solution
to y + ltly = f (t) is formally
Note that if w is an odd multiple of 1r, one of the terms in the series is undefined
and would have to be corrected. Indeed, for w close to (2n + l)1r the corresponding
term in the series will have a large amplitude; Partial sums to 100 terms are graphed
in Figure 14.14 for various values of w. A formula for the general solution would
have to contain additional terms ci cos wt + c2 sin wt. The numerical values of the
coefficients in the series for y(t) are larger when n = 0 than when n > 0, particularly
for the choices of w 2 in Figure 14.14 (c), and (d); this explains the dominance of
FIGURE 14.14 y y
---
Square-wave input w2 = S output
(a) (b)
y y
the first term in the graphs. Note also that Theorem 4.5 applies to the displayed
solution, so we can compute y' (I) from term-by-term differentiation, and the output
is decidedly smoother than the square-wave input f (t). On intervals k ::: t < k + I
the solution must satisfy ji + al y = ± l so on such intervals y (t) = q cos wt +
dk sin wt ± w- 2 , with the pieces fitting together smoothly at t = k to produce a
periodic solution.
EXERCISES
1. Find the Fourier series for the function In Exercises 10 to 15, extend the function to an interval
to the left of x = 0 so that the extended function is
f(x) = -X, -2 < X < 2. even. In each case sketch the graph of the even periodic
extension of f after normalizing f to have the average
To what values will the series converge at x = 2 and value at jump discontinuities. In the same picture sketch
X = -2? also the graph of the sum of the first two nonzero terms
2. Find the Fourier series for the function of the Fourier expansion, which should contain only
cosine terms plus perhaps a constant.
f(x) = 1 + X, 1< X < 2. 10. f(x) = 1, 0< X < 7f
To what values will the series converge at x = 1 and 11. f (X) =1- X, 0 < X < 1
X = 2? 12. f(x)=x 2 , O<x <n
3. Let/ be an odd function on [- p, p], that is, f(-x) = 13. f(x) = sinx , 0 < x < n/2
- f (x ), and let g be an even function. that is, g ( - x) = 0 0 < X < 1,
g(x). Let ak, bt and a~. b~ be the Fourier coefficients of 14. f (x) ={ 1: 1~ x < 2
f and g, respectively. Show that
O~x<l,
2 [P 15. f (X) = { 0:x 1~X < 2
ak=O, bk=- Jo f(x)sin(kJrx/p)dx,
, 21
ak = -
p 0
p p
g(x)cos(kJrx/p)dx,
D
,
bk =0.
16. Find
(a)
(b)
the Fourier cosine expansion and
the Fourier sine expansion of the function
FIGURE 14.15 u
Temperatures at equally spaced
times.
X
722 Chapter 14 Infinite Series
so if u x (x, t) > 0 heat flows to the left at x, while if u x (x, t) < 0 heat flows to the
right at x. Thus the rate of change of heat in a segment [x1, x2] is
k [-~u(x1, t)
ax
+ ~u(x2,
ax
t)] , (1)
where the number k is the heat conductivity of the wire, assumed to be constant
over the length of this wire.
By a version of the Fundamental Theorem of Calculus, the rate of change of heat
in the segment in Equation (1) is equal to
(2)
-
d 1x2 cpu(x,t)dx = cp 1x2 -(x,
au t)dx, (3)
dt Xt XI at
where the constants c and p are the heat capacity and density of the wire per unit
of length. Equating the expressions for rate of heat change in the wire in Equations
2 and 3 gives
a
21x2 -a2u (x, t)dx = 1x2 -(x,
au t)dx,
Xt ax 2 XI at
au 2
au
10.1 One-dimensional heat equation a 2 - 2 (x, t)
ax
= -(x,
a, t).
Equation IO. I is linear in the sense that if u1 and u2 are solutions, then so are
a2
linear combinations c1 u1 + c2u2, the reason being that both - 2 and - act linearly.
a
ax at
To single out particular solutions, we start by imposing two boundary conditions that
specify temperature zero at the ends of the wire of length p,
and one initial condition that specifies the initial temperature at all points x,
u(x, t) = X(x)T(t).
The boundary conditions translate into X (0) = X (p) = 0. If such product solutions
exist, substitution into a 2uxx = ut gives
2 X"(x)
a --=--.
T'(t)
X(x) T(t)
Note that if x varies nothing changes on the right. Similarly varying t changes
nothing on the left. Hence both sides must be equal to some constant C. We now
set both sides of the equation equal to a constant, letting C = -).2 for convenience:
a 2 x" +). 2 X = 0, T
1
+). 2 T = 0.
The first of these equations has solutions
With).= krr/p, the differential equation for T(t) is now T' +(karr/p) 2 T = 0, and
1
its solutions are
~
T(t) = ce-(kl a-:,r 2 IP 2)r_
Except for a constant factor, the product solutions uk(X, t) = Xk(x)Tk(t) are
-(k2a21r2 I p2)t .
Uk(x,t)=e sm(krr/p)x, k=l,2, ....
Since the heat equation is linear, linear combinations of the functions uk(x, t),
N
UN(X, t) = L bke-<k2a2 2/P=)c sin(krr / p)x,
1r
k=l
724 Chapter 14 Infinite Series
are also soluLions. But recall that we still have to satisfy an initial condition u (x, 0) =
h(x). This amounts to setting t = 0 in the previous equation and requiring the
coefficients bk to be chosen so that
N
h(x) = Lbksin(krr/p)x.
k=l
It's important to understand the separation technique, because it has many applica-
tions, some of which are in Section IOC and in the review problems at the end of
the chapter.
If the function h (x) satisfies the conditions of Theorem 8.6 of Section 8, we can
let N tend to infinity in u N (x, t) and get a Fourier series representation that we
incorporate using the Fourier sine expansion Formula 9.3 into a
iJ2u au
10.2 Solution formula for a2 axz = att u(0t t) = u(0t p) = 0, u(x, 0) = h(x).
u(x,t)
ex::
= Lbke-(k-a-rr-/p-)tsin(krr/p)x,
0
, , ,
where bk= -
21P h(x)sin--dx.
krrx
k=I P O P
Note that the decreasing exponential factors make the series converge very rapidly,
so we'll be able to differentiate the series term-by-term often enough with respect to
x and t to verify that we do have a solution. (See Theorems 4.1 and 4.5.)
To be more specific about solving the heat equation, we assume for simplicity that
p = rr. Recall that to solve a 2 uxx = u 1 with boundary condition u(0, t) = u(rr, t) =
0 and initial condition u (x, 0) = h (x ), we want in general to be able to represent
h(x) by an infinite series of the form
00
h(x) = { x, 0 S x S rr/2,
rr - x, rr /2 S x S rr.
To verify that u(x, t) satisfies a 2uxx = u, for t > 0 we use Theorem 4.5, noting
that the exponential factors provide the required uniform convergence. Theorem 4.2
shows that limHou(x, t) = h(x) for O::::: x .:S TC. The graphs of h(x) and u(x, t)
appear in Figure 14.16.
X
726 Chapter 14 Infinite Series
will satisfy u(0, t) = uo, u(p, t) = ui if w(x , t) is a solution of the heat equation
satisfying homogeneous conditions
Note that the function w + v is indeed a solution of the heat equation, because both
w and v are solutions and because the heat operator a 2 Dxx - D, acts linearly.
To solve problems of the form
bk= -
21P
p O
brx
(h(x) - v(x)) sin - d x .
p
The reason u(x, 0) = h(x) is that when t = 0 the series represents h(x) - v(x).
All terms of the series are zero when x is O or p, so u(0, t) = v(0) = uo and
u(p, t) = v(p) = ui.
j. EJ<.AM~LE·2 -1 To solve the problem
a 2 uxx = u1 , u(0, t) = 10, u(5, t) = 30, u(x, 0) = h(x) = lO - 2x, 0 < x < 5,
we first find the steady-state solution v(x) = 10 + 4x. The desired solutions
w(0, t) = w(5, t) = 0, t ~ 0,
and an initial condition
u(x, 0) = w(x, 0) + v(x) = h(x) ;
this last condition is just
w(x, 0) = h(x) - v(x) = -6x.
Thus our solution u(x, t) has the form of Equation 10.3, where the bk are Fourier
sine coefficients
bk =-21
5 o
5
krrx
- 6x sin --dx
5
60(-ll
=- --
krr
The solution to the problem is thus
00 1
u(x, t) = (10 + 4x) - (60/,r) L (-l)k+
k e-k
2 2 2
a ,r 1125
krrx
sin - - .
5
k=l
Section 10B Heat and Wave Equations 727
EXERCISES
Solve the heat equation a 2 uxx = u 1 with boundary and as requiring the conducting medium to have insulated ends
initial conditions 1 through 6. at x = 0 and x = p. Because the temperature gradient
Ux is always 0 at the endpoints, there is no heat flow past
1. u(O, t) = u(p, t) = 0; u(x, 0) = sin(rrx/ p), 0< x < p those points. The solution involves Fourier cosine series.
2. u(O, t) = 11(1, t) = 0; u(x, 0) = x, 0 < x < 1 The next four exercises are about problems of this kind.
3. u (0, t) = u (1, t) = 0; u (x, 0) = 1 - x, 0 < x < I 23. (a) Show that product solutions u(x, t) = T(t)X(x)
4. u(O, t) = u(rr, t) = 0; = x(rr - x), 0 < x
u(x, 0) < rr of the insulated endpoint problem have the form
2 7 2
Uk(X, t) = ake-k 1r t!P cosk(rr/p)x.
5. u(O, t) = u(rr, t) = 0; u(x, 0) = sinx + ½sin2x,
(b) Use the Fourier cosine expansion for f(x) on O :::
0<x<rr
x ::: p to solve the boundary value problem for a
6. u(O, t) = u(2, t) = O; u(x, 0) ={ o:1 0 < x < 1,
1< x < 2
general initial temperature f (x).
(c) What is the steady-state temperature function
Find steady-state solutions u (x, t) =
v(x) of the heat u(x, oo) for 0::: x::: p?
equation a 2 uxx = u 1 that satisfy each of the conditions 24. Solve the heat equation a 2 uxx = 11 1 with insulated end
7 through 10. conditions ux(O, t) = ux(l, t) = 0 and initial condition
7. u(0,t)=-1, u(2,t)=l u(x, 0) = x for O < x < 1.
8. u(O, t) = 0, u(IOO, t) = JOO 25. Solve the heat equation a 2uxx = u 1 with insulated end
9. llx(O, t) = 1, u(1, t) = 2 conditions ux(O, t) = ux(I, t) = 0 and initial condition
u(x, 0) = I for 0 < x < 1.
10. Ux(O, t) = -1, u(l, t) = 3
Find the steady-state solution to each of the problems 19 28. The partial differential equation txux = Ut has product
through 22. solutions of the form u(x, t) = X(x)T(t) for x > 0 and
t > 0. Find the general form of all such solutions.
19. Uxx = u1 + 2; u(O, t) = 1, u(l, t) = 2
20. Uxx = u 1 + u; u(O, t) = 0, u(2, t) =0 29. Suppose that u(x, t) satisfies a 2 u., = u 1 and that u(x, to)
is concave up as a function of x for xo < x < x1. Show
21. Un:= u1 +x; u(O, t) = 1, 11(1, t) =2 that for each x with xo < x < x, there is a time t(x) such
22. Uu = u1 - 2x + 1; u(O, t) = 1, u(l, t) =0 that u(x, t) increases as t increases from to to t(x). What
if u (x, to) is concave down?
Insulated endpoints. We can interpret the I -dimensional
heat flow problem 30. Verify that the partial differential operator L = a 2 Dxx -
Uxx =U1, Ux(O,t)=u;x;(p,t)=O, D 1, defined by L(u) = a 2uxx - u 1 is linear in its action
on twice differentiable functions u, v, that is, show that
u(x, 0) = f(x), 0::: x ::: p, L(cu + dv) = cL(u) + dL(v) for constants c and d.
728 Chapter 14 Infinite Series
31. The assumption that the separation constant C for the product solutions uk(x, t) = e-k
2
<" 2 7f 2 /P 2>r sin(br/p)x
problem a 2 u.u = 11,,, u(O, t) = u(p, t) = 0 has the spe- using C instead of -A 2 •
cial form -A 2 is convenient but not essential. Derive the
c~
0
X(s,t)~
p
two different expressions for the total force vector acting on a typical segment of
the subdivision. To find the tension force T acting on the small piece, note that the
opposing forces at x(s) and x(s + !!,.s) are
FIGURE 14.17
Ft(s + !!,.s) and - Ft(s), so T = F[t(s + l!,.s) - t(s)J.
String and tangent.
But by Newton's law, the force T acting on the small piece equals mass p!!,.s times
acceleration a, so T = (pl!,.s)a. Hence
. . at .. ax a 2x
Lettmg !!,.s ~ 0 gives pa= F-(s, t). By defimtton t = - and a= - 2 , so
as as at
. a2x(s, t)
= a x(s,
2
t)
10.4 Vector wave equation (F/p)
as 2 a, 2
Before proceeding we'll pause to interpret this equation. From Chapter 8, Sec-
tion 3 recall that io 2x/as 2 1 = lot/osl = K(s) is the curvature of the string at x(s).
Thus the magnitude of the acceleration on the right side increases with increasing
force F and curvature K, and decreases with increasing density p. Equation 10.4
is linear, but it requires us to use arc length s along the string as an independent
variable, something that's very hard to measure in practice.
Setting F / p = a 2 we write the vector differential equation as a system of three
scalar equations
The motion's x-axis component is usually slight, indeed zero where we assumed
the ends are fixed, so the equation for x (s, t) is usually set aside. Between the
Section 1OC Heat and Wave Equations 729
other two equations there is little difference in physical significance unless we make
some other special assumption. To be specific we assume the string has been set in
motion so that movement is entirely in the xy-plane, so z(s, t) = 0, and we're left
with only the middle equation for y. Equation 3.6 in Chapter 8, Section 3 shows
that Yss = Yxx/(1 + y;) 312 • If the slopes Yx are small enough, we can replace the
nonlinear expression Yss by the slightly larger Yxx, so with a slight loss of precision
we replace the system of three equations by one linear equation for y(x, t ):
2
10.5 One-dimensional linear wave equation a 2 a y2 (x, t)
ax
= ay
at
(x, t).
and
ay
y(x, 0) = f (x) and -(x, 0)
ar
= g(x).
The first pair of equations holds the string on the x-axis at x = 0 and x = p. The
second pair specifies for O s; x s; p the initial shape of the string, perhaps from
plucking as on a harp string, and its initial velocity, perhaps from hammering as on
a piano string.
Separation of variables. As in solving the heat equation we use separation of
variables and rely now on the linearity of Equation 10.5 for constructing solutions
that satisfy the boundary and initial conditions. Start by setting
y(x, t) = X(x)T(t),
X"(x) T"(t)
a 2 X"(x)T(t) = X(x)T"(t), 2
or a --=--.
X(x) T(t)
The right side of the second equation is independent of t because the left side is
independent of t, so both sides are constant. For convenience in treating the heat
equation we chose a special fonn for the constant; this wasn't really necessary as
we'll show here by calling the separation constant simply .>... We write
Solving for CJ and c2 shows that to get nonzero solutions we must have
Allowing for complex exponents and complex values for CJ and c2, we see that
for some integer k we must have 2,.;'Ap = 2bri. Thus ,.;'A = bri/ p, and the
corresponding solutions X (x) have the form
. kn . kn
=2CJi sm - x = bk sm-x .
p p
Since we now know that A= (kni/p) 2 = -(kn/p)2, the equation for Tis
2 krra kna
T"+(kna/p) T=0 with solutions Tk(t)=Ckcos--t + Dksin--t.
p p
kna
y(x, t) = [ Ak cos - -p 1 + Bk sin -kna ] kn
- 1 sin -x.
p p
We now form finite or infinite sums of these terms and try to satisfy the initial
conditions by choosing Ak and Bk to be the appropriate Fourier sine coefficients.
azy azy
10.6 Solution formula for a 2 -
ax 2
= -a, 2 , satisfying y(O, t) = y(p, t) = 0 and
ay
y(x, 0) = f(x), -(x, 0) = g(x).
a,
00
y(x, 1) =L [ kna
Ak cos - - t + Bk sin -kna ] kn
- t sin -x, where
k=l p p p
Ak=- 21P kn
f(x)sin-xdx, 2
and Bk=- 1P g(x)sin-xdx.
kn
P o p kna o p
Figure 14.18 shows equally time-spaced string positions for a plucked string.
nx Oy
y(O, t) = y(p, t) = 0, and y(x, 0) = A sin-
P
, -(x, 0) = 0.
a,
Section 10( Heat and Wave Equations 731
FIGURE 14.18 u
Plucked string.
If we want to relax our assumption that the string's motion is confined to a plane,
all we have to do is reintroduce the equation a 2 zzz = x 11 along with its own initial
conditions and solve that problem in the same way. Thus we 'd have a vector solution
(y(x, t), z(x, t)) with the same independent variables x and t.
EXERCISES
Solve the wave equation a 2uxx = Uu with boundary and (a) Find the equilibrium solutions u(x, t) = v(x) of the
initial conditions 1 to 4. nonhomogeneous equation.
1. u(O,t)=u(rr,t)=O, u(x,O)=sinx, Ut(x,0)=0
(b) Among the solutions found in part (a), select
the solution that satisfies the boundary conditions
2. u(O, t) = u(rr, t) = 0, u(O, t) = u(p, t) = 0.
( 0) X, 0 < X :S 1T /2, ( 0) 0 (c) Explain how to modify the Fourier solution method
ux, = { rr-x , rr/2<x <rr, llrX, = for a 2 uu: = Utt to cover the solution of the nonho-
3. u(O, t) = u(Jr, t) = 0, u(x, 0) = 0, ur(x, 0) = mogeneous problem.
0, 0 < X '.:: 1T /2,
{ l, 1T /2 < X <: 1T
6. The d 'Alembert solution to the wave equation. This
4. u(O, t) = u(l , t) = 0, u(x, 0) = x(l - x), Ut(x, 0) = method predates the Fourier series method, but is not
sinrr x so popular because it's not so widely applicable. Let
5. The nonhomogeneous wave equation U(x) and V(x) be twice differentiable functions for all
real x.
(b) Assuming f (x) twice differentiable, show that (b) Sketch the odd periodic extension off (x) = x(I -
x), as described in part (a), from 0 .:'.5 x .:'.5 I to
u(x, t) = ½U(x +at)+ f(x - at)] -00 < X < 00.
(c) Show that if f (x) is odd and has period 2p for
is a solution to the wave equation of the form -oo < x < oo, then
described in part (a) that also satisfies the initial
conditions u(x, 0) = f(x), u 1 (x, 0) = 0. u(x , t) = ½U<x +at)+ f(x - at)]
(c) Assuming g 1(x) continuous, show that
Chapter 14 REVIEW
In Exercises 1 to 10, use what you know of specific 35. (a) Let L be defined as an operator by L(u) = Uxx - u 1 .
infinite series to identify a sum in closed form for the Show that L is a linear operator and conclude that
given series, determining also its domain of convergence. linear combinations of solutions of the heat equation
are also solutions.
1. L~o(-l)k(x - Si (b) Show that boundary conditions of the form u(a, t) =
u(b, t) = 0 are linear in the sense that if two
2. L~o(x + 1)2k
functions satisfy the conditions, then so does a linear
3. L~o(x2 + 1)-k combination of the two functions.
(c) Show that a boundary condition of the form
4. I:~0 (-l)*(x - ll/k! u(a, t) = 1 is not linear in the sense of part (b).
5. L~o(x + l)2k / k! ( d) Show that the initial condition u (x, t) = f (x) is
not linear in the sense of part (a) unless f (x) is
6. I:~o 2k (x - 1)* / k ! identically zero.
7. I:~0 (-1/(x - s)2k/(2k)! 36. Example 4 in Chapter 9, Section 6 shows that the 2-
dimensional Laplace equation V 2 u = 0 in polar coor-
8. L~o(-1}1(x + 1)2k+l /(2k + 1)!
dinates is
9. I;~(x 2 - 1)-2k/(2k)!
11. (1- x 3 )- 1;a =0 12. (2x - x 2)- 1;a =1 (b) Show that 0" + ). . 2 0 = 0 has solutions satisfying
3 0(0) = 0(2.ir) if)...= k for integer k, with solutions
13. ln(l - 2x);a =0 14. e-x ;a= 0
ak cos k0 + bk sin k0. Thus e talces the same value
15. ex-l;a =1 16. ex- 1;a =0 at polar angles 0 = 0 and 0 = 2.ir.
2
=0 (c) Show that the Euler equation has solutions rk and
17. cos(2x); a= 0 18. sinx ;a
,-1e for integer k, but that negative exponents are
19. sin(x + .ir);a = 0 20. sinx + sin2x;a = 0 ruled out by the boundary condition that u(r, 0)
21. ex - e2x;a =0 22. (1 +x)- 1(1-x)- 1;a =0 should be finite at the origin.
(d) Show that if f (0) has Fourier series representation
State all real values of x for which the series in Exercise
23 to 28 converges.
f(0) = ao
2
'°'
00
The table at the end of this appendix lists some frequently occurring integrals. As
a supplement to the table, you may find it useful to use a symbolic calculator or
software that provides some indefinite integrals. If you don't see how to compute an
indefinite integral directly and don't find it else where, you may find that one of the
following techniques works. Integration constants are omitted, since they're not the
main issue here.
I IDENTITY SUBSTITUTIONS
Rewriting the integrand using an algebraic, trigonometric, exponential, or logarith-
mic identity will sometimes convert an apparently intractable integrand into an
amenable one.
i EMNIPLE ,tJ The integral J(ex + e3x ) 2 dx can be rewritten by squaring out the binomial to get
f(ex + e3x)2 dx == f(e2x + 2e4x + e6x) dx
= ½e2x + ½e4x + ie6x.
To integrate cos 2 x, recall the trigonometric identity cos 2x =
2 cos 2 x - 1, which
is equivalent to cos x =
2
½(l + cos2x). Thus Formula 30 in the table follows for
a= 1 from
f 2
cos xdx =½/(I+ cos2x)dx = ½x + ¾sin2x.
f (x - a;(x - b) dx - f 1
Ca - b)\x - a) - (a - b) (x - bJ dx
= In Jx - al _ In Jx - bl = _ l _ 1n Ix - a I·
a- b a- b a- b x - b
735
736 Appendix Finding Indefinite Integrals
I dx
,,/x+I
= I 2u du
t1+l
=2 I( 1 __
l ) du
u+I .
The last step comes from division of u by u + 1. Now integrate with respect to t1
and reintroduce x using the inverse relation t1 = ,Ix to get
I
dx
+ 1)) = 2(,,/x - ln(,,/x + !)).
,Ix
+I = 2(t1 - ln(u
.
If)(AMPLE2 I To integrate~. set x = sinu , dx = cosudu. Then
f ~ = f .J1 dx
2
-sin ucost1du
=f 2
cos u du = ½t1 +¼sin 2u, by Example 2.
In the integral J cos 2 x sin x dx we note the square of a function, namely g(x) =
cosx, multiplied by a function, sin x, which is easily modified to be the deriva-
tive g' (x) = - sin x. By including the constant factor -1 in the integrand and
compensating with a"-" before the integral, we rewrite the integral as
f cos2xsinxdx =- f (cosx)\-sinx)dx.
Section V Integration by Parts 737
It's now natural to think of substituting u for g (x) = cos x and du for g' (x) dx =
( - sin x) dx to get
2
/ cos x sinx dx =- f u du
2
= -½u 2 = -½ cos 3 x.
IV INTEGRATION BY PARTS
This technique is one of the most important, because of its frequent use in deriving
other general fonnulas; it is embodied in the fonnula
which follows from the product rule for differentiation. To apply the method, you
need to recognize the integrand of a given integral as a product of two functions;
one of them, f(x), you differentiate and the other one, g'(x), you try to identify as a
function you can integrate easily. If one choice for f(x) and g'(x) fails to work you
may want to try another. Formulas 17, l 9, and 23 in the table can be computed by
a single application of integration by parts. Formulas 21, 24, and 29 are computed
by repeated integration by parts.
2. j--.!!:!__
ax+b
= ! In lax+ bl
a
1 2
3. / x(ax + bt dx = a 2 (n + 2) (ax + bt+
n=/--1,-2
xdx x b
4.
I --=---lnlax+bl
ax +b
xdx
a a2
b 1
5.
I ---=---c----+-lnjax+bl
(ax+ b) 2 a (ax + b)
2 a2
6. ---- 1 -dx =- 1- In Ix
--a,
-
/ (x - a)(x - b) a- b x- b
738 Appendix Finding Indefinite Integrals
1. f dx
(ax+b)(cx+d)
= l
ad-be
ln ax+ b
cx+d
I I· ad - be =I- 0
xdx b 1
8
· f (ax+ b)2
~
= 2a ((ax + b) ) - a (ax + b)
2 2
2(3ax - 2b)
2
9.
f xvax + bdx = lSa 2 (ax+ b)
312
xdx
=
2(ax - 2b) -/ax+b
l0Sa 3
ll. f Jax + b
dx
=
]
3a2 ax +
f
X
12. =- arctan -
a2 +x 2 a a
13
• f a2~ \2 = ~ In I: ~; I
14. f dx - _2_ ln
x(ax + b) - 2b
x2
ax 2 + b
I I
15. f J p 2 - x 2 dx = ½xJ p 2 - x2 + ½p 2 arcsin(x/ p)
16. / Jx 2 ± p 2 dx = ½x/x 2 ± p 2 ± ½p 2 In (x + Jx 2 ± p 2)
11. f dx
Jp2-x2
= arcsin(x/ p)
18. / dx = ln (x + J p 2 ± x 2)
Jx2 ± p2
22. f ln ax dx = x ln ax - x
27.
2·
x sm ax dx = 2a2x .sm ax + 3a2 12
cos ax - - x cos ax
a
/
2 x sin2ax
28. sin ax d x = 2- ~
.
/
1 cos3 ax
f
3
29. sm axdx = --cos ax+ - - -
a 3a
1 l
31.
f xcosaxdx =2
a
2
cosax + - xsinax
2
a
l
32.
f x 2 cosaxdx =
x
2
a
xcosax - 3 sinax
sin2ax
a
+ -x 2 sinax
a
sin3 ax
= -a1 sin ax -
34.
f cos 3 ax dx ---
3a
sin(a - b)x sin(a + b)x
35.
f = - - - - - - - - - , lal f. lbl
sin ax sin bx dx
2(a - b)
sin(a - b)x
2(a + b)
sin(a + b)x
36.
f cosaxcosbxdx =
2 (a-b)
+ ( b) , lal f. lhl
2a+
cos(a - b)x cos(a + b)x
37.
f sin ax cos bx dx =- 2
(a _ b) -
2
(a + b) , lal f. !bl
42. f 2
sec ax dx =l tan ax
48.
f eaxcosbxdx= /
a +b 2
(acosbx+bsinbx)
ANSWERS TO ODD-NUMBERED EXERCISES
CHAPTERl:VECTORS
Section 1: Coordinate Vectors
Exercise Set 1 (pgs. 7-8)
1. (a) (-1, 6). (b) (0, 14). (c) (4, -6).
3. (a) (9, -3, 0). (b) (18, 9, -13). (c) (1, 1, -4).
5. 5i - 8j. 7. (1 - 2c)i + (4 - d)j.
9. (a, b) = (3, 2) is the only solution.
11. No a and b satisfy ax+ by= (3, 0, 0). The only possibility is c = 5, a = b = l.
13. (a) -x + 2y - z = 5i. (b) 6x - 2y + z = 5j. (c) -4x + 3y + z = 5k.
15. Let x = (;q, ... , xn) be a vector in ]Rn and let r ands be real numbers. Then apply the definitions.
17. Apply the definitions. 19. Apply the definitions.
21. By inspection, (-2, 3) = -2e1 + 3ez. 23. (2, -7) = -i(l,
1) + ~(1, -1).
25. Let x = 2i, y = i - 3j and z = 3i + 2j - 2k.
(a) I = ½x. (b) j = ¼x - ½Y- (c) k = Hx - ½Y - ½z,
27. 700 ink, 90000 paper, 5500 binding. 29. ¼<x(2) + x(8) + x(14) + x(20)).
l·
(·2.2) y
p
~1,1)
-2 -1 1 • X
X ·1
-2
-2
5. 2,,/2. 7. 6.
741
742 Answers to Odd-Numbered Exercises
2 X
Exercise 7
y
• z
2 •.
x-y y /'
/
2 / - - ••• ___ _
x-y / ·········· ...
3· x+y .>'.
__1-"",_
J _....~_x_+...,Y,_.
' ~- - -2
.~
3 X
,/,______,---~
,,/
...
--------
,'
,...... i
-· ;- ; ~ - - - ~ ,.······· X+2y
-1 y- ---------... ,/ 1 ~ 2
X+2y
·11 ?-
Exercise 9 Exercise 11
Answers to Odd-Numbered Exercises 743
13. 15.
y
/:~ ----... 4
.---.v······--...2x:i:y
,· ·:.c,,/
:·/ 2x+y ·. ·-... ______ _
/
.... ··- ...
-1 y
17. 19.
y z
(0,1,1)
\ ,y
(1,2)
2-
,,~
~ (2,1)
., /
I
·1 2 X
(1,-1,-1)
21. 23.
y z
3 ,,.,.,vZ-·······"'·''
u ,
u
2
X
3
2 3 X
744 Answers to Odd-Numbered Exercises
3. X=t(l ,0,1)+(1,2,2).
·1 2 X
Exercise 1
Exercise 3
5. (a)
y
" '
. p 3
' '· 2
d
"'·.
- - + - - - r - - - + - ---"l"----'k-----jf-- - 2
·3 ·2 ·1 X
'·,,, z
z /\. -.,,.
',,
/ '\
/ /.
/
/
t·
" 'f.-.
'
'·,,
.:··,. ... i
··,, ·, i
'·'·-.' (1,0,)
\ '· $
'
Exercise 15 Exercise 17
1. 10 3. -5
5. (a) 1. (b) 1. (c) rr /4.
7. (a) 8. (b) 3. (c) 0.4759 rad. ~ 27 .3° .
9. Angle: 0.6147 radians (~35.2°. Distance~ I= r8 = 4000(0.6147) ~ 2459 miles.
11. (a) Positivity Hint: Sum of squares is nonegative. (b) Symmetry Routine check. (c) Additivity Routine check.
(d) Homogeneity Routine check.
13. (a) .JJ· (b) (2/3, 2/3, 2/3); (1/3, -5/3, 4/3).
15. (a) - ~- (b) (-9/14, -27/14, 18/14); (37/14, -15/14, -4/14).
17. IA - Bl= .JTIIT, IA - Cl= .Jill, IA - Cl= 5, AB acute, AC right, BC acute.
19. (a) Routine calculation. (b) Routine calculation. (c) 1/./6
21. Routine calculation. 23. 3120/.JIT ~ 941 watts.
25. The total work done is the same either way: 3750./3 + 1/./2 ~ 7244 foot-pounds.
27. Hint: Consider two cases. 29. Follow the outlined steps.
3 (2,3)
Exercise 1 Exercise 3
X
/
/
/
/
/
/
/
y + 22 =1
X-t-y- Z=l
19. 3x - 2y + 5z = 9.
21. (a) x - 2y + z = 0. (b) x = 1 or z = 1 and many others. The three points lie on a line parallel to the y-axis so don't
determine a unique plane.
23. I /../5. 25. 1/,./3. (I, 0, -1) is below P.
27. 3/vl4. (1, 0, -1) is below P.
To locate the origin relative to P, set x = y = 0 in the equation for P to find out where the plane crosses the z-axis.
This gives 3z = 1, or z = 1/3. Thus, (0, 0, 1/3) is on the plane and (0, 0, 0) is one-third unit below it; i.e., the origin is
below P. It follows that ( 1, 0, -1) and the origin are on the same side of P.
29. ldl.
z
. ""·~~. \
,/ \
z z
.•
.··:
5. ./i. 7. 3./3/2.
9. 2x - 5y + 2z = 5. 11. Routine check.
13. (a) a = (-2, l, -2), b = (2, -1, -2), c = (2, 3, -2) are shown on the right with their tails at the apex.
(b} u = a x b = (-4, -8, 0), v = b x c = (8, 0, 8), w = c x a= (-4, 8, 8).
(c) cosa = 0, cos/3 = 1/./i, cosy= 2/3.
15. (a) Routine computation. (b) Hint: Use a little trigonometry. (c) Hint: More trigonometry.
17. Routine computation, 19. 3.
21. V(B) = 17.
_j :'"
}// "r
(1, ,0)
x- Exercise 21
Exercise 19
23. Follow the steps, 25. (a) Unequal. In (b), (c), (d) pairs are equal.
27. Use the hint in the exercise.
1. C. 3. 3k.
5. 2e1 - ei + 3e3 + 2e.i . 7. (4, l, -2) = -2(1, 2, 3) + (6, 5, 4) .
748 Answers to Odd-Numbered Exercises
9.
11. s(-1, 1l + (1, 2), t(5, 7) + (4, 5), intersecting at (3/2, 3/2).
13. f(t) = t(6/v'38, -l/v'38, -l/v'38) + (-5, 3, 4).
15. (a) Hint: Show d(t) is always a scalar multiple of a fixed vector. (b) If p1(t) = tv1 + o 1 and p 2 (t) = tv 2 + o 2 the
collision is at the time when these are equal if that's positive, otherwise no collision.
11. K and M arc parallel and are the same line. L and M arc parallel but are not the same line.
f19. s(3, -2, 0) + t (0, 2, -5) + (3, 0, 0).21. s(l, 0, 0) + t(0, 1, 0) + (1, 2, 3).
23. x = t(3 , 1, 2). (fhe values t = 0, 1, -2 give the three points.)
25. X = s(-2, 0, 2) + t(-3, 0, 1) + (1, 2, 3) .
27. (a) Angle between a and bis less than the angle between a and c. (b) (0, -1, I). (c) ,fi./2. (d) ./IT/2.
29. 18 units.
31. (a) 6/v'l3 units. (h) x = t(3, -2) + (3, 0). (c) ±(8/v13, 12/v13).
33. (a) x = t(l, -2). (b) n = (1/0, -2/0), c = -2/0. (c) 0; the origin and (3, 5) are not on the same side
of L.
35. x = t (0, 3, -3) + (8/3, 0, 1/3); (8/3, 2/3, - I /3).
are JO,
44 2
~l 1 , ~1~ , 4, :fi°, 2;}/, current of ¥i ~ 2.32 amps flows in at A and out at B.
1
Answers to Odd-Numbered Exercises 749
3. With junctions labelled 11, ... , ls as shown,
let Vi be the voltage at l; . With fixed voltages v1 = 1 and v4 = 0, other voltages are = ti, = f4,
u2 v3
v5 = f.i, L'6 = ;, v1 = ~. vs= f4· Current of¥~ 1.71 amps flows in at 11 and out at ]4.
5. Magnitudes proportional to 2,/5, ../2, ./IO.
7. No. Any resultant (a, b, c) of forces acting in the given directions is a linear combination with positive coefficients, and
must have a > b > c > 0.
9. P2 = ~- - - - 11. Pl = P2 = P3 = P4 = 1/2.
16 15 17 _ 23
13• Pl= 29, P2 = 29, P3 = 29, P4 - 29· 15. r =(-a+ 3, a - 2, -a+ 4, a), for any a.
17. A= !h, B = th, C = ½h,
J. oi -n m-m
(x, y, .r.) = (-t, t + 1, t) = t(- 1, 1, l) + (0, 1, 0).
x+z=0
5• X
3x
+2y
+y=0
= }' (X) = (-1/5)
y 3/5
· 7, y =1 '
x+y=0
9. X = 24/15, y = 13/15, Z = -7/15.
11. (a) (6 ~ =n• (b) X = t(l , 1, 1). (C) X = t(l , 1, 1) + (3/5, -1/5, 0).
13. (a) 01
(0
01 6.), (b) X = 0, (C) X = (0, 1, 0).
0 0
15. i = -3(1, 2) + 2(2, 3), j = 2(1, 2) - (2, 3). 17. (1, 2, 1, 0) + 2(2, -1, 0, 1).
19. (0, 0, 0, -1, 2). 21. V = -a+ 2b.
23. v = -~a - ~b - ~c. 25. t(O, 4, 8) + (-4, -2, 0).
27. Show that if the lines are non-parallel, equations have a unique solution (whether lines intersect or not). Show that if
the lines are parallel, they lie in a plane, and any line that is in the plane and perpendicular to both given lines will do.
29. Hint: A(w - v) =Aw-Av. 31. Hint: A(t1x1 +t2x2) = t1Ax1 +t2Ax2 = t1b1 +t2b2.
750 Answers to Odd-Numbered Exercises
I. =; _;). 3. ~ ;) .
5. : :~· . 7. ; I!)·
9. 2 4 .
2 4
11. -3
-5
-1
6
-4)·
-9
:::(
8A r!r!); 8 h,s 3 colomas, A h,s 2 rows
17
, (- I I
9
~: - - :) .
2
-3 3 3 -
19. DC is not defined, D has 2 columns, C has 21. X and Y arc 2-by-3.
3 rows.
23. X is 2-by-2, Y is 2-by-3. 25. X and Y are 2-by-3.
31. (38). -I
-]
33.
-]
-I
_:).
35. AO is defined when O is n-by-p for some p and is then m-by-p . 0 A is defined when O i_s p-by-m for some p and is
' then p-by-n.
37. (a) definition of matrix product.
(b) If c has entries CJ, ••. , c,, and r has entries r1, ... , r11 then M = er has entries mij = c;rj , for i , j = I, . .. , n.
39. A2 =( =! ;). A
3
= (-~; g). p(A) = (-~~ n-
41. A2 =(6 ~ ~).A =(-6
0 0 9
3
0
~
0
~).
27
p(A) = (~
0
~
0
~) -
12
49. (a) For the given numbers, p(x) = x 2 + 2 and A= G =n- A 2 = (-~ -~) so p(A) = 0.
.
(b) Start with A2 = (aca++ c
2
dbe abb + bd)
c + 2
d and go on from there.
3. ( 16/3 -20/3)·
1. (_: -~) - -20/3 40/3
5 -! _
· A -
( 4/11
-3/11
1/11) ( 5/ll)
2/11 ,X= -1/11 . 7. (-1! ~ -~).
2 0 l
9. No inverse; row reduction gives a row of zeros.
25. Use the results of Exercises 21 and 22. 27. Use a trigonometric identity.
Section 5: Determinants
Exercise Set 5A-E (pgs. 98-99)
1. det A = 24, det(2A) = 192. 3. det(2A) = 2n det A.
5. 32. 7. det A= 7, det B = 2, detAB = dct BA = 14.
752 Answers to Odd-Numbered Exercises
5. /(ei) = (\
2
), f (ei) = (3~ 2
)-
17. ::: :::-~:w(e~ M~ltl~)at:o:~:~ 1:av(e~te f1)s ~xed and rotates e2 and CJ 90° in the yz-plane.
1
0 0 1 0 0 1
The two products represent rotations of 90° in opposite directions about the z-axis.
9/49 18/49 6/49)
19. 18/49 36/49 12/49 .
( 6/49 12/49 4/49
21. Line r(l, 3). 23. Line t (1, -1).
25. Plane s(I, -2, 0) + t (-2, -1, -5). 27. Both make sense.
29. Only g o f makes sense. 31. a = (/(e1), ... , /(en)).
I - ~... ,
.
y (1.1)
yl(0,1)
, ' I ,,,. , ' ,
, ' , I- I
,, ' ' , (1,0) I
(-,:of,
,',
''
--~ ,
-
(-1,0) , , ,
....
I
I ,
t ,
,,'{1 ,0) X
I
,,
I ,
~,,
(0,·1)
{·1 ,• 1 ) - 1
I
domain' points image points (a=1)
1. G~)- 3. (-~ ~) -
5. One-to-one. J- 1(x1,x2,x3, . . . ) = ½<x1,x2,x3, ... ). Domain of 1- 1 is all of R 00 •
7. Not one-to-one.
9. (g o f)(x1,x2 , . .. ) = (f o g)(x1,x2, .. . ) = (2x1,4x2,6x3, ... ).
11. (g o p)(x1 , x2 , x3, . . . ) = (0, 2x1, 3x2, 4x3, .. . ), (po g)(x1, xi, x3, .. . ) = (0, x1 , 2x2, 3x3 , . .. ).
13. Use the right distributive law and the scalar commutativity laws for matrix multiplication (Theorem 3.2 in Chapter 2).
15. Du(x) = 6x 2 - 4, xu(x) = 2x 4 - 4x 2, D(xu(x)) = 8x 3 - &x, xDu(x) = x(6x 2 - 4) = 6x 3 -4x .
17. (a) (Dx - xD)u = (xu' + 11) - xu' = 11.
(c) No. (D 2 - x 2)u = D 2u - x 2u = u" - x 2u, but (D + x)(D - x)u = (D + x)(u' - xu) =
D(u' - xu) + x(u' - xu) = (u" - xu' - u) + xu' - x 2u = u" - (1 + x 2 )u.
19. Hint: Use some basic integration formulas.
21. 1l. 23. (0, 2).
25. 2x 2 + 3x. 21. No. j'(O) is not defined.
29. No. Every function in the image of L has value O at 0.
31. No inverse.
2/3
-7/9)
33. /- 1(y) has matrix 1/~ -2/9 . Domain of 1- 1 is R2 •
( 1/3
35. No inverse.
Answers to Odd-Numbered Exercises 755
Section 4: Image and Null-Space
Exercise Set 4A-C (pgs. 130-131)
1. Range: JR 3 . Image: plane through O spanned by (1,0, 1) and (0, 1, 1). Linear, null-space {0}.
3. Range: IR.3. Image: plane u(I, 0, 2) + v(0, 1, 1) + (0, 0, 1). Not linear.
5. Range and image: JR 2 • Linear, null-space t(l, 2, -3).
7. Image: C(-oo, oo). Not linear.
9. Range: cOl(-oo, oo). Image: subspace off with /(0) = 0. Linear, null-space {0}.
11. Image lR2 , null-space: 10}.
13. Image: the plane spanned by (1, 0, 1) and (4, 1, 1). Null-space: the line t(l, 0, -2).
15. (a) t(5, 2). (c) (1, -1); t(5, 2) + (1, -1).
17. Hint: Show that /(x) = (/(e1), ... , /(e,,)) • x.
19. (a) Use definition of linearity and properties of D.
{b) (D - l)(x + 1) = (x + l)' - (x + 1) = l - (x + 1) = -x. All solutions: y(x) =ctr+ x + 1.
21. (b) If Gu is the zero function, talcing derivatives gives tu(t) = 0 for all t, so u(t) = 0 fort #- 0. Since u is continuous,
it is the zero function, so G is one-to-one.
(c) Polynomials of the fonn x 2 p(x) with p(x) in P,,.
(d) Polynomials of the fonn x 2 p(x) with p(x) in P.
(e) The domain of c- 1 is the same as the image of G, and consists of continuously differentiable functions g with
g(0) = g'(0) = 0. For such a g, taking /(t) = g'(t)/t fort#- 0 and /(0) = lim1->og'(t)/t gives a continuous
f = c- 1(g).
(f) The constant function 1.
23. (a) The reflection of (x, y) is (-x, y) and (Ru)(x) = u(-x).
(b) (R 2 )(u(x)) = u(-(-x)) = u(x).
(c) Hint: If u(x) = u(-x) then Ru= u.
(d) Image the odd functions, null-space the even functions.
(e) F} = F,,, F; =
F0 •
(~!
3 9
!) (:)
27 C
= 0. Row reduction shows that the only solution is a = b = c = 0, so the given functions are
linearly independent. Another proof is to multiply the equation by e-x and take the limit as x goes to -oo to show
a = 0, and then similarly show b and then c are 0.
=
11. If a cosx + bsinx = 0 for all x, putting x = 0 and x n/2 gives b = 0 and a = 0, so sinx and cosx are linearly
independent.
13. (b) for e'C, (I, I); for e-x, (l, -1). 15. (b) (1, - I, I).
17. For cos 2 x, (1/2, 0, 0, 1/2, 0); for sin2 x, (1/2, 0, 0, -1/2, 0).
19. Partial answer: A product f (x)g(x) is a linear comhination of terms of the form cos ax cos bx, cos ax sin bx, and
sin ax sin bx, with a ~ p and b ~ q. The trigonometric identity cos ax cos bx =
½cos(a - b )x + ½cos(a + b )x shows
directly that if a> b then cosaxcosbx is in Tp+q· What about other terms? What if a~ b?
21. {e-', e-x }. 23. {cos x, sin x, sin 2x }.
25. Image: {(2, I), (I, 2)}. Null-space: {0}.
756 Answers to Odd-Numbered Exercises
1. (n and ( =i) arc associated with;,. = 7 and (-n is associated with;,. = - 5. The others arc not eigenvectors.
and by l - ../2 in the direction of (-./i., 1), with reversal of direction because l - ,Ji < 0.
15. x(t) = 2cie2r - 2c2e-21, y(t) = qe21 + c2e-21.
17. x(t) = -2c1 + 2c2e41 , y(t) = c1 + c2e41 .
Exercise Set 6BC (pgs. 154-155)
1. AJ = ½(l + ./5), A2 = ½O - ./5). Theorem 6.7 guarantees a basis of eigenvectors IR2.
3. A= i, A2 = -i. There is a basis eigenvalues C2 but not in IR2.
5. )q = .Jio, Az = -.Jio and A3 = -1. There is a basis of eigenvectors in IR3.
7. (a) The eigenvalues of Ro are cos 0 ± i sin 0 and are real only when sin 0 = 0, so 0 = 0 or 0 = ,r. For 0 = 0 every
non-zero vector is an eigenvector associated with the eigenvalue l, and for 0 = ,r every non-zero vector is an
eigenvector associated with the eigenvalue -1.
(b) For Ron to be a real multiple of u, it must have either the same or the opposite direction.
17 I 1 ) A = (I+ i.J3 0 )·
· U= ( l+i.J3 l-i.J3' 0 1-i.J3
·1
X
.,
~ u,. M ~ G-1
0
1. To ,erify the ,xis of mtation. ,hock that Au, ~) . Angle is n.
0 -1
3. (a) R= (~
0
~
1
-~).s= ( ~
0 -1
O ~) -
0 0
(b) SR= (
-1
~~ -I)-A,is(l.1.-1 ).
(b) Since >.. = I, f leaves points on the line through u1 fixed. Apply Exercise 7 to the submatrix (: !). In case (b)
of Exercise 7, f is a rotation about the axis u1 ; in case (c) it is a reflection in the plane spanned by u1 and the line
of reflection in the u2u3-plane.
(c) In case (b) of Exercise 7, f is the composition of a rotation with axis u 1 with reflection in the u2u3-plane. In case
(c), f is a reflection in the line of reflection in the u2u3-plane.
1. Dependent. 3. Independent.
5. ( ~~
-1 0
~)-
1
7. C ~ =!)·
-1 3 0 0
0 1 0 0
l~~
1/2
-1/2) 2 2 0 0
9. ( 2 - 1 . 11.
0 0 -1 3
-1/2 1/2 1/2
0 0 0 1
0 0 2 2
1 - 1
-1
1
-1 -2
1
2 -2 .
2
-!)
15. (a) R = (b ~ -~),
0 1 0
S = (cot
sin 0
~
0
si~O).
cos 0
(b) Rotation of angle -0 about the z-axis.
CHAPTER 4: DERIVATIVES
Exercise Set lA-D (pgs. 182-184)
1. f'(t) = (2t, 3t 2), t(s) = s(4, 12) + (5, 9)
3. f'(t) = (2 e~:).
t(s) = 2 + s( :=~) (:=~)
5. f'(t) = (-sint, -2sin2t, -3sin3t, -4sin4t); t(s) = sf'(rr/2) + /(rr/2) = s(-1,0, 3,0) + (0, -1,0, l)
760 Answers to Odd-Numbered Exercises
7. 9.
z y
)
. ---·;::_:!~ - ~.
.;-;:..,.,.,,.-·
·2
I ~ X f(-1) ·1
11.
y
1(2)
2
f(-1) /
~
-+----------"'f<'--- - -- -- +---
-2 4 X
y ~------
g'(0}=(1y___.......--
/
I
/'
I
/
y/ 'Y
/
,,
/
/
./
f(O) //
12 X
27. (acoswt+bcosot,asinwt+bsinSt)
Answers to Odd-Numbered Exercises 761
39.
X(1
. 41.
z
-
-
X(O)
-
X(1/2)
X (1)
'j t
'-- X(O)
~ X(O)
7. 9.
z
(2cos ln2, 3sin ln2, 2)
z /(s"',-5,5~)
y
, I
/
--~
'
' ~ '
,. projection projection ",--.;_ ·,, I__; /_,.,,-
onto xy-plane
onto xy-plane ~-"'
__ !,. ,.,.·i-..,
·-,, ''-~
, y
762 Answers to Odd-Numbered Exercises
17. v(0) = (20, -40); vo ~ 45 ft/sec; Clark's speed at time of rescue: 122 ft/sec; victim's speed at time of rescue: 80 ft/sec
1. 3. z
1mage(f)=(0,2]
-,, "'
I 'I
X
..
I 1 y
y
!5- x2+y2=4
5. 7.
(1/2.1.e3l2) : I
\
Answers to Odd-Numbered Exercises 763
9. 11.
z z
(-1,0,0)
(~
X y
/
//
, _ - ,; -- - - - ~
, ~/: - projection
~ • ,-/ onto xy-plane
/4'
13. 15.
z z
xz-plane k=0
---
A
k=1 / ,,,(
' ----._J•
/.
I
:
f
----------- - - -- - - - , ~ ks2 ; ~
,','
xy-plane ~ - • •
I .,·
<il I
'-----
I
~ ,,..-- I
• yz-plane
19. I
17. z
zJ X2+y2+z2:0
lks0) line of
intersection
1(1,1,-1)
,.,
) .
x-y=O
y ..
y+z~o
X
'
764 Answers to Odd-Numbered Exercises
d(x,y)=7i4
X
y
-1 0 1 X
,,, i
'-
', • . i . '"
/
.., , ,_,---· r··
,· ~,,•/
level set
d(x,y)=x2+2y2-x+ 1
(b)
25. (a) I
D: x2+y2.4, o< z < s z
J
I
1.
z Z=-y2+4, X=-2
~ (-2,2,0)
i
5. Z=X2+27, 7.
z y=3
z- (1,1,3)
/~
'-...,._ --.,- - - - - -~-
// _.,
_,,-" ',,.~ projection
I
~ onto
Z=X2-27, X
Y=-3
9. 11.
Z= y( 1·Y ) X=-1
3
z z
-, 1+y2 '
Z=O, y=O
~ -~-,, ,,
z..sin y,
X=2TI:
Z=O, Y='br.
- - ;/ -,
',,~
2 '-,,, x2+y2=2, z-(cos 2)/3
X
766 Answers to Odd-Numbered Exercises
23. 25.
z .(3,2,6)
~- -
(1,2,4) ,' (3, 1,5)
(1,1,3) •
y
/
/
/ projection
onto xy-plane
Y=(2-x2)1i3, z=O
1.
z
(a) x 2 +'ll -z 2 =0
3. The length of the axes of the ellipsoid of level 2 increases by a factor of ./2; the vertex (saddle point) of the elliptic
paraboloid (hyperbolic paraboloid) of level 1 is (0, 0, - 1/c).
1. Q is an elliptic paraboloid.
Answers to Odd-Numbered Exercises 767
9. 11. 13.
z z
I, ._· \
' y
;''
X ,.,..,,,.,,. X / - ... ., t,~
i '"
,,. / /
' //
\ .:
15. 17.
Z=·X2/4+41/16,
y=S/2
Z=4X2·4,
Y=·2
·,.,
-..,
z=y2/4-9/16, ~
X=·5/2 ----.,,___
hyperbolic paraboloid
hyperbolic paraboloid
19. / 21.
z
elliptic paraboloid
parabolic cylinder
768 Answers to Odd-Numbered Exercises
23. 25.
yz-plane
z I xz-plane
zl ../
parabolic cylinder
circular cylinder
27. xz-plane
z! I
I
.' . - -\
'',.Y.
x2 ;,4..,.zt!.-1, '"·.
y=O
elliptic cylinder
17. fx = 2x/(z 2 + w 2), Jy = -2y/(z2 + w 2), fz = 2z(y2 - x 2 )/(z 2 + w 2)2; f w = 2w(y2 - x2 )/(z 2 + w 2 ) 2
19. ft = 1, Jy = 2z, fz = 2y 21. fvxx = 2/(x + y) 3
Answers to Odd-Numbered Exercises 769
29. r(x,y) =-1-x- {l-y+./2
31. r(x, y) =1
f(x,y)=(1-x2..y2)1/2
z
l(x.~
]'
,/
'-
..'···,.' _.,.·
. . . . 1-----.._
. ·--- y ,,__,__ ,. t .·.
---------
~/,/
f(x,y)=exp(-x2-y2) r .,\f__
I
33. (1, l, ./2.)t + (1/2, 1/2, ./2/2) 41. D is the xy-plane; u is not harmonic on D
43. D is the xy-plane; u is harmonic on D 45. D is the xy-plane; u is not harmonic on D
47. Dis the xy-plane with the y-axis deleted; u is harmonic on D
49. D is the xy-plane; u is not harmonic on D
4
f(1,0,t) ,
, 1(0,j
/
X
2
X
-2 fk 1,0,0) 2
}~~,\
4 X '- (e2,2,0) ,-.
Answers to Odd-Numbered Exercises 771
5.
7. u is the angle made by the positive x-axis and the projection of the vector f onto the xy-plane; v is the angle made by
the positive z-axis and the vector f .
9. 11.
z
I '·
, I - ' I
,,,.-1 tl ~-~-~'·,~
,Y,(-'1 -2-513).. 1- ' (· 1,2 ,'31.J----..__
'I ' , I , X
1•' I ,•'
'' , I ,{,
-, I
13. (x, y, z) = (cos u cosh v, sin u cosh .v, (1 /..ti.) sinh v) 15. (x, y, z) = (cos u sinh v, sinu sinh v, -cosh v)
z z
x2+y 2-2i2=1, lzl < 1
z2-x2-y 2"1 , zs1
y
. ..,...
(d)
z
x(t) -~
y
t=1t/2
'\
9. 2x + 12y - z = 17
23. (a) ellipses if k > 0, single point (0, 0, 0) if k =0 (b) parabolas
25. F(x,y)=y-x 2
27. h(x) = (x, x 2 )
\ p z g(t)=(t,t,t2)
proje:~~~
onto ~,,"-- .,, t=1
_1/
-~-- --3·
-~;;~-
Answers to Odd-Numbered Exercises 773
CHAPTER 5: DIFFERENTIABILITY
Section 1: Limits And Continuity
Exercise Set lA-C (pgs. 224-225)
1.
3. The interior of S is S and, therefore, S is open; the boundary of S is the circle of radius 3 centered at (l, 2).
5. Let S = {(x, y)I0 < x < 3, 0 < y < 2}. The interior of Sis S and, therefore, Sis open. The boundary of S consists of
the four line segments /1, Ii. /3, /4, where /1 has endpoints (0, 0) and (3, 0); Ii has endpoints (3, 0) and (3, 2); /3 has
endpoints (3, 2) and (0, 2); and [4 has endpoints (0, 2) and (0, 0).
7. The set S = {(x, y)jx 2 + 2y 2 < I} contains all points inside the ellipse E:: x 2 + 2y 2 = 1. The interior of Sis Sand,
therefore, S is open. The boundary of S is E:.
9. The set S = {(x, y)jx 2 + y 2 > O} is the xy-plane with the origin deleted. The interior of Sis Sand, therefore, Sis
open. The boundary of S consists of the single point (0, 0).
11. The given set S = {(x, y) jx > y} is the region below the line y = x in the x y-plane. The interior of S is S and,
therefore, S is open. The boundary of S consists of all points on the line y = x.
13. Lines and planes in IR 3 are not open subsets of IR 3 because no point on a line or a plane is an interior point. For
example, in the case of a line, if Xo is a point on the line then every neighborhood of Xo contains points not on the line.
= (X +;~).the given function can be written as f(x, y) = (X +;;).Thus, the domain space
and the range space are both of dimension 2 (i.e., n = m = 2). The real-valued coordinate functions off are
f1(x, y) =x + 3y and f1(x, y) = 2y.
17. The domain space of f(t) = (t, t 2, t3, t 4 ) has dimension 1, and the range space has dimension 4 (i.e., n = I, m = 4).
The real-valued coordinate functions off a.re f1(t) = t, h(t) = t 2 , h(t) = t 3 and f4(t) = t 4.
19. The domain space and range space of f(x, y, z) = (2x, 2y, 2z) are both of dimension 3 (i.e., n = m = 3). The real
valued coordinate functions of f are f 1 (x, y, z) = 2x, h (x, y, z) = 2 y and h (x, y, z) = 2z.
21. The coordinate functions of the given function f are
y X
f1(x,y)= x2+I and h(x, y) = -2--1 ·
y -
774 Answers to Odd-Numbered Exercises
Ji is continuous everywhere on the xy-plane. h is continuous on the xy-plane except on the horizontal lines y = ±1,
where limx-+Xo fi(x) fails to exist. It follows that limx-+xo f(x) fails to exist for xo on the horizontal lines y = ±1.
(Note: For points of the form Xo = (xo, ±1), where xo-::/; 0, limx----no h(x) fails to exist because h(xo) is infinitely
large. But for points of the form x0 = (0, ±I), limx-+xo h (x) fails to exist because its value depends on the direction
from which we approach xu.)
23. There is only one coordinate function of the given function f; namely,
For points Xo an the vertical lines[,. : x = nrr (n a nonzero integers), limx----,.xo /(x) fails to exist, but this limit docs
exist at all other points in the xy-plane. (Note: f is not continuous at points of the form (0, Yo), but lirnx-+(0,YO) f (x)
does exist and equals l + yo.)
25. The coordinate functions of the given function f are Ji (u, v) = uv/(1 - u 2 - v 2 ) and fi(u, v) = l /(2 - u 2 - v 2 ). Ji
is continuous everywhere on the uv-plane except on the circle C1 : u 2 + v 2 = I. For points Xo on Ct, Iimx-+xo f (x) fails
to exist. h is continuous everywhere on the uv-plane except on the circle C : u 2 + v2 = 2. For points Xo on C2,
limx.....,.Xo f (x) fails to exist. It follows that f is continuous everywhere on the uv-plane except on the two circles C1 and
C2, and limx-->xo f (x) fails to exist for points xo on these two circles.
27. The coordinate functions of the given function f are Ji (11, v) = 3u - 4v and fi(u, v) = u + 8, both of which are
continuous on the uv-planc. Thus f is continuous on the uv-plane.
29. The given function f is continuous on the xy-plane except possibly at (0, 0). However, it was shown in Example 6 that
limx-->(0,0J f(x, y) fails to exist, so that f can't be continuous at (0, 0) (regardless of how .f(O, 0) is defined).
31. For x E IR", the function /(x) = lxl/(1 - lx! 2) isn't continuous for points x E IR" that arc I unit from the origin. That
is, f isn't continuous on then-dimensional unit sphere.
33. (a) The translation T(x, y) = (x + y) + (1, 1) takes each point in the xy-plane and moves it a distance of
!( 1, l) I = ../2 units along a line parallel to the line y = x.
(b) Hint: Use Theorem 1.4
35. /lint: Each time you include another open set you may need a smaller ball inside.
37. flint: Use the definition of length.
39. Hint: Use the triangle inequality.
41. Hint: Assume the contrary and reach a contradiction.
~ )-
1 cosy O )
1. (~ 3. ~ ~ - s;nz .
5. (x 2 + 2x)ex. 7.
w
~ ;
O u
~)-
9. 2; -2: .
(
2x 2y) 2.x
ll. ( 2y
-2y).
2x
13.(~~)- 15. (2 2) .
-1 0)
17 ( h/2 ) .
. -h/2 19. ( ~ -~ .
,. n-n
23. (a) P is the projection of the vector (x, y, z) onto the xy-plane. (b) G~ ~) .
25. (a) f(xo + Y1) = ( tl), f(Xo + Y2) = (~:~). f(Xo + y3) = e:11).
(b) T (x, y) = ( x +; ) .
(c) f(xo+Y1) =:::: (i~i). f(Xo+Y2) =~ (~:D, f(Xo+Y3) ~ (~:n-
27. The 3-by-3 matrix in the definition off.
._ 29. Hint: Use the definition of the derivative matrix in both parts .
776 Answers to Odd-Numbered Exercises
1. (a) The graph of f (x) = x 113 - x, -2 < x < 2 is shown on the right.
y
(b) The tangent lines / 1, /2 and /3 to the graph of f for xo = 3/4,
xo = -3/4 and xo = - 1/4 (resp.) are shown on the graph given
in part (a). Their respective points of tangency Pl, pi and p3 are
also shown.
(c) The equation x 113 - x = 0 can be factored as x 113 (1 - x 113)(1 +
x 113 ) = 0, from which we obtain the three solutions I , - I and 0.
(c) I, - 1, -1 respectively.
(d) This choice gives a solution remote from the starting value.
3. (a) Routine calculation. (b) x3 = x4 = 0.739085133.
5. x9 = x 10 = (0.980222741, 1.993801602, -0.874024343) is one solution.
Chapter 5 Review (pgs. 250-251)
1. (a) Open. (b) Not closed. (d) Equals the set. (e) Nonnegative x and y axes.
3. (c) Neither. (d) The set with the semicircle deleted. (e) The semicircle together with the segment - I :::: y :::: I on
the y-axis.
5. (a) Open. (b) Closed. (d) Equals the set. (e) Empty.
7. (a) Open. (b) Not closed. (d) Equals the set. (e) Three parts of planes: z = 3 where x ~ 0 and y ~ 1, y =I
where x ~ 0 and z ~ 3, x = 0 where y ~ I and z ~ 3.
9. (a) The interior of Sis the solid unit sphere in IR 3 without its "skin," and (b) the boundary of Sis its "skin". It follows
that the smallest closed set containing S is the solid unit sphere in IR 3 together with its "skin."
11. H of has no points of discontinuity .
13. The points of discontinuity of H of are the points on the line y = x.
15. No points of discontinuity.
17. (-2x (x 2 + y2)- 2 , -2y(x 2 + y2)- 2 ).
19. (l,-1,0). 21. (y+w, x+z, y+w, z+x).
23. fxx = ex sin y, Jyy = -ex sin y, fxy = fyx = ex cosy.
25. fxx = yze', /yy = fzz = 0, fxy = fyx = zeX, fxz = fzx = yeX, /yz = fzy = eX ·
27. fxx = 12x2 , Jyy = 6y, f~: = 2, fxy = Jyx = fxz = fzx = fyz = fzy = 0.
33. ( ~ !~ :).
VW tlW UV
=
IV /(1, 1)1 VO. u = (2/VO. -3/v'TI)
9. Vh(x) = (ysinz,xsinz,xycosz),
IVh(x)I = j(x 2 + y 2 ) sin2 z + x2 y 2 cos2 z,
Vh(l, 2, n) = (0, 0, -2), u = (0, 0, -1)
11. 13.
y
-2
-1 2 X
-----2
15. V f(x, y) = (y, x + 2y) 17. V f (x, y, z) = (2x, 2y, 0).
y z
-----,--+---,.--+-_..-+-----+-- X
/ ..... '\.
/ /
19. normal vector (2, 2, 0); tangent plane x + y = 2
21. normal vector e1; tangent hyperplane x 1 = 1 (x 1 is the first-coordinate variable for points in Rn)
23. normal vector (1, I, 1); tangent plane x + y + z = 3
25. (b) normal vector (-1 - e, - l - e, 1); tangent plane x + y - z/(l + e) = 1.
27. F'(O) = 3 29. g'(rr) = -1
31. (a) x 2 2
- y2 = 8 (b) y = (x/3) 31 , t 2:: 0 (c) (6, -2) (d) (6, 3) (e) maximum radiation at t = v16; radiation
decreases fort> J6 and is zero fort 2:: 3.
z) ½x
35. If J (x, y, = 2 + + }z 2 !l 37. / (x, y) = 3 + ½x ½y3
778 Answers to Odd-Numbered Exercises
1. 3. 5.
I
(·1,1)
(0,1)
dy/dx=O I'
dy/dx=-·I
- ~ - - - + - - - ~- x
/·
X=·(1-y2)1/2 ..
... ·\
y=-( 1-x2)112
3. dy/dx(-1, l) = l, dx/dy(-1, l) = l
5. dy/dx(-1, 1/4) = -1/4, dx/dy(-1, 1/4) = -4
7. (a) dx/dz(l, l , -1) = -1/2, dy/dz(l, l, -1) = 3/2 (b) dy/dx(l, l, -1) = -3, dz/dx(l, 1, - 1) = -2
(c) dx/dy(I, I, -1) = -1/3, dz/dy(l, I, -1) = 2/3
9. F(x, y, z) = (~ :z
2
:yt) for all three parts.
15. (-1, 0, I) and (-1, 0, -1) 17. (0, t, t), itl:::: ./2
19. (l/J3, J/J3, l/J3°) 21. -1/./2
23. 6 by 6 by 3 25. (5V/6) 113 by (5V/6) 113 by (36V/25)L
27. (a) I+ 1/./2 (b) I 29. (27/19, -7/19, 7/19, -3/19)
31. Hint: Minimize the function f(x) = (a1 + ·· · + aN + x)/(N + I) - (a1 · · · aNx) 1l<N+I) for x > 0 and use induction.
33. (a) !hi < I (b) maximum and minimum off on Ch a r e ~ a n d - ~ , respectively. (c) 0 is both the
maximum and minimum value off on C1,
35. The given plane is tangent to the given sphere at the point (I, I, 1). So, f(l, I, 1) = I is both the maximum and
minimum of .f subject to the given conditions.
37. (a} length= 3, width= height= 6 (b) length= height= 3(2) 113, width= 6(2) 113
T x2+v2=1,
\xsO,y<O
X
~-~(0,·1)
5.
Answers to Odd-Numbered Exercises 781
13. z
0o = 21r/3
r 0 = 1, y = -2\/'3x.
x2 + (y/2) 2 + (z/2) 2 =1
15. (-2t sin t sin t 2 +cost cos t 2 , 2t sin t cos t 2 +cost sin t 2 , - sin t)
____'- ._,.'
'-. '-. \ I/.,..,,, __
//.,,.,-,,
-----/I \ -----
6 ~~,. II".,... ii/
'-,.
". '-.. ...____,
X
/.,·· I
J \ \ ". "'
/II I \ \ \ ".
IS. 8//8s(2, 1) = 80
17. (b) dP/dt(O) = -21/ 16 (c) dP/dt(30) = -3/73
19. Zx(l, 1, 1) = 1, Zy(l, 1, }) = 2
21. (a), (b) tangent vector: (5, -4, 1)
23. (b) the result of part (a) implies Theorem 1.2.
782 Answers to Odd-Numbered Exercises
1. -~- 3. ¥-
y y
2
2
·1 X
X
67
5 • 28 ' 7. j.
y
y
, ~ -· ----
1 "
' ~1-x)
112
"-,
B \
1 X
-2
11. ¼-
y
rr/2
•=cos y
'-./
\
B \
........•...__J__ _
1 X
Answers to Odd-Numbered Exercises 783
'• ..
2
2[i4x-2.x
19. (a) A(B) =
1
0
2
_
smnx
1 dy ] dx.
[i4x-2.x
(b)
1
0
2
_
sm,r ,-
f(x, y)dy ] dx.
21. tJo
,~,
z z
~
2
/!''·':,
y=x'/2
,{::~J~
,-.Jl--i- (1,1,0)
~~2
X
784 Answers to Odd-Numbered Exercises
z z
25. (b) rra3. cross-section Z cross-sections
(0,a,2a) perpendicular to W perpendicular to
w ,
:, I
the axis of the . _,--\ the the line y=-a
"''"'" / -----!
',. -~ y,,,
-. : t:>~x'
y=-:
- ")~/-~
,, ' x-- ,,
1- X ·, / , ]- X
27. Let F,, be the value of the expression for n ~ 2; F2 = 1/2 and Fn = 7/12 for n ~ 3.
29. (a) (] - 8112 ) 2 . (b) 4.
1. Jr . 3. Jr.
rr X
5. 6 7. ¥-
9. 2. 11. 2;r.
13. V(B) =
1
112~dx , dy
14-4x
.
2
-y2
dz= 4;r.
-) -2~ 0
V(B)=
1 12~
1
-)
dx
-2~
(4-4x 2 -y2)dy.
15. 16;r. z
Answers to Odd-Numbered Exercises 785
y
20
B
zj typical
cylindrical shell
"""·---
D
X...,.-"
,..~
-1 1 X
rn 1=1
-1 •
(1,-1)
x
y
0 2/3 2 7. (~¥- 3<;t2>) ~~ (0.48,0.48). Q ,,
~~"'\
/ '/Y=X
9. (3/4, 12/5).
It. Hint: Center the ball at the origin. lH0:48,0.48J
-· _ ( 4(a·1 -b3 ) 4(a-1 -b3 ) )
I 3• (a) X - 3ir(a2-b2J' 3ir(a2-l,2J .
(b) - _ ( 4a
X - 3,r ' 3,r .
4a)
15. The centroid of R is on the line of symmetry at distance 4,1/3rr ,, X
z
3
(1,0,2) • .(0,1,2)
R
(2,0,1) • - 'o(0,2.1)
2. _2
X
3
.
.... _.,,. ,.
3
y
projection of I\, onto the xy-plane
(b) If H (x) is the Heaviside function, then H (3 - x - y - z) has the value 1 for points below the plane x + y + z = 3
and is zero elsewhere. The smallest rectangle, call it Ro, containing R is 0 S x S 2, 0 S y S 2, 1 S z S 2 (see
figure). We integrate the Heaviside unit step function H(3 - x - y - z) over Ro. When Simpson's rule in three
dimensions was applied to this function over Ro, the approximation for p = q = r = 50 was 1.176576, which is
accurate to only one decimal place. When the midpoint approximation in three dimensions was applied with
p = q = r = 50, the result was 1.166400, which is accurate to three decimal places. Simpson's rule is usually
better for smooth functions, but not here H since is not continous.
(c) The apparent superiority of the midpoint approximation suggested by the result of part (b), compelled us to forego
the use of the Simpson approximation in favor of the midpoint approximation, which was applied to the function
(x 4 + y 4 + z 4 )H(3 - x - y - z) over the rectangle Ro described in part (b). Using p = q = r = 50, the result was
6.679043. In order to check this answer, one can directly compute
1 R
(x
4
+ y4 + z4) dxdydz = [ 13-z 13-y-z
I
2
dz
O
dy
O
(x
4 1403
+ y4 + z4)dx = - - ~ 6.680952381.
210
We see that our midpoint approximation is accurate to only one decimal place (although rounding to two places
produces two-place accuracy).
(Note: In previous exercises, values of p and q were tried for much larger values than were tried here. The reason
is that the number of operations required to carry out both the Simpson approximation and the midpoint
approximation in dimension 3 is roughly proportional to the cube of the number od subintervals used. Thus, all
things being equal, computation time much longer in three dimensions than it is in one or two dimensions.)
788 Answers to Odd-Numbered Exercises
1. 2/3. 3. e2 - 3.
y y
1 >------~ 2
2 X
2 X
5. 0. 7. 1/2.
l~i
y
2n
7 1 X
r. X
13. 20/3. The square S in R 2 of side length 2 centered at (I , 0) is below on the right.
y
y 1
t
s
X
1 X
-1
/~·,,/y,,x2
I
I
• ! R,_
w I D
/ ''1
3X
1 X
790 Answers to Odd-Numbered Exercises
41. (a)
1 [! ~ [1x ] ]
0
1
-~ 0
z dz dy dx. (b) 11 [farccosz[11
0 -arc,:o,: lSCCIJ
]]
zr dr d0 dz. (c) Jt /16.
43. 1/32.
z
H
~-Jf,-m~r--
, · ; , mmr.-,-, -. = _, ,,
I, ' ·,., ~'j ', ·1 z varies
-<- l . \
I
./
,,
• l
,,
X , "-,, _y
/ I l: and r vary ··"-
17. 478/15
Answers to Odd-Numbered Exercises 791
• 4
1 ft=1{
F (x,y)=(y,x)
F(x,y)=(-y,x)
27. (a) See graph for Exer. 25 in this section (b) n a 2 (c) ab (d) ½ab; the answers are the areas of the enclosed
regions.
29. - 18 31. 24
(b) Hint: Try the line segment with endpoints (0, 1), (1 , 0),
33. (a)
and the half circle joining these points.
y
.--/>---..,.....--
/ i, ' / / - -
I ///K"
/II".,.
I I •
792 Answers to Odd-Numbered Exercises
y· r=1 +cose,
'Y
0 <0 <11
'Y
acceleration
13. (b} Kmax = I, Kmin = 0 (c) minimum curvature at x = 0, maximum curvature at x = 56 116 and x = -56 116 •
15. (b) 6/(lt1(4 + 9t 2) 312 ); 00
(2,0,3)"
zl
/ \
/ \_
- ~;J-
X /
,' 2
T \
, ,_ . - - ~ - - - - -
------ 1
Y
....+-
-
(0,0,0)
projection onto
the xy-plane
z
T
9. 32Jr 11. 0
2
f S: x2+2y2+3z2=1
7.
5.
zi
9. Parametrize the border of S by f(t) = (cost, sint, 0); fscurl F •dS = -1e.
11. (b) Border of S consists of all points on the either of the two circles x 2 + y 2 = l and x 2 -1- y2 = 4. (c) 31r
13. Hint: Use the results of Exercise 25 in Section 3 and Exercise 12 in this section.
15. Hint: Using the coordinate functions of F, compute F' (x) - [F' (x) ]' and curl (x) x y, for x. y in IR 3 . Then use the
results of Exercise 33 in Section 4 of Chapter 2.
19. 0 21. F(x, y) = (-y/(x 2 + y 2 ),x/(x 2 + y 2 ))
23. (a) Him: Using coordinate functions, find the equations that must hold in order for curl G =F
25. G(x. y, z) = (z 2 /2 - xy, -yz, c), c constant
27. G(x, y, z) = (-yz - 3xy, -xz, c), c constant
29. Hint: Show that curl G(x) - curl H(x) = curl(G - H)(x) . Then use Theorem 5.4.
31. G(x) = ½tz 2 - xy, x 2 - yz, y2- xz)
33. G(x) = ~(-yz-3xy,3x 2 -xz,2xy)
-- - - ---· x
' '' 1 ) ')
y Y..
y ._,,.._,,,._,,,,;///// /\ \ \ ' \ ' \ ' ,,-......-
I LJ,,L I ! ./·f ·t--! ,/ I I I I .,,..,,..,,,..,,.._,,,_,,,// I J \ \ \ ,:::,l·:-... .._ . . . . . . . . .
•/ , , 1 1 1 1 1 / \ \ \ \ \ \ \ ' ' ' ' I 'f1·f~1- 'j i. I Iii~ II ·----..IllrJ. /1 I \ ,,11!-::.:: - ----
,,// Ill// I\\\\\\
I I I I I I \ \ \ \ \ '
,,,
'-''
I I 11 {rT/''') :;9 <_11'r11 ·--------/1 \~U--- ----
,, / ,
...,., ' / , / / / / \ \ \ \ ' ' .,
,, ., ,· ,, .,-,,11 I I I \ \ \ \ ' ' ' '>' ' '
--- !Jl71f"/II/ I I 1,,,~'(I,_,
,, I I I I i.f t
I /// I )".,.-"":',
I II I, / //,·· -
/ / ."( I '1.. I
' ,, I I\I
/ I ·v I
---------,,,//.,....,,._
_....._ ____ ,,,"\\II//,,,.,,,...,..,,..,,. __ ----
_,....._,,,,,"\\\ //I ////,..._.._
. . . _.,,..,,..,,..,.,..,,,//.II\\,,,, __ .__ . . / I AI ,,,,
--------,,1,,,, ______ _ I / I /
' X
. .......... //
. , , , , , , , , 1 I I// ///
/~
//// •
,
---/// /
y
.-{,, / .,,,.,,.,._., __ y
'///---
:~'\~~::]ill-=
y ==:~~;;I
-------// ,
---///
! / / / ______ _
_J</ / / / ~
-------///
---////
/
////--
/
//--
\ \'-'--
I~\,~,--
--·///, I
--..--.>/ 6 -- - ;, / . ./.f / / ,,:,/- - - ___
- - - /,,,,/
./ // / ~
I j \ -..._',
I \ \ .. , - - - - - / / / /
,,- -
-.- -- , /
---
---.,.'//'/ / 1 / /,, , ~
- ~- /_./;" } _,, /
{//_/.,., /
/,:,. ---- -- - - - ,,y;, / /
~///
,· -,,//·/~
1/..
/.,, --
.,/ - -
- 4........,,. .,,. . . ··/*'· ,' ,, -r •
/ / /
f
,,, -- -
~ ~- --+-_ _ __,,,_,,_,__,.,__-+X __
____,_-.,_,,,_,;r·A'--/---.'
____ ./-, / / /-' ______ _ X
\ ~: ·'<> - -
\ ~~·-., ~ ..... -- __ .,,. ..
/
I, I
1/ I __ _,,..,:1-,·'>/ ////..---
-____.,.,..,...fJ;/
~-/ / / /
////-- - / /
'---''- -
I
-- ,-·/1 :_:::.p,,: /'/' / / / / ---- -
I
I
I
\ \
\ \
\ \
-~~
._,
-
__ __ /
I
I I
I I
;,:
_,,, .
/
~ # '/ / /
---;:.,,,ti//
~ " , . , , ,.... / /
-------///
/ / / ---- - - -
///...-----
////----
/// ______ _
---.,-,///
~;:,. // /
///...---
/ / / ---- . -
\ \ ",,·-.-- _ ,,, I I -------/// ///...----
I I \ ,,-- ------/ I I ----//// //.,'/--- -------/// //,//--
23. (a) Hint: Use the Fundamental Theorem of Calculus. (b) Use the hint
25. Hint: If y(x) is a solution so is y(x + a). Theorem 1.1 doesn't apply because the derivative o f ~ isn't bounded
near y = ±l.
X
-a X
-1
(0,-1)
::.-=-=,::__,,,
- - ....... ':'- ',
,,,-------
' -- - ...-
I l
I
I
j
I I I I
-' '
~,--.,.,,.·,ft~'
, ___ / .
~-
.. ., / _.,, ;~:;
/
,.,
-- ' ---'
. .... ' \. ··.,
.........
-, -- - -,,,,./A.1
,,...,..
I I /t Ii I ,/ .' /
l'/I · ,l/
i .t/ / /,q .,,,,,.
.... --
''. ' --- ~f/ ~~;-~::::~ ~~:'
Ii f., : 1,,
-· ,-,,....":·,
,;// /. .----
- .;_// / , / /
I I !/
I I
I
,I
I
; I I / I
I I !/ I
.-,--.::~/
"-;:: / :/
'
~_,.,.;(,. ~
/ / /
/
/
.,.,... - --
....--
t;I
,' I I II I /
.,,..,_....._,,__
,_,, _____ --,,,
/
,,.
/ / / ......
----// / ( / / ..,--
----//
----//
--- - / y, / -~-~---
1/'
.,.,
,., ___ _
/ / ----- . ,......
:.,-;;'' ,/ ?,.,,-;.:·/
I // /, ~ //'
~->// ;;.;'/~ //
- - - ..- ,.-:· /'
-- ---
X
---- -- ---
/ / _,,,_, ......
-<~-- . -- / /.' ,,,,,,_......_,......_ - / / -~ ,/~ / / /// / /
--~·-::::-_____,,._,, - / ,"//
,,,"' / ·/ ,,-··'" / ';.::/...,,::, / / / . /
----,,,- ~ / //,"',
',,
/ //---- _..,,.,,,..--_.._,_,..,.
.., ''~ ........
/ _,,,, ....... <·-0.
,,. ,, ,.. . ......
-
. - - ,. . . <,,,; /
. - --~.--,:;:., / /
//.----
//.----
/,,-,,,. __ _
21. (a) Hint: Find dV /dt. (b) r(t) = J1 - ½t 2 . (c) 1.414 hours, about I hr, 25 min.
23. (a) S(t) = I50e- 1l 50 , t :::_ 0, Sis decreasing, and lim, ... 00 S(t) 0. =
(b) S(t) = 100 + 50e-i/IOO, t :::_ 0, S(t) is decreasing, lim1--+oo S(t) = 100.
25. k = 0.
21. Hint: Consider the product of the two slopes. (b) y(x) = C1x, x i= 0. x2 + y2 = C2, y i= 0, where
C2 = 2K2 > 0.
y
,,\\\1\\\\\///((/tt
'
,
...... ' ' \ \ \ \ I / / ' i I I /
, ,,1\\\/fl/ll.'
I
Direction field for dy/dx = y/x Direction field for dy/dx = -x/y.
800 Answers to Odd-Numbered Exercises
29. (a) GM= 96000 mi 3/sec 2 . (b) About x6.93 mi/sec :::::: 24900 mi/hr. (c) About 6.20 mi/sec :::::: 22300 mi/hr.
31. (a) Use the hint. (b) Routine calculation. (c) Routine calculation.
33. (a) For ht+ c > 0, b = -(a/A)..firl.. The constants c = -J(h(0), b = (a / A)../i12 also work. (b) Letting
h(0) = ho, te = (A/a)J2ho/g. If A= 251r sq/ft (tank is 10 feet in diameter), a= rr/16 sq ft (hole diameter is 6 11 ),
ho= 20 feet, and g = 32.2 ft/sec 2 then te :::::: 446 sec :::::: 7.4 minutes.
y
\II\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
/////////, // ,,.'//////
///////// /Ill/////
/////!//! II/Ill/I/
llll!IIII llllll!II
!Ill/Ill/ f//llllll
II/Ill!// /1//f/f//
I II II I I I I I I I II I I II
I I I I I I I I I I I I II I II I
I I I I I I I I I I I I! I I I I I
5. -2sinx
7. Char.~: r 2 +
2 - 6 = O; roots: r1 = 2, r2 = -3; general solution: y(x) = qe2x + c2e- x; particular solution:
3
11. Char. eq.: r 2 - r = O; roots: r1 = 0, r2 == I; general solution: y(x) = c 1 + c2ex; particular solution:
y(x) = (e - ex)/(e - 1)
13. Ch~. eq.: 2r 2 - Jr+ 1 = O; roots: r1 = 1, r2 == 1/2; general solution: y(x) = qex + c2exf2 ; particular solution:
y(x) = 0
15. y" + 2y' + y == 0, 11. y" = 0,
y(x) = c1·e-x + c2xe-x y(x) =CJ+ c2x
y y(X)=X8·X y
y(X)=l+X
X
y(x)=xe·•-e·x
21. (D 2 + 2D + 1 )y = O; D2 + 2D + 1 = (D + l)(D + 1)
31. (b) y(x) = qx + c2 /x (c) Hint: Apply each operator to the nonzero constant function y(x) = c.
(d) y(x) = ci In lxl + q
33. (b) Hint: Write cosh .8x and sinh .8x in terms of exponential functions.
802 Answers to Odd-Numbered Exercises
-- X
-1
-1 X
37. (a) Hint: Write Newton's equation F = ma in two ways and equate the results. (b) g
(c) y(t) = yocosh --/iTl t + vo,Jffi sinh--/iTl t (d) Hint: Set vo = 0 in the solution in part (c) and use an inverse
hyperbolic function to solve y(t) = I fort.
39. (b) Hint: Find the roots of the characteristic equation. (c) Hint: Use l'Hopital's rule.
33. Hint: Treat the cases of unequal roots and of equal roots separately.
35. (a) Hint: Use the identity sin(a + b) = sin a cos b + cos a sin b
(b) A= 2,0 = rr/6 (c) Hint: Use the cofunction identity sin(rr/2 - a)= cos a.
37. Hint: Break the integral into its real and imaginary parts.
39. y(x) = cieO+ /)x/v'l + c2e-<l+ilxfv'1, qand c2 are real or complex constants.
41. Hint: routine computation
43. Hint: routine computation
45. (a) ·Hint: Use the chain rule to find d 2 yc/dx 2 and dycfdx . (b) Hint: Use the chain rule to find d* ycf dx* for
k = 1, ... , n. (c) Hint: Use the fact that y(x + c) and y'(x + c) are differentiable on a - c < x < b - c.
Answers to Odd-Numbered Exercises 803
47. Hint: For f3 ,f= 0, show that if the given functions are not linearly independent on some open interval l then tan.Bx is
constant on some open subinterval of/. If f3 = 0 then the given functions are not linearly independent.
Exercise Set 2BC (pgs. 512-513)
X=0.58L
inflection point
maximum downward
vertical deflection
I 3x 3 x IC
5. gen. sol.: y(X} = qex + c2e-x + xc - X, part. SO}.: y(x) = e - e- + x - X
2 4 4 2
I 1
7. gen. sol.: y(x) = c1 cosx + c2 sinx + x sinx, part. sol.: y(x) = sinx + x sinx
2 2
. 1 12.
9. gen. sol.: y(x) = c1cosx + c2 smx + -xcosx +
4
-x smx, part. sol.: y(x)
4
= -43.smx + 4-xcosx
1 12 .
+ -x smx
4
11. y(x) = ~xex + ctC + c2e-2x 13. y(x) =- ~ieix + c1ix + c2e-ix
804 Answers to Odd-Numbered Exercises
y(x)=(3.14)e·2x +(7/4)xe-2x-3,/4+(3,14)x
-2
x ,. 1 .. I . 3 . 1
31. gen. sol.: y(x) = c1e- COSX + c2e-· sinx + Se , part. sol.: y(x) = -Se-·' COSX + Se-> sinx + sex
X
·1
1. Y1(x) = e2x, u(x) = e-x + CJX + cz 3. y1(x) =x, u(x) = 3l x 3 +CJx 2 +ez
5. YI (x) =ex, .}'2(x) = e-2x, Yp(x) = ie2>·, y(x) = c1ex + c2e-2.r + ie 2x
7. YI = cosx, J'2 = sinx, )'p(x) = cosx ln(cosx) + x sinx, y(x) = CJ cosx + c2 sinx + x sinx + cosx ln(cosx)
9. YI = I, .''J = x, )'p(x) = x 2ex - 4xex + 6eX, y(x) = c1 + c2x + x 2ex - 4xex + 6e'"
ll. yp(x)=- 1x 2e·r "
13. )'p(x)=e-~<(a+e-<)(-l+ ·
lnJa+e·'J)
2
Answers to Odd-Numbered Exercises 805
15. homo. sols.: YI (x) = e-x, Y2 (x) = e-lx; G(x, t) = e-(x-r) - e- 2 (x-r); part. sol.:
_ 4 -x 3 -2x { 0, X < I,
Y (x ) - e - e + J 1-x
2- e + 2I e2(1-x) , I ~ x.
17. homo. sols.: Y1(x) =land Y2(X) = lnx; G(x, t) = t lnx - t Int; part. sol.: y(x) = lnx + x - J
19. Hint: Use the hint in the text.
1
21. (i) G(x, t) = - -(e'i(x-t) - e'2 <x-t>), (ii) G(x, t) = (x - t)e'<x-r>, (iii) G(x, t) = .!..e'~(x-t) sin,B(x - t)
r1 - r2 .B
l 2 2 l 3
23. Yp(X) = -2XoX - XQX + 2x ' x/xo > 0
25. Yp(x)= (xo - I)~ + (e-x<i)e2x - xex
27. Hint: Use the bint in the text.
29. y(x) = c1x + c2x- 1, x > 0
31. y(x) = cix- 1 + c2x- 1 lnx, x > 0
33. (a) Hint: Expand the given determinant about the first column.
(b) y" - (2cotx)y' + (2cot2 + l)y = 0, 0 < x < rr (c) (x - l)y" - xy' + y =0
Exercise Set 4A-C (pgs. 531-540)
1. (a) critically damped (b) x(t) = c 1e-r + c2 te-t
3. (a) harmonic (b) x(t) = c1 cos 3t + c2 sin 3t
5. (a) underdamped (b) x(t) = e- at /2 ( CJ cos -a,./3 . -a../3
-t + c2 sm -t )
2 2
x(l)=·(5/2)e-2t+(3/2)e-4t
x(t)=·te·2t
l 3 . O
13. xp(t)= cm.t+
10 10 smt, t:=:::5. l
15. (a) h = 6 lbs/ft (b) h :=::: 8.9 kg/m (c) 80 lbs (d) Hooke's Law is not valid over this range of c~mpression.
17. (a) k2 < 8h (b) m > 1/4 (c) k = ,./3 (d) 0 < k < 2
19. A = ../IT; frequency = l/(2rr); <I> = arctan(3/2) ~ 0.9828 radians
21. A = .,/5; frequency = 1/2; <I>= arctan(l/2) ~ 0.4636 radians
23. rr /6 radians 25. rr /4 radians
29. (a) k = 2m Vo/(eE), h = m VJ /(eE) 2 (b) Hint: By part (a), k/m and h/m depend only on the parameters Vo and £.
31. (a) Hint: Show that the steady-state response is of the form Xp(t) = At cos{L\)t + Bt sin wot (b) ,(c) flint: Show that
the amplitude of the steady-state response is laol
J(h -w2m) 2 + w2k2
33. (a) Hint: Show that the real parts of the roots of the characteristic equation are negative. (b) Hint: See Example 5
in the text. (c) Hint: Write the general solution as the sum of the transient solution and the steady-state solution and
thPn nc;:.P thP tri~no1P. inP..tl1Ut1itv .
806 Answers to Odd-Numbered Exercises
II
Xp(t) = ao
4
a1
+ 3 cost (n = 1)
37. (a) (a - /J)/cv (b) a - fJ - rr/2 39. mo= 9/8 (no oscillation form = 9/8)
41. m = 1/26 and k = 1/13 43. No such m exists.
H(t)-H(t-1)
25. (b) Hint: Use induction and the result of part (a).
9. y = 2 - .Jf+Yt. t < 1/2 11. y(t) = ln(cosh t), -oo < t < oo
13. Both YI (t) = -½t 2 and y(t) = 0 are solutions. This does not contradict Theorem 7.1 because, in this case, the function
/ in the theorem is f (t, y, = -t-2y2, which is not continuous at t = 0.
y)
15, (a) Y1(t) = sint (for y = -y), Y2(t) = sinht (for y = y)
(b) (c) Y2(it) = iy1 (t)
l
17. (b) Hint: Show that (c) Hint: Show that oscillatory motion cannot occur if
2y2(t) is always positive.
1 2
-cos yo+
2z0 = 1
Exercise Set 7C (pg. 561)
l. y(t) = 1 +cost
z (y-1)2+zZ:1
y 1
y(1)=1+cos t
1=0
/
y
-1
1 2
3. y(t) = -t
2 z
y
y = (1/2) z2
y(l)=(1/2) 12
I=()
y
808 Answers to Odd-Numbered Exercises
5. y(t) =2
y z
y(t) = 2
(2,0)
1
1. (y - 1) 2 + z2 = C, C ::: 0 9. y = -z2 +C
2
z (y-1 )2+z2=C C=-3/2 Z
BOO
700
GOO
soo
g
4 00
300
200
100
k=1
8 f
9. i(O) ~ 3.67
11. (a) 0(1.24) = 0.1572, 0(6.2) = 0.154875, 0(11.16) = 0.152584 (b) 0, 2.48, 4.96, 7.44, 9.92, 12.40, 14.88
(c) 0, 3.1, 6.18, 9.25, 12.3
0 21
13. (a) k(t) = 0.2(1-e-O.lr), h = 5, m = 1 (b) k = 0, h(t) = 5(1 - e- · ), m = 1
X
X
2 y
810 Answers to Odd-Numbered Exercises
17.
ll=O IJ=0.5 ll= 1
·2
21. (b) Hint: e±x > 0 for all real x. (c) x = ~ ln(3/2) + i(2n + l)rr/2, nan integer.
25. (a) Hint: Expand (D - r)(D - (r + h))y = 0.
(h) Hint: Find the roots of the characteristic equation.
(c) Hint: Use the definition of d(e'x)/dr as the limit of a quotient.
21. z = Yt + 2y2 - 4YJ 29. z = 0
31. yes 33. no
35. yes 37. no
7. The vector field F(x, y) = (x + I, y) 9. The vector field F(x, y, z) = (x, y /2, z/3).
y z
I
1/'., ,;,;•
'\\\ \~
I / •
:"-.
1/
I
•
/
/ '. / /_,./.,.p,,..._.-
·>-;,.,
,
-- I /
I .,,--
,'-.." \
........... , . '
__.,. / /
-1
I
I
/
.
'•
-----=
//..,.,.----,.::,.-;:;;.
- ------"---!:::;-
X
'// I I \ '"---.........____-::::::
/II
.,; I; ~ I ~
,1
'
'
\ "" ·,'
' \.
\
'''-,,:,,.,:::-
\
~ .,_, ,---.
~'· .,
...
~,.
11.
/
. 1 /
,./ •
/
(-1,;f_,
./
/
,/
2 X
29. x = ½y2 + C.
31. (a) x(t) = (zo/k)(l - e-tt) . y(t) = (kw 0 + g)/k 2(1 - e- kt) - g/kt.
(b) Hint: Solve y(t) = 0. (c) Routine calculation. (d) Routine calculation.
33. (a) Hint: Use the differential equations. (b) Hint: p =fi 0. (c) Hint: Use part (b). (d) to= In 2/(p + q).
Exercise Set 1D (pgs. 584-585)
l. Domain is t < I/a . 3. No.
5. (a) Use Theorem 1.6. (b) H(x, y) = -(l/2)(x 2+ y2) (c) Show His constant on flow lines.
7. Use Theorem 1.6. 9. (a) Routine. (b) Fairly complicated.
11. (a) Routine. (b) (1/t)J~ divF(T,,(x))du.
(I t//3) ( /3) 2
2t t
11. A(t) = O 3 , b(t) = -t2/ 3 .
13. (a) Routine calculation. (b) Matrix must be square. (c) Suppose b(t) =fi 0.
15. Nonlinear. 17. Nonlinear.
19. (x(t), y(t)) = (-c1 sint + c2 cost - I - t, CJ cost+ c2 sint + I - t).
2]. dx/dt = -X + 2y, dy/dt = X - )'.
Answers to Odd-Numbered Exercises 813
y(t) = -½e' + ¼e4 1 (cos .J,}:-t + (l + ./2) sin .J,/:-t) + ¼e-4 1 (cos {l-t + (-1 + ./2) sin /,/-1).
29. i = u, j, = V, u = v, v = -u.
31. i=u,j,=v,u=y+e',v=-x.
33. X(t) = 0, y(t) C. =
35. =
x(t) ce'l 2, y(t) = ce 112.
37. (a) x(t) = cie..fil + c2e-..fi.1, y(t) = c3 cos ..fit+ c4 sin ..fit.
z(l)= ..fi. (CJ e..fi., - c2e-..fil - c3 sin ..fit + C4 cos ..fit).
w(t) = {1 (c1e..fil - c2e-..fi.1 + c3 sin ..fit - qcos ..fit).
(b) z= 2w, w = 2z.
Section 3: Applications
Exercise Set 3 (pgs. 601-606)
1. (a) dy/dt = (4/lOO)z - (4/lOO)y, dz/dt = (1/lOO)y - (4/lOO)z.
(b) y(t) = 25e-t/SO - l5e- 3'1 50 , z(t) = l2.5e-,;so + 1.5e- 31 150 • 20
(c) Ymax ~ y(15) ~ 12. From t = 15 both derease to zero.
9. (a) t = 100. (b) x = y/(100 + t) - 2x/50, x(0) = 0, j, = 2x/50- y/(100 + t), y(O) = 10.
(c) i + (1/25 + 1/(100 + t)) x = 10/(100 + t), x(0) = 0. (d) x(t) = 250/(100 + t)(l - e-1125 ).
y(t) = 10 - 250/(100 + t)(l - e- 1125 ).
11. µ 1 = J<5 - ./5)/2 ~ 1.1756 and µ2 = J(5 + ./5)/2 ~ 1.9021.
13. µJ = J(8 - ./2)/2 ~ 1.8146 and µ2 = J(8 + ./2)/2 ~ 2.1696.
15. (a) U(x, y) = (l/2)(k1 + k2)x 2 - k2xy + (l/2)(k2 + k3)y 2.
(b) Hint: Multiply m 1i = -Ux by i, and mzj, = -Uy by j, then add.
17. Hint: Equate two expressions for the acceleraion at the surface.
19. Ve~ 1.1086 x 104 mis. 21. Hint: Integrate x = -g.
23. (a) Use the hint. (b) Hint: G(m1r 2>
x0
= !:.
XO
(c) Ve= ..fiv1. (d) About 27.28 days.
25. (a) w 2 = k/a 3. (b) Hint: aw is the orbital speed. (c) Hint: Let the orbit be r = f (8) in polar coordinates.
27. (a) Use the hint. (b) Use what worked in part (a).
29. (a) Routine arithmetic. (b) Hint: Differentiate ij, - j,i. (c) Hint: Use the cabin rule. (d) Hint: For the last part
consider A(t + r) - A(t). (e) Apply Green's Theorem to a region swept out by a radius in time t.
31. Follow the steps. 33. Use the chain rule.
814 Answers to Odd-Numbered Exercises
3. (a) Routine calculation. (b) & (c) The table shows the comparison.
IE X IE y formula y formula x
0.0 2.00000 1.00000 0.0 1.00000 2.00000
0.2 2.70993 1.46714 0.2 1.46715 2.70996
0.4 3.68521 2.10158 0.4 2.10163 3.68528
0.6 5.00851 2.96431 0.6 2.96442 5.00866
0.8 6.78615 4.13513 0.8 4.13532 6.78640
1.0 9.15444 5.71797 1.0 5.71828 9.15484
5. The improved Euler method was used with step size h = 0.001. Of the 5,000 values generated, the left-hand table
below records every 500th value.
7. The improved Euler method was used with step size h = 0.001 to compute a numerical approximation of the solution.
Of the 1500 values generated, the right-hand table below records every 150th value.
x(t) x(t)
0.0 0.500000 0.00 0.000000
0.5 0.591083 0.15 0.150008
1.0 0.662880 0.30 0.300243
1.5 0.720503 0.45 0.451852
2.0 0.767312 0.60 0.607861
2.5 0.805669 0.75 0.774373
3.0 0.837303 0.90 0.962467
3.5 0.863521 1.05 1.19204
4.0 0.885335 1.20 1.50098
4.5 0.903540 1.35 1.97104
5.0 0.918770 1.50 2.81982
Table for Exercise 5 Table for Exercise 7
Answers to Odd-Numbered Exercises 815
9. (·1,1) y (1,1)
(-1,·1) (1,·1)
x = fx(x, y) = 2x,
j, = fy(x, y) = y,
13. (a) The process stops when tank I first becomes full at time I 50 minutes.
70
.i = 3y/(100 + t) - ?x/(50 + 1), x{O) = xo,
j, = 2 + 3x/(50 + 1) - 4y/(100 + I), y(0) = YO, 0 ::': I .::: 50. 60
(b) The improved Euler method with step size h = 0.01 was used to plot the 50
solutions.
15.
20
tank 1-----
0
10 _ _ __
n
10 20 30 40 50 t
time tin minutes
X X X
{c) y y
3 X
a= O. l Q = 1.0 a= 2.0
y y
X X X
a= 0.1 (l = 1.0 a = 2 .0
Answers to Odd-Numbered Exercises 817
19. The improved Euler method with step size h = 0.005 was used to plot five phase curves of the hard spring oscillator
=
equation y -yy 3 + 8y for various pairs (y, 8). There are three equilibrium solutions; namely, y = 0, y = ...fy7K and
y = -../ylK, whose phase curves are the three points (0, 0), (../yJK, 0) and (-,./y[h, 0). The results are shown below.
y y y
=
In Exercises 21, 23, and 25, the improved Euler method with step size h 0.001 was used to plot the solution
y = y(t) of the oscillator ji + k(t)j, + h(t)y = sint, y(0) = 0, j,(0) = 1, where the damping factor k(r) and the spring
stiffness h(t) are time dependent. The results are shown below.
21. y 23. y
2
·2
10 20 30
(c) Here, the "PLOT H, P" command was used to plot the trajectory in the HP-plane of the solution found in part (a).
The result is shown below in the figure on the right. The arrows indicate the direction of the trajectory.
2 3 • t H
time
The plot of the trajectory of the solution is shown on the right. The arrows indicate the a H
direction of the trajectory.
33. First observe that H (0) = 3 for each of the initial-value problems suggested by Exercises 29, 30, 31, 32 in this section ..
Using the improved Euler method with step size h = 0.001, various values of ft for the interval O ::: t ::: t 1 were used
to sketch a phase curve of the four solutions correponding to the initial conditions H (0) = 3, P(O) = Po, where
t
Po = ½, 2, ~. Visually comparing the curves for various values of ft allowed us to hone in on an estimate to of the
actual orbit time t* such that to - 0.01 < t* <to.Once, to was determined for a given Po, the "PLOT H, P" command
was suppressed and the "PRINT T, H, P" command was used on the time interval O::: t ::5 to. Toward the end of this
printout the columns for H(t) and P(t) contained values close to H(t) = 3 and P(t) =Po.The corresponding values
of t were therefore close to t•. Using this method,the values of t* for the four orbits were estimated to be
y(t)= 3 ( l+ ./17
15 ) cosw1t+ 3 ( 1- .JPj
15 ) cosw:it.
2 2
11. (x(t), y(t), z(t)) = (- sint, cost, -e1).
x
13. If x. = x*(t) solves = F(x), then x(t) = x.(-t), solves = -F(x). x
15. Hint: From Chapter 5, Section 1, V f(x, y) is orthogonal to a level curve through (x, y).
f~ J,:
17, Hint: Show that 1 V f • dx = 1 lx(t)l 2 dt, and use the Fundamental Theorem of Calculus for line integrals.
19. Hint: Solve the specific Hamiltonian system explicitly.
-te-1 ) ( et (t + s)e1+s )
e-t = I. (b) 0 e'+• .
820 Answers to Odd-Numbered Exercises
~ e2, ) .
2 -1
rA - ( 3e2r - 2e3'
11· e -
6e 2, -6e 31
-e2,
- 2e 21
+ e3' )
+ 3e 3' ·
19. e A
1
= ½( ~ e1 : e2, e'
e' - e21 e' + e2'
21. (a) Hint: In the series for ei,A, the even terms are real and the odd terms imaginary.
(b), (c) Hint: costA sintA = -½(eitA -e-i 1A).
= ½<ei 1A +e-i 1A) and
23. Hint: Use part (b) of Exercise 21. To show that this is the most general solution, show that c1 and c2 can be chosen to
match any given values for x(0) and x(0).
3. e'A =( ~
0
';'
0
~,e~ :t~i::: );
e 21
x(t) =(
1
;' ) .
0
21 21 21
½(e + 1) ½<e - 1) ½<e - 1) O )
A e' - l(e2r + 1) e' - l(e2r + 1) _l(e21 - 1) 0
5. e' = 2 2
-e' + e 1
2 2
-e1 + e I
2 2r
e O '
(
e' + te 1 - ½<e 21 + 1) te' - ½<e21 - 1) -½(e 21 - 1) e 1 + te 1 - ½(e 21 - 1)
cosh t sinh t sinh t O )
1 1 1 - cosh t 1 - sinh t - sinh t 0
which can also be expressed as e A = e _ 1 + e' _ 1 + e1 e1 0 ·
(
1 + t - cosh t t - sinh t - sinh t 1+ t - sinh t
~t ~ ~
1
7. Exponential matrix is e 31 ( ) .
O I+ t -f
9. (a) Eigenvectors ( 1, 0) and (0, a - fJ) are linearly independent if a -1- /J)
(b) Hint: The only solutions of ( ~ ~ ) ( : ) =( ~ ) are multiples of (1, 0).
(c) ( eat
0
e~=:'
eP'
)·, ( eat
0
teat )
ea' .
Answers to Odd-Numbered Exercises 821
21. A-
1
= j(-A 3 +8A 2 - 21A + 221) =( ~
-1
2
0
1
0 -: )
-8
-4
I •
I
0 0 4
23. A- 1
= ,-•(A2 - (I+ 2e')A + (2e' + e")I) =( i 0
e-1
0 e-t
0
-te- 1
)
Exercise Set 2E (pgs. 639-640)
1 0 0)
I.A=(~~). 3. A=
( 0
0
2
0
0
3
.
5. Hint: Set t = 0.
7. Hint: Compare the square of the matrix with its value when tis replaced by 2t.
9. Hint: Multiply the linear combination ct x1 + · · · + CnX,, = 0, where Xk is the kth column of A, by A - I .
11. Hint: x, (t) = c2x2(t) + · · · + Cm'Xm (t) if and only if (-l)x1 (t) + c2X2 + · · · + CmXm (t) = 0.
)/ 2 X
...
2 X
13. k = 0, stable center, type V; 0 < k < 2, asyptotically stable spiral, type VI; k = 2, asyptotically stable star, type VIII;
k > 2, asymptotically stable node, type III.
15. k = 0, stable center. type V; 0 < k < ../8, asyptotically stable spiral, type VI; k = ../8, asyptotically stable star, type
VIII; k > ../8, asymptotically stable node, type III.
17. At a constant equilibrium solution, i = j, = 0, so there are infinitely many of them when ax+ by= c.x + dy = 0 has
infinitely many solutions, which happens if and only if det ( ; !) = ad - bd = 0.
Answers to Odd-Numbered Exercises 823
19. The trajectories are the same, but traversed in the opposite direction (time reversal).
21. (a) x(t) = e-t/Z [c 1 cos :{j-t + c2 sin {l-t ] ,
y(t) = ½e-t/2 [ (.J3c2 - c1) cos "41 - (.J3c1 + c2) sin ft],
z(t) = CJe-t + ½e-t/Z [<c1 - .J3c2)cos {l-t + (.J3c1 +c2)sin 41].
(b) All terms in the solution have exponential factors that go to zero as t goes to +oo.
(c) The solution for z(t) contains a term c3e1 which is unbounded as t goes to +oo. The equilibrium point (0, 0, 0) is
unstable.
Exercise Set 4B (pgs. 657-658)
1. The solutions of the system of linear equations Ax = 0 form a line of equilibrium solutions of the autonomous system.
3. Hint: At an equilibrium point, a(y - x) = 0, px - y - xz = 0, and -{Jz + xy = 0. Then x = y, so x(p - J - z) = 0.
5. (a) Hint: At an equilibrium point, y + ax(x 2 + y 2) = 0 and -x + ay(x 2 + y 2) = 0. For a ¥- 0, multiply the fust
equation by x, the second by y, and add.
(b) The linearized system at (0, 0) is .i = y, y = -x, with eigenvalues ±i so it has a stable center at (0, 0).
r r
(c) Hint: With x = r cos 0 and y = r sin 0, d1e chain rule gives .i = cos 0 - 0r sin 0, y = sin 0 + 0r cos 0. Substitute
in the given differential equations, multiply one by cos 0, the other by sin 0, and add.
(d) 0(t) = -t +c1, r(t) = (ci - 2at) 112.
7. (a) (A, B/ A). (b} The figure shows the case A= 1, B = 3 with an equilibrium point at (I, 3).
y
·1 3 4 5 X
9. Equilibrium points are at (0, 0) and on the circle 1 - x 2 - y2 = 0. Hint: ~(x 2 + y2) = 2(x.i + yy) = 0.
11. Equilibrium points: ( -1, 0), (0, 0), (I, 0). (-1 , 0) and (1, 0) are stable, with derivative matrix ( - ~ _ ~ ). and
eigenvalues -2, -2. (0, 0) is unstable , with derivative matrix ( b _~ ). and eigenvalues I, -J.
13. (a) x(t) = cost), y(t) = sint is a solution. (b) Evaluate (d/dt)(y/x) in terms of x and y.
(c) Hint: tan0 = y/x (e} Hint: (d/dt)r 2 = (d/dt)(x 2 + y2) = 2(x.i + yy)
15. (a) .i = y, y = -x + ay.
(b) The type of equilibrium at (0, 0) for the linearized system is: for O < a < 2, unstable spiral; for a = 2, unstable
star; for a > 2, unstable node; for a= 0 stable center; for -2 < a < 0, stable spiral; for a = -2, stable star; for
a < -2, stable node.
Chapter 13 Review (pgs. 658-659)
Ji ~ }
2e->..r iei>..r
Theorem 2.4,
ho + b2 b2 bI+ b3 b3 )
tA 2 3 b2 ho - b2 b3 b1 - b3 ,
e =ho+ b1A + b2A + b3A = bi+ b
2 3 bi ho+ b
2
b
2
, where ho, b1, b2, b3 are functions
(
b1 -b1 + 2b3 b2 ho - b2
satisfying e>..r = bo + )..b1 + A2b2 + ).. 3b3, ew =ho+ i)..b1 - ).. 2b2 - i).. 3b3, e->..r = ho - Ab1 + A2bi - ).. 3b3, and
e- iJ.J = ho - i )..b1 - A2b2 + i ).3b3 . This gives b1, b2, b3, b4 as linear combinations of e>..r, ei>..r, e- >..r , e-i>..r that can be
written more compactly as ho= ½(cos At+ coshAt), b1 = ¼). 3(sin)..t + sinhAt), b2 = 2(cos)..t - coshAt), -¼}..
b3 = -¼}..(sinM - sinh).t) . In te1ms of initial conditions at t = 0,
x(t) = {½(cos ,\.t + cosh At) - ¼h(cos)..t - coshM)}x(O) - ¼h(cos).t - cosh)..t)y(O) + (¼2 314 (sin ).1 + sinh)..t ) -
¼2 114 (sin)..t - sinhAt)}i(O) - ¼2 114 (sin)..t + sinh)..t)y(O),
y(t) = -¼h(cos)..t + coshM)x(O) + (½(cos)..t + cosh)..t) + ¼h(cosAt - cosh)..r)}y(O) - ¼2 114 (sinAt -
sinhM)i(O) + (¼2 314 (sinAt + sinhAt) + ¼2 114 (sinM - sinhM)}y(O), where).. stands for 2 114 •
15. (a) Letting xo = (a1, ... , an) and 1.() = (b1, ... , bn), the system is equivalent to the sequence of systems
= Yk, .Yk = -Xk , xk(O) = ak , yk(O) = bk fork= I , .. . , n.
Xk
(h) Xk (t) = <lk cost + bk sin t for k = l, . .. , n .
cost sin t )
(c) The exponential matrix has n 2-by-2 blocks ( _ sin cos on its diagonal and is zero elsewhere.
1 1
k+3
9• ak = (k + 1)3k+l'
ll. Hint: Follow Example 3 in the text. Sum of the infinite series is 1/4.
13. Hint: Use induction. The infinite series diverges to oo.
15. 6/5 17. I /n
00
43. L -3r ( I -
k=O
-1)*
r
= 3, for 1/2 < r < 1.
45. (a) Hint: Show that all subintervals remaining after the nth step are of the same length and that the number of such
subintervals is twice the number remaining after the (n - l)st step.
(b) Hint: For a given E > 0 and any positive integer n satisfying n > lnE/ln(2/3), show that O < (2/3l < E.
y y
n2 +n - I .
11. Sn == , II== I, 2, . . . , hinn--+oc Sn = I
n + 3n + J
2
I oo
13. (a) Hint: Lk=I ak is a telescoping sum. (b) a1 = 2, ak = - ---, k >
k(k - I) -
2, I:C,k
k=l
=I
00
15. e
00
43. (a) Hint: Follow the given steps. (b) Hint: Use the term test and part (a). (c) Hint: Use part (b) and the fact that
kl/k ~ (1/k)lfk_
5. Hint: Show that the absolute value of the kth term is dominated by (A+ B)/ k 2 and apply the Weierstrass test.
1. Hint: For the first series, use the first derivative test to find the maximum value of (1 - x)xn+I. For the second series,
use Exercise 32 in the last section to show pointwise convergence on [O, I] to a function f (x), and then show that if N
is any given positive integer then lsN - f (x) I can be made arbitrarily close to I.
• k=O
x2k+I
k"'O
Answers to Odd-Numbered Exercises 827
00
n. I: 1!x2k
k=1
00
11. I: (-l)k+I
k x2.1:
k=I
oc
21. 1:<-lh:r - 1l
k=O
25. Hint: Find the Taror expansions for 1/(1 - x) 2 and 1/(1 -x)3, and then use the partial fraction decomposition of
x(x + 1)/(1 - x) .
27. (a) Hint: Follow the given steps. (b) 1 + 3x + 3x 2 + x 3 (a = 3), 1 - 3x + 6x 2 - l0x 3 (a = -3),
1 1 2 1 3
1 + x - x + x (a= 1/2)
2 8 16
29. Hint: Use the change of variable t = x + a.
31. 1 33. 1/2
35. L (-1/
00
ck+! x
k
k=O
37. Hint: Shift the index of summation in the series for /"(x) and then add the series for f"(x) and / (x) by combining
like powers of x .
7. Hint: Find the first four nonzero terms of y(x) and separate the result into two series.
Exercise Set 7 (pgs. 705-706)
1. Hint: Follow the steps in Example 2 of the text.
Cl<) (-ll 2
3. (a) y(x) = co I:-,-x21c (b) y(x) = ce-x (c) Yes
hO k.
~ (-1/ 21:
00
(c) Hint: Show that xlo = -x1; - J~ (there is no need to use the series representation of lo(x) found in part (b))
828 Answers to Odd-Numbered Exercises
I oo "> 00 4
2+L
5. (Zk - I) sin(2k + l)x 7. """' - - sin(2k + l)x
· k=O + Ir
62k+l
00 4
9. -~ +L Zk cos(2k + l)x 11.
- k=O ( + I) 2Ir
y
~,-~;:/2' -2• -•
,/
.//
~n:
•X
15.
13. Yj
-------·
-2n .•- - - -
1 t-.--2.x
'
1
-2• 2• X
Answers to Odd-Numbered Exercises 829
17. 19. y
·2• /
·2• 2• X
21. Hint: Use Equations 8.2 in the text and the linearity of the integral.
3 1 3 1 .
23. cosx+ cos3x 25. sinx- sm3x
4 4 4 4
27. cos3x + sin5x
29. Hint: Straightforward but tedious integration 31. Yes.
x=4n+I,
=~
~
(-2. hr
cos hr - -
2
4
k 2n 2
- sin hr) sin hrx
2 2
k=I
I - 2n + x,
11. f. (x)
e
= { 1+2n-x, 2n - I < x < 2n; }
- -
2n<x<2n+I,
= -21 + Loo ---,---"7cos(
4
(2k+J)2n2
k
2 + 1)nx
k=O
-3 -2 -1 2
00
2nn<x<(2n+l)n;
13. fe(X) = {-smx,
si~x,
(2n - l)n _::: x _::: 2nn, }
= -n2 + L -4
(4k2 - ])Jr
COS 2
k
X
k=I
~
n/2
I,
/
-J t ~~ x
I
Y fe
x = 2n + I,
00
2 1
~ -e-4k'la
J2
13. u(x, t) = x+ -
L.,k
1r
1
sin2knx,O < x < l,t > 0
- - -
1T k=I
S. (a) v(x) =
2
!2
2
x + c 1x + c2 (b) v(x) = ~ 2 x(x - p) (c) First solve the homogeneous equation a 2u.u = u11 ,
u(0, t) = u(p, t ) = 0, and let w(.\', t) denote the solution. The solution of the nonhomogeneous problem is then
u(x, t) = w(x, t) + v(x), where v(x) is given in part (b).
7. (a) Hint: Let to be a fixed time and fort > to let b = a(t - to). Show that U(x + ato) = U((x - b) +at) and
V (x - ato) = V ( (x + b) - at). (b) Hint: Use the identity cos(a - /3) = cos a cos /3 + sin a sin /3. (c) Hint: Use the
identity 2 sin a cos f3 = sin(a + /3) + sin(a - /3) .
9. (a) Hint: routine computation
832 Answers to Odd-Numbered Exercises
(c) Hint: Show that G"(0) and G"(p) exist as two-sided derivatives. The condition G'(0) = G'(p) = 0 insures that G
is continuously differentiable on all of IR.
11. X" - AX = 0, T" - MT = 0
Chapter 14 Review (pgs. 733--734)
1. J/(x - 4), 4 < X < 6 3. I+ l/x 2 ,x ::f.O
2
5. e<x+l) , -oo < x < 00 7. cos(x - 5), -oo < x < oo
00
833
834 Index