0% found this document useful (0 votes)
515 views

Multivariable Mathematics Compress

FOURTH EDITION Richard E. Williamson

Uploaded by

ml200212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
515 views

Multivariable Mathematics Compress

FOURTH EDITION Richard E. Williamson

Uploaded by

ml200212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 860

MULTIVARIABLE

MATHEMATICS
MULTIVARIABLE
MATHEMATICS

FOURTH EDITION

Richard E. Williamson
Dartmouth College

Hale F. Trotter
Princeton University

• .

PEARSON EDUCATION, INC.


Upper Saddle River, New Jersey 07458
Library of Congress Cataloging-in-Publication Data

Williamson, Richard E.
Multi variable mathematics/ Richard E. Williamson, Hale F. Trotter-4th ed.
p. cm.
Includes index.
ISBN 0-13-067276-9
1. Algebras, Linear. 2. Differential Equations. 3. Calculus. I. Trotter, Hale F. II. Title.

QAl84.W54 2004
5!2'.J5-dc21 2003049839

Acquisitions Editor: George Lobell


Editor in Chief: Sally Yagan
Vice-President/Director of Production and Manufacturing: David W. Riccardi
Executive Managing Editor: Kathleen Schiaparelli
Senior Managing Editor: Linda Mihatov Behrens
Assistant Managing Editor: Bayani Mendoza de Leon
Production Editor/Interior Designer: Jeanne Audino
Manufacturing Buyer: Michael Bell
Marketing Manager: Ha/ee Dinsey
Marketing Assistant: Rachel Beckman
Director of Creative Services: Paul Belfanti
Art Editor: Tom Benfatti
Art Director: Jayne Conte
Cover Designer: Bruce Kenselaar
Editorial Assistant: Jennifer Brady
Art Studio & Composition: Laserwords Private Limited

Cover Image: Provided by Richard Williamson. It is a trajectory ofthe three-species system described
in Exercise 34 in Chapter 12, Section 4.

• © 2004, 1996, 1979, 1974 by Pearson Education, Inc .


Pearson Education, Inc.
Upper Saddle River, New Jersey 07458

All rights reserved. No part of this book may be


reproduced, in any form or by any means,
without permission in writing from the publisher.

Printed in the United States of America

ISBN D-13-067276-9

Pearson Education LTD., London


Pearson Education Australia PTY, Limited, Sydney
Pearson Education Singapore, Pte. Ltd
Pearson Education North Asia Ltd., Hong Kong
Pearson Education Canada, Ltd., Toronto
Pearson Educaci6n de Mexico, S.A. de C.V.
Pearson Education-Japan, Tokyo
Pearson Education Malaysia, Pte. Ltd.
CONTENTS

Pref ace x111

• CHAPTER 1 Vectors 1
1 Coordinate Vectors 1
2 Geometric Vectors 8
A Points and Vectors 8
B Distance and Length 10
C Scalar Multiplication 11
D Vector Addition 13
E Points, Arrows, and Vectors 14
3 Lines and Planes 17
A Lines 18
B Planes 21
4 Dot Products 24
A Lengths and Angles 25
B Properties of x • y and Ix I 27
C Unit Vectors and Projections 28
5 Euclidean Geometry 33
A Equations for Lines and Planes 33
B Distance to a Line in JR2 or a Plane in R 3 35
6 The Cross Product 37

• CHAPTER 2 Equations and Matrices 46


1 Systems of Linear Equations 46
A Elimination Method 46
B Applications 52
2 Matrix Methods 59
A Matrix Equations and Elementary Operations 59
B Reduced Matrices 64

V
vi Contents

C Homogeneous Systems 65
D Geometry of Solution Sets 70
3 Matrix Algebra 74
A Sum and Scalar Multiple 74
B Matrix Multiplication 75
C Identity Matrices 78
D Matrix Polynomials 79
4 Inverse Matrices 81
A Invertibility 81
B Computing Inverses 83
C Special Matrices 85
5 Determinants 88
A Definition 88
B Row and Column Expansions 91
C Basic Properties 92
D Computing Determinants 94
E Invertible Matrices 96

11 CHAPTER 3 Vector Spaces and Linearity


(Optional Chapter) 102
1 Linear Functions on JR 11 102
A Matrix Representation 102
B Composition 107
C Inverse Functions 108
2 Vector Spaces 112
A Examples of Vector Spaces 112
B Subspaces 114
3 Linear Functions 119
A Examples of Linear Functions 120
B Composition and Linear Combination 123
C Inverse Functions 124
4 Image and Null-Space 126
A Image 126
B Null-Space 127
C Nonhomogeneous Equations 129
5 Coordinates and Dimension 131
A Bases and Coordinates 132
B Linear Functions 134
C Dimension Theorems 139
6 Eigenvalues and Eigenvectors 143
A Definitions and Examples 143
B Bases of Eigenvectors 148
C Changing Coordinates 153
Contents vii

7 Inner Products 155


A General Properties 155
B Orthogonal Bases 159
C Rotation and Reflection l 68

• CHAPTER 4 Derivatives 173


1 Functions of One Variable 174
A Derivatives 174
B Velocity and Speed 179
C Higher-Order Derivatives; Acceleration 181
D Arc Length 181
E Computer Plotting of Space Curves 184
F Vector Integration 185
2 Several Independent Variables 189
A Graph of a Function 189
B Level Sets 190
C Computer-Generated Graphs 193
D Quadric Surfaces 196
3 Partial Derivatives 198
A Definition 198
B Geometric Interpretation 200
C Continuity 201
4 Parametrized Surfaces 205
A Vector Partial Derivatives 205
B Quadric Surfaces 209
C Computer Plotting of Image Surfaces 212

• CHAPTER 5 Differentiability 216


1 Limits and Continuity 216
A Neighborhoods 2 I 7
B Limits 218
C Continuity 221
2 Real-Valued Functions 225
A Differentiability and Continuity 225
B Tangent Approximations 230
3 Directional Derivatives 232
A Definition 232
B Mean-Value Theorem 235
4 Vector-Valued Functions 237
A Differentiability 237
B The Derivative Matrix 239
C Tangent Approximations 241
5 Newton's Method 245
viii Contents

11 CHAPTER 6 Vector Differential Calculus 252


Gradient Fields 252
A Basic Properties 252
B Chain Rule 254
C Normal Vectors 256
D Plotting Vector Fields; Flow Lines 259
2 The Chain Rule 261
A General Formula and Examples 263
B Changing Variables 271
3 Implicit Differentiation 275
4 Extreme Values 283
A Critical Points 283
B Constraints 286
C Lagrange Multipliers 287
D Saddle Points 291
E Second-Derivative Criterion 293
F Steepest Ascent Method 298
_,~- Curvilinear Coordinates 303
A Polar Coordinates 303
B Spherical Coordinates 305
C Cylindrical Coordinates 306
D Jacobian Matrices 306

11 CHAPTER 7 Multiple Integration 312


Iterated Integrals 312
A Integration over a Rectangle 312
B Nonrectangular Regions 315
C Higher Dimensions 318
D Solids with Known Sectional Areas 320
2 Multiple Integrals 322
A Definition 322
B Existence 325
C Double Integrals 327
D Triple Integrals 330
E Content and Mass 331
3 Integration Theorems 333
4 Change of Variable 338
A Polar Coordinates 338
B Spherical Coordinates 339
C Cylindrical Coordinates 340
D Jacobi's Theorem 342
Contents ix

5 Centroids and Moments 348


6 Improper Integrals 353
7 Numerical Integration 359
A Midpoint Approximations 360
B Simpson Approximations 361

• CHAPTER 8 Integrals and Derivatives


on Curves 367
Line Integrals · 367
A Definition and Examples 367
B Fundamental Theorem of Calculus 374
2 Weighted Curves and Surfaces of Revolution 377
3 Normal Vectors and Curvature 383
4 Flow Lines, Divergence, and Curl 386

• CHAPTER 9 Vector Field Theory 397


1 Green's Theorem 397
A Statement and Examples 397
B Changing Paths 402
C Physical Interpretations 404
2 Conservative Vector Fields 409
A Potentials 409
B Path Independence 412
C Derivative Criterion 413
D Indefinite Integration 417
3 Surface Integrals 419
A Normal Vectors 419
B Area and Mass 420
C Integrating Vector Fields 423
D Orientation 426
4 Gauss's Theorem 431
A Statement and Examples 431
B Interpretation of Divergence 435
5 Stokes's Theorem 438
A Statement and Examples 438
B Interpretation of Curl 442
C Simple Connectedness 446
6 The Operators V, V x and V• 449
A Derivative Formulas 449
B Green's Identities 45 I
C Changing Coordinates 453
X Contents

11 CHAPTER 10 First-Order Differential


Equations 460
1 Direction Fields 460
A Plotting Direction Fields 461
B Numerical Methods 467
2 Applications 469
A Direct Integration 469
B Separation of Variables 470
3 Linear Equations 480
A Exponential Integrating Factors 481
B Applications 483

11 CHAPTER 11 Second-Order Equations 490


1 Differential Operators 490
A Examples 490
B Factoring Operators 494
2 Complex Solutions 500
A Complex Exponentials 500
B Higher-Order Equations 507
C Independent Solutions 508
3 Nonhomogeneous Equations 513
A Superposition 513
B Undetermined Coefficients 517
C Variation of Parameters 521
D Green's Functions 525
4 Oscillations 530
A Harmonic Oscillation 531
B Damped Oscillation 532
C Forced Oscillation 535
5 Laplace Transforms 540
6 Convolution 548
7 Nonlinear Equations 553
A Dependent Variable Absent: ji = f (t, y) 554
B Independent Variable Absent: ji = f (y, y) 555
C Phase Space 558
8 Numerical Methods 562
A Euler's Method 563
B Improved Euler Method 564

11 CHAPTER 12 Introduction to Systems 571


l Vector Fields 571
A Geometric Interpretation 571
B Autonomous Systems 575
Contents xi

C Second-Order Equations 576


D Existence, Uniqueness, and Flows (Optional) 580
2 Linear Systems 585
A Elimination Method 585
B Nonstandard Forms 588
3 Applications 594
4 Numerical Methods 606
A Euler's Method 607
B Improved Euler Method 608

• CHAPTER 13 Matrix Methods 617


1 Eigenvalues and Eigenvectors 617
A Exponential Solutions 617
B Eigenvector Matrices 620
2 Matrix Exponentials 625
A Definition 625
B Solving Systems 627
C Relationship to Eigenvectors 629
D Computing e'A in Practice 631
E Independent Solutions 637
3 Nonhomogeneous Systems 640
A Solution Formula 640
B Variation of Parameters 641
C Summary of Methods 643
4 Equilibrium and Stability 645
A Linear Systems 647
B Nonlinear Systems 653

• CHAPTER 14 Infinite Series 660


1 Examples and Definitions 660
2 Taylor Series 664
A Taylor Polynomials 664
B Convergence of Taylor Series 667
3 Convergence Criteria 670
A Convergence of Sequences 670
B Sums and Multiples of Series 672
C Series with Nonnegative Terms 674
D Absolute Convergence 677
E Alternating Series 680
4 Uniform Convergence 682
5 Power Series 688
A Interval of Convergence 688
B Differentiation and Integration 690
xii Contents

C Finding Limits by Using Series 693


D Products and Quotients 693
6 Differential Equations 696
7 Power Series Solutions 699
8 Fourier Series 706
A Introduction 706
B Orthogonality 707
C Convergence of Fourier Series 709
9 Applied Fourier Expansions 713
A General Intervals 714
B Sine and Cosine Expansions 716
C Differential Equations 718
10 Heat and Wave Equations 721
A One-Dimensional Heat Equation 721
B Steady-State Solutions 725
C One-Dimensional Wave Equation 728

Appendix: Finding Indefinite Integrals 735


I Identity Substitutions 735
II Substitution for the Integration Variable 736
III Substitution for a Part of the Integrand 736
IV Integration by Parts 737
V Integral Table 737

Answers to Odd-Numbered Exercises 741


Index 833
PREFACE

This book covers material that is often studied after a first course in one-variable
calculus, namely the algebra and geometry of vectors and matrices, multivariable
and vector calculus, and differential equations, including systems. The branches
of these three areas are strongly intertwined and we've designed our treatment
to display the connections effectively. Our aim has been to teach basic problem
solving, both pure and applied, in a framework that is mathematically coherent,
while allowing for selective emphasis on traditional rigor.
While the sequence of topics follows rather traditional lines of mathematical
classification, the actual route taken may vary widely from course to course. An
underlying theme is the encouragement of geometric thinking in two and three
dimensions, extended to arbitrary dimension when it's useful to do so. Thus,
most of Chapters I and 2 on vectors and matrices is prerequisite for the rest of
the book, but otherwise there is considerable flexibility for course scheduling.
Chapter 3 on linear algebra, with an introduction to general vector spaces and
linear transformations, is included for those who want to cover this material at
some point, but none of it is prerequisite for later chapters. In particular the
material on differentiability in Chapter 5 is organized so that the motivation for
the definition depends on gradient vectors rather than linear transformations.
For this edition the exposition has been completely rewritten in many places
and, in addition to Chapter 3, a number of topics that are optional additions to a
basic course have been added, as follows:
Additional emphasis on scientific applications in Section 1B of Chapter 2.
Subsection on vector integrals in Chapter 4, Section 1.
Subsections on quadric surfaces in Chapter 4.
Subsection on flow lines in Chapter 6, Section 1.
Subsection on use of the chain rule in coordinate changes.
Expanded treatment of the second-derivative criterion for extrema.
Section 5 on centroids and moments in Chapter 7.
Section 6 on application of improper integrals in Chapter 7.
Section 4, Chapter 8 relating flow lines, divergence, and curl.
Subsection on finding potentials in Chapter 9, Section 2.
Additional subsection on flows in Chapter 12, Section 1.
More efficient computation of exponential matrices in Chapter 13, Section 2.

xiii
xiv Preface

Chapter on infinite series, with applications to differential equations.


Sections in Chapter 4 on computer plotting of curves and surfaces.
Subsection on the steepest ascent method.
Subsection on the midpoint and Simpson rules for multiple integrals.
Subsection on Newton's method for vector functions.
Java applets for the graphical and numerical work.
The figures are a salient feature of the text, including those in the answer section
and the one on the cover, which represents a trajectory of a Lotka-Volterra system
modified for three species and discussed in Exercise 34 in Chapter 12, Section 4.
The impetus for this edition came from George Lobell, whose knowledge
of the field and continued support has helped us a great deal. Allan Gunter
read the entire text, making insightful suggestions and working all the exer-
cises; his collaboration was invaluable. Jeanne Audino's experienced and good
humored oversight of the production has made working out final details a plea-
sure rather than a chore. Corrections can be sent to rew11@dartmouth.edu or
hft@math.princeton.edu.

Richard E. Williamson
Hale F. Trotter

SYLLABUS SUGGESTIONS
The vertical listings by chapter and section are by no means exhaustive but dis-
play variety in emphasis to give some feeling for the book's flexibility . Unlisted
sections or subsections selected from the table of contents can, of course, be
included at an instructor's discretion.

Chapter Basic Vector Differential Emphasis on


Number Calculus Calculus Equations Linear Algebra

l 1-6 l--0 l--0 l--0


2 1-5 1-5 1-5 1-5
3 1-8
4 1 A-D, 2-4 1 A-D, 2-4 1 A-F, 2-4 1 A, B, 2
5 1-3 1- 3
6 1, 2, 4, 5 I, 2, 4, 5 1, 2
7 1 A, B, 2 A-D, 4 1 A, B, 2 A-D, 4
8 1
9 1, 3, 4
10 1-3 1- 3 1-3
11 1- 3 1-3
12 1 A-C, 2, 3 1 A-D 1 A-C, 2, 3 1, 2
13 I 1-3 1-3
14 1, 2 1, 2, 5-7, 9
CHAPTER 1

VECTORS

Originally vectors were conceived of as geometric objects with magnitude and direc-
tion, suitable for representing physical quantities such as displacements, velocities,
or forces. Later on, introduction of a more general algebraic concept of vector uni-
fied and simplified various topics in pure and applied mathematics. This first chapter
introduces vectors in algebraic terms but is chiefly concerned with their geomet-
ric interpretation. The ideas are fundamental for the rest of the book, because the
possibility of visualizing multivariable problems geometrically is one of the major
advantages of using vectors.

SECTION 1 COORDINATE VECTORS


We use the symbol IR to denote the set of all real numbers, JR 2 for the set of ordered
pairs (x1, x2) of real numbers, JR 3 for the set of ordered triples (x1 , x2, x3), and in
general IR." for the set of n-tuples (x1 , x2 , ... , Xn)- To save writing subscripts, we
often write general pairs and triples simply as (x, y) and (x, y , z). We'll refer to
pairs, triples, and n-tuples as vectors and denote them by boldface letters x, y, z,
etc., while ordinary lightface letters stand for the single numbers that we sometimes
refer to as scalars when we want to distinguish them from vectors. The term scalar
is common in physics, to distinguish numerical quantities like mass or temperature
that we measure on "scales" of numbers from vector quantities such as velocity or
force that have both magnitude and direction. Though complex numbers don't have
such a direct physical interpretation, we can also use them as scalars, and we have
good reason to do that in Chapters 3, 11, 12, and 13. Exercise 24 on page 43 tells a
little more about the origin of these terms.
For a scalar r and a vector x = (x1, x2, . .. , Xn), we define the scalar multiple
rx to be the vector that we get by multiplying each entry Xk by r, so

2
li~,~A,~.~4t ,l,.~j If we take x = (1 , 2) in JR. and r = 3, then
rx = 3(1, 2) = (3, 6).
Similarly, with x = (l, 2, -3) in JR 3 and r = - 2,

rx = -2(1 , 2, - 3) = (- 2, -4, 6) .
For two vectors x = (x1, x2, . . . , Xn) and y = (YI , Y2 , . . . , Yn) in IR", we define
the sum to be the vector

X + y = (x1 + Yl , X2 + Y2, . . . , Xn + Yn),


2 Chapter 1 Vectors

in which each entry Xk + Yk is the sum of the corresponding entries Xk and Yk·
Note that the sum is defined only for vectors with the same number of entries. For
example, the sum of ( 1, 2) and (3, 4, 5) is undefined because corresponding entries
can't be matched up.

1.~XAIVIPLE 2 1 With x = (1, 2) and y = (2, -3) both in JR.2 ,


x+y = (1,2) + (2,-3) = (3, - 1).

In JR. 3 , with x = (0, 2, 4) and y = (-1, -2, 2), we have

X + y = (0, 2, 4) + (- J, -2, 2) = (-1, 0. 6).


The two basic operations on vectors, scalar multiplication and vector addition, are
extensions of the operations of addition and multiplication of single real numbers,
and in combination they help us express multiple operations concisely.

ff x = (2, - 1, 0) and y = (0, -1, -2), then 2x + y expresses the result of three
multiplications and three additions:

2x + y = 2(2, -1, 0) + (0, -1, -2)


= (4, -2, 0) + (0, - 1, -2) = (4, -3, -2).
Similarly, 3x - 2y involves six multiplications and three additions:

3x - 2y = 3(2, - 1, 0) - 2(0, -1, -2)


= (6, -3, 0) + (0, 2, 4) = (6, -1, 4) .

We customarily write -x for the scalar multiple (-1 )x, and x - y as an abbrevi-
ation for x + (-y), and use 0 to denote an n-tuple whose entries are all zero. Then
for an arbitrary vector x, x - x = 0, as in

(I, 2, 3) - (I, 2, 3) = (1, 2, 3) + (-1, -2, -3) = (0, 0, 0) = 0.

The notation O is ambiguous since O may stand for (0, 0) in one formula and for
(0, 0, 0) in another. The ambiguity disappears in context since only one interpretation
will make sense. For instance. if z = (-2, 0, 3), then in the formula z+O, the Omust
stand for (0, 0, 0), since addition is defined only for n-tuples with the same number
of entries.
Formulas 1 to 9 below are valid for arbitrary x, y, and z in JR." and arbitrary
real numbers r, s. They state rules for our new operations of addition and scalar
multiplication very closely analogous to the familiar distributive, commutative, and
associative laws for ordinary addition and multiplication of numbers.

1. rx + sx = (r + s )x
2. rx + ry = r(x + y)
Section 1 Coordinate Vectors 3

3. r(sx) = (rs)x
4. x+y = y+x
5. (x + y) + z = x + (y + z)
6. x+ 0 = X
7. X + (-X) = 0
8. Ix= x
9. Ox= 0

Note that the O on the left side of Formula 9 is the real number zero, while the O on
the right side is the zero vector in JR 11 for some n.
These formulas are straightforward consequences of the definitions of the vector
operations and of the laws of arithmetic. For illustration, we give a formal proof of
the second one.
Proof of Formula 2. Let x = (x1, x2, ... , X11) and y = (y1, Y2, ... , y 11 ), and let r
be a real number. Then

rx = (rx1, rx2, ... , rx 11 ), [definition of scalar


multiplication]
ry = (ry1, rn, ... , ry,1), [definition of scalar
multiplication]

so

rx + ry = (rx1 + ry1, rx2 + ry2, ... , rx11 + ry11). [definition of


addition]

On the other hand,

X + Y = (x1 +YI, x2 + Y2, ... , Xn + Yn), [definition of


addition]

so

r(x + y) = (r(x1 + YI), r(x2 + y2), ... , r(xn + y11 ) ) . [definition of


scalar multiplication]

By the distributive law for numbers, r(x1 + Y1) = rx1 +ry1, r(x2 + n) = rx2 +ry2,
etc., so the n-tuples rx + ry and r(x + y) are the same. •
A set with operations of addition and multiplication by scalars defined in such a
way that the Formulas I through 9 hold is called a vector space, and its elements are
called vectors. We use this more general point of view in Chapter 3, but elsewhere
in the book the term vector may be taken to refer to an element of JR 11 •
As with numbers and other algebraic expressions such as polynomials, the commu-
tative and associative properties of vector addition stated in Formulas 4 and 5 imply
4 Chapter 1 Vectors

that we can reorder and regroup the terms in a sum without changing the value of
the sum. Consequently we can simply write a sum such as x1 + · · · + x11 without
putting in parentheses to show how the terms are grouped, because the grouping
doesn't affect the value. Also, the distributive laws stated for two-term sums in 1
and 2 hold for sums of more than two terms.
A sum of scalar multiples a1 x1 + · · · + akXk is called a linear combination of the
vectors x1, ... , Xk. Fommlas 1 through 9 justify manipulating and simplifying linear
combinations in much the same way as other algebraic expressions, as illustrated in
the following example.

IEXA_MPtE 4 j For vectors u, v, and w, the expression

3(2u + v + w) - (u + 2v + 3w)

simplifies to (6 - l)u + (3 - 2)v + (3 - 3)w =Su+ v.


We take this kind of manipulation for granted from now on, but here is an outline
of a formal justification on the basis of Formulas 1 through 9. By Formula 2 (extended
to 3-term sums),

3(2u + v + w) + (-l)(u + 2v + 3w)


= 3(2u) + 3v + 3w + (-l)u + (-1)2v + (-1)3w.

By Formula 3, this is equal to 6u +3v+3w+(-l)u+(-2)v+(-3)w. Rearranging


and regrouping using Formulas 4 and S gives

(6u + (-l)u) + (3v + (-2)v) + (3w + (-3)w).

Applying Formula l gives (6 - l)u + (3 - 2)v + (3 - 3)w =Su+ lv + Ow, which


simplifies by Formulas 6, 8, and 9 to Su+ v.

The special vectors

e1 = (1, 0), e2 = (0, 1) in ~ 2


'
e1 = (l, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) in ~ 3 ,

and, in general,

e1 = (1 , 0, ... , 0), e2 = (0, 1, ... , 0), ... , e11 = (0, 0, ... , 1) in ~ 11

have the property that if x = (x1, ... , x 11 ) is an arbitrary vector, then

Note that these are the only coefficients that express x as a linear combination of
e1, . . . , e11 , for if
Section 1 Coordinate Vectors 5

then Xk = Yk for k = l, ... , n, In other words, every vector in !Rn appears in just
one way as a linear combination of the vectors e1, ... , en, and the coefficients in the
linear combination are simply the entries in the vector. Because of these properties,
the set {e 1 , . • • , en J is called the standard basis for !Rn. The numbers x 1 , . • . , Xn
are called the coordinates of x with respect to the standard basis, and the vectors
x I e 1, ••• , Xn en are called the components of x with respect to this basis.
In JR.3 we often write the standard basis vectors as i, j, and k instead of e 1, e2, and
e3, and in JR2 we may use i and j instead of e 1 and ei. The ijk-notation appears most
often in geometric and physical applications. The notation ek by itself is ambiguous
because it could in principle refer to a vector in JR.n for any n 2:. k. It will always be
clear from the context how many entries are meant for a vector ek,

Given a vector x = (x1 , x2) in JR.2 ,

(x1,x2) =x1(1,0)+x2(0, 1)
= x1e1 + x2e2 =xii+ x2j.

In particular, (2, -3) = 2e1 - 3e2 = 2i - 3j.


In JR3 , we have

= x1 e1 + x2e2 + x3e3
=xii+ xij + x3k.

In particular, (), 2, -7) = e1 + 2e2 - 7e3 = i + 2j - 7k.

(2, 3) = 2e1 + 3ei


shows (2, 3) written as a linear combination of e1 and e2 in .IR2. The vector (2, 3) is
also a linear combination of (1, 1) and (1, -1) as follows:

c2. 3) = ~o. o - ½o, - 1).


In JR 3 , the equation

(2, 3, 4) = 2e1 + 3ei + 4e3


= 4(1, 1, 1) - 1(1, 1, 0) - 1(1, 0, 0)

shows the vector (2, 3, 4) represented as a linear combination of the vectors (1, 1, 1),
(1, 1, 0), and (1, 0, 0).

To express (1, 3) as a linear combination of (1, 1) and (3, 4), we look for numbers
x and y such that

x(l, 1) + y(3, 4) = (1, 3), or (x + 3y, x + 4y) = (1, 3).


6 Chapter 1 Vectors

We need to solve the pair of equations

X + 3y = J
X + 4y = 3

for x and y. Subtracting the first equation from the second gives y = 2. Then setting
y = 2 in the first equation gives x = -5. So (1, 3) equals the linear combination

(1, 3) = -5(1, 1) + 2(3, 4).

le~MPtes.1 To express (l, 3, 8) as a linear combination of (1, I, 1) and


numbers x and y such that
(3, 4, 5) we look for

(1, 3, 8) = x(l, I , 1) + y(3,4, 5)


= (X + 3y, X + 4y, X + 5y).

Now solve
X + 3y = J
X + 4y = 3

X + 5y = 8

for x and y. The first two equations are the same as in the previous example, and the
calculations there showed that x = -5 and y = 2 are the only values that satisfy both
equations. Substituting these values in the third equation we find -5 + 5(2) = 5 -/ 8,
so there are no values for x and y that satisfy all three equations. We conclude that
(1, 3, 8) is not a linear combination of (I, 1, 1) and (3, 4, 5).

The last two examples show that answering questions about linear combinations
may require solving systems of first-degree equations. Chapter 2 describes routines
for solving such equations with many variables. Equations that come up in examples
and exercises in this chapter will be simple enough to be solved by common-sense
methods as in these examples.
In this book we emphasize applications to geometry and physics in two and three
dimensions, and most of our examples in this chapter involve vectors in JR2 or JR 3 .
However, we'll be applying the concepts and methods illustrated here to vectors in
!Rn for arbitrary values of n later on. Here is a nongeometric example.

A model for an economy might use a vector p to represent annual production, with
entries Pi giving the year's production for each of n commodities considered in the
model. Thus p would be a vector in Rn, where n might be as large as several hundred
in an elaborate model. Similarly, vectors c and b might represent annual consump-
tion and the amount in inventory at the beginning of the year for each commodity.
Then the amounts in inventory at the end of the year would be given by the vector
b + p-c.
Section 1 Coordinate Vectors 7

Calculations with vectors in ]Rn are impractical to do by hand when n is large,


but computers do them efficiently. Interactive programs such as MATLAB, Maple,
and Mathematica, and programming languages such as Basic, C, Pascal, and oth-
ers, provide for computations with vectors, often called arrays in the programming
context. The entries x1, x2, .•. in a vector x usually appear as x(l), x(2), ... or
x[l], x[2], ... , depending on the program language. Many languages allow direct
specification of scalar multiplication and addition using mathematical notation; in
others, such as C and Pascal, subroutines are used to carry out these operations.

EXERCISES

1. Let x = (-3, 4) and y = (2, 2). Compute 13. Let x = i + j, y = 2i + j + k, and z = -2i + j + 2k. Cal-
(a) X +y culate
(b) 2x + 3y (a) -x+2y-z
(c) -x+y-(1,4) (b) 6x-2y+z
(c) -4x+3y+z
2. Let u = (2, -1) and v = (-3, 1). Compule
(a) u - 2v
14. Let u = i - j + 3k, v = 2j + k, and w = -2i + 2j - k.
(b) 3u+2v
Calculate
(c) 4u+v-{-l,3) (a) -gu+tv--low

3. Let x = (3,-1,0),y = (0, 1,5), and z = (2,5,-1).


(b) -!u + 2 v- r\Jw
Compute
(c) ju+ !w
(a) 3x 15. Write out a proof for Formula 3 on page 3, giving precise
(b) 4x-2y+3z justification for each step.
(c) -y+(l,2,1) 16. Write out a proof for Formula 4 on page 3, giving precise
4. Let U = (1 , 2,3), V = (0,-},0), W = (-3,0,2). justification for each step.
Compute In Exercises 17 to 20, simplify the given expression,
(a) 2u+ (0, -3, I) +w showing which of the Formulas 1 through 9 for vector
(b) 2u + 2v - 3w algebra justify each step.
(c) 3u+2v+w 17. 2(3x-2y+z)-4x 18. x+(x+(x+y))
In Exercises 5 to 8, write the given vector as a linear 19. ½<x + y) - y 20. (2x + 3y) + (3x - z)
combination of i and j in JR2 , or of i, j, and k in JR 3 .
5. 2(1,2)-3(-1,4) 6. (1,0,1)+3(2,3,-1)
In Exercises 21 to 24, represent the first vector given as
a linear combination of the remaining vectors, either by
7. (l , 4)-(2c,d) 8. (x, y, z) + (z, y, x) inspection or by solving a system of equations.
9. Find numbers a and b such that ax + by = (9, -1, 10),
21. (-2,3);e1,e2
where x = (3, -1, 0) and y = (0, 1, 5). Is there more than
one solution? 22. (2, 0, -5);e,, e2, e3
10. Find numbers a and b such that au+ bv = (9, -3, 6), 23. (2, -7);(1, I),(], -1)
where u = (3, -I, 2) and v = (-6, 2, -4). Is there more 24. (2,3,4);(1,1,1),(1,2,l),(-l,1,2)
than one solution? 25. Let x = 2i, y = i - 3j, and z = 3i + 2j - 2k
11. Show that no choice of numbers a and b can make (a) Express i in terms of x.
ax+ by = (3, 0, 0), where x and y are as in Exercise (b) Express j as a linear combination of x and y.
9. For what values of c is ax+ by= (3, 0, c) possible? (c) Express k as a linear combination of x, y, and z.
[The answer to part (a) helps in (b); the answers to
12. Show that no choice of numbers a and b can make
(a) and (b) help in (c).]
au+ bv = (6, 2, 4), where u and v are as in Exercise
10. For what values of c, if any, is au+ bv = (6, c, 4) 26. Let u = 2i - 3j + k, v = j + 2k, and w = j - k.
possible? What about au+ bv = (6, 2, c)? (a) Express j and k as linear combinations of v and w.
8 Chapter 1 Vectors

(b) Express i as a linear combination of u, v, and w. (d) What vector gives the total production for a five-day
[First use the answer to part (a) to express u in tenns week?
of i, v, and w.] 29. A computer monitors the temperatures recorded by sen-
27. Let x = (5,500, 10) represent the amount of ink, paper, sors at 50 sites in a building. Suppose that Xk(t) is the
and binding material needed to produce a single copy of temperature at the kth site at time t as measured on a 24-
some book and let y = (4, 800, 90) be the same vector hour clock. Then the vector x(t) = (x1 (1), ... , xso(t))
for some other book. What does lOOx + 50y represent? represents the profile of temperatures in the entire build-
ing at time t. Write an expression in terms of x(t) for the
28. A small factory produces products of four different
vector that represents the average temperatures for the day
kinds. The vectors w = (50, 75, 100, 190) and r =
at the 50 sites, using the readings alt = 2, 8, 14, and 20.
(100, 150, 200, 300) give the wholesale and retail prices
in dollars for a single unit of each kind. The vector *30. Give a formal proof that the extension of Fonnula 2
p = (25, 25, 15, 10) gives the number of units of each on page 3 to sums of m tenns follows from the given
product produced in a day. Formulas 1 to 9, that is, prove that for given vectors
(a) What vector gives the retailer's profit per unit for x 1, ... , Xm and scalar r,
the four products?
(b) If the wholesale price vector is doubled, what hap- r(x1 + · · · + X,n) = rx1 + · · · + rxm.
pens to the retailer's profit vector?
(c) What happens to the retailer's profit vector if the =
Use mathematical induction: starting with m 2, assume
retail prices each increase by 10% and the wholesale the formula true with 111 tenns and prove that it's true with
prices stay unchanged? 111 + 1 terms.

SECTION 2 GEOMETRIC VECTORS


2A Points and Vectors
We can find geometric representations of IR = IR 1 as a line, of IR2 as a plane,
and of IR 3 as three-dimensional space by using coordinates. To represent elements
of IR as points of a line, we specify a point on the line to be identified with the
number O and called the origin, a unit of distance, and a direction on the line to
be called positive, with the opposite direction called negative. A positive num-
ber x corresponds to the point that is x units of distance in the positive direc-
tion from the origin, and a negative number x corresponds to the point that is Ix I
units from the origin in the negative direction. The number line is usually pictured
as horizontal with the positive direction to the right. With this standard conven-
tion, we obtain the familiar Figure 1.1 in which the arrow indicates the positive
direction.
In the plane, we pick an origin, a unit of distance and, a pair of perpendicular
lines called axes through the origin, with a positive direction on each axis. Given a
vector in IR 2, that is, a pair x = (x1, x2) of numbers, the procedure described in the
preceding paragraph determines a point (x1, 0) on the first axis corresponding to the
number x, and a point (0, x2) on the second axis corresponding to the number x2.
-2 -I 0 2
The projection of a point p on a line L is defined as the foot of the perpendicular
from p to L, unless p happens to be a point of L, in which case the projection of p
FIGURE 1.1 on L is p itself. Thus the vector x =
(x1, x2) corresponds to the point in the plane
Oriented line. whose projection onto the first axis is (x1, 0) and whose projection on the second
axis is (0, x2).
The conventional choice is to take the first axis horizontal with the positive direc-
tion to the right, and the second axis vertical with the positive direction upward.
This leads to the usual picture shown in Figure l.2(a), where the pairs (-3, 2) and
(2, 1) appear along with their projections on the axes.
Section 2A Geometric Vectors 9
FIGURE 1.2
(a) Positions. (b) Directed
arrows. (-3, 2) ( - 3, 2)
•------
1
(2, I)
I
I
I
----, I

(a) (b)

Here is an alternative geometric interpretation for elements of R IR.2 , and IR.3 •


What we do is draw an arrow from the origin to a point xr in JR, or to (x 1 , x 2 )
in R 2, or to (x1, x2, x3) in R 3. In JR a positive number x corresponds to an arrow
pointing to the right as in Figure 1.1, while if x is negative it corresponds to an
arrow pointing to the left.
Similarly, a pair x = (x, y) in R 2 corresponds to an arrow from the origin to the
point with coordinates (x, y), as in Figure 1.2(b), which shows arrows corresponding
to (-3, 2) and (2, l). You can get from the tail at (0, 0) to the tip of the arrow for
(x, y) by going to x on the horizontal axis and then to (x, y) vertically.
When referring to the point or vector that represents an element x (x, y) of IR.2 , =
we usually say simply "the point x" or "the vector x" instead of the more cumbersome
"the point with coordinates x" or "the vector with coordinates x."
Everything we have said about points and vectors in the plane extends to three
dimensions. Choose an origin, three perpendicular axes through it, and a positive
=
direction on each. If x (XI, x2, x3) is a triple in JR3, the point x with coordinates
(xi, x2, x3) is then the point whose projections on the three axes correspond to the
numbers XI, x2, and x3, and the geometric vector with coordinates (x1, x2, x3) is the
arrow from the origin to this point.
It's customary to take the first two axes in a horizontal plane and the third axis
vertical, labeled as in Figure 1.3, with the positive directions as shown. Some people
interchange the XI and x2 axes, but the convention illustrated in Figure 1.3 is more
commonly used and is the one we'll follow. Points and geometric vectors are closely
related; every point has associated with it a position vector, defined as the vector
represented by the arrow from the origin to the point.
Direct geometric visualization is possible only in dimensions up to three, so many
of our examples will be set in R 2 or R 3 . With the exception of the cross product
described in Section 6, the algebraic equivalents of geometric concepts that we

FIGURE 1.3
(a) Position. (b) Directed arrow.

' --, ( I, 2, I) (I, 2, 1)


I

I / I /
' I /
----~:::.-:,.!,/ ____' :~":LY"
I /

(a) (b)
10 Chapter 1 Vectors

FIGURE 1.4
(a) din IR 2 . (b) d in JR3.

/A (a1,a2,a3)
/ I

/
:1/1: -la3 - b3I
(b1,b2,b3) ~ - -
-----+ ,a
1

(a 1 2,b 3)
(a,,a2J I
I I
I I
I
ia2 - b2I
: X2
I
(b 1, b2) (a 1, b2 ) I
la, - bil I
I
X ~---,..----.--- l
I v'(a, - b,)2 + (a2 - b2)2 --,

(a) din !Ff (b) din JRl 3

introduce are valid in JR" for all values of n, allowing us to extend the geometric
concepts to higher dimensions in ways that are essential for applying geometric
reasoning to problems with many variables. In particular, Sections 3 and 4 will
extend and justify the intuitive geometric ideas of line and perpendicularity that we
used to introduce axes and projections onto axes.
28 Distance and Length
We define the distance between the points x =(xi, ... ,x11 ) and y = (y1, ... , Yn)
in JR 11 to be

This definition agrees with the usual notion of distance in JR 1 , JR 2 , and JR 3 • When
n = 1, /(x1 -y1) 2 = lx1 -Y1I, the absolute value of x1 - YI, which is the natural
distance between points x and y in JR. Application of the theorem of Pythagoras to
the right triangles in Figures l.4(a) and (b) shows that the formula defines in JR 2 and
JR3 the usual geometric distance. This agreement motivates the definition. We'll see
in Section 4 that distance defined here has other basic properties of distance that we
would want in JR11 •

!EXAMPLE 1 j In JR, the distance between 2 and -5 is 12 - (-5)1 = 7. In JR 2 , the distance between
(2, 4) and (-3, 1) is

/(2 - (-3))2 + (4 - 1) 2 = /52 + 32 = 54.


In JR3, the distance between (1, -1, 2) and ( -1, I, I, ) is

Jo - c-1>> 2 + c-1 - 1>2 + c2- 02 = J22 + 22 + 12 = 3.


Section 2C Geometric Vectors 11

The length of a vector x = (x1, ... , Xn) is denoted ·by !xi and defined to be

!xi= Jxr + ·· ·+x;.


The length is equal to the distance from the origin to the point x and is the geometric
length of an arrow representing the vector x. The distance between points x and y is

Ix - YI or IY - xi,
since these two numbers are both equal to

(a) The length of (1, -2, 1) in ~ 3 is

10, -2, 1)1 = J1 2 + (-2) 2 + 12 = -16.


(b) In ~4, the length of (1, 2, 5, -1) is

10, 2, 5, -1)1 = J1 2 + 22 + 52 + (-1) 2 = 51.


(c) The -disfarice1ii ~2- between (1, 2) and (3, 4) is

10, 2) - (3, 4)1 = 1(-2, -2)1


= ./(-2)2 + (-2)2 = Js.
ax
2C Scalar Multiplication
We can picture scalar multiplication in terms of the arrows representing vectors.
Figure 1.5 illustrates scalar multiplication by both a positive number a and a negative
number b. If a > 0, the arrow representing ax has the same direction as the arrow
for x and is a times as long. If b < 0, then bx points in the parallel but opposite
direction to x and is !bl times as long. One reason for drawing the arrows ax and
bx this way is that for a vector x in ~n, Jxj, the length of x, and jrxl, the length of
r x, are related by

b<O 2.1 lrxl = lrllxl ;


that is, the length of rx is lrl times the length of x. To prove Equation 2.1, we only
FIGURE 1.5
have to observe that
ax, a > O; bx, b < 0.

lrxj = ./(rx1)2 + ··· + (rxn) 2


= PJxr + ... +x; = 1r11x1.
Multiplying the position vector of a point by 2 moves the point directly away from the
origin in the direction of the position vector to double its distance from the origin.
12 Chapter 1 Vectors

FIGURE 1.6 X2 2x X3
Scalar multiples.

3y

X1 X2
-2y

-x XI

(a) (b)

Multiplying the position vector of a point by ½ moves the point halfway directly
toward the origin. For instance, let x = (1,2). Then ½x = 1),2x = (2,4), and (i,
-x = (-1, -2), as shown in Figure l.6(a). Figure 1.6(b) shows y = (1, 2, 1), 3y =
(3, 6, 3), and -2y = (-2, -4, -2).

FIGURE 1.7 ,< - 1,2,3)


Midpoints. I
I

I
~(3, 4) I
I
(2. 3v f(O, 2, 3)
/

"(I , 2)

(a) (b)

The point midway between two points will be useful in Section 2D for arriving
at a geometric interpretation for the sum x + y. We define the midpoint m between
x and y by m = ½(x + y) = ½x + ½Y, motivated by observing that the coordinates of
m will then be the averages of the coordinates of x and y. Furthermore, the distances
from x to m and y to m are

jm - xj = !½cx+y)-xl = !½Y -½xl = ½ly-xj,

Jm -yJ =!½ex+ y) -y/ = /½x - ½Y/ = ½Ix -yj,

each of which is half the distance between x and y. Thus the sum of the distances
from x to m and from m toy is equal to the distance from x toy, so x, m, and y
are collinear instead of being the vertices of a triangle. This justifies calling m the
midpoint. See Figure 1. 7.

L~>:<Ai~e,~~~41 The midpoint m between (I, 2) and (3, 4) in JR 2 is

m = ½w, 2) + (3, 4)) = ½(4, 6) = (2, 3).


Section 2D Geometric Vectors 13
FIGURE 1.8
Parallelogram law.

(a) (b)

The midpoint m between (1, 2, I) and ( -1, 2, 5) in JR 3 is

m = ½<O, 2, 1) + (-1, 2, 5)) = ½(0, 4, 6) = (0, 2, 3) .


2D Vector Addition
Figure l.8(a) shows x + y as the arrow from the origin to the opposite comer of
the parallelogram having as two adjacent sides the arrows representing x and y; this
geometric rule for adding vectors is called the paraUelogram rule of addition. To
see why the parallelogram rule is valid, consider the arrow from O to the midpoint
m ½<x
= + y) of x and y. The three arrows with tails at the origin and tips at y, x,
and m all lie in the plane detennined by the three points 0, x, and y. But x + y is
the arrow twice as long as m ½<x
= + y) and pointing in the same direction, so the
arrow for x + y also lies in the same plane as x and y. Furthermore, opposite sides
of the figure with comers at 0, x, y, and x + y have equal lengths because

l(x + y) - YI = !xi and Ix+ y - xi = IYI-


Hence the four-sided figure is a parallelogram.
Another way to look at the parallelogram rule appears in Figure l.9(b). Starting
at the tip of the arrow for x, draw an arrow parallel to y and of the same length
(a) as y; in other words, we translate y, keeping it parallel to itself, to the tip of x.
y Then x + y is the arrow from the origin to the tip of this translation of y. This form
of the rule applies literally to a pair of vectors, whereas the parallelogram rule as
~'> previously stated doesn't strictly speaking apply if the arrows representing x and y
O~x lie on the same line.
(b) The difference x - y of vectors x and y has an interesting geometric interpretation.
By definition, x - y is x + (-y), and the parallelogram law provides an arrow for
FIGURE 1.9 it, as shown in Figure l.9(a). But it's simpler to think of x - y as the vector v such
Differences. that y + v = x. The tip-to-tail rule implies that if arrows for x and y start at the
same point, forming two sides of a triangle, then the third side of the triangle gives
an arrow for x - y as in Figure 1.9(b). Note the direction carefully; the arrow for
x - y goes from the tip of y to the tip of x.

Let x = (-1, 2) and y = (3, 1). Figure l.I0(a) shows + y, x x- y, and x+ 2y.
Figure I.I0(b) shows x + y and x - y when x = (1 , 2, 4) and y = (1, -1, I).
Choose coordinate axes so that i points east, j points north, and k points up. Find a
vector of length 1 that points in the direction of the sun

(a) when the sun is due south and at angle a = 60°


above the horizon.
(b) when the sun is southwest and at angle f3 = 30° above the horizon.
14 Chapter 1 Vectors

FIGURE 1.10

x+ y
X X - y

(a) (b)

j : North

(c)

In each case consider the required vector as an arrow starting at the origin that' s
the hypotenuse of a right triangle whose other sides are vertical and horizontal, as
in Figure l.IO(c).
In case (a), the vertical side has length sin 60° = -/3 /2 and the horizontal side
has length cos 60° = l /2. The required vector is the sum of the horizontal side
considered as an arrow pointing south and the vertical side pointing up. Thus the
required vector is - ½j 4
+ k.
In case (b), the vertical component has length sin 30° = 1/2 and is (l/2)k.
The horizontal component has length cos 30° = -/3/2 and points southwest, which
is the direction of -i - j. Since 1-i - jl = -/2, we have to multiply -i - j by
(-/3/2)/-/2 = -/6/4 to get a vector of length -/3/2, so the required vector is the
sum - f i- f j + ½k of the two components.

To illustrate how vector calculations can prove geometric theorems, we show that
the diagonals of a parallelogram intersect at their midpoints. We choose one vertex
of the parallelogram to be the origin O and let the adjacent vertices have position
vectors a and b. The fourth vertex then has position vector a + b. The midpoint
between a and b has position vector ½a+ ½b, while the midpoint between O and
a + b has position vector ½O + ½<a+ b) . Since these vectors are equal, the two
midpoints are the same.

2E Points, Arrows, and Vectors


Numerical calculations involving vectors typically leave us with an n-tuple in JR 11 ,
where n is 2, 3, or some unspecified value n. Geometric interpretation of the result
is not always pertinent, but when it is we have to decide what interpretation is
appropriate. If the result is meant to describe the location of a particle, then we use
the point interpretation. If the result is meant to describe a force with magnitude and
direction, then we represent it by an arrow with the tail translated to the point where
Section 2E Geometric Vectors 15
FIGURE 1.11
z(r) and z(s) withs= }, t = J.

the force is applied. The choice should always be made to create the picture, mental
or visible, that conveys the underlying idea in the most useful way. Experience is
the best guide, and examples point the way.
Arrows. We've discussed arrows, their tails, and their tips in an intuitive way
without a formal definition. We'll clarify the mathematical connections among these
geometric ideas as follows. For vectors x and y in !Rn, the arrow or directed line
segment from x to y consists of all points of !Rn of the form z(t) = (I - t)x + ty,
where O :s t :S 1. We call x the tail of the arrow and y the tip. To see that the
definition of z(t) is appropriate, we need to show that z(t) covers all the points from
tail to tip as t increases from O to 1. To prove this consider the distance relations

1(0 - t)x + ty) - xi= tly- xi, and 1(0 - t)x + ty) - YI= (1 - t)lx - YI,

generalizing the case t = ½ that justifies the midpoint fonnula in Section 2C. The
equations tell us that a typical point (1-t)x + ty of the segment is at distance t ly-xl
along the segment from x toy and is distance (I - t)lx -yl along th.e segment from
y to x. All of these points are collinear with x and y, because the two distances add
up to the distance between x and y. As t runs from O to 1, z(t) covers exactly the
points from x to y in that order as shown in Figure 1.11, so the segment inherits
the order of the points in the interval O :S t :s 1 in the real number line. To reverse
direction we just interchange x and y in the fonnula for z(t).
Equivalence. Two arrows, one from x to y and a second from z to w, are called
equivalent if the second is a translation of the other, which means that there is a
vector v such that z = x + v and w = y + v. If this is so, then (1 - t)z + tw =
(1 - t )x + ty + v so corresponding points on the two arrows are all translated by the
same vector v. Each arrow is equivalent to infinitely many translations of itself, one
for each vector v. An arrow and a few of its translates appear in Figure 1.12.
Every arrow from x to y is equivalent to one with its tail at O; just take v = - x
to get the arrow from x - x = 0 to y - x; in other words, subtract the tail from the
tip. Among all the arrows equivalent to a given one, we have singled out the one
with its tail at O as the equivalent position vector, because it's natural to identify
this special one with the point at its tip, and hence with its coordinate vector in !Rn.
The identification of arrows with elements of !Rn via position vectors allows us to
do numerical computations involving arrows, even when it's better to think of their
tails as somewhere other than at O; just translate each arrow to the location of the
equivalent position vector. Then identify each position vector with the unique element
of !Rn that corresponds to it, and perfonn the desired operations in !Rn. Finally, use
the reverse identification to find a position vector or some other equivalent arrow, if
that conveys more in a picture.
16 Chapter 1 Vectors

FIGURE 1.12 (-2, -1, 2)


Translates of (a) (3, I) and
(b) (I, -1, I).
r---,
: I
(-3, 0, I)

(-1, 2) (3, 2, 3) I .,-;;~


, ,/
/ ..,..
(-4,~)~y (I, - 1, ]) / / 1 / (2, 3, 2)
/ I

(a) (b)

IEXAMPL~~i The arrows in a plane shown in Figure I. I 2(a) are mutually equivalent and the unique
position vector associated with each one of them is p = (3, 1). The arrow from x to
y is related to p by the translation vector v = (-4, I).

Drawing arrows in 3-dimensional perspective often requires more care. One strat-
egy is to locate the projections of the tail and tip of an arrow in the horizontal plane
and then mark off the corresponding vertical coordinates along a lightly traced line.
Then join the tail to the tip.

IEXAMPLE 9 I Figure 1. l 2(b) shows the position vector (I, -1, I) translated once so that its tail is
at (2, 3, 2) and again so that its tail is at (-3, 0, I) .

EXERCISES

For each pair of elements in IR 2 or IR3 given in Exercises 11. x = (I , I, I), y = (I, I, -1)
I to 4, sketch coordinate axes and mark the points for 12. x = (0, 0, - I), y = (0, I, I)
which they are the coordinates. Draw the line segment
connecting the points and calculate its length. Also For each pair of vectors given in Exercises 13 to 16, draw
calculate the coordinates of the midpoint of the segment, coordinate axes and arrows representing x and y. Then
and mark it in the drawing. use geometric constructions to draw arrows for 2x + y
1. (1,1),(-2,2) 2. (-1,4),(2,-1) and x -y.
3. (l,1.1),(1,-1 , l) 4. (1,0,0),(1,2, I) 13. x = (-2, !), y = ( I, 2)
For each vector in JR 2
or JR 3
given in Exercises 5 to 8, 14. X = (1, 0), y = (-2, -1)
draw arrows starting at the origin representing x, ½x, and 15. X=(] , 0,]),y=(2,l,-l)
-2x. Find jxj, !½xi, and I - 2xj. 16. x = (0, I, -1), y =(],I, 2)
5. (I, I) 6. (-1, 2) 7. (I. 2, 2) 8. (-!, 1, I)
In Exercises 17 to 20, draw the arrows with the given
For each pair of vectors in IR2
or JR 3
given in Exercises tails and tips. Then draw the equivalent position vector.
9 to 12, find the vectors x + y, x - y, and x +2y. Sketch 17. Tip (I, 2), tail (2, I)
coordinate axes and draw arrows representing the vectors
you have found. 18. Tip (-1, I), tail (2, 2)

9. x = (I, I), y = (1, -1) 19. Tip (I,0,0), tail (0, l, l)

IO. x = (2,4), y = (-], -2) 20. Tip (I , 2, 0), tail (-1, 0, I)


Section 3 Lines and Planes 17
In Exercises 21 to 24, draw the arrow that represents the In Section 2E we saw that z(t) = tx + (1 - t)y covers
position vector identified with the given element u of the points between x and y as t ranges over the interval
JR 2 or JR 3 . Then draw the arrow equivalent to u with tail O<t<l.
at v. (a) Show that c is between a and b only if O ~ t ~ I.
21. u = (1, 2);v = (2, 1) (b) For what values oft is a between b and c?
(c) For what values oft is b between a and c?
22. U = ( -1, 1); V = ( - 2, 0)
In Exercises 32 and 33, a surveyor is standing at the
23. u = (1, 1, l);v = (2, 1,0)
origin of a coordinate system that has its axes oriented
24. u = (1,0, l);v = (1, 0, -1) so that i points east, j points north, and k points up.
25. Verify that 1((1 - t)x + ty) - xi= tjy - xi and (You'll need a calculator or trigonometric tables, and
1((1 - t)x + ty) - YI= (1 - t)ly- xi for x, yin IR" and answers should be given to three significant figures.)
0 ~ t ~ 1. Why is the condition O ~ t ~ l needed?
32. A comer of the base of a building is 200 feet away from
26. Suppose that the tail z and tip w of one arrow are related to the surveyor, in a direction 40° north of east, and the
the tail and tip x and y of another arrow by z = x + v and building is 150 feet high. What are the coordinates of the
w = y + v. Check that the same relation holds between point at the top of the building directly above the comer?
elements (1 - t)z + tw and (1 - t)x + ty of the two
33. The base of a flagpole is in a direction 20° west of north
arrows.
from the surveyor and the top is 20° up from the horizon.
27. Draw the square that has the origin and the point (2, 2) as Find the unit vector pointing at the top of the flagpole.
one pair of diagonally opposite vertices. Mark arrowheads
J on the four sides so that following the arrows takes you Exercises 34 to 38 involve locating places on the surface
around the square in the counterclockwise direction. Write of the earth in terms of degrees of latitude north or south
down the vectors represented by the four arrows and of the equator and degrees of longitude east or west of
calculate their sum. a reference meridian that passes near London, England.
Take a coordinate system with origin at the center of the
28. Do the same as in Exercise 27 for the three sides of the
earth, with i pointing toward the location with latitude 0°,
triangle with vertices at (-2, 0), (3, 1), and (0, 4).
longitude 0° (which is in the Atlantic Ocean), and with k
29. Can you state a general condition on a set of arrows that pointing toward the north pole. In doing these problems,
implies that the vectors they represent add up to O? (The assume that the surface of the earth is a sphere of radius
condition should apply to the previous two problems.) 4000 miles (which is a fairly good approximation).
Does the condition work in IR 3 as well as in the plane?
34. What are the latitude and longitude of the location pointed
30. (a) Draw a triangle with one vertex at the origin and toward by j?
the other two &t a = 5i and b = 2i + 4j. Find the
midpoints of the sides joining the origin to a and b, 35. Los Angeles has latitude 34° north and longitude 118°
and the vector represented by the arrow from one west (approx.imately). Find its position vector.
midpoint to the other. Show that it is ½ times the 36. Do the same for New York, which has latitude 41 ° north
vector represented by an arrow along the third side and longitude 74° west (approximately).
of the triangle.
(b) Do a vector calculation to show that the conclusion 37. Find the distance between New York and Los Angeles,
of part (a) holds for an arbitrary triangle, that is, for measured in a straight line (which would pass below the
arbitrary choices of a and b. earth's surface).
31. It is a theorem of geometry that a point c is between the 38. Find the straight-line distance between New York and
points a and b if and only if the distance from a to b is Paris, France (which has approximate latitude and lon-
the sum of the distances from a to c and from c to b. gitude 48° north and 2° east).

SECTION 3 LINES AND PLANES


In the previous section, we took the ideas of line and plane for granted and used
them informally to justify the geometric interpretation of vector operations. Now we
take a more formal point of view and define lines and planes in terms of vectors.
The informal geometry we've been using will motivate the definitions.
18 Chapter 1 Vectors

3A Lines
In both the plane and JR 3 there are two natural ways of specifying a line, either as
passing through a particular point in a particular direction or as passing through two
particular points. We begin with the first of these.
A nonzero vector specifies a direction. Motivated by the earlier discussion of scalar
multiplication, we define two nonzero vectors u and v to have the same direction
(a)
if u = tv for some positive number t and the opposite direction if u = tv for some
negative number t. We say that u and v are parallel if they have either the same or
the opposite direction.
Let u be a nonzero vector. We define the line through the origin in the direction
of u to consist of all points whose position vectors are multiples of u, as illustrated
in Figure 1.13(a). Translating all the points of such a line by a fixed vector v gives a
parallel line through the point whose position vector is v as in Figure l.13(h). Hence
(b) we make the following definition.
A line is a set consisting of all the points tu+ v, where u and v are fixed position
FIGURE 1.13 vectors, while u =I- 0 and t ranges over all real numbers. Each value oft corresponds
Arrows on a line. to a point of the line. A variable t used in this way is called a parameter, and
the expression tu + v is called the parametric representation of the line in vector
form.
Although motivated by geometric intuition in JR2 and JR 3 , this definition of a line
applies in ]Rll, as do the definitions of same and opposite direction and parallelism.

Given the vectors u = (2, 1) and v = (-1, 1) in JR 2 , suppose that we want lo sketch
the line of points x = tu + v. We think of the direction of the line as determined
by the vector u and so sketch the line t u through the origin. Then draw the line
parallel Lo t u that passes through the lip of the arrow representing v; this gives the
Xz line of points x =tu+ v, shown in Figure I.14(a). Alternatively, plot two points on
the line, for example, those obtained by setting t = 0, which gives x = v = (-1, I),
and t = 1, which gives x = u + v = (2, I)+ (-1, 1) = (1, 2). Then draw the line
through these two points as shown in Figure I.14(h).
If we let (x, y) be the coordinates of x, the vector equation x = tu + v, or
~ - - - ..;,.iL--- ---
X1 (x, y) = t (2, 1) + (-1, 1), is equivalent to the pair of numerical equations

(a)
X = 2t - 1
y = t + l,
which are scalar parametric equations for the line. We'll usually use the more concise
vector representation, hut both forms are equally valid.

To find a parametric representation for the line through two distinct points with
position vectors a and b, recall that the vector from a to b is b - a. Thus b - a
gives the direction of the line, and a is a point on it, so
(b)
x = t(b - a)+ a
FIGURE 1.14
Points on a line. is a parametric representation of the line. A more symmetrical rewriting of this
expression is
Section 3A Lines and Planes 19
3.1 X = (I - t)a + tb.
When t = 0, x = a and when t = l, x = b. For t between 0 and l we saw in Section
2E that xis on the line segment between a and b. Thus if t = ½,xis the midpoint.

To find a parametric representation for the line in JR 3 through (-1, I, 0) and (2, 2, I),
we find (2, 2, I) -( - 1, I, 0) = (3, l, I) as the vector from the first point to the second
and obtain

x == 1(3, l, l ) + (-1, I, 0).

To find out whether a point such as (1, -2, 2) lies on this line, we check to see
whether there is a value of t such that

(I, -2, 2) == t(3, I, l) + (-1, 1, 0)


or, subtracting ( - 1, 1, 0) from each side,

t(3, 1, 1) == (2, -3, 2).

The second coordinates match only if t = -3, and the third coordinates match only
if t = 2, so the equation has no solution and the point is not on the line.
If we ask instead about the point (-10, -2, -3), we check the equation

(-IO, -2, -3) = t(3, I, 1) + (-1, 1, 0)


or
t(3, l, 1) = (-9, -3, -3).
This equation has the solution t = - 3, so the point is on the line.
1\vo lines are defined to be parallel if they have representations tu 1 + v1, tu2 + v2
in which the direction vectors u1 and u2 are parallel, that is, if u2 is a scalar multiple
of u 1 . In plane geometry, it's usual to define lines to be parallel if they do not
intersect, but that definition doesn't work in higher dimensions. For an example in
JR 3, consider the x-axis and a line parallel to the y-axis, but one unit above it. The
lines don't meet, but they aren't parallel either. (See Exercise 24.)

Let a= (1, I , 0) and b = (2, - I , 0), and let L be the line through them. The direc-
tion of L is b - a = (2, -1, 0) - (1, l , 0) = (1, - 2, 0). So L has the representation
t(l, -2, 0) + (l, 1, 0). The line through the origin parallel to L has the represen-
tation t (l, -2, 0) and the line through (I, 2, 3) parallel to L has the representation
t(l, -2, 0) + (1, 2, 3).
The line through (-1, 2, 3) and (I, -2, 3) is parallel to L, because it has
(-1, 2, 3) - (1, -2, 3) = (-2, 4, 0) for a direction vector, and the vector (-2, 4, 0) =
2(1, - 2, 0) is a scalar multiple of (I, -2, 0). The line through (-1, 2, 3) and the ori-
gin is not parallel to L, because it has ( -1, 2, 3) for a direction vector and ( -1, 2, 3)
is not a scalar multiple of (I, -2, 0).
20 Chapter 1 Vectors

A given line has many different representations. In the expression tu+v, the vector
u giving the direction is replaceable by a nonzero multiple ru, and v is replaceable
by the position vector of another point on the line without changing the line being
represented. A way to check whether two representations give the same line is to
take two distinct points in one representation and see whether they are also given by
the other, as in the following example.

IEXAMPLE 41 Consider the parametric representations

X = t(],2) + (], ]),


X = 1(2, 4) + (0, -]).

Setting t = 0 and t = I in the second representation gives the points (0, - I) and
(2, 3). These points are both given by the first representation because

t(], 2) +(],I)= (2, 3)

when t = I, and

t(], 2) +(I, I)= (0, -1)

when t = - I. Hence the two representations give the same line.


Instead of viewing tu+ v just as a way to describe a line as a set of points, we can
think of it as a function x(t) = tu+ v associating particular points with particular
values oft. If we let x(t) be the position of an object at time t, the function describes
motion of the object along the straight line parameterized by the function x(t) with
parameter t. Different representations of the same line correspond to motions along
the same line, but with different velocities u and different starting positions v.

A motorboat starts out from a dock on a lake at 10 miles per hour in the direction
given by 4i + 3j. (We take coordinates with origin at the dock and i pointing east and
j pointing north, with the unit of distance 1 mile.) At the same time a rowboat starts
north at 4 miles per hour from a point 2 miles east of the dock. The boats move along
the lines with parametric representations t1 (4i + 3j) and 2i + t2j. The lines intersect
where 11 (4i + 3j) = 2i + tij, giving 411 = 2 and 311 = t2. Then t1 = ½, t2 = ~, and
the point of intersection is ½(4i + 3j) = 2i + ~j.
The representations we have used tell us the paths followed by the boats but take
no account of how fast they go. The vector 4i + 3j has length 5; if we double it, and
let m(t) = t (8i + 6j), we have a function describing the motion of a boat that goes I 0
miles in 1 hour and is at the origin (the dock) when t = 0. Similarly r(t) = 2i + 4tj
describes the motion of the rowboat. The motorboat reaches the point 2i + !j
when
i,
t = ¼, and the rowboat reaches it when t = which is } hour= 7½ minutes later,
so the boats do not run into each other.

Recall from Section 2E that the points on the segment directed from x to y were
+
represented in the form z(t) = ( l - t )x ty with the parameter t increasing over
the interval 0 ::'.S t :'.S 1. Rewriting z(t) shows that z(t) = t (y - x) + x, so letting t
Section 3B Lines and Planes 21
run over all real numbers gives the points on the line through x in the direction of
y - x, assuming y - x ¥- 0. When t > l the equations

1(0-t)x+ty)-xl=tly-xl and 1(0-t)x+ty)-yl=(t-l)!x-yl

show that lz(t) - xi = Ix - YI + IY - z(t)I, so y is between x and z(t). Thus the


values t > l give the points on the line that lie beyond y in the direction of y - x. A
similar argument with t < 0 shows that x is between y and z(t), so z(t) also gives
the points on the line that lie beyond x in the opposite direction.

3B Planes
If we pick two nonzero vectors and picture them as arrows from the origin, then
unless they lie along the same line, there is a unique plane that contains both arrows.
We define a plane through the origin to be the set of linear combinations

X = t1UJ + t2U2,

where neither u 1 nor u2 is a scalar multiple of the other. We say that u 1 and u 2 are
linearly independent if neither one is a scalar multiple of the other. Geometrically
this means that the vectors aren't parallel as defined on page 18 and that neither one
is the zero vector.
A part of such a plane appears in Figure l.IS(a). In Figure l.IS(b) the plane
through the origin has been translated by adding a fixed vector v to every point
of the plane through the origin. Formally, we define a plane to be a set of points
whose position vectors have the form t1 u 1 + t2u2 + v, where u1 , u2, and v are fixed
vectors, u 1 and u2 are linearly independent, and ti and t2 range over all real numbers.
The variables t1 and t2 are parameters and t1u1 + t2u2 +vis called a parametric
representation of a plane in vector form.
This definition of plane is motivated by the way we picture planes in JR 3 , but
like our definition of line, it applies in ]Rn. A characteristic property of geometric
planes is that they are fiat, that is, if p and q are two distinct points in a plane,
then the entire line through p and q lies in the plane. To see that planes as we have
defined them have this property, let p = PI u 1 + p2u2 + v and q = q1 u1 + q2u2 + v
be two points in the plane that has parametric representation t1u1 + t2u2 + v. By
Formula 3.1, every point x on the line through p and q has a position vector of the

FIGURE 1.15 V +SUI+ IUz


Generating planes. V + IUz ______ / --~
/
.------ ,, /
/ __ ....-"v + SUt

(a) (b)
22 Chapter 1 Vectors

FIGURE 1.16 V

Parallel planes. \

'\
'
') f ,Q ' f2U2 +V
,1
I j
/ I
/ I

/ : l2U2

form ( 1 - t )p + tq for some number t . Substituting for p and q in this formula and
rearranging the terms gives

x = (1 - t)(p1 u 1 + p2u2 + v) + t(qi u1 + q2u2 + v)


= {(I - t)pi + tqi)u1 + (0 - t)p2 + q2)u2 + v.

This shows that x is in the plane because it has the form t1 u 1 + t2u2 +v with
t1 = (] - t)p1 + tq1 and t2 = (] - t)p2 + q2.

IE>Q\MPLE .1 I Let u1 = 0, 0), u2 = (0, 1, 0), and v = (0, 0, 2). The vectors u1 + t2u2 are just
(I,
the vectors (/1 , t2 , 0) making up the xy-coordinate plane. The vectors
t1

lt UJ + t2U2 + V = (/1, t2, 2)


give a parallel plane two units above the coordinate plane. These planes appear in
Figure 1.16.

Here's how to find a representation for the plane through three given points. For
three points x1, x2, and x3 to determine a unique plane passing through them, the
three points must not lie on a line. If the three points do not lie on a line, the vectors
x3 - x1 and x2 - x1 are linearly independent; otherwise x3 - x1 = t(x2 - xi ), for
some value oft . Then x3 = t(x2 - x1) + x1, and X3 would lie on the line through
x1 and x2.
Let x1 , x2, and x3 be three points that don't lie on a line. We just observed that
the vectors u1 = x3 - x1 and u2 = x2 - x1 are linearly independent. Then

is the parametric representation of a plane; this is the plane containing the three
given points because x = x1 when ti = t2 = 0, x = x2 when t1 = 0 and t2 = I , and
x = x3 when t1 = I and t2 = 0.

IEXAMPLE s I Let XJ = (1 , 2, l), Xz = (- 1, 2, 0), and X3 = () , ), 2). Let


U[ = X3 - XJ = (0, - 1, ]) and U2 = x2 - X1 = (- 2, 0 , -1 ).
Section 38 Lines and Planes 23
FIGURE 1.17
(a) Arrows. (b) Points.

\
'
I
1/

(a) (b)

Then the parametric representation for the plane through the three points is

X = t1(0, -1, I)+ t2(-2, 0, -1) + (I, 2, I).

To picture the plane relative to rectangular coordinate axes in IR.3 , we can either draw
the translated vectors n1 + x1 and n2 + x1 as in Figure 1. l 7(a) or plot three or four
noncollinear points on the plane as in Figure I.I 7 (b).

EXERCISES

In Exercises I to 4, represent the described line in the For the pairs of lines given in Exercises 7 to 10, find out
parametric fonn x = tu+ v and sketch the line. whether the two lines are the same, and if they aren't,
whether they are parallel.
1. The line in R 2 parallel to the vector (2, l) and passing
through the point (-1, 2) 7. t(I, 2) + (2, 1) and t(-2, -4) + (3, 3)
2
2. The line in JR through the points (1, 0) and (0, 1) 8. t(2, -1) + (2, 1) and t(-1, 2) + (2, -1)
3. The line in JR. 3 passing through the points (2, 2, 3) and 9. t(2,3, -1) + (-1, -1, 1) and t(-4, -6,2) + (1, I , -1)
(1, 2, 2)
10. t(4, 2, 2) + (2, 0, 1) and t(2, 1, 1) + (2, 2, 1)
4. The line in IR3 parallel to the vector (1, l, 2) and passing
through the point (2, 0, 1) Which of the pairs of vectors given in Exercises 11 to
14 are linearly independent?
5. Let a= (-1, 1), b = (0, l), c = (2, 1), and d = (-3, 2).
11. (1 , 2), (2, I) 12. (2, -1), (-2, 1)
(a) Sketch the lines ta+ b and sc + d.
(b) Find the point p where the lines in part (a) intersect 13. (3, 1, 3), (1, 3, 1) 14. (-1, 3, 1), (2, -6, -2)
by finding values of s and t for which
In Exercises 15 to 18, find a representation for the given
plane in the parametric form ti n1 + t2n2 + v, and sketch
p = ta+ b =SC+ d. the plane.

(c) Change c to (2, -2) and show that then the lines 15. The plane parallel to the vectors (1, 1, 0) and (0, 1, 1) and
ta+ b and sc + d do not intersect. passing through the origin
6. Let a = (-3, 0, 1), b = (0, 1, 2), c = (2, -1, 1), and 16. The plane parallel to the vectors e1 and e2 in JR 3 and
=
d (l,2,0). passing through the point (0, 0, 2)
(a) Sketch the lines ta + b and sc + d. 17. The plane passing through the three points (I , 0, 0) ,
(b) Find the point of intersection of the lines in part (a), (0, 1, 0), and (0, 0, 1)
or show that they do not intersect.
(c) What is the answer to part (b) if d is changed to 18. The plane passing through the three points (1, 1, 0),
,_10,l\? (-3, 0, 2), and (2, 4, 7)
24 Chapter 1 Vectors

19. A plane Pin IR 3 contains the line t{l, -1, 2) + (1, 2, 1) 23. (a) Show that if tu = 0, for u a vector in IRn, then either
and the point (3, 0, 1). t = 0 or u = 0. Can you derive the same result using
only the laws l through 9 on page 3?
(a) Find a vedor between two points of P that is linearly
(b) Show that if u1 is a scalar multiple of u2, and is not
independent of (1, -!, 2).
zero, then u2 is a scalar multiple of u1.
(b) Find a parametric representation for P.
(c) Show that tu1 + VJ and tu2 + v2 represent the same
20. Let the vertices of a triangle be the points a, b, and c line if and only if both u2 and v2 - v1 are scalar
and let p, q, and r be the midpoints of the sides opposite multiples of u 1.
a, b, and c, respectively. A line joining a vertex to the 24. Let u = (u1, u2), v = (v1, v2) be nonzero vectors in IR2.
midpoint of the opposite side is called a median of the (a) Show that u and v are parallel if and only if u I vi =
triangle. Show that the point p = ½a+ jp is on the u2v1.
median that joins a and p. Express p in terms of a, b, and (b) Let a and b be vectors in :~.2. so that tu+ a and
c. Show also that p is on the other two medians of the sv + b are two lines in the plane. Show that if u
triangle. and v are not parallel, then there are values of s and
t such that tu + a = sv + b, so the lines intersect.
21. Show that if a, b, c, and d are the position vectors of
(Write out the vector equation tu + a = sv + b as
the vertices of a quadrilateral, not necessarily lying in a
a pair of scalar equations and show that they can
plane, then the midpoints of the four sides are vertices of
always he solved if u1 v2 i= u2vt,)
a parallelogram and do lie in a plane.
25. A subset S of IR" is convex if whenever it contains a and
22. Show that two lines are parallel if and only if for two b it also contains the line segment joining them, that is,
distinct points VJ and v2 on the first line, and two distinct all points ta+ (I - t)b with O .:S t .::: 1. Show that if S
points w1 and w2 on the second line, the difference and T are convex subsets of IR", then the set S + T of all
W1 - W2 is a multiple of v1 - v2. sums x + y, with x in S and y in T is also convex.

SECTION 4 DOT PRODUCTS


The dot product of two vectors x = (x1, ... , x11 ) and y = (y1, ... , y11 ) in IR" is
defined to be the number given by the formula

4.1 X • Y = XtYl + ··· + XnYn·


This simple formula has a geometric interpretation that allows us solve problems
involving lengths and angles. We discuss other applications in this section, and still
others appear in later chapters.
We often use the following properties of dot products in both theoretical calcula-
tions and applications.

4.2 Theorem. Positivity: x • x > 0, except that 0 • 0 =0


Symmetry: x •y =y •x
Additivity: (x + y) • z = x. z+y •z
Homogeneity: (rx). y = r(x • y)
Proof. To prove positivity, note that in the sum x • x = xf + · · · + x~, each term
xf ~ 0. If x = 0 then all the terms xf
are O so the sum is 0. Otherwise, at least one
term is greater than O and the sum is greater than 0.
Symmetry holds because in the sums for x • y = x1y1 + · · · + X11 Yn and y • x =
}'1X1 +··· + YnXn, corresponding tenns x;y; and y;x; are equal by the commutative
law for ordinary multiplication, and therefore the sums are equal.
Section 4A Dot Products 25
Proofs for additivity and homogeneity also follow directly from the definition of
dot product and the laws of ordinary arithmetic and are left as Exercise 11. •
Because of the symmetry of the dot product, it follows immediately that additivity
and homogeneity hold for the second vector also; that is,

X•(Y + z) = x • y + x • z and x • (ry) = r(x. y).


4A Lengths and Angles
The dot product of a vector x with itself is

x • x = xf + · · · + x~ .
Referring to our earlier definition of the length of x in Section 2B as

lxl J
= xf + .. · + x;,
we see that the length of x equals the square root of the dot product of x with itself:

4.3 !xi = ,.;x;i.


FIGURE 1.18

x- y
X

8
y y y
(a) Ix - Yl2 < lxF + IYl 2
(b) Ix - yj2 = lxl + IYl2 2
(c) Ix - Yl 5- lx12 + IYl 2
2

We get a relation between dot products and angles by recalling from Section 2
that if two sides of a triangle represent vectors x and y, then the third side represents
the vector x - y from the tip of y to the tip of x as illustrated in Figure 1.18. The
key to the relation is the familiar trigonometry formula called the law of cosines that
relates the length of one side of a triangle to the lengths of the other sides and the
angle between them. Applied to the triangles in the figure it gives the following:
4.4 Law of Cosines.

Ix - Yl
2
= lxl 2 + IYl 2 - 21xl IYI cos 0 ,

where 0 is the angle opposite the side x - y.

We now use the law of cosines to derive the following fundamental geometric
property of the dot product.

4.5 Theorem. The dot product of two nonzero vectors is equal to the product of
their lengths times the cosine of the angle between them. In other words, if 0 is the
angle between vectors x and y, then

x • y = lxllYI cos 0.
26 Chapter 1 Vectors

Proof Using the additivity and homogeneity of the dot product, we have
2
Ix - Yl = (x - y) • (x - y)
= x • (x - y) - y • (x - y)
=X•X-X•Y - Y•X+Y·Y
= lxl 2 + IYl 2 - 2(x • y) ,

where the last step uses the symmetry property y • x = x • y. Comparing this with
the value for Ix - yl 2 given by the law of cosines shows that x • y = lxl IYI cos 0. •
As illustrated in Figure 1.18, the formulas for lx-yj 2 in the proof of Theorem 4.5
show that when x • y > 0 and cos 0 > 0 then Ix - yl 2 < lxl 2 + lyl 2 ; similarly, when
x•y < 0 and cos0 < 0, then lx-yl 2 > lxl 2 + jyj 2 . The special case of perpendicular
vectors x and y shown in Figure I. l 8(b ), when cos 0 and x•y are both 0, is particularly
simple and important and we emphasize it as follows:

4.6 Theorem of Pythagoras. If x • y = 0, then Ix -yl 2 = lxi2 + lyl 2 .

Note that if we replace y by -y throughout, the conclusion of Theorem 4.6 becomes


the more symmetrical statement Ix+ yj 2 = lxl 2 + IYl 2 . We say that two vectors are
orthogonal to each other if their dot product is 0. Thus two vectors are orthogonal
if they have perpendicular directions or if one of them is 0.
It follows from Theorem 4.5 that cos0 = (x • Y)/(lxllyl) and hence that the angle
0 between x and y is

X•Y
4.7 0 = arccos -- .
lxllYI
If either x or y is 0, the right side of Equation 4.7 is undefined, which is appropriate
because the zero vector has no direction and so can't form an angle with another
vector.

l™,JV!PLE l! To find the angle 0 between x = (I, 3) and y = (- 1, l) in JR2 we compute

2 2 2 2
x • x = I + 3 = I0, y • y = (- 1) + 1 = 2, and x • y = -1 + 3 = 2.

Then
2 I
lxl = Jio, IYl=-v'2, and cos0=-- = - .
Jw ,./5
Using a calculator we find 0 ~ arccos .447 ~ 1.1 radians, or about 63°.
To find the angle 0 between x = (I, 2, 0, - 2) and y = (0, -6, 3, 2) in JR4 we
compute

X • X = 12 + 22 + 02 + (-2) 2 =] + 4 + 0 + 4 = 9,
y. y = 0 2 + (-6) 2 + 32 + 22 = 2 = 0 + 36 + 9 + 4 = 49, and
X•y=0 - 12+0-4= -16.
Section 4B Dot Products 27
Then jxj = 3, IYI = 7, and cos0 = -16/21, giving 0 ~ arccos(-.762) ~ 2.44
radians, or about 139.6°.

FIGURE 1.19 X
Perpendicular vectors .

-x

Problem: Find a vector x of length 2 perpendicular to a = (1, 2, 3) and b =


(1, 0, -1). Note that if x is a solution, so is -x, as in Figure 1.19. The perpen-
dicularity conditions x • a = x • b = 0 give the equations

x1 + 2x2 + 3x3 = 0,
XI - X3 = 0.

The second equation gives x1 = x3, and substituting in the first equation gives
4x1 + 2x2 = 0 or x2 = -2x1. Taking x1 = 1 gives x = (1, -2, I) as one solution,
and a scalar multiple of x is also perpendicular to a and b. The length of x is
./6, so to get a vector of length 2 we must multiply by ±2/./6 ±,J273, giving =
X = ±( ft, -ft, ft).
To illustrate how to use the dot product to prove geometric theorems, we'll show that
in a parallelogram with all four sides the same length the diagonals are perpendicular.
Figure 1.20 shows such a parallelogram with diagonals parallel to the vectors x + y
and x - y. To show that the diagonals are perpendicular we need to show that
(x + y)-(x - y) = 0. Using the additivity property of the dot product twice allows
us to multiply out, getting

(x + y)•(X - y) = X. (x - y) + y•(X - y)

Q27
FIGURE 1.20
X

to show.
= X • X + X • (-y) + y • X + y • (-y).
By homogeneity, x • (-y) = -(x • y) and Y• (-y) = -(y • y). Since the lengths of x
and y are equal, x • x = lxi 2 = IYl2 = y • y, so the right side is zero, as we wanted

Perpendicular diagonals. 4B Properties of x • y and jxj


The following inequality expresses a fundamental property of the dot product that
comes up in many areas of mathematics.

4.8 Cauchy-Schwarz Inequality.

Ix• yl ~ !x!IYI
28 Chapter 1 Vectors

Proof Theorem 4.5 gives x • y = lx!lyl cos0 for some angle 0. Taking absolute
values, and noting that the absolute value of a product of real numbers equals the
product of their absolute values, we have

lx•yl = lxl1Yl lcos0 1 ~ !xl!YI

because I cos 0 I ~ 1 for all 0.


For a purely algebraic proof that treats vectors only as n-tuples of numbers and
makes no use of their geometric interpretation, see Exercise 29. •

4.9 Theorem. The length function defined on IR" by !xi = ,Jw has the
properties:
Positivity: !xi > 0 except that 101 =0
Homogeneity: lrxl = !rl!xl
Triangle Inequality: Ix+ YI ~ Ix! + IYI
Proof Positivity is an immediate consequence of the positivity property of dot
products, for since x • x > 0 unless x = 0, the same is true of !xi = ,Jw. We leave
the proof of homogeneity to the reader in Exercise 12.
Geometrically, the triangle inequality is equivalent to the theorem that a side of
a triangle cannot be longer that the sum of the lengths of the other two sides, which
is why it's called the triangle inequality. See Figure 1.21.
For an algebraic proof, we start with the equation

2 2
Ix+ Yl = (x + y) • (x + Y) = Jxl 2 + IY1 + 2x • y.
FIGURE 1.21
Triangle inequality.
From the Cauchy-Schwarz inequality, x • y ~ lxl!YI, so

Ix+ yJ ~ JxJ + IYl + 21xllYI


2 2 2
= (JxJ + !yl)2.
Taking square roots, we get
X_i

Jx +YI~ JxJ + IYI- •


4C Unit Vectors and Projections
V A vector is called a unit vector if it has length 1. Any nonzero vector may be used
to specify a direction, but it's often convenient to use a unit vector for the purpose.
If v t O then n = v/!vl is the unique unit vector with the same direction as v and
is called the normalization of v.
h We often need to decompose a force vector into the sum of two vectors, one
having a given direction and the other perpendicular to it. For example, in IR.3 every
vector x is the sum of a vertical component v parallel to the z-axis and a horizontal
component h in the xy-plane perpendicular to the z-axis, as in Figure 1.22. Other
FIGURE 1.22
examples and applications are given in this section and in the next section. The
X = h +v.
following theorem tells how to find such a decomposition.
Section 4C Dot Products 29
FIGURE 1.23
X = p+q.

4.10 Theorem. Let x and n be vectors in Rn, with \n\ = 1. There are unique
vectors p and q such that x = p + q, with p parallel to n and q perpendicular
ton. The vector p equals (x • n)n, with length \x. n\ , and q = x - p, with length
\qi= Jlxl 2 - lp\ 2 .

Proof. Since p is parallel to n there is a scalar r such that p = rn. For q =x- p
to be perpendicular ton, we need (x - p) • n = 0. But since n. n = 1,

(x - p) • n = (x - rn) • n = x • n - r(n • n) = x. n - r = 0,
so (x - p) • n = 0 if and only if r = n • x. By Pythagoras \q\ 2 = \xl2 - \p\ 2 . •

The vector p in the statement of Theorem 4.10 is the component of x in the


direction of n, or the projection of x on n. The vector q = x - (n • x)n is the
component of x perpendicular to n. The scalar x • n is the coordinate of x in the
direction of n.
Figure 1.23 illustrates this decomposition for a vector x making an angle 0 with
n. The vector p is the component of x in the direction of n. We see geometrically
that its length is \xi I cos 01 and that it has the same direction as n when n • x > 0
and 0 ~ 0 < 1r /2 and the opposite direction to n when n • x < 0 and 1r /2 < 0 ~ 1r.
Since n is a unit vector, this agrees with the formula p = (n • x)n and the geometric
interpretation of the dot product in Equation 4.5.

The standard basis vectors e1, e2, ... , en for Rn are unit vectors and are orthogonal
to each other. In particular, in R 3 we have i • i = j • j = k • k = I and i • j = j • k =
k • i = 0. If x = (xi, x2, ... , Xn) = x1e1 + · · · + Xnen is an arbitrary vector in Rn
then its coordinate in the direction of ei is x • ei = Xi. In R 3 , if v = xi + yj + zk
then X = V. i, y = V • j, and z = V. k.

A mechanical force has magnitude and direction and so is a vector with its arrow's
tail usually drawn at the point of application of the force. It's often useful to express
a force as a sum of perpendicular components because they act independently of
each other, as in the next example.

Here we analyze the effect of gravity on a I-pound brick held in place by friction
on a roof that slopes so that n = ~ i + ~j + ~ k is the unit vector perpendicular to the
roof, as in Figure 1.24. Gravity exerts a downward vertical force of I pound, which
as a vector is -k. The brick doesn't move because the roof exerts an opposing force
F = k on it. The component of F in the direction of n is
6 12• 18· 36 k
P = (F
· n) n = 7n = 49 1 + 49J + 49
30 Chapter 1 Vectors

and is the force with which the roof presses directly against the brick. The other
F
force component F - p = -l~i - ~j + l~k is perpendicular to n and thus parallel
to the roof, and is the frictional force that keeps the brick from sliding.
The vector v = 2i + 3j + 6k has the same direction as n and could have been
given instead of n to specify the way the roof slopes. In that case we would have to
start by calculating lvl = 7 and finding the unit vector n = v/!vi.

The work done by a force in moving an object a certain distance in a straight line
FIGURE 1.24
is the product of the distance and the magnitude of the force, provided the motion
is in the direction of the force. More generally, the work done is the distance times
Brick on a roof.
the coordinate of the force in the direction of the motion. Suppose a force F moves
an object through a displacement d so the distance moved is d = ldl . If n is the unit
vector in the direction of d, then d = dn and the coordinate of F in the direction of
d is F • n. The work done is (F • n)d = (F • dn) = F • d.

IEXAMPLE 6J The bottom of a snow-covered slope is at the origin, and the top at (100, 20), with
units measured in feet. Pulling a child's sled up the slope takes a force given by the
vector (8, 3), in units of pounds. The work done is (100, 20) (8, 3) = 800+60 = 860
0

foot-pounds.

We use the dot product to analyze the flow of fluid, or heat, or radiant energy
described by a vector-valued function v(x,y,z) that gives the direction and magnitude
of the flow, that is, the flow velocity, at each point (x, y, z). We consider here only
the simplest kind of flow, in which v is a constant vector so that the flow is uniform
along parallel lines.
For a given flow there will be a rate of flow per time unit through a surface in its
path, which is called the flux through the surface. This is illustrated in Figure 1.25
showing a horizontally placed parallelogram P, and a flow vector v down and from
the right. The vector n is a unit vector perpendicular to P. The shaded region indicates
the volume flowing through P in one time unit, and is a solid B bounded by six
parallelograms with horizontal top and bottom, with its other four edges of length
lvl and parallel to v. The volume V(B) is the area A(P) of its top times the vertical
height h of B. Since h = n • v, the coordinate of v in the direction of n, the flux
through the top parallelogram is A(P)n • v. Note that the sign of the flux through P
would change if we reversed the direction of n.
Similar considerations apply to other planar figures R perpendicular to n and of
area A(R). If a flow velocity in R 3 is the constant vector v at every point, the flow

FIGURE 1.25
Flow's flux equals V(B).
Section 4C Dot Products 31

through R is A(R)n • v. In Chapter 9, Section 3C we'll define the flux of a variable


flow through a curved surface S that we approximate near each point of S by a
tangent parallelogram.

We'll compute the flow of air through a window of area IO square feet that is
perpendicular to the vector w = 3i - 4j when the wind velocity (in feet per second)
is 20i. Since lwl = 5, we find n = ½w as the unit vector perpendicular to the plane
of the window, so An= IOn = 2w = 6i - 8j. The flux through the window is then
20i • (6i - 8j) = 120 cubic feet per second.

This example uses the dot product in a nongeometric context. Suppose a manufacturer
produces four different models of widgets. We can write information about the models
as vectors in JR4 , with each entry of a vector corresponding to one of the four models.
Suppose unit production costs are given by a vector c = (2, 4, 5, 7), meaning that
it costs 2 dollars to produce each model 1 widget, 4 dollars for each model 2
widget, and so on. Similarly let the unit wholesale prices be w = (3, 6, 7, IO) and
retail prices be r = (5, 9, 11, 18). If p = (100, 30, 10, 5) gives the number of each
model produced in a day and s = (80, 40, 8, 3) the number sold at retail, then the
day's total manufacturing cost is p • c = (100)(2) + (30)(4) + (10)(5) + (5)(7) =
200+ 120+50+35 = 405 dollars, and the retailer's gross income (before expenses)
is s • (r - w) = (80, 40, 8, 3) • (2, 3, 4, 8) = 160 + 120 + 32 + 24 = 436 dollars for
the day.

EXERCISES

In Exercises 1 to 4, compute x • y for the given vectors. 11. Prove that the dot product has the additivity and homo-
geneity properties in Theorem 4.2.
1. x=(l ,3), y=(-2,4)
12. Prove the homogeneity property of length listed in
2, X = (v'2, vf3), y = (vf3, v'2) Theorem 4.9.
J. X = (-1, -1 , 2),y = (l, 6, J) In Exercises 13 to 16, find (a) the coordinate and (b) the
4. X = (l , 2, 1, 3), y = (0, l, 2, 1)
component of the vector x in the direction of the vector
v, and also the component of x perpendicular to v.
In Exercises 5 to 8, for the given vectors u and v find (a)
u • v, (b) lul and lvl , and (c) the angle between u and v. 13. X = (l, -1, 2), V = (l / vf3, l / vf3, l/vf3)
14. X = (l, -1, 2), V = (2/7, -3/7, 6/7)
5. u ={l,1),v=(l,0)
15. X = (2, -3, 1), V = (1, 3, -2)
6. U = (vf3, 1), V = (1, vf3)
16. X = (-4, 0, -1), V = (0, -3, 2)
7. U = (2, l , 2), V = (l, 2, 2)
In Exercises 17 and 18, for the triangle with the given
8. u=(3,l,l),v=(4,l,0) vertices A, B , C, find the lengths of its sides, and deter-
In Exercises 9 and 10, use the information and approx- mine which of its angles are acute, obtuse, or right
angles.
imate coordinate system of Exercises 34 to 38 on
page 17. 17. A= (2, -3, 6) , B = (I , 3, -2), C = (1 , 7, 1)
9. Find the angle between the position vectors of New York 18. A= (l , 2, 4), B = (-2, -1, 2), C = (4, 2, -3)
and Los Angeles, and find the approximate airline distance 19. (a) Show that for a vector x in R 3 ,
between the two cities.
10. Do the same as in Exercise 9, for New York and Paris.
32 Chapter 1 Vectors

(b) If the vector x in part (a) is a unit vector, that is, This inequality is sometimes called the reversed triangle
a vector u of length I , show that u • ei = cos a;, inequality.
where Cl'i is the angle between u and ei . The coor- 28. Show that the sum of the squares of the lengths of
dinates cos a; are called the direction cosines of the four sides of a parallelogram is equal to the sum
u relative to the standard basis vectors ei . If x is of the squares of the length, of the diagonals. [Hint:
a nonzero vector, the direction cosines of x are Write Ix ± yl 2 = (x ± y) • (x ± y) and multiply
defined to be the direction cosines of the unit vector out.]
x/lxl.
(c) Find the direction cosines of (1, 2, I) . 29. Here is one way of proving the Cauchy-Schwarz inequal-
ity in JR" without appealing to geometry or trigonom-
20. Find the direction cosines of (6, -3 , -2) etry. Recall that if b 2 - 4ac > 0 and a f. 0,
21. Show that the standard basis vectors satisfy e; • ej = 0 if then the quadratic equation at 2 + bt + c = 0 has
i f. j, and e; • ej = 1 if i = j . two distinct real roots, and there are some values
22. Show that if x f. 0, then the vector (1/jxl)x has of t that make the expression at 2 + bt + c neg-
length I . ative. We suppose two vectors x and y are given
in JR", and we want to show that I(x • y) I <
23. A solar energy collector with area 15 square meters is lxllyl.
mounted so that its panels are perpendicular to the vector
(a) Show that if either x or y is 0, then the inequal-
4i+3k. At what rate does solar energy fall on the collector ity is true because both sides of it are 0. From
if the vector i + j + 3k gives the direction of the sun
now on we may assume that neither x nor y
and the rate of energy falling on a surface perpendicular
is 0.
to the sun's rays is 80 watts per square meter? (Use
(b) Using the properties 4.2 of the dot product, show
the method of Example 7, treating radiation as a flow of
that
energy.)
24. At what rate does solar energy fall on the collector in
ltx + Yl 2 = (tx + y) • (tx + y)
Exercise 23 later in the day when the direction of the sun
is -3j+k? = Jxl 2t 2 + 2(x • y)r + IYl 2
25. A wind blowing from the northwest exerts a force of 2:: 0 for all values of t .
15 pounds on a bicycle rider who follows a road that
goes 400 feet west and then 500 feet in a direction
30° north of west. In a coordinate system in which i (c) Use the remark at the beginning of the problem to
points cast and j points north, find a vector representing conclude that
the force of the wind and vectors representing the two
parts of the road. Calculate the work that the rider does
against the wind in cycling along each part. If there were
a road running straight from the starting point to the
(d) Derive the Cauchy-Schwarz inequality from
finish, would taking it make a difference in the total work
(c) .
done?
Once the inequality is established, we know that for
26. Suppose that a factory produces each day four dif-
nonzero vectors in JR" the ratio (x • y)/(lxllyl) is always
ferent items in amounts represented by the produc-
between -1 and I and therefore equal to cos 0 for a
tion vector p = (25, 25, 15, 10) and that these items
unique angle 0, with O :::: 0 :::: n. Since the angle
are sold according to the wholesale dollar price vec-
between two vectors is measured in the plane contain-
tor w = (100,150,200,300). What is the total rev-
ing the vectors, we use Equation 4.7 to define angle
enue for the factory from selling all of each day's
in JR".
production?
30. If equality holds in the Cauchy - Schwarz inequal-
27. Derive the inequality
ity, that is, if l(X•Y)I = lxllYI, then one of the
vectors is a scalar multiple of the other. Prove
lxl - IYI :::: Ix - YI
this
from the triangle inequality, and then show that (a) for vectors in JR 2 and JR 3 using Equation 4.5
(b) in general, using ideas from the proof outlined in
llxl - IYII:::: Ix - YI- Exercise 29
Section SA Euclidean Geometry 33
y SECTION 5 EUCLIDEAN GEOMETRY
A basic fact of analytic geometry is that the points in the plane whose coordinates
(x, y) satisfy an equation of the form ax + by = c lie on a straight line and that
every straight line has such an equation. There is a similar correspondence between
,, ,rX - Xo
,, ,, planes in JR3 and equations of the form ax+ by+ cz = d. With the help of the dot
X product we'll find a geometric interpretation for the coefficients in these equations
and extend the idea to higher dimensions.
SA Equations for Lines and Planes
Suppose that xo is a fixed point on a line in IR 2 or a plane in JR 3 , and p is a vector
(a)
perpendicular to the line or plane. Then a point x is on the line or plane if and only
z if the vectors p and x - xo are perpendicular. In other words, for every point x on
the line or plane, we must have

5.1 p • (x - xo) = 0.
Figure 1.26 shows the relation between these vectors in JR 2 and JR 3 . In JR2 we can
write p = (a, b), x = (x, y), and Xo = (xo, yo). Then Equation 5.1 becomes

(a,b) • (x-xo,Y-Yo)=0 or a(x-xo)+b(y-yo)=0.

(b) Letting ax0 + byo = d gives the standard equation for a line in JR 2 , in which (a, b)
is still a vector perpendicular to the line:
FIGURE 1.26
Line and plane. (a, b) • (x, y) = (a, b) • (xo.yo) or ax+ by= d.

Find an equation of the line in JR2 that is perpendicular to p = (l, 2) and passes
through (3, 4). The answer is

(1, 2) . (x - 3, y - 4) = 0 or (x - 3) + 2(y - 4) = 0,

which simplifies to x + 2y = 11.


For a plane in JR3 , we let p = (a,b,c),x = (x,y,z), and xo = (xo,Yo,zo) in
Equation 5.1 to get

(a, b, c) • (x - xo, y - Yo, z - zo) =0


or
a(x - xo) + b(y - Yo)+ c(z - zo) = 0.
Letting axo + byo + czo = d gives

ax+ by+ cz =d
for an equation of a plane perpendicular to (a, b, c).
34 Chapter 1 Vectors

FIGURE 1.27 z
Plane perpendicular to (l , I , I).

IEXAMPLE 21 =
In JR 3 let p (I, I, I ) and xo
through xo has equation
= (I, 0, 0). The plane perpendicular to p and passing

(l,l,l) • (x-1,y,z)=0 or (x-l)+y+z=0

or
x+y+z=I.

To get an idea of how the plane lies in JR 3 , we can pick three points on the plane, for
example (1, 0, 0), (0, I , 0), and (0, 0, I), and sketch the triangle fanned by them as
in Figure 1.27. We can find points on a plane by picking values for two coordinates
and solving the plane's equation for the third. In this example, we found the plane's
intercepts, the points where the plane intersects the axes. Intercepts are easy to spot
since they have two coordinates equal to zero.

The angle between two planes is the angle between vectors perpendicular to the
planes, but as the next example shows, some care is needed in specifying the angle
and choosing which way the vectors point.

IEXAMPLE 3 I The+ 5zsides= 0,of meeting


y
a shallow trough lie in the planes with equations -y + 5z = 0 and
on the x-axis as in Figure l .28(a), which is a cross-sectional
view showing the y- and z-axes, with the x-axis coming straight out from the page.
What is the angle between the sides?
The vectors p = - j + 5k and q = j + 5k are normal to the sides. For the
angle e between p and q we have cose = (p•q)/IPllql. Since p•q = 24 and

FIGURE 1.28
Trough.

p'iq
--e--- ------
---- ----
(a) (b)
Section 5B Euclidean Geometry 35

IPI = lql = ,Ju,, cos 0 = 24/26 ~ 0.923 and (from tables or a calculator) 0 ~ 22.6°.
This looks reasonable for the angle between p and q, as shown in Figure l .28(b ),
but not for the angle between the sides of the trough, labeled <p in Figure l .28(a).
Instead, 0 is the angle between one side of the trough and the extension of the plane
containing the other, shown as a dotted line in the figure, and <p = 180° -0 ~ 157.4°.
Note that this is the angle between p and -q, which could have been chosen as
normals instead of p and q. In the abstract, any one of ±p and ±q could be chosen
as normals to the planes, and 0 or <p could be taken as the angle between the planes;
the appropriate choice in a concrete problem is best made with the help of a sketch.

FIGURE 1.29
Distance to line and plane. n

/
~x,
/
/
/

/6 = n · (x 1 - x0)

(a) (b)

SB Distance to a Line in JR.2 or a Plane in JR.3


The representation of a line or plane by an equation p • (x - xo) = 0 is not unique,
because another point on the line or plane can replace Xo, and a nonzero scalar
multiple of p can replace p without changing the set of points x that satisfy the
equation. We can use some of this freedom to advantage by requiring the vector p
to have length I. We call the result a normalized equation, which then becomes
n • (x - Xo) = 0, where n = p//p/. Alternatively we can write n • x - c = 0 or
n • x = c where c = n • Xo- Using a normalized equation we can find the distance,
measured perpendicularly, from a point to a line or plane. Figure 1.29 shows some
examples of how this works for both lines and planes.
5.2 Theorem. Let n • (x - Xo) = 0, or n • x - c = 0, be a normalized equation
of a line or plane P and let x1 be a point in the same space. Then the distance from
x 1 to P is

8 = n • (x1 - xo) = n • x1 - c

if x1 is on the side of P where the tip of n is and is -8 if X1 is on the other side


of P.

Proof Since lnl = 1, we have


8 = n • (x1 - xo) = lnllx1 - Xol cos0 = lx1 - xol cos 0,
where 0 is the angle between n and the vector x1 - Xo. If O ::: 0 < n /2, then 8 is the
length of the projection of x 1 - xo on a line parallel to n; thus 8 is the perpendicular
distance from Xo to P. If n /2 < 0 ::: n, we have cos 0 < 0, so 8 is negative and the
=
actual length is -8. If 0 ±n/2 then xo lies on P and 8 = -8 0. =

36 Chapter 1 Vectors

IEXAMPLE 41 The line (3, 4). (x, y) = I, or 3x + 4y = 1, in JR2 is perpendicular to p = (3, 4)


!
of length 5, so the normalized equation is } x + y - } = 0. The distance from
x1 = (2, 3) to the line is then 8 = i + s2 - } = 157 • On the
1
other hand, if
x 1 = (1, -1) we find 8 = ¾- !- ½= -j; this tells us that (1, -1) is on the other
side of the line from (2, 3) at a distance of j.

IEXAMPLE s I The plane x + y - 2z =


normalized equation is
- 1 is perpendicular to p = (l, I, - 2) of length ,,/6, so the

1 l 2 l
-x+-y- - z + - = 0.
,,/6 ,,/6 ,,/6 ,,/6

The distance from the origin x1 = (0, 0, 0) to the plane is then I /-/6. Note that if
we put x1 = (1, I , 1) instead, we get the same result, so this point is on the same
side of the plane as the origin and lies at the same distance from the plane.

EXERCISES

In Exercises 1 and 2, find an equation for the line 11. The lines with equations x - 2y =l and 2x +y =3
described in JR2 , and sketch it. 12. The line through (0, 0) and (I, 2) and the line through
I. Perpendicular to e2 in R 2 and passing through (2, 3) (1. 2) and (2, 3)

2. Perpendicular to (2, -3) and passing through (1, I) 13. Find the cosine of the angle between the planes 2x + y +
z = l and x - y - z = -1 in JR 3 . [Hint: Look at vectors
In Exercises 3 and 4, find an equation for the plane perpendicular lo lhe planes.]
described, and sketch it.
14. Find the cosine of the angle between the plane 2x+y+z =
3. Perpendicular to (I, 2, 4) and passing through (-1, 0, 0) I and a line parallel to (1, 2, I).
4. Perpendicular to e2 - e3 in R 3 and passing through In Exercises 15 to 18, sketch the plane in JR 3 with the
(0, I, 0) given equation.
In Exercises 5 to 8, describe the point or set of points 15. x + y - z = I 16. 2x - y + 3z = 0
that the plane in JR 3 with equation 2x + 3 y- z = 2 has in 17. y + 2z = 1 18. x - z = -I
common with the line having the given parameterization. 19. Find an equation for the plane parallel to the plane
3x - 2y + 5z = 2 and passing through (2, I, I).
5. (x, y, z) = t(l, -1, 4) + (I, 0, -2)
20. Find a unit vector II perpendicular to the plane passing
6. (x. y, z) = t (- 1, 0, I) + (-1, I , -1) through a = (I, 0, 1), b = (2, I, 0), and c = (I, I , I) .
7. (x. y. z) = t ( -1, I, 1) + (- I , 1, - 1) [Hint: n • (b - a) = 0 and n • (c - a) = O.]

8. (x,y.z)=t( I, l , 5)+(0,0,-2) 21. Find an equation for the plane through


(a) (0, 0, 0), (0, I , 2), (-1, 0, 1)
9. Find the point of intersection in R 3 of the plane perpen-
(b) (I, - 1, 1), (I, I. I). (I. 2, I)
dicular to (I, - 1, 2) and passing through (0, -1, 0), and
the line parallel to (1, 0, I) and passing through (1, I. I). 22. Find an equation for the plane through the point (2. 3, 5)
and perpendicular to (-1, -4, I).
IO. Find the point of intersection in R 3 of the plane perpen-
dicular to (0, 2, -1) and passing through (0, -1, 0), and For each of the points and planes listed in Exercises 23
the line passing through the points ( 1, 2, 1) and (-1, 3, I). to 28, find the distance from the point to the plane or
line. Is the point on the same side of the plane or line as
In Exercises 11 and 12, find the cosine of the angle the origin, or on the opposite side? Is the point above the
between the given lines in JR 2. plane or line, or below it (where "up" is the direction of
Section 6 The Cross Product 37

the positive y-axis in IR 2 and the direction of the positive 26. point (1, l, 1); plane (1, 1, l) •x =3
z-axis in IR 3 )? 27. point (1, 0, -1); plane x + 2y + 3z = 1

23. point (2, -1); line 2x +y =2 28. point (-2, 1, 0); plane x - y + z = 2

24. point (-1, 2); line 2x +y=2 29. Let ax + by + C'{. = d be the normalized equation of a
plane in JR 3 , so a 2 + b 2 + c2 = I; what is the distance
25. point (1 , 0, -1); plane (1, I, 1) • x = l from the plane to the origin?

SECTION 6 THE CROSS PRODUCT


The cross product is a construction defined only for vectors in JR3 that's useful in
a variety of geometric problems and also has applications in physics, especially in
the study of electromagnetism. Our first use of it is as a convenient way to find a
vector perpendicular to two given vectors, and Exercise 23 shows how the following
formula arises naturaHy in trying to solve this problem.
We begin with the formal definition. The cross product of two vectors
u = (u1, u2, u3) and v = (v1, v2, v3) in JR3 is defined to be

6.1 u xv= (u2v3 - u3v2, u3v1 - u1v3, u1 v2 - u2v1).

As a help in remembering the formula, note that the pattern of subscripts in each
component comes from the one before it by the substitutions 1 - 2 - 3 - l.
Formal computation with 2-by-2 and 3-by-3 determinants, treated more generally in
Chapter 2, Section 5, makes the cross product easier to remember in the form

j k
U XV= UJ
VJ
U2
Vz
U3
V3
= I u2
Vz
u3
V3
Ii - I U\
Vt
u3 1 ·
V3 J
+ I u1
Vt
!12
v2
Ik.

j'E,)(.4.1\APL'E fj The cross product of u = (1, -3, 2) and v = (2, 4, -5) is

i -! j = 1-: -~ Ii-1 ~ _; l.i + I~ -i I k

= (15 - 8)i - (-5 - 4)j + (4 + 6)k = 7i + 9j + 10k.


Working directly from the definition of cross product, the computation is less trans-
parent:

(1, -3, 2) X (2,4, -5) = {(-3)(-5)- (2)(4), (2)(2) - (l)(-5), (1}(4)- (-3)(2))

= (15 - 8, 4 + 5, 4 + 6) = (7, 9, JO).


An important property of the cross product of u and v is that it's perpendicular
to both u and v as shown in Figure I .30(a). In other words,

6.2 u • (u x v) = 0, and v • (u x v) = 0.
38 Chapter 1 Vectors

FIGURE 1.30 11 XV k k

L L;
Axis orientation. area 111 Xvi

k__
...
V

. J
II i"' ,i-
J
Right-handed Left-handed
(a) (b) (c)

To check the first formula we write

U•(u xv) =(u1,u2,u3) •(u 2v3-u3v2, u3v1 -111v3, u1v2 -u2vi)

=u 1u2v3 - u 1u3v2 + u2u3v1 - u2u I v3 + u3u I v2 - u3u2v1.

Now observe that each term matches another with the opposite sign so that the sum
is zero. A similar calculation shows that v • (u x v) = 0.
Note that interchanging u and v exchanges the two terms in each coordinate entry
of u x v and so has the effect of changing the sign of the entry. Hence the cross
product is not commutative. Instead we have

6.3 V X U= -U XV.

IEXAMPLE 2 j We can illustrate Equation 6.2 by calculating

i xj = (1, 0, 0) x (0, I, 0)
= ((0)(0) - (0)(1) , (0)(0) - (1)(0), {1)(1) - (0)(0))
= Oi + Oj + I k = k
and noting that k is perpendicular to both i and j. Similar calculations give j x k = i
and k x i = j. From Equation 6.3 we then have j x i = -k, k x j = -i, and
i x k = - j. Recall that we've already adopted the right-handed orientation for
labelling axes shown in Figure I.30(b ). The algebra is the same regardless of what
orientation we choose, but the picture would look different if we had chosen the
left-handed orientation.

[EXAMPLE 3 j Let us find an equation for the plane that has parametric representation x = su +
tv + w with u = (3, -1, 2), v = (2, 5, -2) and w = (0, 0, 4). We need a vector
p perpendicular to the plane. Since u and v are parallel to the plane, and u x v is
perpendicular to both of them, we can take p = u xv= ((- 1)(- 2) - (2)(5) , (2)(2)-
(3)(-2), (3)(5) - (-1)(2)) = (-8, IO, 17) to obtain a vector perpendicular to the
plane. Then p • x = p • w, or -8x + IOy + 17z = 68, is the required equation.

lEXA~P~E4 I Unless they are parallel or coincide, two planes in JR. 3 intersect in a line. Here is
how to find a parametric representation for the line of intersection. As an example,
we take the planes with equations 2x - y + z = IO and -3x + 2y - z = - 7,
rewriting these equations as p • x = 10 and q • x = - 7, where p = (2, - 1, 1) and
Section 6 The Cross Product 39
q = (-3, 2, -1) are perpendicular to the first and second plane, respectively. The
line of intersection lies in both planes and is therefore perpendicular to both p and
q, so we calculate

v=pxq
= ((-1)(-1) - (1)(2), (1)(-3) - (2)(-1), (2)(2) - (-1)(-3))

=(-1,-1,1)

to get a vector with the direction of the line. We still need to find a point on the
line. Usually it is simplest to find the point where a line meets one of the coordinate
planes x = 0, y = 0, or z = 0. For instance, taking x = 0 in the equations for
the planes gives -y + z = 10 and 2 y - z = - 7, which we solve to give y = 3
and z = 13. The point (0, 3, 13) is therefore on both planes and so on the line of
intersection. We now know the direction of the line of intersection and a point on it,
and have t (-1, -1, 1) + (0, 3, 13) as a parametric representation for the line.

To find the point of intersection of the line t ( -1, 2, 4) + (0, - 2, 3) with the plane
through (4, 2, 3), (-2, 0, 1), and {I, 3, -1), we first find an equation for the plane.
The vectors p = (4, 2, 3) - (-2, 0, l) = (6, 2, 2) and q = (4, 2, 3) - {I, 3, -1) =
(3, -1, 4) are parallel to the plane, so v = p x q = (l 0, - 18, - 12) is perpendicular
to the plane. An equation for the plane is v • x = v • (4, 2, 3) = (10, -18, -13) •
(4, 2, 3) = -32. For x on the line we have

V • X = (10, -18, -12) • (t(-1, 2, 4) + (0, -2, 3)) = -94t + 0,


and the point is also on the plane when -94t = - 32, that is, when t = !~- The
point of intersection is then }~ ( -1, 2, 4) + (0, -2, 3) = (- }~, -1~,
2,?.,5 ).

Suppose we want to find an equation for the plane parallel to the two vectors x1 =
(1, 2, -3) and x2 = (2, 0, 1), and containing the point xo =
(1, 1, 1). A vector p
perpendicular to Xt and x2 is x1 x x2:

p = ((2)(1) - (-3)(0), (-3)(2) - (1)(1), (1)(0) - (2)(2))

= (2, -7, -4).


Writing x = (x, y, z), we require that

p • (x - xo) = 0, that is, (2, - 7, -4) • (x - 1, y - 1, z - 1) = 0.


According to the definition of the dot product, this last equation is

2(x - 1) - 7(y - I) - 4(z - l) = 0, or 2x - 7y - 4z + 9 = 0.


The following formulas are sometimes useful in calculating with expressions
involving cross products.
40 Chapter 1 Vectors

6.4 Let u and v be vectors in JR 3 and let r be a scalar. Then

Anticommutativity: v x u = -u x v
Additivity: u X (v + w) = u XV+ u X w
(u + v) X w = u X w + V X w

Homogeneity: r(u x v) = (ru) x v = u x (rv)

Proof. Anticommutativity has already been justified as Equation 6.3. Proofs of the
other properties also follow directly from definition 6.1 and are left as exercises.
Note also that (u x v) x w is usually not equal to u x (v x w), so the cross product
is not associative. See Exercises 25 to 28. •
We have already seen that u x v is perpendicular to both u and v. Two more
properties are needed to fully characterize the cross product geometrically. The first
of these gives the geometric meaning of the length of the cross product, expressed
in the formula
6.5 lu Xvi= A(P),

where A(P) is the area of the parallelogram P that has arrows u and v as adjacent
edges as shown in Figure I .30(a). In other words, the length of the cross product
of two vectors is equal to the area of the parallelogram having the vectors as adja-
cent edges. This property makes the cross product useful in computing areas and
volumes in JR 3 , and plays a role in defining the area A (S) of a smooth surface in
Chapter 9, Section 3B. A proof of Equation 6.5 in straightforward steps is outlined
in Exercise 15. The area of the triangle with u and v as adjacent edges is half that
of the parallelogram and is therefore equal to ½Iu x vi.

IEXAMPLE 11 We saw in the previous example that the cross product of the vectors x 1 = (1, 2, -3)
and x2 = (2, 0, 1) is the vector (2, - 7, -4). Hence the area of the parallelogram P
with edges XJ and x2 is

A(P) = 1(2, -7, -4)1 = .,/22 + (-7) 2 + (-4) 2 = v'69 ~ 8.3,


while the triangle with these edges, that is, the triangle with vertices at the origin,
(2, 0, 1), and (2, - 7, -4), has area ½K9, chosen to be consistent with our choice
of right-hand coordinate axes orientation.

Perpendicularity to u and v and having length given by Equation 6.5 isn't enough
to characterize u xv completely, because -u x v has the same properties. The choice
between the two possible directions of u x v relative to the pair (u, v) is settled by
the following rule:

6.6 Right-Hand Rule for the Cross Product. The vector u x v points in the
direction of the thumb when the fingers of the right hand curl from u to v.

This rule is illustrated in Figure 1.31 (b ). If you look at the plane containing u and
v from the side away from which u x v points, then it takes a counterclockwise
rotation of less than 180° to rotate u to point in the same direction as v.
Section 6 The Cross Product 41

FIGURE 1.31
Orientation of u, , -, w. 1-------u uxv
a, I
g'
='
-1 V

V u
(a) (b)

We have aJready computed that for the unit coordinate vectors, i x j = k. If you hold
your right hand with the thumb pointing up (in the direction of k), then its fingers
naturally curl in the counterclockwise direction taking i to j as in Figure l.30(b).
EquivaJently, if you look down at the xy-plane from above, it takes a positive
(counterclockwise) rotation to carry i to j.

The choice of the right-hand rule for the cross product instead of a left-hand rule
is an arbitrary convention, but we have aJready made that choice in drawing the
coordinate axes as we did in Figure 1.4 in Section 1. If we had used the left-hand
rule for orienting the coordinate axes, then the cross product would obey the left-hand
rule shown in Figure I.30(c).
The cross product is aJso linked to volumes. Three vectors u, v, win JR3 that don't
lie in a plane determine a solid region B with u, v, w for adjacent edges; B is called
a parallelepiped because it's bounded by three pairs of congruent parallelograms,
illustrated in Figure l.3l(a). Each edge of Bis paraJlel to one of the vectors u, v, w,
and B looks like a lop-sided box. The scalar triple product of u, v, and w in that
order, is defined to be u • (v x w). We'll show that this number equals either plus
or minus the volume V(B) of B, where B is the parallelepiped determined by u, v,
and w. The precise statement is:
6.7 Theorem. Let B be the parallelepiped with the vectors u, v, w as adjacent
edges. Then u • (v x w) = V(B} if the three vectors obey the right-hand rule and
otherwise u • (v x w) = -V(B).

Proof. To verify Theorem 6.7, note thal u • (v x w) = Iv x wl lul cos 0, where 0 is


the angle between u and v x w as in Figure 1.3 l(a). We take the base of B to be
the paraJlelogram P with area A(P} = Iv x wl determined by v and w. The figure
shows that the vertical height of B is ju II cos 0 I. Hence the volume of B equals Lhe
triple product if T! /2 ::: 0 :::: T! and equals minus the triple product if O ::: 0 :::: T! /2.
The sign of the scaJar triple product is positive when O < 0 < ½,
that is, when u
and v x ware on the same side of the plane containing v and w. Since v, w, v x w is
a right-handed system, this happens just when u, v, w is also a right-handed system.
When u, v, w is a left-handed system, u and v x ware on opposite sides of the plane
and I
< 0 :::: rr, so cos 0 < 0 and the triple product is negative. •

l-';l$<AMPLE &I Let u = (I, 1, 1), v = (1, 2, -3), w = (2, 1, 1). The triple product of these three
, "" - vectors is (1, 1, I)• ((1, 2, -3) x (2, 1, 1)) =
(1 , 1, I)• (5, -7, -3) =
-5, so the
volume of the parallelepiped B determined by Lhe vectors is V (B) =
5. Since the
sign of the triple product is negative, u, v, and w form a left-handed system.
42 Chapter 1 Vectors

IEXAMPLE 10 j In Lhe discussion before Example 7 of Section 4 we defined the flux of a flow with
constant velocity vector v. There we found the flux across a parallelogram P with
unit normal vector n to be <I>= A(P)n•v, where A(P) is the area of P. We now
see that this flux is a scalar triple product: <I> = v • (p x q), where the vectors
p and q represent adjacent edges of P. The reason is that A(P) = IP x qi and
n = (p x q)/lp x qi, so <I> is the scalar triple product of v, p, and q:

pxq
<I> = A(P)n • v = IP x qi--• v = v • (p x q).
IP X q i
We use this formula to generalize flux to nonconstant flows across curved surfaces
in Chapter 9, Section 3C.

If you choose three vectors u, v, win JR 3 that form a right-handed system and test
v, w, u and w, u, v by the right-hand rule, you'll find that they are also right-handed
systems. Consequently,

6.8 u • (v x w) = v • (w x u) = w • (u x v).

In other words, permuting u, v, w cyclically so u --+ v--+ w--+ u doesn't change


their scalar triple product. But interchanging any two of the vectors changes the
sign of the triple product because that changes the sign of the cross product in
the triple product. Exercise 17 asks you to check Equation 6.8 algebraically, though
we'll see also in Chapter 2, Section 5 that this follows from a basic property of
determinants.

EXERCISES

In Exercises 1 to 4, find the cross product of u and v, 9. U=(l,2,4),v=(4,2,l)


and ~ketch the three vectors u, v, and u x v
10. u = (- 3, 0, I), V= (0, 1, -4)
I. u = e2, v = e1
11, Verify that (a) i x j = k, (b) j x k = i, (c) k x i = j, and
2. II =j, V = -k (d) u x u = 0 for all u in JR3 •
3, ll=(0,l , 2).V=(- l,0,]) 12. Verify properties 6.4(b) and 6.4(c) for arbitrary vectors
4. ll = (,./2, 1, v1'°3), V = (,./2, ,./2, 1) u, v, wand scalar r.

In Exercises 5 and 6, find the area of the parallelogram 13. The triangle with vertices (-2, 1, 0), (2, 3, 0), (2, -1, 0)
with u and v as adjacent edges. forms the base of an irregular pyramid with apex at
(0, 0, 2).
5, ll = i, V = j + k
(a) Make a sketch of the pyramid.
6. u=(-1,0,0),v=(0,-1 , 0) (b) Find a vector perpendicular to each of the three
In Exercises 7 and 8, find the area of the triangle with u sides.
and v as adjacent edges. (c) Find the cosine of the angle between each of the
three sides and the base.
7. u=(3,-l,2),v=(-l,0,l)
14. Verify using coordinates the second of Equations 6.2 of
8. U = (1, 1, _1), V = (1, 2, 1) the text: v • (u x v) = 0.
In Exercises 9 and 10, use the cross product to find *15. (a) Verify by direct coordinate computation that
an equation of the form ax + by + cz = d for the
plane parallel to both u and v that contains the point
(-1 , -1, 1).
Section 6 The Cross Product 43
(b) Use the result of part (a) to show that Sir William Rowan Hamilton introduced the terms scalar
and vector in 1846 in connection with his invention of
lu xvi= iullvlsin0, quaternions, which he defined to be expressions of the
form q =a+ bi + cj + dk, where a, b, c, and dare real
where 0 is the angle between u and v that satisfies numbers. He called a the scalar part of q, representing
0 :'.:: 0 :':: 7L a single magnitude on the scale of real numbers, and
(c) Show that lullvl sin0 is the area A(P) of the paral- called bi + cj + dk the vector part, representing a line
lelogram with edges u and v, and hence by part (b) segment with both magnitude and direction in ~n. The
that the length of u x v is equal to A(P), as stated
word vector came from astronomy; in the eighteenth
in Equation 6.5 of the text.
century the line from the sun to a planet was called the
16. Compute the volume of the parallelepiped with adjacent planet's radius vector. Hamilton defined the algebraic
edges (2, 1, 3), (-1 , -2, 4), (3, 3, 2). operations on quaternions so they became an extension
17. Verify Equation 6.8 algebraically by writing out the prod- of those for complex numbers, adding them as linear
uct in terms of coordinates for three vectors u,v,w. combinations of the basis {I, i, j, k} and defining the
18. Explain geometrically why Equation 6.8 is consistent with product of two quaternions by multiplying out assuming
Theorem 6. 7. the distributive law and then simplifying using the rules
In Exercises 19 and 20, find the area of the parallelogram
in ~ 3 with the given edges and make a sketch of it. ·2 =J·2 = k2 = -
I
I' ij = k = -ij,
19. (l,l,0),(0,1,2) 20. {0,l,2),{-3,5,-1)
J"k = .t = - kj ' ki =j = -ik
21. Find the volume of the parallelepiped B in R3 with edges
(1, 1, 0), (0, 1, 2), and (-3, 5, -1). Make a sketch of B. for multiplying i, j, and k. Note that quaternion multi-
22. If u = (2, 1, 3), v = (0, 2, 1), and w = (1, 1, 1), compute plication is not commutative.
{a) u x v (b) u • (v x w)
(c) (u x v) • w (d) (u x v) x w 24. Show that if q1 = s1 + v1 and q2 = s2 + v2 are two
(e) u x (v x w) (0 (u • v)w quaternions with scalar parts s 1 , s2 and vector parts v 1 , v2,
23. This exercise shows how the formula for the cross product then their product is the quaternion
comes up naturally if you try directly to find a vector
perpendicular to two given vectors. Let u = (u 1, u2, u3)
and v = (v1 , v2 , v3 ) be nonparallel vectors in !R3 • To find
a vector x = (x, y, z) perpendicular to u and v, we want Thus the product of two quaternions v1, v2 with zero
u • x = v • x = 0, that is, scalar parts yields scalar part -v1 • v2 and vector part
v1 x v2. Quaternions fell from favor among physical
u1x+u2y+u3z=O scientists after Josiah Willard Gibbs later introduced the
more convenient dot and cross products.
VJ X + V2Y + V3Z = 0.
The remaining exercises for this section deal with asso-
(a) Multiply the first equation by v1 and the second by ciativity of multiplication, which fails in general for the
u I and subtract to get cross product but holds for quaternion products.
25. The associative law for the cross product holds only for
some choices of vectors; verify this by comparing the
following products.
(b) Do a calculation similar the one in part (a) to get (a) (i X l) X j and i X (i X j)
(b) (i X j) X i and i X (j X i)
(c) (i X j) X k and i X (j X k)
(d) (i X i) X j and j X (i X i)
(c) Use (a) and (b) to express x and y in terms of z, *26. Let u, v, and w be arbitrary vectors in JR 3 •
and chc,ose a value for z that avoids fractions in the {a) Using geometric properties of the cross product,
result. Compare the triple (x, y, z) that you obtain show that there are scalars a and b such that u x
with U XV. (v x w) =av+ bw.
44 Chapter 1 Vectors

(b) By taking the dot product of both sides of the part (c) of Exercise 26 and other properties of the cross
equation in part(a) with u, show that a(u • v) = product to find a relation between u x (v x w)- (u xv) x w
-b(u • w). and v x (u x w).]
(c) Verify using coordinates that *28. Use the results of Exercise 24, part (c) of Exercise 26,
u x (v x w) = (u • w)v - (u • v)w. and Equation 6.8 to show that quaternion multiplication is
*27. Show that the cross product is associative for three vectors associative, so q1 (qzq3) = (q1q2)q3 for three quaternions
u, v, win that order, that is, u x (v x w) = (u x v) x w, if q,, qz, q3.
and only if v and u x ware linearly dependent. [Hint: Use

Chapter 1 REVIEW

Exercises l to 4 refer to vectors a = i + j + k, b 12. Show that for a pair of points a and b, the points
=
i + 2j + 2k, and c 2i + 3j + 6k. p = !a+ ~b and q = ~a+ !h are on the line segment
joining a and b, and divide it into three equal parts.
1. Which is longer, 4a or c?
13. Find a vector function oft that describes a particle moving
2. In the triangle with vertices a, b, and c, which side is
along the line through (1, 2, 3) and (-5, 3, 4) at unit
longest?
speed.
3. Express b - a and c - b - a in terms of i, j, and k.
14. Find a vector function oft that describes a particle moving
4. Express k, j, and i in terms of a, b, and c. in a straight line at constant speed that passes through
In Exercises 5 to 8, express the first vector given as a (2, -3) when t = I and through (5, 4) when t = 3. What
linear combination of the other vectors, or show that it's is the speed of the motion?
impossible to do so. 15. Suppose two boats start out at positions u 1 and u2 when
5. (2,-l,3,2);e1,e2,e3,e4 t = I) and maintain constant velocities VJ and v2. What
functions Pl (I) and P1(t) give their positions at time t?
6. (I, 2); (2, 3), (4, 6), (-6, -9) Let d(t) be the vector displacement from the first boat to
7. (4, 1,-2);(1,2,3) , (6,5,4) the second as a function of t.
8. (3, 1, -2); (l, 2, 3), (6, 5, 4), (5, 3, I) (a) Show that if the boats are on a collision course, then
the direction of d(t) doesn't change with time.
(b) Suppose the direction of d(f) doesn't change with
time. When will collision occur if it does? Under
what circumstances will it not occur?
16. Consider airplanes moving in three dimensions instead of
boats moving on a two-dimensional water surface as in
Exercise 15. Does this make a difference in the answers
to (a) or (b )?
17. Here are descriptions of four lines K, L, M, and N in the
plane:
FIGURE 1.32
K : the line through the points (3, 4) and (-2, 3)
9. Copy Figure 1.32 and then draw arrows representing (a)
L: the line with parametric representation t (i + 2j) + i - 3j
x + 2y, (b) x - y, and (c) 2x - 3y. Label the arrows with
(a), (b), and (c) to show which is which. M: the line through the point (8, 5) parallel to the vector
10. Let points a and b be position vectors of diagonally Si+ j
opposite vertices of a parallelogram and let c be the N: the line through the origin and the point (2, 4)
position vector of a third vertex. Express the position
Which of the lines are the same? Which of the lines are
vector of the fourth vertex in tenns of a, b, and c.
parallel?
11. Find representations su+a and tv+b for the line through
18. Here are descriptions of four lines K, L, M, and N in R.3:
(I, 2) and (2, 1), and the line through (4, 5) and (-1, -2),
and find where the lines intersect. K: the line through the points (l, 2, 3) and (4, -5, 6).
Section 6 The Cross Product 45
L: the line with parametric representation (b) a vector in the plane x - y + 4z = 0 and a vector
t (3i - 7j + 3k) - 2i + 9j. perpendicular to it
M: the line through the point (0, I, 2) parallel to the vector 29. A force F = i + 3j - 2k drags an object from (l, 2, 3) to
3i + k. (4, 5, 0). Find the work done.
N : the line through the origin and the point (I, 2, 3). 30. If the wind velocity is lOi + 20j (in feet per second),
Which of the lines are the same? Which of the lines are (a) What is the speed of the wind?
parallel? (b) How much air blows through a triangular opening
with vertices at (-2, 2, 0), (3, -4, 0), and (0, 0, 5)
In Exercises 19 to 22, find a parametric representation in one second? (Coordinates are in feet.)
for the given plane.
31. Let L be the line (2, 3) • x = 6.
19. the plane containing (3, 0, 0), (0, 2, 0), and (0, 0, 5). (a) Find the distance of the origin from L.
20. the plane parallel to the one in Exercise 19 and containing (b) Find a parametric representation for L.
(0, -!, 3). (c) Find all vectors of length 4 that are perpendicular
to L.
21. the plane parallel to the x- and y-axes and containing
(1, 2, 3). 32. Let P be the plane (l , l, l) • x = 3.
22. the plane containing the origin, (l, 2, 3), and (-2, -3, !). (a) Find the distance of the origin from P.
(b) Find a parametric representation for P.
For each set of three points given in Exercises 23 to (c) Find all vectors of length 4 that are perpendicular
26, determine whether the points lie on a line. If they to P .
do, find a parametric representation of the line. If they
33. Let L be the line t(2, l) + (-2, 0).
do not, find a parametric representation of the plane in
(a) Find a parametric representation for the line K
which they lie.
that is perpendicular to L and passes through the
23. (3, 1,2), (0,0,0),(-6,-2,-4) origin.
24. (l, l, 0), (0, !, !), (l, 0, l) (b) Find a unit vector n and number c such that L has
=
the equation n • x c.
25. (l, 2, 3), (3, 2, l) , (4, 2, 2) (c) What is the distance from L to (3, 5)? Is (3, 5) on
26. (l, 2, -3), (10, 5, -9), (-5, 0, l) the same side of L as the origin?
27. Let a = i + j + k. b = i + 2j + 2k, c = 2i + 3j + 6k, as 34. Let P be the plane s(3, 2, I)+ t(2, I, l) + (-2, 0. 1).
in Exercises l to 4. (a) Find a parametric representation for the line L
(a) Which is larger, the angle between a and b or the that is perpendicular to P and passes through the
angle between a and c? origin.
(b) Find a nonzero vector perpendicular to both a and b. (b) Find a unit vector n and number c such that P has
(c) What is the area of the triangle whose vertices are the equation n • x = c.
the origin and the points a and b? (c) What is the distance from P to (3, 5, -2)? Is
(d) What is the area of the triangle whose vertices are (3, 5, -2) on the same side of P as the origin?
the points a, b, and c? 35. Find a parametric representation for the line of intersec-
28. Express i + 3j - 2k as the sum of tion of the planes with equations x + y + z =3 and
(a) a vector parallel to 2i - 6j - 3k and a vector =
2x - y - z 5, and find the point where the line intersects
perpendicular to it the plane with equation x - y = 2.
CHAPTER 2

EQUATIONS AND MATRICES

Many specific problems about vectors and their applications require solving systems
of first-degree equations. This chapter is mainly about how to solve such systems, but
also features applications and an optional look in Section 2D at geometric interpreta-
tion of the solutions using the vector geometry of Chapter l. In practice, the matrix
methods introduced in Section 2 of this chapter often provide the most efficient way
to do the necessary computations.

SECTION 1 SYSTEMS OF LINEAR EQUATIONS


In this section we look at examples of systems of linear equations and find their
solutions in a systematic way that foreshadows the matrix method to be introduced
in Section 2. We start with a linear system of a type that occurs repeatedly in the
rest of the book:
a11x + a12Y +a13z = b1

a21x + a22y + a23z = b2,


a31x + a32y + a33z = b3,
where the aiJ and bj are given numbers. All linear systems of equations have this
form, though the number of equations and the number of variables may differ. We
want to find all the triples of numbers (x, y, z) that satisfy all three equations. Stated
geometrically, we want to find all points or vectors in JR: 3 that satisfy the system.
Recall that each of these equations represents a plane in JR: 3 . so we're looking for
the points (x, y, z) that are on all three planes. Geometrically we know that three
distinct planes intersect either in a single point, or in a line, or else have no points
in common. We'll see how all three cases occur algebraically, and show how to
represent all solutions parametrically.
IA Elimination Method
Two systems are called equivalent if they have the same set of solutions. For
example, a system representing three planes intersecting in a single line will be
equivalent to the system we get by discarding one of the three planes. Our procedure
will be to alter a system in a sequence of steps to arrive at an equivalent system for
which the solutions are obvious. The following example illustrates the process.

j EXAMPLE 1 I To find all x, y, z simultaneously satisfying the equations


3x + l2y + 9z = 3
2x + 5y + 4z = 4
-x + 3y + 2z = -5,

46
Section 1A Systems of Linear Equations 47

multiply the first equation by ½, which makes the coefficient of x equal to l :

X + 4y + 3z = l
2x + 5y + 4z = 4
-x + 3y + 2z = -5.

Add (-2) times the first equation to the second, and replace the second equation by
the result; this eliminates x from the second equation by making its coefficient equal
to 0:
X + 4y + 3z = l
- 3y - 2z = 2
-x + 3y + 2z = -5.

Add the first equation to the third, and replace the third equation by the result:

X + 4y + 3z =
- 3y - 2z = 2
7y + 5z = -4.

Next multiply the second equation by -½:


X + 4y + 3z = l
2 2
y + 3Z = - 3
7y + 5z = -4.

Add -4 times the second equation to the first, and - 7 times the second equation to
the third to get
X + ½z = 11
Y + iz = -i
I _ 2
3Z - 3·

Multiply the third equation by 3 to get

X + !z = 11
y + jZ = -i
z = 2.

Add ( -l) times the third equation to the first and ( - i) times the third equation to
the second to get
X = 3
y = -2
z= 2.

Hence the system has one solution. the point (x , y, z) = (3, - 2, 2) in JR3.
48 Chapter 2 Equations and Matrices

We can verify by substitution into the initial system of equations above that we've
found a solution, but this verification doesn't rule out the possibility that the original
equations might still have other solutions. We'll dispose of this idea once and for all
by singling out the two operations we use to solve linear systems and then proving
that the solution set of a system remains unchanged after applying them to the
system.

l.l Definition An elemenlary multiplication U\Ultiplies bo~1~iq~~ ?fa ~\qgJ~


eqoitt~on by a nonzero scalar r . Since r i:f Othe <>peration. js rev~~ py the:
spc,n~ing ln,•erse efeme11taty multiplicatioQ by r:1 1• ~p eJfnten~'l . ?
a
ofa ~ingle equation adds scalar multiple by r of atmtn~.-
~quatipri to thtt.
to be modifit'd. The operation is reversed by the corresppnding inverse e,~1/
modification that uses cbe scalar -'-r.

In the previous example we used only elementary operations, and the following
theorem guarantees that the one solution we found is the only solution.

1.2 Theorem. Applying elementary operations to a system of linear equations


does not change the solution set of the system.
Proof. If a set of numbers satisfies an equation then the equation is still satisfied by
the same set of numbers after multiplication by a scalar. Similarly, if the same set of
numbers satisfies two equations it also satisfies the sum of the two equations. Thus
every solution of a system also satisfies the equations resulting from a sequence of
the two elementary operations.
Conversely, starting from a system that has been modified by elementary opera-
tions, we can apply the inverse of each of the operations in the reverse order and get
back the original system. Thus every solution of the modified system also satisfies
the original system. •

!EX:A~PLE 2 j Consider just the first two equations in the previous example:

3x + l2y + 9z = 3
2x + Sy + 4z = 4.

These equations represent two planes, so they may intersect in a line. When we solved
these two equations along with the third equation in Example I , it so happened that
we avoided until the end adding a multiple of the third equation to either of the
others, so we can go through the first steps of that example simply ignoring the third
equation. Three steps from the end we arrive at

X + ½z = 131
2- - 2
Y + J•- - -3-

There are infinitely many triples (x, y, z) that satisfy the equations, because for an
arbitrary value of z the last two equations determine corresponding x and y values
Section 1A Systems of Linear Equations 49

such that (x, y, z) is a solution. We introduce a parameter t and write

z=t

to denote an arbitrary value of z. Then we can write the equations in final fonn as

X = -½t + 131
(a)
y = -jt - j

z = t.

In vector form the equations become

(x, y, z) = t ( - ½, -j, I) + ( ¥, - j, 0) .
(b)
This is a parametric representation for a line, just what we thought we might get
from the intersection of two planes.

We remarked that the intersection determined by three scalar equations in three


unknowns could consist of a line; just start with the equations of three different planes
all of which contain the line. The various possibilities for three distinct equations are
in Figure 2.1: three planes intersecting in a point, three planes intersecting in a line,
and three planes with no point in common. The third case, in which there are no
(c) solutions, comes from what is called an inconsistent system. In addition to distinct
planes that intersect in three parallel lines as in Figure 2. l(c), planes such that some
FIGURE 2.1 pair, or all three, are parallel also illustrate inconsistent systems.

For an example of an inconsistent system, consider the equations

X + y - 2z = 1
-3x + 2y + z = 0
-x + 4y - 3z = 1.
To try to solve this system, we add 3 times the first equation to the second and
times the first to the third. The result is

x+ y-2z= l
Sy - Sz =3
Sy - Sz = 2.
There are no values of y and z that can satisfy the last two equations simultane-
ously, so we conclude that the system has no solutions, that is, that the system is
inconsistent. Geometrically, we see that the given system turns out to be equivalent
to one in which the last two equations represent distinct parallel planes, so there is
no point of intersection. ·

Example 2 illustrates a general principle that we'll prove in Section 2B: When a
system of linear equations has more than one solution it always has infinitely many
50 Chapter 2 Equations and Matrices

solutions, solutions that we can find by assigning arbitrary values to some of the
unknowns, and then detennining values for the other unknowns in terms of these
arbitrary values. As in Example 2, this always happens for a consistent system with
more unknowns than equations. The simplest case is that of one equation in several
unknowns, for example,

2x - 3y + z = 1.

If we let x = s and y = t, then


z = -2s + 3t + 1,
so all solutions are of the form

(x, y, z) = (s, t, -2s + 3t + 1)


= s(I, 0, -2) + t(0, I, 3) + (0, 0, I),
a parametric representation for a plane.
We'll see in Section 2C that homogeneous systems, in which all the constant
terms are zero, are the key to describing multiple solutions. Such systems always
have a zero solution, in which all the unknowns equal zero. The question then is
whether there are other solutions. If a system is not only homogeneous but has more
unknowns than equations, we'll see in Section 2B that there are always infinitely
many nonzero solutions.

~MF>LE 4 1 In the homogeneous system

x+y-z=0
X -y + Z = 0,

set z = t. Then the system becomes

X +y = I
X - y = -t.

Subtracting the first equation from the second gives

X + y = I
- 2y = -21.
Hence y = t from the second equation, so x = 0 from the first. The solutions are

(x, y, z) = (0, t, I)
= t(0, 1, 1),
which exhibits the infinitely many solutions as a parametric representation of the
points on a line in JR 3 .

Many questions about lines and planes come down to solving systems of linear
equations. For example, two lines in JR 3 with parametric representations su 1 + v1
and tu2 + vz intersect if and only if there are values of s and t such that the
Section 1A Systems of Linear Equations 51

vector equation su1 + VJ = tu2 + v2 holds, and this amounts to a system of linear
equations for sand t. If we take u1 = (3, 2, I), v1 = (-1, 0, I) , u2 = (0, 2, I), and
Vz = (-4, 2, 2), then the vector equation is equivalent to

3s - I = -4
2s =2t+2
s + I = t + 2.
The first equation gives s = -1, and then the second gives t = -2. These values
also satisfy the third equation, so the lines do intersect. The point of intersection is
obtained by putting s = -1 in su1 + v1 (or t = -2 in tu2 + v2) and is -(3, 2, I)+
(-1, 0, 1) = (-4, -2, 0). If we change v2 to (-4, 2, 0), the third equation is changed
to s + I = t and is not satisfied by the values of s and t that satisfy the first two
equations, so the lines don't intersect.

Sometimes we know the degree of a polynomial j(x) but don't know the coefficients.
If we know values of f (x) for enough values of x, it may be possible to find the
coefficients by solving a system of equations. For example, when a particle moves
in the plane under the influence of a constant force that is parallel to the y-axis, its
path is a parabola with an equation of the fonn y = f(x) = ax 2 +bx+ c. If the
points (x, y) = (0, I), (2, 4), and (3, 3) are known to be on the path then we can
find a, b, and c by solving the equations

Oa +Ob +c =l
4a+2b+c=4
9a + 3b +c = 3.
Exercise 11 asks you to finish the calculation to find the coefficients of f (x).

EXERCISES

Some of the systems of equations in Exercises I to 6 In Exercises 7 to 10, find a point of intersection of the
have one solution, some have more, and some have no two given lines, or else show that they do not intersect.
solutions. If there are solutions, find all of them and
interpret them as an intersection of lines or planes. In 7. x=t(l ,-1,2)+(1,l,l), x=s(3,2,l)+( -2,-6,5)
the cases where there is no solution, give a geomet- 8. X = t(l, 1, 2) + (0, 1, 1), X = s(-2, 1, 1) + (2, 1, 2)
ric explanation.
9, X = t(l , 2) + (2, 1), X = s(l , 3) + (3, -1)
1. X +y =1 2. 2x - y = 2 10. x = t(-1 , I, 1) + (1, 0, 2), x = s(2, 0. 2) + (1 , 2, 2)
x-y=2 -2x + y 2 = 11. (a) Finish the calculation in Example 6 in the text, by
finding the values of a, b, c, and J(.t).
3. X + y +Z =0 4, X + y +Z= 0 (b) ls there a value y such that if the path of the object in
x-y =0 x-y =0 Example 6 passed through (0, l), (2, 4), and (3, y),
y +z=0 2x+ z= 0 then the value of a would be 0? Js the path a parabola
in this case?
5. X + y + l =0 6. x - 2y = I 12. Suppose f(x) = qex + c2e-x + <'l, where CJ, c2, and c3
x+y- z=1 2x + y = -1 are constants. How should these constants be chosen so
X + y + 2z = 2 X - 1y = 4 that f(O) = 1, f'(O) = I, and J"(0) = 2?
52 Chapter 2 Equations and Matrices

13. Suppose that a mixture of sand and cinders contains 10


cubic yards and weighs 34 tons. If the sand weighs 4 tons 15. a 1 = ( ~ ) , a2 = ( ; ) ; b= ( -! ).
per cubic yard and the cinders weigh I ton per cubic yard,
how much of each does the mixture contain?
14. Suppose that various mixtures are made of substances
16. a1 = ( ~ ) , a2 = ( ~ ) ; b= ( ~ ).

!)• i);
S1, S2, and S3 having densities 2, 3, and I respectively,
measured in grams per cubic centimeter. Suppose also that
the price of each substance in cents per cubic centimeter is 17. 8J - ( : ) • a2 - ( 8F (
4, 3, and I, respectively. Is it possible to make a mixture
weighing IO grams, with a volume of 20 cubic centimeters
and costing I dollar?
Recall from Chapter 1, Section 1 that a vector b is
h-(H
a linear combination of a 1, ... , a11 if there are scalars
x1, ... , x 11 such that b = x1a1 + · · ·+x11 a11 • In Exercises
15 to 18, find coefficient x' s to express b as a linear
18••, - ( n- n- ., -(~ ),
.2 - (

combination of the a's, or show that no such coefficients


exist. First conve1t the vector equation into a system of
linear equations for the Xi. b-(n
1B Applications
Here are some problems in various areas of applied mathematics that require solving
systems of linear equations.

IEXAMPLE 1 I junctions
An assembly of electrical conductors is called a network if each pair Ji, of
in the assembly is contained in a closed loop, or circuit where each segment
lj

represents a connecting wire with a given electrical resistance. When such a network
is connected to a power source, currents flow in the segments, and a voltage is
measurable at each junction. Ohm's law states that the current flow in a wire is
proportional to the difference in voltage at its two ends, the constant of proportionality
being the reciprocal of the wire's resistance:
(Vi - Vj)
Cjj = , (1)
ru
where Cij is the current flowing from junction l; to junction lj, ru is the resistance
of the connection between junctions l; and lj, and v; and Vj are the values of the
voltages at junctions l; and lj. The standard units of measurement are amperes for
current, ohms for resistance, and volts for voltage. A negative value for the current
from l; to lj indicates a current flowing from lj to 1;.
Figure 2.2(a) shows a network with four junctions and five segments, with the
resistance of each segment indicated beside it. Suppose external power source ter-
minals connected at junctions 1 to 4 maintain values v1 = 12 and v4 = 0. Since
junction Ii has no external connection, the current flowing in must balance the cur-
rent flowing out, so that if signs are taken into account, the sum of the currents out
of junction ]z must be zero. Using Equation (1 ), we get the equation

½(v2 - v1) + (v2 - v3) + !<v2 - v4) = 0.


Rewriting in the form
Section 1B Systems of Linear Equations 53
FIGURE 2..2 '2 2
Electric networks.
_!~ ~ 4__ _
~ /3 2
(a) (b)

we see that v2 is a weighted average of vi, v3, and V4, with coefficients that are the
reciprocals of the resistances in the lines joining Ii to the others. A similar equation
holds at each junction that doesn't have an external connection. Thus at lJ

Since we assumed external voltages vi = 12 and v4 = 0, we reorder terms in the


previous two equations, and get

jv2 - V3 =6
-V2 + iv3 = 2.
Solving this system gives v2 = 2} :=:::: 7 .33 and v3 = ~ :=:::: 6.22. Once we know the
voltages, we can find the currents from (1). Thus the current from 11 to Ii is

c12 = (v1 - v2)/r12 ~ ½02 - 7.33) ~ 2.34.

Similarly the current from junction 11 to junction h is


c13 = (v1 - v3)/r13 :=:::: !02 - 6.22) ~ 0.96.

The total current from junction Ii into the rest of the network is then about 2.3 +
0.96 = 3.3 amperes, which is the current flowing into junction 11 from the outside
source.

We may regard vectors in JR2 or JR3 as representing forces acting at some point which
for convenience we take to be the origin. The direction of the arrow is the direction
in which the force acts, and the length of the arrow is the magnitude of the force.
Our fundamental physical assumption here is that if more than one force acts at a
point then the resulting force acting at the point is represented by the sum R of the
separate force vectors acting there. In Figure 2.3 we have two different pictures, and
the resultant arrow R appears only in Figure 2.3(a). For example, suppose that the
force vectors in Figure 2.3(a) lie in a plane, which we take to be R 2 with the origin
at the point of action. If we have

F1 = (-1, 3), F2 = (4, 3) , F3 = (-2, -4), (2)

then by definition

R = F1 + F2 + F3 = (-1, 3) + (4, 3) + (-2, -4) = (1, 2).


54 Chapter 2 Equations and Matrices

FIGURE 2.3
Force vectors.

(a)

Suppose we are given only the directions of the three force vectors and are asked
to find corresponding forces that will produce a given resultant, say, R = (-1, -1).
In other words, suppose we want to find nonnegative numbers CJ, c2, C3 such that

(3)

(Having some c; < 0 would reverse the direction of the corresponding force.) The
vector equation is equivalent to the system of equations we get by substituting for
the F; the given vectors (1) that determine the force directions. We want

q(-1,3)+c2(4,3)+c3(-2,-4)=(-1,-l), or
-CJ +4c2 -
+
2C) = -1
3C/ 4 q = -1.
3c z -

Since we have two equations and three unknowns, we would expect in general to be
able to specify one of the c; and then solve for the others. However recall that the
c; are to be nonnegative. In particular, a glance at Figure 2.3(a) shows that we could
not get a resultant equal to (-1, -1) unless c3 is positive. Hence we try C3 = 1.
This choice leads to the pair of equations

-c1 +4c2 =1
CJ+ Cz = 1.
These equations have the unique solution CJ = ~. c2 = r
Thus the triple
(CJ, c2, c3) = (~, ~.1) is one possible solution, and the three force vectors are

cJFJ = (-~, n, c2F2 = (!, ~), c3F3 = (-2, -4),

with magnitudes lci FJ I= iJio,lc2F2I = 2, lc3F3I = 2v'5.


We could equally well have asked for an assignment of force magnitudes that
would put the system in equilibrium, so that the resultant R is the zero vector. We
would then have replaced the vector ( -1, -1) on the right side of Equation (3) by
(0, 0) and solved the new system in a similar way.
Section 1B Systems of Linear Equations 55
The analysis in this example is useful for distinguishing completely random behavior
of a rat in a maze from purposeful or conditioned behavior. We define for this
example a random walk to be a process such that a rat proceeds between fixed
positions along sequences of paths, each path having a given probability of being
used. Specifically we assume about random behavior that the probability of leaving
some position along a particular path is the same for all paths heading away from
that position. Some sample mazes appear in Figure 2.4.
A probability is a number p in the interval O ~ p ~ 1, and the probability or
likelihood of a particular event is equal to the sum of the probabilities of the various
distinct ways that event can occur. Thus in Figure 2.4(a), since we assume that all
paths heading away from as are equally likely, it follows that the probability of
leaving as along each of the two possible paths is ½, Similarly, each of the three
paths from a, has probability ½, as do the three paths from a4. Note also that the
probability of going from a1 to a2 is less than the probability of going from a2 to
a 1• We also assume for this example that the probability of two successive events is
equal to the product of their respective probabilities. Thus going from a2 to a I to a4
has the probability ½·½ ¼- =
We can now ask a question such as the following: What is the probability Pk of
starting at ak and arriving at the specified position as without first going to a4? We
see that starting at a 1 we can go to as directly with probability ½, or we can go to
a2 with probability ½and then go to as with probability pz, the probability of going
from a2 to as without going to a4 . Thus

Pl=½+ ½pz.
Similarly, because going to a4 does not occur in the events we are watching,
I I
P2 = 2PI + 2P3
1
P3 = 2P2·
We rewrite the previous three equations as
I I
P1-3P2=3
-½ PL + P2 - ½P3 = 0
-½pz + P3 = 0
and solve them by routine methods. We get PI = ~, p2 = ~, p3 = 4.
It appears
that the closer we start to a 5 the more likely we are to get to as without going to
a4 , but the exact probabilities depend on the entire maze.

FIGURE 2.4 b2 bb
Rat mazes. 0 0

a5

b,
&i 0

b.1
0

b4
(a) (b)
56 Chapter 2 Equations and Matrices

In a network of interconnected water pipes the junctions are pipe joints. It's usual
in a pipe network to assign each pipe a positive flow direction with an arrow as in
Figure 2.5. With this understanding a positive number rk will be a flow rate in the
direction assigned to the kth pipe, while a negative number -rk will be a flow of
equal rate in the opposite direction. We'll separate the flow rates into internal rates
rk and rates fk from or to external sources or drains. Specifying an external rate
fk = 0 at a joint closes off the external pipe there. We also assume that the inflow
at a joint equals the outflow. Thus at the upper left comer in Figure 2.5(a) we find
t1 = r1 + r2, while at the lower left we find r3 = r2 + t3, or -r2 + r3 = t3. Checking
each external joint of the network of Figure 2.5(a), we find the entire set of equations
relating the rates tk to the rates rk:

- r4 + rs
= t3 (4)
+ r6 = t4
rs + r6 = ts.
From these equations we conclude that specifying the flows rk in the internal pipes
completely determines the flows fk at the external joints.
Turning the problem around, we can ask to what extent specifying the flows fk
at the external joints will specify the flows rk in the pipes. In particular, we can try
specifying that the exterior flow fk at each joint should be zero. This leads to the
system of five equations in six unknowns:

r1 + r2 =0
-r1 - r4 + rs =0
- r2 + r3 =0 (5)
- r3 + r4 + r6 = 0
rs+ r6 = 0.
We can let r6 = a be an arbitrary number, so we get rs = -a from the last
equation. Similarly, let r4 = b be an arbitrary number. Noting from the first and
third equations that r3 = r2 = -ri, the remaining two equations for r1 and r4 both
reduce to r1 = -a - b, so the solution vector is

ro=(-a-b, a+b, a+b, b, -a, a).


FIGURE 2.5
Water pipes. 12

r, T5

Tz r4
rb
rJ
14

(a) (b)
Section 1B Systems of Linear Equations 57

It follows that there are infinitely many pipe flows, depending on the parameters
a and b, that will produce external flows tk = 0 for k = 1, 2, 3, 4, 5. We' II see
in Section 2C that every solution r of a system such as Equation (4) is the sum
r = r P + ro of one particular solution r p and one of the solutions in the 2-parameter
family ro.

f'ekA,MPLB11 j The derivation of Simpson's rule for approximate integration is based on the require-
_,_ .-:i =··: .,., · · .. :-·-.--. ·=,=., ,. · ~- ment that it should give exact results when applied to quadratic polynomials. The rule
gives an approximation to the integral of a function over an interval a-h ~ x :::: a+h
in terms of the values of the function at the points a - h, a, and a+ h. The general
form of the approximation is

a+h

1
a-h
f(x) dx :=:::: Af(a - h) + Bf (a)+ Cf(a + h),

where A, B, and C are constants. If the formula is to be correct for all polynomials
of degree less than or equal to 2, it must in particular be correct for the polynomials
fo(x) = l, fi(x) = x, and h(x) = x 2 • Each of these requirements leads to an
equation for A, B, and C. For instance, with /o(x) = 1 we have

a+h

1 a-h
fo(x) dx = 2h and Jo(a - h) = Jo(a) = Jo(a + h) = 1,

so we require A+ B + C = 2h . Similarly, we obtain two more equations using


/1 (x) = x and h(x) = x 2 , and have

A+B+C = 2h
(a - h)A + aB +(a+ h)C = 2ah,

(a 2 - 2ah + h 2 )A + a 2 B + (a 2 + 2ah + h 2 )C = 2a 2 h + }h 3 .

It's straightforward to check that A = C = ½h, B = 1h satisfy these three equations,


and Exercise 17 asks you to verify that these are the only solutions.
Thus we have a rule that is correct for the particular polynomials Jo, Ji, and /2.
Its correctness for an arbitrary quadratic polynomial f (x) = px 2 + qx + r follows
without additional computation from the following general observation. Let E (/)
be the error committed when the rule is applied to a general continuous function
f(x), so

a+h
E(f) =
1
a-h
f(x)dx - ½hf(a - h) - ihf(a) - ½hf(a + h).

Note that E(/o) = E(/1) = E(h) = 0 holds by the way we chose A, B and C.
But then elementary properties of the integral and the form of the approximation as
a linear combination of values of /(x) show that ·

E(ph + qfi + rfo) = pE(/2) + qEfi) + rE(fo) = 0.


58 Chapter 2 Equations and Matrices

We can use the same method to derive a variety of formulas in the field of numerical
analysis, as in Exercise 18.

EXERCISES

1. Figure 2.2(b) shows an electrical network with the resis- 10. What is the probability p3 that a walk starting at a3 goes
tance in ohms of each edge marked on it. Suppose an to a4 without passing through a5?
external power supply maintains junction A at 10 volts
In Exercise Exercises 11 to 13, assume a rat traces a
and junction B at 4 volts. Following the procedure of
random walk on the paths shown in Figure 2.4(b ). Let
Example 7 in the text, set up equations for the voltages
Pk be the probability of going from bk to b6 without
at the other junctions and solve them. From the results,
going through hs.
calculate the current flowing into the network at junc-
tion A. 11. Find Pk for k = 1, 2, 3, 4.
The edges and vertices of a 3-dimensional cube form 12. Modify Figure 2.4(b) in the text so b4 and the path from
a network with 8 junctions and 12 edges. Suppose that it to b3 are eliminated. Then compute the resulting new
each edge is a wire of resistance I ohm and that there are values for /Jk for k = I, 2, 3.
just two external connections, which maintain a voltage
13. Modify Figure 2.4(b) in the text by introducing a new path
of I at one of the vertices and O at another. In Exercises
from b4 to b6. Then compute the resulting new values for
2 to 4, find the values of the voltages at the other vertices
Pk for k = I , 2, 3, 4.
and the current flowing in the external connections under
the stated conditions. 14. (a) Suppose the vector t = (11, 12, l3, l4, l5) in Equations 4
of the text is specified to be t = (-1, 0, 1, 2, I). Find a
2. The two vertices with external connections are at opposite vector r that determines consistent internal flow rates.
corners of the cube.
(b) Solve Equations 5 to verify that the vector r =
3. The vertices with external connections are at the two ends (-a - b, a + b, a + b, b, -a, a), with arbitrary a and
of an edge of the cube. b, describes all solutions that are consistent with external
4. The vertices with external connections are at opposite flow t = 0 in Figure 2.5(a).
corners of a face of the cube. 15. Let the external flow vector in Figure 2.S(b) be s =
2 (I, 1. 2, 4). Show that there is more than one consistent
5. If forces in JR act at the origin parallel to (2, I), (2, 2),
internal flow vector r , and find all of them in terms of an
and (-3, -1), find magnitudes we can assign to the forces
arbitrarily assigned value.
so their sum will be zero.
16. If the external flow vector in Figure 2.S(b) is s =
In Exercises 6 and 7, suppose that three forces acting
(I, 0, I, I), show that there is no consistent internal flow
at the origin in IR3 have directions parallel to (I, 0, 0), vector.
(l, 1, 0), and (I, 1, I).
17. Carry out the solution of the equations for A, B, C given
6. Find examples of magnitudes for forces acting parallel in Example 11 of the text. LSuggestion: Start by subtract-
to tht'se directions so the resultant force vector will be ing a times the first equation from the second and a 2 times
(-], 2, 4). the first from the third.]
7. Can an arbitrary force vector F = (a, b, c) be the resultant 18. Use the method of Example 11, to find constants
of forces acting in the actual directions specified in the A, B , C, D such that
preamble? Explain.
a+3h
In Exercises 8 to I 0, suppose that a random walk
traverses the paths shown in Figure 2.4(a). 1 a f(x) dx = Af(a) + Bf (a+ h) + Cf(a + 2h)
+Df(a + 3h)
8. What is the probability /JI that a walk starting at a1 goes
, to a4 without passing through a5?
is exact whenever f(x) is I, x, x 2 and x3, and so as in
9. What is the probability P2 that a walk starting at a2 goes Example 5 is also exact for a polynomial of degree at
to a4 without passing through a5? most 3.
Section 2A Matrix Methods 59
SECTION 2 MATRIX METHODS
In Section 1 we used elementary operations to solve systems of linear equations in
an ad hoc way that's hard to adapt to large systems. In Sections 2A and 2B we
introduce matrix equations and an effective solution routine. In Sections 2C and 2D
the emphasis is on geometric ideas. The matrix operations appear again in Sections
4 and 5 for computing inverse matrices and determinants, as well as later on in
Chapters 6 and 13.
2A Matrix Equations and Elementary Operations
A matrix is simply a rectangular array of numbers. Here are some examples:

( -~"'! )· (
O S 1 0.7 3 I O ../2) .
1 4
0.9 0 28 ). ( 0 -1 ) . (Z,S, O) ,
(
The horizontal lines of numbers in a matrix are called its rows, and the vertical lines
are called its columns.
The numbers of rows and columns in a matrix determine its dimensions, and for
consistency the number of rows is always designated before the number of columns.
The five examples just given have dimensions 3-by-2, 2-by-3, 2-by-2, l-by-3, and
4-by- l. A matrix is square if it has the same number of rows as columns, so it has
dimensions n-by-n for some n. The 1-by-n matrices are called n-dimensional row
vectors, and n-by-1 matrices are called n-dimensional column vectors, so we may
regard the rows or columns of an m-by-n matrix as vectors in Rn or Rm, respectively,
as in Definition 2.1.

,,~X:AM~~E\} I Matrices occur naturally for representing systems of linear equations. In the system

2x + 3y - 4z = l
X - y + 2z = - 1,

we temporarily regard the letters x, y, and z just as placemarkers, so we can describe


the two sides of the equations using the two matrices

The 2-by-3 matrix is the coefficient matrix of the system, and the 2-by-I matrix is
the right side.
When writing the coefficient matrix of a system it's important to have the variables
lined up in the same order in all the equations and to use the coefficient O in the
matrix to indicate the absence of a variable. For example, to put the system

2x +y =4
z - y =2
60 Chapter 2 Equations and Matrices

in matrix form, it is a good idea to make clear the place that each coefficient has in
the system by first rewriting it as

2x+y =4
-y+z=2.

The coefficient matrix and right side are then

We can use dot products to relate a system's matrix and variables algebraically.

2.1 DefiuiUou 'fbe.product ilX of an ,r1.,,t:,y;,j r~l~trix iilt(and art ,,,g.,,~~1stonal


(!gl~~l "'.~ct?r xis the m;diln~n$ional column Y~tor~·~?sc entrigs <li:e t~tJ succes-
the
~iYt dot p~oducts of th~ rr~s of AWit~ VCC(()f ~. l\e pr9duft 1txJs~'(ticfin¢d
if the number ofcolumns in A doesn't et1ua(the number of entries irt x.
. . . . ..... ·. ···. ······

@AMPLE 2 I If A = (- i ;) and x = ( ~~ ).
then Ax = ( - i ;) (~~ ) = ( ~;: ! ;~~ ).
If B = ( : ! r) and y = 0}
then . By =( : : / ) m t;:! /) .
= (:

If C=O -n and Z=U)-


ilien Cz=o-nm=m
Using products of matrices and column vectors, we write systems of linear
equations in the vector form Ax = b, where A is an m-by-n matrix, x is an n-
dimensional column vector with unknown entries, and h is an m-dimensional column
vector.

IEXAMP(E<3 I Each system on the left is equivalent Lo the matrix equation on the right. Each
variable corresponds Lo a column of the matrix.

4x + 3y = I
-x + 2y = 2
Section 2A Matrix Methods 61

2x
X
+ y + 2z = -1
+ 2y + Z = 0 o~n(:)~(-b)
Xi
Xi -
XJ
+ X2
X2
+ 2x2 = ]
=]
=0 (:-i)C:)~O)
The operations we used in Section 1 to solve systems of linear equations were
elementary multiplication of an equation by a nonzero scalar and elementary mod-
ification that adds a scalar multiple of another equation. The resulting system was
equivalent to the system we started with in that both systems had precisely the same
solutions. Here we apply these same operations to systems Ax = b described in
terms of matrices A and vectors b, noting their effect on the corresponding scalar
equations in a system. In particular, the operations have no effect on the vector x,
consistent with the invariance of the solution set of the system. Thus we can if we
like omit x and just operate simultaneously on A and b.

To illustrate how operations on matrices correspond to operations on equations,


consider the first system in Example 3. In either form the operations must be applied
to both sides of the equations. We start with

4x + 3y = 1 with matrix form


-x + 2y = 2
Multiplying the first rows of the matrices by ¼has the same effect as multiplying
the first equation by ¼:

X + ¾Y = ¼ with matrix form


-x + 2y = 2
Adding the first rows of these matrices to their second rows has the same effect as
adding the first equation to the second:

3
x + 4Y = 41 with matrix form
!ly
4
-- 42.

Multiplying the second rows of the constant matrices by ii has the effect of multi-
plying the second equation by 141 :

3 _ I
X + 4Y - 4
with matrix form
y = IT9
Adding -¾ times the second rows of the constant matrices to the first has the effect
of replacing the first equation by x = - ii :
62 Chapter 2 Equations and Matrices

X = -rr4
y = TI9
The unique solution is evident in either scalar equation or matrix form.

You can imagine trying to solve a large system this way, faced with a large number
of possible steps to take. We'll describe a fail-safe routine, one that Theorem 2.2 in
Section 2B proves will always work. It's helpful to refer to the first nonzero entry
in a row of a matrix as the leading entry in that row. Here':, the routine:

Step 1. Pick a column of A that has leading entry r, by definition nonzero, in


some chosen row r . Multiply that row by l /r to make the leading entry
in r equal I .
Step 2. If a row other than r in Step I has an entry c =I= 0 in the same column as
the leading entry I in r , subtract c times r from the other row. Continue
this process until all other entries in the column of the leading entry are
0. This column is then said to be reduced.
Step 3. Repeat Steps I and 2 until every leading entry is I and every column
that has a leading entry is reduced. The entire matrix is then said to be
reduced. Note that a row without a leading entry has O for every entry.

IEXAMPLE s I matrix
We repeat the calculations of Example 2 of Section I to see the solution process in
form for a system with infinitely many solutions.

+ + 9z = 3
( ; I; ~ ) ( ; ) = ( ! )
3x 12y
2x + 5y + 4z = 4 ;
- lx + 2y + z = -5 - 1 2 0 z -5

To change the system so that x appears only in the first equation, we multiply the
first equation by ¼, adding ( - 2) times the new first equation to the second, and
also adding it to the third. In matrix terms, we apply elementary multiplication by ¼
to the first rows of the coefficient matrix and the right side, then apply elementary
modifications adding multiples of the first row to the second and third:

The first variable x appears only in the first equation with coefficient I; correspond-
ingly the first column of the matrix has 1 in the first row and O elsewhere.
Similarly, to isolate y in the second equation, multiply the second row of the
matrix and of the right side by -¼,
and then perform elementary modifications,
adding (-4) times the new second row to the first row and adding (-6) times the
new second row to the third row, treating the right-side entries similarly:

y
+ lz
3 -
2
-
2 ,
+ 3Z = -3'
0=
ll
3

0
U: t)(O=( ~i)-
Section 2A Matrix Methods 63

The second column of the coefficient matrix now has 1 in the second row and 0
in the other rows, corresponding to y appearing only in the second equation.
All possible values of the variables satisfy the third equation, so we can ignore it.
Because x appears only in the first equation and y appears only in the second, we get
a solution in which z = t has an arbitrary value and x = t+ -½ ¥
and y = -j t - j.
As in Example 2, the set of solutions has the parametric representation of a line in
IR.3 : (x, y, z) = t (-½, -j, 1) + (1l, -j, 0).
In a reduced matrix R a leading entry is the only nonzero entry in its column, and
the variable associated with that column in the system Rx = c is called a leading
variable; all other variables in the system are called nonleading.

In the matrix equations with x = (x, y, z) all the real information is in the 3-by-3
matrices and the constant column vectors; the vector x and the equal sign just remind
us of the context, so in principle could be dropped.

We reduce the first column by adding ( -½)


times the first row to the second row,
and 3 times the first row to the third row to get

To reduce the second column we multiply the second row by (-1) to get

01 -2I -35 ) ( Xy )
= (
-62 ) ,
( 0 - 1 -5 z 6

and then add 2 times the second row to the first and 1 times the second row to the
third to get

( 1 0 7) (
0 I 5
0 0 0
xy )
z
= ( -10 ) -6
0
.

The leading variables are x and y, while the nonleading variable is z, and we have
the two nonzero equations

X + 7z = -10
y + Sz = -6 .
Giving z an arbitraryvalue, we find the unique solution with z = t is x = -10 - ?t,
y = -6 - St, z = t. The solutions in vector fonn are the points

(x, y, z) = (-10, -6, 0) + t(-7, 5, 1)


64 Chapter 2 Equations and Matrices

on a line in JR 3 containing (- 10, -6, 0) and parallel to (- 7, 5, I). If the 0 on the


right side of the last matrix equation had turned out to be 2, the third equation would
have been O = 2, so the system would have been inconsistent, with no solutions.

2B Reduced Matrices
The examples suggest that row operations on the equations in a linear system produce
solutions or else tell us that there are none. The choices we made may have looked
a bit ad hoc so it may not be clear that the process outlined in Steps I, 2, and 3 in
the previous subsection always works for systems of arbitrary size. We'll now show
that they provide a guaranteed routine for displaying a system in a form that makes
it easy to read off the solutions. We repeat the definition of the key terms we used
to describe the process, that a leading entry in a matrix is the first nonzero entry in
a row and that a matrix is reduced if the following two conditions hold:

(i) Every column containing a leading entry is zero except for the leading entry.
(ii) Every leading entry is I.

IEXAMPLE 1 j If
0 0 0)
and B = I I I ,
( 0 2 0

the matrix A is reduced because the top two rows have leading entry I with only
zeros elsewhere in the columns containing the leading entries. Note that the zero row
has no leading entry. The matrix B is not in reduced form because the conditions
(i) and (ii) are both violated; a reduced form for B would have the I and 2 in the
middle column replaced by O and 1, respectively. The reduced form of A gives us
the solutions to a matrix equation Ax = b such as

(~I ~2 Ob)(·~t) (~I)


x +2y =I
or in scalar form z = 2.
0=0

Letting y = t in the top scalar equation we find x = I - 2t, y = t, z = 2, representing


the line (x·, y, z) = t ( -2, 1, 0) + (I, 0, 2) in JR 3 • On the other hand, if we replace
the 0-entry on the right side of the last scalar equation by I, the last row is O = I
so the system becomes inconsistent, with no solutions.

The Steps I, 2, 3 listed earlier for applying elementary row operations are the
main ideas we need to prove the following theorem.

2.2 Theorem. Given a matrix A, there is a sequence of elementary operations


that converts A to a reduced matrix R, namely a matrix that satisfies conditions (i)
and (ii).

Proof Suppose the matrix A is not yet reduced. Then there must be some column
containing a leading entry such that either (i) or (ii) or both fail to hold. If that column
contains the leading entry r for the ith row ri, multiplying ri by r - 1 will make the
leading entry 1. (Since r was a leading entry, it couldn't be zero, though it might be
Section 2C Matrix Methods 65
1 to begin with.) If other entries in the column are nonzero, we can replace them by
zero by adding suitable multiples of the ith row to the other rows. Another column that
already satisfied (i) and (ii) before these operations must have a zero for its ith entry
and therefore is unaltered by the operations. Applying this process to an unreduced
matrix A increases the number of columns that satisfy the conditions (i) and (ii). If the
resulting matrix is still not reduced, we repeat the process, and we obtain a reduced
matrix after at most n steps, where n is the number of columns in A. •
Theorem 2.2 shows that we can always use row reduction to convert a system
of linear equations to a system with a reduced coefficient matrix that has the same
solutions. If the reduced system has no zero rows, or if any zero rows correspond to
zero entries on the right side, then the system is consistent and the solut.ions are all
given by assigning arbitrary values to the nonleading variables.

+ Z + 2w = -]
(~ -2
0
X - 2y
z- w = 1

is not reduced, but we can reduce it by subtracting the second row from the first:

l -2 0 3 ) ( ; ) - ( -2 ) . X - 2y + 3w = -2
( 0 0 1 -1 z - 1 ' z- w = 1
w

We can assign arbitrary values to the variables y and w, so for each of the values
s and t there is just one solution with y = s and w = t, obtained by then putting
= =
x 2s - 3t - 2 and z t + 1. The solutions are given as vectors by

; ) _ ( 2s-:t-2 ) -
z - t+l - s
( ; )
O +t
(- ~1 ) +
(- ) ~1 .
(
w t O I 0
The general form x = su1 + tu2 + v for solutions of our original system Ax = b is
significant in a fundamental way discussed in Section 2C. Note that if the constant
vectors in this linear combination were in ~ 3 instead of ~ 4 , we could assert that
the solutions form a plane containing the point v in ~ 3 ; Section 2D shows how to
extend the possibility of this geometric interpretation to solutions of all systems.

2C Homogeneous Systems
The planar solutions x = su1 + tu2 +v to the system Ax = b of Example 8 illustrate
an important decomposition for solutions of linear systems. Setting s = t = 0 shows
that x = v is a solution of the original system. But if we let x = 01 or x = 02 for
that example, we find that instead of Au1 = b or Au2 = b we get Au1 = Au2 = 0.
Thus x = u 1 and x = u2 are solutions of the homogeneous equation Ax = 0 that
we get when we set b = 0 in Ax = b. To explain what's going on here we start
with the following property of matrix-vector products.
66 Chapter 2 Equations and Matrices

2.3 Linearity of Matrix Multiplication. If the products are defined, then

A(su + tv) = sAu + tAv,


which by repeated application implies

Proof. Let r; be the ith row of A, so the ith entries in A(su+tv), Au, and Av are
r; • (su + tv), r; • u and r; • v. By additivity and homogeneity of the dot product,

r; • (su + tv) = r; • (su) + r; • (tv) = s(r; • u) + t(r;. v).

The expression on the right is the ith entry in sAu + tAv. To get the more general
equation, apply the two-term version to t1 u1 +(t2u2+ · +tkuk), and then successively
split off one more term at a time. •
Remark. The term linearity applied to the property of matrix-vector multiplication
in Theorem 2.3 stems from the observation that multiplication of the points on a line
tu + v in JR 2 or JR 3 by A carries the line into another line, or possibly just a point.
The reason is that, by Theorem 2.3, A(tu + v) = tAu + Av. Thus if Au -=I- 0 the
result of applying A is a line through the point Av and parallel to the vector Au. If
Au= 0 we get only the point Av.
Here is the basic theorem about the structure of solutions of Ax = b.
2.4 Theorem. Every solution of the matrix equation Ax = b has the form Xh +xp,
where Xp is some particular solution and x1i is a solution of the homogeneous equation
Ax=0.

Proof. Let Xa be an arbitrary solution of Ax= b. Then by Theorem 2.3 applied to


just two terms we have

A(Xa - Xp) = Axa - Axp = b - b = 0.


Thus Xu - Xp = x1z for some solution xh of Ax = 0, so Xa = xh + Xp. •
IEXAMPLE9 I In Example 8 we exhibited the solutions of a system Ax
su1
= b in the form x =
+ tu2 + v. This illustrates Theorem 2.4 because we get a particular solution
Xp = v by setting s = t = 0, and the other part of the solution, su 1 + tu 2 , consists
of solutions of the homogeneous system. A vector w such that x = w + v is also a
solution must necessarily be a solution of Ax = 0 because, by Theorem 2.3,

A(w + v) =Aw+ Av= Aw+ b = b.

Subtracting b from both sides of the last equality shows that Aw = 0.


The decomposition x = x1z +xp of solutions in Theorem 2.4 reduces to just x = Xp
precisely when x = 0 is the only solution of Ax = 0. We now examine in general
how to tell whether Ax= 0 has multiple solutions or only the zero solution.
A zero row in a matrix R represents the scalar equation O = 0 in the system
Rx = 0, and has no effect on the solutions of the system. We call such equations
trivial and don't count them as belonging to the system of scalar equations.
Section 2C Matrix Methods 67
2.5 Theorem. A homogeneous system Ax = 0 has infinitely many nonzero solu-
tions if it has more variables than equations. It also has infinitely many nonzero
solutions if an equivalent reduced system Rx = 0 has more variables than it has
nontrivial equations. It has only the zero solution if Rx = 0 has at least as many
nontrivial equations as variables.

Proof Let A have n columns and m rows, with m < n . When we convert A
to a reduced form R, the zero rows in R will be consistent because the right side
is zero also. There can be at most m leading entries, so at most m columns with
leading entries. We may specify arbitrary values for each variable that corresponds
to a column having no leading entry, then solve in terms of these for the variables
that correspond to leading entries. Given the infinitely many arbitrary values for at
least one variable, we get infinitely many solutions. The same argument applies to
Rx= 0 if we don't count trivial equations O = 0. Finally, a reduced system with at
least as many nontrivial equations as variables has a leading entry in every column,
so has only the zero solution. •

l:~M~P~~ ~~P I
u-: -! ) (n n
1 The system

X - 2y + Z = 0
=( °' 2x +y - 3z = 0

is an example of Theorem 2.5, because it is homogeneous and has more unknowns


than equations. We can regard the solutions as the intersection of two planes, and
they include the trivial zero solution (x, y, z) = (0, 0, 0) . Hence the planes intersect
in at least an entire line through the origin. It's straightforward to check that the
solutions are of the form (x, y, z) = t(l, 1, 1), where t ranges over all real numbers;
to see that these represent all solutions just set z = t in the reduced form

( ~ ~ =:)( ~ ) = ( ~ ) or
X
y
+ -z = Q
+ -z = 0.

Thus with z = t the line of solutions is (x, y, z) = t (1, 1, 1).


In an example like this one

or
x +Oz= o
y +Oz= 0,

in which z appears only with zero coefficients, we still have to remember it's there
and set z = t to get all solutions (x, y, z) = (0, 0, t).

Theorem 2.4 links multiple solutions of a system Ax = b with multiple solutions


of the associated homogeneous system Ax = 0. Thus Ax = b has a unique solution
precisely when x = 0 is the only solution to Ax= 0, so it's important to be able to
68 Chapter 2 Equations and Matrices

tell when the latter happens. Theorem 2.5 gives one way of answering the question,
and Sections 4 and 5 present special criteria that apply when A is a square matrix.
In the following discussion we assume nothing about the dimensions of A.
If A has columns u_; with entries U;J, then

2.6 ) ~x1u1 +···+x,,u,,.

From this equation we see that Ax = 0 is equivalent to x1 u1 + · · · + x 11 u,, = 0. If the


scalars Xk aren't all zero, this relation among the columns of A gives rise to multiple
solutions to Ax = 0, while the lack of such a relation gives only the solution x = 0.
Here is the standard generalization of our earlier definition of linear independence,
stated in Chapter l for just two vectors u1 and Uz.

2.7 Definition Vectors u1, ••• , Un arc linearly inde1,cndcnt if Uli.'!. p u~tiCJQ
xt 01 f · · · f x11U11 = 0 is satisfied poly by ¢tiooiji1t~ aU Xk = (), <)f V:;f
kntly py Equation 2.6, if the only solution to A.x ..;:: o is·>~ =.!!
v.-b,re .. •.is the
matrix with the Uk for columns. Otherwise the veccors nn:> linearly delh'ndent

Definition 2.7 becomes more intuitive and is often easier to apply, in a form that
explicitly contains our original definition for two vectors:

2.71 Definition Vectors ui , ... , 0 11 are linearly independent if no one hf thc1n


is a linear combination of the other n ·- 1 vectors. Otherwise die vech)rs are linearly
dependent.

The following theorem allows us to use whichever of the two definitions is more
convenient at a given point.

2.8 Theorem. Definitions 2.7 and 2.7' of linear independence are equivalent.

Proof The equation x1u1 + · · · +x 11 u11 = 0 is equivalent to

If Dk is a linear combination of the other o's, then both equations hold with Xk =
1 #- 0. But if the equations hold with some Xk #- 0, then dividing the second equation
by Xk shows that uk is a linear combination of the other u' s. •

IEXAMPLE 12 I Let A =
( 13)
3 l
2 2 and B = (134)
2 2 4
3 1 4
. The columns of A are linearly inde-

pendent, because neither is a scalar multiple of the other, so the equation Ax = 0


has only the solution x = 0. The columns of B are linearly dependent, because the
third is the sum of the first two, so the equation Bx = 0 has multiple solutions.
Section 2C Matrix Methods 69

EXERCISES

Note. A linear system may have just one solution, or 15. Express the vectors i, j in R 2 as linear combinations
infinitely many solutions, or else no solutions if it's inconsis- of (I, 2) and (2, 3) by solving an appropriate system of
tent. Solving a system requires finding all solutions or showing equations for the coefficients of combination.
that there are none if that's the case.
16. Express the vectors i, j, k in R 3 as linear combinations
In Exercises 1 to 4, (a) write the system of equations in of (I, 1, 1), (1, 1, 0), and (1, 0, 0).
matrix fonn; that is, find a matrix A and a vector b such
that the system is equivalent to the equation Ax = b, 17. Express the vector (5, 0, 1, 2) as a linear combination of
and (b) solve the system. (1, 2, 1, 0) and (2, -1, 0, 1).

1. 3x - 2y =1 2. 3x +y +z =1 18. Can an arbitrary vector in R 4 he expressed as a linear


X - 3y =2 x-y-z=0 combination of the last two vectors in Exercise 17?
Explain.
+y =1 + 2y = 0

!D·~(D
3. X 4. X I 2 3
=1
y - z X - y = 0 0 1 2
X +Z =0 -x + y = 1 19. Solve the system 0 0 1
( 0 0 0
In Exercises 5 to 8, (a) write a system of equations 0 0 0
equivalent to the given matrix equation and (b) solve
the system. 20. Solve the system

5• ( ! i )( ; ) = ( b)
(i l j
0 0 0
_i
1 -D·~( J)
6. ( -b i ) = ( ~ )
X
2 3 4 6

0: O·~ (n
In Exercises 21 to 24, determine whether or not the vec-
tor vis a linear combination of the other vectors given.
1
• 21. V = 2i + 3j; 3 = 2i-j, b = 2i + j

s. 0D·~U) 22.

23. v
V = 2i + 3j + 4k; a = 2i - j, b = i + j + k,
C =j-2k
=
(0,1,2)
(-1,0,-l);a = (2,-1,2),b = (1,1,-3),c =
In Exercises 9 and 10, solve the given system. =
24. v (3, -1, 0, -l);a = (2, -1, 3, 2), b = (-1, 1, L-3) ,

•. (-!H)(O~O) C=(},J,9,-5)
In Exercises 25 and 26, parametric representations arc
given for two lines, L1 and L2. Show that there is just
one line L3 that intersects both LI and L2 at right angles,
10. ( _; !~) (' ~) = (
3 -1 2 w
~)
-1
and find a parametric representation for it. [Hint: If the
parameters s and t have values corresponding to the
' points where L3 intersects L1 and L2, then the vector
In Exercises 11 to 14, (a) find a reduced matrix equiv- from one point to the other must be perpendicular to hoth
alent to A, (b) solve the system Ax = 0, (c) solve the L 1 and L2. Show that th.is leads to two linear equations
system Ax = e1 + ei. that s and t must satisfy.]

11. A= ( ; -i -! ) 12. A= ( ~ ~
25. L1: s(3, 2, 1) + (-1, 0, 1), L2: t(O, 2, 1) + (-4, 2, 0)
26. L1:s(-1,3,0), L2:t(4,0,1)+(-2,1,1)

A~( H
*27. Show that if L 1 and L 2 are two lines in JR3 that do not
13. A= ( ~ 1
1 0 0
) ~ 14. intersect and are not parallel, then there is a unique third
line L 3 that intersects both of them at right angles, as in
70 Chapter 2 Equations and Matrices

Exercise 25. What if the lines are parallel? What if they the homogeneous system Ax = 0. This version of linearity
intersect? is sometimes called the superposition principle.
28. Show that if v 1, ... , vk are solutions of a homogeneous
system Ax = 0, then every linear combination of them is
30. Show that if x = v and x = u + v both satisfy Ax = b,
then Au= 0.
also a solution.
29. Let x = v be one solution of the system Ax = b. Show 31. Show that if x1 satisfies Ax = h1 and x2 satisfies
that w is also a solution if and only if w-v is a solution of Ax = b2, then l1 x1 + t2X2 satisfies Ax = t1 b1 + t2b2.

2D Geometry of Solution Sets


In the systems we've solved so far, solution sets have been either empty, a single
point, a line, or a plane. We'll show here that the solution set of every system of
linear equations has a similar form. Crucial to calling a set of points x = .m +
tv, a plane is that neither u nor v is a scalar multiple of the other, which we
called linear independence of u and v, now seen as a special case of Definition 2.7'
in Section 2C. Linear independence of the set S = {u 1, u2, ... , uk} implies that
the set of linear combinations ti u1 + t2u2 + · · · + tkUk can't be collapsed into the
set of linear combinations of a smaller subset of S by replacing a vector ui by a
linear combination of the others. This collapse is what happens when a parametric
representation x = t1 u1 +t2u2 +v of what appears superficially to be a plane collapses
to a line x = (t1 + ct2)u1 + v if u2 = cu1, or to a single point x = v if u 1 = u2 = 0.

2.9 Definition A subset of R" is called a k~plane if it consists of aU p()ints


x = ti 01 + ··· + 1.1:uk + v, where ,, is a 11xcd vector, and the Di arc fixed litleady
independent vectors, with the 11 varying over all real nuri1bers.

Comparing Definition 2.9 with the special cases in Chapter I, we see that an
ordinary plane is a 2-plane and a line is a I-plane. We can even take k = 0 and
regard a single point as a 0-plane. Every example of a system of linear equations
that we have treated has had for its solution set a k-plane fork = 0, I, or 2, or else
has had no solutions at all.

j EXAMPLE n I In Example 8, Ax = b stood for two equations in four variables:

( 01 -2 0 3) = (-2)·
0 I -I x l '
x1 - 2x2 + 3x4 =
X3 - X4 =
-2
I.

For arbitrary scalars s and t there is a unique solution in which the two nonleading
variables, x2 and x4 have the values x2 = s and x4 = t, namely,

x- ( + -,
2s-3t-2 ) ( 2) (-3 ) (-2
g +t : + ! .
)
This represents a 2-plane x = su 1 + tu2 + v if we take
-~--~ -- - -

Section 20 Matrix Methods 71

llJ = G} •2 = n} and V = CD
To check that 01 and 02 are linearly independent, note that the second entries in
01 and 02 are respectively O and 1, while the fourth entries are respectively 1
and 0. Thus neither 01 nor u2 can be a scalar multiple of the other, so the set
of solutions is a 2-plane in IR4 parallel to u 1 and u2, and containing the point
v. Note that xh = su 1 + tu2 solves Ax = 0 while Xp = v solves Ax = b,
illustrating the decomposition of solutions into homogeneous plus particular in
Theorem 2.4.

In a reduced matrix with n columns and r nonzero rows, the number of nonleading
variables is k = n - r, because every nonzero row of a reduced matrix contains just
one leading entry and corresponds to just one variable. Thus a system of m linear
equations in n variables has a solution set that is an (n-m)-plane unless the result
of applying row reduction to the coefficient matrix produces one or more zero rows.
In particular, the solution set of a single linear equation in JR 11 is an (n-1)-plane,
sometimes called a hyperplane in JR11 •

We may write a single linear equation in the fonn a • x = b. For an example in JR 4 ,


consider (0, -3, -2, 1) • x = 3, with x = (x1, x2, x3, X4). Reduction of the single
matrix row is just division by -3, giving the equivalent equation

(0, 1, j, -½) •x = -1 or xi+ ix3 - ½x4 = -1.


The only leading variable is x2, and the nonleading variables are xi, x3, and x4.
Proceeding as in the previous example, we set x1 = s, x3 = t, and X4 = u to find as
solution set the hyperplane in JR4 consisting of the points

X=
(-lr/lu-1 )
3
t
u
3 =S (b)
O
O
+t (-f) (f) (-~)
3

O
1
+u 3
0
1
+
. 0
0
.

The first three tenns in the sum are linearly independent since each one has a 1 in the
entry where the other two have only zero, so their linear combination is the general
solution Xh of a • x = 0. The fourth vector is a solution Xp of the nonhomogeneous
equation, so the solutions fonn a hyperplane containing Xp in IR 11 •

Checking for Independence. For arbitrary sets of vectors the simple check for
independence we used at the end of the previous example is often impossible, but
there is a routine check. The method depends on knowing that applying ele~entary
row operations to a matrix A with columns a 1, ••• am preserves a dependence relation
among the columns. For instance if we get B from A by applying row operations,
then a 1 = a2 -2a3 if and only if b1 = b2 -2b3. The reason is that the row operations
(i) multiplication by r -:f. 0 and (ii) adding a multiple of one row to another, preserve
a dependence relation in each affected row. More formally, we have the following.
72 Chapter 2 Equations and Matrices

2.10 Theorem. If a matrix B is obtained from a matrix A by application of row


operations, then a dependence relation among columns of one matrix holds for the
corresponding columns of the other.

Proof. We can express a linear relation x1 a1 + · · · +x11 a11 = 0 among the columns
aj of A as Ax = 0, where Xj = 0 if aj isn't involved. Since solution vectors x of
Ax = 0 are unchanged by row operations on A, a linear relation among the columns
of A carries over to the same relation among the corresponding columns of B. •

2.11 Theorem. Let u 1, ... , Um be m vectors in ~ 11 • and let A be the n-by-


m matrix with the Uj as columns. Then u 1, ... , Um are linearly independent if
and only if every column in a reduced form R of A contains a leading entry. A
column of A with no leading entry in the corresponding column of R is a linear
combination of the columns of A corresponding to columns that do have leading
entries in R.

Proof. A column rk of R with a leading entry can't be a linear combination of other


columns containing leading entries, because rk has a I where the others have zeros.
Thus the columns are independent if every column of R contains a leading entry. To
write a column rk of R with only nonleading entries as a linear combination of the
columns with leading entries I, multiply each such column rj by the corresponding
nonleading entry in rk and add the results to get rk. The same dependence relation
then holds among the corresponding columns of A by Theorem 2.10. •

To find out whether a= (I, 2, 3, - 1), b = (0, I, -1, I), and c = (-1, 0, 2, -1) are
linearly independent, we form the matrix A that has them as columns, with reduced
form R for which we omit the details:

2
l O -1 )
I 0
I O O)
0 I 0
R=
A=
(
3 - 1
-1
2 ;
l -1
( 0 0 0
0 0 1
.

Every column of R has a leading entry, so a, b, and c are linearly independent.

Theorem 2.11 tells us that if some column of a reduced matrix we're checking for
dependent columns has no leading entry, that column will be a linear combination
of the columns that do have only leading entries, as in the following example.

To check the vectors a = (1, 0, l, 0), b = (0, 1, 1, 0), c = (1, I, 2, 1), and d =
(-3, -4, - 7, -3) for independence, we fonn the matrix A that has them as columns,
showing also the system Ax = 0 to aid understanding the dependence relations:

I 0 XJ + X3 - 3x4 =0
A=
(
0 l
I l
0 0 I
l =~ )'
-3
x1 +
x2
x2
+ x3
+ 2x3
- 4x4
- 7x4
X3 - 3X4
=0
=0
=0
Section 2D Matrix Methods 73

A row reduction left as Exercise 5(b) gives an R and a reduced system Rx= 0:

XJ =0
01 01 00 -10) X2 X4 =0
R= 0 0 0 0 ; 0=0
(
0 0 l -3 X3 - 3x4 = 0.
Solving the corresponding system Rx = 0, we can set the sole nonleading variable
x4 equal to 1, so x1 = 0, x2 = 1, and x3 = 3. Hence b + 3c + d = 0. Evidently
d = -b - 3c, so {a, b, c, d} is a linearly dependent set of vectors, and even the
smaller set {b, c, d} is dependent. But {a, b, c} is an independent set. What about
{a, b, d}? By looking carefully at R, you can answer this question without referring
to the system Rx= 0.

EXERCISES

l. Solve the equation w + 3x - 2y + z = 3 by expressing determines a plane that doesn't contain the third
the solutions parametrically as a 3-plane in R 4 . vector.
2. Solve the equation u + v + w - x - y =
I by expressing
the solutions parametrically as a 4-plane in R5 •
9. Let A =( t - ).
~ ~ Show that points x such that

3. Carry out the row reduction indicated in text Example 15.


Ax = 0 form a line.
10. Show that the system's solutions form a line, or I-plane,
4. Find a reduced form R for the matrix containing O in R 3 :

1 0 -1 3 )
2 1 0 -I
( 3 -I 2 0 '
-1 1 -I 2

11. Show that the system's solutions fonn a 2-plane contain-


and use R to show that the columns of the matrix are
ing O in R 4 :
linearly independent.
5. In Example 16 in the text, the four vectors a =
(1,0,1,0),b = (0,1,1,0),c = (1,l,2,1), and d =
(-3, -4, -7, -3) were found to be linearly dependent,
while the three vectors a, b, c were found to be linearly
independent.
(a) Test the sets of three vectors a, b, d; a, c, d: and A matrix R is in echelon form if it's not only reduced
b, c, d for independence. but the leading entries shift to the right as you go down
(b) Carry out the row reduction we used in Example 16. the columns and all zero rows are at the bottom. It can
6. Let x = (I, 3, 9), y = (0, -4, 6), Z = (I, 7, 3), and be shown that a given matrix A is reducible to a unique
w = (1, 2, 4). Find which of the sets {x, y, z, w}, {x, y, w}, matrix in echelon form.
and {x, y, z} are linearly independent. 12. Show that given a reduced matrix R there is a
7. Show that two nonzero vectors u, v in R 3 are linearly sequence of row interchanges that will put R in echelon
independent if and only if they don't both lie on a line form.
through the origin. 13. Show that if an n-by-n matrix is in echelon form then
8. Show that three nonzero vectors u, v, w in R 3 either it has one or more zero rows, or else its columns
are linearly independent if and only if each pair are the standard basis vectors e1, ... , e,,.
74 Chapter 2 Equations and Matrices

SECTION 3 MATRIX ALGEBRA


Operations of addition and multiplication by scalars work for matrices very much as
they do for vectors. Similarly, the product of a matrix and a column vector defined
in the previous section has a natural extension to a more general product of matri-
ces. This section is about the properties of these extended operations, properties that
subsume and organize the arithmetic of multivariable systems of equations, not only
linear ones but the calculus of nonlinear systems also. To be specific, matrix alge-
bra plays an important role in establishing the properties of inverse matrices in the
next section, in extending the meaning of differentiability for real-valued functions
IR 11 ~ IR to vector-valued functions IR11 ~ !Rm in Chapter 5, Section 4, in extend-
ing Newton's method for root-finding to vector-valued functions in Chapter 5, Section
5, and in solving linear systems of differential equations throughout Chapter 13.
If a matrix is named by a capital letter, we denote its entries by the corresponding
lowercase letter with a pair of subscripts. The first subscript labels the row where the
entry occurs and the second labels the column. Thus the first index increases across
the rows and the second increases down the columns in

A = ( :~:
a31
:~~ )
a32
or B =( b11
b21
b1 2
b22
b13 )
b23 .

In general, a;j is called the ijth entry of A and stands for the entry in the ith row
and the jth column of the matrix. For a row vector or a column vector we usually
use only one subscript and write, for example, a = (a1, a2, . . . , a 11 ) .

IEXAMPLE 1 I To illustrate the notation more concretely, let

7 -I ) II 12 13 )
p = ( - 3 2 and Q =( 21 22 23 .
Then in P the entries are P11 = 7, p12 = -1, P21 = - 3, p22 = 2. We can write a
formula for the entries in Q, namely%= I0i +j for i = I , 2 and j = l, 2, 3.

3A Sum and Scalar Multiple


The addition and scalar multiplication that were defined in Chapter I for vectors in
IR11 extend to matrices: If A and B have the same dimensions, then the sum A + B
is defined to be the matrix C with Cij = Gij + bij.

IEXAMPLE 2 j Here are two examples of matrix addition.

(~ ;)+( -! ;)=(~ ;)
2
I
I )
0 +( - I
I =; -~) = ( ~ ~ ~)
There's no reasonable way to define addition for matrices of different dimensions,
so we can't add
Section 3B Matrix Algebra 75
For a matrix A and number r, the scalar multiple r A is defined to be the matrix
C with entries Cij = raij-

j,:E~MP,(lj Here are two examples of scalar multiplication.

61)= ( -~ =~ )
- 2(

3
(-! i 6)=(-~ ~ ~)
Using both addition and scalar multiplication, we can write linear combinations of
matrices that have the same dimensions. For example, using 2-by-2 matrices,

)
- 3( 0
2
l )
l
=( 2
4
-2 )
6
+( 0
-6
-3 )
-3

= ( -22 -5)
3 .

As with vectors in Rn, we write -A for (-l)A and A - B for A+ (-1)B. Also,
for every m and n there is an m-by-n zero matrix, denoted by 0, with all entries
equal to zero, such that A + 0 = A.
Notational warning. When O is used to denote a zero matrix, we depend on the
context to make clear what the dimensions are intended for the matrix. For example,

_if A =( ! ; ), then the O in A + 0 must stand for ( ~ g),because that is


the only zero matrix we can add to a 2-by-2 matrix.
The formulas 1 to 9 concerning linear combinations of vectors in IR.n stated in
Section 1 of Chapter 1 (page 2) are equally valid for linear combinations of matrices.
3B Matrix Multiplication
In Section 2A we defined the product Ax of an m-by-n matrix A and n-by-1 column
vector x to be the m-by-1 column vector with entries the dot products of the rows
of A with x. Thus writing systems of linear equations we saw that if

A = -10 -32) and x = ( xi ) , then Ax= ( -1l -32)( xi ) =( -x1+2x2


x1 -3x2 ) .
( 5 1 x2 5 1 x2 5x1 +x2

Our general definition of the matrix product AB depends on the special case Ax.
76 Chapter 2 Equations and Matrices

IEXAMPLE4 I If

)
4 3
A-( - 1 2
l -1
and B =( h11
h21
b12
b22 ).
then

( -i ; ) (
b 12 ) ( 4h11+3b21
AB= b11
b22
= -b1 I +2b21
-I h21 b 11-b21

For a numerical example, let C =( ! ; ). Then

16
7 23)
8 .
-3 -3

The entry in the first row and second column of AD is the dot product given by
(4, 3) . (2, 5) = (4)(2) + (3)(5) = 23; you should check that the other entries are
correct as shown.
Mat1ix multiplication is sometimes called row-by-column multiplication, and
schematically the process looks like this, putting the dot product of the second row
and the fourth column in the corresponding row and column of the product:

(::: =)(:::: :)=(=::: =)·


**** * * * * * *****
For basic questions about matrix products it's often essential to write the entries
in the product of A = (a;,;) and B = (hiJ) by using the summation notation. In this
notation the ijth entry in AB is
II

I:a;khkJ,
k=I

an expression that is read as "the sum for k = l to n of aikbk.i" and means the
same as

Note that the summation index k runs over the column index of A (that is, across a
row) and over the row index of B (that is, down a column).
It is important to note that matrix multiplication is not in general commutative.
Thus even if the products AB and BA are defined and have the same dimensions
they may not be equal. This is why we need the first two laws, with factors in
opposite orders, in Theorem 3.2.
Section 38 Matrix Algebra 77

IE~Me~e sJ Let A =( ~ ~) and B = ( ~ ~). Then AB = ( ~ ~) and BA =


( ! ~ ). so AB# BA. Exercises 43 48 to show other ways that matrix operations
differ from the analogous scalar operations.

3.2 Theorem. Let A, B, and C be matrices having the proper dimensions for the
sums and products to be defined, and let t be a scalar. The basic properties of matrix
products are

l. (A+ B)C = AC+ BC (Right distributive law)


2. C(A + B) =CA+ CB (Left distributive law)
3. (tA)B = t(AB) = A(tB) (Scalar commutativity law)
4. A(BC) = (AB)C (Associative law)

Proof. We prove only property 4 since it is the most complicated; proofs of the
others are left as exercises. Suppose A is m-by-n. For the products AB to be defined,
B must haven rows and so must be n-by-p for some p. Similarly, C must be p-by-q
for some q in order for BC to be defined. To prove two matrices equal we have
to show that corresponding entries are the same in both. The r jth entry of BC is
I:;= 1 brst\j, so the ijth entry of A(BC) is
(p
n
Lair L b,sCsj ) = LLp ai,brsCsj.
11

r=l s=l r=ls=l

Similarly, the ijth entry of (AB)C is

LP(nL a;,b,s ) Csj = LLn a;,b,sCsj.p

s=l r=l s=l r=l

The sums on the right in these two equations consist of the same terms added in
different orders, so corresponding entries in A(BC) and (AB)C are equal. •
Formulas I through 4 in Theorem 3.2 state laws that have the same form as some
of the laws of ordinary arithmetic, with matrices replaced by scalars. Because of
the associative law for matrices it makes sense to write the product ABC instead of
(AB)C or A(BC) since the result is independent of the order in which the products
are formed, though as we saw in Example 5 not necessarily independent of the order
of the factors. In the next section we'll define an inverse operator to multiplication
by A, denoted A- 1 that is defined only for some square matrices.

j-~><AMPLE 6 I To illustrate the associative and distributive laws in the next two examples, let
A = ( _ : ~) , B = ( ~ i) , C = ( -~ ~) , and D i) ·
= ( - ~~

Then AB = (-112)(0 2) = (444)


2 2 1 0 and (AB)C = (444)(-1 0) (-488)
0 3 2 = 0 '
78 Chapter 2 Equations and Matrices

while BC = ( ~ i)(-~ ~) = ( ~ i) and A(BC) = ( - ! ;)( ~ i) (-! ~).


=

Thus (AB)C = A(BC), as the associative law states for A. B, and C in that order.

jEXAMPLEJ I To illustrate the first distributive law we calculate (A+ B)D as

This equals AD+ BD , since

(-! ;)(-; 1 f)+(~ f)(-~ ~ i)


= ( ~ ; 6) + ( -i 1 ; ) = ( ; ~ ~)·
Note that the product D(A + B) is not defined.
3C Identity Matrices
A square mauix of the form

J-(
- 0I 0)
I or l= U 0 0)
l
0
0
I
or I=
(
0
1 0 - --0
0 I ··· 0
. . .

0
.
.. .. . . ..
··· 0
0)
0
..
.
I
that has Is on its main diagonal and Os elsewhere is called an identity matrix. An
identity matrix I has the property that both

IA = A, and BI =B
for matrices A and B such that the products are defined. Thus it is an identity element
for matrix multiplication somewhat as the number I is an identity for multiplication
of numbers. There is an n-by-n identity matrix for every value of n, and as with
zero maUices, we depend on the context to determine the dimensions of the identity
matrix denoted by an occun-encc of / in a formula.

IEXA~PLE a I You should check the following matrix products:

0 n(!; D=(!; n(g ! n=o; n


Section 3D Matrix Algebra 79
Notice that while we use the letter / to denote the identity matrix when it occurs on
either side of a given matrix A, these identity matrices will have different dimensions
if A is not a square matrix.
If xis an n-dimensional column vector, then Ix= x looks like this when n = 3:

0I O O) (
l O XJ )
x2 =( XI )
x2 .
( 0 0 I X3 X3

3D Matrix Polynomials
We may multiply together any number of matrices of the same dimensions n-by-n to
get a square matrix with the same dimensions. In particular, if A is a square matrix
we may multiply it by itself repeatedly, and we define

A2 = AA, A 3 =AAA= AA 2 , ... , Ak = AAk-l _


Note that if B is not square, for example, 2-by-3, then not even B 2 makes sense.
When A has dimensions n-by-n it is natural to define A O to be the n-by-n identity
matrix /, since then the rule Ai Ak = Ai+k holds for all nonnegative integers j and
k. Since the powers of A all have the same dimensions, we can add scalar multiples
of them to form polynomials in A such as A 2 + 3A + 5/.

A2 =( ~ j )( ~ ~ )=( ~ ~ ),
A
3
=( ~ ~)(~ ~) =( ~ 1~ ), etc.

If p(x) is the polynomial x 2 + 4x + 2, then

p(A) = A 2 + 4A + 21 = (45) + (2 !)+ (6 ~)


O 9 4 O 2

4 5 )
=( 0 9 +( 8
0
4 )
12 +( 2 0 )
0 2 =( 14
0
9 )
23 .

The possibility of replacing the variable x in a polynomial p(x) by a square matrix


X raises some interesting questions. In Exercise 47 we look at the question of solving
for X in a quadratic matrix equation aX 2 +bX +cl = 0, and in Exercise 50 we look
at a way of defining /(X) for some functions other than polynomials, /(X) = ex
in particular, which we'll find useful in Chapter 13.
80 Chapter 2 Equations and Matrices

EXERCISES

In Exercises J to 10, compute the given expressions 28. Show that if A is a m-by-11 matrix, then Aej is the jth
using the following matrices: column of A, where e1, ... , e11 are the standard basis
vectors in JR", considered as column vectors. What arc
the products e; A, considering e; as a row vector?
A = ( -b i ), B = ( In Exercises 29 to 34, compute the given matrix
products.
c-(1 i )·
(i 00) (: DO)
2 1
- 1
29. 5 30. 0

.)( n
8 1
1. 3A - B 2. A +2B 3. 2A +B +C
4. A
7. ABC
+sc 5. AB
8. AB-2B
6. BC
9. C+AC
31. ( 2 32.
0)1 2 4 )

10. A+ B +C 2
In Exercises 11 to 20, compute the given expression if
it is defined, or else give a reason why it is not defined,
33.
(H) (_: - 1
1 -1
1
)
using the following matrices.
34. (~ ~ ~ )(-:)
0 0 5 -1

A=( 35. Show that for a matrix A and zero matrices of appropriate
dimensions,

c-( AO= 0 and OA = 0.

If A is m-by-n, for what possible dimensions of zero

G=( -I 2)
matrices are AO and O A defined, and what arc the
dimensions of the products?
0 3
36. Prove the Icft distributive law for matrices, C(A + B) =
CA +CB .
11. 2B - 3G 12. AB 13. BA
37. Let r be a 1-by-n row vector and e an n-by-1 column
14. BD 15. DB 16. CD +3DB vector.
17. 2AB - SG 18. 2GC -4AB 19. CDC (a) Show that the matrix product re is the same as the
20. DCD dot product r • e.
(b) Describe in general the product er.
In Exercises 21 to 26, with A, B, C, D the same as in
the preceding group of exercises, determine what the 38. A company makes m grades of a product in n different
dimensions of X and Y would have to be for each of the factories.
following equations to be possible. (In some cases there (a) Let £lij be the number of tons per day of grade i
may be no possible dimensions; in other cases there may produced in plant j .
he more than one possibility.) (b) Let dj be the number of days per month that plant
21. AX=B+Y 22. (D+2X)YC=0 j operates.
(e) Let p; be the wholesale price per ton of grade i .
23. AX= YD 24. ex +DY= o
25. AX= YC 26. AX= CY
27. Using the matrix C of the preceding group of exercises, ~I A - (a,1), d - ( ;: ) , ~d p - ( :~ } ood
compute Ci, Cj, and Ck, taking the basis vectors i, ,i, and
k as column vectors in JR 3 • let u be the column vector of all Is in JR" and v be the
Section 4A Inverse Matrices 81
column vector of all ls in ]Rm. Give interpretations for factoring the left side we know that the scalar equation
the following expressions. x2 - I = 0 has x = l and x =-
l as its only solutions.
(a)
(d)
Au
(Ad)• v
(b)
(e)
(Au)• v
(Ad)• p
(c)

In Exercises 39 to 42, compute A 2 , A3, and p(A), where


p(x) = 2x 2 - 3x + 3.
Ad

(b) Show that ( : _: r


(a) Show that if A= I or A=-/, then A 2 - I= 0.

= ( ~ ~) ifa 2 +bc=
I. Thus the equation A 2 - I = 0 has infinitely many
different solutions in the set of 2-by-2 matrices.
39. A= ( -~ ~) 40. A=
( 0
0
0
1
0
1)
I (c) Show that every 2-by-2 matrix A for which A 2 = I
is either /, - l, or one of the matrices described in
0 0
part (b).

~ ~) 0 0 3) 49. Given numbers a, b, c, d, let p(x) = x2 - (a+ d)x +


41. A= ( -~
0 0 3
42. A=
( -I
0
O 0
5 0 (ad - be), and let A be the matrix ( ~ ! )-
Take a= 1, b = -3, c = 1, d = -1, and verify that
V = ( i ~ ).
(a)
43. Let U = ( - ~ -~ ) and Compute p(A) = 0 in this case.
UV and VU. Are they the same? Is it possible for the (b) Show that, for arbitrary values of a, b, c, d, we have
product of two matrices to be zero without either factor p(A) = 0.
being zero? *50. Let A be an n by n matrix. We define an n-by-n matrix
44. (a) Show that if A is an n-by-n matrix and / is the n- eA by
by-n identity matrix, then (A - l)(A + I) = A 2 - /.
(b) Give examples of2-by-2 matrices A and B such that N oc

(A - B)(A + B) # A 2 - B 2. eA = N-+oo
.
hm LI-A k = Elk
-A.
k! k!
45. (a) Show that for A and / as in Exercise 44, (A+ /) 2 = k=O k=O

A 2 +2A + I.
(b) Find examples of 2-by-2 matrices A and B such that Note that the finite sum from O to N is a polynomial
(A+ B) 2 # A 2 +2AB + B 2 • PN(A) and so is itself an n by n matrix in which we
compute the limit separately in each matrix entry.

46. Find Aand Afor A=(-!2 ~I


2 3
-~)·Zero is the
-1
(a) Compute the matrix PN(A) for A =( b ! ),
only number whose cube is 0. Thus this exercise illustrates Then compute the limit matrix eA by letting N --+
00.
another difference between the arithmetic of numbers and
of matrices. (b) Using the commutativity of scalar-matrix multiplica-
tion, redo the computation in part (a) with the 2-by-2
47. The factorization X 2 -3X + 2l = (X - /)(X -2/) shows matrix A replaced by t A to find the 2-by-2 matrix
that the m·atrix equation X 2 - 3 X + 2l = 0 has solutions e'A.
X = I and X = 2!. Show that if X and l are 2-by-2, (c) Since we can compute limits of matrices one entry
then X =( ~ ~ ) is also a solution for every scalar a. at a time, we can differentiate one entry at a time.
d .
Show that dt e1A = Ae 1A for the 2-by-2 matnx of
48. We know that if the scalar equation xy = 0 holds then at
least one of the numbers x and y must equal 0. Thus by part (a).

SECTION 4 INVERSE MATRICES


This section completes our discussion of the algebra of matrix operations by describ-
ing the limited extent to which we can perform an analogue of division by a square
matrix.
4A Invertibility
If A and B are square matrices with the same dimensions such that
AB=BA=l,
82 Chapter 2 Equation~ and Matrices

then we say that B is an inverse of A, and that A is an invertible matrix. As we


show in Theorem 4.5, a matrix A can have at most one inverse, so we can speak of
the inverse of A and denote it by A - I. Thus, if A is invertible,

AA - 1 = A - 1A = /.

f EXAMPLE 1 I If A = ( j ~ ) , we can check that A- 1


= ( -~ -i ):

and

7 -2 ) ( 1 2 )
A
- I
A = (
-3 I 3 7 =( I O)
0 I .

A 2-by-2 matrix ( ~ !) is invertible if its determinant, ad - be, is not zero.


In that case

4.1 a b
( c d
)-t I
- ad- be
(

In Example I we used

Formula 4.1 is worth remembering and is the special 2-by-2 case of Theorem 5.8 in
Section SE. Exercise 32 asks you to verify that the formula is correct.
If A is an n-by-n matrix, then the matrix equation Ax =
b is equivalent to a
system of n linear equations in n variables. If A happens to be an invertible matrix
with inverse A - I, then we can solve the system in matrix form by multiplying
both sides on the left by A - 1 to get A - I Ax = A - I b. Since A - I A = I , we have
A- 1 Ax =Ix= X, so

4.2 Theorem. If A is invertible, then Ax = b has a unique solution x = A- 1b.


In particular, x = 0 is the only solution to Ax = 0.

Theorem 4.4 will show that if A is a square matrix and x = 0 is the only solution
to Ax = 0, then A invertible.

IEXAMPLE 2 I The system

X + 2y = 3 is equivalent to
3x + 1y = -4
Section 4B Inverse Matrices 83
By Fonnula 4.1,

l 2
( 3 7
)-l -_( -37 -2 )
1

Multiplying the two sides of the equation by the inverse gives

( _; -i ) ( ~ ~ )( ; ) = ( ~ ~)(; ) =( ; )
on the left and

( ; ) = ( _; -i )( -~ ) = ( -~~ )
on the right. Thus (x, y) = (29, - 13) is the unique solution.

4B Computing Inverses
Formula 4.1 gives an easy way to find inverses of 2-by-2 matrices, and Section SE
has fonnulas for A- 1 using detenninants when A has larger dimensions, but it's
often more efficient, particularly for large matrices, to use row operations as in the
process described below. To use the method we need to introduce another type of
row operation on a matrix A: row rearrangement, which changes the order of the
rows in the matrix. As with elementary multiplication and elementary modification,
applying the same row rearrangement to A and b leaves the solution set of Ax = b
unchanged; this is so because the solutions of a system are unchanged by writing
the scalar equations of the system in different orders. We'll first describe the process
and give an example, then prove some theorems that show it always works, either
finding the inverse of a matrix A or showing that A has no inverse.
4.3 Matrix Inversion Process. Given a square matrix A, apply row operations
to obtain a reduced matrix R, applying the same operations in the same order to I
to obtain a matrix B . Then
If a row of R is a zero row, A is not invertible.
If R has no zero rows, then A is invertible. Apply row rearrangement
1
to convert R to I, and apply the same rearrangement to B . Then B is A- .

We illustrate how the process works on the matrix A on the left below, row reducing
A and perfonning the same row operations on I as we go along. We start with

-3
~ -7
~),/=(~ ~ ~)- 0 0 1

Add -2 times the second row to the first and -1 times the second to the third:

40 8)
0
-2
, ( 01 1
0)
0 .
-3 -7 0 -1 1
84 Chapter 2 Equations and Matrices

Multiply the first row by ¼ and then add 3 times the first row to the third:

Multiply the third row by -1 and then add -2 times the third row to the first:

(
0l O1 O0) , (
0 0 I
-4
i0 - 1I
3
1

s
2 -1
~).
The matrix on the left is reduced. Interchanging the first and second rows gives

R=(b ~ ~). B=( _li -s


0 0 l
11
2 ~)-
- I
4 2
You can multiply A on the right and left by B to verify that B is an inverse for A.
If R had a zero row, the system Rx= 0 would have been one with two equations
in three variables having nonzero solutions. Thus A could not have been invertible,
by Theorem 4.2.

Since A A- 1 = = I, and one way


I, A- 1 is a solution of the matrix equation AX
to find A- 1 is to solve that equation. Let Xk be the kth column of X, so the kth
column of AX is Axk. The kth column of I is the standard basis vector ek in !R_II,
so solving AX = I amounts to solving all the systems Ax1 = e,, ... , Axn = e11 at
once. We use this idea in the proof of the following theorem.

4.4 Theorem. Let A be a square matrix. If x = 0 is the only solution of Ax = 0


then the inversion process 4.3 yields a square matrix B such that AB = I . If Ax= 0
has nonzero solutions, A is not invertible and the inversion process produces a
reduced matrix R containing at least one zero row.

Proof Let R be the result of applying row reduction operations to A, and B the
result of applying the same operations to /, as in the inversion process 4.3.
First suppose that x = 0 is the only solution of Ax= 0. Then the same is true of
the system Rx = 0, and by Theorem 2.5 the system has at least as many nontrivial
equations as it has variables. Since R is square, this implies that R has no zero
row. Thus every row and column contains a leading entry I with Os in the rest of
the column, and the rows of R can be rearranged to produce the identity matrix I.
Rearrange the rows of B in the same way, and write bk for its kth column. Row
operations on a matrix apply simultaneously to all its columns, so we've converted
the equations Axk = ek to an equivalent system IXk = bk. Thus Xk = bk is the kth
column of X, and Bis a solution of AX= I. In other words, AB=/.
Section 4C Inverse Matrices 85
=
Otherwise, suppose that the system Ax 0 has nonzero solutions. Then the same
is true of the system Rx = 0 and, again by Theorem 2.5, the reduced system has
more variables than it has nontrivial equations, which implies that R has at least one
zero row. •
Theorem 4.4 implies that if A is invertible, then the inversion process 4.3 produces
a matrix B such that AB= I. To prove that B = A- 1 we need to show that BA= I
as well. Also, many different sequences of row operations will reduce A to I, so it
isn't obvious that the process used in Theorem 4.4 always gives the same matrix B
as a result. The next theorem settles both these questions.
4.5 Theorem. If A is an invertible matrix, and B is the matrix produced by the
inversion process 4.3 such that AB = I, then BA = I as well, so B is inverse to A.
Finally B = A- 1 is the only inverse of A.

Proof. To show that BA =


I, note that we got B by row operations on I, so we
can get I back again from B by applying the inverses of these operations in the
opposite order. Thus if we apply the inversion process to B, we get a matrix C such
that BC = I. Then A = AI = A(BC) = (AB)C = IC = C, so A = C and
BA= BC= I.
To show the uniqueness of A- 1, suppose A is invertible and both B and C satisfy
the conditions for being an inverse of A, namely AB = BA = I and AC = CA = I .
Then AB = AC because they are both equal to I. To show B = C multiply
AB = AC by B on the left to get BAB =
BAC. Since BA =
I , this gives
I B = IC and B = C. •
4C Special Matrices
Invertibility is very easy to determine for some matrices. One such type is the class
of diagonal matrices, with all entries zero except on the main diagonal:

(
a1
0
0 ) ,(
a2
~ etc.
0

In particular, identity matrices are diagonal matrices in which all the diagonal entries
are l. The notation diag(t1, t2, ... , tn) is convenient for the n-by-n diagonal matrix
with entries t1, ... , tn on the main diagonal. Diagonal matrices are easy to multiply.
We have for dimensions n-by-n,

diag(a1, a2, ... , a11 ) diag(b1 , ... , b11 ) = diag(flJ b1, ... , anbn).

It follows that a diagonal matrix is invertible if and only if the main diagonal entries
are all nonzero, and in that case

4.6

ke><AM~l.E 4 j
. ' ..., ,, .' .,.. , ;•,· c6 ~
( ~ ~ )-1 = ( t ½ )
86 Chapter 2 Equations and Matrices

The upper triangular matrices are those of the form

au a12 b11 b12 b13 )


( O , 0 h22h23, etc.,
022 ) (
0 0 b33

with all entries below the main diagonal equal to zero. Just as with diagonal matrices,
an upper triangular matrix is invertible if and only if the main diagonal entries are
all nonzero. The reason is that if there is no O on the diagonal of an upper triangular
matrix, we can transform it into the identity matrix by elementary operations as in
the following example.

IEXAMPLE· s I right
With upper triangular matrices, it is simpler to reduce the columns working from
to left instead of from left to right.

I 3 )
l 2 ,
0 4

0i D·
0l O·

EXERCISES

In Exercises l to 4, either find the inverse of the given In Exercises 5 and 6, solve the matrix equation Ax =b
matrix or show that it does not have one. by multiplying by A- 1 .

I. ( ~) 5. A = ( ; -! ); b= ( !)
3. ( l :) 4 ( -7
· 12
-5 )
9 6. A = ( ~ ~ ) ; b = ( -~ )
Section 4C Inverse Matrices 87
In Exercises 7 to 10, use row operations to find the A square matrix Q is orthogonal if its column vectors,
inverse of the given matrix, or to show that its inverse or equivalently its row vectors, are mutually perpendic-
doesn't exist. ular and all have length 1. In Exercises 26 to 29, show
that the given matrix is orthogonal.
1/../2 -1/../2 )
26. ( 1../2 1/./2
cos0 - sin0 )
1~ ) 27
• ( sin0 cos0
-7

11. Solve the matrix equation (


for X .
! ~) X =( ~ -; ~) '··OH) 1;-/3 - 1/./2
12. Show that if B = A- 1, then B- 1 = A. 1/./6 )
29. 1/../3 1/./2 1/./6
(
13. Show that if A and B are invertible matrices with the 1/../3
0 -2/./6
same dimensions, then AB is invertible and (AB)- 1 = 30. Verify that the inverse of each of the orthogonal matrices
B-1A - 1.
Q in Exercise 26 and Exercise 28 is the transposed matrix
14. Show that if A 1, • •• , An are invertible matrices with Qt, as defined just before Exercise 21.
the same dimensions, then the n-fold matrix product 31. (a) Show that every orthogonal matrix Q is invertible,
A I A2 · · · An is invertible and and that Q- 1 = Qt .
(b) Use part (a) and Theorem 4.4 to show that the
columns of a square matrix are perpendicular and
have length I if and only if the same is true of the
rows.
15. Show that if A 3 = 0 , then J + A + A 2 is the inverse of
I - A. 32. (a) Let A =( ; !) be a 2-by-2 matrix with ad :/-
16. Prove that an upper triangular matrix is invertible if and be. Prove Formula 4.1 by verifying that if A- 1 is
only if it has no zeros on the main diagonal. given by the formula then AA- 1 = A- 1A = I .
Find the inverses for the following special matrices. (This proves that A is invertible if its determinant,

0n
ad - be, is not zero.)

0 ;)
0 -1 (b) Try to find the inverses of the following 2-by-2
17. 2 18. -1 matrices using Formula 4.1.
0 0

19. u D w.o 2
2
0
0
- 1
0
I
0
1
The transpose A of a matrix A == (aij) is defined
0 0 0)
2
0
0
0
3
0
0
0
4
What goes wrong in the third one?
33. A square matrix A is symmetric if At = A and skew
symmetric if At = -A, where A' is the transpose of A,
by A' = (aj;) . In Exercises 21 to 24, prove the stated defined just before Exercise 21.
equations.' (a) Show that if A is a square matrix, then A + At is
symmetric and that A - A 1 is skew symmetric.
21. (A 1 )1 =A 22. (AB)'= B' A' (b) Using part (a), show that every square matrix is the
23. (A + B) = A' + B'
1
24. (A') - 1 = (A-1)1 sum of a symmetric matrix and a skew-symmetric
25. Theorem 4.4 shows in particular that if A and B arc matrix.
square matrices mch that AB = I, then BA = I a1so, (c) Show that if A is invertible and symmetric, then
so A is invertiblc and A- 1 = B . Use this result to show A - I is symmetric. What if A is invertible and skew
that if we know only that BA = I then A- 1 = B by symmetric?
showing first that A' B 1 = I . (This is an alternative to the 34. This exercise shows that we can use matrix products to
proof given for Theorem 4.5.) carry out the elementary row operations on a matrix M
88 Chapter 2 Equations and Matrices

that we used in Section 2 for solving systems and in this the ilh and jth rows, a row rearrangement operation
section for inverting square matrices. on M, yields Tij M as a result.
(a) Let D; (r) be the matrix that equals the identity (d) A matrix D;(r) or (/ + rEij) or Tij is called an
matrix I of some given dimensions except that the elementary matrix. Show that each of the three
I in the ith diagonal entry is replaced by r. Show types of elementary operation is reversible in the
that the matrix product D; (r )M equals the result of sense of Definition 1.1 in Section I by verifying
multiplying the ith row of M by r. that, assuming i ¥- j, D 11 (r) = D;(l/r), (I+
(b) Let Eij be a matrix with 1 for its ijth entry and O's rEij)- 1 = (I - rEij), T;-;1 = Tj;.
elsewhere. For example, we have the 3-by-3 matrices
*35. Let p(x) = ao+a1x+- · +anx~ be a polynomial of degree

Eu = 00
( 0
00 1)
0 , and £21 = (01 00 0)
0 .
at most n with real coefficients. An algebra theorem says
that if there are more than n distinct values of x such
that p(x) = 0, then all its coefficients ak = 0. Use
0 0 0 0 0
this theorem to prove that if bo, .. . , b,. are scalars and
Show that if M is m-by-n while Eij and / are xo, . . . , Xn are distinct scalars, then there is exactly one
both m-by-m, then (/ + rEij)M is the elementary polynomial of degree at most n such that p(xk) = bk for
modification of M that we get by adding r times the k = 0, .. . , n. [Hint: Consider a system of linear equations
jth row to the ith row. with ao, ... , an as variables, rhen apply Theorem 4.4 to
(c) Let Tij be a matrix resulting from the interchange of the associated homogeneous system.]
the ith and jth rows of/. Show that interchanging

SECTION 5 DETERMINANTS
Determinants were originally invented as a device for solving systems of linear
equations, but they turned out to have both geometric and algebraic significance
which make them important in many fields of pure and applied mathematics. Apart
from this section, we use determinants in Chapters 3, 7, 11, and 13 in a variety of
contexts that will arise naturally there. A determinant is a scalar det A defined for
each square n-by-n matrix A. A common notation for the determinant of a matrix is
to replace the parentheses enclosing the matrix by vertical bars. Thus

SA Definition
In our definition of determinant we'll define det A first for 1-by-l matrices, and then
for each n define the determinant of an n-by-n matrix in terms of determinants of
(n - 1)-by-(n - 1) submatrices called minors. For a matrix A, the matrix obtained
by deleting the ith row and jth column of A is called the ijth minor of A and is
denoted by Aij. Recall that we use the small letter aij to denote the ijth entry of
a matrix A. Thus the ijth minor A;J corresponds to the entry a;1 in a natural way,
because the minor is obtained by deleting the row and column containing aiJ.

[ EXAl\'1PlE 1 I Let
-5 -6
B=(! 2 ).
A-( -38 -9
4
D· and
4
Section SA Determinants 89
Some examples of entries and corresponding minors of A and B are

a,,= -5 A11 =( -9
4 ~)
a23 = 0 A23 = ( -5
_3 -~)
b11 = B11 = (4)
b12 = 2 B12 = (3).

For a l-by-1 matrix, A = (a), we define detA = a. For an n-by-n matrix,


A= (aij), i, j = 1, ... , n, we define

5.1 detA = a11 det A11 - a12 det A12 + · ·· + (-tt+ 1a1n det A1n-
The definition is inductive in the sense that the determinant of an n-by-n matrix A is
defined in terms of the determinants of the (n - 1)-by-(n - 1) minors AiJ. Starting
with the simple definition for the 1-by-l case allows us to go on to 2-by-2, then
3-by-3, and so on. In words, the formula says that det A is the sum, with alternating
signs, of the elements of the first row of A, each multiplied by the determinant of
its corresponding minor. For this reason the numbers

detA11, -detA12, ... , (-1}'1+ 1 detA1n

are called the cofactors of the corresponding elements of the first row of A. In
general, the cofactor of the entry au in A is defined to be (-Ii+i detAiJ· Thus in
Example 1 the entry a23 = 0 in the matrix A has cofactor

3
(-1)2+ det ( =~ -: ) = 38.
The factor (-1); +i associates plus and minus signs with det AiJ according to the
pattern
+ +
+ +
+ +
+ +

(a) det( !i )=(1)(4) - (2)(3)=4-6=-2


(b)

-5 -6
8 -9
-3 4
n- -5 det ( -: n-(-6) det ( -~ n
+ 7 det ( _ ~ -: )
90 Chapter 2 Equations and Matrices

= (-5)(-18 - 0) + (6)(16- 0)
+ 7(32 - 27)
= 90 + 96 + 35 = 221

(c) det ( ; !)= ad - be, in agreement with the definition given earlier in
connection with Formula 4.1 for inverting a 2-by-2 matrix. Stated in words,
det ( ; !) is the product of the entries on the main diagonal minus the
product of the other two entries. We can often compute 2-by-2 determinants
mentally, and consequently find 3-by-3 determinants in one or two lines.

Geometric Interpretation. Interpreting either the rows, or else the columns,


of a square matrix A as vectors leads to a geometric interpretation of det A. Here
we'll consider just the 2-by-2 and 3-by-3 cases. Recall from Chapter I, Section 6
that the cross product of two vectors in the xy-plane of JR 3 is

j k
(a, b, 0) x (c, d, 0) = a
C
b
d
O
0
= det ( : !) k.

The length of this vector is the absolute value of the 2-by-2 determinant. On the
other hand, from Chapter 1 we know that the length of the cross product equals the
area A(P) of the parallelogram P with the vectors (a, b) and (c, d) in the xy-plane
for adjacent sides. Thus

det ( ; ! )= ±A(P).

Since we can interchange rows and columns in the 2-by-2 matrix without changing
the detenninant, A(P) is also equal to the area of the parallelogram with the vectors
(a, c) and (b, d) for adjacent edges. In either case the sign is "+" if the angle from
the first vector to the second is positive, and "-" otherwise.
Instead of the signed area we get in the 2-by-2 case, the determinant of a 3-by-
3 matrix is the signed volume V (P) of a solid region P called a parallelepiped,
having congruent parallelograms for opposite faces. To see this, recall from Chapter
I, Section 6 that the signed volume of a parallelpiped P with u, v, and w for adjacent
edges is equal to a scalar triple product of the three vectors:

j k UJ
u2 u3 )
u • (v x w) = (u1i + uzj + u3k) • VJ Vz V3 = det VJ VZ V3 •
w, Wz W3 ( w, Wz W3

Hence det ( ~; ~; ~! ) = ± V (P), where the sign is ·'+" if the three column
WJ Wz W3
vectors in the given order form a right-hand system and is "-" otherwise.
Section 58 Determinants 91
5B Row and Column Expansions
It is an important fact, the proof of which we omit, that if in Equation 5.1 the elements
and cofactors of the first row are replaced by the elements and cofactors of any other
row, or of any column, then the expansion is still valid. Here is the fonnal statement.
5.2 Theorem. If A is a square matrix, then
n
det A= L(-ti+i Oij det AiJ expansion by ith row
j=l

and
n
detA =L (-l)i+iaiJ det Ai} expansion by }th column.
i=l

Equation 5.1, which we used to define detenninants, appears as a special case


of the first equation in the theorem when we let i = 1. The alternating pattern of
cofactor signs applies to an expansions by row or column, and one way to think
of the connection between the two expansion formulas is that the second one, by
columns, is a row expansion of det A1, where At is the matrix you get from A by
interchanging rows and columns. Thus if a;j is the entry in the ith row and }th
column of A, then the number Oji is the entry in the ith row and }th column of the
transpose At.
The expansion of det At by the first row looks just like the expansion of det A by
the first column, except that the minors of det A1 are the transposes of the minors of
A. Hence if det At = det A holds for an (n-1)-by-(n-1) matrices, it will hold for
an n-byn matrices. It obviously holds for l-by-1 matrices, and so holds by induction
for square matrices of all dimensions.
An alternative to the detenninant computation in part (b) of Example 2 is to use the
elements and cofactors of the second row:
-5 -6 7 )
det 8 -9 0
( -3 4 2

= -(8)det ( -6 7 ) + (-9)det ( -5
4 2
_
3 2
7) - (O)det ( -5 -6)
_
3 4
= -8(-12 - 28) - 9(-10+ 21) = 221.

~n
Computing this detenninant using the elements and cofactors of the third column,
which equals the expansion of the transposed matrix by the third row, we get

-5 -6 7 ) ( -5 8
det 8 -9 0 = det -6 -9
( -3 4 2 7 0

= 7 det ( 8 -9)
_
3 4
- (0) det ( -5
_3 -: ) +2det( -~ =~)
= 7(32 -- 27) + 2(45 + 48) = 221.
92 Chapter 2 Equations and Matrices

SC Basic Properties
We can always evaluate a determinant using the definition, but this involves a lot of
arithmetic if the dimensions of the matrix are al all large. Some of the theorems we
prove will justify other methods of calculation that usually work better than row or
column expansions if 11 is greater than 2 or 3.
5.3 Theorem. If B is obtained from A by multiplying some row or column
by a number r, then det B = r det A. If A has a zero row or column, then
detA = 0.

Proof. [f the ith row of A is multiplied by r, the expansion of B by the ith row is
II

det B = L (- 1/+i ra;i det AiJ


i=l
II

=r I)-l)i+iaii det Ai)= r det A.


J=l

A similar argument using a column expansion proves the column version. The induc-
tive expansion by a zero row or column gives zero, so det A = 0 in that case. •

Let
I·£.><AMPLE 4.,
1 2 3) =
1 2r2r 43) .
A=
( -1 2 4
0 1 2
, B
( -1
0 r 2

Notice that B is obtained by multiplying the second column of A and r. The theorem
says that

det B - det ( -l 2; 2r 3) =
~ r det
( -1
0
1

=r det A.

5.4 Theorem. Let A, B, and C be matrices that are identical except in one row
or column; and suppose that in the exceptional row or column the entries in C
are the sums of the corresponding entries in A and B. Then det C = det A +
det B.

Proof. Suppose the special row or column is the )th column. Then expansion using
that column gives
n

det C = L(- 1/+J (aiJ + biJ) det CiJ


i=l
n n
= I)-1 /+iaijdetCij + I)-l)i+ib;j det Cij,
i=l i=l
Section 5C Determinants 93
But the minor Cij contains only entries that are the same as those in both A and B,
so det Cij = det Aij = det Bij. Hence

det C = det A + det B.

A similar argument using a row expansion proves the row version. •


Let

1
-1 2 4 2 3) , B = ( -1I
0 1 2 0

-1
5 3)
3 4 .
0 -1 2

The matrices A, B, and C are identical except in the second column, and that column
of C is the sum of the corresponding columns of A and B. We have

det A= (1)(4 - 4) - 2(-2 - 0) + 3(-1 - 0)


=0+4-3=1,
det B = ( I )(2 + 8) - 3(-2 - 0) + 3(2 - 0)
= 10 + 6 + 6 = 22,
detC = (1)(6 + 4) - 5(-2 - 0) + 3(1 - 0)
= 10 + 10 + 3 = 23 = det A + det B.

Sign Changes. Another basic property of determinants is that if two rows (or
columns) of a matrix A are interchanged then det A changes sign. This property is
sometimes expressed by saying that the detenninant is alternating as a function
of its rows (or of its columns). For 2-by-2 matrices, the property amounts to the
observation that

det ( ; ! )= ad - be = - det ( : ; )

= -det ( : ~ ) ·

Here is the general statement.


5.5 Theorem. If B is obtained from A by exchanging two rows or two columns,
then det B = - det A. If A has two rows or columns proportional, then det A = 0.

Proof. We proceed inductively, assuming that the theorem has been proved for
(n - 1)-by-(n - 1) matrices, and proving it for n-by-n matrices. We have already
observed that the theorem is true for 2-by-2 matrices. Assuming n 2: 3, we expand
94 Chapter 2 Equations and Matrices

det A using some row different from the two that are to be interchanged, say the kth
row. Then
ll

detA = z)-tl+lakJ det AkJ·


}=l

But interchanging two rows different from the kth in A interchanges two rows dif-
ferent from the kth in each (n - 1)-by-(n - 1) minor Aki· Since all determinants
det AkJ then change sign by the induction assumption, and nothing else changes in
the expansion, it follows that det A changes sign. A similar argument using a col-
umn expansion proves the column version. Finally, if A has two rows or columns
proportional, we can factor out a proportionality constant r. and get det A = r det B,
where B has two rows or columns the same. Then det B = - det B so det B = 0.
Hence det A = r det B = 0. •

IEXAMPLE 6 I The matrix

-1
2
0
-6 -3
1
j)
has its first and third rows proportional, with the third being (--3) times the first. We
verify, expanding by the second row, that

det=-(-l)det(_! -~) - sdet(_~ -!)

= (-9 + 9) - 5(-6 + 6) = 0.

5D Computing Determinants
The following fact is useful in computing the determinants of large matrices.
5.6 Theorem. Adding a scalar multiple of one row (or column) of A to another
leaves det A unchanged.

Proof. Let the ith row or column of A be the one affected by adding r times the
kth row or column, and denote by C the modified matrix. Then we look at det C as
a function of its rows (or columns) CJ, .•• , c;, ... , c11 , so that c; = 3; + r3k.

detC=det(c1, ... ,c;, ... ,c11)


= det(31, ... , 3; + r3k, ... , 3k, ... , 311)
=det(3J, ... ,3;, ... ,311)+
det(a1, ... ,rak,··· ,ak,··· ,an)
=detA+det(a1, ... ,r3k,··· ,3k,··· ,a11)

The last matrix has two rows or columns proportional, so its detenninant is zero.
=
Thus det C det A, as was to be shown. •
Section 5D Determinants 95
Let

I 3 -2) I 3 0)
A=
( 2 -4
3
I
5 -2
, C =
( 2
3
-4 5
5 4
.

The third column of C is equal to the third column of A plus 2 times the first column.
We compute

det C = (1)(-16 - 25) - (3)(8 - 15) + (0)(10 + 12) = -20.


It follows that det A = - 20 also.
Let

2 4 -I O)
3 0 2 3
A= -I 2 3 I .
(
0 I -2 -1

By adding 2 times column 3 to column 1, and 4 times column 3 to column 2, we


obtain

; ) t

-1

and by Theorem 5.6, det A = det B. The expansion of det B has only one nonzero
term and we get

det B = (- I) det ( ;
-4
1
-7
! i) -I

=-det ;
-4 -3
! i) -1
[subtracting column 1 from column 2)

[subtract 7 times column 2


= - det - 5~
17
!-
-3
2~ )
8
from column I and 3 times column 2
from column 3).

Then detB = -[(-1)((-58)(8) - ( 17)(-26))) = -22.

Theorems 5.3, 5.5, and 5.6 describe the effect on det A of the elementary row
operations of Section 2A and of the interchange of two rows or columns:

1. An elementary multiplication of a row of A by r gives r det A. (Theorem 5.3).


2. An elementary modification of A leaves det A unchanged (Theorem 5.6).
96 Chapter 2 Equations and Matrices

3. Interchanging two rows or two columns of A changes the sign of det A


(Theorem 5.5).

Thus in putting a matrix A in reduced form R, all we have to do is to keep track


of the interchanges and multiplications. Then det A = k det R, where k is plus or
minus the product of the elementary multipliers, depending on whether the number
of interchanges used was even or odd.
SE Invertible Matrices
To find out whether a large square matrix is invertible, it is usually simplest to try to
reduce it to the identity by row operations: By Theorems 4.3 and 4.4, if the reduction
is possible the matrix is invertible, and otherwise it is not. But for 2-by-2, 3-by-3,
and some special matrices of larger dimensions it may be easier to use the following
criterion. This theorem is critical for computing eigenvalues in Chapter 13.

5.7 Theorem. A square matrix A is invertible if and only if det A =I- 0.

Proof. Suppose that A has been put in reduced form R by elementary row opera-
tions, so that det A = k det R for some constant k =I- 0. If det A =I- 0, then R cannot
have a zero row. A reduced square matrix R with no zero row is equivalent to /,
so A is invertible. If det A = 0, then R contains a zero row, and A is not invertible,
because the equivalent systems Ax= 0 and Rx= 0 have nonzero solutions. •

j E~AMPLE 9 j The matrix A in Example 7 is invertible because detA = -20 =I- O.


Here we generalize to n-by-n matrices the Formula 4.1 for inverting 2-by-2 matri-
ces. Let aiJ = (-Ii+ i det Aij, the cofactor of a;_; that we use in row or column
expansions of det A, where A has entry aii in its ith row and jth column. Let A be
the square matrix with ijth entry iiij and form the transpose A\ which is the matrix
we get by interchanging rows and columns in the matrix A. Thus if the ijth entry
in A is iiij, the ijth entry in A1 is a_;;.

I l -
5.8 Theorem. If A is invertible, then A- = - - A 1•
det A

Proof. The key to the proof is the equation

if k = i,
(I)
if k =I- i.

If k =i, the left side of Equation 1 is just the expansion of det A by the elements of
the ith row, so we get det A. If k =I- i, we can still regard the sum as an expansion
by the ith row of the determinant of a matrix in which the ith row is the same as
the kth row, thus giving determinant Oby the last part of Theorem 5.5.
To finish the proof we look at the kith entry in the matrix product AA 1 :

if k = i,
if k =I- i,
Section SE Determinants 97
the Jast two equalities fol1owing from the definition of a;i and Equation (1 ). Hence
AA = (detA)J, so dividing by detA gives A((detA)- 1A 1 ) = J. Now apply A-1
1

to both sides to get (det A) - 1 A1 = A- 1. •

1-..EXAMPLE
·.· cc,.. ,". • •· . . ······,10 l.
To invert the matrix
.=- - : _;: ,,: •. . · ., ...,,., •, ' ·- . · .

25 36 74) -63 -56 -3 )


A=
(8 9 0
first compute the matrix -36
( -3
-32 -6
-6 -3

with entries del Aij; thus det Ai 1 = I ~ 61 = -63, and det A12 = I ~ 61 =
-56. To get the matrix of cofactors insert the factors ( -zi+ i, changing the sign of
every second entry and giving

-63 56 -3 )
36 -32 6 .
( -3 6 -3

Finally, transpose this matrix by reflecting across its main diagonal, then divide by
det A = 30, found for example by expanding det A by the last row. The result is
21 6
-63 36 -3 -m s
A- 1 = _2_ ( 56 -32 6 ) = 28 _.!.§_
30 ( 15 15
-3 6 -3 I I
-m s

If a square matrix A is invertible, the system of linear equations that the vector
equation Ax = b represents has the unique solution x = A- 1b. We can combine
this solution formula with the formula for A- 1 in the previous Theorem 5.8 to get
x = (detA) - 1A1 b. This formula leads to the fol1owing rule.
5.9 Cramer's Rule. If det A :f: 0 and x = (x1, ... , Xn), then the }th coordinate
of the solution of Ax = b is
det B<i)
x-----
J - detA '

where B<i> is A with its jth column replaced by the entries b 1 , ... , b11 in b.

Proof. As we noted before, the statement of Cramer's rule x = (detA)- 1A1 b. The
jith entry in the matrix A1 is aJi =au=
(-l)i+i detAu, Thus

II II

Xj = (detA) - 1 I:aJ;b; = (detA)-l I:(-1/+i detAijb;, j = 1, ... , n.


i=l i=l

But the sum on the right is just the expansion of B(j) by the jth column. •
98 ..,..,,.._....,.,..,____ Chapter 2 Equations and Matrices

IEXAMPLE 11 j We'll solve this system using Cramer's rule:

x1 -2x2 +4x3 =I
-x1 +x2 -x3 =2
2x1 +3x~ -x3 = 3.
The relevant matrices are

A=
( 2
1 -2
1 1
3 -I
-~), BO>=
1 -2
2
( 3
1
3
4 )
-1
-1
,

B<2 l = 1I 21 -14) , -2 1 )
(2 3 -1
1 2 .
3 3

Expanding the determinants by their first rows gives

det A= (1)(2) - (-2)(3) + (4)(-5) = 2 + 6 - 20 = -12


det B 0 > = (1)(2) - (-2)( 1) + (4)(3) = 2 + 2 + 12 = 16

det B<2> = (1)( 1) - (1)(3) + (4)(-7) = 1 - 3 - 28 = -30

det B< 3> = (1)(-3) - (-2)(-7) + (1)(-5) = -3 - 14 - 5 = -22.


1'hen XJ 16
= - 12 = 4 .
- 3 30
, x2 = 12 S
= 2 , x3 = 12
22
= 611 .
EXERCISES

6· ( i i ~ 2i)
In Exercises 1 and 2, evaluate det A and also det(2A).

I -2 3 )
1. A= 3 1 4 1 4 16 64
( 5 6 7

1 0 1 0)
0 3 1 4
The product rule for determinants states that if A and
B are square matrices with the same dimensions, then
2· .4 = (- 1 1 1
-1
4 0
2 3
det(A B) = det(A) det(B). In Exercises 7 and 8, find
A B , BA, and the determinants of A, B, AB, and BA,
and verify that the product rule holds for these examples.
3. What is the relation between dct A and dct(2A)? (Sec
Exercises 1 and 2.)
B=(O2 -3l)
4. What is the relation between detA and det(-A)?

III Exercises 5 and 6, use the method of Example 8 of the -1 0 1 )


text to evaluate the determinants of the given matrices. B = 2 -I -3
( 0 3 5

s. ( - ~
2 -1
! _i -!l )
0
9. Apply the product rule of the preceding exercises to show
=
that if A is invertible, then det A -::f. 0 and det(A- 1)
(detA)- 1•
Section SE Determinants 99

('''°"
10. Show that if D is the diagonal matrix diag(d1, .. . , dn), -e 1 sint
)
0
in the notation used in Theorem 4.6 in the previous 22. 1
e sin t e1 cost 0
section, then det D is the product of the diagonal elements, 0 0 e3r
d1d2 · · · dn. In particular, det I = 1.
11. Compute the determinant of the matrix

(i -r i -~).
0 0 0 4
23.
(f
4
0
2 n_;)
4

12. Show that for an upper triangular matrix like the one in
Exercise 11, in which every element below the diagonal is
24.
(2:' -t
2

0, the determinant is equal to the product of the diagonal 25. Show that the cross product of vectors u = (u 1, u 2 , u 3 )
elements. and v = (v1, v2, v3) has the form of a 3-by-3 determinant:

In Exercises 13 to 20, use Theorem 5.7 to determine


which of the following matrices have inverses. For those
that are invertible use Theorem 5.8 to find the inverse. u x v= det ( u\v,
13. (
-2
! ~ ~) 0 l
14. ( - ~ i ~)
O 3 3 26. Show that the scalar triple product of the ordered triple

uH)
of vectors u =
(u1, u2, u3), v = (v1, v2, v3), w =
(w1, w2, w3) is expressible as a 3-by-3 detenninant:
IS. ( : j j) 16.
u1
= det
HD O=! :)
u • (v x w) v1
(
Wt
17. ( 18.

n
27. Let A be an m-by-m matrix and B an n-by-n matrix.
2
2
-1
0
02 0I O)
0 Consider the (m + n)-by-(m + n) matrix ( ~ ~ ).
0 I 0 3 0 which has A in the upper left corner, B in the lower
0 0 0 0 4 right comer, and zeros elsewhere; show that its deter-
minant is equal to (det A)(det B). [Hint: Consider the
In Exercises 21 to 24, use Theorem 5.7 to determine for
cases A = I and B = I. Then use the product rule of
which values of the real variable t the given matrix fails
Exercise 7.]
to have an inverse.

21.
1-t
0
2 0)
2- t 5
28. Explain how Formula 4.1 for inverting 2-by-2 mat1ices
is a special case of Theorem 5.8 for inverting n-by-n
( 0 0 3-t matrices.

Chapter 2 REVIEW

In Exercises l to 8 let
E =
( -1
2
0
-1 )
-4 , F =
( 0
l
-3)2 ,
4 1 3 -2 0
B=( -2 -1

c-(1 -3 -1 ) 2 0 -2) and evaluate each of the following expressions, or else explain
- 2 0 , D=
( I O
0 0
5
3
, why it's not defined.
1. A+ B 2. AB 3. B+2C
100 Chapter 2 Equations and Matrices

24. Do the same as in Exercise 23 after changing the third


4. A +BE 5. A+ EB 6. CD +EA
equation to x + y = I.
7. cn- 1 8. BF+ FB
In Exercises 25 to 28, express the vector v as a linear
In Exercises 9 to 12, let A, B, C, and X be square combination of the other vectors given, or show that it
matrices of the same shape. Solve the given equation is impossible to do so.
for X, stating what conditions you have to assume to
produce a unique solution. [For example, AX+ BX = C 25. V = 2i + 3j; 3 = 2i - j, b = 2i + j
has the solution (A+ B)- 1c, provided that A+ B is 26. V = 2i + 3j + 4k; a = 2i - j, b = i + j + k, C = j - 2k
invertible.]
27. V = (3, -}, 0, -l); a = (2, -}, 3, 2), b = (-}, 1, }, -3),
9. AX+ 2B = ex 10. XA + 2X = B
c=(l ,1,9,-5)
11. AX= 3A 12. XA +XB =C 28. V = (3,0,0,-2); a = (0,-1,1,2),b = (-1,1,1,0),
Trne or false? In Exercises 13 to 20, if the statement c=(l,1,0,-2)
is always true or always false, give a reason. If it is
sometimes true and sometimes false, give an example In Exercises 29 to 35, find the inverse of the given matrix
of each possibility. Assume that A and B are square or show that the inverse doesn't exist.
matrices of the same dimensions in all cases.
13. If AB= 0, then BA= 0.
29. ( -~ i) 30. ( ~ 1
- ~ )

14. If AB= I, then BA= I.


15. If AB= A, then BA= A.
16. If AB= BA, then B = A- 1•
,1. ( J =LO 32. ( _! -H )
-1 0 2 0 0
17. If A= 0, then A 2 = 0.
18. If A 2
19. If A 2
20. If A 2
= 0, then A = 0.
= /, then A = I.
= 0, then(/+ A) - 1 =/- A.
33.
(
-2
0
3 -2
1
1
1 2
0
0
0
0
0
1
0 D
21. Let A =( b ~ ~ ). Describe the set of solutions
0 0 0
for the system of equations Ax = b when (a) b = 0,
(b) b = (1, 2, 0), (c) b = (0, 1, 2). In general, for what 36. Evaluate p(A) when p(I) = t 2 - 2 and A = diag(a, b, c).
vectors b does the system have a solution? What is the general rule for evaluating q(A) for a poly-

b~ ~)
nomial q (t) and diagonal matrix A?
22. Convert the matrix ( to reduced form by 37. (a) Show that if E and F are diagonal matrices of the
1 1 2
same dimensions, then E F = FE.
elementary row operations. (The operations allowed at
(b) Let D = diag(a, b, c) and write down DA and AD,
each stage may be different for a few special values of s.
where A is the matrix in part (c) of Exercise 8.
Consider different cases if necessary.) For which values
Describe in words the effect of multiplying a matrix
of s is the matrix invertible?
by a diagonal matrix, when the diagonal matrix is
23. Write the system of equations on the left, and when it is on the right.
(c) Assuming a, b, and c are all different, and D =
2x+ y-z=O diag(a, b, c), what can you say about matrices B
3y - z = h such that DB= BD? What if a = b = c? What if
X - y = I a= h I- c?
38. Let p(x) be the polynomial a+bx+cx 2 • By setting up and
in matrix form, and apply row operations to get an solving a system of linear equations, find values for the
equivalent system with a reduced coefficient matrix. Give coefficients a, b, c so that p(-1) = 1, p(0) = 0, p(l) =
one solution for each value of h for which a solution I. Given numbers r, s, t is it always possible to find values
exists. for a, b, c to make p(-1) = r, p(0) = s, p(l) = t?
Section SE Determinants 101

39. Let f(x) = a sinx + h cosx + c. Find a, b, and c so that 41. (1, 2, 1, 2), (1, 2, 3, 4), and (1, 0, 0, 1)
/(0) = 1, J'(O) = 2, and f"(0) = 3.
42. (1, 2, 1, 2), (1, 2, 3, 4), and (0, 0, 1, 1)
40. Let K be the plane through (1, 2, 3), (-1, 5, 2), and
(2, -6, 10), and let L be the plane with equation 2x - Evaluate the detenninants of the matrices given in Exer-
3y +z = 2. cises 43 to 46.

u n 0 -n
(a) Find an equation for K by solving a system of linear
2 2
equations, without using the cross product.
43. 1 44. 1
(b) Find the intersection of K and L with the plane
0 0
that has equation x + 2y + az = 0. (The result will

n (J
depend on a.) Are there values of a for which the
2 0 0 2
three planes do not intersect?
In Exercises 41 and 42, find all vectors in JR. 4 that are
perpendicular to the given vectors.
45.
(-i 1
-3
0
2
0
-1
46.
-3
-2
0
0
1
-2 D
CHAPTER 3

VECTOR SPACES
AND LINEARITY
(~OPTIONAL CHAPTER)

This chapter extends and generalizes the linear algebra developed in Chapters l and
2, but the additional material isn't used in later chapters.

SECTION 1 LINEAR FUNCTIONS ON !Rn


In Chapter 2 we used matrix multiplication as a way to describe systems of linear
equations, so for instance, the system of equations

2x+3y=5
X - y =3
is equivalent to the matrix equation Ax = b with

Another way of looking at the matrix product Ax in this example is as a function


y = Ax assigning a value y = (z, w) in IR2 to a given x = (x, y) in IR2 . For this
particular matrix A, the function would be given by

= 2x + 3y
z
w = x- y
or (~)=(i -i)(;).
IA Matrix Representation
There's a close connection between systems of linear equations, either in scalar or
matrix form as illustrated above, and the natural generalization of the function y = ax
from IR to JR to functions from IR11 to !Rm. If A is an m-by-n matrix, setting j(x) = Ax
defines a function y = f (x) from IR11 to 1R111 • We proved in Theorem 2.3 of Chapter 2,
Section 2C that a matrix-vector product Ax has the property we called linearity,
meaning that for a given linear combination sx+ty we have A(sx+ty) = sAx+t Ay.
In terms of the function f (x) = Ax, this says
1.1 /(sx + ty) = s/(x) + tf(y),
11
and a function f from 1R to !Rm is called linear if it satisfies Equation 1.1 . Repeated
application of Equation 1.1 shows that a linear function f (x) always satisfies the
more general condition

102
Section 1A Linear Functions on JR" 103

1.2 f (t1x1 + t2x2 + · ·· + tkXk) = tif (x1) + tif (x1) + · · · + tkf (xk).
The rest of this section, and indeed this chapter, is about understanding linearity.

IExiNt~~~;lJ ~~~g;;~ AIR;:: (a;i) the following systems define the same function f (x)
0
= Ax
YI = a11x1 + + a1nXn

Ym = amlXI +
The next theorem shows that as a consequence of Equation 1.1, all linear functions
y = f (x) from ]Rn to ]Rm are expressible in the two forms of the previous example,
in other words in the form f (x) = Ax.
1.3 Theorem. If f is a linear function from IR11 to ]Rm, and A is the m-by-n
matrix whose columns are the vectors f (e1), ... , f(en), then f(x) = Ax for every
x in ]Rn.

Proof. e,
The product of a matrix and the standard basis vector always gives the jth
column of the matrix, so the definition of A implies Ae, = f (ej) for j 1, ... , n.=
Now consider an arbitrary vector x = (x1, ... ,xn) = x1e1 + ·· · + xnen in !Rn.
To check that f (x) = Ax, we use the linearity of f and the linearity of matrix
multiplication (Theorem 2.3 in Chapter 2). Since we can write x = x1e1 +· · ·+xnen,

f(x) = f(x1e1 + · · · +xnen)


= xif (e1) + · · · + Xnf (en) [by linearity of fl
= x1 Ae1 + · · · + xnAen [by definition of A]
= A(x1 e1 + · ·· + xnen) = Ax, [by linearity of matrix multiplication]

so f(x) = Ax. •

For an example of how the theorem works in a particular case, suppose f is a linear
function from JR 3 to JR2 with

We form the matrix A = (2 3-4)


1
_1
2
from these column vectors. Then

Ae, =O-i-n =m, m


Aez=(i _i-n m=(_;),
104 Chapter 3 Vector Spaces and Linearity

= (2xi)+ ( +(-4x3) =
x1
3x2)
-x2 2x3
+ 4x3) (2 -4)(:~) .
(2x1 3x2 -
x1 - x2 + 2x3
= 3
1- I 2
X3

Note. In the context of real-valued functions of one variable the term "linear func-
tion" often means a function of the form /(x) = ax + b, because such functions
have straight line graphs in the xy-plane. Such functions are linear in the present
context of functions with vector variables only if b = 0. In this book the term "linear
function" will always mean a function satisfying Equation 1.1.
Before looking at more examples of linear functions, we introduce some termi-
nology that's useful for talking about functions in general, whether or not they are
linear. We revisit these terms at the beginning of Chapter I applied to nonlinear
functions.
A function f is defined on a set D called the domain of f and takes its values
within some set R called the range of/. Thus for every x in D, /(x) is some yin
R. We say that f is a function from D to R, and use the notation

f: D---+ R

to indicate that f is a function with domain D and range R.


If E is a subset of the domain of f we write f (E) for the subset of the range
of f consisting of the values f (x) taken on by f as x varies over the set E, and
call it the image of E under f. The image f (D) of the entire domain off is called
simply the image of/.
Variant terminology. Some people use the term "range off" to mean what we
call the "image off", but for clarity we distinguish between the two terms as we just
indicated. For us, the range of a function is the space, such as 1R11 , within which its
values are defined to lie, while the image consists of the values the function actually
takes on. The image is contained in the range but need not be all of it.
The following examples illustrate Theorem 1.3 on matrix representation for some
typical linear functions f:IRn---+IRm, including some that we've already seen in
Chapter 1.

Defining f (x) to be the scalar multiple x (1, 2, 1) gives a function f: IR 1---+ JR 3 that
is linear because the coordinates of the image of f in JR 3 are YI = x, y2 = 2x,
y3 = x, and these have the simplest possible form of the first display in Example 1.
The image of f consists of all scalar multiples of ( 1, 2, 1) and is a line through
the origin in IR 3 in the direction of the vector (1, 2, 1). If E is the interval [O, 1),
then /(£), the image of E under /, is the line segment joining (0, 0, 0) and
(I, 2, I).
Section 1A Linear Functions on !Rn 105

JEX,AM~~~;I 2 3
Let f:IR -> R be the linear function such that f ( ~) = 0) and f C) =

~
( - )- The matrix of f is ( ~ -i ), and for arbitrary (,, t) in R2, we have

f(,. t) = o-n (n = (:/:',) =, G) +, ( -~ ).

The image F of f is the plane through the origin in JR 3 containing the vectors
(1,2,4) and (-1,0, 1).

For a linear function with a different kind of geometric interpretation, consider the
function from JR. to JR that sends x to 2x and has the geometric effect of stretching
the real number line by the factor 2. In two dimensions we can define a linear
function f: IR.2-----+ JR 2 that stretches horizontal distances by a factor of 3 and vertical
distances by a factor of 2. To do this we need /(e1) = 3e1 and f (e2) = 2e2, so f
is represented by a diagonal matrix as

Figure 1.1 shows the geometric effect off, where C is the unit circle xf + xJ =
f(C) I, and /(C), the image of C under f, is an ellipse. If ( ~~) is the image of

HGURE 3.1
( ;~) = ( ;;~) under f. then x1 = ½u1 and x2 = ½u2. Hence if ( ~~) is in
u2
Unequal expansions.
f (C), then
9
u2
1
+ J
= 1; this is the equation of the ellipse with semi major axis 3
and semiminor axis 2 shown in Figure 1.1.

The projections of one vector on another defined in Chapter I, Section 4C are geo-
metric examples of linear functions f: lRn---+ ]Rn. Let n be a unit vector in ]Rn.
Define the function P0 : IR.11 -----+ IR.11 by P0 (x) =
(x • n)n. Then P0 is linear because
using properties of the dot product shows it satisfies Equation 1.1 :

P0 (sx + ty) = ((sx + ty) • n)n = (sx • n + ty • n)n


= s(x • n)n + t(y • n)n = s P (x) + t Po(y).
0

As we saw in Theorem 4.10 of Chapter 1, P0 (x) is the projection of x on the line


through the origin in the direction of n, and the image of P11 is the same line.
If e; is one of the standard basis vectors in JR.11 then e; • e; = 1 and e; • ej = 0 if
i ¥= j. Thus if n = e;, then P0 {e;) = e; and P0 (e1) = 0 if i ¥= j. Consequently, the
matrix of Pe; is O except for having e; in the ith column. For the matrix of Pn for
an arbitrary unit vector n see Exercises 19 and 20.
106 Chapter 3 Vector Spaces and Linearity

Rotations are another class of linear functions from a space to itself. We view a
rotation of the plane around the origin as a function f:~ 2-+IR2 with f(x) defined
as the result of rotating x through an angle 0, where both f (x) and x are pictured
as arrows with tails at the origin. To see that such a rotation is a linear function,
recall the geometric interpretation of vector addition and scalar multiplication in
Section 2 of Chapter I. Figure 3.2(a) shows vectors u, v, and w = u + v and their
images under f. By the parallelogram law for addition, the arrow representing w is
the diagonal of the parallelogram with sides u and v. The rotation carries the entire
parallelogram to a congruent one with sides f (u) and f(v) and diagonal f(w), so
f (w) = f (u) + f (v).
Similarly, Figure 3.2(b), in which q is a scalar multiple sp of p, shows that
f (q) = sf (p) because both q and p are rotated through the same angle, and their
lengths are not changed.

For a simple example of a rotation, consider turning the plane 90° counterclockwise
!EXAMPLE 71
around the origin. This takes e1 to e2 and ez to -e1 as shown in Figure 3.3(a). The
rotation is then a linear function f:IR 2-+IR2 with f(e1) = e2 and f(e2) = -e1, so
its matrix has e2 and -e1 as columns. For a vector u = ( ~).

f(u) = f ( ; ) = ( ~ - ~) ( ; ) = ( -~ ).

The image of a vector u = (x, y) under a 90° rotation should have the same length
as u and be at right angles to it. We can check this algebraically by computing some
dot products. We have lf(u)J 2 = f (u) • f (u) = (-y) 2 + x 2 = x 2 + y2 = JuJ 2, so
f (u) and u have the same length, and f (u) • u = -yx + xy = 0 so f (u) and u are
perpendicular to each other.

IEXAMPLES I Consider rotating the plane counterclockwise around the origin through an arbitrary
angle 0. As shown in Figure 3.3(b) this rotation carries e1 and e2 to the vectors that
form the columns of the matrix
R _ ( cos 0 - sin 0 )
0
- sin 0 cos 0 ·

Computing dot products as in the previous example shows that Rex has the same
length as x, and the cosine of the angle between Rex and x is equal to cos O. (See
Exercises 9 and IO.)

Here is a general statement about the image of a linear function f (x) = Ax.
11 111
1.4 Theorem. Let f: JR -,> JR be a linear function with matrix A. Then the
image off consists of all linear combinations of the columns of A.

Proof. We can write a vector x in JR11 as a linear combination

and then use the linearity of f to write

f(x) = xi f(e1) + · · · + Xnf(en),


Section 1B Linear Functions on JR" 107
Since xis an arbitrary vector in the domain off, the last equation expresses an arbitrary
vector in the image of f as a linear combination of the vectors J (e1) , . . . , J (e,,). By
Theorem 1.3, these vectors are just the columns of the matrix of J, so the image of J
consists of all linear combinations of the columns of A. •

l•·EXAMPLE9'1
,~ ,,, ··" ,, · 1
To illustrate Theorem 1.4, consider a plane P through the origin in IR. 3 . P consists
of all linear combinations of two vectors Y1 and Y2 in JR 3. You can check that the
function f : JR 2 -+ JR 3 defined by

is linear. Theorem 1.3 implies that the matrix of J has as columns the two vectors

(a)

For example, if

then
(b )

FIGURE 3.2

Thus the image of f in JR. 3 is the plane P determined by the two column vectors of
the 3-by-2 matrix.
More generally, the definition of a k-plane containing the origin in JR" given in
c,. y) Section 2D of Chapter 2 amounts to saying that a k-plane through the origin is the
image of a linear function .f: JRk~ JR" given by f (x) = Ax, where A is an n-by-k
matrix with k linearly independent columns.

1B Composition
(a) If f and g are two functions, not necessarily linear, such that the image of J overlaps
the domain of g, then the composition g oJ of J and g is defined to be the function
(-sin 0, cos OJ obtained by first applying f and then applying g :

(g o/)(x) = g(/(x)).
The domain of (g o/) consists of all x such that both f (x) and g(J (x)) arc defined,
and is the same as the domain of J when the image of f is contained in the domain
(b)
of g.
The following theorem states an important connection between composition of
FIGURE 3.3 linear functions and matrix multiplication that motivates the definition of matrix
multiplication.
108 Chapter 3 Vector Spaces and Linearity

1.5 Theorem. Let f: JR11 -+ ]Rm and g: ]Rill -+ ]RP be linear functions with
matrices A and B, respectively. Then the compositio~ g f is defined, and 0

(gaf)(x) = (BA)x,
for all x in JR'Z, so g a f has matrix BA and is a function from JR11 to ]RP. It follows
from Theorem 2.3 in Chapter 2, Section 2C that go f is linear.
In Theorem l.5 the image off is contained in the domain of g, which is ]Rm, so
the domain of go f is all of JR 11 • Note also that A is an m-by-n matrix and B is a
p-by-m matrix, so the product BA is defined.
Proof. Suppose that

f (x) = Ax and g(y) = By


are linear functions with matrices A and B. Then

(gof)x = g(/(x)) = g(Ax) = B(Ax) = (BA)x,


where at the last step we used the associative law for matrix multiplication. •

IEXAMPLE 10 I Let f: 1R 2 2
-+ JR and g: JR 2 -+ JR 2 be defined by

/(x) = (-~ -~ )x and g(y) = (-~ -~ )y.


Then

(go/)(x) = ( 0 1) ( 1 2) (-1 -4)


-2 -3 - 1 -4 X = l 8 X.

IEXA~PLE 11 j Consider the rotation matrix Ro of Example 8. On geometric grounds we would


expect that (Ro) 2 = Rw, because a rotation through angle 0 followed by another one
is a rotation through angle 20. Algebraically, all we have to check is that multiplying
the matrix Ro by itself gives the matrix of Rw:
2 2
cos 0 - sin 0 ) ( cos 0 -sin0 =( cos 0 - sin 0 -2sin0cos0 )
( sin 0 cos 0 sin 0 cos0 ) 2 sin 0 cos 0 cos 2 0 - sin 2 0

_ ( cos 20 - sin 20 )
- sin 20 cos 20 ·

In the last step we used the trigonometric identities sin 20 = 2 sin 0 cos 0 and
cos 20 = cos 2 0 - sin 2 0.

1C Inverse Functions
A function f:lR"---+JRlll is one-to-one if for every yin the image F off in ]Rill
there is a unique x in JR11 such that /(x) = y. If f is one-to-one, its inverse function
1- 1: F ---+ JR 11 is defined by setting 1-1 (y) = x, where x is the unique x in JR 11 such
that /(x) = y. Thus for all x in !R:11 and ally in F,
Section 1C Linear Functions on R" 109

If l is one to one, so is 1-1, since if 1- 1 (y1) = 1- 1(y2), applying l to both sides


=
shows that Y1 Y2- Thus there is a unique y such that 1- 1 has the value 1- 1(y).
The following theorem describes linear functions that arc one to one.
1.6 Theorem. A linear function l (x) = Ax is one to one if and only if the
columns of A are linearly independent vectors. If in addition A is a square matrix
then A is invertible, and 1-1(y) = A- 1y for all y in ]Rm.

Proof. If l is one-to-one, then since 1(0) = AO= 0, the only solution of Ax= 0 is
x = 0. Conversely, suppose x = 0 is the only solution of Ax= 0. Since Ax 1 = Ax2
if and only if A(x1 - x2) = 0, it follows that if Ax1 = Ax2 then x1 - x2 = 0
and x1 = x2, so l is one-to-one. By the Definition 2.7 of linear independence in
Chapter 2, the columns of A are independent if and only if the system Ax = 0 has
only x = 0 for its solution. Following Inversion Process 4.3 in Chapter 2, a square
matrix A is invertible if and only if Ax = 0 has only the zero solution. In that case
Ax= y if and only if x = A- 1y, so 1- 1 (y) = A- 1y. •
For an example of a function given by an invertible matrix A, let l (x) = Ax =
(i !)x. Since the columns of A are independent A- 1 exists, so

-1 13 -1 ( 4 -3 )
l (X) =( 1 4) X = -1 1 X.

In this example the image F of l is all of IR 2 , since if Yo is in IR 2 , then l (Xo) = Yo,


where Xo = A- 1yo.

We can see geometrically that rotating a vector in IR 2 through an angle 0 and then
through the angle -0 puts it back in its original position, so the functions given by
the rotation matrices

R _ (cos0 - sin0) d R-1 _ ( cos 0 sin 0 )


0 - sin0 cos0 an 0 - - sin 0 cos 0 ·

are examples of functions that are inverses of each other. As it should, multiplying
the matrices in either order gives the identity matrix:

cos 0
2
+ sin2 0 0 ) ( 1 0)
( 0 sin 2 0 + cos 2 0 = 0 1 ·

In Example 9 we considered the parametric representation of a plane P containing


the origin in IR" given by

The function l(u, v) is one-to-one by Theorem 1.6 because the columns of the 3-
by-2 matrix are independent, neither one being a scalar multiple of the other. The
110 Chapter 3 Vector Spaces and Linearity

inverse function is defined only on P, a plane whose equation in JR 3 is x - y + z = 0.


Given a point on P satisfying x - y + z = 0 you can find the coordinates of the
corresponding point J-1(x, y, z) = (u, v) by solving u - v = x, u + v = z to get
U = ½(x + z), V = ½(z - X). These formulas for U and V define a function on all of

JR 3 , but we get a one-to-one correspondence that has an inverse ' only by restricting
the domain of f (x, y, z) to P, since u and v are independent of y and (x, y, z) and
(x, 0, z) give the same values for u and v for all values of y.

The form of the inverse correspondence in the previous example suggests that
1- 1 is a linear function
from P to IR 2 . That this is so follows from

1.7 Theorem. If /:ffi:11 ~ ]Rm is linear and one-to-one then 1- 1 is also linear.

Proof. Suppose /(u) = x and /(v) = y. Since f is linear, /(su + tv) = sx + ty


for all scalars s and t. Thus linear combinations of elements x, y in the image F of
f are also in F, and

This says that 1- 1: F ---+ ffi'.11 is linear. •


EXF.RCISES

Exercises 1 to 4 give information about linear functions


f. In each case find the matrix A that represents f in
the form f (x) = Ax and determine whether the function
5. 1 c) = ( ; ) . 1 ( - ~ ) = ( _~ )

is one-to-one. 6. 1(;)=C)· 1(_!)=(-!)


1(6)=(;), 1(~)=(;)
1.

2. 1(6)=(;). 1(~)=C)
7- I (:)- G) , I (l)- u).
3
-/m-G)- 1m-O) 10)-m
•. I m-(-:) . m-(_:).I
•. 10)-G)- 1(0-(D .
fm-m 1(:)-(g)
Exercises 9 and 10 refer to the rotation matrix Re =
Exercises 5 to 8 give information about linear functions cos 0 - sin 0 ) f E . th.1s secuon.
.
f. In each case find f (ek) for the standard basis vectors ·
( sm o xamp1e 8 m
0 cos 0
ek in the domain of f by first expressing each ek as a
linear combination of the domain vectors x for which 9. Show that IR0xl = lxl for every x in i 2, so R0 preserves
f (x) is given. the lengths of vectors.
Section 1C Linear Functions on IR" 111
10. Show that x • Rox = lxi 2 cos 0. Assuming the result of b, C, and d, such that (fog)(x) = Px+q. When is fog
Exercise 9, what does this say about the cosine of the the same function as go f?
angle between x and Rox? 19. Let n be the unit vector(~.~,~), and let Pn:R 3--.R 3
For each of the pairs of linear functions in Exercises 11 be the associated projection function as in Example 6.
to 14, find the matrix that represents the composition Find the matrix of Pn by finding the image of each of the
g O f. Also, say what the domain and range of g of are. standard basis vectors under it.

11. f (x) = (2-1)


3 I x, g(y) = (l0)
2 I y
*20. In this exercise, we consider a vector x in JR" to be
a column vector, that is, an n-by-1 matrix, and we
write x1 for its transpose, that is, the 1-by-n matrix
12. f (x) = (120)
2 23 X, g(y) = (2 0)
3 -1 y obtained by writing x as a row vector. With this under-
standing, if x and y are in JR", the matrix product x1y

= ( -1l O 2) = ( -II -1
is just the dot product x • y, while xy' is an n-by-n
13. f(x) l 3 x,
2 l O
g(y)
I -1
!)y matrix.
Show that if n is a unit vector then the matrix of the
projection function P0 is nn 1 •
14. f (x) = (2 l 3 )x, g(y)= 2y
the matrix ( ~ b)
In Exercises 21 to 26, find out whether the image of the
15. (a) Show that gives a linear func- given function is a line, a plane, or some other subset of
tion from R2 to JR 2 that corresponds geometrically its range.
to reflection in the line through the origin 45 ° coun- 21. f(x, y) = (2x - y, 6x - 3y)
terclockwise from the horizontal.
(b) What matrix corresponds to reflection in the line 22. f(x, y) = (x - y, x - 3y, x + y)
through the origin 135° counterclockwise from the 23. f(x, y, z) = (x + y - z, -x - y + z)
horizontal?
24. f(x,y,z)=(x+y-z,x-y+z)
(c) Compute the product of the matrices in parts (a) and
(b) and interpret the result geometrically. 25. f(x, y, z) = (x - 2y + z, -2x - y - z, -Sy+ z)
16. A counterclockwise rotation in R 2 through an angle ct 26. f(x, y, z) = (y, z, x)
. d .b d b th . R ( cos ct - sin ct )
1s escn e y e matrix a = sin ct cos ct . Given functions f: IR" - ]Rm and g: ]RP - iRq, the
composition go f doesn't make sense unless the image
Let f3 be another angle, and compute the product RaRp.
F off lies in the domain JR11 of g, in particular, unless
The composition of a rotation through angle ct with one
m = p. For f and g of the types given in Exercises 27
through angle f3 is a rotation through the angle ct + f3 . to 30, decide whether g 0 f, f 0 g, both, or neither, makes
What is the relation between RaRp and Ra+p? sense.
17. (a) Show that 27. f:JRI--.JR2, g:JR2--.]Rl
28. f:R2--.JR3, g:JR1--.JR3
~1 -~0) and V =(
-1~ 0~ 0b) 29. f: R2--. JR2, g: JR.2--. R3
30. f:R3--.JRI, g:JRI--.JR2
31. Show that if f : JR" -+ JR is a linear function, then there
represent 90° rotations of" JP.3 about the x 1-axis
is a vector a in JR" such that J(x) =a• x for all x in JR" .
and x 2 -axis, respectively. Find the matrix W that
represents a 90° rotation about the x3-axis. Also 32. Show that if A is an m-by-n matrix such that Ax = 0 for
find u- 1 and v- 1 , which represent rotations in the every x in 1R11 , then all entries in A are zero.
opposite direction. 33. The linear function fa : IR2 -+ !R2 defined by fa (x, y) =
(b) Compute uvu- 1 and vuv- 1 and interpret the (x + ay, y), for fixed a =I= 0, is an example of a shear
results geometrically by checking out what they do transformation.
to basis vectors. (a) What is the matrix of fa?
*18. Let f and g be defined by f (x) = Ax+ b and g(x) = (b) Find the points fa(l,0),f0 (0, l) , f0 (-l,0), and
Cx + d for given matrices A and C and vectors b and d. f 0 (0, -1), and sketch their relation to the corre-
~;nn " m<>tri,c P and vector Q. expressed in terms of A, sponding domain points when a = l.
112 Chapter 3 Vector Spaces and Linearity

(c) Show that if a > 0, then j~ moves points above the (e) For which lines L in the plane (not necessarily
x-axis to the right and points below the x-axis to the through the origin) is the image f(L) equal to L?
left. What happens if a < O? (f) What is the composition of two shear transforma-
(d) What points are always left fixed by fa? tions fa and fh?

SECTION 2 VECTOR SPACES


In Chapter 1 we defined operations of addition and scalar multiplication on ~ 11 and
noted that they obey the following analogues of familiar laws of arithmetic, where
r, s, and O are scalars, x and y are vectors in ~ 11 , and O is the zero vector in ~ 11 •

1. rx + sx = (r + s)x
2. rx + ry = r(x + y)
3. r(sx) = (rs)x
4. x + y =y+x
5. (x + y) + z = x + (y + z)
6. x+ 0 = X
7. x +(-x) = 0
8. Ix = x
9. Ox= 0

We now take a more general point of view and define a vector space V over
the real numbers to be a set with operations of addition and scalar multiplication
defined so that they behave like the familiar operations on~". Specifically, for x and
y in V and r in ~. the sum x + y and the scalar multiple rx must also be elements
of V. In addition, V has to contain a zero vector, and formulas I through 9 above
have to hold for all real numbers r, s and all x, y, and z in V. As we have done
all along with vectors in ~ 11 , we write - x for the scalar multiple (- l)x, x - y for
x + (-y), and O for the zero vector.
We form linear combinations of vectors in a vector space by adding scalar mul-
tiples of vectors, and routine calculations such as

(2u-3v+w) + 3(u+2v-w) = (2+3) u + (-3+6)v + (l-3)w =Su+ 3v - 2w

follow from Formulas I through 9, just as was illustrated for vectors in ~ 11 in


Example 4 at the beginning of Chapter 1.
2A Examples of Vector Spaces

The set of all n-tuples of real numbers, with addition and multiplication by a scalar
IE~AMPLE,, defined as in Chapter 1, forms the vector space ~ 11 • From the point of view of this
chapter, when we showed that the operations defined on IR:11 have the properties I
through 9, we were showing that IR:11 is a vector space.

IE)(!\MPLE 21 For fixed m and n the set of all m-by-n matrices forms a vector space J\1111 , 11 • The
vector operations are matrix addition and scalar multiplication of matrices as defined
Section 2A Vector Spaces 113
in Chapter 2. For example, in M2,3 we have

(; ~-i)+(;;;)=(~i~)
and = ( ~ ~ i).
2 (; ; ; )

Each m-by-n matrix has mn entries and so corresponds to the element of Rm 11 we


get by lining up the successive rows end to end, so sums and scalar multiples of
elements of Mm,n correspond to sums and scalar multiples of elements of ]Rmn. The
zero vector is the zero matrix O, and it follows that rules I through 9 hold for Mm,n
simply because they hold for lRmn. We remark that the operation of multiplying one
matrix by another plays no part in making Mm,n a vector space.

Let V be the set of all infinite sequences of real numbers {x 1, x 2 , .•. }• If a =


{a1, a2, ... } and b ={b1, bi, ... } are two elements of V, define a + b as the
sequence {(a1 +b1),(a2 +b2), ... } and ra as the sequence {ra1,ra2, ... }. By a
natural analogy with the spaces JR11 , we call this space JR00 , the space of sequences.

Let '.Pn be the set of all polynomials of degree at most n, which is the set of all
functions p having the form

p(x) = ao + a1x + ... + a 11 x


11
,

where ao, ... , a11 are constants. Define addition and scalar multiplication as usual
for polynomials, by collecting terms with like powers of x. For example, when n 2 =
we have

(1 + 2x + 3x2) + (2 - 3x) =3- x + 3x 2

and

3(1 + 2x + 3x 2) = 3 + 6x + 9x 2 .
'.PII and R 11 +1 are very much alike as vector spaces, under the correspondence

Sums and scalar multiples are preserved by the correspondence, so Fonnulas 1


through 9 hold for '.J>11 because they hold for JR11 + 1• The zero element of '.J>11 is
the identically zero polynomial. We'll use '.J> to denote the space of all polynomials,
and '.J> is also a vector space with addition and scalar multiplication defined as in '.J>11 •

Let V be the set of all continuous real-valued functions on [O, l]. For f and g in
V and r in R define f + g and rf as the functions whose values at a point x in
[O, 1] are f(x) + g(x) and rj(x), respectively. We learn in Chapter 5 that sums and
constant multiples of continuous functions are continuous, a theorem often assumed
in calculus. Thus the defined operations do produce elements of V as they should.
114 Chapter 3 Vector Spaces and Linearity

FIGURE 3.4
(a) Sum. (b) Scalar multiples.

0 0 ~ 1
'-..___../ '\,I
(a) (b)

Formulas 1 through 9 follow from the same kind of argument used in Chapter l to
show that they hold for JR11 • The 0-element of V required by formula 6 is the zero
function z defined by z(x) = 0 for all x in [0, 1].

The vector space described in Example 5 is commonly denoted by C[O, l], the
space of continuous functions defined on the interval [O, 1]. More generally, the
continuous functions on an interval [a, b] form a vector space called C[a, b]. The
notation C (-oo, oo) denotes the space of continuous functions on the entire real
line. Figure 3.4 illustrates the two vector space operations in C[O, l].
2B Subspaces
The vector space JR2 fits in a natural way inside JR3 if we identify (x, y) in JR2 with
(x, y, 0) in JR3 . The following example shows that all planes though the origin are
subspaces of JR 3 .

IEXAMPLE 6 I Let V consist of all the vectors in JR 3 that lie in a plane ax t- by + cz = 0. The
sum of two vectors in V is in V and the same is true for scalar multiples of vectors
in V. We see this geometrically from the parallelogram law of addition and the
geometric interpretation of scalar multiplication. For this example we can also check
algebraically that if

ax1 + by1 + cz1 = 0 and ax2 + bY2 + cz2 = 0


then

a(x1 + x2) + b(y1 + Y2) + c(z1 + z2) = 0 and s(ax1 + sy1 + sz1) = 0

for all scalars s. Thus we can think of addition and scalar multiplication as restricted
just to V so V is a vector space. If a = b =
0 and c = 1 we get the xy-plane of
vectors (x, y, 0) as a special case.

The previous example generalizes as follows. Let W be a vector space and let V
be a subset of it. We say that V is closed under addition if x + y is in V whenever
x and y are, and closed under scalar multiplication if every scalar multiple sx is
in V whenever x is. If V is closed under both operations, V is called a subspace of
w.
2.1 Theorem. If V is a nonempty subset of a vector space W, and V is closed
under addition and scalar multiplication as defined on W, then V is a vector space.

Proof To prove that V is a vector space we have to show that the formulas 1
through 9 hold. First, closure under scalar multiplication implies that if x is in V so
Section 28 Vector Spaces 115
are -x = (-1 )x and O = Ox. Then all the fonnulas hold for vectors in V because
the vectors are also in W and the formulas hold because W is a vector space. •
The next theorem gives an alternative condition for a subset of a vector space to
be a subspace.

2.2 Theorem. If V is a non-empty subset of a vector space W, then V is a


subspace if and only if every linear combination of elements of V is also in V.

Proof. If V is a subspace and vectors x 1, ..• , Xn belong to it, then repeated appli-
cation of the closure conditions shows that every linear combination

also belongs to V. On the other hand, if linear combinations of elements of V belong


to V, so do sums and scalar multiples of elements of V, because x 1 + x2 and rx 1 are
special cases of the linear combination a 1x 1 + a2x2 obtained by setting a 1 = a 2 = 1
and a1 = r, a2 = 0, respectively. •

For a simple example of a subspace, let V be the subset of JR 3 consisting of all


vectors (x, y, 0) that have third coordinate zero. Sums and scalar multiples of such
vectors also have this property, so Vis a subspace of JR 3• We can visualize V as the
horizontal xy-plane in JR 3 .

In this example, we'll describe all possible subspaces of JR 3 •

1. Since O+ 0 = 0 and every scalar multiple sO = 0, the subset consisting of


the zero vector alone satisfies the closure conditions and is a subspace.
2. If V is a subspace containing some vector x1 # 0 then, since V is closed
under scalar multiplication, it contains all multiples tx1, where t ranges over
the real numbers. In other words, V contains the line through the origin with
parametric representation

X = fXJ.
It may be that V contains no other vectors, in which case our subspace V is
identical with the line, shown in Figure 3.5.
3. Otherwise, V contains a vector x2 different from all the vectors tx1. Since V
contains all scalar multiples and sums of vectors in it, V contains all linear
combinations tx 1 + ux2, where (t, u) ranges over R 2 . In other words, V
contains the plane through the origin with parametric representation

x = tx1 +ux2.

It may be that V contains no other vectors, in which case V is identical with


the plane, partly shown in Figure 3.5.
4. Suppose finally that V in addition contains a vector XJ different from all
the vectors rx 1 + ux2. Then because V is closed under addition and scalar
116 Chapter 3 Vector Spaces and Linearity

multiplication, V contains all linear combinations

/
/ ' tx1 + ux2 + vx3,
/

where (t,u,v) ranges over JR 3 • Because x1,x2, and X3 don't all lie in a
plane through the origin, every vector in JR 3 is a linear combination of them,
something that's apparent geometrically from Figure 3.5(c), where the shaded
box shows the vectors with all three coefficient values between O and I .
(a) Example IO in Section 5C shows that we really get all of JR 3 this way.

In a vector space W the zero vector by itself is always a subspace, for the reasons
given in item I of the preceding example. It is called the zero subspace, or sometimes
the trivial subspace, of W. W is itself closed under the vector operations, and so
is technically a subspace of itself. Subspaces of a vector space W other than W
itself are called proper subspaces of W. We summarize the results of Example 8 as
follows, where the proper subspaces of JR 3 are the ones listed in (I) to (3).

2.3 Theorem. The subspaces of JR 3 are (I) The zero subspace, (2) the lines
through the origin, (3) the planes through the origin, and (4) the space JR 3 itself.
(b)
More generally we'll see in Example IO of Section 6C, the subspaces of ]Rn are the
zero subspace, the k-planes through the origin for k = I, ... , n - I, and the space
JRII itself.

The subspaces of JR 3 that we found in Example 8 were constructed by forming


all linear combinations of certain sets of vectors in JR3. This is a general method of
producing subspaces. If S is a subset, not necessarily a subspace, of a vector space
W, we define the span of S to be the set V of all linear combinations of vectors
from S. When the set S has only finitely many vectors x1, ... , Xn in it, the span of
S is just all linear combinations
(c)

FIGURE.3.5 If Vis the span of S, we say that S spans V.


Subspaces.
IEXAMPLE 9 j For examples of spans, consider the following:

(a) The span of {e 1, ei) in JR 2 is all of JR 2 .


(b) The span of {e 1, e2} in JR 3 is the xy-plane in JR 3 .
(c) Let S = {x1, x2} consist of two vectors (arrows) in JR 3 that don't lie on a line.
The span of S is all linear combinations tx1 + ux2 and is a subspace that is a
plane P through the origin. Alternatively, we can say that {x1, x2} spans P .

Here is the formal statement and proof that the span of a subset of a vector space
is always a subspace.
2.4 Theorem. Let S be an arbitrary set of vectors in a vector space W and let V
be the span of S. Then Vis a subspace of W.

Proof. We need to show that V is closed under addition and scalar multiplication.
Let x and y be vectors in the span of S, so each of them is a linear combination of a
Section 28 Vector Spaces 117
finite number of vectors in S. Suppose that {v 1, ••• , vk} contains all the vectors in
S needed in the linear combinations for both x and y. Then there are scalars ai and
bi such that

x = a1v1 + ... +akvk


y=b1V1 + ... +bkVk.

(Some of the a's and b's may be zero if not all of the v's are needed for both x and
y.) Then x + Y = (a1 + b1)v1 + ... + (ak + bk)Vk and rx = ra1v1 + ... + rakVk , so
x + y and rx are also linear combinations of vectors in S. Hence V is closed under
addition and scalar multiplication and is therefore a subspace of W. .,

In the space '.J> of polynomials discussed in Example 4, the span of the set S =
{I, x, x 2 , ••. , xn} is the subspace '.J>n of polynomials of degree at most n. If m ~ n,
then '.J>,,, is a subspace of '.J>n, and if m < n, then '.J>m is a proper subspace of '.J>n.
The whole space '.J> is the span of the infinite set {1, x, x2, ... }.

Here are some examples of subspaces of C[a, b], the space of continuous functions
on [a , b] discussed in Example 5.

We can get a subspace of C[a, b] by taking the span of a set of functions in it. For
instance, the span of {I, x, x 2 }, is a subspace of C[a, b] consisting of all polynomials
of degree at most 2. Another example is the set of all functions of the form

which is the span of the set {l, cos x, sin x, cos 2x, sin 2x, cos 3x, sin 3x}.

In the next two examples we have subspaces of C[O, I] that are not described as
the span of a set.

l'EMMf>~~_.g j ~;~p~rt~~~s~;\~{e:~~~~~~t~~7 ~~dC;a~r~] ~o~~~~: J: f(x)dx = 0. Using familiar

1b (f (x) + g(x)) dx = 1b f (x) dx + 1b g(x) dx = 0+0 = 0

and 1b sf(x)dx =s1b f(x)dx =s0=0.

Thus Vis closed under addition and scalar multiplication and is a subspace of C[a , b].

1.·. ····. ····· · .··.·.·.·····.···.


·E)(AMPLE13
·== :<· :., .. • ;t. _. ·.:·<·' _:,.. '~\-:··.'. -··,
I Consider the subset of functions in C[a, b] that have continuous first derivatives; we
denote this set by c< 1)[a, b]. To verify that c(l>[a, b] is a subspace of C[a, b] all
we have to do is observe that since

d(rf) df
!!_(I+ g) = df + dg and --=r-,
dx dx dx dx dx
118 Chapter 3 Vector Spaces and Linearity

sums and scalar multiples of functions with continuous derivatives also have contin-
uous derivatives.
We denote the set of functions whose first k derivatives are continuous by
c<kl[a, b]. Repeated application of the argument used for c<l)[a, b] shows that it is
a subspace of C[a, b]. For I :::: k, c<O[a, b] is a subspace of c<kl[a, b]. For I > k,
c<O[a, b] is a proper subspace of c<kl[a, b]. A proof is outlined in Exercise 33.

EXERCISES

In each of Exercises 1 to 6, let S be the set of all vectors the e;? Give an example of a vector in IR 00 that is not in
(x, y, z) in IR! 3 whose entries satisfy the given conditions. the span of all the e;.
In each case, either show that the subset is a subspace
In Exercises 16 to 19, determine whether the set of all
of IR 3 by verifying the closure conditions, or show that
polynomials p in '.P3 that satisfy the given conditions is
it is not a subspace by finding some linear combination
a subspace of '.P3.
of elements of S that is not in S.
1. X + 2y = 0 2. X + Z = 2 16. p(0)=I 17. p(l)=0
18. p(0) = p(l)
3. x + y = 0 and z = 0 4. x + y = 0 or z = 0
19. p(l) = p' (2), where p' is the derivative of p
5. X = y3 6. x + y = 0 and x = y3
20. In the space '.P of polynomials, let A be the set of all
In Exercises 7 to 10, let S be the subset of the vector p such that p(x) = -p(-x), and let B be the set of
space of2-by-2 matrices, M2,2, consisting of the matrices p such that p(x) = p(-x). Show that A is the span of
{x, x3, x5. ... ), and find a spanning set for B.
A = (; ~) whose entries satisfy the given conditions.
In Exercises 21 to 24, determine whether the given subset
Show either that S is a subspace of M2,2 or that it is not.
of cO\-oo, oo) is also a subspace.
1. X =W 8, X = -W
9. y =z= I 10. det(A) = xw - yz =0 21. All f such that f' (0) exists
11. (a) Show that the set of vectors (x, y, z) in JR3 such that
22. All f such that f' (0) =2
x + 2y - z = 0 is a subspace of JR 3. 23. All f such that f' (0) = f (2)
(b) By finding a parametric representation for the solu- 24. All f such that f(x) = f(-x) for every value of x
tions of x + 2y - z = 0, find two vectors that span
the subspace in part (a). 25. Let C[a, b] be the vector space of continuous real-valued
functions defined on the interval [a, b]. Let Co[a, b] be the
12. Let a be a fixed nonzero vector in IR". set of functions fin C[a, b] such that f(a) = f(b) = 0.
(a) Show that the set S of all vectors x such that Show that Co[a, b] is a subspace of Cfa, b].
a • x = 0 is a subspace of IRn.
(b) Show that if k is a nonzero real number, then the In Exercises 26 and 27, show that S and T have the
set A of all vectors x such that a , x = k is not a same span in IR 3 by showing that the vectors in S are in
subspace. the span of T and vice versa. [Hint: You can do this by
solving systems of linear equations.]
13. Let S be a subset of IR", and let s.L (pronounced "S
perpendicular", or "S perp" for short) be the set of all 26. S = ((l, 0, 0), (0, 1, 0)), T = ((1, 2, 0), (2, 1, 0)}
vectors p in IR" such that p • s = 0 for all s in S. Show 27. S = /(2, 3, 1), (1, 2, 3)), T = ((3, 5, 4), (I, 1, -2))
that S.L is always a subspace of IRn.
28. (a) Show that the plane P of points (I, I, I) +
14. Show that for a subset S of IRn, the span of S is con- s (I, 2, 0) + t ( -2, I, 1) is not a subspace by finding
tained in (S.l ).L. [Hint: First show that S is contained in two vectors in P whose sum is not in P.
(S.L ).L .]
(b) Show that the plane P of points (-1, 3, 1) +
*15. Let e; be the sequence in IR 00 (Example 3) having I in s(l, 2, 0) + t(-2, 1, l ) is a subspace.
the ith place and O elsewhere, so e1 = (I, 0, 0, ... ), e2 = (c) What is different about cases (a) and (b)? For which
(0, 1, 0, ... ), etc. Which vectors in IR: 00 are in the span of + +
vectors b do the points b s(l, 2, 0) t(-2, 1, 1)
(e1. e2, ... , e11 }? Which are in the span of the set of all form a subspace?
Section 3 Linear Functions 119
29. Show that if S is a subset of a vector space W and V c<k+ 1>[a, b], so c<k+ 1>[a, b] is a proper subspace of
is a subspace of W that contains S, then the span of S c<k>[a, b].
is a subset of V. (Another way of stating this is to say
that the span of S is the smallest subspace of W that con- *34. Let c< 00 > be the vector space of infinitely often differen-
tains S.) tiable functions of a real variable. Show that c< 00> is a
proper subspace of c<k) for k =
I, 2, ....
30. Show that the intersection of two subspaces of a vector
space V is always a subspace of V. 35. Suppose a linear function f : JR 3 - JR has f (e1) =
I, /(e-i) = 2, and f (e3) = I. Show that the equation
*31. Exercise 4 shows that the union of two subspaces is not / (x) = I has solutions consisting precisely of the points
always a subspace. Show that the union of two subspaces in the plane perpendicular to (1, 2, I) and passing through
is a subspace if and only if one of them is contained in (1, 0, 0).
the other. ·
In each of Exercises 36 to 41, say whether the given
32. Given two subsets A and 'B of a vector space, let A + 'B statement is always true or sometimes false. If the
stand for the set of all vectors that are equal to sums statement is always true, give a reason why; otherwise
a+ b with a in A and b in 'B. Show that if A and 'B are give an example for which it is false.
subspaces, then so is A+ 'B.
36. If S is a subspace of a vector space and x is in S, then
*33. This exercise outlines a proof that for the spaces of -xis in S.
functions c<k>[a, b] defined in Example 13 that c<l)[a, b]
is a proper subspace of c<k>[a, b] when l > k. 37. If S is a subspace of a vector space W and x is in W but
(a) Show that c<1>[a, b] is a proper subspace of C[a, b]
not in S, then the set of all sums x + y with y in S is not
a subspace of W.
by giving an ex.ample of a function that is continuous
on the interval [a, b] but doesn't have a derivative 38. If S is a subspace of !Rn and S contains more than one
that is continuous on the interval. vector, then S contains a line through the origin.
(b) One version of the fundamental theorem of calculus 39. If Sr and S2 are two different proper subspaces of a vector
states that if .f is continuous on an interval [a, b] space W, then W has a proper subspace that contains both
then F(x) = J:x /(!) dt is a function with derivative S1 and Sz.
F'(x) = f(x). Use this and your example from part
(a) to find a function that is in c(l>[a, b] but not in 40. There is no subspace of JR11 such that !xi ::5 I for all x in
c<2>[a, b]. the subspace.
(c) Show by induction that for k = 1, 2, ... , 41. No subspace S of IR3 has the property that X• (l, 2, I) =I
there is a function in c<k>[a, b] that is not in for all x in S.

SECTION 3 LINEAR FUNCTIONS


In Section l of this chapter we studied functions f from IR.11 to !Rm that satisfied
Equation 1.1 stating that for a linear combination sx + ty,

3.1 f (sx + ty) = sf (x) + tf (y).


We'll now be looking at linear functions from one vector space to another without
assuming that these spaces are the standard coordinate spaces IR.11 , so we can't always
assume a matrix representation f (x) = Ax for a linear function.
In general, we define a function to be linear if its domain and range are vector
spaces, and it satisfies Equation 3.1. When checking whether a function f is linear
it's sometimes more convenient to check separately that

(a) f (x + y) = f(x) + f(y) for all vectors x, y


3.2
(b) f (sx) = sf (x) for all vectors x and scalars s.

In Section 1 we saw that a linear function f (x) = Ax with m-by-n matrix A


is one-to-one, and so has an inverse, if and only if the columns of A are linearly
120 Chapter 3 Vector Spaces and Linearity

independent. Matrix representations for linear functions in general are not readily
available, but the definition of linear independence in Definition 2.7 in Chapter 2
provides a useful substitute that doesn't depend on properties of !Rn.

3.3 Theorem. A linear function J is one-to-one if and only if x = 0 is the only


vector such that j(x) = 0.

Proof By linearity, j(0) = J(Ox) = OJ(x) = 0, and j(x1) = f (x2) if and only if
j(x1 -x2) = 0. If J is one-to-one, then x = 0 is the only vector such that J(x) 0. =
If f is not one-to-one, there are vectors such that j(x1) = j(x2) but x1 =f x2, and
then x1 - x2 is a nonzero solution of J(x1 - x2) = 0. •
3A Examples of Linear Functions
We now give some examples of linear functions just to illustrate various possibilities
that come under the definition. After giving some specific examples, we consider
general ways of combining linear functions to get others.

For our first example, we simply recall Theorems 2.3 in Chapter 2, Section 2C and
1.3 of Section I of this chapter. If f: !Rn~ !Rm is a linear function, then f (x) = Ax,
where A is the m-by-n matrix whose jth column is the vector f(ej). This theorem
is significant in that it gives us a concrete computational description of all possible
linear functions from IR11 to !Rm.

This direct description of linear functions by matrices only works for functions
whose domain and range are standard coordinate spaces IR11 , !Rm. As we'll see later
in Section SB, many vector spaces are very much like the spaces !Rn, and there is a
way to associate linear functions on them with matrices, but in other cases such as
the next couple of examples this is not possible.
We sometimes use the term transformation to refer to a function from one vector
space to another, and use the term operator to refer to a function from a vector
space to itself. These terms help to avoid confusion when we deal with vector
spaces such as C(-oo, oo) whose elements are themselves functions. For example,
differentiation is a differential operator from infinitely often differentiable functions
such as J (x) = sin x that operates on f (x) to produce J' (x) = cos x. We use this
terminology in several of the examples that follow.

IEXAMPLE -21 Let '.P be the vector space of all polynomials, as in Example 4 in the previous section.
Because the derivative of a polynomial is a polynomial, we can define the differential
operator D : '.P-+ '.P by setting Dp(x) = p'(x) for every polynomial p(x) in '.P.
For example, D(2 + x - x 3 ) = I - 3x 2 . Checking that D is linear is a matter of
observing that if p(x) and q(x) are polynomials, and r ands are numbers, then

D(rp(x)+ sq(x)) = (rp(x) + sq(x))' and


rD(p(x)) + sD(q(x)) = rp'(x) + sq'(x),

and these are equal by familiar rules of differentiation.


The linear operator D is not one-to-one, because the derivative of every constant
polynomial is the identically zero polynomial. The image of D is all of '.P, because
every polynomial is the derivative of some other polynomial.
Section 3A Linear Functions 121
FIGURE 3.6 u u
Linear actions on u(x). 1 X
u(x) =-- xu(x) = - -
1 + x2 1 +x2

X X

(a) (c)

u u
-2x
Du (x) = ,
(l + x-)2

X X

(b) (d)

l ~MPCE g;j

'.;.:·..E
.•
• <• •· • ' -
The discussion of the differential operator D in Example 2 applies somewhat dif-
ferently if we consider D as a transformation from c(l) (-oo, oo ), the space of
continuously differentiable functions f(x), to C(-oo, oo), the continuous functions.
The linearity of D follows just as in Example 2. But note that while here f (x) is
assumed to have a continuous derivative, Df(x) = f'(x) may only be continuous
but not differentiable. As with differentiation of polynomials D is still not one-to-
one, for the same reason as before, namely, that the derivative of every constant
function is the identically zero function. Figures 3.6(a) and (b) show the graphs of
a function u(x) and its derivative Du(x). The linearity of D as an operator on u(x)
isn't at all obvious from looking at the pictures.

RL:E '4;j
l••·.EJ<.A
. . M. .. ···· ·
Let C ( -oo, oo) denote the space of continuous real-valued functions u (x) and let
q(x) be a fixed function in C(-oo, oo). We define an operator Q: C(-oo, oo) ~
C(-oo, oo) by

Qu(x) = q(x)u(x).
Figure 3.6(c) shows the effect of multiplying by q(x) = x when u(x) = (1 +x 2 )- 1 ;
Figure 3.6(d) shows the effect when q(x) = x 2 (1 + x 2 )- 1 instead. Checking that Q
is a linear transformation amounts to observing that

Q(ru(x) + sv(x)) = rQu(x) + sQv(x),


or in other words, that

q(x)(ru(x) + su(x)) = rq(x)u(x) + sq(x)u(x),


which follows from the ordinary arithmetic used in combining functions.
122 Chapter 3 Vector Spaces and Linearity

Putting the operator Q and the differential operator D in the single equation
Du= Qu gives an example of a differential equation:

Du= Qu or u' = xu,


where we have chosen q(x) = x for concreteness. It's straightforward to verify that
for a constant k the function

u(x) = kex 2 / 2
satisfies the equation. The preceding formula gives all solutions, as we show
in Chapter l 0, Section 3. In this example the domain of D is the subspace of
C(-oo, oo) consisting of the vector space of continuously differentiable functions.

IE?(AIVIPL~ .s I isIn notthis theexample we again use an m-by-n matrix to describe a linear function, but it
same as Example 1. For one thing, the domain is not and the range is
]Rn
not ]Rm. Let Mn,p be the vector space of n-by-p matrices discussed in Example 2
of the previous section. If A is a fixed m-by-n matrix, then for each n-by-p matrix
M, the product AM is defined and is an m-by-p matrix. Thus we obtain a function
f A : Mn,p -+ Mm,p by defining

fA(M) = AM.

The function /A is linear because of the properties of matrix multiplication given in


Theorem 3.2 of Chapter 2. By the right distributive law in that theorem,

fA(M + N) = A(M + N)
=AM+ AN= fA(M) + /A(N),
and by the scalar commutativity law of the same theorem,

/A(rM) = A(rM) = r(AM) = rfA(M).


Hence /A is a linear function.

In the preceding examples, formal verification that the transformations were linear
was straightforward, and in the future we'll often leave such routine checks to the
reader.
A formal proof that a function f is not linear involves finding some x and y in
its domain such that /(x + y) -f- /(x) + /(y), or a scalar r and an x in the domain
off such that /(rx) -f- rf(x). For example, /:lR 1---+JR 1 defined by f(x) = x 2
certainly looks nonlinear. To prove that /(x) = x 2 is not linear, it's enough to note,
for instance, that /(l + 1) = /(2) = 4 while /(1) + /(1) = l + l = 2 -f- 4, or
/((-1)(3)) = /(-3) = 9 while -/(3) = -9 -f- 9. Usually a function that looks
nonlinear is nonlinear, but care is sometimes required, as in the next example.

Define /:lR2 ---+JR2 by f(x, y) = (3x-y, (x+y+2) 2 -(x+y) 2 -4). The function f
appears nonlinear at first glance, but a second look shows that (x+y+2) 2 -(x+y) 2 -4
simplifies to 4x + 4y, so f is linear after all.
Section 38 Linear Functions 123
The previous example is rather artificial. A more natural situation is to have a
family of functions that are nonlinear in general but linear in exceptional cases that
may be overlooked. For instance, ax 2 + bx is nonlinear only if a f= 0. The exact
domain of a function may also make a difference, as in the following example.

Let f:R 2 --+ JR 2 be the function defined by the second-degree formula f(x, y) =
((x + 1) 2 - (y + 1) 2 , 3x 2 + 5xy + 2y 2 ). It certainly looks nonlinear, and we leave
it to the reader to check that it is.
Now let V be the subspace of JR 2 consisting of all scalar multiples of (1, -1 ),
and define another function g: V --+ JR 2 given by the same formula as .f, but with
domain restricted to V. For (x, y) in V, we have y = - x , so for (x, y) in V,

g(x, y) = f(x, -x) = {(x + 1) 2 - (-x + 1)2, 3x 2 - 5x 2 + 2x 2 ) = (4x, 0),


and g is linear on its domain.

3B Composition and Linear Combination


Composing linear functions by applying first one and then another is a way of
producing many other examples of linear functions. The composition g o f of g and
f has been defined by

(g o.f)(x) = g(f(x))
whenever the right side is defined. According to Theorem 1.5 in Section 1 of this
chapter, the composition of linear functions g:Rm--+ JRP and f:TJf..P--+ Rn that have
standard coordinate spaces for domain and range corresponds to matrix multiplica-
tion, so that if /(x) = Ax and g(y) = By, then (g of)(x) = BAx. It then follows
from Theorem 2.3 in Chapter 2, Section 2C that g o f is linear. The same conclusion
holds for linear functions on vector spaces in general.
3.4 Theorem. If f :11--+ V and g:V--+ Ware linear functions, then their com-
position g o f:11 --+ W, defined by (g c f)(x) = g{f (x)) is also linear.

Proof. We check the linearity of g o f by the following calculation, using first the
linearity of f and then the linearity of g.

(g of)(rx + sy) = g(f(rx + sy))


= g(rf(x) + s/(y))
= rg{J(x)) +sg(f(y)) = r(g of)(x) +s(g of)(y) •

If f: V --+ V is a function whose domain and range are the same space, then
f of is defined. We often write / 2 for f of. Since / 2 is again a function from V to
V, we can also define / 3 =f o / 2 , and so on. For instance, we write D 2 instead of
D oD for the second derivative operator so D 2 f means the same as f" .
For functions with the same domain S and the same vector space for their range,
sums and scalar multiples are naturally defined by

(f + g)(x) = f(x) + g(x) and rf(x) = r{.f(x)) ,


124 Chapter 3 Vector Spaces and Linearity

whether or not S is a vector space. When the domain of the functions is a vector
space, and the functions are linear, we have the following:

3.5 Theorem. If f: V --+ W and g: V --+ W are linear functions, then the sum
f + g:V--+ Wand scalar multiple rf: V--+ Ware linear also.
The proof amounts to checking linearity by using the definition in much the same
way as in the proof of Theorem 3.4, and we leave it as an exercise.

IEXAMPLE a I We define linear differential operators using both composition and linear combination
of transformations. For example, suppose that p(x), q(x), and r(x) are continuous
functions, and D is the differentiation operator. Then

0 p(.t)D 2 + q(x)D + r(x)


(a)
acts on twice continuously differentiable functions 11(x) as a linear transformation L
y
from cC2\-oo. oo) to C(-oo, oo), by

L(u) = pu" + qu' + ru.


Typically. we specify a continuous function f and ask for a solution II in
X C(2)(-oo, oo) that satisfies L(u) = f.

3C Inverse Functions
Recall from Section l that a function /: IR11 --+ IR111 has an inverse function if there
is a function 1- 1 whose domain is the image F off in IR111 , such that
(b)

FIGURE 3.7
Image line. for every x in the domain of f and for every y in the image set F of f. Thus f has
an inverse precisely when f is one-to-one; hence we have, by Theorem 3.3:

3.6 Theorem. If f is linear then 1- 1 exists if and only if f (x) = 0 is satisfied


only by x = 0.

The following theorem generalizes Theorem 1.7.

3.7 Theorem. If f : V -+ W is linear then its image F is a subspace of W, and


if f is one-to-one then 1- 1 is also linear.

Proof If x and y are in the image F off, which is the domain of 1- 1, then there
exist u and v in V such that /(u) = x and /(v) = y, sou= 1- 1 (x) and v = 1- 1 (y).
Because f is linear, ifs and t are scalars, sx + ty = j(su + tv), so sx + ty is also
in F. This shows that the domain of 1- 1 is a vector subspace of W. The range of
1- 1 is the vector space V. Apply 1- 1 to both sides of sx + ty = f (su + tv) to get

so 1- 1 is linear. •
Here are some examples of linear functions that have inverses.
Section 3C Linear Functions 125
l:~MM,eJ:f~,I 2 2
Let /:lR ---+ 1R be the linear function defined for vectors X in JR 2 by

f (x) = Ax, where A =( f ~).


Theorem 4.1 of Chapter 2 tells us that A has an inverse matrix

with the property that A A - I = A - I A = I. It follows immediately that 1- 1 exists


and 1- 1(x) = A- 1x.

Let JR ---+ JR2 be defined by f (t) = (t, t). Then f is linear and its image is the line
in JR 2 with equation x = y, as shown in Figure 3.7(b). Thus in this case the image of
f is a proper subspace V of JR 2 , that is one that is not all of JR 2 . Since the equation
f (t) = (0, 0) is satisfied only by t = 0, the function f is one-to-one, so has inverse
1- 1, given by 1- 1(t, t) = t for all vectors (t, t) in V.
The function S from C[0, 1) to C[0, 1) defined by Su(x) = fox u(t) dt, has an
image consisting of the continuously differentiable functions u in C[0, 1] for which
u(O) = 0. Sis one-to-one because Su is identically zero only if u is also. According
to the fundamental theorem of calculus, the inverse of S is the differentiation operator
D restricted to the functions u such that u(O) = 0 and du/dx is continuous.

EXERCISES

In Exercises 1 to 4, a value of n and some information In Exercises 9 to 12, determine the effect on a sequence
about a linear function /: :!Rn---+ ]Rn are given. In each (x1, x2, x3, ... ) of the given combinations of the func-
case find the matrix A such that f (x) = Ax for all x tions defined in Exercises 5 to 8.
in lRn. 9. fog and g o f 10. goh and h og
1. n = 2, f(ei) = (l, 2), f(e2) = (2, l) 11. g op and pog 12. h op and poh
2. n = 3, f(e1) = (1,2,0),f(ez) = (-l,2,0),f(e3) = 13. In analogy with Example 5 of the text, define for each
(0, 0, l) fixed m-by-n matrix B, the function g9 : Mq,m ~ Mq,n
by g 8 (M) =MB. Show that g 8 is linear.
3. n = 2, f(l, 1) = (l, 2), f(2, l) = (2, l)
14. If A is m-by-p and Bis q-by-n, what are the domain and
4. n = 3, f(e1) = e2, f(e2) = 2e3, f(e3) = 3e1
=
range of hA.B as defined by hA,n(M) AM B for all M
Each of Exercises 5 to 8 defines a function from in Mp,q? ls hA,B linear?
JR 00 to JR 00 , where JR 00 is the vector space of sequences
In Exercises 15 and 16, Let D be the differentiation oper-
(xiJ, k = l, 2, 3, ... of Example 3 in Section 2. In each
ator d/dx. For each given function u(x), find Du(x),
case, show that the function is linear and state whether
xu(x), D(xu(x)), and xDu(x).
the function is one-to-one or not. If it is one-to-one then
describe its inverse and the domain of the inverse. 15. u(x) = 2x 3 - 4x 16. u(x) = e3 x

5. f (.q, x2, x3, ... ) = 2(x1, x2, X3, ... ) 17. Let D =d/dx act as a transformation from cO>(-oo, oo)
to C(-oo, oo).
6. g(x1,x2, x3, ... ) = (x1, 2x2, 3x3, ... ) (a) If u(x) = 2x 3, find (Dx - xD)11(x), where the
7. h(x1, xi, x3, ... ) = (x2, x3, x4, ... ) operator Dx first multiplies by x and then applies D.
8. p(xi, X2, X3, ... ) = (0, XJ, x2, X3, ... ) (b) Show that Dx - xD =
I, where I is the identity
operator defined by Ju == u for all u.
126 Chapter 3 Vector Spaces and Linearity

(c) Is D 2 - x 2 equal to ( D + x)(D - x)? To find out, 26. L:IR 2-JR2, l.(x,y) = (x+2y,2x+4y). [s there an
apply both operators to a general function u(x) in (x, y) such that L (x, y) = (- I , 2)?
c( 2 \-oo, o.:i) and see if you get the same result. 27. L: cO>[Q, I] - C[0, I], l.l = l' - 21 , Is l(x) = lxl
18. (a) Show that the equation in the domain of L?

ex(D + l)u(x) = Dexu(x) 28. L: C(2)[0, I] - C[0, I], Lf(x) = 2l"(x). ls there an f
in the domain of L such that L.l(x) = x 2 ?
is satisfied by all functions u in cO>(-oo, oo).
29. L : C[0, l] - C[0, I], Ll(x) = xf (x). Is there an f in
(b) Show that the equation
the domain of l. such that l.f(x) = x 2 + I?
(D + l)u(x) = 0 In Exercises 30 to 35, for the given linear function,
is satisfied by all functions of the form u (x) = ce-x, determine whether it has an inverse function; if it has,
where c is a constant. describe the inverse by using a matrix, or in some other
(c) Show that the equation (D + l)u(x) = 0 has only way. Specify the domain of the inverse function.
the solutions given in part (b ).
In Exercises 19 and 20, show that the given function
2
30. l :IR -JR
2
, 1(~) = ( _: ; )(;,)
S: C[O, l] -+ C[O, I] is linear.
19. Su(x) = ft u(t) dt 20. Su(x) = J0' e-1 u(t)dt
2
31. l:IR -JR ,
2
1(;) = (!;) (~)
In Exercises 2 1 to 25, L : V ~ W is a linear function 32. f:JR'-n~.2. f(x) = (x, -2x)
from some specified vector space to another. Use the
given information to answer the questions. *33. l:V-JR2, l ( :X)
1

= (-I 2J 2
] 3) (X): , where V
21. L:JR 2
- IR, L(uo) = I, L(u 1) = -2. What is
L(3uo - 401)? is the subspace of JR 3 consisting of all linear combinations
22. L:JR2->IR 2 , L{l , 2) = (2, 3), L(-1, I)= (I, -1) . Find of (1, I, I) and (I , 2, 3).
a vector u in JR 2 such that Lu = (3, 7). *34. D:V-C(-00,00), where Du= u', and Vis the
23. L:JR 3 -ne, L(e1) = (1,2), L(e2) = (-1,0), L(e3) = subspace of c(l>(-oo, oo) consisting of the continuously
differentiable functions with u (0) = 0
(2, 2). What is L(-1, 3, 2)?
24. L : C[0, l] - C[0, l], l(l) = I, l(x) = x, L(x 2 ) = *35. D:V - C(-oo, oo), where Du = u', and V is the
x 2 +2. What is l(2x 2 +x - I)? subspace of cO \-oo, oo) consisting of the continuously
differentiable functions with u (0) = u (I)
25. L : C[0, l] - C[0, l], l(l) = x, L(x) = x 2 • What is
l.(2x + 3)? 36. Prove Theorem 3.5: If l and g are linear, so are the sum
f + g and scalar multiple rf, for a scalar r.
In Exercises 26 to 29, L : V ~ V is a linear function
from a specified vector space V to itself. Use the given
information to answer the questions.

SECTION 4 IMAGE AND NULL-SPACE


4A Image
Recall from Section 1 that for a function f and a subset E of the domain of f, the
set of values taken on by f as u ranges over E is called the image of E under f
and denoted by f (E). The image of the entire domain of f is the image of / .
The images of linear functions are themselves vector spaces, as the following
theorem shows.
4.1 Theorem. Let / : V -+ W be linear and let U be a subspace of V. Then
/(U) is a subspace of W. In particular, f(V), the image off, is a subspace of W.

Proof. If x and y are in f (U), then there are vectors u and v in U such that
f(u) = x and f (v) = y. For scalars r and s, ru + sv is also in U because U is a
Section 48 Image and Null-Space 127

subspace, so f (ru + sv) is in /(U). Because f is linear,

f(ru + sv) = rf(u) + s/(v) = rx + sy,

so rx + sy is in /(U). We have shown that rx + sy is in /(U) whenever x and y


are, so /(U) is a subspace of W. •
FIGURE 3.8 V

Image of a square.

(a) (b)

We already know a whole class of linear functions that illustrate Theorem 4.1 . If
f: JR"~ ]Rm is linear then by Theorem 1.4, its image is the span of the columns
of the matrix that represents f and is therefore a subspace, by Theorem 2.4. For
example, if

f (:) =( !-: )(:) = u ( : ) +v (-i),


then the image F off is the plane spanned by (1, 1, 1) and (1, -1, 1). Figure 3.8(b)
shows the images in JR 3 of the standard basis vectors e1 and e2 in JR2 . The shaded
parallelogram is the image in JR 3 of the square in JR 2 with opposite comers at (0, 0)
and (1, 1) shown in part (a) of the figure.

In this example we have to use other reasoning to find the image. If D = d/dx acts in
cO>(-oo, oo), the space of continuously differentiable functions, then the image of
a single function u in c<l)(-oo, oo) is a continuous function v in C(-oo, oo). The
image of Dis all of C(-oo, oo), because if v is an arbitrary element of C(-oo, oo),
then

u(x)= fox v(t)dt


defines an element of c 0 >(-oo, oo) such that Du = v.
4B Null-Space
Besides the image, another important subspace associated with a linear function
f: U ~ V is the subset of its domain consisting of the vectors u in l( such that
/(u) = 0. As will be shown in Theorem 4.2, it is a subspace of the domain off. It
is called the null-space of f.
We look at some examples of null-spaces before giving the formal theorem.
128 Chapter 3 Vector Spaces and Linearity

The formula f (x, y) = x + 2 y defines a linear function from lR 2 to R The null-space


off consists of all vectors (x, y) in lR2 such that x + 2y = 0. These vectors form a
line through the origin of slope-½, spanned by, for example, the vector (2, -1).

For each fixed vector a in JR 3 , the formula f (x) = a • x defines a linear function
j.EXAfvlt>LE 4 j /: JR 3 ~ JR. If a is the zero vector, then the null-space off is the entire domain JR 3 .
If a =f. 0, then the null-space of f is a plane through the origin in JR 3 consisting of
all vectors perpendicular to a.

IEXAMPLE s j Finding the null-space of a linear function given by a matrix amounts to solving
a homogeneous system of equations. For example, suppose f : JR 2 ~ JR 2 is
f (x) = Ax with 2-by-2 matrix

A= (! 1i ).
To find the null-space N of J, we solve the system Ax = 0 in the form

X +4y=0
3x + l2y = 0,

by the row-reduction method of Chapter 2 (or by inspection if you notice that the
second equation is just 3 times the first). The solutions are of the form (x, y) =
t (-4, 1), where t ranges over all real numbers. In other words, N is a line through
the origin in IR 2 with slope -¼.
In this example, finding the null-space requires knowledge of calculus. If D =
IEXAMPLE6j d/dx acts on c(l>(-oo, oo), the null-space of D consists of all functions with
derivative identically zero. It is a theorem of calculus that a function has derivative
0 on an interval if and only if the function is constant on the interval. Hence the
null-space of D is the subspace consisting of constant functions. Since sums and
multiples of constant functions are constant, they do indeed form a vector subspace
of C 1 (-oo, oo) .
For a second example with the same domain and range, if multiplication by x
produces a continuous function xu(x) that is identically zero, then u must have
been identically zero. It follows that the operation of multiplication by x, acting on
C(-oo, oo), has its null-space consisting of the zero function in C(-oo, oo).

Here is the formal statement of the theorem illustrated in Examples 3 to 6.

4.2 Theorem. Let f : V ~ W be linear and let N be the set of all v in V such
that f(v) = 0. Then N is a subspace of V.

Proof. Let u and v be vectors in N. Then for a given linear combination ru + sv


we have

f (ru + sv) = rf (u) + sf (v) = rO + sO = 0.


Thus ru + sv is also in N, so N is a subspace of V. •
Section 4C Image and Null-Space 129

The following theorem is a criterion for a linear function f to be one-to-one in


terms of the null-space of f . It is simply a restatement of Theorem 3.3.

4.3 Theorem. A linear function is one-to-one if and only if its null-space is the
zero subspace consisting of the vector O alone.

f;~j~ij~~~+j The integration operator defined on C(-oo, oo) by

v(x) = fox u(t)dt


produces a function v in C(-oo, oo). Furthermore, if vis identically zero, it follows
that its derivative u is also identically zero. That is, the null-space of the integration
operator consists of zero alone, so the operator is one-to-one.

4C Nonhomogeneous Equations
The null-space of a linear function f plays a central role in the description of all
solutions of the equation

/(x)= b ,

where b is a fixed vector in the image of /. The equation

/(x) =O
is called the associated homogeneous equation of / (x) = b, and the null-space of
/ is therefore the set of all solutions of the homogeneous equation.
We used the following theorem for solving linear systems of numerical equations
in Chapter 2; this more generally applicable version has formally the same
proof.

4.4 Theorem. If xo is an arbitrary solution of the linear equation / (x) = b, then


the set S of all solutions consists of all vectors xo + v, where v ranges over the
solutions of the associated homogeneous equation.

Proof. Suppose that /(X-O) = band also that /(u) = b. Since/ is linear,

/(u - X-O) = /(u) - / (xo)


=b- b = 0.

It follows that u - xo = v for some vector v in the null-space of /. But then


u = xo + v as we wanted. •

x+4y =5
3x + 12y = 15
130 Chapter 3 Vector Spaces and linearity

has one solution that we can guess by inspection: (x, y) = (1, 1). In Example 5 we
found all the solutions of the associated homogeneous system

X =0
+4y
3x + 12y = 0

to be of the form (x, y) = t (-4, I). Hence all solutions of the given system are of
the form (x, y) = t(-4, I)+ (l , 1).

IEXAMPLE 9 I One application of Theorem 4.4 is very familiar in elementary calculus. Suppose we
want to solve the linear equation

Dy =g,

where D = d/dx stands for differentiation and g is a continuous function on some


interval. If G satisfies G'(x) = g(x), then yo(x) = G(x) defines one solution. Since
the null-space N of D consists of all functions v such that v' (x) is identically zero,
N consists of the constant functions v(x) = C. Thus every solution of the differential
equation has the form

y(x) = Yo(x) + v(x)


= G(x) + C.

EXERCISES

In Exercises I to 6, a function f: fiJ/. 11 - , ; YiJ/.m is specified 9. F : C(-oo, oo) -4i- cO>(-oo, oo), where F(u)(x) =
by a formula. In each case state which [J/.m is the range fte- 1 u(t)dt.
off, and desctibe the image off in YiJ/.m. Is it a subspace
of YiJ/.111 ? Also state whether or not the function is linear, 10. F : c<l)(-oo, oo) -+ C(-oo, oo), where F(u)(x)
and if it is linear, find its null-space. u'(x) + u(x).
In Exercises 11 to 14, describe the image and the null-
1. f (x, y) = (x, y, x + y) space of the function defined by /(x) =
Ax for the given
2. j(t)=(t,2t,3t) matrix A.
3. f(u,v)=(u,v,2u+v+l)
4. J(x, y) = (x + y, x - y)
11. A =( 6 ~ ). 12. .4 =( ~ 1).
5.
6.
f(x, y, z) = (x + 4 y + 3z, 2x +Sy+ 4z)
j(t) = (t,O, l) IJ.A-o H)- 14..b(n n
In each of Exercises 7 to IO, describe carefully the 15. (a) Find all solutions of the homogeneous equation
image of the given transformation F , state whether the
2x -Sy= 0.
function is linear, and if it is linear, describe its null-
space. (b) Verify that a linear combination of two solutions is
also a solution.
1. F : C(--oo, oo) -+ C(-oo, oo), where F(u)(x) = (c) Find a single solution of the nonhomogeneous
u(x) + x. equation
8. F: C(-oo, oo)-+ C(-oo, oo), where F(u)(x) = e"<x)_ 2x -Sy= 7;
Section 5 Coordinates and Dimension 131
then use Theorem 4.4 to represent all solutions of (b) Verify that a linear combination of two solutions may
the nonhomogeneous equation. not be a solution.
16. Let f : lRn ---+ !Rm be linear. (c) The conclusion of Theorem 4.4 doesn't hold for
(a) If f is not identically zero, show that the image of f (y) = Dy - y21 3 • Why doesn't the theorem apply
f contains a line through the origin. to f(y)?
(b) If n > m, show that the null-space of f contains a 21. Define a function G from C ( -oo, oo) to C ( -oo, oo) by
line through the origin.
17. Show that the null-space of a linear function / : ]Rn ---+ JR Gu(x) = lox tu(t) dt.
is the set of all vectors orthogonal to some fixed vector
Xo in ]Rn.
(a) Show that G is linear.
18. (a) Find all solutions of the pair of homogeneous (b) Show that G is one-to-one.
equations
(c) Describe the image under G of the subspace of '.P
consisting of all polynomials, p(x) = ao + a1x +
x+y+z=O ... + anxn of degree S n.
x-y+z =0. (d) Describe the image under G of all of '.P.
(e) Describe the inverse of G.
(t) Find an element of '.P that is not in the image of G.
(b) Verify that a linear combination of two solutions is
also a solution. 22. If Fis a function from a set A to a set 'B, and B is a subset
(c) Verify that (x, y, z) = (1, 1, -1) is a solution of the of 'B, then the inverse image of B under F, denoted by
pair of nonhomogeneous equations p- 1(B), is defined to be the set of all a in A for which
F(a) is in B . Show that if Fis a linear function from one
x+y+z=l vector space to another, and U is a subspace of its range,
then F- 1(U) is a subspace of the domain of/. What is
x-y+z=-1. the connection with Theorem 4.2?
23. A function u(x) in C(-oo, oo), the space of continuous
Then use Theorem 4.4 to represent all solutions of
functions on (-oo, oo) is called even if u(x) = u(-x)
the pair of nonhomogeneous equations.
for all x. It is called odd if u(x) = -u(-x) for all x.
19. Consider the homogeneous differential equation (For example, cos x is even and sin x is odd.) Let R be
(D - l)y = 0. the operator defined on C(-oo, oo) by (Ru)(x) = u(-x).
(a) Verify that the operator (D - 1) : c<l) (-oo, oo) ---+- Let I be the identity operator: (lu)(x) =
u(x).
C (-oo, oo) is linear. (a) Show that the graph of Ru is the reflection of the
(b) Chapter IO, Section 3 shows that all solutions of graph of u in the y-axis.
(D - l)y = 0 have the form y(x) = cex for some (b) Show that R and I are linear operators and that
constant c. Verify that y(x) = 1 + x is a solution R2 = I.
of the nonhomogeneous equation (D - l)y =-x, (c) Let Fe = iU + R). Show that the image of Fe
and use Theorem 4.4 to represent all solutions of the consists of the even functions and that its null-space
nonhomogeneous equation. consists of the odd functions.
20. (a) Verify that for each constant c the function y(x) = (d) Find the image and null-space of F0 = I - Fe =
t-,(x + c) 3 is a solution of the differential equation iU - R).
Dy - y2/3 = 0. (e) Find F'; and F; in terms of Fe and F0 •

SECTION 5 COORDINATES AND DIMENSION


We say a line has dimension I because it takes one coordinate to specify the position
of a point on it. It takes two coordinates to specify a point in a plane, so a plane has
two dimensions. Similarly, solid space is 3-dimensional. Although our direct spatial
experience doesn't go beyond three dimensions, it seems appropriate to say that Rn
has dimension n because it takes an n-tuple of coordinates to specify a point in it.
This section has two main goals. One is to show how to introduce coordinates and
132 Chapter 3 Vector Spaces and Linearity

use them to do computations with vectors and linear functions in vector spaces other
than ]Rn. The other is to show how to define dimension for a vector space for which
we lack immediate geometric intuition. The goals are related because both depend
on first generalizing the standard basis (e1, ... , en} in )Rn.
SA Bases and Coordinates
Recall that if S is a subset of a vector space V, the span of S is defined to consist of
all vectors w that are linear combinations of vectors in S. We showed in Theorem 2.4
of Section 2 that the span of S is always a subspace of V. If the span of S is all of
V we say that S spans V or is a spanning set for V.

IEXAMPLE 11 Figure 3.9 shows the span of each of three different sets of vectors in JR 3 • In
Figure 3.9(a), the span of x1 and x2 is the line containing those two vectors. In
Figure 3.9(b), the span of Yt and Y2 is the plane through the origin containing YI, y2.
In Figure 3.9(c), the vector y3 = Yt +Y2 adds nothing to the span of YI and Y2 because
Y1 + Y2 is already in the span. Thus the span of x1 and x2 looks one dimensional.
The span of Yt and Y2 looks two dimensional, and since y3 is a linear combination
of YI and Y2, the span of Y1, Y2, and y3 looks the same. The span of e1, e2, and e3
looks three dimensional.

Recall from Definitions 2.7 and 2.7' of Chapter 2 that a set of vectors in ll{n
is linearly independent if no single vector in the set is a linear combination of
other vectors in the set, or equivalently, if the only way to express O as a linear
combination of vectors in the set is by taking all the coefficients equal to zero.
The definitions of spanning set and linearly independent set make sense for general
vector spaces, so we can make the following definitions.

5.1 Definition. A basis for a vector space V is a set of vectors in V that is linearly
independent and spans V.

5.2 Definition. If V has a finite basis {b1, ... , b11 } consisting of n vectors, then
V has dimension n, written dim(V) = n. If V consists of the zero vector alone we
define dim(V) = 0. If V isn't spanned by a finite set then V is infinite dimensional.

Note. It's conceivable that V might have two bases with unequal numbers of ele-
ments. We prove in Section 5C that this can't happen, so if V has a finite basis,
dim(V) is obtained by counting the vectors in an arbitrary finite basis for V, and it
doesn't matter which basis is used.
FIGURE 3.9
Spanning sets. \ I
I
\
\
\
\
X2

.
\
\
\
\
\
e2

// I
I
;1
-~
I
I
I

(a) (b) (c)


Section SA Coordinates and Dimension 133
Starting in Chapter 1 we've referred to the vectors e 1, ... , e11 , where ek has l in
entry k and O elsewhere, as the standard basis for IR". They span 1R11 because a
vector (XI, ... , Xn) is equal to the linear combination xi e1 + ... + x 11 en. They are a
linearly independent set because each Ck has 1 in position k and so can't be a linear
combination of the other ci because they all have O in that position. Consequently
{e1, ... , en} is a basis for !Rn according to Definition 5.1 , and !Rn officially has
dimension n according lo Definition 5.2.

In the vector space '.Pn of polynomials of degree at most n, the set Sn consisting of
the n + 1 polynomials

Po(x) = 1, Pl (x) = x, ... , Pn(x) = xn


is a basis for '.Pn. If p(x) = ao +a1x + · · · + anxn, then p(x) is a linear combination
of Po(x), p1(x), ... , Pn(x), so Sn spans '.Pn. Also, if for all x a linear combination

q(x) = aopo(x) + · · · + OkPk(x) + · · · + OnPn(x) = 0,


then all coefficients must be zero, because otherwise q (x) would be a polynomial of
degree at most n with more than n roots. Hence '.Pn has dimension n + 1. The vector
space '.P of all polynomials is spanned by the infinite set {1, x, x 2 , ... }, but no finite
subset is adequate to span '.P, so '.P is infinite dimensional. The set {l, x, x 2 , •• • }
is linearly independent as required by Definition 5.1, because every finite subset is
linearly independent.

The next theorem shows that a basis {b1, ... , bn} for V generates a one-to-one
correspondence v ++ (v1, ... , Vn) between vectors v in V and vectors (v1, ... , Vn)
in !Rn that is linear, that is, it preserves addition and scalar multiplication. Using a
different basis will produce a different correspondence of the same kind.
5.3 Theorem. Let B = {b1 , ... , bn} be a basis for the vector space V. Then for
every v in V, there are unique scalars VI, ... , v11 such that v = VJ h1 + · · · + Vnbn.
The correspondence is linear, that is, If u ~ (u1, ... , un) and v ++ (v1, ... , v11 ),
then

u + v +--+ (u1 +VI, ... , Un+ v11 ) and ru +--+ (ru1, ... , ru 11 ).

Proof. Since B spans V, every vector vis some linear combination v1 b 1+ · +v11 b 11 •
To prove uniqueness we have to show that if vis also equal to w1h1 + · · · + w 11 b11
then the v's are the same as the w's. If both linear combinations are equal to v, then
their difference is 0, so

(vi bJ + · · · + hn) - (w1 b1+ · · · + Wnbn)


= (vi - W1)h1 + ... + (Vn -Wn)h11V = 0.

Since B is an independent set, the coefficients vk - Wk must all be 0, which makes


Wk = vk fork= 1, ... , n. To prove linearity of the correspondence we verify two
correspondences:
134 Chapter 3 Vector Spaces and Linearity

ru = r(u1b1 + · · · + Unbn) = (ru1b1 + · ·· + runbn) B- (ru1, ... , ru 11 ). •

If {b1, ... , h11} is a basis, then the unique n-tuple (x;, ... ,x11 ) such that

is called the n-tuple of coordinates of x relative to the basis. The coordinate n-


tuple of x depends on the order of the basis vectors. The n-tuple (x 1 , •.• , x11 ) is a
vector in JR", and is called the coordinate vector of x relative to the ordered basis
{b1, ... , b11 }. The previous theorem shows that if V has a basis with n elements,
then we can apply what we know about algebra in JRI! to vectors in V by using
the corresponding n-tuples in JR11 • In Section 7 we consider products that serve in a
vector space V the way the dot product does in JRI!.

L~XAMPLE4 I larly
Some vector spaces have natural bases relative to which coordinates are particu-
simple. For example, in JR the coordinate vector of the 11-tuple
11
(x 1 , ••• , x 11 )
relative to the standard basis {e1, ... , e11 } of JR" is just the n-tuple itself, because
(xi, ... ,Xn) = x1e1 + · · · + Xnen.
Similarly, in the space 'Y11 of polynomials of degree at most n, the coordinate vector
of a polynomial p(x) = ao + a1x + · · · + anX 11 relative to the basis {1, x, ... , xn}
of Example 3 is simply the (n + 1)-tuple of coefficients, (ao, a1, ... , a 11 ).

,···exAMPLE.• S•l Giving a basis for a subspace is often a good way to describe the subspace. For
example, the functions ex and e-x span the subspace S of C(-oo, oo) consisting of
all functions expressible in the form

c1 and c2 constant. Neither function is a constant multiple of the other, so they are
linearly independent, and {ex, e-x} is a basis for the 2-dimensional space S.
The functions sinhx = ½ex - ½e-x and coshx = ½ex+ ½e-x are in S, so relative
to the basis {ex, e-x} their respective coordinate vectors are <½, -½)
and (½, ½).
Exercise 13 asks you to show that the pair {coshx, sinhx} is also a basis for S.

5B Linear Functions
Theorem 5.3 shows that by fixing bases in finite-dimensional vector spaces V and W
we can calculate the results of vector operations in them by working with coordinate
vectors in JR" and JR"'. The next theorem is a generalization of Theorem 1.3 in Section
1, and it shows how to use coordinates to represent a linear function f:V ~ w by
a matrix. In Sections 6 and 7 we take up some ways of finding bases that lead to
particularly simple matrix representations for linear functions we are interested in.

5.4 Theorem. Let V = {v1, ... , v11 } be a basis for V and W = {w1, ... , Wm} a
basis for W. Let f be a linear function with domain V and range W such that
Section 58 Coordinates and Dimension 135
Then the coordinates vk and Wk are related by

llm2 ••·
a111 ) (
Omn
~~ ) =( uwi:,ln ) '

Vn

where the jth column of the m-by-n matrix consists of the coordinates of /(vj)
relative to the basis Was given by f(Vj) = OJjW1 + · · · +amjWm.

Note. Using standard bases in V = ]Rn and W = ]Rm we get Theorem 1.3.
Proof. We can combine f with the linear correspondences between vectors and
coordinates given by Theorem 5.3 to obtain a function F: lR11 -+ ]Rm by defining
F ( v 1, ... Vn) to be the coordinate vector ( w 1 ... , Wm) such that

F is then the composition

and is therefore linear by Theorem 3.4 because it is a composition of linear functions.


All that remains to be done is to check that the matrix A = (au) of F does the right
thing to the standard basis vectors e 1 , ... , e11 • But the jth column of A is Aej, and
by hypothesis this vector is also the coordinate m-tuple (w1, ... , Wm) that represents
/(vj) relative to the basis Wand so is equal to F(ej). 11

In applications of Theorem 5.4 the spaces V and W are very often the same and we
use the same basis to represent both domain and image vectors. For example, suppose
V = W = JR 2 and the single basis is {v1, v2} = {(I, I), (I, 2)}. If it's given that

/(1,1)=(2,3) and /(],2)=(0,- 1),

then in terms of {v1, v2} these equations say that

Hence relative to {v1, v2}-coordinates, f has matrix

For example, /(2v1 - 3v2) has {v1, v2]-coordinates


136 Chapter 3 Vector Spaces and Linearity

The matrix A has inverse

A-'-U -D·
so 1- 1 exists and has matrix A- 1 relative to the basis {v1, v2}.

We know from Theorem 1.4 that the image of a linear function f(x) = Ax is
spanned by the columns of the matrix A. To get a basis for the image, we need to
check for linear dependencies among the column vectors in A. On the other hand,
to find a basis for the null-space off we need to represent the solutions of Ax= 0
as a linear combination of independent vectors. We'll do both in the next example.
A routine procedure, proved to work in all cases in Theorem 5.10 in the following
Section SC, is Lo apply elementary row operations Lo A to get a reduced matrix R
for which dependence relations are easier to see. Since solution vectors x of Ax = 0
are unchanged by row operations on A, a linear relation among the columns of A
carries over to the same relation among the corresponding columns of R.

IEXAMPLE 11 We'll find bases for the null-space and image of the linear function
defined by the matrix equation f (x) = Ax, where
f: IR4 ----+ JR 3

I34 2)
Ax=
( - 1 2 1 3 x.
l O I -I

Row reduction of A takes only a few operations, and the resulting equation Rx = 0,
equivalent to Ax = 0, is easier to analyze:

IO I -1) = 0,
Rx =
(0 1 1
000
1
0
x = 0, or
Xj +x3 -X4
x2 +x3 +x4 = 0.

The null-space off (x) = Ax consists of the vectors x = (x1 , x2, x3, x4) that satisfy
Ax = 0, or Rx = 0. Setting the non leading variables x3 = s and x4 = t, we find
leading variables x1 = -s + t and x2 = -s - t. Thus all solutions are

The two vectors span the null-space of f, and they're independent because, based
on the first two entries alone, neither is a scalar multiple of the other. Hence the
null-space of f is 2-dimensional with these two vectors as a basis.
For the image of f, we know that it's the span of the columns. The first two
columns of R are evidently independent, and the third is their sum while the fourth
is second minus the first. Hence the first two columns of A suffice to span the image
of f. Since these vectors are independent, the image of f is 2-dimensional, with
basis {(I, - 1, I), (3, 2, 0)} in JR3.
Section SB Coordinates and Dimension 137
The dimensions of the null-space and image add up, 2 + 2, to the dimension 4 of
the domain off. We prove in Section SC that this relation generalizes to all linear
functions f: V --+ W if V has a finite basis.

The differential operator D: 'Y3 --+ 'Y3 acts linearly. Using the basis {1, x, x 2 , x 3} for
both the domain and range, the natural coordinates to use are the respective coefficient
vectors, (ao, a1, a2, a3) and (ho, b1, b2, b3) for polynomials ao + a1x + a2x 2 + a3x 3
and bo + b1x + b2x 2 + b3x 3 in 'Y3. The 4-by-4 matrix operation that canies out
differentiation in tenns of these coordinates is

o0 01 o
2 0 a1
o)(ao) b1
(ho)
( 0 0 0 3 a2 = b2 ·
0 0 0 0 a3 b3

Reducing the matrix just replaces the 2 and 3 by l's, so ao is the single nonleading
variable, while a1, a2, and a3 are leading variables. To find the null-space we set
the bk = 0 and solve, setting ao = t and finding a1 = 0, 2a2 = 0, and 3a3 = 0.
Thus the null-space consists of the constant polynomials, with one-element basis {1}.
Since the matrix has just three independent columns, the image is 3-dimensional and
consists of the polynomials with coefficients bo = a1, b1 = 2a2, b2 = 2a3, and
b3 = 0. Thus {I, x, x 2 } is a basis for the image. As in the previous example, we see
here that again the sum, 1 + 3, of the dimensions of the null-space and image is 4,
the dimension of the domain.
We have gone through this extensive description of the simple relation D(ao +
arx + a2x 2 + a3x 3) = a1 + 2a2x + 3a3x 2 to illustrate some general principles in an
abstract setting that is familiar enough to be thoroughly understood.

EXERCISES

In Exercises 1 to 4, show that the given set of vectors 8. (1, 2, 3), (2, 3, 4), (3, 4, S)
fonns a basis for IR.n of the appropriate dimension by
showing (a) spanning, and (b) independence. In Exercises 9 to 12, show that the given subset of
C(-oo, oo) is linearly independent.
1. {(-1,l),(l,l)} 9. {Er,e2x,eJX) 10. {X,ex ,e-X )
2. {(1, 2), (1, -2)}
11. {cosx, sinx) 12. {cosx,xcosx,x 2 cosx}
3. {(I, 0, 0, ), (1, 1, 0) , (1, 1, 1))
13. Let S be the subspace of C(-oo, oo) with basis {e , e-x ).
4. {(I, 2, 3), (0, 0, 1), {2, 2, 4)) (a) Show that the pair {cash x, sinh x) is another basis
In Exercises 5 to 8, find the dimension of the subspaces for S.
of lie or IR. 3 spanned by the given vectors. [Hint: If a
(b) Find the coordinates of ex and e-x relative to the
set of vectors is already independent, it fonns a basis for basis in part (a).
the subspace it spans.] 14. Let S be the subspace of C(-oo, oo) with basis {e, e-x).
What are the coordinates of the function 3ex - 4e-x
5. (-l,l),(1,-1) relative to the basis {cash x, sinh x)?
6. (1, 2), (1, 3)
15. (a) Show that the set {1, x + 1, (x + 1) 2)
is a basis for
7. (I, 0, 1), (0, 0, 1), (1, 0, 2) the space '.P2 of polynomials of degree at most 2.
138 Chapter 3 Vector Spaces and Linearity

(b) What are the coordinates of the polynomial x 2 +x + I In Exercises 32 and 33, let '.Ps be the space of polyno-
relative to the basis given in part (a)? mials of degree at most 5 and let ('.) be the subspace of
'.Ps consisting of odd polynomials, that is, polynomials
In Exercises 16 to 19, let Bn = {1 , cosx . sinx , cos2x, p(x) such that p(-x) = -p(x).
sin 2x, ... , cos nx , sin nx}. Functions in the subspace 'J11
of C(-oo, oo) spanned by Bn are called trigonometric 32. Find a basis for('.). What is the dimension of('.)?
polynomials of degree ::: n. 33. Is there a polynomial p(x) such that {x-x 3 , x 3+x 5 , p(x)}
16. We'll show in Theorem 7.4 of Section 7B that B11 is a is a basis for ('.)?
linearly independent set, so it is actually a basis for 'J11 • In Exercises 34 to 37, determine whether x is in the span
What is the dimension of 'J,,? of S.
17. Show that cos 2 x and sin 2 x are in 'J2, and find their 34. x = (17, -6, 13) and S = {(I, -6, 2), (4, 8, I ) )
coordinates relative to the basis B2. (Use trigonometric 35. x = (3 , -4, 5, 2) and S = {(I , - 2, I , I) , (2, I , -2, I ),
identities.) (3, I, I, I)}
* 18. Show that for given integers p and q , the product 36. x = sin(x + 'lf/7) and S = {cos x, sinx)
(cos px)(sin qx) is in 'Ip+q and find its coordinates rela-
tive to the basis Bp+q · 37. x = cos 2x and S = {I, cos.x, cos 2 x)
*19. Show that if f(x) is in 'Ip and g(x) is in 'Iq, then 38. Let f he the linear function from JR 2 to JR 2 whose
f (x)g(x) is in 'Ip+q· matrix relative to the natural basis is ( ~ _ 7). Thus
In Exercises 20 to 23, find a basis for the vector space f(I, 0) = (I , 2) and f(O, I) = (2, - 1). Find the matrix
consisting of all linear combinations of each of the off relative to the basis {v1, v2) = {(I , I) , (1 , 2)} for IR2.
following sets of functions . 39. Let g be the linear function from IR 2 to JR 3 whose matrix
20. {l , x , x - I, x 2 + I) 21. {e-' ,e-x, sinhx , coshx} 2
::::v: ~oe:~: ~::r: b:si(s· {e1r2~rr)R and natural
22. {sin2 x, cos 2 x , I} 23. {cosx , sinx,sin2x} 3
11 3
In Exercises 24 to 27, let the linear function f : JR11 -+ - 2 2
ll(m be f (x) = Ax. Find a basis, if there is one, for (a) [Thus g(I, 0) = (2. I, - 2) and g(O, I) = ( - 1, 2, 2).]
the image of f and (b) the null-space of f. Find the matrix of g relative to the bases
{v1,v2} = {(I , l),(1 , 2)) for IR 2 and {w1 , w2 , w3} =
24. A =(; I~ ) 25. A =( i ~) {(I, 0, 0), (1 , I, 0), (I, I, I)} for JR 3 •

,•. A-(n D 27.A-(u


28. Let f: V--+ W he both linear and one to one.
n 40. Let D = d / dx he the linear operation of differentiation
on the vector space '.J>2 of polynomials of degree at
most 2.
(a) Find the matrix of D relative to the basis {I, x , x 2)
for '.J>2 .
(a) Show that if S is a linearly independent set of vectors (b) What are the matrices of D 2 and D 3 relative to the
in V, then f (S), the image off in W, is a linearly same basis for '.J>2?
independent set of vectors in W. 41. Let She the shift operator defined on the vector space '.J>2
(b) Show that V and j(V) have the same dimension. of polynomials of degree at most 2 by Sp(x) = p(x + 1).
[n Exercises 29 and 30, fi nd the dimension of the For example, S(2x +I)= 2x + 3.
subspace of R 3 consisting of all solutions of the given (a) Show that S is a linear operator on '.J>2.
equation or equations. (b} Show that S =I+ D+ ½D 2, where D = d / dx , and
I stands for the identity operator.
x+ y -z=0 ,
29
· 2x +y =0
30. X +y - Z =0 (c) Show that on '.J>,,, the space of polynomials of degree
at most n, the shift operator is related to D by
31. Show that if Pl (x), .. . , Pk(x) are k nonzero polynomials
whose degrees are all different, then they form a linearly
S =I+ D+ iD 2 + -· ·+ ;hD" . [Hint: Use Taylor's
theorem to expand p(x + h) with h = I.]
independent set in the vector space '.J> of all polynomials.
Section SC Coordinates and Dimension 139
SC Dimension Theorems
A nonzero vector space V has infinitely many different bases, even when V = JR 1,
where each nonzero number is a one-element basis, so it's conceivable that a vector
space might have two bases with unequal numbers of vectors. We'll prove here in
Theorem 5.6 that if a vector space V has one finite basis {ht , ... , bn}, then every
basis for V also contains exactly n vectors. This theorem justifies Definition 5.2 that
we made in Section SA, allowing us to say without ambiguity that a nonzero vector
space V has dimension n if it has a basis consisting of n vectors.
The next theorem justifies the crucial step we need to prove the Theorem 5.6.
5.5 Theorem. If V has a basis with n elements, then a subset of V with more
than n elements is linearly dependent.

Proof. Let V = {v1, ... , Vn} be a basis for V, and let {x1, ... ,Xm} be a subset
of V with m > n. We need to show that there are numbers ri, ... , rm, not all zero,
such that

r1X1 + ·· · + r 111 Xm = 0. (1)

Since V is a basis for V, each Xk is a linear combination

(2)

with some numbers akj as coefficients. Substitution of (2) into ( l) and interchanging
the order of summation gives

Because the v's fonn a basis they are independent, so the coefficients of the v's are
all zero, and the m variables r1, ... , rm satisfy the equations
m

L akjrk=O, j = 1, ... ,n.


k=I

Since m > n these equations form a homogeneous system with more variables than
equations. By Theorem 2.5 of Chapter 2, there are infinitely many nonzero solutions
for the r's, which is what we wanted to show. •
We now have a short proof that the dimension of V is independent of whatever
finite set of basis vectors we count to compute dim(V).
5.6 Theorem. Let V be a vector space having a basis with n elements. Then
every basis for V has n elements.

Proof. Let {VJ, •.. , Vn} and {u 1, •.. , ut} be two bases for V; in particular, each
set is independent. We can't have k > n because then the u's would be dependent
140 Chapter 3 Vector Spaces and Linearity

by the previous theorem, and we can't have n > k because then the v's would be
dependent. Hence n = k. •
It follows that to find the dimension of a vector space all we have to do is pick
a basis and count the number of vectors in it; we always get the same number, no
matter what basis we count. Thus we've proved that the definition of dimension in
Section 5A assigns a clearly defined number called the dimension of V to every
vector space V that has a finite basis. A vector space with a finite basis is called
finite-dimensional, and all others are called infinite-dimensional.

The vector space '.P of all polynomials doesn't have a finite basis. If it did have one
with n elements, then the linear independence of the n + I functions I, x, x 2 , ..• , x"
would contradict Theorem 5.5. However '.P has a basis consisting of the infinite
sequence of independent functions {1,x,x 2 , ••. }, because every polynomial is a
unique linear combination of some finite subset of them.

5. 7 Theorem. If V is spanned by nonzero vectors x 1, ... , x11 , then some subset


of the x's is a basis for V.

Proof. If x 1, ... , x11 is an independent set, then that set itself is a basis for V.
Otherwise, some relation r1X1 + · · · + r11 x11 = 0 holds with at least one r, say
rk, different from 0. Dividing by rk, we get Xk = -(r1/rk)x1 - · · · - (r 11 /rk)Xn,
Substituting the right side for Xk in a linear combination of all the x's, we get a
linear combination from which Xk has been eliminated. It follows that we can delete
Xk, and the span of the remaining vectors will still be all of V. If the resulting subset
of x' s is not independent, we repeat the process until we do arrive at an independent
set, which is then a basis for V. •
The previous theorem says we can get a basis from a finite spanning set by deleting
some vectors. For a finite-dimensional space the next theorem shows that we can get
a basis from a linearly independent set by putting more vectors in the set.

5.8 Theorem. Let S = (x1, ... , xk} be a linearly independent set in a vector
space V. If Sis not a basis for V, we can include more vectors in S, possibly arriving
at a finite basis for V, in which case Vis finite-dimensional. Otherwise we can extend
S with an infinite sequence of independent vectors so V is infinite-dimensional.

Proof. Suppose x 1, ..• , Xk are linearly independent but don't span all of V. Then
there is some vector y that is not a linear combination of x1, ... , Xk, Take Xk+l =y.
We claim that the set x,, ... , Xk, Xk+I is linearly independent. Suppose that r1x1 +
· ·. + rk+tXk+t = 0. We must show that all the r's are 0. If rk+I were not 0, we
could write Xk+1 = -(r1/rk+dX1 - · · · - (rkfrk+i)Xk, which is impossible because
Xk+l is not a linear combination of the other x's. Therefore we have rk+I = 0 and
r1 x1 + · · · + rkXk = 0. Since x,, ... , Xk are independent, the last equation implies
r1 = · · · = rk = 0. Thus if a linearly independent set S doesn't span V, we can add
a vector to S so that the resulting set is also independent.
Repeating the process, we may reach a spanning set S in a finite number of steps,
so S becomes a basis and Vis finite dimensional. Otherwise we can find an arbitrarily
large independent set so V is infinite dimensional. •
Section SC Coordinates and Dimension 141

We extend the definition of k-plane in Chapter 2, Section 2D from the spaces !Rn
to other vector spaces V by saying that a k-plane is either a k-dimensional proper
subspace S of V, as in the next example, or else a translation of S by a fixed vector
v in V.

Suppose dim(V) = n and that S is a subspace of V. If S isn't the 0-subspace, then it


contains a vector x ¥- 0 that we can extend to a linearly independent set S as in the
previous theorem. By Theorem 5.5 applied to V, S can't be infinite-dimensional, so
S has a finite basis, and dim(S) = k for some k with O < k s n. If k < n, then S
is a k-plane containing 0. Thus every subspace of a finite-dimensional vector space
V is either a k-plane containing O or else V itself. In particular the proper subspaces
of IR'1 are the k-planes containing 0.

We summarize much of what we have proved about finite bases and dimension
as follows. It is proved by a straightforward application of the previous theorems.

5.9 Theorem. Let dim(V) = n. If B is a subset of V with two of the following


properties, then B has the third property and so is a basis for V.

(a) B contains exactly n vectors.


(b) B is a linearly independent set.
(c) B spans V.

Theorem 5.4 in Section 2B shows that introducing bases and coordinates in finite-
dimensional vector spaces allows us to study linear functions /: V ~ W by looking
at linear functions /: !Rn~ !Rm of the form f (x) = Ax. The following theorem gives
us an effective description of the null-space and image of such a linear function f
as k-planes containing O in V and W; in particular, the dimension of the null-space
of f is the number s of columns with nonleading entries in a reduced form R of
A, and the dimension of the image is the number r of columns with leading entries.
Since every column in R is of one kind or the other, it follows that s + r = n, where
n is the dimension of the domain of f.

5.10 Theorem. Let V and W be finite-dimensional. A linear function f: V ~ W


has null-space of dimension k, with k the number of nonleading variables in a reduced
form R of a matrix off with respect to some bases in V and W. The image off
in W is a subspace of dimension r, with r the number of leading variables in R.

Proof. We'll assume that reducing the matrix off gives a reduced matrix R with k
nonleading variables. The null-space of f consists of the vectors whose coordinates
satisfy the system Rx = 0. Since the system Rx = 0 always has x = 0 as one
solution, if there are no nonleading variables then the unique solution to Rx = 0 is
=
x 0, so the null-space is the single point 0, a 0-plane. If k > 0, there is for each
i = 1, ... , k, a unique solution Ui of Rx= 0 in which we choose the ith nonleading
variable to be I and the other nonleading variables to be 0. These k vectors Ui are
linearly independent, because Ui has 1 in the position of the i th nonleading variable
where the remaining k - 1 vectors have 0. Hence Ui can't be a linear combination of
=
the other Uj. The vectors x t1u1 + · · · + fkUk are the solutions of Rx = 0. To see
142 Chapter 3 Vector Spaces and Linearity

this note first that every such x is a solution, because by the linearity of matrix-vector
multiplication
Rx= t1Ru1 + · ·· +rkRuk = 0.

because Ru1 = ... = Ruk = 0. Second, the nonleading variables have some values
v; in every solution vo, and using t; = v; in the formula for x shows that the solution
vo has the form x.
Each of the r = n - k leading variables, necessarily the same as the number of
nonzero rows in R, corresponds to a column containing a single entry I , each in a
different row and with the other entries 0. These columns are standard basis vectors,
spanning the other columns, hence spanning the image off. Thus the image off
is an r-plane containing O in W . •

EXERCISES

1. (a) Let Eij be the m-by-n matrix with I in the ijth (a) Show that f: V - IR is a linear function with
position and zeros elsewhere. Show that the Eij form null-space precisely W .
a basis for the space of all m-by-n matrices. What (b) Show that if g: V - IR is another linear function
is the dimension of the space? with null-space W, then g = cf for some constant
(b) What is the dimension of the space of diagonal n- c. [Hint: Show that g(x) - g(x11 )f (x) = 0 for all
by-n matrices? x in V.]
2. Which of the following statements is true for every linear
9. (a) Show that if S is a k-dimensional subspace of IR",
function f? Prove your answer.
then S is the null-space of some linear function f:
(a) If XJ and x2 are linearly independent, then so are IR" - 1R11 -t . [Hint: Begin by picking a basis for
f(x1) and f(x2). S and extending it to a basis for IR" .]
(b) If f (x1) and f (x2) are linearly independent, then so (b) Use part (a) to show that every k-dimensional sub-
are x1 and x2. space S of IR" is the intersection of n-k hyperplanes
3. Show that if f is a one-to-one linear function, then the set through the origin, that is, of (n-1 )-dimensional
{f(x1), ... , f(xk)l is linearly independent if and only if subspaces of IR".
{x1 , . . . , xk) is linearly independent. What does this imply
about the dimensions of the image and domain of J? 10. Assume V and W are finite dimensional and that
f: V - W is linear. Prove that if 'N is the null-space of
4. Let x, = (1,2,3},x2 = (-1,2, l),x3 = (I, I, I), and f, then there is a subspace S of V such that Sn 'N = 0,
"4 = (I. 1,0). and f restricted to S is one-to-one. [Hint: Pick a basis
(a) Without doing any computation, give a reason why for 'N and extend it to be a basis for V.J
x1, x2 , x3, and X4 form a linearly dependent set.
(b) Express x1 as a linear combination of x2, x3, and x4. 11. Assume V and Ware finite dimensional, and f:V -w
5. Prove that two planes that contain O in IR 3 intersect in a is linear. Let 'N and f (V) be the null-space and
line or coincide. image of f. Use Theorem 5.10 to prove the equation
+ =
dim(N) dim(! (V)) dim(V).
6. Prove Theorem 5.9, that is, prove that if V has dimension
n, then (a) and (b) imply (c), (a) and (c) imply (b), and 12. Prove that if V is finite-dimensional and f :V - W is
(b) and (c) imply (a). linear, then the inverse image of a vector w in the image
7. Prove that if dim(W) = n and Vis a subspace of W , then of f is a k-plane in V, where k is the dimension of the
dim(V) .:'.:: n. ' null-space of f . (For the definition of inverse image, see
Exercise 22 in Section 4.)
8. Let a vector space V have dimension n and
have an (n - 1)-dimensional subspace W with basis 13. (a) Prove that if f:V -w is linear with dim(V) >
{XJ , .. • ,x,,_J)_ Let x11 be in V but not in W. Define dim(W), then dim(N) > 0, where N is the null-
f (a1x1 + · · · + a 11 x11) = a11 • space off.
Section 6A Eigenvalues and Eigenvectors 143

(b) Use part (a) to explain why there can't be an m-by-n 14. Find a 3-by-2 matrix A and a 2-by-3 matrix B such that
matrix A and an n-by-m matrix B such that BA = I BA= l.
ifm < n.

SECTION 6 EIGENVALUES AND EIGENVECTORS


This section is about a way to better understand linear operators L : V --1'" V, on a
finite-dimensional vector space V. We'll see that finding a basis for V consisting of
vectors x such that j(x) = AX, for some scalar A associated with x, allows us to
write the matrix of f in the diagonal form

AI 0
0 A2

(
0 0

The main advantage is that diagonal matrices display properties of an operator that
are easy to read from the matrix. We'll mostly be concerned with linear operators
on .IR.n, represented by square matrices, but some of the ideas apply as well to other
linear operators, such as the differential operators D = d/dx and D 2 = d 2 /dx 2 .
6A Definitions and Examples
For a given linear operator L: V ~ V and a vector x, there is usually no simple
relation between x and L(x). But for special vectors u it may happen that L(u) is a
scalar multiple of u, so that

6.1 L(u) = All.


This equation makes no sense unless u is in both the image and domain of L. If
u i- 0 and satisfies L(u) = All, we say that u is an eigenvector of the linear operator
L and that A is the eigenvalue of L associated with u, or alternatively that u is an
eigenvector associated with the eigenvalue A. The terms characteristic vector u and
characteristic value A are also common. Because L is linear, a nonzero multiple cu
of u is an eigenvector associated with the same eigenvalue. The vector O is never
called an eigenvector, and the scalar O is an eigenvalue of L only if L(u) = Ou = 0
for some vector u =I 0.

K~~~IVJfl~~-1)11 2
Let L be the linear operator on IR. defined by the matrix

( ! ! )·
Thus

It's routine to verify th_a t


144 Chapter 3 Vector Spaces and Linearity

and that

That is, the vector x 1 = (I , 2) in ~ 2 is an eigenvector associated with the eigenvalue


3, and the vector x2 = {I, -2) is an eigenvector associated with the eigenvalue - 1.
Likewise, nonzero multiples of these two vectors will be eigenvectors with the same
two eigenvalues.

Before discussing how to find eigenvectors, we ' ll show how they ' re useful. Sup-
pose that L is a linear operator on a vector space V and that u 1, ... , Uk are eigen-
vectors of L with associated eigenvalues A1 , ... , Ak , In other words,

L(u1) = A1u1 , L(u2) = A2U2 , ... , L(u;J = Akllk .


If u = c 1u 1 + ... + quk is a linear combination of the eigenvectors, we have

L(u) = c1L(u1) + .. . + qL(uk )

In terms of eigenvectors DJ of L the effect of L is to multiply each term in the linear


combination by the associated eigenvalue:

These formulas are most useful when there are enough linearly independent eigen-
vectors to form a basis for the space, since then every vector is a linear combination
of the eigenvectors and Equation 6.2 applies to the basis representation of every x.
In the general case of an n-dimensional vector space with a basis of n eigenvectors
the second equation in terms of matrices and coordinates becomes

A1

L(x) -
(
I 0

Returning to the linear operator L of Example 1, we express an arbitrary x in ~ 2


as a linear combination of the two eigenvectors Xt = (I , 2) and x2 = ( I , -2 ) with
respective eigenvalues 3 and - 1. Suppose

Since L(u) = 3u and L(v) = -v, the linearity of L implies


Section 6A Eigenvalues and Eigenvectors 145
l<'lGURE 3.10
l(x) ~
I '
II '
I
I ''
I
I
''
I '
I
I '
I
- vx2 ~
''
-xz'

- - -- - - - - - -,- - --~
' ', X = UX1 + VXz
'I •
Xz' /
"'
Using u and v as coordinates gives the diagonal matrix representation

Figure 3.10 shows the effect of L on each of the two eigenvectors u and v, so
the image L(x) of a vector x depends geometrically on the parallelogram law. We
express the effect of L by saying that L is a composition of two operators:

1. A stretch by a factor of 3 away from the line through v along the lines parallel
. h eigenvector
to x 1 an d wit . coord'mate matnx
. ( 3 0 ) .
O 1
2. A reversal of direction on lines parallel to x2, leaving points on the line
through u fixed, with eigenvector coordinate matrix ( b _~ ). Operators
( 1) and (2) together produce the same end result in either order:

To compute the eigenvalues and associated eigenvectors for the function L of the
previous ex.amples, we proceed as follows. We need to find vectors u =j:. 0, and
numbers >.. such that
L(u) >..u = 0.
In matrix form, this equation is

or

(! ) (:) - (~ ~)( : ) = (~)


146 Chapter 3 Vector Spaces and Linearity

or
1
(I -A) )(~)=(~). (I)

If this 2-by-2 matrix has an inverse, then the only solution of the equation ( 1) is
x = 0 and y = 0. Hence we must try to find values of A for which the matrix isn't
invertible. By Theorem 5.7 of Chapter 2, Section 5E, this will occur precisely when

(I - A)
det (
4 (I _l A) ) = (I - A) 2 - 4 = 0.
This quadratic equation in A has roots A = 3 and A = -1, as we see by inspection,
by factoring, or by using the quadratic formula. To find eigenvectors associated with
A = 3 and A = - 1, we must find x and y, not both zero, satisfying Equation (I).
Thus we consider

and

A= - 1 : ( ; ~ ) ( ~ ) =( ~ )·
Each of these systems reduces to a single equation:

A = 3: - 2x +y = 0
A = -1 : 2x +y = 0.
It follows that there are many solutions, but all we need is one nonzero solution for
each eigenvalue. We choose for simplicity

In principle, a nonzero numerical multiple of either vector would do as well.

We summarize the method of Example 3 as follows. To find eigenvalues of a


linear operator L on !Rn with matrix A, solve the characteristic equation

6.3 det(A - A/) = 0,


with characteristic roots A1 , ••• , An. Then for each root A1 , ••• , An, try to find
nonzero vectors XJ, .•. , x11 that satisfy the matrix equation

6.4
Because Equations 6.3 and 6.4 are expressed in terms of the matrix A of L, we
sometimes refer to eigenvectors and eigenvalues of the matrix rather than the function
L. However the distinction between the matrix A and the operator L is particularly
important here, because even if A is a real matrix, some of the roots Ak may be
Section 6A Eigenvalues and Eigenvectors 147

complex numbers. In that case, the matrix A-Ak I can't be interpreted as an operator
on IR.n, because it will have complex entries. In Section 6B we'll see that we can
still get interesting information by operating instead on the space en
of n-tuples of
complex numbers.
For differential operators the general definitions and principles of Equations 6.1
and 6.2 remain the same, but the determinants and matrices of Equations 6.3 and 6.4
are replaced by calculus computations as in the next example.
If r is a constant, then (d/dx) erx =rerx. If we consider u(x) = erx as a vector in
the space c<oo), and let D be the differentiation operator, then

Du= ru,
so er x is an eigenvector for D associated with the eigenvalue r . In particular, Dex =
~, De2x = 2e 2x, and De- 3x = -3e-3x, so the functions q ex, c2e2x, c3e-3x are
eigenvectors for the differentiation operator D if the Ci are nonzero constants. The
associated eigenvalues are 1, 2, and -3, so

D(c1ex + c2e2x + C3e- 3x) = c1ex + 2c2e2x - 3c3e- 3x,

as we expect from the rules of calculus.


Here we'll combine the results of Examples I to 4 to solve the following system of
differential equations for x(t) and y(t):
dx
= x+y,
dt
dy
or in vector-matrix form dt
dx =( l
4
)x,
= 4x + y,
dt
where x(t) = (x(t), y(t)} and dx(t)/dt = (x'(t), y'(t)).
If we use u and v as coordinates with respect to the two eigenvectors u and v,
then x = uu + vv. Example 2 shows that in terms of u and v,

In these same coordinates ~; = ( :: ) , so the original system of differential


equations becomes
u' = 3u ,
v' = -v.
Example 4 shows that we can take u = qe 3' and v = c2e- 1 • Section 3
of Chapter 10 shows that these are the only possibilities. Switching back to xy-
coordinates shows that our solution is
x(t) = qe 31 + c2e- 1 ,
y(t) = 2c1 e 31 - 2c2e- 1 •

For this analysis to work, the eigenvectors of L had to be a basis for JR.2 , but that
follows from their linear independence.
148 Chapter 3 Vector Spaces and Linearity

EXERCISES

1. The linear operator L from JR 2 to R2 with matrix (b)For ).. < 0, let k = H and show that any linear
combination of cos kx and sin kx is an eigenvector
for).._
(c) Find two linearly independent functions such that
any linear combination of them is an eigenvector
has eigenvalues 7 and -5. Which of the following vectors for)..= 0.
is an eigenvector of L? For those that are, what is the (d) We'll show in Chapter 11 that the functions listed
associated eigenvalue? in parts (a), (b), and (c) are the only eigenvectors
for the operator D2 . Show that the only functions
f(x) that satisfy the condition /(0) = /(rr) = 0
and are eigenvectors of D2 are multiples of the
functions sin kx, for k a positive integer. What are
the associated eigenvalues? (This question comes up
In Exercises 2 to 7, find all the eigenvalues of each of
in studying the small vibrations of a string anchored
the linear operators defined by the following matrices,
at the points x = 0 and x = 1r.)
and for each eigenvalue find an associated eigenvector.
12. (a) Find the eigenvalues and an associated pair of eigen-
2.
( ! 1) 3. ( ~ ~) vectors for the linear operator L on R2 having matrix

5. ( i i)
( ~ ~) ·
4. ( g ~) (b) Show that the eigenvectors of the function L in

n 0n
part (a) form a basis for R2 , and use this to give

(!
0 0 a geometric description of the action of L on R2 , as
6. l 7. I in Example 2.
I 0
(c) Generalize the results you found for (a) and (b) to a
8. Show that 0 is an eigenvalue of a linear operator L if and linear operator L from !Rn to !Rn having a diagonal
only if L is not one to one. matrix diag(a1, a2, ... , an).
9. Show that if f is a one-to-one linear operator having ).. 13. Find the eigenvalues of the operator G on JR 2 with matrix
for an eigenvalue, then 1- 1, the inverse of f, has I/).. for
an eigenvalue. ( ! i ), show that the associated eigenvectors span
10. Let f be a linear operator having ).. for an eigenvalue. IR 2, and describe the action of G, as in Example 2.
(a) Show that ).. 2 is an eigenvalue associated with f of.
(b) Show that ).." is an eigenvalue associated with the
In Exercises 14 to 17, solve the system of differen-
tial equations using eigenvalues and eigenvectors as in
function we get by composing f with itself n times.
Example 5 in the text. The matrices are the same as the
11. Let c< 00 l (R) be the vector space of infinitely often differ- ones in Exercises 2 to 5, and you may use the results of
entiable functions f (x) for x in R. Then the differential those exercises if you have already worked them out.
operator D 2 acts linearly from c< 00 l(JR) to c< 00 l(IR). This
exercise is about the eigenvectors of the operator D2 .
(a) For ).. > 0, let k = ..ff and show that any lin-
14. ~; =( ! 1) X 15. ~; =( ~ ~) X

ear combination of ekx and e-kx is an eigenvector


for)... 16. ~; =( g ~) X 17. !; = ( i i·) X

6B Bases of Eigenvectors
In Section 6A we saw that the effect of a linear operator on a linear combination
of eigenvectors is particularly simple. Here we'll look at conditions under which a
finite-dimensional space V has a basis of eigenvectors for a given operator L(x) on
Section 6B Eigenvalues and Eigenvectors 149
V so we can take advantage of Equation 6.2 and, if V is finite-dimensional, express
L by a diagonal matrix acting on coordinates u 1, ... , u 11 in x = u 1u 1 + · · · + Un Un.
6.5 Theorem. The matrix of an operator L with respect to a basis {u 1 , .•• , 0 11 }
is diagonal if and only if each basis vector u; is an eigenvector of L. If the matrix
is diagonal, then the ith entry on the diagonal is the eigenvalue associated with u;.

Proof. Recall that the matrix of a linear function L with respect to given bases in
its domain and range is defined so that its jth column gives the coordinates with
respect to the basis in the range of L(uj ), where Dj is the jth basis vector of the
domain. Here we have an operator L, whose range is the same as its domain, and
we can use the basis {u1, .•• , Un} for both domain and range.
Ifuj is an eigenvector, then L(uj) = AjUj, where Aj is the associated eigenvalue.
To express Aj D_; as a linear combination ciu1 + ... + c11 u 11 , we take Cj = Aj and
all the other e's equal to 0. Thus the coordinate vector of L(uj), namely the jth
column of the matrix of L, is zero, except for having Aj in the jth place. The matrix
is diagonal if and only if this condition holds for every column. On the other hand,
if the entries in the jth column of the matrix are zero except for a value J...j in the
jth place, we have L(uj) = Ajllj and u1 is an eigenvector. Thus if the matrix is
diagonal, every basis vector Uj is an eigenvector. •

IEXAIVJPL~ 6 I In Example I of Section 6A we saw that the operator L on JR 2 that has the matrix

A= ( ! !)
with respect to the standard basis has eigenvectors

with eigenvalue J...1 =3


and

with eigenvalue J...2 = -1.


According to Theorem 6.5, the matrix of L with respect to the basis {01, 02} is the
diagonal matrix

Let us check by computing L(x), where x = 0 1 + 202, by using each ma_trix.

The coordinate vector of x in the basis {01, 02} is ( ~) and ( ~ _ ~) ( ~ ) =


( -~ ) is the coordinate vector of L(x) in the same basis. That means that

L(x) = 301 - 202

=3 ( ~ )- 2 ( -~ ) =( l~ ) ·
150 Chapter 3 Vector Spaces and Linearity

On the other hand, x = ( 1) + -1 )2( = ( _; ) in the standard basis, and

L(x) =( ! )( _; ) = ( 16 )
by direct calculation.

The next theorem shows that eigenvectors associated with different eigenvalues
are linearly independent. We apply it to obtain a condition under which an operator
is guaranteed to have a basis of eigenvectors.

6.6 Theorem. Let 0 1, ... , Dk be eigenvectors of a linear operator L, associated


with the respective eigenvalues )q, ... , )+ If A1, ... , Ak are all different from each
other, then 01 , ... , Dk are linearly independent.

Proof. We proceed by induction on k, the number of eigenvectors in the set. If


k = 1, then 01 forms an independent set, since 0 1 =f 0. Now suppose that every set
of k eigenvectors of L associated with k different eigenvalues is independent. We'll
show that u 1, . . . , DH I is an independent set. Let constants c 1 , . . . , ck+ 1 be chosen
so that

C] ll ] + •••+ Ck+Jllk+l = 0.

Apply L to both sides of this equation to get

C]AJ UJ + .. · +Ck+]Ak+Jllk+l = 0,

where we have used L(uj) = Ajllj. Now multiply the previous equation by A1, and
subtract from this equation to get

The k vectors 02 , ... , llHJ are independent by assumption, so Cj(Aj - Ai)= 0 for
j = 2, ... , k + 1. Since Aj -A1 =f 0, we conclude that Cj = 0 for j = 2, ... , k + 1.
Then the first equation implies that c1 0 1 = 0, so c1 = 0 also. Hence u 1, ••• , llHJ
is an independent set. •
Theorem 6.6 is not restricted to operators on finite-dimensional spaces. In particular,
it applies to the differentiation operator D on the infinite-dimensional space c(l) of
continuously differentiable functions. The function erx is an eigenvector for D associ-
ated with the eigenvalue r, because Derx = rerx; this then implies that the functions
erix, ... , ekx are linearly independent, provided that the numbers r 1, ... , rk are all
different.

If A is an n-by-n matrix and Ak is a real root of the characteristic equation


det(A - H) = 0, then A-Aki represents an operator on ffi.11 • Since det(A -Ak/) = 0,
the operator A - Aki is not invertible, and there is some nonzero vector u such that
(A - Ak/)u = 0. Hence
Au= Akll,
Section 6B Eigenvalues and Eigenvectors 151

and Ak is an eigenvalue. If we consider a matrix A as representing an operator on


the set en of n-tuples of complex numbers, then A - Aki represents an operator on
en for an arbitrary real or complex value of Ak. Thus a characteristic root of A is
an eigenvalue of A considered as an operator on en. For the reasons just mentioned,
the next theorem has a different statement for the real and complex cases.
6.7 Theorem. If Lis a linear operator on !Rn, with matrix A, and its characteristic
equation det(A - U) = 0 has n distinct real roots, then JR 11 has a basis consisting
of eigenvectors of L. If L is a linear operator on en and its characteristic equation
has n distinct roots, then en has a basis consisting of eigenvectors of L.

Proof. Let ''1 , . .. , Vn be eigenvectors associated with the distinct roots AJ , ... , An .
By Theorem 6.6, they are linearly independent. But n linear independent vectors in
an n-dimensional space form a basis, by Theorem 5.9. •

The function L of Example 1 illustrates Theorem 6.7. It is an operator on a 2-


dimensional space, and its characteristic equation has the two roots 3 and - 1. The
eigenvectors (1, 2) and (1, -2) are independent and so form a basis.

An operator may have a basis of eigenvectors without having a full set of distinct
eigenvalues. A simple example is the function from JR 3 to JR 3 given by the matrix

The characteristic equation is (2-A) 2 (3 - >..), and the only roots are 2 and 3. We still
have a basis of eigenvectors because in this case there are two linearly independent
eigenvectors, e1 and e2, associated with the eigenvalue 2.

This example shows one way that a matrix can fail to have enough eigenvectors to
form a basis. Consider the linear function from JR2 to JR 2 with the matrix

A= ( 0 -5)
2 2 .

To find eigenvalues, we must solve the equation

det(A - U) = det ( - ~
2
~f ) = (->..)(2 - >..) + 10
= >.. 2 -2A + 10 = 0.
The formula for solving quadratic equations gives the complex roots 1 + 3i and
1 - 3i. Therefore, the linear function from JR2 to IR 2 has no real eigenvalues and
consequently no eigenvectors in R 2 if we use only real numbers for scalars.

If we use complex scalars and consider the matrix A of the previous example A
as defining a linear function from the complex 2-dimensional coordinate space e 2
into itself, we can use the eigenvalues 1 ± 3i. To find the eigenvectors, we proceed
152 Chapter 3 Vector Spaces and Linearity

just as we have done before with real eigenvalues. For ). = I + 3i, the equation
(A - U)x = 0 becomes

-5
I - 3i ) (; ) = 0.
Dividing the first row by -1 - Ji gives

-I -
2
1 - 3i

-1
2
)C )=o
Subtracting twice the first row from the second leaves

and we see that ( ; ) ( =Ji )


1
is one solution. For ). = I - 3i, a similar

calculation leads to (
1
: Ji ) as an associated eigenvector. Thus viewed as an a
erator _on a comple~ vector space, A does have a basis of eigenvectors-namely,

(
1 _- 31 ) , ( I + . d wit
_ 31 ) , associate . h the e1genva
. Iues I + 3.1, I - 3.1.
2 2

The elementary arithmetic operations of addition, subtraction, multiplication, and


division are governed by the same rules in both the real and complex numbers, so the
theorems about vector spaces and linear functions in general that we've proved so
far are valid whichever set of scalars we assume. But in the two previous examples
the choice between real and complex scalars made a genuine difference, the reason
being that to find eigenvalues we solve polynomial equations, so the results depend
on the number system we allow.
The characteristic equation det(A-U) = 0 for an n-by-n matrix A is a polynomial
equation of degree n. There are two ways in which a polynomial may fail to have
n distinct real roots so that IR.n may fail to have a basis of eigenvectors of A. One
way is for the polynomial to have complex roots as shown in Example I 0. Another
is for the polynomial to have one or more multiple roots, and Example 9 showed
that in this case a basis of eigenvectors may still exist. The next example shows that
a basis of eigenvectors may fail to exist if there are multiple roots.

IEXAMPLE 12 I 2
For the operator on IR. with matrix

A=( -2-1 2 )
3

the characteristic equation det(A - U) = 0 is


(-1 - ).)(3 - ).) - (-2)(2) = ). 2 - 2). +I
= ().- 1) 2 = 0.
Section 6C Eigenvalues and Eigenvectors 153
The only root is ). = l. To find the associated eigenvectors we solve

(A - _/)x = ( -2 2)
_
2 2
x = 0.

The solution set is the 1-dimensional subspace of IR.2 consisting of all multiples of
the vector ( : ) . Two eigenvectors associated with the eigenvalue 1 are linearly
dependent, so can't form a basis. The result is exactly the same if we consider A as
the matrix of an operator on the complex space C2 ; we still fail to have a basis of
eigenvectors.

It's possible to make up the deficiency in the previous example by generalizing the
definition of eigenvector, but in our application to differential equations in Chapter 13
we avoid the need for this by using exponential matrices e' A defined for arbitrary
square matrices A.
6C Changing Coordinates
Theorem 5.4 of Section 6 showed us how to find the matrix representation for a
linear function f: V ~ W relative to bases {v1, ... , v,1 } in V and {w1, ... , Wm} in
W. Here we assume we have a basis different from the standard basis in !Rn and
show how the matrix of a linear operator F: !Rn~ IR.11 relative to the standard basis
is related to the matrix of the same operator relative to the nonstandard basis. We'll
then use this result to show how using a basis of eigenvectors may simplify the
matrix of an operator.

6.8 Theorem. Let x = (x 1, ... , x11 ) be the standard coordinates of a vector in


IR.11 , and let y = (y 1, ••• , Yn) be the coordinates of the same vector relative to a
basis {01, •.. , u11 } for IR.11 • Then x = Uy, where U is the n-by-n matrix with Oj for
its )th column. If A is the n-by-n matrix of a linear operator F: IR.11 ~ IR.11 , then the
matrix B of F relative to the basis {0 1, .•• , o,i} for IR" is related to A by

A= UBU- 1, or B = u- 1 AU.

Proof. First observe that x = y1 01 + · · · + Y11 011 = Uy. Since the columns of U
are basis vectors, they're independent, so 1
u-
exists and y = u- 1x. Then Ax =
U By= U BU- x for all x, in particular when x = ek. Hence A= U Bu- 1 .
1 •

l'. ~X"Ml''t:~13 I Suppose we define an operator F: R 2


~
2
R by F(x) = Ax = ( ! ~ )( ~ ).
Using instead the nonstandard basis { ( ! ),( -: )}, Theorem 6.8 tells us that
the matrix becomes

B = u- 1AU= ( _j : ) (! : )(
154 Chapter 3 Vector Spaces and Linearity

IEXAMPLE 14 I If A ! ).
=( then Example I shows that IR
2
has a basis of eigenvectors

{( 1),( _1 )}, with associated eigenvalues AJ = 3, A2 = -1 . Then A =

diag(3, -1), and we can check directly that A= U Au- 1, where U =( 1 -~ ),


that 1s. (I I) = (I 1)(3 O)(½½ -¼*) .
4 1 2 _2 0 _1 .
In the prev10us example

there's no apparent advantage to using the nonstandard basis, but using the eigen-
vector basis for the same operator allows us to simplify by operating with a diagonal
matrix. This change will be useful in Chapter 13.

EXERCISES

The matrices I to 6 are operators on IR 2 or JR 3 , but


may also be regarded as operators on C 2 or C3 . Find 9. Show that if A = ( _ ~ ; ), then A IO =
their eigenvalues and state whether or not Theorem 6.7 -1022 2046 ) . .
guarantees a basis of eigenvectors in IR 2 or JR 3 , in C2 ( _ 1023 2047 . [Hmt: Wnle A = U AU -I , where
or C 3 . For the matrices for which Theorem 6.7 does not A is diagonal.]
guarantee a basis of eigenvectors, find out whether a
basis of eigenvectors exists. In each of Exercises IO to 13,
the given n-by-n matrix
defines an operator L relative to the natural basis in ]Rn
1.
( ~ :) 2.
(: ~) that has n independent real eigenvectors. Find a basis for
!Rn consisting of eigenvectors of L, and find the diagonal

3.
(-~ ~)
( 3-2 -2)
4.
n n I
0
0
matrix that represents L relative to that basis.

10. ( =~ ! )
5. -2 -2
2 2 -2
I 6. CO I) 0 I 0
0 0 I
,2. ( -! =i -n 13.
(
-1 00)
-1
- 1 -1
0 0
I

7. A rotaLion about the origin in JR 2 through angle 0 has


In each of Exercises 14 to 17, find a square matrix U
matrix
and a diagonal matrix A such that the given matrix
A = U Au- 1• In Exercise 17 you need to use complex
R _ ( cosl.l -sinl.l ) numbers.
11 - sin0 cos0 ·
14. A= ( ~ -~) 15. A= ( -~ ~)
(a) Show that Lhe only values of 0 wilh O ~ 0 < 27f
for which R0 has real eigenvalues are l.l = 0 and
0 = 7f. What are the eigenvectors when there are
real eigenvalues?
(b) Explain the results of (a) in geometric terms.
,.ho
18. (a)
I -1
2 -1
0 I ) 17. A= ( -~ ~)
Let u 1, .•. , u,, be independent vectors in JR", and
8. Show that the matrix R0 of Exercise 7 has complex eigen- let U be the n-by-n matrix with kth column Uk-
values cos 0 ± i sin 0. Also find associated independent Let ).. 1, ... , )... 11 be real numbers, and let A be the
complex eigenvectors. diagonal matrix with kth diagonal entry Ak. Show
Section 7A Inner Products 155
that the matrix A = U 11.u- 1 satisfies Auk = AkUk the matrix A of D with respect to the given basis and
fork=l, ... , n. find the eigenvalues of A. Then find a basis of eigenvec-
(b) Find a 2-by-2 matrix with eigenvalues >..1 = 2, >..2 = tors for the given subspace, using complex coefficients
3 and with associated eigenvectors u 1 = (1, 2), 02 = if necessary.
(2, 5).
19. The subspace with basis eX and e-x
In Exercises 19 to 22. show that the image of the
given subspace of cO>(-oo, oo) under the differenti- 20. The subspace with basis 1, x, and x 2
ation operator D = d/dx is contained in the same sub- 21. The subspace with basis sin x and cos x
space, so D can be considered as an operator restricted
to having the subspace for both domain and range. Find 22. The subspace with basis eX cos x and eX sin x

SECTION 7 INNER PRODUCTS


In Chapter I we used the dot product in IR" to introduce length and angle. Here we
discuss a more widely applicable generalization called an inner product, defined as a
function having the same formal properties as the dot product, but defined on a more
general class of vector spaces than the spaces IR". We then define length, angle, and
orthogonality in terms of inner products. Though these definitions no longer coincide
precisely with the usual geometric concepts, the analogies with intuitive geometry
are still useful.
7A General Properties
Recall that the dot product x • y of vectors x = (x1, . . . , Xn) and y = (y1 , . . . , Yn)
in IR" is defined by x • y = x1y1 + · · · + XnYn, and has the properties

Positivity: x . x > 0, except that O. 0 = 0


Symmetry: X•Y=Y•X
Additivity: (X+ y) • Z = X • Z + y • Z
Homogeneity: (rx). y = r(x • y)

For elements of a vector space we define an inner product {x, y) to be a real-valued


function having the same properties:

Positivity: {x, x) > 0, except that (0, 0) =0


7.1
Symmetry: {x, y) = {y, x)
Additivity: {x+y,z} = {x,z} + {y,z)
Homogeneity: {rx, y) = r{x, y)

The dot product on IR" has Properties 7. I and so is an example of an inner product.
We need a notation different from x • y for a general inner product both to make
it clear which product we're talking about, and because we sometimes use both
together, as in Section 7C.
In Definition 7. l we assumed homogeneity only in the first entry, but using sym-
metry twice we get {x, ry} = (ry, x} = r{y, x} = r{x, y}. Hence {x, ry) also equals
r{x, y) . Similarly, we assumed additivity only in the first entry, but we also have it
in the second entry also: {x, y + z} = (x, y} + {x, z} as a consequence of additivity
in the first entry and symmetry.
We define the length. or norm, of a vector by
156 Chapter 3 Vector Spaces and Linearity

The dot product and length in IR11 satisfy the Cauchy-Schwarz inequality

/x • YI S /xl/y/ .

To aid geometric intuition we used a proof of the inequality in Chapter I, Section 4


that depended on the law of cosines. Here we give a different proof that uses only
the Properties 7.1 and so works for other inner products as well. As a result the law
of cosines holds for general inner products as well. (See Exercise 12.)

7.2 Cauchy-Schwarz Inequality. /(x, y)/ S 1/x/lllYII-


Proof. If either vector is 0, the inequality is satisfied because both sides are 0, so
for the rest of the proof we assume that neither vector is 0. Suppose first that x and
y are unit vectors, that is, that I/xi/ = IIYII = I. Then

0S /Ix -yf = (x -y, x -y)


= l/xli2 - 2(x, y) + IIYll 2 = 2 - 2(x, y),

or 0 .s 2 - 2(x, y), so (x, y) S I. For nonzero x and y, x/llx/1 and y//lYII are unit

vectors, so(_!__, L_)


llxll IIYII -
< I and multiplying by 1/x!IIIYII and using the homogeneity

property gives (x, y) S l/x/1/IYII. Now


replace x by -x to get -(x, y) S 1/x/lllYII- The
last two inequalities imply the inequality with the absolute value in it. •
When x and y are not zero we can write the Cauchy-Schwarz inequality as
l{x, y)I . {x, y)
- - < I , so there 1s an angle 0 between 0 and rr such that cos 0 =- -.
llx!IIIYII - llxlll/YII
We now define the angle between x and y to be this angle 0, and the definition will
always make sense when the vectors are nonzero. As we did for the dot product
in Chapter I, we say that x and y are orthogonal if (x, y) = 0, allowing for the
possibility that x or y is zero.
Here are the characteristic properties of length as defined by !Ix//.

Positivity : I/xii > 0, except that 1/0/1 = 0


7.3 Homogeneity : 1/rxl/ = /r/1/x/l, r real
Triangle Inequality : /Ix+ YII s I/xii + IIYII
The proofs are the same as the ones given for the properties 4.9 in Chapter I,
Section 4, with (x, y) replacing x • y and I/xi/ replacing /xi, and we won't repeat
them here.

We've already seen that the dot product x•y in IR11 is an example of an inner product.
(j~AMPLEtj In particular, if x = (x1, x2) and y = (y1, Y2),

X • Y = XJYI + X2Y2
is an inner product in IR 2 • If we define

(x, y) = Xj YI + 2x2y2
Section 7A Inner Products 157
FIGURE 3.11

ixl = I

(a) (b)

instead, we get another inner product for x and y in JR. 2. To see this, all we have
to do is check that the relations given under 7.1 are satisfied, which we leave as an
exercise. Relative to this inner product, length is defined by

Jlxll = jx? + 2x1"


Thus the "unit circle" defined by llxll = I consists of all points (x 1, x2) in JR.2 that
satisfy Xf + 2x~ = 1; this ellipse appears in Figure 3.11 (a) along with the Euclidean
unit circle lxl = 1, or Xf + x?
= 1. This definition of length adds more weight to
the second coordinate, so to get a really circular unit circle, lx2 I has to be smaller.
We use the inner product (x, y) to define the angle 0 between two vectors as
above by the formula
(x, y)
cos0 = --.
llxJIJlyll
Then x and y are orthogonal or perpendicular when cos 0 = (x, y) = x I YI + 2x2y2 =
0. Thus the vectors e1 = (1, 0) and e2 = (0, 1) are still orthogonal, just as they
were relative to the dot product, but the vectors (I , l) and ( -1, 1) are not. Also the
vectors (1, I) and ( -2, 1), shown in Figure 3.11 (b ), are orthogonal relative to this
inner product, but not relative to the dot product.
ISl,<AM,f>~~x~,J Let C[-rr, rr] stand for the vector space of continuous real-valued functions f(x)
defined for -rr :::: x :::: rr . The space C[-rr, rr] is infinite-dimensional because it
contains the infinite linearly independent set {l , x, x 2 , ... }. We define a useful inner
\ product on C[-rr,rr] by the integral

(/, g) = /~ f(x)g(x) dx.

We have (/, g) = (g, /) simply because f(x)g(x) = g(x)f(x) for -rr :::: x :::: rr.
The other properties of the inner product depend on properties of definite integrals;
the verification is left as Exercise 11. The importance of this example depends partly
on the formulas

(coskx,coslx) = r: coskxcoslxdx = 0, k =/ l,

(sinkx, sinlx) = 1_: sinkx sinlx dx = 0, k =pl,

(cos kx, sin lx) = 1_: cos kx sin lx dx = 0,


158 Chapter 3 Vector Spaces and Linearity

where k and l are integers. These formulas follow in a straightforward way using
trigonometric identities; their significance here is that in terms of the inner product
(/, g}, and the orthogonality relation (/, g} = 0, they assert that certain trigono-
metric functions are orthogonal. We're not claiming that the graphs of coskx and
sin Ix intersect at right angles, but rather that their ordinary product has average
value zero over the interval -rr :'S x :'S rr. If k = l direct computation shows that
(cos kx, cos kx} and (sin kx, sin kx} both equal rr for k ~ 1. (If k = 0, cos kx = 1
and sinkx = 0 so the integrals are 2rr and 0 instead.) We'll return to this example
in the next section.

EXERCISES

In Exercises I to 4, detem1ine whether the given formula 9. Show that there is no inner product on JR 3 such that
defines an inner product on ~ 2 • Verify your answer (e1, e1) = (e2, e2) = I, (e3, e3) = 5, (e1, e2) = 0, and
by showing either that the Properties 7 .1 on page 155 (e1, e3) = (ei, e3) = 2. [Hint: Show that if homogeneity
are satisfied or that at least one of them fails. Here and additivity hold, then ( (2, 2, - I), (2, 2, -1)) is nega-
=
x (x1, x2) and y = (y1, Y2). tive, so positivity fails.]
1. (x, y) = XJ YI + 2x2Y2 10. Let V be a 2-dimensional vector space with an inner
product and a basis {u, v}, and let (u, u) = a, (u, v) = b,
2. (x, y) = XJYI -X2Y2
and (v, v) = c.
3. (x, y) = XJYI + x1y2 + X2J1 + 2Qy2 (a) Let x = pu + qv and y = ru + sv be vectors in V.
4. (x, Y) = XJYI Use additivity and homogeneity of the inner product
5. Sketch the "unit circle" determined by llxll = I, if to show that
the nonn is determined by the inner product (x, y) =
3x1 YI + 2X2Y2, where x = (x1, x2), y = (y1, Y2). (b)
(x. y) = (p q ) (: : ) C).
Show that a > 0 and c > 0, and that the Cauchy-
6. (a) Let V be a finite-dimensional vector space with basis Schwarz inequality implies that b 2 < ac.
{v1, ... , v,,}. Let (c) Show that if a, b, and c satisfy the conditions of
part(b) and (x, y) is defined by the formula in part(a)
x=x1v1+···+x11 v,, then (x, y) satisfies the conditions for being an inner
product. [Hint: To show positivity, write out (x, x)
Y=y1v1+···+Y11V11
in terms of a, b, c, p, and q and use the technique
be representations of x and y in the given basis. of completing the square.]
Show that 11. (a) Verify that the formula f::,rc f (x)g(x) dx defines
an inner product on the space C[-rr, rr] of real-
(x, y) = (x1, ... , x,,) • (Y1, ... , y,,) valued continuous functions on [-rr, rr ]. To show
that (f, f) > 0 unless J = 0 you may assume that
defines an inner product on V. if J (x) ~ 0 is continuous but not identically zero
(b) Show that with the inner product as defined in part
(a), the basis elements v 1, ... , v,, satisfy (v;, Vj) =
on an interval a ~ X ~ b, then r:
f(x) dx > O.]
(b) Write out explicitly the meaning of the Cauchy-
0 if i ,f. j and (v;, Vj) = I.
Schwarz inequality for the inner product in
7. Verify the orthogonality relations in text Example 2 by part (a).
using trigonometric identities.
12. Prove the law of cosines for general inner products by
8. Suppose an inner product defined on JR.2 has the val- expanding llx - yll and using the definition of the cosine
ues ((-1, 2), (-1, 2)) = 4, ((2, -5), (2, -5)) = 9, and of the angle 0 between two vectors.
((-1, 2), (2, -5)) = 5. Calculate the lengths lie, II and
lle2 II of the standard basis vectors for the length function
13. Prove that the Pythagorean relation llx-yf = llxf+IIYll 2
holds if and only if x and y are orthogonal.
associated with the given inner product. [Hint: Express e1
and e2 in terms of the vectors (-1, 2) and (2, -5), and 14. Prove that if I/xii= ..j (x, x} is the norm defined by an
use homogeneity and additivity of the inner product.] inner product (x, y), then (x. y) = ¼<llx+yll 2 - llx-yf).
Section 7B Inner Products 159
7B Orthogonal Bases
The standard basis vectors E = {e1, • • • , en} in JR" form an orthogonal set since
ej •ek = 0 if j -:f k. If in addition all the vectors in an orthogonal set have length 1, the
set is said to be orthonormal. Thus the set E is orthononnal since \ek \ = ek • ek = 1
fork = 1, ... , n. The more restrictive orthononnality is a useful property for a basis
{u1, ... , u,,} to have, because we'll see we can then compute the coordinates in a basis
representation x = u 1u 1 + · · ·+u 11 u,1 directly as inner products (x, Uk) without solving
systems of linear equations and also compute inner products (x, y) as dot products of
coordinate vectors. We'll also see how to construct an orthonormal basis starting from
an arbitrary given basis in a finite-dimensional space with an inner product.
The standard basis {e1, ... , e11 } is an orthonormal basis for JR" relative to the usual
dot product in JR" because ei • ej = 0 if i -:f j, and e; • e; = I.

A finite-dimensional vector space V has dimension n if V has a basis {v 1, • • • , v11 }


with n elements, so according to Theorem 5.9 every linearly independent set of n
vectors in V is a basis for V. The following theorem shows further that if dim(V) =
n and V contains an orthonormal set {u 1, • • • , u,,} of n vectors then this set is
automatically linearly independent and so is a basis for V. In addition we compute
the coordinates (u 1 , ••• , u,,) of a vector x relative to this basis from Uj = (x, u_;).
7.4 Theorem. Suppose dim(V) = n and that B = {u1, ... , Un} is an ortho-
normal set in V. Then the set B is a basis for V, and if x = u101 + · ·· + Unlln , then
Uj = (X,Uj) for j = I, ... ,n.

Proof. To show independence of the Uj suppose u1 u1 + · · ·+uj Uj +· · ·+ u11un = 0.


Using additivity and homogeneity, the inner product of both sides with Uj is
UJ (u1, Uj) + · · · + Ztj(Uj, Uj ) + · ··+Un (Un, Uj) = U] (01 , Uj) = Uj = 0,
since (Uj, Uj) = 1 and (Uj, Uk)= 0 if k -:f j. Hence Uj = 0 for j = l, ... , n, so the
Uk are independent and form a basis for V by Theorem 5.9. If x = u101 + · · ·+u11 U11
a similar computation assuming orthonormality shows that for j = l , ... , n
(x, Uj) = 111(01, Uj ) + · · · + Uj(Uj, Uj) + · · · + Un(Un, Uj) = Uj(Uj, Uj) = Uj.
Hence Uj = (x, Uj) - •
In the next two examples we use the dot product in JR2 and JR 3 .
As we remarked in Example 3, if we use the dot product the vectors e1 = (I, 0)
and e2 =
(0, 1) form an orthonormal basis for JR 2 . The pair 01 = (_}i, _}i) and
02 = (7z, 7i), shown in Figure l.12(a) is also orthonormal. To compute coordinates
relative to the basis {u1, u2 } in JR 2 , we use the second part of Theorem 7.4. For
example, the coordinates of the vector ( - 2, 3) are

UJ = (-2, 3) • (}i, }i) = J2


u2 = (-2, 3) • (72, }i) = ./2'
Hence (-2, 3) = }i-u1+ }iu2.
160 Chapter 3 Vector Spaces and Linearity

FIGURE 3.12

(a) (b)

IEXAMPLE sI Figure l.12(b) shows the three orthogonal unit vectors

VJ=(~.~. n. V2 = (~. -~. n. n. ~. -~).


V3 =

Relative to the basis {u1, 02, 03) in IR. 3, the coordinates of {I, 2, -1) are
1
VJ = {1, 2, - J) • ( ~, ~, ~) = \

v2 = ( I 2 -1) • (~ _J
' • 7'
l) = _l7
7' 7

V3 = (1, 2, -1) • ( ~, ~, - ~) = \3 •
Hence {I, 2, - 1) = \1v1 - ~v2 + 1.'/v3.
I:e)(AMPLE GI Consider the infinite-dimensional subspace S of C[-;rr, ;rr] spanned by the set of
functions

{l, cosx, sinx, cos2x, sin 2x, ... }.

In Example 2 of the previous section, we observed that relative to the inner product

U, g) = 1-: f(x)g(x) dx,

the orthogonality relations

(coskx, cos/x) = 0, k -:j:. /;k, I= 0, l, 2, .. .


(sinkx, sinlx) = 0, k -:j:. /;k, I= l, 2, .. .
(coskx, sinlk) = 0, k = 0, l, 2, ... ;/=I, 2, ...

all hold. It follows from Theorem 7.4 that these functions are linearly independent.
We could divide each of the functions in the orthogonal set by its length to get
an orthonormal set, but we are mainly interested in computing the coefficients in a
linear combination of coskx and sinkx, so what's usually done is to alter the inner
product by a constant positive factor that absorbs the normalization constants. We
can do this because with one exception all these factors are the same:

(I, I)= 1-: dx = 2;rr,


Section 7B Inner Products 161

(cos kx, cos kx) = 1-: cos 2 kx dx = rr,

(sinkx,sinkx) = 1-: sin 2 kxdx =rr.

Because the factor rr occurs in each of these numbers, it's customary to alter the
definition of the inner product by dividing by rr and put

(/, g) = -I 11r f(x)g(x) dx.


1C -Jr

This doesn't change the orthogonality of the set, but now we have

(1, I)= 2, (coskx, coskx) = I, (sinkx, sinkx) = I.

Defining the Fourier coefficients ak fork= 0, I, 2, ... and bk fork= 1, 2, ... by

ak = -1 ire f (x) cos kx dx,


1C -re

bk = -I ire f (x) sin kx dx,


1C -re

allows us to write a trigonometric polynomial of degree n

Tn(x) = ao + a1 cosx + b J smx


. .
+···+an cosnx + bn smnx
2
derived from the Fourier coefficients of an arbitrary integrable function f (x) defined
for -rr s x S 1C. Note that cos kx = l when k = 0, and that the 2 under the ao
comes from dividing by (I, I) = 2. We']] see in Theorem 7.5 that Tn is, in a sense
made precise there, the best approximation to f by a trigonometric polynomial of
degree less than or equal to n.

(E>9\MfL£,]. j Let /(x) = lxl for -rr S x ~ rr . Then

ak = -1 ire lxlcoskxdx, bk= -I ire lxl sinkx dx .


1C -re 1C -re

Now lxlsinkx has integral zero over [-rr,rr], because it's an odd function. Hence
bk = 0 fork = 1, 2, .... On the other hand, the graph of !xi cos kx is symmetric
about the y-axis, so we can just double the integral over [O, rr]. For k =I- 0 we
integrate by parts, getting

ak = -21re x cos kx dx
Jr 0

= -2 [x-sinkx]re
- - - -2 ire sinkxdx
rr k O krr o
162 Chapter 3 Vector Spaces and Linearity

2
= [ -coskx ]]r = -(cosbr
2
- 1)
2
k rr 2
k rr O
k = 2,4, 6, ... ,
= __2_((-ll - 1) ={ O, 4
k2rr - k 2rr , k = 1, 3, 5, ....

When k = 0, we have ao = -
21]r x dx = rr. To summarize,
7r 0

= 2,4, 6, ...

I
0, k ,
ao = rr, Gk= _ _±_
k2rr'
k = I, 3, 5, ... ,
k = 1,2,3, ....

The constant term is ao/2 so the nth Fourier approximation T, 1 (x) for odd n is

rr 4 4 cos 3x 4 cos nx
T, 1 (x) = - - -cosx- - -2- - ··· - - - -
2.
2 rr rr 3 rr n
Approximation of the values f (x) by T,1 (x) in the previous example is taken up
in Chapter 14, Section 8, but in the present context we consider instead another kind
of approximation that's measured by the norm of a vector, be it a function f(x) in
C[-rr, rr] or a vector x in IR11 . The main theorem is as follows, and it's one of the
principle reasons that orthonormal bases are important.

7.S Theorem. Let S = {u 1 , • • • , u11 } be an orthonormal set in a vector space V


with inner product (x, y} and associated norm llxll. Then for every x in V the distance

dn = llx - (u1u1 + · · · + U11U11)II


between x and points in the span of S is minimized by choosing the scalar coefficients
to be Uk = (x, Uk} ; thus the minimum is d11 = 0 if xis in the span of S. If x is not
in the span of S the minimum d 11 is positive and satisfies

11
d112 = llxll 2 - "L.., (x, uk} 2.
k=I

If S is infinite it is called a complete orthonormal set if d11 tends to zero as n tends


to infinity for every x in V.

Proof. We'll work with d;,, using additivity and homogeneity of the dot product:
llx - (111u1 + · · · + U11U11)11 2 = (x - (u1u1 + · · · + u11u,i), x - (u1u1 + · · · + U11U11)}
IJ Jl

= llx/1 2 - 2 L uk(x, Uk}+ L ur


k=I k=l
Section 78 Inner Products 163

Adding and subtracting I:Z=I (x, Dk) 2 to the last expression gives a sum of squares:
11 n n n
i; = llxu2 - L(x, uk) 2 + L(x, uk) 2 - 2 L Uk(X, uk) +Lui
k=I k=I k=I k=l
n n
= Uxll 2 - L(x, Dk) 2 + L((x, Dk) - uk)2.
k=l k=I

The Uk' s occur only in the last sum on the right, which is always nonnegative and
takes on its minimum value of O just when uk = (x, uk) for every value of k from
1 ton, so these are the values that minimize d;. •
In the previous example the inner product and norm on the vecLor space C[-Jr, Jr]
were respectively

1 {rr ( 1 {rr ) 1/2


(f,g) =; J_rr f(x)g(x)dx and 11/11 2 = ; J_rr J 2 (x)dx

For f(x) = lxl we found the nth-degree trigonometric approximation for odd n to be
Jr 4 4 cos 3x 4 cos nx
T,1(X) = -2 - -COSX -
Jr
--- -
Jr 32
··· - ---.
Jr n2

Since the function f(x) = lxl isn't differentiable it certainly isn't in the span of the
trigonometric system so II/-T,111 is always positive. Since 11/11 ¼f::_rrx 2 dx = =
IJT2
3 '

dn
2
= 11/ - Tnll
2 2Jr2
= - 3- -
( Jf2
2 + Jr2 I;
16 (n+l)/2 l )
(2k-1)4 ·

We're not in a position to prove it here, but the trigonometric system is complete so
d; Lends to zero as n tends to infinity.
In Chapter 1, Section 5B we saw how to find the distance between a point x in R. 2
or R. 3 and a line or plane when these were given in the respective forms ax+ by = c
or ax+ by+ cz = d. Using the previous theorem, we can find the distance from a
point to a general k-plane if we first represent the k-plane parametrically using an
orthonormal set {01 , ... , ut}.

To find the distance from the point (2, 3, 4) in JR 3 to the line x t(l, 1, 1), first =
rewrite the line as x = s(JJ, JJ' JJ).
According to Theorem 7.5, the point on
the line that produces the minimum distance to (2, 3, 4) is the one for which the
single scalar coordinate is s = (2, 3, 4) • = (JJ, JJ' 73 ) 1·
Hence the min-
imizing point is (3, 3, 3) and the minimum distance is 1(2, 3, 4) - (3, 3, 3)1 =
I(-1, 0, 1) I = ./2. If the line had been shifted so it dido' t contain the origin, for
164 Chapter 3 Vector Spaces and Linearity

example, x = I (I, 1, l) + (1, 2, 2), we would have instead minimized the distance
between the point (2, 3, 4) - (1, 2, 2) = (1, I , 2) and the line x = s( JJ, JJ). JJ"'
Theorem 7 .5 shows that if the vectors in an n-dimensional space are represented
using an orthonormal basis, then norms are all computable using Euclidean lengths
of coordinate vectors. This is true for inner products ah;o, from which follows the
result for nom1s. See also Exercise 14.
7.6 Theorem. Let {u1, ... , 0 11 } be an orthonormal set in a vector space with
corresponding inner product (x, y). If

X = XJ llJ + · ·· + X 11 ll11 and Y = YlllJ + · ·· + YnUn,


then

(x, y) = (x1, ... , Xn) • (y1, ... , Yn) and !!xii = l(x1, ... , Xn)I.

Proof Using additivity of the inner product and orthonormality of the Dk, we have

(x, y) = + · · · + XnUn, .YJUJ + · ·· + Ynlln)


(x1t11
= XJYl + · ·· + XkYk + · · · + XnYn· •
Note that setting y = Dk and using (uk, Dk) = I and (uj, Dk) = 0 if j =/:- k, we
get (x, uk) = Xk- Similarly, Yk = (y, uk)-
A specific orthonormal basis is often the most natural basis for a vector space,
but if all we have to start with is a natural inner product and a basis that isn't
orthogonal, then Equations 7.7 and 7.8 below serve as recipes for finding orthonormal
bases using what's called the Gram-Schmidt process. We start by describing the
process in detail. It assumes we have in hand a linearly independent set of vectors
X = {x1, ... , x11 , ••• }, and our aim is to produce from this an orthonormal set
Y = {YI, ... , Yn, ... } that spans the same vector subspace that X does. First we
pick one of the x' s, say x 1, and normalize x 1 to get

01 = x1/llx1II, so !10111 = 1ix1ll/ llx1II = I.


Next we pick another of the x' s, say x2, and form its projection on u 1, shown
in Figure l.l3(a) as a geometric vector defined by (x2, u 1)01. The justification for
using the term projection is that the vector Y2, defined as the difference Y2 = x2 -
(x2, 01)01, is orthogonal to u1:

(Y2, 01) = (x2 - (x2, n1)u1, u1)


= (x2, u1) - (x2, ui}(u1 , 01 ) = 0.
FIGURE 3.13

''k?T U1 (X2, U1)U1


• X1

(a) (b)
Section 78 Inner Products 165

The vector Y2 can't be zero, because by its definition that would imply x2 and u 1
to be linearly dependent, which they are not. Thus the vector u2 = y2/IIY2il has
length l.
Next we take x3 from our independent set and form its projection p on the
subspace spanned by u1 and u2, defined by

and illustrated in Figure 3.13(b). We define y3 by subtracting p from x3:

We can check that y3 is orthogonal to both u1 and u2 because (u 1 , u 1) = 1 and


(u1, u2)=0. We get

Also, because (u2, u2) = 1,

If y3 = 0, then x3, u1 , and u2 would be dependent; but this is impossible, because the
subspace spanned by u1 and u2 is the same as that spanned by x1 and x2; therefore,
x3, x2, and x1 would be dependent. Hence u3 = y3/IIY3II has length 1.
In this way we successively compute u 1, u2, ... , Uk. To get Yk+l we set

7.7
We can verify as before that Yk+I is orthogonal to u1 , u2 , ... , Uk. To obtain a unit
vector, we can normalize Yk+l to Uk+!= Yk+1/IIYk+1II- In practice, it may be more
convenient to compute the y's from the equivalent formula

(Xk+l, Y1) (Xk+l, Yk)


7.8 Yk+I = Xk+l - IIY1ll2 YI - ... - 11Ykll2 Yk·

The vectors x1 = (I, -1, 2) and x2 = (1, 0, - I) span a plane '.P in JR 3 because they
are linearly independent. To find an orthogonal basis for '.P, we apply the Gram-
Schmidt process to the basis {x1, x2} for '.P. We set Y1 = x1 and

(x2 • Y1)
Y2 = x2 - IYil 2 YI
(-1)
= (1, 0, -1) - - -(1, - I, 2)
6
= (i, -¼, -j).
Thus the plane '.P was defined as the set of all linear combinations

SXJ + tX2,
166 Chapter 3 Vector Spaces and Linearity

FIGURE 3.14 z

(a) (b)

but can also be represented as the set of all linear combinations of YI = x1 and Y2,
namely,
uy1 + vy2,
where {y1, Y2} is the orthogonal pair {(1, - I, 2), ( t, -¼, -i)},
shown in
Figure 3. I 4(a).
If we add a third vector x3 that is linearly independent of X1 and x2, we can
go on to find y3 so that {y 1, Y2, y3} is an orthogonal basis for JR. 3, and {y 1, Y2} is
an orthogonal basis for '.P. For instance, if we take x3 = e1, y3 works out to be
( I 3 I )
TI• TI• TI ·

IEXAMf)LE u I but
Let '.P l, 1] be the vector space of polynomials f(x) = ao + a1x + · · · + a x
11 [ -
restricted to -1 ~ x ~
1. We define an inner product
11
11
,

(f, g) = f
-]
1
f(x)g(x) dx.

The argument used in Example 3 of Section 5A to show that the functions 1, x, ... , x 11
form a basis for '.P11 , works also to show that they form a basis for '.P11 [ -1, 1] when
restricted to [-1, l ]. To find an orthogonal basis, let .vo(x) = I. Then let

(x, 1)
Y1(x) =x - - - 1 =x,
(I, 1)

because (x, I ) = 0. Next let

2
(x 2, 1) (x 2, x) ( 3) I
v2(x) = x2 - ---1 - - - x = x2 - -- = x2 - -·
. (1 , 1) (X, X) ' 2 3'

this is so because (x 2, 1) = i,
(x 2, x) = 0, and ( I, 1) = 2. The graphs of the
three polynomials .vo(x) = l , y1(x) = x, y2(x) = x 2 - (½) are illustrated in
Figure 3.14(b). We get the corresponding orthonormal set by dividing successively
by II 111 = ,./2, llxll = If
and llx 2 - ½II = to get ffs, r/½,
Jix, /ii-(x 2 - ¼)}.
Thus we have an orthonormal basis for '.P3[- l , 1]. The resulting polynomials are
called normalized Legendre polynomials . ·
Section 78 Inner Products 167
EXERCISES

1. Find a vector (.x, y, z) in R 3 such that the triple of vectors 10. Let S = {01, ... , Un) be an orthonormal set. Prove that
(I, I , I), (-1,½, ½), (.x, y, z) fonns an orthogonal basis the vector L~=I Uk Uk is orthogonal to x - L~=I UkUk
for R3 • Then nonnalize this basis by dividing each vector if and only if Uk = {x, Uk) . This is another way to
by its length. characterize the choice of coefficients in Theorem 7.5,
showing that the nearest point to x in the span of S is
2. The vectors (1, 1, 1) and (1, 2, 1) span a plane '.J> in R3 • the perpendicular pr~jection of x onto the span of S.
Use the Gram-Schmidt process to find an onhogonal basis
for R3 in which the first two vectors fonn an orthogonal 11. For a given inner product {x, y) on Rn, let A be the n-
basis for '.J>. by-n matrix defined to have entries aij ={e;, ei) for
4 i, j = l ... , n. Show that for arbitrary x and y in R",
3. Find an onhogonal basis for R in which the first three {x, y) = x • Ay.
vectors form a basis for the subspace S spanned by
(1,2, 1, I), (-1,0, 1,0), and (0, 1, 0,2) . 12. An t1-by-t1 matrix A = (aij) is symmetric if aii = aj;
for i, j = l, ... , n. Prove that if A is symmetric then
4. Let '.J>2 be the three-dimensional space of quadratic poly-
Ax • y = x • Ay for all x and y in Rn.
nomials p(.x) = a +bx +cx 2 , restricted so that O S .x .:5 1.
If '.P2 is given the inner product 13. Show that if A is symmetric and {x, y) is defined to be
x•Ay for x and yin ]Rn, then {x, y) has all the properties of
an inner product except possibly the positivity property:
{p,q) = fo' p(.x)q(.x)d.x, {x, x) > 0 unless x = 0.
14. A is called a positive definite matrix if it is symmetric
find an orthononnal basis for '.P2. [Hint: One basis for and {x, y) = x • Ay has the positivity property of an inner
'.P2 is {1 , .x, .x 2}.)] product. Show that a diagonal matrix is positive definite
5. Show that applying the Gram-Schmidt process to the if and only if all its "diagonal entries are positive.
three vectors (3, 0, 0), (1, I , 0), and (1, I, 1) in order pro-
duces an orthogonal basis that normalizes to the standard *15. Show that a symmetric 2-by-2 matrix A =( ~ !) is
basis {e1 , e2, e3}. positive definite if and only if a and det A = ac - b 2 are
6. (a) Show that applying the Gram-Schmidt process to an both positive.
orthononnal set in order gives the orthononnal set *16. Even if {x, y) doesn't have the positivity property, but
back again. has the other properties of an inner product, Formula 7 .8
(b) Let {01, ... , Un) and {v1, ... , Vn} be two orthonor- in the Gram-Schmidt process still makes sense unless
mal sets in a vector space, such that the subspaces {yk, Yk) = 0 at some stage. This observation leads to
spanned by {01, ... , Dk} and {v 1, ... , vk} are the an efficient method for determining whether a symmetric
same fork = 1, 2, ... , n. Show that Uk = ±vk for matrix is positive definite or not.
k= 1,2, ... ,n.
(a) Suppose that {x, y) is defined to be x • Ay for a
7. Show that if {01, ... , Un} and {v1, ... , Vn} are two symmetric matrix A, as in Exercise 13. Show that if
orthonormal bases for a vector space, then the matrix M applying the formulas of the Gram-Schmidt process
used to change from one set of coordinates to another to the standard basis vectors {e 1, ••• , en) leads at
has columns that fonn an orthononnal set in Rn. [Hint: some stage to a Yk with {Yk , Yk) S O then A is not
Express Uk as a linear combination of v 1, ••• , vn, and use positive definite.
Theorem 7.6.) (b) Show that if {Yk, Yk) > 0 at every stage then A is
8. Find the distance between the point (2, 3, 4) and the plane positive definite.
parametrized by x = u(l, 2, 1) + v(l, -1, 1) + (I, 1, 2). *17. Show that the result of Exercise 12 remains valid if x
Note that conveniently (1, 2, 1) • (1 , -1, 1) = 0. and y are allowed to have complex entries, and use this
9. Find the distance between the point (2, 3, 4) and the to prove that all the eigenvalues of a real symmetric
plane parametrized by x = u(l , 1, 1) + v(l, -1, 1). Since matrix are real numbers. [Hint: If Ay = AY with A and
(1, 1, 1) • ( 1, - 1, l) =f 0, use the Gram-Schmidt process y complex, show that Ay = Iy and put x = y in the
first. equation of Exercise 12.)
168 Chapter 3 Vector Spaces and Linearity

7C Rotation and Reflection


In Section 1 we looked at examples of rotation operators in JR 2 that preserve the
lengths of vector arrows with tails at the origin as well as the angles between the
arrows. We saw that rotation by the angle 0 about the origin in JR 2 is a linear function
. R0 = ( cos
. h matnx
wit . 0 - cos sin 0 ) . F or th e same reasons, rotation
. ab out a 1me.
sm 0 0
through the origin in JR 3 is a linear function R: IR 3~ JR 3 and is therefore given by
multiplication by a matrix A whose columns are the images R(ej) of the standard
basis vectors. Since R preserves lengths and angles, the image vectors R(ei) have
length 1 and are mutually perpendicular, so like the standard basis vectors, they
form an orthonormal basis. Moving {e 1, e2, e3} continuously to its new position
{R(e1), R(e2) , R{e3)} we see that the image vectors continue to form a right-handed
system, so their scalar triple product in that order equals +1. Since the scalar triple
product also equals det(A) = I , we'll see in Chapter 7 that a rotation R in JR 3
preserves volumes.

Given what we know about rotations in IR 2 it's easy to find the matrix of a rotation
in IR 3 about one of the coordinate axes. If R is a rotation through angle 0 about
the xi-axis, then R(e1) = e1 and R rotates e2 and e3 through the angle 0 in the
1 0 0 )
x2x3-plane, so its matrix is M = 0 cos 0 - sin 0 . in which the submatrix
( 0 sin0 cos0
in the lower right corner is the same as the matrix Re of Example 8 in Section I •

More generally, if {01, 02 , 03} is an orthonormal basis in JR. 3 and R is a rotation


about an axis in the direction of u, through the angle 0 then the matrix of R relative
to the basis {01, u2 , u3} is the matrix M of Example 12.
To find the matrix of R relative to the standard basis, we use Theorem 6.8 in
Section 6C. If A is the matrix of R relative to the standard basis and M its matrix
relative to the basis {u 1, u2 , u3}, then A= UMu- 1, and u- 1AU = M , where U
is the matrix whose columns are u1 , u2 , and 03.

For a numerical example, we'll find the matrix of a rotation R through 60° about an
axis in the direction of a = (I, 1, 1). Figure 3.15{a) shows the standard basis vectors
and their images under R, with the axis of rotation shown as a dotted line.
The first step is to find an orthonormal basis u1, u2, and u3 with u I in the same
direction as a. We could use the Gram-Schmidt process here, but the following
method is less work. We'll start by finding an orthogonal basis containing a. For a
vector orthogonal to a we can take an arbitrary b = (b1, b2, b3) with a• b = b1 +
2b2 + 2b3 = 0, for instance, b = (0, 1, -1 ). Since we are working in JR3, we can get
a third vector perpendicular to a and b by finding the cross product c = a x b, which
works out to be (-2, 1, 1). Now we normalize by dividing each of a, b, and c by its
length to get unit vectors u 1 = ( l/v'3, 1/v'3, 1/v'3), u2 = (0, 1/-/2, - 1/-/2), and
03 = (-2/-/6, 1/-/6, 1/-/6). Using these vectors as the columns of a matrix gives
1/v'3 0 -2/-/6 )
u I/./3 1/./2 1/./6 . Since cos60° = 1/2 and sin60° = ./3/2,
(
l /v'3 -1/-/2 1/-/6
Section 7C Inner Products 169
FIGURE 3.1S z

',
' ', p
.,
.,,
I
,' I
., I
,. I
, , ~ --.... I ,,,..
., --- J./ y
x tJ' I
C I
I
I

(a) (b) '


a

1
M = ( 0 1~2 -l/2 ) is the matrix of R relative to the basis {u1 , u2, u3} .
0 v'3/2 1/2
Then the matrix of R relative to the standard basis is A = U Mv- 1 according
to the formula given above. We can use orthononnality to find u- 1 without any
computation. It is simply U 1 , the transpose of U, obtained by flipping U about its
main diagonal, so that the rows of U become the columns of U 1 and vice versa. In
l/v'3l/v'3 l/v'3 )
this case, U 1 =( l/-v'2 - 1/../i . To see that u- 1 = U 1 when U is
O
, -2/./6 1/./6 1/./6
a square matrix with orthonormal columns, note that row i of U 1 (which is column
i of U) is just ui , and column j of U is DJ. The ijth entry in the product U 1 U is
therefore o; • OJ, which is 1 when i = j and O when i -:f. j because {01, uz , u3}
is an orthonormal basis. Thus U 1 U is the identity matrix and therefore u- 1 = U 1 •
Finally,

A= UMU- 1 = UMU 1

=( !j1 1/h
1/v'3 - l/-v'2
-~~1)(~ -l12)( ;3 !~Ji -!~1)
1/./6
1~2
0 v'3/2 1/2
11

- 2/./6 1/./6 1/./6

1/v'3 0 -2/./6)( l / v'3 1/,J?, 1/v'3)


=( l/v'3 1/../i 1/./6 l / -v'2 0 - 1/ ./2
1/v'3 -1/./2 1/./6 - J/./6 2/./6 - 1/./6

2/3 - 1/3 2/3)


= 2/3 2/3 - 1/3
( - 1/3 2/3 2/3

is the matrix of R relative to the standard basis. As a partial check on the calculation,
you can verify that Ao1 is equal to 01 as it should be.
170 Chapter 3 Vector Spaces and Linearity

IE~~MPLE 14 I
)
We'll now see how to find the axis of a rotation R if we're given its matrix A. For

~ - ;t ~3/
a numerical example we'll take the matrix A =
7
(;
3
7 - -;
2 6
, whose columns

are the orthonormal vectors VJ, v2, v3 of Example 5, and which you can check has
determinant 1.
If a has the same direction as the axis of rotation, then R(a) = Aa = a, so a is
an eigenvector of A associated with the eigenvalue I, and
6

(A - /)a=( -: 3
-7
7
lO

2 13
: ) ( :~ ) = 0 .
a3
7 7 - 7

Row reduction of (A - /) gives a matrix with two rows ( b~ =; ) and a third


row of zeros, from which we can read off a = (3, 2, 1) as a solution of (A - /)a = 0,
which is therefore an eigenvector of A associated with the eigenvalue I.
One way to find the angle of rotation of R is to find its matrix relative to an
orthonormal basis whose first basis vector has the direction of a, as in Example 13.
See Exercise I for details.
For a quicker way of finding the angle of a rotation directly from its matrix , see
Exercise 2.

1· ~XAMPLE 15 I Orthonormal bases make it easy to describe the geometric operation of reflection in
a subspace. Let V be a vector space with an inner product and an orthonormal basis
[u1 , .. . , u11 }, and let Uk be the subspace spanned by the first k basis vectors. Also
Jet Uo be the zero subspace. The linear function r: V - V defined by

is called reOection in the subspace Uk. When n is 2 or 3, and k = n - I this gives


the familiar notion of reflection in a line in ~ 2 or plane in ~ 3 • For instance, using
the standard basis in IR:3 , 112 is the xy-plane, and reflection in it sends (x , y , z) to its
mirror image (x, y , -z). When k = 0 we get reflection through the origin.

the origin. If p has coordinates (x, y, z) then


c = ( - x, - y , - z).
a=
Figure I. J5(b) shows a point p, and its reflections in the x y-plane, the z-axis, and
(x, y, -z), b = (- x, -y, z) , and

We leave it as Exercise 6 to show that the operation of reflection in a subspace


depends only on the subspace, and not on the particular orthonormal basis used in
the preceding definition.
EXERCISES

1. This exercise refers to the rotation R with matrix A


of Example 14 of the text. Show that the columns 3/.Ji4 0
U = 2/./14 -1//5 -6/,.fio
5/./10) are an orthonormal set,
UJ, u2 , 03 of (
1/./14 2/./5 - 3/./70
Section 7C Inner Products 171
and that the axis of the rotation R has the direction of u 1. 8 in Section 7C that such a function is linear and that
Find the matrix M of the rotation R relative to the basis its matrix relative to an orthonormal basis has columns
{u1, 02, 03} using the relation M = u- 1 AU = U 1 AU. that form an orthonormal set.
Compare the result with the matrix Mo of Example 12 in
the text to find the angle of rotation. 7. Let M = (: : ) be a matrix whose columns form an
2. The trace of a square matrix A is defined to be the sum orthonormal set in JR 2 .
of the entries on its main diagonal, a 11 + ·· · + a 11 n. (a) Show that a 2 + c 2 = 1 and either (c, d) = (-b, a)
(a) Show that if A and B are square matrices of the or (c, d) =
(b, -a).
same dimen~ion, then Trace (AB)= Trace(BA). (b) Show that if (c, d) = (-b, a), then M is the matrix
(b) Show that if P and U are square matrices of of a rotation about the origin.
the same dimension, and U has an inverse, then (c) Show that if (c, d) = (b, -a), then M is the matrix
Trace(u - 1PU)= Trace(P). of a reflection in a line through the origin in the
(c) Show that if A is the matrix of a rotation through an direction of an eigenvector of M.
angle 0 in R3, then Trace(A) = 1 + 2cos0. [Hint:
See Exercise l and Example 12 in this section.] In Exercises 8 and 9, let /: JR 3 ----+ JR 3 be a function that
(d) Use part (c) to find the rotation angle of the matrix
preserves lengths, and let A be its matrix relative to the
A in Example 14 in the text.
standard basis.
3. (a) Find matrices R and S for the rotations through 90° 8. Show that because f preserves lengths, every eigenvalue
about the x- and y-axes in R 3. of f has absolute value 1, so every real eigenvalue is
(b) Find the axis of the rotation obtained by first doing either 1 or -1.
the rotation about the x-axis and then the rotation *9. Because the characteristic polynomial of A has degree
about the y-axis. three, it has at least one real root, so f has at least one
4. Do the same as in Exercise 3, but with rotations through eigenvalue ). = ±1. Let M be the matrix of f relative
the angle 0 with cos 0 = ¾ and sin 0 = ~ instead of 90° . to an orthononnal basis {u1, u2 , u3} in which u 1 is an
eigenvector of U associated with ).. The columns of M are
5. Find the matrix of a rotation about the axis (1, 1, 0) an orthonormal set, because it is the matrix of f relative
through the angle 0 of Exercise 4. to an orthonormal basis.

~ ~ ~ ) , with the
6. Let r: V ---+ V be reflection in a subspace U of a space V
with an inner product as in Example 15. (a) Show that M has the form (
(a) Show that r has the property that r(x) = x for x 0 C d
in U and r(x) = -x if x is perpendicular to every
vector in U. submatrix ( ; : ) having columns forming an
(b) Show that every linear function from V to V with orthonormal set.
the property in part (a) must be identical with r . (b) Show that if). = +l, then f is either a rotation with
The following exercises outline the steps in a proof that axis u I or a reflection in a plane containing u 1.
every function from JR 2 to JR2 or from JR 3 to JR 3 that (c) Show that if ). = -1, then f is either reflection
preserves lengths and takes the origin to itself is either a in a line perpendicular to u 1 or the composition of
rotation, a reflection, or the composition of a rotation and a rotation with axis u1 and reflection in the plane
a reflection. We pointed out just before Examples 7 and perpendicular to u 1.

Chapter 3 REVIEW

In Exercises l to 4 determine whether the given set of 4. X = (0, 0, 0), y = (1, 0, 0), Z = (0, 1, 0)
vectors is independent. ·
In Exercises 5 to 10, find the matrix of a linear function
1. X=(2,3),y=(-1,2) , z=(l,0) f with the given properties.
2. X = (0, }, 2), y = (0, 0, 1), Z = (1, 0, 0)
5. The null-space of f : R 3---+ R 3 is the plane spanned by
3. X= (-1,2,0,3),y = (0, },-1,4),z= (- 1,3,-1, }) (1, 1, 1) and (1, -1, I), and /(0, 0, 1) = (0, 0, 1).
172 Chapter 3 Vector Spaces and Linearity

6. The image of f: IR2 - IR 3 is the plane spanned by there is a basis of eigenvectors for the matrix operator.
(1, I, I) and (1, -1, I).
7. The null-space of f:IR 3-IR 2 is the line x = t(l, I, I),
17. ( _; -3) -4
18. ( -4-3 ~)

n -b)u D
and j(l , 0, 0) = (1, 1) and j(0, I, 0,) = (], 2).

(g
-2 1
8. f:IR 3-IR3 has J(l,0,0) = (0,0, 1), j(O,O, I)=
19. 2 20. 2
(-1. 0, 1), and /(x) = x for all points x = (0, t, 0). -I I
9. J: IR 3- JR 3 has f (2, 2, 2) =
(I, 0, ]), and j(x) = x for

(
x in the plane spanned by (0, 1, 1) and (1, 1, 0). 0 a

10. f:IR 4 -IR 4 has /(x) = x for x in the subspace spanned


21. Let A = -a 0 C , with a, b, and c
b -c 0
by(], 1, 1, I) and (2, 3, I, 2) and j(x) = -x for x per- real numbers.
pendicular to the same subspace. [Hint: Find an orthogo-
(a) Find the eigenvalues of A, and show that one of
nal basis whose first two vectors span the given subspace.]
them is always real.
In Exercises 11 to 14, Mm.n is the vector space of m- (b) Find an eigenvector a~sociated with the real eigen-
by-n matrices described in Example 2 in Section 2A, value.
and Bm,,1 is the basis for Mm,n consisting of the mn *(c) Find three linearly independent eigenvectors of A
matrices {£11, £12, ... , Em 11 }, where Eij is the m-by-n when a = I, b = 2, and c = -2. (You'll need to
matrix with 1 in the ijth position and zeros elsewhere. use complex numbers.)
11. Let /: 'M.2.2 - M.2.3 be defined by j(M) = 22. (a) Show that the polynomials such that p(0) = p(l)
M( -! ~ ;) for M in 'M.2,2- Find the matrix of f
form a subspace of the vector space '.P3 [ -1, I]
defined in Example 11 of Section 8D, and find a
relative to the bases B2.2 and B2_3. basis for it.
12. Find a basis for the image of the function f of (b) For the same subspace find a basis that is orthog-
Exercise 11. onal relative to the inner product defined in
Example 11 of the text.
13. Let J: 'M.2,2 - 'M.2,2 be defined by j(M) =
= M. Let
(! _;) M (-: -! ) for M in M.2,2- Find the matrix
23. Let M be an n-by-n matrix with M 2
{v 1, ••• , v,.} be a basis for the null-space of M,
of f relative to the basis B2,2- and {w1, . .. , Ws} a basis for its image. Show that
{v1, . .. , v,., w1, ... , Ws} is a basis for IR", so that
14. Find a basis for the null-space of the function f of
Exercise 13.
r +s = n.
15. (a) Find matrices R and S for the rotations through 90° 24. (a) Find the 2-by-2 matrix C that converts coordi-
about the x-axis and an angle 0 about the y-axis in nates in IR 2 relative to the standard basis {e 1, e2}
JR3. into coordinates relative to the basis {v1, v2} =
{(I,]), (-1, ])}.
(b) Find the axis and angle of the rotation with matrix
(b) Suppose that f ; IR 2 - IR 2 has matrix A =
R- 1sR.
16. Find the matrix of a rotation through 30° about the line ( _i -~ ) relative to the standard basis in R 2 •
t(l, 1,0) in JR 3. Compute the matrix CA C 1, where C is the matrix
In Exercises 17 to 20, find the eigenvalues of the matrix; found in part (a), and interpret the result as a matrix
then find associated eigenvectors and determine whether of the transformation f.
CHAPTER 4

DERIVATIVES

A real-valued function f(x) defined on some interval a < x < b hac; a derivative
at a point x in its domain interval, denoted by f' (x ), if
'( ) _ . f(x + h) - f(x)
f X - 1Im - - - - - - .
h~O h
Fundamental interpretations such as velocity and slope give the derivative a primary
place in applied mathematics and in geometry, and the formulas of one-variable
calculus provide techniques for dealing with the functions such as the trigonometric
and exponential functions that arise in these areas.
We'll assume that the reader knows the rules and elementary examples of calculus,
and that pictures are familiar that show the graph of a function f (x) along with its tan-
gent line at a point xo ac; in Figure 4.1 . The purpose of this chapter is to begin extend-
ing the definition and interpretations of the derivative from real-valued functions of a
real variable, associated with formulas such as y = f (x ), to vector-valued functions
of a vector variable. The resulting notational change required in this last formula
is fairly slight, since we'll now write y = f (x), but the supply of applications and
interpretations will increase considerably. The elementary techniques and examples
of one-variable calculus will continue to play an important role throughout the rest
of the book. Thus what we're about to do will provide a review of that material.
FIGURE 4.1 y
Graph with tangent line.

When we speak of a function f with domain D in !Rn and range or image in


~m , we mean a function defined for all x in D and taking its values in some subset
of ~ m. We'll sometimes use the notation
f : D-+ !Rm
for such functions. When we don' t need to call attention to the precise domain D of
f we'll sometimes use the notation

to describe a function with domain a subset of !Rn and image a subset of !Rm .

173
174 Chapter 4 Derivatives

We often describe a function f from IR.11 to !Rm by a collection of m real-valued


functions called coordinate functions . The coordinate functions will have the same
domain as f itself, and if

/(x) = (/1(x), ... , fk(x), ... , !,n(x)),

then the real-valued function /k is called the kth coordinate function of f. For
example, if

J(x, y) = (x 2 - y2, 2xy),


then f 1 (x, y) = x 2 - y 2 and fz (x, y) = 2x y are the coordinate functions of f.
The Jinear functions /(x) from IR. 11 to .!Rm are just those functions with domains the
whole space IR.11 whose m real-valued coordinate functions have the simple fonn

f;(x1, ... , x 11 ) = a;1x1 + ai2X2 + · · · + a; 11 x 11 •

In terms of an m-by-n matrix A = (aij ), the image points y of a linear function f


are just the matrix-vector product f (x) =
Ax of Definition 2.1 in Chapter 2, where
x is in JR. 11 • For example the function f: JR. 2--+ JR.2 given by

YI= x1 +x2
Y2 = x1 - x2
has matrix fonn ( ;~ ) =( ! -! ) (;~ ).
SECTION l FUNCTIONS OF ONE VARIABLE
Here we take up the most straightforward generalization of calculus for real-valued
functions of one real variable: vector-valued functions x = /(t) of a real variable t.
An important difference between vector-valued functions and real-valued functions
is one of geometric interpretation: for vector functions we usually study the image of
f, namely the vector values actually taken by f, rather than the graph of an equation
x = /(/). For notation we'll sometimes write vectors as columns instead of rows
with comma separations; this practice sometimes results in a more readable display
and is often required in the context of matrix multiplication.
JA Derivatives
If a point moves in space so as to occupy various positions at a progression of times,
then its position at time t generates a vector-valued position function f with values
/(t). In particular if the position of a point in JR. 3 at time t is given by

/(/) = /XJ + XQ,


where XJ and xo are fixed vectors in JR. 3 , then the point is moving on a straight line
in JR.3 parallel to XJ and passing through Xo as in Figure 4.2(a). More generally, a
function f taking values in IR.11 is typically defined in the fonn

/(I) = (/1 (1), ... '1;1 (/) ),


where the coordinate functions Ji(/), ... , fn (I) denote the real-valued coordinates
of a point in IR.11 at times t. This generalization to dimensions higher than 2 or 3 is
-- --~-

Section 1A Functions of One Variable 175


FIGURE 4.2
(a) Line, (b) Parabola.

(a) (b)

not a purely theoretical concept; for instance it's crucial for describing the dynamics
of planetary motion in Chapter 12, Section 3.

If x1 = (x1, YI, z1) and xo = (xo, Yo, zo) are points in JR 3, then the function JR ~ JR 3
with image the points x(t) given by

x(t) = f(t) = t(x1, YI, z1) + (xo, Yo, zo)


= (tx1 + xo, ty1 + Yo, tz1 + zo)
gives a parametric representation of a line in JR 3 as in Figure 4.2(a).

l?EMMe~e:,n 2
The function g from JR to JR for which

g(t) = (t, t 2)
describes a curve in JR2 . Because the coordinates x = t and y = t 2 satisfy the relation
y = x 2 , the point (t, t 2) always lies on the parabola with equation y =
x 2 , shown
in Figure 4.2(b ).

We define the limit of a vector-valued function f with values in ]Rn by using


limits of the real-valued coordinate functions fk of f. Thus if

f(t) = (/1(t), ... , f,,(t))

is defined for an interval a < t < b containing to, we write

Jim f(t)
t-.ro
= (1im fi(t), ... , Jim
r-ro ,-ro
f,,(t)).
Similarly a function with values in )Rn is said to be continuous if its real-valued
coordinate functions are all continuous on their common domain interval. These
definitions are treated more generally in Chapter 5, Section 1.

The function defined by g(t) = (t, t 2) has limit vector (2, 4) at t = 2 because
lim(t, t 2)
,~2
= (1im t, lim r2 )
,~2 ,-2

= (2, 4).
176 Chapter 4 Derivatives

FIGURE 4.3

)
X1 X1

Continuous image Discontinuous image


(a) (b)

The function g is continuous for all real r because the coordinate functions t and t 2
arc continuous.

The intuitive idea behind continuity of a vector-valued function is similar to that


for a real-valued function: The values of the function should not change abruptly.
Figure 4.3(a) shows the image of a continuous function and Figure 4.3(b) shows the
image of a discontinuous function, both from domain in JR to range space JR 3 • For
now we'll consider only continuous functions JR~JR11 with g(r) defined on an open
interval a < t < b. We'll first define the derivative of g and show how it leads to a
definition of tangent line to the image curve of g.
A function g (1) has a derivative g' (1) at a point 1 in an interval a < 1 < b if

'( )
g t = 1· g(t
Im------,
+ h) - g(t)
h->0 h

assuming the limit exists. If the limit exists for each tin (a, b), then g'(t) determines
a new function JR -.!.+ JR11 , just as in the case n = 1. The derivative is often
written dg/dt.

j _E)(~MPLE 4 j Let g(t) = (r 2 , t 3 ). Then writing g(t) as a column vector, we have

r g(t + h) - g(1) r 1 ( (r + h)2 _ ,2 )


h~O h =h~h (t+h) 3 -1 3

(t + h)2 - 12 )
= h->O
1m l. h
( (t + h~ 3 - t
3 .

The two entries in this vector have as limits the derivatives of 12 and 13 , respectively.
Hence by the definition of the derivatives,

. (1+h)2-r 2 . (I + h)3 - 13 2
lim - - - - - = 21 and hm - - - - - = 3t .
h->0 h h->0 h

Rv th~ definition of vector limit the vector limit g' (t) exists, and g' (t) = (2t, 31 2 ).
Section 1A Functions of One Variable 177

Example 4 suggests that a function JR -!+ ]Rn has a derivative at a point t if and
only if each coordinate function of g has a derivative there. This is true, and we
have

1.1
g1(t) ) g; (t) )
If g(t) = : ' then g'(t) = : ,
(
gn(t)
( g~(t)

where each derivative gk(t) is an ordinary derivative of a real-valued function of a


real variable t. The resulting vector expression for g' (t) is an immediate consequence
of the definition of the limit function in terms of limits of its coordinate functions.

If g(t) = (
cos
. t )
SUI 1
, then g'(t)
COS t
=(
- sin 1 ) Note that as I vanes . .the curve
traced in JR2 by g(t) is a circle of radius l centered at the origin. Indeed lg(t)I =
./cos 2 t + sin 2 t = 1, and the geometric definition of cos t and sin t is based on the
interpretation oft as the counterclockwise angle that the radius at g(t) makes with
the positive horizontal axis, as shown in Figure 4.4(a). By a similar argument, g(-t)
also traces the same circle, but in the clockwise direction.

If h(t) = ( :2 ) , then h'(t) = ( ;, ) . For O :St ::: 1, the points h(t) trace the
13 3,2
curve in JR3 sh~wn in Figure 4.4(b ). As a guide for sketching this curve, observe
that its perpendicular projection into the xy-plane is the parabola, y = x 2 , and its
projection into the xz-plane is the cubic z = x3 • The projection y = z213 into the
yz-plane is less familiar; for that interesting curve see Example 9.

Figure 4.4(c) shows that, ash tends to 0, the vector g(t +h)- g(t) has a direction
that should tend to what we would like to call the tangent direction to the curve y
at g(t). However, since g is assumed continuous,

lim g(t + h) - g(t) = 0,


h-+0

and the zero vector that we get as a limit has no direction. The standard way to
overcome this difficulty is to divide by h before letting h tend to zero. Observe that

FIGURE 4.4 X X3

y
g(t + h)

(1 , 1, 1)

(a) (b) (c)


178 Chapter 4 Derivatives

division by h will not change the direction of g(t + h) - g(t) if h is positive; it


will reverse it if h is negative. A glance at Figure 4.4(c) shows that this reversal
is desirable for our purposes, because we want the tangent vector to point in the
direction of increasing t along the curve. (What would happen if we divided by lhl
instead?) If the derivative g'(t) exists and is not zero, then g'(t) defines the standard
tangent vector to y at g(t). A positive multiple of g'(t) has the same direction as
the standard tangent and so is also called a tangent vector. In this context we'll often
use the notation x(t) for position instead of g(t) and also use Newton's overdot
notation x(t) = g'(t) for the vector derivative, particularly when t is interpreted as
a time parameter.
The tangent vector arrow x(to) is usually pictured so that its tail is at x(to) as in
Figure 4.4(c). The line with direction vector x(t) = g'(t) and passing through g(t)
is called the tangent line to y at x(t). Thus if x(to) is a particular point on a curve,
the tangent line at x(to) will have a parametric representation of the form

= tx(to) + x(to),
t(t)

jEXAMPLE 7 j The circle of Example 5 has points x(t) = (cost, sin t) and tangent vector x(t) =
( - sin t, cost). A typical tangent vector appears in Figure 4.4(a).

IMinition If a curve has a p~~tij¢, hn~g~ qf .t function


g(ij such that the derivative g'(1),1s (i} never zero.J.hcn the
curve is called smooth.

The condition that g' (t) be nonzero requires the curve to have a well-defined
tangent line at every point.
The condition that g'(t) be continuous means that the direction and length of the
tangent vector g'(t) change continuously as the point g(t) moves along the curve.
Here's an example of a smooth curve that we'll encounter often.

IEXAMPLE s j The image curve defined parametrically by x(t) =(cost, sin t, t) lies on the cylinder
of radius 1 shown in Figure 4.5(a). The image curve is called a helix. If we tem-
porarily set the third coordinate function of x(t) equal to 0, the image is a circle of
radius 1 centered at (0, 0, 0), because

l(cost,sint,0)1 = Jcos 2 t +sin 2 t = 1.


But with t again in the third coordinate, the image point rises as t increases, in the
direction of the z-axis, as shown in Figure 4.5(a), lying all the while above the points
of the circle. A tangent vector is

x(t) = (- sin t' cos t' l)

and the tangent line to the helix at x(O) = (1, 0, 0) has the parametric representation
t(t) = tx(O) + x(O)
= t(O, 1, l) + (1, 0, 0).
Section 1B Functions of One Variable 179

Note that x(t) is a continuous function, and is never zero, so the helix is a smooth
curve.

If a point moves in the plane so that at time t its position is x(t) = (t 2 , 13), then the
tangent vector is x(t) = (2t, 3t 2 ), with length jx(t)I = (4t 2 + 91 4 ) 112 . In particular,
x(O) = 0. The sketch of the path traced by x(t) is in Figure 4.5(b) for -1 :::: t :::: 1.
In making the picture it's helpful to observe that the coordinates of a point on the
path satisfy the equation· x = y 213; since x =
t 2 ~ 0 here we also have y x 312 . =
The tangent vector shrinks to zero in this example as x(t) approaches the origin
I'
because, with continuously varying x(t), its length becomes instantaneously zero at
I
I the abrupt change in the direction of motion shown in Figure 4.5(b). In this way the
I
I parametrization describes the geometric situation well. The curve certainly doesn't
I
I deserve to be called smooth at (0, 0), and is said to have a cusp there.

We list here some useful formulas that hold if two vector-valued functions x(t) =
f (t) and y(t) = g(t) have vector derivatives on an interval a < t < b; we assume
<f>(t) and u(t) are real valued and differentiable on the same interval.

d d
(a) 1.2 dt (x + y) = x + y, dt (ex)= ex, c constant
y
d . I
1.3 d/<f>x) = <f>x + <I> x
d
1.4 d/x • y) = x •y +x •y

v(½) 1.5 ~ (x(u)). = u' (t):x(u),


dt
with u taking values in the domain of x(t)

X 1.6 !!__(x x y) =xx y + x x y, if x and y take values .in IR3


dt .

The preceding formulas all follow from writing x = f(t) and y = g(I) in terms
of their coordinate functions and then applying the corresponding differentiation
(b) formulas for real-valued functions along with Formula 1.1. For example, the proof
of 1.5, a version of the chain rule for differentiation, goes like this:
FIGURE 4.5
(a) Helix, (b) Cusp.
~x(u) = (/1 (u), ... , J,1 (u) )'
dt
= ((/1(u)] 1
, ••• , [J,,(u)]')

= (f{(u)u', ... , 1,;(u)u') = u'i(u).


1B Velocity and Speed
One reason for singling out x(t) = g' (t) for special attention as the standard tangent
vector, rather than some multiple of it, is that we often want to consider the parameter
t as a time variable, with x(t) tracing the path of a point moving in !Rn. Under this
interpretation, the Euclidean length jx(t) I = jg' (t) I is the natural definition for the
180 Chapter 4 Derivatives

speed of motion along the pal.h y described by g(t) as t varies. To justify the use
of the term speed, we observe that, for small h, the number lg(t + h) - g(t)l/lhl is
close to the average rate of traversal of y over a sufficiently short interval from t to
t + h. In addition, if g'(t) exists, we'll now show that

lim lg(t + h) - g(t)I = lg'(t)I.


h-+-o lh I

By the triangle inequality in the reversed form llxl - IYII::: Ix -yl, (seep. 32),

lg(t+h)-g(t)I I I
lg(t+h)-g(t) I
- - -h-- - - lg (t)I :::: - - - h - - - g (t) .
I
l 11

The right side tends to zero as h tends to zero by the definition of g'(t). Hence the
left side tends to zero also. Thus lg' (t) I is a limit of average rates over arbitrarily
small time intervals. It's for this reason that the real-valued function v defined by
v(t) = lg'(t)I is called the speed of g. It follows that it's natural to call the vector
v(t) = g'(t) the velocity vector of the motion at the point g(t). Note that the vector
v(t) is identical to what we called the standard tangent vector toy at g(t) if v(t) -:/- 0.
Velocity v(t) = 0 indicates speed zero and no direction at time t.

IEX.AMPLE 10 I Let x(t) = (a cost, a sin t, bt) with a and b nonzero constants. This is a more
general helix than the one in Example 8, where we took a = b = 1. Figure 4.6(a)
shows the choice a = 1, b = ½ along with a = -1, b = ½ as a dotted curve.
The two together outline the general configuration of the double helix portion of
the DNA molecule. The velocity at time tis i(t) = v(t) = (-asint,acost,b).
It follows that the velocity vector is always perpendicular to the vector r(t) =
(a cost, a sin t, 0), which points horizontally from the axis of the spiral to x(t ). To
see this just check that

v(t) • r(t) = (-a sint, a cost, b) • (a cost, a sint, 0) = 0.


The speed at time t is equal to the constant J a 2 + b2 , because
lv(t)I = 1(-asint,acost,b)I
= Ja 2 sin2 t + a 2 cos 2 t + b 2 = Ja 2 + b 2 .

FIGURE 4.6

v=i

(a) (b)
Section 1D Functions of One Variable 181

lC Higher-Order Derivatives; Acceleration


If JR ~ ]Rn has a derivative IR ~ ]Rn, then we can ask for the derivative of g',
which we denote by g" or d 2 g / d t 2 • It may happen that g" is defined at fewer point~
than g or g'. We write g<3> or d 3g /dt3, and so on for higher-order derivatives, though
these will occur rarely in what follows. The reason is that derivatives of order n > 2
don't have as interesting and intuitively appealing interpretations as the first-order
and second-order ones.

Suppose that g(t) = (rcoswt, r sin wt, ct), where r, c, and ware positive constants.
Then computing first and second derivatives one coordinate at a time gives

g'(t) = (-rwsin wt, rwcoswt, c) and g 11 (t) = (-rw 2 cos wt, -rw2 sin wt, 0).
Suppose JR ~ JR 3 describes a path in JR 3 with velocity ~ector v(t) = g'(t). If
we assume that g' itself has a derivative, we define the acceleration vector at g(t)
by a(t) = g" (t). If x(t) is used to denote the image points of some curve, then along
with x(t) for velocity vectors we may denote acceleration vectors by x(t).
The physical significance of acceleration a(t) is that if x(t) describes the motion
of a particle of constant mass m, then F(t) = ma(t) is by definition the force vector
acting on the particle. If we denote by a(t) the length of a(t), then a(t) is called
the magnitude of the acceleration, and ma(t) is called the magnitude of the force
acting on the particle. We detect the presence of acceleration that isn't parallel to
the velocity at x(t) by observing a bending of the path of motion away from the
straight line through the tangent vector x(t) and toward the direction of x(r). Look
at Figure 4.6(b) and think of the sideways pull that you feel when going around a
tight curve at high speed. The previous example illustrates the basic idea; there the
acceleration points in a direction perpendicular to the vertical axis of the helix since
the third coordinate of g"(t) is always zero. Section lF provides another illustration,
and there is a more general treatment in Chapter 8, Section 3.

If the vector x(t) = (r cos wt, r sin wr, ct) gives the position at time t of a particle
of mass m in JR3, then the velocity and acceleration vectors, v(t) = g 1(r) and a(t) =
g"(t), are as computed in Example 11, namely

x(t) = (-rwsinwt, rwcoswr, c) and x(t) = (-rw2 cos wt, -rw2 sin wt, 0) .
Typical velocity and acceleration vectors are shown in Figure 4.6(b ), located appro-
priately with tails at x(t). The lengths of these vectors just happen to be constant as
functions of time t, depending only on the constants r, w and c as follows:

lv(t)I = J(rwsinwt) 2 + (rwcoswt) 2 + c 2 = Jr 2w2 + c2


la(t)I = J(rw 2 coswt) 2 + (rw 2 sinwt) 2 = rw 2 .

Note that while the speed and velocity depend on c, the acceleration doesn't.

1D Arc Length
For a point moving with constant speed v along a curve, the distance covered between
time to and time !1 should turn out to be V times t, -tn Mn.- 0
------"
182 Chapter 4 Derivatives

obtain the distance along a parametrized curve y between t = to and t = t1 by defin-


ing it to be the integral of the speed v(t) with respect to time to get arc length l(y ):
lt
l(y)=
f 'o
v(t)dt.

·: EXAMPLE;;13
1.·.· ··.· ··· ··.• ..., ...,..·.·.···.·····.. ·. ·--·.·.·. 1 If a circle of radius a is parametrized in R 2 by
( ,; . "-''. '::,:- ... , . ~:-:=· :- . ,·:. :,: ) .. ,,,< "· .. ;

g(t) = (a cos wt, a sin wt), w > 0,

then
v(t) = lg'U>I
= 1(-awsinwt, awcoswt)I
= awJsin2 wt + cos 2 wt = aw.
Thus the distance covered between times to and t1 is
11
I=
1
10
awdt = aw(ti - to).

The constant w is called the angular speed of the motion on the circle.

Image curves that appear to be the same may have different parametrizations that
yield different arc lengths. The following simple example shows that a little care is
needed to avoid producing inconsistent results from different parametrizations.

Let g(t) = (cost, sin t), for O ::: t s


2n and h(u) = (cos u, sin u), for O ::: u 4n. s
These functions have the same image, namely a circle of radius 1, but h traces the
circle twice. Since jg'(t)I = lh'(u)I = 1,

{2rr {2rr {4rr {4rr


lo lg'(t)ldt = lo dt = 2n, while lo lh'(u)ldu = lo du= 4n.
Verification of some fonnal conditions that guarantee equal arc lengths for different
parametrizations is left as Exercise 44.

EXERCISES

Find the derivatives f'(t) and f"(t)for each of the 4. f(t) = (t + t 2 , t 2 + 13, 13 + 14 ), when t = - I
following functions l to 6 at the indicated point. Then 5. f(I) = (cos1,cos21,cos31,cos41) when I =rr/2
find a parametric representation for the tangent line at
6. f (I)= ti+ t 2j + t 3k when t = I
each indicated point.
Sketch the curves defined parametrically by the follow-
1. f(t) =(I+ t 2, l + t 3 ), =2 when t ing functions 7 to 12.
2. f(t) = (tcost, tsint), when t = ;r/2 7. j(t)=t(l,2,0)+(1. ), J),-OO<t <OO
s. JU)=U.t 2 ,t 3),0srs 1
3. /(t) = { :: 1 ). when I= -I 9. J(t)=(2t,t),-IsrsI
Section 1D functions of One Variable 183

10. h(t) =ti+tj+t 2k,-1 St 5:2 23. f (t) = (3t, 4t); to = o, ti = 4


11. f(t) = (21, ltl), -) St S 2 24. g(t) = (2cost, 2sint);to = 0, ti= rr/2
12. /(t) = (cost, sint, t}, 0 :-:: ts 2rr 25. h(t) = (t, 21 312); to= o, t1 = ¾
13. Suppose that temperature at a point (x, y, z) in JR 3 is 26. c(t) = (t, cosht}, 0 .:'.:: t.::: a
T(x, y, z) = x 2 + y2 + z 2 • A particle moves so that at
time tits location is given by (x, y, z) = (t, t 2 , t 3). Find In 27 and 28 a planet orbits a fixed star in a cir-
the temperature at the point occupied by the particle at cular path of radius a with constant angular speed
t = ½. What is the rate of change of the temperature at w. We parametrize the orbital motion by x(t) =
(a cos wt, a sin wt). A moon orbits the planet in a circu-
the particle °"' hen t = ½?
lar path in the plane of the planet's path and of radius
14. Show that d(x • x)/dt = 2x • x if x exists. b < a with constant angular speed 8 relative to the
15. Use the result of the previous exercise to show that if a planet. The relative masses of the three bodies are such
curve is traced with constant speed, then the velocity and that we neglect the gravitational attraction between the
acceleration vectors are always perpendicular. moon and the star.
16. A point has position at time t given by (t, t 2 , I+ t 2) for 27. Find a parametric representation for the path of the moon
0 .:'.:: t S I. At time t = l the point leaves this curve relative to the fixed star assuming that the three bodies
and flies off along the tangent line while maintaining the are in line at t = 0.
constant speed attained at t = 1. Where is the point at 28. Find the speed v(t} of the moon at time t. Under what
t = 2? conditions is v (t) constant?
17. If g(t) =(e 1 , t) for all real t, sketch in JR2 the curve
Differentiation Formula 1.5 listed at the end of
described by g together with the tangent vectors g' (0)
Section lA is proved in the text. Using similar ideas, do
and g'(l).
29 to 32.
18. Let f(t) = (t, t 2 , t 3 ) for O .:'.:: t SI.
29. Prove Formula 1.2 in the text.
(a) Sketch the curve described by f in JR 3 and the
tangent line at ( ½, ¼, k). 30. Prove Formula 1.3 in the text.
(b) Find 1/'(t)I. 31. Prove Formula 1.4 in the text.
19. If /(t) = (t,t 2 ,t 3 )
for all real t, find all points of 32. Prove Formula 1.6, where x and y take values in JR 3 and
the curve described by f at which the tangent vector is are differentiable on an interval.
parallel to the vector (4, 4, 3). Are there points at which
33. Show that if JR .J!..+ JR" has a derivative and g' (t) = 0 for
the tangent is perpendicular to (4, 4, 3)?
a < t < b, then g(t) is a constant vector on that interval.
20. Sketch the curve represented by (x, y) =
(t 3 , t 5 }, and [Hint: Consider the coordinate functions one at a time.]
show that the parametrization fails to assign a tangent
vector at the origin. Find a parametrization of the curve In 34 and 35 let a differentiable function g(t) represent
that does assign a tangent at the origin. Is the curve the position in IR.3 at time t of a particle of possibly
smooth? varying mass m(t). The vector function P(t) = m(t)v(t)
is called the linear momentum of the particle. The force
21. Let g(t) = (sin2t,2sin2 t,2cost) and show that the
vector is F(t) = (m(t)v(t)). The angular momentum
image curve lies on a sphere centered at the origin in
about the origin is L(t) = g(t) x P(t) , and the torque
JR 3 • Find the length of the velocity vector v(t} and show
about the origin is N(t) = g{t) x F(t). Apply these ideas
that the projection of this vector into the xy-plane has a
to the next two exercises.
constant length.
22. Show that if f is vector valued, differentiable, and never 34. Show that if F is identically zero, then P is constant. This
is called the law of conservation of linear momentum.
zero for a < t < b, then
(a) f • df = lfldlfl 35. Show that L' (t) = N(t), and hence that if N is identi-
dt dt cally zero, then L is constant. This is called the law of
(b) 1/1 is constant if and only if f • f' = 0 conservation of angular momentum.
Sketch the following four parametrized curves between 36. Show that if a particle has an acceleration vector a(t) at
the given parameter values to and t1. Then find the speed time t and v(t) f:. 0, then v' = t • a, where t is the unit
v(t) and calculate the arc length between to and ti. vector (1/v)v.
184 Chapter 4 Derivatives

37. Show that if a, b and w are positive constants, then the t at the height of a projectile fired straight up with an
parametrization x(t) = (a cos wt, b sin wt) traces the same initial speed of 300 feet per second. What is the minimum
ellipse x 2 /a 2 + y2 /b 2 = I in IR2 regardless of the size temperature attained?
of w. *44. Parametrizations x = g(t), a :St :Sb and x = h(u), a :S
38. Find the velocity and acceleration vectors for x(t) in u :S f3 are called equivalent if there is a continuously
Exercise 37. Show that the velocity and acceleration are differentiable¢ with¢' > 0 from [a, {3] onto [a , b] such
never zero. that g(</J(u)) = h(u). {Note that¢'> 0 implies¢ strictly
Sketch the following four curves for the indicated time increasing.)
intervals. Then add to your sketch the velocity and (a) Use Equation 1.5 to show that if g and h are equiv-
acceleration vectors at the designated times. alent then lh'{u)I = lg'(<PM)l<P'{u).
1 (b) Use part {a) to change variable in the arc-length inte-
39. x(t)=(t,t,t 2 ),0:St :S l;t=0, 2I ,1 gral for h and show that equivalent parametrizations
40. x{t) = (2 cos t)i + {sin t)j, 0 :S t :S 2rr; t = 0, rr /2, rr yield equal arc lengths.
{c) Show that g{t) = (t, t) for -I :s t :S I and
41. x{t) = (t, t 2 , tJ), 0 :St :S l ;t = 0, ½, l h (u) = (- cosu, - cos u) for O :S u :s 5rr /2 are
42. x(t) = {cost)i + {sint)j + tk, 0 :St :S 2rr;t = 0, rr, 2rr not equivalent parametrizations of the line segment
43. The normal lapse rate for temperature above the surface from {- I , - 1) to {I, l) in JR 2 by showing that they
of the earth assumes a steady drop in air temperature of yield different arc lengths. This example shows that
3°F per 1000 feet of increase in elevation. Under this the condition that the function ¢(u) be increasing
assumption, with ground temperature 32°F, and assuming can't be omitted from the definition of equivalence
negligible air resistance, estimate the temperature at time if equal arc length is to be a consequence.

IE Computer Plotting of Space Curves


Short of making a wire model of a 3-dimensional curve, our best recourse for depict-
ing a curve is a perspective drawing. Standard textbooks have such drawings printed
on their pages. A computer screen, like a blackboard or a piece of paper, presents
us with a flat 2-dimensional surface on which we depict geometric objects. Software
designed to make perspective drawings of objects in 3-dimensional space is widely
available. The discussion presented here is independent of any particular software,
but it serves to indicate schematically the logical routine for drawing space curves
in 3-dimensional perspective. The following algorithm is typical:

DEFINE 91 (t) = 2 ~in (t)


DEFINE 92 (t) = 3 cos (t)
DEFINE g3(t) =
.4t
FOR t = 0 TO 4,r STEP . 01
PLOT3D (g1(t), 92(t), g3(t))
NEXT t

(b) This algorithm plots points on an elliptical helix shown in Figure 4.7, and defined
by three equations of the form x = g1 (t), y = g2(t), z = g3(t) with a ::: t :S b.
FIGURE 4.7 In our particular example we get the picture shown, for which g1 (t) = 2 sin t,
g2(t) = 3 cos(!), g3(t) = 0.4t, and a = 0, b = 4n. The viewing direction here
is along a line joining the point (I, 1, I) to the origin. The order of the sine and
cosine in the first two coordinate functions makes the helix turn clockwise instead of
counterclockwise as it winds up around the vertical axis. Note that the curve winds
around an elliptical cylinder rather than a circular one.
The decision to plot a picture by hand or by computer will usually favor the
computer if a fairly high degree of accuracy is needed for some reason or if the picture
Section 1F Functions of One Variable 185
is just too complicated to draw by hand. Otherwise, a quick pencil drawing may
convey the necessary information with less fuss. Some of the information you might
want to convey is that you understand the basic ideas of graphical representation, and
this may best be done, for example on an examination, with a careful pencil drawing.
For this reason it's a good idea not to become overly dependent on having computer
software do your thinking for you until you've become reasonably adept at doing it
for yourself. The assigned exercises will require a mixture of both approaches.

EXERCISES

Plot the following parametrically defined curves l Plot the image curves 6 to 9 subject to the given
to 4. conditions.
1. g(t) = (tcost, t sint), 0::: t:::: 2rr 6. x = (cost, sint, t 2 ), 0::: z:::: 3
2. f(t) = (t, ½t3. ½r 4), O::: t:::: 1 7. x = (2 cos t, 3 sin t, e ), 1 ::: z :::: 2
1

3. g(t) = (sin2t, 2sin2 t, 2cost), 0::: t:::: 2rr 8. x = (t, t cost, t sin t), x 2 + y2 + z2 ::: 1

4. g(t) = (ltl, 2it - 11, 31t + 11), -2::: t:::: 2 9. x=(t 2 ,t 3 ,t 4 ),lyl::: 5

5. Prove that the curve in Exercise 3 lies on a sphere centered 10. Make computer plots of lines x = ta + b in JR 3 for
c ::: t :::: d for a variety of choices of the vector and
at the origin.
scalar parameters.

lF Vector Integration
We've seen in Section ID that if you know the speed \x(t)\ of some point moving in
space, you integrate speed with respect to t to find distance measured along the path
of motion from some chosen point. Finding the actual path of motion requires prior
knowledge of more than just the speed; for that we need to know the velocity vector
v(t) = i(t). Since we get from position x(t) to velocity i(t) by vector differentiation
it follows that recovering position from velocity is done by vector integration. Given a
vector valued function f(t) with n real-valued coordinate functions fi(t), . .. , fn(t),
each integrable over some common interval, the indefinite vector integral of f is
defined by

f f(t)dt+ c= (! fi(t)dt+ci,f h(t)dt+c2, ... ,f j~(r)dt+c,i).


The vector constant of integration is c = (q, c2, . .. , en).

J;E~~~e~,;l~I If j(t) = (1 , t2), then


f 2
f(t)dt=(/1 dt,ft dt)=(t+ci,½t +c2
3
)=(t, ½r )+c.
3

f
We interpret he relationship between j(t) and F(t) = f(t) dt + c as follows. For
whatever choice of c, the tangent vector to the image curve of Fat F(t) is the vector
f (t), usually pictured with its tail at F(t). Figure 4.8(a) shows two choices for c.

Suppose a and b are constant vectors in ~n and we want to find the position function
x(t) consistent with velocity i(t) = ta + b, as well as with the initially specified
186 Chapter 4 Derivatives

position x(to). Since x(t) is an integral of x(t), we have

x(t) = f x(t) dt = f 2
(ta+ b) dt = ½r a + tb + c.

To determine c, we note that x(to) = ½rJa + tob + c, soc= x(to) - ½tJa- tob. Thus

x(t) = ½ (r 2 - tJ) a+ (t - to)b + x(to) .

As an alternative we use a definite integral:

x(t) - x(to) = {' x(u)dt = {' (ua + b)du = ½u 2a + ubl~0


11 0 J,o
= ½(r 2 - rJ) a+ (t - to)b.

FIGURE 4.8 y

X
/
---- y

c = ~I, -t):c = (I, -1)


1 2 x(t)=-½1 2 g+tv0
(a) (b)

Suppose we fire a projectile from ground level and are willing to ignore the effects
of air resistance on the flight of the projectile. (The retarding effect of air resistance
is taken into account in Chapter 12, Section 3.) Thus the only acceleration we need
to take into account after the initial release of the projectile is that of the vector
-g = (0, 0, - g), where g is the magnitude of gravitational acceleration near our
location on earth. Denote by x = x(t) the position of the projectile at time t after
firing, so that the velocity vector is v = x(t) and the acceleration vector is a = x(t).
x
Equating our two expressions for acceleration gives = -g. Writing this equation is
the critical step in predicting the projectile's path. To solve the equation we integrate
both sides twice with respect to t getting successively

x = v(t) = -tg + c 1 and x = -½t 2g + tc1 + c2.

The vectors c1 and c2 are constants of integration determined by imposing initial


conditions at time t = 0. We place the origin at the firing point, so that x(O) = O. It
follows that c2 =0. Denote the initial velocity vector by vo, so that x' (0) v(O) = =
vo. It follows that CJ = vo. Thus the solution to our problem is x(t) = 2g + tvo. -½t
If the initial velocity vector were directed parallel to the unit vector ( ½, with l, t)
speed vo, = ( j vo, jvo, jvo) · A sketch of the resulting trajectory
we would have vo
is in Figure 4.8(b), assuming g = 32 and vo = 8. Note that Figure 4.8(b) does not
show the relation between time t and position x on the projectile's trajectory.
Section 1F Functions of One Variable 187

EXERCISES

In Exercises 1 to 6, compute the indefinite integ(als 17. Superman, while standing atop a 200-foot-high building,
F(t) = ff (t) dt + c; then determine the constant of sees a scoundrel drop a victim out a window 50 feet
integration c so that the associated condition is satisfied. across the street from' his building and 100 feet above
the pavement below. Reacting instantly, Superman gives
1. /(t) = (t 2 + 1, t 3 - l); F(l) = (2, 2) himself a mighty push in just the right direction to plunge
2. f(t) = (t, t 2 , t 3 ); F(0) = (1, 2, 1) under the influence of gravity and effect a dramatic rescue
just before the victim hits the pavement. Neglecting air
3. /(t) = (t cost, t sint); F(0) = (1, 1)
resistance, estimate Superman's initial velocity vector and
4. f(t) = (1/(t 2 + 1), t/(t 2 + 1)); F(0) = (0, 1) speed. Also estimate Superman's and the victim's speeds
at the time of rescue.
5. f(t) = (1. t 2 , -1, t 2); F(l) = (2, 2, 2, 2)
6. /(t) = ta - t 2 b; F(to) = xo 18. Someone wants to kick a ball on level ground so it falls
In Exercises 7 to 14, given x(t) or x(t), find the x(t) that back to earth a feet away.
satisfies the initial conditions. (a) Show that we can do this with infinitely many
7. i(t) = (t, = (2, 1)
-t 2 );x(0)
different initial angles of elevation as long as the
initial speed vo at which the ball is kicked is at least
8. i(t) = t(l, -l);x(l) = (1, 1) .jag.
9. i(t) = (cost, sin2t);x(rr/2) = (-1, 1) (b) Suppose in addition that the ball is to be lobbed
over a vertical fence of height h halfway between
10. i(t) = (e t); x(0) = (e, 1)
1
,
the initial and terminal points on the ground, Show
11. i(t) = (t, r, t 2);x(l) = (1, -1, 1) that barely clearing the fence requires initial angle
of elevation 0 = arctan (4h/a) and initial speed
12. i(t) = t(l. 1, t);x(0) = (2, I, 2)
vo = Jg(a 2 + 16h 1)/(8h).
13. x(t) = (t, -t 2);x(O) = (2, 1), i(0) = (I, 1)
14. x(t) = (t,, 2 ,e- 1);x(l) = (1,0,0),i(l) = (0, 1,0) *19. Suppose you want to stand at distance a from the base of
a vertical building wall of height h and then kick a ball in
15. Suppose you want to kick a ball over an h-foot vertical
such a way that it lands at distance b back from the edge
fence a feet away from you in such a way that it just
on the building's flat roof, having just grazed the edge
barely gets over the top of the fence and lands on the
of the roof as it went by. Show that the initial angle of
ground b feet from the fence on the other side. Assuming
air resistance neglected, what should be the initial angle
=
elevation of your kick is 0 arctan (h/a+h/(a+b)) and
of elevation 0 of your kick, and what should its initial its initial speed is v0 = .Jga(a + b)/(2hcos 2 B). [Hint:
speed be? Find a parabola containing three crucial points.)

16. A target is suspended over level ground at height ho, *20. A projectile is fired up from the surface of the earth with
to be released to fall earthward under constant vertical initial velocity (uo, vo). Under the influence of constant
acceleration - g . Simultaneously with the release of the vertical acceleration -g the projectile reaches height hmax
target, a gun aimed directly at the suspended target is fired and then falls back to earth. Neglecting air resistance,
from ground level at a horizontal distance l from the point show that the fraction of time during its trajectory that
directly below the target. Assume that the speed of bullet the projectile spends above height h I is lvd/vo, where
and target are not reduced by air resistance. (u 1 , v·1) is the projectile's velocity vector at height h 1 .
(a) Show that the bullet's trajectory will intersect the
vertical path of the target only if 2v5ho ~ g(l 2 +h5) . . 21. Big Bertha In World War I, Paris was bombarded by guns
(b) Show that the bullet will hit the target if the condi- from the unprecedented distance of 75 miles away, shells
tion in part (a) is met. · taking 186 seconds to complete their trajectories. Estimate
(c) One feature of the conclusion in part (b) is that it the angle of elevation at which the gun was fired and the
happens independently of the size of vo as long as maximum height of the trajectory, assuming negligible air
it satisfies the condition in part (a). However the resistance. During a substantial part of the trajectory the
distance d 1 that the target has fallen when it is hit altitude was high enough that air resistance was negligible
does depend on vo. Find d1 assuming d1 > ho. there.
188 Chapter 4 Derivatives

*22. Fox and Rabbit Suppose that a rabbit runs with constant
26. If JR ~ IR11 and IR ~ IR11 are both integrable over
speed v > 0 on a circular path of radius a, and that a fox,
[a, b], show by using the corresponding properties of
also running with constant speed v, pursues the rabbit by
integrals of real-valued functions that
starting at the center of the circle, always maintaining a
position on the radius from the center to the rabbit. Show
that it takes the fox time t = na/(2v) to catch the rabbit
and that the fox's path is a semicircle.
1b kf(t)dt =k 1b f(t)dt, k a real number,

23. Suppose that in IR 3, constant masses of size m 1, m2, ... ,


m 11 are concentrated at the respective points XJ, x2, ... ,
1b (JU)+ g(t)) dt = 1b J(t) dt + 1b g(t) dt,

x 11 • The center of mass of the system is defined to be the


point where the integrals are defined as in the previous exercise.
m1x1 +···+m 11 x11
C=------- 27. If IR ~ JR11 is defined for a .:::: t .:::: b, and J' is
m1+···+m11
continuous there, prove the following extension of the
fundamental theorem of calculus:
The momentum of the system is defined to be the vector

d dx1 dxn
1b J'(t) dt = f(b) - .f(a).

p -(nlJXJ + · · · +m
= dt 11 x11 ) = m1-dt
+ · · · +mn-d ·
t
28. Suppose x = x(t) has two continuous derivatives on an
interval, and that x(t) = rx(t) for some scalar constant
Thus the momentum of the system is the velocity vector r =I= 0, so that the acceleration vector is parallel to the
of the center of mass multiplied by the sum of the masses. velocity vector. The purpose of this exercise is to show
Show that if the momentum of such a system is a constant that the motion of x(t) is confined to a line.
Po, then the center of mass either remains fixed or moves (a) Verify that the equation x(t) = rx(t) is equiva-
with constant speed along a fixed line parallel to po. lent to
24. Consider the vector differential equation + ax + bx = 0 x
to be solved for vector functions x = g(t). We assume a ~(e-' 1x(t)) = 0.
and b are scalar constants. dt
(a) Suppose the scalar equation r 2 +ar+h = 0 has roots (b) Show that part (a) implies x(t) = e' 1c for some
r1 and r2. Show by substitution that x(t) = e' 11 c1 + constant vector c.
e'21 c2 satisfies the vector differential equation for (c) Show that part (b) implies that x(t) = (1/r)e''c+d
fixed arbitrary choices for the vectors c I and c2 and for constant vectors c and d, and hence that x(t)
for all t. stays on a line.
(b) If the roots r 1 and r2 of part (a) happen to be
*29. This exercise generalizes the previous one. Suppose
equal, the two terms in x(t) collapse into a single
x = x(t) has two continuous derivatives on an interval
term with arbitrary coefficient c1 + c2. Show that in
a .:s t .:s b and that x(t) = g(t)x(t) for some continuous
that case additional solutions are given by x(t) =
real-valued function g(t). Thus the acceleration vector, if
e' 11 c1 + te' 1'c2.
not zero, is parallel to the velocity vector. The purpose of
25. Let IR ~ IR" be a function defined for a .:::: t .:::: b. If the this exercise is to show that the motior.. of x(t) is confined
coordinate functions !1, ... , .fn of f are integrable, we to a line.
define the integral off over the interval [a, b] by (a) Verify that the equation x(t) = g(t)x.(t) is equiva-
lent to

1b f(t)dt= (1b f1(t)dt, ... ·1b f(t)dt). :t (e-h<Ox.(t)) = 0, where h(t) = 1' g(u)du.

(b) Show that part (a) implies x(t) = ehUlc for some
(a) If .f(t) = (cost, sint) for 0:::: t .:::: n/2, compute
constant vector c.
1t 2
f(t)dt. (c) Show that part (b) implies that x(t) = H(t)c + d
(b) If g(t) = (t, t 2 , t 3) for 0 ::: t ::: 1, compute for constant vectors c and d, where H' (t) = eh(t).
Jd g(t) dt. Hence show that x(t) stays on a line.
Section 2A Several Independent Variables 189

SECTION 2 SEVERAL INDEPENDENT VARIABLES


2A Graph of a Function
The graph of a function f is the set of all ordered pairs (x, /(x)), where xis in the
domain off. The graph off is then said to be represented explicitly by f.

r~~M~~'.f'I ~~::~ho~~~~:t:u(:~t~~n!1~,[~~!r:O ~ ~e~n~~ ~~u[~~ ;a;:~//~~ ~h;:ar~o~


the parabola shown in Figure 4.9(a).

Apart from real-valued functions of a real variable, the functions we picture most
effectively by their graphs are the functions JR2 -1..+ JR with graphs in JR3 consisting
of the points
(x, y, z) =
(x, y, f(x, y)),

where (x, y) is in the domain of f. A typical graph of such a function is in


Figure 4.9(b). We often describe the relation between (x, y) and z by writing z =
f(x, y).

FIGURE 4.9

I y
I
I
I
I
\

~ (.r, y, 0)
I
z=f(x,y)
y=x2 - I
(a) (b)

FIGURE 4.10 y z
A

i
__Jo
1·~ x -+-Y

(a) (b}

Here's how to sketch the part of the graph of

f (x, y) = I- x - y2
for which x ~ 0, y ~ 0, and I - x - y 2 = z ~ 0. First observe that the domain D of
the function that we are interested in has been restricted to the part of the xy-plane
in the first quadrant for which I - x - y 2 ~ 0, or x ::s I - y 2 . This domain appears in
Figure 4.IO(a), and again in Figure 4.IO(b) under the graph of/. To sketch the graph
190 Chapter 4 Derivatives

of f itself, it helps to notice that cross sections of the graph obtained by holding
y = Yo fixed and letting x vary are lines whose projections onto the xz-plane satisfy
z = 1- x - y5. Each of these lines joins a point in the yz-plane, where x = 0 and
z = 1 - y2, to one in the x, y-plane, where z = 0 and 1 - x - y2 = 0. Such lines
are in Figure 4. IO(b). We could also include cross sections of the graph off taken
parallel to the yz-plane; such curves are parabolic in shape, with projections onto
the yz-plane satisfying z = 1 - xo - y2 for values of xo between O and 1.

IEXAMPLE 31 The graph of f (x, y) = x 2 + y 2 has the property that f is constant on each circle
of a given radius in the xy-plane and centered at the origin. In other words, cross
sections of the graph taken with planes parallel to the xy-plane are circles, shown
in Figure 4.11 (a). All these circles pass through the parabola in the yz-plane with
equation z = y2, because z = /(0, y) = y2.

IEXAMPLE 41 A function f : IR2 ~ JR is given by

f (x, y) = -2x - y + 2.
Setting z = J(x, y), we get

z = -2x - y +2 or 2x + y + z = 2;
we see that the graph off is a plane in JR 3 . To sketch it, we take cross sections parallel
to the yz-plane, which project into that plane as lines with equations y+z = 2 -2x0 .
Or we may also take cross sections parallel to the xz-plane. Both are shown in
Figure 4.1 l(b) for x ~ 0, y ~ 0, z ~ 0.
A more direct way to sketch the plane is to locate three points on it by setting, for
example, (x, y) = (0, 0), (1, 0), and (0, I). The corresponding points on the graph
(a) are (x, y, z) = (0, 0, 2), ( 1, 0, 0), and (0, 1, 1), shown as dots in Figure 4.11 (b ).
Joining these dots by lines in this plane gives some idea of the position of the plane.
Alternatively, we find the points where the plane intersects the axes by setting two
'-
',
z 1 I
I of the coordinates equal to zero and solving for the third; doing this we find (1, 0, 0),
' I
(0, 2, 0) and (0, 0, 2).
' (0. 0, 2) Note that we my also write the plane's equation as
/
. ' /
/

' , t
i .,.,, ......--r-
' ' ~
, • '(O, I, I) 2x + y + (z - 2) = (2, 1, I) • (x, y, z - 2) = 0.
- \
OJ/~
(I , 0, ~ This equation shows that our plane is realized as all points (x, y, z) such that the line
joining (x, y, z) to (0, 0, 2) is perpendicular to the vector (2, 1, 1), a nonnal vector
~ '°,
' I
I
to the plane.
I

(b) 2B Level Sets


Drawing the graph of a function JR 3 ~ JR is impossible because the graph is a
FIGURE 4.11
subset of JR4 . Even in JR 3 a graph may be too complicated to draw easily. But in
both cases we may settle for trying to draw the sets on which f is a constant, th~reby
getting some picture of the behavior of the function. If / is a real-valued function,
and k is a point in the image of J, the level set of J at level k is the set of all
x in the domain of f such that J(x) = k. Level sets of f are defined implicitly
by J, that is, by regarding them as solution sets of f (x) = k, for some k. The
Section 2B Several Independent Variables 191
FIGURE 4.12

(a) (b)

=
implicitly defined level set associated with f (x) k is sometimes called the graph
of the equation f (x) = k, whereas the graph of the function f(x) always refers to
the equation y = f (x).
Topographical maps display terrain elevations by showing level curves at equally
spaced levels as in Figure 4.12. Such displays have the advantage over perspective
drawings that foreground features don't obscure what lies in back of them. See
Figure 4.12(a), which shows the terrain levels, and Figure 4.12(b), which shows the
corresponding level curves.
The function f (x, y) = x 2 + y2 of Example 3 has concentric circles for level sets.
At level k = l we get f(x, y) = x 2 + y 2 = 1, which represents a circle of radius
1 about (0, 0) in the xy-plane. In general, at a level k > 0 we get a level curve
x 2 + y 2 = k, which is a circle of radius ./k. See Figure 4.13(a), where the values of
./k are nearly equally spaced. As the surface rises more steeply the level lines will
get closer together. If we don't label the level curves with numerical level value k,
we can't tell from level curves alone whether the surface is rising or falling as we
go out from the center.
The function f: JR.3 ~ JR. defined by f (x, y, z) = x 2 + y2 + z2 has level sets in JR. 3
consisting of points (x, y, z) that satisfy an equation of the fonn
x2 + y2 +z 2 = k
for some fixed real number k. If k > 0, we get a sphere of radius ./k centered at (0,
0, 0), because x 2 + y 2 + z2 is the square of the distance from (x, y, z) to (0, 0, 0). If
k = 0, the equation is satisfied only by (0, 0, 0). If k < 0, the corresponding level
set is empty. Some level sets are shown in Figure 4.13(b) as concentric spheres. The
graph off is a subset of R 4 and can't be pictured.

FIGURE 4.13

(a) (b)
192 Chapter 4 Derivatives

IEXAMPLE 7 j 3
The linear function g: JR -+ JR defined by

g(x,y,z)=x+y+z

has a graph in JR4, so we can't draw it. The level sets of g are the parallel planes
with equations
X + y +z = k,

one for each real number k. Three of the planes are shown in Figure 4.14. Note that
each plane is perpendicular to the vector ( I , I , 1), because the equation also takes a
form showing (I, 1, 1) and (x, y, z) - (0, 0, k) perpendicular:
...
( I , 1, 1) • (x, y, z - k) = 0.
Note that the graph of/: JR 2 -+ JR is the set of points (x, y, z) in JR 3 such that

FIGURE 4.14
z = f (x, y), and that this set is the same as the level set at level k 0 of the =
function g: JR 3 -+ JR given by g(x, y, z) = z - f(x, y). Whichever point of view
we take, we get the same picture.

EXERCISES

1. Consider the function f(x, y) = x 2 - y2 . J4 - 11. f (x, y) = (x 2 + y 2 + 1)2 - 4x 2 = 0


(a) Sketch the domain of f, making it as large as 12. f (x, y, z) = x +y +z = l
possible.
(b) Sketch the graph of f.
13. f (x, y, z) = xyz = 0
(c) Sketch the image of J. 14. J (x, y, z) = x2 - y2 = 2
2. Consider the function g(x, y) = In(x + y). In Exercises 15 to 18, sketch the level sets of each of
(aJ Describe the domain of g, making it as large as the functions f: JR 3 -+ JR for the indicated levels k.
possible. 15. f (x, y, z) = x + y, k = 0, I, 2
(b) for what values of (x, y) does the graph of g lie
above the xy plane? 16. f(x,y,z)=x 2 +_v2-z 2 ,k=0,I
(c) Describe the image of g. 17. f(x,y, z) = Jx 2 + y 2 + z2 , k = 0, I
Sketch the graphs of the following functions 3 to 8. 18. f(x, y, z) = x + y + z, k = 0, I

3. f(x,y)=2-x 2 -y2 In Exercises 19 to 22, we consider a function .f:


I JR 3-+ JR 2 with coordinate functions .f1, h determined
4. h(x, y) = - 2- -2 by f(x,y,z) = (f1(x,y,z),h(x,y,z)). For each vec-
X +y
5. g(x, y) = sinx tor k = (k1, k2) in the image of f, the equation
f (x, y, z) = k determines a level set in JR 3 that is the
6. f(x, y) =0 intersection of the level sets determined by the pair of
7. f (x, y) = ex+Y equations
J1(x,y,z)=k1
8. g(x,y) = lf I
0
if Ix! < !YI }
if lxl::: I.vi h(x, y, z) = k2.
Sketch the following implicitly defined level sets 9 to 14
in JR 2 or JR 3 . Using this point of view, sketch the following level sets
(curves) in JR 3 .
9. f(x,y)=x+y=I
10. g(x, y) = x 2 + 2_v 2 = I
Section 2C Several Independent Variables 193

20. ( ~ ~~ ) =( i) inversely as 1+ t 2, where t is time. If the box is described


in lR 3 by !xi :5 1, IYI :::: l, lzl :5 1, and if the density at
x2 + y2 +z2 a comer of the box. is 1 when t = 0, find a fonnula for
21. (
x-z )=(~) the density at a given point and time. What is the rate of
change of the density at a point ½unit from the center of
x2 + y2 +z2
22. (
y-z )=(~) the box at time t = l?

23. Suppose that the density per unit area of a thin film, 25. Suppose the region D in JR 3 consists of all points (x, y, z)
referred to in (x, y)-eoordinates, is given by the fonnula satisfying both x 2 + y 2 :::: 4 and O :5 z :::: 5. Suppose
d(x, y) = x 2 + 2y2 - x + l for -1 :::: x :::: I and the temperature at a point (x, y, z) in D is T (x, y, z) =
- l :::: y :::: l. Sketch the set of points at which the film x2 + y2 - z.
has density ¾- (a) Sketch the region D.
24. Let the density per unit of volume in a cubical box of side (b) Sketch the set of points in D for which the temper-
length 2 vary directly as the distance from the center and ature is -1 degree.

2C Computer-Generated Graphs
Some graphs of functions f (x, y) are fairly easy to draw by hand. For example,
the graph of z = JI - x 2 - y 2 is a hemisphere of radius I over the domain
x 2 + y 2 ~ I . A few examples that have been done by a computer are shown in
Figure 4.15. But whether sketching is done by hand or computer, the technique
illustrated here is fundamentally the same in that it consists of drawing curves on
the function's graph that are traced by holding one variable fixed and varying the
other.
To describe this technique another way, sketching the graph of z = f (x, y) is
possible by plotting some carefully chosen curves that lie on the surface, as in
Figure 4.15. The simplest curves to draw are often the ones that are images under
f of line segments in the domain of f that are parallel to the x and y axes, as
in Figure 4.15(b); this approach allows us to use either function values f (x, yo),
with Yo fixed and x varying, or else f(xo, y) with xo fixed and y varying. Thus a
rectangular domain for f such as O ~ x s 2, rr / 4 ~ y ~ 3rr /2 for f (x, y) = x cos y
might be treated using the following routine. This "program'' is not intended to run
in a particular language, but is presented only as a compact way of indicating the
rough structure of such a program.

DEFINE f(x, y) = ,c • ~,1s(y)


[First plot y-varytng curves, spaced by x = 1/4.]
FOR x = O TO 2 STEP 0.25
FOR y = n/4 TO )11/'; STEP 0.01
PLOT3D (x. y, tlx, Yi)
NEXT y
NEXT x
[Then Plot x-varying curves, spaced by y = rr/4 . ]
FOR y = ,r/4 TO 3,r/2 STEP rr/4
FOR x = 0 TO 2 STEP 0 . 01
PLOT3D (x, y, f(x, y))
NEXT X
NEXT y

Following this routine produces the picture shown in Figure 4.16. These drawings
are in a style that can in principle be drawn by hand, drawing one curve on the
194 Chapter 4 Derivatives

F'IGURE 4.15 z

y
X

(a) z = x1 - y2 (b) z = xy
v'x2 + y2

2
(d) z = sin (x + y2)
x2 + y2
graph at a time, with only one of the variables actually varying. Applications such as
Maple, Matlab, and Mathematica make drawings such as this with additional sophis-
tication. The Web site http://math.dartmouth.edu/rvrewn/ also provides some Java
programs in the style of the graphical techniques we use here.
The Java programs are designed to ignore error-producing values such as square
roots of negative numbers or undefined function values that may arise from trying
to plot the graph of a function like f(x, y) = JI -
x 2 - y 2 over a rectangle that
contains the circular disk x + y2 :-s I. [Here for example, j(l, 1) = .J=T.] The
2
natural plotting domain of the programs we use is a rectangle with edges parallel
to rectangular axes, but we may want to plot only over a domain with some other
shape such as a circular or triangular one. We do this easily using the Heaviside
unit step function defined by

FIGURE 4.16
z = xcosy . H(x) = 10,l, if X ~ 0,
if X < 0.

IEXAlvlPLE a I Suppose we want a picture of the graph of the function f (x, y) = sin(x 2 + y 2 ) / (x 2 +
y2) with its domain restricted to the part of the first quadrant inside the circle
x 2 + y2 :-s 9 of radius 3. Using the Heaviside function we define a new function of
two variables h (x, y) by writing h (x, y) = H(9 - x 2 - y 2 ). It follows that

1, if x 2 + y2 ~ 9,
h(x, y) = O,
1 if x 2 + y2 > 9.

Thus h(x, y) takes the value l inside and on the circle of radius 1 centered at the
origin and the value 0 outside the circle. Then the product h(x, y)f (x, y) will talce
Section 2C Several Independent Variables 195
FIGURE 4.17 z z
z = sin(x 2 + y2)/(x 2 + y2), 0 S
XS 3, 0 SYS 3.

X X

y )'

(a) (b)

on the value O outside the circle x 2 + y2 = 9 and will be equal to / (x, y) inside
and on the circle. If we sketch the graph of z = h(x, y)f(x, y) over the square
0 ~ x ~ 3, 0 ~ x ~ 3 we get a picture like Figure 4. l 7(a). By suppressing the zero
values we get Figure 4.l 7(b ).

In the previous example / (x, y) has not been defined at (x, y) = (0, 0) but
we've successfully avoided the issue. One way to do this is to define / (0, 0) = 1,
which incidentally will make / continuous at (0, 0). Another way is to incorporate
a feature in the plotting program that allows it to ignore points in the domain that
would normally produce an error message. This is what has been done with the Java
program GPLOT available at the Web site referred to previously.
EXERCISES

l. Sketch the graph of f(x, y) = x2 - y2, for Jxl S 2, assuming the value I on each region as described in 12
IYI s 2. to 17 and assuming the value zero elsewhere in JR 2 .
2. Sketch the graph of f(x, y) = x2 - y2 for OS x S 2, 12. P(x, y) talces the value 1 where x ::: 0, y ::: 0, and
0 Sy S 2. y ~ x.
3. Sketch the plane z = 1-x - y for OS x .S 2, 0 Sy .S 2.
13. P(x, y) takes the value 1 where 1 ::: x ::: 0, 1 ::: y::: 0,
4. Sketch the plane x+2y+z = 2 for 1 .S x S 2, 1 =:: y .S 2. and y::: x.
5. Sketch the graph of f(x, y)= x 2 +y3 for !xi .S 3, IYI .S 3. 14. P(x, y) talces the value 1 where x 2 + y2 .s 1.
6. Sketch the graph of f(x, y) = x 2 +y 2 for !xi .S 1, IYI .S 1.
15. P (x, y) takes the value 1 where x2 + y2 S I and x ::: y.
7. Sketch the graph of f(x, y) =x + y for O S x S 1,
0 .Sy S 2. 16. P(x, y) takes the value 1 on the triangular region in R2
with vertices (0, 0), (0, 1) and (1, 0).
8. Sketch the graph of f(x, y) =y2 - x3. 0 S x S 2,
0Sy.S 1. 17. P (x, y) takes the value 1 on the square in JR 2 with vertices
(0,0), (0, 1), (1,0) and (1, 1).
9. Sketch the graph of f(x, y) = cosx sin y, 0 ~ x ~ 2Ir,
0:'.::Y:'.::2,r. 18. Sketch the graph of f (x, y) =
(x - y ) 3 for values of
10. Sketch the graph of f(x, y) = exp(-x -2y), 0 ~ x S 2, (x, y) simultaneously satisfying O :'.:: x .s
2, 0 :'.:: y .S 2,
0 Sy S 2. and y :'.:: x .
11. Let f (x, y) = xy(x 3 + y3)/(x 2 + y 2 ). Sketch its graph 19. Sketch the graph of f (x, =
y2 - x 2 for values of
y)
for -1 S x :'.:: 1, -1 Sy :'.:: 1. What is the difficulty at (x, y) simultaneously satisfying O S x ~ 2, 0 S y :'.:: 2,
(x, y) = (0, 0)? and x Sy.
Using modifications of the Heaviside function H(x), 20. Sketch the graph of f(x, y) = cos(x 2 +y2)/(l +x 2 +y 2 ),
form a product of functions that we'll refer to as P(x, y), for !xi S 2, IYI ~ 2.
196 Chapter 4 Derivatives

21. Sketch the graph of f(x, y) = cos(x 2 + y2)/(l+x 2 + y2), *24. Sketch the part of the sphere of radius I centered al the
for x 2 + y2 :S 2. origin that lies above the part of the first quadrant in
the xy-plane that lies between the y-axis and the line
22. Sketch the graph of z = Js - x 2 - y2 when I ::: z .:::: 2. )' =X.

*25. Sketch the part of the graph of f (x, y) = 2 - x 2 - y3


23. Plot the part of the plane z-x-y = I in R 3 for I :s x .:::: 3 that lies above the first and second quadrants in the
and l :S y .:S 2. xy-plane.

2D Quadric Surfaces
Quadric surfaces are level sets in JR 3 of second-degree polynomials in three variables
x, y, z; they fall into six distinct types illustrated in Figure 4.18, plus some degenerate
cases in which the polynomial depends on only two variables. We'll be returning to
all of these surfaces later in Section 4, where we represent them parametrically in a
way similar to what we used to represent curves in space.
The elliptic cone in Figure 4.18(2) is a limit of hyperboloids in two ways: (i) As
the waist of the hyperboloid of one sheet pinches in while k decreases to O through

FIGURE 4.18 z z z

---
----
\--- ____:
X

----------
---- (----
--------
(--- ----
--------
x2 '
-+Y- -z2 =k>O
x2 y2
-+--z=O
2 x2 y2 2 =k<O
----z
a2 b2 a 2 b~ a2 b2
(I) Hyperboloid (2) Elliptic cone (3) Hyperboloid
of one sheet of two sheets

z
z
---- z
----

x2 y2 x2 y2
-+--z=O ----z=O
a2 b2 a2 b2
(4) Ellipsoid (5) Elliptic paraboloid (6) Hyperbolic paraboloid
Section 2D Several Independent Variables 197
positive values it tends to the cone. (ii) As the two separate pieces of the hyper-
boloid of two sheets get closer while k increases to O through negative values, the
two pieces of the hyperboloid become more pointed and come together to form
the cone.
As well as being level sets at level O of functions of three variables, the two
paraboloids are also graphs in JR 3 of the respective functions (x/a)2±(y/b)2 defined
on JR2 . The surface of Example 5 is an elliptic paraboloid in which a = b = 1. The
spheres of Example 6 are a special case of the ellipsoid in which a = b = c.
The degenerate cases mentioned previously are cylinders, which may be level
sets of functions on JR 3 that really depend on only two variables, say x and y. For
example, the equation x 2 + y2 = 1 that determines a circle in JR2 also determines
a circular cylinder in JR 3• Since the equation places no restriction on z, a level set
satisfying x 2 + y2 = k > 0 in JR3 contains all lines perpendicular to the xy-plane
and passing through the circle of radius ./k.
To make highly accurate pictures of surfaces, in particular quadric surfaces, we
use computer graphics. On the other hand, rough sketches are often based on the
observation that well-known curves such as lines, parabolas, ellipses, and hyperbolas
lie on these surfaces and are useful guides in making a drawing.

The hyperboloid of one sheet defined by x 2 / a 2 + y 2 / b 2 - z 2 = k, with k > 0,


contains the hyperbola x 2 / a 2 - z 2 = k in the xz-plane. (Just set y = 0 to restrict
to the xz-plane.) The same hyperboloid contains the hyperbola y 2 /b 2 - z 2 = k.
(Set x = 0 to see this.) Cross sections of the hyperboloid by planes z = l parallel
to the xz-plane are the ellipses x 2 /a 2 + y 2 /b 2 = k + 12 , z = l that lie above the
elliptical level sets x 2 /a 2 + y 2 /b2 = k + 12 in the xz-plane. These cross-sectional
curves, together with the two hyperbolas identified previously, form a framework
for the surface. Such a framework appears in the generic picture shown of the
hyperboloid of one sheet. If a = b, this surface arises by rotating one of the hyper-
bolas about the z-axis. Interchanging x and z, or y and z. in the given equation
gives hyperboloids with the x-axis and y-axis respectively as principal axis of
symmetry.

The second picture in our catalogt1e is the cone x 2 / a 2 + y2 / b 2 = z2 . This cone


contains the lines z = ±x/a in the xy-plane, where y = 0, and the Jines z = ±y/b
in the yz-plane, where x = 0. Cross sections of the cone by planes z = l parallel to
the xy-plane are the ellipses x 2 / a 2 + y 2 / b2 = 12 , z = I that lie above the elliptical
level sets x 2 /a 2 + y 2 / b2 = 12 in the xy-plane. If a = b we get a circular cone
generated by rotating one of the lines about the z-axis. Note that the upper half of
this circular cone is the graph of z = (1 /a)-./x 2 + y 2 and the lower half is the graph
of z = (-1/a)-./x 2 + y 2 .

The hyperboloid of two sheets x 2 /a 2 + y2 /b 2 - z 2 = k, with k < 0, comes in


two pieces, namely the graphs of z = ±Jx2 /a 2 + y 2 /b2 - k. Since k < 0, the two
graphs contain the points (0, 0, ±H), but there are no points of the graphs at
z-levels between these numbers. Cross sections at z-levels outside these numbers are
ellipses as in the previous two examples.
198 Chapter 4 Derivatives

EXERCISES

1. Use the picture of the generic elliptic cone in Figure 4.18 7. Consider two distances in JR. 3 : (i) from (x, y, z) to the
as a guide in making sketches of the circular cones plane z = -1, (ii) from (x, y, z) to the point (0, 0, I).
(a) x 2 +y2-z 2 = 0. (b) x 2 +z 2 -y2 = 0. (c) y2+z 2 -x 2 = The points (x, y, z) for which these distances are equal
0. constitute a quadric surface Q. Identify Q and make a
2. Make sketches in JR. 3 of (a) circular cylinder x 2 + y2 = 1. sketch of it.
(b) parabolic cylinder x 2 - y = 0. (c) hyperbolic cylinder
8. The paraboloids x 2 + y2 = z and x 2 + y 2 = 8 - z intersect
x2 - y2 = 1.
in a curve in JR. 3 • Identify the curve and make a sketch
3. The ellipsoid, the elliptic paraboloid, and hyperbolic of it.
paraboloid are shown as (4), (5) and (6) in Figure 4. 18 as
generic level surfaces of quadratic polynomials at respec- Each of the following quadratic equations describes an
tive levels l, 0, and 0. What, if anything, would be altered example of one of the quadric surface types illustrated
in the pictures if we had chosen levels 2 in (4), I in (5), in the text. In each case identify the type by name and
and 2 in (6)? make a sketch of the surface.
4. (a) Show that the intersection of the hyperboloid H 1 of 9. 4x 2 - y2 + 4z 2 = 16 10. x2 /4 + y 2 /4 - z2 /9 = l
one sheet
11. x 2 /4 - y 2 /4 + z2 /9 = 1 12. 4x 2 + y2 + 4z 2 = 16
2
(x/a) + (y/b)2- (z!d = l 13. x 2 /4 + y 2 /4 + z2 /9 = 1 14. x2 /4 - y2 / 4 - z2 /9 = 0

with a plane perpendicular to the z-axis is an ellipse. 15. 4x 2 - y 2 ...:. z = 0 16. x 2 /4 + y2 /9 - z = I


(b) Show that the intersection of the hyperboloid H 1 17. x 2 /4 - y 2 /4+z=1 18. 4y2 + z2 - X = 0
with a plane perpendicular to the x-axis or to the
y-axis is a hyperbola. 19. 2 2
x /4 + z /9 - y = 0 20. x 2 /4 + y2 /4 +z =0
5. (a) Show that if a plane perpendicular to the z-axis
intersects the hyperboloid H2 of two sheets Each of the quadratic equations 21 to 28 in two variables
describes a curve in IR 2 • Each one also describes a
(x/a)2 + (y/b) 2 - (z/c)2 = -1, quadric surface of cylindrical type in IR 3 • In each exercise
make a perspective sketch of the underlying curve in the
then the intersection is an ellipse. appropriate 2-dimensional coordinate plane in JR 3 • Then
(b) Show that the intersection of the hyperboloid H 2 sketch the cylinder parallel to the remaining axis in JR 3 •
with a plane perpendicular to the x-axis or to the
y-axis is a hyperbola. 21. x 2 + y =0 22. z2 - X =0
6. Identify the curves of intersection of the hyperbolic 23. y 2 + z2 = I 24. x 2 - y2 = 1
paraboloid (x/a) 2 - (y/b)2 = z with (a) planes per-
25. x 2 + z = l 26. z2 + y =2
pendicular to the z-axis. (b) planes perpendicular to the
x-axis or the y-axis. 27. x 2 + 4z 2 = 4 28. x2 - 2y 2 =1

SECTION 3 PARTIAL DERIVATIVES


3A Definition
Partial derivatives are the straightforward generalization to functions from JRn to JR
of the ordinary derivative of a real-valued function of a real variable. For example,
if f is defined on JR 2 we define the partial derivatives af /ox and of/oy by

of ( ) . f(x + t, y) - f(x, y)
- x, y = 1J ill--------,
ax 1-->0 f

of( )-l· f(x,y+t)-f(x,y)


- x,y - un - - - - - - - - .
oy t-->0 I
Section 3A Partial Derivatives 199

Thus a partial derivative is the result of differentiating with respect to just one
variable at a time with the others held fixed. If the derivatives anax and aflay
exist they are also functions from JR 2 to R
A similar definition works for functions defined on !Rn . For each i = 1, ... , n ,
we define a new real-valued function called the partial derivative of f with respect
to the ith variable, denoted by af/axi . For each X = (XJ, . . . , Xn) in the domain of
f , the number (af/axj}(x) is by definition
af . f(XJ, ... , Xi+t , ... ,Xn)-f(x1, .. , , Xj, .. , , Xn)
3.1 -(x)
axi
= ,~o
hm - - - - - - -- - - - - - - - - - .
t
The domain space of aflax; is !Rn , and the domain of af/axi is the subset of
the domain of f consisting of all x for which the preceding limit exists. Thus the
domain of anaxi could conceivably be the empty set. The number (af/axj)(X)
is simply the derivative at Xi of the function of one variable obtained by holding
x1 , . .. , x; - 1, Xi + 1, . . . , x 11 fixed and by considering f to be a function of the i th
variable only. As a result, the differentiation formulas of one-variable calculus apply
directly.
It's important to realize that we do not call a function "differentiable" just because
it has partial derivatives. For functions of more than one variable, the concept of
differentiability is a little more complicated than that; the matter is taken up in
Chapter 5.
Let f(x, y, z) = x 2 y + iz + z2x . Then
aJ 2
-(x, y, z)=2xy+ z,
ax
af 2
-(x , y , z) = x + 2yz,
ay
aJ
- (x,y ,z)= y +2zx.
2
az
The partial derivatives at x = (1 , 2, 3) are

aJ o. 2, 3) = 4 + 9 = 13,
ax
aJ
-(1 , 2, 3) = 1 + 12 = 13,
ay
aJ
- (1,2,3) =4+6= 10.
az

Let f(u, v) = sin u cos v. Then


af asin U COS V = COS U COS V,
au = au
aJ asin U COS V
- - - - = -sinusinv ,
av av
~ a~u~v rr . rr
-(rr /2, rr /2) = - - - - ( r r /2, rr /2) = - sin - sm - = - 1.
av av 2 2
200 Chapter 4 Derivatives

We can repeat the operation of taking partial derivatives. The partial derivative
of aflax; with respect to the }th variable is a;axj(af/ax;) and is denoted by
a2 f/ax1ax;. We may repeat this indefinitely, provided the derivatives exist. An
alternative notation for higher-order partial derivatives is illustrated as follows, in
which each variable of differentiation is denoted by a subscript:
a.r
-=Jx;
ax;
2
- a ( -aJ) =---=fx
a x· .r
axj ax; axj ax; ' J
2
1
/.
X1
(:X .) = 1
:
X;
{ = fx;x;

2
a:k ( a: tx;) = axk :~ ax; = fx;xjxk ·
1
Note that the order of variables in the subscript notation is the opposite of that in
the a-notation, since for example fxy means Ux) y·

IEXAMPLE 3 j Consider f (x, y) = xy - 2


x .
aJ
fx=-=y-2x
ax
a2 J
fry= - - = l
ayax
a2 f
!xx= ax2 =- 2

a3 J
/yxx = ax2ay = 0
z

3B Geometric Interpretation
f To interpret partial derivatives geometrically, we rely on something we know about
.·• I
real-valued functions of a single variable, namely that the value of the derivative at
',_ ---,~ I ,../
l"~~--w-;-r-- -~- :~ a point is the slope of the tangent line to the graph of the function at that point. For
illustrative purposes it will be enough to consider the graph of a function IR: 2 ~ JR,
I ~~,. -~ I
I \ ' ~I~ t
b 1
t ~,__,_
:
-1- - - - - , - -- namely, the set of points (x, y, f(x, y)) in IR: 3 where (x, y) is in the domain of/.
1 \ I
a \ -- l,.r ---· ___ I Y Such a graph is in Figure 4.19 as a surface lying over a rectangle in the xy-plane.
1
I 1
I 1 The intersection of the surface with the vertical plane determined by the condition
y = b is a curve satisfying the conditions
I I
I I
I I
\ I
I I
z = f(x, y), y = b.
'v
Consider the curve defined by the function g(x) = f(x, b) as a subset of 2-dimen-
X
sional space. Its slope at x = a is

FIGURE 4.19 I aJ
g (a)= ax (a, b).
Section 3C Partial Derivatives 201

Similarly, at y = b the curve defined by h(y) = J (a , y) has slope equal to

' aJ
h (b) = - (a, b).
ay
The angles a and f) shown in Figure 4.19 therefore satisfy

aJ
tan a = -(a, b), tan f)
aJ b).
= -(a,
ax ay
The numbers tan a and tan f) are slopes of tangent lines to two curves contained in
the graph of the function J. For this reason it's natural to try to define a tangent plane
to the graph of f just to be the plane containing these two lines. If J satisfies the
condition of differentiability defined in Chapter 5, then that turns out to be consistent
with our ultimate definition. We see that the set of points (x, y, z) satisfying

3.2 z = f (a, b) + (x - a) aJ (a, b) + (y - b) aJ (a, b)


ax ay
is a plane containing the tangent lines found previously. To see this, specify in turn
y = b and x = a in the previous equation to determine the respective tangent lines.

A sketch of the part of the graph of

f(x , y ) = 1- 2x 2 - y2
corresponding to x ~ 0, y ~ 0 is in Figure 4.20. The function f has partial deriva-
tives at (½, ½) given by
:~ (½, ½) = - 2, aJ
ay
(½, ½) = -1.
Since f ( ½, ½)= ¼,the tangent plane to the graph of J at ( ½, ½) is, by Equation 3.2,
z= ¼- 2 (x - ½)- (y - ½)
= ¾-2x - y.
z f
I ,,I '\
We can sketch the tangent plane by drawing the two tangent lines in it determined
at (x , y) = f).
<½, It's somewhat easier to locate three points on the plane, for
simplicity
(i,o,o), (o, ¾,o)(o,o, ¾) ,
and then sketch the plane containing these points. The point of tangency on the graph
of f is (½, ½,¼). See Figure 4.20.
FIGURE 4.20
3C Continuity
We discuss continuity for functions of more than one variable extensively in Chapter 5.
At this point, we'll consider briefly the case ~ 2 _£,,. ~- To a1low z to approach x from
202 Chapter 4 Derivatives

an arbitrary direction we assume, for each x = (x, y) in the domain of f, that f (z)
is defined for all vectors z = (z, w) satisfying Ix - zl < 8, where 8 is some positive
number. We then say that f is continuous if for each point x in the domain off

lim f {z)
Z--+X
= f (x).
The limit relation means that we can make f{z) arbitrarily close to j(x) if the
distance Ix - zi from x to z, is small enough. As usual, the intuitive idea of continuity
is that the values of the function f should not change abruptly, resulting, for example,
in breaks in the graph of f. The graphs shown in Figure:; 4.19 and 4.20 are those
.., ___
._, of continuous functions, whereas Figure 4.21 shows a simple example of the graph
of a discontinuous function.
If we assume certain continuity conditions on f and its partial derivatives then
,. / ~ - ;
X the higher-order partial derivatives of JR2 ~ JR are independent of the order of dif-
ferentiation. The precise statement follows, though we remark that a slightly stronger
FIGURE 4.21 theorem is true. (See Exercise 13 of Chapter 7, Section 3.)

3.3 Clairaut's Theorem. Let JR 2 . ~ JR be continuous and such that fx, fy, fxy,
and Jyx are also continuous on the same domain as f. Then fxy fyx· =
Proof. Choose x, y, h =f. 0, k =f. 0 and 8 > 0 so the difference

F(h, k) = [f(x + h , y + k) - f(x + h, y)] - [f(x, y + k) - f(x, y)]

is defined if ../h 2 + k1 < 8. We now apply the mean-value theorem in the variable
x to the function
G (x) = f (x, y + k) - f (x, y)
on the interval with endpoints x and x + h. We find
G(x + h) - G(x) = hG'(x,),
where x, is between x and x + h. In terms of F and f, this last equation is

F(h, k) = h[fx(XJ, Y + k) - fx(XJ, y)].


Now apply the mean-value theorem again, this time to the function H(y) = fx (x 1, y)
on the y-interval with endpoints y and y + k. We find

F(h, k) = hkfxy(XJ, Yl ),
where YI is between y and y + k. Rewriting F in the form
F(h, k) = [f(x + h, y + k) - f(x, y + k)] - [J(x + h, y) - f(x, y)]
allows us to follow the same general procedure, this time differentiating with respect
toy, then x. We find
F(h, k) = hkfyx(x1, Y1),
where x2 and )'2 lie between x, x + h and y, y + k respectively . Equating the two
expressions found for F(h, k), and canceling the factor hk, gives
Section 3C Partial Derivatives 203
Now let both h and k tend to zero. It follows from the positions of the Xi and y;
that the distances

both tend to zero. Therefore, by the continuity of fxy and fyx, we get fxy(x, y) =
Jyx(x, y). The point (x, y) was arbitrary, so fxy = Jyx on the domain off. •
We may apply Theorem 3.3 successively to still higher-order partial derivatives,
provided the analogous differentiability and continuity requirements are satisfied.
Moreover, by considering only two variables at a time, we can apply the theo-
rem to functions ]Rn -1..+JR where n > 2. Thus for the commonly encountered
functions that have continuous partial derivatives of arbitrarily high order, we have
typically
a2f a 2f
axay ayax
a3g a3g
axayax = ax 2 ay
a4h a4h , etc.
azaxayaz = axayaz 2
The last two formulas follow from repeated application of the two-variable formula
by interchanging two differentiations at a time.

EXERCISES

.
In Exercises I to 6, find -
aJ aJ .
and - , where f (x, y) 1s 9. f (x, y)
1
= - 2- -2 , (a, b) = (1, 1)
ax ay X +y
the given function. 10. f(x, y) = x(y2 + 1), (a, b) = (0, 2)
1. x + x sin(x + y)
2
a2 f a2 f
In Exercises 11 to 14, find - - and - - , where f is
2. sin x cos(x + y) ayax axay
3. ex+y+I as given.

4. arctan(y/x) 1l. xy+x 2 y3


5. xY 12. sin(x 2 + y 2 )
6. logx y 1
13. x2 + y2
In Exercises 7 to I 0, find the general formulas for
af!ax and aflay, then evaluate them at the indicated
point (x, y) = (a, b). Then for each function f use x+y
Equation 3.2 to write the equation of the tangent plane
at the point (a, b, f (a, b)) on the graph off. Simplify In Exercises 15 to 20, find all first-order partial deriva-
the equation of the tangent. tives of the given functions.
7. f(x,y)=x 2 y+xy 2 ,(a,b)=(l,-1) 15. f(x, y, z) = x 2 ex+y+z cosy
8. f(x, y) = .r. 2 - y2, (a, b) = (2, 1) 16. f(x, y) = x 2 cosxy
204 Chapter 4 Derivatives

x2-y2 37. y(x, t) = cosh(x + t)


17. f (x, y, z, w) =
z2
+ w2 Assume in Exercises 38 and 39 that a pair u (x, y) and
18. f (x, y, z) = xyz v(x, y) of functions satisfy the equations ux(x, y) =
19. f(x. y, z) = x + 2yz vy(x, y) and uy(x, y) = -v(x, y), called the Cauchy-
Riemann equations, and assume that u and v have
20. f (x1, xz, x3) = x1x2 - x3 continuous derivatives of order two on some domain D.
. a3 f (x, y) . =
21. Fmd , 1f f(x, y) = In(x + y) 38. Show that Uxx(x, y) + Uyy(x, y) 0 and Vxx(x, y) +
ax 2ay Vyy(x, y) = 0 on D.
In Exercises 22 to 25, show that the Laplace equation 39. Show that u(x,y) = ecosy and v(x,y) = exsiny
fxx + fyy = 0 is satisfied by the given function. satisfy the two Cauchy-Riemann equations.
22. ln(x 2 + y2) Harmonic functions on IR 2 are real-valued functions
u(x, y) that satisfy Uxx + Llyy = 0 for all (x, y) in
23. x 3 - 3xy2
the domain of u. In Exercises 40 to 49, each formula
24. x/(x 2 + y2) defines a function on some domain D in IR2 . In each case
25. eX cosy describe D, and state whether the function is harmonic
on Dor not.
26. If f(x, y, z) = l/(x 2 + y2 + , 2) 112 , show that
40. u(x, y) = x 2 - y2
fxx + f yy + fzz = 0. 41. u(x,y) =x 3 -y3
27. If f(x1, x2, ... , x,,) = 1/(x?+x?+ · · ·+x;)<n-2>12 , show 42. u(x, y) = e cosy
that 43. u(x, y) = x 3 - 3xy 2
fx1x1 + fr ;.x: + · · · + f,.x,, = 0. 44. u(x, y) = x/y
28. Prove directly, without using Theorem 3.3, the general
45. u(x, y) = sin(x - y)
statement that if f(x,y) is a polynomial in x and y,
that is, a sum of constant multiples of functions of the 46. u(x, y) = ln(x 2 + y2)
form xk yl, where k and l are nonnegative integers, then 47. u(x, y) = arctan(y/x)
fxy(X, y) fyx(X, y).= 48. u(x, y) = arctan(x/y)
For each of the functions 29 to 32, use Equation 3.2 49. u(x, y) = sin(x + y)
to find a function whose graph is the tangent plane to
the graph at the indicated point. Sketch the graph of the Concavity. A harmonic function u as defined in the
function and the tangent plane near the point of tangency. preamble to the previous exercises has the property that
at every point of its domain either it.u = uyy = 0 or
29. JI - x 2 - y 2 at (1/2, 1/2, 1/ v'Z) else the graph of u exhibits concavity, up or down, along
30. e+Y at (0, 0, }) one or both of the lines through the point and parallel
2 2 to the x- and y-axes. If Ltxx # 0 and u yy # 0, then the
31. e-x -Y at (0, 0, 1)
concavities have opposite directions.
32. J1 -x3 -y3 at (0.0, 1) 50. Illustrate the concavity properties using the specific
33. Find a parametric representation for the line perpendicular example u(x, y) = x 2 - y2 •
to the tangent plane found in Exercise 29 and passing
through the point of tangency. *51. Prove the concavity properties for hannonic functions in
general.
Verify that the following functions satisfy the diffusion *52. If
equation u xx = 4u 1 •
x2 _ y2
34. u(x, t)
35. u(x, t)
= e-02114 cos ax, where a is constant.
= ,-112e-x 2
/1 fort > 0.
Verify that the functions 36 and 37 satisfy the wave
equation Yxx = Ytt·
f(x, y) =
I2xy
0,
x2 + y2' for x 2 +y2 =/=O
for X = y = 0,
show that fxy(0, 0) = -2 and fyx(0, 0) = 2. [Hint: You
need to use the definition of partial derivative.] Why does
36. y(x, t) = sin(x - t) Theorem 3.3 not apply here?
Section 4A Parametrized Surfaces 205
SECTION 4 PARAMETRIZED SURFACES
Parametrization is both useful and fundamental for the representation of curves, and
the same statement is true for surfaces. More than one parameter is needed in a
vector-valued function to represent differentiable surfaces, as for example planes in
Chapter I, Section 5. Consequently the derivatives we use for computation will be
partial derivatives with respect to more than one variable.
4A Vector Partial Derivatives
In the previous section we considered partial derivatives of real-valued functions
only. If f: ]Rn -+ ]Rm is a vector-valued function of n variables, it's natural and
useful to define their partial derivatives of/ox; by Equ·ation 3.1 that was used for real-
valued functions. The difference is that the quotient in that definition now becomes
a vector rather than a number, so the limit of/ox;(x) is a vector also. What we
have here is a combination of the ideas of Section 1, on vector-valued functions, and
Section 3, where the domain variable is a vector. Since vector limits are computed
by taking the limit of each coordinate function, it follows immediately that

ofi (x)
fi(x) ) OXi
4.1 If f(x) = : , then of (X) =
( OXi
fm(X) ofm (x)
OXj
M,(~;l' Writing g(x, y) as a column vector, suppose that JR 2 ~JR is
2

g(x, y) = ( ::~ ) .
Then
og
ox (x, y) = ( 2xy
yi ) and og(x,y)=(2x2 ).
ay xy

U COS V )
f(u,v)= us~nv ,
(

then
af
-(u, v) =( cosv )
sin v and
af
-(u, V) = (-u v) sin
U COS V .
au o av t

If x and y are constant vectors, and h(u, v) = ux + vy, then ah/au(u, v) = x and
ah/av(u, v) = y.
The geometric significance of the vector partial derivative is as follows. If all
coordinates but one are held fixed, and the remaining one, say Xi, is allowed to vary,
then f (x) = f (x1, ... , x;, .. . , Xn) traces an image curve in ]Rm, sometimes called
a coordinate curve if m ~ n. Hence by the interpretation of Section I, the vector
af/axi(X) is a tangent vector to this coordinate curve at the image point f(x).
206 Chapter 4 Derivatives

IE~AIVl~LE 41 Consider the very simple situation in which f: IR2 -+ JR 3 is given by

f (u, v) = ux1 + vx2,


which is a parametric representation of a plane containing x1 and x2 and also passing
through the origin. If v = vo is held fixed, then as u varies, f (u, vo) traces a line
through vox2 and parallel to x,. See Figure 4.22(a). The vector partial derivative
with respect to u is
aJ a(ux1 + vox2)
-(u, vo) = - - - - - = XJ.
au au

Similarly,
aJ a(uox1 + vx2)
(a) -(uo,v) = - - - - - =x2.
au av

Thus x1 and x2 each plays the role of tangent vector to a line in the plane parametrized
by f.

Just as a parametric representation of a plane is a linear function image, perhaps


shifted by a constant vector, so curved surfaces are parametrized images of nonlinear
vector-valued functions. If f is differentiable in a sense to be made precise in Chapter
5, and the vectors X11(uo, vo) = aJ/au(uo, vo) and xv(uo, vo) = aJ ;av(uo, vo)
are linearly independent, we define the tangent plane at x(uo, vo) to a surface
parametrized by x(u, v) = f (u, v) to be the plane passing through x(uo, vo) and
parallel to x11 (uo, vo) and Xv(uo, vo). A parametric representation for the tangent
plane gives a picture such as Figure 4.22(a) for the image of

4.2 x = uxu(uo, vo) + vxv(uo, vo) + x(uo, vo).


If you like, you can make the point of tangency on the tangent plane correspond
(b)
to the parameter values (u, v) = (uo, vo) by replacing the scalar factors u and v in
Equation 4.2 by (u - uo) and (v - vo). Alternatively, you have the option of using
HGLTRE 4.22 altogether different letters for the parameters, for example using s and t:

x = sxu(uo, vo) + txv(uo, vo) + x(uo, vo).


Figure 4.22(b) shows a helical surface that is generated by helical curves. or alterna-
1.···E>v\fVlPLE>Sj tively by line segments perpendicular to the z-axis. A parametric representation for
the surface is
U COS V )
f(u, v) =( us~nv ,

where in the picture we have restricted the parameters u and v so that O ::::: u ::::: 4
and O ::::: v ::::: 3JT. If u = uo is held fixed and v varies, we get a helical curve
(see Example 8 in Section 1) winding one and one-half times around the z-axis on
a cylinder of radius uo. With v = vo and u varying, we get a line segment v units
above the xy-plane. The vector partial derivatives off were computed in Example 2.
Section 4A Parametrized Surfaces 207
At (uo, vo) = (1, T(/4), we get the tangent vectors
aJ (1, rr/4) = ( 1;,/2)
au 1/-;f , -(l,T(/4)=
af (-l/,/2)
1/,/2 ,
av l

to the two parameter curves through the point f(l, T(/4) = (1/,/2, 1/,/2, rr/4) on
the surface. The tangent plane at this point is represented parametrically by

1/,/2 ) ( -1/,/2 ) ( 1/,./i,)


u ( 1/ t2- +v l / ~ + 1~ f1 ,
as (u, v) ranges over IR 2 . The plane appears in Figure 4.22(b). Note that the curved
surface is not the graph of a function of (x, y), because there are z-values differing
by 2rr corresponding to some pairs (x, y). The image surface is called a helicoid.

The function f(u, v) = (u cos v, u sin v, v) with domain altered from the previous
example to the rectangle -1 ::: u :S 1, -37'{ ::: v =s 3rr has for its image a helicoid
H that winds around the central axis. Figure 4.23(a) shows a sketch of a portion
of the surface. The tangent plane to H at a point f (0, vo) = (0, 0, vo) on the
vertical axis is generated by two tangent vectors, fu (0, vo) = (cos vo, sin vo, 0) and
fv(O, vo) = (0, 0, 1). Since the second of these two vectors is parallel to the central
axis, the tangent plane at a point on the axis always contains the axis. Since the
dot product of fu(O, vo) and fv(O, vo) equals zero the tangent plane also contains
segments perpendicular to the central axis and lying in the surface. The surface is
pictured as a kind of "skeleton," reminiscent of models of the DNA molecule and
their linking of points on pairs of helices shown in Figure 4.23(a).

. ·.·..-." .·.1_· ·.·- ··.....1


:· EXA.MPLE7
r .·.·. ··~.. · ,..·.·.· A cone C represented parametrically by x{u, v) = (u cos v, u sin v, u) is sketched
• . :·,,=- --· ·;.. ·': ..'·.·,:-:-:i:··
in Figure 4.23(b). (Note that the parametrization differs only in the third coordinate

FIGURE 4.23 z

(a) (b)
208 Chapter 4 Derivatives

function from that of the helicoid.) The parameter values (u, v) = (0, vo) all corre-
spond to the single point (0, 0, 0) on C. Furthermore, the attempt to find a pair of
independent vectors Xu (0, vo) = (cos vo, sin vo, 0) and Xv (0, vo) = (0, 0, 0) at (0, 0,
0) fails to produce a tangent plane, since the second vector fails to provide a tangent
direction. At every other point C has a well-defined tangent plane, as you are asked
to show in Exercise 19.

The sharp point at the ends of the two symmetric halves of the cone in the
previous example is called "singular" because the surface lacks a certain smoothness
that we like to associate with a typical point of a surface. The official definition
is as follows. Recall the related definition of smooth curve in Section l of this
chapter.

Definition A $4li~c~ S parametrit~d by a f4Jictiofi ~'l.:


p~itt!~ g(J~,11) 911,Siftbere ij . <' n;9~1ngJ7 ~ . <.Ii ·15' h. 1· -:
0). Sri(q ti) and !11i(!,f, ~) ~aye continuous lt()()tdi*1ate. tu n~tion,,;
and (ii) the vector . panial deriVa1ives •gu(i,l VJ and ,l/ ,, ( u. JJ
~~ri~ .· iflat 1~, 11either i.~ 3c~nstant mut.tI~l~ (?[I~~ 9. .
the 1~(1,ippn is that <s~P'.-1,1~s is eqttival~?I ,q;~~
J~1,ge~i plane at the .g(u, v), A point where ll surfac¢
a singular point. .. .

j E,CAMPLE s I Here we consider the transformation h: JR 2 -+ JR2 given by

h(u, v) = (u cos v, u sin v)

restricted by 0 ::: u :::: 2 and 0 ::: v :::: Jr. This example comes from the previous one
by elimination of the third coordinate in the range, so the image surface is restricted
to the image plane. Figure 4.24(a) shows the domain of h and Figure 4.24(b) shows
the image. To get an idea of how the transformation behaves, we look al the images
under h of the lines parallel to the axes in the domain. The resulting curves in JR. 2 ,
given by
X = ll cos V, y = u sin V,
(a)
are semicircles if u is held fixed and line segments if v is held fixed. The image of
y f restricted to the rectangle in Figure 4.24(a) is the half-disk in Figure 4.24(b). ln
this example the image "surface" lies in IR 2 •

Graphs of z = /(x, y). In Section 2 we considered the graphs of real-valued func-


tions f(x, y) as surfaces S in IR 3 lying over the domain D of f in the xy-space
IR 2 . There is a simple way to absorb such surfaces into our present discussion of
X parametrized surfaces. The idea is just to regard x and y as parameters in a para-
metric representation (x, y, z) = (x, y, /(x, y)), where (x, y) varies over D. A
(b)
glance al Figure 4.23{a) shows that finding a single-valued function f(x, y) whose
graph is the entire helical surface is impossible, so the parametric representation is
FIGURE 4.24 really more general. You can verify that letting f(x, y) = arctan(y/x) is a partial
Section 4B Parametrized Surfaces 209

solution to finding a function JR2 --1+ JR3 whose graph is a part of the helicoid
parametrized by
x = ucosv, y = usinv, z = v.
The following example illustrates the general proposition that when both represen-
tations apply, the tangent planes tum out to be the same.

The graph of the function f(x, y) = tr-Y has partial derivatives fx(l, 1) = 1 and
f y ( 1, 1) = -1. Hence the tangent plane to the graph S of f at the point of tangency
(1, 1, 1) is
z=l+(l)(x-l)+(-l)o/-1) or z=x-y+l.

There is no uniquely determined parametric representation for S, but there is one


that is particularly convenient. Letting x = u and y = v to introduce our usual
parameters, we write

g(u, v) =( ~
eu-v
) , so gu(u, v) =( b ),
eu-v
gv(u, v) =( ~
-e u-v
)

as parametric representations for S and its tangent vectors. Parametrically, the tangent
at g(l, 1) = (1, 1, 1) is x = g(l, 1) + ugu(l, 1) + vgvO, 1), or

Reverting to u = x and v = y, we see that z = 1 + x - y, which is what we got


using the first method.

4B Quadric Surfaces
The definition of quadric surface given in Section 2D is fundamental for many pur-
poses, but parametric representations provide insight into the structure of some of
them, and will also be useful for certain multiple integration problems later on. The
elliptic and hyperbolic paraboloids are graphs of real-valued functions, so these two
types are only a notational change away from parametrization, as we've just seen. Of
the remaining types, we'll treat the elliptic cone and the ellipsoid in detail, leaving
details of the hyperboloids as exercises.

Geometrically an elliptic cone is generated by all lines that pass through the points
of a fixed ellipse in space and that also pass through a fixed point not in the plane of
the ellipse. (In particular, an elliptic cone could be one of the familiar right circular
cones.) Using coordinates in R3, let the fixed point be the origin, and let the fixed
ellipse be the one that projects parallel to the z-axis from the plane z = 1 onto
the ellipse (x/a) 2 + (y/b) 2 = 1 in the xy-plane. A typical point on the ellipse has
coordinates (a cos v, b sin v, 1), so a line joining this point to the origin consists of
all points of the form

x(u, v) = u(a cos v, bsin v, l) = (au cos v, bu sin v, u).


2'!0 Chapter 4 Derivatives

FIGURE 4.25 ,.~ ----z----- . .


I
z

---- ---
,_
(

---------
(a) (b)

As u varies for fixed v, x(u, v) traces one of the generating lines of the cone. As v
varies for fixed u =I= 0, x(u, v) traces an ellipse with semiaxes lual and luhl in the
plane z = u. See Figure 4.25(a).
To identify the cone as the quadric surface defined in Section 2, just extract x, y
and z from the parametrization, and observe that
2 2
X
02
+ Yb2 = u 2 cos 2 v + u 2sm
·2 2
v =z .

IEX"MPLE 11 I Geometrically an ellipsoid is a closed surface with three perpendicular axes of sym-
metry such that the plane cross sections perpendicular to these axes are ellipses,
possibly circles. Taking the symmetry axes to be the coordinate axes in R 3, we let
a, b and c be positive numbers and

a cos u sin v )
x(u,v)= bsinusinv .
( CCOS V

Identification with the quadric surface of Section 2 comes from checking that

(xja)
2
+ (y/b) 2 + (zjc)2 = cos 2 u sin 2 v + sin 2 u sin2 v + cos 2 v = l.
For a fixed v between O and re and varying u, x(u, v) traces an elliptic latitude curve
(xjasinv) 2 + (y/bsinv) 2 = l in the plane z = ccosv. As v varies for fixed u,
x(u, v) traces longitude curves extending between the "north pole" and the "south
pole." Typical curves are shown in Figure 4.25(b).

EXERCISES

Find fonnulas for the vector partial derivatives 2. f(x.y) = (excosy,e-'siny)


aj/ax(x, y) and aj/ay(x, y) of the functions in Exer-
cises I to 6. 3. f (x, y) = ( /:. y )

1. j(X, y) =
x+y
X - )'
( x2 + y2
)
4. f(x, y) - u; )
Section 48 Parametrized Surfaces 211
5. J (x, y) =(ex, eY , e-<+Y) (b) Find the image of the segment of the line y = x
between (0, 0) and (I, 1).
6. f(x,y)=( ~ )+( ~) (c) Find the image of the region defined for positive x
and y, and x 2 + y 2 < l.
For each of the functions in Exercises 7 to 10, find (d) Find the angle between the images of the lines y = 0
all first-order vector partial derivatives at the indicated and y = (l/v'3)x.
point. Then for each one, sketch the curves on the image 17. A vector function f from the x y-plane to the u v-plane is
surface passing through the image point g(uo, vo) and defined by
having the property that only u varies on one curve, and
only v varies on the other; that is, sketch the coordinate
curves through g(uo, vo). Finally, sketch the two tangent
vectors given by the partial derivatives.
(x: 4x
y)2 ) ' X =;,i= 0.

7. g(u, v) = (u, v; u 2 + v2 ) at (uo, vo) = (1, I)


(a) What are the coordinate functions of J?
8. g(u, v) = (u, v , uv) when (uo, vo) = (I, 1)
(b) What are the vector partial derivatives of J?
9. g(u,v) = (cosusinv,sinusinv,cosv) at (uo,vo) = (c) Describe the image of the region bounded by the
(rr /4, rr /4) four lines
10. g(u,v) = (ucosv,usinv,v) at (uo,vo) = (l,rr/4)
X = y, y = X - 8, X = -y, y = 8 -x.
For each of Exercises 11 to 14, sketch the image surface
corresponding to the domain given here and sketch the 18. Let a transformation from the xy-plane to itself be
tangent plane at the indicated point. given by
11. g(u, v) =
(u, v, u2 + v2 ), -2 ::: u ::: 2, -2 !: v S 2, l ( xY ) = ( -x+y
x+y)·
tangent where (uo, vo) = (I, I)
12. g(u, v) =
(u, v, i. v), 0 ::: u ::: 2, 0 ::: v ::: 2, tangent
where (uo, vo) = (I, I) (a) Show that f accomplishes an expansion out from
13. g(u,v) = (cosusinv,sinusinv,cosv), 0::: u ::: ½rr, the origin by a factor ,./2 combined with a rotation
through an angle rr /4.
0::: v ::: ½rr, tangent where (uo, vo) = (rr /4, rr /4) [Hint:
(b) Show that the vector partial derivatives of f are
This surface is a piece of a sphere.]
constant.
14. g(u, v) = (ucosv, usin v, v), 0::: u::: 2, 0::: v::: 2rr,
19. Show that the cone C as parametrized in Example
tangent at (uo, vo) = (l,rr/4)
10 of the text is smooth at all points except the
15. Let f(x, y, t) = (x + t, y + t 2), fort ~ 0, represent the pointed tip.
position at time t of a point starting at (x, y) and tracing
a path in JR2 •
20. Let J(u, v) = (u 2 cos v, u2 sin v, u).
(a) Show that the image S off with (u, v) unrestricted
(a) Sketch the paths starting at (x, y) = (0, 0), (I, 0),
coincides with the level set x 2 + y 2 - z4 = 0.
and (I, 1).
(b) Sketch the path starting at (-1, 0) and give a geo-
(b) Make a sketch of S.
(c) Where are the singular points of S?
metric interpretation for the vector partial derivatives
21. The top half T of the cone as parametrized in Example 10
aJ
-(-1,0,0)
. and
aJ
-(-1,0, I).
of the text appears intuitively to be singular at its pointed
a, a, end, and is singular in the technical sense because, while
the natural tangent vectors there are continuous, they
16. The vector function f is defined by aren't linearly independent since one of them is th.e zero
vector.

f
Ix)- ( -y2)
~y -
x
2
2xy ·
(a) Show that T is also parametrized by g(u, v)
(u, v, ~ ) by showing that the images of
=
' both parametrizations coincide with the level set
Consider the domain space to be the xy-plane and the x 2 + y 2 - z2 = 0 for z ~ 0.
range space to be the uv-plane. (b) Show that the pointed end of T is singular with
(a) What are the coordinate functions of J? respect to the parametrization g(u, v), but for a
212 Chapter 4 Derivatives

different reason than for the parametrization given are different parametrizations of the ellipsoid
in Example I 0. (x/a) 2 + (y/b)+ (z/c) 2 = l.
(c) Show that, with respect to g(u, v), T is smooth at (b) Use the parametrizations of part (a) to show that the
every point other than the tip. ellipsoid has elliptic cross sections perpendicular to
22. (a) Sketch the surface parametrized by the x-axis and y-axis.
26. Parametrizations for the hyperboloid of one sheet,
x(u, v) = ((I - u 2 ) cos v, (1 - u 2 ) sin v, u), (x/c) 2 + (y/h) 2 - (z/c) 2 = I, and of two sheets.
(x/a) 2 + (y/b) 2 - (z/c)2 = -1, may involve the
-1 :::0 !I :::0 1, -OO < V < 00.
hyperbolic functions cosh v = }(ev + e-v) and sinh
(b) What are the singular points of the surface in
v = ½(ev -
e-v).
part (a)? (a) Verify that cosh 2 v - sinh2 v = I.
(b) Verify that the hyperboloid of one sheet has
23. Let a > b > 0, and consider the set T parametrically parametrization
represented by
x(u,v) = (acosu cosh ·v,bsinu cosh v,c sinh v).

U) ( (a+bcosv)cosu)
(a+bcosv)sinu,
bsin v
0:::: u :::: 2n, 0:::: v :::: 2n.
(c) Verify that each piece of the hyperboloid of two
sheets has parametrization

x(u, v) = (acosu sinh 11, bsinu sinh v,c cosh v),


(a) Show that T is a torus by identifying two families
of circles on it, showing in particular that for fixed where c > 0 gives the upper half and c < 0 gives
v = vo, x traces a circle of radius a+ b cos vo paral- the lower half.
lel to the xy-plane and that for fixed u = uo, x traces
27. Suppose that (y, z) = (g(v), h(v)) parametrizes a
a circle of radius b in a plane containing the z-axis.
Note that a+ b is the outer radius of the ring, a - b curve in the yz-plane of lll.3 . Rotating this curve
is its inner radius, and h is the radius of the tube. about the z-axis generates a surface of revolu-
(b) By eliminating u and v show that the points of T tion S. Show that a parametrization for S is given
satisfy by
g(v)cosu )
x(u,v)= g(v)sinu,.
( h(v)

24. The helical surface represented parametrically by


(x, y, z) = (u cos v, u sin v, v) for u > 0 and -n/2 < 28. Show that if the graph S of the real-valued func-
v < n /2 is also the graph of a function f (x, y) tion f(x, y) has a tangent plane at (xo, YO, f(xo, Yo))
defined for x > 0. Show by eliminating u and v given by Equation 3.2, then the parametrization of S
that f(x, y) = arctan(y/x), where we assume arctan given by
(0) = 0.
25. (a) Verify that the parametrizations x(u, v) =( ~
f(u, v)
),
z(u, v) = (acosv, bcosu sin v, csinu sin v)
together with Equation 4.2, yields the same tangent
x(u. v) = (acosu sin v, bcos v, csinusin v) plane.

4C Computer Plotting of Image Surfaces


A parametric representation x = =
Ji (u, v), y = h(u, v), z h(u, v) of a surface
is one in which the surface is the image of the vector function with coordinate
functions Ji, h, and 13 restricted to a domain in JR2 with coordinates (u,v). This is
a generalization of the situation considered when we plotted graphs for z /(x, y). =
Section 4C Parametrized Surfaces 213
There we drew pictures in which the domain of a function was itself a part of the
picture. In particular, for a positive function, the domain could be thought of as lying
in the xy-plane under the graph. In the present situation, the common 2-dimensional
domain of the coordinate functions /1, h, h can usually just as well be thought of
as lying somewhere in some separate plane. The plotting program is very similar
however.

DEFINE f1(u, v) = u cos (v)


DEFINE f2(u, v) = u sin (v)
DEFINE f3(u , v) = 0.4v
First plot v-varying curves, for selected u.
FOR u = 0 TO 3 STEP O. 5
FOR v = 0 TO 2,r STEP O. Ol
PLOT3D (f1(u, v), f2(u , v), f3(u, v))
NEXT V
NEXT u
Then plot u-varying curves, for selected v.
FOR v = 0 TO 2rr STEP rr/5
FOR u = 0 TO 3 STEP . 0 l
PLOT3D (f1(u, v), f2(u, v), f3(U, v))
NEXT u
NEXT V
l<'IGURE 4.26

Executing the routine produces Figure 4.26.

EXERCISES

In Exercises 1 to 6, sketch the parametrically defined 8. (x,y,z) = (cosusinv,sinusinv,cosv),O::: u::: rr/2.


surfaces. In 5 and 6 you may find it useful to introduce 0:::v:::rr/4
the heaviside function H(u 2 + v2 ) as in Section 2C. 9. (x, y, z) = (cosu sin v, sin u sin v, l+cos v), 0::: u::: 2rr,
1. f (u, v) = (cos u sin v, sin II sin v, cos v), 0 ::: 11 ::: rr /2, 0 ::: v ::: 2rr, for ½ ::: y ::: 1
0 :'.:: V :'.:: Jr/2 10. (x, y, z) = (exp(u - v), u, v), 0::: u ::: 2, 0::: v ::: 2
2. f(u, v) = (cos11 sin v, sinu sin v, 1 +cosv),0::: u::: 2n, 11. (x,y,z) = (u+v,u-2v,2u+3v),lxl::: l,IYI::: 2,
O:::v:::2n lzl::: 3
3. f(u, v) = (exp(u - v), u, v), 0::: u ::: 2, 0::: v::: 2 In Exercises 12 to 15, find a parametrization and, using
this parametrization, make a computer sketch of an
4. f(u, v) = (u + v, 11 -2v, 2u + 3v), 0::: u::: 3, 0::: v ::: 2 example each of the following.
5. f (tt, v) = (u+v, u-2v, 2u+3v), 0::: u, 0::: v, u 2 +v2 ::: 2 12. The ellipsoid x 2 + 2y2 + z2 =l
6. f(u, v) = (u, v, u2 + v 2 ), 0::: u, 0::: v, u2 + v2 ::: 2 13. The hyperboloid of one sheet x 2 + y 2 - 2z 2 = I for
lzl ::: I
7. Explain in words the geometric significance of the angle
parameters u, v in Exercises l and 2. 14. The half of the hyperboloid z2 - x 2 - y2 = l for which
z~ l
In Exercises 8 to 11, sketch the parametrically defined 15. The half of the hyperboloid z2 - x 2 - y2 = l for which
surfaces subject to the given conditions. z::: -1
214 Chapter 4 Derivatives

Chapter 4 REVIEW

1. Let x(t) = (e 1 cos t)i+(e 1 sin t)j+e' k and find the velocity 8. A particle moves so that at time t its position is x(t) =
vector i(t) and the vector t(t) of length I pointing in the ( - sin t, cos t, t 2 ).
same direction. Also find the acccleralion vector i(t). (a) Find the velocity v(t) =
i(t) and the acceleration
2. The motion of a particle is given by the vector function a(t) = ti(t).
x(t) = (cos 2t)i + (sin 2t)j + 12 k. (bl What is the distance between the positions of the
particle at t = 0 and alt = n?
(a) Sketch the trajectory of the particle when O ::S t ~ n.
(c) Find a parametric expression for the tangent line to
(b) What is the velocity vector when t = n/4? Add this
the curve traced out by the particle's motion at the
vector to your sketch.
point x(n).
(c) Suppose the particle leaves its prescribed path when
t = n/4 and continues at the constant velocity it has 9. Find an equation for the tangent plane to the graph of the
acquired at that time; where will the particle then be equation z = x 2 + y 3 at ( 1, 2, 9).
at time t = n /2? 10. Find a parametric representation for the plane tangent to
3. The position of a particle at time t is given by the image surface of the function J(u, v) =
(u+v, u 2 , v 2 )
=
at the point f (l, 1) (2, 1, 1).
In Exercises 11 to 16, verify that the functions satisfy
the Laplace equation uxx + uyy = 0 in two dimensions.
11. u(x,y)=x 2 -y2
(a)Find the velocity and acceleration vectors when
t =0. 12. u(x , y) = 2xy
(b) What distance docs the particle travel between t = 0 13. u(x, y) = x 3y - xy 3
and t = 2n?
(c) Show that the particle always lies on a sphere cen-
14. u(x, y) = ln(x 2 + y 2 )
tered al the origin. 15. u(x, y) = arctan(y/x), xi= 0
(d) Find a plane that contains the path of the particle,
16. u(x, y) = ex 1 -y2 cos2xy
and sketch the path from an appropriate point of
view. In Exercises 17 and 18, verify that the functions satisfy
4. (a) Sketch the image curve uf the function given by the 3-dimensional Laplace equation uxx+uyy+u:: = 0.
f(t) = (t, cost, sint) for O ~ t ~ 2n. 17. u(x, y, z) = xyz + 2x 2 - y2 - z2
(b) Find a parametric representation for the line tangent
tu the curve in part (a) at the point f (n/2) = 18. u(x, y, z) = (x 2 + y 2 + z2 )- 111 , (x, y, z) i= (0, 0, 0)
(n /2, 0, l ), and add this line to the sketch for 19. Let u(x, y, z) = (x 2 + y 2 + z2)u, ex constant. Show that
part (a). u,_, + llyy + ll~z = 0 just for ex= 0 and ex= -1/2.
(c) Find the acceleration vector of the curve at J (n /2)
and add this to the sketch also. 20. Let f(u, v) = +
(u, v, u 2 v 2 ).
(a) Sketch the image of J.
5. Find the maximum and minimum values of the speed (b) Compute the partial derivatives J11 (u, u) and
of the motion along the curve traced by x(t) = fv(u, u) . Compute the equation of the tangent plane
(a cost, b sin t, ct), where a > b > 0 and c are constant. to the image surface of J when u =
1, v 1. =
6. Suppose x(t) traces a curve in JR 3 that lies in a plane (c) Sketch the u and v coordinate curves of J through
ax+ by+ cz = d. Show that if the curve has nonzero (1, 1, 2); that is, the curve where u varies and v = 1,
tangent and acceleration vectors at each point x(t), then and the curve where v varies and u =
I. Sketch the
x(t) and i(t) are parallel to the plane uf the curve. tangent vectors to these curves at (I, 1, 2). What is
the relationship between these tangent vectors and
7. (a) Sketch the curve parametrized by x(t) = (e 1 cost,
the partial derivatives computed in (b)?
e1 sint) for O ::St ::S n/2.
(b) Find the length of the curve in part (a). 21. (a) Show that the intersection of the elliptic cone
(c) Repeat parts (a) and (b) with the parameter interval +
(x / a ) 2 (y / b)2 - z2 =
0 with a plane perpendicular
replaced by -n/2 ~ t ~ 0. to the z-axis is an ellipse.
Section 4C Parametrized Surfaces 215
(b) Show that the intersection of the cone with a plane
perpendicular to the x-axis or to the y-axis is a A function JR.11 ~ JR"' represents a set S
hyperbola. (a) explicitly if s is the graph of f in an+m'
22. Identify the curve of intersection of the ellipsoid (b) implicitly if S is a level set off in !Rn,
(x/a}2 + (y/b) 2 + (z/c) 2= 1 with a plane perpendicular (c) parametrically if S is the image of f in IR"'.
to one of the coordinate axes.
for example, the graph of JR. I ~ JR. I where f (x) = x 2 is
23. Identify the curves of intersection of the elliptic an explicit representation of a parabola Pin JR. 1+I = JR.2.
paraboloid (x/a) 2 + (y/b) 2 = z with (a) planes per-
pendicular to the z-axis. (b) planes perpendicular to the 25. Find an implicit representation for the parabola P.
x-axis or the y-axis. 26. Find a parametric representation for the parabola P.
24. Consider the helicoid H parametrized by
27. The function IR 1 ~ JR 3 where g(t) = (t, t, r2 ) has as
x(u, v) = (u cos v, u sin v, v). its image a parabola P in JR 3, so P is represented para-
(a) Find a parametrization for the tangent plane to H at metrically by g. Make a sketch of P, and find an explicit
the point (1/../2, 1/../2, rr/4).
(b) Find a nonnal vector n of length I to H at the same representation of P as the graph of some JR 1 ~ JR 2 .
point.
CHAPT ER 5

DIFFERENTIABILITY

To say that a function IR L IRm is differentiable on an interval a < t < b means


that them real-valued coordinate functions off (t) are differentiable on a < t < b
and that f(t) has derivative f'(t) = (f{(t), . .. , f~(t)) for every tin the interval.
The aim of this chapter is to explain the analogue of that simple definition for
L
functions IR 11 L
IR and then for IR11 IRm. The connection between the property
of differentiability and the existence of derivatives is more interesting for the case of a
higher-dimensional domain even when the range space is I-dimensional. The reason
for this is that with only a I-dimensional domain there is only one line along which
to differentiate. But whenever the domain has dimension n ~ 2 we have to deal with
infinitely many different directions from which to approach a given point, and hence
infinitely many possible directions along which to compute the limit that defines a
derivative. Recall that in the previous chapter all the derivatives we computed for
functions of more than one variable were partial derivatives, each one defined by
a limit taken in a direction parallel to one of the coordinate axes; this allowed us
to use all the machinery of single-variable calculus in our computations. The only
reason we were able to get away with ignoring the infinitely many other possible
directions for computing derivatives was that we restricted our examples to functions
IR 11 L }R_m that were differentiable in the specific sense defined below in Section 2
of the present chapter. Once we have clarified this point, we'll see that extending the
definition of differentiability to vector-valued functions f is really just a matter of
requiring the real-valued coordinate functions of f to be differentiable. Finding the
correct analogue of the derivative f'(t) then comes down to using matrix algebra
in a very natural way, one advantage of which will become apparent in Section 5
of the present chapter on Newton's method for approximate solution of systems of
equations. Another important application appears in Section 3 of Chapter 6 on chain
rule computations.
For functions of one real variable, finite or infinite intervals with or without
endpoints will be sufficiently general domains for our purposes. To make the ideas
mentioned in the previous paragraph precise in higher-dimensional domains we begin
with some terminology for describing significant properties of more variously shaped
subsets of IR 11 such as disks and rectangular regions. We follow these examples by
a brief discussion of continuity and then consider differentiability, for real-valued
functions in Section 2, and for vector-valued functions in Section 4.

SECTION 1 LIMITS AND CONTINUITY


To discuss differentiability for general vector functions we need two ideas: the limit of
such a function, and what it means to be an interior point of the domain of a function.

216
Section 1A Limits and Continuity 217
We'll also extend the definitions of limit and continuity from real-valued functions
of one variable to real-valued and vector-valued functions of several variables and
explain how to construct continuous functions of several variables using the familiar
functions of single-variable calculus.
The definition of limit is based on the idea of nearness. The limit relation
sinx
lim - - =1
x---+0 X

underlies the usual introduction to the calculus of the trigonometric functions and
I I
is most often proved geometrically in that context. The equation says "(sinx)/x is
- I 0 arbitrarily close to 1 provided xis sufficiently close to O." We express nearness on the
real-number line by inequalities such as Ix - 31 < 0.4, which says that the distance
between the number x and the number 3 is less than 0.4, or equivalently that x lies
FIGURE 5.1 in the open interval with center 3 and length 0.8. See Figure 5.1. And we translate
statements such as "(sin x) / x is arbitrarily close to 1 provided x is sufficiently close
to O" into statements about inequalities that we can operate on algebraically. Thus
the previous displayed formula says the following: For a given positive number E,
there is a positive number 8 such that if

0 < Ix -01 = lxl < 8, then lsi:x - 11 < E.

The condition 0 < Ix - 01 signifies that the precise value, if any, assigned to the
function at x = 0 is irrelevant to the existence of the limit; this condition can only
make it easier to find the required number 8 > 0, because x = 0 doesn't have to
satisfy the E-inequality.
In the previous chapter we assumed known the properties of elementary functions
such as sin x / x, and will continue to do so here, largely avoiding E and 8 arguments,
and concentrating more on the geometric properties of the natural domain sets in JR 11
that play the role of intervals a < x < b and a S x S b in JR.
lA Neighborhoods
In JR" a definition of limit also requires the means of asserting that one point is close
to another. For a given 8 > 0 and point xo in IR 11 , the set of all points x in IR11 that
xo = (l, 2, I) satisfy the inequality
lx-xo1<8

-~··,. ---+ >-


y is called a 8-ball with radius 8 and center XQ. For example ifxo = (1, 2, 1), Figure 5.2
---
' f
shows the set of all x in JR 3 such that

jx - xol = J(x -1) 2 + (y - 2) 2 + (z - 1) 2 < 0.5.

FIGURE 5.2 Suppose S is a set of points in IR11 and x a point in JR 11 • Then x is a limit point
of S if, for a given 8 > 0, there exists a point y in S such that 0 < jx - YI < 8.
Translated into English, the definition says that x is a limit point of S if there are
points in S other than x that are contained in a ball of arbitrarily small positive radiw
with center at x. A 8-ball is sometimes called a neighborhood of the point at it
center. Thus x is a limit point of S if every neighborhood of x contains a point of
other than x. Note that a limit point of S need not itself be in S.
218 Chapter 5 Differentiability

!EXAMPLE 1 I The set Sin JR 2 consisting of all points (x, y) such that

x 2 + y2 < I,

together with the single point (2, 0) appears in Figure 5.3. The set of limit points of
S consists of the circular disk together with the circle
y

x2 + y2 = 1.
Note, however, that the limit points precisely on the circle of radius 1 are not in S.
The point (2, 0) is not a limit point of S, even though it is in S, because there is no
~--- (2, 0) other point of S within 1 unit of it.
If. : !11;
One way for a point x to be a limit point of a set S is for x to be an interior
point of S, that is, a point x in S such that all points within some neighborhood of
I x are also in S. A set S all of whose points are interior points is called open. For
example ]Rn is an open set, and so is the circular disk in Example 1. If something
FIGURE 5.3
occurs at all points in a neighborhood of a point x then x is an interior point of the
set where it occurs.
Consider again the set S shown in Figure 5.3 and described in Example 1. The
interior points of S, which are also limit points, are those in the open disk represented
by the shaded part of the drawing. A point x in the disk but not on the circle
x 2 + y 2 = 1 is an interior point of S, because a disk of small enough radius centered
at x would be contained in S. The point (2, 0) is not an interior point of S, even
though it is in S. Even if the circle x 2 + y2 = I were included in S, the points of
the circle would not be interior points. Figure 5.4(a) shows the interior points of S
and Figure 5.4(b) shows the limit points of S.

In the most common examples of functions f: D ----+ ]Rm, the domain D is either
an open set, or else an open set together with some points called boundary points
of D. A boundary point of a set D is a point x such that every neighborhood of x
contains both a point in D and a point not in D. Thus x may be a boundary point
of D without being itself in D. But an interior point of D is never a boundary point
of D. The boundary of a set D is just the set of all boundary points of D, and a
closed set is by definition a set that contains all of its boundary points.

j EXJ,\'1,1PLE ,3 I ofFigurewhichshows
D,
5.5 examples of closed sets
is always open, and the boundary of
D 2
in JR and in IR along with the interior
,
The boundary of the set shown
D.
in Figure 5.3 consists of the points on the circle x 2 + y 2 = 1 together with the point
(2, 0).

1B Limits
Here is the definition of limit for a function ]Rn ~ ]Rm. Let yo be a point in ]Rm
and xo a limit point of the domain of f. Then Yo is the limit of J at xo if, for a
given E > 0, there is a 8 > 0 such that If (x) - Yol < E whenever x is in the domain
off and satisfies O < Ix - Xol < 8. The relation is written

lim f (x)
x--->X<i
= Yo-
Section 1B Limits and Continuity 219
To put it less formally, the definition says that /(x) is arbitrarily close to y0 when x
is sufficiently close to xo and x # xo. Geometrically, the idea is this: Given an E-ball
Bf centered at Yo, there exists a 8-ball B0 centered at xo whose intersection with the
domain off, except possibly for xo itself, is sent by f into Bf. A 2-dimensional
example is pictured in Figure 5.6. The statement

(a) lim /(x)


I---+XQ
= Yo
is also commonly read "The limit of /(x) as x approaches xo is yo." We always get

- a unique limit if one exists, since if

lim
bx---+XQ
f (x) = YI and lim
bx---+xo
f (x) = Y2
(b) then by the triangle inequality

FIGURE 5.4
IY1 - yzl:::: IYI - /(x)I + 1/(x) - Y2I < E +E = 2i;
for all x in a small enough neighborhood of xo. But we can make 2E as small as we

FIGURE 5.5

--
a
...
like, so IY1 - Y2I = 0 and YI = yz.

b
Closed interval
a :$ X :$ b,
a
Interior of D

Open interval
a <x< b,
b
Boundary of D

a b

denoted [a, b] denoted (a, b)


(a) (b) (c)

ti d

-t- I ~
a a b a b
Closed rectangle Open interior Boundary rectangle
a :$ X :$ b, a <x< b,
C :5,y :$ d. c < y < d.
(d) (e) (f)

FIGURE 5.6

Domain space '.R.2 Range space '.R. 2


220 ...------, Chapter 5 Differentiability

lE:ltAIVlftLE 4 j Consider the function <;letlned by

/(t) = (cost, sint) .


The domain off is all of IR, and at every point to of IR., f has limit /(to). To see
this we use known facts about cos t and sin t and consider

1/(t) - /(to)I = J(cost - costo) 2 + (sint - sinto) 2


.:'.:: I cos t - cos to I + I sin t - sin to I-

The last inequality holds because Ja 2 + b 2 .:'.:: Jal+ !bl. (Square both sides.) Using
continuity of sin t and cos t,

Jim cos t
r--+ro
= cos to and Jim sin t
t--+to
= sin to.
Then we can make both

I cost - cos tol and I sin t - sin tol

as small as we like by making It - tol small enough. Hence the inequality (*) shows
that lf(t) - /(to)I is as small as we like whenever It - tol is small enough.

j EXAMPLE s I Consider the real-valued function defined in all of IR.2 except for (x, y) = (0, 0) by
l
f(x,y)= 2 2·
X +y
In this example we can write
Jim f (x) = +oo
x--+O

to describe what happens, because as l(x, y)I tends to 0, its square x 2 + y 2 = l(x , y)l 2
tends to zero also, so the fraction tends to +oo. By the convention established in
our definition of limit, only elements of IR" are acceptable limits, so we say for this
example that the limit fails to exist.

!EXAMPLE 6 I Let f be real-valued with the same domain as in the preceding example and defined by

x2 -y2
f(x,y)= 2 2·
X +y
There is no limit as (x, y) -;. (0, 0). If (x, y) approaches (0, 0) along the line y =ax,
we obtain
x2 - y2 . x2(l - a2) l - a2
lim -=---:- - hm - - - - - - -
x--+O x 2 + y 2 - x--+0 x 2(l + a 2) - l + a 2 ·

This limit is not independent of a, because, for example, the limit equals 0 if a = I,
=
and I if a 0. But for a unique limit to exist we have to be able to approach (0, 0)
from all possible directions, so the overall limit fails to exist.
Section 1C Limits and Continuity 221
The functions in Examples 5 and 6 are both real-valued. The following theorem
shows that the problem of the existence and evaluation of a limit for a function
1Rn ~ 1Rm reduces to the same problem for the real-valued coordinate functions.

1.1 Theorem. Given IR.n ~ IR.m, with coordinate functions Ji, ... , f m, and a
point Yo = (y1, ... , Ym) in ]Rm, then

Jim f(x)
X--Ho
= Yo (A)

if and only if
Jim /i(x)
X-->-X()
= y;, i = 1, ... ,m. (B)

Proof To say that Equations (A) and (B) are equivalent is to say that the distance

1/(x) -yol = ./u1(x) - Y1) 2 + ... + Um(X) - Ym) 2

is arbitrarily small for x in a small enough neighborhood of xo if and only if

lfi(x) - Yd,···, lfm(X) - Yml

also become arbitrarily small. But the equivalence of these last two statements follows
at once from the inequalities

1/(x) -Yol ~ lfi(x) - y;I, i = 1, ... , m, and


1/(x) -Yol ~ ,Im, max {lf;(x) - y;J}.
l::,i::,m

After squaring both sides the first inequality follows from af + · · · + a! ~ af,
i =I, ... , m, and the second from af +···+a~~ m(max1::;i_:::m la;l)2. •

j ~l\<'PiP,1P~~!;ZH For vector functions Ji and h defined by


fi(t) = (t, t 2 , sint), h(t) = (t, t 2 , sin(l/t)),

we have
Jim Ji (t) = (0, 0, 0) .
1-->-0

But Jim fz(t) doesn't exist because the function sin(l/t) has no limit at t = 0.
1C Continuity
Roughly speaking, a continuous function f is one whose values do not change
abruptly. That is, if xis close to Xo, then f(x) must be close to f (XQ). This idea is
related to the idea of limit, and the definition of continuity is as follows: A function
f is continuous at xo if

(a) xo is in the domain of f.


(b) Jim f (x)
X-->-XQ
=
f (XQ).

At a nonlimit, or isolated, point of the domain off, we can't ask for a limit; instead
we extend the definition of continuity simply by defining f to be continuous at such
222 Chapter 5 Differentiability

a point. A function is continuous on a subset S of its domain if it's continuous at


every point in S. It's an immediate corollary of Theorem 1.1 that

1.2 Theorem. A vector function is continuous at a point if and only if its coor-
dinate functions are continuous there.

j E>CAMPLI: s J Returning to Example 7, the function

fi(t) = (t, t 2 , sint)

is continuous at every value of t. On the other hand, the function

h(t) = (t, t 2 , sin(l/t))

is continuous on the set S of all real numbers t with t = 0 deleted.


A function is simply called continuous if it's continuous at every point of its
domain. From Theorem 1.2 we see that a continuous vector-valued function of a
single variable, ~ L ~ 11
, is precisely one for which the coordinate functions

Ji, ... , / 11 are continuous real-valued functions of a real variable. The latter include
most of the functions of ordinary calculus, such as x 2 , sin x, and, for x > 0, ln x.
We use these same functions to construct examples of the continuous coordinate
functions that constitute the vector-valued functions ~n L ~m of a vector variable.
For example, the coordinate functions of

sinxy cosxy )
f(x, y) = ( ex+y ' ex+y

tum out to be continuous. The continuity of these and other examples follows from
repeated application of the following three theorems, together with Theorem 1.2 on
coordinate functions. If you think these theorems are obviously true you're right, but
we'll prove them anyway.

1.3 Theorem. The functions ~ 11 ~ ~ . where Pk(X1, ... , x 11 ) = Xk, are contin-
uous for k = 1, 2, ... , n. Pk is called the kth coordinate projection.

1.4 Theorem. The functions ~ 2 ...!..+ ~ and ~ 2 ~ ~. defined by S(x, y) =


x + y and M(x, y) = xy, arc continuous.

1.5 Theorem. If ~ 11 L ~m and ~m --3....+ ~P are continuous, then the compo-


sition g (f (x)) is continuous wherever g (f (x)) is defined.

Proof of 1.3. We have IPk(x1, ... ,X11) - Pk(a1, ... ,an)I = lxk - akl .::: Ix -
al, so IPk(XJ, ... ,x11 ) - Pk(ai, ... ,a11 )I is arbitrarily small if Ix - al is small
enough. •
Proof of 1.4. For S(x, y) = x + y, write IS(x, y) - S(a, b)I = Ix - a+ y - bl .:::
Ix - al+ IY - bl, by the triangle inequality. Hence IS(x, y) - S(a, b)I is small if
Ix - al and IY - bl are small enough. Since Ix - al and IY - bl are both at most
Section 1C Limits and Continuity 223

the distance ../(x - a) 2 + (y - b) 2 from (x, y) to (a, b), making this distance small
enough makes IS(x, y) - S(a, b)I as small as you like.
For M(x, y) = xy use the triangle inequality and factor out !xi and lbl to get
IM(x, y) - M(a, b)I = lxy - xb + xb - abl S lxllY - bl+ Ix - alibi. Keeping x
within distance I of a, makes

IM(x, y) - M(a, b)I S (lal + l)ly - bl+ Ix - alibi.

Hence IM(x, y) - M(a, b)I is as small as you like for a given (a, b) if Ix -
al and IY - bl are small enough. From here the argument is the same as for
S(x, y). •
Proof of 1.5. We let a be a limit point of the domain of g(f(x)) and show that
lim g(/(x))
X->X()
= g(f(a)). Since g is continuous at f(a) there is a neighborhood Bs,
s > 0, of j(a) such that lg(y)- g(/(a))I is a small as we like, say less than E > 0,
when y is in Bs. Similarly, since f (x) is continuous at x = a there is a neighborhood
Br, r > 0, of a such that j(x) is in Bs when xis in Br. Hence lg(y)- g(/(a))I < E
whenever xis in Br . ,---=-------=- •

l:EXAM·· .·i:e<°g •j The function j(x, y) = ../1 -


x 2 - y 2, defined for l(x, y)I S 1, is continuous,
"' '· • ,f> ·i ·• because we can write it as

f(x, y) = J1 - (P1 (x, y)) 2 - (P2(x, y)) 2 ,

so j(x, y) is a composition of continuous functions. Similarly, g(x, y) = ln(x + y),


defined for x + y > 0, is continuous. The product of f and g, given by

y
h(x, y) = J1 -x 2 - y 2 1n(x + y),

is defined on the half-disk that is the intersection of the domains of / and g, as shown
in Figure 5.7. The product is a continuous function because it is the composition of
the continuous vector function
F(x, y) = (f (x, y), g(x, y))
with the function M of Theorem I .4.

A function Rn ~ Rm is called linear if its coordinate functions have the form

fk(XJ, ... , Xn) = akJXJ + ·· · + aknXn, k = l, ... , m,


FIGURE 5.7
for some scalars akj. The linear functions enhance our understanding of the property
of differentiability taken up in the next section, and are imponant in their own right
in both pure and applied mathematics. Theorems 1.2, 1.3, and 1.4 show that linear
functions are continuous, as the next theorem spells out.

1.6 Theorem. A linear function Rn ~ Rm is continuous.

Proof. Each scalar akj in ak1x1 + · · · + aknX11 is continuous because it is constant,


and each Xk is continuous by Theorem 1.3, so their product and sum are continuous
by Theorem 1.4. Then / is continuous by Theorem 1.2. •
224 ..------~ Chapter 5 Differentiability

!:£.)(AMPLE f() I The pair of equations


x = 2u - 3v +w and y =u +v - w

defines a linear, therefore continuous, function from points (u, v, w) in JR 3 to points


(x, y) in JR 2 . The examples of the next section will help to illuminate the behavior
of functions such as this one.
EXERCISES

I. Assuming xo = (l, 2), draw the set of all vectors x in IR 2 17. f(t) = (t, r2 , r3, r4 )
such thal
18. f (u, v) = (u + v, u - v, u 2 + v2 )
(a) Ix - xol ::: 3
(b) Ix - xol = 3 19. f(x, y, z) = (2x, 2y, 2z)
(c) Ix-· xol < 3 In Exercises 20 to 25, detennine at which points the
In Exercises 2 to 11, identify (a) the interior and (b) the function fails to have a limit. Use Theorem 1. I. Take the
boundary of the set of points x = (x, y) in JR 2. ( c) Which domain of each coordinate function as large as possible.
sets are open? (d) Which sets are closed? The domain of f is then the part common to the domains
of all the coordinate functions.
2. Ix - 0, 2) I :::: o.5
3. Ix - 0, 2)1 < o.5 20 _ f ( x ) =( y + tan x )
y ln(x + y)
4. Ix- (l,2)1 < -0.5
5. 0 < X < 3 and O < y < 2
6. 2 ::: x < 3 and O < y < 2
21. f ( ; ) =( x
2
1 )
y2 - I
f
7. x 2 + 2y 2 < I
X
8. x =I- (0, 2) or (1, 2) 22. f (x, y) = -.
smx
- +y
9. x 2
IO. x > 0
II. X
+ y2 >

> y
0
23. f(x, y) = { ,
- .-+y,
smx
2+ y,
if X =/:- 0,
if x = 0

12. Let the set S consisl of the points (x, y) in IR 2 satisfying

~
sin t
0 < x 2 + y2 < 1, together with the interval I ::: x < 2 of
the x axis.
(a) Describe the boundary of S.
(b) What are the interior points of S?
24. /(r) (
cost

sint 2
)
(c) Is S open? Closed? 25. f(u,v)= UV
,
I )
(
1-u 2 -v 2 2-u 2 -v 2
13. Let L be a line and P a plane in JR 3 • Is either P or L an
open subset of JR 3 ? In Exercises 26 to 31, determine at which points the
function fails to be continuous. Take the domain of each
In Exercises 14 to 19, the formula defines a function f
coordinate function as large as possible. The domain of
from !Rn to !Rm for some n and m. In each example, state
what n and m are, and list the real-valued coordinate
f is then the part common to the domains of all the
coordinate functions.
functions of f .
14 . .f(x,y)=(x-y,x 2 -y 2 )
26. f ( ; ) = ( :2 + ;2 )

15. f(x, y) =( 6 ~ ) ( ;, ) x2 + y2

16. f(x, y, z) = (x - y)./x2 + y2 + z2


Section 2A Real-valued functions 225
37. Prove Theorem 1.3 of the text by first proving the inequal-
28. f(x, y) ={ si;x + y, if x #O ities
1 + y, if X =0
lxkl ~ !(xi, ... , Xn)I, k = I, ... , n.
if x 2 + y2 # 0
if x 2 + y2 =0 38. Prove both parts of Theorem 1.4 of the text. [Hint: For
the function M, note that by the triangle inequality for
absolute value, lxy - xoyol ~ lxy -xyol + lxyo - xoyol-1
30. f ( : ) - ( : ; · , )
39. If f and g are vector functions with the same domain and
same range space, prove
31. / (x) = ~. if xis in Rn
1-lxl Jim (/ (x) + g(x)) = X->X()
lim / (x) + X->XQ
Jim g (x),
X->X()
32. A vector function / has a removable discontinuity at XO
if (1) / is not continuous at xo, and (2) there is a vector
provided that Jim / (x) and Jim g(x) exist.
Yo such that Jim /(x)
X---->XQ
= yo. Give examples of functions X->XQ X->X()

/ and g that are discontinuous at a point xo such that / 40. Let S be a closed subset of R". Prove that the complement
has a removable discontinuity at xo and g does not. of S in Rn is open.
33. A function T: Rn -+ Rn is called a translation acting 41. If S is an open subset of JR,n, show that the complement
on R" by Yo if there is a vector Yo in Rn such that of S in .!Rn is closed.
T(x) = x + Yo for all x in Rn. 42. A function of more than one variable can have a limit
(a) Describe in words the effect on JR2 of translation by along every line through a point without having a limit at
Yo== (1, 1). that point. For example, define/ (x, y) = x 2 y/(x 4 + y2)
(b) Prove that every translation is a continuous function. for (x, y) # (0, 0).
34. Prove that the union of an arbitrary collection of open (a) Show that Jim / (0, y) = 0 and that, for each fixed
y->0
subsets of Rn is open. number a, Jim f(x, ax)= 0.
x->0
35. Prove that the intersection of a finite collection of open (b) Show that approaching (0, 0) along the parabola
subsets of Rn is open. y = ax 2 you get limit a/(1 + a 2 ).
36. Give an example to show that an intersection of infinitely (c) What is the set of possible limits achieved by using
many open subsets of Rn may fail to be open. the approaches of part (b)?

SECTION 2 REAL-VALUED FUNCTIONS


2A Differentiability and Continuity
We begin by reviewing the definition for a real-valued function JR; lli of a single /4
real variable x on some interval a < x < b. We say that J is differentiable at xo if
there is a number a such that

lim /(x) - /(xo) = a, . J (x) - J(xo) - a(x - xo)


X-Ho X -XQ
or hm - - - - - - - - -
X--+ .t() X - XQ
= 0,

in which case / has derivative /'(xo) =a.The equivalent form on the right calls
attention to an important property of the tangent line equation y = f (xo)+a(x -xo),
namely that its graph approaches the graph of /(x) as x approaches xo more rapidly
than x - xo approaches 0.
Reca1l that in Chapter 4, Section 3 we used a modification of this same definition
to motivate the definition of the partial derivatives of a function of more than one
variable. At that point we emphasized that the mere existence of partial derivatives
226 Chapter 5 Differentiability

is not enough to provide a reasonable definition of differentiability for a function


of more than one variable. The reason for this is that partial derivatives at a point
xo take account of the behavior of a function near xo only along lines through xo
and parallel to the axes. Thus we have to consider a function's behavior not only
along infinitely many other individual lines through xo but, even more, throughout
an entire open neighborhood of xo. Thus we formulate the definition as follows.

2.i Definition A function R11 ~l r. is differentiable at~ 'il;


(i} xo :fa a,, )nreriot point of ti.le domain of f.
(ii) There is a vt..-ctor a such that

Um f(!)YC'.' {~~}:·:··\~;, .·~ o:


s4 i:o l:t - xol

To be consistent with the custom in the case of a single real variable, the function
J is called simply differentiable if it's differentiable at every point of its domain.
We'll prove in Theorem 2.2 that there can be only one vector a for which (ii) is true,
and it's called the gradient of the differentiable function J at xo; it's customary to
use the notation VJ (xo) for the vector a. The symbol V is pronounced "grad" here,
so V.f (x) becomes "grad J at x." If J is a function of a single real variable x we
continue to write the customary J'(x) instead of VJ(x).
Remark 1. Since we can't divide by a vector x - xo we instead mulliply by the
_../" reciprocal of its length in (ii), the crucial point being to ensure that the numerator
tends to zero faster than Ix - xol does.
Remark 2. Condition (i) of the definition requires the approach of x to Xo to be
y
unrestricted, as compared to the restricted I-dimensional limits used to define partial
/
derivatives.
Remark 3. According to the definition of differentiability, the domain of a dif-
>X
ferentiable function is an open set. It's convenient however to extend the definition
sufficiently to speak of a differentiable function J defined on an arbitrary subset S
FIGURE 5.8
of the domain space. By such an J we'll mean the restriction to Sofa differentiable
z= Jt - x2 - y2. function whose domain is an open set containing S.

The function J defined by J(x, y) = Jt - x 2 - y 2 has for its domain the disk
x + y2 :::: 1. Its graph appears in Figure 5.8. The interior points of the domain are
2

those (x, y) such that x 2 + y2 < I. We'll see that J(x, y) is differentiable at these
interior points. But it doesn't follow by Remark 3 that J is differentiable at the points
of the circle x 2 +y2 = 1; indeed we'll see that J can't be extended to be differentiable
on an open set containing the circle. See Examples 5 and 6 and Exercise 26.

The next theorem allows us to compute the vector VJ (x) in tenns of partial deriva-
tives of a differentiable function. The gradient is the key to putting a firm foundation
under the notion of tangent plane introduced in Chapter 4. We'll see in Chapter 6
that the gradient of a real-valued function has several natural interpretations.

2.2 Theorem. If a function ]Rn ~ JR is differentiable at xo, then the kth coor-
dinate of the gradient VJ(xo) of J at xo is the kth partial derivative of J at xo,
Section 2A Real-valued functions 227

k = l, 2, ... , n. Thus VJ(X-O) is uniquely detennined by the differentiability condi-


tions {i) and (ii), and

VJ{X-O) = ( -aaJ (xo), ... , -(xo),


aJ aJ
... , -(xo) ).
Xt axk axn
Proof. To identify the entries in VJ(X-O) = a using the definition
of this vector,
we specialize x in part (ii) of the definition to vectors of the form Xj = xo + tei.
Since XO is assumed to be an interior point of the domain of J, then xo + tej is in
the domain of J for small enough t . Since now x - xo = tei, these special cases of
condition (ii) become

. J(xj)- J(X-O)- a• (lej)


I1m - - - - - - - - - ' - -
t---40 I
= 0, j = 1, ... , n.
Here we've used Ix - xoJ = Jteil = JtJJeil = JtJ. We then removed the absolute
value since the limit is zero, making the sign of the denominator irrelevant. By the
homogeneity of the dot product, a• (lei) =
ta• (ej ), so we rewrite these limits as

lim J (Xj) - J (xo) = a • ej, j = I, . . . , n.


t---40 I

But Xj = xo + tei differs from xo only in the jth coordinate, and in that coordinate
the difference is just t. Hence the limit on the left side of the last equation is just
af!axj at xo. Since the dot product a• ej is just the jth entry in a, we're done. •

j'~)(,,',1P~E·.~:J ~e~l zs;::o~~e::::i;~~e ~tn~~;o;~itr~\:,) y~ : : / ~ / : ~,a;e~p~~~i!ei:? ~:~w~


gradient vectors are VJ= Ux, Jy) and Vg = (gx, gy, gz), so

VJ(x,y)=(exsiny, excosy+ l ) and Vg(x,y,z)=(y+z, x+z, y+x).

How can we tell whether or not a vector function is differentiable? Theorem 2.2
only allows us to conclude that a function is not differentiable at a point if one or
more of its first-order partial fails to exist at that point, because differentiability of
J at x implies that these partials exist at x. Thus Example 2 is inconclusive to the
extent that we have simply assumed that the functions J and g appearing there are
differentiable. The converse implication isn't valid, since it's possible for the partials
to exist without J being differentiable, as Exercise 24 shows. However, by adding
an additional assumption, namely that the partials are themselves continuous on an
open set S, we'll deduce in Theorem 2.3 the differentiability of J on the entire set
S. The theorem guarantees differentiability for most examples met in practice.

2.3 Theorem. Let the domain of !Rn -1+


IR be an open subset D of IR11 on which
all partial derivatives aflaxi of J are continuous. Then J is differentiable at every
point of D .
Proof. Since we can formally write the gradient VJ{X-O) of J at xo, the theorem
will have been proved if we show that VJ (XO) satisfies
. J(x) - J{X-O) - VJ(X-O) • (x - X-O)
hm - - - - - - - - - - - - = 0.
lHXo Jx - xol
228 Chapter 5 Differentiability

Ifx = (x1, ... ,Xn) and xo = (a1, ... ,an), set

X Yk=(x1, ... ,Xk,ak+J, ... ,an), k=O,I, ... ,n,

~() l so that in particular Yo = xo and y11 = x. We show these points with line segments

y4-J
i Y2
joining them for three dimensions in Figure 5.9. Then we have
11

= L (f(yk) -
I

f(x) - f(xo) f(Yk-d),


k=I
since only the first and last tem1s survive in the sum on the right because of cancel-
lation between successive terms.
Because Yk and Yk-1 differ only in their kth coordinates, we can apply the mean-
FIGURE 5.9 value theorem for real functions of the real variable Xk to get

f(yk) - f(Yk-1) = (Xk - ak) aJ (Zt),


axk
where Zk is a point on the segment joining Yk and Yk-1. Then

n aJ
f(x) - f(xo) = L(Xk - ak)-(zk).
k=l axk

We also have, by the definition of Vf(X-O),

aJ aJ )
Vf(xo) • (x - xo) = ( -(xo), ... , -(xo) • (x1 - a1, ... , Xn - an)
ax, ax,,
n aJ
= L(Xk - ak)-(xo).
k=l axk

Hence
11

lf(x) - f(xo) - Vf(xo) • (x - xo)I = L


k=l
( aJ
-(zk)
axk
aJ
- -(xo)
axk
) (Xk - ak)

:SL I-.axk
aJII
-(zk) - aJ
-(xo)
axk
I /x - xo/,
k=I

where we have used the triangle inequality and the inequalities

/xk - ak I :S /x - xol for k = 1, 2, ... , n.

Now divide by Jx - xo/. Since the partial derivatives are assumed continuous at xo,
and the Zk tend to xo as x does, Equation (*) folJows from making /x - xol tend to
zero. •
!EXAMPLE a I ToTheconclude
function f = sin
(x, y) ex of Example 2 is differentiable at all points
y- y
this from Theorem 2.3 we calculate that
(x, y).

(!Ax, y), / 1,(x, y)) = (ex sin y, ex cosy+ 1)


Section 2A Real-valued functions 229

According to Theorems 1.3 and 1.4 of Section 1, these partial derivatives are con-
tinuous for all (x, y ), so by Theorem 2.3 the function J (x, y) is differentiable for
all (x,y).

=
The function g(x, y, z) xy + yz + zx of Example 2 is differentiable at all points
(x, y, z). To apply Theorem 2.3 we compute the vector

(gx(x,y,z), gy(x,y,z), gz(x,y,z)) =(y+z, x+z, y+x).

According to Theorem 1.4 of Section 1, these partial derivatives are continuous


for all (x, y, z), so by Theorem 2.3 the function g(x, y, z) is differentiable for all
(x,y,z).

The function h(x, y) = JI - x 2 - y 2 of Example 1 turns out to be differentiable at


all points in the open set of (x, y) such that x 2 + y2 < I, and at no other points.
Note that h is not even defined if x 2 + y 2 > I. To apply Theorem 2.3 at the interior
points of the circular disk we calculate that

(/x(x.y), /y(x, y)) = (-x/JI - x2 - y 2, -y/JI - x2 - y2).

According to our Theorems 1.3, 1.4 and 1.5 of Section l, these partial derivatives
are continuous for all (x, y), so by Theorem 2.3 the function f(x, y) is differentiable
for all (x, y) in the open unit disk.

Continuing with the previous example, the points on the circle x 2 + y 2 = I are more
problematic. According to Remark 3 following Definition 2.1 of differentiability, if it
were possible to extend the definition of h (x, y) to an open set containing the circle
in such a way that the resulting extension he(x, y) became differentiable, then we
could claim that the original function h(x, y) is differentiable on the circle. However,
such an extension of h(x, y) is impossible for this example. For depending on the
signs of x and y,

hx(x, y) = -x1J1 -x 2 - y2 and hy(x, y) = -y;J1 -x 2 - y2

tend to oo or -oo as (x, y) tends to a point on the circle from inside the circle. Thus
there is no point on the circle at which both partial derivatives can exist, as would
have to be the case if h(x, y) were differentiable there. See Exercise 26.

A single-variable example given in Exercise 25 shows that continuity of the partial


derivatives of a function J is not necessary for differentiability of/. However, the
hypotheses of Theorem 2.3 are referred to often enough to be given a special name.
A function J is continuously differentiable on an open set D if the entries in
the gradient of .f are continuous on D. Thus each of the functions of Examples l
through 4 is not only differentiable but also continuously differentiable in the interior
of its domain. ln practice we deal almost exclusively with continuously differentiable
functions.
We conclude by considering a relationship between continuity and differentiability
that our intuition suggests should hold:
230 Chapter 5 Differentiability

2.4 Differentiability Implies Continuity. If JR" --1+ JR'. is differentiable at a point


xo of its domain, then f is continuous at xo.

Proof. Differentiability of f at xo means that the fraction


/(x) - /(xo) - V/(xo) • (x - xo)
q(x) = -----------
Ix - xol
tends to zero as x tends to xo. Multiplying this equation by Ix - xol and rearranging
terms gives

/(x) - /(X-O) = q(x)lx - xol + V/(xo) • (x - xo).


The first term on the right is a product of factors that tend to O as x tends to xo. By
the Cauchy-Schwarz inequality

IV/ (xo) • (x - xo)I ~ IV/(xo)I Ix - xol-


Thus the second term on the right also tends to zero as x tends to xo. So f (x) tends
to /(xo) as x tends to xo, and f is continuous at xo. •
2B Tangent Approximations
Many of the concepts and techniques of calculus have at their foundations the idea
of approximating a graph by a tangent line or plane. In particular, the tangent line
to the graph of a differentiable function JR --1+ JR at a domain point xo is defined
to be the graph of the function JR~ JR given by T(x) = f(xo) + J'(xo)(x - xo).
We refer to the function T as the tangent approximation to f at xo, though the
term first-degree Taylor approximation is also appropriate. Figure 5.10 shows the
graphical relationship between a typical f and T if f is a real-valued function of
one real variable.
In the natural generalization to real-valued functions ~ 11 ~ ~ of several vari-
ables, the tangent approximation takes the form
2.5 T(x) = /(xo) + VJ (xo) • (x - xo),
where V/(xo) is the n-dimensional gradient vector off at the point xo. With this
formula for T we get the expected result T(xo) =
/(X-O), since the dot product of the
gradient vector with the zero vector in ~ 11 is the real number zero. If f is a function
of two real variables the picture corresponding to Figure 5.1 O is Figure 5.11.

FIGURE 5.10 y = f(x0) + f'(x 0 )(x - x0) = T(x)


y
~
Tangent for IR IR.
/ fix)

X
Section 28 Real-valued functions 231

FIGURE 5.11 z-axis

y-axis
f'(Xo)(§ =jg)
x-axis

In Example 4 of Chapter 4, Section 3B we considered the problem of finding the


tangent plane to the graph of f(x) = 1 - 2x 2 - y 2 where (xo, yo)=(½,½). In that
example our discussion of tangency was incomplete in that we ignored the behavior
of the function except along lines parallel to the axes. Using our thoroughly justified
Equation 2.5 we calculate Vf(x,y) = (-4x, -2y), so Vt(½,½)= (-2, -1) and
f ( ½, ½) = ¼. Thus Equation 2.5 becomes

T(x, y) = ¼+ (-2, -1) • (x - ½, y - ½) + ¼= ¾- 2x - y,

consistent with the tangent plane z = J- 2x - y we found in the earlier example


with somewhat less justification.

It's only for functions of two variables that we can draw a graph of a tangent
plane in JR 3 , but the approximation given by Equation 2.5 is vaJid quite generally as
in the next example, where we consider a function of three variables with graph and
tangent in JR 4 •

l': ~~Me~iJll ~~~r~;i~a~?o~ :(~~~ i~;n:f~~;:~u~~ ~j~;~~'. J~x~z~i , ;: ~;)a!~ ~~~ ~~~ ~~n!e~~
2
3

Hence the approximation given by Equation 2.5 is

T (x, y, z) = I + (I, 2, 3) • (x - I, y - 1, z - I)

= -5 + X + 2y + 3z.
In the equation that has the tangent plane at (I , I , I , I) as its graph in JR4, we
introduce an additional variable w. The desired equation is then

W = -5 + X + 2y + 3z.
In determining the vector Vf (xo), we required that the real-valued function

f(x) - T(x) = f(x) - f(xo) - V/(Xo) • (x - Xo)


tend to zero faster than x-xo as x tends to xo. This requirement turned out to uniquely
determine the entries in the vector Vf (xo), at the same time ensuring that the graph
of T(x) is a good fit to the graph off (x). To ensure a good geometric approximation
near xo we also needed to be able to let x approach xo from every possible direction,
232 Chapter 5 Differentiability

so we required that some open neighborhood of xo be contained in the domain off.


Thus the conditions (i) and (ii) in the definition 2.1 of differentiability are just what
are needed to define tangency adequately.

EXERCISES

For each of the functions I to 8 find V/(x) at a general 20. f(x, y) = (x2 + y2)-I
point x in the domain of f.
21. Consider the function f: R" -+ R defined by f (X) =
1. f(x,y)=x2-y2 lxJ 2 = x • x. Prove that 'vf (x) = 2x for all x in R 11 •
2. f(x, y) = x 2 - y2 - sinxy 22. Is the function g: R11 -+ R defined by g(x) = lxl
differentiable at every point of its domain? Explain your
3. f(x,y)=x+2y answer. [Hint: lxl = ,Jx-x".]
4. f(x, y, z) = (x - y)z *23. Prove that if the real-valued function f is differentiable
5. f(x, y, z) = x + y - z2 at Xo, then

6. f(x, y, z) = x 2 + y2 + z2 . -f (Xo + tx) - f(Xo)


I1m ------ = "'f(
v
)
Xo •X.
7. f (x1, x2) = Xf + 2xi t-+0 t

8. f(x1, x2, X3) = X1X2X3 *24. Consider the function


Find the tangent approximation T(x, y) or T(x, y, z)
as appropriate for each of the functions 9-16 at the ~, x-:/=±y,
f(x, y) = x - r
indicated point and use it to write the equation for the { 0, X = ±y.
tangent plane in tenns of coordinate variables (x, y) or
(x, y, z). (a) Prove that f has at the origin the partial derivatives
9. f(x, y) = x3 - y3
at (I, I) f,(O, 0) = fy(O, 0) = 0.
(b) Prove that f is not differentiable at (0, 0) by assum-
10. f(x, y) = sin(x + y 2) at (n, n)
2
ing it is differentiable and contradicting the conclu-
11. f(x, y) = x + 2y at (I, 2) sion of the previous exercise.
(cJ Prove that f is not differentiable at (0, 0) by con-
12. f(x. y, z) = (x - y)z at (1, 0, 1) tradicting the conclusion of Theorem 2.4.
13. f(x, y, z) = x + y - z2 at (0, 0, 1) *25. Show that the function defined by
14. f(x, y, z) = x 2 + y 2 - z2 at (1, 1, 1)
2
15. j(x, y, z) = x + 2yz at (1, 1, I) f(x) ={ x sin~, x #0.
16. f(x,y,z) =xy2z 2 at (2, I, 1) 0, X =0

At which points do the functions 17 to 20 fail to be is differentiable for all x but is not continuously differen-
differentiable? Give a reason for your answer. tiable at x = 0.
17. f(x, y) = x-2 + y- 2 *26. Referring to Examples 5 and 6 in the text, prove that
there is no point on the circle x 2 + y 2 = 1 for which the
18. f(x, y) = Jx 2 - y 2
one-sided partial derivatives both exist when the defining
19. J(x, y) =Ix+ YI limits are taken from within the circle.

SECTION 3 DIRECTIONAL DERIVATIVES


3A Definition
A partial derivative of a real-valued function measures the rate of change of the
function in a particular coordinate direction, parallel to one of the coordinate axes.
Section 3A Directional Derivatives 233
To measure the rate of change in an arbitrary direction, we use the directional deriva-
tive. We first define the derivative with respect to an arbitrary nonzero vector v. Let
R" ~ R be a real-valued function, and let v be a vector in the domain space R".
The derivative with respect to v, denoted by of/ov, is the real-valued function
defined by

of (x) = Jim /(x + tv) - f (x).


av /-+0 t

This is a significant extension of our earlier use of the partial derivative notation, but
the derivative is still "partial" since it's computed in a single direction. The domain
of of/ov is the subset of the domain of f for which the preceding limit exists.
In practice we always assume v =I- 0, since the case v = 0 gives no information.
(Why?)
The connection between the derivative with respect to a vector and the gradient
is provided in the following theorem.

3.1 Theorem. If f is differentiable at x = (x1, ... ,xn) and v = (v1, ... , Vn),
then
aJ
-(x) = V/(x) • V.
av
We can write this formula in terms of coordinates as

of (X) = V\ !i_(X) + ·· · + Vn !i_(X).


av OX] OXn
Proof First assume v =I- 0 and note that ltv/ = It !Iv/. Since f is differentiable at x
. /(x + tv) - /(x) - V/(x) • (tv)
hm - - - - - - - - - - - - = 0.
HO /ti/VI
Since the limit is zero we can remove the vertical bars from /t I to get

Jim_!__ (/(x + tv) - f (x) - VJ (x) • v) = 0.


HO jv/ t

Multiplying by /vi we get

/(x + tv) - /(x)


.
]1m - - - - - - -
f-+0 t
= ":f( )
v X • V,

and the proof is finished for nonzero v. When v = 0, both sides of the previous
equation are zero. •
Observe that when v = ej, a standard basis vector of length I, the equation in
Theorem 3.1 shows that the derivative with respect to that vector is just the partial
derivative with respect to Xj, that is,

aJ aJ
234 Chapter 5 Differentiability

As in the previous equation we'll most often want to choose the vectors v in afjav to
have length 1 so these derivatives serve as standardized rates of change in a variety
of directions. Nevertheless the more general definition has its uses, as for example
in Exercises 22 and 23.
For each vector u in JR11 of length lul = I, we define the directional derivative
of f in the direction of u to be the function afjau. The reason for the name
"directional" derivative is that in JR11 there is a natural way lo associate a vector to
each direction, namely, take the unit vector in that direction. The number (af/ au)(x)
is then regarded as a standard measure of the rate of change of the value f (x) in the
direction of u.

IEXAMPLE 1 I Suppose a function f: JR 3 ---+ JR isf(x, y, z) = xyz. We find the directional deriva-
tive of f in the direction of the unit vector u = (I /2, I /2, I/./2) by letting
x = (x, y, z) and using Theorem 3.1 to get

aJ (x) = V/(x). u = (yz, xz, xy) • (1/2, 1/2, 1/-12)


au
I
= 2(yx + xz + ./2.xy).
It follows that the directional derivative of f in the direction of u at (I, I, I) has the
value afjau(l, 1, I) = I+ 1/./2.

Let JR 2 ~ JR be a function whose graph is a surface in JR 3 and let u be a unit


vector in JR 2 , i.e., lul = I. An example appears in Figure 5.12. The value of the
directional derivative afjau at x = (x, y) is by definition

aJ (x) = Jim /(x + tu) - /(x).


au 1--+0 f

The distance between the points x + tu and x is given by

l(x + tu) - xi = ltul = ltl.


Hence the ratio
/(x + tu) - /(x)
t
is the slope of the line through the points f (x + tu) and /(x). It follows that the
limit of the ratio, (a//au)(x), is the slope of the tangent line at (x, /(x)) to the
curve formed by the intersection of the graph of f with the plane that contains x
and x + u, and is parallel to the z axis. This curve appears in dashes in Figure 5.12.
The angle y in Figure 5.12 indicates the inclination of a tangent line to the graph in
y _.. the vertical plane containing u and therefore satisfies the equation

tan y
aJ
= -(x).
au
The situation here is a generalization of the one shown in Figure 4.19 in Chapter 4,
Section 3. If u = e1, the angle y becomes the angle a in the earlier figure and
FIGURE 5.12 aJ aJ .
- , w1thu=(l,0).
tan y is a slope. au ax
Section 3B Directional Derivatives 235
If u = e2, we get y = f3 in Figure 4.19 of Chapter 4, Section 3, and

at at
- , With U
. = (0, 1).
au ay
3B Mean-Value Theorem
We assume here some acquaintance with the fo11owing fundamental theorem from
single-variable caJculus, and we state it without proof.

3.2 Mean-Value Theorem. Let IR. ~ IR. be continuous on the closed interval
[x , y ] and differentiable on the open interval (x, y ). Then there is a number xo strictly
between x and y such that f(x) - f(y) = f'(xo)(x - y).

The function f in Theorem 3.2 is a function of a single real variable, so in that


context we can write the mean-value equation with both sides divided by x - y;
this is a natural way to write it because we can interpret each side of the equation
as a slope. For a real-valued differentiable function of a vector variable x we can't
divide by x - y, but there is still a valid generalization with f ' (xo)(x - y) replaced
by Vf (xo) • (x - y). The formal statement fo11ows.

3.3 Theorem. Let IR.n ~ IR. be differentiable on an open set containing the line
segment S joining two vectors x and y in IR.n . Then there is a point xo on S such
that
f (y) - f (x) = Vf (xo) • (y - x) .

Proof. Consider the function g(t) = f(t(y - x) + x), defined for O ~ t ~ 1 and
set m(t) = t (y - x) + x. Then if h is a real number,

g(t + h) - g(t) = f((t + h)(y - x) + x) - f(t(y - x) + x)


= f(h(y - x) + m(t)) - f(m(t)).
Dividing both sides by h =/:- 0 gives

g(t + h) - g(t) f(h(y - x) + m(t)) - f(m(t))


------=-----------
h h
Now let h ~ 0. On the right side we get of/o(y - x) evaluated at m(t). Hence the
limit of the left side exists also, and is g'(t). Thus

g'(t) = a(/~ x) (m(t)) or g'(t) = Vf(m(t)). (y - x) , (*)

by Theorem 3.1. Now let t = I and t = 0 in the definition of g(t) to get g (I) - g (0) =
f (y) - f (x). But by the mean-value theorem for functions of one variable, applied
to g at t = 1 and t = 0,

g(l) - g(O) = f(y) - /(x) = g'(to),


1- 0
for some to satisfying O < to < 1. Setting t = to and xo = m(to) in Equation(*)
and using this last equation gives f(y) - f (x) = Vf(xo) • (y - x). a
236 Chapter 5 Differentiability

FIGURE 5.13
t

Connected Not connected


(a) (b)

One of the most important conclusions we draw from the mean-value theorem
for functions of one variable is that a function with zero derivative on an interval is
constant. For a function J of a vector variable, we replace the domain interval by
an open set D in ]Rn that we assume to be polygonally connected; a polygonally
connected set S is one such that a given pair of points in it can be joined by a finite
sequence of line segments lying in S, that is, by a polygonal path. Figure 5.13 shows
a set in JR 2 that is connected in this way and also one that is not.

3.4 Theorem. If JR.n ~ JR. is differentiable on a polygonally connected open set


D and VJ (x) = 0 for every x in D, then J is constant.
Proof. If XJ and x2 are points of D joined by a single line segment, then Theo-
rem 3.3 and the assumption that VJ (x) = 0 in D together imply that J (xi) = J(xz).
Working stepwise, one additional segment at a time, the same conclusion holds
for two points x1 and Xp joined by a finite sequence of segments. So J is con-
stant on D. •
EXERCISES

In each Exercise I to 4, with functions defined on JR.2 or 1. f (x, y) = ex+Y at x = (I, 1) and in the direction of the
JR. 3 , find the directional derivative of J in the direction curve defined by g(t) = (t 2 , t 3 ) at g(2) for t increasing
of the unit vector u at the point x. 8. f(x, y) = (x 2 + _v2)- 1 at the poim (I, 3) and in the
1. f(x,y,z) =x 2 + y 2 +z 2 , u = (l/.J3, I/.J3, 1/.J3), x = direction of the vector (1, 2)
(I, 0, 1) 9. Find the directional derivative at (I, 0, 0) of the function
2. f(x, y) =x 2
- _v2, u = (1/./2, 1/./2), x = (2, 1) J(x, y, z) = x 2 +yez in the direction of the tangent vector
at g(0) l.o the curve R3 defined parametrically by
3. J(x,y)=x+y, u=(1 ,0), x= (2,3)
g(t) = (3t 2 + t + 1, 2t , t 2 ). .
4. f(x,y,z) = xysinz, u = (1/./2,0, - 1/./2),x =
(1, I, I) 10. Find the directional derivative at (I, 0, 0) of the function
f (x, y, z) = x 2 +yez in the direction of increasing t along
In Exercises 5 to 8, for the real-valued function defined the curve in R3 defined by g(t) = (t 2 - t + 2, t, t + 2) at
in JR. 2 , find the directional derivative at x in the direction g(0).
indicated.
11. Find the directional derivative at (I, 0, I) of the function
5, f(x, y) = x 2 - y2 at x = (1, 1) and in the direction f(x, y, z) = 4x 2 y + _v2z in the direction of the vector
(I/ ../5, 2/ ../5) (I, 1, I).
6. J (x, y) = ex sin y at x = (1, 0) and in the direction 12. Find the directional derivative at (0, 0) of the function
(cos ex, sincx) J(x, y) = sin(x + y) in the direction of the vector (a, b) .
Section 4A Vector-valued Functions 237
13. Use the cross product to find the direction of a perpen- 21. The mean-value theorem doesn't generalize to vector-
dicular p at (I, 2, I) to the surface defined parametrically valued functions, because for a vector-valued function
by (x, y, z) = (u 2 v, u + v, u). Then find the directional f (x) of even just one variable there may not be a single
derivative off (x, y, z) = x 3 + y 2 + z in the direction of point xo between x and y at which (f (y) - f (x)) / (y -
pat (I, 2, I). x) = j'(xo). Verify this assertion for the example f(x)=
14. Show that for an arbitrary angle a, the vector u = (sin x, sin 2x) on the interval O .:'.':: x .:'.S 1r.
(cos a, sina) is a unit vector in IR 2 inclined at angle a In Exercises 22 and 23, the fundamental derivative
to the positive x-axis.
approximation f(x + h) ~ f (X-O) + VJ(xo)h for dif-
ferentiable functions JR" ~ JR becomes the firs.t-degree
15. For the unit vector u in Exercise 14, show that

of of . of part of a higher-degree approximation. We define the


- = cos a-+ sma-0 . Nth-degree Taylor approximation tof (x) by
au ax y

16. Let u he a unit vector in IR" with direction cosines


cos a; , = u • e;, relative to the natural basis e 1, .. . , e11 •
f(x + h) ~ j(x) + :; (x)
Show that u = (cosa1, cosa2, ... , cos a,,).
1 a2 J 1 a3 f
17. For the unit vector u in Exercise 16, show that + 2! ah 2 (x) + 3! ah3 (x)
-of
ou
of
= cosa1 -OXJ + ·· · +cosa
of
OXn
11 - .
+ ... + N!
I aN f
ohN (x),

18. If f : JR" -+ IR is differentiable and u is a unit vector in


IR", show that where we assume that the required Nth-order derivatives
are continuous on an open neighborhood of x. The
af
--(x)
of
= --(x), higher-order derivatives with respect to the vector h are
a(-u) au defined recursively by generalizing ordinary higher-order
partials as follows:
for all x in JR" .
19. Show that the mean-value formula of Theorem 3.3 also
takes the form
f(y) -f(x) _ of ( )
ly-xl
-
au
XO ,
a
= ah(aahzI)
2
(x), ... , and in general

a (aN-1 I)
where the unit vector is u = (y - x)/IY - xJ.
20. Show that the function f defined by
aN f
ahN (x) = rJh ahN-1 (x).
xlyl
f(x,y)=
I ..jx2+y2'
0,
(x, y)

(x, y)
i= (0, 0)

= (0, 0)
has a directional derivative in every direction at (0, 0),
22. Let f(x, _y) = sin(x 2 + y) , x = (0, 0), and h =
(h, k). Compute the second-degree Taylor approximation
to f(h, k).
2 1
but that f is not differentiable at (0, 0). [Hint: If f were 23. Let f (x , y) = eX +»2-z , x = (0, 0, 0), and h =
differentiable at (0, 0), then we would have Vf (0, 0) = (h, k, l). Compute the second-degree Taylor approxima-
(0, 0).] tion to f(h, k, [).

SECTION 4 VECTOR-VALUED FUNCTIONS


4A Differentiability
Relying on Definition 2.1 for real-valued functions in Section 2, we define a vector-
valued function IR. 11 ~ !Rm, with domain an open set D , to be differentiable if
238 Chapter 5 Differentiability

each of its m real-valued coordinate functions Ji, h, ... , J,,, is differentiable. If


all n first-order partial derivatives of each of the m coordinate functions of f are
continuous on D then f is in addition called continuously differentiable on D. In
this generality we're looking at m times n real-valued partial derivatives altogether.
A natural way to organize these partial derivatives is to compute successively the
gradient vectors of the coordinate functions Ji, h, ... , fm of/:

af .)
Vfi (x) = ( -(x),
a1i a1i (x), -(x)
ax1 ax2 axil .

Vfi(x) = ( -(x),
a1i ~h (x), ... ' ah (x))
ax1 i:1x2 axil ,

Vfm(X) = (aa.fix1111 fx), afm (x), ... ' aJ,11 (x)).


ax2 ax/I
In practice we hope to be able to observe that each of these mn partial derivatives
is continuous on the open set D in which case f is not only differentiable but
continuously differentiable.

j EXAMPLE 1 I The function IR 2 _!_.,, JR2 defined by


x2 _ y2 )
f(x, y) =( 2x_;

has coordinate functions fi(x, y) = x 2 - y 2 and fi(x, y) = 2xy, so

Vfi(x, y) =(2x, -2y)


Vfi(x, y) =(2y, 2x).
Since the coordinates of Vfi (x, y) and Vfi(x, y) are continuous for all (x, y), the
vector-valued function f is continuously differentiable on all of JR 2 .

An alternative way of displaying the first-order partial derivatives of the coordinate


functions of a vector function f is to look directly at the vector partial derivatives
off. We'll see that sometimes there's a distinct advantage to displaying a vector
function f and its partial derivatives as column vectors.

j EXAMPLE 2 I The function IR 2 _!_.,, iR 2 of Example l defined by

f (x, y)
·
=( x2 -
2xy
y2 )

has vector partial derivatives

aJ ( 2x ) aJ ( -2v ·)
ax(x,y)= 2y a/x, y) = 2; . ·
We draw the same conclusion as in Example I from continuity of the four partial
derivatives, namely that f is continuously differentiable on all of JR2 .
Section 4B Vector-valued Functions 239
4B The Derivative Matrix
Each of the Examples I and 2 in Section 4A displays the first-order partial derivatives
of a vector-valued function JR 2 -1.+
JR 2 as the result of computations that are in
principle somewhat different. In Example l we looked at the gradient vectors of
the two coordinate functions, while in Example 2 we looked at the vector partial
derivatives off itself. Both of these interpretations are important to keep in mind,
but there is a third way of organizing the results of the computations that is specially
important, namely the arrangement of each example's real-valued partial derivatives
in the following 2-by-2 matrix:
afi
a(x, y) aJ1
ay(x,y) ) - ( 2x
f'(x, y) = a}i . ah<
-2y)
2x ·
( ax(x,y) - x y)
ay ,
- 2y

We have denoted the resulting matrix at a point (x, y) by f' (x, y ), a notation that will
be particularly useful when we take up the general chain rule in Chapter 5. Notice
that the rows of this matrix are the successive coordinates of the gradient vectors
of the coordinate functions f1(x, y) = x 2 - y2 and h(x, y) = 2xy of f(x, y).
The columns of the matrix consist of the entries in the vector partial derivatives of
f(x, y).
In general we define the derivative matrix of a differentiable function at x by
aJi (x) aJ1 (x) aJ1 (x)
ax1 ax2 axn
a1z (x) a12 (x) a12 (x)
J' (x) = ax1 ax2 axn

where f1 (x), h(x), ... , fm (x) are the differentiable coordinate functions of f (x).
Note that the differentiation variable remains the same in each column, producing a
vector partial derivative ofJaxj(X), which is a tangent vector at x to a curve in the
image space generated by varying only Xj. Across each row a coordinate function
f; remains the same, producing the coordinate functions of a gradient vector VJ;(x),
which we'll see in Section 1 of the next chapter gives the magnitude and direction
of maximum increase of fi at x.
If f is a real-valued function there is only one coordinate function so the matrix
has only a single row, whose entries are identical with the coordinates of the gradient
vector VJ (x). Notationally the only difference between VJ (x) and the single-rowed
matrix is that VJ (x) has its entries separated by commas. So in every case a deriva-
tive matrix f' (x) can be regarded as having tangent vectors for columns and
gradient vectors for rows.
j·,~XAM~~E:3 I The function JR
3
-1.+ JR 3 defined by
f(x, y, z) = ( ./: ;s:~z )
x+y
240 Chapter 5 Differentiability

has coordinate functions

t( .
1 .,,y,z )-x
- 2 +e,
y h(x,y,z) =x+ ysinz, /J(x,y,z)=x+y.

The derivative matrix at (x, y, z) is the matrix whose columns are the three possible
vector partial derivatives:

,
t(x,y,z)= (at at (x,y,z) -:--
:l-(x,y,z) -a at (x,y,z) ) .
cX y <1Z

For this example

t'(x, y, z) =
2x
l
eY
sinz ycosz
O) .
( I I 0

In particular the derivative matrix of t at (I, I , n) is the matrix

t' (1, I , n) =
2I eO - O)
I .
(I I 3

The three columns are vectors in JR 3 tangent to coordinate curves of f through


f (I, I, 7Z') = ( I + e, I, 2), and the three rows are gradient vectors at the same point.

IEXAMPLE4 I The function JR f


2 -----+ JR 2 defined by

t(x, y) = (x 2 + 2xy + y2, xy2 +x 2 y)


has coordinate functions ti(x, y) = x 2 + 2xy + y2 and h(x, y) = xy2 + x 2 y.
Thinking in terms of gradient~ this time we find

Vt1 (x, y) = (2x + 2y, 2x + 2y) and 'Vh(x, y) = (y2 + 2xy, 2xy + x 2 ).
Hence the derivative oft at (x, y) is given by the matrix

t'(x, ) =( 2x + 2y 2x + 2y )
y y2+2xy 2xy+x 2 ·

IEXAIVIPLE s J Consider the function JR -1...,,. IR'.2

cost ) -oo < t <


tU) = ( sin t ' 00.

The derivative t' (to) is the 2-by- l matrix

- sin to ) .
( cos to

It's instructive to consider the mau·ix as a vector in the range space of and to t
t
draw it with its tail at the image point (to). For to = 0, Jr /4, n /3, n /2, and n, the
Section 4C Vector-valued Functions 241

respective derivative matrices f' (to) are

-..fi./2)
( ~) ' ( ./2/2 '
Viewed as vectors, drawn with their tails at their corresponding image points under
f , these are shown in Figure 5.14. Evidently, for functions of one real variable, the
idea of derivative, introduced here as a matrix, coincides with the vector derivative
developed in Chapter 4, Section 4. The first-degree Taylor expansion approximates
f (t) near to and is the vector function oft given by

T(t) = j'(to)(t - to)+ f(to) ,

which in terms of matrices becomes


FIGURE 5.14
Tangents to a circular image. T(t) = (t _ to) ( -sin to ) +( c?st0 + to sin to ) .
cos to sm to - to cos to

This is the parametric representation of the line tangent to the image off at J(to) .

4C Tangent Approximations
We've so far encountered two settings of the tangent approximation, the first being
the one appropriate for the case IR --1+ IR of a differentiable real-valued function of
a single real variable, namely

T(x) = f(xo) + /'(xo)(x - xo) .

In this familiar setting in single-variable calculus, we use T(x) to write the equation
of the tangent line to the graph of y =
f (x) at xo in the form

y = f(xo) + J'(xo)(x - xo).

In the second setting we get a geometric picture of the tangent approximation of a


real-valued function f(x, y) at (xo, yo) by drawing the graph of z = f(x , y ) together
with the tangent approximation to the surface at the point (xo, YO, f(xo, yo)), namely

T(x, y) = J(xo, Yo) + Vf(xo, Yo)• (x - xo , y - yo).

Thus the tangent plane is the graph of the equation

z = f(xo, Yo)+ Vf (xo, Yo)· (x - xo , y - Yo).


The function T(x, y) is a good first-degree approximation to the function f(x , y) ,
provided (x, y) is close to (xo, yo) . Figure 5.11 in Section 3 is a picture of the
relation between the graphs of z = f (x, y) and z = T(x , y). The more general
version of T (x) for real-valued functions is

T(x) = /(xo) + VJ (xo) • (x - xo).

We've seen that for general n and m a differentiable function IR" IRm has an --1+
m-by-n derivative matrix /'(x) with m rows, one for each coordinate function, and
242 Chapter 5 Differentiability

n columns, one for each coordinate variable in x. It is this matrix that plays the role
that the gradient plays in the case of a single real-valued coordinate function. Thus
the first-degree Taylor approximation to /: IR.11 - !Rm at xo is

4.l Definition
\\.'.l'l¢t~ rtf~ vccto~ s ~)<o mqst be · f¢.ted as ~ tt$~y1( '.2~l that
wf. \~~ >JllUldply Oil tpe left by the tf( .•. t .watib< . f '(ii) } . j ~y.
1;,oh1U1n. cule·for. . matr.i.~ multiplk<Jtion.

IEXAMPLE 6 I The function IR 2


-L JR3 defined by
f (u, u) - ( : ~;: ) 0 :, u :, I, 0 :, u :" 2n

has as image one complete tum of a circular helix in IR. 3. This function f has
derivative matrix
cos v -u sin v )
J'(u, v) = s\n v u c~s v .
(

Since the entries ip. this matrix are continuous real-valued functions on all of IR 2 , we
can conclude that f (u, v) is continuously differentiable on its rectangular domain
0 S u S l, 0 ::::: v s 2.1r. Note that the columns of the 3-by-2 matrix J' (uo, vo) in
a single-variable context are tangent vectors no and vo that generate a tangent plane
at a point f (uo, vo) consisting of all points
T(u, v) = f(uo, vo) + (u - uo)uo + (v - vo)vo.
This plane is the image of the first-degree tangent approximation at (uo, vo). Note
also that the inclusion of the terms -uo and -vo just has the effect of shifting the
parameter values so the point of tangency on the plane corresponds to (uo, vo). Delet-
ing those two terms would still produce the same plane as image; with (u, v) = (0, 0)
the parameter values indicate the point of tangency on the plane. See Figure 4.22(b)
in Chapter 4, Section 4, where the columns of the matrix J'(u, v) were written as
the vector partials afl au and af I aV.

IEXAMPLE ii The function IR 3 -L IR 3 defined for all (x, y, z) by


f(x,y,z) = ( 2~x_~;:;z)
3x + 3y - 3z
has as image all of JR3 • To see this note that f has derivative matrix

-1 l I )
J'(u, v) = 2 - 2 2 .
( 3 3 -3
This constant 3-by-3 matrix A = f'(x, y, z) has positive determinant 24 and so
is invertible with inverse A- 1 using Theorem 5.7 in Chapter 2, Section 5. Note that
/(x) = Ax, where x is a 3-dimensional column vector. Hence there is an inverse
Section 4C Vector-valued Functions 243

function 1- 1(y) = A- 1y that takes each point y in the image of f back to its
corresponding point x in the domain of f.

The first-degree approximation defined by Equation 4.1 has the same essential
character as the two special cases we considered previously, namely that f (x) - T (x)
tends to zero faster than jx-xol as x tends to xo. Here is the complete formal statement
and proof.
4.2 Theorem. If !Rn _!__,,. !Rm is differentiable at xo, then

. f (x) - /(xo) - /'(xo)(x - xo)


I1m - - - - - - - - - - -
x~Xo jx - xol
= 0,
and /' (xo) is the unique matrix that satisfies this equation.
Proof. According to the definition of the matrix-vector product f' (xo)(x - xo), the
ith coordinate in the product /'(xo)(x - xo) is just the dot product of the ith row of
the matrix /'(xo) with the column vector x - XQ. But the entries in the ith row of
the matrix are just the coordinates of Vfi (xo), so the ith coordinate of the desired
limit equation is
. fi(X) - /i(Xo) - V/i(Xo). (x - Xo)
hm - - - - - - - - - - - - = 0.
x~xo jx - xol
We assumed f was differentiable at xo, which means that each real-valued function
Ji is differentiable there, so the limit is valid in each coordinate. Theorem 1.1 in
Section 1 of this chapter says that the vector limit is valid if, and only if, the limit
is valid in each coordinate. So the vector limit is valid. Finally, the matrix f' (xo) is
the unique matrix that has the gradients of the coordinate functions as rows. •

4.3 Corollary. If A is a constant m-by-n matrix, then the function !Rn _!__,,. !Rm
defined by f (x) = Ax has A for its derivative matrix, that is, J' (x) = A.
Proof. The derivative matrix /'(xo) is the unique matrix that satisfies the equation
of Theorem 4.2. Observe that since A(x - xo) =
Ax - Axo, then
Ax - Axo - A(x-xo) _ Ax -Axo - Ax - Axo _
0
Ix - xol - Ix- xol - ·

The limit as x tends to xo is the 0-vector, so A is the unique derivative matrix. •


We saw in Chapter 2, Section 2C that a function of the form F(x) = Ax is linear
in the sense that F(sx + ty) = s/F(x) + tF(y) for all scalars s, t and n-dimensional
vectors x, y. Thus it's a linear function that's the crucial part of the first-degree
Taylor approximation T(x) = f'(xo)(x - xo) + /(xo) to /(x) near x = xo.
EXERCISES

In Exercises 1 to 10, find the derivative matrix f' at a


general point of the domain of the function f from !Rn 2. f(u v)
'
=( UC?SV )
U Sln V
to !Rm.
+ siny )
x
l. f (x, y) =( /J. y )
3. f(x,y,z)= y+cosz
( x+y+z
244 Chapter 5 Differentiability

20. A(:)-(:~:) at ( : ) - ( b)
5. f(x) = x 2ex
6. f(x,y) = (exY,xy)
21. r(: )-( :;~;~) " (: )-(; )
22. f (x, y, z) = (x + y + z, xy + yz + zx, xyz) at (x, y, z)
UV )
1. f(u. v. w) - ( VU! 23. Let P be the function from R3 to R2 defined by
WU P(x, y, z) = (x, y).
(a) What is the geometric interpretation of this transfor-
8. f (t J = ( ~: ) mation?
(b) Show that P is differentiable at all points and find
the derivative matrix of P at (l, 1, l).

9. f(x, y) = ( :~ ~ ;~ ) 24. (a) Draw the curve in R 2 defined parametrically by the


function
xy

10. f(u, v) = (u, v, 11 2 + v 2) g(t) = (t -1,t 2 - 3t +2), -00 < 1< 00.

L1 Exercises 11 to 14, let / be the vector function (b) Find formulas describing the tangent approximations
to g near t = 0 and near t = 2.
defined by

f ( X)
y
=( x2
2xy
-y2) (c) Draw the lines defined parametrically by the tangent
approximation.
25. Let f be the function given in Exercise I, and let
Find the derivative matrix of f at the following points:

11. ( ; )
xo =( 6) , YI =( OOI ) ,

12. ( ~ )
Y2 =( O~I ) , Y3 = ( ~: ! )·
(a) Compute f(xo + y;) for i = I, 2, 3.
13. ( 6) (b) Find the tangent approximation to J(xo + y) for an
arbitrary vector y.
(c) Use part (b) to find approximations to the vectors
1/./2 ) J(xo + y;), i = I, 2, 3.
14. ( 1/./2
26. (a) Sketch the graph in R3 defined explicitly by the
In Exercises 15 to 22, find the derivative matrix of the function
function at the indicated point. f(x,y)=4-x 2 -y2.
1 (b) Find the tangent approximation to f (i) near (x, y) =
15. f ( ; ) = x2 + y2 at ( ; ) =( )
(0, 0) and (ii) near (x, y) = (2, 0).
16. g(x, y, z) = xyz at (x, y, z) = (l, 0, 0) (c) Draw· the graphs of the approximations in (b).
27. What is the derivative matrix f'(x, y, z) of the function
17. J(t) =( sint ) at t = !L
cos 1 4
a1
f (x, y, z) = b1 a2
b2 b3 ) ( xy )
a3 +( ao
ho ) ?
18.1(,)-(:,),11-1 (
CJ c2 c3 z co

28. A translation is a function of the form T(x) = x + b,

)!~
19. g(x.y)= ( 2 ) at (x,y)=(l,2)
for a fixed vector b. What is the derivative matrix of a
translation from R" to R 11 ?
Section 5 Newton's Method 245
*29. Show that if f:lRm-JRn and g:lRm-lRn are differen- (a) (f + g)'(xo) = f'(xo) + g'(xo)
tiable at XO and a is a real number, then f + g and af are (b) (af)'(xo) = af'(xo)
differentiable at xo and

SECTION 5 NEWTON'S METHOD


In this section we treat Newton's method for approximating a solution of an equation
f (x) = 0, where IR" ~ IR" is a nonlinear function. We begin by looking at the
• Xr • X2 idea of approximating a vector in IR" by a sequence of vectors in IR11 • We are used to
• x~ thinking of a real number like ,./i. as being approximated by a sequence of rational
numbers, say, l, 1.4, 1.41, 1.414, .... The idea extends immediately to vectors.
First we define the limit of a sequence in IR". Let x1, Xz, X3, .•. be an infinite
sequence of vectors in IR11 • Suppose there is a vector x in IR11 such that, for a given
€ > 0, there is an integer N for which

lxk - xj < €

whenever k ~ N. Then we say that the given sequence converges to the limit x,
and we write
FIGURE 5.15 lim Xk = X.
k--+OO

We can summarize by saying that the sequence x1, xz, x3, ... converges to x if
lxk - xi is arbitrarily small for all sufficiently large k. Figure 5.15 shows a sequence
with entries lying within € of x whenever k ~ 6.
Consider the vector ( ,./i., n) in IR2 . Suppose that ,./i. is approximated by the decimal
expansion sequence 1, 1.4, 1.41, 1.414, ... and that n is approximated by 3, 3.1,
3.14, 3.141, .... Then we can form the sequence of vectors (1, 3), (l.4, 3.1), (1.41,
3.14), (1.414, 3.141), ... to approximate the vector (,./i., n). We leave as an exercise
showing that if x1, xi, x3, ... and YI, yz, )'J, ... are the sequences approximating ,./i.
and n respectively, then lim Xk = ,./i. and lim Yk = rr implies that lim (xk, Yk) =
k--+oo k--+oo k--+oo
(,./i., 1{ ).
We look first at Newton's method for approximating a solution of an equation
f(x) = 0 where f is real-valued and x is a real variable. We assume that f is
continuously differentiable. If the graph of f should happef!. to be convex as shown
in Figure 5.16, then it's geometrically apparent that the tangent line to the graph
at (xo, /(xo)) crosses the x-axis at a point x1 that is a better approximation to the
solution x than xo is. Having chosen xo somewhat arbitrarily, and having found xi,
we can repeat the process. This time we use the tangent line at (x 1, /(xi)) and call
its intersection with the x-axis xz. Thus we can generate a sequence of numbers
xo, x1, xi, . . . approximating x.
In practice, we need a formula for computing the sequence x1, xz, ... . We observe
first that the tangent line at (xo, f (xo)) has the equation
FIGURE 5.16
5.1 y = J'(xo)(x - xo) + f(xo).
Since the approximation x 1 is found by intersecting the tangent with the x-axis, we
set y = 0 in the above equation and solve for x1. TI1e result is

0 = J'(xo)(x1 - xo) + /(xo), or J'(xo)(xi - xo) = -f(xo).


246 Chapter 5 Differentiability

If f' (xo) -1- 0,


f(xo) f(xo)
xi - x o = - - - . or x i = x o - - - .
f'(xo) f'(xo)

Having found x1, to find x2 we replace xo by x1 in the last formula to get

/(xi)
X2 =XJ - - - .
f'(x1)

In general, we compute Xk+I by


/(xx)
5.2 Xk+I = Xk - - -.
j'(Xk)

j EXAMPLE 2 j The equation x 2 - 3 = 0 has two solutions, ./3 and -./3. To approximate ./3 we
· choose xo = 2 and compute Xk+I from Xk by the Formula (1), which in this case is

(x} - 3)"
Xk+I = Xk - 2Xk
(x} + 3)
2Xk
Thus we get x1 = ¾ = 1.75. Substituting this value in the preceding formula for
k = l gives x2 = ~~ ;: : : 1.732142857. This approximation to ./3 is correct to three
decimal places. Calculating one more step gives x3 ~ 1. 7320508, which is correct
to the number of displayed digits.

We follow a similar procedure to the one just described if / is a function from


IR" to IR" . The difference is that, in this case, x1, xo. and /(xo) are vectors in IR" and
/'(xo) is an n-by-n derivative matrix. To approximate a solution of a vector equation
/ (x)= 0, we consider equation generalizing Equation 5.1 defines the value of the
tangent approximation to / near XO, that is,

5.3 y = /'(xo)(x - xo) + /(xo),


where xo is chosen as an initial approximation to the desired solution x. In
Figure 5.16, Equation 5.3 is the equation of the tangent to the graph of / at
(xo, f (xo)). As before, we set y = 0 in Equation 5.3 to get

0 = /'(xo)(x1 - xo) + /(xo), or /'(xo)(x1 - xo) = - /(xo) .

If /' (xo) has an inverse matrix [/' (xo) 1- 1, we apply the inverse to both sides to get

XJ - XO= -[/'(xo)]-l .f(XO) , Or XJ = XO - [/ (Xo)r 1/(xo).


1

r
In this equation [J' (Xo) I/ (xo) is the vector we get by applying the inverse of the
matrix /'(xo) to the vector /(xo). The vector x 1 is the first improvement on the
initial approximation xo to the solution x.
Section 5 Newton's Method 247

We can repeat what we have just done, replacing xo by x 1 to get

After k +l steps we have the general formula for

5.4 Newton's Method.

j E,~fV!P~~ ~ I Consider the pair of equations

x 2 +y2 =2
2
x -y2 = 1,
the intersecting graphs of which appear in Figure 5.17. There are four solutions to
the pair of equations. To find approximate solutions by Newton's method we define

f(x, y) =( x2
X
2
+ y22 -
-y - I
2)
and solve the equation f(x, y) = (0, 0). Since f is a function from IR2 to JR 2 , we
require both x 2 + y 2 - 2 = 0 and x 2 - y 2 - l = 0, and it's helpful to sketch the curves
defined by these two equations. The exact solutions are represented br the four points
of intersection of the circle x 2 + y2 - 2 = 0 and the hyperbola x 2 - y - l = 0 shown
FIGURE 5.17 in Figure 5.17. The choice of an initial approximation depends on which solution we
want to approximate. To look for the solution in the first quadrant, we try xo = ( l , I).
Since
f(x,y) = ( x22 + 2 y2-2)
X -y - 1
, we have / , (x, y) = ( 2x 2x 2y )
-2y

and
1 -] 1 -I
[J'(x, y)rl =
-x
1
( -y -1
4
-x
4
I -1
--y
4
)
Then the right side of Equation 5.4 becomes

I -I
-x 1 -1 )
1
( ; ) - [/'(x, yW f(x, y) - ( ; ) - (
4 ::t ( x~ + y~ - 2 )
1 -I 1 -1 X - y - 1
-y --y
4 4
2x 2 - 3

-(;)-( 4x
2y 2 - I
4y
248 Chapter 5 Differentiability

This vector is the analog of the expression (x 2 + 3)/2x in the previous example
and is the formula by which the sequence of approximations is actually computed.
Setting xo = (xo, yo) = (I , I), we get

X
I
= ( 2xt:+ 3) ~
2y5 1
= (
3
) 1.25 )
- ( 0.75 .
4yo 4
Substituting x1 into Equation 5.3 gives

2(1.25)2 + 3 )
X
2
= 4( 1.25) ::::: 1.225
2(0.75) + 3 ( l.70833 ) .
(
4(0.75)
Substituting our approximate value for x2 gives
2
2(1.225) + 3 )
4(1.225) 1.22574
X3 =( 2(1.70833)2 + 3 ::::: ( 0.707108 ) .
4(1.70833)
Similarly, we get

1.22474 ) d ( 1.22474 )
X4 ::::: ( 0.707107 an xs ::::: 0. 707107 ·

As in the previous example, you can check that further iteration using only five
places after the decimal point doesn 't produce a change.
In this example, the two simultaneous equations can actually be solved by elim-
ination to yield x = (./f.5, J53). The approximation x4 = ( l.22474, 0.707107)
happens to be correct to that many decimal places. We get the other three vector
solutions by symmetry. Referring to Figure 5.17, we get these vectors by changing
one or both signs of the coordinates to minus. The numericai procedure could have
been applied by taking as initial estimate xo one of the vectors (- 1, - 1), (- 1, 1),
or (1 , - 1).

In choosing an initial approximation Xo, some care must be used in getting a


sufficiently close approximation. For instance, if in Example 3 we wanted the solution
in the first quadrant, then making too gross an error in choosing xa could lead to
approximating the wrong solution. In many examples a sketch or similar geometric
analysis of the function / will show how xa should be chosen.
In using Newton's method in ~n for large n, it may be very time-consuming to
invert the matrix /' (Xn) at each step of the iteration. In such cases we can use the
5.5 Modified Newton Method:

As with Newton's method we derive a sequence of approximations to a solution of


/(x) = 0. Fork = 0, the formula defining XJ is the same as the Newton formula.
Section 5 Newton's Method 249

For k ~ 1, Xk+l as defined by Equation 5.5 will in general be different from the
corresponding value determined by the Newton formula in Equation 5.4, because the
matrix [J'(xo)]- 1 remains the same at each step in Equation 5.5.

j E><AMP~£'4 I Returning to the equation of Example 3, namely,

( ;~ =~~ =i )= ( ~ ) '
we apply Equation 5.5 with xo = (l, l ). Then

so

1
x- [J'(x,,)r /(x) =(;)-(i l ) ( 1
4
2
x + y2 _ 2 )
x 2 - y2- 1

2
+3
=(;)-(:( )=( -2x +4x

-2y 2 :4y+ 1
)

Then x 1, defined by x 1 = xo - [/'(xo)]- 1f(xo), is

-2x5 +4xo + 3
4 1.25 )
-2yJ +4yo + 1 ) =( 0.75 .
4
In the next step
2
-2(1.25) + 4(1.25) + 3 )
4 1.21875
x, = ( -2(0.75) 2 : 4(0.75)+ 1 =( 0.71875 ) ·

Continuing, we arrive at
1.22474 )
XJO =( 0. 707107 '

which agrees with the result obtained in Example 3, though it takes more steps.

In deciding whether to use the Newton Formula 5.4 or its modification 5.5, note
that Formula 5.4 produces faster convergence than 5.5, that is, it achieves a smaller
error in a given number of steps; on the other hand Formula 5.5 has the advantage
250 Chapter 5 Differentiability

that it requires calculation of the derivative matrix J' and its inverse at only one
point. Thus if computing the inverse matrices [f' (xk )]- 1 is going to be particularly
time-consuming, it may be worth taking the extra iteration steps that Equation 5.5
may require to achieve the desired accuracy.
Note. Java applets NEWTON, NEWT2, and NEWT3 are available at http://math.
dartmouth.edu/"-'rewn/ for implementing Newton's method.

EXERCISES

I. (a) Sketch the graph of f(x) = Vi-x for -2 < x < 2. (a) Sketch the curves satisfying each of the two
(b) Sketch the tangent lines to the graph of f at xo = equations.
¾, xo =-¾,and xo = -¼- (b) Defining f by
(c) For each of the three choices for xo in part (b), what
solution of f (x) = 0 can the Newton iteration be
expected to converge to?
f(x, v) =( x2 +y - I )
· x+_v2-2 '
(d) Discuss the choice xo = V'J/9 for an initial approx-
imation to a solution of f(x) = 0.
find f'(x, y), [f'(x, y)]- 1• and (x, y) = [f'(x, y)]- 1
2. To get some idea of how Newton's method can fail, f(x,y).
observe that the equation Vi = 0 has the unique solution (c) Using the sketch in part (a), choose an initial approx-
x = 0. Show that applying Newton's method to this imation xo = (xo, yo) to the solution off (x, y) = 0
equation with initial guess xo for the solution produces the that lies in the fourth quadrant of the xy-plane.
subsequent approximations x,, = (-2)11 xo. What happens (d) Compute XJ = (x1, YI) by Formula 5.1.
if xo =/:- 0 as n increases? [Strictly speaking, the method (e) Compute xs = (xs, y5).
doesn't apply if we start with xo = 0, since f' (0) fails to
exist.]
5. Let

3. (a) To approximate che solution of cosx - x = 0 by


Newton's method, show that when xo is chosen, then
fork::: 0,
Xk sin Xk + cos Xk and find f'(I, 2, -1). Taking xo = (I, 2, -1), apply the
.t}+I = .
I+ SlnXk modified Newton Formula 5.2 to approximate a solution
to f(x, y, z) = (2.1, 5.7, 8.2).
(b) Assuming xo = I find x4.
6. Let
4. Find approximate solutions to the pair of equations
g(u, v) =( u2 + uv2 )

u + v·3
x
2
+ y- l = 0
x+_v2-2=0 Noting that g(I, I)= (2, 2), use Newton's method 5.1 or
its modification 5.2 to approximate a solution to g(u, v) =
by following these steps: (1.9, 2. 1).

Chapter 5 REVIEW

State whether each of the following sets I to 8 is 2. All (x, y) in IR 2 such that lx l < I and IYI::: I
(a) open, (b) closed, or (c) neither. Also describe each
3. Al.I (x, y, z) in !R 1 such that x 2 + y2 + z2 s I and x < 0
set's (d) interior, and (e) boundary.
4. All (x. y, z) in IR 3 such that x > I , y > 0, and z > - I
I. The positive quadrant in JR1, that is, the set of points
{(x, y)lx > 0, y > O} 5. The xy-plane in JR 3
Section 5 Newton's Method 251

6. The set in R 2 consisting of all (x, y) with I(x, y) I < 1, 21. f(x, y, z, w) = xy + yz +zw +wx
along with the point (2, 2)
In Exercises 22 to 27, compute all second-order partial
7. The set of all (x, y, z) in R 3 such that x > 0, y > 1. and derivatives of the given function, including mixed partial
z> 3 derivatives such as a2 flax ay.
8. The set of vectors x in R4 such that Ix - (1, 1, 1, 1) I ::::: 2 22. f(x, y) = x3 - y3
9. Consider the set S in R 3 consisting of the points (x, y, z) 23. f(x, y) = e'" siny
such that x 2 + y2 + z 2 ::::: 1 when y ::::: 0, and such that
x 2 + y2 + z 2 < 1 when y > 0. Describe (a) the interior of 24. f(x, y) = x/(x 2 + y 2)
S and (b) the boundary of S. What is the smallest closed 25. f(x, y, z) = yzex
set containing S?
26. f(x, y, z) = cos(x + y + z)
In Exercises 10 to 15, suppose the function IR ~ JR is 21. f(x,y,z)=x4+y3+z2
defined by
In Exercises 28 to 33, find the derivative matrix for the
if X:::: 0,
H(x) = 11, given function.
0, if X < 0. 28. f(x, y) = (x + y, x 2 + y2)
29. f (u, v) = (u, v, u + v, u - 1)
What points of discontinuity, if any, does the composi-
tion H(/ (x, y)) have in its domain of definition, given 30. f(t) = (t, t 2 , t 3 )
each of the following definitions for f (x, y)? 31. f(x, y) = (½x 2 - ½i, xy)
10. f(x, y) = x 2 + y2 32. f(x, y,z) = (x + y, y + z, z +x)
11. f(x,y,z)=x 2 +y2+z 2 33. f(u, v, w) = (uv, vw, wu, uvw)
12. f(x, y) = + y2)
ln(x 2 34. Assuming lul = 1, find the rate of increase aj/au(l, 2, 3)
of f(x, y, z) = xyz in the direction of the vector (100,
13. f (x, y) = (x - y)/(1 + (x - y)2)
200, 500).
14. f(x, y) = Y 35. Let f(x, y) = xy2 and let u = (cos 0, sin0).
15. f(x, y) = 1/(1 +x 2 + y 2) (a) Compute (n//au)(-1, 2) in terms of 0.
In Exercises 16 to 21, compute 'i/f(x) at a general point (b) In what direction does this function increase most
x in the domain of f. rapidly from the value f ( - l, 2)?
*36. Let two fixed unit vectors in IR2 satisfy u1 # ±u2, Sup-
16. f (x, y) = e'" sin) . . 1 denvattves
pose the d1recttona . . -af (x) and -,
af- (x ) are
17. f(x, y) = (x 2 + .:,,2)- 1 au1 au2
18. f(x, y, z) = xy + xyz 2 continuous functions of x on all of IR 2 • Is f continuously
differentiable? Does your answer change if the unit vec-
19. f(x,y,z)=x-y tors u1 # ±u2 vary continuously from one point x to
20. f(x, y, z) = x/() 2 + z2) another?
CHAPTER 6

VECTOR DIFFERENTIAL
CALCULUS

The previous chapter stresses the fundamentals of multivariable differentiation


together with some geometric interpretations. The present chapter has to do with
extended techniques and interpretations that have proved to be particularly useful in
applications. Each section contains a mixture of the geometric settings introduced in
Chapter 4.

SECTION 1 GRADIENT FIELDS


lA Basic Properties
If J is a differentiable real-valued function IR11 ~ JR, then we saw in Section 2
of Chapter 5 that we could calculate the function VJ in rectangular coordinates
(x1, x2, ... , Xn) using

VJ(x) aJ
= ( -(x), aJ )
... , -(x) .
ax, axil
In Chapter 5 we concentrated on isolated values of the gradient of a real-valued
function J (x) and paid relatively little attention to it in a context in which it was
important to regard it as a function from JR" to IR11 • As a function of x, the image
of VJ(x) is most often pictured as a vector field, which we'll define here. We'll
then refer to a vector field generated by VJ for some differentiable function J
as a gradient field. (In Chapter 12 we' II study vector fields that aren't necessar-
ily gradient fields.) Plotting a vector field F(x, y) = (f(x, y), g(x, y)) in JR 2 is
in principle a very simple matter. Associated with each point (x, y) in the domain
of F is an arrow from (0, 0) to the point F(x, y ). We translate this arrow paral-
lel to itself so that it starts at (x, y) and ends at (x, y) + F(x , y). Carrying out
this routine for a suitably chosen selection of points in the plane will produce a
sketch of the vector field. Carrying out this procedure by hand for a large num-
ber of points is extremely tedious for all but the simplest vector fields, and com-
puter plotting is the preferred alternative, as described in Subsection 1D. Figure 6.1
shows some sketches that include a few such arrows in IR 2 and IR 3 . To make
sense of these pictures physically, you can imagine that the direction and length
of an arrow VJ (x) are respectively the direction and speed of a fluid flow at the
point x to which the arrow is attached. (See Subsection 1D for a discussion of
flow lines.)

real-valued function J(x, y) = ½x 2 + y is differentiable in all of JR 2 • Therefore


I·EXAMPLE 1 j The
VJ(x, y) = (aJ/8x(x, y), aj/ay(x, y)) is also defined on IR 2 and necessarily takes
252
Section 1A Gradient Fields 253
FIGURE 6.1 l

(a} (b)

its values in 1R 2 • We have

Vf(x, y) = (x, 1).


Figure 6. l(a) shows a sketch with just 20 vectors of the vector field. For instance,

Vf(l,0)=(1,1), VJ(-1,0)=(-l,l), Vf(0,2)=(0,1).

Notice that the length JVJ(x, y)I = Jx 2 + 1 is independent of y, but increases as Jxl
does. Also, the arrows starting from a single vertical line, with the same x-coordinate,
are parallel, with the same length because, as we remarked, VJ (x, y) is independent
of y in this example.

!t~~f4~~~;gJ
1
The function g(x, y, z) = ¼(x 2 + y 2 + z2 ) has a gradient in JR 3 ,
Vg(x, y, z) = (½x, ½Y, ½z).
The direction of the field is directly away from the origin at each point, as shown in
Figure 6.1 (b ).

The gradient of a function is important for several reasons, one of which is that
it appears often in applications, for example to the concept of energy in Chapter 9,
Section 2. Also, we proved in Chapter 5, Section 3, Theorem 3.1 that we can write
the directional derivative of a real-valued function f with respect to a unit vector u
in terms of the gradient of J. Thus if J is differentiable, then

1.1 aJ
-(x) = VJ(x) • u .
au
The following theorem is the origin of the mathematical use of the term gradient.
The term appears in several other areas, for example road construction, where it
refers to the slope of a road.

1.2 Theorem. Let !Rn ~ JR be differentiable in an open set D in !Rn. Then at


each point x in D for which VJ(x) 'I 0, the vector VJ(x) points in the direction of
maximum increase for J. The number JVJ (x)I is the maximum rate of increase of
fat X.
254 Chapter 6 Vector Differential Calculus
Proof Recall that the dot product of two vectors equals the product of their lengths
multiplied by the cosine of the angle between them. Given a unit vector u, we have,
by Equation 1.1 and the assumption that lul = 1,

.. .
y
aJ (x) =
ou V/(x) • u = IV/(x)llul cos 0
x-
steepest path
= IV/(x)I cos 0,
(a)
where 0 is the angle between u and VJ (x). Hence the directional derivative assumes
its maximum value when cos 0 = l and 0 = 0, that is to say when u has the same
y
direction as VJ (x). Thus VJ (x) points in the direction of maximum increase of /
at the point x, and IV/(x)I is the maximum rate of increase there. •

Figure 6.2{a) shows the graph of a function JR2 -1..+


JR along with some level
curves and gradient arrows in the domain of/, shown in Figure 6.2(b). The arrows
represent unit vectors in the direction of maximum increase at their tails.
Remark. If the maximum rate of increase off at xo is positive in the direction u,
then the maximum rate of decrease has the opposite direction, namely the direction
X - u = -VJ (xo), but the same absolute magnitude. The reason is that
(b)
of
-.-(xo) = V/(xo) • (-u) = -V/(xo) • {u) = --;--(xo).
af
FIGURE 6.2 o{-u) du

lEXAMPLE3 I increases
Let f(x,y) exx_ Then Vf(x,y)
=
most rapidly in the direction
(yexY,xer>'); thus at
=
2) = (2e e
V/(1,
(1, 2) the function f
2 , 2 ), which has the same
direction as the unit vector (2/ ./5, I/ ./5). The rate of increase in this direction is
IV/(1, 2)1 = ./5e2 • Similarly,

V/(-1,2) = (2e- 2 , -e- 2 )

and has direction (2/ ./5, -1 / ./5), with maximum rate of increase at (-1, 2) equal
to .Jse-2 . The maximum rate of decrease occurs in the opposite direction.

1B Chain Rule
Next we'll prove a chain rule for differentiating the composition g(/(t)), of a func-
tion JR ~ JR11 and a function JR11 --3...+ JR. For example, if / and g are

f(t) = (t, t 2 , t) and g(x, y, z) = x cos(y + z),


then

g(f (t)) = t cos(t 2 + t).


In this example the composition defines a new function from JR to JR.. For example,
with g(x, y, z) denoting temperature at a point (x, y, z) in a region D of JR 3 and
f describing the motion of a point along a path lying in D, we may be interested
in finding the rate of change of temperature with respect to t along the path. The
Section 1B Gradient Fields 255

theorem gives a formula for doing this in terms of the gradient of g and the vector
derivative off.
1.3 Theorem. Let g be real-valued and continuously differentiable on an open set
D in ]Rn and let f (t) be defined and differentiable for a < t < b, taking its values in
D. Then the composite function F(t) = g(f (t)) is differentiable for a < t < b and

F'(t) = Vg(f(t)) • J'(t).


Proof. By definition,

F'(t) = lim F(t + h) - F(t)


h-+0 h
. g(f (t + h)) - g(f (t))
= h-+0
hm --'----'-------'-,
h
if the limit exists. Since J is differentiable, it is continuous. Then we can choose
8 > 0 such that, whenever !hi< 8, J(t+h) is always inside an open ball centered at
f(t) and contained in D. We now apply the Mean-Value Theorem 3.2 of Section 3,
Chapter 5 to g, getting

g(y) - g(x) = Vg(xo) • (y - x),

where Xo is some point on the segment joining y and x. Letting x = f (t) and
y = f(t+ h), with !hi < 8, we have
F(t + h) - F(t) f (t + h) - f (t)
h = Vg(xo) • h .

The vector xo is now some point on the segment joining f(t) and f(t + h). (Note
that xo is in the domain D of g because it lies on a radius of a ball contained in
D.) Since g was assumed continuously differentiable, Vg(x) is continuous, and so
Vg(xo) tends to Vg(f (t)) ash tends to zero. The dot product is continuous, so F'(t)
exists, with

F '(t ) = 1·1m 0 ( ) f (t + h) - f (t)


vg xo • - - - - - -
h-+0 h
= Vg{J(t)) • J'(t). •
Remark. Theorem 1.3 is true under the weaker assumption that g is differen-
tiable. The previous proof avoids some technical work needed to prove the stronger
theorem.

Let g(x, y) = x 2 y + xy 3 for (.t, y) in JR2 . Let f(t) be differentiable in some


neighborhood oft = to and take its values in JR2 • If it is known only that f(to) =
(-1, 1) and J'(to) = (2, 3), then the composition, F(t) = g(f(t)), is known only
at t = to, and F' (to) cannot be computed by direct differentiation. However, by the
previous theorem we have

F' (to) = Vg(f (to))• J' (to).


256 Chapter 6 Vector Differential Calculus

We find that Vg(x, y) = (2xy + y3, x 2 + 3xy2 ) so Vg(f(to)) = (- 1, -2). Then


F'(to) = (-1, -2) • (2, 3) = -8.

We'll extend Theorem 1.3 to vector-valued functions F in Section 2.


lC Normal Vectors
We'll now see how gradient fields are connected with the level sets of real-valued
functions. Level sets were defined and illustrated in Section 2B of Chapter 4. Recall
that the level set S at level k of J is the set of points x in the domain of J such
that J(x) = k. For ffi. 2 ~ ffi. we're usually interested in S when Sis a curve, and
for ~ JR when S is a surface. These two cases are illustrated in Figure 6.3.
JR 3
One possible approach to finding a tangent to a level set S of a function J is
to first parametrize S and then use the parametrization to determine the tangent as
in Sections I and 4 of Chapter 4. However we can also proceed more directly as
follows. Define a normal vector to S at a point xo on S to be a vector n =I= 0 that is
perpendicular to every smooth curve on S that passes through xo. If such a vector n
exists, it's then natural to define the tangent to S at xo to be the unique line or plane
that contains xo and is perpendicular to n. We show below that under appropriate
hypotheses on J we can take n to be the gradient vector VJ (xo). These ideas are
illustrated in Figure 6.3( a) for level curves and Figure 6.3(b) for level surfaces. The
precise statement follows.
(a)
1.4 Theorem. Let JR 11 ~ ffi., n 2: 2, be continuously differentiable at xo, and let
S be a level set of J containing xo. If VJ(xo) =I= 0, then VJ(xo) is a normal vector
to S at xo, and all points x in a tangent line or plane to S at xo satisfy the equation
1.5 VJ (xo) • (x - xo) = 0.
Proo/: Suppose g(t) parametrizes some smooth curve y on S and that g(to) = xo.
.,,y
r We need to know that g'(to) is perpendicular to VJ(xo), that is VJ(xo) • g'(to) = 0.
To see this apply the chain rule to the function h (t) = J (g(t)) to get
1/= k0
(b)
h'(to) = Vf(g(to)) • g'(to) = VJ(xo) • g'(to).
Since y lies on a level set of J at some level k, the function h(t) is constantly equal
FIGURE 6.3 to k. Consequently, h'(to) = 0, so VJ(xo) is perpendicular to the tangent vector
g'(to). If x is a point on a tangent to S at xo, then the vector x - xo is parallel to
the tangent vector g'(to). It follows that x - xo also is perpendicular to VJ(xo), so
Equation 1.5 holds. •

IEXAMPLE s I The function J y) = + y2 has a level curve passing through the point ( 1, and
having the equation
(x,
+ y2 = 5. Then
x3
x3
2) = (3(1)2, 2(2)) = (3,
VJ(l, According 4).
2)

to Theorem 1.4 the tangent line to this curve at (x, y) = (1, 2) has equation
(3,4) • (x-l,y-2)=0, or 3(x - 1)+2(y-2)=0, or 3x t2y=7.

I· EXAMPLE 61· +
The function f(x, y, z) = x 2 y2 - z2 has for one of its level surfaces a cone C
consisting of all points satisfying x 2 + y2 - z2 = 0. The point xo = ( 1, 1, ./2) lies
Section 1C Gradient Fields 257
FIGURE 6.4
y

----y
X

(a) (b) (c)

on C, and to find the tangent plane to C at xo we compute V/(xo) = (2, 2, - 2.J2).


Then
V/(xo) • (x - xo) = (2, 2, -2-v'2) • (x - 1, y - 1, z - v'2),
and according to Theorem I .4, the tangent plane is given by (x - I) + (y - I) -
J2(z - J2) = 0, or x + y - J2z = 0. This plane is shown in Figure 6.4(a). Notice
that both C and its tangent contain a common line with direction (1, 1, v'2), and the
normal vector to the tangent is perpendicular to that line.

Putting together Theorem 1.2 with I .4, we get the following theorem.

1.6 Theorem. The direction of maximum increase of a differentiable function f


at Xo is perpendicular to the level set of/ containing xo, assuming VJ (xo)-=/; 0.

Proof. The reason is that V/ (xo) is the direction of maximum increase of f at


xo and at the same time is perpendicular to the level set through xo determined by
/(x) = k. •
Figure 6.4(b) shows the graph of a differentiable function f (x, y), and Figure 6.4(c)
shows some level curves with perpendicular vectors. The curve running from bottom
to top on the graph in Figure 6.4(b) has the property that the horizontal component of
a tangent vector at (x, y, f(x, y)) always points in the direction of maximum increase
for J, that is, in the direction of V/(x, y). Such a path is called a path of steepest
ascent, and in Figure 6.4(b) this path appears to lead directly to the point at which
the maximum value of / is attained; indeed such paths are sometimes used to locate
maxima in practice. See Section 4F.
EXERCISES

In Exercises I to 6 find V/(x) for each of the following 4. f(x, y, z) = (x - y)z.


functions at a general point x in the domain of f.
5. f(x, y, z) = x + y - z2 •
1. f(x, y) = x2 - y2. 6. f(x) = lxi 2 , for x in IR".
2. f(x, y) = x2 - y2 - sinxy.
In Exercises 7 to 10 find the gradient of each of the
3. f(x,y) =x+2y. following functions at the indicated point xo. Then find
258 Chapter 6 Vector Differential Calculus

the unit vector u that points in the direction of maximum (b) Arc there directions of maximum increase for
increase of the function at xo, and also find the rate of f(x, y) = xy and g(x, y) = x 2 - y 2 at (x, y) =
maximum increase at x. (0, 0)? Does Theorem I 2 apply?
7. f(x,y) =x 2 -y 3 at (xo,.Yo) = (], 1) 27. If g (x, y) = ex+y and f' (0) = (1, 2), use the chain rule
to find F'(0), where F(t) = g(f(t)) and f(0) = (1, -1).
8. g(x, y) = x_v2 at (xo, .Yo)= (-1, 2)
9. h(x, y, z) = xy sin z at (xo, Yo, zo) = (I, 2, n) 28. Let y be a curve in R 3 being traversed at time t = I
with speed 2 and in the direction of (1, -1, 2). If t = I
10. p(x, y, z, w) = (x 2 +_v 2 +z 2 +w 2 ) 112 at (xo, YO, zo. wo) = corresponds to the point (1, 1, 1) on y, find the rate
(1, 1, 1, 2) of change of the function x + y + z + xyz along y at
In Exercises 11 to 14, sketch the vector fields described t = 1.
by the functions from ~ 2 to ~ 2 or ~ 3 to ~ 3 • To do this 29. If f(x,y,z) = sinx and F(t) = (cost,sint,t), find
pick a few points x in the indicated domain and draw g'(n), where g(t) = f(F(f)).
the arrow for F(x) with its tail located at the point x.
30. Let R ~ R" be differentiable. Let IR" ~ IR be
11. F(x,y) = (1,x) for -1 ~ x ~ 2,0 ~ y ~ 2 continuously differentiable, and such that the composition
12. F(x, y) = (-y. x) for x 2 + y2 ~ 4 g(t) = f(FU)) exists. If F'(to) is tangent to the level
surface of f at F(to), show that g' (to) = 0.
13. .,,(x, y) = (y, x) for x 2 + y 2 ~ 4
31. A spaceship is traveling in R 2 along a path such that
14. F(x, y, z) = -f (x, y, z) for x 2 + y2 + z2 ~ 4
at time t ~ 0 the ship is at g(f) = (3t 2 , t 3 ). The
In Exercises 15 to 17, first compute V/, and then sketch intensity of gamma radiation at the point (x, y) in IR2
the vector field F = V/. is / (x, y) = x 2 - y2 wherever I(x, y) ~ 0. Describe
fully, using a labeled sketch where appropriate, the
15. f (X, y) = Xy + y2 following:
16. f(x,y,z) =x 2 +_v2+z 2 (a) The level curve of I that the ship is on at t = 1.
(b) The path of the ship for t ~ 0.
17. f(x, y, z) = x 2 + y2 (c) The gradient vector of I at the ship's position when
18. f(x, y, z) =x - y +z t = 1.
(d) The ship's velocity vector at t = 1.
In Exercises 19 to 24, find, · if possible, a normal vector
(e) The time, if there is one, when the ship stops
and the tangent line or plane to each of the following
increasing its radiation risk and begins its race to
level curves or surfaces at the indicated points.
safety. Does its course become more dangerous
I 9. x 2 + y 2 - z2 = 2 at (x, y, z) = (I, 1, 0) later on?
20. x sin y = 0 at (x, y) = (0, n/2) and at (x, y) = (0, 0) 32. If T (x, y, z) represents the temperature at a point (x, y, z)
of a region R in R3, the vector field VT is called the
21. Jxl = 1 at x = e1 , the first natural basis vector in R"
temperature gradient. Under certain physical assump-
22. x 2 y + yz + w = 3 at (x, y, z, w) = (1, 1, 1, 1) tions VT(x, y, z) is negatively proportional to the vector
23. xyz = 1 at (x, y, z) = (1, 1, 1) that represents the direction and rate per unit of area of
heat flow at (x, y, z). The sets on which T is constant
24. xyz = 0 at (x, y, z) = (1, 2, 0) are called isotherms. If the isotherms of a temperature
25. If IR 2 ~ R is continuously differentiable, its graph is function are concentric spheres, prove that the tempera-
defined implicitly in IR 3 as the level surface S of the ture gradient points either toward or away from the center
function F(x, y, z) = z-f(x, y) given by F(x, y, z) = 0. of the spheres.
(a) Show that VF = (-aJ/ax, -aJ/ay, 1), which is 33. Show that the vector field defined on IR 2 by F(x, y) =
never the zero vector. (-y, x) is not of the fom1 vf (x, y) for a function f.
(b) Find a normal vector and the tangent plane to the [Hint: Suppose that af/ax (x, y) = -y and af/cly
graph of f(x, y) = xy + yex at (x, y) = (1, I). (x, y) = x. Then differentiate the first equation with
respect to y and the second with respect to x.]
26. (a) The function f (x, y) = x 2 + y 2 has VJ (0, 0) =
(0, 0), which fails to indicate that there is a direction For each of the following fields 33 to 36, find an /
of maximum increase for fat (x, y) =
(0, 0). Is this such that Vf = F. The previous exercise established,
reasonable? What happens at (0, 0)? by specific example, that some vectors fields are not
Section 1D Gradient Fields 259
gradient fields. The distinction between gradient fields (c) Find the field of which
and nongradient fields is investigated in some detail in f(x, y, z) = (x 2 + y2 + z 2) - 1l2, the Newtonian
Section 2 of Chapter 9. But when a vector field F is a potential, is the potential function.
gradient field, a little guesswork based on experience (d) Find the field of which f(x, y) = -½
log(x 2 + y2),
with indefinile integrals will sometimes yield a real- the logarithmic potential, is the potential function.
valued function such that VJ = F . For example, if (e) Show that the generalized Newtonian potential
F(x, y) = (x, y), a little thought leads to the guess that f(x) = lx1 2-n in R", n ::: 3, satisfies 'vf(x) =
V(x 2 /2+y 2 /2) = F(x, y). It follows from Theorem 3.4 (2 - n)lxl-nx.
of the previous chapter that every two solutions to the 39. The vector equation of motion for the position x(i) at
problem differ by at most an additive constant. time I of a single planet relative to a star fixed at
the origin has the form x = -klxt 3x, where k is a
34. F(x, y) = (y, x) positive constant depending on the gravitational constant
35. F(x, y, z) = (x , y, z) and the masses of the two bodies. See Equations 3.2 in
Chapter 12, Section 3.
36. F(x, y) = (e-t+Y , e-'+Y) (a) Show that the magnitude of the acceleration vector
37. F(x, y) = (x2, y2) obeys an inverse-square law: Iii= k/lxl 2 •
(b) Show that the vector equation is equivalent to a pair
38. The level surfaces of a function Rn ~ R are called the of equations, where x = (x, y):
equipotential surfaces of the vector field 'vf, and f is
called the potential function of the field.
(a) Show that the equipotential surfaces are perpendic-
ular to the field. (c) Show that the vector field F(x, y) =
(b) Find the equipotential surfaces of the field ( - kx(x 2 + y2) - 3l 2 , -ky(x 2 + y 2 )- 3l 2 ) is equal to
'vf(x , y, z) = (x, y, z). 'vf (x, y), where f (x, y) = k(x 2 + y2)-l/2_

1D Plotting Vector Fields; Flow Lines


As described in Subsection IA, a vector field is often pictured by drawing an arrow
representing VJ (x), but shifted to a parallel arrow with its tail at x instead of at Lhe
origin. When plotting by computer the initial points for the arrows are located at
the points of a rectangular lattice spaced s units apart. A good routine puts arrow
points at the tips of the arrows and little dots at the initial points. The basic routine
is as follows for (f(x + y), g(x + y)} = ( fo(x + y), 1~(x - y)} in a rectangle
x1 < x < x2, Yi < Y < Y2·

,,----
JI\\\"-
! \ \ \
} I \ \ ' ,
, ----
---·
___
........
........ ....... _
DEFINE f(x , y) = fo(X
DEFINE g(x,y) = fo (X - y)
+ y)

. I I t \ '
. I I I
. I I I I
I '
I
-- -
_,,,.,,,.,,/ .
,-, / .
FOR X = x1 TO x2 STEP S
FOR y = Yl TO Y2 STEP s
PLOT ARROW :
(x,y) TO (x + t(x,y), y + g(x,y)J
" // ___ _ I I I I I . NEXT y

. ..,.,,..,-,- - .... I I / / / •
NEXT x

·--- - -.. ...... , I I I I .


·_____
- --~,....... '
_____ , ' \ \ I / ·
' \ \ I I Using such a routine produces Figure 6.5 for -3 < x < 3, -3 < y < 3. Note
.........
' \ \ \ particularly the factor 1~ in the vector field ( 1~ (x + y), 1~ (x - y)) sketched in
Figure 6.5. Without that factor the arrows in the graphic representation would be 10
FIGURE 6.5 times as long, and the resulting overlap of arrows would lead to a confusing picture.
(Try it, leaving out the scale factor!) Thus we often scale the vector lengths in a
field, either down or up, to get a better picture, bearing in mind what specific scaling
260 Chapter 6 Vector Differential Calculus

has taken place. What we're often interested in anyway is the relative strength of a
field as it varies from point to point, and that information comes across very well in
a properly scaled sketch.
Recall that a gradient.field is a vector field having the form F(x, y) = Vf(x, y),
where the real-valued function f is called a potential function of F. In iR2 , for
example,

Vf"(x •) = (af(x, y) af(x, y)).


. 'y ax ' ay

Considerable insight is needed to distinguish a gradient field from one that isn't a
gradient just by looking at sketches such as Figure 6.6. For example, the one in
Figure 6.6(a) is a scaled version of the gradient field Vxy = (y, x) while the one
in Figure 6.6(b) is a scaled version of F(x, y) = (y, -x ), which is not a gradient
field. We'll see later in Chapters 8 and 9 that the "rotational" character of the field
on the right is a visual clue that it isn't a gradient field. Theorem 2.5 in Chapter 9,
Section 2 provides a computational criterion.
Often a potential function f has an infinite singularity, that is, a point of its
domain where f tends to infinity. At such a point VJ will not only fail to be defined
but at nearby points will have gradient vectors that become arbitrarily long. If such
a singularity occurs in the course of making a sketch, we need to make allowances
for it in our algorithm. The simplest thing to do is arrange it so that the plotting
steps give the troublesome point a wide berth.
In principle, we can make a perspective sketch of a vector field in IR. 3 . The trouble
is that the tendency of the arrows to overlap each other often makes the picture hard
to interpret.
Flow Lines. Suppose a continuously differentiable vector field F is defined on
some open subset S of IR.n. The image of a parametrized curve x = g (t), with image
in S, is called a flow line of F if the velocity vector of the curve at a point x = g(t)

FIGURE 6.6 -___- ,,,,._/'//


//7 / - - - ----,,,,
-- -
~

// /.,,,. --,,'\.,
/ I I I \
'\
I I I I \
,, ., I I I I I '
'
. . , I l I
\
\

I I I
I I I I ; . \ \ \
I I
I I
,, ,,.
' ' \ \ I \
I I
~ ' ,,,.___ __
\ \.

-
\

,,, ............. _ -- .-- -


I /,f I I
/ / / -itr -
fa~···.....- . . . . . ...... '-
-.--.'-.'-.

V.\)' = (y/10, xi IO)


"'
'\
/ /

itr F(x, y) = (y/10, -x/10)


~
//
. ,

(a) (b)
Section 2 The Chain Rule 261

in S coincides with the vector F(x), that is,

g'(t) = F(g(t)).
If F is a velocity field, its flow lines represent the paths of particles moving with
velocities given by F. There is a detailed discussion of this relationship between
vector fields and curves in Chapter I 2 on systems of differential equations. For now
we simply observe that given a reasonably accurate sketch of a vector field, we can
sketch in some typical flow lines, as in Figure 6.6. The idea is to draw curves that
appear to be tangent to the arrows of the field. The speed and direction of traversal
of the curve are determined by the length and direction of the arrows of the field,
though if the field arrows are scaled as previously described, then only direction and
relative speed are apparent directly from the sketch.

EXERCISES

Plot the vector fields 1 to 6 in JR 2 . Use scaling of 7. (a) Verify that if a and b are constants, not both zero,
the . vector lengths as it seems appropriate. Add some then the image of the curve parametrized by
sketches of typical flow lines to each of the vector
field sketches. (x, y) = (a cost+ b sin t, bcos t - a sin t)

1. F(x, y) = (x, y), -2.:::: x.:::; 2, -2.:::: y.:::; 2 is a flow line of the vector field F(x, y) = (y, -x).
(b) Show that the flow lines of part (a) are circles.
2. F(x ,y)=(x,0),-4.:::;x.:::; 4,-4.:::;y.:::;4
8. (a) Verify that if a and b are constants, not both zero,
3. F(x , y) = (2x, x), -4.:::: x :'S: 4, -4.:::: y :'S: 4 then the image of the curve parametrized by

4. F(x, y) = V(x 2y2), -2.:::; x.:::; 2, -2.:::; y.:::; 2 (x, y) = (a cosh t + b sinh t, b cosh t + a sinh t)
5. F(x, y) = vex+y, -4 :-s: x :-s: 4, -4 :<:: y :-s: 4
is a flow line of vector field F(x, y) =
(y, x).
6. F(x, y) = V½ log(x 2 + y 2 ), -4 ~ x ~ 4, (b) Show that the flow lines of part (a) are either
-4 ::Sy :-s: 4, (x, y) =;i= (0, 0) hyperbolas or lines.

SECTION 2 THE CHAIN RULE


One of the most useful one-variable calculus fonnulas is the chain rule, used to
compute the derivative of the composition of one function with another:
dz dz dy
dg (J(x) ) = g '( f(x) ) J I (x),
d~ or dx = dydx'
in the compressed Leibniz notation with z = g(y) and y = J(x). The generalization
of both fonnulas to several variables is just as valuable and, properly formulated, is
just as easy to state. We proved the special case where g is real-valued and f is a
function of one real variable t in Theorem 1.3 of Section I.
If two functions J and g are related so that the range space of f is the same as
the domain space of g, we may fonn the composite function g O f by first applying
f and then g. Thus we define

Ro.f(x) = g(J(x))
262 Chapter 6 Vector Differential Calculus

FIGURE 6.7 Image off

The image off and the domain


of g must overlap for g oJto be defined

for every vector x such that xis in the domain off and /(x) is in the domain of g.
The domain of g 0 f consists of those vectors x that are carried by f into the domain
of g. An abstract picture of the composition of two functions is shown in Figure 6.7.

IEXAMPLE 11 Suppose that we are given a two-dimensional region in which the points move about
according to some specified law. It may be known that, for a given position with
coordinates (u, v), a point is always to be found at some definite later time in a
position (x, y ). Then (x, y) and (u, v) are related by equations of the form

x=g1(u,v)

y = g2(u, v).
In vector notation these equations might be written

X = g(u),
where x = (x, y), u = (u, v), and g has coordinate functions g1 , g2 . Now suppose
that the position u = (u, v) of a point is itself determined as a function of other
variables (s, t) by equations

u = fi(s, t)
V = /i(s , 1).
These may be written in vector form as

u = /(s),
wheres= (s, t) and f has coordinate functions /1, Ji. Then (x, y) and (s, t) are
related by

x = g1 (/1 (s, t), h(s, t))


y = g2(/1(s, t), h(s, 0),
or

X = g(/(s)).
Section 2A The Chain Rule 263

2A General Formula and Examples


To see what the derivative (g o f)' should be in terms of the derivatives g' and /',
suppose that we are given

and suppose first that f and g are the linear functions

/(x) = Ax, g(y) = By,


where A and B are constant matrices having respective shapes m by n and p by m.
Then by associativity of matrix multiplication,

g 0 f(x) = B(Ax) = BAx.


But it follows from Corollary 4.3 of Chapter 5 that for a function f (x) = Ax or
g(x) = Bx generated by multiplying a vector by a constant matrix, the derivative
matrix is just the constant matrix. Thus

g' = B, J' = A and (g o f)' = BA .


Hence for functions f and g defined by matrix-vector products we have
2.1 (g o /)
1
=g 1
/
1

It's a remarkable fact that for differentiable f and g the previous formula is the
correct extension of the chain rule if we take care to evaluate the derivatives at the
proper points, as in the next example.

Consider the special case in which f is a function of a single real variable and g is
real-valued. Then g f is a real function of a real variable. Theorem 1.3 shows that
O

if f and g are continuously differentiable, then

(g of) 1 (t) = Vg(/(t)) • f'(t).


That is, in terms of coordinate functions,

(g of}'(t) = (~(/(!)), ... , ~(/(t))) • (/{(t), · · · , f~(t)).


ayi aym

The right side of this last equation equals a matrix product in terms of derivative
matrices

g
, (/(0) = ( -(JU)),
ag ag
... , -(JU)) ) ,
ayi aym

and

f{(t) )
J'(t) = :
(
f~(t)
264 Chapter 6 Vector Differential Calculus

as (go f)'(t) = g'(f(t)}J'(t). The product of g'(/(t)} and f'(t) is defined by


multiplication of matrices of size 1-by-m and m-by-1 and is equivalent to the dot
product of the two matrices looked at as vectors in JR"'. Thus for the case where the
domain of f and the range of g are both I -dimensional, we can regard the formulas

Vg(/(t)} • J'(t) and g'(J(t)}f'(t)

as different notations for the same thing.

The chain rule is valid under the assumption that f and g are differentiable, but
for a proof requiring less detailed analysis, we make the stronger assumption of
continuous differentiability.

2.2 Chain Rule. Let f be continuously differentiable near x, and let g be con-
tinuously differentiable near /(x), with

If g O f is defined on an open set containing x, then g o f is continuously differentiable


at x, and

(g of)'(x) = g'(/(x)}/'(x).
Proof. We need only show that the derivative matrix of go f at x has continuous
entries given by the entries in the product of g' (J (x)) and /' (x). These matrices
have the respective forms

[ ac1 (/(x}) 8g1 (/(x)) 8/1 (x)


ay1 aym [ ax1
a/1 (xJ axil
and
agP (/(x)) agp (/(x)) a1,,'. (x) afm (x)
ay1 aym ax1 axil
The product of the matrices has for its ijth entry the sum of products
111
a al':
L _£(/(x)}~(x).
k=l ayk axj
(I)

Butthis expression is just the dot product of two vectors Vgi (/(x)) and (af!axj )(x).
It follows from Theorem 1.3 that

nvgi (/( x_')) • -af (x)' 8(gi-of)


=- - ()
x, (2)
axj axj

because we are differentiating with respect to the single variable Xj. This establishes
the matrix relation, because the entries in (g oJ)' (x) are by definition given by the
right side of Equation 2. Since g and / are continuously differentiable, Formula (l)
represents a continuous function of x for each i and j. Hence g o f is continuously
differentiable. •
Section 2A The Chain Rule 265
Let

f(x, y) = (x 2 + y2,x 2 -y2)


and let

g(u, v) = (uv, u + v).


We find

g t (u, v) =( V
1 Ul ) and / I (x, y) = ( 2X _ 2y
y ) .
2x 2

To find (g o/) 1(2, 1), we note that /(2, I)= (5, 3) and compute

g I (5, 3) = ( 3 5)
I I and /'(2, I)= ( : _; ) ·

Then the product of these last two matrices gives

(g O f) (2, l)/ = ( 32 -4)


8 0 .

It's common practice in calculus to denote a function by the same symbol as a typical
element of its range. Thus the derivative of a function JR _!_,,. JR is often denoted, in
conjunction with the equation y = f(x), by dy/dx. Similarly, the partial derivatives
of a function JR3 _!_,,. JR are commonly written as

aw aw aw
"a;'ay'
and
az·
along with the explanatory equation w = f(x, y, z). For example, if w =
f(x, y, z) = xy 2 ex+ 3z, then

This notation has the disadvantage that it doesn't contain specific reference to the
function being differentiated, but it's convenient and is the traditional language of
calculus. To illustrate its convenience, suppose that the functions g and / arc given
by real-valued coordinate functions

w = g(x, y, z), x = f1(s, t), y = fz(s, t), z = f3(s, t).


266 Chapter 6 Vector Differential Calculus

Then by the chain rule,

ax
-
ax
-
as at
(aw aw)= (ag ag ag) ay ay
as at ax ay az as at
az az
as at
Matrix multiplication yields

aw ag ax ag ay ag az
as =--+--+--
ax as ay as az as
aw
-
ag ax ag ay ag oz
at =---+--+--
ax at ay at az at
} (A)

We get a slightly different-looking application of the chain rule if the domain space
of J is one-dimensional, that is, if J is a function of one variable. Consider, for
example,

w = g(u, v), ( ~ ) = j(t) = ( j~g~ ) .


The composition of go J is in this case a real-valued function of one variable. Its
derivative is the I-by-I matrix whose entry is the ordinary derivative

d(g •/) _ dw
dt - dt.
The derivatives of g and / are defined, respectively, by the derivative matrices

c: ~:) and ( ]~ )

Hence the chain rule implies that

dw = (aw aw) dudt ) dw aw du aw dv


-
dt = -
au-dt+ -
av-dt.
or (B)
dt au av ( dv
dt
This is the case treated in Section 1, using the gradient, where we would have written

dW I
- = 'vg•f.
dt
Let us suppose that both f and g are real-valued functions of one variable, the
situation we meet in one-variable calculus. The derivatives of J at t, of g at s = f(t),
Section 2A The Chain Rule 267

and of g 0 fat t are represented by the three l-by-1 derivative matrices f'(t), g'(s),
and (go f)' (t), respectively. The chain rule implies that

(go f)' (t) = g' (s) J' (t).


If the functions are presented in the fonn
X = g(s), S = f (t),

the chain rule appears as the familiar equation


dx dx ds
(C)
dt = ds dt

Given that
X = u 2 + VJ and
U=t+l
{ y = e"v, { V = e',

find dx/dt and dy/dt at t = 0. Let JR ~ JR 2 and JR2 ~ JR 2 be the functions


defined by
f(t) =( t ~l ) =( ~ ) , -oo < t < 00,

2
U ) =( u + VJ ) =( X ) { -00 < U < 00.
g ( v e"v y ' -oo < v < oo.

The derivative f' (t) is defined by the 2-by- l derivative matrix

The derivative g'(u, v) is


ax

( -
au
ay
au
ax
av )
ay
av
=(
2u
VeUV
3v 2
ue"v )-
The dependence of x and y on t is given by

(;) = (g of)(t), -00 < t < 00.

Hence the two derivatives dx /dt and dy /dt are the entries in the derivative matrix
of the composite function g of. The chain rule therefore implies that
l
268 Chapter 6 Vector Differential Calculus

That is,
dx =ax
- - du
- +ax -dv
-
dt au dt av dt
(D)
dy = ay du + ay dv .
dt au dt av dt

Substitution of the specific entries in

dx
dt
dy
f' (t)

= 2u +3v2et
= ve111• + ueuv+t
l
and g' (u, v) gives

.
dt

If t = 0, then ( ~ ) = /(0) = ( ! ). and we get u = v = l. It follows that

dx
-(0)=2+3=5,
dt
dy
dt (0) = e + e = 2e.
The definition of matrix multiplication gives the derivative formulas resulting from
applications of the chain rule a formal pattern that helps the memory. The pattern is
particularly evident when the coordinate functions are denoted by scalar variables,
as in Formulas (A), (B), (C), and (D). All formulas of the general form
az ax iJ :.: ay
· ··+--+--+···
ax at ay a1
have the disadvantage of not containing explicit reference to the points at which the
various derivatives are evaluated. It's essential to know this information, and we can
find it by going to the formula

(g o f)' (x) = g' (/(x)) J' (x).


It follows that derivatives appearing in the matrix /' (x) are evaluated at x, and those
in the matrix g' (f (x)) are evaluated at f (x). This is the reason for setting t = 0 and
u = v = l in Formula (*) to obtain the final answers in Example 7.

IEXAMPLE a I Let z = xy and


x = f(u, v).
{ y = x(u, v).
Suppose that when u = 1 and v = 2, we have

ax = -I, ax = 3, ay = 5, ay = 0.
au av au av
Suppose also that /(1,2) = 2 and g(l,2) = -2. What is az;au(l,2)? The chain
mle implies thal
oz
-=--+--.
oz ox oz oy (E)
au axau ayau
Section 2A The Chain Rule 269
When u = I and v = 2, we are given that
x=/(1,2)=2 and y=g(l,2)=-2.
Hence

az
;-(2,
ux
-2) =y Ix=2,y=-2
= -2

az (2, -2)
ay
=x Ix=2,y=-2
= 2.

To obtain az/au at (u, v) = (I , 2), it's necessary to know at what points to evaluate
the partial derivatives that appear in Equation (E). In greater detail, the chain rule
implies that

Hence
az
-(1,2)
au
= (-2)(-1) + (2)(5) = 12.
l;exAMPLe9I If w = f(ax 2 + bxy + cy 2 ) and y = x 2 + x + 1, we may want dw/dx at x = I.
- · The solution relies on formulas that follow from the chain rule such as (A), (B), (C),
(D), and (E). Let z be defined by

z = ax 2 + bx y + cl.
Then w = f(z), and since ax/ax= 1,
dz az ax az dy az az dy
-
dx
- -ax-ax+ ay
- -=-+-
dx ax
-.
ay dx
Hence
dw = df dz= df (az + azdy)
dx dz dx dz ax ay dx
=j 1
(z)(2ax +by+ (bx+ 2cy)(2x + 1)) .
If x =- 1, then y = 1, and so z = a - b + c. Thus
dw
-(-1)
dx
=f I
(a - b + c)(-2a + 2b- 2c).

EXERCISES

1. Assume (a) Find the matrices g'(f(x, y)) and f'(x, y).
2
f ( x ) =( x + xy + 1 ) (b) Use part (a) to find the matrices (g 0 f)'(l, 1) and
y y2 +2 ' (g of)' (0, 0).

,(:)-(T)
270 Chapter 6 Vector Differential Calculus

= Jx 2 + y 2 + z2 and

+)-(n
2. Assume 7. If g(x, y, ZJ

r cos(}
f (t) - ( f(r,0) = (
rs~n0

and
find g' (f (r, 0)) and f'(r, 0); then multiply these together
and find (g o f)' (2, n).
8. Vector functions f and g are defined by

f (' u ) =( u c?s v ) , { 0 <" < 00,


(a) Find the matrices g'(f(t)} and f'(t). v u sm v -n/2 < v < n/2,
(b) Use part (a) to find the matrices (g •>f)'(l ) and
(g of)'(O).
3. Consider the curve defined parametrically by
g('x)=(~).
. Y t arctan
0 < X < 00 .

~
f(I) =( 12 ~4 )
er-2
, -oo < I < oo.
(a) Find the derivative matrix of go f at ( )-

(b) Find the derivative matrix of f, g at ( ; )-

Let g be a real-valued differentiable function with domain (c) Are the following statements true or false?
IR.3 • If xo = (2, 0, I), and (i) Domain of f = domain of g O f.
(ii) Domain of g = domain of fog.
ag 8g 8g 9. Let v be a tangent vector at xo to a curve defined
-(xo)
ax
= 4, -(Xo)
8y
= 2, 8z (XQ) = 2 ' parametrically by a differentiable vector function g. If
xo is in the domain of a differentiable vector function
find d(g c f)/dt at t = 2. F, prove that F'(Xo)V, if not zero, is a tangent vector at
4. Let z = x_v2 and suppose that x = 2u + 3v. Assume F(xo) to the curve defined parametrically by F O g.
also that y is a function of u and v with the properties 10. The convention of denoting coordinate functions by real
that when (u, v) = (2, 1) then y = -1, 8y/au = 5 and variables has its pitfalls. Resolve the following paradox:
ay/av = -2. Find az/au and az/av when (u, v) = Let w = f(x, y, z) and z = g(x, y). By the chain rule
(2, I).
5. Consider the functions aw aw ax aw ay aw az
-=--+--+--.
ax ax ax ay ax az ax
u+v
" - ll The quantities x and y are unrelated, so that ay /ax = 0.
"2 _ v2 However ax/ax= l. Hence

and aw aw aw az
-=-+--,
ax ax 8:: ax
F(x, y, z) = x 2 + y2 + z2 = w.
and so, subtracting aw/ ax from both sides,
(a) Find the derivative matrix of F 0 f at (u, v).
(b) Find Jw/8u and aw/av.
6. Let 11 = f(x, y). Make the change of variables x
r cos 0, _v = r sin 0. Given that In particular, take w = 2x + y + 3z and z = 5x + 18.
Then
af
-ax = x 2 + 2xv. - -v2 and
0f 2
-2xy +2,
aw az = 5 _
- =X -=3 and
ay az ax
find of/80, when r = 2 and 0 = n/2. It follows that O = 15.
Section 2B The Chain Rule 271

11. If y= f (x - al)+ g(x + at), where a is constant and f Find a 2(fog)/avau at (I, 1).
and g are twice differentiable, show that
In Exercises 16 to 19, let J be real-valued and differen-
a2y
2
a2y tiable.
a &x 2 = ;ii"i" (wave equation).
16. If u(x, y) = f(ax + by), show that b au/ax= a au/ay.
12. Let U(x, y) = f(x + iy) + f(x - iy), where i 2 = -1. 17. If U(X, y) = f(xy), show that X au/ax= y au/ay.
Show that Uxx + Uyy = 0.
18. If U(X, y) = f(x/y), show that X au/ax = -y au/ay,
13. If f(tx, ty) = tn f(x, y) for some integer n. and for all y ;t=O.
x, y, and t, show that
19. If U(X, y) = j(x 2 + y2), show that y au/ax= X au/ay.
aJ aJ
X ax+ Y ay = nj(X, y). If in Exercises 20 to 25 f and g are of the following
types, decide whether go J, or fog, or neither one, can
14. (a) If possibly be defined.
w = f(x, y, z, t), x = g(u, z, t), and
20. j: R2--+ JR2, g: IR.2--+ R3
z = h(u, t), 21. f: lR3--+ R2, g: JR2--+ R
22. f: JR--+ R2, g: R--+ JR2
write a fonnula for dw/dt, where by this symbol
23. f: R3--+ R2, g: R3--+ JR3
is meant the rate of change of w with respect to t,
and where all the interrelations of w, x, z, t are taken 24. f: IR--+ R 2, g: R 3 --+ R 2
into account. 25. f: R --+ JR.3, g: R3 --+ JR3
(b) If
*26. A 2-dimensional Hamiltonian system is a pair of
w = f (x, y, z, t) = 2xy + 3z + t 2, equations of the fonn
g(u, = ut sin z,
z, t) dx/dt = Hy(x, y, t), dy/dt = -Hx(x, y, t).
h(u, t) = 2u + t,
The function H of three variables that determines the
evaluate dw/dt at the point u = l, t = 2, y = 3, by system is called its Hamiltonian. Suppose that the pair
using the formula you derived in part (a) and also (x(t), y(t)) satisfies the system, and consider two func-
by substituting in the functions for x and z and then tions oft:
differentiating.
d
15. Consider a real-valued function f (x, y) such that (i) dt [H(x(t), y(t), t)], (ii) H 1 (x(t), y(t), t),

JA2, 1) = 3, Jy(2, 1) = -2, fxA2, I)= o. where the partial derivative of the Hamiltonian in (ii) is
fxy(2, 1) = /yx(2, I)= 1, Jyy(2, 1) = 2. computed before substituting x(t) and y(t) for x and y.
(a) Show that (i) and (ii) are equal as functions of 1.
Let R 2 ~ IR.2 be defined by (b) Show that if H is independent oft, then the curve
parametrized by (x(t), y(t)) lies on a level curve
g(u, v) = (u + v, uv). of H.

2B Changing Variables
One of the most important uses of the chain rule is computing the effect of a change of
variable on the form of important expressions such as /VJ/, the length of a gradient.
In doing computations of this kind it's often simpler and clearer to use the subscript
notation for partial derivatives, as in the next example.

If u = u(x, y) is a differentiable real-valued function, then the length of the gra-


dient /Vu/ = (u; + u~) 112 limits how fast u can change in an arbitrary direction.
Suppose new variables z. w are introduced by setting z = x + y, w = x - y, with
inverse relations x = (z + w)/2, y = (z - w)/2. Because of the way it arose in the
272 Chapter 6 Vector Differential Calculus

definition of differentiability, VJ has an intrinsic meaning independent of the choice


of coordinates. But so far we've calculated VJ only in terms of standard rectangular
coordinates in ~ 11 , so it's important to know how to compute it directly from the
vector (Uz, u111 ), where u(z, w) = u((z + w)/2, (z - w)/2). To do this we compute
Ux and Uy in terms of Uz and Uw using the chain rule. Noting from z = x + y and
w = x - y that Zx = Wx = Zy = 1 and Wy = -1, we find

Hence Vii{z, w) = Ciiz + iiw, Uz -- tiw). Squaring and adding the coordinates of
Vii(z, w) gives

This equation tells us that in this case we can compute IVul directly from the length
of (uz, iiw) if we just multiply this length by ./2.

IEXAMPLE 11 I For functions of a point (x, y) in ~ 2 , the Laplace operator ~ acting on twice con-
tinuously differentiable functions u = u (x, y) produces a continuous function ~u:

~u(x, y) = Uxx(x, y) + Uyy(x, y).


The Laplace operator ~ is an extension to higher dimensions of the second-order
differential operator d 2 /dx 2 that occurs naturally in many problems in pure and
applied mathematics. It is important to know the form that ~ takes after a change
of variable. For example, suppose new variables are introduced by setting z = x + y
and w = x - y, with inverse relations x = (:,: + w)/2, y = (z - w)/2. We need to
find out how ~ is expressed in terms of partial derivatives of u with respect to z
and w, where u(z, w) = u ( (z + w)/2, (z - w)/2). First compute ux and uy as in the
previous example to get

Then
Uxx = (uz), + (uu,)x = (UzzZx + Uzu,Wx) + (Uw ,Z, _l.. UwwWx)

= (Uzz +Urn,)+ (ii.,,;+ Uww),


Uyy = (iiz)y - (iiw)y = CiizzZy + UzwWy) ·"'" (UwzZy - UwwWy)
= (Uzz - Uzw) - (U ,vi - Uww)-
Adding the results of these two computations cancels U:w and "iiwz to give

Uxx + Uyy = 2(Uzz + Uww).


Section 28 The Chain Rule 273

In terms of /1 this equation is 11(.t,y)U = 211(z,w)Ii. When interpreting this equation


it's important to remember that while u and u take the same value at corresponding
coordinate pairs (x, y) and (z, w), the formal expression of the Laplace operator may
change in going over to the (z, w) coordinates.

The term transformation is sometimes used instead of function, particularly when


the domain and range spaces are the same. When using a change of coordinates we
use the term coordinate transformation. It's important to know to what extent each
choice of coordinates in one system corresponds to a unique choice in the other. In
the two previous examples, the coordinate transformations were made by invertible
linear transformations. In Example l O we had

( ; ) =( l -i )(~ ) and ( ~ ) =( : _: ) ( ; ) .
These matrix equations establish a one-to-one correspondence between pairs (x, y)
in one copy of IR 2 and pairs (z, w) in another copy. We've seen in Section 5 on
determinants in Chapter 2 that a linear coordinate change z = Ax with square matrix
A is one-to-one precisely when det A i= 0. Establishing an analogous criterion for
nonlinear transformations is possible only locally, meaning in some neighborhood
of a point. The general result, too technical to prove here, is as follows.

2.3 Inverse Function Theorem. Let 1R11 ~ ]Rn be continuously differentiable


on open subset S of ]Rn, and let xo be a point in S with invertible derivative matrix
F' (xo). Then there is an open neighborhood N of Xo such that F has a continuously
differentiable inverse function F- 1 defined on the image set F(N). The derivative
matrix of F- 1 is related to F'(x) by (F- 1)'(F(x)) = F'(x)- 1 for x in N.

Briefly the theorem says that F has a continuously differentiable local inverse in
some neighborhood of a point xo where F' (xo) is invertible, or equivalently where
det F'(xo) i= 0. The scalar det F'(x) is called the Jacobian determinant of F(x),
and it's crucial for changing variables in multiple integrals in Chapter 7.
The pair of equations x = u + v, y = u 2 - v2 determines a transformation F from
points (u, v) in JR. 2 to points (x, y) in JR. 2 . But note that F sends every point (u, v)
for which u = -v onto the single point (x, y) = (0, 0). To put it another way, the
entire line u + v = 0 gets sent by F into the single point (0, 0), so without some
restriction on its domain the transformation F can't be one-to-one. On the other
hand, the derivative matrix of F at (u, v) is

,
F (u, v) = ( 2u
l l ) '
- 2v so det F' (u, v) = -2(u + v).
The inverse function theorem implies that F has a continuously differentiable inverse
defined in a neighborhood of every point F(u, v) for which det F'(u, v) = -2(u +
v) i= 0. Note that these are exactly the points not on the line u + v = 0, which is
collapsed by F into a single point. For this particular transformation we can actually
compute the coordinate functions for the inverse where it exists:

U = ~ (X + ~) , V = ~ (X - ~) , X = U + V -::/= 0.
274 Chapter 6 Vector Differential Calculus

Figure 6.8(b) shows some image curves of vertical and horizontal line segments in
the (u, v)-plane. The points on the diagonal in Figure 6.8(a) all have (x, y) = (0, 0)
as their image.

We treat more examples in detail in Section 5 where both sets of variables have
standard geometric interpretations.

EXERCISES

l. Let u(x, y) be differentiable for all (x, y) in JR 2 . Let 7. As in Example 10 of the text, show that if (x, y) and
x = s+t, y = s-t and u(s, t) = u(s+t, s-t). Use the (z, w) are related by
chain-rule to show that

z = (x + y)/,./i, w = (x y)/,./i,
(~
11
as
)
2
+ (a
at
11
)
2
=2 (au)
ax
2
+2 (au)
ay
2

then u" + llyy = 111;;: + Uww·


8. F(u, v) is defined by the equations, x = u+v, y = u2 -v 2
2. Let x = u 2 - v2 and y = 2uv, and suppose that
in Example 12 of the text.
z = f (x, y) is differentiable. Show that
(a) Show that for fixed v = vo and varying u the
image curves are upward-pointing parabolas passing
through (x, y) = (0, 0) as indicated in Figure 6.8(b).
(b) Show that for fixed u = 110 and varying v the image
curves are downward-pointing parabolas passing
through (x, y) = (0, 0) as indicated in Figure 6.8(b).
3. Let z = f (x, y) be differentiable, and let x = u cos v and (c) Derive the inverse relations u = !(x + y/x), v =
y = 11 sin v. Show that for u =I- 0 ½ex - y/x), x =/- 0. LHint: y = x(u - v).]

9. Define JR 2 ~ IR 2 by the equations x = 2uv, y = u 2 -v 2 •


(a (a (az)
2 2 2 2
2 2
) + ) = (~) + _!_2 (a) Show that the images of lines through the origin in
ax ay au u av
the u v-plane are half-lines emanating from (x, y) =
(0, 0) and that each poiJll of a half-line except for
4. Suppose that z = f (x, y) is differentiable for all (x, y) (0, 0) is the image of two points.
and that x = e 11 +v + e 11 -v and y = e 11 +v - eu-v. (b) Show that image curves of circles u 2 + v2 = a 2 in
(a) Compute the first-order partial derivatives of x and y the uv-plane are circles x 1 +y2 = a 4 in the xy-plane.
with respect to u and with respect to v. Then express (c) Compute detf'(u, v) and show that, if (uo, vo) =/-
these four derivatives in terms of x and y. (0, 0), the inverse function theorem implies the
(b) Show that existence of a local inverse in a neighborhood of
J(uo, vo).

IO. Define JR 2 ~ JR 2 by the equations x = u cos v, y =


u sin v for u > 0.
(a) Show that for fixed v = vo and varying u > 0
the image curves in the xy-plane are half-lines
5. Let J : JR 2 - IR be continuously differentiable, let
emanating from (x, y) = (0, 0).
x(u, v) = u + v and y(u, v) = u 2 - v2 , and set z =
(b) Show that for fixed u = 110 and varying v the image
J(x(u, v), y(u, v)). Show that z~ -zt = 4xfxfy +4yf;,
curves in the xy-plane are circles of radius uo each
6. Let f : JR 2 - JR be continuously differentiable, let one traced infinitely often.
x(u, t>) = u+ v and y(u, v) = u 2 + v2 , and sel z = (c) Compute det P'(u, 11) and show that if uo =I- 0, the
f(x(u, v), y(u, v) ). Show that z~ +z~ = 2J; +4xfdy + inverse function theorem implies the existence of a
4yf}. local inverse in a neighborhood of P(uo, vo).
Section 3 Implicit Differentiation 275

*11. For a continuously differentiable function IR ~ IR the function. If one of the main hypotheses, F' (XO) ::/: 0, fails
inverse function theorem sharpens slightly to assert that there may still be a merely continuous inverse. Verify
if f' (xo) ::/= 0, then f is either strictly increasing or else this last assertion using the function IR ~ IR defined by
strictly decreasing in some neighborhood of x 0 . Prove f(x) = x 3 .
this using the mean-value theorem for derivatives, and
without appealing to the statement of the inverse function 13. Use the chain rule to show that under the assumptions
theorem. Make sure to make use of the continuity of f'; of the inverse function theorem 2.3, (F- 1)'(F(xo)) =
the conclusion is false without that assumption. (F'(xo) r 1
, that is, the derivative matrix at F(xo) of the
inverse mapping is equal to the inverse of the matrix
12. The conditions of the inverse function theorem guaran-
tee the existence of a continuously differentiable inverse F'(xo).

SECTION 3 IMPLICIT DIFFERENTIATION


V
It may happen that two vectors are related by a formula that doesn't express either
one directly as a function of the other. For example, the formula

pv
u -=ko
t
expresses the relationship between pressure p, volume v, and temperature t, of the
gas in some container. Or the equations
(a)
x2 +y2+z2 = I
x+y+z=0
may be interpreted as a relation between the three coordinates of a point on both the
sphere of radius I centered at (0, 0, 0) in JR 3 and a plane through the origin. In neither
example do the equations give an explicit formula for any of the coordinates in terms
of others. In this section we study the application of calculus to such relations.
For two functions JR.2 ~ IR and IR ~ JR, the equation
(b)
F(x, y) =0
FIGURE 6.8 defines f implicitly if F ((x, f (x)) = 0 for every x in the domain of f. The zero on
the right side of the equation could in practice be an arbitrary constant c. But since
F(x, y) = c is equivalent to G(x, y) =
F(x, y) - c = 0, it's customary to absorb
the constant into the function F in a generic context.
Let F(x, y) = x 2 +y2-l. Then the condition that F(x, f(x)) x2 +(f(x))2-I = 0, =
for every x in the domain of f, is satisfied by each of the following choices for f

fi(x)= ~ , -} :5 X :5 1.

/z(X) = - ~ , - 1 :5 X :'.S l.

v'f=xI. _ _! < X <


2 - -
0.
h(x) = { -JI -x2, 0<x:51.
Their graphs are shown in Figure 6.9. It follows from the definition of an implic-
itly defined function that all three functions Ji, h, h are defined implicitly by the
eauation x 2 + y 2 - I = 0.
276 Chapter 6 Vector Differential Calculus

FIGURE 6.9

(a) (b) (c)

Consider a function JR11 +111 __!_,,. ]Rm. We can write an arbitrary element in JR11 +111
as (.q, ... ,Xn,YI,··· ,Ym), or as a pair (x, y), where x = (.q, ... ,Xn) and y =
(y1, . . . , Ym). In this way F looks like either a function of the two vector variables,
x in JRll and y in IR. 111 , or a function of the single vector variable (x, y) in JR 1i+m. The
function IR.11 __!}_,,. IR.111 is defined implicitly by the equation

F(x, y) =0
if F{x, G(x)) = 0 for every x in the domain of G.

j EXAMPLE 2 j The equations


x+y+z-1=0
2x +z+2=0
determine y and z as functions of x. We get

y = X + 3, Z = -2X - 2.

Notice that the number of equations is the same as the number of variables that we
solve for, two in this example.
In terms of a function JR 2 ~ IH: 2 Equations (*) are

F (x, ( Yz )) = ( 2xx + Y + : -
+~+2
1) = ( 0 )
0

=(~)x+(~ !)(;)+(-~)=(~)-
The implicitly defined function IR __!}_,,. IR. 2 is

G(x) = ( y ) = ( x +3 ) .
z -2x - 2

Although Example I shows that an implicitly defined function need not be con-
tinuous, we'll be primarily concerned in this section with functions that are not only
continuous but also differentiable. The implicit function theorem described in Theo-
rem 3.4 gives conditions for the existence of a differentiable G defined by an equation
Section 3 Implicit Differentiation 277

F(x, G(x)) = 0. However, we consider here the problem of finding the derivative of
G only when G and G' are both assumed to exist. Suppose the functions JR.2 ~ JR.
and JR ~ JR are differentiable and that

F(x, G(x)) =0
for every x in the domain of g. Then the chain rule applied to F(x, G(x)) yields,
in terms of the partial derivatives Fx and Fy ,

Fx(x, G(x)) + Fy(x, G(x))G'(x) = 0.


Since in a typical application we don't have an explicit formula for y = G(x), we
rewrite the previous equation as

dy
F;r;(X, y) + Fy(X, y)-
dx
= 0.
Solving the last equation for dy / dx gives
dy F;r;(x,y)
31 - = ---- if Fy(x, y) f 0.
• dx Fy(x, y)'

If F (x, Y) = x 2 + y 2 - I = 0 is thought of as defining y implicitly as a function


IEXAfVIP.li'E~l·-11
·· · 'tiA·h,·o1 y = y(x), we can differentiate both sides with respect to x to get

dy
2x +2y-
dx
= 0.
Solving for dy /dx gives
)' dy
dx
if y i 0.

For example, at the point (xo, Yo)= (l/,./2, J/,./2), we have F(xo, Yo)= 0, and
X
dy 2xo
-(xo, Yo)= - -
dx 2yo
x2 +y2 - l = O =-1.

FIGURE 6.10 Thus the graph of the implicitly defined function has slope - I al (xo, yo). Figure 6. JO
shows the tangent line there.
The process just described is called implicit differentiation, and it extends to
vector-valued functions of several variables too.

,·~XAIV'f!~E~ I Given the equations

x
2
+ y2 + z2 - 6 = 0, xyz + 2 = 0,
278 Chapter 6 Vector Differential Calculus

suppose that x and y are differentiable functions of z, that is, the function defined
implicitly by the equations is of the form (x, y) = G(z). To compute dx /dz and
dy/dz, we apply the chain rule to the given equations to get

dx dy
2x- + 2y- +2z = 0,
dz dz
dx dy
yz- +xz- +xy
dz dz
= 0.
We can solve these new equations for dx/dz and dy/dz . The solution is

which is the matrix G' (z). Notice that the corresponding values for x and y have to be
known to make the formula completely explicit. That is, from the information given
so far, there is no possible way of evaluating dx/dz at z = l. On the other hand, given
the point (x, y, z) = (1, -2, 1) satisfying both equations, we have (dx /dz)(l) = -1.
The reason is that, just as in Example 1, there is more than one function f defined
implicitly by the given equations. By specifying a particular point on its graph, we
determine / uniquely in the vicinity of the point.

Consider
XU+ yv + ZW = 1,
X + y + Z + U + V + W = 0,
xy + zuv + w = l.
Suppose that each of x, y, and z is a function of u, v, and w . To find the partial
derivatives of x, y, and z with respect to w, we differentiate the three equations using
the chain rule.

ax ay az
u-+v-+w-+z =0,
aw aw aw
ax ay az
-+-+-+1=0,
aw aw aw
ax ay az
y--+x-+uv-+1=0
aw aw aw

The linear system for Xw, Yw, Zw in matrix form is


Section 3 Implicit Differentiation 279

Solving the this linear system is simplest using Cramer's rule, giving, for example,
Xw as
ax uv 2 + XZ + W - ZUV - XW - V
aw - u2 v + vy + wx - yw - ux - uv 2 .

Similarly, we could solve foray/aw and az/aw. To find partials with respect to u,
differentiate the original equations with respect to u and solve for ax/au, ay/au,
and az/au. Partials with respect to v are found by the same method.

The computation indicated in Example 5 leads to the nine entries in the derivative
matrix of an implicitly defined vector function. For the computation to work it's
necessary to have the number of given equations equal the number of implicitly
defined coordinate functions, just as in Example 2. To get more insight into the
reason for this requirement, suppose we are given a differentiable vector function

F(uvx )=(Fi(u,v,x,y))
' ' 'Y F (u v x y)
2 ' ' '

and that the equations

Fi (u, v, x, y) = 0, F2(u, v, x, y) = 0

implicitly define a differentiable function (x, y) = G(u, v). Differentiating these


equations with respect to u and v using the chain rule, we get

aF1 + aF1 ax + aF1 ay =O aF1 + aF1 ax + 0F1 ay = O,


au ax au ay au ' av ax av ay av
aF2 + aF2ax + aF2ay =O, aF2 + aF2 ax + ilF2 ay = O.
au ax au ay au av ax av ay av
These equations written in matrix form are

The last matrix on the right is the derivative matrix G'(u, v). Solving for it, we get
1
ax 1
ax aF1 ) - aF
av ) ( aF, (
ax ay au
3.2 G'(u, v) = :; ay - aF2
(
av ax
-aF2
ay
aF2
-
au
au
To be able to solve uniquely for the matrix G'(u, v) it's essential that the inverse
matrix appearing in Equation 3.2 should exist. In particular, this requires that the
matrix to be inverted be square, in other words that the number of equations originally
given must equal the number of implicitly determined variables; equivalently, the
280 Chapter 6 Vector Differential Calculus

number of variables you solve for must be the same as the number of equations
that determine them, just as for linear systems for which you may expect a unique
solution.
The analog of Equation 3.2 holds for an arbitrary number of equations under suit-
able hypotheses and the proof follows similar lines. We summarize the generalization
of Equations 3.1 and 3.2 as follows.

3.3 Theorem. Suppose JR 11 +m ~ JR.111 and JR.11 ~ JR. 111 are differentiable and
that y = G(x) satisfies F(x, y) = 0 for all x in some open subset of JR". Then

G'(x) = -Fy- 1{x, G(x))Fx(x, G(x)),


provided that them by m matrix Fy is invertible. The derivative matrix Fy is com-
puted with x held fixed and Fx is computed with y held fixed.

The subscript notation used in the theorem is illustrated in the next example.

[EXAMPLE 6 I Suppose that


2
F(x,y,z) =( x y+xz)
+
xz yz

and that we choose x = x, y = (y, z). Then

Fx(x,y,z)= ( 2xyz+ z )
x2
and F(y,z)(X, y, z) =( z

Note that the vector y must be chosen so that Fy is a square matrix and that the
implicit differentiation formula works only when that matrix is invertible. Thus we
must choose (x, y) so that

detFy(x, y) t= 0.
For the choice made in Example 6, we have
2
det ( x
z
X
x+y
) = x 3 +x 2 y - xz,

so the formula fails at any point (x, y, z) for which x 3 + x 2 y - x z = 0.


Theorem 3.3 is very general as far at it goes, but it does nothing to guarantee the
existence of a function G such that F(x, G(x)) = 0; this is done in theory at least, by
Theorem 3.4. The proof is too complicated to give here, but there's a straightforward
equivalence with the inverse function theorem of Section 2B.

3.4 Implicit Function Theorem. Let JR11 +111 ~ !Rm be a continuously differen-
tiable function. Suppose for some xo in JR.11 and some Yo in ]Rm that

(i) F(xo, Yo) =0 and (ii) Fy(xo, Yo) is an invertible m-by-m matrix.
Section 3 Implicit Differentiation 281

Then there is a unique continuously differentiable function ]Rn ~ ]Rm defined on


an open neighborhood N of xu in ]Rn such that F (x, G (x)) = 0 for all x in N and
G(Xo) = Yo-

Note that condition (ii) of the statement is equivalent to det Fy(Xo, Yo) -=I= 0
and is just what is needed to make sense of the formula for computing G' (x) in
Theorem 3.3. Theorem 3.4 is useful for identifying points at which the level sets
S of a function are smooth. This identification typically has to be done piecemeal,
treating one "patch" of a curve or surface at a time.

The circular cone F(x, y, z) = x 2 + y2 - z2 = 0 has two conical parts, symmetric


about the z-axis, each with a sharp point at (0, 0, 0). Applying the implicit function
theorem, we can ask where z is represented locally as the graph of a smooth function
z = z(x, y). We compute Fz(x, y, z) = -2z. Hence near any point (x, y, z) where
z -=I= 0 the desired function exists. This example is atypical in that we can actually
find such functions. But here z = ±J
x2 + y 2 meet the requirements for the top
and bottom halves of the cone respectively. Conceivably, the origin might not be
singular with respect to representation as either x = x(y, z) or y = y(x, z). But
Fy(0, 0, 0) = Fz(0, 0, 0) = 0 also, so the origin is singular with respect to those
possibilities also.

EXERCISES

l. The equation x 2 + y2 - 1 = 0 is satisfied by many values (d) Solve the given equation explicitly for x in terms of
of (x, y), including (I, 0), (0, 1), and (1/./2, 1/./2). Use y and interpret the results of part (b) graphically.
implicit differentiation in both parts (a) and (b). In Exercises 3 to 6, use implicit differentiation to find
(a) Express dy /dx in terms of x and y, and evaluate dy/dx, and, if possible, dx/dy at the indicated point.
at (x,y) = (1/./2, 1/./2). Does it make sense to
evaluate at (x, y) = (0, l) or (x, y) = (-1, I)? 3. xy+ 1 =0 at (x,y) = (-1, 1)
(b) Express dx/dy in terms of x and y, and evaluate 4. xeY + yex = 0 at (x, y) = (0, 0)
at (x, y) = (1 / ./2, -1 / ./2). Does it make sense to
evaluate at (x, y) = (1, 0) or (x, y) = (0, 1)? 5. x + y(x 2 + l) + ½ = 0 at (x, y) = (-1, ¼)
(c) Solve the given equation explicitly for y in terms of
x and interpret the results of part (a) graphically.
6. x 2 + y2 = 1 at (x, y) = (1/./2, 1/./2)
(d) Solve the given equation explicitly for x in terms of 7. Suppose that x 2 y + yz = 0 and xyz + l = 0.
y and interpret the results of part (b) graphically. (a) Find dx/dz and dy/dz at (x, y, z) = (1, 1, -1).
2. The equation x2 - y2 - 1 = 0 is satisfied by many (b) Find dy/dx and dz/dx at (x, y, z) = (1, 1, -1).
(c) Find dx/dy and dz/dy at (x, y, z) = (1, 1, -1).
points including (x, y) = (v'3, ./2), (x, y) = (1, 0) and
(x, y) = (-1, 0). Use implicit differentiation in both parts 8. If X + y - U - V = 0 and X - y + 2u + V = 0, find ax/au,
(a) and (b). oy/au, ax/av, and ay/av by
(a) Express dy/dx in terms of x and y, and evaluate at (a) first solving for x and y in terms of u and v
(x, y) = (v'3, ./2). Does it make sense to evaluate (b) implicit differentiation
at (x, y) = (1, 0) or (x, y) = (1. 1)?
9. If Exercise 7 is expressed in the general vector notation
(b) Express dx/dy in terms of x and y, and evaluate at
of Theorem 3.3, what are F, x, y, Fx, and Fy for part (a)?
(x, y) = (./2, 1). Does it make sense to evaluate at
Part (b)? Part (c)?
(x, y) = (1, 0) or (x, y) = (0, 1)?
(c) Solve the given equation explicitly for y in terms of 10. If Exercise 8 is expressed in the vector notation of
x and interpret the results of part (a) graphically. Theorem 3.3, what is the matrix G'(x)?
282 Chapter 6 Vector Differential Calculus

11. If x 2 + yu +xv+ w = 0, x + y + uvw + I = 0, then, 16. Show that the hyperboloid of two sheets x 2 +y 1 z2 + I =
regarding x and y as functions of u, v, and w, find 0 has two pieces, not intersecting. that are graphs of
smoolb functions of the form z = z(x, y).
ax
- and
av
~ at (x, y, u, v, w) = (I, -1, 1, 1, -1). (a) Do this by finding explicit representations for the
au au two graphs.
(b) Apply the implicit function theorem to show this for
12. The equations 2x 3 y + yx 1 + 12 = 0, x + y + t - I = 0 a neighborhood N of eyery point xo in '.R 2 .
implicitly define a curve
17. It's intuitively evident that a sphere of radius a centered
at the origin is a smooth surface S near all of its points.
f(t) = ( ~~:~ ) that satisfies f(I) = ( -~ ) . (a) Explain how this follows from the implicit func-
tion theorem, applied with each of x, y and z as
dependent variable, to the function F (x, y, z) =
Find the tangent line to the curve when I = I. x2 + y2 + 2 2 _ a2.
13. Let 1.he equation x /4 2
+ y2 + z2 /9 - I = 0 define (b) Find representations for parts of S in six pieces, so
z implicitly as a function z = f(x, y) near the point showing explicitly the smoothness of S.
x = I, y = v'll/6, z = 2. The graph of the function f is
a surface. Find its tangent plane at (1, v'll/6, 2). 18. The sphere x 2 + y1 + t 2 - 4 = 0 and the plane
x + y + z - 2 = 0 have points in common, for example
14. Suppose the equation F(x, y, z) = 0 implicitly defines (2, 0, 0), and the two surfaces appear to intersect in a
z = f(x, y) and that zo = f(xo, Yo). Suppose further that circle.
the surface that is the graph of z = f (x, y) has a tangent
(a) Find the center c and radius a of the circle.
plane at (xo, yo), as defined in Chapter 4, Section 3B.
(b) The implicit function theorem doesn't actually find
Show that
an explicit parametrization of the circle for you, but
aF aF it docs show that this can in principle be done in
(x - xo)8x(.:tQ, YO, zo) + (y - Yo)ay(xo, YO, Zo) + overlapping pieces. Explain how.

aF 19. The equation xyz - yz 2 + x 1 y = I is satisfied by the


(z - zo)az(.~o. YO, :.o) =0 points on a level set S in R 3 of F(x, y, z) = xyz - yz 2 +
x2y.
is an equation for this tangent plane. (a) Find all points (x, y, z) such that Fx (x, y, z) = 0,
15. The equations where the implicit function theorem fails to guaran-
tee the existence of x = x(y, z).
(b) Some of the points found in part (a) may not lie
2x + y + 2z + u - v - I =0
on S. Show that the only points on S at which it's
xy + z - 11 + 2v - 1 =0 impossible to apply the implicit function theorem to
guarantee existence of x = x(y, z) are the solutions
yz + xz + u 2 + v = 0 of 2x + z = 0 and 5yz 2 + 4 = 0.
(c) Solve the x-quadratic equation xyz - yz 2 +x 2 y =l
define x, y, and z as functions of II and v near explicitly for x = x(y, z). The points where this
(x, y, Z, U, V) = (1, 1, -1, 1, I). solution fails to define a continuously differentiable
(a) Find 1.he derivative matrix of the implicitly defined function should be the same as the points found in
function part (b).

20. The equation xyz - yz 2 + x 2y = I in Exercise 19

( ;,z ) = ( ~~::
z(u, v)
) ~~ is satisfied by the points on a level set S in R 3 of
F(x, y, z) = xyz - yz 2 + x 2 y.
(a) Find all points (x, y, z) such that F, (x, y, z) = 0,
= f(u, v) at (u, v) = (1, 1). as required by the implicit function theorem for the
existence of z = z(x, y).
(b) The function f parametrically defines a surface in (b) Some of the points found in part (a) may not lie
the (x, y, z) space. Find the tangent plane to it at the on S. Show that all points on S near which it's
point (1, 1, -1). impossible to apply the implicit function theorem to
Section 4A Extreme Values 283
guarantee existence of z = z(x, y) arc the solutions (d) The results of parts (b) of Exercises 19 and 20
of x - 2z = 0 and 5x 2 y - 4 = 0. together imply that S is a smooth surface at all of
(c) Solve the z-quadratic equation xyz - yz 2 + x 2 y = I its points. Explain why.
explicitly for z = z(x, y). The points where this
solution fails to define a continuously differentiable *21. Show that the inverse function theorem (Theorem 2.3)
function should be the same as the points found in follows from the implicit function theorem (Theorem 3.4)
part (b). by setting F(x, y) = x - / (y).

SECTION 4 EXTREME VALUES


4A Critical Points
The problem of finding the maximum and minimum values of a real-valued function
of several variables is important in many branches of applied mathematics, as well
as in pure mathematics. Familiar examples are extremes of temperature, speed, or
economic profit, each of which may be a function of more than one variable in a
practical problem.
A real-valued function f has an absolute maximum value at xo if, for all x in
the domain of f,
f (x) ~ /(xo),

and an absolute minimum value if, instead,

/(xo) ~ /(x).

The number /(XO) is called a local maximum value or a local minimum value if
there is a neighborhood N of xo such that, respectively,

/(x) ~ /(xo) or f (xo) ~ /(x),

for all x in N. A maximum or minimum value off is called an extreme value.


A point xo at which an extreme value occurs is called an extreme point. The rou-
tines of single-variable and multivariable calculus just identify extreme points, and
the corresponding extreme values, of real-valued differentiable functions under the
assumption that these extreme values do indeed exist. The fundamental theorem that
guarantees existence of extreme values makes no reference to differentiability of a
function f: S ~ JR but only to continuity on a set S in JR 11 that is both closed,
so contains all its boundary points, and bounded, so is contained in a ball of finite
radius. For functions of a real variable x, the typical closed, bounded set encountered
in this context is an interval a ~ x ~ b, often denoted [a, b]. We state the theorem
without its fairly technical proof. The conditions of the theorem are designed to
specifically exclude two possibilities: (i) that f is unbounded on S, and (ii) that f
might approach a limit on S that is not attained at a point in S.

4.1 Theorem. For a function /: S ~ JR assume (i) that S is a closed, bounded


subset of 1R11 , and (ii) that f is continuous on S. Then f assumes its absolute max-
imum and absolute minimum values on S.

Consider the function defined by f(x, y) = x 2 + y2 for points (x, y) in the set S
of points that lie inside or on the ellipse x 2 + 2y 2 = l. Note that S ·is closed and
284 Chapter 6 Vector Differential Calculus

FIGURE 6.11

bounded and that f is continuous on S. The graph of f is shown in Figure 6.11.


Suppose that f has an extreme value (i.e., maximum or minimum) at a point (xo, yo)
in the interior of the ellipse. Then both functions f I and h defined by

/i(x) = f(x, Yo), h(y) = f(xo, y)


must also have extreme values at xo and yo, respectively. Applying the familiar
criterion for differentiable functions of one variable, we have

J{ (xo) = f~ (yo) = 0.
Since

'
/ 1 (xo)
aJ (xo, yo)
= ax '
and / 2 (yo)
af
= -(xo, Yo),
ay

a necessary condition for / to have an extreme value at (xo, yo) is

af af
ax (xo, yo)= ay (xo, yo)= 0.

In this example, with f (x, y) = x 2 + y2,


aJ aJ
ax (x, y) = 2x and a/x, y) = 2y,

and so the only extreme value off in the interior of the e11ipse occurs at (xo, yo) =
(0, 0). From the graph of f, shown in Figure 6.11, we see that the value 0 there is a
local minimum. We next consider the values of f on the boundary curve itself. The
ellipse is defined parametrically by the function

g(t) = (x,y) = (cost, ~sinr), 0::; t < 2rr.

Thus the values of f on the ellipse are given as the values of the composition f O g.
Any extreme values of / on the ellipse will be extreme for / g. The latter is a
0
Section 4A Extreme Values 285
real-valued function of one variable, and we treat it in the usual way, that is, by
setting its derivative equal to zero. By the chain rule, we obtain
d
-(Jog)= Vf(g(t)}. g 1(t)
dt
= (2cost,2/-v'2sint) • (-sint, I/-v'2cost)
= - 2 cost sin t + sin t cos t
= -½ sin 2t.
Extreme values therefore may occur at t = 0, rr /2, ;r, and 3rr /2. The corresponding
values of (x, y) are (1, 0), (0, I /v'2), (-1, 0), and (0, -J/v'2), and those of f are
J, ½, I, and ½, respectively. We see that the absolute minimum of f is 0 at (0, 0)
and that the absolute maximum off occurs at the two points (1, 0) and (-1, 0).
Notice that the two extreme values of f O g that occur at t = rr /2 and 3rr /2 arc not
extreme for f, a<; we see by looking at Figure 6.11.

The methods used in the preceding example are valid in any number of dimensions.
The next theorem is the principal criterion used in this extension, and although we can
prove it by reducing it to the single-variable method, we give a proof that contains
the single-variable situation as a special case.

4.2 Theorem. If a differentiable function Rn ~ IR has a local extreme value at


a point xo interior to its domain, then VJ (Xo) = 0.

Proof. Suppose f has a local minimum at xo. For any unit vector u in Rn, there is
an E > 0 such that if -E < t < E, then f(xo):::: f(xo + tu). Hence, for O < t < E,

f (xo + tu) - f (xo)


0 < - - - - - - - , and
- t
f (xo - tu) - f (Xo)
0<-------.
- t

It follows from Theorem 3.1 of Chapter 5 that

af I
-(xo) = f (xo)u.
au
Therefore

f(xo + tu) - f (xo)


.
0 :"::: IIm - - - - - - -
,~o+ t
= !'( XO )U,
. -
0 < Ilffi
- ,~o+
f(xo - tu) - f(xo)
------
t
= !'( Xo )( -u ) = - !'( xo )u.

We conclude that f'(xo)u = 0. Because u is an arbitrary unit vector, f'(xo) = 0.


The argument for a maximum value is similar. •
286 Chapter 6 Vector Differential Calculus

The previous theorem is what we should expect. Recall that

aJ
au (xo) = J' (xo)u,
and that the derivative with respect to u measures the rate of change of J in the
direction of u. At an extreme point in the interior of the domain of J, this rate should
be zero in every direction. The importance of the theorem is that of all the interior
points x of the domain of J we need to look for extreme points only among those
for which J'(x) = 0. Points x for which J'(x) = 0 are called critical points of J.
4B Constraints
As we did in Example 1 we'll consider in more detail real-valued functions J on
open sets D, trying to find the extreme points of J when J has its domain restricted
to some subset S of D. Two possibilities that we must look out for are

1. A point x where VJ(x) = 0 is not necessarily an extreme point for J.


2. J may have an extreme point x on a set S without having VJ (x) = 0.

IEXAMPLE2 I Let J(x, y, z) = xyz in the set defined by !xi ::: 1, IYI ::: 1, \zl ::: I. Thus the domain
of J is the cube C with edges of length 2 illustrated in Figure 6.12(a). The condition
VJ (x) = 0 for critical points amounts to (yz, xz, xy) = (0, 0, 0). The solutions of
this equation are the points satisfying x = y = 0, or x = z = 0, or y = z = O;
in other words, the coordinate axes. Since J has the value zero at all of its critical
points, and since J has both positive and negative values in the neighborhood of
each of these points, no critical point can be an extreme point. Furthermore, a little
thought shows that J has maximum value 1 and minimum value -1 on C. These
values occur at the eight corners of the cube, none of which is a critical point for J.

The boundary set S of a region R in IR11 is itself never an open subset of IR11 ,
so examining critical points of J is of no use in finding whatever extreme points
of J may lie on S, as on the boundary of the cube in the previous example. More
generally, we may be interested in maximizing or minimizing a function J whose
domain is restricted to a lower-dimensional set, say a curve or a surface, that we
may not necessarily regard as the boundary of some region.

IEXAMPLE 3 j The function J(x, y, z) = y2 - z - x has as its gradient the vector

VJ(x, y, z) = (-1, 2y, -1),


so f has no critical points as a function defined on IR3 • However, suppose that J is
restricted to the curve y defined parametrically by

(x,y,z)=(t,t 2 ,t 3) , -00 < t < 00.

On y, J takes the values F(t) = J(t,t 2 ,t 3 ) = t 4 - t 3 -t, while t varies over


(-oo, oo). We have

F'(t) = 4t 3 - 3t 2 - 1 = (t - 1)(4t 2 + t + 1).


Section 4C Extreme Values 287

Then F'(t) is zero only at t = 1. Furthermore, since F"(t) = l2t 2 - 6t, we have
F"(l) > 0. It follows that f has a relative minimum at the point {I, 1, I) while
restricted to the curve y. The minimum value of f on y is f (l , I, 1) = -1, and
there are no other extreme values.

Suppose the function f(:c, y, z) =x + y + z is restricted to the intersection of the


two surfaces
x 2 +y2=1, z=2
shown in Figure 6. l 2(b ). The curve C of intersection is parametrized by

t # - ' •

) ~ y
/.•
/.
-- . The function f on C takes the value F(t) = cost + sin t + 2. We have F'(t) =
/,'
-sint + cost, So F'(t) = 0 at t = rr/4 and t = 5rr/4. Since F"(rr/4) < 0 and
F"(Srr/4) > 0,
(a)

is the maximum and

,< ...
y

is the minimum value for f on C.


This problem can also be done by setting y = ± ~ a n d z = 2 in f(x, y, z)
(b) and then finding the maximum and minimum values of the two functions

FIGURE 6.12 y = X ± J1 -x2 + 2


on the interval - 1 S x S 1.

4C Lagrange Multipliers
The solution of the previous problem depended on our being able to find a concrete
parametric representation for the curve of intersection of the plane z - 2 = 0 and the
cylinder x 2 + y 2 - 1 = 0. When a specific parametrization is not readily available,
we can still sometimes apply the method of Lagrange multipliers, to be described
next. The method consists in verifying the pure existence of a parametric represen-
tation and then deriving necessary conditions for there to be an extreme point for a
function f when restricted to the curve or surface.

4.3 Lagrange Multiplier Method. Suppose that a function Rn ~ ill: is differen-


tiable and is restricted to a set S and that f has a local extreme point at xo on S. Suppose
that near Xo, the set S is a smooth level set of a function ffi:n ~ ffi:m with m < n and
coordinate functions G1, G2, ... , Gm. Then there are constants Ai, )..2, ... , Am such
that Xo is a critical point of the real-valued function f + 1..1 G1 +···+Am Gm, that is,
288 Chapter 6 Vector Differential Calculus

FIGURE 6.13 y

~ u

z

X
n = 2, m = I n = 3, m = 2
(a) (b)

Why Does the method work? What the Lagrange method does for us in
practice is allow us to restrict attention to solutions xo of the previous equation that
also lie on S. A complete proof of the method's correctness is fairly complicated,
but the geometric idea behind it is quite plausible. Since xo is a local extreme point
for f on S, derivatives of f in directions parallel to S at xo should be zero; in
other words,

aJ
-(xo) = 'vf(xo) • u = 0
au
for every unit vector u tangent to S at xo. But vectors u tangent to S are perpen-
dicular to them normal vectors VG1 (xo), VG2(xo) , ... , VGm(xo), as illustrated in
Figure 6.13 for the cases n = 2, m = l and 11 = 3, m = 2. The previous displayed
equation shows that u is also perpendicular to 'vf (xo). The pictures suggest, and it
can be proved, that 'vf (xo) is then a linear combination of the vectors VGk(xo), a
combination that we choose to write here in the form

Graph
off
G ( x , y ) ~- , t If m = I, then Vf(xo) will typically be parallel to VG1 (xo),asshowninFigure 6.13(a),
~ <- ~-0-\1/- - but in any case there are constants Ak such that the Lagrange condition is satisfied at
VJ -1+
xo. Figure 6.14 is a picture that includes the graph of a function JR 2 JR and shows
some critical vectors 'vf perpendicular to the set S determined by G(x, y) = 0.
FIGURE 6.14 Remark 1. It's important to understand that the Lagrange condition is only a
necessary condition that must hold at a local extreme point, which is why we can
use it to exclude many other points from consideration. The Lagrange condition may
hold at some points that are not extreme points, just as the gradient of a function f
may be zero at points that are not extreme points of f.
Remark 2. In all max-min problems it's important to have some grounds for
believing that the desired extreme points do indeed exist. In addition we need to be
able to distinguish among the critical points for those that provide relative maxima
and minima. Section 4E gives a second derivative test that's sometimes helpful
for doing this. For the Lagrange method to be effective, we also need to assure
ourselves that the set S to which f is restricted is not only closed and bounded but
Section 4C Extreme Values 289
also sufficiently smooth, so sharp corners on curves and sharp edges in surfaces have
to be examined separately. We deal with these issues in the Exercises.

I EXAMPLES I
. ..... .... "''• ., ...... : The problem of Example 4 is that of finding the extreme points of f (x, y, z)
x + y + z subject to the conditions
=

G1(x, y, z) = x 2 + y2 - I= 0, G2(x, y, z) = z - 2 = 0.

According to Theorem 4.3 we write

(x + y + z) + )q (x 2 + y2 - 1) + >..2 (z - 2).

The critical points of this function of x, y, and z occur when


1 + 2)qx = 0, 1 + 2)qy = 0, 1 + A.2 = 0.

In addition, we have to satisfy the two given conditions on x, y, and z, making in


all five equations in the five variables x, y, z, A.1, A2. Solution of such a system is
often best carried out by looking for simplifying substitutions among the equations,
aimed at reducing the number of variables in any one equation. The first condition
in this problem says that z = 2, so we're already down to four variables. Subtracting
the second Lagrange condition from the first gives

2>..1(x - y) = 0, so either >..1 = 0 or x = y.

The choice >..1 = 0 is inconsistent with 1 + >..1x = 0, so we reject that possibil-


ity and accept x = y. Setting x = y in x 2 + y 2 - 1 = 0 gives 2y 2 = 1, so
(x, y) = ±(1 / .J2, l / .J2). Thus the only values of f that we need to check arc at
(x, y, z)= (± l /.J2, ±l/.J2, 2). There we find respectively 2 + .J2 for a maximum
and 2 - .J2 for a minimum as in Example 4. In some problems it's convenient to
find explicit values for one or more of the )..'s along the way, though that wasn't
necessary in this example. (It was nevertheless important not to neglect any possible
A-values.)

IE><AMPLE6I Find the maximum value of f (x, y, z) =x - y + z, subject to the condition x 2 +


y 2 + z2= 1. The function

x- y + z + A(x 2 + y2 + z2 - 1)

has critical points satisfying

1 - 2>..x = 0, - 1 + 2>..y = 0, 1 + 2>..z = 0,


and

x2 + y2+z2 = 1.
The solutions of these four equations are found as in the previous example:

>..=±-
./3 = -y = Z = =f-.
1
2 , X
./3
290 Chapter 6 Vector Differential Calculus

The maximum off occurs at (1/../3, -1/../3, 1/../3). The maximum value is ../3.
What is the minimum value?
Let g(x1,x2, ... ,x11 ) = 0 implicitly define a surface S in JR 11 and let a =
(a1, a2, ... , a 11 ) be a fixed point not on S. Suppose we want to minimize locally the
distance from a to S. Minimizing the distance from a to Sis the same as minimizing
the square of the distance, which is easier to differentiate. Applying the Lagrange
method, we look for points p on S that are critical points of
II

L (xk - ak)
2
+ Ag(x1, ... , x 11 )
k=l

for some A. The critical points satisfy, in addition to g(x1, ... , x,1 ) = 0, the equations
ag
2(x1 - at) +A-(x1, ... ,x11 ) =0
ax1

ag
2(x11 - a11) + A-(x1, ... , Xn) = 0.
dXn

ln vector form, these equations reduce at the critical point p to


ag
-(p)
ax1

ag
-(p)
dXn

where p = (p1, ... , p11 ). The vector p - a on the left is then either zero or parallel
to the normal vector to S at p, which appears on the right side of the equation. In
s other words, we have shown that p - a is perpendicular to S, or else p = a. A
two-dimensional example is illustrated in Figure 6.15, where p provides a local, but
FIGURE 6.15 not a global, minimum.

IEXAMPLE sl Suppose that a cylindrical can is to contain a fixed volume V and that its surface
area, with top and bottom, is to be as small as possible. If the radius of the can is x,
and it5 height is y, then V = rrx 2y. We want to minimize the total area 2rrx 2+2rrxy
of the top, bottom, and sides. We write

F(x, y) = 2rrx 2 + 2rrxy + A(rrx 2y - V)

and look for critical points of F. We find that Fx = 0, Fy = 0 reduce to


( 2x + y + J..xy = 0, 2x + >..x 2 = 0.
The second equation is satisfied if x = 0 or if Ax = -2. But x = 0 would require
V = 0, so we substitute AX = -2 into the first equation to get 2x = y. Thus height y
Section 4D Extreme Values 291

must equal diameter 2x. The value of x for a given volume V can then be determined
from the equation 2;r x 3 = ;r x 2 y = V.

x +y + z- 1= 0 and x+y - z=0

intersect in a line Sas shown in Figure 6.16(a). Let f(x, y, z) = xy, and restrict f
to the line S. Using the Lagrange method to maximize f on S, we consider

xy + >..(x + y + z - 1) + µ(x + y - z).

Its critical points occur when

y + >.. + µ = 0, x + >.. + µ = 0, >.. - µ = 0.


The only point that satisfies these conditions, together with the condition that it lie
on S, is xo = <¼, ¼,½).The maximum value is thus /6 = /(Xo). Note that f has
no minimum on S. We have also that V/(xo) = <¼, ¼, 0), which is perpendicular to
S. The unit vector u in the direction of V/(Xo) is shown in Figure 6.16(a) with its
initial point moved to xo.

4D Saddle Points
A critical point xo of a function f such that f (xo) is neither a local maximum nor
a local minimum value for f is called a saddle point for f.

Let f(x,y) = y 2 -x 2 • Then fx(0,0) = /y(0,0) = 0, so xo = (0,0) is a critical


point. Since f (x, 0) = -x 2 < 0 for all x =I= 0, and f (0, y) = y 2 > 0 for all
y =I= 0, f (0, 0) = 0 is neither a local maximum nor a local minimum value and so
(0, 0) is a saddle point. The graph of f is shown in Figure 6.16(b), a hyperbolic
paraboloid.

FIGURE 6.16 z

/
/
/
/ ·' ,,,-/'/
/.
s ')
·, .'

~
•• ··1
··,
____.,,, ~ , - ---+
y
u
·,
I
I,

/. /
(a) (b)
292 Chapter 6 Vector Differential Calculus

Summary of Methods. To find the maximum and minimum values of a dif-


ferentiable function f defined on a region R in Rn, compare the values of f at the
following points:

(a) Critical points of f in the interior of R


(h) Points on the boundary of R

In case (b), or in the case where R has no interior points, we can do either of the
following:

1. Find a parametric representation g for the boundary of R, in which case we


have a new problem with the function fog defined on a set of one lower
dimension.
2. Use the Lagrange method.

EXERCISES

In Exercises I to 8, find all critical points, if any, of the 16. x + y + z where x 2 + y 2 + z 2 ~ 1


given function.
17. xy - xz where lxl ~ 1, IYI ~ 1, lzl ~ 1
1. f(x,y)=x 2 +4xy-y2-8x-6y
18. x 2 + y2 - z 2 + z where x 2 + y2 + z 2 ~ 1
2. j(x,y)=x 2 +y 2 -2xy-y 3
19. x - y 3
+z 3
- x +y - z where O ~ x ~ 1, 0 ~ y ~ I,
3. f(x, y) =x2 - y2 - 2xy - y O~z~l
4. f(x, y) = x - y 2 + x + y
2
20. Find the maximum value of the function x(y + z), given
5. f(x, y, z) = x 2 + y 2 - z2 - xz + x that x 2 + y2 = 1 and xz = 1.
6. f(x. y, z) =
x 2 + 4xy - y 2 + z2 - 8x - 6y +z 21. Find the minimum value of x+ y 2, subject to the condition
7. f(x, y, z. w) = xy + yz + zw + wx + x + y - z +w 2x2 + y2 = 1.
8. f(x, y, z, w) = x 2 - z2 - w 2 - 2xy + 2zw 22. Let f(x, y) and g(x, y) be continuously differentiable, and
suppose that subject to the condition g(x, y) = 0, f(x, y)
In Exercises 9 to 14, find the points at which .the largest attains its maximum value M at (xo, y0 ). Show that
and smallest values ~e attained by the function in the the level curve f (x, y) = M is tangent to the curve
region. g(x, y) =
0 at (xo, yo).
9. x + y inside and on the square with corners (± 1, ± 1) 23. A rectangular box with a square base and no top is to
10. x + y sin x in the rectangular region O ~ x ~ 2n, contain exactly 108 cubic inches. Find the dimensions
-1 ~ y ~ 1 that yield the minimum surface area.
11. x 2 + 24xy + 8y 2 in the circular region x 2 + y2 ~ 25 24. A rectangular box with a square base and no top is to
12. 1/(x 2 + y2) in the circular region + ~1
(x - 2) 2 y2 contain exactly 54 cubic inches. Find the dimensions that
yield the minimum cost if base material costs four times
13. 2 2
x + y + (2./i./3)xy in the elliptical region x 2 + 2y2 ~ 1 as much as side material.
14. x 2 + 2y 2 in the circular region x 2 + y 2 ~ 1 25. A rectangular box with a square base and no top is to
J5. Find the points that are farthest from the origin on the contain volume V. Find the dimensions that yield the
closed curve in R 3 parametrized by minimum cost if material for one side costs twice as
much as for the other three sides and material for the
g(t) = ( cost, sint, sin(t/2)). base costs three times as much as for the less expensive
sides.

26. (a} Find the minimum distance in JR 2 from the circle


In 16 to 19, find all critical points of the following
functions in the given region.
x2 + y2 = 1 to the line x +y = 4. [Hint: Treat
Section 4E Extreme Values 293
the square of the distance as a function of four (c) What are the maximum and minimum values
variables.} attained by f on C 1?
(b) Solve part (a) geometrically and compare answers.
34. The extreme-value problem for f(x, y) = x2 -y2 subject
27. (a) Find the maximum value of to the condition x + y = 1 has no solution for either a
x 2 + xy + y 2 + yz + z 2 , subject to the condition maximum or a minimum.
x2 + y2 +z2 =
l. (a) Explain why this is so.
(b) Find the maximum value of the same function (b) What is the result of trying to apply the Lagrange
subject to the conditions x 2 + y2 + I and z2 = method to this problem? Does the Lagrange theorem
X + J2y +z =0. really apply here? Explain why or why not.
28. (a) Find the points Xo at which f(x, y) =
x 2 - y2 - y
35. The extreme-value problem for f (x, y, z) = x 2 - y2 + z2
2
attains its maximum on the circle x + y2 1. = subject totheconditionsx+y+z = 3 andx 2 +y 2 +z 2 = 3
(b) Find the directions in which f increases roost rapidly
has a unique solution. Find the unique solution to the
at XQ.
problem.
29. The planes x + y - z - 2w =
1 and x - y + z + w 2 = 36. A rectangular box with no top is to have surface area 48
intersect in a set '.f in R4 . Find the point on '.f that is
square units. Our problem is to choose dimensions that
nearest to the origin.
maximize the volume V = xyz.
30. Let XJ, ... , XN be points in Rn, and let (a) Use the Lagrange method to eliminate from consid-
N eration those (x, y, z) that can't maximize V.
f(x) = L Ix - 2
Xkl - (b) Show by using the constraint on (x, y, z) that V
k=l tends to zero if x or y or z tends to +oo.
Find the point at which f attains its minimum and find (c) Use the results of parts (a) and (b) to find the
the minimum value. constrained values of x,y, and z that maximize V.
*31. Prove by solving an appropriate minimum problem that if 37. (a) A rectangular shed with an open front and no floor-
ak > O,k = 1, ... ,n, then ing is to be built to shelter 108 cubic feet. If the
1/n a1 + a2 +···+an roof material costs twice as much as the material for
( a1a2 ···an ) :'.S - - - - - - - . the three walls, what dimensions will be the least
n
expensive?
32. Let f (x, y) = 2x 2 +y2 be restricted to the set S satisfying (b) How would the answer change if roofing costs the
x 113 + y 113 = 1 as well as !xi ::= 1 and IYI ::= 1. same as walls?
(a) What conclusion can you draw from applying the
Lagrange multiplier method to f on S? 38. A rectangular building is to be built to contain a fixed
(b) Sketch the set S, and explain why f attains both volume V. Heat loss through the roof and walls is pro-
maximum and minimum values on S. What are these portional to area, and heat Joss through the floor is neg-
values, and where are they attained? ligible. Heat loss through the roof material is 3 times as
rapid as through the wall material. What dimensions will
33. Consider the extreme-value problem for f(x, y, z) = x minimize heat Joss?
restricted to the set Ch satisfying x 2 + y2 + z2 - 1 = O
and z = h for constant h =I ± 1. 39. Let Rn _!_. Rm be differentiable on ]Rn. Prove that a
(a) For what values of h is Ch a closed, bounded, smooth curve or surface S defined implicitly by F(x) = k
nonempty set in R3 ? is a closed set in Rn. [Hint: To show that S contains
(b) What conclusion can you draw from applying the a given boundary point Xo, let {Xj} be a sequence of
Lagrange multiplier method to this problem for the points in S such that limj--•oo Xj = xo and consider
values of h found in part (a)? limj ..... 00 F(xj ).]

4E Second-Derivative Criterion
In this section we'll identify strict local extreme points, that is, points xo for which
a strict inequality f (xo) > f (x) or J(xo) < f (x) holds for x ¥ xo in some neigh-
borhood of xo. For functions of one real variable with two continuous derivatives
the second-derivative test says that at a critical · point xo interior to its domain
f has (i) a local minimum if J" (xo) > 0, (ii) a local maximum if J" (xo) < 0.
294 Chapter 6 Vector Differential Calculus

or (iii) neither of these if J" changes sign at xo. The intuitive geometric content
of these alternatives is as follows: near xo the graph of J (i) is concave up and
so stays above the horizontal tangent through (xo, f(xo)) if J"(xo) > 0, (ii) is
concave down and stays below the tangent if J" (xo) < 0, or (iii) crosses the
tangent in case J" changes sign at xo. We'll see that the alternatives for func-
tions IR 11-4IR are very similar, the main technical difference being the criteria
that we use to decide about concavity. The essence of the extension to higher
dimensions consists of examining the second-order directional derivative, defined
naturally by
cP J
--(x) = -aua (aJ)
-au (x).
au2

The basic analysis is in the following theorem, which contains the I-dimensional
case and so bears out the intuitive review we started with.

4.4 Theorem. Let IR11 ~ H~ be twice continuously differentiable on an open


subset of IR 11 that contains a critical point xo of J.

(i) If az
1 (xo) > 0 for all unit vectors u, then J(xo) is a strict local minimum
au 2
value.
az f
(ii) If auz (xo) < 0 for all unit vectors u, then J(xo) is a strict local maximum
value.
aZ f
(iii) If auz (xo) is positive for some u and negative for others, then x0 is a saddle
point.

Proof. We first observe that if g(x) is real-valued and twice continuously differen-
tiable on an interval containing 0 in its interior, then

g(x) = g(0) + g'(0)x + lox (x - t)g"(t)dt.

To see this just compute the integral by parts and apply the fundamental theorem of
calculus to the remaining integral. Now let g(x) = J(xo + xu), where u is a unit
vector in IR 11 • By the chain rule g'(x) = Vf(xo + xu) • u; applying the chain rule
again we get

11
g(x)=V(VJ(xo+xu) •U)•u = a
- (i)j)
- (xo+xu).
au au
Since xo is a critical point for J, it follows that g'(0) = VJ(xo). u = 0. Replacing
g by the corresponding expressions in J everywhere in Equation ( *), we get

J(xo + xu) = J (xo) +


x(x - a2 f
t ) - (xo + tu) dt.
loO au 2
Section 4E Extreme Values 295

a a
But 2 j/ou2 is continuous, so for small enough t the sign of 2 f /ou2 (xo + tu)
is the same as the sign of 8 2 f /8u2 (xo). Case (i) now follows by checking that the
inequality /(xo + xu) - f(xo) > 0 holds at all points x for which lxnl = !xi < 8,
for some positive 8 > O; cases (ii) and (iii) are similar. •
Remark. Just as in the I-dimensional case, the three cases of Theorem 4.4 don't
cover all possibilities. (See Exercise 1.) For example, 82 f /8u 2 (xo) could be zero for
all unit vectors u, in which case the statement yields no information.
Theorem 4.4 is in a way a straightforward generalization of the second-derivative
criterion of single-variable calculus. But in practice the transition from dimension 1
to dimension 2 is a distinctive one, because in JR 2 there are infinitely many ways to
approach a critical point, while in ~ there are at most two approaches, from the right .
or the left. For dimension 3 or more the additional distinctions are mainly technical,
so we'll concentrate on functions ~ 2 ~ JR.
4.5 Theorem. Suppose that f(x, y) has continuous second-order partials fxx,
fxy = fyx and /yy defined on an open set containing xo. Let u = (u, v). Then

2
8 f
- 2 (xo) = fxx(X-O)u 2 + 2/xy(XQ)uv + /yy(Xo)v 2 .
au
Proof. We know that 8f / au = Vf • u = fx u + /y v, so the second-order derivative
82 j/8u2 is

V(Vf • u) • u = VUxu + /yv) • (u, v)


= Uxu + /yv)xu + Uxu + Jyv)yv
= fxxU 2 + 2/xyUV + /yyV 2 . •
In applications the second partials in Theorem 4.5 are to be evaluated at some
critical point X-0 = (xo, yo). Often the decision as to whether we're in case (i), (ii),
(iii), or none of these, in Theorem 4.4 will follow from very elementary observations.

Suppose fxx(xo, yo) = p > 0, /yy(xo, yo) = q > 0 and fxy(xo, Yo) = 0. Then
(8 2 //8u 2 )(xo, yo)= pu 2 +qv 2 > 0, since u and v can't both be zero if l(u, v)I = 1.
Thus we're in case (i) of Theorem 4.3, and f (xo, yo) would be a strict minimum at a
critical point. On the other hand, if fxx (xo, Yo) = /yy(xo, Yo) = 0 and fxy(xo, yo) =
r :f. 0, then (8 2 j/8u2 )(xo, Yo) = ruv. Since ruv can be both positive and negative,
a critical point at (xo, Yo) would be a saddle point.

Let f(x, y) = x 2 y + y 3 - y. Then f'(x, y) =


(2xy, x 2 + 3y2 - l) =
(0, 0) has
solutions (0, ± 1/ J3) and (±I, 0), so these four points are the critical points. We
have fxx(x, y) = 2y, fxy(X, y) = 2x, /yy(x, y) = 6y, so fxx(O, ± 1/J3) = ±2/J3,
=
fxy(O, ±1/J3) 0, fyy(O, ±l/J3) = ±6/J3. Then the second directional deriva-
tive reduces to
82 f 2 2 6 2
-= ± - u ±-v.
2
8u J3 J3
296 Chapter 6 Vector Differential Calculus

It follows that (0, I/ v'3) is a strict minimum point and (0, -1 / v'3) is a strict max-
imum. At (I, 0) we find fxxO, 0) = 0, fxy(1, 0) = 2, Jyy(I, 0) = 0, so the second
derivative is 2uv, which exhibits different signs depending on whether u and v have
the same or opposite sign. Hence (I, 0) is a saddle point. Similarly, (- l , 0) is a
saddle point.

If the quadratic polynomial of Theorem 4.5 doesn't yield to the quick sign analysis
that applied in the two previous examples, we have a simple test based on the
discriminant of a quadratic equation.

4.6 Theorem. Let D = fxx(xo, YO)Jyy(xo, yo) - f}y<xo, yo), where ~ 2 ~ ~ is


twice continuously differentiable. Assume (xo, yo) is a critical point of f (x, y ).

(i) If D > 0 and fxx(xo, Yo) > 0 or fvy(xo, yo) > 0, then f(xo, yo) is a strict
local minimum.
(ii) If D > 0 and fxx(xo, yo) < 0 or fyy(xo, yo) < 0, then f(xo, Yo) is a strict
local maximum.
(iii) If D < 0, then f(x, y) has a saddle point at (xo, yo).

Proof. Since u = (u, v) is a unit vector, u and v can'l both be zero. Thus for any
choice of u and v, the quadratic polynomial a2 J/au 2 of Theorem 4.5 can be written
in either of the two forms
2 2
u [!xx+ 2(f~y)(v/u) + (fyy)(v/u) 2 ] or v [ux x)(u/v)2 + 2(f{ y)(u/v) + ./yy].
Deciding whether either of these two functions changes sign or not comes down to
deciding whether either of the quadratic polynomials in v/u or u/v has a real root
or not; if not, there is no sign change. But "no real root" is equivalent to

4f}y(xo, Yo) - 4fxx(xo, yo)fyy(xo, Yo) < 0,

in other words to D > 0. To decide in that event whether the quadratic polynomial
is negative or positive, all we have lo do is check one of the coefficients frx(xo, yo)
or fyy(xo, yo). This covers (i) and (ii). Case (iii) is now quite simple. If D < 0,
there are two distinct real root values for v/u or u/v, so there are two definite sign
changes as (u, v) varies. Hence (xo, yo) is a saddle point. •

=
If D 0 no conclusions follow from Theorem 4.6. Keeping the next example in
mind along with Figure 6.17 makes it easier to recall the alternatives of Theorem 4.6.

(',~~~(yl~~l:; 1~ j The general quadratic polynomial f (x, y) = ax 2 + 2bxy + cy 2 has critical points
satisfying

2ax + 2by = 0 ( 2a 2b ) ( x ) ( 0 )
{ 2bx + 2cy = 0 or 2b 2c y = 0 ·

Hence there is a single critical point at (xo, yo) = (0, 0) unless the determinant
D = 4ac - 4b2 = 0. Note that frx = 2a , Jyy = 2c and fxy = fyx = 2b. If
Section 4E Extreme Values 297
FIGURE 6.17

X y X z=y2-x2
(i) D = /;ufvv > O,f,,> 0 (ii) D = f xxfyy > O.fxx < 0 (iii) D = f,J11 < 0
up
Concave· in X and y, down in x and y, up in x, down in y.

a = c = l and b = 0, then D = 22 > 0, and the graph of J(x, y) is an elliptic


paraboloid with minimum at (0, 0) [Case (i)]. If a = c = -1, then D = (-2) 2 > 0,
and we get an elliptic paraboloid with maximum at (0, 0) [Case (ii)]. If a and c have
opposite signs with b = 0, or if a = c = 0 with b =j:. 0, then D < 0 and the graph
of J(x, y) is a hyperbolic paraboloid with a saddle point at (0, 0) [Case (iii)]. See
Figure 6.17.

Let f(x, y) = 3x 2 - 6xy + 5y 2 + y3. Since fx.(x, y) = 6x - 6y and /y(x, y) =


(-6x + lOy + 3y2), the equation fx = 0 shows that a critical point must satisfy
x = y. From the equation /y = 0 we then get 3y 2 + 4y = y(3y + 4) = 0. Hence
the critical points are at (0, 0) and (-1, -;).
To classify these points we compute

fxx(X, y) = 6, fxy(x, y) = -6, /yy(X, y) = 6y + IO.


At (0, 0) we find D = 60 - 36 = 24, so (0, 0) is a local extreme point. Because
fxx<0, 0) = 6 > 0, f(0, 0) = 0 is a strict minimum. At we find D = (-1, -1),
=
(6)(2) - 36 -24, so this point is a saddle.

EXERCISES

1. (a) Show that the functions f(x) = x 3 and g(x) = x 4 u = ± 1. Show that both second partials are equal
behave differently at their critical points, but that to f" (x).
the second-derivative criterion of Theorem 4.4 fails (b) Show more generally that for twice-differentiable
to distinguish between their behaviors. functions lR" ..f+ JR
(b) Find the critical points of the functions f(x , y) =
(x + y) 3 and g(x, y) = (x + y)4, and describe the
behavior of f and g near these points. Show also
that Theorem 4.4 fails to distinguish between the
two behaviors. Note that it's not correct to show this by simply
replacing (-u) 2 by u2 in the previous equation.

2. (a) For twice-differentiable functions JR ..f+ R there Find the critical points of each of the functions 3 to 13,
are in principle two values for the second- and try to apply the second-derivative test to determine
o2 f whether each critical point is a maximum, a minimum
order derivative au 2 (x), because we can let or a saddle point. Note, however, that if the conditions
298 Chapter 6 Vector Differential Calculus

of Theorem 4.4 don't apply you may have to make a


16. For twice continuously differentiable functions JRn ~R
decision based on special features of the problem.
the n-by-n matrix
3. f(x, y) = x 2 - 2x + y2 + 4y
4. f(x, y) = x 2 + 4xy - y2 - Sx - 6y
5. f(x,y)=x 2 -.i: y -y2+5y
is called the Hessian matrix of f. Note that H1(x) is
6. f(x, y) = x2 - 2y 2 - x symmetric relative to its main diagonal since fxixk =
7. f(x,y)=x4+y4 frkXj •

8. f(x, y) = (x - y) 4 (a) Show that the discriminant D of Theorem 4.6 is the


determinant of a 2-by-2 Hessian matrix.
9. f(x , y) = x 2 + 2xy (b) .If u = (u1 , u2, ... , u11 ) is a unit vector, show
10. f(x, y) = x 3 - y 3 - 2xy that the second-order directional derivative of f can
11. f(x, y) = x- 1 + xy - sy- 1 be written using the matrix-vector product Hiu as
either
12. f(x, y) = e-x 2 -y2
13. f(x. y) = ex 2 -y2
14. (a) Jx
Integrate by parts to show that xo (x - t)g" (t )dt =
g(x) - g(xo) - g'(xo)x, as in the proof of
Theorem 4.4. (c) Write out the analogue of the formula in Theo-
(b) Use the equation of part (a) to establish directly rem 4.5 for a generic twice continuously differ-
entiable function J(x) = f(x, y, z) with u =
the second-derivative criterion for functions JR ~
(u, v, w).
JR: At a critical point xo, (i) g'' (xo) > 0 implies
g(xo) is a minimum, (ii) g"(xo) < 0 implies *17. The roots AJ, ... , A11 of the polynomial equation dct
g(xo) is a maximum, (iii) g"(x) changing sign (H1(xo) - H) = 0 are called the eigenvalues of H1(xo),
at xo implies g(xo) is neither a minimum nor a and we can use them to characterize a critical point xo of
maximum. f. It's possible to prove that if all Ak > 0, then f(x 0 ) is
(c) Give three examples to show that if g"(xo) = 0, a strict local minimum, and if all Ak < 0, then f (xo). is
then g(xo) can be a maximum. a minimum, or a strict local maximum. If roots of both signs occur, then
neither. xo s a saddle point.
15. For second-degree polynomials, the second-order deriva- (a) Show that criteria (i), (ii), and (iii) of Theo-
tive is closely related to the quadratic part of the rem 4.6 are implied by this statement in the two-
polynomial. dimensional case.
(b) Apply the statement to
(a) Show that if p(x, y) = ax 2 + bxy + cy2, then for
f(x, y, z) = x 2 + y2 + z 2 - 2xy - 4yz - 6x z.
u = (u, v),
*18. The principal subminors H;m) = (!,, 't (x)) i=I ........
k=J.. .. ,m
of
I a2 p the Hessian matrix play a role in the generalization of
2 au2 (x, y) = p(u, v), independent of (x, y).
Theorem 4.6 to higher dimensions. It can be shown that
if xo is a critical point of f and the determinants det
(b) Show that if q(x, y, z) = ax 2 + by 2 + cz 2 + lyz + HJ"'>, m = I, ... , n arc positive then f has a strict
mzx + nxy, then for u = (u, v, w), local minimum at xo. Furthermore, if these determinants

1 a2q
alternate in sign so that (- l)m dct Ht) (xo) arc positive
;:,-a2 (x, y, z) = q(u, v, w), then f has a strict local maximum at xo. Show that this
,. u formulation has the criteria (i) and (ii) of Theorem 4.6 as
independent of (x, y, z) . special cases.

4F Steepest Ascent Method


This is an alternative to the standard calculus method for finding the maximum
or minimum value of a continuously differentiable function JR" ~ JR when the
Section 4F Extreme Values 299

extreme point is in an open subset of the domain of .f. Our classic strategy so
far is to restrict attention to the critical points of .f, that is, the solutions of the n
simultaneous equations embodied in the single vector equation .f' (x) = 0, but finding
those solutions can be problematic, even with Newton's method.
Steepest Ascent Method. Though the method described here applies in any
number of dimensions, it's easiest to visualize in the special case JR.2 ~ JR. Imagine
that the graph of J represents the topography of some mountainous terrain, and that
an ambitious climber is determined always to head up the steepest way from a
given point. Figures 6.4(b) and (c) in Section 1 illustrate the setting. This strategy
determines the climber's path, and it's natural to call that path a path of steepest
ascent. To make mathematics out of these remarks, all we have to do is to recall
that at a point x = (x , y ) in the plane, the direction of maximum increase of J is
the same as the direction of the gradient vector VJ (x), provided this vector is not
zero. (This effectively tells the climber what horizontal compass heading gives the
steepest way up at each point of the path.) It seems reasonable that a path of steepest
ascent will in general lead to the "top," at which point we would have reached a
local maximum of J where VJ = 0. (It might only be the summit of one of the
foothills.) To reach a local minimum we would always head in the direction -VJ(x)
opposite to that of the gradient. We'd then call what we're doing the steepest descent
method.
The numerical implementation of steepest ascent amounts to taking a succession
of small steps along the direction of the gradient, at each point x along the way
observing the value J (x). At each step we make a decision about whether to continue
or not and record the value of J at the end of the last step as our estimate for a local
maximum value for J.
Each step in the process has the same general form as the very first step. Most
of the computation takes place in the domain of J, which in the case of JR. 2 we can
think of as a topographic map of the graph of J. Having decided on xo, a starting
point, we move in the direction of the gradient vector at xo by a certain distance to
a new point x 1:
Xt =XO+ ho VJ(xo) , where ho > 0.

In general we go from Xn to Xn+ 1 as follows:

X11+l = Xn + hn VJ(xn), where hn > 0, n = l, 2, ...


By following the gradient in small steps we move at each step from a point x in a
direction determined by the vector u = VJ (xn) having two properties:

(i) u points in the direction of maximum increase of J .


(ii) u is perpendicular to the level set of J containing x.

To search for a local minimum value, we think of pursuing the path of steepest
descent by making hn < 0. This choice will tend to move us downhill in the
direction opposite to the gradient direction.
The remaining question is how the numerical factors hn are to be chosen. The
simplest choice is to make all hn the same; this choice has the consequence that as
300 Chapter 6 Vector Differential Calculus

we approach the location of a maximum value for f. where the gradient is zero,
continuity of VJ will cause the vectors h'vf(x.n) to get shorter and shorter. This
is desirable, for otherwise we face the danger of taking a big step right past the
extreme point. The following routine summarizes the method. From a different point
of view what we're doing here is finding approximate solutions to a system of two
differential equations, namely

dx dy
dt = fx(X, y), dt = /y(X, y).

At the end of the section we describe a method for improving the accuracy of the
process by varying h.

DEFINE f(x, y) =
1 - x2 - 2y2
DEFINE fx(x, y) = -2x (x-coordinate of 'i1 f(x, y).)
DEFINE f} (x, y) =
-4y (y-coordinate of 'i1 f(x, y).)
SET h = 0.4
INPUT (x, y) (Starting point.)
DO
SET (x1, u1) =
(x, y) (Keep for later use.)
(Next compute new position.)
SET (x, y) =
ex + tlf,:(X, }'J. j' llt'y(x, y))
PRINT x, y, f(x, y)
LOOP UNTIL Jf(x, y) - f(x1, Y1ll < f
(Stop if change in f is < e.)

IE)(AIVIPlE 1s I The function used to illustrate the preceding routine is /


with a single critical point at
y) = 1 - x 2y
(x,
y) = (0, 0) and maximum value I at that point.
(x.
2 - 2,

Figure 6.18 shows some curves on the graph off along with their projections into
the xy-plane. Starting with xo = yo = 0.5 and e = 0.001 the output of num-
bers Xn, Yn and f (xn, Yn) from the next to last line of the program would be
as follows:

X y f(x, y)

0.1 -0.3 0.81


0.02 0.18 0.9348
0.004 -0.108 0.976656
0.0008 0.0648 0.991601
0.00016 -0.03888 0.996977
0.000032 0.023328 0.999608

Note that the step size h = 0.4 is fairly large; this choice results in repeatedly
overshooting the origin, as evidenced by the alternating sign in the y-coordinates. A
smaller choice, say h = 0. I, avoids this effect but requires more steps to achieve the
same accuracy. With a very small, and very inefficient, step size the points (x, y)
would approximate a flow line of the gradient field that cuts perpendicularly across
level curves of f as shown in the picture that accompanies the routine.
Section 4F Extreme Values 301
FIGURE 6.18
z

The function chosen for this example, j(x, y) = sin 2 x +sin 2 y, has its critical points
at the solutions of the equations
fx(x, y) = 2sinxcosx = sin2x
Jy (x, y) = 2 sin y cos y = sin 2y.

The solutions are all of the fonn (x, y) = (j re /2, krc /2), where j and k are integers.
If j = 21, k = 2m are both even we get f(lrc, mrr) = 0 for a local minimum,
and if j = 21 + 1, k = 2m + I we get f((2l + l)rr/2, (2m + l)rr/2) = 2 for
a local maximum. Having this information before doing the computing allows us
to experiment intelligently with different starting points and different sizes for h to
see what the results are. You're asked to do this in the exercises. The trick is to
coordinate the choice for h with the starting point; too small or too large a value for
h can require many steps to reach an acceptable approximation, or may lead to no
convergence at all. In our example, starting at (x, y) = (1.5, 1.5), computation with
step size h = 0.4 and t: = 0.0001 ends after three steps; note that re /2 ~ 1.57079:

X y f(x,y)

1.55645 1.555645 1.99959


J.51341 1.51341 1.9998
1.57022 1.57022 2.0000

In the preceding example we were able to start reasonably close to the extreme
point at (x, y) = (re /2, rr /2) because we already knew the exact location of the point.
If we always had that kind of infonnation there would be no need for numerical
methods at all. In practice, we need to make educated guesses about the location
of extreme points. If the domain of the real-valued function f under consideration
is two-dimensional, making a computer-aided graph of f may be helpful. Failing
that, Newton's method for approximate root location may be helpful, as described
in Chapter 5, Section 5.
The preceding program outline is crude in that it forces you to stick rigidly with
a single step size h throughout the computation. This rigidity is clearly a defect
302 Chapter 6 Vector Differential Calculus

when the approximations x11 are getting close to a critical point, causing repeated
overshooting from side to side. What we need is a method for automatically choosing
the step size and adjusting it as the process proceeds. One way to do this is to think
of replacing /(x + u), where u = hV/(x), by its second-degree Taylor expansion

af 1 a2 f
f (x) + au (x) + 2 au2 (x).
In dimension 2, with u = (u , v), the middle term is

Vf(x, y) • (u , v) = fx(x, y)u + /y(x, y)v.


From Theorem 4.5 in Section 4E we have for the third term

fx;r(X, y)u 2 + 2/xy(x, y)uv + /yy(x, y)v 2 .


Since in our application u = hf((x, y) and v = h/y(x, y), the expressions get a bit
cluttered unless we suppress the evaluations at (x, y), so with that in mind we have
the Taylor approximation
2
f + h(f; + f;) + h2 (/xxf; + 2/xyft/y - /yy/;}.

We now maximize this function, which is possible since it's parabolic with respect
to the variable h, and has a maximum if the coefficient of h 2 is negative. Setting the
derivative with respect to h equal to zero, we find the critical value of h to be

f;(x, y) + f;(x, y)
h= - 2
f.cx(X, y)fx (x, y) + 2/xy(x, y)fx(X, y)Jy(x, y) + /yy(X, y)fy2 (x, y) '
Replacing the line that sets h = 0.4 in the earlier program by this more complicated
expression requires earlier insertion into the program of definitions for the second
derivative functions fxx(x,y), fxy(x,y), and /yy(x,y).
Note that the second derivative that appears in the denominator of the expres-
sion for /z will typically be negative if we're looking for a maximum, so h
will be positive. Correspondingly, in looking for a minimum h will be nega-
tive. The applet ASCENT/DESCENT implements the method for fixed h, and
ASCE;NT/DESCENT+ does the same for the variable h method. Both applets are
available at http://math.dartmouth.edu/~rewn/. The former program is simpler to
apply because it requires the user to calculate only first partial derivatives, so you'd
resort to the automatic step-size version only if the simpler version fails.

EXERCISES

1. The function sin 2 x + sin2 y


has infinitely many local 2. Verify the details in the derivation of the variable step
maxima, with maximum value 2, and local minima, with factor h.
minimum value 0. Find numerical approximations for In Exercises 3 to 6, find approximate values for the
the coordinates of those extreme points that lie in the coordinates of the local extreme (both max. and min.)
rectangle O5 x s 5, 0 5 y 5 5. points and the local extreme values of the following
functions.
Section SA Curvilinear Coordinates 303

2 8. f (x) = (x 3 + 2x + l)e-x2, for x in JR


3. f(x,y)=(x+2y)e-x - 2y2, for(x,y)inlR 2
4. f(x, y) = (x 2 + 2
r2. for x 2 + y2 ~ 9
2y)e- 3x - 2 9. A rectangular box has two faces contained in the positive
5. f(x, y) = ((x - 1) 2 + y2)- 1 + ((x + 1) 2 + y2)- 1, for xz-plane and yz-planes, with their common edge on the
(x, y) in JR.2 z-axis. The two corners farthest from the origin lie on the
graphs of e-O. lx - 0 ·2Y2 and -e-0.4x 2 - 0-1i, respectively .
2
6. f (x, y) = exp(-x 2 - 2y2) sin(x 2 + y 2 ), x 2 + y 2 ~ 1
(a) Use steepest ascent to estimate the dimensions that
The method of steepest ascent (or descent) applied to will maximize the volume of the box.
a real-valued function f (x) generates a sequence of (b) Solve a similar problem by hand if the two functions
numbers Xn+I = Xn + hf'(xn). Apply this approach to 2 2
are replaced by e-x -y and -e-x -r.
2 •.2

Exercises 7 and 8 using a hand calculator.


7. f(x) = exp(-x 2 ) sinx, 0:;; x :;; 4

SECTION 5 CURVILINEAR COORDINATES


We can often simplify formulas that occur in mathematics and its applications by
finding good descriptions of the quantities to be singled out for special attention.
Since in practice these quantities are usually represented by vectors whose entries
are real-number coordinates, our problem here is one of choosing the most useful
system of coordinates. Thus we consider introducing coordinates in ]Rn different
from the natural coordinates Xk that appear in the designation of a typical point
0-axis (x1, ... , Xn)- Specifically, to each point (x1, ... , x 11 ) there will be assigned a new
n-tuple (u1, ... , u 11 ). If we are to be able to switch back and forth from one set
of coordinates to the other, the assignment described must be one to one, that is,
for each (x1, ... , x,1) there will be just one n-tuple (u 1, ... , u11 ) and vice versa.
In practice we sometimes make the new coordinate assignment for some specific
r-axis
subregion of ]Rn rather than for the whole space. In what follows we'll denote the
space of new coordinate vectors (u 1, ... , Un) by un
to help us change variables
coherently from the standard coordinates (x1 , ...• x,.) in JR 11 •
SA Polar Coordinates
Consider two copies of 2-dimensional space: the xy-plane JR 2. One of these we'll
rename the r0-plane, denoting it by lU2. The function lU2 ~ JR2 defined by
y-axis
x ) = p ( r ) = ( r c?s 0 ) { 0 < r < oo
( y 0 ::; 0 < 2rc
(~) 0 r sm 0 '

has a simple geometric description. The image under P of a point (r, 0) is the point
-+--t-»t-~--'-t--~
x-axis x = (x, y) whose distance from the origin is r and such that the angle from the
positive x axis to x in the counterclockwise direction is 0. See Figure 6.19.
The image of P consists of all of JR 2 except for the origin, so for any point (x, y)
in IR 2 there are numbers r and 0, called polar coordinates of x, such that
FIGURE 6.19 5.1 x = r cos 0 and y = r sin 0.
For two points (r1, 01) and (r2, 02) in the domain of P, the equations

r1 cos 01 = r2 cos 02,


r1 sin 01 = r2 sin 02
304 Chapter 6 Vector Differential Calculus

hold whenever r1 = r2 and 01 = 02 +2rrm for some integer m. Hence the polar coor-
dinates of a point (x, y) in IR 2 are not uniquely specified without some restrictions
on r and 0. However if (x, y) =I- (0, 0) the polar coordinates of (x, y) are uniquely
specified up to an integer multiple of 2rr in the 0-coordinate. To see this square both
sides of the two displayed equations and add to get rf = r}. Assuming r > 0 we
conclude that r1 = r2. But then cos 01 = cos 02 and sin 01 = sin 02, so (cos 01, sin 01)
and (cos 0z, sin 02) represent the same point on a circle of radius 1 centered at the
origin in JR 2 . Hence 01 = 02 + 2rrm for some integer m.
The preceding paragraph says that P is not one-to-one, but that it becomes so
if its domain is restricted to be a subset of a rectangular half-strip in the r0-plane
defined by inequalities
0 < r < oo, 0o :'.S 0 < 0o + 2rr.

So restricted, P does have an inverse function, and we can find some partial formulas
for the inverse by solving the equations x = r cos 0, y = r sin 0 for r and 0. We
obtain, for x =f. 0,
y
0 = arctan -X + krr.
We have used the common convention of restricting an inverse trigonometric function
to the principal branch of the corresponding multiple-valued function. Hence the
image of arctan is the interval -rr /2 < 0 < rr /2. If follows that the function
defined by

(n=(:::.:n. nO,

f ),
is the inverse of the restriction of P by O < r < oo and -rr /2 < 0 < rr /2. Similarly

the function defined by ( ~ ) = ( ::: Y > O,


07
is the inverse of the restriction of P by O < r < oo and O < 0 < rr.

IEXAMPLE 1 j We have not defined polar coordinates for the origin of the xy-plane simply because

( ~ ~~s: ) =( ~) for all 0,

so the one-to-one requirement fails at the origin. This failure causes no real difficulty;
for example, the equation in rectangular coordinates of the lemniscate,

~
(I)

becomes, upon introduction of polar coordinates,


I I •
~~x
r2 = 2cos20, r > 0. (2)
The image under P of the set of pairs (r, 0) that satisfy Equation 2 is precisely the
FIGURE 6.20 set of pairs (x, y) that satisfy Equation 1, except for the origin. We may simply fill
in this one point. See Figure 6.20.
Section 5B Curvilinear Coordinates 305

0-axis
SB Spherical Coordinates
Consider the function 1U3 ~ JR3 , defined by
5.2

l
r ) ( r sin ¢ cos 0 ) 0<r<oo
S ¢ = r sin ¢ sin 0 , 0<</)<JC
( 0 r cos¢ 0 S 0 < 2JC.

Here, for simplicity, we have restricted the domain of S from the outset so that
S is one-to-one. Its range is all of JR 3 with the exception of the z-axis. Hence it
assigns spherical coordinates (r, ¢, 0) to every point of JR 3 except those on the
z-axis. As with polar coordinates in the plane, the spherical coordinates (r, ¢, 0)
of a point x = (x, y, z) have a simple geometric interpretation. See Figure 6.21(b).
The number r is the distance from x to the origin. The coordinate ¢ is the angle
in radians between the vector x a11d the positive z-axis. Finally, 0 is the angle in
radians from the positive x-axis to the projected image (x, y, 0) ofx on the xy-plane.
The symbols ¢ and 0 are sometimes interchanged, particularly in physical appli-
cations.
We can compute an explicit expression for the inverse function, which we denote
(a) by s- 1, by solving the equations

x = r sin ¢ cos 0,
y = r sin ¢ sin 0,

z-axis z = rcos¢,
for r, 0, and¢. We get, for y ~ 0,

2 2 2
= (.x, y, z) r ) ( x ) ../x + y z+ z
( arccos
</J X

(
¢ = S -1 y = Jx2+y1+:i x 2 + y2 > 0.
0 z arccos ~x
xZ+y~

Since the image of the principal branch the arccosine function is the interval 0 s
y-axis
0 s JC, this function is actually the inverse of the function obtained by restricting the
'' domain of S by the further condition 0 s 0 s JC. To get values of 0 in the interval
'
' ... JC < 0 < 2JC, corresponding to y < 0, we add JC to the third coordinate in the
' preceding formula. Note that when ¢ = 0 the spherical coordinate transformation
x-axis
(b) reduces to x = r cos 0, y = r sin 0, z = 0; this amounts to changing to polar
coordinates in the plane z = 0, that is in the xy-plane in JR 3 •
FIGURE 6.21
Three surfaces in lll3 defined by spherical coordinate equations r = 1, ¢ = JC /4,
1·~>,CAM~LE 2J and 0 = JC/3, respectively, are shown in Figure 6.22. The corresponding rectangular
coordinate equations derived from the preceding expressions for s- 1 are respectively

x 2 + y2 + z2 ~ 1, with x 2 + y2 > 0

z= -I; J x 2 + y 2 + z2 , with z> 0


y = ../3x, with x > 0,
306 Chapter 6 Vector Differential Calculus

FIGURE 6.22 z

4> =l0=j
(r varies)

}'
(0 varies)

O=j.r=I
X
(4> varies)

SC Cylindrical Coordinates
The coordinate transformation is defined by

5.3 D<r<OO
-rr < 0 S rr
- -- 6 varies -00 < Z < 00 .

.,,_.,,--···
y
The coordinates (r, 0, z) are obtained by a straightforward extension to IR. 3 of
X
polar coordinates in IR. 2 • Figure 6.23 shows the effect of varying each of the three
coordinates.
FIGURE 6.23
5D Jacobian Matrices
The name "curvilinear" is applied to coordinates for the reason that if all but one
of the nonrectangular coordinates are held fixed and the remaining one is varied,
the coordinate transformation defines a curve in IR.11 • Thus in plane polar coordinates
the coordinate curves are circles and straight lines, as shown in Figure 6.24(b).
For spherical coordinates, typical coordinate curves are the circle, semi-circle, and
half-line obtained as intersections of the pairs of surfaces shown in Figure 6.22. The
curves and surfaces obtained by varying one or more curvilinear coordinate variables
play the same role that the natural coordinate lines and planes of IR.11 do. For example,
to say that a point in IR. 3 has rectangular coordinates (x, y, z) = (l, 2, 1) is to say that
it lies at the intersection of the coordinate planes x = 1, y = 2, and z = 1. Similarly,

FIGURE 6.24 (I }'


r=2

I}= :I!._
4
t-------+- I} = f <r varies)

r X

r = 2 (0 varies)
I} = - ~ t-------+-
8 I} =- }1! (r varies)
8

(a) (b)
Section SD Curvilinear Coordinates 307

saying that a point in IR.3 has spherical coordinates (r, <I>, 0) = (1, rr /4, rr /3) is to
say that the point lies at the intersection of the surfaces shown in Figure 6.22.
Generalizing from the preceding examples, we see that a system of curvilinear
coordinates in IR. 11 is determined by a function 1U11 ~ IR. 11 • It's assumed that for
some open subset N in the domain of T, the restriction of T to N is one-to-one and
therefore has an inverse r- 1 • The curvilinear coordinates of a point x lying in the
image set T(N) are

We impose fairly stringent regularity conditions on a coordinate transformation;


specifically we'll assume that at every point u of N the function T is continu-
ously differentiable and that T'(u) is invertible. Thus the inverse function theorem
(Theorem 2.3 will apply to T locally.
The polar, spherical, and cylindrical coordinate changes, represented by
Equations 5.1, 5.2, and 5.3, have derivative matrices

5.4 cos0 -r sin0 )


( sin0 r cos0 '

5.5 sin <I> cos 0 r cos <I> cos 0 -r sin</> sin 0 )


sin ct> sin 0 r cos <I> sin 0 r sintcos0 ,
( cos <I> -r sin ct>

5.6 cos0 -r sin0


sin0 r cos0
( 0 0

respectively. These matrices, and, more generally, the derivative matrices of dif-
ferentiable coordinate transformations, are called Jacobian matrices. We've seen
in Chapter 5, Section 4 that the columns of these matrices have simple geometric
interpretations; each column of a Jacobian matrix is obtained by differentiation of
the coordinate functions with respect to a single variable, while holding the other
variables fixed. This means that the }th column of the matrix represents a tangent
vector to the curvilinear coordinate curve for which the }th coordinate is allowed to
vary. That is, let the coordinate transformation be given by 1U11 ~ IR.11 • Then the }th
column of the matrix of the derivative T'(uo) is a tangent vector, which we'll denote
by Cj, at xo = T(uo), to the curvilinear coordinate curve formed by allowing only
the }th coordinate of uo to vary. Tangent vectors are shown (with their initial points
translated to the point XQ) in Figure 6.25 for some polar, spherical, and cylindrical
coordinate curves. The coordinates of the tangent vectors CJ, ••. , c11 are rectangular
coordinates, not curvilinear coordinates.
Remark. We can now see that the Jacobian matrix itself of a coordinate trans-
formation is the matrix of a certain first-degree change of coordinates at each point.
To see this, consider curvilinear coordinates in JR. 11 given by x = T(u), where u
308 Chapter 6 Vector Differential Calculus

FIGURE 6.25

Polar
(a)

C
2
=(_!__, ,] ,--...!..)
2a 2;2 ,'2

X
Spherical Cylindrical
(b) (c)

is the curvilinear coordinate variable. Fix a point xo having curvilinear coordinates


u0 . At xo we can introduce a new origin and new unit vectors e1, ... , e11 with
the same directions as the natural basis vectors for IR11 • Then multiplication by the
matrix T' (no) transforms the vectors e1, ... , e11 into the vectors c,, ... , c11 that are
the tangent vectors to the curvilinear coordinate curves. Figure 6.25 illustrates the
relation between the ei and the c;. Notice that the vectors CJ, • . . , c11 will be linearly
independent if and only if T'(uo) is invertible. This is one reason for requiring not
only that a coordinate transformation be one-to-one in a neighborhood of a point,
but also that its derivative be invertible.

EXERCISES

In Exercises l to 4, make a sketch using xy coordinates In Exercises 5 to 8, make a sketch using xyz coordinates
in JR 2 of the curves given in polar coordinates. in JR 3 of the curves and surfaces given in spherical
coordinates.
1. r = 1, JT ::; 0 ::; 3JT /2
5. r = 2, 0::; 0 S n/4, n/4-:::_ <I>-:::_ JT/2
2. r = 0, 0 ::; 0 S 1r /2
6. I -:::_ r -:::. 2, 0 = 1r /2, <P = JT / 4
3. r(sin0 - cos0) = n/2, r > 0, n/2 S 0 S Jr
7. 0 Sr S 1, 0 S 0 S rr /2, <P = rr /4
4. r = -;r/2cos0, JT s 0 .s 3-;r/2
8. 0-:::_r-:::_ 1,0=n/4,O::;¢::;JT/4
Section 5D Curvilinear Coordinates 309
9. Use cylindrical coordinates in R 3 to describe the region For which points (x, y) in R2 , or (x, y, z) in R3, do the
defined in rectangular coordinates by O :'.S x, x 2 + y2 :'.S J. matrices 5.4, 5.5, and 5.6 fail to be invertible?
10. Let (r, 0) be polar coordinates in R 2 • The equation 13. With a, b, c positive, the equations

0 <I<-,
7f x = ar sin¢, cos 0
- - 2 y = br sin¢, sin 0 , 0 < ¢, < n, 0 :'.S 0 < 2n,
z = er cos¢,
describes a curve in U2 • Sketch this curve, and sketch its
image in R 2 under the polar coordinate transfonnation. define ellipsoidal coordinates in R3 . For a = 1, b = c =
11. Let (r, ¢,, 0) be spherical coordinates in JR3 • The equation 2, sketch a typical example of each of the three kinds of
coordinate surface.

7f
14. Compute the coordinates of tangent vectors to the coordi-
0 <I< - nate curves for the general ellipsoidal coordinates given
- - 2'
in Exercise 7, when a = b = I, c = 2, and r = ½,
¢, = 0 = n/2.
determines a curve in R3 (as well as in the r<J,0 space 15. Let r, ¢,, and 0 be spherical coordinates in R 3 . The
U3 ). Sketch the curve in IR 3 • [Suggestion: The curve lies equation
on a sphere.]
12. Compute the determinants of the matrices 5.4, 5.5, and
5.6, and show that they are
8(x, y)
(a) - - =r
8(r, 0) detennines a curve in R 3 • Compute the coordinates of a
a(x,y,z) ."' tangent vector to the curve.
(b) - - - = r 2 sm'I'
8(r, ¢,, 0) 16. Prove that in 3-dimensional spherical coordinates,
a(x, y, z)
(c) ---= r the sphere xr + x] + xf = I has the equation
8(r,0,z) r = 1.

Chapter 6 REVIEW

2 2
In Exercises I to 6, let /(x, y, z) = x2 + y2 - z2 , 8. Suppose T(x, y, x) = Ke-c<x +Y +,h is the temperature
g(u,v) = (cosu,sinu,v), h(u,v) = (u+v,uv), and at a point (x, y, z) inside a solid ball x 2 + y2 + z2 :'.S a 2 ,
K(x, y) = xy. Find where K and c are positive constants. Show that the
surfaces of equal temperature (i.e., the isotherms) are
1. f'(x,y,z) spheres, and that the temperature gradients point toward
2. h'(u, v) the center of the ball. What is the magnitude of the
temperature gradient on a sphere of radius b < a?
3. g' (rr /3, n 2 /36)
9. Define F(x, y) = (x/Jx 2 + y 2, y/Jx2 + y 2) at points
4. (goh)'(rr/6, n/6) (x, y) "# (0, 0).
(a) Sketch the vector field defined by F(x, y).
5. BK /Bu at (1, ../3), where u is the unit vector in the
direction of the gradient of K at (1, ../3) (b) Find the maximum rate of increase of Jx 2 + y2 at
(x, y) "# (0, 0).
a
6. - f(K(x, y), K(x + y, x - y), x2 + y 2 ) 10. Let x = u 2 - v 2 and y = u2 + v2 . Assume z = f(x, y)
ax
has partial derivatives fx(O, 2) = 3 and /y(O, 2) = 4.
7. If the temperature at a point (x, y, z) of a solid ball of Find 8z/8u and 8z/8v at (u, v) = (1, I).
radius 3 centered at (0, 0, 0) is given by T(x, y, z) =
yz + zx + xy, find the direction in which T is increasing 11. A function Rn -1.... JR is said to be homogeneous of
most rapidly at(), I, 2). degree m if f (tx) = tm /(x) for all t > 0 and all x in the
310 Chapter 6 Vector Differential Calculus

domain off. For example f (x, y) = (x 3 +y 3 ) cos(y/x) is 18. The graphs of x + yz = 0 and y + xz = 0 intersect
homogeneous of degree 3. Show that if f is differentiable in a curve containing the point (x, y, z) = (I , - I , I) and
and homogeneous of degree m, then J(x) = ~x • Vf(x). parametrized by functions of the form y = y(x), z = z(x).
(a) Find y' (I) and z' (I) by implicit differentiation.
12. Let /(z) be differentiable and w = f (ax + by), a, b
(b) Find a tangent vector to the curve at ( I , -1, I).
constant. Show that bwx = awy.
13. Let h(z) be a real-valued function, differentiable for all
19. A function z = f(x, y) is defined implicitly by
real z. Define u(x, t) =
lz(x - at), where a is a constant.
x + y2 + z2 = 3z. Find Zx and Zy at (1, I, I).

Show that 20. The equation ex +eY+ez-3xyz = 3 defines x = x(y, z),


y = y(x, z) and z = z(x, y) near (x, y, z) = (0, 0, 0).
a2 11 a2 11
a2 - 2 - - 2 = 0 for all pairs (x, t) .
(a) Find ax/ay at this point.
ax ar (b) Find the other possible partial derivatives at (0, 0,
0) without any additional computation, and explain
14. Suppose w = f(x, y) is differentiable and x = u + v, your reasoning.
y =u- v. Show that
21. Consider the curve y of intersection of the two level
surfaces
aw aw =
au av
(aJ)
ax
2
(a/)
_
ay
2
x+ 2y2 + 3z 2 = 6
2

x + i- z2 = I.
2

15. Suppose that /(11, v) = u 2 v + uv 3 and that u and v are You arc asked to find a tangent vector to y at (I, I , I) in
differentiable functions of s and t with u (2, I) = -2, each of two different ways:
u.,(2, 1) = 3, v(2, I) = 2 and v5 (2, I) = -4. Find (a) Solve explicitly for x = x(z) and y = y(z) to find a
aJ (2 I) parametric representation (x, y, z) = (x(z), y(z), z).
as ' .
Then find a tangent vector.
16. Define F from the uv plane to the xy plane by (b) Use implicit differentiation to find dx/dz and dyldz
near z = I . Then find a tangent vector.
22. The equations x 2 + y 2 + z2 = 6, x + y + z 4 define =
(a) Show that an image point (x, y) necessarily satisfies z = z(x) and y = y(x) 8.S differentiable functions near
x > lyl. Sketch the region R in the xy plane (x, y, z) = (1, 2, I).
determined by x > lyl. (a) Sketch the curve defined by the intersection of the
(b) Show that F satisfies the hypotheses of the Inverse two surfaces.
Function Theorem 2.3 in Section 2, and so has a (b) Find dlldx and dyldx at the point (x, y, z) =
continuously differentiable inverse defined in some (1, 2, I).
neighborhood of each image point (x, y) in R. (c) Find a vector parallel to the line tangent to the curve
(c) Show that F has a global inverse defined for all in part (a) at the point (I, 2, I). [Hint: A parametric
(x, y) with x > IYI by representation for the curve near (I, 2, I) is given
by (x, y(x), z(x)}.]
u =½In (½(x + y)) +½In (½(x - y)) 23. Let f(x,y) be differentiable, and let u = (cos0,sin0).
(a) Show that the directional derivative af/au has a
v=½ln(½<x+y))-½In(½(x-y)) .
critical point as a function of 0 whenever u and VJ
17. The pressure P, volume V and temperature T of a gas are parallel.
are related by PV = 6T . (b) How does the result of part (a) relate to Theorem 1.2
(a) Show that it's impossible to have P, V, and T all in Section 1?
increase simultaneously with each one increasing at 24. Find a point on the sphere x 2 + y 2 + z2 = 28 at
its own constant rate. which the tangent plane is parallel to the tangent plane
(b) Assume T increases steadily at 2° per minute and to xy + lnz = 6 at the point (2, 3, I).
V at 3 cubic centimeters per minute. At t = 0,
25. The equations
T = IO~ and V = 8 cubic centimeters. Find d P /dt
at/= 0. sin(x + u) - cos(y + v) + xy - u + uv =1
(c) Using the same data as in part (b), find dP/dt at
I= 30. U + V +X +y - UX - Vy =5
Section SD Curvilinear Coordinates 311

implicitly define x and y as functions of u and v. Find Xu 36. Find the maximum and minirnwn values of the function
and Yu at the point (u, v, x, y) = (2, 1, -2, --1). f(x, y) = x 2 + 4xy + y2 on the region in R 2 defined by
26. Let f (x, y, z) = 3x2 z + 3xy - 6y 2 - 3z + 3z 3 • x2 + y2 :'.: 1.

(a) Find 'vf(x, y, z). 37. Check all critical points of 2xy + y 2 + 4y + 2x for local
(b) Let S be the level set of / at 0. Find a unit vector maximality and minimality.
that is perpendicular to S at the point (1, 1, l).
38. Use Lagrange multipliers to find the point or points on
(c) Find the tangent plane to S at (1, 1, 1).
the parabola y = x 2 closest to the point (0, b), where
27. The equations xy + zt = -4, x 2 + y + z + t 2 = 8 (0, b) is a fixed point on the y-axis. Note that the answer
define x and z as differentiable functions of y and t near will depend on b.
(x,y,z,t) = (-2,1,-1,2). Find ax/oy and oz/oy at
this point. 39. Let x0 be a nonzero vector in R11 • Define a real-valued
function on Rn by f (x) =Xo • x. Without calculating any
28. A point moves on a differentiable curve in the xy-plane derivatives, show that the maximum value of /(x) for x
in R 3 so that its position at time t is (x(t), y(t), 0). Acor- restricted by lxl = 1 is lxol- What is the minimum value?
responding point on the graph of a differentiable function Do these answers change if the restriction is replaced by
z = f(x, y) is then at (x(t), y(t), f (x(t), y(t))). Sup- !xi:'.: l?
pose that the velocity vector of the plane curve is given
by the gradient field of f(x, y), so that (dx/dt, dy/dt) = 40. Find the minimum value of f(x, y) = 3x 2 +2y2--6x-4y
(/x(x, y), /y(x, y)) for a point (x, y) on the curve. subject to the condition x + 2y = 4. What can you
(a) Show that, if z(t) = f(x(t), y(t)), then dz/dt = say about the maximum? Answer both questions if the
l'vf(x, y)1 2 - condition is replaced by x + 2y :'.: 4.
(b) Show that the point on the graph of f (x, y) corre- 41. Let f(x, y) = x2y - 2x - y.
sponding to (x(t), y(t)) has speed (a) Find all critical points off (x, y) in R 2.
(b) Find the maximum and minimum values of f(x, y)
v = l'vf (x, y)IJ 1 + IV/ (x, Y)l 2 . when (x, y) is restricted to lie on the line segment
29. Find the points where f (x, y) = x 2 + y attains it max- joining the points (0, 1) and (1, 0).
imum and minimum values on the subset of R2 where
x2 +2y2 :'.: 1.
42. Find the points on x2 +2xy +3y2 = 14 which are closest
to and farthest from the origin.
30. Find the maximum value of x + 2y subject to x 2 + y2 = 1. 43. (a) Find all critical points of the function x 3 + 3x y + y2.
31. A rectangular box is to be constructed so that the length (b) Find the maximum value of the function in part (a)
of one of its internal diagonals is 3 units. What is the given that (x, y) is restricted to the closed square
maximum possible volume? - l:'.:,X:'.:,1,-1:'.:Y:'.:l.
32. (a) It's clear that the minimum value of / (x, y, z) = 44. Find the maximum and minimum values of the function
xyz subject to the conditions O :'.: x, 0 :'.: y, 0 :'.: z g(x, y) = x 2 + x + y + y2 over the closed region
and z + y + z = I is zero. Find the maximum value. x2 + y2 :'.: 1.
(b) What if xyz is replaced by xy2z 3 ?
45. Suppose (xo, yo) is a point in R 2 with polar coordi-
33. Let u = f(r) andr = a +-
2u
Jx 2 + y 2• Show that - 2
a2u-, = nates (ro, 80). Show that the polar coordinate equation
ax 8r 2
, - 2rro cos(& - 0o) + ,J
= a 2 represents the circle with
d2 f + ! df if r ¥ 0. radius a and center (xo, Yo).
d,-2 r dr
46. A cone with vertex at the origin in R3, and that intersects
34. Let f(x, y) =(x 2 - y2, 2xy) =
(u, v) and g(u, v) =
planes through the z-axis in two perpendicular lines, has
(eu cos v, eu sin v) = (s, t). Using the chain rule, show
a simple representation in terms of spherical (r, ¢,, 0)
that as/ox= 8t/8y and as/By= -at/Bx. coordinates. Find such a representation. Then find an
35. Find all critical points in IR.3 of the function f (x, y, z) = equation in terms of rectangular (x, y, z) coordinates for
xy + yz +xz . the same cone.
CHAPTER 7

MULTIPLE INTEGRATION

This chapter is devoted to the study of integrals of functions with domains in ~ 11 •


Such integrals occur in many branches of pure and applied mathematics, with inter-
pretations such as volume, mass, probability, and flux. In Section 1 we start with
iterated integrals because they have a direct interpretation in terms of volume, and
because they are then immediately available for computing the multiple integrals
introduced in Section 2.

SECTION 1 ITERATED INTEGRALS


IA Integration over a Rectangle
Recall that we can interpret the definite integral

lb f(x)dx

as the signed area between the x-axis and the graph of f. We want to extend this
idea to the integral of a function R ~ R. Suppose that / (x, y ) is a function
2

defined on a rectangle a .::: x .::: b, c.::: y .::: d . By

1d f(x, y )dy

is meant simply the definite integral of the function of one variable obtained by
holding x fixed; for example,
2
f 2 x 3y2dy = [ ~
3 3 ] Y=
= -g x 3 •
Jo 3 y=O 3
As this example shows, if the integral exists, it depends on x . Thus we may set

F(x) = ld f(x , y)dy

and fonn the iterated integral in the order first y, then .x:

lb F(x)dx = lb [1d f(x, y) dy] dx .

To interpret the iterated integral, look at Figure 7. l(a). For a fixed value of x , the
integral with respect to y is Lhe area of the shaded region, which we have called
F(x) . Then we can interpret the iterated integral, which is the integral of the area

312
Section 1A Iterated Integrals 313

FIGURE 7.1
Graph off
Graph off "\

\ '

:
I
I I
'~
(J t:t 1
I
I
1
I
I I [ I I
Area F(x) I I l I Id
i------1---',- '--I------; -.--.
t 11 I I/
a f -i----¥- -- i_ _ _ /

X
a/+
--I-
I I I /
b
/
/

___
I
: /
/

l ___ 1-\-V
.
/
/ :
I /
/

I I I /
b ___ J-' --------~'
Area G (y)
(a) (b)

function F (x) with respect to x, as the volume of the region between the graph of
f and the rectangle a S x S b, c S y S d. If f assumes negative values there, we
interpret the integral as a signed volume.
We can also define an iterated integral in the opposite order. We set

G(y) = ib f(x, y) dx,

and form the iterated integral in the order first x, then y:

id G(y) dy = id [ib f (x, y) dx] dy.

We may also interpret this last integral as the signed volume under the graph of f,
as suggested by Figure 7.l(b). Intuitively we expect the two iterated integrals to be
equal, and this will follow from Theorem 2.2 in the next section.
A common notational convention, which we'll sometimes use, is to omit the
brackets and w1ite the previous iterated integral as

This alternative notation has the advantage of emphasizing which variable goes with
which integral sign, namely, x with ib and y with id
l~.X~Nf P~~'. JJ Consider f (x, y) = x 2 + y, defined on the rectangular region
0 S x S 1, 1S y S 2.

f l dx f 2 (x 2 + y) dy = f l [ x 2 y + L2])'=2 dx
lo 11 lo 2 y=I
314 Chapter 7 Multiple Integration

= fo
1
[(2x + 2) - (x
2 2
+ i)] dx

= 11 ( + } ) x2 dx = }~ ~ = ~I .

To interpret this example geometrically, look at the surface defined by z = x 2 + y


shown in Figure 7.2(a). For each x in the interval between O and I, the integral

f\:c2 + y)dy = x2 + ~
11 2
is the area of the shaded cross section. It is customary to interpret the definite integral
of an area-valued function as volume. Thus we can regard the iterated integral

[1 dx
lo
f\x
11
2 + y)dy = .!...!_
6
as the volume of the 3-dimensional region lying below the surface and above the
rectangle O :5 x:::: I, 1 :5 y:::: 2.

I EXAMPLE 2 j We can perform the integration in Example I in the opposite order.

f 2 dy f I (x 2 + y) dx = f 2 [ ~3 + yx ]x=l dy
11 Jo 11 x=O

r
3
2 2
= 1i (1 + y) dy [f + ~
=

= (} + 2) - (1 + 1) = 161.

This time
1 J
1 o
(x
2
+ y) dx = - + y
3

FIGURE 7.2

'l
~

/?f
/4/' J( '
·~//f !
(

,I y j -
y

I /~
X
IX
(a) (b)
Section 1B Iterated Integrals 315

is the area of a cross section parallel to the xz-plane. See Figure 7.2(b). The second
integral again gives the volume of the 3-dimensional region lying below the surface
2
z = x + y and above the rectangle O S x S 1, 1 Sy s 2. So it isn't surprising that
the two iterated integrals of Examples 1 and 2 are equal.

1B Nonrectangular Regions
\I
-- X... It is important to be able to integrate over subsets of the plane that are more general
(x, 0)
I \
\
than rectangles. In such problems, the limits in the first integration will depend on
(a) the remaining variable.

L~~MPt..~ ~, Consider the iterated integral

J.' dx J.'-x'(x + y)dy - J.' [xy + y:J:-x' dx


= fol [xo - x2) + (1 ~x2)2] dx

= [1 [x - x3 + 1 -
k
2x2
2
+ x4] dx = ~-
60
y For each x between O and 1, the number y is between O and 1 - x 2• In other
words, the point (x, y) runs along the line segment joining (x, 0) and (x, 1 - x 2 ).
As x varies between O and 1, this line segment sweeps out the shaded region B
--- as shown in Figure 7.3(a). The integrand f(x, y) = x + y has the graph shown in
Figure 7.3(b), and the iterated integral is the volume under the graph and above the
(b) region B, shown in Figure 7.3(b).

Suppose we are given an iterated integral over a plane region B in which the integrand
is the constant function f defined by f(x, y) = l, for all (x, y) in B. The integral
may then be interpreted either as the volume of the slab of unit thickness and with
base B or simply as the area of B. For example,

l 1---------. [1 dx r1-x dy = ~
2

lo lo 3
is the area of the region B shown in Figure 7.3(a). The integral also represents the
volume of the solid region of height 1 based on the plane region B and shown in
y Figure 7.3(c).

A region of integration is often described by inequalities, or else by specifying


----,, its boundary curves using equations. The usual procedure for setting up an iterated
integral over such a region is as follows:
(c) 1. Sketch the region of integration B using the given information.
2. If they are not already given, find equations whose graphs make up the bound-
FIGURE 7.3 ary of B.
3. Using the equations for the boundary curves in the limits of integration, write
the iterated integral in the order that seems simplest.
316 Chapter 7 Multiple Integration

FIGURE 7.4

JM
y y
y = u(x)
d ~-~~JB

y = v(x) C -- -

(I b X X

fbd_··Ju(x)
a
'
v(x)
f (x, y )dy dd Jr(y)
J c y s(y) f(x, y)dx
(a) (b)

For example, to integrate over a region A between the graphs of y = u (x) and
y = v(x)for x between a and b, shown in Figure 7.4(a), we would choose the order
of integration to be first with respect to y, then x. On the other hand, the region B
in Figure 7.4(b) would naturally lead us to the opposite order: first x, then y.

IEX.AMPLE s I Let f be defined by f (x, y) = x y over the region B bounded by the vertical lines
x = -1 and x =
2 and by the graphs y = 1 + x 2 and y = -x 2 , shown in
Figure 7.5(a). To find the iterated integral of f over B in the order first y, then
x, we think of holding x fixed somewhere between - 1 and 2 and letting y vary
between y = -x 2 and y = 1 + x 2 . Thus we get the single integral

1+x 2

1-x2
xydy.

Then integrate with respect to x between x = -1 and x = 2 to get

(a)
1 [1'+,-2
2
-l - x2
]
xydy dx.

To compute the value of the integral we write it as

2 2
[11 ·'+x
1 2
-I
x
-x-
]
y dy dx = 1 2
-I
[
2
J 2] l+x
x ..:. y ,
x-
dx

(b)

FIGURE 7.5
Section 1B Iterated Integrals 317
The choice of the order of integration was dictated by noticing that integrating the
other way, first with respect to x, then y, leads to two separate integrals when y is
between I and 5 or between -4 and 0. See Figure 7 .5(b ).

IEXAMPLE ·.~ I Consider the region D in ~ 2


defined by the inequalities

0::::x and x2 +i:::l.


This region is shown in Figure 7.6(a) and consists of the half-disk that is the inter-
y section of the circular disk x 2 + y2 ::: I and the right half-plane x ~ 0. The boundary
of D consists of the line segment x = 0 for -1 .:'.S y s 1 on the left and the graph
of x = ~ foe - 1 s y s 1 on the right. To integrate the function f (x, y) = x
over D, we integrate first with respect to x and then with respect to y to get

1-1
1
dy f ~ x dx =
lo
1-1
1
[!x 2 ] ~ dy
2 o
(a) = ! 11 (1 - y2) dy
2 -I
y

y=>'l-x2
= i [y - ~y3II= irn -(-1)J=i-
We can also integrate in the other order. We think of D as bounded above by the
graph of y = ~ and below by the graph of y = - ~ - We first integrate
with respect toy between -.Jt=xI and JI - x 2 , then with respect to x between
0 and 1, as indicated in Figure 7.6(b). The result is

1 1~ 11 ~
(b) 1
1 dx x dy = x[y] ~dx
0 -~ O -y l -x 2
FIGURE 7.6
1
= fo 2x~dx

2 2 3/2] I 2
= [ -3(1 - X ) 0 = 3·
In this example the two orders of integration lead to computations of about equal
complexity. In practice it may happen that you get stuck using one order, so you try
the other.
I EXAMPLE 7'·1 Let f be defined by f(x , y) = x 2 y+xy 2 over the region bounded by y = !xi, y = 0,
· - x = - 1 and x = 1. See Figure 7. 7. The two iterated integrals over the region are

1 1'"'
1
-1
dx
0
(x 2 y + xy2) dy
and
318 Chapter 7 Multiple Integration

y The second integral breaks into two pieces because, for fixed y between O and I , the
integration with respect to x is carried out over two separate intervals. Computation
y = lxl of the integral is straightforward. We get

[1 [x3y
lo -3-
x2y2]-y
+ -2- _ + -3- + -2-
[x3y x2y2]1 dy = }[123(y - Y4)dy = 31 - 152 = s·I
Y
1 0
X

The iterated integral in the other order is

- dx= 11 (x4
FIGURE 7.7
1[x2y2
- - +xy3]1xl xJxl3)
-+-- dx
1 2 3 o
-I 2 3 -I

x4 11 --dx.
= 1-dx+ xJ.xl3
1 2 3 -I -I

The functions x 4 /2 and x Ix J3 /3 are even and odd, respectively. It follows that the
sum of the two integrals is

110
-
2 -I
x 4 dx - -
3
110 -I
x 4 dx 111
+-
2 O
x 4 dx + -I
3
11
O
x 4 dx = 11
O
I
x 4 dx = -.
5

IC Higher Dimensions
Iterated integrals for functions defined on sets of dimension greater than 2 can also
be computed by repeated I -dimensional integration.

IEXAMP~E s j We compute the value of the iterated integral

1dx lol-x dy 11-.r-y xdz.


10 0 0

Since x is held fixed during the integration with respect to z and y, the integral is

lo
r1
x dx
r1-x
Jo
r1-x-y
dy lo dz = Jo
r1
x dx lo
rl-x [z]6-x-y dy
= Jo
r1 xdx Jor1-.r (1-x -y)dy

I ) 1
=Jo(I ( 2x
I 3 2
-x +2x dx=24·

The region of integration is shown in Figure 7.8(a). It is bounded on top by the


graph of z = I - x - y and on three other faces by the coordinate planes x = 0,
Section 1C Iterated Integrals 319

y = 0, z = 0. The first integration, with respect to z, is along a vertical segment


depending on x and y. The second integration, with respect to y, is sweeping out a
triangular region depending on x . The third integration with respect to x, pushes the
triangle across the entire solid. Note that the graph of the function f(x, y, z) = x
cannot be pictured in R 3 .
If the integrand f(x, y, z) = x is replaced by the constant function with value
1, then we interpret the resulting iterated integral as the volume of the solid region
shown in Figure 7.8(a). We compute the volume as follows:

r1 r1-x f 1-x-y r1 f 1-x


z lo dx lo dy lo dz= lo dx lo [z]6-x-y dy

r1 fl -x
= lo dx lo (1-x -y) dy

X We may interpret the integrand f(x, y, z) = x in the previous integral as a variable


(a) density that increases from 0 as we move away from the yz-plane in the positive
x-direction. The integral would then be the total mass of the solid.

In this example, the function f(x, y, z) = x + y is integrated over the region in


R 3 shown in Figure 7.8(b). We integrate first with respect to z, then y, and then x
to get
f 1 dx fx dy
lo lx2 lo
r
(x + y)dz = f 1 dx
lo
r [xz + yz]~:odY
lx2

= r1 dx lx2f (x2 + xy)dy


lo
x
z =x /
~
1 ]y=x
=
10
1[
x 2 y+-xy2
2 y=x2
dx

= lor1 2x 3-(3 4 21x 5) dx


x -

(b)
= [~x4 -
8
!xs - _!_x6]1
5 12 0

FIGURE 7.8 3 l I Il
=- - - - - = -
8 5 12 120
We can interpret the first integration with respect to z as taking place along a verti-
r~l <:Pomt>nt ioinini? the ooints (x , v, 0) and (x, y, x). The integration with respect to
320 Chapter 7 Multiple Integration

y takes place for each fixed x between O and 1, and sweeps out a vertical rect-
angle between the xy-plane and the plane z = x. [The graph of the integrand
X f(x, y, z) = x + y cannot be drawn in Figure 7.8(b); the sketch shown there is
the domain of f for the purposes of integration, but the graph of f would be
h
in JR 4 .]
a
In the integration with respect to x, the vertical rectangle sweeps out the 3-dimen-
sional region between the planar top and bottom. We can interpret the value of the
integral as the total mass of a solid region with variable density equal to f (x, y, z) =
x + y at the point (x, y, z).

1D Solids with Known Sectional Areas


Integration over solid regions B in JR 3 can sometimes be simplified if the coordinates
used to describe B are chosen so that plane cross-section3 perpendicular to one of
the axes have area that you already know or can routinely compute in advance.
(a) Examples are shown in Figure 7.9.

IEXAMPLE 10 I A solid column C of height h has circular cross-sections with radii that decrease
linearly from bat the base to a at the top. Our problem is to find its volume V(C).
We choose an x-interval from x = 0 to x = h to coincide with the column's axis of
symmetry as shown in Figure 7.9(a). The radius and area of a cross-sectional disk
at distance x from the base are given, consistently with r(O) =band r(h) = a, by

r(x) a-b)
= ( -,;- x +b and A(x) = n:r 2 (x).

This formula for A(x) could have been the result of an additional, but unnecessary,
iterated integration with respect to y and z. In any case the volume of the column is

FIGURE 7.9 We can compute this integral either by making the substitution u = r(x) or by
squaring the bracketed expression for r (x ). The result is

V(C) = (nh/3)(b 3 - a 3 )/(b - a)= (nh/3)(b 2 +ab+ a 2).

Note that when a = 0 we get a formula for the volume of a right circular cone with
base radius band height h: V = ½nb2 h. If a= b we get the volume of a cylinder.

IEXAl')APLE 11 I A solid torus B is generated by rotating a circular disk of radius a about a line l in
the same plane and at distance c > a from the center of the disk. See Figure 7.9(b).
If S is sliced by a plane perpendicular to l the resulting intersection is a flat annular
region with inner radius of the form ro = c - p and outer radius r 1 = c + p. Thus
the annulus will have area

A= nr[ - 1rr5 = n((c + p) 2 - (c - p) 2 ) = 4ncp.


Section 1D Iterated Integrals 321

The increments ±p depend on the level at which the slice is made. At level z
measured along the horizontal axis having label /, we see that p = .ja 2 - z2. Hence
A(z) = 4nc./a 2 -- z2 . To find V(B) we integrate A(z) from -a to a:

V(B) = 1_: A(z)dz = 4nc 1_: ../a 2 - z2 dz.


1

This last integral is usually computed using the substitution z = a sin u. However
a moment's thought shows that the integral represents the area of a semicircle of
radius a, namely ½na 2 . Hence V(B) = (4nc)(½na 2) = 2n 2 ca 2 , 0 < a < c.

EXERCISES

In Exercise 1 to 14 evaluate the following iterated inte-


grals and sketch the region of integration for each.

1. 1.~ 2
[l\x y2+xy3)dy]dx 14.1 1

-1
dx [fxldy r1(x+y+z)dz
lo lo
2 3 1
2. fo [/ Ix - 21 sinydx] dy 15. Evaluate the integral lorr sinxdx fo dy fo\x+y+z)dz.

I
f
3. /
0
[1° (x + y2)dy] dx
16. Evaluate the integral
Jo
dx ix dy lx+y dz ix dw.
-x -x-y -z

2 17. Sketch the subset B of R 2 defined by O S x S 1,


4. forr/ [!~ sinx dx] dy 0 :::: y :S x, and write down the integral over B in each
of the two possible orders of J (x, y) = x sin y. Evaluate
both integrals.
lo y2 (x 2 + y)dx
5.
1 -2
1
dy
0
18. Sketch the region defined by x ~ 0, x 2 + y2 :S 2, and
x 2 + y 2 ~ l. Write down the integral over the region in
6. Jl-I
dx rtxl dy
lo
each of the two possible orders of J (x, y) = x 2• Evaluate
both integrals.

1 19. Consider two real valued functions c(x) and d(x) of a


1. lo dx lo,./1-"i dy real variable x. Suppose that, for all x in the interval
a ::5 x Sb, we have c(x) S d(x).
8. 1-l dx 12.x ex+ydy (a) Make a sketch of two such functions and of the
subset B of the xy-plane consisting of all (x, y) such
that a :S x :Sb and c(x) Sy :S d(x).
frr/2 rosy (b) Express the area of B as an iterated integral.
9. lo dy Jo x siny dx (c) Set up the iterated integral of J (x, y) over B.
2 x3 20. Sketch the subset B of JR3, defined by O S x S 1,
10.
Ji
f
1
dx
1x2
xdy 0 :::: y :S l, and O ::5 z :S 2. Write down the iterated
integral with order of integration z, then y, and then x,
of the function f(x, y, z) = x 2 + z over the subset B.
11. lo [1oz [foy dx] dy] dz Compute the integral.

1 21. Sketch the region defined by O :S x :::: 1, x 2 S y S ,Ji,


12. fo [lx [lox+y ydz] dy] dx and O S z :S x + y, and evaluate the iterated integral, in
some order, of J (x, y, z) = x + y + z over the region.
322 Chapter 7 Multiple Integration

22. Let f be defined by f(x, y, z) = 1 on the hemisphere 26. Let f be defined by f(x1, ... ,Xn) = x1x2 .. . Xn on the
bounded by the plane z = 0 and the surface z = cube O .::S x1 .::S 1, 0 .::S x2 .::S 1, ... , 0 .::S x 11 .::S l. Evaluate
JI - x2 - y 2. Evaluate an iterated integral off in some
order over the region.
fl d .tJ r1 dx2 . .. r1 X1X2. , .XndXn .
23. A solid region B has a circular base of radius a. Cross- Ji h h
sections by planes perpendicular to a fixed diameter of
the circle are squares. Sketch B and find its volume. 27. Evaluate

24. Derive the formula V = ;rra


3 for the volume of a sphere
of radius a by computing an integral of areas of parallel
slices of the sphere.
25. (a) 1\vo planes intersect at angle rr/4. A cylinder of 28. (a) Evaluate IN = 1N dy 1N e -:x-y dx.
radius a with central axis perpendicular to one of 0 0 00 00
the planes intersects that plane in a disk of radius
a tangent to the line l of intersection of the planes.
(b) Evaluate the improper integral lo dy lo e-:x-y
Sketch the wedge-shaped solid W bounded by the dx by finding Jim IN.
N---+oo
planes and the cylinder.
~ dx.
1
(b) Find the volume of W by integrating sectional areas 29. (a) Evaluate la= [1 dy [
perpendicular to the axis of the cylinder. la .a "'xy
(c) Find the volume of W by integrating sectional areas
perpendicular to /. (b) Evaluate [1 dy [1 - 1- dx by finding lim 16.
lo lo fa 6---+0+

SECTION 2 MULTIPLE INTEGRALS


2A Definition
Recall that the integral of a function with values f (x) assumed on some interval
a :::: x :::: b is defined by

K b
lim
ma~(ti.x;)--+0
L f(xk)!:Hk = lar f(x) dx
a b X k=O
(a)
where t:nk is the distance between points of subdivision shown in Figure 7. IO(a) and
Xk is some point in the kth interval. The sum on the left is most simply interpreted

'j as the total area of a collection of rectangles. The purpose of this section is to extend
the definition of integral to functions with values f (x) where xis in some subset B
of JR" for n ~ 2.
The extension proceeds very naturally if we keep in mind the analog of Figure
7. I O(a) shown in Figure 7.1 O(b) for the case n = 2. A segment with length ~x
is replaced by a rectangle in the xy-plane with dimensions ~x and ~y. and the
rectangle area J(x)~x is replaced by a volume f(x)~x~y. The integral will then
be defined as a limit of sums of volumes.
Although integration over intervals is adequate for practically all purposes in
(bl
dimension 1, we need more general sets in ~". We first consider some simple sets
in ~ 11 • A closed coordinate rectangle is a subset of ~ '! consisting of all points
'FIGURE 7.10 x = (x1, ... , x 11 ) that satisfy a set of inequalities

a;::':: X; ::':: b;. i = 1, ... , n. {I)


Section 2A Multiple Integrals 323
If in Formula (I) some of the symbols":::;" are replaced by"<," the resulting set is
y
still called a coordinate rectangle. In particular, if all the inequalities are of the form
ai < Xi < bi, the set is open and is an open coordinate rectangle. A coordinate rect-
angle has its edges parallel to the coordinate axes. Throughout this section the word
rectangle will be understood to mean coordinate rectangle. Figure 7.11 illustrates
X
rectangles in JR2 and IR 3 . A "rectangle" in JR is just an interval.
Let R be a rectangle (open, closed, or neither) defined by Formula (I), and with
replacement of some or all symbols "::::" by "<" pennitted. The volume or content
l::5x::54
of R, written V(R}, is by definition the product of the lengths of the edges of R, and
-1::5y::51 so equals
(a)
(2)

In the examples shown in Figure 7 .11,

V(R2) = (4 - 1)(1 - (-1)) = 6 and V(R3) = (3 - 1)(3 - 1)(2 - I)= 4.

If, for some i in Formula (I), ai = bi, then R is called degenerate and V (R) = 0.
For rectangles in IR 2, content is the same thing as area, and we often write A(R)
instead of V(R) to have the notation remind us of area rather than volume.
A subset B of JR.n is called bounded if there is a real number k such that lxl < k
for all x in B. A finite set of (n - I )-dimensional planes in ]Rn (lines in JR 2 ) parallel
to the coordinate planes will be called a grid. A grid separates ]Rn into a finite
number of closed, bounded rectangles R1, ... , Rr and a finite number of unbounded
regions. A grid covers a subset B of ]Rn if B is contained in the union of the
bounded rectangles R1, ... , Rr, so a set B can be covered by a grid if and only if
FIGURE 7.11 B is bounded. As a measure of the fineness of a grid, we take the maximum of the
lengths of the edges of the rectangles R 1, . . . , Rr. This number is called the mesh
of the grid. In Figure 7.12 the shadings are parts of planes that cut B3 .
Remark. When n ~ 4 a note is in order about the planes that form a grid. In the
space ]Rn each of the planes of dimension n - l that we use to form a grid will be
perpendicular to one of the coordinate axes. For example, planes with equations x1 =
c consist of all points (c, x2, x3, . . . , Xn). Thus vectors of the form (0, x2, x3, ... , Xn)

FIGURE 7.12 y-axis +


I
I
I
I

______,..
x-axis

R, _ 1 R,
-I
X
324 Chapter 7 Multiple Integration

joining two points in this plane will be perpendicular to a vector (d, 0, 0, ... , 0) with
d i- 0 that is parallel to the x1 axis. Similar remarks apply to each of the other n - 1
types of plane with respective equation types x2 = c, x3 = c, ... , x 11 = c. Each of
these planes has content zero in JR.11 , and the volume of a box bounded by n pairs of
parallel planes is the product of the distances between the two planes in each pair.
We now give a definition of the multiple integral, called the Riemann integral
after Bernhard Riemann. Consider a function JR.11 /4 JR and a set B such that
(a) B is a bounded subset of the domain off.
(b) f is bounded on B.

Assertion (b) means that there exists a real number K such that If (x) I ~ K, for
all x in B. The multiple integral of f over B will be defined in terms of the function
fB, which is f altered to be zero outside B, that is,

fB(X) = { t,cx), if x is in B
if x is not in B

Figure 7.13 shows the shaded graph of a function J8 cut from a graph over the
first quadrant in JR. 2 . Let G be a grid that covers B and has mesh equal to m ( G).
In each of the bounded rectangles Ri formed by G, with i = 1, . . . , r, choose an
arbitrary point Xi. The sum
,.
L fB(x;)V(R;)
i=I

is called a Riemann sum for f over B. Its value, for given f and B, depends on
G and XJ, ... , x,.. If, no matter how we choose grids G with mesh m(G) tending to
zero, it happens that
r

exists and is always the same number, then this limit is the integral of f over B
and is denoted by L f d V. If the integral exists, f is said to be integrable over B.

FIGURE 7.13

X
Section 28 Multiple Integrals 325
The limit that defines the multiple integral is somewhat different from the limit
of a vector function defined in Chapter 5, Section l, although the idea behind it is
similar. The defining equation
,.
lim I:.ts(X;)V(Ri) = {
m(G)-.O.
1=1
Js .fdV
means that, for any £ > 0, there exists 8 > 0 such that if G is any grid that covers
B and has mesh less than 8, and S is an arbitrary Riemann sum for .f8 formed from
G, then

It should be emphasized that the integral is not defined for functions f and sets
B unless the boundedness conditions on f and on B are satisfied. Without these
conditions, even the Riemann sums may not be defined.
If f is a real-valued function of one real variable, that is, if n = 1, and if B is an
interval a S x S b, the Riemann integral of f over B is the familiar definite integral

1b f(x) dx.

Other common notations for the integral of !Rn ~ JR over B are

L fdA and L f(x, y) dx dy, if n = 2,

L f(x, y, z) dx dydz, if n = 3,

L f dx1 ... dxn, for arbitrary n.

2B Existence
Multiple integrals are often computed by first rewriting them as iterated integrals,
which are then evaluated by repeated application of I-dimensional integration tech-
niques. Even though they are too technical to prove here, it is nevertheless important
to have criteria for the existenc~ of an integral L/(x) dV. The criteria provided
below in Theorem 2.1 impose conditions (i) on the set of boundary points of B and
(ii) on the set of discontinuity points off. Both (i) and (ii) require that the respective
sets be negligible in the following sense: A set S has zero content if 1 dV = 0. For
example, finite sets of points, finite collections of smooth curves in JR~ and JR 3 , and
finite collections of smooth curves and surfaces in JR 3 all have zero content, though
we won't prove this.

2.1 Theorem. Let JR11 ~ JR be defined and bounded on a bounded set B such
that (i) the boundary of B has zero content and (ii) f is continuous except possibly
on a set of zero content. Then f is Riemann integrable over B.
326 Chapter 7 Multiple Integration

!EXAMPLE 1 ! We'll evaluate l (2x + y) dx dy directly from its definition, where B is the rectangle
0 ::: x :::: 1, 0 ::: y ::: 2. Note that (i) the boundary of B consists of four line segments
having total content zero, and (ii) f (x, y) = 2x + y is continuous everywhere. Thus
y ~ we know that f is integrable on B, so we can use an arbitrary sequence of Riemann
sums with mesh tending to O to evaluate the integral. For each n = 1, 2, ... , consider
R,6
the grid G n consisting of the lines

j
x--
- ' i = 0, . .. ,n and y = -, J = 0, ... , 2n.
n n

See Figure 7.14(a). The mesh of Gn is 1/n, and the area of the rectangles Rij is
R11
I/ n2 . Setting
(a)
X

X;j = (x;, Yj) = (~, !) ,


.,__ we form the Riemann sum, partly illustrated in Figure 7.14(b),
z 1 ,( t
'-
'
I
' i

( I I !
tfJ I ·:
' 1t ' I

ID'
I

:,__ ,J-. - ·I ·1·


I :,#
..
,/

1j -
T'.:.i ; -}·
•)''
_j '
J__.,_
y

(b)
= n3
1 (
4n !;
n
i +n E
2n
j
)

FIGURE 7.14 = __!_ (4n 2 + 4n 4n


2
+ 2n)
n2 2 + 2

4n 2 + 3n 3
= n
2 =4+ -.
n
Hence

f (2x + y)dxdy = n--+oo


} B
lim (4+ ~) = 4.
n

Direct evaluation of a multiple integral would be very difficult for most functions
we want to integrate. Fortunately in many instances we can evaluate the multiple
integral by repeated application of ordinary I -dimensional integration instead of by
finding the limits of Riemann sums. The pertinent theorem, which we don't prove,
is the following.

2.2 Theorem. Let B be a subset of IR.11 such that the iterated integral
Section 2C Multiple Integrals 327
exists over B. If, in addition, the multiple integral

lfdV
exists, then· the two integrals are equal.

Since the argument that proves Theorem 2.2 applies equally well to any order of
iterated integration, we have an immediate corollary:

2.3 Theorem. If£ f dV exists and iterated integrals exist for some orders of
integration, then all these integrals are equal.

We won't prove this theorem, but looking at Example I gives us an intu1t1ve


justification for interchanging the order of integration. In a Riemann sum we can
add first with respect to one index, then with respect to the other, to get

where l!.x; and l!.yj are the dimensions of the rectangle Rij. Thus we can expect
these sums to tend to the respective integrals as the mesh of the grid tends to zero:

l f(x, y)dA = f [f j(x, y) dy] dx

= f [f f(x, y)dx] dy.

2C Double Integrals
Computing multiple integrals by iterated integration often requires us to describe
the region over which we are to integrate so we can make reasonable choices for
the order of integration and the limits of integration. We start with 2-dimensional
examples. A double integral is usually written

ff J(x,y)dxdy or lf(x,y)dxdy,
B

where B is a subset of JR2 on which J is defined. If f is nonnegative, we may


interpret the integral as the volume between B and the graph of f in JR3 .

I~~,e~~.;~.:) Compute In (2x + y) dx d y where B is the rectangle

0 5 x 5 I, 0 5 y :'.S 2.
328 Chapter 7 Multiple Integration

This is the same integral that occurs in Example 1, but we evaluate it here by iterated
integration as follows.

z
l (2x + y)dxdy = fo
1
dx fo\2x + y )dy

= [1 [2xy + ! i]y=Z dx
4

lo 2 y=O
1
= fo (4x + 2)dx
--F-~~~~--->-
2 x
= [2x 2 + 2x]6 = 4.
(a) Integration in the other order, first with respect to x and then with respect to y,
would produce the same final result.

IEXAMPLE 3 j Consider the 2-dimensional region B satisfying O s x s 2 and O s y s x 2 shown


in Figure 7. I5(a). The double integral of a function with values / (x, y) defined on
B equals an iterated integral in either of two orders. Integrating first with respect to
y, we would have

l J(x, y) dx dy = fo
2 2

dx fox f(x, y ) dy.

The integral with respect to y, namely

r2 f(x,
lo y) dy,

(b) depends on x and represents the area of the shaded vertical slice shown in Figure
7. I 5(h ). If we want to integrate first with respect to x, perhaps to make the indefinite
integrals easier to find, we hold y fixed with O S y s 4 and note that then x satisfies
.jy S x S 2. The double integral is then given by [see Figure 7.15(c)]
4 2
{ f(x,y)dxdy= { dyf f(x,y)dx.
ln lo g
I
I y For example, if f(x, y) = xy, we would have
,.._.,_~~~ ~ ~ ~ 1/
2
(c) f xydxdy= [4dyf xydx
ln lo g
FIGURE 7.15
[I ]x=2 dy
=
1
0
4
-x 2 y
2 x=Jy
= 14 (2y- ~y2) dy

= [y2 - ~y3]4
6
= 16 - 32
3
= ~-
3
0
Section 2D Multiple Integrals 329
The inequality x 2 + y2 s l defines a disk D in the xy-plane shown in Figure 7. l 6(a).
The volume above D and under the graph of / (x, y) =
x 2 + y2 is shown in
Figure 7.16(b). For each fixed x satisfying -1 s x s I, we have y restricted
y so that

(x,vf"=?°)

We compute the double integral of f over D as an iterated integral as follows.

1 1~
- ) I X

....____,__.- (x, - \ / ~ )
1 D
(x 2 + y2)dxdy = 1
-I
dx
-~
(x 2 + y2)dy

(a)

= ~ fo
1
2 2
(Jt-x +2x ~ d x

=i(~+i)=i
The indefinite integrals needed in the last step are in the Appendix and most standard
(b)
tables; they're evaluated by making the substitution x = sin 0.
FIGURE 7.16
More complicated regions like those shown in Figure 7 .17 are handled by cutting
them up into disjoint regions Ck over each of which it is possible to compute an
iterated integral. Then use as often as necessary the equation

1 C1U C2
f(x,y)dxdy=l f(x,y)dxdy+J f(x,y)dxdy;
C1 C2 .

this is proved as Theorem 3.6 in the next section.

FIGURE 7.17 y
y

(a) (b)
330 Chapter 7 Multiple Integration

2D Triple Integrals
A triple integral is usually written

ff!
B
f(x,y,z)dxdydz or lf(x,y,z)dxdydz,

where B is a subset of JR 3 on which f is defined. If f i:, nonnegative, the number


f (x, y, z) can be interpreted as the density of material at the point (x, y, z), so under
this interpretation the value of the integral becomes the total mass of material in B.

j ~~~fVIPl.E s j If R is the 3-dimensional rectangle defined by

Osxs2
z 0 Sy SI

0 S z S 2,

we can sketch it as in Figure 7.18. The integral of f(x, y, z) = xyz over R is then
computed as an iterated integral in any of the possible orders.

l xyzdxdydz = fo
2
dx fo
1
dy f 2
xyzdz
2 1
= fo xdx fo ydy J\:.dz

[lx J: [ly I [~z2I = ~


FIGURE 7.18
2 2
=
We have factored the x or y out of each integral in which it's not the integration
variable, being careful not to do this when x or y is the integration variable.

Let f(x, y, z) = xyz, and let the subset B of JR 3 be defined by x 2 + y 2 + z 2 s 4,


x :::: 0, y :::: 0, z :::: 0. B is the interior and boundary of one-eighth of the spherical

z
ball of radius 2 with center at the origin, shown in Figure 7.19. The integral L f dV
equals the triple iterated integral of the function f(x, y, z) = xyz over B. For fixed
x and y, the variable z runs from Oto J4 -
x 2 - y 2 , which are the limits of the first
integration with respect to z. The result of this integration is a function of x and y
that must be integrated over the 2-dimensional subset obtained by projecting B on
the x y-plane, that is, over the region
y
x 2 + y2 S 4, X :::: 0, Y :::: 0.

For fixed x, the variable y runs from O to .J4 - x 2 ; hence these are the limits on
FIGURE 7.19
the integration with respect to y. Finally, x runs from O to 2, so we conclude that

~ 2 J4-x2-y1

1 11 1
R
f dV =
O
dx
O
dy
0
xyzdz.
Section 2E Multiple Integrals 331
Then

2
[ fdV =! f xdx [ ~ y(4-x 2 -y2)dy
lB 2 lo lo

= 21 lo[2 X ( 2(4 - 2
X ) -
x2
2(4 -
2
X ) -
(4-x2)2)
4 dx.

The last integral simplifies to

2
lo[ ( 2x -x 3+ 81x 5) dx = 34 .
2E Content and Mass
If an integrand f (x, y) is constantly 1 on · a region R in JR 2 , then the integral of f
over R can be interpreted as the volume of the solid B with base R and height 1.
An alternative interpretation is as the area of R. More generally, and more precisely,
if a set B in ]Rn satisfies the conditions of Theorem 2.1, we define the content of B
to be
V(B) = l dV.

In case n = 2 we call this number the area of Band denote it by A(B). When n ~ 3
we speak of n-dimensional volume and retain the notation V (B), or else use Vn (B)
if the dimension isn't clear from the context. One virtue of this definition is that it
allows us to remove the ambiguity inherent in the multiple ways of computing via
iterated integrals. However, the definition is dependent on the choice of coordinates
used to describe B ; this point is addressed in Section 4.

Referring back to the previous Example 4, the volume of the region B that lies
vertically between the disk D and the graph of f (x, y) = x 2 + y 2 is equal, by
Theorem 2.2 and our definition of V ( B), to

f f [ rt(x,y) ]
V(B)= lBdV= lv lo dz dxdy.

The integral with respect to z works out immediately to f(x, y), so the value rr/2
of the integral computed in Example 4 is assigned to V(B) without concern about
order of integration.

If µ(x) is nonnegative and integrable for over B, then µ(x) can be interpreted as
the density at x of a mass distribution µ. Then, assuming V (B) > 0, the integral

M(B) = l µ(x)dV

is called the total mass of the distribution µ on B. The type of density considered
here describes mass per volume unit, and in case V ( B) =
0, the integral of an
332 Chapter 7 Multiple Integration

integrable density µ, over B will always be zero. Some alternative densities for
curves and surfaces with zero volume are described in Chapter 8, Section 2 and
Chapter 9, Section 3.

Referring again to Example 4, we take B to be the 2-dimensional disk D and let


µ,(x, y) = x 2 + y2. Think of D as being made of material that increases in density
from O at the center to l at the edge as the square of the distance from the center.
According to the computation in Example 4, the total mass is M ( D) = n /2.

The importance of the multiple integral is due partly to rhe variety of interpreta-
tions that stem from it. Content and total mass are conceptually two of the simplest;
others are discussed in subsequent sections.

EXERCISES

In Exercises I to 4, make a drawing of the set B and 7. f(x, y) = x + y 2 and B is the rectangle with corners
compute JB
f dx dy. (1, 1), (1, 3), (2, 3), and (2, 1).

1. f(x, y) = x 2 + 3y2 and B is the disk 8. f(x, y) = x + y + 2 and B is the region bounded by the
x2 + y2.::: 1. curves y 2 = x, and x = 2.

2. f(x, y) = 1/(x + y) and B is the region bounded by the 9. f(x, y) =Ix+ YI and Bis the disk x 2 + y 2 ::: 1.
lines y = x, x = l, x = 2, y = 0. 10. f(x, y) = x 2 + y 2 and B is the square with corners at
3. f(x, y) = x sinxy and B is the rectangle O ::: x .::: rr, (x, y) = (±1, ±1).
0:::y:::1. 11, Find by integration the area of the subset of R 2 bounded
4. f(x, y) = x 2 - y 2 and B consists of all (x, y) such that by the curve
2 2
0 ::: x .::: 1 and x - y c:::_ 0.
x 2 - 2x + 4y2 - 8y + 1 = 0.
In Exercises 5 and 6, use the definition of the double
integral as a limit of Riemann sums to compute the 12. Given that f(x, y, z) = xyz and that

integrals 1 f (x, y) d x dy with given f and B. Then


fet(x,y,z)dxdydz = fo
2
dx fox dy fox+y xyzdz,
verify your answer by computing an appropriate iterated
integral. The following formulas will be useful. sketch the region B and evaluate the integral.
13. Sketch the region B in R 3 bounded by the surface z =
~i = n(n + I) ~ i 2 = n(n + l)(2n + l) 4 - 4x 2 - y 2 and the xy-plane. Set up the volume of B
~ 2 ' ~ 6 ' as a triple integral and also as a double integral. Compute
i=l i=l
the volume.

"·3 = "')2
n ( n 14. Write an expression for the volume of the ball
~l ~l x2 + y2 + z2 .::: a2
i=l i=l (a) as a triple integral.
(b) as a double integral.
5. f(x, y) =x + 4y and B is the rectangle O ::: x .::: 2, 15. Sketch in R 3 the two cylindrical solids defined by
0::: y.::: 1. x 2 + r.2 ::: 1 and y 2 + z 2 ~ 1, respectively. Find the
volume of their intersection.
6. f(x, y) = 3x 3 + 2y and Bis the rectangle
Q _::: X :'.:: 2, Q _::: y _::: 1. 16. The 4-dimensional ball B of radius 1 and with center at the
origin is the subset ofR4 defined by xf+x?+x5 +x] ::: 1.
ln Exercises 7 to I 0, find the volume under the graph of Set up an expression for the volume V (B) as a fourfold
f and above the set B, where iterated integral.
Section 3 Integration Theorems 333
17. A hemispherical bowl of radius a conlains liquid with (b) Explain how Cavalieri's principle follows from The-
maximum depth h. Find the volume of the liquid. orem 2.2 and the definition of area and volume.
18. Cavalieri's principle as originally fommlated in the 19. A semicircular steel plate with a two foot radius has a
17th century states that two solids that have equal concenlric semicircle of 6-inch radius removed from its
cross-sectional areas at the same height will have equal straight edge. If the steel has uniform density µ, = 12
volumes. pounds per square foot, find the total mass of the plate.
20. A reclangular 2-by-3 foot steel plate has been machined so
that its density varies linearly from 10 pounds per square
foot to 12 pounds per square foot as measured in the long
direction. Find the total mass of the plate.
21. A column with circular cross section varying from diam-
eter 12 inches to diameter 8 inches is 10 feet long. The
density µ, of the material in the column varies linearly
along the length of the column from 50 pounds per cuhic
foot at the thick end to 40 pounds per cubic foot at the
(a) (b) thin end. Find the total mass. Sec Figure 7.9(a).
(a) Assuming that the hypotheses of the principle hold
for the two solids with square cross sections that
follow, find the volume of the one below.

SECTION 3 INTEGRATION THEOREMS


The emphasis in the two previous sections is on computational technique and on inter-
pretation of the integral. Here we look at four characteristic properties of integrals
and show how some other important properties follow from them.

3.1 Theorem. Linearity: If f and g are integrable over B and a and b are any
two real numbers, then af + bg is integrable over B and

3.2 Theorem. Positivity: If f is nonnegative and integrable over B, then

Lf dV ~ 0.

3.3 Theorem. If Risa rectangle, then l dV = V(R), where the content V(R)
is defined as the product of the lengths of the edges of R.

In the next theorem recall that / 8 (x) is defined to equal /(x) for x in B and to
equal O for x not in B.

3.4 Theorem. If B is a subset of a bounded set C, then L f dV exists if and

only if [ fsdV exists. Whenever both integrals exist, they are \"!qual.
334 Chapter 7 Multiple Integration

Proof of 3.1. Let E > 0 be given, and choose 8 > 0 so that if S1 and S2 are
Riemann sums for f 8 and gn respectively whose grids have mesh less that 8, then

Let S be any Riemann sum for (af + bg)B whose grid has mesh Jess than 8. Then

= a L fB(Xi)V(Ri) + b LgB(x;)V(Ri)
i

Hence

Js-a lfdV-b lgdVl=las1-a lfdV+bS2-b lgdVI


::: lal Js1 - /e fdvJ + lhl Js2 - /e gdvJ
E E
< - +- =
2 2
E.

Thus

lim L(af + bg)B(xi)V(Ri) =a f fdV +b f gdV,


m(G)---+0 .
I
] B jB

and the proof is complete. •


Proof of 3.2. Since all the Riemann sums are nonnegative, the limit must also be
nonnegative. •
Proof of 3.3. This follows immediately from Theorems 2.1 and 2.2. •
Proof of 3.4. The existence and the value of the integral l f dV = ff B dV

depends only on the function fn. Similarly, L .fBdV is defined by using <.fB)c,
which is equal to .fn. •
We can now prove the next two theorems directly.
3.5 Theorem. If f and g are integrable over B, and f ::: g on B, then

l fdV::: l gdV.

If in addition If I is integrable over B then

l
I fdVI::: £ 1JldV.
Section 3 Integration Theorems 335
Proof. The function g - f is nonnegative and, by Theorem 3.1, is integrable over
B. Hence, by Theorems 3.1 and 3.2,

from which the conclusion follows. The second part is left as Exercise 2. •
The next theorem establishes an analog for the equation

le f (x) dx = lb f (x) dx + 1c f (x) dx

that holds for functions of one variable.


3.6 Theorem. If f is integrable over each of two disjoint sets B 1 and B2 , then
f is integrable over their union and

1 B1UB1
f dV = 1B1
fdV + 1
B2
fdV.

Proof. By Theorem 3.4,

Since B1 and B2 are disjoint, fB 1uB2 = fB 1 + fBz· Hence, by Theorem 3.1, the
function fB 1uB2 is integrable over B1 U Bz, and

Finally, by Theorem 3.4 again, f is integrable over B1 U B2 and

which completes the proof. •


The possibility of changing order of integration has a number of consequences
other than its convenience for computing multiple integrals. One of these is the theo-
rem for change of order in partial differentiation, proved in Section 3 of Chapter 4 by
other means, and in a slightly stronger form in Exercise 13 of this section. Another
consequence is the Leibniz Rule for interchanging differentiation and integration.

3.7 Leibniz Rule. If (ag/ay)(x, y) is continuous for a:::: x:::: band c:::: y:::: d,
then

dd
y
lb
a
g(x,y)dx= lb
a
ag
-(x,y)dx.
ay
336 Chapter 7 Multiple Integration

Proof The trick is to start with the following change in order of integration:

For each t the integral in square brackets on the right evaluates to g(t, y) - g(t, c).
(Use the version of the Fundamental Theorem of Calculus that tells you how to
integrate a derivative.) Thus the previous equation becomes

Note that the subtracted term is a constant. Now apply the other version of the
fundamental theorem of calculus to both sides. On the left we undo they-integration,
so we get the desired result

la
b gy(t, y)dt =:!._lb dy a
g(t, y)dt. •
J1
Let G (y) = 0 sin(.v,,x) dx. There seems to be no way to evaluate the integral in
terms of elementary functions, but we can find G'(y) using the Leibniz rule. We find
1
G'(y) =
1 0
1 cos(yex)exdx = [ -I sin(yex) ]
Y O
= -(sin(ey)
I
Y
- siny) , y f:. 0.

j ~*MPlf2] If u and v are fixed numbers, both positive or both negative, the formula
u I
F(y, u, v) = J,
II
-eyx dx
X

defines a function of y. To find the derivative of F with respect to y, we can write,


using Theorem 3.7,

d
-F(y,
dy
u, v) = 1v -a [
II oy
1
-eyx
X
] dx

ll

= J, u eYxdx

EXERCISES

I . Consider the rectangles and the function

B1 defined by O < x S I, 0:::: y < I, 2x - y, if X < I,


B2 defined by 1 S x :'.':: 2, - I :'.':: y :'.':: 1 J(x,y)= { xZ+y , if X ~ I.
Section 3 Integration Theorems 337
Compute
(b) If k is some fixed vector in !Rm, and ]Rn ~ JR"' is
integrable over B , show that
{ f(x,y)dxdy.
la,uB2
2. Use the first part of Theorem 3.5 to show that if f and
If I are integrable over B, then
(c) Show that if Rn ~ R"' and JR" J4 JR are

Il fdVI 5 l lfldV. [Hint: - lfl ~ f ~ lfl-1 integrable over B , then


the Cauchy-Schwarz inequality.
ll fdV' :S l lfldV by

3. Use the result of Exercise 2 to show that if ]Rn ~ JR is


[Hint: f(x) · l fdV :S lf(x)l ll
in B . Integrate with respect to x and apply the result
fdV,, for all x
continuous on a set B , and xo is interior to B, then
of part (b).]
1
lim - -, { f dV = f(xo), 8. (a) Use the Leibniz rule together with the chain rule to
r--+0V(Br.1 Js,. prove that if gy(x, y) is continuous, and h 1(y) and
h2(Y) are differentiable, then
where B1 • is a ball of radius r centered at xo.
4. Let B be the subset of R 2 consisting of all points (x, y) d
-
1h2(Y)
g(t,y)dt
such that O :5 y 5 1, and x is rational, 0 ~ x ~ I. Does dy hi(y)
the area exist? h2(y)
S. On the rectangle O ~ x 5 1 and O :S y :S 1, let
f(x, y) = 1, if x is rational, and f(x, y) = 2y if x is
=
1 h1(y)
gy(t, y)dt + h;(y)g(h2(y), y)

irrational. Show that - h; (y)g(h1 (y), y).

[Hint: The integral on the left has tlie form


[1 dx { f(x, y)dy = 1, G(y, hi (y), h2(y) ).]
lo ~o
(b) Use part (a) to compute F'(y), where F(y) =
but that f is not Riemann integrable over the rectangle.
6. Interchange of order may not always work for improper
integrals. Prove that
l-y
y (1/x)(I - e-xy)dx .

In Exercises 9 to 12, use the Leibniz rule to find the

lo dy fo'\e-xy -
l 2e-2xY) dx
indicated derivative of the given function.
1
9. f(y) = fo (y2+r 2 )dt. Find .f'(y).

#- Joo dx fol (e-xy - 2e-2xY) dy.


10. g(t) = [2 !erx dx. Find g' (t).
lt X

~ JR"'
2
7. Let ]Rn be defined on a set B in Rn. We define 11. h(x) = fox (x - u)eu du. Find h'(x) and h"(x).

l f dV = (l f1 dV, ... , l fm dV), 12. k(s) =


0 i
t us - I
- - d u . Assumings> -1, find k'(s) and
lnu
then find k(s).
provided that the integrals of the coordinate functions
ft, ... , f m of f all exist. *13. Prove the following stronger version of Theorem 3.3 of
Chapter 4: If fx, fy, fyx, and fxy arc continuous on an
(a) Show that if JR" ~ JR"' and Rn ~ Rm are both open set, then fxy = fyx· [Hint: Apply the Leibniz rule
integrable over B, then to the equation

f(x, y) - f(a, y) = 1x fx(t, y)dt ,


where a and b are constants. and then differentiate both sides with respect to x.]
338 Chapter 7 Multiple Integration

SECTION 4 CHANGE OF VARIABLE


A change of variable is often used in a I -dimensional integral to simplify the inte-
grand. For example, we can make the substitution x = u + I and dx = du to find
2 1
[ x.Jx=l dx = fo (u + 12
l)u 1 du

1
= [~u5/2 + ~u3/2] 16
0 5 3 0 15
The aim of this change of variable was to simplify the integrand, and the change of
r - - 1--T--1 interval of integration from I ~ x ~ 2 to O ~ u ~ 1 makes very little difference in
I I I
,_ - - - _.1_ _ _ ,
I
the computation. In computing multiple integrals it's more often the corresponding
: Ar ; : - - f: change in the region of integration that we're concerned with, the point being that we
--
., AO •
I
I
can use a change of variable to simplify the region. We first consider some simple
Oo - ___!.__ examples of multivariable coordinate changes.
4A Polar Coordinates

(a)
r
In a double integral L f (x, y) dx dy over a circular region, it is often helpful to
introduce polar coordinates by the transformation
y
x = r cos0
y = r sin 0.
Corresponding regions in the xy-plane and r0-plane are shown in Figure 7.20. As
r and 0 vary within the limits ro ~ r ~ ri and 0o ~ 0 ::: 01 , the values of x
and y give the coordinates of the points in the shaded part D of the disk shown in
Figure 7.2O(c). Rather than approximate the value of an integral by decomposing D
using a rectangular grid, we can think of using a polar coordinate grid. A typical
X subdivision S of such a grid is shown in Figure 7.2O(b). Using elementary geometry,
we can compute the exact area of S as a certain fraction, namely, (!:!,.0)/2rr of the
(b)
region between circles of radius r and r + l:!,.r. We find
y · !:!,.0
A(S) = -[rr(r
2rr
+ !:!,.r)2 - rrr 2 ]

!:!,.0 I
= -[2r!:!,.r + (l:!,.r) 2 J = r!:!,.r!:!,.0 + - (!:!,.r) 2 !:!,.0.
2 2
If l:!,.r and !:!,.0 are small, the second term is relatively small compared with the first
term, and we make the approximation A(S) ~ rl:!,.r!:!,.0 . Because x = rcos0 and
y = r sin 0, we can approximate the integral of f over D as follows :
X N

(c) { f(x,y)dA ~ Lf(rkcos0k,rksin0k)A(Sk)


JD k= 1
FIGURE 7.20 N
~ L f(rk cos 0k, rk sin 0k )rkl:!,.rk!:!,.0k,
k=I

It follows from Equation 4.4 in Section 4D that


Section 48 Change of Variable 339

= 81 1'1
4.1
1D
J(x, y)dA
1 8o
d0
~
f(rcos0, rsin0)rdr.

The computations in the next two examples are considerably simpler than direct
use of iterated integrals with respect to x and y .

I~~A4'e~~ 'ti If f is constantly 1 on a quarter-circle Q of radius I, we get

which is the area of a quarter-circle of radius 1.

If f(x, y) = x 2 + y2 on a half-circle H, then f(rcos0, r sin0) = r 2 , so we would


have

4B Spherical Coordinates
y We introduce spherical coordinates in JR 3 by the transformation
X

x ) r sin </J cos 0 )


(
y =( r sin <I> sin 0 .
z rcosq,

~l Corresponding regions are shown in Figure 7 .21 . The spherical coordinate "cube" C
has by a direct calculation a volume approximated by

V(C) ~ r 2 sin </>l!.r l!.q,l!.0.


e ~ a,t>

I It will follow from Equation 4.3 that the equation


----;
.L 4.2
1B
f(x,y,z)dV=
i Bo
81 d0 1t/>1
t/>o
d</> lri
ro
f(r,q,,0)r 2 sinq,dr

FIGURE 7.21 is valid, where f(r, </>, 0) = f(r sin¢ cos 0, r sin</> sin 0, r cos</>).
h~MP:Lf';~] A solid ball B of radius a is described by the spherical coordinate inequalities

0 Sr Sa, 0 S </> S rr, 0 S 0 < 2rr.


340 Chapter 7 Multiple Integration

We can compute the volume of the ball by

l dV=
2
fo ,r d0 fo,r d</J foa r 2
sin<j)dr

2
= fo ,r d0 forr sin </J d</J foa r 2dr
= [0]6JT [-cos</J]o [lr J:
3

8 = (2rr)(2) (ia 3
) = irra3,
r
the formula for the volume of a sphere of radius a.
(a)

z 4C Cylindrical Coordinates
The transformation

x ) ( rcos0)
( : = rs~n0

y
is used to introduce cylindrical coordinates in IR.3 • Corresponding regions are shown
X

in Figure 7.22. Note the close connection with plane polar coordinates. Using the
(b) result of the similar calculation for polar coordinates, we can see that the volume of
a cylindrical coordinate "cube" C is given approximately by
FIGURE 7.22
V(C) ~ rf1r/10!1z.

It will follow from Equation 4.3 that the equation

181 d0 1r1 f(r,0,z)rdr


1 B
f(x,y,z)dV=
1 ZI

zo
dz
0o ro

is valid, where f(r, 0, z) = f (r cos 0, r sin 0, z).


IEXAMPLE 41 A longitudinal wedge cut from a cylinder C of height h and radius a is described
by the inequalities

0 ::; r ::; a, 0::; 0 ::; w, 0 ::; z ~ h.

To integrate the function f (x, y, z) = x 2 + y2 + z2 over C, we compute as follows.


Section 4C Change of Variable 341

The formulas for integration in polar, spherical, and cylindrical coordinates can all
be derived in a uniform way. The computation involves the determinants of the
Jacobian matrices of the coordinate transformations. For polar coordinates, we have

X ) =( r C?S 0 ) and -r sin0 )


( y rsm0 rcos0 = r.
For spherical coordinates, we have, correspondingly,

sin ¢, cos 0 r cos ¢, cos 0


det
(
sin ¢, sin 0
cos¢,
r cos ¢, sin 0
-r sin¢,
t
- r sin ¢, sin 0 )
r sin cos 0 = r 2 sin ¢,.

And for cylindrical coordinates

det
cos0
si~ 0
-rsin0
rcos0
0) =
0 r.
(
,_ ----- =.::-1--- --1
--== ---i--- 0 l

(a) Thus we see that the extra factor in the integrand on the right side of Equation 4.1
and Equation 4.2 is in each case supplied by the Jacobian determinant of the coor-
dinate transformation. The expression r !:!,,r !:!,,0 is called the area element in polar
coordinates. Similarly r 2 sin¢, !:!,,r !:!,,¢, !:!,,0 is called the volume element in spherical
coordinates, whereas in cylindrical coordinates the volume element is r !:!,,r !:!,,0 l:!,,z.
These formula'> will be generalized in Section 4D.
--- ___ .,,..
Cylindrical Shells. Suppose a solid figure B is generated by rotating a plane
region R about a line l in the same plane. If R doesn't intersect l and is bounded
above and below by graphs of u(r) and v(r), we can imagine that B is composed
(b) of coaxial cylindrical shells, each one of height h(r) = u(r) - v(r) depending on
the distance r of a point on the shell from the line. See Figure 7.23(a). When such a
FIGURE 7.23 shell is slit vertically and rolled out flat, it has surface area S(r) = 21rrh(r), which
is the circumference 21rr of the shell multiplied by its height h(r). It seems plausible
that we can find the volume of B by computing the integral of S(r) over the relevant
interval ro S: r S: r1:
b 1,.,
4.3 V(B) =
1a
S(r) dr =
ro
21rrh(r) dr.
342 Chapter 7 Multiple Integration

To make a rigorous reconciliation of Equation 4.3 with the fundamental definition


V ( B) = l d:c d y d z, we simply introduce cylindrical coordinates as follows

1r1 [lu(r) ] .
V(B) =
1
0
2rr
d0
ro t•(r)
dz rdr = 2rr
1
(u(r) - v(r )}rdr.

Since u(r) - v(r) = h(r), Equation 4.3 follows immediately.

Rotate a square with side length a about a line l in the same plane, where l is parallel
to an edge of the square and lies at distance c > a /2 from the center of the square.
The resulting solid B is a ring with right-angled comers. See Figure 7.23(b).
To use Equation 4.3 to compute V(B), we note that r should run from ro = c-½a
to r1 = c + ½a and that h(r) = a for all r in the interval. Then

V --- .... '


/ [J',,__
/
/' \
/ Domain T 1
This result follows in an interesting way using Pappus' s theorem, discussed in
I ', Section 5.
\ R ,1
I I
I I
\ I
', I 4D Jacobi's Theorem
'---------" The foregoing discussion shows that the Jacobian determinant of a coordinate trans-
II formation is related to the volume of the natural curvilinear coordinate subdivisions.
The Jacobian determinants of arbitrary one-to-one continuously differentiable trans-
(a)
formations discussed in Chapter 6, Section 2B have similar interpretations. In what
follows it will usually be more convenient to consider the domain space and range
y
space of T as distinct. We therefore regard T as a transformation from one copy
of !Rn , which we label un, to another copy, which we continue to label !Rn, writ-
ing typically T(u) = x where u is in 1111 and x is in !Rn. The statement of the
n-dimensional change-of-variable theorem follows . We won' t give the proof because
it's quite complicated. Typical regions R and T(R) are shown in Figure 7.24.

4.4 Jacobi's Theorem. Let un ~ IR11 be a continuously differentiable transfor-


X mation. Let R be a set in un having a boundary consisting of finitely many smooth
(b)
sets. Suppose that R and its boundary are contained in the interior of the domain of
T and that

FIGURE 7.24
(i) T is one-to-one on the interior of R.
(ii) det T' , the Jacobian determinant of T , is not zero in the interior of R.

If the function f is bounded and continuous on the image of R under T , denoted by


T(R), then we have

f f(x)dVx ={ J(T(u)}ldetT'(u) jdV0 •


}T(R) JR
Section 4D Change of Variable 343
Using Leibniz notation for the Jacobian determinant of

T(u, v) = (F(u, v), G(u, v)),


we have
aF
detT'= a(x,y) =det a"; aF
av )

~~
a(u, v) ( aG .
au
Then Jacobi's formula becomes

iT(R)
J(x,y)dxdy=
1 R
a(x,y)I
f(F(u,v),G(u,v)) - - dudv.
l a(u, v)

V
In three dimensions, we have, with

T(u, v, x) = (F(u, v, w), G(u, v, w), H(u, v, w)),


the formulas
- - - - , (2, 1) aF aF aF
R au av aw
u a(x, y, z) ac aG aG
detT' = a(u, V, w) = det
(a) au av aw
aH aH aH
y
au av aw
and

- T
r
fr(R)
f(x, y, z) dx dy dz

X
= j. f (F(u, v, w), G(u, v, w), H(u, v, w) )la(x,y,z)I
- - - du dv dw.
(b) R a(u, V, w)

Aside from the computation of det T', the application of the transfonnation for-
FIGURE 7.25
mula is a matter of finding the geometric relationship between the subset R and its
image T ( R) for a transfonnation T.

The integral l (x + y) dx dy, in which P is the parallelogram shown in Figure 7.25,


transforms into an integral over a rectangle. This is done using the transfonnation

The Jacobian determinant of T is

<let T' = I~ ! I= I.
344 Chapter 7 Multiple Integration

By the change-of-variable theorem,

l (x + y)dxdy = L[(u + v) + v] [I]dudv


2 1
= fo du fo (u+2v)dv=4.

The transformation T is one-to-one because it is a linear transformation with nonzero


determinant. Notice that the region of integration in the given integral is in the range
of the transformation rather than in its domain.

I EXAMPLE 7 1 The polar coordinate transformation

x )
( y
=( u c?s v )
u sm v

goes between the regions shown in Figure 7 .26. The Jacobian is

. v
det T ' -_ , cos -u sin v I-- u.
sm v u cos v

The transformation is one-to-one between Rand T(R). We can see this geometrically,
because of the interpretation of v and u as angle and radius, respectively, or directly
from the relations

u= Jxi + y2, COS V =


X
-;:::;:::::=7 ,
Jx2 + y2
together with the fact that cos v is one-to-one for O ~ v ~ Jr /2. Given the integral
of x 2 + y 2 over T(R), we can transform as follows:

1,r/2
i T(R)
2
(x +y2)dA=
1R
u 2 udA=
f 1
2
u 3 du
0
15Jr
dv=-.
8
FIGURE 7.26 y
V

-
(2, 0)
0,f) (2.Jl
T

(0, 2) X
u
(a) (b)

l~>cAMP~E. a I Let B be the positive octant in 3-dimensional space R. 3 defined by the inequalities
2
x +y2+z2~ I, x:;:::O,y:;:::O,z:;:::0.
Section 4D Change of Variable 345
F'IGURE 7.27
z

- T

Ix
I
To transform the integral l (x 2 + y2) dx dy dz, we can define T by

x )
(
y =T ( u )
v =( u sin v cos w )
u sin v sin w .
Z W U COS V

Restricting (u, v, w) to the rectangle R in U 3 defined by


Jr Jr
0 '.::: y :5 1, 0 '.::: V :5
2 , 0 :5 W :5
2,
we get T(R) = B. The corresponding regions are shown in Figure 7.27. Since
u = Jx2 + y2 + z2
z
COS V = -;:.::;;==::;;===::,
Jx2 + y2 + z2
X
cos w = -;::=:===::,
Jx2+y2
we conclude that the transformation T is one-to-one from R to B except on the
= =
boundary planes u 0 and v 0. The Jacobian determinant is
sin v cos w u cos v cos w -u sin v sin w
det T' = sin v sin w u cos v sin w u sin vcos w
cos v -u sin v 0

= u 2 smv.
.

The transformed integral is

l (x 2 + y2) dx dy dz= l (u 2 sin 2 v cos 2 w + u 2 sin 2 v sin2 w) u 2 sin v du dv dw


r1 4 r12 3 r12
= lo u du lo sin vdv lo dw
12n n
= -·-·-
5 3 2
= -15
346 Chapter 7 Multiple Integration

EXERCISES

In the definite integrals I and 2, make the indicated IO. Let B be the region in IR 3 described by the inequalities
change of variable together with the appropriate change 0 .:5 x, 0.::: y, 0 .:5 z, and x 2 + y2 + z2 .:5 4.
in the limits of integration. Then compute the resulting (a) Sketch the region B, and describe it by using spher-
integral. ical coordinates.
{2 , (b) Use spherical coordinates and Equation 4.2 to eval-
1. Let X = ,Ju in lo xe··· dx. uate the triple integral

2. Let x = sin0 in fo
1
~dx. L Jx 2 + _v 2 + z 2 dx dydz .

3. Let B be the region in IR 2 described by the inequalities (c) Use spherical coordinates to evaluate the triple
integral
0 .:5 x, and x2 + y2,::: 4.
(a) Sketch the region B and describe it by using polar L zdxdydz.
coordinates.
(b) Use the polar coordinates and F.quation 4.1 to eval- Use spherical coordinates to compute the triple inte-
uate the double integral grals 11 and 12.

11. L Jx 2 +y 2 dxdydz, where Bis the solid ball of


LJx 2 +y 2 dxdy .
radius 1 centered at the origin in IR 3 .
(c) Use polar coordinates to evaluate the double integral
fc z2 dx dy d z, where C is the region in IR
L xdxdy.
12.

l .:5 x2 + y2 + ,2 .:5 4.
3 described by

4. (a) Let R be the region in !R2 bounded by the x-axis


and the polar coordinate curve r = I + cos 0 for 13. Find the total mass of a solid ball of radius a with density
0 .::: 0 .:5 rr. Sketch R and find its area. at each point equal to the distance from the point to the
(b) Compute the integral center of the ball.

l (x
2
+ y2)dxdy,
Use cylindrical coordinates to compute the integrals 14
and 15.
where R is the region in part (a).
5. Let A be the annular region in !R consisting of points 2
14. l z dx dy dz, where B satisfies 1 .:5 z .:5 2, x 2 + y2 .:5 I.

(x, y) that satisfy 1 .:5 x 2 + y2 .:5 4. Suppose that A is 15. fc (x 2


+ y2) dx dy dz, where C is the region in JR 3
used as a pattern for a flat annular piece of plastic that has
density at each point inversely proportional to the distance described by 0 .:5 x, 0 .::: y, 0 .:5 z .:5 I, and x 2 + y 2 .:5 2.
of the point from the center of the hole. What is the total 16. Prove the ] -dimensional change-of-variable formula
mass of the piece of plastic if the density is IO grams per
lb
unit area at the inner edge of the region?

t
6. Compute lo dx lo{~ (x
2
+ .r2) 3 dy.
i ip(b)

ip(a)
f(x)dx =
"
f(</J(u))<f/(u)du,

under the assumptions that / is continuous and </J is

7. Compute L cos(x 2 + y2) dx dy, where Dis the disk of


continuously differentiable. [Hint: Differentiate both sides
with respect to b.]

radius ../ir72 centered at (0. 0). For Exercises 17 to 20 use multiple integration to prove
the geometric volume formulas.
8. Compute the area bounded by the polar coordinate curves
0 = 0, 0 = rr/4, and r = 0 2 . 17. Sis a sphere of radius a. Show V(S) = ;.na3.
9. Find the area bounded by the Iemniscate (x 2 + y2) 2 = 18. C is a cone of height h and base radius a. Show V(C) =
2a 2 (x 2 - y2) by changing to polar coordinates. ja 2h.
Section 4D Change of Variable 347
19. L is a right circular cylinder of height k and radius a. proportion k < 1 of the volume of the whole sphere to
Show V(L) = na 2k. remain as a ring, how should you choose b?
20. R is a slice of thickness k perpendicular to the axis *28. A solid ball Ba of radius a is spherically homogeneous if
of a right circular cone having maximum radius b and its density is constant on every spherical shell with center
minimum radius a. Show that its volume is V(R) = at the center of Ba. The purpose of this exercise is to
f (a 2 + ab + b2 )k. Explain how Exercises 18 and 19 are establish Newton's result that according to the inverse-
essentially special cases of this. square law the gravitational attraction of a spherically
21. Consider the transformation T defined by homogeneous ball acting at a point p is the same as it
would be if all the mass of the ball were concentrated at
its center. If p is inside the ball, the part of the ball at
distance from the center greater than IPI is irrelevant. By
definition, if Ba is centered at the origin and has density
Let Ruv be the region 1 ::: u 2 + v2 ::: 4, u ;:: 0, v ;:: 0. µ(jxl) at x, the attracting force vector on a particle of
(a) Sketch the image region Rxy = T(Ruv). mass 1 at p is given by G times the 3-dimensional vector
dxdy integral
(b) Compute
1 ~-
Rq yx2 + y2 x- p Mp
22. Define a transformation from the uv-plane to the xy-plane
by x = u + v, y = 11 2 -v. Let Ruv be the region bounded
1Ba
µ(!xi) I
X-p
13 dVx = - -p13P,
I
by (1) u-axis, (2) v-axis, and (3) the line u + v = 2. where Mp is the mass of the part of Ba that lies within
(a) Find and sketch the image region Rxy· distance IP! of its center, and G is the gravitational
dxdy constant. Newton showed without using our techniques
(b) Compute the integral
1 Jl + -;;;;=:::;:::=::::;=-
4x + 4y
R,y
23. Let a transformation of the uv-plane to the xy-plane be
that the integral equals the expression on the right.
(a) Choose perpendicular (x, y, z)-axes with origin at
the center of Ba and positive z-axis passing through
given by
p = (0, 0, p}. Show, without computing any inte-
grals, that the x and y coordinates of the vector
X = U, y = v(I + u2), integral are zero and that the z coordinate is given
and let Ruv be the rectangular region given by O ::: u ::: 3 in spherical coordinates by
and O::: v::: 2.
(a) Find and sketch the image region Rxy·
(b) Find a(x, y).
2n la ,2µ(r)

[ ] f"
a(u, v)
rcos</J- p sin1Pd1P]dr.
(c) Transform f x dx dy to an integral over Ruv and 0 (r 2 - 2pr cos IP + p2)3/2
]R,y
compute either one of them. (b) Let cos IP = u and integrate by parts to show that
the inner integral in part (a) is
24. Rotate a circular d1sk of radius a about a line in the same
plane at distance c > a from the center of the circle. This 1
generates a solid torus B. Find the volume of B using
the cylindrical she11 approach to setting up an integral for f -1
(ru - p)(r 2 + p 2 - 2pru)- 3/ 2 du

V(B), as in Equation 4.3. (A way to verify the correctness = { -2/p2 , p > r,


of the answer is to use the Pappus theorem of Section 5.) 0, p < r.
25. If 0 < c < a the solid figure obtained by rotation in the
previous exercise is no longer a torus but rather a kind (c) Show that the mass of Bv is 4n Ji
r 2µ(r)dr, and
of dimpled sphere. Make a sketch of such a solid for the then use the previous results to prove Newton's
case c = I, a = 2, and find its volume. formula for the attracting force.
(d} Use the results of parts (a) and (b) to show that
26. A cylindrical hole of radius ½a
is bored through the center the attracting force of matter distributed in a spher-
of a sphere of radius a. Find the volume of the remaining ically homogeneous way between two concentric
solid. spheres cancels out, and so paradoxically exerts
27. A cylindrical hole of radius b is to bored through the zero gravitational attraction at points inside the inner
center of a solid ball of radius a > b. If you want a sphere.
348 Chapter 7 Multiple Integration

(e) Specialize the results of parts (a) and (b) to tion of the ball on a unit point-mass inside the
the case of a homogeneous ball with constant ball and r units from the center has magnitude
density µ to show that the gravitational attrac- 11rµGr.

SECTION 5 CENTROIDS AND MOMENTS


If positive masses m 1, ... , mN are concentrated at the respective points x1, ... , XN
in space, then the center of mass of the system is the point
N
5.1 where M= Lmk.
k=l

Thus the center of mass i is a weighted average of the position vectors Xk in the
system. We'll see that i is the unique point at which a physical system consisting
of masses mk at points Xk would "balance" under the influence of constant gravity
if the mutual distances lxk - xi I are held fixed. The meaning of the term "balance"
is expressed by saying that the "moment" of the system about an arbitrary plane P
through i is the zero vector. To make these ideas precise, we define the moment
Mp of the mass system about a plane P to be the weighted algebraic sum of the
distances from the points to P. See Figure 7 .28. If n • (x - xo) = 0 is the equation
of a plane through Xo, normalized so that lnl = 1, then the distance from Xk to the
plane is n • (Xk - xo) if Xk is on the side of the plane toward which n points and is
minus that number if Xk is on the other side. (See Section 5 of Chapter 1.) A formula
suitable for computation of Mp is then

N
Mp= Lmkn • (xk - xo).
k=l

The moment Mp is independent of the point xo on the plane P, since Mp is just


a weighted sum of distances to P. However, the sign of Mp does depend on an
arbitrary choice for the direction of the unit vector n.

5.2 Theorem. Let P be an arbitrary plane containing the center of mass i of the
system of masses m1, mz, ... , mN at the respective points x1, xz, ... , XN, Then the
moment Mp of the system about P is zero.

FIGURE 7.28 m4

j"----- /
I ~,,/ --vX5
X4
I ----/ m-
l .,,, -- :,

I //:
~~ I
~ I
m1 I

x, :
I
~ :I
m, L_ 1
X, - ---
- m, __________ jI
x, .
Section 5 Centroids and Moments 349
Proof. To verify that the moment about a plane Po containing the center of mass
x is 0, we replace the generic point xo by x and use the distributive law for the dot
product. We get

M Po = t
k=I
mkn • (xk - x) = n• [t k=l
mkxk - (t
k=I
mk) x] .
The vector in square brackets is O by the definition of x, so M Po is the
0-vector. •
For mass that is distributed according to a continuous, nonnegative density
µ(x) 2'.: 0 over a body B in space, by analogy with Equation 5.1, we define the
center of mass of the distribution µ, over B to be the point
1
5.3 x = M(B) f 8 µ(x)xdV, if the total mass M(B) = f8 µ(x)dV is positive.

The term centroid is used for x if the distribution µ,(x) is uniformly equal to
I, in which case we're talking about a purely geometric property of B rather than
a physical property associated with mass. From this perspective it's appropriate to
write V (B) for volume instead of M (B).
Just as with Equation 5.1 for discrete masses, the center of mass of a continuous
density µ(x) should be a vector that we picture as a point in space. This is just what
(a)
we get from Equation 5.3. The reason is that the vector-valued integral of a vector-
valued function is defined to be the vector whose coordinates are the integrals of the
individual coordinate functions x, y and z of the vector x = (x, y, z). In applying
Equation 5.3 in JR3, we have

µ,(x)x = (xµ(x, y, z), yµ(x, y, z), zµ(x, y, z)).


Since M is a positive number, we can compute x = (x, y, z) by

5.4 -x = -I
M
1 B
xµ(x,y,z)dxdydz

(b)
-y = -
1
M
1 B
yµ(x,y,z)dxdydz

FIGURE 7.29 -Z = -I /, ZJ,I, (X, y, Z) d X d Y d z.


M B

lt:exAMete:111
' "' < 1
0

"'·'"·
To find the center of mass of a solid hemisphere H with radius a and constant density
µ,, we'll take as given the formula 1.1ra 3 for the volume of a sphere. The total mass
of His then M = ½<1.1ra 3 )µ = 2.1rµa 3 /3. To compute x, we introduce coordinates
as shown in Figure 7.29(a). We note that x = y = 0, because His symmetric about
the yz-plane and the xz-plane. To find z we change to spherical coordinates, getting
2 2
JHµzdxdydz=µ fo 1r d0 forr/ d</> loa(rcos</>)r 2 sin</>dr

{2rr {rr/2 fa 3
=µ,lo d0 lo sin<f>cos</>d</> lo r dr
350 Chapter 7 Multiple Integration

= [0]5" [ ½sin 2 q> J~ 12 [¼r 4 Jg


= µ,(2,r)(½H¼a 4 ) = ,rµ,a 4 /4.

Hence z =
rr µ,a 4 /4M = (,r µ,a 4 )(8,r µ,a 3 /3) =
3a/8. In other words, the center
i
of mass is of the way along the axis of symmetry of H, measured from the flat
surface of H.

In mechanical problems it's often convenient to idealize a flat piece of material,


treating it as a plane region R that carries with it a density function µ,(x , y) that may
be constant or may vary from point to point of R. Choosing the plane in which R sits
to be the xy-plane and assuming µ,(x, y) independent of z forces the moment about
this plane to be zero, so we automatically get z = 0. The remaining two coordinates
are then computed from

5.5 x= _!._ f xµ,(x,y)dxdy


M JR

y= !l yµ,(x, y)dxdy, where M = l µ(x, y)dxdy.

l,eXAMPLE 2 I Let Q be a quarter-disk of radius a, weighted so that the density at a point is


equal to the square of the distance from the center of the full disk. We can start by
introducing rectangular coordinates, placing Q in the first quadrant with vertex at
(0, 0), as in Figure 7.29(b). Then µ,(x, y) = x 2 + y 2 . Because Q is symmetric about
the line y = x it's apparent from Equations 5.5 that x = y, so we'll just compute x.
Changing to polar coordinates is not required, but is helpful. We find

M
f
= JQ µ,(x, y)dx dy = Jo
f"p d0 Jofa r2 r dr

=( i) ( 4
a )
4
= rr ;4.
Similarly,

JQf xµ,(x, y)dxdy = Jof"p d0 Jofa (rcos0)r 2 rdr


12 5
= fo" cos0d0 foa r 4 dr = (1) (a;)= ~ .

Hence x = y = (a 5 /5)j(,ra 4 /8) = 8a/(5rr). It follows that the center of mass is


8-v'2/(5rr) ~ 72% of the way out from the vertex at the origin along the line of
symmetry of the quarter circle. This is reasonable; the weighting is heavier near the
circular edge of Q, so we expect a value more than ½-

We define the moment Mp of the mass density µ, distributed over B to be the


number

Mp= iµ,(x)n•(X-Xo)dV,
Section 5 Centroids and Moments 351
where n • (x - xo) = 0 is a nonnalized equation for the plane P. As in the case
of a discrete distribution, Mp changes sign if n changes direction, and in many
applications there is a natural choice for this direction.

5.6 Theorem. The mass and moment of the union of disjoint regions Bi and B2
about a plane P is the sum of their respective masses and moments about P:

M(B1 U B2) = M(B1) + M(B2) and Mp(B1 U B2) = Mp(Bi) + Mp(B 2).
Proof. Both equations are immediate consequences of Theorem 3.6 of Section 3,
which expresses additivity of the integral as a function of the domain of inte-
gration. •
Denoting the moments of B about the yz-plane, the xz-plane and the xy-plane in
JR3 by Myz, Mxz and Mxy respectively, we summarize Equations 5.4 as

x= Myz/M, y= Mxz/M, z= Mxy/M, where Mis the mass of B.

By Theorem 5.6, to compute the center of mass of the union of two disjoint bodies
Bi, B2, we can write

_ Myz(B1) + Myz(B2)
x=--------
M(B1)+M(B2) '
_ + Mxz(B2)
Mzx(B1)
y = + M(B2) '
M(B1)
_ Mxy(B1) + Mxy(B2)
z = --------.
M(B1) + M(B2)

These formulas extend by successive application to an arbitrary finite union of distinct


bodies.

Consider the body B consisting of a hemispherical region Ha of radius a with a


concentric hemisphere Hb of radius b < a removed. Assuming unifonn density µ,
the mass of B is M = jn µ,(a 3 - b 3). Choose coordinates in JR3 so that the flat
base of B rests in the xy-plane, with the axis of symmetry along the positive z-axis.
Then x = y = 0. To compute z we borrow the information from Example 1 that Ha
has moment about the xy-plane Mxy(H11 ) = ¼n
µ,a 4 . Since moments can be added,
we find

Hence Mxy(B) = ¼n µa 4 - ¼n µb 4 = ¼n µ(a 4 - b4 ). For the center of mass of B


we have

Recall that the location of the centroid of a geometric object is a geometric


property of the object, though it coincides with center of mass if we think of area
352 Chapter 7 Multiple Integration

or volume as a uniform mass distribution of density 1. The next theorem invokes


the idea of centroid and is an intuitively appealing consequence of our ability to
give two different geometric interpretations to the same mathematical object. In this
case we're rotating a plane region R about a line L in the same plane as R but not
intersecting R .

5.7 Pappus's Theorem. The volume of a solid of revolution B about a line L is


equal to the area of the rotated region R times the circumference 2JTr of the circle
traced by its centroid during rotation: V (B) = 2JTr A(R).
Proof. According to the "cylindrical shell" analysis for volume of revolution dis-
cussed in Section 4C,

V(B) = 2JT jb
II
rh(r)dr, 0 <a.::: r.::: b,

where z = h (r) defines the height of the region R measured parallel to L and at
distance r from the line. Dividing both sides of this equation by 21T A(R) shows that
V(B)/(2JTA(R)) = r, the distance of the centroid of R from L. Now just multiply
this last equation by 21T A ( R) to get Pappus' s formula. •
Pappus's theorem is particularly simple to apply when we know on geometric
grounds where the centroid R is located relative to the axis of rotation, as in the next
example.

IEXAMPLE4 l Rotate an a-by-b rectangle R about a line l in the same plane and parallel to an
edge of length a and lying at distanced from the nearest edge. Figure 7.23(b) in the
previous Section 4C shows the solid for the case a = b. The resulting solid B is a
ring with sharp corners. Since the centroid is d +h/2 units from l, the circle it traces
has circumference 2JT(d +b/2) = JT(2d +h). Since Risa rectangle A(R) = ab, so
V(B) = JT(2d + b)ah.
EXERCISES

In Exercises I~. find the center of mass x of each of the defined on the set described. Sketch the set and show
following discrete mass distributions. Sketch the given the location of the center of mass x.
points and the center of mass x. 5. Let / be the interval 0 :5 x :5 2 in JR, and let µ,(x) =
1. In R, mI = 1 at x1 = 1, m2 = 3 at x2 = 2, m3 = l - ½x.
2 at X3 = -4. 6. Let D be the disk of radius I centered at the origin in R 2,
2. In R 2, m1 = 1 at XJ = (I, 1), m2 = 2 at x2 = (1,0), and let µ,(x , y) =Ix+ yl.
1113 = 3 at X3 = (0, 1). 7. Let Q be the quarter-disk of radius 1 in the first quadrant
3. In R 2, mI = 1 at xi = (1, - 1), m2 = 2 at x2 = (I, 2), with edges on the axes in R 2, and let µ,(x, y) = x + y.
m3 = 3 at X3 = (- 1, 1).
8. Let C be the unit cube in R 3 defined by 0 :::: x :S 1,
4. In R 3, m1 = ½ atx1 = (1,-1,2), m2 = ¼ atx2 = 0 :5 y :s I, 0:::: z:::: I, and let Jt(x, y, z) = xyz.
(0. 1, 2), m3 = i at X3 = (I, 1, 1).
9. Find the centroid of the region R in the first quadrant of
In Exercises 5 to 8, find the center of mass x of each JR 2 bounded by the graph of y = x 2, the line y = 4 and
of the following continuous distributions with density J.l the y axis.
Section 6 Improper Integrals 353
IO. Find the centroid of a solid right circular cone of base where zo is a fixed point in JR 2 ; the number I (zo) is
radius a, and height b from base to tip.
I I. Show that the center of mass of a homogeneous solid ball
is at the center of the ball. /(zo) = l Ix - zoi2 µ,(x) dA.
12. Use the result of Example 3 of the text to find the center of
mass of a hemispherical surface of radius a and constant
18. Find the moment of inertia of a disk Ra of radius a about
surface density µ,. (The result will be corroborated later
its center if (i) Ra has constant density µ,(x) = 1 and
with an approach tailored to more general surfaces.)
(ii) Ra has density µ,(x) =
jxjP, p > 0.
13. (a) Find the centroid of the part of the annulus with radii
19. Find the moment of inertia of a square S of side b about
a > b and center (0, 0) that lies in the first quadrant.
one corner if S has uniform density µ,.
(b) Use part (a) to locate the centroid of a quarter circle
of radius a. 20. Find the moment of inertia of a square S of side b about
its center if S has uniform density µ,.
14. A square region R of side length a is rotated about a line
parallel to a diagonal and containing a vertex not on that 21. Show that the moment of inertia I (zo) of R as defined
diagonal. Find the volume of the solid generated this way. previously satisfies
15. Use Pappus's theorem and the volume ;na
3 of a ball
of radius a to find the centroid of a plane semicircular
I(zo) = M(R)lzo - x-i2 + /(x),
region.
where x is the center of mass of the weighted region R.
16. A right circular cone of height h and base area A has [Hint: 1n the definition of/ (zo), jx - zoi 2 can be replaced
volume V = ½hA. Use Pappus's theorem to find the by the dot product of (x - i) + (i - zo) with itself.)
centroid of a right triangle with perpendicular side lengths
22. Use the previous exercise to show that / (zo) is minimized
a and b.
by taking zo to be the center of mass i of R.
17. Let B be a set in ]Rn and µ,(x) the density at x in B .
23. Let B1 , ... , BN be nonoverlapping regions with union
If xo is a fixed point in IRn and n • (x - xo) = 0 is
B, having respective masses M(Bk), M(B) and centers
the normalized equation of a plane P through XO, we've
of mass Xt, i.
defined the moment of B about P by
(a) Prove that x is given by a weighted sum of the Xk as

Mp(B)= l µ,(X)D•(x-xo)dV. 1 N
x = -(- L M(Bk)Xk-
M B) k=I

(a) Show that Mp 1s independent of xo as long as xo is


This reduces finding x to the case for point masses
on the same plane P. [Hint: If x,
is another point
in Equation 5.1.
on P, then n • (x1 - xo) = O.J (b) Illustrate part (a) with the example in which
(b) Show that if P passes through the center of mass i
B1 and B2 are the rectangles in IR2 hav-
of B, then Mp = 0, an extension of Theorem 5.2.
ing the same uniform density 1 and respec-
Exercises 18 to 22 refer to the moment of inertia I(zo) tive comers at (I, 3), (3, 3), (3, 2), (1, 2) and
of a mass density µ,(x) over a set R in JR 2 about 7-0, (3, -1), (4, -1), (4, -2), (3, -2).

SECTION 6 IMPROPER INTEGRALS


The underlying definition of the Riemann integral as a limit of weighted sums of
function values requires the integrand to be a bounded function f (x) that is defined on
a bounded domain B. The definition extends to some functions that are unbounded
or that aren't necessarily zero outside a bounded set. Such an integral is called
"improper" and is defined as a limit of "proper" integrals of bounded functions
defined on bounded sets B0 of a family {B0 } that expands to cover all of B . The
354 Chapter 7 Multiple Integration

indices 8 are chosen at our convenience to tend, either increasingly or decreasingly,


to some finite number or to infinity.

Let /(x) = x- 113 for O < x s 1. See Figure 7.30(a). The integral off over this
interval / is not an ordinary Riemann integral because /(x) tends to infinity as
I
x ~ 0. To assign a value to lo f (x) dx, we let /;; be the interval [8, 1) and first
compute

{ f(x) dx = [1 x-1/3 dx = [~x2f3] \ = ~(l - 82/3).


h Jd 2 l• 2
Thus a value of the integral is determined by

f f(x)dx =
}1
Jim
Jf
d----+ o+ 18
f(x)dx = lim
.5----+ o+
io - 8213 ) = r

Let g(x) = e-x for O s x. See Figure 7.30(b). The integral off over this interval
J is not an ordinary Riemann integral because the domain of the integrand f (x) is
00
an unbounded interval. To assign a value to fo g (x) dx, we let h be the interval
[O, 8] and first compute

f f(x)dx = fd e-x dx = [-e-x]g = 1 - e-.1.


6 j X
11.i lo
(a) Thus a value of the integral is determined by

JfJ f(x)dx= .5----+oo


lim f
J1 6
f(x)dx= lim
.5----+oo
(l-e-.5)=1.

'~
~ I
6
(b)
X We say that l f (x) d V is defined as an improper integral if a limit

f f(x)dV= lim f f(x)dV is finite and independent of the family {B6 } used
ln ln.i
FIGURE 7.30 to define it. It's possible to show that if either /(x) ::: 0 on B, or /(x) s O on
B, then the choice of expanding sets Bd that cover all of B doesn't affect the final
outcome; the limit value assigned to the improper integral will either be finite, in
which case we speak of convergence of the integral to that value, or else we have
divergence of the integral, perhaps to +oo or to -oo.

Since In z ::: 0 for O < z ::: 1, the function f (x, y) = - ln(xy) is non-negative on
the square S: 0 < x ::: 1, 0 < y ::: 1. See Figure 7.31(a). Noting that f is bounded
on the square S.1 determined by 8 ::: x ::: 1, 8 ::: y _'.:S 1, we integrate f over S/i and
compute the limit as 8 tends to 0.

{ -ln(xy)dxdy= [ -(lnx+lny)dxdy
1s6 ls~
Section 6 Improper Integrals 355

= -1 1 -1 1 1
dy
1
lnxdx
1
dx
1
lnydy

= -21 1
1 1
dy lnxdx

= -2(1 - 8)[x In x - x]l = -2(1 - 8)(-1 - 8 ln 8 + 8)

An elementary limit calculation, = 0, shows that the improper integral


lim 8 In 8
o---+O+
(a)
is convergent and we can write fs - ln(xy) dx dy = 2.

The integral l 1/(x2 + y2)P dxdy, where D is the disk x 2 + y2 ~ 1 in ~ 2 , is


improper if p > 0. To check for convergence, first com~ute an ordinary Riemann
integral over the annulus D0 determined by 82 ~ x 2 + y ~ 1. See Figure 7 .31 (b ).
This computation is best done by changing to polar coordinates. The case p 1 =
turns out to be special, so we'll assume p :f= 1.

{ l/(x 2 +y2)Pdxdy= {2:rrd0 r1r- 2Prdr=2rr {\l- 2Pdr


1D 6 lo lo lo
= 2rr[(2- 2p)- 1 r 2 (1-p)11 = (rr/(1 - p))O - 82 (1-p))

(b) Now let 8 -+ 0+. If O < p < l the limit of the integrals over D0 is rr /(1 - p), so
the improper integral converges to that value. If p > 1 the integrals tend to +oo, so
FIGURE 7.31 the improper integral diverges for p > 0. The case p = 1 is left as an exercise.

The function f(x, y) = 1/x 2y 2 , defined for x 2': 1 and y 2': 1, has the graph shown
in Figure 7 .32. If B is the set of points (x, y) for which x 2': 1 and y 2': l, it is natural
to define f 8 f dA in such a way that it stands for the volume under the graph of
/. We can approximate this volume by computing the volume lying above bounded
subrectangles of B. To be specific, let BN be the rectangle with comers at (1, 1) and
(N, N) and with edges parallel to the edges of B. For N > 1 we have

FIGURE 7.32 z

f(x, y)= 2-2


1
/ Xy

/
l ~~~~---
t,
i '. i ' ,
/

/
(0, N)
----;
(1, 0/ ) / -,.. --~~-------ft
/" .. .,,
..... .
/"./
(N)L---- (4-_ --
/ / .
/ ~ :._ - -- -
'

/_// /
>-' X
356 Chapter 7 Multiple Integration

= { Ndx { N ~ d y
1 BN
f dA
11 11 X Y

As N tends to infinity, the rectangles BN eventually cover every point of B , and the
regions above the BN fill out the region under the graph of f . Then we define

ls{ f d A = N --+oo
lim 1f BN
dA = 1.

Probability Densities. Integrals over unbounded regions occur often in statistics


and statistical mechanics. If p(x) ::: 0 for x in some subset S of ~ 11 , and the function
p is normalized so that

6.1 Is p(x)dV = 1,
then p(x) can be interpreted as the density of a statistical outcome. To be more
specific, suppose E is an experiment with possible outcomes in S. Then the proba-
bility that the outcome of the experiment lies in a subset B of S can sometimes be
expressed in the form

Pr[£ in BJ = l p(x) dV,

for some density function p(x) . For example, the coordinates of a vector outcome x
might be the results of measuring simultaneously several distinct properties of some
physical object. In analogy with the center of mass of a mass distribution, we define
the mean of a probability distribution as the vector

m[p] = l xp(x) dV.

IEXAMPLE6 I Thebysymmetric normal probability density in ~


~2
2 is the function defined in all of

N(x, y) = _I_e -[(x2+y2)/2u2 J_


2na 2
The number a 2 is a positive constant. To verify that N is a probability density,
we need to check the normalization condition 5.4. The integral in question is an
improper double integral, which we can evaluate using polar coordinates as follows:
Section 6 Improper Integrals 357

= -l Jim [ -a 2e-(r 2 f?n 2) ]R


(12 R--HX! 0

= Jim [-e-(R /?n


7 2
) + 1] = l.
R----+oo

Thus the integral of N has value l as we wanted to show and so is independent of


the constant a 2, called the variance of the normal density in .IR2 • The variance is a
measure of the dispersion away from the origin of the density. The mean of N is
zero in both its coordinates, because the integrals

are both zero. For example,

i•
xe-(x 2 +y2 )/?n 2 dx dy = 1
00

-oo
xe-(x 2 f?n 2) dx 1 -oo
00
e-(y 2 f?n 2 ) dy.

The first integral on the right is zero because the integrand is an odd function.

The function

p(x, y) = e-x-y, x ::".'. 0, y ::".'. 0,

defines a probability density in the first quadrant Q of .IR 2 • All we have to check is
that the integral of p over Q is equal to l. We compute

l e-x-y dx dy = 1 1
00

e-xdx
00

e-Y dy

2
lim [-e-x]i)
= ( N----+oo
2
Jim (-e-N +
= ( N----+oo 1)) = l.
The probability that the outcome of the experiment E is in some rectangle R: a ~
x ~ b, c ~ y ::: d, where a and c are nonnegative is

Pr[E in R] = £ e-x-y dx dy

= [-e-x]: [-e-Y]~
= (e-a - e-b)(e-c - e-d).
358 Chapter 7 Multiple Integration

(!XAMPtE s I The squaring trick used in the previous example used in combination with a switch
2
to polar coordinates allows us to show that the integral of e-x between - oo and
oo equals ,./ii as follows:

00 , )2 = ]Ri{ e-x--y-
, ,dx dy
(1_00 e-x- dx

{2,r {00 ,
= Jo d0 Jo e-rrdr

=27r [ -2e
1 -,·2]00 I
= (2rr)(z) = 7r.
O

EXERCISES

In Exercises 1 to 9, determine which of the following


improper integrals have finite values, and for those that 11. For which values of a does fi 00

xa dx have a finite value


do, compute the value. as an improper or as an ordinary Riemann integral?

1. Loo (l + x)-4dx . 12. (a) Show that p(x) = e-x is a probability density on
the interval 0 ~ x < o:, in JR.
(b) If E is an experiment with probability density
2. lo"° x - 3dx . p, as given in part (a), find the probability that
the outcome of E lies between a and b, where
3. f x dx d y, where R is the infinite rectangle O ~ x ~ 1, 0 ~a< b.
}R y 2 13. (a) For what constant k is the function
2 ~ y in JR 2 .

4. { :_ dx dy, where R is the same as in Exercise 3. p(x,y)=k(l- x 2 - y2), x 2 + y2~ t


}Ry

5. [ e-x-y-z dtdydz, where C is the infinite rectangle a probability density on a disk of radius 1?
(b) If the outcomes of an experiment E are dis-
0 ~ X ~ 1, 0 ~ y ~ I, 0 ~Zin JR • 3
tributed according to the density of part (a), find
1
6. f -,-- d x dy dz, where C is the same as in the probability that E has an x-coordinate bigger
le z-fe than½ -
Exercise 5. (c) What is the mean of p?
7. l ln(x 2 + y2) dx dy, where D is the disk x 2 + y 2 ~ 1 in 14. If an experiment E has as its probability density the
symmetric normal density in !R 2 with constant o- 2 , find
IR2 • [Hint: Let D6 be the annulus 82 ~ x 2 + y 2 ~ 1, and the probability that the coordinates of the outcome are
change to polar coordinates.] both positive.
8. l + (x 2 y2)- 1dx dy, where D is the disk x 2 + y2 ~ 1 15. (a) Show that

in !R2 .
dxdy -1- 1°" e --~n
·• ,_a 2 dx = I
9.
1+Jx2 +
Q
-;;::;:==:;:,
y2
where Q is the quarter-disk x

x 2 y 2 ~ 1 in JR2 . [Hint: Use polar coordinates as in


~ 0, y ~ 0, a..ffii - oo

Exercise 7.] by using the result of Example 6 in the text that


1
10. For which values of a does fo xa dx have a finite value
_1 _2100 dx loo e - \Al+y2)/2.<J2 dy = I.
as an improper or ordinary Riemann integral? 2rca _00 _00
Section 7A Numerical Integration 359
(b) Deduce from part (a) that 17. The Maxwell distribution for a gas molecule of mass m
at temperature T assigns a probability that the molecule's
Nm(x) = _I_e-(.1-m)2;2a2 velocity vector at a given inst.ant lies in a region B
u../2.ir of the 3-dimensional space of possible velocity vectors
v :::: (v1, v2, tJJ). A fundamental assumption is that for
is a probability density on the interval a given speed v = lvl, all possible directions in the
-00 < X < 00. velocity space are equally likely. In other words, we
(c) Show that the mean of Nm is assume that the probability density for v is spherically
symmetric, so it is appropriate to restrict attention to
00
events dependent entirely on statements about the speed
1 -oo
xNm(x)dx = m. v. We write the corresponding probabilities in terms of
spherical coordinates with radial variable v in the form
(d) Show that
2
Pr[a =:: v =:: b] = fo1r sin</)d</J fo 1r d01b f(v)v 2dv

= 4.rr 1b f(v)v 2 dv.


The number u 2 is the variance of Nm.
16. (a) Show that the vector mean m[Nm_n] of The function f (v) has been determined on theoretical and
experimental grounds to be
N (x y) = _l _e-[(x-m)2+(y-n)2]/20'2
m,n , 21fu2
f(v) = (~)3/2 e-mv2/(2kT)_
2.rrkT
is m[Nm,n] = (m, n).
(b) The variance of a density p(x) on R 2 with vector The Boltzmann constant k is a factor relating the mean
mean m = (m, n) is defined by kinetic energy of the molecule at temperature T, as stated
in part (c).
(a) Calculate the integral that verifies
[ !x-mj 2p(x)dV, Pr[O ;=: v < oo] = I.
]R2
(b) Show that the mean speed m[4.rrv 2 f(v)] is
Compute the variance of the density Nn.m of part (a). .j8kT /(.rrm).

SECTION 7 NUMERICAL INTEGRATION

A double integral L f (x, y) dx d y on a rectangle R determined by a .:S x .::: b,


c .::: y .::: d is an iterated integral in either of two orders:

If evaluating either of these two integrals is possible by finding a succession of two


indefinite integrals, then that is probably the best method. (For example, that would
give us an answer for lots of values of a, b, c, d.) If we can't find the required
indefinite integrals, it may still be possible to evaluate the integral for particular
choices of limits, perhaps by some clever change of variable. [See Exercise 4(a).J
When all else fails, numerical approximations are available, though only for specific
numerical choices of the limits. We'll consider two such approximation methods.
360 Chapter 7 Multiple Integration

7A Midpoint Approximations
For a I-dimensional integral lb (I
f(x)dx we partition the interval a ~ x ~ b into
p equal subintervals with endpoints x1 =(a+ j(b - a)/p), j = 0, . .. p) . The
midpoint approximation is

1 a
b
f(x)dx
p-1
~ L f(a + (j + ½Hb -a)/p),
j=--0

which evaluates f (x) at the midpoint of each of the p subintervals.


For a double integral over a rectangle R we impose a grid on R with intersection
points at

(Xj, Yk) =(a+ j(b - a)/ p, c + k(d - c)/q);

here p and q are the respective numbers of lines in the grid in the x and y directions,
while j runs from O to p and k runs from O to q. Since the dimensions of each
rectangle are (b - a)/ p by (d - c) / q, the midpoint of the grid rectangle with lower
left comer (xj, Yk) is at

(Xj, Yk) = (xJ + ½(b - a)/ p, Yk + ½(d - c)/q)


=(a+ (j + ½Hb - a)/ p, c + (k + ½Hd - c)/q).

The midpoint approximation to the value of the integral is

(b - a)(d - c) p-I q-I

1
R
f(x,y)dxdy ~ - - - - L L f ( x1 ,yk).
pq j=O k=O

The approximate value we get this way is just the value of a Riemann sum of the
type used to define the integral. In particular, if f is a positive function, the approx-
imate value is a sum of volumes of vertical boxes as illustrated in Figure 7.14(b) of
Section 2. The routine to implement the midpoint approximation method is a double
loop of the form

SET s = 0
FOR j = 0 TO p - 1
FOR k = 0 TO q - l
LET s = s + f(a + (j + .S)(b - a)/p , C + {k + .S)(d - c )/q )
NEXT k
NEXT j
PRINT s(b - a)(d - c)/(pq)

The rest of the routine consists of a definition for f, and an assignment of values to
the limits a, b, c, d.
To integrate over a region D in JR 2 that's more complicated than a rectangle,
simply enclose D in a rectangle R. Then define / by its given values for (x, y)
Section 78 Numerical Integration 361
in D and define f(x, y ) = 0 for (x , y ) outside D. The Heaviside function H of
Chapter 4, Section 2C is helpful here. Alternatively, it may be simpler first to make
a change of variable in the integral that results in integration over a rectangle.
The analogous formula for the midpoint approximation for a function g(x, y, z)
integrated over a rectangular region R with extreme comers at (a, c, e) and (b, d , j) is

(b - a)(d - c)(e - f) p-1 q-1 r- 1

1 R
g(x,y)dxdydz ~ - - - - - - L L L g ( xj , Yk,ZI),
pqr j=O k=O 1=0

where

(xj , Yk, z1) =(xi+ ½(b - a)/p , Yk + 1(d - c)/q, Zk + ½(e - j)/r)
=(a+ (j + ½Hb -a)/p , c + (k + 1)(d - c)/q, e + 1) (e - j)/r).

7B Simpson Approximations
If the integrand fin a multiple integral is a fairly smooth function we can take advan-
tage of its smoothness by repeated use of the I-dimensional Simpson approximation
over an even number p of intervals:

b b-a
l a
f(x)dx ~ --(J(xo) + 4j(x1)
3p
+ 2f(x2) + · ·· + 4j(Xp-J) + j(xp))

where Xj =a + j(b - a)/p. The pattern for the coefficients sjP) is such that the
first and the last coefficients are Scip) = st> = l , while the intermediate values are
given by the formula sJP> = 3 - (- l)i, in other words, alternating 4's and 2's,
beginning and ending with 4. The requirement for the even number of subdivision
intervals comes from the geometry underlying the method; with just two intervals,
the Simpson approximation is precisely the integral over a :S x :S b of the unique
quadratic polynomial that interpolates the values off at a, ½<a+ b) and b.
To apply the Simpson formula to a double integral we use a two-stage Simpson
approximation to an iterated integral

1d [lb j(x , y)dx] dy.

Thinking of y as held fixed for the moment, start with

F(y) =
l a
b
J(x, y)dx ~ T I:sY>
b

p
a P

j=O
j(Xj, y), where Xj =a+ j(b - a)/p.
362 Chapter 7 Multiple Integration

Letting Yk = c + k(d - c)/q, with q even, we approximate

(d c)

i
d q
F(y) dy ~ -
3q
L slq> F(yk)-
c k=O

Now replace F(yk) by the Simpson approximation previously obtained to get

i d [ fb j(x, y)dx] dy
la
~ (b -;)(d -
pq
c) t s!q> (t sy>
k=O j=O
f(xj, Yk))

p q
~ (b -;)(d - c) LL sy> s!q> J(xj, Yk)-
pq j=O k=O

Note that in this fonnula we use the values off at all grid points in the rectangle R,
including those on its boundary. The double loop in the implementing routine has
the form

SET s 0
FOR j = O TO p (Odd number of values, with p even.)
FOR k = O TO q (Odd number of values, with q even.)
LET s s + S(j,p)S(k,q)f(a + j(b - a)/p,c + k(d - c)/q)
NEXT k
NEXT j
PRINT s(b - a)(d - c) jU.1pq)

Before the loop we need to include the definition

S(j, k) ={ 3- (- l)j' if O < j < p,


1, if j = 0 or j = p.

The remarks at the end of Section 7B about defining f (x, y) on nonrectangular


regions apply here as well.

EXERCISES

l. Use a program to implement the midpoint approximation In Exercises 3 to 8, use the midpoint or Simpson approx-
for an f(x, y) and test it on the example f(x, y) = x 2 +y4 imations to find an approximate value to four-place accu-
over the rectangle R: 0 _::: x _::: 1, 0 _::: y _::: l. Having
computed the correct value via indefinite integrals, you racy for lj(x,y)dxdy where f, Rare
can find out how small p and q can be while still
producing four-place accuracy. 3. f(x, y) =1- x 2 - y 2 , R: 0 _::: x :'S: 2, 0 :'S: y :'S: 3.

2. Use a program to implement Simpson's approximation for 4. f(x,y)=l-x-y, R:0:'S:x, 0_:::y, x+y:'S:I.
f (x, y) and test it on the same example as in the previous 5. J(x,y)=l-x 2 -y2, R:x 2 +y2_:::1.
problem to find minimal values for p and q for four-place
accuracy, such that reducing either p or q fails to yield 6. f(x,y)=l-x 2 -y2, R:x 2 +y 2 :'S:l,y2'.:0.
that degree of accuracy, in particular increasing p or q 1. f(x, y) =I- x2 - y2, R: 0::: x, 0 _::: y, x 2 + y2 .'.:: 1.
should not change the first four digits.
8. f (x, y) = )', R : 0 :':: X :':: J, 0 :':: y .'.:: X.
Section 7B Numerical Integration 363
9. The double integral p q r

G(a , b, c, d) = lb 1d e-x
2
-i dx dy
" " " Sj(p)S(q)S(r)
L..,L..,L..,
j=O k=O l=O
(
k t gxj,Yk,Zt),

can't be evaluated in tenns of elementary functions of a,


b, c, d. Nevertheless, we can still find such a value when 10. l ln(xyz) dx dy dz, R: 1 ::: x ::: 2, 1 ::: y :5 3,
the rectangular region is replaced by a circular region 2 :5 z :5 3.
of radius a centered at the origin. The trick is first to
make the change of variable x == r cos 0, y = r sin 0, 11. l Jx 2 + y 2 +z2 dxdydz, R: x2 + y 2 +z 2 ,:::: l , 0.::: z.
dx dy = r dr d0, then to compute the resulting double
integral by iterated integration over the region O .::: r .::: R,
0.::: 0.::: 2:rr.
12. l e<x+y+z> dx dy dz, R: 0 :5 x _::: I, 0 ::: y _::: 2,

(a) Compute the value of the integral over JR 2 by using 0,::: z::: 3.
polar coordinates over a disk of radius R as R tends
to oo.
13. l(x +y+z)dxdydz, R: 0::: x.::: 1, O::: y::: I,
(b) Compute approximations to ,r by finding Simpson 0:::z:::l.
approximations to 4G(O, a, 0, a) for suitable values 14. (a) Sketch the region in JR2 bounded by the four lines
of a. x + y = I, x + 2y = 4, x - 2y = - I and x - 3y = I.
(c) Estimate how large you need to make the positive (b) Find an approximate value for the area of the region
number a in part (h) to get four-place accuracy. in part (a). Can you find the exact value, 187/120,
In Exercises 10 to 13, use Simpson's approximation by elementary geometry?
(c) Find an approximate value for the integral of
to find approximate values for the following integrals.
f (x, y, z) = x 3 + y 3 over the region described
The Simpson fonnula for a triple integral over a 3-
in part (a).
dimensional rectangle R is
IS. (a) Sketch the region in the positive octant of JR3
ff lg(x,y,z)dxdydz
hounded by the planes x + y + z = 3, x + y +2z = 6,
z = I and z 2. =
1 [ld [1b
(b) Find an approximate value for the volume of the
1 region in part (a). Can you find the exact value by
= g(x, y, z)dx] dy] dz elementary geometry?
(c) Find an approximate value for the integral of
(b - a)(d - c)(f - e)
f(x, y, z) = x 4 + y4 + z4 over the region described
~ 21pqr in part (a).

Chapter 7 REVIEW

Sketch the region of integration in each of the following 1

iterated integrals, and evaluate the integral as given. 4. fo dy foy (x + y}2dx.


Then evaluate the integral in the reversed order as a
check. 2

1 2
S. forr [fo rr sin(x - y)dy] dx.

I. fo [fo xldx] dy.


6. fo
1
[L: 4
x dy] dx.

2. ill [12 ex-ydy] dx. Sketch the region of integration stated for each of the
following double integrals. Then evaluate the integral.

.£ rlox c-Ydy] dx. You may want to pick your order carefully, and you
2
3• may even want to change coordinates.
364 Chapter 7 Multiple Integration

24. (a) Sketch the region R bounded by the graphs of


1. f (x + y) dx dy; T is the triangle with corners at (0, 0),
y = x 3 and x = y 2 •
]T
(1, 0) and (1, 1).
(b) The double integral l x dx dy is equal to each of
8. 1 y dA; R is the half of the disk of radius 1 centered at two iterated integrals over R; write down both of
ccf. 0) where y ~ 0. them and evaluate one of them.

9. l x dx dy; Q is the part in the first quadrant of the disk


25. Let R be the region in the plane between the parabola
y = x 2 and the line y = 2x + 3. Write the integral
of radius 1 centered at (0, 0). l x
2
yd A as an iterated integral in both possible orders;

10. 1+ (x 2 y2) 5 dx dy; D is the disk of radius 4 centered at then evaluate one of them.
Make a sketch of each of the following plane or solid
elf. o).
regions numbered 26 through 33, and find the area or
11. 1 (x 2 - y2)2 dx dy; D is the disk of radius l centered at volume as the case may be. Use your own reasonable
choices for the constants in making the sketches.
c& o).
12. l (l+x 2+y2)- 312 dV; Dis the disk of radius 2 centered
26. Elliptic region E in JR2 : x 2 /a 2 + y2 / b2 :S I.
27. Solid B based on the region E of the previous exercise
at (0, 0). and with square cross sections perpendicular to the x-axis.

13. ls (x 2 + y2) dx dy; S is the square of side 2 centered at


28. Solid B based on the region E of the previous two exer-
cises and with semicircular cross sections perpendicular
(1, 0), edges parallel to the axes. to the x-axis.

14. ls x 2y2 dx dy; S is the square Ix I + IY I :S: 1. 29. Ellipsoid B: x 2/ a 2 + y2 / b 2 + z 2/ c2 :S 1. Note that B has
elliptical cross sections.
Sketch the 3-dimensional region of integration of each 30. Region H in JR2 : x 2 :S: y 2 + l, IYI :S: l.
of the following integrals. Then evaluate the integral, 31. Solid B generated by rotating the region H of the previous
possibly after a change of coordinates. exercise about the y-axis.

15. fo
1
[i 2
[fi\yzdz] dy]dx.
32. Solid B generated by rotating the region H of the previous
exercises about the x-axis.

1
33. Solid B bounded by the three coordinate planes in JR 3 and
16. fo [lax [lay xyzdz] dy] dx. the plane ax+ by+ cz = I, where a, b, c are positive.
34. For the integral
17. fc z(x:: + i)dV; C is the solid cylinder x2 + y 2 :':: 1,
0 :S: z :S: 2.
Jof I dy 1./Y
Y 2xydx,
18. fc (x 2 + y2 + ·z.2) dV; C is the solid cylinder x 2 + y 2 ::: 4,
0 :S: z :S: l. (a) Sketch the region of integration in JR2 •
19. L (x 2 + y2 +z 2) dV; B is the solid ball x 2 + y 2 +z 2 :s l.
(b)
35. Given
Evaluate the integral.

20. L z dx dy dz; B is the solid ball x 2 + y2 + z2 :':: I.

21. £ (x 2 + y2) dV; K is the solid cone Jx 2 + y 2 ::: z::: 1.


11 I J2-2x 1
2xdydx.

22. £ zdV; K is the solid cone ½Jx2 + y 2 :S z :s 3. (a)


(b)
Sketch the region of integration.
Write an equivalent integral with the order of inte-

23. £ )x2 + y2JV; K is the solid cone Jx 2 + y2 :S z :S 2. (c)


gration reversed.
Evaluate either integral, or both as a check.
Section 78 Numerical Integration 365
36. Lel B be lhe region bounded by the xy-plane, the yz-
plane, the xz-plane and the plane x + 2y + z = 4. (a)
Make a sketch of B. (b) Find the volume of B.
37. Integrate the function f(x, y) = 3x 2 +2y, over the region 47. The equations x = u + v, y = u + 3v define a transfor-
in the plane bounded by the curves y = x 2 and y = 2 - x. mation from the uv-plane to the xy-plane that carries the

38. Compute l sin(x - y) dx dy, where R is the region


points inside the rectangle with corners at (0, 0), (2, 0),
(2, I) and (0, I) onto a region R. Compute l ydA.
bounded by parallelogram with vertices at (0, 0), (0, Jr /2),
(Jr /2, Jr /2), and (n /2, Jr). 48. The equations x = 2u + 1/(v + 1), y = u + v define

39. Compute l Jx 2 + y 2 dxdy, where R is the region


a transformation from the uv-plane to lhe xy-plane that
carries the points inside the rectangle with corners at (0,
enclosed by the part of the polar coordinate curve r = 0), (l, 0), (I, I) and (0, 1) onto a region R. Sketch R and
cos 8 traced out as 8 varies from -Jr /2 to Jr /2. Sketch find its area.
the region R. 49. A vertical cylinder C has a flat base in the xy-plane
40. Let B be the part in the first octant of the solid ball consisting of the semicircle for which x 2 + y2 :::: 9 and
in JR3 of radius 3 centered at the origin. Use spherical x :::: 0. The top of C is part of the graph of z = 1+x+ y 2 •

coordinates to compute l y zd V .
Find the volume of C.
SO. Let R be the region in the first quadrant of the xy-plane
41. Let C be a solid cylinder of radius I symmetric about the where 4 ~ xy:::: 9 and I ::S y/x ::S 4.
z-axis. Let W be the wedge-shaped subset of C where (a) Solve the equations u = xy and v = y/x uniquely
0 ~ z ~ x. Write an iterated integral for l zdV
(b)
for x and/ in R in terms of u and v.
Express JR x- 2 dxdy as an integral in u and v, and
(a) in rectangular coordinates.
(b) in cylindrical coordinates. find its value.
(c) Evaluate the multiple integral in whichever way you In Exercises 51 to 56, evaluate those improper integrals
prefer. that have finite values; for those that don't, explain why
42. (a) Set up an iterated integral whose evaluation will they fail to converge.
yield the volume of the spherical ball Ba of radius 2
51. { e-(x +i) dx dy.
a centered at the origin in JR3. }JR.2
(b) Use Jacobi's theorem to show that every spherical
ball of radius a in JR 3 has the same volume. 52. { (x 2 + y2)- 1 dx dy.
}R2
43. Let R be the region in JR 3 determined by the inequalities
0 ~ x, 0 ::s y, x 1 + y 2 :::: I and 0:::: z:::: x 2 + y 2 • Evaluate 53.1 (x 2 + y2)-J/ 3 dA.
l xyzdV. '
54. r
.r2+y2~,
2
e-../xZ+y dx dy.
44. Compute the value of the integral off (x, y) = x + y 2 2 }R2
over the triangle in JR 2 with corners at (0, 0), (l, 0), and 55. r e-(x +i+z ) dx dy dz.
23
2
f.!
(I, I) by computing iterated integrals in both possible }JR.3
orders.
45. Find limits of integration, some of which may be noncon- 56.1 x2+y2+z2iJ
(x
2
+ y2 + z2 )- 2dV.
stant, for this integral if the region of integration is the
circular disk of radius 2 centered at (3, 0): In Exercises 57 to 60, decide for what values of a the
integrals have finite values.
j~b [id f(x, y) dx] dy. 57.
1.r2+y2~J (x 2
1
+ y 2)a
dA

46. The region of integration for h(x, y, z) is bounded above


by the plane z = I and below by the circular paraboloid
z = x2 + y2; find the limits, which may be non-
constant:
366 Chapter 7 Multiple Integration

60.1 x2+y2+z2~1 (x 2
;
+y + z2 )a
dV
(a)
(b)
Sketch the region of integration.
Write the integral in terms of rectangular coordi-
nates.
{21r fl {~2 (c) Write the integral in terms of spherical coordinates.
61. lo d0 lo dr lo rdz is expressed m cylindrical (d) Compute the value of the integral.
coordinates.
CHAPTER 8

INTEGRALS AND DERIVATIVES


ON CURVES

One version of the Fundamental Theorem of Calculus is

1b J'(t)dt = J(b) - /(a) .

The present chapter features just one of several ways to extend the Fundamental
Theorem, and we postpone further extensions to Chapter 9. Here we first introduce a
simple generalization of the integral itself, called the line integral over a parametrized
curve. Combining the line integral with the gradient operator we'll arrive at an
analogue of the Fundamental Theorem,

lb V/(x) • dx = /(b) - /(a) ,

which we'll use to express a key relationship between the physical concepts work
and energy. Recall that curves in JR 2 and JR 3 are in a sense negligible in the multi-
ple integrals of Chapter 7. However, all the integrals in this chapter will reduce to
ordinary one-variable integrals for computational purposes, even though the concepts
that give rise to them have a distinctly higher-dimensional flavor.
Differentiation along a curve is a concept intrinsic to the geometry of the curve,
typically distinct from differentiation of the curve's parametrization. The idea is
particularly important in describing motion along curves. In some applications to
motion on curves we use the term particle to describe a body moving on a curved
path, but we often understand particle to stand for the position of the center of mass
of what is really a large and complex body such as the earth, a simplification that
turns out to be adequate for many purposes.

SECTION 1 LINE INTEGRALS


lA Definition and Examples
The integral lb J (x) dx of a real-valued function of one scalar variable generalizes
in several ways. One generalization that has applications in physics is the line integral,
which we describe here. Let R 3 ~ R 3 be a continuous function defined in a region
D of R 3 . We picture F as a vector field, that is, as an assignment of the arrow F(x)
to the point x for each x in D . A sketch of a 3-dimensional vector field is shown
in Figure 8.1 (a). Suppose also that y is a curve lying in D, and parametrized by a
function g(t ), continuously differentiable for a :S t :Sb.

367
368 Chapter 8 Integrals and Derivatives on Curves

FIGURE 8.1

(a) (b)

We'll be particularly interested in the arrows of the field F that stem from points
on y, as shown in Figure 8.1 (b ). These arrows will depend on t in a specific way
if we introduce the composition F(g(t)). At each point g(t) on y there is also a
tangent vector g' (t), and the dot product

F(g(t)). g'(t)

is a continuous real-valued function for a :'.S t :'.Sb. The line integral of F over y is,
by definition,

1.1 1b F(g(t)) • g'(t)dt.

IE~MPLE1j If a vector field is given in ~ 3 by F(x, y, z) = (x 2 , y 2 , z2 ) and y is given by


g(1) = (t, 12, 13 ) for O :'.St :'.SI, then the integral of F over y is

1 1
fo 2
(1 , ,4, 16 ) • (1, 2,, 3t 2 ) dt = fo (1 2
+ 2t 5 + 3t 8 ) dt

= [½t3 + }t6 + }r9]~ = l.

We interpret the line integral in qualitative terms as follows. The dot product

F g'(t)
(g(t)). lg'(OI

is the coordinate of F{g(t)) in the direction of the unit tangent vector to y at g(t).
Then F{g(t)) • g'(t), the integrand in Formula I.I, is the tangential coordinate of
F{g(t)) times jg'(t)!, the speed of traversal of y at g(t). In particular, if F(gU)) is
always perpendicular to y at g(t), the integrand, and hence the integral will be zero.
At the other extreme, for a given field F, if the speed lg' (t) I is prescribed at each
point of the curve, then the integrand will be maximized by choosing a curve y that
at each point has the same direction as the field there. Thus the integrand in the line
integral is a local measure of the circulation of the vector field along y. The term
circulation is justified by the frequent interpretation of F as the velocity field of a
fluid flow .
Section 1A Line Integrals 369

Equation 1. l works in any number of dimensions. If 1R11 ~ lR11 is a vector field,


and JR ~ JR" describes for a .:S: t S b a smooth curve y , lying in the domain
D of F, the line integral of F over y is still defined by Equation 1.1 in which
the dot product is now formed in 1R11 • In general the circulation of F over y is
defined to be the value of the integral of F over a curve y, whether y is a closed
curve or not.
Let F(x , y) = (x, y) define a vector field in JR 2 . The curve given by g(t) =
(cost, sint) for O::: t ::: 1r/2 is the quarter-circle shown in Figure 8.2(a) together
with some tangent vectors and some vectors of the field. Because the field is per-
pendicular to the curve at each point, we expect the integral to be zero, and we have
y
rr/2 rr/2
lo F(gU))·g'(t)dt= lo (cost,sint)•(-sint,cost)dt

r12
= Jo (-costsint+sintcost)dt

F flf /2
= lo Odt = 0.
X

Work. An important physical interpretation of the line integral arises as follows.


(a)
Suppose that the function JR 3 ~ JR 3 determines a continuous force field in a region
X p t D in JR 3 • Thus F(x) represents the magnitude and direction of a force applied at
• x. To define the work W done in moving a particle along a curve y in D , we use
~ F
a preliminary definition for linear motion in a constant field, namely that work is
scalar force acting in the direction of motion multiplied by distance covered. Thus
(b) work is the product

FIGURE 8.2
wheres is the distance traversed and F 1 is the force in the unit direction t of motion.
In Figure 8.2(b) a particle moves along a line having direction vector t with ltl = I,
and it is subject at each point x to the constant force vector F. The coordinate of F
in the direction of motion is Ft = F • t, so work is

W = (F • t) s = (force coordinate) x (distance) .

For motion along a continuously differentiable curve, we begin by approximating


the curve by tangent vectors. If the curve y is described parametrically by JR _?_~ JR 3
with g(t) defined for a ~ t ::: b, then the arrows representing the tangent vectors

g'(tk-1Htk - tk-1) , to< t1 < · · · < tk,

will approximate y as shown in Figure 8.3, since the number lg' (tk- 1) II(tk - tk-d I
approximates the distance from g(tk_I) to g(tk), We fix a point Xk = g(tk) on y, and
near Xk approximate F by the constant field F(xk), That is, near Xk we approximate
F(x) by the vector field that assigns the constant vector F(xk) to every point. The
tangential coordinate of F(xk) is F(xk)-t(tk), where t(t) =
g'(t)/lg'(t)I . Thus the
work done in moving a particle along y from Xk to Xk+I is approximately
370 Chapter 8 Integrals and Derivatives on Curves

FIGURE 8.3

1
wk= (F(xk) t(tk))lg (tk)l(tk+1 - tk)
0

== F(g(tk)) • g'(tk)(tk+I - tk).

Letting m(P) = max (tk - fk-1), we get


l~k~K

L wk= 1F(gU)) • g'(t)dt,


K-1 b
Jim
m(P)-.O k=O a

an integral formula that we define to be the work done by the field F in moving the
particle through the domain of F along y.
A suggestive shorthand notation for the general line integral uses the unit tangent
vector t(t) = g'(t)/lg'(t)I to a smooth path of integration on which g'(t) # 0. Given
that arc length along such apathy is defined by s(t) = 1t lg'(t)ldt, it's natural to
write ds = lg' (t) I d t for the so-called arc length differential. The line integral for
work is then the natural extension of the special case W = (F • t)s :

W = l F tds. 0

This way of writing the integral captures the essence of our interpretation of the line
integral, since F • t is the coordinate of F along the tangent direction to y.
The assumptions that ~, be continuous and that g' be continuous assured that the
integrand F(g(t)). g' (t) would be continuous and hence that the line integral would
exist. However these conditions are stronger than necessary. It's enough to assume
that the path of integration is piecewise smooth and then that the vector field F is
sufficiently regular so the integral in Formula 1.1 exists. Thus the derivative g' may
be discontinuous at finitely many points, allowing y to have sharp corners at some
points.

IEXAMPLE 31 Let a vector field be defined in JR3 by F(x, y, z) = (x, y, z). Let the curve y in JR 3
be given by g(t) =(cost, sin t, It - Jt /21) for 0 :S t :S n. Then y has a corner at
(0, l, 0), where ·1 = n /2. Indeed, g is not differentiable there, and lim g' (t)
r-.rr/2-
=
( -1, 0, -1) and Jim g' (t) = (-1, 0, 1), showing that the direction of the tangent
r-.1r /2+
Section 1A Line Integrals 371

jumps abruptly at t = TC /2. Nevertheless, the integral of F over y exists. To compute


it, the interval of integration would ordinarily be broken at t =
TC /2. But in this
particular case F(g (t)) • g' (I) = I - TC /2 unless I = TC /2. It follows that

i F•tds = fo,r F{g(l)) • g'(l)d1 = forr (r - g.) d1 = 0.


A convenient notation for line integrals denotes the parametrization of y by g(I) =
(x(I}, y(I), z(t)) for a SI ~ b. If the coordinate functions of F are F 1, F2, and F 3,
and we suppress the variable t in the integrand, we get

1b
a F(g(t))•g'(t)dl= a
1b [ dx dy dz]
F1(x,y,z)dt+F2(x,y,z)dl +F3(x,y,z)dl di.

The last integral abbreviates to

i F1 dx + F2dy + F3dz.

This is still shorter if we write dx = (dx, dy, dz), giving a convenient shorthand for
the line integral of F over y:
i F•dx.

f ~~Ml!L~,~J Let a vector field be given in JR by


3

F(x, y, z) = (x - y, y - z, z - x).

A curve y, given by g(I) = (I, -I, 12 ) for OS I S l, passes through the field. We
compute the integral of F over y as follows. First, the values of F on y are given by

F{g(t)) = (21, -I - 12, 12 - I).

We write

dx = g 1 (1)dt = (1, -1, 21) d1.

Then F•dx = F(g(1)) • g'(1)dl = (21, -I -1 2, 12 -1) • (1, -1, 21)dl, so

i F •dx = fo
1
[<21)(1} + (-1 - 2
1 )(-1) + (1 2 - 1)(21)] di

5
i
i
= (21 3 -1 2 +31)d1=-.
o 3

If we can choose coordinates so that one or more of the sections of a line integral
path are parallel to an axis, computing the value of an integral may be substantially
simplified, as in the following example.
372 Chapter 8 Integrals and Derivatives on Curves

FIGURE 8.4
~ \ y ! Ii /,*
"-.-r: /9-Y, I / /"

,,,,,.,,,,
--.
Y4
,,,

\
8,
82

---
'--. ..._____
X

/ I \ \~ (a) (b)

!EXAMPLES I Let F(x, y) = (x, y) define a 2-dimensional velocity field along the path 8 from
(0, 0) to (a, 0) along the x-axis and then along the line segment from (a, 0) to
(a, b). Thus 8 consists of a path 81 along the x-axis followed by a path 82 parallel
to the y-axis. See Figure 8.4(a), where it's assumed that a > 0 and b > 0. No
matter how the path is parametrized by a function g(t) = (g1(t), gz(t)), we see
that g~(t) = 0 along 81. In other words, dy = 0 along 81. Similarly, dx = 0
along 82. Since F • dx = x dx + ydy, we can compute the circulation of F along
8 using x and y as parameters on 81 and 82 respectively. For arbitrary a and b
we have

f x dx + y dy = f x dx + f y dy
lo lo, 102
r
= lo xdx
rb ydy = 2a2
+ Jo
i i
+ 2b2.

Let F(x, y) = (x, y) as in the previous example. This time we integrate F over the
closed path, shown in Figure 8.4(a), consisting of two circular arcs with their ends
joined by radial segments. The entire path y is to be traced counterclockwise. Over
the circular arcs the tangent vectors t are perpendicular to the vector field arrows,
so F • t = 0 there. Thus the integral of F is zero over y2 and y4. Let t be the unit
tangent at a typical point of the segment YI. Since F and t point in the same direction
along YI, F • t = JFI = Jx 2 + y 2 at a point (x, y) of YI. In contrast, the tangent
vectors to Y3 all point toward the origin. Since F and t point in opposite directions
along y3, F • t =-!Fl= -Jx 2 + y 2 at a point (x, y) of y3. The integrals of Fare
thus negatives of each other:

1
YJ
F-tds = -1YI
F tds .
0

Thus the two remaining integrals cancel in the computation of the integral over y,
giving zero net circulation over the complete circuit.
Section 1A Line Integrals 373

Figure 8.4(b) shows the 3-dimensional vector field F(x, y, z) = -yi + xj + k along
with the helix x(t) = (2 cos t)i + (2 sin t)j + (t)k. To integrate F over the helix we
compute F • dx along the curve. Note that

F(x(t)) = (-2 sin t)i + (2cos t)j + k, and x(t) = (-2 sin t)i + (2 cos t)j + k.
Hence the velocity vector to the helix at a point x coincides with the vector F(x),
so F(x) • x = lxl 2 = 5. It follows that the circulation of F along the helix from x(a)
to x(b) is
lb F(x(t)) • x(t) dt = lb 5 dt = 5(b - a).

Equivalent Parametrizations. Different parametrizations may describe the


same image curve, for example f (t) = (t, t 2 ), 0 s t s I and g (u) = (u 2 , u 4 ),
0 s u S l both describe the same segment of a parabola extending between (0,
0) and (I, I). It's conceivable that line integrals of the same field using the two
functions f and g might be different. However, these parametrizations are related in
two significant ways. One is that each segment of the image curve is traced the same
number of times (once in this example) by either representation. The other is that
there is a correspondence between points t and u of the parameter domain such that
corresponding points on the curve have the tangent vectors f'(t) and g'(u) pointing
in the same direction. These conditions should be met if we expect to get the same
value for a line integral using either parametrization. The conditions will be satisfied
in general by imposing the requirement that parametrizations

x = j(t), a S t Sb and x = g(u), a S u S fJ


be equivalent, meaning that there is a continuously differentiable function </J with
</) 1 > 0 from [a, /3] onto [a, b] such that J(</J(u)) =
g(u). (Note that <P' > 0 implies
that </J is strictly increasing, so changing to the new parametrization won't produce
a zero tangent where there wasn't one before.) The following theorem allows us
some latitude in the choice of a convenient parametrization for a curve y. The
proof is a direct application of the chain rule and the change-of-variable theorem for
integrals.
1.2 Theorem. Equivalent parametrizations of a curve y yield the same value for

L F•dx.

Proof. Let the parameter correspondence be t = </J(u), so that f (</J(u)) = g(u).


By the change-of-variable theorem

L F •dx = lb F(f(t)) • J'(t)dt = 1/3 F(f(<fJ(u)) • J'(</J(u))</J'(u)du.

By the chain rule, g'(u) = U(</J(u))]' = J'(</J(u))</J'(u), so

L F•dx= 1/3 F(g(u))•g'(u)du. •


374 Chapter 8 Integrals and Derivatives on Curves

:
!EXAMPLES
. . ,, " :I Consider three parametrizations for the line segment joining a and b:

f(t) =th+ (1 - t)a, g(u) = u2 b + (1 - u2 )a, h(v) = (1 - v)b + va,


with parameter interval [0, l] for all three. The first two are equivalent via the
function <J>(u) = u 2 , and indeed the tangent vectors for both of them have the same
direction as b - a, pointing in the direction of traversal of the segment. With h(v)
we have t = <J>(v) = I - v so <l>'(v) < 0, and the curve is traversed in the opposite
direction, from b to a; this changes the sign of the integral as compared with the
other two parametrizations.
1B Fundamental Theorem of Calculus
Recall that the gradient 'ilf of a differentiable function f from ~n to~ is the vector
field defined by ·

'ilf(x) of
= ( -(x), of .)
... , -(x) .
OXJ OX11
(I

If 'ilf is continuous, it generalizes the derivative in the formula for the fundamental
theorem of calculus for one variable:
X

J1, vf-dx = J vf·dx


(a)
')'; !
lb f'(t) dt = f (b) - f (a) .

z 1.3 Theorem. Let f be a continuously differentiable real-valued function defined


in an open set D of ~ 11 • (Thus 'ilf is a continuous vector field in D.) If y is a smooth
curve in D with initial and terminal points a and b, then

[ vf • dx = f (b) - f (a).

_,_ ____ _ In particular, the value of the line integral of a gradient field over a curve depends
only on the endpoints of the curve; thus in this case, the notation

X
y
lb 'ilf • dx = f (b) - f (a)

is justified. If a equals b, lb 'ilf • dx = 0, so the integral of a gradient field over a


FIGURE 8.5 path starting and ending at the same point is zero. (See Figure 8.5.)

Proof. Suppose y is parametrized by g(t) with as ts b, and g(a) = a, g(b) = b.


Using first the definition of the line integral we have

[ vf • dx = lb 'ilf (g(t)) • g' (t) dt

bd
=
1 -f(g(t))dt,
a dt
Section 18 Line Integrals 375

where in the second step we used the chain rule, Theorem 1.3 of Chapter 6. But by
Equation (*), the fundamental theorem for one variable, the last integral is equal to
f(g(b)) - J(g(a)) = f (b) - f(a). •

Consider the vector field Vf(x, y) in JR 2, where f(x, y) = ½(x 2 + y 2). Then
<
VJ(x, y) = (x, y). If y is some continuously differentiable curve with respective
initial and final endpoints xi = (x1, YI), and x2 = (x2, Y2), then

(x2,Y2)

J Y
Vf(x) •dx =
J
<x1,Y1)
xdx + ydy = J(x2, Y2)- f(x1, yi)

= i(xi + Yi) - ½(xf + yf}


= ½<x}- xf} + i(Yi - yf).
This is what we would expect formally from the fundamental theorem.

l Y dx + X dy = X2Y2 - XJ YI.

In particular, if the path starts at (xo, Yo) = (I, 2), and ends at (x, y ), we find that
(x,y)

J(1,2)
ydx +xdy = J(x, y) - f(l, 2) = xy-2.

Examples 9 and 10 show in detail how line integrals solve a vector equation

VJ(x, y) = (Fi (x, y), F2(x, y))


for J(x, y) subject to a condition of the form

f (xo, Yo) = 0,
provided that (F1(x, y), F2(x, y)) is a given gradient field. The solution is then

(x,y)
f(x, y) =
J(xo,yo)
Fi (x, y) dx + F2(x, y) dy.

More generally the line integral

J(x) = /~ F(x) • dx solves Vf(x) = F(x), with J(xo) = 0,


assuming again that F(x) is indeed a gradient field.
376 Chapter 8 Integrals and Derivatives on Curves

The vector differential equation VJ =


F is discussed in more detail in Section 2
of Chapter 9. The examples in Exercises 9 to 12 show that if F is not a gradient
field, the value of a line integral of J over a path joining two points may depend on
the path and not just on its endpoints.

EXERCISES

are gradient fields. Use Theorem 1.3 to compute


In Exercises I to 8 compute the line integrals.

1. i x dx + x 2 dy + y dz, where L is given by l (Fi dx + F2 dy) for the given choices of F1 and F2.
g (t) = (t , t, t), for O :S t :S I.
= x 2, = y2.
2. l + + (x y) dx dy, where P is given by g(t) = (t, t 2),
13. Fi (x, y) F2(x, y)
14. F1(x,y) = xi, F2(x,y) =x 2y.
0:St:SI. 15. F1(x ,y) =siny, F2(x,y)=xcosy.

3. 1 1
YI
x dy and
Y2
x dy, where y 1 is given by
16. F1 (x, y) = ex-y, F2(x, y) = -ex-y_
17. Find the work done in moving a particle along the curve
g(t) = (cost,sint) for O :St :s 2n, and where n is
given by h(t) = (cost, sin t) for O :st ::: 4rr .
(x, y, z) =
(t, t , t 2), 0 :s t ::: 2, under the influence of the
field F(x, y, z) = (x + y, y, y).
4.1 YI
(dx + dy), where YI is given parametrically by 18. (a) Find the work done by the force field F(x , y) =
yi - xj in moving a particle clockwise once around
(x, y) = (cost, sint), 0::: t::: 2rr. the circle of radius 1 centered at the origin in JR 2 •
5.
1YI
dx +dy
X
2
+ )' 2
,
. . .
where YI 1s the curve m Exercise 4. (b) How docs the answer to part (a) change if the circle
is moved so that its center is at an arbitrary point

6. i (ex dx+z dy+sin z dz), where y is given by (x, y, z) =


(a, b)?
19. Consider the vector field F(x, y) = (y, x) and the curve
g(t) = (e 1 , e- 1 ) for O ::: t :S 1.
(t, t 2 , t 3 ), 0 :S t :S 1.

7. i F • dx, where F(x, y, z)


parametrically by (x, y, z; =
= (z, x, y) and y is given

(cost, sint , t) , 0 :St :s 2rr .


(a) Sketch F and g in the same picture.
(b) Compute the integral of F over the curve.
20. Show that

8. j F•
}'
dx, where F(x , y, z, w) = (x, x, y, xw) and y is f(t) = (cost, sint), 0 :St :S rr/2,
given by (x, y, z, w) = (t, 1, t, t), 0 :St::: 2. 1- u
2
and g(u) =(- - , -2u- )
l+u 2 l+u 2
, 0< u < 1
- -
In Exercises 9 to 12, let YI be given by (x, y) =
(cost, sint), 0:::: t:::: rr/2 and y2 by (x, y) = (l-u, u),
are equivalent parametrizations of a quarter-circle.
0:::: u :::: 1. Compute j (J dx + gdy) and (The relevant definition of equivalence is given in the
preamble to Theorem 1.2.)
J
YI

(J dx + g dy) for the given choices off and g. 21. Show that f(t) = (1 112 , 1312 ), 1 :S t :S 2, and g(u) =
Y2 (u, u 3 ), 1 :S u :S ./2 are equivalent parametrizations of
a cubic curve. (The relevant definition of equivalence is
9. f(x,y)=x,g(x,y)=x+l.
given in the preamble to Theorem 1.2.)
10. f (x, y)

1 I. f(x, Y)
= x + y, g(x, y) =
= -2--2'
1
g(x, y) = -2--2 ·
l.
1 22. Show that if fy
F • dx and j' G • dx exist, then
y
X +y X +y

12. f(x. y) = xy, g(x, y) = x + 1. i (aF + bG) • dx = a i F • dx +bi G • dx,


In Exercises 13 to 16, let Y2 be given on O:::: u ::: 1 by
(x, y) = (1 - u, u). The vector fields (F1(x, y), F2(x, y)) where a and b are constants.
Section 2 Weighted Curves and Surfaces of Revolution 377
23. Let a function g(t) represent the position of a panicle
of varying mass m(t) in JR 3 at time t. Then the velocity
vector of the particle is v(I) = g'(t), and the force acting
(c) Compute 1G•dx, where r is a rectangle with sides
parallel to the axes, traced counterclockwise.
on the particle at g(t) is F(g(t)) = [m(t)v(t)]'. (d) Do a computation analogous to the ones in parts (b)
(a) Show that and (c) for a triangle with vertices at (0, 0), (a, 0)
F(g(t)) • g'(t) = m'(t)v 2 (t) + m(t)v(t)v'(t), where and (0, b) where a and b are positive. What is the
v is the speed of the particle. pattern in the answers?
(b) Show that if m(t) is constant, then the work done
in moving the particle over its path between times
t = a and I = b is w = (m/2)(v 2 (b) - v2 (a)).
In Exercises 28 to 31, compute 1VJ • dx for the
indicated choices of f and y. Y
[(½)mv 2 (1) is the kinetic energy of the particle.]
24. Sketch the vector field F(x, y) = 28. f(x,y) == x 2 +y2 ;y: g(t) = (1+1 2, l-1 2), -I~ t ~ 2.
geometric grounds why £ (x, y). Explain on
F • dx == 0 if the path y is
29. f(x, y, z)
30. f(x, y)
= x-y2+z;y: g(I) = (t, 12, -12 ), 0 ~ 1 ~ 2.
= x 3 -2y 3 ; y: line segment from (1, 1) to (5, -1).
confined to a circle centered at the origin.
31. f(x, y, z) = (x - y + z)2; y: one turn of a helix from
= (-y,x).
geometric grounds why £
25. Sketch the vector field F(x,y) Explain on
F • dx #- 0 if the path y traces
(1, 0, 0) to (1, 0, 4).
32. (a)
(b)
Sketch the vector field F(x, y) = (y, 0).
Show that the vector field of part (a) can't be the
an ellipse centered at the origin and with major and minor
axes not necessarily parallel to the (x, y )-axes. gradient of a real-valued function f by finding
distinct paths from some point a to another point
26. The purpose of this exercise is to display a pattern in b #- a such that the integrals of F over these paths
the results of integrating the vector field F(x, y) = xj == have different values.
(0, x) over some closed paths in JR 2 . (c) Show that the vector field of part (a) can't be the
(a) Make a sketch of the vector field F.
(b) Compute 1 F • dx, where c is a circular path of
gradient of a real-valued function f by finding a
closed path y starting and ending at the same point
such that £F • dx #- 0.
radius a, centered at (a, /3) and traced counter-

(c)
clockwise.
Compute 1 F • dx, where , is a rectangle with
33. (a)
(b)
SketchthevectorfieldF(x,y)=(-y, x).
Show that the vector field of part (a) can't be the
sides parallel to the axes, traced counterclockwise. gradient of a real-valued function f by finding
[Hint: Only the vertical sides of the rectangle make distinct paths from some point a to another point
a nonzero contribution.] b #- a such that the integrals of F over these paths
(d) Do a computation analogous to the ones in parts (b) have different values.
and (c) for a triangle with vertices at (0, 0), (a, 0) (c) Show that the vector field of part (a) can't be the
and (0, b) where a and b are positive. What is the gradient of a real-valued function f by finding a
pattern in the answers? closed path y starting and ending at the same point
27. The purpose of this exercise is to display a pattern in
the results of integrating the vector field G(x, y) =
such that £F • dx #- 0.

-½yi + }xj = (-}y, }x) over some closed paths in JR2 • 34. Assume a continuous vector field F(x) satisfies
(a) Make a sketch of the vector field G. (i) IF(x)I = k for a constant k > 0, and (ii) F(x) is

\
(b) Compute 1 G • dx, where c is a circular path of
curve y of finite length. Prove that £
tangent at each point x to a continuously differentiable
F • dx equals ±k
radius a, centered at (a, /3) and traced counterclock-
wise. times the length of y .

SECTION 2 WEIGHTED CURVES AND SURFACES OF REVOLUTION


In Chapter 4, Section 1 the arc length functions = s(t) of a curve parametrized by
x = g(t) on to St S ti is defined by
378 Chapter 8 Integrals and Derivatives on Curves

s(t) = f I
to
jx(u)j du.

The definition is a natural one, because the length jx(t)J is the speed at which the
curve is being traced at time t. But a given path in space can be traced at many
different varying speeds, including possible multiple back-and-forth tracings. Hence
it's useful to have a standard parametrization for a curve that depends only on the
intrinsic geometry of the image set of the curve. If points in the image correspond
one-to-one with values of arc length measured from a specific point xo on the image
curve, we can use arc length s as the parameter in a representation of the curve
by a function g(s), where g(so) = xo. It would then follow thats = 1s
so
jg(u)j du.
Differentiating both sides of the equation with respect to s gives 1 = jg(s)j. For
this reason we say that a curve g(t), to _:s t _:s t1 is parametrized by arc length if
lg(t)I = I for all t in the parameter interval; in other words the curve is traced with
constant speed I. The expression Jx(t)I dt = Jg'(t)I dt is traditionally called the arc
length element of the curve.

Let a > 0 be the radius of a circle centered at the origin in .IR 2 . The simplest
parametrization for this circle is g(t) = (a cost, a sin t), 0 _:st _:s 2:,r. Since jg(t)I =
I(-a sin t, a cost) I = a, the circle has been parametrized by arc length just when
a = 1. For other values of a, note that the arc length of the part of the circle
corresponding to the parameter interval O _:s u _:s t is
1 1
s = fo lg(u)I du = fo a du= at.

This equation s = at suggests the idea of introducing arc length s as parameter by


substituting s / a for t in the given parametrization; we get

g(s/a) = (a cos(s/a), a sin(s/a)),


so we now have an arc length parametrization, since

dg(s/a)
ds
I .
= 1(-sm(s/a),cos(s/a)}! = I.
l

Whether we denote the parameter by s or some other letter t is irrelevant; what is


important for applications is that the point g(t /a) is t units along the curve from
g(O), or equivalently, that the curve is traced with uniform speed 1.

The plane curve g(t) = (t, }1 312 ) has velocity vector g(t) = (I, 1112 ), so the speed
is jg(t)J = ./f+t. Arc length in terms of the parameter t measured from O is then
/JT"'+u 2 ,, 2 2
s(t) =
1o
du= -(1
3
+ u)3/2 = -(I + 1)3/2 -
0 3
-.
3
Solving fort in terms of s gives t = (1 + !s) 213 - 1. Exercise 14 shows that if we
use arc length as parameter, letting h(s) = g((l + ~s) 312 - l), then lh'(s)I I, so =
the curve is now traced by h(s) with uniform speed 1 for s > 0.
Section 2 Weighted Curves and Surfaces of Revolution 379

Weighted Curves. Thinking in terms of arc length parametrization is particu-


larly appropriate for integration over a curve for which points in the image of the
curve correspond one-to-one to parameter values. In particular, suppose we have a
scalar-valued weight function µ that assigns a number µ(h(s)) to each point h(s)
at distance s along the curve. For example, µ might be the mass density of a wire
bent into the shape of some curve. Then the total mass M of the weighted curve is
naturally defined to be
Sl
2.1 M=
1so
µ(h(s))ds.

If the curve happens to be parametJ.ized by something other than arc length, using
instead a function g(t) on to St S ti, the arc length functions is given by

s(t) = [1 lg(u)I du, to::: t S ti.


110
The significance of this equation for the two vector functions h(s) and g(t) that
parametrize the curve is that h{s(t)) = g(t) for to St ~ ti. Since ds/dt = !g(t)!,
the one-dimensional change-of-variable theorem for integrals applied to the integral
for M gives

M =
1~
s1
µ(h(s))ds = 1~
1
1 ds(t)
µ(h(s(t)))-dt
dt
= 111 µ(g(t))!g(t)ldt.
~

The differential ds = lg(t) ldt is called the arc length differential for a curve
parametrized by g(t).

Consider a full tum of the helix described by

g(t) = (acost,asint,t), a> 0,0 St S 2rr.

Suppose that the density of the helix at a point x is equal to the square of the distance
from x to the midpoint q = (0, 0, rr) of the helix's axis. Thus the density at g(t) is

2
!g(t) - ql 2 = a 2 cos 2 t + a 2 sin2 t + (t - rr)2 = a 2 + (t - rr) .

Since lg(t)I = v'(-asint) 2 + (acost) 2 + 1 = .JaI+t, the total mass is

Ma = fo2Jr (a 2 + (t - rr) 2)v'a 2 + 1 dt

= Ja 2 + 1 1_: (a 2 + t 2 ) dt = Ja 2 + 1 (2rra 2 + 2rr 3 /3).


380 Chapter 8 Integrals and Derivatives on Curves

FIGURE 8.6

Surfaces of Revolution. Certain weightings of a curve provide a way to find


some surface areas. By a surface of revolution we'll mean a surface S in JR 3 that
is generated by rotating a plane curve about a fixed line lying in the plane of the
curve. For example, a curve in IR 2 parametrized by (x(s), y(s)), so~ s ~ si, can
be rotated about a variety of lines including, but not restricted to, the x-axis and the
y-axis, to produce a surface. See Figure 8.6 where the rotation is around the y-axis.
During rotation a point at distance s along the curve traces out a circle of radius
r(s) and circumference 21rr(s) centered on the axis of rotation. It's plausible that the
area swept out by a short segment of length ds is about 21rr(s) ds, so it's natural to
define the surface area a (S) of S by an integral with respect to arc length
s2

2.2 a(S) =[ 21rr(s)ds.


s1

This definition gives the expected result for commonly met surfaces such as cylinders,
spheres, and cones, and is a special case of a more comprehensive definition given in
the next chapter that includes surfaces that aren't necessarily surfaces of revolution.
In a typical application we're given a curve parametrized in IR 2 by g(t) =
(x(t),y(t)), to~ t ~ t 1 . The arc length element is ds = ../i(t) 2 +Y(t) 2 dt. If
y(t) ~ 0 and we rotate the curve about the x-axis then r(s(t)) = y(t). Hence

2.3 a(S) = 1''


to
21ry(t)jx(t) 2 + Y(t)2dt.

The circle parametrized by (y(t), z(t)) = (a cost, b +a sin t), 0 ~ t ~ 21r has radius
a and center at (0, b) in the yz plane. If O < a < b, rotating about the y-axis generates
a torus or "donut" surface T , shown in Figure 8.7(a). The arc length element is
lg(t)ldt = ../(-asint) 2 + (acost)2dt = adt, and r(s(t)) = (b +asint). Then
{21(
a(T) = Jo 21r(b + a sin t)a dt

{21( {21t
= 21rab lo dt + 21ra 2 lo sin t dt = 4rr 2ab.
Section 2 Weighted Curves and Surfaces of Revolution 381
FIGURE 8.7 z

(a) (b)

If O < b < a, the surface of revolution is more complicated than a torus; to make
the surface area computation valid we would have to replace the integrand by its
absolute value.

If we rotate the graph y= f(x) for a differentiable function f (x), a S x s b,


we can use x as the parameter in g(x) = (x, f(x)) . Then ds =Ji+
f'(x) 2 dx,
and r(s(x)) = f (x). If f (x) ~ 0, the area of the surface generated by rotating the
graph about the x-axis is

A line segment of length l extends from the origin in IR2 to a point h units above
the positive x-axis and is then rotated about the x-axis to produce a cone C. Slitting
the cone along the segment allows it to be rolled out flat as a sector of a circle, as
shown in Figure 8.7(b). The circular arc of the sector has length 2rrh, which is the
circumference of the cone's base. The area of the sector is thus 2rrh/(2rrl) = h/l
times the area rr / 2 of a full circle of radius /. Hence the cone should have area
cr(C) = (h/ l)rrl 2 = rrhl.
Using instead the previous displayed fonnula, we'll represent the segment as
the graph of f(x) = (h/-J/ 2 - h 2 )x for O S x S -J1 2 - h 2 . We find ds =
(l/,Jl 2 - h 2 )dx. In agreement with our purely geometric computation we get

cr(C) = 1~
2rr(h/JZ 2 - h 2 )x(l/./Z 2 - h 2) dx

= -l 22rrhl
--
-h 2
1~
O
xdx = rrhl.

This example provides some supporting evidence for the con-ectness of the definition
of er (S).
382 Chapter 8 Integrals and Derivatives on Curves

EXERCISES

In Exercises 1 to 4, find the length l(y) of the indicated 11. Compute the surface area of a sphere Sa of radius a in
curves. two ways using Equation 2.2.
1. (x, y) = (t, lncost), 0 _::: t _::, I. (a) Parametrize a semicircle in JR 2 by g(t) =
(acost,asint), 0::: t _:::Jr.Show that ds = adt
2. (x, y) = (t 2 , ~t 3 - ½t), 0 _::, t _::, 2. and r(s(t)) = asint. Then rotate the semicircle
3. y=x 312 ,0_:::x_:::5. about the horizontal axis to get a(S0 ) = 4Jra 2 •
4. g(t) = (6t 2 , 4h13, 3t 4 ), -1 _::, t :::; 2. (b) Parametrize a semicircle by g(t) = (t, .../a 2 - 12 ),
-a _::: t ::: a. Show that ds = a(a 2 - t 2 )- 112 dt and
5. If a curve is described in plane polar coordinates by a r(sU)) = (a 2 - 12 ) 111 . Then rotate the semicircle
function r = f (0) for a :::: 0 _::: b, then in rectangular
about the horizontal axis to get a(S0 ) = 4Jra 2 .
coordinates the curve may be parametrized by
12. The graph of y = a cosh(x/a), 0 ::: x :::: b, is rotated
(x, y) = (r cos 0, r sin0) about the x-axis in IR2 to generate a surface S. Find a(S).
= (j(0)cos0, f(0)sin0), a_::: 0 _::: b. 13. (a) Set up an integral for the arc length of the ellipse in
IR2 parametrized by
(a) Show that the arc length formula for a curve f = (x, y) = (a cost, bsint), 0:::: t:::; 2Jr.
f (0) in polar coordinates is
(b) Assume a > b and show that the arc length integral
found in part (a) is equal to
f"/2
(b) Sketch the curve given by r = (I + cos 0) for
4a Jo .Ji - k2 sin2 tdt , k
2
= (I - 2
b /a 2 ).
0 _::: 0 _::: Jr and find its length. This integral is a standard form of an elliptic inte-
6. A 5-foot piece of wire is coiled in a uniform spiral 3 gral; it can't be evaluated using elementary func-
inches in diameter. Find the height of the coil if it contains tions if O < k2 < I.
six complete turns. (c) Approximate the length of the ellipse if a = 2 and
7. Find the total mass of the helix g(t) = (a cost, a sint, bt),
=
b l, either by direct numerical approximation of
the integral using Simpson's rule or by finding the
0 :::: t .::: 2Jr, if its density per unit length at (x, y, z) is
value of the elliptic integral in a table.
equal to x 2 + y2 + z2•
14. If y is given by g(t) = (t , ~1 312), and h(s) =
8. Let y be a continuously differentiable curve with end-
points p I and P2· Let ).. be the line segment p 1+t (p2 - P1 ), g((l + ~s) 213 - 1), show that lh'(s)I = I for s > 0, and
0:::: t.::: l. Prove that[()..):::: l(y). Thus the shortest dis- that the parametrization h(s) is an arc length parametriza-
tance between two points is a straight line. [Hint: Use the tion for y.
result of Exercise 7(c) of Chapter 7, Section 3.) 15. Compute Jg(t)I for the helix g(t) = (cost,sint,t) and
9. Find the total mass of the wire with shape (x, y, z) = use the result to find an arc length parametrization for
(6t 2 , 4v'2t 3 , 3t 4 ), 0 :::: t :::; 1, this helix.
(a) if the density at the point corresponding tot is 12 . 16. In Example 3 of the text, if the radius a of the helix
(b) if the density at a point is equal to the square of its tends to zero the total mass Ma tends to 2rr 3 /3. What
distance from the yz-plane. geometric interpretation does this number have in the
present context?
10. Suppose y is given by g(t) for a :::: t ::: b and y is then
reparametrized by are lengths so that t = t(s). Show that 17. The centroid of a curve y of finite length l(y) is the aver-
.the line integral equation age position Po of the points on the curve. Thus the vector
Po is given in terms of an arc length parametrization h(s)
r'<r> or more general parametrizations g(t) in a vector-valued
la
b
F(g(t)) •g'(t)dt = Jo F(h(s))-t(s)ds integral by

J 1s1
holds, where h(s) = g(t(s)) and t(s) = (dh/ds)(s). Po= l() h(s)ds = l 1() 111 g(t)lg(t)ldt.
[Hint: Use the change of variable theorem for integrals.] Y so Y to
Section 3 Normal Vectors and Curvature 383
(a) Let y be an arc of a circle of radius a such that 18. Using the definition of centroid of a curve y in the previ-
the ends of the arc subtend angle 0 at the center ous exercise, prove Pappus's theorem: Rotating a plane
of the circle. Show that the centroid of y lies curve about a line in the same plane generates a surface
at distance (2a/0) sin(0 /2) from the center of the S of area a(S) equal to l(y) times the circumference of
circle, measured along the line from the center of the circle traced by rotating the centroid of y about the
the circle to the midpoint of the arc. line. [Hint: The distance in JR 2 from a point y to a line
(b) Show that the half-tum of a helix parametrized by is In• (y - Xo)I, where n • (y - x0 ) = 0 is a normalized
g(t) = (a cost, a sin t , ht), 0 .'.:: t ::, rr has its centroid equation for the line.]
at Po = (0, 2.a/rr, brr /2).

SECTION 3 NORMAL VECTORS AND CURVATURE

The purpose of this section is to analyze the connection between the shape of a
smooth curve in space and the variety of possible motions that can occur along the
path of the curve. We'll denote position on a curve as a function of time by x = x(t)
and assume x(t) is twice continuously differentiable. Since arc length s (t) along a
curve parametrized by x(t) is an integral of speed !x(t)I, it follows that the speed is
v = ds/dt, or sometimes more conveniently v = s. Thus

ds .
-(t)
dt
= s(t) = !x(t)!.
It's customary to denote the vector of length 1 having the same direction as the
velocity or tangent vector v(t) = x(t) by t(t). Recall that, by definition, s f O on a
smooth curve. Thus we can write x(t) = s(t)t(t) or t = (1/s)x, with ltl = 1.
Turning to the acceleration vector along the curve, we have by definition x =
v = d(st)/dt. Now apply the product rule for a scalar times a vector, Formula 1.3
in Chapter 4, Section 1, to get

3.1 x = st+si.
As a first step in interpreting Equation 3.1, we 'II verify in the following proof that
the vector t(t) is orthogonal to t(t), that is t(t)•t(t) = 0. Since t f 0, this means that
either (i) i = 0 or else (ii) t is perpendicular to t. In case t(t) f 0, we define a unit
vector n(t) called the principal normal to the curve at a point by n(t) = (!tl- 1)t.
=
Thus i !tin. This observation allows us to refine Equation 3.1 as follows.

st 3.2 Theorem. For a twice-differentiable smooth curve x(t), the acceleration vec-
tor is expressible as a sum of orthogonal components as

x= st+ sltln.
Proof. The orthogonality of t and n follows from having ltl 2 = t • t equal to a
constant, namely 1 in this case; just differentiate t t = 1 with respect to time t using
0

= =
the product rule. On the left we get (d/dt)t•t t•t +t •t 2t•t. On the right side
FIGURE 8.8
the derivative of 1 is 0, so t • i = 0. Hence t and t are orthogonal. Since t !tin, =
we can rewrite Equation 3.1 as claimed. •
Figure 8.8 is a typical picture of how t and n relate to the path followed by a curve.
=
The two orthogonal components, at st and a 11 = s !tin are called respectively the
384 Chapter 8 Integrals and Derivatives on Curves

tangential acceleration and the centripetal acceleration of motion along the curve.
The tangential acceleration measures the rate of change of speed along the curve.
The centripetal acceleration measures the rate at which the motion bends away from
a straight-line path. The force that bends the path of a particle of mass m generates
the centripetal acceleration, so the centripetal component of that force is man, and
the total force is mat+ man,

If a path is traversed with constant speed, i.e., s = vo = const., then s = 0, and


Equation 3.1 reduces to x = voltln. Thus any acceleration vectors xare perpendicular
to the path of motion, i.e., are centripetal. This is true, for example for the helical
motion x(t) = (a cost, a sin t, bt), since s = .Ja2 + b2 is constant.

We can get more insight into centripetal acceleration by introducing a measure


of the shape of a curve at each point called curvature. If x =
x(t) traces a twice-
differentiable smooth curve then the curvature at x(t) is the scalar

3.3 K(t) = 1:; I·


The unit tangent vector t and the arc length s are intrinsic to the geometry of the
image path, so the derivative dt/ds is intrinsic also. Therefore the scalar curvature
K(t) is an intrinsic measure of the turning rate of the unit tangent vector t, viewed
as a function of arc length. In particular, if the tangent vector doesn't tum at all, as
for a straight line when t is constant, we get K = 0.
The definition of curvature involves a derivative with respect to arc length s, so
a direct approach to computing curvature would start with introducing arc length as
parameter. The following theorem allows us to avoid this step, often awkward in
practice; it also allows us to identify the role that curvature plays in the centripetal
s
term a11 = ltln of Theorem 3.2.
3.4 Theorem. For a twice-differentiable smooth curve x(t) , the curvature func-
tion is determined by ltl = SK so K = ltl/s. Hence the acceleration vector of
Theorem 3.2 equals x = st+ 2Kn. s
Proof. By the chain rule for vector functions of a scalar, dt/dt = s(dt/ds). Hence,
since s?: 0, ltl = ls(dt/ds)I = sldt/dsl =SK . Since s > 0 for smooth curves we
can Write K = jtj/s. •

l·EXAMPLE'2 I ToThefindhelixitsx(t) =
(a cost, a sin t, bt) has radius a
curvature we first compute
K
> 0 and vertical climb rate b > 0.

x(t) = (-asint,acost,b).
Hences = Ja 2 sin2 t + a 2 cos 2 t + b2 = .Ja 2 + b2 . Hence

t(t) = (1/s)x(t) = (a 2 +b2 )- 112 (-asint,acost,b),


so t(t) = (a 2 + b2 )- 112 (-a cost, -a sin t , 0). Then It[ = a(a 2 + b 2 )- 112 , so
K = (l/s)jtj = a/(a 2 + b2).
Section 3 Normal Vectors and Curvature 385
As b tends to O with a fixed, a single tum of the helix approaches a circle of radius a,
while K approaches 1/a. Decreasing a then gives greater curvature, so qualitatively
tighter curling goes with greater curvature. On the other hand, as b gets large a tum
of the helix stretches out in the direction of its axis, making K tend to 0. If b = 0,
the curve we get is a circle of radius a with constant curvature K = 1/a.

Some special formulas for computing curvature are taken up in Exercises 12 to


15. Here are two. one written directly in terms of the time derivatives and of the x x
position vector x(t) that traces the curve, another for the graph of y = f (x).

J1x1 2 1x1 2 - (x. x) 2


3.5 K(t) = li:l 3 , assuming x-:j:. O.

y"
3.6 K(x)-----
+
- (1 (y')2 )3/2 .

EXERCISES

1. Show that the curvature of a plane circular path of radius 8. Let a, b and (J) be positive constants . Let
a> 0 is 1/a. g(t) = (acos(J)t,asinM,bt), t ~ 0.
(a) Find explicitly the arc length parametrization h(s)
2. Find the curvature K(t) of the parabola x(t) = (t, t 2 ) for
of the curve.
-00 < t < 00.
(b) Find the unit tangent and principle normal vectors
3. Centripetal acceleration a0 increases in magnitude if either at an arbitrary point h(s).
speed s or curvature K is increased by a factor p > I.
(c) Find the curvature K(s).
Which does more to increase la0 1?
4. For the circular helix motion x(t) = (acost,asint,bt), 9. Show that the curve (x, y) = (coss, sins), 0 ~ s ~ 2n
is parametrized by arc length. Sketch the curve together
show that the tangential component of the acceleration is
always zero and that the centripetal component at a point
with its velocity and acceleration vectors at s Jr /2. =
of the path is directed toward the axis of the helix. 10. (a) Show that for a line given by g(t) = tx1 + xo, the
curvature is identically zero.
5. Equation 3.5 expresses curvature K in terms of the square
(b) Show that if a curve y, parametrized by arc length
root of an expression of the form lal 2 lhl 2 - (a• b) 2 ; is
and given by a function f (s), has a tangent at every
this expression always nonnegative? Explain.
point and has curvature identically zero, then y is a
6. Motion along a linear path can be described by x(t) = straight line.
¢(t)c + d, c ¥- 0, where we assume that the real-
valued function cp(t) is strictly increasing and has two 11. Use Theorem 3.2 to show how that if a particle of constant
continuous derivatives. Show that the acceleration vector mass m moves so that at time tit is at x(t), then the work
has centripetal component identically zero. done in traversing a part of the path having length so is
equal to an integral with respect to arc length s along x(t)
7. Here is the converse to the statement in the previous in the form
exercise: Suppose x(t) has the centripetal component of
its acceleration identically equal to zero, but that s(t) ¥- 0. so d2s
=
Then the path of motion is a straight line. Prove this as
follows.
W
!oo m-
2
ds.
dt
(a) Show that i = 0 and hence that Ji: = scfor some
constant vecmr c ¥- 0. 12. Here are two different ways to prove Equation 3.5 for
(b) Integrate the result of part (a) with respect to time curvature.
t to show that x(t) = s(t)c + d for some constant (a) Verify the equation using the substitutions Ji: = st
vector d, so that the path of motion is a line. and x=
st+ s2 Kn. Then expand the dot products
386 Chapter 8 Integrals and Derivatives on Curves

and use the relations t • t = n • n = I and liy - yil


--,,--=--~
=
l•n 0.
K -
- (i2 + y2)3/2.
(b) Derive the equation by first computing the time
derivative of the vector t = (x •x)- x.
112 Then use
the formula K = Iii/Iii. (b) Find the curvature K(t) for x = 12, y = r3 when
t # 0. This is not a smooth curve at t = O; what is
13. A twice differentiable real-valued function f (x), has as the limit of K(t) as t tends to O?
its graph a smooth curve in JR 2 .
(a) Show that the curvature of the graph of f is 16. Position on a helical curve is x(t) = (a cos ,p(t),
a sin ,p(t), b,p(t)}, where ¢,(!) is a twice-differentiable
lf"(x)I real-valued function of time t. Imagine a bead of mass
K(x)- - - - - -
- (I +
f' (x )2)3/2 · J that starts sliding from rest at t = 0, with negligible
friction, on a helical wire whose central axis is vertical,
This follows from Equation 3.5 if we use x as and so parallel to the acceleration -g of gravity. In that
parameter in (x, y) = (x, /(x)}. case ,p(t) = -½bgt 2 /(a 2 + b 2 ).
(bl If y = cosx on [-n/2, n/2), find the maximum and (a) Show that for t > 0, s(t) =hgt; .Ja 2 + b2.
minimum of K(x). (b) Since curvature depend:; only on the shape of the
(c) Find all the points of maximum and minimum K(x) helix, which depends on a and b but not ¢,(t),
on the graph of y = x 4 . the curvature K(t) is given by Example 2 of the
text. Use this information to show that the tangen-
14. (a) Use Equation 3.5 to show that if 0 is the angle
tial and normal components of the acceleration vec-
x
between and x,
then
tor have respective magnitudes bg / .../a'2+b2 and
ab2g2t2 /(a2 + b2)2.
lxll sin0I
K=
1x1 2 *17. If a smooth curve is parametrized by arc length, its
curvature is K(s) = l(d/ds)t(s)j. Show that if 0(s, h) is
(b) If the speed and magnitude of acceleration are given, the angle between t(s) and t(s + h), which tends to zero
what does pan (a) tell you about the dependence of as h tends to zero, then
K on 0? Explain why your answer to this question
agrees, or fails to agree, with your physical intuition. . l0(s.h)I
K(S) = hm -- .
15. (a) Show that for a special case of a plane curve h-+0 h
parametrized by two scalar functions x(t) and y(t)
Equation 3.5 reduces to [Hint: Show that lt(s + h) - t(s)I = .J2 - 2cos0(s, h).]

SECTION 4 FLOW LINES, DIVERGENCE, AND CURL

Suppose ~n ~ ~,, is a vector field. A differentiable curve x = x(t) with the


property that at each of its points x the corresponding velocity vector coincides x
with the field vector F(x) is called a flow line of the field F; thus x(t) = F{x(t)) for
each t in some interval a < t < b. If the field F represents the velocity field of a fluid
flow, a flow line models the path followed by a particular molecule of the fluid, or
alternatively the path traced by a small foreign object dropped into the fluid at some
point. Finding explicit formulas for flow lines is generally not possible except under
some special assumptions discussed in Chapter 12, Section 2. Indeed we often have
to resort to the approximate numerical methods of Chapter 12, Section 5, to make
accurate pictures of flow lines. However, given a reasonably accurate sketch of a
vector field, some flow lines can usually be sketched in roughly by hand; the idea is
to draw a flow line so it appears to be tangent to an arrow of the field if the arrow's
tail is on the curve. (The computer methods for drawing flow lines described in
Chapter 12 use essentially this idea.) Figure 8.9(a) shows such a sketch. The lengths
Section 4 Flow Lines, Divergence, and Curl 387
FIGURE 8.9
\
1
/

I
I
\
F(x, y) = ¼<x -y) i + ¼<x +y)j F(X,y ) = 41 xt. + 2I yt•
(a) (b)

of the nearby field arrows give an indication of the speed with which a flow line is
traversed at a given point.

A sketch of the 2-dimensional vector field F defined by the vector equation F (x, y) =
¼<x - y)i + ¼<x + y)j appears in Figure 8.9(a). For example, fi'(l, I) is represented
by a vertical arrow of length ¼with its tail at the point F(l, I). Four flow lines have
been sketched in with attention paid to their tangency relation to the arrows in the
field sketch.
If in the previous example we had wanted instead a sketch of the field 6F(x, y) =
(x - y)i + (x + y)j we might have preferred to settle for the picture shown in
Figure 8.9(a) anyway. The point is that making the arrows 6 times as long for 6F
makes for a more cluttered picture, particularly if the domain is enlarged beyond the
one shown in the figure. In accepting the temporary convention that the arrow lengths
be scaled down by the factor ¼, we make a clearer picture at the small expense of
asking our minds to interpret the picture as if the arrows were 6 times as long. The
flow lines will appear to be the same in either case, though they will be traced with
6 times the velocity in 6F as in F. There is nothing about the image of the flow lines
themselves that shows their velocities, so we rely on our interpretation of the field
arrows for that information.

If a field F is known to be a gradient field, with x = F(x) = VJ(x), then


Theorem 1.4 of Section IC tells us that the flow lines are perpendicular to the level
sets of J, as shown in Figure 8.9(b). Thus for a gradient field VJ we have the option
of drawing flow lines by first drawing level sets of J and using these as a guide in
drawing flow lines.

f~i~~f~~;3,] Suppose J(x, y) = ¼x 2 + ¼l- The gradient field}'= VJ is then given by

F(x, y) =¼xi+ ½YJ


388 Chapter 8 Integrals and Derivatives on Curves

Figure 8.9(b) shows a sketch of the field together with some level curves of the
function f. Some flow lines have also been sketched in. These flow lines have the
property that, as well as being tangent to vectors of the field, they are perpendicular
to the level curves that they cross. To sketch the flow lines we can use either tangency
to the field arrows or perpendicularity to the level curves as a guide, whichever seems
simpler. In this example the level curves are the family of ellipses 2+ !x ¼i
= k or
equivalently, with c = Sk, x + 2y = c; the ellipses are fairly easy to draw, so we
2 2
might prefer using these as an aid in sketching the perpendicular flow lines instead
of first sketching the vector field.

Divergence of a Vector Field. Important aspects of the behavior of a dif-


ferentiable vector field F and its flow lines are characterized by the divergence
of F, abbreviated div F(x) and defined as the sum of the main diagonal elements
of the derivative matrix F'. Thus the divergence of F is a real-valued function
wherever F is differentiable. For example, if F has real-valued coordinate functions
F1, F2, ... , F,1, then

. aF1 aF2
4.1 d1vF(x, y) = --(x, y) + - ( x , y), for F : JR.2 -+ JR. 2 .
ax ay

. a~ a~ a~
4.2 d1vF(x, y, z) = -(x, y, z) + -(x, y, z) + -(x, y, z),
ax ay az

for F : JR 3 -+ JR 3 . If we write V = (!_, !_, !_)


ax ay az
and F = (F1, F2, F3) then it
makes sense to write div F = V • F.

j EXAMPL:TTJ The 2-dimensional field F(x, y) = xyi + (x - y 2 )j has

axy a(x - y 2 )
divF(x, y) = -ax + - -
ay
- = y + (-2y) = -y.

IEXAMPLE. s j The 3-dimensional field F(x, y, z) = xi + yj + zk has

ax ay az
div F(x, y, z) = -ax + - + -
ay az
= 1 +I+ I = 3.

To attach some meaning to div F, think of F as the velocity field of a 2 or


3-dimensional fluid flow. Using Gauss's theorem in Sections IC and 4B of the
following Chapter 9 we show that div F(x) is the expansion rate of fluid at x per
unit of area and volume respectively; in terms of the density p(x) of the fluid at x
this property can be expressed as the continuity equation

4.3 div F(x) = - op (x).


at
Section 4 Flow Lines, Divergence, and Curl 389
In particular, if div F(x) < 0 in a region then the fluid is contracting there and
becoming more dense, because p1 > 0. If div F(x) = 0 the fluid volume and density
remain constant, and if div F(x) > 0 the fluid is expanding and becoming less dense.
We can even use div F to measure the compression or expansion of a fluid such
as a gas in another way. Indeed a special case of Theorem 1.6 in Section ID of
Chapter 12 shows that if div F is constant, then in time t > 0 a region of volume
V will flow into a region of volume e1 div F times V; such pairs of regions appear in
Figure 8.10. For t > 0. the factor e' div F > I if div F > 0, is equal to I if div F = 0,
and less than I if div F < 0. A basic assumption in this discussion is that no fluid is
being created or destroyed near x.

The field F(x, y) = ½<-x+ y)i + ½<-x -


y)j shown in Figure 8.IO(a) has
divF(x, y) = -½-½=-¼in all of R2 , so the fluid is being compressed at a con-
stant rate per unit of area; in other words the fluid density is increasing. Following
the points of the shaded region in the second quadrant along the flow lines of the
field for some fixed time unit, we see the initial area compressed into a smaller
region in the first quadrant. If the flow direction is reversed, the fluid expands.

If we were to reverse the direction of the field in the previous example and consider
instead the field - F, the arrows in the sketch would all reverse direction and the
flow lines would spiral outward from the origin as in Figure 8.9(a). We would have
div( -F)(x, y) = ¼, and conclude that we have expansion of areas along flow lines
of-F.

~,,~MP4~';, -I ~ur:e~~~~(~e~it:i:~y~o= li~e~~:ia:~n~ ~ ~!i;J~~i~ :gr:~~-


2 2
~es~~:n in

FIGURE 8.10

l
I I
\
I \
I
-y-- -------1+-+-~--+--__..__--F"'H-----L
t

I
\
/

F(x,y) = ½<-x +y)i +½(-x -y)j


" - ./' /

G(x, y) = -¼(y/v'x2 + yi)I + ¼<xlv'x2 + y2)j


Area decreased: div F(x, y) = -¼ Area preserved: div G(x, y) = 0
(a) (b)
390 Chapter 8 Integrals and Derivatives on Curves

divG(x, y) = -a ( -r==e:::::=::;:
-y ) + -a ( x )
ax 4Jx2 + y2 ay 4Jx2 + y2
xy xy
-----=0.
- 4(x2 + y2)3/2 4(x2 + y2)3/2

Since div G is identically 0 the points in the shaded region in the first quadrant move
along their flow lines during a fixed time interval into a region of the same area.

The 3-dimensional vector field .,(x, y, z) = xy 2i - yz 2j + x 2zk would be fairly


complicated to sketch. However we can get some significant information about its
behavior by computing div F. We find div F(x, y, z) = y2-z 2 +x 2 , so div F(x) < 0
whenever x 2 + y2 < z2 , that is, precisely when x = (x, y, z) is inside a right circular
cone symmetric about the z-axis. Thus a fluid flow with velocity field F would be
compressive inside the cone and expansive outside the cone.

The Curl of a Vector Field. The term curl is meant to suggest that we're trying
to measure the local tendency of a vector field and its flow lines to circulate around
some axis. In dimension 3 the curl of a differentiable vector field F = F1 i+ F2j+ F3k
is the 3-dimensional field

curlF= aF3 aF2) • ( aF1 - a- aF2 - a-


F3 ) J• + ( - F1 ) k.
4.4 ( -ay- -az- •+ -az ax ' ax ay

To help in recalling the formula, we can express curl F as a kind of cross-product of


the gradient operator (a/ax, a;ay, a;az) and F = (F1, F2, F3). Thus

curl F = det ( a;~x


F1

IEXA.MllLE.10 ! If F(x' Y' z) = (y' z2' x3), then


j
curl F = det ( d/:x a/ay
z2

We can conclude that along the y-axis, where z = x = 0, the vectors of the field
curl F all have length I and point in the direction of the negative z-axis.

IEX."M.~4E 11 I Sometimes we can choose coordinates so that the third coordinate function of the
given vector field is identically zero and the other two coordinate functions are
independent of z, that is,

F(x, y, z) = F1(x, y)i + F2(x, y)j.


Section 4 Flow Lines, Divergence, and Curl 391
Then all partials with respect to z are zero, and

i j k ) 2
curlF=det cl/ox o/oy o/oz =Di+Oj+(aF _ cJFr)k.
( Fi (x, y) F2(x, y) O ax oy

We conclude that the arrows representing the curl of a field of this special kind either
have length zero or else are parallel to the z-axis.

We can look at a 2-dimensional vector field as a horizontal slice of the very


special kind of 3-dimensional field described in the previous example: F(x, y) =
Fr (x, y )i + F2 (x, y )j. As we see from the example,

curl F(x, y) = (aF2/ox - oFrfoy}k,

so we define the curl; sometimes called the scalar curl, of a 2-dimensional vec-
tor field to be the real-valued function curl F = (0F2/ox - cJFif3y). The scalar
curl plays an important part in Green's theorem taken up in Section l of the next
chapter, and it's particularly helpful in conveying an intuitive feeling for the signif-
icance of curl F in general. It will follow from Green's theorem that if the scalar
curl of a 2-dimensional field is continuous and positive at a point (x, y), then the
line integral of F over a small enough counterclockwise oriented circle centered
at (x, y) will be positive; thus a field with positive scalar curl will have positive
counterclockwise circulation near (x, y). If curl F(x, y) < 0 the circulation will be
clockwise near (x, y ). If curl F = 0 identically, the circulation will be zero near
every point.
The statements in the previous paragraph can't be interpreted as predictions about
how a fluid particle would move at a given time and place; that information is given
by the vector values of F. Indeed, without some external constraint, a fluid particle
will simply follow a flow line with its velocity at each point x detennined by F(x) .
Circulation, as defined by a line integral rn Section 1, is just a cumulative measure
of the effect of the field along a particular path.

The 2-dimensional field F(x, y) = k<-x + y)i + k<-x - y)j of Example 6, shown
in Figure 8.1 O(a), has

1o(-x-y) la(-x+y) 1 l 1
curlF(x y)
'
= -----
8 ax
- -----
8 oy
= --8 - -8 = --.
4

This tells us that near each point (x, y) the circulation around a counterclockwise
oriented circle is negative, or alternatively that the circulation around a clockwise
circle is positive.
392 Chapter 8 Integrals and Derivatives on Curves

j~:x,;tva~~E;13j The vector field G(x, y) = -¼(y/Jx 2 + y 2)i + ¼(x/Jx2 + y2)j of Example 8
· · · shown in Figure 8.lO(b) has a scalar curl given for (x, y) #- (0, 0) by

(x2 + y2)1/2 _ y2(x2 + y2rl/2


+ _ _ _ _4_(_x2_,_+_y-c-2)_ _ __

=--:===>0.
4Jx2 + y2
Since curl G is positive everywhere on the domain of G we can conclude that the
circulation of G around small enough counterclockwise circles in the domain of
G will be positive. It's tempting to think that by some reasoning we can con-
clude that the circulation around the closed flow lines shown in Figure 8. lO(b)
will also be positive; the conclusion is true, but for a somewhat different reason
explained in Section 1 on line integrals, namely that the tangent vectors to the
curve coincide with the vectors of the field. (See also Exercise 8 of the present
section.)

The vectors of the field G in the previous example all have length ¼- By varying the
arrow-lengths but leaving the directions alone we find a field having the same flow
lines as G but traced with different speeds. For example, consider the vector field

4 -y . X ,
H(x , y) = -;::=~G(x,
Jx2 + y2
y) = 2
X + y
2I+
X
2
+ y
2J·

Only on the circle x 2 + y 2 = 16 do the two vector fields coincide; outside this circle
the arrows of H are shorter, while inside the circle they are longer. The scalar curl
of H is

curlH(x,y)=!._( x
ax x 2 + y 2
)-~(
ay x2
-y)
+ y2
(x2 + y2) _ 2x2 (x2 + y2) _ 2y2
= (x2 + y2)2 + (x2 + y2)2 = 0, (x, y) #- (0, 0).

We conclude that the circulation of H is zero near every point. Nevertheless, it's
intuitively evident that the circulation along the flow lines will be nonzero. (See
Exercise 8.)

We'll see in Chapter 9 that the scalar curl measures the local tendency of a 2-
dimensional vector field to have a nonzero circulation about a point, as defined by
line integrals of the field over small closed paths around the point. However, such
a tendency by no means implies that a particle acted upon by a velocity field with
Section 4 Flow Lines, Divergence, and Curl 393
FIGURE 8.11 curl F(x)

nonzero scalar curl will exhibit vortex motion locally. Indeed Figure 8.lO(b) shows
that the circular flow lines would cut right across a small circular path centered at a
point other than the origin. Similar remarks apply to the 3-dimensional vector field
F of a 3-dimensional field F.
In JR 3 we can ask if there is an interpretation not only for the magnitude of
curl F but also for its direction, assuming curl F(x) #- 0. The answer is yes, and we
interpret the vector curl as follows. Let P be a plane through x with unit normal
vector n. For each point of P that is also in the domain of F, project the vector F(x)
perpendicularly onto P as shown in Figure 8.11 to get a 2-dimensional vector field
F n in P having a scalar curl, namely curl F n.
The following three observations show that understanding the special case of the
scalar curl is a help in understanding the 3-dimensional vector field curl F.

(i) /curlF(x)j 2 = jcurlFn(x)/ 2 + (n •curlF(x)}2.


(ii) Choosing n perpendicular to curl F(x), which means the vector curl F(x) is
parallel to P, maximizes the absolute value of curl Fn(X) among all choices
for a plane through x.
(iii) If curl F n(x) > 0 in a neighborhood of a point XQ. then the circulation of
F n around nearby circular paths centered at xo in P, is positive, following
the fingers of the right hand rule, with thumb pointing in the direction of
curl FO (x), as shown in Figure 8.11.

Statement (i) is just the Theorem of Pythagoras for the large triangle in Figure 8.11 .
Statement (ii) follows from the first statement, since curl F0 (x) is maximized by
making n. curl F(x) = 0. Statement (iii) will follow from Example 5 in Section 1 of
Chapter 9.

For a 3-dimensional vector field of the form

F(x, y, z) = -yi + xj + cf>(z)k


we find curl F(x, y, z) =Qi+ 0j + 2k = 2k regardless of our choice for cf>(z). The
third coordinate of curl F, namely 2, is equal to the scalar curl of the 2-dimensional
field F 0 (x, y) = -yi + xj that we get if we project the arrows of F(x, y, z) onto
the xy-plane, with unit normal n = k; this projection is the 2-dimensional field
Fn(x, y) = -yi + xj with scalar curl equal to 2.
394 Chapter 8 Integrals and Derivatives on Curves

EXERCISES

14. Let F be a continuously differentiable vector field, sup-


In Exercises 1 to 6, compute div F and, as appropriate,
pose x = x(t), a :'.S t ::s b, parametrizes a flow line y of F,
the scalar or vector curl for each of the indicated real-
valued functions. and consider the circulation integral l F • dx of F on y.
1. F(x, y) = cos(xy)i + sin(xy)j. (a) Show that if s(t) is the speed along y al time t, then
2. F(x, y, z) = xyi + yz.i + zxk.
3. F(x, y) = (2x - y)i + (x - 3y)j.
4. F(x. y. z) = yzi + xz.i + xyk.
5. F(x. y) = (x 2 + y2) 2i + (x 2 - y 2)2j.
(b) Use part (a) to show that the circulation of F is
6. F(x, y , z) = (x 2y, J1z 2 , xz 3 ). always positive along a flow line of positive length.
In Exercises 7 to 10 describe the region in JR 2 or JR 3 15. Suppose that a 3-dimensional vector field F is continu-
in which div F is positive, and hence in which the ously differentiable and that F • curl F is identically zero.
corresponding flow is expanding. Show that the line integral of curl F along a flow line of
F is equal to zero.
7. F(x, y) = (x 2 + y 2 )i + (x 2 - y2)j.
16. Suppose that a 3-dimensional vector field F is differen-
8. F(x, y, z) = xyi + yzj + zxk. tiable and that g is a differentiable real-valued function
9. F(x, y) = ex+yj + e·t-Yj. defined on the domain of F. Show that

10. F(x, y, z) = x 3 i - y 3j + z3 k.
Note. General methods for deriving the parametriza-
(gF) ,curl(gF) = g2F , curlF
tions of flow lines in the next three exercises are taken up
in Chapter 12. holds identically on the domain of F. (This is easy to show
if g is constant, but otherwise is a little more work.)
I J. (a) Verify that the gradient field F(x, y) = ¼xi + ½yj
in Example 3 of the text has flow lines parametrized *17. (a) Show that if F = Vf is a gradient field with
by x(t) = cie 114 i + c2e112j, where c1 and c2 are R3 -1.+ R twice continuously differentiable, then
arbitrary real constants. curl F is identically zero.
(b) Show that the flow lines in part (a) usually follow (b) Use part (a) and the result of the previous exercise
parabolic paths, degenerating in some cases into to find a 3-dimensional vector field G that isn't a
straight lines heading away from the origin. gradient field, but such that G • curl G is identically
12. (a) Verify that the vector field 6F(x, y) = (x - y )i + zero.
(x+y)j in Example 2 of the text has flow lines given (c) Find a 3-dimensional differentiable vector field F
by x(t) = Ae1 cos(t + a)i + Ae 1 sin(t + a)j, where such that F • curl F is not always zero.
A and a are arbitrary real constants. 18. The scalar curl of a continuously differentiable 2-dimen-
(b) Show that the flow lines described in part (a) follow sional gradient field F = Vf is always zero.
generally spiral paths, in one case degenerating into (a) Show this by computing the second-order derivatives
a point. in curl (VJ).
Verify that the vector field 4G(x, y) = (b) Show this by calculating the circulation integral
13. (a)
-¼(y/Jx2 + y2)1+ ¼<x/Jx 2 + y 2 )j in Example 8
of the text has flow lines parametrized by
1 VJ• dx over a closed curve c.

19. The divergence of the curl of a twice continuously differ-


x(t) = A cos(¼t/ A+ a)i + A sin(¼t/ A+ a)j, entiable vector field JR 3 ....!".+ R 3 is identically zero.
(a) Prove this by direct computation of the required
where A and a are real constants with A > 0. mixed partial derivatives.
(b) Show that the flow lines described in part (a) are (b) What can you conclude about the effect of motion
counterclockwise circles centered at the origin in JR2. along flow lines of curl F on volume?
Section 4 Flow Lines, Divergence, and Curl 395
20. The curl of the gradient of a twice continuously differen- 22. (a) Show that the flow line of the 2-dimensional
tiable function JR 3
/4
JR is identically zero. field F(x, y) = yi + xj passing through (u, v) is
(a) Prove this by direct computation of the required parametrized by
mixed partial derivatives.
(b) What can you conclude about the effect on local x(t) = ½(u + v)e' + ½(u - v)e- 1 ,
circulation of VJ?
21. The divergence of the gradient of a twice continuously dif- y(t) = ½<u + v)e 1 - ½<u - v)e- 1 •

ferentiable real-valued function R 2 /4


JR is the Lapla-
cian of J, denoted by !J.J; thus !J.J = div(Vf). What (b) Define the transformations T1 (u, v) = (x(t), y(t))
can you conclude about the effect of flow with velocity from JR2 to JR 2 using the definitions of x(t) and
VJ on area from the sign of the Laplacian of J in vari- y(t) in part (a). Show that the Jacobian determi-
ous regions? In particular, suppose that J is a harmonic nants a(x, y)/a(u, v) of the transformations T, are
function, which by definition is a function with the prop- identically equal to I for all t.
erty that AJ = O; what can you then conclude about the (c) Show that div F(x, y) is identically zero. How does
effect of flow with velocity F = VJ? this relate to part (b )?

Chapter 8 REVIEW

Compute the following line integrals by whatever correct 9. Explain in general terms why/(}"}.)= /(Y3), while /(yt)
method seems simplest. has a different value.
1. [ xy dx + (x + y2) dy,
2
where s is the closed counter- 10. (a) Which, if any, of the parametrizations in the previous
exercise are equivalent, so that the line integrals of
clockwise oriented square with comers at (0, 0), (I, 0),
an arbitrary continuous vector field F over them will
(I, I) and (0, I).

2. i xy dx + (x 2 + y2) dy, where q is the part in the (b)


be equal?
Suppose F = VJ is a continuous gradient field on
IR.2• Which of the integrals of J along the three
first quadrant of the counterclockwise oriented circle curves in the previous exercise will he equal?
x2 + y2 = 1. 11. Find the work done by the force field F(x, y, z) =
3. 1\iz dx 3
+2xyz 3dy+3xiz 2 dz), where a= (I, 1, I) (x, 2y, z) on a particle moving from (I, 0, 0) to (-1, 2, 1)
on the straight line segment joining these two points.
and b = (2, 2, 2). The field is the gradient field of a
function that's fairly easy to guess. 12. Find the work done by the force field F(x, y, z) =
4. i lz 3 dx+xyz 3 dy+xy2z2 dz, where>.. is the line from
(I, I, I) to (2, 2, 2) .
(0, 0, mg) on a particle moving up through exactly two
full turns of the elliptic helix h(u) = (cos u, 2 sin u, u).
13. Let y be a smooth parametrized curve, and let F be a

1 dx, where c is the counterclockwise oriented circle


vector field that assigns to each point x on y the unit

l
5. y tangent vector to the curve, pointing in the direction of
x2 + y2 = I. traversal. Show that F(x) • dx = l(y), the length of y.
In Exercises 6 to 9, consider a line integral / (y) = 14. Define a vector field F(x) =
x and parametrize the
fr x2 y dy, where y starts at (0, 0) and ends at (1, l). · segment joining point a and point b by x(1) = 1b+(l-1)a,
with O 5 1::: I.
6. Compute /(yi) if Yl is parametrized by g(t) = (t 2 , t 3 ) (a) Show by direct computation of the line integral that
for05t5I.
7. Compute / (!"2) if l"2 is parametrized by g(t)
0515I.
= (t, 12) for £ F • dx = }(lhl 2 - laJ 2 ).

8. Compute /(YJ) if }'3 is parametrized by g(1) = (1 2, 14) (b) Do the computation in part (a) by finding a real-
for O 5 1 5 1. valued function J such that VJ =
F and applying
396 Chapter 8 Integrals and Derivatives on Curves

Theorem 1.3, the Fundamental Theorem of Calculus 20. Show that the curvature of the graph of y = ex tends
for line integrals. to zero as x - ±oo. Where is the point of maximum
l 5. Let t stand for time and consider the time-dependent plane curvature?
vector field 21. Let x(t) = (a cost, a sin t, bt) where a and b are nonneg-
ative constants.
F(t, x, y) = ((I - t)x - ty, tx + (I - t)y). (a) For a fixed positive value of b, what values of a
yield the maximum and minimum values for the
curvature K?
(a) Find the work done by this field on a particle at (b) For a fixed value of a, what values of b yield the
g(t) = (cost, sin t) in the time interval O ~ t ~ 2JT. maximum and minimum values for the curvature K?
(b) How does the answer to part (a) change if, instead
22. Let x(t) trace a smooth curve with speed s(t) along the
of varying with time, the vectors of the field are
curve.
constantly equal to their values at some fixed time
to? Explain why the answer is geometrically evident (a) Show that lx{t)1 2 ::: ls (t)1 2 .
for t = 0 and for t == I.
s
(b) Show that the magnitude 2 K of the centripetal
component of acceleration along the curve is equal
16. Let y be a smooth curve parametrized by x == g{s), to Jlx(t) 12 - ls(t) 12 at x(t), so that the discrepancy
0 ~ s ~ [, where s stands for arc length measured along between lxl and Isl is zero just when K == 0.
the curve starting at g{O) and ending at g(l). Show that
if F is a continuous vector field defined along y, then 23. Using the cross-product, show that curvature of a smooth

i F • dx = f F(g(s)) -t(s) ds, where t(s) is the unit


tangent vector to the curve at g(s) pointing in its direction
curve x(t) is

K(t) = lx(t) x x(t)I


of traversal. lx(t)1 3
17. An outdoor sculpture consists of a vertical wall of heavy
24. Consider the decomposition x(t) = s(t)t(t)+s 2(t)K(t)n(t)
sheet steel. The base of the wall follows the curve x =
of acceleration of motion along x = g (t) into perpendicu-
t 3 - 3t, y = 3t 2 for l ~ t ~ 2 where x and y are measured
in meters and the height of the wall at (x , y) is y. If the lar components. Along with x, the tangential and normal
steel weighs 30 kilograms per square meter, what is the components of xdefine vector fields on the path of motion,
total weight of the steel used? [Hint: Consider a weighted so we can integrate them along the path.
curve.] (a) Show by direct computation that the line integral of
s
the normal component 2(t)K(t)n(t) along a given
J8. (a) The graph of the parabola y = x 2 for O < a ~ x ~ b part y of the path is always zero.
is rotated about the y-axis. Find the surface area (b) Show by direct computation that the line integral
generated. of the tangential component s(t)t(t) along a given
(b) What is the area of the surface generated by rotation part y of the path from x(a) to x(b) is equal to
about the x-axis? ½s 2 (b) - ½s 2 (a).
(c) Use the results of parts (a) and (b) to compute
19. Show that a parabola has its maximum curvature at its
vertex, where the parabola's line of symmetry intersects
the curve.
i x•dx.
CHAPT ER 9

VECTOR FIELD THEORY

The fundamental theorem of calculus for one variable says that if f' is integrable
for a ~ t ~ b, then

lb J'(t) dt = f(b) - f(a). (1)

In Section I of the previous chapter, the theorem was extended to line integrals of
a gradient V/ by the equation

lb VJ (x) • dx = /(b) - f (a). (2)

The main theorems of the present chapter are also variations on the idea that an
integral of some kind of derivative of a function can be evaluated by using only the
values of the function itself on a boundary set, for example the endpoints a and b in
the fundamental theorem stated above. We begin with the version known as Green ' s
theorem.

SECTION l GREEN'S THEOREM


lA Statement and Examples
Let D be a plane region whose boundary is a single curve y, parametrized by a
function g in such a way that, as t increases from a to b, g(t) traces y once in the
counterclockwise direction as indicated by the oriented circle on the integral sign.
An example is shown in Figure 9.1. If F and G are real-valued functions defined
on D, including its boundary, then the formula for Green's Theorem says that

= J:.
JD{ (aG
ax
- aF)dxdy
ay Tr
Fdx+Gdy, (3)

under appropriate differentiability conditions on F and G, and on the boundary


curve y. It's enough to assume additionally about y that it's piecewise smooth,
which means that y consists of finitely many curves each of which is smooth as
defined in Section lA of Chapter 4, though the condition that the parametrization
have nonzero derivative isn't really necessary here. The requirement that y be traced
counterclockwise is the analog of the requirement that in Equations ( 1) and (2), the
differences on the right have to be taken in the proper order. We can further strengthen
the analogy of Equation (3) with Equations (I) and (2) if we think of the integrand
(aG /ax) - (a F jay) as a kind of derivative of the vector field F = (F, G).

397
398 Chapter 9 Vector Field Theory

Suppose that D is the square defined by -1 ::: x ::: 1, -1 ::: y ::: 1, and let F and
G be defined on D by F(x, y) = -yex and G(x, y) = xe>'. Then

ac aF
ih(x, y) - ay(x, y) = eY + ex
y

'Y2
so

'Y1
{ (oG - iJF) dx dy
JD ax oy
=f f 1
-I
dx 1 (e>'
-1
+ ex)dy
D

=f +
1
X
(e 2ex - e- 1 )dx
'Yi -1

= 4(e - e- 1).
'Y4
We parametrize the boundary curve y in four pieces Yi, i = I, 2, 3, 4, by

FIGU RE 9.1

-1 ::':I::': I.

Notice that the traversal of y is counterclockwise, as is shown in Figure 9.1. On the


first side of the square we have

1YI
F dx + G dy = 1-yex dx + xeY dy
Y

=1 -1
1
d
[<-te/x +e' y]dt
dt dt

=1
1
e'dt =e- ~-
-1 e
Similarly, the integrals over the other three sides are also equal to (e - 1/e), so

i F dx + G dy = 4 ( e -1) .
Equation (3) is thus verified for this particular example.

In computing a line integral, a given parametrization can always be replaced


by an equivalent one for which the line integral will have the same value. In the
previous example the boundary curve y was given what appears to be the simplest
Section 1A Green's Theorem 399
parametrization, although an equivalent one would do. The question becomes more
important if the boundary is presented without a parametrization but merely as a set.
It may be necessary to choose a parametrization, and if Green's theorem is to be
applied, we'll see that this must be done so that the boundary is traced just once,
and in the proper counterclockwise direction.
The need for a counterclockwise instead of clockwise traversal of the boundary
curve is apparent when we observe that reversal of direction changes the sign of a
nonzero line integral, whether over a closed path or not. In other words, we have
the following theorem.

1.1 Theorem. Let y be a smooth curve and let F be a continuous vector field
defined on y . Denote by y - the curve y traced in the opposite direction. Then

l- F • dx =- i F • dx.

Proof If y is parametrized by x(t) = g(t) for a ~ t ~ b, we parametrize y - by


x(t) = g(a+b - t) over the same interval; this change reverses direction, going from
g(b) to g(a) instead of the other way around. Since dg(a+b-t)/dt = -g'(a+b - t) ,
we have

1 y-
F•dx=-1bF(g(a+b - t))•g'(a+b-t)dt.
a

Now change the variable of integration by t =a+ b - u, dt = -du to get

1 y-
F•dx==1°F(g(u))•g'(u)du
b

= -lb F(g(u)) •g'(u)du =- i F•dx. •

We can prove Green's Theorem most easily for regions D such that y, the bound-
ary of D, is crossed at most twice by a line parallel to a coordinate axis. Such a
region is called simple. Thus a coordinate line intersects the boundary of a simple
region either in a line segment or else in at most two points. Using Theorem I. l we
can extend the theorem to finite unions of simple regions. A few such are shown
in Figure 9.2, where only D1 is simple. As shown for D2, when the boundary of
the region is not a single curve, only the outer boundary is traced counterclockwise,
while the inner boundary is traced clockwise. A rule that covers all cases is to trace
each piece of the boundary so that the region is always to the left as a point traces
the boundary. Line integrals around a path that begins and ends at the same point,
called a closed path or circuit, are important enough that they are often distinguished
from other integrals by means of an integral sign like f, or perhaps f to indicate
a direction of traversal.
400 Chapter 9 Vector Field Theory

FIGURE 9.2

0
Di
(a) (b) (c)

1.2 Green,s Theorem. Let D be a bounded plane region that is a finite union of
simple regions, each with a boundary consisting of a piecewise smooth curve. Let
F and G be continuously differentiable real-valued functions defined on D together
with y, the boundary of D. Then

1(aaox
D
- aF) dx dy =
- -
oy
f Y
F dx + G dy,
where y is parametrized so that it's traced once, with D on the left.

Proof Consider first the case in which D is a simple region, with boundary y
parametrized by

Since

y
i Fdx+Gdy= i Fdx+ i Gdy,

we can work with each of the te1ms on the right separately. We have

y = v(x)
Cl. X
The curve y consists of the graphs of two functions u(x) and v(x), perhaps together
with one or two vertical segments, as shown in Figure 9.3. On a vertical segment,
FIGURE 9.3 g1 is constant, so g~ = 0 there. On the remaining parts of y we apply the change
of variable x = g1 (t) so that, on the top curve, g2(t) = y = u(x), whereas on the
bottom, g2(t) = y = v(x). It follows that

i F(x, y)dx = la F(x, u(x)) dx + 1/3 F(x, v(x)) dx,


where the integration from fJ to a occurs because the graph of u is traced from right
to left. Reversing the limits in the first integral, we get

i F(x, y)dx = i'\-F(x, u(x)) + F(x, v(x))]dx


Section 1A Green's Theorem 401

= /3 [
- 1u(x) -a
aF ]
1a v(x)
(x, y)dy
X
dx

aF

y
=
1D
--dxdy.
ay

A similar proof, referring to Figure 9.4, shows that

/3' ~~-s(yO
'~.
= r(y) { G(x, y)dy = { aG dx dy.
X
}y lv ax

a'
Combining this equation with the previous one gives Green's Theorem for the special
X
class of simple regions.
We now extend the theorem to a finite union, D = D1 U · · · U DK, of simple
regions each with a piecewise smooth boundary curve Yb k = 1, ... , K. Applying
FIGURE 9.4 Green's Theorem to each simple region Dk we get

{ (aG - aF)dxdy= { Fdx+Gdy.


lv. ax ay }Yk
y

The sum of integrals over Dk is an integral over D; so

f
fv
(aGax - aF)dxdy=
ay
{ Fdx+Gdy+--·+J Fdx+Gdy.
lY1 Yk

Now the boundary of D consists of pieces taken from several of the curves Yk· In
X
addition, there may be parts of curves Yk that are not a part of y but that act as a
common boundary to two simple regions. The effect is illustrated in Figure 9.5.
A piece 8 of common boundary will be traced in one direction or the opposite
FIGURE 9.5 depending on which simple region it's associated with. But for a line integral we
always have, by Theorem l.l,

{ F dx + G dy + { F dx + G dy = 0,
lo lo-
o- is otraced in reverse order. Thus although the parts of the curves Yk
where
make up y contribute to i F dx + G dy, the other parts cancel, leaving
that

{ (aGax - aF)dxdy
lv ay
= { Fdx+Gdy.
lY

This completes the proof of Green's Theorem. •


402 Chapter 9 Vector Field Theory

FIGURE 9.6 y
y

~
~
X
X

(a) (b)

1B Changing Paths
The last part of the proof just given extends Green's Theorem from simple regions to
those such as are shown in Figure 9.6. The extension has an important consequence
for line integrals / F dx + G dy over two closed curves y and 8, when the functions
F and G are defined in the region D between y and 8. In Figure 9.6(a), the curves
are traced in the same direction (counterclockwise in the figure), and in Figure 9.6(b),
the curves go from one point to another in the same direction. Given this relative
orientation of the two curves, if the equation

ac aF
---=0 (4)
ax ay
holds throughout D, then we can conclude that

i Fdx+Gdy= iFdx+Gdy.

We'll show the validity of this principle in the next two examples.

IE)(~fv'IPLE 2 j Let F and G be defined by

, -y X
F (x, y) = X 2 +y2' G(x,y)=
X
2
+y
2'

for (x, y) -:/= (0, 0). Direct computations show that these functions satisfy
'Y
Equation (4). If y is the ellipse shown in Figure 9.7 and defined by

then the integral i F dx + Gdy would be troublesome to compute directly, even


using tables. However we can apply Green's Theorem to the region D between y
FIGURE 9.7
Section 1B Green's Theorem 403
and the circle c of radius 1 about the origin, parametrized by (x, y) = (cost , sint),
for 0 S t S 2rr. Because Equation (4) is satisfied, Green's Theorem yields

1 y Uc
Fdx+Gdy=O,

where c- is c traced clockwise, so that D is on its left. By Equation 1 the last


equation is equivalent to

i Fdx+Gdy= 1 Fdx+Gdy .

But on c we have x 2 + y 2 = 1, so

i Fdx+Gdy= 1 - ydx+xdy

f 2,r
= lo 2
(sin t + cos2 t) dt = 2rr.

It's important to observe that Green's Theorem could not have been applied directly
to the entire inte1ior of the ellipse because (aG;ax) and (rJF/rJy) fail to exist at
the origin.

The curve Yl given by g(t) = (t, t 2), 0 ::=:: t S I, is shown in Figure 9.8. Suppose
that F(x , y) = (F(x , y),G(x,y)) is a continuously differentiable vector field for
x 2 + y2 < 4 and satisfies Equation (4), namely Gx(x , y) - Fy(x , y) 0 in the disk =
of radius 2. The line integral of F over YI could perhaps be computed directly in
the form

--~
1

1 YI
F dx= { [F(t , t 2 )+G(t , t 2 )(2t)]dt.
0

lo
i Xi Xi
But there are other possibilities. For example, the curve Y2 can be parametrized by
Ik g 2 (t) = (t, t), 0 ::=:: t ,:s I. Since we can apply Green's Theorem to the region between
Yl and Y2, Equation (4) implies that

{ (aG
}D ax
- aF)dxdy
ay
= o,
FIGURE 9.8

and hence

1YI
F • dx + 1- Y2
F • dx = 0.
404 Chapter 9 Vector Field Theory

Here y2- is given by g2(t) = (l - t, 1 - t) for O ::: t S I. Then the line integrals
over Yt and y2 are equal by Equation I , and the latter integral is then

l F• dx= 1 1
[F(t,t)+G(t,t)]dt.

Another alternative would be to replace YI by YJ, where y3 is parametrized in Lwo


pieces, one horizontal and one vertical, by

l
(I, 0), 0:::t:::1,
g3(f) =
( I , t),

Thus

f F- dx= (1 F(t,O)dt+ (1 G(l,t)dt.


1Y3 lo lo
This may be easier to compute than either of the integrals over Yt and Yl, although
all three are equal. In Exercise 5 you are asked to compute the value of a specific
example using each of these three paths.

The previous examples are typical applications of Green's Theorem; in summary


we have the
Path independence principle. If a plane vector field F(x, y) = (F(x, y), G(x, y))
satisfies aG /ax - aF /ay = 0 in a region whose boundary is the union of two paths
YI and Y2 with common initial and terminal points, then

1 YI
F • dx = 1 Y2
F • dx.

The paths may even intersect at points other than their common endpoints, as indi-
cated in Figure 9.6(b); the only requirement is that we be able to apply Green's
Theorem to the region or regions bounded by the curves.

IC Physical Interpretations
Green's Theorem has two distinct but closely related physical interpretations. We
assume D to be a region in IR 2 whose boundary is a single counterclockwise-oriented
curve y. If y has a smooth parametrization g(t) = (KI (t), g2 (t)), a s t s b, and has
a nonzero tangent at each point, we can form the unit tangent and normal vectors
1
g (t) ( gi (!) g~ (I) )
t(r) = lg'(t)I = lg'(t)I' lg'(t)I
and
Section 1C Green's Theorem 405
An example is shown in Figure 9.9. Note that this normal vector isn't related to
the curvature of y and doesn't necessarily have the same direction as the principal
normal to the curve, as defined in Chapter 8, Section 3.

Stokes's Theorem in the Plane. In Section 4 of Chapter 8 we defined the scalar


curl of a vector field as a real-valued function. Stokes's Theorem in that context
equates an area integral over a region D to a line integral over the boundary of D.

If F = (F, G) is a continuously differentiable vector field defined on a region


containing Dandy, then using the abbreviation, ig'(t)i dt =
ds, the line integral in
Green's Theorem is

f y F dx + G dy = 1b F(g(t)) • t(t)ig'(t)I dt

= fr F O tds.

We define a real-valued function, curl F, called the scalar curl of F, by

ac aF
cur!F(x) = -(x)
ax
- -(x).
ay

FIGURE 9.9 Green's Theorem then becomes

l curlFdA = fYF•tds,

sometimes called Stokes's Theorem for the plane. Now interpret F as the velocity
field of a fluid flow in the plane, which means that at each point x the arrow rep-
resenting F(x) has the direction of the flow at x, with the speed of the flow there
equal to the length of the arrow. The line integral represents the circulation of the
flow around y in the counterclockwise direction. (Recall that circulation of a vec-
tor field over a smooth curve, closed or not, was defined in Chapter 8, Section 1.)
Stokes's Theorem says that this circulation is equal to the integral of curl F over D.
In particular, if curl F is identically zero in D, then the circulation is zero for every
smooth circuity contained in D, whether y is oriented counterclockwise or not. For
this conclusion to hold, it's necessary that curl F be defined throughout the inside of
every circuit in D to which Stokes's Theorem is applied. Conversely, we can show
that if the circulation is zero over every smooth circuit, then the function curl F must
be identically zero. See Exercise JO and Section 2 for an alternative approach. In
Section 5, we treat the scalar curl generalized to a vector field in IR 3 •

cur!F(x,y,z)= aH - ac).
(- - 1+ (aF
- - aH)
- j + - - aF)
- k. (aG
/ ay az az ax ax ay
406 Chapter 9 Vector Field Theory

This vector field curl F reduces to curl F(x, y) = ( -aG - -.JF)


- k when F is indepen-
ax ay
dent of z and H is identically zero, because all partials by z are zero and all partials
of H are zero. This 3-dimensional field curl F(x, y) is in all respects but one the
same as the scalar curl that occurs in Green's Theorem, the difference being that the
vector curl has a direction, a direction that's always perpendicular to the xy-plane.
Now suppose the scalar Gx(x, y) - Fy(x, y) is positive in a neighborhood of a
point xo = (xo, yo) in IR 2 • Applying Green's Theorem over a disk D centered at xo,
and making D small enough that Gx(x, y) - Fy(x, y) > I) there, we get

0< L (Gx(x, y) - Fy(x, y)) dA = L curlF d.A = fr F -tds,

where y is the counterclockwise oriented boundary circle of D. It follows that the


predominant circulation of the flow, taken counterclockwise around the boundary of
D, and given by the line integral on the right, is positive. Thus the orientation of
the vector k relative to the circulation follows the right-hand rule, with the fingers
curling in the direction of the flow and the thumb pointing in the direction of k. See
Figure 8.11 in Chapter 8.

In the previous example we interpreted the field J:t' as the velocity field of a fluid
flow in D. That is, the vector field F at each point of D represents the speed and
direction of the flow at that point. In this case the line integral in Stokes's Theorem
is called the circulation of F around y, and Stokes's Theorem says that circulation
of F along y is the integral of curl F over D. Thus if curl F is identically zero in
D, then the circulation is zero around every smooth closed curve with its interior
contained in D. A field F for which curl F is zero is called irrotational for this
reason.
Gauss's Theorem in the Plane. Using the divergence of a vector field introduced
in Section 4 of Chapter 8, we can rewrite Green's Theorem in another way. Instead
of applying the fundamental Equation (3) to the field F = (F, G), here we instead
apply it to Lhe related vector field H = ( -G, F). If t = (a, h) is a unit tangent vector
pointing so that the region is on the left, then the perpendicular vector n = (b, -a)
is a unit vector that points away from the region as shown in Figure 9.9. Since
F = (F, G), we have

H -t = (-G, F) • (a, b) = -aG + bF = (F, G) • (b, -a) = F • n.


Hence the line integral of H over y becomes

f YH • dx = i H • t ds

= fr F • nds .
On the other hand, the area integral for Green's Theorem applied to H is

(aF+ac)
1 -
-
D ax ay
dxdy.
Section 1C Green's Theorem 407
We define a real-valued function div F called the divergence of F by

. aF ac
d1vF(x, y) = -(x, y) + -(x, y).
ax ay
In terms of the divergence, Green's Theorem is

This version of Green's Theorem is called the 2-dimensional Gauss's Theorem, or


the 2-dimensional divergence theorem.

Using the fluid flow interpretation, in which F represents the velocity field of a
fluid flow, the line integral in the divergence theorem is the integral of the outward
normal coordinate F • n of F over y and gives the rate at which fluid is flowing out
of the region D bounded by y. The value of this line integral is called the total flow
rate, called flux, of F across y in the outward direction. Gauss's Theorem shows
that the flux across y, denoted 4>(y ), is equal to the integral of the divergence of
F over the region bounded by y. Thus div F(x) measures the rate of change of the
density of the fluid at the point x. If div F(x) is predominantly positive in D, then
4>(y), the outward flow, will be positive, while a negative 4>(y) indicates that more
fluid is going into D than is going out. If div F is identically zero, then F is said
to represent an incompressible flow, since the flow into and out of arbitrarily small
neighborhoods of every point will be exactly balanced.

A flow in the plane determined by the vector field F(x, y) = yi+xj is incompressible
since div F(x, y) = 0. Hence

0= l div F dA = f/ dx + x dy
for every circular disk D with counterclockwise oriented boundary circle y. Indeed,
if we were to parametrize y by x = xo + r cost, y = yo+ r sin t for O .S t S 21r
we would discover that the total flux of F across y is O for all choices of the point
(xo, yo) and radius r.

In Chapter 8, Section 4 we introduced without proof the continuity equation for


fluid flow, namely

1.3 ap (x) = - div F(x),


at
where F is the velocity field of a fluid flow and p(x) is the density of the fluid
at x. Proving the 2-dimensional continuity equation is an interesting application of
Gauss's Theorem in the next example.

Let F(x) be the continuously differentiable velocity field of a fluid flow in 2-


dimensional space, with continuously differentiable fluid density p(x), and let D
be an arbitrary region of finite area in the domain of F(x). We assume that no fluid
408 Chapter 9 Vector Field Theory

is created or destroyed in D, so any change in density is due to compression or


expansion of the fluid. Then

!!:_ { pdxd y = - { F•nds ,


dt }D lo
where n is the outward-pointing unit normal to the boundary curve 8 of D . The left
side is the rate of change of mass with respect to t, which because we assume no
fluid is created or destroyed in D , must be due to flux across 8, as measured by
the right side. We need the minus sign because the integral by itself measures total
outward flux, which would be positive if the density p were decreasing, making the
left side negative. Now apply the Leibniz differentiation rule of Chapter 7, Section 3
on the left and Gauss's Theorem on the right to get

fop dxdy= - f div F dxdy , or f (op +divF) dxdy =D.


JD ot JD JD ot
Since D is arbitrary we can choose D to be a disk (xo) of radius r centered at
an arbitrary point xo in the domain of F(x). Since the integrand in the right-hand
equation is continuous it must be identically zero, otherwise a nonzero value for it at
xo would, for small enough positiver, give a nonzero value for the integral over Dr .
This establishes Equation 3.

EXERCISES

l
value of the line integral
indicated closed path.
i
In Exercises l to 4, use Green's theorem to compute the
2
y dx +x dy, where y is the
7. (x - y) dx + (x + y) dy, where y is a triangle traced
counterclockwise and having for its three vertices (0, 0),
(1, 0), and (1 , 1)
8. Use the same integrand a•; in Exercise 7, but change the
1. The circle given by g(t) = (cost, sint), 0 :St ::S 2n path to the square with comers at (0, 0), (1 , 0), (1 , 1), and
(0, I), traced counterclockwise.
2. The square with comers at (±1 , ±1), traced counterclock-
wise 9. 1 (x 2 - y2) dx + (x 2 + /) d y, where c is the circle of
3. The square with comers at (0, 0), (1, 0), (1, 1), and radius 1 centered at the origin and traced clockwise
(0, 1), traced counterclockwise

4. The ellipse x2 + 4y 2 = 4, traced clockwise


10. £ (x 2 - / ) dx + (x 2 + y 2 ) dy , where y is the circle of
radius 1 centered at the origin traced clockwise, together
S. Use each of the three paths y 1, Yl, or Y3 in Example 3 with the circle of radius 2 traced counterclockwise
of the text to compute this path-independent line integral 11. Show that if D is a simple region bounded by a piecewise
from (0, 0) to (1 , 1): { y dx + x dy . smooth curve y, traced counterclockwise, then the area of
]Yk Dis given by
6. Let y be the curve parametrized by g(t) = (2 cost , 3 sin t ),
0 :S t ::S 2n. Compute£ (2x + y ) dx + (x + 3y ) dy . A(D) = ½f / - y dx +xdy).

In Exercises 7 to 10, evaluate the following line integrals 12. Let f be a real-valued function with continuous second-
by whatever method seems simplest. order derivatives in an open set D in IR2 • Let F be the
Section 2A Conservative Vector Fields 409
vector field defined in D by F(x) = VJ (x), the gradient Show that if J(xo) f:. 0 for some xo in B, then there
of J. Show that if F(x) = (F(x), G(x) ), then the equation is a disk D centered at xo such that IJ(x)I ~ 8 for
(oG/ox) - (oF/oy) = 0 is satisfied in D. some 8 > 0, and all x in D.]
13. (a) If J(x, y) = arctan(y/x) for x > 0, compute (b) Use part (a) and Stokes's Theorem to show lhat
VJ(x, y). if curl F is continuous in an open set D, and the
(b) Show that the formulas for the coordinate functions circulation of F is zero around every smooth circuit
of VJ found in part (a) define a continuous vector in D, then F is irrotational in D; that is, curl F is
field F(x,y) = (F(x,y),G(x,y)) for all (x,y) f:. identically zero in D.
(0, 0). (c) Use part (a) and Gauss's Theorem to show that if F
(c) Show that there is no function g such that is continuous in D and the flux <l>(y) = 0 for every
Vg(x, y) = F(x, y) for all (x, y) f:. (0, 0). [Hint: smooth circuit y in D, then F is incompressible.
If g existed, then the line integral of Vg would be 17. Define
independent of the path as long as the path avoided
(0, 0).] -y • X •
F(x,y ) = -2--2•+-2--2J•
X +y X +y
for (x,y):/=(0,0).
14. (a) Consider a particle moving in a plane vertical to the
surface of the earth and subject to the gravitational
(a) Show that div F is identically zero. What implica-
field N(x, y) =(0, mg), where mis the mass of the
tion does this have for areas of regions under the
particle and g is the acceleration of gravity. Show
influence of the flow generated by F?
that as the particle moves in the plane, the amount
(b) Show that curl F is identically zero. What impli-
of work done is independent of the path between
cation does this fact have for the circulation of F
two points and depends only on the initial and final
around circular paths that don't go around the ori-
points. In particular, the work done in moving along
gin?
a closed path is zero.
(c) What is the circulation of F along a counterclock-
(b) Replace the field N by a field l<' = (F, G) satisfying
wise-oriented circle of radius a centered at the ori-
(oG/ox) = (oF/oy) throughout the plane. Show
gin? Does this result contradict part (b)? Explain
that the same conclusions hold.
your answer.
15. Assume that the vector field F =
(F, G) is a gradient
The equations curl F = 0 and div F = 0 occur in
field, that is, F = VJ for some real-valued J. Show that
Green's Formula can be written in the form
complex variable theory in a slightly different form as
the Cauchy-Riemann equations. In Exercises 17 to 20
l ti..fdA = i VJ•nds,
show that if u (x, y) and v (x, y) are the real and imagi-
nary parts, respectively, of the following complex-valued
functions, then the vector field given by F(x, y) =
(u(x, y), -v(x, y)) is irrotational and incompressible.
where ti..J = (o 2f/ox 2 )+UJ1 J ;ay2), the Laplacian of J.
16. (a) Show that if J(x, y) is a continuous real- 18. (x + iy) 2
valued function defined in an open set B of IR.2 , 19. (x + iy) 3
andl J(x, y)dx dy = 0 for every circular disk D 20. e-1+iy
in B, then J(x, y) is identically zero in J3.. [Hint: 21. ½ln(x 2 + y2) +i arctan y/x, x > 0

SECTION 2 CONSERVATIVE VECTOR FIELDS


2A Potentials
The examples of the previous section show that, under certain conditions, it's possible
to alter the path of integration in a line integral in the plane without affecting the
value of the integral. Not all line integrals have this property, but those that do are
particularly important, not only for the computational reasons already illustrated, but
also because of their relation to the gradient. We have the following theorem, valid
in !Rn, which is a converse to Theorem 1.3 of Chapter 6, Section l.
410 Chapter 9 Vector Field Theory

2.1 Theorem. Let F be a continuous vector field defined in a polygonally con-


nected open subset D of !Rn. If the line integral

i F dx 0

is independent of the piecewise smooth path y from xo toxin D, then Lhe real-valued
function defined by

J(x) = f~ F dx0

is continuously differentiable and satisfies the vector equation VJ = F throughout D.


Proof. We have to show Lhat, for each x in D, VJ(x) = F{x). Since xis an interior
point of D, there is a ball of radius o centered at x and contained in D . This implies
that, for any unit vector u and for all real numbers t satisfying It I < o, the vectors
x + tu are contained in D. Since the line integral is independent of the path, we
choose an arbitrary piecewise smooth path from xo to x, lying in D, and extend it
by a linear segment to the vector x + tu, !ti < 8, as shown in Figure 9.10. Then

J(x + tu) - J(x) = [+tu F • dx - f~ F • dx

r+ru
FIGURE 9.10
= lx F dxO

1
= fo F(x +vu)• udv.

In the result of this computation we let u = ej , the jth standard basis vector in
Rn. Then

aJ (x) = lim J(x+tej)- J(x)


dXj 1-0 t

= lim - 11
,-o t o
1
F(x + vej) • ei dv .

Since the integral in this last limit is zero when t = 0, the limit is the derivative
with respect to t of the integral, evaluated at t =
0. By the fundamental theorem of
calculus, this is just the integrand evaluated at v = 0, so

where Fj is the jth coordinate function of F. Since I<' was assumed continuous, so
are the partial derivatives aj/axj; therefore J is continuously differentiable on D .
Finally, the equations (aj/axj)(x) = Fj(x), j =
1, .. . , n, taken all together mean
that VJ = Fin D.
Section 2A Conservative Vector Fields 411

A vector field F for which there is a real-valued function f such that F = VJ is


called a conservative field or gradient field. In that case f is called a field potential
of F. The next example motivates this terminology.

Suppose that a continuous force field F is defined in a region D of JR 3 • Suppose also


that the work done in moving a particle from one point to another under the influence
of the field is independent of whatever path it's constrained to take between the two
points. Thus if x1 and x2 are two points in the field and W(x1, x2) represents the
work done in going from x1 to x2 , we can write

If the particle follows a particular path given by x(t) = g(t) , then the velocity and
acceleration vectors are v(t) = g'(t) and a(t) = g"(t), and we have F(g(t)) =
ma(t), where m is the mass of the particle. We write v = lvl, so that v 2 = v • v.
Hence if X1 = g(t1) and x2 = g(t2), then
t2
W(x1 , x2) =
1 ti
ma(t) •v(t)dt.

But since a(t) = v(t), and (d/dt)v 2(t) = 2v(t) • v(t), we have
W(x1 , x2) = -m 1t2 -d 2
[v (t)] dt ,
2 ti dt

= -m2 ( V 2 (t2) - V 2 (t1) ) . (l)

The function T(t) = (m/2)v 2 (t) is called the kinetic energy of the particle at time t .
On the other hand, if we fix a point xo in D, then by Theorem 2.1, the equation

U(x) =- [ F•dX

defines a continuously differentiable function U in D. Using independence of path


to integrate from XJ to x2 via xo, we get

= £ 2
F•dx- /~ F•dx
1
(2)

= -U(x2) + U(x1) .
Comparison of Equations (1) and (2) shows that
412 Chapter 9 Vector Field Theory

In other words, along the path traced by g(t), the sum U(g(t)) + T(t) is a constant,
independent oft, called the total energy of the particle. For this reason, the function
U (x), which is a function of position in D, is called the potential energy of the
field F. Thus the potential energy is minus the field potential. Notice that there is
an arbitrary choice made in defining the potential in that the point xo was picked to
have zero potential. The choice of some other point xo would change the function
U by at most an additive constant equal to W(xo, x1). It is the constant total energy
that's "conserved" and that gives rise to the term conservative field.

2B Path Independence
For a vector field F defined in a region D of JR. 11 , independence of path in the line
integral / F • dx means that

2.2

where y[x 1, x2] and 8[x 1, x2] are any two piecewise smooth curves in D having
initial point X1 and terminal point x2. An alternative formulation of the independence
property is that

2.3 i F dx
0
=0

for every piecewise smooth closed curve y lying in D. The equivalence of the
two properties follows from the observations that y[x 1, x2] followed by 8[x1, x2] in
reverse direction is a closed path, and that a closed path may be regarded as two
paths joining x 1 and x2, but traced in opposite directions.
The following theorem is a formal summary of three equivalent characteristics of
gradient fields, the first of which is just our original definition.

2.4 Theorem. Let F be a continuous vector field defined in a polygonally con-


nected open set D in JR.11 • Then each of the following three statements implies the
others.
(a) The integral of F over every piecewise smooth path from x 1 to x2 in D has
rx 2 F(x) • dx.
the same value, so we can write the integral as Jx1
(b) The integral over every piecewise smooth closed path y in D is zero, that is,
fr F(x) • dx = 0.
(c) There is a continuously differentiable function f : D ~ JR such that F is the
gradient off, that is, F(x) = V/(x) for all x in D.
Proof. To see that (a) implies (b), let the two points x1 and x2 on the closed path y
separate y into two paths, p from x 1 and x2 and another, q, from x 2 to x 1. Reversing
direction on the second of these paths gives another path q - from x 1 to x2. Assuming
Section 2C Conservative Vector Fields 413

(a), we have
.
obtam (b) from
1p
F • dx = 1-
q
F • dx. Hence 1 -1
p
F • dx
q-
F • dx = 0. Thus we

i F • dx = £ +i F • dx F. dx

= 1 -1
p
F • dx
q-
F • dx = 0.

To see that (b) implies (a), we reverse the previous argument, letting r and s be
two given piecewise smooth paths from X1 to x2. Then r together with the reversed
curve s- make up a closed path y over which F has integral zero. It follows that

i F • dx + 1- F • dx = 0, so i -1
F • dx F. dx = 0.
Finally, Theorem 1.3 of Chapter 8, Section 1 states that (c) implies (a), while
Theorem 2.1 of the present section states that (a) implies (c). It then follows from
the previous implications that (b) and (c) imply each other. •
2C Derivative Criterion
A more intrinsic criterion for deciding whether a continuous vector field is a gradient
field arises as follows. Suppose first that JR. 2 ~ JR. 2 is continuous on an open set D,
and that F is a gradient field, that is, there is a real-valued function f defined on D
such that V/ = F. In terms of coordinate functions F1 and F2 of F, this means that

and

If F itself is continuously differentiable, we can form the second partials,

and
a2 f
---=-,
aF2
ax1ax2 ax1

and conclude from their equality that

(3)

throughout D. By the definition of curl F, Equation (3) says curl F = 0. This equation
has an extended consequence: We consider a more general vector field ]Rn ~ ]Rn,
which we assume continuously differentiable in an open subset D of JR.n. If F is a
=
gradient field, there is an f such that V/ F, or, in terms of coordinate functions

af = F'j ,
-axj j = l, . .. , n.
414 Chapter 9 Vector Field Theory

Differentiating with respect to x;, we get

(4)

The functions 8F; /axi are the entries in the n-by-n Jacobian matrix of JR 11 -.!+
JR 11 ,
and Equation (4) expresses its symmetry, which means that F' equals its transpose
across its main diagonal.

2.5 Theorem. If JR 11 -..!:+ JR 11 is a continuously differentiable gradient field, then


F', the Jacobian malrix of F, is symmetric.

The converse of Theorem 2.5 is false, as we see by looking at an example in JR 2.


The vector field

'
F (x' y) = ( x2 -y X
+ y2 , x2 + y2 )

is defined for all (x, y) =f. (0, 0). You can check that 8F1/8y = 0F2/8x, but there is
no continuously differentiable f such that "vf (x, y) = F(x, y) for all (x, y) =f. (0, 0).
The underlying reason is that for x > 0 the function f (x, y) = arctan(y / x) satisfies
VJ= F, but this f cannot be extended to be a single-valued solution of the equation
in the entire plane with the origin deleted. (See Exerci::;e 13 of the previous section.)
If f(x, y) could be so extended to a function g(x, y), we would have

f F(x) •dx = f Vg(x) • dx = g(l, 0) - g(l, 0) = 0,

where c lies on x 2 + y 2 = I, starting and ending at (I, 0). This is impossible, since
explicit calculation along c, traced once counterclockwise with (x, y) =(cost, sin t),
shows

J. F(x). dx = J. - y dx + x dy = f 2,r (sin2 t + cos 2 t) dt = 21r =f. 0.


~ ~ h
Example 2 shows that the nature of the region D on which F is defined is signif-
icant in determining whether F is a gradient field. By making a special assumption
about D we can obtain a partial converse to Theorem 2.5.

2.6 Theorem. Let R be an open coordinate rectangle in JR", and let F be a


continuously differentiable vector field on R. If F'(x), the Jacobian matrix of F,
satisfies aF;jaxi = aFi/axi, then Fis a gradient field.

Proof. Pick a fixed point xo in R and let x be any other point of R. We consider
paths from xo to x, each consisting of a sequence of line segments parallel to the
axes and such that each coordinate variable varies on at most one such segment.
Three-dimensional examples are shown in Figure 9.11. The reason for looking at
Section 2C Conservative Vector Fields 415
FIGURE 9.11

such paths is to be able to approach x from any coordinate direction for the purpose
of talcing partial derivatives at x. Choosing one of these paths, call it Yx, define a
real-valued function f by

/(x) = f F • dx. (5)


]Yx
Although the particular path Yx is only one of several of the same type, we'll see that
any of the other possible choices would lead to the same value for /(x). The reason
is that any one of these paths can be altered step by step into any one of the others by
changes, each of which leaves the value of the integral (5) unaltered. Each path can be
described as a sequence of segments, along which only one coordinate variable varies.
(For example, the dashed path in Figure 9.11 corresponds to xi, x2, x3, and the solid
one to x2, x3, xr.) We can change one such sequence into another by successively
interchanging adjacent variables in pairs until the desired order is reached. But each
interchange replaces a pair of segments (8;, 8j) by another pair (8;, 8.1) lying in the
same 2-dimensional plane. To see that the replacement leaves the value of the integral
invariant, we form the circuit 8 consisting of the segments 8; and Oj, followed by 8I
1
and 8 in the reverse of their original directions. On these segments, Xi and xi are
the only variables that vary, so we can write the circuit integral as

We apply Green's Theorem to the 2-dimensional rectangle Rt, bounded by 8 and get

f, l,
F•dx =
1(-aF-
R~
1
OXj
-
aF-)
-
OXj
1
dx;dxj = 0,

since by the symmetry assumption, aFi/ax; - oFifoxj = 0 in R. Thus

J,l . dx =
fi
[
l(o; ,6j)
F • dx - [ , , F • dx
j(l,J'l,i)
= 0,

and so the change of path leaves the value of the integral invariant. •
416 Chapter 9 Vector Field Theory

Once it has been established that x can be approached along a path of integration
that varies only in an arbitrary coordinate, say the kth, we have, as in the proof of
Theorem 2.1, the equation aJ('iJxk(x) = Fk(X), for all k. Thus VJ(x) = F(x) for all
x in R.

[ E;XAMPLE 3 I Applying Theorem 2.6 to the field

-y X )
(x, y)-/= (0, 0) ,
F(x, y) = ( x2 + y2 , x2 + y2 '

of Example 2, we conclude that F, when restricted to any coordinate rectangle not


containing the origin, is a gradient field. This is true, for example, in any of the four
half-planes bounded by a coordinate axis. A potential function J for the half-plane
x > 0 can be computed by the line integral

(x ,y) -ydx xdy


J(x,y)=
J(1 ,0)
2
X
2+
+Y X
2
+y 2'

y where the path of integration is any piecewise smooth curve from (1, 0) to (x, y). A
(X, y) polygonal path from (1, 0) to (x, 0) and from (x, 0) to (x, y) is shown in Figure ·9.12.
On the first segment, the entire integral is zero because y is identically zero, and on
the second segment, with x constant, the integral reduces to

[Y : dy = arctan (~).
(1, 0) (x, 0)
X lo X + Y2 X

The most general potential of F in the right half-plane differs from this one by at
FIGURE 9.12 most a constant. (Why?) The general solution of VJ = Fin the half-plane is therefore

f(x, y) = arctan~ + C.
X

1- ~XAMPL£ 4 I In this example we look at the vector field

F(x, y) = (x2: x2: y2, y2). (x, y) =I- (0, 0).

Like the field in Example 3 this one satisfies the hypotheses of Theorem 2.6, so we
conclude from the theorem that there is a potential function J such that VJ = F
on any half-plane bounded by a coordinate axis. But in contrast to Example 3, for
this field there is a continuous potential function defined everywhere in IR 2 except
at the origin. This infonnation can't be obtained simply by verifying the symmetry
condition, but requires checking one of the conditions of Theorem 2.4. To actually
find a potential we calculate the line integral of F(x, y) from ( 1, 0) to (x, y) along the
polygonal path from (1, 0) to (x, 0) and then from (x, 0) to (x, y) as in Example 3.
After some cancellation the result is
ydy
J (x, y) = y
= 2I ln(x 2 + y 2), =I-
la
0 X
2
+Y 2
(x, y) (0, 0) .

Thus we have a potential function f (x, y) for the vector field F(x, y ), usually called
the logarithmic potential, valid everywhere in the plane except at the origin.
Section 2D Conservative Vector Fields 417

2D Indefinite Integration
Given a vector field IR" -.!+
IR", finding a real-valued function f such that VJ = F
amounts to solving for the function F in the system of partial differential equations

where F1, F2, ... , Fn are the given coordinate functions of F . It's sometimes simpler
to avoid working with definite integrals along explicit paths of integration as in the
previous example and instead use indefinite integrals. Assuming equality of mixed
. Is, t he consIStency
part1a • cond'1t1ons
. -aFi = -aFj must be satis . fl e d , so 1t
. usua ll y
ax} axi
makes sense to verify them before proceeding further; failure of even one of these
equations means that there is no solution f to the system of equations.

Suppose F(x, y) = (y2, 2xy + 1), and we're looking for a function IR2 ~ IR such
that fx(X, y) = y 2 and Jy(X, y) = 2xy + 1. Since a(y 2)/ay = a(2xy)/ax = 2y for
all (x, y), Theorem 2.6 guarantees that the desired function is defined for all (x, y).
To find f, start for example with fx (x, y) = y2 and integrate with respect to x while
holding y fixed. We get

f fx(x,y)dx=f(x,y)= f 2
y2dx=xy +C(y).

It's important in principle, and often in practice, to allow the "constant" of integration
C(y) to depend on the temporarily fixed, but arbitrary, value y. Now apply a/ay to
this partly determined expression for j(x, y) and compare the result with the given
expression Jy(x, y). We find

a(xy2 + C(y)) '


--'-----'- = 2xy + C (y) = 2xy + l.
ay

Canceling 2xy, we see that we need to have C'(y) = 1, so C(y) = y +c where c is


a real constant. (It's only at this point of the process that we have concrete evidence
for the existence of solutions; they are f (x, y) = xy 2 + y + c, now directly verifiable
as solutions.) As a final payoff, we see that a line integral of F from, for example
a= (1, 1) to b = (2, 3), is really a line integral of a gradient field. We can choose
=
the constant c 0. Hence J: =
F dx /(2, 3) - f(l, 1) = 21 - 2 19.
O =
Theorem 3.4 of Chapter 5 guarantees that any two solutions J to the equation
VJ = F differ by at most a constant, so we know we have the most general solution
in the previous example. Note also that the "partial integrations" in the example are
essentially integrations along paths parallel to the axes.

Let

F(x, y, z) = (F1 (x, y, z), F2(x, y, z), F3(x, y, z)) = eY+z(y, x(y + 1), xy).
418 Chapter 9 Vector Field Theory

We want to solve

f c (x, y, z ) - ye y+z , f y (x, y, z ) - x ( y + 1) e y+z , f z (x, y, z ) - xye y+z ,

that is, find f such that VJ = (F1, F2, F3). The three consistency conditions, for
example, aF1/ay = (y + l)e>'+z = aF2/ax, all hold, so we go ahead and integrate,
choosing to start with the first equation fx(x, y, z) = (y + l)eY+z. We find

f(x, y, z) = f fc(X, y, z)dx = f ye>'+z dx = xyey+z + C(y, z).

The constant of integration may depend on the two variables not involved in the
integration. Now apply o/az to this last expression for f to get f,(x, y, z) =
xyeY+z + Cz(Y, z). The third equation, f,(x, y, z) = xye>-+z, of our given system
shows by comparison that C,(y, z) = 0. This says C(y, z) = C(y) is independent of
z, so f(x, y, z) = xye>-+z+c(y). Now compute /y(x, y, z) = x(y+l)e>-+z+c'(y);
comparison with the second equation of the system shows that C' (y) = 0, so C(y)
is constant. Thus f (x, y, z) = xyeY+z + c.

EXERCISES

I. Consider the approximation to the earth's gravitational 9. Consider the vector field defined in JR1 , with the z-axis
field acting on a particle of mass I represented by the deleted, by
vector tiel<l F(x , y, z) = (0, 0, -g).
(a) Find for F the potential energy function U(x, y, z) F(x , y,z)=
-y X )
that is zero when (x, y, z) = (0, 0, 0) . ( -2--2.-2--2,0.
X +y X +y
(b) If a particle of mass I has at (0, 0, 0) a velocity
Is F a gradient field?
vector (v1, v2, v3) with v3 > 0, and no force but F
acts on the particle, find the path of the particle. In Exercises 10 to 13, find a field potential for the given
(c) Verify that the sum of potential energy and kinetic field.
energy remains constant for the path of part (b). 10. F(x,y, z) =(2xy,x 2 + ;: 2 ,2yz)
2. Show that if F and G are gradient fields defined on the 11. G(x, y) = (ycosxy, x cosxy)
same domain D, then F + G and cF are gradient fields,
where c is a constant. 12. H(x, y) = -y
?-- , -
X )
(x, y) # (0, 0)
(-
x- + y 2 X
--?
2 + y-
,

In Exercises 3 to 6, use Theorem 2.4, Theorem 2.5, or


2.6 to decide whether the vector field is a gradient field. 13. K(x,y) = ( ~ . - 2 y 2 ), (x,y) =I- (0,0)
x-+ y- X + y
3. F(x, y) = (x - y, x + y), for (x, y) in JR 2
14. Consider the vector field F which is the gradient of the
4. G(x, y, z) = (y, z, x), for (x, y, z) in JR1 Newtonian potential / (x) = - lx1- 1 for nonzero x in JR3 •
-y X ) Find the work done in moving a particle from (1, I, 1) to
5. H(x ,y)= ( 2 - 2 ,-?- ,(x,y)#(0,0)
X +y x- + 2y (-2, -2, -2) along a smooth curve lying in the domain
of F.
6. K(x,
. y) = ( - 2-X -2 , - 2-y -2 ) , (x, y) # (0, 0) 15. Give a detailed proof of the equivalence of Relations 2.2
X +y X +y
7. Use Theorem 2.5 to show that the vector fields in Exer- and 2.3 of the text.
cises 3 and 4 are not gradient fields in any open subset at 16. In JR", how many paths can there be from xu to x of the
all of R 2 or ~ 3 respectively. special kind described in the proof of Theorem 2.6?
8. Show that the vector field H of Exercise 5 is a gradient 17. Apply the method of indefinite integration to find a
field in the region y > 0 of JR 2 and find an explicit potential of the field
representation for its potential. F(x, y) = (2x/(x 2 + y 2 ). 2y/(x 2 + _v2)).
Section 3A Surface Integrals 419
18. Redo Example 5 of the text by first integrating the ential equation, because the order of differentiation
equation Jy(x, y) = 2xy + 1 with respect toy . has been reduced from two to one.
In Exercises 19 to 22, find a potential f following the 27. A particle of unit mass moves with constant angular
method of Examples 5 and 6 of the text. velocity w on a circle of radius a about the origin in JR 2 •
The centripetal force field that constrains the position x
19. Find J, if VJ (x, y) = (eY, xeY). of the particle to remain on its circular path is F(x) =
20. Find J, if VJ(x, y) = (y 2 + 2xy, 2xy + x 2). -aw 2 x.
21. Find J, if VJ(x, y, z) = (y + Z, z + x, x + y). (a) Show that J(x) = -½aw2 ixi 2 is a field potential for
F, that is, show that VJ(x) = F(x).
22. Find J, if VJ (x, y, z) = (yz + z, xz, xy + x). (b) Show that the total work done by the field during
23. Find the function J of Exercise 20 by direct computation one complete traversal of the circle is equal to zero.
of a line integral of Vf from (0, 0) to (x, y). (c) Compute the work done by the field if, instead of
traversing the circle, the particle moves along a
24. Find the function J of Exercise 21 by direct computation
smooth path from a circle of radius a to a circle
of a line integral from (0, 0, 0) to (x, y, z).
of radius b > a.
25. Verify that the line integral of the field F(x, y) =
28. The location x of a single satellite relative to a fixed
2x/(x 2 +y2)i+2y/(x 2 +y2)j is zero around every closed earth at the origin in JR 2 is governed by the force field
path that avoids the origin.
F(x) = -klxj- 3x, where k is a positive constant.
26. Suppose x = x(t) satisfies the vector equation (a) Show that J(x) = k/lxl is a field potential for F,
x+ VU(x) = 0 on at-interval I . that is, show that VJ (x) = F(x).
(a) Show that the scalar equation x• x+ VU (x) • x = 0 (b) Show that the total work done by the field during
holds on I. one complete satellite orbit is equal to zero.
(b) Show that dix:12 /dr = 2(x • x). (c) Compute the work done by the field if, instead of
(c) Apply part (b) to part (a) and integrate to show moving on a circular orbit of radius a, the satellite
that ½lx!2 + U(xj = C, where C is constant. This moves along a smooth path to a circular orbit of
equation is called a first integral of the vector differ- radius b > a.

SECTION 3 SURFACE INTEGRALS


3A Normal Vectors
In Chapter 8, we defined integrals both of a real-valued function and of a vector
field over a smooth curve. Defining an integral over a surface S leads to a different
geometric situation with a close analogy with the line integral. As described in
Chapter 4, Section 4, we assume as our primary representation for a surface S a
parametrization by a continuously differentiable function JR 2 ~ JR.3. We'll write g as

g1(u, v) )
g(u, v) = g2(u, v) , (1)
( g3(u, v)

with u = (u, v) in the interior of some set D in JR2 , which we assume bounded by
finitely many smooth curves. We further assume that, at each point g(u, v) of S, the
tangent vectors defined by the vector partial derivatives
ag
av (u, v)

determine a 2-dimensional tangent plane to S; in other words, that the lwo tangents
are linearly independent. If S satisfies all these conditions, we'll refer to it as a piece
of smooth surface.
420 Chapter 9 Vector Field Theory

FIGURE 9.13 V
ag Jg
- X---:..
,1u ,Ju

II

(a) (b)

On a smooth curve, the choice of a parametrization going from one endpoint to


the other establishes an orientation for the curve. Analogously, on a piece of smooth
surface, a one-to-one parametrization determines a standard normal vector
ag 8
-(u, v) x -(u, v)
a (2)
au av
pointing out of one side or the other of the surface at g(u, v) . See Figure 9. 13(b).
3B Area and Mass
We recall that the length of the cross product of two vectors a and b is the area of
the parallelogram spanned by a and b. In particular,

ag
OU (u, V)
ag (LI,
av /
I X V)

represents the area of the outlined tangent parallelogram shown in Figure 9. l3(b).
If we think of scaling down such parallelograms by factors du and dv at the points
g(uk, Vk) corresponding to the corner points (uk, vk) of a grid over D, then it's
natural to define the area of S by integrating the parallelogram area over D:

a(S) = [ Iag (u, v) X ag (u , v)I du dv .


3.1 JD au av
We assume that JR 2 ~ JR. 3 is one-to-one so that each part of the image surface is
covered just once. The integral over D will exist as a finite Riemann integral because
g is assumed to be continuously differentiable. The expression

da = lg 11 (u, v) x g 11 (u, 11)1 du dv


is called the area element differential for the parametrized surface S. In addition,
if µ(x) is a continuous scalar-valued function defined for x on S, then

0
3.2 f µda= f µ(g(u,v))l g(u,v)x ag(u,v),dudv
JD JD au av
exists. If µ(x) ::: 0, then Equation 2 defines the total mass due to the density µ.
Section 3B Surface Integrals 421
FIGURE 9.14 ! z

. ~Ji I
I
I

I i :

~
X I
•I'

(a) (b)

{E>Ct\MPJ;.E tj Let S be parametrized by

g(u, v) =( ~ ),
u2 + v2
u
2
+ v 2 ~ a 2;

thus S is actually the graph of z = x 2 + y2 for x 2 + y2 ~ a 2. The surface is shown


in Figure 9.14(a). Then

og
-(u,v)= ( 01 ) , ag
-(u, v) = ( o1 ) .
au 2u av 2v

We have

ag ag ( 0 2u 2v
ou (u, v) x av (u, v) = 2u 2v II
' 1 0

The length of this vector is

l(-2u, -2v, 1)1 = J4u 2 + 4v 2 + l.


Changing to polar coordinates gives

a(S)=1 J4u 2 +4v 2 +1dudv


u2+v2:'.,:a2

= fo2rr d0 foa J4r 2 + 1 r dr

= 2rr [_!_(4r2 + l)3/2]a = ~((4a2 + 1)3/2 - l).


12 0 6
422 Chapter 9 Vector Field Theory

The surface in the previous example can be thought of as a piece of the graph of
an equation z = f(x, y), where in the example f(x, Y> = x 2 + y 2, with the domain
of integration the disk x 2 + y2 ::: a 2 • In general, such a graph can be parametrized
by a function JR. 2 ~ JR. 3 of the form

g(x,y)=( ; )· (.r,y)inD.
f(x, y)

Then gx(x, y) x gy(x, y) = (-fx(x, y), - /y(x, y), I) so the area differential
becomes

3.3 da =J f;lx, y) + f}(x, y) + 1 dx dy = J1vJ(x, y)l 2 + 1 dx dy.


Using this approach in the previous example would have led to the same integral
that we computed before for the surface area, namely

To compute the area of a sphere Sa of radius a using Equation 3.3, we start with
IE~AMPL~21 the hemispherical graph Ha of z = Ja 2 - x 2 - y 2 over the disk x 2 + y 2 < a 2. See
Figure 9.14(b). Then we have fx(x, y) = -x(a 2 - x 2 - y2)- 112 and /y(x, y) =
-y(a 2 - x 2 - y2)- 112 , so Equation 3.3 becomes

This is an improper integral, because the integrand tends to infinity as (x, y) tends
from within the disk to an arbitrary point on the boundary. To compute the integral,
integrate first over a smaller disk of radius b, and then let b ~ a. This can be done
by changing to polar coordinates and then letting b ~ a. We get

a(H0 ) = lim 1
h-+a x2+y2<b2
a/Ja 2 - x 2 - y2 dx dy

2
= lim a f ,r d0 fh r/Ja 2 - r 2 dr
b-+a lo lo
= 2rra Jim [-
b-+a
(a 2 - r 2 ) 112 Jg
lim [-(a 2 -
= 2rra h-+a b2 ) 112 + a] = 2rca 2 .

Hence a(S0 ) = 2a(H0 ) = 4rra 2 , the formula for the area of a sphere of radius a.
Section 3C Surface Integrals 423
L§~;\""1ft,~~'.~:j We parametrize a complete turn of a helicoid surface of width a by

n
g(u, V) =( : ~r:; ), 0 :,U :< a, Q :< V :< 2".

Suppose the helicoid is weighted with density µ(x, y, z) =


Jx 2 + y2 at (x, y, z). In
other words, the density at a point of the helicoid equals its distance from the central
axis of the surface, which is just u in terms of the parameter pair (u, v). To find the
total mass distributed this way, we compute

The weighted area differential is thus


FIGURE 9.15
µda = u\gu(U, v) X gu(U, v)I du dv
= u\(sin v, - cos v, u)I du dv = uJI + u 2 du dv.
The total mass is

Ma= !obr dv !oa uJl + u 2 du

J:
= 2rr [½(I + u 2 ) 312 = (2rr /3)[ (I + a 2 ) 312 - 1].

3C Integrating Vector Fields


The main purpose of this section is the definition of the integral of a continuous
vector field JR3 !;,. JR 3 over a surface S. Continuing with the assumption that S is
a piece of smooth surface represented by the function of Equation ( l ), we compare
the standard normal vector ag/au x ag/av with the vector field Fat a point g(u, v)
of S. These are shown in Figure 9.15 at one point. If n is a unit normal to S at
g(u , v), then the dot product F • n at g(u, v) is the coordinate of F in the direction
of n. But since

ag ag
-x-
n== au av
ag x ag 1 ·
I au av

it follows that

ag ag )
F(g(u , v)) • ( -(u, v) x -(u, v)
au av
424 Chapter 9 Vector Field Theory

is equal to the coordinate of F(g(u, v)} in the direction of n, multiplied by the area
of the tangent parallelogram spanned by ag/au and ag/au at g(u, v). We define the
surface integral of I<' over S by

3.4
1 D
F(g(u, v)) • -(u,
au (
ag v) x
ag v) ') du dv,
-(u,
av ,

and denote it by f s
F dS or
0
f s
F•nda .

Suppose that a continuous vector field IR 3 ~ IR 3 describes the speed and direction
of a fluid flow at each point of a region R in which it's defined. We'll define, using
a surface integral, the flux, or rate of flow across a piece of smooth surface S, lying
in the region R. If S is perfectly flat and Fis a constant field, then the flux is equal
to Fna(S), where Fn is the coordinate F • n of Fin the direction of a unit normal n
F to S. Thus, for a flat S, the flux is equal to the volume of the tube of fluid illustrated
in Figure 9.16, which shows the amount of fluid passing through its base in one time
unit. Because Fn = F • n, we define, for a flat S with area a (S), the flux to be the
rate of flow of F across S given by the formula
<t>(F,S) =F•na(S).

If S is a piece of smooth surface in a region R, we partition S along coordinate


curves of the form u = constant and v = constant and assume that, within each part
FIGURE 9.16
of S so formed, the field F is constant. Approximating S by tangent parallelograms
Sk having for adjacent edges the vectors ~u gu(Uk ) and ~v gv(Uk) gives Sk the area

a(Sk) = lgu(Uk) X gv(Uk)I ~Lt ~v.


F(g(u)) See Figure 9. 17. The approximate flux across a typical subdivision Sk of S will have
the form

~ " <t>k = F(g(uk)) • Dk a(Sk)


= F(g(uk)) • (ag (Uk) X ag (uk)) flu ~v,
au av
since the length Jgu(uk) x 8v(Uk)I cancels from the area a(Sk) and the denominator
of nk . The sum
FIGURE 9.17
N
'°"'<t>k
Lt
N
= '°"'F(g(uk)}•
Lt
~(uk)
au
(a a
x ~(Uk) ~u~v
av
)
k=l k=l

becomes a better approximation to what we would like to call the flux of F across
S as the subdivision of S is refined by making the corresponding grid G finer in the
parameter domain D. On the other hand, if Fis continuous on Sand g is continuously
differentiable on D, then

lim
m(G)-.O
~
Lt <t>k =
k=l
1 D
F(g(u)) • (ag
-(u)
au
ag ) du dv
x -(u)
av

= { F•dS,
ls
Section 3C Surface Integrals 425
which is the previously defined integral of F over S. Consequently, we define the
flux of F across S to be the rate of flow given by

<l>(F, S) = Is F•dS = Is F•nda.


We remark that the sign of <I> would change if S were reparametrized so that the unit
normal vector n determined by the parametrization pointed in the opposite direction.

Suppose the vector field F(x) = (F1(x), F2(x), F3(x)) is tangent to the surface S
at every point x of S. Then F • n = 0 at every point of S, so the surface integral
Is F • n da is zero. At the other extreme, if the continuous vector field F(x) is
perpendicular to S at every point x of S we expect that the integral of F over S will
be different from zero. For example, if F coincides with the standard unit nonnal
vector n at each point of S, then Is F • n da = Is n • n da = Is da, which is just
the area of S.
The motivation for the definition of flux given previously is stated in terms of
the velocity field of a fluid flow, because that is the physical setting for flux mea-
surements across surfaces that we most easily visualize. However some of the most
important applications of surface integrals concern the flux of the more abstract
fields: gravitational, electric, and magnetic. The next example is fundamental for
these areas.

A body of mass M concentrated at the origin in JR 3 generates a gravitational field at


x that attracts a body of mass m toward the origin with force

where G is the universal gravitational constant. Note that the magnitude of the
field at xis !F(x)I = GMmlxi- 2 ; in other words, this is an inverse-square law of
attraction. To compute the flux of this field across a sphere Sa of radius a centered
at the origin, we make the simplifying observation that if x is on Sa, then we have
F(x) = -GMma- 2n, where n is an outward-pointing unit vector directed from x
away from the origin. Since n • n = 1, the flux of the field is

<I>= f F(x) • nda == -(GMm/a 2 ) f n • nda =


lsa lsa
-(GMm/a 2 ) 1SCI
da.

This last integral is the area of the sphere Sa, namely 4na 2 , so flux <I>== -4nGMm.
The most significant feature of this result is that <I> is independent of a, the radius
of the sphere. We'll use Gauss's Theorem in the next section to show that the same
phenomenon holds for closed surfaces other than spheres.

The coordinates of the standard nonnal vector to a surface parametrized by


x(u , v) = g(u, v) can be written in terms of2-by-2 Jacobian determinants as follows.
426 Chapter 9 Vector Field Theory

Let x = (x, y, z). Then the coordinates of the cross-product fJgjou x og/ov have
the form

fJ(y,z)
cJ(u, v)
= I Yu
Zu
Yv
Zv
I· fJ(z, x)
fJ(u,v)
= I Zu
Xu
Zv
Xv
I· a(x, y)
cJ(u, v)
= I Xu
Yu
x,,
Yv

Thus we can write general surface integrals of a vector field F = (F1, F2, F3) in
either of the successively more abbreviated fonns

f s
F dS= { (F1iJ(y.z) +F2fJ(z,x) +F3fJ(x,y))dudv
0

lv a(u, v) a(u , v) a(u, v)

= ls (Fidy dz+ F2dz dx + F3dx dy).

The last abbreviation is analogous to our abbreviation for a line integral:

3D Orientation
In computing a line integral over a piecewise smooth curve, it's customary to orient
the smoofh pieces of the curve coherently so that the tenninal point of one piece is
the same as the initial point of the one that follows it. To integrate a vector field over
a piecewise smooth surface, we need a notion of orientation for pieces of smooth
surfaces S. If JR 2 ~ JR 3 represents S parametrically with g defined on D, then
Figure 9.13 shows how D and S might possibly be related. The edge of S, corre-
sponding under g to the boundary of D, we'll call the border of S. As a point u
moves around the piecewise smooth boundary of D in the counterclockwise direction,
its image g(u) traces the border of S with what we'll call its positive orientation.
It will be convenient later to use the notation as to denote the positively oriented
border of S.
An alternative way to describe the positive orientation is as follows. Define the
positive side of S by saying that "positive" is the side of S out from which the
normal vector agjfJu x cJg/fJv points. If you then walk on the positive side of S
keeping S on your left as you follow the border around, you are going in its positive
direction. See Figure 9.18(b) for a picture. The equivalence of these two notions

FIGURE 9.18

r
t' n

I
;

II ti
(a) (b) (c)
Section 3D Surface Integrals 427

of positivity can be stated and proved as a fonnal theorem, but we won't attempt
that here.
A piecewise smooth surface is defined to be a finite union of pieces of smooth sur-
face that are joined along common border curves. Figure 9.18 shows some examples.
The border curve of each piece of surface has a positive orientation that comes from
some parametrization of that piece. The parametrizations of two adjacent pieces are
coherent if they give opposite orientations to common border curves, as in parts (a)
and (b) of Figure 9.18. A piecewise smooth surface is said to be orientable if its
adjacent pieces can be parametrized coherently. The border orientation of a single
piece can always be reversed to accommodate a neighbor by interchanging the roles
of its two parameters, for example replacing (u, v) by (v, u) throughout. Parts (a)
and (b) of Figure 9.18 show orientable surfaces. However part (c) of Figure 9.18
shows two rectangular strips joined together, one of them with a twist. This surface
is not orientable, because no matter how the orientations of the pieces are changed
there will be some part of the common border traced in the same direction. The
resulting surface is called a Mobius strip.
We define the integral of a continuous vector field over a piecewise smooth surface
to be the sum of the integrals over each of its smooth pieces. Thus if S = S1 U S2,

{ F•dS ={ F•dS+ { F•dS.


ls ls1 ls2
This definition holds even if S is not orientable, but in practice it's of little inter-
est to integrate a vector field over a nonorientable surface. On the other hand, we
can compute the integral of a real-valued function over a surface without regard
to orientation. The reason is that in Formulas 3.1 and 3.2 the area differential of
surface area,

da
ag
= -au agl
x -
l av dudv ,
doesn't change when the orientation is reversed by interchanging the roles of u and
v. But in Formula 3.3, the vector surface differential,

dS = ( -ag x -ag) du dv,


OU ov
does change sign when u and v are interchanged. We observe that the surface dif-
ferential dS can also be written in the form

dS = nda,
where n is a unit normal to the surface. Possible choices for this vector are considered
in the following examples.

For a flat surface S parallel to the xy-plane, there are two possible choices for
the unit normal; n must be either (0, 0, 1) or (0, 0, -1). In Figure 9.19(a) either
choice would in principle be appropriate for the rectangle S1 in the xy-plane, but
having chosen one, say n1 = (0, 0, 1), there is only one possible choice of n for the
428 Chapter 9 Vector Field Theory

FIGURE 9.19 z

1y y
X

(a) (b)

rectangle S2 in the xz-plane that will lead to a coherent orientation. Following the
conventions illustrated in Figure 9. l 9(a), we choose n2 = (0, 1, 0). (Note that with
these choices the two normals point out from the same side of the two-piece surface.)
To compute the total flux of the vector field F(x, y, z) = (y, z, x) over S1 U S2 with
this orientation, we first note that F(x, y, z) • n1 = (y, z, x) • (0, 0, 1) = x on S1.
Also on S1 we have da = dx dy. Hence
1
f F 0 n1da = f xdxdy = f dy [1 xdx = ~-
ls1 ls 1 lo lo 2

On S2, we have F(x, y, z) • n2 = (y. z, x) • (0, 1, 0) = z, so

f F 0 n2da = f zdxdz = [1 dx [1 z dz = ~-
ls2 lsi lo lo 2

The total flux across the oriented surface is then equal to ½+ ½= 1.

The orientation of a surface is sometimes most naturally determined so that all


normal vectors are "outward-pointing" or "inward-pointing." This way of describing
coherent orientation is particularly appropriate for a closed surface that comprises
the boundary of a solid 3-dimensional region, as in the next example.

A vector field is defined in JR 3 by F(x, y, z) =


(x, y, z). Let C denote the part
2 J
of the cone z = x + y between z = 0 and z = b. We close the top of the
2
cone with a flat disk D of radius b in the plane z = b. See Figure 9.19(b). If we
choose n to be an outward-pointing unit normal at each point of C or D, this will
be consistent with Figure 9.19(b). To compute the flux of F across C, note that on
C the arrows representing F lie along C, and so they are perpendicular to n. Thus
F. n = 0 on C , so the flux of F across C is zero. At a point (x, y, b) of D we
have F(x, y, b) • n = (x, y, b) • (0, 0, 1) = b . Hence the flux of F out across D is
l F • nda = l bdx dy = rrb 3, which is then the total flux across CUD .

Suppose the vector field in the previous example is replaced by the field G(x , y, z) =
(x, y, 0), but we retain the closed surface CUD shown in Figure 9.19(b). This time

the normal vector n to D is perpendicular to the field, so LG• n da = 0.


Section 3D Surface Integrals 429
The conical surface is parametrized either by (x, y, z) = (v cos u, v sin u, v) or
else by g(x, y) = (x, y, Jx 2 + y 2 ) for (x, y) in the disk Rb defined by x 2 +y 2 ~ b2•
We choose the latter, in which case

gx x gy = (-x/ Jx 2 + y 2 , -y/Jx 2 + y2 , 1) and lgx x gyl = Ji.

The area element on C is du = lgx x gyl dx dy = ..fidx dy. Since the third
coordinate of gx x gy is positive, this vector points in, so we must change sign to get
an outward-pointing normal. However we'll compute G • n just by observing that G
is parallel to the xy-plane and n is perpendicular to the lines on C. Thus the angle
between G and n is rr / 4 at every point of C, and

It follows, changing the integral to polar coordinates, that

Thus the total flux of the field G out across C U D is f rr b3•

EXERCISES

1. (a) Sketch the plane triangle T in JR 3 parametrized by (b) Find the flux across P of the vector field
g(u, v) = (2u + v, v, 3u + v) for F(x,y, z) =-xi+ yj + zk.
0 ~ U, 0 ~ V, U + V ~ 1. 6. Use the parametrization
(b) Find the area of T.
a cos u sin v )
2. (a) Sketch the plane elliptical region E in the part of g(u,v)= asinusinv, 0 ~ u ~ 2rr,
the plane z = 4 - x - 2y that lies above the disk ( acosv
x 2 + y2 ~ I in the xy-plane.
(b) Find the area of£. for a sphere of radius a to show that the area of the sphere
is 4rra 2 •
3. (a) Sketch the part of the graph of the hyperbolic
paraboloid z = y 2 - x 2 that lies above the disk 7. A repelling electric field E(x) = /x/- 3 x has flux <I> across
x 2 + y 2 ~ l in the xy-plane. the sphere of radius a centered at the origin. Find <I>.
(b) Find the area of the part of the graph described in 8. (a) Find the area of the spiral ramp represented para-
part (a). metrically by

4. Let a and b be vectors in JR 3 • Let P be the part of a


plane parametrized by x(u, v) = ua + vb for parameter
variables (u, v) in a region R with area A(R). Show that
g(u,u)~(::!::). O~u~l, O~u~k

the area of P is /a x b/A(R).


(b) Let the surface of part (a) have a density per unit of
5. Let P be the parl of the graph of z = x2 + y lying area at each point equal to the distance of that point
vertically above the square O :5: x ~ 1, 0 ~ y ~ I in the from the central axis of the surface. Find the total
xy-plane. mass of the weighted surface.
=
(a) Assume that P is weighted by density µ(x, y, z)
x. Find the total mass of the weighted surface. 9. Compute Is F • dS, where
430 Chapter 9 Vector Field Theory

(a) F(x, y, z) = (x, y, z) and Sis given by (b) Find the area of the graph of J(x, y) = x2 + y for
0::::x::::1,0::::y::::l.
/11-1)) 0:::: u:::: ), Show that if JR 3 ~ IR is continuously differentiable
g(11,v)=
( u+v ,
LIV
0 :'.':: V :'.:: 2.
15. (a)
and implicitly determines a piece of smooth surface
Son which 8G /az =/; 0, and which lies over a region
(b) F(x, y. z) = (x 2 , 0, 0) and Sis given by D of the xy-plane, then

g(u,v)=
(
1/COSIJ)
us~nv,
0:::: u :::: l,
0:::: v:::: 2T(. a(S) = l (~~r (~~r (~~r
+ +
10. Find the total mass of a spherical film having density at
aa1-
i}z
1
each point equal to the linear distance of the point from
a single fixed point on the sphere.
X
l dxdy.
*11. Let x = g(u, v), for (u, v) in D, and x = h(s, t), for (s, t) Assume that just one point of S lies over each point
in B, be parametrizations for the same piece of smooth of D.
surface S in R 3 . If there is a one-to-one transformation (b) Compute the surface area of the hemisphere
T, continuously differentiable both ways between D and
B, such that the Jacobian determinant of T is positive,
and such that g(u, v) = h(T(11, v)) for (u, v) in D, then
g and h are called equivalent parametrizations of S. using part ( a).
(a) Show that equivalent parametrizations assign the
16. (a) Show that if a surface Sis the graph of z = f(x, y)
same surface area to S. (Hint: Use the change-of- for (x, y) in D, then the surface integral of F
variable theorem.) (F1, F2, F3) over S is
(b) Show that the equivalent parametrizations assign the

I(
same value to the surface integral of a vector field
over S.
aJ
-F1-;- - F2-;-
aJ + F3 ) dx dy.
D ax oy
12. Let the temperature at a point (x, y, z) of a region R be
given by a continuously differentiable function T(x, y, z). (b) Use part (a) to compute the integral of F(x, y, z) =
Then the vector field 'ilT is called the temperature (x, y, z) over the graph of z = x 2 + y for
gradient, and under some reasonable assumptions about 0::=::x::::1,0::::y::=::l.
the region, 'ilT(x, y, z) is proportional to the direction and
rate of flow of heat per unit of area at (x, y, z).
In Exercises 17 to 20, find a parametrization as a piece-
wise smooth orientable surface wilh outward-pointing
(a) If T(x, y, z) = x 2 +y 2 for x 2 +y 2 :::: 4, find the total
normal for the given surface.
rate of flow of heat across the cylindrical surface
x 2 + y2 = 1, 0:::: z:::: l. 17. The cylindrical can with bottom and no top given by
(b) Give an example of a continuously differentiable x 2 + y2 = 1, 0 :::: z :::: I and x2 + y 2 :::: 1, z = 0
vector field that cannot be a temperature gradient.
18. The funnel given by x 2 + y 2 - z 2 = 0, l :::: z :::: 4 and
13. The Newtonian potential function (x 2 +y 2 +z 2 )- 112 has as x2 + y 2 = 1, 0 S z S I
its gradient the attractive force field F of a charged particle
19. The trough given by
at the origin acting on an oppositely charged particle at
(x, y, z). The flux of the field across a piece of smooth
surface is defined to be the surface integral of F over
y - z = 0, 0 :::: x :::: 1, 0 :::: z :::: 1, and
S. Show that the flux of F across a sphere of radius a y + z = 0, 0 S x S l, 0 S zS 1
centered at the origin is independent of a.

If JR 2 ~ JR is continuously differentiable on a set


20. The top half of the sphere of radius I centered at the
14. (a)
origin in IR 3, together with the disk of radius l centered
D bounded by a piecewise smooth curve, show that
at the origin of the xy plane in R 3
the area of the graph of J is
21. Let F be the vector field in JR 3 given by F(x, y, z) =
a(S)= LJ1+u,Y+(Jy) 2 dxdy. (x, y, 2z - x - y). Find the integral of F over the oriented
surface of Exercise 17
Section 4A Gauss's Theorem 431
22. Let F be a continuous fluid flow field and let M be a 26. Let f be a real-valued continuously differentiable function
piecewise smooth Mobius strip lying in the domain of F . of one variable, nonnegative for a ::; x ::; b. The graph
Is it possible to define the flux of F across M? of f, rotated around the x-axis, generates a surface of
23. Parametrize the set of Exercise 17 so it is reoriented, with revolution in JR 3 •
normals pointing out on the bottom and in on the sides. (a) Find a parametric representation of S in terms of f.
Compute the integral of F(x, y, z) = (x, y, 2z - x - y) (b) Prove that a(S) = 2:rr J;
f(x)Jl + (f'(x)) 2 dx.
over this surface. e
27. The solid angle determined by a solid cone with vertex
24. Prove that if F and G are continuous vector fields on a at the origin in IR 3 is defined to be the surface area of the
piece of smooth surface S, then intersection of e with the unit sphere lxl = l.
(a) Show that 2-dimensional reduction of this definition
leads to the usual definition of the angle between
ls(aF+bG)•dS=a ls F dS+b ls G dS,
0 0
two lines.
(b) Compute the solid angle determined by the cone
x + y :S 2z 2 , 0 :S z.
2 2
where a and b are constants.
*25. (a) Let F be a continuous vector field on a piece of 28. Let G(x, y, z) = yj + zk define a vector field in JR 3 and
let S1 U S2 be the oriented two-piece surface described
smooth surface S. Show that
in Example 6 of the text. Compute the flux of G across
S1 U S2 as oriented in the example.
29. Let H(x, y, z) = (x, 2y, 3z) define a vector field in JR 3
and let CUD be the closed surface described in Example
where M is the maximum of IF(x)/ for x on S. 7 of the text. Compute the flux of H out across CUD.

[Hint: Write f F • dS in the form / F • n da and 30. Let J(x, y, z) be a continuously differentiable function
defined on a smooth surface S in JR 3 • Suppose that every
use Theorem 3.5 in Chapter 7, Section 3.] level surface of f is perpendicular to S wherever the
(b) Show that if the piece of S shrinks to a point two surfaces intersect. (This just means that their normal
Xo in such a way that a(S) tends to zero, then vectors are[rpendicular at each point of intersection.)
{l/a(S)}J5 F dS tends to F(xo) •no, where no is
0

a unit normal to S at XQ. Prove that 'ilf • dS = 0.


s

SECTION 4 GAUSS'S THEOREM


4A Statement and Examples
Gauss's Theorem is a fairly straightforward generalization of Green's Theorem from
JR2 to JR3 . Both of these theorems are generalizations of the Fundamental Theorem of
Calculus, so we can expect them to play a fundamental role in relating multivariable
integrals to multivariable derivatives. We begin with a region R in JR3 having as
boundary a piecewise smooth surface S. Each piece of S will be parametrized by
a continuously differentiable function JR2 ~ JR 3 with the normal vector ag /au x
ag/av pointing away from Rat each point of S. The boundary surface Sis then said
to have positive orientation, and we denote the positively oriented boundary of R
by aR. To state the theorem, we consider a vector field F, continuously differentiable
on Rand its boundary. We define the divergence of F to be the real-valued function
div F defined on R by

aF1 aF2 3F3


divF(x) = -(x) + -(x) + -(x),
ax ay az
where F1 , F2, F3 are the coordinate functions of F. If F is the velocity field of a
fluid flow, we can interpret div F(x) as the rate at x of expansion or contraction of
432 Chapter 9 Vector Field Theory

the fluid per volume unit. In particular, if div F(x) > 0 the fluid is expanding at x
and if div F(x) < 0 the fluid is contracting at x. This interpretation is justified in
Section 4B. (See also Chapter 8, Section 4 and Chapter 12, Section ID.)

IE.XAMPLE 1 I Let F(x, y, z) = (x 3 , y 2 , z). Then divF(x, y, z) = 3x 2 + 2y + 1.


The fonnula for Gauss's Theorem, or the divergence theorem, is

{ divFdV = { F dS.0

JR JaR
Gauss's Theorem is like Green's Theorem and the fonnula

la
b "vf • dx = J (b) - J (a),

in that it relates an integral of some kind of derivative of a function to the behavior


of that function on a boundary. In each case the orientation of the boundary is
important. For example, if we apply Gauss's Theorem to the region R in JR. 3 given
by 1 ~ jxj ~ 2, then the oriented boundary, denoted aR, must be such that its
normal vectors on the outer sphere point out from the sphere, and on the inner
sphere point in from the sphere, as shown in Figure 9.20. We'll say in general that
aR is positively oriented with respect to R if the normal vectors given by the
parametrization of aR point away from R.
We'll prove Gauss's Theorem for the case in which R is a finite union of simple
regions, where a simple region in JR. 3 is one whose boundary is crossed by a line
parallel to a coordinate axis at most twice. For example, the non-simple region
between two spheres, shown in Figure 9.20, splits into a union of eight simple
regions, one in each coordinate octant.
•z
4.1 Gauss's Theorem. Let R be a finite union of simple regions in JR 3 , hav-
ing a positively oriented piecewise smooth boundary aR. If F is a continuously
differentiable vector field on R and aR, then

{ divFdV ={ F dS.
0

JR JaR
Proof. In terms of coordinate functions of F, Gauss's formula reads

1(-a-+
R
aF1 -aF2 + -aF3)
xay az dx dydz = 1 F1
aR
dydz + F2dzdx + F3dxdy.

We assume first that R is a simple region and prove only the equation
l<'IGURE 9.20

the proofs for the tenns containing F1 and F3 being similar. Addition of the resulting
equations will then prove the theorem for simple regions. Because R is simple, aR
Section 4A Gauss's Theorem 433
FIGURE 9.21 z

y = s(x, z)

"'·
_.. y
X

consists of the graphs of two functions, s(x, z) and r(x, z), perhaps together with
pieces consisting of lines parallel to the y-axis as shown in Figure 9.21. Let

gi(u, v) )
g(u, v) = g2(u, v) , (u, v) in D,
( g3(u, v)

be a parametrization for aR that orients it positively. Then by the definition of the


surlace integral,

1
aR
F2dzdx =
1D
a(g3,g1)
F2(g1, g2, g3)---du dv,
o(u, v)
(l)

and, on the sections of aR that are parallel to the y axis, the normal vector to aR
is perpendicular to they axis. Hence a(g3, g1)/a(u, v), the second coordinate of the
normal, is equal to zero, thus eliminating the part of the integral that is not on the
graph of r ors. We now apply the change-of-variable theorem to the two remaining
parts of the integral in Equation ( 1). The appropriate transformations are

z )
( x
=( g3(u, v) ) ,
gi(u, v)

with (u, v) in either Dr or Ds, where Dr and Ds are the parts of D corresponding to
the graphs of rands. The Jacobian determinant a(g3, gi)Ja(u, v) is positive on the
graph of r and negative on the graph of s, because it represents the x 2 coordinate
of the outward normal. On Dr we have gz(u, v) = r(x, z), whereas on Ds we have
g 2 (u, v) =
s(x, z). Using these facts, we get from the change-of-variable theorem
and Equation (1 ),

{ F2dzdx= { F2(x,s(x,z),z)(-l)dxdz
laR }R2
+f F2(x,r(x,z),z)dxdz,
lR2
where R2 is the plane region we get by projecting R onto the xz-plane. These last
two integrals are not surlace integrals, but rather 2-dimensional multiple integrals.
434 Chapter 9 Vector Field Theory

Then by the fundamental theorem of calculus,

aF2
1 1 [i
r(x.n ]
F2dzdx = -(x,y,z)dy dxdz
iJR R2 s(x,z) ay

=
1 Ray
aF2
-dxdydz.

Similar arguments involving Ft and F3 complete the proof for simple regions, since
the addition of the three resulting equations gives

{ F1 dydz + F2dzdx + F3dxdy = { (aFi + a,F2 + aF3 ) dxdydz.


laR JR ax ay az
This is the Gauss formula in coordinate form.
The extension of Gauss's Theorem to a finite union R of simple regions is essen-
tially the same as the analogous extension of Green's Theorem. In the present case,
when two simple regions have a common boundary surface, the respective outward
normals will be negatives of one another. The corresponding surface integrals are
then negatives of one another, and so cancel out. The remaining surface integrals
add up to the integral over the surface aR. •

Example 5 of Section 3 consists of showing that the flux of the gradient field F of
the potential function

across a sphere of radius a, centered at the origin, is independent of a. Using Gauss's


Theorem we can prove something more general, and with a minimum of calcula-
tion. Let S1 and S2 be any two piecewise smooth closed surfaces, one contained in
the other, both containing the origin, and bounding a region R between them; for
example, R might be the region between two spheres, as shown in Figure 9.20. A
routine calculation shows that the gradient is

F(x, y, z) = (x 2 + y2 + z2)- 312(-xi - yj - zk),

and then that the divergence of this field is zero (i.e., div F = 0 everywhere except
at the origin). In particular, div F = 0 throughout R. Applying Gauss's Theorem to
R gives

{ J< dS = { divFdV = 0.
0

laR jR
But aR consists of S1 with inward pointing normal and S2 with outward pointing
normal; so, with the understanding that S1 stands for the inner surface with reversed
normal, we get

{ F • dS = { F • dS +{ F • dS = 0.
laR ls, ls2
Section 4B Gauss's Theorem 435
Thus the integrals over the outward-oriented surfaces are equal. To find the actual
value, it's enough to compute it for one surface, say a sphere. The result is -41r,
as shown in Example 5 of Section 3 with GM m = 1. This result is a special case
of one version of Gauss's Law, which says that the gravitational flux out across a
surface S containing a mass distribution of total mass M on R is -41r M.

Example 2 is a typical application of Gauss's Theorem. The same argument


implies the following general statement.

Surface independence principle. If a vector field F satisfies div F = 0 in a region


R whose entire boundary consists of two surfaces S1 and S2 , one with inward-oriented
normal and the other with outward-oriented normal, then

4.2 { F • dS = { F • dS.
lsi ls2
In other words, F' has the same flux across the two surfaces S1 and S2.

Two examples of appropriately oriented pairs S1, S2 of surfaces are in


Figures 9.22(a) and (b). We choose an orientation for each surface so we can apply
Gauss's Theorem to the region bounded by S1 and S2. The principle is analogous to
the Path Independence Principle for line integrals in the plane, one difference being
that in the plane case the derivative condition was stated in terms of scalar curl as
curlF = 0 or 8F2/8x1 = 8Fif8x2. Stokes's Theorem in Section 5 will add further
insight into these ideas.

It's easy to check that the field F(x,y,z) = (xz,yz,-z 2 ) satisfies divF(x) = 0
everywhere. It follows from the Surface Independence Principle that if two sur-
faces S1 and S2 bound a region R , one with inward-oriented normal, the other with
outward-oriented normal, then F has the same flux across the two surfaces. If one
of the two surfaces, say S1, is contained in the xy-plane, as in Figure 9.22(a), then
the flux across S1 is zero, because F(x, y, z) = 0 when z = 0. Hence the flux across
S2 is also zero.

4B Interpretation of Divergence
The divergence of a vector field F at a point x is a measure of the tendency of the
field to radiate away from x, hence the term divergence. To justify this interpretation,
consider a solid ball Ba of radius a centered at a point xo in the interior of the set

FIGURE 9.22 z

(a) (b)
436 Chapter 9 Vector Field Theory

on which F is continuously differentiable. Apply Gauss 's formula to F on Ba and


divide both sides of the equation by the volume V (Ba ) to get

- -1
1
V (Ba) Ba
divFdV =-
1
-
V (Ba)
{
JaBa
F • ndu,

where n is the outward-pointing unit normal vector to the spherical boundary surface
Sa = aBa . The ratio on the left is the average value of div F in a neighborhood of
xo, and so tends to divF(xo) as a tends to zero. (See Exercise 3 of Chapter 7,
Section 3.) The integral on the right is the average flux, per unit of volume, of F
directed out across Sa. Hence this average flux tends to the limit of the left side,
namely div F(xo), as a tends to zero. The number div F is the expansion rate of F
at xo. Gauss's Theorem itself is often called the divergence theorem because the
theorem is a statement about div F. In particular, if div F(x) > 0 the flow generated
FIGURE 9.23
by Fis expanding near x and if div F(x) < 0 the flow generated by Fis contracting
near x. (See also Chapter 8, Section 4 and Chapter 12, Section ID.)

IEXAMPLE 41 Consider the vector field F(x , y, z) = x 3 i + y3 j + z3 k. A glance at the sketch of F in


Figure 9.23 shows the field radiating away from the origin with increasing strength
as distance from the origin increases. So it follows from the definition of flux that
the average flux of F over a sphere centered at the origin will be positive. A little
additional thought shows on similar grounds that the average flux of F out from a
sphere that doesn't enclose the origin will also be positive. The reason is that the part
of the flow more distant from the origin is both outward-directed and stronger than
the parts nearer to the origin. Beyond these simple remarks, we can conclude from
Gauss's Theorem that the average flux of I<' out across an arbitrary smooth surface
bounding a solid 3-dimensional region R is positive and equal to

l divF(x . y. z)dxdydz = L 3(x


2
+ y2 + z2 ) dxdy d z.

In Chapter 8, Section 4 we introduced without proof the continuity equation for


fluid flow , namely

4.3 ap (x) = - ct·1v F (x),


-
a1
where F is the velocity field of a fluid flow and p(x) is the density of the fluid at x.
Proving the continuity equation is a nice application of Gauss's Theorem, as follows.

IEXAMPLE s j Let F(x) be the continuously differentiable velocity field of a fluid flow in 3-
dimensional space, with continuously differentiable fluid density p(x) , and let B
be an arbitrary region of finite volume in the domain of F(x). We assume that no
fluid is created or destroyed in B so any change in density is due to compression or
expansion of the fluid. Then

!!._1
dt B
pdV=-{ F•ndS,
laB
Section 4B Gauss's Theorem 437
where n is the outward-pointing unit normal to the surface aB. The left side is the
rate of change of mass with respect to t, which because of the absence of creation
or destruction of fluid must be due to flux across aB, as measured by the right side.
We need the minus sign because the integral by itself measures total outward flux,
which would be positive if the left side were negative, and vice versa. Now apply
the Leibniz rule on the left and Gauss's theorem on the right to get

[opdV=-{divFdV, or f(ap+divF)dV=O.
1B at 1B 1B at
Since B is arbitrary we can choose B to be a ball Br(Xo) of radius r centered at
an arbitrary point xo in the domain of F(x). Since the integrand in the right hand
integral is continuous it must be identically zero, otherwise a nonzero value for it
would, for small enough positive r, give a nonzero value for the integral over Br.
This establishes Equation 4.3.

EXERCISES

In Exercises I to 4, compute the divergence of the vector In Exercises 13 and 14, prove the identity for a twice
field F. continuously differentiable vector field F or real-valued
function f.
I. F(x,y,z)=(x 2 ,y2,z 2 )
13. div(curl F)(x) =0
2. F(x,y,z) = (sinxy,0,0)
14. curl(Vj)(x) =0
3. F(x, y, z) = (y, z, x)
15. (a) Show that for f(x, y, z) = (x 2 + y2 + z2 )- 112 the
4. F(x, y, z) = (xy, yz, zx) equation div(VJ)(x) = 0 holds for all x =I= 0.
In Exercises 5 to 8, verify Gauss's Theorem for the (b) Show by example that div(Vf)(x) =I= 0 may hold for
vector field F and regions R in JR3 . Sketch R, together some twice continuously differentiable function f.
with a few outward-pointing normal vectors. (c) If the operator t::,. is defined by t::,.J = div(Vf), find
a formula for t::,.J in terms of partial derivatives of
5. F(x, y, z) = (x 2 , y2, z2); R: x 2 + y2 ~ I, 0:::: z:::: 1 J. A function such that t::,.f (x) = 0 for all x in the
6. F(x, y, z) = (y, -x, O); R: x 2 + y 2 + z2:::: 4 domain of f is called harmonic function, and t::,. is
called the Laplace operator.
7. F(x, y, z) = (0, 0, z); R : x 2 + y2:::: I, 0:::: z:::: I
16. The trace of a square matrix is defined as the sum of
8. F(x, y, z) = (x, y, z); R: 0:::: x:::: I, 0:::: y:::: 1,
the elements on its main diagonal. If Rn .! Rn is a
0:::: z:::: I differentiable vector field, we define div F to be the real-
In Exercise 9 to 12, sketch the closed surface S, and
compute 1
F • dS over S by using Gauss's Theorem.
s the normal vectors to s point
. out.
valued function given by

div F(x) = tr F' (x),


Assume that
where tr A stands for the trace of A. Show that in the 2-
9. F(x, y, z) = (x, y, z); S : x 2 + y2 + z2 = 4 and 3-dimensional cases this definition agrees with those
10. F(x,y,z) = (x,x,x);S : cylindrical surfacex 2 +y 2 =1, previously given.
0 :::: z :::: I; bottom x 2 + y2 :::: I, z = 0; top x 2 + y2 :::: In Exercises 17 and 18, use Gauss's Theorem to compute
l,z = I
11. F(x, y, z) = (xz, -yz, xy); S: x 2 + 2y2 + 3z 2 = I
Is F· dS over the sphere of radius 1 centered at the origin

12. F(x, y, z) = (x, y, z); S: bottom x 2 + y2:::: l, in JR 3 and with outward-pointing normal.
2
z = O; top z = I - x - y2
17. F(x,y,z)=(x 2 ,y2 ,z2)
438 Chapter 9 Vector Field Theory

18. F(x, y, z) = (xz 2 • 0, z3 ) in R, that is Uxx + ttvv + u: = 0 in R. Show that if the


boundary of R consists of finitely many smooth surfaces,
19. Show that for a region R to which Gauss's Theorem
then the outward flux of the field Vu across aR is zero.
applies, the volume of R is given by
24. Use the surface independence principle, Equation 4.2,
to compute the flux of the constant field F(x, y, z) =
V(R)=~ f xdydz+ydzdx+zdxdy. (0, 0, 1) across the hemisphere z =
j 1 - x 2 - y 2 , where
3 laR
x +y2 ~ I.
2

20. (a) Use Gauss's theorem to prove that if Fis a continu- 25. A field F for which div F(x) = 0 everywhere is called
ously differentiable vector field with zero divergence divergence free. Show that the flux of a divergence-free
in a region R, then the integral of F over aR is zero. field across a smooth closed surface is zero.
(b) Write an intuitive argument, based on the interpreta- 26. Define the vector field F(x, y, z) = (ax, by, cz), where a,
tion of the divergence, for the assertion in part (a). b, and c arc constants.
x2 y2 z2 (a) Find the flux of F across a sphere of radius p > 0,
21. Let S be the ellipsoid
02
+ b2 + c 2 = 1, and let D(x, y, z) oriented so that its normal vector points out from the
be the distance from the origin to the tangent plane to S sphere.
at (x,y,z). (b) Answer the question of part (a) with F defined
(a) Let instead by F(x, y, z) = (yz, zx, xy).
27. Gauss's Law. The gravitational field generated by an
integrable mass density µ, defined on a region R is

Show that F,n = v- 1, where n is the outward unit F(x) =G { µ,(y)(y -/) dVy.
normal to S at (x, y, z). JR ly-Xj
4
(b) Show that f D- 1da = JT (be+ ca+ ab)·
ls 3 a b c
(a) Show that the flux of this field across a smooth
22. A vector field JR 3 !.
JR 3 defined in a region R is called closed surface S with no points of R inside or on S
incompressible in R if div F(x) = 0 for all x in R. If is zero. Do this by interchanging the order of surface
Fis continuously differentiable and incompressible in R, and volume integrals.
show that the flux of F is zero across every sufficiently (b) Show that the flux of this field across a smooth
small sphere with its interior in R. closed surface S containing all points of R in its
23. Suppose that u(x, y, z) is twice continuously differen-
tiable in a region R and that ·u is a harmonic function
interior is -4JT G l µ, d V.

SECTION 5 STOKES'S THEOREM


SA Statement and Examples
An important extension of Green's theorem is as follows. Instead of considering a
plane region D bounded by a curve, we can think of lifting such a region, together
with its boundary curve, into a 2-dimensional surface S in ]1( 3 . Then S will have as
its border a space curve y corresponding to the boundary of D. The lifting is made
precise by defining on D and its piecewise smooth boundary a function ~ 2 -~i,, ~ 3
having S as the image of D. A typical picture is shown in Figure 9.24. The region
D has its boundary oriented counterclockwise, and y, the border curve of S, inherits
what we'll call the positive orientation with respect to ~- If we parametrize the
boundary of D by ~ -!. ~
2, for a ::; t ::; b, then the composition g(h(t)) will
describe the border of S. We'll denote the positively oriented border of S by as.
Section SA Stokes's Theorem 439
FIGURE 9.24

...u
y

(a) (b)

(The term border instead of boundary is used to avoid confusion with what we
earlier called the boundary of S in Chapter 5, Section 1.)
We can now relate the line integral of a vector field F around aS to the surface
integral of an associated vector field over S. We assume that JR3 ~ JR3 is a contin-
uously differentiable vector field whose domain contains S. In Chapter 8, Section 4
we defined the vector field curl F by

cJF2 cJF1 cJF3 cJF2 cJF1 )


5.1 curlF(x) = ( -cJF3
( x ) - -(x), - ( x ) - -(x), - ( x ) - -(x) ,
ay az az ax ax ay
where F1, F2, and F3 are the coordinate functions of F. If the domain of F is an
open set, then the domain of curl F is the same set. As a memory aid, we express
curlF as a formal cross-product of the gradient operator (&/ax, a;ay, a;az) and
F = (F1, F2, F3), namely

curlF = det ( a;~x


F1

I~~f!t1,t~~-i:;] If F(x, y, z) = (y 2 , z2 , x 2 ), then


j
curl F = det ( a/~x a;ay a~z ) = -2.::i - 2xj - 2yk.
y2 z2 x2

The vector field curl F is a kind of derivative of the field F, and it plays a central
role in another extension of the Fundamental Theorem of Calculus called Stokes's
Theorem, to be proved later as Theorem 4.2:

f curl F • dS = f F • dx. (I)


ls las
The way in which the positively oriented border curve as in Equation (I) inherits
its orientation from a parametrization of S is crucial to the validity of Equation ( l ).
For example, an incorrect choice will produce the wrong sign on the right side. Even
worse, failure to orient the two border curves coherently in Figure 9.24(b) can lead
to a result having no significant relation to an integral over the surface.
440 Chapter 9 Vector Field Theory

If F were essentially a 2-dimensional vector field, with F3 = 0 and F1 and F2


independent of z, then only the third coordinate of curl F, namely aF2/ax - aF1/ay,
would be different from zero. In addition we could write dS = dx dy, so Stokes's
Theorem would reduce to Green's Theorem of Section I.

IJXAMPl.£: zI Let S be the helicoid parametrized by

X) = (UC?S V) , TC

(
_y
Z
u sm v
V
for O ::: u ::: 1, 0 -< V
-< -.
2

Then the border of S consists of three line segments and a spiral curve shown
in Figure 9.25 together with the domain D of the parametrization. Restricting the
parametrization of S to the boundary of D gives the following parametrizations of
the smooth pieces of the border of S:

Now let F be the vector field F(x, y, z) = (z, x, y). The line integrals of It' over
the Yi are all of the form

{ zdx+xdy+ydz.

FIGURE 9.25 z
V

(l)
7T
2

(2) (4)

y
(3)
u ---
x

(a) (b)
Section SA Stokes's Theorem 441
It's easy to see that the integrals over YI, Y2, and y3 are all zero, whereas over y4
we get

~ rr

1n
F • dx =
Lo
(cos 2 t + sin t - t sin t) dt =-.
4

On the other hand curl F(x1 , x2, x3) = (1 , 1, 1); so the integral of curl F over S is
f curlF•dS= f (a(y, z) + a(z,x) + a(x,y))dudv
ls lv a(u, v) a(u , v) a(u , v)
1du lon/2 (sinv-cosv + u)du = rr
=
1 O O
-.
4

This verifies Equation (1) for our special example.

The proof that we give of Stokes's Theorem depends on an application of Green's


Theorem to the region D on which the parametrization of S is defined. For this reason
we need to assume enough about D to make Green's Theorem hold on it. Also, if
JR2 ~ JR3 is the parametrization of S, we'll want the second-order partial derivatives
of g to be continuous, that is, g should be twice continuously differentiable on D .
We can relax these conditions, but to do so makes the proof much more difficult.

5.2 Stokes's Theorem. Let S be a piece of smooth surface in JR 3 , parametrized


by a twice continuously differentiable function g. Assume that D, the parameter
domain of g, is a finite union of simple regions bounded by a piecewise smooth
curve. If F is a continuously differentiable vector field defined on S, then

J. F • dx,
lsf curl F . dS = fas
where as is the positively oriented border of S.
Proof. Let F1 , F2, F3 be coordinate functions of F. We'll prove that

ias
Fidx =
J aF1 aF1
--dxdy + -dzdx .
s ay az
(2)

The proofs that

i as
F2dy =
1s
aF2 aF2
- - dy dz+ -
ay ax
dx dy

and

ias
F3dz =
ls
aF3
- - d z dx
s ax
+-
aF3
ay
dy dz

are similar, and addition of the three equations gives Stokes's formu1a. To prove
the Equation (2), suppose that h(t) = (u(t), v(t)) is a counterclockwise-oriented
442 Chapter 9 Vector Field Theory

parametrization of 8, the boundary of D. Then g ( h (t)) is a piecewise smooth


parametrization of the border of S, which by definition is then positively oriented.
Using g,, gz, g3 for the coordinate functions of g, we can write the differential dx as

This substitution and the chain rule give

J F1dx=f F1(g(u,v))!!_g1(u,v)dt
~s dt
ag, du ag1 dv]
=
f
F1 (g(u , v)) [ -(u, v)- + -,-(u , v)-d dt
au dt av t

=
i[,
ag1
F1(g)-du
au
+ F1(g) -
ag,
av
dv.

This last integral is a line integral around the region D in JR 2 , and we can apply
Green's Theorem to it, getting

i as
F1dx=
1[
D
a ( F1(g)-
--
au
ag1) - - a ( F1(g)-
av av
ag1)] dudv.
au
(3)

The assumption that g is twice continuously differentiable ensures that the integral
over D will exist. The same assumption allows us to interchange the order of partial
differentiation in a computation which shows that

_!!___(F ag1)-_!!___(F )ag1) _ _ aF1a(g1,g1) aF1a(g3,g1)


ci ,(g)a V aV i(g ,l - ay vcic ti , V ) + aZ ac U, V ) · (4)
vU vU

Substitution of this identity into Equation (3) gives Equation (2), thus completing
the proof. A suggestion for deriving Equation (4) is given in Exercise 17. •
SB Interpretation of Curl
Using Stokes's Theorem we can derive an interpretation for the vector field curl F
that gives some information about F itself. Let xo be a point of an open set on which
F is continuously differentiable. Let no be an arbitrary unit vector pointing away
from xo, and construct a disk S,. of radius r centered at xo and perpendicular to no.
This is shown in Figure 9.26. Applying Stokes's Theorem to F on the sulface S,.
and its border y,. gives

JF 0 dx = f curlF dS.
0

fr,. ls,.
The value of the line integral was defined more generally in Chapter 8, Section lA
to be the circulation of F around y,.. For small r, the circulation around y,. is a
measure of the tendency of the field near xo to rotate around the axis determined by
no. On the other hand, the surface integral is, for small enough r, nearly equal to the
dot product curl F(xo) •no multiplied by the area of S,.. See Exercise 22 of Section 3.
FIGURE 9.26
Section SB Stokes's Theorem 443
It follows that the circulation around Yr tends to be larger if no points in the same
direction as curl F(xo). Thus we can think of curl F(xo) as determining the axis
about which the circulation of F is greatest near xo. Similarly, Jcurl F(.xo) I measures
the magnitude of the circulation around this axis near xo. Mechanically speaking,
if the vanes of a paddle-wheel were attached to an arrow no (see Figure 9.26.) and
inserted in a velocity field F at xo, the wheel would be expected to rotate most rapidly
with no held parallel to the vector curl F(xo) and not at all with no perpendicular
to curl F(xo). In summary, we can think intuitively about the curl of a field as
follows:

(i) The direction of curl F(x) is the axis about which F rotates most rapidly at x.
(ii) The length of curl F(x) determines the maximum rate of rotation at x.

The extension of Stokes's Theorem to piecewise smooth orientable surfaces is


very simple, though we need to be careful in orienting the border of such a surface.
Figure 9.27 illustrates the method. The surfaces S1 and S2 have their borders joined
so as to produce a piecewise smooth positively oriented surface, which we denote by
S1 U S2. Recall that the surface integral of a vector field F over S1 U S2 has already
been defined by

FIGURE 9.27 [ F • dS = [ F • dS + [ F • dS.


1 1u~ 11 12
The piece of common border curve, indicated by a dashed line in Figure 9.27, will
be traced in opposite directions, depending on whether the parametrization induced
by S1 or by S2 is used. Hence the respective line integrals of F over the common
border will have opposite sign, and when the line integrals over as 1 and as2 are
added, the integrals over the common part will cancel, leaving a line integral over the
rest of the borders of S1 and S2. It is this remaining part that we call the positively
oriented border of S1 U S2, and denote by a(S1 U S2). With this understanding, we
write Stokes's Theorem in the fonn

f curlF•dS = J F 0 dx,
ls hs
for a piecewise smooth surface S.

We can regard a sphere as a piecewise smooth surface on which all of the border
curves cancel one another. Indeed if we parametrize a sphere Sa in JR 3 by

a sin v cos u ) 0::: u ::: 2rr


g(u,v)= asinvsinu,
( 0:::u:::n,
a cosv

then the positively oriented "border" of the sphere consists of the half-circle shown
in Figure 9.28 traced once in each direction. Thus the half-circle corresponds to
the segments u = 0 and u = 2n in the parameter domain. (What happens to the
444 Chapter 9 Vector Field Theory

r
FIGURE 9.28

1T

(a) (b)

segments v = 0 and v = rr?) The result is that a line integral over as


0 will be zero,
and Stokes's Theorem applied to a vector field F on Sa gives

{ curl F . dS = 0.
ls"
A surface like that in Example 3, in which the border is effectively nonexistent
for the purpose of line integration over as, is called a closed surface.

According to the electromagnetic theory embodied in Maxwell's equations, the


l:EMMPLE41 vector current flow I in an electrical conductor is related to the magnetic field B that
the current flow induces in the surrounding space by the equation curl B = I. To
apply Stokes's Theorem to this equation, let a bordered surface S cut the conductor
cross-sectionally. Then

{ l . dS = { curl B • dS = { B . dx.
ls ls las
The first integral is the total current flux across S, and the last one is the circulation
of the magnetic field around the border curve as that encircles the conductor. The
equality of these two quantities is called Ampere's law.

A vector field for which div F = 0, is called a divergence-free field; for a quick
way to find one, start with an arbitrary twice continuously differentiable vector field
G = (G1, G2, G3) and set F = curlG. Then divF = 0, since

divF = !_ (aG3 _ 0G2) + ~ (aG1 _ 0G3) + ~ (aG2 _ 0G1) = 0


ax ay oz ay az ax Jz ax oy
by equality of mixed partials. This way of generating fields that are locaJly diver-
gence-free exhausts all possibilities, since locally every divergence-free field is the
curl of some other field G. Though we won't prove this, note that it's analogous to
the result of Theorem 2.6 of Section 2C which implies in JR 3 that locally every curl-
=
free field, that is, a field F for which curl F 0, is the gradient of some scalar-valued
function f, that is, VJ = F.
Section SB Stokes's Theorem 445
Let G(x, y, z) = (y sin z, x cos z, z sin x). Define a divergence-free vector field }' by
F = curl G. Thus

F(x, y, z) = (x sin z, y cos z - z cos x, cos z - sin z),

and div F(x, y, z) = sin z + cos z + (- sin z - cos z) = 0.

To simplify the pictures, we'll consider here some vector fields in which the z-
coordinate is zero. Our pictures will show a horizontal slice of the field. Figure 9.29
shows three snapshots of a time-dependent field

}'r(X, y, Z) = ((1 - t)X - ty, tX + (1 - t)y, 0)


taken at times t = 0, ½and 1. With t fixed we compute divF 1 (x, y, z) = 2-2t and
curl F1 (x, y, z) = (0, 0, 2t). The field varies from curl-free att = 0 to divergence-free
at t = l. Note carefully the geometric character of these extreme states. Intermediate
states have both curl and divergence nonzero.

J<'IGURE 9.29

t = 0: div =f, 0, curl = 0. t =½:mixture.


(a) (b)

r = 1: curl =f, 0, div = 0.


(c)

F, (x, y, z)
= ((1 - r)x - ty, tx + (1- r)y,0)
Os rs 1.
446 Chapter 9 Vector Field Theory

SC Simple Connectedness
Stokes's Theorem has many applications and in particular gives information about
gradient fields, that is, fields F such that F = VJ , or, using an alternative notation,
F = gradf. If we assume that F is the continuously differentiable gradient field of
f , we can form the vector field curl F. But F = (aj/ax , aJ/ay, aJ/az), so we get
immediately from the definition of curl F and the equality of mixed partials that

5.3 curl(grad f)(x) = 0,


for all x in the domain of f . We have already met the condition curl F = 0 in
Theorems 2.5 and 2.6, where, for the 3-dimensional case that we consider here, it
was stated in terms of the Jacobian matrix

aF1 aF1 aF,


-
ax ay az
aFi aF2 aF2
F' = -
ax ay az
aF3 aF3
-
aF3
ax ay az
The symmetry of F' about its main diagonal is equivalent to curl F = 0. Theo-
rem 2.5 says, in particular, that if F is a gradient field, then curl F is identically
zero. Theorem 2.6 gives only a partial converse, to the effect that if curl F is iden-
tically zero, then there is some rectangle in which F equals a gradient field. This
is sometimes paraphrased by saying that F is locally a gradient field. Example 2 of
Section 2 shows that the strict converse is false. Using Stokes' s Theorem, we can
prove another partial converse, in which the local condition is replaced by a different
kind of restriction on the domain of the given field.
For this purpose we'll define a simply connected open set Bin !Rn. Roughly, a set
B is simply connected if every closed curve y in B can be continuously contracted
to a point in such a way as to stay within B during the contraction. As y contracts
to a point, it sweeps out a surface S lying in B , and y is the border of S. The region
between two spheres shown in Figure 9.30(a) is simply connected, because a closed
curve can slip past the inner ball and then shrink to a point. However the open ball
with a hole bored all the way through it isn' t simply connected, because a surface
whose border encircles the hole must lie at least partly in the hole, and so outside
B. See Figure 9.30(b). In JR2 , the typical simply connected region is the inside of

FIGURE 9.30

(a) (b)
Section SC Stokes's Theorem 447
FIGURE 9.31

(a) (b)

a closed curve, whereas the outside of such a curve is not simply connected. In
Figure 9.3l(a), the curve y is the border of the surface consisting of the part of the
plane lying inside y. However, the presence of the hole in Figure 9.3l(b) prevents a
similar construction. More precisely, we'll say that an open set is simply connected
if every piecewise smooth closed curve y lying in B is the border of some piecewise
smooth orientable surface S lying in B, and with parameter domain a disk in JR 2 • We
assume for applications that S is parametrized by twice continuously differentiable
functions.
Now we can prove the following.

5.4 Theorem. Let F be a continuously differentiable vector field defined on an


open set B in JR 2 or JR 3 . If

(a) B is simply connected, and


(b) curl F is identically zero in B ,

then F is a gradient field in B, that is, there is a real-valued function f such that
F=V/.
Proof. By Theorem 2.4 it is enough to show that §Y F-dx = 0 for every piecewise
smooth curve y lying in B. Because B is simply connected, there is a piecewise
smooth surface S of which y is the border and to which we can apply Stokes's
theorem in either two or three dimensions. Thus

J F. dx = f curl F • dS = 0,
hs ls
as we wanted to show. •
EXERCISES

In Exercises 1 to 4, compute curl F. that Stokes's Theorem holds for the vector field F and
surface S. Sketch S and its border, showing orientation.
1. F(x , y, z) = (y - z2 , z - x 2 , x - y 2 )
2. F(x, y, z) = (z, 2y, 3z) 5. F(x, v, z) = (x, y, z);
3. F(x, y, z) = (x - y, z - x, y - z) S : g(u, v) = (u, v, ./~l---u,,.._--v""""
2 2 ), u2 + v 2 :S I

4. :F(x, y, z) = (x, y, z) 6. F(x, y, z) = (z, x, y);


In Exercises 5 to 8, verify by computing both integrals S : g(u, v) = (u, v, 1 - u2 - v2 ), u2 + v2 :S I
448 Chapter 9 Vector Field Theory

7. F(x,y,z) = (x,y,0); 15. Let F be a differentiable vector field defined in an open


S: g(u, v) = (u, v . u 2 + v2 ), u 2 + v 2 S 4 subset B of R 3 • Use the decomposition of a square matrix
8. F(x,y,z) = (x,y,z); A into symmetric and skew-symmetric parts given by
S: g(u, v) = (cosu, sin u, v), 0 Su S 2,r, 0 S v S 2 A= ½(A+ A')+ ½(A - A') to show that for ally in JR 3

In Exercises 9 and 10, compute ls curl F • d S by using F' (x)y = S(x)y + ½curl F(x) x y,
Stokes's Theorem. In other words, choose a properly where S(x) is a symmetric matrix.

and compute i
oriented parametrization for the border curve y of S,
F • dx.
16. Let F(x, y, z) be the gradient field of the real-valued
Newtonian potential

9. F(x, y, z) = (y, z, x);


S: g(u, v) = (u, v, .Jr-l---u--,2.-_-v.,..2), u 2 + v2 S 1
f (x, Y, z) = V(x 2 + y2 + z2) - l/2 .
10. F(x,y,z)=<.: 2 ,x 2 ,y 2 ); Show that the circulation of F is zero around a smooth
S: g(u, v) = (u, v. u2 + v2 ), u2 + v2 S 4 closed curve that is sufficiently close to a point of the
domain of F.
11. (a) Verify that if F(x, y, z) is independent of z and the
third coordinate function of F is identically zero, 17. Carry out the computation of the identity in Equation (4)
then Stoke~'s Theorem, applied to a planar surface of the proof of Stokes's Theorem. For the first term on
in the xy plane, becomes Green's Theorem. the left of Equation (4) we find
(b) Consider the function JR 2 ~ JR 3 defined by a
2 , a a a2g 1
8
g)-.- + -;-(Fi og)-.81- = (Fi og) - -
auav au av auav
u cosv ) 1 S II S 2,
g(u,v)= us~nv,
0sv<4,r.
+ (aF1 ag1 + aF, agz + 0F1 ag3) ag,.
( ax au ay au az au av

If S is the image in IR 3 of g, give a precise descrip- The second term works out similarly. Then subtract, using
tion of the oriented border of S. equality of mixed partials.
(c) Use Stokes's Theorem to compute the integral of
F(x, y, z) = (x, x, 0) over the border of .S as ori-
18. A vector field JR 3 -!JR 3 defined in a region R is called
irrotational in R if curl F(x) =
0 for all x in R. If F
ented by the parametrization in part (b). is continuously differentiable and inotational in R, show
12. Show that Stokes's formula can be written in the form that the circulation of F is zero around every sufficiently
small circular path in R.
{ curlF nda 0 = J, F-t ds, 19. Consider a cylindrical can C of radius I having a closed
ls f.1s flat bottom and open top with an unspecified smooth bor-
der ac, oriented as shown in Figure 9.32. Let F(x, y, z) =
where n is a unit normal to S and t is a unit tangent to
(x, x, 0). What is the value of the line integral of F over
as.
the border of C?
*13. Use the result of Exercise 25 of Section 3 and Stokes's
Theorem to prove that if F is a continuously differentiable ::
vector field at xo, then
ac
. -1-
Inn
r--+OA(D,) c
i
F • t ds = curl F(xo) • no,

where D,. is a disk of radius r centered at xo, no is a unit


_ _ __ ,..
y
normal to the disk, and c is the boundary of D,.
14. Prove that if Fis a continuously differentiable vector field
such that at each point x of a piece of smooth surface S,
the vector curl F(x) is tangent to S, then the integral of F FIGURE 9.32
around the border of S is zero.
Section 6A The Operators V, Vx and v. 449
20. Compute the integral of curlF, where F(x, y, z) = GI and G2 are given by
(y 3 , -x 3 , z3), over the hemisphere x 2 + y 2 + z2 = 1, z ::::
0, by considering an integral over the disk that closes the
bottom of the hemisphere.
G1(x,y,z) = 1z F2(x,y,t)dt-1Y F3(x,t,O)dt,
21. Show that the open subset of JR 2 consisting of R 2 with the
origin deleted is not simply connected by finding a vector G2(x, y, z) =- foz F1(x, y, t)dt .
field F for which curl F is identically zero, but such that F
is not a gradient field. [Hint: See Exercise 7 of Section l.] To verify this you need to use the Leibniz rule
2 2 for differentiation under the integral sign and the
22. The open set in R consisting of JR with two points
assumption that div F = 0.
x, and X2 deleted is not simply connected. However,
you're asked here to show that if F is any continuously In Exercises 23 to 26, use the result of part (b) of
differentiable vector field in such a region such that the previous exercise to find a vector field G such that
curl F = 0 there, then the integral of F over the smooth curl G = F, and check that the equation is satisfied for
curve shown in Figure 9.33 is equal to zero. each of the following fields, in which div F = 0.
24. F(x, y, z) = (2x, -y, -z)
25. F(x, y, z) = (y, z, x)
26. F(x, y, z) = (yz, xz, xy)
27. F(x, y, z) = (x, -y , 3x)
The vector fields G found in the previous exercise are
not the only ones for which curl G = F when div F = 0.
28. Show that adding a gradient field will also work, that is,
FIGURE 9.33
curl(G + V/) = F .
29. Show that if G and H satisfy curl G(x) = curl H(x) =
F(x) for all x in JR 3 , then G - H = V/ for some f:
23. Finding a vector field G = (G 1, G2, G3) such that
curl G = F, where F = (F1, F2, F3) is (necessarily) diver- JR3 -JR.
gence free, is in principle a rather complicated-looking It's possible to prove that if div F(x) =0 for all x in
problem that in practice may have a fairly straightforward JR3, then
solution.
1
= fu
(a) Show that finding G amounts to solving this system
of partial differential equations for G1 , G2, and G3: G(x) [F(tx) x (tx)] dt

defines a vector field such that curl G = F. In Exer-


ao 3 _ ao 2 = F,. ao, _ ao3 = F2 , cises 30 to 33, use this formula to find curl G given F,
ay az az ax
and check that curl G = F, for each of the following
ao2 _ ao, = F3 . fields for which the divergence is zero.
ax ay
30. F(x, y, z) = (2x, -y, -z)
31. F(x,y,z) = (y,z,x)
(b) Show that if F is continuously differentiable on all
32. F(x, y, z) = (yz, xz, xy)
of JR 3 , the system of equations in part (a) always has
solutions in which, for example, G3 is constant and 33. F(x, y, z) = (x, -y, 3x)

SECTION 6 THE OPERATORS V, Vx AND V•


6A Derivative Formulas
To facilitate the application of the Gauss and Stokes theorems, it's helpful to extend
the use of the symbol V, called "del," that is used in denoting the gradient field of
a real-valued function. In terms of the natural basis i, j, k for JR 3 , we recall that
450 Chapter 9 Vector Field Theory

6.1 'vf = aJ i + aJ j + aJ k.
ax ay az
This equation defines 'v as an operator from real-valued differentiable functions
JR 3 -1+ IR, to vector fields JR3 ~ JR 3 . If we write

6.2 'v = -axa •+-J+-


. a. a k
ay az ,
then Equation 6.1 follows by application of both sides of Equation 6.2 to f.
The formalism just described makes the following definitions natural. lf F is a
differentiable vector field given by

F(x) = F1 (x)i + F2 (x)j + F3 (x)k,

then the operator 'v x is defined by taking the formal cross product of 'v and F to get

a a a a a a
'vxF= ay az i + az ax j + ax ay k
F2 F3 F3 F1 F1 F2
6.3
= ( ~ F3 _ ~I F2) i + ( aF1 _ aF3) j + ( aF2 _ aF1) k.
ay az az ax ax ay
Thus 'v x F is the vector field that we have called the curl of F and written curl F.
Similarly, for a differentiable vector field F, we define the operator 'v• by taking the
formal dot product of 'v and F to get

6.4

This real-valued function we have called the divergence of F and have written div F.
The meaning of the notation just introduced is easy to remember if Equation 6.2 is
kept in mind.
Using the 'v notation, Stokes's formula becomes

6.5 f ('v x F) • n da = f F • t ds,


ls las
and Gauss's formula becomes

6.6 [ 'v • FdV = [ F • nda.


JR JaR
Section 68 The Operators V, V x and V. 451
To exploit these formuJas fuJly, we need some identities involving v'. In the following
formulas, J and g are real-valued differentiable functions, F and Gare differentiable
vector fields, and a and b are constants.

v'(af + bg) = av'f + bv'g (I)


= fv'g + gv'J
v'(jg) (2)
v' x (aF + bG) = av' x F + bv' x G (3)
y' X (JF) = JV X F + VJ X F (4)
v' ~F+bG)=av'•F+bv' G
0 0 (5)

v' • (JF) = JV • F + VJ • F (6)


y'. (F X G) = (v' X F). G - F. (v' X G). (7)
Checking each of these formulas is a matter of writing out the expressions using the
coordinate definitions of the operators. All of these formulas compress into a useful
fonn the extraordinary amount of clutter that writing the corresponding coordinate
expressions requires. Note also that with the possible exception of Formula 7 the
fonnulas appear to be natural extensions of familiar calculus formulas and so are
easy to remember.
Using the same kind of verification used for Fonnulas 1 to 7 establishes that if J
and F are twice differentiable, then

v' • (v' x F) = 0, (8)


v' X (v'/) = 0, (9)
v' • VJ = v' 2 J, (JO)

where V2 f is just shorthand for v' • (v'f) and so denotes the Laplace operator

2 a2 J a2 J a2 J
v' J= ox 2 + ay2 + oz 2 .
Equations (8) and (9), are the same as those in Exercises 13 and 14 of Section 4,
where we used the notations div and curl.
6B Green's Identities
The preceding formulas imply many special cases of the Gauss and Stokes theorems.
A particularly important kind arises if the vector field F is assumed to be a gradient
v'J, or a multiple Jv'g of a gradient. If we set F = v'J in Equation 6, the result is

[ v'•v'JdV= [ v'J•nda. (11)


JR laR
But by Formula (JO), v' • v'f = v' 2 J, and by Equation I of Chapter 6, Section J,
VJ• n = (o/on)J. Thus we have

f V 2 J dV = f aJ da. (12)
JR ls an
452 Chapter 9 Vector Field Theory

If we replace Fin Formula 6.6 by JVg, instead of by VJ, we have, from Equation (6),

V. (fVg) =JV• Vg +VJ• Vg,

and so Gauss's formula yields

6.7 [ JV 2 gdV + [VJ• VgdV = [ J~g da.


JR JR ls an
This is called Green's first identity . Because of the symmetry in the middle term,
interchange of J and g and subtraction of the corresponding terms gives Green's
second identity:

6.8
1R
(jV 2 g - gV 2 f)dV=
1(J..J--g-
S
a aJ)
an an
da.

Let R be a polygonally connected region in IR. 3 with a piecewise smooth boundary


[~AMPLEtl surface S. If h is a real-valued function defined in R, we consider the Poisson
equation

subject to a preassigned boundary condition, u(x) = </)(x) for x on S. We suppose


that there is at least one solution 11(x) defined in R and satisfying the boundary
condition. We can prove, using Green's first identity, that such a solution must be
unique. Let us suppose that there were two solutions u I and u2; then the function
u defined by u (x) = 11 1 (x) - u 2 (x) would satisfy the Laplace equation V2 u = 0
in R, together with the boundary condition u(x) = 0 on S. Setting J = g = u in
Formula 6.7 gives

f uV 2 udV + f JV11J 2 dV = f 11 ~u da.


JR JR ls an
But the first and last terms are zero because V 2 u = 0 in R and u = 0 on S. It
follows from L JVu! 2dV = 0 that Vu= 0 identically on Rand S. Hence u must
be a constant in the polygonally connected region R. Finally, 11 must be identically
zero because u(x) = 0 for x on S. We remark that the Laplace equation is the
special case of the Poisson equation obtained by taking h identically zero; thus
we have proved a uniqueness theorem for the Laplace equation also. The Laplace
and Poisson equations are important in various physical problems. For example, the
steady-state temperature in a homogeneous solid satisfies Laplace's equation. If h
is the density of electric charge in a region of space, then the electrostatic field is
proportional to the gradient of a solution u of the Poisson equation V 2 u = h.

I.EXAMPLE 2 I Green's second identity, Equation 8, illuminates many features of the Laplace oper-
ator V2 . In particular, suppose u and v are harmonic functions on a region R of IR. 3
and its smooth boundary surface aR, that is, suppose V 2u = 0 and V 2 v = 0 on
Section 6C The Operators V, Vx and v. 453
R U aR. Setting J = u and g = v in Equation 8 makes the left side zero, so

1 ( av u - - vau) av= la au
aR an
- da =0
an
or
laiJR u-da
an
v-da.
aR an
This last equation expresses a symmetry that holds for an arbitrary pair of harmonic
functions on R U oR. In particular, with v = 1 as constant harmonic function,
av/on= 0 so we conclude that
au
laiJR -da =0.
on
In words this last equation says that the average value over aR of the nonnal deriva-
tive of a hannonic function must be zero. This last result can also be obtained directly
from Green's first identity. (See Exercise 20.)

From Equation 6 we can derive some equations for vector-valued integrals. Let
v be an arbitrary constant vector and let F(x) = J(x)v, where J is real-valued and
continuously differentiable on a region R. Then because V • Jv = VJ. v (verify!),
Formula 6.6 becomes

[ VJ •vdV = [ Jv •nda.
JR JaR
Since v is constant,

v. f VJ d V = v• f J n da,
JR JaR
Because vis arbitrary, we can successively set v equal to e1, e2, e3, and conclude
that the two vector integrals have the same coordinates in JR3 . Hence

{ VJdV = [ Jnda. (13)


JR JaR
Similarly replacing F in Equation 6 by v x F, where v is a constant vector, we can
conclude that

f
JR
V X FdV = r
JaR
n X Fda. (14)

6C Changing Coordinates
We've already dealt with this issue in Chapter 6, Section 2B and Section 5, arid in
Chapter 7, Section 4. In the first instance we looked mainly at fairly simple linear
examples, in the second mainly at geometry, and in the third mainly just at Jacobian
determinants in multiple integrals. Here we take up nonlinear changes of variable
as they affect first and second order differential operators. The calculations can be
fairly messy, so to understand them it's important to follow a general principle.
Changing from rectangular coordinates x in !Rn to curvilinear coordinates u we
use a coordinate transformation x = T(u) that's both continuously differentiable
454 Chapter 9 Vector Field Theory

and invertible with continuously differentiable inverse r - 1• Suppose v(x) is a scalar


or vector function, and for reasons of symmetry in the field we'd like to compute
V • v, not in x-coordinates but in u-coordinates. But we defined the divergence as a
sum of partial derivatives of the x-variables, and just adding up partials with respect
to the u-variables isn't correct. To clarify the notation we introduce a new function
v(u) = v(T(u)). Thus we can differentiate and integrate v(u) directly in terms of
u-variables. Using the derivative matrix T' (x) and its inverse (T')- 1(u), the switch
to new coordinates sorts out as follows .

6.9 Theorem. If v(x) is continuously differentiable then v' (x) v' (u)(T') = 1
(u),
where x = T(u) is a coordinate transformation and v(u) = v(T(u)).
Proof. Since x = T(u), and we assume r- 1 exists and is differentiable, the chain-
rule gives us

By Theorem 2.3 of Chapter 6 the derivative of an inverse is the inverse of the


derivative, so (T- 1)'(x) = (T') - 1 (u). Finally note that v'(T - 1(x)) = v'(u). •
We'll show how this theorem works with plane polar coordinates.

j EXAMPLE 3 j Divergence and gradient in polar coordinates. We first compute the inverse of the
derivative matrix for the polar coordinate transformation

x
y
= r cos0}
= r sin0 · (
cos0 -rsin0
sin0 r cos 0
)-I= ( cos0
- (1/r) sin0
sin0
(l/r) cos 0
)
·

To find Vu(x, y) in polar coordinates we use Theorem 6.9 and multiply this inverse
. . . . _ (au au) ( cos0 sin0 )
matnxbythedenval!vematnxofu(r,0): ar ae -(1/r)sin0 (l/r)cos0 ·
Thus

_
Vu= ( cos0-
au - -1 sm0-
. au)·1 + ( sm0
. -au + -cos0-
1 au)·J, for r> 0. (15)
ar r a0 ar r a0
In particular we see that we can replace partial differentiation of u with respect to x
and y by action on u according to

au -
-au = ( cos0- 1
-sin0- au) and -au = ( sin0 -au + -I sin0-
au) (16)
ax ar r a0 ay ar r a0
Note that the coefficients of the partials with respect to r and 0 are just the columns
of the inverse matrix. Thus the divergence of a vector field F(x, y) = F1 (x, y)i +
F2 (x, y )j becomes in polar coordinates

V •F
aFi - -l
= ( cos0-- 8Fi)
sin0-- + ( sin0-
aF2 8F2) .
t- -l sin0-- (17)
ar r a0 ar r ae
Section 6C The Operators V, Vx and v. 455

If u(r) and F(r) are functions of r alone then Equations (15) and (17) simplify to

n- 0au.
vu= cos -•+ sm -J
. 0au. and
-
V •F
a Fi
= cos0-- +
aF2
sin0--.
or ar ar ar
Note that while the vector coordinates are computed using polar coordinates, the
vectors Vu are represented using standard basis vectors i and j in IR 2 •

An operator of the form V 2 = V • V is called a Laplacian, and it operates on


functions defined in spaces of various dimensions and expressed in a variety of
coordinate systems. Here's the simplest example in non-rectangular coordinates.

Laplacian in polar coordinates. In plane rectangular coordinates V 2 acts on a


scalar-valued function by V 2u(x, y) = uxx(x, y) + uyy(x, y). Switching to polar
coordinates we let u(r,0) = u(rcos0,rsin0). In operator form Equations 16 lead
to the operator replacements

~
ax
by (cos0i- -
ar
!r sine~)
. a0
and ~
ay
by (sin0~ +
ar
!r cos0~)-
a0
(18)

Applying the first of these twice to the first equation in (16) gives
2
-a u = ( cos0-
a - 1 . a ) ( cos0-
-sm0- au - -sm0-
1 . au)
ax2 ar r a0 ar r a0
a ( cos0-
= cos0- au) a ( -sm0-
- cos0- 1 . au)
ar ar ar r a0
1 a ( cos0-
- -sin0- au) +-sm0-
1 . a ( sm0-
. au) .
r a0 2 ar r a0 a0

We get the polar expression for ~ from ~ by interchanging cos 0 and sin 0 and
ay ax
replacing - by +, so
2
-au = ( sin0-
a + 1 a ) ( sin0-
-cos0- au + -cos0-
I au)
ay2 ar r a0 ar r o0
a ( sin0-
= sin0- au) + sin0-
a ( -cos0-
1 au)
ar ar ar r ao
1 a ( sin0-
+ -cos0- au) + 1 cos0-
a ( cos0-
au) .
r a0 ar 2r a0 a0
With the help of sin2 0 + cos 2 0 = l we can extract from the first and last terms of
the sum Uxx + Uyy the terms Urr and r- 2u00- Everything else but r- 1ur cancels, so
in polar coordinates

2_ a2u 1 au 1 a2u
(19)
V u = ar 2 + -;: or + r2 a0 2' O < r.
Exercise 20 shows it's easier to verify this equation than derive it from Uxx + Uyy·
456 Chapter 9 Vector Field Theory

Laplacian in cylindrical coordinates. An argument analogous to the ones in


Examples 3 and 4 is based on the inverse of the derivative matrix Formula 5.6

I
of Chapter 6, Section 5, namely

x = r cos0 cos0 -r sin0 0 - 1 ( cos0 sin0


y = r sin0 : sin0 rcos0 ~ = -(l/~sin0 (I/r ) cos0
z=z ( 0 0 )
0

Using the polar coordinate result, we find

2 au 1au 1 au a 2u
Vu = ar 2 + -;: or + r 2 00 2 + az 2 ' r > O. (20)

Laplacian in spherical coordinates. An argument analogous to the ones in Examples


3 and 4 is based on the inverse of the derivative matrix Formula 5.5 of Chapter 6,
Section 5, namely

sin¢ cos 0 sin¢ sin0 cos¢


c in¢cos0 rcos¢cos0 -rsin¢sin0 ) - l cos¢ cos0 cos¢ sin 0 sin¢
sin ¢ sin 0 r cos¢ sin 0 r sin q, cos 0 = r r r
sin0 cos0
cos¢ -r sin¢ 0 - - - -- 0
r sin¢ r sin¢
The replacements of the partials with respect to x, y, and z are, respectively,
a ( a cosq,cos0 a sin0
- by sinq,cos0- + - - - - - - - - - -
a)
ax ar r a¢ r sin¢ a0 '
a ( a cos ¢ sin 0 a cos 0 a )
- by sin¢sin0-+---- + ---
oy ar r a¢ r sin¢ a0 '

~ by (cos¢~ - sin¢!.._)·
oz or r oq,
Applying these twice in succession to a function u(r, ¢ , 0), and then adding the
results, gives for O < r, 0 < <P < rr,

_ a2u 2 au 1 a2u cos¢ au 1 a 2u


2
Vu=-+--+--+
2 2 2ar r ar r a¢
.
r 2 sm </J a¢
+ r2 sm. 2 </J - .
a0 2

EXERCISES

For Exercises I to 7, verify the corresponding identity In Exercises 12 and 13, assume x in JR. 3 , and prove the
(I )-(7) in Section 6A. For Exercises 8 to IO, verify the equation for x # 0.
corresponding identity (8)-(10) in Section 6A.
12 'v
·
(_!___)- -x
lxl - lxl 3
11. Prove that if v is a constant vector and x is not zero, then
V X X V V•X
Vx--=-+-x . 13. v2 (_!__) = o
3
lxl lxl lxl lxl
Section 6C The Operators V, Vx and v. 457
14. Replace F in Equation 6 of the text by v x F where v moving a particle from oo to x along some smooth path
is a constant vector in IR3 • Use this to prove Formula through the field V N.
(14) at the end of Section 6B. [Hint: Use Formula (7) in
Section 6A, and Equation 8 in Section 4C of Chapter 1
18. Show that if f (x, y) equals a function 7(Jx 2 + y 2 ) =
to show that the dot product of the two sides of Formula f(r), then
(14) with v are equal as in the proof of Formula (13).]
15. If T(x) is the steady-state temperature at a point x of an
n2/(
V X, y -
2
) - 8 J(r) + ! aJ(r) .
2ar r ar
open set R in JR 3 , then the flux of the temperature gradi-
ent across any smooth surface in R is zero. Use this fact 19. Show that if f(x, y, z) equals a function 7
and Equation (12) to prove that a steady-state temperature
(Jx2 + y2 + z2) = J(r), then
function that is twice continuously differentiable is har-
monic, i.e., V 2 T = 0. [Hint: Suppose that V 2 T(Xo) > 0.
Prove that V2 T(x) > 0 in some ball centered at Xo,) n2/(
V
) -
X,Y,Z -
a27(r)
- a 2
+ --a-·
2 8f(r)
r r r
16. Use Green's first identity to prove that if u is a harmonic
function on a region R together with its smooth boundary 20. Verify the formula for V 2ii(r, 0) in text Example 4
aR in JR3, then the average value over aR of the normal by computing Ur, Urr, and ii00 from ii(r, 0) =
derivative of u must be zero. u(rcos0,rsin0). This computation doesn't qualify as
17. Consider the Newtonian potential function N(x) = Jx1 - 1 a derivation of the polar form of V 2 u(x, y).
and its associated gradient field VN(x). (See Exercise 16.) 21. Verify that the cancellations claimed at the end of text
Prove that N(x) can be interpreted as the work done in Example 4 do occur.

Chapter 9 REVIEW

l. (a) Find a function f such that (b) Prove that I (y; /Jo) = (xo + yo)/(1 + xoyo) for all
Vf(x, y) = (3x 2y, x 3 + 3y2). piecewise smooth paths in the first quadrant from
(b) Use your answer to part (a) to prove that (0, 0) to (xo, yo), using whatever method seems most
convenient.
[ 3x 2y dx + (x 3 + 3y2) dy =
8 for any path y
(c) Explain to what extent the results of parts (a) and
from (1, 1) to (1, 2). (b) extend to other quadrants.
2. Let y be the closed curve consisting ofline segments from 5. Let S be the closed surface that is the boundary of
(0, 0) to (I, 0), from 11, 0) to (1, 2) and from (1, 2) back the solid region inside the cylinder x 2 + y2 = 4 and
to (0, 0). Show that between the planes z = 0 and z = 2. Suppose S is
positively oriented, with normal vector pointing out at

l (-xy + sinx 2 )dx + cos2 ydy = ~-


each point. Find the flux of the vector field F(x, y, z) =
(x3. y 3 + x, xy) across S.
1
6. Consider the vector field in JR 3 given by F(x) = JxJP x
3. Let R be the region in IR2 above the x-axis and below for X # 0.
the curve y parametraed by g(t) = (1 + t 2 , t - t 2 ) with (a) Prove that for every real constant /J, curl F(x) = 0.
0 < t < I. Use Green's Theorem to prove that the area (b) What does part (a) tell you about the circulation of
of-R is-! . F around closed paths in JR3 ?
4. (a) Find the unique choice /Jo of the conslant /3 for (c) Prove that div F(x) = 0 if and only if /J = 3, that is,
which the line integral / (y; /3) is independent of the just when JF(x)I = JxJ- 2•
path y in the first quadrant, where (d) For f3 = 3 what does part (c) tell you about the flux
of F across closed surfaces in JR 3 ?

. -1 l + f3y2 1 + f3x2 d
l(y,/3)- Y (1+xy)2dx+ (l+xy)2 y.
(e) Prove that the flux ·of F across the sphere of radius
a centered at the origin is 4rra 3-P; is this consistent
with your answer to part (d)?
458 Chapter 9 Vector Field Theory

7. The circulation and flux of the vector field F(x, y) = 13. The conical graph of z = 2Jx 2 + y 2 , x 2 + y2 :'.:: 1 has the
(-x, -y) relative to the circle x 2 + y2 = 1 can be
hemispherical graph of z = 2+J1 - x 2 - y 2 , x 2+y 2 :'.:: I
computed in several ways, some easier than others.
placed over its top to form a surface S.
(a) Find the total circulation of F(x, y) around the
(a) Find parametrizations that orient each of the two
circle x 2 + y2 = 1 (relative to the counterclockwise
parts of S such that the normals point outward over
direction).
the whole surface.
(b) Find the total flux of F across the circle x 2 + y 2 = 1,
(b) Compute the total surface area of S and its enclosed
relative to outward-pointing unit normal vectors. volume.
8. Let F he given hy F(x, y, z) = (x + y, y + z, z +x), and (c) Find the flux of F(x, y, z) = (x, y, z) across S.
let g(u, v) = (u cos v, u sin v, v) parametrize a helicoidal 14. Let S he the portion of the sphere x 2 + y 2 + z2 = 1 above
surface H.
(a) Compute curlF(x, y, z). the xy plane. Calculate Jls z da in two ways:
(b) Compute the normal vector g.,(u, v) x 8v(u, v). (a) Directly, as the integral of a function over a surface.
(c) Compute the integral f
]H1
curl F • dS, where H1 is (b) By noting that/ ls z da =
the part of the helicoid corresponding to O :'.:: u :'.::
1, 0 :'.:: v :'.:: 4n, representing two complete turns of
a helicoid of width 1.
J ls<O,O,l)•(x,y,z)da = Jls curlF 0
nda,
where the field F is F(x, y, z) = (-y /2, x /2, 0) and
9. Consider the family of 2-dimensional vector fields S is given an upward orientation, and then applying
defined by Stokes's Theorem, either to change the surface of
integration or to convert to a line integral.
-y X
15. Let f and g be scalar functions of three variables whose
Fa(x,y) = (x2+y2)"i+ (x2+y2)"j,
second partial derivatives are all continuous.
(x, y) ,f. (0, 0), (a) Prove (VJ) x (Vg) = curl(f'vg)
(b) Letf(x,y,z)=x+y+zand
g(x, y, z) = x 2 + y 2 - z 2. Compute
where a is a positive constant. Note that curl Fo(x, y) =
2k.
(a) Compute the scalar curl F" (x, y) and prove that it's J fs((vf)x('vg))•nda,
zero if and only if a = 1, in which case it's
identically zero. where S is the hemisphere x 2 + y 2 + z 2 = I, z :::'.: 0
(b) What can you say, depending on a > 0, about the and n is the upward directed normal.
circulation of F" around a smooth closed curve that [Hint: Think about the change of surface or line
doesn't contain the origin? integral using Stokes's Theorem.]
(c) What can you say, depending on a, about the cir-
16. (a) Find a formula for the function f: IR2 - JR such
culation of F" around a circle centered at the origin
that 'vf(x, y) = (siny, xcosy).
that encircles the origin once, counterclockwise?
10. Find a function f(x, y) such that 'vf(x, y) =
(2xy+ y3 + 1, x 2 +3xy 2). Explain why you cannot find an
f (x, y) such that 'vf(x, y) = (x 2 + 3xy2, 2xy + y3 + 1).
(b)
i
Use your answer to part (a) to compute
sinydx +xcosydy, where y is any path from
(1, I) to (I, 2).
11. Let R be a plane region with piecewise smooth bound- 17. Let y be the counterclockwise path consisting of the part
ary curve aR, oriented counterclockwise. Prove that of the circle x 2 +y 2 = l lying in the first quadrant together

1 aR
xdy = -1 aR
ydx = A(R), with the segments O :'.:: x :'.:: 1 and O :'.:: y :'.:: l on the x- and
1
and 1aR
xdx = f
laR
ydy =0.
y-axes. Use Green's Theorem to compute

18. Let F : JR 3 -
)'
xy dx+y dy.

JR 3 he twice continuously differentiable.


J.
12. Find ½ -y dx + x d y. where c is the boundary of the (a)
(b)
Prove that div(curl F) is identically zero.
Use the result of part (a) to prove that if S is the
region between the curves y = x 2 and y = 8 x 2 , oriented oriented boundary surface of a region R in IR3 , then
counterclockwise. Sketch the region, its boundary, and the
field. the surface integral Is curl F • d S is 0.
Section 6C The Operators V, Vx and v. 459
19. Let S be a part of a plane parametrized by g(u, v) = 23. Suppose F is a continuously differentiable vector field on
(u + v, 2u + v, u - v) for 0 S u S 1 and 0 S v S 1. Let a simply connected region B in JR 3 in which div F = 0.
as denote the border of S with the positive orientation. Let S1 and S2 be smooth surfaces in B with the same
With F(x, y, z) = (y, z, x), use Stokes's Theorem to find border. Explain why the flux of F is the same for both
a surface integral equal to f F-dx. Evaluate the resulting smfaces.
las 24. Let u(x) be an harmonic function, that is,
surface integral.
uxx +uyy +uzz = 0, in a region B of JR 3 having a smooth
20. Let F(x) be a continuously differentiable vector field boundary aB.
on a region B, including its piecewise smooth boundary (a) Prove that f l"vuj 2 dV = f u au du, where n is
surface aB. Suppose that at each point x of aB the vector 1B laB an
the outward-pointing unit normal vector to aB.
F(x) is tangent to aB. Explain why f divFdV = 0. (b) Assume that in addition to being harmonic u is
laB
homogeneous of degree m (i.e., u(tx) = tmu(x)
21, Let B be a region in llf' with a piecewise smooth boundary for t > 0). Prove that if aB is a sphere of
aB.
1 radius a centered at the origin, then l Ivu 12 d V =
(a)
(b)
Prove that V(B) =½
aB
x • dS.
Prove that V(B) is also equal to each of
~
a aB
1 u 2 du. [Hint: Prove that x • Vu(x) = mu(x).J
the following three (equal) surface integrals: 25. Use Gauss's Theorem and the definition of centroid to

1aB
x dy dz,
laB
f y dzdx, f
laB
zdx dy. prove that

22. Define a vector field F(x) = f (x)v, where vis a constant


vector and f is a continuously differentiable function
ls 2
x dy dz+ y2 dzdx + z2 dx dy
JR 3 ~ JR defined on a region B with a piecewise smooth = irra 3(xo +yo+ zo),
boundary surface aB.
(a) 7.plyGauss's Theorem to prove that where S is the sphere of radius a centered at (xo, Yo, zo).

B
V• Vf(x)dV =1 aB
f(x)v•dS. 26. Let f and g be twice continuously differentiable real-
valued functions defined on IR 3 , and let A denote the
(b) Use part (a) to establish an equation for vector-
valued integrals: 1 B
Vf(x) dV ={
laB
f(x) dS.
Laplace operator Au = Uxx + Uyy + Uzz·
(a) Prove that A(fg) = f Ag+ gAJ + 2Vf • Vg.
(b) Prove that !l.(fg) = f Ag+gAf if the level surfaces
(c) What conclusion can you draw from part (b) if f is
identically zero on aB? of f and g intersect only at right angles.
C H A PT E R 10

FIRST-ORDER DIFFERENTIAL
EQUATIONS

This chapter is a brief introduction to differential equations, and it makes no assump-


tion about prior knowledge of the subject. However, the use of derivatives in studying
functions of a single variable is assumed to be familiar from elementary calculus.
One of Newton's many great discoveries was that while we may lack precise infor-
mation about the values y(x) of a function or its derivatives y'(x) and y"(x) we can
sometimes find an equation, cal led a differential equation, relating the function and
its derivatives. For example,

y'(x) + y(x) = x and y"(x) + y'Ct) = 0


are differential equations, usually abbreviated

y' +y =x and y" + y' = 0.


A solution of a differential equation on an interval a < x < b is a function
y(x ), which, when substituted along with its relevant derivatives into the differ-
ential equation, satisfies the equation for all x in some subinterval. For example, it's
a routine check that

y(x) =x - 1 and y(x) = e-x


are respective solutions of the preceding two equations. As with algebraic equations,
we'll want find all solutions of a differential equation. We'll also consider the geo-
metric interpretation of an equation and its solutions and the derivation of the
equation from scientific principles. This chapter takes up these matters for first-
order equations, that is, differential equations in which the derivatives that occur are
of first order.

SECTION 1 DIRECTION FIELDS


We can interpret a differential equation of the form

y' = F(x, y)

as assigning a slope y' to a point (x, y). The assignment is usually represented
geometrically by drawing through the point with coordinates (x, y) a line segment
with slope y' = F(x, y). Just such an array of points and segments is shown in
Figure 10. l (b ). A collection of points with directions attached is called a direction

460
Section 1A Direction Fields 461
field or slope field, and geometrically speaking the assignment of slopes to points is
the essence of the equation y' = F(x, y).
It's important to be clear about the distinction between direction fields and the
vector fields introduced in Chapter 6, Section I. A picture of a direction field always
contains a I-dimensional domain axis whose positive direction determines a direction
for the solution curves, while speed is equal to the slope of segments relative to this
axis. In a vector field we need some device such as an arrow point to indicate
direction, and we indicate speed by the length of a segment. A I -dimensional vector
field has its arrows all on the same line, not a helpful picture, which is one reason
for resorting to direction fields here instead.
(a)
lA Plotting Direction Fields
When a thin film of fluid flows steadily over a plane surface, the particles of fluid trace
in the plane paths called flowlines. Figure 10.l(a) illustrates such a flow by showing
some of its flowlines. In practice, we might try to describe the flow by giving even
less information, namely, just some short line segments tangent to the flowlines at a
selection of points. Figure 10.1 (b) shows some tangent segments, chosen from among
the tangents to the paths in Figure 10.l(a). Visually it's fairly easy to reconstruct the
significant features of Figure 10.l(a) from Figure 10.l(b); to do this graphically, we
can sketch curves through the selected points, making them appear to be tangent to
the segment through each point. A study of such a reconstruction is the geometric
theme of this chapter.
(b) There are two natural ways to produce a sketch of a direction field. One way
is draw tangents to flowlines. The other is to make the sketch associated with the
FIGURE 10.1 first-order differential equation y' = F(x, y) by drawing a short segment with slope
F (x, y) through the point (x, y). These two ways of looking at a direction field blend
together when we solve the differential equation. The reason is that a solution y(x)
satisfies

y'(x) = F(x, y(x));


therefore the graph of y(x) has a slope equal to the slope specified by the differential
equation

y' = F(x, y)

at the point (x, y) = (x, y(x)). In particular, the curves in Figure 10.l(a) are the
graphs of solutions coming from the direction field in Figure 10.1 (b ).

Suppose that by physical measurement we decide that the directions in a flow of


particles are determined according to the differential equation

y
I
= - -y , for X =/:- 0.
X

At each point in the xy-plane, except for points on the y-axis, where x = 0, the
equation specifies a numerical slope y'. We can make a table of some sample points
and slopes:
462 Chapter 10 First-Order Differential Equations

(x,y) Y'=-i
(I, l) -I
(l, 2) -2
(2, l) I
-2
(-1, 2) 2
(-2, 2) l

By plotting some points and at each drawing a short segment with the specified slope,
we get the picture of the direction field shown in Figure 10.2(a). The shape of the
curves tangent to the segments in Figure 10.2(a) is fairly easy to sketch, and some are
in the figure. Note in particular that the positive and negative x-axes, where y = 0,
are such curves, but that the vertical y-axis is excluded because of the restriction
that X-:/- 0.
We can use calculus in this example to find formulas for the solution curves. We
multiply the given equation by x and rearrange to give

xy' + y = 0.
Treating y as a function of x, the product rule for differentiation shows that our
equation is the same as
(xy)' = 0.

But this means that the product xy must be a constant: xy = c. In other words,
C
y= -.
X

The graphs of y = c/x, for various choices of c are just the curves tangent to the
segments of the direction field in Figure 10.2(a). In particular, c = 0 corresponds to
the x-axis except for x = 0, and c = I corresponds to the curves sketched in the
first and third quadrants. Finally, we can verify directly that given a constant c,
C
y=- satisfies y ' =--,
y
for X-:/- 0.
x X

FIGURE 10.2 ///Ill!/


//11111111
I I 11,,,,,,
11111\\\\\ ,
,
___
.... - - - . , / / / / /
___ ,.,,,./////
.,/////
~~;~,:==-: c = I
I/.-::.// ____ ,
Direction fields and solution II/I/Ill/I \\~\\\\\\\
graphs.
l / / / 1 / 1 II
///////41/1
II 11,,,,,
II\,,,,,,
/.///____ ,
-· /r-----....
,,,,,,,,l/.'/I I/ 11,,,,,,,,
.,.,,,,,, '/II II 11,,, ,, ........
.,.,,-< I I / I I/ ,,,,,, "''
C
--------//
= -J ----.,.,/// ,,
_____ ,_,,, /,,
_______ __
,,,,-.....----- c=l
_______
c=I .... --.,,,,1 ///-'"---~ c=-1
''"'· '-''''''
,.,,, ...,,\ \ \
////// ';///
I I I//. /,,,.,,,,...,..,
,,,,,,,,II //It.'/ //;,.-,,
,,,,,, \II I I/ / / / t / l
,,,,,,, II I/ I/ I/,'/;
\\\\\\\~\\ //1///1//,r/
,,,\\\\1 I I I/ II/Ill// /////----,
,,,,,,11 I I 1111/1/1 /////...----,

(a) (b)
y' = -ylx y' = cosx
Section 1A Direction Fields 463

The reason is that y' = -c / x 2 , and on the other hand, -y / x = -c/ x 2 , also. Hence
I C y
y = -- =
x2 X

The special case of the equation y' = F (x, y) in which F is independent of y takes
the form

y' = G(x).
We assume that G is a continuous function on some interval. To solve the equation,
we integrate both sides with respect to x, getting formally,

y=JG(x)dx+C.

Here the indefinite integral stands for any function whose derivative is G. We know
that any two such integrals differ by at most an additive constant C. Each different
constant C gives a graph parallel to the others because the function

F(x, y) = G(x)
is independent of y: that is, direction segments lying on the same vertical line are
all parallel.
For example, the differential equation

y
I
= cosx
has solutions

y = f COS X dx +C
= sinx + c.
The direction field generated by
y' = G(x) = cosx

is sketched in Figure 10.2(b) together with the particular solutions y = sinx + 1 and
y = sinx - l.

The previous example suggests that solving first-order differential equations is


something like finding indefinite integrals. In particular, we should expect solutions of
a first-order differential equation to be distinguished from one another by specifying
an arbitrary constant as in the preceding examples. TI1e usual way to single out a
particular solution is to specify that its graph should pass through some preassigned
point (xo, Yo). An initial condition for a solution y(x) of a first-order differential
equation requires y(x) to satisfy a condition of the form y(xo) = yo. The problem
of satisfying

y' = F(x, y) and y(xo) = yo,

a differential equation and an initial condition, is called an initial-value problem.


464 Chapter 10 First-Order Differential Equations

Sometimes people refer loosely to a solution fonnula that contains an arbitrary


constant as a "general solution formula" for a first-order differential equation, even
though the formula may not contain all possible solutions as special cases. We'll
avoid the term "general solution" except when we can actually show that the formula
really does contain all solutions.

[ EXAMPLE 3 I We return here to the differential equation


y
I
= --y
x

of Example l, with solutions


C
y = -, xi=- 0, c constant.
X

We find the particular solution whose graph passes through the point (xo, yo) =(½, 2)
by substituting these values into the solution formula. The constant c is determined
by the equation
C
2= T•
2
so that c = l. Thus the solution to the initial-value problem
I y I
Y =--, Y!2)=2
X

is given by the formula

y = -, X > 0.
X

The graph of this solution is shown in Figure 10.2(a). To find the solution curve
through an arbitrary preassigned point (xo, yo), with xo i=- 0, we make the substitution
C
YO= -
xo

and find c = xoyo. The solution through (xo, yo) is evidently


XOYO
y=-, x=/=-0.
X

The preceding examples all suggest that, with minor exceptions, through each
point of a direction field there is a unique solution curve for y' = F(x, y) and that
the solution extends without hindrance wherever the field is defined. The following
examples show that neither of these statements is true in general.

IEXAMPLE 41 The differential equation

l={ Jy, y~0


0, y<0
Section 1A Direction Fields 465
FIGURE 10.3 y' =-{y, y 2: 0
I I I
I I I
I I I
I ,, I ,, I

... ," _, ,,
I
I I
I I
I I
I I
I I
I
I I y' == I + y2
y'=0,y<0
' I

(a) (b)

has two distinct solutions passing through (xo, yo) = (0, 0):
y(x) = 0, for - oo < x < oo

and

l
0, -00 < X < 0,
y(x) = x2
0 :'.:: X < 00.
4'
The first solution has its graph coinciding with the x-axis for all x, whereas the graph
of the second coincides with the x-axis for x ::: 0 and then assumes a pambolic shape,
shown in Figure 10.3(a). Thus a solution to the differential equation is not uniquely
determined by the requirement that its graph pass through (0, 0). For still more
solutions of this differential equation see Exercise 6.

tfi~flf>L~ ~;,t The differential equation

l=I+y2
has the solution y = tanx with its graph passing through (xo, Yo) = (0, 0). But the
solution tends to infinity discontinuously at x = ±rr /2, despite having F (x, y) =
1 + y 2 well-behaved throughout the entire xy-plane. Figure 10.3(b) makes it clear
graphically why the solution can't be carried on continuously outside the interval
--;r/2 < X < n/2.

The question of existence and uniqueness of solutions is taken care of for large
classes of differential equations by using the methods of the next three chapters.
We'll state without proof a theorem for the initial-value problem y' = F(x, y),
y(xo) = yo.

1.1 Existence and Uniqueness Theorem. Suppose the function F(x, y) and its
partial derivative Fy(x, y) are both continuous for a < x < b and a < y < /3,
and that a < xo < b and a < Yo < /3. Then the initial-value problem y' =
F(x, y), y(xo) = y 0 has a unique solution defined on some subinterval of a < x <
b; if in addition there is a constant B such that IFy(x, y)I < B for all x in the
interval and for all real numbers y, then this solution will exist on the entire interval
'
466 Chapter 10 First-Order Differential Equations

EXERCISES

By substituting into the given differential equation in I 1- y


15. y = - - , X-:/= 0 16. y' =y
Exercises 1 to 6, verify that the corresponding fonnula X
to the right gives one or more solutions to the differential The differential equations 17 to 22 are of the special
equation. Then detennine the arbitrary constant so that form y' = f(x), having isoclines that are lines parallel
the differentiable function y(x) satisfies the given initial to the y-axis. Thus to sketch the direction field you need
condition of the form y(a) = b and satisfies the given to determine only one slope on each such line, making all
differential equation on an interval containing a. slope-segments centered on that line parallel to the first
I. y' = y + 1; y = Ce-' - 1, y(0) = 2 one. Sketch the direction field for each of the following
differential equations and then use the field to sketch in
2. dy =-'.:;y=../a 2 -x 2 .lxl <a,y(l):::4 a few solution graphs.
dx y
3. y' + y = O; y = K e-x. y(5) = 6 17. y' =x3 18. l=-v'x2+1
4
4. y'
1
= -, x-:/= 0; y = log lxl + C, y(-1) = 3 19. y' = 1/(1 + x ) 20. y' = x4
X
21. y' = -Vf=x3 22. y' = x/(1 + x 4)
5. y'=y2;y=(C-x)- 1,y(3)=2
23. (a) Verify that a differential equation of the form y' =
6. y' = 1 + y2; y = tan(x + c), y(l) =1 F(x), where F is continuous on an interval con-
For each of the differential equations 7 to 10 of the taining x 0 , has solutions on that interval of the form
form y' = F(x, y), sketch the associated direction field, y(x) =YO+ J:i F(t)dt .
locating a ,hort segment with slope F(x , y) at enough (b) Prove that the solution in part (a) is uniquely deter-
points (x, y) so that a geometric pattern begins to appear. mined by the requirement that y(xo) = YO· [Hint:
Then sketch into the same picture a solution graph Suppose there are two solutions, YI (x) and y2(x).
containing the given point (xo, Yo). What is ()'1 - Y2) 1 ?]
7. y' = X~. (xo, Yo)= (1, 2) 24. (a) For the differential equation y' = .jy, y 2'.'. 0, show
that there are infinitely many different solutions
dy X
8. - = --, (xo, yo)= (1, 1) passing through the point (xo, yo) = (0, 0). [Hint:
dx y Consider y = (x - a)2/4 for x 2'.'. a and y = 0 for
dy X < a.]
9. -=y+x,(xo.yo)= (l,-I)
dx (b) Verify that the equation y' = .jy for y 2'.'. 0 has the
10. y' = x 2 , (xo, Yo)= (1 , 0) identically zero solution y(x) = 0. Explain why the
uniqueness part of Theorem 1.1 is not contradicted
lsoclines An isocline in a direction field is a curve by this example.
along which the directions of the field are all the same. (c) Prove that there is no value of the number a such that
Finding the isoclines of a field is helpful in sketching the formula for solutions through (xo, Yo) = (0, 0)
the field because the direction segments on an isocline found in part (a) yields the identically zero solution
are all parallel. For the direction field determined by a as a special case.
differential equation ·y' = F(x, y ), the isoclines satisfy
equations of the form F(x, y) = m, where m is some 25. The differential equation y' = JI=-?
is satisfied by
constant slope. In each of Exercises 11 to 16, sketch y(x) = sin(x + a) on any interval on which y'(x) 2'.: 0.
several isoclines, and then sketch the direction field by The differential equation i~ also satisfied by y(x) =
drawing parallel segments crossing the isocline curves 1 and y(x) = -1 (identically 1 and identically - 1).
F(x, y) = m with slope m. Show that on the interval -rr /2 < x < n /2 there are
infinitely many different solutions passing through (0,
11. y' = -~ 12. )'
1
=X +y 1) and also infinitely many different solutions passing
X
through (0, - 1). Explain why the uniqueness part of
13. y' = x2 + y2 14. y' = x2 Theorem I. I is not contradicted by this example.
Section 1B Direction Fields 467
1B Numerical Methods
Hand-plotting direction fields and solution graphs is good practice for understand-
ing the concepts, but computer graphics programs are much better for producing
accurate pictures, particularly when the geometry is complicated. Figure 10.4 shows
an example that would be difficult to deal with by hand. We can use commercially
available software to sketch direction fields that most people would consider too tire-
some to sketch by hand, for example Maple, Matlab and Mathematica. The Web site
http://math.dartmouth.edu/~rewn contains the Java program DFDEM that will
plot a direction field and allow you to draw solution graphs tangent to the field and
starting at a graphically determined initial point.
A straightforward way to implement a numerical routine for making approxima-
tions to the solution of the initial-value problem

y' = F(x, y), y(xo) = Yo

is to start with equally spaced x values

XO, XI = xo + h, x2 = Xt + h, ... , Xm+l = Xm +h

and use the tangent line approximation


y' = e sin(,:,,)
y - Yk = F(xk, Yk)(x - xk) at the point (x , y) = (Xk, Yk)
FIGURE 10.4

to get an approximate value Yk+ 1 at Xk+ 1. Setting x = Xk+ 1 in the tangent line
equation and noting that (Xk+t - xk) = h, gives

Yk+I - Yk = F(xk, Yk)(x - xk). or Yk+I = Yk + hF(xk, Yk),


The value Yk+ 1 is called the kth Euler approximation at Xk+ 1, and the entire process
is called Euler's method.
A computing routine to print approximate values for the solution of the initial-
value problem
y' = xy, y(O) = l

for x values between 0 and l, with step size h = 0.01, might look like this:

DEFINE F(X, Y) = X•Y


SET X 0
SET Y l
SETH 0.01
DO
SET Y y + H * l'(X, Y)
SET X X + H
PRINT X, Y
LOOP WHILE X<l
468 Chapter 10 First-Order Differential Equations

To improve accuracy, we think first of increasing m, the number of subdivisions


of the interval from xo = 0 to Xm = I, thus making h smaller. But increasing
the number of subdivisions increases the likelihood of significant round-off error
accumulation in the arithmetic. Rather than carelessly increasing m, we prefer to use
a simple improvement of the method. The improvement produces a smaller error
at each step without increasing the number of steps, so the error increases more
slowly.
The improved Euler method uses a process called prediction-correction. The
method assigns a corrected slope to each approximating segment that is the aver-
age of the Euler slope at Xk and what the predicted Euler slope would have been
at Xk+l ·
We'll now use Yk+I to denote our improved approximate value at step k + l,
and use Pk+I for the corresponding simple Euler prediction, based on a previously
computed value Yk- We follow these steps to approximate the solution to the initial-
value problem y' = F(x, y), y(xo) = yo:

1. Compute the slope F(xk, Yk)


2. Determine a predictor estimate Pk+I by Pk+I = H: + hF(xk, Yk).
3. Compute the average of the two slopes F(xk, Yk) and a predicted slope
F(xk+l, Pk+!) and use it to determine Yk+I by

F(xk, Yk) t F(xk+I, Pk+l))


Yk+I = Yk + h( 2

= Yk + ½h(F(xk, Yk) + F(xk+I, Pk+i)).


The formula for Yk+I displayed above comes from writing the equation for the
line through (xk, Yk) with the average slope and then setting x = Xk+I to get the
corresponding value y = Yk+l· A computing routine to implement the method for
the initial-value problem y' = xy, y(O) = 1 might look like this, with step size
h = 0.01, printing values for x between O and I:

DEFINE F(X. Y) X*Y


SET X O
SET Y =
1
SET H = 0.01
DO WHILE X < l
SET P Y + H*F{X )' )
SET Y = Y + (H/2)*(F(X.Y) + F(X+H. Pl)
SET X = X + H
PRINT X, Y
LOOP

At each stage the value Pk is the prediction and Yk is the correction. If h < 0
the approximations move from larger to smaller x values. For graphic output use
some form of PLOT instead of PRINT. Matlab, Maple, and Mathematica software is
available for doing the following exercises. Also Java applets DFDEM, IORDPLOT,
and IORD are at the Web site http://math.dartmouth.edu/~rewn/.
Section 2A Applications 469

EXERCISES

In Exercises I to 12, make computer-aided direction field


sketches for each equation - 2 ~ x ~ 2, - 2 ~ y ~ 2,
5. 4y' = sin(x 2 + y2) 6. y' = (1 + y4)-I
and then add a few solution graphs. 7. y' = cos(x 2 + y 2) 8. y' = y/(x 2 + I)
1. y' = sin(x - y) 2. y' = esin(x+y)
9. y' = (1 + x4)-l 10. y' = cos(x y)
3. y' = J9-y 3 4. y' = x4 - y3
11. y' = e-Y2 12. y' = (1 + y2) ~

SECTION 2 APPLICATIONS
One of Newton's many contributions to science was that it's useful to formulate
a differential equation to be solved for a physically interesting unknown and then
solve the equation. This apparently simple observation has had a profound influence
on science.
2A Direct Integration
We'll consider first some problems that are reducible to solving differential equations
of the form dy/dx = F(x). If F(x) is continuous on some interval, then all solutions
are

y(x) = j F(x)dx +C or y(x) = G(x) + C,

where C is an arbitrary constant and G' (x) = F (x) on the interval in question.
To satisfy an initial condition y(xo) = yo, solve for the constant C in the equation
y(xo) = G(xo) + C, getting C = y(xo) - G(xo). This routine for solving the initial-
value problem proves the special case of Theorem 1.1 of the previous section in
which F(x, y) = F(x) is a function of x alone. In geometric terms, we see that the
slopes of a continuously varying direction field generated by an equation y' = F(x)
determine a function y(x) on the interval of definition of F(x) whose graph satisfies
two conditions: (i) It passes through a given point (xo, yo) if xo is in the interval.
(ii) It is tangent to a direction segment at each of its points.

f~i(AriAP~~'ll If F(x) = (I + x 2 )- 1 for all real x, then all solutions of y' = F(x) have the form

y = G(x) = J F(x)dx +C

J
= -dx-2 + C = arctanx + C.
l+x
We'll assume that the arctangent function is the principal branch, the branch for
which arctan O = 0. To satisfy the initial condition y(l) = Jr /2 we need
Jr Jr
-2 = arctan 1 + C = -4 + C
so we take C = Jr/ 4, making the unique solution to the initial-value problem y =
arctanx + rr/4.
470 Chapter 10 First-Order Differential Equations

FIGURE 10.5 V =Q I
v=O I I
I
I / y> O
: y< O v> O I
I
v< O I I
I I
I I
I
v< O ty= O
v > O -i-y= O I
I
I
I
I

y> O : y< 0
I
I
I
I
I
I

(a) (b)

I EXAMPLE 2 I Let g be a constant approximation to the acceleration of gravity near the surface of
the earth. A projectile is fired straight up from the top of a building (yo = 0) with
velocity v(0) = 1000 feet per second. If we choose to measure distance up from the
top of the building, as in Figure I0.5(b), then at time 1 ~ 0 we have dv / d1 = -g
since gravity acts to decrease velocity. Integrating dv / dt = - g with respect to t
using initial condition v(0) = 1000 and the estimate g = 32.2, gives

V = dy / dl = - gt + Ci ,
~ -32.21 + 1000.
Integrating the velocity with initial condition y(0) = 0 to find y (l) gives

2
y (1) = - ½g1 + v(0)t + C2
~- 16.11 2 + lO00t.

The maximum height is reached when v = 0, at lmax ~ 1000/32.2 ~ 31 seconds.


The maximum height is Ymax ~ - 16. l (tmax) 2 + 10001max ~ 15, 528 feet. Note that
if we had measured y down from the top of the building our original differential
equation would then have been dv / dt = g, and the first initial condition would have
been v(0) = - 1000. See Figure 10.5(a).

2B Separation of Variables
We'll now consider more examples where first-order differential equations arise from
geometric or scientific assumptions. The equations are chosen so that we can solve
them by a method called separation of variables, illustrated in the next example.

IEXAMPLE3 I It's often observed in biological studies that the rate of change d P / d
of a bacteria population at time
P(t) very nearly proportional to
tis
1 of the size
P (t ). Expressing
this proportionality in the form
dP
dt
= kP , (1)
Section 2B Applications 471
where k is a constant, gives a first-order differential equation for P. Because the
derivation of the differential equation depends on assumptions that may not be pre-
cisely true, we can expect a solution P(t) to be at best an approximation to the
true situation. It's our purpose here to study this approximation. Experience with the
exponential function allows us to guess one solution. If we let

P(t) = Kek 1 ,
we see that P'(t) = kP(t) for all real numbers t. In other words, P = Kekt is
a solution. If we had not been able to guess a solution, or if we wanted to try to
find still other solutions, we would have proceeded as follows. Assuming that the
population size is always positive, we can divide Equation (1) by P, getting
I dP
--=k.
p dt
Next we integrate both sides of the equation with respect to t:

/
1-
- dP
dt
p dt
= I kdt.

The integral on the left is In P; that on the right is kt. Both integrals are determined
only to within an additive constant. Hence we can lump the constants together and
write
lnP=kt+c,

where c is the constant of integration, as yet undermined, Taking the exponential of


both sides and recalling that exp(ln P) = P, we have
P(t) =i 1
+c

Now set the positive constant ec = Po, so


P(t) = Poek 1 •

Figure 10.6 shows the graph of P for, rather arbitrarily, k = 2 and various choices
of Po. The constant Po is usually determined by observing that, for t = 0, we have
Po = P(0), which is the size of the population at t = 0. If instead of P(0) we
happen to know P(t1) for some t1 > 0, then the equation
P(t1) = Poekti
leads to

Hence
P(t) = P(t1)ek<t-t1)
for all t > 0.
472 Chapter 10 First-Order Differential Equations

A condition that requires a solution P(t) to satisfy an equation of the form P(to) =
Po is called an initial condition. The term comes from the interpretation of to in
applications as a starting time for an evolving process.
We could solve the differential equation in the previous example because we could
rewrite it as an equation between two functions, for each of which we could find an
indefinite integral. The typical equation of this type looks like
P0 =2
dy
g(y) dx = j(x), (2)
P0 =I
P 0 = 0.5
P0 = o_.1-+----------
though this form isn't possible for all first-order differential equations. (See
1 Exercise 9.) By assuming that y is some differentiable function of x, we can try
to find an indefinite integral with respect to x for each side, and so write
FIGURE 10.6
f g(y) : ; dx = f J(x) dx. (3)

If there is an indefinite integral G, of g, such that G'(y.l = g(y), then we have, by


the chain rule, dG(y)/dx = g(y)dy/dx. If we can also find an indefinite integral
F such that F'(x) = f(x), then we can integrate Equation (3) to get

G(y) = F(x) + C, (4)

relating y and x. There still remains the problem of solving this last equation for y
in terms of x.
The process outlined is usually called separation of variables because it involves
getting the x' s on one side of the equation and the y' s on the other. The whole matter
becomes simpler notationally if we cancel the dx's on the left side of Equation (3).
The resulting formal equation

f g(y) dy = f J(x) dx (5)

still leads to Equation (4) for the solution. The original Equation (2) is sometimes
written in the symmetric form

g(y)dy = J(x)dx,
which can be interpreted as either

dy dx
g(y) dx = f(x) or g(y) = J(x) dy.

Analogously, we can try to find y as a function of x, or x as a function of y, from


Equation (4), whichever suits our purpose better

(fxAMPLE 4 j The differential equation

dy y
(6)
dx X
Section 28 Applications 473
, \ Y I I has associated with it the direction field with slope y / x at the point (x, y). Since the
\ \ I I
\ \ I I , ·
'
'',, ',, \ : ,,' ;';
slope y/x is just the same as the slope of the line from (0, 0) to (x, y), the direction
'' \\ I I '
field looks like the sketch shown in Figure 10.7. It appears that the solution curves
', ', \ \ I I ,/
_ ','
,~,\ \ \ \ , , ,"'
I I '
,,,,," "
are radial lines from the origin. To prove this, we write the equation in either of the
...
.................. '~~\\ ,,~" .,,,.,,,,""
........... :::=---
----· two forms

---- ----.,,-:,,,,,
_,..," ,,,
,,,.,,,, ," I I
l dy
-=-
l
or
dy
-=-
dx
I y dx X y X

-- ''
.,.,,,,, , ' II

- ,"
' /
I/ I
assuming both y =I= 0 and x =I= 0. Integration with respect to x on the left of the first
equation gives
.FIGURE 10.7

I .!_ dy dx
y dx
= f .!_ dx,
X

or, formally,

J;=Jd;.
In either formulation, we find

In IYI = In lxl + C.
Taking the exponential of both sides gives

or

Removing the absolute value symbols, we get

y = ±ecx.
Since ec is always positive, and since y = 0 is a solution of Equation (6), a solution
formula for Equation (6) is

y = kx,
where k is any real number. In other words, the graphs of the solutions are lines
through the origin.

Suppose that a tank containing a chemical in solution is divided into two compart-
ments by a porous membrane. Suppose that the chemical in one compartment is
maintained at a fixed concentration C (e.g., in grams per liter) and let u(t) be the
concentration of the chemical in the other compartment at time t. It may sometimes
be determined experimentally that diffusion takes place across the dividing mem-
brane in such a way that the rate of change of the concentration u (t) is proportional
to the difference in concentrations.
474 Chapter 10 First-Order Differential Equations

Then

-du = k(C - u),


dt
where k is the constant of proportionality. To solve the differential equation, we
write it in the form
du
- - =kdt.
C-u
Carrying out the integration gives

- In jC - uj =kt+ c.

We can now take exponentials of both sides to get

jC - ul = e-ce-kt_

Removing absolute values gives

C- u = Ke-kt,
where K is now any nonzero constant. Finally,

u(t) =C- Ke -kr_

To determine the constant K (remember that C is given at the start), we could, for
example, measure u(O). Setting t = 0 in the preceding solution formula then gives

u(O) =C - K or K =C- u(O).

Hence
u

u(t) = C - (C -- u(0))e-k'.
u(O)

The constant k also could be determined experimentally by measuring u(t1) for some
C t1 > 0 (see Exercise 19). The shape of the graph of u (t) is shown in Figure 10.8
u(O)
for a single arbitrary choice of C and k and for values of u(O) that are relatively
larger and smaller than C. Indeed, the original differential equation u' = k(C - u),
with k > 0, shows that whenever C > u(t), then u' (t) > 0, so that u is increasing.
Similarly whenever C < u(t), we must have u'(t) < 0, so that u is decreasing. We
FIGURE 10.8 assumed above that u -=I- C; if u(0) = C then u(t) = C is a solution, sou is constant.

I~XJ.\MPL~ 6 ] Chemical solutions in a tank, for example solutions of salt in water, are often subject
to inflow and outflow of a particular chemical at different rates. If S = S(t) is the
amount of chemical in the tank at time t, then
dS .
- = (rate of mflow) - (rate of outflow). (7)
dt
Section 28 Applications 475
As an example, suppose that a full 100-gallon tank contains 150 pounds of salt in
solution at time t = 0, that a salt solution with a concentration of 2 pounds per
gallon is being added at a rate of 2 gallons per minute, and that thoroughly mixed
salt solution is flowing out of the tank at a rate of 2 gallons per minute. Thus salt is
flowing in at a constant rate of 4 pounds per minute and is overflowing at a rate of
2S(t)/100 pounds per minute at time t. The differential equation

dS 2S
-=4--
dt 100
then expresses the general relation of Equation (7). To solve the equation for S(t),
we write it as
dS 2 dt
S- 200 =- 100

Integration gives

In IS -2001 = -l0 t +c
or

Removing the absolute value, we get

S(t) = 200±ece-tf50_
But S(O) = 150 by assumption, so ±ec must be equal to -50. The amount of salt
at time t is then

S(t) = 200 - 50e-,;so ,

and the graph of S is shown in Figure 10.9.


If instead of salt solution pouring in at a steady rate, the rate is allowed to vary
200 with time, then the method is the same. Suppose, for example, that salt solution
at a concentration of 2 pounds per gallon is poured into the same tank at a rate
150
r(t) = 2 - t/2 for the first four hours and then no more solution is introduced, so
100 there's no overflow after t = 4. The graph of r is shown in Figure 10.lO(a). The
50 differential equation for Sis now dS/dt = 0 when t > 4, and when 0::: t .::: 4,

dS
dt
= 2 (2- ~)-
2
(2- ~) _!_
2 100
FIGURE 10.9

Written in the form

dS (2 - (t/2)) dt
---=
S-200 100
476 Chapter 10 First-Order Differential Equations

r the last equation is still easy to solve. We get


2
2 In IS - 2001 = - -I
100
( 2t - -t
4
) +C
or
2 4
(a)

J
Removing the absolute value gives
200
150
S(t) = 200 ± ece-(r/50)+(12/400>,
100
50 and, as before, the assumption that S(0) = 150 shows that ±ec = -50 is correct.
Thus
2 4

(b) S(t) = 200 - 50e-(r/50J+(r2/400)

FIGURE 10.10
is the solution to the problem for 0 _::: t s 4. For t > 4, the simple differential
equation
dS
-=0
dt
means that there is no further change in the amount of salt. The correct value of the
solution

S(t) = C, 4 < t,

comes from setting t =4 in the formula that holds for 0 S t S 4. We find that
S(I0) = 200 - 50e- 0 ·04
~ 152.

The graph of S, for both t s 4 and t > 4, is shown in Figure 10. lO(b ). There is no
single elementary formula to represent the function, so we write
2
200 - 5oe-< 1 / 50J+(r /400) 0 < t < 4
S(t) ={ , - - '
200 - 5oe- 0 ·04 , 4 < t.

A satellite moving on a radial line away from a planet is subject to a force of


magnitude
GMm
F=---2-,
r
where r is the distance between the two bodies, m is the mass of the satellite, M is
the mass of the planet, and G is a constant depending on the units of measurement
(see Figure 10.11). The acceleration of the satellite has magnitude a= d 2 r/dt 2 , so
we also have
d 2r
F=ma = m -2 •
dt
Section 2B Applications 477
Equating the two expressions for F and canceling m gives the second-order differ-
ential equation

d2 r GM
dt 2 = --;:z-
If we let v = dr/dt denote the radial speed, then by the chain rule,
FIGURE 10.11
d 2r dv dv dr
dt 2 = dt = dr dt
dv
=V -.
dr
The differential equation then becomes
dv GM
v- - - -r 2- ·
dr -
lntegration of both sides with respect to r gives

1 ro
v
vdv=-GM
lr
ro 2
r
dr

or
2
v _ v5 = GM _ GM
2 2 r ro
where vo is the radial speed at distance ro from the planet. This relation between
speed and distance enables us to determine the escape speed of the satellite, namely
the speed vo that must be attained at distance ro so that the speed v always remains
positive thereafter. We must have
2GM 2GM
V2 =Vo+
2
-- - -- > 0.
r ro
Since GM/ r ---+ 0 as r ---+ oo, the only way the inequality can hold is to have
2GM
Vo2 - - - >0.
ro
The critical escape speed that must be exceeded at distance ro is thus

vo = J2GM_
ro
The analysis presented here ignores the possibility that the planet moves under the
influence of the satellite; practically speaking this is fine if the satellite has negligible
mass compared with the planet's mass. Also, considerations involving kinetic and
potential energy show that a formula of this type for escape speed is correct even
if the satellite is on a nonradial path. We treat both of these issues in Chapter 12,
Section 3, where we allow for two bodies having commensurate masses as would
be the case in a double star system.
478 Chapter 10 First-Order Differential Equations

EXERCISES

ln Exercises I to I 0, solve each differential equation by ential equations, and then find a particular solution that
direct integration, and find the particular solution that satisfies the given additional condition. Verify by sub-
satisfies the associated initial condition by determining stitution that your solution does satisfy the differential
one or more constants of integration. equation.
dy
1. y' = x(I - x), y(0) = I 17. -
dt
= 2y, .v(0) = 2
2. ds/dt = (t -t- 1) 2, s(l) = 2 dy
18. - = 2tv. , y(0) = 2
3. y' = x/(1 - x 2 ), y(0) = I dt
X
4. c/11/dv = v2 + 1, 11(-l) = 1 19. y
I
= 2' y(l) = 0
y
5. y" = sinx, y(0) = I, y'(0) = I
dy X
6. y"' = I, y(0) = y'(0) = y"(0) = 0 20. - = --,y(I) = I
dx y
1. dz/dt = te 1 , z(O) = I 21. (a) Suppose that a spherical ball of dry ice evaporates
8. y' = arctanx, y(O) = 0 in such a way that the rate of evaporation dV/dt is
always proportional to the radius r of the ball. Use
9. dx 2 /dt 2 =e 1
, x(0) = 1, dx/dt(0) =0 V = (~)1rr 3 to show that the first-order differential
10. y"" = X, y(0) = y"(0) = 0, y'(I) = y"'(I) = I equation satisfied by r as a function of t is of
11. A projectile is fired up from ground level with an initial the form
dr k
vdocity of 5000 feet per second. What is the maximum
altitude attained, and how long does it take to get there,
dt
= ,.
a~suming g = 32 feet per second per second? where k is a negative constant.
12. A weight is dropped from 5000 feet above ground. How (b) Solve the differential equation in part (a), and use
long does it take Lo reach the ground, and with what final the observed measurements that at time t = 0 the
velocity does it hit? Assume g = 32 ft./sec 2 . radius of the ball is 1 inch, whereas I hour later the
radius is ½ inch, to determine a particular solution
13. Suppose the two objects described in Exercises 11 and 12 as well as the constant k.
are released at the same time and are aimed directly at (c) How long dqes it take the ball to evaporate com-
each other. pletely starting with a radius of I inch?
(a) How Jong after release do they meet, and at what
22. Psychological studies of stimulus and response often
height above ground?
attempt to treat these as numerical variables s and r
(b) What initial velocity should the projectile be given
so that the two objects meet 2500 feet above ground?
related by an equation of the form r =
f (s). It's some-
times hypothesized that f satisfies a differential equation
14. A projectile is fired up from ground level so that its of the form
maximum height will be 5000 feet. What is its initial
velocity?
-dr
ds
r"
= k-,
s
with k > 0.

15. A weight is thrown down from 5000 feet above ground so '
as to reach the ground in 10 seconds. What is the velocity Which of the two hypotheses on the exponent 11, 11 = 0
of the throw? or n = 1, is consistent with the following table of
16. Suppose the objects described in the previous two exer- experimental values?
cises are sent on their way at the same time and are aimed
directly at each other. r s
(a) About how long after release do they meet, and at
what height above ground? 0.5 1
(b) Whal initial velocity should the projectile be given 1 2
so that the two objects meet 2500 feet above ground? 3 6

Find a solution formula for each of the following differ-


Section 2B Applications 479
23. (a) Suppose that a 100-gallon tank containing 150 (b) What are the solution curves of the two differential
pounds of salt in solution at t = 0 has pure water equations in part (a)?
added at a rate of 2 gallons per minute and that the 28. A function F of (x, y) is called homogeneous of degree n
resulting mixture is drawn off at a rate of 2 gallons if F(tx, ty) = tn F(x, y) for all x, y, and t.
per minute also. Find a differential equation satisfied
(a) Show that if F is homogeneous of degree zero, then
by S(t), the amount of salt in the tank at time t, and
the substitution y = xu transforms the differential
solve the equation to find S(t). Is S increasing or
equation
d~reasing? Wbat is Jim S(t)? dy
(b)
t->00
Suppose that the process described in part (a) is -dx = F(x,y)
modified as follows: (i) each 2 gallons flowing in per
minute contains 1 pound of salt; (ii) only 1 gallon into
of solution is drawn off per minute; (iii) 1 gallon of
water per minute is boiled away as steam. Answer du F(l,u)-u
the same questions as in part (a). dx X

24. Assume that a membrane separating a vat into two com- -


so that this equation is of the fonn that's solvable
ponents has a porosity that is variable with time, so that
by separation of variables.
the equation
du (b) Show that F(x, y) = (x 2+y 2 )/2xy is homogeneous,
dt
- = k(t)(C - u) and use the substitution of part (a) to change the
equation
is satisfied by u(t), the concentration of some chemical
dy xZ + yZ
in one of the compartments. Suppose that by measuring
u (t) we find that dx 2xy
2
u(t) = C(l - e-t ).
into an equation of the form

What is the corresponding porosity factor k(t)?


25. If the solution u found in Example 5 of the text has
-du
dx
= G(x,u) .
the form
u(t) = 10 - 5e-k 1 , (c) Solve the last differential equation of part (b). and
replace u by y / x in the resulting solution. Then
and u(2) = 5, what is the constant k? check to see that you have found a solution to the
equation y' = (x 2 + y2)/2xy.
26. Show that the differential equation
29. (a) Referring to Example 7 of the text, suppose that the
planet is the earth, having a radius of 4000 miles,
dy
- =y+x and that the acceleration of gravity at the surface of
dx the earth is -0.006 miles per second (-32.2 feet per
second, approximately). Find the constant GM.
cannot be written in the form
(b) Find the initial velocity that a projectile fired from
the surface of the earth would need in order not to
dy
g(y) dx = f(x), fall back to the surface of the earth.
(c) Find the velocity that a projectile would need 1000
miles above the surface of the earth in order not to
and therefore cannot be solved by separating variables. fall back to the surface of the earth. ·
27. (a) Sketch the direction fields for the two differential
30. Let g be the acceleration of gravity near the surface of the
equations
earth. (g ~ 32.2 feet per second per second.) By Newton's
dy = ~ and - = - -
dy X
law an object falling with negligible air resistance has
dx X dx y
acceleration
and show thal at points (x, y), for which both are dv
defined, the direction fields arc perpendicular. dt = g,
480 Chapter 10 First-Order Differential Equations

where v = v(t) is the velocity of the object at time t. Use 33. Flow of liquid from a tank. A cylindrical tank with cross-
integration to derive the following relations. sectional area A has an outlet hole in its side near the
(a) v(t) = gt+ vo, where vo is the velocity at time 0. bottom. If h = h(t) is the height of an ideal fluid above
(b) s(t) = ht 2 +vot+so, wheres= s(t) is the distance the outlet at time t, and a is the area of the outlet hole,
at time t of the object from the reference points = 0. then V(t), the remaining fluid volume at time t satisfies
Torricelli's equation
31. An object dropped near the earth's surface falls distance
s(t) = ½gt 2 in time t. In particular, so = s(0) = 0 and
vo = s'(0) = 0. dV
- =-a/fii,.
(a) Show that s = s(t) satisfies the first-order differen- dt
tial equation
ds ~ An intuitive justification for the equation is to note that it
dt = .,;2gs.
depends on having the outlet velocity equal to the free-fall
velocity of a drop of fluid from height h, as derived in
(b) Show that the differential equation in part (a) has the previous exercise; thus -d V /dt equals area a times
each member of the one-parameter family outlet velocity .J'[gfi. (A thoroughly scientific justification
depends on principles of fluid mechanics.) Thus for an

s =( A+ t c)
2

= ~ gt 2 + ./iict + c2
ideal fluid, the equation takes the form

dh a ;;,;:-;:
as a solution.
- =--.,;2gh.
dt A
(c) Show that the solution in part (b) satisfies v5 = 2gso.
32. The general solution to the falling object problem treated (a) Show that the Torricelli equation has a solution of
in the two previous problems is the form h(t) = (bt + c)2. Then determine what the
constants b and c must be.
I 2
s = 2 gt + vot+so, (b) Use your answer to part (a) to find out how long it
would take for the fluid height above the outlet to
where vo is initial velocity and so is initial displacement. drop from ho to 0. In particular, estimate how long
(a) Show that s = s(t) satisfies the first-order differen- it would take to empty a full cylindrical tank with
tial equation diameter 10 feet, height 20 feet, and circular outlet
at the bottom with diameter 6 inches.
2
~; = Jv5 + 2g(s - so). 34. The first-order nonlinear equation dy /dx = e- • ~-y can
in principle be solved by using separation of variables.
[Hint: Solve fort in terms of both sand ds/dt.] (a) Try to find an effective solution formula for the
(b) Show that the expression v5 + 2g(s - so) under the initial-value problem with initial condition y(O) =
radical is always nonnegative, given our assumptions - 1, and explain the difficulty you encounter.
on s. (Hint: When does that expression reach its (b) Make a computer graphics plot of the solution to the
minimum as a function of t?) initial-value problem in part (a).

SECTION 3 LINEAR EQUATIONS


The first-order differential equation of the form

y' = F(x, y)
has a particularly important special case, namely the one in which

F(x, y) = -g(x}y + f(x)


Section 3A Linear Equations 481

for some functions g and / . The resulting differential equation is usually written in
the normalized form
y' + g(x) y = J(x). (1)

For reasons explained at the end of the section the equation is called a first-order
linear differential equation.

If / happens to be identically zero, we can find solutions y to Equation ( l) by


assuming y # 0 and writing
y'
-y = -g (x) .
Integrating with respect to x, we get

ln IYI = -G(x) + c,
where G is an indefinite integral of g and c is a constant. Taking the exponential of
both sides gives

and removing the absolute value allows us to replace the positive constant ec by an
arbitrary nonzero constant K:

y = Ke-G<x)
= Ke-f g(x )dx_

3A Exponential Integrating Factors


The method of solution used in Example l fails if the function / in Equation ( l) is
not zero; it also has the technical defect that it forces us to assume y # 0 (conceivably
there are solutions that take on the value zero). We avoid both objections at once
if we use the following method, suggested by the form of the solution found in
Example 1. For the differential equation

y' + g(x)y = /(x),


written in normalized form, we define an exponential integrating factor to be

M(x) = ef g(x)dx ,

where / g (x) dx is an indefinite integral of g . The trick is next to multiply the


differential equation by M to get

ef g<x) dxy' +g(x)ef g(x )dx y = /(x)ef g(x) dx_


482 Chapter 10 First-Order Differential Equations

The whole point is that the left side can now be written as the derivative of ef g(x) dx y,
because, by the product rule, applied to the factors ef g(x)dx and y,

.!!._ (e f g(x) dx y ) = ef g(x) dx y' + g(x )efg(x)dxy.


dx

Thus we have rewritten the standard linear differential equation in the form

; (ef g(x) dx y ) = ef g(x) dx f( x );

it remains only to integrate both sides with respect to x and then solve for y. The
integrating factor M(x) is sometimes called an exponential multiplier.

IEXAIVIPLE 2 j To find all solutions of the linear differential equation

l = xy+ x,

we first rewrite the equation in Lhe standard form

y' - xy =x .

The exponential multiplier is then found by identifying the coefficient function


g(x)= - x and computing
M(x) = e f g(x )dx
= e - f x dx
= e - (1/2Jx 2

Multiplying the differential equation by M gives

e-(IJ2)x2 y ' - xe-(l/2)x2 y = xe-(l/2)x2


But we know from the preceding discussion, or we could verify directly, Lhat this
last equation is the same as

Integrating both sides with respect to x gives

e- (1/2)x2 y = f xe-( lil},·2 dx +C

= -e-(1/2)x2 + C.
Section 38 linear Equations 483
y 2
Then multiplying by e+< 112>x gives

y = -1 + ce<l/2)x2

for the solution. Figure 10.12 shows the graph of the particular solution satisfying
y(O) = 0.

Two points should be emphasized about applying the exponential multiplier


FIGURE 10.12 method:

1. The linear differential equation must be in standard form

y' + g(x)y = f (x)


before identification of the coefficient function g for the purpose of computing
M(x) = ef g(x)dx.
2. The differential equation

y' + g(x)y = f (x)


and its multiplied form

M(x)y' + M(x)g(x)y = M(x)f(x)


are completely equivalent to one another in the sense that any solution of one
equation is also a solution of the other. The reason is that the multiplier M,
being an exponential function, is never equal to zero, so we can multiply and
divide by it as we please.

3B Applications

Suppose that a 100-gallon vat contains 10 pounds of a certain chemical dissolved in


water and that a solution of the same chemical is being run into the vat at a rate of
3 gallons per minute. The solution being run into the vat has a concentration that
increases slowly with time according to the formula

C(t) = 1- e-r/100_

The solution is kept thoroughly mixed and the excess is drawn off, also at a rate of
3 gallons per minute. Let S(t) stand for the amount of chemical in the tank at time
t 2: 0. We have rate of inflow minus rate of outflow equal to

dS S(t)
dt(t) = 3C(t) - 3 lOO

= 3(1 - e-r/100) - -2._S(t).


100
484 Chapter 10 First-Order Differential Equations

The resulting first-order equation can't be solved by separation of variables unless


C(t) were to be replaced by a constant. But the equation is linear in any case, and
in standard form is
dS 3 ,1100
- - - S = 3(1 - e- ).
dt 100

An exponential multiplier is given by

M(t) = ef(3/IOO)dt = e31/IOO_


Multiplying the equation by M puts it in the form

:t (e3r/100s) = 3(e31/100 _ e21/10<\

and integration with respect to t gives

e3t/100S(t) = I 3(e3t/lOO _ e2t/lOO)dt +K


= 100e3r/100 - 150e2t/lOO + K.
Then multiplication by e- 3t/lOO gives

S(t) = 100 - 150e-t/ IOO + Ke- 3tflOO .


To determine the constant K we recall that the vat initially contains IO pounds of
the chemical, so that S(0) = 10. Then setting t = 0 in the formula for S(t} gives

10 = -50 + K, or K = 60.
Thus the desired particular solution is

S(t) = 100 - 150e-t/ lOO + 60e- 3t/IOO_


Notice that
Jim C(t) = I ,
,-oo
s0 that the concentration of the solution being added approaches 1 pound per gallon.
From this information we could conclude on physical grounds that the total amount
of salt in the 100-gallon tank should approach 100 pounds; indeed the formula for
S(t) shows that

Jim S(t) = 100.


,~oo

IEXAMPLE4 I Let f (t) be the concentration of a chemical solution on one side of a porous mem-
brane, and let u(t) be the concentration on the other side. Suppose that diffusion
takes place through the membrane in such a way that

du
- = 2(/(t} - u) ,
dt
Section 38 Linear Equations 485
that is, so that the rate of change of u is proportional to the difference in concentra-
tions. If u(O) = 3, and f is maintained so that

0:::; t < 10
f(t) = { 1: 10::::t,

then we can most easily solve the equation by writing it as

du
-dt + 2u = 2/(t).
An exponential multiplier M is given by

M(t) = ef2dr = e2'.


Hence the differential equation is

d
-(e2t u) = 2e2' f(t).
dt

Integration of both sides from t = 0 to t = s gives

s-(e
d 21
u(t))dt =
las 2e 2
' f(t)dt
laO dt O
or

e2su(s) - u(O) = las 2e 2' f(t)dt.

Then

u(s) = u(0)e- 2s+ 2e-2s las e21 f(t)dt. (2)

Using the integral with limits is convenient here because we can write, according to
the definition of f,

las 4e21 dt, 0:::: s ~ 10,


I,'," f(t)dt ={ 110 4e 2' dt + is 2
e ' dt, 10 ~ s,
0 10

2(e 21' - 1), 0:::; s:::; 10,

=I 2(e20 - I), +½(e2s - e20), 10:::: s.

2(e2s - I), 0:::; s:::; 10,

=I ~e 20 - 2 + ½e2s, 10 :::: s.
486 Chapter 10 First-Order Differential Equations

Then returning to Equation (2),

4 - 4e-2s, 0 S s S I 0,
u(s) = 3e-2s + { (3e 20 - 4)e-2s + 1, 10 S s.

4 - e- 21·, 0ssslO,
= { (3e20 - l)e- 2s + I, 10 S s.

Sketching the graph of the solution u is left as an exercise.

IEXAMPLE s I Newton's
ture
law of cooling asserts that the rate of change of the surface tempera-
of an object is proportional to the difference between
u(t) and the u(t) J(t),
temperature of the surrounding medium. Thus

du
- = k(j - u), k > 0.
dt

The constant k must be positive to be consistent with knowing that if f (t) > u(t)
then du/dt > 0. We can't solve this differential equation by separation of variables
unless f is constant, but in any case the equation is linear, with the form

du
-dt + ku = kf(t).
One important problem is to figure out how to control the temperature u(t) in some
desired way by choosing j(t) p~operly. Our solution method leads to

d
-(ekt LI) = kit j(t).
dt

Hence

It's convenient to choose the indefinite integral to have the value O when t =0 so
we can write it as a definite integral. We get

where we have replaced C by u(O). If we now take f (t) to be a constant and call it
f o, the solution becomes

u(t) = .fo(I - e-kr) + u(O)e-kt


=Jo+ (u(0) - fo)e-*'.

Since Jim e-kt


f-+ 00
= 0, we find that Jim 11(t)
f-HX)
= Jo, which is reasonable from physical
intuition. Other choices for f (t) are considered in the Exercises.
Section 3B Linear Equations 487

EXERCISES

In Exercises 1 to 4, assume that y represents some dif- satisfying


ferentiable function of x. Find an exponential multiplier
M for each combination such that the product has the dv
m-=mg-kv
dt .
form (d/dx)(M(x)y).
1. y' + 2y Here, g is the acceleration of gravity and k is a positive
dy constant depending on the viscosity. Show that
2. dx +xy
dy 2 v(t) = (v(O) - mg) e-kr/m + mg
k k .
3. -+-y
dx X

4. y' + try 12. Choosing an appropriate scale, sketch the graph of a


typical function u found at the conclusion of Example 5
In Exercises 5 to 8, find the general solution of each of
of the text, ifO < u(O) <Jo.What is the maximum value
the following linear equations, and then find a particular
of u, and what is Jim u(s)?
solution that satisfies the given initial condition. s-~:.')::

ds In Exercises 13 and 14, use Newton's law of cooling to


5. - +ts= t, s(O) =0 find the result of choosing f (t) in Example 5 of the text
dt
6. y' = y + 1, y(O) = I in two ways.

7. 2-
dy
= xy, y(l) = 0 13. f(t) = e-21 , fort~ 0, with 11(0) = JO
dx
. f(t) = { /o' (constant), for Ost s 1, with u(O) = 5,
dP 14 for 1 < t
8. t - + P = t 3 , P(l) = 0 0
dt
15. A container of milk at 70° F is placed in a mixture of
9. Salt solution enters a JOO-gallon tank of initially pure
ice and brine constantly at 30c F. Assume the validity of
water from two different sources. One source provides
Newton's law described in Example 5 and that the milk
water containing I pound of salt per gallon at a rate of 2
has reached 40° after 15 minutes.
gallons per minute. A second source provides 3 gallons
(a) Find an approximate value for the constant k in
of salt solution per minute at a varying concentration
C(t) = 2e- 2t, measured in pounds of salt per gallon. Newton's law.
Assume that the contents of the tank are kept thoroughly (b) When will the milk reach 35°?
mixed at all times and that solution is drawn off at a rate 16. Suppose that a metal bar initially at 300° F is immersed
of 5 gallons per minute. Find the amount of salt in the in a water bath at 100° F. for 30 minutes and then is
tank at an arbitrary time t > 0. transferred to another water bath at 50° F. Assume the
validity of Newton's law described in Example 5 of the
IO. The current i (t) in an electric circuit satisfies the differ-
ential equation text.
(a) What will the temperature of the bar be after an addi-
di tional 30 minutes, assuming the cooling coefficient
L- + Ri = E(t), for the iron in water is k = 0.1?
dt (b) Suppose that initially the bar is cooled for 30 minutes
in air at 100°, for which the cooling coefficient is
where L and R are positive constants, called inductance
only k = 0.07 and is then immersed in water for 30
and resistance, respectively, and E(t) is an applied volt-
minutes. What will the temperature of the bar be at
age. Show that
the end of the hour?
17. Verify directly that if YI (x) is a solution of

y' +gy =0,


11. A pellet of mass m failing under the influence of gravity
through a viscous medium has a velocity v(t) at time t, then cy,(x) is a solution, for every constant c.
488 Chapter 10 First-Order Differential Equations

18. Verify directly that if Yl (x) and n(x) are solutions of the 21. A 100-gallon tank is initially full of pure water. Salt
respective equations solution is added for 10 minutes at the rate of I gallon per
minute with salt content of the added solution increasing
linearly over the 10 minutes from 1 pound per gallon to
y' + gy = !1 and y' + gy = h, 2 pounds per gallon. Thoroughly mixed salt solution is
drawn off at the rate of one gallon per minute. Estimate
then c1 Yt + C2)'2 is, for every pair of constants c1, c2, a the amount of salt in the tank at the end of the l Ominutes.
solution of 22. A 100-gallon mixing vat is initially half-full of pure water.
Two gallons of salt solution per minute at a concentration
y'+gy=cif1 +c2h- of one pound of salt per gallon begin to flow in, while one
gallon per minute of mixed solution flows out. Estimate
the amount of salt in the vat at the moment it begins to
19. An initially full 100 cubic-foot tank starts with 10 pounds overflow.
of salt dissolved in water. At a cenain time additional
23. A 100-gallon tank initially contains 50 gallons of water
salt solution begins to enter the tank at a rate of l cubic
with a total of 10 pounds of salt dissolved in it. A drain
foot per hour, while thoroughly mixed solution runs out a
is opened in the bottom that is regulated so as to let
drain at the same rate. However, the amount of salt in the
out I gallon of solution per minute. Simultaneously. salt
added solution decreases at a constant rate from l pound
solution begins to be added at 2 gallons per minute with
per cubic foot initially all the way down to zero pounds
a concentration of 2 pounds per gallon.
per cubic foot at the end of one hour.
(a) How much salt is in the tank when it first becomes
(a) Find the amount of salt in the tank at a given time
full and starts to overflow?
during the first hour. In panicular, about how much
(b) If the process is allowed to continue with overflow
salt will be in the tank at the end of one hour?
at an additional outflow of 1 gallon per minute, what
(b) If pure water continues to run into the tank after the
is the upper limit for the total amount of salt in the
first hour at the rate of I cubic foot per hour, how
tank? Estimate the additional time after the start of
much more time will it take for the total amount of
overflow for the amount of salt in the tank to reach
salt in the tank to reach 5 pounds?
175 pounds.
20. Two 100-gallon mixing tanks are initially full of pure 2
water. A solution containing one pound of salt per gallon 24. The first-order linear equation dy/dx - (sinx)y = e-x
of water pours into the first tank at the rate of one gallon can in principle be solved by using an exponential inte-
per minute. Thoroughly mixed solution runs from the first grating factor.
tank to the second at the rate of one gallon per minute, (a) Try to find an effective solution formula for the
where it too is thoroughly mixed in before draining away initial-value problem with initial condition y(O) =
at l gallon per minute. There will always be at least -1, and explain the difficulty you encounter.
as much salt in the first tank as in the second; find the (b) Make a computer graphics plot of I.he solution to the
maximum amount of this excess. initial-value problem of part (a).

Chapter 10 REVIEW

In Exercises 1 to 14, find all functions that satisfy the 1. xy' + (2x - 3)y = x4
differential equation. 8. y' = xy + y
I. x(dy/dx) + y - x =0 9. t(dx/dt) = -2x + t 3 , x(2) = I
2. dy/dx = 1/(y(l - x)2) 10. t(dx/dt) =I
3. dx/dt = tx + e 1
11. dx/dt = -3x 2
(I + X) )' + y = COS dy/dt + ty = l
1
4. X 12.
5. y3y' = (y 4 + l)ex 13. dx/dt =(x + t) 2 [Hint: Let x + t = y.]
6. dy/dx = 4x 3y - y, y(l) = l 14. dy/dt = cos2 y
Section 38 Linear Equations 489

15. Consider the differential equation dy /dx = ex-y. 18. Early experiments with objects dropped from rest above
(a) In what region of the (x, y )-plane are all solutions the earth led to the conjecture that after an object had
strictly increasing? fallen distance s its velocity would be proportional to s.
(b) In what region of the (x, y)-plane are all solutions Under the contemporary assumption that the acceleration
concave up? of gravity is constant, the velocity is proportional to .ji.
(c) Is the line y = x a solution graph? (a) Is the early conjecture consistent with initial velocity
(d) Is the line y = x an isocline? zero? Explain your reasoning.
(e) Solve the differential equation by separation of (b) Is the early conjecture consistent with positive initial
variables. Can you get the infonnation asked for velocity? How would acceleration be related to s
above directly from your solution fonnula? Which under this assumption?
approach seems simpler? 19. A 100-gallon mixing vat is initially full of pure water,
whereupon two gallons of salt solution per minute is
16. Consider the differential equation dy / dx = ex-y. added, each gallon containing I pound of salt. Water evap-
(a) What conclusions can you draw from Theorem 1.1 orates from the tank at the rate of one gallon per minute,
on existence and uniqueness about solutions of this and the excess solution overflows into a drain. Find the
equation? amount of salt in the tank at time t under the given
(b) Can a solution graph passing through the point assumptions and also under the altered assumption that
(x, y) = (0, 1) cross the line y = x? Explain your the tank initially contains 50 pounds of salt in solution.
reasoning. 20. Coffee cooling. We are presented two choices for cooling
one cup of coffee over a period of IO minutes: (i) let the
17. Consider the family of linear equations y' + ay = c, with
coffee cool by itself for IO minutes and then add cream,
a, c constant, a # 0.
or (ii) add the same amount of cream right away and then
(a) Show (a) that the isoclines of the direction field of allow the mixture to cool for 10 minutes. Assume that
this equation are horizontal lines and (b) that every mixing quantity p of liquid at temperature To and quantity
such line is an isocline. q at temperature T1 instantly results in quantity p+q with
(b) Sketch the direc.:tion field associated with the differ- average temperature given by (pTo+qT1)/(p+q). Which
ential equation y' + 2y = L method will end up with cooler coffee?
CHAPTER 1·1

SECOND-ORDER EQUATIONS

Most of this chapter is about linear differential equations such as

y"-2y=x, (l)
y" - 3y = ex, (2)
y" - 3y' + 2y = f(x), (3)

in which the left-hand side is a sum of multiples, with constant coefficients, of


functions y(x), y'(x), y"(x). Such equations have several important applications
that are taken up in Section 4 and are called linear constant-coefficient equations.
The reason for focusing on second-order equations is that it's the first and second
order derivatives of a function that have the most important interpretations, velocity
and acceleration, matched geometrically with slope and concavity. Quite apart from
any applications, these differential equations are interesting because we can describe
their solutions very explicitly in terms of solution formulas. Furthennore the set of
all solutions of a linear equation has a form that enables us to take a geometric view
of the sets of all solutions similar to the form of solutions of an algebraic system
Ax = b. The solution techniques and concepts of this chapter apply also to many of
the systems of equations in Chapter 12.
In Section 3C we'll relax somewhat the requirement that the coefficients of com-
bination in the linear equation should be constants. And finally in Section 7 we'll
remove the restriction to linear equations and discuss some features of nonlinear
equations such as y" + (y')2 = 0.

SECTION 1 DIFFERENTIAL OPERATORS


Before beginning a systematic treatment of constant-coefficient linear differential
operators, we'll look at some simple examples of linear differential equations.
lA Examples

IE>CAIVIPLE 1·I We can write the differential equation y' - r y = 0, where r is a constant, as

y' = ry, (4)

which specifies that the rate of change of y is proportional to the value of y for
every value of the variable x. This type of equation appears in Chapter 10, Section 2
for describing population growth. To find solutions we use repeatedly the formula
y' = rerx for the derivative of y = e'"x. It follows that Equation (4) is satisfied if
we take y = erx_ More generally, if c is an arbitrary constant, then Equation (4) is

490
Section 1A Differential Operators 491
satisfied if we take
y = cen,.' (5)

because the c will cancel on both sides. Equation (5) gives the most general solution
to (4); observe that we can write (5) in the form

Differentiating with respect to x gives

or, using the product rule for derivatives,

Dividing by e-rx leaves y' - ry = 0, which is the given Equation (4) rewritten. But
now we can reverse these steps, supposing that y is some solution. We start with

y' - ry =0
and then multiply by e-rx to get

e-rx y' _ ,e-rx y = O.


By the product rule, this last equation is

Integrating both sides with respect to x gives

where c is a constant of integration. Multiplying both sides by erx shows that y must
be of the form
y =cerx_

Thus we have shown that ce'x is the most general solution of y' = ry in the sense
that all particular solutions arise from specifying the value of c.

The method used in the preceding example consists of multiplying the expression
y' + ay by eax and then recognizing the result as the derivative (eax y)' = eax y' +
ae"-x y. We'll use this exponential multiplier e"-x repeatedly in what follows.

,-~~MP~E~ I To solve the differential equation y' _ y =ex,


3

we multiply by e- 3x and get


492 Chapter 11 Second-Order Equations

which is the same as


(e-3x y)' =e 2x

Now we integrate both sides with respect to x, getting


e-3x y = -½e-2x + c,
where c is some constant of integration. Then multiplying by e3x we obtain

y = -½ex+ ce3x
for the most general solution. We can verify directly that we have indeed found
some solutions, one for each value of c. What we have shown additionally is that
all solutions must be of the form -½
ex + ce 3x.

Before considering more complicated examples, it will be useful to describe some


notation that is often used in solving differential equations. We let D stand for
differentiation with respect to some agreed-on variable, say x, and interpret D +
2, D 2 - 1, and similar expressions as operations acting on suitably differentiable
functions y. For example,

(D+2)y= Dy+2y

=l+2y,
(D 2 - l)y = D 2y - y
= D(Dy) y = y" - y.

An important observation is that D acts linearly on y; the term linear operator


is sometimes used to avoid possible confusion over y itself being a function of x,
though not necessarily a linear function. To see that D acts linearly all we have to
do is recall two familiar properties of differentiation:

D(YJ + Y2) = Dy, + Dy2


D(cy) = cDy, c constant.
These two equations express the linearity of D. Repeated application of linear oper-
ators is linear, so it follows that the operators D 2, D 3 , and in general Dn are also
linear. Because scalar multiplication is a linear operation and because lhe sum of
linear operations is linear, the operator (D + a) is linear for all constants a. Putting
these ideas together allows us to conclude that expressions such as

D 2 +a, D 2 +aD+b, (D+s)(D+t)

are all linear operators, with the respective interpretations

(D 2 + a)y
= y" + ay,
(D 2 + aD
+ b)y = y'' + ay' + by,
(D + s)(D + t)y = (D + s)(y' + ty)

= D(/ + ty) + s(y' + ty)


Section 1A Differential Operators 493

= y" + ty' + sy' + sty


=y"+(t+s)y'+sty

= (D 2 + (s + t)D + st)y.
The last computation shows that for constants s and t

(D + s)(D + t) = D 2 + (s + t)D + st,


and also that
(D + t)(D + s) = D 2 + (s + t)D + st.
Thus if a is constant we can formally multiply operators of the form D - a as we
do ordinary polynomials with variable D. Conversely it's also sometimes important
to be able to factor an operator, for example, D2 - l. We see immediately that for
this example
D 2 - I = (D - l)(D + l)
= (D + l )(D - 1).

Returning to differential equations, suppose we are given one of the form

y" + ay' + by = O;
Equation (3) at the beginning of this chapter is similar to this, with a = -3, b = 2.
Writing the equation using differential operators gives

(D 2 +aD+b)y=0.

If we try to find a solution of the form y = erx, then Dy = rerx and D 2 y = r 2erx,
so erx is a solution if and only if r 2 erx + arerx + berx = 0. Then dividing by erx
gives the condition on r
2
r + ar + b = 0,
called the characteristic equation of the given differential equation.

j£*M'.~t~,l 'I The differential equation


y" - 3y' + 2y =0
has characteristic equation

r 2 - 3r +2 =0 or (r - 1)(r - 2) = 0.
The roots are r1 = 1 and r2 = 2, so there are solutions
y1(x) = ex, Y2(X) = e 2x.
The operator L = D 2 - 3D + 2 is linear, so if both L(y1) 0 and L(y2) = = 0 then
we also have L(CJYI + c2y2) = 0, and additional solutions are given by

y(x) =cit?+ c2e2x


for each pair of constants CJ, c2. This formula gives all the solutions, but to prove
that, we must proceed differently as shown next.
494 Chapter 11 Second-Order Equations

lB Factoring Operators
Our general method of solution will be to factor an operator into factors of the form
(D+s) and (D+t), and then apply the exponential multiplier method of Examples 1
and 2 repeatedly.

[EXAMl:'LE 4 j Suppose we want to find all functions y = y(x) that satisfy

y" +5y' +6y = 0.


We write the equation in operator form as

(Dz+ 5D + 6)y = 0.
Next we try to factor the operator. We see that

(Dz+ 5D + 6) = (D + 3)(D + 2);

thus we need to solve


(D + 3)(D + 2)y = 0.
To find all solutions, we suppose that y is some solution. Letting

(D + 2)y =u
for the moment, we substitute u into the previous equation and arrive at

(D + 3)u = 0.
But we can solve this first-order linear equation for u if we multiply through by e 3x.
We get
e 3x Du + 3e 3x u = 0
or
D(e 3x u) = 0.
Therefore
e3·Tu = c1,
for some constant q, and so

Recall now that we have temporarily set (D + 2)y = u. We then have

(D + 2)y = qe- 3x.


Multiply this first-order linear equation by ezx to get

e2xDy+2e 2xy=qe-x or D(e 2xy)=cie-x.


Section 1B Differential Operators 495
Integrating with respect to x gives

Since the constants CJ and c2 are arbitrary anyway, we can change the sign on the
first one to get

for the form of the most general solution.

We find the exponential multiplier used in the previous examples as follows:


multiply (D + a)y by eax to get D(eax y), that is,

Repeated application of this formula to constant-coefficient equations leads to the


following general theorem that lets us write the solutions knowing only the equation's
characteristic roots, that is, the roots of the characteristic equation. We consider first
the case of second-order equations of the form

y" +ay' +by= (D - ri)(D - r2)y = 0,


where ri and r2 are real numbers, and the values of the arbitrary constants CJ and
c2 are determined by initial conditions that prescribe the values y(xo) and y'(xo) of
the solution and its derivative at a point xo.

1.1 Theorem. The constant-coefficient equation

y" + ay' + by = 0,
with unequal characteristic roots r1, r2 has its most general solution of the form

If r 1 = r2, then er2 x is replaced in the general solution formula by xer,x to get

The constants CJ, c2 are uniquely determined by prescribing initial conditions y(xo) =
Yo, y' (xo) = zo.
Proof. In operator form, the differential equation is

(D - r1)(D - r2)y = 0.
We assume y(x) is a solution and show that it has the form claimed in the theorem.
Set z(x) =(D - r2)y(x) and substitute z for (D - r2)Y in the previous equation.
Now solve the resulting equation (D - ri)z = 0 to get
496 Chapter 11 Second-Order Equations

Note that q is determined by q = z(xo)e-r,xo = (y' (xo) - rzy(xo) )e-rixo. Given


the relation between y and z, the solution y then satisfies

and multiplication by e-r2x gives

If ri =I- r2, we integrate to get

e-rox
- y - + cz.
= - -Cj- e(r1 -r,)x
r1 - r2

Now multiply by e'2 x to get

For neatness, we can rename the constant ci/(r1 - r2) and call it q to get

If r1 = r2, we have D(e-r 1xy) = Cj, so integrating both sides gives

In that case,

Finally, note that once c1 is determined from y(xo) and y' (xo) as noted previously,
the constant c2 can in any case be determined from the value y(xo) alone; just solve
the appropriate equation (*)or(**) for c2 with x = xo. •

The problem of finding a solution to a differential equation that also satisfies


given initial conditions is called an initial-value problem. Geometrically, the initial
conditions described in Theorem 1.1 require the graph of a solution to go through
a given point (xo, yo) with given slope y' (xo) = zo. It's possible to extract from
the proof of Theorem 1.1 some formulas for determining coefficients q and c2 from
initial conditions in a linear combination of two solutions. Since such formulas aren't
particularly memorable, it's usually just as efficient to work directly, as in the next
example.

IEXAIVIPLE s j Suppose we want a solution graph of y" + 5 y' + 6y = 0 that passes through (0, I)
with slope 2. In other words we want the solution for which y(O) = l and y'(O) = 2.
Since the characteristic equation of the differential equation is r 2 + 5r + 6 = 0, the
characteristic roots are r = -2 and r = -3. All solutions thus have the form
Section 1B Differential Operators 497
FIGURE 11.1 y
y(x) = 5e-2x - 4e- 3x, y'(x) = 2-
-10e- 2x + 12e- 3x.

3 X
-1

The corresponding derivative fonnula is

Setting x = 0, in y(x) and y'(x) we get the two equations

y(0) = c1 + c2 = 1 and y'(0) = -2ci - 3c2 = 2.


Solving these for c1 and c2 gives the unique solution ci = 5, c2 = -4, so our
solution to the initial-value problem is y(x) = 5e-2x - 4e- 3x. Figure 11.1 shows
the graphs of y(x) and y'(x). Note that y(x) attains its maximum when y'(x) = O
at x --1 n §5 •

The equation y" + 2y' + y = 0 has for its characteristic equation r 2 + 2r + 1 = 0


with a repeated characteristic root ri = r2 = -1. Theorem 1.1 says that the general
solution is y(x) = qe-x + c2xe-x. Initial conditions y(0) = I, y'(0) = 0 require
first that y(0) = ci = I. Since y'(x) = -c1e-x + c2e-x - c2xe-x, the second
condition requires y' (0) = -ci + c2 = 0. Hence ci = c2 = 1 and the initial-value
problem has solution y = e-x + xe-x.

Suppose that in the previous example we didn't want to satisfy initial conditions at
a single point, but wanted instead a solution graph passing through two given points
in the xy-plane. Such conditions applied to a single solution at more than one point
are called boundary conditions. The problem of finding a solution of a differential
equation that also satisfies boundary conditions is called a boundary-value problem.
Boundary-value problems are theoretically more complicated than initial-value
problems and don't always have solutions. (See Exercises 10 and 11 in the next
section.) Nevertheless some boundary-value problems are quite simple computation-
ally. The next example is of this kind.

Our aim is to find the solution to y" - 4y = 0 that satisfies boundary conditions
y(0) = I and y(l) = 2. Thus in this example we need work only with the expression

y (x ) = qe2x +c2e-2x
for the general solution itself, since values of y'(x) aren't involved. The resulting
equations for the coefficients are
498 Chapter 11 Second-Order Equations

Though it's not guaranteed by Theorem 1.1, these equations tum out here to have a
unique solution, namely

The desired solution is

y(x)
2 - e-
= ( e~~ - -2
2) e2x + ( 2e2- 2 ·
-2 )
2
e- x.
e e - e ,

EXERCISES

In Exercises 1 to 6, with D = d/dx, compute [Hint: What characteristic roots go with each solution?]
1. (D + l )e- 2x 2. (D 2 + ])ex In Exercises 21 to 26, put each of the linear differential
4. (D 2 + D - 1) sinx equations in the operator form (aD 2 + bD + c)y = 0.
Then factor the operator, e.g. D 2 - 1 = (D - l)(D + 1). ,
5. (D 2
+ l )x cos x 6. (D 2 - l)xe-ix

In Exercises 7 to 14, find the characteristic equation of


21. y" + 2y' + y = 0 22. y" - 2y =0
each of the following differential equations. Then solve 23. 2y" - y =0 24. y" + 3y' = 0
the characteristic equation and use the roots to write
the general solution of the differential equation. Finally
25. y" =0 26. y" - y' = 0
determine the arbitrary constants in the general solution Each of the equations 27 to 30 has a factored opera-
to produce the solution to the initial-value or boundary-
tor form:
value problem.
(D - r1)(D - rz)y = f(x).
7. y" + y' - 6y = 0, y(0) = 2, y'(0) = 2
8. 2y" - y = 0, y(0) = 1, y'(0) = 0 In each case let (D - r2)y = z and solve
9. y" + 2y' + y = 0, y(0) = 1, y'(0) = 2
(D - r1)z = f(x)
IO. y" + 3y' + ys = 0, y(l) = l , y'(l) = l
11. y" - y' = 0, y(0) = 1, y(l) = 0 for the most general possible z. Having found z, solve
12. y" - 3y' - y = 0, y(0) = 0, y(l) = 0 (D - r2)Y = z.
13. 2y" - 3y' + y = 0, y(0) = 0, y'(0) = 0 27. D(D - 3)y = 0
14. 3y"+3y'=0, y(l)=l,y(2)=2 29. y 11 - y =] 30, y 11 + 2y' + y = X
In Exercises 15 to 20, sketch the graph of the given func- 31. The differential equation y"+(1/x)y'-(1/x 1 )y 0, x > =
tion of x. Then find a differential equation of the form 0, has operator form as (D 2 + (1/x)D - 1/x 1 )y = 0.
(a) Show that the equation can also be written
y" + ay' + by = 0 D(D + 1/x)y = 0.
(b) Solve the equation in part (a) by letting z =
of which each is a solution; write the general solution (D + 1/x )y and solving a succession of first-order
of the differential equation and verify that the given equations.
function is a special case of your general solution. (c) Show that D(D + l /x) =I- (D + 1/x)D.
(dJ Solve (D + 1/x)Dy = 0.
16. ex+ e-x
32. The hyperbolic cosine and hyperbolic sine are defined by
17. 1 +x 18. 2e 2-< - 3e 3·'
20. e-3x + e5x
Section 1B Differential Operators 499
(a) Show that, if constants d1 and d2 are suitably chosen (c) Find y(t) if y(O) = yo and dy/dt(0) = v0 .
in terms of c I and c2, then (d) Show that if the chain starts from rest with length
Yo > 0 hanging over the side, then the last link
goes over the side at time t 1 =
./flgln((l +

(b) Express the general solution of


/12 - y5)1Yo).
38. Here is an alternative way to arrive at the modified
exponential solution xemx when the characteristic
equation of y" + ay' + by = O has m as a dou-
in leffils of hyperbolic functions. ble root. First write the equation in operator fonn as
33. (a) Show that the characteristic equation of (D - m) 2 y = 0, or as

Ay" + By' + Cy = 0, y" - 2my' + m 2y = 0.


with A, B, C constant, A I, 0, has real roots if and Now try to find a solution of the form y = em-< 11 (x)
only if B2 ::: 4AC. by substitution into the displayed equation. (This tech-
(b) Show that when B 2 > 4AC, the general solution nique is used in an essential way in Section 3C for dealing
of the differential equation in part (a) also has the with linear differential equations with nonconstant coeffi-
foffil cients.) [Hint: Show that u" (x) = O.J
39. This exercise gives a clue as to why the factor x occurs
in the "equal-root" case.
where er= -B/2A, /3 = JB 2 -4AC/2A. (a) Show that (D - r)(D - (r + h))y = 0, or (D 2 -
(2r + h)D + r(r + h))y = 0 can also be written
34. Assume IAI < ¼ in the equation y" - (2r + h)y' + r(r + h)y = 0.
(b) Show that if h I, 0 the general solution is Yh =
Ay" +y' + y =0 cie<r+h)x + c2e'x.
(c) Let c1 = I/ h, c2 = -1 / h, and show that with these
and show that, as A tends to 0, and with proper choice choices Jim Yh(x) = xe'x for all x.
of arbitrary constants, there are solutions of this equation h-+0
tending, for each fixed x, to solutions of y' + y = 0. (d) Show that the limit in part (c) is, by definition, the
derivative of the solution e'x with respect to r.
35. The differential equation y" - 2y' + y = 0 has infinitely
many solutions y(x) with graphs passing through the point *40. Assume that the characteristic roots of the constant-
(0, l ). Find the three that have slopes -1, 0 and l at that coefficient differential equation y" +ay' +by = 0 are real
point and sketch their graphs. numbers r1, r2. Show that if x1 I, x2, the boundary-value
problem
36. Initial conditions y (xo) = ao and y' (xo) = a 1, imposed
at a single point xo, will always be satisfied by some
· y"+ay'+by=0, y(x1)=Y1, y(x2)=Y2
solution of y" - 3y' + 2y = O; show that the boundary
conditions y' (0) = ao and y(ln 2) = a,, at two different
points xo = 0 and x, = In 2, are satisfied only if a 1 = 2ao, always has a unique solution for given numbers y 1
and even then not uniquely. and Y2- [Hint: Consider the cases r1 # r2 and r1 =
r2 separately, and show that you can always solve
37. A chain of length I and mass density 8 per unit of length
for the desired constants CJ · and c2 in the general
lies unattached and in a straight line on the deck of
solution.]
a ship.
(a) If the chain runs out over the side with no force 41. Two functions YI (x) and y2(x) are linearly independent
acting on it but gravity, in particular without friction, on an x-interval if and only if neither one is a constant
at constant acceleration g, show that the amount y multiple of the other.
hanging over the side satisfies d 2 y/dt 2 = (g/l)y as (a) Show that erx and esx are linearly indepen-
long as O S y S l. (Assume the deck is more than dent on a given interval a < x < b if
height/ above the water.) r l,s.
(b) How fast is the chain accelerating as the last link (b) Show that e'x and .fe1° 1 are linearly independent on
1wes over the side? a given interval a < x < b.
500 Chapter 11 Second-Order Equations

SECTION 2 COMPLEX SOLUTIONS


2A Complex Exponentials
Complex numbers arise just as naturally in the solution of constant-coefficient equa-
tions of the form
ay" + by' + cy = 0,

as they do in the solution of the related algebraic equation

ax 2 + bx + c = 0.
But to exploit fully the analogy between these two kinds of equation we need the
complex exponential function, defined for purely imaginary numbers ix, with x
real, by
/x = cosx + i sinx.
Our motivation for this definition comes from Equations 2.1 and 2.2 below. See also
Exercise 40.
The absolute value of a complex number a+ i/3 is la+ i/31 = ,/a 2 + fJ2, and
equals its distance from the complex number 0. Figure 11.2 shows that in eix we
can interpret x as an angle. The absolute value leix I equals I for all x because

lixl = I cosx + i sinxl


= ,/cos 2 x + sin2 x = I.
Using the addition formulas for sine and cosine shows that

(cosx + i sinx)(cosx' + i sinx')


= (cosx cosx' - sinx sinx') + i(cosx sinx' + sinx cosx')
= cos(x + x') + i sin(x + x').
It follows that

FIGURE 11.2
',,
, /

,, / ''\
1
1
sin..-.· ' eix = cos x + i sin x
I \
I \
I I
I I

\ COS X :
\ I
\ I
\ I
\ I
' ' '-. ____,.., /
/

',
-- ---
Section 2A Complex Solutions 501

In particular, when x' = -x, we get eixe-ix = I, so that


_J__ -ix
. -e .
elX

These equations justify using the exponential notation: The function eix behaves very
much like the real-valued exponential ex, for which ex~' = ~+x' and 1/ ~ = e-x.
In addition to its algebraic simplicity, another reason for using the complex expo-
nential function is the simplicity of the formulas for its derivative and integral. To
differentiate or integrate a complex-valued function u(x) + iv(x) with respect lo the
real variable x, we simply differentiate or integrate the real and imaginary parts. By
definition,
d du dv
-(u(x) + iv(x))
dx dx
=
-(x) + i-(x),
dx
and

f (u(x)+iv(x))dx= f u(x)dx+i f v(x)dx.

Then the derivative of eix with respect to x is given by


d . d
-e1x = -(cosx + i sinx)
dx dx
= - sin x + i cos x
= i (cosx + i sinx) = ieix.
In short, we have

Similarly,

where c may be a real or complex constant. These are analogous to the formulas for
the derivative and integral of e0 x when a is real. More generally, we can define

and compute
2.1 .!!.__e(a+i{J)x = (a + if3)e(a+i{J)x
dx
and
= __l ___ e(a+i{J)x + c,
2.2
f e<a+i{J)x dx
a+ ifJ
a+ ifJ =/- 0.

These computations are left as exercises. We are now in a position to discuss the
differential equation
(D 2 +aD+b)y=0
502 Chapter 11 Second-Order Equations

when the factored operator

contains complex numbers rJ and r2. We' II see that the usual techniques, as dis-
cussed in Section 1, still apply. The exponential multiplier method goes over formally
unchanged because of Equation 2.1; we have

D(erx y) = erx (D + r)y,


whether r is real or complex.

I,EXAMPLE· 1 I Consider the differential equation y" + y = 0. We write the equation in operator form,
(D 2 +1)y=O,

and factor D 2 + I to get


(D - i)(D + i)y = 0.
Then set
(D+i)y=u, (1)

and try to solve


(D - i)u =0
for 11. As in the real case, we multiply by a factor designed to make the left side the
derivative of a product. The same multiplier rule suggests that the correct factor is
e-ix, so we write

Since D(e-ixu) = e-ix(D - i)u, we can write

D(e-ix u) = 0.
We integrate both sides with respect to x to get

-ix
e u = CJ or u = CJ eix .
Substituting this result for u into Equation (I) gives

which must now be solved for y. We do it by multiplying through by eix to get

or, since D(eixy) = ei.r (D + i)y,


D(eix y) = CJe2ix.
Section 2A Complex Solutions 503
Integrating gives

or

On replacing the arbitrary constant c1 by 2i c1, we have

= CJ (cos x + i sin x) + c2 (cos x - i sin x)


=(CJ+ c2) cosx + i(CJ - c2) sinx.

To simplify the solution we can set d1 = (CI+ c2) and d2 = i (c1 - c2). This involves
no change in generality in the constants because for given d1 and d2 we can solve
for CI and c2. Solving for CI and c2, we find

c1 = ½(d1 - id2) and c2 = ½(d1 + id2).


Whenever the equation y" + ay' + by = 0 has coefficients that are real numbers,
if the characteristic equation
2
r + ar + b = 0
has complex roots as in Example 2, these complex roots will be conjugate to each
other, that is, of the form r1 = a + i/3 and r2 = a - i/3 where a and f3 are real
numbers. This follows from the quadratic fonnula

-a± ../a2 -4b


r=------
2

and the assumption that a 2 - 4b < 0. Thus

ri = -i + ~J4b - a 2 , r2 = -~2 - ~J4b - a 2.


2
It follows that the complex solutions

y = Cie(a+i.B)x + c2e<a-i,B)x' a, f3 real,

can always be written

y = eax(c1ei.Bx + c2e-i.Bx)
= eax[CJ(cosf3x + i sinf3x) + c2(cosf3x - i sin/Jx)]
= eax [(c1 + c2) cos f3x + i (c1 - c2) sin f3x]
= d1 eax cos f3x + d2eax sin f3x.
This is a form of the solution that is often used in practice, so we include it in the
statement of the following Theorem 2.3. The proof is formally the same as that of
504 Chapter 11 Second-Order Equations

Theorem I. I in the previous section, so we omit it. The only difference here is that
we can interpret the solutions as being complex-valued, though the case of real roots
is automatically included also.

2.3 Theorem. The differential equation

y" + ay' +by= 0, a, b conslant,

has for its general solution

y = CJ erix + c2erix, r1 =/ r2
y = c1xe''x + c2e''x, r1 = r2 ,

where r1, r2 are the roots of r 2 + ar + b = 0. If a and b are real numbers with
a 2 - 4b < 0, we can write ri = a + i/J, r2 = a - i/J. The general solution can then
be written

Initial conditions y(xo) = Yo, y'(xo) = zo can always be satisfied by a unique choice
of CJ and c2.

The equation y" +ai y = 0, with w 2 > 0, has characteristic equation r 2 +w2 = 0 with
characteristic roots r = ±iw. The associated solutions to the differential equation are

Functions of this form are called harmonic oscillations because of the role they play
in the analysis of sound waves, and the differential equation is called a harmonic
oscillator equation.

!E~MP.lE 3 I We classify the solutions of

Ay" + By'+ Cy =0
according to relations among the constants A, B, and C. The characteristic equation is

with roots
-B ± ,./B -4AC
2
ri, r2 =- - - 2A----
If B 2 - 4AC > 0, the roots are real and unequal, so the solutions are

If B 2 - 4AC = 0, the roots are equal and real, so solutions are all of the form
Section 2A Complex Solutions 505

Finally, with B 2 - 4AC < 0, the solutions have the form

where a = - B /2A and f3 = .J4AC - B 2/2A. The case B 2 - 4A C = 0 is critical in


that it represents the division between oscillatory solutions, for which B2 - 4AC < O
and nonoscillatory solutions (B 2 - 4AC > 0). The physical significance of this
classification is explained in Section 4. It's apparent in any case that the presence of
sine and cosine functions in a solution produces oscillatory behavior.

l:~gA/llltp;i~~,I An alternative way to find solutions of the differential equation

y" +2y' +5y =0


is as follows. This approach is technically simpler than the method of successive
integration, but fails to prove that we've found all solutions, as the integration method
does. We let y(x) = e'x be a trial solution and try to determine what values of r
will indeed yield a solution. Since

we must have, after substitution into the differential equation,

But since e'x -:j:. 0 for all complex numbers rx, we can divide by e'x to get

r 2 +2r+5 = 0.
The polynomial on the left is just the characteristic polynomial of the differential
equation, and its roots are

r1 = -1 + 2i, r2 =- 1 - 2i.

It follows that
YI (x) = e<- 1+2i>x = e-x(cos 2x + i sin 2x)
and
y2(x) = e<- 1- 2nx = e-x(cos2x - i sin2x)

are complex-valued solutions of the differential equation. Hence the linear combination

y(x) = CJe-x (cos 2x + i sin 2x) + c2e-x (cos 2x - i sin 2x)


= (CJ + c2)e-x cos 2x + i (CJ - c2)e-x sin 2x
is also a complex-valued solution. Here it's understood that CJ and c2 may be complex
numbers. But as we saw at the end of Example 1, we can choose CJ and c2 so that
506 Chapter 11 Second-Order Equations

the combinations CJ + c2 and i (CJ - c2) assume arbitrary values, in particular, real
values. Thus the solutions we have found have the form

EXERCISES

In Exercises I to 4, show that each of the given complex 17. y" - 2y' + 2y = 0, y(n) = 0, y'(n) =0
numbers has absolute value l. Then find a real number 18. y" - y' + y = 0, y(0) = 2, y' (0) = -1
x such that the complex number has the form eix = 19. 2y" + y' - y = 0, y(0) = 0, y' (0) = 2
cos x + i sin x; for example,
20. y" + y' = 2y, y(0) = 0, y(l) = I

./3-+-i = cos -Jr + i sin -Jr = e',r· /6 . 21. 2y" + y' + y = 0, y(0) = 0, y'(0) = 0
- 22. 3y" - y' + y = 0, y(0) = 0, y(l) = 0
2 6 6
In Exercise 23 to 30, find a second-order differential
1. i 2. (1 + i)/./2 3. (1 - i)/./2 4. (-v'3 - i)/2 equation y" + ay' + by = 0 that has the given solution.
[Hint: What are the charactenstic roots associated with
In Exercises 5 to 8, for each pair dJ, d2 of real numbers each solution?]
given, find complex numbers CJ, c2 such that
23. sin2x 24. e2x sin 2x
25. ex cos2x 26. cos2x
27. cos(x/2) 27. xezx
5. dJ = 1, dz = 0 6. dJ = 4, d2 = -2 29. sin 3x - cos 3x 30. X -7
7. dJ = o, dz= n 8. d1 = l , d2 = 1
In Exercises 31 to 33, we deal with the issue that a
9. Recall that a function is periodic, with period p, if constant-coefficient differential equation with complex
J(x + p) = /(x) for all x in the domain of/. characteristic roots may fail to have a unique solution if
(a) Show that eix has period 2kn if k is an integer. we impose certain critically chosen boundary conditions.
(b) Show that ei/Jx is periodic for fJ real, and find the
smallest positive period if fJ # 0. 31. Show that the boundary-value problem y" + y = 0,
y(0) = 1, y(n) = 1 has no solution. [Hint: You know
10. Show that e-ifJx = cosx - i sin{Jx. What properties of all solutions of y" + y = O.]
cos and sin are used here?
32. Show that the boundary-value problem y" + y = 0,
Solve each of the differential equations 11 to 14 by y(0) = 1, y(2n) = 1 has infinitely many solutions.
factoring the differential operator associated with it and
33. Show that if the constant-coefficient equation y" + ay' +
then successively solving a pair of first-order linear
equations. by = 0 has real characteristic roots, then the associated
boundary-value problem y(x1) = YI, y(x2) = yz always
J1. y" +y= l 12. y'' + 2y' + 2y = 0 has a unique solution if xi # xz.
13. y" + 2y = 0 14. )' 11
+ y' = X 34. (a) Show that CJ cos {Jx + c2 sin {Jx also has the form
Find the roots of the characteristic equation of each of J
A cos({Jx - <I>), where A = cf + c~ and <I> satisfies
the differential equations 15 to 22. Then write the gen- cos <I> = CJ/ A and sin <I> = c2/ A.
eral solution of the differential equation, replacing com- (b) The result of part (a) is useful because it shows
plex exponentials by eax cos f3x and eax sin f3x where that
it's appropriate. Finally determine the constants of inte-
gration so the given initial conditions will be satisfied. CJ cos {Jx + c2 sin {Jx
15. y" + 2y = 0, y(O) = 0, y' (0) =l has a graph that is the same as that of cos {Jx
16. 2y" + 3y' = 0, y(O) = 1, y'(0) =0 shifted by a phase angle <J> and multiplied by an
Section 2B Complex Solutions 507
amplitude A. Sketch the graph of Complex-valued differentiable functions f(x) = u(x) +
iv(x) and g(x) = s(x)+it(x) obey the same basic rules
cos 2x + v'3 sin 2x relative to differentiation that real-valued functions do.
In Exercises 41 to 44, use the corresponding relations
by first finding <I> and A. for real-valued functions to show that the following
35. (a) Let CJ and c2 be real numbers. Show that
fonnulas hold on an interval a < x < b on which
y(x) = ci cos{3x + c2 sin th can also be written both f, g and f / g are differentiable complex-valued
functions.
as A sin(jlx +0 I where A = /er+ c~ is the ampli-
tude of y(x) and 0 satisfies sin 0 = ci/ A and
41. (/ + g)' = /' + g'
cos0 = c2/A. 42. (cf)' = cf', c constant
(b) The result of part (a) says that 43. (jg)'= Jg'+ J'g

ci cos {Jx + c2 sin {Jx = A sin({Jx + 0) 44. (f/g)' = (f'g - fg')/g 2 , g #, 0


45. (a) Show that if y = y(x) is a solution to the constant-
for appropriately chosen real numbers A and 0. Use coefficient equation y" + ay' + by = 0 and c is
this result to sketch the graph of y(x) cos 2x + = constant, then the function Ye defined by Yc(x) =
v'3 sin 2x, by first finding A and 0. y(x + c) is also a solution. [Hint: It's not necessary
(c) Show that to know the form of y(x) in terms of elementary
functions.]
(b) Generalize the result of part (a) to a solution y(x) of
A sin(8x + 0) = A cos({Jx - <I>),
the nth order constant-coefficient equation
where <I>= n/2'- 0. The number <I> (or sometimes
-0) is called a phase angle, and A is the amplitude y<n) + an-lY(n-1) + ... + a1/ + aoy = 0.
of the trigonometric function A sin({Jx + 0).
36. Verify Equation 2.1 m the text. (c) Generalize the result of part (a) to a solution
y(x), a < x < b, of a second-order equation, linear
37. Verify Equation 2.2 m the text.
or nonlinear, of the form y" = F(y, y'). [Hint: Yc(x)
*38. Find all real or complex solutions of y" + iy' = 0. will in general be defined on an interval different
*39. Find all real or complex solutions of y" + iy =0 from a < x < b.]
40. Separate the real and imaginary terms in the infinite series 46. Let y(x) be a solution of a second-order equation y" +
ay' + by = 0. Can constants a and b be chosen so that
y(x) = x cos x? What about the function sin x + cos 2x?
~ (ixi
Justify your answers.
L, k!
k=I 47. For fixed a and fixed f3 #, 0, show directly that
eax cos {Jx and eax sin {Jx are linearly independent. What
into two power series. Then use the result as another
if {3 = 0?
justification of the definition eix = cos x + i sin x.

2B Higher-Order Equations
While most of the differential equations that arise directly in applications have order 1 or
2, understanding higher-order equations is technically useful for solving certain second-
order equations in Section 3 and for solving systems of equations in Chapter 12. There-
fore it's useful to record an extension of Theorem 1.1 as follows. The proof continues
step-by-step as in the proof of Theorem 1.1 and uses no new ideas so we omit it.
2.4 Theorem. The differential equation
(D - r1)(D - r2) · · · (D - rn)Y = 0,
with characteristic roots rk all different has its most general solution of the fonn
Y = cie''x + c2e'2x + · · · + Cne'nx.
508 Chapter 11 Second-Order Equations

If some rk are equal, say r1 = r2 = · · · = rm, then er2 x, er3x, ... , er'"x are
replaced in the general solution formula by xerix, x 2erix, ... , xm - I er,x respec-
tively. The constants c1, c2, ... , c11 are uniquely determined by prescribing values
y(xo), y' (xo), ... , y 11 - 1 (xo) of the solution and its derivatives at a single point xo.

The key to applying Theorem 2.4 is finding the characteristic roots of a differ-
ential equation

that is, finding the roots of the purely algebraic characteristic equation

r
11
+a 11 - 1rn-l + · · · + a1r + ao = 0.
Although there are general formulas for the roots if n is 3 or 4, these are awkward
to use and we'll rely in our examples on equations that reduce to solving quadratic
equations.

IEXAMPLE s I To solve y'" - 4 y" + 4 y' = 0, we solve the characteristic equation r 3 - 4r 2 + 4r = 0.


We observe that we can factor the right side:

r (r 2 - 4r + 4) = r(r - 2) 2 = 0.

The roots are O and 2, where 2 is a double root. According to Theorem 2.4 the
general solution to the differential equation is a linear combination of the solutions
e0x = 1, e2x and xe2x. Thus the solution is y = c1 + c2e 21 + c3xe2x.
It will be useful for us to apply the idea behind Theorem 2.4 in reverse order,
starting with solutions and arriving at a differential equation that has those solutions.

I·EXAMPLE 6 I To find a constant-coefficient equation L (y) =


0 of least possible order having r, e2x
2
and e- x for solutions, we write the linear operator equation

(D - J)(D - 2)(D + 2)y = 0.


The function e- 2x is a solution, because (D + 2)e- 2x =
0. Since the operator
factors may appear in any order, and (D - 1)e2x = 0 and (D - 2)e·t = 0, all
three functions are solutions of the differential equation. Linear combinations y =
c1ex + c2e2x + c3xe-2x constitute the general solution, as in Theorem 2.4

2C Independent Solutions
The substitution method used in the preceding example doesn't by itself show that the
solution formula obtained gives the most general solution; this is true and it follows
from Theorem 2.3. Knowing about the solutions to the nth-order constant-coefficient
equation is useful in Section 3 for understanding some special types of second-order
nonhomogeneous equations y" + ay' +by= f(x). Otherwise it's mainly 4th-order
equations that arise directly in applications. We restate Theorem 2.4 to take account
of the trigonometric fonn that solutions may take if the constant coefficients in the
differential equation are all real.
Section 2C Complex Solutions 509
2.5 Theorem. The nth-order constant-coefficient equation L(y) = 0 has for its
general solution a sum of constant multiples

CJ YI (x) + · · · + CnYn (x)


of solutions Yk(x), where the q are arbitrary constants; these constants are uniquely
prescribed by initial conditions:

y(xo) = zo, y'(xo) = ZJ, ... , y<n-l)(xo) = Zn-I·

If r1, ... , rn are the roots of the characteristic equation rn +an-1rn-l +· .. +air+
ao = 0, the terms Yk(x) in the solution can each be written in the form x 1e''k-t,
I= 0, I, . .. , m - 1, where m is the multiplicity of the root rk, If roots a + i/3 and
a - i/3 occur in complex conjugate pairs, then the corresponding pairs of exponential
solutions are equivalent to

x 1eax cos f3x, x 1eax sin f3x .

The constants of integration in the solution formulas we've just been dealing with
appear frequently in the next section. An expression of the form

CJYI + C2Y2 + · · · + CnYn

is called a linear combination of YI, y2, ... , Yn with coefficients c1, c2, ... , Cn.
Alternatively, the numbers Ck may be regarded as parameters, and the linear combi-
nation displayed above is an example of an n-parameter family of functions.

The differential equation y<4> - y = 0 has characteristic equation r 4 - 1 = 0. To


find the roots, we note that since r 4 = 1 then either r 2 = 1 or else r 2 = -1. Hence
the roots are r1 = 1, r2 = -1, rJ = i and r4 = -i. The first two roots provide the
solutions ex and e-x. The second pair provides eix and e-ix or the alternative form
cos x and sin x. The complete solution is then

y(x) = qex +c2e-x +CJcosx +qsinx.

Initial conditions y(O) = 0, y' (0) = 1, y" (0) = 2, y"' (0) = I impose conditions on
the constants q. For example,

y'(x) = ct~ - c2e-.!c3 sinx + q cosx ,

so y' (0) = CJ - c2 + q = 1. The complete set of conditions reduces to

CJ +c2 +CJ= 0
CJ -c2 + c4 = I
CJ+ C2 - CJ= 2
CJ-c2-C4= 1.

Straightforward elimination shows that CJ = 1, ci = 0, C3 = -1 and C4 = 0. Thus


the particular solution that satisfies the initial conditions is y(x) = ~ - cos x.
510 Chapter 11 Second-Order Equations

The third-order differential equation (D - 1) 3 y = 0, which looks like y 111 - 3 y" +


l~~MPLESI 3 y' - y = 0 when written without operator notation, has the single characteristic root
r = 1 with multiplicity 3. The general solution is, by Theorem 2.4,

IEXAM.PLE e I The equation / + = 0 is satisfied under certain conditions by a function


4
J Ay 11
that describes the lateral deflection, measured units from one end, of a unifonn
x
y(x)

column under a vertical compressive force. (The constant ).,_ = P / p depends on the
structure of the column and on the vertical load P applied to it.) The characteristic
equation is r 4 + Ar 2 = 0, or r 2(r 2 + ).,_) = 0. With ).,_ > 0, the roots are r1 = r2 = 0
and r3 = ./I;, r4 =-,/Ii.The general solution is

y(x) = CJ + c2x + C3 cos .Jix + q sin .Jix .

Initial conditions at a single point xo are physically uninteresting in this problem.


What is usually done is to impose "boundary conditions'' on y(x) and y'(x), or else
on y(x) and y"(x) at points corresponding to the two ends of the column, say at
x = 0 and x = L. The existence of a unique solution depends critically on the value
of A. These matters are taken up in the Exercises.

Finding the roots of an nth degree characteristic equation is in general a difficult


problem. The following examples illustrate some fairly simple special cases.

j EXAMPLE.10 I We can factor the cubic in r 3 + 2r 2 + Sr = 0 to get r (r 2 + 2r + 5) = 0. Apart from the


root rJ = 0 due to the first factor, there arc the roots r2 = -1 + 2i and r3 = -1 - 2i
of the quadratic equation r 2 + 2r + 5 = 0, so the corresponding solutions of the
differential equation

y'" + 2y" +Sy'= 0


are

y =CJ+ c2e-.t cos 2x + C]e-x sin 2x.

IEX/\MPLEi1tl The fourth degree, or quartic, equation r 4 - 13r 2 + 36 = 0 is also a quadratic equation
in r 2, with solutions r 2 = (I 3 ± 5) /2. Since r 2 = 4 or r 2 = 9, the four distinct roots
are r = ±2, ±3. Hence the solutions to

y< 4> - 13y11 + 36y =0


consist of all members of the four-parameter family

It will be useful in Section 3 to apply the ideas behind Theorem 2.5 in reverse
order, starting with solutions and arriving at a differential equation that has those
solutions.
Section 2C Complex Solutions 511

(D - l)(D - 2)(D + 2)y = 0.


The function e-2x is a solution, because (D + 2)e-2x = 0. Since the operator
factors may appear in any order, and (D - 2)e2x = 0 and (D - t)ex = 0, all
three exponentials are solutions of the differential equation. (By the linearity of the
equation, linear combinations y = q ex + c2e2x + c3e- 2x constitute the general
solution.)

Let us find a constant-coefficient equation L(y) = 0 of least possible order having


cos x and sin 2x for solutions. These solutions would have arisen from characteristic
roots ±i and ±2i respectively. We write the linear operator equation

(D 2 + l)(D 2 + 4)y = 0.
The function y = sin 2x is a solution, because (D 2 +4) sin 2x = 0. Since the operator
factors may appear in any order, and (D 2 + 1) cos x = 0, both functions are solutions
of the differential equation. (Linear combinations y = c1 cos x +c2 sin x +q cos 2x +
C4 sin 4x constitute the general solution.)

Independence of basic solutions. A set of n functions YI (x), Y2 (x), ... , Yn (x)


defined on an interval is linearly independent on that interval if the identity

C!Yl (c) + C2Y2(x) + · · · + CnYn(X) = 0


holds there only with the choice q = c2 = · · · = Cn = 0 for the constants.
Exercise 48 asks you to show that a set of two or more functions is linearly inde-
pendent if and only if no one of them is expressible as a linear combination of the
others. Thus there is no ambiguity about the coefficients in the expression of a func-
tion as a linear combination of elements from a given independent set of functions.
A set B of linearly independent functions whose linear combinations constitute all
solutions of a linear differential equation is called a basis for the solution set. That
the solutions x 1eax cos {3x, x 1eax sin {3x listed at the conclusion of Theorem 2.6 form
a linearly independent basis is a simple consequence of the theorem, as follows.

2.6 Corollary. The solutions x 1erkx listed in Theorem 2.5, including those with
real form x 1eax cos {3x and x 1eax sin {3x, are linearly independent.
Proof Suppose a linear combination y(x) of these n solutions is identically zero:

CIYI (x) + · · · + CnYn(X) = 0.


To prove independence we need to show that all q = 0. The linear combination
is a solution of a linear homogeneous differential equation L(y) = 0 with possibly
multiple characteristic roots a + i/3. Since this solution is identically zero, it satisfies
initial conditions y(xo) = y'(xo) = · · · = y<n-l)(xo) = 0 at a given point xo of
the interval. Since the choice q = 0 produces this solution, and since Theorem 2.5
guarantees uniqueness of the coefficients Ck, the numbers q must all be zero. •
512 Chapter 11 Second-Order Equations

EXERCISES

In Exercises l to 10, find the general solution to the of initial conditions. Find the correct values for the
differential equation. constants Ck so the conditions will be satisfied.

1. y"' +y = 0 2. y<4) + 2y" + y = 0 39. c1 + c2x + cJex; y(0) = 1, y'(0) = 2, y"(0) = I


40. c1 cosx + c2 sinx +CJ+ qx;y(0) = 2, y'(0) = y"(0) =
3. y -2y' = 0
111
4. y< 4> - y" = 0
y 111 (0) =0
5. y"' - J6y' = 0 6. y< 4> - 4y" + 4y = 0 41. ci cosx + c2 sinx + CJex;y(0) = 2, y'(0) = y"(0) = -3
7. (D 2 + 4)(D 2 - l)y =0 8. D(D - l)Jy = 0 42. c1 cos x + c2 sin x + CJ cos 3x + c4 sin 3x; y(0) = y' (0) =
9. y"' - 2y" =0 10. y<4 ) = 0 I, y"(0) = -1, y 111 (0) = 3
In Exercises 43 to 48, find the general solution to each
In Exercises 11 to 28. find constant-coefficient linear equation.
differential equation of the smallest possible order that
has the function y(x) as solution. [Hint: What are the 43. y< 4 ) - y =0 44. y<4) - 2y" + y = 0
characteristic roots associated with each function?]
45. y< 4> - 2y' = 0 46. y<4 ) + y = 0
11. y(x) = e 5x 12. y(x) = xe 5x
47. (D 2 - 4)(D 2 - l)y =0 48. D 2 (D - 1) 2 y =0
13. y(x) = x 2 14. y(x) = x 2e-x [Hint: For + r4 1 = 0,
note that r2 = ±i = ±ei(rr/Zl.
15. y(x) = x + ex 16. y(x) = xJ Then r = ±iei(rr/41, ±ei(rr/4>.J
17. y(x) = x 5 + x 2 ex 18. y(x)= -1 49. Explain why y(x) = cosx+sin2x can't be the solution to
a constant-coefficient equation of the form y" +ay' +by =
19. y(x) = cos4x 20. y(x) = x cos 4x 0. Find an equation of higher order that y(x) does satisfy.
21. y(x) = x 2 cos4x 22. y(x) = x sin4x 50. The general solution y(x) = c1 + c2x + CJ cos ./Ix +
23. y(x) = xe' sinx 24. y(x) = xJcos4x c4 sin ./Ix to y'm + ).y" = 0 is derived in Example 7 of
the text; use it to do the following.
25. y(x) = x 5 26. y(x) = cos4x + sin3x (a) If). = 4rr 2 , find the infinitely many solutions that
21. y(x) = e-x cosx 28. y(x) = x cosx + cos 2x satisfy the boundary conditions y(0) = y(I) =
0, y'(0) = y'(I) = 0.
In Exercise 29 to 38, there are given families of solutions (b) Under the assumption that y(x) represents the
to some linear constant-coefficient differential equations horizontal deflection of a column under a verti-
of order more than 2. In each case find a differential cal compressing force, we can interpret the bound-
equation of least possible order satisfied by the family. ary conditions in part (a) to mean that the ends of
the column are rigidly embedded in floor and ceil-
29. C} + c2x + c3ex ing. Sketch some typical solutions, assuming small
30. ci cosx + c2 sinx +CJ+ qx displacements.
31. ci cosx + c2sinx + CJ~ (c) Show that if). = rr 2 then the only solution satisfying
the boundary conditions in part (a) is the identically
32. ci cos x + c2 sin x + CJ cos 3x + c4 sin 3x zero solution.
33. c1 cos 2x + c2 sin 2x + c3e-x 51. Euler beam equation. Suppose a uniform horizontal
34. C] + c2x + CJX 2 beam has profile shape y = y(x), with x measured from
35. c1 cos 3x + c2 sin 3x + c3 the left end. For a rather rigid beam with uniform load-
ing, y(x) typically satisfies the fourth order differential
36. c1 e-' + c2e-x + CJ cos x + c4 sin x equation y"" = - P, where P > 0 is a constant depend-
37. CJ ex cos 2x + c2ex sin 2x + c3 ing on the characteristics of the beam. If the left end of
the beam is embedded horizontally in a wall at x = 0,
38. c1 cosx +cisinx
called a cantilever support then y' (0) = 0. indicating
Each of the solution families in Exercises 29 to 32 is that the beam is flat there. Jf the beam is just supported
listed again in Exercises 39 to 42 along with a set from below at x = L, but at the same level, say level
Section 3A Nonhomogeneous Equations 513

0, then we'll use boundary conditions y(0) = y(L) = 0, (a) Show that the most general solution to the differen-
y'(0) = 0, and y" (L) = 0. Imagine the beam extended tial equation may be written
beyond x = L and bending with an inflection at x = L.
(a) Solve the differential equation by four successive y = c1 cash .VJ:x + c2 sinh .VJ:x
integrations, and use the boundary conditions to
show that the beam's shape is described by the + c3 cos .VJ:x + q sin .VJ:x,
graph of.
p
where coshu = (eu +e-u)/2 and sinhu = (e" -
y(x) = --(2x 4
- 5Lx 3
+ 3L 2 2
x ). e-")/2.
48 (b) Let ). = n 4 rr 4 where n is a positive integer, and
(b) Show that the graph of y(x) has an inflection on find solutions of the differential equation subject to
0 < x < L. What is the maximum downward boundary conditions y(0) = y"(0) = 0, y(l) =
vertical deflection from level 0 on L < x < 0? y"(l) = 0.
(c) Make a sketch that shows the qualitative features of (c) Sketch the graphs of the solutions found in part
the graph of y(x) for L ~ x ~ 0. (b) for n = 1, 2, 3.
(d) Show that if ). is not of the form prescribed in part
52. Rotating shaft. A differential equation for the lateral (b) then the only solution to the problem posed there
displacement y = y(x) at distance x from one end of is the identically zero solution.
a uniform rotating shaft is
53. Prove that a set {y1(x), yz(x), ... , Yn(x)} of functions
y(4) - ).y = 0. defined on an interval is linearly independent if and
only if no one of them is a linear combination, using
where the constant ). > 0 is proportional to the speed of constant coefficients, of any remaining functions in
rotation. the set.

SECTION 3 NONHOMOGENEOUS EQUATIONS


3A Superposition
An operator, for example, L = D 2 + D + 1, is linear if for functions YI, Y2, and
constants c, both

(i) L(y1 + n) = L(yi) + L(}'2), and


(ii) L(cy) = cL(y).
To a given linear operator L we can associate the homogeneous equation

L(y) = 0,
and for a given function f we can also consider the nonhomogeneous equation

L(y) = f.
The associated homogeneous equation is the special case of the nonhomogeneous
equation obtained by letting f be the identically zero function, and this special
case is fundamental to understanding the more general case. For constant-coefficient
linear differential operators, the theorems of the two preceding sections give a com-
plete description of the set of all solutions to the homogeneous equation. Fur-
thennore, the exponential multiplier method developed there provides a practical
method for solving many nonhomogeneous equations. The next example illustrates
the method.
514 Chapter 11 Second-Order Equations

IEXAMPLE 11 Given
y" + 2y' + y = e3x'
we write the characteristic polynomial in the form D 2 + 2D + l and factor it, putting
the equation in the form
(D + 1)2y = e3x.
Letting (D + l)y = u, we try to solve

(D + J)u = e3x.
Multiplication by ex gives

or

Then integration gives


ex u = ¾e4x + CJ ,
or

Since (D + l)y = u, we have

(D+ l )y = ¼e3x +c1e-x.


Again multiplying by ~, we get

or
D(ex y) = ¼e4x +CJ.
Then
eX y = 16 e4x + cix + c2
[

or

In the preceding example, the solution breaks naturally into a sum of two parts
Yh and Yp:
Yh = qxe-x + c2e-x,
1 3x
Yp = 16e ·

The function Yh is called the homogeneous part of the solution because it's a solution
of the homogeneous equation

L(y) =0
Section 3A Nonhomogeneous Equations 515

associated with L(y) = f. The function Yp is called a particular solution of

L(y) =f
because it's just that: a particular solution, though not the most general one. We
sometimes refer to the homogeneous part of a general solution as the homogeneous
solution. We get Yp by setting ci = c2 = 0 in the general solution. The breakup
of the solution into two parts is an example of a general property of linear operators
discussed in Chapter 2, Section 2C on systems of linear algebraic equations. The
principle is important enough, and at the same time simple enough, that we state it
here also.
3.1 Theorem. Let L be a linear operator. Let / be a function, and let Yp be a
function in the domain of L such that L(yp) = f. Then every solution y of

L(y) = f is a sum y = Yh + Yp,


where Yh is a solution to L(y) = 0.

Proof. Suppose that L(y) =/ and that also L(yp) = f, Then since L is linear,
L(y - Yp) = L(y) - L(yp)
= f - f =0.
It follows that y-yp = Yh for some homogeneous solution y1z. But then y = y1z + Yp
as we wanted to show. •
The method of Example I can always be used to find the most general solution
to an equation L (y) = / of the form

(D - r1) · · · (D - rn)Y = f.
In second-order examples we can use Theorems 1. I or 2.3, since Yh fo1lows imme-
diately from the roots of the characteristic polynomial. Theorem 3.1 then says that if
we find the general homogeneous part of the solution Yh using Theorems 1.1 or 2.3,
and somehow find a particular solution Yp, then the general solution of the given
equation is Yh + Yp·
To find Yp it's often convenient to take advantage of the linearity of Lin case the
right-hand side/ is a sum of two or more terms. If we want to solve

L(y) = a1 Ji + a2h (1)

and we can find solutions Y1 and Y2 such that

then because L is linear, the function

y = a1y1 +a2Y2

is a solution of Equation (I). In this context, the property of Jinearity is some-


times called the superposition principle because the desired solution is found by
superposition (i.e., addition) of solutions of more than one equation.
516 ,_ _ _ _..., Chapter 11 Second-Order Equations

[E)(A.JVl~LE 2 j In Example I we found that the differential equation


(D + 1)2y = e3x
had the general solution

When c, = c2 = 0 we get the particular solution YI = 1~e 3x. If we now wanted to


solve
(D + 1)2y = e3x + 1, (2)

we would not have to start all over again, but would only have to find a particular
solution for
(D+lfy=l.

This could be solved by using exponential multipliers, but in this case the differential
equation is so simple that we can guess a solution, namely, Y2 = 1. Then a paiticular
solution of Equation (2) is Yp = /6 e 3x + 1, and the general solution is

Solving

with an extra term on the right requires us to find a particular solution to

To do this we could return to the exponential multiplier method, and let

(D + l)y = u.
Then we solve
(D+l)u=e-x

by using the multiplier ex to get

ex(D + l)u = 1
or

Hence

Solving (D + l)y = u for y, we use again the multiplier ex to get


Section 38 Nonhomogeneous Equations 517
Integration then gives
ex y = I 2
zX + CJX +c2
or
y = ½x 2e-x + c1xe-x + c2e-x.
Using the linearity of (D + 1) 2 , we conclude that the general solution of
(D + ])2y = e3x + e-x

is

The exponential multiplier method shown previously for finding particular solu-
tions of L(y) = f (x) will provide a solution if we can perform the integrations
involving f (x). The method of undetermined coefficients explained next has more
restricted applicability, but is often more efficient when it does apply.
3B Undetermined Coefficients
The method depends on the observation that if we want to solve

L(y) = f (x),
where f (x) is itself a solution of a homogeneous equation My = 0, then
M(L(y)) = M(f (x)) = 0.
Then the desired solution y(x) must be among the solutions of

M(L(y)) = 0.
If M and L are linear constant-coefficient operators then the solutions of the preced-
ing equation are linear combinations of functions of the form xke'X, where r may
be real or complex. Thus the only computational problem is to determine the so far
"undetermined coefficients" of combination that will actually give a solution of the
original equation L(y) = f (x). Guessing the operator M is based on experience
solving homogeneous equations.
I
I E><"-¥P~~ 3 1 The differential equation

in operator form is
(D 2 - l)y = ex.
Since (D - l)ex = 0, we have for any solution y(x),
(D - l)(D 2 - l)y = (D - l)~- = 0.

Hence any particular solution y must have the form


518 Chapter 11 Second-Order Equations

Since the first two tenns arc solutions of the associated homogeneous equation
y" - y = 0, we can concentrate on the remaining tenn CJXex. To find CJ, we compute

Yp(x) = c3xex
y~(x) = CJ(xex + ex)
y;(x) = c3(xex + 2ex).
Thus for y~ - Yp =~ to hold, we must have

CJ(Xe~ + 2e 1 ) - CJXex = e",


or
(2c3 - l)e" = 0.
Hence 2c3 = I, so that c3 = ½. The general solution is thus

Y = cie-x +c2e" + ½xex.

j E*MF!LE 4 j We can write the differential equation


y" + y = 2sinx
in operator form as
(D 2 + l)y = 2sinx.
Since (D 2 + I) sin x = 0, it follows that any solution y(x) to the given equation
must satisfy
(D 2 +1)2y=0.

Hence y must have the form

y(x) = c,x cosx + c2x sinx + CJ cosx + q sinx.


The last two terms satisfy the homogeneous equation y" +y = 0, so we try to
detennine c,
and c2 so that

Yp(x) = c,xcosx +c2xsinx


satisfies y" + y = 2 sin x. We compute

y~(x) = c, cosx - c,x sinx + c2 sinx + c2x cosx,


y;(x) = -2ci sinx - c,x cosx + 2c2 cosx - c2x sin x.

Then to satisfy the given differential equation we substitute )'p and its derivatives
to get
-2c, sinx + 2c2 cosx = 2 sinx.
Because cos x and sin x are linearly independent on any interval we must have
c, =-1 and c2 = 0. Thus Yp(x) = -x cosx is a particular solution, and
y(x) = CJ cosx + c4 sinx - x cosx
is the general solution.
Section 38 Nonhomogeneous Equations 519
. Here is an outline of the routine for finding the terms in a linear combination for
a trial particular solution Yp to L(y) = f(x), where f is a constant multiple of

perhaps with n, a or f3 equal to zero.

(i) Include in the linear combination Yp the function f itself and all terms in
its derivative set, consisting of the linearly independent sets of functions of
which f and its successive derivatives are linear combinations. For example,
the derivative set of x 2 + x sin x consists of the two sets {x 2 , x , I} and
{x sinx, x cosx, sinx, cosx}.
(ii) If a term included in step (i) happens to be a solution of the homogeneous
equation, multiply that term and all terms in its derivative set by the single
lowest power xk such that the resulting terms are no longer homogeneous
solutions.
(iii) Form a linear combination with undetermined constant coefficients of the
terms from (ii), and determine the values of the coefficients by substitution
into L(y) = f .

Here are some examples of functions f (x) and corresponding trial solutions Yp (x ),
assuming no term in Yp(x) satisfies the homogeneous equation.

f(x) = cerx; Yp(X) = Aerx


f(x) = cx 2; Yp(x) = Ax 2 + Bx+ C
f(x) = cx2erx ; Yp(X) = (Ax 2 +Bx+ C)erx
f(x) = ccos{Jx ; Yp(x) = A cos f3x + B sin f3x
f(x) = ex sinf3x; Yp(x) =(Ax+ B) sin f3x + (Cx + D) cos f3x
f (x) = ceax cos f3x; Yp(x) = eax (A cos f3x + B sin fix)

j ~~MijC~ ~] The differential equation


y" + 2y' + y = 3e-x
has the homogeneous solution

For a nonhomogeneous solution, we try functions of the form

Since
y'(x) = A(2x - x 2)e-x,
y" (x) = A(2 - 4x + x 2 )e-x,

substitution into the nonhomogeneous differential equation gives


520 Chapter 11 Second-Order Equations

The terms with x and x 2 as factors all cancel, and we arc left with

so A = ~- Thus Yp(x) = !x 2e-x is a particular solution, and the general solution is

To find the form of a trial solution Yp(x) for a nonhomogeneous equation it's
often simpler, and just as effective, to make an educated guess at Yp(x) rather than
methodically following the three rules listed above. For example, a little experience
shows that Yp = Axe·-x is a good choice for the equation y" - y = e-x.

EXERCISES

In Exercises 1 to IO, find the general solution of the In Exercises 29 to 32, find the general solution of the
differential equations and then find the particular solution equation by first finding the general solution of the asso-
satisfying y(O) = 0 and y'(O) = 1. ciated homogeneous equation and then adding to it a
particular solution found by the undetermined coeffi-
1. y" _ )' = e2x 2. y" - y = 3ex cient method. Then find the particular solution satisfy-
3. y" + 2y' + y = e 4. y" -y =X ing y(O) = 0, y' (0) = 1 and sketch the graph of that
solution.
5. y" - y = ex + x 6. y" - 2y = cos2x 29. y" + 4y' + 4y = 3x 30. y" - y' - I 2y = 2e 4x
7. y + )' = COS X
11
8. y" = cosx + sinx
31. y" + 2y' + 2y = ex 32. y 11 - y' =X
9. y" + y = x cos x 10. y" - y = xe-<
In Exercise 33 to 42, find the general form for a trial
In Exercises 11 to 14, use factored operators and the
solution Yp for each of the following. (For example,
exponential multiplier method to find the general solu-
if y" - y = ex, choose Yp = Axex .) You need not
tion for the differential equation.
detennine the coefficient values.
11. y" + y' - 2y = ex 12. y" - y = e2 ' 33. y" -4y = xe2x + e2x
13. y" + y = eix 14. y" =x 34. y 11 + y = x 2 COS X
In Exercise 15 to 22, find a homogeneous differential 35. y" - Sy'+ 6y = xe 2x + e3x
equation of least possible order for which the given
function is a solution. 36. y"+4y=x 2 cos2x-2sin2x
15. e + 2e 2x 16. ex COS X - ex sin X 37. y" - 4 y = e2x + 5 cos x
17. x +I 18. xex - 2ex 38. y" + y = 3x sin(x - 3)
19. x sin 3x 20. x 2 cos4x 39. y" - y' = x 2 + 2ex

21. xex sinx 22. x 3e-x cos2x 40. y111


y = ex/ 2 sin ./3x
-

y = l + x +x3
111
In Exercises 23 Lo 28, find the appropriate form for a 41.
trial solution for the equation. For example you would 42. y" + )' = x 99 COSX
use Yp = A cos 2x + B sin 2x for y" - y sin 2x. =
Falling body in a resisting medium. The distance y(t)
23.y"-y=COSX 24. y" + y = COS X covered in time t by a falling body of mass m under the
25. y" - y = ex 26. y" - y = xex sole influence of a constant gravitational field satisfies
a differential equation of the form d 2 y/dt 2 = g. (If
27. y" - 2y' + y = xex 28. y" = x5 distance is measured up from the surface of the earth,
Section 3C Nonhomogeneous Equations 521

then instead we would use d 2 y/dt 2 = -g.) To take (b) Show that the formula for y(t) in part (a) satisfies
atmosphere resistance into account in a simple way, we y(t) ~ ½gt 2 fort?. 0. [Hint: y" = g-(k/m)y' ~ g .
write the differential equation in terms of force rather Now integrate from 0 to t .]
than resistance in the form (c) Find the analogue of the formula given in part (a) for
the case of initial velocity y' (0) =
vo.
d2y dy 45. (a) For a body of mass m subject to friction constant
m-=gm-k- or k, show that initial velocity vo, leads to velocity at
dt2 dt
time t given by
Here k is a positive constant that is used to express
a retarding force proportional to velocity d y / d t. This
equation applies to Exercises 43 to 46. y ' (t) = kmg + ( Vo - k-
mg) e- k t I m.

43. (a) Show that the general solution to the retarded falling
(b) Find the limit ask tends to 0 of the formula for y'(t)
body equation is
in part (a). Does this agree with the free-fall formula
VO+ gt?

46. We can estimate the friction constant k by using an


observed value of the terminal velocity v00 == mg/ k of a
(b) Show that if initial conditions y(O) = yo and y' (0) = falling body dropped from rest.
vo are observed, then c1 = yo - (m/k)(mg/k- vo) (a) Find k if a body of weight w = 100 pounds achieves
and c2 = (m/k)(mg/k - vo). terminal velocity v00 == 180 feet per second.
(c) Show that regardless of the choice of initial con- (b) How far must the body in part (a) fall to attain the
ditions, Jim y' (t) = mg/ k; this limit is called the velocity of 150 feet per second?
1->00
terminal velocity of the falling body.
*47. Use successive integration by the exponential multiplier
(d) If the initial velocity y'(O) = vo is negative,
method to show that if f is continuous on an interval
show that the velocity reaches zero at time ti =
containing xo, then the equation
(m/k) In (1 -kvo/(mg)). (Remember that y is mea-
sured down toward the attracting body, so negative
initial velocity is directed up.) (D - ri)(D - ri)Y = J(x)
44. Suppose a falling body is subject to a linear friction force
ky'(t) and has mass m. has solution
(a) If the initial velocity is 0, show that the distance
covered in time t is

mg m2g -kt m
y(t) =-k
t- - 2 (I - e I ).
k satisfying y(xo) = y'(xo) = 0.

3C Variation of Parameters
The undetermined coefficient method is inadequate if the nonhomogeneity involves
a term that is not itself a solution of some homogeneous equation. If we know a
nontrivial solution YI of a homogeneous equation, we can try to find a function u(x)
such that y(x) = YI (x)u(x) will be a solution of the associated nonhomogeneous
equation. We can find the complete solution this way, and the procedure is called
variation of parameters. The substitution y(x) = YI (x)u(x) will leave us with a
linear differential equation that we can solve for u(x); then solve this equation to
find the "variable parameter" u (x). This method doesn't require that the coefficients
be constant.

IE~!\'IP~E l ~ We know that the associated homogeneous equation for

y" +2y' + y = e-x


522 Chapter 11 Second-Order Equations

has characteristic equation (r + 1) 2 = 0 and so has YI (x) = e-x for a solution.


Letting y = Y1(x)u(x) = e-xu(x), we compute

Substitution into the given differential equation followed by division by e-x yields

(u" - 2u' + u) + 2(u' - u) + u = I, simplifying to u" = l.


We integrate u" = I twice to get u = ½x 2 + c1x + c2. So our solution is
y = Y1(x)u(x) = e-xu(x)

= ½x 2e-x + qxe-x + c2e-x,


in agreement with what we would have found by the me(hod of Section 38.

I~~MPLE 1 I It's routine to check that the equation


x 2y" - 2xy' + 2y = x 4
has YI (x) = x for a solution of the associated homogeneous equation. (We just accept
this as a guess for now.) Letting y = y1(x)u(x) = xu(x), we compute

y = xu, y' = xu' + u, y" = xu'' + 2u'.


Substitution followed by simplification of the differential equation yields

x 2(xu" + 2u') - 2x(xu' + u) + 2xu = x 4 or x 3u" = x 4 .

We divide by x 3 and integrate u" = x twice to get u = ¼x 3 + qx + c2 • So our


solution is
Y = YI(x)u(x) = xu(x)
= 6I x 4 +c1x 2 +cix.
Note that what we just did would have given us the second independent solution
y2(x) = x 2 regardless of what function of x we had on the right side of the equation.
In the previous example we found the homogeneous solution YI (x) = x by guess-
ing, but lacking a correct guess we could have found a solution by a method described
in Exercise 26 at the end of this section.
In applying variation of parameters, we can reduce by one the number of integra-
tions required if we already know two independent solutions YI, )'2 of the associated
homogeneous equation. While it's similar in principle to what we did previously,
the routine using two solutions is lengthy enough that rather than repeat it for every
application we'll standardize it, along with the final result, as follows. Suppose a(x)
and b(x) are continuous on some interval and that f (x) is continuous. Then consider
the normalized equation

y" + a(x)y' + b(x)y = f (x),


Section 3C Nonhomogeneous Equations 523
in which the coefficient of y" equals 1. If YI, Y2 are homogeneous solutions, we
form the linear combination

y(x) = Y1(x)u1(x) + Y2(x)u2(x),


where u1(x) and u2(x) are to be determined so that y(x) is a solution of the non-
homogeneous equation. What we do now is what we did in the previous example:
compute the derivatives y' and y" and substitute into the nonhomogeneous equations.
Then rearrange the terms as follows:

(y;' + ayi + by1)u1 + (y{ + ay~ + by2)u2 + (y1u~ + y2u;) 1


+ a(yI u; + y2u;) + (y; u; + y;u;) = f.
The first two collections of terms are zero because YI and Y2 are homogeneous
solutions. What remains of the equation will be satisfied if we can choose u I and u 2
so that the two equations

3.2

hold identically for all x on the interval in question. We solve this system of equations
for u; and u~, with the result that

, -Y2(x)f(x)
ul (x) = YI (x)y ' (x) - Y2(x)yi (x)
,
2
u' (x)
2
= Yi(x)f(x) .
YI (x)y~(x) - Y2(x)y; (x)

The expression in the denominators is the same in both fonnulas and we can write
them as the 2-by-2 determinant

called the Wronskian determinant of the pair YI, Y2· It's possible to prove that
w(x) is never zero if YI, y2 are linearly independent solutions, so the examples we
consider here will have that property. To complete the solution, integrate the formulas
for u;, u; to find u1 and u2, and then combine with YI, Y2 to get a particular solution
Yp(x) = YI (x)u1 (x) + Y2(x)u2(x),
3.3
= Y1(x) 1 -y2(x)f(x) dx + Y2(X)
w(x)
I Yi(x)f(x) dx.
w(x)

Then the solutions of the original nonnalized equation is

y(x) = c1y1(x) + c2y2(x) + Yp(x).


Because Equations 3.2 are easier to remember than Equation 3.3, people some-
times prefer to start with them in each problem and carry out the rest of the
524 Chapter 11 Second-Order Equations

computation to arrive at Equation 3.3. The next example will be done that way.
Because of the way that f enters Equations 3.2 and 3.3, the equation to be solved
musl be in normalized form to make these formulas valid.

jEX~¥PLE sJ 2
We normalize x y" - 2xy' + 2y = x 3 , say for positive x, to get
II 2 I 2
y ---y+-2 y=x, x>O.
X X

It's routine to check that the equation has homogeneous solutions y1(x) = x, y2(x) =
x 2. For this example Equations 3.2 are
I
XU!+ X
2
= 0,
U2I
UJI +2XU2 = X.
I

Multiplying the second equation by x and then subtracting the firsl equation from it
gives x 2u; = x 2, or u; = I. It then follows from the first equation that u; = -x.
Integrating to find u I and u2 gives

u2(x) = x.
A particular solution is
Yp = YJUJ + y2u2
= x-(-½x 2) + x 2-x = ½x 3 •
Adding constants of integration to u I and u2 would only add linear combination of
homogeneous solutions to Yp· In any case, we have the solution

We'll now solve a generalization of Example 1, but using Equation 3.3 instead of
Equation 3.2.

I~X!\MPLE 9 ·I In normalized form, we consider


2 I
II
y - -y + 22 y = f(x), x > 0.
X X

Homogeneous solutions are YI (x) = x, Y2 (x) = x 2. The Wronskian determinant of


Y1,y2is

w(x) = I~ ;: I= 2 2 2
2x - x = x .

Equation 3.3 reduces to

-x 2 f(x)
Yp(x)=x
f x2 dx+x
2 / xf(x)
~dx
.

=-x
f j(x)dx+x
2J f(x)
----:;-dx.
Section 3D Nonhomogeneous Equations 525
To make the integration fairly easy, we can use the example f(x) = x cosx. For
this choice, we get

)'p = -x f xcosxdx +x
2
f cosxdx

= -x (xsinx - f sinxdx) +x 2 f cosxdx

= -x(x sinx + cosx) + x 2 sinx


= -x cosx.
3D Green's }'unctions
A slight modification of Equation 3.3 yields formulas for the initial-value problem
y(xo) = y' (xo) = 0 associated with a second-order constant-coefficient operator. The
first step is to convert the integrals in Equation 3.3 into definite integrals over the
interval from xo to x:
X-y2(t)f(t)dt+ ()1XY1(t)f(t)d
X = YI ()1
Yp () x Y2 x - - - t,
xo w(t) xo w(t)
3.4
r
= YI (t)n(x) - Y2(t)y1 (x) f (t) dt'
lxo w(t)

where w(t) = Y1U)y;(t) - y2(t)y;(t) is the Wronskian determinant of y1(t), Y2(t).


Setting x = xo in Equation 3.4 gives Yp(xo) = 0. It's left as Exercise 19 or 20 to
show that y;(xo) = 0 also. Thus we define the Green's function to be

3•5
= YI (t)y2(x) - Y2(t)y1 (x)
G(X, t ) --------,
w(t)

and this generates the solution

Yp(x) = 1: G(x, t)f (t) dt

to the initial-value problem

y" + a(x)y' + b(x)y = f (x), ytxo) = y' (xo) = 0.


For equations y" + ay' + by = f (x), where a and b are real constants, the
homogeneous solutions reduce to three distinct types, identified in Theorem 2.3 of
Section 2A, depending on the nature of the characteristic roots of r 2 + ar + b 0. =
Since the Green's functions are constructed from the solutions of constant-coefficient
equations, G(x, t) also has just three types:

1
(i) G(x, t) = -- (er1(x-t) - er2(x-t)), r1 f:. r2
Tl - T2
(ii) G(x, t) = (x - t)e'1(x-t), r1 = r2
(iii) G(x, t) = "iJea(x -lJ sin /J(x - t), r1 =ex+ i/3, r2 = ex -- i{J
526 Chapter 11 Second-Order Equations

Using what we know about solving constant-coefficient equations, these formulas


are fairly easy to remember, and it's straightforward in Exercise 21 to derive them
from Equation 3.5.
Quite apart from the neatness with which Equation 3.4 displays a solution, it's
convenient for calculating solutions when the forcing function J (x) is discontinuous,
as in the next example.

[!xAMPtE 10 I The normalized constant-coefficient equation y" - 3y' +2y = J(x) has characteristic
equation r 2 - 3r + 2 = 0 with roots r 1 = 2 and r2 = I and independent solutions
YI = e2x, yz =ex.Then w(t) = -e 31 and the Green's function solution is
x e2t ex _ et e2x x
Yp(X) =
1 xo
----f(t)dt
-e3I
=
1 xo
(e 2(x-t) _e<x-t))f(t)dt.

Suppose f (x) = -1 if x < 0 and f (x) = 2 if 0 S x. In a purely formal sense the two
cases differ only by a constant factor. Assume first that x < 0, where f (x) = -1.
In that case, with xo = 0,

Yp(X) = -fox (e2(x-l) - e<x-1)) dt = -e2x fox e-21 dt + ex fox e-1 dt

1 2x 1
= e-' - -e for X < 0.
2 2
For the case x ~ 0, just replace the factor -1 by 2 in the integral to get altogether
ex - le2x - l < 0

I
X
Yp(X) = 2 2' '
1 + e 2x - 2ex, x ~ 0.

Note that the behavior of the two parts is quite different: xE~oo Yp(x) = -½, but
lim Yp(x)
x~oo
= oo.

IEXAMPLE 11 I Recall from earlier examples that x 2 y" - 2xy' + 2y = 0 has independent solutions
Y1(x) = x, Y2(x) = x 2. The Wronskian is w(x) = x 2 -:j:. 0 except at x = 0. The
Green's function is G(x, t) = (x 2 ft) - x. We have the solution to the normalized
nonhomogeneous equation y" - (2fx)y' + (2f x 2 )y = J(x) given by

y(x) = c1x + c2x 2 + 1: (~ 2


- x) f(t)dt, xo #- 0.
If f (x) = x 3 the integral term is

Yp(x) = {x ((x 2 ft) - x)t 3 dt


lxo

= x2
1x XO
t 2dt - x
1x
XQ
t 3 dt = l
-x 2(x 3 - x5) - -x(x 4
3
1
4
- X6)

15 1 3 2 4 1
= -x
12
- -x
3 o
x + -x
4 o
x.
Section 3D Nonhomogeneous Equations 527
This is the solution that satisfies y(xo) = y' (xo) = 0. The tenns containing xo
combine with the homogeneous solution to give the solution y = c 1x + c2x 2 + i1zx 5 •

= x x +
2
x 1 I= x 2 - 1, x =I- ±I.
w(x)
I1 2

The Green's function for an initial-value problem is then

= -t(x- -+t 2
I) - (t 2 + I)x
2
G(x,t) ___1_ __

Suppose we want the solution with

l
0, -1 < X < 0,
f(x) = 1, 0S XS½,
0, ½ < x < 1,

and satisfying y(0) = y' (0) = 0. This is


0, - 1 <X <0,

Yp(x) = fox G(x, t) f(t)dt = fox G(x,t)dt,


f 1/2
Jo G(x , t)dt, ½<x<l.

We compute, for O =:: x S ½,


xG(x, t)dt = (x 2
+ 1) ix t 1x - 12 +
1
1o o t2 - I
- - - dt -x 2
--
ot-I
dt

= (x 2 + 1)[½ ln(l - 2
x )] - x[x + ln ( ~ ~;)].
The complete solution is then

0, - 1< < 0,

I
X

Yp(x) = ½<x 2 + 1) ln(l - x2 ) - x ln((l - x)/(1 + x)) - x2 , 0 :S x ::: 1/2,


½ln(¾)(x 2 + 1) - <½ - ln 3)x, ½< x < I.

The third line comes from evaluating the bracketed expressions in the previous inte-
gral evaluation at x = ½- The first and third lines are solutions of the homogeneous
equation on their respective intervals, because f(x) =
0 there. Finding the solution
that satisfies more general initial conditions, y(0) = yo, y' (0) = zo, is just a matter
528 Chapter 11 Second-Order Equations

of solving for c, and c2 from

y(x) = c,x +c2(x 2 + 1) + Yp(x).

We have
y'(x) = CJ + 2c2x + y~(x).
But Yp(O) = y~(O) = 0, so we find q, c2 from equations ez = yo, c1 = zo.
Summary. What we have seen in this section is a collection of methods for
finding explicit solution formulas of the form

3.6 y(x) = C1Y1 (x) + CzyZ(X) + Yp(X)


for differential equations of the form

3.7 y" + a(x)y' + b(x)y = f(x).


It will follow from Theorem I. I in Chapter 12 that Equation 3.7 always has solu-
tions of the form in Equation 3.6 on an interval xo ::: x :::: x1 if a(x) and b(x)
are continuous on that interval. In the constant-coefficient case we've seen that by
choosing c1 and c2 properly we could satisfy arbitrary initial conditions of the form
y(xo) = yo, y' (xo) = zo. This last possibility follows simply from the meaning of
c, and c2 as arbitrary constants of integration. If a (x) and b(x) are continuous func-
tions, the analogous theorem still holds, though we won't prove it. The next example
illustrates how this works.

IEXAMPLE 13 j The equation (x - 1 )y" - x y' + y = I has a particular solution Yp (x) = 1, and the
associated homogeneous equation has solutions Yt (x) = x and yz(x) = ex. Initial
conditions y(xo) = Yo, y' (xo) = zo are satisfied by solving for CJ, c2 in

Yo= c1y1(xo) +c2n(xo) + 1


zo = l'JY; (xo) + c2y~(xo).
If xo = 0 it turns that CJ = zo - Yo + 1 and c2 = Yo J. But our method fails
to apply at xo = I, because the coefficient (x ·- I) of y" is zero there so y" (I)
is not determined by y(l) and y' (I ). You' re asked to show in Exercise 27 that the
initial-value problem at xo = I has a solu~ion only if zo = Yo - 1 and that in that
case the solution is not unique.

EXERCISES

For each equation in Exercises I to 4, find or guess 2. J 11 + (I / X) y' = X, X > 0


a solution Yt of the associated homogeneous equation.
Then determine u(x) so that y(.x) = Yt (x)u(x) is a 3. x 2y" - 3xy' + 3y = x4, x > 0 [Hint: Try YI = xn, for
solution of the given equation containing two arbitrary some n.]
constants.
4. xy" - (2x + l)y' + (x + l)y = 3x 2ex, x > 0 [Hint: Try
1. y" - 4y' + 4y = ex Y1(x) = erx.]
Section 3D Nonhomogeneous Equations 529
In Exercises 5 to IO, find a particular solution Yp by if aF /ax is continuous. Use this result to establish the
solving Equation 3.2 for u1(x), u2(x) to get Yp(x ) = formula for y~(x) in Exercise 19.
Y1(x)u1(x) + y2(x)u2(x). If suitable Yl and Y2 are not 21. The constant-coefficient equation y" + ay' + by = f (x)
given, find them first. Then find a solution that contains has homogeneous solutions y 1(x) = e' 1x, Y2 (x) = ef"lx.
two arbitrary constants to the nonhomogeneous equation. These solutions are independent if r1 I r2; otherwise we
5. y" + y' - 2y = eh consider Y1(x) = e' 1x, y2(x) = xerix_ Find the Green's
function for the equation in the following cases by using
6. y" + y = tanx, -Jr/2 < x < Jr/2 Equation 3.5 of the text.
7. y" + y = secx, -Jr/2 < x < Jr/2 (a) r1 f r2
8. y 11
- y = X tr (b) r1 = r2
9. y" = x 2fr (c) r1 = a + i/3, rz = a - i/3, f3 I 0
22. Sketch the graph of the solution found in Example IO of
10. x 2 y" - 2xy' + 2y = 1; YI (x) = x, Y2(x) = x 2 , x > 0 the text.
In Exercises 11 to 14, use Equation 3.3 to find a formula Do the calculation of the Green's function integral in
for a solution Yp(x). Complete the required integration Example 11 of the text for the choices
if you can. Don't forget to make sure that the equation
is normalized. 23, /(x) = X

11. y" - 2y' + y = ex 24. f(x) = -l;x < O; f(x) = 1,x ~ 0


12. y"+3y'+2y= l+tr Do the calculation of the Green's function integral in
13. y" + 3y' + 2y =(a+ ex)- 1 Example IO of the text for the choices
14. 2y" + Sy = fr 25. f(x) = fr
In Exercises 15 to 18, find a pair of independent solu- 26. f(x) = ex, X < O; f(x) = e-x, x ~ 0
tions to the associated homogeneous equation and use 27. Consider the differential equation (x -1 )y" -xy' + y = 1
them to write the Green's function for the normalized of Example 13 of the text. Show that the initial-value
equation. Then normalize the equation and solve the problem y(l} = Yo, y' (1) = zo has a solution only if
given initial-value problem. zo = Yo - 1, and that in that case there are infinitely many
/
solutions. (Hint: x and ex are homogeneous solutions.]
15• Y II I 2 { Q, X < 1,
+ 3Y + Y= 1, 1 ~ x; y(0) = 1, y'(0) =2 28. (a) Show that we can solve the Euler differential
equation
II { 1, X < l
l6. 2Y + 4 y= 0, l~x ; y(0)=-1.y'(0)=l
x 2 y" + axy' + by = 0, a, b real constants,
17. y" + (1/x)y' = 1/x ; y(l) = 0, y'(I) = 2
as follows, assuming x > 0. Let y = x/J-, so that

I
0, 0 < X < 2,
18. x 2 y"-2xy' +2y= 3, 2~x~4. y' = µ.xµ.-J, and y" = µ(µ. - l)xµ.- 2 _ Show that for
J
- 1, 4 < x; y(3) = 0, y'(3) = 0 y to solve th~ differential equation, µ, must satisfy
the indicial equation
19. Show that the derivative of the Green's function For-
µ. 2 + (a - l)µ. + b = 0.
mula 3.4 is
(b) Show that if the indicial equation has real roots
, (x) = 1x Y1(t)y~(x)- Y2(t)yi(x) f(t)dt. /J.1 I /J.2 then YI = x/J- 1 , .Y2 = x/J- 2 are solutions.
Yp xo w(t) (c) Show that if the indicial equation has complex con-
jugate roots µ,1 = a + ifJ, /J.2 = a - i/3 then
Show then that y~(xo) = 0. [Hint: Use Equation 3.4
YI = xa cos(/3 lnx), 'n = xa sin(/J lnx) are solu-
separated into two integrals, and then apply the product
tions. Note that, by definition, xu+i/J = xaeif3lnx for
rule for differentiation.]
X > 0.
20. The Leibniz rule for differentiating an integral states that (d) Show that if µ, 1 is a double root of the indicial
d lb(x) aF equation then y 1 = x/J- 1 , Y2 = x/J- 1 lnx are solutions.
- F(x, t) dt = lb(x) -(x, t) dt Use Theorem 2.4 for this.
dx a(x) a(x) ax
In Exercises 29 to 32, use the results of the previous
+ b'(x)F(x, b(x)) -a'(x)F(x, a(x)) , exercise to solve the following Euler Equations.
530 Chapter 11 Second-Order Equations

29. x 2 y" + xy' - y =0 30. x 2 y" + 4xy' + )' = 0 (c) Find a second-order homogeneous linear equation
having f(x) = x and g(x) = ex as solutions for all
31. x 2 y"+3xy'+y=0 32. x 2 y"+xy'+y=0 X =/- J.
33. Let f (x) and g(x) be twice differentiable functions on an 34. It's sometimes erroneously inferred from insufficient evi-
interval a< x < b on which f(x)g'(x)- J'(x)g(x) =I- 0. dence that the Wronskian determinant
(a) Show that the 3-by-3 detenninant equation

y f g =
f(x) g(x) I
y' J' g' =0
W[f, g](x)
I J'(x) g'(x)

y" J" g"


of two linearly independent functions f and g can't equal
is a second-order homogeneous linear equation on 0. Put this idea to rest by computing W[f. g ](x) for all
the interval (a, b) having f (x) and g(x) as solutions. real x when f (x) = x 2 and g(x) = xix!. You need to use
(b) Find a second-order homogeneous linear equation the definition of derivative to compute g' (0). What' s true
having f (x) = sin x and g (x) = x sin x as solutions is that W[y 1 , Y2](x) =I- 0 for two independent solutions of
for 0 < X < ]'(. a second-order equation.

SECTION 4 OSCILLATIONS
A second-order constant-coefficient linear differential equation

d 2x dx
a- +b- +ex= IU)
dt 2 di
often has a physical interpretation that allows a neat classification of the solutions
and equations into distinct types, depending on the relations between the constants
a, b, c and the function I. Equations of this kind are important not only because
of their direct physical applicability, but also because of the insight they yield about
related nonlinear phenomena. A typical mechanism that we can analyze using a
constant-coefficient equation is shown in Figure 11.3, in cross section. Automobile
shock absorbers and artillery recoil mechanisms are designed using the principles
illustrated here. The working parts consist of a piston that travels in a cylinder
containing fluid, and a spring that can expand and compress. A spring usually exerts
a force roughly proportional to its extension or compression from its equilibrium
position, denoted by O on the fixed scale. Thus if x is the amount of displacement
from 0, then the force f 1. exerted by the spring is representable, according to Hooke's
law, by
Is= -hx , h > 0,
where for small enough displacements h is constant. We also assume that the fric-
tional force IF in the mechanism due to the viscosity of the fluid is proportional to
the velocity:
dx
IF = -k dt , k > o.
The time-dependent external force !£ = I (I) acts independently of J. and IF,
which are, in tum, assumed to act independently of one another, so that the total
force acting parallel to the scale in the figure is

dx
fs + fF + /E = -hx - k-
dt
+ l(I).
Section 4A Oscillations 531
FIGURE 11.3 -I O +1

On the other hand, general physical principles assert that this force must also be
equal to the mass m of the moving parts times the acceleration d 2 x / d t 2 . Thus

d 2x dx
4.1 m-
dt 2
+ k -dt + hx = f(t).
We'll investigate various assumptions about k, h, and f (t). The mass m will be
a fixed positive constant. We consider first harmonic oscillation, also called free
osciHation, with external force f identically zero.

4A Harmonic OscilJation
We assume that k =
0 and f =
0. These assumptions represent an ideal situation
that can only be approximated by the mechanism shown in Figure 11.3. Under these
assumptions the differential equation becomes

d 2x h
-+-x=O.
dt 2 111

The solutions all have the form

x(t) = er cos /!i; t+ c2 sin At.


To interpret the solution easily it's good to rewrite it as follows. We choose an a
such that
CJ
cos a = ---;:===,
Jcr +ci
such an a can always be found if CJ and C2 are not both zero. Then letting A =
Jct+ Ci gives
x(t) = A ( cos a cos At+ sin a sin At)
X
This last fonnula shows that A is the amplitude of an oscillation about the equilib-
rium position. The number wo = ,Jhfm is is the number of complete oscillations
in time 2rc, called the circular frequency of the oscillation. Thus if wo = 2, both
cos(cvot) and sin(wot) go through 2 oscillation periods in time 2rr. The graph of the
FIGURE ll.4 displacement as a function of time is shown in Figure 11.4 for the choice a = rr /2,
532 Chapter 11 Second-Order Equations

A = 2, and h = m = l. Changing the number a, called the phase angle shifts the
graph to the right or left. The frequency of the oscillation depends only on the ratio
h/m; increasing h or decreasing m increases the frequency of the oscillation. That
this ideal oscillation is periodic is a direct consequence of the assumption that there
is no friction.

@AMPLE 1 j If in Equation 4.1 we take m h = l, k = 0, and f = 0, then the resulting


differential equation

has the solution


x(t) = A cos(t - a).

The initial conditions


dx
x(0) = xo, -(0)
dt
= VO,

require that
Acos(-a) = xo, -Asin(-a) = vo.
We can solve these equations for A and a in terms of xo and uo to get

vo
a= arctan -
XO

In the special case when xo = 0, the initial displacement from equilibrium is zero,
and we find
7r
A= lvol, a-±-
- 2·

Thus the solution is


x(t) = lvol cos (t =f ~)
= ±ivol sint.
The sign has to be chosen so that ±!vol = vo. Other special cases are treated in the
exercises.

4B Damped Oscillation
The piston in the mechanism shown in Figure 11.3 exerts a damping force that
depends on the viscosity of the medium where the piston moves. If we continue
to assume that f = 0 in Equation 4.1, then we have to deal with the differential
equation
d 2x k dx h
-+-
dt 2 m
-+-x=0.
dt m

The characteristic equation is


2 k h
r + -mr + -m = 0,
Section 48 Oscillations 533
which has the roots

r1 = 2
~ (-k + ../k 2 - 4mh), r2 =
2
~ (-k - ../k 2 - 4mh).

We distinguish three distinct cases depending on the discriminant k 2 - 4mh.


Overdamping: k 2 - 4mh > 0. Because k and h are both positive this case
occurs when k > 2,./mii. Physically this inequality means that the friction constant
k exceeds the constant ,./mii, depending on the spring stiffness h and the mass m,
by a factor of more than 2. The effect of the assumption k > 2,./mii is to make the
roots of the characteristic equation satisfy
r2 < r1 < 0.

As a result, the general solution has the form

where the exponentials decrease as t increases.

(:, ~~~n,,;,~~;-~51 We assume m = 2, h = 1, and k = 3, so that k > 2,./mii. Then

r1 = - ½, r2 = -1,
so that the displacement from equilibrium at time t is

A typical graph is shown in Figure 11.5. The maximum displacement occurs at just
one point, after which the displacement tends steadily to O as t increase~.
X

Underdamping, k2 - 4mh < 0. This case occurs when k < 2,./mii, so that,
0.47 relative to ,./mii, the friction constant k is small. The characteristic roots are now
complex conjugates of one another:
0.92
1
ri = -2m1 (-k + i../4mh - k2 ) , r2 =- (-k - i)4mh - k2 ).
2m -
FIGURE 11.5
The general form of the displacement function is then

2 2
- k J 4mh - k
x(t)
·
= e-kt/2m ( c 1 c oJ s4mh
- - - - t + c2 s i n - - - -
2m 2m
t
)

2
= Ae-kt/2m cos ( J4mhm-k t- a
)
,
2

where A= )cf+ c~, and a= arctan(c2/c1).


534 Chapter 11 Second-Order Equations

FIGURE 11.6 X

x(t) = e- 1 cos t

,r/2

j EXAMPLE 3 I Take h = 2, m = I, and k = 2. Then k < 2,./mFi, and the displacement at time t is
x(t) = Ae- 1 cos(t - a).

Figure 11.6 shows the graph of such a function with A = I and a = 0. It's easy
to check that this choice for the constants A and a gives a solution satisfying the
initial conditions
dx
x(0) = I, -(0) = -1.
dt

Critical Damping, k = 2,./mFi. This case lies between overdamping and under-
damping, and it is critical in the sense that an arbitrarily small change in one of
the parameters k, m, or h will disturb the equality k = 2,./mh and produce one
of the other two cases. Numerically, the case of critical damping is distinguished
by the equality of the characteristic roots: ri = r2 = -k /2m. It follows that the
displacement function is given by

x(t) = c1te-kt/2m + cze-kt/2m.

Take m = h = I and k = 2. Then

If x(0) = xo and dx/dt(0) = vo, then


CJ = (vo + xo), cz = xo,

so that
x(t) = [(vo+xo)t+xo]e- 1 •
Figure 11.7 shows four possibilities, depending on the size of vo, the initial velocity.

A critically damped displacement is like an overdamped one in that there is no


oscillation from one side to the other of the equilibrium position. But with fixed
mass m and spring constant /z, the critical viscosity value k =
2-/mh produces the
most rapid return toward equilibrium from an initial position in which vo = 0. The
physical reason is that a higher viscosity produces a more sluggish return, and lower
viscosity allows oscillation.
Section 4C Oscillations 535
l<'IGURE 11.7 X X

Xo Xo

Uo > 0 Uo: 0

(a) (b)

X X

Xo Xo

-x0 < 1.1o<O t Uo < -xo

(c) (d)

4C Forced Oscillation
In the specific instances considered so far, the differential equation

d 2x dx
m-
2
dt
+k- +hx
dt
= f(t)
has been subject to initial conditions x (0) = xo, dx / dt (0) = vo, but the external
force function f has been assumed to be identically zero. The resulting free oscil-
lation is described by a solution of a homogeneous differential equation. (Note that
free doesn't imply undamped.) When f is not identically zero, we speak of forced
oscillation. From a purely mathematical point of view, there is no reason why the
function f on the right-hand side of the preceding differential equation cannot be
chosen to be an arbitrary continuous function for t ::: 0. However a force function
f that assumes large values could easily drive the oscillations outside the range
in which we can maintain the original assumptions used to derive the differential
equation. (For example stretching a spring too far might change its characteristics
to the point of destroying its elasticity altogether.) For this reason, the function f is
chosen in the examples and exercises to have a rather restricted range of values. In
every example we can use the decomposition of a solution x(t) into homogeneous
and particular parts,
x(t) = Xh(t) +xp(t).

The solution Xh ( t) has already been discussed earlier in this section for various
choices of m, k, and h in the homogeneous differential equation. What remains to be
done is to discuss the effect of adding a solution of the nonhomogeneous equation.
If k > 0, the analysis given in the earlier examples shows that every homogeneous
solution tends to zero like an exponential of the form e-kt/1m. Thus for values of
t that make (kt/2m) moderately large, the addition of the homogeneous solution
has a negligible effect. Such an effect is called transient, and the complementary
particular solution is called the steady-state solution.
536 ....------. Chapter 11 Second-Order Equations
IE:x,Ary!PLE s j If /(t) = ao cos wt, then the differential equation

d2x dx
m dt 2 + k dt + hx = ao cos wt, k > 0,

has a particular solution of the form

Xp(t) = A cos wt + B sin wt


that we can find by the undetermined coefficient method of Section 3B. Substitution
of Xp into the equation yields
2
(h - w m)A + wkB = ao,
wkA - (h - w 2 m)B = 0.
It follows that

so the particular solution we get is


~ 2 .
Xp(t) = (h _ w2 m) 2 + w 2k 2 ((h - w m) cos wt+ wk sm wt)
ao
= ---;::==:;:=~=~::;: cos(wt - a) ,
J (h - w 2 m )2 + w 2k2
where a = arctan(wk/(h-w 2 m) ). What we have found is just one of many solutions,
each satisfying different initial conditions. But since k > 0, the exponential in Xh (t)
decreases to zero so our particular solution is the steady-state solution. Notice that
Xp(t) is much like the external force f (t). Since f (t) = ao cos w t,

The choice w 2 = h/m makes the maximum amplitude of x_v equal to ao,./m/(k,./h).
This choice of the frequency w is called resonant because it produces a response
of large amplitude for values of the system parameter k that are small relative to
Jm[li. Notice that in this example of resonance we have a arctan( oo) n /2, = =
so that
Xp(t) ao
= -COS ( Wt - -n)
wk 2

Thus in a system with ,./m large relative to k../h, a small external force may produce
vibrations of large amplitude if the external force oscillates at the resonant frequency.
For this reason, resonance can completely upset an operating system, even though
the external force remains small in magnitude.
Section 4C Oscillations 537
EXERCISES

Note. Some of the following exercises use the overdot where E(t) is the voltage impressed on the circuit from an
notation .i = dx/dt and i = d 2x/dt 2 for first and second external source. The charge Q(t) is related to the current
time derivatives, used also in Chapter 4, Section l. flow l(t) by I= dQ/dt.
Each of the differential equations 1 to 6 generates a free (a) Derive the relations that must hold between R, L,
(i.e., undriven) oscillation. (a) Without solving it first, and C in order that the response of Q(t) should
classify each equation according to type: harmonic, over- be respectively underdamped, critically damped, and
damped, underdamped, or critically damped. (b) Find the overdamped.
general solution formula for each equation. (b) Show that if C = oo (capacitor is absent) the
equation for l(t) is Ldl /dt +RI= E(t).
l. d 2 x/dt 2 + 2dx/dt + x =0 (c) Solve the equation in part (b) when E(t) = E sin wt
2. d 2x/dt 2 +2dx/dt +2x =0 and I (0) = 0, and show that, if t is large enough,
the current response differs negligibly from
3. i +9x =0
4. x + 3.i +x = 0
E . 0
5. x+a.i+a 2 x=0,a>0 Z sm(wt - ),
2
6. d x/dt 2
+ ¼dx/dt + ½x = 0
For each of the following general solutions of a second- where Z = JR 2 +w2 L 2 and cos0 = R/Z . The
order, constant-coefficient equation, find the choice of function Z(w) is called the impedance of the circuit
the arbitrary constants that satisfies the corresponding in response to the sinusoidal input of frequency w.
initial conditions. Sketch the solution that you find. Also *(d) Show that, if O < C < oo, a long-term response to
find the differential equation of lowest order that the input voltage E(t) = Esinwt is E(t + a)/Z(w),
solution satisfies. where Z(w) =
2
JR 2 + [wL -1/(wC)] 2 , tana =
(c-t - w L)/(wR).
7. x(t) = CJ cos 2t + c2 sin 2t; x(0) = 0, dx/dt(0) = 1
15. We can determine the range of validity of Hooke's law
8. x(t) = A cos(3t - q, ); x(0) = l, dx/dt(O) = -1
for a given spring, and the corresponding value of a
9. x(t) = qte-21 + c2e-21 ; x(0) = 0, dx/dt(0) = -1 Hooke constant h, as follows. Hang the spring with known
10. x(t) = Ae-1 cos(2t - rp); x(0) = 2, dx/dt(0) = 2 weights Wj = mjg, j = 1, . . . , n, attached to the free
end. If the additional extension is always proportional to
11. x(t) = qe- 2' + qe- 4'; x(0) = -1, dx/dt(0) = -1 the additional weight, then Hooke's law is valid for this
Find the steady-state solution to each of the differential range of extensions, and h is the constant of proportion-
equations 12 and 13. Also estimate the earliest time ality. A similar procedure applies to compression.
beyond which the transient solution remains less than (a) Assume distance units are in feet. A spring with a
O.ol, assuming initial conditions x(O) = I, x(O) = 0. 5-pound weight appended has length 6 inches, but
with an 8-pound weight appended has length 1 foot.
12. d 2 x/dt 2 + 2dx/dt + 2x = 2cos 3t If the spring satisfies Hooke's law with constant h,
[Hint: Show lx(t)I ~ .J2e- 1 .] find h.
(b) What if distance is measured in meters and force in
13. d 2 x/dt 2 + 3dx/dt + 2x =cost
kilograms in part (a)? (There are about 3.28 feet in
14. There is a well-known analogy between the behavior of a meter and 2.2 pounds in a kilogram.)
a damped mass-spring system and of an RLC electrical (c) Suppose we know Hooke's constant to be h =
circuit. Here L is the inductance ( analogue of mass) of a 120 for a certain spring. We observe that between
coil, R is the resistance (analogue of friction constant) hanging a 20-pound weight from it and then a larger
in the circuit, and C is the capacitance (analogue of weight we get an additional extension of 6 inches.
reciprocal of spring stiffness) or ability of a capacitor to How big is the larger weight?
store a charge. The differential equation satisfied by the (d) A spring is compressed to length 20 cm by a force
charge Q(t) on the capacitor at time t is of 5 kg, to length 10 cm by a force of 6 kg, and
to 5 cm by a force of 7 kg. Discuss the possible
validity of Hooke's law given this information.
538 Chapter 11 Second-Order Equations

16. Answer the following questions about solutions x(t) of 26. cos t - sin t , sin t
the forced harmonic oscillator x + hx = cos w t, if 27. A weight of mass 111 = 1 is attached by springs with
x(0) = (h - w 2 ) - 1 and i{O) = 0. Hooke constants h l, h2 to two fixed vertical supports. The
(a) If h = 2, what value should w have to make the weight oscillates along a horizontal line with negligible
response amplitude equal 4? friction. By analyzing the force due to each spring, show
{b) If w = 2, what value should h have to make the that the displacement x = x (t) of the weight from
response amplitude equal to 5? equilibrium satisfies x = -(111 + h2)x .
(c) If h = 2, what is the unique positive value of w
28. A weight of mass m is attached by springs with Hooke
for which the response x(t) becomes unbounded as
constants h 1, h2 to two fixed vertical supports. The weight
t -4 oo?
oscillates along a horizontal tine with negligible friction.
(d) If w = 10, for what range of h-valucs will the
Let the respective unstressed lengths of the two springs be
response amplitude remain between 3 and 4?
I 1, t 2 , and let b denote the distance between the supports.
17. Answer the following questions about solutions x(t) to (a) By analyzing the force due to each spring, show that
the damped and unforced equation mx
+ ki + hx = 0. the displacement x = x(t) of the weight from the
(a) If m = 2, how should h and k be related so that the support attached to the first spring satisfies
solutions will be oscillatory?
(b) If h = k = 1, how should m be chosen so that all
nontrivial solutions will oscillate?
(c) If m = h = I. how should k be chosen so that x(t) (b) Use the result of part {a) to show that the equilibrium
has circular frequency w = ½? value for x(t) is
(d) If m = h = 1 how should k be chosen so that x(t)
is oscillatory?
18. Answer the following questions about solutions x(t) of
the damped and forced equation mx =
+ ki + hx cos w t . (c) Show that the constant value Xe of part (b) is the
(a) If m = k = h = 1, how should w be chosen so that solution to the differential equation that satisfies the
the amplitude of the steady-state solution will be 1? initial conditions x(0) = Xe, i(0) = 0.
(b) If k = w =
1, what relation must hold between m 29. The recoil mechanism of an artillery piece is designed
and h so that the steady-state amplitude will be 1? containing a linearly damped spring mechanism. The
{c) If m = h = w = I, how should k be chosen so that spring stiffness h and the damping factor k should be
the amplitude of the steady-state solution will be 2? chosen so that after firing the gun barrel will tend to its
(d) What polynomial relation must hold among m, h, k, original position before firing without additional oscilla-
and w if the frequency of the transient solution is tion. We'll assume a given initial velocity Vo and ma~s m
to be the same as the steady-state frequency? As a for the gun barrel during recoil, and also a fixed maximum
special case, show that if the homogeneous solutions recoil distance E, always attained by the barrel.
have frequency w, and also h = mw 2 , then there is {a) How should h and k be chosen so that under these
no transient solution. [Hint: Show k = 0.) conditions the gun barrel undergoes critical damping
Find the amplitude A, the frequency w/2rr and a phase after firing?
angle </J for each of the following periodic functions. (b) Write the differential equation for displacement x(t)
so as to display dependence on the parameters Vo
19. 2 cost + 3 sin t
and E instead of h and k.
20. - 2 cos 2t + 3 sin 2t
30. During construction of a suspension bridge, two towers
21. sinm + 2cosnl have been erected, and a IO-ton weight is suspended
22. sin(3t) between the towers by a cable anchored to both tow-
ers. Because of the elastic properties of the cable and the
By how much are each of the following pairs of oscil- towers, it takes a ½-ton force to move the weight side-
lations out of phase? You can decide this by expressing ways by 0.1 feel. An earthquake moves the base of each
each pair in the general form A cos(wt-</J), cos(wt-i/f). tower sideways with identical displacements of the form
23. (-v13/2)cost+(l/2)sint, cost 0.25 cos 61 feet in t seconds. Assume a linear model for
the lateral force on the weight and that damping is negli-
24. (I /2) cost + (-v13/2) sin t, cost gible. Find Hooke's constant h and the natural unforced
25. (I/Ji) cost+ (I/Ji) sin t , sin, frequency of oscillation for the weight.
Section 4C Oscillations 539
31. The differential equation (c) Show that 4mh - k 2 > 0 is equivalent to assuming
mx + ki + hx = ao sin wt that the transient response xh(t) is oscillatory.
(d) Show that if k 2 ~ 2hm then F'(w) = 0 only
determines the displacement x(t) of a damped spring with when w = 0 and that F(w) is strictly increasing
external forcing f (t) = ao sin wt as a function of time t. for w ~ 0. Hence conclude that in this case the
A frequency w that maximizes the amplitude of x (t) is maximum response is laol/ h, and occurs only for
called a maximum resonance frequency. This exercise the constant f (t) = ao .
asks you to investigate the maximum resonance frequency
for fixed w, and under various assumptions about the 33. The purpose of this exercise is to show that if m, k and
mechanism. h are positive constants, then for large enough t each
particular solution of mx+ki+hx = bo sin wt is bounded
(a) Consider the mechanically ideal case where k = 0.
by a number proportional to lbol -
Show that choosing w to equal the natural circular
(a) Show that the solutions of the associated homoge-
frequency we, = ,,/h7m produces a response x(t)
that contains the factor t and hence has deviations neous equation all tend to zero as t ---+ oo. (These
from the equilibrium position x = 0 that become are the transient solutions.)
(b) Show that
arbitrarily large as t increases. (Thus there is no the-
oretical maximum resonance frequency in this case, -bo
though in practice the maximum response will be Xp(t) = k2w2 + (h - mw2 )2
limited by the structural capacity of the mechanism
to accommodate wide deviations from equilibrium.) x (kw cos wt - (h - mw2 ) sin wt)
(b) k is a fixed positive number in Example 5 of the
is a particular solution and that it satisfies lxp(t)I .:::
text. Show that the steady-state displacement xp(t)
has maximum amplitude when h and m are chosen lbol/Jk 2w2 + (h - mw 2 )2.
so that ,/hTm = w and that the maximum amplitude [Hint: See Example 5 of the text.]
is ao(wk)- 1• (Making such choices for h and m (c) Show how to conclude from the results of (a) and
constitute tuning of the mechanism for maximum (b) that every solution is bounded for t ~ 0.
response.) 34. The purpose of this exercise is to observe the effect on
(c) Continuing with the ideas of part (b), show that we the individual solutions of the initial-value problem
get a small response amplitude in the steady-state
solution to a given forcing frequency w by making x + hx = sin wt, x(0) = i(O) = 0
lh - w 2ml l&rge.
of letting the parameter w approach the positive constant
32. In the previous exercise we assumed the input frequency ../ii. (The differential equation represents a highly ideal-
in Example 5 of ihe text to be fixed and considered the ized situation from a physical point of view, because there
effect of varying h and m in the differe~tial equation. is no damping term.)
Suppose now that h, m and k are fixed positive numbers
(a) Show that the unique solution to the initial-value
and that we want to choose w so as to maximize the
problem with w 'I= ../ii is
amplitude of the response xp(t).
(a) Show that the amplitude factor
112
x(t) = h ~ w2 (sin wt - ~ sin ./hr),
p(w) = ( (h - w 2m) 2 + k2w2)-
and that the solution satisfies
of xp(t) is maximized when the function F(w) =
(h - w 2m)2 + k 2w 2 is minimized. 1 + w/,,/h
lx(t)I .'.:: lh - w21 for all values of t.
(b) Show that if k 2 < 2hm, then F'(w) = 0 when
w 2 = (2hm - k 2 )/(2m 2). Conclude from this that
in this case the maximum response amplitude
(b) Show that as w approaches ../ii, the solution values
found in part (a) approach
2mlaol
occurs for w = woy~
l - v:;;;;, _!_ (-,,/ht cos ..fiit + sin ..fiit) .
k,,/4hm - k 2 2h
where wo = .,/Tilm is the natural circular frequency Show also that in contrast to the inequality in part
of the undamped (k = 0), unforced (ao = 0) (a), this function oscillates with arbitrarily large
mechanism. amplitude as I tends to infinity.
540 Chapter 11 Second-Order Equations

(c) Find an initial-value problem that has the function (b) What is the phase difference between cos(wt - a)
obtained in part (b) as a solution. and sin(wt - fJ)? [Hint: Express the second one in
[Hi11t: What happens to the original differential equation terms of cosine.]
as w-+ ./h?] 38. Let f(t) =
sin at+ sinfJt where a and f3 are positive
35. Suppose that an undamped, but forced, oscillator ha~
numbers.
the form (a) Show that if f3 = ra for some rational number r
n then f (t) is periodic for some period p > 0, i.e.,
x+ 2x = Lak cos kt. f(t + p) = f(t) for al1 t. Show also that p can be
k=O expressed as (possibly different) integer multiples of
both rr /a and rr / /3.
(a) Use the linearity of the differential equation to show *fb) Prove that if an f (t) of the form given above is
that it has the particular solution periodic with period p > 0 then f3 = ra for some
n
rational number r . Thus for example sin t + sin ,,/21
~ Ok can't be periodic. [Hint: Check that f(p) = 0 and
Xp(t) = ~ 2_ k2 COS kt.
f"(p) = 0. Then conclude that (a 2 - {3 2) sinap =
k=O
(a 2 - {3 2 ) sin /Jp = 0, so that either a = ±/3 or else
The trigonometric sum on the right in the differential ap and /Jp arc integer multiples of rr .]
equation is an example of a Fourier series, discussed 39. Suppose we want to construct a damped hannonic oscil-
in general in Chapter 14. An extension of such a sum lator with Hooke constant h = 2 and damping constant
to an infinite series can represent a very general class k = 3. What is the lower limit mo for the mass m such that
of functions. oscillatory solutions are possible? Docs oscillation occur
(b) How does the solution in part (a) change if the left form= mo?
side of the differential equation is replaced by x+4x
and the right side remains the same? In Exercises 40 to 43, suppose a physical process is
36. Consider the differential equation accurately modeled by a differential equation of the form

x+ 2Sx = 16cos 3t . d 2x dx
m- 2
+k- +hx = 0,
(a) Show that the equation has general solution
dt dt
with 111, k and h positive constants. It may be possible
x(t) = c1 cos St + c2 sin St+ cos 3t .
by observation to draw conclusions about the parameters
(b) Show that the particular solution satisfying x(O) = 0, in the underlying process. Given each of the following
= 0 is xp(t) = cos3t -cos St .
x(O) sets of information about the constants and a solution,
(c) Show that cos 3t - cos St = 2 sin 4t sin t. find the implications for the other constants.
(d) Use the result of part (c) to sketch the graph of the 40. m = I, x(t) = e- 31 cos6t
particular solution found in part (b) for O ~ t ~ 2rr.
41. h = I, x(t) = e- sin St
1

37. (a) The phase difference between cos(wt - a) and


cos(wt - /3) is a - /3. What is the time-shift required 42. k = I, x(t) = e- 12 cos(t/2)
1

to put the two oscillations in phase? 43. k = 3, h = 2, x(t) = e- 41 sin4t

SECTION 5 LAPLACE TRANSFORMS


The techniques described earlier in the chapter are used to find formulas for the
solution of initial-value problems such as

y" + ay' + by = f (t), y(0) = Yo, y' (0) = Yl ·

Recall that the routine so far has been to solve the homogeneous equation by find-
ing the roots of the characteristic equation, then solve the general nonhomogeneous
Section 5 Laplace Transforms 541

equation by an integration that involves not only f(t) but two independent homo-
geneous solutions. If we are willing to assume that f (t) is defined for all t ~ 0,
and that f (t) and the solution y(t) don't grow too rapidly as t ~ oo, we can use
an alternative method that incorporates all of these steps and has for historical rea-
sons achieved considerable popularity in electrical engineering and control theory.
Experience with exponential integrating factors shows that it's natural to multiply a
solution y(t) by a factor e- s1 , where s is some real or complex number, to get a
product e- st y(t). What seems less natural, but nevertheless turns out to be effective,
is to integrate with respect to I between 0 and oo. This gives us an improper integral,
leading for example to the calculation

5.1

which holds when (s - a) > 0. We'll use this result repeatedly, and to verify it we
first compute the partial integral

T -(s-a)t d
= [ - - -l e -(s-a)t]T
lao e t
s -a o
= __l_e-(s-a)T + _l__
s-a s-a
When T ~ oo, the exponential factor tends to zero, so letting T ~ oo on both
sides proves the formula correct. If for some real or complex numbers s, an integral
of the form

converges to a function .C[f] depending on s, then .C[f] is called the Laplace


transform of the function f. The key to using the Laplace transform to solve
differential equations is the following formula:
00
5.2 fo 00
e- st y'(t)dt = -y(O) + s fo e-SI y(t)dt,

which we prove under the assumptions (i) the improper integrals are convergent, and
(ii) lim e- s1 y(t) = 0. Indeed, integration by parts of the partial integral on the left
1--+00
gives
T T
la e- st y'(t)dt = [e- s1 y(t)]~ +s lo s1
e- y(t)dt

T
= e-sTy(T) - y(O) + s la e- st y(t)dt.

Because of assumption (ii), the first term on the right tends to 0 as T ~ oo.
Equation 5.2 follows by letting T ~ oo in the preceding equation.

The example
y' +2y = 0, y(O) = 3,
542 Chapter 11 Second-Order Equations

is too simple to show the real advantages of using Laplace transforms, but it does
illustrate the general principles involved. We form the Laplace transform of both sides
of the differential equation by multiplying both sides by e- sr and then integrating
from O to oo with respect to t. The result is

Equation 5.2 allows us to rewrite the first integral, obtaining

Here we have used the assumption that y(O) = 3. We also rely on our knowledge of
the exponential nature of the solution to justify the assumptions (i) and (ii) needed
for the application of Equation 5.2. The previous equation can now be solved for the
Laplace transform of the solution y(t) in the form
3
i
(X)
e-Sf ydt =- -. (I)
o s+2
Thus we have found not the solution y(t), but its Laplace transform. However if we
set a= -2 in Equation 5.1, and then multiply by 3, we get

(2)

Since we already know from the general theory of this chapter that y(t) must be an
exponential solution, there remains only the one question of the constants involved,
and we see by comparing Equations (1) and (2) that the solution

y(t) = 3e-2'
satisfies our requirements.

To apply the Laplace transform to differential equations with order higher than
one, we need a simple extension of Equation 5.2. To simplify the notation, we write
5.2 in the form
.c[y'](s) = -y(O) + s,C[yl(s).
Applying this equation to ,C[y"](s ), the Laplace transform of y" , gives

qy"](s) =- y'(O) + s.c[y'].

Applying the equation again gives

,C[y"](s) = - y' (0) + s { - y(O) + s ,C[y](s)}


=- y'(O) - sy(O) + s 2,C[y](s).
Section 5 Laplace Transforms 543
The same routine leads after n steps to the formula
qy<n)](s) = -y<n-1\o) _ sy<n-2\o) _ ...
5.3
- s"- 2y(0) + s" £.,[y](s).
The assumptions (i) and (ii) needed for 5.2 have to be increased to
00

(a) the integrals fo e-sry<k)(t)dt are convergent fork= l, 2, ... , n.

(b) lim e- st y<k)(t) = 0 fork= 0, 1, ... , n - 1.


,~oo
ll~M~L~ 21 We can solve the differential equation

y" - y' - 2y = 3e1 ,


with initial conditions
y(0) = 1, y'(O) = 0,
by applying the Laplace transform to both sides. For simplicity, we denote tl1e
Laplace transfom1 of y by Y, that is, Y(s) = £.,[y](s). Using Equation 5.3 for n = I
and 2, together with the initial conditions, we find

£.,[y]= Y(s),
.c[y'J = -y(O) +sY(s)

= -1 +sY(s),
£,[y"] = -y'(O) - sy(0) + s2Y(s)

= -s +s 2 Y(s).
Because integration from Oto oo and multiplication by e- s1 are both linear operations,
the equation
.C[y" -y' - 2y] = .C[3e'J

simplifies to
£.,[y"J - .c[y'] - 2£,[y] = 3£,[e1].
The expressions found for .C[y], .C[y'J, and .C[y"], together with 5.1, allow us to
write the equation as

l
[-s + s2Y(s) ] - [-1 + sY(s)] - 2[Y(s)] = 3s -- -l.
Rearrangement gives

2
2
(s -s -2)Y(s) -3- +s -1 = -
= s-1 s - 2s +4
- --
s-l
544 Chapter 11 Second-Order Equations
or
s 2 - 2s + 4
Y(s) = ----
2
--.
(s-l)(s -s-2)
Having found an expression for Y(s), our problem is now to identify precisely the
solution y(t) that satisfies .l[y](s) = Y(s). Because Y(s) is a rational function, it can
theoretically always be broken down according to the partial fraction decomposition
usually associated with the computation of indefinite integrals. In our example the
decomposition works because the denominator of Y factors. We need to detem1ine
the coefficients A, B, and C in

s 2 -2s +4 A B C
- - - - - - - - = - - + - - + --.
(s-l)(s+ l )(s-2) s-1 s+I s-2

Multiplying through by (s - l ) with s =/:- 1, and then letting s go to I, gives A = -! .


Similarly, we multiply by (s + I ) and then set s = -1 to get B = Finally, we t·
multiply by (s - 2) and then sets = 2 to get C = 1·
As a result, we have
3 7 4
Y(s) = _ _L + _<i_ +-·-3-.
s-1 s+I s-2
Equation 5.1 now allows us to identify y(t) as

y(t) =-~er+ te-1 + 1e21_


As a check on the computation, we can verify that y(O) = I and y'(0) = 0.
In the examples given previously, we have used the linearity of £.,, the Laplace
transform operator. The property is formally expressed by the two equations

£fy1 + Y2] = + £fy2)


.l[yi]
5.4
q cy] = c,£.,f y], c = constant.
These equations, together with Equation 5.3 for the Laplace transform of a deriva-
tive, need to be supplemented in practice by the calculation of Laplace transforms
of specific functions as for example in Equation 5.1, which asserts that £[e01 ](s) =
I/ (s - a). Table 11.1 contains more than enough entries to do all the problems in this
section, although more elaborate tables may contain several hundred entries. Such
tables are meant to be used in both directions, so that while the entry

n!
£[1 11 ] = --
5
n+ I' 11
= 0 ' I ' 2 ' ...
provides the transform of f(t) = t'1, it also provides, after division by n!,the inverse
transform
11
;:_,-1 -1- ] - -
[ sn+l
!
-n!'
n = 0, I, 2, .. .
For the proof that for every Laplace transform Y there is a unique function y such
that lfy] = Y, we can refer to more theoretical accounts of the subject. All the
entries in Table 11 .1 are computed using elementary integration techniques.
Section 5 Laplace Transforms 545
TABLE 11.1 Table of Laplace transforms.

f(t) ,(,[f)(S) = Jo e-St f(t) dt

1. 1
s

2. t
n!
n =0, 1,2, ...

s-a

(s - a) 2
n!
6. t"e 0 ' n=0,1 ,2, ...
(s-a)n+ 1 '
b
7. sinbt
s2 +b2
s
8. cos bt
s2 +b2
2bs
9. t sinbt
(s2 + b2)2
s2 - b2
10. t costJt
(s2 +b2)2
b
l 1. e" 1 sin bt
(s -a) 2 +b 2
s- a
12. e"' cos bt
(s -a) 2 +b 2
(a -b)
(s - a)(s - b)
(a - b)s
(s - a)(s - b)
(a - b)(b - c)(a - c)
15. (b - c)e 01
+ (c - a)i" + (a - 1
b)ec
(s - a)(s - b)(s - c)
2b 3
16. sinbt - bt cos bt

Partial fraction decomposition. For finding inverse transfonns it's sometimes


essential to decompose a rational function P(s)/Q(s), with degree of Pless than the
degree of Q, according to the following two rules; otherwise, long division shows
that P(s)/Q(s) = T(s) + R(s)/Q(s), where the remainder R(s) has lower degree.

I. If the denominator Q(s) has the factor (s - ar as the highest power of


s - a that divides Q(s), then include in the decomposition of P(s)/ Q(s) the
fractions of the form

(s - a)i'
j = 1, 2, ... , m.
546 Chapter 11 Second-Order Equations

2. If the denominator Q(s) has the factor (s 2 + ps +q) 11 as the highest power of
s 2 + ps +q that divides Q (s ), then include in the decomposition of P (s) / Q (s)
the fractions of the form

Bks + Ck k, k = 1, 2, . . . , n.
(s 2 +ps+q)

IEXAMP:@ To find the function f (t) having Laplace transform

s+I
F(s) = (s - 1)2(s2 + I)'
we decompose the function into a sum of fractions as follows:

s+I
------,--- = -A- + --~
B Cs+ D
+ ----,---
2 2
(s - I) (s + I)2 2 s- 1 (s - 1) s +1
To compute B, we can multiply through by (s - l )2 and then set s = l. We get
B = 1. The same kind of trick doesn't apply directly to the other coefficients, but
if we subtract 1/(s - 1)2 from both sides we find we can cancel (s - 1) on the left
to get
-s 2 + s -s A Cs+ D
(s - 1) (s + 1) - (s - l)(s 2 + 1)
2 2 = s - 1 + s2 + 1 ·

Now multiply by (s - 1) and then set s = l to get A = -½. As a result,


-s -½ Cs+D
(s - l)(s 2 + 1) =s- I + s2 + I ·

To find C and D, we can multiply through by (s - l)(s 2 + I) to get

-s = -½(s 2 +I)+ (s - l)(Cs + D).

Rearranging the powers of s gives

½s 2 - s+ ½= cs2 + (D - C)s - D.

We equate coefficients of like powers on both sides and find that C = ½ whereas
D = -½. The result is that

l 1 l
s +1 -2 l 2s 2
-(s---l)-2-(s_2_+_1) = -s--1 + -(s---1)-2 + -s2_+_1 - -s2_+_1 ·

From the table of transforms, we conclude that

.c- 1 [F(s)] = -½e 1 + te1 +½cost - ½sin t.


The coefficients A, B, C, and D can also be computed by multiplying through by
(s - 1)2 (s 2 + l) at the first step and then equating coefficients of like powers of s
Section 5 Laplace Transforms 547

on both sides of the equation. The resulting linear equations can then be solved for
the coefficients.

l;~x~,Mp~~.4d The differential equation


y" + 4 y = 3 sin 2t
with initial conditions y(0) = 1, y' (0) = -1, transfonns into
/ 2
-y (0) - sy(0) + s Y(s) + 4Y(s) = 3 s 2 2+ ,
4
by using Equation 5.3 on the left side and Entry 7 of the transform table for the right
side. Using the given initial values, we get

2 6
(s + 4)Y(s) = -2- - + s - I
s +4
or
6 s 1
Y(s) = (s2 + 4)2 + (s2 + 4) (s2 + 4).

To use entries 16, 8, and 7 in the table, we first write

3 16 s 1 2
Y(s)=s"(s 2 +4) 2 + s2 +4 -2-s2 +4·

From the table, we read directly

y(t) = i(sin 2t - 2t cos 2t) + cos 2t - ½sin 2t


= - ½sin 2t + cos 2t - ¾t cos 2t.
The amount of arithmetic required is less than if we had used any of the methods
described earlier in the chapter; the reason is that we have relied heavily on the
already-computed table of Laplace transforms.

EXERCISES

In Exercises I to 4, compute directly, assuming that 3. Integrate once by parts to verify that if y(0) = 1 then
y(t) has a transfonn Y(s) and is such that all required 00
integrals and limits exist and are finite. fo e- st y'(t)dt = sY(s) -1.
1. Integrate to verify that 4. Integrate twice by parts to verify that if y(0) =2 and
y' (0) = 3, then
for s > a. 00
fo e- st y"(t) dt = s 2 Y(s) - 2s - 3.
2. Use integration by parts to verify that
In Exercises 5 to 8, by computing the appropriate inte-
1
1
00
£,[t](s) = te- st dt =2 , for s > 0. gral, or by using Table 11 .1 of Laplace transfonns, com-
Cl s pute .C[J](s) where /(t) is as follows:
548 Chapter 11 Second-Order Equations

5. t sin 2t 6. cos t + 2 sin t 7. t 2+ 2t - I ( d) Solve the differential equation y" = H (t - a), 0 <
a, with initial conditions y(0) = I, y'(0) = 0.
8. cos(t + a) 9. (2t + l)e 31 10. e' + e-1

24. Let P (D) be an nth-order, linear, constant-coefficient,


Use Table 11.1 to find the inverse Laplace transforms of differential operator. Show that
the following functions /(s)-that is, to find y(t) such
that L[y](s) = F(s). f..,[P(D)y](s) = P(s)f..,[yJ(s) + Q(s),
I 2
11. - , - - 12. s2 +4 for some polynomial Q of degree n - 1. Use induction.
s~ - I
13. I s 25. (a) Show that if f (t) is defined and differentiable only
(s - 2) 2 +9 14. s2 - 4 for t > 0 (instead of t 2: 0), then
4s l l
15. 2 2 16 - - - - -
(s +4) • s2 (s - 1)2 f..,[f'](s) =- f (O+) + s£[f](s),
In Exercises 17 to 22. use the Laplace transform lo
solve the following initial-value problems. Check by where f(0+) = Jim f(t).
1--.0+
substitution. (b) Show that if the limits t<k>(o+) all exist, then
17. y'-y=t,y(0)=2 Equation 5.3 generalize to
18. y' +2y = I, y(0) =I f..,[f(n)](s) = _t<n-i>(o+) _ ... -s<n-1) f(O+)
19. y' + 3y = cos2t, y(0) = 0
20. y" + y = e- 1 + I. y(0) = -1, y'(0) = l
+ s"f..,[f](s).
21. 2y" - y'
21.
= 2cos3t, y(0) = 0, y'(0) = 2
y" + y' + y = l, y(0) = y'(0) = 0 26. The a~sumption that 1 00
1/'(t)I dt < oo implies that

23. (a) Define the Heaviside function


00
Jim 1 e- sr f'(t)dt = loo f'(t)dt.
H (t) ={ 0, ~f t < 0 ,. . . o+Jo Jo
I, If O ::st.
Show that, under the additional a~sumption that Jim f (t)
Show that ,(,[H(t -a)J(s) = (1/s)e-a". , ..... 00
exists,
(b) Show that if g(t) = H(t - a)f(t) for O ::S t and Jim f(t) = .v--.0+
Jim s,(,[f](s).
a 2: 0, then f-->00

.(,[g](s) = e-as f..,[j(t + a)](s). This formula makes it possible to determine something
about the long-run behavior of f from the behavior of
(c) Sketch the graph of H(t) - H(t - I). f..,[f](s) nears = 0, without finding f.

SECTION 6 CONVOLUTION
Let us review the solution of the second-order differential equation

y" + py' + qy = j(t), y(O) = Yo, y'(O) = Yl·


Taking the Laplace transform of both sides gives

2
(s + ps + q)Y(s) = F(s) + y'(O) + sy(O) + py(O).
The polynomial factor P(s) =
s2+ps+q on the left is the characteristic polynomial
2
of the operator D + pD + q. The reciprocal Q(s) = I/ P(s) is called the transfer
Section 6 Convolution 549
function of the operator, and if we multiply by Q(s), or divide by P(s), we get the
formula
Y(s) = F(s) + y'(O) + sy(O) + py(O)
P(s)

for the Laplace transform Y = .C[y]. The remaining step is to find the inverse trans-
form y (t) = .C - 1[Y)(t). The essence of the method is to use the Laplace transform
to reduce the solution of the problem to some routine algebraic manipulations.
In addition to the table of specific Laplace transforms in the previous section
(Table 11. l ), there are a number of general formulas, such as Formula 5.3, that
are useful in solving problems. The most important of these answers the following
question: If F(s) and G(s) are the Laplace transforms of f(t) and g(t), respectively,
what function has Laplace-transform equal to the product F(s)G(s)? It turns out that
under rather general hypotheses there is an answer, given by the convolution integral

f * g(t) = fo' f (u)g(t - u) du.

The function f * g(t) is called the convolution of the functions f (t) and g(t) and is
defined for t ~ 0, provided that f and g are integrable on every finite interval. The
convolution f * g is to be thought of as a kind of product of f and g and it turns
out that f * g = g * f, although this is not obvious from the definition. The basic
information about convolutions is summarized as follows.
6.1 Theorem. Let J(u) and g(u) be integrable on O ~ u ~ t for every positive
t ; then f * *
g and g f both exist and are equal, that is, convolution is commutative:
f*g=g*f.

If 1/(t)I and lg(t)I are such that .CCIJl](s) and L[lgl](s) are both finite, then
.C[f * g](s) = (.CC/1(s)){L[g](s)) .

Proof. The first statement follows from changing variable in the definition of f * g.
We have, on replacing u by t - v,

f * g(t) = fo' f(u)g(t - u) du =- fOf (t - v)g(v) dv

= fo' g(v)f(t - v)dv = g * f(t).


Proving the second statement involves an important technical point for which we
won't give a proof, but the rest of the argument is complete. To simplify the writing
of limits of integration, we can extend both f (t) and g(t) to have the value O for
t < 0. Since e-su is independent of v, we can write

L[f](s)L[g](s) = 1_: e-su f(u) du 1_: e-s vg(v) dv

= 1_: f(u) [/_: e-s(u+v) g(v) dv] du .


550 Chapter 11 Second-Order Equations

We next make the change of variable v =t - u in the inner integral to get

.C[f](s ).c[g](s) = 1_: [j_: f (u) e- st g(t - u) dt] du.

Under our assumptions we can interchange the order of integration using a theorem
called Fubini' s theorem. Then we have

.C[J](s).c[g](s) = 1_: [f_: e-


st
j(u)g(t - L1)du] dt.

Because we have assumed f (t) and g(t) are zero fort < 0, the inner integral is zero
for t < 0. It follows that we need the t integration only for O ::=: t < oo. Similarly
we need the u integration only for O ::=: u ::=: t . Hence
00

.C[f](s).c[g](s) = fo e- st [fo' f(u)g(t - u)du] dt

= .C[f * g](s),
which is what we wanted to prove. •
IEXAMPLE 11 From Table 1 I.I, we see that .c[t](s) =
from Theorem 6. 1 that
I/s 2 and £,[sin t](s) = 1/(s 2 + l ). It follows

1
~ -- - - = .C [ [' (t - u)sinudu] (s) .
s s +1
2 lo
Holding t fixed, we can use integration by parts to show that

[' (t - ti) sin LI du = [- (t - ti) cos u ]~ - [' cos LI d LI


k J1
= -t - sint .

We could have obtained the same result by computing a partial fraction decomposi-
tion of the form
I I A B Cs+ D
s 2 ·-s2_+_1 =-;-
+ s 2 + _s_2_+_1_'
and then finding the inverse transform of each term.

Table l l.2 lists the most frequently used general properties of the Laplace trans-
form. The entries that haven't already been discussed follow from elementary cal-
culus techniques. The table omits the precise conditions under which each formula
holds. The distinction between Formulas 2 and 3 and Equations 5.2 and 5.3 of the
previous section occurs because in 5.2 and 5.3 we assumed that we were deal-
ing with solutions of differential equations and that these solutions had continuous
derivatives at t = 0. The corres~nding formulas in Table 11.2 are valid under the
weaker assumption that Jim f k>(t) exists, but is not necessarily equal to J<kl (O).
t-+0+
(See Exercise 8 of Section 5.)
Section 6 Convolution 551
TABLE 11.2 General properties of the Laplace transform.

1. .(,[af + bg] = a.C,[f] + bL[g], a, b constant

2. .C,[f'](s) = s.C,[f](s) - f(O+), f(O+) = ,~o+


Jim /(t)

3. .(,[JM](s) = s".(,[f](s) - s 11 - 1/(0+) - sn-l f'(O+) - · · · - J<n-1)(0+)

4. .C, [1' f(u) du] (s) = ~.C,[f](s)


5 . .(,[ea1 f(t)](s) = .C,[f](s - a)
6. .C,[/ (t - a)](s) = e-as .C,[f](s), a > 0, f (t) = 0 if t < 0

d
7 . .C,[tf (l)](s) = --.(,[f](s)
d.r
8. .C,[/ * g](s) = .(,[f](s).C,[g](s)

From Table 11. l, we find that


1
.c[sint](s) = - 2- - .
s +l

Taking f (t) = sin t in Fonnula 4 of Table 11.2 gives

l l .c .
s(s2 + 1) = :;- [sm t](s)

= .C [fo' sin u du] (s) =.(,[-cost+ 1].

Hence
.(, -1 [ 1 ] = - cos t + l.
s(s 2 + 1)
Repeating the application of Fonnula 4 gives

1
= .C [ [' (- cos u + l) du]
s 2 (s 2 + l) lo
= .(,[- sin t + t].

This establishes the fonnula

r-1[ s (s l + l) ] =-smt+t,
....,
2
·
2

which was derived in Example 1 using convolution.

IE~MPLE ~ii Starting with the formula


.C[cos t](s) = -2-s- ,
s +l
552 Chapter 11 Second-Order Equations

we can apply Formula 7 in Table 11.2 to get

d s
,(,[tcost](s) = -- - -
ds s 2 + 1
s2 - 1
2
= (s2 + 1)2 ·
Another application of the same formula gives
FIGURE 11.8
2 d s2 - 1
,(,[t cost](s) =- ds (s 2 02+
2s 3 - 6s
= (s2 + 1)3 ·

I~>O\MP,I.E 4 j To apply Formula 6 of Table I J.2 to the function f (I) = t, we define /(I)
t < 0. The graphs of
0 for
f (I) and f (t - a) are in Figure 1.1.8 for a = I and a = 2.
=
Each function is zero where it's not positive. From Formula 6 we find

,(,[f(t - l)](s) = e-s ,(,[f(t)](s)

Similarly,
,(,[J(t - 2)](s) = e- 2s .C[f(t)](s)
= e-2s .L:,[t](s) = e- 2'· 2I . 1

EXERCISES

In Exercises 1 to 4, find the convolution f * g of the In Exercises 9 to 14, use the formulas in Tables l 1.1
given pair of functions. and 11.2, find the inverse Laplace transform of the given
function.
1. f(t) = t, = e-t, t ~ 0
g(t)
e-2s
1
2. f(t) = t , g(t) = (1 2 + 1), t :=: 0
2 9
· s(s + 3) 2 IO. s-(-s2_+_4_)
3. f(t) = 1, g(t) = I, t :=: 0 1 1
11. -s2_+_2s_+_2 12. -s2_+_1
4. f(t) = t, g(t) = cost, t ~ 0
13. (e-l· + 1)/s 14. _ s _
In Exercises 5 to 8, use the convolution of two functions s2 + 10
to find the inverse Laplace transform of each of the given
products of Laplace transforms. In Exercises 15 to 18, solve the given initial-value prob-
lem and check by substitution.
I 1
5 6 = sin2t + 1, y(O) = 1, y'(O) = -1
· s2(s + 1) · (s - l)(s - 2) 15. y" - y

7 1 -21 8 I 16. y" + 2y = t, y(O) = 0, y'(O) = I


• s3 e · (s2 + l)(s - 1) 17. y" + y' = t + e-t, y(O) = 2, y'(O) = 1
Section 7 Nonlinear Equations 553
18. y" + y = sint, y(O) = 0, y'(O) = 1 (c) Show that if a > - 1, then
19. Solve the equation y' + y = J~ y(u) du + t, given that
y(0) =1
20. Solve the equation y' - y == f~ y(u) du, given that
y(0) = 1 *23. The conditions on the absolute values of f and g in
21. (a) Use Formula 4 in Table 11.2 repeatedly to show that Theorem 2.1 that .C[lf I ](s) and .C[lgl ](s) both be
finite is needed to allow interchange of integration
order in the second part of the proof. These hypothe-
ses are usually routine to verify in ordinary practice,
but to see that it really is a restriction we need an
1 example of an f that has a transform but such that
= sn+I F(s) . Ill doesn't.
(a) Let
(b) Use the Convolution Theorem 6.1 to show that the f(t) = e<t+,n sine<e') _
(n + 1)-fold iterated integral equals
Show that
-1
n! o
lo' (t - u)" f (u) du.

1 O
00
e- st f (t) dt = 1-
00

e•
sinu
- - du,
1
(nu)5
22. One possible definition of the gamma function, denoted
by f(z), is and that this is finite for all s > 0. [Hint:
f(z) = loo ,z-le-1 dt, z > 0.
Express the second integral as an alternating infinite
series.]
(b) Show that J;c'
e- st lf(t)ldt =
+oo for all s.
(a) Use integration by parts to show that [Hint: Compare the analogue of the second inte-
gral above with a smaller, but divergent, infinite
series.]
f(z + 1) = zf(;:). (c) Prove that if .C[lf (t)l](s) is finite then so is F(s) =
(b) Deduce from part (a) that f(n + 1) = n!, for .C[f(t)](s). [Hint: Compare I Jff e- st f(t)dtl and
n =0, 1,2, .... J;' e- st lf(t)ldt.]

SECTION 7 NONLINEAR EQUATIONS


In the earlier sections of this chapter we've seen a complete treatment of the initial-
value problem
y + ay +by= f(t), y(to) = yo, y(to) = zo,
where a and bare real constants, and f (t) is continuous on an interval J containing
to. In particular we proved the existence of a unique solution for the problem and
exhibited a general solution fonnula of the fonn

y(t) = C!Yl (I)+ c2y2(t) + Yp(I),


where YI (t) and Y2 (I) are solutions of the associated homogeneous equation with
f (t) = 0 on J, and the constants q, c2 are chosen to satisfy the initial conditions.
The analogous problem where coefficients a = a(t) and b = b(t) in the differential
equation are continuous functions on the interval J has solutions of the same fonn,
and this will follow from Theorem 7.1. However, Theorem 7.1 covers much more
than linear equations; also included in its scope are nonlinear equations such as

y=yy, or y=ti, or y=-siny.


554 Chapter 11 Second-Order Equations

7.1 Existence and Uniqueness Theorem. Assume the function of three variables
f(t, y, z) and its two first-order partial derivatives /yU, y, z) and fz(t, y, z) are
continuous fort in an interval / and for all (y, z) in an open rectangle R containing
(yo, zo) shown in the figure. Then the initial-value problem

Y = f(t, y, y), y(to) = Yo, y(to) = zo


has a unique solution on some subinterval J of I containing to. If in addition there
is constant B such that 1/y(t, y, z)I ::S Band lfz(t, y, z)I ::S B for all t in / and all
(y, z) in JR2 , then the unique solution is defined on the entire interval /.

IEXAMPLE 11 Of the three equations preceding Theorem 7.1 y = - sin y is the only one that
satisfies the boundedness condition. In this example 1/y(t, y, z)I = I cosyj ::SI, and
fz(t, y, z) = 0, so the initial-value problem has a solution for all real t, starting with
arbitrary to, yo, and zo.

We'll consider in Sections 7 A and 7B two special cases of the equation ji =


f (t, y, y) for which we can sometimes find explicit solutions. The trick in each case
is to do what we did with constant-coefficient linear equations, reduce the problem to
a pair of successive integrations. These methods apply also to some linear equations,
including the ones discussed earlier in this chapter.
7A Dependent Variable Absent: y f(t, y) =
Given that y = f(t, y) the natural thing to do is first let z = y, giving us a system
of two first-order equations
z = f(t,z)
y=z

to solve first for z and then for y. In the next example we solve a linear equation with
a variable coefficient, one that we can't solve using the constant-coefficient methods
of the previous sections in this chapter.

j EXAMPLE 2 j Suppose we want to solve the initial-value problem

..
y
1.
= -y, ( )
y to = Yo, .(
y to) = zo,
t
with to i= 0. Letting z = y, this problem is equivalent to
.
z = -t1z, z(to) = zo
y = Z, y(to) = YO·
The top equation doesn't contain y and is first-order linear with integrating factor
exp (f (-1/t)dt) = 1/t, so the integrable form of the equation is

d
(1/t)z - (l/t 2)z =O or -(z/t) = 0, with solution z = CJt.
dt
The initial condition on z requires CJ = zo/ to, so z(t) = zot / to. To find y we
integrate y =
z to get y = ½zot 2/to+ cz. Finally, the initial condition y(to) Yo =
requires cz = Yo - ½zoto soy = ½zot /to+ yo - ½znto. A routine check shows this
2
Section 7B Nonlinear Equations 555
to be the solution to the initial-value problem for t > 0 when to > 0, or for t < 0
when to< 0.

7B Independent Variable Absent: y f(y, j) =


As usual we reduce the problem to two first-order problems by letting dy/dt =z
and considering the system
dz
dt = f(y, z)
dy
-=z.
dt

At this point we make an assumption, namely that z is expressible as a differentiable


function of _v. an assumption that will be verifiable in practice. Under that assumption
we apply the chain rule for functions of a single variable as follows:

-dz
dt
= dz dy
--,
dy dt
. dy dz dz
orsmce - =z, - =z-.
dt dt dy

. dz dz . .
Thus the shortcut to remember 1s the replacement of - by z- m the top equation
dt dy
of our first-order system, yielding
dz
zdy=f(y,z)

dy
-=z.
dt
If we can solve the top equation for z as a function of y then we can put the result
in the bottom equation with some hope of solving for y = y(t). Note that we would
then have y = z(y(t)) as a check on the accuracy of our computations.

The initial-value problem ji = yy, y(0) = 0, y(0) = ½breaks down, according to


the outline above, into
dz
z-
dy
= zy, z(0) = 2I ,
dy
-dt = z, y(0) = 0.
Assuming z =I- 0, we can divide by z in the top equation to get the separable equation

dz
dy
= y, with solutions z = ½i + c1.

Using the two initial conditions z(0) = ½ and y(0) = 0 we see that c1 = ½, so
z= ½i + ½- (Note that z is never zero.) Putting this expression for z into the
bottom equation of the system gives the equation

-dy = -2I y 2 + -2I .


dt
dy
or, in separated form, - 2- -
y +I
=,- dt.
l
556 Chapter 11 Second-Order Equations

Integration gives arctan y = ½t


+ c2, or since y(O) = 0, arctan y = ½t. Hence the
solution to the second-order initial-value problem is y = tan ( ½t).

Note. The solution y = tan ( ½t)


found in the previous example has its domain
restricted by the behavior of tanx near x = ±n:/2. Thus the solution is valid only
when -n: /2 < t /2 < n: /2, or n: < t < n:. This is an instance of Theorem 7 .1 where
the domain interval J of a solution is actually smaller than the domain interval I in
which the differential equation makes sense. There is no way to tell from looking at
the differential equation y = y j, just what the interval J will be, because J depends
critically on the initial conditions.

IEXAMPLE4 I Initially we best describe the motion of a pendulum with no force but gravity acting
on it in terms of the angle 0 = 0(t) that the pendulum makes with a vertical line as
shown in Figure 11.9. Here we'll assume an ideal pendulum with a rod of negligible
mass attaching a weight of mass m to the pivot, and with all mass concentrated at the
weight's center of gravity at distance I from the pivot. Figure 11.9 shows the set-up,
where the possible positions for the center of mass describe a circle of radius I. The
typical motion is a back and forth oscillation, such as we considered in Section 4.
The downward-directed gravitational force F of magnitude mg has a radial com-
ponent FR of magnitude lgm cos01 and a component FT of magnitude lgm sin01
tangential to the circle. At any given position the gravitational force component in
the direction of motion is perpendicular to the component directed along the length
of the pendulum. The coordinates relative to these directions are FT = -gm sin 0
and FR = - gm cos 0. It follows by the Pythagorean relation that the sum of the
squares of the perpendicular component magnitudes must be g 2 m 2 . The force FR
-gm cos e acting along the length of the pendulum toward its end must be exactly balanced by
-gm
an opposite force at the pivot, and these forces play no other role in our description
of the motion. If 0 is measured in radians, distance along the circular path of motion
is y = 10, so we can express the force coordinate FT in the direction of motion as
FIGURE 11.9
mass times acceleration: FT = md 2(10)/dt 2. Equating our two expressions for FT
Pendulum analysis.
gives the differential equation satisfied by 0 = 0(t):
d 2 (10) .
m~=-gmsm0.

The minus sign signifies that the signed velocity d(/0)/dt is decreasing if 0 < 0 < n:
and increasing if -n: < 0 < 0. If y(t) stands for distance measured along the circle
then 0(t) = _v(t)/ I, leading to the alternative form
d2y . y
- = -gsm-.
dt 2 I
Theorem 7 .1 guarantees the existence of a unique solution y(t) to an initial-value
problem with y(to) = Yo, y(to) = zo, but this nonlinear equaLion has no solutions in
terms of elementary functions.
If y remains small, say IYI < 0.1, we may find approximate solutions that are
acceptable for some purposes by using the tangent-line replacement for the graph of
sin y near y = 0, namely sin y ::::::: y. This approximation leads to the linear equation

d2y g
dt 2 = -Ty.
Section 78 Nonlinear Equations 557
It's routine to check that this differential equation has among its solutions

y = cos ./iii I and y = sin ./iii t.


The solutions of the linear version of the pendulum equation are examples of har-
monic oscillation, studied in detail in Section 4A. The solutions of the nonlinear
pendulum equation are oscillatory also, but have their own distinct character, which
we'll take up in the Exercises and in Section 8 on numerical methods. In particular,
it's not at all obvious from looking at the differential equation ji = - sin y that it ha<;
periodic solutions, but this is nevertheless true and doesn't depend on the periodicity
of the sine function.

EXERCISES

In Exercises l to 6, use the method of Section 7A to solve (c) Use the complex exponential ei 1 to find a relation
the initial value problem, stating explicitly for what values between the solutions of part (a).
of the independent variable your solution is defined.
16. (a) Solve the initial-value problem ji = y y, y(O) = 0,
1. tji + y = O; y(l) = 0, j>(l) = 1 j>(O) = -½-
(b) Sketch the graph of the solution to part (a).
2. t 2 j.i+_j,2=0; y(l)=j>{l)=-1
(c) Use the complex exponential eir/2 to find a relation
3. j.i + j, 2 = O; y(O) = 0, j>(O) = 1 between the solution to part (a) and the solution to
4. ty + y = O; y(l) = j>(l) = l the problem in Example 3 of the text.
5. tj.i + y = t 3 ; y(I) = j>{l) = 1 17. (a) Apply the method of Section 7B to the initial-value
problem
6. j.i + y= O; y(O) = y(0) = 1
In Exercises 7 to 13, use the method of Section 7B ji = - sin y, y(O) = Yo, j>(O) = zo
to solve the initial value problem, stating explicitly for
for a nonlinear pendulum equation to establish the
what values of the independent variable your solution is
equation
defined.
7. yji - j, 2 = O; y(O) = j>(O) = 1 I ·2 I 2
2 y =cosy-cosyo+ 2 z0 •
8. y2ji + j, 3 = 0; y(O) = j>(O) = l
(b) Use the result of part (a) to show that if - cos Yo+
9. ji - j, 3 = O; y(O) = j>(O) = 1 z5
½ > l the pendulum will rotate "over the top"
10. ji + j>2 = 0 y(O) = 0, j>(O) = 1 repeatedly.
(c) Use the result of part (a) lo show that to have
11. ji + j>2 = I; y(O) = j>(O) = 0
oscillatory motion, with angle y strictly between -7r
12. ji - y 3 = 0; y(O) = j>(O) = ../2 and 7r, we must have - cos YO + ½ < l. z5
13. Show that the differential equation t j.i + y2 = 0 has
2
18. (a) Apply the method of Section 7B to the initial-value
more than one solution satisfying y(O) = j,(0) = 0. problem
Explain why this doesn't contradict the uniqueness part
of Theorem 7. I • ji = -siny, y(O) = T/, j>(O) =0
3
14. Show that the differential equation t ji + j> = t has
for a nonlinear pendulum equation to establish the
more than one solution satisfying y(O) = j>(O) = 0.
equation
Explain why this doesn't contradict the uniqueness part
of Theorem 7. l.
½i =cosy - cos 1/,
15. (a) Solve the initial value problems for j.i = - y and
ji = y, using the same initial conditions y(O) = 0, where -7r < 1/ < 1r.
j,(0) = l for both equations. (b) Prove that the time T(TJ) it takes for the pendulum
(b) Sketch the graphs of both solutions of part (a). in part (a) to fall from angle y = 1/ lo the vertical
558 Chapter 11 Second-Order Equations

position at angle y = 0 is sin(77/2), with O < 77 < n, and

T (11) = 1
o
'1 dy
----=====;:-
J2(cos y - cos 77)
K(k) =
-~
1 Ji - d¢ 2
-;=======, k < I,
2
o k sin¢

(c) Make a change of variable in the integral in part (b) which is an elliptic integral and isn't computable
chosen to show that T(77) = K(k), where k = using elementary functi,1ns.

7C Phase Space
If we can't find an explicit formula for the solution of a second-order equation and
we want to study a particular solution satisfying given initial conditions, one option
is to apply numerical methods as discussed in Section 8 and another is to display
solutions in what is called phase space, described here. To do either of these we first
convert second-order equations ji = f(t, y, y), with initial conditions y(to) = Yo,
y(to) = zo, into first-order systems as follows. Let y = z so that i: = ji. Then
z = f(t, y, y), and we can write the original second order initial-value problem as
y = z, y(to) = Yo,
z= f(t, y, z), z(to) = zo.

There are two advantages to this reformulation. A purely technical advantage is that
it allows us to apply the first-order numerical methods of Chapter 10, Section 2 to
second-order problems. The other advantage is conceptual, in that the first-order
system gives equal weight to displaying the two fundamental quantities, position y
and velocity z = y. The 2-dimensional (y, z)-space is the phase space of the second-
order equation. For the purpose of plotting curves in phase space we restrict attention
to equations j; = J(y, y) in which the function f is not explicitly dependent on time
t; such equations are called autonomous.

IEXAMPLE s j The harmonic oscillator equation ji = -<,iy is equivalent to the system


y = z,
. 2
z = -w y.

The solutions y = A cos(wt - a) of the second-order equation correspond to solutions

y = A cos(wt - a), z = -wA sin(wt - a)

of the first-order system. Since

the vector functions (y(t), z(t)) trace out ellipses in the phase space. Figure 11.I0(a)
shows some curves in phase space and Figure I l.IO(b) shows the corresponding
solution graphs, which relate time and position. The phase curves relate the funda-
mental quantities position and velocity. In Figure 11.10 ha!f the width of an ellipse
corresponds to the amplitude of a graph.
Section 7C Nonlinear Equations 559
FIGURE 11.10 z y
(a) y2 +z2 /4 = A 2
(b) y = A sin(2t).

The previous example illustrates an important point about phase curves and peri-
odic solutions y = y(t) of second-order differential equations y = f (y, y).

7.2 Periodicity Theorem. Assume /y(Y, y) andfz(Y, j,) are bounded continuous
functions of y and that y(t) is a solution of y = f (y, y). Then y(t) is periodic if
and only if the corresponding phase space curve traced by (y(t), j,(t)) is a closed
loop.

Proof. If y(t) is periodic with period P > 0, that is y(t + P) = y(t) for all
t, then differentiation shows that also y(t + P) = y(t). Thus the vector function
defined by (y, z) = (y(t), j,(t)) traces a closed loop in the yz phase space whenever
t traverses an interval of length P. Conversely, if this same vector function traces
a closed loop when t traverses an arbitrary interval to :s t ::S to + P of length
p, then applying the Existence and Uniqueness Theorem 7 .1 with the initial-values
(y(to), j,(to)) = (y(to + P), j,(to + P)) implies that the solution y(t) repeats over
intervals to + P ::S t ::S to + (k + 1) P for integer k and so is periodic. •

Example 5 is about a linear equation that has explicit solutions in terms of the
periodic functions sine and cosine. Since the corresponding phase curves are ellipses,
Theorem 7 .2 implies that a solution must be periodic with period P equal to the time
it takes to traverse the corresponding ellipse.

The next example is the nonlinear pendulum equation, for which the solutions are
well understood but for which there are no simple formulas.

The simplest form of the pendulum equation is y= - sin y. Letting z = y we arrive


at the equivalent I-dimensional system

y=z
z= -sin y
for the phase space variables y and z. We interpret y as a displacement angle in
radians and z as its angular velocity. Applying the method of Section 7B, we solve

dz .
z-=-smy
dy
toget ½z 2 =cosy+c, or z=± 2cosy-2cosyo+z5, J
560 Chapter 11 Second-Order Equations

where yo and zo are initial values for y and zo. To get a real value for j, we must
have -2 ~ -2 cos yo + z5. Furthermore, if -2 cos yo + z5 > 2, then z is either a
positive or negative periodic function of y whose graph can' t cross the y axis to
fonn the closed loop that goes with a solution that's periodic as a function oft. Thus
the solutions y(t) that are periodic are generated by initial-conditions satisfying

-2 ~ -2cosyo + z5 < 2;

we can think of forming the corresponding phase curves by joining at the y axis the
two graphs we get by choosing opposite sign for the square-root. See Figure I 1.1 l(a).
The periodic rotational curves above and below the y axis go with "over-the-top"
motions of the pendulum, motions with increasing angle for z = j, > 0 and decreas-
ing angle for z = j, < 0. On the closed loops the motions are "back-and-forth" peri-
odic, with top part of the loop representing increasing angle y and the bottom part
decreasing angle y. The periodicity of the periodic solutions y = y(t) of y = - sin y
depends in no way on the periodicity of sin y. For comparison we include a phase
portrait for a somewhat more realistic linearly damped pendulum; the top two curves
in Figure 11 .11 (b) represent motions that start out rotating over the top but after a
while are damped down to swinging back and forth with decreasing amplitude. We
take up the plotting of these curves in Section 8.

Note. Other than closed loops and rotational traces, we list three types of special
points and curves in Figure I I. I J (a). The unstable equilibrium points and separating
curves listed as types 2 and 3 are highly theoretical, and are impossible to realize
mechanically.

1. Single points (Yl, 0) on the y axis at the center of closed loops represent
vertical stable equilibrium positions of the pendulum, hanging down, and
with velocity z = 0.
2. Single points (y2 , 0) midway between the points of type I represent vertical
unstable equilibrium positions of the pendulum, balanced up, with velocity
z = 0.
FIGURE 11.11 z = .v
Phase portraits for
(a) ji = - sin y and
(b) ji = -Jby - sin y.

)'

(a) (b)
Section 7C Nonlinear Equations 561
3. Curves extending between two points of type 2 separate the rotational motions
from the back-and-forth motions and are the phase-space traces of motions
that tend away from and toward unstable equilibrium without ever attaining it.

The single points referred to as type 2 are traces in phase space of constant solutions
y(t) = C of the differential equation y = - sin y. Note that y(t) = 0 so the
corresponding phase space plot of a constant solution is a point (y, z) = (C, 0)
on the y-axis. In general a constant solution of a differential equation must satisfy
y(t) = 0, and is called an equilibrium solution because the position y(t) doesn't
change over time. The pendulum example shows that an equilibrium may be very
unstable, as in the upward vertical position (y, z) = (Jr, 0) of a pendulum. In general
an equilibrium point (y, z) = (C, 0) is called stable if all phase curves starting
sufficiently close to it remain close to it. Otherwise the point (y, z) = (C, 0) is
called unstable.

The pendulum equation y = - sin y, with phase portrait shown in Figure 11.1 (a) has
equilibrium solutions satisfying y(t) = 0, so these solution also satisfy y(t) = 0. For
the pendulum equation this amounts to asking for the solutions of - sin y = 0. The
solutions of this equation are y = kJr for integer values of k. Thus the equilibrium
points are at (y, z) = (kJr, 0). When k is an even integer, these points lie at the
centers of the closed loops in Figure 11.1 (a), and these points are stable equilibrium
points, because a closed loop starting close enough to such a point remains close to
the point. When k is an odd integer, a loop starting near the point loops nearly 2Jr
units away while going around a stable point that is nearly Jr units away.

EXERCISES

In Exercises I to 6, solve the initial-value problem and 15. The differential equation y - y =0 has phase plots
sketch the graph of the solution y = y(t). Then sketch that decompose into three distinct parts, some of which
the trace of the solution in yz phase space, indicating correspond to constant equilibrium solutions.
the direction of traversal as t increases.
(a) Make a phase portrait of the differential equation in
l. y + y = 1, y(0) = 2, y(0) =0 (y, z) == (y, j,) space, making clear where the equi-
2. y - y = 0, y(0) = l, y(O) = 0 librium points and what the directions of traversal
are for the other phase curves.
3. y = 1, y(0) = y(0) = 0 (b) On the basis of your sketch for part (a) do the
4. y + y = 0, y(0) = 2, y(0) =- l equilibrium points appear to be stable or unstable?
Explain your answer.
S. y + y = 0, y(0) = 2, y(0) =0
6. y - y == 0, y(0) = l, y(0) = 0
16. The differential equation y + y == 0 has phase plots
In Exercises 7 to 14, use the method of Section 78 to that decompose into three distinct parts, some of which
find a relation depending on a constant C between y correspond to constant equilibrium solutions.
and z = y; then use this to sketch a phase portrait of the
(a) Make a phase portrait of the differential equation in
differential equation containing at least three curves.
(y, z) = (y, y) space, making clear where the equi-
7. y+ y = l 8. y- y == 0 librium points and what the directions of traversal
9. y == 1 10. y=0 are for the other phase curves.
(b) On the basis your sketch for part (a) do the equilib-
ll. y + y = 0 12. y +2y 3 = 0 rium points appear to bt: stable or unstable? Explain
13. y - 2y 3 = 0 14. y = 1+ y your answer.
562 Chapter 11 Second-Order Equations

SECTION 8 NUMERICAL METHODS

IEXAMPLE 1 I We saw in Section 4 that the differential equation


my+ ky + hy = f (t)

is satisfied by the displacement function y(t) of a vibrating spring. There, the factors
m (mass), k (frictional constant), and h (spring stiffness) were constants, whereas
f (t), the externally applied force, was allowed to be a nonconstant function. Using
the method of characteristic equations, we were able to make a fairly complete
analysis of the solutions of the differential equation. But suppose that some or all
of m = m(t), k = k(t), and h = h(t) vary with time. For example, the spring
may weaken, or the friction may increase because of heating, or the weight of the
mechanism might increase or decrease for some reason. Assuming that m(t) > 0,
we can divide by it to get an equation of the form

y = a(t)y + b(t)y + f(t),


where a= -k/m and b = -h/m. In exceptional circumstances, it may be possible
to solve this differential equation explicitly, but usually we have to settle for an
approximate solution.

Second-order differential equations are either linear, of the form

d 2y dy
dt 2 + a(t)dt + b(t)y = f(t),
or nonlinear, for example, the damped pendulum equation

,l2y dy g .
dt2 + k dt + Ism y = 0.
Equations of both types appear in the very general form

It is this form that we'll treat here along with initial conditions of the form
dy
y(to) = YO, dt (to)= zo,

where to, yo, and zo are given constants. The Existence and Uniqueness Theorem 1.1
described in the introduction to this chapter guarantees a unique solution to this
problem if F(t, y, z) and its partial derivatives with respect toy and z are continuous
on some interval containing to and in some rectangle containing the point (yo, zo).
It's important to realize that the only type of problem for which we so far have
a universally effective method of actually displaying such solutions is the linear
equation with a(t) and b(t) both constant. Even then, if f(t) is not a function we can
integrate explicitly, we may have trouble finding a formula for a particular solution.
What we may then settle for is a numerical approximation Yk to the value y(tk)
of the true solution at a discrete set of points tk, Such approximations were treated
Section SA Numerical Methods 563
in Chapter 2 for first-order equations, and the methods used here for second-order
equations are simple modifications of the first-order methods.
If the purpose in solving a differential equation is just to obtain numerical values
or a graph for some particular solution, then a purely numerical approach may be
more efficient than first finding a solution formula and then finding the desired
graphical or numerical results from the formula. On the other hand, if what you
want is to display the nature of a solution's dependence on certain parameters in the
differential equation, or on initial conditions, then solution by formula is preferable
if at all possible. Beyond that, detailed properties of a solution, such as whether it's
periodic or only approximately so, can be hard to get from a numerical approximation
and easy to get from a formula such as y(t) = 2 cos 3t.
SA Euler,s Method
Numerical methods for first-order equations are motivated by using the interpreta-
tion of the first derivative as a slope. Rather than trying to make something of the
interpretation of the second derivative in a second-order equation, what we'll do is
find a pair of first-order equations equivalent to a given second-order equation and
then apply first-order methods to the simultaneous solution of the pair of equations.
The principle is easiest to understand in the general case
y" = F(t, y, y'), y(to) = YO, y'(to) = zo.
There are many ways to find an equivalent pair of first-order equations, but the most
natural is usually to introduce the first derivative y' as a new unknown function z.
We write z = y', z' = y", so we can replace y" = F(t, y, y') by z' = F(t, y, z).
The pair to be solved numerically is then
y' = z, y(to) = YO,
z' = F(t, y, z), z(to) = zo.
Since y' (to) = z(to) = zo, there is an initial condition that goes naturally with each
equation. To find an approximate solution, we can do what we would do with a
single first-order equation, except that at each step we find new approximate values
for both unknown functions y and z, and then use these values to compute new
approximations in the next step. The iterative formulas are as follows for the simple
Euler method. with step size h:
Yk+l = Yk + hzk,
Zk+l = Zk + hF(tk, Yk, Zk),
where tk = to + kh. The starting values yo, zo come from the initial conditions.
The initial-value problem •
ji = -siny, y(O) = j,(0) = I , for OS t < 20
describes the motion of a pendulum with fairly large amplitude, so large that the linear
approximation ji = -y would be inadequate. We go ahead to solve numerically the
system
j,=z, y(O)=l,

z= -siny, z(O) = I,
564 Chapter 11 Second-Order Equations

We can express the computation as follows.

DEFINE F(T,Y,Z)= -SIN(Y)


SE'.r T=O
SET Y31
SET Z=l
SET H=0.01
DO WHILE T < 20
SET S=Y
SET Y=Y+H•Z
SET Z=Z+H•F(T,S,Z )
SET T=T+H
PRINT T,Y
LOOP

Note the command SET S=Y, saving the current value of y for use two lines
later; without this precaution, the advanced value y + hz would be used, which is
not correct. The printout results in 2000 values of t from 0.01 to 20 by steps of
size 0.0 I along with the corresponding y-values. We could also print the z-values,
which are approximations to the values of the derivative y. This is useful in making a
phase-space plot of the solution, plotting approximations to the points (y(t), y(t) ). In
the formal routine listed above, the line PRINT T,Y would be replaced by something
like PLOT Y,Z for a phase plot.
Rather than displaying a table with 2000 entries, Figure I 1.12 shows the graph of
y as an unbroken curve for the initial conditions y(0) = j,(0) = I. The replacement
of sin y by y in the differential equation is inappropriate in this instance, because
the values of y that occur are too large to make the approximation a good one. The
linearized initial-value problem ji = -y, y(0) = y'(0) = 1 has solution cost+
f ),
sin t = -./2 sin (t - with graph is shown in Figure 11.12 as a dotted curve. The
graphs show that the solution to the nonlinear pendulum equation differs substantially
in amplitude and period from the solution to the linear equation. At very sma11
amplitudes the discrepancy is much less, because the approximation of sin y by y is
better the closer y is to zero.

FIGURE 11.12 y
Solutions y(t) to y = - sin y
and y = -y (·. · ), both with
y(O) = j,(0) = 1.

8B Improved Euler Method


The improved Euler method results from applying the single-variable version of
the method to each of the equations y' = z, z' = F(t, y, z), as we did for the Euler
method:

Pk+I = Yk +hZk
qk+I =zk+hF(tk,Yk,Zk)
Section 8B Numerical Methods 565
h
Yk+l = Yk + 2(zk + qk+1)
h
Zk+l = Zk + 2[F(tk, Yk, zk) + F(tk+l, Pk+I, qk+I )].

Here Pk and qk provide the simple Euler estimates that are then used to compute
the final estimates for Yk and Zk = Yk; fk = to + kh as before. (As with the simple
Euler method, the value Yk has to be kept for use in computing Zk+I and cannot be
replaced by Yk+l without significant error.)
The advantage of the modification is that the error in the final estimates is sub-
stantially reduced without adding much complexity to the computation.

[;~~M,~~-}zj The initial-value problem

y" + y = 0, y(0) = 0, y' (0) =1


has solution y = sin x. A numerical table of values compares the Euler method,
the improved Euler method, and the correct value rounded to six decimal places
(Table 11.3). The value of h used is 0.001, but only every hundredth value is given.
The only discrepancies in the last two columns are between the final digits in the
sixth, eleventh and last entries.

An algorithm to produce the first, third, and fourth columns in the previous table
might look like this. The routine produces only every hundredth row of the computed
values.

DEFINE F(T,Y,Z)c -Y
SET T=O
SET Y=O
SET Z=l
SET H=.001
FOR J=l TO 30
FOR K=l TO 100
SET P=Y+H*Z
SET Q=Z+H•F(T,Y,Z)
SET S=Y
SET Y~Y+.S*H*{Z+Q)
SET Z=Z+.S*H*{F(T,S,Z)+F(T~H,P,Q))
SET T=T+H
NEXT K
PRINT T, Y, SIN(T)
NEXT J

Recall that in this algorithm we are dealing with a pair of equations of the form

y=z z=F(t,y,z).

The Web site http://math.dartmouth.edu/"-rewn/ has Java applets 20RD and 20RD-
PLOT at use this routine. Decreasing the step size h often improves accuracy in the
Euler methods. This requires more steps to reach a given value of t and may produce
566 Chapter 11 Second-Order Equations

TABLE 11.3

X Euler y Imp-Euler y y ~ sinx

0.1 0.099838 0.099834 0.099334


0.2 0.198689 0.198669 0.198669
0.3 0.295564 0.295520 0.295520
0.4 0.389496 0.389418 0.389418
0.5 0.479545 0.479426 0.479426
0.6 0.564812 0.564643 0.564642
0.7 0.644443 0.644218 0.644218
0.8 0.717643 0.717356 0.717356
0.9 0.783679 0.783327 0.783327
1.0 0.841892 0.841471 0.841471
1.1 0.891697 0.891208 0.891207
1.2 0.932598 0.932039 0.932039
1.3 0.964185 0 .963558 0.963558
1.4 0.986140 0.985450 0.985450
1.5 0.998243 0.997495 0.997495
1.6 1.000370 0.999574 0.999574
1.7 0.992508 0.991665 0.991665
1.8 0.974725 0.973848 0.973848
1.9 0.947200 0.946300 0.946300
2.0 0.910297 0.909297 0.909297
2.1 0.864117 0.863209 0.863209
2.2 0.809387 0.808496 0.808496
2.3 0.746564 0.745705 0.745705
2.4 0.676275 0.675463 0.675463
2.5 0.599221 0.598472 0.598472
2.6 0.516173 0.515501 0.515501
2.7 0.427958 0.427379 0.427379
2.8 0.335458 0.334988 0.334988
2.9 0.239597 0.239249 0.239249
3.0 0.141333 0.141119 0.141120

more approximate solution values than is convenient. Thus we'd print results only
after m steps of calculation. For example, h =
0.001 and m =
10 would produce
approximate values with argument differences of 0.01. The applets just referred to
allow for this feature.

EXERCISES

MATLAB, Maple, and Mathematica are widely avail- j,(0) = 2. What changes if you replace the condition
able . for doing these exercises. In addition there are j,(0) = 0 by j,(0) = a for various choices of a?
Java applets 20RD, 20RDPLOT, and PHASEPLOT at [Hint: Look for successive approximate values for y
the Web site http://math.dartmouth.edu/~rewn/ and the with opposite sign.]
Heaviside function H(t) is available for use in these (b) What can you say about the questions in part (a) on
applets. an interval 11 :=:: t :S O with 11 < O?
2. The Bessel equation of order zero is
1. The Airy equation y + ty = 0 has solutions for t > 0
somewhat similar to solutions of y + y = 0. x 2 y" +xy' + x 2y = 0.
(a) Estimate the location of positive r-values for which For I :=:: x ::s 40, estimate the location of the zero values
y (t) = 0 for the solution satisfying y(O) = 0 and of a solution satisfying y(l) = l, y'(l) = 0.
Section 8B Numerical Methods 567
3. Make a numerical comparison of the solution of ji = (a) If the terminal velocity for the linear model is 36
- siny, y(O) == 0, j,(0) = l, with the solution of ji = -y feet per second, what is k?
using the same initial conditions. In particular, estimate (b) Estimate the time it takes for the linear model to
the discrepancies between the location of successive zero reach velocity 35.99 ft/sec.
values for the two solutions, one of which is y(t) = sin t. (c) If the terminal velocity for the nonlinear model with
4. The nonlinear equation ex = 1. 1 is 36 feet per second, estimate k. This
requires trial and error.
d2y dy 2 (d) Estimate the time it takes for the nonlinear model of
dx 2 + k dx + hy = 0, h, k, constant, part (c) to reach velocity 35.99 ft/sec.
has solutions defined near x =
0. Compare the behavior of 9. Bead on a wire. Consider a bead sliding without friction
numerical solutions of the initial value problem y(O) = 0, under constant vertical gravity along a wire bent into
y' (0) = 1 with the corresponding behavior when the the shape of the twice continuously differentiable graph
nonlinear term hy 2 is replaced by a linear term h y. To do of y = f (x). It turns out that x = x(t) satisfies the
this you should investigate the result of choosing several equation x = -(g + /"(x)x 2)/'(x)/(1 + f'(x)2). With
different values of k > 0 and h > 0. J(x) = -x 3 +4x 2 - 3x, g =
32 and x(0) =
0, estimate
how large x(0) > 0 should be for the bead to overcome
5. The linear equation
periodic oscillation and go over the hump in the wire.
d 2y dy
dx2 + a(x) dx + b(x)y = 0 PENDULUM
with continuous coefficients a(x) and b(x) occurs often 10. (a) Use the Euler method for
with nonconstant coefficients. For other choices of the y" = F(t, y, y'), y(to) = YO, y'(to) == zo,
coefficients, we use numerical methods. Study the behav-
ior of numerical solutions of the initial value problem and apply it to the pendulum equation with
y(O) = 0, y' (0) = 1 as follows.: F(t, y, y') = -16 siny, y(O) = 0, y'(O) = 0.5.
(a) Let a(x) = sin x and b(x) = cosx for O::: x.::: 2rr . (b) Do part (a) using the improved Euler method.
(b) Let a(x) = e-x 12 and b(x) =
e-x/ 3 for OS x .::: 1. For small oscillations of y, the approximation
sin y ~ y is fairly good, leading to the replace-
6. Nonuniqueness. The initial-value problem ji =
3.jy, ment of the pendulum equation y" + (g/ /) sin y = 0
y(0) = 0, j,(0) = 0 has the identically zero solution. by the linearized equation y" + (g / l) y = 0, with
(a) Verify that y(t) = -kt
4
is also a solution. solutions of the form
(b) Investigate the application of the Euler methods to
the problem. y(t) = ci cos ./iii t + c2 sin ./iii t.
Assuming g/ I = 16, compare y(t) with the
FALLING OBJECTS improved Euler approximation to the solution of
7. Suppose that the displacement y(t) of a falling object is the nonlinear equation under initial conditions.
subject to a nonlinear friction force (c) y(O) = 0, y'(O) = 0.1
(d) y(O) = 0, y' (0) = 4
mji = -kya + mg. 11. Consider the damped pendulum equation 0 = -(g/ l) sin
(a) Find numerical approximations to y(t) in the range 0-(k/m)O, with g =
32.2, l = 20, k = 0.03, and m = 5.
0 ::: t .::: 20, with g = 32.2, y(O) = 0, y(O) = 0, For the solution with 0(0) = 0, 0(0) = 0.2.
m = 1 and a = 1.5. Use the values k = 0, 0.1, (a) Estimate the maximum angles 0 for OS t .::: 15.
0.5, 1, and sketch the graphs of y = y(t) using an (b) Estimate the successive times between occurrences
appropriate scale. of the value 0 = 0 for OS t S 15.
(b) Estimate the values of k that, along with y(O) = 0 (c) Repeat part (b), but with initial conditions 0(0) = 0,
and the other parameter values in part (a), produce 0(0) = 2.0
approximately the values y(O) =
0, y(5) 66, = 12. Consider the following modification of the pendulum
y(lO) = 137 and y(l5) = 208. equation m/0 = -gm sin 0, written here in terms of
8. A nonlinear model for an object of mass 1 dropped from forces. If the pendulum pivot is moved vertically from its
rest has frictional force y = -kya + g, y(O) = y(O) = O; usual fixed position at level 0, so that at time tit is at f(t),
the model is linear if a =
1. In numerical work assume with f (0)=0, the additional vertical force component is
e = 32 ft/sec 2. mJ(t). Thus the vertical force due to gravity alone is
568 Chapter 11 Second-Order Equations

replaced by ( - gm+ m/(t)) sin 0. It follows that the detect long-term approach to periodic behavior for some
equation for displacement angle 0 = 0(t) becomes k < 0.

m/0 = - gm sin0 + m/(t) sin0 or 16. Plot closed periodic phase paths for the Morse model of
displacement y from equilibrium of the distance between
.. I .. the two atoms of a diatomic molecule:
0 = ( - g + f(t)) sin 0 .
1
Show that if f(t) = at, with a constant, then
(a) ji = K(e-2ay - t -ay ) , K =a= I.
there is no change in acceleration as compared with
the fixed-pivot case and hence that the equation 17. Consider the nonlinear oscillator equation mi +kili l.8 +
for the displacement angle 0 remains the same: hx = 0, 0 5 /3 = canst. Let m = 1, k = 0.2, h = 5
0 = -(g / l) sin 0. and suppose that x(0) = 0, ..i:(O) = 5. Compute numerical
(b) Show that in the case of a general twice- approximations to x(t) on the range O 5 t 5 20 for
differentiable f (t), the position of the pendulum f3 = 0, 0.5, I. Sketch the resulting graphs using computer
weight at time tis x(t) = (t sin0(t), f(t)-l cos0), graphics.
where 0 satisfies either of the differential equations
displayed above. 18. Chaos. The nonlinear Duffing oscillator models a
(c) Let g = 32,/ = 5,111 = 1 and f(t) = 2sin4t. Plot periodically-driven, damped initial-value process:
(t, 0(t)) fort-values 0.01 apart between O and 100,
assuming 0(0) = 0 and 0(0) = 0.01, 0.001, 0.0001. ji + ky - y + y3 = A cos wt, y(O) = ½, j,(0) = 0.
Note the long term deviation in behavior as com-
pared ~ith the identically zero solution correspond- (a) Make a computer plot of the solution y(t) for O 5
ing to 0(0) = 0. t 5 300 for the parameter choices k = 0.2, A =
(d) Plot the path of the pendulum weight under the 0.3, w = 1 and initial conditions y(O) = y (O) = 0.
assumptions in part (c). The behavior of the damped, periodically driven
Duffing equation is often described as "chaotic,"
which in practical terms means unpredictable. In par-
OSCILLATORS AND PHASE SPACE ticular, the specific output that you get will depend
13. An unforced oscillator displacement x = x(t) satisfies significantly not only on the parameter and initial
mx + ki + hx = 0, where k and h may depend on time t. values, but on the choice of numerical method and
(a) Suppose that k(t) = 0.2(1 - e- 0· 1'), h = 5, 111 = I
step size, and even on the internal arithmetic of the
and that x(0) = 0, i(0) = 5. Compute a numerical machine used to generate the output. For this reason
approximation to x(t) on the range of O 5 t 5 20. it seems impossible to describe accurately the global
Then sketch the graph of x = x(t). shape of the output from the damped, periodically
(b) Do part (a) using instead k = 0 and /z(t) = 5(1 - driven Duffing oscillator.
e-0.21). (b) Change just the damping constant in part (a) to
k = 0, and make a plot of the solution. Comment
14. Make phase-plots of the soft spring oscillator equation on the qualitative changes that you sec as compared
ji = - y y 3 +8y under each of the following assumptions. with the output in part (a) .
(a) y = 8 = I (c) Make several phase plane plots, starting at different
(b) y = 1, 8 = 2 points in the (y, y)-plane, of solutions of the Duffing
(c) y = 2, 8 = I equation. Use the parameter values k = 0.2, A =
15. Make solution graphs and (y, y) phase plots for the 0.3, and w = I.
periodically driven hard spring oscillator equation (d) Experiment with part (c) by trying your own choices
ji = -y - y3 + ky + Jo
cost and use the results to for the three parameter values.

Chapter 11 REVIEW

In Exercises 1 to 6, find all solutions that satisfy the 2. y" + y = x sin x


given equation. 3. y" - y = sin x
1. y" + 2y' + y = e-x + 3ex 4. y" - y' - y = 1, y(0) = y' (0) = 1
Section 88 Numerical Methods 569
5. y" + 2y' + 3y = I neither function is a constant multiple of the other on such
6. (D - 1) 2y = x 3 - ;; an interval.

7. y 111 =X
21. (a) Show that the functions e" and e-., are linearly
independent on an interval a < x < b by showing
8. y"" = Bly directly that the equation
9. y" + 9y = sin 3x, y(O) = 1, y'(O) = 0
10. (D 2 + 4)y = cos 3:i, using Section 3C
11. y" + y = 0, y(O) = -1, y(rr) = I can't be satisfied for all real x unless the constants
c1 and c2 are both zero.
12. y" + y = 0, y(O) = 0, y(rr/2) = 2 (b) Show that the equation 2e" - 3e-x = 0 is satisfied
The basic real solution forms for y" + ay' + by = 0 are for exactly one real x and that 2e" + 3e-x == 0 is
(e 71X, er2x}, (e 71 x, xe71 x} and (eax cos/Jx, eax sin/Jx} satisfied for no real x.
and are prototypes for Exercises 13 and 14. (c) For what complex values of x is 2e" + 3e-x = 0
satisfied?
13. Make a corresponding list of triples for the equation
22. Suppose that
y 111 + ay'' + by' + cy = 0.
14. Make a corresponding list of quadruples for
y"" + ay"' + by" + cy' + dy = 0.
15. Derive from scratch the fundamental sinusoidal solutions (a) Find the general solution y(t) of this equation.
to the harmonic oscillator problem ji + y = 0, y(O) = 0, (b) Let
j,(O) = 1. Do this in the following steps: (Our earlier dy
derivations were made using complex exponentials.) z(t) = dt(t).
(a) Multiply the equation by j,, and then integrate with
Show that the parametrized curve (y(t), z(t)) traces
respect tot to get ½f
+ ½Y 2 = C. clockwise a circular path or else reduces to a single
(b) Find C, solve for j,, and solve the resulting first order
point.
equation to get y(t) = sin(t + c).
23. Theorem 2.4 implies that there is a one-to-one correspon-
16. Suppose that y1 (x) and Y2 (x) are real-valued functions dence between the set of all n- tuples of initial values
defined for all real x and you know that YI (x) is not a
(zo, z1, ... , Zn-1) and all solutions to the nth-order homo-
constant multiple of Y2(x). Are the two functions neces- geneous equation L(y) = 0. Explain how this conclusion
sarily linearly independent? Explain your answer, using follows.
an example if necessary.
24. Theorem 2.4 implies that there is a one-to-one corre-
17. Let the functions YI, Y2 be defined for all real x by spondence between the set of all n-tuples of inirial val-
YI (x) = e'" and Y2 (x) = t!x, where r and s are unequal ues (zo, z1, ... , Zn-I) and all n-tuples (CJ, c2, ... , cn) of
complex numbers. Show that YI and Y2 are linearly coefficients of linear combination. Explain how this con-
independent even if complex constants CJ, cz are allowed clusion follows.
in CJ YI (x) + c2y2(x ).
25. (a) Show that the family of differential equations
18. For what values of the constant b do the nonidentically y 11 - (2r + h)y' + r(r + h)y = 0, depending on
zero solutions of y" + y' + by = 0 oscillate as func- the parameter h, can also be written
tions of x? (D - r)(D - (r + h) }y = 0.
19. The current 1 (t) flowing through a certain electric circuit (b) Show that the equations of pru1 (a) have solutions
at time t satisfies y = CJ e<r+h)x + c2e' x if h :/= 0.
(c) Let CJ = l/h,c2 = -1/h, and show that for
I"+Rl'+l=sint, each fixed x, the resulting solution Yh(x), tends as
h - 0 to
where R > 0 is a constant resistance. The equation has d rx rx
a solution of the form 1 (t) = A sin(t - ex) for certain y = dr
-e =xe .
constants A > 0 and ex . Find A and ex.
20. Show that the functions e" and e-" are linearly indepen- In Exercises 26 to 29, assume that L is a linear operator
dent on an interval a < x < b by showing directly that such that L(YI) = w1, L(y2) = w2, and L(y3) = w3 .
570 Chapter 11 Second-Order Equations

Using just this information, find linear combinations z integral ft


e-st f (t) dt exists for some values of s.
of Yt, Y2 and y3 so that the given equation holds. Disregarding the problem of actually finding a formula
26. L(z) = 2w1 - 3w2 in Exercises 40 to 47, decide whether the given function
has a Laplace transform for some values of s.
27. L(z) = w1 + 2w2 - 4w3 30. 12e1 31. t Int
28. L(z) = L(2z) + w1 + w2
32. Int 33. e
12
29. L(z) =0
34. e- 12 35. sin(ln t)
The use of the Laplace transform is limited to those
functions f (t), defined fort > 0 for which the improper 36. sin(l / t) 37. t- 1
C HAP TE R 12

INTRODUCTION TO SYSTEMS

In the two previous chapters we have considered differential equations whose solu-
tions are real-valued functions of a real variable. It turns out that a natural and useful
generalization is to consider vector differential equations (or, equivalently, systems
of real differential equations) whose solutions are vector-valued functions of a real
variable. There are two main reasons for making this generalization: one is that
many phenomena in applied mathematics can most naturally be expressed in vector
form; another is that real differential equations of order higher than one can often be
reduced advantageously to vector equations of order one. Both of these statements
will be explained in this chapter.
In dealing with vector equations it will be convenient to use the letter t to denote
the variable with respect to which derivatives are taken. This choice has the advantage
that applications most frequently involve time-dependent phenomena, and also that
the letters x, y, z, etc., are left free to denote space coordinates, as usual. To write
general systems of differential equations more compactly we'll use the notation
x = (x, y) for 2-dimensional systems, x = (x, y, z) for 3-dimensional systems, and
x = (x1, ... , Xn) for n-dimensional systems.

SECTION 1 VECTOR FIELDS


lA Geometric Interpretation
A first-order differential equation of the fonn
dx
- = F(t,x)
dt
has a real-valued solution of the form x = x(t). For example, the equation
dx
- =x+t
dt
has the general solution x(t) = Ce' - t - 1, defined for all real numbers t. Similarly
a pair of equations
dx
-dt = F(t,x, y)
dy
-dt = G(t,x,y),
called a system of dimension 2, has as a solution a pair of functions

X = X(t)
y = y(t),

571
572 Chapter 12 Introduction to Systems

defined on some interval a < t < b and satisfying the differential equations. As t
a b
increases from a to b, the point in the xy-plane with coordinates (x(t), y(t)) will
(a)
trace a path, perhaps like the one in Figure 12.l(b). Such a path is called a trajectory
of the system. It's important to remember that a solution of a system is a function
y oft and that its trajectory is the image of this function, containing only a part of the
information contained in the solution; by itself, the trajectory fails to show explicitly
(x (b), y (b))
the correspondence between values oft and values (x(t), y(t)) of the solution, nor
(x (a), y (a))
does the trajectory display the speed of traversal. Nevertheless a sketch of several
judiciously chosen trajectories of a system, called for historical reasons a phase-
X portrait of the system, is often enough to convey important information about the
system, particularly if the directions of traversal are shown by inserting appropriate
(b) arrow points. Note however that a "trajectory" may be a ~ingle point xo arising from
a constant solution x(I) = xo; in that case we refer respectively to an equilibrium
FIGURE 12.1 point and an equilibrium solution.
In the description of scientific problems, the variable t often represents time. Thus
the derivatives x' and y' may stand for the rates of change of x and y with respect
to time; if t is to be interpreted as time, the derivatives are often written with dots
instead of primes: x, y.
[ EXAMPLE.· 1·1 Consider the system
x=x

y = 2y.
This system is particularly simple because each unknown function occurs in just one
equation. Such a system is called uncoupled, We can therefore solve each equation
separately to get the general solution

x(t) = c1e'
y(t) = c2e2,.
We see that x(O) = CJ and y(0) = c2, so that imposing initial conditions such as

x(0) = 1
y(0) =2
y will determine the values of c1 and c2 and will single out the particular solution

x(t) = e' > 0,


(1, 2)
2
t =0 y(t) = 2e ' > 0.

The trajectory of this solution satisfies

X y = 2x 2 ,
with the restriction that x(t) > 0 and y(t) > 0. Figure 12.2 shows a part of the
FIGURE 12.2 trajectory for -oo < t < oo. Because x(f) =
er can't be negative, the trajectory
consists of only half of the parabola y = 2x 2 .
Section 1A Vector Fields 573
We can write the system of differential equations in Example I in vector notation
by letting x = (x, y), dx/dt = (dx/dt, dy/dt), and F(x, y) = (x, 2y). Then the
system becomes
-dx
dt
= F(x).
Similarly, the system
dx
-=x+y+t
dt
dy
-=x-y-t
dt
would be written dx/dt = F(t, x), where

F(t, x, y) = (x + y + t, X - y - t).

When we speak of a general first-order system of dimension 11 in normal form,


we'll mean a system of the form

dx1
dt = F1(t,x1,x2, ... ,Xn)

dx2
-=F2(t,x1,x2,, .. ,x11 )
dt

Using the vector notations

X=(XJ , .. ,,Xn) and dx = (d:q, ... , dx 11


)
dt dt dt

we can write the system more compactly as

dx
dt = F(t, x),
where F(t,x) = {F1(t,x), ... ,F11 (t,x)). A solution x = x(t) is a vector-valued
function of a real variable t on an interval a < t < b, such that substitution into the
differential equation satisfies the system of equations on the interval, as (x(t), y(t)) =
(e', 3e21 ) satisfies (i:, y) = (x, 2y) in Example 1. The image of the interval under this
function is a trajectory curve in R 11 • Such a trajectory in R 3 is shown in Figure 12.3.
One advantage of the vector interpretation of a system is that the derivative dx/dt
has a geometric meaning that the derivatives dx;/dt do not have when taken sep-
arately: dx/dt is a tangent vector to a trajectory, and if t is time, then dx/dt is a
velocity vector. Formally, tangent and velocity are matters of definition. The follow-
ing discussion shows how the formal definitions are suggested by the intuitive ideas
behind tangent and velocity.
574 Chapter 12 Introduction to Systems

FIGURE 12.3

Reviewing the vector derivative, Figure 12.3 shows the points x(t), x(t + h), and
the chord x(t + h) - x(t) joining them. If we multiply the vector by 1/ h, then
x(t + h) - x(t)
h
will be parallel to the chord, and if it has a limit as h ......,. 0, this limit vector can
reasonably be defined to be a tangent vector at x(t). The limit is defined as in Chapter
4 so that
dx . x(t+h)-x(t)
-(t) = hm - - - - -
dt h--+0 h
= (um x1(t+h)-x1(t)····, Jim x (t-+-h)-x (t)) 11 11

h--+0 h h--+0 h
dx1 dx2 dx 11 ) dx
= ( -(t), -(t), ... ' - ( t ) = -(t).
dt dt dt dt

Thus dx (t) is a tangent vector at x(t). The velocity interpretation is valid when t is
dt
time because the Euclidean length I(dx/dt)(t) I is defined to be the speed of traversal
of the trajectory at x(t). The reason is that for small values of h, the Euclidean length

Ix(t + h~ - x(t) I
is nearly the average rate of traversal of the trajectory over the interval from t to
t + h. This approximation is really good only if x(t) is differentiable, in which case,
because length is continuous,

. l"(t+h)-x(t)I
hm ----- -(t) I.
= Id"
h--'>0 h dt

IEXAIVIPLE 2 j The pair of equations


dx
-=-y
dt
dy
-=x
dt
Section 1B Vector Fields 575
has the form dx/dt = F(x), where x = (x, y) and F(x) = (-y, x). We interpret
F (x, y) translated
the system as saying that the tangent vector to a solution trajectory passing through
~ (x, y) has coordinates ( -y, x). Figure l 2.4(b) shows a sketch of some tangent vectors
F(x,y)=(-y,x)\ at a few points, made by first locating the arrow that represents (-y, x) and then
I
I
\ (sy)
moving the arrow parallel to itself so that its tail coincides with the point (x, y) as in
I
Figure l 2.4(a). The picture suggests that the trajectories have a circular shape. Notice
also that the arrows get longer as they get farther from the origin, which indicates
that the speed along a trajectory is greater when the trajectory is farther from the
origin. We postpone actually solving the equations until we develop a systematic
procedure in the next section.
(a)
Figure 12.4(b) plays somewhat the same role that a sketch of a direction field does
for a single first-order equation, but the present picture contains more information:
here, the lengths and orientations of the tangents are significant, whereas in a direction
field all the important information is conveyed by the slopes of the line segments.
The reason is that representing the slope of a graph by a single number is possible
only for real functions of a real variable.

l " // lB Autonomous Systems


If, in the general first-order system

"'
FIGURE 12.4
(b)
-dx
dt
= F(t, x),
the function F(t, x) is independent of t, then the system is called autonomous and
so has the form
dx
-dt = F(x).
Example 2 describes an autonomous system. For an autonomous system, the tangent
vector F(x) located at the point x is always the same regardless of what time it is
when the trajectory passes through x. Such an assignment of vectors F(x) to points
xis called a vector field and Figure l2.4(b) shows a sketch of one. For the general
nonautonomous system, the function F(t, x) specifies a tangent vector (to a trajectory
through x) that may be different for each t, in this way producing a time-dependent
vector field. The pictorial analogue of the static vector field shown in Figure 12.4 for
a time-dependent vector field would be a sequence of "snapshots" taken at different
times. Each snapshot would have the same general form as Figure 12.4, but would
show changes in the individual arrows as time t varies.
The 2-dimensional system
l:-~,~~MPL~,3J
t - f - .

-dx
dt
= (1 - t)x - ty

dy
-=tx+( l -t)y
dt
determined by the time-dependent vector field
(1-t)x-ty)
F(t,x,y)= ( tx+()-t)y

= (1- t) ( ~ ) + t ( -~ )
576 Chapter 12 Introduction to Systems

is not autonomous. Figure 12.5 shows some sketches of the vector fields for
t=0,½,1.
Suppose the 2-dimensional system dx/dt = F(t, x, y), dy/dt = G(t, x, y) is such
that the ratio G(t, x, y)/ F(t, x, y) = R(x, y) happens to be independent oft. This
would occur in particular if neither F nor G depended explicitly on t. Since the
chain rule allows us to write
dy dy/dt
y
dx dx/dt
/
' X
under fairly general conditions, we can conclude that there are trajectory curves of
the system satisfying the differential equation
dy
/
F (O,x. y)
(a)
'
= (;)
-dx = R(x , v).
·
If we can solve this equation, we have a way to plot trajectories without finding
solutions x(t), y(t). For example, dx/dt = ty, dy/dt = -tx leads us to consider
dy X
y

--r- t dx y'

which has solutions x 2 + y2 = c, representing circular trajectories if c > 0. Using


this method, you are asked sketch some trajectories for the systems in Exercises 24

t
F (], x, y)
-
= (7')
X to 27. Looking at the vector field will tell you roughly how a trajectory is traced.
The trajectory of a constant solution, for which x = y = 0 for all t is just a single
point.

IC Second-Order Equations
(b) Systems of first-order differential equations arise very naturally in the study of higher-
order equations. We restrict ourselves here to the most important case, namely second
y order, a context we discussed also in Chapter 11 , Section 7. In a second-order I-
dimensional equation ji = f (t, y, y) we reduce the order to I and simultaneously
/
' /
X
raise the dimension to 2 by introducing a new dependent variable z. We let y z.
Then ji = i;. Thus an initial-value problem

ji = f(t, y, y), y(to) = Yo, y(to) = zo


=

'
F(I2, x, Y;~-l(x-v)
- 2
(c)
x T :\'
is equivalent to a 2-dimensional system of the form

Y = Z, y(to) = YO,
z = f(t, y, z), z(to) = zo,
FIGURE 12.S a first-order system x= F(t, x), with x = (y, ;:) and time-dependent vector field

F(t, y, z) =;= ( f(t.~, z)).


The _v~-space is the phase space or state space of the second-order equation. The
state-space is important in part because it explicitly displays all the initial-value
information necessary for determining the system's evolution. Note, for example, that
the solution graphs of the original equation of order 2 don't display the vital velocity
Section 1C Vector Fields 577
component y(t) at times t > to. The image plot of the 2-dimensional system's typical
solutions is a phase portrait of the original I-dimensional equation.

r~~N'~~E··~·.·I The second-order differential equation


ji+y=O
converts into a first-order system if we let y = z. With ji = z we get
z z = -y
Y =z.
This is the system considered in Example 2, with its vector field sketched in Figure 12.4.
We're familiar with the general solution toy+ y = 0 and its derivative z(t) = y(t),
namely
y y(t) = CJ cost+ cz sint, z(t) = -c1 sin t + cz cost.
We determine particular solutions most naturally by choosing initial conditions. For
example, y(O) = 1, j,(0) = 0 are equivalent to y(O) = 1, z(O) = 0, which implies

y(O)=Ct=l,
(a) y + y = O;
z(O) = c2 = 0.
z
The vector solution determined by these values of ct and c 2 is

(~g~ ) = ( ~::~ t) .

The trajectory of this solution is a circle of radius 1, shown in Figure 12.6(a), which
-t-+--f--f-f----(ft~-,1-,1---,t-1-t---....
y taken altogether is a phase portrait for the original equation ji + y = 0, consisting of
the circles

y2 + z2 = (c1 cost +ez sint) 2 + (-q sint + cz cost)2


2
=cf+ c~ = r .
(b))i +fy + ½Y = 0. These circles arc traced repeatedly as t tends to infinity.

FIGURE 12.6 The curves in Figure 12.6 are trajectories of vector solutions y =
y(t), z z(t) =
(a) Circles; y + y = 0 and as such have directions as indicated by arrow points in the figures. If a phase
(b) Spirals; ji +hi+ !Y = 0. curve has parts on both sides of the y axis, these directions in a phase portrait always
have a clockwise orientation relative to the usual orientation of the y-axis and z-axis.
The general principle is that when z = y is positive then y is increasing, and when
=
z y is negative then y is decreasing. Since z = y = 0 on the y axis, it follows
that dz/dy = z/y = 0, so a trajectory that crosses the y axis crosses vertically, as
displayed in both parts of Figure 12.6.
1
! =
The differential equation ji + j y + y 0 has general solution y = e-3 1 ( CJ cos j t +
c2 sin jt) and is equivalent to the first-order system

y=z
·
Z = - 51 y - 2
5Z-
578 Chapter 12 Introduction to Systems

We solve the second-order equation for y and then find z = y to solve the system:
I
1
y =e-5 (c 1 cos ~t + c2 sin ~t)
I
z =¼e-5 1 ( (2c2 - c1) cos }t - (c2 + 2c1) sin jt ).
Figure 12.6(b) shows some phase space plots starting at eight different points; the
trajectories all spiral in toward the origin as t increases.

A major advantage of plotting a phase curve as compared with plotting graphs is


that a phase curve typically compresses the long-term behavior of the functions y(t)
and j,(t) = z(t) into a relatively small picture, as compared with a vain attempt to
depict graphs extended over an infinite t-axis. A disadvantage of a phase curve is
that it fails to associate points on the curve with specific values of t. These ideas
appear in more detail in Section 7C of the previous chapter.

EXERCISES

The uncoupled systems I to 4 are solvable by treating 7. F(x ,y) = (x+ l,y}
each equation separately. Find the general solution, and 8. F(t,x,y) = (t,y)
then find the particular solution that satisfies the given
initial conditions. 9. F(x, y, z) = (x, ½J, jZ)
10. F(t , x, y, z) = (x + t , y - t, z)
1. dx/dt = x + I, x(0) = I,
dy/dt = y, =
y(0) 2 11. Sketch the vector field F(x, y) = (-y,x). Then sketch
the trajectory curve tangent to arrows in the field sketch,
2. dx/dt = t, x(l) = 0,
starting at (x. y) = (1, 0).
dy/dt = y, y(l) = 0
12. Show that the system i = -ty, y = tx has circular solu-
3. dx/dt = x, x(0) = 0, tion trajectories of radius r > 0, traced with increasing
dy/dt = !J,
y(0) = l, speed rt as time increases. [Hint: Show that xi+ yy = O.]
dz/dt = ½z, z(O) = -1 In Exercises 13 and 14, by letting y = z, express
4. dx/dt = x + t, x(0) = 0, each second-order differential equation as a first-order
dy/dt=y-t, y(0)=0, system of dimension 2. Also find the corresponding
dz/dt = z, z(O) = l initial conditions for y(O) and z(O), and solve the initial-
5. For the system in Exercise 2, there is a vector-valued value problem for y and z.
function F(t, x), with x = (x, y) such that the system 13. y + y + y = 0, y(0) = l, j,(0) = l
has the form dx/dt = F(t, x).
14. y + tj- = t, y(0) = 0, y(0} = l
(a) Find F.
(b) Find the speed of a trajectory through x at time t. Find first-order systems equivalent to the differential
equations 15 to 18 by setting dy/dt = z and, if appro-
6. For the system in Exercise 4, there is a vector-valued
priate, dz/dt = w.
function F(t, x), with x = (x, y, z) such that the system
has the form dx/dt F(t, x).= 15. d 2 y/dt 2 + (dy/dt) 2 + y2 = e1
(a) Find F. 16. d 2 y/dt 2 = y (dy/dt)
(b) Find the speed of a trajectory through x at time t.
17. d 3 y/dt 3 = (d 2 y/dt 2 }2 - y (dy/dt) - t
Sketch the vector fields 7 to IO by drawing a few arrows
18. d 3 y/dt 3 = 12xdy/dt
for F(x) or F(t. x) with their tails at selected points x of
the fom1 (x, y) or (x, y, z). In Exercises 8 and 10, make In Exercises 19 to 22, reduce the system to normal form,
separate sketches for t = - 1, t = 0, and t 1. = with each first derivative by itself on the left side.
Section 1C Vector Fields 579
19. dx/dt+dy/dt=t, still no forces acting horizontally, the single equation is
dx/dt - dy/dt =y replaced by the 2-dimensional uncoupled system
20. dx/dt + dy/dt = y,
i =0,
dx/dt + Uy/dt = x
21. Ux/dt+dy/dt+x+5y=t, ji = -g.
dx/dt +dy/dt +2x +2y = 0 (a) Solve the 2-dimensional system, subject to the four
22. dx/dt - dy /dt = e- 1 , initial conditions
dx/dt +dy/dt = e1
23. Sketch a phase portrait for the second-order equation ji =
x(0) = 0, y(O) = 0,
j,, indicating the directions of traversal where appropriate. i(0) = zo > 0, j,(0) = wo > 0.
Note however that since one of the equations in the system
will be j, = z these directions are always left to right (b) Show that the trajectory of the solution found in part
when z > 0 and right to left when z < 0. [Hint: One set (a) follows a parabolic path.
of trajectories consists of the individual points on the y (c) Show that the maximum height is attained when
axis.] the horizontal displacement is zowo/g and that the
maximum height is w5/(2g).
24. Sketch a phase portrait for the second-order equation (d) Show that the horizontal distance traversed before
ji = y, indicating the directions of traversal. [Hint: Two returning to height y(O) = 0 is 2zowo/g. Show also
fam.ilies of hyperbolas make up the trajectories.]
that for a given initial speed vo = Jz5 + w5, this
25. Consider the 2-dimensional coupled system
horizontal distance is maximized by having zo = wo.
X= X + y, }' = 4x + y. 31. A projectile fired against air resistance proportional to
velocity satisfies the uncoupled system
(a) Change the coordinates (x, y) to (z, w) with the
relations i = -ki,
x = z + w, y = 2z - 2w,
ji = -ky - g, k > 0.

(a) Solve the 2-dimensional system, subject to the four


and show that this change results in the uncoupled
initial conditions
system
z = 3z, w = -w. x(0) = 0, y(O) = 0,
(b) Solve the uncoupled system in part (a) for z and w, i(0) = zo > 0, j,(0) = wo > 0.
and use the coordinate change to solve for x and y.
Then verify by substitution that your solution for x The equations are linear with constant coefficients so you
and y satisfies the given system. can use the methods of Chapter 11.
(b) Show that the trajectory of the solution found in
In Exercises 26 to 29, convert the problem of solving the part (a) rises to a unique maximum at time tmax =
system into solving an equation of the form dy / dx = (I/ k) ln(l + kwo/ g).
R(x, y) as in Example 4 of the text and then solve the (c) Show that the position of maximum height has
first-order equation and sketch some trajectories of the coordinates
system.
26. dx/dt = x - y, dy/dt = x 2 - y2
Xmax = zowo/(g + kwo),
27. dx/dt = e2Y, dy/dt = ex+y Ymax = wo/ k - (g/ k2 ) In()+ kwo/g) .
28. dx/dt = e y, dy/dt = e x
1 1
(d) Show that as k tends to zero the maximum height
29. dx/dt = xy + y 2 , dy/dt = x + y tends to ½w5/g.
30. The I-dimensional equation ji = - g is used to determine 32. Here is an outline of a derivation of the pendulum
the motion of an object moving perpendicularly to the equation 0 = -(g / I) sin 0 using a system of differential
surface of a large attracting body and subject to no other equations.
forces. Suppose the range of motion is extended to a (a) Show that if x, y are rectangular coordinates and 0
vertical plane with horizontal coordinate x. If there are is the angle formed by the vector (x, y), measured
580 Chapter 12 Introduction to Systems

counterclockwise from the downward vertical direc- (b) Use the relation between u and v derived in part
tion, then x = I sin 0 and y = -l cos 0. [Hint: These (a) together with the given system to derive a single
equations are a slight modification of the usual polar differential equation satisfied by u(t). Then solve
coordinate relations.] this equation using the initial conditions.
(b) Show that .i = -l sin 0tF + I cos 00 and y = (c) Find a formula for v(t).
l cos00 2 + l sin 00. (d) Find out how long it takes for the initial temperature
(c) Use the representation .i = 0, y = -g forthe coordi- difference between the two bodies to be cut in half.
nates of the ac<.:eleration of gravity together with the 34. Bugs in mutual pursuit. Four identical bugs are on a
result of part (b), to derive the pendulum equation. flat table, each moving at the same constant speed v. Use
(This derivation safely ignores the lengthwise, or (x, y)-coordinates on the table, and locate bugs I through
radial, force on the pendulum, since that force is 4 initially in respective quadrants I through 4, each at
always perpendicular to the path of motion.) [Hint: one of the points (±1, ±1). Bug I always heads directly
Eliminate terms containing 02 .] toward bug 2, bug 2 toward bug 3, bug 3 toward bug
33. Heat exchange. The temperatures u (t) 2': v (t) of two bod- 4, and bug 4 toward bug I, so their paths are mutually
ies in thermal contact with each other may be governed congruent.
for the warmer body by Newton's law of cooling and for (a) Use the symmetry of the paths to show that the bugs
the cooler body by the analogous heating law: are at all times at the corners of a square, and in
particular, if bug I is at (x, y), then 2, 3 and 4 are
-du = -p(u - v),
dv
- =q(u - v), respectively at (-y, x), (-x, -y) and (y, -x).
dt dt (b) Show for a bug at (x, y) that y/i = (y-x)/(x + y)
and that i 2 + y2 = v2 .
where p and q are positive constants and the equations (c) Use part (b) to show that the path
are subject to initial conditions u(0) = uo, v(0) = vo. (x, y) = (x(t), y(t)) followed by bug I satisfies the
(Note that p may be different from q if the two bodies nonlinear autonomous system
have different capacities to absorb heat.) This is a coupled
system, but because of its simple form we can solve it as dx -V X +y dy V X - y
follows.
dt = ./2 Jx2 + y2' dt = ./2 .Jx2 + y2 ·
(a) Show that qdu/dt + pdv/dt = 0. Then integrate
with respect to t to show that qu(t) + pv(t) = co,
where co= quo+ pvo.

ID Existence, Uniqueness, and Flows Optional


Reduction of a system to first-order normal form isn't always possible, but when
it is, and certain differentiability conditions are met, it's possible to apply the fol-
lowing theorem, the proof of which we omit. See M. W. Hirsch and S. Smale,
Differential Equations Dynamical Systems and Linear Algebra, Ch. 15, Academic
Press (1974).
1.1 Existence and Uniqueness Theorem. Suppose that F(t, x) and the entries
aFj/ax; in its derivative matrix Fx(t, x) with respect to x are continuous for t in
an interval / containing to, and for x in an open rectangle R in ~n containing xo.
x
Then the system = F(t, x) has a unique solution satisfying x(to) = XQ on some
subinterval J of I containing to, and x(t) is a continuously differentiable function of
the vector xo. If in addition the entries in Fx(t , x) are bounded for all x in ~n, then
the solutions exist for all t in / , as, for example, in the case of a nonhomogeneous
x
linear system = A(t)x + b(t ).
The uniqueness part of the theorem tells us that for an autonomous system x=
F(x) that satisfies the hypotheses of the theorem, two different solution trajectories
can never have a point xo in common, that is "trajectories of an autonomous system
can't cross." Put more formally what happens is this.
Section 1D Vector Fields 581
1.2 Corollary. If the autonomous system x = F(x) satisfies the conditions of the
Uniqueness Theorem, and two solution trajectories of the system have a point xo in
common, then on either side of xo one trajectory is contained in the other.

Proof. If two trajectories did agree at xo, this common vaJue taken as initial value
would dictate that the trajectories are the same from that time on until one of them
terminates. Similarly the reverse trajectory that satisfies x=
-F(x), and that coin-
cides as a curve with a trajectory approaching xo from the other side, would also be
uniquely determined until termination of one of them. •

[E}<Atv1,P~~7I.I ~t/xr(~~t:e)~o~~~z ~~~~~e


~y~t~~(; +=a),y~te: ~ ~;~ : 0 0
~~ c~n~=t~eAg~~

These solutions trace circular trajectories of radius A shown in Figure 12.7(a).

A trajectory of a nonautonomous system can cross itself, and distinct trajectories


can cross each other, as the next example explains.

Figure 12.7(b) shows some computer plots of trajectories for the nonautonomous
system i = (1 - t)x - ty, y = tx + (l - t)y. Each one of the four trajectories is
y shown crossing one of the others. One trajectory of a nonautonomous system may
very well cross another one, or even intersect itself at a nonzero angle, because on
arrival at the same point in phase space at a different time there may have been a
change of direction in the vector field F(t, x). Snapshots of the vector field of this
system are in Figure 12.5. The graphs of different solutions in (t, x, y)-space will
have no points in common, because t varies from point to point.

Autonomous system trajectories Flows. The trajectories of an autonomous system x = F(x) are called the
(a) flow lines of d1e vector field F, and we can picture them, as shown for example
in Figure 12.7(a), as the possible paths followed by fluid particles in a steady fluid
y
flow with velocity vector F(x) at x. These ideas are also discussed in Chapter 8,
Section 4. In what follows we'll assume that the autonomous vector field F satisfies
the conditions of Theorem 1.1 in some region B in JR'\ thus guaranteeing (i) that
there is a unique flow line through each x in B and (ii) that distinct flow lines have
no points in common. We associate with each such n-dimensional vector field F a
family of flow transformations T, from B to B defined by
Nonautonomous system trajectories 1.3 T,(x) = y(t) , where y(t) solves y= F(y) with initial value y(0) = x.-
(b)
In words, T, {x) is the point on the flow line of F starting at x that the flow reaches
after time t.
The system x = -y, y =
x has circular trajectories, as in Figure 12.7(a). Thus
a flow line of radius A for the vector field F(x, y) = (- y, x) is parametrized by
x(t) = A cos(t + a) , y(t) = A sin(t + a). To start one of these flow lines at a fixed
point (u , v) when t = 0, we note that A = Ju 2 + v2 and write

T,(u , v) = (Ju 2 + v2 cos(t + a) , Ju 2 + v2 sin(t + a)) ,


where a is an angle that the radius from the origin to (u, v) makes with the positive
x-axis. (Thus a(u, v) = arctan (v/u), extended for u = 0 and v # 0 to be the odd
multiple of ;r /2 that makes a(u, v) continuous.)
582 Chapter 12 Introduction to Systems

Since sine and cosine are periodic functions, the vector-valued function </> : JR 3 ---1>
JR 2defined by </>(t, u, v) = Tr(u , v) in the previous example is not one-to-one as a
function oft and (u , v) unless t is somehow restricted. However for fixed t = to,
the function T10 : JR2 ---1> JR 2 turns out not only to be one-to-one but to have a nice
inverse, namely T_ 10 • It's a straightforward exercise to show that Tr0 is just a rotation
about the origin through angle to. Hence the inverse of Tr0 is T-ro · We 'II see that
this simple relationship between Tr and its inverse holds very generally.
The flow transformations T, defined above have the composition property.
1.4 T1 Ts = Tr+s; in other words, T1 (Ts(x)) = Tr +s(X) ,
whenever all three transformations are defined. Equation I .4 holds because the system

-dx
dt
= F(x) ,
of which y(t) = T1 (x) is a solution, has a unique solution starting at Tr (xo) whose
vaJue at s time units later must coincide with the unique solution value achieved by
starting at xo and running for time t + s. Furthermore, the reversed system,
dx
dt = -F(x),
has solution trajectories traced in the direction - F(x) exactly opposite to that of the
=
solutions of dx/dt F(x). We can use solutions of the reversed system to define T,
for t < 0 by T, (xo) = z(t), where: z(t) satisfies
dz
dt = -F(z), z(O) = xo.
It follows that each of Lr and Tr is an inverse operator to the other, so
1.5

where / is an identity operator that leaves points fixed.

We now consider the effect of T, on area in JR 2 and volume in higher dimensions.


This effect is measured locally for each fixed t by the Jacobian determinant Jr of
the transformation T, . Section 4 of Chapter 8 contains a detailed examination of the
consequences of the next theorem in the context of interpreting the divergence of a
vector field.
1.6 Theorem. Let F(x) be a continuously differentiable vector field on ]Rn such
that the derivative matrix F'(x) has bounded entries. The system x = F(x) defines
fort ~ 0 a family of one-to-one transformations JR11 ~ ~n by T1 (x) = y(t ), where
y(I ) represents the flow line of the vector field F starting at x. For fixed t the flow
transformation T, is volume-preserving in its action on a region B of ~n if and only
if div F is identically zero in B. If div F < 0 in B , then T1 is volume decreasing in
B, and if div F > 0 in B, then T, is volume-increac;ing in B. In th 7 special case that
div F(x) is constant, the Jacobian determinant of T1 is J1 (x) = ld,v F"(x).

Proof. The existence of a unique solution curve passing through each xo ensures
that T, is a well-defined transformation. Furthermore T1 is one-to-one because if
Section 1D Vector Fields 583
Ti(xo) = Ti(x1) for some t > 0, and for xo =f. X1, then there would be two distinct
solution curves to the system
dx
- = -F(x)
dt
starting at Yo= Ti(xo) = Ti(x1) and passing back through xo and x1, respectively.
But this is impossible, again by the uniqueness theorem.
To see what the transformation Ti does to volumes, we use the following.
1.7 Lemma. The Jacobian determinant J1 (x) satisfies

(d/dt)J1 (x) = div F(y(t))J,(x).


Lemma Proof. (Exercise IO is the 2-dimensional case of this proof.) Let

= y(t) = (Y1(t), ... , Yn(t)),


T,(x)

where y(t) = F(y(t)) and y(O) = x. By Theorem I.I Ti(x) is a continuously dif-
ferentiable function of x. We use the Leibniz notation of Chapter 7, Section 4D to
write
a(yi(t), ... ,yn(t))
J, = -------.
a(x1,,., , Xn)
The derivative of a determinant is the sum of the determinants obtained by differ-
entiating one row at a time, as shown in Exercise 9. With rows indexed by i, we
then have

By the chain rule, the ikth determinant entry in the ith term above is

ayi
axk =
I: ayi ayj
j=I ayj axk.

By row-linearity of the determinant, the ith term in the sum for dJr/dt is then

t.
ayi a(y1, .. ·, Yj, .. ·, Yn) .
ayj a(XJ, ... ,Xk,,,. ,Xn)
;=I

But the detenninants in this last sum are O (two rows equal) unless j = i, in which
case the determinant is Jr. To finish proving the lemma, we note that the remaining
multiplier of ayj/ayi is just J1 • To finish proving the lemma we have

dJ, = J, ~ ayj(t)
dt ~
1=]
ayi

= Jr '°' ---
n aF;(y(t)) .
~ ay-
= J1 div F(y(I)).
i=l I
584 Chapter 12 Introduction to Systems

We'll now finish proving the theorem. If div F = 0 then by the lemma, 11 is constant
as a function of t . But To is an identity transformation, so lo = 1 and 11 = 1 for
t > 0 also. Hence the transformation T1 is volume-preserving by the Jacobi change-
of-variable theorem for multiple integrals. Conversely, volume-preservation implies
=
11 = 1, so divF 0. Finally, the first-order linear differential equation for 11 with
initial condition Jo = 1 has the solution

(Note that if div F is constant, then the exponent is just t div F.) The statements
about volume-decreasing and volume-increasing follow as previously by Jacobi's
theorem. •

You can see directly that the uncoupled system x = x, y = 2y, z = 3z has the
1·Ei(AMPl~, to.! solution x(t) = ue 1 , y(t) = ve 21 , z(t) = we 31 with initial values x(O) = u, y(O) = v.
z(O) = w. The flow generated by the vector field F(x, y, z) = (x, 2y, 3z) is therefore

</>(t, u, v, w) = T1(u, v, w) = (ue 1 , ve 21 , we 31 ).


Since div F(x, y, z) = 1 + 2 + 3 = 6, either Theorem 1.6 or direct computation tells
us that the Jacobian determinant of T1 is J1 (u, v, w) = e61 • It then follows from
Jacobi's theorem 4.4 of Chapter 7, Section 4D that in a time interval of length t the
flow transformation T1 sends a set B of volume V(B) into a set T1 (B) of volume
e61 V (B). For this example, T1 expands length in each coordinate direction separately,
giving again the volume expansion factor e1e21 e 31 = e61 .

EXERCISES

1. The domain of t-values for which the solution of even 5. A 2-dimensional Hamiltonian system has the form
an autonomous system exists may be quite restricted.
Illustrate this point by deriving the explicit solution to the . aH . 8H
x=-, y=--,
I-dimensional initial-value problem i = ax 2 , x(O) = l, ay ax
where a > 0 is constant.
where the real-valued Hamiltonian function H (x, y) is
2. If x(O) = 0 Theorem 1.1 on existence and uniqueness assumed to be twice continuously differentiable.
of solutions fails to apply to the I-dimensional equation (a) Show that the flow of a 2-dimensional Hamiltonian
X={ Jx, X ~ 0, system preserves areas.
0, X < 0. (b) Show that the system i = -y, y = x is Hamilto-
nian, and find a Hamiltonian function H (x, y) for
(a) Explain why the theorem doesn't apply if x(O) = 0.
this system.
(b) Find two distinct solutions to the equation, both
(c) Show that the flow lines of a 2-dimensional Hamil-
satisfying x (0) = 0.
tonian system follow li!vel curves of the associated
Hamiltonian function.
3. Can the flow of a continuously differentiable 2-
dimensional vector field send a region of positive area 6. Show that the second-order equation i = - f (x) is
into a region of area zero in finite time? Explain your equivalent to a first-order system if we set y = i. Then
answer. show that the first-order system is a Hamiltonian system,
as defined in the previous exercise, with Hamiltonian
4. What is the flow of the identically zero vector field on
JR3? H(x, y) = ½i + U(x), where U 1(x) = f(x).
Section 2A Linear Systems 585
The function U(x) is the potential energy of the system, (b) Carry out the proof for n-by-n matrices by induc-
and H (x, y) is the total energy. tion, first expanding the determinant by one row, for
7. A 2-dimensional gradient system has the form example the first row, and then applying the induc-
tion hypothesis to the cofactors.
. au . au 10. If the proof of the lemma for Theorem 1.6 is restricted to
x=-,
ax y= ay' dimension 2, the computation is in principle no simpler
but we avoid using so many subscripts. In particular, we
where the real-valued potential function U (x, y) is deal with the flow transformation T,(u, v) = (x(t), y(t))
assumed to be twice continuously differentiable. Show generated by the system i = F(x, y), j, = G(x, y) with
that the flow of such a system preserves areas if and only initial conditions x(O) = u, y(O) = v.
if Uxx + Uyy is identically zero that is, if and only if (a) For fixed t let 11 (u, v) be the Jacobian determinant
U(x, y) is a harmonic function. of T1 (u, v) with respect to u and v. Use the system
to show that
8. Consider the 2-dimensional uncoupled system i = x 3,
y=y3.
(a) Sketch the vector field of the system near the
origin.
(b) Compute the Jacobian determinant 11 of the flow (b) Apply the chain rule, for example Fu =
transformation T1 of the system. Fxxu+FyYu, to the partials of F and Gin part (a) to
(c) Let B be a region of positive area in JR2 • Use the show that (d/dt)J1 = (Fx + Gy)l1 = div(F, G)J1 •
result of part (b) to show that the area of the image Here F and G are evaluated at (x(t), y(t)).
of B under T1 is bigger than the area of B if t > 0 (c) Noting that 711 is an identity transformation, so
and less than the area of B if t < 0. that Jo = det To = 1, solve the first order linear
(d) Can you draw the same conclusion as in part (c) if differential equation for 11 in part (b) to show that
the original system is replaced by i = =
x 2 , j, y 2? f~
11 = exp ( div(F, G) dt), where F and G are
Explain your reasoning. evaluated at (x(t), y(t)).
9. Show that if the entries in an n-by-n matrix A(t) = 11. The last part of the proof of Theorem 1.6 shows that
(aii (t)) are differentiable functions of a real variable t, the Jacobian determinant of a flow transformation 7i is
then the derivati,,e of det A (t) is computed by differen- f~
11 (x) = exp ( div F(Tu (x)) du).
tiating the entries in one row of A(t) at a time and then (a) Show that if div F is constant, then the exponent is
adding the resulting n determinants. tdivF.
(a) Carry out the proof for 2-by-2 matrices by first (b) Show generally that the exponent is t times the time-
expanding the determinant and then applying the average of div F over the part of the flow line starting
product rule for differentiation. at x traced between time O and time t.

SECTION 2 LINEAR SYSTEMS


2A Elimination Method
In th.is section we define what it means for a system of first-order differential equations
to be linear. As in the case of a single differential equation, the methods of solution
are explicit for constant-coefficient linear systems, and we'll concentrate on such
systems. Unfortunately there is no generally applicable analogue of the method for
solving a single linear differential equation with nonconstant coefficients.
Recall that a single first-order differential equation is called linear if it is equivalent
to an equation of the form

-dx
dt
=a(t)x + b(t),
where a and b are real-valued functions defined on some interval. Similarly an n-
dimensional first-order system of differential equations is called a linear system if
586 Chapter 12 Introduction to Systems

it has the normal form


dx
dt = A(t)x + b(t),

where A(t) is an n-by-n matrix

a11(t)
a111(t) )
a21 (t) a211 (t)
A(t) = .
(
a111 (t) a,,,~ (t)

and b(t) is an n-dimensional vector

b1 (t) )
b2(t)
b (t) = : '
(
bn(t)

both with coordinate functions defined on an interval.

IEXAMPLE 1J The first term in the right side of the vector differential equation
(:;j::) = ( i -i ) (;) + ( ~)
in terms of a matrix product is

(~)+( 7 4) (X)y = (2x+4y+2)·


-l
x-y+4

Hence the vector equation can also be written as a system,

dx
-- =2x+4y+2
dt
dy
- =x-y+4.
dt

It's convenient to denote differentiation with respect to t by D; that is, let D = d / d t.


Then regrouping the terms involving x and y on the left side gives

(D - 2)x - 4y =2
-x + (D + I )y = 4.

At this point, we follow a routine similar to the row reduction method for solving
linear algebraic equations. For example, we can operate on the second equation with
the differential operator (D - 2) to eliminate x when we add the first equation to
the second:
(D - 2)x - 4y =2
-(D - 2)x + (D - 2)(D + l)y = (D - 2)4.
Section 2A linear Systems 587
Addition gives
(D - 2)(D + l)y - 4y = (D - 2)4 + 2

or
2
D y - Dy - 6y = -6.
We can solve this equation by the methods of the previous chapter, because it contains
only one unknown function, namely y(t). The characteristic equation is

r
2
- r - 6 = (r + 2)(r - 3) = 0,
which has roots r1 = -2 and r2 = 3. Hence the general solution of the associated
homogeneous equation is

By inspection, we see that Yp = 1 is a particular solution, so what we have shown


is that if (x(t), y(t)) is an arbitrary solution of the original system, then y(t) must
be of the form

We use the second equation of the system to express x(t) directly in terms of y(t):

x(t) = (D + l)y - 4

= -qe-2, + 4c2eJr - 3.

If we put x(t) and y(t) together in vector form, we get

x(t))
( y(t) =c1e
-2, (-1)1 +c2e Jr (4) + (-3)
1 1 .

Our method of solution guaranteed only that every solution of the original system
must be a special case of the general formula just obtained. Therefore we should
substitute the formula into the system to see if it really provides a solution for every
choice of q and c2. The general theory to be developed in Chapter 13, Section 3
applies, showing that for systems of the form dx/dt = Ax+b(t), the general solution
of an n-dimensional system contains n arbitrary constants. Therefore substitution is
necessary in such an example only if the number of arbitrary constants present
is greater than the dimension of the system, in which case the substitution leads to
relations between the constants. Substitution is always a useful check on the accuracy
of a computation.

The previous example was misleadingly simple, because after solving for y(t)
it was not necessary to solve another differential equation to find x(t); as a result,
no extra arbitrary constants were introduced, so there was no need to find relations
among the constants so the number of constants would equal the dimension of the
system. We'll content ourselves here with such simple examples, leaving the more
complicated ones for Chapter 13 where we use more efficient methods.
588 Chapter 12 Introduction to Systems

The general theory of linear operators applies to linear systems of differential


equations. If we let D = d/dt, then the equation
dx
dt = A(t)x + b(t)
takes the form
(D - A(t))x = b(t)
since both D and A(t) act as operators on vector functions x(t) . The solutions
of the system can then be derived in two parts: the homogeneous solutions (i.e.,
solutions of the homogeneous equation) and a particular solution. By Theorem 3.1
of Chapter 11, Section 3, each solution of the nonhomogeneous equation is the sum
of a fixed particular solution and some solution of the homogeneous linear equation
dx
dt = A(t)x.
For instance, in the previous example

Xp(t) = (-~)
is a particular solution that happens to be constant, and

-e- 21 ) ( 4e3t )
Xh (t) = c1 ( e-2, + c2 e3t

represents all the homogeneous solutions.


2B Nonstandard Forms
Two somewhat more general looking types of constant-coefficient linear systems
arise in applications. Both reduce to the normal form we have already considered,
namely systems of the form dx/dt = Ax+b. One advantage of this reduction is that it
enables us to apply the geometric intuition associated with tangent vectors and vector
fields to equations to which these interpretations are not directly applicable. Another
advantage is that the general theory of linear equations develops most naturally in
standard form. Finally, standard form is used for application of numerical methods,
where in practice we rely most heavily on existence and uniqueness theory.
The next two examples illustrate two types of reduction to standard form and show
how to solve them by the elimination method we have already used in Example I.

l:~XAM~L~ i I The second-order system d 2x


dt2 =x +2y +t
d2y
dt2 = 3x + 2y
reduces to a first-order system of dimension 4 by letting

dx dy
Lt= - v=-.
dt' dt
Section 2B Linear Systems 589
The system then becomes
du
dt = X + 2y + t
dv
dt = 3x + 2y
dx
-=u
dt
dy
dt = v.
This system is of the standard form dx/dt = Ax+ b(t), where

A=
00 00 31 2)
( 01 0
l
0
0
2
0
0
Md b(t)= u) ·
The order u, v, x, y has been used in forming the matrix. Alternatively, the system
takes the form
Du - x - 2y t =
Dv - 3x - 2y = 0
-u + Dx 0. =
-v +Dy= 0.

The numerical methods in Section 4 will apply directly to the system in standard
form. We take up matrix methods for solving the first-order system in Chapter 13,
Sections 1 to 3, but here we use just the techniques of Chapter 11. A moment's
thought shows that the elimination method applied to the first-order system will
simply take us back to the original second-order system, or one equivalent to it.
Therefore we might as well try to solve the second-order system directly by elimi-
nation. We first solve the associated homogeneous system, and, as usual, we write
D = d/dt to get

(D 2 - l)x - 2y =0
-3x + (D 2 - 2)y 0. =
If we multiply the second equation by 2 and operate on the first with (D 2 - 2), then
addition of the resulting equations eliminates y:

(D 2 - 2)(D 2 - l)x - 6x = 0.
Multiplying out the operators gives

(D 4 - 3D 2 - 4)x = 0.
We solve the equation by finding its characteristic roots from the equation

,4 - 3,2 - 4 = (,2 + l)(,2 - 4) = O;


590 Chapter 12 Introduction to Systems

they are evidently r1 = i, r2 = -i, r3 = 2, r4 = -2. Hence the homogeneous


solution for x is
x(t) = CJ cost+ c2 sin t + qe 21 + qe- 21 .

The first of the two homogeneous equations allows us to solve for y directly

y(t) = ½<D 2 - l)x(t).

A straightforward calculation shows that

y (t ) = -CJ cost - , t + 3c3e 21 + 3qe -21 .


c2 sm 2 2
In vector form the homogeneous solution looks like

(
xh
YhU)
(I)) =CJ ( -cost
cost ) ( sin t)
21
( e ) ( e-2, )
+c2 -sint +c3 ~e21 +q ~e-21.

To find a particular solution, we try

Xp(t))
( YpU)
= (at+b)
ct +d '

which we substitute into the given system, getting

0 = (at+ b) + 2(ct + d) + t
0 = 3(at + b) + 2(ct + d),

or
0 =(a+ 2c + l)t + (b + 2d)
0 = (3a + 2c)t + (3b + 2d).

It follows that

b+2d=0, a+2c=-l,
3b + 2d = 0, 3a + 2c = 0.

The solutions are b = d = 0, and a = ½, c = -¾. The particular solution is then

Xp(f))=(
( Yp(t) _:J. 1
½t)·
4

We could have computed the particular solutions along with the homogeneous solu-
tion, by applying elimination to the nonhomogeneous system.
A typical set of initial conditions for the system might take the form

x(O))
( y(O) -
(0)
0 '
dx/dt(O))-
( dy/dt(O) -
(0)
1 ·
Section 2B Linear Systems 591

Applying these to the general solution, consisting of homogeneous solution plus


particular solution, we get

q (-!) +c2 (~) +q ( i) i) = ( ~),


+C4 (

q ( ~)+ C2 ( _ : ) + C3 ( i )+ q ( -i)
=i )+ ( = ( ~ ) ·
These two vector equations are equivalent to the equations

q +q + q = 0
- Ci + ~C3 + ~ C4 = 0
c2 + 2c3 - 2c4 = -½
- c2 + 3q - 3q = ¾-
These equations have the unique solution q = 0, c2 =- 1, c3 = A, c4 = - ½, so the
particular solution we are looking for is

x(t) ) = ( - sint + ½e2' - ½e- 2' + ½t ) .


( y(t) sin t + ie 21 - ie- 2' - lt
16 16 4
Next is an example of a first-order system that is not presented in standard form.

If~~~PL~::~d The system of differential equations


dx dy
dt + dt = 2x + 4y
dx dy
2-
dt
+ 3-
dt
= 2x + 6y
takes the form dx/dt = Ax if we apply the elimination method to the left side. We
multiply the first equation by 2 and subtract from the second to get
dy
-=-2x - 2y.
dt
Now we subtract this equation from the first one to get
dx
dt = 4x +6y.
As a result we can rewrite the equation as
(D - 4)x - 6y =0
2x + (D + 2)y = 0.
We can now proceed as in Example 1 to eliminate x. We multiply the first equation
by 2 and operate on the second with (D - 4). Subtracting the first from the second
gives
(D - 4)(D + 2) y + 12y = 0
592 Chapter 12 Introduction to Systems
or
(D 2 - 2D + 4)y = 0.
The roots of the characteristic equation r 2- 2r+4 = 0 are r1 = I +i .J3, r2 = 1-i .J3,
so y(t) has the fonn

y(t) = qe<I+i.J3r> + c2e<I-i.JJ1)


= d1 e1 cos -v'3 t + d2e1 sin -v'3 t.
Using the second equation of the modified system, we can express x(t) directly in
terms of y(t):
1
x(t) =- (D + 2)y(t)
2
1
= - 2(D + 2)(d,e1 cos -v'3 t + d2e' sin -v'3 t)

= die' - 3 cos -v'3 t + -v'3 sin -v'3 t )


( 2 2

+ d2e' ( - 2
3. .J3 cos Vr,:,)
sm V r,,3 I - 2 3I .

In vector fonn we can write

(x(t)) =de' (
y(t) I
-~cos-v'3t
2
+ .J3 sin-/31)
2
cos .J3 t

+ d2e' - 3 sin .J3 t -


2 .J3 cos-v'3 t )
2 .
( sin .J3 t

EXERCISES

In Exercises I to 4, classify the first-order system as In Exercises 5 and 6, solve by first eliminating one
linear or nonlinear: of the unknown functions. Then determine the arbitrary
constants so that the initial conditions are satisfied.
dx dv
1. - = I +x 2 + y 2. ___:,_ = 12 +z dx
di di 5. dt = 6x + 8y, x(O) = I
dy dz
-dt = +x + y
1
2 - = 13 + y
di dv
d~ = -4x - 6y, y(O) =0
dx dx
3. - =t x + y +
2
e1 4. - =tx
di
+y dx
di
dy dy
6. --
dt
= x + 2y
'
x(O) =0
-=l -=x+y
di dt dy
dt =-2x+y, y(0)=-1
Section 28 Linear Systems 593
In Exercises 7 and 8, find a particular solution for the dx dy dx dy
system. Then noting that the homogeneous equations are 11. -+-
dt dt
=x +ty. 12. - +e1 - =x+e1 •
dt dt
the same as the ones in Exercises 5 and 6, write the most dx dy 2 dx dy
general solution. - -2-=x+t • -+-=y+e-1 •
dt dt dt dt

7. ( ~;~~:) = ( _: _:) (;) + C) 13. (a) Verify that if x1 (t) and x2(t) are solutions of the
system dx/dt = A(t)x, where A(t) is an n-by-n
[Hint: Try x =at+ b; y =ct+ d.J matrix then cix1 (t) + c2x2(t) is also a solution for
arbitrary constants c1 and c2 .
8. ( ~;~~~) = ( _; ~) (;) + (~I) (b) If dx/dt = Ax, where A is an m-by-n matrix, can
you haven 'Im? Give an example or explain why
[Hint: Use undetermined coefficients as in part (a).) not.
9. Solve by elimination (c) Show that for a system of the form dx/dt = A(t)x+
b(t), the conclusion of part (a) follows only if b(t)
is identically zero.
dx
- = x+ z.
dt In Exercises 14 to 17, classify the system as linear or
dy nonlinear.
dt =x+2y.
14. dx/dt + dz/dt = 1, 15. d 2 x/dt 2 + dy/dt = 0,
dz
- = -z. dx/dt - t(dz/dt) =x y2 + t(d y/dt 2 )
2
=t
dt
16. dx/dt + d z/dt = 1, 17. d x/dt - dy/dt = x 2 ,
2 2

Then satisfy the initi&I condition x(0) = l, y(0) = - 1, dx/dt - t 2 (dz/dt) =O y +t 2 (d 2 y/dt 2 ) = o
z(O) = 2.
In Exercises 18 and 19, use elimination by operator mul-
10. (a) Find a first-order system of dimension 4 equivalent
tiplication to get rid of one of the dependent variables.
to the second-order system
Solve the resulting equation for the remaining variable,
and then determine the general solution (x(t), y(t)).
i - 3x - 2y = 0.
18. dx/dt = X + 2y, 19. dx/dt =- y - t,
i - y+2x = 0.
dy/dt = X + y + t dy/dt = X + t
[Hint: Let i = u, y = v.J In Exercises 20 to 23, reduce the system to the standard
(b) By solving for i, y, u, and ii in the first-order fonn with just one first derivative on the left side of each
system obtaine<i in part (a), write the equivalent equation.
4-dimensional system in the form dx/dt = Ax,
where A is a 4-by-4 matrix. 20. dx/dt.+ dy/dt = t,
(c) By collecting terms properly, write the system in dx/dt - dy/dt =X
part (a) in the form 21. dx/dt +dy/dt = y,
dx/dt +2dy/dt =X
L1 (D)x + L2(D) y = 0 22. 2dx/dt + dy/dt + x +Sy= t ,
L 3(D)x + L4(D)y = 0, dx/dt + dy/dt + 2x + 2y = 0
23. dx/dt +dy/dt = sint,
where each Lk.(D) is a second-order constant- dx/dt - dy/dt = cost
coefficient operator.
In Exercises 24 to 27, use elimination by operator mul-
(d) Use the method of elimination to solve the system
ti plication to get rid of one of the dependent variables.
found in part (c).
Solve the resulting equation for the remaining variable
In Exercises 11 and 12, apply row operations to the tenns and then determine the general solution of the system.
containing first derivatives to reduce the system to the Substitution may be necessary to find relations among
standard fonn dx/dt = A(t)x + b(t), where A(t) is a constants. Then determine the constants so that the initial
square matrix. conditions are satisfied.
594 Chapter 12 Introduction to Systems

24. d 2 x/dt 2 -x+dy/dt+y=0, 34. dx/dt +dy/dt = t, 35. dx/dt + dy/dt = y,


dx/dt - x + d 2 y/dt 2 + y = 0, x(O) = y(O) = 0, dx/dt + dy/dt = x dx/dt + dy/dt = x
i(0) = 0, y(0) = 1
:16. (a) Find a first-order system of dimension 4 equivalent
25. d 2 x/dt2 - dy/dt = 0, to the second-order system
dx/dt + d 2 y/dt 2 = 0, x(O) = 1, y(O) = 0,
i(0) = y(0) = 0 y -3x - 2y = 0,
26. d 2x/dt 2 - dy/dt = t, x -y +2x =0.
dx/dt + dy/dt = x + y, x(O) = y(O) = 0, i(0) = 1 (b) Write the system found in part (a) in standard fonn.
21. y = et,
d 2x/dt 2 - (c) Solve the system in part (a).

d y/dt + x = 0, x(O) = y(O) = .i-(0) = j,(0) = 0


2 2 37. (a) Find a 2-dimensional system of order 2 satisfied by
the x and y coordinates of the solutions to
In Exercises 28 to 31, introduce new independent vari-
ables. u = x,
v = y, and reduce the system to first-order i = z + w,
standard fonn in u, v, x, and y. (Section 2 of Chapter 13
develops an efficient method for solving these systems j• = z - w,
in first-order standard form.) Z= X -y,
2
28. d x/dt 2
- x + dy/dt + y = 0, U: = + y.
X
dx/dt - X + d 2y/dt 2 + )' = 0
29. d 2 x/dt 2 dy/dt
- = 0, Then solve the second-order system and use its
dx/dt +d y/dt 2=0
2 solution to solve the given system.
(b) Find a 2-dimensional system of order 2 satisfied by
30. d x/dt - dy/dt = t,
2 2
the z and w coordinates of the solutions to the system
dx/dt + dy/dt = X + y in part (a).
31. d 2 x/dt 2 - y = e1, 38. It's generally not possible to find closed-form solutions
2 2
d y/dt +x =0 for linear systems with nonconstant coefficient functions.
Here is one that can nevertheless be solved readily by
None of the linear systems in Exercises 32 to 35 is solving a second-order equation for y. Find the general
equivalent to a first-order system in standard form. solution.
Discuss the solutions, or lack thereof.
32. dx/dt + dy/dt = 0, 33. dx/dt + dy/dt = 0, i = u- 1 -t)x -t 2y, t > 0,

dx/dt + dy/dt = 1 dx/dt +dy/dt =X )' = X +ty.

SECTION 3 APPLICATIONS
The examples in this section are all of a type that arise frequently in applied math-
ematics. For some we'll be able to give complete solutions, whereas the others are
examples for which we need the numerical methods described in the next section.

Figure 12.8 shows two 50-gallon tanks connected by flow pipes and with inlets and
outlets all having the rates of flow as marked in gallons per minute (g/m). The flow
rates are arranged so that each tank is maintained at its capacity at all times. We
suppose that each tank initially contains salt solution at a concentration in pounds per
gallon that we leave unspecified for the moment, that the left-hand tank is receiving
salt solution at a concentration of 1 pound per gallon, and that the right-hand tank is
receiving pure water. The problem is to find out what happens to the amount of salt,
Section 3 Applications 595
FIGURE 12.8 I g/m at I lb/g I g/m pure water
Fluid exchange.
=-£ /Q M
3g/m

50 gal.
2 g/m
2g/m
Concentration = /o

in pounds, as time goes on. We assume that each tank is kept thoroughly mixed at
all times, so that the concentration of salt is always the same throughout the whole
tank. In the left-hand tank, with salt content x(t), the rate of change of the amount
of salt is dx/dt. On the other hand, because of the various flow rates, we can break
this rate of change into three parts:

dx = -4 (~) + 3 (!_) + I
dt 50 50 '
where x / 50 is the concentration of salt in the left tank and y / 50 the concentration in
the right tank, both in pounds per gallon. The term -4(x/50) is the rate of outflow
of salt, and the other two terms represent the rate of inflow. Similarly,

dy
dt
= 2 (~) -
50
3 (!_).
50
Thus we have a system of differential equations that we can write as
dx 4 3
-=--x+-y+I
dt 50 50
dy 2 3
dr = sox - soY·
To solve it, we can use the elimination method, first writing the system in the form

(D + ~) x - 2_y = I
50 50

_2.x
50
+ (v + 2-)
50
y = o.

We multiply the first equation by st and operate on the second by (D+ 3t). Addition
of the two equations then gives

or
596 Chapter 12 Introduction to Systems

The roots of the characteristic equation come from the factorization

2
r + so'+
7 6 = (r+ 1)(r+ 6)
(50) 2 50 50

and are r1 = -s1


0 and r2 = - 5i. A particular solution is evidently Yp(t) = 5
°, a
3
constant. Thus in general,

y(t) = c,e-(If50)t + c 2e-(6/50)t + 5t


Using the second equation of the system to write x(t) in terms of y(t), we find

x(t) = 5~ (D + fa) y(t)


_ c e-(l/50)t _ le e-(6/50)t
- l 2 2
+ 502 •
Thus the general solution is

x(t) = qe - (l/50)1 + 520


_ ic e-(6/50)t
2

y(t) = CJ e-(1/50)1 + c2e-(6/50Jr + 5i.

From these equations, we see immediately that

lim x(t)
f-+00
= ~.
lim y(t)
/-+00
=5t
In other words, the concentration, in pounds per gallon, in the left tank approaches
½, and in the right tank approaches ½.
The constants CJ and c2 depend on the initial values x(O) and y(O). Thus the
equations

x(O) = c1 - + 52°
ic2

y(O) = CJ + c2 + 530

detennine CJ and c2 when x (0) and y(O) are known. The values x (t1) and y(t1) at
a time t1 also determine the constants. We leave these details as an exercise.

Consider two weights of mass m 1 and m2 separated by springs from each other and
from fixed walls. Suppose the springs have stiffness constants k1, k2, k3 as shown
in Figure 12. 9; thus the restoring force toward the motionless equilibrium position
for the ith spring is proportional to k;. Let x and y be the displacements from
equilibrium of the first and second weights. The force acting on the first weight is
equal to nq(d 2 x/dt 2 ), but we also have
Section 3 Applications 597
FIGURE 12.9
Mass-spring system.

X y

The choice of signs is dictated by whether a positive displacement causes an increase


or decrease in velocity. Similarly,

In deriving both equations, we have neglected frictional forces. We can rewrite the
system in the form
d 2x (k1 + k2) k2
dt 2 = - m1 X + ;-;-y
d 2y k2 k2 + k3 y.
-=-x-
dt m2 m2

For example, if the weights are equal, say m1 = m2 = 1, and k1 = k2 = k3 = 1,


then we write the system as
(D 2 + 2)x - y = 0
-x+(D 2 +2)y=O.

Operating on the second equation with (D 2 + 2) and adding gives

(D 4 + 4D 2 + 3)y = (D 2 + l)(D 2 + 3)y = 0.


The general solution of this equation is

y (t) = CJ cos t + c2 sin t + c3 cos ,,/3 t + q sin ,,/3 t.

Using the second of the pair of equations to find x gives

x(t) = (D 2 + 2)y(t)
= c1 cost + c2 sin t - c3 cos ,,/3 t - q sin ,,/3 t.

The constants c1, c2, CJ, and q would be determined by initial displacements and
velocities, namely x(O), y(O), i(O), j,(0).

The oscillation cos ,,/3 t in the previous example is called a normal mode of the
oscillation, and it is determined by its circular frequency ,,/3. The other normal
mode, cos t, with circular frequency µ, = 1 appearing in the same example arises
from different initial conditions. In identifying a normal mode, it's customary to
focus attention on the circular frequency itself. Thus the typical normal mode looks
598 Chapter 12 Introduction to Systems

like cos µt with circular frequency µ. The normal modes are important characteristics
of an oscillatory system, and particularly efficient routes to their computation are in
Chapter 13 using eigenvalue methods and exponential matrices.

Suppose the third spring is removed altogether from our previous example, so that
k3 =0. We're left with
d 2x
dt2 =- 2x + y,
d2y
dt2 =X - y,

or, in operator form


(D 2 + 2)x - y=0
-x + (D 2 + l)y = 0.

Elimination of x proceeds as in the previous example, but this time the differential
equation for y is (D 4 + 3D 2 + l)y = 0, with characteristic equation

4
r + 3r 2 + 1 = 0.
Regarding the left side quadratic as a function of r 2 , we find r 2 = (- 3 ± v'5) /2. Both
values are negative, so the characteristic roots are ±iJ(3 + v'S)/2 ~ ±1.62i and
±iJ(3 - v'S)/2 ~ ±0.62i. The normal modes from which solutions are constructed
have circular frequencies µ1 = J(3 + v'S)/2 and µi = v'S)/2. /<3 -
i,,EXAM'PLE 4 j A typical autonomous second-order system has the form

x = f(x,y,x,y)
y = g(x, y, x, j,),

with initial conditions x(to) = xo, y(to) = yo, x(to) = uo, y(to) = vo. A particularly
important special case is that of Newton's equations of planetary motion,

.. -kx .. '-ky
X = -~---,--
(x2 + y2)3/2 Y = (x2 + y1)3/2'
in which k is a positive constant. In these equations, x and y stand for the rectangular
coordinates of a planet in a planar orbit relative to a fixed sun at the origin.
We'll derive the planetary motion equations in a more general vector form that
allows for the motion of both bodies, which could be applied to a double star inter-
action for example. Let X1 = x1 (f) and x2 = xz(t) represent the positions at time t
of two bodies in space such that each acts on the other by the inverse square law
of gravitational attraction, with no other forces considered. If m I and m 2 are the
Section 3 Applications 599
FIGURE 12.10
Equal but opposite forces:
F/m1 < F/m2, m1 > m2.

respective masses of the two bodies, the magnitude of the mutually attractive force
is then

where r = jx1 - x2/ is the distance between x1 and x2 (i.e., the length of the vector
between them). The gravitational constant G is about 6.673- 10- 11 if the relevant
units are meters, kilograms, and seconds. The normalized vectors

X1 -X2 Xj -Xz
u2 = - - - U1= - - - -
/x1 -xzl' lx1 - x2I
have length 1 and point respectively from the second body to the first, and vice
versa. Thus the vectors that describe the force acting on each body are the product
of magnitude F and a normalized direction unit vector u; the vector Fu1 acts on the
first body and Fu2 acts on the second body. Since these forces can also be described
by Newton's second law as mass times acceleration, we have

Figure 12. IO shows the positions and force vectors. The acceleration vectors, which
actually govern the motion, are depicted as if m 1 is much larger than mz. Written
out in more detail these Newton equations are
.. Gm1
3.1 X2 =- (X2-X1),
/x1 -x2I 3
where m1 has been canceled from the first equation and m2 from the second. Sub-
tracting the second equation from the first gives

.. ..
XJ - X2 = - G(m1 +m2)(x13 -X2)
/x1 -x2/
Equations 3.1 form a system of vector equations for the motions of the two bodies
relative to some coordinate system. If a moving coordinate system has its origin
maintained at the center of mass of one of the bodies, say the second, we can let
x = x1 - x2 and consider only the equation of relative motion for the first body:

3.2
600 Chapter 12 Introduction to Systems

Writing x = (x, y, z) and G(m1 + m2) = k, we get three scalar-valued equations:


.. -kx .. -ky .. -kz
3.3 x = (x2 + y2 + z2)3/2, Y = (x2 + y2 + z2)3/2 • z = (x2 + y2 + z2)3/2 ·
A solution of this nonlinear system will describe a trajectory of the first body relative
to the second, or vice versa. We can eliminate the third equation from consideration
=
by choosing (x, y, z) coordinates so that initial conditions on z are z(O) z(O) = 0.
By Theorem I. I of Section ID, if x(O) # 0 the system has a unique solution with
its third coordinate z = z(t) identically zero. Thus the system is 2-dimensional and
has order 2:
.. -kx .. -ky
x = (x2 + y2)3/2' Y = (x2 + y2)3/2 ·

There are no simple formulas for the solutions x = x(t), y = y(t) of these
equations. The classical approach to the problem is to derive certain significant prop-
erties of the solutions without actually finding the solutions explicitly. For example,
the trajectories have reasonably simple equations. These properties are usually stated
as Kepler's laws of planetary motion, laws that were discovered empirically from
astronomical observation before the work of Newton. Kepler's laws hold for solutions
to Newton's equations that have closed paths for trajectories.

1. The path described by a solution (x(t), y(t)) is an ellipse with one focus at
the sun. ·
2. The radius from the sun to the planet sweeps our equal areas in equal periods
of time.
3. If T is the time required to complete one orbit and a is half the major axis
of the orbit, then
2 47r2 3
T =-----a.
G(m1 + m2)
The derivation of these beautiful laws from Newton's differential equations is
given in many physics texts and in some calculus texts. (An outline of the derivation
is given in a series of exercises at the end of this section.)
Although the results just described tell us a great deal about planetary motion, if
what we want to know is the position or velocity of a planet at a given time, then
we may resort to numerical methods of the kind described in the next section. These
methods apply directly to a first-order system of arbitrary dimension, and to apply
them to Newton's equations we consider an equivalent system of four first-order
x
equations. Let = u and y = v. Because x=
ii and y = v,
the system takes the
first-order form

x=u

y=v
. G(m1 +m2)x
u=------
(x2 + y2)3/2

G(m1 +m2)y
v= (x2 + y2)3/2 ·
Section 3 Applications 601

The prescription for initial position (x(to), y(to)) and velocity (x(to), j,(to)} takes
the fonn
x(to) = xo, u(to) = uo,
y(to) = Yo, v(to) = vo.

Thus (xo, Yo) represents the position of the planet at time t == to, whereas uo and v0
are the rates of change of x and y at the same time.

In analyzing orbits it's important to understand that while an ordinary planetary


orbit is an ellipse, which is a closed curve, increasing the orbital speed sufficiently
gives an unbounded hyperbolic orbit of a type that is observed for some nonretuming
comets. The critical speed, called the escape speed, is given by the fonnula

In other words, at distance r from the sun, a speed greater than Ve implies a hyperbolic
trajectory, and a speed less than Ve implies an elliptic trajectory. For a derivation of
the formula for Ve under the assumption that the less massive body has a negligible
effect on the other one and that the motion is radial, see Example 6 of Chapter 10,
Section 2. Exercise 28 in the present section shows how to eliminate the radial-
motion assumption.

EXERCISES

1. Suppose that two 100-gallon tanks of salt solution contain (b) Show that it's possible to choose c 1 and c2 so that
amounts of salt y{t) and z{t) at time t. Suppose that the an arbitrary initial condition (x(0), y(0)} = (xo, Yo)
solution in the y tank is flowing to the z tank at a rate of is satisfied. Is this a reasonable state of affairs from
1 gallon per minute, and that the solution in the z tank is a physical standpoint?
flowing to the y tank at the rate of 4 gallons per minute. 3. In Example 2 of the text, the system
Suppose also that the nverflow from the y tank goes down
the drain, whereas the z tank is kept full by the addition (D 2 + 2)x - y =0
of fresh water. Assume that each tank is kept thoroughly
mixed at all times. -x + (D 2
+ 2)y = 0
(a) Find a linear system satisfied by y and z.
are shown to have the general solution
{b) Find the general solution of the system in part (a)
and then determine the constants in it so that the
initial values will be y(0) = 10 and z(0) = 20.
x(t) = CJ cost + c2 sin t - c3 cos J3 t - q sin J3 t,
(c) Draw the graphs of the particular solutions found in y(t) = ci cost+ c2 sint + c3 cosv'3 t + q sinv'3t ,
part (b) and interpret the results.
where x(t) and y(t) are interpreted as the displacements
2. In Example 1 of the text, the general solution to a system
at time t of two masses in a mass-spring physical system.
of differential equations is found to be
(a) Show the initial conditions
x(t) = cie-(l/50)r - ~c2e-(6/50)r + ~. x(0) = 0, i(0) =1
y(t) = cie-(L/50)r + cze-(6/50)r + ~-
y(0) = l, j,(0) =0
(a) Find values for the constants CJ and c2 so that the are satisfied by choosing the constants properly in
initial conditions x(O) = 25, y(0) = j are satisfied. the general solution.
602 Chapter 12 Introduction to Systems

(b) Show thal general initial conditions of the form (c) Show that fork > 0 the curves found in part (b) are
x(0) = XO, y(0) = YO, x(0) = uo, j,(0) = Vo can closed circuits in the HP-plane. Thus the Lotka-
always be satisfied. Volterra theory models the cyclic variation in the
4. Two points start from x 1 = 0 and x2 = 1 on a line sizes of certain populations. [Hint: There are at most
and move with positions x 1 (t) and x2(t) at time t ~ 0. two positive x-values for which f (x) = xa /eb:x: has
Suppose that the x 1-point always maintains its velocity at a given value.]
exactly 10 units per second greater than that of the x2- 7. Two 100-gallon tanks X and Y contain initially 50 and
point. Suppose also that the sum of the two velocities is 100 gallons, respectively, of pure water. From an external
e-t fort~ 0. source, salt solution is added to Y at I gallon per minute
(a) Express the relation between the velocities as a first- (gpm), each gallon containing 1 pound of salt. Mixed
order system. solution flows from Y to X at 2 gpm and from X to Y
(b) Describe the motion of the two points. Are they ever at I gpm. Let x = x(t) and y = y(t) be the respective
at the same position at the same time? amounts of salt in X and Y at time t ~ 0. Note. You're not
5. (a) Show that under initial conditions of the spe- asked to solve any differential equations for this question.
cial form (a) At what time t 1 will X begin to overflow? Express
x(0) = xo > 0, i(O) = uo the total amount of salt in the two tanks as a function
y(0) = 0, j,(0) = 0, of t while 0 ::: t :::: t1.
(b) Find a system of differential equations satisfied by
Newton's equations of planetary motion reduce to x(t) and y(t) for 0::: t:::: t1.
(c) Find a system of differential equations satisfied by
d 2x GM x(t) and y(t) for t 1 ::: t, while X is overflowing.
dt 2 =-7,
8. Two 100-gallon tanks X and Y are initially full of salt
together with the condition that y (t) is identically solution, with xo pounds of salt in X and y0 pounds of
zero. salt in Y. Mixed solution is pumped from X to Y at 2
(b) Taking the physical situation into account, what can gallons per minute and from Y to X at 3 gallons per
you say about the behavior of a solution x(t) of the minute. Pure water evaporates from X at 2 gallons per
reduced system in part (a) if uo = 0? minute. Let x = x(t) and y = y(t) be the respective
amounts of salt in X and Y at time t ~ 0.
6. The Lotka-Volterra equations
(a) At what time t1 will one of the tanks first overflow
dH or become empty?
dt = (a - bP)H, (b) Find, but don't solve, a system of differential
equations satisfied by x(t) and y(t) for 0::: t :::: ti.
dP (c) Show that x(t) + y(t) remains constant and that the
dt = (ell - d)P,
amount of salt in each tank separately is constant
whenever xo = iYO·
with a, b, c, d > 0, model the size relationship of parasite
(d) Assume that x(0) = 10 and y(0) = 20. Use the
P (t) and hosl Ii (t) populations at time t.
equation x(t) + y(t) = 30 to solve the system you
(a) Show that if P(t) > a/b, then H(t) decreases, and found in part (b).
that if H(t) < d/c, then P(t) decreases. Show also
that the equilibrium points (He, Pe) are (0, 0) and 9. Two tanks, one of capacity 100 gallons, the other of
(a/b, d/c). capacity 200 gallons are each initially half-full of liquid.
(b) Show that the parameterized solution curves The JOO-gallon tank starts with nothing but pure water, but
(H, P) = (H(t), P(t)) satisfy the other tank starts out with 10 pounds of salt dissolved
in the water. Solution flows from the 100-gallon tank to
dH (a -bP)H the other tank at 2 gallons per minute. Solution flows in
the opposite direction at I gallon per minute. Pure water
dP (cH -d)P
is added to the JOO-gallon tank at I gallon per min. The
and solve this equation by separation of variables entire process is stopped if either tank becomes empty or
to get either tank overflows.
(a) How long does it take for the process to stop?
(b} Write down the system of differential equations
where k is constant. and initial conditions whose solutions describes the
Section 3 Applications 603
process as a function of time. As a check, notice a form of the equations of motion that predicts math-
that the total amount of salt present in the system ematically the relative equilibrium positions of the two
remains unchanged. masses.
(c) Use the check in part (b) to find a first-order linear Instead of measuring the locations of the two masses
initial-value problem for x(t) alone. shown in Figure 12.9 from their equilibrium positions we
(d) Solve the initial-value problem in part (c) for x(t). can measure both displacements from the same point at
Then find y(t), and estimate the amount of salt in the left-hand support. If we know the unstressed (i.e.,
each tank when the process stops. relaxed) lengths !1, /2, /3 of the three springs, and the
Normal modes. In Exercises 10 to 13, calculate the distance b between the supports, this approach allows
circular frequencies of the various constituent oscilla- us to determine the precise location of the equilibrium
tions associated with the system of Example 2 of the positions. (This information was assumed known in our
text under the following assumptions. earlier analysis.) Let z and w be the respective distances
of masses m I and m2 from the left end, as shown in
10. m1 = m2 = 1, k2 == 2, k1 = k3 = 1 Figure 12.9.
11. m1 = m2 == 1, k1 == k2 = 1, k3 =2 (a) Show that
12. m1 = 1, m2 = 2, kt = 1, k2 = k3 = 4
m1d 2z/dt = -k1 (z -
2
!1)+ k2((w - z) - !2),
13. mt = l, m2 = 2, kt = 2, k2 = k3 = 3
14. Suppose that the middle spring is removed from the sys- m2d 2w/dt 2 = -k2((w - z) - /2) + k3((b - w) - /3).
tem governed by the equations of Example 2 of the text.
(a) Show that the system becomes uncoupled.
(b) Show that the equations are equivalent to
(b) What are the normal modes?
2
m1d 2z/dt = -(k1 + k2)z + k2w + k1l1 - k2l2,
15. A 2-dimensional mechanical system m 1x = f(x, y),
m I y = g(x, y) is called conservative if there is a 2
m2d w/dt 2 = k2z - (k2 + k3)w + k2l2 - k3l3 + k3b.
potential function U (x, y) such that
(c) The equations derived in part (b) are similar to the
oU(x, y) oU(x, y) __ -g(x, )').
- -- = - f(x, y) and ones derived in Example 2 of the text except for the
ax oy presence of additional constant terms on the right
(a) Show that this two-body system is conservative by side. Thus they constitute a nonhomogeneous system
computing a potential: rather than a homogeneous one. Since equilibrium
solutions are constant, the second derivatives z and
m1x = -(kt + k2)x + kzy, w are identically zero. Consequently, to find the
m2y = k2x - (k2 + k3)y. equilibrium positions, all we have to do is set the
right sides of the differential equations equal to zero
(b) The kinetic energy of the system is
and solve for z and w. Find the equilibrium solutions
T = !(m1i 2 + m2j, 2). Show that for a general in terms of the ls and ks.
conservative system of the type considered here, the
total energy T + U is constant. [Hint: Multiply the 17. Let g be the acceleration of gravity at the surface of a
first equation by i, the second by j,, add the two homogeneous solid spherical body of mass M and radius
equations and integrate.] =
R. Use the inverse-square law to show that g GM/ R 2,
*16. In deriving the equations of Example 2 of the text to where G is the gravitational constant in appropriate units
establish the precise location of each mass relative to the of measurement. Assume the mass of the body is concen-
trated at its center.
other, and to the spring supports, we needed to know
in advance the equilibrium positions of the two masses. 18. (a) Use the equation established in the previous exercise
'Irus is a problem of finding an equilibrium solution to estimate the gravitational constant using a mea-
to the appropriate equations of motion, that is, find- sured value of 9.8 meters per second for the accel-
ing a constant solution, for which all time derivatives eration of gravity near the surface of the earth. (Use
are zero. For the equations derived in Example 2, it's the values for the mass and radius of the earth
routine to check that the unique equilibrium solution is m = 6-1024 kg, R = 6368 km.)
x(t) = 0, y(t) = 0. Indeed we chose our coordinates (b) Estimate the acceleration of gravity near the surface
so that these would be the equilibrium solutions, so we of the earth using the value 6.67·10- 11 for the
get no new infonnation. This exercise asks you to derive gravitational constant G.
604 Chapter 12 Introduction to Systems

19. The outer radius (not the thickness!) of the earth"s atmo- moon from the earth, and the earth's motion around
spheric shell is about 5600-103 meters, and the earth's the sun is ignored? (Because of the earth's motion
mass is about 5976-1024 kilograms. With G = 6.673- around the sun, the number of days from full moon
10- 11 , estimate the escape speed required at the outer on earth to full moon is about two days more than
limit of the atmosphere for a projectile of mass 100 kilo- the answer to this exercise.)
grams. How is your answer affected if the projectile mass 24. The synchronous orbit of a body of mass m about a
is instead 1000 kilograms? How about 10 22 kilograms? unifonnly rotating body of mass M > m is the one
20. Suppose at some time that two bodies subject only to their that maintains the orbiting body directly over one point
mutual gravitational attraction are at distance ro apart and on the rotating one. Assume the mass of each body is
are receding from each other along a fixed line at a certain concentrated at its center.
fraction q of escape velocity, where O < q < 1. Show (a) Use the first two Kepler laws to show that a syn-
that their separation velocity reaches zero, and the bodies chronous orbit is necessarily circular, and that it
start to "fall" toward each other, when their distance apart must lie in the plane of the equator of the rotating
becomes ro/(1 - q 2 ) . body.
21. This exercise is a reminder that there would be no such (b) Use the third Kepler law to show that if T is the
thing as escape velocity for a body of constant mass if the period of rotation of the larger body, then the radius
acceleration of gravity were really constant. For linear of the synchronous orbit is R = K T 213 , where
motion away from the attracting body, we would have K = ;,fG(M + m)/41r2.
x = -g, for some positive constant g. Show that no (c) Show that the synchronous orbit about the earth for
matter how large xo = x(0) > 0 and vo = i(0) > 0 are, a small satellite has radius approximately 6.22846
x(t) has a finite maximum. times the radius of the earth, or 26,246 miles. (Con-
tinuing orbital correction of communication satellites
22. (a) Use Kepler's second law, equal areas swept out in
is required because of uneven mass concentrations
equal times, to show that a planet moving in circular
on earth and the influence of other bodies such as
orbit must have constant speed.
the sun and the moon.)
(b) Use Kepler's third law, T 2 = 41r 2a 3 /( G(m 1 +m ;i ) ),
together with the result of part (a), to show that a :ZS. The Newton equations for orbits of a single planet of mass
circular orbit of radius a has constant orbital speed m2 relative to a fixed sun of mass nlJ have the form
v = JG(m1 + m2)/a. [Hint: Express v in terms of .. -kx .. -ky
the period T .] x = (x2 + y2)3/2' Y = (x 2 + y2)3/2 '
23. The uniform orbital speed of a satellite of mass m I at
where k = G(m1 + m2).
distance xo from an attracting body of mass m2 is the
speed v I that the satellite must attain to keep it in a (a) Find the relationship that must hold between the
uniform circular orbit. positive constants a and w so that these differential
(a) Show that the orbit
equations will have solutions with circular orbits
described by x (t) = a cos wt, y(t) = a sin cot.
x = xo(cos(v/xo)t, sin(v/xo)t) (b) Show that the relarionship described in part (a)
expresses the third Kepler law.
represents circular motion of radius xo with uniform (c) Show that the orbit
speed v and acceleration x toward the origin of
magnitude v 2 / x 0 . This acceleration vector is called x = (acoswt,asinwt), w = const. > 0
centripetal acceleration. [Hint: Compute lxl, Iii obeys the second Kepler law.
and Iii.]
(b) Show that if gravitational acceleration 26. A vector system x = -F(x) is called conservative if
G(m1 + m2)/xJ is to provide precisely the cen- there is a real-valued potential energy function U(x)
tripetal acceleration of the circular orbit found in such that F(x) = VU(x). For a I-dimensional vector field
part (a), then the uniform orbital speed will be the relation is just F(x) = U'(x); a potential function is
VJ :::; JG(m1 + m2)/xo. determined only up to an additive constant.
(c) How is uniform orbital speed related to escape (a) Verify that the Newtonian vector field
speed?
(d) How many days would there be in a month if the
earth's moon had a uniform circular orbit of radius
F(x, y) = ( (x2 ::2)3/2' ::2)3/2)
(x2

equal to 384,404 kilometers, the mean distance of the has U(x, y) = -k(x 2 + y2)- 112 as potential.
Section 3 Applications 605
(b) The kinetic energy of a body of mass I following Express area swept out along an orbit as an integral
a path (x, y) = (x (t), y(t)) is T = ½(x 2 + j,2), and with respect to time t between t and t + r .]
the total energy of motion in a conservative field is 29. Kepler's second law (radius vector from sun to planet
sweeps out equal areas in equal times) holds for all
E = T + U = ! (i 2 + j, 2) - k central force laws, that is, force laws expressible in
2 .jif+y2 the form x = G(x)x, where G(x) is some real-valued
for the Newtonian field. Verify that the total energy function. This includes as special cases the inverse-square
E is constant for the motion in the vector field of law of attraction, where G(x) = -klxl- 2 , k > 0, and the
part (a). Hint: Show that dE/dt = 0, and use the Coulomb repulsion law, where G(x) = klxi- 2 , k > O;
Newton equations of motion.] the latter governs interaction of particles bearing electric
(c) Verify that for motion governed by the equation charges of the same sign.
x= F(x) the total energy E is constant if the vector (a) Assuming planar motion and using rectangular coor-
=
field is conservative: F(x) VU(x). dinates (x, y) for x, show that a central force law has
the form
The results of the next four exercises establish the
validity of Kepler's Jaws. x = G(x, y)x, y = G(x, y)y,
27. We've seen that the orbit of one body relative to a second =
and conclude that xji-yx 0 for a motion governed
always lies in a fixed plane containing both bodies. This by a central force law.
is often shown as follows. (b) Use the conclusion of part (a) to show that xj,-yi =
(a) Show that if a body of mass m has a path of motion h for some constant h.
that obeys the inverse-square law mx = -(k/lxi 3)x, (c) Change to polar coordinates by x = r cos 8, y =
then the motion is confined to a plane through the r sin0 to show that xj,- yi = r 20 and hence, using
center of attraction determined by the initial position part (b), show that r 28 = h for some constant h. The
and the velocity vectors. [Hint: Establish the relation result says that for r_notion in a central force field,
f,(x xx)= xx x to show that the plane containing the angular velocity 0 is inversely proportional to the
x
x and is perpendicular to a fixed vector.] square of the distance from the center of the field.
(d) Use the result of part (c) and a computation of area
(b) A central force law is one such that motion is
governed by an equation of the form x = G(x)x, in polar coordinates to prove Kepler's second law for
where G(x) is a real-valued function. Show that a central force field by showing that, as a function
motion subject to a central force law is confined to of time t, area swept out has the form A = ½ht+ c.
a plane. Explain why this proves Kepler's second law under
the given assumptions.
28. The angular momentum of a planet at position x in its *(e) Apply Green's theorem to the equation xj,-yi = h
plane orbit about the wn is the vector L = x x mi, that derived in part (b) to show directly, without using
is, L is the cross-product of the position vector x with the polar coordinates, that Kepler's second law holds.
linear momentum vector mi.
(a) Introduce rectangular coordinates x, y in the plane
*30. A single planet with position x = x(t) obeying x =
of motion so that x = (x, y, 0) to show that the -(k/lxl 3 )x follows an elliptic, parabolic, or hyperbolic
path. Here is an outline of a way to show this by deriving
length L = ILi of angular momentum equals L =
a linear differential equation from the vector equation.
mlxj, - yil.
(b) Show that in terms of polar coordinates x = r cos 0, (a) Use x = (r cos 0, r sin 0) to express the vector
y = r sin 0, the angular momentum is mr 2iJ, if equation of motion in the two polar coordinate
iJ > 0. equations;: - rB 2 = -k/r 2 , r0 + 2rB = O.
(c) Kepler's second law of planetary motion decrees (b) Show that ~he second equation derived in part (a)
that the radius joining a planet to the sun sweeps implies r 20 = h for some constant h, and use
out equal areas in equal times. Use the Kepler law this to write the other equation in the form r =
together with the formula h 2 r- 3 - kr- 2. In particular, show that if h = 0 the
motion is confined to a line and results either in
collision or escape.
A=! (9i r 2 d0 (c) Use the results of part (b) to show that if h ::f. 0,
2101
for area in polar coordinates to show that the angular I d2r 2 I (dr ) 2 k
momentum mr 2 8 is constant on an orbit. [Hint: r2 d0 2 - r3 d0. =~ - 112.
606 Chapter 12 Introduction to Systems

[Hint: Use the chain rule to express ;- and ;: in terms and ro are initial speed and distance. Starting with the
of derivatives with respect to 0.] Newton vector equation x = -kx/\x\ 3 , we form the dot
(d) Make the change of variable r = 1/u to show that product of both sides by x to get the scalar equation
the equation in part (c) becomes the second-order X•X =
-k(x •x)/ jxj 3 .
linear equation d 2 u/d0 2 + u = k/ h 2 . (a) Show that the left side of the previous equation is
(e) Show that the solution u = 1/r = d v2 ~ .
A cos(0 + a) + k/ h 2 to the previous equation rep- equa I to dt , where v -vx • X Ix! .
= =
2
resents an ellipse, parabola or hyperbola in polar (b) Show that the right side of that same equation is
coordinates according as IA! < k/h 2 , IAI = k/h 2, equal to k(Vjxj- 1) • x, where V is the gradient
or IA! > k/ h2 . [Hint: Let x = r cos(0 + a), operator: VJ= (afiax, aflay, at1az).
y = r sin(0 + a), a rotation by a of the original (c) Show that the result of part (b) is also k(d/dt)lxl- 1,
xy-axes.] and conclude that
(0 Each focus of an ellipse lies on the major axis at
d v2 d 1
=
distance c from the center where c 2 a 2 - b2 and a - - =k - -
and bare the semi-axis. The eccentricity is e = c/a . dt 2 dt r
Show that for an elliptic orbit, the center of attraction (d) Integrate the previous equation between O and an
is at one focus and the eccentricity is jAjh 2 /k. Then arbitrary positive time t to get the equation relating
show that the polar equation for an orbit is v, vo, r and ro.
h2 /k 32. Suppose a projectile is fired directly away from and at
r----- distance xo from the center of mass of a planet with initial
- 1 + ecos0 ·
speed zo. If zo is Jess than the escape speed, show that
[Hint: For the first part, convert the polar equation, the maximum additional distance attained from the center
with a = 0, to rectangular coordinates.] of mass of the planet is
(g) Assume that the orbit in part (f) is elliptic, with
0 :5 e < I. Show that the time for one complete
revolution is T = 2rrnb/ h. Then show that h 2 / k =
2GM-xozl'
b2 /a to derive the third Kepler Jaw T 2 = 4rr 2a 3/k.
[Hint: The sum of the maximum and minimum where M is the sum of the masses of the two bodies.
values for r is equal to 2a.]
33. A 2-dimensional Hamiltonian system is a pair of differ-
*31. We established the formula for escape speed in the text ential equations of the form
under the assumption that the relative distance separating
two bodies, subject only to the forces of mutual grav- dx/dt = Hy(x, y, t), dy/dt = -Hx(X, y, t).
itational attraction, would always be measured radially
along the same fixed line. The purpose of this problem The function H that determines the system is called
its Hamiltonian. Suppose that (x(t), y(t)) satisfies the
is to show that the fixed-line assumption isn't necessary,
and that the relative speed v and distance r are always system, and consider two functions oft:
d
related by (a) dt [H (x(t), y(t), t)],
v2 v2 k k
- - _Q_ (b) H1 (x(t), y(t), t),
2 2 r ro where the partial derivative of the Hamiltonian in (ii) is
Here the constant is k =
Gm, where m is the sum of computed before substituting x(t) and y(t) for x and y.
the two masses, G is the gravitational constant, and vo Show that these two functions oft are equal.

SECTION 4 NUMERICAL METHODS


Our numerical methods apply to a first-order vector initial-value problem

-dx
dt
= F(t, x), x(to) = xo.
Section 4A Numerical Methods 607
If the system is linear and has an explicit solution formula, that formula may well be
preferred to a numerical approximation, because the approximation may be unable
to give a convincing description of a solution's long-term behavior. However, even
for a solvable system the numerical approach may be the quickest way to get some
short-term qualitative information about solution trajectories.
4A Euler's Method
We choose a step of size h to find successive approximations Xk to the true values
x(to + kh) of the solution x(t). The idea is to use the derivative approximation
x(t + h) - x(t)
h ~ F(t, x),

in the form
+ h) ~ x(t) + hF(t, x).
x(t
Thus having found x corresponding to tk = to + kh, we define the approximation
Xk+l at fk+J by
Xk+l = Xk + hF(tk, Xk).
Starting at a point Xo, this equation generates a sequence XJ, x2, ... , Xm of arrow
tips Xk+ 1 designed to lie close to a trajectory containing Xk- Figure 12.11 (a) shows
an example of an autonomous vector field F(x) along with some arrows tangent to
points Xk on a solution trajectory, the latter shown as a dotted curve. If we scale these
tangent arrows down in length by a small enough factor h > 0, we can expect the tip
of each arrow to land at points Xk +hF(tk, Xk) that are good approximations to points
on the trajectory. Having accepted one of these approximations as Xk+l, we may then
go on similarly to the next approximation by starting at Xk+J· Figure 12.1 l(b) shows
how using a small scale factor h can improve the approximation.
For a 2-dimensional system,
x = F(t, x, y), x(to) = xo,
y = G(t, x, y), y(to) = YO,
the 0th step starts with xo and yo. Then
x1 = xo + hF(to, xo, Yo)
YI = YO+ hG(to, xo, Yo).

"-
__.
-
FIGURE 12.11
-...... '-...
/ /
,,.,,,,..-
//,./ ~'-\ \
I I
\ I/ I
\ \.' ...__ '-.. _
.......
?
__,, /
/ /
'-- ---
""' Vector field, trajectory, and tangents
(a)
Effect of scaling on a tangent
(b)
608 Chapter 12 Introduction to Systems

Next, with t1 =to+ h,


xz =x1 +hF(t1,x1,YI)

Y2 =YI+ hG(t1, XJ, Y1).

In general, with lk = lk-1 + h =to+ kh, we get

Xk+i = Xk + hF(tk , Xk, Yk)


Yk+I = Yk + hG(lk , Xk, Yk).
The basic loop for computer implementation is then

S =X (Save previous X.)


X = X + H * F(T, X , Y)
Y = Y + H * G(T, S, Y)
T = T+H,

where the letters on the left represent the new values and the letters on the right
represent the values computed in the previous step of the loop.

IE><AMrLE·11 The system


x = ty + 1, x(O) =I
y = X, y(O) =- I
arose in Example 4 of the previous section. With step size h = 0. 1, the loop

S= X
X = X + H * (T * Y + I)
Y=Y+H*S
T=T+H

produces the values in Table 12. I .

4B Improved Euler Method


A more accurate numerical method for a vector equation is a modification of the
Euler method in which, instead of using the tangent vector F(tk, Xk) to find the next
value, we use the vector average

where Pk+! is the value that the Euler method would have predicted, namely

Pk+I = Xk + hF(tk, Xk),


Using the predictor value F(tk + h, Pk+d to modify the Euler estimate, we define
the improved Euler approximation to be

h
Xk+I = xk + 2[F(tk, xk) + F(tk + h, Pk+1)] .
Section 4B Numerical Methods 609
TABLE 12.1

t X y x = ty+ 1 }'=X

0 1 -1 1
0.1 I.I -0.89 1 1.1
0.2 1.19 -0.77 0.92 1.19
0.3 1.28 -0.64 0.87 1.28
0.4 1.35 -0.51 0.85 1.35
0.5 1.44 -0.36 0.85 1.44
0.6 1.52 -0.21 0.89 1.52
0.7 1.61 -0.05 0.97 l.61
0.8 1.70 0.12 1.08 1.70
0.9 1.81 0.30 1.24 1.81
1 1.94 0.49 1.44 1.94
I.I 2.09 0.70 1.70 2.09
1.2 2.26 0.93 2.02 2.26
1.3 2.48 1.18 2.41 2.48
1.4 2.73 1.45 2.88 2.73
1.5 3.03 1.75 3.45 3.03
1.6 3.39 2.09 4.14 3.39
1.7 3.83 2.47 4.96 3.83
J.8 4.35 2.91 5.95 4.35
1.9 4.97 3.41 7.13 4.97
2 5.72 3.98 8.56 5.72
2.1 6.62 4.64 10.28 6.62
2.2 7.69 5.41 12.36 7.69
2.3 8.98 6.31 14.88 8.89
2.4 10.53 7.36 17.93 10.53
2.5 12.40 8.60 21.64 12.40

Figure 12. 12 shows a geometric rationale for using this particular modification to
get an improved estimate. The tip of the arrow Xk+ 1 lies at the midpoint between
the tip of the Euler arrow computed at Xk and time tk and the tip of the Euler arrow
computed at Pk+I and time lk + h, then translated back to Xk- Thus the improved
estimate used infonnation not just from the pair (tk, Xk) but also estimated future
information from (tk + h, Pk+1).
FIGURE 12.12 ..... _
P, I I + hF(t, + h, Pt t ,)

X.,.. 1 = t[(x. + hF(t, , x,)) + (x, + hF(t, + h, Pt + 1))]


= x. + f[F(t., x,) + F(I, + h, p1 + 1)]

For a 2-dimensional system


x = F(t, x, y), x(to) = xo
y = G(t, x, y), y(to) = YO,
610 Chapter 12 Introduction to Systems

we start with (xo, yo). Then letting p = (p, q), we compute the Euler approximation

PI =XO+ hF(to, xo, Yo)

qi =YO+ hG(to, xo, Yo),

followed by the modified approximation

h
X\ = xo + 2[F(to, xo, Yo)+ F(to + h, PI, qi)]
h
YI= YO+ 2[G(to, xo, Yo)+ G(to + h, Pl' q1)].
At the (k + l)th step, we compute tk = tk-I + h =to+ kh and

Pk+I = Xk + hF(tk, Xk, Yk)


qk+! = Yk + hG(tk, Xk, Yk)
h
Xk+I = Xk + 2[F(tk , Xk, Yk) + F(tk+I, Pk+I, qk+I )]
h
Yk+I = Yk + 2[G(tk, Xk, Yk) + G(tk+I, Pk+I, qk+! )].
j EXAM.f:'LE 2 j Consider the second-order equation

y+y = O; y(O) = 0, y(O) = 1.


Reducing the equation to a first-order system by letting y = x gives
x= -y, x(O) I=
y=x, y(O)=O.

The recursive formulas for the basic loop have the form

P=X-H*Y
Q=Y-H*X
H
X=X+T*(-Y-Q)
H
= Y+
Y
2 * (X + P) .
The results in Table 12.2 use step size h = 0.01, but records only every tenth step,
including also the values of cost and sin t for comparison, because (x(t), y(t)) =
(cost, sint) is the correct elementary formula for the solution.

If F(t, x) is continuously differentiable in x, then the one-step error using the


Euler method is. of order h 2 , where h is the step size. Using the improved Euler
method, the one-step error is of order h3, provided F(t, x) is twice continuously
differentiable in x.
Section 48 Numerical Methods 611
TABLE 12.2

T X y cos T sin T

0 1 0 0.1 0.0
0.1 0.99505 0.09983 0.99500 0.09983
0.2 0.98011 0.19868 0.98006 0.19867
0.3 0.95538 0.29553 0.95533 0.29552
0.4 0.92110 0.38944 0.92105 0.38942
0.5 0.87762 0.47945 0.87757 0.47943
0.6 0.82537 0.56467 0.82533 0.56465
0.7 0.76487 0.64425 0.76483 0.64422
0.8 0.69673 0.71740 0.69669 0.71736
0.9 0.62163 0.78337 0.62159 0.78333
1 0.54031 0.84152 0.54028 0.84148
1.1 0.45360 0.89126 0.45358 0.89121
1.2 0.36235 0.93209 0.36233 0.93204
1.3 0.26749 0.96361 0.26747 0.96356
1.4 0.16995 0.98550 0.16994 0.98545
1.5 0.07071 0.99754 0.07071 0.99749
1.6 0.02922 0.99962 0.02922 0.99957

Recall from Section 3 that Newton's equations of planetary motion for a planet of
mass m 1 orbiting a star of mass m2 are
.. G(m1 +m2)x
x=---,----,,-,-
(x2 + y2)3/2 '
x(O) = xo, i(O) = uo,
.. G(m1 +m2)Y
Y =- (x2 + y2)3/2 • y(O) = YO, j,(O) = VQ.
These equations are derived in the previous section, where we remarked that the
second-order system is equivalent to the first-order system

X =U , x(O) = xo
y = V, y(O) = Yo
. G(m1 + m2)x
u=- u(O) = uo
(x2 + y2)3/2 '
. G(m1 + m2)y
V = --.,,---::~-=-
(x2 + y2)3/2 '
v(O) = vo .
We often choose units of measurement so that G(m1 +m2) = 4:rr 2, although for some
purposes we could just as well choose them so that G(m 1 + m2) = 1. In choosing
initial values for position and velocity, recall that we get an elliptic orbit only if the
orbital speed is less than the escape speed, Ve = J2G(m1 + m2)/r. In other words,
we want
2G(m1 + m2)
Juz+v2<
0 0
x5 + Y5
Thus if G (m 1 + m2) = 1 and (xo, yo) = (1 , 0), we should choose (uo , vo) so that
612 Chapter 12 Introduction to Systems

TABLE 12.3

t X y

0 1 0
0.5 0.877588 0.479435
1.0 0.540327 0.841496
1.5 0.070792 0.997561
2.0 -0.416073 0.9094'18
2.5 -0.801107 0.598749
3.0 -0.990088 0.141517
3.5 -0.936777 -0.350347
4.0 -0.654219 -0.756473
4.5 -0.211552 -0.977463
5.0 0.282893 -0.959211
5.5 0.708087 -0.706161
6.0 0.959937 -0.280329

to get a closed trajectory. Table 12.3 for (y, x, y) came from using the improved
Euler method, having chosen G (m 1 + m2) = 1, (xo, yo) = (1, 0), and (uo, vo) =
(0, 1). The step size wash= 0.01, but the result is printed only for every 50 steps.

The initial conditions we have chosen in these examples are satisfied by the
solution (x(t), y(t)) = (cost, sin t), which has a circular orbit for its trajectory.
Hence we can use this solution as a check on the accuracy of our method of numerical
approximation.

EXERCISES

Software for doing these exercises is widely available; of size h = 0.1, both by computing from the
in particular the Web site http://math.dartmouth.edu/ explicit exponential solution formula and by a direct
~,rewn contains applicable Java applets, along with numerical solution of the system using either the
some graphical demonstration applets for specific appli- Euler method or its improved modification.
cations. 2. Find a table of approximations to the solution x(t), y(t)
of the system
1. The first order autonomous system
x=x+i
x=y y=x 2 +v+t,
y=x with initial condition x(O) = I, y(O) = 2. Use a step of
size h = 0.1 on the interval 0 .::: t =s ½, and make the
is equivalent to the single equation _v = y, via the relation
approximation with
_y = x.
(a) The Euler method
(a) Show that the system has solutions
(b) The improved Euler method.
x(t) = qe + c2e-
1 1
, 3. (a) Show that the second-order equation
y(t) = cie
1
cze-
-
1

.Y-2.v + y = r
(b) Find the panicular solution satisfying ,\"(0) = with initial conditions y(O) = I, j,(0) 2, IS
l , y(O) = 2. equivalent to the first order system
(c) Compute a table of numerical approximations to
the particular solution found in part (b ). Do this .i =
2x - y + t, x(O) = 2,
computation on the interval 0 ::: t ::: ! in steps y=x, y(O)=I.
Section 4B Numerical Methods 613
(b) Find a numerical approximation to the solution of MIXING
the system in part (a) for the interval 0 :::: t ::::: J.
(c) Solve the given second-order equation by using its 12. Tanks I at capacity I 00 gallons and 2 at 200 gallons are
characteristic equation, and compare the solution initially full of salt solution. Tank 1 has 5 gallons per
with the numerical results of part (b). minute of salt solution at I pound per gallon running in
while mixed solution is drawn off, also at 5 gallons per
4- (a) Find a first-order system equivalent to
minute, with an additional 3 gallons per minute flowing
y"-ty'-y=t, y(O)=l, y'(0)=2. out to tank 2. Tank 2 has 2 gallons per minute of pure
water running in and 3 gallons per minute being drawn
(b) Find a numerical approximation to the solution of off, while 2 gallons per minute more flow to tank I.
the system in part (a) for the interval O :::: t ::::: I. (a) Find a system of differential equations satisfied by
the salt contents of the tanks up to the time when
Apply the improved Euler method to the I -dimensional one is empty.
systems in Exercises 5 to 8. (b) Make a computer plot that compares the graphs of
the components of the solution to part (a), assuming
5. dx = 1-x 113 ,x(O) = ½ tank I has initially IO pounds of salt and tank 2 has
dt
20 pounds. Estimate the maximum amount of salt in
6. dy = t 2 + y2, y(O) = 1 each tank and the time when these are attained.
dt
dx
7. -
dt
= vl~
+r,x(O) = 0
13. Tanks I at capacity 100 gallons and 2 at 200 gallons
are initially half-full of salt solution. Tank I has 5 gal-
dy . lons per minute of pure water running in while mixed
8. - =smy, y(O)
dt
=I solution is drawn off at 4 gallons per minute. with
an additional 3 gallons per minute pumped to tank 2.
9. Make a computer plot of the solution to the bug-pursuit
Tank 2 has 2 gallons per minute of salt solution at 1
problem of Exercise 34 in Section IC.
pound per gallon running in and I gallon per minute
10. The general Lorenz system is .i =a(y - x), y = being drained off, while 3 gallons per minute pour into
px - y - xz, i. = +
-/3z xy, where /3, p, a are pos- tank I.
itive constants. For certain values of the parameters, in (a) Find a system of differential equations satisfied by
particular f3 =8/3, p =
28, a = 10, solution trajec- the salt contents of the tanks up to the time when
tories exhibit an often-studied type of unpredictable or one is empty.
"chaotic" oscillation. Plot the orbits with these parameter (b) Make a computer plot that compares the graphs
choices and initial value (x, y, z) = (2, 2, 21) . A partic- of the components of the solution to part (a),
ularly good view is obtained by projecting on the plane assuming tank I has initially IO pounds of salt
through the origin perpendicular to the vector (- 2, 3, I). and tank 2 has 20 pounds. Estimate the minimum
Note the effect of small changes in the initial vector on amount of salt in tank 1 and the time when this is
the successive numbers of circuits in each spiral configu- attained.
ration.
11. A basic result of multivariable calculus, proved in PLANETARY ORBITS
Chapter 5, Section I, says that the gradient vector
Vf(x, y) = (f,(x, y), /y(x, y)) is perpendicular to the 14. The system of Newton equations
level curve of f that contains (x, y); consequently, x = -x(x 2 + y2)- 312 , y = -y(x 2 + y2)- 312 with initial
( - /y(x, y), fx(x, y)) is tangent to a level set, which conditions x(O) = 1, y(O) = 0, i(O) = 0, y = vo,
is then a trajectory of the system x = -fy(x, y), has a solution with a closed trajectory if vo < ..fi.
=
y fx(x, y). Assume that f(x, y) x 2 = + ½y2. Use the improved Euler method to make an approximate
(a) Use these ideas to make a computer-graphics plot of computation of the trajectory of a single orbit if
the elliptic level curves of f. (a) vu= 0.35 (b) vo = 0.7 (c) vo = 1.4
(b) Make a computer-graphics plot of some orthogonal 15. When the inverse-square law F = Gm1m2r- 2 is replaced
trajectories, that is, curves perpendicular to the level by F = Gm1m2r-P, where 0 < p, the Newton equations
set of the function /. for the orbit of a single planet about a fixed sun take
(c) Identify the well-known family of orthogonal trajec- the form
tory curves by solving the relevant uncoupled system
analytically.
614 Chapter 12 Introduction to Systems

Assume k = and make a pictorial comparison of the 19. Make phase plots of the hard spring oscillator equation
orbits with initial conditions x(O) = I, y(0) = 0, x(O) = y = -y y 3 + 8y under each of the following assumptions.
0, y(O) = 0.5 for the choices p = 1.9, p = 2 and p = 2.1. (a) y = S = 1
Discuss the differences among the three cases. (b) y=l,8=2
(c) y = 2. 8 = l
OSCILLATORY SYSTEMS 20. Make solution graphs and (y, y) phase plots for the
periodic driven hard spring oscillator equation ji =
16. The equations ii = -(g/l)sin0 + <i, 2 sin0cos0,
¢,
-204>cot0, 0 # kn, k integer, govern the spherical pen-
fo
- y - y 3 + k y + cost and use the results to detect
Jong-term approach to periodic behavior.
dulum, where </J and O < 0 < TC are spherical coordinate
angles where 0 is measured from the downward-pointing Time-dependent linear spring mechanism. The
vertical axis in JR 3 and </J is the longitudinal angle. Use equation ji+k(t)y+h(t)y = sint represents an oscillator
g/ l = I. externally forced by f (t) = sin t, damped by factor k(t)
(a) Assuming g / l = 1, plot the trajectory in 0¢,-space and with stiffness h(t). Use initial conditions y(O) = 0,
for a solution if 0(0) = 0(0) = 1, ¢,(0) = 0, 4>(0) = y(O) = 1 to plot solutions with the following choices
]. for k(t) and h(t).
(b) Assuming g I l = I, make a 3-dimensional perspec-
tive plot of the (x, y, z) path of the bob if the initial 21. k(t) = 0, h(t) = e 110 1

conditions are as in part (a). 22. k(t) = 0, h(t) = e-t


17. The van der Pol equation is i - + x = 0,
a(l - x 2 ).i- 23. k(t) = ti!• h(t) = e 110 1

where a is a positive constant.


24. k(t) = (I - e- 1 ), h(t) = I
(a) Let y = x and write the first-order system in x and
y that is equivalent to the van der Pol equation with 25. k(t) = 1, h(t) = 1/(l + t 2)
k = 0.1. 26. k(t) = 0, h(t) = t /(1 + t 2 )
(b) Use the improved Euler method to plot numerical
solutions to the system found in part (a), using
initial values x(O) = 2, y(O) = 0, while successively INTERACTING POPULATIONS
letting a = 0.1, 1.0, and 2.0. Plot for O < t < t1,
where t1 is in each plot large enough that the 27. The special Lotka-Volterra system ii = (3 - 2P)H,
trajectory of (x(t), y(t)) appears to be a closed P = (½H - l) P is described in Exercise 6 of the previous
loop. section.
(c) Repeat the three experiments in part (b) with initial (a) Using the initial conditions H (0) = 3, P (0) = ~,
values x(O) = 1, y(O) = 1, and with x(O) = 3, compute sufficiently close approximations to the
y(O) = 0. The closed loops being approximated in solutions (H(T), P(t)) so that the values nearly
part (b) are each an example of a limit cycle. return to the initial values.
(b) Sketch the graphs of H = H(t) and P = P(t) using
18. The nonlinear Duffing oscillator with periodic external
the same vertical axis and the same horizontal t-axis
driver
for the time interval found in part (a).
j; + ky - y + y3 = A cos wt, y(O) = ½, y(O) = 0 (c) Sketch an approximate trajectory in the HP-plane
for the solution found in part (a).
is equivalent to the 2-dimcnsional first-order system 28. The Lotka-Volterra system ii = (3 - 2P)H, P =
obtained by setting y = z. (½H - l)P has solutions (H(t), P(t)) that are periodic,
(a) Make a computer plot of the solution y(t) for because a given orbit always returns to its initial point
0 ::: t ::: I 00 for the parameter choice k = 0.25, in some finite time t1; thus H(t + t1) = H(t) and
A = 0.35, w = I. Compare this result graphically P(t +ti) = P(t) for all t. The time period t1 depends on
with what you get using k = 0.25, A = 0, w = 1 the particular orbit however. Make numerical estimates of
and then k = 0, A = 0.35, w = I. t1 for the system on orbit& with (a) H(O) = 3, P(O) = ½-
(b) Make (y, z) = (y, y) phase plots using the three sets (b) H(O) = 3, P(O) = }, (c) H(O) = 3, P(O) = 2.
of parameter values suggested in part (a). Estimate
the apparent period r of the solution you get in the A refinement iI = (a - bP)H(L - H),
undamped (k = 0) case, indicated by the return of P = (cH - d)P(M - P), a, b, c, d positive constants,
the phase path to its starting point. of the Lalka-Volterra equations takes account of fixed
Section 48 Numerical Methods 615
limits L > H(t) and M > P(t) to the growth of the 33. Estimate the time it takes for an orbit to close under
host and parasite populations. Assume L > d/c > I and each of the three sets of initial conditions proposed in
M > a/b > 1. The restrictions L and M are typically Exercise 28.
imposed by lack of sufficient habitat or food supply. Use 34. The three-species system i = =
x(3 - y), j, y(x + z - 3),
a = 3, b = d = 2, c = 1 and L = 4, M = 3 to plot the i: = z(2 - y) has equilibrium solutions at (3, 3, 0) and
trajectory of the system for (0, 2, 3), as well as the trivial one at (0, 0, 0). Use initial
29. H(O) = 2, P(O) = 1 conditions (x, y, z) = (0.C)()l, 3, 2) to plot the following.
(a) The orbit of (x(t), y(t))
30. H(O) = 2, P(0) = ~
(b) The orbit of (y(t), z(t))
31. H(O) = ~. P(O) = 2 (c) The orbit of (x(t), z(t))
32. H(O) = ~. P(O) = ~

Chapter 12 REVIEW

Solve the initial-value problems in Exercises I to 12. dx/dt = t, x(O) = 0,


12. dy/dt = y, y(O) = 0,
1. dx/dt = x 2 + 1, x(0) = 1, dz/dt = y + z, z(0) = 1
dy/dt = y, y(O) = 2 13. Suppose that the autonomous differential equation x =
F(x) has solution x = x(t) satisfying x(0) = xo and
2. dx/dt = t, x(l) = 0, x( l) = x 1• What, if anything, can you say about a solution
dy/dt = X + y, y(l) = 0 to x = -F(x)?
3. dx/dt = y + 1, x(O) = l, 14. Consider the system (i, j,, i:) = (-wy, wx, u), where w
dy/dt = x, y(O) = 2 and <r are nonzero constants.
(a) Without solving the system, show that the acceler-
4. dx/dt = -y, x(l) = 0, ation vector of a nonzero solution trajectory (i) is
dy/dt = -x, y(l) = 0 always perpendicular to its velocity vector, (ii) is
parallel to the xy-plane, with length equal to w 2
5. i = 3x - 4y, xCO) = l,
times the length of the corresponding position vector
j, = 4x - ?y, y(O) =2 (x, y, 0).
6. i = 3x - 5y,x(O) = 0, (b) Find the complete solution of the system. Then
sketch the trajectory passing through (l, 0, 0),
j, = X - y, y(O) = l assuming w = <r = 1.
1. i = y + t, x(O) = 1, Let f (x) = f (x, y) be a continuously differentiable
j, = 4x - l , y(O) =2 function defined on ~ 2 • The gradient field Vf(x, y) =
8. i = 2, x(O) = 0, (fx (x, y), /y (x, y)) generates the autonomous vector
differential equation x = V/(x), called a gradient
j, = x + y, y(O) = l system.
9. x = -3x + 2y, x(O) = 3, i(O) = 0 15. Show that the nonconstant solution trajectories of a gradi-
y = 2x - 2y, y(O) = 3, j,(0) = 0 ent system arc perpendicular to the level curves f (x, y) =
10. x = y, x(O) = 0, i(O) = l, C off.

y = x, y(O) = 1, j,(0) = 0 16. Illustrate Exercise 15 using the example f(x, y) =


x2 + y2.
11. dx/dt = -y, x(O) = 0,
*17. Suppose to and ti are arbitrary points in the domain of
dy/dt = x, y(O) = 1, a solution x(t) of a gradient system with x(to) =
xo and
dz/dt = z, z(0) = -1 x(t1) = x 1, where XO and x 1 lie on the same smooth level
616 Chapter 12 Introduction to Systems

set off, that is f (xo) = f (x1 ). Show that j'/ 1 Jx(t)i 2 dt = 18. Show that the solution trajectories of a Hamiltonian sys-
O and hence that x(t) must be a constant so~ution. tem are level curves of H (x, y). [Hint: See Exercise 15.]
Let H(x) = H(x, y) be a continuously differentiable 19. Illustrate Exercise 18 using the example H (x, y) =
function of two real variables. The vector field H(x , y) = x2-y2.
(Hy(x, y), -Hx(x, y)) is call~d a H_amiltoni~n ~eld,
and the autonomous vector d1fferenual equation x = 20. Illustrate Exercise 18 using the example H (x, y)
H(x), called a Hamiltonian system. x2+y2.
C H APT E R 13

MATRIX METHODS

This chapter is about some special techniques for solving linear systems of differential
equations, principally those with constant coefficients. The methods all depend on
the notion of eigenvalue and eigenvector, introduced in Section I. The results are
analogous to those for a single linear constant-coefficient equation, and the methods
developed for them are special cases of the eigenvector analysis in this chapter.
Section 4 deals with equilibrium and stability for linear and nonlinear systems.

SECTION 1 EIGENVALUES AND EIGENVECTORS


IA Exponential Solutions
We've discussed the vector differential equation

dx
-=Ax
dt

in Chapter 12 for a few examples. The examples show that if A is a constant n-by-n
matrix, then we can expect to find exponential solutions. For example, in the case
n = I, we would have x = ax, with solutions of the form x(t) = ce 01 • Consequently,
we try solutions

x(t) = e'·1 u,
where u is a constant vector in !Rn. Differentiation of x(t) gives

dx
- =)..e''1 U.
dt
Since the matrix A acts linearly, we also have

Ax= /· 1 Au.

Thus to solve the differential equation, we must have, after division by t?-- 1 ,

Au= )..u.

The case u = 0 is too trivial to be interesting, so with that possibility ruled out we
define the nonzero vector u to be an eigenvector of the matrix A, and the number
).. to be the corresponding eigenvalue. Going the other way, if u is an eigenvector
with eigenvalue ).., then x(t) = e>..r u is a solution to the differential equation, and so
is an arbitrary scalar multiple ce>..t u.

617
618 .---~-- Chapter 13 Matrix Methods
I~><AMPJ.E.J I To solve the system

dx/dt) ( x+y )
( dy/dt = 4x+y

=(! :)(;).
try to find nonzero vectors u = (u, v) that satisfy the eigenvector equation

for some number J.... In other words, try to find numbers J... such that the equation

has nonzero solutions. If the 2-by-2 matrix is invertible, then only the solution
(u, v) = (0, 0) exists, so we assume it isn't invertible. Theorem 5.7 of Chapter 2,
Section 5 tells us that a square matrix A is invertible if and only if det A =I- 0. Hence
we require

I - J..
4

In other words,
(l - J...) 2 - 4 = J...2 - 2J... - 3
= (J... - 3) (J... + I) = 0.
The only solutions are J... = 3 and J... = -1.

Case (a). J... = 3. We want nonzero vectors (u, v) such that

The two numerical equations

- 2u +V =0
4u - 2v = 0

are equivalent to v = 2u, so we can choose u = I, v = 2. Thus since


J...= 3,

XJ (t) = e3' ( ~ )

is a solution and so is a numerical multiple c 1x 1(t) .


Case (b). ). = -1. We want nonzero vectors (u, v) such that

(; ~)(~)=(~)-
Section 1A Eigenvalues and Eigenvectors 619
Note that u = 1, v = -2 will do, so

xz(t) = e- 1 ( -~ )

is a solution, and so is a numerical multiple of it. Thus the general


solution of the vector differential equation is

Geometric interpretation. If we take Cz = 0 and make c 1 > 0 in the previous


3
example, the resulting solution Ct e ' ( ; ) traces a half-line in the xy plane that as

t increases extends away from the origin in the direction of the eigenvector ( ~ ) .

Similarly with CJ = 0 and cz > 0, we get a half line traced by c2e-r ( _; )toward

the origin and parallel to the eigenvector ( _; ). More generally each solution of
the system traces a curve in the xy plane that results from forming a particular linear
combination of points that correspond to the same t-value. We'll now pursue this
observation further. To see the geometric significance of the computation in Example
I, observe that neither of the vectors

U =( ; ), V = ( _; )

is a multiple of the other, as shown in Figure 13.1. Using the parallelogram law, a
vector x in JR 2 is expressible using coordinates z and w relative to these vectors as

J<'IGURE 13.1 /z
y I
Eigenvectors u, v. 4'1/',
\
\ I \
\ \
\ \
\ \
\ \
\ \
_ _ _ _ _ _ _ _ _ ___,,)x=zu+wv
'---+- \

I / X
I I
I I
I I
I I
I I
I

I
I
I
WV~/ \
\W
620 Chapter 13 Matrix Methods

Figure 13.1 shows a typical x as a linear combination of u and v. Thus given a


vector x, by the linearity of multiplication by A

Ax= Az ( ; ) + Aw ( _; )

= zA ( ; ) + wA ( _; ) ·

Since ( ~ ) and ( _ ; ) are eigenvectors, with eigenvalues 3 and -1, respectively,

Ax=3z(; )-w( _; )·
In other words, A has the effect of multiplying the first vector ( ; ) by 3 and the

second vector ( _; ) by - l. Thus expressing vectors as linear combinations of


eigenvectors shows that the action of A is given by the diagonal matrix

B =( ~ -~).
It follows that relative to the (z, w) coordinates, the vector differential equation is

So expressed in (z, w) coordinates, the system is uncoupled, that is,


dz
dt = 3z,
dw
-=-w.
dt
These two differential equations are particularly simple to solve, because each one
involves only one unknown function. We have
z(t) = qe 31
w(t) = c2e-1 ,
using (z, w) coordinates. This shows geometrically why we can write the general
solution as

1B Eigenvector Matrices
The procedure in the previous example generalizes to arbitrary dimensions. We pro-
ceed as follows to solve the n-dimensional constant-coefficient equation
dx
-=Ax :
dt
Section 18 Eigenvalues and Eigenvectors 621
1. Find the eigenvalues of A by finding the roots of the polynomial equation
det(A - >..I) = 0.
2. For each eigenvalue Ak, find an eigenvector Uk by solving

(A - >..d)u = 0.
Theorem 4.3 of Chapter 2 guarantees the existence of an eigenvalue.
3. If the solutions e>.. 1'u1, ... , e>.. 11 'u11 are linearly independent, so that none is a
linear combination of the others, write the general solution

We'll see shortly that if the matrix U with the Uk as columns is invertible then we
do get the most general solution. In particular we prove in Chapter 3, Section 7B
that if the eigenvalues of A are all different, then the corresponding solutions will
always be linearly independent. If the solutions e>..kt Uk are linearly dependent, the
procedure outlined produces solutions but not the most general one. In this case, we
can use the elimination method explained in Chapter 12 or the exponential matrix
method described in the next section.
If A has some complex numbers for eigenvalues, then the same method still works,
with the complex exponential replacing the real exponential.

dx
-=x-y
dt
dy
-=x+y
dt
has matrix

A=(! -1) 1 .

The eigenvalues are solutions of

1->.. -1
det ( 1 1->.. ) =0,

which is the same as

which has solutions Al = 1+i and A2 = 1- i.

Case ( a). ).. = 1 + i. The eigenvectors are the nonzero solutions of

( -: ~! ) ( ~ ) = ( ~ ) , that is, of
-iu -
u - iv
V =0
= 0.
622 Chapter 13 Matrix Methods

One solution is

Case (b ). ).. = 1 - i. The eigenvectors are the nonzero solutions of

( ~ - I ) ( ~ ) =( ~ ), that is, of
iu - V =0
u + iv = 0.
One solution is

The general solution of the differential equation is then

x(t) = c1e<I+i)1 ( -! )+ c2e<J-i)1 ( : )

=c e' ( c?s t + i si~ t ) _ c 2e' ( ~os t - i s~n t )


J -1cost-smt 1cost+smt
1
(CJ + c2)e cost+ i (c1 - c2)e' sin t )
= ( -i (CJ ·- c2)e' cost + (c1 + c2)e' sin t ·

If we rename the constants so that c 1 + c2 = d 1 and i (CJ - c2) = d2, then


1 1
( )_ ( d 1e cost + d2e sin t )
x t - -d2e1 cost+ d1e 1 sint
1 1
= di ( e 1c?s t ) + dz ( e sin t ) .
e smt -e1 cost

Because CJ = (d1 -idz)/2 and c2 = (d1 +idz)/2, the constants c1 and c2 can always
be chosen so that d 1 and d2 have arbitrary preassigned values; in particular, we can
choose them to make d1 and d2 real numbers.

As usual, we may want to choose arbitrary constants in general solutions such


as those preceding so as to satisfy prescribed initial conditions. If we assume that
the eigenvectors U J, ••• , u,, of the n-by-n matrix A are linearly independent, so that
none of them is a linear combination of the others, this calculation reduces to a
routine as follows. We denote by U the matrix with column vectors u1, ... , u,, in
some fixed order. The matrix U = (u1, ... , u,, ) is called the eigenvector matrix of
the system. We denote by A, the diagonal matrix

e~''
A,= .
(
0 0
Section 1B Eigenvalues and Eigenvectors 623
with corresponding eigenvalues Ak in the same order as the eigenvectors. If c is an
arbitrary constant column vector with entries c1, ... , en, we can form the vector-
valued function

x(t) = U Arc.
The vector x(t) has the form

and so is a solution of dx/dt = Ax, though not the most general solution unless the
vectors Uk are linearly independent. But if the Uk are independent, thus making U
invertible, we can let c = u- 1xo for some xo in lRn. Hence

x(t) = U ArU- 1xo.


Since Ao = /, we have x(O) = uu - 1xo = xo; so x(O) = xo. Thus x(t) is the
solution of the vector differential equation that satisfies x(O) =
xo. To satisfy an
=
initial condition x(to) xo at t = to, we let c in x(t) = U Arc equal

C = A-rou- 1xo.
Since At is diagonal matrix with entries e>-kr, AtA-r0 = Ar-to and

This formula gives all solutions, because every solution has some value at to, and
the formula assigns the arbitrary value xo there. By Theorem I. I in Chapter 12,
Section ID, the solution is uniquely determined by the initial condition, so there is
a I-to-I correspondence between solutions and initial conditions.
If the eigenvector matrix U isn't invertible we don't get all possible solutions
this way, but we'll see in the next section that the role of U Aru- 1 is assumed
by a routinely computed exponential matrix that provides all solutions in the form
x(t) = erAxo.

F~~JVl1'.t,.E ~] In Example 1, we found for

A_ ( I I )
- 4 I

the eigenvalues Al = 3, A2 = -1 with eigenvectors

We have
e3t
At= ( O
624 Chapter 13 Matrix Methods

Thus if xo = (2, 3), the solution

x(t) =( I
2
I ) ( e3
-2 0
1

)0)
=( {e3r + ¼e-1 )
1e3r _ le-,
2 2

satisfies the differential equation dx/dt = Ax and the initial condition x(O) = (2, 3).

To satisfy instead the initial condition x(l) = (3, 4), we form


31 1
I 1 ) ( e <- > 0
x(t) = ( 2 -2 O e-<1-1)

=( ~e3<1-l) + ½e-<1-l) )
Se3(1-I) _ e-<1-I) ·

EXERCISES

Find the eigenvalues and eigenvectors of the matrices in of the system in the form x(t) = qe.1,. 1' u 1 +c2e.1,. 2'u2. In
Exercises I to 6. the case of complex eigenvalues, convert the solution to
real form. Then use the initial conditions to determine
CJ and c2.

7. ( ~ ) - ( =! ; ) (; ).( ;;g; )- ( ~ )
dx
8. -
dt
= 3x, X (0) = l,
dy
dt -- 2_y, _11 (0) =0
dx
9. - =x +4v, x(0)
dt -
=l
The 2-dimensional systems of differential equations in dy
....:.. =5v, _v(0)= l
Exercises 7 to 10 can all be written dx = Ax in which dt -
dt
the matrix A has constant entries. In each example,
dx )
( :;cit =( ~
find the eigenvalues of A, and for each eigenvalue find
a corresponding eigenvector. Use the eigenvalues and 10.
-1 ) (
2
X )
y '
( X (Q)
y(O)
) =( l )
0
eigenvectors of the system to write the general solution
Section 2A Matrix Exponentials 625
11. (a) Find the general homogeneous solution of the 15. The second-order, constant-coefficient, differential equa-
system tion

d2 y dy
-+a-+by =0
dt 2 dt

is equivalent to the first-order system


(b) Find a particular solution of the system in part (a).
(c) Find the particular solution x of the system in part dx
(a) that satisfies the condition x(O) = (1, 1, 2). - =-ax-by
dt
12. (a) Find the general solution of the system dy
-=x.
dx dt
dt =y
Show that the eigenvalues of the matrix
dy
-=-x
dt
dz
dt = -z
by a method of your choice. are the same as the characteristic roots of the second-order
(b) What are the eigenvalues of the following matrix? differential equation.
16. Prove that a square matrix U is invertible if and only if its
0 1 0) columns are linearly independent. [Hint: Row operations
( -10 00 -10 reduce U to / .]
17. Let U be an invertible n-by-n matrix with columns
u 1, .•. , u11 • Let D be the diagonal matrix with diago-
13. (a) Find the general solution of the system nal entries AJ, . .. , An, and define the n-by-n matrix A by
A=Unu- 1•
dx
- =-x+y (a) Show that Auk = AkUk, fork= I, . . . , n.
dt
(b) Find the 2-by-2 matrix that has eigenvalues u 1 =
dy (2, 3), u2 = (I, I), and corresponding eigenvectors
dt = -y. >..1 = 3, >..2 = l.
(c) Show that the system dx/dt = Ax has solutions
(b) What are the eigenvectors of the following matrix?
eAk 1uk, fork= 1, ... , n.
(d) Find a 2-dimensional system having
-1 I )
( 0 -1

14. Show that if the eigenvectors of the n-by-n matrix A span


IR" and the eigenvalues of A have negative real parts, then
all solutions of dx/dt = Ax tend to zero as t tends 1.0 +oo. as its general solution.

SECTION 2 MATRIX EXPONENTIALS


2A Definition
To round out the theory of linear systems having constant coefficients, we use the
idea of the exponential of a square matrix A, which we define using the familiar
power series for the exponential function, namely

x
e = 1 + X + -2!lX 2 + -3!IX 3 + · ·· .
626 Chapter 13 Matrix Methods

If A has dimensions n-by-n and / is the n-by-n identity matrix, we consider the
finite sum
k
I +A+ _!._ A 2 + ... + _!._
kl
A k = 'L.,,,
\""""' _!._A i.
2 ., .
.,
. 0 .I.
]=

This sum of n-by-n matrices is also an n-by-n matrix. We define the exponential of
A by

kl- 001.
eA = lim '\""""' -A 1
k--+ 00 L.,,, j !
= 'L.,,,
\""""' -j ! A1 '
J=O J=O
where the existence of the matrix limit is understood to mean that the limit exists in
each of the n 2 entries in the matrix. It's sometimes convenient to use the notation
exp A for eA. For example, if

A=(~~). A
2
=(
22 0
0 32
)
' ... 'Al
·
=
( 21.
0

then

exp ( ~ 0
3
) .
= k--+oo
lim k -
1
~ j!
( 21.
0
0
3i )
J=0

00 21
L-:i- 0

=
. 0 J.
1=
00 3J
=( ~2
0
e3 ).
0 I:-:i-
i=O ]·
It's remarkable that the exponential of a square matrix always exists and has many
of the properties of the ordinary real or complex exponential function . In the matrix
exponential's most useful form, A is multiplied by a scalar t that we can pull out of
the powers (tA)i to give 11 Ai . The most important properties of e'A are as follows.
2.1 Theorem. If A is an n-by-n real or complex matrix, the matrix series
00 ti .
'°'-Al
L.,,, . '
J=O 1·
converges to an n-by-n matrix e 1 A satisfying
(a) e<r+s)A = e 1 AesA = esAerA for scalars t ands
(b) e'A is invertible, and e- 1AetA = e1Ae-tA = /
d
(c) -ef A = Ae 1 A = e'A A
dt
Proof If A = (aiJ ), choose a positive number b such that laiJ I :'S b for i, j
1, ... , n. Since the entries in A 2 are of the form

lliJllJi + ··· + ll;nllni,


Section 2B Matrix Exponentials 627

it follows that they are all at most nb2 in absolute value. Proceeding inductively, the
entries in Ak are at most nk-I bk in absolute value. It follows that each entry in eA
is defined by an absolutely convergent infinite series dominated by the convergent
series
nb 2 1 · I ·
1+ b + - + ... + -nl- bl + ...
2! j! '
as discussed in Chapter 14, Section 3D. Hence all the entries exist and eA is defined.
These estimates show that if the entries aiJ in A are replaced by the entries t aiJ in
t A, then the convergence is uniform on every bounded interval c ~ t ~ d. (See
Chapter 14, Section 4.)
To prove property (a), we apply the binomial theorem to (t + s)j to get

e<t+s)A = f= +;y f= ), [t (1)


;=0
(t Aj =
;=0 1=0
1LsJ-I Ai].

Since ( { ) = j!/l!(j - l)!, we can cancel j! to get

e<r+s)A _ '°' '°' __l!(j -


oo
- ~
[
~
j
tlAL s_
J-IAJ-1]
__
/)! .
;=0 l=0

This last sum is just the product of the two absolutely convergent series that represent
e'A and esA respectively. Since s and t commute in the original series, we also get
the product in the other order.
Property (b) follows from (a) on taking t = 1 ands = -1. Since e0 = 1, property
(a) implies eAe-A = e-AeA = J.
Formally we can compute the derivative of e'A from the definition by

.'!___etA = !1_ ~ ti Ai
dt dt f;
J=v
j!

oo jri - IAJ
=I:
J=l
.,
J.
oo 11- 1AJ-1
= A'°'----=
~ (j-1)!
J=l
Ae'A.

Note that the factor A could be taken out on the right just as well as on the left.
This computation using term-by-term differentiation of series is justified because the
differentiated series in each entry is uniformly convergent by the estimates made in
the first part of the proof. See Chapter 14, Section 4. ~

2B Solving Systems
The simplest justification for introducing e' A is to show that

x(t) = e Axo 1
628 Chapter 13 Matrix Methods

defines the solution of


dx
- = Ax, x(O) = XQ.
dt
First, to differentiate the vector e1Axo, we need only differentiate each entry in the
matrix e1A, because xo has constant entries. Hence
d
-e1AXQ = Ae1Axo
dt
by part (c) of the previous theorem. Thus the differential equation is satisfied. Second,
to show that the initial condition is satisfied, note that

x(O) = lxo = XQ.


More generally,

x(t) = /t-to)AXo
satisfies the initial condition x(to) = Xo, since e0 A = I. It's one of the nice features
of e 1A that it always exists and that the equation x(t) = e 1Ac provides all solutions
of dx = Ax as the constant vector c ranges over all constant vectors with the same
dt
dimension as x; this will follow from Theorem 3.2 in Section 3A.

I~¥AMP~ We can write the system


dx
-=x+y
dt
dy
-=y
dt
in matrix form dx/dt = Ax as
dx/dt )
( dy/dt
=( 1 1) ( x ) .
O I y

We compute

) , A
2
=( ~ i ),... , Ak = ( ~ ~ ) , ....
Then
00 k 00
tk

etA_L~
-
00 k ( k
0I 1 )=
I:~k!
k=O ti (k - l)!
00 k
k=O k!
0 I::,
k=O .

- ( e'
- 0
te
e'
1
).
Section 2C Matrix Exponentials 629
Hence the solution with initial conditions x(O) = xo, y(O) = Yo is

To find the solution with initial conditions x(to) = xo, y(to) = yo, just replace
t by (t - to) everywhere in the vector solution for to =
O; this is simpler than
recomputing undetermined coefficients and it shows one of the many advantages of
using the exponential matrix e1A.

2C Relationship to Eigenvectors
The connection between exponential solutions and the eigenvector method of the
previous section is as follows. If the eigenvectors of the square matrix A are linearly
independent, we form the eigenvector matrix

U = (01, ll2, ... , Un)

with these vectors as columns. We also form the diagonal matrix

A'= Ct . .,:, )·
where Ak is the eigenvalue of Dk. Then we have seen that

solves the initial value problem for the equation dx/dt = Ax. Since
x(t) = e'AXO
solves the same problem, we are faced with the question of whether the two solutions
are the same. By Theorem 1.1 in Chapter 12, Section 1D, there is only one solution
satisfying x(O) = XO- (Exercise 18 in Section 3 indicates a direct proof.) Hence
e'Axo = U A,u- 1xo for all t. Since xo is arbitrary, it follows that
2.2 e1A = U A,U- 1 •
j,;~~Ar"ifl.~Z,:] In Example 1 of the previous section, we solved the system

( :;j:; ) = ( ! ! ) ( ; )
by finding the eigenvalues ).. 1 = 3, A2 = -1 and corresponding eigenvectors (I, 2),
(1 , -2) of the 2-by-2 matrix A of the system. Thus

U=U -n• A,=(·;· .~. ). u·'=U -D


630 Chapter 13 Matrix Methods

and

As a check on the computation, notice that e1A = I when t = 0. This example shows
that if the eigenvectors of A are linearly independent, it may well be easier to use
them to compute e1A than to use the matrix power series definition.

Equation 2.2 is ineffective for computing e1A if the eigenvector matrix U fails
to have an inverse. For example if A =( b ! ) there is only a single repeated
eigenvalue ). = 1, and all corresponding eigenvectors have the form ( ~ ) with

2
u =f. 0. Hence the only possibility for U is a matrix of the fonn ( t~t ~ ) , which

is never invertible. Section 20 provides an alternative method for computing e1A


that's often more efficient.

EXERCISES

In Exercises 1 to 6, find the exponential e1 A of the


matrix A by first computing the successive tenns I, t A,
t 2 A 2 /2 1. . . . m
• the senes
. de ti muon.
..

l.h(-~ :) 2.A=(; ~) (d) Find the solution of the initial-value problem

3. A= ( : 1)
1
(i 0)
4. A=
0 -,
.
(ID-(~ )C)

5. A = u:~ ) 6. A = (

7. In Example l of the text, we showed that


~ :)
( ~~~~ ) =( -~ ) .

In Exercise 8 to I 0, find the square matrix e' A in terms


of the general identity matrix I.
exp t ( ~ !)=( ~ 1 1
:, ) •
8. A= I 9. A= 21 10. A=-/
Verify directly for this example that

(a) exp (, ( ~ ) ) and exp ( - t ( ~ ) ) are


In Exercise 11 to 19, compute e'A for each matrix A
using Equation 2.2. Then find the inverse matrix e-tA,
inverse to one another. and check your original computation by showing that the
derivative of e1A at t = 0 is equal to A.
(b) exp (, ( ~ 1
) ) exp (s ( ~ ))
=exp (u + s) ( ~ ! ) ). 11. A.= -3 2 )
( -4 3
12. A= ( I
0 5
4 )
Section 2D Matrix Exponentials 631

13. A-(: -: ) 14. h u~) Define cos t A to be the real part of the series and sin t A
to be the imaginary pan, so that

eitA =costA+isintA.

IS. A-(: :) 16. A-c~ -~ =n Show that the matrices cost A and sin t A satisfy
(a) cos(-tA)
d
dt
=
cost A , sin(-tA) = -sintA
. d .
(b) -costA=-AsmtA, -smtA=Acos tA
dt
4 4 [Hint: Express cost A and sintA in terms of ei 1A.]
17. A= ( O I )
-6 5
18. A= ( -
-6 6
) (c) (costA) 2 + (sintA) 2 =
J, where I is the n-by-n
identity matrix

- 1 22. Let A be the 2-by-2 matrix

19. A= 0
(
0 -½
20. (a) Use the method of elimination to find the general Define cost A and sin t A as in Exercise 21, and verify the
solution of the system formulas given in (a), (b), and (c).
23. Show that if A is an n-by-n matrix, then a system of
the form

has solutions of the form


(b) Use the result of part (a) to compute the matrix e1A
where x(t) = (costA)c1 + (sinrA)c2,
A= ( 9 -4)
4 1 . where CJ and c2 are constant 11-dimensional vectors, and
cost A and sin t A are the n-by-11 matrices defined in
Exercise 21. Is this always the most general solution?
[Hint: Find solutions such that x 1(0) = e 1 and
x2(0) == e2.] 24. Let A be the n-by-n matrix with all entries equal to 1.
(c) Show that the eigenvectors of A are linearly depen- (a) Show that A2 = nA, and more generally that Ak =
dent. nk-l A foe integer k 2: I.
I
21. Let A be an n-by-n matrix with real entries. The matrix (b) Show that e1
A =I+ -(enc - l)A.
eitA is defined by n
(c) Find the four entries in e1A when 11 = 2.
25. It turns out that if AB = BA, then eAeB = eA+B_
Find noncommuting 2-by-2 matrices for which this last
equation fails.

2D Computing e'A in Practice


The method we describe here is usually simpler than appealing directly to the def-
inition of e'A or first finding an eigenvector matrix as described in the previous
Section 2C. Furthennore, the method doesn't depend on knowing the eigenvectors
of A, and it works whether A has independent eigenvalues or not. Our aim is to
show that an n by n exponential matrix always reduces to a polynomial P(A):
n-1
2.3 e'A = I)J<t)A j.
j=O
632 Chapter 13 Matrix Methods

where the coefficient functions bk(t) contain the eigenvalues AJ, ... , An of A explic-
itly. The next theorem shows how to compute the coefficients by solving a system
of linear equations. The complete proof of the theorem is complicated, so we'll
just sketch it. There is a complete proof in Chapter 7, Section 2 of Introduction to
Differential Equations, 2nd ed., by Richard Williamson, McGraw-Hill (2001).
2.4 Theorem. The coefficient functions bk(t) in the matrix Equation 2.3 satisfy
the linear scalar equations
n-1
(i) 1
e J..k = L, bj (t)A£, k = l, .... n.
j=O
If some m of the eigenvalues Ak are equal, say Ai = · · · = Am, then the following
m - I additional relations hold:

dk dk n-1 .
(ii) dAke'J..=dAkL,bj(t)A 1 , at A=AJ, fork=l, ... ,m-1.
j=O

Sketch of Proof. We'll assume Equation 2.3 holds for some choice of the coeffi-
cients bj(t). Let Vk be an eigenvector of A corresponding to eigenvalue Ak : Avk =
AkVk, vk -1- 0. Apply the matrix sum on the right side of Equation 2.3 to Vk, noting
that Ajvk = vk: ).i
e"v, = %bj(t)Aiv, = (%b;(t)A/) v,.
Similarly, apply the matrix e1A to Vk to get another expression for the same thing:
N · oo ·
tl . tl .
e'Avk = N-+oo
lim ~ --:-A 1 vk = ~ --;-A 1 Vk = e'J..kvk.
!-,-, J ! !-,-, J ! k
1=0 1=0

Since vk is an eigenvector, it isn't zero, so the coefficients of Vk at the ends of the


previous two displayed lines must be equal, that is, Equation.. (i) hold.
If there are multiple eigenvalues we alter the entries of A slightly to produce
matrices Ah with distinct eigenvalues whose limit as h -+ 0 is A. Equations (ii)
follow by calculating an appropriate limit of a difference quotient as h -+ 0. •

[§:1\,PLE 3 1 We saw in Example l of Section I that the matrix A =( ~ !) had eigenvalues


Af =- 1, A2 = 3. Equations (1) of Theorem 2.4 are then
e-' =bo(t)-b1(t)
3
= bo(t) + 3b1 (t).
e '

Solve for bo(t) and b1 (t) to get b1 (t) = -¼e- 1 + ¼e 3', bo(t) = ¾e- 1 + ¼e 3'. Plugging
these coefficient functions into Equation 2.3 gives

e'A -
_4(le-'+ le3')
4
( 01
l O) + (-le-r
4
+ le3')
4
( 41
l l )
Section 2D Matrix Exponentials 633

½e-1 + ½e3, -¼e-, + ¼e3' ) .


( -e-1 + e3t ½e-1 + ½eJ,
As a partial check on the accuracy of our computation we can verify that our expres-
sion for e1A equals the identity matrix I when t = 0.

E~XA~PL~ 4;,j In Example 2 of Section 1 we saw that the matrix ( ~ - ~ ) has eigenvalues
Ai = 1+i and A2 = 1- i . Equations (i) of Theorem 2.4 are then

e<l+i)t = bo(t) + bi(t)(l + i)


e<l-i)t = bo(t) + bi (t)(l - i).

Subtracting the second equation from the first gives


1 . .
so b1(t) = 2/'(e" - e-
11
) = e1 sint.
Substituting for bi (t) in the first of the equations above containing bo,

bo(t) = -(1 + i)e1 sin t + e' (cost+ i sin t) = e1(cost - sin t).

Then

exp ( t ( ! --~ )) = 1
e ( cos I - sin t) ( b ~ ) + e sin t ( 1
! -! )
1 1
_ ( e cos t -e sin t )
- e1 sin t e1 cos t ·

I~~~M,e~~l I The matrix A = ( -~ -! )


has for its characteristic equation A2 + 4A + 4 = 0,
with eigenvalues Al = A2 = -2. Equations (i) of Theorem 2.4 are identical,

e 1).. = bo(t) + b1(t)A1 or e- 21 = bo(t) - 2bi(t).


1

Hence we need another equation satisfied by the coefficients. As suggested by Theo-


rem 2.4, we differentiate formally with respect to A1 in the first equation above and
then set Al = -2 to get

= bi (t) or te- 21 = bi (t).


te1 ).. 1

We see right away that bo(t) = e- 21 + 2te- 21 . Equation 2.3 then becomes

1
eA = bo(t) ( b ~ ) + bi (t) ( -~ _! )
= (1 + 2t)e- 21 ( ~ ~) + te-21 ( -~ -! )
-21 ( 1 + 2t
=e -4t
634 Chapter 13 Matrix Methods

I·· E><,\MPLE 61
= 2 I O) A) 3 = 0, so there is a
Let A
( 00 02 2l . The characteristic equation is (2 -
triple root A1 = 2. Equations (i) of Theorem 2.4 reduce to

We get two additional relations among the bk(t) by differentiating the first equation
above twice with respect to A1 and then setting A1 = 2 after each differen-
tiation:

te 21 = b1(t) +4bi(t), t 2e21 = 2bi(t).

Solving the last three displayed equations for the bk(!) gives

O!n
Equation 2.3 is then

e'A - (I - 21 + 21')eh

+ (I - 21 '),'' 0i n 0~ n + ½r',''
l 1
½12 )
= e2' 0 I 1 .
( 0 0 I

The multiple-eigenvalue case in the proof we gave for Theorem 2.4 is incomplete.
However the following remarkable algebraic result makes the theorem plausible and
leads to a complete proof. This theorem allows in principle for the possibility of
collapsing the infinite series for e1 A into a finite sum by replacing powers of A
higher than n - 1 by lower powers as in Equation 2.3. The next example works out
the 2-by-2 case.

2.5 Cayley-Hamilton Theorem. If A is an n-by-n matrix with characteristic


polynomial

P(A) = det(A - A/),

then the matrix polynomial obtained by substituting Ak for )._ k in P()..) satisfies
P(A) = 0 , with the understanding that A O = I replaces )._ 0 = I in the substitution.

The theorem, which we won't prove, is often stated briefly as "a square matrix
satisfies its own characteristic equation."
Section 2D Matrix Exponentials 635

I. ~~l'vlf~~~') I Suppose A =( ~ : ) . The characteristic polynomial of A is

a-). b ) 2
det ( e d-). =A -(a+d)).+(ad-be).

The Cayley-Hamilton theorem asserts that A2 - (a+ d)A + (ad - be)/ = O, or


A2 = (a+ d)A - (ad - be)/. Multiplying this last equation by A gives

A3 =(a+ d) A2 - (ad - be) A


=(a+ d)((a + d) A - (ad - be) I) - (ad - be) A
=((a+ d) 2 - (ad - be))A - (a+ d)(ad - be)/.

Continuing as in the previous example, we see that a power of a 2-by-2 matrix


A, and hence a polynomial p(A) of degree n ::: 2, equals a first-degree polyno-
mial: p(A) = aA + {JI. The main difficulty in proving Theorem 2.4 is that we're
dealing with an infinite sum that defines e1A. Nevertheless Theorem 2.4 implies that
the successive additions to coefficient terms in I, A, ... , An- I converge to sums
bo, ... , bn-1 that we compute by the routine of Theorem 2.4.
If a square matrix A is invertible, then the Cayley-Hamilton Theorem allows us
to write A- 1 as a polynomial in A. For if

aol + a1A +···+an-I An-I ± A' = 0, 1

is the characteristic equation of A with ).. replaced by A, we can multiply by A- 1


to get

aoA- 1 +ail+··· +an-1An-i + An-l = 0.


Now solve for A- 1, noting that ao = det A ¥= 0.

2 3 1 )
The matrix A = 0 2 2 has characteristic equation
( 0 0 2

(2 - )..) 3 =8- }2)._ + 6).. 2 - )..


3
= 0,
so 8/ - 12A + 6A 2 - A3 = 0. Multiply by A- 1 to get
8A- 1 -12/ +6A-A 2 =0 or 8A- 1 = 12/ -6A + A2 •
Hence to find A- 1 we need only divide by 8 after computing

~ ~ ~) ~ ~) ~
1
1
8A- = 12 ( - 6( ; +( I~ ~ )
001 002 0 0 4

-0 -~ ~)
636 Chapter 13 Matrix Methods

EXERCISES

In Exercises 1 to 4, solve the initial-value problem


x = Ax for the given matrix A and corresponding initial 9. Let A =( ~ 1) where ex, fl are real numbers.

condition by first finding e1 A. (a) Show that if ex i= fl, then A has two linearly
independent eigenvectors.
1. ( I~ =; ); x(0) = ( _; ) (b) Show that if ex = fl, then the only eigenvectors of A
are of the form u = ( ~ ). c # 0.
2. ( -1 _: ); x(l) =( ~ ) Compute in each of the two cases.

,. o1-n n
(c) e'A

10. How should Equation 2.3 be interpreted when 11 = I?


1
;x(O) - ( 11. Use the definition of e A as a matrix power series to show
that e11 = e1 I .

4.(~l 01 I) = (2)
12. Theorem 2.4 shows that the coefficients bk(t) in an expan-
~ ; x(O) ~ sion
11 - l
In Exercises 5 and 6, find e1 A for the given matrix A. e1A = L)dt)Ak
k=O

S. A iI i0 -l-1
=( ~I )
are completely determined by the eigenvalues of the n-
hy-n matrix A. For example, the matrices ( ~ ~ ).

6. A-(l -05)
0
0
0.5
-1 (i f ) both have characteristic polynomial with I
2 2.5 0~5 and 2 as roots. Hence the bk(t) are the same for all these
-1 0.5 1.5 matrices regardless of the value of the entry fl .
7. Find the appropriate exponential matrix and use it to (a) Compute bo(t) and b1 (t) for the two matrices above,
solve and use them to find the corresponding exponential
matrices, each depending on the parameter fl .
dx
-=2t
dt
+ Z, (b) Compute the exponential matrix for ( ~ ! ).
dy
- =-x+3y
dt
+ Z, 13. Use Theorem 2.5 to compute A2 and A 3 if A =
dz
- =-x
dt
+ 4z. ( ! ~ ).
14. Verify the Cayley-Hamilton Theorem for the matrix
(Note that A = 3 is a triple eigenvalue.)
8. Solve by whatever method seems simplest:
( -~ -! ).
In Exercise 15 to 23, use the Cayley-Hamilton theorem
to find the inverse matrix if the given matrix is invertible.
dx
-=2x+ z,

~
dt
dy
dt = y + w, 15. (_: : : ) 16. ( - : : )
dz

~ ~ ~~
-dt =2z+w '
dw 17. ( : ) 18. ( ~I ) , t real
-=-y+w.
dt I -3 -7 0 0
Section 2E Matrix Exponentials 637

C :) J n.c
2 -1 0
19. 0
0
20.

C -1
0
e'
0
,; )-"'"'
e'
2 -1 3 0 0
0 2 0 0 2 0 0
21. 22.
0 0 0 0 3 0
0 0 0 4 0 0 0 4

2E Independent Solutions
The discussion of this section shows that there is an exponential solution fonnula
x(t) = e'Ac for every equation dx/dt = Ax in which A is a constant square matrix.
In the example we had

/Ac= ( t 1
;' ) ( ~~ )

= c1 ( ~ ) + c2 (
1
;' ) .

Since we want different solutions for every different choice of c 1, c2, it's important
to avoid the redundancy that would occur in the fonnula in case one of the two
columns in the matrix is a constant multiple of the other. We'll see that this cannot
happen in general, but to state the general result we need to look more closely at what
is meant by linear independence of vector functions. Let x1 (t), x2(t), ... , Xm (t) be
n-dimensional column vectors whose entries are functions on some common interval
a < t < b. (It's not ruled out that some or all of the entries may happen to be
constant.) Vector functions Xk(t), k = 1, .. . , m defined on a t-interval are said to
be linearly independent if whenever

CJXJ (t) + c2x2(t) + · · · + CmX111 (t) = 0

for all t, then the constant coefficients q are all zero. When we have only two
functions (m = 2), asserting linear independence is the same as saying that neither
function is a constant multiple of the other. The reason is that if either c1 or c2 is not
zero we could divide by it and express one vector as a multiple of the other. Similarly,
if we have m > 2 vector functions, their linear independence means that none of
them is equal to a sum of scalar multiples, or linear combination. of the others. The
negation of linear independence of a set of vectors is called linear dependence, and
it means simply that at least one of the vectors is a linear combination of the others.

The vector functions


638 Chapter 13 Matrix Methods

that form the exponential matrix in Example 1 are linearly independent. For

is the same as

c1e
1
+ c2te 1 = 0
c2e 1 = 0.
It follows that c2 = 0. Hence CJ = 0. This conclusion holds for a given value of t,
so in particular the constant vectors

(~) . (~)
are linearly independent. Just set t = 0.
Consider the vector functions

x,(1) = (fl· Xz(t) =( ~


te 1
),

The check for independence for -oo < t < oo is to solve

for CJ , c2, c3 . This is the same as


1
CJe =0 ·
c2e
1
+ qe 1 = 0
c2te1 + c3t 2e1 = 0.
The first equation shows that c1 = 0. The middle equation implies c 2 = - q , so the
last equation says c2t - c2t 2 = 0 for all t. Thus c2 = C3 = 0, so the vector functions
are independent as defined on an interval a < t < b. Note, however, that when t = 0
we get

x1 (0) = 0)· D· x,(O) =( x3(0) =( !) ,


and these constant vectors are linearly dependent. This shows that functions may be
linearly independent while their restrictions to some smaller domain (in this example
a single point) may be linearly dependent.
Here is a theorem that guarantees independence of the columns of an exponential
matrix for all values oft, and hence the existence of an independent set of solutions
for x = Ax for every constant square matrix A.
Section 2E Matrix Exponentials 639
2.6 Theorem. Let A be an n-by-n matrix with constant entries and let Xk(t) be the
kth column of the exponential matrix e'A. Then the vector functions x 1(t), ... , Xn (t)
are linearly independent over an arbitrary set oft-values.

Proof. Apply the matrix e-rA to both sides of the vector equation

CJXJ(t) + · · •+ CnXn(t) = 0.
Using the distributivity of matrix multiplication, we get

But e-rA is the inverse of the matrix whose kth column is xk(t). Hence e-rAxk(t)
is the kth column of the identity matrix I. Thus our equation becomes

where ek is the column vector with 1 in the kth entry and 0 elsewhere. Adding up
the linear combination gives

CJ 0

= 0

So all q = 0 and the vectors xk(t) are linearly independent. •


EXERCISES

Each of the given matrices in Exercises l to 4 is the Not every square matrix with linearly independent
exponential matrix of some constant matrix A. For each columns is an exponential matrix. For example, an expo-
matrix, find A by computing the derivative of e'A at nential matrix e'A must equal / when t = 0 and proper-
t = 0. Then express the vector function x(t) = e'Ac as ties (a) and (b) of Theorem 2.1 must hold. In Exercises
a linear combination of the columns of e'A and verify 5 to 8 show that the matrix has linearly independent
that each column of e' A is a solution of x = Ax. columns, but is not an exponential matrix.

I. ,,A - ( ~ :.) z.,'A - ( ~ -~ )

3.,"-0 -: J)
4. lA = ( eo~ r:,' ~ ) 9. Theorem 2.6 is a simple consequence of the following
more general theorem: If A(t) is an invertible square
0 e21
matrix for each t in some interval a < t < b, then the
640 Chapter 13 Matrix Methods

columns of A(t) are linearly independent vector functions 11. Prove that a set (x1 (t), x2(t) , ... , x111 (t)) of vector-valued
on that interval. Show how to prove this theorem using functions of the same dimension n is linearly independent
the ideas in the proof of Theorem 2.6. on an interval a .'.': t .'.': b if and only if no one of them is
10. Let D be the n-by-n diagonal matrix with entries a linear combination of the others.
d 1, ••• , d,, on the main diagonal and zeros elsewhere.
Show that e1 D is the diagonal matrix with entries
edit, ... , ed.,1_

SECTION 3 NONHOMOGENEOUS SYSTEMS


3A Solution Formula
To develop efficient methods for solving nonhomogeneous systems, we need a for-
mula for the derivative of a matrix product, or, more particularly, the product of a
matrix and a vector. The rule is similar to the usual product rule for derivatives:

3.1 ~[A(t)B(f)] = [~A(t)] B(t) + A(t) [~B(t)].


dt dt dt
However, the order of the factors on the right is important because it involves
matrix multiplication, which is not in general commutative. To prove the formula,
we differentiate one entry at a time on the left side; the derivative of the ijth entry is

d ~ ~ da;k ~ dbkj
dt L..,a;k(t)bkj(f) = L, dt(t)bkj(f) + L..,a;k(t)dt(t).
k=O k=O k=O

But this is just the ijth entry in the sum of products of matrices on the right, so the
formula is proved. In our first application B(t) will be a column vector x(t).

The proof of the next theorem is fonnally just an application of the exponential
multiplier method of Chapter IO, Section 3A.
3.2 Theorem. The vector differential equation
dx
- = Ax+ b(t),
dt
where A is a constant matrix and b(t) is a continuous function on some interval has
for its general solution
f
x(t) = e 1 A e- 1Ab(l)dt + e 1Ac,

where c is an arbitrary constant vector. In particular, the homogeneous equation


dx
- = Ax has x(t) = e'Ac for its general solution.
dt
. dx
Proof. We rewrite the differential equation as - - Ax = b(t), then multiply
dt
through by the matrix e- 1 A to get
dx
e-1A - - e-rA Ax= e-rAb(t).
dt
By Equation 3.1, the product rule for differentiation of matrices, this is the same as
Section 3B Nonhomogeneous Systems 641
Integration of both sides gives

e-tAX == f e-tAb(t) dt + C.
Since e1A is the inverse of e-tA, we can multiply through by e'A to get

x(t) == e'A f e- 1Ab(t)dt + e'Ac. •

In the first example of the previous section, we saw that the system

( ~~~~: ) == ( b ! ) ( ~ )
had associated with it the matrix
tA _ ( e' te' )
e - 0 e' ·
Hence to find a particular solution of

( ~;~~; ) = ( ~ ! ) ( ; ) + ( e~' ) '


we compute the particular solution

etA f e -tA ( e-t


e'
) dt =( ~
te'
e1 ) / ( e~' te-
-e-1
1
) ( e-t
e' ) dt

- ( e'
- 0
te'
e1
)f ( 1 - -21
te-
e
21
) dt

) ( I+ ½te-2' + ¼,-~
- ( e'
- 0
te 1
e1 I -21
-2e
)
~
1
=( te ~-I ) •
-1.e
Adding the particular solution just found to the general homogeneous solution we
already had gives

X(I) - ( ~ ':,' ) ( ;: ) + ( te~~e~-• )


=( cie
1
+ c2te1 + te1 + ¼e- 1 )

c2e1 - ½e- 1

for the general solution.

3B Variation of Parameters
Even in the case of an n-by-n matrix A(t) with nonconstant continuous entries there
is a formula for a particular solution of
dx
dt = A(t)x + b(t)
642 Chapter 13 Matrix Methods

in terms of solutions of the related homogeneous equation. Suppose x1 (t), ... , Xn (t)
is a set of n linearly independent solutions of

dx
- = A(t)x.
dt
We now form the n-by-n fundamental matrix

X(I) = (x1(t) ... x 11 (t))

whose columns are these independent vector solutions.

IEXAMPLE 2 j If A is a constant matrix, then the matrix X (t) = e1 A is an example of a fundamental


matrix, because its columns are linearly independent solutions of dx/dt = Ax. In
Example I, a fundamental matrix is
1 1

XI (t) = erA = ( e0 teer )


.

Another fundamental matrix for the same system is

X2(/) = ( te' e')


e' 0 '

although X2(t) is not an exponential matrix, because X2(0) # I.

Having found a fundamental matrix X (t), we try to find a vector-valued function


v(t) such that
Xp(/) = X (/)V(t)

is a solution of the nonhomogeneous equation. It turns out that this can always be
done as follows. Using the product rule for differentiation, we substitute X (t)v(t)
into the nonhomogeneous equation to get

dX(t) dv(t)
--v(t)
dt
+ X(t)--
dt
= A(t)X(t)vft)
.
+ b(t).
Since each column of X (t) is a solution of the homogeneous equation, we have

dX(I)
-dt- = A(t)X(t).
Therefore, the first term cancels on each side, leaving

dv(I)
X(I)--
dt
= b(t).
Since the columns of X (t) are independent as vector functions, it follows (see
Exercise l 6.) that these columns are independent vectors for each fixed t. Hence
the inverse matrix x- 1(t) exists, and multiplying by it gives

dv(t) = x- 1(t)b(t).
dt
Section 3C Nonhomogeneous Systems 643
Integration gives the formula for v(t):

v(t) = f x-1(,)b(t) dt.

Finally,
3.3 Xp(t) = X(t)v(t)
f
= X(t) x- 1 (t)b(t)dt.
Notice that this formula is the same as the one previously derived in the constant-
coefficient case, with e' A now replaced by the more general X (t ). This process for
finding Xp is sometimes called variation of parameters, because to find it we replace
the constant vector vo in the homogeneous vector solution X (t )vo by a function that
varies with t.

r.-_,EXAMPLE
,
· . -_
"
,- ·- -.,,_·_- ·, -_-.·--_· 3-·1
.-. •, : ...... ' '
0
-~:·,· :-·: ,.·,:; ,,~. _,,·· ~:
It's routine to verify that the homogeneous system associated with

has independent solutions

We form a fundamental matrix X (t) and its inverse:

et e2t )
X(t) =( 0 et '

Formula 3.3 gives the particular solution

Xp(t) =( et
0
e21
et )f ( e~t ;-1, ) ( ;;t ) dt

- ( et
- 0
e2t
et )f ( 1 ~te2t ) dt

3
-(i
- 0
e2t
et
) ( I -
et
½e2' )- ( te + je '
e2t )
The general solution is then

x(t) = c1x1 (t) + c2x2(t) + Xp(t).

3C Summary of Methods
For linear systems in the standard form dx/ dt = Ax+ b, and hence for systems and
equations reducible to this form, we usually proceed as follows:
644 Chapter 13 Matrix Methods

1. Find the general solution of the homogeneous equation dx/ dt = Ax, either
by elimination, by the eigenvector method, if applicable, or by finding e1A
directly. In the constant-coefficient case the homogeneous solution is always
of the form x1i(t) = e 1Ac, where c is a constant vector.
2. Find a particular solution to the nonhomogeneous equation, either as a by-
product of the elimination method, by undetermined coefficients, if applicable,
or by Formula 3.2 or 3.3.
3. Write the general solution as x(t) = x1,(t) +xp(t).

If A isn't constant there is no general method for finding x1i(t), and we'll very likely
have to use numerical methods.
EXERCISES ·

In Exercises I to 4, use Equation 3.2 to solve the initial- (b) Show that if X(t) is an n-by-n matrix with linearly
value problem of the form dx/dt =Ax+ b(t), x(to) = independent columns, in particular if X (t) is a fun-
xo. The associated homogeneous equations dx/dt = Ax damental matrix, then in order for
were found in Exercises 11, 12, and 13 for Section 2A
to 2C to have the exponential matrices e1A needed here
x(t) = X (t)c, . c constant
in Exercises 2, 3, and 4. to satisfy x(to) = xo, we must have c = x- 1 (to)xo.
Exercise 19 shows that X (to) is invertible.

~ )-o O)(x) + (e -I)·


1
6. Let X(t) be a fundamental matrix whose columns span
2 y e-1 , the set of solutions of the homogeneous equation
l. (
dx
- = A(t)x.
( ;~~~ ) = ( =~ ) (a)
dt
Show that if Xp(t) is a particular solution of the
nonhomogeneous system

2
• ( ~ )-( =i nc )+( n, dx
-dt = A(t)x + b(t),
then the general solution of the nonhomogeneous

(;rn ) = ( ~)
system is
x(t) = Xp(t) + X(t)c,

,.(~ )-u nc)+(H (b)


where c is a constant.
Show that if the general solution in part (a) is to
satisfy an initial condition x(to) = xo, then c should
be chosen so that

( ;~~~ ) =( ~) c = x- 1 (to)(xo - xp(to)).

~f)-(;
For Exercises 7 to 10, consider the systems in Exercises
21 1 to 4 respectively, which are solvable by the method of
-I2 ) ( xy ) + ( e
2e2r )
; undetermined coefficients: Fann linear combinations of
4. (
the terms, and their derivatives, that occur in each entry
of the nonhomogeneous part of the differential equation,
( ~~~~ ) =( =1 ) taking care to include appropriate multiples by powers
oft for terms that are also homogeneous solutions. Then
5. (a) Show that for a solution of the form x(t) = e Ac,
1 substitute into the equation to detennine the coefficients
where c is a constant vector, to satisfy the condition of combination. In Exercises 7 t.o 10, use this method on
x(to) = xo, we must have c = e- 0Ax0 .
1 the corresponding system in Exercises l to 4.
Section 4 Equilibrium and Stability 645

In Exercises 11 and 12, the system has the homogeneous has X (t) = e1A for its fundamental matrix of independent
solutions shown. Verify that these are linearly indepen- column solutions with X (0) = I .
dent solutions. Find a particular solution of the nonho- In Exercises 16 and 17, A(t) is a square matrix with
mogeneous equation, using Equation 3.3. entries differentiable on a ::: t < b.
3 16. (a) Show that if A(t) and dA(t)/dt commute, then
2t dA 2 (t) dA(t)
1 ~ =
2A ( t) ~ .
- 2t 2 dA k (t) dA(t)
(b) Generalize part (a) to - - = kAk- 1( t ) - - .
) dt

17. Use the previous exercise to show that


dt
deA(t)
=
dt
t -] eA(t)dA(t).
t- l t- I dt
1 0 18. Modify the derivation of Formula 3.2 to show that,
for A constant and b(t) continuous, the initial-value
problem

13. (a) Let x1 (t), ... , Xn(t) be continuously differentiable dx


functions taking values in Rn, and forming a linearly
- = Ax+ b(t), x(to) = xo,
dt
independent fet of vectors for each t. Let X (t) be the
n-by-n matrix with columns x1 (t), ... , Xn (t). Show has a unique solution of the form
that if we define
A(t) = X'(t)X-\t),
then the system dx/dt = A(t)x has x1 (t), ... , Xn(t)
x(t) = e'A t e-11A b(u)du + e<1-1o)Axo.
J,o
as solutions and thus has X(t) as a fundamental
matrix .
[Hint: Integrate from to to t instead of using an indefinite
(b) Find a first-order homogeneous linear system of the
integral.]
form dx/dt = A(t)x having
*19. Let X(t) be a fundamental matrix of independent solu-
x1(t) = ( 2:~, ) , x2(t) = ( ;, ) x
tions of = A(t)x where A(t) has continuous entries on
some interval a < t < b. Prove for each to in the interval
as solutions. Are these two solutions linearly inde- that X (to) is invertible as follows.
pendent?
(a) Assuming the contrary, show that there is a vector
14. Let A be an n-by-n invertible matrix of constants, and let c # 0 such that X (to)c = 0.
b be a fixed vector in Rn. Show that the equation (b) Show that the initial-value problem x = A(t)x,
dx x(to) = 0 has the solution x(t) = X (t )c on the
-dt = Ax+b interval, and also has the identically zero solution
there.
always has Xp = -A - 1b for a particular solution. (c) Use the uniqueness part of Theorem 1.1 in
15. Show that if A is a constant n-by-n matrix, then the Chapter 12, Section JD to show that X(t)c is identi-
equation cally zero on the interval, thus contradicting the lin-
dx ear independence of the columns of the fundamental
-=Ax
dt matrix X(t).

SECTION 4 EQUILIBRIUM AND STABILITY


An equilibrium solution of a single differential equation involving a time derivative,
or of a system of such differential equations, is just a solution that is constant over
646 Chapter 13 Matrix Methods

FIGURE 13.2 z =y z=.Y


At (y, z;) = (2hr, 0), stable in
(a) and asymptotically stable in
(b); unstable at
(y,z;) = {(2k+ l)rr, 0) in (a)
and (b).

>91-+-'l--lf-+-l-f-HctJl,IIIH--H----+-l--++DIIH-Hf- -
y

(a) (b)

time. The reason for using Lhe tenn equilibrium solution rather than constant solution
is that we 're mainly interested in the stability of solutions that result from small
perturbations of an equilibrium solution. For the purposes of graphing, an equilibrium
solution appears as a single point in the space of the dependent variables, so we often
find it natural to refer to an equilibrium solution as an equilibrium point.

IEXAMPLE 1 j The classic example of a stable equilibrium in a mechanical system is a pendulum


in a motionless downward position. In the pendulum equation and its equivalent 2-
dimensional system, the equilibrium positions are expressed by solutions y(t) = 2krr
for integer k. A slight perturbation leads to motion that varies only slightly from the
equilibrium position, expressed by solutions y(t) having small amplitude. At the
other extreme are equilibrium solutions y(t) = (2k + I )rr representing precariously
balanced upward vertical pendulum positions; a slight perturbation always leads to
positions remote from the unstable equilibrium, no matter how slight the perturbation.
Thus the stable equilibrium points are at (y , z) = (2krr, 0) and the unstable points are
at (y, z) = ((2k+ l)rr, 0). See Figure I3.2(a), which we discussed also in Example 7
of Chapter 11, Section 7C.

In general we consider behavior of solutions x = x(t) of autonomous systems


i = F(x) near an equilibrium point Xo, that is, a point xo such Lhat F(xo) = 0. The
basic types of behavior are as follows:

1. xo is asymptotically stable if there is a number do > 0 such that every


solution starting within distance do of xo tends to xo as t tends to infinity.
2. xo is stable if there is a do > 0 such that all solutions starting within some dis-
tance d1 < do from xo remain within distance do of xo. Note that asymptotic
stability near an equilibrium point Xo implies stability there.
3. An equilibrium point xo that is not stable is called unstable.
Section 4A Equilibrium and Stability 647
4A Linear Systems
An n-dimensional autonomous linear system in first-order standard form is
i = Ax+ b,
where A is a constant n by n matrix and b is a constant n-dimensional vector. An
equilibrium solution is a constant vector satisfying
AXQ +b = 0 or Axo = -b.

If A is invertible, there is a unique equilibrium point given by x 0 = - A - 1b. If A- 1


fails to exist, then either (a) there will be no equilibrium point, or else (b) there
will be an entire line or plane consisting entirely of equilibrium points. In either
case, if x = Xo is an equilibrium solution, the homogeneous plus particular fonn
x(t) = Xh(t) + xo of the general solution shows that the qualitative behavior of x(t)
near xo is the same as the behavior of the homogeneous solution Xh (t) near x 0. =
So to find out about stability of Xo we just need to find out about the behavior of
solution trajectories of i = Ax near x = 0. But we know from Sections 1 and 2 that,
for constant square matrices A, the basic solutions of i = Ax are all of the form e}.. 1 ,
or rkeM, where>.. is an eigenvalue of A. Thus the qualitative behavior of solutions is
entirely determined by the eigenvalues of A along with their multiplicities. The case
in which zero is an eigenvalue is atypical and is often called degenerate, because
it implies the existence of nonzero solutions x to Ax = 0, which as we remarked
earlier implies an entire line or plane of equilibrium points.
The following displays I to VIII of types of 2-dimensional trajectory behavior of
i = Ax covers all possibilities for which zero is not an eigenvalue, and thus for which
the origin is the only equilibrium point. Where there are side-by-side pictures in a
category the example on the left has the standard basis vectors as eigenvectors, while
the one on the right has an eigenvector that's tilted relative to the axes. When there is
at least one eigenvalue with positive real part the equilibrium is necessarily unstable.
If one eigenvalue is negative and one is positive the equilibrium is called a saddle
point, as shown in Display IL An extension to a display for 3-dimensional space
would contain more pictures, that take into account whether the third eigenvalue is
positive or negative.

I. Unstable Node: 0 < >..1 < >..2

Example: x = 2e 1 ,
y = e31 ;8y = x 3 .
648 Chapter 13 Matrix Methods

II. Saddle (Unstable): )q < 0 < A2

y
y

Example: x = e-t,
y = 3e';xy = 3.
III. Asymptotically Stable Node: A1 < A2 < 0

y
y

Example: x = 2e- 3',


y = e-';x = 2y 3 •
IV. Unstable Spiral: A1 = p + iq, A2 = p - iq, p > 0, q :j:. 0

Example: x = e1 cos 2t,


y = e' sin 2t; x2 + y2 = e<arctany/x).
Section 4A Equilibrium and Stability 649

V. Stable Center: AJ = iq, A2 = -iq, q # 0.

Example: x = 2 cos t,
y = sint;x 2 + 4y 2 = 4.
VI. Asymptotically Stable Spiral: A1 = -p + iq, A2 = -p - iq, p > 0, q # 0

Example: x = e- 1 cost,
= e-t sin t; x2 + y2 = e-2(arctan YJx).
y
VII. Unstable Star: At = A2 > 0

Example: x = 2e 1 , Example: x = e1 ,
y = 3e 1
, 3x = 2y. y = te 1 , y = x In x.
650 Chapter 13 Matrix Methods

VIII. Asymptotically Stable Star: 11 = 12 < 0

y
y

Example: x = e-1 , Example: x = e- 1 ,


y = 3e- 1 , 3x = y. y = te- 1 ,y = -x ln x.

IE~IVI~ The unique equilibrium solution x = xo, y = yo of the system

x=x+y+l x+y+l=O
satisfies
y = 4x + y - 1 4x + y- I= 0,

and this solution is xo = i,


yo = -i- So for stability of this constant solution we
study the homogeneous system

x=x+y
y = 4x + y. or x. = ( 4I

We saw in Example I of Section I that the eigenvalues of the 2 by 2 matrix are


). = 3 and ). = -1, so there are solutions

where u and v are respective eigenvectors of 3 and - I. The presence of solutions of


the form x1(t) is enough to demonstrate that the equilibrium point Ci, -i)
of the
original nonhomogeneous system is unstable. The reason is that by taking c2 =/:- 0
small enough in absolute value we can produce solutions starting arbitrarily close to
equilibrium that tend arbitrarily far away as t increases. Thus it's just the presence
of the positive eigenvalue ). = 3 that is decisive. Note that the values of the eigen-
vectors u and v are irrelevant in making the decision; these vectors are important in
determining the shape and direction of the solution trajectories, but they don't affect
stability.

j;~XA"'1P-LE 3 j The 3-dimensional system

i=y 0 I
y = -x - y, in matrix form X = ( -I -1 ~ ) X,
z=x-z 0 - 1
Section 4A Equilibrium and Stability 651

has a unique equilibrium point at the origin (0, 0, 0). The characteristic equation is

-A 1 0 )
det -1 -1 - A O = 0,
( 1 0 -1-A

which works out to be A3 + 2A 2 + 2A + 1 = 0. By inspection we see that A\ = -1


is a root. Factoring out A + 1 leaves A2 + A + I = 0, so the eigenvalues are

A\ =-1, A2=½(-I+v'Ji), A3=½(-l-v'Ji).

Since all roots have negative real part, either -1 or-½, the equilibrium is asymp-
totically stable, with all solutions tending to the origin as t increases. We could
compute the general solution without much trouble, but we don't need that if we're
only checking stability near equilibrium. Note that showing the existence of just one
eigenvalue with positive real part would have been enough to guarantee instability.

x= (
-1
~
-1
-1
0 -1
~)x
have respective characteristic equations

with respective eigenvalues

A = -1 ± i, A = -1 and A = 1 ± i, A = I.

The respective solutions are

c1e' cost )
and x(t) = c2e' sin t .
( CJe'

The origin is respectively stable and unstable for these solutions as shown in Figure
13.3. Note that the trajectories appear to be very similar, but the ones in (b) nec-
essarily start at a positive distance from the origin, while in theory the ones in (a)
approach arbitrarily close to the origin.

The previous examples make the next theorem plausible.

4.1 Linear Stability. Let A be a real n-by-n constant matrix. The equilibrium
solution xo = 0 for the homogeneous system x = Ax is

(a) asymptotically stable if every eigenvalue of A has negative real part, and is
(b) unstable if A has at least one eigenvalue with positive real part.
652 Chapter 13 Matrix Methods

FlGURE 13.3
z
(a) Stable. (b) Unstable.

X
y

(a) (b)

The same conclusions hold for stability of an equilibrium solution for x = Ax+ b,
where b is an n-dimensional constant vector. If all eigenvalues of A have real
part zero, we can draw no immediate conclusion in dimension n ~ 4, for then an
equilibrium solution xo may be stable or may be unstable, but if n < 3 and the
eigenvalues are all distinct then xo will be stable.

Proof The method for computing the exponential matrix described in Section 2D
shows that every entry in the exponential matrix e1 A has the form e>..r Q(t), where
Q(t) is a polynomial and A is an eigenvalue of A. If every such A has negative real
part, then the entries, and hence all solutions, tend to zero as t tends to +oo. On
the other hand, if a single eigenvalue A, with eigenvector v, has positive real part,
then the solution x(t) = 8e>..r v is unbounded in every nonzero coordinate as t tends
to +oo, regardless of how small the positive number 8 is chosen.
If the nonhomogeneous equation has xo for an equilibrium solution the homoge-
neous plus particular fmm x(t) = e1Ac + xo for the general solution shows that the
same conclusions hold for the nonhomogeneous equation. The last statement of the
theorem is settled by checking out the two examples in Exercise 23 and noting that
with real and nonzero parts both zero when n = 2 the eigenvalues are ±i q with q
real , and nonzero, and when n = 3 the eigenvalues are ±i q, and 0. •

EXERCISES

Find lhc eigenvalues )q, A2 associated with each system


1 through 8. Then classify the behavior of each system
5. i = 2x - 3y, 6. i = -x + y,
near equilibrium by name according to the types catego- j, = -2x -2y. j• = - 4x - y.
rized in the Displays I to VIII of the text and sketch a
1.i=x+y, 8. X = X + )',
phase portrait for each system.
y = 2x. )' = X - y.
I. dx/dt = -3x + 2y, 2. dx/dt = X + 4y,
dy/dt = -4x + 3y. dy/dt = Sy. Each of the second-order equations 9 to 12 is equivalent
3. dx/dt = 2x - y, 4. dx/dt = x, to a first-order system obtained by letting dx/dt = y.
Classify by name, according to the list I-VIII, each
dy/dt = X + 2y. dy/dt = 3x + y. equation, or equivalently the system, by finding the
Section 4B Equilibrium and Stability 653
characteristic roots of the given second-order equation (a) Show that, if c2 and c4 are not both zero, then as
directly. Sketch a phase-portrait for each equation. t - oo the slope dy /dx = (d)'/dt)/(dx/dt) of the
associated trajectory approaches c4/c2, interpreted
9. d 2 xjdt 2 - X = Q. as a vertical slope if c2 = 0.
2
10. d x/dt 2
+ dx/dt - x = 0. (b) Show that if c2 = q = 0, but c 1 and C3 are not
both zero, then all trajectories are straight lines with
2
11. d x/dt 2
+ dx/dt + x = 0. slopes c3 / c 1, or a vertical line if ci = 0.
12. d 2x/dt 2 +x = 0. 19. What difference, if any, is there between the phase por-
The second-order equations 13 through 16 determine the traits of the system i = f(x, y), y = g(x, y) and the
evolution x(t) of a spring system subject to a frictional system i = - f(x, y), y = -g(x, y)?
force k(dx/dt), where k:::: 0 is constant. Find conditions 20. Show that if the system i = ax+ by, y = ex+ dy
on k under which the equation belongs to any or all of has real characteristic roots then it has an asymptolically
the classes in the standard I-Vm. stable node at the origin if and only if a + d < 0 and
ad-be> 0.
13. d 2x/dt 2 + kdx/dt + x = 0.
21. (a) Solve the system i = y, y = -x - y, i: = x - z.
14. d 2x/dt 2 + kdx/dt + 3x = 0. (b) Show that every solution tends to the equilibrium
IS. 2d 2x/dt 2 + kdx/dt +x = 0. point (0, 0, 0), which is thus asymptotically stable.
(c) What changes if the last equation is replaced by
16. md x/dt2 2
+ kdx/dt + 3x = 0, m > 0.
Z = X + z?
17. Show that the system i = ax+ by, y = ex + dy has 22. Use the systems x= Ax with these matrices to confirm
infinitely many constant equilibrium solutions if and only the last statement in Theorem 4.1 :
if ad-be= 0. [Hint: Try to solve ax+by = cx+dy = 0.]
18. Consider the general solution x = qe>- 11 + c2e>- 21 , y =
c3 e>- 11 +qe>-21 of a 2-dimensional linear system in which
Al < A2 < 0, and for which (0, 0) is an asymptotically (J ~ 0~
0
0 -2
~) 1
0
and
(j
1
0
0 0
0 -5
0
1

stable node.

48 Nonlinear Systems
To extend eigenvalue analysis of equilibrium solutions to autonomous nonlinear
systems we start by finding the equilibrium points xo = (a 1, ... , an) of a nonlinear
system x= =
F(x), that is, points such that F(Xo) 0. To do this for linear systems we
had the routine of Chapter 2, Section 2, but if F(x) is nonlinear we have to resort
to ad hoc methods or perhaps numerical approximation using Newton's method
as described in Chapter 5, Section 5. After locating an equilibrium point xo, the
next step is to linearize the system at XO, replacing each real-valued equation Xk =
F.t(x1, ... , Xn) in the system by its linearization

11

Xk '°' oF
= ~ ~(xo)(xj
ox·
- ai) + Fk(xo), k = l, ... , X11.
j=I J

The resulting linear autonomous system is called the linearization of F(x) at x=


xo, and we can write it as i = +
F'(xo)(x - xo) F(xo) using the derivative matrix

0F1 0F1
- (Xo) -(Xo)
ox1 OXn
F'(Xo) =
0F11 0F11
- (Xo) -(Xo)
ox1 OX11
654 Chapter 13 Matrix Methods

x
For simplicity we work with the homogeneous equation = F' (xo)x at Xo, as we
did in Section 4A, and we'll see that the eigenvalues of the constant matrix F'(xo)
are the key to our criteria.

l ~~J\11~~~ 5 I A 2-dimensional system and its linearization are respectively


X=X x=x
and
y = -y + x2 y =-y,
For both the nonlinear system and its linearization there is a single equilibrium point
at (xo, yo) = (0, 0). The characteristic equation of the 2-by-2 matrix is (I - A)(- I -
A) = A2 - 1 = 0, with roots A = ±I. Thus the basic solmions are linear combinations
of e1 and e-t and so are of the unstable saddle type in Display II in Section 4A.
Theorem 4.2 on page 654 guarantees that the solutions of the nonlinear system will
be similarly unstable.
Aside from the equilibrium point at the origin, in Figure 13.4(a) there are four
other exceptional phase curves in the phase portrait: the positive and negative y axes
directed toward the origin, and the left and right halves of the parabola y = ½x 2 ,
both directed away from the origin. To derive the equations that these parabolic
trajectories satisfy you can solve the first-order linear equation

y dy -y + x 2
i = -dx = X ' X =f. O.

FIGURE 13.4
(a) Nonlinear saddle. (b) Lorenz z
trajectory; f3 = ~, p = 28,
(f = JO.

(a) (b)

[ EXAI\IIP~E 6 j The general Lorenz system is

i = -ax +ay
y = px - y-xz
z=xy-{Jz,
and it has been studied extensively in recent years because of the apparently chaotic
behavior of its solution trajectories near equilibrium. The equilibrium solutions are
Section 48 Equilibrium and Stability 655
just the solutions to the algebraic system we get by setting the right hand sides equal
to zero. Looking at the special case f3 = a = 1, p = 2, we solve

-x +y =0
2x - y-xz =0
xy- z = 0.

Noting from the first equation that x = y, we then see that there are just three
solutions: (1, 1, 1), (0, 0, 0) and (-1, -1, 1). The derivative matrix F'(x, y, z) for
the linearization at (x, y, z) is

o(-x + y) o(-x + y) o(-x + y)


OX av
o(2x -y-xz)
ax
o(xy - z)
o(2x -y-xz)
ay
o(xy - z)
oz
o(2x - y -xz)
oz
o(xy - z)
=
( - l
2 ;z
1
-1
X
~,)
-1

ax ay oz

Evaluating this matrix at (x, y, z) = (1, 1, 1) gives

F'(l, 1, 1) =
(
-1

y
I
2 - z -l
X
~x )
-1 (1.1.l)
=( -! -! -~ ) .
1 1 -1

Similarly, we get the Iinearization matrices at (0, 0, 0) and (-1, -1, 1) by evaluating
the same derivative matrix at these two additional points:

F'(O, 0, 0) - (
-1
2
0
-! ~);
0 -1
F'(-1, -1, 1) = ( -~ -~
-1 -1 -1
~).
The next theorem draws conclusions about stability from the eigenvalues of deriva-
tive matrices. For a general n-dimensional system the number of possibilities is very
large, so we list only a few general categories. We omit the detailed proof.

4.2 Linearization Theorem. Assume that the real-valued coordinate functions


Fk of F are continuously differentiable and that the system x= F(x) has an equilib-
rium point at xo. The equilibrium solution xo for the system is asymptotically stable
if every eigenvalue of the derivative matrix F' (Xo) has negative real part. The point
xo is unstable if F' (Xo) has at least one eigenvalue with positive real part, and is
called a saddle point if both signs occur. If all eigenvalues of F'(xo) have real part
zero, we can draw no definite conclusion, and the equilibrium xo may be stable or
may be unstable. Exercise 5 has an example of each possibility.

The special Lorenz system treated in Example 3 has an equilibrium at ( 1, 1, I) with


= rlf'tfF' fl 1. 1) -An. From Examole 3 we see that
,..h,,..-.,,..t,...;ct;" nnhmnmi"l Pfl 'I
656 Chapter 13 Matrix Methods

-1-). 1
P().) = det
(
! -1-).
1
0
-1
-1 - ).
)

Computing the determinant, we get P().) = - (().+ 1) 3 +I), and we see by inspection
that ). 1 = - 2 is a root. Division by).+ 2 gives P().) = -(). + 2)(). 2 +).+I). The
roots of the quadratic factor are ). 2 = (-1 + ./3 i) /2 and ;\.3 = (- 1 - ./3 i) /2. Thus
the real parts of all three eigenvalues are negative, so we conclude from Theorem 4.2
that (1, 1, I) is an asymptotically stable equilibrium solution.

IE>fA~PLE a I The
Continuing with the special Lorenz system, we examine the equilibrium at (0, 0, 0).
relevant characteristic polynomial is evaluated at (0, 0, 0) and is = det P().)
(F'(0, 0, 0) - ).[), or

- 1 - ). 1 0 )
P().) = 2 -1 - ). 0 .
( 0 0 - 1 - J..

The determinant is -(). + 1) 3 + 2).. + 2 = -(A+ 1)().. 2 + 2A - 1). The roots are
). 1= - 1, Az = -1 - ./2 and A.3 = -1 + ./2. Since ).3 > 0 we conclude from
Theorem 4.2 that (0, 0, 0) is an unstable equilibrium. This point is a saddle point,
since there are two negative eigenvalues that contribute to making the other basic
solutions tend to zero. Checking out the equilibrium at (-1, -1, I) is left as an
exercise.

IE>,<AMPL~ g j The general Lorenz system

x = cr(y - x), j, = px - y - xz, z= -{3z + xy, {3, p, er positive constants

has been studied extensively with the aim of understanding trajectories such as the
one shown in Figure 13.4(b). With the choice of parameters shown there, the equi-
librium points, aside from the one at the origin, are at (±6./2, ±6./2, 27). The
trajectory shown in the figure has initial point (2, 2, 2 1). It winds around in the area
of one equilibrium an apparently random number of times, then switches toward the
other equilibrium with similar behavior, continuing back and forth unpredictably.
The eigenvalues of the linearizations are the same at the two equilibrium points;
they are approximately as follows: AJ ~ - 13.85 , A2, A3 ~ 0.09 ± 10.19i . Thus
these two points are saddle points, and each one has a surface containing it on which
all trajectories gradually spiral away from the point, as well as a trajectory at a pos-
itive angle to the surface that converges to the point. The typical trajectory behavior
lies somewhere between these extremes, winding away from one equilibrium until
it is attracted by the other, then reversing. The number of circuits about each point,
and the path taken, is very sensitive to minute changes in the initial conditions.

Edward Lorenz began the study of the Lorenz system by using it to approximate
more complicated differential equations in the study of weather patterns; hence the
interest in the system's sensitivity to initial conditions, sometimes called the butterfly
effect. For more details about the system see Colin Sparrow, The Lorenz Equations:
Bifurcations, Chaos, and Strange Attractors, Springer-Verlag (1982).
Section 48 Equilibrium and Stability 657

EXERCISES

1. Assume A is not the O matrix and let x = Ax be a 2- (c) Show that in polar coordinates the given system
dimensional autonomous system for which det A = 0. takes the form ;- = ar 3 , fJ = -1.
Show that the system has zero for an eigenvalue and that [Hint: Apply d/dt to the equations x = r cos 0, y =
the equilibrium solutions make up an entire line in IR2 • rsin0.]
2. The nonautonomou~ system (d) Solve the polar-form system in part (c), and show
that if a > 0 the equilibrium point is unstable, and
.i:= (1 - t )x - t y that if a < 0 it is stable. Thus the parameter value
a = 0 is called a bifurcation point for the system,
.v = tx + (1 - t)y because the stability of the system at the equilibrium
point changes in a fundamental way as a increases
exhibits a change of character at its lone equilibrium point. through zero.
This system appears also in Example 3 of Chapter 12,
Section l. 6. A nonlinear pendulum with frictional damping has a
(a) Show that for each real number t Lhe system has a
displacement angle 0 = 0(t) that satisfies B+(k/(/m))iJ+
single equilibrium point at (x, y) = (0, 0). f sin 0 = 0, where k > 0 is constant.
(h) Show that while t > 1 solutions behave in a stable (a) Show that the equation for 0 is equivalent, with
manner near the equilibrium point, and that behavior x = 0, y = iJ, to the first-order system
is unstable when t < I.
. g . k
3. Show that the equilibrium points of Lhe Lorenz system i =y, y = -l smx - lmy.

i = a(y - x), y = px - y - xz,


(b) Show that the equilibrium points of the system in
z = -/3z + xy, /3, p, a part (a) are independent of k.
(c) Show that the unstable equilibrium poinls are all
positive constants, are (0, 0, 0) and, if p > 1, Lhe two saddles.
points ( ± ..jf3(p - 1), ±..j{3(p - 1), p - I). (d) An equilibrium solution xo of a nonlinear system is
4. (a) According to Theorem 4.2, the nonlinear system i = a node or a spiral point if xo is a node or spiral
x, y = -y + x 2 of Example 5 has an unstable saddle point for the linearization at xo. Show that the stable
equilibriwn at (x, y) = (0, 0). Solve the system equilibrium points are nodes if k2 > 4glm 2 and
explicitly to show the unstable saddle behavior of asymptotic spirals if k 2 < 4glm 2 . Why does this
solutions near (0, 0). distinction make sense physically?
(h) Show that the trajectories for which x -:j:. 0 satisfy 7. Consider the nonlinear system
the linear equation dy/dx = (-y + x 2 )/x, and
identify the equations y = y(x) of the two parabolic i = A - Bx -x + x 2 y, A -:j:. 0,
trajectories.
5. Linearized analysis is inadequate for some systems, as y = Bx -x 2 y.
claimed in Theorem 4.2, and we can see this by looking
at the family of nonlinear systems (a) Find the system's single equilibrium point.
(h) Assume that A > 0 and B > A 2 + I. Trajecto-
i = y + ax(x 2 + y2), ries starting sufficiently near, but not at, the equilib-
rium point exhibit limit cycle behavior in that they
y = -x + ay(x 2 + y2), approach a closed trajectory. Investigate this claim
by using a graphical-numerical method.
where a is constanl.
(a) Show that the only equilibrium point is (xo, yo) = 8. The Lotka-Volterra equations that model the interaction
(0, 0), regardless of the value of a. between the sizes H and P of certain host and parasite
(b) Show that the linearized system associated with (0, populations are
0) is i = y, j• = -x, and that the origin is a stable
center as t ~ oo. Note that (0, 0) is also a stable dH dP
___ .._ __ ~-- •L- _, __ ..,. _ _ ,..,._1·_ ... ,.._ .........,· _.., ,..,1....,.,... ,.., _ {\ 7- = (a - bP)H, 7, = (cH - d)P, a, b, c, d > 0.
658 Chapter 13 Matrix Methods
(a) Show that for H > 0 and P > 0, the only (b) Show that all solutions (x, y) satisfy (d/dt)(y/x) =
equilibrium point is (Ho, Po)= (d/c, a/b), and find 1 + (y/x) 2 when x =j: 0.
the associated linearized system. (c) Use part (b) to show the polar angle 0 of a point on
(b) Show that the equilibrium solution of the linearized a nonconstant trajectory satisfies 0 = arctan(y / x) =
system is a stable center. t + c and hence that all such trajectories wind
(c) Discuss the equilibrium solution of the nonlinear counterclockwise infinitely often around the origin.
system at (H, P) = (0, 0). Do your conclusions (d) Show that all solutions (x, y) satisfy xi + y y =
make sense, given the interpretation of P and H as (x2 + y2)(1 - x2 - y2).
sizes of parasite and host populations respectively? (e) Use part (d) to show that the polar radius r of a
9. Find all equilibrium points of i = -y(I - x 2 - y2), point on a trajectory satisfies dr/dt = r(l-, 2 ), with
y = x(l - x 2 - y2). Then show that all other trajec- solutions r = ke 1 jJk 2 e2r ± 1, the sign depending
tories arc circles and that no trajectory converges to an on whether O < r < 1 or r > 1. Show that the
equilibrium point. [Hint: dy/dx = -x/y.] trajectories of these solutions approach the circular
trajectory x 2 + y2 = 1 as t - oo.
10. Find all equilibrium points of i = x(2 - x - y), y =
y(x - 1) and discuss their stability. Then by hand or 14. Consider the system
using computer graphics, make a sketch of some typical
trajectories near the equilibrium points. i = x(I - x2 - y2)3 - y(I - x2 - y2)2 - y3 •
11. Discuss the stability of the equilibrium solutions of the y = x(l -x2 - y2)2 + y(I -x2 -y2)3 + xy2.
system
(a) Show that all solutions satisfy xi + yy = (x 2 +
i = -x(x 2 + y2 - 1), j, = -y(x 2 + y2 + 1). y2)(1 - x 2 - y2) 3 and hence that the polar radius r
of a point on a trajectory satisfies dr /dt = r(l-r 2 ) 3 .
12. The system i = -x(x 2 + y2 - 1), j, = -y(x 2 + y2 + 1) (b) Show that the system has trajectories on the unit
reduces to i = -x(x 2 - 1) on the x-axis and to circle satisfying i = -y3, y = xy2. Explain why
y = -y(y 2 + 1) on the y-axis. the unit semicircle with y > 0 has trajectories with
(a) Solve the y-equation explicitly to show that a tra- ( 1, 0) as and (-1, 0) as limit points. What can you
jectory with y(O) = YO on the y-axis converges to say about the semicircle with y < O?
(0, 0) as t - oo. 15. The van der Pol equation x + a(x 2 - l)i + x = 0 is
(b) Solve the x-equation explicitly to show that if Jxol < equivalent to the system i = y, y = -x - a(x 2 - l)y.
1 a trajectory starting at x 0 on the x-axis converges
(a) Find the linearization of the system near (xo, yo) =
to either ( 1, 0) or (-1, 0) as t - oo and converges (0, 0).
to the origin as t - -oo. (b) Discuss the behavior of solutions near (xo, Yo) =
13. Consider the system i = x(l - x 2 - y 2 ) - y, (0, 0) and their dependence on the constant a > 0.
)' = X + y(l - X2 - y2). What happens if a = 0 or if a < O?
(a) Show that the system has circular trajectories of
radius 1.

Chapter 13 REVIEW

The initial-value problems in Exercises 1 to 12 are


solvable (i) by the elimination method of the previous 2. X = ( i ; ) + ( ;r ) ,
x x(O) =( !)
chapter, (ii) by computing eigenvalues and eigenvectors,
(iii) by computing an exponential matrix, or (iv) by an
ad hoc approach involving an educated guess. You are
3. x= ( ~ =~ ) + ( ;;r ) ,
x x(O) = ( ~ )

asked to use whatever approach to a particular problem


seems most efficient. 4. x= ( -b )x + ( ~ ) . = ( ~ )
~ x(O)

1. x= ( i =~ ) x +( :r ) , x(O) = ( ~ ) 5. x=( i -; )x+( ;r ), x(l)=(-~)


Section 48 Equilibrium and Stability 659

=( ~ )
14. (a) Find an equivalent first-order system of dimension
sin t ) , x(O) 2n for the n-dimensional second-order initial-value
problem x = x, x(O) = Xo, x(O) = zo, and write
the resulting Zn-dimensional system as an uncoupled
sequence of 2-dimensional coupled systems.
(b) Solve the given second-order system for the case
n = I and deduce the solution to the general case
from part (a) and this special case.
(c) Deduce from the result of part (b) the form of the
exponential matrix for the Zn-dimensional system of
-1 -1
part (a).
I -1

n
-1 1 15. (a) Find an equivalent first-order system of dimension
2n for the n-dimensional second-order initial-value
1
problem x = -x, x(O) = Xo, i(O) = zo, and write
-2 ] ) x, x(O) - ( the resulting Zn-dimensional system as an uncoupled
I
sequence of 2-dimensional coupled systems.

D•0)• D
12. i=x+e1,x(O)=en,n:::::2
+ x(O) - (
(b) Solve the given second-order system for the case
n = I and deduce the solution to the general case
from part (a) and this special case.
(c) Deduce from the result of part (b) the form of the
exponential matrix for the Zn-dimensional system of
part (a).
13. Find the general solution of the second-order system
x = x + y, y = x - y by first writing it as a first- 16. If all eigenvalues of the n-by-n matrix A are real, show
order system of dimension 4 in matrix form and finding that the system ti = Ax has solutions x(t) = t.lu for
an exponential matrix in complex form. t > 0, where u is an eigenvector of A with eigenvalue A.
CH APTER 14

INFINITE SERIES

The study of numerical infinite series is an important branch of the study of numerical
approximation. In lhe first section we ' ll treat limits of sequences somewhat infor-
mally. Later on we'll use calculus technique to deal with sequential limits. Section 1
introduces the idea of convergence of a series, and Section 2 complements it using
Taylor expansions. Additional sections deal with the more technical aspects of con-
vergence. The chapter closes with power series solutions of ordinary differential
equations and an introduction to Fourier series and the 1-dimensional heat and wave
equations.

SECTION 1 EXAMPLES AND DEFINITIONS


Infinite series generalize finite sums to infinite sums of the form
a1 + a2 + a3 + ·· · ,
where the three dots indicate that a term ak is included for each of the infinitely
many positive integers k = l, 2, 3, .. .. The decimal expansion
1
3 = 0.3333 · · ·
is really an example of an infinite series, because suitably rewritten it's an infinite
sum in the form
I 3 3 3 3 3 3 3 3
3 = IO+ 100 + 1000 + 10,000 +···=IO+ 102 + 103+ 104 + ··· .
In both decimal and sum form the dots at the end mean that the most obvious pattern
is to be carried on indefinitely. In principle this convention could lead to ambiguity,
but we'll be more specific when necessary. To write infinite series more briefly, and
less ambiguously, recall the :E-notation for finite sums, defined by
n

L ak = am + am+ J + · · · + an , m S n.
k=m
For example,
II

I:k= 1+2+3+ ··· +11 .


k= l
A natural extension allows us to write a general infinite series in more compressed
form like this:
00

a 1 + a2 + a3 + ··· = L ak.
k= I

660
Section 1 Examples and Definitions 661
Writing formulas like this requires us lo have a formula for lhe general term ak.
Thus, for example,
00
3 3 3 3 3
10 + 102 + 103 + 104 +···=I: 1Qk ·
k=l

Assigning an appropriate numerical value to an infinite series is usually done by


taking the limit as n -* oo of lhe nlh partial sum Sn = LZ=I ak. Thus by definition,
the sum s of a series is
OC 11

'°'
L., ak
k=l
= n~oo '°'
lim L., ak = s,
k=l

if the limit exists. If the series has a sum in this sense, then the series is said to
converge to s, and if the limit fails to exist, lhe series is said to diverge.

p~x,~r.,t~L~, 11 The geometric series with ratio X is


00

L xk = 1 + x + x 2 + ··· .
k=O
There is a simple formula for the partial sums if x =j:. 1, given by
2 n-1 1 -xn
Sn= l+x+x +···+x =- -.
1-x
To verify the formula, multiply bolh sides by 1 - x and note that all but two terms
cancel on the left. If O < Ix I < 1, then xn tends to O as n -* oo; to see lhis take
n > 1n8/ In jxj to make jxjn < 8 < l. Thus for all x for which Jxl < I the sum is
oo 1 - xn
'°'xk
L.,
= lim - -
n-H)O l- X
= } - X
k=O
If jx I > I, then xn is unbounded as n -* oo, so the sum fails to exist. If x = -1,
the formula for s,1 gives O if n is odd and l if n is even, so there is no limit then
either. Finally, if x = l the formula for Sn is invalid, but in lhat case we see directly
that s11 = n, which tends to oo as n -* oo.
We can get some feeling for convergence of a sum by looking at specific numerical
examples. In each of the next three examples we can find a simple formula for the
partial sums, something lhat is not possible for many important infinite series.

[ ~)(AIVIPL~ ~J Consider the geometric series with ratio x = ½- Using Example 1, we have
00
1 I I I
L
k=O
3k = i + 3 + 32 + 33 + ...

(I ;3n+[)
hm '°' - = n~oo
• II } • } -
= n~oo L., 3k
hm 1 - (] /3)
k=O
1-0 3
1-(1/3)
= -2
662 .....,.,.,.,,..,........,.-,.,.,....,..., Chapter 14 Infinite Series
j E~AMPLE 3 I Here is a particularly simple series.

= n--+oo
lim (1 - !) + (! -!) + ··· + (~ - - +
2 2 3 n n
1
1
)

= . ( 1 - -1- )
hm = 1- . -1-
hm = 1- 0 =1
11--+oo n+1 n--+oon+1

This is an example of what is called a "telescoping" series, because the interior terms
cancel in the partial sum. Since (1/ k) - (1/ k + 1) = 1/ k(k + 1), the series can also
be written
00 1
I: k(k + o = 1.
k=l

The infinite series L~I k = 1 + 2 + 3 + · · · is divergent. To see this, note that the
nth partial sum is the sum of the first n integers:

n(n + 1)
Sn = 1 + 2 + · ·· + n = 2
.

Hence limn--+oo Sn = oo, so we agree to say that the series "diverges to +oo."
I,~~AMP~ts j The geometric series with x = -2 is
00

I: (-2l = 1 - 2 + 22 - 23 + ....
k=O

and the nth partial sum is, according to the fonnula in Example 1,

1 + (-1) 11 2n
3

As n -+ oo, the numerator oscillates between being large and positive and large
and negative. Since the partial sums do not tend to a fixed finite value, the series
diverges.

EXERCiSES

5 6
'£ (~--
II
1
In Exercises l to 4, use the definition L ak = 1
• k=I + )
k k 1
2. I: k - L (k - o
k=2 k=o3
k=m 4 II
am + am+ 1 + · · · + a11 for the I:-notation to write out and 3. I: 2-k 4. 6 I: (- 1/
simplify the sums. k=O k=I
Section 1 Examples and Definitions 663
i Exercises 5 to 10, find a formula for the kth tenn, The symbol kl, called k-factorial is defined by
= 1, 2, 3, ... , of the infinite series
that is consistent
vith the given part of the series.
I I 1
kl-{
.-
l,
k(k-1)· · ·3·2·1,
k=O,
k~l.
5· 1 + 2 + 3 + 4 + ...
1
6.1--+ - --+···
l I Thus 0! = 1, l! = I, 2! = 2-1 = 2, 3! = 3-2·1 = 6, and
2 4 8 so on. Rewrite each of the infinite series 26 to 31 using
I I 1 factorial notation. Then write out the first three terms.
7 · 1-3 + 2·4 + 3.5 + ... 00 3-k
26. E - -
I 1 I k=i1·2· · ·k
8' 3 + 2.3 2 + 3.33 + ...
4 5 6
27. I; l
9' 2.32 k=I 2-4 · · · (2k)
+ 3.3 3 + 4.3 4 + ...
10. -
I
- 2-3
1
- + -3.4 - ···
I 28. I: 2k
k=I 2·4-6 · · · (2k)
1-2
In Exercises 11 to 14, verify that each of the partial 29. I; l
sum formulas is correct. Then find the sum of the k=l k(k + l)(k + 2) · · · (2k - 1)(2k)
corresponding infinite series as n ~ oo if it exists. 00 1.3.5 .. · (2k - I)
30. E-----
n 1 n k=I 2-4·6 · · · (2k)
ll. k~I 2k(2k + 2) = 4(n + I) 00
(-Ii
31. E-- ---
n 3 1-3 · · · (2k + I)
k=o
12. Ek =4-4-n
k==O 4 In Exercises 32 to 37, write out Sn for n = 1, 2, and 3.
n Also find limn-H)O Sn
13. }: 2k = n (n + 1)
k=I
32. Sn = ( 1 + ~)
14. E -- -
n (2k + 1 1) =2- 2-n
k=O 2k 3n 2 -1 2"
34. Sn= -2-- 35. Sn= n+I
4n +n 3
In Exercise 15 to 20, what is the sum of the geometric 3n +4 n!O + 1
series? 36. S n = - - 37. Sn= -n-(n_9_+_1_)

E (l)k
n
15. 00
- 16. E
oo ( --
3)k
Verify the equations in Exercises 38 to 41.
k=O 6 k=o 4
00 I 1 00
I 1
11. E -
oo ( I )k 18. E
oo (
-1-2 )k , x ¥ o 38. k~l 10k =9 39. k~2 3k =6
k=I l + lT k=O 1 +x
40
00
( 1 )k 1 4
00
l I
19.
00
E e-2k
00
20. }:(0.0li
· k'f-1 -4 = -5 1. k~3 k(k + 1) =3
k=O k=O 42. Find an infinite series with positive terms that converges
In Exercises 21 to 24, write the repeating decimal expan- to 3.
sions as an infinite geometric series and find the sum of 43. Find an infinite series with alternating positive and nega-
each one. tive terms that converges to 3.

21. 0.888~ 22. 0.10101010 44. Assume that the series }:~ 1 ak = s is convergent. Show
that L~m
ak is also convergent if m > I. What is the
23. 0.123123123 24. 1.23452345 sum of the second series in terms of s and the terms ak?
25. Prove that an infinite decimal expansion that repeats 45. Cantor set. From the interval [O, l], the open middle third
periodically from some point on must represent a rational
(j, j) is deleted. Then the open middle third is deleted
number. from each of the two remaining closed intervals, then
664 Chapter 14 Infinite Series

the open middle third from each of the four remaining (a) Find a formula for Cn, the sum of the lengths ,
closed intervals, and so on. The set C remaining after the intervals remaining after n steps.
entire infinite sequence of deletions is called the Cantor (b) Show that limn-+oo Cn = 0, showing that C has
middle-third set. length zero.

SECTION 2 TAYLOR SERIES


2A Taylor Polynomials
A polynomial f of degree n is just a sum of numerical multiples of powers x : f (x) =
co+ c1x + ·. · + CnXn. To examine the behavior off near a point x = a, it's useful
to be able to write f (x) as a polynomial in powers of (x - a). The next theorem
gives a simple formula for the coefficients akin terms of derivatives f(k)(a) of f(x)
at a and the factorial function defined by k ! = l ·2 ·3 · · · k for k ~ 1 and O! = 1.
2.1 Theorem. If f(x) is a polynomial of degree n of the form
1
f(x) = ao + a1 (x - a)+ a2(x - a) 2 + · · · + an(x - at, then ak = k! f(k)(a).

Thus ao = f(a), a, = f'(a), a2 = ½f"(a), a3 = ¼f"(a), a4 = ]4 J<4)(a), and


so on.
Proof. To prove the formula for the coefficients ak we note first that k successive
differentiations of the terms aj(X - a)i of f(x) with O S j ~ k - 1 give zero,
starting with the constant ao and going through the term of degree k - 1. All the
remaining terms but the one of degree k are zero at x = a, because each contains
a positive power of (x - a). Hence the only surviving term in the computation of
Jfkl(a) is

::k [ak(x -al]= 1·2· · · (k - 1)-kak.


In other words, f(k)(a) = k!ak. The formula for ak follows upon division by k!. •

It will follow from the next theorem that every polynomial has the form in Theo-
rem 2.1 using powers of x - a for arbitrary a. Let f(x) = 1 - x 2 + 2x 3 • To write
f (x) in terms of powers of x - a with a = I, we compute

/(1) = 2, J'(l) = 4, 11
/ (1) = 10, /
111
(1) = 12.
Hence
. 4 10 12
J(x) = 2+ -(x -1) + -(x - 1)2 + -(x - 1) 3
1! 2! 3!
2
=2+ 4(x - I) + 5(x - 1) + 2(x - 1) 3 •

More generally, suppose that /(x) is a function, not necessarily a polynomial,


that has n derivatives at x =a.We can use the coefficients ak given by Theorem 2.1
to produce a polynomial
Section 2A Taylor Series 665

called the nth degree Taylor polynomial off (x) at x = a. Theorem 2.1 shows that
if f (x) is itself a polynomial of degree at most n, then the coefficients ak are designed
so that Tn(x) = f(x) for all real x. When f(x) is not a polynomial, Tn(x) will not
usually be equal to f (x) except at x = a. The difference f (x) - Tn (x) = Rn (x)
is the Taylor remainder. If Rn (x) is small Tn (x) will be a good approximation to
f (x). Here is a simple estimate for the size of the remainder, under the assumption
that j<n+l)(x) is continuous on an interval containing a.

2.2 Taylor Remainder Theorem. Suppose f has n + 1 continuous derivatives


on an open interval containing a. Then for all x in the interval,

n 1
f(x) = L k! j<k)(a)(x - al+ Rn(x)
k=O

where
1
R (x)
n
= ---f(n+l)(c)(x
(n + l)!
- at+l
,

and c is a number between x and a. If f(x) is a polynomial of degree n, then Rn(x)


is identically zero, so f (x) is a polynomial in powers of x - a.

Proof. With x held fixed and x -=/=- a, define the unique number K by

f(x) '°" -J<k)(a)(x -


= ~k!
n 1
al + K(x - at+l.
k=O

With K detennined this way, define g(t) by

1n
g(t) = -f(x) + L k!f(k)(t)(x - tl + K(x - tt+
1

k=O

Note that g(a) = 0 because of the way K is defined, and that g(x) = 0 no matter
what value K has. Applying the product rule to the terms in the summation over k,
we find that differentiation with respect to t gives

g'(t) = '°"
~k!
n 1
-f(k+l)(t)(x - d - '°"
II 1
---f(k)(t)(x - tl-l
~(k-1)!
k=O k=l
- (n + l)K(x - tt.

All but two terms cancel, so

1
g'(t) = -J<n+l)(t)(x -tt - (n + l)K(x - tt.
n!
By the Mean-Value Theorem for derivatives, there is a number c between x and a
such that g'(c) = 0. (Recall that g(x) = g(a) = 0.) But the equation g'(c) = 0
666 Chapter 14 Infinite Series

allows us to solve for K to get

K = I /(n+l)(c)
(n + I)! '
which is what we wanted to show.
The nth degree Taylor polynomial T,1 (x) is also called the nth degree Tayk
approximation to /(x) about x = a. Note that to increase the degree of approx,
mation, all we do is add another tenn without altering the previous terms. Without
worrying about convergence, we can write the infinite Taylor series.

2.3

= f(a) + f 1
(a)(x - a)+ I, f 11 (a)(x - a)
2
2

+ _!_ J"'(a}(x - a) 3 + · ··
3!
and arrive at the nth degree Taylor approximation by stopping after (n + I) terms.
The infinite series is defined only when j(x) has derivatives of all orders at x = a,
and even then may not converge except at x = a.

l. £~MPLE2 ·j Let j(x) = x 112 for x ~ 0. Then J'(x) = c½)x- 112 , j"(x) = -d>x- 312, fl/l(x) =
Ci)x- 512 , and so on. We find, with a= I in Taylor's formula

Jx = J + _!_
1! 2
(~) (x - I)+_!_(-
2!
~)
4
(x - 1-,Z + R2(x)
.
I I
=l+ (x - I} - cx - I) 2 + R2 (x),
2 8
where R2(x) = (l/3!)ic- 512 (x - 1) 3 , and c is somewhere between x and I. Note
that the first two terms of the approximation give the function T1 whose graph is
the tangent line to the graph of y = ..fi at x = I. The first three terms describe
a quadratic function T2 that approximates ..fi near x = I . The graphs of both
approximations are shown in Figure 14.J(a). Note that the approximations get better
the closer we get to x = I.

The Taylor approximations of j(x) = ex are particularly simple to compute, because


J<k>(x) = ex fork= 1, 2, 3, .... Since J<k>(O} = e0 = I, the formal infinite series
of Equation 2.3, with a = 0, becomes
00
I k I I I
L -x
k!
k=O
= I + -x + -x 2 + -x 3 + · · · .
I! 2! 3!

The partial sums T1(x) = 1 + x and T2(x) = I+ x + ½x 2 are compared with ex


graphically in Figure 14.1 (b ). The remainder after n + I terms is
1
R (x) = ---ecxn+l _
n (n + 1) !
Section 28 Taylor Series 667

T1(X)

~ - - - {i
~ = - - - r2(X)

~---<---+------~
X
X

(a) (b)

II }
We can then estimate that ~ differs from L- xk by at most
k=l k!

e
S (n + I)! , if O S x S 1.

2B Convergence of Taylor Series


Taylor's formula in the form
n
J(x) - L J<kl(a)(x - al= Rn(X)
k=O

gives us a way of showing that in some cases

oo I
f(x) = L k! J<k\a)(x - al,
k=O

with the series on the right converging for x in at least some subinterval of the
domain of f. All we have to do is show that

lim Rn(X)
n->OO
=0
for the values of x in question. This at once proves convergence of the series and
shows that the sum at x is f (x).
2.4 Here is a list of important Taylor expansions that many people find useful to
remember.
oo xk x x2 x3
(a) ex = L - = 1 + - + - + - + ··· -oo < x < oo
k=ok! 1! 2! 3! '
oo ( - 1l x2 x4 x6
(b) cos x =L --x2k = 1- - + - + - + ··· -oo < x < oo
k=O (2k)! 2! 4! 6! '
668 Chapter 14 Infinite Series

. ~ (-l)k 2k+I x3 x5 x1
(c) sm x = ---x
L, = x - - + - - - + · ·· -oo < x <
k=O (2k + l)! 3! 5! 7! '
oo (-l)k+I k x2 x3 x4
(d) ln(l + x) = L ---x =x - - + - - - + ··· , -1 < x < 1
k=l k 2 3 4

j ~>CA:MPLE 41 As we showed in Example 3, Taylor's formula for ex with a= O is


x x2 x3 xn
ex = I + -l! + -2! + -3! + · · · + -n! + R n (x) '
1
where R 11 (x) = ec xn+ 1 for some c between x and 0. To prove the infinite
(n + l)!
series expansion 2.4(a), we have to show that the remainder satisfies

lim l ecxn+l =0
11->oo (n + I)!
for all real numbers x. Pick a fixed value for x. Since c ::; lei ::; lxl, we see that
ec < elxl for all relevant values of c. Hence, with x fixed, all we need to show is that
1
lim ---xn+l = 0.
+ 1)!
n->oo (n

We prove this as follows. Choose k > l :": 2x > 0, hold l fixed, and write

X X X X X

kl =1 2 I l +1 k.

By the assumption on k, l and x, we have x/ I :::; ½- Thus for O < x < I we have

xk x1 1
-<---
k! - l! 2k-l"

Letting k - oo shows that xk / k ! - 0. Allowing -/ ::; x < 0 only makes the sign
alternate. Similar arguments apply to the other series listed previously, and these are
left as exercises.

To find a series expansion for e-2x, there is no need to start from scratch. Simply
replace x by -2x everywhere in 2.4(a):

oo I
e-2x = L k!(-2lxk.
k=O
To find an expansion for ln(l - x) in powers of x, replace x by -x in 2.4(d):

ln(l - x) = r:
00 (
- l)k+l <-I)kxk, -1<-x<l,
k=l k
oo 1
= - I: -xk, -l<x<l.
k=l k
Section 2B Taylor Series 669
.;URE 14.2

y = cos x, T0(x), Ti(x), T4(x) y = sin x, T,(x), T_i{.t), T_1(x)


(a) (b)

Figure 14.2(a) shows the graph of cos x along with the Taylor expansion partial sum
graphs of To(x) = 1, T2(x) = l - ½x 2 and T4(x) = 1 - ½x 2 + tfX 4.

Figure 14.2(b) shows the graph of sinx along with the Taylor expansion partial sum
graphs of T1 (x) = x, T3(x) = x - ¼x 3 and Ts(x) = x - ¼x 3 + iiox 5.

EXERCISES

In Exercises 1 to 4, write the polynomials in the indi- oo x2k


cated fonn; that is, find the coefficients q by using The- 15. coshx = (eX + e-x)/2 = I: --
orem 2.1. k=O (2k)!
oo x2k+I
1. I + x + x 2 = co + c1 (x - 2) + c2(x - 2) 2 16. sinhx = (ex - e-x)/2 = I; - - -
2. 2x - x 3 =co+ CJ (x - 2) + c2(x - 2) 2 + c3(x - 2) 3
k=O (2k + ])!

3. l + x 2 =co+ c1 (.r. + 1) + c2(x + 1) 2


17. By using a Taylor expansion about x = 0, prove the
binomial theorem:
4. (I - x) + (1 - x)2 =co+ CJX + c2x 2
In Exercises 5 to 8, find the coefficients co, ci, c2 in the
indicated Taylor expansion. Compute the remainder as
in Theorem 2.2.
5. x 113 =co+ CJ (x - 1) + c2(x - 1) 2 + R2(x) where the binomial coefficients are given by
6. x/(1 + x 2) =co+ CJX + c2x 2 + R2(x)
7. 1/x =co+ CJ (x + 1) + c2(x + 1) 2 + R2(x) n) n! n(n-l)· ·· (n-k+I)
( k =k!(n-k)!= k(k-1)···2-l ·
8. e2x =co+ cIx + c2x 2 + R2(x)
In Exercises 9 to 14, find infinite series expansions
for each of the following functions about x = 0, by In Exercises 18 to 21, sketch using the same coordinate
modifying one of the series given in Equations 2.4. axes the graphs of f and the Taylor approximations Tk.
Also estimate the remainder for -re :S x :S re.
9. e-x 10. cos2x 11. sin(x/2)
2 18. f (x) = cosx and T2(x) = 1 - ½x 2
12. ln(l - x 2) 13. eX 14. xex
19. f(x) = cosx and T4(x) = 1- ½x 2 + ~x 4
Use the definitions of coshx and sinhx along with
20. f(x) = sinx and T3(x) = x - ¼x 3
the Taylor expansion for ex to establish the Taylor
expansions in Exercises 15 and 16. 21. f(x) = sinx and T5(x) = x - ¼x 3 + Ji0 x 5
670 Chapter 14 Infinite Series
00 (b) Using the method of Example 4 of th
22. (a) Show that '°' xk
L.J
= - -, - I
1-x
< x < I, of show that
k=O
Example I in Section l is the same as the Taylor 00
(-Ii
expansion of (I - x)- 1 about x = 0. sinx = '°'---x2k+l
L.J (2k + I)! '
-oo < x < c
(b) Derive from part (a) the expansion about x = I k=O

25. Show that


0 < X < 2.
OO (-l)k+l
ln(I +x) =L k :l, for-l<x<I.
k=I
(c) Use the identity
l+x
= 2 + (x - 1)
1/2
- - - - - to show that
I + (x - 1)/2
26. Use x = -½ in Formula 2.4(d) to show that ln2 =
ooI
I oo Lk2k·
- - = ~)-llTk-l(X - J/ for - J < X < 3.
k=l

(°"oo (°"oo
I +x k=O
27. Show that L...k=O (-Ii)
--,;i-- Lk=O k!I) = 1.
23. (a) If/(.()= cosx, show that the Taylor coefficients of 28. Show that
f about x = 0 arc
1<nl(0) = { 0. n odd,
n! . (-1)"12/n!, n even.

(b) Using the method of Example 4 of the text, 29. An even function f is a function such that / ( -x) = f (x)
show that for all real x, and an odd function is a function such that
f(-x) = -f(x) for all real x.

L (-Ii
00
2k (a) Show that the odd-order derivatives J<2k+ 1l(O) of
cnsx = (lk)! x , -oo < x < oo. an even function are all zero. [For example, f(x) =
k=O
cosx.]
(b) Show that the even-order derivatives J< 2k>(O) of an
24. (a) If f(x) = sinx, show that the Taylor coefficients of odd function are all zero. [For example, f(x) =
/ about x = 0 are sinx.]
(c) What conclusions can you make about the Taylor
j(n)(O) ={ (-l)(n-1)/2/II!, n odd, expansions about a = 0 of even functions and odd
n! 0, n even. functions?

SECTION 3 CONVERGENCE CRITERIA


3A Convergence of Sequences
For infinite series in general, we need a theory of convergence where we can prove
convergence without already knowing the sum of the series. The key property of real
numbers that we assume is the following.
3.1 Nondecreasing Sequence Principle. Let {s,i}, n = 1, 2, 3 ... be a nonde-
creasing sequence of real numbers, so that s11 ::S Sn+l for n = 1, 2, 3, ... Then either
there is a finite upper bound b such that Sn :::: b for all n, in which case

(i) Jim s11


n---+oo
= s, for some number s S b,
or else there is no finite upper bound, in which case
(ii) Jim Sn = +oo.
n->oo
Section 3A Convergence Criteria 671

This principle is a fundamental property of real numbers and indeed is built into their
very definition. Our attitude here is to accept the principle as a plausible assertion
about the real number line. Figure 14.3 illustrates both cases of the principle.

!.;~~""~b§Ttl Every decimal expansion of the form


0.b1b2b3 ... ,
where 0 ::5 bk ::5 9, is really an infinite series with kth tenn ak bk10-k. We
observe that
Sl S2 S3... sb
• I • I I
..... n bk n 9
lims0 = s :5 b
n~oo
Sn= L
k=I
1Qk ::5
k=I
1Qk L
(a)
n 1 ( 1 - 10-n- l ) 1
-9~--9 -]+--- -1--
SI S2 S3 ... Snb
- L..... 10k - 1 - (1/10) - ](}'I.
I I I • • • I• I k=I
limsn = co
n~oo Note that sn+I = Sn +bn+1/J0n+I ~ Sn, and that Sn ::5 b = 1. Hence the nondecreas-
(b)
ing sequence principle asserts that the series L~
1 bk 1o-k, and hence the related
decimal expansion, converges to some number s ::5 1. This is what we expect; the
largest number we can represent in the decimal form 0.b1bzb3 ... is
FIGURE 14.3
1 = 0.9999 ....

(EXJ\lyt~LE lj The infinite series with kth tenn ak = k- 12-k has nth partial sum
n I II l
Sn = L k2k ::5 L 2k
k=I k=O

= 2(1 - i-n-1) .:5 2.

Since sn+I =Sn+ 2-n-l /(n + 1) > Sn, and Sn ~ b = 2, the nondecreasing sequence
principle shows that the series has a sum
oo I
s =L k2k ~ 2.
k=!

The equations

lim s
n->OO 11
=s and lim (s 11
11->00
- s) =0
are completely equivalent, and very often it's simpler to show that a sequence of
numbers sn - s tends to O than to show that Sn tends to s. Here we record two
important kinds of sequence that have zero for a limit.
. 1
(a) hmn->oo -
na = 0, if a> 0
3.2 I
(b) limn->oo brJ = 0, if lbl > 1
672 Chapter 14 Infinite Series

The reason in each case is that the denominator, along with its absolute value in (b),
tends to infinity as n-+ oo. See Exercises 21 and 22 for details.

3B Sums and Multiples of Series


The general distributive and associative laws for finite sums are
II ll I! II I!

c Lak = L eak, and


k=I k=I
Both laws extend to convergent infinite series as follows.
3.3 Theorem. Suppose c is a fixed real number and that
00 00

are convergent series of real numbers. Then the series with kth terms cak and ak +bk,
respectively, are convergent also, with
00 00 00 00 :X,

c Lak =L eak and Lak + Lbk = I)ak + bk).


k=l k=l k=l k=I k=I

Proof. Use of the distributive and associative laws for finite sums shows that the
proof reduces to showing that if
I! II

S11 = L"k and f11 = Lbk,


k=l k=l

then
c Jim
11-00
Sn = Jim cs 11 ,
11-00
and Jim s11
11-x,
+ Jim
11-00
111 = 11-00
Jim (s,1 + 111 ).

These limit relations are proved in the same way as the more familiar analogues for
a continuous variable x, which we assume:

c Jim f (x)
X-00
= Jim cf (x)
X-00
and

x~~ f(x) + x~~g(x) = -~~~ (f(x) + g(x)). •

j EXAMPLE 3 j We have by Theorem 3.3,


00 00
(a) 3 I: 2-k = I: 3.2-k
k=O k=O
00 00 00
(b) I: 2-k + L rk = I:(2-k + 3-k)
k=O k=O k=O
00 00 00
(c) 2 L 2-k + 3 L 3-k = L c2-k+I + 3-k+l)
k=O k=O k=O
Section 38 Convergence Criteria 673
It's often convenient to change just finitely many terms in a series, in which case
the numerical sum of a convergent series may change, but

3.4 Theorem. Altering a finite number of terms in an infinite series has no effect
on whether the series converges or diverges.

Proof Suppose all changes occur among the first M terms, replacing ak by a~ for
k = 1, 2, ... M. Then for n > M, the new partial sums s~ differ from Sn by a fixed
amount, s~ - Sn= d, independent of n. Hence

Jim s~
n----->00
= d + n->oo
lim Sn,

so the two series both converge or both diverge.



(~)(AMPLE 4J We saw in Example 2 of Section 1 that
00
1 1 1 1 3
I:
k=O
3k = 1 + 3 + 32 + 33 + · · · = 2·

Leaving out the first two terms, we get


00
1 1 1 3 1 1
I:
k=2
3k = 3 2 + 33 + · · · = 2 - 1- 3 = 6·
Increasing the first term by 2, we get
00

(1 '°' -1 1
+ 2) + ~ 3k = 3 + -3 + -32 + -33 + · ·· =
1 1 3
-
2
+2 =
7
-.
2
k=l

EXERCISES

Exercises 1 to 8, define Sn for pos1t1ve integers n. I 5 II 19 I 3 5 7


Determine which sequences have limits and which do ll. 5' ii' 19' 29.. . 12 · 3' 7' -9, ... -:s,
not. In case the limit value exists, find its value. 13. Given an infinite sequence of numbers Sn for n =
1 II I +n 2 1, 2, 3, ... , there is always an infinite series Lk=I ak that
1. Sn=]+
2 2. Sn= n +I 3,sn =- - has Sn for its nth partial sum .
n 11
(a) Show that if a 1 = s1 and ak = Sk - Sk-1 for
2n +3 I +2n
5.sn=~ 6. Sn = arctan(n 2) k = 2,3,4, ... , then
4. Sn= 2n + 2
II
cosmr (2n)!
7.sn = - -
n 8. Sn = 2nn! L llk = Sn for n = I, 2, 3, ....
k=I

In Exercise 9 to 12, find a formula for the nth entry in an (b) Find a simple formula for the kth term ak of an
infinite sequence with the given four values. Then find infinite series that has s 11 = I + (l/11) for its nth
limn->oo Sn,
partial sum. What is the sum of 1 ak? Lk=
I 2 3 4 3 4 5 6 14. Let L~Iak be a series of nonnegative terms. Show that
9 10 the series either converges or else diverges to infinity, in
• 2' 3' 4' 5 · .. · 2' 3' 4' 5' ...
674 Chapter 14 Infinite Series

the sense that In Exercise 17 to 20, write the given expression as a


single infinite series.
n
Jim
n--+oo L..,
'°"'
ak = +oo. 17. L
00
2-k +L
00
2-k+I
k=I k=I k=I
00 00

15. Find limn ..... 00 (11 + 1) /nn by letting n


11
= 1/x in 1s. 2 I: rk -4 I: 2-A:-- 1
k=O k=O
]n ((n + l)n/nn) with X ~ o+.
00 00

16. Show that lilll/c->oo xk / k! = 0 for x > 0 by choosing 19. I: (2/3l - I: (2/3/+ 1
k=O k=O
k > I ::: 2x and writing
00 k 00 1
20
XI.: X X X X X · k~I k 3 +1 - k~J k 3 +1
-=-·-···-·--···-
k! 1 2 / /+1 k 21. Prove that limn->oo n-a = 0 if ex > 0. [Hint: To make
11-a < E, make n > E-l/a.]

Then use the assumption x / I .::: ½, What happens if 22. Prove that limn--. 00 b-n = 0 if lbl > 1. [Hint: Show
X < 0? lhln = (1 + 6)n ::: 1 + 116 by the binomial theorem.]

JC Series with Nonnegative Terms


There is no universal criterion, other than the definition, for deciding about the
convergence or divergence of infinite series. However, the tests we explain next have
been found to be useful for large classes of series that have practical importance.
Here is the simplest test of all to apply, but the only conclusion we can draw from
it is divergence of a series.
3.5 Term Test for Divergence. If I:~ 1 ak converges, then limk--+oo ak = 0. In
other words, if ak fails to tend to O as k ~ oo, then the series diverges.

Proof. Let S11 = Lk=I ak, By assumption limn--+oo s11 = s, for some finite number
s. Hence limn--+oo Sn-I = s also. It follows from Sn - Sn -I = a, 1 that
lim a,1 = Jim (sn - Sn-1)
n--+ oo n--+ 00

= 11--+oo
Jim Sn - lim Sn-I
n--+oo
= s -s = 0. •
The series
00
k I 2 3
Ek+l =2+3+4+·--

has kth term ak = k/(k + 1). Since


k
lim ak
k--+oo
= klim --
..... oo k + 1

= Jim I =1
k--+ool+(l/k) '
the series fails to converge because a11 doesn't tend to 0.

Warning. It is not true, just because limn--+x a 11 = 0, that the series ak Lbt
converges. The harmonic series }:~1 (] / k), shown to diverge in Example 6, is a
counterexample, because limk..... o I/ k = 0, but the series diverges.
Section 3C Convergence Criteria 675
There are close analogies between integrals over an infinite interval and infinite
series. Thus the improper integral of f (x) from o to oo is convergent if it has a
finite value determined by

00
Jim lb f(x)dx;
1a f(x)dx =
b-oo a
otherwise, the integral is divergent. For example,
00
f e-p:,; dx = Jim {b e-p:,; dx
11 b-00)1
e-P - e-pb e-P
=Jim----=
b-oo
p p
if p > 0.

If p < 0, the computation shows that the improper integral diverges to oo. (What
about p = O?) For another example, consider

[ 00 dx {b dx
lo I +x = b~~}0 I +x 2
2

. Jr
= hm [arctan b - arctan OJ = - .
b-oo 2

The analogy with series is that to compute an improper integral you first compute
"proper" integrals over finite intervals and then find their limit over intervals with
length tending to oo. The next theorem shows that there is a very useful connection
between the convergence and divergence of particular infinite series and improper
integrals.

3.6 Integral Test. Let I:,:~1 Ok be a series of positive terms, and suppose f is
a decreasing function such that f (k) = ak for k = 1, 2, 3 .... Then the series and
improper integral,
1 2 3 4 5
(a) 00

Lak and 1 00

f(x)dx,

1~ I 2 3
(b)
4 5
either both converge or both diverge.

Proof.
k=l

Suppose first that the integral converges. Looking at Figure 14.4(a) shows
that, by comparing areas, we have
FIGURE 14.4

N
Lok ~
1N-I f(x)dx ~
loo f(x)dx < oo.
k=2 1 I

The series converges because the partial sums are increasing (because Ok > 0)
and bounded above by the value of the integral. Then the nondecreasing sequence
principle (3.1) applies.
676 Chapter 14 Infinite Series

Figure 14.4(b) shows that

f I
N
f(x)dx
N
~ L>k ~ Lllk <
k=I
oo

k=I
oo,

so again Principle 3.1 applies to show that the integral converges to a finite number
as N ~ oo if the series converges. •

IEXAMPt.E 61 The harmonic series L~J (1/ k) diverges, because with f(x) = 1/x, we have
· J(k) = 1/ k. But

!NI-dx
f I
ool
-dx
X
=
=
Jim
N-+oo I

lim lnN = oo.


X

N-+oo

The series diverges even though limk->:::d 1/ k) = 0, as we pointed out in the warning
about misapplication of the term test.

~IVIPl£7 I To=t=decide about the p-series L~J k-P for p > 0, let /(x) = 1/xP. We have, for
p 1,

l
I
oo I
-dx
xP
= N-+oo
lim
fI
N 1
-dx
xP

= N-+oo
lim [
1 ]N
(1 - p)xP-1 I

= N-+oo
11m
. -I- [ -I- -
1 - p NP-I
I] = I --,
p-1
+oo,
p > I,

Hence we have convergence for p > I and divergence for p < I . The case p = 1 is
the harmonic series, and when p ~ 0 the tenns of the series fail to tend to zero, so
the series diverges by the term test. For p > I, the p-series defines a function { (p)
called the Riemann zeta-function:
00 l
{(p) = "-.
~ kP
k=I

For p = 2 and p = l . I , the series


oo 1 oo l
{(2) = L k2 and {(l.l) = L kl.I
k=I k=I

both converge. But for p = ½and p = 0.9, the series

oo I oo I
I: - and Lko.9
k=I -Jk k=I

both diverge.
Section 3D Convergence Criteria 677
Application of the integral test usually depends on being able to compute some
indefinite integral, but examples in which the computation is awkward can sometimes
be handled by comparison with a related series.

3.7 Comparison Test. Suppose O _:::: ak _:: : bk.

(i) If }:~ 1 bk converges, then }:~ 1 ak converges.


(ii) If }:~ 1 ak diverges, then }:~1 bk diverges.

Proof To prove (i), note that

n ti oo
Sn= Lak .:S Lbk .:S Lbk =b.
k=I k=I k=I

Since ak ::: 0, the partial sums Sn form a nondecreasing sequence, bounded by b.


Then }:~ 1 ak converges by Principle 3.l(i). To prove (ii), note that

n t1

Sn= Lak .:S Lbk ,


k=I k=l

and that by Principle 3. l(ii), lim11 .... oc Sn = oo. Hence }:~1 bk diverges also. •
00 1
Consider the series L - - - . Since Ink ->
k=2 k 2 Ink
1 fork> 3, we have
-

1 1
0 < -- < - 2 for k _> 3.
- k 2 Ink - k '

Since }:~3 I/ k 2 is a convergent p-series by the integral test, we know that }:~3
l/(k 2 lnk) converges by the comparison test. Hence the given series, with the addi-
tional term 1/ (4 ln 2), converges also.

Consider the series


00 1
I: k112<k + l)l/3 ·
k=l

Since, for k ?:. 1,

1 1 1
------------,.<-~---=-c=-
(k + 1)5/6 - (k + J)l/2(k + 1)1/3 - klf2(k + 1)1/3'

the given series diverges by comparison with the divergent p-series for p = i.
3D Absolute Convergence
If all terms of a series are nonpositive from some point on, we can simply multiply
the series by ( -1) and apply the tests of the previous subsection. For a series that
678 Chapter 14 Infinite Series

has infinitely many terms ak of both signs, we try if possible to show that the series
with kth term lak I converges. If
00

converges we say that I:~


1 ak is absolutely convergent. The special terminology
is justified because absolute convergence is in general stronger than ordinary con-
vergence in the sense that every absolutely convergent series converges, but not
conversely. The key theorem is this:

3.8 Theorem. If I:~


1 lakl converge~, then so does I:~ 1 ak. In other words,
absolute convergence implies convergence.
Proof. Since O ::5 ak + lakl ::5 21ak I, the comparison test shows that I:~ 1(ak + lakl)
converges, because 2 I:~ 1 lakl = I:~ 1 21akl is assumed to converge. Hence
00 00 00

converges also, because it is the difference of two convergent series. •


The series

converges and even converges absolutely, because


00

I: <-1)\2 = k=I
k=l
1 Lk21 II 00

is a convergent p-series for p = 2.


The series I: ~ converges absolutely by the comparison test for series with
k=l k k + 1
positive terms. The reason is that

(-ll I 1 I
Ik./k+T = k./k+T ::5 k3/2'

and I:~ 3 2
1 (1/ k 1 ) is a convergent p-series with p = ~-
For series with nonnegative terms, convergence is the same as absolute conver-
gence, so a test that proves one proves the other also.

I· ~XJ:\M~t.E 12 -j The geometric series


Section 1. We have
L~o rk converges to 1/ (1 - r) for Ir I < 1, by Example 1 of

00 1 .
L2k =2, with r = 1,
k=O
2
Section 3D Convergence Criteria 679
and
00
(-1}*: 2
L-r =
3, with r = -1/2.
k=O

A series L~o(±l)2-k converges absolutely, whatever choice of sign we make in


each term because it converges when we always choose the plus sign. However,
there is no general way to determine the sum of the series as there is for the two
geometric series with all plus signs or with alternating signs.

The following test applies to series with kth term ak -/= 0 from some point on and
deals directly with absolute convergence.

3.9 Ratio Test. Let L~l ak be a series for which limk~oo lak+1 l/lakl exists.

I I < 1, the series converges absolutely.


(i) If limk~oo a::'

I
(ii) If limk~oo a::
1
I > 1, or is infinite, the series fails to converge.
If the limit of the ratio lak+ 1 / ak I of successive terms fails to exist or if the limit is
l, no assertion is being made about convergence of the series. Note also that if a
series of positive terms fails to converge absolutely then it fails to converge at all.

Proof. We'll assume ak > 0 since we are concerned only with terrns of the fonn
lak I in the proof.

Case (i). Since the limit of ak+i/ak is less than 1, there is a number r < 1 such
that ak+ 1/ak s r for all sufficiently large values of k, say k ~ N. Thus
ak+I s rak fork= N, N +
1, .... Hence
2 k
aN+k s raN+k-1 Sr aN+k-2 S ···Sr aN,

Since O s r < 1, the series


00 00

LrkaN = aN Lrk
k=O k=O

converges. Hence L~o aN+k converges also by the comparison test.


Including the finitely many terms a1, ... , aN-1 shows that L~J ak
converges also.
Case (ii). This time the limit of ak+i!ak is bigger than 1, so there is a number
r > 1 for which ak+ 1/ ak ~ r for k sufficiently large, say k ~ N. Thus
ak+I ~ rak fork= N, N + 1, .... Hence

Since r > l, rk tends to oo a k ~ oo, and the series diverges by the


term test. •
680 Chapter 14 Infinite Series

IEXAfl#IP~~ ;13 j
· -
The series
ak+ 1 = (k
1=;
+
1k
1) /2k+
2
/2k converges because, with ak
1
, the ratio test gives
= k2/2k and

. (k + 1)2 /2k+I . (k + 1)22k


hm --,,.-..,.......-=
2
hm
k-+oo k /2k k-+oo k 22k+ 1

2
lim ( I + -I ) ·-I = -I
= k-+oo < I.
k 2 2

j;~i<AIVtPLE141 Consider the series L~oxk/k!, where kl, k factorial is 1-2-3---k if k 2: I, and
· O! = I. We have ak = x/c / k! and ak+J = xk+ 1/(k + I)!

.
]1m lak+II
-- = 1·1m lxk+l/(k+l)!I
k-+oo ak k-+oo xk / k!
. k!lxl . lxl
= k-+oo
hm - - - = hm - - = 0 <
(k + 1) ! k-+oo k + l
I.

Hence the series of terms depending on x converges by case (i) of Test 3.9 for all
x. We have already seen that the series converges to ex.

3E Alternating Series
An alternating series is one in which the terms are alternately positive and negative.
The alternating harmonic series is an example:
00
kl 1
°"(-I) -
~ k
=I- -
2
+ -31 - -4I + ... .
k=l

Some of these series converge by the following criterion.

3.10 Leibniz Test. If I,:~ 1 ak is an alternating series such that

(i) lakl 2: lak+1 I fork= 1, 2, 3 . .. ,


and
(ii) Jim ak
k-+oo
= 0,
then the partial sums Sn converge to a sum s, with error Is - s11 I at most ja11 + 1 I-

Proof. Suppose a1 = Pl, a2 = -p2, a3 = p3, and so on, with Pk 2: 0. Then the
. I sum "2n
part1a .
L..-k= 1 ak is

s2n = (pi - P2) + (p3 - p4) + · · · + (P2n-t - P2n),

where the terms P2k-l - P2k are all nonnegative, because Pk = lak I 2: lak+I I = Pk+ 1-
Hence s2,1 is nondecreasing as n increases. For the same reason, grouping the terms
differently shows that

Pl 2: s211 = Pl - (p2 - p3) - · · · - (P2n-2 - P2n-d - P2n ·


Section 3E Convergence Criteria 681

Hence the partial sums s2n are bounded above and nondecreasing, while the partial
sums s2n+1 = PI -(p2- p3)-· · ·-(P2n-2-P2n-l )-(P2n-P2n+1) are nonincreasing.
By Principle 3.1 for bounded sequences
2n
lim s2n = lim ""' ak = s,
n-+oo n-+oo L
k=I
for some numbers. But s2n+1 = s2n + a2n+1, so since lim a2 11 +1
n-+OO
= 0 by (ii),
2n+l
lim
n-+oo
s2n+ 1 = lim ""' ak
n-+oo L
=s
k=l
also. Hence all s11 converge to s. Since s2n ,::: s .::: s2n+ 1 it follows that if m is even
Sm ~ s ~ Sm+l and if m is odd Sm+! .:5 s ~ Sm, Thus Is - sml .:5 lam+1 I- •
[~~,VWl¼~ti~I The alternating harmonic series converges, because with ak = (-l)k+ 1/ k, we have

(i) 1(-1/+l
k
I> ,(-l)k+21
k+l ,
and
( -l)k+l
(ii) lim - - - = 0.
k-+oo k
Note that the alternating harmonic series fails to converge absolutely because }:~1
l / k is divergent.

The series
00
(-Il 1 I l
L ~ = ln 2 -
k=l
ln 3 + ln 4 - · · ·

converges because (i) I / Ink 2::. I/ ln(k+ I) and (ii) limk-+oo I / Ink = 0. The Leibniz
test implies convergence.
EXERCISES

oo I
In Exercises I to 6, determine the convergence or diver- 5. I:-2-
gence of the infinite series by using the term test (for k=l k + l
divergence) or the integral test (for convergence or
In Exercises 7 to 12, detennine the convergence or
divergence). Show carefully how the test you use applies
divergence of the infinite series by using the comparison
in each case.
test or the ratio test. Show carefully how the test you
oo k2 00 k use applies in each case.
t.1:-kI 2. I:-2-
k=l k + l
2
k=l + 00 2k 00 1

+ kl)k + k)3/2
00
7. k"fl 3k+l 8. k"fl (k2
3. I: ke-k 4. I:
00 (
i
1
k=I k=I
10. ~ _l_
00
9. I:-2- /;;;1 n2"
n=2 n Inn
682 Chapter 14 Infinite Series

11 f- 1-·-
j3 + 1
• j=l
12.
j=l
I; j:
./T+T
2 33. (a)
(b)
Prove that
Prove that
I:t.. 1 kxk =
L~I kZ-k
x/(1 - x) 2, for lxl < I.
= 2.

In Exercises 13 to 18, determine whether the series In Exercises 34 to 41, determine the real values of x for
converges absolutely or not. For those alternating series which the series converges.
that fail to converge absolutely, try to apply the Leibniz oo I oo 1
test for convergence. 34. L -kxk 35. L 2 xk
k=l k=I k

n.
oo (-I)*
I: - -
oo (-J)*+l
14. I: - - 36 _ I: sin;x 00 1
37. I: ---r<x - 1)*
k=z k 2 Ink k=I k2 k=I k k=I 2

15. E (-l~j 16. E (~l)j j 38.


oo I
L 2
-:x j
~ j .
39. L.. - - x l
j=I .Ji j=I J+1 j=I j j=I j2+ 1
00
i1. I: - 2 -
m=om + 1
(-1)1H
40.
00
L (lnxl
k=O
41. f (-3-)k
k=O + 1 x2

In Exercises 19 to 24, determine the convergence, abso- 42. (a) Show that t(p), which is defined by the p-series as
lute convergence, or divergence of the series. t(p) = L~I k-P, is decreasing asp increases, for
p > 1.
oo (-Il oo k (b) Show that (1-Z-P)s(p) = l+l/3P+I/5P+l/7P+
19. I: - - 20. I: - 3-
k=2 1n(l/ k) k=l k +I
00 (-2)k 00 kl (c) Show that (1 - 21-P)t(p) = 1 - I/2P + l/3P -
21. I: - 2 - 22. I: _ · l/4P + ...
k=I k + 1 k=l (2k)!
43. The function f(x) = rt is defined for > 0 by
E(1 + ~)
x
23.
k=I k,
z-k 24. E(-ll-k_!
k=I
-
(k+l)!
f(x) = exlnx _
(a) Use l'Hopital's rule to prove that limx-+O+ xx = I,
In Exercises 25 to 30, determine the convergence or and so conclude that limk-+oo (1/k/ 1/k) = I.
divergence of the series. (b) Prove that I: (!)
k=I k
l/k diverges.
00 k2 oo kk
25. I: ---r 26. I: - (c) Prove that I:~ 1 k(l/k) diverges.
k=l 3 k=l k!
44. Prove that if a is a real number and lbl > 1 then
27. f k! 28. I: (-l)k lim
na
-b = 0, by applying the ratio test to ""~ nab-".
k= I k ..;1nk
k=2 n-..oc- n ~n- 1

29. E sink 30. I: __l _ oo 1


k=I k2 n=I 2n2 - n 45. Prove that L --- converges if a > I and diverges
k=Z k(lnkt
31. (a) Prove that I:~ 1 kxk converges absolutely for if a:'.:: 1.
lxl < I.
*46. Prove that if ak ~ 0 and I:~ 1 ak converges then so
(b) Provethat(l-x)I:~ 1 kxk =I:~ 1 xk,forlxl < 1.
do L~I ll2k and I:r=o"(2Hl)· Is the conclusion true
32. Prove that L~t xk = x/(1- x), for lxl < 1. without the condition ak ~ O?

SECTION 4 UNIFORM CONVERGENCE


Let fk (x), k = 1, 2, 3, ... , be a sequence of real-valued functions defined for all x
in some set S. Then for each x, we consider the series I:~ 1 Ji..(x). If it converges
for each x in S, we say that the series converges pointwise on S. Calling the limit
/(x) for each x in S, we write
00

f(x) = L fk(x)
k=I
Section 4 Uniform Convergence 683
N
= N--+oo
lim ""'/k(X).
L
k=I

This means that for each x in S there is a number /(x) such that, given E > 0, there
is an integer K sufficiently large that

N
L fk(x) - f(x) < E,
k=l

whenever N ~ K.

r~~,M~~~1d The series L~o xk has for its (N + l)st partial sum the finite sum
N
""'x
11 - xN+I , x,i=l,
L k = I -x
k=O N + I, X =}.

Then

oo N l
""'xk
L
= lim
N--+ooL
""'xk = --,
1-x
for -1 < x < 1.
k=O k=O

For real values of x outside the interval ( -1, 1), the series fails to converge.

The trigonometric series I:~ 1 (sin kx)/ k 2 converges pointwise for all real x.
The reason is that we can compare its terms with those of the convergent series
I:~ 1 1/k2 , by observing that
sinkx < _!_
k2 - k2'
I k = 1, 2, ....
I
The result is that the given series even converges absolutely.
An infinite series I:~ 1 f.t(x) that converges for each x in a set S to a number
f (x) defines a function f on S. However, in general we can conclude very little
about the properties of f from pointwise convergence alone. For this reason it's
sometimes helpful to consider a stronger form of convergence on S. We say that
I:~ 1 fk converges uniformly to a function f on a set S, if, given E > 0, there is
an integer K such that for all x in S and for all N ~ K.

N
Lfk(x) - /(x) < E
k=I

The definition just given should be compared carefully with that of pointwise con-
vergence. Notice that uniform convergence implies pointwise convergence, but not
conversely. Roughly speaking, uniform convergence of a series of functions defined
on a set S means that the series converges with at least a certain minimum rate for
684 Chapter 14 Infinite Series

all points in S. A pointwise convergent series may have points at which the con-
vergence is increasingly slow. Figure 14.5 is a picture of uniform and nonuniform
convergence to the same function /; sN(x) and IN(x) are Nth partial sums of two
series.
To determine that a series converges uniformly, we have the following.
4.1 Weierstrass Test. Let L~J fk be a series of real-valued functions defined
a b on a set S. If there is a constant series L~i Pk, such that

(a) uniform 1. 1/k(x)J ::: Pk for all x in Sand fork= 1, 2, ... ,


00

2. L Pk converges,
k=I

then L~i fk converges uniformly to a function J defined on S.

Proof. The comparison test for series shows that L~,


/k(x) converges (even abso-
'
'' ' lutely) for each x in S to a number that we'll write /(x). Hence we can write
'
' '
''
' ' ' '' N oo N oo

a '
' '
' '
' '' b
/(x) - L fk(X) =L fk(x) - L fk(x) = L /k{x).
' '',,' k=I k=I k=I k=N+I
,,
'' ' IN(x) ,, It follows that
,,
' ' ,, N
' ' ,, oo oo
''
,,
'' /(x) - L /k(x) < L 1/k{x)J ~ L Pk·
(b) nonuniform k=I k=N+I k=N+l

Since L~i Pk converges, we can, given € > 0, find a K such that L~N Pk < €
if N > K. This completes the proof, because the number K depends only on € and
FIGURE 14.5 ~00~ •

The trigonometric series L~ 1 (sin kx) / k 2 converges uniformly for all real x, because

sinkx I< __!__


I k2 - k2'

and L~l I/ k converges. However, the power series L~o xk, while it converges
2
pointwise for -1 < x < 1, fails to converge uniformly on (-1, I). See Exercise 6.
The Weierstrass test applies on symmetric closed subintervals [ -r, r] with O < r < I
by observing that lxk I ::: ,t for x on [-r, r] and that L~o ,k converges if O ~ r < 1.
Hence the power series converges pointwise on (-1, 1) and uniformly on [-r, r] for
any r < 1.

The next four theorems are about uniformly convergent series of functions. They
all assert that certain limit operations are interchangeable with the summing of a
series, provided that certain series converges uniformly. If uniform convergence is
replaced by pointwise convergence, then the resulting statements fail to hold in
general. See Exercise 9.
4.2 Theorem. Let Ji, h, h, ... be a sequence of functions defined on a set S
in !R11 • Suppose xo is a limit point of S, and suppose that the limit
lim fk(X)
X->Xo
Section 4 Uniform Convergence 685
exists for k = 1, 2, .. . . Then
00 00

lim
X--c>XQ
L fk(x) = L lim fk(x),
X--c>XQ
k=l k=l

provided the series of numbers on the right converges and the series on the left
converges uniformly on S.

Proof. Let limx--c>xo ft(X) = ak. Then adding and subtracting I:f:: 1 fk(x) and
I:f= 1 at, we get
00 00 oo N

Lfk(x)- Lat < L /t(X) - L /t(X)


k=I k=l k=I k=l
N N N oo

+ Lfk(x) - Lat + Lat - Lat (1)


k=I k=l k=I k=I

Now let E > 0. Since Lt=I fk converges uniformly, we can choose K such that
N > K implies
oo N
L ft(X) - L ft(X) <-
E
3'
for all x in S.
k=l k=l

Then choose an N > K such that


N oo

I:ak - Lat < -.


3
E

k=l k=l

Finally, pick 8 > 0 so that Ix - xol < 8 implies, via the relation
N N N N
Jim '"" ft(X)
X--c>XQ ~
= '""Gk
~
, that L fk(x)- Lat <-
E
3.
k=l k=l k=l k=l

Then for x satisfying Ix - xol < 8, the left side of equation (1) is less than E. •

4.3 Corollary. If I:~ 1 ft is a uniformly convergent series of continuous func-


tions fk defined on a set S in !Rn, then the function f defined by f (x) ft (x) = Lt=t
is continuous on S.
In the next two theorems we restrict ourselves to functions of one variable,
although by treating one variable at a time, we can apply them to functions of
several variables.
4.4 Theorem. If the series Lk=I
fk converges uniformly on the interval [a , b],
and the functions fk are continuous on [a, b], then

t't 1b
00
fk(x) dx = 1b[OOt't /k(x)
]
dx .
686 Chapter 14 Infinite Series

Proof. By Theorem 4.3 the function I:~ 1 /k(x) is continuous on [a, b] and so is
integrable there. We have

1 b[OO
?;fk(x)
]
dx - ?; 1
00 b
fk(x)dx = 1kilb 00
/k(x)dx. (2)

Let E > 0, and choose K so large that if N > K, then

00
L fk(x) < E(b - a)- 1, for all x in [a, b].
k=N+l

In general. if g(x) is a continuous function, then

11
a
b g(x)dxl::: (b-a) max lg(x)I,
a::,x:;;h

so if g (x) is the sum of the terms /k (x) from N +l to infinity, then

b 00

l L /k(x)dx :::(b-a)·E·(b-a)- 1 =E, forN>K.


a k=N+l

Thus the left side of equation (2) is less than E in absolute value for N > K . •
The interchange of differentiation with the summing of a series requires somewhat
more in the way of hypotheses than did the previous theorem on integration.

4.5 Theorem. Let /1, h. /3, . . . be a sequence of continuously differentiable


functions defined on an interval [a, b]. If L~I /k(X) = f(x) for all x in [a, b]
(pointwise convergence), and if I:~ 1 dfk/dx converges uniformly on [a, b], then
f is continuously differentiable, and

-dxd L fk(x) = I:-(x).


OO dfk
dx
OO

k=l k=l

Proof. By the fundamental theorem of calculus

1 1
N N x x [ N ]
!;[fk(x) - /k(a)] =~ a Ji(t) dt = a ~ Ji(t) dt.

Using pointwise convergence on the left and uniform convergence on the right to
justify letting N tend to infinity in Theorem 4.4, we get I:~ 1 /k(x) = f (x). Hence

/(x) - /(a)= [ [E J;(t}t


Section 4 Uniform Convergence 687
00

Differentiation of both sides of the last equation gives /'(x) = L f~(x), which is
k=1
the conclusion of the theorem.

[:e'~Mf:"t.l;~.:~j

I
Consider the trigonometric series

~ sinkx
L- k4 .
k=I

The series converges absolutely for all real x, because the terms are dominated by
k- 4 . Furthermore, the series of derivatives of the terms of the given series is

00
'\"""' coskx
L- k3 .
k=I

Similarly, this series converges uniformly for all x by the Weierstrass test, because

coskx < I _!_


I k3 - k3'

and because I:f (I/ k3) converges. Hence by Theorem 4.5,


d ~ sinkx _ ~ coskx
dx L- ~ - L- k3 ·
k=I k=1

The same kind of argument applies to give

d 2 ~ sinkx ~ sinkx
dx 2 L- ~ = - L-J?l·
k=I k=I

EXERCISES

1. Show that the series Lk=O xk converges uniformJy for 4. (a) Show that if ICkl < B for some fixed number B,
-d ::0 X ::0 d if O < d < l. then the series
00

2. (a) Show that the trigonometric series Lk=I (cos kx/ k2 ) u(x, t) = L qe-k 2
t sinkx
converges unifonnly for all real x. k=I
(b) Prove that the series of part (a) defines a continuous is a solution of U:u = u, satisfying u(O, t) =
function for all real x. = 0 when t > 0 and x is in [O, JT ]. [Hint:
u(n, t)
00 For arbitrary 8 > 0, apply Theorem 4.5 with t :::: 8.)
3. Show that if a trigonometric series a; + L (at cos kx + (b) Show that, if u(x, t) in part (a) is defined for t = 0
k=I by a series convergent for each x, then u(x, t) is
bk sinkx) converges uniformly on [-JT, n], then it con- continuous on the set S in R 2 defined by O ::: t,
verges unifonnly for all real x . O<x:::;n.
688 Chapter 14 Infinite Series

(c) Show that the function 11(.x, t) is infinitely often on (0, ]]. [Hint: For the fim part use the error estimate
differentiable with respect to both x and t, for in Theorem 3.10.]
t > 0. 8. (a) Assume that the series L~J k 2ak and L~J k 2bk
5. Show that if a trigonometric series as displayed in both converge absolutely. Show that
Exercise 3 satisfies the conditions lak I ~ A/ k2 , lbk I ~ cc
B/k 2 , fork= I, 2, 3, ... and fixed constants A and B, w(x. t) = L sinkx(ak coskat + bk sinkat)
then the series converges uniformly for all real x. k=l

6. By considering the partial sums of the power series is a solution of the I-dimensional wave equation
L~o xk for -1 < x < I, show that the series fails a 2 w.u = wu. [Hint: Use the Weierstrass test and
to converge uniformly on (- I, 1). Show uniform convcr- Theorem 4.5.]
gencc ,.1or - 2I ~ x ~ 2I . (bJ Show that the solution w(x, t) of part (a) satisfies
the boundary conditions w(0, t) = w(:,r, t) = 0 for
*7. Show that L~ 1 (-ll(l-x)xk converges uniformly on t 2'.: 0 and an initial condition w(x, 0) = h(x), where
(0, 1], but that L~! (1 - x)xk converges only pointwise h is twice continuously differentiable.

SECTION 5 POWER SERIES


A power series is an infinite series of the form
00

L adx - al= ao + a1 (x - a)+ a2(x - a) 2 + · · · .


k=O
Such a series defines a function of x for all x for which the series converges. The
Taylor series discussed in Section 4 are examples of power series associated with
known functions such as tr, cosx, and 1/(l - x). Our point of view here will be
different in that we'll start with the series, rather than with some other representation
for the function, and then study the series directly as a function of x. This point of
view is essential to the use of power series in solving ordinary differential equations.
SA Interval of Convergence
We speak of the series L~o ak(x - al as being a power series "about" x = a,
because the set of real numbers x for which such a series converges is always either
an interval with its midpoint at x = a, or else is the whole real line. We won't prove
this in genera], but the examples will illustrate it. Figure 14.6(a) shows an interval of
convergence; one-half its length is called the radius of convergence, denoted by R.
FIGURE 14.6 R R

(1
Divergence Divergence
Interval of
convergence
(a)

R=2 R=2

-1 0 3
Divergence Divergence
Interval of
convergence
(b)
Section SA Power Series 689

is a geometric series of the form L~o


rk, with r = -(x - 1) /2. For a geometric
serieswesawinSection 1 thatit'sjustwhen jrj = l-(x-1)/21 < 1 that convergence
holds, in other words, when Jx - 11 < 2. Thus the interval of convergence is -1 <
x < 3, with x = 1 as its midpoint. See Figure 14.6(b). The radius of convergence
is R = 2.

l\~~MPLE.~J We can test the power series about x = O given by

00 1
I:
k=I
k2kxk

for absolute convergence by using the ratio test. The kth term is ak = 2-k xk / k, so
.
l1m ak+I
- - I I= 1·1m I rk-1 xk+I /(k + I) I
k-+-oo ak k-+-oo 2-kxk / k

= hm. -1 ( -k- ) lxl = lJxl.


k-+-oo 2 k + l
2

By the ratio test, the series converges absolutely when ½Jx I < 1, that is, when
lxl < 2, and diverges when ½lxl > 1, that is, when lxl > 2. Thus the interval of
convergence has radius R = 2 and is centered at x = 0. Because the ratio test gives
no information when the limit, in this case Jxl/2, is equal to 1, we have to check
that case separately. The points satisfying Jx I/2 = 1 are just the points x = 2 and
x = -2. These points are the endpoints of the interval of convergence, and direct
substitution into the series shows that at x = 2 we have I:~ 1 1/ k, which diverges;
at x = -2 we have I:~ 1(-Ii/k, which is a convergent alternating series. Thus
the precise interval of convergence is - 2 ~ x < 2.

The series

is the Taylor expansion of the function ex, and we showed in Section 2 that the
series converges, even absolutely, to ex for all real values of x. Hence we see that
the interval of convergence is -oo < x < oo, and it's customary to say that the
radius of convergence is R = oo.

The series
690 Chapter 14 Infinite Series

has kth term ak = x 2k /3k (k + 2). We apply the ratio test.

. I
hm - -
ak+l I = hm
. Ix2k+22 ;3k+1 (k + 3) I
k---+oo ak k---+oo x k /3k(k - 2)
k+2 I
= k~~ lxl2 3(k + 3) = 3x2
By the ratio test, we have convergence for lxl < J3 and divergence for lxl > J3.
A separate check shows divergence for x = ±J3.

SB Differentiation and Integration


One reason the power series representation of functions is so useful is that, in the
interior of its interval of convergence, we can differentiate and integrate a power
series term by term. To see what this means in practice, we first consider some
examples. The term "interior" of an interval is rrieant specifically not to include
either endpoint of the interval.

IEXAMPLE 5 I The Taylor expansion


--i =
I -x
I: x
00
k =l+x+x 2 +···
k=O

is valid for -1 < x < I. In the interior of the interval of convergence, that is for
- I < x < 1, we integrate both sides and include a constant of integration to get

oo xk+l x2 x3
- In() - x) =c+ '°' -- =
L...,k +I
c + x + - + - + ··· .
2 3
k=O

The constant c is determined by setting x = 0 in both sides. We get

0=-lnl =0+c,

so

oo xk x2
-ln(I -x) = '°' -k
L..,
x3
= x + - + - + ···
2 3 ' -I< x < I.
k=l

Computing the successive derivatives of - In( 1- x) at x = 0 shows that the preceding


expansion is just the Taylor expansion of - ln(l - x) about x = 0. Notice that the
function ln(l - x) is defined for all x < I, but that the series fails to converge when
x < -1. Symmetry of the interval of convergence about x = 0 and failure of ln(l -x)
to be bounded near x = 1 cause the limitation of the domain of convergence.

j EXAMPLE 6 j Consider the Taylor expression


. ~ (-Jl 2k+l x3 x5
Slll X = ~ (2k + })! X = X - 3! + S ! - ... ,
k=O
Section 58 Power Series 691
valid for all real x. If we compute derivatives on both sides, we get
00

cos x = L -(2k(-Ii
k=O
- - ( 2 k + I )x2k
+ l)!
oo (-If x2 x4
=~ (2k)! x2k =I- 2! + 4! - ... .

This is just the Taylor expansion of cos x. The theorem that justifies the preceding
computations is as follows.

5.1 Theorem. A power series L~


ak (x - al is arbitrarily often differentiable
or integrable term by term in the interior of an interval of convergence of the form
Ix -al< R, where R > 0.

Proof. The case of an expansion about a point other than a = 0 follows from a
simple change of variable, as in Exercises 28 and 29. We prove first the part about
integration, under the assumption that the series
00

f(x) = L akxk
k==O

represents the function f (x) in the interval lxl < R. Assuming -R < x1 < R we'll
prove that
. (t 00 Xt 00 k+l

1 0
f(x)dx = Lak
k=O 1 O
xkdx = Lak_:L_·
k=O k+ l
To use Theorem 4..4, we need to verify that the given power series converges
uniformly on the interval between O and xI. But if we choose s and r so that
R > s > r > x 1 , then the series converges at x = s. Hence its terms tend to zero
and so are bounded in absolute value by some number m : laksk I ~ m. Then for
x ~ r < s, we have

The series withkth term m(r/sl is a convergent geometric series, becauseO < r /s < I..
This shows by Theorems 4.1 and 4.3 that the given series converges uniformly on
( -r, r) to a continuous function, which is necessarily f (x) because the series is assumed
to converge to f(x) at each point x. Because r is a number such that O < r < R, we
can include an arbitrary x in ( - R, R) in an interval of uniform convergence, so we can
integrate term by term on all intervals contained in [0, xi}.
For the differentiation part of the theorem, we start with the same series for f (x)
and show that
00

J'(x) = Lkakxk-I_
k=l
692 Chapter 14 Infinite Series

To do this, we apply Theorem 4.5 by showing that the differentiated series converges
uniformly on every interval -r :5 x :5 r, where O < r < R. Choose a number c
such that r < c < R. Since L~o
akck converges, there is a number b such that
lak lck :5 b. Then for x in [-r, r] we have
lkakxk-l I :5 k/ak/rk-l

The series with kth term k(b/c)(r/ci- 1 is geometric and convergent, because O <
r/c < I. Hence the series with kth term kakxk-J converges uniformly on [-r, r]
by the Weierstrass test. This allows us to differentiate term by term on all intervals
[-r, r] with O < r < R. Thus can differentiate at each x such that - R < x < R,
simply by choosing r so that Ix/ < r < R . •
Theorem 5.1 allows us to differentiate and integrate a power series repeatedly,
because the result of performing one such operation on a power series conver-
gent when Ix - a I < R is just another power series that is also convergent when
/x -al< R.
IEXAMPL~ ,71 Starting with

--1
= Loo k
x =l+x+x
2
+ ... , Ix/< 1,
1 -x
k=O

we differentiate once to get

I oo
---
2
= Lkxk - J = 1 +2x+3x 2 +···, /x/ < I.
(1-x) k=I

Differentiating again gives

2
3
'°'
00

= ~(k - l)kx k2
- = 1·2+2·3x+3-4x 2 + ... , lxl < 1.
(I - x) k=2

If we have a power series representation for a function f about a point x = a, then


the power series is automatically the Taylor series for f about x a. Thus we =
needn't verify this in practice. Specifically, we have the following theorem.

5.2 Theorem. If
00

J(x) = L ak(X - al
k=O

on some interval Ix - a/ < R, then the series is the Taylor series off about x = a.
That is,

an= -] f (n) (a), n =0, 1,2, ... .


n!
Section SD Power Series 693
Proof. Differentiating n times in the expansion for f knocks out the first n terms,
leaving
00

J<n>(x) = L k(k - I)· · · (k - n + l)ak(X - al-n


k= n
= n!a11 + (n + l)!an+l (x - a)+ ··· .

Now set x = a on both sides. All terms on the right become zero except the first, so
J"(a) = n!an ,
which is what we wanted to show.

SC Finding Limits by Using Series
A convergent Taylor expansion

00

f(x) = Lak(X - al
k=O

about some point x = a represents a differentiable, and hence continuous, function


f . It follows that we can compute a limit of f (x) as x approaches a by setting
x = a to get
lim f(x) = ao.
x--+a

This idea applies to calculation of fairly complicated limits, as the following example
shows.

To show that
x - sinx 1
lim - - -= -
x--+O x3 6

we just observe that x - sin x =x - (x - ix 3 + f2ox 5 - · · · ). Hence

(x - ·
smx )/ x 3 = 61 - I
120 x
5 + ··· .
Taking the limit as x ~ 0 amounts to setting x = 0 in the continuous function
represented by the series. The resulting limit is i.
SD Products and Quotients
We can multiply and divide power series very much like polynomials. The product
of two power series about the same point gives a third series called their Cauchy
product, which is simply the series formed by collecting equal powers of x:

5.3 (ao + a1x + a2x 2 + · ·· )(bo + b1x + b2x 2 + · · ·)


= aobo + (a1bo + aob1)x + (aob2 + a1b1 + a2bo)x 2 + · · · .
The coefficient of xk is q = aobk + a1 bk- I + · · · + ak-1 b1 + akbo, and x may be
replaced by (x - a) in an three series. The relevant theorem, which we won't prove
694 Chapter 14 Infinite Series

here, states that the Cauchy product converges to the correct value in the interior of
the common interval of convergence of the two factors.

!EX~I\APLE g j Here are several simple examples.

(a) Recall that a polynomial in x is a (finite) power series:

(l + x)(l + x + x 2 + x + · · ·)
3
= I+ 2x + 2x 2 + 3
2x + · · · ,

which is valid for -1 < x < l.


(b) Multiplying together the powers series about x = 0 for (x-l)- 1 and ln(l -x),
we get, after canceling minus signs,

2 3
(l + X + X + X + · • · )(X + 2I x 2 + 3I x 3 + · · · )

= x + (l + ½)x + (l +
2
½+ ½)x 3 + · · · = x + !x + ~1x 3 + · · · ,
which is valid for -1 < x < l.
(c) Multiplying together the power series for 1/x and lnx about x = l , we get
(l -(x - 1) + (x - 2
1) - (x - 1) 3 + · · ·)
x ((x - l) - ½(x - 1) 2 + ½<x - 1) 3 - ... )

= (x - l) + (-1 - ½Hx - 1)2 + (l + ½+ j)(x - 1)3 - •• •

= (x - 1) - i<x - 1)2 + ¥<x - 1) 3 + ....

Examples (b) and (c) show that it may be impossible to find a simple formula for
the kth coefficient in a power series.

While division of power series is theoretically the reverse of multiplication, there


are some things to watch out for in practice. Suppose we are looking for coefficients
ak that satisfy

co+c1x+c2x 2 +··· 2
2 = ao + a Ix + a2x + · · · .
bo + b1x + b2x + · · ·

We do need bo =/; 0, since ao = co/bo. (However, if bo = 0 and b1 =/; 0, we


could factor an x from the denominator and then proceed.) Also, if the series in
the denominator talces on the value zero for some, possibly complex, value of x,
that would limit the convergence interval of the quotient series. With those two
observations in mind, we could then proceed to find the ak by solution of the sequence
of equations

for the desired values of the ak . You can carry this process oUL as far as you like,
because the solution of each equation depends only on the solution of the previous
Section SD Power Series 695

equations in the list. For this reason you can also truncate the series in the numerator
and denominator and use polynomial long division to compute a preassigned number
of terms.

To compute the first few terms in the expansion of (I +ex)- 1 , note that co= 1, c 1 =
c2 = · · · = 0. Also bo =
2, bk= l/k!, k =
l , 2, 3, .... We then solve

2ao = 1, ao +2a1 = 0,

The result is ao = i• a1 = -¼, a2 = 0, a3 = J8 , .•.. Hence


1 3 l I 1
--=---x+-x
I+ ex 2 4 48
+···.

To then find the first three terms in the expansion of sin x/ (1 + ex), we compute

(x -ix + ... ) (l - ~x + ;
3
8
x
3
+ ... ) = lx -{x 2
_ / x3 ....
2

EXERCISES

Using the ratio test, or by other means, in Exercises In Exercises 15 to 18, use the Taylor expansion ln(l -
1 to 8 find the interval of convergence of the power
series. In case the interval has finite endpoints, determine x) =- E!xk
k
k=I
to derive the Taylor expansion, valid
whether the series converges when x is equal to each of for Ix I < 1, for the function.
the endpoints, and sketch the interval.
+ x) !2 ln ( 11 +-xx)
1. f: !xk
k
k=I
2. E
00

k=I
1
- (x
k
-2/
15. ln(l 16.

17. ln(l + x 2) 18. x 2 In(] - x 3 )


3. E-
1 00

k=O !(2k)
-x2k 4. E k 2 <x + 1/
k=I
In Exercises 19 to 22, use the Taylor expansion
k 1
(l-x)- 1 = L~oxk to derive the Taylor expansion for
00 00
5. E --<x +2/ 6. E ~---,,--xk
k=I k + 1 k=O.Jk2+1 the function about the point a.
00 l 00
,. E -<x + 3)2k+1 s. E zkx2k 19.
1
+ x 2 , about a = 0
1
20. - -2 , about a =0
k=I y'k k=O 1 1-x
In Exercises 9 to 14, use the Taylor expansion ex = 1
21. -, about a =1
X
22. - - . about a= 0
d E_!_k! xk to derive the Taylor expansions about the
k=O
X 1-x

point a = 0 for the function. 23. Use the relation d(arctanx)/dx = 1/(1 + x 2 ) and the
result of Exercise 19 to derive the Taylor expansion of
10. (e-' + e-x)/2 = cosh x arctan x about a = 0.
11. (ex - e-x)/2 = sinhx 12. xex
2
24. Use the relation d(l + x 2 )- 1 /dx == -2x/(l + x 2 ) 2 and
n. c 14. e5x the result of Exercise 19 to derive the Taylor expansion
of (1 + x 2)-2 about a = 0.
696 Chapter 14 Infinite Series

25. Use the Taylor expansion (1 - x)- 1 = I:~ 0 xk to from the formula
prove that
)xii< R.

30. (a) Prove that limn_, c-) n In ( I + ~) = a by using the


First find the expansion for (1 - x )- 3 • Taylor expansion for ln(l + x) about x = 0.
26. Prove that L~ox-k = x/(x - 1) for !xi> I. (b) Use part (a) to prove that limn--->oo (I+ ~r
= ea.

27. Let a be a real number. In Exercises 31 to 34, use Taylor expansions to find the
(a) Prove that, if j(x) = (1 + x)a, then j(k)(0) limit.
a(a -1) ··· (a -k + I), so that the Taylor expansion 2 5
1. Jim ln(l + x ) ln(l - x )
of (1 + x)a about O is 3 x--->0 x2 32. J~ xS
·m x + In( I - x) cos x - I + x 2/2
~ a(a - 1) .. · (a - k + I) k
33. ll 34. Jim
x->0 x2 x--->0 x4
(I +xt =I+~ k! X •
k=I 35. Find the Taylor expansion of f (x) = 1/(x + c) about
X = 0 for C-/= 0.
(b) Write out the first four terms of the expansion in part 36. Find the Taylor expansion of g(x) = 1/[(x + c)(x + d)]
(a) for a = 3, a = -3, and a = ½- about x = 0, for c -/= 0, d -/= 0, and c -/= d, by expressing
28. Derive the formula g as a sum of two fractions.
37. Prove that, if f (x) = L~o CkXk converges in some
Ix _ al < R, interval, then

00

J"(x) + f(x) = L[ck + (k + l)(k + 2)ck+2]xk


from the formula
k=O

d 00 k) 00 k-1
dx (
Lakx = Lkakx , lxl < R. in the interior of the same interval.
k=O k=l 38. Use Theorem 5.2 to prove that, if two power series
29. Derive the formula 00

Lak(X - al and
k=O

converge to the same function on an interval containing


x = a, then ak = bk for k = 0, 1, 2, 3 ....

SECTION 6 DIFFERENTIAL EQUATIONS


If a differential equation of the form
y' = f(x, y)
has a solution y = y(x) near x = xo, the equation determines the derivative y'(xo)
from y'(xo) = f(xo, y(xo)). If f is sufficiently differentiable even higher derivatives
are similarly determined at xo as in the following example.
The equation y' = y2 has y = (I - x)- 1 for a solution satisfying y(O) = 1. The
power series expansion
l 2 3
y=--=l+x+x +x +···
1-x
Section 6 Differential Equations 697

is a Taylor series, so by the Taylor coefficient formulas we also have


y' (0) y" (0) y"' (0)
y=y(O)+--x+--x 2 +--x 3 + ···.
1! 2! 3!
Comgarison of coefficients of xk in the two expansions shows that y<k) (0) / k ! = 1,
so y k)(O) = k!, k = 0, 1, 2, .... Suppose, however, that we had no formula for
the coefficients to begin with. (Most examples are of this sort.) Starting with the
given differential equation, we compute successive derivatives and then simplify by
substitution from the earlier equations:

y' =y2,

y" = 2yy' = 2y3,


y"' = 6y2y' = 6y4,

y<4> = 24y3y' = 24y 5.


The general pattern is evidently y<k) = k!yk+ 1• The formal Taylor expansion for
a solution y = y(x) about xo with y(xo) = YO is then
y(x) =Yo+ y5(x - xo) + yJ(x - xo) + · · ·
2

= Yo(l + yo(x - xo) + y5(x - xo/ + .. ·)


YO
=-----
1 - yo(x - xo)

We can treat higher-order equations, for example, y" = J(x, y, y'), in a way
similar to what we used in the previous example.

j;, ~-~MRJ,~';! Suppose we want successive derivatives y<k) (0) to a solution of

y" = yy'
given that y (0) = 1 and y' (0) = -1. First compute from the given equation some
formulas for higher derivatives. Then simplify by substituting the given values y(0) =
1, y' (0) = -1. We find

=
y" yy'; y"(0) -1,=
y"' = yy" + (y')2 = y2y' + (y')2; y"'(0) o, =
yC4l = 2y(y')2 + y2y'' + 2y'y" = 4y(y')2 + y3y'; yC4l(0) = 3.
The first five terms of the Taylor expansion of y(x) about x = 0 then add up to
y' (0) y" (0) 2 y"' (0) 3 yC4l (0) 4 1 2 I 4
y(O)+--x+--x + - - x + - - x = 1-x - - x + -x .
1! 2! 3! 4! 2 8
In this example there doesn't seem to be a simple coefficient pattern, but we could
compute as many terms as we had time and space for.
698 Chapter 14 Infinite Series

j EXAMPLE 3 1 From the differential equation

y" = xy' - y

we compute, using substitution after differentiating,

y"' = xy" = x(xy' - y)


= x 2 y' - xy,
y<4> = x 2 y" + xy' - y = x 2 (xy' - y) + xy' - Y
= (x 3 + x)y' - (x 2 + l)y.
If we denote y(0) by co and y' (0) by CJ, then

y" (0) = -co, y"' (0) = 0, y< 4) (0) = -co,


Thus the Taylor expansion of y(x) about x = 0 has the form
y(x) =co+ CJX - co x 2 +0 - co x 4 - •· •
2! 4!

= co (1 - lx 2 2
-
4
~x - · · · ) +cJX,

For comparison, note that if the original differential equation were replaced by

y" = -y
then solutions would have the fonn

y(x) = co cosx + CJ sinx,


and that y(0) = co, y'(0) = CJ.
EXERCISES

6. Find the first four nonzero terms in the Taylor expansion


In Exercises I lo 4, find the first three nonzero terms in
ofy = y(x) aboutx = I if y'" = y and y(l) = 2, y'(l) =
the Taylor expansion about x = 0 of y = y(x) if y and
ilf> derivatives satisfy the given relation. 0, y"(l) = I.
7. Suppose y" = x 2 y while y(O) = co and y'(O) = c,. Show
1. y' = y2 + y, y(O) = 1 that y = y(x) has the form
2. y'=y2+x,y(0)=-1 4 5
y=co(l+ ~x +···)+c1 (x+ ~x .. - ) .
3. y' = xy, y(O) = 2 1 2

4. y" = xy, y(O) = 1, y'(O) = 0 8. Show that if y'" = y 2 y' and y(O) = y'(O) = y"(O) =
5. Find the first four nonzero terms in the Taylor expansion 1, then
of y = y(x) aboul x = 0 if y" = yy' and y(O) = 12 13 14 35
y'(O) = I. y = 1+ X + -X
2
+ -X
6
+ -X
8
+ -X
40
+ ··· .
Section 7 Power Series Solutions 699
SECTION 7 POWER SERIES SOLUTIONS
The solutions of many of the differential equations we have studied may be repre-
sented in tenns of their Taylor expansions. For example, polynomials, the elementary
transcendental functions cos x, sin x, ex, and linear combinations of all these have
Taylor expansions that are valid for all x. Beyond these examples there is a large and
important class of differential equations that has solutions representable by power
series, and even if the solution so represented is not a combination of elementary
functions at all, the infinite series expansion may serve to define a new function
nearly as important as some of the more familiar ones. Furthermore, the partial sums
of a series expansion often give useful approximations to the true solution.
Recall that the Taylor expansion of a function f has the form

~ f(k)(xo) k 1 , 1 " 2
~ k! (x - xo) =f (xo) + l! / (xo)(x - xo) + 21 f (xo)(x - xo) + ··· .
k=O
If such a series converges anywhere but at x = xo then it's absolutely convergent
in an interval xo - R < x < xo + R that is symmetric about xo, and within such
an interval we can treat such series very much like the polynomials in x that arise
as special cases. In particular we can add and multiply Taylor expansions about the
same point x 0 , and a Taylor expansion is differentiable or integrable term by term to
produce the Taylor expansion of the derivative or integral of the expanded function
within the interval of convergence. A function f is called analytic if its Taylor series
converges to f (x) for all x in such an interval.

l:~~00,~;~~,JJ The Taylor expansions


00 k
C = L :, , -00 < X < 00,
k=O
00
(-llx2k
cosx =L (2k)! , -00 < X < 00,
k=O
00
• (-llx2k+l
smx - ~ - - - - -oo < x < oo,
- ~ (2k + l)! '
k=O
l 00 k
- - = ~ x, -l<x<l,
1-x ~
k=O
00
(-ll+ 1(x - l l
In x =L k , 0 < x < 2,
k=I
are computable directly from the general Taylor formula and will converge to the
value of the function on the left for each x in the indicated interval. Furthermore,
we may compute the derivative of a function such as cos 2x as follows:

1l (2x )2k
00
d d
-cos2x
dx
L
= -dx k=O
(-
(
2k)'.
700 Chapter 14 Infinite Series

00
d (- tl (2x) 2k
=,?; dx (2k) !
00
(-ll(2k)(2x) 2k-l{2)
= L
k=l
(2k)!

- 2 ~ (-ll(2x)2k-l
- ~ (2k - l)!
k=I
00
Il
-- - 2 ~ '°'------
(- (2x)2k+I
+ l)! = -2sin2x.
(2k
k=O
In the last step we simply replaced k by k + 1 throughout to make the expansion
look more like the expansion in the preceding examples.

Many of the most important examples of series solutions of differential equations


are expressible as power series about the point x = 0. Since Taylor expansions about
zero are also a little easier to work with, most of our examples will be of that kind.
The next example shows how to solve a familiar differential equation using series.
To solve y" + y = 0, we try to find a solution of the form
00

y(x) = LCkXk.
k=O
This fonn of the expansion is particularly appropriate if we want to solve an initial-
value problem with y{O) and y'(0) specified, because then co= y(0) and c1 = y'(0),
if there is a Taylor expansion for the solution about x = 0. Proceeding under that
assumption for the moment, we compute
00

y'(x) = L kqxk-1,
k=I
00

y" (x) = L (k - 1)kqxk-Z


k=2
00

= "~(k + l )(k + 2)ck+zX


. k.
k=O
We have shifted the summation index by 2 in the expression for y"(x) to make
addition to the series for y(x) more convenient. We find that for y(x) to represent a
solution we must have
00 00

y"(x) + y(x) = L qxk + L(k + 1)(k + 2)ck+zxk


k=O k=O
00

= I)ck + (k + l)(k + 2kk+2Jxk = 0.


k=O
Section 7 Power Series Solutions 701
Next we use the fundamental property of a Taylor series that for an expansion to be
identically zero all the coefficients must be zero. Hence

Ck+2 =- (k + l)(k + 2), k = 0, l, 2, ....


Recalling that we can specify a particular solution by determining the numbers co =
y(0), c1 = y'(0), it's natural to compute recursively

co co q c!
c2 = - ~ = - ,,
2
C3 = -2-3 = -3!'
c2 co C3 q
C4 = - 3.4 = 4!' C5
- 4.5 = 5!'
C5 CJ
C7 = 6-7 7!'

C2k-2 k Co
(2k - 1)(2k) = (- l ) (2k) ! ' C2k+l = 2k(2k + 1)
l k CI
(- ) (2k + 1)!

If we take y(0) = co and y' (0) = c1 = 0, then only the first column contains nonzero
entries, and we get the solution

co 2 co 4 k co 2k
yo(x)=co--x +-x -···+(-!) - - x + ...
2! 4! (2k)!
= co cosx.
On the other hand, the choice y(0) =
co = 0 and y'(0) = q makes the entries in
the first column all zero, so we get another solution from the second column:

CJ 3 CJ 5 q 2k+l
Y1(x) =q - -x + -x - .. · + (-1)---x + .. ·
3! 51 (2k+l)
=q sinx.

Thus the general solution is y(x) = co cosx + q sinx as we expected.

Solutions of Airy's equation y" + xy = 0 are not obtainable in terms of elementary


functions, so we try •

00

y(x) = L qxk,
k=O
00

y"(x) = L(k - l )kqxk-2.


k=2
702 Chapter 14 Infinite Series

Substitution into the differential equation gives


00 00

y"(x) + xy(x) = L(k - I)kqxk- 2 + L qxk+J


k=2 k=O
00 00

= L(k + l)(k + 2)Ck+2Xk + L Ck-JXk = 0,


k=O k=J

where this time we shifted the index in the first summation up by 2 and in the second
summation down by I to make the exponents of x agree. Thus we get a single term,
the constant 2c2, in the first summation that does not correspond to a term in the
second summation. We can write then
00

2c2 + L[(k + l)(k + 2)Ck+2 + ck+dxk = O;


k=I

setting all coefficients equal to zero gives

c2 = 0,
Ck+ 2 =- (k + ])(k + 2)'
We find as a result that the terms are determined in sequences with indexes differing
by 3, and that

0= C2 = C5 = Cg = · · · = C3k+2 = ··· ·

However, if co or CJ is not zero, we compute as follows:


co
C3 =
-2-3'
C3 co
C6
6-6
= 2-3-5-6'
C6 co
C9 = 8-9
= 2-3-5-6-8-9'

CJk-3 (-llco
CJk
(3k - 1)3k
= 2-3-5-6 · ·. (3k - 1)3k '
C4
C4 -· 3.4'
C4 CJ
C7 = -6-7
- =
3-4·6·7'
C7 CJ
CJO =
9-10 3-4-6-7-9-10'

C3k-2 (-llc1
CJk+J =
3k(3k + ]) 3-4-6-7 · · · 3k(3k + 1)
Section 7 Power Series Solutions 703

The solution determined by y(0) = co = 1, y' (0) = CJ = 0 is

(-llx3k
yo(x) = 1+ L
oc
------,
k=I 2-3·5·6 · · · (3k - 1)3k

and the solution determined by _y(0) =co= 0, y'(0) = CJ = 1 is

oo (-llx3k+I
Y1(x) = + I:------.
X
3-4-6-7 · · · 3k(3k + 1)
k=I

The power series for Yo and YI both converge for all x because the denominators of
the kth terms each contain increasing integer factors, 2k in number. Hence each series
has terms dominated by those of an everywhere convergent series, for example,

(-llx3k I lxl3k
I 2-3-5-6 .. · (3k - 1)3k ::: (2k)!'
In practice estimates of this kind are useful for testing the accuracy we get by
stopping with a specified number of terms in a Taylor expansion. In this example,
to get an estimate when Ix I ::: 1, we estimate the tail of the factorial series by a
geometric series with ratio l/4n 2 :
00
1 1 I
2
< n)! L
k=n
(2k)! = 1
+ (2n + 1)(2n + 2) + (2n + 1) · .. (2n +4) + .. ·

1 1
< 1 + 2- + - 2-2+ ...
- 4n (4n )
4n 2
= 4n 2 - 1 ·

Thus the error in stopping after n - l terms on the interval -1 ::: x ::: 1 is at most

4n 2 1
4n 2 -1.(2n)!'

For n = 5, that is, keeping terms of degree 4, the error is at most 3-1 o- 7 .
Since the two solutions Yo and YI are linearly independent, the general solution
of y" + xy = 0 has the form

y(x) = coyo(x) + c1y1 (x).


The solutions of y" + cy = 0 have infinitely many zero values if c > 0 and at most
one such when c < 0. Analogously it's true that a solution of the Airy equation has
infinitely many positive zeros, but at most one negative zero.

p~xA,~L~ -1 I The Legendre equation of index m is

(1 - x 2 )y" - 2xy' + m(m + l)y = 0.


704 Chapter 14 Infinite Series

To find solutions in the form of power series, we let

00 00 00

y(x) =L qxk' y'(x) = Lkqxk-1, y"(x) = L(k - l)kqxk- 2 ,


k=O k=I k=2

so that the q are determined by

00 00 00

(1 - x 2 ) L(k - l)kqxk- 2 - 2x Lkqxk-l + m(m + 1) L qxk = 0.


k=2 k=I k=O

Shifting the index by 2 in the first sum allows us to write

[2c2 + m(m + l)co] + [2·3CJ - 2q + m(m + l}q]x


00

+ L[(k + l)(k + 2)ct+2 - ((k - l)k + 2k - m(m + l})q]xk = 0.


k=2

Setting the coefficient of each power of x equal to zero gives

m(m + I) (m - l)(m + 2)
c2 = co, c3 =- 2-3 CJ,
2
(k + l + m)(k - m)
k ?:_ 2.
Ck+ 2 = (k + l}(k + 2) Ck,

Since the recurrence relation contains a shift by 2, it's natural to split the coefficients
into those of even and those of odd index:
(3 + m)(2- m) (m - 2)m(m + 3)(m + I)
C4 = 3.4
C2 -- ---------co,
4!
(5+m)(4-m) (m - 4)(m - 2)m(m + 5)(m + 3)(m + 1)
C6 = 5,6 C4
6!
CO,

and

(4 + m)(3 - m) (m - 3)(m - l)(m + 4)(m + 2)


C5 = -----C3
4.5
= 5!
CJ,

(6+m)(5-m)
6,7 C5
(m - 5)(m - 3)(m - l)(m + 6)(m + 4)(m + 2)
= 7!
CJ,

If m = 21 is a positive even integer, then all even coefficients are zero beyond 2/.
Thus the series expansion with even powers reduces to an even polynomial in that
case. Similarly, if m = 21 + 1 is a positive odd integer, the series expansion with odd
Section 7 Power Series Solutions 705
powers reduces to an odd polynomial. For example, when m = 4 and co = q = 1,
we get the solutions

4.5 2.4.7.5
P4(x) =l-
- x2 + - - x 5
2! 4! '
3-6 1·3·8-6 -1-1-3-10-8-6
Q4(x)=x--x 3 + - - x 5 - - - - - - x 7 +··· .
3! 5! 7!

Using the ratio test for convergence (see Exercise 7) shows that the infinite series
solution converges for - 1 < x < l. These two solutions form the basis for the
collection of all solutions of the homogeneous Legendre equation of index 4 on the
interval -1 < x < 1. In general, the Legendre equation of integer index m has two
independent solutions of which one is a polynomial and the other is not.

EXERCISES

l. Use the method of power series to derive the general 5. (a) Apply the power series method to find the general
solution solution of the differential equation

y = co coshx + c1 sinhx y" +xy' + y =0


in the form y(x) = coyo(x) + ctY1 (x).
to the differential equation y" - y = 0. (b) Show that a special case of the general solution
2. (a) Show that, if y = J(x) is a solution of y" +xy = 0, found in part (a) is the solution y(x) = e-<x / 2).
2

then y = f(-x) is a solution of (c) Apply the power series method to find a solution of
the differential equation
y" -xy = 0.
y" +xy' + y =x.
(b) Use the result of part (a) together with the result
of Example 3 of the text to find a power series (d) Combine the result of part (a) with that of Exercise 5
expansion for the general solution of y" - xy = 0. to write the general solution of y" + xy' + y = x.
3. (a) Apply the method of power series to solve the first- 6. Apply the ratio test for series convergence to the series
order differential equation solution found for the Legendre equation in Example 4 of
the text. Show that the series converges for -1 < x < 1.
y' + 2xy = 0. [Hint: Split into even and odd parts; show that each
converges separately.)
(b) Solve the differential equation in part (a) by finding
an exponential multiplier and then integrating. 7. The Bessel equation of index n is
(c) Do the results of parts (a) and (b) agree for all x? 2
n )
4. (a) Apply the power series method to find the general y II + ~y +
} / (
I - x2 y = 0.
solution of the differential equation
(a) Show that when n = 0 the coefficients of a solution
y" -xy' =0 of the form I:~o qxk satisfy (k + 2)2ck+2 = -q.
(b) Show that, if we choose co = 1, c1 = 0 in part (a),
in the form y(x) = coyo(x) + ciy1(x). we get the solution
(b) Solve the differential equation in part (a) by solving
the equivalent system ""
Io<x) = E<-1/z-2k(k!)-2x2k,
y' =U, k=O
called a Bessel function of order 0.
U
I
= XU. (c) Show that Ji(x) = -J6(x) defines a solution of the
(c) Do the results of parts (a) and (b) agree for all x? Bessel equation of index l.
706 Chapter 14 Infinite Series

8. A Bessel function of integer order n is defined by (b) Show that, if y11 is a solution of the Bessel equation
of index n, and u 11 (x) = .Jxy,,(x), then Un
t)"
ln(X) = (i ?;<-d
oo x2k
22kk!(n + k)! ·
satisfies

(a) Show that J,, satisfies the Bessel equation of index u


, + (1 4n- ~1) u=O.
2
-
n given in Exercise 7.

SECTION 8 FOURIER SERIES


SA Introduction
Phenomena that are approximately periodic occur so often in nature that their study
has generated a large branch of mathematics known as Fourier analysis, in recognition
of one of its originators, Jean-Baptiste Fourier. The prime examples of periodic
functions are the sine and cosine functions, and it is these functions that fanned the
basis of Fourier's own investigations. For the moment disregarding convergence, we
define a trigonometric series to be one of the fonn
00
8.1 G; + L(Gk coskx + bk sinkx).
k=I

In the special circumstances that the coefficients Gk, bk arise from an integrable
function f (x) using the Euler formulas

8.2 Gk= -1 111: f(x)coskxdx, bk= -1 111: f(x) sinkx dx,


Jr -,r Jr -,r

then the trigonometric series is called the Fourier series of f. The coefficients Gk,
bk as given by Equations 8.2 are called the Fourier coefficients of f. This choice
is justified by Theorem 8.5. The most fundamental question about a Fourier series
is the extent to which the series represents the function. The importance of such a
representation stems partly from the possibility of incorporating the individual tenns
of the series into a solution of certain differential equations.

Our first examples illustrate the beautiful way in which the partial sums of a
Fourier series attempt to mimic the function / that generates them. A partial sum
N
8.3 SN(x) = ao + "L_)GkCOSkx +bksmkx)
.
2 k=I

is called a trigonometric polynomial. Note that each term in SN is a periodic


function f of period 21r, that is, f (x + 21r) =
f (x) for all x. It follows that SN is
also periodic:

Note also that the Fourier coefficients ak, bk are detennined by integral formulas that
use the values of J(x) only for - Jr ~ x ~ 1r. For these reasons, we'll sometimes
restrict attention to values of x in the interval of length 21r between -Jr and Jr .
Section 88 Fourier Series 707
8B Orthogonality
The functions cos kx, sin kx that occur in a Fourier series are the most important
examples of orthogonal functions on the interval -rr ~ x ~ rr. For integers k and
l orthogonality of cos kx and sin kx means that
8.4 -I 111: coskx sinlx dx = 0,
T{ -Jr

2-111: coskx coslx dx = { O, k # l,


rr -n: 1, k = l # 0.

2_ 111: sinkx sinlx dx = { O, k # l or k = l = 0,


rr -ir I, k = l # 0.

These formulas are usually proved using trigonometric identities, but they can also
be proved by first writing the sines and cosines in terms of eikx and eilx. (See
Exercise 29.) As a sample application of the orthogonality relations 8.4, suppose
that a trigonometric series satisfies some condition that allows us to integrate it term
by term on the interval -rr ~ x ~ rr. For example, the series might converge
uniformly on the interval, or it might be only a finite sum, with all ak and bk equal
to zero from some point on.

8.5 Theorem. Suppose the trigonometric series 8.1 converges to a function f (x)
and is integrable term-by-term over the interval [ -rr, rr] to give the integral of f (x ).
Then the coefficients ak, bk of f(x) are given by the Euler formulas 8.2.

Proof We denote the sum of the series by


00

f(x) =a;+ I)akcoskx +bksinkx).


k=l

Then for a fixed integer l :::_ 0,

-1
l
rr
n:

-ir
f(x) coslx dx = -1 - +
l
rr -n:
n:

[
ao
2
00
I )ak coskx
k=O
+ bk sinkx) ]
coslx dx

= -ao 111: coslxdx +~ -


~ [ak 111: coskx coslx dx + -bk 111: smkx
. coslx dx ] .
2rr -n: k
==l
rr -n: rr -n:

The first two of Equations 8.4 show all but one of these last terms is zero, the only
survivor being the term with the factor a, . We find for l # 0,

-I fir f (x) cos lx dx = -a1111: cos lx cos lx dx = a1.


7r -n: 7r -Jr

When l = O the only nonzero term that survives in the sum is the first one. Since the
integral of 1 over the interval is 2rr, we get ao. A similar computation in Exercise 30
shows that -1111: f (x) sin lx dx = b1. •
7r -:n:
708 Chapter 14 Infinite Series

Theorem 8.5 shows that no choices other than Formulas 8.2 for determining the
Gk and bk are possible if we want to represent a reasonably large class of functions
f (x) by the trigonometric series of Equation 8.1.

I~XAMPLE 1 j Let /(x) = lxl for -;r S x 5 Jr. Then

Gk = -I lrr Ix I cos kx dx, bk= - I lrr lxl sinkx dx.


Jr -rr Jr -rr
Now lxlsinkx has integral zero over [-rr,rr], because it's an odd function. Hence
bk = 0 for k = I, 2, .... On the other hand, the graph of Ix I cos kx is symmetric
about the y-axis, so we can just double the integral over [O, ;r ]. For k i- 0 we
integrate by parts, getting

s0 (x) = J Uk= -211r


Jr 0
x coskx dx
(a)
= -2[xsinkx]rr
- - - - -211r sinkxdx
Jr k O k;r O

= [ - 22- coskx
]7r = - 2-(cosk;r - 1)
k;r o 2
k;r
k = 2, 4, 6, ... ,
= -j---((-ll - 1) = { O, 4
s1(x) = s0(x) - (i) cos .r k ;r - k 2;r, k = I, 3, 5, ....
(b)

When k = 0, we have uo = -21:n: x dx = rr. To summarize,


Jr 0

0, k =2,4,6, ... ,
ao = Jr, Gk= -~
k = 1,3, 5, ... ,
{ k2;r '
bk = 0, k = I, 2, 3, ....
s3(x) = s (x) -
1 ( ~)cos 3x
9
Fourier approximations to
Hence the Nth Fourier approximation is given for N = I, 3, 5 , ... by the trigono-
metric polynomial
j{x) = Ix I on [- 1T, 7r]
(c)
;r 4 4 cos 3x 4 cos N x
SN(X) = -2 - -cosx- - - - - ... - - - - - .
;r ;r 32 ;r N2
FIGURE 14.7
If N is even, we have SN(x) = SN-i(x). Figure 14.7 shows how the graphs of
So, Si, and S3 approximate that of !xi on [-rr, ;r]; additional terms improve the
approximation.

IEXI\MPLE 21 Let g(x) ={ I,


- 1'
0 < x <
-
-Jr
-
:S X
Jr,
< 0.
To compute Uk and bk
.
we break the mterval of
integration [ -;r, Jr] at 0:

Gk= -
Jr
1 /o-rr coskxdx + -1 /orr coskxdx.
Jr 0
Section 8C Fourier Series 709
Since cosine is an even/unction, cos(-kx) = coskx, the two integrals are equal, so
we get ak = 0. Similarly,

bk= - -11o
T( -TC
sinkx dx + -T(I 11r sinkx dx.
0

Since sine is an odd/unction, sin(-x) = -sinx, the integrals themselves are nega-
tives of each other, and we get for k -=f. 0,

bk=~ {1r sinkx dx = ~ [-coskx]rr


T( lo T( k

I
0
k even
= 2__(-(-1/ + I)= 0~
kT( - k odd.
kT(
s1(x) =(¼)sinx In summary,
(a)
ak = 0, k =0, 1,2, ... ,
0, k = 2, 4, 6, ... ,
4
kT(, k = 1, 3, 5, ... .

Hence for N odd, the Nth Fourier approximation to g is given by

s(x) = s(x) +(3~)sin Jx


3 1
SN(x) = -T(4 sinx + -
4 sin3x
-
T( 3
- + ... +
4 sin Nx
-
T(
-N- .
(b) The graphs of S1, S3, and Ss are shown in Figure 14.8, together with that of g(x).

SC Convergence of Fourier Series


An important question is whether the Fourier approximations SN(x ) converge as
N ~ oo to f(x), where /(x) is the function on [-T(, T(] from which the Fourier
coefficients are computed. The Fourier series of f (x) is by definition the infinite
series
00
S5(X) = sJlx) + ( 5 ~)sin 5.r ~o + ~)ak cos kx + bk sin kx),
(c) k=I

where ak and bk are given by the Euler Formulas 8.2. Theorem 8.6 gives some con-
l:<'IGURE 14.8
ditions on f under which we can use the Fourier series to represent / . Suppose
Step-function approximations
that the graph of/ is not only bounded on [-T(, T(] but piecewise monotone, which
SN(x) for N = 0, 3, 5.
means that the interval [-T(, T(] breaks into finitely many subintervals, with endpoints
-T( = xi < x2 < · · · < Xn = T(, such that J(x) is either nondecreasing or nonin-
creasing on each open subinterval (xk, Xk+i). It's possible to prove that the Fourier
series of / will then converge to the 2T( -periodic extension of f (x) illustrated in
Figure 14.9 wherever/ is continuous, and at a discontinuity at xo will converge to
the "average" value

½U(xo-) + f(xo+)].
710 Chapter 14 Infinite Series

FIGURE 14.9
Typical periodic extension.
-- -- --•
' ... ,
----
- ,r 'Tr

Here J (xo-) stands for the left-hand limit of J at xo, and f (xo+) stands for the
right-hand limit. The graph of a typical piecewise monotone function appears in
Figure 14.9 with average value at jumps indicated by dots.

8.6 Theorem. Let J be bounded and piecewise monotone on [-Jr , n]. Then the
Fourier series of J converges at every point x of the interval to ½[J (x - )+ f (x+ )]. In
particular, if J is continuous at x, then the series converges to f (x ). At x = ±n, the
series converges to <½)[f(n - ) + f(-n +)]. (A somewhat stronger version is called
Dirichlet's theorem; the conclusion is the same, with somewhat weaker assumptions
about J.)

Examples 1 and 2 gave an indication of how partial sums of a Fourier series


converge. In each of those examples, the function satisfies the condition of piecewise
monotonicity; hence the series converges to the value claimed in Theorem 8.6.

The function g defined in Example 2 is arbitrarily assigned the value I at x = 0.

I
Theorem 8.6 implies that the Fourier series of g converges as follows:

00
4sin(2k + l)x l , O < X < Jr ,
L--
Jr
k=O
-- =
2k + l
0, x=0
- I -Jr < x < 0.

To be very specific, we can set x = Jr /2 and arrive at the alternating series expansion

Theorem 8.6 gives a reason beyond the one in Theorem 8.5 for choosing the
coefficients in a trigonometric series according to Euler Formulas 8.2. Assuming
piecewise monotonicity the resulting sequence of trigonometric polynomials will
converge to the function f and its average value at jumps. Since the partial sums
of a Fourier series are themselves periodic functions, the function to which they
converge is also periodic, a function we called a periodic extension of J. A periodic
extension may differ from the precise definition of J (x) at some points in the interval
-Jr S x S Jr , but changing a value f (xo) at a point xo has no effect on the
integral formulas for the Fourier coefficients, so it's customary to make such changes
whenever it's convenient in defining a periodic extension of a function from an
interval to the entire real number line.
Figure 14.10 shows a function f extended periodically, with period 2n, from the
interval [-Jr, n] to other values of x. Since the partial sums of the Fourier series are
also periodic with period 2n , whatever convergence takes place on [-Jr, 1r] extends
periodically to all values of x.
Section SC Fourier Series 711
FIGURE 14.10
Periodic extension of
f(x) = x + n from (-;,r, ;,r)
with S4(x).

'IT

j(x) == x + 'IT extended periodically from (-'IT, 'IT) and nonnalized at jumps.
The 4th Fourier partial sum is superimposed.

The coefficient values ak and bk that we get from the Euler formulas are inde-
pendent of the finite values assigned to / (x) at isolated points; this is because the
definite integrals in the Euler formulas don't distinguish between two functions that
differ at finitely many points. For example, the two functions

f (x) ={
0,
1,
-,r :'.:: X :'.::
0 < x :'.S
0
an
d ( ) = { 0,l -,r :'.:: X < 0
,r gx ' 0< <
_x _,r

differ only at x = 0, where /(0) = 0 and g(O) = 1. Intuitively speaking the areas
under the two graphs should be the same, namely ,r, and this is an important property
of the integral. Since the Fourier series of a piecewise monotone function converges
to the average of the right and left limits at each point x, it makes sense simply to
redefine such a periodic function to have the average value at each jump discontinuity
and refer to this as the normalized function.
We restate Theorem 8.6 as follows.

8.7 Theorem. If a bounded piecewise monotone function is extended period-


ically and is also normalized to have the value ½(/ (x +) + / (x-)) at its jump
discontinuities, then its Fourier series converges to the function at every point.

Strictly speaking, Theorem 8.7 has implications here only for functions of period
2:,r,as does Theorem 8.6, but we'll see in the next section that a modified statement
is valid for functions with positive period 2p.

I
EXAMPLE 4 :1 If the function / (x) = x + :,r is extended periodically from -:,r < x < :,r to
' -, , ,
1
, '" · •• other values of x, its graph consists of parallel line segments of slope l; it remains
undefined at odd integer multiples of :,r since it's initially undefined at ±n. To
produce a normalized version of the function defined for all x, all we have to do
is define the function to have the value :,r at odd multiples of :,r. We compute the
Fourier series for the functions as originally defined, and the series will converge to
the normalized function for all real x. The periodically extended function is shown in
Figure 14.10 together with the 4th partial sum of the Fourier expansion. The Fourier
coefficients are computed as follows:
712 Chapter 14 Infinite Series

ao = -11rr (x + n) dx
rr -rr

= -l lrr xdx + -l lrr ndx


n -rr rr -rr

= o+ 2rr = 2n.
(Note that the integral of x over an interval symmetric about O is always 0.) When
k > 0,
ak =-
1 lrr
(x + n) cos kx dx
n -rr

= -l lrr x cos kx dx + -l lrr n cos kx dx


n -rr n -rr

The last integral above is O because the indefinite integral is O at ±n. The previous
one is most easily seen to be zero by observing that the integrand x cos kx is an odd
function, so that the integral over [ -n, O] is the negative of the integral over [O, n ].
Now for the bk' s,

bk = -1 lrr (x + n) sin kx dx
n -rr

= -l lrr x sin kx dx + -l lrr n sin kx dx


n -rr n -rr

2(-1/+ 1 2(-1/+1
k + O = ---'--k--
Tbe last integral is zero because the integrand is an odd function. The previous one
is computed using integration by parts, with

2- jrr x sin kx dx =
n -rr kn
1
- - -x cos kx Irr + - -
-rr
1
kn
f rr
-rr
cos kx dx

1 l 2(-l)k+l
= --cos
k
kn - - cos(-kn)
k .
+0 = --
k
-
The full expansion, including the constant ao/2, is then
2(-l)k+l
=n +L
00
f(x) k sinkx
k=l
= n + 2 sin x - sin 2x + i sin 3x - ½sin 4x + - · · · .
EXERCISES

The following observations about values of sine and 1. sin hr= 0


cosine are useful for computing Fourier coefficients; k 2. cos hr= (-ll
is always an integer here.
Section 9A Applied Fourier Expansions 713
3. sin(k + ½)rr = (-1/ metric polynomial is necessarily the Fourier series of
the function it represents. For example, the identity
4. cos(k + ½)rr = 0
cos 2 x = ½+ ½cos 2x is the Fourier expansion of cos 2 x.
In Exercises 1 to 10, compute the Fourier coefficients In Exercise 22 to 27 find the Fourier series of the func-
of each of the functions and write the corresponding tion by using appropriate identities, for example the ones
Fourier series in the form of Equation 8.1. Sketch the in the next exercise.
graph of each function extended to have period 21C on
the interval -21C ~ x ~ 21C. Finally, sketch, relative to 22. sin2 x 23. cos3 x 24. sin 2x cos x
the same axes, the graphs of the first three partial sums 25. sin x 3
26. cos4 4
x - sin x 27. sin 5x + cos 3x
So(x), S1 (x), S2(x) of the Fourier series.
28. Establish the orthogonality relations in Equations 8.4 of
1. f(x) = x, -rr < x :::: rr the text, by using the following trigonometric identities to
2_ f(x) = { -rr - x, -rr < x < 0, compute the relevant integrals.
rr-x , 0:5x:5rr
3. f(x) = x 2 , -rr < x :5 rr cos a cos /3 = ½cos(a + /3) + ½cos(a - /3),
4. f(x) = lxl + 1, -rr :5 x :5 rr sin a cos /3 = ½sin(a + /3) + ½sin(a - /3) ,
5. f (X) = { 0, -'Jf < X :5 0 sin a sin /3 = ½cos(a - /3) - ½cos(a + /3)
1, 0< X ~ Jr

6. f(x)=x+l,-rr<x:::rr
*29. Establish the orthogonality relations in Equations 8.4 of
7 _ f (x) = { -rr, -rr < x < 0, the text, by using the identities cosnx = ½<ei 11 x +e-i 11x),
rr, 0:5x:5rr
1 . .
8. f(x) = 2x + 1, -rr < x < rr sinnx = i (e'"x - e-,nx) together with the identity
2
9. f(x) = -/xi, -rr :::: x :S rr ei(a+/3> = eiaeif3 to compute the relevant integrals.

JO. f(x) = { -1,


2,
-rr < x :S 0
0 < X :5 Jr
30. Carry out the details of the proof that ~
1f
f-tr
tr f(x) sin Ix
dx = b1, parallel to the computation in Theorem 8.5.
Exercises 11 to 20. By Theorem 8.6, the Fourier series
31. Let f (x) = .Jixf for -rr :5 x :S 1r. Does f satisfy the
of each function f(x) in Exercises 1 to 10 converges
hypotheses of Theorem 8.6?
to some function F(x) whose graph may differ at some
points from the graph of f(x). For Exercises 11 to 20, 32. Let f (x) be an odd function on -'Jf ::; x ::; 1r (i.e.,
sketch the corresponding F(x) on -2:rr ~ x ~ 2n, f(-x) = - J(x)) and let g(x) be an even function (i.e.,
paying close attention to values at x = 0, ±n, and ±21C. g(-x) = g(x)). Let ak, bk and ak•
bk be the Fourier
coefficients off and g respectively. Show that
21. Show that if f (x) and g(x) have the Fourier coefficients
ab bk and a1c, bk respectively, then af (x) + f3g(x),
where a and /3 are constant, has Fourier coefficients ak =0. bk= - 21tr f(x)sinkxdx,
aak + f3a1c, abk + f3b',,. 1f 0

The Nth partial sum of a trigonometric series is called a 2


a1c=- ltr g(x)coskxdx ,
trigonometric polynomial of degree N, and a trigono- 1f 0

SECTION 9 APPLIED FOURIER EXPANSIONS


The direct application of Fourier methods to practical problems usually requires
adapting the standard formulation presented in the previous section to intervals other
than [-rr. ;rr ]. In the present section we describe some of these adaptations and their
application. With these modifications Theorem 8.6 on convergence extends immedi-
ately to arbitrary finite intervals.
714 Chapter 14 Infinite Series

9A General Intervals
While the interval [-;r, ;r] is a natural one for Fourier expansions because it is a
period interval for the trigonometric functions, it may be that a function encountered
in an application needs to be approximated on some other interval. If the function f
to be approximated is defined not on the interval [ -;r, rr:] but on [ - p, p ], a suitable
change in the computation of the approximation is as follows. With f defined on
[- p, p ], we define
fp(X) =f (~), -Jr :'::: X S Jr.

Then we can compute the Fourier coefficients of Ji, by Formula 9.2. The resulting
trigonometric polynomials SN will converge to Ji, on [-Jr, ;r] as in Theorems 8.6
and 8. 7. To approximate f on [ - p, p], we consider

;rx)
SN ( - = -ao2 + ,;-.. ( k;rx k;rx)
~ llk cos - - + bk sin - - , - p::::xsp.
P k=l P P

The coefficients ak and bk are computed directly in terms of f by making a change


of variable. We have

ak = -1 lrr fp(x)coskxdx = -l lrr f (px)


- coskxdx
Jr -]f Jr -]f Jr

= -I
p
JP f(x)cos (k;rx)
-p
-
p
dx.

A similar computation holds for bk, and we have

9.1 ak = -I JP f(x) cos -dx,


k;rx
bk= -
I JP j(x) sin -k;rx dx
p -p p p -p p

for the coefficients in the Fourier approximation

N
ao ~( k;rx k-!rx)
2 + k=l
~ ak cos - - + bk sin - -
p p

to the 2µ-periodic extension of the function f defined on [- p, p].

If
0 :'::: X .:'::: p,
h(x)={~I, -p .:'::: X < 0,
then
ak = 0, k = 0, 1,2, ... ,

bk = -21orr sin --dx


k;rx
p O p

=; forr sinkx dx = I O~
k;r'
k

k
= 2,4, 6, ...
= 1, 3, 5, ....
,
Section 9A Applied Fourier Expansions 715
FIGURE 14.11
Intervals of length b - a = 2p.

-p p a b
p = (b - a)/2

Hence the Nth Fourier approximation to h is ·given, for odd N, by


4 rrx 4 3rrx 4 Nrrx
SN(x) =-
rr
sin -
p
+ -3rr sin -p- + · · · + -Nrr sin--,
p
-p,::: x.::: p.

For a function J defined on an arbitrary interval a ,::: x ,::: b, it's helpful to think of
a periodic extension F of J having period b - a and defined for all real numbers x.
Such an extension appears in Figure 14.10. We set 2 p = b - a so that p = (b - a) /2
and - p = -(b - a)/2. We then compute the Fourier coefficients of F over the
interval [-p, p] according to Formula 9.1. Also, because the integrands in Formula
9.1 have period 2p, we can use the geometric observation that we can perform
the integration over an interval of length 2p = b - a, in particular, over [a, b] as
in Figure 14.11. (See Exercise 7 for a nongeometric proof.) The reinterpretation of
Formulas 9.1 is

9.2 ak =-2
-
b-a
lb a
2krrx
f(x)cos--dx,
b-a
2
bk= - -
b-a
lb
a
2krrx
J(x)sin-- dx.
b-a
The associated trigonometric polynomials are

SN(X) - + LN ( QkCOS--
= ao 2k:rrx . 2k:rrx)
+bk sm-- .
2
k=I
b-a b-a
Equations 9.2 are useful computationally in part because the way f (x) is defined
may make it easier to compute its integral over the interval fa, b] rather than [-p, p ].

j:E~~M'1~,~2 j Let /(x) = x, for O < x < 1. We find, integrating by parts fork i= 0,
1
ak =2 lo x cos 2k:rr x dx

=2 [ xsin2k:rrx]
--- - -
2k:rr
2
o 2k:rr o
1
sin2k:rrxdx=0, i 1

1
bk= 2 fo x sin 2k:rrx dx

COS 2k:rr X ] l
= 2 [ -x --- + -
2 lo l
cos2k:rrxdx
2k:rr o 2k:rr o
cos 2k:rr 1
k:rr k:rr
716 Chapter 14 Infinite Series

1
Since ao = 2 fo xdx = 1, then ao/2 =½,and the Fourier series is

sin4rrx sin6rrx
1
- - -
1 (
sin 2rr x +- -- + - - - + .. · ) .
2 rr 2 3

9B Sine and Cosine Expansions


An expansion in terms only of cosines or only of sines is sometimes more convenient
to use than a general Fourier expansion, and is really necessary in Section 10. We
start with the observation that the cosine terms in a Fourier expansion are even
functions [i.e., cos(-krrx/p) = cos(krrx/p)], and that the sine terms are odd [i.e.,
sin(-krr x / p) = - sin(krr x / p)]. It follows that if l is an even periodic function,
the product l (x) sin(krr x / p) is odd. (Simple verification.) Therefore for the Fourier
sine coefficient bk, we have by Equation 9.1,

bk = -1 1P l(x) sin -krrx


d x = 0.
p -p p

Hence aside from a possible constant term, an even function has only cosine terms
in its Fourier expansion. Similarly, if l is an odd periodic function, the product
l (x) cos(krr x / p) is also odd; so for the Fourier cosine coefficient we have

ak = -1 1P l(x) cos -krrx


d x = 0.
p -p p

Thus an odd function has only sine terms in its Fourier expansion.
Suppose given a function l (x) defined just on the interval 0 :s; x .:s; p we want
to find a trigonometric series expansion for l consisting only of sine terms, or
sometimes only of cosine terms. The trick is to extend the definition of l from the
interval 0 :s; x .:s; p to all real x in such a way that the extension is periodic of
period 2p and either is odd or else is even. We then compute the Fourier series
of the extension. If le is an even periodic extension of l, then le will have only
cosine terms in its Fourier series in an expansion designed to represent l just on
0 :s; x .:s; p. Similarly if lo is an odd periodic extension of l, then lo has only sine
terms in an expansion designed to represent l just for 0 ::: x .:s; p.

9.3 Sine expansion. For l (x) on 0 ::: x :::: p we have


00
krrx 21P krrx
Lbksin-- , bk= - l(x)sin--dx.
k=l p p O p

9.4 Cosine expansion. For l(x) on 0 ::: x .:s; p we have


00
krrx
½ao + L ak cos--, ak = -21P l(x) cos --dx.
krrx
k=l p p O p

We illustrate the method with two examples.


Section 98 Applied Fourier Expansions 717
FIGURE 14.12
Even extension of f(x} =I - x
from O ~ x ~ 2.

We'll compute the cosine expansion for the function defined by l(x) = 1 - x for
0 S x S 2. We consider the even periodic extension shown in Figure 14.12. To find
the extension we define le by le (x) = I (- x) for - 2 S x < 0, and then extend
periodically, with period 4, to the whole x-axis. We use Fonnula 9.4 to compute the
Fourier-cosine expansion of le - (Since le is even, we know that bk = 0 for all k.)
The coefficient fonnula in 9.4 allows us to write

Ok= -
11
2 -2
2
brx
le(x)cos-dx
2
= 1
2

o
brx
le(x)cos -dx.
2

Since on O S x S 2, the function le is the same as the given function I (x) = I - x,


we integrate by parts for k > 0:

Ok=
1o
2
brx
(l - x) cos --dx
2
= [ -2( I - x) sin -
k;r
brx]
2
- + -2
2 o k;r o
2
brx1
sin --dx
2
4 k { 0, k even,
= ;r2k2 [l - cos ;r] = 8/(1r 2k2), k odd.

Finally, ao = Jl(I -
x)dx = 0. Thus the cosine expansion of I on OS x S 2 has
for its general nonzero term

8 (k;rx) k odd.
;r2k2 cos 2 '
Written out, the expansion of the given function looks like

_ ~ (cos ;rx/2
l(x) - 7r
2 ] + cos 3;rx/2
9
+ cos 5nx/2
25
+ ... ) , 0 S XS 2.

Starting with the same function as in Example 3, l(x) = 1 - x for 0 S x S 2,


we compute its sine expansion by considering the odd periodic extension shown in
Figure 14.13. We first define l 0 (x) = -1(-x) for -2 S x < 0, and then extend
periodically with period 4. (Since lo is odd, we know that Ok = 0 for all k.) Also,
by Equations 9.1 or 9.3,

bk= -
11
2 -2
2
knx
l 0 (x) sin -
2
dx = 1
2

o
k;rx
l 0 (x) sin -dx.
2
718 Chapter 14 Infinite Series

FIGURE 14.13
Odd extension of f (x) = l - .x
from O ~ .x ~ 2.

But J0 (x) = I- x for O::: x :::: 2, so integrating by parts gives

bk= f\1
lo
-x)sin brx dx = [-~(I -x)cosk.1rx2] -
2
~f
1r lo
2
cosk7rx2dx

I
2 1r O

2 2 2 [2 k1rx]2 0, kodd,
=-(-ll+--- -sin- = 4
k1r 1r k1r k1r 2 0 k1r , k even.

Thus the general nonzero term in the sine expansion is 4 / (k1r) sin(k1r x /2), for even
k > 0. A careful interpretation of this formula shows that the sine expansion of j(x)
is then

j(x) = -2 ( sin1rx + sin 21r x sin 31r x ·)


- - - + - - - + .. · , 0 < X < 2,
1r 2 3 ,
with convergence to zero at x = 0 and x = 2.

Note that Examples l and 2 of the previous section are cosine and sine expansions
respectively of the given functions restricted to O ::: x :::: 1r.
The Java Applet FOURIER at Web site http://math.dartmouth.edu/~rcwn/ approx-
imates Fourier coefficients using Simpson's rule and then plots graphs of partial
sums. An alternative is to use computer algebra software such as Maple, MATI.AB or
Mathematica to compute Fourier coefficients for elementary functions.
9C Differential Equations
Given a linear differential operator L, for example the hannonic oscillator operator
L = (D 2 + (J}), we can use Fourier series to solve the nonhomogeneous equation
Ly = f (t) if the forcing function f is periodic with period 2p and representable by
a Fourier series with coefficients ak, bk, For example, to solve

OO k1rt
Ly= Lbk sin-,
k=I p
we first find a simple particular solution Yk(t) of the equation
k1rt
Ly= sin-, k = 1, 2, 3, ....
p
The linearity of L leads us to a formal particular solution of the linear equation as
00

y = LbkYk(t) .
k=I

The next example is typical of the case in which L is a constant-coefficient operator.


Section 9C Applied Fourier Expansions 719

Suppose we want to solve the forced harmonic oscillator equation y + <,iy = J (t),
where f (t) has period 2p = 2 and is defined on the interval 0::: t < 2 by

J(t) = { 1, 0::: t < 1,


-1, 1 ::: t < 2.

Figure 14. l4(a) shows the graph of J for 0::: t < 8, called a square wave.
The square-wave input f has Fourier expansion
00 00
2 (1-(-ll) 4 1
J (t) =-
7r
L k=l
k sin(kJZ't) =-
7r
L - - sin((2n + l)Jrt),
2n + 1
n=O

with the understanding that the series converges to 0 at the jump discontinuities. The
computation appears in Example 1. The differential equation y + <Lly = sin(kJrt)
has the particular solution Yk(t) = (w2 -k2n 2)- 1 sin(knt). It follows that a solution
to y + ltly = f (t) is formally

, 4 Loo sin((2n + l)JZ't)


y(l1=- - - - -2- - - - - .
· 7r (2n + l)(w - (2n + 1)27r2)
n=O

Note that if w is an odd multiple of 1r, one of the terms in the series is undefined
and would have to be corrected. Indeed, for w close to (2n + l)1r the corresponding
term in the series will have a large amplitude; Partial sums to 100 terms are graphed
in Figure 14.14 for various values of w. A formula for the general solution would
have to contain additional terms ci cos wt + c2 sin wt. The numerical values of the
coefficients in the series for y(t) are larger when n = 0 than when n > 0, particularly
for the choices of w 2 in Figure 14.14 (c), and (d); this explains the dominance of

FIGURE 14.14 y y

---
Square-wave input w2 = S output
(a) (b)

y y

w2 = 9 output w2 = 10.S output


(c) (d)
720 Chapter 14 Infinite Series

the first term in the graphs. Note also that Theorem 4.5 applies to the displayed
solution, so we can compute y' (I) from term-by-term differentiation, and the output
is decidedly smoother than the square-wave input f (t). On intervals k ::: t < k + I
the solution must satisfy ji + al y = ± l so on such intervals y (t) = q cos wt +
dk sin wt ± w- 2 , with the pieces fitting together smoothly at t = k to produce a
periodic solution.

EXERCISES

1. Find the Fourier series for the function In Exercises 10 to 15, extend the function to an interval
to the left of x = 0 so that the extended function is
f(x) = -X, -2 < X < 2. even. In each case sketch the graph of the even periodic
extension of f after normalizing f to have the average
To what values will the series converge at x = 2 and value at jump discontinuities. In the same picture sketch
X = -2? also the graph of the sum of the first two nonzero terms
2. Find the Fourier series for the function of the Fourier expansion, which should contain only
cosine terms plus perhaps a constant.
f(x) = 1 + X, 1< X < 2. 10. f(x) = 1, 0< X < 7f

To what values will the series converge at x = 1 and 11. f (X) =1- X, 0 < X < 1
X = 2? 12. f(x)=x 2 , O<x <n
3. Let/ be an odd function on [- p, p], that is, f(-x) = 13. f(x) = sinx , 0 < x < n/2
- f (x ), and let g be an even function. that is, g ( - x) = 0 0 < X < 1,
g(x). Let ak, bt and a~. b~ be the Fourier coefficients of 14. f (x) ={ 1: 1~ x < 2
f and g, respectively. Show that
O~x<l,
2 [P 15. f (X) = { 0:x 1~X < 2
ak=O, bk=- Jo f(x)sin(kJrx/p)dx,

, 21
ak = -
p 0
p p
g(x)cos(kJrx/p)dx,
D
,
bk =0.
16. Find
(a)
(b)
the Fourier cosine expansion and
the Fourier sine expansion of the function

In Exercises 4 to 9, extend the function to an interval f (x) = x, 0< X < 7f .

to the left of x = 0 so that the extended function is


odd. In each case sketch the graph of the odd periodic (c) Compare the results of (a) and (b) with the complete
extension of f after normalizing f to have the average Fourier expansion of
value at jump discontinuities. In the same picture sketch
also the graph of the sum of the first two nonzero terms g(x) = X, -n < X < 7f.

of the Fourier expansion of the extended function, which


should contain only sine terms. In Exercises 17 to 24, assume the relevant combinations
are defined.
4. f(x) = 1, 0< x < n
17. Prove that a product of even functions is even.
5. f(x) = I - x, 0 < x < l
18. Prove that a product of odd functions is even.
6. f(x)=x 2 , O <x< n
19. Prove that the product of an even function and an odd
7. f(x) = cosx, 0 < x < n/2 function is odd.
1, 0< X < 1, 20. Prove that a linear combination of even functions is even,
8. f(x) ={ 0, 1~x<2 and a linear combination of odd functions is odd.
0, 0 < x < I,
9. f (.x) = { X -2, l~x<2 21. Show that if f is periodic and differentiable, then f' is
periodic.
Section 10A Heat and Wave Equations 721
22. Show by example that if f ' is periodic, then f need not k=fal,
be periodic.
k = I.
23. Show that if f is even and differentiable, then J' is odd.
[Hint: Consider the limit of (f (-x + h) - f (-x) )/ h as
25. Use the identity sin a sin fJ = ½cos(a - {J) - ½cos(a + {J)
h - O.] to prove the preceding statement directly.
24. Show that if f is odd and differentiable, then J' is even. 26. Show how the statement above follows from Equations
Sec the hint for the previous exercise.
8.4 of Section 8.
In Exercises 25 and 26, prove that the set of functions 27. Find a formal Fourier expansion for a particular solution
{J27p sin(krr x / p)}: 1 is an orthonormal set on the to the differential equation ji - a 2 y = f(t), where f(t)
interval O ~ x ~ p, that is, show that is the square wave defined in Example 5 of the text.

SECTION 10 HEAT AND WAVE EQUATIONS


Here we show how to use Fourier series to solve problems in I -dimensional heat
conduction and wave motion. As often happens in applications we first find an
equation that is satisfied by the physical quantity under study, and then apply some
mathematics, in this case Fourier expansions and separation of variables, to solve
the equation.
lOA One-Dimensional Heat Equation
Suppose we are given a thin wire of uniform density and length p. Let u(x, t) be
the temperature at time t at a point x units from one end. Suppose O ~ x ~ p and
that t ~ 0. We assume that heat transfer takes place only along the direction of the
heat conductor and that the temperature at the two ends is held fixed. Thus we can
represent the wire as a straight segment along an x-axis and represent temperature
as the graph of a function u = u(x, t), as in Figure 14.15.
A basic physical principle of heat conduction is that heat flow is proportional to,
and in the direction opposite to, the temperature gradient 'v' u. Recall that 'v' u is the
direction in which the temperature increases most rapidly, so it's reasonable that
heat should flow in the opposite direction, from hotter to colder. Since the medium
is I-dimensional, represented by a segment of the x-axis, the gradient is just ux(x , t),

FIGURE 14.15 u
Temperatures at equally spaced
times.

X
722 Chapter 14 Infinite Series

so if u x (x, t) > 0 heat flows to the left at x, while if u x (x, t) < 0 heat flows to the
right at x. Thus the rate of change of heat in a segment [x1, x2] is

k [-~u(x1, t)
ax
+ ~u(x2,
ax
t)] , (1)

where the number k is the heat conductivity of the wire, assumed to be constant
over the length of this wire.
By a version of the Fundamental Theorem of Calculus, the rate of change of heat
in the segment in Equation (1) is equal to

(2)

An alternative expression for the rate of change of heat in the segment is

-
d 1x2 cpu(x,t)dx = cp 1x2 -(x,
au t)dx, (3)
dt Xt XI at

where the constants c and p are the heat capacity and density of the wire per unit
of length. Equating the expressions for rate of heat change in the wire in Equations
2 and 3 gives

a
21x2 -a2u (x, t)dx = 1x2 -(x,
au t)dx,
Xt ax 2 XI at

where a 2 = k/cp. Allowing x2 to vary, we differentiate both sides of this last


equation with respect to x2, to get, after replacing x2 by x, the

au 2
au
10.1 One-dimensional heat equation a 2 - 2 (x, t)
ax
= -(x,
a, t).
Equation IO. I is linear in the sense that if u1 and u2 are solutions, then so are
a2
linear combinations c1 u1 + c2u2, the reason being that both - 2 and - act linearly.
a
ax at
To single out particular solutions, we start by imposing two boundary conditions that
specify temperature zero at the ends of the wire of length p,

u(O, t) = 0 and u(p, t) = 0, t ::": 0, (4)

and one initial condition that specifies the initial temperature at all points x,

u(x, 0) = h(x), 0 ~ x ~ p. (5)

Separation of variables. The standard way to solve this problem is by an


extension to partial differential equations of separation of variables for ordinary
Section 10A Heat and Wave Equations 723
differential equations. The method has many applications, so it's important to under-
stand its principles. In either setting it's sometimes hard to tell in advance whether
the method will work or not. We start by trying to find product solutions of the fonn

u(x, t) = X(x)T(t).
The boundary conditions translate into X (0) = X (p) = 0. If such product solutions
exist, substitution into a 2uxx = ut gives

a 2 X"(x)T(t) = X(x)T'(t), for O :S x :Sp, 0 < t.

Dividing through by X(x)T(t), we get

2 X"(x)
a --=--.
T'(t)
X(x) T(t)

Note that if x varies nothing changes on the right. Similarly varying t changes
nothing on the left. Hence both sides must be equal to some constant C. We now
set both sides of the equation equal to a constant, letting C = -).2 for convenience:

a 2 x" +). 2 X = 0, T
1
+). 2 T = 0.
The first of these equations has solutions

X(x) =q cos()./a)x + +c2 sin().ja)x.


But the boundary condition X (0) = 0 implies q = 0, and X (p) = 0 then implies
c2 sin()./a)p = 0. Unless we make c2 = 0 also, we can satisfy this condition only
by choosing ).. so that ()../a) p = kn, where k is an integer. That is, we must take
). = (karr)/p. The result is that X(x) has the form
X(x)=c2sin(krr/p)x, k= 1,2, ....

With).= krr/p, the differential equation for T(t) is now T' +(karr/p) 2 T = 0, and
1
its solutions are
~
T(t) = ce-(kl a-:,r 2 IP 2)r_
Except for a constant factor, the product solutions uk(X, t) = Xk(x)Tk(t) are

-(k2a21r2 I p2)t .
Uk(x,t)=e sm(krr/p)x, k=l,2, ....

Since the heat equation is linear, linear combinations of the functions uk(x, t),

N
UN(X, t) = L bke-<k2a2 2/P=)c sin(krr / p)x,
1r

k=l
724 Chapter 14 Infinite Series

are also soluLions. But recall that we still have to satisfy an initial condition u (x, 0) =
h(x). This amounts to setting t = 0 in the previous equation and requiring the
coefficients bk to be chosen so that

N
h(x) = Lbksin(krr/p)x.
k=l

It's important to understand the separation technique, because it has many applica-
tions, some of which are in Section IOC and in the review problems at the end of
the chapter.
If the function h (x) satisfies the conditions of Theorem 8.6 of Section 8, we can
let N tend to infinity in u N (x, t) and get a Fourier series representation that we
incorporate using the Fourier sine expansion Formula 9.3 into a

iJ2u au
10.2 Solution formula for a2 axz = att u(0t t) = u(0t p) = 0, u(x, 0) = h(x).

u(x,t)
ex::
= Lbke-(k-a-rr-/p-)tsin(krr/p)x,
0
, , ,

where bk= -
21P h(x)sin--dx.
krrx

k=I P O P

Note that the decreasing exponential factors make the series converge very rapidly,
so we'll be able to differentiate the series term-by-term often enough with respect to
x and t to verify that we do have a solution. (See Theorems 4.1 and 4.5.)

To be more specific about solving the heat equation, we assume for simplicity that
p = rr. Recall that to solve a 2 uxx = u 1 with boundary condition u(0, t) = u(rr, t) =
0 and initial condition u (x, 0) = h (x ), we want in general to be able to represent
h(x) by an infinite series of the form
00

h(x) = L bk sinkx. (6)


k=I

Suppose, for example, that h(x) is giveh for x in (0, rr] by

h(x) = { x, 0 S x S rr/2,
rr - x, rr /2 S x S rr.

To make Equation 6 represent the Fourier sine expansion of h (x ), we extend h to


the interval -rr s x s rr so that the cosine terms in the expansion of h will all be
zero, leaving only the sine terms to be computed. We do this by extending the graph
of h symmetrically about the origin. According to Formula 9.3,

bk= - 211r h(x)sinkxdx = -211r/Z xsinkxdx +-211r (rr -x)sinkxdx


rr O rr O rr ~~

= -2 i,r/2 x sin kx dx +- 210 x sin kx dx


rr o rr -rr /2
Section 10B Heat and Wave Equations 725
0, k = 0, 2,4, .. . ,
=~ lo{1r/
TC
2
xsinkxdx = ; (hr)=
k TC
sin
2
4
k 2rr '
k=I,5,9, ... ,
-4
k 2 TC'
k = 3, 7, 11, . . ..
Theorem 8.6 then implies that

4(sinx sin3x sin5x sin7x )


h(x) = 7r- -12- - -32- + -52- - -72- + - · · · '
for O::::: x ::::: Jr.

From Equation 10.2 we expect that the solution to our problem is

sin x _ 320 2, sin 3x _ 52 0 2, sin Sx )


u(x t)
'
= -4 ( e_ 0 21 -
T{
--
12
e --
32
+e -- -
52
+ ·· · .

To verify that u(x, t) satisfies a 2uxx = u, for t > 0 we use Theorem 4.5, noting
that the exponential factors provide the required uniform convergence. Theorem 4.2
shows that limHou(x, t) = h(x) for O::::: x .:S TC. The graphs of h(x) and u(x, t)
appear in Figure 14.16.

lOB Steady-State Solutions


A solution u(x, t) = v(x) of the heat equation that is independent of t is called
a steady-state solution, because it doesn't vary with time. The heat equation for
such functions becomes simply v"(x) = 0, and all solutions are necessarily of the
form u (x, t) = v(x) = a + f3x, where a and /3 are constant. Solutions of this type
are useful for solving the time-dependent problem when we have nonhomogeneous
boundary conditions, which have the form

u(0, t) = uo, u(p,t) = u1, t > 0,

where at least one of uo and u 1 is a nonzero constant. The idea is to choose a, f3 in


the steady-state solution v(x) =a+ f3x so that

v(O) = uo, v(p) = U[.


Then
u(x, t) = w(x, t) + v(x)
FIGURE 14.16 u
Time-varying temperatures for
equally spaced x.

X
726 Chapter 14 Infinite Series

will satisfy u(0, t) = uo, u(p, t) = ui if w(x , t) is a solution of the heat equation
satisfying homogeneous conditions

w(0, t) = w(p, t) = 0, t > 0.

Note that the function w + v is indeed a solution of the heat equation, because both
w and v are solutions and because the heat operator a 2 Dxx - D, acts linearly.
To solve problems of the form

a 2 uxx = u 1 , u(0, t) = uo, u(p, t) = ui, u(x, 0) = h(x),

we first find a steady-state solution v(x) = a+ f3x. We need v(0) = uo = a and


v(p) = =
= ui a+ {3p. Thus a = uo, and /3 (ui - uo)/p. Then solve the heat
equation a 2 wxx = w 1 with boundary conditions w(0, t) = w(p, t) = 0 and initial
condition w(x , 0) = h(x) - v(x). The solution to the original problem has the form
u(x, t) = v(x) + w(x, t), or
00
. kJrx
10.3 u(x, t) = v(x) + Lhke-k2 (,r
2 2
a IP
2
)r sm - - , where
k=l p

bk= -
21P
p O
brx
(h(x) - v(x)) sin - d x .
p
The reason u(x, 0) = h(x) is that when t = 0 the series represents h(x) - v(x).
All terms of the series are zero when x is O or p, so u(0, t) = v(0) = uo and
u(p, t) = v(p) = ui.
j. EJ<.AM~LE·2 -1 To solve the problem
a 2 uxx = u1 , u(0, t) = 10, u(5, t) = 30, u(x, 0) = h(x) = lO - 2x, 0 < x < 5,

we first find the steady-state solution v(x) = 10 + 4x. The desired solutions

u(x, t) = (10 + 4x) + w(x, t)


come from finding solutions w(x, t) that satisfy homogeneous boundary conditions

w(0, t) = w(5, t) = 0, t ~ 0,
and an initial condition
u(x, 0) = w(x, 0) + v(x) = h(x) ;
this last condition is just
w(x, 0) = h(x) - v(x) = -6x.
Thus our solution u(x, t) has the form of Equation 10.3, where the bk are Fourier
sine coefficients
bk =-21
5 o
5
krrx
- 6x sin --dx
5
60(-ll
=- --
krr
The solution to the problem is thus
00 1
u(x, t) = (10 + 4x) - (60/,r) L (-l)k+
k e-k
2 2 2
a ,r 1125
krrx
sin - - .
5
k=l
Section 10B Heat and Wave Equations 727
EXERCISES

Solve the heat equation a 2 uxx = u 1 with boundary and as requiring the conducting medium to have insulated ends
initial conditions 1 through 6. at x = 0 and x = p. Because the temperature gradient
Ux is always 0 at the endpoints, there is no heat flow past
1. u(O, t) = u(p, t) = 0; u(x, 0) = sin(rrx/ p), 0< x < p those points. The solution involves Fourier cosine series.
2. u(O, t) = 11(1, t) = 0; u(x, 0) = x, 0 < x < 1 The next four exercises are about problems of this kind.
3. u (0, t) = u (1, t) = 0; u (x, 0) = 1 - x, 0 < x < I 23. (a) Show that product solutions u(x, t) = T(t)X(x)
4. u(O, t) = u(rr, t) = 0; = x(rr - x), 0 < x
u(x, 0) < rr of the insulated endpoint problem have the form
2 7 2
Uk(X, t) = ake-k 1r t!P cosk(rr/p)x.
5. u(O, t) = u(rr, t) = 0; u(x, 0) = sinx + ½sin2x,
(b) Use the Fourier cosine expansion for f(x) on O :::
0<x<rr
x ::: p to solve the boundary value problem for a
6. u(O, t) = u(2, t) = O; u(x, 0) ={ o:1 0 < x < 1,
1< x < 2
general initial temperature f (x).
(c) What is the steady-state temperature function
Find steady-state solutions u (x, t) =
v(x) of the heat u(x, oo) for 0::: x::: p?
equation a 2 uxx = u 1 that satisfy each of the conditions 24. Solve the heat equation a 2 uxx = 11 1 with insulated end
7 through 10. conditions ux(O, t) = ux(l, t) = 0 and initial condition
7. u(0,t)=-1, u(2,t)=l u(x, 0) = x for O < x < 1.

8. u(O, t) = 0, u(IOO, t) = JOO 25. Solve the heat equation a 2uxx = u 1 with insulated end
9. llx(O, t) = 1, u(1, t) = 2 conditions ux(O, t) = ux(I, t) = 0 and initial condition
u(x, 0) = I for 0 < x < 1.
10. Ux(O, t) = -1, u(l, t) = 3

Solve the heat equation a 2 ux.t =u given and initial


26. Solve the heat equation a2uxx = u1 with insulated end
1,
conditions ux(O, t) = ux(rr, t) = 0 and initial condition
conditions 11 through 14.
u(x, 0) = cosx for 0 < x < rr.
11. u(O, t) = I, u(p, t) = 3; u(x, 0) = sin(nx/p),
O<x<p
27. The partial differential equation tuxx = u 1 has product
solutions of the form u(x, t) = X(x)T(t).
12. u(O, t) = -1, u(2, t) = 1; u(x, 0) = x, 0 < x < 2
(a) Find two ordinary differential equations satisfied by
13. u(O,t)=O, u(l . t)=l; u(x,0)=1-x, O<x<l X(x) and T(t) respectively. (The equation for X
14. u(O,t)=-1, 11(1,t)=2; u(x,0)=3x-1, O<x<l should not contain t and the equation for T should
not contain x. Take care to allow for separation
Find all solutions of the form u (x, t) = X (x) T (t) for constant ).. = 0.)
each of the equations 15 through 18: (b) Solve the ordinary differential equations found in
15. Un; + Ux = Ut 16. Uu - U;x; = Ut part (a), and use the results to specify the general
17. xux = 2u1 18. Uxx = Utt form of the solutions X (x)T (t).

Find the steady-state solution to each of the problems 19 28. The partial differential equation txux = Ut has product
through 22. solutions of the form u(x, t) = X(x)T(t) for x > 0 and
t > 0. Find the general form of all such solutions.
19. Uxx = u1 + 2; u(O, t) = 1, u(l, t) = 2
20. Uxx = u 1 + u; u(O, t) = 0, u(2, t) =0 29. Suppose that u(x, t) satisfies a 2 u., = u 1 and that u(x, to)
is concave up as a function of x for xo < x < x1. Show
21. Un:= u1 +x; u(O, t) = 1, 11(1, t) =2 that for each x with xo < x < x, there is a time t(x) such
22. Uu = u1 - 2x + 1; u(O, t) = 1, u(l, t) =0 that u(x, t) increases as t increases from to to t(x). What
if u (x, to) is concave down?
Insulated endpoints. We can interpret the I -dimensional
heat flow problem 30. Verify that the partial differential operator L = a 2 Dxx -
Uxx =U1, Ux(O,t)=u;x;(p,t)=O, D 1, defined by L(u) = a 2uxx - u 1 is linear in its action
on twice differentiable functions u, v, that is, show that
u(x, 0) = f(x), 0::: x ::: p, L(cu + dv) = cL(u) + dL(v) for constants c and d.
728 Chapter 14 Infinite Series

31. The assumption that the separation constant C for the product solutions uk(x, t) = e-k
2
<" 2 7f 2 /P 2>r sin(br/p)x
problem a 2 u.u = 11,,, u(O, t) = u(p, t) = 0 has the spe- using C instead of -A 2 •
cial form -A 2 is convenient but not essential. Derive the

lOC One-Dimensional Wave Equation


Think of a stretched elastic string of length p and uniform density p placed along
an x-axis in JR 3 . Suppose that the ends of the string are fixed at x = 0 and x = p
by opposite forces of magnitude F. If the string is made to vibrate starting at time
t = 0 our problem is to predict the position at time t > 0 of a point x(s, t) on the
string a distances along the string from the end fixed at x = 0. Figure 14.17 shows
an exaggerated string shape with a unit tangent t(s) = dx(s)/ds at x(s, t).
r(s) We imagine the string partitioned into short pieces of length !!,.s and then derive

c~
0
X(s,t)~

p
two different expressions for the total force vector acting on a typical segment of
the subdivision. To find the tension force T acting on the small piece, note that the
opposing forces at x(s) and x(s + !!,.s) are

FIGURE 14.17
Ft(s + !!,.s) and - Ft(s), so T = F[t(s + l!,.s) - t(s)J.
String and tangent.
But by Newton's law, the force T acting on the small piece equals mass p!!,.s times
acceleration a, so T = (pl!,.s)a. Hence

pa=...!_= F [t(s + !!,.s) - t(s)] _


!!,.s !!,.s

. . at .. ax a 2x
Lettmg !!,.s ~ 0 gives pa= F-(s, t). By defimtton t = - and a= - 2 , so
as as at

at(s,t) a2x(s,t) a 2 x(s,t)


pa= F - - - = F----,-
2
, and also pa = p---
2
as as at

Equating the two formulas for pa and dividing by p gives the

. a2x(s, t)
= a x(s,
2
t)
10.4 Vector wave equation (F/p)
as 2 a, 2

Before proceeding we'll pause to interpret this equation. From Chapter 8, Sec-
tion 3 recall that io 2x/as 2 1 = lot/osl = K(s) is the curvature of the string at x(s).
Thus the magnitude of the acceleration on the right side increases with increasing
force F and curvature K, and decreases with increasing density p. Equation 10.4
is linear, but it requires us to use arc length s along the string as an independent
variable, something that's very hard to measure in practice.
Setting F / p = a 2 we write the vector differential equation as a system of three
scalar equations

The motion's x-axis component is usually slight, indeed zero where we assumed
the ends are fixed, so the equation for x (s, t) is usually set aside. Between the
Section 1OC Heat and Wave Equations 729
other two equations there is little difference in physical significance unless we make
some other special assumption. To be specific we assume the string has been set in
motion so that movement is entirely in the xy-plane, so z(s, t) = 0, and we're left
with only the middle equation for y. Equation 3.6 in Chapter 8, Section 3 shows
that Yss = Yxx/(1 + y;) 312 • If the slopes Yx are small enough, we can replace the
nonlinear expression Yss by the slightly larger Yxx, so with a slight loss of precision
we replace the system of three equations by one linear equation for y(x, t ):

2
10.5 One-dimensional linear wave equation a 2 a y2 (x, t)
ax
= ay
at
(x, t).

This differential equation doesn't specify the displacements y(x, t) of a string of


length p unless we impose some boundary and initial conditions:

y(0, t) = y(p, t) = 0 for t > 0,

and
ay
y(x, 0) = f (x) and -(x, 0)
ar
= g(x).
The first pair of equations holds the string on the x-axis at x = 0 and x = p. The
second pair specifies for O s; x s; p the initial shape of the string, perhaps from
plucking as on a harp string, and its initial velocity, perhaps from hammering as on
a piano string.
Separation of variables. As in solving the heat equation we use separation of
variables and rely now on the linearity of Equation 10.5 for constructing solutions
that satisfy the boundary and initial conditions. Start by setting

y(x, t) = X(x)T(t),

so Equation 10.5 becomes

X"(x) T"(t)
a 2 X"(x)T(t) = X(x)T"(t), 2
or a --=--.
X(x) T(t)

The right side of the second equation is independent of t because the left side is
independent of t, so both sides are constant. For convenience in treating the heat
equation we chose a special fonn for the constant; this wasn't really necessary as
we'll show here by calling the separation constant simply .>... We write

X"(x) T" (t) 2


- - =.>.. and --=a.>...
X(x) T(t)

The first equation is X" = .>..X and has solutions


730 Chapter 14 Infinite Series

The boundary conditions y(O, t) = y(p, t) = 0 require X(O) = X(p) = 0, so

Solving for CJ and c2 shows that to get nonzero solutions we must have

Allowing for complex exponents and complex values for CJ and c2, we see that
for some integer k we must have 2,.;'Ap = 2bri. Thus ,.;'A = bri/ p, and the
corresponding solutions X (x) have the form

Xk(x) =c1e(kTri/p)x - c2e-(kTri/p)x

. kn . kn
=2CJi sm - x = bk sm-x .
p p

Since we now know that A= (kni/p) 2 = -(kn/p)2, the equation for Tis

2 krra kna
T"+(kna/p) T=0 with solutions Tk(t)=Ckcos--t + Dksin--t.
p p

kna
y(x, t) = [ Ak cos - -p 1 + Bk sin -kna ] kn
- 1 sin -x.
p p

We now form finite or infinite sums of these terms and try to satisfy the initial
conditions by choosing Ak and Bk to be the appropriate Fourier sine coefficients.

azy azy
10.6 Solution formula for a 2 -
ax 2
= -a, 2 , satisfying y(O, t) = y(p, t) = 0 and
ay
y(x, 0) = f(x), -(x, 0) = g(x).
a,
00

y(x, 1) =L [ kna
Ak cos - - t + Bk sin -kna ] kn
- t sin -x, where
k=l p p p

Ak=- 21P kn
f(x)sin-xdx, 2
and Bk=- 1P g(x)sin-xdx.
kn
P o p kna o p

Figure 14.18 shows equally time-spaced string positions for a plucked string.

A simple example of Formula 10.6 is a string that's initially stationary, so g(x) = 0.


= A sin(n x/ p), for O ~ x ~ p, the boundary and
If the initial displacement is f(x)
initial conditions will be

nx Oy
y(O, t) = y(p, t) = 0, and y(x, 0) = A sin-
P
, -(x, 0) = 0.
a,
Section 10( Heat and Wave Equations 731
FIGURE 14.18 u
Plucked string.

Since g(x) = 0, all Bk = 0. Since f (x) is a one-term trigonometric polynomial of


the same type as the general solution formula, A 1 A and Ak = =
0 for k i= I. The
solution according to Formula 10.6, and graphed in Figure 14.18, is

y(x, t) = A cos -JC Qt sin-


1CX
.
p p

If we want to relax our assumption that the string's motion is confined to a plane,
all we have to do is reintroduce the equation a 2 zzz = x 11 along with its own initial
conditions and solve that problem in the same way. Thus we 'd have a vector solution
(y(x, t), z(x, t)) with the same independent variables x and t.

EXERCISES

Solve the wave equation a 2uxx = Uu with boundary and (a) Find the equilibrium solutions u(x, t) = v(x) of the
initial conditions 1 to 4. nonhomogeneous equation.
1. u(O,t)=u(rr,t)=O, u(x,O)=sinx, Ut(x,0)=0
(b) Among the solutions found in part (a), select
the solution that satisfies the boundary conditions
2. u(O, t) = u(rr, t) = 0, u(O, t) = u(p, t) = 0.
( 0) X, 0 < X :S 1T /2, ( 0) 0 (c) Explain how to modify the Fourier solution method
ux, = { rr-x , rr/2<x <rr, llrX, = for a 2 uu: = Utt to cover the solution of the nonho-
3. u(O, t) = u(Jr, t) = 0, u(x, 0) = 0, ur(x, 0) = mogeneous problem.
0, 0 < X '.:: 1T /2,
{ l, 1T /2 < X <: 1T
6. The d 'Alembert solution to the wave equation. This
4. u(O, t) = u(l , t) = 0, u(x, 0) = x(l - x), Ut(x, 0) = method predates the Fourier series method, but is not
sinrr x so popular because it's not so widely applicable. Let
5. The nonhomogeneous wave equation U(x) and V(x) be twice differentiable functions for all
real x.

(a) Show that u(x, t) = U(x +at)+ V(x - at) defines


incorporates g, the acceleration of gravity, into the vibrat- a solution of a 2 uxx = u 11 that is valid for all real x
ing string problem. and all real t.
732 Chapter 14 Infinite Series

(b) Assuming f (x) twice differentiable, show that (b) Sketch the odd periodic extension off (x) = x(I -
x), as described in part (a), from 0 .:'.5 x .:'.5 I to
u(x, t) = ½U(x +at)+ f(x - at)] -00 < X < 00.
(c) Show that if f (x) is odd and has period 2p for
is a solution to the wave equation of the form -oo < x < oo, then
described in part (a) that also satisfies the initial
conditions u(x, 0) = f(x), u 1 (x, 0) = 0. u(x , t) = ½U<x +at)+ f(x - at)]
(c) Assuming g 1(x) continuous, show that

I 1x+at is ad' Alembert solution to a 2uxx = u11 that satisfies


u(x,t)=- g(s)ds the boundary conditions u(0, t) = u(p, t) = 0 and
2a x-ar
initial conditions u(x, 0) = f(x) and u 1 (x, 0) = 0.
is a solution to the wave equation of the form 9. This exercise imposes boundary conditions and initial
described in part (a) that also satisfies the initial displacement zero on d' Alembert's solution described in
conditions u(x, 0) = 0, u 1 (x, 0) = g(x). Exercise 6.
(d) Combine the results of parts (h) and (c) to find a (a) Let G(x) be twice differentiable for 0 .:'.5 x .:'.5 p. with
solution formula for the wave equation subject to the G'(0) = G'(p) = 0. Extend G(x) to the interval
initial conditions u(x, 0) = f(x), u 1 (x, 0) = g(x).
-p < x < 0 by defining
7. (a) Show that the d' Alembert solution
U(x +at)+ V(x - at) described in the previous G(x) = G(-x), if - p < X < 0.
exercise represents the sum of two wave motions,
the first moving left with speed a > 0, the other
Then extend G(x) to -oo < x < oo to have period
moving right with the same speed.
2p. Show that G (x) so extended is not only periodic,
(b) Show for the general term in the solution of
hut even, that is, G(-x) = G(x) for -oo < x < oo.
Equation 10.6 that
(b) Sketch the even periodic extension of G(x) =
[ A cos(hta/ p)t + B sin(hrn/ p)t] sin(br/ p)x x 2 (1-x)2, as described rn part (a), from 0 .:'.5 x .:'.5 1
to -00 < X < 00.
= J AZ+ B2 cos(krrat/ p - 0) sin(krrx/ p), (c) Show that if G(x) is even and has period 2p for
-oo < x < oo, then the function
where 0 depends on A and B . This term is called a
standing wave. u(x, t) = (I/2a)[G(x + at) - G(x - at)]
(c) Show that the term in part (b) also has the
d'Alembert form
is ad' Alembert solution to a 2 uxx = u11 that satisfies
the boundary conditions u(0, t) = u(p, t) = 0 and
½JA 2 + B 2 ( sin ((krr/ p)(x + at) - 0)
initial conditions u(x, 0) = 0 and u 1 (x, 0) =G'(x).
+ sin ((br/p)(x - at)+ 0)]. Why did we assume G'(0) = G'(p) = 0?
10. The I-dimensional wave equation with linear friction
[Him: Use sin(a + b) + sin(a - b) = 2 sin a cos b.] constant k is
8. This exercise imposes boundary conditions and ini-
tial velocity zero on d' Alembert's solution described in
Exercise 6.
(a) Let f(x) be twice differentiable for 0 .:'.5 x .:'.5 p,
where k is a positive constant. Find all bounded solutions
with f (0) = f(p) = 0. Extend f (x) to the interval
u(x, t) = X(x)T(t).
-p < x < 0 by defining
11. The partial differential equation tu~,._. = u 11 has product
f(x) = - f(-x), if - p < X < 0. solutions of the form u(x, t) = X(x)T(t). Find ordinary
differential equations, each dependent on a parameter .l..,
Then extend f (x) to -oo < x < oo to have period satisfied by X(x) and T(t) respectively. The equation for
2p. Show that J(x) so extended is not only periodic, X is to be independent of t and the equation for T is to
hut odd, that is, f(-x) =-f (x) for all real x. he independent of x.
Section 1OC Heat and Wave Equations 733

Chapter 14 REVIEW

In Exercises 1 to 10, use what you know of specific 35. (a) Let L be defined as an operator by L(u) = Uxx - u 1 .
infinite series to identify a sum in closed form for the Show that L is a linear operator and conclude that
given series, determining also its domain of convergence. linear combinations of solutions of the heat equation
are also solutions.
1. L~o(-l)k(x - Si (b) Show that boundary conditions of the form u(a, t) =
u(b, t) = 0 are linear in the sense that if two
2. L~o(x + 1)2k
functions satisfy the conditions, then so does a linear
3. L~o(x2 + 1)-k combination of the two functions.
(c) Show that a boundary condition of the form
4. I:~0 (-l)*(x - ll/k! u(a, t) = 1 is not linear in the sense of part (b).
5. L~o(x + l)2k / k! ( d) Show that the initial condition u (x, t) = f (x) is
not linear in the sense of part (a) unless f (x) is
6. I:~o 2k (x - 1)* / k ! identically zero.
7. I:~0 (-1/(x - s)2k/(2k)! 36. Example 4 in Chapter 9, Section 6 shows that the 2-
dimensional Laplace equation V 2 u = 0 in polar coor-
8. L~o(-1}1(x + 1)2k+l /(2k + 1)!
dinates is
9. I;~(x 2 - 1)-2k/(2k)!

10. I:~ 1 c-1l+ 1cx 2 - 1)1/k


For the functions in Exercise 11 to 22, find the Taylor (a) Letting u(r, 0) = R(r )0(0), show that separation
expansion about the indicated point a, and state the of variables leads to two equations, a harmonic
domain of convergence. In some cases you may be able oscillator equation and an Euler equation,
to do this most easily b( some means other than by the
Taylor formula Ck = _r< >(a)/ k!. 8
11
+ ). . 2 0 = 0, and r 2 R" + rR' - ).. 2 R = 0.

11. (1- x 3 )- 1;a =0 12. (2x - x 2)- 1;a =1 (b) Show that 0" + ). . 2 0 = 0 has solutions satisfying
3 0(0) = 0(2.ir) if)...= k for integer k, with solutions
13. ln(l - 2x);a =0 14. e-x ;a= 0
ak cos k0 + bk sin k0. Thus e talces the same value
15. ex-l;a =1 16. ex- 1;a =0 at polar angles 0 = 0 and 0 = 2.ir.
2
=0 (c) Show that the Euler equation has solutions rk and
17. cos(2x); a= 0 18. sinx ;a
,-1e for integer k, but that negative exponents are
19. sin(x + .ir);a = 0 20. sinx + sin2x;a = 0 ruled out by the boundary condition that u(r, 0)
21. ex - e2x;a =0 22. (1 +x)- 1(1-x)- 1;a =0 should be finite at the origin.
(d) Show that if f (0) has Fourier series representation
State all real values of x for which the series in Exercise
23 to 28 converges.
f(0) = ao
2
'°'
00

+ L..,(ak cosk0 + b1e smk0) .


23. I:~ 1k 2xk 24. I:~ 1k(x 2 - 1/ le=!
25. I;~ 1(x/k/ 26. I:~ 1 sinkx/k2 on O ~ 0 ~ 2.ir, then for O ~ r ~ 1 the function
27 , """
L...~=l 2-k COS k X

Test the series in Exercises 29 to 34 for convergence.


2
'°'
00

u(r, 0) = ao + L.., r 1e (ak cos k0 + bk smk0)


If the series converges, state whether it also converges k=I


absolutely.
defines a function u(r, 0) that solves the Laplace
29. I:~, (k + 1)/ k2 30. I:~ 1 Ck/2}1 equation in the interior of the unit disk and also
31. I:~1 <-Okek I k2 32. L~I kl/ k 2 satisfies the boundary condition u(l, 0) = f (0).
:n. >?.° , <k /2k)le 34. J.~1(-l)*ek/kk 37. The Laplace equation in spherical coordinates is
734 Chapter 14 Infinite Series

,Pu 2 au 1 a2u Assuming there is a function of the form u(r, </), 0) =


2
Vu=-+--+--
iJr2 r ar r 2 a</)2
R(r)<l>(</))8(0) that satisfies this partial differential
equation, find second-order ordinary differential equations
cos</) au 1 a2u satisfied by R, <l>, and 8. [Hint: First multiply by
+ - - - +----=--
r 2 sin <p a</) r2 sin2 </) a0 2
=0. r 2 sin 2 </).]
APP EN DIX

FINDING INDEFINITE INTEGRALS

The table at the end of this appendix lists some frequently occurring integrals. As
a supplement to the table, you may find it useful to use a symbolic calculator or
software that provides some indefinite integrals. If you don't see how to compute an
indefinite integral directly and don't find it else where, you may find that one of the
following techniques works. Integration constants are omitted, since they're not the
main issue here.

I IDENTITY SUBSTITUTIONS
Rewriting the integrand using an algebraic, trigonometric, exponential, or logarith-
mic identity will sometimes convert an apparently intractable integrand into an
amenable one.

i EMNIPLE ,tJ The integral J(ex + e3x ) 2 dx can be rewritten by squaring out the binomial to get
f(ex + e3x)2 dx == f(e2x + 2e4x + e6x) dx
= ½e2x + ½e4x + ie6x.
To integrate cos 2 x, recall the trigonometric identity cos 2x =
2 cos 2 x - 1, which
is equivalent to cos x =
2
½(l + cos2x). Thus Formula 30 in the table follows for
a= 1 from

f 2
cos xdx =½/(I+ cos2x)dx = ½x + ¾sin2x.

Identities to facilitate the integration of rational functions P(x)/Q(x), where P(x)


and Q(x) are polynomials, are derived by partial fraction decomposition, described
in detail in Chapter 11, Section 5 on Laplace transforms.

!E~~P,E l] A partial fraction decomposition is used to compute Formula 6 in the table:

f (x - a;(x - b) dx - f 1
Ca - b)\x - a) - (a - b) (x - bJ dx

= In Jx - al _ In Jx - bl = _ l _ 1n Ix - a I·
a- b a- b a- b x - b

735
736 Appendix Finding Indefinite Integrals

II SUBSTITUTION FOR THE INTEGRATION VARIABLE


Awkwardness in an integrand can sometimes be circumvented by a substitution. If
the given integration variable is x, a substitution x = g(u), dx = g'(u)du may
simplify the integrand enough that the corresponding integral in u can be computed.
Then replace u by its equivalent in te1ms of x using the inverse relation u = h(x),
where g(h(x)) = x.

An awkward occurrence of ,Ix can be circumvented by letting x = u 2, dx = 2u du.


For example,

I dx
,,/x+I
= I 2u du
t1+l
=2 I( 1 __
l ) du
u+I .

The last step comes from division of u by u + 1. Now integrate with respect to t1
and reintroduce x using the inverse relation t1 = ,Ix to get

I
dx
+ 1)) = 2(,,/x - ln(,,/x + !)).
,Ix
+I = 2(t1 - ln(u
.
If)(AMPLE2 I To integrate~. set x = sinu , dx = cosudu. Then

f ~ = f .J1 dx
2
-sin ucost1du

=f 2
cos u du = ½t1 +¼sin 2u, by Example 2.

Since sin 2u = 2 sin u cos u, we find J .Jf=7 dx = ½arcsin x + ½x.Jf=7.

III SUBSTITUTION FOR A PART OF THE INTEGRAND


J
Here we try to write a given integral in the form J(g(x) )g'(x) dx and set u = g(x),
du = g'(x) dx in the hope that we can compute the indefinite integral F(u) =
J f(u)du. If F'(u) = f(u), the result is

f f(g(x))g'(x)dx = f J(u)du = F(g(x)).

In the integral J cos 2 x sin x dx we note the square of a function, namely g(x) =
cosx, multiplied by a function, sin x, which is easily modified to be the deriva-
tive g' (x) = - sin x. By including the constant factor -1 in the integrand and
compensating with a"-" before the integral, we rewrite the integral as

f cos2xsinxdx =- f (cosx)\-sinx)dx.
Section V Integration by Parts 737
It's now natural to think of substituting u for g (x) = cos x and du for g' (x) dx =
( - sin x) dx to get

2
/ cos x sinx dx =- f u du
2

= -½u 2 = -½ cos 3 x.

IV INTEGRATION BY PARTS
This technique is one of the most important, because of its frequent use in deriving
other general fonnulas; it is embodied in the fonnula

f f(x)g'(x)dx = f(x)g(x)- f J'(x)g(x)dx,

which follows from the product rule for differentiation. To apply the method, you
need to recognize the integrand of a given integral as a product of two functions;
one of them, f(x), you differentiate and the other one, g'(x), you try to identify as a
function you can integrate easily. If one choice for f(x) and g'(x) fails to work you
may want to try another. Formulas 17, l 9, and 23 in the table can be computed by
a single application of integration by parts. Formulas 21, 24, and 29 are computed
by repeated integration by parts.

j xsinxdx = (x)(-cosx)- f(1)(-cosx)dx


= - X cos X + sin X.
V INTEGRAL TABLE
1 1
= i- -
1. /(ax+ bt dx
a(n + 1) (ax+ bt+ , n I

2. j--.!!:!__
ax+b
= ! In lax+ bl
a
1 2
3. / x(ax + bt dx = a 2 (n + 2) (ax + bt+
n=/--1,-2
xdx x b
4.
I --=---lnlax+bl
ax +b
xdx
a a2
b 1
5.
I ---=---c----+-lnjax+bl
(ax+ b) 2 a (ax + b)
2 a2

6. ---- 1 -dx =- 1- In Ix
--a,
-
/ (x - a)(x - b) a- b x- b
738 Appendix Finding Indefinite Integrals

1. f dx
(ax+b)(cx+d)
= l
ad-be
ln ax+ b
cx+d
I I· ad - be =I- 0
xdx b 1
8
· f (ax+ b)2
~
= 2a ((ax + b) ) - a (ax + b)
2 2

2(3ax - 2b)
2

9.
f xvax + bdx = lSa 2 (ax+ b)
312

2 ~ 2(15a 2x 2 - 12abx + 8b 2 )(ax + b) 312


10.
f x v ax + b dx

xdx
=
2(ax - 2b) -/ax+b
l0Sa 3

ll. f Jax + b
dx
=
]
3a2 ax +

f
X
12. =- arctan -
a2 +x 2 a a

13
• f a2~ \2 = ~ In I: ~; I
14. f dx - _2_ ln
x(ax + b) - 2b
x2
ax 2 + b
I I
15. f J p 2 - x 2 dx = ½xJ p 2 - x2 + ½p 2 arcsin(x/ p)
16. / Jx 2 ± p 2 dx = ½x/x 2 ± p 2 ± ½p 2 In (x + Jx 2 ± p 2)
11. f dx
Jp2-x2
= arcsin(x/ p)

18. / dx = ln (x + J p 2 ± x 2)
Jx2 ± p2

19. f eax dx = ~eax


20. f xeax dx = _!_(ax - 1)eax
a2

21. f x 2eax dx = _!_ (a 2x 2 -


a3
2ax + 2)eax

22. f ln ax dx = x ln ax - x

23. f xlnaxdx = ½x 2 Jnax -¼x 2


24. f 2
x lnaxdx = ½x 3 lnax - ~x 3

25. f sin ax dx = -~ cos ax


Section V Integral Table 739

26. f xsinaxdz = {sinax


a
- !xcosax
a

27.

x sm ax dx = 2a2x .sm ax + 3a2 12
cos ax - - x cos ax
a
/
2 x sin2ax
28. sin ax d x = 2- ~

.
/

1 cos3 ax
f
3
29. sm axdx = --cos ax+ - - -
a 3a

30. / cos ax dx = l sin ax

1 l
31.
f xcosaxdx =2
a
2
cosax + - xsinax

2
a
l
32.
f x 2 cosaxdx =
x
2
a
xcosax - 3 sinax

sin2ax
a
+ -x 2 sinax
a

33. cos 2 axdx = -2 + ---


4a
/

sin3 ax
= -a1 sin ax -
34.
f cos 3 ax dx ---
3a
sin(a - b)x sin(a + b)x
35.
f = - - - - - - - - - , lal f. lbl
sin ax sin bx dx
2(a - b)
sin(a - b)x
2(a + b)
sin(a + b)x
36.
f cosaxcosbxdx =
2 (a-b)
+ ( b) , lal f. lhl
2a+
cos(a - b)x cos(a + b)x
37.
f sin ax cos bx dx =- 2
(a _ b) -
2
(a + b) , lal f. !bl

38. / tanaxdx = -llncosax

39. f tan 2 ax dx = l tan ax - x

40. f tan 3 ax dx = _!_ tan 2 ax + ! In cos ax


2a a

41. f sec ax dx =l ln tan(ax /2 + rr /4)

42. f 2
sec ax dx =l tan ax

43. f sec 3 ax dx = _!_


2a
tan ax sec ax + _!_ ln tan(ax /2 + rr /4)
2a

44. f tan ax sec ax dx = 1 sec ax


740 Appendix Finding Indefinite Integrals

45. j csc ax dx = 1IIn tan(ax /2) I

46. cot ax dx = -;;1 In Ism. axl


/
. eax
47.
f ecu sin bx dx =
a2 +b2
ax
(a sin bx - bcosbx)

48.
f eaxcosbxdx= /
a +b 2
(acosbx+bsinbx)
ANSWERS TO ODD-NUMBERED EXERCISES

CHAPTERl:VECTORS
Section 1: Coordinate Vectors
Exercise Set 1 (pgs. 7-8)
1. (a) (-1, 6). (b) (0, 14). (c) (4, -6).
3. (a) (9, -3, 0). (b) (18, 9, -13). (c) (1, 1, -4).
5. 5i - 8j. 7. (1 - 2c)i + (4 - d)j.
9. (a, b) = (3, 2) is the only solution.
11. No a and b satisfy ax+ by= (3, 0, 0). The only possibility is c = 5, a = b = l.
13. (a) -x + 2y - z = 5i. (b) 6x - 2y + z = 5j. (c) -4x + 3y + z = 5k.
15. Let x = (;q, ... , xn) be a vector in ]Rn and let r ands be real numbers. Then apply the definitions.
17. Apply the definitions. 19. Apply the definitions.
21. By inspection, (-2, 3) = -2e1 + 3ez. 23. (2, -7) = -i(l,
1) + ~(1, -1).
25. Let x = 2i, y = i - 3j and z = 3i + 2j - 2k.
(a) I = ½x. (b) j = ¼x - ½Y- (c) k = Hx - ½Y - ½z,
27. 700 ink, 90000 paper, 5500 binding. 29. ¼<x(2) + x(8) + x(14) + x(20)).

Section 2: Geometric Vectors


Exercise Set 2A-E (pgs. 16-17)
1. (-1/2, 3/2). 3. (1,0, 1).


(·2.2) y

p
~1,1)
-2 -1 1 • X

X ·1
-2
-2

Exercise 1 Exercise 3 Exercise 5

5. 2,,/2. 7. 6.

741
742 Answers to Odd-Numbered Exercises

9. x+y =(1 , 1)+(1,-1)=(2,0),


x-y =(1,l)-(l , -1)=(0,2) ,
x + 2y = (I, I) + 2(1, -1) = (3, - I ).

2 X

Exercise 7

11. x+y =(1,l,1)+(1,1,-1)=(2,2,0),


x - y = (I, I , I) - (I, I , - I) = (0, 0, 2),
x + 2y =(I , ! , I)+ 2(1, I, - 1) = (3, 3, - I ) .

y
• z

2 •.
x-y y /'
/
2 / - - ••• ___ _
x-y / ·········· ...
3· x+y .>'.
__1-"",_
J _....~_x_+...,Y,_.
' ~- - -2
.~
3 X
,/,______,---~
,,/
...
--------
,'
,...... i

-· ;- ; ~ - - - ~ ,.······· X+2y
-1 y- ---------... ,/ 1 ~ 2
X+2y
·11 ?-
Exercise 9 Exercise 11
Answers to Odd-Numbered Exercises 743

13. 15.
y

/:~ ----... 4
.---.v······--...2x:i:y
,· ·:.c,,/
:·/ 2x+y ·. ·-... ______ _
/
.... ··- ...

-1 y

17. 19.
y z
(0,1,1)

\ ,y
(1,2)
2-

,,~
~ (2,1)
., /

I
·1 2 X

(1,-1,-1)

21. 23.
y z
3 ,,.,.,vZ-·······"'·''
u ,
u
2

X
3

2 3 X
744 Answers to Odd-Numbered Exercises

25. Hint: Rearrange terms. 27. (0, 0).


29. Hint: Arrange vector arrows tail-to tip.
31. (a) Hint: Consider ta+ (1 - t)b - a and ta + (1 - t)b - b.
(b) Hint: Use ideas from part (a).
(c) Hint: Use ideas from part (b).
33. About 0.32 Ii + 0.883j + 0.342k.
35. About (-1557, -2928, 2237). 37. About 2420 miles.

Section 3: Lines and Planes


Exercise Set 3AB (pgs. 23-24)
1. = t(2, I)+ (-1, 2).
X

3. X=t(l ,0,1)+(1,2,2).

·1 2 X

Exercise 1
Exercise 3

5. (a)
y
" '
. p 3

' '· 2

d
"'·.
- - + - - - r - - - + - ---"l"----'k-----jf-- - 2
·3 ·2 ·1 X

(b) p = (-5/3, 8/3).


(c) Hint: Show a contradiction.
7. Lines are parallel and equal. 9. Lines are parallel hut not equal.
11. Linearly independent. 13. Linearly independent.
15. X = tJ (1, 1, 0) + t2(0, 1, 1). 17. X = t1 (1, -1, 0) + t2(1, 0, -1) + (I, 0, 0).
Answers to Odd-Numbered Exercises 745

'·,,, z
z /\. -.,,.
',,
/ '\
/ /.

/
/

" 'f.-.
'
'·,,
.:··,. ... i
··,, ·, i
'·'·-.' (1,0,)
\ '· $

'

Exercise 15 Exercise 17

19. (a) (2,-2,0). (b) x=t1(1 , -l,2)+t2(2,-2,0)+(1 , 2,l).


21. Hint: Find the midpoints.
23. (a) Use either coordinates or the laws. (b) Hint: Divide by the scalar. (c) Hint: Suppose first that the conditions
hold.
25. Pick two points in S + T.

Section 4: Dot Products


Exercise Set 4A-C (pgs. 31-32)

1. 10 3. -5
5. (a) 1. (b) 1. (c) rr /4.
7. (a) 8. (b) 3. (c) 0.4759 rad. ~ 27 .3° .
9. Angle: 0.6147 radians (~35.2°. Distance~ I= r8 = 4000(0.6147) ~ 2459 miles.
11. (a) Positivity Hint: Sum of squares is nonegative. (b) Symmetry Routine check. (c) Additivity Routine check.
(d) Homogeneity Routine check.
13. (a) .JJ· (b) (2/3, 2/3, 2/3); (1/3, -5/3, 4/3).
15. (a) - ~- (b) (-9/14, -27/14, 18/14); (37/14, -15/14, -4/14).
17. IA - Bl= .JTIIT, IA - Cl= .Jill, IA - Cl= 5, AB acute, AC right, BC acute.
19. (a) Routine calculation. (b) Routine calculation. (c) 1/./6
21. Routine calculation. 23. 3120/.JIT ~ 941 watts.
25. The total work done is the same either way: 3750./3 + 1/./2 ~ 7244 foot-pounds.
27. Hint: Consider two cases. 29. Follow the outlined steps.

Section 5: Euclidean Geometry


Exercise Set SAB (pgs. 36-37)
1. y = 3.
3. x + 2y + 4z = -1. (Note: The vector N in the diagram below is parallel to the given perpendicular but is only half as
long. The right triangle with N as one leg has its other leg in the plane and its hypotenuse parallel with the z-axis.)
5. (x, y, z) = (2/5 + 1, -2/5, 4(2/5) - 2) = (7 /5, -2/5, -2/5).
7. Every point on the line is in the plane.
746 Answers to Odd-Numbered Exercises

3 (2,3)

Exercise 1 Exercise 3

9. (x, y, z) = (-1/3 + 1, 1, -1/3 + 1) = (2/3, 1, 2/3).


11. Angle is 90° with cosine zero. 13. Angle is 90c with cosine zero.
15. 17.
z
/ y

X
/
/
/
/
/
/

/
y + 22 =1
X-t-y- Z=l

19. 3x - 2y + 5z = 9.
21. (a) x - 2y + z = 0. (b) x = 1 or z = 1 and many others. The three points lie on a line parallel to the y-axis so don't
determine a unique plane.
23. I /../5. 25. 1/,./3. (I, 0, -1) is below P.
27. 3/vl4. (1, 0, -1) is below P.
To locate the origin relative to P, set x = y = 0 in the equation for P to find out where the plane crosses the z-axis.
This gives 3z = 1, or z = 1/3. Thus, (0, 0, 1/3) is on the plane and (0, 0, 0) is one-third unit below it; i.e., the origin is
below P. It follows that ( 1, 0, -1) and the origin are on the same side of P.
29. ldl.

Section 6: The Cross Product


Exercise Set 6 (pgs. 42-44)
l. -e3. 3. (1, -2, 1).
Answers to Odd-Numbered Exercises 747

z
. ""·~~. \
,/ \
z z

.•
.··:

UxV _(!.O ~)_ ......._j (-2,1,0)


. .V- ···· ...X;°'
(0,0,-1) ------,;;:;'>(
~-,
... ,,

Exercise 1 Exercise 3 (2,-1,0)

5. ./i. 7. 3./3/2.
9. 2x - 5y + 2z = 5. 11. Routine check.
13. (a) a = (-2, l, -2), b = (2, -1, -2), c = (2, 3, -2) are shown on the right with their tails at the apex.
(b} u = a x b = (-4, -8, 0), v = b x c = (8, 0, 8), w = c x a= (-4, 8, 8).
(c) cosa = 0, cos/3 = 1/./i, cosy= 2/3.
15. (a) Routine computation. (b) Hint: Use a little trigonometry. (c) Hint: More trigonometry.
17. Routine computation, 19. 3.
21. V(B) = 17.

_j :'"
}// "r
(1, ,0)

x- Exercise 21

Exercise 19

23. Follow the steps, 25. (a) Unequal. In (b), (c), (d) pairs are equal.
27. Use the hint in the exercise.

Chapter 1 Review (pgs. 44--45)

1. C. 3. 3k.
5. 2e1 - ei + 3e3 + 2e.i . 7. (4, l, -2) = -2(1, 2, 3) + (6, 5, 4) .
748 Answers to Odd-Numbered Exercises

9.

11. s(-1, 1l + (1, 2), t(5, 7) + (4, 5), intersecting at (3/2, 3/2).
13. f(t) = t(6/v'38, -l/v'38, -l/v'38) + (-5, 3, 4).
15. (a) Hint: Show d(t) is always a scalar multiple of a fixed vector. (b) If p1(t) = tv1 + o 1 and p 2 (t) = tv 2 + o 2 the
collision is at the time when these are equal if that's positive, otherwise no collision.
11. K and M arc parallel and are the same line. L and M arc parallel but are not the same line.
f19. s(3, -2, 0) + t (0, 2, -5) + (3, 0, 0).21. s(l, 0, 0) + t(0, 1, 0) + (1, 2, 3).
23. x = t(3 , 1, 2). (fhe values t = 0, 1, -2 give the three points.)
25. X = s(-2, 0, 2) + t(-3, 0, 1) + (1, 2, 3) .
27. (a) Angle between a and bis less than the angle between a and c. (b) (0, -1, I). (c) ,fi./2. (d) ./IT/2.
29. 18 units.
31. (a) 6/v'l3 units. (h) x = t(3, -2) + (3, 0). (c) ±(8/v13, 12/v13).
33. (a) x = t(l, -2). (b) n = (1/0, -2/0), c = -2/0. (c) 0; the origin and (3, 5) are not on the same side
of L.
35. x = t (0, 3, -3) + (8/3, 0, 1/3); (8/3, 2/3, - I /3).

CHAPTER 2: EQUATIONS AND MATRICES


Section 1: Systems of Linear Equations
Exercise Set lA (pgs. 51-52)
I. Two lines intersecting in the point (3/2, - 1/2). 3. Three planes intersecting in (0, 0, 0).
5. No solutions. Each pair of planes intersects in a different line parallel to (1, -1, 0).
7. Intersect at (4, -2, 7). 9. Intersect at (7, 11) .
II. (a) (a,b,c) = (-5/6, 19/6, 1) and f( x) = 2
+ 3*x+ I. -ix
(b) y = I 1/2. The path is a straight line, not a parabola.
13. 8 cubic yards or 32 tons of sand, 2 cubic yards or 2 tons of cinders.
15. b = 2a1 - ~a2. 17. ja1 - ½a3.

Exercise Set 1B (p. 58)


I. Voltages at junctions 11 = A, ]z, h , l4 = B, ls, 16 in the diagram

are JO,
44 2
~l 1 , ~1~ , 4, :fi°, 2;}/, current of ¥i ~ 2.32 amps flows in at A and out at B.
1
Answers to Odd-Numbered Exercises 749
3. With junctions labelled 11, ... , ls as shown,

let Vi be the voltage at l; . With fixed voltages v1 = 1 and v4 = 0, other voltages are = ti, = f4,
u2 v3
v5 = f.i, L'6 = ;, v1 = ~. vs= f4· Current of¥~ 1.71 amps flows in at 11 and out at ]4.
5. Magnitudes proportional to 2,/5, ../2, ./IO.
7. No. Any resultant (a, b, c) of forces acting in the given directions is a linear combination with positive coefficients, and
must have a > b > c > 0.
9. P2 = ~- - - - 11. Pl = P2 = P3 = P4 = 1/2.
16 15 17 _ 23
13• Pl= 29, P2 = 29, P3 = 29, P4 - 29· 15. r =(-a+ 3, a - 2, -a+ 4, a), for any a.
17. A= !h, B = th, C = ½h,

Section 2: Matrix Methods


Exercise Set 2A-C (pgs. 69-70)

1. e=n G) G} = X = -1/1. y = -5/1.

J. oi -n m-m
(x, y, .r.) = (-t, t + 1, t) = t(- 1, 1, l) + (0, 1, 0).
x+z=0
5• X
3x
+2y
+y=0
= }' (X) = (-1/5)
y 3/5
· 7, y =1 '
x+y=0
9. X = 24/15, y = 13/15, Z = -7/15.
11. (a) (6 ~ =n• (b) X = t(l , 1, 1). (C) X = t(l , 1, 1) + (3/5, -1/5, 0).

13. (a) 01
(0
01 6.), (b) X = 0, (C) X = (0, 1, 0).
0 0
15. i = -3(1, 2) + 2(2, 3), j = 2(1, 2) - (2, 3). 17. (1, 2, 1, 0) + 2(2, -1, 0, 1).
19. (0, 0, 0, -1, 2). 21. V = -a+ 2b.
23. v = -~a - ~b - ~c. 25. t(O, 4, 8) + (-4, -2, 0).
27. Show that if the lines are non-parallel, equations have a unique solution (whether lines intersect or not). Show that if
the lines are parallel, they lie in a plane, and any line that is in the plane and perpendicular to both given lines will do.
29. Hint: A(w - v) =Aw-Av. 31. Hint: A(t1x1 +t2x2) = t1Ax1 +t2Ax2 = t1b1 +t2b2.
750 Answers to Odd-Numbered Exercises

Exercise Set 2D (p. 73)

I. r(I, 0, 0, -3) +s(0, I, 0, 2) +t(O, 0, I, -1) +


(0, 0, 0, 3).

5. (a) {a, b, d} and {a, c, d} independent, {b, c, d} dependent.


3.
(~~ ~ 0001) .

7. Hint: When is one vector a scalar multiple of another?


9. t(1, -3, I) . 11. Solutions arc s(0, I, 0, 0) + t(-1, 0, - I, I) .
13. Hint: First show that every row and column contains a leading entry.

Section 3: Matrix Algebra

Exercise Set 3A-D (pgs. 80--81)

I. =; _;). 3. ~ ;) .

5. : :~· . 7. ; I!)·
9. 2 4 .
2 4
11. -3
-5
-1
6
-4)·
-9

:::(
8A r!r!); 8 h,s 3 colomas, A h,s 2 rows
17
, (- I I

9
~: - - :) .
2
-3 3 3 -
19. DC is not defined, D has 2 columns, C has 21. X and Y arc 2-by-3.
3 rows.
23. X is 2-by-2, Y is 2-by-3. 25. X and Y are 2-by-3.

27. Ci= n} Cj= (:} Ck= U) 29, ~)

31. (38). -I
-]
33.
-]
-I
_:).
35. AO is defined when O is n-by-p for some p and is then m-by-p . 0 A is defined when O i_s p-by-m for some p and is
' then p-by-n.
37. (a) definition of matrix product.
(b) If c has entries CJ, ••. , c,, and r has entries r1, ... , r11 then M = er has entries mij = c;rj , for i , j = I, . .. , n.

39. A2 =( =! ;). A
3
= (-~; g). p(A) = (-~~ n-
41. A2 =(6 ~ ~).A =(-6
0 0 9
3

0
~
0
~).
27

p(A) = (~
0
~
0
~) -
12

43. UV= 0, VU= (1~ =~~).


Answers to Odd-Numbered Exercises 751
45. (a) Use distributive laws for matrix. multiplication.
(b) Use distributive laws to show (A+ B) 2 = A 2 +AB+ BA+ B 2 . This equals A 2 + 2AB + B 2 if and only if
AB = BA, so any matrices with AB # BA, such as those in Exercise 43, are examples.
2 3
47. X = (~ ~). p(X) = 0.

49. (a) For the given numbers, p(x) = x 2 + 2 and A= G =n- A 2 = (-~ -~) so p(A) = 0.
.
(b) Start with A2 = (aca++ c
2
dbe abb + bd)
c + 2
d and go on from there.

Section 4: Inverse Matrices


Exercise Set 4A-C (pgs. 86-88)

3. ( 16/3 -20/3)·
1. (_: -~) - -20/3 40/3

5 -! _
· A -
( 4/11
-3/11
1/11) ( 5/ll)
2/11 ,X= -1/11 . 7. (-1! ~ -~).
2 0 l
9. No inverse; row reduction gives a row of zeros.

l 2)-I ('-3/2 1/2 ) ( 1/2 11/2 6)


ll. ( 5 6 = 5/4 -1/4 'X = -1/4 -17/4 -5 .

13. Hint: Simplify (AB)(B- 1 A- 1) using associativity.


15. Hint: Evaluate (/+A + A 2 )(1 - A).
-1
-3/4)
17.
(~ ;} 0
1/2
0

21. Apply definition.


"· (f
1/2 0 -1/8
0
0 0
- 1/4 .

23. Apply definition.


1/4

25. Use the results of Exercises 21 and 22. 27. Use a trigonometric identity.

29. Note that ½+ ½+ ¼= } + t = l.


31. Hint: The rows of Qt are the columns of Q and vice versa.
33. Hints: (a) (A+ A1) 1 =At+ A . (b) What is (A+ At)+ (A - At)? (c) What is At(A- 1l ?
35. Hint: p(Xk) is equal to the dot product of (ao, . . . , an) and (I, Xk, x;, ... , x,:).

Section 5: Determinants
Exercise Set 5A-E (pgs. 98-99)
1. det A = 24, det(2A) = 192. 3. det(2A) = 2n det A.
5. 32. 7. det A= 7, det B = 2, detAB = dct BA = 14.
752 Answers to Odd-Numbered Exercises

13. (-!~ ~ -~)-


2 0 l
15. ( 7/~
-3/4
-I I/~
5/2
~)-
-I

17. det A = 0; not invertible. 19. (~~ I~~ ~0


0
I
0
=~~:).
-1/4
1/4
21. The detenninant is (I - t)(2 - t)(3 - t). The matrix fails to have an inverse when t = I, 2, or 3.
23. The determinant is -4(t - 4). The matrix fails to have an inverse when t = 4.
25. Expand by the first row and compare with the definition of the cross product.
27. Hint: [f A = I, what is the result of expansion by the first row?

Chapter 2 Review (pgs. 99-101)

1. Dimensions of A and B don't match. 3. (~ -~ -~)-

5. Dimensions of A and EB don't match. 7. D- 1 does not exist.


9. C - A must be invertible. If it is, then X = 2(C - A)- 1B.
11. X = 3/ is always a solution, unique if A is invertible.
13. Sometimes true (e.g. if A = /), sometimes false, e.g. if A = G ~). B = (~
15. Sometimes true (e.g. if A = /), sometimes false, e.g. if A = (~ ~). B = (~
17. Always true.

21. (a) t (=:} (=:) W·


(bl 1 + (c) no soluHons.

There are solutions for b = (b1, b2, b3) when b3 = 0.


23. No solution unless b = -2; if b = -2 then x = l, y = 0, z = 2 is one of many solutions.
25. v =-a+ 2b. 27. Impossible.
3 3
29. (l/l -S/ l ) · 31. Not inve1tible.
2/13 3/13
-1/5 6/5 -4/5 -8/5 16/5
3/5 -3/5 2/5 4/5 -8/5
33. Not invertible. 35. -1/5 1/5 1/5 2/5 -4/5
-1/5 1/5 1/5 -3/5 6/5
1/5 -1/5 -1/5 3/5 -1/5
37. (a} Compute the products.
(b) Jn DA each row of A is multiplied by the corresponding diagonal entry of D, in AD each column is multiplied.
=
(c} B is 3-by-3 in all cases. If a, b, c arc all different, B is diagonal. If a = b c, B is arbitrary. If a = b #- c then
= = =
b31 == b32 b13 b23 0.
39. a= 2, b = -3, c = 4. 41. t(-1 , 0, -1, I).
43. 9. 45. -20.
Answers to Odd-Numbered Exercises 753
CHAPTER 3: VECTOR SPACES & LINEARITY
Section 1: Linear Functions on !Rn
Exercise Set lA-C (pgs. 110-112)

l. G:), not one-to-one. 3. ( ~ ; } on< ,~one.

5. /(ei) = (\
2
), f (ei) = (3~ 2
)-

7. /(e1) = G). /(e2) = (-n· /(e3) = (-D-


9. Hint: Calculate Rox• Rox.

11. G=D, domain == range == R


2
.

13. ( ~ ~ -:) , domain == range = R 3 •


-2 0 5
15. (a) Hint: Image of (x, y) is (y, x ).
(b) ( -~ -~)

(c) (- ~ _ ~) = -1, 180° rotation, (same as reflection through the origin).

17. ::: :::-~:w(e~ M~ltl~)at:o:~:~ 1:av(e~te f1)s ~xed and rotates e2 and CJ 90° in the yz-plane.
1

0 0 1 0 0 1
The two products represent rotations of 90° in opposite directions about the z-axis.
9/49 18/49 6/49)
19. 18/49 36/49 12/49 .
( 6/49 12/49 4/49
21. Line r(l, 3). 23. Line t (1, -1).
25. Plane s(I, -2, 0) + t (-2, -1, -5). 27. Both make sense.
29. Only g o f makes sense. 31. a = (/(e1), ... , /(en)).

I - ~... ,

.
y (1.1)
yl(0,1)
, ' I ,,,. , ' ,
, ' , I- I
,, ' ' , (1,0) I
(-,:of,
,',
''
--~ ,
-
(-1,0) , , ,
....
I
I ,
t ,
,,'{1 ,0) X
I

,,
I ,
~,,
(0,·1)
{·1 ,• 1 ) - 1
I
domain' points image points (a=1)

33. (a) Gn-


(b) fa(l, 0) = (1, 0), / 0 (1, 0) = (a, 1), f,,(-1, 0) = (-1, 0), fa(0, -1) = (-a, - 1).
(c) Hint: If a > 0 and y > 0 then x + ay > x. (d) The x-axis. (e) Horizontal lines. (t) fa+b·
754 Answers to Odd-Numbered Exercises

Section 2: Vector Spaces


E11.t:rcise Set 2AB (pgs. 118-119)
1. Subspace. 3. Subspace.
5. Not a subspace. 7. Subspace.
9. Not a subspace. 11. (1, 0, 1), (0, l, 2) (not the only correct answer).
13. Hint: Use additivity and homogeneity of the dot product.
15. Sequences that are O after the nth term; sequences that are O after some finite number of terms; the sequence in which
every term is 1.
17. Subspace. 19. Subspace.
21. Subspace. 23. Subspace.
25. Hint: (/1 + h)(a) = ft (a)+ h(a) , (r f1)(a) = r ft (a).
27. Partial answer: T is in the span of S since (3 , 5, 4) = (2, 3, 1) + (1, 2, 3) and (1, I, -2) = (2, 3, 1) - (1, 2, 3).
29. Hint: Look at Theorems 2. l and 2.4.
31. Hint: If a is in A but not in B, and bis in B but not in A, show that a+ b is in neither.
33. (a) f(x) = Jx - a is one example, Ix - (a+ b)/21 is another.
(b) /1(x) = fax f(t)dt, where f is as in part (a).
(c) Take fk(x) = (x -al+
112, or more generally, fk(x) = J;
fk-1U)dt .
35. Hint: f((x, y, z)) = x + 2y + z . 37. Always true.
39. True for S1 and S2 the x- and y-axes in R 3, false for the x- and y-axes in JR. 2.
41. True. If x is in S, then 2x is not.

Section 3: Linear Functions


Exercise Set 3A-C (pgs. 125-126)

1. G~)- 3. (-~ ~) -
5. One-to-one. J- 1(x1,x2,x3, . . . ) = ½<x1,x2,x3, ... ). Domain of 1- 1 is all of R 00 •
7. Not one-to-one.
9. (g o f)(x1,x2 , . .. ) = (f o g)(x1,x2, .. . ) = (2x1,4x2,6x3, ... ).
11. (g o p)(x1 , x2 , x3, . . . ) = (0, 2x1, 3x2, 4x3, .. . ), (po g)(x1, xi, x3, .. . ) = (0, x1 , 2x2, 3x3 , . .. ).
13. Use the right distributive law and the scalar commutativity laws for matrix multiplication (Theorem 3.2 in Chapter 2).
15. Du(x) = 6x 2 - 4, xu(x) = 2x 4 - 4x 2, D(xu(x)) = 8x 3 - &x, xDu(x) = x(6x 2 - 4) = 6x 3 -4x .
17. (a) (Dx - xD)u = (xu' + 11) - xu' = 11.
(c) No. (D 2 - x 2)u = D 2u - x 2u = u" - x 2u, but (D + x)(D - x)u = (D + x)(u' - xu) =
D(u' - xu) + x(u' - xu) = (u" - xu' - u) + xu' - x 2u = u" - (1 + x 2 )u.
19. Hint: Use some basic integration formulas.
21. 1l. 23. (0, 2).
25. 2x 2 + 3x. 21. No. j'(O) is not defined.
29. No. Every function in the image of L has value O at 0.
31. No inverse.
2/3
-7/9)
33. /- 1(y) has matrix 1/~ -2/9 . Domain of 1- 1 is R2 •
( 1/3
35. No inverse.
Answers to Odd-Numbered Exercises 755
Section 4: Image and Null-Space
Exercise Set 4A-C (pgs. 130-131)
1. Range: JR 3 . Image: plane through O spanned by (1,0, 1) and (0, 1, 1). Linear, null-space {0}.
3. Range: IR.3. Image: plane u(I, 0, 2) + v(0, 1, 1) + (0, 0, 1). Not linear.
5. Range and image: JR 2 • Linear, null-space t(l, 2, -3).
7. Image: C(-oo, oo). Not linear.
9. Range: cOl(-oo, oo). Image: subspace off with /(0) = 0. Linear, null-space {0}.
11. Image lR2 , null-space: 10}.
13. Image: the plane spanned by (1, 0, 1) and (4, 1, 1). Null-space: the line t(l, 0, -2).
15. (a) t(5, 2). (c) (1, -1); t(5, 2) + (1, -1).
17. Hint: Show that /(x) = (/(e1), ... , /(e,,)) • x.
19. (a) Use definition of linearity and properties of D.
{b) (D - l)(x + 1) = (x + l)' - (x + 1) = l - (x + 1) = -x. All solutions: y(x) =ctr+ x + 1.
21. (b) If Gu is the zero function, talcing derivatives gives tu(t) = 0 for all t, so u(t) = 0 fort #- 0. Since u is continuous,
it is the zero function, so G is one-to-one.
(c) Polynomials of the fonn x 2 p(x) with p(x) in P,,.
(d) Polynomials of the fonn x 2 p(x) with p(x) in P.
(e) The domain of c- 1 is the same as the image of G, and consists of continuously differentiable functions g with
g(0) = g'(0) = 0. For such a g, taking /(t) = g'(t)/t fort#- 0 and /(0) = lim1->og'(t)/t gives a continuous
f = c- 1(g).
(f) The constant function 1.
23. (a) The reflection of (x, y) is (-x, y) and (Ru)(x) = u(-x).
(b) (R 2 )(u(x)) = u(-(-x)) = u(x).
(c) Hint: If u(x) = u(-x) then Ru= u.
(d) Image the odd functions, null-space the even functions.
(e) F} = F,,, F; =
F0 •

Section 5: Coordinates and Dimension


Exercise Set 5AB (pgs. 137-138)
1. Spanning:(x, y) = ½<Y - x)(-1, I)+ ½(x + y)(l, 1). Independence: neither vector is a scalar multiple of the other.
3. Partial answer. Spanning: (x, y, z) = (x - y)(l, 0, 0) + (y - z)(l, 1, 0) + z(l, I , 1).
5. I. 7. 2.
9. If at:' + be 2-' + ce 3x = 0 for all x, we can, for instance, set x = 0, x = In 2 and x = In 3 to get

(~!
3 9
!) (:)
27 C
= 0. Row reduction shows that the only solution is a = b = c = 0, so the given functions are
linearly independent. Another proof is to multiply the equation by e-x and take the limit as x goes to -oo to show
a = 0, and then similarly show b and then c are 0.
=
11. If a cosx + bsinx = 0 for all x, putting x = 0 and x n/2 gives b = 0 and a = 0, so sinx and cosx are linearly
independent.
13. (b) for e'C, (I, I); for e-x, (l, -1). 15. (b) (1, - I, I).
17. For cos 2 x, (1/2, 0, 0, 1/2, 0); for sin2 x, (1/2, 0, 0, -1/2, 0).
19. Partial answer: A product f (x)g(x) is a linear comhination of terms of the form cos ax cos bx, cos ax sin bx, and
sin ax sin bx, with a ~ p and b ~ q. The trigonometric identity cos ax cos bx =
½cos(a - b )x + ½cos(a + b )x shows
directly that if a> b then cosaxcosbx is in Tp+q· What about other terms? What if a~ b?
21. {e-', e-x }. 23. {cos x, sin x, sin 2x }.
25. Image: {(2, I), (I, 2)}. Null-space: {0}.
756 Answers to Odd-Numbered Exercises

27. Image: {(2, 0, I) , (4, l , 3)); null-space: {(l, - l , l)} .


29. l .
31. Let Pl (x), . . . , Pk(x) have different degrees d1 , .. . , dk, and suppose them ordered so that d1 < d2, · · · < dk . Suppose
cm is the last non-zero coefficient in a linear combination q ( r) = c, Pl (x) + ··· +Ck Pk (x) . Then the coefficient of x '"
is non-zero in CmPm(x) and zero in all other terms, so q (x) is not the zero polynomial.
33. One possibility is p(x) = x.

: : : ~: ~::::::o::(~ ~2, I, 1) - (2, I , -2, 1) :.((~ (


3
~!)
41. (a) S(rp(x)} = S(rp)(x) = (rp)(x +I)= rp(x +I)= rSp(x), and similarly for additivity, so Sis linear.
(b) For p(x) =au+ a,x + a2x 2 , p(x + I) = (ao + cq + a2) + (a1 + 2a2)x + a2x 2, (Dp)(x) = cq + 2a2x , and
½<D 2 p)(x) = a2, so p(x + I)= p(x) + (Dp)(x) + ½<D 2 p)(x) .
if
(c) By Taylor's formula p(x + h) ~ p(x) + p'(a)h + p"(x) 2 + · · · + -1-rP(nl(x)hn for all polynomials of degree ~ n ,
since all their derivatives of order > n are zero. Putting Ii = I shows that S and / + D + ~ D2 + . . · + have ;/ion
the same effect on polynomials in P11 •

Exercise Set SC (p. 142-143)


1. (a) Eij has I in row i column j, where all other Epq have 0, so is not a linear combination of them. An m-by-n
matrix with entries a;j is the sum of a;j Eij, so the Eij arc a spanning set. Thus {E;j) is a basis for the space of
m-by-n matrices, and the dimension is m11. (b) n.
3. Hint: A linear combination ci.f(x 1) + · · · +q/(xk) = /(c1x1 + · · · + CkXk) is O if and only if c1x1 + · · · + CkXk = 0.
Dimensions of domain and image are equal.
S. Hint: Either the intersection contains two linearly independent vectors or it doesn't.
7. A basis for Vis a linearly independent subset of W. Apply Theorem 5.6.
9. (a) Let {h1, . . .. b,,} be a basis for Rn with {b1, ... , bk) a basis for S. One possible f is defined by
/(ct ht+ ·•· + qbk + Ck+tbk+t + · · · + Cnbn) = Ck+] bk+J + · · · + Cnbn .
(b) Hint: Let fk(x) be the kth coordinate of /(x) for fas in part (a). Consider the null-spaces of fk+I , ... , fn -
11. Hint: Show that in the notation of Theorem 5. lO, k is the dimension of the null-space off and r is the dimension of
the image.
13. (a) By Exercise 7 in this section dim(image of/) ~ dim W < dim V so by Exercise 11, dim(null-space of/) > 0.
(b) Hint: If Ax = 0 then B Ax = 0. What does part (a) say about the null-spaces of the operators defined by
multiplication by A and BA?

Section 6: Eigenvalues & Eigenvectors


Exercise Set 6A (pg. 148)

1. (n and ( =i) arc associated with;,. = 7 and (-n is associated with;,. = - 5. The others arc not eigenvectors.

3.;,. = 2, (~);;,. = -2, (-~). 5.),. = 0, (-~);A= 4, C)·


7. hO,(g} A-1, (!}A-2, G)
9. If f (u) = ),.u then u = /- 1()..u) =:: ;,.J- 1 (u) so 1- 1(u) =). - 1u.
11. Hims:
(a) (e±h)" = k 2e±kx.
(b) (sink~)"= -k 2 sinkx, (coskx)" = - k 2 coskx.
Answers to Odd-Numbered Exercises 757
(c) I,x.
(d) For A = /c 2 > 0 we want ciekx +
c2e-kx to be 0 when x =
0 and x =
,r, so CJ + +
c2 == 0 and CJ ekrr c2e-krr= 0.
=
Then c2 = -ci, and ciek,r - cie-k,r 0, or cie-k,r(e 2k,r - l) = 0. e-b-:/- 0, and e2krr - I -:/- 0 because k -:f. 0 so
CJ = c2 = +
0. Thus CJ e"x c2e-kx is the zero function and therefore not an eigenvector. For >.. = 0, CJ + =
c 2x 0
= =
when x 0 and x ,r also implie·s CJ -ci = = =
0. For ).. -k 2 < 0, CJ cos kx + c 2 sin kx = 0 when x = 0 only if
c1 =
0, and is O when x =
,r only if c2 sin k,r = 0. This is possible with c2 -:/- 0 only when k is an integer and
).. = -k 2 .
13. AJ = 1 + .fi., u1 = ( ~; A2 = l - .Ji, u2 = (-~- G stretches by a factor of l + .Ji in the direction of (.Ji, 1)

and by l - ../2 in the direction of (-./i., 1), with reversal of direction because l - ,Ji < 0.
15. x(t) = 2cie2r - 2c2e-21, y(t) = qe21 + c2e-21.
17. x(t) = -2c1 + 2c2e41 , y(t) = c1 + c2e41 .
Exercise Set 6BC (pgs. 154-155)
1. AJ = ½(l + ./5), A2 = ½O - ./5). Theorem 6.7 guarantees a basis of eigenvectors IR2.
3. A= i, A2 = -i. There is a basis eigenvalues C2 but not in IR2.
5. )q = .Jio, Az = -.Jio and A3 = -1. There is a basis of eigenvectors in IR3.
7. (a) The eigenvalues of Ro are cos 0 ± i sin 0 and are real only when sin 0 = 0, so 0 = 0 or 0 = ,r. For 0 = 0 every
non-zero vector is an eigenvector associated with the eigenvalue l, and for 0 = ,r every non-zero vector is an
eigenvector associated with the eigenvalue -1.
(b) For Ron to be a real multiple of u, it must have either the same or the opposite direction.

9. Partial answer: U = ~ ! )- 11. BasisC) ·G} G~) · matrix

13. B~is (:) . (: • (~} matrll 15. U =G~).A=(~ ~)


( ~ ~)-
-~
0 0 I

17 I 1 ) A = (I+ i.J3 0 )·
· U= ( l+i.J3 l-i.J3' 0 1-i.J3

19. A= (b -~)- Eigenvalues I, -1, eigenvectors ex and e-x_

21. A = ( ~ - b)- Eigenvalues i, -i; eigenvectors cos x + i sin x, cos x - i sin x.

Section 7: Inner Products


Exercise Set 7 A (pg. 158)
1. An inner product. 3. An inner product.
5. Ellipse with axes from (-l/.J3,0) to (l/.J3,0) and from (0, -1/./i.) to (0, 1/.fi.).
1. Partial Hint: cos kx cos Ix = ½cos(k + l)x + ½cos(k - l)x. There arc similar formulas for the other products.
9. (2e1 + 2e2 - e3, 2e1 + 2e2 - eJ) = 4(e1, e1) + 4(e2, e2) + (e3, e3) + 8(e1, ei) - 4(e1, e3) - 4(e2, e3) = -3 < 0, so
positivity fails .
11. (a) Hint: Use linearity properties of the integral, and note that /(x)2::::: 0.
~ (f~1r J2 (x)dx}1 12 (J~:,rg2(x)dx)
112
(b) IJ~,rf(x)g(x)dxj .
758 Answers to Odd-Numbered Exercises

Qunit" circle under lhe Y unit circle under the


norm detflrmiri~J by standard Euclldean norm
the given n:.t< x2,y2=1
produc:I '., .,.,,.......--
3x2+2>2= 1

·1
X

.,

13. Him: llx - yf = (x - y, x - y) . Use additivity of the inner product.

Exercise Set 7B (pg. 167)


1. x1 = (1, 1, 1), x2 = (-1, ½, ½), X3 = (0, 1, -1) form an orthogonal basis. Then ((l/.J3)x1 , i'l/../312)x2, (l/v'2)x3} is
an orthononual basis.
3. {(1, 2, 1, 1), (-1, 0, 1, 0), (4, 1, 4, -10), (3, -4, 3, 2)}, or any scalar multiples of these.
5. Just apply the process.
7. Hint: Use Theorem 7.6 to express the dot product of the matrix columns in terms of the inner product.
9. Use Gram-Schmidt (for the ordinary dot product) to find the orthonormal basis
{U1 = (l/.J3. 1;.JJ, 1/.J3), U2 = (1/,,/6, -2/,,/6, 1/,,/6), U3 = (l/v'2, 0, -1/v'Z)}. Distance= Jz.
11. Hint: Use additivity and homogeneity of the inner product.
13. Hint: Use additivity and homogeneity of the dot product, and distributivity of matrix multiplication. Positivity may fail,
for example with A= 0 or A= - / .
15. Hint: a= (e 1, e1) > 0. Write out ((x,y), (x,y)) and complete the square.
17. Additional hints: For a complex vector y, y • y > 0 unless y = 0. ).. is real if)..= I.
Exercise Set 7C (pgs. 170-171)

~ u,. M ~ G-1
0
1. To ,erify the ,xis of mtation. ,hock that Au, ~) . Angle is n.
0 -1

3. (a) R= (~
0
~
1
-~).s= ( ~
0 -1
O ~) -
0 0
(b) SR= (
-1
~~ -I)-A,is(l.1.-1 ).

4/5 1/5 -2v'2/5)


5. 1/5 4/5 2v'2/5 .
(
2v'2/5 -2v'2/5 3/5
7. Hints:
(a) a 2 + b2 = c 2 + d 2 = 1, and since ac + bd = 0, a 2c 2 = b 2d 2 •
=
(b) Since a 2 + h 2 1 there is a 0 such that cos0 = a and sin0 = b.
(c) (a + 1, b) is an eigenvector for 1, and (a - 1, b) for -1. These vectors are orthogonal (because a 2 + b2 = 1) and
can be normalized to be orthonormal.
9. Hints :
(a) The first column is as shown because u1 is an eigenvector for)..= ±1. The rest of the first row is 0 because
columnr. 2 and 3 are orthogonal to column 1.
Answers to Odd-Numbered Exercises 759

(b) Since >.. = I, f leaves points on the line through u1 fixed. Apply Exercise 7 to the submatrix (: !). In case (b)
of Exercise 7, f is a rotation about the axis u1 ; in case (c) it is a reflection in the plane spanned by u1 and the line
of reflection in the u2u3-plane.
(c) In case (b) of Exercise 7, f is the composition of a rotation with axis u 1 with reflection in the u2u3-plane. In case
(c), f is a reflection in the line of reflection in the u2u3-plane.

Chapter 3 Review (pg. 171)

1. Dependent. 3. Independent.

5. ( ~~
-1 0
~)-
1
7. C ~ =!)·
-1 3 0 0
0 1 0 0

l~~
1/2
-1/2) 2 2 0 0
9. ( 2 - 1 . 11.
0 0 -1 3
-1/2 1/2 1/2
0 0 0 1
0 0 2 2
1 - 1
-1
1
-1 -2
1
2 -2 .
2
-!)
15. (a) R = (b ~ -~),
0 1 0
S = (cot
sin 0
~
0
si~O).
cos 0
(b) Rotation of angle -0 about the z-axis.

17. A[= -5, UJ = G} >..2 = 2, u2 = (-n- Basis.

19. At - I . Ut - (g} A2 - 2. u, - (:} A3 - 3. U3 - n} BMi<.

n) e,:';} (-\ =il)


21. (a) 0, ±i./a 2 + b 2 + c2 .

::: :~:~::::.':~::: :::gonvectm

23. Hint: Put x = Mx + (x - Mx), and show that x - Mx is in the null-space of M.

CHAPTER 4: DERIVATIVES
Exercise Set lA-D (pgs. 182-184)
1. f'(t) = (2t, 3t 2), t(s) = s(4, 12) + (5, 9)

3. f'(t) = (2 e~:).
t(s) = 2 + s( :=~) (:=~)
5. f'(t) = (-sint, -2sin2t, -3sin3t, -4sin4t); t(s) = sf'(rr/2) + /(rr/2) = s(-1,0, 3,0) + (0, -1,0, l)
760 Answers to Odd-Numbered Exercises

7. 9.
z y
)

. ---·;::_:!~ - ~.
.;-;:..,.,.,,.-·
·2

I ~ X f(-1) ·1

11.
y
1(2)
2

f(-1) /
~
-+----------"'f<'--- - -- -- +---
-2 4 X

13. T(l/2) = 21/64, T'(l/2) = 27/16


TT. ~~

y ~------
g'(0}=(1y___.......--

19. (l/2, 1/4, 1/8); No 21. 2J1 +4sin2 t.


23. v(t) = 5, l(y) = 20 25. v(t) = ,Jf+9t, l(y) = ¥
h(S/3)'

/
I
/'
I
/
y/ 'Y
/
,,
/
/
./
f(O) //

12 X

27. (acoswt+bcosot,asinwt+bsinSt)
Answers to Odd-Numbered Exercises 761

39.
X(1
. 41.
z
-
-
X(O)
-
X(1/2)
X (1)

'j t

'-- X(O)
~ X(O)

43. T(t) = 32 - ~(-16t 2 + 300t), Tmin ~ 27.8° F

Exercise Set IE (pg. 185)


3.
1.
y

7. 9.
z
(2cos ln2, 3sin ln2, 2)
z /(s"',-5,5~)
y

, I
/

--~
'
' ~ '
,. projection projection ",--.;_ ·,, I__; /_,.,,-
onto xy-plane
onto xy-plane ~-"'
__ !,. ,.,.·i-..,
·-,, ''-~

, y
762 Answers to Odd-Numbered Exercises

Exercise Set lF (pgs. 187-188)


1. F(t) = (½1 3+ t,¾t4-t) +c, c = (2/3, 11/4)

3. F(t) = (t sint + cost, -tcost + sint) + c, c = (0, l)

5. F(t) = (t, ¼t3, -t, ¼t 3 ) + c, c = (1, 5/3, 3, 5/3)


7. (½ 12 + 2, -½ t 3
+ 1) 9. x(t) = ( sin t - 2, -½ cos 21 + ½)
ll. 2 2
x(t) = (½1 + ½, ½1 - ~, ½1 3+ j) 13. x(1) = (¾1 3+ 1+2, --b1 4+1 + 1)

15. tan0 = h(I/a + 1/b), v5 = abg/(2h cos 2 0).

17. v(0) = (20, -40); vo ~ 45 ft/sec; Clark's speed at time of rescue: 122 ft/sec; victim's speed at time of rescue: 80 ft/sec

21. 0 ~ 54.4°; maximum height ~ 26.2 miles

25. (a) (1, 1) (b) (l/2, 1/3, 1/4)

Exercise Set 2AB (pgs. 192-193)

1. 3. z
1mage(f)=(0,2]

-,, "'

I 'I

X
..
I 1 y
y
!5- x2+y2=4

5. 7.
(1/2.1.e3l2) : I
\
Answers to Odd-Numbered Exercises 763

9. 11.
z z
(-1,0,0)
(~

X y
/

//
, _ - ,; -- - - - ~
, ~/: - projection
~ • ,-/ onto xy-plane
/4'

13. 15.
z z
xz-plane k=0

---
A
k=1 / ,,,(
' ----._J•
/.
I

:
f

----------- - - -- - - - , ~ ks2 ; ~

,','
xy-plane ~ - • •
I .,·
<il I

'-----
I

~ ,,..-- I

• yz-plane

19. I
17. z
zJ X2+y2+z2:0
lks0) line of
intersection
1(1,1,-1)
,.,

) .
x-y=O

y ..

y+z~o
X
'
764 Answers to Odd-Numbered Exercises

21. 23. thin film

d(x,y)=7i4
X
y
-1 0 1 X

,,, i
'-
', • . i . '"
/
.., , ,_,---· r··
,· ~,,•/

level set
d(x,y)=x2+2y2-x+ 1

(b)
25. (a) I
D: x2+y2.4, o< z < s z

J
I

Exercise Set 2C (pgs. 195-196)

1.
z Z=-y2+4, X=-2

~ (-2,2,0)
i

Z=X 2-4, Y=2


Answers to Odd-Numbered Exercises 765

5. Z=X2+27, 7.
z y=3
z- (1,1,3)

/~
'-...,._ --.,- - - - - -~-
// _.,
_,,-" ',,.~ projection
I
~ onto
Z=X2-27, X
Y=-3

9. 11.
Z= y( 1·Y ) X=-1
3
z z
-, 1+y2 '

Z=O, y=O
~ -~-,, ,,

z..sin y,
X=2TI:
Z=O, Y='br.

13. P(x, y) = H(x)H(y)H(I - y)H(y - x)


15. P(x , y) = H(l -x 2 - y2)H(x - y)
17. P(x,y) = H(x)H(l -x)H(y)H(l -y)
19. z 21.
z-
4

- - ;/ -,
',,~
2 '-,,, x2+y2=2, z-(cos 2)/3
X
766 Answers to Odd-Numbered Exercises

23. 25.
z .(3,2,6)

~- -
(1,2,4) ,' (3, 1,5)

(1,1,3) •
y
/
/
/ projection
onto xy-plane

Y=(2-x2)1i3, z=O

Exercise Set 2D (pg. 198)

1.
z

(a) x 2 +'ll -z 2 =0

3. The length of the axes of the ellipsoid of level 2 increases by a factor of ./2; the vertex (saddle point) of the elliptic
paraboloid (hyperbolic paraboloid) of level 1 is (0, 0, - 1/c).
1. Q is an elliptic paraboloid.
Answers to Odd-Numbered Exercises 767

9. 11. 13.
z z

I, ._· \

' y
;''
X ,.,..,,,.,,. X / - ... ., t,~
i '"
,,. / /
' //
\ .:

hyperboloid of one sheet hyperboloid of one sheet ellipsoid

15. 17.

Z=·X2/4+41/16,
y=S/2

Z=4X2·4,
Y=·2

·,.,
-..,
z=y2/4-9/16, ~
X=·5/2 ----.,,___

hyperbolic paraboloid
hyperbolic paraboloid

19. / 21.
z

elliptic paraboloid

parabolic cylinder
768 Answers to Odd-Numbered Exercises

23. 25.
yz-plane
z I xz-plane

zl ../

parabolic cylinder
circular cylinder
27. xz-plane

z! I

I
.' . - -\

'',.Y.
x2 ;,4..,.zt!.-1, '"·.
y=O

elliptic cylinder

Exercise Set 3A-C (pgs. 203-204)


J. fx = 2x + x cos(x + y) + sin(x + y), fv = x cos(x + y)
3. f, = ex+y+I, Jy = e+y+I = yxY- I, /v = xY Jnx
. 5. f,
7. fx = 2xy + y2, Jy = x + 2xy, f r<l, -1) = -1, Jy(l, -1) = -1. x + y + z = 0.
2

9. fx = -2x/(x 2 + y2)2, Jy = -2y/ (x 2 + y2)2, fx( l , 1) = -½, /v(l, 1) = -½; x + y + 2z = 3.


11. /vx = fxy = 1 + 6xy 2 13. Jyx = J,y = 8xy/(x 2 + y 2) 3
15. f.- = (2x + x )ex+y+z cosy, /v =:: x ex+y+z(cos y - sin y), f z = x 2ex+y+z cosy
2 2

17. fx = 2x/(z 2 + w 2), Jy = -2y/(z2 + w 2), fz = 2z(y2 - x 2 )/(z 2 + w 2)2; f w = 2w(y2 - x2 )/(z 2 + w 2 ) 2
19. ft = 1, Jy = 2z, fz = 2y 21. fvxx = 2/(x + y) 3
Answers to Odd-Numbered Exercises 769
29. r(x,y) =-1-x- {l-y+./2
31. r(x, y) =1
f(x,y)=(1-x2..y2)1/2
z
l(x.~
]'
,/

'-
..'···,.' _.,.·
. . . . 1-----.._
. ·--- y ,,__,__ ,. t .·.
---------
~/,/
f(x,y)=exp(-x2-y2) r .,\f__

I
33. (1, l, ./2.)t + (1/2, 1/2, ./2/2) 41. D is the xy-plane; u is not harmonic on D
43. D is the xy-plane; u is harmonic on D 45. D is the xy-plane; u is not harmonic on D
47. Dis the xy-plane with the y-axis deleted; u is harmonic on D
49. D is the xy-plane; u is not harmonic on D

Exercise Set 4AB (pgs. 210-212)

5. fx = (e", O. eX+Y), Jy = (0. eY. e"+Y),


7. gu(I, I)= (1, 0, 2), gv(l, 1) = (0, 1, 2)
9. gu(Jr/4,n/4) = (-1/2, 1/2,0);
gv(Jr /4, Jr /4) = (1/2, 1/2, -./2/2)

Z=f!.+-1, Z=X2+1, 9v(n/4,n/4)


X=1 .. · . "' y=1
/,./ ',,~
y/· '.'(_
770 Answers to Odd-Numbered Exercises

11. 13. x+y+2 112z=2


z
2x+2y-z=2

"---,,, , (1!2, 1'2,112



/

15. (a) (b) / 1 (-1, 0, t) and /r(-1, 0, t) are the


velocity vectors at time t of a point
y
1(1,1,1'/. starting at (- 1, 0),

4
f(1,0,t) ,

, 1(0,j
/
X
2

X
-2 fk 1,0,0) 2

17. (a) fi(x,y) =x (x =J: 0), h(x,y) = (x + y) 2 /(4x)


(b) f,(x, y) = (I,
(2(x + y) - (x + y}2)/(4x 2 ) ), fr(X, y) = (0, ,(x + y)/(2x))
(c) The image in the uv-plane of the given region is bounded by four curves: v = 11 (0 < u ::: 4), v = 16/u
(4::: u ::: 8), u = u - 8 + 16/u (15 ti ½Cl!), v = 0 (0 < u ::: 4)
~~v,.f:.'i
Exercise Set 4C (pg. 213)
1. 3.
z (1,2,2)

}~~,\
4 X '- (e2,2,0) ,-.
Answers to Odd-Numbered Exercises 771

5.

7. u is the angle made by the positive x-axis and the projection of the vector f onto the xy-plane; v is the angle made by
the positive z-axis and the vector f .
9. 11.
z
I '·
, I - ' I

~ ~~ ..~3} _ , -:- (1,2,5'3)


~'
' ........ J~,
I I y
I
I I ,·..,.. ' I

,,,.-1 tl ~-~-~'·,~
,Y,(-'1 -2-513).. 1- ' (· 1,2 ,'31.J----..__
'I ' , I , X
1•' I ,•'
'' , I ,{,
-, I

13. (x, y, z) = (cos u cosh v, sin u cosh .v, (1 /..ti.) sinh v) 15. (x, y, z) = (cos u sinh v, sinu sinh v, -cosh v)
z z
x2+y 2-2i2=1, lzl < 1
z2-x2-y 2"1 , zs1

y
. ..,...

Chapter 4 Review (pgs. 214-215)

1. x(t)= e1 (cost - sin t)i + e1 (cost + sin t)j + e'k,


x(t)= (-2e' sint)i + (2e 1 cost)j + e'k,
t(t)= ~(cost-sint)i+ ~(cost+sint)j+ ~k
3. (a) i(0) = (5 / ..fi.. 5/ ..fi., 0),- x(O) = (0, 0, -5) (b) IOrr
772 Answers to Odd-Numbered Exercises

(d)
z

x(t) -~

5. max = ...Ja 2 + c 2 , min = ./b2+c2


7. At = curve parametrized on [0, ;rr /2] and A2 = curve parametrized on [-n/2, 0]. l()..i) = (eH/ 2 - 1)../2,
[()..2) = (1 - e-rr/ 2 )../2

y
t=1t/2
'\

9. 2x + 12y - z = 17
23. (a) ellipses if k > 0, single point (0, 0, 0) if k =0 (b) parabolas
25. F(x,y)=y-x 2
27. h(x) = (x, x 2 )

\ p z g(t)=(t,t,t2)

proje:~~~
onto ~,,"-- .,, t=1
_1/
-~-- --3·

-~;;~-
Answers to Odd-Numbered Exercises 773
CHAPTER 5: DIFFERENTIABILITY
Section 1: Limits And Continuity
Exercise Set lA-C (pgs. 224-225)

1.

(c) lx-xoj <3


(a) lx-xoj:s;3 (b) Ix - XO I = 3

3. The interior of S is S and, therefore, S is open; the boundary of S is the circle of radius 3 centered at (l, 2).
5. Let S = {(x, y)I0 < x < 3, 0 < y < 2}. The interior of Sis S and, therefore, Sis open. The boundary of S consists of
the four line segments /1, Ii. /3, /4, where /1 has endpoints (0, 0) and (3, 0); Ii has endpoints (3, 0) and (3, 2); /3 has
endpoints (3, 2) and (0, 2); and [4 has endpoints (0, 2) and (0, 0).
7. The set S = {(x, y)jx 2 + 2y 2 < I} contains all points inside the ellipse E:: x 2 + 2y 2 = 1. The interior of Sis Sand,
therefore, S is open. The boundary of S is E:.
9. The set S = {(x, y)jx 2 + y 2 > O} is the xy-plane with the origin deleted. The interior of Sis Sand, therefore, Sis
open. The boundary of S consists of the single point (0, 0).
11. The given set S = {(x, y) jx > y} is the region below the line y = x in the x y-plane. The interior of S is S and,
therefore, S is open. The boundary of S consists of all points on the line y = x.
13. Lines and planes in IR 3 are not open subsets of IR 3 because no point on a line or a plane is an interior point. For
example, in the case of a line, if Xo is a point on the line then every neighborhood of Xo contains points not on the line.

15. Since(~ n(~)


No point on a line is an interior point (in fact, they are all boundary points). A similar observation holds for planes.

= (X +;~).the given function can be written as f(x, y) = (X +;;).Thus, the domain space
and the range space are both of dimension 2 (i.e., n = m = 2). The real-valued coordinate functions off are
f1(x, y) =x + 3y and f1(x, y) = 2y.
17. The domain space of f(t) = (t, t 2, t3, t 4 ) has dimension 1, and the range space has dimension 4 (i.e., n = I, m = 4).
The real-valued coordinate functions off a.re f1(t) = t, h(t) = t 2 , h(t) = t 3 and f4(t) = t 4.
19. The domain space and range space of f(x, y, z) = (2x, 2y, 2z) are both of dimension 3 (i.e., n = m = 3). The real
valued coordinate functions of f are f 1 (x, y, z) = 2x, h (x, y, z) = 2 y and h (x, y, z) = 2z.
21. The coordinate functions of the given function f are
y X
f1(x,y)= x2+I and h(x, y) = -2--1 ·
y -
774 Answers to Odd-Numbered Exercises

Ji is continuous everywhere on the xy-plane. h is continuous on the xy-plane except on the horizontal lines y = ±1,
where limx-+Xo fi(x) fails to exist. It follows that limx-+xo f(x) fails to exist for xo on the horizontal lines y = ±1.
(Note: For points of the form Xo = (xo, ±1), where xo-::/; 0, limx----no h(x) fails to exist because h(xo) is infinitely
large. But for points of the form x0 = (0, ±I), limx-+xo h (x) fails to exist because its value depends on the direction
from which we approach xu.)
23. There is only one coordinate function of the given function f; namely,

f (x, y) = { x /sinx + y, if x-::/; O;


2 + y, if X = 0.

For points Xo an the vertical lines[,. : x = nrr (n a nonzero integers), limx----,.xo /(x) fails to exist, but this limit docs
exist at all other points in the xy-plane. (Note: f is not continuous at points of the form (0, Yo), but lirnx-+(0,YO) f (x)
does exist and equals l + yo.)
25. The coordinate functions of the given function f are Ji (u, v) = uv/(1 - u 2 - v 2 ) and fi(u, v) = l /(2 - u 2 - v 2 ). Ji
is continuous everywhere on the uv-plane except on the circle C1 : u 2 + v 2 = I. For points Xo on Ct, Iimx-+xo f (x) fails
to exist. h is continuous everywhere on the uv-plane except on the circle C : u 2 + v2 = 2. For points Xo on C2,
limx.....,.Xo f (x) fails to exist. It follows that f is continuous everywhere on the uv-plane except on the two circles C1 and
C2, and limx-->xo f (x) fails to exist for points xo on these two circles.

27. The coordinate functions of the given function f are Ji (11, v) = 3u - 4v and fi(u, v) = u + 8, both of which are
continuous on the uv-planc. Thus f is continuous on the uv-plane.
29. The given function f is continuous on the xy-plane except possibly at (0, 0). However, it was shown in Example 6 that
limx-->(0,0J f(x, y) fails to exist, so that f can't be continuous at (0, 0) (regardless of how .f(O, 0) is defined).
31. For x E IR", the function /(x) = lxl/(1 - lx! 2) isn't continuous for points x E IR" that arc I unit from the origin. That
is, f isn't continuous on then-dimensional unit sphere.
33. (a) The translation T(x, y) = (x + y) + (1, 1) takes each point in the xy-plane and moves it a distance of
!( 1, l) I = ../2 units along a line parallel to the line y = x.
(b) Hint: Use Theorem 1.4
35. /lint: Each time you include another open set you may need a smaller ball inside.
37. flint: Use the definition of length.
39. Hint: Use the triangle inequality.
41. Hint: Assume the contrary and reach a contradiction.

Section 2: Real-Valued Functions


Exercise Set 2AB (pg. 232)
I. {fr, /y) = (2x, -2y). 3. <fx, /y) = (l, 2) .
5. <fx, fy, fz) = (l, 1, -2z). 7. <fxl' fx 2 ) = (2x1, 8xi)-
9. z=3x-3y. ll. z=x+2y.
13. W = X + y - 2z + J. 15. W = X + 2y + 2z - 2.
17. When x = 0 and/or y = 0. 19. When x + y = 0.
21. Routine calculation. 23. Hint: Use the definition of differentiability.
25. flint: Use the dcfinitionof J'(O).
Answers to Odd-Numbered Exercises 775
Section 3: Directional Derivatives
Exercise Set 3AB (pgs. 236-237)
l. 4/v'3. 3. 1.
5. -2/../5. 7. 4e2;.Jio.
9. 2/../5. 11. 4/./3.
13. ±(-3/./3 + 4/./3 + 1/./3:: 2/./3). 15. Routine calculation.
17. Routine calculation. 19. Hint: If IY - xi I 0, divide by it.
21. Use the hint.
23. f(h, k,l) ~ 1 +o+ 1r(2h 2 +2k 2 - 21 2) = 1 +h 2 +k 2 - 12.

Section 4: Vector-Valued Functions


Exercise Set 4A-C (pgs. 243-245)

~ )-
1 cosy O )
1. (~ 3. ~ ~ - s;nz .
5. (x 2 + 2x)ex. 7.
w
~ ;
O u
~)-

9. 2; -2: .
(
2x 2y) 2.x
ll. ( 2y
-2y).
2x

13.(~~)- 15. (2 2) .
-1 0)
17 ( h/2 ) .
. -h/2 19. ( ~ -~ .

,. n-n
23. (a) P is the projection of the vector (x, y, z) onto the xy-plane. (b) G~ ~) .
25. (a) f(xo + Y1) = ( tl), f(Xo + Y2) = (~:~). f(Xo + y3) = e:11).
(b) T (x, y) = ( x +; ) .
(c) f(xo+Y1) =:::: (i~i). f(Xo+Y2) =~ (~:D, f(Xo+Y3) ~ (~:n-
27. The 3-by-3 matrix in the definition off.
._ 29. Hint: Use the definition of the derivative matrix in both parts .
776 Answers to Odd-Numbered Exercises

Section 5: Newton's Method


Exercise Set 5 (pg. 250)

1. (a) The graph of f (x) = x 113 - x, -2 < x < 2 is shown on the right.
y
(b) The tangent lines / 1, /2 and /3 to the graph of f for xo = 3/4,
xo = -3/4 and xo = - 1/4 (resp.) are shown on the graph given
in part (a). Their respective points of tangency Pl, pi and p3 are
also shown.
(c) The equation x 113 - x = 0 can be factored as x 113 (1 - x 113)(1 +
x 113 ) = 0, from which we obtain the three solutions I , - I and 0.
(c) I, - 1, -1 respectively.
(d) This choice gives a solution remote from the starting value.
3. (a) Routine calculation. (b) x3 = x4 = 0.739085133.
5. x9 = x 10 = (0.980222741, 1.993801602, -0.874024343) is one solution.
Chapter 5 Review (pgs. 250-251)
1. (a) Open. (b) Not closed. (d) Equals the set. (e) Nonnegative x and y axes.
3. (c) Neither. (d) The set with the semicircle deleted. (e) The semicircle together with the segment - I :::: y :::: I on
the y-axis.
5. (a) Open. (b) Closed. (d) Equals the set. (e) Empty.
7. (a) Open. (b) Not closed. (d) Equals the set. (e) Three parts of planes: z = 3 where x ~ 0 and y ~ 1, y =I
where x ~ 0 and z ~ 3, x = 0 where y ~ I and z ~ 3.
9. (a) The interior of Sis the solid unit sphere in IR 3 without its "skin," and (b) the boundary of Sis its "skin". It follows
that the smallest closed set containing S is the solid unit sphere in IR 3 together with its "skin."
11. H of has no points of discontinuity .
13. The points of discontinuity of H of are the points on the line y = x.
15. No points of discontinuity.
17. (-2x (x 2 + y2)- 2 , -2y(x 2 + y2)- 2 ).
19. (l,-1,0). 21. (y+w, x+z, y+w, z+x).
23. fxx = ex sin y, Jyy = -ex sin y, fxy = fyx = ex cosy.
25. fxx = yze', /yy = fzz = 0, fxy = fyx = zeX, fxz = fzx = yeX, /yz = fzy = eX ·
27. fxx = 12x2 , Jyy = 6y, f~: = 2, fxy = Jyx = fxz = fzx = fyz = fzy = 0.

29. 0:1 0~1)) . 31. (; -;) .


(

33. ( ~ !~ :).

VW tlW UV

35. (a) 4cos0 -4sinll. (b) (1/.Ji, -1/.Ji).


Answers to Odd-Numbered Exercises 777
CHAPTER 6: VECTOR DIFFERENTIAL CALCULUS
Exercise Set lA-C (pgs. 257-259)
1. VJ (x) = (2x, -2y)
3. V /(x) = (1, 2)
5. V f (x) = (1, 1, -2z)
=
7. V f (x) (2x, -3y2),
IV /(x)I = J4x2 + 9y 4 ,

=
IV /(1, 1)1 VO. u = (2/VO. -3/v'TI)
9. Vh(x) = (ysinz,xsinz,xycosz),
IVh(x)I = j(x 2 + y 2 ) sin2 z + x2 y 2 cos2 z,
Vh(l, 2, n) = (0, 0, -2), u = (0, 0, -1)
11. 13.
y

-2

-1 2 X
-----2
15. V f(x, y) = (y, x + 2y) 17. V f (x, y, z) = (2x, 2y, 0).
y z

-----,--+---,.--+-_..-+-----+-- X

/ ..... '\.

/ /
19. normal vector (2, 2, 0); tangent plane x + y = 2
21. normal vector e1; tangent hyperplane x 1 = 1 (x 1 is the first-coordinate variable for points in Rn)
23. normal vector (1, I, 1); tangent plane x + y + z = 3
25. (b) normal vector (-1 - e, - l - e, 1); tangent plane x + y - z/(l + e) = 1.
27. F'(O) = 3 29. g'(rr) = -1
31. (a) x 2 2
- y2 = 8 (b) y = (x/3) 31 , t 2:: 0 (c) (6, -2) (d) (6, 3) (e) maximum radiation at t = v16; radiation
decreases fort> J6 and is zero fort 2:: 3.
z) ½x
35. If J (x, y, = 2 + + }z 2 !l 37. / (x, y) = 3 + ½x ½y3
778 Answers to Odd-Numbered Exercises

Exercise Set ID (p. 261)

1. 3. 5.

Exercise Set 2A (pgs. 269-271)

J. (a) f'(x, y) = (2x t Y 2x ), g'(f(x, y))


y
= (~
O 2y 2 +4
~ )

(b) (g O /)'(I, I) -· G ;) , (g o f )' (0, 0)


12
= (g g)
0 0
3. d(g O f)/dt(2) = 14
5. (a) (Fo f)'(u, v) = (4u +4u 3 -4uv 2 , 4v-4vu 2 +4v 3) (b) aw/au= 4u +4u 3 -4uv 2 , aw/av= 4v-4vu 2 +4v 3
7. aw/ar = J2; aw;ae = 0
9. Hint: If g(uo) = xo, use the chain rule to compute (F o g)'(uo).
13. Hint: f(tx, ty) can be computed in two ways.
15. a2 (f O g)/ovil11(1, 1) = 2
21. g o f can possibly be defined; f o g cannot be defined.
23. g o f cannot be defined; f Q g can possibly be defined.
25. g o f can possibly be defined; f o g cannot be defined.
Exercise Set 2B (pgs. 274-275)
9. (a), (b) Hint: Use a parametric representation for a line and and a circle in the uv-plane, then apply f.
(c) det(f') = -2(u 2 + v2 ).
11. Hint: Choose E > 0 such that E < lf'(xo)I and use the continuity off' to show that 0 < lf'(x)I on some open interval
containing xo. Then use the mean-value theorem.
13. Hint: (F- 1 o F)(x) = x for all x in some neighborhood of xo. Now apply the chain rule to this equation.

Exercise Set 3 (pgs. 281-283)


1. (a) dy/dx = -x/y, (y I= 0), dy/dx(l/J2, l/J2) = -1; dy/dx(0, 1) makes sense but dy/dx(-1, I) does not.
(h) dx/dy = -y/x, (x I= 0), dx/dy(l/J2, -l/J2) = l; dx/dy(l, 0) makes sense but dx/dy(0, 1) does not.
Answers to Odd-Numbered Exercises 779

(c) Y = ±.Jf=xI (d) x=±~


dy/dx y dx/dy
undefined undefined

I
(·1,1)
(0,1)

dy/dx=O I'
dy/dx=-·I

- ~ - - - + - - - ~- x


X=·(1-y2)1/2 ..
... ·\
y=-( 1-x2)112

3. dy/dx(-1, l) = l, dx/dy(-1, l) = l
5. dy/dx(-1, 1/4) = -1/4, dx/dy(-1, 1/4) = -4
7. (a) dx/dz(l, l , -1) = -1/2, dy/dz(l, l, -1) = 3/2 (b) dy/dx(l, l, -1) = -3, dz/dx(l, 1, - 1) = -2
(c) dx/dy(I, I, -1) = -1/3, dz/dy(l, I, -1) = 2/3

9. F(x, y, z) = (~ :z
2
:yt) for all three parts.

For part (a), x = z, y = (x),


y
Fx = ( xyY), Fy = (2xy
yz
x2 +xzz).
For part (b), x = x, y = G). Fx = (~~). Fy = ( x
2
~; x~ ) -

For part (c), x = y, y = (:). Fx = (x


2
~;). Fy = (~~ x~)-
11. ox/ou(l, -1, I, 1, -1) = 0, oy/ou(l, -1, 1, I, -1) = 1
13. 9x + (6v'll)y + 8z = 36.
5
15. (a) /'(1, I) = (- /; -~) (b) 14x + 9y + llz = 12
-1/3 -I
17. (b) x1(y. z) = Ja 2 - y 2 -z 2, x2(y, z) = -../a 2 - y 2 - z2 , Y1(x, z) = Ja 2 -x 2 - z2 , Y2(x, z) = -Ja 2 -x 2 - z 2,
z1(x, y) = Ja 2 -x2 - y 2, z2(x, y) = -Ja 2 - x 2 - y2
19. (a) (x,0,z),(x,y,-2x) (b) (x,-l/(Sx 2 ),-2x) (c) x 1(y,z)=(-yz+../y(5yz 2 +4)/(2y),
x2(y, z) = (-yz - ../y(Syz 2 +4))/(2y)

Exercise Set 4A-D (pgs. 292-293)


1. (2, 1) 3. (x, y) = (-1/4, -1/4)
5. (x,y,z) = (-2/5,0, 1/5) 7. no critical points
9. maximum value at (1, I); minimum value a1 (-1, -1)
11. maximum value at (3, -4) and (-3, 4); minimum value at (4, -3) and (-4, 3)
13. maximum value at (2/0, 1/Jlo) and (-2/./5, -1/Jlo); minimum value at (0, 0).
780 Answers to Odd-Numbered Exercises

15. (-1, 0, I) and (-1, 0, -1) 17. (0, t, t), itl:::: ./2
19. (l/J3, J/J3, l/J3°) 21. -1/./2
23. 6 by 6 by 3 25. (5V/6) 113 by (5V/6) 113 by (36V/25)L
27. (a) I+ 1/./2 (b) I 29. (27/19, -7/19, 7/19, -3/19)
31. Hint: Minimize the function f(x) = (a1 + ·· · + aN + x)/(N + I) - (a1 · · · aNx) 1l<N+I) for x > 0 and use induction.

33. (a) !hi < I (b) maximum and minimum off on Ch a r e ~ a n d - ~ , respectively. (c) 0 is both the
maximum and minimum value off on C1,
35. The given plane is tangent to the given sphere at the point (I, I, 1). So, f(l, I, 1) = I is both the maximum and
minimum of .f subject to the given conditions.
37. (a} length= 3, width= height= 6 (b) length= height= 3(2) 113, width= 6(2) 113

Exercise Set 4E (pgs. 297-298)


3. (I, -2) (minimum) 5. ( l , 2) (saddle)
7. (0, 0) (minimum) 9. (0, 0) (saddle)
11. (-1/2, 4) (maximum) 13. (0, 0) (saddle)
17. (b) (0, 0, 0) is a saddle point.

Exercise Set 41<' (pgs. 302-303)


3. local extreme point (1/J6, I/J6) gives the local maximum value ./6e- 112 ; local extreme point (-I/J6, -I/J6)
12
gives the local minimum value -
76 e- 1
5. no extreme points ((0, 0) is a saddle point)
7. The point 0.653271187 gives the extreme value 0.396652961; the point 3.292310007 gives the extreme value
-0.000002945.
9. (a) 1.49546 by 1.77438 by 0.72437 (b) J/./2 by 1/./2 by 2e- 1

Exercise Set 5A-D (pgs. 308-309)


3.
1.
(-1,0) y

T x2+v2=1,
\xsO,y<O
X

~-~(0,·1)

5.
Answers to Odd-Numbered Exercises 781

9. R = ((rcos0, r sin 0, z) IO::::: r::::: 1, -rr/2::::: 0::::: rr/2) ll. z


1=0 (0,0,1)
.(sin t cos t,sin 2t,cos t),
,,,,,,,.
0<1<"12/

13. z

0o = 21r/3
r 0 = 1, y = -2\/'3x.
x2 + (y/2) 2 + (z/2) 2 =1
15. (-2t sin t sin t 2 +cost cos t 2 , 2t sin t cos t 2 +cost sin t 2 , - sin t)

Chapter 6 Review (pgs. 309-311)

1. J'(x,y,z)=(2x,2y,-2z) 3. g' (• /3, •' /36) - ( --;;~ ~)

5. 8K/8u(l, ../3) =2 7. in the direction of (3, 3, 2).


9. (a) y (b) rate of increase is constant at 1.
'\ \
,,\ \ l I I/
\ ! ///

____'- ._,.'
'-. '-. \ I/.,..,,, __
//.,,.,-,,

-----/I \ -----
6 ~~,. II".,... ii/
'-,.
". '-.. ...____,
X

/.,·· I
J \ \ ". "'
/II I \ \ \ ".
IS. 8//8s(2, 1) = 80
17. (b) dP/dt(O) = -21/ 16 (c) dP/dt(30) = -3/73
19. Zx(l, 1, 1) = 1, Zy(l, 1, }) = 2
21. (a), (b) tangent vector: (5, -4, 1)
23. (b) the result of part (a) implies Theorem 1.2.
782 Answers to Odd-Numbered Exercises

25. x11 (2, 1, -2, -1) = 3, y 11 (2, I, -2, -1) = 1/2


21. ax/ay(-2 , 1, 1, 2) =4/9, az/oy (- 2, 1,-1 , 2) =7/9
29. maximum at (.Ji4/4, 1/4), (-.JI4/4, 1/4); minimum at (0, 1/ ../i.), (0, - 1/ ../i.).
31. 3./3 cubic units 35. (0, 0, 0)
37. The only critical point (-1 , -1) is a saddle point
39. minimum value of f(x) is -IXol - The same results are obtained if the restriction lxl = 1 is replace by lxl S I.
41. (a) (1 , 1), (-1, -1) (b) maximum value is -1, minimum value is - 2.
43. (a) (0, 0), (3/2, -9/4) (b) maximum value is 5.

CHAPTER 7: MULTIPLE INTEGRATION


Section 1: Iterated Integrals
Exercise Set lA-D (pgs. 321-322)

1. -~- 3. ¥-
y y
2
2

·1 X
X

67
5 • 28 ' 7. j.
y
y
, ~ -· ----
1 "
' ~1-x)
112

"-,
B \

1 X
-2

11. ¼-

y
rr/2
•=cos y

'-./
\
B \
........•...__J__ _
1 X
Answers to Odd-Numbered Exercises 783

13. 1. 15. 2JT + 6.

'• ..

17. ~ sin 1 - cos 1 + t Y

2
2[i4x-2.x
19. (a) A(B) =
1
0
2
_
smnx
1 dy ] dx.

[i4x-2.x
(b)
1
0
2
_
sm,r ,-
f(x, y)dy ] dx.

21. tJo

,~,
z z

~
2
/!''·':,
y=x'/2

,{::~J~
,-.Jl--i- (1,1,0)
~~2
X
784 Answers to Odd-Numbered Exercises

z z
25. (b) rra3. cross-section Z cross-sections
(0,a,2a) perpendicular to W perpendicular to

w ,
:, I
the axis of the . _,--\ the the line y=-a

"''"'" / -----!
',. -~ y,,,

-. : t:>~x'
y=-:
- ")~/-~
,, ' x-- ,,
1- X ·, / , ]- X

27. Let F,, be the value of the expression for n ~ 2; F2 = 1/2 and Fn = 7/12 for n ~ 3.
29. (a) (] - 8112 ) 2 . (b) 4.

Section 2: Multiple Integrals


Exercise Set 2A-E (pgs. 332-333)

1. Jr . 3. Jr.

rr X

5. 6 7. ¥-
9. 2. 11. 2;r.

13. V(B) =
1
112~dx , dy
14-4x
.
2
-y2
dz= 4;r.
-) -2~ 0

V(B)=
1 12~
1

-)
dx
-2~
(4-4x 2 -y2)dy.

15. 16;r. z
Answers to Odd-Numbered Exercises 785

17. :rrh 2 (a - h/3). 19. 45:rr/2 lb.


21. About 252.4 lb.

Section 3: Integration Theorems


Exercise Set 3 (pgs. 336-337)

J. f. 3. Hint: f f(xo)dV = V(B)f(xo).


ln,.
5. Use the definition of the double integral. 7. (a) and (b) are routine. For (c) use the hint.
9. 2y. 11. h'(x) = J; e" 2 du; h"(x) = <r 2 •
13. Use the hint.

Section 4: Change Of Variable


Exercise Set 4A-D (pgs. 346-348)
l. ½(e 4 - 1).
3. (a) B is the part of the first quadrant of the xy-plane bounded by the circle of radius 2 centered at the origin.
(b) 4f. (c) f
5. M (A) = 20:rr grams. 7. :rr.
2
9. 2a. 11. 4"2 .
13. :rra 4 . 15. 2:{.
17. Use spherical coordintes. 19. Use cylindrical coordinates.
21. (a) Rxy is the upper half (y ::: 0) of the annular region of inner radiuus 1 and outer radius 4 centered at the origin.
(b) 3:rr .

23. (b) 1 + u 2 . (c) ~- (a)


25. ;2 + 6:rrJ3.
16

y
20
B
zj typical
cylindrical shell

"""·---
D

X...,.-"
,..~

21. b = aJl - k213.


786 Answers to Odd-Numbered Exercises

Section 5: Centroids and Moments


Exercise Set 5 (pgs. 352-353)
xI y rnr2
1. - ¾- I ",=1 fT\,F3 3. (0, 1) . 2 •
I '-! • • (1,2)
-4 ,0 1 2
1 x=(0,1)

-1 1 X
rn 1=1
-1 •
(1,-1)

x
y
0 2/3 2 7. (~¥- 3<;t2>) ~~ (0.48,0.48). Q ,,

~~"'\
/ '/Y=X

9. (3/4, 12/5).
It. Hint: Center the ball at the origin. lH0:48,0.48J
-· _ ( 4(a·1 -b3 ) 4(a-1 -b3 ) )
I 3• (a) X - 3ir(a2-b2J' 3ir(a2-l,2J .
(b) - _ ( 4a
X - 3,r ' 3,r .
4a)
15. The centroid of R is on the line of symmetry at distance 4,1/3rr ,, X

units from the flat edge.


17. (a) Use the hint. (b) Hint: Break Mp(B) into two integrals.
19. 2µ,b 4 /3. 21. Use the hint.
23. Hint: Use basic properties of integrals.

Section 6: Improper Integrals


Exercise Set 6 (pgs. 358-359)
1. 1/3. 3. 1/4.
S. (I - e-1)2. 7. -Jr.

9. rr/2 . 11. {+oo,


-1/(a. + I),
a> -I
a< -1.

13. (a) k = 2/rr. (b) } - ~ 0.13. (c) (0, 0).


¥}-
15. (a) Hint: The iterated integral is the square of an integral. (b) Routine. (c) Hint: Change Variable to get rid of the
m in the exponent. (d) Hint: Same as for (c).
17. Hint: Use the result of part (a) of Exercise 15. (b) Hint: Compute f000 3
v 3 e-av dv.

Section 7: Numerical Integration


Exercise Set 7 (pgs. 362-363)
1. With p = 86, q = 87 about 0.5333000. Since 8/ 15 = 0.53, the above approximation is accurate to four decimal places.
3. Simpson with p = q = 2 gives the exact value -20. The midpoint approximation doesn't give the exact value but
rather approaches it from below as p and q increase.
5. The exact value is rr /2 ~ 1.570796327. Because the given region R is not a rectangle, the function that was used for
the approximation was F(x, y) = ( I - x 2 - y2)H( I - x 2 - y 2 ), where H(x) is the Heaviside unit step function which
is 1 fot x ::: l and otherwise, iintroduced in Chapter 4. Thus, F (x, y) = f (x, y) inside the unit disk and is zero
elsewhere. The Simpson approximation and the midpoint approximation were then applied to F(x, y) on the rectangle
Answers to Odd-Numbered Exercises 787
-1 S x S 1, - I ::: y S 1 for the values p = q = I 00 and p = q = 150. The two methods produced 1.57082, 1.57080
respectively.
7. The region R given here, is the part of the region defined in Exercise 5 above that lies in the first quadrant of the
xy-plane. Since the integrand is the same here as it was there, and since it is symmetric with respect to the origin, it
follows that the exact value of the integral is one-fourth the value found in Exercise 5; namely rr /8 ~ 0.392699082. As
in Exercise 5 above, the given region R is not a rectangle. Therefore, just as in Exercise 5, the function that was used
for the approximations was F(x, y) = (I - x 2 - y 2 )H(l - x 2 - y 2). However, here we applied the Simpson
approximation and the midpoint approximation on the rectangle OS x :::= l, 0 Sy S l. Six decimal place accuracy was
first achieved with the Simpson approximation at p = q = 136, while the same accuracy was first achieved with the
midpoint approximation at p = q = 70.
9. (a) rr. (b) G(0, 1.000, 0, 1.000) ~ 2.23098516, G(0, 2.000, 0, 2.000) ~ 3.11227036,
G(0, 2.600, 0, 2.600) ~ 3.140110976, G(0, 3.575, 0, 3.575)::::: 3.14158996, G(0, 3.600, 0, 3.600) ~ 3.1415904.
(c) The list suggests that G(0, a, 0, a) first approximates rr to four-decimal places at about a = 3.58.
11. The exact value is rr/2 . Simpson with p = q = r = 30 gives 1.5733481677, accurate to only two plaes.
13. The exact value is 3/2 and Simpson with p = q = r = 2 gives that with IO-place accuracy.
15. (a) The exact value is 7 /6.

z
3

(1,0,2) • .(0,1,2)
R
(2,0,1) • - 'o(0,2.1)

2. _2

X
3
.
.... _.,,. ,.
3
y
projection of I\, onto the xy-plane

(b) If H (x) is the Heaviside function, then H (3 - x - y - z) has the value 1 for points below the plane x + y + z = 3
and is zero elsewhere. The smallest rectangle, call it Ro, containing R is 0 S x S 2, 0 S y S 2, 1 S z S 2 (see
figure). We integrate the Heaviside unit step function H(3 - x - y - z) over Ro. When Simpson's rule in three
dimensions was applied to this function over Ro, the approximation for p = q = r = 50 was 1.176576, which is
accurate to only one decimal place. When the midpoint approximation in three dimensions was applied with
p = q = r = 50, the result was 1.166400, which is accurate to three decimal places. Simpson's rule is usually
better for smooth functions, but not here H since is not continous.
(c) The apparent superiority of the midpoint approximation suggested by the result of part (b), compelled us to forego
the use of the Simpson approximation in favor of the midpoint approximation, which was applied to the function
(x 4 + y 4 + z 4 )H(3 - x - y - z) over the rectangle Ro described in part (b). Using p = q = r = 50, the result was
6.679043. In order to check this answer, one can directly compute

1 R
(x
4
+ y4 + z4) dxdydz = [ 13-z 13-y-z
I
2
dz
O
dy
O
(x
4 1403
+ y4 + z4)dx = - - ~ 6.680952381.
210

We see that our midpoint approximation is accurate to only one decimal place (although rounding to two places
produces two-place accuracy).
(Note: In previous exercises, values of p and q were tried for much larger values than were tried here. The reason
is that the number of operations required to carry out both the Simpson approximation and the midpoint
approximation in dimension 3 is roughly proportional to the cube of the number od subintervals used. Thus, all
things being equal, computation time much longer in three dimensions than it is in one or two dimensions.)
788 Answers to Odd-Numbered Exercises

Chapter 7 Review (pgs. 363-366)

1. 2/3. 3. e2 - 3.
y y
1 >------~ 2

2 X

2 X

5. 0. 7. 1/2.

l~i
y
2n

7 1 X

r. X

9. I / 3. The region Q is shown below on the left.

11. 7T /6. The unit disk is in the middle figure below.

13. 20/3. The square S in R 2 of side length 2 centered at (I , 0) is below on the right.

y
y 1

t
s
X

1 X
-1

ExercisE> !J Exercise 11 Exercise 13


15. 15/ 8. 17. 1r.
z
Answers to Odd-Numbered Exercises 789
19. 4,r /5; B is the solid ball of radius 1 centered at the origin.
Note: The regions of integration in Exercises 21 and 23 below are truncated cones with heights l and 2 respectively, so
is shown only once.
21. ,r/10. 23. 8,r/3.
K

25. 3616/35. 27. 16ab2 /3.


y z
9 (3,9)/
\
£,
y=2x+3·. /.'.,.,
f
square
j)f cross-section

/~·,,/y,,x2

I
I
• ! R,_
w I D
/ ''1

3X

29. 4,rabc/3. 31. 8.rr/3.


(0,-b,c) z I (-a.O,c) B
' '~. , ".,.,. . :...---,,
~""\. i . .
.' *......
. ~-
~ ·'· X
H I,
'~
H-, "
X
-;__"-t
/

33. 1/(6abc). 35. (b) fo..fi. J/ 1


-Y
212
2x dx dy. (c) 2../2/3.
y
2112 ---~ ~(2-2x2)1/2
",,
\
R \

1 X
790 Answers to Odd-Numbered Exercises

37. 477/20. 39. 4/9.


r=cos8 , j0j,;.lfl2

41. (a)
1 [! ~ [1x ] ]
0
1
-~ 0
z dz dy dx. (b) 11 [farccosz[11
0 -arc,:o,: lSCCIJ
]]
zr dr d0 dz. (c) Jt /16.

43. 1/32.

45. a= -2, b = 2, C = 3 - ./4- y 2 , d =3 - ./4- y 2 •


47. 10. 49. V(C) = ll?n/8 + 18.
51. Jt. 53. 3Jt /2.
55. 4Jt/3. 57. a < 1.

59. a< 3/2.

~ 12,r d0 1,r/2 sin </J d</J 12 r


f 1 1
J JJ -xLy2
2
61. (b) dx dy dz. (c) dr. (d) 2rr /3.
-1 -~ 0 0 0 O

z
H
~-Jf,-m~r--
, · ; , mmr.-,-, -. = _, ,,
I, ' ·,., ~'j ', ·1 z varies
-<- l . \
I
./
,,
• l

,,
X , "-,, _y
/ I l: and r vary ··"-

CH:\.PTER 8: INTEGRALS AND DERIVATIVES ON CURVES


Exercise Set lAB (pgs. 376-377)
1. 4/3 3. on y1: Jt; on n: 2rr
5. 0 7. 3Jt
9. on YI: 1/2 + Jt/4; on Y2: 1 11. on YI : 0; on n: 0
13. 0 15. 0

17. 478/15
Answers to Odd-Numbered Exercises 791

19. (a) (b) 0


y
. __ __..,.,,,.// _,. / / I /I I
__ _ - , , , / / / / / / I
--,_,.~//I'/ I /I!! I
__ ...,_,,..,_,,,/'////!!!,
~)=0 .,,11// !.
.... , ,,
g(t)=(et,e·t)
~
! / / /

• 4
1 ft=1{

F (x,y)=(y,x)

21. Hint: Let <J>(u ) = u 2 .


25. Because the flow lines of the field F are concentric circles about the origin, the line integral of F around an elliptical
path y centered at the origin is not affected by rotating y about the origin. Also, the angle 0 between the velocity
vector of a continuously differentiable parametrization of y and the field always satisfies 0 ::: 0 < n /2 so that the
integrand of the line integral around y is always positive and therefore J F • dx I- 0.
I y

F(x,y)=(-y,x)

27. (a) See graph for Exer. 25 in this section (b) n a 2 (c) ab (d) ½ab; the answers are the areas of the enclosed
regions.
29. - 18 31. 24
(b) Hint: Try the line segment with endpoints (0, 1), (1 , 0),
33. (a)
and the half circle joining these points.
y
.--/>---..,.....--
/ i, ' / / - -

I ///K"
/II".,.
I I •
792 Answers to Odd-Numbered Exercises

Exercise Set 2 (pgs. 382-383)


l. ln(sec 1 + tan I) 3. 335/27
5. (b) l(y) = 4

y· r=1 +cose,
'Y
0 <0 <11

7 . ./a 2 + b2 (2rra 2 + 1(8b 2rr 3)) 9. (a) 5 (b) 126


13. (a) Jirr J a 2 cos2 + sin t dt (c) ~ 9.6888
t b2 2

15. ig'(t)I = ./i.,. h(s) = (cos(s/./2), sin(s/./2), s/./2), 0 .:5 s


17. (a) Hint: Use l(y) = a0 in the formula for po and interpret Lhe result.
Exercise Set 3 (pgs. 385-386)
3. speed 5. Yes, by the Cauchy-Schwarz inequality.
9. y

'Y
acceleration

(cos s,sin s), O < s < 211

13. (b} Kmax = I, Kmin = 0 (c) minimum curvature at x = 0, maximum curvature at x = 56 116 and x = -56 116 •
15. (b) 6/(lt1(4 + 9t 2) 312 ); 00

Exc1·cise Set 4 (pgs. 394-395)


1. div F = -y sin(xy) + x cos(xy), curl F = y cos(xy) + x sin(xy)
3. div F = - I, curl F = 2
5. div F = 4x 3 - 4xy(x - y) + 4y3, curl F = 4x 3 - 4xy(x + y) - 4y 3
7. the points in IR 2 that lie below the line y = x
9. the points in JR 2 above the x-axis
17. (b) G=(y,0,0) (c) F=(z ,x,y)
Answers to Odd-Numbered Exercises 793
19. (b) neither expansion nor contraction occurs and volume is preserved
21. If 6.f > 0 ( < 0) in a region then a given mass is moved to a region of larger (smaller) area and therefore density
decreases (increases), and if 6.f = 0 (i.e., f is harmonic) in a region then a given mass is moved to a region of the
same area and density is preserved.

Chapter 8 Review (pgs. 395-396)


1. 1/2 3. 63
5. -rr 7. 1/3
9. I (Y2) = I (YJ) because the parametrizations a.re equivalent. But I (YJ) has a different value because the parametrization
is not equivalem to either of the other two.
11. 9/2
15. (a) 2rr 2 (b) 2rrto; If to= 0 then the vector field is perpendicular to the path traced by g(t); and if to = I then the
field vectors point in the same direction as g'(t).
17. 2304 kg
19. Hint: Use the formula for curvature given in Exercise 13 in Section 3.
21. (a) maximum curvature when a= b, minimum curvature when a = 0 (b) maximum curvature when b = 0, no
minimum curvature

CHAPTER 9: VECTOR FIELD THEORY


Exercise Set lABC (pgs. 408-409)
1. -rr 3. 0
5. I 7. I
9. 0
11. Hint: Use Green's Theorem and the field F = (F, G) = (-y, x).
13. (a) V f =
(-y/(x 2 + y2),x/(x 2 + y2)), x > 0
15. Hint: Use Gausi's Theorem in the plane.
17. (a) regions not containing the origin flow into regions of the same area (b) circulation is zero (c) 2rr (Fis not
continuous on the interiors of circles centered at the origin so that Stokes's Theorem in the plane does not apply for
these regions; i.e., parts (b) and (c) are not contradictory)

Exercise Set 2A-D (pgs. 418-419)


1. (a) U(x,y,z)=gz (b) x(t)=(v1t,v2t,-½gt 2 +v3t),t~O
3. not a gradient field 5. not a gradient field
9. No 11. sin xy
13. ½ln(x 2 + y2) 17. ln(x 2 + y2)
19. f(x, y) = xeY 21. f(x, y, z) = xy + xz + yz
23. f(x, y) = xy2 + x 2y 27. (c) -½ka.i2(b 2 - a 2 )
794 Answers to Odd-Numbered Exercises

Exercise Set 3A-D (pgs. 429-431)

1. (a) (b) v'14/2

(2,0,3)"
zl
/ \

/ \_

- ~;J-
X /

,' 2
T \

, ,_ . - - ~ - - - - -

------ 1
Y
....+-
-

(0,0,0)

projection onto
the xy-plane

3. (a) (b) :rr(50- l)/ 12; yes

5. (a) (6 3/ 2 - 2312)/ 12 (b) I


7. 4:rr 9. (a) -2 (b) 0
15. (a) Hint: Sis the graph of a continuously differentiable function g(x, y) = z. Now use the result of Exercise 14 in this
section with f = g.
17. The pair of parametrizations g1(u, v) = (cosu , sinu, v), 0 5 u ~ 2:rr, 0 5 v 5 l (cylindrical side) and
g2(u, v) = (vcosu, v sinu, 0), 0 ~ u ~ 2:rr, 0 5 v 5 I (bottom) are coherent.
19. Thepairofparametrizationsg1(u,v)=(v,u,u), 0~u51 , 05v51 and
g2(u, v) = (v, u - 1, I - u), 0 5 u 5 1, 0 5 v 5 I are coherent and parametrize the two sides of the trough.
21. 2:rr
23. Change g1 as given above in the answer to Exercise 17 to f(u, v) = (cos v, sin v, u), 0 ::=: u 5 I, 0 =:: v ~ 2:rr
(cylinder) and keep the parametrization g2 as shown there. The resulting integral has value -2:rr .
27. (a) g(x,0)=(x,f(x)cos0,f(x)sin0), a5x5b, O5052:rr
29. 0
31. Hint: Let n(x , y, z) be the unit normal to Sat the point (x , y, z) and show that 'v f • n is identically zero on S.
Exercise Set 4AB (pgs. 437-438)
1. 2x +2y +2z 3. 0
Answers to Odd-Numbered Exercises 795
5. 7. Hint: Use the parametrizations f(u, v) = (u cos v, u sin v, 1), 0 ~ u ~ I, O ~ v :::: 2Jr for the top of R,
g(u, v) = (cosu, sinu, v), 0::: u:::: 21r, 0::: v:::: 1 for the cylindrical side of R, and
h(u, v) = (vcosu, vsinu,O), 0::: u ~ 21r, 0::: v:::: 1 for the bottom of R.

z
T

9. 32Jr 11. 0
2
f S: x2+2y2+3z2=1

13. Hint: Routine calculation


15. (b) f(x, y, z) = x2 (c) f!.f = o2 f/oxr + · · · + o2 //ox;
17. 0
19. Hint: Use Gauss's Theorem with the field F(x, y, z) = (x, y, z).
21. (a) Hint: Show that F(xo) • n(xo) = JF(xo)I and then use Theorem 5.2 in Chapter 1.
(b) Hint: Use part (a) with Gauss's Theorem and the formula V(S) = 41r(abc)/3.
23. Hint: A direct result of Gauss's Theorem and the definition of flux.
25. Hint: A direct result of Gauss's Theorem and the definition of flux.
27. (a) Hint: For each fixed Yo in R, let Hy0 (x) = (yo - x)Jyo - xj- 3 for x not in R. Show that divHy0 (x) = 0 for all x
not in R. (b) Hint: Let Sa be a sphere of radius a with a large enough to enclose S, choose the standard spherical
coordinate parametrization of Sa and show that fs. Hy0 • dS = -4JT, then use the Surface Independence Principle on
Sand Sa,

Section 5: Stokes's Theorem

Exercise Set SABC (pgs. 447-449)


1. (-2y - 1. -2z - 1, -2x - 1) 3. (0, 0, 0)
796 Answers to Odd-Numbered Exercises

7.
5.
zi

9. Parametrize the border of S by f(t) = (cost, sint, 0); fscurl F •dS = -1e.
11. (b) Border of S consists of all points on the either of the two circles x 2 + y 2 = l and x 2 -1- y2 = 4. (c) 31r
13. Hint: Use the results of Exercise 25 in Section 3 and Exercise 12 in this section.
15. Hint: Using the coordinate functions of F, compute F' (x) - [F' (x) ]' and curl (x) x y, for x. y in IR 3 . Then use the
results of Exercise 33 in Section 4 of Chapter 2.
19. 0 21. F(x, y) = (-y/(x 2 + y 2 ),x/(x 2 + y 2 ))
23. (a) Him: Using coordinate functions, find the equations that must hold in order for curl G =F
25. G(x. y, z) = (z 2 /2 - xy, -yz, c), c constant
27. G(x, y, z) = (-yz - 3xy, -xz, c), c constant
29. Hint: Show that curl G(x) - curl H(x) = curl(G - H)(x) . Then use Theorem 5.4.
31. G(x) = ½tz 2 - xy, x 2 - yz, y2- xz)
33. G(x) = ~(-yz-3xy,3x 2 -xz,2xy)

Exercise Set 6ABC (pgs. 456-457)


I. Hint: routine computation 3. Hint: routine computation
5. Him: routine computation 7. Hint: routine computation
9. Hint: routine computation
11. /lint: Use identity (4) in the text with f = 1/lxl and F =vx x.
13. Him: Use Exercise 12 in this section and identity ( 10) in the text.
15. Hint: Use the hint in the text.
17. Hint: Use Theorem 1.3 in Chapter 8 and let the lower limit of integration go to oo.
19. Hint: Use the chain rule to write fx, Jy and fz in terms off, . The rest is lengthy but routine computation.
21. /lint: routine (but lengthy) computation

Chapter 9 Review (pgs. 457-459)


1. (a) f(x, y) = x 3y + y 3 5. 481r
7. (a) 0 (b) -21r
9. (a) 2(1 - a)/(x 2 + y2) 0 (b) positive circulation (counterclockwise flow) if a < 1, negative circulation (clockwise
flow) if a> 1, and no flow if a= I. (c) circulation is positive for all a > 0 with maximum value of 21r when a= I.
11. Hint: Use Green's Theorem with the four pairs of functions (F(x, y), G(x , y)) = (0, x), (-y, 0), (x, 0), (0, y)
13. (a) g(x, y) = (x, y, 2 + ./I - x 2 - y 2) (hemisphere), h(x, y) = (y, x-./x 2 + y 2) (cone), for x2 + y 2 ~ 1 (b) surface
area: (2 + ./5)1r; volume: 41r/3 (c) 41r
Answers to Odd-Numbered Exercises 797
15. (a) Hint: Use identities (4) and (9) in Section 6. (b) 0
17. -1/3 19. 0
1
2dudv 2 J Jd =
21. (a) Hint: Use Gauss's Theorem. (b) Hint: Apply Gauss's Theorem to each of the fields (x, 0, 0), (0, y, 0), (0, 0, z:).
23. A third surface S in B can always be found that intersects S1 and S2 only on their common border, and
parametrizations of S 1, S2 and S can be found such that all normals are pointing in the proper directions to apply
Gauss's Theorem to the closed piecewise smooth surfaces S1 US and S2 US. The Surface Independence Principle can
then be used to conclude that the flux across S 1 and the flux across S2 are both equal to the flux across S.

CHAPTER 10: FIRST-ORDER DIFFERENTIAL EQUATIONS


Section 1: Direction Fields
Exercise Set IA (pg. 466)
1. y(x) = 3tr - l , -00 < X < 00.

3. y(x) = 6e 5e-x = 6e 5-x , -oo < x < oo.


5. y(x) = (27 - X )-1 , X ~ 7 /2.
7. y' = y/x, (xo, Yo)= (l, 2) 9.dy/dx=y+x,(xo,Yo)=(l,-l)
y y
--///JI/Ill l/////11111
___ ...,,,/I I I I I
' : ~\\\\\\1111//;, ;,'~~ 11///11/111
,~
'-
' ' \ \ \ \ \\ /Ill 1/,///.
,,,,\\ \\!Ill/Il l ~1.1 _, ___
,---,,,..////II
,,...._...__..,.,..
,,,./////
'/I
I I II I I II I I
//II/Ill/I
"' ' ' ' \ ' \ \ \ I I I I (1 ,2} "/ / · ll/ll/1111
-.. '• ',, \ \ \ I I I I I, . , ,, ,, .1.,,. · \''\.._-...·,... ---,,,/,I l///1111111
...... , , , , . , \ \ \ \ \
................... , ,
.... , , , ~ , , , , \ \ \
'\\\
I
J~(~"
f/J,'////
, • ,,,.,,, __
,.,,,,,,
.,,.,,. \\
",\\\
'·~----,•//
\' ...,_ .....___
,,,./
,, ... -_,,,,
tt//1/11/ U
·111111111 I
///////////
............ ..., ~, , , , , \ \ If/,..,.,.,.,,...-...- ..... ///////////
I \ \ \ '· "'-' .• ,, - -
_____ ..._, .... , , , \
---------.•,,\ -l1,---------·
I / / / - . - - - ~--
-
\ \' \ \ ..... :.... ........... /////JI/ /}

-- - - ---· x
' '' 1 ) ')

~ ~ ~ <' ~ , .., ~:;,.


,~,. -
•'·"'+"'·"'./
_,.../ ,,-/ II I
X
,, ........ .._ ______ . I I I \ I I \ \ , , '>,,.
:.:::·:.~ . . ~:?; ~: ~
\' '' ...... ·~ ............ . \ \ \ \ \ \ \ \ \ '. '-"
I \I I \ \ \ I \ , , , ~ ,///{/
.,....,,.._,,..,,.,,,.,,~,1 I''''''-"'''•''
..,..,,,..,,,.,,,,,.-r
.,,,,,.;,1_, ,,r II
-'!,Ill\\\ ,··~·.-..- \\\\\\\\\\\
I \ \ \ I I I I I \ \ ~~:5t,:11:::~~:;;
,,,,
///;'//0/ ,,,.... ..._____
\ \ \ · , , , , , ......... .. _,,.._,,,
I \\\II\ I\\\ ,,._,,
.,, . ~ ~ ~ I I I I I
\ \ \ \ '\ ' ' ' ' ' ' \ \
11\\1\\\\\\
, . ... ~.,,..
/
/.//.tll
,-f///11
/ft\\\\\.\'\
\\\\\\'\'''''
Ifft\\\\\\\'""'
'' 11\\\\1\\\\
I I \III I\ I I\ ,,,,,,,,, __·-
\\\\
,,,,,,,,,
,
....
/,- /,I/fl/I \\\\\\\\'\\'\ I I II I I \ I \I I
/I 1; !Ill/\\\\\\\\\..._'\

y Y..
y ._,,.._,,,._,,,,;///// /\ \ \ ' \ ' \ ' ,,-......-
I LJ,,L I ! ./·f ·t--! ,/ I I I I .,,..,,..,,,..,,.._,,,_,,,// I J \ \ \ ,:::,l·:-... .._ . . . . . . . . .
•/ , , 1 1 1 1 1 / \ \ \ \ \ \ \ ' ' ' ' I 'f1·f~1- 'j i. I Iii~ II ·----..IllrJ. /1 I \ ,,11!-::.:: - ----
,,// Ill// I\\\\\\
I I I I I I \ \ \ \ \ '
,,,
'-''
I I 11 {rT/''') :;9 <_11'r11 ·--------/1 \~U--- ----
,, / ,
...,., ' / , / / / / \ \ \ \ ' ' .,
,, ., ,· ,, .,-,,11 I I I \ \ \ \ ' ' ' '>' ' '
--- !Jl71f"/II/ I I 1,,,~'(I,_,
,, I I I I i.f t
I /// I )".,.-"":',
I II I, / //,·· -
/ / ."( I '1.. I
' ,, I I\I
/ I ·v I
---------,,,//.,....,,._
_....._ ____ ,,,"\\II//,,,.,,,...,..,,..,,. __ ----
_,....._,,,,,"\\\ //I ////,..._.._
. . . _.,,..,,..,,..,.,..,,,//.II\\,,,, __ .__ . . / I AI ,,,,
--------,,1,,,, ______ _ I / I /
' X
. .......... //
. , , , , , , , , 1 I I// ///
/~
//// •
,

--------~,~,_,, _________ x /YI / / !Al , , , \ , \ \ \ \ 111/ l / ',.,/J'

___ ._....,...._...._, ·-..\I//_,,,.,,,. _ _... _ _.,._.._ !XI 1/til . ' ,\\\\\I/////' /


I I\I I·-.- ; , l. I I/ I I ,,,,,\\\\\\1/ll/i,,///
_._ ................ ......_,,,\\/I///_,,..,,...,....,... ___ . , \ , \ \ \ I I I I I I 1111 //,· / /
_.....__, ..... , .. \ \ \ I I I//./,...,.,.,,.. ......... I I °),, I )·.;_ , / / , _;, t I l I I . , , , \ I \ I I \ I I I I / /,,,1._l,n,/ I /
· •, •, •,' , , \ I I I I / / / / / / .- ,, , I I 1' \ l 1Y-/- 11 I f'I I I . \ \ \ \ I I \ I \ I I I I I /T/"11 I /
,,,, ,,11 I I 11111////, 111 1-..J I I
1 t//,/IJ/1 , \ \ \ \ \ \ \ \ \\1 / / 1 1 / I / / / I
·,,\\IL\ I I / I l l / I / / , I I I I I f"r-I- ... i r i I I I I I ,\\\\\\\\111 1 ////////'
, ,,·,\{"r"f\ 1111 / I I / / , ,
,,,,\\\\\\I 111///II//,
Exercise 13 Exercise ,15
Exercise 11
798 Answers to Odd-Numbered Exercises

L7. y' = x3 19. y' = 1/(1 + x 4 ) 21. y' = (I _ x3)1/3

---/// /
y
.-{,, / .,,,.,,.,._., __ y
'///---

:~'\~~::]ill-=
y ==:~~;;I
-------// ,
---///
! / / / ______ _
_J</ / / / ~
-------///
---////
/
////--
/
//--

\ \'-'--
I~\,~,--
--·///, I
--..--.>/ 6 -- - ;, / . ./.f / / ,,:,/- - - ___
- - - /,,,,/
./ // / ~
I j \ -..._',
I \ \ .. , - - - - - / / / /
,,- -
-.- -- , /
---
---.,.'//'/ / 1 / /,, , ~
- ~- /_./;" } _,, /
{//_/.,., /
/,:,. ---- -- - - - ,,y;, / /
~///
,· -,,//·/~
1/..
/.,, --
.,/ - -
- 4........,,. .,,. . . ··/*'· ,' ,, -r •
/ / /
f
,,, -- -
~ ~- --+-_ _ __,,,_,,_,__,.,__-+X __
____,_-.,_,,,_,;r·A'--/---.'
____ ./-, / / /-' ______ _ X
\ ~: ·'<> - -
\ ~~·-., ~ ..... -- __ .,,. ..
/
I, I
1/ I __ _,,..,:1-,·'>/ ////..---
-____.,.,..,...fJ;/
~-/ / / /
////-- - / /

'---''- -
I
-- ,-·/1 :_:::.p,,: /'/' / / / / ---- -
I
I

I
\ \
\ \
\ \
-~~
._,
-
__ __ /
I
I I
I I
;,:
_,,, .
/
~ # '/ / /
---;:.,,,ti//
~ " , . , , ,.... / /
-------///
/ / / ---- - - -
///...-----
////----
/// ______ _
---.,-,///
~;:,. // /
///...---
/ / / ---- . -
\ \ ",,·-.-- _ ,,, I I -------/// ///...----
I I \ ,,-- ------/ I I ----//// //.,'/--- -------/// //,//--

23. (a) Hint: Use the Fundamental Theorem of Calculus. (b) Use the hint
25. Hint: If y(x) is a solution so is y(x + a). Theorem 1.1 doesn't apply because the derivative o f ~ isn't bounded
near y = ±l.

Ya(X), a ~rr./2 y (0,1) y y0 (x), a s-rr./2


'-
'

X
-a X
-1
(0,-1)

Exercise Set 1B (pg. 469)

1. y' = sin(x - y) 3. y' = ./9 - y3 5. y' = sin(x 2 + y 2 )


.,.,..... __ .....,, y ,,,......, __ J_
·.--,-..._,, ....._,
y

::.-=-=,::__,,,
- - ....... ':'- ',
,,,-------
' -- - ...-
I l

I
I

j
I I I I
-' '
~,--.,.,,.·,ft~'
, ___ / .
~-
.. ., / _.,, ;~:;
/
,.,
-- ' ---'
. .... ' \. ··.,
.........
-, -- - -,,,,./A.1
,,...,..
I I /t Ii I ,/ .' /

l'/I · ,l/
i .t/ / /,q .,,,,,.

._ . . . . ' ~ ;-~,.~:/~4· ' I j


·I 11
I
I I'J ; :-.:::::,// ;_,..-,.
i f ·,
l ~ 1--.~ X
I
I I .- -- - ---
yj/....._
""'
IJ
. ' '\. "- ' ' -- -- ,,-_4/, ,/ / (//
~--c:---1··· ,- _,,_ ·x --,-J,ft
I I
I
I i I ! ~ I ,": i ,
'-'l-::_:::-;,~,.t'71i--:,,:~~--::.:::=;;4==-:-~
-- , , ,I X
/

.... --
''. ' --- ~f/ ~~;-~::::~ ~~:'
Ii f., : 1,,

-· ,-,,....":·,
,;// /. .----
- .;_// / , / /
I I !/
I I
I
,I
I
; I I / I
I I !/ I
.-,--.::~/
"-;:: / :/
'
~_,.,.;(,. ~
/ / /
/
/
.,.,... - --
....--

t;I
,' I I II I /
.,,..,_....._,,__
,_,, _____ --,,,
/
,,.
/ / / ......

-- --- / / //1 //,-,--,


. ~ , ,, ,,".,, + .. // ;
I I 1.· .'
I
' I
!
I
• ,I - ' '
Answers to Odd-Numbered Exercises 799

7. y' = cos(x 2 + y 2) 11. y' = e-Y2


___,......,,, /
y
/ / ,;_,,:.::::--=-
y

----// / ( / / ..,--
----//
----//
--- - / y, / -~-~---
1/'
.,.,
,., ___ _
/ / ----- . ,......
:.,-;;'' ,/ ?,.,,-;.:·/
I // /, ~ //'
~->// ;;.;'/~ //
- - - ..- ,.-:· /'

-- ---
X

---- -- ---
/ / _,,,_, ......
-<~-- . -- / /.' ,,,,,,_......_,......_ - / / -~ ,/~ / / /// / /
--~·-::::-_____,,._,, - / ,"//
,,,"' / ·/ ,,-··'" / ';.::/...,,::, / / / . /
----,,,- ~ / //,"',
',,
/ //---- _..,,.,,,..--_.._,_,..,.
.., ''~ ........
/ _,,,, ....... <·-0.
,,. ,, ,.. . ......

Section 2: Applied Integration


'
~
,,
-" "- ~ / /
.,, ....... //

-
. - - ,. . . <,,,; /
. - --~.--,:;:., / /
//.----
//.----
/,,-,,,. __ _

Exercise Set 2AB (pgs. 478-480)


1. y(x) = ½x 2 - ix 3 + 1, -oo < x < oo.
3. y(x) = -½ 2
ln(l - x ) + 1, lxl < 1.
5. y(x)=-sinx+2x+I, -oo<x<oo.
7. z(t) = te 1 - e1 + 2, -oo < t < oo. 9. x(t) = e 1 , -oo < t < oo.
11. tmax = 5000/32 = 156.25 seconds. Ymax = 390625 ft~ 74 miles.
13. (a) t = I second at an altitude of YI (1) = - 16 + 5000 = 4984 feet. (b) vo = 400 ft/sec.
15. -340 ft/sec, where the negative sign indicates that the weight is thrown downward.
113
17. y(t) = 2e21 , -oo < t < oo. 19. y(x) = ((3/2)x 2 + (13/2)) •

21. (a) Hint: Find dV /dt. (b) r(t) = J1 - ½t 2 . (c) 1.414 hours, about I hr, 25 min.
23. (a) S(t) = I50e- 1l 50 , t :::_ 0, Sis decreasing, and lim, ... 00 S(t) 0. =
(b) S(t) = 100 + 50e-i/IOO, t :::_ 0, S(t) is decreasing, lim1--+oo S(t) = 100.
25. k = 0.
21. Hint: Consider the product of the two slopes. (b) y(x) = C1x, x i= 0. x2 + y2 = C2, y i= 0, where
C2 = 2K2 > 0.
y
,,\\\1\\\\\///((/tt
'
,
...... ' ' \ \ \ \ I / / ' i I I /
, ,,1\\\/fl/ll.'
I

-.. ' ' - ' ' - \ \ \ \ \ I l l / ti,..;,.,,


•'''·,\\\I/I////,,,,/
.. • "-' ' \. \ \ ' i I I I I / ., •", r
~~~:~~~I~~=~11~~;;:11::
I ff/,-'/._.,.,.... __ , , , , , \ \ \ \ \
1////I//,,,--,,,\\\\\\\
, - . , , , , , \ . \ \ \ I /I///,...,,.,,.,..,.,., I I I I I I I I I_,_ - -.. , \ \ \ \ \ \ \ \
______ ,,,,,///~-------
___ , , _ , , , \ \ I / / / / / ~ - ~ - - I I I I I II I I , 1'· , \ I I I I I I I I
I I I I f f I I I i /·, \ I I I I I I I I I
----------,/----------
----------/ __________ x
------~~//:,,, _______ _
I I I I I I \ \ I \ , ., 1 I I I I I I I I IX
\ \ \ \ \ \ I I I ' -- ,. ,. / I / I f I I I I
-----//////\\,,,, ____ _ I I I I I I \ \ , , -..- ' I I I / / I I I I
\ \ \ \ \ \ \ , , .._ - ··· ,.. / / / I I I I I I
, __ / / / l r l / f \ \ \ , , , , , , , , \\\\\\,,,----/////////
,..~_,,,.,,,,,,//// f I\\-. \\\\\,,,------////////
,,,,,11111 I I II \>' \\\,,,,,------////////
,, •t'I////{ \\\ \ \' ,, ,,,,,,,--------///////
,// , 1 / / I I I I I I \ \ \ , ' ' ,,,,,,,--------/////~/
, , , , , , , ________ ; , / , , / / r
,1111111 I If I\ I I I \ \ \ ' ' '

Direction field for dy/dx = y/x Direction field for dy/dx = -x/y.
800 Answers to Odd-Numbered Exercises

29. (a) GM= 96000 mi 3/sec 2 . (b) About x6.93 mi/sec :::::: 24900 mi/hr. (c) About 6.20 mi/sec :::::: 22300 mi/hr.
31. (a) Use the hint. (b) Routine calculation. (c) Routine calculation.
33. (a) For ht+ c > 0, b = -(a/A)..firl.. The constants c = -J(h(0), b = (a / A)../i12 also work. (b) Letting
h(0) = ho, te = (A/a)J2ho/g. If A= 251r sq/ft (tank is 10 feet in diameter), a= rr/16 sq ft (hole diameter is 6 11 ),
ho= 20 feet, and g = 32.2 ft/sec 2 then te :::::: 446 sec :::::: 7.4 minutes.

Section 3: Linear Equations


Exercise Set 3AB (pgs. 487-488)
1. M(x) = e2x. 3. M(x) = x 2•
2
5. s(t)= 1 - e-t / 2 . 7. y(x) = 0.
9. S(t) = -40- 1.fi°e-21 + t~fl <:.1/20_
11. Find M(t).

15. (a) k = 1f54 :::::: 0.0924. (b) 22.5 min.


17. Routine calculation.
19. (a) S(t) = 10100 - lO0t - 10090e-r/too, 0 _:::: t _:s 1; S(1) :::::: 10.397 pounds. (b) About 73.2 hours.
21. S(10) :::::: 14.35 pounds.
23. (a) S(50) = 155 pounds. (b) About 29.4 min

Chapter 10 Review (pgs. 488-489)


1. y(.\) = f + f. Note that x # 0 if C # 0, but that x can take any value if C = 0.
3. y(t) = e12;2 J e,-,2;2dt + Ce'212.
5. 4I ln(y 4 + 1) = e-' + C, C:::: 0 or y = (Ke 4e' - 1) 1/4 , K::::: 1.
7. y(x) = 4 + Cx 3e- 2"', derived assuming x # 0, but the solution is valid for all x.
9. x(t) = ½t 3 - -ffi, t > 0.
,I

11. x(t) 1/(3! + C)


= and x(t) = 0.
13. x(t) = t + tan(t + C).
15. (a) Increasing over the xy-plane. (b) x < y, above the line y = x. (c) Yes. (d) Yes.
(e) y(x) = ln(ex + C) = x + ln(l + ce-"') gives the same conclusions but with more work.
17. (a) Isoclines are y = (c - k)/a.

y
\II\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\
\\\\\\\\\ \\\\\\\\\

/////////, // ,,.'//////

///////// /Ill/////
/////!//! II/Ill/I/
llll!IIII llllll!II
!Ill/Ill/ f//llllll
II/Ill!// /1//f/f//
I II II I I I I I I I II I I II
I I I I I I I I I I I I II I II I
I I I I I I I I I I I I! I I I I I

19. If So = 0 or So= 50, S(t) = 200 - 200e-t/lOO or S(t) = 200 - 150e-r/lOO_


Answers to Odd-Numbered Exercises 801

CHAPTER 11: SECOND-ORDER EQUATIONS


Exercise Set lAB (pgs. 498--499)
1. -e-2x 3. 27e3x

5. -2sinx

7. Char.~: r 2 +
2 - 6 = O; roots: r1 = 2, r2 = -3; general solution: y(x) = qe2x + c2e- x; particular solution:
3

y(x) = -e2x + -e- x 3


5 5
9. Char. eq.: r 2 + 2r +I= O; roots: r1 = r2 = -I; general solution: y(x) = qe-x + c2xe-x; particular solution:
y(x) = e-x + 3xe-x

11. Char. eq.: r 2 - r = O; roots: r1 = 0, r2 == I; general solution: y(x) = c 1 + c2ex; particular solution:
y(x) = (e - ex)/(e - 1)

13. Ch~. eq.: 2r 2 - Jr+ 1 = O; roots: r1 = 1, r2 == 1/2; general solution: y(x) = qex + c2exf2 ; particular solution:
y(x) = 0
15. y" + 2y' + y == 0, 11. y" = 0,
y(x) = c1·e-x + c2xe-x y(x) =CJ+ c2x
y y(X)=X8·X y

y(X)=l+X
X

19. y" + 2y' + y = 0, y


y(x) = c1e-x + c2xe-x _ _J__ _,..;.===i===
X

y(x)=xe·•-e·x

21. (D 2 + 2D + 1 )y = O; D2 + 2D + 1 = (D + l)(D + 1)

23. (2D 2 - l)y == 0; 2D2 - 1 == 2(D - l/v'2)(D + l/v'2)


25. D 2 y = O; D2 == DD

21. z == c (c constant); y(x) = c1 + c2e 3x

29. Z ::= -1 + Cir (c constant); y(x) == -1 + CJ fr+ c2e-x

31. (b) y(x) = qx + c2 /x (c) Hint: Apply each operator to the nonzero constant function y(x) = c.
(d) y(x) = ci In lxl + q

33. (b) Hint: Write cosh .8x and sinh .8x in terms of exponential functions.
802 Answers to Odd-Numbered Exercises

35. y(x) = ex - 2xex y(x) =ex -x~ y(x) = ex


y y_ y
y(x)=e"-xe'

-- X
-1

-1 X

37. (a) Hint: Write Newton's equation F = ma in two ways and equate the results. (b) g
(c) y(t) = yocosh --/iTl t + vo,Jffi sinh--/iTl t (d) Hint: Set vo = 0 in the solution in part (c) and use an inverse
hyperbolic function to solve y(t) = I fort.
39. (b) Hint: Find the roots of the characteristic equation. (c) Hint: Use l'Hopital's rule.

Exercise Set 2A (pgs. 506-507)


I. x = rr /2 3. X = -rr/4
5. CJ = J/2, C2 = 1/2 7. ci = -irr/2, c2 = irr/2
9. (a) Hint: Use the definition of the complex exponential and the periodicity of the sine and cosine functions.
(b) p = 2rr//J
11. y(x) = 1 + CJ cosx + c2 sinx 13. y(x) = CJ cos J2x + c2 sin v'ix
IS. roots: r1,2 = ±iv'2; general solution: y(x) = CJ cos J2x + c2 sin v'ix; CJ = 0, c2 = 1/v'2
17. roots: r1,2 = l ± i; general solution: y(x) = ex(CJ cosx + c2 sinx); CJ= c 2 = 0
19. roots: l"J = I /2, r2 = -1; general solution: y(x) = CJ ex 12 + c2e-x; q = 4 /3, c2 = -4/3
21. roots: r1,2 = - I /4 ± i ./7/4; general solution y(x) = e-x/4 ( c1 cos .,;:_ x + c2 sin ~ x); c 1 = c2 = 0
23. y" + 4y = 0 25. y" - 2y' +Sy= 0
27 . .)'
11
i
+ y =0 29. y" + 9y =0

33. Hint: Treat the cases of unequal roots and of equal roots separately.
35. (a) Hint: Use the identity sin(a + b) = sin a cos b + cos a sin b
(b) A= 2,0 = rr/6 (c) Hint: Use the cofunction identity sin(rr/2 - a)= cos a.

y(x)=cos 2x+(3) 112 sin 2x = 2sin(2x-m/6)

37. Hint: Break the integral into its real and imaginary parts.
39. y(x) = cieO+ /)x/v'l + c2e-<l+ilxfv'1, qand c2 are real or complex constants.
41. Hint: routine computation
43. Hint: routine computation
45. (a) ·Hint: Use the chain rule to find d 2 yc/dx 2 and dycfdx . (b) Hint: Use the chain rule to find d* ycf dx* for
k = 1, ... , n. (c) Hint: Use the fact that y(x + c) and y'(x + c) are differentiable on a - c < x < b - c.
Answers to Odd-Numbered Exercises 803

47. Hint: For f3 ,f= 0, show that if the given functions are not linearly independent on some open interval l then tan.Bx is
constant on some open subinterval of/. If f3 = 0 then the given functions are not linearly independent.
Exercise Set 2BC (pgs. 512-513)

1. y(x) = c1e - x + c2c 12 cos TX+


./3 . ./3
CJexf2 Sm TX
3. y(x) =Ct+ c2e..fi.x + c3e-..fi.x 5. y(x) = ct + c2e 4x + CJe- 4x
7. y(x) = CtC + c2e-x + CJe2x + c4e- 2x 9. y(x) = ct + c2x + CJeX
11. y' - 5y =0 13. y"' = 0
15. y'" - y" = 0 17. y<9> - y<6) = 0
19. y" + I6y = 0 21. y(x) = / 6> + 4ByC4> + 76By" + 4096y = 0
23. y<4> - 4y"' + By" - By' +4y = O
25. / 6> = 0 27. y" + 2y' + 2y 0 =
29. y"' - y" = 0 31. y"' - y" + y' - y = 0
33. y'" + y" + 4y' + 4y = 0 35. y"' + 9y = 0
37. y"' - 2y" + 5y' = 0 39. c1 = 0, c2 =CJ= I
41. c1 = 5/2, c2 = -5/2, c3 = -1/2 43. y(x) = cieX + c2e-x + CJ cosx + C4 sinx
2 3
45. y(x) = Ct + c2e2113 x + e-z- l x ( CJ cos(2- 213./3x) + C4 sin(2- 2/3./3x))
47. y(x) = cie2x + c2e-2x + c3ex + C4e-x
49. Any constant-coefficient linear equation having y(x) = cosx + sin2x as a solution must have a characteristic equation
having the four roots ±i and ±2i, so the order of the D.E. must be at least four. The given function is a solution of
y<4> + 5y" + 4y = 0.
51. (a) y(x)=c1+c2x+cJX 2 +qx J - 1 Px 4 , 0~x~L (b) maximumdownwardverticaldeflection:
24
~ -0.005416122P L 4
(c) P = 0.02 and L = 10 feet.
y y(x)=-(1/48)P(3L2x2..5Lx3+2x 4) , Os x s L

X=0.58L

inflection point
maximum downward
vertical deflection

Exercise Set 3AB (pgs. 520-521)


l2x I X I2x
= ciex + c2e-x + =-
l. gen. sol.: y(x)
3
e , part. sol.: y(x)
3e- + 3e
3. gen. sol.: y(x) = cte-x + c2xe-x + 4I c, part. sol.: y(x) = -~i:e-
I
+ I xe- + e
2 4
X X X

I 3x 3 x IC
5. gen. sol.: y(X} = qex + c2e-x + xc - X, part. SO}.: y(x) = e - e- + x - X
2 4 4 2
I 1
7. gen. sol.: y(x) = c1 cosx + c2 sinx + x sinx, part. sol.: y(x) = sinx + x sinx
2 2
. 1 12.
9. gen. sol.: y(x) = c1cosx + c2 smx + -xcosx +
4
-x smx, part. sol.: y(x)
4
= -43.smx + 4-xcosx
1 12 .
+ -x smx
4
11. y(x) = ~xex + ctC + c2e-2x 13. y(x) =- ~ieix + c1ix + c2e-ix
804 Answers to Odd-Numbered Exercises

15. y" - 3y' + 2y = 0 17. y" = 0


19. yl4 J + l Sy" +Sly= 0 21. y<4 J - 4y"' + Sy" - Sy'+ 4y =0
23. Yp(x) = A cosx + B sinx 25. _\'p(X) = Axex
27. Yp(:t) = Ax 2e-< + Bx 3 e-<
= qe- 2 r + c2xe- 2x - ~ + ~x,
3 2 7 ,, 3 3
29. gen. sol.: y(x) Part. sol.: -v<.d
.
= -e-
4
+ -xe-~x
4
l - - + -x
4 4
y

y(x)=(3.14)e·2x +(7/4)xe-2x-3,/4+(3,14)x

-2

x ,. 1 .. I . 3 . 1
31. gen. sol.: y(x) = c1e- COSX + c2e-· sinx + Se , part. sol.: y(x) = -Se-·' COSX + Se-> sinx + sex

X
·1

33. Yp(x) = Ax 2e2.r + Bxe 2r 35. Yp(X) = Axe 2x + Bx 2e2x + Cxe 3x


37. Yp(x) = Axe 2x + B cosx + C sinx 39. )'p(x) =Ax+ Bx 2 + Cx 3 + Dxex
41. Yp(x) = Ax 3 + Bx 4 + Cx 5 + Dx 6 45. (b) limk->O+ j•(t) = VO+ gt
Exercise Set 3CD (pgs. 528-530)

1. Y1(x) = e2x, u(x) = e-x + CJX + cz 3. y1(x) =x, u(x) = 3l x 3 +CJx 2 +ez
5. YI (x) =ex, .}'2(x) = e-2x, Yp(x) = ie2>·, y(x) = c1ex + c2e-2.r + ie 2x
7. YI = cosx, J'2 = sinx, )'p(x) = cosx ln(cosx) + x sinx, y(x) = CJ cosx + c2 sinx + x sinx + cosx ln(cosx)
9. YI = I, .''J = x, )'p(x) = x 2ex - 4xex + 6eX, y(x) = c1 + c2x + x 2ex - 4xex + 6e'"
ll. yp(x)=- 1x 2e·r "
13. )'p(x)=e-~<(a+e-<)(-l+ ·
lnJa+e·'J)
2
Answers to Odd-Numbered Exercises 805
15. homo. sols.: YI (x) = e-x, Y2 (x) = e-lx; G(x, t) = e-(x-r) - e- 2 (x-r); part. sol.:
_ 4 -x 3 -2x { 0, X < I,
Y (x ) - e - e + J 1-x
2- e + 2I e2(1-x) , I ~ x.
17. homo. sols.: Y1(x) =land Y2(X) = lnx; G(x, t) = t lnx - t Int; part. sol.: y(x) = lnx + x - J
19. Hint: Use the hint in the text.
1
21. (i) G(x, t) = - -(e'i(x-t) - e'2 <x-t>), (ii) G(x, t) = (x - t)e'<x-r>, (iii) G(x, t) = .!..e'~(x-t) sin,B(x - t)
r1 - r2 .B
l 2 2 l 3
23. Yp(X) = -2XoX - XQX + 2x ' x/xo > 0
25. Yp(x)= (xo - I)~ + (e-x<i)e2x - xex
27. Hint: Use the bint in the text.
29. y(x) = c1x + c2x- 1, x > 0
31. y(x) = cix- 1 + c2x- 1 lnx, x > 0
33. (a) Hint: Expand the given determinant about the first column.
(b) y" - (2cotx)y' + (2cot2 + l)y = 0, 0 < x < rr (c) (x - l)y" - xy' + y =0
Exercise Set 4A-C (pgs. 531-540)
1. (a) critically damped (b) x(t) = c 1e-r + c2 te-t
3. (a) harmonic (b) x(t) = c1 cos 3t + c2 sin 3t
5. (a) underdamped (b) x(t) = e- at /2 ( CJ cos -a,./3 . -a../3
-t + c2 sm -t )
2 2

7. CJ = 0, C2 = 1/2; 9. CJ =-l,c2=0; 11. Cj = -5/2, C2 = 3/2;


i+4 =0 x+4i +4x =0 x+6x+8x=0
X x(t)=(1 /2)sin(21) X X

x(l)=·(5/2)e-2t+(3/2)e-4t
x(t)=·te·2t

l 3 . O
13. xp(t)= cm.t+
10 10 smt, t:=:::5. l
15. (a) h = 6 lbs/ft (b) h :=::: 8.9 kg/m (c) 80 lbs (d) Hooke's Law is not valid over this range of c~mpression.
17. (a) k2 < 8h (b) m > 1/4 (c) k = ,./3 (d) 0 < k < 2
19. A = ../IT; frequency = l/(2rr); <I> = arctan(3/2) ~ 0.9828 radians
21. A = .,/5; frequency = 1/2; <I>= arctan(l/2) ~ 0.4636 radians
23. rr /6 radians 25. rr /4 radians
29. (a) k = 2m Vo/(eE), h = m VJ /(eE) 2 (b) Hint: By part (a), k/m and h/m depend only on the parameters Vo and £.
31. (a) Hint: Show that the steady-state response is of the form Xp(t) = At cos{L\)t + Bt sin wot (b) ,(c) flint: Show that
the amplitude of the steady-state response is laol
J(h -w2m) 2 + w2k2
33. (a) Hint: Show that the real parts of the roots of the characteristic equation are negative. (b) Hint: See Example 5
in the text. (c) Hint: Write the general solution as the sum of the transient solution and the steady-state solution and
thPn nc;:.P thP tri~no1P. inP..tl1Ut1itv .
806 Answers to Odd-Numbered Exercises

II

35. (a) Hint: Straighforward substitution (b) xp(t) = ~ t sin2t + L 4


: \2 cos 2t (n ~ 2), Xp(t) =~ (n = 0),
k=0.k;t2

Xp(t) = ao
4
a1
+ 3 cost (n = 1)
37. (a) (a - /J)/cv (b) a - fJ - rr/2 39. mo= 9/8 (no oscillation form = 9/8)
41. m = 1/26 and k = 1/13 43. No such m exists.

Exercise Set 5 (pgs. 547-548)


4s 2 2 1
5
• (s2 + 4)2 7 -+---
• s3
s2 s
2 1 1 I 1 I
9. ----,-+--
2
11. -e - -e-
(s - 3) s- 3 2 2
13. ~ e21 sin 3t 15. t sin 2t
3 2 3
17. y(t) = -1 - t + 3e 1
19. y(t) = cos 2t +
13 sin 2t - e- 3'
13 13
2 152
21. y(t) = - -374 cos 3t -
- sin 3t - 4 + -e'l2
111 37
23. (a), (b) Hint: Apply the definition of Laplace transform. 1
(c) (d) y(t) = 1 + u - a) 2 H (t - a).
2
y

H(t)-H(t-1)

25. (b) Hint: Use induction and the result of part (a).

Exercise Set 6 (pgs. 552-553)


1. t - I+ e- 1 3. t
1 2
S. -1 + t + e- 1
7. (t-2) H (t-2)
2
9. -1 - -e
9 9
1 -31 1
- -te -31
3
11. e_, sin t
7 I 13 I 1
13. H (t-1)+1 15. y(t) = - 1 +-e + -e- - -sin2t
IO 10 5
1 2
17. v(t)
-
= 5 - t + -t
2
- 3e- 1 - te- 1
,./5
19. y(t) = -1 +2e- 12 cosh 1
t
2
21. (a) Hint: Use induction on n ~ 0. (b) Use Theorem 6.1
23. (a),(b),(c) Hint: Use the hints in the text.

Exercise Set 7 AB (pgs. 557-558)


1. y(t) =Int, t > O 3. y(t) = ln(t + l), t > -1
1 IS 3
5. y(t)=
16
t + Int+
4
4
16
, t>0 7. _y(t) = e', -oo < t < oo
Answers to Odd-Numbered Exercises 807

9. y = 2 - .Jf+Yt. t < 1/2 11. y(t) = ln(cosh t), -oo < t < oo

13. Both YI (t) = -½t 2 and y(t) = 0 are solutions. This does not contradict Theorem 7.1 because, in this case, the function
/ in the theorem is f (t, y, = -t-2y2, which is not continuous at t = 0.
y)
15, (a) Y1(t) = sint (for y = -y), Y2(t) = sinht (for y = y)
(b) (c) Y2(it) = iy1 (t)

l
17. (b) Hint: Show that (c) Hint: Show that oscillatory motion cannot occur if
2y2(t) is always positive.
1 2
-cos yo+
2z0 = 1
Exercise Set 7C (pg. 561)

l. y(t) = 1 +cost
z (y-1)2+zZ:1
y 1
y(1)=1+cos t
1=0
/
y

-1

1 2
3. y(t) = -t
2 z
y
y = (1/2) z2
y(l)=(1/2) 12

I=()

y
808 Answers to Odd-Numbered Exercises

5. y(t) =2
y z
y(t) = 2

(2,0)

1
1. (y - 1) 2 + z2 = C, C ::: 0 9. y = -z2 +C
2
z (y-1 )2+z2=C C=-3/2 Z

11. y 2 + z2 = C, C:::. 0 13. z2 - y4 =C

15. (a) Equilibrium solutions are the points on the y-axis. z


(b) Since all solutions with non-zero initial velocity move away from L: z = y +c
the origin, all equilibrium points are unstable. Zo> 0

Exercise Set SA B (pgs. 566--568) y


1. (a) 0.00, 2.68, 4.35, 5.75, 6.99, 8.14, 9.20, 10.22, 11.17;
the zeros do not change as long as a # 0 (a = 0 gives the zero solution) / ZO<O
(b) If a# 0 then y(t) # 0 for all t < 0.
3. TI1e zeros of sin t are t ::::: 3. 14 k, k an integer. The zeros of the solution
of y = -sin y, y(0) = 0, y(O) =
1 are t ::::: 3.34 k, k an integer.
5. (a) (0.00. 0.00), (0.25, 0.24), (0.50, 0.46), (0.75, 0.63), (1.00, 0.75), (1.25, 0.82), (1.50, 0.86),
(1.75, 0.89), (2.00, 0.93), (2.25, 0.98), (2.50, 1.05), (2.75, 1.17), (3.00, 1.34), (3.25, 1.58),
(3.50, l.93), (3.75, 2.44), (4.00, 3.15), (4.25, 4.16), (4.50, 5.54), (4.75, 7.39), (5.00, 9.74),
(5.25, 12.52), (5.50, 15.51), (5.75, 18.33), (6.00, 20.50), (6.25, 21.58)
Answers to Odd-Numbered Exercises 809
(b) (0.00, 0.000), (0.05, 0.049), (0.10, 0.095), (0.15, 0.139), (0.20, 0.181), (0.25, 0.220),
(0.30, 0.257), (0.35, 0.292), (0.40, 0.325), (0.45, 0.356), (0.50, 0.385), (0.55, 0.412),
(0.60, 0.437), (0.65, 0.461), (0.70, 0.483), (0.75, 0.503), (0.80, 0.521), (0.85, 0.538),
(0.90, 0.553), (0.95, 0.567), (1.00, 0.579)
7. (a) (b) k ~ 0.6
y
900

BOO

700

GOO

soo

g
4 00

300

200

100
k=1

8 f

9. i(O) ~ 3.67
11. (a) 0(1.24) = 0.1572, 0(6.2) = 0.154875, 0(11.16) = 0.152584 (b) 0, 2.48, 4.96, 7.44, 9.92, 12.40, 14.88
(c) 0, 3.1, 6.18, 9.25, 12.3
0 21
13. (a) k(t) = 0.2(1-e-O.lr), h = 5, m = 1 (b) k = 0, h(t) = 5(1 - e- · ), m = 1

X
X

15. y=-y-i-½Y+foCOSt, y(0)=0,j,(0)=3,0~t~40


z

2 y
810 Answers to Odd-Numbered Exercises

17.
ll=O IJ=0.5 ll= 1

·2

Chapter 11 Review (pgs. 568-570)

l. y(x) = cie-x + c2xe-x + ~x 2e-x + ~ex


3. y(x) = ciex + c2e-x + -1 sinx
2

5. y(x) = c1e-x cos(J21) + c2e-x sin(ht) + -1


3
2
7. y(x) = 2~ x 4 + c;x + c2x + c3
I 1
9. y(x) = cos3x - xcos3x + sin3x
6 18
11. y(x) = -cosx + c2 sinx
13. {e' 1x, eaxcosf)x, eax sinf)x), {erix, e'2 x, e'Jx), {erix, xe'F, erp·), {erix, xe''x, x 2erix), where a, f3 are real ({3 =fa 0) and
r1, r2, r3 are different real numbers.
17. Hint: If k is a complex constant then e'<x =fa O for all real x and elcx is constant on an open interval of the real line if,
and only if k = 0.
19. A= 1/R, a =rr/2

21. (b) Hint: e±x > 0 for all real x. (c) x = ~ ln(3/2) + i(2n + l)rr/2, nan integer.
25. (a) Hint: Expand (D - r)(D - (r + h))y = 0.
(h) Hint: Find the roots of the characteristic equation.
(c) Hint: Use the definition of d(e'x)/dr as the limit of a quotient.
21. z = Yt + 2y2 - 4YJ 29. z = 0
31. yes 33. no
35. yes 37. no

CHAPTER 12: INTRODUCTION TO SYSTEMS


Section 1: Vector Fields
Exercise Set lARC (pgs. 578-580)
1. (x(t), y(t)) = (-1 + 2e1 , 2e 1 ) • 3. (x(t), y(t), z(t)) = (0, e 12 , -e 13).
1 1

.5. (a) F(t, x) = (r, y). (b) ldx/dyl = ltl.


Answers to Odd-Numbered Exercises 8 11

7. The vector field F(x, y) = (x + I, y) 9. The vector field F(x, y, z) = (x, y /2, z/3).
y z
I
1/'., ,;,;•
'\\\ \~
I / •

:"-.
1/
I

/
/ '. / /_,./.,.p,,..._.-
·>-;,.,
,
-- I /
I .,,--
,'-.." \
........... , . '

__.,. / /
-1
I

I
/
.
'•
-----=
//..,.,.----,.::,.-;:;;.

- ------"---!:::;-
X

'// I I \ '"---.........____-::::::
/II
.,; I; ~ I ~
,1
'
'
\ "" ·,'
' \.
\
'''-,,:,,.,:::-
\
~ .,_, ,---.
~'· .,
...

~,.

11.

13. Use the hint.


IS. dy/dt = z, d:./d1 =-,2 - : 2 + e 1

17. dy/dt = z, dz/dt = x, dw/dt = w 2 - yz - t.


19. dx/dt = (t + y)/2, dy/dt = (t - y)/2.
21. dx/dt=x-3y+t,dy/dt=-3x+y-t.
23. z /
/

/
. 1 /
,./ •
/

(-1,;f_,
./
/

,/

25. (a) Simple sustitution. (b) x = cie 31 + c2e- 1


, y = 2x - 2y = 2cie 31 - 2c2e- 1 •
812 Answers to Odd-Numbered Exercises

27. y = ln(ex + C).


y
y=ln(e><+-C) 2

2 X

29. x = ½y2 + C.

31. (a) x(t) = (zo/k)(l - e-tt) . y(t) = (kw 0 + g)/k 2(1 - e- kt) - g/kt.
(b) Hint: Solve y(t) = 0. (c) Routine calculation. (d) Routine calculation.
33. (a) Hint: Use the differential equations. (b) Hint: p =fi 0. (c) Hint: Use part (b). (d) to= In 2/(p + q).
Exercise Set 1D (pgs. 584-585)
l. Domain is t < I/a . 3. No.
5. (a) Use Theorem 1.6. (b) H(x, y) = -(l/2)(x 2+ y2) (c) Show His constant on flow lines.
7. Use Theorem 1.6. 9. (a) Routine. (b) Fairly complicated.
11. (a) Routine. (b) (1/t)J~ divF(T,,(x))du.

Section 2: Linear Systems


Exercise Set 2AB (pgs. 592-594)
l. Nonlinear. 3. Linear.
S. x(t) = cosh 2t + 3 sinh 2t, y(t) = -2 sinh 2t .
7. Hint: Try x =at+ b, y = ct + d.
9. y(t) = (l/3)e- 1 - 2e 1 + (2/3)e 21 •

(I t//3) ( /3) 2
2t t
11. A(t) = O 3 , b(t) = -t2/ 3 .
13. (a) Routine calculation. (b) Matrix must be square. (c) Suppose b(t) =fi 0.
15. Nonlinear. 17. Nonlinear.
19. (x(t), y(t)) = (-c1 sint + c2 cost - I - t, CJ cost+ c2 sint + I - t).
2]. dx/dt = -X + 2y, dy/dt = X - )'.
Answers to Odd-Numbered Exercises 813

23. dx/dt = ½(sin t +cost), dy /dt = (I /2)(sin t - cost).


25. x(t) = 1, y(t) 0. =
27. x(t) = }e' - ¼e ,q., ((I + ./2) cos {l-t - sin {!- t) - ¼e-4 1 ( (1 - ./2) cos .J,}:-t + sin ft),

y(t) = -½e' + ¼e4 1 (cos .J,}:-t + (l + ./2) sin .J,/:-t) + ¼e-4 1 (cos {l-t + (-1 + ./2) sin /,/-1).
29. i = u, j, = V, u = v, v = -u.
31. i=u,j,=v,u=y+e',v=-x.
33. X(t) = 0, y(t) C. =
35. =
x(t) ce'l 2, y(t) = ce 112.
37. (a) x(t) = cie..fil + c2e-..fi.1, y(t) = c3 cos ..fit+ c4 sin ..fit.
z(l)= ..fi. (CJ e..fi., - c2e-..fil - c3 sin ..fit + C4 cos ..fit).
w(t) = {1 (c1e..fil - c2e-..fi.1 + c3 sin ..fit - qcos ..fit).
(b) z= 2w, w = 2z.
Section 3: Applications
Exercise Set 3 (pgs. 601-606)
1. (a) dy/dt = (4/lOO)z - (4/lOO)y, dz/dt = (1/lOO)y - (4/lOO)z.
(b) y(t) = 25e-t/SO - l5e- 3'1 50 , z(t) = l2.5e-,;so + 1.5e- 31 150 • 20
(c) Ymax ~ y(15) ~ 12. From t = 15 both derease to zero.

3. (a) Follows from (b). (b) ( ! -~ \( ~; ) = ( ;~ ).


( ! -1 )(~: )= ( :~ ). C
:,

Hint: y(t) = 0 is the unique solution of one equation.


0
5. (a) Q.

(b) The planet will collide with the star.


7. (a) For O ~ t ~ 50 x(t) + y(t) = t, 0 ~ t ~ 50.
(b) i = y/50 - x/(50 + t) , x(0) = 0, y = l + x/(50 + t) - y/50, y(O) = 0.
(c) i = y/50 - x/50, j, = I + x/100- y/50, t ~ 50. 200
time t in minutes

9. (a) t = 100. (b) x = y/(100 + t) - 2x/50, x(0) = 0, j, = 2x/50- y/(100 + t), y(O) = 10.
(c) i + (1/25 + 1/(100 + t)) x = 10/(100 + t), x(0) = 0. (d) x(t) = 250/(100 + t)(l - e-1125 ).
y(t) = 10 - 250/(100 + t)(l - e- 1125 ).
11. µ 1 = J<5 - ./5)/2 ~ 1.1756 and µ2 = J(5 + ./5)/2 ~ 1.9021.
13. µJ = J(8 - ./2)/2 ~ 1.8146 and µ2 = J(8 + ./2)/2 ~ 2.1696.
15. (a) U(x, y) = (l/2)(k1 + k2)x 2 - k2xy + (l/2)(k2 + k3)y 2.
(b) Hint: Multiply m 1i = -Ux by i, and mzj, = -Uy by j, then add.
17. Hint: Equate two expressions for the acceleraion at the surface.
19. Ve~ 1.1086 x 104 mis. 21. Hint: Integrate x = -g.
23. (a) Use the hint. (b) Hint: G(m1r 2>
x0
= !:.
XO
(c) Ve= ..fiv1. (d) About 27.28 days.

25. (a) w 2 = k/a 3. (b) Hint: aw is the orbital speed. (c) Hint: Let the orbit be r = f (8) in polar coordinates.
27. (a) Use the hint. (b) Use what worked in part (a).
29. (a) Routine arithmetic. (b) Hint: Differentiate ij, - j,i. (c) Hint: Use the cabin rule. (d) Hint: For the last part
consider A(t + r) - A(t). (e) Apply Green's Theorem to a region swept out by a radius in time t.
31. Follow the steps. 33. Use the chain rule.
814 Answers to Odd-Numbered Exercises

Section 4: Numerical Methods


Exercise Set 4AB (pgs. 612-615)
1. (a) & (b) Routine calculations.
(c)
IE X formula x IE y formula y
0.0 1.00000 1.00000 2.00000 2.00000
0.1 1.20500 1.20533 2.11000 2.11017
0.2 1.42202 1.42273 2.24105 2.24146
0.3 1.65324 1.65437 2.39446 2.39519
0.4 1.90095 1.90257 2.57175 2.57289
0.5 2.16763 2.16981 2.77471 2.77634

3. (a) Routine calculation. (b) & (c) The table shows the comparison.

IE X IE y formula y formula x
0.0 2.00000 1.00000 0.0 1.00000 2.00000
0.2 2.70993 1.46714 0.2 1.46715 2.70996
0.4 3.68521 2.10158 0.4 2.10163 3.68528
0.6 5.00851 2.96431 0.6 2.96442 5.00866
0.8 6.78615 4.13513 0.8 4.13532 6.78640
1.0 9.15444 5.71797 1.0 5.71828 9.15484

5. The improved Euler method was used with step size h = 0.001. Of the 5,000 values generated, the left-hand table
below records every 500th value.
7. The improved Euler method was used with step size h = 0.001 to compute a numerical approximation of the solution.
Of the 1500 values generated, the right-hand table below records every 150th value.

x(t) x(t)
0.0 0.500000 0.00 0.000000
0.5 0.591083 0.15 0.150008
1.0 0.662880 0.30 0.300243
1.5 0.720503 0.45 0.451852
2.0 0.767312 0.60 0.607861
2.5 0.805669 0.75 0.774373
3.0 0.837303 0.90 0.962467
3.5 0.863521 1.05 1.19204
4.0 0.885335 1.20 1.50098
4.5 0.903540 1.35 1.97104
5.0 0.918770 1.50 2.81982
Table for Exercise 5 Table for Exercise 7
Answers to Odd-Numbered Exercises 815

9. (·1,1) y (1,1)

(-1,·1) (1,·1)

11. Let f(x, y) = x2 + ½i- y


(a) The level curves of f are the trajectories of the system

X = - fy(X, y)= -y,


Y = fx(X, y) = 2x,
X
and are shown in the top sketch on the right. The improved Euler method was used with
=
step size h 0.001.
(b) Curves perpendicular to the level sets of f are the trajectories of the system

x = fx(x, y) = 2x,
j, = fy(x, y) = y,

and are shown in the bottom sketch on the right.


(c) The system shown in part (b) is uncoupled. By inspection, the general solutions are
seen to be x(t) = cie 21 and y(t) = c2e 1 • If initial conditions are given by x(to) = xo,
y(to) = yo then we are led to the specific solution

x(t) = xoe2<r-to) and y(t) = Yoet-ro.


If xo = yo = 0 then the trajectory is the single point (0, 0), which corresponds to the
identically zero solution. If xo = 0 and yo I- 0 then the trajectory consists of all positive
mulriples of yo, which is the positive y axis if yo > 0 and the negative y-axis if yo < 0.
In a similar fashion, If xo I- 0 and Yo = 0 then the solution trajectory is the positive
x-axis if x 0 > 0 and the negative x-axis if xo < 0.
If x0 I- 0 and .Yo -/- 0 then note first that the corresponding trajectory stays in the same quadrant for all t. Second, the
solution equations can be written as ( 1/xo)x = e2(t-toJ and (1/yo)y = et-to, so that (l/yo) 2 y 2 = e 2<t-to) = (1/xo)x, or
x = cy2, where C = xo/y5.
816 Answers to Odd-Numbered Exercises

13. (a) The process stops when tank I first becomes full at time I 50 minutes.
70
.i = 3y/(100 + t) - ?x/(50 + 1), x{O) = xo,
j, = 2 + 3x/(50 + 1) - 4y/(100 + I), y(0) = YO, 0 ::': I .::: 50. 60

(b) The improved Euler method with step size h = 0.01 was used to plot the 50
solutions.
15.

20
tank 1-----

0
10 _ _ __
n

10 20 30 40 50 t
time tin minutes

(a) p = 1.9 (b) p = 2 (c) p = 2.1

17. (a) x=y,j,=a(l-x 2 )y-x.


(b) y y y

X X X

a= 0.1 Q = 1.0 n = 2.0

{c) y y

3 X

a= O. l Q = 1.0 a= 2.0
y y

X X X

a= 0.1 (l = 1.0 a = 2 .0
Answers to Odd-Numbered Exercises 817
19. The improved Euler method with step size h = 0.005 was used to plot five phase curves of the hard spring oscillator
=
equation y -yy 3 + 8y for various pairs (y, 8). There are three equilibrium solutions; namely, y = 0, y = ...fy7K and
y = -../ylK, whose phase curves are the three points (0, 0), (../yJK, 0) and (-,./y[h, 0). The results are shown below.

(a) 1 =J=1 (b) 1 = 1, J = 2 (c) 1 = 2, o= l


z z z
2

y y y

=
In Exercises 21, 23, and 25, the improved Euler method with step size h 0.001 was used to plot the solution
y = y(t) of the oscillator ji + k(t)j, + h(t)y = sint, y(0) = 0, j,(0) = 1, where the damping factor k(r) and the spring
stiffness h(t) are time dependent. The results are shown below.
21. y 23. y
2

·2

y + e1l 2 y = sint, y(0) = 0, j,(0) = 1 y + (1/IO)j, + e1l 2y = sint, y(0) = 0, j,(0) =0


25. y
2

10 20 30

y + y + ( 1/(1 + t 2 ))y = sint, y(O) = 0, j,(0) = 1.

27. Consider the Lotka-Volterra system iJ = (3 - 2P)H, P = (!H - l)P.


(a) Using the improved Euler method with step size h = 0.001, various values of ti for the interval 0 ::: t :::: t 1 were
used to sketch a phase curve of the solution correponding to the initial conditions H(O) = 3, P(O) =l
It was
found that the corresponding phase curve almost closed with t1 =
3.6 and was closed when t1 =
3.7. The "PLOT
H, P" command was then changed to "PRINT T, H, P" on the interval O::: t ~ 3.7. The numerical evidence
suggested that the orbit time was approximately 3.666 time units (truncated to three places).
(b) The result of part (a) shows that the graphs of H(t) and P(t) are periodic of period~ 3.666 time units. Using the
same program as was used in part (a), the "PRINT T, H, P" command was suppressed and the commands "PLOT
T, H" and "PLOT T, P" were sumultaneously activated to plot one period of the graphs of H and P on the same
set of axes. The result is in the figure on the left.
818 Answers to Odd-Numbered Exercises

(c) Here, the "PLOT H, P" command was used to plot the trajectory in the HP-plane of the solution found in part (a).
The result is shown below in the figure on the right. The arrows indicate the direction of the trajectory.

2 3 • t H
time

· 29. Set a = 3, b = d = 2, c = I, L = 4 and M = 3 in the given refinement of the Lotka- p


Volterra equations. The conditions H (0) = 3, P (0) = ½ then determine the unique
solution of the initial-value problem
------·
iJ = (3 - 2P)H(4- H), H(O) = 3, P(O) = 1/2;
p= (H - 2)P(3 - P). (3,0 .5)
...~ - -- -;,.o
The plot of the trajectory of the solution is shown on the right. The arrows indicate the H
direction of the trajectory.
31. Set a = 3, b = d = 2, c = I, L = 4 and M = 3 in the given refinement of the Lotka-
Volterra equations. The conditions H(O ) = 3, P(O) = 2 then determine the unique p
0
solution of the initial-value problem

Ii= (3 - 2P)H(4 - H).


P= (H - 2)P(3 - P).
H(O) = 3, P(O) = 2;
0 ~

The plot of the trajectory of the solution is shown on the right. The arrows indicate the a H
direction of the trajectory.
33. First observe that H (0) = 3 for each of the initial-value problems suggested by Exercises 29, 30, 31, 32 in this section ..
Using the improved Euler method with step size h = 0.001, various values of ft for the interval O ::: t ::: t 1 were used
to sketch a phase curve of the four solutions correponding to the initial conditions H (0) = 3, P(O) = Po, where
t
Po = ½, 2, ~. Visually comparing the curves for various values of ft allowed us to hone in on an estimate to of the
actual orbit time t* such that to - 0.01 < t* <to.Once, to was determined for a given Po, the "PLOT H, P" command
was suppressed and the "PRINT T, H, P" command was used on the time interval O::: t ::5 to. Toward the end of this
printout the columns for H(t) and P(t) contained values close to H(t) = 3 and P(t) =Po.The corresponding values
of t were therefore close to t•. Using this method,the values of t* for the four orbits were estimated to be

1.911 , Po= 1/2;


1.562, Po= 3/2;
r. ~
1.630, Po =2;
11.911 , Po= 5/2.

Chapter 12 Review (pgs. 615-616)


1. The system is uncoupled. (x(t), y(t)) = (tan(t + 1!'/4), 2e1), -31!'/4 < t < 1!'/4.
3. x(t) = ((3/2)e + (l/2)e- 1, (3/2)e1 -
1
( I /2)e- 1), -oo < t < oo.
Answers to Odd-Numbered Exercises 819

5. (x(t), y(t)) = (e- 51 , 2e- 51 ). 7. (x(t), y(t)) = (e 21 , 2e21 - t).

9. With w1 = J(-5 + ./17)/2 and w:i = J(-5 - ./17)/2,

y(t)= 3 ( l+ ./17
15 ) cosw1t+ 3 ( 1- .JPj
15 ) cosw:it.
2 2
11. (x(t), y(t), z(t)) = (- sint, cost, -e1).
x
13. If x. = x*(t) solves = F(x), then x(t) = x.(-t), solves = -F(x). x
15. Hint: From Chapter 5, Section 1, V f(x, y) is orthogonal to a level curve through (x, y).
f~ J,:
17, Hint: Show that 1 V f • dx = 1 lx(t)l 2 dt, and use the Fundamental Theorem of Calculus for line integrals.
19. Hint: Solve the specific Hamiltonian system explicitly.

CHAPTER 13: MATRIX METHODS


Section 1: Eigenvalues & Eigenvectors

Exercise Set lAB (pgs. 624-625)


1. 2, (1, 2); 3, (3, 5).
3. -i + i:!p, (-6, 3- iv'lS);-i - i:!p, (-6, 3 + j,J'fs)
5, 1, (1, 0, 0); 2, (1, 0, 1).
1
7. -e ( ~ ) + 2e-1 ( ~ ) · 9. ( ::: ) ·

11. (al ,,(,)-",-• ( =i ) +~e (-1 )+q,' ( ~ )· (bl •, - (-~½ )


(c) x(t) - -j,-• ( =l )-2" (-1 )+2,• ( i) + ( -~;[)
13. (a) x(t) = qte- 1 + c2e- 1, y(t) = c1e-1. (b) -1, nonzero multiples of (1, 0) .
15. dct(A - )../) = )..2 +a)..+ b, which has the same roots as the characteristic equation.
17. (a) Hint: Uek = Uk (b) A = ( ~ =! )-
(c) Use the definition of eigenvector.
(d) i = -3x + 4y, y = -6x + 7y.

Section 2: Matrix Exponentials

Exercise Set 2A-C (pgs. 630-631)

-te-1 ) ( et (t + s)e1+s )
e-t = I. (b) 0 e'+• .
820 Answers to Odd-Numbered Exercises

(c) Hint: both equal (


e1 te' + e' ). (d) ( -e' 2~,21e' ).
0 e'

9. e 21 I . 11. e'A = (' -e, + 2e-1 e, - e_, ) ·


-2e 1 + 2e- 1

13 e1 A =( e2' cost -e 21 sin t ) .


• e21 sin t e 2' cos t
4 4
15_ e' A = ..L ( l 5e ' cosh .Jtsr 5./t5e ' sinh .Jtsr ) .
15 3Jise 41 sinh .Jtsr I5e 4' cosh Jisr

~ e2, ) .
2 -1
rA - ( 3e2r - 2e3'
11· e -
6e 2, -6e 31
-e2,
- 2e 21
+ e3' )
+ 3e 3' ·
19. e A
1
= ½( ~ e1 : e2, e'
e' - e21 e' + e2'
21. (a) Hint: In the series for ei,A, the even terms are real and the odd terms imaginary.
(b), (c) Hint: costA sintA = -½(eitA -e-i 1A).
= ½<ei 1A +e-i 1A) and
23. Hint: Use part (b) of Exercise 21. To show that this is the most general solution, show that c1 and c2 can be chosen to
match any given values for x(0) and x(0).

25. One possible example: A = ( : ~ ) , B = ( ~ : ).


eAeB =( e e2 - e ) eA+B - I ( e2 + 1 e2 - I )
c- I e2 - e + 1 ' - 1 e2 - 1 e 2 + 1 ·

Exercise Set 20 (pgs. 636-637)


21
-5e21 +6e 3' 3e -3e ' )
3
(
2
- 19e ' +2le
3
' )
l. el A =( -IOe 2' + IOe 31 6e ' - 5e 3' ; x(t)
2 = -38e 2' + 35e 3' •

3. e'A =( ~
0
';'
0
~,e~ :t~i::: );
e 21
x(t) =(
1
;' ) .
0
21 21 21
½(e + 1) ½<e - 1) ½<e - 1) O )
A e' - l(e2r + 1) e' - l(e2r + 1) _l(e21 - 1) 0
5. e' = 2 2
-e' + e 1
2 2
-e1 + e I
2 2r
e O '
(
e' + te 1 - ½<e 21 + 1) te' - ½<e21 - 1) -½(e 21 - 1) e 1 + te 1 - ½(e 21 - 1)
cosh t sinh t sinh t O )
1 1 1 - cosh t 1 - sinh t - sinh t 0
which can also be expressed as e A = e _ 1 + e' _ 1 + e1 e1 0 ·
(
1 + t - cosh t t - sinh t - sinh t 1+ t - sinh t

~t ~ ~
1
7. Exponential matrix is e 31 ( ) .
O I+ t -f
9. (a) Eigenvectors ( 1, 0) and (0, a - fJ) are linearly independent if a -1- /J)
(b) Hint: The only solutions of ( ~ ~ ) ( : ) =( ~ ) are multiples of (1, 0).

(c) ( eat
0
e~=:'
eP'
)·, ( eat
0
teat )
ea' .
Answers to Odd-Numbered Exercises 821

11. Hint: Jk =I for all integers k.


2
13. P(A)=A -5A-2soA -5A-2/=0.ThenA =5A+2/=(
2 2
1
; ~~ ),A 3 =5A 2 +2A=( !i ii:).
15. A- 1 = A2 - 3A + 31 = ( - lj ~ -~ ) .
2 0 1

17. r' = l<A


2
+sA-261) =( j -1 -D
19. det A = 0, no inverse.

21. A-
1
= j(-A 3 +8A 2 - 21A + 221) =( ~
-1

2
0
1
0 -: )
-8
-4
I •

I
0 0 4

23. A- 1
= ,-•(A2 - (I+ 2e')A + (2e' + e")I) =( i 0
e-1
0 e-t
0
-te- 1
)
Exercise Set 2E (pgs. 639-640)

1 0 0)
I.A=(~~). 3. A=
( 0
0
2
0
0
3
.

5. Hint: Set t = 0.
7. Hint: Compare the square of the matrix with its value when tis replaced by 2t.
9. Hint: Multiply the linear combination ct x1 + · · · + CnX,, = 0, where Xk is the kth column of A, by A - I .
11. Hint: x, (t) = c2x2(t) + · · · + Cm'Xm (t) if and only if (-l)x1 (t) + c2X2 + · · · + CmXm (t) = 0.

Section 3: Nonhomogeneous Systems


Exercise Set 3A-C (pgs. 644-645)
31 51
-~e -½e'+½) (-I-¼et-te'+¼e )
1· 2 1 '
3· 5 5i - 4e1
I .
( -3e21 - 3e-1 4e
5. {b) Hint: xo = X(to)c. 31
7. x(t) = cie + f-
}et, y(t) = c2e 2' - ½e- 1 •
9. x(t) = (c1 - c2)e 1 + c2e 5t - 1 - te 1, y(t) = c2e5t - ¼et.
11. ( ;;1;:).
13. {a) Hint: x-'(t)xk(t) = ek. {b) .i = -x + e1y, y == -2et x + 3y.
15. Use the definition of fundamental matrix.
17. Hint: Apply the result of Exercise 16(b) to the power series expansion of eA<1).
19. Fill in the details of the given line of reasoning.
822 Answers to Odd-Numbered Exercises

Section 4: Equilibrium & Stability


Exercise Set 4A {pgs. 652-653)
I. .:l.1 = -1, .:l.2 = l ; Saddle, type II . 3 . .:l.1,2 = 2±i; Unstable spiral, type IV.

)/ 2 X

...

5 • .:l.1,2 = ±,./io; Saddle, type II. 7. AJ = -l,.:l.2 = 2; Saddle, type II.

11. Roots -½ ±if; Asymptotically stable spiral, type VI.


_r_-:::;,~ ·,, '
/

2 X

13. k = 0, stable center, type V; 0 < k < 2, asyptotically stable spiral, type VI; k = 2, asyptotically stable star, type VIII;
k > 2, asymptotically stable node, type III.
15. k = 0, stable center. type V; 0 < k < ../8, asyptotically stable spiral, type VI; k = ../8, asyptotically stable star, type
VIII; k > ../8, asymptotically stable node, type III.
17. At a constant equilibrium solution, i = j, = 0, so there are infinitely many of them when ax+ by= c.x + dy = 0 has
infinitely many solutions, which happens if and only if det ( ; !) = ad - bd = 0.
Answers to Odd-Numbered Exercises 823
19. The trajectories are the same, but traversed in the opposite direction (time reversal).
21. (a) x(t) = e-t/Z [c 1 cos :{j-t + c2 sin {l-t ] ,
y(t) = ½e-t/2 [ (.J3c2 - c1) cos "41 - (.J3c1 + c2) sin ft],
z(t) = CJe-t + ½e-t/Z [<c1 - .J3c2)cos {l-t + (.J3c1 +c2)sin 41].
(b) All terms in the solution have exponential factors that go to zero as t goes to +oo.
(c) The solution for z(t) contains a term c3e1 which is unbounded as t goes to +oo. The equilibrium point (0, 0, 0) is
unstable.
Exercise Set 4B (pgs. 657-658)
1. The solutions of the system of linear equations Ax = 0 form a line of equilibrium solutions of the autonomous system.
3. Hint: At an equilibrium point, a(y - x) = 0, px - y - xz = 0, and -{Jz + xy = 0. Then x = y, so x(p - J - z) = 0.
5. (a) Hint: At an equilibrium point, y + ax(x 2 + y 2) = 0 and -x + ay(x 2 + y 2) = 0. For a ¥- 0, multiply the fust
equation by x, the second by y, and add.
(b) The linearized system at (0, 0) is .i = y, y = -x, with eigenvalues ±i so it has a stable center at (0, 0).
r r
(c) Hint: With x = r cos 0 and y = r sin 0, d1e chain rule gives .i = cos 0 - 0r sin 0, y = sin 0 + 0r cos 0. Substitute
in the given differential equations, multiply one by cos 0, the other by sin 0, and add.
(d) 0(t) = -t +c1, r(t) = (ci - 2at) 112.
7. (a) (A, B/ A). (b} The figure shows the case A= 1, B = 3 with an equilibrium point at (I, 3).
y

·1 3 4 5 X

9. Equilibrium points are at (0, 0) and on the circle 1 - x 2 - y2 = 0. Hint: ~(x 2 + y2) = 2(x.i + yy) = 0.
11. Equilibrium points: ( -1, 0), (0, 0), (I, 0). (-1 , 0) and (1, 0) are stable, with derivative matrix ( - ~ _ ~ ). and

eigenvalues -2, -2. (0, 0) is unstable , with derivative matrix ( b _~ ). and eigenvalues I, -J.
13. (a) x(t) = cost), y(t) = sint is a solution. (b) Evaluate (d/dt)(y/x) in terms of x and y.
(c) Hint: tan0 = y/x (e} Hint: (d/dt)r 2 = (d/dt)(x 2 + y2) = 2(x.i + yy)
15. (a) .i = y, y = -x + ay.
(b) The type of equilibrium at (0, 0) for the linearized system is: for O < a < 2, unstable spiral; for a = 2, unstable
star; for a > 2, unstable node; for a= 0 stable center; for -2 < a < 0, stable spiral; for a = -2, stable star; for
a < -2, stable node.
Chapter 13 Review (pgs. 658-659)

1. x(t) = ¾er+ ie-t - iter - 2, y(t) = ¾e' + !e- 1 - ½ter - 1.


3. x(t) = iet + te' - ie- 1 - ½e2t, y(t) = -}e-1+ }e21 .
824 Answers to Odd-Numbered Exercises

5 , x(t) = 5i f er<7- ./5)/2 + s1fer<1+./5)f2; y(t) = _ s1f er(1-./5)/2 _ 5if er(7+./5)/2.


7. x(t) = e1 - 2te 1, y(t) = 2e1 - 2te1, z(t) -2te1 • =
= +
9. x(t) = e1, y(t) e1 e21 , z(t) e1 - e21 • =
ll . x(t) = 2e 21 - I , y(t) = 2e 21 , z(t) = 0, w(t) = te1 + e1 •
+ e-i>.t + ei>..r + eAI, -e->..r - + ieiAI - eM w3, - -- e-).J + e-i>..r + ei>..r - ~ 1w 2 , -e->.t + iei>..r - i ei>.t - e>..1w

Ji ~ }
2e->..r iei>..r

13. X~Ax.where x ~ (x. y • .i. j,) and A~ ( f Eigen,alues ,re 1 ~ 211 4


• iA. - 1. •nd -;1. By

Theorem 2.4,
ho + b2 b2 bI+ b3 b3 )
tA 2 3 b2 ho - b2 b3 b1 - b3 ,
e =ho+ b1A + b2A + b3A = bi+ b
2 3 bi ho+ b
2
b
2
, where ho, b1, b2, b3 are functions
(
b1 -b1 + 2b3 b2 ho - b2
satisfying e>..r = bo + )..b1 + A2b2 + ).. 3b3, ew =ho+ i)..b1 - ).. 2b2 - i).. 3b3, e->..r = ho - Ab1 + A2bi - ).. 3b3, and
e- iJ.J = ho - i )..b1 - A2b2 + i ).3b3 . This gives b1, b2, b3, b4 as linear combinations of e>..r, ei>..r, e- >..r , e-i>..r that can be
written more compactly as ho= ½(cos At+ coshAt), b1 = ¼). 3(sin)..t + sinhAt), b2 = 2(cos)..t - coshAt), -¼}..
b3 = -¼}..(sinM - sinh).t) . In te1ms of initial conditions at t = 0,
x(t) = {½(cos ,\.t + cosh At) - ¼h(cos)..t - coshM)}x(O) - ¼h(cos).t - cosh)..t)y(O) + (¼2 314 (sin ).1 + sinh)..t ) -
¼2 114 (sin)..t - sinhAt)}i(O) - ¼2 114 (sin)..t + sinh)..t)y(O),
y(t) = -¼h(cos)..t + coshM)x(O) + (½(cos)..t + cosh)..t) + ¼h(cosAt - cosh)..r)}y(O) - ¼2 114 (sinAt -
sinhM)i(O) + (¼2 314 (sinAt + sinhAt) + ¼2 114 (sinM - sinhM)}y(O), where).. stands for 2 114 •
15. (a) Letting xo = (a1, ... , an) and 1.() = (b1, ... , bn), the system is equivalent to the sequence of systems
= Yk, .Yk = -Xk , xk(O) = ak , yk(O) = bk fork= I , .. . , n.
Xk
(h) Xk (t) = <lk cost + bk sin t for k = l, . .. , n .
cost sin t )
(c) The exponential matrix has n 2-by-2 blocks ( _ sin cos on its diagonal and is zero elsewhere.
1 1

CHAPTER 14: INFINITE SERIES


Exercise Set 1 (pgs. 662-664)
1. 5/6 3. 31/16
I
5. ak = I/ k, k 2: I 1. ak = ---, k >
k(k + 2) -
I

k+3
9• ak = (k + 1)3k+l'
ll. Hint: Follow Example 3 in the text. Sum of the infinite series is 1/4.
13. Hint: Use induction. The infinite series diverges to oo.
15. 6/5 17. I /n
00

19. e 2 / (e 2 - I) 21. 8 L)l/lO)k = 8/9


k=l
00

23. 123~)1/lOOOi =41/333


k=I
2S. Hillt: First show that the assertion holds for numbers w satisfying O < w < I .
00
l I I I ~ (k - 1) ! _ ~ _I__ _I .. .
27• ; 2k k, = 2 + s+ + ·..
48 29 • L.., (2k)! - 2 + 24 + 360
k=l
+
Answers to Odd-Numbered Exercises 825
00
k 2* k! I I
31. I)-1) (
k=O 2k+I)!
=I - -
3
+ - - ···
15
33. SJ = I, s2 = 3/2, s3 = 5/3, limn___. 00 Sn =2
35. s1 = 2/9, s2 = 4/27, s3 = 8/81, limn___. 00 Sn = 0
37. s1 = I, s2 = 1025/1026, s3 = 29525/29526, limHoo Sn = I
39. Hint: The given series is a convergent geometric series that begins with k = 2.
41. Hint: Write the given series as a telescoping series and use Example 2 of the text.
00

43. L -3r ( I -
k=O
-1)*
r
= 3, for 1/2 < r < 1.

45. (a) Hint: Show that all subintervals remaining after the nth step are of the same length and that the number of such
subintervals is twice the number remaining after the (n - l)st step.
(b) Hint: For a given E > 0 and any positive integer n satisfying n > lnE/ln(2/3), show that O < (2/3l < E.

Exercise Set 2AB (pgs. 669-670)


1. 7 + 5(x - 2) + (x - 2) 2 3. 2 - 2(x + I) + (x + 1) 2
5 813
5. co= I, ci = 1/3, c2 = -1/9, R2(x) = 81 c- (x - 1) 3 , for some c between I and x.
7. co= - 1, CJ= - 1, c2 = I, R2(x) = -c-4 (x + 1)3 for some c between -J and x.
9 ~ (-Ii k 11 ~ (-1/ 2k+I
· L
~o
k!x ' L 22.Hl(2k+ J)!x
k~
oo I
13
'Lk!
'°'
-x2* 15. Hint: Combine the series for ex and e-x and
k=O simplify.
17. Hint: Use Theorem 2.1 with f(x) = (x + a)n.
5
19. IR4(x)I ~ rr /120 21. IRs(x)I =~ rr 6 /720

y y

23. (b) Hint: Follow the given directions.


25. Hint: Compute the nth degree Taylor polynomial with remainder Rn(x) of ln(I + x) about x = 0 and show that Rn(x)
tends to 0 as n --+ oo.
27. Hint: Let x = I and x = - I in Formula 2.4(a) in the text.
29. (a) Hint: Differentiate both sides of the equation f(x) = f(-x) n times and Jet x = 0.
(b) Differentiate both sides of the equation f(-x) = -f(x) n times and let x 0. =
(c) The terms in the Taylor expansion about x =
0 of an even function contain only even powers of x; and the terms
=
in the Taylor expansions about x 0 of an odd function contain only odd powers of x.

Exercise Set 3AB (pgs. 673-674)


1. limn___. 00 Sn =I 3, limn___.oo Sn = 00
5. limn___.oo Sn =0 7. liflln___. 00 Sn =0
9. Sn= n/(n + I), n = I, 2, ... , limn___. 00 sn = I
826 Answers to Odd-Numbered Exercises

n2 +n - I .
11. Sn == , II== I, 2, . . . , hinn--+oc Sn = I
n + 3n + J
2

I oo
13. (a) Hint: Lk=I ak is a telescoping sum. (b) a1 = 2, ak = - ---, k >
k(k - I) -
2, I:C,k
k=l
=I
00

15. e
00

19. I: 2k 1i+ 1 21. Hint: Use the given hint.


h:O

Exercise Set 3CDE (pgs. 681-682)


1. diverges (term test) 3. converges (integral test)
5. converges (integral test) 7. converges (comparison test, ratio test)
9. converges (comparison test) 11. converges (comparison test)
13. converges absolutely 15. converges
17. converges absolutely 19. converges
21. diverges 23. converges absolutely
25. converges 27. converges
29. converges
31. (a) Hint: Use the ratio test.
(b) Hint: Multiply the partial sum LZ= 1 kxk by I - x, combine the resulting terms and then let n - oo.
33. (a) Hint: Multiply the partial sum LZ=I kxk by (I - x)2, combine the resulting terms and then let n - oo.
(b) Let x = 1/2 in part (a).
35. !xi :S I 37. -] < X < 3
39. -J:::: x < I 41. Jxl > ,./2

43. (a) Hint: Follow the given steps. (b) Hint: Use the term test and part (a). (c) Hint: Use part (b) and the fact that
kl/k ~ (1/k)lfk_

45. Hint: Use the integral test.

Exercise Set 4 (pgs. 687-688)


1. Hint: Show that L~o dk converges and use the Weierstrass test.

3. Hint: Show that the given series has period 2;,r on R

5. Hint: Show that the absolute value of the kth term is dominated by (A+ B)/ k 2 and apply the Weierstrass test.
1. Hint: For the first series, use the first derivative test to find the maximum value of (1 - x)xn+I. For the second series,
use Exercise 32 in the last section to show pointwise convergence on [O, I] to a function f (x), and then show that if N
is any given positive integer then lsN - f (x) I can be made arbitrarily close to I.

Exercise Set SA-D (pgs. 695-696)


1. -I:::: x < 1 3. -00 < X < 00

5. -3 < x < -I 7. -4 < X < -2


I
9•
oo (-Ji k
L --x·
k!
11 L (2k +. l)!
oo

• k=O
x2k+I
k"'O
Answers to Odd-Numbered Exercises 827
00

n. I: 1!x2k
k=1
00

11. I: (-l)k+I
k x2.1:
k=I
oc
21. 1:<-lh:r - 1l
k=O
25. Hint: Find the Taror expansions for 1/(1 - x) 2 and 1/(1 -x)3, and then use the partial fraction decomposition of
x(x + 1)/(1 - x) .

27. (a) Hint: Follow the given steps. (b) 1 + 3x + 3x 2 + x 3 (a = 3), 1 - 3x + 6x 2 - l0x 3 (a = -3),
1 1 2 1 3
1 + x - x + x (a= 1/2)
2 8 16
29. Hint: Use the change of variable t = x + a.
31. 1 33. 1/2

35. L (-1/
00
ck+! x
k

k=O
37. Hint: Shift the index of summation in the series for /"(x) and then add the series for f"(x) and / (x) by combining
like powers of x .

Exercise Set 6 (pg. 698)


1
1. 1 + 2x + 3x 2 3• 2 +x 2 + -x
4
4

7. Hint: Find the first four nonzero terms of y(x) and separate the result into two series.
Exercise Set 7 (pgs. 705-706)
1. Hint: Follow the steps in Example 2 of the text.

Cl<) (-ll 2
3. (a) y(x) = co I:-,-x21c (b) y(x) = ce-x (c) Yes
hO k.

~ (-1/ 21:
00

5. (a) Yh(x) = co L 2kk! x + c1 L


(-ll2kk! 21:+1
(2k + l)! x (b) Hint: Set co= 1, CJ = 0 in part (a). (c) Yp(x) = 21x
k=O k=O
2 oo (-ll2kk!
+ c, L
X
(d) y(x) = 2 + coe-x /2 (2k + l)! x21:+1
k=O
7. (a), (b) Hint: Apply the power series method to the Bessel equation of index 0.

(c) Hint: Show that xlo = -x1; - J~ (there is no need to use the series representation of lo(x) found in part (b))
828 Answers to Odd-Numbered Exercises

Exercise Set 8ABC (pgs. 712-713)


2 00
4(-li
+ I: --2 -
2(-J/+1 Ir
1. L --- sin kx
00
3. -
3 k=l k
cos kx
k=I k
y
y

I oo "> 00 4

2+L
5. (Zk - I) sin(2k + l)x 7. """' - - sin(2k + l)x
· k=O + Ir
62k+l

y f(x) s,(x) Y f(x)

00 4
9. -~ +L Zk cos(2k + l)x 11.
- k=O ( + I) 2Ir
y

~,-~;:/2' -2• -•

,/
.//
~n:
•X

15.
13. Yj
-------·
-2n .•- - - -
1 t-.--2.x
'
1

-2• 2• X
Answers to Odd-Numbered Exercises 829

17. 19. y

·2• /

·2• 2• X

21. Hint: Use Equations 8.2 in the text and the linearity of the integral.
3 1 3 1 .
23. cosx+ cos3x 25. sinx- sm3x
4 4 4 4
27. cos3x + sin5x
29. Hint: Straightforward but tedious integration 31. Yes.

Exercise Set 9ABC (pgs. 720-721)


~ 4(-1/ . hrx
1. L . , - - - s m - 3. Hint: routine computation
k=l hr 2

5. fo(X)= {2n+Ql -X, 2n < x < 2n + 2;} = Loo k2.


-sm JrX
X = 2n, kJr
k=l

2nJr < x < (2n + l)Jr; IL


I
COSX, oo Sk
1. fo(X) = -COSX, (2n - l)Jr < x < 2nJr; = 2 sin2kx
0, X = nJr, k=l ( 4k - })Jr
830 Answers to Odd-Numbered Exercises
y
0, 4n - 1 < x < 411 + I;
x - 2 - 4n, 4n +I< x < 4n + 3;
9. fo(x)=
X =4n - J;

x=4n+I,

=~
~
(-2. hr
cos hr - -
2
4
k 2n 2
- sin hr) sin hrx
2 2
k=I
I - 2n + x,
11. f. (x)
e
= { 1+2n-x, 2n - I < x < 2n; }
- -
2n<x<2n+I,
= -21 + Loo ---,---"7cos(
4
(2k+J)2n2
k
2 + 1)nx
k=O

-3 -2 -1 2

00
2nn<x<(2n+l)n;
13. fe(X) = {-smx,
si~x,
(2n - l)n _::: x _::: 2nn, }
= -n2 + L -4
(4k2 - ])Jr
COS 2
k
X

k=I

~
n/2
I,

/
-J t ~~ x
I
Y fe

-x + 4n, 4n - I < x _::: 4n;


x - 411, 4;n < x < 4n + I;
15. f.,(x)=
0, 4n +1< x < 4n + 3;

x = 2n + I,

= -I + Loo [ -2 sin -kn + -24-2 ( cos -kn - 1)] cos -


knx
-
4 kn 2 k n 2 2
k=l
17. Hint: routine computation 19. Hint: routine computation
Answers to Odd-Numbered Exercises 831

21. Hint: Consider the limit of (/ (x + p + h) - f (x + p)) / h as h --+ O.


23. Hint: Use the hint in the text. 25. Hint: routine computation
27. Yp(x) = ¾L~o (2n+l) [(2n+I)
1
2 1r 2 +a 2
] sin(2n + l)nt

Exercise Set lOAB (pgs. 727-728)

1. u(x, t) = e- (1r 2a2/ P2)I sin-,


Jr X
0 :'.:':= x :'.:':= p, t ~ 0
p
00
3. u(x, t) = ~ l:._e-k1 a:1r
21
sinknx, 0 < x < l, t > 0
~kn - - -

5. u(x, t) = e-a 21 sinx + !e-4a


2
21
sin 2x, 0 < x < n, t > 0
- - -
7. v(x) = -l +x 9. v(x) = l +x
11. u(x, t) = l+
2
p
x + (1- !) n
e-<a
2 2 2I
1r f P > sin J_CX +
p k=2 kn
t 2 2 2 2
_}:__[3(-ll - lJe-* <a 1r f P >I sin knx,
P
O :'.:':= x :'.:': p, t > O

00
2 1
~ -e-4k'la
J2
13. u(x, t) = x+ -
L.,k
1r
1
sin2knx,O < x < l,t > 0
- - -
1T k=I

(c1er 1x +c2er2x)eAI , ).. #-l/4;


15. u(x,t) = { (cie-xf2+c2xe-xf2)e-tf4, ).. = -1/4.
17. u(x, t) = cxAeft/ 2, x > 0 19. v(x) =x2+ 1
1 5
21. v(x) = 6x 3 + 6x + 1
= -1lop J(x) dx + (2- LP f(x) cos -knx dx ) e-k = -1lop /(x) dx
00

23. (b) u(x, t)


P O
L
k=I P P O
2 2
1r
2 knx
t/p cos -
P
(c) u(x, oo)
P 0
25. u(x,t)=l,0:'.:':=x:'.:':=1,t:=::0
27. (a) X" + ).. 2X = 0, T' + )..2 tT = 0 (b) If)..= 0 then X(x) = c1 + c2x, T(t) =
co, u(x, t) = C 1 + C2x.
212 212 2
If)..;;/- 0 then X(x) = A cos)..(x - a), T(t) = coe-). 12 , u(x, t) = ce - A 1 cos)..(x - a).
29. Hint: u(x, t) concave up implies uxx(x. to) > 0, which implies u I (x, to) > 0, which implies u is increasing. Use a
similar argument if u(x, t) is concave down.
31. Hint: Except for the obvious minor changes, use the derivation of the product solutions that was used in the text.

Exercise Set lOC (pgs. 731-732)


1. u(x, t) = cos at sinx
00

3. u(x, t) = ~ - 2-2 k ] sinkat sinkx


[ (-t)*+l +cos~
L., k na 2
k=I

S. (a) v(x) =
2
!2
2
x + c 1x + c2 (b) v(x) = ~ 2 x(x - p) (c) First solve the homogeneous equation a 2u.u = u11 ,
u(0, t) = u(p, t ) = 0, and let w(.\', t) denote the solution. The solution of the nonhomogeneous problem is then
u(x, t) = w(x, t) + v(x), where v(x) is given in part (b).
7. (a) Hint: Let to be a fixed time and fort > to let b = a(t - to). Show that U(x + ato) = U((x - b) +at) and
V (x - ato) = V ( (x + b) - at). (b) Hint: Use the identity cos(a - /3) = cos a cos /3 + sin a sin /3. (c) Hint: Use the
identity 2 sin a cos f3 = sin(a + /3) + sin(a - /3) .
9. (a) Hint: routine computation
832 Answers to Odd-Numbered Exercises

(c) Hint: Show that G"(0) and G"(p) exist as two-sided derivatives. The condition G'(0) = G'(p) = 0 insures that G
is continuously differentiable on all of IR.
11. X" - AX = 0, T" - MT = 0
Chapter 14 Review (pgs. 733--734)
1. J/(x - 4), 4 < X < 6 3. I+ l/x 2 ,x ::f.O
2
5. e<x+l) , -oo < x < 00 7. cos(x - 5), -oo < x < oo
00

9. cosh (x 2 ~ 1) , Ix! =I- I 11. L(x 3 l , -1 < x < I


k=O
-2k k 1
oc 1 oo 1
13. ~ - X - - < X < - 15. ~-(X-l)k,-OO<X<OO
Lk '2- 2 L kl
k=I k=O
00 (-l)k4k oo (-J)k+I
17. I : - - - X2k, -00 < X < 00 19. ~ - - - x 2 k + I -oo < X < 00
k=O (2k) ! L (2k+ J)! '
k=O
21 • "00 l-2k)
~k=O ( I<! Xk , -00 < X < 00

23. lxl < 1 25. -00 < X < 00


27. -00 < X < 00 29. diverges
31. diverges 33. absolutely convergent
35. (a), (b), (c), (d) Hints: routine computations.
37. r 2 R" + 2r R' - y R = 0, (sin ct, )<I>"+ (sin ct, cos ct,)<I>' + (y sin 2 ct, - ).)<I> = 0, 0" + ).0 = 0
INDEX

A autonomous system, 575 circular frequency, 53 1, 597


axis, coordinate, 8 circulation, 369, 405
absolute maximum, 283
Clairaut's Theorem, 202, 337
absolute minimum, 283
B closed
absolute value, 500
under addition, 114
absolutely convergent, 678 ball, 217 under scalar multiplication, 114
acceleration basis, 132, 511 closed set, 218, 283
magnitude, 181 standard, 133 closed surface, 444
vector, 181 Bessel equation, 705 coefficient matrix, 59
addition Big Bertha, 187 coefficients, 509
of matrices, 74 Boltzmann constant, 359 cofactor, 89
of vectors, 1, 13 border, 426 coffee cooling, 489
additivity boundary, 218 columns, 59
of cross product, 40 boundary conditions, 497 complete orthonormal set, 162
of dot product, 24, 155 boundary point, 218 complex exponential, 500
of inner product, 155 boundary-value problem, 497, 499
component, 5
Airy equation, 701 bounded set, 283, 323
perpendicular, 29
alternating, 93 bugs in mutual pursuit, 580
vector, 29
alternating harmonic series, 680
composite function, 261
alternating series, 680
C composition, 107
amplitude, 507, 53 I
Cantor set, 663 flow transformation, 582
analytic function, 699
Cauchy product of series, 693 of functions, I 23
angle
Cauchy-Schwarz Inequality, 156 of linear functions, 123
between planes, 34
Cauchy-Riemann equations, 204 conductivity, heat, 722
between vectors, 25
angular momentum, 183, 605 Cavalieri's principle, 333 cone
center of mass, 188, 348, 349 elliptic, 209
angular speed, 182
central force field, 605 conjugate, complex, 503
anticomrnutativity, cross product, 4-0
arc length, 182, 377 centripetal acceleration, 384, 604 conservation
differential, 370, 379 centripetal force field, 419 angular momentum, 183
arc length element, 378 centroid, 349, 382 linear momentum, 183
area, 331 chaos, 568 conservative field, 411
parallelogram, 40 characteristic conservative system, 603, 604
surface, 420 equation, 146, 493, 508 consistent system, 417
area element, 341, 420 polynomial, 548 content, 323, 331
arrow roots, 146, 496, 508 zero, 325
equivalence, 15 value, 143 continuity equation, 388, 407, 436
tail, tip, 15 vector, 143 continuous
vector, 9, 15 circuit, 399 at a point, 22 I
autonomous equation, 558 electrical, 52 at isolated point, 221

833
834 Index

on a set, 222 directional, 234 eigenvector, 143, 617


on domain, 222 partial, 198 eigenvector matrix, 622
continuous function, 175, 202 with respect to vector, 233 elementary matrix, 88
continuous functions derivative matrix, 239 elementary modification, 48, 61
vector space of, 113 derivative set, 519 elementary multiplication, 48, 61
continuously differentiable, 229, 238 detenninant, 82 ellipsoid, 210
converge, 66 I Jacobian , 273 elliptic cone, 209
convergence diagonal matrices, 85 elliptic integral, 382, 558
pointwise, 682 differentiable energy
uniform, 683 at a point, 226 kinetic, 411
convergent, 675 continuously, 238 potential, 412
convergent integral, 354 on a subset, 226 entry, 74
convergent sequence on domain, 226 equation
vector, 245 differentiable function, 216, 225 of a line, 33
convex, 24 vector-valued, 237 of a plane, 33
convolution, 549 differential operator, 120 equilibrium point, 572, 646
convolution integral, 549 differential equation, 460 equilibrium solution, 561, 572, 603, 645
coordinate first-order, 460 equipotential surface, 259
perpendicular direction, 29 diffusion equation, 204 equivalent systems of equations, 46, 61
relative to a basis, 134 dimension, 139, 140 equivalent arrows, 15
coordinate axes, 8 finite, 132 equivalent parametrizations
coordinate curve, 205 infinite, 132 of curves, 184, 372
coordinate function, 174 dimensions, 59 of surface, 430
coordinate projection, 222 directed line segment, 15 escape speed, 477
coordinate rectangle, 322 direction Euler approximation, 467
coordinate transfonnation, 273, 453 same, opposite, 18 improved, 608
coordinate vector, 134 direction cosines, 32, 237
Euler beam equation, 512
coordinates direction field, 461 Euler differential equation, 529
standard, 5 directional derivative, 234
Euler equation, 733
Coulomb repulsion law, 605 second-order , 294
Euler formulas, 706
critical damping, 534 Dirichlet's theorem, 710
Euler's method, 467
critical point, 286 distance, IO
improved, 468
cross product to line or plane, 35
even function, 131, 720
and area, 40 diverge, 661
expansion rate, field, 436
definition, 37 divergence, 354, 407
explicit representation, 189, 215
determinant form, 37 field, 431
perpendicularity, 37 exponential matrix, 626
divergence of field, 406
right-hand rule, 40 exponential multiplier, 482, 491
divergence theorem, 432, 436
curl, 390, 391, 439 extreme point, 283
plane, 407
curl-free field, 444 extreme value, 283
divergence, of field, 388
curvature, 384 divergence-free field, 438,
curvilinear coordinates 444 F
general, 307 divergent, 675
cusp, 179 factorial, 663, 664, 680
diverges to infinity, 673
cylindrical coordinates, 306 field potential, 411
domain, 104, 173
cylindrical shell method, 341 finite-dimensional, 140
dot product, 24
first integral, 419
Duffing oscillator, 568, 614
flow, 581
D
flow line, 260,386,461,581
degenerate rectangle, 323 E flow transformation, 581
density, 420, 722 eccentricity, 606 flow velocity, 30
density, mass, 331 echelon fonn, 73 flux, 30, 424, 425
derivative, l 73, 176, 225 eigenvalue, 143, 298, 617 plane, 407
Index 835
force Heaviside function, I 94 inverse
magnitude, 181 helicoid, 207, 423 elementary operation, 48
vector, 181 helix, 178 function, 108, 124
force vector, 183 Hessian matrix, 298 linear, 109
forced oscillation, 535 homogeneity matrix, 82, I 09
Fourier coefficients, 161, 706 of cross product, 40 computation of, 83
Fourier series, 706 of dot product, 24, 155 inverse image, 131
fox and rabbit, 188 of inner product, 155 inverse operator, 582
free oscillation, 531 of length, 28, 156 inverse-square law, 259
function homogeneous equation, 65, 129, 513, 514 invertible, 82
even, 131 homogeneous function, 309, 479 irrotational vector field, 406, 448
inverse , 124 homogeneous solution, 5 I 4 isocline, 466
linear, 119 homogeneous systems, 50 isolated point, 221
odd, 131 Hooke's law, 530 iterated integral, 312
fundamental matrix, 642 hyperboloid
one sheet, 212 J
G two sheets, 212 Jacobian determinant, 273
hyperplane, 71 Jacobian matrix, 307
Gauss formula, 432
Gauss's Law, 435, 438
Gauss's Theorem K
plane, 407 k-plane, 141
identity matrix, 78
Geometric interpretation, 90 Kepler's laws, 600, 605
image, I 04, 173
geometric series, 661 kinetic energy, 411, 603, 605
inverse, 131
Gibbs, Josiah Willard, 43
impedance, 537
gradient, 226
temperature , 258
implicit L
definition, 258, 275, 276
gradient field, 252, 411, 615 Lagrange multiplier method, 287
differentiation, 277
gradient system, 585, 615 Laplace equation, 204, 214, 452
representation, 190, 215
Gram-Schmidt process, 164 Laplace operator, 272, 437
improper integral, 354, 675
graph of function, I 89 Laplace transform, 541
improved Euler method, 468 Laplacian, 395, 409, 455
gravitational field, 438
incompressible flow, 407, 438 polar coordinates, 455
Green's first identity, 452
inconsistent system, 49 cylindrical coordinates, 456
Green's function, 525
independence, linear, 21, 68, 132 spherical coordinates, 456
Green's second identity, 452
checking for, 71 law of cosines, 25
Green's Theorem, 397
independence of path, 412 leading entry, 64
grid, 323
indicial equation, 529 Legendre polynomials, 166
infinite dimensional, 132 Leibniz notation
H infinite-dimensional, 140 Jacobian determinant, 343
Hamilton, Sir William Rowan, 43 initial condition, 463, 472, 495 Leibniz rule, 529
Hamiltonian field, 616 initial-value problem, 463, 496 lemniscate, 304, 346
Hamiltonian function, 584 inner product, 155 length, 11, 155
Hamiltonian system, 271, 584, 606, 616 insulated endpoints, heat, 727 level set, 190
hard spring, 614 integrable function, 324 limit, 175
harmonic function, 204, 395, 437, 438, integral of function, 218
459,585 improper, 354 uniqueness, 219
hannonic oscillation, 504, 531 iterated, 3 I 2 vector sequence, 245
harmonic oscillator equation, 504, 733 multiple, 324 limit cycle, 614
harmonic series, 674, 676 vector, 185, 349 limit point, 2 I 7
heat capacity, 722 integrating factor line, 18
heat conductivity, 722 exponential, 481 equation, 33
heat exchange, 580 interior point, 218 parametric representation, 18
836 Index

through two points, 18 N orthogonal, 87, 383


with given direction, 18 vectors, 156
n-dimensional volume, 331 orthogonal functions, 707
line integral, 368
n-dimensional column vectors, 59 orthogonal trajectories, 613
line segment, 15 n-dimensional row vectors, 59
linear, 102, 481 orthogonality relations, 707
n-parameter family, 509 orthonormal, 159
constant-coefficient equation, 490 neighborhood, 217
function, 119, 174, 223 orthonormal set, 721
network, 52 overdamping, 533
operator, 120, 513 Newton equations, 599 overdo! notation, 178
system, 585 Newton's equations, 598
linear combination, 4, 509, 637 Newton's law of cooling, 486, 487
linear dependence, 637 p
Newton's method, 247
linear independence, 21, 68, 132, 499 modified , 248 p-series, 676
checking for, 71 Newtonian potential, 259, 418, 448 Pappus's theorem, 383
of vector functions, 637 node, 657 parallel, 18
linear momentum, 183 nonautonomous lines, 19
linear operator, 492 system, 581 parallelepiped, 41
linear system, 46 nonhomogeneous, 513 parallelogram rule, 13
linearity, 102 nonleading, 63 parameter, 18, 21
of integration, 333 norm, 155 parametric representation, 215
linearly dependent, 68 normal form of a curve, 174
linearly independent, 68, 511 system, 586 by arc length, 378
local normal form system, 573 of a line, 18
inverse, 273 normal lapse rate, 184 of a plane, 2 I
maximum, 283 normal mode, 597, 603 of a surface, 205
minimum, 283 normal probability density, 356 parametrization
property, 273 normal vector, 256 partial derivative, 198, 199
logarithmic potential, 259, 416 normalization, vector, 28 particle, 367
Lorenz system, 613 normalized equation, 522 particular solution, 5 I 5
Lotka-Volterra equations, 602, 614 line or plane, 35 path
normalized form steepest ascent, 257, 299
linear first-order, 481 pendulum equation
M normalized function, discontinuous, 711 derivation, 579
null-space, 127 periodic extension, 709
main diagonal, 78
periodic function, 706
mass, total, 331
phase angle, 506, 532
matrix, 59 0 phase portrait, 572, 577
matrix exponential, 626 odd function, 131, 720 phase shift, 507
matrix product, 75 Ohm's law, 52 phase space, 558, 576
maximum resonance frequency, 539 one-to-one, I 08 piecewise monotone function, 709
Maxwell distribution, 359 open set, 218 piecewise smooth, 397
Maxwell's equations, 444 operations piecewise smooth surface, 427
mean speed, 359 vector, 2 plane, 21, 70
mean, distribution, 356 operations on vectors equation of, 33
median, 24 rules for, 2 parametric representation, 21
mesh, grid, 323 operator, 120 through the origin, 21
midpoint, 12 differential, 120 pointwise convergence, 682
midpoint approximation, 360 orientable surface, 427 Poisson equation, 452
minor, 88 orientation polar coordinates, 303
moment, 348, 350 positive, 426 polygonally connected, 236
moment of inertia, 353 oriented border, positive, 426 polynomial
momentum, 188 oriented boundary trigonometric, 138
angular, I 83 surface, 432 polynomials, vector space of, 113
Index 837
position vector, 9, 15 Riemann zeta-function, 676 standard normal vector, 420
positive definite, I 67 right side, 59 state space, 576
positive orientation, 426, 431 right-hand rule, 40 steady-state, 535
boundary, 438 rotating shaft, 513 steepest ascent, 299
positively oriented border, 438 row rearrangement, 83 path, 257
positively oriented boundarv row-by-column multiplication, 76 steepest descent, 299
surface, 432 · rows, 59 Stokes's Theorem, 439
positivity plane, 405
of dot product, 24, 155
of inner product, 155
s strict
maximum, 293
of integral, 333 saddle point, 291,647,655
minimum, 293
of length, 28, 156 scalar, I
subspace, 114
potential scalar curl, 391, 405
proper, I 16
logarithmic, 259, 416 scalar multiple, I, 75
trivial, 1I 6
of a function, I 23
Newtonian, 259 sum, 74, 661
potential energy, 412, 585, 604 scalar multiplication, 11
infinite series, 661
scalar triple product, 41, 99
potential function, 259, 260, 585, 603 of functions, 123
second-derivative test, 293
power series, 688 sum of vectors, I
separation of variables
prediction-correction, 468 superposition principle, 70,
first-order equation, 472
principal normal, 383 515
separation of variables, second-order
principal subminor, 298 surface
linear, 722
probability, 356 orientable, 427
separation of variables, second-order
probability density, 356 revolution, 212
partial, 729
product, 60 . surface area, 380
shear transfonnation, 111
product rule, 98 surface integral, 424
simple region, 399, 432
projectile surface of revolution, 380,
simply connected, 446
air resistance, 579 431
Simpson approximation, 361
projection, 8, 28, 165 symmetric, 87
singular point, 208
coordinate, 222 symmetric matrix, 167
singularity, infinite, 260
orthogonal, 164 Jacobian, 414
skew-symmetric, 87
vector, 29 symmetry
slope field, 461
proper subspace, 116 dot product, 24
smooth
Pythagoras theorem, 26 of dot product, 155
curve, 178
surface, 208 of inner product, 155
Q smooth surface synchronous orbit, 604

quaternion, 43 piece, 419


piecewise, 427 T
solid angle, 431
R solution, 460 tail, 15
JR, I span, 116, 132 tangent
spanning set, 132 line, 178, 256
Rn' 1
rabbit and fox, 188 speed, 180 plane, 201, 206, 256
radius of convergence, 688 angular, 182 tangent approximation, 230
random walk, 55 trajectory, 574 tangent vector
range, 104, 173 spherical coordinates, 305 standard, 178
reduced matrix, 62, 64 spherical homogeneity, 347 tangential acceleration, 384
reflection in a subspace, 170 spherical pendulum, 614 Taylor approximation
remainder, Taylor, 665 spiral point, 657 first-degree, 230
removable discontinuity, 225 square, 59 first-degree , 242
resonant, 536 square wave, 719 higher degree, 237, 666
reversed triangle inequality, 3 2 stable equilibrium point, 561 Taylor polynomial, 665
Riemann sum, 324 standard basis, 5, 133 Taylor remainder, 665
838 Index

Taylor series, 666 triple product vector integral, 185, 349


temperature gradient, 258, 430 scalar, 41 vector space, 3
term test, 674 trivial subspace, 116 matrix, 112
terminal velocity, 521 tuning, 539 of continuous functions, 113
trigonometric polynomial, 161 of polynomials, 113
tip, 15 u of sequences, 113
torque, 183 over the real numbers, 112
Torricelli's equation, 480 uncoupled, 572 vector sudace differential, 427
total energy, 412, 585, 603, 605 underdamping, 533 velocity
total mass, 331, 349, 420 uniform convergence, 683 flow, 30
curve, 379 uniform orbital speed, 604 velocity vector, 180
trace, 171 unit vector, 28 volume, 323
trace, matrix, 437 upper triangular, 86 11-dimensional, 331
trajectory, 572 volume element, 341
transfer function, 549 V
transformation, 120, 273 van der Pol equation, 614 w
coordinate, 273 variance, 357, 359
transient, 535 wave equation, 204
variation of parameters, 521, 643 work, 30, 369, 370
translation vector, l, 3
operator, 225, 244 Wronskian determinant, 523, 530
geometric, 8
vector, 13, 15 in ]Rn, l
transpose, 87, 96, 169 normalization, 28 z
triangle inequality, 28, 156 unit, 28 zero matriJ{, 75
trigonometric polynomial, 138, 706, 713 vector addition, 1, 13 zero solution. 50
trigonometric series, 706 vector field, 252, 367, 575 zero subspace, 116
.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy