Linear Algebra

Download as pdf or txt
Download as pdf or txt
You are on page 1of 545

Linear

Algebra

H. Friedberg

Stephen

Arnold J. lnsel

Lawrence

E.

Illinois

State

Spence

University

Prentice

Hall,

Englewood

Cliffs, New Jersey

07632

Data

Cataloging-m-Publication

Congress

of

Library

Friedberg, Stephen

H.
Arnold J.

H. Friedberg,

Linear

algebra/Stephen
E. Spcnce.\342\200\2242nd ed.

Insel, Lawrence

cm.

p.

Includes indexes.

ISBN 0-13-537102-3

1.
E

I. Insel,

Linear.

Algebras,

Arnold J.

II. Spcnce,Lawrence

III. Title.

QA184.F75 1989

88-28568

CIP

512\\5\342\200\224dc!9

supervision

Editorial/production

and

design: Kathleen M, Lafferty

interior

Cover design:

Lubelska

Wanda

Paula

buyer:

Manufacturing

lithograph

Victor

by

1979

\302\2511989,

Massenaro

on a

Cover illustration: Based

Vasarely,

Communications Company

A Paramount

EnglewoodCliffs,

New

be reproduced,
may
without
permission

Printed

in

the

book

in any form or
in writing from

any

by

means,

the publisher.

of America

States

United

07632

Jersey

part of this

reserved. No

All rights

Inc.

Prentice-Hail,

by

10 9 8 7

0-13-S37102-3

ISBN

Prentice-Hall

International

Prentice-Hall

of Australia Pty,

Prentice-Hall

Canada

Prentice-Hall

Hispanoamericana,

Inc.,

Prentice-Hallof India

Private

Prentice-Hall

of Japan,

Simon & Schuster

Asia

Editora

Prentice-Hall

London

Limited,

(UK)

Limited,

Sydney

Toronto

S.A., Mexico
Limited,

New

Delhi

Inc., Tokyo
Pte.

Ltd.,

Singapore

do Brasil, Ltda.,

Rio de Janeiro

To

EI

U-

f]

11!
ffl

our

families

Contents

Preface

Vector Spaces

1.1

Introduction

1.2

Vector

1.3

Subspaces

1.4

Linear Combinationsand

Spaces

14

1.6
1.7*

Linear
Bases

Linear

21

Equations

1.5

of

Systems

and Linear Independence

Dependence

35

Dimension

and

Maximal

Linearly

Independent

Index of Definitions
for

52

Chapter

and

Matrices

Transformations

2.1

Linear Transformations,
Null Spaces,

2.2

The

and

Representation

Composition
of

Linear

2.5
2.6*

\342\231\246Sections

of a Linear

Invertibility and Isomorphisms


The

of

Change

Coordinate

Dual Spaces
denoted

by

an

and

Transformations

72

Multiplication

2.4

Ranges

65

Transformation

2.3

49

Subsets

Linear

Matrix

31

asterisk

Matrix

101
are optional.

84
93

Matrix

viii

Homogeneous Linear DifferentialEquations

2.7*

Elementary
and

Matrix

Linear

and

3.3

3.4

Matrix and Matrix

132

Inverses

Systems

of Linear

Equations\342\200\224Theoretical

Systems

of Linear

Equations\342\200\224Computational

Aspects

147

160

Aspects

Elementary

128

Rank of a

The

3.2

of

\342\200\242
Index

for

Definitions

Chapter

170

Determinants

4.1
4.2

4.3

171
171

Order

Determinants

of

Determinants

of Order n

190

of Definitions

Determinants

about

Facts

Summary\342\200\224Important

Index

182

Determinants

of

\342\200\236Properties

4.4

for Chapter 4

214
and

Eigenvalues

231

Diagonalizability

5.3*

Matrix Limits and

5.4

Invariant

252

Chains

Markov

Subspaces

the Cayley-Hamilton

Index of Definitions

for

Inner

Product

6.1

Inner Products

6.2

The Gram-Schmidt

6.3

214

Eigenvectors

5.2

and

Theorem
Chapter

280

293
295

Spaces
and

295

Norms

Orthogonalization

Process

and Orthogonal Complements 304


The

Adjoint

205

213

Diagonalization

5.1

127

Equations

ElementaryMatrixOperations
Matrices

125

Operations

of

Systems

3.1

for Chapter 2

of Definitions

Index

108

Coefficients

Constant

with

of

a Linear

Operator

314

Contents

ix

and

Normal

6.5

Unitary and Orthogonal Operators

and

6.6

333

Einstein's

63*

Conditioning

Special Theory of Relativity


the

and

Canonical

of

for

416

7.2

Jordan

7.3

The Minimal Polynomial

451

7.4*

Rational

459

416

Eigenvectors

430

Form

Canonical

Form

Canonical

of Definitions

for Chapter

480

481

Sets

Functions

Fields

ComplexNumbers

Polynomials

Selected

List of Frequently

of

481

to

406

415

Forms

Appendices

Index

Chapter

Generalized

Answers

398

Orthogonal Operators

Definitions

348

385

Quotient

Rayleigh

7.1

Index

Theorem

355

Forms

Quadratic

The Geometry of
Index

and the Spectral

Projections
and

Bilinear

6.8*

6.10*

Operators

Self-Adjoint

Matrices

Their

Orthogonal

6.1

Index

325

6.4

Theorems

483

484
488

492

Exercises

Used Symbols

501

518
519

522

Preface

The languageand
of
matrix
and, more
theory
generally, of linear
in the
social
and natural sciences,
algebra have comeinto
usage
continues
to be of
computer science, and statistics.In addition,
algebra
in
modern treatments of
and
great importance
analysis.
The
of Linear Algebra, Second Edition, is to
a
primary
purpose
careful
treatment
of the principal
topics of linear algebra and to illustrate
of the subject
a variety of applications. Althoughthe
formal
power
through
for
the book
is a one-year course in calculus,
material
in
prerequisite
6 and 7 requires the mathematical sophistication of
Chapters
college
and
seniors
to
juniors
(who may or may not have had
previous
exposure
concepts

widespread

linear

geometry

present

the

only

the

typical

some

the

subject).

from

The book is
three to six

organized to permit a
semester hours in

of

number

to

length)

be

courses

different

from

taught

(ranging

it. The

core

material (vector spaces,linear


and
of linear
matrices,
systems
1 through
5.
equations, determinants, and diagonalization) is foundin
The remaining
are
chapters, treating inner product spaces and canonical
and
completely
independent
may be studied in any order. In addition,
the
book
are a variety
of applications to such areas as
and physics. These applications are not central
economics,
equations,
geometry,
to
mathematical
and
however,
development,
may be excluded at the
transformations

Chapters

forms,

differential

throughout

the

'
make it possible for many of the importanttopics
in a one-semester course. This goal has
to
with fewer unnecessary preliminaries than in a

discretionof the instructor.


We

linear

develop

have

the

traditional

be covered

to

algebra

to

attempted

major

topics

led

treatment

(Our

approach.

of

doesnot
any
theory
to covermostof book
discussionof
have had someprior
determinants)

in

Chapter
vector

of

a one-semester
to

exposure

1 of the

many

(omitting

linear

algebra.

book presents the

spaces\342\200\224subspaces,

linear

canonical form, for instance,


The resulting economy permits us
the optional sections and a detailed
four-hour
course for students who

Jordan

of polynomials.)

require

the

of the

us

basic

combinations,

theory

linear

of

finite-dimensional

dependence

and indexi

Preface

xii

we

which

in

dimension.
the existence

and

bases,

pendence,

prove

and

transformations

Linear

of Chapter2.
matrix

transformation,

there

discuss

We

The chapter concludes with an optionalsection


vectorspaces.
of a basis in infinite-dimensional
their
to matrices are the subject
relationship
the
null space and range of a linear
a

of

representations

change of coordinates. Optional sections


linear differential equations end the chapter.

dual

on

and

isomorphisms,

transformation,

and

spaces

homogeneous

and linear
transformations
The applications vector
to
space
theory
in Chapter
3. We have chosen to defer this
systems of linear equations found
it can
be presented
as a consequence of the preceding
importantsubjectso
material.This
allows
the familiar
topic of linear systems to illuminate
the abstract
and
us to avoid messy matrix computations in the
permits
2. There will be occasional examples in these
1 and
presentation Chapters
where
we shall
want to solve systems of linear equations.(Of
chapters,
will
not be a part of the theoretical development.)The
course,
examples
of

are

that

approach

theory
of

however,

these

Determinants,
they once were. In

the subject of Chapter4,

a shortcourse

we

more time may be


have

we

Consequently

the

of

development

devoted

to

presented

two

needed

\342\200\242

5 discusses

Chapter

the most importantapplications


of

limits. We

lightly

7.

5 through
complete
the

important

material

this

and diagonalization.

matrix limits and


statement of some
canonical form. Section 5.4

on
section
optional
the most general
though

this

requires

theorem.

and

The

the Gram-Schmidt

products;

(inner

adjointtransformations;normal,
6.1

inner

form,
the

discusses

the

6.9, and
structure.
6.8,

space

product

Canonical
Jordan

Sections

6.7.

through

rich

and

projections;

orthogonal

mathematical

process; orthogonal complements;


orthogonal,

self-adjoint,

operators;

and

unitary

Sections
6.10 contain diverse applications of

spectral

theorem)

is contained in

the

forms are treated in Chapter7.Sections


7.1 7.2 develop
Section 7.3 presents the minimal polynomial,and
and

Section

rational

One of
matrix

in computing

occurs

Markov chainsin
even
chapter
of the results
a knowledge
of the Jordan
contains materialon invariantsubspaces the Cayley-Hamilton
basic
Inner product
spaces are the subject of Chapter6.
theory

so that

chapters (Section4.4).

an

included

therefore

h&ve

determinants

4\342\200\224a

than

importance

4.3) and a summary of

eigenvectors,

eigenvalues,

less

much

in Chapter

alternatives

for the remaining

of

in Chapters

material

the

are

treat

to

prefer

4.1 through

(Sections

theory

are

that

facts

1.4.

in Section

contained

is

background

necessary

the
7.4

form.

discuss
There are five appendices. The first four,
which
sets,
functions,
basic
ideas
fields, and complex numbers, respectively, are intendedto review
used
the book. Appendix E on polynomialsis used primarily
in
throughout

Chapters

and

7, especially
but

independently

appendices
The

chapters.

following

diagram

in Section 7.4. We prefer not to discuss


to refer to them as the need arises.
rather
illustrates
the dependencies
among the various
the

xiii

Preface

Chapter

''

Chapter 2

1'
3

Chapter

'

4.1-4.3

Sections

or Section

4.4

'

5.1

Sections

and 5.2

^\"\"^\"\"^^

5.4

Section

i'

Chapter

One final word is required about our notation.Sections


denoted
an
by
asterisk
instructor
sees
fit. An
(*) are optional and may be omitted as the
exercise denoted by the daggersymbol(f) is not optional,
however\342\200\224we use this
be
cited
at some
later point of the text.
symbol to identify an exercise that will
M. Lafferty
for her cooperation and
Specialthanks are due to Kathleen

fine

work

the

during

BETWEEN

DIFFERENCES

10 years that

In the

efficient ways to
are

in

the

existing

THE FIRST

have elapsed sincethe


first

the

in

contained
of

number

process.

present a numberof

on direct sumsin

sums

production

exercises
topics

edition

optional

has

been

have
examples
and
the additional

and

first

topics.

key

subsections

AND

EDITIONS

SECOND

we

edition,

have

of the

Most

discovered

material

more

that; relied

and the results about direct


rewritten,
or relegated to the exercises.Also,a

been added to
new material.

reflect the changesmade

the dependence
on
In Chapter 1 manyproofs
more
readable
without
about
direct
sums except
direct sums. Section 1.3no longercontains results
in the exercises. Section 1.4now employsGaussian
2 the proof of the dimension theorem in Section2.1
In
longer
Chapter
in Section
2.4 the
uses
direct
sums. In Section 2.3 the proof of Theorem2.15,
on
proof of Theorem 2.21, have beenimproved.Also,
quotient
spaces
are

any

elimination.

no

and

exercises

Preface

xiv

have been added

to Section2.4.

in Section 2.5has

In

rewritten.

been

on

material

The

particular,

case
more understandable,* general
exercises.
of
In Chapter3
equations
rather
than
Gaussianelimination
'
in the first edition.

by

method,

4a

proof of Cramer's

simplified

considerably
on the
depends

Chapter

by the more efficient


the Gauss-Jordan
technique used
solved

now

are

systems

In

Theorem

of

the

coordinate matrix
is less general, but
2.24 is included in the

Change of
Theorem
2.24
the

rule is presented

classical adjoint. In fact, the classicaladjointnow


as a simple
exercise that exploits the new developmentofCramer'srule.
appears
In
5 the proofs
of Theorems 5.10 and 5.12 in Section5.2have
been
Chapter
Theorem
5.14 characterizes
without the earlier
simplified.
diagonalizability
Direct
sums
sums.
are now treated
as an optional
dependence on \"direct
no

that

longer

subsection.Gerschgorin'sdisk
now
contained
in Section
5.3, and
Section 5.4 combines the Cayley-Hamilton
and
invariant
subspaces.
The minimal
is
found
in
Section
7.3.
now
polynomial
6 and
7 in the first edition have been interchangedin
second
This
not only reflects the authors' tastes, but those
edition.
alteration
a
is

theorem

theorem

the

Chapters

of

of our

number

readers.

In Chapter 6 the

material in Section

clarity and flow of the topics.

The

results

for

needed

rewritten

been

has

6.2

least

squares

to improve

the

now appear in
and minimal
6.3. In Section

Section 6.2 and no longer


on
direct
sums.
Least
squares
solutions are introducedearlierand now contained
in Section
6.4 the addition of Schur's
allows
a completely
new and improved
no
on results
relies
operators
approach to normal and self-adjoint
longer
exercises.
Section
6.5
concerning T*-invariant subspaces. These results are
has been rewritten to take advantageof the
in Section
6.4. A new
approach
subsection of Section 6.5 treats rigid
in
the
law of
Sylvester's
plane.
inertia is now includedwith the material bilinear
forms
in Section
6.6. For
reasons of continuity, the optionalsections the special
of relativity,
theory
and the geometry of orthogonaloperators now located
at the
conditioning,
rely

are

theorem

which

now

new

motions

on

on

are

end of the

chapter.

In Chapter
the

direct

minimal
been

of

proof

sums;

the

the

proofs

which

in Section

are

In

and no

7.1 have been

improved. In Section7.2

simplified and no longer


dealt with in a subsection. Section 7.3
the
treats
Section 7.4 the proof of the rational
form
has
longer uses either direct sumsor quotientspaces.
canonical

Jordan

polynomial.
simplified

form has been

uses

now

canonical

Stephen

Arnold

if.

Friedberg

J. lnsel

Lawrence E. Spence

Vector

1.1

Spaces

INTRODUCTION

as forces, velocities,1 and accelerations,


of the force, velocity, or acceleration) and
amount
involvebotha magnitude
(the
and direction is called a
both magnitude
a direction.Any such entity involving
in which the length of the arrow
are
vector.Vectors
by arrows
represented
and the direction of the arrow represents
the
of the vector
denotes
magnitude
of the vector.
In most physical situations involvingvectors,only
the
direction
and
of the vector are significant; consequently,we
the
direction
magnitude
Many

familiar

regard

vectors

irrespective

of

their

two

positions.

experiments

physical

that test the

manner in

which

interact.

vectors

situations

Familiar
the

point,

magnitude and direction as being equal


In this section the geometry of vectors is discussed.
This
same

from

derived

is

geometry

the

with

such

notions,

physical

suggest that when

at

simultaneously

vector (the vectorobtained

of the resultant
need not be the

magnitude

two vectorsact

the

adding

by

sum of magnitudes
of the original
two.
rate
For example, a swimmer swimmingupstreamat
of 2 miles
per hour
does
not progress
at the rate of 3 miles
hour
against a current of 1mile
in this instance
hour.
For
the motions of the swimmerand
each
oppose
is
mile
and the rate of progress of the swimmer
other,
per hour
upstream.
the
his or
swimmer
then
the
is
If, however,
(with
current),
moving
hour
downstream.
her rate of progress is 3 miles
that vectors add according to the following
show
Experiments
two

vectors)

original

the

the

per

per

current

only

downstream

per

parallelogram

law

(see

Figure

Law for Vector

Parallelogram
that

act at the
sides

adjacent

fThe

both

word

1.1),

same point P is

that

the

is being

used

and direction. The

magnitude
direction of motion)

vector

in

the

here in its

two

x and

y
as

beginning at P.

scientific

magnitudeof a

x and

vectors

having

parallelogram

by the diagonal

is represented

velocity

Addition. The sumof

sense\342\200\224as

velocity

(without

an

entity
regard

having
for the

is calledits speed.

Chap.

Vector

Spaces

Figure 1.1

are paralleland of equal length, the


Since opposite sides of a parallelogram
by allowing x
Q of the arrow representingx + y can also be obtained
endpoint
at the endpoint of x Similarly,the
to act at P and then
allowing
y to act
endpointof the vector x + y can be obtained'by first permittingj; to actat P and
vectors
x and
x to4act at the endpoint of j;. Thustwo
then
y that both
allowing
act at a point P

at P anda

appliedto

vector
the

is the

vector

be

may

added

the

having

of

endpoint

endpoint

xory

and direction

same
the

is, either

that

\"tail-to-head\";

magnitude
first. If this

is done, the

may be applied

as the other may

endpoint of

be

second

the

of x 4- y.

The additionof
analytic geometry.In
origin.
system with P at
the

the

plane

Let

described

be

can

vectors

containing

(als

with
algebraically
a
x and
y, introduce

the endpoint

a2) denote

the use of

coordinate

of x and (bu

b2)

endpoint
1.2(a)shows,the endpoint of x 4- y is
a reference is made to the coordinatesof
when
+ b2).
Henceforth,
(ai + bx,
the vector should be assumed to emanate
of a vector,
the
endpoint
since a vector beginning at the origin is
Moreover,
origin.
its
we will sometimes
refer to the point x rather than
endpoint,
x if x is a vector emanating
the
origin.
of the, vector
endpoint
addition
natural
the
of
there
is
another
Besides
vector,
operation

of y.

the

denote

Then as Figure

<*2

from

the

completely

determined

the

by

from

operation

that

can

be

performed

on

vectors\342\200\224the

of

length

of the
contracted without changing direction
scalar multiplication, consistsof multiplyinga
the

a vector

vector.

vector

by

may be magnified or
called
This operation,
a real
number.
If the

(taj,
Q

(cj

+ bv

{ax+

bv

Figure

a2 +

&2)-

1.2

b2)

ta2)

Introduction

Sec. 1.1

vector x is represented

an

by

be

will

x but

representing

arrow

having

an

by

represented

then

the

times

vectors

nonzero

(Thus

having

directionsare parallel.)

To

coordinate

scalar

describe

If the

origin.

a plane

into

system

of x has

endpoint

coordinates(au

so that x

2. (x +

3. There

= x

exists

+ (y +

vectors

5. lx =

6.

(ab)x

7. a(x +

there is a

vector

ax

y)

of

let u and

Q. Let O denote the originof a coordinate


denote
the
vectors
that
begin at O and

addition

vector

the

denotes

that

u +

(\342\200\224
l)u.

(See

Figure

Since

the

length than
of

endpoint

P lies on
on

1.3 has

line

the

is x = u +
point

ws

scalar

tw

the

line.

coordinates

endpoint of

t(v

Notice

\342\200\224

u),

v,

will

We

plane.

at P and

and

and g,

end at P
v

\342\200\224

us

space,

t'tail-

then

\342\200\224

where

every

P and Q may be obtained


as

joining

and has the


vector

of

Q, Thus an equation
t is a real number
where

also that the

in

system

ending at g,
w =

hence

two

1.3, in which quadrilateral OPQR is a


of w is parallel to w but
of a
possibly

the line

P and

joining
u +

on

begins at P

that

a vector

multiple

point

any

t. Conversely, the

number

beginning

shows

parallelogram.)
different

vector

the

w denotes

If

to-head\"

rather than in a

P and

points

respectively.

eight properties,
and scalar

addition

vector

acting in space

these

to write equations of lines and planesin space.


first
the equation
of a line in space that passesthrough

results

Consider

and

for vectors

also

true

are

that

show

above

given

geometric

distinct

y = 0.

4- ay.

those

these

x +

that

such

x.

vector

each

for

a(bx).

Argumentssimilarto
as well as the
interpretations
use

0= x

that x +

0 such

b)x = ax-\\- bx.

8. (a +

multiplication,

and z

x, y,

z).

denoted

vector

each vector x

4. For

for

multiplication

x.

y 4+

y)

of the

1.2(b)].

be

for

4-

the

from

coordinates

the

then

a^),

emanates

and

1. x

opposite

introduce

easilyshownto (tax, ta2) [see Figure


scalar
The algebraic descriptions of vectoraddition
in a plane yield the followingproperties
vectors
arbitrary
and arbitrary real numbers a andb:
of tx are

endpoint

the

having

If

representing

again

the vector x

containing

arrow

arrow representing
x.
=
if y
tx for some nonzero real
the same direction or opposite

algebraically,

multiplication

as the

of the

length

parallel

t.

arrow

arrow

an

by

havinglength|t|
Two nonzero vectorsx and are called
number

length of the

tx will be represented

0, the vector
direction as x and
t <

tx

the same direction

t times the

length

having

0 the vector

real number t ^

for any

arrow,

the

form

form

tw that begins

of the line
and x denotesan

endpoint R of

through

the

real

some

for

tw

vector

equal to the differenceof the coordinates


of

\342\200\224

and

and

P
arbitrary

in

Figure
P.

at
Q

Chap.

Vector Spaces

Figure 1.3

Example 1
line through the points P and Q having
The endpoint R of the
coordinates
respectively.
(4,5,3),
(\342\200\2242,0,1)and
and
the same direction as the
the
vector emanatingfrom
origin
having
at
has
coordinates
vector beginning at P and
terminating
Q
=
(6, 5,2). Hence the desired equation is
(4,5,3) \342\200\224
(\342\200\224X0,1)

We

the

find

will

\342\226\240

the

of

equation

0,1)

(-2,

2).

+\302\242(6,5,

any three noncollinearpoints space. These


determine
whose equation can found
use of our
points
by
the vectors
at
previous observations about vectors. Let u and denote
beginning
P and ending #t Q and R, respectively.
that
in the plane
any
point
the endpoint
S of a vector x beginning at P and
containing P, Q; and
the
form
and
of t^u will be
txu + t2v for some real nunfbers
t2. The
endpoint
the point of intersectionof
line
P and Q with the line through S
through
g, and R denote
a unique plane,

let P,

Now

in

be

Observe

is

having

tx

the

parallelto
locate

the

line

the endpoint

a vector lyingin
the

plane

the

containing

and

through

R (see Figure

of t2v. Moreover, for


plane

P,

P, Q,

containing

tl and

the plane.

t2 are arbitrary

numbers

and R. It

similar

tl

will

procedure

t2i txu + t2v

and

is

follows that an equationof

R is

g, and

x= P
where

real

any

1.4),

reaL

numbers

txu

and

t2V,

x denotes

an arbitrary

point in

\342\200\242

*-

Figure

1.4

Introduction

Sec. 1.1

Example 2

Let P, g,

the

be

and

\342\200\224

(1,8,

and

The

respectively.

5),

same

the

having

at

terminating

having coordinates
of the vector
endpoint
direction

and

length

= (0,8,

(1,8, 5)- (1,0,2)


three given pointsis
\342\200\224

7).

next

the

of the plane

equation

and

eight

the
R is

containing the

^(0,8,-7).

the

possessing

of vector

examples

many

the

^(-4,-2,2)

the

In

space.\"

beginning

Hence

structure

mathematical

origin and having


at P and terminating
at

from

emanating
vector

the

\342\200\224

^ = (1,0,2)

consider

\342\200\2242,4),

= (-4,-2,2).

vector

the

\"vector

3,

is

Similarly, the endpointof


same length and directionas

calleda

\342\200\224

emanating from the origin


as the vector beginning at P and

(-3,-2,4)-(1,0,2)

Any

(1,0,2), (

points

on page 3

properties

is

we formally define a vectorspaceand


other than the ones mentionedabove.

section
spaces

EXERCISES

1.

(a)
(b)

of

pairs

following

points

(3, 1, 2) and (6,4,


1,7)

.(-3,

(c) (5, -6,

7)

from
emanating
are parallel.

vectors

the

if

Determine

the origin and

terminating at the

2)

and

(9, -3,

and

(-5,

-21)
-7)

6,

(d) (2,0, -5) and (5,0, -2)

2.

the

Find

(a)

4) and

-2,

(3,

(c)

(3,

7,

(d) (-2,

2) and

7,1)

(-5,

(b) (2, 4,0) and (-3,

the following pairs of pointsin space.

lines through

of the

equations

-6,0)

(3, 7, -8)

-1,5) and

9,

(3,

7)

the equations of the planes containingthe


7, 1)
(a) (2, -5, -1), (0,4, 6), and
(-3,
and (5, -9, -2)
0, -4),
(b) (3, -6, 7), (-2,

3. Find

(c) (-8, 2, 0),(1,3,0),

and

(d)
4.

(1,1,

What

are

(-6, 4,

the coordinates

of the vector

3 on

condition

3.

that if the

page 3? Prove

terminates

0 in the Euclidean

that

plane

satisfies

that this choiceof coordinatesdoes


the

from

with

space.

2)

vector x emanates

terminates at the point


emanates from the origin

in

points

-5,0)

1),(5, 5, 5), and

condition

5. Prove

(6,

following

coordinates

origin
(al3a2),

at

the

point

of the
then

with

satisfy

plane and
vector tx that

Euclidean
the

coordinates

(tau ta2).

Chap.1

+ c)/2s (b +

is ((a

7.

1.2

bisect each other.

parallelogram

are

respectively)

polynomials

vector

and

natural

properties

space

(or linear

space)

V over a

field* F consistsofa

and scalar
operations
(called addition
in
so that for each pair of elements
y
and for
each element a in F and

two

which

the

both permit natural definitionsof addition


1 through 8 on page 3,it is
possess
properties
in the following definition.

that

these

abstract

as the forces acting in a planeand

entities

multiplication

on

(c, d)

coefficients

Definition.
set

of a

diagonals

number

real

scalar

and

b)

(a,

d)/2).

diverse

such

with

segment joining the points

Spaces

SPACES

VECTOR

Because

to

the

that

Prove

of the line

the midpoint

that

Show

6.

Vector

multiplication,

V there
x,
each element

defined

is a

unique

x
element x 4- y; in V,
in
V there
conditions
hold:
unique element ax in V, such that thefollowing
= y + x (commutativityof addition).
(VS 1)_ For all x, y in V, x + y
all x, y, z- in V, (x + y) + z = x + (y + z) (associativity
For
,(VS
2)

addition).

exists an elementin

(VS 3) There

each

(VS

4)

(VS

5)

element x in

each

For

x\"+y = 0.

(VS 6) For

each

(ab)x

a,

pair

there

a(bx).

F and

each element a in

(VS 7) For

= ax
+ ay.
a(x +
8) For each pair a, b

element

an

exists

x m,V, lx = x.
in F
b of elements

element

each

For

that

0 such

by

x + 0

of

= x for

V.

in

denoted

is a

each

y in V such

that

and each elementx in


elements

of

pair

V,

in V,

xs y

y)

(VS
The

(a + b)x= ax +
x +

elements

y and ax

of

in

elements

and

each

x in V,

element

bx.

are called the sumo/

and

y and

of a and

the product

x, respectively.

The

of

elements

the

field

F are

called scalars and

the elementsof

vector

the

the
word
should not confusethis
word
\"vector\"
is
with the physical entity discussed in Section1.1;
\"vector\"
now being used to describe any element of a
space.
be discussed in the text without
A vector
frequently
space-will
of scalars.
The reader is cautioned to remember,
its field
mentioning
space

V are

called

vectors. The reader

use

of

the

..vector

explicitly

however,

fSee

\"field\"

C. With

Appendix

to mean

numbers\"

(which

few exceptions,

\"field of real numbers\"(which


we

denote

by

C).

however, the reader may


we

denote

by

R)

interpret

or \"field

the

word

of complex

Sec. 1.2

Vector Spaces

that

vector

will

every

as a vector

be regarded

will

space

space overa

which

field,

given

by F.

denoted

be

In the remainder of this section introduce


of vector spaces that will be studied
to
describing a vector space it is
specify

several

we

throughout

not

necessary

Observe

but also

the vectors

only

the operations of additionandscalar

examples
in
that

important
text.

the

multiplication.

An

of

object

field F, is called

an

(al3...

F. Two n-tuples
=
if a-x =
bt for i

from

equal if and

at are

entries

the

where

,a\342\200\236),

entries

with

n-tuple

defined to be

bn) are

(bu...,

form

the

only

of a

elements

(a1?...,
1,2,..., n.
an)

and

Example1

The

of

set

from a field F

entries

with

n-tuples

formsa

vector

that

the operations of coordinatewise


addition
= (au...,an)efn,y
= (bli...ibn)e
and
c 6F,

if x

is,

x +

and

then

Fns

For example, in

(fli

we

which

space,

by Fn, under

denote
multiplication;

all

an +

bx,...,

ex

and

bn)

{cax,...,

can).

R4,

0, 5)

-2,

(3,

1, 4, 2) =

+ (-1,

(2,

-1,

4, 7)

and

-5(1,

rather than as
as

regarded

an). Since

of F, we

will write F

a 1-tuple with

entry from F

ratherthan F1

for

the

vector

be

may

space

from F.

of 1-tuples

An

element

an

(au...,

vectors

row

as column vectors:

be written

often

will

Fn

of

Elements

= (-5,10,0,-15).

-2,0,3)

with

n matrix

entries from a

field F is a

rectangular

array

of

the

form

fl21

\342\200\242\"
\302\2532n

\302\25322

' '
am2

\\f*ml

where

each

of

&iu&iii
often
compose

entry au(l <

\342\200\242-\342\226\240>
ain

be

regarded
the

jth

the

matrix

as a
column

i<

m,

above

<

'
amnj

an element of F. The
the \\th row of the matrix and

< n) is

compose

row vector in Fn,


of the matrix and

entries

will

whereas the entriesaip a2jJ...,


will often be
as
a column

amJ

regarded

Chap. 1

vector in

In this book

and

addition,
if

called

the

denote

we

square.

Two m x n

matrices

denote

we

of a

entry

by

matrix

matrices A and B are

correspondingentriesare equal,

that

<

zero

the

called

in

lies

that

are equal,

of a matrix
be

to

defined

is,

letters (e.g., A, B, and C),


row i and column j by Au. In

italic

capital

and columns

of rows

number

the

equal to zerois

having each entry

matrix

by 0.

is denoted

and

matrix

x n

The

Fm.

Vector Spaces

if and

if and

equal

if Au =

only

is

matrix

the

their

only if all

i<

Btj for 1 <

and

j < n.

Example 2

The

denote

we

by Mmxn(F),
For

multiplication:

with entries from

matrices

x n

all

of

set

A,

vector

which

space,

under the following operations of addition


B e Mmxn(F) and c eF,

scalar

and

(A

a fieldF isa

B)u

(cA)i}= cAu.

and

Au + By

For instance,

and

in

M2x3(R).

Example

Let S be any

all

nonemptyset and

equal if f(s)

= g(s) for each seS.

operations of addition

scalar

and

c eF

set

f and g in

seS.

each

Note
for

multiplication

to be

space under the


f, g e ^(S, F) and

for

(cf)(s)= c[/M]
that these are the familiar operations of addition
as used in algebra and calculus. 1
functions
+

+ g(s) and

= m

gKs)

and

with

polynomial

coefficients
=

f(x)

anxn

from

a field F

+ fln-iX\"\"1 +

\342\200\242
\342\200\242
\342\226\240

n is a

axx

a0

\342\200\242
\342\200\242
\342\200\242

an

polynomial

\342\200\224

1;

scalar

is an expressionof the

form

a0,

nonnegative integer and an,,.., are elements


=
= 0, then
that is, if =
a0
f(x) is called the zero
of f(x) is said to be
the degree
of a polynomial
otherwise,
where

defined

vector

defined

multiplication

are

F)

J*\"(S,

is a

^(S,F)

F) denote the set of

^(5,

by

\342\226\240

for

The

let

and

field,

any

F. Two elements

S into

from

functions

be

of

F, If
and

f(x) = 0,
the

degree

is defined to be the

Sec.1.2

Vector

the representation

that appears in

of x

exponent

largest

Spaces

f(x) =

anxn

\342\200\242\342\200\242\342\200\242

fln-xx\"-1

a0

coefficient. Note that the polynomialsof


=
form f(x)
c for some nonzeroscalarc.
polynomials
and
their
same-degree
equal if and only if they have
a nonzero

with

Two

an infinite number of elements,we will usually


with coefficients from F as a functionfrom
F into
F, In this

field containing

is a

a polynomial

case the value

at

c e

F is the

of the function

f(x) =

anxn

fln-xx\"-1

/(c) =

ancn

fln_1cn-1+---

\342\200\242\342\200\242'\342\200\242

fl0

scalar

either of the

Here

of like

are equal.

powers of x
regard

g{x) are

coefficients

the

When

and

f(x)

of the

are

zero

degree

notations f or f(x)
f(x) =

anxn

fln-xx\"-1

the

for

used

be

will

a0-

..

function

polynomial

\342\200\242

a0.

Example

all polynomials with coefficientsfroma

The set of

we denote by

P(F)Sunder
f(x)

the

vector

space,

which

For

operations:

following

is a

field

+ ---+^

anxn + an_lxn-1

and
.
in

and

P(F)

if +

=
g(x)

bux*

h-iX?-1

\342\200\242-

b0

c 6 F,
=
fiOM

+
(A\302\273

bu)xn

(fln-x

+ dn-Jx\"-1

-\342\226\240\342\200\242

(flo .+

b0)

and

(cfXx) = canxn +
We will see

in Exercise 21 of Section
2.4

that

example below is

F.

F be
As

essentiallythe same as

\342\226\240
\342\200\242
\342\200\242

the

ca0.

vector

space

defined

in, the

P(F).

Example

Let

cfl\342\200\236_
!*\"-l

any field.
the

usual,

consist of all

a such

sequence

{dn}in

in

{c\342\200\236}
V

such

space. 1

F is

V such
that

that

= tan
dn

have

cn = an+
=

(\302\253

1,2,...).

only

bn

an

will

a finite

be

denoted

number

{an}.

Let V

of nonzero

terms

by

then {an} + {bn} is that


and t{an} is that sequence
(n
1,2,...)
V is a vector
Under
these operations

in

sequences

a from the positiveintegersinto

a function

that a(n) =

sequences {an}in F that

are
an. If {an} and-{fcn}
sequence

in

sequence

and
=

teF,

10

Chap.

two

next

Our

are

multiplication

Vector

Spaces

contain sets on which an addition


but which are not vector spaces.

examples

defined,

scalar

and

Example6
Let

{(al3 a2):
+

(fli\302\273 <*2)

Since

= (ax +

b2)

(bu

(VS 1), (VS

R}. For (au

al,a2s

and ce

e S

b2)

(bu

a2\\

b2) and c(au

bua2-

2), and (VS 8)allfailto hold,S is

ca2).

(cau

a2)

a vector

not

R, define

under

space

these

operations.

Example 7

Let S be as

in

S under

Then

(VS 4)] and


We

of

a vector

not a

S and ceR,

and c{au

0).

(cau

a2)

vector

define

because

space

a few of

with

section

the elementary consequencesof

element in V such
x + 0 = x + (z+v) =

There exists an
x =

1.

Corollary

Proof

Proof

+ v)

y+(z

0 (VS

4). Thus

z)

z are

= y

4), (that

inverse of x

0 described

in (VS 3)

is unique.

The

vector

y described in {VS 4)

is unique.

Exercise.

is,

vector
1

2.

vector

The

The

Exercise.

Corollary

multiplication.

and

y,

and (VS 3). 1

by (VS 2)

The

z +

that

(x

= (y + z)

(VS

the

space.

Proposition 1.1 (Cancellation Law for VectorAddition). If x,


that
x + z = y + z, then
x = y.
elements of a vector space V such

Proof

3) [hence

(VS

(VS 5) fail.
this

(bXi b2) 6

+ ^, 0)

these operations is

conclude

definition

(al3 a2\\

b2) = (at

+ (ftlt

a2)

(<*u

For

6.

Example

3) is called the
the unique vector such that
0 in (VS

zero

and is denoted

following

x+

0)

V, and
is

the vector

called

the

in

additive

\342\200\224

by

result

of

vector

contains

x.

some of the

elementary'propertiesof scalar

Vector

1.2

Sec.

11

Spaces

1.2.
In any vector space
=
(a) Ox Ofor eachx e
Proposition

the

true:

are

statements

following

V.

(b)

(c) a0 =

\342\200\224(ax)

(\342\200\224a)x

each

Ofor

Ox + Ox

Ox = 0

Hence

= (0 +

\342\200\224

[\342\200\224(ax)]

the

4-

(\342\200\224a)x

But

ax +

Thus

(a).

Ox

Ox.

0 +

(VS

by

= [a +

(-a)x

0,

2 of

Corollary

that
1.1

such

Proposition

8),

(-a)]x =

particular, (-l)x
= a[(-l)x] = [a(-l)]x

of (c) is

proof

of

element

unique

Ox

= -(ax). In

(-a)x

a(-x)
The

0)x=

that

follows

it

3)

(VS

is

(ax)

= 0. Hence
ax 4if ax
implies that (-a)x = -(ax).

by

x 6 V.

by Proposition 1.1.
element

The

(b)

F and each

e F.

(a) By (VS 8), (VS 1),and

Proo/.

a 6

each

for

a(\342\200\224x)

So

-x.

by (VS

6)

(-a)x.

similar to the proof of (a).

EXERCISES

1.

the

Label

(a) Every vectorspace

contains

A vector

(c)

In

(d)

In

(e)

An

b.

y.

An

are

Write

degree

degree may be added.


n, then f + g is a polynomial-of

polynomial of degree n and c is a


of degree n.
polynomial

If

f is a

element

nonzero

having

degree

Two

functions

the

zero

of

F may

nonzero

to be an

be considered

then

scalar,

is a

cf

elementof

P(F)

zero.

values at each

2.

of

polynomials

same

n.

degree

(k)

of the

polynomials

P(F)

(j)

vector.
zero

only
(g) In
and
If
(h) f
g

(i)

zero

vector.
space may have more than one
any vector space ax = bx implies that a =
vector
any
space ax = ay implies that x =
element
of Fn may be regarded as an elementof Mnxl(F).
m x n matrix
has m columns and n rows.

(b)

(f)

true or false.

as being

statements

follpwing

in 1F(S7F)

are equal if and

element of 5.

vector

of M3x4(F).

3. If

what are M13,M2ii and

M22?

only

if

they

have

the

same

12

Vector Spaces

Chap.

operations indicated.

4. Perform the

(a)
(b)

(c)

(d)

(e)

+ 4x +

(2x4 -:7x3

3) +

(f) (-3x3*+.7x2 + 8x -

(g) 5(2x7-

6x4

Sx2

(h) 3(x5 - 2x3+ 4x +


Exercises
multiplication

and
(as

6 show
defined

6)

- 6x +
7)

2x2

(8x3

- 8x +

(2x3

10)

3x)

2)

why the definitions of matrix addition and scalar


in Example
2) are the appropriate ones.

in Sagehen Creek, California, J.


on Trout
of Beaver
5. RichardGard
the
of trout
number
25,
following
reports
WildlifeManagement,
having crossed beaver dams inSagehenCreek:,
(Effects

221-242)

Crossings

Upstream

Fall

Brook

Rainbow

Brown

trout

trout

trout

Summer

Downstream
Fall

Crossings
Spring

Summer

Brook trout

Rainbowtrout

0
i

Brown

trout

Record the upstream and


matrices and
that
the
sum

downstream

verify

Spring

crossings

of these

matrices

as

data

in

gives the total

two

3x3

number of

Sec.1.2

Vector

13

Spaces

season.
6.

of

end

the

At

store had the

a furniture

May

categorized by trout speciesand

and downstream)

upstream

(both

crossings

followinginventory:
Mediter-

Early

American

Living room suites


suites

Bedroom

Dining room suites

decided

12

as a

3 x 4 matrix

to double its

inventory on

these

Record

data

Danish

ranean

Spanish

M. To prepare

for

items

the

of

each

additional
the
none of the present stockis sold
that'the inventory on hand after theorderis.
is
described
2M. If the inventory at the end of

June

its

above.

arrives,

described

is

filled

the

by

verify
matrix

matrix

the

by

June

that

Assuming

furniture

until

the store

sale,

A =

interpret

How

A.

2M\342\200\224

7. Let S = {0,1} F =
=
f = g and f + g

R, the

and

field of real

where

h,

= 2x

f(x)

June sale?

sold during the

were

suites

many

numbers. In ^(5,
+ 1, g(x) = 1 +

show

that

\342\200\224

and

R),

Ax

2x2,

= 5x+l.

h{x)

vectorspace

8. In any

and

xjeV

any

(a +

that

show

V,

b)(x + y) =
1.1 and

V is

that

Prove

bx +

by

for

any a,beF.

9. Prove Corollaries 1 and to Proposition


the
set of all difFerentiable
10. Let denote

realline.

ax + ay +

scalar

real-valued functions defined on the


vector space under the operations of additionand
in

defined

multiplication

1.2(c).

Proposition

11. Let V = {0}consist of a single


each c in F. Prove that V is

Example

vector
a

vector

0, and
space

3.

define
F.

over

= 0 and = 0 for
(V is called the zero vector
0 + 0

cO

space.)
12.

for

f[x)

f(\342\200\224x)

on

defined

the
the

denote

Let

are

elements
(fli,

Is V

each

real

a2)

a vector

set of

is called an

ordered

x Prove that

even

functions

of

vector space.
pairs of real numbers.If (^,

if

function

and

scalar

3 is a

is an elementof F,

= (fli +

set

a2)

and

define

bl9 a2b2) and

c(au
under these operations?Justify your

Q>iyb2)

space

the real line

the
even
line with the operations of addition
real number

of V and c
+

on

in Example

defined

multiplication

13.

defined

function

real^valued

a2)

answer.

(cau'a2).

(bi,

b2)

14

Chap.

14. Let V = {(al,...,an): as eC for i = 1,2,...,n}. Is V a


of
field of real numbers with the operations
coordinatewise

Vector Spaces

vector

the

over

space

and

addition

multiplication?

{(ax,..., an):a{ 6 R for i =

15. Let V =

field of complex numbers

1,2,...,

the

with

Is

n}.

V a

vector

over the

space

addition

of coordinatewise

operations

and multiplication?

16. Let

numbers. Is

field of rational

be the

m x n matrices

of all

set

the

denote

definitions of matrixaddition
Let V = {(al9 a2): al9 a2 e
For

(au

jR}.

(\302\253i>

vector

over

space

let

and

entries,

the usual

F under

multiplication?

(bl9 b2)

a2),

b2) = {ax +bl9a2

+ {bx,

a2)

scalar

and

17.

with real number

define

e V and csR,

+ b2)

and

ifc = 0

[(0,0)

c(aua2)= If
Is V.a

19.

Let

a2 + 3b2)

+ 2bl9

20.

How

vector

these

under

space

there

are

elements

many

answer.

c 6

and

c[au a2) =

field.
al,a2s
F}, where F is an arbitrary
of V coordinatewise,and for c 6 F and (ax, a2)

c{au fl2) =
V

and

6 V

a2):

{(a\\,

of elements

Is

#0.

C, define

(cau

ca2).

these operations? Justify your answer.

under

space

b2)

(bl9

(fli

b2)

a vectpr

Is

0>u

Justify your

C}. For (al9a2),

{(al9 a2): al,a2E

(flu ai) +

if

under these operations?

vector'space

Let

18.

a2\\
c&u\342\200\224

addition

Define

e V,

define

(^1.0).

Justify your answer.


operations?
in the vector space Mmxn(Z2)?

1.3 SUBSPACES

In

the

of

study

possess

notion of

scalar

is a

In any vector
zero

vector

subspace

space
of

of

space
on V.

defined

multiplication

the

subset

of\\fif\\N

called_

vector

for

substructure

Definition.
subspace

as

structure

same

the

it is of interest to examinesubsetsthat
the set under consideration.
The appropriate
in this section.
is introduced
spaces

structure

algebraic

any

note

V,

V.

a vector

space

over F under

that

V and

V over a

field F is calleda

the operationsof addition

{0} are subspaces.

and

The latter is

Sec. 1.3

15

Subspaces

Fortunately, it is not

order to provethat a

conditions

1),

(VS

elements
Thus

(VS

2),

5), (VS

V is in

space

fact a subspace.Since

(VS 8)are

6), (VS 7), and

in

space conditions

vector

the

all

verify

a vector

of

subset

(VS

to

necessary

to

known

for

hold

of V, these conditions automatically hold for elementsof subset


W of V is a subspace of V if and
a subset
if the following
only

V.

of

four

conditions hold:

1. x
2.

4-

W whenever aeF

ax 6

3. The

condition 4 is

Theorem

1.3.

subspace of

and

if

redundant, as

Let V be a
if the

only

of

element

the

shows.

theorem

following

vectorspaceand

following

to W.

belongs

subset

of

V. Then

three conditions hold for

W is a

the operations

V:

in

defined

s W.

to W.

4. The additive inverseof each


Actually,

and xeW.

V belongs

of

vector

zero

and

xeW

whenever

e W

OeW.

(a)

(b) x +

(c) ax 6
Proof
operations

(b)and
x e But
(c)

W.

W and yeW.
and

aeF

whenever

x 6

whenever

6 W

xeW.

vector space under


of addition
and scalar multiplication
defined on V. Hence conditions
and
such
there
exists an element 0' e
that
x 4- 0' = x for each
hold,
also
x 4- 0 = x, and thus 0' = 0
1.1. So condition
Proposition
(a)
If

is

V, then

of

subspace

W is a

the

by

holds.

Conversely, if conditions(a), and (c) hold, the discussion preceding this


of V if the additive inverse of each element
theorem shows that is a subspace
W
to W by condition
to W. But if x e W, then
l)x
belongs
belongs
(c), and
=
1.2. Hence W is a subspace of V. 1
x
l)x
by Proposition
(b),

of

(\342\200\224

\342\200\224

(\342\200\224

theorem

The

not a
is

subset

given

to prove

used
The

transpose

above
of

a vector

space

for determining whether or

method

a simple

provides

is a subspace.

Normally, it is this

that a subset is, in fact,a subspace.


M1 of an m x n matrix M is the n

M by interchanging
example,

the rows with

x
that

that
>

the columns;
/

result

obtained

m matrix
is,

(M%*

Mn.

from
For

o\\

\342\226\240

Gi-iH-'
\\

-1/

matrix M such that


of
all symmetric
must be square. The set

A symmetric

matrix

matrix is a

M1

M.

Clearly,

matrices

a symmetric

in Mnxn(F)

is a

Chap.1

16
of

subspace

(a)
It is

for

that

proved

easily

and hence belongs to W.

to its transpose

is equal

matrix

zero

The

+ bBY = aAl + bBl (see Exercise3).


(aA
conditions
(b) and (c) of Theorem 1.3 follows.

Using
as

and

yie-W

If

(b)

(c) If

+ B, so that A + B e W.
A1 = A. So for any asF,
then

6 W,

= Bl.

and

A*

can

we

fact,

establish

easily

(A + BJ

Thus

b,

= A

B1

A1

A =

then

BeW,

this

and

and any scalars a

A and B

matrices

any

Spaces

of Theorem 1.3 hold:

the conditions

since

Mnxn(F)

Vector

(aAJ=

aA1

The following examples provide furtherillustrations


The first three are particularlyimportant.
subspace.

the

of

aA e

Thus

aA.

W.
a

of

concept

Example1
matrix. The (main) diagonal ofM consistsof the entries
Mlls
x
if each
\342\200\242
- -, Mnn.
An
n
n
matrix
D
is
called
a
matrix
not
on
entry
diagonal
M22,
the diagonal of Dis zero,that is,if DtJ = 0 whenever i # j- The set of alldiagonal
an

be

Let

in

matrices

n xjn

is a

Mnxn(F)

of Mnxn(F).

subspace

Example2
be

Let

a nonnegative

element

than

tess

degree

having

integer,

of Pn(F) since

and let Pn(F)

in

n. (Notice that

or equal to

its degree

consist of all polynomials

Then

isn\342\200\224

1.)

zero

the

is a

Pn(F)

P(F)

is

polynomial

an

of P(F).

subspace

Example3
set

The

is

the continuous
R), where ^{R, R) is

of ^(R,

subspace

real-valued functions definedon

of all

consisting

C(R)

as

in

defined

3 of Section

Example

Example 4

The traceof an
lying

n matrix

on the diagonal;

(see

Exercise

n matrices

6).

of

entries

the

that is,

tr(M) =
The set of

sum of all

tr(M), is the

M, denoted

MU

trace

having

M22

+ Mnn.

equal fo zero

is a subspaceof

Mnxn(^)

Example 5

The set ofmatricesin


Mmxn(F) because condition
Mmxn(F)

(c)

having
of

nonnegative

Theorem

1.3 does

entries

is not

not hold.

a subspace of

Sec.1.3

theorem

next

The

other

17

Subspaces

Any intersection

1.4.

be a

Proof Let ^

intersectionof all

vector,0 e W.

is easily

Theorem

W is a

that

Since

vector

is

space

V, and

of

to that subspace).

belong

(that

union of subspacesis a

condition (b) need not hold.

In

two

of

of

subset

subspaces

It

other.

the

two

of combining
is,

contains

that

one

subspace.

and

be readily

it can

fact,

(c)

of

shown

if one
is natural, however, to expect that
subspaces Wx and W2 to obtain a
both W^ and
As
we
have
is a

if and only

subspace

VV2).

is
finding such a subspace
this idea in Exercise21.

explore

and

e\\N

is a subspace, it is natural

the key to

above,

Thus x +

of subspaces

or not the

the union

the

let W denote

every subspace contains the zero


of W; then x and y are
elements
of
of each subspace in ^
are elements
and the product of a scalar and a

ax

intersection

the

method

subspace

suggested
1.3. We

the union of subspaces must satisfyconditions


(a)

of the subspaces
is a
be

#.

and

of whether

1.3 but that

thereshould

subspaces

subspace by Theorem1.3.

question

seen

of

a subspace

in

both

that

(see Exercise 18)that

larger

of subspaces of

elements

be

x+

vectors

subspace

the

consider

and x,

e F

shown

Having

in

subspaces

of

sum

so that

axE\\N,

collection

in #. Hence
the

from

vector

the

Let

subspace

(becausethe

It

from

subspace

of V.

subspace

to

new

subspaces.

Theorem

each

of forming a

a method

provides

condition

(b)

of

Theorem

EXERCISES

1.

Label

the

(a) If

is

W is a

vector

as being

statements

following

and

space

subspace of

W is a

true or false.

subset of

that

is a

vector

space,

V.

a subspace of every vector space.


V is a vector
If
(c)
space other than the zero vector space{0},
W such that W ^ V.
contains
a subspace
of any two subsets of V is a subspace
of V.
(d) The intersection
more
than
n nonzero
(e) Ann x n diagonal matrix can neverhave
is
the
of its entries
(f) The trace of a square matrix
product
(b)

The

empty

then

set is

then

entries.

on the

diagonal.

2. Determinethe transpose
the matrix is square,
of

compute

of

each
its

the

trace.

following

matrices.

In addition,

if

3,

(e) (1, -1,

3.

Vector Spaces

Chap. 1

18

that

Prove

5)

+ bBf =

(aA

aAl + bBl for

A,

any

and

e Mmxn(F)

any

a,b eF.
A e Mm x n(F).
4. Provethat (AJ =
each
5. Prove that- + A1 is symmetric
for any square matrix A.
= a
6. Prove that-tr(a/l+
any
A, B e Mn x n(F).
tx(A) +
are
7. Prove that diagonal
matrices.
symmetric
8.
if the following sets are subspaces
Determine
under
the operations
addition and scalar multiplication definedon R3.
answers.
your
for

ft

bB)

for

tr(\302\243)

matrices

of

R3

of

Justify

(a)

Wj

e R3:

{(aua2,a3)

(d)

- {(aua2,a3) e
W3
W4 ^.((0^02,
fl3) 6

R3: 2a, R3: a., ~

(e) W5 = {(aua2,a3) e R3:aJh(f) W6 = {(aua2,a3) e R3:


and

10.

VV3 n

~
~

that

Verify

but

that

W^

W2

= {(au...,

0}
=

3a\\ + 6a\\ =

= 0 or /
=
W
11. Is the
{/ 6 P(F):./
n > 1? Justify your
set

1}

0}
Wj n W3j

8. Describe

ax +

Fn:

an)

0}

3a3

2a2

an) 6 Fn: a,

{(a,,...,

a3

a3

4a2

a subspace of R3.

that each is

observe

VV4 and

2}

Exercise

in

as

be

W4

a3=-a2}

la2 +

5a\\

9. Let Wx, W3, and

and

3a2

ax

e R3:a, = a3+

(b) W2 = {(a^a^aj
(c)

\342\200\242
\342\200\242
\342\200\242

\342\200\242
\342\200\242
\342\200\242

has

an

an

W4j

subspace of

0} is a

Fn

1} is not.
n}

degree

Wj n

if

of P(F)

subspace

answer.

12.

An

m.x

diagonal

A is called

n matrix

are zero, that

triangular

Verify that for any


3?(S,

14.

15.

=
Ai}

s0 e 5,

{/

the

the upper

of Mmxn(F).
6 &(S9

of all differentiablereal-valued
of C(R)1 Justify your answer. .
that

below

lying

F): f(s0) =

0} is a

subspace
of

F).

Is the set

Let

entries

j. Verify that

i >

0 whenever

a subspace

form

matrices

13.

is, if

upper triangularif all

Cn(jR)

have

functions

the

denote

set of
nth

a continuous

orders 1,2,...,

n).

Verify

defined

that

Cn(JR)

R a

subspace

defined on therealline
(and hence continuous derivatives of
is a subspace
of &(R, R).

all real-valued functions


derivative

on

Sec.1.3
16.

19

Subspaces

a subset

that

Prove
W #

of

0, ax e W, and x

vector

whenever

ye\\N

a subspace of V if
e F and x,ye\\N.

V is

space

and

if

only

17. Prove that a subset of a vector space V is a subspace of if and only if


e\\N
whenever
ae F and xjeW.
0 eW and ax +
\\j VV2
is a
18. Let Wj and
be
of a vector
subspaces
space V. Prove that
Q W2
or W2
of V if and only if
WV
subspace
is called an even
function
19.
Let
g s^r(F1,F2)
Fj and
F2 be fields.
if
for
each
xe Fl and is called an odd
function if g( x)
g(x)
=
functions
in
for
each
x e Fx. Prove that the set of all
are
of
F2)
subspaces
2F(FU
F2) and the set of all odd functionsin
V

\\Nl

VV2

\302\243

Wx

\342\200\224 \342\200\224

g(\342\200\224x)

function

even

\342\200\224g(x)

J**(Fls

20.f

Prove that if W
fli*i +

sum of Sj anrf

+ S2,is

S2) denoted Sj

set

t/ie

Prove that if Wi and W2 are subspaces


is a subspace
of V that contains
Wx +

VV2

by V =

Wi n W2 = {0}

W^

anrf

22. Show that

Fn

Wx

and

Wj

if

+ W2

= V.

direct

sum

the

is

both

vector

and

VV2.

\\Nl

of

of

subspaces

W2

that

also

and

\\Nl

sum

VV2

\\N\302\261+

Prove

VV2.

both

the

6 S2}.'

to be-thedirect
are

V, then

space V, then

vector

W^

an<2

F.

in

an

space

Sj

y:xe

contains

that

subspace

\302\251VV2,

{x

of a

space V is said

A vector

Definition.

denoted

smallest

the

is

scalars au...,

nonemptysubsetsofa

then

of W,

elements

are

xn

x\302\261,...,

of W for any

element

and S2 are

If Sj

and

of V

subspace

is an

anxn

Definition.

21.

is

\342\200\242
\342\200\242
\342\200\242

and

VV2,

V such that

of the subspaces

W1 = {(fl1,...,fl.)6Fn:fl.

= 0}

and
=

23. Let Wj denote

the

the

VV2

=-.-=^^=0}.
f in P(F)

all polynomials

of

set

such that f(x)

= 0 or,

in

representation

f{x)
the

6 Fn:a1

{(fll5...,an)

W2

coefficients

a0, a2,

the

denote

set

of

= anxn +

an-!*\"\"1 +

\342\200\242\342\200\242\342\200\242

a0,

of x equal
zero.
a4,... of all even
all polynomials
g in P(F) such that

Likewise,

powers

g(x)

let

0 or,

in the

Prove

that

representation

g(x) =
the
P(F)

coefficients

= Wx

Exercises

bubz,b5,...

bmxm

bm.1xm'1

of all odd

\342\200\242\342\200\242\342\200\242

b09

powers ofx equal

zero.

\302\251W2.

denoted

by f

will be referencedin

othersections
of

the

book.

Chap. 1

20

W1 = {^6Mmxn(F):

24. Let

{A e

x n(F):

Mm

i ^ /}. QNl is the setof upper triangular


= \\Nl \302\251W2.
12.) Show that Mmxn(F)

vector space consisting of all upper triangularn x n


denote
the
of V
matrices
(as defined in Exercise 12), and let \\Nl
subspace
of all diagonal matrices. Show that V = Wx \302\251VV2,
where
consisting
=
=
0 whenever
i > j}.
VV2
{A e V: AfJ

25.

Let

26.

the

denote

called

is

matrix

is a subspace

matrices

if M1 =

skew-symmetric

Prove that the

matrix is square.

symmetric

of

Wx

Let

x n(R).

Mn

consisting of the symmetricnxn


Wx
Let

27.

= {A e

Wx

\\

be

VV2

of \\Nl and

as *! +
Let

Prove that

skew-

Clearly,

n x n

skew-symmetric

the

be

VV2

of Mn

subspace
that

Prove

matrices.

x n(R)
=

Mnxn(jR)

+ W = {v + w:

{v}
customary

Both
Wj

of a
if

only

vector space V. Prove that V is the direct


each element in V can be uniquelywritten

of a vector space
6

coset by

coset

of

any

e V

set

the
It

v.

containing

{v} + W.

than

rather

F. For

a field

over

the

called

is

W}

to denote this

of

27.

26 and

Exercises

Compare

\302\251W2.

the

and x2 e VV2.

6 Wj

Xj

a subspace

W be

and

if

VV2

<;'}, and let W2 denote


are
W^ and VV2
subspaces
i

whenever

x n(F)

M^

subspaces

where

x2,

=
ASJ

matrices.

nxn

arid

\\Nl

sum

Mnxn(F):

symmetric

Mn x n(^)-

28. Let

M.

all

of

set

\342\200\224

\302\251W2.

set of

29.

W2 =

and

j}

whenever

Exercise

in

defined

matrices

= 0

AtJ

i >

whenever

Afj

Vector Spaces

is

Prove the

following:

(a)
(b)

of V if and only if v e W.
if and only if vt - v2 e W.
+ W
v2
and scalar multiplication
by elements of F can be defined
S = {v.+ W: v e V} of all cosets of W as follows:
is a

subspace

+ W =

Vi

Addition
collection

K +
for

W)

a(y +
v

VV

v2)

W)

+ W

av

F.

a e

V and

the operationsaboveare

(c) Prove that


Vl

the

6 V and

all yx, y2

for all

= K

+ W)

(v2

in

v\\

v2 +

and

(i>! +

W =

v'2

W) +{v2 + W)

W,

show

i.e.,

well-defined;

that

if

then

= fa +

W)

(y2

W)

and

a(Vl +
for all a

This

is denoted

a(vi

+ W)

e F.

(d) Prove that

above.

=
W)

the

vector

5 is

set

space

by V/W.

a vector
is called

space under the operations defined


VV and
modulo
the quotient space of
V

Sec. 1.4

and

Combinations

Linear

of Linear

Systems

21

Equations

1.4 LINEAR COMBINATIONS AND SYSTEMS

LINEAR

OF

EQUATIONS

shown that the equation the plane through


three
noncollinear
points P, Q, and R in spaceis x = + txu 4- t2v, where u and
at P and ending at Q and
denote
the
vectors
and
beginning
respectively,
tj
and t2 denote arbitrary real numbers. An important specialcase
when
P
to
is the origin.
In this case the equation of the plane
x = t^u + t2v,
in this
be proved
and the set of all
is a subspace of
will
as
(This
plane
Theorem 1.5.) Expressions of the form
+
tx and t2 are scalars and
t2v, where
u and are vectors,
a central
role in the theory of vector spaces.
play
of such
is presented in the following
appropriate
generalization
expressions
In

1.1 it was

Section

of

jR,

occurs

simplifies

3.

points

t^u

The

definition.

Definition.
x in

V is

Let

said to be

a vector

be

a linear combinationof

number of elements yi,...,

x = a^ +
linear

\342\200\242
*\342\200\242

4-

of

combination

and

if there

exist

a finite

in F such

a!,...,an

customary to say

0 for

of any nonempty

combination

linear

scalars

Ox

V,

is

of

elements

that

that
is

yn.

yl5...,

Observe that in any vectorspace


vector

vector

it is also

situation

this

In

anyn.

iw

yn

nonempty subset of V.

and S a

space

each

x 6 V. Thus

the zero

subset of V.

Example1

Table1.1

the

shows

vitamins

We
vector

in

(thiamine),

Bj

A,

the

record

will

R5\342\200\224for

grams of 12 foods with respectto


B2 (riboflavin),
niacin, and C (ascorbic acid).
vitamin
content
of 100 grams of each food as a column
of 100

content

vitamin

the

example,

vitamin

for apple

vector

butter is

/o.oo\\

0.01

Considering
soy sauce,and

vectors

vitamin

the

wild

rice,

/o.oo\\

'

*
0.05

0.06

0.30

\\o.oo/

we

see

for

cupcake,

coconut

custard

pie, brown rice,

that

/o.oo\\

/o.oo\\

/o.oo\\

7o.oo\\

/nno\\

/-0.34*

'nno*
0.02

'0.45'

0.05

4.70
\\o.oo,

+ 2

0.25

0.63

.0.40

\\o.oo/

\\o.oo/

6.20

Vector Spaces

Chap. 1

22

Grams of Certain Foods

of 100

Content

Vitamin

1.1

TABLE

B,

B2

Niacin

(nig)

(mg)

(mg)

(mg)

(units)

0.01

0.02

0.2

90

0.03

0.02

0.1

2
4

0.02

0.07

0.2

100

0.10

0.18

0.01

0.3
0.1

10

0.05

0.06

1.3

0.03

0.2

0.02

0.4

(0)

butter

Apple

Raw,

(freshly harvested)

apples

unpared

Chocolate-coated candy with

coconut

center

Clams

only)

(meat

Cupcake from mix (dry


farina

Cooked

Jams and

from

(baked

mix)

Cooked

rice

(unenriched)

spaghetti

Raw wild rice

Source: Bernice'K. Watt


Number

8),

(0)

(0)

0.34

0.05

4.7

0.02

0.25

0.4

0.01

0.01

0.3

(0)

0.45

0.63

6.2

(0)

of Foods (AgricultureHandbook

Composition

Research Division, U.S. Departmentof Agriculture,

Economics

Food

and

L. Merrill,

Annabel

and

Consumer

1963.

D.C.,

Washington,

0.02

\"

sauce

Soy

0.01

10

preserves

brown

0.01

(Of

(unenriched)

Coconut custardpie
Raw

form)

\"Zeros in parentheses

indicate that the amount ofa vitamin

none

is either

present

or too

small to

measure.

Thus the vitamin vector

linear combination of the vitamin


custard
pie, raw brown rice, and soy sauce.So 100
of coconut
custard pie, 100 grams of raw
brown

vectors

for

of

grams

100

cupcake,

200

and

rice,

coconut

cupcake,

grams

of soy sauce

grams

as 100 grams

vitamins

of

/nni\\
0.01

0.03\\

0.02

0.02

nrv,

0.10

0.20

\\2.00/

butter,

100

100 grams of farina,100


exactly the sameamountsof

the

Throughout Chapters1
in which it is
to determine

and

\\o.oo/

\\o.oo/

the

technique,

of

combination

of

problem
we

will

100
as

encounter
or not

many

different situations

a vector can

be expressedas

how. This question reducesto


of linear equations. To illustrate this important
the vector (2,6,8) can be expressedas a

vectors,

a system
if
determine

solving
will

other

we

whether

necessary

linear

vitamins

five

and

of chocolate candy,
grams of spaghetti provide
100 grams of clams.
1

100 grams

of apples,
jam,

five

0.01

0.30

of

the

'0.01

0.10

grams

grams

0.01

\\o.oo/

200 grams of apple

nm

0.01
+

0.07

of

/o.oo\\

/io.oo\\

- -

Qm

amounts

same

since

Similarly,

/o.oo\\

0.20.

4.00/

\\

rice.

/o.oo\\

/ 90.oo\\

/o.oo\\

provide exactly the

wild

raw

is a

rice

wild

raw

for

and if so,

linear

Sec.

23

Linear Equations

and Systems of

Combinations

Linear

1.4

combination of
yx

must

we

a2y2 + azyz +

= flx(l,2,1)

and

+ a2(-2,

-2),

043\302\274

^3

2, 3),

(0,

8, 16).

y3 =(-3,

are scalars

if there

determine

= axyx +

(2,6,8)

- (-2, -4,

0,-3),

(2,

3\302\274

Thus

y2

2, 1),

(1,

a4, and a5

ax, a2, a3,

asy5

-4, -2) +

a3(0,2,3)

+ fl4(2,0,-3)
+

(ax

\342\200\224

\342\200\224

4- 2a4

2a2

\342\200\224

3a5,2a1

4-

Aa2

2a3

(2,6,8)

of

combination

of

5-tuple

scalars

2a2

can be expressed as a linear

if and only if thereisa

a5(-3,8,16)

8a5,

\342\200\224

ax
Hence

that

such

3a3

yl9

y2,

\342\200\224

3a4

16a5).

y3, yA, and

a5) satisfying

{ax ,a2,az,aA,

y5

the system

of linear equations
\342\200\224

2a2

ax

\342\200\224

<

2ax

4-

Aa2

obtained by equating corresponding


To solve system (1),
same solutionsbut
easier

will

3a4

3a3

8a5 =

+ 16a5 =

the

replace

(1)

in the

coordinates

the

we

3a5

2a3

ax - 2a2+

\342\200\224

2a4

preceding
equation.
another
by
system with the

system

to be used

expresses
someof unknowns
in terms
of others by eliminating certain unknowns from
one.
To begin, we eliminate ax from every equation
all equations
except
2 times
the first equation
the
first
to the second and
by adding
to the third. The result is the
times
the
first equation
new
system:
solve.

to

is

which

The

procedure

the

the

\342\200\224

\342\200\224

except

following

\\ax

<

\342\200\224

2a2

4-

that
In this case happened
except the first, also eliminated
neednot happen general..
We
and
then
eliminate
system
a3,
it

we

in

(2)

first

multiply

2a4

second

14a5=

- 4a4 +

3a3

- 5aA + 19a5 =

while

ax from

eliminating

every equation
want
to solve the

now

\342\200\224

2a2

every equation
except the first. This

second equation in

third equation. To do this,


\\, which produces

a3 from the

we

\342\226\240

ax

(2)

6.

a2 from

by

equation

3a5

2a3

for

the

\342\200\224

\342\200\224

2a4

= 1

5aA

2a4

a3

3a3

\342\200\224

3a5

la5
19a5

= 6.

Chap. 1

24
add

we

Next

second

the

\342\200\2243
times

\342\200\224

\342\200\224

2a4

3a5

- 2a4 +

a3

by eliminating a4 from

2a5

3.

of

equation

every

7a5

\342\200\224

a4

{ax
We continue

to the third,

equation

4-

2a2

Vector

Spaces

obtaining

(3)

the third. This

(3) except

yields
(ax

\342\200\224

a3

unknown

in each of

present

Thus for any choice of the scalars


-

a5

a vector

+ 0^2 +

8) = -4^

7^3

yu

any

Section

to

solutions

3.4

1. The
If

equation,

unknown
then
equations.

two

y3,

0y5,

yA,

and

of

obtained

by

y5.
to simplify

operations

of an equation

that

system

to anotherequation

operations do not change

the

system. Note that


the

the

constant

nonzero

equations that had

is

in

equations

prove that these

we

following

these

employed

of

set

to

operations

properties:

in each equationis
the first unknown with a nonzero
unknown occurs with a zero coefficient
in

first nonzero coefficient

an

other

multiple

we will

the original

a system of

obtain

2.

constant

any

Adding

In

3, a5)

system:

1. Interchangingthe orderof
2. Multiplying any equation by a
3.

2a5

3;y4

types

the original

7,

(\342\200\2244,0,7,3,0)

is a linear combinationof
y2,
The procedure illustrated above uses three

so that (2,6, 8)

of the form

Therefore,

solution

(2, 6,

other

we find

-3a5 +

vector

the

of the

terms

in

a4)

first

the

a5,

?to system (1).In particular,


= ti-and a5 = 0 is a
to
(1).

a2

setting

3.

+ 3.

- 4, a2,

is a solution

(4)

form,

2a5
and

a2

(2a2

a5

- 3a5+

I a3 =
a4

this

in

= 2a2 -

[ ax

(ax, a2, a3, a4s a5)=

the equations (ax,a3, and


(4)

for

(a2 and a5). Rewritingsystem

unknowns

3a5

\342\200\224

form: It is easyto solve

of the desired

a system

is

(4)

2a5=

a4 System

a5

4-

2a2

one.

in

coefficient

each

some
of

the

Linear Combinationsand Systemsof

Sec. 1.4

unknown

first

The

3.

Linear

coefficientin
equation
the first unknown with a nonzero
with a nonzero

subscript

than

preceding

equation.

following

3x3

\342\200\224

x4
4-

x2

the

with

unknown

first

the

Once

for

some

satisfy

nonzero

condition

system

the

2;

(see

first equation; and

coefficient

in the

3 because

x2, the first unknown

in terms of the

others

(as

the
a

with

the

in
a

to the study

by

in

solving
of

matrices.

that

2x3 -

12*

2x2

- 6

is a linear combinationof
2x2

containing

then the original

equations

method

x3 -

above).

example

system

to

an

system

2).

Example

of systemsoflinear
of
discuss there the theoretical basis for this
equations and further simplify the procedure use
will return

We will show

in the

coefficient

a nonzero

first

the

does not satisfy

in (6)

system

with

because

third equation, does not have a largersubscript


than
with a nonzero coefficient in the secondequation.
with
1,2, and 3 has been obtained, it is easy
properties

of the unknowns

has no solutions

Example

condition

satisfy

If however, in the course of usingoperations1,2, and


c is nonzero,
is obtained,
equation of the form 0 = c, where
We

(7)

in the

coefficient

nonzero

solve

not

3x5

unknown

first

6
1

\342\200\224

5x3

(5) does not

second equation, occurs


does

(6)

- 6x5= 0

the secondequationis

condition 2 because x3,

the

x5 =

2x3

xx

nonzero coefficient in

of

none

\342\200\224
\342\200\2245

x5

x4 + 3x5

ixi

the system in

that

any

-2x5=

x3

(7)

in

2x3-5x4=-l
\342\200\224:2*2

systemin

a larger

coefficient

Specifically,

has

any

note
help clarify the meaning of these properties,,
systems meets these requirements.

To

x4,

25

Equations

3x3

and

-5x-3

-5x2

in P3(jR),but that

3x3-

2x2

Ix

+ 8

-4x-9

Chapter

systems

3. We

of linear

Chap. 1

26
is

In the first case

combination.

a linear

such

not

to

wish

we

find

Vector

scalars

Spaces

a and b

such that

2x3 -

= (a +

3fc)x3

Thus we are led to

a(x3 -

- 6=

12x

2x2

the

2x2

(-5a -

- 5fc)x2 +

(-2a

of linear

system

following

a + 3b

-2a -5b

<

=-

- 9)

- 9b).

equations:

12

-6.

first

the

(-3a

4x

-2

-3a-9b=
of

Ab)x

5x2

-5a-4b=

Adding appropriatemultiples
eliminatea,

b(3x3-

- 3) +

5x

to

equation

the

in order to

others

find

we

3b

a+

==

b^

11& == 22

0 == 0.

appropriate multiplesof

Now adding the

second

the

equation

to

the

others

yields

= -4

b=

2
0

0=
0=

0.

Hence

2x3-2x2

12x-6

4(x3

In the secondcase
which

we

3x3As

above,

2x2

we

Ix

obtain

+ 8

to

wish

= a(x3

a system

- 2x2-

5x

3)

there

that

show

- 2x2-

5x-

3) +

2(3x3

- 5x2 -Ax-

are no scalars

fc(3x3

9).

a and b for

- 5x2 -Ax-

9).

of linear equations

a + 3b -2a-5&=

-2

-5a-4fc-

_3a-9fc=

8.

(8)

Sec.1.4

and

Combinations

Linear

Eliminating a as

of Linear

Systems

27

Equations

before yields

a+

3b

b=

11& = 22

0 = 17.
But

the

no

solutions.

x3

The
vector

3x3

Hence

- 5x

2x2

inconsistent

of the

presence

5x2

another

provides

space

2x2

system

elements of a nonempty subset of a


of a subspace, as the followingresult

of the

combinations

linear

of

set

\342\200\224

- 3 and 3x3-

0 = 17 indicatesthat
has
(8)
Ix + 8 is not a linear combinationof
Ax - 9. 1

equation

example

shows.

If S

1.5.

Theorem

is a nonempty subset of a vector

consisting of all linearcombinations


of

elements

smallest

the

is

Since

Proof First,
S # 0, 0 e

y = axxx +

that

such

\342\200\242
\342\200\242
\342\200\242

elements

subset

is

subspace

and

= axxx +

of any

xn and
\342\200\242
\342\200\242
\342\200\242

bmwm

of V.

are

linear

vvx,...,
for

wm

some

Now

\342\200\242
\342\200\242
\342\200\242

is

xx,...,

z = bx-wx +

and

anxn

and bu...,bm.

of scalars au...,an

choice

elements

z are

and

of elements of S. So thereexist

combinations
in

If

VV.

prove that W
of W, then

1.3 to

Theorem

use

will

we

of V. Moreover,

a subspace

sense that

the set

then

S.

of V that contains

subspace

S in the

of V containing

subspace

S is

of

V,

space

+ bxwx

anxn

\342\200\242
\342\200\242
\342\200\242

fcmwm

and

cy =
are

any scalar c.

where
Therefore,
Since

4-

of S; so
of

denote

z and

c_y

are

elements

of W

for

V.

any subspace of

that

contains

S. Ify

is an element

of

of S; say y
+
+ anxn,
axxx
Sc VV, xu..., xn 6 W.
an 6 F and xl9..., xn e S. Because
au...,
\342\200\242
\342\200\242
=
is an element
of VV by Exercise 20 of Section1.3.
+ anxn
y
a!*! + \342\200\242
of VV, belongs
to VV, VV c VV. This
the
element
y, an
arbitrary
completes
is

linear

of elements

combination

Definition. Thesubspace
VV

S (or

\342\200\242
\342\200\242
\342\200\242

proof.

convenience

canxn

Thus W is a subspace

let

Now

\342\200\242
\342\200\242
\342\200\242

of elements

combinations

linear

W, then

caxxx +

the

generated

subspace
we

define

span(0)

described

the elements
= {0}.

by

in

Theorem

of S) and

1.5 is

called the span

of

is denotedspan(S).For

Chap.1

28

of

and only if

S if

x is an elementof span(S).
For

that span({(l,0,0), (0,1,0)})is


span(S) = V.

R3, it

in

instance,

is easily seen

xy-plane.

vector space

S of a

subset

Definition.

the

Spaces

of elements
that x is a linearcombination

1.5 shows

Theorem

that

Observe

Vector

In this situation

also

we

the

that

say

(or

generates

V if

spans)

of S generate

elements

(or

span) V.

Example3
vectors

The

of

R3 is

(aua2,

a3)

scalars

r, s, and t

a linear combination

0)

\\(ax

5(1,

1) +

0,

s = \\(a\\ -a2

a3),

a2-

of the three

arbitrary

element

in

vectors;

given

the

fact,

for which

r(l, 1,
are
r=

and (0,1,1) generate R3 since an

(1, 0,1),

(1,1,0),

t(0, 1, 1) =

(au a2,a3)

and

+ a3),

+ a3).

+ a2

i(-ax

Example 4

The polynomials
x2

3x

\342\200\224

5x

2,2x2

\342\200\224

3,

since each of the three givenpolynomials


belongs
is
a linear
combination
ax2 + bx + c in P2(R)

(- 8a + 5b +

3x

H-

3c)(x2

- 2) +

(\342\200\224a+

b +

to

Ax

c)(2x2

\342\200\224

Ax

c)(\342\200\224x2

and

P2CR)

of these

4 generate

each

P2(R)

polynomial

three; namely,
5x
+

4)

3)
= ax2 +

bx + c.

matrices

generate

M2x2W

since an arbitrary

elementof

\\fl21

can

x2

Example

The

(4a -2b-

\342\200\224 \342\200\224

and

be expressed

as a linear

^22/

combination ofthe four


+ 3fl12 +

3flll

3^11 +

*\342\200\242

M2x2(R)

3fl21

\342\200\224

3^12-3^21+^22)

3fl22)

given

matrices

as

follows:

Sec.

and Systems of

Combinations

Linear

1.4

+ 3^21 +

3^11-3^12

+ (-3*11

Linear Equations

3^22) 1

+^22)(

+3*12+^21

\302\253u

29

0\"

\302\25312

^22,

fill

EXERCISES

1.

the

Label

false.

true'or

as being

statements

following

combination
of any
(a) The zero vectorisa
nonempty
(b) The span of 0 is 0.
If
5 is a subset
of a vector space
then
span(S)
equals
of all subspaces of
that
contain
S.
(d) In solving a system of linearequations is permissible
by any constant.
equation
linear

V,

(c)

set of vectors.
the

intersection

it

linear
(e) In solvinga
multiple of one equationto
of

system

(f)

of linear

system

Every

2. Solve following
in this section.
- 2x2 (a)
{2xx
the

*i

.
(b)

2xx
f

xx

2x2

xx

4x2

xx

'\342\200\224

x2
+

*3 + xA=5

\342\200\224

\342\200\224

\342\200\224

= 8

3x3

3x4

*3

4x4

8x3

5x3

+ 5x4 =

5x2

+ 11*2

5*4

= -6

2x2

2x3

xx

^ 4xx

x3

\342\200\224

2xx

4x3 = 10

[xx + 2x2 +

xx

by the method

- x2-2xz = 6

xx

(e)

X4

2x2

2xx + 3x2

(d)

'\342\200\224

^-X^

- 7x2 +

f 3xx
3xx

*i (c)

'\342\200\224

X2

equations

=-2

2x3 + 5x4 =

3x2

3*i

\342\200\224

\342\200\224

\342\200\224

4x3

10x3

\342\200\224

5x3
\342\200\224

7x3

an

to add a

has a solution.

of linear

3x3

2xx

permissible

multiply

another.

equations

systems

is

it

equations

to

x4
3*4
\342\200\224

4x4
\342\200\224

10x4

x5

\342\200\224

4x5
\342\200\224

x5
\342\200\224

2x5

=
=

1
\342\200\22416

introduced

(f) f

xx

2x2

3xx

x3

the

15

+ 10x3 =-5

xx + 3x2

3. For eachof
the
vector

=
x3\"

\342\200\224

x2

lists

following

can

first

- 1

6x3

2xt + x2 +

be

(a)

(-2,0,3),(1,3,0),(2,4,-1)

(b)

(1,2,-3),(-3,2,1),(2,-1,-1)

(d)

(f)

combination of the other

as a linear

two.

-3,2)

-5), (1, -2,

(e) (5, 1,

whether or not

-1,1)

-1,0),(1,2,-3),(1,

(2,

in R3, determine

vectors

of

expressed

(c) (3,4,1),(1,-2,1),(-2,

VectorSpaces

Chap. 1

30

-3), (-2,

-4)

3,

(-2,2,2),(1,2,-1),(-3,-3,3)

For each of the following


lists polynomials
in P3{R),
whether
determine
or not the first polynomialcanbe
as a linear
combination
of the
other two.
(a) x3 - 3x+ 5, + 2x2 - x + 1, x3 + 3x2- 6x2 + x + 4
2x2 - 6, x3 - 2x2+
4x3>
+
1, 3x3
(b)
- 2
3x
+ 2, x3 - 2x2 + 3x - 1,
+ x2 + 3x
(c) -2x3 - 3x2 + 4x +
+ 3
+ 2x
1, x3 (d) x3 + x2 + 2x + 13,
of

expressed

x3

4x

llx2

2x3

2x3

5.

(e)

x3

(f)

6x3

In

Fn

- 8x2+ 4x, - 3x2 + x + 2,


x3

2x2

+ 3x

x2 +

x3

2x.+

the vector whose

denote

leffej

x2

- 1, x3 2x + 3, 2x3 + x2 7

is

coordinate

th

coordinates are 0. Prove that {e1,e2,..-,


generates
Show that Pn{F) is generated by {l,x,x2,. .\\,x\"}.

7.

Show

whose other

1 and

matrices

the

that

Fn.

e\342\200\236}

6.

3x +

and

M2x2(F).

generate

8. Show that

if

and Mi =
3

then the span

of {Ml9M2,

9.T For any elementx in a

Interpret this

10.

Show that a

result

M3}

vector

in

geometrically

subset W of a

vector

Show

then

deduce

that

span(Sj)

if 5X

and

space

S2 are subsets

\302\243

span(S2).

that span(S2) = V.

In

set

of all

symmetric

that

prove

space,

span(W)= W.
II.1\"

the

is

span({x})

,1

2x2

matrices.

= {ax: a

e F}.

R3.
V is

a subspace

of a vectorspace

particular,

'0

if

S1

\302\243
S2

of V if and
such
and

that
span(5x)

5X

only if
\302\243

S2,

= V,

Sec.1.5

Linear

Linear

and

Dependence

S1 and S2 are

31

Independence

12.

Show that if

13.

+ span(S2).
span(5x u S2)=
Let 5X and S2 be subsets of a

of

subsets

arbitrary

a vector

V, then

space

span(5x)

vector

\302\243

span(5x)
n

span(Sx)

1.5

are

span(S2)

INDEPENDENCE

LINEAR

AND

of Section 1.4

At the beginning

n S2)

span(5x

example in which span(SxnS2) and


and an example in which they are unequal.

equal

DEPENDENCE

LINEAR

that

Prove

an

Give

span(S2),

V.

space

the

that

remarked

we

a plane
is of the

of
equation
is the origin,

of which
through three noncollinearpointsin space,
=
where
form x
+ t2v,
x in
u, v e R3 and tx and t2 are scalars.Thusa
R3 is a linear combination
of u, e R3 if and only if x lies in the planecontaining
one

vector

txu

v (see

and

Figure

\\

We

.5).

that in

see, therefore,

3
R

the

of two

span

nonparallel

vectors has a simple geometricinterpretation. similar


can be
interpretation
vector
in R3 (see Exercise
9 of Section 1.4).
given for the span of a single
A

nonzero

*>

hv

1.5

Figure

the

In

x =

equation

txu + t2v,

x depends on

v in

and

sense

the

that x

vector is a
of the
others
is called a linearly dependent set. Consider,
combination
=
the
S = {xu x2, x3, x4} c
where
set
4),
example,
xx
(2,
=
=
=
if
is
and
To
see
S
-1).
(1, -2,
x2
(1, -1,3),
x3 (1,1, -1),
a vectorin S
a linear
we
check
or
not
there
is
must
whether
dependent,
is a linear combinationof

and

v.

in which

set

at least one

linear
for

R3,

\342\200\2241,

linearly

xA

that

of the

combination

and x3 if and

others. Now the

only if

are

there

vector

scalars

a, b,

x4.= axl
that

if

is,

and

x4

is a

however,

solution.
that

linear

The
this

of xl3

combination

x2,

c such that

+bx2 + oc3,

\342\200\224
\342\200\224a

4a + 3b
a

and

b + c,
b +
+ 3b
c,4a
linear combination of xx,x2, and x3 if and
2a + b + c =
1
[

-ahas

is

if

only

x4 == (2a +

Thus

xA

is

c=

c).

only

if

the

system

-2

- c = -1

verify that no such solution exists.Notice,


for we
not show that the set S is not linearly
dependent,

reader
does

\342\200\224

should

Chap. 1

32

must now check

be written as

or x3 can

x2,

xXs

can be shown,in

in S. It

vectors

other

the

of

combination

not

or

whether

Vector Spaces

that

fact,

a linear

is a

x3

linear

So S is
and x4; specifically, = 2xx
+ 0x4.
3x2
indeed linearly dependent.
for linear dependence given
that
the condition
this
We see
example
to
use
because
it may require checking several vectors to
above is
the
in S is a linear combination of the others.
seeif
vector
reformulating
of linear
that
in the following way, we obtain a definition
definition

of xx, x2,

combination

\342\200\224

x3

from

inconvenient

some

By

dependence

is

to use.

easier

vector space

A subset Sofa

Definition.

is

there exist a finite'number of distinct

if
ax, a2,...,

vectors

all

not

anJ

axxx + a2x2
In this case

To
this

also

say

that

the

subset

we

must

find

will

we

show

definition,

dependent
scalars

S and

in

xx,x2,...,xn

\342\200\242
\342\200\242
\342\200\242

anxn

= 0.

of S are

elements

the

that

scalars

linearly dependent.

is linearlydependentusing

R3 defined above

S of

\302\2531*1

be linearly

that

such

zero,

to

said

and a4, not all

ax, a2, az,

+ fl4*4 =

that

such

zero,

ft

fl2*2

^3*3

a2 +

a3

- 2a4,4ax + 3a2-

that is, suchthat


(2ax

+ a3

a2

+ a4,

ax

Thus we mustffind-a solution to

the

\342\200\224 \342\200\224

ilax

in which

a2

ax

a4)

= (0,0,0).

system

a2 + a3

a3

4ax + 3a2

not all the unknowns are


=
2, a2 =
1.4, we find that

a3

\342\200\224

+ a4 = 0
\342\200\224

\342\200\224

0,

2a4

a4

a3

zero. Using

the

in

discussed

techniques

solution.
Notice that using the definition
of linear dependence
stated above, we are
able to
that
S is linearly
dependent
by solving only one system of
should
that
the two conditions for linear
equations. reader
verify
above
discussed
are, in fact, equivalent
(see Exercise 11).
It is easily
that
in any
vector
space a subset S that contains the
vector
must
be linearly
For since 0 = 1 0, the zero vector is a linear
dependent.
combination of elementsof S which
some
coefficient
is nonzero.
Section

ax

\342\200\224

3,

a3

\342\200\224

1,

a4

= 0

is one such

by

check

The

dependence

seen

zero

\342\200\242

in

Example

In R4 the set

S = {(1,3, -4,
is

linearly

4(1,

3, -4,

dependent

2) -

2),

(2,

2,

-4,

0), (1,

2(1,

-3,

-3, 2, -4),

(- 1,0, 1,

0)}

because

3(2,2, -4,

0)

2, -4)

+ 0(-1,

0, 1,

0)

(0,

0, 0,

0).

Linear Dependenceand

Sec. 1.5

in

Similarly,

-3

(7

is
/

5(-4

/-3

2\\

-3

are

linearly

linearly

in this case.

independent

about

facts

The

2/

elements

the

are true in

sets

of

any vector

sets must be

for linearly dependent

independent,

not linearlydependentis

will also say that

independent

linearly

is linearly

set

empty

space that is

vector

0 0'

ll\\_/0

\342\226\2403

As before, we

independent.

/-2

-l)-\\-X

Sofa

The following
space.

1.

-2

subset

be

to

-7J5

4\\

5J+3(

Definition.

said

-2

5j5V

since

dependent

linearly

4\\

/-3

2\\

(V-4

33

Independence

set

the

M2x3{R)

Linear

nonempty.

2.

set
is

{x}

nonzero vector is linearlyindependent. if


then ax = 0 for some nonzero scalara.

of a single

consisting

For

dependent,

linearly

Thus

= a~l0-=0.

a-\\ax)

= 0 if
we
have
3. For any vectors xx, x2,...,
+ a2x2 +
+ anxn
a1xi
of 0 as a
ax 0, a2 = 0,..., an = 0. We call this the trivial
of xx, x2,.. -,
A set is linearly
linear combination
independent if and only
as linear
combinations
ofO
of its distinct elements
if the
representations
\342\200\242
\342\200\242
\342\200\242

xn,

\342\200\224

representation

x\342\200\236.

only

are the

representations.

condition

The

is

set

finite

trivial

linearly

3 provides a very useful method for determining


ifa
This technique is illustrated in the following
independent.
in item

example.

Example

Let

xk denote

last n
for

\342\200\224

if axxx

the vector in

are

1 coordinates

+ a2x2 +

the left and right sides

anx\342\200\236

this

of

\342\200\224

coordinates

1. Then {xl9 x2,...,

*
\342\200\242
\342\200\242

first

whose

Fn

0,

equating
gives

equality

the
the

following
=0-

= 0

+ a2

=0

\342\200\242\342\200\242\342\200\242

<

fli

+a2

+a3

ax '+a2r+a3

\342\200\242
\342\200\242
\342\200\242

linearly

corresponding

\"ax

ax

xn}is

are

an

= 0.

zero

and whpse

independent,

coordinates

of

system of equations:

34

Chap. 1
the

Clearly,

is

of this system

solution

only

= 0.

\342\200\242
\342\200\242
\342\200\242
=

an

ax

then S2 is

dependent,

linearly

Proof.

S2

\302\243'V.

If

Sx

is

linearly dependent.

then Sx: is

and let

a vector space,

V be

Let

Proof

Sx

Exercise.

Corollary.

independent,

space, and let

a vector

be

Let

definitions

of the

are

1.6.

Spaces

The following usefulresults immediate


consequences
of linear dependence and linear independence.
Theorem

Vector

Sx

V.
\302\243

S2

is linearly

S2

If

linearly independent.
1

Exercise.

EXERCISES

1.

or
Label the following statements as beingtrue
false.
(a) If ~S is..a linearly
dependent set, then each elementof S
of other
elements
of S.
combination
set
the zero
vector is linearly dependent.
(b) Any
containing

(c) The empty

is

set

(d)

of linearly

(e)

Subsets

of

(f) If

axxA

2. In

are

coordinates

3.

Show

4.

Prove

whose jth coordinate

{1, x, x2,...,
that the matrices

that the set

linearly

is

is

e\342\200\236}

is 1 and

other

independent.

linearly

in

independent

linearly

whose

Pn(F).

and

o)'

5)

(o

in M2x2(F).

independent

linearly

x\"}

5)(5

o)'(o

are
x2, . \342\200\242.,
x\342\200\236

xx,

that {e1,e2,...,

0. Prove

(J
are

vector

the

e,- denote

let

Fn

and

at equal zero.

scalars

the

all

then

independent,

independent
\342\200\242
\342\200\242
\342\200\242
=
+ anx\342\200\236

a2x2

sets are linearly dependent.


sets are linearly independent.

dependent

linearly

linear

dependent.

linearly

Subsets

is a

5. Find a
of
linearly
diagonal
independent
matrices.
vector space of 2 2 diagonal
6.T Show that {x, y] is linearlydependentif and
the other.
set

that

matrices

generate the

7.

an

Give
the

example
is

three

8. Let S =
V

over

answer.

of three

the

field

dependent

if

or

y is

a multiple of

vectors in R3 such that noneof

of another,

a multiple

{xl9x2,...,

linearly

only

xn}

Z2. How

be

linearly

independent

subset

of a vector

many elements are there in span(S)?

Justify

space
your

35

Bases and Dimension

1.6

Sec.

9. Prove Theorem1,6
its
corollary.
10.
Let V be a vector space over a fieldof characteristic
with
u and
(a) Prove that {u, v) is linearly
linearly
independent.
only if {u + v, u
and

not

is

v}

{u +

v,

11. Provethat a

12.

Let

w,
S

set

vectors

distinct

if and only if
independent
{u, w} is linearly
v + w} is linearly independent.
is linearly
if and only if S = {0}or
exist
dependent
that y is a linear combinationof
-, xn in S such
xu x2,

that

prove

Similarly,

- -

- -

if and

dependent

v,

there

y,

x2,

{xu

-,

if xx

only

set of vectors.

a finite

be

xn)

\342\200\224
0

or

xfc +

/c< n.

13.

14. Let M be a

Prove that f
aert + best= 0.

g are

and

aert

differentiate

S of a

A subset

possesses a

Theorem
sets

generating

that

basisfor

\342\200\224
0

finite

each

= est,

s.

where r #

R). Hint: Suppose that


a second

obtain

to*

in Exercise 12 of
the columns of M

involving a and

for a and

equations

is

that

b.

Then

equation

ft.

element

It

basis

a vector

is a basis for

V is a

space

V,

also

we

say

be expressed in

one

of S. (This property

of elements

is this result that makes


blocks of vector spaces.

pfor

generates

and

independent
of V can

linearly

combination

building

V. (If fi

generates

linearly

be

will

independent

linearly independentsubset
that the elements of P form a

V.)

Example

Recalling

that

span(0)

is a basis for the


Example

1.7.)

the

Definition.

of

let t

property\342\200\224every
a linear

as

way

in

these

Solve

ft.

useful

very

= 0, and

g{t)

in ^(R,

an equation

obtain

0 to

vector space

and onlyone
proved

if

only

DIMENSION

AND

BASES

and

ert

independent

linearly

best

involving a and
1.6

Let

defined by f{t)

functions

g be

and

(as defined
Prove
that

entries.

having

Let

independent if and

linearly

for some

1 espa.n({x1,x2,---,xk})

matrix

triangular

upper

square

nonzero
Section1.3)
diagonal
are linearlyindependent.

15.

Prove that S is

is linearly independent.

of S

subset

vectors is linearly

S of

a set

that

Prove

if and

v distinct

independent

(b)

2.

to

equal

vector

{0} and that

space

{0}.

0 is

independent,

linearly

we

see

that

=
In Fn, let
0,0,...,0,
(l,
0,1); {el5e2, -..,
(0,0,0,...,

0), e2 =

ex

the standardbasis

for

Fn.

en}

is

readily

seen

(0, 1, 0,
to be

..., 0,

0),

a basis for Fn

....

en

and is called

Vector Spaces

Chap. 1

36

Example 3

In

and

row

Then

column.

/th

whose only nonzero entry is a in the


1 < J <
{MiJ: 1 < i <
n} is a basis

the matrix

denote

Mij

let

xn(F),

Mm

for

m,

Mmxn(n i
4

Example

In

ith

x2,..., xn}is a basis.

set {1, x,

the

Pn(F)

basis

standard

the

basis

this

call

We

forPn(F).

Example 5

In

set

the

P(F)

{1, x,

Observethat
vector

has\" a finite

space

that no basis

section

for P(F) canbefinite.

the most significant

Let V be a

used

vector spaceand

basis

ft

Thus
y

4-...4-

axxx

and

is

The proof of

-..,

xn} can be

by

representations

gives

that

=
ax \342\200\224
bx
expressible

is uniquely

\342\200\242
\342\200\242
\342\200\242
=

an

\342\200\224

bn

0.

as a linear

of /?.

each vector

that

scalars
and

a1x1

the

entries

the

+\342\200\242...

al5...,

conversely,

exercise.

uniquely expressedin

using

first

such

two

are

bnx\342\200\236

follows

the converseis an

{au...,a\342\200\236),

unique vector

that

so

b\342\200\236,

chosen

appropriately

of scalars

anxn

+ '--+{an-bn)xn.

it

v
for

bxxx 4--..4-

{ax -bx)xx

1.7 shows

Theorem
{xl9

elements

the

of

combination

= V.
since
y e V, then y e span(/?)
span(/?)
of the
elements
of /?. Suppose that

independent,

linearly

ax = bx,...,an

Thus

V. If

the second equality from the


0

ft

combination

anx\342\200\236

of y. Subtracting

Since

\342\200\242
\342\200\242
\342\200\242

axxx

for

basis

linear

is

be

/?

expressed
in the form

an.

al5...,

Proof Let

2, shows

be uniquely

can

expressed

in

scalars

every

a subset of

xn} be

{xl5...,
V

in

in Chapter

frequently

V if and only if each vector


V. Then $ is a
for
as a linear combinationof vectors p, i.e., can be

unique

we

a basis.

of

property

for

not

Hence

be

will

which

1.7.

In fact,

not be finite.

need

a basis

basis.

The followingtheorem,

Theorem

that

shows

Example

in this

later

see

will

is a basis.

x2,...}

a vector

in

V with

basis

ft

form

+anxn

an. Thus
each

space

determines

n-tuple

of the n-tuple

a unique

of scalars

n-tuple

determines

as the coefficientsof a

linear

of the vectors in /?.This

combination

Fn, where n is the

vectors

of

number

suggests

in

a basis

like the vector

V is

that

fact

1.9 identifies a large classof


spaces
First, however, we must prove a preliminaryresult.
Theorem

vector

an element of V

x be

let

and

that is not in S.

Then

vectorspace

V,

is linearly

{x}

bases.

finite

having

independent subset of a

a linearly

S be

Let

1.8.

Theorem

space

will see in Section2.4

for V. We

case.

the

indeed

is

this

that

37

Dimension

and

Bases

1.6

Sec.

if

dependent

and only if x espan(S).

Proof If 5 u

is

{x}

scalars

nonzero

and

al9...,an
S is linearly
independent, one of
= 0, and so
+ anXn
aix + fl2*2 +
in

Su{x}
Because

are vectors

there

then

dependent,

linearly

such that axXi +

the xh

xn

0.

Thus

x.

equals

x^

say

xx,...,

\342\200\242
\342\200\242
\342\200\242
=
+ anx\342\200\236

\342\226\240
\342\226\240'

x = a^\\-a2x2

Sincex is a
x e span(S).

0 = axxx +

So

au

...,

Thus

1.9.

Theorem

subset

of

S0 is

a vector

If

a basis for V.

x #

anxn.

1,...,

set S0,

by a finite

V is generated

space

i=

for

x*

\342\226\240
\342\200\242
\342\226\240

HenceV has a finite basis.

S0

0 or so

by
a

then

=
V =
that
then
{0} and 0 is a subsetof
Recall
a basis for V. Otherwise S0 contains a nonzero
that
xx.
{xx}
set.
if
Continue, possible,choosing
x2]...,xr
linearly independent
If

Proof

n,

is linearly dependent

5u{x}

Theorem1.6. 1

= axxx +

since

and

dependent.

linearly

that

such

a\342\200\236

\342\226\240\342\200\242\342\200\242
+ anx\342\200\236
+ (\342\200\224
l)x,

is

{x1,...,x\342\200\236,x}

scalars

and

in

xn

exist vectors xXl'

there

Then

span(S).

of S,

elements

are

which

x2,...,xn,

x 6

that

suppose

Conversely,

...,

of

combination

linear

anxn).

is

S0

{\302\260}>

is a

element

in

elements

that

S0 so

{xx, x2,

-.., xr} is

must eventuallyreacha
of

independent
subset

50,

independent,

We

6 span(S).

to S any

adjoining

will show that

S is a

that

show

Otherwise,

S0

if x fi 5, then

the

by

The method

by

a useful

Example

The

(0,2,

way

which

of obtaining

the

basis

Let

Q span(S).

is linearly dependent.So x 6 span(S)

is

element of S0notin S
V.

by Theorem 1.5 to
x

which

S was

Since

x 6 50.
1.8.

bases. An example

a linearly

S is
= V,

span(50)

above

obtained

finite set, we

produces

Because

V.

for

construction'
Theorem

is a

S0

xr} is a linearly

S = {xu...,

basis
=
it sufficesto provethat span(S)

set.

dependent

but

at

stage

Since

independent.

linearly

Thus

If x 6 5,
shows

then

that

linearly

it suffices
clearly

S u {x}

c span(S).

S0

in the proof of Theorem1.9

of this procedureis

below.

given

should check that

reader

and

\342\200\2241),

(7,2,0)

generate

the elements(2,-3,5),
-12,20),
R3. We will select a basis for

(1,0,

(8,

R3

from

-2),
among

Chap. 1

38

the
nonzero
element
from
these elements. To start,select
(2, 3,5), as one of the elements of the basis. Since4(2,-3,5)
{(2,

(8,

\342\200\2243,5),

of (2,

vice

and

\342\200\2243,
5),

\342\200\22412,20)

Thus we include (1,0,

independent.

element in the

generatingset,
element (0,2, -1) according
to

set {(2,

the

versa,

(1,0,
(1,0,

\342\200\2243,
5),

Section

1.5).
a

not

\342\200\2242)is

is linearly
to the next

-2)}

basis.

our

in

\342\200\2242)

Since

basis.

our

in

the

\342\200\22412,20),

(8,

6 of

(Exercise

dependent

linearly

include(8,

Hence we do not
multiple

is

\342\200\22412,20)}

set, say

generating

any

\342\200\224

set

Vector Spaces

Proceeding
include
into

our basis the


whether
the
set {(2,
3,5), (1,0,- 2),(0,2,-1)}
shows that the
islinearly
or
An
calculation
linearly
independent.
easy
dependent
in
our
basis.
The
final
set is linearly
so we include
(0,2, \342\200\224
1)
independent;
into our basis
element of the generating set (7,2,0)is excluded
from
or included

according

to whether {(2,-3,5),

dependent

or

2(2,.-3,

the set is

is

is

The following theorem and its corollaries


results in Chapter 1.

Theorem1.10

Theorem).

(Replacement

subset

independent

exists

subset

V containing

of

Sx of

exactly

perhaps

Let

V be

S = {yu...,ym}

having a

be a linearly

< n.

there

Then

that

such

elements

\342\200\224m

significant

a vector space

m elements, where m

exactly

containing

ft

most

the

are

Let

elements.

exactly

containing

So the set

R3.

for

0, 0),

from the basis.

(7,2,0)

basis

0) = (0,

(7, 2,

2,-1)-

4(0,

exclude

we

and

{(2,-3,5), (1,0,-2), (0,2,-1)}

basis

linearly

Since

-2) +

5) + 3(1, 0,
dependent

linearly

(1,0,-2), (0,2,-1), (7,2,0)}

independent.

linearly

or

from

exclude

we

SuSx

V.

generates

Proof The proof will be


=
m 0; for in this caseS = 0,

so

5X

of the

the conclusion

satisfies

/? clearly

begins with

induction

The

m.

on

induction

by

theorem.

assume

Now

that

the

theorem

that the theorem is true for m


subset of V containing
independent
prove

is true for
1. Let S =

exactly

linearly independent by the corollaryto


inductive hypothesis to conclude that there

some m, where
{_yx,...,

that

{^1,...,ym}u

au...,am,b1,b2,...,

Observe

that

is a linear

{yu*

is

*i = (-bll*i)yi +

V.

ym} is

{yl9...,

the

apply

may

{xx,...,
there

Thus

(9)

...,

ym,

independent.

\342\200\242'\342\200\242

subset

fcx,is nonzero,for otherwise

linearly

we

a linearly

x\342\200\236_m}

exist

of

j?

scalars

that

such

bn\342\200\236m

combination of yu

\342\200\242\342\200\242>
J'ms
J'm+i}

generates

{x1,...,x\342\200\236_m}

some bh say

exists

1.6,

be

ym+i}

Since

1 elements.

Theorem

such

ym,

will

We

n.

<

i-b:xam)ym

in

to

contradiction

Solving

would

the

(9) for xx

- {-b^l)ym+1

imply

that

assumption

gives

+ {-b;lb2)x2

ym+l

that

Sec.1.6

xx 6

Hence

are

xn_m

39

Dimension

and

Bases

span({ j^,...,

of span({3>x,...,

elements

clearly

since

But

\342\200\242
\342\200\242.,
xn-m}).

x2,

ym+u

ym,

x2,

ym+u

ym,

...,

yu

x2,...,

ym3

follows

it

\342\200\242..,xn_m}),

that

X2,.

Xi,

cspand^,...,

Xn-m})-

1.5 implies that

Theorem

Thus

\342\200\242.,

So

of 5X

choice

the

This

\342\200\224

the

completes

V-

is true for

the theorem

that

proves

xn_m}

{x2,...,

Vm}) =

*2

span({3>x,...,
1

proof.

To illustrate the replacement


note
linearly independent subset of P2(F).Since =

that

theorem,

subset

this

In

P2(^).

generates

5X

Corollary 1.
Then

is a

basis

for

V.

Clearly,

{yu...,

elements.

5X of

subset

1.10 need

Theorem

not be unique.

having a basis p containing exactlyn


subset of V containing exactly n elements

a vector

be

space

independent

linearly

S =

Let

exactly

exists

in

5X

will

element

one

containing

ft

is a

4- 6}

V.

Proof.
containing

any subset of

example

Let

any

/? containing

4,

a basis for P2(^)s


such that S\\j 5X

is
{l,x,x2}
2=1
3 \342\200\224
element

we see that theset

suffice for 5X. Hence

elements.

of

\342\200\224

{x2

/?

there must be a

1.

4-

/? containing

theorem, we see
such that S u

replacement
n =

\342\200\224

of V

subset

independent

linearly

the

Applying

= 0;

5X

be

yn}

0 elements

that there
generates

5X

so S generatesV.SinceSis also

S is

independent,

linearly

basis for V.

Example
7

The

vectors

(1,

\342\200\224

ax(l, -3,
then

a2,

au

2) +

a3 must

and

and

(4,1,0),

3,2),

1,

a2(4,

< \342\200\224

3a2

. 2a
it

is easily

a3 = 0.

Hence

therefore

seen that the


(1,

\342\200\2243,2),

form a basis

elements.

Then

2.

of equations
= 0

4a2

a2 4-

2a3 = 0
a3

0.

only solution of this


(0,2,

Note

ax

do

we

0, a2

= 0, and
and

independent

linearly

that

vector space havinga basis


p

containing

than

more

dependent. Consequently,any linearlyindependent

subset

elements.

is

not have

to check

R3.

any subset of

system

\342\200\2241)are

for R3 by Corollary1.

Let V be a

R3S for if

= (0, 0, 0),

2, -1)

\342\200\224

and

for

a basis

form

+ a3(0,

0)

(4,1,0),

that the given vectorsspan


Corollary

1)

satisfy the system


ax

But

\342\200\224

(0,2,

n
of

containing

elements
V contains

exactly

is linearly
at most n

Chap,1

40
a subset of

S be

Let

Proof

reach a contradiction,

S is

that

assume

Let

independent.

linearly

Spaces

In order to

n elements.

than

more

containing

Vector

be

5X

any

is a basis
for V by Corollary
containing exactly n elements; then
subset
of S, we can select an elementx of S
is not
Because is a proper
=
a basis
V. Thus Theorem
of 5X. Since
is
for V}xe
element
span(5x)

of S

subset

1.

5X

that

5X

5X

impliesthat

5X

is

{x}

5X

5 is

so

\302\243
S;

{x}

1.8

linearly

is linearly

that

therefore

conclude

We

contradiction.

dependent\342\200\224a

But

dependent.

linearly

an

dependent. 1
S

Example

S =

Let

{x2 + 7,

that S is a

\342\200\224

Let S be

{l,x,x2}

a vector space

every basis for

Then

ft

contains

we can prove directly


conclusion
follows
is a basis for P2(*r) containing

4- 2}. Although
of P2{F),
this

Ix

subset

V be

Let

3,

Corollary

Proof.

3,

2 since

Corollary

fewer elementsthan S.

elements.

\342\200\224

4x

2x,

dependent

linearly

from

immediately

8x2

having a basis

S contains

at

n elements.

exactly

a basisforV. SinceSis

exactly

containing

ft

independent,

linearly

S contains
most n elements by Corollary 2. Supposethat
m elements;
exactly
then m ^ n. But alsoSisabasisfor V and /? is a linearly independent
subset of V.
So Corollary 2 may be applied
with
the roles
of /? and S interchanged to yield

n ^m. Thus

n.

a
space has a basiscontaining
'above asserts that the numberof

If a vector
the corollary

same. This result

space is the

Definitions.

A vector space

consisting of a finite

basisfor
not

is

called

finite-dimensional,

number

the

of

it is

then

the

vector

called

are consequencesof

{0} has dimensionzero.

Fn has

dimension

n.

Example 11

The

vector

Examples

space

Mmxn(F)

has

dimension

mn.

\342\226\240

the

has a basis

of elements in

number

is denoted dim(V).
infinite-dimensional.

space

if it

of V and

10

Example

unique

for

basis

possible.

finite-dimensional

called

is

definitions

following

then

elements,

each

in

elements

The vector space

The

elements;

dimension

The following results


Example

the

makes

of

number

finite

each

If a vectorspaceis

through

5.

Bases

Sec. 1.6

12

Example

dimension n + 1.

space Pn(^) has

vector

The

41

Dimension

and

13

Example

vector

The

P(F) is infinite-dimensional.

space

The following
of
depends on its

two

the

that

show

examples

of a vector

dimension

space

scalars.

field

14

Example

the field of complex numbers, the vector


is {1}.)
dimension 1. (A basis
1
Over

complex

numbers

has

complex

numbers

has

of

space

Example 15

Over the fieldof real


dimension 2.
basis
is
Let

V that

subset

of

V and

hence

most

subset

n elements. Hence S =

space

having

contains at most n

S such

of

5X

of

n, and let

dimension

elements.ThenS isa

that

5X;

basis for

5X is a

exactly n elements. But


so
S is a basis for V.

3,5X contains

By Corollary

space

S be a

basis

for

n elements.

exactly

Proof. There existsa


1.9).

vector

V and

generates

contains

be

vector

{l,z}.)

(A

Corollary 4.

the

numbers,

5X

S and

(Theorem

S contains

at

Example16
It

from

follows

basis

3x

{x2

.is a

4 of

Example

Section

- 2, 2x2

1.4 and Corollary

+ 5x -

3, -x2 -

Ax

4 that
+

4}

for P2tR).

Example17

It

from

follows

5 of

Example

oMo

is a basis

for

Corollary

M2

x 2(8)-

5.

Section

1.4 and Corollary

lMi

iMi

4 that

Let P be a

basisfor a

finite-dimensional

vector

There
exists
a subset
linearly independent subset of
is a basis for V. Thus every linearly independent
SuSx
extended to a basis for V.

let Sbea

>

V.

subset

V, and

space

Sx of
of

fi

that

such

can

be

42
Let

Proof.

elements, where

a subset

of

5X

the

V.

If

of

wealth

vectors.

has

the

the

about

information

in

for

a vector

a finite

basis,

This number is

relationships
this

reason

we

put

them

into

to

linearly independent subset that

V is a

space

then every basis for

the

contains

called the dimensionof

V,

and

of the

five corollaries
For

perspective.
V

and

theorem,

independent sets, bases, and generatingsets.


here the main results of this section order

among linearly
will summarize
basis

theorem

replacement

replacement

replacement theorem containa

contain

Theorem1.9,

better

S must

that

know

we

\342\200\224

contains

a basis for V.

Corollary 2

Spaces

guarantees that there is


such
n
m elements
V.
that S \\j 5X generates
exactly
at most n elements;so Corollary4 implies
that
S u 5X is
the

Hence

^n.

/? containing

S u 5X

Clearly,

= n. By

dim(V)

Vector

Chap.

same

is said

generates

of

number

to be finite-

dimensional. Thus if thedimension V is n, every basis for V contains exactly n


vectors. Moreover, every linearlyindependentsubsetof contains
no
more
than n vectors'and
can be extended to a basis V by including
appropriately
chosen
vectors. Also, each generating set for
contains
at least
n vectors
and
can be reduced to a basis V by excluding
chosen
vectors. The
appropriately
Venn diagram in Figure 1.6depicts
We will see in Section 2.4
relationships.
F of dimension
n is essentially
that every vector space
Fn.
of

for

for

these

over

Figure1.6

The Lagrange Interpolation

Formula

The

cn, be
fn{x\\

preceding

distinct

results

elements

be applied

can

to obtain a usefulformula.

Let

in an infinite field F. The polynomials

f0{x\\

where

Mx)

- c0)
fe -

(x

Cj-tX*

Cj- i){Ci-

\342\200\242
\342\200\242
\342\200\242

Cn)

fo

- c{+1)

\342\200\242
\342\200\242
\342\200\242

= {x

\342\200\242
\342\200\242
\342\200\242

Ci+

x)

(x

\342\200\242
\342\200\242
\342\200\242

{Ci

cn)
cn)

c0,

cx,

/x(x),

Bases

Sec. 1.6
are

the

called

each

P = {To*fu
dimension

of

basis

of degree n and

polynomial function

fi(x) as a

We will use
- \342\200\242
\342\200\242
* fn)

Pn(F)

is n

this property of
*s a

the

henceis

show that

is

ft

~* F,

fx\\

an

element

we see

that

Note

cn).

of

By

Pn(F).

that

to

that

show

of Pn(F). Since the

subset

from Corollary 1of

1,10

Theorem

suppose

independent,

linearly

...,

polynomials

Lagrange

independent

linearly

it will follow

+ 1,

for Pn(F). To

(associated with c0, cu

polynomials

Lagrange

is a polynomial

ft(x)

regarding

43

Dimension

and

that

ft is

that

71

E
i~0

where

a0, au...,

an,

Then

function.

the'zero

0 denotes

for some scalars

= 0

aJt

71

1= 0

aift(cj)

for

\302\260

0, 1,...,\302\253.

But also
71

a, = 0

by (10). Hence
Because
linear

j? is

a;

..., n; so
linearly
independent.
basis for Pn(F), every polynomialfunctiong in
for

0, 1,

/?

is

Pn(F)

is

of /?, say

of elements

combination

flt/l(cj)

9= I

bj.

Then
=

bJt(cj)
i= 0
E

9(cj)

by,

so

9= E
i~0

fifcJ/i

of g as a linear combinationof elements


of
/?. This
representation
the Lagrange interpolation formula* Notice that the
is called
representation
n + 1 elements
of F (not
above
shows
that if b0, bu ..., bn are any
argument
is the

unique

necessarilydistinct),

the

then

is the unique
unique

polynomial

element

of

of degree

function

polynomial

Pn(F)

not

such

i-o
that

exceeding

bJi
g(cj)

by Thus we

n that has

have found the

specifiedvaluesbj at

given

44

Chap.

domain (j = 0,1,...,
2 whose
polynomial g of degreeat
and (3,
in the notation
above
(Thus
Cj in its

points

For

n).

c0 = 1,

\342\200\2244).

and

\342\200\224
\342\200\2244.)The

b2

. = (x-2Xx-3)
,,
/oW

1,

/lW

^W

\"

\342\200\224

(x

\342\200\224

l)(x

3)

(2-l)(2-3)

(2,5),
=

5,

c0, cx, and c2 are

with
,.

\"5X + 6)-

(l-2)(l-3)=2(X

2,

cx

associated

polynomials

Lagrange

the real

the points (1,8),


= 3, b0 = 8, fcx
c2

contains

graph

most

Vector Spaces

us construct

let

example,

,,= -1(x2-4x

3)-

and

the desired

Hence

(x-l)(x-2)

(3 _ 1)(3_

2)

5x

= -3x2 +
An

cn

in

2(x2

- 3x +

2)

for

interpolation formula is
elements c0, cx,...,
1 distinct
the

zero function.

f is the

then

F,

3)

of the Lagrange

and f(cj) =

f e Pn(F)

If

result:

following

- 4x +

5.

6x

- 4/2(*)

5/i(x)

- 5(x2

6)

consequence

important

8/o(x)

&\302\253/,(x)

= 4{x2-

2)l

is

polynomial

9(x) = E
,=o

3X
2(X

The Dimensionof Subspaces

vector

Theorem

Then

fte

= dim(V)| then
Let

Proof

Otherwise,

in

Continuing

that

such

{xt,x2,

independent subset of

stop at a stage

adjoining

any

1.8

now

Therefore,

implies

other

that

dim(VV)

,xk}

where

is

\342\200\242
\342\200\242
\342\200\242

contain

can

finite-dimensional

the

vector space V.
Moreover,

if

is finite-dimensional

and

dim(VV)

then

{0},

dim(V).

xx; so {xx} is a
choose
elements xu x2,...,
xk
Since
no
independent.
linearly

a nonzero

contains

linearlyindependentset.
W

to the dimensionof

V.

= n. If

dim(V)

dim(VV)= 0 ^ n.

in

of a

subspace

is finite-dimensionaland

dim(W)

Let W

1.11.

of a subspace

it.

contains

that

space

the dimension

relates

result

next

Our

this

linearly
more

way,

than

element

n elements,

this process must

linearly independentbut
dependent set. Theorem
generates W, and hence it is a basis W.

{xlt x2,...,
xk} is
of W produces
a linearly

n and

element

{xu

x2,...,

= k ^

n.

xk}

for

Bases

Sec.1.6
If

45

Dimension

and

a basis for

= n, then

dim(W)

containing n elements.
for W is also a basisfor

But Corollary 1of


= V.

so

V;

is

1.10

Theorem

of V

subset

independent

linearly

basis

this

that

implies

Example IS

Let
=

It is

that

ax + az +

as) e F5:

a3i'aAi

a2,

{{au

easily

shown

basis.

0, -1), (0,0, 1,0,-1),


Thus dim(VV)= 3. 1

is

{(1, 0, 0,

as a

a4}.

1, 0)}

0,

1,

(0,

19

Example

The

=
a2

F5 having

of

subspace

a5= 0,

Section 1.3).

for

basis

is a subspace

n x n matrices

of diagonal

set

of

Mn

(see Example

xn(F)

1 of

is

{M11,M22,...,MBB},

the matrix in

where MlJ is
jth

dim(VV) = n.

Thus

column.

the

which

nonzero

only

is a 1 in

entry

the i th

and

row

Example20
We

of

A Dasis

MnxnW

the set of

1.3 that

Section

in

saw

is

and

ith

Aij

row

and

column,

dim(W)=
W

Corollary.
has a finite

V,

then

the

IfWisa

\302\253
+

(\302\253

1) +

1 =

\342\200\242
\342\226\240
\342\226\240

-4-

1).

|\302\253(\302\253

1 in the j th

column,

subspace of a finite-dimensionalvectorspaceV,

basisfor W is a

theorem

replacement

J th

row

basis, and any

S\\j 5X is a

that

< n},

and
having 1 in the ith
0 elsewhere. It follows that

Theorem 1.11showsthat

Proof

i^ j

1 <

n matrix

n x

the

subspace

is

f\302\260r w

{Aij:

where

symmetric n x n matricesis a

has

subset

of

basis

that there exists

shows

basis for V. Hence S is a

of

the

a2x2

subset

V.

for

S. If j? is

basis

a finite

a subset

basis

Su

then

any basis for


5X

of

/? such

5X for V.

Example21
The

set

of

all

polynomials

of the form

ai8x18 + ai6x16 +
where

a18, a16,...,

{1,x2,...,

a2s a0 6 F, is a

x16,.x18},
which

is

subset

\342\200\242
\342\200\242
\342\200\242

subspace

of the

of

standard

a0>
Pi8(F).

A basis

for W is

basis for Pia(^)-

46

of

subspaces

R2 and

0,1, or 2

dimensions

R2, respectively.

R3. Since R2 has

of

R2

of

1.11 to

be

can

R2

of

{0} and

all scalar

1 consists of

dimension

having

the

2 are

0 or

dimensions

Spaces

determine

dimension 2, subspacesof

only. The onlysubspaces


subspace

Any

by using Theorem

section

this

conclude

We

Vector

Chap.

9 of Section 1.4).
R2
is
in the natural
identified
way with a point on the
Euclidean
then
it is possible
to describe the subspaces of R2 geometrically:
0 consists of the origin of the Euclidean
of R2 having
dimension
subspace
the
a subspace
of R2 with dimension
1 consists of a line
origin,
Euclidean
and
a subspace
of R2 having dimension 2 is
entire
plane.
the subspaces of
must
have
dimensions
As above,
0, 1, 2, or 3.
of
we
see that a subspace
Interpreting these possibilities
must
be the origin of Euclidean 3-space, a subspaceof dimensionis a
zero
the origin, a subspace of dimension 2 is a
the
line
through
origin,
through
and a subspace of dimension 3 is Euclidean3-space

multiples some
If a point

in R2

vector

nonzero

of

(Exercise

of

plane,

through

plane,

the

R3

geometrically,

dimension

plane

itself.

EXERCISES

1.

(a)

zero

The

as being true or false.

statements

the following

Label

vector

space

has no basis.

(b) Everyrvector space that is


(c) Every .Vectorspacehasa

cannot have more


vector
space has a finite basis,
is the same.

(d)

A vector

(e)

If

basis

space

(f) The dimension


(g)

The

that

independent

Then

(i) If S
as a

is

vector

linear combination

set has

is m +

in

vectors

every

n.
vector

that

space,

that S2 is a subsetof
more elements than S2>

space

V, then

of elements of S in

only

5X

that

every vector in

one

is

a linearly

generates

can

be

V.

written

way.

(j) Every subspace of a finite-dimensional space is finite-dimensional.


V is a vector
has
If
one
exactly
space having dimension n, then
with dimension 0 and exactly one subspace
n.
dimension
subspace
which of the following sets are bases
R3.
Determine
V

(k)

with

2.

a basis.

n.

of V, and

contain
the

finite

than one basis.


then the numberof

a finite-dimensional

is

subset

5X cannot
generates

P\342\200\236(F)

of Mmxn(F)

dimension

(h) Suppose

of

basis.

finite

by

generated

for

(a)

{(1,0,-1),(2,5,1),(0,-4,3)}

(b)

{(2,-4,1),(0,3,-1),(6,0,-1)}

(c) {(1,2,-1),(1,0,2),(2,1,1)}

(d) {(-1,3,1),(2,-4,-3),(-3,8,2)}

(e) {(1,-3, -2),(-3, 1,3),(-2,-10, -2)}

3.

Determine which of

the following

(a)

(b)

- x+
{-1

(d)
(e)

5. Is {(1,

7.

of

sum

-3,

x5 =

and

-17),

vector

the

*u

10.

(au

and {ax,
x2

3x

and

3,

\342\200\224

generate

= (-6,

1),

x4 = (2, -8,

15, -6),
6),

-2,

2,

x8 = (2, -1,

1, -9,
x8} that is a basis

7)

V.

for

1,

1,

(0,

F4. Find

a2,

a3, a4)

in F4 as

for

bases

is

vector

for

1), x3

= (0, 0,

1, 1), and x4 =

the unique representation

a linear combination

of an

vectors

the

of

V. .Show

that both {x

+ y, ax}

are arbitrary nonzeroscalars.


space with a basis {x!,x2sx3}. Show that
is also a basis for V.
a and b

V, where

a vector

+ x3,x3}

4- x3,x2

space

the

system

xx

2x2 + x3 = 0
= 0
3x2 + x3

[2xx

is a subspaceof R3.Finda basis this subspace.


Find bases for the following subspacesof F5:
for

=
Wx

{(als

a2)

a3, aA,

= 0}

a5) e F5: ai-a3-aA

and
=

What

of

x6 = (0, -3, -18,9,12),

of {xls...,

W2

in R5 for which the

9, -12,

x2

for

12. Theset of solutionsto

13.

R3?

12, -4), xA
Find
a subset

R3.

generate

2),

basis

a basis

are

that
+

4-

andx4.

x2,xz,

Let {x, y} be

{xx

form

1)

vector

Suppose

of all vectors
consisting
zero. The vectors

1, 1, 1), x2 =

xx = (1,

vectors

by}

11.

9x2}

\342\200\224

for R3.

basis

3, -2),

-2,

a subset

Find

V.

arbitrary

6x2}

a linearlyindependentsubsetof

8)

1,2,1, -3),
0,

(1,

x7

(-3, -5,

7, -9,

= (-l,

x5

(0, 0, 0,

+ 18* -

A,

-5,

A,

-2,

(3,

x3

The

6x2}

for F2 and for M2x2(F).


-2),
1), x2 = (1,
xz =(-8,

equals

xx = (2, -3,

9.

- 5x -

1, 4x2

2x2

space

coordinates

the

generate

x2, -1

is a

that

x5}

xA,

denote

10x2,

Ax2}

12x

-2

8), {2,1,1), (0,1,0)}

=(2,

xx

{xu x2, x3,

8. Let

P2{R).

answer.

vectors

31,

Ax

x3
answer.

three different bases

The

2x +

2x +

- x2, -3

\342\200\224

your

your

Give

(1,

6), (1,5,

A,

Justify

polynomials
Justify

P3tR)?

6.

- x2,

2x

the

Do

2x + 4x2, 3 -

+
+

{1

+ x

3+x2,x
-2 + 3x
2x\\

Ax

{-1

2x2,

for

bases

are

sets

- 2x2, 1 + x2}

+ x2,
-

{l+2x

(c) {1 +

4.

47

Dimension

and

Bases

1.6

Sec.

{{au

a2,

a3, a4,

are the dimensions of

a5) 6 F5: a2 =
Wx

and

VV2?

a3 = a4,

ax

as

0}.

Chap. 1

48

14

The set of all


(see

Mnxn(f)

matrices

x -n

of

1.3). Find a basis

4 of Section

Example

is a subspace

equal to zero

trace

having

Vector Spaces

is the

What

W.

for

of W?

dimension

15. The set of all

1.3).Find a basis

12 of Section

(see Exercise
ofW?

16. The set of all

skew-symmetric

is a subspace

n matrices

1.3).Find a basis

of

x n(F)

Mn

dimension

Mnxn(F)

dimension

is the

What

W.

for

of

is the

What

W.

for

26 of Section

(see Exercise

is a subspace

matrices

x n

triangular

upper

of W?

17. Find a basis

vector

the

for

5 of Section

in Example

space

1.2. Justify your

answer.

18.

the

Complete

19.f Let

vector

be

proof

of Theorem

1.7.

having

dimension

space

a subset of V

let

S be

careful

not

and
\302\253,

that

generates-V.

(a)

a subset of S is a basisfor

that

Prove

V.

(Be

to assume

that S

is finite.)

(b) Provethat S

(a)

if

that

Prove

space
vector
dim(Wx + W2)
(b)

Let

and let
and

21.

and

Wx

Wx

VV2

Wx

and

Wx)

dim(

be

to

VV2

finite-dimensional

are

VV2

then

V,

Wx n

for

basis

Wx

VV2

finite-dimensional

+ W2.

is

subspaces

that V is the

Deduce

and

finite-dimensional,

of a

of

subspaces

+ dim(W2) - dim(WxnW2).
for Wx and a basis for
b#sis

if dim(V) = dim(Wx) +

only

of

sum

direct

and

sum

the

the exercisesof Section1.3.

as defined in

subspaces,
20.

24 require knowledge of

20 through

Exercises

n elements.

least

at

contains

Hint:

Extend

VV2.

vector

direct sum of

space V,
and

Wx

W2

if

dim(W2).

Let

W^K\"

M2X2(F)S

b)EV:a,b,ceF\\,
a.

and
=
\\N2

W2/

22. Let

Wx and
+ W2, and

that

Prove
Wx
Wx

and

VV2

VV2

are

i[

_\302\260

neV:a,fceF

of V,

subspaces

and find the

dimensions of Wls

Wx n W2.

be subspaces

n, respectively,

where m^n.

(a) Prove that

dim(

Wx

n W2)

of a vector

^ n and

examples of subspaces
dim(Wx n W2) = n and dim(

(b) Give

Wx

space

and

W2)

= m

W2)

VV2

+ n.

m and

dimensions

having

dim(Wx +

Wx

of

R3

n.

for

which

Sec. 1.7
(c)

Maximal

Give

of subspaces

examples

n and

dim(Wx n W2) <

23. (a) Let

and

Wx

= Wx \302\251W2.
If
show that j?x n fi2 =
V

(b) Conversely, let


then

that

Prove

24.

there

then

infinite

Wx

Let

of

VV2

1.7*

{xfc+

a formula

LINEARLY

MAXIMAL

V =

that

such

requires familiarity
a

of

Wx

for V,

a basis

/?2 is

W2j

V,

space

\302\251VV2.

if and onlyifit
29

Exercise

with

an

contains

vector

finite-dimensional,
for

and

Ws

V. Let

space

basis

to

this

extend

1.3.

of Section

for V.

{xl3x23...,xfcJxHl!...3xn}
Derive

/?x

and

Wx

subset.

{xux2,..., xA}.be a basis

(b)

for V.

vector

space is infinite-dimensional

subspace

(a) Provethat

respectively,

W2,

for subspaces

that if

that

such

finite-dimensional

subspace of a

independent

be

bases

disjoint

and

Wx

basis

is a

/?2

vector space V. Prove

if Wx is any
is a subspace

linearly

/?x

be

/?2

bases for

space

\302\251W2.

The following exercise


26.

and

/?x

of a

a vector

that

Prove

25.

0 and

n.

a vector

of

subspaces
and
/?x
p2 are

which

for

R3

of

VV2

< m +

+ W2)

Wx

dim(

and

Wx

be

VV2

respectively,

49

Subsets

Independent

Linearly

W,..., xn + W} is a basis
dim(V)s dim(W), and dim(V/W).

xfc+2 +

W,

relating

for

V/W.

SUBSETS

INDEPENDENT

several important resultsfrom Section1.6 extended


infinite-dimensional
vector spaces. Our principal goal is to
has a basis. This result is of importance the study
vector
space
dimensional
vector spaces because it is often
to
construct
such a space explicitly.
In this section

to

are

include

that

prove

every

of infinite-

in

a basis

difficult

for

The difficultythat arises extending


of the preceding section
the
theorems
to infinite-dimensional spaces is that
mathematical
induction,
of
which played
a crucial role in many of
of Section
1.6, is no longer
proofs
called
result
the maximal principle,
adequate. We will use insteada
general
which requires the followingterminology.
in

the

principle

the

more

Definition.
3F be a family
(with respect to set inclusion) no
Let

if

of sets.
member

of

member

of OF

is called maximal

contains M.

!F properly

Example 1

Let

fF

ofS.)

be

the

family

The set S

of all

subsets of a

is easily seen to be a

Definition.

each pair of sets

nonempty set S.(!Fis


maximal

element

A collection of sets *6is calleda chain


and
B in % either
AcBorBgA.

of

{or

the

called

^,

power

set

P
nest

or

tower)

if for

Chap. 1

50

= 1, 2,

3, ...}

is a

in

chain;

With this terminology

sets.

of

if m

only

Then

<df

< n.

principle.
each

If for

each member of

\342\200\242

n.

...,

2,

maximal

the

family

contains

that

8F

of

and

if

An

state

now

Let fF be a

Maximal Principle.
member

Am q

fact,

can

we

there existsa

of the integers 1,

the set consisting

denote

An

{An:

Spaces

Example

Let

Vector

!F

then

<df9

fF

chain *6 Q

contains

maximal element.

Because the; maximal principle


the
existence
in a -family of sets, it
be
to reformulate
the
elements
useful
We
will
basis in terms of a
subsequently
property.
is
to
definition
of a basis.
reformulation
equivalent
original

of

guarantees

maximal

of a

definition

will

show

maximal

this

that

the

conditions:

(a) B is

Example

2 of Section

a maximal
of

subsets

this

2x2

- 5x2 -

39 3x3

5k-

2x2

+ 12x

69

x3

Ax

9}

a vector

39

3x3

5x2

any two-element subset of S is

case, however,

ft for

- 5x -

2x2

linearly independent subset


a set need not be unique.

basis

dependent.

linearly

linearly independent subset of

{2x3

In

P2{R)-

satisfying

linearly

of the following

both

1.4 shows that

{x3-

in

of

properly contains B is

of S that

S=

A maximal

V.

space

independent.

linearly

Example

is a maximal

subset

vector

I
subset

Any

(b)

is a

subset of S

independent

a subsetof a

Let S be

Definition.

easily

of S. Thusmaximal

- 4x -

9}

shown

to

be

independent

linearly

space V is a

maximal linearlyindependentsubsetof

\342\226\240
V9

(a)

for

P is

linearly

independent

by definition.

(b) If xeV and x$fi,


because span(/?)= V.

then

Our

next

Theorem

fi is a

maximal

result

1.12.

shows

Let

/?u{x}

that

V be

the

is

of this statement

converse

a vector space

linearly independent subset of

by

dependent

linearly

and S a subset

that

S9

then

p is a

1.8

Theorem

is also true.
generates

basis for V.

V.

If

Sec. 1.7

Maxima!

maximal linearly independentsubsetof S.


V.
it suffices to prove that
Suppose
generates

Let /? be a

Proof.

independent,

linearly

implies that

/?

Corollary.
In
maximal

it

V,

from

follows

the
Exercise

11

subset

space V is a

a vector

ft of

V.

we can accomplish

corollary,

preceding

basisfor

and

if

if$

is a

only

our goal of proving

contains
a
a basis by proving that everyvector
subset. This result follows immediately from our
independent

vector

every

1.8

span(]8)= V. 1

the

of

view

we have contradicted

span(/?). Because span(S)

linearly independent subset of

maximal

that

Theorem

x \302\242span(/?).Since

that

such

independent,

linearly

S q

of Section 1.4 that

that

is

{x}

is

ft

ft

/?. Hence

of

Because

then there exists xe

S $ span(/?);
maximally

51

Subsets

independent

Linearly

linearly

space

has

space

theorem.

next

a linearly independentsubsetof a
maximal linearly independent subset of
that
contains

exists

There

Let S be

1.13.

Theorem

the family of all

S.
of

subsets

independent

linearly

V.

space

Let J^ denote

Proof

vector

contain S. We will use the maximalprinciple


to show that 8F contains a
we
maximal element. In order to apply themaximalprinciple,
must
show
that
if
a chain
in ^, then there exists a member U of J*7 that contains
<df is
each member
of <df. We will show that U, the union of the membersof <df, is the desired
set.
Since U clearly contains each memberof <df, it suffices
that C/ef
to prove
(i.e.,
that U is a linearly
subset
of V that contains
independent
S). Now each element
that

of

<df

of V

a subset

is

let uu...,

independent,
cxux

such

= 0.

\342\200\242
\342\200\242
\342\200\242

cnun

contains all the

set; so

*6 is a

chain,

cxux +

Themaximalprinciple
maximal element is easily
to
seen

for

n,

there

sets

exist

Au...,

for

= 0

U is

linearly

sets

Ax in

<\342\202\254

An, say Ak,

1,..., n. However,
=
implies that cx =
cn

i =

can

an

be

be

Ak

\342\200\242
\342\200\242
\342\226\240

a maximal

linearly

a maximal

independent

is

= 0.

element.

This

subset of

vector

infinite-dimensional
in

Linear

basis.

3 to Theorem 1.10,that
space has the same cardinality (see,e.g.,
Vol. 3, D. Van Nostrand Company,

to Corollary

analogously

shown,

Lectures

Exercises

J*7 contains

Every vector space has a

N. Jacobson,
New York, 1964,p.
infinite-dimensional

the

of

that

S.

contains

Corollary.

basis

Ak

cnun

that

implies

It

1,...,

prove

cn be scalars such that

cu...,

one

u\342\200\236

* - *

To

V.

U is linearly independent.

Therefore,

that

and

u{ eU for i =

others.Thus uu...,

independent

linearly

un

Because

q Uq

in

vectors

be

A{. But since

uj 6

that

S; hence S

containing

Algebra,

every

154).

through
spaces.

5 extend

other

results from Section

1.6 to include

52

Chap.

Vector Spaces

EXERCISES

1. Label the

as

statements

following

or false.

true

being

element.
(a) Every family of sets containsa
a maximal element.
chain
contains
(b)
Every
of sets
has a maximal
element, then that maximal
(c) If a
maximal

family

unique.
(d)

has a maximal

of sets

a chain

If

elementis

element, then that maximal

is

element

unique.

a vector spaceis a

A basis for

(e)

maximal

vector space.

(f)

maximal

vector

Prove

that

vector

nonzero

each

for

vector

in

V,

P and unique'nonzero scalarscu...,

there
cn

such

space

V.

1.7: Let /?

be a

vector

for V.

basis

of a

Theorem

of

version

4.

for

finite-dimensional)

3. Prove the followinginfinite-dimensional


V.
subset of an infinite-dimensional
space
only.if,

space is a basis

space.

be a subspace of a (not necessarily


that \"any basis for W is a subset

2. Let W

of a vector

subset

independent

linearly

of that

subset

independent

linearly

a basis for V if and


exist unique vectors xu...,xn
in
*
\342\200\242
\342\226\240
=
that
4y
cxxx 4cnx\342\200\236.
Then

/? is

of Theorem 1.9: Let Sx and S2 be subsets


of a vector space V such
that
Sx Q S2. If Sx is linearly independent and S2
exists a basis /? for V such
that
there
V, then
generates
Sx q /? q S2. Hintto the family of all linearly independent subsets
Apply the maximalprinciple
of Theorem
of S2 that contain Su and proceed
as in
the proof
1.13.
the

Prove

generalization

following

Let /? be a
5. Prove the followinggeneralization the replacement
theorem:
basis for a vectorspace and let S be a linearly independent subset of V.
a basis
a subset
There
is
for V.
Sx of /? such that S u
of

V,

exists

Sx

OF

INDEX

Additive

Basis

10

inverse

10

law

49
7

16

matrix

sum

formula

interpolation

Linearcombination21
Linearly

Dimension 40
Direct

space 40

Lagrange polynomials 43

Degreeofa polynomial 8
Diagonal

28

Generates

Lagrange

20

Diagonal

19

function

Infinite-dimensional

Columnvector
Coset

Even

Finite-dimensional space 40

35

Cancellation
Chain

CHAPTER

FOR

DEFINITIONS

19

16

dependent

32

Linearly independent 33
Matrix

Maximal

element

sets 49

of a family of

43

Index

Chap.

Maximal

of Definitions

linearly independent

53

subset 50

/j-tuple
Odd

Sum of subsets

19

Polynomial

Quotient

space

Row vector

20

15

16

15

Transpose

Trivial representation 33

Uppertriangular

18

matrix

Sequence

Skew-symmetric

20

matrix

Vector

Vector

space

Span of a subset 27

Zero matrix 8

Spans 28

Zeropolynomial 8
Zerosubspace 14

Square

Standard

matrix

basis for Fn

35

Standard basis for Pn(F)

Subspace 14

elements

19

matrix

Symmetric

Trace

the

27

a set

function

Scalar

by

generated

Subspace

36

Zero

vector

Zero

vector

10

space

13

of

Linear

Transformations

Matrices

and

In

is now

It

detail.

in

1 we

Chapter

sense

some

developed
natural
to

\"preserve\"
and

transformations,\"

calculus
themostimportant
the

of abstract vector spacesin considerable


consider those functions defined on vectorspaces
the structure. These special functions are called
the theory

that

\"linear

they

abound

of

differentiation

operations

of

examples

mathematics.In
provide us with two of

pure and applied

in both
and

integration

transformations

Jinear

two
Section2.1).
examples'allow
in differential
integral
equations
Sections
particular vector spaces

us to

These

in

and-

2.7

(see

(see Examples

problems

transformations

of linear

terms
and

many of the

reformulate

1 and 2 of

on

5.2).

and
reflections, and projections(see
5,6,
transformations.
Later
with another class linear
to study the rigidmotions
(Section
6.10).
of
linear
In the remaining chapters we will see further
and social sciences. Throughout this
in both
the physical
occurring
that
all vector
spaces are over a common field F.
chapter assume

In geometry, rotations,
7 of Section
2.1) provide us
we use these transformations

Examples

of

in

Rn

examples

transformations

we

2.1

NULL

TRANSFORMATIONS,

LINEAR

SPACES,

RANGES

AND

In this section we consider a numberof examples


will be studied in more
of these transformations
Many
Definition.

calleda
54

linear

Let V and
transformation

be

from

vector

V into

spaces

(over

W if for

of

linear
detail

transformations.
in

F). A function

all x,

V and

later

sections.

T: V -> W is
ceF
we hane

Sec.2.1

Linear

Null

Transformations,

(a) T(x + y) = T(x)


(b) T(cx) = cT(x).

1. If T
2.

T:

T is

facts

for all x,

and

-* W.

= 0.
is linear, then
linear if and only if T(ax +
T(0)

3. T is linear if and

aT{x)

y)

for

if

only

T {y)

\\i-i

We

use

generally

2 to prove

property

and au...,

xn e V

xu...,

^(t^^iaJix-X

an

we

have

*\302\253i

that a giventransformation
is

linear.

Example

and

V=Pn(JR)

denotesthe

of

Example

To

show

(a5

that T is

by

linear, let g and

=
/i)

2 above,

h)r

- ag' + ft'

T is

Then

V.

in

vectors

al(g)

+ T(ft).

on

R.

Let

/W*
by the elementary

transformation

a linear

functions

propertiesof

integral.

notation,are

the
the

of

remainder
identity

of linear transformations that appear


the book, and therefore, deserve their own

examples

important

very
in

frequently

be

T(/)

Two

linear.

is

\302\273

real-valued
Let V = C{R), the vector spaceof continuous
-*
R by
Define T: V
a,beR9a<b.

for all f

where f

T(/)=/',

Now

T(a5 +
So by property

f.

T:V->W

Define

W=Pn\342\200\2361(JR).

derivative

a e R.

and

Pn(R)

the

verify the following

F.

as

Let

should

reader

The

linear.

function

55

Ranges

+ T(y).

We often simplycall

about a

and

Spaces,

and

transformations.

zero

the identity transformation


spaces V and
(over
F) we define
iv: -* V by Iv(x) = x for all x e and the zero transformation
T0: V -> W by
are linear.
T0(x) 0 for all x e V. It is clear that both these transformations
of Iv.
We often write
instead
We now look at some additional examplesof linear
For vector

==

of

transformations.

56

and Matrices

Linear Transformations

Chap.

Example 3

Define

To
y

that

show

T:

R2

is

linear,

->

R2

by

a2) = (2ax

T(ax,

where x =

and xjeR2,

ceF

let

+ a2, ax).
and

(fcx,fc2)

Since

(dud2).

ex +

=
y

cfc2 +

+ dx,

(cftx

<f2),

have

we

+ y)

T(cx

= (2(cbi +

<fx)

cfc2

+ <f2,

cfcx +

<fx).

Also,

= c(2fcx +

+ T(y)

cT(x)

(2cfcx

fc2,bx) + (2ix +

+ cfc2

= (2(cftx +

<fx)

+ 2dx +
+

cfc2

<f2i

<*2,

<*x)

cftx

<f2, cfcx

dj)

+ <fx).

So T is linear.
4

Example

-* Mnxm(F) by

T: Mmxn(F)

Define

1.3. Then

is

T(A) =

by Exercise

transformation

linear

A'

where

A\\

is as

defined

3 of Section

in Section

1.3. 1

f
As

and

wide

are

6, the applications of linear algebrato geometry


The mainVeason
for this is that most of the important
varied.
are linear.
Three particular transformations
transformations

geometrical

of

the

to

linearity

the rotation,

are

consider

now

we

in Chapter

see

will

we

For 0 <

reader.

0 < In

define

we

the

called

(ai cos0

by 6

rotation

R2 ->

T*:

T0(ax, a2) =
is

and projection. We leave the proofs

reflection,

Example

T0

that

R2 by
\342\200\224

a2

sin

sin 9 +

0, al

[see Figure 2.1(a)].

a2 cos 0).

Example6
Define

T:

R2

-> R2

by T(ax, a2)

axis [seeFigure2.1(b)]. 1
Example

Define

= (al9

\342\200\224

a2).

T is

the reflection

called

about the x-

T: R2

-> R2 by

(see Figure 2.1(c)].

T(ax, a2) = (ax,0).


T

is

called

the

projection

on the

x-axis

Sec.2.1

Linear

Null

Transformations,

and

Spaces,

57

Ranges

T(x)

9
h

T(x)
T(x)
Rotation

(a)

(b)

2.1

Figure

our attentionto

We now turn

(c) Projection

Reflection

two

with

associated

sets

important

very

linear transformations: the \"range\"and \"null


The
determination
space.\"
sets allows
us to examine more closely the intrinsicproperties

these

of

linear

of

transformation.

Let V and

Definitions.
null

the

define

the

define

all images (under

{xeV:

N(T)

(or image)

range

T(x)

-> W be linear. We

let T: V

to be theset of all

vectors

in

V such

0}.

R(T) of J to be the

subset

T) of elementsof V; that is,

and

be

vector

spaces,

and let I: V

->

and zero transformations,


as
=
=
=
R(i) V,N(To) V,andR(To) {0}. 1

of

x 6

{T(x):

R(T)

V and

of

consisting

V}.

W be the

T0: V ->

identity

N(I) = {0},

Then

above.

defined

respectively,

Example

T: R3

Define

-> R2 by

T(al9 a29a3) =
It is left as
In

linear

and

spaces,

Example

Let

is,

vector

N(T) of T

{or kernel)

space

that T(x) = 0; that


We

be

an exerciseto
8 and

Examples

transformations

that

verify

9 we see

(ax

N(T)

a2i 2a3).

and R(T)=

{(a, a, 0): aeR}

that the range and null

is a subspace. The next

this

that

shows

result

each

of

space

R2.

of

the

in

is true

general.

Theorem 2.1.
N(T)

and
Proof

Let

and

R(T) are subspaces


To clarify the

zero vectors of

and

W,

be vector

of

and

W,

spaces

be

linear.

Then

respectively.

notation, we use the


respectively.

and T: V -> W

symbols

0V

and

#w

to denote

the

Since

Then

and ex 6
=

^w>

T(0V)

T(y +

The

T{v) = x

T(cu) = cT(u) =
W.

let

Now

R(T).

and

Let

a basis

ft

W be vector

V and

So

R(T)

spaces, and let T:

->

the

the

for

set

spanning

is

range

be

linear.

then

xn},

{xx,...,

0W.

y.

T(w)

for

2,2.

R(T) and

x, ye

ex Thus x +

a method for

provides

and c e F.

y e N(T)

x,

and T(cx) = cT(x) = c0w


is a subspace
of V.

that

V such

in

and

y,

0W

Let

findinga
a basis
transformation. With this accomplished,
(see Example 6 of Section1.6).

to discover

If V has

and

N(T)

that

of

theorem

of a linear

Theorem

= 0W,

R(T) is a subspace

R(T), so that
next

have

T(w)

ex 6

we

Matrices

and

N(T).

that

so

N(T),

0V

0W

Then there exist


=
w) = T(y) +

ceF.

easy

T{x)

y e N(T)

x +

Because

range

y)

have that
+ T(y) =
+
0W

= 0W, we

T(0V)

T(x

Hence

and

Linear Transformations

Chap 2

58

R(T) = span{T(x1),...,T(xn)}.

Proof. Clearly,

T(xf)

containsspan{T(x1),...,

basis

for

that

suppose
V, we

R(T) is a

subspace, R(T)

span(T(/?)).

T(xn)}

Now

z. Because

each

for

R(T)

y e R(T).

Then

=
y

for

T(x)

some

x 6 V.

Because

have
n

Since

T is linear,

aixi

f\302\260r

some

it follows that
=
y

T(x)=

transformation T:
u>

-\302\273\342\226\240

{1, x, x2}
R(T)

is a basis

0
(/(1)-/(2)0

for

/(0),

=
we

have

found

have

= span({T(l), T(x), T(x2)})

0
span'\"o

we

P2(R),

= span(T(/J))
=

by

M2X2(R)

P2(K)

Thus

result.

this

10

Define the linear

/J =

of

usefulness

the

Since

\302\243V(*i)espan(T(/9).

1=1

The following example illustrates


Example

F.

al9...,ane

0\\

/-3

0'

0 0.

0\"

/-1

span\"\\o ij'\\
a basis for

0\\

o0 on
0.

D'\\
i;'v

s>

'0

/--1

0\\

o,

R(T), and so

=
dim(R(T))

2.

ft

is

Sec.

Linear Transformations, Null Spaces, and

2.1

As in Chapter 1,

null space and

so

are

range

\"size\"

the

measure

we

of asubspace

by its dimension.

The

we attach

special names to

their

that

important

59

Ranges

dimensions.

respective

W be vector

V and

Let

Definitions.

spaces, and let T:

N(T) and R(T) are finite-dimensional,then


rank(T),
nullity(T), and the rank of T,

we

the

define

the

be

to

denoted

->

be linear.

of T,

nullity

If

denoted

of N(T) and

dimensions

R(T), respectively.

Reflectingon
the larger nullity,
that

and

0, the

into

carried

are

the

larger

the

nullity

the range. The same heuristic reasoningwilltellus


the smaller
the nullity. This balance between the rank
in the next theorem, appropriately called the
precise

the

rank,

is

made

Let V and

Theorem).

be

vector

nullity(T)+

let

=
rank(T)

dim(V).

dim(V) = n, and let {xl9...,xk} be a basisfor N(T).


Lll we may extend {xu...,xk} to a basis
to Theorem
the
corollary
S =
{xu...,
xB} for V. We will show that the set
{T(xft + 1),... ,T(xn)} is a

basis

for

R(T).

First
T(x\302\243)

that

Suppose

Proof

we
for

that

prove
1 < z

<

k,

R(T)
we

Now

that

prove

Using

the fact that

R(T). Using Theorem 2.2and

S generates

the

have

we

T(xn)} = span(S).

= spanjKXi),...,
S is

independent.

linearly

biJ(Xi)=

T is linear,

\302\260

for

bk +

1, * -

Suppose that
\342\226\240
> K

6 F.

have

we

t(

b;x)

= 0.

So

MieN(T).

t
Hence

and

spaces,

linear. If V is finite-dimensional, then

T:V->Wfce

the more vectorsthat

smaller

2.3 (Dimension

Theorem

In other words,

that

theorem.

dimension

By

we see intuitively

transformation,

rank.

the

smaller

the

the

a linear

of

action

the

exist

there

ck 6 F

cl9...,

71

= ft +

ft

ft

Mi

E
f

such that

E ci*i
\302\243=1

or

71

E (-c.-)*i + = E Mt =
i
i=l
l
ft

\302\260-

fact

that

60

Since

for

basis

is

ft

bt = 0

have

we

V,

Matrices

Linear Transformations and

Chap.

S is

for all z. Hence

linearly

independent.

T
apply the dimension theorem to the linear transformation
=
1.
9, we have that nullity(T) + 2 = 3, so nullity(T)
should
review the concepts of \"one-to-one\"and \"onto\"
reader
in Appendix
for a linear transformation both of these
B. Interestingly,
are
connected
with the rank and nullity of the
intimately
will
be demonstrated
in the next two theorems.
This

If we
Example
The
presented
concepts
transformation.

Then

Theorem 2.4.

Let

T is one-to-one

if and only if

vector

be

spaces,

Since T is one-to-one,

we

and
Now assume that
{0},
= T{x
0=
y). Hencex-ys
W{y)

T(x)

N(T)

So x

{0}.

important specialcase.
and

T:

let

linear.

be

V\342\200\224\342\226\272
W

From

Proof.

V and

Let

2.5.

Theorem

0,

to

Then
x =

or

y.

an

spaces of equal (finite)


onto.
one-to-one if and only if
is

is

ThetuJ

the

that

conclude

W be vector

dimension,

we have

theorem

dimension

.the

are equivalent in

onto

and

one-to-one

of

conditions

the

Surprisingly,

= T(0).

T(x) = T{y).

that

us

transformation

linear.

be

= 0

T(x)

{0}.

should observe that Theorem 2.4allows


defined in Example 9 is not one-to-one.

The reader

one-to-one.

is

Then

suppose

N(T)

->

{0}.

N(T)

This means that

and let T:

T is one-to-one and x e N(T).


have
x = 0. Hence
N(T) =

that

Suppose

Proof.

and

in

nullity(T) + rank(T)= dim(V).


with

Now,

the

use

of Theorem

2.4, we have that T


= 0, if and
only

if nullity(T)
N(T) = {0}, if and
= dim(W), and if and
if rank(T)
only
1.11 this equality is equivalent to
only

only if
=

The linearityof
construct

examples

in

2.4

Theorems

of functions

from

JR

into

one-to-one

if rank(T) =
=

dim(R(T))

the

W,

R(T)

is

and
JR

if and

By

of T being

definition

2.5 is essential, for it


that
are not one-to-one

onto, and viceversa.


The

determining

two

following

whether

given

make

examples

linear

theorems above in
is one-to-one or onto.

use of the

transformation

Example 11
Define

*
T:P2CR)-P30R)

by

T (/)(*)

= 2f'(x)

Jo

if

dim(V),

dim(W).

Zf(t)dt.

if

only
and

Theorem

onto.

is

easy

1
to

but are

Sec.2.1

Linear

Null

Transformations,

and

Spaces,

61

Ranges

Now

T(x), T(x2)})=

R(T) = span({T(l),

Hence rank(T) =
=

nullity(T)

3.

Since

dim(P3(JR))

4, T

= 0, so

3. So nullity(T)

+ f

span({3x,

x2, Ax +

From Theorem 2.3,

is not onto.
=

N(T)

Thus

{0}.

x3}).

Theorem

by

2.4,

T is one-to-one.
12

Example

Define

T:F2->F2

It is

that

see

to

easy

T must be

that

In

14

that

stated

if T is

independent if and only if


the use of this result.

is linearly
illustrates

= {a2 +

a^aj.

linear and one-to-one,

then

T(5) is

subset

13

Example

independent.

linearly

us

tells

2.5

Theorem

Hence

it is

T{a1,a2)

{0}; so T is one-to-one.

N(T)

onto.

Exercise

by

Example 13
Define

T:

T(a0 + *i*

by
Let

in

independent

linearly

R3

one-to-one.

is

Clearly,

->

P2(R)

P2{R)

S = {2 - x
if and only if

T(S) = {(2, -1,3),(0,


is

R3.

in

independent

linearly

+ a2x2) =
3x2,

of

most

important

ace completely

determined

from
book.

that

vi>

\342\200\242
\342\200\242
\342\200\242
> vn

T(X|) =

yi

zn W
for

2x2}. Then S

is

the

vector

of

This

technique

will

be

is that they
transformations
on a basis. This result, which follows
action
their
by
will be used frequently throughout the
corollary,
of linear

properties

V and

Let

W be vector

and

space

spaces over a common


fieldF,
a
For
with
basis
is finite-dimensional
any vectors
{x!,...,xn}.
-> W such
that
there exists exactly one linear transformation T:
= 1,..., n.

2.6.

Theorem

suppose

and

theorem

next

the

1
from

the

x2,

1,1),(1,0, -2)}

In Example 13 we transferred a problem


to a problem in the vectorspace 3-tuples.
polynomials
exploited more fully later.

One of

{a0,aua2).

Proof Let xeV,

Then

71

62

where al9...9an are

unique scalars.

Define

T:V->W

(a)

For

is linear:

and Matrices

Linear Transformations

Chap.

that u,

suppose

T(x)

by

we may write

Then

dsF.

and

\302\243 a\302\243y\302\243.

an'^

&\302\243X\302\253

\302\243=1

C*Xi\"

\302\243=1

Now
71

v=

du

Y, (dt>i +

c\302\243)x\302\243.

\302\243=1

So

T(<iu

i;)

\302\243 (dbt

Ci)y\302\243

(b)

=
\302\243 c\302\243y\302\243

dT(u)

T(i>).

\302\243\302\2531

Clearly,

T(x\302\243)

biyi +

= 1
\302\243

\302\243^l

(c) T is unique:
n. Then for
1,...,

y\302\243

fori

For suppose that


x 6 with

U:

1,...,\302\253.
->

is

linear

and

y-t for

U(x\302\243)

fl\302\243*i

\302\243=1

we

have

U(x)

Hence U = T.

then

{xl9...,
U

Example

U(x;)

\302\243 a,*

T(x).

i=i

Corollary.
basis

\302\243 a,

i= i

Let

and

xn}. I/ U, T:

fce vector

->

are

spaces, and suppose that V has


Zznear and Ufo) = T(x;) /or i =

a finite

1,...,

n,

= T.

14

- al9 3^), and suppose


=
Define T: R2->
by
T(al9a2)
{2a2
=
is linear. If we know that U(l,
and
U(l, 1) = (1, 3), then
(3,3)
follows
from the corollary and from the
that
is
{(1,2),(1,1)}
R2. 1
R2

that

2)

fact

R2

U:
U

->

T.

a basis

R2

This

for

EXERCISES

1.

the

Label

and

from

following

are

V into

statements

finite-dimensional

W.

as being

vector

For the following,


a function
(over F) and

true or false.

spaces

is

Sec. 2.1

Linear

Null

Transformations,

and

Spaces,

63

Ranges

sums
and
scalar
(a) If T is linear, then T preserves
(b) If T(x + y) = T{x) + T(y), then T is linear.
if and
only if N(T) = {0}.
(c) T is one-to-one

(d) If

is

then

is

carries

(g) If T, U: -> W are both


(h) Given xl9 x2eV and

T:

For
for

3. T:
4.

T is

->

R2;\"T(fll,

a2i

R2

->

R3;

a2)

T(al9

(a,

a2).

T^11
=

T(/(x))

PZ(R);

=
T{A)

statements

1,

(c)

T(a1,a2)

a2)

10.

T is not linear.

(fl1,flJ)

a2)

{\\ai\\,

T(al9 a2) = {al +


that

Suppose

R2, state why

linear.

= (sin al90)

(d) T(al9
(e)

J An.

beginning of this section.


in Examples
defined
5 and 6 are

that

T(a1,fl2)

3 at the

2, and

8.
the
transformations
T: R2 ->
9. For
(a) T(a1,a2)= (l,a2)
(b)

that

Recall

tr{A).

following

+ f(x).

xf(x)

tr(A)

the

to

- a2, 2a3).
+ a29 0, 2a, ^2

Verify

the

= (ax

a3)

T:M2x3(F)->ivWF);

Verify

bases

or onto.

one-to-one

R3

5. T: ?2{R)->
6. T: Mnxn -> F;

7.

find

section

whether

determine

= y2-

that T is a

6, prove

onto

theorem.

dimension

2. T:

and

N(T)

= yt and T(x2)

T(x:)

linear transformationand
R(T). Then compute the nullity and rank of and verify
Finally, use the appropriate theorems in this

2 through

Exercises

both

that

such

->

y2

yl9

subsets of

and agree on a basis of V, then


T = U.
W, there exists a linear transformation

linear

= dim(W).
independent

rank(T)

linearly
of W.

subsets

independent

linearly

nullity(T)

then

linear,

= 0W.

T(0V)

is linear,

(e) If T

(f) If

then

linear,

products.

T:

1, a2)

R2 ->

is

R2

linear

and

Is T one-to-one?
T(l, 1) = (2, 5). What is
3)?
11. Prove that there exists a linear
T(l, 1) = (1,0,2) and T(2,3)= (1,-1,4).

that T(l,

0)

4)

and

such

that

(1,

T(2,

is

What

12.

Is there

a linear transformation

Let

and

spaces and T:

be vector

linearlyindependent

subset

=
T(x\302\243)

yt

->

R3

R2

such

for

i = 1,...,k,

of

R(T).

->

If

S =

be

R3

T(8,11)?

that

T(l, 0,3) =

(1,1) and

linear.

Let {yl9...,

yk} be a

= (2,1)?

1(-2,0,-6)

13.

T:

R2->

T:

transformation

{xl9...9xk}

is chosen so

prove that S is linearlyindependent.

that

64

14. Let V and

vector

be

(a) Provethat

subsetaof

onto

(b) Suppose that

is

15. Recall

independent

one-to-one

and

and

if
of

definition

the

only

linearly

independent

linearly

if and

one-to-one

is

subset of V. Prove that Sis


T(5) is linearly independent.
S is a

that

if

only

1.2. Define

by T (/)(*) =

T:P0R)-P(K)

of W.

subsets

in Section

P(JR)

be linear.
if T carries linearly independent

T: V -> W

and

spaces

and Matrices

Transformations

Linear

Chap.

f(t)dt.
o

16. Let
(b)

an

17. Give
=

N(T)

cannot

be

onto.

then

cannot

be

one-to-one.

19.

and

T:V->W

vector

be

is

linear,

{x e V: T(x) e
20. Let T: R3 -> R
for

21.

4- by

V be

Let

T: V -> V'-iscalled a

R3.Can

T:

for

result

this

generalize

you

the

geometrically

c such that
->

Fn

Fm.

null

for the

possibilities

20.

a vector, space

projection
on

and let Wi
Wi

be

of

subspace

V. A

if:

a subspace W2 of V such
V = \\Nt\302\256
that
(Recall
W2.
sum
direct
in
the exercisesof Section1.3.)
of
definition
defined
=
x
For
x: + x2, where x: 6 Wi and x2 6 W2, we have T(x) = xx.

(b)

Exercises 22 through
Prove

that T is

23.

Prove

that

24.

Describe

25. Suppose

24 assume

T(x)

W2

that

is

that there existsa


V be a
to

be

above.

notation

the

the

=
linear and Wi = {x:
x}.
=
\\Nl = R(T) and
N(T).
if Wi =
or
\\N1 is the zerosubspace.

22.

said

that

exists

There

(a)

Let

a, b, and

scalars

result

Describe

and

of

subspace

prove an analogous

Exercise

Definition.

function

all (x, y, z) e

be linear.
of T. Hint: Use

Let T: R3 -> R

exist

there

that

4- cz for

If

respectively.

Wl9

V.

of

Show

is

T(VX)

subspace

F? State and

T: Fn ->

space

26.

be linear.

ax

z)*=

y,

that

and

that

such

and

transformations

with subspaces V:

spaces

prove
is

Wx}

T(x,

that

such

R(T).

example of distinct linear


=-N(U) and R(T) = R(U).

Let

linear.

be

then

18. Give an
N(T)

->

transformationT:R2->R2

of a linear

example

and T:

spaces

< dim(W),
> dim(W),

that if dim(V)
that if dim(V)

Prove
Prove

(a)

vector

finite-dimensional

be

and

but not onto.

one-to-one

T is

that

Prove

subspace

(a) Prove that

the

let T:

if T(x) 6
subspaces

vector

finite-dimensional

space V. Prove

W.

on

projection

vector space, and


T-invariant

of a

{0},

for

V,

->

every

R(T),

be

linear.

xeW,

and

A subspace
i.e., T(W)

N(T) are all

W of

c W.

T-invariant.

is

Sec.2.2

Matrix

The

(b) If

T-invariant

is

Tw(x) = T(x)

(c)

is

If

of

subspace

eW.

all

for

on

projection

65

Linear Transformation

of a

Representation

define

V,

by

TW:W->W

Tw is linear.
is
T-invariant
Ws show that
that

Prove

that

and

Tw = lw.

(d) If

R(T)

is T-invariant,

and

\302\251W

if V is also finite-dimensional,
Prove that N(TW) = N(T) n W and

that
(e)

that

prove

then

Show

N(T).

N(T).

R(TW)

T(W).

2.6
to infinite-dimensional
following generalization of
Let V and W be vector spaces over a
and
spaces;
ft be a basis
->
W
for V. Then for any function f:
there
exists
one linear
exactly
transformation T: -> W such that T(x) = f(x) for all x 6
A function
T: V -> W between vector spaces
and
W is called
additive if
=
e V.
+ T{y) for all x,
Prove
that if V and W are vector
T(x +
T(x)

27. Prove the

Theorem

field

common

ft

28.

)5.

y)

29.

spaces over

the

into

linear

is

rational

an additive functionT:

linear.Hint:
numbers Q. the corollary
that is not

Regard

f(y) = x, and f(z)


transformation T:
R

that

= f(z)

T(z)

R is

where

R,

is

->

ft

as a vector

regarded

Then

all ze)5.

for

)5.

a linear

space overQ,

such

c = y/x,

for

but

additive,

= y,

f(x)

by

exists

there

28

Exercise

By

28)

rational

space has a basis

1.13 this vector

= z otherwise.
->

the field of

over

space

in Exercise

defined

R (as

of /?, and define f:

elements

distinct

y be

and

->

a vector

as

to Theorem

By

Let

from V

function

additive

any

numbers,'then

transformation.

that there is

Prove

of

field

T(cx) # cT(x).

Thefollowing
requires
Exercise29 of Section1.3.
30.

Let

be

rj:M ->
(a)

V/VV

Prove

and

a vector
by

that

r\\

let

= v +

t](v)

with

familiarity

exercise

be

W for

the

linear

is

finite-dimensional.

of

subspace

that

from

V onto

Use

part

theorem to derivea
dim(V),
relating
dim(vV)s
Read the proof of the dimension
Compare
same
solving part (b) with the method of deriving
in Exercise 26 of Section 1.6.
theorem.

the

THE

MATRIX

OF

REPRESENTATION

and that

V/W

(a) and

formula

2.2

in

the mapping

V. Define

N(ij) = W:

(b) Suppose
(c)

of quotient space

V.

transformation

is

definition

the dimension
and
the

result

dim(V/W).
of

method
as

outljned

A LINEAR

TRANSFORMATION

Until now we have studiedlinear

transformations

by

examining

their

null spaces. We now embark upon one of the


most
useful
approaches
of
a
on
a
linear
transformation
finite-dimensional
vector
analysis
of a linear transformation by a matrix.In fact,we develop
representation

and

ranges
to
space;

the
the

a one-

Chap. 2

66

to-one

utilizeproperties
of

to

one

an

of

Let

for
is a

is

basis

with a specified

V endowed

for

a vector

space.

order; that is, an

independentelements

sequence of linearly

finite

vector

a finite-dimensional

be

for

basis\"

\"ordered

that

of\\/

Matrices

space.
basis

An ordered
basis

ordered

for

V.

generate

Example

Let V have

{xlt

ft

ordered basis,

x2,

x3}

y as

but\"/?

For the vector space

basisfor Fn.Similarly,

the

for

as an ordered

vector

{x2,

xl9

x3} is

also an

the standard ordered


{el5...,en}
the
Pn(F) we call the set {1,x,...', x\"}
set

the

call

we

bases.

ordered

Fn

basis. Then

space

for Pn(-F).

basis

ordered

standard

we
the concept of an
basis
will
be
with
identify abstract vectors in an n-dimensional
space
/i-tuples.
identification
will be provided through the use of \"coordinate

Now that we have

ordered

able

to

This

vector

as

vectors\"

below.

introduced

vector

dimensional

be an ordered basis
/? = {xl9...,xn}
V. For x 6 V we define the coordinatevector

Let

Definition.

denoted

to P,

to

that allows us

transformations
of the other.

properties

study

We first need the concept

Definition.

and

matrices

between

correspondence

Linear Transformationsand

space

for a
of

finite-

x relative

[x]p, by

w\302\253

where
n

\302\243 ajXf.

i=

Notice

that

= ex in

[x;]^

the definitionabove.

It

is

left

as

an

to

exercise

with
us
a linear
show that the correspondence x -> [x]^provides
transformation
in
2.4
in more
detail.
from V to Fn. We will study this transformation
Section

Example

Let
f(x)

= 4

2
P20R),

+ 6x

and

let

- 7x2,

then

be the standard

/? = {l,x,x2}
.

*
I

ordered basis

for

V.

If

The Matrix Representationof

Sec. 2.2
Let

us now

transformation.

proceed
Suppose

orderedbases

be

linear

y = {yl9...,

xn} and

ym), respectively. Let T:

scalars atJ e F

there exist unique

Then

linear.

67

Transformation

with the promised matrix representationof a


V and W are finite-dimensional vector spaceswith
that

{xl9...,

ft

Linear

such that

(i= 1,...,m

and

->

1,...,

n)

for l

flyy,-

T(xj)=

<;<\302\253.

i=i

Definition.

by Ay
ajj the matrix that represents in the
= y,we write
=
W and
write A = [T]5- J/
P

that

the corollary to
We

will

Theorem 2.6

that

such

transformation

column of

thejth

the

illustrate

simply

matrix

ft

defined

y and

and

will

[T],.

observe

Also

[T(xj)]r
if U:V->

that

follows

it

W is a

that from

linear

then U = T.

= [T]
J
J,

[U]

is

bases

ordered

Notice

the m x n

above, we call

notation

the

Using

the next several examples.

of [T])J in

computation

Example

Define

Let

j?

and

be

the

bases for

ordered

standard

=/'.

T(/)

by

T:P3(K)->P2(K)

and

P3(JR)

P2{R),

respectively.

Then

1(1) = 0-1

+0-x + 0-x2

T(x)=

+0-x

T(x2)

1*1

0-x2

+ 0-x2

= 0*l +2-x

T(x3)= 0-1 +0*x

3-x2

So

/0

[T]J =

J0

0 2 0 .
0

0
\\0

that the coefficients


elements of y give the entries

Note

of T(xJ) when
of the

jth

0\\

3/

written

column.

as

combination

linear

Example 4

Define

T:

R2

->

R3

by

T(fll,

a2) = (a,

+ 3a2, 0, 2a,

4a2).

of

68

Let P and

T(1,0)

for R2 and

bases

ordered

standard

the

be

R3, respectively.

Now

0^ + 2^

= 1^ +

(1,0,2)

and Matrices

Linear Transformations

Chap.

and
=

1)

T(0,

= 3^+0^-4\302\276.

-4)

(3,0,

Hence

3
i1
If

we

let

y'

{e3\", e2,*e1}, then

-4

. i1
[T]J = 0

a procedure
for associating matrices with linear
that this association \"preserves\" addition.
shortly

defined

have

we

that

Now

see
transformations,
To make this
we
need
addition of linear transformations.
will

we

more

vector

Let

and

spaces,*,

T(x)+

T,

let

U: V ->

aeF.

define

V-\302\276

by

functions, where
W

U: V->

= aT(x) for

(aT)(x)

Of course, this is just


usual
definition
multiplication for functions. We are fortunate,
both sums and scalar multiples of linear

are

transformations

Then

linear.

Let V and
all aeF

for

(a) aT + U
(b)

above,

the

vector

Proof

space

(a)

be

vector

and
the

also

scalar
that

result
linear.

T, U: V ->

let

and

spaces,

(T + U)(x)=

be

is linear,

the

Using

are

and

all x 6 V.

have

to

however,

2.7.

addition

of

the

Theorem

by

the

about

discussion

preliminary

W be arbitrary
We

V, and aT:

x 6

all

for

U(x)

some

explicit,

Definition.

0|.

collection

of addition
linear
of all

over

F. This vector

operations

Let x, ye\\f

(aT +

U)(cx

and ce F.
y)

= aT{cx
=

a[cT(x)

from

transformations

space is denoted

V into

^(V,

by

Then
+ y) + U(cx
+

= acT(x) +
= c[aT

multiplication as defined

and scalar

T(y)]
cU(x)

+ U](x) +

y)

+ cU(x)
+

aT(y)

[aT +

+ U(>;)
+

U](y).

U{y)

W).

is

Sec.

So aJ +

W),
\302\243\302\243(V,

the

In

Section

2.3 we

see

Let

ft

and

be

and

->

with

W)

vector

the

W, respectively.
theorem.

This

vector

with

finite-dimensional

let T, U: V

and

y, respectively,

next

the

of

use

V).
\302\243\302\243(V,

V and

of

dimensions

by

ordered bases

of

Sf(\\f,

identification is easily established


2.8.

over

\302\243\302\243(V)

Mm x n(F), where n and m are the

Theorem

the

will write
instead
a complete identification of

= W we

where

case

In

zero
plays the role of
W) is a vector space

T0, the zero transformation,


it is easy
to show that Sf(M,

that

Noting
in

element

69

of a Linear Transformation

Representation

linear.

is

(b)

space

Matrix

The

2.2

linear

be

spaces

transformations.

Then

(a)

+ [U]J.

[T+U]$=[T]J

(b) [aT]J =
Let

Proof

scalars ai} and

a[T]J/or a//a eF.


=

p
biS

in

and

{xl9...9xn}

<

(1

<

m,

< j

T(xj)

There

exist

unique

< n) such that


m

ym}.

{yu-.-,

and

i=i auyt
\302\243

U{xj)

for Uj<\302\253.

JT buyt
r=i

Hence
+

(T

U){xj)

\302\243 (au

6u)j\302\273,.

\302\243=1

Thus

(CT
So

(a)

U]3B0

fc<,

\302\253<,

the proof of

and

is proved,

J + [U]

([T]

J)\342\200\236

(b) is similar.

Example

Define

T: R2

-> R3 by

l(au a2) = {a, +

0,

3a2,

2a,

Aa2)

and

U: R2 ->
Let

j? and

y be the

R3

by

U(al9

a2)

= {a,

- a2i 2al9

standard ordered bases of

[T]J=[0

/1

R2

03\\ ,

and

3ax

R3,

2a2).

respectively.

Then

70

Chap.

in Example 4) and

(as computed

-.

,i

0|.

[U]J=l2

\\3

If

now

we

and Matrices

Transformations

Linear

using the definitionsabove,

T + U

compute

(T

obtain

we

+ 2a2, 2au 5ax -

a2) = {2a1

U){a1}

2a2).

So

/2

[T +

whichis

simply

[T]J

U]J=|2

2.8.

Theorem

illustrating

[U]jji,

EXERCISES

1. Label

the

and

Assume

For

any scalar.a,

T, U: V ->

that

aT + U is a

spaces

[T]i=

JSf(V,
Let P and
(f)

2.

a vector

is

W)

^f(W,

W)

the

be

->

R3

(b)

(c)

T: R3 ->

(d)

T:

R3->R3

an

is

[T])J

n matrix.

m x

T: Rn ->
=

a2)

a2,

byT(al5

R m,

and

compute

Rm,

{2a1

= 2a1 +
a3)

respectively.

For

[T])J.

- a2i 3a1 +

T(a1, a2, a3) = (2a1+

by

defined

for R n

bases

ordered

standard

defined

W.

V).

R2 defined
JR

into

space.

the following transformations


(a) T: R2 -> R3 defined by ~[<(au
T:

from

transformation

J&?(V,

/?

linear
then

(e)

linear.

are

U.
[U]J implies that
(c) If m = dim(V) and n = dim(W),
=
(d)[T+?U]J
[T]J+[U]}%

(b)

the following
with ordered bases and y,

or false. For

true

being

vector

finite-dimensional

denote

respectively.

(a)

as

statements

following

4a2, aj.

\342\200\224

a2

a1

a3,

3a2

a3).

\342\200\224'3a3.

by

-h 5az,
T(al9 a2i a3) = (2a2+ a3, -ax+
a: + a3).
=
(e) T:Rn->Rn defined by T(au a2i...,an) (au ^,...,^).
Aa2

(f)

(g)

3.

defined

T:Rn->Rn

T: R

Let T: R2

->

jR defined

by

by

T{a1}

a2,...,

T(al5

-> R3 be defined as l(aua2)

the standard ordered basis


Compute [T]J. If a = {(1,2),

R2

for

(2,

4.

= {an, an\342\200\2361}...,

a2i...,an)

3)},

an) =

ax

an.

\342\200\224

a2,

(ax

and

au

%ai

+ a2)-

Let /? be

{(1,1,0),(0,1,1),(2,2,

[T]

compute

aj.

J.

Define

T:M2x2(K)->P2(K)

by Tf

*J

(a

ft) +-(2d)x

+ bx\\

3)}.

The

Sec. 2.2

Matrix

71

Transformation

a Linear

of

Representation

Let

Compute

5.

For

HO

'-<\342\226\240\342\200\242**>\342\200\242

\302\260K:)}

:)\342\226\240(:

[T]J.

let

the following parts,

a=
=
p

o)'(o i)}'

J)'(i

{G o)'(o
*> X2},

{1,

and

r = {i}.

(a) DefineT:
(b)

->

M2x2(F)

M2x2(F)

by

J (A) =

by

T</,

A'. Compute

Define

T:Ps(K>->

*,\342\200\236,<*,

=
('\302\253

'

where
(c)

Define

(d)

Define

(e)

If

(g)

6.

differentiation.
Compute [T]\302\243.
->
T: M2x2(F)
F by T(i4) = tr(i4).Compute[T]J.
P2(jR) ->

T:

jR

2.8.

Theorem

of

vector space

T: V -> Fn by T(x) = [x]p.


that
Let V be the vector space of complex
Prove

8.

and

compute

where

[T]^

z is

is linear.

the

of z, prove

conjugate

complex

Show that

T: V ^ V

JR. If

field

the

over

where /? = {1, i).

is

not

that

is

if V is

linear

C.
space over
V be a vector space with the ordered basis = {xl9...9x\342\200\236}. Define
Let
= 0. By Theorem 2.6 there exists a
transformation
T: V ->
x0
= 1,... , n.
defined by T{xj) = x} + x,-^
j
Compute [T]^.

as a vector

regarded

9.

/?. Define

basis

ordered

an

with

numbers

is defined by T(z)= z,
linear,

[1]\302\276.

[a]r

be an n-dimensional

7.* Let.V

Compute

[/],.

compute

x2,

(b)

= /(2).

T(/)

by

[i4]a.

If /(*) = 3 - 6x +
For a 6 F, compute

Prove

J\342\204\242).

denotes

compute

(f)

[T]\342\200\236.

field

the

/?

linear

for

10.

Let

be

an

transformation.
Exercise 26 of

n-dimensional

Suppose

vector

that

space,
is

and

let T:

T-invariant

Section 2.1)havingdimension
k.

->

subspace
Show

that

there

a linear

be

of
is a

(see

basis

ft

72

for

that

such

the form

has

[T]^

c)'

\\o

where

Let V be a

11.

subspace

and 0 is

matrix

x k

is

an (n

\342\200\224

V.

of

Section2.1.)

of

a diagonal

that [T]^ is

/? for V such

basis

ordered

an

Find

preceding Exercise 22

of projection

definition

the

(See

on a

a projection

be

let

and

matrix.

zero

k)

finite-dimensionalvector space,

and Matrices

Linear Transformations

Chap.

matrix.

spaces, and let


from V into W. If R(T) n

be vector

and

Let

12.

transfonnations

R(U)

a linearly independent subsetof


Let V = P(jR), and for j ^ 0

13.

derivativeof f.

14.

(a)

(b)
Let
T:

a.

5\302\260is

vector
of

subspace

is the /th

/(j)

independent

linearly

such thit [T])J is a

be a subset

and let S

and

c S2i then

S1

(Vx +

of V, then

subspaces

V.

of

Define

S^ c 5?.
=

V2)\302\260

V\302\260n

V\302\247.

such that dim(V)= dim(W), and


bases and y for V and W, respectively,
let

/?

diagonal

matrix.

TRANSFORMATIONS

LINEAR

OF

COMPOSITION

\342\226\240
\342\226\240
is
\342\226\240,Tn}

W).

JSf(V,

be vector spaces
W be linear. Find ordered

and

V ->

are

V2

{Tl9 T2,

where

fu\\

Prove

of

and

0 for all x e S}.

If S1 and S2 are subsets


Vx

W).
Tj{f)

spaces,

W): T(x) =

e JSf(V,

{T

(c) If

2.3

(V,

{T, U} is

that

prove

{0},

linear

nonzero

any positive integer n.

be

and

Let

set

the

that

Prove

of J5f(V) for

5\302\260=

15.

J5f

define

subset

be

and

AND MATRIX MULTIPLICATION

In Section2.2

we

how

learned

with a linear transformation


multiples of matrices are associated

a matrix

to associate

in such a
that
both
sums
and scalar
with the corresponding
and
of
scalar
multiples
the
matrix
question now arises as to
representation
linear transformations is related to
matrix
representations
associated linear transfonnations. The attempt to
to a definition of matrix multiplication. We use
notation
way

sums

of a

how

U and

of linear

transfonnations

g and f.

Specifically, we have the following

W ->

Our

linear.

as

with

contrasted

g\302\260f

for

for

lead

composition
functions

arbitrary

definition.

Let

Definition.

U:

will

question
UT

the

of the

each

of
this

of

composition

the

answer

The

transformations.

Z be linear.
first

result

V,
We

W,

define

shows

and
UT:

that

Z be vector
V ->

Z by

spaces, and let

(UT)(x) = U(T(x))

the composition

T:

V->

for all x 6

and
V.

of linear transformations is

2.3

Sec.

Theorem 2.9.

U:

->

Proof.

Let

Let

x,

UT(aX +

U(T(ax

-y)

vector

= .U(aT(x)

y))

= aU(T(x)) + U(T(y))=
The

some

lists

theorem

following

and T: V-> W

spaces

and

F. Then

a e

and

73

and Matrix Multiplication

is linear.

UT: V -> Z

ye\\f

be

and

W,

V,

Then

linear.

be

Transformations

of Linear

Composition

+ T(y))

a(UT)(x)+UT(y). 1
of linear

of the composition

properties

transformations.
(a) T(UX +

U2) =

TUX

space. Let T,

a vector

V be

Let

2.10.

Theorem

and

TU2

.(c)

TI

IT =

B. Because of
A

[U]

J and

Now
{^u

let

\342\200\242
\342\200\242
\342\200\242
> Jmh

For 1 < j

<

Theorem 2.8it
B =

we

F.

ZP)

to

then

->.Z,

and let a=
for V, W,

bases

ordered

\342\200\242
\342\200\242
\342\200\242
\342\200\242>be

{zu

W and U:

matrices

of two

analogy

by

as above,

be

AB

product

reasonable

T: V ->

and

A,

the

define

seems

where

[T]\302\243

U,

T5

a\302\260d

U^aUJ /or all ae

in a position to

We are now

U2T.

Exercise.

Proof.

Then

^(V).

T.

= (aU^UJ =

(d) a^UJ

U2

+ U2)T = UXT +

(Ux

(b)T(.U1U2)= (TU1)U2.

Ul5

A and
that if

require
/IB = [UT]*.
ft

{xl9...9x\342\200\236}9

and Z, respectively.

have

\\

(m
/

\\

1-1

where

Q/=
This

suggests

computation

the

^4\302\276\302\2761

of matrix

definition

following

multiplication.

the product

anm

Abe

Let

Definition.

x n matrix

denoted AB, to

of A and B,

be

and B be anh
the

x p

matrix.

such that

matrix

=
(AB)fJ

Z
k-l

AikBki

far

1< i

< m, 1 < j

We

<

p.

define

74

Note that (AB)ijis

i th row of
some

the

and

of

sum

the

of B. At

column

jth

Thereadershould
are

is

Notice

be

must

case with composition of


is not commutative.
Consider

true that AB

need not be

the

Recalling

if

that

1)= 2

functions,
the following

the

for

order

the

and

equal,

AB

1.

matrix

that

have

we

of the matrixproducts

even if both

we se&that

show

relationship (2 x 3)-(3 x

the

in.

multiplication

two

products.

are

BA

and

it

defined,

\342\200\224

BA.

of a matrix from Section1.3,

of the transpose

definition

we

m x n matrix and B

A is an

is an

then

matrix,

\"

= BIA*. Since

(ABf

is, in

(m x p)\";that

the

the symbolic

again
As

will

x p) =

Example

Hence

sizes

relative

n)'(n

product AB to be defined,
of A and B. The following

for the

order

the two \"inner\"dimensions


yield the size of
product.

\"outer\" dimensions

two

\"(m

helpful:

see

will

reader

the

to be defined,

AB

product

the

regarding

device

in

that

observe

there
restrictions
mnemonic

the end of this section

of this definition.

applications

interesting

from the

elements

of corresponding

products

Matrices

Linear Transformations and

Chap.

=
{AB)\\j

{AB)3l

AJkBki

fc-i

and

(B'A%

we

the

done.

Hence

opposite

order.

are

the transpose

The following theorem


matrix

(B')ik(A% =
of a product

BkiAJk,

is the product

of

is an immediateconsequence
of

the

transposes

in

our

definition

of

multiplication.

Theorem

2.11.

Let

ordered bases a, j?,and


transformations.

y,

V,

W,

respectively.

and

Z be
Let

finite-dimensional
T: V -> W and

Then

[UT]i=[U]J[T]2.

vector spaces with

U:

->

be

linear

Sec. 2.3

Composition
of

Corollary.
Let
T, U 6 if
p.

75

Multiplication

Let V be a finite-dimensionalvectorspace

basis

ordered

an

with

(V). Then [UT], = [U^CT],.

in the following

2.11

Theorem

illustrate

We

Matrix

and

Transformations

Linear

example.

Example 2

Define

U:P3(K)-P2(K)
3 of Section2.2.

as in Example

{1,x,

x2,

and

x3}

3x3

0 if

i # j

We

and the n x

\\o

[I],.

\"identity matrix\" and is


the \"Kronecker delta.\"

is called an

In

1 if i = j

=
S^ by 3-^
=

delta

matrix

identity

\\0

Kronecker

the

define

3/1

01

useful notation,

a very

with

along

Definitions.
<5|j

Theorem

illustrate

/10

\\

J J Jl-IO

above

matrix

diagonal

below

0\\ /

0 0

defined

f{t) dt.

Clearly, UT = I. To

0 2 0

[U]J[T]5=|0

The

{1, x, x2}.

0
[UT]/r

T(f)(x)

by

that

observe

2.11,

/'

Define

T:P2(K)-P3(K)

Let a =

U(/)

by

by

and

<5jj.

(In)jj

Thus

We see in the

in
subscript
Mnxn(F).

from

Theorem

Furthermore, if

orderedbasisp,

the

matrix

identity

is sufficiently

context

the

When

that

theorem

next

clear,

unity element
we sometimes omit the
acts as a

In.

For

2.12.
V

then

is

n x n matrix

any

finite-dimensional

=
[lv]/j

vector

space

we

have

of dimension

!\342\200\236\342\200\242

Proof

(Jn^)y=

(Jn)ifc4y=

InA

5ifc4tj = 4y

= AIn =

A.

n with an

Chap.2

76
Hence

AIn = A.

= A. Similarly,

InA

Let

xn}.

{xl9...,

/?

Matrices

and

Transformations

Linear

j we

for each

Then

have
n

X}

K(Xj)

Hence [lv]^ =

For
Ak

then

A2

we

zero

(the

We define
see that if

2,3,....

notation

this

With

A0

that matrix ^multiplicationdoes distribute


Let

2.13.

A be an m x

the

Thus

theorem

cancellation

shows,

however,

addition.

over

Theorem.

0.

next

The

matrices.

for

general,

In.

even though

matrix)

for fields is not valid

property

= AA, A3 = A2A, and, in

A we define A2

n matrix

for k =

Ak~lA

X $ijxi-

i==l

In.
x

an

n matrix, and let

and

be

n x p

matrices. Then

A(B + C) =
and for

AB

AC,

any scalar a,
f,

a(AB)

= A(aB).

(aA)B

Proof.

IA(B + C)]y =

Corollary- Let
eF. Then
ax,...,ak

If A

Exercise.

^(\302\276

C*i)

UikBkj
+

be

(AQu
left

is

proof

an

Aft
Proof

+ AikCkj) = X ^ift%+ X

(AB)tj
the

Qy

The remainderof

A*&

= LAB +

as an

x n

AC]tj.
1

exercise.

matrix,

a,B,)=X

^ifcCjy

Bl9..., Bk

fee

p matrices,

and

a,AB,.

is an m x n matrix,

we sometimeswrite

(A{i),...,

Ain)), where

A\302\256

2.3

Sec.

is

the

jth

77

column

A.

of the matrix

For the following

theorem,

Let

Theorem 2.14.

x n

an

be

the

denote

will

e-3

n x p matrix.

and B be an

matrix

of Ip.

column

jth

Then

= AB0).

(AB)U)

(a)

and Matrix Multiplication

Transformations

of Linear

Composition

(b) B(j) =

Bej.

Proof.
'V

l]

\\AB)

= ABU).

= A

(AB)U)

(AB)

BnJ

mJ,

AmkBkj

Z
\\

Hence

(a)

The

is

result

next

the

ordered bases
x e V we have

ft

and

and

[T(jc)]r

and

let T: V ->

and

<j),

i//:

-*

show

Fm

having

spaces
Then

linear.

be

let A = [T]$.

Suppose that

for

has

each

exactly

by

<K*)= A[x]p

all

for

we must show that = \\f/.


that both maps are linear. By Exercise

the theorem,
first

vector

be finite-dimensional

xn}, and

the two maps:


=

the

vector.

given

any

respectively,

{xl9...,

ft

0(jc)

We

y,

Define

m elements.

Let

an exercise. 1

work. It utilizes both


matrix
and matrix multiplication in order to

transformation
at

2.15.

Proof. Let

Toprove

linear

is left as

of our past

much

transformation

Theorem

of (b)

proof

justifies

of a

representation
evaluate

The

proved.

e V.

<\302\243

the composition of the two linearfunctions:


x

->

and

T(x)

T(x)

of

->

Section

[T(x)]y,

is

2.2 $

so $ is

linear by Theorem2.9.
Theorem
2.13
the
of the
by
composition
linear.
and
hence
Now
two linear functions: x->[x]p and
by
=
the corollary
to Theorem 2.6, we need only show that
for all j. By
tl/(Xj)
=
= AU), and by Theorem 2.14
the definition of'A,
have
<f>(xj)
[T(x,-)]y
Similarly,

if/

is

if/

[x]y\342\200\224>A[x]p,

is

<j>(xj)

we

we

78

and Matrices

Linear Transformations

Chap.

have

ft*} = A[xj]p

->

P3(JR)

defined

be

P2(JR)

for P3(jR) and

bases

ordered

AO).

Aej

Example

Let T:

Example 3 of Section2.2

the
p6
=
q(x) p'(x)= -4
is

P3(jR)

P2(R),respectively.

If

2.15

9x2.

2x

be

then

[T]jJ,

the

standard

we have from

that [T(p)]y =

by verifying

= 2
p(x)

polynomial
+

let p and

that

Theorem

illustrate

We

= /\", and

by T(/)

\342\200\224

Ax

4- 3x3.

x2

[T]^/?]^, where
Let q = T(p); then

Hence

U(p)-]y=lq]y

But also

'0 1

[T]JM, =

=(0.0

Alp-]p

.0

We complete this
transformation\"

probably

L^,

we

to

it

use

of

by
A

to

prove

x)

transformation.

Let

A is an

that

for

mapping
each

tool

matrix
A

be

for

matrices

multiplication
an

x n

LA: Fn -> Fm
column
vector

matrix.

matrix
defined

xe Fn.

the

\"left-multiplication

transformation

This

is

about

properties

transferring

about

properties

analogous

LA the

and

Example

of

where

Definition. Let
denote

section with the introduction

the most important

transformations

'o

and vice versa.

For example,

is associative.
with entries from a field F. We
= Ax
by
LA(x)
(the matrix product
We

call

LA

left-multiplication

Sec.2.3

Composition
of

Matrix

and

Transformations

Linear

79

Multiplication

If

then

We see in the

not

that

theorem

next

great many other usefulproperties.


are easy to remember.

These

x n matrix
other
bases for Fn and
(a) [LA]j; =

(with

\"m

(b)

LA

(c)

LA+B

have

ft

only if A =
=
+ LB and LaA

LA

(d) U T: Fn ->

natural

and so

F.

from

the

the following

standard

the

is any

if B

Furthermore,
y are

Then

ordered

properties,

is

Fm

then

linear\\

B.
al_A

there

Lc. In fact, C =
an n x p matrix, then
n5 then L!n = lP.

all

for

exists

a 6 F.
a unique m

x n matrix C

such

[T]\302\243.

E is

(e)

If

(f)

=
If m

Proof The

that

fact

LALE.

follows

is linear

LA

=
LAE

of

column

The/th

is equal to LA(e}).But

[LJJ

from Theorem 2.13

immediately

and its corollary.


(a)

linear.
and

quite

entries

with

it has a

A.
if and

LB
=

that

we

respectively,

Fm5

is

Fm

all

fact,

in

but,

linear,

are

matrix

LA: Fn ->
entries from F) and

transformation

left-multiplication

LA

properties

m x n

be an

Let

2.16.

Theorem

is

only

=
LA(ej)

Aes

A\302\260\\

= A.

ILAYP

(b) If

Hence

=
LA

B.

LBi

The

The proof of

proof
(c)

we

then

is

of the
left

to

write
converse is trivial.

may
the

use (a) to

[LAyp

[LB]J

= B.

reader.

=*
have
By Theorem -2.15
[T(x)]r
UYpWp*
=
=
=
T
all
x
So
Cx
The
of
C
follows
T(x)
Lc(x)
Lc.
uniqueness
=
=
=
=
have
For
AEU)
j we
(e)
any
LAE(ej) (AE)e}
A(Eej)
=
Hence
the
1-^=1-^
corollary
^^(^)=1-,4(^(^))
(^^)(6^).

(d) Let C=[T]J.

we

for

from

(AE)U)

by

Theorem

We

or
(b).

=
to

2.6.

The proof of (f)

property

so

now
about

is left to

the

use left-multiplication
matrices.

reader.

transformations

to establish an important

80

Theorem 2.17. Let


Then

is

(AB)C

and

defined

C be matrices
= (AB)C; that

and

B,

A,

Linear Transformations and Matrices

Chap.

A(BC)

such that A(BC) is

defined.
is, matrix multiplicationis

associative.

2.16 and the

Theorem

*-A(BC)
from

So

(AB)Cis
associativity of functional composition,
=

*SC

*-A

to

Needless

of
multiplication.
many other

The

An

Application

large'

to,\"

we

that

objects

in

arises

for

entries

diagonal

denote

by 1,2,..., n,
i is related to /,

then

we

associated

the

define

and Ay = 0 otherwise.
A by Ay = 1 if
To make things concrete, suppose that we have four peopleeach
a commflnication
device. If the relationship on this groupis \"can
= 0 otherwise.
then
A^j
Ay =.1 if i can send (a message)to j and
matrix

incidence

owns

the

interesting applications

\"incidence matrices.\" An incidence


the entries are either zero or one and,
are zero. If we have a relationshipon a

all

which

in

matrix

square

called

matrices

special

all
of

group

utilize

varied collection of

and

with

convenience,

^~{AB)C-

matrices.

and

matrix is a

have

be proved directly from the definition


above, however,
provides a prototype of
the
linear
between
relationships

proof

that

arguments

connection

A(BC) = (AB)C.

of

(e)

Using

could

theorem

matrix

transformations

= 0-a*-b)*-c = *-ABLc=

*-AV-B*-c)

2.16 we have
this

say,

defined.

we

of Theorem

(b)

to show that

to the reader

is left

It

Proof.

of

whom

transmit

Suppose

that

A =

Then since
send

=
AZAr

1 and

(A

Note that
is, if and
ways

see that person 3 can sendto

1 cannot

but

to 4.

We can obtain
for instance,

of

= 0, we

A1A

any

term

)31

an interestinginterpretation
of

which

3 can

1 if

equals

A3kAkl

only if 3 can sendto


in

+ ^32^21

^3i^4n

send

and

to 1 in

A2

+ ^33^31 +

and onl-y if both

k can

entries

the

send

A3k

one

Consider,

^34^41'
and

to l.Thus(y42)31

two stages (or in

of A2.

relay).

Akl

equal

1, that

gives the number


Since

Compositionof

Sec. 2.3

(A

A2

stages.
;

two can send

By

An)tj

collection

A maximal

(see

For

example,

of three or

16) that

each

The

suppose that the

and

a cliqueif and

any

cliques

matrix-\302\243

by

then it can
if

only

incidence matrix associated

be
0.

>

(B3)ff

some

with

\342\226\240Since

the

all

only if

=
Ajt

dominates

other.

zero, we conclude that

of the use of incidence


relation

0 for

is called

can
terminology of our firstexample,

send

relation, it can be

shown

dominated by] all

the

in

others

one

or two

(see
entries

has

[is dominated by]


this property.
Consider, for

/0
0
A

10
0

10

0 10
v

110

in

18)
every

person

stages. In fact, it

dominates

who

one

least

Exercise

dominance

&

= 1 if

Ay
one

of

them

a message

that

to)

the matrix

position

except

who dominates

can be

and

shown

[is
that

the greatest numberof people


example, the matrix

in

0^

10
0

the

with

concerned

incidence matrix A has the property


that
all i and j, that is, given any two people,exactly

For such a

stage

of people

a group

among

A + A2 has a row [column]containing


positive
on the diagonal. In other words,there
is
at

first

is

matrices

associated

(or, using the

person

no

are

there

relationship.
A

if the

above and

B as

matrix

the

B3

of B3 are

entries

form

we

and

Our final example


concept of dominance.
relation

to cliques,

diagonal

in this

cliques

this case

compute B3. In

the

a new

define

most

that

property

= 0 otherwise,

By

i belongs to

which people belong

To determine

any

we

general,
at

in

of determining

problem
if

other,

person

the

with

clique.

In

stages.

is

relationship

the

more people

However,

can sendto

two

which i can sendto j

of ways in

number

81

Multiplication

in

ways

be quite difficult.

Exercise

shown

two

in

to each other is calleda

at first to
1 if i and j

seems

is the

\342\226\240
\342\200\242
\342\200\242

to 1

3 can send

that

see

we

Matrix

and

Transformations

Linear

Chap. 2

82

and

Transformations

matrix corresponds to a

verify that this

should

reader

The

Linear

Matrices

relation.

dominance

Now

2 1 1
0
1
2 0 2
2 2 0
2 2 2

1
A2 =

A +

2
persons

most

two

3, 4,

1,

while persons 1, 2,

stages,

3, and

others in at most

from) all the

messages

1
0/

(can send messagesto) all

5 dominate

and

1
Thus

two

dominated

are

in

others

the

(can

by

at

receive

stages.

EXERCISES

following statements as being


V, W, and Z denote vector spaces

1. Label the

ordered

with

below,
a, /?, and y,

bases

(finite)

are linear;

U:W->Z

and

T:V->W

respectively;

occurs

what

For

false.

or

true

and

and

denote

matrices.

(a)

[UT]Z=[T]\302\243[U]J.

for
(b) [!(*)],=
(c) [UWl^CUlfC^forallyeVy.

(d) [lvi =

(e)

CT2]f

(g) J =LA
(h)

'.

(f) A2 = I

([T]f)2.

implies that

for some

= 0 implies

A2

2. (a)

If

= -L

A,

matrix

that

or

=*l

0,

0 denotes

where

the zero matrix.

+ 1^.

(0 L*+^=^
(j)

xe^

all

[T]\302\243[x]a

and

is square

Sy

AVi

for

all i

and j, then

Let

1
A

Compute

A(2B

3\\

-1

B=

/1

4 1

and

+ 3C),

0 -3'

(AB)D, andL4(BD).

(b) Let

B=

Compute

A\\

AlB,

BC\\

CB, and CA.

and C =

(4

3).

Sec.2.3

Composition
of

3. Let g(x)

= 3+

Matrix

and

Transformations

Linear

83

Multiplication

Define

T:

P2(K)

T(f) = f'g

by

P2(K)

2f.

Define

U:
Let

(a)

-> R3

P2(R)

{1, x, x2}

and

e2,

{el9

Compute
verify your result.

(b) Let h(x)=3-2x +


Compute
from part (a) and
2.15
For each of the followingparts, T
the corresponding part of
compute the following:
Theorem

let

Exercise

[T(A)]a9

where

(b)

[T(/)L,

where f(x) =

(c)

[T(i4)]r,

where

and

[K]p
to

verify,

be

the
of

2.11

Theorem

use

Then

(a)

b).

e3}.

[U]J, [T],, and [UT]J directly.


x2.

4.

+ ex2)= {a+biC,a-

U(a + ta

by

use [U]J

Then

[U(fc)]r

result.

your

linear

defined in

transformation
Use

2.2.

Section

to

2.15 to

Theorem

4-

3x2.

6x

A =

2x2.
+
(d) [T(/)]y, where f(x) = 6 the proof of Theorem 2.13 and its
Complete
x

5.
6.

Prove

2.14.

of Theorem

(b)

2.10. Also,state
Find linear transformations U, T:
Use
#
transformation) but
T0.
such
that AB = 0 but BA # 0.

7. Prove Theorem
8.

and

TU

9.

Let

A
=
V

and

11.

Let

n x

an

be

only
V,

(a) If

n matrix. Prove

all i and

for

SijAn

Ay

10. Let

be

vector

if R(T)

W,

and

UT

is

corollary.

space,

F2

->

your

that

more

prove

(the zero

to find matrices

answer

is

UT = T0

that

such

F2

result.

general

only if

if and

matrix

diagonal

and

j.
and

let T: V -> V

be linear. Prove that

T2

T0

c N(T).

Z be

vector

one-to-one,

spaces, and let T: V -> W


that T is one-to-one.
prove

and

U; W

-> Z be

linear.

Must U also be one-to-

one?
(b)
(c)

If

UT

onto,

If U and

12. Let A and


i

is

prove

that

U is onto.

Must

T are one-to-one and onto,

the

equals

i
Prove

that

tr(AB)

= tr(BA)

*\302\253\342\226\240

and tr(A) =

also
that

prove

B be n x n matrices. Recall that

I
;

if

tr(^().

trace

onto?

be

is also.

UT
of

A9 written

tr(/l),

Chap.2

84
13.

Let
(a)

finite-dimensional vector space, and let T:


=
If rank(T) = rank(T2), prove that R(T)n
V be a

V =

R(T)

(see

of Section

exercises

the

that there exists a

(b) Prove
R(T*)

\302\251N(T)

be

linear.
that

Deduce

{0}.

1.3).
/c

integer

positive

->

N(T)

Matrices

and

Transformations

Linear

V =

that

such

\302\251N(T*).

T; -> V such
Determine all lineartransformations
for
x in V, and
T = T2. Hint: Note that x =
that
+ (x
every
T(x))
show that = {y: T(y) = y} @ N(T) (see the exercisesof Section1.3).
15.
the
definition
of matrix
only
multiplication,
prove that multiplication
14.

Let

a vector space.

V be

\342\200\224

T(x)

Using

of matricesis

16.

For

associative.

related to /.and

to a clique

/ is relatedto i,

and

if

related matrix B defined

A with the

matrix

incidence

an

and

Bu

16 to
the
cliques
the following incidence matrices.

17. Use Exercise

(a) /00 Y1

,1

Let

an

be

hvall

entries

matrix

the
A
and use Exercise.18to
matrix

given

below

20.

Let

be

the others within

an n x n
Determine

relation.

2.4

SfUVERTIBILITY

The

concept

Fortunately,

AMD

number

belongs

two

to

corresponding

1
0 1
1 0

person(s)

a dominancerelation.

that contains

positive

to a dominance
dominated)

relation,

[is dominated

stages.

incidence matrix that


the

is

1 if i

corresponds

which

determine

by] each of

that is associated with


A + A2 has a row [column]
on the diagonal.
except,

positions

19. Prove that

relations

the

matrix

incidence
the

that

Prove

(b)

1\\

1
0 1 0
1

18.

1\"

1
0

prove that i

0 otherwise,

in

determine

00

Bu

> 0.

if (B3)u

only

by

of nonzero

correspondsto a

dominance

entries of A.

ISOMORPHISMS

of invertibility is introduced quite early in the study


many of the intrinsic properties of functionsare

of
shared

functions.
by

their

in calculus we learned that the


of
example,
properties
being
or difFerentiable
continuous
are generally retained by the inverse
functions.
We
will see in this section (Theorem 2.18) that the inverse
of a linear
transformation
inverses.

For

is alsolinear.
one

transformation

it

of this section

the remainder

In

exists)

A.

matrix

the

of

Section

(when

LA

of matrices.
study of \"inverses\"
2.3, the inverse of the left-multiplication
can be used to determine properties of the
us in the

aid

greatly

from

expect

might

inverse

will

result

This

As

85

Invertibility and Isomorphisms

Sec. 2.4

we

invertibility to the concept of \"isomorphism.\"will see


be
identified.
vector spaces (over F) of equal dimension
made more precise shortly.
We

true linear
for use in

transformations.

for

V and

Let

Definition.

U: W ->

has an inverse
are

B are, of course,

these definitions

section.

this

inverses

in Appendix
presented
we repeat some of
Nevertheless,

will be

ideas

These

functions

inverse

about

facts

about

results

finite-dimensional

that

may

The

the

of

many

apply

TU

lw

= T~x.

we write U

and

unique,

if

spaces, and let T: -> W


and UT = lv. As noted in

W be vector

We

say

B,

Appendix

if T has an

is invertible

be linear.

inverse.

The following
l.:(TU)-1

U.

= U\"1T-1.

2. (T-1)\"1= T;
We also use

in

is invertible.

T~x

particular,

the fact that a

invertible

is

function

one and onto.

if and

only

if it is

one-to-

reader

can

verify

directly

Example

-> R2 by

T: Pi(R)

Define

R2 -> P^R) is

that T -1:
is

T and

functions

invertible

for

hold

facts

linear.

also

defined

Let

y2 e W

yu

unique vectors
*i =T\"Hyi) and

and

xx

T'\\cyi

y2)

is

x2

such

cxx

that

x2

(d - c)x.Observethat
this is true in general. 1

spaces, and let T:


is

T(xx)

so

= T-^cTJxJ

The

->

be

~x
T

linear

linear.

and c e F. Since

x2 = T'1^,
+

\342\226\240

b).

= c +
d)

W be vector

Then T_1:-W ->

and invertible.

T ~x(c,

by

V and

Let

2.18.

(a,a +

2.18 demonstrates,

As Theorem

Theorem

Proof.

T(a + bx) =

+ T(x2)] =
= cT

and

onto

one-to-one,
and

yx

T(x2)

there

exist

y2.

Thus

T^CKcx, + xj]

HyJ + T

\"

\\y2).

a
from Theorem 2.5 that if
is
linear
transformation
between
vector
spaces of equal (finite) dimension, then
conditions
of being
and onto are all equivalent.
invertible,
one-to-one,
We are
to define
the inverse
of a matrix. The reader shouldnote
ready
with
of a linear transformation.
the inverse
analogy

It

now

follows

immediately

the

now

the

Chap.2

86

an n x

n matrix B

n x n matrix. Then
= BA = I.
AB

A be an

Let

Definition.

that

such

The matrix B is unique

is

and

A'1. (If C

IB =

of a matrix.

this

At

Lemma..
->

Let

linear.

\\N-be

the inverses of

with

matrices

of

inverses

be

and

Proof Because

and let

spaces

we have

onto,

and rank(T)=
dimension theorem it follows that
nullity(T) = 0

by the

vector

then dim(V) =dim(W).


and

one-to-one

is

relate

linear transformations.

finite-dimensional

T is invertible,

If

the inverse

computing

a number of resultsthat

like to develop

would

we

point

for actually

a technique

learn

will

we

So

(CA)B=

inverse of

verify that the

should

In Section3.2

T:

= C(AB) =

reader

the

C = CI

then

matrix,

B=

and written

\302\243.)

Example

The

such

another

were

of A

inverse

the

called

exists

if there

invertible

is

Matrices

and

Transformations

Linear

dim(W).

dim(R(T))

dim(W).

dim(V)

*\302\273

bases

ordered

W be finite-dimensional vector spaces with


and y, respectively.*Let
T: V -> W be linear.
Then
T is invertible
if

and only if

[T]ji is

Proof.

Suppose

dim(V)

\342\200\224

invertible.

Now let

= /n.

BA

[T]^

By Theorem

and P =

{xx,...,xn}

= T\"1, observe

[T\"1]?.

UT

lv,

for;

if

(W, V)

Now

such that

=1,2,...,\302\253,

{yl9..-,yn}- Then [U]$ = B.

and

matrix.

have

matrix B suchthat

exists annxu

that

2.11. So

we

lemma

is an nxn

([T]^1

[UT], = [U]>[T]J =
by Theorem

([T]J)_1.

T\"XT= Iv. Thus

2.6 there exists

t Byy,

the

By

[T]^

There

be invertible.

U(xj)=
where

hence

and

/\342\200\236,
A

Then

= lw and

TT'1

satisfies

UVpLT\"1^

= dim(V).

[T~x]\302\243

T is invertible.

that

Similarly,
AB

Furthermore,

Let

dim(W).

W->V

T\"1:

V and

Let

2.19.

Theorem

similarly,

BA

In~

[lv],

TU = lw.

To

show

that

Sec,2.4

and

Invertibility

87

Isomorphisms

Example

spaces P^jRJand R2,choose

For the vector

bases

the

respectively.In

of

notation

the

It can be verified

matrix

by

andy = {el9e2}9

{l,x}

of the

is the inverse

matrix

each

that

multiplication

have that

1, we

Example

/?

other. 1
basisp,

let

and

invertible.

Then T is

-> V be linear.

T: V

vector space with an

a finite-dimensional

V be

Let

1.

Corollary

[T\"1]^ = [T]^\"1.

Furthermore,

invertibleif

and

if

only

ordered
is

[T]^

Proof. Exercise. 1

LA

is

invertible*

A be

Let

2.

Corollary

Furthermore,

Proof. Exercise.

of

case

M2 x2(^)

(L^-1

= LA-i.

the

(a, b, c, d),

manner;

that is, in

W be

V and

W if there exists a linear

by

it

an

called

vector spaces.

the

fact

that

Define

We

these

a similar

two

vector

\"is

say

that

V is

isomorphic

is invertible. Such
V onto W,

-> W

from

isomorphism

proof of

relation as an exercise.
Example

in

associate

structure,

space

T:

transformation

We leave the

matrix

products

vector

the

spaces

identical or \"isomorphic.\"
Let

is
transformation

in

we see that sums andscalar

terms of the

may be considered
Definition.

of their elements.For example,

to each

associate

we

if

F4>

4-tuple

if

only

be used to formalize what may already


that is, that certain vector spacesstrongly

'a

the

if and

invertible

is

may

reader,
for the form

except

and

\302\247

by

another

one

resemble

n matrix. Then

invertibility

observed

been

have

of

notion

The

an n x

that

to\" is an

isomorphic

to

a linear

equivalence

T: F2

isomorphic

-> PX(F) by

to PX(F).

T(a1,a2) = ax +

a2x.

Clearly

is

invertible;

so

F2 is

88

Chap.

and Matrices

Transformations

Linear

Example

Define

byT(/) =

T:P3(K)->M2X2(K)

It is

in

Section

zero polynomial. Thus T is one-to-oneand so T


that P3CR) is isomorphic to M2x
conclude
I
2CR).

is the

when

may

now

vector

isomorphic

thisis no

= dim(W).

dim(V)

->

we

2.2,

suppose

that

such

and

if

V and

for

T is linear

the

= dim(W).

T: V->

that

and

we

T(X()

/?

have

and

{xls...,xn}

By Theorem 2.6 thereexists


i

for

yt

let

an

is

lemma preceding Theorem2.19,

W, respectively.

and

vector spaces (over

f/*dim(V)

only

to

isomorphic

W. By the

1,..., n. Using

Theorem

have

= span(T(j8)) =

R(T)
So

to

that dim(V) = dim(W), and

bases

be

\342\226\240
\342\226\240
\342\226\240
> yn}

is

V onto

from

Now

T:

the

W be finite-dimensional

is isomorphic

Suppose that

isomorphism

{^ij

equal

V and

Let

\302\245).Then V

field

\342\200\224

have

spaces

2.20.

Proof

reader may have observed that


dimensions.
As the next theorem shows,

coincidence.

Theorem

that

We

invertible.

is

4 and 5 the

of Examples

each

In

same

be shown

can

it

1.6,

By use of the Lagrange interpolationformula


(compare with Exercise 20) that T(/) = 0 only

linear.

T is

that

verified

easily

Corollary.

that

Theorem

2.5 we have

If V is a

vector space

From

is onto.

span(y)\302\253W.

is

over

of

one-to-one.

also

dimension

n,

then

V is

isomorphic to Fn.

Until now

in a

now

are

We

representations.

Theorem 2.21.
dimensionsn and respectively,
Let

and

Then

for

the function

allows

that

for

<!>

every

vector

be

<D:i?(V,

frand
->

W)

y be

ordered

with

the

spaces over F

bases for

defined

Mmxn(F),

of

collection

identified

vector

be finite-dimensional
let

the

space

and

=
by <b(T)

of
W,

[T]\302\243

W), is an isomorphism.

Proof. Theorem2.8
show that

W
and

m,

6 JSP(V,

as a

may

vector

respectively.

to prove that

between two given vector spaces


space of m x n matrices.

all transformations
appropriate

position

matrix

their

with

transformations

associated

have

we

is

one-to-one
m

and

n matrix

us

onto.

conclude

to

This

A, there

that

<!>

is

linear.

will be accomplished

exists a unique

Hence

we must

if we can show

linear transformation

Sec.

2.4

T: V ->
be

and

Invertibility

that

such

given

transformation

89

isomorphisms

P=

= A. Let

<D(T)

{xls...,

matrix.

T: V

-> W such

ym}. Let

{yls...,

2.6 there exists

By Theorem

xn

and

xn}

a unique

linear

that
m

T(xj)

that

means

this

But

Let V and

Corollary.

n anrf m,

respectively. Then

dim(Mmxn(F)) = mn.

the

dimensional

vector

an

is

mn.

of dimension
2.20

and

2.21

Theorems

of dimensions

spaces

is finite-dimensional

W)

isomorphism.

vector

finite-dimensional

the fact that

and

that allows us to see


linear
transformations
defined on abstract
and
linear
transformations
defined
on Fn.

spaces

We begin by naming

a result

with

section

between

relationship

J\302\247?(V,

n.

this

conclude

be

follows from

Proq/1 The proof

We

<5(T)= A. Thus$

= A, or

[T]$

;<

for 1 <

^,-

more

x->

transformation

the

[x]p

clearly

finite-

in Section

introduced

2.2.

Let

Definition.
over thefield F.

The

-*

Fn

defined

\302\242\302\242-.

ordered-basis

an

for

of V

standard

representation

by

= [x]^ for

^(x)

vector space V

an n-dimensional

with

each x e

respect

to

ft

is

the

function

V.

Example

Let

be

fi

- R2, p =

(frpix)

{(1,0),(0,1)},

and

= (
_ 2J

MP

We have observed earlier


theorem tells us much more.

Theorem2.22.
basis

/?,

is

<j)p

an

For

{(1,2),

and

that

<j>p

is

Wv

<M*)

linear

vector

(1, -

2)

we

have

transformation.

finite-dimensional

any

For x =

(3,4)}.

space

The

V with

following

ordered
'

isomorphism.

Exercise.

Proof.

an
This theorem providesus
alternate
that
an n-dimensional
proof
the
to Theorem
vector space is isomorphic to
(see
corollary
2.20).
We are now ready to use
standard
of a vector
representation
space
a
with
of
to
the
the
matrix
transformation
along
representation
study
V and W are
between the linear transformation T: -> W, where
relationship
-> Fm,
Fn
A = [T]$
where
abstract finite-dimensional vector spaces, and
and P and are arbitrary
ordered
bases
of V and
respectively.
with

Fn

the

linear

LA:

W3

90

Linear Transformations and

Chap.

Matrices

T
\342\226\240*W

(1)\"

(2)\\
\302\242.

A.

\\
:m

Fn

2.2

Figure

transformations

linear

1. Map
the

two

These
a

it by

follow

and

with

are depicted

compositions

of Theorem

reformulation

simple

of

compositions

with

transformation

this

follow

and

<j>p

two

LA\\

this

yields

-L^ <j}p.

composition

2. Map V into

By

with

Fn

into

thereare

Figure 2.2 Notice that


that map V into Fm:

consider

us first

Let

the

obtain

to

<\302\243y

composition

<\302\243yT.

by the dashed arrows in the diagram.


2.15, we may conclude that
j

this

Heuristically;

Fn and

Fm via $p

and

<f>v

that

indicates

relationship

we

respectively,

after V and
may

are

identified

T with

\"identify\"

with

L^.

Example 7

Recall the transformationT:

(T(/) = /\".)

Let

respectively,and

standard

and

p
let

representations

Consider

\302\242^.

->

P3(jR)

the

be

and

R4

P3(JR)->

of P3(jR)

<\302\243y:P2(JR)

2+x-

Now

,0003

10

L*MriH 0 0

R3

and P2(jR). Let

the polynomial p(x) =

bases for P3(jR) and

ordered

standard

3x2

3 of Section

in Example

defined

P2(R)

be

[T]J;

5x3.

We

the

2.2.

P2{R),

corresponding

then

will

show

that

- Sec. 2.4

and

Invertibiiity

91

Isomorphisms

since

But

1 - 6x +

= p' =

j(p)

I5x\\

have

we

So

LA$p(p)

$yT(p).

example with differentpolynomials

Try repeating this

\302\247

p(x).

EXERCISES

1.

as being true or false.For

the following statements


W are vector spaces with
T: V -> W is linear. A and

Label
and
and

(a)

([1-^

(b)

ordered (finite)bases a

if and only

if T is one-to-oneand onto.

F5.

(e) Pn(F) is isomorphicto


A and
(f) AB = I implies

and

if

Pm(F)

only

if n

= m.

invertible.

\302\243
are

that

= A.

{g){A~Yl

invertible if and

(h) /1 is
2.f

respectively,

4 =

LA1 where
[T]\302\243.
is
to
isomorphic
M2x3(^)

(d)

/?,

[1-\302\276

is invertible

(c)

and

following,

matrices.

are

the

if

only

is

LA

invertible.

be square in order to possess inverse.


and B be n x n invertible matrices. Prove that

(i)

A must

Let

an

is

AB

and

invertible

{AByl=B~lA~K

3.f

4. Prove that

5.
6.

7.

If

A2

hence

10.

=
\302\243

/1^1).

11. Let ~
the

class

(We

\"is

of

= 0, then

B = 0.

that AB is

(A*)'1

= {A\"1)*.

invertible. Prove that

to show that

arbitrarymatrices

and
and

invertible.
A = B\"1
n matrices such that AB = /n.
(and
are in effect saying that for square matrices,a
if AB is

Prove

inverse.)

transformation definedin

Example

is

one-to-one.

2.22.

Theorem
mean

and

invertible.

such

an example

a two-sided

is

Prove that the


Prove

n x

be

inverse

onesided

9.

and

invertible

of Theorem2.19.

matrices

Give

be invertible

AB
be

cannot

1 and 2

Bbe\302\253xn

not

Let

is

A1

and

invertible

that

invertible.

are

need

is

0, prove

and

/1

Let

if

Corollaries

Prove

8.f

Prove that

Let A be invertible.

isomorphic

vector

spaces

to.\"

over

equivalencerelationon
as defined in Appendix A.

Prove

that

~ is an

92

12.

Chap.

and Matrices

Transformations

Linear

Let

F3.
Construct an isomorphism from onto
V -> W be an
13.
Let V and W be finite-dimensional vector spaces,and
T:
isomorphism. If is a basis for V, prove that T(/?) is a basis for W.
an n x n invertible
14.
B
be
matrix. Define <5:Mnxn(F)->
by
= B~lAB.
that Q? is an isomorphism.
Prove
15.f Let and W be finite-dimensional
vector
spaces and T: -> W be an
Let V0 be a subspace of
isomorphism.
that
Prove
(a)
T(V0) is a subspace of W.
V

let

ft

Let

Mnxn(F)

\302\242(,4)

V.

(b) Prove that

16.

17.

dim(T(V0)).

dim(V0)

polynomial p(x)= 1 + + 2x2


2
Let V-= M2x2W) the four-dimensional vector space
having real entries. Recall from Example!4of Section2.1
7 with the

Example

Repeat

4- x3.

->

-V

(a) Let'/? =

jth

{E11,E12,

E2\\

(j)T

tA(j)

a linear

mapping

transformation.

matrix having the i,


entries zero. Prove that
an
2 x 2

is the

/?

is

A.

Compute

[T]^.

standard

the

denote

(j)

is

for V.

basis

all other

and

EiJ

where

E22},

one

to

equal

entry

ordered

(b) Let
(c) Let

by T{A) = A1 for each

defined

matrices

the

that

T:

of

to /?.Verify

matrix

the

for

of V with respect

representation

\"-GO

space

=
that is3prove that
0T(M).
an
T:V-> W be a linear transformation
V to an m-dimensional
vector space W. Let

for

V and

\\-A<f>{M)

18.f

Let

n-dimensional

from

that

Prove

respectively.

W3

nullity(T) = nullity(L^),

where

Hint:

Apply

bases

ordered

= va.nk(LA)

rank(T)

A=[Typ.

y be

and

j?

vector

and

that
15

Exercise

to

Figure 2.2.

19. Let
and
W
be
P = {xl3...,xn)

finite-dimensional

there existsa

and

linear

ym},

{yu...,

transformation

Ty:

First prove that {Ty:l<f<m,


Then let EiJ be the m x matrix
with
n

elsewhere3

and

prove

that

[Ty]

spaces with ordered bases


respectively.
By Theorem 2.6
-> W such that

vector
V

is

!<;<\302\253}

1 in

the fth row

= EiJ. Again

basis

for

i?(V,

W).

and/th column and 0

by Theorem 2.6

there

exists

The Change of Coordinate

2.5

Sec.

linear transformation

isomorphism.

21.

Let

\302\243\302\2

F.

field

infinite

an

of

elements

Define

,/(cn)). Prove that T is an


Use
the Lagrange
associated with c0s..., cn.
polynomials
the vector
space defined in Example 5 of Section 1.2,and let

Hint:

denote

T(/)

by

{f(c0),...

by

Define

P(F).

T:V->

T((r) =

\302\243 atyx1,

i=

largest integer such that

n is the

where

<5(T0-)=

that

such

Mmxn(F)

W)->

i?(V,

be distinct

Let c0, c!,...,cn


+
T: Pn(F) -> Fn 1

20.

<\302\243:

an isomorphism.

<!> is

that

Prove

93

Matrix

a{n)

that

Prove

0.

T is an

isomorphism.

The followingexerciserequires
defined in Exercise 29 of Section1.3
Let T:

22.

space

->

T:V/N(T)->Z

\342\200\242i;'
+

(d) Prove that

that

by

N(T))

= T(v)

that is,

prove

that

if

v +

N(T)

= T(i;').
is

T(i>

V/N(T).

T is well-defined;
linear.

T is an

that

Prove

(c)

in

+_N(T)

N(T),thenT(i;)

(b) Prove that

a vector

onto

the mapping

Z. Define

for any coset v


that
(a) Prove

space

space

2.1.

of Section

of a vector

transformation

a linear

be

30

Exercise

and

of quotient

definition

the

with

familiarity

the

isomorphism.
shown

diagram

in

Figure

2.3 commutes;

that is, prove

Jr\\.

-*-z

h
I

Figure

V/N(T)

2.3

OF COORDINATE

2.5 THE CHANGE


In

of an

appearance
-would
expression

be

found
is

mathematics

of

areas

many

of

MATRIX

expression.
by
such

of variable is used to
the
For example, in calculus an antiderivativeof
the change of variable u = x2. The resulting
form that an antiderivative is easily recognized:
a change

simplify

2xe*2

making
a simple

2xex2

dx

e^du

= eu

= ex\\

Chap.2

94

Matrices

and

Transformations

Linear

*~-x

Figure 2.4

in plane geometry

Similarly,

the changeof variable


2

be

can

the equation 2x2

to transform

used

equation (x')2(+6(/)2 = 1,
an
Figure
2.4). We
(see

in

6.5.

Section

is a change
done

is

with

coordinate

the

coordinate

new

The

ellipse.

by

unit

4-

Axy

5y2

~ 1 into

the equation of

seen to be

easily

the simpler

will see how this change of variableis determined


the change of variable

P in the planeis described.


a new frame of reference, an x'y'-coordinate
introducing
axes
rotated
from the original xy-coordinate axes. In this case
axes
are chosen
to lie in the directions of
of the
axes
vectors along the x'-axis and
an ordered
form
basis
/-axis

the

in

This

Geometrically,

\342\200\224

it is

form

which

in

ellipse

way

that

the

of a point

position

system

the

the

p' =

for

the

and

R23

of

vector

change

of variable

to

the standard

relative

the coordinate vector relativeto


A

natural

basis

be

system

changed

of

equations

arises;

question
into

a coordinate

relating

is actually

ordered basis
rotated

new

the

How

can

vector

the new and

the coordinate

a change from
=
/?

basis

{else2}3

/?',

[P]/j

to

),

= I , I.

[P]^

a coordinate
vector
relative to the other?

= (

relative to one

Notice that

old coordinates can be

represented

the

by

Sec.

95

Coordinate Matrix

The Change of

2.5

the matrix equation

y)

\\

V5
also

Notice

the matrix

that

2=-

equals [l]$<,

where

M/j = 2M/r

for

aH

Theorem2.23.
vectorspace and let
(a) Q is

denotes

R2.

Let

the

and

/T

is true in

result

R2. Thus

general.

bases for a finite-dimensional

ordered

two

be

on

transformation

identity

A similar

ft

V,

[lv]jr- Then

invertible.

(b)

For

v e

any

V, [y]^ =

Proo/1 (a) Since


(b) For any e V,

is

lv

QM/jQ is

invertible,

by Theorem

invertible

2.19.

M/\302\273

Pv(\302\2530],\302\273

[UjrM/r

fiM/r

by Theorem 2.15.

The matrix 2
matrix.Becauseof part
coordinates
p-coordinates.
=
P' {xi9X29...9x39

(b)

if

that

Observe

into

shall

we

theorem

the

of

is called

2.23

Theorem

in

defined

a change of coordinate
say that Q changes /?'-

P =

{x1,x2,...,xn}

and

then

=
x'j

for; = 1, 2,...,

n;

that

is,

the

yth

column

QiJ*i
of Q is

Notice that if
changes
^'-coordinates
into
changes ^-coordinates
^'-coordinates
Q

(see

Example

Let

Exercise

then

^-coordinates,

Q\"1

10).

1
=

V=R2,

into

and

((1,1),(1,-1)),

\302\243

3(1,1)- 1(1,-1)
coordinates

[Xj]p.

into

and

(3,1)

^-coordinates

2(1,
is

^ = ((2,4),(3,1)}.

1) + 1(1,

Since (2,4) =

-1), the matrix that

changes

\302\243'-

LinearTransformationsand Matrices

Chap. 2

96

Thus, for instance,

[(2,4)], =
Suppose
that

and

vector

dimensional

bases for V. Then

be

can

change

bases

ordered

p'-coordinates

changes

the

V. Let

for

into

The

matrix.

coordinate

on

be

the

by

matrices?

these
of

space

represented

linear transformation

Let

which

between

the

change of coordinate matrix

on a vector

transformation

a linear

T:V->-V be a
space V, and let p and ft'

2.24.

(_ J

What is
relationship
answer
usinga
simple

matrices [T]^ and [T]^.


next theorem provides a
Theorem

4)],,

->\342\200\242
V is

/?' are ordered

and

T: V

that

now

fi[(2,

finite-

Q be the

p-coordinates.

Then

[Ti^Q-^rijQ.

Proof. Let
by

hence,

Theorem

be

the

T = !T =

Tl and

2.11,

err],. =

'.

on V. Then

transformation

identity

[i]$.rj]\302\243

[IT]?.

'[TI]f

mimi

= ui,q.
Therefore,

[T],

I,

fi-^T]^.

Example 2

Let

R3,

and

let T: V

-\342\226\272
V

be

defined

/a A

by

2a t + a2

T.l-a.2 = (at
Let p be

the standard ordered basis

for

3a3

and

let

a2

R3,

H(l)'(:}(:

which is an orderedbasis

R3.

for

To

=
[T],

Theorem

illustrate

/2

I 1

13

\\0

-1

0\"

0,

2.24, we note

that

Sec.2.5

The

of

Change

97

Matrix

Coordinate

the matrix that changes^'-coordinates


the standard
ordered basis for R3, the columnsof g
Thus
written in the same order (see
11).
Let Q be

into

are

ft is

Since

^-coordinates.

of /?'

elements

the

simply

Exercise

a method for computing

In Section 3.2

easily be

It can

described.

be

will

verified that

1 2 -1
0 1 -1

Q-l =

0 0
By

2.24

Theorem

= Q
r[T]Ag.

[T]^

shows

multiplication

straightforward

that

this is

To show that
yth

element

entries

of the

the

of /T is

It

example

its

For example, for

j=2

/T

with

we

have

of

the

/1\\

are the entries

to

reverse direction, as

2.24 in the

Theorem

apply

of the second columnof [T]^.


the

next

is

(x>

\342\200\224y)

for

b)

values

1)
\342\200\224(\342\200\2242,

(2,

easy

about the x-axis in


to

obtain.

We

now

derive

any

on

a basis

\342\200\224

1).

Therefore,

if we

let

2.1. The

Section

the less obvious

line y = 2x (see Figure 2.5).We


(a, b) in R2. Since T is linear, it

for R2. Clearly,

of

Example

the

T about

reflection

by

of

elements

the

under

= 2

Recall the reflection

for T(a,

that the image

verify

shows.

Example

fay) -^

can

/2\\

/-A

useful

often

is

its coefficients.

as

coefficients

the

that

we

matrix,

the linear combination of

column

th

/2\\-

Notice

correct

the

T(l,

2)

wish

to

is

completely

(1,

2)

find

rule

rule for

the

an expression
determined

and

T(-2,

1) =

Chap. 2

98

Matrices

and

Transformations

Linear

T(fl,

\302\273

*~x

2.5

Figure

then P' is

an ordered basis

and

R2

for

p be

Let

ordered basis for

the standard

let

and

R2,

Q be

that changes

the matrix

/?'-coordinates into ^-coordinates.Then

1 -2'
e
1

^d

[T,]/j2

= [T]/j-. We

can

reader

Thus

[T]/j.

ordered basis, it

the standard

for any (a, b)

in R2,

willbe
we

shall

the

subject

introduce

of

further

the

name

to A

matrices

in Chapters

study

is

by

left-multiplication

3fc

[T]^

and [T]^ in

Theorem 2.24

5,6, and 7. At this time,

however,

for this relationship.

Definition. Let and B be


if there exists an invertible
A

4a +

5\\
the

between

+ 4^

l/-3a

T\\fc/
relationship

follows that

have

we

/a\\

The

that

4'

1/-3
4

/? is

obtain

that

verify

[TL
Since

to

[T]^

2
the

for

equation

Because ;,

^fiCTlu-Q\"1.

[T],

can solve this

elements

matrix

of Mn xn(F). We say
6 Mnxn(F)
such that

that

B is

similar

B = Q\"1AQ.

Sec.2.5

The

Exercise

an

relation

equivalence

(see

8).

Notice also that

If T:

of similarityis

that the relation

Observe

99

Matrix

Coordinate

of

Change

and if p and
In this case

can

we

V, then [T]^

to allow T: V

generalized

->\342\200\242

W,

well as in

in V as

bases

change

for

bases

vector space V.

is similar to
V is

where
(see

as follows:

be stated

on a finite-dimensional

ordered

any

be

can

2.24

Theorem

are

/?'

can

2.24

Theorem

terminology

transformation

linear

->\342\200\242
V
is

in this

Exercise

[T]^.

from W.

distinct
7).

EXERCISES

1. Label the following statementsas


true
or false.
^'-coordinates
(a) If Q is the change of coordinatematrixthat changes
=
=
and p
where
P'
{xl5...,xn) are
^-coordinates,
{x'ls...,x'n}
being

into

(b)

(c) Let T:

space

a linear

->\342\200\242
V
be

let

and

V,

space,

coordinate

of

change

Every

then the/th column of Q is [xj]p>.


is invertible.
matrix
transformation
on a finite-dimensional
vector
and
be ordered
bases for V. Then
/?'
of coordinate
matrix that
Q is the
change

a vector

for

bases

ordered

ft

[T]/j = QLT^p-Q\"1,
where

(d)

changes /^'-coordinatesinto ^-coordinates.


e Mnxn(F) are called similar if B =
The
matrices
A9B

for

QlAQ

some

fieMnxn(F).

(e)

Let T: V

space V. Then for

bases

ordered

any

on a finite-dimensional

transformation

a linear

->\342\200\242
V
be

and

ft

y for

V,

vector
[T]^ is similar to

[T]r

2. For eachof

the

of coordinate

change

bases

ordered

of

pairs

following

ft

and

/?'

for

R2, find

matrix that changes /?'-coordinatesinto

the
/?-

coordinates.

{eu e2} and

(a)

(b)

=
p
{(-1,

of

of

change

{(0,

/T

3),

the

(c) p =
(d)

P'

1}

b2)}

of ordered

pairs

that

matrix

(5,

10),

0)}

e2}

p' = {(2, 1),

and

-1)}

coordinate

(-4,

1)}
the
/?' for
/?'-coordinates into /?-

bases p and

changes

P2tR)\302\273

and

+ b0} c2x2

+ ctx +

c0}

+ btx
+ b0i c2x2
{<*2X2 + <*ix + ao-> b2x2
=
x2}
x, 3x2 + 1, x2} and P'
{2x2

+ ctx +

c0}

{^2%2 + aix

(b) p = {1,x,
P'

(\302\276

following

coordinates.
(a) p = {x2,x,
P'

(bu

a2\\

3), (2, -1)} and p' =


5), (-1, -3)} and = {eu

(c) p = {(2,
(d) p = {(-4,

3. For each

0' = {(au

x2}

flo>

&2*2

bxx

and

- x + 1, x + 1, +
{x2
{x2 + x + 4, 4x2 3x +
x2

{1,x,

1}
2,

and
2x2

3}

find

Chap. 2

100
=

(e)

P'

4.

{2x2

0' = {9x T: R2

Let

- x +

-\342\226\272
R2

21 x

x2

9,

defined

be

3x -

1, x2 +

- 2, 3x2 +

5x +
+

(la
-

the standard ordered basis

for

Theorem

2.24

Let T: Pt(k)

-\342\226\272

2}

b\"

3fc

R2,

let

and

that

Use the fact

Let

find

to

be

Pt(R)

and P' =

{l;*}

[T]^.
defined

{1 + x,

T(p)

Let

m # 0, and let
line y = mx} where
L. Find an expression for T(x, y).

about

7. Prove

the

y' be ordered

and

space W, let

vector

dimensional

T be the
2.24.

Theorem

Let T: V
for

bases for W. Then

=
[T]\302\243

->\342\200\242
W

V,

and

matrix that changes ^'-coordinatesinto ^-coordinates


into coordinates.
that changes /-coordinates
and

the

is

let

Q is the

where

P_1[T]JQ,

be

finite-

to

p and P' be orderedbases

R2

of

reflection

vector space

from a finite-dimensional

transformation

linear

of

generalization

following

P^jR).

be \"the

of p e

= pf, the derivative


x}. Use the fact that

by

and Theorem 2.24to find [T]^.


6.

1} and

P' =

5.

- 3}

by

and

2x +

-x2

2,

- x

5, 2x2

5x

'a\\

let P be

Matrices

and

1}

{5x2

jS

(f)

- x, x2 + 1, x - 2x - 3, -2x2

{x2

LinearTransformationsand

matrix

8.

For

~ is
9.

B in Mn x n(F)}defineA

A and
an

Prove\" that if

10- Let
(a)

be

Prove
change

and

are

12 of Section

Exercise

Use

Mnxn(F).

similar

n x n

if Q

vector

and R are
into

a-coordinates

A is similar

matrices, then tr(A) ~

space

with

ordered

that

if

changes

bases a, p,
matrices

and

^-coordinates
RQ

Prove

tr(\302\243).

the change of coordinate

^-coordinates

of
the
change
coordinates, respectively,then
which changes a-coordinates into y-coordinates.

(b)

to B. Prove that
Hint:

2.3.

finite-dimensional

that

mean

B to

on

relation

equivalence

is

a-coordinates

into

coordinate

^-coordinates,

changes ^-coordinatesinto a-coordinates.

and

y.

which

into ymatrix

then g\"1

Sec.

Dual

2.6

11.

101

Spaces

x n matrix with entriesin a


let /? be an ordered
basis
F,
and
let
B =
for
Prove that B = Q~1AQ)
is the
nxn
Q
[L^.
matrix
whose ;th column equals the /th
of
/?.
=
Let V be a finite-dimensional
vector space over a field F,
let
ft
V.
Let
Q be an n x n invertible matrix
{xu...,
xn} be an ordered basis

Let Abe&nn

field

where

Fn5

vector

12.f

and

for

with entries

Define

F.

from

forU;<\302\253,

Qij*t

XJ=

(\342\226\240=i

and

the

set

/?'

matrix changing

of coordinate

change

a basis for

Prove that/?'is

{xfu...,x'n).

/?

hence

and

that

Q is

into

/?-

'-coordinates

coordinates.

13.

of Exercise7: If

Prove the converse

fieldF,

V and

/? and /?'
such

for

and

W, and a

y' for

and

and nxn

Hints:

Let

matrices P and

Exercise

DUAL

linear transformation T:

Q (according
12), and let yr be

/?' be

the ordered

to the definition
basis

the

for

y be

P and

and

LA,

Let

respectively.

Fm,

/? via

from

->\342\200\242
W

basis for
from

and

obtained

by

justified

P\"1.

via

SPACES

exclusively

is

transformation

letters

definite
see in Example 1,
important examplesof a
the

linear

provides

integral

in

functional

is

itself

called

vector

a linear

from

transformations

linear

which

space

functional

of

on

linear functionals. As we will


us with
one of the most

mathematics.

valued
Let V be the vector space of continuous
real-)
(or
complexthe interval [a, &]. The functionf: V -+ C (or JR) defined by

f(x) =
linear

ordered

the standard

on page 95

W obtained

with
In this section we are concerned
vector space V into its field of scalarsF,
dimension 1 over F. Such a linear
to denote
V. We generally use the
f, g,
h,...

is a

Q,

B = n%

and

[T]^

= Fm, T

Fn,

bases for Fn and

Example

matrices over a

that

A =

2,6*

m x m

invertible

m x n

each

are

such
that
B = PAQ, then there exist an n-dimensional
vector
an m-dimensional
vector space W (both
over
bases
F)} ordered

respectively,
space

exist

there

if

and

and

functional

on

V.

If the

functions

on

x(t) dt

interval

is [0,2%] and n

is an integer,

the

102

Linear Transformations

Chap.

and Matrices

function defined by
l2n
=
h\342\200\236(x)

linear functional.
Fourier coefficientofx. 1
is also a

In

x(t)e~intdt

scalar

the

texts

analysis

hn(x)

nth

is called the

Example

= Mnxn(F),
and define f: V -+ F by f(A)
6 of Section
1.3, we have that f is a

Let

{xx,

space with the ordered

vector

finite-dimensional
x2\302\273....,
xn}. For each i=
a

be

functional.

A.

By

Example

the trace of

linear

Exercise

Let

= tr(yi),

define

1,.,.,\302\253,

f\302\243(x)

basis

where

ah

<*2

is the coordinate vectorof x


ith

called

the

These

lineap

(see

coordinate
functionals

Theorem

vector

the

operations

Note

that

the

dualV**

Theorem

by

vector

over

of V

is a natural
Theorem

= dim(if(V, F)) =
2.20,

to be the

V and

Suppose

and

of all linear

to

functionals on

with

2,2,

Theorem

2.27

dim(V).

We also define

the

we

show,

in fact,

double

that there

V**.

that V is a finite-dimensional vectorspacewith

and

f =

\302\243

i=

the dual basis of p.

be

as defined in Section

=
dim(V)-dim(F)

=
ordered
basis
P
{xl5,,., xn}. Let f{ (1 < i ^ n)
to P as defined
above, and let P* \342\200\224
respect
{fx,
f
6
V*
have
we
basis for V*,
for
any

We call P*

of

the dual space

we define

V* are isomorphic.

dual of V*. In

identificationof
2.25.

on

functional

V*.

consisting

space

F,

and scalar multiplication


then
V is finite-dimensional,

if

a hnear

f(- is

of addition

dim(V)
Hence,

space

JSf(V, F), denoted by


is

V*

Then

=
with respect to the basis p. Note
that
f(-(xy)
<5(J-.
a very important
role in the theory of dualspaces

For a vector

space

Thus

play

/?,

2.25).

Definition.
the

function

to

relative

f(x,)f,

the

be the coordinatefunctions
, fn}* Then
/?* is an ordered
with

Dual

2.6

Sec.

103

Spaces

Since dim(V*) =

Let f 6 V*.

Proof.

f =

for then

will

/?*

g =

For 1 ^ j <

that

show

only

Hxdii,

t fM%
,-=1

have

we

n,

need

we

Let

V*.

generate

n,

9(*;)

( t

f(*.)f.-)(*j)

f (X,)5y = f (Xj).

\302\243 f(*i)f.(\302\276)

,-=1

Hence g =

the

by

are done, 1

2.6, and we

to Theorem

corollary

Example

Let

basis /?* =

of

{fi\302\273f2}

these

Solving

\342\200\224x
+

y)=

fi(2ei

+ e2) =

2f t(et) +

= fi(3ei

+ e2) =

3ft(et) +

1)

i(2s

fi(3,1)

it can

Similarly,

3y.

that V and

p and

with ordered bases

be shown that

ft(e2).

and ft(e2)= 3, i.e.,

f2(x,

=
y)

2.4

Section

In

vector

finite-dimensional

are

respectively.

y,

h(e2)

fi(ei) =-1

that

obtain

we

equations,

assume

We now

the dual

determine

explicitly

the equations

to consider

need

we

To

R2.

for

P,

1=

fi(^>

an ordered basis

{(2,1), (3,1)} be

\342\200\224

2y.

spaces

we proved

over

that there

W
exists a one-to-onecorrespondence
transformations
T: V
linear
m x n matrices
and
For
a matrix
of
(over F) via the correspondence
[T]$.
=
a linear
the form
exists
arises as to whether or not
[T] jj, the question
U may
such
that
transformation
U associated
be
with T in some natural
represented in some basis as A1.Of course, m # n} it would be impossible for U
V into
to be a
transformation
from
W. We now answer this question
what
we have
learned about dual spaces.
applying
already

\342\200\224>\342\200\24

between

<-*

there

way

if

linear

by

Theorem 2.26.
with ordered bases p and
the

V:

mapping

\342\200\242v..:Proof

and

hence

that

Tf

is

For

the

g 6

is an element
linear

to

the

defined

property

W*,

For

respectively.

W*-^-V*

with

transformation

y,

be finite-dimensional

and

Let

by Tr(g) =

that

it is clear

=
[T*]\302\243

gT for all

Tr

maps

geW* is a

linear

([T]J)r.

that Tf(g) =

of V*. Thus

reader.

linear

any

spaces over F
T: V \342\200\224>\342\200\242
W,
transformation
vector

W*

gT is a
into

linear

V*.

functional

We leave the

on

proof

Chap. 2

104

the proof, let /? = {xl3...,


complete
/?* = {fl9...,
fn} and y* = {gl9...,

bases

{yu...,
For

respectively.

gm},

and B = [T1]^. Then

A = [T]J

y =

xn} and

To

Matrices

and

Transformations

Linear

dual

with

ym}

let

convenience,

T(*i)

\302\243

< i

for

i4Myfc

<

\302\253,

and

m.
that

show

must

We

=
\302\243

= g;T=

r(g;)

that

shows

2.25

Theorem

A*.

^(gyrxxjf,,
,-=i

so

Bu

(QjT)(xd

transpose
[U]\302\243

Aji

^\302\276

\302\243

k=l

Tf

transformation
It

T.

of

Aki9j(yk)

A^

\302\243

fc=i

(A%

A1.

linear

The

9,(

=
\302\243

k=l

Hence B =

Qj(T(xd)

linear transformation

Tris the unique

that

is clear

ourselves with demonstrating that


be identified in a very natural
with

We now concern
V can

vector

space

There

is, in fact,

an isomorphism

vector

functional

us

to

the desired

define

Let

Lemma.

V be

Let

x ^0.

dual

basis

Theorem
\\j/: V

be

\342\200\224>\342\200\242
V**

2.27.
defined

Let

exists

xn} for V

f^x^ = 1 #
V

on

V**.

and

let

V.

If

0.

{xu...,

of /?. Then

not depend

vector space, and

\"there

/?

the

does

between

isomorphism

We will show that

Choose an orderedbasis
be

V**.

->-.Fby x(f) = f(x) for every f 6 V*. It is


on
so x 6 V**-. The correspondence
V*,

a finite-dimensional

all f e V*, then

Proof.

dual

spaces.

x: V*

double

its

that

V**

and

between

for

x(f)

that

finite-dimensional

any

way

any choice of bases the two


For a vector x 6 we define
easy to verify that x is a linear

= 0 for

such

([T]J)'.

x\302\253-\302\273xallows

the

2.26 is called

in Theorem

defined

be

0.

such
f

Let

finite-dimensional

= x. Then
by \\//(x)

\\j/

is

an

V*

f-e

such

that

x(f) # 0.

that xt = x. Let {fx,...,

fn}

fx.

vector
isomorphism.

space,

and

let

105

Dual Spaces

Z6

Sec.

Proof, (a)

Let

linear:

is

\\j/

+ ay)(f) = f (x

*(x

(x

eF. For

s V and a

x, y

(y) = x(f)

+ af

f (x)

ay)

have

we

6 V*

+ aj>(f)

aj>)(f).

Hence

+ ay) =

i^(x
(b)

thatx = 0.
=

dim(V)

an

is

\\j/

(c)

every

ordered

Proof.

Let

is

In

of

ideas

the

fj(xj). Thus

We

combine

may

{fu...,

a dual

exists

there
fn}

is

the

for

be extended

dual

to the case

dual space,
to its double dual via the map

the existence of a

space

is isomorphic

infinite-dimensional

vector

vector

can

section

this

of

for example

finite-dimensional,

fact,

x^fj)

V*

V.

of

finite-dimensional

x->-x.

that for this basis

2.27 to conclude
StJ

not

V*.

ordered basis of

fn} be an

{fl9...,

and

with dual space V*.

basis

Although many of

where

fact that

(b) and the

from

a finite-dimensional
vector space
of V* is the dual basis of some

basis

of {xu...}xn}.

basis

the

V*

V be

xn} in V**, that is,

basis {xu...,

zero functional on
for
lemma we conclude
previous

Let

2.25

a\\j/(y).

Corollary.

Theorems

By

follows

This

isomorphism:

dim(V**).

\342\226\240*\342\200\242

Then

6 V*.

every

\\j/(x)

i^(x) is the

that

Suppose

x(f) = 0 for

e V. Then

some

one-to-one:

is

\\j/

x + ay =

spaces,

only

V* are never

and

isomorphic.

EXERCISES

1. Label

the

spaces
Every

(b)

(c)
(e)

defined

functional

linear

matrix.
vector

Every

If

T is

(g)
(h)

is

is

V**.

If

V is

The
the

or false. Assume

to its dual

is isomorphic

space

an isomorphism

the

dual

of

transformation

linear

isomorphic
of

derivative
vector

space

some

from V onto V*

T(jS) = jS*.

V, then

(f) If

true

that all vector

is a linear functional.
on a field may-be represented

transformation

linear

(d) Every vector spaceis


,

being

finite-dimensional.

are

(a)

as

statements

following

from

other

space.
vector

and /?is a

V into

as a J x 1

space.

finite

W, then the

ordered

basis

of

domain of (T1)'

to W, then V* is isomorphicto W*.


a function
on
may be considered as a linearfunctional
of differentiable
functions.

Chap. 2

106

2. For the following

a vector

on

functions

functionals.

(a)

(b)

(c)

(d)

R2; f(x, y) =

P(t)dt

P(R); f(p) =

= A11

M2x2(R);i(A)

3. For each vectorspace

(b)

V = R3;
V

Prove^that
Let

{l,x,x2}

and define

{fl9

as follows:

e V*

f3

f2,

fl9

f2(x,
f2, f3} is a basis
2y,

fi(*> y, z)'=x-

5.

as

V*

p = {(1,0,1),(1,2,1),(0,0,1)}

P2(K);/?

4. Let V ='R3

it is

basis /?* for

find the dual

/? below,

4/

Example

(a)

basis

and

in

+ z2

(DV

Ay)

= x2 + y2

y, z)

f(x,

R3;

differentiation

= tit4)

M2x2(F);f(,4)

V =

(e)

(2x,

' denotes

p\"(l), where

= 2p'(0) +

f(p)

P(R);

which are linear

V, determine

space

Matrices

and

Transformations

Linear

z)

y,

for

find

then

and

V*,

f3(x, y, z) =

and

y + z,

x +

a basis for V

3z.

for which

dual.

the

V =

p eV define

PxtR), and for

6 V*

f2

fl9

by

\342\226\240i

f i(p)

dt

p(t)

and
=

Up)

2p(t)dt.

Prove that {fl9f2}is a basis

for

for V for which

a basis

find

and

V*,

it is the

dual.

6.

Define

(3x

2y,

f(x,

by

(R2)*

by

fifi + bi2

and

is the

e2}. Define

derivative

(a) If f e W*

is

where

R2 -+

R2

by

T(x,

=
y)

scalars

finding

cft

with

R2

T: V -+

that Tr(fJ =

+ df2>

and [T]Jj, and


W

d such

and

a3b,c,

for R2 and

basis

ordered

standard

the

is

fi

Tr(f2)

(c) Compute [T]^


7. Let V = Pi(jR) and
{eu

T:

and

Tr(f).

(b) Compute [Tr]^*,


j8* = {f1,f2}9

x).

(a) Compute

y) = 2x

by

your

compare

T(p)

bases

ordered

respective
=

(p(0)

with

results

- 2p{l), p(0) +

part

(b).

P = {1, x}
p'(Q))9

and

where

p'

part

(b).

of p.
defined

by

f (a,

b)

= a~

2b,

compute Tr(f).

(b) Compute

[Tr]^*

(c) Compute

without

appealing

to

[T]$ and its transpose,and

2.26.

Theorem
compare

your

result

with

8.
9.

that

Show

null

space

Let

T be

plane

of an

element

Let

(a)

For

< i

Hint:

to

(b) Use the corollaryto

(c) For

scalars

any

existsa

< i <

\342\200\224

ct)(t

c2)

ordered basis

that {f0,...,

is

fn}

of this set that equals


* * \342\200\242
\342\200\224
cn)
(t

deduce

and

polynomials

necessarily

an (not

at

q of degree

polynomial

5y for 0 < i <


defined in Section
=

pi(cj)

exist

show that there

(a) to

part

that

such

,pn

Lagrange

a0,...,

unique

for

=Tr(g(-)

scalars in F.

\342\200\224

and

2.27

Theorem

the

are

of the standard

f(-

is zero.

unique polynomialsp0,...
polynomials

= (t

p(t)

i.e.,

Fn,

any linear combination

Apply

coefficient

first

the

Fn.

= p(c;). Prove

ft(p)

by

if and only if
fm(x)) for all x 6

= (f^x),...,

basis

dual

R2.

T is linear

that

cn be distinct

6 V*

ff

T(x)

the

with

an analogousresult in

(g;T)(x) for x e

the

is

gm}

transformation

zero

that

< n, define

V*.

of

basis
the

define f;(x) =

let c0,...,

and

Pn(F),

that

such

(Fn)*

be identified

in R3 may

Prove

Fm.

into

Fn

m, where {gl9...,

< i <

of Fm.

10.

fm

is linear,

in (R3)*. State

from

a function

If

Hint:

the origin

through

every

thereexistfl9...,
1

107

Dual Spaces

Sec. 2.6

n.

These

1.6.
distinct), deduce that there
most n such that q(c^ =
for
at

n. In fact,
n

(d)

Deduce

the

Lagrange

formula:

interpolation
n

,=0

p(ci)pt

for any peV.


(e)

that

Prove

p(t)

dt=J^

pic^d;,
= 0
\302\243

where

dt =

Suppose

now

Pi(t)dt.

that

c-.= a

-\\

tor

0,...,

n,

the
rule
for polynomials.
For n = 1, the above
trapezoidal
for
n = 2, this result is Simpson's
polynomials.
and
let
V and W be finite-dimensional vector spaces
F,
between V and V** and W and W**,
the isomorphisms

For

yields

rule

11.

Let
be

defined

over

\\j/1

respectively,

in

Theorem

2.27.

Let T: V

-\342\226\272
W

be

linear,

and

define

and

\\j/2
as

T\" = (Tr)(-

Chap.2

108

Matrices

and

Transformations

Linear

T
V

*-W

*i

*:

'

Prove that
12. Let

that

vector

where

/?**,

\\j/(P)

For every subset 5 of

5\302\260

13. (a)

that

Prove

the

{f

5\302\260is

e V*:

2.27.

Theorem

in

annihilator

f(x) = 0
of

and

S\302\260of

basis /?. Prove

the ordered

with

i.e., that

commutes,

vector

a finite-dimensional

subspace

is a subspace of

(b) If W

2.6

Figure

space

defined

denotes

define

as

is

\\j/

13 through 17

In Exercises

in

depicted

diagram

finite-dimensional

be

the

Figure2.6

over F.

space

5 as

for all x 6 5}.

V*.

x^W,

there exists f

that

prove

VV\302\260 such

thatf(x)#0.

that S00 =

span(^(S)), where
For subspaces Wt and
prove
W?=
Prove

(c)

(d)

as

is

\\j/

that

VV2,

defined

in

2.27.

Theorem

if and

W1=W2

only if

W\302\260.

For

(e)
14.

15.

and

subspaces'W*

(Wt +

W2, show that

=
W2)\302\260

=
of V, prove that dim(W)+ dim(W\302\260)
subspace
Extend
an
ordered
basis* {xl3...,xk}
of W to an
=
of V. Let p* = {fl3...,Q.
Prove that {fft+
P
{xl3...,xn}
basis of W\302\260.

is a

If

Suppose
V

T:
16.

Use

17.

Let

that

is

linear.

-\342\226\272
W

T: V

that
VV\302\260

is

T-invariant

(as

ordered basis
ls...,

fB}

F and

is

that

that rank(L^t)= rank(LJ


and

in Exercise

for

any

subspace of V. Prove
26 of Section 2.1)if and
W be a

only

if

T'-invariant.

is

LINEAR

HOIVIOGENEOUS

2.7*

defined

Hint-

(R(T))\302\260.

transformation

a linear

->\342\200\242
V
be

N(T') =

15 to deduce

14 and

Exercises

that

W\302\247.

dim(V).

vector space over

is a finite-dimensional
Prove

W?

DIFFERENTIAL

EQUATIONS

WITH CONSTANTCOEFFICIENTS
As

an

problem.
A

allowed

suppose

to

that

weight

stretch

of

until

the weight

let us
section,
m is attached
to a

this

to

introduction

consider the following physical

vertically suspended springthatis


the forces acting on the weight are in equilibrium. us
is now motionless and impose an X 7-coordinate

mass

Let

system

Sec.

2.7

Equations with Constant

Differential

Linear

Homogeneous

109

Coefficients

+-X
Figure

with

2.7

the

and the spring lyingon

at the origin

weight

the

upper

part

of the

7-axis

(see Figure 2.7).

Suppose that at a

weight is lowered a distances


the
and
7-axis
released.
The spring will then begin to oscillate.
of the spring. At any time t ^ 0, let
describe
the motion
us
denote
the force
on the weight and y(t) denote the coordinateof
acting
weight
along
=
For example, y(0)
The
of y with respect, to
the 7-axis.
s.
second
derivative
the
acceleration
of the weight at time t, and hence
Newton's
is
time,
certain

say

time,

= 0, the

along

Let

F(t)

the

\342\200\224

by

y\"(t\\

second

law

= mf(t).

F(t)

It is reasonable to assume

to the tension

the

of

acting
position
on

but

then

the

is

proportional

weight
in

acts

law

Hooke's

and

the

opposite
states
that

(1) and (2),

we

weight is totally

due

Hooke's law: Theforce


to its displacement from the equilibrium
If k > 0 is the proportionality
constant,

direction.

-ky(t).

F(t)=

Combining

on the

acting

this force satisfies

that

spring

force

the

that

(1)

(2)

obtain

my\"

\342\200\224ky

or

y\"

The expression in
equation

differential

y, t, and

in

an

(3)

is

^--1^

differential

to

differential

(3)
of

function

=
y

equation

l)+---

are
functions
f
wherea0,alr. ..,an
y, then the equationissaid be linear.
of the
Thus
equation
(4).
and

example

unknown

derivatives of y. If the
fl\302\273yw+

an

= o.

-y

+ alya>

\"differential

is

y(t)
is

an

equation.\"

equation
of the form

+ a0y = f9

involving

(4)

y{k)denotes the/cth derivative of


The functions
at are called the coefficients
(3) is an example of a linear differentia
of t and

110

Chap.

coefficientsare constants
zero. When the function f in
identically
equation is called homogeneousin which the

equation

In this section
homogeneous linear
we say that
differential

we

is

where

by an

b{

to obtain a

i =

for

aifan

is identically

linear

differential

to solve
If an # 0,

studied

we have

coefficients.

of order n.

(4) is

the

constant

with

in

equation

function

zero,

algebra

equations

the

sides

linear

the

apply

differential

both

the

and

(4)

and Matrices

Transformations

Linear

In this case

divide

we

new, but equivalent,equation


n

0,1,...,

\342\200\224

always assume that the leading


A solution to (4) is a functionthat
identity.
coefficient

this

of

Because

1.

in

an

is

(4)

1.
for

substituted

when

we shall

observation,

(4) to an

y reduces

Example 1

The function

\"^

,*

t. Notice,

all

not

is

zero.

identically

[k

that substituting y(t) = t

however,

y\"(t)

which

m
for

to (3) since

a solution

t is

sin^/k/m

y{t)

yields

= -t,

-y(t)

y(t) = t is

Thus

into (3)

not a solutionto

(3).

we will discover that it is


to
as complex-valued
functions
solutions
of a real variable even
view
that
are meaningful
to us in a physical sense are real-valued
though solutions
The convenience
of a real
of this viewpoint will becomeclear
variable.
later.
we
be concerned
with the vector space ?F(Rf C) (as
will
in
3 of Section
functionsof a
1.2). In order to consider complex-valued
Example
to differential
variable
as solutions
equations, we must define what it means
such
functions.
a complex-valued
to
Given
function x e !F(Rf C) of
a
exist unique
variable
real-valued functions x1 and x2 oft, such
t, there
While

differential

to solve

attempting

equations,

useful

the

functions

Thus

defined

real

differentiate

real

that

x(t)

where i is
real

part

the

and

x2 is

Definition.
part

x2,

we

differentiable,

say

that

x 6

a function

^(R,

x is differentiable

we define the

that

i2 =-1.

part of x.

the imaginary
Given

such

number

imaginary

purely

for t 6 R,

+ ix2(t)

xt(t)

C) with real

if x1 and

derivativeof x, x;,
x'

= x'i

+ ix2.

to

be

x2

are

We say that x1is


part x1 and
differentiable.

the

imaginary

If

x is

Sec. 2.7

Homogeneous

Differential

Linear

In Example

2 we illustrate some

computations

111

Coefficients

Constant

with

Equations

with

complex-valued

functions.

Example

If x(t) =

cos It + i sin It,

then

x2(t)= (cosIt
= cos

At

and

real

the

find

next

We

i sin

of x2(t) is cos

the real part

Since

parts of x2.

imaginary

It)2

i sin

It + i(2cosIt).

- 2 sin

x'(t)

- sin2 It) +

(cos2 2t

2t)

4t,
At,

the

and

part

imaginary

is sin At.

The followingtheoremindicates
we
limit
may
vector space considerablysmallerthan !F(R,C). proof,
Example 3, involves a simple inductionargument,

our

that

is illustrated

by

omit.

we

which

to a

investigations

which

Its

to a homogeneous linear differential


equation
x
a
to
constant
has derivatives
is
of all orders; that is, if
coefficients
an equation,
then x<k) exists for every positive integerk.

with

Any solution

2.28.

Theorem!

solution

such

Example

As

It cos

sin

f(2

of Theorem

illustration

an

2.28, consider the


+

yw

Clearly, to qualify

= 0.

Ay

function

must

have

two

If

derivatives.

is

then

however,

solution,

as a solution,a

equation

y{2)=-Ay.

Thus since y{2)is a constant multipleof a function


namely y, y{2) must have two derivatives,so y{A) exists.

Since

a constant

is

y{A)

it

derivatives,

in this

manner,

also

derivatives

of

It is a

has

We

all

to

For

x e

the

denote

derivatives,

fact,

two

derivatives, and hence

y{6)

exists.
of

derivatives

of all functions

set

Continuing
all

orders.

in 1F(R, C) that

orders.

simple exercise to showthat

us.

In

two

that we have shown has at least

function

that any solutionhas

C\342\204\242
to

use

a vector space over C.


interest

of a

at least two

we can show

Definition.
have

multiple

has

that

In

view

of

Theorem

C00 the derivative

\302\260\302\260
C

is

2.28

C) and hence

of !F(R,

subspace

it is this

x' of x alsolies

vector space that

in

C00.

We

can

use

is of
the

Chap. 2

112
derivative

= x'

D(x)

It is easy to showthat

polynomial over C of

p(D) =

antn + fl^it\"\"1

anDn

transformation

linear

Definitions.
The

operator.

More

\342\200\242
\342\200\242
\342\200\242

+ a0.

(see

... + fli'D +

fl^iD\"\"1

E).

Appendix

For any polynomial p(t) over C,


called
is
order of the differential operator p(D) is
the

us

yM + flB-1y0,\"1)+
can be rewritten
r

by means of
(D\" +

\342\226\240\342\200\242\342\226\240
+
+
aty\342\204\242

differential

of

- \342\200\242

polynomial

auxiliary

For example,
(3)

homogeneous

rewritten

has

auxiliary

linear

= 0

+ fl0IXy) =
the

above,

equation

differential

p(t) = tn+an_1tn-1
the

atD

a0y
as

operators

fl^iD\"-1

Giventhe

Definition.

be

the

of

degree

Any

homogeneous

can

differential

a means
operators are useful since they provide with
a differential equation in the context of linear algebra.
with constant
coefficients
linear
differential
equation

reformulating

Any

p(D)

Differential

called

ool,

p(t).

polynomial

is

consider any

generally,

define

Then if we

p(D) is a

by

form

the

p(t)

Cm

Matrices

forxeC00.

transformation.

a linear

is

Cro ->

a mapping D:

to define

operation

and

Transformations

Linear

... + a1t

0.
polynomial

complex

a0

with the equation.

associated
polynomial

differential

equation with constant

coefficients

as

= o,

p(0)(y)

where p(t) is the auxiliarypolynomial


associated
equation
implies the following.

the

with

Theorem 2.29. Theset ofall


to
equation with constant coefficientscoincides
the
is the auxiliary polynomialassociated
solutions

with

with

Clearly,

equation.

a homogeneous

linear

null

of p(D),

the

equation.

space

this

differential

where p(t)

Sec,

2,7

Linear Differential Equationswith

Homogeneous

The set of all


constant
coefficients

Corollary.
with

equation

In

the

of

view

space of the equation.


A

it.

for

these

'

solution

bases

finding

spaces.

number s

For a real

unique

certain

is 1

logarithm

e\\ where e is the

real number

the

with

familiar

are

we

natural

whose

number

instance,

that is ofuse in

class of functions

a certain

examine

will

We

for

coefficients the solution


such a space is to find a basis

of describing

way

practical

solutions to a

the set of

constant

with

equation

differential

C00.

call

we

above

linear

homogeneous

is a subspace of

corollary
differential

linear

homogeneous

solutions to a

113

Coefficients

Constant

= 1). We

(that is, ln(e)

for

know,

of exponentiation:

properties

e8+I = eV

and e\"'=

\342\200\224

el

for

include

in such a

numbers

complex

a + ib

Let c =

Definition.

imaginary part b.

For example, for c =


c
e

ez

2+

of

number

complex

any

true.

remain

properties

with

real

e to

of

powers

part

a and

i sinb).

b +

ea(cos
i(tc/3),

\342\200\242
\342\200\242

\342\200\224

isin

\"3

is real (b = 0),

use

the

these

that

way

be

K
cos

with

of

Define

ec

Clearly, if c

extend the definition

t. We now

s and

numbers

real

any

we

ec+d = eced

ec = ea.

It can be shown

that

identities

trigonometric

result:

usual

the

obtain

and e-c =
e

for

Let c be

Definition.
by

f(t)

all t in R

ecl for

of

derivative

The

c and d.

numbers

complex

any

theorem, is as

an

is called an

computation, which we leave as an

We

homogeneous

For

2.30.

Theorem

will
linear

use

exponential

differential

function,
The

expect.

any

proof

C defined

as described
involves

in the following
a straightforward

exercise.

function f(t) = ecl, f'(t) =

exponential
functions

equation

f: R ->

function

exponentialfunction..

exponential

would

one

The

number.

complex

any

to describe

of order

cecl.

all solutions of

1. Recall that

the orderof such

an

\\

114

equation is the degreeof its

an

Thus

polynomial.

auxiliary

and Matrices

Linear Transformations

Chap.

equation

of order

1 is

of the form

y' +

0.

The solution space for

2.31.

Theorem

=
a0y

(5)

(5) is of

has

1 and

dimension

{e~aot}

as a basis.

Proof Clearly (5) has

a solution.

as

e~aot

x(t) is any

that

Suppose

solution

to (5).

Then

x'(t)

R.

for all t e

-a0x(t)

Define

z(t) =

eaotx(t).

z yields

Differentiating
\342\200\242'

z'(t)

the

that

(Notice

+ eaotx'{t) =

(eaot)'x(t)

valuedfunctionsof a

familiar

= 0.
a0eaotx(t)
holds for complex-

rule for differentiation

product

involves

A justification

variable.

real

a0eaotx(t) -

a lengthy,

although

direct, computation.)

Since z' is

real-valued

for

known

valued functions.
real
separately,at

The

functions

of
which

and

imaginary

function. (Again, this fact, wella real variable, is also true for complexon the real case, involves looking
relies
of z.) Thus there exists a complex
parts

a constant

that

such

z is

proof,

the

number

zero,

identically

z(t)

= eaotx(t) =

for all

R.

So

x(t) = ce~aot.

We conclude that any


combination of e~aot. I
Another

way

of

member of

one.

\342\200\224

cl

as

2.31 is as

Theorem

number

complex

any

of

space

(5)

is

a linear

\342\200\242

formulating

Corollary. For
has
operator
{ecl}

solution

the

the

follows.

null space

of the differential

a basis.

We next concern ourselves with differential


equationsof order greater than
Given an nth order homogeneous linear differential
equationwith constant

coefficients

yM

fl\342\200\236-iy(\"\"1}

\342\200\242\342\200\242\342\200\242

atyw

a0y

= 0,

Sec. 2.7

Homogeneous

Differential

Linear

its auxiliary

Constant

\342\200\242
\342\200\242
\342\200\242

+ a0

115

Coefficients

polynomial

p(t) = tn +

+
fln-it\"\"1

of polynomialsof
- c2)
=
P(t)
(t- ct)(t

into a product

factors

with

Equations

a^

1:

degree

\342\200\242
\342\200\242
\342\200\242

where

fundamental

the

from

c\342\200\236),

are (not necessarily distinct) complexnumbers.(Thisfollows


of algebra in Appendix D.) Thus
theorem

c2,...,cn

cu

(t

p(D) = (D-Cl!XD-c2!)...(D-cnI).
the

Now

\342\200\224

operators

c(\\

N(D-c,.i)c:N(/7(D))
Since

we

equation,

equation

differential

c isa

zero

then

p(t),

of

the

Given

differential

homogeneous

equation

- 3/ + 2y =

p(t)

polynomial

auxiliary

= t2

\342\200\224

3t

Hence

by

and c =

= 1

2 are zerosof p(t).

that

show

\302\260\302\260f,
span({er,

{e\\ e21}

solution space. This


Theorem 2.33.

lies

result follows

the

from

to the proof

of the

space

above

equation

space. It is a simple

matter

Thus if

that

show

we

is

e2t}

{e\\

that

the

basis

theorem.

following

p(

of

because

above

For any differential


operator
subspace

a preliminary

the equation

solution

we can conclude

of p(D) is an n-dimensional
As

in the

is linearly independent.

is two-dimensional,

space

e2t})

2).

solution

the

Since

is a subspaceofC

as

2 factors

are solutions to

er and e2t

2.32

Theorem

4-

0,

(t- l)(t -

p(t) =

to

2.31.

Theorem

p(t) be the

y\"

differential

given

Example

its

space of the
by the corollaryto

auxiliary polynomialfor a
with constant
coefficients. For any complex numberc,.if
ecl is a solution to the differentialequation.

Let

2.32.

Theorem
linear

conclude

can

that

i.

all

for

with the solution


the following result

coincides

N(p(D))

9 we have

Exercise

so by

and

commute,

D)

of order

solution

1
n, the

the

for

null space

Cm.

of Theorem 2.33

we

must

establish

two

lemmas.

Lemma

1.

complex numberc.

The differential operator

\342\200\224

cl;

C00

->

C00 is

onto for any

116

wish to find a

Let x e C00. We
Proof.
=
Define
a function
w by w(t)

for

x(t)e~ct

Clearly, iveC00 because x and

imaginary
continuous.

Thus

Since

w.

of

parts

have

they

and

antiderivatives,

W: R -> C

= w2. Define
W'2

tf/(t)

Then We

Also W = w.

respectively.

Clearly,
..

x.

say

wx and w2 be

Let

and

Wx

that

such

W2,

W\\

wx

for t e K.

+ iW2(t)
and

real

Finally, define y:

->

of

parts

imaginary

are

and

W^

W^*

= W^(t)ecr for t
y(t)

by

e K.

since

and

yeC\342\204\242,

=
c\\)y

by

^(t)

the

and

C00,

(D

the real and


w2 are differentiable and hence

C00.

in

wt and

weC30,

that

such

C00

e R.

lie

e~ct

Matrices

and

Transformations

Linear

Chap.

(D-cl)j<t)=/(t)-cj<t)

= W'(t)ect +
=

W(t)cec' -

cW(t)ect

w(t)ect

= x(t)e~ctect
\342\200\242

*\302\253,

;
(D

c\\)y = x.

Lemma2.

operators

on

(b)

Then

onto.

TTze

nu//

spaces

null

space

of

the

vector

dim(N(TU))

Let

{vl9 v2,...,
for

choose

vq]

be

each

obtain a setof p

for

bases

and

N(T)

elements

{wu

= U(u,) =

0\342\200\224a

that for any


Hence

contradiction.

i and j,

the

that

TU(wf)

ft

prove

N(TU).

generates

= T(u(.) =

0
P c

Now supposethat

N(TU).

the

set

it suffices to

lemma,

and

Since

for
=

TU(i>y)

N(TU).

Then

0 =

we

vj9 for

w(

show that

N(TU).

We first show

can

we

U(w^) = ut. Thus

w2,...,wp,vu...,vq}

To

elements.

for

wp}. Note

w2,...,

that

and

up)

Since U is onto,

V such

u2,...,

{uu

respectively.

N(U),
wt

contains p + q distinct
basis

and

dim(N(U)),

1,...,p) an element

i(i =

p = {Wl,

+ dim(N(U)).

dim(N(T))

p = dim(N(T)), q =

uf = U(w()

otherwise

are linear

of T and U are finite-dimensional


TU is finite-dimensional,
and

\342\226\240

Proof

that T ami U

and suppose

space,

t/iat

suc/i

is

IrJ

(a)

be

Let

T\\J(v) = T(\\J(v)).

any

T(0)

w( and
= 0,

Vj

in

/?

/?

is

Sea 2.7
Thus

Homogeneous

Differential

Linear

U(v) e N(T).

So there

scalars

exist

au

that

such

\342\200\242
\342\200\242
\342\200\242

+ a2w2 +

U(a1w1

117

Coefficients

Constant

a2,...,ap

a2u2+

U(v) = a1u1 +
=

with

Equations

apup

\342\200\242
\342\200\242\342\200\242

apwp).

Hence

U(v

that

We conclude

\342\200\224

\342\200\224

{a1w1

scalars bu b2,...,bq

4-

(a1w1

4-

a2w2

4-

\342\200\242
\342\200\242
\342\200\242

\342\200\242
\342\200\242
\342\200\242

4-

4-

apwp))

in

lies

apwp)

\342\200\242
\342\200\242
\342\200\242

0.

that there exist

It follows

N(U).

that

such

v-

(a1w1

+ a2w2 +

v =

atwt

4- a2w2 +

apwp)

b1v1 -f b2v2

\342\200\242
\342\200\242
\342\200\242

bqvq

or

Therefore,

We

\342\200\242
\342\200\242
\342\200\242

next show that

a1w1

Applying U to

any

is

/?

apwp

bqvq.

+ b2v2 +

b1v1

+ a2u2

apup
the

b1v1 + b2v2+

dim(N(TU))= p +

a
=

for

basis

a2,...,

ap,

bqvq

af's

implies

vq}

bqvq

= 0.

(6)

are

all

zero.

that

the

b('s are

Thus

dim(N(U)).

(6)

all zero.

N(TU) is finite-dimensional
1

Hence

N(TU).

dim(N(T))

\342\200\242
\342\200\242
\342\200\242

= 0.

\342\200\242
\342\200\242
\342\200\242

independence of {vu v2,.,.,


is

au

= 0.

\342\200\242
\342\200\242
\342\200\242

up} is linearly independent,

/?

Let

obtain

we

(6),

a1u1

We conclude that

independent.

linearly

\342\200\242
\342\200\242
\342\200\242

both sidesof

Since {uu u2j.-.,


reduces to

\342\200\242
\342\200\242
\342\200\242

that

such

scalars

+ a2w2 +

Again, the linear

+ b2v2 +

btvt

N(TU).

/? spans

bu b2,...,bq be

apwp

and

induction
mathematical
on the
Proof of Theorem 233. The proofis
order of the differential operator p(D). The first-order case coincides
Theorem
2.31. For some integer n > 1 suppose that
2.33
holds
for any
differential
a differential
operator of order less than n, and suppose are given
can
be factored
into a product
of
operator
p(D) of order n. The polynomial
by

with

Theorem

we

p(t)

two polynomials

p(t) =
for some polynomial q(t) of degree n
the given differentialoperator
be

\342\200\224

may

p(D)
By
\342\226\240

dim(N(D

Lemma
\342\200\224

cl))

1,
=

1;

\342\200\224

and

c\\

by

is

and

the

some

for

by

complex

number

c. Thus

as

rewritten

= q(D)(D
onto;

c)

q(t)(t

c\\).

the

induction

corollary

to Theorem

hypothesis,

2.31,

dim(N(g(D))=

118
n

\342\200\224

1.

Thus

dim(N(p(D)))=

dim(N(g(D)))

= (n -

constant

with

coefficients,

The corollaryto
2.33
reduces
to an nth-order homogeneouslinear
of

results

a set

finding

set must be

a basis

For

Corollary.
constant

with

{eClt, eC2t,...,

functions

exponential

the set

then

cnJ

equations.

set

the

cnJ

of

independent.

linearly

linear differential equation


if its auxiliary
polynomial
p(t) has n distinct zeros
solution
of the
space
{ecit, eC2t,..., eC[,t}is a basisfor
nth-order

any

coefficients,

Ci\302\273c2,...,

e\302\260nl}

is

such

many

The

space.

numbers cl5 c2j...,

n distinct complex

Given

for

quickly

solution

the

for

basis

2.34.

the

By

theorem enables us to find a


following
exercises.
Hints for its proof are provided in the
Theorem

constant

with

equation

all solutions

independent solutions to the equation.

of n linearly

1 any such

Chapter

of finding

problem

differential

to

c!))

n.

the

Theorem

coefficients

dim(N(D

any nth-order homogeneous linear differentialequation


the solution
space is an n-dimensional subspace of Cm.

For

Corollary.

1) +

Matrices

and

Transformations

that

2 we conclude

Lemma

applying

by

Linear

Chap.

homogeneous

the

equation.

Exercise.

Prqof.

Example 5
We

will

solutions

all

find

to the differential
+

y\"

Since the auxiliary

zeros:

to

solution

the

given

y(i) =

+ Ay =

5/

polynomial p(t)factorsas

\342\200\2244.Thus

and
\342\200\2241

equation

bxe\"1

is of the

equation
4-

4-

4)(t

1), p(t)

has two distinct

solution space. So

any

form
bx and

constants

some

for

b2e~M

(t

basis for the

is a

{e~',e~41}

0.

b2.

Example 6
We

find

all

differential

to the

solutions

f +
The

hence

auxiliary

polynomial

has distinct zeros

solution space.

more

p(t) = t2

ct = 3i,
useful

cos3t = i(e3*'+

0.

9y

+ 9 factors as
\342\200\224

3i.

c2

basis

e-3*r)

equation

Thus

is obtained

and

sin3t

(t

p(t)

{e3\"s

e\"3it}

by applying
=

^(e3\"

\342\200\224

3i)(t

is a basis

3i)

and

for the

Exercise 7. Since

- e~3a),

Sec.2.7

{cos3t, sin

it follows that

the

to

reference

no

Next consider

imaginary

differential

the

This

a basis.

of the

consists

it

that

with

Constant

Coefficients

119

basis

has an advantage

over the

Equations

also

is

3t}

originalone in
makes

Differential

Linear

Homogeneous

cosine functions and

familiar sine and


number i. I

equation
+

y\"

2y'

auxiliary polynomialis

= 0,

a
above.
solutionto
to Theorem 2.33 its solution
equation
By the corollary
In
to find a basis for the solution space
order
need
spaceis
that is linearly independent of e~\\ The readercan
to find
a solution
that
te~l
will do. Thus
solution
This
result
can be
space.
{e~\\ te~1} is a basis for
for which the

(t

p(t)

1)2. By

Theorem

2.32, e~l is

the

two-dimensional.

we

verify

the

generalized as follows.

Theorem2.35.
the

be

integer,

positive

equation

Let

constant

with

p(t)

polynomial

coefficients.

The

space is n-dimensional, we need only

the solution space.First,

show

that

observe

that

for

ft

any

(D

Hence for k

set

te^,...,^^01}

^,

and lies in

independent
integer

positive

of

number and n

space.

the solution

Since

Proof
is linearly

solution

the

is a
a homogeneous linear differential
a complex

c is

where

c)n,

auxiliary

\302\243

is a basisfor

\342\200\224

(t

- clXtV') = /aft-V

4-

ctkect

ctkect

< n,

{D-c\\)n{tkect)= 0.
p is

that

follows

It

We next
combination of

that

show

scalars

some

bn. Dividing

bu...,

M\"\"1+
Thus the left-handside

of

that the
hence

coefficients

is a basis

(8)

the

be

are
\342\226\240,
bn

b2,..

4- bnect= 0

b^^e*

the zero
all

zero.

for the solutionspace.

differential

any linear

+ bn = 0.

bn-,t

polynomial
Thus

is

ft

linearly

4y3> + 6yi2)

4/11

(8)

function. We conclude

equation

yW

(7)

by ect in (7), we obtain


4--..

b2tn~2

must

4----4-

Example

Given

bu

Consider

independent.

linearly

4- b2tn\"2ect

b^'e\"
for

is

/?

space.

that

such

/?

of the solution

a subset

0,

independent

and

we

the solution space. Sinceits

a basis for

find

to

wish

4t3 4- 6t2 -

= t4-

p(t)

we can immediatelyconclude
solution

the

So

space.

y(t)
for

scalars

some

bl3 b2,

constant

basis

form

the

b4t3e*

as an

exercise)

is stated

equation with

differential

is

polynomial

whose'auxiliary

coefficients

linear

a homogeneous

For

equation is of

leave

we

proof

(whose

2.36.

t2e1} is a

b3, and b4,

Themost generalsituation
in the followingtheorem.
Theorem

4-

is

polynomial

auxiliary

{e\\ te\\ t2e\\

to the given

Matrices

- l)4,

(t

that

-h b2tet 4- b3t2el

b^

4-

2.35

solution

any

At

Theorem

by

for

LinearTransformationsand

Chap. 2

120

P(t)=(t-ci)ni(t-c2r...(t-ck)ns

where nx, nj,...,


numbers,

are

nk

the following

and

positive

integers

set is a

basisfor

solution

the

...,

cx, c2,

ck

distinct

are

of the

space

complex

equation:

e*1,tec'1,..., tn^lQc^}.

>c'1, tec'1,...,tn'-V'1,...,
8

Example

the

Consider

differential

equation

= ft

y(3)-4y(2>+5y(1)-2y

We will finda basis

for

solution

its

Since

space.

the

auxiliary

polynomial

factors as

p(t) = t3-4t2 + 5t-2 (twe conclude that a basis


the
solution
space
=

for

W,
Thus

any

solution

of the given
=

y(t)

for some scalarsbu

b2,

and

b3.

te\\

l)2(t

to the

2),

differential

is

equation

e21}.

equation is of the
+ b2tet +

b1et

p(t)

form

b3e*

EXERCISES

1. Label the following


(a) The set of solutions

as

statements

(b)

to an

being

nth-order

with constant
coefficients is
equation
The
solution
space of a homogeneous
null
of a differential
space
operator.

true

or false.

homogeneous

linear

differential

an n-dimensionalsubspaceof
linear differentialequationis

C00.

the

Sec. 2.7

Homogeneous

Differential

Linear

with

Equations

121

Coefficients

Constant

differential
polynomial of a homogeneous
equation
with
constant
coefficients
is a solution to the differentialequation.
a homogeneous
linear differential equation with
solution
to
Any
constant coefficientsis the form aect or atkec\\ where a and c are

(c) The auxiliary


(d)

linear

of

complex
(e)

differentialequation
given equation.

(f) For

with

zerosof p{t),
the
differential
then

Given
differential

is

2.

each

For
Justify

there exists a
coefficients

constant

with

equation

polynomial

eCkt}

p(t) e P(C),

polynomial

any

if clt c2, *>-,ck are the distinct


is a basis for the solution space of

equation.

given

(g)

eC2t,...,

{ecit,

to the

constant

with

equation

linear

p(t),

polynomial

auxiliary

having

differential

linear

homogeneous

any

coefficients

integer.
to a given homogeneous
of solutions
is also a solution
constant
coefficients

combination

linear

Any

a positive

k is

and

numbers

linear

homogeneous

whose auxiliary

p(t).

is

of the following, determine whether the statement true


claim with either a proof or counterexample,whichever
your

or

false.
\342\226\240

is

appropriate.

finite-dimensional

Any

(a)

(b)

Given

a solution

is

if

differential

linear

homogeneous

any

two

polynomials

constant

coeffi-

equation

with

to the equation, so is its derivative


p{t) and
q{t) in P(C), if

with constant

constant
x'.

xeN(p(D)) and

then

F%(D)),

+ y6N(P(D)g(D)).

(d)x

(e) xy 6

N(p(D)q(D)).

3. Findbases
y\"

(b)/ff =

solution

the

for

y(4)

=
y

the

for

following

differential equations.

'

+ 3yV) + 5y

4. Findbases

of the

y = 0

2y<2) 4-

yW~yi2)

spaces

= 0

+ y

2/

(d)y' + 2/ +
(e)

solution

whose

coefficients,

(c)

with

linear differential equation


has {t, t2} as a basis.
space

a homogeneous

exists

There

(c) For

(a)

equation

space of a

cients.

coefficients

yk

differential

linear

homogeneous

is the solution

of C^

subspace

= 0

following

subspaces

of CM.

D~l)

(a) N(D2-

}/ (b) N(D3-3D24-3D-I)

. . (c) N(D34-6D24-8D)

5.
^6.

Show

(a)

\"*\342\200\242

(b)

that

Cm

is

a subspace

Show

that

D; Cffl

Show

that

any

of ^{R,

linear transformation.
is a linear transformation on C*.
operator

-^ Cffl is a

differential

C).

122

if {x,

7, Prove that

fc
8,

a second-order

Given
constant

coefficients,

{eatcos bt,

\342\200\224

sin

is a

bt}

9, Given a collection
{Ui,
U,Uj

linear differential equation with


that
the auxiliary
has distinct
polynomial
a 4- ib and a
where
that
a, be R. Show
ib,
basis for the solution space.

roots

eat

\342\200\242

y^ix-y)

homogeneous

suppose

complex

conjugate

then so is

space over C,

a vector

for

basis

is

y}

and Matrices

Transformations

Linear

Chap.

of

linear

commutative

pairwise

of a vector space
U2,...,Un}
=
for all i,j), prove that for

any

that

such

transformations

(i.e.,

1),-1),-

transformations

1,2,...,n

NfUJsN^lV-.UJ.

10. Prove Theorem 2.34andits


4- b2eC2t 4----4-

btf1'

Hint:

corollary.

To show tfie fc/s are zero,


theorem for n = 1. Assuming
the

apply

establish the

operator

cn\\

n distinct

for

theorem

\342\200\224

2.36. Hint: First


that
this
solution space. Then

Let

be

k. The
\342\200\224

solution

the

equation

that if p(t) =

where

g{t)h{t),

the operator

apply

c/s,

and

\342\200\224

to

ck\\)nk

the

any

homogeneous linear differential

h(t) are

p(t). Prove

polynomial

auxiliary

having

g(t)

(D

that equals 0.

of an nth-order

coefficients

constant

with

distinct

space

in the

is linearly

set

of the alleged basis

combination

linear

12.

for

holds

theorem

above to
lies

basis

alleged

such

n\342\200\224\\

independent
by
=
case k
1 is Theorem 2.35.Assumingthat

verify

on

any

of the equation
functions.

the

that

verify

induction

sides

both

exponential

11. Prove Theorem


mathematical

to

for

true

is

the

n. Verify

on

induction

theorem

the

cf's are distinct).

the

(where

mathematical

apply

functions,

bneCnt

that

Suppose

of positive degree,

polynomials

then

Rfe(Dv))

N(fc(D))

5(D)(V),

DV:V->V is defined by Dv(x)= x'


two
g(D)(V) c N(/j(D)). Then prove that
dimension.

where

the

13.

differential

(a)

Prove

4-

+ an_1y(B~1)

\342\200\242\342\200\242
\342\200\242
\302\261
a^1*

nonhomogeneouslinear

is called a
equation,

finite

same

the

have

spaces

prove

equation

yin)

coefficients

First

Hint:

xsM.

for

if the

x, is a
that

coefficients

to the equation

xeC\342\204\242

differential

there

and

above. Hint: Use

Lemma

constant

with

the

right

side

of

the

zero.

identically

exists

= x

a0y
equation

at are constant

function that is not

for any

4-

such

ayeC\342\204\242

to

that

Theorem

y is

2.33

a solution
to show

Sec.

2.7

Linear

Homogeneous

Differential

Equations

123

with Constant Coefficients

that if

p(t) =
then

(b)

be

Let

solution

the

4-

4-

fln-it\"-1

above, then the

equation

differential

linear

4-

fl^

\342\226\240
\342\200\242
\342\200\242

the

to

solution

any

4-

fl0,

linear equation

for the homogeneous

space

+
yM + flo-iy01\"\"
Prove that if z is

\342\200\242
\342\226\240
\342\226\240

is onto.

Cm -> Cm

p(D):

tn

fliy(1)

= o.

<*0y

differential

linear

nonhomogeneous

set of all solutions

the

to

nonhomogeneous

is

equation

{z + ^j/eV}.

14. Given
constant

nth-order

any

coefficients,

x'{t0)
Next

xin~1\\t0)

induction

suppose

nth-order

n
y'
15.

c)

\342\200\224

1.

\342\200\224

cy

x(t0)

then x = 0 (the zero function).Hint:


on n. First prove the conclusion for the casen
n

c and

number

complex

z(t0) = 0 and
0. Now apply

\342\200\224

Use

is

the

as

p(t)

1.

an

consider

the

to

solution
induction

p(t)

equation

hypothesis.
differential

linear

homogeneous

and

q(t) of degree

polynomial

solution space of an nth-order


with
constant
coefficients.
Fix t0 e R,

Let V be the
equation

if

0,

= q{D)x. Show that


= 0. Conclude
that z =

Let

t0 e

true for equations of order


1, and
with auxiliary polynomial p{t). Factor

some

for

with

equation

x and any

solution'

any

it to be

equation

\342\200\224

q(t)(t

\342\226\240\342\200\242\342\200\242
=

mathematical

for

that

prove

differential

linear

homogeneous

define

mapping

0>:V->Cnby

<5(x)

*('o)

x'(t0)

isomorphism.Hint:

Use

with

numbers

one
for
16.

Pendular

c0,

approximated

where 6(t) is

by

space

is trivial.

Deduce

that O is

an

14.

Exercise

For any nth-order

differential

there

cu

solution,
x,
=
k
0,1,...,

Motion.

for each xeV.

homogeneouslinear
constant
coefficients,
any t0 e R, and any complex
exists
..., cn_i (not necessarily distinct),
exactly
=
such
that
to the given differential equation
ck

the following;

equation

null

its

and

Prove

\\

(a) Prove that O is linear


(b)

\\

x{k)(t0)

\342\200\224

1.

It is well-known
the

differential

that the motion of

is

pendulum

equation

the angle in radians

that

the

pendulum

makes

with

a vertical

Chap. 2

124

and

Transformations

Linear

Matrices

Figure

line at time t

right and

is to the
viewed

by

magnitude

if

negative

reader.

of

acceleration

if the

that

6 is

is to

the left of the

pendulum

positive

pendulum

vertical lineas

length of the pendulumand g is


due to gravity. The variable t and constants and
units (e.g., t in seconds, in meters,
and
g in meters
I is the

the

compatible

per second per

the

Here

the

be in

g must

(see Figure 2.8)interpretedso

2.8

second).

(a) Expressan
two fixedreal-valued

to

solution

arbitrary

this

as a linear

equation

combination of

functions.

(b)

the

Find
\342\226\240

6(0)

(Thesignificance the
the

from

vertical

by

(c) Prove thatit

circuitback

60 radians

takes

0 the pendulum is displaced

at time t =

and has zero

velocity.)

for the pendulum

of time

units

2%^/lfg

0'(O) = 0.

and

>0

60

is that

above

of

that satisfies the conditions

to the equation

solution

unique

to make one

the pendulum.)
At the beginning of this section
17. Periodic
with Damping.
of a Spring
discussed
the
motion
of an oscillating
spring under the assumption that
force
on the spring was the force due to
tension
of the
acting
only
We found in this case that (3) describes
motion
of the
spring.
spring,
Find
the
form
of
all
solutions
to
(a)
general
If we
the behavior
of the general solution in part (a),
see
that
the
analyze
is a periodic function. Hence (3)indicatesthat
solution
will
never
spring
We know from experience, however, that
of
stop oscillating.
amplitude
the oscillation
will decrease until motion finally ceases.The reason
the
in part
solutions
the
effect
(a) do not exhibit this behavior is that
ignored
of friction
on the moving weight. At low speeds such as those
the
an example of viscous
resistance
of the air provides
consideration
is
to the velocity
of the moving
resistance
proportional
To correct
for air resistance, we must add
weight but opposite direction.
the
r >-0 depends on the
in
which
to (2). The constant
ry'
the motion
takes place (in this case, air), and
term
has
a negative
ry'
of the motion.
direction
sign because the resistance is always oppositeto
=
Thus the differential equation of motion is
that
is,
ky;
and

forth.

(This

time

the period of

is called

Motion

we

the

the

the

(3).

we

the

the

that

we

under

damping\342\200\224the

in

term

\342\200\224

medium

\342\200\224

the

the

\342\200\224

my\"

my\" 4-

ry' 4-

=
Jcy

\342\200\224ry'

Q.

125

Index of Definitions

Chap.

(b) Find the generalsolutionto


y(0)

(d) For

in

as

y(t)

decreasesto

18. At the beginning


differential

that

zero;

solutions

valued.

Justify

that

of the oscillation

useful to
solutions
functions of a real variable
realus in a physical sense
that it is

view

even

are

involve

not

does

exercises

linear

We list them

algebra.

for the sake of completeness.

(a)

of derivativespossessed
by

For

(b)

solution.

prove

c, deC,

any

on the number

induction

mathematical

Use

Hint:

2.28.

Theorem

Prove

conditions

l-*oo

to

meaningful

initial

the

the amplitude
lim y(t) = 0.

stated

it was

satisfies

of view.

point

19. The followingset of

prove

is,

are

that

this

that

as complex-valued

equations

though

show

section

this

of

to

(c),

part

equation.

part (b) that

the unique
solution in
= 0 and
/(0) = i>0.

Find

(c)

this

(i) ec+*=eced.

(c)

Prove

Theorem

(d)

Verify

the product
variable:

a real

of

2.30.

the product

rule of differentiationfor complex-valued


For any differentiable functions x and in ^{R,
functions

C)

xy is differentiable and

(xy)'= x'y + x/.


Find

Hint:

(e)

differentiate.

then

y;

that if x

Prove

INDEX

and

112

matrix 95

is a

function.

constant

FOR CHAPTER 2
81

Double

102

dual

102

Dual basis

Dual space 102


of

Exponential

differential

109

function-

Fourier

102

Differential

equation

112

Homogeneous

equation

Differential

operator

112

59

Identity

Incidence

102

coefficient

linear differential

109-10

Identitymatrix

66

Dimension theorem

113

function

Coordinate vector relativeto a


basis

then

relation

Clique 81

Coordinate

0,

in terms ofthoseofx and

Dominance

Change of coordinate

equation

parts of xy

OF DEFINITIONS

polynomial

Coefficients

imaginary

6 ^{R, C) and x' =

108

Annihilator

Auxiliary

real

the

75

55

transformation

matrix

80

Invariant subspace 64

Chap.2

126
of a

Inverse

85

linear transformation

Inverseof a matrix
linear

Invertible

matrix

85

delta

75

78

transformation

of

differential

67

transformation

Nonhomogeneousdifferential

equation 122
a

of

Null space

linear

of

transformation

59

57
differe'fitial

tr

110

homogeneous

113

equation

Standard

ordered basis for

Standard

ordered

Fn

66

66

for Pn(^)

basis

Standard representationofa vector

space

with

Transpose

respect

a basis

to

of a linear

transformation 104

Ordered basis 66

Order

to a differential equation

Solutionspace

a linear

representing

98

matrices

Similar

Solution
54

transformation

Nullity

transformation 59

56

Rotation

Linear functional 101


Matrix

73

Reflection56

Left-multiplication
Linear

of a linear

Rank

87

Kronecker

112

operator

57

Range
87

Matrices

64

Projection

86

Isomorphic vector spaces


Isomorphism

of a differential

Order

Productof matrices

86

transformation

Invertible

and

Transformations

Linear

equation

110

Zero

transformation

55

89

This

1. The

to two related

is devoted

chapter

of

study

certain

2. The application
to

transformations

the

As a consequenceof

the rank of a

of these operations

the

and

of systems

solution

objective

we

will

linear

of

theory

of linear equations
a simple

obtain

method for computing

finite-dimensional

between

applying these rank-preserving matrix

on matrices

operations

\"rank-preserving\"

transformation

linear

objectives:

operationsto a

matrix

vector

spaces

that

represents

by

transformation.

that

the
of systems of linear equationsis
most
important
The
familiar
method
of
of
linear
elimination
algebra.
solving
application
which was discussed in Section 1.4,
of linear
systems
equations,
so that a simpler system can be obtained.
elimination
of variables
technique
eliminated
utilizes
three
which
the
variables
are
of
types operations:
by

The solution

probably

for

involves

the

The

1.

any

Interchanging

two

system
equation in
Adding a multiple of one equationto

2. Multiplying any
3.

We

expressed
three

in the

equations

will

as a

operations

the

system.
by

another

a nonzero

constant.

equation.

can
be
system of linear
the
single matrix equation. In this representationof
system
above are the \"elementary row operations\"for matrices.These

See

in

Section

3.3 that a

equations

the

127

128

Chap.

Matrix

Elementary

and Systems of Linear Equations

Operations

computational
operations will providea
solutions
of a system of linear equations.

3.1

MATRICES

ELEMENTARY

AND

all

determining

OPERATIONS

MATRIX

ELEMENTARY

for

method

convenient

define the elementary


that
be used
will
operations
the chapter. In subsequent sections will use these operations
to
throughout
rank
of
a linear
obtain simple computational methods for determining
and the solution to a system linear
transformation
There
are two
equations.
and
column
operations
operations.
types of elementary matrix
As we will see, the row operationsare
the three
useful.
arise
from
They
that can be used to eliminate
in
a system
of linear
operations
In this section we

matrix

we

the

of

operations\342\200\224row

more

variables

equations.

Let
an array

be^an

Recall that

a field F.

over

n matrix

as

considered

be

can

of m rows,

iAA

or

as

an

n columns,

array,of

on the rows

operations

[column]

Ai2),...,

(A{1\\

Let Abe anm x

Definitions.
three

L(2)

A{n)).

n matrix,as

[columns]of

above.

is

Any
called

one
an

of the

following

elementary

row

operation:

(a) Interchanging any two rows [columns]of


row [column]
of A by a nonzero constant.
Multiplying
any
(b)
A.

(c) Adding

any

constant

of

multiple

a row

[column]

of

to

another

row

[column].

Any of the three operations


Elementary

are

are

operations

obtained

by

either

above

of

type

will

be

called

1, type 2,

elementary

or type 3 dependingon

(a), (b), or (c).

Example 1

Let
A

-2

operations.

-1

3
2

whether

they

3.1

Sec.

Matrix

Elementary

Interchanging

with

A{2)

and Elementary Matrices

Operations

is an

Aa)

of type 1.The resulting

/2

operation
of

A{2)

The

2.

type

3 4

\\4 0

12

resulting

operation of type 3. The

an

operation

elementary

performing

or 3 according

An

-1
1

17

an

of an

example

elementary

row

is

12

-1

is a matrix

matrix

n elementary

on In.

is

Ai3)

matrix

resulting

D =

Definition.

times

to A(1) four

column

v4

adding

of an elementary

an example
matrix
is

3 is

by

C=

Finally,

row operation

-1;

1 2

B=

multiplying

of an elementary

example

is

matrix

Again,

129

The elementary matrix

obtained by

is said to

to whether the elementaryoperation

on

performed

be

In

type

1, 2,

type

/, 2,

of

is a

or 3 operation,respectively.

For example,

the

interchanging

elementary

Note

of /3 produces

rows

two

the

matrix

that

E can

also be obtained

performing

an

elementary

operation

by interchanging the

on

row

two

columns

obtainedin at least
on In or by performingan
operation
two

of/3.

ways\342\200\224either

In
by

elementary

/n. Similarly,

matrix since it can


column operation of type 3 (adding
is an elementary

first

matrix can be

fact any elementary


column

first

be

obtained

\342\200\2242
times

the

from

first

/3

column

by

an

of/3

elementary
to the third

Chap.3

130

Matrix

Elementary

column) or by an elementary
row to the first row).

row

and

Operations

of

operation

type

shows
that
Our first
performing
matrix is equivalentto multiplying matrix
the

an

performing

[n

x n]

Let

3.1.

Theorem

matrix

[column]

E such that

by performingthe corresponding
E is an elementary x m [n x n]
row

obtained

by performing

The

B=

EA

by
operation.

matrix,

an elementary row

Example

Consider
Example

the matrix B in Example 1.


two
1) by interchanging the

on

is a

if

Conversely,

Im [In].

that

matrix

be

can

A.

on

that EA =

the

obtain

we

part

of

of A by 3.
matrix

column

matrix

rows

of

theorem.

from

was

obtained

A.

Performing

A (in

this same

matrix

elementary

1,

Example

C is obtained

Observe that AE =
useful

from

on

C.

fact

that

the

multiplying

by

Performing this same operation

the

B.

In the second
elementary

of

use

first

is

[column] operation

This

operation on Iz,

It

on a

that B is obtainedfrom by
Then there exists an m m
=
AE]. In fact, E is obtained

then EA [AE]

the

second

third

matrix.

elementary

operation

[column]

The followingexampleillustrates

Note

[B

operation

elementary
an

the

times

we omit, requires verifying Theorem 3.1for each


of
type
The proof for column operations can then
row operation.
be
the matrix transpose to transform a columnoperationintoa
using
details
The
are left as an exercise.

elementary

row

operation.

\342\200\224

Equations

which

proof,

obtained

by

and suppose

x n(F),

Mm

row

elementary

elementary

3 (adding

an

theorem

of Linear

Systems

I4,

we

the

obtain

0
0 0
1 0
0 1

the inverse of

an elementarymatrix is also

an

matrix.

elementary

Theorem
elementary

3.2.
matrix

matrices are invertible, and


Elementary
is an elementary matrix of the sametype.

the

inverse

of

an

3.1

Sec.

Matrix

Elementary

the

Proof. In view of

obtained

an

by

each

Case1.
of

of operation.

type

E2 = In.

Hence E is invertible,and

In

by

It is

row

elementary

E be

Let

inverse.

multiplicative

the

row

elementary

the elementary
hence

and

fact

is

It

the qth
to

easy

E'1.
of In by

row

pth

2. Since c

of type

operation

= EE = In,

type 1.

the

of multiplying

operation

easily shown that EE

in

multiplying

by

nonzero constant c, an

has a

row operation of

Suppose that E is obtained

Case 2.

the pth and

by interchanging

an elementary

# q),

In(p

is obtained

that

Suppose

that

verify

three

need consider only

on Ini we

operation

can be

n matrix

n x

elementary

any

an elementary n x n matrix.

Let E be

rows

row

elementary

for

cases\342\200\224one

that

fact

131

Matrices

and Elementary

Operations

# 0, c

matrix obtained from


the pth row of by c-1.
In

E'1.

to the pth
row of In c times
Suppose that E is obtained adding
the qth
of Ini
where
be obtained
p # q and c is any scalar.Thus can
from In by an elementary row operation of
Observe
that
from E via an
row
In can be obtained
of type
times
the
3, namely, by adding to the pth row
operation
qth
E (of type
row of E. By Theorem 3.1 there is an
matrix
3) such
8 of
that EE_~ In. Thus
Exercise
Section
2.4 E is invertible and

Case 3.

by

row

3.

type

elementary

\342\200\224

elementary

by

E~1 = E. I

EXERCISES

1.

the

Label

true or false.

as being

statements

following

(a) An elementary matrix is

square.

always

of an elementary matrix are zeros and ones.


(b)
only
n x n identity
matrix is an elementary matrix.
The
(c)
(d) The productof two n x n elementary matrices is an elementary matrix.
an
matrix
is an elementary matrix.
of
(e) The inverse
elementary
is an elementary matrix.
matrices
(f) The sum of two n x n elementary
entries

The

(g) The transpose an


that
(h) If B is a
operation on a
of

matrix

matrix.
elementary
an elementary row
be obtained
by performing
B can also be obtained by performing an
then

can

matrix

A,

is an

matrix

elementary

elementarycolumnoperation A.
If B is a matrix that can be obtained
on

(i)

operation

elementary

2.

on a matrix A,

then

row operation on

an

performing

by

be

can

obtained

row

elementary

an

performing

by

B.

Let

B-

-2

/10

3\\

and

C =

| 0

-2

1 -3

-2
1

Elementary

elementary

operation

132

Chap.

an

Find

additional

3.

elementary

obtained

on In or

operation
4. Prove

that E is

5. Let

be

if El

only

several

of

row
on

operation

then Bl can be obtained

A1

from

[row]

in case

1 of the

In.

is.
A

column

can be

elementary

Prove that if B can be obtained


operation,

an

and

from

elementary

corresponding

an
column

and

if

matrix

row [column]

elementary

Equations

into
means,

By

elementary

an elementary
matrix.

B into C.
C into /3.
performing

by

ways\342\200\224either

by performing an

x n

an

two

least

at

in

Linear

129: Any elementary n x n matrix

on page

made

assertion

the

Prove

will transform

which

which will transform


transform
operations,

operation

elementary

Systems of

Matrix Operations and

by

an
the

by

operation.

6. Prove Theorem 3.1.

7.

matrix of type

1,
for the matrix E definedin
n x n

elementary

that

8. Verify

made

assertion

the

Verify

proof of Theorem3.2:

If

E2

then

the

is

an

In.
of

proof

case

2 of Theorem

3.2

=
\302\243\302\243
\302\243\302\243=?/\342\200\236.

that

Prove

9.

obtained

10.

a succession

.by
3

type

followed

by

that

Prove

that

[column].

3.2

THE

RAN

by a nonzero

[column]

of type 2

can be

scalar.

of type 3 can be
[column]
operation
of some row [column] from another row

row

\342\200\242

EC

In this section

operation

[column]

a multiple

subtracting

by

row

elementary

any

obtained

row

some

dividing

by

Prove

one

elementary

any

obtained
11.

row [column]
operation of type 1 can be
of three elementary row [column] operations of
row [column]
elementary
operation of type 2.

elementary

any

OF

the

define

we

operationsto
sectionconcludes

of

rank
rank

the

compute

with

AND

MATRIX

procedure

Mm

of

for

MATRIX

INVERSES

We will then use elementary


matrix
or a linear transformation. The
the inverse of an invertible
computing
a matrix.

matrix.

Definition. If
be

the

rank

of

the

linear

we define

xn(F),

the rank of A,

LA: Fn ->

transformation

denoted rank(A),to

Fm.

of matrices
rank
follow
from the
Many results about
immediately
result
of this
corresponding facts about linear transformations. important
type, which follows from Theorem 2.5 and Corollary of Theorem
2.19, is that
&n n x n matrix is invertibleif and
its
rank
is n.
We would like the definition above to
the
condition
that
the rank of
a linear transformation is equal to
of any
rank
matrix
that
representing
Our first theorem showsthat
transformation.
condition
fulfilled.
is, in fact,
the

An

only

if

satisfy

the

this

The

Sec. 3.2

Let T:

3.3.

Theorem

a linear

be

->

and
dimensional vector spaces,and
respectively. Then rank(T) = rank([T]|).
let

the

that

Now

next

The

n matrix. If

an m x

n matrices, respectively,then
rank(AQ)

(a)

A be

Let

3.4.

Theorem

and n x

we

rank-preserving operationson matrices.


tell us how to do this.

its corollary

and

theorem

a linear transformationhas
need
a result
the rank of a matrix,

the rank of

of finding

problem

2.4. I

us to perform

will allow

that

the

to

reduced

been

ordered

18 of Section

of Exercise

of finding

problem

between finitebases for V and W,

transformation

be

/?

a restatement

is

This

Proof

133

Inverses

Matrix

and

a Matrix

of

Rank

P andQ

invertible

are

x m

rank(A),

= rank(A),

(b) rank(PA)

and therefore,

(c)

that

observe

First

Proof

= RCLJ since

L^F\")

rank(A).

rank(PAQ)

onto.

is

LG

R(L^)=R(LALG)

=
LALG(Fn)

LM(LG(Fn))

Therefore,

rankle)

dim(R(LAG))

= dim(R(LJ)

= rank(/l).

Thisestablishes
k

We

the

omit

establish

To

(a).

\"

details.

column

and

row

Elementary

2.4 to T =

LP.

that

have

we

= rank(4).

= rank(iM)

rank(P40

Corollary.

(a) and (b),

applying

Finally,

15 of Section

Exercise

apply

(b),

I
on a matrix

operations

are rank-

preserving.

Proof If
then

there

invertible,

exists

and

\302\243
is

an

elementary

hence

the

from

obtained

matrix
=

rank(\302\243)

matrix

A by an

B = EA.

E such that

is
elementary column operationsare rank-preserving
that

an

as

left

preserve
The

rank.

determine

Theorem

The rank of

3.5.

linearly independent columns;that is,

subspacegenerated

by

its

columns.

matrix

any
the

rank

equals
of

The

3.4.

have a classofmatrixoperations
of examining a transformed matrixto ascertainits
first of several that enable us to
the
rank

Now that we
a way
is the

Theorem

By

Theorem

by

rank(4)

elementary row operation,

the

a matrix

of

3.2,

proof

E is

that
1

exercise.

need

rank

we

next

theorem

a matrix.

number

of its

is the dimension

of the

maximum

134

For

Proof.

Elementary Matrix Operations and Systemsof Linear

Chap.

Equations

A e Mm.xn(F)s

any

= dim(R(LJ).
the standard ordered basis
Fn.

rank(i4) = rank(LJ
Let

ft

en} be

e2>...,

{eu

Then

for

Fn

/? spans

Theorem 2.2

and hence by

R(LJ = spanfL^X L^(e2),...,

LA(eJ}.

that

seen

have

we

But

LA{eJ)

\342\200\224
A0*.

R(LJ

Hence

A&\\...,

span{/l(1),

A\342\204\242}.

Thus

ranfc(i4)

A(n)}).

Example

= dim(span{,4(1), A(2\\...,

dim(R(LJ)

Let

Observe that the first and second


that the third column is a
combination

of

columns

linear

rank04) = dim

To

3.5 until

has

span

the rank of

compute

use of Theorem

| 0

that

of

be

matrix

obtained

entries.

zero

obvious.

modified

the

can

is

the

1 |}

useful

frequently

the

=
\\

number

by means

2.

to

following

example

row

the

postpone

of appropriate

to Theorem 3.4 guarantees


corollary
is the same as the rank of A. Onesuch

elementary

and

of linearly

The

by using
The-

of

linearly
independent
first two. Thus

modified

suitably

elementary row and columnoperationsso


independentcolumns

are

1 L

a matrix A, it is
been

that the rank

modification
and column operationsto introduce

illustrates

of

this procedure.

Example 2

Let

If

we

subtract

the

first

row

of A from

rows 2 and 3

(type 3

elementary

row

The Rank

3.2

Sec.

of a Matrixand

is

the result

operations),

0 -1

twice the first column from

now subtract

second

the

and

(type 3 elementarycolumn

from the third

column

-2

If we

135

Inverses

Matrix

we

operations),

the

subtract

first

obtain

0'

0-2

-1
,\302\260

It is
this

now obvious that


matrix
is 2. Hence

The

means

of

a finite

into

transformed

(b)

D\342\200\236

3.6

i <

for

an m x

Then

r,

for i > r.
and

its

to understand,is
consideran example.
tedious

are quite important. Its


As an aid in following

corollaries
read.

to

proof, though easy


the proof we first

Example

Consider

the

matrix

2\\

,6 3 2
By

simple

/or i#j,

(c) DH = 0
Theorem

of

(a) Dy = 0
=

means

n matrix of rank r.
r <
m, r < n, and
be
number of elementary row and columnoperations can
matrix D such that
A be

Let

3.6.

Theorem

by

is

theorem uses this process of modifyinga matrix


by
it
and
to
transform
intoa
row
column operations
particularly
of this theorem can be seen in its corollaries.
power

elementary

of

columns

next

The

form.

the maximum numberof linearlyindependent


the rank of
2.
1

means

transform

10 2

9 1/

succession
of elementary row and column operationswe will
into a matrix D as in Theorem 3.6.We
list
of the intermediate
many

of a
A

is
transformed
the preceding
from
matrices, but on several occasions a matrix
one by means of several elementary operations.The number
above
each
arrow
indicates
how many operations are involved. Try to identifythe nature of each

136

Chap.

column

operation (row or
2

r0

4 4

and

2\\

,6

9 1,

0\\

-6

.0

4-

1/

2~

6
3

^0

type).

10
0

\342\226\2408r

4-.

V0

0^

0.

1 0

1 Q 2

,0 0 2
By

the

in the

position, and
and

row

first

mind

in

example

of

Proof

4>

0 01
0

= D.

,00000.

0>

= rank(D). But

two elementary operationsin


several

next

the

not

3.6.

Theorem

for the

except

clearly rank(D)= 3;so

change

If A is

Example

3) result in

a 1

in

result

in

0Js everywhere

1, 1 position.

Subsequent

and first column.

the first row

with the proof

we proceed

(type

operations

column

first
do

operations

elementary

10

3.4, rank(/l)

that the first

Note

the 1,1

Theorem

0 10

0 0

,0000
to

3.

0)

10

0^

0 0 10

4,

0*.

corollary

rank(/l)

0 0

3/01000

of Linear Equations

/44480

1
0
0

and Systems

Operations

10

Matrix

Elementary

this

With

of Theorem3.6.

the zero matrix, r

=0

by

Exercise

3,

In

this

follows with D = A.
A # O and r = rank(/l); then r > 0.
will
be by
suppose
proof
mathematical
induction
on m, the number of
A.
of
that m = l.By means of at mostone
1 column
and
Suppose
operation
at most one type 2 column operation, can be transformed
into a matrix with a

case

the conclusion
Now

The

rows

type

1 in the 1,

matrix,

position.

in turn, can

By

of at

means

most n

be transformedinto
I>

= (1

\342\200\224

matrix

the

0).

type

3 column

operations

this

Sec.3,2

and Matrix

a Matrix

of

Rank

The

there is a
=
the
rank(D) rank(/l)= 1
established
Thus the
Next assume that the
; Note that

for

is

rows

> 1). We

some

(for

= 1.
for

holds

theorem

by Theorem 3.5.

3.4 and

Theorem

to

in D. So

column

independent

linearly

corollary

by

theorem

one

of

maximum

137

Inverses

will prove that

for

holds

theorem

the

with

matrix

any

1
at most m \342\200\224
with
any matrix

m rows.

Suppose that

a manner

in

established

is

x n

any

to that for

analogous

We now suppose that n > 1.

of at mostone
entry

at most

one additional type

nonzero entries in
the

\342\200\224

the

and

row

first

the

column

type

By

3.)

Example

10),

By means

i, j.

the 1, 1 position,

1 in

means

of at most

can

we

\342\200\224

three

column

type

all

eliminate

with the exception

and

of

means

By

3).

Example

assure

first column
row

some

be

of type 1) we can move the

operations,

(In Example 3 we used two

1 position.

i,

and

operations

row;

can

Exercise

0 for

AVi

Theorem 3.6 can

(see

(just as was donein


we

in

(each

2 operation

(Look at.thesecondoperation

0,

operation

1, 1 position

the

into

nonzero

Since

column

one

and

row

If n = 1,

matrix.

of the 1

in

to

operations

do this.)
I

of elementary

number

a finite

with

Thus

operations

transformed

be

can

a matrix

into

01

\302\243'

is an

B'

where

(m

\342\200\224

1)

\342\200\224

(n

1)

matrix.

Bf

-6

\\-3
Exercise

By

= r

rank(B')

r ^m

Hence

B' has

11,

\342\200\224

1.

and

the

By

In

-8

-6

-4

-3

finite number of
(m

1)

(n

1)

\342\200\224

<

\342\200\224

and

=
rank(B)

\342\200\224

<

r,

\342\200\224

}.

r < n.

matrix

Df

(D\\j=0

iff #/,

(D\\{ =1

if i < r -

of all zeros

of a
an

that

such

(D')u

That is, Df consists

by means
into
operations

transformed

column

and

row

elementary

be

can

B'

hypothesis

\342\200\224

than B. Sincerank(/l)=

hypothesis

Also by the induction


\342\200\224

2 2'

rank one less

induction

Example

except

if i

for

ones

1,

> r.
in

the

first r

\342\200\224

positions

of the

Chap. 3

138

Elementary

Matrix

and

Operations

of Linear

Systems

Equations

Let

main diagonal.

\342\200\242\342\200\242\342\200\242

0s

Df

the theorem now

see that

We

B by means of a

this follows

Thus since

each

by

of elementary

number

finite

by

and

column

can be transformed into D,


A can be transformed into

operations,

operations.

r
1
Finally, since D' containsones in its
positions
r positions
its
diagonal, D contains ones in the
along
zeros elsewhere. Thus Dn = 1 i < r, Da = 0 if i > r, and
establishes the theorem. 1
\342\200\224

first

first

if

Corollary1.
that

where D is an
= 0
i/i#j,

D = BAC,
(a)

Du

(b)

D\342\200\236

(c) Dn =

Theorem

if i>r.

3.1 each

elementary
Gx,

G2J.-,

cairbe

3.6, A

Theorem

m x m and

matrix

=
Du

and column

Then

n,

matrices

Gq such

that

exist

there

such

respectively,

satisfying

by means of a

transformed

into the matrix D.

operations

number

finite

can

We

to

appeal

an elementaryoperation.
there
E2,..., Ep and elementary nxn

time we perform

mxm

diagonal and
0 if i # j. This

r,

if

row

elementary

i <

By

Proof

of

C of dimensions

B and

matrices

invertible

main

main

the

along

of rank r.

x n matrix

an

be

Let

But

operations.

B and B

into

of elementary

number

finite

D can

12.

Exercise

of

transformed

be

row

from

be obtained

that

show

we

elementary

applications

can

of

number

finite

repeated

by

follows once

Eu

exist

Thus

matrices

D = EpEp-l---E2ElAGiG2---Gq.

By Theorem 3.2
C =

Gl-Gq.

D = BAC-

each

Then B and

Let

2.

(a) rank(Al)=
The

and

Gj

is

invertible.

Let

by

Exercise

invertible

are

rank

A be

2 of

\342\200\224

and

*E1

Section

2.4, and

any m x

n matrix.

rank(A).

of any

independent

rows;

subspace

generated

(c) The rows and

equals the maximum number of


that
is, the rank of a matrix is the
by its rows.

matrix

columns

(a) By Corollary

its

linearly

dimension

of

any

matrix

dimension, numericallyequal to

Proof,

B = EPEP-1

Corollary

(b)

Ej

the

generate
rank

of

the

subspaces

of

the

of the same

matrix.

1 there existinvertiblematrices
B

and

C such

that

The

Sec. 3.2

D = BAC,

where
transposes, we have

139

Inverses
conditions

stated

the

satisfies

Matrix

and

Matrix

of

Rank

of the

corollary.

Exercise

3 of

Taking

Dl = ClAlBl.
C are

and

Since

so are Bl and C

invertible,

by

2.4.

Section

by Theorem 3.4

Hence

=
rank04') = vankiC'A'B1)
Then Dl is an n x
that
r = ra.nk(A).
matrix
Theorem
1, and hence rank(Dr)= r
Corollary
= rank(Z>') = r = rank(A).
rank(/l')
rank(D').

Suppose

of

conditions

establishes

The

(a).

3.

Corollary

Proof. If
Dtj

for i #

As

j and

inverse

elementary

n x

invertible

matrices

1<

1 for

Du

elementary

/\342\200\236

of each factor.Notice
of a matrix and
rank

the

how

of

the

vector

(a) rank(UT) <


(b)

<

rank(UT)

(c) rank(AB)

spaces

is defined.

AB

product

->

W
V,

the

a matrix

relationship

W,

U: W

rank(U).
rank(T).

< rank(A).

R(T)

Clearly,

c W. Hence
U(R(T))

= R(U).

U(W)

Thus

rank(UT) = dim(R(UT))<
establishes

But

of

part

(a).

to the rank
between the rank

-> Z be linear transformations


and Z, and let A and B be matricessuch

and

Then

R(UT) = UT(V) =

This

Thus

product

(d) rank(AB)< rank(B).


Proof.

is

and

transformation.

a linear

of

rank

the

exploits

proof

Let T:

3.7.

onfinite-dimensional
the

elementary

We now use Corollary2 to relate

that

BAC.

matrices.

Proposition

by

where

BAC,

the G/s are


matrices.
=
sothsLtA
E^iE2i---E;lG;iG;}i---G:1.
matrix is elementary, and hence
the
product

= B-1C-\\

of an

n. Hence

B and C such that D =


i < n. Thus D = In; that is,
1, note also that
EpEp_l---Ex

proof of Corollary
where the E^s and

the

in

C=GiG2-Gq,
A = B-1IttC-1
the

exist

then rank(A) =

n matrix,

invertible

an

is

of elementary matrices.

is a product

matrix

invertible

Any

1 there

= 0

(c) are left as exercises. 1

of (b) and

proofs

Corollary

Thus

3.5.

by

This

the

satisfying

=
dim(R(U))

rank(U).

Chap. 3

140

Matrix

Elementary

and

Operations

Systems

of Linear

Equations

By part (a)
=

reLnk(AB)

This establishespart
and

(c)

part

By

This establishespart
A

and

3.3

Theorem

B' =

and

[U]J

part

We

matrix.

Corollary

let

by

(d)
=

i
it is

that

= rank(T).

< rank(B')

rank(/l'B')

to be able to compute

important

corollary to Theorem3.4,

can use the

3.5

Theorems

of

rank

the

and

2 of Theorem 3.6 to accomplish goal.


on a
is to use elementary row and columnoperations
object
to the
it (so that the transformed
matrix has lots of
entries)

any

and

3.6,

a simple
rows

independent
4

Example

how
enables us to
columns the matrix has and thus to

observation
or

determine

linearly

many
its

determine

point
rank.

-'

(a) Let

that

to

matrix

zero

\"simplify\"

Note

= rank(B).

this

The

where

later

Then

[T]\302\243.

This establishes part (b).


see

< rank(B')

rank(BM')

bases for V, W, and Z, respectively,and


= [UT]J
AB'
by Theorem 2.11. Hence

ordered

rank(UT)

witJ

(d).

\342\226\240

We

= rank(A).

3.6

2 of Theorem

rank(U\302\243)')

y be

and

oc, p,

< rank(LJ

rank(LALB)

(c).

Corollary

vsLnk(AB)

Let

rank(L^)

the first and

second rows of

a multiple of the other.

Thus

are

rank04)

linearly

independent

since

one is not

2.

(b) Let

A =

,0

0,

there are
to
ways
elementary row operationto obtaina
first row from the second
we
obtain

In this case

several

proceed.
in

zero

the

row,

-3

0 0

3 0 0

we
Suppose
2, 1 position.

begin with an
the
Subtracting

Sec.3.2

Now note that

second

the third

are

rows

and Matrix

a Matrix

of

Rank

The

is

row

Thus

As an alternative method, note that


A are identical and that the
and

first,

of A are

columns

independent. Hence rank(/l) = 2.

of

columns

fourth

and

third,

second

first and

row, and the

= 2.

rank(/l)

the

first

second

the

of

multiple

independent.

linearly

141

Inverses

linearly

Let

(c)

A=\\2

the last

It is clear that

column'operations,

the

obtain

we

of matrices:

sequence

following

elementary row and

various

Using

has

matrix

three

and hence has

rows

independent

linearly

rank 3. 1

In

simplifiedenoughso
columns is obvious.

The

the augmented

the matrix is
rows or
independent
until

operations
of linearly

number

matrix is invertibleif and

that annxu

how to compute the rank

matrix to determine if it is
computing the inverse of a

We

invertible.

that

matrix

Let A and B

Definition.

column

Matrix

remarked

Since we know

and

maximum

the

that

of

Inverse

We have
is n.

row

perform

summary,

matrix (A|B)

we

of

now

elementary

be m x n and
the

mean

x (n

can

a simple

provide

utilizes

we

matrix,

any

row

if

only

its

always

rank

test a

technique

for

operations.

matrices,

respectively.

By

+ p) matrix

(^,...,^3^,...,8\302\276
where

A(i)

Let

matrix

B^

and

be

the \\th column of A

denote

an invertible

C = (A \\In). By

n x n matrix,

Exercise 15

A-iC

we

and thejth columnof B,

respectively.

and consider the

2n

augmented

have

(A->A\\A-%)

= (In\\A-1).

(1)

142

Chap.3

By Corollary 3 to

A'1 = EpEp_1

Matrix

Elementary

Theorem3.6,

-\342\226\240

matrix

result:If
(A |

an

by

an

is

In) into the

Equations

say

matrices,

row

elementary matrix transforms the


(Theorem
3.1), we have the following

operation
x n

invertible

then it is possibleto

matrix,

matrix (IJ A\"1)

= {In\\A-1).

by an

left

the

on

matrix

elementary

of elementary

product

= A-lC

EpEp_l---El{A\\h)

Becausemultiplying
a

of Linear

Systems

(1) becomes

Thus

E^

the

is

and

Operations

means

by

number

a finite

of

matrix

the

transform

row

of elementary

operations.

Conversely,supposethat

transformedinto
matrix
(In\\B)
operations. Let Eu E2i...,Ep the
elementary row\"operations as in
the

be

Theorem

3.1;

\342\226\240
\342\200\242
\342\200\242
Ep

these

then

(2)

(2) that

from

have

we

E2EU

be

row

= (In\\B).

Ep---E2Ei(A\\In)

Letting M =

with

associated

matrices

elementary

(A | /n) can

of elementary

number

a finite

by

the matrix

that

and

invertible

is

M{A\\In) = {MA\\M) =

{In\\B).
=
MA
M = B. It follows that M = A'1. So =
/n-and
have the followingresult:If
an
invertible
n x n matrix

Hence

we

is

(A|

is

In)

elementary row operations, then B =


The

Example

A-1.

by

procedure.

rank(/l)

= 3 to

= A~l.

Thus

and if the matrix

of a

means

this

demonstrates

example

following

(In | B)

of the form

a matrix

into

transformed

number

finite

of

Let us compute the

matrix

the

of

inverse

'0

/1=(2

3 3

[The reader

may

To

invertible.]

to

wish

that

verify

A ~x3 we

compute

be assured that

must use elementary

row

operations

is

to

transform

1 0 0
0 2
=
2 4 2 0 1 0
04|/)
3 3 1 0 0 1
this
accomplishing
{I | A'1). An efficientmethod
4

into

for

change

each column of

corresponding column of I.
we

begin

by

interchanging

a nonzero

need

we

Since

rows

with

beginning

successively,

1 and 2.

the

first column,

entry

in the 1,

The result is

1 0

1 0 0
1 0 0 1

is to

transformation

into the

1 position,

Sec.

of a

Rank

The

3.2

In orderto placea

in

143

Matrix and MatrixInverses


1, 1

the

the first row

we must multiply

position,

by

this

\\\\

yields

operation

0 i

0 2 4

3 1

b
We now complete

work in the

/1

column

first

by

3 times row

adding

1 to row 3

to obtain

In orderto change
column of I,
multiply

row

we

of the

column

second

the

obtain

by \\ to

second

above into the

matrix

a 1 in

the 2,2 position.

This

produces

operation

i
2

We can now completework on


row 1 and 3 times row 2 to
row

column

second

the

3.

result

The

adding

by

\342\200\2242
times

2 to

row

is

-1

l2

1
2

Only

3,3

position,

the

third

we multiply

row 3 by

'1 0
0
0

Thus

this

\\\\

1
row

In order to placea

in

the

yields

operation

-1

-3

Adding appropriate multiplesof

and gives

to be changed.

remains

column

i
2
3.
8
3

to

rows

1 and

2 completes

the process

144

Chap.3
Being able to

linear

inverse of a

Matrix

Elementary

compute the

The

Equations

us to compute

the
this

demonstrates

example

following

of Linear

Systems

allows

matrix

of

inverse

transformation.

and

Operations

technique.

Example 6

Let T: P2(R)->
denote

the first and

defined

be

P2(R)

second derivativesof/ It is

that T is invertible.Taking

[TL =
inverse

of this matrix

and

/'

N(T)

/\"

{0}, so

0
0

\\0

the

that

shown

easily

where

/1

Now

/\",

we have

{l,x,x2},

ft

+ /' +

= /

T(/)

by

is

1 -1

K0

But

1 to Theorem 2.19.

by Corollary

='[1\"%

([T]^)\"1

Hence

by

Theorem

2.15 we have

axx +

[T l{a0 +

=
a2x2)J\302\273

Therefore,

\"

l(aQ + axx

+ a2x2\\=

(a0 fli)+

\342\200\224

2a2)x

(fli

a2x

EXERCISES

1. Label the following


(a) The rank of a matrix is
matrices
(b) The product of

as

statements

ranks

(c)

The

of

the

matrix is the

m x n zero

Elementary

has

always

nonzero columns.
equal to the lesser of the

of its

rank

matrices.

two

(d) Elementary row


(e)

or false.

number

the

to

equal

two

true

being

only

n matrix

having

rankO.

operations preserverank.

column

operations

(f) The rank of a


independent rows in

is

matrix

the

equal
matrix.

do

not

to

the

necessarily
maximum

preserve rank.
number
of linearly

Sec.

(g)

\342\226\240

(h)

of a Matrixand

The Rank

3.2

elementary
n x n
An

(i)
2. Find

An

,1

(d)

(f)

matrices.

\342\226\240

3. Prove that

for

following

-8

any

-3
x n

n is invertible.

rank

having

the

at most n.

is of rank

of

of

means

by

exclusively

operations.

matrix

n matrix

(a)

row

rank

the

can be computed

of a matrix

inverse

The

145

Inverses

Matrix

matrix

A, rank(^4) =

0 if and

if

only

is

the

zero

matrix.

row and column operationsto


matrices
into a matrix D satisfyingthe conditions
determine
the rank of each matrix.

4. Use elementary
following

of

then

and

(a)../1- 1.
0

\\1

5.

For
inverse

(a)

transform

of

the

Theorem

3.6,

compute

the

-1
1

of the following
if it exists.

each

each

matrices, compute the

(b) 1 2'

(c)

(f)

rank

and

Chap. 3

146

Matrix

Elementary

and

Operations

of the following
transformations
invertible, and compute T\"1if it exists.

6. For each

(a) T:
(b) T:

(c) T:

P2(R)

->

R3

->

P2(R)

defined

-+

P2(R)

defined

defined

R3

T: R3 ->

T:
T:

(f)

= f\" + 2f'-f.
= (x + l)/'(x).
by T(f)(x)
by T(f)

(ax

2a2

->

M2x2(R)

(ax

a3)

a2 + 2a3,

-ax+

a3,

a3) + (ax

a2 +

by T(/) =

-> R3 defined

P2(R)

if T is

determine

ax

a3\\

P2(R) defined by

T(ax,a2,
(e)

Equations

by

a2, a3) =

T(ax,
(d)

T,

linear

P2(R)

of Linear

Systems

R4 defined

- a2 + a3)x+

axx2.

(/(-1), /(0), /(1)).

by

T(A) = (tv(Al

tv(AE))

tv(EAl

tr(A%

where
'
E='\302\260

0/'

,1

7.

matrix

invertible

the

Express

/l
1

as

a prpduct

8. Let

be

an

1N

of elementary matrices.
m x n matrix.
Prove that

if c is

nonzero

any

rank(Ci4)

to
Theorem
proof of
corollary
elementary column operations preserverank.

9. Complete the

10.

3.4

the

3.6

Theorem

Prove

then

scalar,

= rank(/l).

for

the case that

is

an

x 1

by

showing

that

matrix.

11. Let

0
B

B'

where

= r

rank(S')

12.

Let

B'

an

is

m x n submatrix

- 1.
be m

D'

and

x n matrices,

of B. Prove
and let B and

matrices defined

by

and

D =

that
be

if

the

rank(B)

(m

r,

1) x (n

then

+ 1)

Systemsof

Sec. 3.3

an
be transformed into D'
B can be transformed into

then

operation,

13. Prove parts


14. Let T, U:
V

(b) and
->

(b)

If W

(c)

Deduce

(MA

any

at most 1.

rank

th'at

that

n rows, prove

containing

B,
=

| B)

M(A

M.

C is a

and

matrix

and

rank(B).

matrix

rank(U).

rank(i4)+

is a 3

that

then rank(T + U) ^ rank(T)+


x n
matrices
that, for any

1 x 3 matrix, then

Conversely, show that

having rank 1, then thereexista

1 matrix

if

is

a 1 x

B and

the

3x3

any

3 matrix C

BC.

the proof of

the details to

.17. Supply

an

by

R(U).

matrices

has

BC

such

are

that if B

matrix

3.3

| MB) for

matrix

[column]
row
elementary

3.6.

Prove

transformations.

is finite-dimensional,
from
part
(b)

and

16. Prove

Theorem

2 of

Corollary

linear

R(T)

rank(i4 + B) ^
A

of

(c)

be

(a) R(T + U) c

If

row

elementary

by

operation.

[column]

15.

147

Aspects

Equations\342\200\224Theoretical

if B' can

that

Prove

Linear

3.4.

Theorem

of

(b)

part

OF LINEAR EQUATIONS-

SYSTEMS

THEORETICALASPECTS

Thissection

of systems of linearequations,
in both
arise
the physical and social sciences. In this section
naturally
2 to describe the solution sets of
results
from
of
linear
apply
Chapter
as subsets
of a vector space. In Section3.4,
row
equations
operations
are used
to provide a computational method for findingall solutions such
next

the

and

devoted

are

to the study

which

we

systems

elementary

to

systems.

The

of equations

system

(5)

fluXi + ^12*2+

\342\200\242
\342\200\242
\342\200\242

^21*1 + <*22*2+

\"

Ami*! + Gm2*2+
where
...,

a(J

and

bt (1

n variables

xn are

in n unknowns

The

is called

< i<

and

^ j

\"

'

^ n) are

taking values in F,

\" *

alnxn

32n*n

+'

bi

= &2

= b mi

elements of a fieldF

is calleda

over the field F.


n matrix

the coefficient matrix

amnXn

of the system(5).

system

of

linear

and xls
equations

x2,

148

Chap.

the

j
\\b,

system

(5) may

the results that

To exploit

system of equations as a

As = B.

the system.

equation.

eFn

\\

The set of all solutionsto

(5)

system

is called

the solution

set of

\342\200\242;

the system

Consider

\342\200\224

x2

Xi

=3
=

ft

of faftiiliar

use

is only

there

(5) is an n-tuple

xx + x2

By

consider

frequently

Example

(a)

equation

will

we

developed,

s=\\
that

matrix

= B.

matrix

single

to. system

A solution

have

we

a single

be rewritten as
AX

such

LinearEquations

and

\\*\342\200\236/

then

and Systems of

Operations

let

we

If

Matrix

Elementary

we can solve

techniques

x1 = 2xx2 =

\"one solution:

1.

the system aboveand

conclude

that

1; that is,

\342\226\240<

In

matrix

form

the

can be written

system

so
A

-i]Uj

and
\\\\

(b)

(ir
B =

_|)

Consider

2xx + 3x2 +

x3 = 1

i.e.,

\342\226\240
F.i

3.3

Sec.

This

of Linear

Systems
has

system

such as

solutions

many

s=

\\

149

Aspects

Equations\342\200\224Theoretical

s =

and

(c) Consider

i.e.,

It is

linear

of

system

no solutions. 1
when

recognize

solutions and

a system has

This section and

solutions.

all its

describe

to

to

Thus we see that

no solutions.

one, many, or
able

be

must

We

has

system

have

can

equations

able

this

that

evident

the next are

be

then

this

to

devoted

end.

We

our study of systems of equations by


systems of linear equations. As we will

begin

\"homogeneous\"

of solutions to a homogeneous
forms a subspaceof
can
We
then

of

system

apply

Fn.

of

For

solutions.

can

solution

if

homogeneous

Any

AX

system

= 0. Otherwise

homogeneous

system

in

over

unknowns

\342\226\240K=N(LA);

Proof

hence

Let

AX

F, Let
afield
is a subspace

{s e Fn:

..the dimension theorem.

be

of

any

nonhomogeneous.

one solution, namely,

result

next

The

homogeneous

to

\342\200\224

The

further

system.of m linear equations

set of all solutions


Fn of dimension n rank(LA)
H(LA).

gives

system.

homogeneous

K denote the

As = 0} =

and

the system is saidto

has at least

= 0bea

space can be found,

equations in n unknownsis saidto

the

3.8\302\253

to thisset

of vector spaces
vectors.

lhis solutionis called


trivial
solution.
information about the set of solutionsto a
Theorem

in n unknowns

combination of the basis

= Bo/m

3.8), the set

(Theorem

equations

the theory

for the solution

as a linear

be expressed

Definitions.
he

a basis

example,

linear

see

of

class

the

examining

second

part

now

AX

= n

0. Then

\342\200\224

rank(A),

follows

from

150

Chap.

If m <

Corollary.

Proof. Suppose

n,

dim(K) > 0,
to

solution

n. Then

rank(^4) = rank(Lj4)

AX

{0}.

2x2

\342\200\224

Thus

\302\245=

= 0,

>

rank(Lj4)

K = N(Lj4). Since

nontrivial

<

of Linear Equations

a nontrivial solution.

= 0 has

\342\200\224
\342\200\224
n

dim(K)

is a

system

and Systems

Operations

AX

the

that

where

Matrix

Elementary

>

there

^ m. Hence

0,
exists

s 6 K, s

# 0; so s

Example 2

(a) Consider

the

system

x1

+ x3 =
X3

%2

Xi

U,

Let

be the coefficient
then

system,

basis

K.

for

where

matrix. It is clear

that

dim(K)
For

1. Thus

2. If K

is the solution

any nonzero solution

set of the

will constitutea

since

example,

teR.

(b) Consider
unknowns.

the

= 3

\342\200\224

rank(^4)

solution

If

the system xt

A = (1,

set,

\342\200\2242,1)

dim(K)

= 3

is

2x2

coefficient

the
=

\342\200\224

\342\200\224

2. Note

and

x3

matrix,

that

one equation in
rank(^4) = 1. Hence if

= 0 of

three
K

is

Sec. 3.3

Systems
of

are linearly

Linear

elements of K.Thus theyconstitutea basis

independent

for

0/

In Section3.4 will
basis for the solution
set

\\-i

discuss

we

of

turn to the study


the solution set of

described in terms

refer to the
AX

equation

B,

= 0.

let

and

Then for any

Proof

{s}+

solution s to

If

KH.

be

Let

e K,

e KH,

s + k e {s}+

and

KH,

But

KH

Aw

then

{s} +

to

and

\302\243
K,

{s + k:

system

keKH}.

AX = B. We

to

equations

must show that K =

B. Hence

exists

= 0.

= B-B

= Aw~-As

-s)

there

such that

keKH

\342\200\224

So

k.

w =

therefore

suppose

Conversely,

AX

solution

then Aw =

Thus

\342\200\224

+ KH =

any

A(w

= 0. We

corresponding

system

homogeneous

Let K be the solution set of a system


linear
of
KH be the solution set of the corresponding
homogeneous

K={s}

So

the

as

be

B can

AX

system

homogeneous

3.9.

Theorem
\342\200\224

AX

the

of

AX

system

result

next

Our

systems.

B.

AX

AX

system.

of nonhomogeneous
a nonhomogeneous

of the solutionset

for finding a

methods

computational

explicit

homogeneous

We now
that

that

so

K,

*\\

shows

151

Aspects

Equations\342\200\224Theoretical

that

w e {s}

+ KH;then

= As +
=
A{s + k)
thus
K = {s} + KH.
Ak

+ 0 = B;

s +

so

k e KH.

k for some
w

K.

Therefore

Example3
(a)

the

Consider

system

Xi +

2x2 + x3

X^

The

system

homogeneous

Example

2(a).

It is

X2

corresponding

easily verified that

X3

\342\200\2244.

to the one

above is the system

given

in

152

Chap.3

is a solution

system is

Matrix

Elementary

and

Operations

to the nonhomogeneous

Equations
set to the

solution

the

So

above.

system

of Linear

Systems

K=

by

Theorem

3.9.

(b)

Consider

the

corresponding to this

x1

system
is

system

\342\200\224

2x2

in

given

= 4. The

x3

homogeneous

system

2(b). Since

Example

'4'

s =-1-0

is a solutionto

this

the

system,

K= <J| 0 +tA

set

solution

n equationsinn
A\"1B.
then the system has exactly one solution,
has exactly one solution, then A is

invertible,

system

Substituting

A~1B is a

then As = B, Multiplying both sides

3.9,

one

only

solution,

namely

suppose the system has


for the corresponding

{s}

{s}

hence

Conversely,

is

we

If

gives

is

an

arbitrary

s = A~~1B. Thus

the

A~~1B.

denote
exactly one solutions.
homogeneous system AX = 0.
Let

+ KH. But this can

occur

only

if

KH

By

=
KH
{0}, Thus

invertible.

three

in

equations

2x1 +
5 of

Section

unknowns:

three

2x2 + 4x3

= 2

= 3

4x2

3xt + 3x2
Example

If

Consider the system of

In

unknowns.

A~1B into the system,

solution.

A\"1

by

N(LJ = {0},and
Example

invertible.

= B. Thus

(AA'^B

Conversely,
solution
set

Theorem

A is

that

Suppose

system has one and


the

solutions

invertible.

A(A~1B)

solution,

a system of

namely

Proof
have

= B be

AX

Let

3.10.

Theorem

if the

of computing

a means

with

us

written as

tut2ER}

0|:

+t2

The followingtheoremprovides
to certain systems of equations.
A is

K can be

3.2 we computed

2x3

+ x3 =

1.

the inverse of the

coefficient

matrix

Systems of Linear Equations\342\200\224Theoretical

3.3

Sec.

of this

it has exactly

Thus

system.

A'1B=

one solution;

5
\342\200\224

\\-i

this

use

We

In Example1(c) saw
Wenow establisha
criterion involves the rank

therankof

criterion

for

matrix

the

the system AX = B.

has

R{LA).

The matrix

AX

= B be

columns

the

of

span

{A

\\

B)

the

is called

matrix

augmented

a system of

A^\\...,

span{,4(1),

RCLJ

the

coefficient

of

linear equations.
the
at least one solution if and only i/rank(A)= rank(A|B).
To
that
AX = B has a solution is equivalentto saying that
say
In the proof
of Theorem 3.5 we saw that

Proof
B

| B).

Let

3.11-

Theorem

system

(A

determining

the

of

that concludes

application
of linear
system

we

equations having

this section.
equations that has no solutions.
when
a system has solutions. This
matrix
of the system AX = B and

the

in

matrices

coefficient

invertible

systems of linear

for solving

technique

153

Aspects

of A Thus

AX

\342\200\224
B

Then

A\342\204\242},

has

if and

solution

BespsLh{Aa\\A(2\\...,Ain)}. But B espan{^(1>,A{2\\...,

A\342\204\242}

This

spm{A{1\\Ai2\\...,Ain)}=spm{A^\\Ai2\\...,Ain\\B}.

if

last

only if
only

if

statement

is

and

equivalent to

A{2\\...,
dim(span{,4(1),

4(n)})

So

by Theorem

3.5 the equation

Ai2\\...,

dim(span{^(1),

A\342\204\242,
B}).

above reduces to

rank(/l)= rankU|B). 1
5

Example

Recall

the

system

of equations

xt + x2 =
Xl

in Example

x2

0
l

1(c).

Since

and (A\\B)=
rank(/l)

= 1 and

has no solutions.

rank(^41B) = 2.Becausethe

two

'1
1

ranks

(T

1
are

unequal,

the

system

154

Chap.

Matrix

Elementary

Operations

and Systems

if (3,3,2)

is in the

of Linear Equations

Example 6

We will use

R3

T:

transformation

Now

->\342\226\240
R3

(3,3,2)

defined

a3) = (at

a3, ^-^

e R(T) if and onlyif

exists

such that T(s) = (3,3,

2).

Such

there

Xj

x2

are 2 and 3,
(3,3,2) #R(T).
An

follows

it

respectively,

and

matrix

coefficient

the
this

that

at

a3).

= (xlJx2)x3)in

R3

to the system

be a solution

X3

+ x3

ixt x1

a3,

= 3

x3

X2 +

a vectors

s must

vector

Since the ranks ofthe

range of the linear

by

+ a2 +

a2,

T{al9

determine

to

3.11

Theorem

= 2.

has

system

of this system

matrix

augmented

no solutions.

Hence

Application

Prize in economics his work


in developing
a mathematical
model (that may be used to describe
economic
We close
this section by applying some of the ideas
phenomena.
have
studied
to illustrate
two special cases of his work.
In

1973,

Leontief

Wassily

won the Nobel

for

various

we

We begin
a simple
of three people
considering
composed
society
a tailor
who
all the
makes
food,
(industries)-^ farmer who grows all
all
the
We assume that each
builds
clothing, and; a carpenter
housing.
a central
from
and that everything produced is
person sells to and
pool
consumed. Sinceno
either
the system, this case is
enter
or leave
referred to as
model.
closed
Each
of the three individuals
consumes all three of
commodities
in the society.
commodities
produced
Suppose that the proportion of eachof
consumed
each
of the
by each
person is given in the followingtable.Notice
of the table must sum to 1.
columns
by

the

who

buys

commodities

the

the

the

that

Food

Farmer

0.40

Tailor

0.10

Carpenter

Let

respectively.

Housing

0.20

0.20

0.70

0.20

.- 0.10

0.60

and
denote the incomes of
tailor,
farmer,
carpenter,
To assure that this society survives,
that
the
require
individual
his or her income. In the case of
of
each
equals
into the equation 0.40/^ +0.20/?2+0.20/?3=
translates
requirement

Pu

Pi> anc* Pi

the

we

the

consumption

farmer,

0.50

Clothing

this

Sec.3.3

Linear

of

Systems

we need to

pv Thus

consider the

linear

of

system

0.10/?!+

0.70/?2

0.60/?3=

and /1 is the

/?2
/?3

= P, where

/IP

equivalently,

0.20/?3

0.50/?x + 0.10/?2 +
or,

equations

0.20/?3 = px

+ 0.20p2 +

'0.40/?!

155

Aspects

Equations\342\200\224Theoretical

of

matrix

coefficient

the

In this context

system.

is

the

called

input-

condition.
output (or consumption) matrix, and AP = P is called
equilibrium
use
For matrices
B and C of the same size
the
notation
B ^ C{B > C]
to mean ^ Ctj {By > Cy] for all i and j.
called
if
nonnegative
{positive]
> 0], where 0 is
matrix.
zero
B^O{B
condition the
At first it may seem reasonable to replace the equilibrium
AP ^ P, that
exceed
is, the requirement that consumption
inequality
But
in fact AP < P implies that AP = P in
closed
model.
For
production.
the

we

is

By

the

by

not

the

for which

exists a k

there

otherwise

Pk

>

\302\243
AhjPj-

the

since

Hence,

columns

of A sum to

1,

i j

is a contradiction.

which

One solution to
equivalent

\\

to

the

the homogeneous

\342\200\224

(I

system

0, which

is

is

condition,

equilibrium

A)X

/0.25

P =

0.35

\\0.40
We

and

may

Noticethat
systembut in
whetheror not
one

the

matrix

(whose

may

proof

and Social

Science

in

incomes

have

carpenter

nonnegative

to mean

this

interpret

not

are

we

that

is

whose
be

simply

\342\200\224

Thus

A)X

columns

0 has

a nonnegative

Relationships,\"

by

on
Conference for College
Berkeley, California) is stated below.
Teachers

tailor,

solution to the
the question of
solution, where is a
A

useful theorem in this direction


of Matrices to Economic Models

sum to 1. A

in \"Applications

found

farmer,

in any nontrivial
we must consider

interested

nonnegative.
(I

system

will surviveif the


the proportions 25:35:40 (or 5:7:8).

that the society

Ben
Applied

Noble,

Proceedings
Mathematics,

of the Summer
1971,

CUPM,

156

Chap.

Matrix

Elementary

Theorem 3.12. Let

x n

an

be

1 x

is a

Then

(I

\342\200\224

A)X

3-

\342\200\224

0 has

C is an

and

matrix

positive

1)

(n

\342\200\224

1)

is

solution set that

a one-dimensional

matrix.

1 positive

by

generated

vector.

nonnegative

matrix satisfies

that any positive input-output


The matrix below does also.

Observe
theorem.

this

(n

having the form

matrix

input-output

Hi
where

of Linear Equations

and Systems

Operations

the

of

hypothesis

/0.75 0.50 0.65\\

0.25 0.35

0
In the open

the commodities

the

that is,
producing
the

food

of

Rvalue

In general,

must

we

and D are

one.

then the desired


\342\200\224

(1
of

if

a)~l

matrices

to

(I

for a real

\\a\\

<

1. Similarly,

developed

\342\200\224

A)~l

if {An}

i =

sum

the

\342\200\224

A)~

number a the
it can

in Section
converges

of

the

\342\200\224

A)~l

the corresponding

equal

to (I

3.
\342\200\224

A)X

of

A
A

is nonnegative,

and

exists

1D.

series 1 + a

be shown (using

a2

\342\200\242
\342\200\242
\342\200\242

to

converges

the concept of

A2 +
+
5.3) that the series I +
to the zero matrix. In this case(I
A

convergence

\342\200\242
\342\200\242
\342\200\242

converges

\342\200\224

A)~l

will

the matrices

cents

for

$1

D, where

of each column

entries

illustrate

of

be

I, A,, A2,... are nonnegative.


To
the
that 30 cents worth of food, 10
model,
suppose
open
of clothing,
and 30 cents worth of housing are required
the
production
40 cents
worth
of food. Similarly, suppose that 20
of food,
worth
since

nonnegative

worth

in

everything
produced is
the open model, namely,

1, 2, and

solution

to see that if (I

It is easy

solution, will be (I

that

Recall

must

for

d{

a nonnegative

find

for

condition

nonnegative matricesand

notexceed

does

=
AyXj

that

assumption

commodities

three

of food consumed

value

the

minus

equilibrium
the

of

the society is

The

similar

xi ~ X
j-i

Ay

monetary unit such as the dollar)


one monetary unit of commodityj. Then

produced

consumed gives us a
that the surplus ofeachof
outside demands. Hence
3

such that

the

commodities:

three

the

(in

of food in

surplus

matrix

3x3

the

be

Let

a fixed

in producing

i required
of

value

amount

the

dz.

xu

produced

housing

of

each

x2, and x3 be
with
respective

let

society,

and

for

demand

an

clothing,

commodity

outside
model we assume that there is
produced. Returning to our simple

the monetary\"values of food,


outside demands dl9 d2, and
represents-

0.25

\\0.25

cents

Systems of

Sec. 3.3

157

Aspects

Equations\342\200\224Theoretical

and 20 cents

of clothing,

worth

Linear

worth of housing

are

worth
of clothing. Finally, supposethat 30
are required
worth of clothing, and 30 cents
of housing
matrix
is
of $1 worth of housing.
the
input-output

of $1 worth

the

for

required

production
10 cents
food,

of

cents

for the production

worth

Then

/0.30
A

0.10

\\0.30

0.20

0.30N

0.40

0.10

0.20

0.30,

so

I 0.70 -0.20
I-A=

and

\\-0.30 -0.20
Since

(I

\342\200\224

(I

is

A)~l

\342\200\224

A)X

D for

of

demands

{/

$30

A)~l =

we can

nonnegative,

any demand D.
billion in food,

0.5 2.0 0.5

\\1.0

0.70/
find a (unique)

1.01

1.0

/2.0

\\

-0.10

0.60

-0.10

-0.30

2.0

1.0

nonnegative solution to

For example, supposethat


$20 billion in clothing,

$10

and

outside

are

there

billion

in

If we set

housing.

D=

then

X = (I-A)~1D =
a

gross

billion

of

So

housing

billion of clothing,
to meet the required demands.
of food, $60

billion

of $90

production

is necessary

and

$70

EXERCISES

1.

the

Label

(a)

(b)
(c)

(d)
(e)

Any

system

of

Any

system

of

Any

homogeneous

of

system

Any

solution.

n linear

of

system

Any

true or false.
linear
has at least one solution.
equations
linear
has at most one solution.
equations
of linear
has at least one solution.
system
equations
n linear
in n unknowns has at most one
equations
statements

following

solution.
(f)

If

the

system

homogeneous

equations

has

solution,

as being

equations

in n unknowns

corresponding
then

the

given

has at least one

to a given
system has a

system of linear
solution.

158

Chap.3

Matrix

Elementary

and

Operations

(g) If the coefficientmatrixof a


in n unknowns is invertible,then
(h) The solution set of any system m
the

linear

of

in n

equations

Equations

equations
solutions.

nontrivial

no

has

system

n linear

of

system

homogeneous

of Linear

Systems

unknowns

is a

subspace of Fn.

2. For

the

of

each

of and a basis
(a)

xx

= 0

3x2

[2xx + 6x2 =
(c)

xx

[2Xi +

(b)

x2

f xl +

[4x1+

- x3 =

2x2

(d)

(2xl

= 0

x3

xi

2x2 - 3x3+

x4

(f)

x2 -

2x3

*2 +

+
\302\276\302\276

x2 -

3. Using the

x4

= 0

x3

x4

= 0

results of Exercise2,

(a) f xx +

\\lxl +
(c)

x3

f xx -f

3x2 = 5
=

2xx +/ x2 +

x3

(d)

+ x2

\342\200\224

2x3

x2 -

f2xx +

(g)

fxx

\342\200\224

2x2

For

(1)

each

(2) Compute

xx

x3 = 5

X2

= 1

(f)

fxx + 2x2

*i -

= 1

X3

\342\200\224

2x3

\342\200\224

x2

x4 = 1
Xa ^ 1

with coefficient matrix

equations

A:

A\"1.

+ 3x2

the

system.

= 4

2xjl + 5x2 =

(b)~

xx

2x2

Xi

x2

2xx

exampleof a
infinitely many solutions.

5. Give an

is invertible.

(3) -Use .4\"1 to solve


(a)

x4

of linear

system
that

Show

x3 +
X-a

Xo

4.

3x3

6
\342\200\224

systems.

following
x3

xx + 2x2
Xi + 2x2

x2 -

Xx

(e)

2x3

the

to

4xx

2x2 -x3 =3

xz = 0

\342\200\224

x2

(b) f xx +

10

6x2

x3

fx! + 2x2=0

solutions

all

find

0
=

\342\200\224

2x2

\342\200\224

f*i

\342\200\224
x2

x2

xx

(g)

x3 = 0

xx

(e) xx +

find the dimension

equations,

set.

solution

the

for

of linear

systems

following

system

of

linear

\342\200\224

equations

2x2

- x3 =

x3

= 1

x3

= 4

in n unknowns

with

Sec.3.3

7.

Determine

(a)

Xl

which

of the

X2

xi +
+

xx

xx

2xx +

b, 2a -

Xjl

'
(d)

\302\260

x2

x3 = 1

4-

x2

X2

\342\200\224

2x2

Xi

\342\200\224

R3->
B

vector

(a)
9.

(1,

Prove

3,

-2)

system

or

*3 +

X4 = 1

x$

x4 ^

system

has

if B 6

(2, 1,1)

with

the

food,

m,

coefficient

then

the

closed

Suppose

that

of

60%

of

production

economy

production

In the notation

all

What

goods.
of

and

carpenter

produce

for

goods and services.


30% of all services, are used in
of the total economic output is used

and

the

proportion

goods?

of the open model

of

Leontief,

X
A =

in order

of two sectors:

consists
goods

16

1
3

tailor,

farmer,

_5_

must the
equilibrium to be attained?

16

JL

16

At what ratio

_3_

as the basic

is

matrix

\342\200\224_5_

and housing

clothing,

input-output

71

D.

has

unknowns

equations in n unknowns has rank

16

the

to the following statement:If the

_2_

in

each

H(LA).

Leontief

industries, supposethat

certain

= 0

a solution.
of

x4

R(T).

m linear equations in

= B of

AX

11. In the closedmodel

12.

8x3

a + 2c).For

+ b,b-2c,

or not B e

whether

a counterexample
a system
of m linear

of

b, c) =(a

T(a,

by

give

matrix

= 0

(b) B =

if and only

solution
10.

defined

be

R3

= 4

7x3

x4

R3 determine

in

that

Prove

4x2

Xj

\342\200\224

x2 + 2x3 =

2Xi +

8. Let T:

x3

= 2

3x3

x2 + 3x3

xx +
XX

*3 = 3

\342\200\224

2x2

Describe

equations has a solution.

4xi + x2 +
(e)

c).

*3 =

4-

2x2

(b)

*3 + 2x4=

*i + x2 xx

= 2
=

4- 3x3 =

2x2

systems of linear

following

2x3

= (a +

b, c)

T(a,

by

X3 + 2*4

4-

x2

2xx + 2x2
(c)

defined

159

Aspects

Equations\342\200\224Theoretical

R2 be

T: R3 ->

6. Let

Linear

of

Systems

JL'

5
3\">

suppose

that

160

Elementary

demand

vector

Chap.

must

14.

the

that

and

Matrix Operations and

is D = (

How

Linear

each

of

much

].
to satisfy this demand?

be produced

Systems of

Equations

commodity

of goods and services


that
consumes
$90 billion worth of goods and $20
system
supportsa
the economy
of
services
from
but does not contribute to
billion
economicproduction.Suppose 0.5 unit of goods and 0.2 unit of services
are requiredto produce unit of goods and that 0.3 unit of goods and 0.6
1 unit of services. What must the
to produce
unit of
are
required
be to support this defense system?
total
of
the
economic
system
certain

two

the

of

consisting

economy

sectors

defense

worth

that

services

output

3.4 SYSTEMS

ASPECTS

COMPUTATIONAL

a necessary and

3.3 we obtained

In Section

\342\200\224

EQUATIONS

LINEAR

OF

linear equations to have solutions (Theorem3.11)and


the solutions
to. a nonhomogeneous system in terms
to

all

determine

use
system. In this section
two objectives. The essence
we

of the

set

solution

the

for

row

elementary
this

of

linear equationsinto a
solve (as in
1.4).

express
the
to

we

a basis

given system and

solutions

of

us

solutions

the

to

how

learned

of

system-(Theorem 3.9). The latter resultenables


can
to a given system if
find
one solution
to the

homogeneous

corresponding

a system

for

condition

sufficient

a given system
but which is easier

transform

solutions

same

the

having

system

operations
is to

technique

homogeneous
to accomplish
these

corresponding

of
to

Section

Definition.

Two.

have the same

if they

equivalent

The following

are

unknowns

called

solution set.

and

theorem

equations in n

of m linear

systems

give

corollary

a useful

for obtaining

method

equivalent systems.

Theorem3.13.
and

(S'):

= CB is

Proof
then

weK,

be

the

Aw = B.

So

Conversely,if

K.

Corollary.

If

(A' |

Thus

K'

solution
=

then

K',
=

Then

the

system

AX

Let

= B

B' is

be a system

equivalent

by

a finite
to

(S').

If

K'. 1

of m linearequations
in

number
the

for

= B;

C~l{CB)

therefore K =

from (A | B)

then the system A'X =

and K' the solutionset


and
hence
w e K'. Thus K^K'.
CB,
=
CAw
CB. Hence
set for (5)

C~x{CAw)

s K, and

B') is obtained

of m linear equationsin n

m x m matrix.

invertible

any

CAw

Aw

so

be a system

equivalent to (S).

be

Let

= B

AX

(S):

let

unknowns,

(CA)X

Let

original

of elementary
system.

unknowns.

row operations,

Systems of Linear

Sec. 3.4

that

Suppose

Proof,

These

operations.
matrices

Ex,...,

be executed

may

C = Ep

Let

Ep.

the

3.13

Theorem

\342\200\242
\342\200\242
\342\200\242

Ex;

invertible,so

Since each Et is

We now describe a method


illustrate this method with the

for

of

equations

+ 3x3

x< +

3 2 3

use

in

occurs

to the right

a column

1. Put a 1

this step

is

the

in

row,

first

first

the

times

to the third

in

row to obtain
\342\200\2243

in

zeros

must

the

add

first

the
\342\200\224

row

row to obtain

the

previous
column,and

next
In

row(s).

multiplying

-2

we

2 1

0 -1
o -4

i >j.)

2
0 3

use the first

3. Put a

row.

preceding

column. In our example


to the second row and then add
times

row

first

the

-1

of the first

positions

remaining

each

In our example we can accomplish


and third rows. The resulting matrix is

3 operations,

type

of

of

column.

first

means

and

if Ai} = 0 whenever

triangular

upper

the augmented matrix into an


it
entry of each row is 1
entry

By

-1

1 2 1

2.

We

equations.

2.

of the firstnonzero

interchanging

by

of linear

1
3

-2

to reduce
operations
in which
the first nonzero

matrix

that matrix

(Recall

x4

row

elementary

triangular

upper

AX

system

by

\342\200\224

will

= B.

= 3

111
We

CB. Thus

augmented matrix

we form the

First

2x4

+ x3

xx + 2x2

\342\200\224

x-

Xn

the

system

any

+ 2x2

and B' =

= CA

A'

solving

system

3xx

elementary mxm

{CA\\CB).

is equivalent to

A'X = B'

system

row

elementary

by

\\B)

then

Now

C.

is

(A

by multiplying by

= C{A\\B) =

{A'\\B')

from

is obtained

(A'\\B')

161

Aspects

Equations\342\200\224Computational

we

the

our
can

second

row

example
make

row

the

by

-5

possible column, without using


the second
column is the leftmost possible
second
row, second column entry a 1

the

in

-1

leftmost

by

\342\200\224

1.

This

operation

produces

162

Elementary MatrixOperationsand Systems

Chap..3

of

-1

-1

-4

type 3 operationsto
preceding step. In our example
the third row.The resulting

4. Now use

-1

-5

-1

in the

the second

row to

-1

-1

-3

-9

5. Repeat steps 3 and 4 on eachsucceeding


row
In our example this can be accomplished
by
This

times

four

1 created

is

matrix

no

until

nonzero
the

multiplying

-1

1 0

-1

0 0 0

the

complete

this

in

4. Hence

zeros in row

its

column.

nonzero

first

In

our

each

certain

eliminating

third row to

we add the

one, column

four

10

12

the

the

row

each

and

row

first

the

two,

and

column

matrix is

The resulting

7. Make

simplification
in

upward,

nonzero

last

the

column

to obtain

rows

second

four.

lies

of

multiples

is

row

row

by

beginning with the last


row
to the rows above. In our
row, and the first nonzero entry

work

objective,-we

nonzero row, and add


example the third

of this

row

entry

accomplish

third

-1

obtained the desiredmatrix.To


of the augmented
matrix, we must make the firstnonzero
only nonzero
entry in its column. (This correspondsto
unknowns
all but one of the equations.)
from
To

remain.

We have now

6.

rows

produces

operation

\342\200\224|.

the

below

add

must

we

Equations

zeros

obtain

Linear

10

next-to-last

in the

entry

we must add

example

first row in order to make


This operation produces

first

the

row,.second

10

10
0

10

row the only nonzeroentry

in

\342\200\2242
times

column

the

second

entry

row

become

to the

zero.

Sec.3.4

Linear

of

Systems

in step
performed with the second row, at

8. Repeat

the process described

163

Aspects

Equations\342\200\224Computational

for

which

each

preceding

time

the

it .is

until

row

reduction

is

process

complete.

already reached

Since we have

matrix is the
the

to

corresponds

the

desired reduction

the

of

of linear

system

8, the preceding
This
matrix

in step

described

point

matrix.

augmented

equations
+

Xjl

X3

=2

x2

=
xA

3.

by the corollary to Theorem 3.13this system


original system. But this system is easilysolved.Obviously,
Moreover, xx and x3 can have any values provided that
that

Recall

x3

have xx =

t, we then

system has the

\342\200\224

t.

an

Thus

is

x2

the

xA = 3.

2 and

their sum is 1.

Letting

to the original

solution

arbitrary

to

equivalent

form

that

Observe

is a basis

for the homogeneous system

of

to

corresponding

equations

the

given

system.

In the example above


performed
elementary
the
obtained
augmented matrix of the systemuntil
system having properties 1,2, and 3 on pages24-25.

row

we

we

of a

matrix

augmented

Such

on the

operations
has

matrix

a special

name.

Definition.
three

are

conditions

(a)

is

matrix

to

said

be in row

echelon form if thefollowing

satisfied:

Any row containing

a nonzero entry precedes

any

are zero (if any).

entries

(b) The first

nonzero

entry

in

each

row

is the only

row

in

which

all

nonzero entry in

the

its

column.

(c)

The

right

first

nonzero

of the

entry

leading

in each row

1 in any

is 1 and it occursin a

precedingrow.

column

to

the

164

Chap.

Matrix

Elementary

and Systems

Operations

of Linear Equations

Example 1

(a) The last

on

matrix

row is 1 and

of each

nonzero

entry

all zeros

otherwise. Also note

we must move to
new

the

the

or

one

right

that the column

The

because

the first nonzero

nonzero

of

of

entry

the

entry of

first

row;

of the first

to the right

is not

row

and

entry of

the firs't nonzero

11,
second

the

entry;

the

1.

is not

row

first

row echelon form of a


that
is, if different sequences of elementary row operationsare usedto
=
a matrix
into
matrices
Q and Q' in row echelon form,
Q
Q'.
row
although there are many differentsequencesof
operations
can be used to transform a given matrix into
echelon
all
form,
they
It

that

new row,

Thus

to a

2'

10

10

transform

entry

first nonzero entry

one nonzero

than

more

contains

column

unique;

has

first

row echelonform:

are not in

following

first

the

because

to find the

columns

more

downward

move

we

time

first

the

row.

(b)

because

this

containing

that each

that

echelon fonn. Note

is in row

162

page

can

be

shown

Exercise

(see

7) that the

is

matrix

then

elementary

row

produce

the same result.

The procedure used aboveto


It
form is called Gaussian elimination.

an

reduce

1. In

consists

two

of

parts.

separate

an
into
pass, the augmented matrix is
upper
matrix in which the first nonzero entry ofeach
is
and
it
1,
a column to the right of
first
nonzero
of each preceding
entry

the' forward

triangular
in
occurs

echelon

to row

matrix

augmented

transformed

row

the

row.

2. In the backwardpass
back-substitution),
transformed into row echelon form.

the

(or

Of all
methods
Gaussian elimination
the

for

requires

matrices

a matrix

transforming
the

it requires approximately

fewest

50%

upper

into its row

arithmetic
fewer

operations

matrix

triangular

operations.
than

is

echelon form,
(For
the

large
Gauss-

Systems of

Sec. 3.4
Jordan
using

Linear

in which the
method,
the first nonzero entry

165

Aspects

Equations\342\200\224Computational

matrix is transformedinto
in each row to make

entries

other

all

zero

form

echelon

row

by

in its

Because of this efficiency,Gaussianelimination


is preferred
method
when
of linear equations on a computer.In this
the
solving
systems
Gaussian
elimination
is usually modified in order to minimize
procedure
is inappropriate here, readers
Since
discussion
of these techniques
errors.
in such
interested
matters
are
are referred to books on numericalanalysis.
a matrix
is in row echelon fonn, the corresponding system of linear
to
solve.
We present
below a procedure for solving any system
equationsis
is in row echelon form. First,
oflinearequations which the augmented
matrix
however, note that every matrix can be transformed into row echelon form
in
In the forward
elimination.
Gaussian
pass we satisfy conditions (a) and
below
the
first
the definition
of row echelon form and also make zero all
nonzero
entry in each row. Then in the backward
pass make zero all entries
condition
above the first nonzero entry in each
thereby
(b) in the
satisfying
definition of row echelon form.
column;)

the

context

roundoff

who

When

easy

for

we

by

(c)

entries

we

row,

elimination

Gaussian

3.14.

Theorem

into its row

matrix

any

transforms

echelonform.

is

matrix

in

row

which the
To illustrate this procedure,
consider

form.

echelon

a system in

for solving

a method

describe

now

We

augmented

we

system

'2xx

xi +
xi +

X2

for

which

the augmented

*3

x2 +

2xx + 2x2

= 17

x3 + 4x4-9x5

+ 3x2+

x4.

*3 + 2x4

+ 2x3 +

\342\200\224

3x5
\342\200\224

5x5

3x4 -

8x5

14

matrix is

/2314-9

17\\

1111-3

6j

1112-581'
\\2

-8

14/

Applying Gaussian elimination to the augmentedmatrixof


the following
of matrices.
sequence

/2314

/ 1

11112

\\2

9 17
3 6
5
8 14
8

2 3

1
2 2
1

the

produces

system

-3

-9

17

-5

-8

14

the

166

The

of linear

system

Elementary Matrix Operationsand Systemsof Linear

Chap.

to this last matrix is

corresponding

equations

\342\200\224

\342\200\224

2.

2x5

2x3

x4

2x5

since
ignored the last
To solve a'system for which
augmented
divide the variables xx, x2,. x5 into two sets.

that we have

Notice

the

variables

(in

set

this

is

value tiyt27...(x3 =tuxs


parametric
the first set hr terms of thosein the

x3

x4 =
an

Thus

arbitrary

/-2^

h-

x\342\200\236
*2

second

1
2.

that

\\-\\

<

o/

ti

0)

.\\

\342\226\240-2\\

:\\

2u + 2

Notice

t2 +
2t2

t2 e JR.

w
tu

2t2

tx

X4

where

-2tx +

the form

2^+3^
t2

for

solve

then

the

second

the

set:

x5 + 1 =
2x5 + 2 =

s, is of

solution,

= t2),and

+ 2x5 + 3 =

Xi = -2x3
*2

each variable in

{x3, x5}). To

case,

consists of all

+ t-

form,

of those

of the system

of the equations

The second set

x2, x4}).

{xu

echelon

first set consists

The

in one

variables

leftmost

as

of zeros.

entirely
is in row

matrix

\342\200\242.,

appear

consists

it

row

the

variablesthat
(in this case

Equations

remaining

set,
the

assign

variables

a
of

3.4

Sec.

is

of Linear

Systems

set of the

the solution

for

basis

167

Aspects

Equations\342\200\224Computational

correspondinghomogeneous

of

system

equations.

the augmented matrix of the system to row


simplifying
we are in effect simultaneously
finding a particular solutionto the
and
a basis for the solution set of the
homogeneous
in

Therefore,
echelon

form,

associated
this procedure
detects when a system has no solutions,
Moreover,
exist if and only if, in the reduction
matrix
3, solutions
augmented
which
the only nonzero
echelon form, we do not obtaina
entry lies
system

original

for

system.
Exercise

the

of

to row

by

in

row

in the last column.

Thus to
matrix

we need only

If

the

\302\243'),

to
system as described
s = s0 +

The

\342\200\224

m'

following

theorem

states

AX = B

be a

any zero

this

\342\200\242\342\200\242\342\200\242

t2u2

of nonzero rows in
an arbitrary solution, s, can

that

parameters.
than

txux

the last

lies in

system of equations. Solve


solution
of the form

arbitrary

the number

m' is

suggests

an

Gaussian

discard

Otherwise,

corresponding

obtain

above

solutions.

no

has

linear

augmented

of

the

system

of

the

by

column, then the original


rows from (A' |
and
write

where

AX = B

a system

solving

begin to. transform


into
its row echelon form (A^B1)
means
a row is obtained in which
nonzero
only
entry

{A \\B)

elimination.

for

procedure

n unknowns,

in

equations

this

use

A'
be

tn_m.uK_m.,

The

=\302\276
/n).

(m'

that s cannot

of

terms

in

expressed

above

equation

be expressedin

\342\200\224

m'

fewer

parameters.

Let

3.15.

Theorem

rank(A) = rank(A|B)

unknowns. Suppose that

nonzero

of

system

that

and

in n

equations
is in

(A|B)

row echelon

form. Then

(a) rank(A)= m.
(b)

If

the

solution

general

then

un_m} is a

u2,...,

{ujl,

Since

by Exercises 5

Let K
AX = 0.
K

be

and 6.

Setting

set

solution

the

t2

tx

\342\200\242
\342\200\242
\342\200\242
=

tn-m

form

tn-mun_m,

of the

set

solution

the

a solution to

= B, and let KH
= 0, s =
s0 e K.

AX

for

the original system.

>

echelon form, rank(/l |

is in row

(A \\B)

s0 is

the

\342\200\242
\342\200\242
\342\200\242

basisfor

and

system

corresponding
homogeneous

Proof

+ t2u2 +

s0 + txux

aboveis of

by the procedure

as obtained

be

But

set

solution

the

= m

rank(/l)

B)

Theorem

by

for
3.9,

{s0} + KH. Hence


KH

=K-.{s0}

u2,...,

=span({u1,

Since rank(/l) = m, dim(KH)=


is generated
by a set {ul9u2,...,
that the set above is a basis
conclude
n

\342\200\224

m.

Thus

since
at

containing

un_m},

for

KH.

aH_m}).

dim(KH)
most
n

= n
\342\200\224

\342\200\224

elements,

and

KH
we

168

Elementary Matrix Operationsand

Chap.

Systemsof

Linear

Equations

EXERCISES
or
false.
the following statements as being
a finite
If (A'\\B') is obtained from
\\B)
by
sequence
=
and
B
column operations, then the systems

1. Label
(a)

true

of elementary

(A

A'X

AX

= B'

are

equivalent.

(b) If {A'|

\302\243')

from

obtained

is

operations, then the

(c) If

x n

an

is

{A

\\

AX

systems

a finite

by

B)

be put in

and A'X = B' are

Any matrix can

(e)

of elementary row operations.


If (A | B) is in row echelon form, then the

form

echelon

row

of

form

(d)

by

of a finite

means

AX

system

row

equivalent.

then the row echelon

with rank n,

matrix

of elementary

sequence

is In.

sequence

B must

have

solution:

(f) Let AX =

the

(g) If a
A' in

is

matrix

of nonzero rows

transformed

echelon

row

of

rank

by elementary
the number

then

form,

in A.
of nonzero

~r

^3

2^2

2x2

2xtf+

3xx H- 5x2

x3

\342\200\224

2x3

(b)

\342\200\224

^-

2x1
;

Xx

2x2

*3

3x2 +

x3 = 6
-=7

5x2

3xt + 5x2

\342\200\224

x3

6x4

= 17
=

2x4

4-

2x3

3x4 =

\342\200\224

6x3

6x4 =-2

\342\200\224

2x1

3xt

- 2x2 + 9x3 +

x2
*2

2*i
\342\200\224

Xi

x2

4x2

2x1

8x2

\342\200\224x1-t-

4x2

\342\200\224

3x4

x3 + x4 =

x3
2x3

\342\200\224

4x4

10x4= -5

\342\200\224

\342\200\2247

\342\200\224

4x3

5x4

= 9

12

\342\200\224

Xx

5x3

7x3 + 1-1x4.=

(d)

(e)

2x4

x3 +

2xt + 4x2 42x1

= 1

ZX2

\342\200\224

elimination.

Gaussian

(c)

rows in A' is equalto

A.
using

Xj

into a matrix

row operations

following systems of linear equations

2. Solve the

(a)

of

the number

equals

is

dimension

the

in n unknowns for which


equations
in row echelon
form. If this system has solutions,
where
m'
the solution
set of AX = 0 is n \342\200\224
m',
linear

of m

system

matrix

augmented

then

the

be

=-6

3
9

._

(f)

x\\ + 2x2

2xt + 4x2
x2

\342\200\224

x3
\342\200\224

x3

3x4

= 2

6x4

= 5

+ 2x4 = 3

Systemsof

Sec. 3.4

Xj

\342\200\224

Ax1

\342\200\224

4x2

'3xt

(h)

6x4

~r

X3

-6X4

5x3

4-

X,

X2

x-y

%1

\342\200\224

5xt

\302\256

2x2

2x1

\342\200\224

3xt

\342\200\224

\342\200\224

2x4

x2
\342\200\224

2x1

3x2

lx1

\342\200\224

x5 =

6x3

+ 9x4 +

3x-

8x3

2X3 +

4x,

+ 3x4
\342\200\224

x4

\342\200\224

9x3

2
\342\200\224

4x5 =-5
6
+ x5 =

+ 8X4

\342\200\224

5x2

in

B')

{A'\\

that

Prove

x5
\342\200\224

3x4

5x5

-8

echelon

row

the

is transformed

sequence of elementary

by a finite

form

= B

AX

system

Deduce

# rank(04' | B')) if and only

rank04')

the only nonzero entry lies in


that AX = B has solutions if and

Exercise

find

x3 + x4 = 2
x3
x4 = 3

(a) \\\\'x1+2x2-

2x1+
x1
(c)i [xi
Xi

\342\200\224

4-

2x2

3x3

=-2

2x4

a row

contains

B')

\\

B')

no

contains

row

to

if the

determine

find

Finally,

\342\200\224

Xl

X2

xt

x2 4-

x1 4- x2

3*3

x3

\342\200\224

x3

*4

\342\200\224

x4

system

a basis for

\342\200\2242

= 1
+ x2 + x4
+ x2 4- x3 -x4 = 2

*i +
5. Prove

x2

\\

column.

them.

all

(b)

(A'

last

the

{A'

column.
if

only

entry lies in
For each of the following systems,apply
has solutions. If there are solutions,
of
the corresponding homogeneous system.
the only nonzero

if

last

the

in which

4.

+ 3x4

operations.

in which
(b)

2x3

*2

into a matrix
(a)

that the augmented matrix of

3. Suppose
row

= 10

4-

\342\200\224

4-

2x1

x5

x5 =

4x2

\342\200\224

\342\200\224

*1

3x5

+ 4x4

2x1
3xt

2x3

4- 4x3

2x2

2x*

4-

x2
\342\200\224

x5

X*

3x4

\342\200\224

4-

\342\200\224

x3

X5 = 2.

^-^

x2

x1

(j);

Xa

-^3

X2

2x5

+ 7x4
\342\200\224

\342\200\224

x3

2x2

169

Aspects

Equations\342\200\224Computational

\342\200\224

\342\200\224

2x1

(g)

Linear

3x3

\342\200\224

x3

x2

that if

is

numberof nonzero

in

matrix
rows

in

A,

row

echelon

form,

then rank(y4) equals

the

170

If (A

6.

7. (a)

in row echelon

| B) is

into Q'

row operations, then

elementary

(b) Deduce that if

echelon

x n matrices each in

be transformed

Q can

that

Q! are m

and

if Q

is

any

row

elementary

of a

matrix

system

equations

Elementary

system of linear
155

column

128

Nonnegative

matrix

condition

Equilibrium

economy 155
of

Forward

to a system

for a simple

of

set

Solution

solution

Trivial

linear

of linear

a system

to a

system of linear
Type

164

1,

2,

operations
to

system

163

matrix

of

148

equations

Homogeneoussystemcorresponding
nonhomogeneous

of

equations 148

164

Gaussian elimination
a

132

of a matrix

Solution

160

pass

156

155

matrix

System of linear equations

Equivalentsystems
equations

155
economy

form

128

operation

linear

Open modelofa simple

Elementaryoperation 128
row

of

system

Row echelon

Elementary matrix 129


Elementary

149

149

Rank

operation

of linear

system

equations

Positive

matrix

CHAPTER

FOR

Input-outputmatrix 155

147^

Consumption

on n.

induction

Nonhomogeneous

matrix of a

of

number

finite

such

is a unique matrix in row


A by a finite number of

there
from

simple

\"

Use

equations

154

economy

Coefficient

of

model

Hint:

then

of

Backward pass_ 164

Closed

Q'.

Homogeneous

153

equations

of

DEFINITIONS

Augmentedmatrix 141
Augmented

means

form.

operations.

OF

INDEX

Equations

form

echelon

row

by

obtained

be

can

that

form

matrix,

Linear

in row echelon

also

A is

that

prove

form,

that

Prove

of
Elementary Matrix Operationsand Systems

Chap.

151

and

homogeneous

equations 149

3 elementary

128

147

in the
role
of linear algebra.
determinants played a
study
less
In fact, virtually
our only use
Now, however, they are of
importance.
of \"eigenvalues.\"
For this reason the
of the determinantis the computation
determinant
needed
are summarized
for later chapters
important facts about
a development of the
in Section 4.4. Thereader
not
interested
in pursuing
to that
section.
immediately
proceed
theory of determinants
a field F is a scalar
entries
from
The determinant of a square matrix
the
determinant
as a function having
may
regard
(element of F). Thus
values
in F. Although
the determinant
domain Mnxn(F)and
of a square
in
terms
of the
entries
of the matrix, the resulting
matrix can be

At one time

major

much

in

the

is

who

may

with

we

taking

defined

definitionis

use

to

cumbersome

determinant in this manner,in


<5:

possessing

column

in

is,

further

unique

determinant

properties
on
satisfied

are

determinant

The

In

setting.

begins

chapter

4.1

Section

terms

in

determinant

important

definition.Section4.3
and contains a proof that there isa
of a determinant
i.e., that the three defining properties of a
Mnxn(F),
by one and only one function from Mnxn(F) into F.
a discussion
with
of the general theory in a simple
the geometric significance of
we also investigate
and orientation.
Readers who have studied
of area
that

will

scalar

special

called

will

the

assign

to

\"determinant\"

each

n x n

matrix with entries from a

of the matrix,

but first

we

field

an

consider

easy

case.

Definition.

Jacobian.

the

called

OF ORDER 2

DETERMINANTS

we

in multiple integrals

of coordinates

a change

determinant

Eventually

sense of our

the

advancedcalculus recall
necessitatedthe use of a
4.1

a determinant by expansion alonga

of evaluating
in the
a determinant

fact,

contains

we

4.2

method

familiar

the

that

verify

of defining the
define
a determinant
as a function
In that section we also
properties.
Instead

computations.

Section

three

->\342\200\242
F

Mnxn(F)

for

The determinant of a

is the scalar A^A^ -

A12A2is

which

2x2
we

matrix

denote

with

entries

from

afield

by det(A).

171

Chap. 4

172
Example

Consider

the

of M2

element

following

Determinants

2\302\251

,3

Then

det(yi) =
In

its rows; as

terms^of

A in

matrix

it will be

follows

that

discussion

the

= -2. 1

1-4-2-3

convenient to represent a

before, we will write

^(2),

and

determinant

its

denote

by

det'1(1)
A,

L(2).

The

the

has

determinant

properties.

important

The determinantof a 2x2

Theorem 4,1,
three

following

matrix

the

satisfies

following

condit\\pns:

(a) The determinant is


held fixed; that is,

a linearfunctionof each

/ca

de

+ a;ia

de

row

/a-

/a(1)\\

de
\\A(2)V

\\A(2)/

A(2)

\\

other

the

when

row

and

det(

A
\\CA(2)

all scalers c

for

(b) 'If
(c) If

15

(a)

Proof,

M2

x2(F)

the

2 x

Let

Aa)

A<1)A'
+

)=^4^)
A(2)/

+ ^(^

\\A(2)/

\\A(2h

in F.
has

rows, then det(A)

identical

2 identity
= (i4lt

matrix, then det(l)


A12), A'a) = {A^

= 0.

= 1.

A'12),and

=
Am

then

det (CA{1)

^(2)

= det (cAl1 +

^1^
/

\\

= (c4u +

^11

cAl2

^12

^22

^21

A!^A22-

(cA12

A'12)A21

{A21

A22);

is

4.1

Sec.

173

of Order 2

Determinants

= c(AltA22= c det

A12A21)

A'12A21)

+ det/01\"

(A\302\273

(A'ltA22

\\A21 A22)

\\A21 A22)

= Cdet('X(1))+ detr<1)
A

similar

the

row.

second

(b)

So

linear functionof

is also a

the determinant

that

proves

argument

rows of

If the

A12Atl

the form

A has

then

identical,

\342\200\224

AtlAi2

det(/l)

are

'^11

^12

Al

^12,

0.

(c) Since

'\342\226\240(it)-.

det(J) = 1-1-0-0=1.
result

next

The

Theorem 4,2.

mentioned in Theorem 4.1

the three properties


determinant
as defined

that

shows
the

characterize

completely

Let S: M2x2(F)

->\342\200\242
F

be

above.
the following

having

function

any

three properties:

(a) S is a

(b)

\\(c)

6 M2

If

If

I is the

S = det;

Then

linear

that

Observe
<5(M3)

=-1,

then

I denote

the

o)'

Ml
^MJ
Using

rows, then

2 x 2 identity matrix,
<5(A)

2x2

properties

by

(b) and (a),

U+ 1 l+0/+

A 6

and

let

we have

1+0

We

M2x2(F).

M3=G
(b).

\\0+

0.

each

!)\342\200\242and

property

is heldfixed.

1.

matrix,

identity

m*=(o

for

A12A21

<5(A)

<5(I)

\342\200\224

AUA22

= $(M2) =

the other row

when

row

each

x2QF) has identical

that is,

Proof. Let

of

function

will

first

prove

that

Chap. 4

174

= 5(/)+5(M1)
+

0 + 0

1 +

ik:

:)-c

\302\260hi

-\302\273G

5(M2)+5(M3)

+ <5(M3).

= -1.
Thus<5(M3)
let

Now

an

be

^22

^21

\\

of M2x2(F); tnen

element

arbitrary

\\^21 ^22/

Determinants

= 5

a22 + o)

o + i4w

j j

tf CO

^11^22

= AtlA22(l) +

^2i

-5

+ i4Ui42i

^1

AUA21-8(M\302\253Q

+
i4Ui421(0)

U21

-w

vo

0;

V/I21

= ^11^22'&
0

V0

QJ

+ ^12^22'&

<5(M2) +

^12^22'
+

Al2A22{0)

y0

<5(M3)

^12^21'

A12A21(-1)

^12^21 = det(i4).

= ^11^22-

So 6 = det. 1
a Parallelogram

of

Area

The

by this characterization

Motivated

of the determinantof a

2 matrix,

in

a determinant on
as
a function
the
possessing
we
three properties of Theorem 4.1. But
will
use this uniqueness
property to
of
a 2 x 2 matrix.
determinant
In
study the geometric significanceof
will
find
that
the
is of. geometric
particular,
sign of the determinant
4.2 'we define

Section

Mnxn(F)

first

the

we

importance
in

the

6<

and directionas
Given
vectors

it

the

u and

between

and

that
given

u, v,

three

between

orientation.

between two vectorsin

By the angle

such that 0 ^

of

study

if

v and

the

is

formed
but

vectors

and
angle

between

by

mean

the

6 with measure

angle

having the same magnitude


from the origin (see Figure 4.1).

the vectors

from the same point,


w equals the
4.2).
(see Figure
u and

between
and

we

emanating

w emanating
v

R2

we

say

that

v lies

sum of the angles

Sec. 4.1

of

Determinants

Order

175

*-x

two

Angle between

vectors in

R2

4.1

Figure

*-x

*~-x

v lies

between u and w.

does

(Here

lie between

not

lies

u and

between

u and

where

w.
v.)

Figure 4.2

an ordered basis

Given
v

\342\200\224

(bu

b2),

we

denote

/?={u,u}

for

R2,

= (a1,a2)

by

det

the scalkr

det a1 a2

Pi

and

define

the

orientation

b2,

be the real number

of /? to

V
det

u\\

det

(It follows from

Exercise 10 that

the

denominator

is

oi;i-\302\261i.

Notice that

0(6M

= 1 and
o(_Bl)=-l.

not

zero.)

Clearly,

and

176

Determinants

Chap.

ky

*-x

*~x

A right-handed coordinatesystem

left-handed

coordinate

system.

Figure 4.3

In general (see

Exercise 11),

= .1
if

and

a right-handed coordinate

basis {u, v} forms

ordered

if the

only

system,

and

if

and

if ;{u, v} forms a Ieft-fianded coordinate system.


system {u, v} is right-handed if u can be
in a counterclockwise
direction through an angle 6

only

coordinate
rotating

that 0 <

rotatedto coincide

<

4.3).]For

otherwise,

n;

{u, v}

manner.

linearly dependent.
set

ordered

Any

{u, v}

u and

Regarding

by

with measuresuch
a left-handed coordinate system (seeFigure
=

if {u, v} is

with

define

we

convenience,

is

that

[Recall

the parallelogramhaving
by u and (see Figure
4.4).
u

in R2 determines
as

arrows

and

v as

emanating

adjacent

a parallelogram in

the

following

of R2, we call
sides the parallelogram determined
from

the origin

Observe
then

parallel,

that
the

if the set

\"parallelogram\"

which we consider
is an interesting
There

segment,

{u,

v}

is

linearly

determined

i.e.,

dependent,

by u and

is

to be a degenerateparallelogram
relationship between

having

if u and

actually
area

are

line
zero.

Sec.4.1

of

Determinants

Order

177

.>y

*-x

-*~x

Parallelogramsdetermined

by

the parallelogram determined

and

by

v, and

Observe first, however,that

we now investigate.

which

and

4.4

Figure

the area of

since

u
det

we cannot

be negative,

may

expect that

*\342\226\240:-*\342\200\242:

But

can

we

that

prove

\\

<)-<)\342\226\240<

which

from

it follows

that
A

det

In arguing that

<)'<)\342\226\240<
we

will

employ

to

Rn.

First,

a technique

that,

although

since

o,>\302\2611,

somewhat indirect, can

be generalized

Chap. 4

178

we may multiply both sides

to obtain

desired

the

of

Determinants

by

equation

the equivalent form

\302\260CK)-<

Wewillestablish
4.2 are satisfied

this

by

(a)

We

function

the

that

by showing

begin

Observe that this conclusionis


\"

by

and

A|
since

the

Au,;rwe

\"

0 because

1-0 = 0.

;-<>;\342\226\240*;;

see

0. Regarding

Xv as

the base of

altitude

h of the

\\X\\

of

(length

parallelogram

determined

by

and

v (see

A
\342\200\224

(u

\\X\\

\\v

-O

shows that

<)-\"(:
next

prove

that

Si

au +\\) bwj

by

and

Figure

b-5(U
Vvv

\342\200\242
\\X\\

Xv is

(\"

the same

4.5). Hence
u

|A|-A

-\342\204\242>\302\256-><
A similar argument

^altitude)

parallelogram determined

<h;>:

We

the parallelogram determined

that

= base

altitude

that in the

that

assume

if

immediate

ft

So

of Theorem

the three conditions

that

verifying

by

equation

as

4.1

Sec.

179

of Order

Determinants

Figure4.5

for any u,

and

R2

determined

parallelograms

numbers
a and
u and w and by u and

real

any
by

the

=
)

0, then

first

of part

paragraph

Vow/

-h dw/

au

by

AC
a ==

the

same altitude (see Figure 4.6),

u and the

If

b. Observe that since


u + have a common
base

\\w

if a #

(a). Otherwise,

0,

then

u\"

51

So

the

=
u

au + bw/

We are now able to

all

u,

e R2. Since

yl5 v2

Choose any vectorweR2

vectorsvu

v2

R2

there

b-5

|=a-<5
w

in either

exist

case.

\\yi/

17,/

\\Vo

the result is immediateif


such

that

show

Vi

for

lu+_w

is obtained

conclusion

desired

a-S\\

that

scalars

{u,

w} is

at and

linearly
such

&\302\243

u+w.

Figure 4.6

0,

we

assume

that

u # 0.

Then for any


independent.
= a{u +
=
that
w
vt
&f
1,2).
(i

Chap.4

180

Determinants

Thus

Vi

+ (bi +

Sai + ai)u

v,

u
b2)S

b2)wj

atu + DiW/

\\

a2u

vJ

o2>v

\\v2

shows that

A similar argument

(bi

Ul

\"2

:)-(:

for all

R2.

y-e

u2,

ul9

Since

(b)

R2.

for any ue

(c) Because

the

determined

parallelogram

et and

by

S = det.So

of

area

Jhe

the

satisfies

the

by u and

determined

parallelogram

4.2, and hence

of Theorem

conditions

three

unit square,

= l.

<5('eM= o(eiW(eM=l-l
Therefore,

e2 is the

equals

<)\342\226\240<

we see that

for example,

Thus,

and

11 = (-1,5)

==

(4,

the area of

the

determined

parallelogram

by

is

-2)

5'

det

det

18.

EXERCISES

1.

the

Label

following

true or false.

as being

statements

(a) The determinantof.a2x2


is held
matrix when the other
(b) If I is the 2 x 2 identitymatrix,
2 matrix
A
(c) If both rows of a 2

is

matrix

then

(d)

Jf

the

and

v are

parallelogram

are

det(Z)

identical,

in R2 emanating from
having u and v as adjacent

vectors

det

of each

function

row of the

fixed.

row

linear

0.

then det(A) =

the origin,

then

sides

is

the

0.
area

of

Sec. 4.1

A coordinate

(e)

system {u,

equals 1.

(f) The determinantis a

3.

(a) /-1 +
4.

each

For

2i

ii

(b) u

of

and

-2)

1;

and

\302\253(1,3)

5)

(-3,

1)

and
y = (-6,
(c) u = (4, -2)
=
=
(d) u (3, 4) and
(2,
-6)
Prove that if B is the matrix obtained

4i\\

x 2(R).

M2

(c)

7i- /

pairs of vectorsu and

=(2,

F.

into

M2x2(F)

(T

elements of M2x2(C).

determined by u and

if its orientation

(c) /8

2f

\\-3+

of the following

\302\253(3,

5-

(b) /

4i\\

2-3i)

the parallelogram
(a)

of the following

determinants

only

2\\

(b) /-5

the

from
elements

Compute

\\

if and

right-handed

transformation

linear

-3\\

/6

is

v}

the determinants of the following

2. Compute
(a)

181

Order

of

Determinants

in

R2,

fix

\\4

6i,

compute

the

of

area

v.

1)

5.

matrix A, then det(B)=

6.

for

that

Prove

7. Prove that
A

8.
9.

the

equals
that

Prove

The

if

an

is

product
for

classical

any

rows

of a

2 x 2

-det(^).

det^1) = det(i4).
2 x 2 matrix, then the determinantof
triangular
upper
of the entries of A lying on the diagonal.

A e

any

the

interchanging

by

M2x2(F),

A, B

e M2 x 2{F\\

of a 2 x

adjoint

det(4B) = det(4)

\342\226\240

2 matrix

is

det(B).

matrix

the

C =

\\

Prove the following.

(a).Ci4 = i4C=[det(i4)]L
(b)

det(C)

det(,4).

of

(c) The classical adjoint


10.

(a)

Use Exercise 9(a)

(b)

Prove

if det(i4)# 0.

if A

that

classical

adjoint

is

A1

C'.

to prove that a

is invertible,
of

2 matrix

then A\"1 =

A is

invertible

if and only

[det(i4)]_1C,where C is

the

A.

11. Prove that

if and onlyif

system.Hint:

Section2.1.

basis

ordered

the

Recall

the

definition

{u, v}

for R2 forms
of

a rotation

a right-handed coordinate
as given in Example 5 of

182

in

We have seen
of

determinant

characterizes

a linear
held

are

rows

We

properties.

in terms of

n matrix

Definition.
is

three

these properties,but
To begin, we name the first of
results.
the determinant
of a 2 x 2 matrix.

n x

an

preliminary

by

S:

function

-> F is said

Mn xn(F)

/A(i,\\
=

c*<5

A(i0)

we

first

some

require

that

conditions

the

to bean n-linear

function

x n matrix

function
of each row ofann
fixedi, that is, if

is

x 2 matrix
will soon define the
of a 2

determinant

the

that

4.2

Theorem

characterized

completely

Determinants

OF ORDER n

DETERMINANTS

4.2

Chap.

when

the

if

remaining

\342\200\224

/A(1)\\

A'

+ 6

for

i = 1, 2,...,

w>/

VJ
whenever

Ar \\
+

cA(i)

\\

element

is an

Theorem

4.1 shows
~~

that the function


is

^12^21

a 2-linear

function.

det: M2x2(F) F defined by det(,4)=


1

Example

The

A<n)

of Mnxn(F).

Example

^11^22

A(i)

function

linear

S: Mn xn(F)

function.

by

S(A)

0 for

each A e Mn xn(F)

is an

n-

Example

Defined:

Mnxn(F) -> F

of the entries of

-> F defined

the

jth

by

S(A)

column

\342\200\242\342\200\242

A1JA2jof A. Then

AnJ;

that

S is an

is, S(A)

equals

the product

n-linear function for

each

Sec.

4.2

183

of Order

Determinants

< n) since

j(l <j

*\"

\\

*'

Cy4(|j

(n)

^ij

i4j\302\243)

i)j(cAij

A(i~

\342\200\242
\342\200\242'
A

Aij)A(i+1)j

nj

/
- - -

\342\200\224

' '

Aij

CyAj^j

c-<5

y4\342\200\236y)
+(./1^----

+ 5

AV)

* \342\200\242
\342\200\242
A\342\200\236j)

y4(|\342\200\236
11/-4^(1+n/

A'(l (\342\226\240\342\200\242)

w>/

Example

function <5: Mnxn(F)


equals the product of
The

->

by S(A) =

defined

the entries of

on

lying

AltA22
the

- - \342\200\242

Ann

(that

is,

S(A)

is an n-linear

diagonal)

function.

Example5
The

S:

function

-> F

Mnxn(F)

not

is

tv(A)

an

n-linear

function.

Qur next result showsthat


other n-linear functions.

Proposition 4.3.
linearfunction
the
(where

Section

by S(A) =

defined

combination

linear

sum

functions

n-linear

two

of

scalar

and

may

be

combined

n-linear

functions

are as defined

product

to produce

is an n-

in Example3 of

1.2).

Let

Proof.

St and

the linear combination

S2 be n-linear

S = ab1 +

A.(1)

cAty

L(n)

functions, and let a

and

then

bd2,

L(D

A^

= a'd1

Cy4(o

AW

A{i<*)

cA^ + Aft

b'S2

\\

4\302\273

be

scalars.

If S

is

184

h\\
= a

cd1

h\\
+

Am

+ b

A\302\256

\\AV>I

a'd1

\\-4(n)

ffl

f'\"\\

+ b-52

A(i)

4i (o

/M\"

\\A{\\
=

5.

cmS-

^(n)

\\y4(n}/

fc\\

fc\\

Ai)

Determinants

Chap.

A(i)

a-di

Mi,\\

c*<5 ^(0

:+<5 ^(0

^(i)

&\"<52

W-

vw-

MA\"

A[i)

each

for

i,

^ i ^

n.

W/

Thus S is

function.

n-linear

an

of n-linear functions is

Any linear combination

Corollary.

n-linear

an

'.

function.

Exercise.

Proof

The following definitionnames


characterizes the determinant of a 2
x

An

Definition.

two adjacent

whenever

n-linear

function

of

second

the

the

three

that

properties

matrix.

3 is said

to be alternating

<5(A)

if

rows are identical

Example 6

Of

three

the

functions

given

in Examples

2, 3, and

4, only

the

first

is

than

it

alternating.

first

n-linear

The following result


In particular,
appears.

shows that the precedingdefinition


is
to
be assumed
there is no needfor the
stronger

rows

adjacent

in the definition.

Proposition 4.4.

Let 8:

Then both of thefollowingare


(a)

If
then

Mnxn(F)

->

be

an alternating

n4inear

function.

true:

obtained
by interchanging
=
-5(A).
5(B)

B is

any two rows ofannxn

matrix

A,

Sec. 4.2

If two rows

(b)

185

Order

of

Determinants

of an n x

matrix

Proof. We first prove that if


adjacent rows of A, then 6(B)=
rows i and i + 1 of
interchanging

is

are
obtained

any

interchanging

by

-5(/1). Suppose

that

is

obtained

two
by

Ld)

L(D'

5(A) == 0.

then

identical,

An(i+D

thus

A(i)

vw

\\4.)

Now

L(l)

= 5

(f+1)

L(n)

= 5
A{i)

= 0 +
is

5
Now

where

of

rows

<j\\

all,

function. Thus 3(B) = -5(/1).


that
B is obtained
from A by interchanging rows i and j,
suppose
with
rows i and i + 1, successivelyinterchangeadjacent
Beginning
the rows are in the order
until

an

n-linear

alternating

with

i4w

...,
/1(1),
process

are

interchanges

requires

the

needed

preceding

/!(,\342\226\240_!),

/1(,-),

/1/f+D,...,

^4//-1),

\342\200\224

interchange.

This

5(/1) + 5(B)+

/l/i),...,
In

\\

since

/1(/

/I/.-),

\342\200\224
\342\200\224

/1(,-),

/1(,-+1),...,

/!(\342\200\236).

to produce this ordering. Now successively


row until the rows are in the order
+ 1),

..., ^.fi'-l)'

interchanges

^-(1)9

of adjacent

^ti+D'*

rows

^
\342\226\240
\342\200\242>
L(n)-

and produces

the

186

Chap.4
Hence

B.

matrix

= (-ly-^-iy*-1-1

S(B)

It remains to

j < A then
8(A)= 0

8{A) =

(*

by

the

8(B)=

Hence

(a) and

conditions

satisfies

We are now preparedto


the determinant is defined in

Mnxn(F)

j>-F

3:

function

function

can be given

a determinant

of

by 3(A) =

-> F defined

Ml\"xl(F)

A11

on Mlxl(.F), for
4.1

Moreover,

for

each (n + 1) x (n +

is

Ay

the

Let 3 be an

1)

matrix

n x n

entries

column,

is

3(Aij)

function,

3(A^

AiJ'3(Aij)

Therefore,

the

function

a linear

is an (n +

entries from F.

of

independent

is

from

obtained

is

AVi

For

Mn x n(F).

define

the

on the

function

l)-linear

> 3.

n + 1),

deleting

by

result

ith

and jth

row

(n + 1) x

(n + 1)

F.

from

Since

Proof

any

on

function

< j <

j (1

matrix obtained from


(n

with

each

and

column. Then e$ is an alternating


matrices

n-linear

alternating

obtain

next

Our

inductively

4.5.

we

A12A21i

above.

the

A) clearly
shows
that

\342\200\224

where

of

entry

only

(the

the requirements of this definition.


Theorem
A as AltA22
by defining the determinant of a 2 2 matrix
on M2X2(F) in the sense of the definition
a determinant
enables
us to define a determinant on Mnxn(F)
satisfies

Proposition

4.2 that

in Theorem

properties

that

Observe

Mnxn(.F).

0.

2 matrix.

it follows that 3(A)=

on

three

the

but since

3(1) = 1.

that

example

simple

determinant

obtain

j to

and

on Mnxn(F) is an alternatingn-linear

A determinant
such

a
of

characterize the determinantof a

<5:

define

terms

Definition.

(b).

3(B) = 0,

Then

equal.

of the proof,

paragraph

and
and

identical,
say rows i
rows of A are identical

are

adjacent

are

rows

adjacent

second

the

by

\342\200\2243(A)

two

= _S(A^

5(y4)

1, interchange rows i +

> i +

If J

two

then

of

that

see

we

(-l)2O~0~i

rows

two

4-

hypothesis.

S(A)

show that if
0. If j = i 1,

B in which

matrix

of the proof

the first paragraph

by

Determinants

ith row of

the ith row

deleting

by

A. Thus, because 3 is

of each row

l)-linear function on_the

(n

of
1)

except

(n +

row

an

and Jth
n-linear

L Hence

1) matrices

with

since

n+l

6,-(-4)=

z
I(-l)i+^V<504y)

is a linear combinationof
is an (n + 1)functions
l)-linear
(n +
A{j 3(A-U),
e,linear function by the corollaryto Proposition4.3.
now
that
is
If A is an (n + 1) x (n + 1)matrixin
alternating.
prove
the

We

\342\202\254j

\342\226\240

Sec.4.2

k and k

rows

which

of

Determinants

i # k and i

+ 1 are identical,

then

Thus

1.

4-

187

Order

= 0 whenever

5(i4y)

6,.(,4) = (-1)^-^.-5(^)
=

Hence

A{k+1)j.

on

entries

with

the

denote

statement

the

\342\202\254^

Let I denote the


n matrix
obtained

(n + 1)

(n

from I by

(n + 1)

x (n +

if

on

exists

There

and I-j = 1, we

(- VJ+J

e,- is

Thus

Mnxn(f).

with entries from

1) matrices

j. Then

column

and

row

Proof The proof

det: Mlxl(F)-> F
Assume that
(1 < j < + 1), the
there

1)

(n +

\342\226\240

F.

for

Mnxn(F)

function

determinant

e,- defined

with

1) matrices

n. When

the

in Proposition
\342\226\240
(\342\200\224
l)i+J\"

<5(Ajj)

a determinant on

4.5 is called
is

called

the

Mnxn(F),

the expansion of
cofactor

then

of Ay (with

\302\247}.

Example

Let

n = 1,

det(/l)

by

exists

If S is

Definitions.

scalar

defined

any positive

the function
is a determinant on Mlxl(f).
Alt
5 on Mnxn(f).
Then for any j
in Proposition 4.5 is a determinanton
entries from F. This completes
on

induction

by

on the

1
on

determinant

a determinant

induction.

defined

be

will

In

have

integer n.

(h\\+

matrices

= 1

5(//,)

determinant

Corollary2.

the

1)

and let 1^

matrix,

1) identity

deleting

\342\226\240

4.5. If S is a

of Proposition

on the

(- D'\"% 5(4)
*j(J)= \"l
1=1
because S is
{n+ 1) (n +

=
y4&J-

alternating.

determinant

is

is

i4(fc+li/. and

^-=

equal,

\342\202\254j

in

as

be

e^

are

^4

n identity matrix. Since1^ = 0

is the n x

i^k

and

F.

from

Proof

then

x n(F),

Mn

of
that

proving

Let S and

1.

Corollary
determinant

0,

\342\202\254j(A)

/c

+ (- l)*+1'+^+ 1U-5(1(^,,).

k + 1

and

/c

rows

because

But

i#

whenever
+ 1; so
rows

identical

two

has

Au

A denote

the following

element of M3X3(.F):

(i::).
\\7

9/

along

the

the

determinant

jth

column.

The

respect to the determinant

188

A12,A22,and

The cofactors of

are

Az2

+
(-l)1

= 1(1-9-3-7)=-12,

(-l)2

2det(^

the

Hence

of A

expansion

^12(6)

e2(^)

of

cofactors

the

(-l)1

= 1(4-8-5-7)=-3,

= 1(1-5-2-4)=-3,

3detQ

respectively.-Hence
the

7 is

4.9 that the

in Theorem

see

will

+^33(-3)

A23{6)

9(-3) = 0.

+ 6{6) +

3(-3)

equality of e2(A)and

In fact, we will seethat

not coincidental.

column is

the third

A along

of

expansion

6,

^=(-1)(1-8-2-7)

3detQ

(-l)3

We

are

(-l)2

is

+ ^32(6)

^22(-12)

A23, and A33

A13,

3det(^

*M) = ^13(-3)

the second column

along

5(-12) + 8-6= 0.

= 2-6 +
Similarly,

= 6,

= (-1)(1-6-3-4)

(-l)3+2det(^
respectively.

= 6,

= (-1)(4-9-6-7)

f)

2det(^

Determinants

Chap.

is

there

one

exactly

in

e3(A)

Example

on

determinant

M\342\200\236x\342\200\236(n

EXERCISES

1.

the following

Label

(a)

determinant

entries

(b)

If S is

(c)

Let

interchanging

function

linear

from F when

a determinant and
be

as being true
is a

onMnxn(F)

with

matrix

statements

any

two

or false.
of each row

function

the other n
of A

rows

If B Is a
any two rows, then
<5: Mn x n(F) -* F defined
determinant.

S(A)

\342\200\224

are

rows

are identical,

held

then

matrix obtained
=

of an

\342\200\224

from

0.

by

S(B).

0 for

each A e Mn xn(F)

The

(e)

is a determinanton
n(F).
n
2
is
For
there a determinanton
^
any
-> F is a linear transformation.
<5: Mnxn(F)
determinant
Any

S(A)

Mnx

(f)

fixed.

S(A)

(d)

by

Mnxn(.F).

Sec.4.2

that if

2. Verify

column

first

the

along

in Example

matrix

3x3

the

is

189

Order

of

Determinants

7, then the expansionof

zero.

equals

3. Evaluate the determinant of each of


following
third
along the second column and also along
an element of M3 3(C).]

matrices

the

by

column.

the

expanding
matrix

[Each

is

(a)

(b)

(c)

(d)

the followingfunctions

4. Which of

<5:

functions?

3-Iinear

are

M3x3(.F)-j-

each answer.

Justify

(a) 5(A)=

c,

where

(b)

S(A) = A22

(d)

3(A)

is

scalar

nonzero

any

(c) S(A) = AiiA23A32


=

AuA2iA32

(e) 5(A) =
=

6(A)

(f)

AX1AZXAZ2

A\\xAl2A\\z

(g) 5(A) = AitA22A33

5.

(b) Determine all determinantson


6.

Prove

the

equality

denned

in

Proposition

where

AtJ

is

of

2x2

the

Mixi(F).

three functions ey: M3x3(.F)-> F(j


4.5 for each A 6 M3X3(F) by
the

eM)=

t (-lY^Aij-deify

matrix

obtained

from

jth column and det denotes


unique
7. Prove that the unique determinant on
which

both

8. The proof

of Theorem 4.2

by

on
is

fth

the

deleting

row

and

M2xZ(F).

of the1

function

a 2-Iinear

of a 2 x

determinant

= 1,2,3)

2 matrix in

is zero.

identical

are

columns

x 2(F)

M2

the

that

and

matrix

determinant

the

columns of a

F.

S: Mlxl(F)

functions

1-Iinear

all

Determine

(a)

A11A2lAz2

that

shows

if

is

2-Iinear

function

8:M2X2{F)-+F9thm
3(A) = AUA22'8(I) +
where

I, Mu M2,

any scalars a,
e(A)

and M3 are as

b, c, deF
=

A1XA21'S(MX)

the

AlxA22a

in

the

+ Al2A2l-S(M3),

Al2A22-b(M2)

proof

of the

theorem.

Prove

function

+ AltA21b

+ Al2A22c +

Al2A2ld

that for

190
is a

and only if it is

9.

then

condition

10. Prove
4.3

several

to

(b)

If two

(c) If B

a, b, c,

and d.

in

defined

D),

Appendix
that

of

(b)

fields, however.

evaluating a

theorem.

has the following properties:


obtained
A by multiplying each entry of some
from
c, then <5(B) = c 5(A).
S on

determinant

Any

Mn x n(F)

row

\342\200\242

a scalar

rows of

is a

These

matrix.

matrix

quite useful in
in the next
are summarized
are

that

properties

given

by

of,A

function if

2-Iinear

4.3.

Proposition

important

Theorem4.6.
is

is a

OF DETERMINANTS

There are
determinant of a

scalars

some

in arbitrary

true

is not

corollary

PROPERTIES

(a) If

->

2(F)

two (as
of Proposition 4.4 implies condition

result

the

field not of characteristic

(a)

This

proposition.

M2

for

above

form

the

of

is a

if F

that

Show

Deduce that 5':

function.

2-Iinear

Determinants

Chap.

are

obtained

matrix

then

identical,

<5(A)
A

from

= 0.
interchanging

by

two rows, then

<5(B)^-<5(A).

(d) If
(e)

If

one

row

B is a

j (i # j),
Proof

of

A consists

matrix obtained from

then

0 Property

5(B)

of zero entries,

entirely
A

by

consequence of

(a) is a
A{i)

MA

5{A) = 5

(0

(e) Let B be obtained

from

sake

of argument

0-5

6 is

= 0.

An(0

e Mnxn(F)

that i <

by adding

j. Thus if
Ld)

An (0

A\302\256

then

L(n)

i to row

W/

L(n)

that

fact

the

Ld)

^0)

of row

0.

Md\\

0A\342\200\236

\\

the

an n-Iinear
of Proposition
4.4.
consequences
Then
entries.
of zero

(c)

for

multiple

<5(A)

6(A).

are
function, whereas properties (b) and
consists
entirely
(d) Suppose that

Assume

adding

then

=
cA{i)

L(n)

AU)

c times row i to

row j.

4.3

Sec.

191

of Determinants

Properties

So

4i(i)

^(0
S(B) = cS

= c-0 +

<5

S{A)

S(A)

^(0
L(n)

by

the

that

Observe

matrix
matrix.

the

on

performed

properties

of

determinant

1,2, and 3,

how
the
(e) of Theorem 4.6
changes when an elementary row operation is
We can reformulate
these properties in terms of

(a), (c), and

Let El9 E2,

Corollary.

show

and E3 be elementary
on

determinant

Mn

of I

a row

multiplying

by

any

in

matrices

respectively. If E2 is obtained

scalar c, thenfor

by the nonzero
=

<5(E:) =-1,

Mnxn(F),

of types

x n(F)

<5(E2)

and

c,

= 1.

<5(E3)

This corollary is oneof

the

\\

Theorem 4.7.

Let

be

of any

determinant

the

computes

Since

n. Then

are

be

an

element

dependent

linearly

(Corollary

such
that
zero,
3.6). Hence there are scalarscx,...,cn,
= 0. Assume for the sake of
+
+
argumentthat c: # 0;
cnA{n)
all

not

+ c2^(2)

let

0.

<5(A)

rows of

< n, the

rank04)

x n(F), and

on Mn

a determinant

rank less than

having

2 of Theorem
ci^d)

remaining

result

first

of the uniqueness of a
two theorems needed to

proof

matrix.

noninvertible

Proof.

the

prove

Mnxn(.F).

Mnxn(F)

in a

ingredients

key

We
now
determinanton
establish this uniqueness.Our

of

(b).

follows.

as

matrices

elementary

and property

of S

/i-Iinearity

\342\200\242
\342\226\240
\342\226\240

then

^(i)
Let

be

the

of zero

entries,

obtained

matrix

c'[lciA{i)of row i

each

for

so

determinant

next

result
on

i (i

from
= 2,...,

that S(B) = 0. But

Therefore, S(A) = 0.
The

^1 ic2A{2)

+ ^i

+
A

to

adding

by

n). Then the


=
S(B)

S(A)

0.

lcnA{n)

firstrow of
property

by

row the multiple

first

the

consists

entirely

(e) of Theorem

4.6.

the

establishes

Mnxn(F)\342\200\224that

matrix multiplication.This

theorem

to prove the uniquenessof


well with respect to
behaves

final fact needed


a determinant
is

of

considerable

importance

in its

own

Chap. 4

192

for

test

invertibility

Lemma.

on

determinant

then

Mnxn(F),

that

Suppose

=
Theorem 4.6; so
multiplication of a row of

arbitrary elements of

<5(B).

by

a nonzero

on

Then

Hence by Theorem4.7

n, then
=

and

= n, then

If rank(/l)

matrices (Corollary3 of

Theorem

E.B)

5(EM)
\342\200\242

S(Em)

5{B)

5(\302\243t)

Let 5 be a

1.

Corollary

invertible. Then 5(A)#

0,

Let

and

E.B)

on

5{A)'5(A~1)=
S(A) jk 0, and 5(/1-1)= [5(/1)]-1. 1

TTien

the following

(a) 5(A)=
(b)

is

5 be

Let

2.

Corollary

conditions

0.

not

(c) rank(A)

(a)

That

and

Mnxn(F),

let

S(In)

5(B).

A e

Mnxn(F)

be

on Mnxn(F), and

a determinant

let

Mnxn(F).

are equivalent:

invertible.

< n.

Proof Corollary 1 showsthat

condition

\342\226\240

[5(A)]\"1.

S(AA~1)

So

\342\226\240
\342\200\242
\342\200\242

5(B) = 6(A)

---^)-

determinant
5(A\":)

\342\226\240
\342\226\240
\342\226\240

5(Em_,

5(Em

an

Et is

each

have

\342\226\240

of elementary

is the product
\342\200\242
\342\200\242
=
A
where
El9
Em \342\200\242

4.8 we have

By Theorem

Proof

\342\226\240
\342\226\240
\342\226\240

< n.

in this case

hence

we

5(Em

rank(,4)

5(B).

and

3.6).

B be

\342\200\242

3(A)

invertible

is

A and

let

mx\302\261(AB)<

= 0. Thus

matrix. Then by the lemma

of one

5(A) -5(B).
3.7

8{A)

S(AB)

5(AB)

of a multiple

and

Mnxn(F),

Proposition

by

S(AB)

5(AB)

'-

elementary

by the corollary to
establish
the
result
for

or addition

scalar

rows

two

-1

=
S{E)

proofs

a determinant

Mnxn(F).

If rank(.4) <

Similar

S(E)
B

by E interchanges

on the left

\342\200\242

Let S be

4.8.

Theorem

Proof,

Mnxn(F).

another.

to

5(EB)

by Theorem 4.6(c).But

S(EB)

row

matrix with entriesfrom F, and ifS is


= 5(E)-5(B) for any B 6

multiplication

S(EB) = -5(B)

B. Then

second corollary, which provides a determinant


will be frequently used in the following
chapters.

x n elementary

aim

Eis

If

Proof

of

its
particular,
of a matrix,

In

however.

right,

Determinants

implies

condition

condition

if

S(A)

0, then

A is not

invertible. Hence

(b).

(b) implies condition

(c) followsfrom a

remark

Section

in

3.2.

Finally,

Theorem

4.7 shows that

condition (c)impliescondition

(a).

Sec.

193

Properties of Determinants

4.3

It was proved

determinant on

in

to

able

now

are

We

M2X2(F).

is exactly one

there

that

4.2

and

4.1

Theorems

for

a similar result

prove

M\342\200\236.x\342\200\236(n

a
of
Proof. The
Corollary 2 of Proposition4.5.
will
on

determinants

the

complete

4.8.

Theorem

If

then

n,

= n, then

rank(/l)

is

to Theorem

each i (1 <

i<

by

m)

= 5l{Em)'-S1{E1)
=

-\342\200\242

by Theorem 4.8. Hence St =

S2{Em-

Et)

= S2{A)

S2. 1

the

denote

Corollary. Let

is the product of
=
A
Em-> -Ei9 where

= S2(Ei) for

S^Ei)

= 52(Em). 62{Et)

we

Corollary 2 of

hence
Let

3.6).

with

n matrix

n x

arbitrary

are both

4.6,

S1(A)=h(Em---E1)

Henceforth

if St and S2

and

invertible

proved in

was

=
52{A) = 0 by

Theorem

elementary

corollary

an

be

S^A)

elementary matrices (Corollary3 of


each Ei is an
matrix.
Since
the

Let

Mnxn(F).

Mnx n(F)

that

showing

St = S2.

rank(/l) <

entries from F. If

by

proof

then

xn(F)9

Mn

on

determinant

existence

We

one determinant on

is exactly

There

49.

Theorem

For

Mnxn(F).

^ n)

j (1 ^j

any

by det.

on Mnxn(F)

determinant

unique

det(A)= Z(-l)i+iAirdet(Au)5
i
=

where

Ais

is the

(n

\342\200\224

1)

(n

\342\200\224

1)

matrix

row i and

A by deleting

obtained

from

the

determinant

columnj.

EvaluatingDeterminants
The

can be evaluated

expanding

by

expansioncontains
of each

of these (n

along any column, and


involves only determinants
det(A) =

AX1A22

of
\342\200\224

1)

column;

any

along

determinants

determinant

that

shows

corollary

preceding

of size

matrices
(n

of an n x n matrix
if n > 2, the resulting

\342\200\224

1)

this processcan
of 2 x 2 matrices,
be

which

matrices

(n

\342\200\224

can

then

be

1).

an

The

expansion
by the

evaluated

rule

A12A21.

of det(i4y)can avoided
for the product
Aij=0,
AtJ'6Gt(A{j) is zero regardless of
determinant.
Therefore
it is beneficial to expand along a
as possible. The next two examplesillustrate
zero entries
many
Observe,

\342\200\224

(n

be expanded

until

continued
can

1)

however,

that the evaluation

whenever

be

the

column

this

value

of

containing
procedure.

the
as

194

Determinants

Example

Let

Chap.

the

denote

To minimize the

of M4X4(.F);

element

following

computation requiredto evaluate

we

det(y4)5

along

expand

column;

second

4
Z(-Di+2^2-det(I;2)

det(,4)=

/i
+

(-'l).1

2-l-det

i\\

/io
+

| +(-l)2

2-o-detI

\342\226\240

ii/

\\i

i,

/10

/1 o l\\
+

111

(\342\200\224
l)3+2-0-det

/i

'.

= (-1)-1-0 + 0 + 0+l:l-det

\\0
of

first

the

matrices has

four 3x3

is zero.) We now evaluate


the first column to obtain

determinant
along

= (-l)1

the

+
1-l-det(j

j]

(-l)2

l\\

/l
1

=det

11/
two

remaining

\\0

identical

1-1-0

+ (-1)-1-(-1)

l\\

l/

so

rows,

determinant

by

that

its

expanding

I-l-det^

+ (_l)3
=

\\0

1 1

+(-l)4+2-l-det

\\1 1 1/

(The

the

+ 0= 1.

i.0-det(\302\260

Sec.4.3

denote

let B

Now

of

Properties

195

Determinants

-2

0
\\
successively

o\\

0 -3

-4

-9

x 5(jR):

M5

-1

Expanding

of

the following element

-1

the third, fourth,

along

det(B)-(-l)2
+

3-(-3)-det

o/

and third columns,

see

we

that

1-1

5-4

2 0

3-14

-9800
2

5-4

3(-l)3+4-4-det

-9
=

-12-(-l)2

-1

-1*

3-2-det

-9

= 24(-1)

24[1-8-(-1)(-9)]

of

evaluation

matrix

A into

entries

in one

to reduce

determinant

denote

is

often
the

entries

the

to

change

row

echelon

form.

of this technique

Examples

Example 2
A

24.

inefficient.

this

of

Instead

process of evaluatinga
entries present. Without zero
a determinant
by expanding along a column is quite
we can utilize property (e) of Theorem4.6to
procedure
and
a matrix B having the same determinantas
having
that
or more columns. This is essentially same
process

As these examples suggest, the


even
when
there are zero

tedious

Let

the

following

of M4x4(jR):

element

-1

-1

-1

-5

-3

-4

follow.

zero

is used

196

Chap.

Determinants

Then

det(yi) =

= (-l)1

= det

1-l-det

-2
=

7\"det

det

'

\\ 4 0 -1

= _7[(-2)(-l)-6-4]=
3

Example

Let.

-7(-22)=154.

the following

,4 denote

element of M4x4(jR):

1-1
2

by

of

use

Theorem

2 --1
=
det(A)

det

-4
3

det

-2

10

be

\\

-1
-10

then

-6

= det

-./

A. The
by successive

evaluated

-1

1 -5
-2
4

= det

transformed
as

determinant

same

will

matrix

so that A is

4.6(e)

the

matrix

-1

-1

10

-2

into an upper triangular


having
determinant of the upper triangular
expansions along the first column.
1

-6

5 -10
3

zero entries

-1

--1

-4

We will introduce

-2
-4,

Propertiesof

Sec. 4.3

197

Determinants

and columnsof a
in
the
a
as
of determinants
have been quite different; a determinantwas
study
a
of
function
on Mnxn(F)
that satisfies certain properties involving the
whereas
the evaluation
of a determinant is accomplished expanding
matrix,
this fact
columns
of a matrix. These roles are reversible,and
now
verify
along
A1 are equal.
are
by showing that the determinants of and
(Since the rows of
columns
to
that
the
roles
of A\\ and vice versa, this result will be sufficient
Until

roles

the

now,

by the rows

played

matrix

defined

rows

by

we

prove

are interchangeable.)

and columns

of rows

Theorem 4.10. For


Proof If
=

rank(y4r)

is

= 0 =

So det(/l)

invertible.

is

If

Since det(E-) =

matrices.

det(^0 =
=

A = Emfor

det(\302\243f)

that

A1

is

not

\342\200\224

El9

each

where

(see

El9...9Em

Exercise

are

elementary

5), we have

det(\302\243<

\342\226\240
.
det(Ei)
\342\200\242\342\226\240\342\200\242\302\243\302\243,)=
det(\302\243;,)

- \342\200\242
\342\200\242

det^)

3.6), it follows

But since

det(i4') in this case.

then

invertible,

det(A).

rank(y4) < n.

then

matrix

of Theorem

(Corollary

rank(/l)

A, det(Al) =

n matrix

invertible

an

not

any

det(\302\243m)

\342\200\242
\342\200\242.

det(\302\243J

det(\302\243J

det(\302\243m-..\302\2431)=det(,4).

the
rows
of a
^Corollary.
Any statement about determinantsthat involves
and any statement
columns
matrix can be restated in terms of the
matrix,
of the
can be restated in terms of
the
columns
about determinants that involves
of a matrix
the rows of the matrix. In particular, if A is an n x n matrix,

det{A)= t i-l)i+%'det(AiS)3
i
j

where

A^

column

(n

\342\200\224

1)

(n

\342\200\224

1)

obtained

matrix

from

j.

Example

Let

is the

A denote

the following

element of MAx4,(R):

4-1-3
1-2

12

,6

14

3-1

A by a

deleting row i and

198

In

this

case

expanding

\342\200\224

5 det

dGt(A)=

= -5detf~15

-5[(-15)-2- 10*7]
=

J=

result allows us to evaluate easilythe determinant


matrix. This result makes the techniqueused in

triangular

Theorem 4.11. The determinant


ofan
product of its diagonal entries.

Proof.
on

induction

n. If

an

be

Let

of

an

upper
a

Example

very

determinants.

for evaluating

method

500.

final

Our

by

-2

minimized

be

-3

efficient

to evaluate det(.4)can

the computation
required
third
row.
Thus
the
along

Determinants

Chap.

n = 2, then

n x n matrix.

triangular

upper

The proof is

by

form

the

has

is the

matrix

square

triangular

upper

~
0

= A11A22

so det(^

and

matrices,

the

that

Assume*

and

let

A be

\342\200\224

first

*\"

^12

1)

has

the

(n

\342\200\224

1)

form

A2n

y42(n-i)

0---

A{

we see that

column,

\342\200\224

^ In

^l(n-l)

^22
det(A)

triangular (n

= 2.

\342\200\242\342\200\242\342\200\242
A22

0
the

the theorem if n

an upper triangular n x n matrix. Then

along

proving

for upper

is true

theorem

^11

Expanding

== AX1A22,

-0

A12

A22,

A11-det

'\"

^2(n-l)

A2n

\\

0---0/1
\342\200\242
= Alt ~(A22- \342\200\242
Ann)

by

the

induction

Corollary. If
i <

j), then the

This

hypothesis.

completes

determinant of

Proof. Exercise. 1

is

the

the proof..

(that is, A^ = 0 whenever


triangular
product
of its diagonal entries.

is lower

Mnxn(F)

.\342\200\236

Sec.4.3

199

Determinants

of

Properties

possible to interpret
geometrically. We can interpret
4.1, it is

Section

As in

in MnxnW

an

element

volume

in

of

determinant

the

W
det
L(n)>

as

the

of

the

volume

n-dimensional

the vectors Ail)y...,

having

parallelepiped

see SergeLang,

proof of this result,


1968, pp. 413-418.)

In our earlier discussion


formed from the vectors in an
of

and

coordinate

matrix

/?

is

the

the

is true

for

if

for Rn, then

basis

Reading,

of the

also

is

any

where

Mass.,

basis

for

right-handed

is the
if det(Q) > 0,
of
system
change
instance,
changing y-coordinates into ^-coordinates. Thus,

if and only

(For

coordinate

ordered
a

induces

R3)

determinant
saw that this

a right-handed

induces

basis

sides.

adjacent

we

R2,

in R n. Specifically,

ordered

standard

Ain)

significance

basis

if the

as

R2 and

Addison-Wesley,

geometric

ordered

only

Rn and

I,

Analysis

determinant is positiveif
statement
system. similar

of area in

(the generalization

coordinate

for

<

y=

system in R3 since

coordinate

a left-handed

induces

/1
\\

0 |

-1

det

0N

= -2<0,

0
\\0

whereas

induces a right-handed

coordinate systemin
det

v0

More

systems

generally,

induced

if

ft

by

y are

and
/?

and

both left-handed)if and


matrix

changing

y-coordinates

only

any

if

det(g)

into

= 5 >

two ordered

have

the

since

-2

R3

0.

bases for

orientation
> 0, where Q is
same

^-coordinates.

n,

then

the

coordinate

(both right-handed

or

the change of coordinate

Chap. 4

200

Determinants

Cramer's Rule

The
invertible

that

shows

result

next

be

can

matrix

coefficient

expressed

Theorem 4.12 (Cramer's Rule).

systemof n linear
B = (bl9b2,...,
Ifdet(A)
k (1

AX

Let

the matrix form


X = (xl3x2,. ..,xn)1

B be

where

unknowns,

0, then

bn)1.

each

in

equations

of a system of equations with


in terms of determinants.

solution

the

this system has

an

of a
and

a uniquesolution,and

for

< k < n)

xk = tdet(A)T1-det{Mk),

where

Mk

Proof

is

the

matrix

x-n

If det(/l) #

obtained from

0, then this

its /cth column

X.

by

det(Zk) =
Thus

produces

xk.

xk-det(/n\342\200\2361)

=
AXk

= {Aeu...,
=

en)

X,ek+U...,

A{eu...,efc_l5

Aek\342\200\236u

Aen)

Aek+U...,

AX,

(A{1\\.^iA(k'l\\B9A^+l\\.^tAM)

4.8,

det(Mk)=

\342\200\242

det(A)

det(AXk)

det(Zk)

det(yi)

\342\200\242

xk.

Hence

x^Cdet^^-detCM,).

Example 5

We will use Cramer's

rule

to

is

easily

that

checked

the

det(^4)

= 6, so
/2

det

Xl

det(Mx) ~_
det(A)

matrix

equation

AX = B,

where

i i)-\\-@-

'-(!
It

solve

\\l

that Cramer's rule


2

3'

1 -1/

det(A)

15=
~
6

applies.

n, and

by replacing

2.14 we have

by Theorem

So, by Theorem

matrix

n identity

by Corollary

such that 1 <

the /cth row

along

Xk

Expanding

the

from

B.

by

solution

a unique

an integer

k be

Theorem

A(k)

replacing

by

has

system

3.10.
Let
of Theorem 4.8 and
define Xk to be the matrix obtained

Now

Sec.4.3

201

Determinants

of

Properties

det(M2)
x2^ . \\,f

-1/ *_=_=_!.-6

\\1

^t-^

det(A)

det(A)

and

*3~

\\1 1 1/

_ det(M3) _

so the unique solutionof

the

x2, x3) =

(xl5
In
to

know

(2,

a system of

involving

applications

is

system

given

the solution consists


it shows that a
because
that

~~6~2'

det(i4)

det(4)

3
__

\342\200\2241,
2)-

linear equations,

of integers. In this

we

integral

equations

Cramer's

the

of

and

aesthetic,

Gaussian
is

rule

than computational.

rather

EXERC8SES

1. iLabel

the

as

statements

following

(a) If two rowsof are identical,


(b) If B is a matrix obtained
det(B) =.-det(i4).
A

from

(c) If B isa
c, then

row

(e) If

(i

;
\302\243
is

(g)

A matrix

(h)

(i)

_The
any

then

#;*),
an

det(B)
then

x n(F),

Me

Mnxn(F)
of

determinant
or

det(^)=-det(/l).

then
A

A by

or false.
det(y4) = 0.

true

by

interchanging

multiplying

two

a row of

rows,
A

by

then

a scalar

column,

A by

adding

a scalar multiple

of row i to

= det{A).

matrix,
det(i4B)

then

det(E) =+1.
= det(i4)

\342\200\242

det(B).

if and only if det(M)= 0.


has rank n if and only if det(M)# 0.
matrix
may be evaluated by expanding along

M is invertible

matrix

row

from

elementary
Mn

being

= det(B).
obtained

matrix

(f) If A, B e

from

obtained

matrix

det(^)

(d) If B isa

(j)

of

Cramer's

theoretical

n linear

of

determinants

the

is

rule

Cramer's

situation

with
system of linear
has an integral solution if det(^4) =+1. On
coefficients
other
hand,
rule is not useful for computation because the solution a system
in n unknowns requires evaluating n + 1
equations
and hence is much less efficientthan
matrices
method
of
in Section 3.4. Thus our interest in
elimination
discussed

useful

need

sometimes

Chap. 4

202
(k)

of a diagonal

determinant

The

matrix is the

Determinants

product of its diagonal

entries.

(1)

Let

linear

AX

the matrix formof a


where X = (xu x2,..., xj.
A

from

k (1 < k

each

<

be solved

by

by

If

the

replacing

n linear

det(A)

0 and

/cth row of

in n

equations

by

if Mk is
B\\

then

the
for

n\\

xk

(a)

of

system

matrix obtained

Evaluate

in n unknowns can

= B be

unknowns,

2.

equations

rule.

Cramer's

(m)

of n

system

Every

det(Mk).

[deHA)]~l-

each determinant in

the manner indicated,

Expand

the

along

column.

second

(b) Expand..

along the first

row.

(c) Expand

along

the

(d)

third

column,

Expand

the fourth row.

along

3. Evaluate the determinants


In each case C is
of
of

the

(a)

-7

field

3\\

the

below

matrices

by any

scalars.

(b) /9

0\\

(c)

legitimate

method.

Sec.

203

Properties of Determinants

4.3

-1

/-2 +i

(d)

5i

2 -1

(e)

\\

-3

0
-1

-1
(f)

-1

(g)

3\\

5.

all

if

only

its

Complete the proof of

if A

that

Prove

7. (a)

matrix

orthogonal,

(b)

the

4.10

Mn x n(C)

3/

3 + i

of

conjugate

2i

is an

an elementary
matrix of the same

if BB* = I,
if

that

det(B)

det(B)

Recall

of

subset

Mn x

of

n(F) whose /th

Fn if and only if
the proof of the lemma to
linear transformation
T: Pn(F)
det(B)

distinct

of

elements

and

(a) Show that

T(f)

by

the

y be
M

an

is

where (B)l7 =
\342\200\224

B.

[T]^

has

/(^),...,

the form

co

co

c1

c\\

c2

B^.
that

Prove

if

0.

Prove

n distinct vectors,
xy

that

and let
Prove

0.
4.8.

->\342\200\242
Fn

+ 1

defined

in Exercise

f{cn)\\ where c0,


infinite
field F. Let P be the standard
+
standard
ordered basis for Fn 1.
(f(c0),

then

unitary,

column is the vector

Theorem

Section 2.4

BJ{,

(B*)y

respectively.

Fn containing

a basis for

the

PnCF)

matrices,

square

where

Bis

in the form

be written

can

be

the

Complete

2-z

if E is

that

elementary

Prove

Byv

that

Prove

Mnxn(F)

xn}

p is

4z

1 + i
0
matrix is invertible if and

then

10. Let p = {xl9...,


B denote element
that

\342\200\224

is called skew-symmetric if Bt =
matrix
B in Mnxn(C)
e Mnxn(C) is skew-symmetric and n is odd,
det(B)

that

proving

is called unitary

B3

for

by
El

Hint:

det(\302\243).

where B1 and
are
det(^) = det(B1)-det(B3).

12.

then det(cA) = cndet(>l) for any scalar c.


in Mnxn(jR)
is called orthogonal if BBl = L Prove that if
then
=+1.
det(B)

complex

9.f Suppose

11.

are nonzero.

Theorem

|det(B)| = 1. Hint:

8.

6 Mnxn(F),

B in

matrix

n x

triangular

entries

diagonal

matrix, then det(E')=


type as E.

6.

lower

or

\\

2i
upper

2i

-5

4. Prove that an

-1 + 3i

0
0

-1\\

20 of

cl5...,cn are

orderedbasis

204

(c)

Vandermonde

that

Show

(b)

that

Prove

det(M)

the

13. Let

Let

obtained by deletingany n ~
k (1 <

/c

<

the

denote

n)

and

rows

8 of

Solve

a12x2=

(a) {a11x1 +

l/*21*t+

2x1

dct

2x2

3xt + 4x2
(e)

xt

- a12a21# 0.
3x3

\342\200\224

x3

\342\200\224

2x3

cJk

\342\200\2245

x3

2xt Let

<

x2

x3

for

<

8xt

of Ajk in
ej9

\302\260

=-4

3x2

x3 =

x3

\342\200\224

3xt+

2x3 =

x2

2x1

x2

-2x1

= 10

x3

\342\200\224

<

A*\"l\\

\342\226\240det^1',..:,

(b) Show that

cofactor

the

denote

(f)

\342\200\224

det(B).

- 3x3 = 5
~

4x2

x1

B
(and hence

2.4: If A and

rule.

4-

2x2
+

(d)

x2

\342\200\224

x1

I3*1

x2 + 4x3=-2

-8Xi+3x2+

17. (a)

<

^2>

\342\200\224

x2

(^

(b)

from

columns

invertible

by Cramer's

equations

bt

fl22*2

where axla22
(c)

of

systems

the-following

matrices, then det(^4)=

B are similar

A and

\342\200\224

Section

A is

AB

submatrix

x m

such that some k x k


rank(^4) = /c.

Prove

prove

if

n.

< j ^

an

n\\

any

integer

largest

< i

ct

that

c<*)>

\342\200\224

determinant.
submatrix of has a nonzero
14. Use the resultsof thissectionto
Exercise
=
then
are n x n matrices such that
In,
=
B A~1).
Prove

for
Cj
For any m (1 ^ m <

be nonzero.

x n(F)

Mn

(CJ

J]

of the form

terms

all

of

product

of A is
A.

Determinants

with this form is calleda


mati'ix.
det(M) # 0 by using Exercise 20 of Section2.4.

A matrix

15.
16.

Chap.

4x3

x2 + x3=
=

x2

12

+ x3=-8

x1+2x2

the n x n matrix A. Prove that


^)

A*+l\\...,

<*.

< n

r\302\273

CJ2

\342\226\240

=_det(^l)

es.

Hint:

Apply

(c) Deduce that


AC

rule

Cramer's
if

is

the

to AX = ej.
n x n matrix

such that

= dct(A)In.

(d) Show that

if det(yi)#0,

then

A\"1

[det^)]\"^.

=
CfJ

cJh

then

Sec. 4.4

classical

18.

adjoint of

Find the classical


(Axx

A12\\

^21

^22.

(g)

/-1

called

the

matrices.

following

(d)

(f)

4\\

C be the

Let

the

o\\

part (c) of Exercise17 is

(b)

(c) /i-r

(e)

C defined in

A.

of

adjoint

(a)

19.

matrix

The

Definition.

205

Determinants

about

Facts

Summary\342\200\224Important

5\\

(h)

classical adjoint of

the

Prove

Mnxn(F).

following.

(a) det(C) = [det(/l)]n-\"1.

(b)
(c)

If A

classical

the

is

C*

is an

of

adjoint

invertible

AK

upper triangular matrix,

then C and

A\"1

are

upper

triangular.

20.

Let
^T;

C00

->\342\200\242
C00

independent functions in

De linearly

^1,^25---5^

linear

the

be

'
T(y) = det

yi

y2

y'

y'i

y'2

jf)

yW
Prove

4.4

that

N(T)

span({>/l3 y2,...,

and

let

by

\342\200\242
\342\200\242\342\200\242

yn
\342\226\240
\342\226\240\342\226\240

y2n)

y'B

\342\226\240
\342\226\240
y\342\204\242

yn}\\

FACTS

SUMMARY-IMPORTANT

ABOUT

defined

transformation

Ca,

DETERMINANTS

In this section

we

the

summarize

needed for the remainderof

been derivedin Sections


stated without proofs.

4.2

the

and

text.

4.3;

important
The

results

consequently

properties

contained
the facts

of the

determinant

have
presented here are

in this section

206

element
1. If

2.

If

x n matrix A having entries

of an n

determinant

The

det(A),whichcan

of F denoted

If

is

1 x

1, then

is

2 x

2, then

is

\342\200\224

3)

+ 1 times
deleting
precise

> 2, then the determinantof

the

In

(n

(n

evaluated

question.

i of

A)

or

scalar

j of A\\

of column

entries

the

obtained

the

above

formulas

the

in

by
The

(-iy+My'det(I\302\243i)

matrix

of row

entries

the

by

by

1)

obtained

matrix

entry

by

multiplied

t(-iy+Ml7'det(IlV)

is evaluated

\342\200\224

\342\200\224\342\200\2361)

1)

(n

as the

expressed

of A

\342\200\224

the

det(^)=

(if the determinant

1)

(n

and column containing

det(,4)=

(if the determinantis

\342\200\224

be

can

column

or

row

some

the:determinantof an

from A the row


formula is

example,

= (-^(3)-(2)(5)=-11

of each entry of

sum of products

manner:

following

the single entry of A.


for
Thus,
det(A) = AtlA22
A12A21.

n for n

n x

the

in

computed

is an

field

from

det{A) = Alu

det(_5

3.

be

Determinants

Chap.

row i and

by deleting

is
(\342\200\2241)'+Jdet(y4y)

Ay

is

column j from

the

called

where

cofactor

of

evaluated
as the sum
language the determinantof
of products of each entry ofsome
of A multiplied
or
column
by the cofactor
of n determinants
in
of that entry. Thus
terms
of
expressed
~
x
These determinants are then evaluatedin
1) matrices.
(n
(n
~
of (n
and
so forth, until 2x2
matrices are
determinants
2) x (n 2) matrices,
obtained.Thedeterminantsof 2 x 2 matrices are then evaluated as in item 2

the

entry

Ay. In this

is

row

is

det(A)

\342\200\224

terms

1)

of

\342\200\224

the

above.
Let

us

several

consider

determinantof

the

we

will

evaluate

in evaluating the

technique

matrix

4x4

First

of this

examples

-4

-3

5
\"

-1

the determinant

This requires knowing

of

expanding

by

the cofactorsof each

entry

of

that

along

the

row.

The

row.
of
cofactor

fourth

Sec. 4.4
AA1

207

Determinants

about

Facts

Summary\342\200\224Important

= 3 is

/i

(-l)4+1det

Let us evaluate

above

determinant

the

by

the first column.

along

expanding

Then

/1
1

det

-1

_4

]=(-l)1

1(Ddetf _^

\\0

1(1)

jj

-3

+(.-1)2+
+

+ (-l)3

det

1(0)detf

= 1(1)[(-4)(1)
-(-1)(-3)]
+

Thus
is

the

of

cofactor

Evaluatingthis
det 1 -4

(\342\200\2241)5(

determinant

\342\200\22423)

23.

the cofactor

Similarly,

row

second

the

along

of Ai2 =

gives

5\\

/2X

is

AA1

-23.

+ 0=

-7-16

(-1)(1)[(1)(1)-(5)(-3)]+0

= (-l)2

-1

-3
\\2

5)

1(l)det(

det

+ (-1)2+2(-4)

1/

+ (-l)2

- (5)(-3)]
= (-1)(1)11(1)(1)
+

_J

3(-l)det^
-

(1)(-4)[(2)(1)

(5)(2)]

+ (-l)(-l)[(2)(-3)-(1)(2)]

= -16
So the cofactor

of

AA2

is

(-1)6(8)

32

= 8.

8 =

The cofactor
'2

(-l)4+3det(

8.

of

-1 1.
1

=
AA3

1 is

208

If

Chap.

/2

-l|=(-l)3

j)

i(2)det(J

= 1(2)11(1)(-1)

(5)(1)]

=.-12 + 0 +

this determinant

Computing

-4

det

0 +

3(1) det

- (1)(1)]

-11.

2 is

AAA

the secondcolumn,

we

+ (-l)2

~^

2(l)det^

of

cofactor

the

Finally,

by expanding along

=(-l)1

(-1)*

1(1)11(2)(1)

-11) = 11.

of A^ is (-1)7(

the cofactor

Hence

(-l)3+2(0)det(^

\\2

the third row, we find

along

by expanding

det

determinant

this

evaluate

we

Determinants

obtain

_\\

2(l)det^'

+
+(\302\273D3

2(0)det^

= (-1)(1)11(1)(-3)

of

determinant

of

cofactor

the

by

(1)(2)]

1(1)[(2)(\302\2733)

+ 0

+ 0=-13.

= -5-8
Therefore,

- (-4)(2)] +

is

AAA

=
\342\200\2241)8(\342\200\22413)

each

multiplying

entry

\342\200\22413.

We

of the fourth

can

the

evaluate

now

row by its cofactor;this

gives

det(i4)

For

along

expanding
of

of

sake

the

A22,

A12,

and

we

comparison

the

(\342\200\224l)1+2(l)det

\302\2733

the determinant of

/2
2

1 5
\302\2733

]+(-l)2+2(l)det[
\\3

2'/

\\3

by

the cofactors

8, respectively.Thus

-1\\

-4

- 102.

The reader should verify that

14, 40, and

AA2 are

1(11) + 2(-13)

also compute

will

column.

second

/1
det(i4)

+ 6(8) +

3(23)

2t

Sec. 4.4

/2

+ (-l)3+2(0)det

Of

the

2/

-1

1 -4

det

I+(-1)*+2(6)

-3

\\2

lt

102.

the value 102 is obtained


is

choice

of the

independent

no

is

again

detenninant of

of row or

the

since

surprise

column used

expansion.

column

second

of

presence

column, which made it


For
this
reason
(the cofactor of

in the second

a zero

cofactors

of the

of det(A) is easier when expandedalongthe


is the
along the fourth\"row. The difference

the computation
than
when
expanded

that

Observe

one

48

the fact that

course,

value of the

in

40 + 0 +

-1

5'

/2

-4

\\3
= 14 +

209

Determinants

about

Facts

Summary\342\200\224Important

to

unnecessary

evaluate the determinant of a

matrix that contains

the

matrix

to introduce
into
computing detenninant.
the determinant.

by means

matrix

the

zeros

This

the

technique

a row

or column of the

is often helpful
of elementary row operations before
the first three properties of
utilizes

of zero

number

largest

along

expanding

by

to

beneficial

it is

AZ2)-

evaluate

entries.

In fact, it

Properties of the Determinant

1.

is

If

obtained

a matrix

by interchanging

two rows or

two

of

columns

A,

~det(/l).

thendet(B)=

each
of
is a matrix obtained multiplying
entry
=
of A by a scalarc,
cdet(A).
det(B)
A by adding
a multiple
If B is a matrix obtained
of column
i to column
j, where i # j, then

2. If B

by

some

row or column

then

3.

from

multiple

\\

of row i to
=
det(B)

row/ or a

det(A).

the use of these three propertiesin evaluating


determinants,
of the 4x4 matrix A considered
we will
the detenninant
compute
previously.
of
will be to introduce zeros into the secondcolumn
A by
Our
procedure
3 and then to expand along that column.
elementary
employing
property
(The
of
1 to
2 and 4.)
used here consist of adding multiples
row
rows
row operations
To

illustrate

This procedure yields


1

f2

det{A)=^det

i +
1(\302\2731)

5\\

210

of a 3

The resulting determinant

manner: Use type 3


first column and
have

row

elementary

then

= (-1)
=

The

should

reader

- (-11)(40) =
is required

work

we

the

The

forms.

det(J)

5.

If two rows

-11

40

26

102.
of det{A) with

calculation

this

In
following
chapters
matriceshavingspecial
useful in this'regard.

4.

0 -13

1(-l)det

compare

less

much

how

see

-6

(-l)1

(-13)(26)

= (-1)-det

-28/

-5

-5

/-1

1
9

in the same
introduce
two zeros into the
from above, we
Continuing
evaluated

be

-6\\

(2-3

(-l)-det

det(i4)

column.

that

-5

'-1

to

operations

along

expand

can

matrix

Determinants

Chap.

the preceding ones to

when properties 1, 2, and 3 are employed.


often
have to evaluate the determinant of
next
three
of the determinant
are
properties

1.

(or columns) of a

the matrix is

are

matrix

the

then

identical,

determinant

of

zero.

6. The determinant of an

the

is

matrix

triangular

upper

product

of its

diagonal entries.
n

an

As

it

6, notice that

of property

illustration

0-6

-3
det

Property6 provides

= 72.

(-3)(4)(-6)

for

method

efficient

an

=
(

the determinant

evaluating

of

a matrix.

1. Use Gaussian

and

elimination

upper
2.

1, 2,

and

3 above to

matrix.

triangular

the

Compute

properties

product

of the diagonal

entries.

For instance,
'

-1

-1
5

det

\\

-2

-1

-10

10

-6

-1

-1

-5

-2

det

-2

-4

reduce to an

Sec. 4.4

211

Determinants

about

Facts

Summary\342\200\224Important

-1

0
(1 0

0
0

= 1-1-3-6=
The
later

isthat it

,0

provides

-4

6.

are frequently used in


significant property of the determinant
of invertible
matrices (see property

of the determinant
most

characterization

simple

18

properties
the
perhaps

Indeed,

-5
3
0
0
0
1

0
det

four

remaining

chapters.

-1

fl

10).
7. For any A, det(A)=
8.

For

any

9.

If Q

is an

10.

det(ilB) = det(yi) det(B).


invertible matrix, then det(g~1) = [det(Q)]\"1.
B e

A,

matrix

for

that

10 guarantees

property

example,

only if det(g)

if and

because det(/l) =

invertible
132.

\342\200\242

Mn x n(F),

is invertible

Thus,

det(^)# 0.
the matrix

102. Thereforerank(/l)=

by

on

remark

206 is

page

on

page

EXERCISES

1. Label the following


as
being
(a) The determinant of a squarematrix
\\ matrix along any row or column.
(b)

In

the

evaluating
or

row

column

the

(c) If two rows or columns


obtained
(d) If B is a
by
A, then det(B)=
of

matrix

be

may

computed

by

the

expanding

it is wise to expandalonga
number of zero entries.

of a matrix,

determinant

containing

or false.

true

statements

largest
are

then

identical,

two

interchanging

det(/l) = 0.
rows or two

columns of

det(/l).

(e)

If

obtained by multiplying
of A by a scalar, then det(B) =
a matrix obtained from
by
adding

is a

column
(f)

If B

matrix

is

of

(h)

its

row

or

If

2 is an

(k)

matrix

upper

-det(i4).

det(i4')-

(j)

an

of

some

column

triangular

to a
different column), then

a multiple

to a

of some row

n x n matrix

is the product

entries.

diagonal

(i) If A, B e

multiple

of

determinant

The

some

det(y4).

differentrow (or a
det(B) = det(yl).
(g)

each entry of

n x

n(F),

then

det(zlB)

det(i4)

\342\200\242

det(B).

invertible matrix, then det(e_1) = [det(g)]\"1.


if and only if det(g) # 0.
Q is invertible

Chap. 4

Evaluate the determinant


(a) (A

2+

Evaluate

matrices.

2x2

following

<b)7-l

3
i

3 8
-1 +

3f

(d)

3-f

l-2i

(a)

the

-5

2
(c)

of

-6i

the determinant

the second column.

(b) Expand

along

the

(c)

third

row.

Expand

1+ i

1- i

2i

Ai

along

(d)

the

first

column,

Expand

/ i

2+

\\ 0

-1

the first row.

(e) Expand

^-1

along

the

fourth

column.

li

-13

along

Ai

2i

of the following matrices in

Expand

along

Determin

-1

1-f

the

manner

indica

213

Index of Definitions

Chap.

detenninant

4. Evaluate the

the

of

matrices

following

any

by

legitimate

method.

0 3

(b)

(a)

1
2

(d)

(0

5.f

9 of

Exercise

Work

6. Let A 6

4.3.

Section

Mnxn(F),and consider

the

linear

of

system

B,

AX

equations

where

X =

(a)

identitymatrix

(b) Show that


^

where

det(Zfc),

Compute

by

~
AXk

(c) Prove Cramer's

If

rule:

that

such

xk = [det(4)]

Angle

function

n-linear

two vectors

between

Cramer's

the n x n matrix

obtained from

det(,4)

_1

of the system AX
(1 < k < n).

the solution

# 0,

\342\200\242

184

174

for

det(Mfc)

each

Lower

matrix

triangular

n-linear

198

182

function

Orientation

175

Orthogonal matrix 203

187

200

rule

of a 2 x

Determinant

on

Mn

Expansion along a

determined

Parallelogram

Determinant

2 matrix 171

x n{F)

186

column 187

by

by B.

Classicaladjoint 205
Cofactor

by X.

column

Mk is

from the n x n

OF DEFgNfiTBONS FOR CHAPTER

INDEX
Alternating

obtained

matrix

the
/cth

its

where

its /cth column

replacing

is

Xk

replacing

Mk,

and

vectors

by

176

Unitary matrix

203

Vandermondematrix 204

two

\342\200\224
B

is

ization

Diagonal

This

the so-called

with

is concerned

chapter

lineartransformation
T:
seek answers to
following

->\342\200\242

V,

the

questions:

exist an ordered

there

1. Does

V is

where

problem.\" Given a
finite-dimensional
vector space, we

\"diagonalization

basis

ft

for

that

such

matrix? ;

2. If such a basis

how

exists,

be found?

it

can

is a diagonal

[T]^

involving diagonal matrices are simple, affirmative


1 leads us to a clearer understandingof
the
on V, and an answer to question 2 enablesus to obtain
that can be formulated in a linear
problems
practical

Since computations
to

answer

question
T

transformation

how

easy

to

many

We

will

solutions
context.

acts

an

algebra

consider

some

o\302\243
these

problems

chapter; see, for example,Section5.3.


A

of

the

to

solution

and

\"eigenvalue\"

concepts play in
tools in
study

the

the

Aside
problem,

diagonalization

of

many

in this

solutions

their

leads naturally to the concepts


the important
from
role that these
will also prove to be useful
they
as we will see in
transformations,

problem

diagonalization
\"eigenvector.\"

and

nondiagonalizable

Chapter 7.

5.1
Since
that

AND

EIGENVALUES

the
maps

EIGENVECTORS

problem involves the study of a lineartransformation


vector
space into itself, it is useful to name sucha transformation.
V
we
call
a linear
T: V -\302\273\342\200\242
a vector
transformation
on
space

diagonalization
a

Accordingly,

V a

linear operator on V.

For a

are
bases

with

concerned
for

linear

given

the

operator

T on

matrices

that

represent

vector space V,

T relative to

various

we

ordered

V.

this chapter,
Throughout
\"ordered
basis\"
expression

214

a finite-dimensional

we usually omit

the adjective

\"ordered\"

from

the

Sec. 5.1

and

finite-dimensionalvector space
the
matrices
/?' for V. Recall from Theorem2.24

/? and

bases

two

215

Eigenvectors

a linear

Consider
any

and

Eigenvalues

T on a

operator

that

where Q is

the

matrix

coordinate

of

change

coordinates. In Section 2.5


defined
is
special case of this relationship
Let

5.1.

Theorem

for Fn. Then

= 1,2,...,

Proof. Let

/?

[T]^

x n(F),

where

basis

standard

and let

theorem.

following

=
y

{xu

basis

xn} be any

x2,...,

in

Q is the

n x n matrix

for Fn. It

is easily seen that

which

the

jt/z

the

matrix

matrix changing y-coordinates into ^-coordinates.

of coordinate

change

Mn

the

into /?A useful

n).

the

be

Q_1AQ,

[LA]y

is Xj (j

matrices

in

proved

/?'-coordinates
to be similar.

changing

such

we

is the

and

are related by

[T]^

column

Hence

[LJv

= (2-^4(2-

2-^/)(2

Example1
5.13let

To illustrate Theorem

G> G)f
It

is

to check

matter

simple

that if

1
Q

2/,

-,3

then

\\

-0

--(1
and

'

\342\226\240

mentioned

As

relativeto
result.
Theorem

V, and

let P

OG

_!)(;

w.-fl-Mfl-d

above,
bases

different

Let

5.2.

be a basis for

existsa basisP'

for

such

the same linear operator


represent
We now establish
the converse of this

that

matrices
are

T be

similar.

a linear operator

V. If B is
that

3-(111)-

any

B =
[T]^.

on an n-dimensional

n matrix

vector

similar

to [T]^, then

space

there

Chap. 5

216

Proof If B is similarto [T]^


that B = Q\"1[T]/,Q.Suppose
that

an

exists

there

then

and

Q such

matrix

invertible

{x1,x2,-..,xn}9

)5-=

Diagonalization

define

Then )5' =
matrix

{xi, x'2,..-,

is

x'n)

for

basis

into

/^'-coordinates

changing

for 1 ^7<n-

Go***

X'j=

the change of coordinate


12 of Section 2.5).
(Exercise

that Q is

V such

^-coordinates

Hence

2.24.

Theorem

by

The concept of similarity is usefulin


it can be used to reformulate the problem in

studyingthe

since

the definitions of

introduce

now

Definitions.

We

matrices.

of

context

the

diagonalizability.
vector space

a finite-dimensional

T on

operator

if there exists

be diagonalizable

to

said

linear

problem

diagonalization

a basis Pfor

that

such

is

is a diagonal

[T]^

matrix.
A

A is

matrix

square

said to be

diagonalizable if

to

similar

is

a diagonal

matrix.

The

theorem relates these two

following
the

reformulation-of

Theorem 5.3. Let T


a basis
space V and let
for
)5

in the

problem

diagonalization

a-linear

be

be

If T

Proof.

is diagonalizable,

Now

that

suppose
B.

matrix

[J2p'As

an

matrices

the

diagonal

Therefore,
immediate

[Tj^
Theorem

By

T is diagonalizable.
consequence

Corollary.

matrix

of Theorem

Because

A is

given

square

of this

diagonalizable

5.3, we can

matrix

and

V such

that

[T]y

are similar.

[T]y

is similar to
for such that

theorem

we have the followinguseful

if and only

if

LA

is

diagonalizable.

reformulate the diagonalization problem

follows.

Is

[T]^

of

is diagonalizable.
Then [T]^
exists a basis )5'
5.2, there

result.

1.

if and only if [T]^ isa


y

matrix. By Theorem 2.24,


Therefore, [T]^ is diagonalizable.

as

vector

a finite-dimensional

T is diagonalizable

of matrices.

context

then there exists a basis

is a diagonal

to

leads

matrix.

diagonalizable

on

operator

V. Then

concepts and

A diagonalizable?

Sec. 5.1
2.

and

Eigenvalues

If A

that

We

217

Eigenvectors

now

so

determined

be

solution of

the

problem.

vector

if there exists a basis

if and only

diagonalizable

T on a finite-dimensional

A linear operator

5.4.

Theorem

that

such

xn} for

{xu...,

ft

X1,...,Xn {not necessarilydistinct)

T(Xj)

AjXj, for

V is

space

V and scalars

< n.

1<j

Under

circumstances

U1B

0
that

Suppose

Proof

that

leading to a

results

of several

first

the

present

diagqnalization

these

invertiblematrix

is diagonalizable,
how can an
Q~1AQ is a diagonal matrix?

eachj,

XJ

Then there is a basis

T is diagonalizable.

a diagonal matrix. Let

= D is

[TJp

\342\200\242\342\226\240\342\226\240

=
Xj

/?

and

Dn

/?

{xu...,

for

such

xn}. Then for

=
T(*j)

\302\243 Dtjxi

Conversely, suppose there exists


such that T(xy) = XjXj.Then
Xu...,Xn

Dsjxj

a basis

xjxj=

scalars

and

{x1,...ixn}

clearly

\342\200\242\342\226\240\342\226\240

IX1

\342\200\242\342\200\242\342\200\242

x2

\\

X'tit

5.4

Theorem

T(x)

= Jlx. The

an

called

is

scalar

is

for some scalar


eigenvector

of

eigenvector
X.

The

the
scalar

of

eigenvector
the

called

matrix
X

T if there

Aifx
is called

over

vector

space

afield

V. A nonzero

exists a scalar

F, a

nonzero

is an eigenvector
the eigenvalue
of

that

such

to the eigenvector

corresponding

eigenvalue

matrix

on

operator

Similarly, if A is flunxn

calledan

definitions.

following

a linear

Let T be

Definitions.

element x 6

the

motivates

of
A

LA;

element x e Fn is
= Xx
that
is, Ax

corresponding

to

x.

vector and proper


are
used
in place
often
eigenvector. The corresponding terms for an eigenvalueare characteristic
The words characteristic

vector

value

and

proper

the

value.

of

Chap. 5

218

Diagonalization

p consists of
of
the
entries
diagonal
eigenvectorsof T and
[T]^ are eigenvalues of T.
A linear operator T on a finiteas follows:
Thus Theorem5,4
be
restated
V is diagonalizable
dimensional
space
if and only if there exists a basis
T. Furthermore,
V
of
eigenvectors
consisting
of
if T is diagonalizable,
=
D =
if
x2,...,
xn} is a basis of eigenvectorsof T,
{xu
[T]^, then D is a
In this terminology

that

see

we

5.4 the basis

Theorem

in

that

can

vector

Pfor

and

diagonal matrixand

the

is

Du

eigenvalue

Beforecontinuing

consider two examples

(i =

1,2,..., n).

diagonalization

involving eigenvectorsand

we

problem,

eigenvalues.

Example

Let

the

of

examination

our

to X;

corresponding

orders.

(Thus
the

functions,

subspace

exponential
vector

\"the

of

space

1.2, -Define T.

Section

all functions f:R~*R having


of
all
all polynomial functions, the sine and cosine
functions,
etc.) It is easy to see that C^jR) is a
R as defined
to
in
^(R, R) of all functions from

the set of
includes
C^R)

denote

C^jR)

derivative of y.

It is

derivatives

Cm{R)-*Cm{R)

T is

that

verified

easily

T(y)

by

where

y\\

y'

a linear operator

on

the

denotes
C\302\260(R).

We

of T.
eigenvectors
If A is arf-eigenvalue
of T, then there is an eigenvector y 6 C^jR)
such
that
=
=
This
is a first-order
differential
equation whose solutions are of
y'
T(j;)
Ay.
the form y(t) = ceu for some constant c. Consequently every realnumber
A is
an
of T, and the corresponding eigenvectors are of the
form
ceM
for
eigenvalue
=
c # 0. (Note
that if A
the
are
the nonzero
constant
0,
eigenvectors
the

determine,

and

eigenvalues

:
functions.) I

Example3
Let

X2=Q-

xi=(-0'and

D'

a=(?a

Since

.
x1

is

an

of

eigenvector

corresponding

Thus x2 is
eigenvalue.

(X-O-tO-K-O-2-

^c

LA (and

hence

of A). Also

A1

\342\200\2242
is

the

eigenvalue

the

associated

to xx. Moreover,

an

eigenvector

Note that p

of

LA

(and

of

A) with

= {xux2} is a basis

for

[L']'

R2,

(\"o

5}

A2

and

hence

as

by Theorem

5.4

Sec.5.1

and

Eigenvalues

219

Eigenvectors

if

Finally,

then

5.1.

by Theorem

a
Example3 demonstrates
=
for
If p {xu
is a basis
xn}

for

technique

the

matrix whose jth

n xn

column is

Q~1AQ is a diagonal
for

the

determining
are

eigenvectors

for

we

let

and

V,

space

Let

ft

ft'

A = [T]^

Let

Proof.

and

an invertible matrix Q

(j

= 1,2,...,

is

n), then

be a linear

be

any

two

such that

theorem

to

the

introduce

operator on a finite-dimensional
=
bases for V. Then
det([T^\\p.).

and B =

are known. For

eigenvalues

following

we

5.5.

Xj

and

this
computing eigenvalues. As an aid in

the

once

determined

reason begin by discussing


a method
this computation will utilize
the
\"determinant\"of a linearoperator.
Theorem

eigenvector

this procedure, we need a method


of a matrix
or operator. As we will see,

eigenvectors

easily

n matrix A:

to use

order

In

matrix.

the

n x

of eigenvectors of

Fn consisting

\342\200\242
\342\200\242.,

x2,

an

diagonalizing

vector

det([T^\\p)

[T]^. Since
=

B are

and

similar,

exists

there

Thus

Q\"1AQ.

= det(Q-MQ)

det(B)

= det(Q-1)-det(4)-det(Q

\\

=
= [det(Q)]\"' [det04)]
1
[det(0]
det(,4).
This result motivates the followingdefinition.
that
by
of basis.
is well-defined, that is, independent of
choice
\342\200\242

\342\200\242

Note

det(T)

for

V,

5.5,

the

a linear operator on a
the determinant of T, denoteddet(T),as
Let T be

Definition.
V. We

Theorem

define
and

follows:

det(T)

define

vector

finite-dimensional

Choose

any

space

basis

det([T]p).

Example 4

Let T:
det(T),

P2(R)

let

-\302\273

P2{R)

be

=
p
{l,x,x2}.

defined

Then

by T(/)
j?

is

basis

/0

Thus det(T) =

deuIT],) = 0.

= /',
for

the derivative of f. To
P2(R)

and

compute

ft

220

Chap.

of a

linear

Note

operator.

a matrix

of

determinant

the

some important properties of the determinant


of these properties to those proved for
the similarity
4.
in Chapter

establishes

result

next

Our

Diagonalization

Let T be a

Theorem 5.6.
space V. Then

linear

vector

a finite-dimensional

on

operator

# 0.
only
ifdet(T)
(a) T is invertibleif
(b) If T is invertible,then dct(T_1)=
U: V -* V is linear, then det(TU) = det(T) det(U).
If
and

[rfet(T)]\"1.

\342\200\242

(c)

is

If

(d)

and

scalar

any

det(T
A

where

hence

det(,4

is

[T

- /llv) =

det(k -

(a), (b),and

XI),

\342\200\224

\342\200\224Mvlp

Thus

XI.

To

exercises.

are

(c)

for V, and

a basis

/? is

scalar,

- AJ)/:

V, then

[T2p-

The proofs of

Proof

that

for

basis

any

ft

we

definition

by

prove

(d), suppose

A= [T]^ Then [iv]/j= I,


have

det(T

and

\342\200\224

/Uv)

The following

theorem

with

us

provides

for

method

computing

eigenvalues.

Theorem 5.7. Let T


space
det(T

over

vector x in

if and

only

that

such

if

\342\200\224

X\\

Corollary
an

only

\342\200\224

T(x)
is

eigenvalue of T if and
= Xx, or (T X\\)(x)
= 0.
invertible.

not

that

statement

the

Let

1.

A if

of

eigenvalue

Proof

X is an

A scalar

equivalentto

is

is an eigenvalue of

T if

and

only

if

0.

X\\)

Proof

vector

a finite-dimensional

on

operator

XeF

A scalar

F.

field

linear

be

an

be

However,
X\\)

x n

and only if

\342\200\224

det(T

matrix

det(A

\342\200\224

XI)

if

exists

there

By Theorem

by Theorem
0,
1

over afield F.
=

a nonzero

2.5, this is true


5.6, this result is

Then a scalarXeF

0.

Exercise,

Example 5

Let
A

!lelW2M2(K).

\\\\

Since

det(i4 -

l
XI)

detl

the only eigenvaluesof

*
\\

4
A

are

1-X
3 and

\342\200\224

1.

X2

- 2X

- 3=

(X

3)(X +

1),

Sec.

5.1

221

and Eigenvectors

Eigenvalues

Example 6

Let T:
+

f(x)

->

P2(#)

xf'{x)

[T],

standard basis for

Then

P2(jR).

0\\

2 2 .
\302\260

,\302\260

T{f(x))=

is defined by

that

operator

let fi be the

and

f\\x\\

linear

the

be

P2(K)

3/

Since

/1-/110

det(T

=
X\\)

- XI) = det

detflT],

= (1 _
=

an

is

of T

eigenvalue

if and only

Example 6 makes use of

the

if

X)(3

X)(2

X)

-(X-l)(X-2){X-3)i

3-/1

\\

2-/1

3.

23 or

1,

obvious

following

of Theorem

consequence

5.6.

Corollary 2. Let T be a
V.
space V, and let P be a basis
for

eigenvalue

of

5 and 6 the reader

matrix, then det(A

\342\200\224

The

Definition.
is

indeterminate

XIn)

X is

called

have

in

If As Mnxn(F)3 the

basis

of

that

observed

n with

degree

only

ifX

an

is

is an n x n
leading coefficient
if A

zeros of this polynomial.

the

Thus

the

characteristic

polynomial

polynomial

permits

with

of T if and

is appropriate.

similar

are simply the

It is easily shown that


matrices
(see Exercise 12). This fact
polynomial
Definition,

vector

a finite-dimensional

an eigenvalue

may

a polynomial

is

of A

eigenvalues
definition

following

Then

on

operator

[T]^.

In Examples
1)\".
(\342\200\224

linear

\342\200\224

the

of A.f
the

have
the

characteristic

same

definition.

following

a linear operator on a
p. We define the characteristic polynomial
Let T be

in

tln)

det(A

vector

finite-dimensional

of

f(t)

to

space

be the

are
have
noticed
that the entries of the matrix
tThe observant reader
tln
field
not elements
of the field F. They are, however,elementsof another
(The
F(t).
It is usually studied in abstract
F(t) is the field of quotients of the
F[t],
ring
in Chapter 4
determinants
the
results
about
proved
algebracourses.)
A

may

field

polynomial

Consequently,

remain

true

in

this

context.

\342\200\224

Chap. 5

222

characteristic polynomialof

[T]^;

that

is,

- tl).

f(t) = det(A
remark

The
independentof

an

the

T by det(T
tl).
next result confirms

operator

The

of

Theorem

f(t)

[i.e., if and

A has at

characteristic

the
for

two

onlyifX is a

determining

a linear

be

Let

f(t).

above

corollaries

the

only

if x

eigenvectors

Let

vector

'# 0 and x

6 N(T -

characteristic

the

polynomial

is

a zero

a method

us with

vector space

of the

for determining all

Our next result gives us a


to a given eigenvalue.
corresponding

operator.

be

a linear

6 V isan

procedure

operator

on a vector

space and let X be


of T corresponding to if and

eigenvector

V,

X\\),

To find all

the eigenvectorsof

matrix

the

'-(i

in Example 5, recall

that

has

polynomial

Exercise.

Proof

Example

of T.

of

on an n-dimensional

operator
Then

provide

or an

a matrix

Theorem 5.9.
an eigenvalue

zero

the

f(X)

polynomial

of

eigenvalues

be

f(t)

of T if and only if X
(a) A scalar X is an eigenvalue
f(t) [i.e., if and only if f(X) = 0].
(b) T has at most n distinct eigenvalues.

The

(see also

0].
only if
most n distinct eigenvalues.

Corollary2.
with

of A if and

an eigenvalue

X is

scalar

is a

Mnxn(F)

5.8 are immediate

n matrix,and let

of A. Then

polynomial

can

E,l),

Let A be any n x

1.

Corollary

As

It

l)n.
(\342\200\224

of Theorem

consequences

following

Corollary

coefficient

leading

Examples 5 and 6.

of

polynomial

with

degree

of

(b)

of

polynomial

The characteristic

Theorem 5.8.

(a)

often denote the characteristic

our observations about


induction argument.

a straightforward

by

proved

polynomial

is

\342\200\224

The
be

basis /?. We

of the

choice

that this definition

shows

definition

the

preceding

Diagonalization

two

0
eigenvalues

=
Xx

3 and

X2

\342\200\224

1.

We

begin

223

Eigenvalues and Eigenvectors

Sec. 5.1

by findingall

the

3. Let

Xx

-\\

HI

0-C.

\342\200\224*H

to

corresponding

eigenvectors

Then

is

an

that

is, x #

0 and

-2
4

the set of

Clearly

l\\(x1\\= (-2x1
-2AW

all solutions to

the

suppose

that

x is an

* = '-V

above

is

and

3 if

Xx

to

_i;

if

only

corresponding

0\\

.4 i;

/2

1'

.4

X2

then^

x=

if and

only

if

is

a solution

C;)eN(L,,

to the system

2xx 4- x2 = 0

4xx +

=
2x2

0.

Hence

tei?J>.
N(LB)={t(j):

Thus x is

an eigenvectorcorresponding
to
x = t(

=
X2

e N(LB%

0.

/-1

1\\

for some

'1

and

x2\\(tf

equation

eigenvector of

#0

4x1-2xJ

x = 11 I
Now

if and only if x

eigenvector corresponding to

x is an

Hence

to Xx = 3

corresponding

eigenvector

\342\200\224

for some t #

if

and

0.

only

if

\342\200\224

1.

Let

224
Observe

hence

A,

of eigenvectorsof A. Thus,
is diagonalizable. In fact, if
=

LA,

and

-2

Q~XAQ =

0 -1

Example

of the linear

the eigenvectors

find

To

5.4,

that

5.1 implies

Theorem

Theorem

by

then

Diagonalization

that

basis for R2 consisting

is a

Chap.

operator T on

first recall that'T has eigenvalues 1,2, and 3.


Figure 5.1, which is a specialcase Figure
Here = W = P2(R),J3 = y, and

2.2

of

the

consider

Now

in

defined

P2(R)

in

shown

diagram

2.4 as applied

of Section

6,

Example

to T.

^ = CT],=

Wewill

show^that

if

is

<t>p(v)

operator

an

is an

eigenvector
of ,4 corresponding

P2(R)

eigenvector

to

$p(v) # 0
hence

can

establish

If

X,

then

= Xv.

T(v)

^A$fi{v)

(and

X.

on a finite-dimensional vector space.)

corresponding

Now

of T corresponding
to
(This
argument

of A)

since

<t>p

is

an

corresponding

similarly

that

an

is

to

if

is valid

eigenvector

and

for any
of T

Hence

=
<t>fi'T{v)

isomorphism.

<t>fi{Xv)

Thus

X(t)fi{v).

$^)

is an eigenvector

of

L^

argument above is reversible,


is an eigenvector of corresponding
to X,

to X. Since the
if $p(v)

only

we

#n eigenvector of T corresponding to X (see Exercise


13).
An equivalent
formulation of the result provedin the preceding
paragraph
A
for
X
of
of
is that
hence
(and
T), a vector jeR3 is an
any eigenvalue
A corresponding
to X if and
if <t>Jx{y) is an eigenvector
of
of T
eigenvector
only
then

v is

P2W

-^P2W

&

&

-A
R-

\342\226\240*^R3

Figure

5.1

Sec.

5.1

corresponding to

X.

225

and Eigenvectors ,

Eigenvalues

us to

allows

fact

This

eigenvectors

compute

we

did

in

7.

Example

Let Xt =

1, and define
B

A-XlI

/0

12

\\

It is easily

0\\

1.

\\0

2/

that

shown

JH

. W
the

Thus

of T as

of A

eigenvectors

to

corresponding

the

of

are

Xx

form

a\\ 0

for any a

of the

for

any

# 0. Consequently,
the

form

a #

the nonzero constant

0. Hence

polynomials are

Next let

X2

2,

B = y4-yl2J=

1 o1

0 0
\\

of

eigenvectors

$PX

any

easily verified that

N(LB)

Thus the

eigenvectors

define

and

/-1

Again it is

the

to Xv

T corresponding

for

to Xt = 1 are

T corresponding

of

eigenvectors

# 0.

a\\

i
^

T corresponding

=^_1(ei

all:

aeR}
to X2 are

+ e2) =

.
of the form

a(l +x) =

a + ax

of

Chap. 5

226

Finally, consider

and

X3

= A-X3I

Diagonalization

Since
N(LB)= ^a| 2
of

eigenvector

any

'

$p

for

to

a$J x(et +

2e2 + e3)=

the form

is of

X3

a(l

2x

lax + ax2

= a +

x2)

# 0.

any

T corresponding

\\a

|: aeR

l + 2x + x2} is a basisfor
Note also that y = {l9l+x,
of T. Thus T is diagonalizable and
eigenvectors

of

consisting

P2(R)

[T],=

section

this

close

We

geometricviewpoint.If x is
scalar
for
some
T(x) =

an

of

by x.

spanned

then

can

T acts on

X.

are

There

(see

Figure

several

If
of

factor

Case

W = span({x}) be the one-dimensional


= ex for some scalar c. So
then
W,
y
cT(x)

= cXx =

a vector space

Xye\\N.

over the fieldof real

passing through the origin

of

elementsof

possibilities

linear

the

and eigenvalues from a


T on V, then
operator

by

each

multiplying

for the action

of T depending on

0).

through

(i.e.,

element

numbers,

by the scalar
the

value

of

5.2).

Case 1.
a

V is

as a line

be regarded

The operator

If

itself.

into

T(cx)

T(j;)

Thus T maps

If

eigenvectors
of

eigenvector

X. Let

Xx

subspace

analyzing

by

2.

Case 3.

>

1, then

T moves

W to points

elementsof

farther from0

by

X.

If

If

<

factor of X.

1,
X

then
<

1,

T acts
T moves

transformation on W.

as the identity
elements

of

to

points

closer

to 0

by a

Sec.

5.1

227

and Eigenvectors

Eigenvalues

Case

1:

X>1

Case

2:

Case

3:

0<X<1

Case 4: X=0

Case5: X<0

W=

T on

of

action

The

4.

Case5.
To

0,

If

<

0, then

these

illustrate

Examples 5,6, and


= (xu

by T(&i,x2)

the

to

(corresponding

is

\342\200\224x2)

on the x-axis
consider the projection on

geometricallyclear

that

and

are

e2

respectively.
T0:

0<

R2
6

->
<

subspace
of

(and
Theorem

defined

5.8,

it

then

by

the

of the

defined

R2

seen that

j-axis.

and acts as

we

det(T0-tl)

note

that

= det

Next

it is
zero

the

that

e\302\261

and 0,
operator

(xx cos 6 ~

x2

sin

6, xt

sin 6 +

x2 cos 6), If

T0 maps no one-dimensional
implies that T0 has no eigenvectors
To confirm
this conclusion using Corollary 2 of
the characteristic
polynomial of T0 is
that

clear
geometrically
This observation
itself.

eigenvalues).

T acts

that

= (xx,0). Again

on the x-axis

identity

=
T0(xls x2)

is

into

R2

no

hence

as

Observe

U(x1,x2)

by

->

This behavior is a consequenceof the fact


j-axis.
to the eigenvalues 1
of U corresponding
eigenvectors
that
the rotation
recall
through the angle 6 is the
Finally,

R2
7C,

defined

x-axis

in

are eigenvectorsof

orientation

the

reverses

R2

It is easily

x-axis.

the

T:

the

on

transformation

acts

moves

introduced

operators

thus ex and e2
1 and \342\200\224
1, respectively).

and

the

linear

that the operator

about

reflection

eigenvalues

as the identity

orientation of W; that is,

the

2.1. Recall

themselves;

T.

0 to the other.

consider

Section

of

of

eigenvector

the zero transformation on W.

T reverses the

ideas,

onto

axes

both

maps

then

one side of

W from

of

points

T acts as

If

an

5.2

Figure

Case

is

spanflx}) when x

\\6

\342\200\224

in 6
a

\342\200\224sin6

cos

a6

= '2~
J
\\

\342\200\224

t/

2cos0)t+l,

Chap.5

228

0 < 6

zeros

real

no

has

which

< it. Thus

or eigenvectors.

4 cos2 6

the discriminant

since

thereexistoperators
hence
Of course, such operators and

\342\200\224

with

matrices)

(and

not

are

matrices

Diagonalization

is

for

negative

no

eigenvalues

diagonalizable.

EXERCISES

1.

true
or false.
Label the followingstatementsas
(a) Every linear operator on an n-dimensional
being

vector

space

has

n distinct

eigenvalues.

(b) If a

real matrix has one eigenvector,

an

has

it

then

of

number

infinite

eigenvectors.

with
no
eigenvectors.
(c) There exists a square
be
must
nonzero scalars.
(d) Eigenvalues
matrix

(e)

two

Any

are

eigenvectors

(g) Linear operators on

independent.

linearly

ofa
(f) The sum of two eigenvalues
of w:

linear

vector

infinite-dimensional

also

T is

operator

an eigenvalue
never

spaces

have

eigenvalues.

(h) An n x n

matrixif
A.

matrix

and

only

(i) Similar

matrices

(j)
(k)

with

For

have

a field

F is similar

basis for Fn consisting


the

same

to a diagonal

of eigenvectorsof

eigenvalues.

Similar matrices always have the same eigenvectors.


of two
of an operator T is always an
The-sum
eigenvectors
each matrix
that

such

D-AP

and

basis

= Q-1AQ.

ft find

[LJ^.

Also find an

(a)

(b)

eigenvector

:*

off.

2.

is a

if there

always

from

entries

(c)

(d)

A =

3. For each of
A e Mn xn(F)
matrices
following
(1) Determine all the eigenvaluesof A.
the

invertible matrix

5.1

and

Eigenvalues

each

For

(2)

to

229

Eigenvectors

eigenvalue

of

If

possible,

find

(4)

If

successful

in

Fn consisting of eigenvectorsof-A.
a basis, determine an invertible matrix
finding
D such that Q~XAQ = D.
a basis for

matrix

diagonal

for F =

for F =

for F =

P2(JR) be defined by T(/(x))


of T, and find a basis /? for P2(jR)

P2(JR) ->

T:

eigenvalues

= f(x) +

*/'(*)\342\200\242

that

such

Prove

Corollaries

Prove

Theorem

For
Show

the

of M.

entries

finite-dimensionalvector space

any basis
that

ft

that

prove

[/Uv]/j

is

has

and

diagonalizable

is a square

the

of

matrix

scalar matrix is a diagonal

matrix

in

form

which

any

scalar.

= M*

characteristic polynomialof

X\\v

A scalar matrix

for

be

and

(b) Compute the


(c)

an

is

Let V be a
(a)

is

vector

eigenvalue

that

diagonal

5.9.

eigenvalue

Prove

of Theorem5.7.

operator T on a finite-dimensional
space
if and only if zero is not an eigenvalueof T.
linear operator. Prove that a scalar
be an invertible
of
of T if and only if X~l is an
T_1.
the eigenvalues of an upper triangular matrix
are

invertible
Let

1 and 2

5.6.

Theorem

that a linear

Prove

(b)

of

(c)

the

all

F*nd

is a diagonal

[T]^

matrix.

Prove parts (a),(b),and

and

for

(a)

corresponding

X.

(3)

Let

set of eigenvectors

the

find

A,

X\\v.

one eigenvalue.
XI for some scalar X; i.e., a
all the diagonal
entries are

only

equal.

(a) Prove that if


(b)

Show

that

scalar

matrix.

is

similar

a diagonalizable

to

a scalar

matrix

XI, then

matrix having only

one

=
eigenvalue

XL
is

Chap. 5

230

the

(c) Conclude that

Diagonalization

matrix

CO

is not

12.

diagonalizable.

similarmatrices

(a) Prove that

characteristic

same

the

have

that the definition of the characteristic


on a finite-dimensional
vector space is
operator
choice of basis for V.

(b) Show

13. Prove the followingassertions

in

made

If

X, then

eigenvalue

eigenvalue
If

(b)

an

X is

any

the

to

corresponding

to the

T corresponding

of

eigenvector

of A (and hence

eigenvalue

of

of T

eigenvector
For

of

X.

eigenvector
14.*

an

is

the

of

independent

8.

Example

an eigenvector

<t>p(v) is

and

P2(R)

linear

of

polynomial

(a)

polynomial.

to

corresponding
A, prove that

and

if

to

corresponding

matrix

square

of T)s then a

vector

only

if'

is an

s R3

an

is

<j>pX(y)

X.

same

the

A1, have

and

characteristic

(and hence the same eigenvalues).


T be a linear
Let
and
operator on a vector space
of T corresponding to the eigenvalue For
eigenvector
m, prove that x is an eigenvectorof
integer
corresponding

polynomial

15.*

(a)

V,

X.

let
any

positive
to

Tm

eigenvalue

Xm.

the result for

prove

vector

of

transpose

Verify that T

(b)

Show

that

\302\2611

20.

operator

Exercise-12

on a finite-

is well-defined.
by T(A) =

defined

mapping

Use

definition

your

A\\

the

C such

As

that

\342\200\224

1,

are

Mnxn(F).

of T.

eigenvalues

only

matrices that

that for any


e

the

are

1 and

eigenvalues

that

is a linear operatoron

(c) Describe the

19.f

part

A.

(a)

scalar

the

be

Mnxn(F)

a linear

of

Justify

space?

Hint:

trace.

same

the

trace

the

define

you

dimensional
17. Let T: Mnxn(F)->

Show

matrices that is analogousto that in

matrices have

(a) Prove that similar


of Section 2.3.

(b) How would

18.

the

:.

(4

16.

and

Stale

(b)

an

be

to

corresponding

eigenvectors

the

respectively.

e Mn

A + cB

xn(C)

such that B

is invertible, thereexistsa

is not invertible.Hint:

Examine

det(A

cB).

exists
an nthat
there
Let A and B be similar n x n matrices. Prove
T on V, and bases /? and
dimensional
vector space V3 a linear
y for
operator
=
13 of Section
2.5.
Exercise
V such that A =
Use
[T]^ and B [T]r Hint:
Let A be an n x n matrix with characteristic polynomial

f(t) = (-

l)nt\"

fln-it\"\"1

\342\226\240
\342\226\240'

att

fl0.

Sec. 5.2

231

Diagonalizability

Prove that /(0)

if and only if

A is invertible

that

Deduce

det(A).

a0

a0 # 0.

21.

and

Let

f(t) =

(a) Prove that

tv{A) =

that

t)(A22

(Alt

t of

in

polynomial

Show

(b)

in Exercise 20.

be as

f(t)

(Am

at most

degree

\342\200\242
\342\226\240
*
t)

t) + g(t),

where g(t) is

\342\200\224

2.

(-1)0-1^^.

22.1\"Let T be a linearoperator
field F. Prove that if #(0 e

on

vector

finite-dimensional
x is

and

P(F)

the

V over

space

of T corresponding

an eigenvector

=
to eigenvalue then g(T)(x)
g(X)x.
Use Exercise 22 to prove that if f(t) is
characteristic
=
linear operator T, then f(T)
the
diagonalizable
will see in Section 5.4,this
does
not
on
depend
X,

23.

the

zero

T0)

polynomial

of

operator.

(As we

the diagonalizability

result

ofT.)

24. Prove Theorem 5.8.

25.
5.2

characteristic

of

number

the

Find

in M2x 2(^2)-

for matrices

polynomials

DIAGONALIZABILITY

problem and observedthat not


are diagonalizable.
linear
or matrices
Although we were able to
operators
and
and even obtained
matrices
a necessary and
diagonalize
operators
sufficientcondition diagonalizability
(Theorem
5.4), we have not solved the
In

Section

5.1 we

the diagonalization

presented

all

certain

for

diagonalizationproblem.What isstill
operator or a matrixcan diagonalized
we
basis of eigenvectors. this section
be

Example

one

choosing

will

procedure

in

constructed

develop

we obtained a basis of eigenvectorsby


to each eigenvalue. In general such a
corresponding
eigenvector
a basis, but the following theorem shows that any
set
not
yield
must be linearly independent.
this manner

eigenvalues

Let

T.

of

<

corresponds to x}(1 < j

xls x2,

k),

then

by

Assume

distinct

\342\200\224

>

1, and

eigenvalues

the

that

that
Xu

...,

...,

is

an

independent.Supposethat

there

We
are

atxt +

on

always
to

scalars

show
au

...,
= 0.

\342\226\240\342\200\242\342\226\240

akxk

T such

Xk

be

that ^

independent.

the

integer

k. Suppose

and hence {xx} is linearly


holds for k \342\200\224
1
eigenvectors,

eigenvectors xu

wish

of

xk} is linearly

eigenvector,

theorem

let Xu X2,...,

eigenvectors

induction

we have k
Xk.

are

xk

x2,...,

{xls

x1

on V, and

operator

mathematical

independent.

a linear

be

If

Proof. The proof is


that k = 1. Then x1 # since
where

an algorithm
for obtaining a
such a test and an algorithm,

5.1

Section

of

Theorem 5.10.
distinct

if an

to determine

test

simple

if so,

and,

In

vjji

is

needed

that

...,

xk

corresponding

{xu...,xk}

ak such that

to

is linearly
(1)

Chap. 5

232

Applying T

\342\200\224

4)*i + \"'
By the induction hypothesis{xu...,
fliWi -

Since

/ll3...,

So

at

are

Xk

Therefore at =

\342\226\240
- =
\342\226\240

{xls...,xn} is
V.

Thus,

5.4, T is

1 < i

for

<

\342\200\224

1.

/c

0.

{xl9..., xk) is Unearlyindependent.

0.

akxk

on

operator

to

V,

xk #

Since

vector

finite-dimensional

T is

then

Xj

since

and

0, ak =

corresponds

Xj

Xk

distinct eigenvaluesXu....,

independent,

linearly

Theorem

by

a linear

\342\200\224

eigenvalues,

of T such that

eigenvectors

Xt

has n distinct

Let T have n

Proof.

that

hence

independent;

linearly

(1) reduces to

and hence

be

n. If T

of dimension

space

ak

Let

Corollary.

= 0,

is

xft_i}

= 0.

Xk)xk^1

ak_1{Xk_1

follows

it

= 0. Thus

a^-i

distinct,

\342\200\242
\342\200\242
\342\200\242
=

obtain

of (1), we

sides

both

to

Xk\\

Diagonalization

and

Xn,

diagonalizable.
let

xn be

xl3...,

< n. By Theorem 5.10


= h, this set is a basis

1 ^j

for

dim(V)

for

diagonalizable.

Example!\"
Let

-4

JJJ6M2M2(H).

(J
a-

characteristic

The

of A

polynomial

tl) = detY1

det(,4

and thus the eigenvaluesof


two-dimensional vector space
L^

hence

(and

A)

(and hence of

R2,

'

i^J

0 and

are

2. Since

t(t

is

LA

the

from

conclude

we

is

L^)

2),

a linear

above

corollary

Theorem

that

seen

for

operator

is

much

on

condition

stronger

eigenvalues.

the

polynomial.

Definition.

a0,

the existence of

requires

diagonalizability

imposesa
Actually,diagonalizability
characteristic

LA

condition

fact

have

that

is diagonalizable.

a sufficient
5.10
Although the corollary to
provides
the
this condition is not necessary.In
diagonalizability,
identity
X=
one
1.
but
has
only
eigenvalue,namely
diagonalizable
We

on the

operator

als...,

f(x)

polynomial

distinct) in F
a0(x
aj(x

an (not necessarily
\302\2420.=

For example, x2

\342\200\224

(x +

l)(x

(in F) if

in P(F) splits
such

1)

splits

are

scalars

that

-\342\226\240
a2)\342\226\240 (x

\342\200\224

there

in

aj.

JR, but

(x2 + l)(x

\342\200\224

2)

does

5.2

Sec.

not split in

(x2+.

l)(x

(x

\342\200\224

the

does

field

the

The characteristic polynomialof

5.11-

Theorem

1 cannot

matrix.

the

or

operator

If/(0

2).

then

be factored into linear factors. However,


in
C because
it factors into the product
is the characteristic
polynomial of a linear operator
under
consideration
will be the field associated with

split

\342\200\224

f)(x

matrix,

x2

because

JR

\342\200\2242)

i)(x

or a

233

Diagonalizabih'ty

T splits.

operator

Suppose that T is diagonalizable.

Proof.

= D is a diagonal

If

matrix.

X1

Jl2

= det
tl)

det(\302\243>

(xt

'

X2

the

Let

Example

\342\226\240
\342\226\240
\342\200\242

\342\226\240\342\226\240\342\200\242

xn-tt

mt ~ h)(t

= (-

\342\226\240
\342\226\240
\342\226\240

(t

A2)

linear
clear that if T is a diagonalizable

k for

be

an

f(t).

polynomial

integer

- t

Kl

\342\226\240

on

operator

definition.

following

Definition.

positive

-0

then
the
space that fails to have n distincteigenvalues,
of T must have repeated zeros. This observationleads

vector
polynomial

characteristic

(K

then

\342\200\242\342\226\240\342\200\242

\342\200\242
\342\226\240
\342\226\240

it is

theorem

n-dimensional

characteristic

t){X2

this

From

us

is the characteristic polynomialof T,

to

0^

0---/1

rXx-t

f(t)

\342\226\240\342\226\240\342\226\240

\342\200\242\342\226\240\342\226\240

and /(0

/? for V

a basis

exists

there

Then

such that [T]^

an

linear

diagonalizable

any

which (t

of a

eigenvalue

The
\342\200\224

X)k

(algebraic)
is

a factor

or matrix

linear

operator
multiplicity
of
of

is

the

with

largest

f(t).

Let

If/(0 is the characteristic

polynomial

1 is

an

eigenvalue

with multiplicity

1. I

of

A,

of A with multiplicity

then

/(0

2, and

\342\200\224\342\200\224

(t

2 is

l)2(t

\342\200\224

2).

an eigenvalue

Hence

of A

234

Chap.

If

V3 then

space

linear operator on

a diagonalizable
is a basis
there

T is

from Theorem5.4

for

ft

Diagonalization

a finite-dimensional
vector

of T. We

of eigenvectors

consisting

know

diagonal entries
of
T. Since
the characteristic
are
polynomial of T is
eigenvalues
det([T]^ tl), it is easily seen that each eigenvalue of T must occur as a
of
as many times as its multiplicity. Hence
exactly
entry
[T]/j
as
to an
contains
eigenvectors
many
(linearly
independent)
corresponding
Thus we see that the number of
of that
eigenvalue.
eigenvalueas multiplicity
that

in which the

matrix

a diagonal

is

[T]^

the

\342\200\224

/?

diagonal

the

linearlyindependent
interest in determining
Theorem 5.9 that the

an

when

space of T

\342\200\224

/11,

led

are

we

is of great
Recalling from

a given

eigenvalue

diagonalized.
to the eigenvalue

T corresponding

of

in the null

vectors

be

can

operator

eigenvectors

nonzero

to

corresponding

eigenvectors

the

are

of

to the study

naturally

this set.

Definition.

a linear

be

Let

on a vector space M,
operator
e V: T(x) = Xx} = N(T
X\\v).

Define Ex
eigenvalue af-l.
(x
called the eigenspace of T corresponding to

matrix A,

eigenspace of a

the

mean

we

the

X. As

eigenvalue

X be

set

The

expected,

an

Exis

by an

of the operator

eigenspace

corresponding

let

and

La-

Clearly,

of

eigenvectors

is

Ex

independent
**

of

dimension

EA.

V.

space

If

an

is

Pick

Proof.
P

Our

{*!,..\302\253 j xp,xp

basis

of T corresponding
to

By Exercise 9

of

eigenvalue

+!,...,
X,

for

x\342\200\236}

and

let

of

is

at

least

p. But

dim(EA)

V.

for Ex, and extend

it to a basis

^ i ^ p)is

an eigenvector

that

Observe

A = [T]^.

Then

may

be

written

of

in the

form

T is

t)Ip

det

tln)

polynomial

\\X

xt-(1

C-1/\342\200\236.

det((A-t)/p)det(C-\"t/_p)

= (A
polynomial.

X.

vector

characteristic

the

f{t) = det{A

where g(t)isa

the multiplicityof

linear operator on a finite-dimensional


T having multiplicity
m.
^
m, then 1 <

{xls...,xp}

of Section4.3

this dimension to

relates

T be a

Let

...

corresponding

result

next

5.12.

Theorem

of

eigenvectors

the

to

corresponding

of the zero vector and the


X. The number of linearly
eigenvalue
to the eigenvalue
X is therefore the

V consisting

of

subspace

- tyg(t),
Hence

(X

dim(EA) = p;

\342\200\224

t)p

is

a factor

so dim(EJ ^

m.

of f(t),
1

and the multiplicity

5.2

Sec.

Example

Let T:

-\302\273\342\226\240

P2W

P2(JR)

The

of/

235

Diagonalizability

be

linear

the

respect to the standardbasis

of T with

matrix

defined by T(/)

operator

for

/-,
tJ) = det

T has

only one eigenvalue

/11)

-t/
with

0)

(X

for

and

Ex,

Ex =

the constant

= 1. Consequently

of T, so

eigenvectors

for

containing

dim(EA)

So

3.

multiplicity

of P2(JR)

is

EA

2 = -t3.

=
the
Hence
N(T
N(T).
subspace
in
isa
polynomials. So this case{1} basis
there is no basis
of
consisting
P2(#)
\342\200\224

OX

-t

0
\\

Thus

is

P2(JR)

of T is

the characteristic polynomial

deCT],

derivative

the

[TL

Consequently

= /',

that T is

not

diagonalizable.I
4

Example

Let

T be

the linear operator

on R3 defined

by

ia2

We will determine the eigenspace


of

the

standard

basis

characteristic

ft

is

=
tl)

det

So the eigenvalues

- t

of T are

=
Xx

4,

0
'

\\

11

of T is

polynomial

/4-t

respectively.

If

[232
\\1

detQT],-

eigenvalue.

for R3, then

[T], =
the

each

to

corresponding

/4

Hence

+ 2a3

+ 3a2

[la1

5 and

=
\\

-(t

- 5){t -

3)2.

4-t
X2

= 3 with

multiplicities 1 and 2,

236

Chap.

Diagonalizati'on

Since
/-101
\\

=
EAl

-AJ)

N(T

xj
EAl

is

the

solution

of equations

+ x3 = 0
\342\200\224

4-

2x2

2x1

It is easily seen-(usingthe techniques


of

basis for EXr Hence


=
Similarly,'EX2

0-1

-x1

is a

\\

of the system

space

2 -2

x2|eR3:

2x3

= 0

3)

Chapter

that

dim(EAl)= 1.

N(T

\342\200\224

/l2l)

is

solution

the

of the system

space

So

is a basis

for

EXl,

In this case

and

= 2.

the multiplicityof each eigenvalue

equals

At

the corresponding

eigenspace

is a basis
diagonalizable. 1
for

dim(EA2)

R3

consisting

the

dimension

of

EM. Observe that

of

eigenvectors

that
for a
Examples3 and suggest
polynomial splits, diagonalizability
dimension of the eigenspaceand
4

of

multiplicity

linear
T

is

T.

of

operator
equivalent
of

the

Consequently,

T is

T whose characteristic
to equality
of the
eigenvalue

for

each

Sec. 5.2

eigenvalueof T.

distinct

linear

of

eigenvalues

eigenspace corresponding to

Xx.

xt

then

Xj

some

Suppose

i>
to

m (1

integer

for

Then

m.

V, and let

space

each i = 1, 2,

For

T.

a vector

on

operator

...,

let

k,

Xj

If

x2 +

\342\226\240

= 0,

xk

<

each

otherwise. By renumbering if necessarysuppose


i < m, and xf = 0
1 <
m < k) we have that xt # 0
i < m we have that xt is an
of T corresponding
that

for

for

for

eigenvector

+x2

xi

0.

\342\226\240\342\200\242\342\226\240

+xm

Theorem 5.10, which states that


set. We therefore conclude that x{= 0

But this contradicts


independent

these

i.

two theorems give us the information recognizing


and for choosing a basis
linear
is diagonalizable
the
operator
of eigenvectors
of the operator.
consisting

vector

for

\342\200\242
\342\200\242
\342\200\242,
Xk

distinct

be

Let

be

a linear

of T. For

eigenvalues

on a vector

operator

each i =

linearly independent subset of the eigenspace


subset of V.
linearly independent

Then

E^.

Proof Suppose that

each

for

\"i
S

Then

{xrj-: 1 <y

space

I, 2,...,
S

when

for

Theorem 5.13.

a linearly

fonn

xf's
all

for

next

The

V,

space

let

and

S| be a

let

finite

Si u S2

scalars

[axj\\ such

* \342\200\242
\342\200\242

Xu

Sk

is a

i,

\\Xili Xi2J ---3

< nh and 1 <


k

i<

k}.

i=l

XintJ-

Consider

ni

aUxiJ

II
For

the

e E^,

and

At

X2,

\342\226\240
* *

X2->

Xu

all i.

for

Proof.

5.10.

of Theorem

variation

Let T be a

Lemma.
be

a slight

is

which

lemma,

begin with the following

now show. We

as we

true

indeed

is

This

Xk

237

Diagonalizabllity

any

that

\302\260-

j=l

i let

each

ni

yi

Then
for

yi

all i.

that a(j

EXi for

each

for

+ -

aijxij-

\342\200\242
\342\226\240

yk

= 0.

Therefore,

by the lemma,

for
each
f, it now
linearly independent for all i.
all j. We conclude that S is linearlyindependent. 1

But St is
\342\200\224
0

i and yt

Thus,

=
yt

follows

238

Chap.

eigenvectors,
tells

theorem

us how to build up a linearlyindependent


subset
bases for the individual eigenspaces.The next
namely
by collecting
us when
the result will be a basis for V.

distinct

vector

the characteristic

that

V such

Xu

of T. Then

eigenvalues

if
(a) T is diagonalizable

and

only

if the

is

If

P = St

diagonalizable

and

S; is a

u S2,u

Proof.For

.each

equal todim(Ex)
each

for

EXi

i,

then

of eigenvectors of

of

the multiplicity

/n^ denote

let

i,

Sk

basis for

basis for V consisting

is a

\342\226\240
\342\200\242
\342\200\242

of X{ is

multiplicity

for all i.

(b)

of

linear operator on a finite-dimensional


be the
polynomial of T splits.Let
X2,...,Xk

T be a

Let

5.14.

Theorem
space

tells

5.13

Theorem

Diagonalization

dx

Xh

T.

and

dim(EA.)3

n = dim(V).

of
First, suppose that T is diagonalizable. /? be a basis for V consisting
that
are
eigenvectors of T. For each i let = /?n EM, the set of vectors in
in
to Xh and let nt denote the numberof
eigenvectors
ft.
corresponding
i because
Then
<
d( for each
ft is a linearly independent subsetof a space
dimension
df, and
dx < mt for all i by Theorem 5.12.Thus
Let

/?

ft

vectors

of

nx

=
\302\273

ni<

nr's sum to n

degree of

because

ft

multiplicities
of

the

\342\200\224

dt)

From this we conclude that

diagonalizable,and
for

let

EXi, and

=
ft

subset

independent

the

in

St

dt

(mr

\342\200\242
* \342\200\242

Since dt =

This

theorem

is

completes

our

i, we

that

have

is

ft

a basis

be
a

linearly

/? contains

of eigenvectors

V consisting

study

of-the

summarize some of our previousresultsin


A Test

5.13

St

is

= n
Z mi

=
dt

elements. Therefore,
a
basis
for
conclude that T is diagonalizable.1
j?

each i, let

Theorem

by
all

will show that T

i. We

all

prove (b). For

for

mx

i.

each

for

i.

Then

Sk.

>

dt)

for

dt

m{

\342\200\224

all

for

we also

process

u S2

of V.

=
mx

that

suppose

Conversely,

and*

of the

to the sum

is equal

the

sum to n because

m's

that

follows

It

\302\273\302\273!=\342\226\240\302\253.

The

of

polynomial

eigenvalues.

Y, fai

i~l

n elements.

contains

characteristic

the

di<

i=l

i=l

The

the

problem.

diagonalization
test

following

of V. We

and

We

algorithm.

for Diagonalizability

Let T be a linearoperatoron
diagonalizable if and only if both of

an

the

n-dimensional
following

vector
conditions

space.
hold.

Then

T is

Sec. 5.2

239

Diagonalizability

1. The characteristic polynomialof T splits.


The

2.

of

multiplicity

multiplicity
1

eigenvalue

T.

of

theorem.

1.

than

greater

multiplicity

each

of the dimension
Also
satisfied for eigenvalues having
2 need only be checked for those

condition

Thus

for

X\\)

2 makes use

5.12).

(Theorem

having

eigenvalues

\342\200\224

rank(T

2 is automatically

condition

that

observe

\342\200\224

equals

condition

that

Observe

Since diagonalizability of a matrix is equivalent


the operator
a similar
test
holds
for matrices.

to

of

diagonalizability

LA,

for Diagonalization

An Algorithm

Let T be a
space
be

V,

and

let

and

V3

for

basis

[T]^

linear

diagonalizable

Xu...

= N(T
EXj
is a diagonal

the

denote

Xk

\342\200\224

and

/1,-1)

ft

distinct
=

of T. For

eigenvalues

/?t u

\342\226\240
\342\226\240
\342\226\240

ft2

vector

a finite-dimensional

on

operator

j? is a basis

Then

/V

each j, let

ftj

for

matrix.

Example 5

We will test

the

matrix

/3

A=lo

0 |eM3x3(K)

\\0 0
for

diagonalizability.

characteristic

The

has

Hence

X1 = 4 and

eigenvalues

respectively. Clearly condition 1of


since
check

has

Xt

1, condition

multiplicity

2 of the

condition

rank(B)

A is

consequently

\342\200\224

not

the

=
X2

for

test

with

A-XoI

1. Thus

diagonalizable.

2 is satisfied

R3

->

R3

be

defined

T| b

\342\200\224\342\200\224

(t

\342\200\224

4)(t

3)2.

1 and

multiplicities

is satisfied,

for X1. Thus

2 of the

condition

by

'a\\

tl)

need

we

2,
and

only

Example 6

LetT:

\342\200\224

diagonalizability

test for X2.Since

has rank 2,

of A is det(A

polynomial

I -2b-3c

= la +

3b

3c

test fails

for

X2,

and

Chap. 5

240

We will test T for diagonalizability.


Letting
we have

[T]r =

-3y

-2

for R3,

basis

standard

the

denote

Diagonalization

The

characteristic

eigenvalues:

condition 1 of
condition 2.

For

1 with

Xx

for

(t

2 and

multiplicity

test

the

=
X2

Thus

the

\342\200\224

\342\200\224

rank(T

Xt\\)

of

dimension

1, the

multiplicity

\342\200\224

rank

find

now

1. Note that

We

now consider

of

dimension

EA2

is

the

to

equal

multiplicity

- 1 = 2.

of Xt. Since

X2

the solution

\342\200\224

x1 + 2x2
which

Since

matrix.

set of
3x3

\342\200\224x1\342\200\2242x2

+ 3x3 =

0,

has

Pi

as a basis..Also,

-2

'-2

E,=N(T-^2I)=
Thus

is
E2\342\200\236
*A2

the

solution

set

eR3:

\\x2

-0

of
\342\200\224

\342\200\224

\342\200\224

2x2

3x3

2xt

xi +

x2 + 3x3 =

=
*3

0
0,

has

of X2- Hence T

[T]^ is a diagonal

R3 such that

ft for

a basis

= 3

as the multiplicity

same

the

EXl is

EXl is

two

-3

diagonalizable. .
We

has

Thus

multiplicity

satisfied.

-2

'-1

2).

2 with

is

diagonalizability

\342\200\224

l)2(t

1,

Xx

of T is

polynomial

1,

\342\200\224\342\200\224

= 0

is

Sec.

5.2

241

Diagonalizability

which has

02 =

as a basis.

Let

/J

/? is a

then

ftu^2;

(1

[T],=

\\0

Our next exampleis


interest in Section 5.3.

an

basis for
oo)
1

2/
of

application

arid

.1

.
will

that

diagonalization

be of

Example

Let

We will show that


Q\"\"1

for

is

AQ

This information

is

if and

diagonahzable

characteristic

of

polynomial

eigenvalues, and so is diagonahzable.


is a diagonal matrix, note that

is

L^

(t

is a

we

basis

and

2 matrix Q such

An

has

that

Thus
for E,,.
\342\226\240A2

only if L^

\342\200\224

l)(t

\342\200\224

2).

is diagonalizable. Now the

Thus

has

find a basis /? for R2 such


= 1 and X2 =
eigenvalues
Xx

seen tfa$t

EXl

a 2 x

two

To

is a basis for

that
will then be usedto compute

find

n.

integer

Recall that

and

diagonahzable

matrix.

a diagonal

positive

any

is

-2

'0

for the basis

have

[LJp

2.

distinct

that

[LJ^

It is

easily

Chap. 5

242

Diagonalization

if

Moreover,

-i/'

-i

then

'1 0'

Q~lAQ = 0

5.1.

by Theorem

Finally, since

Q^AQ =

0'

\302\260)cr

A-ei'l

Thus
o^

1
An

~i

0'

Q-'Q

~i

'le-1

<

1 0V
Q o 2
Q

r 0
o

1V1 o

-l

-lAO

-1 +

-2

-1

2n

2-2\"

We will now discussan


system of differential equations.

Systems of
Consider

the

Differential

of differential

system

x'2

to solve

diagonalization

equations

i, xt =

x2

2Xi + 4x2

x3 =

\342\200\224

xt

xf(t) is a
variable t. Clearly, this systemhas a
will
determine
Xt(t) is the zero function.
for each

+1

-l+2\"

Equations

x{ = 3xx +

where,

2\"

uses

that

application

+ r

2-2\"

\342\200\224

x2

+ 2x3
4-

We

x3,
function

real-valued

differentiable

solution,

x3

namely

all

the

the

solution

solutions

of

the

in which

real

each

of this system.

Sec.5.2

243

Diagonalizability

X:

Let

-\302\273\342\200\242
R3
be

defined

function

the

by

/*i(t)'

X(t) =

x2(t)
\\*3\302\253.

The

of X

derivative

as the function

is defined

X', where

(x'M

x'2(t)

X'(t)

W)

Letting

A =

be the coefficientmatrixof

matrix formX' =
The

AX,

is the

verify that

matrix

is

the system in

rewrite

can

we

system,

given

AX

where

should

reader

the

product

of
and

diagonalizable

X.

and

the

if

that

then

Q~lAQ=[0

0 0
Set

and

substitute

lently, Q~lX'= DQ~lX.


that

system

Y is

a differentiable

can be
Since

into X' =

A = QDQ'1

function

written as T =
is a

diagonal

Y:

Define

-+

AX

to

R3 by

X'

find

= &D(T^

Y(t) = Q~lX{t). It

and, in fact, Y' =

the

Hence

Q~XX\\

or, equiva-

can be

shown

original

DY.

matrix,

the system T

Y(t) =

DY

is

easy

to

solve.

For

if

244

Chap.5
be written

= DY can

then

Diagonalization

y'2(t)

y'M

The

three

equations

y'i(t)

2yi(t)

y'2(t) = 2y2(t)

y'3(t) = 4y3(t)

are independent
seen

in

(as

2 of

Example

= Cie2ts y2(t)

is yi(t)

be solved individually.

can

thus

and

other

each

of

general solutionof

5.1) that the

Section

= c2e2\\and

equations

c3 are arbitrary

cu c2, and

where

c3e4',

y3(t)

these

It is easily

scalars.Finally,
c,e
=

=X(t)

x2(t)\\

2t

QY(t)

2t\\

Cne

\\

de\"'at

c*e

c3e

c2e2i + 2c^

-c^e21- c2e2t-

c3e4/

\342\226\240\302\273

the

yields

system. Note that

this solution

can

be

as

written

X(t)

The

of the original

solution

general

= e2t

= 2 and
X1

where

respectively,

systemis

=
X{t)

Direct

are

brackets

in

expressions

e2tz1

+ e

+ c

eMz2i

At

fhe

4. Thus

X2

where

elements of E^ and
general solution of the original

arbitrary

simply
=

zx 6

and

EXl

E^2,

z2

e EA2.

Sums*

Let T be a linearoperator
on

a way of decomposing

into

simple

vector

finite-dimensional

that

subspaces

behavior of T. Thisapproachis
nondiagonalizable linear operators. In
simple subspaces are the eigenspacesof

in

useful

especially

case

the

the

of
operator.

space

offers some
Chapter

a diagonalizable

7,

V. There is

insight into the


where we study
operator

the

5.2

Sec.

245

Diagonalizability

Let

Definition.

--- + Wk= X

W1 + W2 +

let Wls W2s.

and

space

of the subspaces,Wi +

define the sum

of V. We

vector

be

It is a simple

x2 +

{x1

4-

--.+xk:

k
=

Wk

subspaces

by

W,

]T

XieW.Ui^k}.

of subspaces

of a vector spaceis

let W2 denote

the yz-plane. Then

the sum

that

show

to

exercise

=
Wi

\342\226\240
\342\200\242
*

W2

be

\342\226\240.,
Wk

a subspace.
8

Example

R3

= R3S let

Let

Wi

the

denote

\\NX

c) = (a,

(a, b,
and

0, 0)

(a,

6 Wls and

in Wx and

not

is

W2

(0, b, c) e

in Example

that

Notice

(a, b, c) in

for any vector

because

W2

and

xy-planes

unique.

0, 0) +

(0,

b,

that

have

we

R3,

c)

W2.

8 the representation of (a, b, c)


For example,
(a, b, c) = (a, b, 0)

which
representation. We are interested in sums
We now impose a constraint on sums to assure
The definition
of direct sum that followsis a
given in the exercises of Section 1.3.
for

(0,0,

of

this

of

write

Wx

\342\200\242
\342\200\242
\342\200\242
\302\251
\302\251Wk

\302\251W2

call

and

v=

and

be subspaces
of
V the direct sum

Wk

W2s...s

Wls

another

are

unique.

outcome.

generalization

Let

c) is

representations

us

Definition.

of vectors

a sum

as

definition

the

a vector space V.
of Wls W2s... Wk

We

if

w,

\\
n

Wi

=
wj)

f\302\260r

(\302\260)

each

i < k).

(K

Example 9

Let

=
W3

d): d

{(0,0,0,
(a,

b, c,

a, b e R},

Wx = {(a,b,0,0):

let

and

R4,

e R). For

d) = (a,

any element (a,b,c,

of

d)

b, 0S0) + (0,0,

0)

c,

{(0,0,c,0)

W2

R}, and

V,

0, d) e Wt

(0, 0,

c e

+ W2 +

W3.

Thus

t*=i

To show that
iWt

But

(W2

these

W3)

equalities

is

the

direct

sum

w,.

of Wls W2s

{0}y W2 n (Wt +
are obvious; so V

=
W3)

\\Nt

and W3s we must

{0},
\302\251W2

and

W3 n

\302\251VV3.

(Wx +
1

prove that
W2)

{0}.

246

Chap.

several conditions

Our next result contains


definition
of a direct sum.

Theorem 5.15.
vector space The following
Let

conditions

V.

(a) V =

Wk.

W2\302\251--.\302\251

Wi\302\251

the

to

equivalent

of a finite-dimensional

Wk be subspaces
are equivalent:

W2s...,

Wls

are

that

Diagonalization

(b)

for any vectors xls

Y, Wj and,

-,= i

= 1,

(i

x2 +

i=

7i u liu '\"'

be uniquely written in
W; (i = 1, 2,..., k).

xt

k,

1,2,...,

by definition. Suppose
1,2,...,

But

also

0,

x2,...,

xls
+

any

representation

is

\342\200\224

so

x,

(f = 1, 2,...,

...,

k).

since

x(

Thus

xf

\342\200\224

y{

fWeregard
yt

yt are
y2),

etc.

is an

ordered

vectors

such

yk

are

xft

for any

Then

Vv^

basis for V.

e W^nl

\302\243 Wj)

that

x^eWj

\302\243

{0}.

\302\243

w,

in the

form

that

yx

y2 +

(xk

-yk)

xx +

\\-xk

x2-\\

that this

/c). We must show

1,2,...,

(f

yd

each

y2 u

(*2

yi) +

uyfc

as

an

(b)

\342\226\240
\342\226\240
\342\200\242

yk9

y{ 6

where

ordered

listed first (in the same orderasin yx),

then

basis in
the

xt

\342\200\224

y{

(i

= 1, 2,

of the representation.

the natural

vectors

Wj

= 0.

that

the uniqueness

i, proving

\342\226\240
\342\200\242
*

\342\200\242;/

from

follows

it

s Wf,
for

y-t

then

/or

y,

/c).Then

(*i But

therefore

Suppose

unique.

can be represented

xt e W,

for some- elements

xt

that (b) implies(c).Since

ueV

vector

W;,

for

basis

ordered

an

\342\226\240
\342\200\242
\342\200\242

= 0.

xk

v=
(b),

form

k).

(b).

proving

We next prove

by

w,

\342\200\242
\342\200\242
\342\200\242

and

Wis

Xf6

x\302\243

that

xx 4- x2 +

\342\200\224

Hence

;.

(f

the

1,2,...,

then

true,

k) and

0 (i =

XjeWj

V.

for

exists
u

that

such

basis

ordered

any

basis

y2

Proof If (a) is

is

yt

ordered

an

*s

u^k1\"

xk

there
eac/j i=ls2,...sk
k) such that y1u
1, 2,...,

(e) For
(i

where

xks

eac/j

/or

//,

(d)

then x, =

= 0,

\342\226\240
\342\226\240
\342\200\242
x2

in V can

vector

Each

(c)

k), if xt +

2,...,

x2s...sxk

in y2

way\342\200\224the

vectors

in

(in the same order asin

5.2

Sec.

247

Diagonalizabillty

(c) implies(d)slet

To show that

V=

for

basis

be

ys

W\302\243(i

k). Since

1,2,...,

t W,
t-=i

by

vectors xtJ e

y\302\243
(j

y1 vy2 u

that

is clear

it

(c),

2,...,

1,

V.

*\342\200\242\342\200\242

generates

uyft

and

m\302\243

/c) and scalars

2,...,

1,
=

that

Suppose

are

there

ay such that

0.

YJaijxij

Set
mf

=
yi

aoociy>

y\302\253l

then

6 span(yj

yf

and

vV(-

^1

= Zai7*i7 =

--- +
^

^2 +

\302\260-

0 e\\N{

Since

for each

that yr= 0

implies

and

0 +

for each I

\342\200\242
\342\226\240
-

Thus
0 = yi= Z

each

for

j = 1,

k)

that

immediate

Finally,
s^ich

y1 u

i. Hence

condition

(c)

that atJ = 0

for

yk,

it follows

\342\226\240
* \342\200\242

y2

\342\226\240
\342\200\242
\342\200\242

ayXy

independent,

linearly

y& is

and

independent

linearly

basis for V.

therefore is a

It is

is

yf

each

and

m|

i..,

2,

...,

since

\302\243.
But

= y1 + y2

(d)

we shall show that


\342\200\242
\342\200\242
\342\200\242
that
u yk
yt u y2 u
V

(e).

implies

(e) implies(a).

If

is

yt

for

a basis

is

\\Nt

{i

1,2,

for V, then

a basis

span{y1vy2v>--vyk)
k

= span(yj

by

suppose

of

applications

repeated

+ span(y2) +

\342\200\242
\342\200\242
\342\200\242

span(yft)

W*

1.4. Fix an

12 of Section

Exercise

index i

and

that

wi|-

O^ueWjnj

Then

weWj =
Hence

v is

a nontrivial

be expressed as

span(yf)

and

6 ]T

linear combination of both

a linearcombination
of

yt

span

Wy

y2 u

yt

and

(J

yy

j.

(Jy*,-//,

*\" u
y&m more

so that

u can

than one

way.

248

Chap.

contradict Theorem 1.7,so

these representations

But

Diagonalization
that

conclude

we

{0},

W,nf^W,)

proving (a).

With

bility

the

of

direct

we are now

able to characterize diagonaliza-

sums.

a linear operator
T is diagonalizable if and
ifMisa
Let T be

5.16.

Theorem
V. Then

space

of

terms

in

5.15

Theorem

aid

only

vector

a finite-dimensional

on

of the

sum

direct

eigenspaces

ofl.

is

Proof First, suppose that T

each i, choose
that

St\\j S2 u
V is

that

conclude

\342\200\242
\342\200\242
\342\200\242

a direct

Sk

Aft be

A2,...,

Als

basis

ordered

an

5.14

Let

of T splits.

polynomial

the

distinct

the eigenspace EXi, It


for V. Therefore,
is a basis
S$ for

sum of the

characteristic

the

Then

diagonalizable.

eigenvalues
from

follows

of T. For
Theorem

by Theorem 5.15

we

eigenspaces.

direct
of the eigenspaces
sum
of T:
Conversely,suppose
of
5.15 we have
Theorem
EXi. By
EAl,...,EAfc.For each i, choosea
that St u S2u
u
of
Sk is a basis for V. Since this basis consistsof
that T is diagonalizable. 1
T, we conclude
V

that

is

basis

S{

\342\200\242
\342\200\242
\342\200\242

eigenvectors

Example10
f

Let

T:

R4

-\342\226\272
R4

be

defined

T(a,

Then it is easily

that

seen

by

c,

fc,

T is

d)

(a, b,

2c, 3d).
with eigenvalues

diagonalizable,

X1 = 1,

2,

A2

= 3. Furthermore,
and
the corresponding eigenspaces coincide with
A3
us
subspaces
Wls
W2, and W3 of Example 9. Theorem 5.16
=
another
that
R4
1
Wi \302\251W2 \302\251W3.
proof

provides

the
with

EXERCISES
or false.
true
followingstatementsas
linear operator on an n-dimensional vector space

1. Label the
(a)

Any

being

that

n distinct

than

eigenvalues

fewer

has

is not diagonalizable.

to
(b) Eigenvectorscorresponding

the

same

eigenvalue

are

linearly

always

dependent.

(c) If

is

an

eigenvalue
of

eigenvector

(d)

If X1 and

Ea,n

=
EA2

A2

{0}.

of a

linear

operator

T, then each elementof

Ex

is

an

T.
are

distinct

eigenvalues

of

a linear

operator

T, then

Sec.

5.2

249

Diagonalizability

{xls..., xn} be a

basis
of A. If Q is the n x n matrix
eigenvectors
=
1,2,..., n), then Q~1AQis a diagonal
(i

(e) Let

/? =

and

xn(F)

Mn

Fn

for

is

column

ith

whose

of

consisting

xt

matrix.

(f)

linear

and

if

onalizable

of

dimension

if

only

vector spaceis

of subspaces

sum

direct

the

vector

nonzero

Wls

W2,...,

at

has

space

then

Wft,

If

then

For

linear operator on

of the

each

if

and

diagonalizability,

Wi

= {0}

Wf n W,

and

V=^W(

2.

the

foriV;.

W{n\\Nj={0}
\302\251

diag-

EA.

(g) Every diagonalizable


least one eigenvalue.

(h) If a

a finite-dimensional vector space is


of each eigenvalue equals
the multiplicity

T on

operator

for i # j,

\342\200\242
\342\226\240
\342\200\242
\302\251
\302\251Wft.

\302\251W2

matrices

following

in

A for

QT1AQis a

Q such that

find a matrix

is diagonalizable,

test

n(R\\

Mnx

diagonal

matrix.

(c)

(f)

3.

For

of the following

each

find a basis

is diagonalizable,

(a) T:
T:

P2{R)

(c)

T:

R3

(d)

T: P2(R)

(f)

defined

-+

T: M2x2(K)
matrix

has

n distinct

T(/)

that

test

+ /\", where/'

= /'

eigenvalues,

If T

diagonalizability.

is a diagonal

[Jlp

matrix.

and

are

/\"

bx + a.

by

by

T(z,

Ttf)(x)

w) = (z

/(0)
Sw,

iz

M2x2(K) defined by T(A) =

version

for

of f, respectively.
T{ax2 + bx + c) = ex2 +

by

defined

the

by

- P2{R)defined

-\302\273\342\200\242
C2

Prove

such

by

defined

P2(K)

-\342\226\272
R3

/?

derivatives

second
-\342\226\272

0>)

(e) T: C2

defined

P3(JR)

and

first

-\342\226\272

P3(jR)

linear operators T,

of the corollary

then

is

+ f(l)(x
+

+ *2).

w).

A*.

to Theorem 5.10;If

diagonalizable.

Mnxn(F)

the

250
of
Theorem
prove the matrix
and
Justify the test for diagonalizability

5. State and
6. (a)

this

in

for

A\"

8. Let A e Mnxn(F)
n \342\200\224
that
1, prove

4^

eigenvalues

the

that

prove

such that [Jlp is a


entries of [T]^ are
X2,
...,ift

diagonal

and

A is

that

If

dim(EAl)

and

mu m2,...,
matrix,

that

each

Xj

k).

matrix whose characteristicpolynomialsplits


of A are

eigenvalues

the multiplicity of Aj. Prove

which

V for

triangular

Xx,

an n x n

distinct

the

that

space

multiplicities

a basis for V

/? is

occurs fitj times (j = 1,2,...,


Suppose

with

Xk occur

Xl9

If

X2.

vector

finite-dimensional

on

respectively.

and

Xx

diagonalizable.

9. Let T be a linearoperator a
the distinct eigenvalues
X2,...,
mk,

e M2x2(R\\-

n.

distinct

two

have

A is

'1

integer

positive

any

for

algorithm

matrices.

7. If

find

the

section.

part (a) for

(b) Formulate

Diagonalization

5,11.

version

diagonalizationstated

10.

Chap.

that

Xx,...,

Xk.

For

each;,

let ntj

denote

ft

(a)

tr(A)

Assume
Y, m]h\342\200\242
;\302\253i

triangular,

re-

the

however,

in general.

true

is

suit

that A is upper

\"\342\200\242

(b) det^H^rW\"2---^.

11. Let T be invertible


linear
\"operator
that
T
is
Prove
diagonalizable and

on

an

if

12. Let

only

if T\"1

if and only if

diagonalizable.

13.

Find
(a)

y'

(c)

of each systemof

the general solution

Xi

x +
3x

= x

x'2 =

*a =

(b)

fxi

\342\200\224

X2

x2 +

=
.=

*3

x3

2x*

14. Let

Ln

\302\25312

l21

\302\25322

a nl

a n2

differential

8*! + Kbc2
JXX

7%2

vector space.

is diagonalizable.

A is diagonalizable

that

Show

Mnxn(F).

a finite-dimensional

equations.

A1

is

Sec. 5.2

251

Diagonalizability

be the coefficientmatrixof

x'2

A2,...,

Als

Aft.

the system if

and only if X is

where

zx 6

system

is

EXi

for

Exercises 15 through

of

real

alnxn

\342\200\242
\342\200\242
\342\200\242

a2nxn

X: R

function

of

eigenvalues

is a

->\342\200\242
R

are

to

solution

form

the

\342\226\240-..+ 'eXhtz

vector

k>

the set of solutionsto

k. Conclude that

1,2,...,

n-dimensional

an

= eXltz1 + eX2tz2 +

X(t)

equations

and that the distinct

a differentiable

that

Prove

differential

fli2*2 + **'+

+ a22x2

a21x1

A is diagonalizable

that

of

system

= flii*i +

'*i

Suppose

the

the

space.

17 are concerned

simultaneous

with

diagonalization.

the samefinite-dimensional
V are
called
simultaneously
space
diagonalizable
if there exists a basis
and [U]^ are diagonal matrices. Similarly,
that
both
V
such
for
[T]^
6 Mnxn(F)
called
are
simultaneously
diagonalizable
if there exists an
such that both Q-1AQ and Q~1BQ are diagonal
invertible
Q 6 Mnxn(F)
Tlvo linear

Definitions.

T and U on

operators

vector

A,

ft

matrix

matrices.

15.

(a)

| If

diagonalizable linear operators on a


that
V, prove
[T]^ and [U]^ are

U are simultaneously

and

vector

finite-dimensional

space

simultaneously diagonalizablematrices

for

(b)

Show

that if

and

are

basis

any

p.

then

matrices,

diagonalizable

simultaneously

are
^L^ and
simultaneously
diagonalizable
operators.
are
then
(a) Show that if T and
simultaneously
diagonalizable
operators,
T and U commute (i.e.,TU =
if A and B are simultaneously diagonalizable matrices,
that
Show
(b)
LB

16.

UT).

then

and

B commute.

The converse

17. Let ;T
space,

of (a) and

(b) will be established


in

be a diagonalizablelinear operator
and let m be any positive integer.
on

simultaneously

W2,...,

vector

finite-dimensional

Prove

that

5.4.

of Section

and

Tm

are

diagonalizable.

Exercises 18 through 21
18. Let Wj,

25

Exercise

Wft

are

be

concerned

with

subspaces

of

direct

sums.

a finite-dimensional

such that

iWr.V.

vector

space V

252
V is

that

Prove

of Wls W2,...,

the direct sum

dim(V)

19. Let

/?2,...,j3ft

fil9

that

Prove

span(j?1)

of
u

]8fc

and

Let

21.

\302\251K2

'\302\251;

matrix

the

life and

natural

if i #;).

and

{0}, then Wt

which

that
\342\200\242
\342\200\242\342\200\242
\302\251
\302\251

EAfc.

of

Kp

V for

space

\302\251 EXl

EXi

\342\200\242
\342\226\240
\342\226\240
\302\251
\302\251

\302\251K2

W2

=
W2

vector

Mt

\302\251M2

Wt

\302\251W2

\302\251

\342\226\240
\342\200\242
\342\200\242
\302\251
\302\251

\302\251M2

Mq.

then for any

entries,

complex

having

positive integer

m,

size that also has complex

of the same

In

entries.

\"limit\"

of

many

sciences there are


the

deteraiiriing

require

are

AND MARKOV CHAINS

matrix

square

Kt

W1n\\N2

\302\251Mt

Kp

squarg

is

Am

is

\342\226\240
\342\200\242
\302\251

LIMITS

MATRIX

if

that

Prove

Mg,

5.3*

jS- = 0

Mq be subspaces

Mls M2,...,

K2s...,Kp,
that \\Nt =

such

\342\200\242
\342\226\240
\342\226\240
\302\251

eigenvectorof T})=

Kls

fi2,.--,Pk

/?ls

\342\226\240
\342\200\242
\342\200\242
\302\251
\302\251span(j?ft).

\302\251spanftSa)

x is an

e V:

W2,

Wls

space

Ki

{xux2,...,xn},

&n

a linear operator on a finite-dimensional


vector
the distinct eigenvalues of T are Als A2,...,
Aft. Prove
span({x

important practicalapplications
exists) of the sequence of matrices A2,

(h\\one

that

A,

,....

In this section we consider such limits


situation
in which this type of limit exists.
assume
of real numbers. The limit of a
sequences
We

of

= 1,2,.,.}

can be definedin

and imaginary parts: If

rm

zm

lim

\342\200\224
lim

zm

entries.

The

limit of the

rm

L,

sequence,

if,

for

all

i and

m-+

that

the

i lim

sequence

sequences

sm.

x p matrices
to

having

matrix

the

j,

= LfJ.

Als A2s A3s...


Am

of real

real numbers, then

oo

lim

of

m-+co

lim (AJy
denote

oo

m-+

limits
numbers

complex

of the

limits

rm and sm are

Als A2s A3s... be n


Als A2s A3, ...is said to converge

Let

sequence

the

ism, where

m-*oo

Definition.

of

terms

important

with

familiarity

sequence

{zm: m

one

examine

and

7b

\342\200\224
ft

is,

(that

/?

\342\226\240
\342\226\240
\342\226\240

uj82

/?x

if

only

T be

Let

20.

/i

that p

of p such

subsets

-=

with basis

space

a partition

be

and

if

Wft

Diagonalization

X dim(W|).

vector

finite-dimensional

be

let

and

If

Chap.

converges,to L,

= L.

we

write

complex

L, called

the

Sec. 5.3

253

Chains

Markov

and

Limits

Matrix

Example

If
3m2

3Xm

mr 4- 1

+ i

(2m

\\

1
1

then

lim
oo

m-*

but

important

the

analogy

which

asserts

simple

Note

theorem.
real

base of the

e is the

where

numbers

3 +

natural logarithm. I
with the familiar property
that if lim am exists, then

lim
m-*

such

entries

complex

lim

Lm-+oo

P 6 MrXn(C)and Q

\\

lim

= PL

PAm

.be a sequenceo/nxp

For

lim AmQ =

and

lim

KPAJnl

LQ.

m-+oo

i < r) and j

i (1 <

any

having

MpXs(C),

m-+oo

Proof

matrices

LeMnxp(C).

Am

m-+oo

Then for any

of limitsof sequencesof

lim a.

oo

Let Als A2s A3s..


that

5.17.

next

oo

= cl

cam

in the

limits is contained

of matrix

property

m-*

Theorem

2i

Am

lim

Zj

<

p),

\342\226\240*

ik(Am)kj

ft-1

m-+oo

m-+oo

(1 < j

Hence lim

=
PAm

PL.

The

P!k j lim
proof

=
[(yU\302\253]

lim AmQ =

that

m-+

m-+oo

Let

CoroUary.

AeMnxn(C),

and

^W

LQ is similar.

matrix

let

lim

Am

e Mn xn(C),

lim (QAQ-1)m =
m-*

oo

oo

m-+oo

vertible

= (^).V

QLQ~1.

L.

Then

for

any

in-

254

Chap.

Diagonalization

Since

Proof.

{QAQ lr =

\342\226\240
\342\226\240
\342\226\240

\"l)

X){QAQ

(QAQ

\"

~l)

(QAQ

QAmQ

\\

we have

lim KQAQ~T1=

Hm

In

that

discussion

the

(the disk of

disk

unit

because if

is

or

lim

number,

complex

the

1}.

and

if

if

only

interest

is of

set

S. (This

the

of

interior

the

This

origin).

exists

Am

number 1 and

of the complex

radius 1 centered at

the set

encounter

we frequently

follows,

set consists

this

S = {AeC:|A|<l
Geometrically,

oo

\\m-+

5.17 twice.

Theorem

applying

= QLQ'1

aAq\"1

Q[}im

m-+oo

m-*oo

by

(QAmQ-l)

fact,

m-+oo

true

is obviously

which
numbers

if

is

real

for complex

also.)*

The

for

of

existence

the

the

lim

if and

exists

Am

and sufficient conditions


limit under consideration.
result

important

following

of

type

Theorem 5.18. Let

be

gives necessary

matrix

square

Then

entries.

conditions hold:

if the following

only

complex

having

m-+oo

(a)

If

(b)

If

I is

an eigenvalue of A,

1 is an

eigenvalue
corresponding to 1 equals

then

S.
the

then

A,

\\of

the

of

The

of

to A. Regarding

A corresponding

will

(a) is

condition

x as an n x

we study

proof

be

easy to

X$S. Let x

A such that

eigenvector

the

reason

for

until

theorem

this

prove

eigenspace
an eigenvalue of A.

1 as

of

multiplicity

of the

dimension

Unfortunately, it will not be possibleto


the Jordan canonical form in Section 7.2;
this
deferred until Exercise 18 of that section. necessity
of
see, however. For supposethat is an eigenvalue

an

to be true

be shown

can

number,

1 matrix,

we

be
see

that

\\\\m(Amx)
m-+co

by

Theorem

5.17,

where

lim
171-+

Am

does

L = lim
exist.

But

Am.

Hence

Lx

J
lim

(Amx)

if

lim

Am

lim (Amx) diverges

m-+ oo

m~*co

oo

then

exists,

condition

(a)

of

17I-+00

00

must

Theorem

5.18

condition

(b) at this

fails.

not

lim Am\\x =

\\m-*oo

m-*

since

Observe

hold.

Although

we are unable

time, let us consideran


that for the matrix

example

to
for

prove

which

the

necessity

this

condition

of

Sec.5.3

and

Limits

Matrix

the eigenvalue

1 has

255

Chains

Markov

2, whereas

multiplicity

/1

Bm

\\0

by an easy

lim

induction, and hence

= 1. But

dim(EJ

exist.

not

does

Bm

(We

will see later

that

m-+co

if

is a

(b) fails, then the Jordancanonical

for which condition

matrix

so that its upper left 2x2 submatrixis precisely


In most of the applications involving this typeof limit,however,
A is diagonalizable.
When condition (b).of Theorem 5.18 is
that
A is diagonalizable
condition
stronger
(see Theorem
be chosen

can

the

of

existence

Theorem 5.19.

matrix
the

B.)
matrix

replaced

the

by

5.14), then

the

shown.

is easily

limit

this

of

form

Let AeMnxn(C)

that

such

be

the

following

conditions

hold:

(a) If

is

an

X e S.

A, then

of

eigenvalue

(b) A is diagonalizable.

Then lim

exists.

Am

m-+ oo

Since A is diagonalizable,
= D, a diagonal matrix.

Proof
that

QT1AQ

0
Because

Xi

Xu

1 or

X2,...,Xn

\\X(|

<

1 for

X?^=l

m-co

sequence

D, D2,

if

=
At.

/
otherwise.

(0

lim

the

corollary

converges to a

D3,...

Am

m-+oo

by

of A, condition (a) shows that

since

to Theorem

lim
m-+co

5.17.

such

fl

1m

Let

the eigenvalues
1 < i < n. Thus

lim

the

matrix

are

1B

But

there exists an invertible

\342\226\240\342\200\242\342\226\240

a:

limit L.

Hence

(QDQ'1)\"1 = QLQ'1

either

256

Chap. 5

The technique

for

Am that

lim

computing

is

useful.

quite

We now employ

the proof of Theorem

oo

m-+

5.19

is used in

Dlagonalization

this method to compute

Am

lim

for

the

m-* oo

matrix

A =

If

then

D =

1 2-3

|-1

Q~\\AQ)=

-5

Hence

\342\226\2401
0

7/

\\

lim

Am

Urn

m-+ oo

m-+oo

Let

constant

remains
city

the

now

us

occurs.

matrix

and

the

(QDQ'1)m

consider

will be living

= Q[ lim

lim;(QDmQ-1)

m-+oo

,m-+

a simple example

but

that

-i

I Q

Dm

oo

in which the limit

of

powers

that the population of a certain


that there is a continual movement of people
Specifically, let the entries of the matrix below

Suppose

suburbs.

probabilities

between

someone

living in the

in each region on

January

city or in

of

the

next

the

suburbs

year.

on

area

metropolitan

of

the

represent

January

Matrix Limitsand

Sec. 5.3

257

Chains

Markov

Presently

Presently

living in

in

living

the suburbs

the city
Living

Living

next

year

in

next

year

in the

For instance,
be living the

the

of

each

column

the

entries

locations,
constant

of

of

column

each

\\

someone

in the city

living

\\

(on January 1) will

1) is 0.10. Notice

(on January

year

1. Any matrix

A be

and that the

are nonnegative

entries

suburbs

0.02
0.98

0.90
0.10

the

the

in

population

that since
of A represent probabilities of residingin eachof
two
of A are nonnegative. Moreover, the assumptionof a
the metropolitan
area requires'that
the sum of the entries
next

suburbs

city

that

probability

in

entries

the

having these

sum of

two

an
matrix (or a stochastic
For
arbitrary
matrix M, the rows and columnscorrespond
to states, and
transition
i in one
the probability of moving fromstatej
state
My represents

called a transition

is 1) is

column

each

in

entries

the

nxn

matrix).

the

suburbs), and

suburbsin

states

two

in

(residing

the

represents

A21

one

are

the

city

of moving

probability

entry

stage.

In

and residing in
from the city to

the

into

our example, there

the

(that

properties

the

(year).

stage

in
the probability that a city residentwillbe
different
in which
after 2 years. Observe first that there are
the suburbs
ways
in the city for 1 year and then
such a move can be
by
remaining
or by moving
to the suburbs during the first year and
suburbs
movingto
there
the
second
5.3). The probability that a city
(see Figure
that
a city
in the
the next year is 0.90, and
stays
city during
probability
is
0.10.
Hence
the
dweller moves to the suburbs during the following
1 year
and moves to the
for
that a city resident stays in
city
probability

Let

determine

us now

living

two

made\342\200\224either

the

dweller

remaining

the

year

the

suburbs during the


moving to the suburbs
next

is

during

0.10(0.98).Thus
after

obtained

years
by

is
the

the

probability

the

Likewise

0.90(0.10).
the

that

first

a city

year

and

resident

dweller
there the next is
remaining
will be living in the suburbs
of a city

probability

+ 0.10(0.98) = 0.188. Observe


0.90(0.10)
same
as that which produces
calculation

that this
(A2)21\342\200\224hence

0.10

>- Suburbs

City

Suburbs
Figure

\342\200\224

5.3

*~

Suburbs

is

number

(A2)21

Chap. 5

258

represents the probability that a

2 years.In general, any


of moving from statej

Suppose additionally
lived in the city and

be

M, (Mm)0-

after
the probability

in the suburbs

residing

represents

m stages.

i in

state

to

area

matrix

transition

for

will

dweller

city

Diagonalization

that 70% of the 1970

of

population

30% livedin

Let

suburbs.

the

us

the

metropolitan

this data

record

as a

column vector:

Proportion of
that the rows

residents

of suburb

Proportion

Notice

dwellers

city

of P correspondto

residing in the suburbs,

states

the

order

same

respectively\342\200\224the

of

residing

as the

in the

states

city and

are listed in

is a column
P
vector
the transition matrix A. Observe also
containing
a
sum
is
a
vector
is
called
1;
probability vector.
nonnegative entries whose
is a probability
matrix
vector.
In this terminology each column a transition
AP. The first coordinate
of the vector
Let us now consider significance
the
calculation
of this
formed
+ 0.02(0.30). The term
by
0.90(0.70)
of the
that
1970 metropolitan
population
0.90(0.70)represents proportion
remained in the city during next year, and the term 0.02(0.30) represents the
that
moved
into the city during
population
proportion of the 1970
coordinate
of AP represents
the proportion of the
the next year. Hence
in the city in 1971. Similarly, the second
living
metropolitan populationthat
coordinate of
that

such

of

the

\"is

vectQf

the

the

metropolitan

the

first

was

the

represents

proportion

suburbs
coordinates

1971,

in

This

of the

metropolitan
can
be

argument

that was living in


population
easily extended to show that

the
the

of

A*P

A(AP)

0.57968\\

0.42032
[_^J

of the
that were living in
represent proportions
metropolitan
population
In
each location in
the
coordinates
of AmP represent
the
general,
will
be
in the city and
proportion of the metropolitanpopulation
living
the

1972.

that

after
suburbs, respectively, after stages
(m years
1970).
Will the city eventually be depletedif this
In view
continues?
of the
discussion it is natural to
the
eventual
of city
preceding
proportion
dwellers and suburbanites to be the
and
second
of
coordinates,
respectively,
limit.
We can construct
lim AmP, Let us now compute
Q and D as in earlier
m

trend

define

first

this

m-+co

computations

to

obtain

Matrix Limits and

Sec. 5.3

259

Chains

Markov

thus

Am=

L=\\im

\\im{QDmQ~l)=Q[l

\Q~l")

m~* co

m\342\200\224*
co

Hence

AmP = LP

lim
m-*co

so

eventually

It

is easy

the

of

\302\243

live in the

will

population

city and f willlivein

suburbs.

the

that

to show

LP =

for any probability

city

givenby the
In

vector.

of

these

AP,

showing

and

matrices

that

the

vector is a probability

the following

matrix

nxn

an

coordinate
if
if

and
and

product
In

matrix.

proof

(real)

1. Then
= u.
if Mlu
= (1).
if ulx

nonnegative
andu
e Rn be

equals
only
only

Corollary.
The

having
coordinates,

nonnegative

having

vector

of

vectors.

be

Rn

characterizes

which

theorem,

is

product

Another

vector.

probability

matrix

(a)

AP

the

which

Exercise,

we gave probabilistic

problem

A2 is a transition matrix and


can be used to show that
arguments
is a transition matrix and that
product

Theorem 5.20. Let M


entries,* be a columnvectorin
the column vector in
each
(a) M is a ti-ansition
(b) x is a probability
Proof

independent

and a probability
can be based on

results

transition

city-suburb

matrices

matrix

transition

the

Analogous

transition

two

the eventual proportions of


of the initial proportions (as

example

P)!

and

A2

probability
of

vector

analyzing
of

interpretations

are

suburbanites

and

dwellers

in this

Hence

P.

vector

of

transition

nxn

two

any

particular,

power

of

matrices is an nxn
a transition matrix is a

transition
transition

matrix.
(b)

The

product

Exercise.

time but that

matrix

transition

vector is a

and a probability

A stochastic process
is constrained

vector.

probability

Proof.

of

is concernedwith predicting

the

to be in exactlyone of a

changes statesin

some

random

number

of

manner.

state

possible

Normally,

of

object

that

at any

given

an

states

the

probability

Chap. 5

260
the

that

state at a

some particular

is in

object

given

will

time

DiagonaEization

such

on

depend

as

factors

1. The state in question

2.

in

time

The

the previous states in


states that other objects are in or
or all of

3. Some

4. The
For

the

which

could be an

the object

instance,

in

been

have

American voter and


or

party,

molecule of H20 and the statescouldbe


physical
exist (the solid, liquid, and gaseousstates). these
factors mentioned above will influence
probability
In

that

state
other

then

chain.

Markov

between

the

city

the

H20

one state will change to a


states (and not on the time,earlierstates,
is called a Markov process.If, in
the stochastic
process
of possible
states
is finite, then the Markov process is
The
of the movement of population
preceding
example

number

the

can
all four
of the
are in a
objects

which

and

an object in

that
probability
on the two
only

the

factors),

be a

time.

depends

or
addition,
calleda

could

a particular

particular state at
different

in

the

of

state

the

object

examples

the

however,

the

states

the

If,

been

has

object

her preferenceof political

be his or

could

object

question

is a

suburbs

two-state

chain.

Markov

chain.
A certain
Let us consideranother
junior
college would like
that
likelihood
various
of presently
to obtain information about
categories
a student
enrolled students will graduate. The school
as a sophomore
of credits
that
the student
has earned.
or a freshmatt depending on the
fall semester
from
one
to the next 40% of the
Data from the schoolindicate
Markov

the

classifies

number

thaj

remain
will
and
30% will quit
sophomores,
sophomoreswill graduate,
that
will graduate
10%
by next fall,
permanently. For freshmen the data
will
remain
and 20% will quit
freshmen,
50% will become sophomores,
of the
students
at the school are
50%
permanently. During the present
that
the trend
indicated
Assuming
sophomores and 50% are
by the
like
to know
data continues indefinitely,the school
30%

show

20%

year

freshmen.

would

1. The percentage of the present studentswho


will
who will be sophomores, the percentage who
who

percentage

will quit

2. The same percentagesas


3. The percentageof its
present

The preceding

graduated

2. Being a

sophomore

3.

freshman

4.

Being

Having

quit permanently

percentage

will be
school permanently by next fall
in

item

1 for

students

paragraph describes a

1. Having

the

graduate,

the
who

fall semester
will

2 years hence
graduate

eventually

four-state

Markov

the

and

freshmen,

chain

with

states

Sec. 5.3
The

Limits

Matrix

data

261

Chains

Markov

and

us with

cited above provide

matrix

transition

the

0'

0.4

0.1

0.3

0.5 0

0.3

t0

0.2

0.2

1.

of the Markov chain.


that
students
have
who
or have quit
graduated
in
those
states. Thus
permanently are assumedto remain
respective
a freshman who quits the school
a later
returns
semester
is not
during
student
is assumed
to have remained
in
regarded as having changed
the state of being a
the
time
he or she was not enrolled.)
during
Moreover, are told that the present distribution of students is half in each
2 and
3 and none in states 1 and 4. The
states
(Notice

indefinitely

and

states\342\200\224the

freshman

we

of

vector

that

the initial probability of being


for the Markov
vector
chain.

describes

probability

To answer question 1,
student will be in eachstate

we

the coordinatesof

the

fall.

we have

As

initial

the

the

determine

must
next

by

in eachstateis called

a present

that

probability

seen, these probabilities

are

vector

\\

AP

25% of
sophomores, 10% will be
by next fall

Hence

the

freshmen,

present
and

students
25%

will

will

graduate,

the school.

quit

40%

will be

'

Similarly,

A2P = A{AP)=

the

provides

and

will

students

present

39%

will

Finally,

needed

information

quit

graduate,

to answer

question

17% will be

2: within 2 years 42%of

sophomores, 2% will be

the

freshmen,

the school.

the answer to

question 3 is

provided

by

the

vector

LP,

where

'Chap. 5

262

L = lim

should

reader

The

Am.

if

that

verify

Diagonalization

m-*co

-4

19 01
-40 0

t0

13

-3

then

1 0.4 0.1
D =

0 0.3 0.5

Q~'AQ

0.3

0.2

0.2

Thus

L=

Am

lim

QI

m-*co

lim

\302\243

/i

o\\

01

10

\\q
\\u

3
7

29 1/
V
56

&

29
56

So

LP

and hence the


the

In

probability that one of

two

preceding

examples

the

present

students

we have seen

will

that lim

graduate

is

A is

where

AmP,

^\302\276.

m-\302\273co

the

transition

gives

powers

the

and

matrix

eventual

proportions

of a transition

initial probability vector of


in each state. In general,

P is the

matrix need not

the

however,

exist.For

example,

if

Markov
the

chain,
limit

of

then lim

Mm

exist.

not

does

clearly

263

Markov Chains

Limits and

Matrix

5.3

Sec.

powers

(Odd

of M equal

M and

even

m-*co

of

powers

shown (see Exercise20


Am does

lim

that

hold for M

5.18 does not

of Theorem

of

Section

But

computation
a
is

such

(a)

of
be

may

of the transition matrix exists, the


difficult. (The reader is encouraged to

powers
quite

the truth of the last sentence.)


appreciate
Fortunately,there
and
class of transition matrices for-which this limit
important
is the class of \"regular\" transition matrices.
computed\342\200\224this
6 to

Exercise

work

limit

the

of

condition

which

for

it can be

fact,

matrices

transition

(a)

hold.

limit

the

if

even

only

condition

that

In

eigenvalue).

precisely those matrices

not exist are

5.18 fails to

the

that

fails to existis

an

is

(\342\200\2241

7.2)

m-*co

of Theorem

that the limit

The reason

equal/.)

is

exists

large
easily

Definition. If
entries, then the

some

is

matrix

Example

called

matrix

transition

of

power

only positive

matrix.

transition

regular

contains

and

The transition matrix

0.90 0.02'

0.10
of
and

chain

Markov

the

is clearly

suburbs

transition

describing
regular

0.98

the movement of population


since each entry is positive.On the

between the
other

hand,

city

the

matrix

0.4

0.1

0 0.3 0.5
=

^0

0.3

of the Markov chain describing


is easyito show that
first
column
the

0.2

of Am

the

enrollments

college

junior

is not

regular.

[It

entries;

for

is

.0

for any m.

Hence

Observe

04m)41,

for

is never

instance,

that a regular

positive.]

transitionmatrix

may

example,

/o.9
M

^0.1

is regularsince

every

entry

of

M2

0.5
0.5

is positive.

0^

0.4

0.6,

contain

zero

Chap. 5

264

In the remainderof thissection


that if

is

transition

regular

are

of
is

the

of

entries

the

row

of the matrix.

absolute values

sum of the

of the

terms

in

problem.)
of proving this result
of the eigenvalues
of any
city-suburb

course

magnitudes

The necessary

terminology

be

to

be the sum

Vj(A) to

A and

of

of the absolute

sum

the

of the absolute

values

of

/ori = l,2,...,n

Pi(A)= flA^I
i
j

and

proving
columns

j of A. Thus

of column

entries

the

the

p;(A)

values of the entriesof

the

in

Let A e Mnxn(C).Define

Definitions.

the

below.

definition

the

in

introduced

given

and columns

rows

the

for

bounds

and

exists

Am

m-*co

of L are identical. (Recall the appearanceof


In
From this fact it is easy to computethe limit.

we obtain someinteresting
square matrix. Thesebounds

with

primarily

L = lim

then

matrix,

concerned

are

we

Diagonalization

\"
-

for

vj(A)=\302\243|Aul
1=1

The row sum

of A, denoted p(A), and

1,2,...,n.

sum

column

the

A3 denoted

of

v(A), are

defined as
\342\226\240

p(A)

max

Example 3

=
p{A)

= 10, p3(A)=
= 12.
1
v{A)

and

10

Our riext results


A

instance,

has

no

5.21

Theorem

i = 1, 2,...,

of the

values

absolute

eigenvalue

max

{vj(A):

^ j <

n}.

denote

lies in some

Cr

=
Vl(A)

smaller

8,

v304) = 12. Hence

= 3, and

v2(A)

of p(A)

and v(A) is

an upper bound

of A. In the precedingexample,for
eigenvalues
with absolute value greater than 10.
Disk

(Gerschgorin's

Let

Theorem).

AeMnxn(C).

For

define

rs

and let Q

6,
the

that

show

the

v(A)

\\

= ?> PiiA)

PiW

for

^ n} and

matrix

the

For

1^ i

{pi(A):

the

disk

centered

p|(A)

- lAjil,

at AH

of radius

r>v

Then

each

eigenvalue

of A

Sec.5.3

Let

Proof.

an

be

265

Chains

Markov

and

Limits

Matrix

of A

eigenvalue

with corresponding

eigenvector

x=

Thenx satisfies

matrix

the

Ax

equation

can be written as

= Ax, which

4yX/

(l =

A*i

l,2,...,n).

(4)

J==l

that

Suppose

that

xk #

feth

note

and

value,

is an eigenvectorof A.

that

show
in

equation

of x having the largestabsolute

the coordinate

0 because x
will

We

xk is

C&, that

in

lies

is, |A

\342\200\224

from the

follows

rk. It

<

,4^1

that

(4)

|Ax&

\342\200\224

\342\200\224

\342\200\224

-<4ftftX&|

Akkxk

AjtjXj

2j

j*k

i=i

< Z I4yll*/I<

Ixjkl

Z I4ull*

k\\

= I**!7**-

I4yl

AkjXj

Thus

\\xk\\\\A-Akk\\ ^\\xk\\rkl
so

l-1-^fcfcl

|x&| > 0.

because

<r&

\\

Corollary1.
By

Proo/.

be

Let

disk theorem,

Gerschgorin's

|A| =

Akk) + ^fcfc|

|(A

A e Mnxn(C).

of

eigenvalue

any

<

\342\200\224

|A

Let

be

eigenvalue

any

Akk\\

p(A).

k. Hence

some

r& for

<

|A|

|4J

<rfc+|J4J=p&(J4)<p(J4).

Corollary 2.

<

Akk\\

|A

T/ien

of

A e

Mnxn(C).

T/ie72

|A|<min{p(A),v(A)}.

Proo/. Since |A|< p(,4)

Exercise14of Section5.1

that

shows

But

the rows
The

Corollary

of

A*

following
3.

are

the

columns

conclusion
If

A is

1, it

Corollary

by

A is

of A.

suffices

to show

that |A| < v(,4).

of A1. Hence
=
Thus p{A*) v(,4);so
<

an eigenvalue

<

|A|

|A|

p(4').

v(,4).

is immediate from Corollary2.

an eigenvalue

of a transition

matrix, then

|A|

<

1.

Chap. 5

266

The next result showsthat

the

Every transition

Theorem 5.22.

in

bound

upper

3 is attained.

Corollary

an eigenvalue.

1 as

has

matrix

Diagonalization

matrix, and let u e Rn be the column


Proof. Let A be an n x n transition
=
vector
in which each coordinate is 1. Then Alu
u by Theorem
5.20, and hence
to the eigenvalue
u is an eigenvector of A1 corresponding
1. But since A and At
have

the

same

Suppose

now that

1 is also

it follows that

eigenvalues,

is

for

matrix

transition

an

nonnegative
corresponding to the eigenvalue1has
be
a
of this vector will
P
probability
multiple
A corresponding to the eigenvalue1.It is
chain
initial probability vector of a
having
static.
in
For
then the Markov chain is

of
eigenvector
to
observe
that
if P is the
A as its transition
matrix,
=
this situation
AmP
P for

interesting

Markov

completely

m,

changes.Consider,
for

Theorem

the

hence
the

instance,

probability

city-suburb

be a

Let AeMnxn(C)

5.23.

positive, and let X be an eigenvalue


a basis for Ex, where
{u}.is

A such

of

that

some

Then

as an

well

as

vector

every positiveinteger

eigenvector

coordinates.

only

and

some

which

A.

of

eigenvalue

of being in
with
problem

\\X\\

each

which

in

matrix

each state never

Then

p(A).

entry

p(A),

is
and

Proof Let

be an
coordinate

of

eigenvector

of x having
=

\\X\\b

Since |A| = p(A),

the

(a)

three

= I

eigenvalue
let

and

j=i

the

the largest absolutevalue,

\\X\\\\xk\\

to

corresponding

\\Xxk\\

I4yll*,l

inequalities

I4y*/I\302\273

<

I! AkjXj <

j=i
I
j=i

let
\\xk\\.

xk

be

the

Then

I I4u*/I
J==l

=
\\AkJ\\b

in

X,

(5) are

pk(A)b

actually

p(A)b.

equalities;

(5)

that is,

(b)

\302\243

\302\243.l4y|&,and

MWI*/I=

= p(A).

Pk(A)

(c)

We will see in
terms

the

of

real numbers

nonnegative

cl5,..,

cn

each

Since

of A is assumed

entry

if all

of some nonzero
\\z\\

1. Thus

there

that

such

(6)

Cjz*

only if for

if and

holds

(b)

if and only

\342\200\224

^kjxj
Clearly,

(a) holds

that

6,1

Section

are nonnegative multiples


of generality
we assume that

loss

Without

z.

number

Exercise15(b)
= 1,2,..,,n)

{j

AkjXj

complex
exist

267

Chains

and Markov

Limits

Matrix

5.3

Sec.

each j

have

we

to be positive,

conclude

j = 1,2,

...,72.

we

AkJ

= 0

or

(b) holds

that

b.

\\xj\\

if and

only if
=

Thus (5),and

From

hence

for k =

1, 2,...,

(7)

n.

that

see

we

(6)

is valid

above,

(c)

foX

\\Xj\\

and hence

b=

(j

\\Xj\\

j =

for

bz

Xj

1,2,...,

'xj\\
X

and hence {u}

w is

Clearly,

1
Au

is a basis

\\

an

n. So
/1

/bz\\

for

EA.

= p(4 ;

= A

.PW.

E^
by (c) above. But
=

that
p{A).

|A|

the precedingparagraph

= p(A),

p(a)u

^/-1

/1'

P(4

'

/j=i

to p(A) since

of A corresponding

eigenvector

such

2,...,n)

Ay

Akj

by (7). Therefore,

= 1,

then

corresponds

if A

that

shows

to

the

is any

eigenvector

eigenvalue
u.

of

Hence

Chap. 5

268

Corollary 1.
let

and

an

be

is

entry

Then

v(A).

positive,

the

and

v(A),

1.

of Ex is

dimension

of

eigenvalue

in whicheach

x n(C) be a matrix
A such that \\X\\ =

e Mn

Let

Diagonalization

Proof Exercise. 1
positive,and let
the

Moreover,

denote

corresponding to the

of the eigenspace

dimension

entry is
\\X\\

eigenvalue

<

1.

is

1.

Our next result extends Corollary2 to regular


that regular transition matrices satisfy condition (a)of

matrices

transition

shows

Then

than 1.

A other

of

eigenvalue

in which each

matrix

a transition

be

x n(C)

any

Exercise.

Proof

e Mn

Let

2.

Corollary

and

thus

5.18

and

Theorems

5.19.

X is

If

(a)

Let A be a

5.24.

Theorem

an

if\\x\\ = 1,

is

Since

then

regular,

\\x\\

1.

1, and

= 1.

dim(EA)

So

S.

as Corollary 2 of Theorem5.23.
has
positive integer s such that

(a) was proved


there
exists a

Statement

Proof

of A, then

eigenvalue

(b) Furthermore,

regular transitionmatrix.

As

only

are
transition matrix and
entries
of
=
are
Let
an
of A
the entries of ,4s+1
positive.
eigenvalue
positive,
are
of As and
eigenvalues
As+i,
having absolute value 1. Then As;and
2 of Theorem
5.23,
Corollary
respectively, having absolute value*1. So
=
and
1. Let
As,
Xs = Xs+i = 1. Thus
E^ denote the eigenspaces of
EA and

A is a

Because

entries.

positive

As

the

AS(A)

be

As+1

by

X.

to X =

corresponding

respectively,

1. Then

(Corollary 2 of Theorem5.23).

Hence

Corollary.
Then lim
exists.

Let

be

EA

EA

dim(EA)

matrix

dimension

E^ has

but

E'x,

E'x, and

transition

regular

= 1.

that is diagonalizable.

Am

m-*co

follows immediately from Theorems 5.24


and 5.19, not the best possible result. In fact, it can be shownthat if is a
1.
transition
then the multiplicity of 1 as an eigenvalueof
is
matrix,
regular
Thus
5.12 condition
by Theorem
(b) of Theorem 5.18 is satisfied.So A is a
transition
A is diagonalizable
lim
exists
whether
or not. As
matrix,
regular
The

which

corollary,

preceding

is

if

Am

m-*

Theorem

with
of

A is

5.18,

however,

be proved at

1 cannot

co

the fact that

this time.
of

lim

Am

when

is a

regular

we

Nevertheless,

(leaving the proof until Exercise20


m-*ao

the multiplicity of 1 as

transition

Section

7.2)

matrix.

and

will

deduce

state

an

this

further

eigenvalue

result

here

facts about

269

Chains

and Markov

Limits

Matrix

5.3

Sec.

Theorem 5.25. Let A be an


(a) The multiplicity of 1 as
exists.
(b) lim

an

transition

regular

1.

A is

of

eigenvalue

Then

matrix.

Am

m-* co

L =

(c)

lim

matrix.

transition

is

Am

m-* co

LA = L.

(d) AL =

columns

The

(e)

vector

probability

unique
For

vector

probability

any

(b) The proof

7.2.
Theorems 5.24

part (a) and

from

follows

exists

Am

lim

that

to

corresponding

co

20 of Section

See Exercise

(a)

to the

equal

lim (Amx) =.v.

x,

m-*

Proof

an eigenvector

also

is

that

is

A.

the eigenvalue 1 of
(f)

In fact, each columnof

identical.

L are

of

m-*co

5.18,

and

(c)
of

entry

is

Am

transition matrix by the corollaryto


= 1,2,3,...).
Hence
(m

Am is a

Since

nonnegative

Ly = lim

each

5.20,

< n.

i, j

1<

for

^0

{Am)tj

Theorem

m-\302\273
co

Moreover,
n

lim

m-* co
L

Thus

(A\ij

(d) By Theorem

AL

to

probability
vector

Let

also

(e).

an

=
{AAm)

m\342\200\224*
co

lim

= L.

Am+i

m\342\200\224*co

=L

an eigenvector of

by part (c),each column

1. Moreover,

eigenvalue

of L is

column

each

(d),

part

by

Thus

probability

vector

The vector
(or

Ay

in

(e)

part

Theorem 5.25 can


used
percentage in each state of a
be

Markov

to

of Theorem

of the

vector)

stationary

ALx

= Lx

to the eigenvalue

corresponding

eigenvector

Definition.

5.20), and

deduce
chain

regular

is

1 of A. So

=
y

is

unique

probability

= y by part

(d). Hence
v by

part

is called the fixed


transition matrix A.
525

information
having

the

to

equal

to Theorem

of

by part (a) each columnof L is


to the eigenvalue 1 of A.
vector v corresponding
x be any probability
vector, and set y = Lx. Then

(corollary

is

Am) = lim

the

vector.

probability

for 1 <y<n.

(1)=

= L.

(e) Since AL
corresponding

lim
m-*co

5.17
co
\\m\342\200\224*

LA

(f)

m-*co

(lim

Similarly,

matrix.

transition

is a

= lim

a regular

the eventual
matrix.
transition

about

Chap. 5

270

Example 4
A surveyin
me

beside

of

loaf

preferred

on a

that

showed

Persia

ancient

30% preferred

bread,

Diagonalization

day 50% of the Persians

particular

a jug of wine, and

thou

preferred

20%

A subsequent survey 1 month later


the
first
Of those who preferred a loaf of bread on
data:
survey,
40%
a jug
of wine,
and 50%
to prefer a loaf of bread, 10% now
in the wilderness.

yielded

the

following

continued

preferred

wine
preferred thou; of those who preferreda jug
now preferred a loaf of bread,
to
continued
and 10% now preferredthou; those
who
preferred
a
now
of
now preferred
bread,
20%
20%
preferred
of

70%

of

loaf

continuedto

that
trend
Persiansin
first, second/
the

we

in

being

see

each

be

preference

of wine,

a jug

on the first

a jug of

20%

survey,

wine, and 60%

can predict the percentage of


the original survey. Letting the
for bread, wine, and thou,

vector that gives the initialprobability

the probability

that

of

is

state

each

thou

survey,

the preceding paragraph is a three-state


are the three possible preferences.Assuming

we
continues,
month
following

states

third

and

respectively,

above
for

state

states

the

which

described

each

in

described

in

chain

Markov

prefer

first

thou.

prefer,

situation

The

the

on

/0.501

P =

0.30

\\0.20

and

is

matrix

transition

the

0.20

0.20^

0.10

0.70

0.20

\\0.50

0.10

0.60,

/0.40
=

The probabilities of being

the coordinatesof

AmP,

vector

the

state

each

in

The

0.30 ,
\\0.40/

reader

may

survey are

/0.26\\

/0.30\\

AP =

the original
check that

after

m months

A2P

A{AP)

0.32

/0.252^
,

AZP = A{A2P)=

\\0.42/

0.334
\\0.414

and

/0.2504'

A*P

A{AZP) = L0.3418

\\0.4078

Note the
Since
preferences

seeming

convergence

of

AmP.

A is regular, the long-range prediction concerningthe


can be found
by computing the fixed probability vector for

Persians1
A.

This

Matrix Limits and Markov Chains

5.3

Sec.

vector is

the

we see that

v such

vector

probability

unique

the matrix equation

271

(A

\342\200\224l)v

that (A

\342\200\224
0

\342\200\224

I)v

the

yields

0. Letting

following

system

of

linear equations:

-0.600! + 0.20y2+
-

0.30u2-+

0.10\302\273!

0.20y3

0.50^ + 0.10y2It

is easily

shown

= 0
=

0.

0.40y3

that

for
probability vector
is a basis

=
0.20y3

solution

the

for

of

space

this

system.

Hence

the unique fixed

is

5 +

7 + 8\\

5+

8
\\

5 +

Thus in
prefer

the

a jug

Note

long

run

25%

of wine, and
that if

of the

7 + 8
Persians

will prefer a

40% will prefer thou beside

me

then

/1

Q~lAQ =

\\

0
0.5

0}
0

0.2,

loaf of bread,

35%

in

the

wilderness.

will

Chap.5

272

Diagonalization

So

'1
m-*co

/10

lim | 0 0.5

= e

Am

lim

m-* co

0
Q

0 0 IQ\"1

\"^filo

0.2

\\0

0.25

0.25

0.25

= I 0.35

0.35 0.35

0.40 0.40 0.40,


5

Example

Lamron

in

Farmers

plant

one crop per

soybeans,

corn,

year\342\200\224either

wheat.

or

the necessity of rotating


these
farmers
will
crops,
not plant thesame crop in successiveyears.In fact,of
total
on which
acreage
a particular
with
each
of the other
crop is planted, exactly halfwill be
two crops during the succeedingyear.This
300
acres
of corn
were planted,
200 acres of soybeans were planted,
100
acres
of wheat
were planted.
The situation described in the paragraph
another
three-state
chain in which the three states correspond to
of corn,
Markov
planting
amount
of land
and wheat, respectively. In this problem, however,
soybeans,
total
devoted
to each crop, rather than the percentageof
acreage
(600 acres),
of
the
total
we
was given. By converting these amounts into
acreage,
matrix
and
the
initial
vector
P of the
see that the'transition
probability
Markov chain are
they believe in

Because

their

the

planted

year

and

is

above

the

the

the

fractions

'300

\302\273

600

P =

and

The fraction of the total acreage

coordinatesof

for

each

crop

and

AmP,

the

are

the

eventual

proportions

of

coordinates

each

to

devoted

in m

crop

years is given

of the total

lim AmP. Thus

by the

acreage to be used

the eventualamounts

of

m-*ao

land

to

be devoted

to each crop

are found

acreage; i.e., the eventual amounts of


coordinates
of 600( lim AmP).
m-*

Since
is a

matrix

A is

this

multiplying

land to be

used

for

limit
each

by the
are
crop

total
the

ao

a regular

L in which

by

transition

matrix, Theorem 5,25 showsthat

each column equals the unique

fixed

lim

Am

im-*co
probability

vector

for

Matrix Limits and Markov

Sec. 5.3

A.

is easily

It

seen that the

273

Chains

fixed probability vector

for

is

Hence

so

/200^

600 lim

AmP

600LP

| 200

\\m-*co

\\200,

Thus

(For

in

the

run we expect 200 acres of eachcrop to be planted


of 600 (lim AmP%
see
Exercise
I
computation
14.)

long

a direct

have concentratedprimarilyon
theory
There is another interesting class of transition

In this section we
transition
can

be

matrices.

each

of

the

year.

regular
that

matrices

in the form

represented

B'

c,

zero matrix.(Suchtransition
are
not
The
lower left block remains O in any power.of
regular
matrix.)
to the identity submatrix are called absorbingstates
corresponding
state^
a state
an
such
is never left once it is entered. A Markov chainis
if it is possible to go from an
state
into
an
chain
Markov
absorbing
Markov
chain
that
state in a finite number of stages.Observethat
absorbing
Markov
the enrollment
described
absorbing
pattern in a junior collegeis
in learning
chain
interested
with states
1 and 4 as its absorbingstates.
more
Markov chains are referred to Introduction Finite
about
absorbing
Mathematics
by J. Kemeny, J. Snell, and G. Thompson
(third
edition)
Hall,
Inc.,
Cliffs, N.J.,
Englewood
1974) or Discrete Mathematical Models
where

is an

identity
since the

matrix

and O is a

matrices

the

called

because

arbitrary
the

an

Readers

to

(Prentice-

by

Fred

S.

Roberts

(Prentice-Hall,

Inc.,

Englewood

sexually,

the

Cliffs, N.J., 1976).

An Application

In species reproduce
respect to a particular
inherited from each
that

trait

genetic

parent.

The

genes

determined

are

jbr

of an offspring

characteristics

a particular

by

a pair

trait

with

of genes, one

are of two

types,

274

Chap.

Diagonalrzation

the
dominant
characteristic
g. The geneG
and g represents
the recessive characteristic. Offspringwith
GG
or Gg
exhibit the dominant characteristic, whereas offspringwith genotypegg exhibit
in humans, brown eyes are a dominant
recessive
characteristic.
For
example,

by G and

we denote

which

represents

genotypes

the

characteristic

blue

and

are

eyes

the

offspring with genotypesGGor Gg


will be blue-eyed.

Let us consider
parent of
Gg.
large, that matingis
of each genotype

the

be

will

of

probability

with

random

the

within

of each

the population
to

respect

is

population

characteristic;

genotype,
independent

thus

of type gg

those

whereas

brown-eyed,

offspring

that

assume

(We

genotype

recessive

corresponding

for a male

genotype

under consideration is
and that the distribution
of sex and life

Let

expectancy.)

the proportion of the adult population


This
respectively^ at the start of the
state Markov chain with transition matrix
denote

with

GG

GG

Genotype

of

female

of

Gg,

describes

experiment

experiment.

Genotype

GG,

genotypes

and

gg,

a three-

parent

Gg

\\

\\0

G\302\273;

B.

offspring

gg

It is easilycheckedthat

permitting
in population

only

for

vector

Now

having
which

contains

genotype
a certain

regular. Thus by
the proportion of offspring
Gg to reproduce,
will stabilize at the fixed probability
genotype
only

positive

entries;

so B is

is

that similar experiments are to be performed


suppose
GG and gg. As above, these experiments

are

genotypes
chains

B,

of

males

the

B2

with

transition

matrices

with
three-state

males
Markov

of

Matrix

Sec. 5.3

to

permitted

males

of each

ip + iq

of G

and

p +

+ \\q and

b=

b.,=

Let
having

q\\

p\\

the

denote

r'
GG,

genotypes

in the

a and b

numbers

(The

population.)

Then

+ r =.1.

p + q
and

r.

0'

\\a

\\0 ib
+

%q

respectively,

g genes,

fa

\302\261q

iq + r,

+ ir

iq

a == p

let

notation,

represent the proportion

where

of

proportion

ip+iq+ir

\\
the

M=l%q

simplify

the

by

rC,

Thus

genotype.

/p + k

To

are

genotypes

and C weighted

of A, B,

combination

linear

is

male

matrix M = pA + qB +

the transition

form

must

we

reproduce,

which

case where all

to consider the

In order

respectively.

275

Chains

Markov

and

Limits

and

Gg,

ap + iag

/
= MP

ftp

r/

br

ibq

\\

/a2

\\

+ ar

ig

offspring

Then

gg, respectively.

>'\\
g-'

of the first-generation

proportion

2ab
\\b2

In order to considerthe
of
unrestricted
the firstmatings
among
be
determined
based
generation offspring, a new transitionmatrixM
upon
the distribution
of first-generation genotypes. As before,
find
that
effects

must

we

\\

is' + r' b'+ &'

a! =

and b' = iq'

p' + ^'

P'

\302\245'

iq' + h\"

0
where

ip' + iqf

p'+iq'

fl'

a2

+ r'.

\302\261q'

la'

k'
r'

\\o

\302\260\\

a'

b'J

But

+ i(2ab)

= a(a +

b) = a

and

b' =
Thus M = M; so

the

\302\243(2afr)

b2

of

distribution

b(a + b)

= b.
offspring

second-generation

three genotypes is

M(MP)

M2P

a3 + a2b
| a2b + ab + ab2
ab2

= MP,

b3

a\\a

+ b)

ab(a + 1 +
b\\a

+ b)

b)

among

the

276
the

Chap. 5

first-generation
offspring. In other words, MP is the fixed
is achieved in the
vector
for M, and genetic equilibrium
one
generation.
(This result is called the Hardy-Weinberg law.)Notice

population

probability
only

that

ization

the

as

same

after

Diagonal

in

the

at

distribution

a = b (or

where

case

special

important

equivalently,wherep =

r),

the

is

equilibrium

EXERCISES

1. Label the following


and
(a) If AeMnxn(Q

as

statements

lim

QeMnxrlQ,

lim

QAmQ~^

If 2 is

an \"eigenvalueof

then,

L,

invertible

for

any

lim

Am does

matrix

QLQ~K

m-*ao

(b)

Am

false.

or

true

being

then

Mnxn(C),

not exist.

m-*oo

(c)

vector

such that xx +

(d) The
(e)

The

of

sum

product

(g)

xn

in each

entries

the

of

transition

The

matrix

does

not have 3

Every

transition

as an eigenvalue.
matrix

If A is

1 as

has

(h) No transition matrix can


(i)

probability vector.
row of a transition matrix equals 1.
matrix and a probability vector is

1 is a

vector.

probability

(f)

\342\200\242
\342\200\242
\342\200\242

eRn

have

an eigenvalue.

\342\200\224

as-an

lim

a transition matrix, then

Am

eigenvalue.
exists.

m-*co

(j)

If A is a

regular transition matrix, then

2. Determine whether

or not lim

Am

exists

for

lim

Am

exists

and

has

rank

1.

m-*co
the

following

matrices

A. If the

Sec.5.3

277

Chains

Markov

and

Limits

Matrix

(c)

O.f

fOA

0.3

0.6

-0.5

0.5\"

0.9 -0.1

3.

if

that

Prove

Au

entries

number

A2, A3,...
such that

of n x p matrices

is a sequence
lim

Am

with

lim Alm
m-* co

then

L,

m-*co

that if

4. Prove

is

Mnxn(C)

complex

If.

L = lim

and

diagonalizable

Am

then

exists,

m-*co

L =

either

or
/\342\200\236

2x2 matrices

5. Find

< n.

rank(L)
A

and

real

having

lim

such that

entries

number

Am,

m-*co

lim

Bm,

lim

and

m-*co

m-*co

6. A hospital

ambulatory

lim

but

(AB)m all exist

&

(AB)m

m-*co

trauma unit has


and 70% are bedriddenat

that

determined

the

of

30%
of

time

A^i

(lim

arrival

lim

m-*co

Bm).

are
patients
at the hospital.
A
its

have
month after arrival, 60% of the ambulatory
recovered,
20%
remain
and 20% have become bedridden.After the sametime,
ambulatory,
have
recovered,
20% have become
patients
10% the bedridden
and
remain
bedridden,
20% have died. Determine the
50%
patients

of

ambulatory,

percentage of patients
and have died 1 month
of patients of each type.

have

who

arrival.

after

7.

player

start)

begins

(see Figure

a game

recovered,

of chance

5.4). A die

Also

are
ambulatory,
determine
the eventual
are

by placing a

is rolled, and the

marker

bedridden,
percentage

marker in box
is

moved

one

(marked

square

to

Chap. 5

278

the leftif a 1or


This

player

What

8.

Which

is

rolled

until

continues

process

the

wins

the

square
the marker

4 (in which

case the

Figure 5.4

3,4,5, or 6 is rolled.
1 (in whichcase
the

the

loses

player

game).

of winning this game?


are

following

to the right if a
lands in square

one

or square

game)

is the probability
of

and

Lose

Start

Win

Diagonalizatron

transition

regular

(b)

(a)

matrices?
(c)

0.5

0.5

o/

\302\260\\

(d)

(f)

(e)

0 0.7

0.8

0.3

,0

(g)

0.2

(h)

9. Compute lim

Am,

if

it

exists^

each

for

of the matrices

in

8.

Exercise

m-*co

the following matrices is a


Markov chain. In each case the initial

10. Each of

regular

transition

probability

for

matrix
vector

is

For each transition matrix, compute the proportionof objects


after two stages and the eventual proportion of objectsin
the fixed probability
vector.
determining

(a)

0.6

0.1
,0.3
(d)

0.4

0.1
0.5

0.1 0.1
0.9

(b)

a three-state

0.8

0.1

0.2

0.1

0.8

0.7

v0.1 0.1

'0.5 0.3

(c)

each

in

state

state

each

0.9

0.1

o.r

0.1
,

0.6

0.1

0.3

0.8/

0.2

0.2

0.7

0.2

0.2

0.8

0.2

0.1

0.6

0.2

0.2

0.4

(e)

(f)

'0.6

0.4

by

Matrix Limitsand

Sec. 5.3
11.

279

Chains

Markov

was
land-use survey showed that 10% the county
land
a followlater
urban, 50% was unused, and 40% was agricultural.
years
had
remained
urban,
10%
up survey revealed that 70% of the urban
had become unused, and 20% had becomeagricultural.
of the
20%
had
unused
land had become urban, 60% had remainedunused,
20%
the 1945 survey showed that 20% of
become
agricultural.
Finally,
land
unused
had
become
while 80% remained agricultural.
agricultural

In

1940 a county

of

Five

land

Likewise,

and

the

compute
Assuming that the trends indicated the 1945 survey continue,
in the
in 1950
the percentage of urban, unused, and agricultural
county
and the corresponding eventual percentages.
a diaper
12. diaper
in each diaper worn by a baby.
after
liner
is placed
the liner is soiled, then it is discarded.
the
is washed
liner
change,
with the diapers and reused, except that each
discarded
after
its third
use (even if it has never been soiled).
that
the baby
will soil
probability
new
liners at first,
are
only
diaper
any diaper liner is one-third.If
used
will be new, oncebeing
eventually what proportion ofthe diaper
by

land

If,

Otherwise

is

liner

The

there

liners

used, and twice-used?

13. In 1975
automobile
that
determined
industry
40% of American car
and 40% drove
owners drove largecars,
drove
intermediate-sized
cars,
that
of the large-car
small cars. A second survey in 1985
owners
70%
in 1975 still owned large cars
had
to an
but
1985,
30%
changed
cars in 1975,
intermediate-sized car. Of those
owned
intermediate-sized
continued
to drive
intermediate-sized
10% had switched to large cars,
of the small-car
1985.
cars, and 20% had changed to small cars
Finally,
owned
small
owners in 1975, 10% owned intermediate-sized
cars
90%
the
cars in 1985. Assuming that these trends continue,
percentage
who will own cars of each sizein 1995
the
of Americans
corresponding
the

20%

showed

in

who

70%

in

and

determine

and

eventual

percentages.

14 Show that

if

and

are

rm

A'

5, then

as in Example

m+

m +

'm+l

+ l

m+
m+

where

1 + (-ir
im

Deduce

\342\200\224

that
200

600(AmP) =

(-l)m

+L_J-

(100)

200

(-l)m

+1

200 + ^-^-(100),

280

Chap.

15.

16.

Prove

Theorem

Prove

the

5.20 and its

17. Prove the

corollary to

Definition.

A e

If

Exercise

define

Mnxn(C),
a
= TI + A

series

3T-+-\"'

series.

with the power

the analogy

(Note

numbers

complex

e\302\260and

Compute

+ ...J

a.)

I denote the n

and

where

e1,

x n zero and

identity

respectively.

matrices,

20. Let ,4

is a diagonal matrix D.

that P~1AP

Suppose

that

+ a+-

\342\226\240>

^ = i

which is validfor all

19.

series

infinite

18.

\342\200\224

...+

this

of

where

Bm,

Am

A2

A+2T
sum

lim

m-*o3

I +

and Bm is the mth partial

eA

sum of the

eA is the

Thus

20).

5.23.

5.24.

Theorem

Bm

(see

corollary.

of Theorem

corollaries

two

Diagonalization

e Wnxn(Q
exists.

eA

t>e

= PeDP~l.

of Exercise

result

7.2 will show

of Section

21

(Exercise

the

Use

diagonalizable.

eA

that

Prove

19 to show

that eB exists for each

BeMnxn(Q.)

21.

Find

A,

Be

22. Provethat a
differential

of

form

in Exercise

denned

X(t) =

some

for

etAv

e^+B.

R -*

X:

function

differentiable

equations

the

that eAeB #

such

M2x2(R)

solution to the
of Section 5.2 if and

Rn is a

14

of

system

if

only

e Rn5 where

is

as

denned

in

is

that

exercise.

5.4 INVARIANT

SUBSPACES

CAYLEY-HAMILTOW

THE

AMD

THEOREM

In Section

5.1 we observed

that if x is

T maps the span of {x}into


are of great importance in
the

itself.

study

an

that

Subspaces

of

of

eigenvector

linear

are

a linear

mapped

operators

T, then
operator
into themselves

26 of

(see, e.g., Exercise

Section 2.1).

Definition.
of

V is

all

Let

be

called a T-invariant

e W.

a linear

on a vector

operator

subspace of

if

T(W)

c W,

space V. subspace
that is, if T(x) e

for

Sec. 5.4

Invariant

The

and

Subspaces

281

Theorem

Cayley-Hamilton

Example

Suppose that T is a linearoperatoron a


subspaces of V are T-invariant.

vector

space

the

V.

Then

as

exercises.

following

(a) {0}
V

(b)

The

(c)

R(T)

(d)

N(T)

(e)

EA) for any

of
T
eigenvalue
that these subspaces are T-invariant are

proofs

Let T: R3 -*

R3 be denned

by

b,

T(a,

are

the

xj>-plane

T-invariant

Let T be

element of

That

x.

W (see the

contain

vector

of

subspaces

play

of

by x.

V generated

x be a

nonzero

It is a

simple matter to

is,

show

of

exercises).Cyclicsubspaces

We

uses.

various

have

apply

them

In
establish the Cayley-Hamilton
the
we
exercises
for using cyclic subspaces to compute
characteristic
a linear
without resorting to determinants. Cyclic
operator
an
role
in Chapter
important
7, where we study matrix
theorem.

the

operators.

Example

T: R3

-\342\226\272
R3

denned

be

by

T(a,
will

the

determine

T(fil) = T(l, 0, 0) =
-eu

let

is the \"smallest\"
T-invariantsubspace V
x must
also
any T-invariant subspace of
containing

linear

We

and

T(x), T2(x),...})

span({x,

representationsof nondiagonalizable

Let

V,

space

in this section to
a method
outline
polynbmial

R}

In fact, W

T-invariant.

containing

{(x, 0, 0): x e

x-axis

the

and

R}

c, 0).

subspace

subspace

is

a linearoperatoron a

is called the J-cyclic


W

b, b +

= (a +

x,

{(x, y, 0):
of R3.
subspaces

The

V,

c)

that

Example

Then

left

=
0)

(~b + c,a

and

e2

T{e& T2^),

+c, 3c).

W generated

subspace

T-cyclic

(0, 1,

= span({ei)

Wei

b, c) =

T2(ex)

...}) =

=
by
(1, 0, 0). Since
=
= T(T(fil)) = T(e2) = (-1, 0,
{(s, t, 0): s, t e R}.
span^,
ex

0)

e2})

Example
4

Let

be

the

linear

operator

the T-cyclicsubspacegenerated

by

on
x2

is

P(jR)

span({x25

denned

by

T(/)

2x, 2}) = P2(R).

= /'.

Then

282

Chap.5

linear

new

T-invariant

is a

domain

linear

is

certain

two

the

how

V, and let

polynomial of

Extend a basis

Proof
xk,...,x\342\200\236}

an

is

where

T and

g(t)

(n

k)

th

=
tln)

9 ^of

Exercise

4.3. Thus g(t)

Section

det^1
by

to

basis

ft

{xl5

...,

Exercise10,

If f(t) is the characteristic polynomialof


of Tw, then
polynomial
~

/(t).= det04 -

xk} for

matrix.

k zero

characteristic

the

is

\342\200\224

of T.

polynomial

...,

{xl9

the characteristic

V. Then

of

= [Tw]r. Then, by
[T]^ and 5X

A =

Let

V.

for

=
y

vector

a finite-dimensional

on

subspace

characteristic

the

divides

Tw

Tw

operator

T-invariant

be

arguethat

As a linear operator inherits


The following result illustrates

linear

Let T be a

5.26.

W (see

linked.

are

operators

Theorem
space

(see the exercises).


its parent operator T.

from

simple matter to

it is a

and

W,

T to

of

Tw

on W

operator

properties

to

from

the restriction

of V, then

subspace

Appendix B) is a mapping
Tw

subspace provides the opportunity to define


is the subspace. If T is a linearoperatoron

T-invariant

whose

operator

and

of a

existence

The

Diagonalization

\302\243/\342\200\236_,)
ff(t);det(B3

divides f(t).

\302\273

Example

T: R4 ->

Let

R4 be defined by

T(a, b, c,
and let W =
R4, for

d)

{a

b +

for

R4.

{eu

- d, b +

{(t,s,0, Q):t,s<=R}.

that

Observe

T(a, b, 0,
Let

2c

e2}, and

note that

=
0)

is

{a

a basis

b, b,
for

d,

2c

is a

0, 0) e

d, c

+ d),

T-invariant

y to the

W. Extend

standard basis

/?

Then

*i=[Tw]7=

'

1\\

_\342\226\240

J
1/

and
and

^4 = L[-T],=
Jp
[-T]\302\273=

1 2 -l\\

10

2-1

\\0 0 1
the

of

W.

/l

in

subspace

notation

of Theorem

5.26. Thus if/(t)

1/

is the characteristicpolynomialofT

Sec.5.4

Invariant

and g(t) is

the characteristic polynomialof

f{t)

= det(.4

The

and

Subspaces

In view of Theorem5.26

we

-l\\

2-t

-1

1-t

\\

then

Tw,

I1-1
0
=
det
tl4)
0

gain information about the characteristic

of

polynomial

cyclic

restriction of T to a

is

subspace

cyclic

characteristic

the

because

useful

are

subspaces

of Tw to
polynomial
T itself.
In this regard,

characteristic

the

use

may

283

Theorem

Cayley-Hamilton

of the

polynomial

computable.

readily

Let T be a linearoperator a finite-dimensional


and let Wdenote the T~cyclicsubspaceof'V
byxeV.
spaceV,
=
k
1
hencex
that dim(W)
^
# 0).
(and
is a basis for W.
T2(x),...
(a)
{x, T(x),
,Tk-1(x)}
5.27.

Theorem

vector

on

Suppose

generated

Then

If

(b)

Tk(x)

\342\200\224

\342\200\224

axT(x)

a0x

\342\200\224-\342\200\242\342\200\242\342\200\242'\342\200\224

of Tw is f(t) =

(Proof Let j be the smallest


is linearly dependent.
exist
j must
Since x#0, /> 1.
T(x)5...,TJ-1(x)}
polynomial

l)k(a0

integer

Thus

mathematical

This

that

is

clear

for

^ 5

somem ^j. Then

T^^x)})

is

Ts(x)

lies

in this

</. Suppose that


b0,

will

show

T(x),...,

span({x,

...,

bu

...,

T(x),

Tj(x)}

by

nonnegative integer s.

span for any

Tm(x)

there exist scalars

{x,

We

1.8.

Theorem

by

+ tk).

a^t1^1

W is finite-dimensional.)
and
linearly
independent

since

{x,

V(x) e span({x, T(x), ...,

which

for

the characteristic

\342\200\242
\342\200\242
\342\200\242

art

(Such

induction

then

1(x)5

iTfc~

afc_

TJ~*(*)})

for

that

b^'such

lm{x)\302\261b0x+bj{x)+ --- l-b^V^x).

ApplyingT to both sides

of

+
Tm 1(x) =

the

we

equality,

preceding

b0T(x) + bj\\x)

+ --

obtain

V:iT'(*)-

combination of T(x),T 2(x),...,


each
+
Thus Tm 1(x) lies in this span,
span({x5T(x)5...5T/\"1(x)}).
So Tm+x(x)

spans

Ty(x),

of

lies in

which

the

completing

Hence

induction.

But

is a linear

= span({x,

clearly,

the

W.

Since

T(x), T2(x),...}) c
inclusion

reverse
this

set

is also

span({x,T(x),..., V~\\x)}).

{x, T(x),..., T'-1^)}


W.
linearly independent, it is a basis

is also

true, and so

for

284

Chap.

But

= k; so

dim(W)

{x, T(x),...,

Tk~

this set must contain

/c

(b), let

To prove part

let

part (a), and


\342\200\224

\342\200\224

\342\200\224

au...,

\342\200\242
.

\342\200\242
\342\200\242

X0

0
of

such

scalars

basis

from
Tfc(x) =

that

-\302\253o

-fll

\342\226\240
\342\200\242

~a2

\342\200\242\342\200\242\342\200\242

---

[Tw]

-a&__2

-ak-J

is

the characteristicpolynomialof

19;*Thus f(t) is

Let T be the linearoperator

Tw,

(b).

proving

subspace

generated

two
(a)

ways:
By'means

3, and

Example

\342\200\224

Wand

generates

let W =

span{el9 e2}*the T-cyclic


the characteristic polynomial f(t) of
by
ev We can compute
of Theorem 5.27 and by meansof determinants.
by means
5.27.
From
3 we have that {eue2}
of Theorem
Example
=
= t2 + 1 by part (b) of
that
Therefore,
ex.
f(t)
T2^)
of

2.27.

Theorem

(b)

e2

Let

of determinants.

means

By

Since T(eJ =

and

T(e2)

\342\200\224

eu

we

which

{e1,e2},

/3

for W.

is a basis

that

have

'0
=
[TL
1

and

be

Example

in

the

thus

that

so the characteristicpolynomial

Tw

= k, and

\342\200\224

UwJp

by Exercise

the

Observe

afc\342\200\2361T&~1(x).

(a).

be

afc_i

\342\200\224

\302\253iT(x)

Therefore,

{x,T(x),...,Tfc~1(x)}

ft

\342\200\224

a0,

\342\200\224

a0x

for W5 proving

a basis

is

\\x)}

elements.

Diagonalrzatlon

-1'

therefore,

f(t)

= t2 +

det

1.

The Cayley-HamiltonTheorem
As

an

E for

Appendix

will

that

result

known

usefulness

the

of

illustration

useful

be

the definition

of Theorem

in Chapter

5.27, we prove

7. The reader should

of /(T), where T is a linearoperator

and

well-

to

refer

f(x)

is a

polynomial.

Theorem

dimensional

vector

5.28

Let

(Cayley\342\200\224Hamilton).

space

V,

and

let f(t)

T be

a linear operator

be the characteristic

on a finite-

polynomial of T.

Then

5.4

Sec.

Invariant

f(T) = T0 (the

zero

285

Cayley-Hamilton Theorem

and The

Subspaces

that

transformation);

is,

characteristic

its

satisfies

polynomial.

Proof. We must
=

/(T)(x)

let

0 since

T*(x)
5.27

Theorem

Hence

= (-l)k(a0

characteristic

the

is

5.26, g(t) divides f(t);

By Theorem

... + a*.!**-1
+

... + at-J*-1

hence

T*)(x)

0.
q(t) such

a polynomial

exists

there

tk)

these two equations yields

of Tw. Combining

l)k(a0\\ + fllT

(-

\342\200\224a^^

At) = q(t)g(t). So

/(T)(x) =
Example

Define

T: R2

where

is

easily

by

T(a,

b)

tl)

= (a +

2b, -2a +

polynomial

= det (1

T0 = f(T)

= T2

7 suggests the

zero

f(t)

be t/ie

matrix.

Proof.

of T is

12_)

- 2T + 5i.

let

\342\226\240

{eu

e2}. Then

therefore

= t2-2t+5,
Similarly,

following.
Theorem

let

b), and

= 0.

0,

Corollary (Cayley-Hamilton
and

~2l

q(T)(0)

that

os

'0

maft-fx

q(T)(g(T)(x))

characteristic
-

det(A

that

verified

Example

=
q(T)g(T)(x)

The

[T]^.

f(t) =

\\
It

-\342\226\272
R2

by

that

such

a^T*\"1^).

+ fllt+

polynomial

0(T)(x)

au...,

a0,

then

that

implies

g(t)

\342\200\224

\342\200\224

then

0, and

then

- aj(x)

-a0x

x = 0,

transformation. Suppose
that
x ^
of V generated
by x. If dim(W) = k,

exist scalars

there

5.27

Theorem

subspace

T-cyclic

all x e V. If

= 0 for

/(T)(x)

is a linear

f(T)

the

denote

that

show

Exercise.

characteristic

for

Matrices).

polynomial of A. T/ien

A be

Let
=
f(A)

an n x

05t/ienxn

Chap.5

286
Invariant

Sums*

Direct

and

Subspaces

vector
decomposea
as possible
sum of as many T-invariantsubspaces
V can be inferred from its behavioron
direct
summands.

It is useful to

finite-dimensional

For

be

can

Exercise
dimensional T-invariant subspaces
consider alternative ways of decomposing into
subspaces if T is not diagonalizable. now proceed
direct sumsof T-invariantsubspaces use in Section
is about characteristic

In

33).

(see

of T-invariant

sums

a few facts about


first of these facts

to gather
7.4. The

We

for

sum of one7 we will

Chapter

direct

T is

example,

a direct

into

decomposed

T on

of

behavior

the

the

if and only if

a direct

V into

space

because

diagonalizable

Diagonalization

polynomials.

and

V,

space

a linear operator on a finite-dimensional


vector
\342\200\242\342\200\242\342\200\242
V = Wx \302\251
where
is a T-invariant
\302\251Wfc,
Wj

T be

Let

5.29.

Theorem

that

suppose

of V far each i (1 <

subspace

f|(t) denotes

of T and

i<

characteristic

the

f(t)

Proof
for

basis

Wl9

5.15(d). Let

Theorem

seen

0 and

where

Now
some

[T]fla

and

[TWl]fll,

B\302\261

det(A

- tl) =

- tl)

det^

\342\200\242

det(52

is

ft

V by

for

basis

It is easily

[TwJfl2.

\342\200\242
tl) =' fx(t) f2(t)

4.3, proving the result if = 2.


assume
that the theorem is true for
1 summands,
than or equal to 1. Supposethat
is
greater
Section

\342\200\224

integer

Then
B2 =

u /?2.

/?x

be

/?x

Thus

9 of

Exercise

by

= 2. Let

that

first

Suppose

0' are zero matrices.


=

f(t)

^ k), then

that

32

Exercise

by

k.

ft

polynomial

fk(t).

fi(t)-f2(t)

The proof is by inductionon


/?2 be a basis for W2, and
=

(1 < i

Tw.

of

polynomial

characteristic

the

denotes

f(t)

If

k).

where
direct

sum

verified

that

\342\200\224

lis

of /c

summands,
V = Wx

So

\302\251W&.

characteristic
Hence

by

Thus

f(t)

= Wj + W2

define

and

case

the

by

polynomial
the

- g(t)

induction
\342\200\242

fk(t)

\342\200\242
\342\200\242
\342\200\242
\302\251
\302\251Wfc,

\302\251W2

for

W^.

k = 2,

f2(t)

...,

A&.

where

fk(t),

=
of Tw. Clearly^
W^
we have g(t) =
hypothesis

the

\342\200\242
\342\200\242
\302\251
\302\251

W^.

\302\251W2

\342\200\242
f\302\261{t)

is

g(t)

* \342\200\224

f2(t)

A-i(0-

fk(t).

of this result, supposethat is a diagonalizable


on a finite-dimensional
vector space with distinct
eigenvalues
Theorem
5.16 we have that V is a direct sumof
By
eigenspaces

As an illustration
operator

easily

f{t) = g(t)

\342\200\242

h{t)

is

It

\342\200\242\342\200\242-

the

linear

Xu X2,
of T.

Invariant Subspaces and The Cayley-Hamilton

Sec. 5.4
each

Since

Theorem

any eigenvalue

\342\200\224

characteristic

where

t)m'\\

(Af

we may view this situationin

is T-invariant,

eigenspace
5.29. For

polynomial

mt

is the

to EA. has characteristic


Theorem
5.29 the
By

of T

restriction

the

Xh

of

dimension

of

context

the

EAf.

f(t) of T is the product

polynomial
=

f{t)

mu

{h~tr\\x2-tr---{xk-t)

the multiplicityofeach
corresponding eigenspace, as is expected.
It follows that

is

eigenvalue

of the

dimension

to the

equal

Example

denned by

~+ R4 be

T: R4

Let

J (a,
and let W^
and
=

(e3>

c,

b,

d)

= {(s,t, 0,0): s, t e
T-invariant

each

are

W2

Pi

W2,

287

Theorem

e*\\> an(^

and j? is

\342\200\224

{eu

a basis for

R4.

ei>

(2a

\342\200\224

b,

b, c

\342\200\224

d,

d),

t e R}.Notice
Wj
=
and that R4=W1\302\251W2.
Let ^
{eue2],
a
Then
is
a
basis
for
basis
for
^3j ^}Wla
/?x
=
=
then
Bt
B2 = [Twj02J and
[T]^
[TwJ^,
and

R}

W2

s, t): s,

{(0,0,

that

is

/?2

If

and

-10

1 0

0 1 -1
1

\\

characteristic

denotes

if f(t)

Moreover,

of

polynomial

f(t)

The
Bx and

det(A

TWl,

the characteristic polynomial of T, f,(t)


and f2(t) the characteristic
polynomial of

- tl) =

the

det^ - tl)

A defined in

matrix

B2 in the

\342\200\242

det(52

tl) = /,(1)

Example 8 canbe obtained

by

manner explained in

the

\342\200\242

f2(t).

joining

then

TW2J

the

matrices

definition.

following

same
be square matrices(notnecessarilyof
x m
and B2 is an
matrix
entries
size)
having
from the same field. If Bx is an
is the
nxn matrix, then the direct
and
B!\302\251B2j
Bx
B2, denoted
of
matrix
A such
that
(m + n) x (m +

Let

Definition.

Bx and B2

the

sum

n)

for

1< i, j

for

^ m
< i,

otherwise.

j ^ n

+ m

288

Chap.

Bl3

If

define

are square matrices with entriesfrom the


B2,...,Bk
the direct sum of Bl5 B2, ...,Bk recursively
by
Bx

If

= Bx

\302\256B2

\302\256B2

\342\200\242
\342\200\242
\342\200\242
\302\256
\302\256Bk

\342\200\242
\342\200\242
\342\200\242
\302\251BkJ
\302\256

will

Bx

we

then

field,

\302\256Bk.

write

often

\342\200\242\342\200\242\342\200\242

OX

..

B,

\342\200\242
\342\200\242
\342\200\242
\302\256
\302\256Bk-J

\302\256B2

we

then

(Bx

same

Diagonalization

C
O

O O

B;

Example

Let

B2 = (3),

2
2

B3 =

and

1 I
Then

invariant

of

matrices

subspaces.
A,

Bu

space

V =

\\N1

and

V,
\302\256W2

Ax

\302\251A2

Proof.

0
0

o1

1/

sums

the

and

B2

let

W2j

W^

- \342\200\242
-

\302\251Wk.

\302\251

be

...,
For

If A= [T];,

Plup2u---upk.

relates direct sums ofmatricesto direct


It states the general case of the relationshipamong
in Example 8.

Theorem 5.30. Let T


vector

of this section

result

final

Bl\302\256B2\302\256B3

The

2\"

linear

each

and

T-invariant

be

Wk

on

operator

for

[TwJ^

Aj

basis for
= I, 2,...,k,

& be a

let

i,

a finite-dimensional
subspaces
of V such
and

Wj

that
P

then A =

\342\200\242
\342\200\242
\342\200\242
\302\251
\302\251Ak.

Exercise.

EXERCISES
as
1. Label the following
being
a
T
exists
linear
(a) There
operator
(b) If T is a linear operator on a
then
a T-invariant subspace of

true

statements

no

with

or false.
T-invariant
vector

finite-dimensional

V,

the

characteristic

subspace.
space

V, and

polynomial

W is
of Tw

Sec. 5.4

Invariant

divides

(c)

Let

x and

subspace generated by y, and

by

x = y.
V, then for

space

the same as

the T-

T(x).

by

generated

subspace

cyclic

by x is

generated

x,

then

W,

vector

finite-dimensional

let

and

V,

generated

(d) If T is a linearoperatoron a
the
any x e
T-cyclic
subspace
(e)

289

Theorem

Cayley-Hamilton

the characteristic polynomial of T.


be a linear
on a finite-dimensional vector space
operator
y be elements of V. If W is the T-cyclic
subspace

is the T-cyclic

The

and

Subspaces

Let T be a linear operator on an n-dimensional


exists a polynomial g(t) of degreen such

vector
that

there

Then

space.

= T0.

g(T)

(f) Any polynomial of the form

(-l)\"(a0+ flit+

\342\200\242-\342\200\242

fln-i*\"-1

tn)

polynomial of some linear operator.


If T is a linear
operator on a finite-dimensionalvectorspace
is
a basis
is a direct sum of k T-invariant subspaces, then
such that [T]^ is a direct sum of matrices.
For each of the following linear operators T, determineif
given
W is a T-invariant
subspace of V.
=
P2(jR)
(a) = P3(K), T(f) = /', and
=
=
=
xf(x), and
P2(R)
(b) V
P(R), T(f)(x)
is the characteristic

V,

(g)

there

if

/? for

and

2.

the

subspace

(c)

V = R3, T(a,
=

\\N

(d)

(e)

3. Let

the

1]), T(f)(t) =

C([0,

= at +

f(t)

{/eV:

M2X2(J?),

T(A) =

b for some a
A, and

(\302\260

and

4 Let T

and
for

A1

A]
V. Prove

space

that

and

EAj

e V:

vector

finite-dimensional

(a) {0}
(c)

and

c),

b}

W = {A

linear operator on a
following subspaces are T-invariant.
N(T)

f(x)dx t, and

T be a

(b)

{(t,t,t):teR}

V =
W

+ b + c,a

+ b+c,a

b,c)=(a

R(T)
eigenvalue

any

of

be a linear operator on a

subspace of V. Prove that


5. Let T be a linearoperator

is

on

vector

V,

space

for

#(T)-invariant

vector

and

V. Prove

space

let W
any

be a T-invariant

polynomial

g(t).

that the intersection

of

of V.
any collectionof T-invariantsubspaces is a T-invariant
subspace
6. For each linear operatorT on
vector
V find a basis for the T-cyclic
space
subspace generated the vector z.
= (a +
c, a + c, a + d), and = ex
(a) V = R4, T(a, b, c, d)
= P3(R), T(f) = /*, and z =
(b) V
of

the

by

b,

x3

(c)

M2 x 2(R\\ T(A)

(d) V = M2 x

2(R\\

T(A)

=
=

A\\

(\\

and

l\\ A, and

z=

290
7.

Chap.

a linear

linear

Provethat

is an

characteristic polynomialof

V,

4=(

A,

lineac

the

on

operator

a vector

as

ways

Exercise

5.26.

of Theorem

Proof

jin

two

in

Tw

6, compute the
in Example
6.

W of

subspace

cyclic

V, let x

space

T-cyclic subspace of

W. be the

let

and

with a T-invariant

space

of Tw

and

11. Let T be a

subspace W.
with corresponding eigenvalue then

a vector

eigenvector

9. For each linearoperator


10. Verify that

be a nonzero

of

element

x.

by

generated

Prove:

W is T-invariant.

(a)

x also
contains
(b) Any T-invariant subspaceof containing
the characteristic
For each linear operator of
6, find
f(t) of T/ahd verify that the characteristicpolynomial Tw

W.

12.

is

subspace

for T.

is true

same

the

on

operator
x

if

T-invariant

on that subspace.

operator

8. Let T be a

operator T to a

of a linear

the restriction

that

Prove

Diagonalrzatton

polynomial

Exercise

of

Exercise

in

(computed

f(t).

9):-divides

13. Let T be a linearoperatorona


let W be the T-cyclic
and
subspace
if
there
prove that ye W if and
vector

V,

only

of

x be

let

V,

space

x.

by

generated

For

any

j^eV,

g(t) such that

a polynomial

exists

of

a nonzero element

y = 0(T>-

14.

its

the

that

Prove

g(t)

than

or equal to

is less

degree

15. Use Cayley-Hamilton


matrices.

16.

Let

(a)

Prove

be

a linear

characteristic

subspaceof
(b)

17. Let

18.

Let

be

be

an

an

n x

n x

the

characteristic

n matrix.

(a) Prove that

is

(b) Prove that

if

(c)

Use

corollary for

space V.
then so does
any T-invariant
the

an eigenvector

then
of T.

Prove that

dim(span({Jn, A, A2,...,})) ^ n.
matrix with characteristic polynomial

+ fl^t\"-1
if and

invertible
is

compute

only

...

if a0 #

+ fl0.

fllt

0.

then

invertible,

= (-1/^(-1^1

part (b) to

of T splits,

polynomial
of V contains

subspace

f(t) -(-1)V

A'1

that

V.

T-invariant

nontrivial

to prove its

on a finite-dimensional vector
the characteristic
of T splits,
polynomial
of T to
of the restriction
polynomial
if

that

Deduce

5.28)

(Theorem

operator

if

that

be chosenso

dim(W).

theorem

the

13 can always

of Exercise

polynomial

+ a^J\"-1
A

i\\

-1/

\342\200\242\342\200\242\342\200\242

aJJ.

any

Sec. 5.4
19. Let

Invariant

-fl0\\

1 0

-fll\\

0 1

~a2

~ak_2

/0

\\o

\\0

where

ait +

(- l)k(a0 +

if and

\342\200\224
TU

is

=*

23. t,et

space

two-dimensional
of itselfor T = for

any

'\"

x2

V.

space

vector

c.

scalar

some

V and

space

that

Prove

suppose

if U is any linear operatoron

such

for some polynomialg(t).

linear

on

operator

vector

a finite-dimensional

subspace of V. Suppose that xu x2,...,


of T corresponding
to distinct eigenvalues. Prove that if
i. Hint:
for
all
Use mathematical
xk is in W, then xt e
be a

let

eigenvectors

xi +

c\\

two-dimensional

U = g(T)

then

vector

c. Show that

scalar

diagonalizable

and

V,

are

for

.== TU,

be

operator on V, then

is a linear

if U

T-

that V is a

suppose

U(x).

c\\

UT

that

xk

x.

by

on

that T

along

g(T) for some polynomialg(t).Hint:Suppose


Choose
g(t) according to Exercise 13 so that

be a linearoperator

22. Let T

determinant

if U =

only

a linear operator on a
V is a T-cyclic subspace

either

tk).

V, and

space

that

T be

Let

vector

Prove

itself.

generated

g(J)(x)=
21.

on

of

that

flfc-it*\"1

row.

cyclicsubspace
UT

the

be a linear operator

20. Let T

\342\200\242
\342\200\242
\342\200\242

induction on k, expanding

Use mathematical
first

the

characteristic

the

A is

of

Hint:

-flfc-x/

are arbitrary scalars. Prove that

a0s fli9...9flfc-i

polynomial

291

Theorem

Cayley-Hamilton

k matrix

k x

the

denote

The

and

Subspaces

T-invariant

on k.

induction

of
a diagonalizable
24. Prove that
restriction
nontrivial T-invariantsubspaceis also

linear

the

T to any

operator

the

Use

Hint:

diagonalizable.

result

of Exercise 23.

25. (a) Prove

Exercise

of

converse

of

16(a)

diagonalizablelinearoperators a
such that UT = TU, then T and are
on

\342\226\240

for
Exercises

Exercise
oriA

that

T show

of

(b)

in

definition

the

State
26

30

of

Ex

exercises

is

U-invariants

EA

and prove a

of Section
22

of

vector

finite-dimensional

apply

Exercise

space

diagonalizable.

simultaneously

of Section 5.2.) Hint:


and

and U are
(See

For any eigenvalue


24 to obtain a basis
A

matrix versionof (a).

30 require

2.1. It is
CQ/,fion

5.2: If T

of T.

eigenvectors

through

Exer\342\204\242**

the

Section

familiarity

with Exercise 29

also advisable to
.

review

Exercise

of Section1.3

and

26

of Section

1.6

292
Let

26.

Chap.

linear operator on

T be a

subspace of V. DefineT:
+ W) =

T(v

(a)

T(i/ +

vector

-*

V/W

W)

(b) Prove that T is a

by

V/W

for any

T(u) + W

linear

be a T-invariant

let W

and

V,

space

T is well-defined. That
whenever
v + W = v' + W.

that

Show

Diagonalization

on

operator

iss

in

V/W.
that

show

T(u

V/W.

in Exercise
as
Let n: V -+ V/W be the lineartransformation
=
v + W. Show
that the diagram of Figure
Section
2.1 by

(c)

n{v)

that

is,

In Exercises
27

Figure5.5

on a finite-dimensional
nontrivial T-invariant subspace.

27. Let f(t),


and T,

and

g(t),

7 ~ {xi> xi->---->xk]f\302\260r W
\302\253

Ws...,

\\xk+s+

that

Prove

xn +

to

a basis

28.
29.

5i

[T]7 and 53
in

hint

the

Use

Theresultsof
and
5.27
Exercise
computingcharacteristic
illustrated by the next exercise.
A

\\l

generated
(a)

Use

(b)

Show

the

by

that

then

4 L let

l/

and

T = L^, and let

use

be

are

This is

of R3

subspace

cyclic

T.

algorithms for

of determinants.

the

and

Tw

useful in devising
the

without

both

If

28:

is

so

ex.

Theorem
that

show

B3.

27 are

polynomials

Let

V,

that if T is diagonalizable,

Theorem

30.

basis

= [T]a.

to Exercises 24
then so is T.

diagonalizable,

for

{xlax2,...,x\342\200\236}

the converse

Prove

\342\200\224

Extend

basis for V/W, and

W} is a

27 to prove

Exercise

Hint:

g(t)h(t).

/?

of T, Tw,

polynomials

\342\200\224

f(t)

[T],

where

vector

characteristic

the

be

h(t)

respectively.

a linear operator

T is

29,

through

W is

and

5.5 is

= lt\\.

rjT

*-V/W

space

30 of

defined

commutative,

W)

5.27 to compute
{e2 + W} is a basis

characteristic

(a)

for R3/W, and use that

fact

to

Tw.

compute

of T.

polynomial

(c) Use the resultsof parts


of A,

the characteristic polynomial of

and

(b)

to

find

the characteristic

polynomial

Chap.5

31.

Let

T-invariant

.--,1\302\276be

W2j

Wla

with direct sums.

38 are concerned

31 through

Exercises

293

Definitions

of

Index

space V.Provethat

of a vector

subspaces

32. Verify that

33.

is

of V.

subspace

of Theorem 5.29.

in the proof

if

5.30.

Theorem

Prove

be a linear operator on a

vector

finite-dimensional

that Wla W2j...,


nontrivial
are
=
that V W1eW2e-'-\302\251Wfc.Prove

suppose

T-invariant

W&

det(T)-

on a

T be a linear operator
that Wla W2j ..., W&
suppose

that V = Wi
is

\302\251W2

- - -

\302\251

\302\251W&.

all

for

diagonalizable

that a collection ^

37. Prove

are

and

of V such

subspaces

\342\200\242
-

det(TwJ.
vector

finite-dimensional

T-invariant

nontrivial
Prove

V,

space

that

det(TwJ

det(TWl)

Let

TWf

sum of one-dimensional
T-

subspaces.

35. Let T

36.

vector space V. Provethat

on a finite-dimensional
and only if V is the direct

operator

diagonalizable

invariant

34.

a linear

be

Let

a T-invariant

also

is

\302\243 Wf

that

T is

V,

space

and

of V such

subspaces

if and only if

diagonalizable

i.

of diagonalizable

linear

on

operators

finite-

space V is simultaneously diagonalizable (see


= TU for all T and
In the case that
in
<8. Hint:
if and
only if UT
25)
UT TU for all T and U in %, first establish theresultif eachoperator #
on
induction
Then establish the general result
has
only one eigenvalue.
of some
the fact that V is the direct sum of
eigenspaces
using
dim(V)
vector

dimensional

Exercise

\342\200\224

in

by

the

in #.

operator

38. Lk Bu
=

the

2?!

B2,..*,Bk be square

with

matrices

\342\200\242
\342\200\242
\342\200\242
\302\251
\302\251Bk.

\302\251B2

of the

product

Prove

characteristic

that

the

the

in

entries

characteristic

field, and let

same

polynomial

of

of the B/s.

polynomials

INDEX OF DEFINITIONS FOR CHAPTER 5


Absorbing

Markov

Absorbing

state

Direct sum of matrices

273

chain

273

Direct

of

Characteristic polynomial of a linear

Eigenspace

of a

sum of a

Column

of

Convergence

Cyclic

matrix

or

operator

linear

216

or

matrix 217

219
Fixed

probability

Generator

operator

of a linear

Eigenvector

operator

linear

or

217

matrix

252

of a linear operator

Diagonalizable

matrix

matrices

linear operator or

Eigenvalueof a

matrix 264

245

subspaces

matrix 234

221-22

281

subspace

Determinant

sum

287

of a

vector

operator or
269

cyclic subspace

281

is

294
Initial

Chap.5
probability

vector

for a

Markov

261

chain
Limit

of

of matrices

sequence

Linear operator

214

Multiplicity

of an

eigenvalue

Probability vector 258

sum

of a

matrix

264

Simultaneously
diagonalizable
linear

or

operators

Splits

matrices

232

Stochastic process 259

260

process

263

matrix

Scalar matrix 229


252

Markov chain 260


Markov

Regulartransition
Row

Invariant subspace 280

Diagonallzation

233

Sumof subspaces
Transition

matrix

245

257

251

are involved with the concept


of the magnitude
or relative size of various quantities.Soit is
and
hence
that
the fields
of real and complex numbers that have a
surprising
assume
should
notion
of distance
play a special role. Except for Section6.7
either
jR or C.
that
all vector
spaces are over the field F, whereF
into
vector
We will introduce the idea of distanceor
spaces,
a much richer structure, the so-called \"inner
structure.
product
obtaining
to geometry (Sections 6.5 and
structure
will provide applications
This
added
in systems of equations (Section 6.9),
6.8), conditioning
physics
(Section
Most

of mathematics

applications

of

measurement

built-in

not

we

denotes

length

space\"

6.10),

leastsquaresapplications

and

6.3),

(Section

forms

quadratic

6.7).

(Section

\\

6.1 INNER PRODUCTS


of the

Many
and

these

R3

may

ideas

NORMS

AND

notions such as angle, length, and


geometric
be extended
to more general real and complexvector

perpendicularity

to the concept of

are related

V be

Let

Definition,

pair

every

and

x,

<cx,

(b)

y)

(c) <x, y> =


(d) <x, x> >
Note
simply

require

0 if

where

that (c) reduces


that

the

All

spaces.

of

of vectors
z in V

and

An

inner

x and

on

product
y

all c in

in

V
we

a scalar

V is

in F,

have

y>.

y>.

x>,

<y,

R2

product.\"

over F.

a vector space

ordered
junction that assigns to
denoted <x, y>, such thatfor all y,
= <x, y> + <z,
<x + z, y>
(a)

= c<x,

\"inner

in

inner

the

bar

denotes

complex

conjugation.

0.

to <x,

j>>

product

<j>,

x>

if F

be linear in

= R. Conditions

(a) and (b)

the first component.

shown that if au

It is easily

..., aneF

hi

xu

x2,

\342\226\240
\342\226\240
\342\226\240,
xn

e V,

then

hi

Example

Let

y,

Spaces

\\

and

Product

Inner

Chap.

296

x =

For

Fn.

and y = (bu...,bn),

(ai3...,an)

define

hi

<%

conditions

satisfies

\342\226\240>

on

Fn.

(In

elementary

courses

verification

of

The

(ci,...,cn),

for

have

we

(a)

through

(d)

is

(a)
n

z, y) =

<x

;\342\200\242

\302\243 (at

= (1 + z)(2 +

y)

<x,

= (1

for

+ z, 4)

30 + 4(4 -

ft-

j>> is any

<x,

\302\243 <*&%

hi

hi

ctbi

+ ^^)-

and

=
j>

50 = 15-

- 3z, 4

(2

+ 5i) in C2

have

we

15z.

Example
If

cdbt

hi

= ^,^)
Thus

the standard innerproduct


is called the dot product.)
easy. For example, if z =

(d) and is called


in linear algebra,
this

through

(a)

inner

product

another inner product

by

the

on a vector space and


= r<x, j>>.If r
rulq
<x, j>>'

we may define

> 0,

< 0,

then

(d)

not

would

hold.

Example 3

Let
For

C([0,

define

f,ge\\f,

vector space of real-valued

1]), the

</, g)

f(t)g(t)dt.

are immediate, and

(a) and (b)


from

on

zero

i
=

</./>

some

subinterval

|2
Cflt)]2 dt >

Since the integral aboveis

f1

Jo

(c)is

continuous functionson [0, 1].

If

trivial.

of [0, 1]

# 0, then

f2 is bounded

(continuity is used

here),

in

linear

and

f,

away

hence

0.

Definition.

Example

Let

A be

an m x

transpose (or adjoint)

conjugate
(A*)\302\253

Let

n matrix

of

to

*be

with

the

entries

n x

from

F.

m matrix

i#e define

the

A* such that

Sec. 6.1

Norms

and

Products

Inner

297

Then
2

~l
A*\302\273(

The conjugatetranspose a
remainder of this
that
Note

matrix

of

of

a very

plays

role in the

important

then A* is simply

has real entries,

if A

chapter.

transpose

3 - 4z;

- 2z

\\1

the

A.

Example

Let V = MnXn(F),

and define {A,B)

for

iJeV.

We

will

tx{B*A)

the

that

(Recall

by tr(,4) =

A is defined

a matrix

of

trace

\302\243

Au.)

that

verify

(a) and (d)

and
of the definitionofinnerproduct
leave
(b) and (c) to the reader. For this
purpose,let B, C e V. Then (using Exercise 6 of Section1.3)
hold

A,

B,

{A

C>

= tr(C*(,4 +

B)) = tr(C*4 +

tr(C*5)

tr(CM)

C*5)

= (A3 C>

+ <5, C>.

Also,
=

A)

<A,

11

&=i

*==i

Now

\\
inner

if i4'#

0, then

vector

space

(A*A)U

~^au=t

\302\243

\302\243

(A*)ttAH

\\a\302\253\\2-

&==i

i=i

/c and z. So

some

<,4, /1> >

0. 1

with a specific inner productiscalled

F endowed

V over

If F

space.

product

0 for

,4W

tr04*.4)

an

= C, we call V

inner

a complex

inner product space.


call
V a real
F = R,
Thus Examples 1, 3, and 5
provide
For the remainder of this chapter will denote

whereas

space,

product

if

we

also

reader

The
vector

space

A
H

of

very

should

continuous

be cautioned
inner

important

product

product

spaces.

space with the

1.

Example

two distinct

yield

inner

the

Fn

inner product given in

of inner

examples

that two distinct innerproducts

on

given

inner product spaces.


space

product

functions

complex-valued

resembles C([0, 1]) is the space


defined on the interval [0,2%] with
that

the inner product


2tc

a9)=i
The

reason

for the constant

1/27U

will

context
space, which arises oftenin
more closely in later sections.
the

At

this

point

we

mention

dt.

f(t)g(t)
0

become
of

a few facts

evident
physical

later.
situations,

inner product
will be examined

This

about integration of complex-valued

298

Product

Inner

Chap.

Spaces

a constant
i can be
as
under
be
written
integral
sign. Second, every complex-valued function f
are
real-valued
functions.
we have that
Thus
/ ='/i + if2, where /i and

functions.

the imaginary number

First,

the

treated

as

may

f2

f =

inner product space [see

H is an
Some
contained

are

properties
in the

f-

the assumption
of

Exercise

it

continuity,

follows

16(a)].

follow easily from

that

the definition of an

inner

product

next theorem.

Let

Theorem-6.1.

and

as well as

these properties,

From
that.

/i + i

inner

an

be

Then for x,

space.

product

y, zeV and

ceF

(a)

z> =

<x,

<x, z>.

<x, y> +

(b) <x,cy> = c<x,

y>.

(c)

<x/x>

(d)

If

0.
if and only if x
y> = <x, z> for all x e

= 0

<x,

\342\200\224

V,

(a)* <x, y +

Proof

z> -

The

of

proofs

(c),

(b),

(y

(d) are left

and

The reader should

to generalize
we need only observe

spaces,

^/a2 + b2 +

c2

\342\200\224

Definition,
(or

x by

of

length)

second

x>.

.N/<x,

Let

be

inner

||x|| = ^/<x,

to

R3

product

space.

For

inner

arbitrary
c)

R3

product

is given

by

definition.

the following

make

the

component.

b,

we

6.1 show that

Theorem

that the length of x = (a,

Hence

an

+ <x, z>.

(b) of

and

the notion of lengthin

In order

+ <z, x>

= <x, y)

<z5 x>

(a)
the

linear^

conjugate

= z.

as exercises.

that

observe

inner productis

= <^s x>

z, x>

x>

<j>,

then

x e V we define the norm

x>.

Example 6

Let

Fn.

Then
1/2

[au...,anj
\302\243-1

is

the

Euclidean

As we might
general, as shown

definition

of length.

Note

that if n =

expect, the well-knownproperties


below.

of

1,

length

have

we

in

R3

||a||

hold

in

Sec.6i1

299

6.2.

Let V be an

= |c|*

||x||.

Theorem

ceF

Norms

and

Products

Inner

inner product space.

Then

all

for

x, y

e V and

have

we

||cx||

(a)

(b) ||x|| = 0 if

(c)

(Cauchy-Schwarz

(d)

(Triangle

Proof.
(c) If

the

leave

||y||.

exercises,
immediate. So assume that

is

result

||y||.

and (b) as

of (a)

proofs

the

then

0,

< ||x|| +

||x + y||

\342\200\242

< ||x||

y>|

|<x,

Inequality)

0.

any case, ||x|| ^

= 0. In

if

only

Inequality)

We

and

0. Then

for

have

any ceF.we

0 < ||x -

cj;||2
=

<x

cy, x

- cy) = <x, x -.cj>>-

- c<x,

= <x, x>

cy)

y).

xy+cc(y,

c(y,

j>>

c(y,

Setting

c=
y>'

<y,

the inequality above becomes


0

x>

\302\253S
<x,

||x

+ j>||2

y) =

y, x +

= <x +

<||x||2

<||x||2

+ 2||x|NL>1
X

we used (c)

when

\"\"\"2

of the complex

number

<x,

j>>.

Note

that

(c) and

in

results

equality

(d) is considered

in Exercise 15.

Example

For

+ ||j;||2

to prove (d).

The case

+ <x, j>> + <j>,j>>

i2

the real part

j>> denotes

<j>, x>

>

2|<x,j;>|+||j;||2

\\
Re<x,

x>

<x,

= ||x||2+2Re<x^>

where

y>\\2

\\<x,

<y>y>

from which (c) follows.


(d)

\342\200\224
=

\342\200\224

= Fn we

may apply (c) and

product to obtain

the

well-known

following
n

>:

*=i

inequalities:

1/2

it,

<

k-i2

>:

1/2

(= 1

L(=sl

to the standard

6.2

Theorem

of

(d)

IM2

and

1/2

1/2

<

>:

L1'-!

Na

1/2

[_*-!

N2

inner

Chap. 6

300
reader

The

that <x, j>> =


y. This

and

recall from earlier courses that for V =


6 denotes the angle (0 <
cos
6, where
||x|| \342\226\240
\\\\y\\\\
may

since |cos

implies (c) immediately

equation

R3

6<

vector

1. Vectors x and

||x|| =

if

<x, y> = 0.

subset

a subset

Finally,

of

in

are

A vector x

space.

product

if S =

that

if

orthonormal

observe

Also

Xj}
nonzero

0.

of

notion

is

unit

if

two distinct elementsof S are


if S is orthogonalandconsists

{x}, x2,...}, then S is


=
8{j, where 5y denotes the Kroneckerdelta.
vector x, (l/||x||)x is a unit vector.

(xh

vectors.

unit

Note

in

(perpendicular)

orthogonal

is orthogonal
if any
S of V is orthonormal

So/V

orthogonal.
entirely

inner

an

be

that
y^

<x,

we

also

only

Let

have

between

%)

if cos
0 = 0, that is, if and only if
and y are perpendicular if and
can
the
We are now at the point where
generalize
to arbitrary inner product spaces.
perpendicularity

Definitions.

Spaces

R2 we

or

1. Notice

<

Q\\

Product

Inner

and

that

if

only

for

any

Example8 .F3

In

set

the

{(1,1,0),

however, if we

(1,

\342\200\224

1,1),

divide each

vector

is

(\342\200\224
1,1,2)}

its

by

./6

not

orthonormal;

the orthonormal

obtain

we

length,

./3

but

orthogonal,

set

9 *.

Example

\342\226\240\302\273

the

Recall

product
space H (defined on page 297).Wewill producean
of an orthonormal
subset of H. Define S = {eijx:jis an integer},

inner

example

important

z is

where

the

etjx= cosjx +

t,

have

we

number

imaginary
z

sinjx.)

j #

for

Using

yf\342\200\224T.

the

S is

Clearly,

eil =e~u

that

property

a subset

of H. (Recall that

for every real number

k that

<eyjc,eakx>=

1 '2*

2n

dt

ev<elkt

2tc

e^-^dt

\342\200\224

2n

2n

2ni(j

,HJ-k)t

0.

k)

Also,
'2*

(etJx, eijxy =
other

words,

We will use

the

2tc

dt=

1.

2n

2%

In

e>u~J)tdt=2

(eiJx, eikxy = 5fk.


orthonormal

set

of

Example

9 in later

examples.

Sec.6.1

301

Norms

and

Products

Inner

EXERCISES
true
or false.
followingstatementsas
inner product is a scalar-valuedfunctionon

1. Label the
(a)

An

being

ordered

of

set

the

pairs

of vectors.

space must be

(b) An inner product


numbers.

is linear

(c) An inner product


(d)

(e)

both

in

the

over

field

or complex

real

of

components.

There
is exactly one inner product on the
R n.
space
The triangle inequality only holds in finite-dimensional
inner product
vector

spaces.
(f)

Only

0 for
aU x in an inner product space, then = 0.
(g) If <X
Let V = C3 with the standard inner product.
1 + z,
x = (2,
= (2 1 + 2z). Compute
2,
y
<x, j>>, ||x||, ||j>||,and ||x y\\\\2. Then
both
and the triangle inequality.
Cauchy's
ipequality
In C([0,1]), f(t) = t and g(t) = e\\ Compute </,#> (as
=

J>>

2.

a conjugate-transpose.

have

matrices

square

Let

3.

let

Example

Let

that

<%

is
\342\226\240>

inner

an

{A, B) =

with

x n(F)

Mn

<x, j>>

Complete

the

for x =

reasons

Provide

vector

given

(1 -

Let

j? be

<Xj yy

9>

where

product,

2,

3i) and

y = (2

+ i, 3 -

2i).

6.1.

following is not an

inner

product

'

R2

M2x2(i?)
on P(R), where

f'(t)g(t)dt

' denotesdifferentiation

a basis for

= 0 for

on

the

on

spaces.

</,

each of the

why

6.2.

bd

(c)

z,

Theorem

(a) {{a, b), (c, d)}= ac = tx(A + B) on


{A, B)
(b)

9.

inner

an

is

xAy*

of Theorem

proof

Example

and

<-er

-*

7. Complete the proof of


8.

*=
Compute

inequality

B).

{A,

\\
6.

==

that <x,y} =

On C2, show

5.

and

Cauchy's

in

r)

\\\\B\\\\,

11,411,

both

verify

tr(5*,4). Complete the proof

If n

product.

H
compute

||/ + g\\. Then

||0||, and
inequality.

||/||,

3),

in

defined

the triangle

and

verify

z,

and

z)

a finite-dimensional inner product space.

all x e ft

Prove

then

that

0.

x and
an inner product space,and suppose
y are orthogonal
=
the
+ ||j>||2. Deduce
elements of V. Prove that ||x +
||x||2

10.* Let V be

that

j>||2

Pythagorean

theorem

in

R2.

if

11.

Inner Product Spaces

Chap. 6

302

the

Prove

l|x+

this

does

What

J>ll2

12.* Let {xl3...,

II*

J>ll2 = 2||x||2

an

be

xk}

parallelograms

in R2?

V, and let

ak

set in

orthogonal

for all xjeV.

+ 2||3i2

about

state

equation

space V; that is, showthat

inner product

on an

law

parallelogram

au...,

F.

that

Prove

ll2

i2ii

aixt

<\",\"X and

that

Suppose

Prove that
14

Let

=
<%

inner

two

are

<\342\200\242,
->2

<% 'X

inner

on V.

product

Prove that

and let ceF.

space V.

on a vector

products

another

is

<\342\226\240,
->2

n matrices,

n x

B~be

and

\342\226\240>

\342\226\240

= 1
\302\243

\302\243=1

13.

|a*l INI

Zi

(A

cBf

A* + cB*r

15. (a) Prove that

if and
If

x or y is

vectors

the

of

|<x, j>>|=

then

space,

product

||x||'||j>||

a multiple of the other.Hint:

let

#0,

inner

an

is

one

if

only

if

<*, y>
a =

Then x =

z, where

ay

z> = 0.

(y,

. .=

By assumption

llxll

\\a\\

(b) Deri-ve a

the

for

the vector space

(a) Show that

||x

equality

||z|| = 0.

obtain

and

z\\\\2

+ j>||

+ ||j>||, and

= ||x||

n vectors.

of

case

the

\\\\ay

similar result;

generalizeit to

16.

10 to ||x||2

Exercise

Apply

in

defined

this

is an inner

section

product

space.

(b) Let

define

and

C([0,1]),

'1/2

9> =

dt.

f(t)g(t)

17. Let

an inner

this

Is

be

an

inner

and suppose

space,

product

that
||T(x)|| = ||x|| for all x.
Let V be a vector space
where
F,
inner
product
space over F with inner

that

18.

V?

product-on

T is

Prove

over

prove that <x,

j>>'

if T

-*

V is

and

linear

one-to-one.

= R or

product

<%

F = C, and let
If

\342\226\240>.

T:

T(j>)>

defines

an inner product

space;

prove

that

V -*

on

be

an

W is linear,
if

and

only

is one-to-one.

19. Let be
(a) ||x
V

an

\302\261
^112

denotes
(b)

<T(x),

that T:

llMI

inner
=

product
IMI2

2
\302\261

y)

||j>||2 for all

x, y e

of the complex number <x,

the real part

llj'lll < llx -

Re<x,

j>>.

j>||

for

all

xjeV.

V,

where

Re<x,

j>>

Inner Products and

Sec. 6,1

space over F.

an inner product

V be

Let

20.

303

Norms

x, y e V

Let

21.

j>>

<x,

(a)

if

iky\\\\2

= C,

where

k=l

that

Prove

z*||x

\302\243

n x

an

be

n matrix.

Define

Ai=~{A

+ A*) and

= Au

Af

reasonableto

define

respectively,of

A%

A\302\261

all

A2

to

J^l.

A2=y{A-A*).
A =

At + iA2. Would
be the real and imaginary

and

A2,

and

it be
parts,

Al

matrix

the

For

identities:

polar

if F = R.

(a) <x,j>>=il|x+j;||2-i||x-j;||2
(b)

the

Verify

=
an n x n matrix. Prove that A =
where
+
iB2,
Bf
= At and 2?2 = A2.
=
then
and
Bt
B2,
V is
or
not
22. Let be a vector space over F, where F is either i?or C.
*
a \"norm\"
an inner product space, we may still
|| || as a real-valued
a e F:
the
conditions
for all x, y e
and
function on satisfying
following
=
if x = 0.
only
(1) ||x|| ^ 0, and ||x|| 0 and
(2) ||ax|| = |a|-||x||.
Let A be

(b)

if

B\302\261

B\302\261

2\302\276

Whether

define

if

(3) ||x + j1<||x||


the

that

Prove

(a)

(b)

||j>||.

are norms on

following

Mmxn(F);

\\\\A\\\\

C([0,1]);

ll/H

the given vectorspaces

V.

for

max|.40.|

= max

all A e V

for all f

|/(t)|

re[0,l]

(c)

||/||=

C([0,1]);

\\

forall/eV

\\\\f(t)\\dt
J\302\260

(d)

11(0,20|| = max

R2;

Exercise 20 to
||x||2 = <x, x>

Use
that

23. Let

be

scalar

the

all x,

y,

(a) d(x,
d(x,

(b)

inner

an

z
y)

y)

e V

that

inner

R2

space,
\342\200\224

y\\\\,

\\\\x

called

if

product

defined

and define for


the

for

\\b\\}

show that there is no


for all x e
<% \"> is

product

d(x,

{|a|,

distance

all

(a,

b) in

<%

\342\200\242>

on

V
R2

such

as in (d).

each ordered pair of


between
x and y. Prove for
vectors

0.

y) = d(y,

x).

(c) d{x,j>)< d{x,

z)

d(z,

y).

d(x, x) = 0.

(d)

(e) d{x,y)\302\245=Qifx^y.

24. Let

let

j?

be
be

a real
a

basis

vector space (possibly infinite-dimensional), and


exist xl3...,xn
e ft such
that
for V. For x, j> e V there

or complex

\342\200\224

Yj

aiXi

ailC*

=
y

Yj biXi-

Chap.6

304

Product

Inner

Spaces

Define

<*> j>>

(a) Prove that


Thus

inner

product

V =

or

Rn

law

that

Let ||

26.

be

may

orthonormal

an

as

regarded

inner

an

defines
<% \342\226\240>

inner

standard

the

a real

on

22)

product.

vector space

11. Define

on V such that

product

basis,

-.HI2]-

-II*

y\\\\2

ordered

||x||2 = <x, x>

for

e V.

all

an

is

standard

the

is

in Exercise

given

<*,*> =iCII* +
Prove

ft

Exercise

in

defined

the^parallelogram

satisfying

Cn and

product defined aboveis

a norm (as

|| \"|| be

ft

space.
V

then the inner


Let

on V and that

product

complex vector space

every real or

basis.

(b) Prove that if


25.

inner

an

is

\342\226\240>

<%

Z aibt-

\342\200\242

||;:be

norm

complex vectorspace

22) on a

in Exercise

defined

(as

satisfying-

\302\243

where

||x

i*.>i2

= 4[||x||2+|b>||2],

= l

fc

Define

^/-1-

+
<*,;>> =i\302\243i*llx

i*j>ll2.

that

Prove

all x e

an inner product

defines

In previous chapters

that

such

= <x, x>

||x||2

for

COMPLEMENTS

in Rn. The special


vectors
an
orthonormal
are

that

productspaces,
We

basis for

if

ft

is

from

are the building

sets are the

blocks of

vector

building blocksof inner

inner product space.


an

standard ordered basis


the fact that the basis

bases.

such

V be an

stem

as bases

orthonormal

name

now

Let

Definition.

also

basis

Just

set,

form

role of the

special

this

of

properties

bases

the

seen

have

we

orthonormal

ORTHOGONALIZATION

AMD ORTHOGONAL

PROCESS

spaces,

on

V.

GRAM-SCHMIDT

THE

6.2

<% *>

ordered

basis

subset

ft

is an

of

that is orthonormal.

Example 1

If

Fn,

then

the

standard

ordered

basis is an

orthonormal basis

for

V.

Orthogonalization Process

The Gram-Schmidt

6.2

Sec.

305

Example 2

2W

-1

Tjir

basis for R2.

is an orthonormal

in particulars

an

Let

6.3.

Theorem

space

product

vectors.

of nonzero

set

orthogonal

inner

an

be

sets and,

so important.

are

bases

orthonormal

why orthonormal

illustrate

its corollaries

and

theorem

next

The

and let S

= {xls...,

xfc}

be

If
k

then as =

<y,

1 < J

For

Proof.

<J>>

all

for

xs)f\\\\xs\\\\2

*;>

< k,

next

have

we

Corollary 1.
orthonormal,

= E ai<xh
i=l

( E aixt> XJ)

in

If,

*j>

follows

corollary

aixt>

j.

\\\302\243\302\2531

The

the

to

aj<xp

xj)

from Theorem 6.3.

immediately

addition

hypotheses

of Theorem

6.3, S is

then

=
y

\\

If

the

compute

Corollary2.

set of

Let

Then

vectors.

nonzero

in

coefficients

Proof. Suppose that

at

a linear

inner

an

be

S is

<fi,

linearly

xu...,

x^l\\\\xi

product

xk

space,

and let S

be an orthogonal

independent.
S and

aiXi = 0.

||2 =

0 for all z. So S is linearlyindependent. 1

Example

Using

xt>xi-

basis, then Corollary 1 allowsus to


combination
very easily (see Example 3).

By Theorem 6.3

<y>

orthonormal

a finite

V possesses

the

orthogonal

set obtained

in Example 8 of Section6.1

and

Corollary

2,

306

we obtain an orthonormalbasis

Inner Product Spaces

Chap.

R3

for

(1,1,-0),.-1=(1,-1,1),-^(.-1,1,2)^

Let x = (2, 1,

\"coefficients\"

The

3).

linear

by

given

6.3 that express

Theorem

+ 3)=

^=4=(2-1

^=4=(2+1)=4=,

./3

./2

x as a

are

vectors

basis

the

of

combination

./6

./3

./3

and

+ 1+6)=

a3 =4=(-2

v6

As

Corollary
6.1 contains

tells

for

us,

dimensional

vector

sptace

of

the

from

way

in

a linearly

the vector space H in Example 9 of Section


set and
hence is not a finiteindependent

linearly

space.

possesses

an

obtaining

this

every finite-dimensionalinner
The next theorem takes us most
basis.
tells us how to constructan orthogonal

yet shown that

not

have

we

course,

product

that

example,

infinite

an

Of

-1,1)+-(-1,1,2).

(2,1,3)=^(1,1,0)+^(1,
2

have

we

check;

orthonormal

It
vectors

result.
sefof

independent

set

in such a

that

way

both

sets

generate

the same subspace.

Before stating this theorem,let us considera


product
{yu yi} is a linearly independent subset of an
basis for some two-dimensional subspace). We would

case,

simple

inner

set from
orthogonal
that
the set {xl5 x2},

{yu y2} that

where

to

and

\342\200\224

cyu

work

will

an

construct
6.1

Figure

= yi
x2

if c

hence

(and

spans the samesubspace.

yt

xx

space
like

that

Suppose

suggests

is properly

chosen.

To find c,

we

need

0 = <x2, ^i>

the

solve

only

cyu

<J>2

following

equation.

j>!> = <j>2, j>!>

c<yu y^.

So
c

<J>2>

J>i>

llj>ll

Figure 6.1

The Gram-SchmidtOrthogonalization
Process

Sec. 6.2

307

Thus

yi >

<yi>

*2=y2-

^1-

\342\200\236.,
,,2

11*

be extended to

This process can

{Yu*

where xx =

yl5

z5 an

S'

the theoremis
set Sk

j=l

proved

xk} with the desired

{xu...,

+ i

set

Sk

=
+l
{xl5...,

+1

=^

= 0,

For

<

<

)>i # 0.

xk,

xk

i}

by the

use
the desired

has

also

the

that

then

Assume

constructed

been

n = 1, then

ii v
II *j

it

(2)

xr

II

=
imply that
l e
span(S&),
that Sk + i is linearly independent.
the assumption
k we have from (2) that

then

contradicts

which

yn}. If

{yu...,

span(S).

\342\200\224

j=i
i

has

properties

the

(1)

vectors such that span(S')=

is by induction on n. Let Sn
= 5la* i.e., *i =
by
taking
S[

proof

Xft

xk +

{xl5...,xn},

llXjll

set of nonzero

of (1). We will showthat


properties, where

If

S'

Define

let

and

/orlOO.

7,-\302\2761%^\302\276

orthogonal
The

Proof.

V.

space,

an<2

TTien

independent subset of

be a linearly

-Jn}

subset.

independent

linearly

an inner product

V be

Let

6.4

Theorem

finite

any

+ l>
\342\226\240(xk

\\

would

(2)

= \\A+l3

Xi/

yk

**/

_/v

\\-Kft

+ l>

v\\

*i/

span(S\302\243)

\342\200\224

/-.

ii

i=i
\342\200\224

xi/

\\xij

ff2

11\302\276||

\342\200\224
Uj
l'Xill|p _0

^+1'x^iiv
7,^2
X\"

that
# J by the inductive
Hence
is orthogonal.
Now by (2) we have that
S'k+\\
+ 1). But by Corollary 2 of Theorem 63,
span(5&
l is linearly
=
so dim(span(5fc + 1)) /c+1 =dim(span(S&+1)).
Hence
1)

since

x*> =

<x,-,

0 if i

assumption

Sk

Schmidt

Example

construction

of {xi,...,xn}

orthogonalization

orthogonal.
span(S\302\243

by the use of (1) is

1)c

independent;

span(S\302\243

The

is

5\302\276

span(S&+1).

called

the

Gram-

process.

=
Let V = R3, and let
1, 0),
(I,
is linearly independent.
{^1^2^3}
yt

y2 = (2,

0, 1), and

We shall use (1)

above

(2,

yz

to

2, 1).
compute

Then
the

Inner Product Spaces

Chap. 6

308

orthogonal vectorsxla

and

x2,

x3.

*2

xx = (1,

Take

HxJ2 = 2, so

*i>

<J>2>

yi

1, 0). Then

\342\200\224fj\342\200\224^2-xi

ii

n*i

(2,0,1)-^(1,1,0)

= (1,-1,1).
Finally,

<J>3>Xl>

\342\200\224

*3

/3

-^-X2

fi

l|x2||2

ll^lll

*2>

<J>3,

fJ2*~Xl

(2,2,1)--(1,1,0)-1(1,-1,1)

1 12^

~3?3'3

Theorem6.5.

inner

a finite-djmensional

be

Let

has an orthonormalbasis/?.

Furthermore,

if

ft

{xl3

product

x2,...,

Then V

space.

xn} and x e

then

V,

n
X

Let

Proof.
an

obtain
By

V.

generates

therefore
from

ft

is

Corollary.
T/ien

=
=

AM

From

Proof.

ft

6.3.

method

easy

Let

orthonormal basis

follows

Theorem

We now have an
a linear operator.

A =
[TV

ft

orthonormal
1 to

Corollary

to

ft

Corollary

By
an

in

vector

each

dividing

Xj>Xi.

<X,

ordered basis for V. Apply Theorem 6.4


=
/?' of nonzero vectors with span(/?') span(/?0)=V.
set that
/?' by its length, we obtain an orthonormal
2 of Theorem 6.3,
and
is
linearly
independent,
basis for V. The remainder of the theorem

set

orthogonal

\302\243

an

be

/?0

be

Theorem

for

Let

T be a

<T(x,-),

x*>.

matrix

product

representation

space

linear operator on

6.5 we have

<T(X;),
\302\243=1

A(j

inner

Xi>.

T(*;) = Z
Hence

the

computing

finite-dimensional

{xl3...,xn}.
<T(Xj),

Xl}Xi.

V,

of

with an
and

let

The Gram-SchmidtOrthogonalization
Process

Sec. 6.2

associated
The scalars
special vector spaces.Although
<x,

orthonormal basis,

scalars

scalars

the

.In

We

<x, y>, where

the nineteenth
the

'2k

subset (possibly infinite)of an

study

of x

coefficients

Fourier

the

define

inner

to

relative

/?

/?.

the French mathematician


of the coefficients

century

with

associated

was

V.

let x e

V, and

space

be

the definition of the

sets /? for

general

an orthonormal

/? be

Let

Definition.

to

were

xn

xl3...,

extensively for
chosen from an

x?>.

<x,

product

more

consider

will

we

vectors

the

studied

been

have

with

x\302\243>

309

Jean BaptisteFourier

'2k

and

f(t) sin ntdt

nt dt,

cos

f(t)

or more generally,
'2k
=

cn

f{t)e~intdt,

2%

cn = (f> einxy,that is,

f e

to

relative

the

and

function,

extensive. will
of this chapter.
We

literature

H define

f(x)

more

= x. We

the orthonormal
set
we have, for n # 0,

see that

function

are the \"classical\"Fourier coefficients


of a
the behavior
of these coefficients is
concerning
about
these Fourier coefficients in the remainder

And, for n =

will compute the Fourier

'2*

2%

'2k

T-

teint dt =

\342\200\224

-iRl

te

dt

2n

0,
\302\253

by

integration

Using

'2k

Now

Exercise

14 we

2n

t{l) dt =

%.

have that

> Z l</,

e\"'\"*>l:

n=l

every

/c. Thus,

of

coefficients

S of Example9 of Section6.1.

</ einx>=

for

we

coefficients

S. These

learn

6.1,

of a continuous

coefficient

Fourier

nth

the

is

cn

Section

of

Example

In

context of Example

f. In the

of a function

using the fact

that

2_
1^2^

we

kave

-1

in

relative
by

parts,

to

310

Since this inequalityis

limitsthat

\302\260\302\260

now

are

We

ready

n2

Btl

similarly by using other functions.

may be obtained

results

Other

use of

by the appropriate

have

/c, we

all

for

true

Inner Product Spaces

Chap.

to

proceed

an

inner

of an \"orthogonal

the concept

with

complement.\"

Definition.;

Let

Sx

define

be

set of all those


= 0 for
{x e V: <x, y>

to be the

\"S perp\")

{read

in S; that is, =
orthogonal complement of S.
to every vector

It

Sx

of

a subspace

all

for

V that

in

vectors

Sx is

that

show

to

i$reasy

subset of V.We

and let Sbea

space,

product

are

orthogonal
S}. Sx is called the

V.

S of

subset

any

Example

The reader should

that

verify

= V and

{0}L

Vx = {0}.

Example

If

= R3

S = {x},

and

perpendicular to x
Exercise
in

an

then SL is

Exercise

(see

space

V.

vectors

that

are

5).

complement

inner

space.

product

Let W be a finite-dimensional
an
basis
orthonormal
If {xl3...,
of W
6.6.

Proposition

all

an interesting example of an orthogonal

16 provides

infinite-dimensional

product

of

set

the

simply

of

subspace

is

xfc}

inner

an

and yeV,

then

where

zeW1.

y = u\"+ z',

<y> xi>xi +

this

Furthermore,

where u e

and

z'

representation
e Wx, then

z>

of

is

unique.

That

is,

if also

and z' =

let

show

<y> xi>xi

z.

Proof. For j'eV,

To

that z e

it

suffices

y
to

1=

<j>\302\273*\302\243>*!\342\200\242

show

that z is

orthogonal to each

Xj.

We

311

Orthogonalization Process

The Gram-Schmidt

6.2

Sec.

have

<z,x,->= (\\y =

<J>, xsy

\302\243

i=l

<j>,

xf><5fi

that

To show uniqueness, suppose


- xi
and
x2, x2 e W\"1-.Then xx

and x2 =

<y,xtyxXxj\\

\302\243

\302\243

<j;, x7>
+

xx

W-1-

x,-><Xi,

<j>,

x7>

<y9

xy>

0.

x2, where xl3 xi e

= xi +

x2

W n

x2 e

x2

(y,xj)

Therefore,

{0}.

= xi

xx

x2. 1

Corollary.

notation

the

In

6.6, the vector

of Proposition
k

Yi

vector in

is the unique
-

ull

lly

As in

Proo/

u e W.

Proposition

z-

= H* +

\"||2

llj>

Section

of

10

Exercise

By

addition,

Yill-In

IIy

that

suppose

an equality, so

see

will

We

IK*

vector

The

||j>

w)||2

the

importance
squares

that

y;

then

is, if ueW,

W\"1-.

-ii)

+ z, where

IK*

z\\\\2

zeW1. Let

u)\\\\2

||z||2

II2-

w||2

yx

||(*

||j>

\342\200\224

*||2.

0. That

in the corollary

to least

application

=
u\\\\2

\342\200\224

\342\200\224

6.6, we have that


6.1 we have

llj>-J>i

>\\\\*\\\\2

Now

to

\"closest\"

is

that

<v> xt>xi

is, *

the

Then

becomes

above

inequality

= u. Clearly,j^-j'^zeW1.

is called the orthogonalprojection

on

ofy

of orthogonal projections
in Section 6.3.

of

in

vectors

W.
the

ExampleS

Let

P3(i?)

with

inner

the

product

</, 9> =
Let

projection

obtain

span({l,x})

let

and

easy

f(x)

for

= x2. We

all

g e

V.

will compute

the

orthogonal

on W. We first apply the Gram-Schmidt


processto
the orthonormal
basis {g1,g2} of W, where
/i

of/

9i(x)=
It is

f(x)g(x) dx

to

compute

(J,

/.w-$.+\302\243(^H)

g^

and

g2(x)

= ^ and </,

2v/3(x

- i).

#2> = ^/3/6.
~-;+*

Therefore,

{1,

x}

and

Chap. 6

312

set

theorem

vector
an

provides

Spaces

1.10) that any linearlyindependent


can be extended to a basis. The
for an orthonormal subset of an inner

5 to Theorem

(Corollary

finite-dimensional

in

shown

was

It

Product

Inner

space

next

analog

interesting

productspace.
6.7.

Theorem

inner

dimensiohal

(a) S can
far V.

be extended to

(b) If

S' and

Because

(b)

Since

6.3:

Theorem

Sx =

dim(W)

+ dim(W-L).

{xfc

1,...,xn}

dim(V)

to a basis

be extended

S can

1.10

Theorem

V. Now

for

~,yn}

that for any

W-x. Note

above)

apply the Gram-Schmidt process to


an orthogonal
7 to obtain
set. The result now follows.
it is linearly independent by Corollary 2 to
5X is orthonormal,
a subset of W-1-,we need only show
it spans
that
5X is clearly

yk+l,..

Exercise

use

then

V,

Proof (a) By Corollary 5 to


,xk5

+ l3...3xn}

{xl3...,xfc3xfc

notation

the

(in

(c) If W is\"anysubspaceof

S' = {xl3....

basis

orthonormal

basis for W-1-.

orthonormal

is an

an

then

span(S),

xk} is an orthonormalsetinan n-

{xl3...,

V. Then

space

product

S =

that

Suppose

x in V,

have

we

X =

\302\243

Xi)Xi.

<X,

1=1

e W-S then

if x

Now

<x,

=0forUi<fc.

x^>

x =

(c) Let

parts (a) and

be

(b),

we

e span^i).

X|>Xi

<x,

basis {xl3...3x&}.By

an orthonormal

of V with

subspace
have

= n =

dim(V)

\302\243

Therefore,

k + (n-k)

dim(W)

dim(W\"L).

Example

Let V = F3

and

span({el3

Then

e2}).

x =

and 0 = <x,e2>= b. So
Wx = span({e3}).One can deduce
same
result
=
=
2
that
3
from part above
1.
dim(W-L)
= a

0 = <x,ei>

by

the

\342\200\224

(c)

(0,0,

and

if

(a,b,c)e\\N\302\261

and

c),

noting

that

if

only

therefore
e3 e Wx

and

EXERCISES

1. Label

the

following

statements

(a) The Gram-Schmidt


an
(b)

orthonormal

Every

set

as

being

true

or false.

orthogonalizationprocess
from an arbitrary set of vectors.
allows

inner

finite-dimensional

product

space

possesses

is a

subspace.

basis.

(c) The orthogonal

complement

of

any

set

us

to

construct

an orthonormal

(d) If

the scalars <x3x(->

(f) Every orthogonalset is


Every

(g)

2. In eachof

the

subset S of
and

(c)

Let

5. Let

of V. If

inner

an

x$

Let

with

C3

an

is

that

that

j;eV

(3 + f,

a finite-dimensional

be

4f, -4)

If

subspace

but <x,

such that y e Wx

0.

j>>

vectors, then the


process
satisfy xf =

of nonzero

set

Gram-Schmidt

the

yt

inner product, and

bases for W and

let

0, 1)}).

span({(f,

Wx.

of an inner product space V. Prove


subspace
T on W such that N(T) = Wx. In

projection

for all x e

< ||x||

||T(x)||

additibn,

V.

Hint:

exercises
of
(Projections are definedin
Let ibeannx/i
matrix with complex
=
an orthonormal set. Prove that
I.
the

10.

f(t)g(t) dt, S =

Use induction.

the standard

exists

there

prove

Jo

and

orthogonal

from

derived

a finite-dimensional

be

f1

1)

6.6.

x\342\200\236}

Hint:

I,...,n.

exists

there

that

Find orthonormal

9.

- i, 2,4/)}

and let

space,

product

prove

W3

vectors {xl3...,
V

(1, 0,

SL

,y\342\200\236}

8. Let

ft.

7. Strove that if {j>l3...


i

(1, 1, 2)

let

Use Proposition

iJmt;

i),

be

coefficients of (3,4) relativeto


5\"1-.
C3, and let S = {(1,0,
2, 1)}. Compute
(1,
and
let S = {x0}3 where x0 # 0. Describe
R3,
geometrically.
= S0 is linearly independent, describeSq geometrically.

x2}

Let

for

x.

for

/?

vector relativeto /?.

</, g) =

product

{(1, i,0), (1

where

and

R2

and

= 1 +
S =

f(x)

span(S),

{xl3

6.

and

x2},

3)},

inner

the

with

P2(jR)

and
1, 1),(1,3,
1, 1),(0, 0, 1)},

the Fourier

Find
4

{(1, 0, 1), (0,


S = {(1, 1, I), (0,

R3,

x,

(d)
3. Let

S =

R3,

{1,

of the given
verify your result.
coefficients

to

basis

find an orthonormal

V. Then

Fourier
6.5

to the given

process

space

Theorem

use

any

independent.

product

the

compute

Finally,
(a)
(b)

inner

the

for

then

V3

the Fourier coefficientsof x.


basis.

the Gram-Schmidt

apply

parts,

following

space

product

independent.

linearly

set is linearly

orthonormal

inner

an

1,...,
n) are
be an ordered

(i

must

basis

orthonormal

An

(e)

x\342\200\236}

for

basis

is

{xl3...,

x e V

313

The Gram-SchmidtOrthogonalfzation
Process

Sec. 6.2

Section
such

entries

10 of Section

Exercise

Use

6.1.

2.1.)
the

that

rows

of A form

AA*

11.

Let Wx and W2
that

Prove

V be

11

Let

dimensional

subspace

(b)

c (S^

finite-dimensional
inner

W\302\243

(Wx

=
\\N2)\302\261

of

Prove

of V.

implies that

and

space, S and S0be subsets

an inner product

(a) 50cS
\342\226\240.>:

(Wx

be subspaces of a
= Wf n
+ W^

SL

s\302\243..

so span(S) c (5^.

the following:

V3

space.

product

\\Nf
and

W\302\243.

be a

finite-

314
(d)

Spaces

Hint: Use Exercise 6.

W = (\\NL)\\

(c)

Product

Inner

Chap.

\302\251Wx

of Section

exercises

the

(see

13. (a) ParsevaVsIdentity.


{xl3...,
any x3j;eV prove that

x\342\200\236}

basis

orthonormal

an

be

Let

1.3).
for V. For

i-l
Use

(b)

to

(a)

part

if

that

prove

dimensional inner product


for

then

BesseVs

S=
we

be

have

<% \">3

product

<* y>,

Fn.

space, and let


Prove that for any x in

product

of V.

subset

orthonormal

any

inner

an

be

Let

Inequality.
{xu..'.,x\342\200\236}

=
[y],y

<[x]\342\200\236

the standard inner producton

where <%->' is

inner

with

over

*,(y)>'

<*,M,

14.

y in V

x and

any

space

of a finite-

basis

orthonormal

an

is

ft

-.

M2>

i\\<x,Xiy\\2.
\302\243=1

Hint:

6.6 to

Proposition

Apply

x e V and

use

Then

span(S).

10

Exercise

of Section 6.1.

15. Let T bea


<T(x)3 =

linear

the

16.

holds

equality

Let

all x, j> e V3 prove


for all x and y

0 for

j>>

of the even

consisting

on

the inner product

T =

that

product

for V.

the

denote

W0

space V. If

prove this result if

T0. In fact,

in some basis

that We and

1]). Suppose

C([-l,

inner

a finite-dimensional

on

operator

and odd functions,respectively.

that

Prove

of V

subspaces
W|

W0 if

is

</*>

17. In each of

vector on
(a)

the

the

j,

(b) V = R3, j,
= P(jR)

(c)V

= (2, 1,

In Section6.1

we

defined

operator

an

inner

on

called

on

the

the

inner

{(*, y,

dt.

product

4x}

z): x +

inner product

of the given

projection

orthogonal

of the
y):y

{(x,

3),

3y-2z

space V.

Q}

f1
=

h(x)

f(t)g(t)dt,

<.\302\243g>

Jo
=

17 find the

6.3 THE ADJOINT

the

with
W

In Exercise

18.

= (2, 6),

- 2x2,

4 +3x

subspace

given

R2,

find

parts,

following

f(t)g(t)

P^R)

distance from

vector

the^given

to

the

subspace

W.

OF A LINEAROPERATOR

\"adjoint\"

the

conjugate

product
of T,

space

whose

transpose

A* of a

matrix A. For a linear

define a relatedlinearoperator
matrix representation with respect to
V, we now

any

The

6.3

Sec.

orthonormal basis
and

numbers

adjoints

<x,

all

for

j>>

e V is clearly

linear. More interestingis

from

fact

the

this

is of

into

g: V-*

function

The

V.

every linear transformation

is finite-dimensional,

if V

space, and let

an inner product

V be

defined by g(x) =
that

analogy between conjugation of complex


will become apparent. We first needa

however.

result,

preliminary

Let

[T] jf. The


of linear
operators
V is

of

/?

315

Linear Operator

of a

Adjoint

form.

Let

Theorem 6.8.

let g:

-*\342\200\242
F

a unique vector

V.

basis for

an orthonormal

/? be

Let

Proof.

over F3 and

space

product

Then there exists

transformation.

<x, y> for all x e

g (x) =

that

such

a linear

be

inner

a finite-dimensional

be

y=

/?

say

V,

let

and

{xi3...,

x\342\200\236}3

9(xi)*f-

E
\302\243-1

If

1 <j

by h(x) =

h: V -*F

define

we

< n we have

<x, j>>, then

is

for

Now

linear.

clearly

h(xy)

E 9(Xi)xi) = ^gW^Xj)

=
<xy, j>>
( xp

(= i

Sinceg and

/?3 we

on

agree

y is unique,

that

show

To

<x, j>>

both

for all x, so

= <x3 />

have g = h

9(^)-

by the corollaryto

suppose that g(x) =

Theorem

by

9(^-1

we

6.1(d)

y = y'.

have

x-

a^

f\302\260r

JO

<x3

2.6.

Theorem

Then

Example 1
Define
transformation.Let

proof

R2~+R

g:

6.8.

linear

<T(x)3y>

\342\200\224

Let

Proof.

first show that g

j;

all

for

T*(y)>

<x,

e V.

Define

is linear.

Let

g(cxx +

=
x2)

Henceg is
We

gCeJei. + 9(e2)e2=

be a finite-dimensional

on V. Then

operator

+ a2; clearly g
2et

is a
+

linear

e2

Then g(al3 a2) = ([aua2)9(2,1)>= 2ai

LetV

6.9.

Theorem
be a

and lety =

{ei,e2},

of Theorem

a2) = 2ax

g(al3

by

(\302\2761)

inner product space,and

unique

xl3

<T(cx!
c<T(x1)3

x2

e V and

+ x2),
y)

by g(x) =
c e F.

x e

V. We

+ T(x2),

j>> = cg(xi)

j>>

+ g(x2).

linear.

now

may

apply

Theorem

6.8 to obtain

-* V such that

all

for

let

Then

y) = <cT(xJ

+ <T(x2),

y)

<T(x)3

the

in

a2.

there exists a
T*: V
function
x, y e V. Moreover, T* is linear.
g: V -* F

as

a unique vector/eV

such

that

316

g(x) = <*,/>;
=

J*(y)

i.e.,

y', we have

To

Defining T*:V-+V

for all xeV.

/>

<x,

<T(x), y) = <x, T*(j;)>.


let yl} y2 e

is linear,

T*

that

show

y)

<T(x),

by

x e V,

Then for any

e F.

and

Inner Product Spaces

Chap.

we

have

+ y2)} =

T*^

<x,

<T(x), c^ +

+ <T(x), y2)

^)

c<T(x),

y2)

= c<x,T*^)) +
=

Since x is

have

we

arbitrary,

+ 1^)).

0,^(^)
+ j>2) =

T^cj^

T*(^)>

<x,

6.1(d).
is

need

we

Finally,

<*, U(j;)> for

=
<x, T*0>)>

the

T*

all

for

U(j>)>

on

operator

unique

->\342\200\242V

Then

x,j>eV.

U.

the

y) = (x,

(T(x),

satisfying

of

adjoint

the

T*(j>))

we also have

that

all xjeV, so

<x,

T* described in Theorem6.9iscalled
T* is read \"Tstar.\"

The lineaT operator


T: The symbol
operator

Thus T* is
for all x,
V. Note

<T(x), y) =

satisfies

it

that

and

linear

Suppose that U:

T* is unique.

that

show

only

Theorem

cl\"*^) + T*(y2) by

<*, T(jO> =

<T(y),

x>

so <x,T(j>)>r=<T*(x),j>>
symbolically ;as adding a

all

for

We

x,yeM.

we shift

when

to

= <T*(x), J>>,

T*(*)>

<y,

view

may

these

equations

its position inside

the inner

product

symbol.

In

be

defined

to

be

The

uniqueness
of an adjoint

the

the

case

infinite-dimensional

the

T* such

function

and linearity

of a linear

adjoint

that <T(x), j>>= <x,

of T* will followas

before.

operator T may
all

for

T*(j>)>

the

However,

xjeV.
existence

is not guaranteed. The reader should


the
of
necessity
in the proof of
the hypothesis
of finite-dimensionality
6.8.
of the
Many
we prove about adjoints, nevertheless, are independent
theorems
the
of V. Thus for the remainder of this chapter
dimension
the convention
adopt
for
the exercises that a reference to the adjoint a linear
on
an
infiniteoperator
inner
assumes
its
dimensional
product space
A useful
for computing
result
adjoints is Theorem 6.10 below.
observe

Theorem

of

we

of

existence.

Theorem6.10.
be

an

basis

orthonormal

Let

be

V. If T

for

on

B =

Ba

we

[T*]fl,

V,

ft

then

[T]y.
and

=
/?

{xls...,xn}.

have

<T*(xj),

space, and let

product

is a linearoperator

[T*],
=
Proof. Let
[T]fl,
6.5
to
Theorem
corollary

inner

a finite-dimensional

Xi) = <xr,T*(x;)>

= <T(x;),x}) =Tn

= 04%-.

Then

from

the

The

6.3

Sec.

Hence B =

of a

Adjoint

A*. I
Let

Corollary.

that ILA2P= A.

we have

matrix. Then

n x n

an

be

Hence

W-A)*lp

(LA)*.

LA*

ordered basis for

is the standard

If j?

Proof.

317

Linear Operator

then

Fn,

[L/\\f

2.16

Theorem

by

= A* =

so

and

[LA+~]p,

(LJ*=L^

As an application

of

will

we

Theorem

6.10,

T(al3 a2) =

{lia^+

the

compute

adjoint

of a

specific linear operator.


2

Example

-* C2 by

T: C2

Define

ordered basis for

C2, then

'2i

[T],=

\342\200\224
a2).
a\302\261

3a2,

If

the

is

standard

3'

So

rT*],

[T]2<

-2i

-1

Hence

T*(al3 a2) =

( \342\200\2242ifli

ci2,

3ax

\342\200\224

a2)-

The following theorem demonstrates the analogy


numbers
and the adjoints of linear operators.
complex

the

between

of

and

be

(\302\276)

linear

on

operators

(T+U)*=T*

(TU)*

a finite-dimensional

inner product space, and letT

V. Then

U*.

cT* for any c e

(b) (cT)* =
(c)

V be

Let

6.11.

Theorem

conjugates

F.

U*T*.

(d) T**=T.

(e) l* =

Proof

x,

We

l.
will

parts

prove

(a)

and

are proved similarly.Let

(d); the rest

V.

Since

<x, (T

(a)

+ U)*(y)> =

<(T + U)(x),y) =

<T(x)

U(x),

= <T(x)5 y>

+ <U(x), y)

= <x, T*(j>)

+ U *(j>)>=

<x,(T* +

since

<*,T(30> = <T*(x),3>>
(d) follows.

<x,T*(j>)>

follows.

Similarly,

j>>

= <*,T**(j>)>,

*Xy)>,

<x,

U*(;>)>

318

existence

Corollary. Let
(a) (A + B)* =
(b)

(cA)*

(c)

(AB)*

(d) A** =

(e) I* =
Since

F.

B*A*.

I.
will

= B*A*.

conjugate

transposes

relied

we

above
can

proof

(L^Lfl)*

parts can be provedsimilarly.


= L^L^ = L^*,
have
we

proof

alternative

(c); the remaining


= (Lfl)*(LJ*

only

prove

='(LAB)*

LiAB)*

the

all c e

A.

We

Proof.

B*.

= cA* for
=

the

that

Then

n matrices.

n x

be

and

A*

In

Spaces

caseprovided
same proof works in the infinite-dimensional
of T* and U* is assumed.

The

(AB)*

Product

Inner

Chap.

be

given by
of the matrices

to Theorem 6.10.
appealing directly to the definitionof
A and B (see Exercise 5).
on the corollary

An

the

Least Squares

Approximation

the

Consider

at times
measurements
tu t2,..
be
or she
at
measuring
unemployment
are
Suppose that'the data
y^,...,(tm,ym)
this
he or she
Figure 6.2).
distribution^
yu

may

(tl5

From

between

relationship

c and

constants

d so that

the data collected.

and

y = ct

t, say,

the line y =

ct +

One such estimate

of

represents the sum of

the

the

of

squares

taking

-, tm, respectively. For example,he


various
times during some period.
as points
in the plane (see
plotted
feels that there exists an essentially

\342\226\240
* *
> J'm
)>2\302\273

linear

collects data by

An experimenter

problem:

following

+ d, and wouldliketo

fit

vertical

to

is

best

the

represents

distances

\"fit\" to

possible

E that

error

the

calculate

the

find

from the points

to the

line;i.e.,
m

1=1

Thus

reason

the

the line y =

A=\\

then

is to find

problem

ct + d is called

least

the

it follows that E =
We now develop a

minimizes

the constants c and d that


squares

line.)

E.

minimize

If we

(For

this

e Fn

that

let

\\\\y

Ax\\\\2.

general method to find an

E; that is, givenan

matrix

A, we

explicit

vector

will find x0

x0

e Fn such that

Sec.6.3

The

a Linear

of

Adjoint

319

Operator

\302\273iJi)

6.2

Figure

||j;
us

\342\200\224

i4x0||

any

fixed

\342\200\224

degree

Wefirst
<x, y\\ denote
<X

inner

standard

Lemma

By

Proof.

A be

Let

2.

the

of

polynomial

let

lemmas. For x,^eFn3we


of x and y in Fn. Notice

have

that

0.

(Ax,Axym,

that A*Ax =

So assume

so that Ax =

Corollary.

If

then A*A is

only

if

invertible.

we

<x, A*y\\.
Then

we need only
Ax

= 0.

0. Then 0 =

x n matrix

Fm\"

Then

have

matrix over F.

0. 1

is an

e Fn3 and

A*y>n.

= (A*y)*x =

theorem

and

F3 x

over

matrix

6.11

an m x n

dimension

A*Ax = 0 if

Theorem

y)m = y*(Ax)

<Ax,

ranfc'),

the

simple

y>m = <x,

Proof* By the corollaryto

the data but also

product

an m x

Let A be

1.

<Ax,

A*Ax

will not only allow

y*x.

X>n

Lemma

we

two

and

notation

some

Fn. This method

data.

best fits the

which
the

x e

vectors

which best fits

function

need

all

for

Ax\\

linear

the

find

to

||y

Clearly,

rank(A*A)=rank(A).

e Fn
x
show that
Ax = 0 implies that
for

{A*Ax,

{Ax,

x>\342\200\236

such that rank(A) =

(i.e.,

,4**x>m

has

\"full

Chap.6

320
Define
6.6

there

{Ax:xe
a unique
exists

So \\\\Ax0-y\\\\ <
To

y)

addition,

x0 = (A*A)~1A*y.

such that

(1,2),

(2,\302\276

A*y

ifrank{A) = n,

(3,.5),

to

only find a

we need

{Ax,

our
and

and

by

Then

Fm.

- y|| <

|| Ax0

then x0 =

the

A*AX

If,

A*y.

have

theorem.

following

for

all

x e Fn;

we

there

0 for

all

for

Lemma

||Ax- y||

exists
all

x0 e Fn
x

e Fn.

(A*A)_1A*y.

that the data collectedis

let us suppose

experimenter,
Then
(4,7).

j>>m

solution to

then

\342\200\224

Ax0

the

from

note

we

\342\200\224
=
y)}\342\200\236

A*(Ax0

<x,

Let A e Mmxn(F) and


=

return

so

in

(A*A)x0

Furthermore,

\\N\302\261,

We may summarizethis discussion

6.12.

Theorem

that rank(,4) = n,

assume

we

\342\200\224

that

have

we

So

=0.

such an x0,

method for finding

6.6 that Ax0

Thus by Lemma 1

that is, A*(Ax0

To

forallxeF\".

a practical

\342\200\224

in

vector

to Proposition

corollary

x e Fn.

system
Fn}; that

\\\\Ax-y\\\\

develop

Spaces

AX = y, where A is an m x n matrix and y e Fm.


is, W = R(LJ. By the corollaryto Proposition
in W, say Ax0 where x0 e Fn, that is closest
to y.

the

consider

Now

Product

Inner

A =

y =

and

hence

'1 2 3

A*A =

'/\342\226\240

'\\

\\4

l/

4^

1111

SO

(i4M)_1

1
201

-10

-10'
30

Therefore,

-10Vl

\"X\302\260

V-10

20

Thus the line y

directly as

= 1.11is
\342\200\224

\\\\Ax0

y\\\\2

the

0.3.

30

least

squares

A1

2 3
1

line.

The

error

E may be

computed

The Adjoint of a

Sec. 6.3

y = ct1

parabola

321

Operator

applied if the

may also be

above

method

The

Linear

+ dt +

this

In

data.

the

to

lt\\ tt

1/

A.

matrix

as the

she would use

case, he or

1\\

tm

\\t2m

fit

to

wants

experimenter

Finally, suppose in the linearcase

the

that

the

chose

experimenter

times,

ti3

to satisfy
m

matrix

17). This, of

Exercise

(see

be orthogonal, so

of A would

columns

two

the

Then

course,

would

a diagonal

be

would

A*

the

simplify

greatly

computations.

Solutions

Minimal

In the preceding

exists a unique
b =

Then the system

Ax0 as above.

s is calleda

if

solution

(a) There exists


(b)

only

solution

simplicity

of

the

solution of (AA*)X =

Proof For
x

be

any

that

of

AX

AX

let

b.

at

b has

that

so

W-L=W/

s is

W'-1-

Exercise

by

write

12.

Exercise

by

if

have

12,

solution.

minimal

unique

that

we have

s +

R(LA*).

u is

W = N(LJ.

can

we

is,

Let

= s + y
Note
that
Let

To

W.

v be

u, where ue\\N'.

by Exercise 10

in

the

we

3.9

Theorem

By

R(LA*);

solution of AX = b that lies

s is a

only show that

solution of AX = b.
seW, which equals

6.6

and s e

that

= R(L^*) and

b. By Proposition
where

As,

that liesin

let W

we

AX = b,

s of

b, thens = A*u.

yeW1,

Ay

(a) we need

of AX = b

notation,

of AX =

solution

with seW and


b = Ax = As +
prove

we

follows,

one solution. A solution

e Fm. Suppose

solution

minimal

one

exactly

is

what

Of

often

is

It

vectors.

y.

Then

least one solution.


s

there

is closestto

all other solutions u

and

x n(F)

Mm

at least

b has

< ||u|| for

||s||

Let A e

6.13.

Theorem

AX

minimal

point in W that

of minimal norm.For

such a vector

to find

desirable

= n, then

rank(,4)

will be infinitely many such

< n, there

if rank(,4)

course,

if

that

showed

we

Ax0 is the

that

such

Fn

x0

discussion

any

Since

of Section 6.1

that

l|y||2

l|5 +

u||2=||5||2+||W||2^||5||2.

Thus s is a minimalsolution.

can

We

||uj|= ||s||3

then

AX

b,

proving

(a).

0 and

s.

also

Hence

see from
s is

the calculation

the unique

above that if

minimal solution of

Chap. 6

322
To

that

(b), we assume

prove

of AX =

a solution

also

is

Product

Inner

Spaces

b that lies in W.

Then

D-seWnW^W'^W'^O},

SO

s.

Example

Consider

the system

'x + 2y+ z=
y +

x-

x+

-11

2z=
=

19.

5y

Let

.4 =

To

find

0/

/6

must

we

\342\200\242

-4

All

so we consider

the system

'

6x+

x + 6y
.1 lx
which

solution

solution

U\\

-4

AA*=

(Any

19

\\

= b. Now

AA*X

for

( -11

6 =

and

of this system,

solution

minimal

the

11 -1
15

1\\

26/

llz=
=

\342\200\224

4z

4y + 26z

\342\200\224

11

19,

is

will suffice.) Hence

'-1

s=

A*u

-3

is the minimal

solution of the

given

system.

find

solution

of

Sec.

The Adjoint of

6.3

Linear

323

Operator

EXERCSSES

1.

Label the following statements as being


or
false.
Assume
inner product spaces are finite-dimensional.
underlying

(a)

linear

Every

an

has

operator

adjoint.

has
(b) Every linear operator on
(c) For every linear operator on

and

form

the

[t*],

the

that

true

->\342\200\242

<x,

basis

every

y e V.

some

for

j>>

of

ft

have

we

V,

=([T]\342\200\236)*.

(d)

The

adjoint

(e)

For

any

operator is always unique.


and U and scalars a and b,

of a linear
T

operators

(aT + bU)*=aT*

bU*.

(f)

For

any

n x n

(g)

For

any

operator

matrix A, (LJ* =
T, (T*)* = T.

2. For each of
following
transformations g:

inner

the

->\342\200\242

F,

L^*.

that g(x) =

y such

a vector

find

V (over

spaces

product

F) and linear

<x,

j>>

all

for

xeV.

g{au a2, a3) =

V = R3,

(a)

(b) V = C2, g(zl5z2) =zl-2z2

(c)

3.

with

P2(R)

foUowing

evaluate

T*

at the

RM(a,b)

C2, T(zu

(a)
(b)

(c)

inner

the

of

Theorem6.11as

T +

T* and

7. Give an
that

8.

be

- f51 +

(3

linear

in

/(x) = 4-x +

2f)

the

the

operator

U2 = TT*.

exampleof a

to

corollary
proof

of (c).

on

6.11 in two

Theorem

Then

use the matrix

an inner product

Prove that

linear

3x2

6.11.

of Theorem

proof

5. Complete the proof


T

(3,5)

operators T on

f(t)g(t) dt9

+ tt

T(f)=f'

Let

spaces V and linear

given element of V.
= {2a + b,a3b\\ x =
z2) = (2zt + iz2i (1 QzJ,

<f,9>

6.

product

with

P2(R)

Complete

= m'+f'(l)

f(Mt)dt,g(f)

h) =

the

of

each

For

V,

+ 4a3

2a2

a\302\261

=
Ux

T on

operator

Uf

and

definition of A*.

space V. Let

=
\\Jl

= U|.

U2

an inner

ways. First use

product

space

such

N(T)#N(T*).

finite-dimensional inner product space,


T*
on V. Prove that if T is invertible,

Let V be a

operator

and

then

let

be

a linear
and

invertible

is

(T*)-1=(T~1)*.
W\"1T is the projection
and
9. Prove that if = W
the
exercises
of
W-1-, then T = T*. (For definitions,
V

\302\251

see

on

Sections

with

N(T)

1.3 and 2.1.)

Chap.6

324
10.

j>

11. For a

linear

inner

linear

Exercise

use

then

and

N(T),

that

Prove

V.

that

prove

V,

assume

if we

true

T*T

that TT* =

T0?
and let T be a

space,
product
=
Hint:
N(T)\\
R(T*)

Prove that

6.2.

12(c) of Section
inner

a finite-dimensional

on

operator

space

product

result

same

the

finite-dimensional

linear operator on

R(T*)X=
13. Let T be a

inner

an

on

T = T0. Is

be

Let

if

operator

= T0 implies

12.

V.

Use

Hint:

V.

Spaces

that
operator on an inner product space Prove
all x e
if <T(x),
and
only
T(j>)> = <x, y) for all
Exercise
20 of Section 6.1.

T be a linear
= ||jc|| for
||T(jc)||

Let

x,

Product

Inner

space V.

product

Prove the following.

(a)

that

Deduce

N(T).

N(T*T)

(b) rank(T) = rank(T*).Deduce


(c) For any n x n matrix
rank(i4*i4)
14. Let
\"an
be
inner
space,
product
from

and

T*

15. Let T:
W
be
product spaces
that

such

(b) Let P
Let

and

if

A*A

is

that

A is

(-

19. In physics,
relation

9),

length

the

Use

constant.
is

given

show

that

finite-dimensional

between

inner
W~\302\273-V

for V and

bases

W, respectively.

Prove that

then

(1,1) find the line and the parabola


fit. Compute
E in both cases.

(0,2),

(-2,6),

least

squares
law

the

between

by)

that T is linear.Then

that det(,4*)= det(,4).


x n matrix whose columns are orthogonal,

an m

Hooka's

exerted

T:V~-*-V

Define

matrix.
3*

the

provide

zeV.

n matrix. Prove

a diagonal

For the data

j>,

unique linear transformationT*:


<x, T*(j>)> for all x e and j> e W.

orthonormal

be

n x

that

Prove

18.

y) =

<T(x),

an

be

let

W.

and

[T*]i= ([T]jS)*

16.1
17.

= rank(,4).

rank(i4i4*)

there exists a

that

Prove

V. First prove

transformation

a linear

V\342\200\224*

(a)

x e

all

= rank(T).

rank(TT*)

define it explicitly.

and

exists,

for

y)z

=.<^\302\273

that

(a)

A,

by T(x)

= rank(T).

rank(T*T)

that

states

length

of a

and

(within

spring

certain

limits)

there is a

and the force y

linear

applied to

(or

the
is, y = ex + d, where c is
spring.
spring
the following
data to estimate the spring constant.
in inches and the force is given in pounds.)

That

called

(The

Force,
y

Length,

X
\342\226\240-

3.5

1.0

4.0

2.2

4.5

2.8

5.0

4.3

325

Normal and Self-AdjointOperators

Sec. 6.4

20. Find

of

solution

minimal

the

x +

2x + 3j;
4x

21.

squares line

For the least

(fi>

\342\226\240
\342\226\240
\342\200\242
> (fm>

J>i)>

ct

\\i=i

+ z=2
=

4.

\342\200\224

ly

d corresponding

to the m

\\

y=i

observations

the normal equations:

6.12 to derive

Theorem

use

J^nX

=
y

\342\200\224

2y

\\

*=i

and

t, c + md = X

X
/-i
Theise

equations
of

derivatives

also

error

E to zero.

6.4 NORMAL

the

partial

OPERATORS

SELF-ADJOINT

AND

of

by setting each

be obtained

may
the

^.

i-i

of diagonalizable operators in Chapter5.


it is necessary
a
these
and sufficient for the vector space to possess
operators
As
an
inner
basis of eigenvectors.
is
product
space in this chapter, it is
that V has an orthonormal
reasonableto
that
will
conditions
guarantee
that will help us achieve our goal is
basis of eigenvectors. very important
result
The formulation
Schur's
below is in terms of linear
(Theorem
6.14).
operators. next section will contain the more conventional matrix form. We
We

the importance

seen

have

For

seek

theorem

The

begin

With

lemma.

V. If

space

Let

Proof.

eigenvalue

linear operator on a finite-dimensional


has an eigenvector, then so doesT*.
T be a

Let

Lemma.

of T,

[T]^,

X is an

has an eigenvector.

the

\342\200\224

Recall

T(W) is contained
->\342\200\242
W

by

Tw(x)

on W. Recall also

polynomials.

ft

of

an

A. Then det(,4
to

XI)

Theorem

2.1) that a

Section
is

subspace

T(x) for all

from Section5.2

that

x in W.It is
a polynomial

we
clear

may
that

splits

T*

16 of

that1

T-invariant

if

is a

if it

an

T*

define
Tw

be

In particular,

V is

of

have

also

we
of

hence

product

by Exercise

0. So

6.11,

T-invariant,

of V. Let

basis

orthonormal

eigenvalue of A* and

in W. If
=

is

\342\200\224

corollary

26

(Exercise

where

and hence of

Section 6.3 and


det(,4* XI) = 0. So

Tw:

inner

the restriction
linear

operator

factors into linear

Chap.6

326

product

there

exists

is immediateif

inner

l)-dimensional

lemma

split. By the
y 6

\\N\302\261

<T(j>),x> =KT(y),cz) =
= cl(y, z> =

T*(cz)>

<j>,

W-1-

= (y, clz)

cT*(z)>

<*

easy to

of T to W-1-.It is
e W-1-.JSFowlet Tx be the restriction
Theorem
5.26) that the characteristic
polynomial of Tx divides
of f and hence splits. By Theorem6.7(c)
dim(W-L)
polynomial
So T(j>)

of

hypothesis to

apply the induction

W-1-such [TJy
such

that

is upper

eigenvectorsof a
V. Note that if
Theorem
matrix. But
linear

an

such

Because

matrix.

if

we.have

matrices

diagonal

commute.Thus

T on

6.10,

an

possesses

\342\200\224

1,

so

basis

we

y of

an orthonormal basis of
a finite-dimensional
inner product space
then [T]^ is a diagonal
basis
ft exists,
=
that
is also a diagonal

orthonormal

by

of finding

goal

original

operator

orthonormal basis

[z] is an

y u

(see

triangular.

our

to

return

ft

Clearly,

show

characteristic

the

orthonormal

an

obtain

and

Tx

is upper triangular.
[T]^

We now

If

T-invariant.

is

0.

c3(0)

may

z. Suppose

e W, then

= cz

and

We now show that

W = span({z}).

that

Xz and

T*(z)

we

polynomials

a unit eigenvector

has

T*

that

assume

can

characteristic

whose

spaces

The result

is true for linearoperatorson

that the result

product

n of V.

dimension

the

on

induction

by

So suppose

1.

\342\200\224

that

Spaces

(Schur).

Proof. The proof willbe

(n

Product

Let T be a linear operatoron a finite-dimensional


V. Suppose
that the characteristic polynomial of T splits.Then
space
an orthonormal
the matrix
basis ft such
that
[T]^ is upper triangular.
6.14

Theorem
inner

Inner

[T]|

[T*]fl

commute,

of eigenvectors

basis

orthonormal

that

conclude

we

T and T*
of T, then

TT*=T*T.

Definitions.
on

V.

We

say

Let

T is

that

be

an

normal

inner

space,

product

if TT* =

T*T.

be a linearoperator

and let J

matrix

i\302\253nxn

is normal

if

AA* = A* A.

It follows
ft is

an orthonormal

T is

that

immediately

normal

if and only

if [T]

is

normal,

where

basis.

Example 1

Let T:
T in

R2

-*

R2

be

the standard

rotation

n.

The

matrix

representation

ordered basis is given

by

/cos

\\sin

Note that AA* =

0 < 9<

by 6, where

I=

A*A,

so

and

\342\200\224

9
hence

sin

9\\

cos 9J'
T is normal.

of

Sec.6.4

and

Normal

327

Operators

Self-Adjoint

T in Example 1 does not


operator
in the case of a real inner productspace, see

the

Clearly,
So

eigenvector.

we

even

that

guarantee an orthonormal basisof eigenvectors. is


if is a complex
inner
will show that normality sufiBces

We

is

normality

not sufiBcient to
however.

one

possess

All

not

lost,

product

space.

Before we prove the promised result for


general properties of normal operators.

Theorem6.15.
operatoron Then

inner

an

be

Let

normal

and let T

space,

product

need

we

operators,

some

be a normal

V.

= ||T*(x)||

||T(x)||

(a)

for aHxeV.

is normal
(b) T
(c) If x is an
\342\200\224

cl

T(x)

(d) If

eigenvectors

then T*(x) = Xx.

Ax,

and

Ax

distinct

are

A2
and

xx

then

x2,

Proof (a) For

U(x)

and by (b) U

= 0,

0=
Hence

T(x) =

that

(c) Suppose

T*(x)
Let

^d)

xx

eigenvectors

and

Ax

and

A2

^i<xi3 x2> =

# A2, we

Ax

6.16.

Theorem

be

distinct

inner

product

basis

of eigenvectors

space

V. Then

Then

= ||T*(x)-Ix||.

using

(/1^,

T be

of T with

eigenvalues
(c), we have
=

x2>

x2>

<T(xJ,
X2(XU

conclude that <xl3 x2>=


Let

\342\200\224Al.

that

implies

= ||(T* -Il)(x)||

= (Xls ^2*2/
Since

(a)

U=T

an eigenvector of T*.

Then,

x2.

x>

<TT*(*),

Let

xeV.

some

for

is normal. Thus

So x is

Ix.

x>

= ||T*(x)||2.

Xx

= ||U*(x)||

||U(x)||

<T*T(x),

exercise.

an

as

left

is

corresponding

we have

= <T*(x),T*(x)>
(b)

if

are orthogonal.

xx and x2

e V,

any

of T with

eigenvalues

||T(x)||2 = <T(x), T(x)> =

The proof of

an eigenvector of T*.Infact,

x is also

T, then

of

eigenvector

c e F.

every

for

= <xl3 T*(x2)>

X2>.

0. 1

a linear operator

T is normal

corresponding

on a finite-dimensional

complex

if and onlyif

exists

there

an

orthonormal

of T.

the
fundamental
theorem
of algebra
By
Proof Suppose that T is
(Theorem D.4) the characteristic polynomial of T splits.So
may
apply
=
theorem
to obtain an orthonormal basis
Schur's
{xl5...,
xn} of V such that
that xx is an eigenvector of T because
We know
upper
triangular.
[T]^j =
Assume
that
are eigenvectors of T. We will
triangular.
upper
Xi,...,**-!
an eigenvector
of T. It will then
that
mathematical
xk is also
by
normal.

we

ft

is

is

show

follow

328

induction on
a

0 for

0 for

=
A]k

1)

\342\200\224

(k

1)

A is upper

Because

matrix.

diagonal
xk

x&_

(i4*)fcj

0 for

xk

all the vectorsof


The

converse

/?

are

next

the

on page 326. 1

inner

complex

6.16 does not

Theorem

shows,

example

infinite-dimensional
2

T.

of

eigenvectors

was already proved


as

Interestingly,

Example

\\c* e*J

triangular,
is an
of
show that
we
need
eigenvector
T,
only show that
j < k. Note that by Theorem6.15(c),
xl3..., x are also
= 0. Thus
= 0 for
But A* =
j < /c, and so
[T*]fl, so C*
of T, so by induction,
is an
eigenvector
j < k. Therefore, the vector

T*.

of

fj
\\o eJ

j > k. To

eigenvectors

\342\200\224

a (k

is

0\\

(B*

AJB

AJk

is

have

We

AJk

matrix.

diagonal

where

of T, or equivalently,that

x/s are eigenvectors

of the

all

that

inner Product Spaces

Chap.

extend to

spaces.

product

Consider the inner product space


let
defined
and
earlier,
xk = elkx.
=
k is an integer}).
k is an integer}
Suppose that = span({x&:
Clearly,
{xk:
is an orthonormal basis of Now
let T and U be linear operators on
that
such
H

ft

V.

= xk+i

J(xk)

U(x&)= x^

and

for all integers'; Then


k.

<T(X,),

X;>

= <XI+1,
X;>

It follows that U

= Sil+l)j =

= T*. Furthermore,TT* =

We will show that T has


of T. We may

^(j-d =

no

<Xf,

so T

T*T,

that

Suppose

eigenvectors.

Xj-t)

<xf,

U(x7.)>.

is normal.
x is an eigenvector

write

Because

Ax

T(x)

for

some

where am #

Y, aixi

have

A, we

Since

contradiction because

real

inner

condition

that

product
= T*.

a linear

as
ft

is

linearly

ABM.

of xn,

combination

xM

1,...,

xm. But

independent.

that normality is not


to
the
guarantee
basis of eigenvectorsfor realinner productspaces.
we must replace normality by the stronger
spaces

1 illustrates
Example
of an orthonormal
existence
For

fli*\302\243+i

am # 0, we can writexm+1

this is a

0.

sufficient

Definitions.
We

V.

on

329

Normal and Self-Adjoint


Operators

Sec. 6.4

T is

that

say

inner

an

be

Let

Ann x n

ifT = T*.

self-adjoint

be a linearoperator

and let J

space,

product

matrix

is self-adjoint

if

A = A*.
if
if [T]fl is selfIt follows immediately that T is
and
only
basis.
For
real
this condition
matrices,
adjoint, where P is an
reduces to the requirement that be symmetric.
we state our main result for self-adjoint operators,
need
some
Before
self-adjoint

orthonormal

we

work.

preliminary

Let T be
V. Then

Lemma.
space

product

a self-adjointoperator

(a) Every eigenvalueof


characteristic

The

(b)

can

apply

is, A is real.
p be an orthonormal

Let

(b)

T^: Cn

Define

adjoint.

[T.Jy =

the

that

3;

of

eigenvalues

characteristic

y is

where

A,

\342\200\224

A,

T^

are real. By
of

polynomial

A is

where

and

T^

the fundamental theorem of


of

hence

of

one

Theorem 6.17. Let T be a


operator
V.
T
Then
is
product space
self-adjointif
basis P of eigenvectorsof T.

and

[T]fl

obtain

to

is

upper

So A and A*

Thus P must
The

T is

that

an

self-

is

algebra

the

of the

form

real inner

a finite-dimensional

on

exists

if there

only

an orthonormal

By the lemma we may apply Schur's


basis P for V such that the matrix
A\"=

But

are both upper triangular,


of

eigenvectors

is left as

therefore

and

consist of

of this chapter.

results

major

self-adjoint.

orthonormal

triangular.

converse

factors

into

splits

the

linear

theorem

real.

Suppose

[T]fl. Then

by

vWe are now ready to establish

Proof.

let

and

T^ is self-adjoint because
(a)
(orthonormal) basis of Cn. So

ordered

standard

the

6.15(c)

= Ax. Now

T^(x)

by

self-adjoint

to obtain

T*(x)= Ix.

basis of

-\302\273\342\200\242
Cn

Because

x#0.

for

Ax

Theorem

Xx = T(x) =

So

T splits.

of

that T(x)=
we

inner

finite-dimensional

T is real.

polynomial

Proof (a) Suppose


operator is also normal,

on

is a

diagonal

matrix.

T.

an exercise. P

Theorem 6.17is
extensively
statistics. We will restate this

in

used

theorem

in

many

matrix

areas
form

of mathematics

in the

next section.

and

Chap.6

330

we noted
are

matrices
not

Spaces

Example

As

Product

Inner

and
earlier, real self-adjoint matrices are
normal.
However, the matrix below is complexand

self-adjoint

symmetric,

but

symmetric,

normal.

A* =

and

normal because (AA*)l2= 1 +

A is not

but

i,

(A*A)l2

\342\200\224

i.

EXERCISES
or
false.
the following statements as being
Assume
inner product spaces are finite-dimensional.
underlying

1. Label

(a)
(b)
(c)

normal.

is

operator

Every-self-adjoint

and their adjoints have the same


Operators
on an inner product spaceV,
If T is an operator

eigenvectors.

only if [T]fl is normal,


if and
is
normal
(d) A matrix
where

ft

is

if L^ is

(e) The eigenvaluesof a


operator
(f) The identity and zero operators are
normal

the

each'of

linear

(b)

~\302\273
R2\"

R2

defined

by

b) = {2a

T(a,

by

whether

- 2b, -2a +

by T(a, b) = (2a +

defined

P2(R)

self-adjoint.

determine

below,

operators

T: C2 ~+ C2 defined

(c) T: P2(R) ~+

be real.

all

must

operator is diagonalizable.

adjoint, or neither.

(a) T:

for V.

is diagonalizable.

operator

(h) Every self-adjoint

2. For

if and

normal.

self-adjoint

Every

normal

T is

then

basis

ordered

any

only

(g)

the

that

true

self-

5b)

2b)

where </ g) =

= /'

T(/)

ib,

it is normal,

f1
f(t)g(t)

dt

J\302\260

For

3.

Let

TU

U be

and

self-adjoint

if and

is self-adjoint

Prove

of

(b)

5. Let V be

consisting of eigenvectorsof T.
on an inner product space.
that

basis for R2

orthonormal

an

find

(a),

operators

Prove

only if TU = UT.

6.15.

Theorem

a complexinner productspace,

let

and

T be

a linear

operator

on V.

Define

Ti^CT

(a)

Prove

(b)

Suppose

that

Tx

also

T2 are

and

and that T

self-adjoint

that T = 1^ +

\302\243U2,

=
Prove that Ux =
and
U2
Prove that T is normalif and
Tx

(c)

+ T*) and\" T2=1(T-T*).

where

1^

and

T^.

T2.
only

if

TXT2

Tx

U2 are

iT2.

self-adjoint.

Sec. 6.4

and

Normal

be a linear operator on

6. Let T

invariant subspace

7.

(a)

If T is

(b)

Wx

the

is

both

(d) If

is

both

a T-

self-adjoint.

then (Tw)* = (T*)w.


and T is normal, then

T- and

T*-invariant,
T- and T*-invariant

normal.

is

Tw

normal operator on a finite-dimensional


inner
product
Prove
that
if W is T-invariant,
then
V, and let W be a subspace of
T*-invariant.
Hint:
Use Exercise 24 of Section 5.4.
be a normal
on a finite-dimensional inner product space
operator
= N(T*) and R(T) = R(T*).Hint:-Use
6.15
that
and
N(T)

T be a

Let

let W be

following.

is

Tw

V, and

space

product

is T*-inyariant.
W

complex

V.

space

8.

self-adjoint, then

(c) If

is

inner

an

Prove

V.

of

331

Operators

Self-Adjoint

also
T

Let

Prove

V.

Theorem

6.3.

Section

12 of

Exercise

9. Let T be a self-adjoint
operatoron a
that for all x e
V. Prove

inner

finite-dimensional

space

product

||x||2.

||T(x)\302\261fx||2=||T(x)||2+

Deduce

10.

Assume

that

(T

that

- fI) is invertible
linear

is a

operator

dimensional)
(a) If T is self-adjoint,
inner

then

If <T(x), x>

(c)

all

for

is real for all x e

11. Let T be a normaloperatorona


of eigenvectors

basis
12.

ket

A be

an n x n real
if

and

matrix.
if

only

the

Define

13.

of

x\342\200\236}

linear

such

is

nonnegative. Hint: Apply


basis {xu...,

inner

resulting

by

products.

= T*.

then

T0. Hint: Replacex

inner

real

finite-dimensional

space

product
orthonormal

an

has

self-adjoint.

a real (square)matrix
matrix

V,

the

polynomial splits. Prove that


of T. Hence prove that T is

characteristic

V whose

then T =

e V,

+ iy and expand

then by x

y and

an adjoint T*. Prove


is real for all x eV.

x>

<T(x),

(b) If T satisfies<T(x),x> = 0

[(T

V with

space

product

fI)\"1]* = (T +

fI)\"1. .
on a complex (not necessarilyfinitethat

and

that

to

the

with

U by U(xf)

of its
to obtain

associated

is

all
L^

if there exists

matrix

Prove that

and
6.17

Theorem

a Gramian

= BtB.

symmetric

eigenvectors

operator

be

to

said

is

Gramian

eigenvalues are
an orthonormal

eigenvalues

Al5...,

y^X; and complete

the

A\342\200\236.

proof.

be a self-adjoint operator on an n-dimensional


inner
V,
product
space
=
and let A
basis
for V. T is said to be
[T]^, where ft is an orthonormal
if <T(x),
0
^
positive
definite
^semidefinite]
x> > 0 for all x # 0 [<T(x),
x>
for all x]. Prove
and
if all of its eigenvalues are
if
(a) T is positive definite[semidefinite]
only

Let T

positive [nonnegative].

definite
(b) T is
(c) T is positive

if

definite

\302\243AV}a{a}

if and

[semidefinite]

positive

>

and

only

for

only

if L^ is

also.

if

all nonzero

n-tuples {au...,

an).

332

Chap.

semidefinite if and onlyif


in Exercise 12).

T is positive

(d)

Is the compositionof
14. Simultaneous

(a) Let

definite

positive

inner

real

a finite-dimensional

product

definite?

positive

operators

Spaces

(as defined

matrix

Gramian

Diagonalization.

be

two

is

Product

Inner

space, and let U and T


= TU. Prove that there

be self-adjointoperatorson such that UT


that
are
of vectors
exists an orthonormalbasis
consisting
of
this
result
eigenvectors of both U and T. (The complex
W =
as Exercise 10 of Section 6.6).Hint: For
eigenspace
appears
6 we have
of T we have
that W is both T- and U-invariant. Exercise
6.17 and
that W-1- is both T- and U-invariant.
Theorem
V

for

version

EA

any

By

Apply

6.6^

Proposition

(b)

and

State

the analogous

prove

result about commuting symmetric

(real)

matrices.

theorem for a complexn

the Cayley-Hamilton

Prove

15.

is

f(t)

characteristic

thfe

which

that

show

theorem

Schur's

of

polynomial

you

f(A) =

is

that

A;

0. Hint:

By

in

triangular,

upper

is, if

case^

'

if

Now

LAi

where{xu...,
proved in

xn}

Section

[Aa -1).

have

we

standard

the

is

20 use the

U be positive

T and

(AnI

T)(xy)

basis

ordered

xy_ x})

span({xl9...,

of Cn. (The

for j

^ 2,

general case is

5.4.)

16 through

Exercises

Let

matrix

that

A, prove

may assume that

m=n

16.

definition found in

definite operators on

13.

Exercise

an

inner

space

product

V.

Prove

(a) T + U

is positive definite.

(b) If c> 0,
(c)
17.

T\342\204\2421
is

cT

then

positive

is

positive

definite.

definite.

inner product space with inner product


that
positive definite linear operator on Prove
<x, j>>'
another inner product on
V be an

Let

<*,

V.

and
let T be a
*>
= <T(x), j>> defines

V.

18.

Let

adjoint

inner product space and let T and


on V such that U is positive definite.
that

a finite-dimensional

V be

operators

Prove

UT are diagonalizable
linear operators
Hint:
Show
that TU is self-adjoint with
= <U(x), j>>. To show that UT is
<x, j>>'
and

that have
respect to

self-adjoint,

with

U\"1

19. Prove the

in place

product space

producton

V.

with

the

repeat

be

both

self-

TU

eigenvalues.
inner
the

product
argument

of U.
of

converse

real

only

inner

Exercise
product

17:

Let

<*, *>,

finite-dimensional inner
let <%*>' be any other inner

V be a
and

Sec.

6.5

and

Unitary

(a) Prove that


<x, = <7(x),

exists

there

such

(b) Prove that

both

20.

inner

there

that

V such

unique

(a) is positive

part

by

on

operator

definite with respectto

linear operators

72 and 72

such that

associated

with V,

product

the

(the

=7rlU*7i

Hint:

linear
self-adjoint
Let <\342\200\242,\342\200\242>
be
the
for U, <*,*>' the

definite

to

to

respect

<%*\302\273.

Let

to

according

operator

with

self-adjoint

[see Exercise

orthonormal

ft is

which

is with respect

adjoint

and

Ti

of eigenvectors

basis

positive

is

and

Tx

727^75.7^.

ft

to

7X

and

<\342\200\242,\342\226\240>',

72

7flU*.

AMD ORTHOGONAL OPERATORS

6.5 UNITARY

MATRICES

THEIR

AMD

linear

matrix

an

inner

inner product on with respect


24(a) of Section 6.1],and
Exercise 19. Show that
U

and define the

<*, *>
the

xn} be

{xl5...,

linear operator on a finite-dimensional


all of the eigenvalues of
are
Prove
that
real.

exist positive definite

operators
inner

T of

operator

be

ft

products.

space

product

the

a diagonalizable

U be

Let

to

that

T on V such

operator

in V. Hint: Let

basis for with respect


<Xj, xt-> for all i and/ Let T
that [T]^ = A.

orthonormal
Atj

linear

unique

all x and y

j>> for

j>>'

333

Operators and Their Matrices

Orthogonal

numbers
and
we will continue our analogy
complex
the
to
linear
Recall that the adjoint of a linearoperatoracts
operators.
of a complex
number (see, for example, Theorem6.11). complex
conjugate
will
those
1 if zz = 1. In this section
linear
number
z has
study
length
= 7*7 = I. We will seethat
that
77*
7 on a vector space such
operators

In

this

section

between

similarly

we

are

the

precisely

||7M|| = ||x||

linear
all

for

finite-dimensional

that
thatona

that \"preserve length\" in the sense


operators
e V. As another characterization, we will prove
inner

complex

whose eigenvaluesall

these

are the normal

operators

1.

value

absolute

have

these

space

product

that
we were interested in studying those
the
structure
of the underlying space. In particular, linear operators
preserve
and scalar multiplication, and
the
of vector
addition
operations
It is now natural
to
vector
structure.
isomorphisms preserve all
space

In

chapters

past

functions

preserve

the

consider those linearoperators7


We will see that this condition

on

inner

an

product
in

guarantees,

space

preserve length.
the inner

7 preserves

that

fact,

that

product.

Definitions.
operator

on

V.

If

Let

||7(x)||

be

an

inner

all x e

= ||x|| for

F-= C and an orthogonaloperator

if

Clearly,

any rotation

orthogonal operator. We
Section 6.10.

space

product

or reflection in

will study

these

V,

we

linear

(over F), and let 7 be a


7 a unitary
call
operator

if

R.
R2

length

preserves

operators

in

much

and

more

hence

is an

detail

in

Chap.6

334

Spaces

Example

Let

Product

Inner

T(/) =

with

let fceV

and

V=H,

\\h(x)\\

x.

all

for

T:V~>V

Define

by

Then

hf.

2n
l|T(/)||2=||V||2=^

since

be a

=
\\h(t)\\2

1 for

Theorem

6.18.

linear

(a) TT* =

(c) If

inner product space, and let T

are equivalent:

the following

Then

I.

all x, yeV.

<x, y> for


basis

orthonormal

an

is

ft

V, then T(/?)

for

is an orthonormalbasisfor

V.

basis

orthonormal

an

exists

There

(d)

T*T=
T(y)>

<T(x),

(b)

V.

operator. 1

a finite-dimensional

V be

Let
on

operator

is a unitary

t. So T

all

dt =

h(t)f(t)h(t)f(t)

that

such

for

ft

T(/?)

is an

orthonormal basis for V.

(e) ||T(x)||= ||x||

all the conditions

Thus

or orthogonal
are normal.
this

above are equivalent to

From (a) it followsthat

operator.

or

Let

be

V.

on

basis P of

= 0. Hence U(x)=

Proof of
Then <x,y) =

for

basis

That

Next
{xl3,..,xn}.

We
=

<x,

V.

Then

5U. So T(/?) is an
(c)

implies

U(x)

for some

I<x,

x>;

=
Ax>

first

prove

that

and let U

be a

T0.
orthonormal

an

choose

may

Compare

then

V,

= Ax

X.

Thus

= T0. P

and thus U

e ft,

all

for

j>>

<(T*T)(x),

then

<x, U(x)> =

6.18.

Theorem

/?,

(a) implies (b).

Let x,ye\\f.

T(y)>,

<T(x),

we prove that (b) implies

Second,
<X|, Xj>

we

eigenvectors of U, If x e
0

orthonormal

6.16 or 6,17

lemma.

space,

product

all x e

= 0 for

U(x)>

<x,

If

By either Theorem

Proof

inner

afinite-dinfensional

operators

orthogonal

following

a unitary

of

definition

the

unitary

Before proving the theorem, we first prove the


lemma
to Exercise 10(b) of Section 6.4.

Lemma.
self-adjointoperator

so I

xe^.

all

for

(c).

Let

/?

T(0) = (1^),.,.,T(xn)}.

orthonormalbasis

of

{xl3...,

Now

xn} be an
<T(xO\302\273

T(xj)>

V.

(d) is obvious.

we prove that

(d)

(e).

implies

Now

x =

E aixi

Let

x e

V,

and

let

/?

6.5

Sec.

for some scalars

so

and

ah

l*ll2

is

*/>

<*\302\253/<*>

E
i=l

Wj&ij=

1111

f=l j=l
/?

=Z

*iXt, E ajxj)

since

335

Operators and TheirMatrices

and Orthogonal

Unitary

w2

orthonormal.

same

the

Applying

to

manipulations
n

T(x)

and

the

using

= 1

also orthonormal,

that T(/7) is

fact

A-T(x,-)

E
,-

l|T(x)||2=\302\243

Hence

Finally, we prove that

<x, x> = ||x||2=

So <x, (I -

T*T)(x)>= 0
=
0
adjoint, and <x,
\342\200\224

and so TT*

x e

for

For

U=

Let
is

the

from

immediately

Corollary 1.
product space V.

Let T be a

1.

Proof Suppose

that

= Xixi and

=
\\X(\\

an

has

all

for

XtXiXi

T(A\302\243Xf)

6.18, T is

then

we

1 for

if

In

of

and

only

basis

orthonormal

i. By Theorem

every

every eigenvalue of a
even more is true.
fact,
real

such

{xi,...,

6.17 T is

for each i.

= T_1,

inner

eigenvectors
of T with
if T is both self-adjoint

x\342\200\236}

that

self-adjoint.Thus

So TT* =

by

Theorem
such

that

6.17

I,

and

T(xt-)

we have that V
= Xixt for all i.

have

INI

W\"
\\Xt\\

T*

a finite-dimensional

basis

Xfx( = xt

{xl,...,x\342\200\236}

so

have

we

by

(a)

of

orthogonal.

If T is self-adjoint,
orthonormal basis
orthogonal,

on

operator

orthonormal

an

has

linear

value

Theorem

U is self= U =
T0

then

T*T;

that

definition

corresponding eigenvaluesof absolute


and orthogonal.

finite-dimensional,

value

(TT*)(Xf)

have

<x, (T*T)(x)>.

the lemma

V. So by

since

T(x)>

<T(x),

e V.

x e V we

any

unitaryor orthogonaloperatorhas absolute

J(xi)

\342\226\240

= I. I

follows

It

all

for

= I. Thus

T*T

hence

and

T*T,

(a).

implies

(e)

||t(x)||2

U(x)>

'\342\200\242

= ||x||.

||T(x)||

N2.

1=1

obtain

we

f.

UiXtW =

=
||T(x\302\243)||

HxJ,

possesses an

If T is also

Chap. 6

336

inner

T be a linear operator on a finite-dimensional


V. Then V has an orthonormal basis of eigenvectors

space

product

The proof is

Proof

1 if and

of absolute value

eigenvalues

corresponding

Spaces

Let

2.

Corollary

Product

Inner

similar to

the

of

proof

unitary.

1.

Corollary

T with

of

T is

if

only

complex

Example 2

Let T:R2~->-R2bea rotation

clear geometrically
all x e R2.The
that
that \"preserves
i.e., that
length\"
a fixed angle
rotations
preserve perpendicularity not only can be
by
but
now
follows
from (b) of Theorem 6.18. Perhaps the
that
geometrically
a transformation
such
the inner product is not so obvious
preserves
we
obtain
this fact from (b) also. Finally, an inspectionof
however,
n. It is

< 9 <

where

9,

by

||T(x)|| = ||x|| for

fact

seen

fact

the

geometrically;

matrix

cos

that T is not

self-adjoint

earlier, this fact also


eigenvectors and from

that T* is a rotation

by

now

examine

easily from the

be seen

can

As we mentioned
that T has no

on 9.
observation

geometric

It

6.15.

Theorem

0
restriction

given

the

from

follows

We

the

for

cos

vsin

reveals

\342\200\224sin

matrix above

\342\200\2249.

and orthogonal

that represent unitary

the matrices

transformations.

call A a

an n\"x n

A be

Let

Definitions.

unitary matrix if A has

matrix that satisfies

call

we

and

entries,

complex

AA*

A*A

=l.We

A an orthogonal

matrix if A has real entries.

rows

the

that

Note

A(n) of

A{1)i...,

= I is

AA*

condition

equivalent to the

orthonormal basis for

A form an
n

0ij

A similar

A*

and T is a
unitary

\\A-A

=
)ij

remark can be

2,

the

because

A.ik(A )kj

made about

=
2_,

AikAjk

columns

the

\\Ait),

of

A and

an

inner

A^).

the condition

I.

follows

It also

is

=
hj

Fn

statementthat

from the definition

linear operatoron
[orthogonal]

for

V,

T is

then

above that if
basis

Example 3

From Example2 the matrix

cos
sin

\342\200\224sin

cos

is

[orthogonal]
fi of V.

unitary

orthonormal

some

product

space

if and only if [T]^

Sec. 6,5

and

Unitary

One can easily

is clearly orthogonal.

orthonormal basis for R2.

vectors in j? is

such that

orthonormalbasis

for

we

seen

A is

that

say

Exercise

(see

equivalent]if
A

But

Q~lAQ.

generally,

and

if there

only

to

are

the

columns

whose

the

since

is similar

an

of Q are

columns

[orthogonal]. In this case


equivalent] to D. It is

Q is unitary

that

follows

matrix

the

Hence

is an

there

unitarily
equivalent [orthogonally
that
this relation is an equivalence relation
17)

More

[Mnxn(R)].

it

Fn,

of eigenvectorsofA.

5.1

D. By Theorem

matrix

a diagonal

[real symmetric] matrix

normal

P for Fn consisting

basis

orthonormal

form an

matrix

of the

rows

the

a complex

for

that

know

We

that

see

337

Their Matrices

and

Operators

Orthogonal

easily

on

Mnxn(C)

A and B are unitarily-equivalent [orthogonally


exists a unitary
[orthogonal] matrix P such that

P*BP.

The

has proved half

paragraph

preceding

of each of

the

two

following

theorems.

Let A be

6.19.

Theorem

a complexn x n matrix.

unitarily equivalentto a

only if A is

remarks

the

By

Proof

above

matrix.

diagonal

we need only

is
normal.
diagonal matrix, then
equivalent
A = P*jDP, where P is a
that
Suppose
diagonal matrix. Then

to a

if

= A*A.

AA*

Thus
\302\243>*\302\243>.

Let Abe a

\"Theorem 6.20.
only

is

unitarily

is orthogonally

Proof The proofis

to

similar

the

= P*DD*P.

P*DID*P

is a diagonal

we

however,

matrix,

n matrix.

to a real

equivalent

is

I
n

real

and

matrix

unitary

(P*jDP)(P*jD*P)

jDjD*

prove that if

=
AA* = (P*jDP)(P*jDP)*
Similarly, A*A = P*jD*jDP.Since D
have

if and

is normal

Then

proof

Then

A is symmetric

if and

diagonal matrix.
of Theorem

6.19 and is

left as an

exercise.

Example 4

Let

Since

is

a diagonal

equivalent

to

diagonal

matrix

To
easy

to

find

show

that

Theorem

symmetric,

matrix. We will

D such that
we

must

the

6.20

first

PlAP = D.

us that

is

find an orthogonal

matrix

orthogonally
P

and

an orthonormal
basis of eigenvectors.It is
of A are 2 and 8. Eigenvectorscorresponding

obtain

eigenvalues

tells

to 2 are

and

1,1,0)
(\342\200\224

apply the Gram-Schmidt process to


1,
(\342\200\224

\342\200\224

and

1, 0)

to 8 is (1, 1,

Notice

1).

Theorem

confirms

observation

is orthogonal

1,1)

(1,

6.15(d).

the

2. An

to

corresponding
\342\200\2242)

1,

^-(1,

obtain

Spaces

we

are not orthogonal,

vectors

these

Because

(\342\200\224
1,0,1).

Product

inner

Chap.

338

eigenvectors

orthogonal

corresponding

eigenvector

to the preceding two vectors.This


an orthonormal
basis of
Therefore,

eigenvectorsis

(-1,1,

ox

i)l.

(i,i,

4f

4=a,i,-2),

Thus we have that

/-1

V^

v^

P=

and

\\/3
1

v^

-2
0

As

it

= C,

If

6.14), the next

result is immediate.

we refer to it also

nxn

with

matrix

to

theorem.

Schur's

as

polynomial splits over F.


then A is a unitarily equivalent

whose characteristic
(a)

Let A be an

6.21 (Schur).

Theorem

(Theorem

of Schur's theorem,

form

matrix

is the

theorem

of Schur's

Because

entries

from

F and

triangular

complex

upper

to a real

upper triangular

matrix.

= R, then

(b) If F

is

an

equivalent

orthogonally

matrix.

Rigid Motions

Plane

the

in

application is to characterize the so-called


motions\"
of R2. One
may think intuitively of such a motion asa
that
does
not affect the shape of a figure underits action,
name
the
\"rigid.\"
For example,
reflections, rotations, and translations (x x + x0) are examples
of rigid motions. In fact,
will
that
prove
every rigid motion is a composition
The
in Rn will be handled in
of these
transformations.
situation
general
The

of this

purpose

\"rigid

transformation

hence

~\302\273

we

three

Section6.10

and

Definition.
rigid

motion

will

Let

use

results

the

be

real

inner

product

space.

if
I|f(x)

for all x,

here.

obtained

V.

- f(y)|| =

||x - y||

A function

f:

~\302\273
V

is

Sec.

6.5

and

Unitary

Although

will

we

our main

the

setting

(about the origin) followed

by
the

is a rigid

we will assume that f


Throughout
V
is defined
product
space V and that T: V ->\342\200\242

1.

Lemma

(a) ||T(x)|| =
(b)

[|T(x)

(c)

<T(x),

For all x,

||x||:
=

T(y)>

inner

real

by

R, we have:

operator.

rigid motion, we have

f is a

Because

(a)

Proof,

a e

and

ay)-T(x)-aT(y)ii=0.
linear

iiT(x)ii=

\\\\f(x)

= Jix

/(0)||

\\\\x\\\\

0\\\\

V.

x e

(b) For all


II

x3.yeV
-

t(x)

have

we

too

|| = ||[/(x)
=

(fe)

<x, y>.

HenceT is an orthogonal

all

on

motion

T(y)|I = ||x

(d) iiT(x +

for

the x-axis)

e.V.

all

a rotation

types:

- f(0)

= f(x)

T(x)

two

of

a reflection
(about
by a translation.

or
followed

origin)

one

is

R2

translation^

followed by a rotation(about

for

motions,

of R2.

Every rigid motionin

6.22.

Theorem

339

results about rigid

of general

a number

prove

in

is

result

Operators and Their Matrices

Orthogonal

For

all

Ml

U(y)

-/Mil

= II* -

2<T(x), T(y)> +

II

3>ll-

have

x^eVwe

iiT(x)

ll/M

- f(oj] -

i|T(x)ii2

t(y)\\\\2

\\\\T(y)\\\\2

and

Part (c) follows from the

(d) For all x,


||T(x

ay)
=

- T(x)
||T(x

, =.||(x +
=

a2||3;||2

two

equations

and

a e
=

aT(y)\\\\2

+ ay)

x\\\\2

+ a2\\\\y\\\\22at(x,y)

+ ay)

||[T(x

- T(x)]

2a[<T(x

2a[_(x + ay,
a\\\\y\\\\2

(a) and (b).

parts

by parts (b),(a),and

fl2||T(j;)||2-

a2\\\\y\\\\2-

and

above

R, we have

- T(x)||2 +

ay)

2a2\\\\y\\\\2

^)+11^IP-

IMI2-2<x,

~y\\\\2=

II*

2a<T(x

aT(y)\\\\2

ay)

+ ay), T(y)>

y) - <x,y>]
<x,

y)l

(c):

T(x), T(y)}

- <T(x),

T(y)>]

Chap. 6

340

Lemma 2.

an

is

motion

rigid

Every

Inner ProductSpaces
followed by a

operator

orthogonal

translation.

Proof.

an orthogonal

T is

1,

+ f(0) for all x

= x

U(x)

Lemma

By

V,

3.

Proof Let

= det([T]|) =

by Lemma 1(a),

T is orthogonal

det(T).

det([T]\342\200\236)

that

have

we

T*T

by

6.18(a). So

Theorem

1 = det(l) =

R2.

we have

6.3,

det(T*) = detOT*],)
Because

by

by Theorem 6.10 and

for V. Then

basis

orthonormal

~\302\273
V

=+1.

then det(T)

finite-dimensional,

an

be

/3

Section

of

16

Exercise

V is

If

Also,

T(x)+ /(0 = /(x).

(UT)(x) = U(T(x)) =
Lemma

translation.

is a

then

operator. If we define U:

=
det(T*T)= det(T*)-det(T)

Lemma 4. Suppose that

'cos

m'=

R2

and

9 (0 <

an angle

exists

there

Then

\342\200\224

sin

det(T)

j? is the

that

9 < 2n)

det(T)2.

standard ordered basisfor

that

such

ifd^T)=l

cos.

sin*

\342\200\242
det(T)

and
'\342\226\240
\342\200\236__-,

Dla

\\

- sin

/cos

H'

\342\200\242

-cos

\\sm9

9 \\
J
9}

\342\200\236
,
,_x

-1.

ifdet{l)=
J
v
'

Because T is an orthogonaloperator Lemma


1,
=
that
we conclude
from Theorem
is
an
orthonormal
T(j3) {T^J,
6.18(c)
basis
exists
of R2. Because T(ex) is a unit
there
an angle
9 (0 < 9 < 2n)
=
suchthat
are
(cos 9, sin 9), Since T(e2) is orthogonal to Tfo),
only
two
choices
for
Either
possible
T(e2).
A = [T]^.

Let

Proof

by

T(e2)}

vector,

there

T(ex)

T(e2) =
If det(T)

we must have

='1,

Proof of

is a rotation

by

9.

If

det(T)

~
ip

Vsin

that

T(e2) = (sin

or

9)

if

det(T)

0,

\342\200\224cos

\342\200\224

1,

we

9).

must

have

the

T is a

=-1,

1.
\302\261

need only analyze the orthogonal


Lemma
4, if det(T) = 1, we see that T

2 we

Lemma
By

then using

-sin0Vl
cos

sin0\\_/cos0
~

_/cos0
have

By

3 det(T)

Lemma

By

we

cos

the first case;and

6.22.

Theorem

operatorT.

9,

case.

second

(\342\200\224sin

-cos

9J

\\sm 9

0^

~l

0/\\O

reflection about the x-axis

followed

by

rotation.

Unitary and OrthogonalOperatorsand Their

Sec. 6.5

341

Matrices

Sections

Conic

of Theorem 6.20,

As an application

+ cy2 +

ax2 + 2bxy

the

consider

we

dx +

ey

of
coefficients
in
conic sections. For example, if a = c= 1,
For special choices

the

= 0.

(3)

= e

various

the

obtain

we

(3),
=

equation

quadratic

= 0, and

/=-1,

center at
The
remaining
origin.
conic sections,
the
and
namely,
ellipse, parabola,
hyperbola, obtained
other
choices
of the coefficients. The absence of the
allows
by
easy
of these
conies by the method of completing
For
graphing
square.
=
0 may
2
x2 + 2x + y2 +
be
the
rewritten
+
as
example,
equation
at
center
the x, ^-coordinate
1,
2) in
(x + I)2 + (y + 2)2 = 3, a circle
the
consider
transformation
of coordinates
system and radius ^/3. If
to
(x, y) -* (x', y'\\ where x' = x + and y' = y + 2, then our equation simplifies
= 3. This change of variable allowsus to eliminate
xand
+
y(/)2
we

circle

the

obtain

+ y2 = 1 with

x2

the

are

xy-term

the

4y

\342\200\224

with

(\342\200\224

we

the

(x')2

terms.

We
accomplish

ax2 +

2bxy

the associated quadratic form


in more generality in Section 6.7.

studied

If

we

be rewritten

may

(4)

form

3x2

+ 4xj; +

The

fact

we may

by X' =

To

xy-term.

(4)

of (3).

forms

Quadratic

will

be

let

then

diagonal

the

cy2,

is called

which

620,

elimination of

concentrate
solely on the
this, we consider the expression

now

is crucial in

symmetric

our discussion.For,

an orthogonal matrix P and a diagonal


= D. Now define
that
such
P*AP
Xx and

PlX or,

matrix

quadratic

Theorem

by

with

real

A2

by

equivalent^,

X'AX = (PXJA(PX') =
Thus the transformation (x,
(3).

the

example,

choose

entries

and hencein

{AX, Xs).For

6y2 may be written as

A is

that

as X'AX =

y)

= PPlX

PXf

=
X,\\PtAF)X'

->\342\200\242

(x',

yr)

allows

= X. Then
XnDX'

us to

A^x')2

eliminate

+ A2(j/)2.

the xy-term in (4)

342

Chap.

since P is orthogonal,

Furthermore,

that det(P) =

1.-If
\302\261

Notice that
=
det(P)

1. By

\342\200\224

det(P)

det(Q)

Theorem 6.22
of P to
columns

basis of

an orthonormal

of Q. Therefore,

as well assume that

we may

Hence

1.

the

P form

of the columns

6.22:,it followsthat matrix P

4 to Theorem

Lemma

of

Product Spaces

3 to

Lemma

by

interchange

may

columns

true

is

same

we

\342\200\2241,

the

Because

the

A,

det(P)

obtain a matrix Q.

eigenvectors of

have

we

Inner

represents

rotation.

In
axis

and

orthogonal matrixand
are the eigenvaluesof

is

result

This\"

det(P)

the

obtain

quadric

the

PX\\
the coefficients

case

surfaces\342\200\224the

to quadratic

extended

easily

= 3,

by special choices
the

cone,

elliptic

the

x-

where P is an
of (x')2 and (j/)2

axis theorem for

of the principal

are

of

by

given

1. Furthermore,

restatement

argumentsabove, course,
variables. For example,in

of

x' and j/

to new axes

y-axis

be eliminated by a rotation

in (3) may

xy-term

summary,.the

R2. The

equations

in n

of the coefficients
we

ellipsoid,

the hyperbolic

paraboloid, etc.

As an^example,consider

the

2x2
for

which

the

associated

equation

quadratic

+
-Axy\342\200\242*

5y2

- 36 =

form is 2x2

quadratic

0,

\342\200\224

4xy

5y2.

In the notation

above

so that the eigenvaluesof

are

1 and

6 with

associated

eigenvectors

-1'

As expected [from
corresponding

Theorem

6.15(d)],

basis of eigenvectors

these

vectors

are

orthogonal.

The

Sec.6.5

and

Unitary

new axes x'

determines

and y' as

in

Their Matrices

if

Hence

6.3.

Figure

343

-1\\

P =

and

Operators

Orthogonal

1/2
1

-1'

V'

5\\l

then

Under

the transformation

X = PX'

0\"

PtAP

or

2,1,

\342\200\224

=z-\342\200\224x

y'

./5

we

2x2

the

have

4xy

new

+ 5j;2 =

quadratic

form

(x')2 + 6(/)2.

36 may be writtenin

the

form

Thus the original equation


(x')2

6(/)2

= 36 relative

\\

*~x

Figure

6,3

to

344

a new coordinate system

and

x'-

the

with

Inner Product Spaces

Chap.

the directions of

y'- axes in

and
/
this equation

that

clear

is

It

respectively.

an ellipse (see Figure 6.3).

represents

EXERCISES

1.

the

Label

inner

underlying

spaces

product

(a) Every unitary

(b)

The

is

operator

(c) A matrix is unitary


(d) If two matrices are
(e) The sum of unitary
(f)

of a

adjoint

false. Assume that

the

operator is normal.

orthogonal

Every

true or
being
are finite-dimensional.

as

statements

following

diagonalizable.

if and
it is invertible..
unitarily equivalent,
they
matrices is unitary.
if

only

similar.

is unitary.

operator

unitary

also

are

then

(g) If T is an orthogonaloperatoron then [T]^


for any ordered basis for V.
If all-the
of an operator are 1,
(h)
eigenvalues
V,

is an

matrix

orthogonal

ft

the

then

(i) An operator may


2. For each of
following

matrix

but

not

that

such

the inner

an orthogonal
= D.
P*AP

A, find

matrices

and a diagonal

norm

the

preserve

the

be

n,

or orthogonal.

unitary

must

operator

product.

or unitary matrix

A =

3. Prove that

the composition
of

[orthogonal]

unitary

is unitary

operators

[orthogonal].

defineTr: C C by Tx(u)
is normal, self-adjoint,or unitary.

4 For zeC

5.

Which

of

->

the

following

and

pairs

\302\253=
zu.

of matrices

Characterize

those

z for

are unitarily equivalent?

and

which T2

Sec.6.5

and

Unitary

Their Matrices

and

Operators

Orthogonal

345

00

and

(d)

and

(e)

and
6.

Let

[0,1]

inner

the

be

inner

the

with

product

and

if

operator

7. Prove that

if

space,

then

8. Let
be
U = (T +
V

if

only
is

has

0 <

1 for

\\h(t)\\

= hf. Prove that T

by T(/)

a unitary

operator

a \"square

root\";

t <

is a

unitary

1.

on a finite-dimensional inner product


that is, there exists a unitary operatorU

= U2.

thatT

such

T: V -> V

define

and

V,

dt.

f{t)g(t)

</,gy=

Let h e

continuous functions on

of complex-valued

space

product

inner

an

fI)(T

21)\"1,

and

space,

product

\342\200\224

->

be

If

self-adjoint.

6.4, that U is

9 of Section

Exercise

using

prove,

let T:

unitary.

9.

Let

a linear

be

||U(x)||= ||x||
or

Prove

10. Let A be

ku...,

Xn

give

operator
all

for

x in

Find

12.

Let

orthonormal

some

a counterexample.

a complex normal or real


(not

necessarily

Prove

distinct).

\302\243 A\302\243

and

with eigenvalues

n matrix

that
=

tr^*^)

2
\302\243 14

an orthogonal matrix whose first row is (^-, f, \302\247).


A be an n x n real symmetric or complexnormalmatrix.
det(.4)

that
Suppose
is similar to

of

eigenvalues

Prove

Prove

that

A,,

the (not necessarilydistinct)


A and B are diagonalizablematrices.
B if and only if and B are unitarily

where the A/s are


13.

symmetric

\\x{A) =
11.

inner product space V.If


basis for V, must U be unitary?

on a finite-dimensional

or

A.

disprove

equivalent.

that

Chap. 6

346
14.

Product

Inner

unitary operator on an inner productspace


finite-dimensional
U-invariant subspace of V. Prove
U be a

Let

be

let

and

V,

Spaces

(a) U(W)=W.
U-invariant.

is

W-1-

(b)

Contrast

Exercise 15.

(b) with

part

15. Find an exampleof a

invariantsubspace such that \\NL


Prove that a matrix that is both

16.

\"is

unitarily

-\342\226\272
V

U is a

self-adjoint

V =

x2 e

and

xx

product space V.

inner

1.3,

- x.2,where

+ x2) = xx

U(xx

by

Section

of

exercises

on

relation

equivalence

of an

subspace

the

an

is

to\"

equivalent

a finite-dimensional
be
18. Let
By Theorem 6.7 and
U:

be a

must

triangular

upper

matrix.

17. Show that


Mnxn(C).

19.

and

umtary

diagonal

product
U-invariant.

is not

U-

space and a

an inner

U on

operator

umtary

Define

\302\251W-1-.

V\\l\\ Prove that

operator.

unitary

inner product space. A linear operator on


V is called
a partial isometry if there exists a subspace of V such
that
= 0 for all x e
W
and
that
Observe
||U(x)|| = ||*|| for all x e
U(x)
need
not be U-invariant.
Suppose that U is such an operator
is an orthonormal
basis of W. Prove the following.
xk}
{xl9...,
=
Hint: Use Exercise20 Section
<x, y) for all xjeW.
(a)
U(j;)>
Let

V be

a finite-dimensional

V\\i\\

and

of

<U(x),

6.1.

(b)
(c)

orthonormal basis

U(x&)} is an

{U(x[),...,

basis

an orthonormal

There;exists

[U]T form an orthonormalset


(d) Let

{yu...-,

to be

(e) Define T

xt (1 <

i<

T =

that

U*

that

Prove

This exercise is
20. Let

(a)
(b)

and

= 0

n x

(1 < i

an

on

is a

< j). Prove that

for V.

basis

T(U(xJ) =

satisfies
T

and

well-defined

is

<x,T(j;)>
for

all

=
ft

x,

e jS.

isometry.

partial

Exercise

n matrices

tr(^*.4)

R(U)-1-. Let

for
that

zero.

are

columns

orthonormal

that <U(x),y) =

Section

of

6.6.

that are unitarily equivalent.

tr(B*B).

Use part (a) to prove that

i
21. Find new coordinates x\\ y' so
written as A^x')2 + A2(j/)2.

(a)

is

ft

cases.

continued in

be

that

Prove

T(yt)

basis

k columns of

the first

remaining

operator

U* Hint: Show

There are four

(f)

linear

the

and

k)

the

and

..., yj}. Then

U(x&), yu

{U(x1),...,

R(U).

that

such

orthonormal

an

be

yj\\

for

for

x2

4xj>

y2

(b) 2x2 + 2x3; +

2y2

=
\\au\\2

that

the

i5yi2-

following

quadratic

forms

can

be

Sec.6,5
(c) x2
(d)

(e)

and

Unitary

+ 3y2

3x2 + 2xy
x2

347

4y2

12xy

Their Matrices

and

Operators

Orthogonal

2xj> +

y2

22. Consider the

A is as defined in
where
X1 = (x,
expression
z) and
Exercise 2(e). Find a change coordinates
x', y', z' so that the expression
in
the
form
above can be
+ A2(y')2 + A3(z')2
A^x')2
in Fn, and let xl9...,
be
the
be
vectors
23. Let yl9...,
linearly
independent
vectors
obtained
from j>i,...,j>n by the Gram-Schmidt ororthogonal
XlAX9

y9

of

written

x\342\200\236

yn

process.
thogonalization
Let

obtained

basis

orthonormal

the

be

zl9...,zn

by defining

\342\200\224
\302\253

zk

JT xk'

ll*ftll

(a)

6.2 for

(1) in Section

By solving

yk

of zk9

terms

in

that

show

it-i
=
yk

(b)

^11\302\276

0^

zk9 respectively. Define R e

and

(1 ^ k ^

zi>zj

the n x n matrices

Q denote

A and

Let

II

in which

the

(c) Compute Q and i? as

the

vectors

(b),
a

is

unitary
shown

have

we

that

Suppose
where

if j >

The

for

that

the

an

and
is

Mnxn(F)

invertible

upper
and

matrix.

triangular
A = Q^R^

Q2^2>

are unitary and RnR2*=Mnxri(F) are upper


=
is a unitary diagonal matrix. Hint:
R2Ri*

described

factorization

method

&.

16.

Exercise

QR

(b)

part

nxnff)

6i,Q2eM
Prove

(e)

if J < k

matrix

triangular.

Use

in

[orthogonal]

unitary

3x3
matrix whose columns are
and
in Example 4 of Section 6.2.
y39 respectively,
and
R is upper triangular in part
[orthogonal]
that
invertible
matrix is the product of
every

y2,

yu
Q

(d)\\\302\247ince

= QR.

that

Prove

yk

by

<yk> zj>

\342\200\224

are

columns

/cth

Mnxn(F)

if

Rjk

n)-

for

solving

linear

in (b) provides an orthogonaHzation


AX = B when
is
invertible:
system
A

by the Gram-Schmidt process or other means,


and
R is upper
triangular. Then QRX = B, and
This last system can be easily solvedsinceR is upper

A to-QjR,

Decompose
where

hence

RX

is

unitary

Q*B.

triangular.

At
large

as a

one

systems
better

time, because of its great


of linear equations
with
method

Gaussian

than

about three times as

much

work.

stability, this method solving


a computer was being advocated

elimination
(Later,

for

even
however,

though
J. H.

it requires
Wilkinson

Chap. 6

348

the

as

stable

xx +

method and part


1
2x2 + 2x3 =

\342\200\242

xl

2x3

*2 +
that

Suppose

ft

11

\342\200\224

the

solve

to

(c)

1.

X3

inner productspace Prove


if
that
matrix that changes^-coordinatesinto ^-coordinates,
V.

then

6.6

ortho-

is

PROJECTIONS

ORTHOGONAL

THEOREM

SPECTRAL

THE

AND

/?

orthonormal.

if 7 is

if and'only

normal

system

for an n-dimensional real (complex)


Q is an orthogonal
(unitary) n x n

bases

ordered

y are

and

as

nearly

method.)

orthogonalization

Use the orthogonalization

24.

Spaces

then it is

is done properly,

elimination

if Gaussian

that

showed

Product

Inner

In this section

we rely heavilyon

to develop an elegant
T on a finite-dimensional
6.17

and

6.16

Theorems

operator
representation of a normalor
that
in
the
inner product space. We will prove
suchan operatorcanbe
form
where
+ XkTk,
Xu...,
Xk are the distinct eigenvalues of T and
AiTx +
self-adjoint

written

\342\200\242
\342\200\242
\342\200\242

Tl9...,Tfc

are

must

first

We

projections.\"

\"orthogonal

about these specialprojections..


5.2. The special case where

of Section

end

the

at

developed

with the results

is familiar

readef

the

that

assume

We

projection on a
V

= Wi

subspace

\\Nl

T(x) = xx.

and

22

Exercises

By

x2, where

23 of Section

R(T) = Wi = {x:

we

of

refer

simply

may

Section

Wi

not

uniquely

\302\251W2

Wi
determine

determined

uniquely

to T

N(T)-1-

We

R(T).

say

does

T.

operator

V\\ll

T on V

is a

such

that

2.1 we have
= W2.

N(T)

projection if and

on its range, and so

shown

if

only

(see

T=T2.

Ts

14

Exercise

Because

that W2 = W3) we see that Wt

an orthcgonal projection

For

of

W2) we have

x2 e

and

In fact, it canbe

not imply

W2 of

a subspace

is a projection

projection

as a projection.

\302\251W3

does

however,

is

by its range.

Definition. Let
projection.

every

T is a

that

2.3)

V =

Thus

\302\251N(T).

xx

linear

and

x}

T(x)

So V = R(T)

exists

if there

= xx +

for

and

\302\251W2,

of

sum

direct

the

sums

1.3.

Section
a

about direct
is

of
two subspaces is considered in the exercises
Recall from the exercises of Section2.1 that

some results

develop

that

be

an

inner

product

is an orthogonal

space,

and

be a

fet.T:V->V

projection if R(T)X =

N(T)

and

OrthogonalProjectionsandthe Spectral

Sec. 6.6

349

Theorem

that

Note

by

6.2 if

12(c) of Section

Exercise

is

above holds.

that one of the conditions


only assume
= N(T), then R(T) = H(T)\302\261A- = H(T)\\
R(T)-1that W is a finite-dimensional
Now
assume

need

on

W.

say even

can

We

W. For if T

and

are

exists

more\342\200\224there

if

example,

inner

one

exactly

R(T) =

= N(U), and since all


null space, we have that

on

projection

orthogonal

W, then

on

projections

orthogonal

= R(T)X = R(U)X
Hence
N(T)
determined
by their range and

For

subspace of an
product
that there exists an orthogonalprojection

6.6 guarantees

V. Proposition

space

we

finite-dimensional,

are

projections

R(U).

uniquely

call
T = U.
T the
between
an
orthogonal
projection on W. To understand the geometric
=
on W and the orthogonal projection on
V
let
R2 and
arbitrary
projection
W = span{(l, 1)}.Define and T as in Figure 6.4, where T(v) is the foot of a
from
v on
the line y = x and U(alJa2) = (^1^1)T is the
on W, and U is a projectionon
that
is not
orthogonal
projection
orthogonal.
We

difference

W,

Then

perpendicular

Note

that v

From

Figure

T(v)

eW1,

whereas

6.4 we see

N\\J\\

U(v)

\"bestapproximation

that T(v) is the

then \\\\w
is, if weW,
property characterizes T. These
6.6.
Proposition

\342\200\224

that

v\\\\

in

\342\200\224

||T(w)

In

v\\\\.

this

fact,

results follow

from

immediately

the

product

the

degree

or

a_n

/eH.

[0,27u]
to

be

by

is nonzero.

that
trigonometric polynomial of degreeless
whose coefficients are the Fourier
set {etJ'x:jis an integer}..
Let

to

corollary

analysis,

an

v\";

approximation

the
inner
recall
As an application of thisresultto Fourier
H of continuous (complex-valued) functions on
interval
space
in Section
6.1. Define a trigonometric polynomialof
introduced
function
g e H of the form

where

to

We will show

the

best
or

than

of

coefficients

Figure

to

approximation
equal

6.4

relative

to

the polynomial
to the orthonormal

n is

350

let

For this result,

projection on W. The

span({ey*:
to

corollary

\\j\\

t
J=

best

the

us that

6.6 tells

</,eV
~n

H.

to f in

approximation

Spaces

be the orthogonal

let T

and

n}),

Proposition

T(/)=
is

<

Inner Product

Chap.

An algebraic characterization of orthogonalprojections

next

the

in

follows

theorem.

is

operator on V. Then J

an

Proof. Suppose

and

= N(T).

R(T)-1-

xls j>!e

need

we

projection,

and

R(T)

xx

Now

R(T)

= yt +
y

and

x2

T2 because
\302\251 N(T)

y2, where

e N(T). Hence

y2

<*>T(j;)> =

Since T =

T = T*.

that

show

only

T = T*.

if T2 =

only

projection.

orthogonal

x, j>eV, then x =

If

x2,

an

is

if and

projection

a linear

be

let

and

orthogonal
T

that

T is a

inner product space,

Let V be an

6.23-

Theorem

<xx

= <xls yt)

yt)

x2)

= <xlsyt)

+ <x2) yt)

and

<x,

T*(j;)>

So <x, T(j;)>=
Now

<x,T*(j;)>

2.3;

N(T). Let x

T =

that

yt + y2) =

all

for

suppose

Section

of

y) = <xx,

<T(x),

x,

T2 = T*.

e R(T) and

thus T =

show that
x =

T(x) = T*(x),

R(T)

R(T)-1- =

and so
=

0)

14

Exercise

and

N(T)-1-

R(T)

<x,

Let

T*.
by

<x,y) = <T*(x),y) = <x,T(j;)>=


x e N^/from
which it'follows that
that
j; e N(T)-1-. We must show that e R(T),
ii^-T(^)ii2 = <^-T(3;)^-T0;)>

Therefore,

= <xx, y^.

y2)

<xls

Then T is a. projection

Then

N(T).

yx>

e V, and

hence we must

and

<xls

0.

N(T)-1-.

is, that

T(j>) = y.

Now

= <^3^-T(j;)>-<T(3;),^-T(3;)>.
\342\200\224

Since

e N(T),

T(y)

<T(j>).-J>

Thus

T(j>)

= <y, T*(y

T(y)>

= 0; that

then xeN(T). For

So T(x) = 0,
Let
T be

and

be

/eV,
x

thus

6.2).
we

e N(T).

xn} for

projection
such

We

have

that

also
-

T(y

<y,

Hence

R(T).

need,

<T(x)~j;>

T(j;))> = <;;, 0}
=

R(T)
=
R(T)-1-

H(T)\\
2

NOT1\"1-

show

only

= 0.

if

that

N(T)

(by

x e R(T)X,

<x,T*(j;)>

= <x,T(j;)> =

space, W be

a subspace of

0.

1
inner

a finite-dimensional

the orthogonal

{xls...,

- T(y))) =

above, we have that

12(b) of Section

Exercise

is zero. But

is, y = T(y) e

the results

Using

and

the first term

product

V,

on W. We may choosean
basis
{xls...,
xk} is a basis for W. Then [T]^ is a
orthonormal

6.6

Sec.

Orthogonal

Spectral Theorem

and the

Projections

diagonal matrix with onesalong


In fact, [T]fl has the

and zeros elsewhere.

entries

k diagonal

first

the

351

form

oA
(h

oj'

\\Oi

has

If U is any projectionon
form

the

We are

now

the

for

ready

Theorem 6.24 (The Spectral


operator on a finite-dimensional
T is normal if F = C
that
T is

theorem

inner

self adjoint

and

distinct

k), and let Jt be

^ i ^

A; (1

eigenvalue

W; be the

of T,let

eigenvalues

space

product

the

a linear
F. Assume that

over

if F =

is

that

Suppose

Theorem),

the

be orthonormal.
of this section.

not necessarily

principal

that [U]r

y for V such

a basis

choose

may

y will

however,

above;

we

W,

R. If Als...,Ak

eigenspace of T correspondingto

orthogonal

on

projection

W| (1

are

the

< i ^

Then

(a) v =
(b)

If

(c)

TjTj

w1\302\251-ewk.

(d) |=T1 +
=

is

follows

It

6.15(\302\276

On the other hand,


Hence

xx +

\342\200\242
\342\200\242
\342\200\242

x&,

write

e V,

T(xx)

T(x)

Tx

set {Als...,
\342\200\242
\342\200\242
\342\200\242

Tk

from

(a)

by Theorem
we have that

- dim(WJ.

= dim(V)

^1^)
A&}

in (d)

of

an

as

\342\200\224

Theorem

by

dim(W\302\243)

exercise.

on

projection

W,-, we have

for xeV
and Tt-(x) = xh proving (d).
x = xl +
where
+
xk,
Hence

x,- e W,-

Then

The

left

is

(c)

= W[.

where
x

For

(e)

= 0

j>>

<x,

Now

W/\\

dim(V)

din^W-1-)

orthogonal

= \\Nt

RiTd1-

N(Tj)

the

is

T\302\243

that

then

W/S proving (b).

W/=

The proof of part


Since

\302\243dim(W,.)

have

we

and j,

that Wt c

this

from

easily

dim(W,') =

6.7(c).

for some- i

and ye\\Nj

e W|

so

diagonalizable,

W10-9Wi

by Theorem 5.16.
x

then

+AkTk.

^-+..-

If

^ i,

...+Tk.

Proof (a) By Theorems6.16and 6.17,

(b)

subspaces of Wj, j

l^U^k.

5yT|

sum of the

direct

the

denotes

\\N{

(e)

k).

+.-\342\200\242-+

+--eigenvalues

is called the

have

we

\342\200\242
\342\200\242
\342\200\242

T(x&)

4^(\302\276)

of T

AxXi

from (b) that

xs e Wy

that

(1 < j

^ k).

\342\200\242
\342\200\242
\342\200\242

Xkxk

=(^^/+---+^^^\302\276).

is called the

spectrum of T, the

resolution of the identityoperator

induced

sum

by

T, and the sumT =

of T. Since

distinct

the

Wt

the

subspaces

of

With
and

VVj's

XkTk

the spectral decomposition


determined (up to order)
by

projections T,-),the spectral

by the orthogonal

hence

is

are uniquely

of T

eigenvalues

(and

decomposition

is called

in (e)

\342\200\242
\342\200\242
\342\200\242

X{Yy

Inner Product Spaces

Chap.

352

unique.

the notation
above, let fi be the union of
let mx = dim(Wi). (Thus mx is the multiplicityof Xt)

orthonormalbases

the

of

Then

the

has

[T]^

form

/Vmi

\\

\\

x2im2

1'

is a diagonal

that is, [T]fl


eigenvalues

Xi

of

\\

and

each

is

Xx

the

which

in

matrix

T,

^Jmk

mt

repeated

for

g{Xk)lk

If T =

follows

\342\200\242
\342\200\242
\342\200\242

X{Y^

that

7)

used

fact will be

g. This

in

as

XkTk

Exercise

(from

polynomial

any

are the

entries

diagonal

times.

theorem, then it

of the spectral
ff(T) = ff(^i)Ti +
(e)

below.
list

now

We

several

moreresultsare found
inner

finite-dimensional

For

exercises.

the

in

over F and

space

product

of the spectral theorem;


what follows
we assume that

many

corollaries

interesting

is

that T is a linearoperatoron

V.

1.

Corollary

polynomialg.
Proof

be

the

F = C,

then T is normalif

and

= g(T) for

T*

if

only

some

\"/

first

Suppose

spectral
above,

equation

If

of T. Taking

decomposition
we have T* = X{Ti

formula

=
g(X}}

(see

Xt for

^ i <

\342\200\242
\342\200\242
\342\200\242

X{Yy

XkTk

the adjoint of both sides

the

of

* \342\200\242
\342\200\242

Using the Lagrange interpolation

polynomial g such that

Let T =

normal.

is

that

~XkTk

T{ is self-adjoint.
1.6), we may choose a

each

since

Section

k. Then

g(T)=g(X1)Tl + --+g(Xk)Tk

= 1^
=

T*

g,
#(T) for somepolynomial
commutes with every polynomial in T. 1
2.

Corollary

\\X\\

.--+\302\276

if T* =

Conversely,
since

1 for

every

= C, then
X of T.
eigenvalue
If

Proof Suppose first that


if T(x) = Ax, we have |A|-||x|| =
x#0.

T is unitary if and

T is

unitary

and

then

if

only

hence

||Ax||= ||T(x)||= ||x||,

and

with

commutes

is

and

normal

Then

normal.
hence

T*

|A|

1 if

Orthogonal Projectionsand the Spectral

Sec. 6.6

353

Theorem

Now

AiTi +

*\342\200\242
\342\200\242

|A| = 1

that

suppose

the spectral

be

XkTk

for every eigenvalue of T, and


of T. Then by (c)of
decomposition

T =

let

the

spectral

theorem

= 1^1^

= T! +
=

T is

Hence

---+Tfc

1.-

T is

is

and

be
the
+
XlTl
Xklk
Suppose that every eigenvalueof T is real.

decomposition

Let T be as in

4.

Corollary

\342\200\242
\342\200\242
\342\200\242

AiTx

been in proved in

Proof Choosea

9/J) =

9j&i)Ti

AkTk.

gj

\342\200\242
\342\200\242
\342\200\242

9j(h)Tk

Tj is a polynomial

< k) such

< j

(1

SiyTi

\342\200\242
\342\200\242
\342\200\242

5kJTk

**
\342\200\242

IfcT&

6.17.

with

theorem

spectral

each

Then

polynomial

the

X{Yi

Theorem

to

lemma

the

T*

Then

of T.

decomposition

spectral

\342\200\242
\342\200\242
AiTx + \342\200\242+ AfcTfc = T.
The converse has

if and only if

T is self-adjoint

then

normal,

real

Let T =

Proof

+ Jklk)

+ .--+1412^

If F = C

of

eigenvalue

-..

A&Tfc)(A1T1

unitary.

Corollary 3.
every

-.. +

= (A1T1 +

TT*

spectral

in T.

that gj(Xt)= 5^. Then

T7,

EXERCISES

\\

1. Label
statements
as being
following
underlyinginner productspacesare

or false. Assume

true

the

that the

finite-dimensional.

All

(a)

are

projections

self-adjoint.

its range.
(b) An orthogonal projection is uniquelydetermined
by
(c) Every
self-adjoint
operator is a linear combination of orthogonal

projections.
an

If

(d)

adjoint.
If

(e)

2.

T is

a spectral

possesses

operator

a projection

on W, then T(x) is the

(f) Every orthogonal projectionis a


Let V = R2, W = span({(l, 2)}),and
P

Compute [T]^, whereT is

R3

and

3. For each

W =

L^

orthogonal

in

that

possesses

in

Exercise

spectral

is closest

its

to x.

operator.

unitary

be

vector

the

ordered

standard

projection

on

span({(l,0,1)}).

of the matrices

(1) Prove that

the

then so does

decomposition,

2 of

Section

decomposition.

6.5:

W. Do

basis

for V.

the same for

Inner ProductSpaces

Chap. 6

354

(2) Explicitly

of

each

define

the

on the eigenspaces

projections

orthogonal

of 1^.

(3)
4. Let

the

theorem.

spectral
on

projection

orthogonal

product space V. Show

of an inner

subspace

then

W,

\342\200\224

is

the

orthogonal

on Wx.

projection

inner

finite-dimensional

be

the

using

a finite-dimensional

be

that if T is
5. Let

results

your

Verify

and let T:

space,

product

->

be

projection.

(a) If T

is an orthogonalprojection,
an

Give

of a

example

hold.

If

and

Let

that

if

<

||T(x)||

for all x

||x||

be an

T must

that

inner product space V. Prove


an orthogonal

T is

then

V,

about T?

prove

complex,

on a finite-dimensional

a projection

be

is

V.

this inequalitydoesnot

be concluded

can

what

holds,

equality

for all x e

< ||x||

||T(x)||

T for which

projection

(b) If T is-also normal


orthogonal projection.

6.

that

prove

a linear
U
7. Let T be a normaloperator
operator
complex inner product space V. Use
the following.
+
+
XkTk of T to prove
AiTx

on a

and

the

projection.
finite-dimensional
decomposition

spectral

\342\200\242
\342\200\242
\342\200\242

(a) If g is a

then

polynomial,

0(T)

ffWV

\302\243=1

(b)

(c)
(d)

If
U

If

some n, then T = T0.


with
T if and only if U commutes
commutes
U is normal
and commutes with T, then
T0 for

T\"w=

where

s \\iT

/xls...

the

are

necessarily
T are

(not

Show that
eigenspaces
(e) There existsa
operator
0
T
is
invertibleif
for
#
(f)
and
(g) T is a
only
the

of

normal

A\302\243

(h)

\342\200\224

T*

is called

T\302\243.
\342\200\242
\342\200\242
\342\200\242

/^1^

eigenvalues

distinct)

/xPTps

of U. Hint:

under U.
such that U2 = T.

invariant
V

on

1 or 0.
skew-symmetric) if and only if every
if every

of T is

eigenvalue

A,-

an

is

number.

imaginary

8. Use

a T

(such

i ^ /c

if

projection

with each

Corollary

1 of the

spectral

operator on a complex
is a linear operator that commutes
T,
9. Referring to Exercise 19 of Section6.5,
(a) U*U is an orthogonal projectionon
finite-dimensional

with

prove

if T

that

show

to

theorem

inner

the

with
facts

following

and

space

product

U commutes

then

is a normal
T*.
U.

about

W.

(b)

= U.

UU*U

10. Simultaneous Diagonalization, Let be a finite-dimensional


complex
space, and let U,T:V-*-V be normal operators
product
=
TU
UT. Prove that there exists an orthonormalbasis
V
consisting
V

such

inner
that

for

vectors

are eigenvectors

that

14 of Section

11.

Prove

part

(c)

of both T and

6.4 along with Exercise8.


of

the

spectral

theorem.

U.Hint:

Use

the

hint

of Exercise

of

Bilinearand Quadratic

Sec. 6.7
6.7

355

Forms

FORMS

AMD QUADRATIC

BILINEAR

of two variables
defined on a
certain class scalar-valued
functions
vector space that is
in the
of such diverse subjects as
considered
study
of \"bilinear
forms.\" We
calculus.
the
class
geometry and multivariable
a special emphasis on
will now study the basic
of
this
class
with
to quadratic
consider
some
of its applications
symmetric bilinear formsand
surfaces and multivariablecalculus.

There is a

of

often

is

This

properties

will

this

Throughout

section

all

V be

a vector

as ordered bases.

be regarded

should

bases

Bilinear Forms
Let

Definition.

x V of ordered

setV

pairs of vectorsin

linear in each variable


(a) H(axx + x2,
(b) H(x, ayt +

the

other

aH(xls

y)

when

y)

y2)

We denote the

producton a

space

F.

function

the

from

form onVifH is
variable
is held fixed, that is, if
+ H(x2) y) for all xl9 x2, e V and a e F.
a e F.
e V and
+ H(x, y2) for all x, yu
to

a bilinear

is called

yt)

aH(x,

set of all

vector

real

space over afield

y2

forms

bilinear

on

a bilinear

V is

^( V). Observe

V by

that an inner

form.

Example 1

Define a functioniJ:R2xR2-^i?by

We

directly

verify

may

and

enlightening

less

a=Q

that

tedious,

bilinear form on R2.Tt will


however, to observe that if

H is a

-0'

*=\302\243)'and

more

prove

>=(\302\243)\342\200\242

then

H(x,

y)

= xlAy.

The bilinearity of H now follows


from
matrix multiplication
over matrix addition. 1

the

directly

The bilinear
situation.

form

is

a special

case

of the following

of

property

more general

Example

Let

above

distributive

matrix

the elements are consideredas


from
F define H: V x V -> F by
entries

= Fn, where
A

with

column

H(x>y)=

*lAy

for

xjeV.

vectors.

For

any

n x

Inner Product Spaces

Chap, 6

356

Notice that sincex

H(x,

We

V.

j>

bilinearity
entry. As in Example 1
property of matrix multiplication
and xls x2)yeV,

of

the

matrix

over

and A is an n x n matrix,
this matrix with its single
identify
H follows
from the distributive
if a e F
addition.
For
example,

matrices

nxl

are

for any x,

1 matrix

is

y)

and

then

H{axi

x2) y)

= (axx +

x2JAy = {ax\\+

ax\\Ay

x2Ay

= aH(xuy)

We
are

reader

the

to

left

several

list

now

Exercise

(see

1. If, for any x e

functions

V,

Lx{y) = H(x,
then

2. H(0, x)

= iJ(x, 0) =

y, z, we

3. If x,

4. If J:
For

x V

Rx(y)

by

for all y e

H(y, x)

all

for

V,

e V.

H(x, z) + H(x,

is

defined

by J(x,

+ iJ2 and

z) +

#(3;,

y) = H&,

the

by

y)

H(y, w).

is a

x), then J

H1,H2e &(V), and

the productaHl

H2)(x,

(Hx+

w)

j>) = H(y,

V, bilinear forms

space

the sum Hx

define

defined

are

-\302\273\342\226\240
F

+ w) =

j>, z

-\342\226\272
F

a vector

F:

a field

over

then

V,

H(x

and

space

linear.

are

L^and.Rj-

vector

Rx:

Lx,

y)

forms. Their proofs

by all bilinear

2).

any bilinear form H on a

For

+ H(x2,y).

possessed

properties

tfJAy

form.

bilinear

scalar

any

a,

we

equations

H2(x9

y)

and

j>) = aCiJ^Xj

(aiJiXx,

It is a

simple exerciseto

forms. It isnot

that

surprising

that

verify

with

all

for

3;))

iJx

H2

to these

respect

x,

V.

y e

ailx are again


operations
^(V) is
and

bilinear
a vector

space.
to

the

For

6.25.

Theorem

any

of sum and

definitions

vector

space

V, ^(V) is

product above.

a vectorspace

with

respect

Proof. Exercise. 1
V

Let

For
whose

be

any bilinear
entry

in

vector

n-dimensional

an

form H e
i column

row

=
AtJ

H(xh

^(V)

we

can

j is defined
xj)

for

space

with basis

associate

with

=
ft

by
all f, j =

1, %...,

{x1,x2J...,xn}.

on n x n

n.

matrix

Sec.

357

Forms

Quadratic

and

Bilinear

6.7

is called thematrix representation

The matrix A above


to the basis /?.

Definition.
with

respect

We can therefore definea

the field of scalars

for

matrix

mapping

^(V)

H e ^(V),

respect to ft.

of H with

representation

from

ij/p

for any

that

such

V,

of

where F is

to Mnxn(F),

i//p(H) = A, where

is

the

Example3
H

form

bilinear

the

Consider

1. Let

of Example

p = l(1X (V-l f)l


1/'

and

4,,m

Then

Bn

=2

=#(()),(}))

3+4-1=8,

= 2-3+4+1=4,
512=^((1),(^5))

+ 3-4+1=2,

^))=2

B21=tf((_j),

= -6.

B22=tf((_j),(_jll=2-3-4-l
So

If

is

standard

the

*,(fl)

any basis fi for

V,

Proof We
To
the

zero
=

LXi(x)

hypothesis,

vector
the

to

We wish to

matrix.

xf

for

iJ(xhx)

LX((xy)
from

is

space

=
to

-1

e/?,

and

all

xeV.

recall

iJ(x\302\243,

for

x,-)

trivial,

function

the

By property
=

ij/p

is

all x/

i.e.,

F an4

Mnxn(F).

transformation.

linear

and

suppose that iJe^(V)

show that H is

a field

over

from ^(V) onto

isomorphism
to verify that

reader

ij/p is one-to-one,

that

show

it

leave

Fix an

xjeV.

transformation

ij/p

3'

'2

For any n-dimensionalvectorspace

6.26,

Theorem

reader can verify that

for R2, the

basis

H(x,

y)

LXi: V -> F

1 on p. 356,
e /?. Hence
is

the

LXi

= 0 for

0,

all

defined by
is

LXi

=
i//p(H)

linear;

by

zero

F. So

iJ(xh x) =

=
LX((x)

for

all x

e V and

x^

jS.

(5)

Inner Product Spaces

Chap. 6

358

Next
defined by

an

fix

= H(xh

Ry(x.)

H(x, y) =

=
=

x e V.

Again
is

Hy

mapping
is

Ry

and

Ry:V->F
But

linear.

(5)

by

that

conclude

we

and

trivial,

iJ is trivial,

0 for

Ry(x)

recall

any xt e /?. Thus


all x, y e V. So

for

y)

for

y)

H(x,

Ry(x)

all

the

and

j;eV,

arbitrary

one-to-

is therefore

one.

To show that ^ is onto,


-> Fn as definedin Section2.4.
(f)p: V
let

Ae

Define

vector.

iJ: V x V ->

mapping

#(*>

and

ex

j^(x,-)
#(\302\276

We conclude that $p{H)=


For

Corollary-l.

n2.

that

show

[M*f)]^W*(*/)]

.4,

thus

and

\\j/p

= .4. If

and J,

4Aej

xh x,-e & then

= ay.

is onto.

vector

n-dimensional

anj>

^(#)

for any i

ey Consequently,

*/)

isomorphism

e Fn as a column

j, e V.

for all x,

will

We

we view 0fl(x)

the

F by

[^M^GO]

J>)

By Example 2, iJe^(V).

0^(x.)=

e V

For

Recall

Mnxn(F).

space

is of dimension

V, ^(V)

Proof Exercise. 1
The

is

corollary

following

established

easily

the proof of

by reviewing

Theorem6.26.
2.

Corollary

basis

y)

H(x,

The

3.

Corollary

existsa

AeMnxn(F),
for all x, e V.

[0,(x)]'A[0,(y)]
following

is now an

For

any
A

H(x,

where

/? is

the field F

if and

only

with

if

matrix

unique

then ^(H) = A

and

Hef(V)

If

ft

vector space over

an n-dimensional

Vbe

Let

the standard

immediate consequence of Corollary2.

field

F, positive

e Mnxn(F),
y)

namely
for

x'Ay

integer n, and

Hef(Fn),

there

A = */^(H), such that


all x,

Fn,

basis for Fn.

between
There appears to be
and
linear
bilinear
forms
analogy
a
and also this
matrix
operators in that each is associated
unique
square
correspondence depends on the choice a basis for the vector space. As in the
case of operatorsone
How is the matrix corresponding to
the
pose
question:
a fixedbilinear
modified
when
the basis is changed?
As we have seen, when
this question posed for linear operators,
it led to the study of the similarity
relationon
In the case
of bilinear forms we are led to
matrices.
study
an

with

of

can

form

was

the

square

of

another

relation

on square

matrices,

the \"congruence\"relation.

Sec.6.7

Forms

Quadratic

and

Bilinear

A, B e Mn

Two matrices

Definition.

exists an invertible

matrix

359

if there

congruent

that

such

e Mnxn(F)

to be

said

are

n(F)

QlAQ = B.
that
It is easily
The following
a bilinear form.
seen

relates

theorem

Theorem 6.27/
P

=5

Let

be

the

to

congruence

matrix

(see Exercise 11).


of
representation

vector

space

relation

equivalence

finite-dimensional

,yn},and let Q the change


to ^-coordinates. Then, for
changing
y-coordinates
=
congruent
Qty^EQQ. In particular, ^y(H) and ^(H)
xn} and y

x2,...,

{xl9

matrix

= {yl5 y2)...

with bases
of coordinate

be

any

e ^(V),

are

^y(H)

direct

two proofs of this theorem. One involves


essentially
while
the other follows immediately from a certain clever

are

There

Proof

computation,
We

observation.
exercise

(see

that

former

the

present

Exercise

Suppose
1

is an

congruence

here and leave

proof

12).
=

and

i//p(H)

B =

i//y(H). Then for

^i,j^n
n

yi

and

Qki*k

yj =

&=1

\302\243

r=l

Qrjxr.

Thus

Bu

= H

= H{yt,
yj)

\302\243

Qkixk,

y}

QkiH{xk, y})

&=1
\\

= t

QuH

= t

Qut

k=l

\"\342\200\242\"=

Q'AQ.

r=l

(Xk,

\\

fib\"

Qki t

Qk!(AQ)kJ

Q'MQ)kJ

QrJ*r)

Xr)

QrjH{xk,

&=1
Thus 5

the latter proofas

QrAkr

A\302\253rQrj

(Q'AQ)iJ.

any i and J

such

that

an

Chap.6

360
The

Corollary.
P = {xl9..., xn}.Suppose
is

If

to i/^(H),

congruent

In fact, if B =

bilinear

Hbea

that

a basis

matrix

invertible

an

for

n x n matrix.

B is an

that

V such

for

then

Q,

basis

with

space

on V and

form

then there exists

Qtyfl(H)Q

vector

n-dimensional

an

be

Let

Spaces

to Theorem 6.27.

is a converse

following

Product

Inner

= B.

^y(H)

Q changes y-

coordinates into p-coordinates.

Proof Suppose

that

? = {j>l>;\"'-^>

matrix Q, Let

invertible

some

for

Q^p(H)Q

Where

Q is

Since

ordered basis for

y is an

invertible,

Q is

and

the

matrix that changes^-coordinatesto ^-coordinates.

by

Therefore,

b = Q'UH)Q

the

Like

determining those bilinear


will

for linear operators,


bilinear

are

namely

diagonal

there'is an
of

the problem
matrix

bilinear forms are

the \"diagonalizable\"

see,

forms,

there

which

for

forms

w$

for

problem

\"diagonalization\"

As

W-

problem

diagonalization

analogous
representations.

6.26

Theorem

Forms

Bilinear

Symmetric

of coordinate

change

those that are

\"symmetric.\"

Definition.

bilinear

H(x, y) = H(y, x) for


As

the

name

H over a

form

all x,

is

called

symmetric

if

V.

symmetric

suggests,

vector space

bilinear forms correspond to

symmetric

matrices.

the following

V be a

Let

6.28.

Theorem

finite-dimensional vector space. For

(b) For

basis

any

y for

V, ^y(H)
V

Pfor

Proof First

(b)

Finally,
P

J:VxV->F,

yt)

H{yh

implies

we

that

prove

be a

= H(yi, yj) =
Bu

basis

we

[yuy2,...,yn)

Clearly

^(V)

symmetric.

(c) Thereexistsa basis


=

are equivalent:

(a) H is

prove

(a)

basis for V,
=

Bn.

Thus

is a symmetric
such

*/^(H)

is a

Suppose

implies_(b).

symmetric

matrix.

H is symmetric.

Let

and let B =
for
i and j,
Then,
any
B is a symmetric matrix, proving(b).
il/y(H).

(c).

that

(c) implies

xn}, ij/p(H)=
where F is the field of

{xl3 x2,...,

that

matrix,

is

(a). Suppose that,


a

matrix.

symmetric

scalars

for

V,

by

J(x,y)

for

some

Define

= H(y,x)

Sec. 6.7
all

for

Bilinearand Quadratic
e V. By

x, y

and J

property 4 on p. 356,J e
=
CtJ

C = A.

Thus

H(y, x) = J(x,

Since
=

J(xh
is

ij/p

Xj)

A bilinear form
if there exists

diagonalizable

is

V, so H

ijtp(J). Then for

any i

Atj.

Ajt

J = H.

that

conclude

we

all x, y e

Let

^(V).

=
H(xj, xd

one-to-one,

y) for

H(x,

y)

Definition.
called

361

Forms

H on a finite-dimensional
a basis Pfor such that */^(H)

(a).

proving

symmetric,

Hence

vector

is a

is

space

diagonal

matrix.

Corollairy.

Let

H is diagonalizable,

then

by

is

space.

For any H

e ^(V), if

symmetric.

diagonalizable. Then thereexistsa basis


= D, a diagonal matrix. Trivially,
a symmetric
that
matrix.
ij/p(H)
Theorem 6.28, H is symmetric. 1
that H is

Suppose

Proof

such

vector

a finite-dimensional

be

is

converse

the

Unfortunately,

not

true,

for

is

So,

by the following

as illustrated

example.
Example

Let

C), and let

Z2 (see Appendix

F2.

H:

Define

V x V

-> F by

-((3-60)-^-

Clearly,
H

is

contradiction.
Suppose
that

is

fact,

will

if P is

assume

diagonalizable.

the standard basis for

that H is

rank(B)

nonzero.

exists a basis

Theorem
ij/y(H) is a diagonal matrix. Thus
matrix
Q such that B = QlAQ.
= rank(,4) = 2. So 5 is a diagonal
whose
Since the only nonzero element of F is 1,
by

Since

matrix

--G

0-

\302\253-(:

i.

that

F2,

then

diagonalizable and obtaina

there

Then

B =

invertible

Suppose

We

matrix.

symmetric

that

In

symmetric.

for

F2

6^TTEere~exists
Q

diagonal

is

such

,an
invertible,

entries

are

Inner ProductSpaces

Chap. 6

362

Then

fa c\\/0

lVfl

o)\\c d)~\\bc +

\\b d)\\l
p = 0 for all p e F,

But p +

1, column

1 = 0, a

contradiction.

ac

so

and

Consequently,

The bilinear form of Example


that
the
diagonahzable stems from the
characteristic

of

circumstances

Priorto
fields

of characteristic

those

than,

other

of

converse

the

proving

lemma.

over

afield

that H(x, x)

nontrivial

Hbea

Let

Lemma.

H(w, w) # 6,
= H(w, w)
H(v, v)
or

we conclude that
I
Its

be

to

failure

1 +

multiplicative inverse by j.
to Theorem 6.28 for scalar
the corollary
two, we must establish the following

bilinear form on

symmetric

0 and

some

for

that

x) = H(v,
=

exists an

a vectorspace

element

e V

such

Then

v)

that

suppose

Otherwise,

*+

H(v,

w)

* 0

w) +

H(w, v) + H(w,

vector

a finite-dimensional

be

bilinear

symmetric

every

Proof^ We use mathematicalinductionon

member of ^(V) is
spaces of dimensionless

If H(v,

w) # 0.

\302\273,weV, H(v,

w)

0.

Let

of characteristic two.

v)

2H(v,
#

w)

H(v,

Theorem 6.29.

over afield

space

F not

on V is diagonalizable.

form
dim(V).

If n

is

theorem

the

that

Suppose

diagonalizable.

= 1, then

every
valid for vector

1. If H is
trivial
then
H is diagonalizable. Suppose then that H is
form,
certainly
and
Then
there exists an element x e
symmetric.
by the lemma
such
that H(x, x) # 0. Define
.
nonzero)
than

bilinear
nontrivial

for

fixed

some

integer

n >

the

(necessarily

L:
rank(L)

the row

characteristic two.
1 is invertible. Under these

there is nothing to prove.


= 0. Setting x =
have
w, we

H(x,

Then

bdj'

# 0.

Proof Suppose

since 2 #

Z2 is of

field

two. Then there

of characteristic

not

bd

\"2\" and its

1\" by

6I1 +

ad

anomaly.

scalar

then

two,

denote

we

be + ad\\

0. Thus, comparing

an

is

fact

If F is not

ac =

ac

in the equation above,


H is not diagonahzable.

matrices

the

of

entries

b\\_fac

is

linear,

= 1, and

and

F by L(z)

since

= H(x, z)

L(x) = H(x,

hence dim(N(L)) = n

for all z e

x) # 0,

\342\200\224

1.

The

is

nontrivial.

restriction

V.

Consequently

of H to

N(L) is

Sec.6.7

Thus

by

that

such

N(L)

xn)

hence-7?

diagonal matrix, so iJ
Let

Corollary.

Ae

is

diagonalizable.

be

is a symmetric

Mnxn(F)

1.

of characteristic
matrix.
matrix, then A is congruentto a

that

to

between elementary matrices and

the

to

3.1
matrix

elementary
1

2s is

If

n x n matrix,

an elementary

then AE is obtained

elementary column operation on A.


from A by means of the same operation
is
obtained
from
the columns of A. Thus

Exercise

By

A by

ElAE

of

rows of

AE.

matrix

the

Q is

that

suppose

performing

an invertible matrix

column operations and

elementary
A

can

EliE2,...iEk are

the

matrices

elementary

of

product

elementary

\342\200\242
\342\200\242
\342\200\242
\302\243\302\243\302\243\302\243_i
E[

--.\302\276.

AEXE2

that

conclude

that

of

means

by

the corresponding

a diagonal

into

transformed

be

is

reversed.)

such

matrix

3.6,

severaP

can be

and D is a diagonal

= D. By Corollary 3 to Theorem
Q
=
=
matrices,
say Q
Q'AQ =
E^2 ---\302\276. Thus D
On
the basis of the equation above, we
QlAQ

on

than

an elementary
perfoming
the same operation on the

of the operations

the order

that

(Note

then

and

is obtained

rather

rows

of

means

by

ElA

20,
the

on

performed

operation on the columns

from

a certain

operations,

invertible

Section

review

of
to

congruent

D and an

matrix

a diagonal

find

is

6.29

operations.

Now

If

not

field

from

entriejs

reader may wish to

= D. The

Q'AQ

the relationship

recall

two.

diagonal

how

such

is a

ij/p(H)

that is not

a field

diagonal matrix. We willshow


Q

Then

x.

Let A be a symmetric n x n matrix with


characteristic
two. By the corollary to Theorem

matrix

that

conclude

We

x\342\200\236

for

Exercise.

Proof

\342\200\224

1.

for V. In addition,

is a basis

1,2,...,

\342\200\224

xn-i}
=

Set

(l^fj^n-l).

i\302\245=j

{xi,x2,.,.,xn}

for i =

x() = 0

H(xn,

of

for

H(xl9Aj)=0

and

xn\302\243N(L),

H(xh

form on a vector space dimension


there exists a basis {xls x2,...,
^hypothesis
bilinear

a symmetries
the induction

obviously

363

Forms

Quadratic

and

Bilinear

rpw

matrix D. Furthermore, if
to

corresponding

column operations (indexed in the order

and

performed)

if

the

elementary

= EyE2
Q

\342\200\242-\342\200\242

Ek,

then

= D.

QAQ

The statement above


Example

Suppose

provides the

key

for

finding

and

Q for

a given

A.t

that

-GOWe

begin

by

using

elementary

column;

in this

second

column

of

twice

the

first

operations

to insert a zero in

case we must

first

the

subtract twicethefirstcolumn
A. The corresponding row operation would
row from the second row. Let
be
the
elementary

second

subtracting

column

of

row,

from

the

involve

EY

matrix

Inner ProductSpaces

Chap. 6

364

correspondingto

the

column

elementary

-2\\

/1

Then

above.

operation

consequently,

^1 = (2 -l)
Observe that since Et produceda
row 2, column1.

and

-1

(l
1, column

row

in

zero

E[AE

a zero in

2, E[ produced

for

Thus

consider

Next

A =

all

interchanging

the

corresponding

to

and

and use it

1 position

1, column

row

the

first column of

A. Thus

the

first

row

first

and

second columns of A. The

in

entries

other

in

to have +1

It is desirable

column

this

operation

to eliminate

we

by

begin
matrix

elementary

is

Clearly

This matrix is obtained interchanging


AE and then interchanging first

the

by

two

the

the

first

and

column

second

row,

of AE.

rows

columns

two

first

of A to obtain

Next we produce

row, first columnof

in the second

a zero in
by

E\\AEy

the second columnof E[AEland


this
with
the corresponding
row operation. Finally we add three times
operation
to the third
column
column and follow with the corresponding
the column
Note
that
operations can be performed in succession
operation.
the

adding

the

first

column

of E\\AE1 to

following

first

prior to

row

the

performing

row

/1
En

\\0

Thus

operations.

if

1 0\\
and

10
0

/l

1/

=
\302\2433

0 3'

\\0

Sec.6.7

and

Bilinear

Forms

Quadratic

365

-1

0 5
The

see that by setting

can now easily

reader

13

10
=

E4

-5

we have

0
=
E^{EizEi2E[AElE2Ez)E4.
with

Thus

1
0 0

and

E1E2EZE4.

D =

Thereadershould
in Section 3.2 to compute
Q) without recording

following

method

the

justify

inverse

the

each

QtAQ=D.

elementary

of

elementary column operations followed the


to change the augmented matrix (A
operations

is a diagonal

matrix.

In the preceding
matrices:

04|J)

Then

Use

separately:

\\

1)

the

into

Ql (and hence
a sequence
of
row

elementary

corresponding

by

to that introduced

for computing

a matrix)
matrix

(similar

form

(D | B\\

where D

Ql.

example this methodproduces

the

following

sequence

of

InnerProductSpaces

Chap. 6

366

0'

1 0

0 3 1

and

Hence

Quadratic*Forms
with

Associated

called \"quadratic

are functions

forms

bilinear

symmetric

forms.\"

Definition. Let

vector

that

K(x) =
If the field F
between

correspondence

In fact, if
characteristic

of

two,

and

on

if K(x)

there

two,

and

forms

a vector

= H(x, x)

form

->

is

e ^( V)

such

V.

x e

all

characteristic

form

quadratic

bilinear

for

K:

F. A function

afield

symmetric

*)

H(x;

bilinear

symmetric

is

is not

over

space

exists a

form if there

a quadratic

called

be

(6)
is a

quadratic

one-to-one

forms given by

(6).

field F not

space V over a

for some symmetricbilinear

of

on

form

V,

then

H(x,
(see

y) = &K(x

+ y)- K(x) -

(7)

K(y)]

15 for details).

Exercise

Example 6

Certainly, the classical


degree polynomial of

of

example

variables.

several

in a

values

a(J (1 ^

field F not

i^ j

n),

define

of characteristic
the

f(tu
Let

K:

Fn -+ F

f\\Cli

c2>

\342\226\240
\342\226\240
\342\200\242
> Cn)-

form

a quadratic

variables

Given
two

is the homogeneous

and

(not

tu

t2i...,

take

that

t\342\200\236

distinct)

necessarily

secondscalars

polynomial

t2i

..., tn)=

be the quadratic

\302\243 aifyty

form

defined

by

K(cl9

c2,..

->

=
c\342\200\236)

6.7

Sec.

Any
the

second

Bilinear

and Quadratic

polynomial

of

367
is called a

above

form

the

In fact, if /?

variables.

in n

degree

Forms

to
symmetric bilinear formcorresponding
il/piH) =

is H,

above

form

quadratic

where

and

apply (7) to

see this, simply

To

the

the

for Fns then

basis

standard

the

is

homogeneous polynomialof

form K, and verify that

f(tu...
In particular, given
polynomial
s

obtain

is

tn)

H(eu

ej)

means

H by

from

computable

the quadratic

from

At]

of (6).

the

h> A)

fih,
real

with

= 2*1 -%

6\302\275

4\302\275

let

coefficients,

3
/2
=

-2

\\0

H(: *,y) ==

Setting

for

x'Ay

\302\260\\

-1

-2}0/

see that

R 3, we

all x, ye

I/h\\

Ah, As A) = (As

h)M

As

\"^

for

\302\2532

t2 |eR3.

J2J

Field

Quadratic Forms over the


Since

matrices

symmetric

R are \"orthogonally
diagonalizable\" (see
bilinear
forms and quadratic forms on
symmetric
over
R is especially
nice. The following theorem
over

of
Theorem 6.20), theory
finib-dimensional vectorspaces
and its corollary are
among
bilinear and quadratic
the

the

certainly

results

useful

most

of

in the theory

forms.

let

for

bilinear

a symmetric

Hbe

V such
Proof

A = ^y(H).

that */^(H) is
Choose
A

is

such

a finite-dimensional

form

on V. Then

real inner product space,and

there exists an

basis

orthonormal

a diagonal matrix.
there

symmetric,
that

for
V,
y = {*i, \342\200\242\342\226\240.,*\342\200\236}
an orthogonal
matrix

basis

orthonormal

any

Since

diagonalmatrix

V be

Let

6.30.

Theorem

exists

= QlAQ by

Theorem 6.20. Let

=
ft

{yu

'

/?

let

and

Q and a
...,

j>n}

be defined by
n

yj=

By Theorem 6.27, $P(H)=


orthonormal,

ft is

orthonormal

D.

Qijxt

Furthermore,

for

Ki^since

by Exercise 24

is

orthogonal

of Section 6.5.

and

is

368

Chap.

Corollary.
space

product
scalars

Let K be a quadraticform on a finite-dimensional


V. There exists an orthonormal basis /? = {xl9
s

X2,...

Xu

that

such

distinct)

necessarily

(not

Xn

Product

Inner

Spaces
inner

real

..., xn} for

x2j

if x e

and

and

s*e

Z sixi>

R>

i~i

then

K(x)

In fact, if

be

Proof Let #
H(x, x) for

all x e

P = {xlsx2j..

\342\200\242,
xn}

bilinear

symmetric

basis for

any orthonormal

to be

chosen

tfie

is

V.

the

=
*\342\200\236(fl)

e V and

exists

which

an orthonormal

K(x) =

basis

\342\200\242-.

.2

\\0
x

for

form

there

6.30

jX,

Let

*/^(H)

bilinear

r
.

w/nc/i

/or

which

V for

for

by K, tften /? can be
is a diagonal matrix.

determined

form

symmetric

Theorem

By

A,s?.

i=i

\342\200\242\342\200\242\342\226\240

that

suppose

^xf

\302\243=1

Then

Sl

K{x)

= H{x, x)

=
[0\342\200\236(x)]'Z>|fo(x)]

(Sl,...,

sn)D\\

=
\302\243

A,s*.

\302\243=1
5\342\200\236

Example

For

the

of degree 2

real polynomial

homogeneous

/01.^) = 5^ + 2^+4^2,
we

that

will

find

an orthonormal

basis

=
ft

{xls

x2}

for

(8)

R2 and scalars

and

X\302\261

A2 such

if

e R

and

then f(tls t2) =

X^l +

X2s\\.

^1

We

may

think

'h

s1xl

+ s2x2,

of sx

and s2 as

the coordinates of

Bilinear

6.7

Sec.

Forms

Quadratic

and

369

the polynomialf(tl3t^9 as
expression
involving
a
coordinates
of
is transformed
R2,
point with respect to the standardbasis
=
a new polynomial
into
as
an
#(slss2)
expression
^i5i + ^isi
a
the
coordinates
of
to
new
basis
relative
involving
point
/?.
H
Let
denote the symmetric bilinear form correspondingto
quadratic
to /?. Thus

relative

an

the

for

interpreted

the

the

defined

form

If

by (8).

is

*,<*) =

by

begin
characteristic

an

find

we

Next

of

h(t)

pofynomial

QlAQ is a diagonal
basis of eigenvectors of L^. The

an orthonormal

computing

~ '

=
2

%!

6 and

matrix.

We

A is

fc(t) = det f5

Thus

*).

(*

Q for which

matrix

orthogonal

for R2, then

basis

standard

the

X2 = 1 are

2_)

(t

- 6Xt -

the eigenvaluesof

A,

1).
each

and

has

multiplicity

computation yields corresponding eigenvectorsof norm one,

1. A simple

and

x1=-LQ
Since

xx and

*2=^(J)-

x2 are orthogonal,

{xu

/?

basis for R2.

orthonormal

is an

x2}

Setting

we

see

that

Q is

an orthogonal

Clearly Q is also a

Thus

change

by the corollary

of

matrix and

matrix.

coordinate

to Theorem 6.30,

for

K(x)

any

Consequently,

+ s2x2 e

s^

R2,

+ sl

6sl

The following example illustrates


be applied to the problemof describing
how

quadratic

the

theory
surfaces

of

quadratic
in

R3.

forms

can

Inner ProductSpaces

Chap. 6

370

Example S

Consider

surface

the

2t\\ + 6^2
i.e.,

\302\245 is

the

set

93

in

5*1

of all

by the equation

R3 defined
-

+ 2t\\ +

2t2tz

satisfy

(9). If y

orthonormalbasis
any point of relative

for

ft

\302\245

= 0;

(9)

R3

such

to

ft is

to

\302\245 relative

the

2t\\

a new
the coordinates of

like to select

of second

terms

on the left-

degree

form on R3:

(9)

involving

equation

describing
simpler than (9).

considerably
that

t2

an

is

would

We

y.

(9)

the equation

that

We begin with the observation


hand side of add to form a quadratic
K|

14

basis for R3, then

is the standard

of points in

coordinates

the

t3

elements

that

2t2 -

3tt -

+ 5t\\

61^2

- 2t2tz +

2t\\.

hj

Next
to

and

is the symmetricbilinear

K. If H

w,e diagonalize

form

Now the characteristicpolynomial


of

/2-t

/i(t) = det

and consequently,

5-t

-1

\\

calculation

corresponding

A '*= $y{H), then

has

eigenvalues

is

-1 |

l(t

Xl

=2,

X2 = 1,

and

the

respective

- l)t,

2-tj

yields eigenvectors

of norm 1 corresponding
to

- 2\\t

eigenvalues.

and

=
A3

0.

simple

Bilinear and QuadraticForms

Sec. 6.7

Now

set

371

{xls x2, x3} and

x/lO

x/14

x/35

Q=

x/14

>/35

As in Example
is
the
coordinates to v-coordinatesand
=
ijfp(H)

By

the

to

corollary

to

relative

coordinates

6.30 if x

Theorem

K(x)

We are now

Q^y(H)Q=QtAQ

ready to

matrix

coordinate

of

change

7,

2 0

.
0\\

0.

then

2s?

+ 7sf.

(10)

(10)
an

into

involving

equation

/?.

e R3.

t2

W
If x = slxl

52x2

we have

53x3j

-3

\\

\\

14 \\

S<

\302\276

'35

-1

Thus

'i =
t2

5i

3s2

353

./35

V14

10

/?-

\\0 0 0/
= SiXi + s2x2+ s3x3,

transform

x=

changing

5s2 , 2s3

372

Chap.6

Inner

Product

Spaces

Therefore,
145,

jLt

52x2

2s{ + 7s|

respectively,

we

x e

the

graph

\302\245if

and

or

axes

new

draw

x', j/,

will coincide, with the

53

surface

only

if

^-52.

Sf.

y'

Figure

6.5

xe

R3 and

+ ./14.

the directions of xu

x2,

and

we

paraboloid (see.Figure 6.5). 1

+ ^-s\\

if

above rewrittenas
(y}2+

Thus

/14

/14

/14

(x')2+\\/

that

conclude

and z' in

of the equation
14

-./145,.

we

then

53x3,

\342\200\224

the equation above,

- Vl453+ 14=

Consequently,if
x3,

^3

-^\"2

(9), (10), and

Combining

x = slxl +

\342\200\224

/14

recognize

^7 to

be an elliptic

The Second Derivative Test

multivariable

the

an application of

consider

now

We

Functions

for

Variables

Several

of

373

and Quadratic Forms

Bilinear

6.7

Sec.

to

forms

quadratic

test for local


an acquaintance
with the
derivative

second

the

of

derivation

calculus\342\200\224the

of

theory

extrema of a functionof
variables.
We
assume
to the
extent
calculus of functions of
variables
of Taylor's
the
reader is undoubtedly familiar
one-variable
version
theorem. For a statement and proof of
multivariable
version,
to Analysis, by Maxwell Rosenlicht
Introduction
example,
several

theorem. The

several

of

with

the

Taylor's
for

consult,
New

(Dover,

York,

1986).

z = f(tu

...,
for which all third-order
Let

t2,

be

tn)

function f is

said to

positive
Likewise
f

have a local

is said to

f(p) <,f(x)
local

\342\200\224

whenever

\\\\x

at

minimum

which f(p) ^

for

number

we

p,

called a critical point off if


that
if f has a local extremum

df(p)/dt(

For,

if

function

real-valued

where

<\302\243f

the

pjis

Pi+n\342\200\242\342\200\242\342\200\242\342\226\240>Pn%

local

critical

isva

criticalpoints

are

exist

by

1,2,...,
=

\302\242((\302\242)

critical point of f.
n

we

d&(Pi)

dt

= 0

define

may

a
t,

f(pup2,---,Pi-u
</>t

calculus

fact

well-known

each j. Obviously,

of p for

is

has

arguments

.
are not necessarilylocal

critical points
Unfortunately
test gives us additional
derivative

conditions under which

extrema.

local

a realDerivative Test). Let f(tl912,..., tj


of n real variables for which all thethird-orderpartial
continuous.
Let pbe a critical point of and let A denote
the n x n

function
and

p is a

then

or a

e Rn

point

6.31 (The Second

Theorem
valued

R n,

off.

point
second

The

extrema.

for any i =

coordinate

jth

dti
p

variable

one

dfjp)

So

5.

5 > 0,

a local maximum

either

<

p||

if, for some

e Rn

exists

\342\200\224

||x

pt. So by ordinary one-variable

at t =

extremum

whenever

local extremumat p.
for i = 1,2,.,.,
n. It is a

at p, then

of

has

at a point

extremum

a local

has

at

there

if

Rn

f has a

that

say

f(x)

minimum

If

5.

<

p\\\\

pe

a point

at

maximum

local

have

The

continuous.

are

and

exist

derivatives

partial

variables

of n real

function

real-valued

be

derivatives,

are

f,

matrix whoseentries

are

given

by

g2f(P)

(3W

(Note

A is a symmetric matrix and therefore


has
(a) If all the eigenvalues of A are positive,then
that

of
'(b) If all the eigenvalues
(c) If
has

has

no

at

local

are

negative,

then

real
f

has

fhas

eigenvalues.)
a local

minimum

a local maximum

at p.

at p.

then
positive and at least one negativeeigenvalue,
at p (i.e., p is a saddle-point f).
extremum

least

one

of

374

does
< n and
eigenvalues, then the second

(d) If rank(A)

The

tn) = /fo

t2l..,3

g(h,

we may define a

If p ^0,

+ pu
are

observations

following

of

derivatives

3. 0 is a critical

of

point

Au

d2g(0)

generalitythat

by

- f(pl9

p if

at

p2,..,,

pn).

and

only if g

has a

0).

0,...,

0 coincide

with the corresponding

partial

g.

0 and

function

to

theorem

of

at 0

that there exists a

and conclude

that

such

Rn

on

suppose without loss

= 0.

f{p)

We next apply Taylor's

real-valued

we may

above

observations

the

of

for all i and j.

(.dtffltj)
view-

In

= (0,

at

-*

[minimum]

at

maximum

negative

verified;

easily
maximum

local
[minimum]
2. The partial
derivatives of f at p.

n
R

+ pn)

p2,...,tn

f has a local

1. The function

4.

t2

functiong:

and

positive

Spaces

inconclusive.

is

test

derivative

Proof.

both

have

not

Inner Product

Chap.

(11)

and
+

\302\243(\302\243!,...,\302\243\342\200\236).

,J'J

U\302\243i(3t\302\253X3tj)

(12)

the

Under

hypotheses

f(tu...,ta)

critical point and /(0) =

0 is a

that

0,

(12)

reduces

to

1
+

S(tu...,

(13)

tn).

\"2U\302\243i(&iX3t/)

Let

us

form iC: Rn

a quadratic

define

Let

be

standard
Since

A is

the

symmetric
basis
for Rn.
self-adjoint,

->\342\200\242
J?

by

'\342\200\242
I

,^

(14)
(\302\276\302\276

bilinear form corresponding to K, and


be
y
It is a simple matter to
that
\\//y(H)=jA,
Theorem 6.20 shows that there exists an
let

verify

orthogonal

the

Sec.

375

Forms

Quadratic

and

Bilinear

6.7

matrix Q such.that

QlAQ=

is a diagonal

whose

matrix

P=
the

{ix;ls

x2,

column

zth

coordinates

of Q.

eigenvalues

of A. Let

z'th member is

Q is the change of coordinate


and by Theorem 6,27

Then

matrix

changing

Qw,

/?-

to ^-coordinates,

\\

A-

i//{J(H)

the

for Rn whose

basis

orthonormal

the

be

\342\200\242
\342\200\242
\342\200\242,
xtt}

are

entries

diagonal

Qtil/y(H)Q=^QtAQ

-.

Suppose that

is

not

the

Pick a positive
e such
positive number 5 such
which
Now pick any x e

if

that

for

Rn

or

5.

<

\\\\x\\\\

\\f(x)

A has nonzero

Then

e < |ilz|/2 for all Ate Rn and


0 < \\\\x\\\\

that

number

matrix.

zero

Then

= \\S(x)\\ <

K(x)\\

0,

<

5,

eigenvalues.

(11) there exists a


then
\\S(x)\\ < e ||x||2.

By

and (14)

by (13)
e\\\\x\\\\

\\
K(x)

<

e\\\\x\\\\2

f(x)

< K(x)

<=\\\\x\\\\\\

(15)

If
x

E sixa

then

\\x\\\\2=

is?

i~l

thus by (15) and

and

(16)

K(x)=^i^;sf.

(16)

ill.X.-e^sf

<f(x)< i=li(Ui

e)sf.

(17)

Inner ProductSpaces

Chap. 6

376

Now

the

all

that

suppose

eigenvalues

hence by the left inequalityin

for all f, and

Then

of A are positive.
(17)

0<i(~Xl-e)sf<f(x).

We conclude that f has a local


involving the right inequality in (17) conclude
Similarly,
by an argument
are
then
if all the eigenvalues
of
f has a local maximum at
negative,
parts
(a) and
(b) of the theorem.
for

11

jc 11 <

5,

< f(x).

f(0)

>

f(0)

Thus

\342\200\224
\\ht

that

we

0.

at

minimum

0, This

establishes

Next suppose that


0

>

Af

A, <

and

some i and

0 for

any real number such


< f(sxd
f(0) = 0 < &l, -

Let s be

that

\\\\sXj\\\\

\\\\sxi\\\\

\\s\\,

|s|

values arbitrarilycloseto 0.
local minimum at 0.
establishes

Consequently,

This

= 0.

case

former

Sylvester's
two

Any

to

form

is

representation

nonzero

matrix

attains

e)s2

< 0

= /(0.

and negative
maximum nor a

positive
a local

neither

the

and

/(ti,

fc)

*i +

\302\243

not have a

diagonal

local extremum at

0. 1

0, whilein

the

latter

I nertia

of a bilinear form have the same rank


We can therefore define the rankof
congruence.

representations
under

is preserved

rank

bilinear

does

of

Law

(\302\276

under

local minimum at

f has a

because

<

In either case

but in the
case

by (17)

(c).

f{tut2) = t\\-ti
at

has

\\kt

the second derivative test is inconclusive


(d) of the theorem, consider the functions

in

stated

conditions

that

illustrate

To

/(\302\253j)

that

conclude

we

5. Then

and

6)52

Since

j. Then
<

a negative eigenvalue, say


e > 0 and -\302\276 + e < 0.

and

a positive

both

has

of its matrixrepresentations.a matrix


to
the
a diagonal matrix, then the rank is also
number
entries of the matrix.
be

the rank

of any one

If

equal

Weconfineour analysisto

symmetric

bilinear

on

forms

of

finite-dimensional

which
spaces. Each such form has a..diagonal
may
have
and negative
as well as zero diagonal entries.
these
positive
entries
are not unique,
we will show that the numberof
that
are
positive
the number
and
of entries that are negative are unique.
are
is,
they
of the choice of diagonal representation. This result is
independent

real

vector

representation,

Although

entries

That

called

Sylvester's
equivalence

law

of

classes

will prove this law and apply it


real matrices.
symmetric
congruent

inertia.
of

We

to describe

the

Sec. 6.7

Let Hbe a symmetricbilinear

of Inertia.

Law

Sylvester's
dimensional
real

377

Forms

Quadratic

and

Bilinear

Then the numberof


the number
diagonal
of negative diagonal entries of
both independent of the diagonal representation.
vector space V.

Proof Suppose
representations of H,
y are

turn

generality

precede the

the positive entries

so that

ordered

of

loss

Without

for V that determine diagonal


we may assume that both ft and

bases

ordered

y are

and

ft

H are

of

representation

any

and

entries

diagonal

positive

a finite-

on

form

both

the zero entries on the diagonalsof


precede
that both representations
have the same
show

in

which

entries,

negative

It

representations.

suffices

number of positive
because
the
number
of negative
entries is equal to the difference
the
rank
the number
of positive entries. Let p and q be
and
number
of positive
entries
of the diagonal representations of H
to ft and
diagonal
respect
y,
a
We
that
and
at
contradiction.
Without
loss
respectively.
p# q
suppose
of generality assume that p < q. Suppose
to

entries

between

the

with

arrive

that

\342\200\242
\342\200\242
\342\200\242
j

-*2s

i\"*ls

.,,,

\342\200\242
\342\200\242
\342\226\240
j

Xr,

Xp,

Xnj

and

where r is

the rank ofH and

defined

mapping

L is

that

verify

H(x,

xj,...,

^ n

we

can

of for+l9...,xn}.

H(x0,

j>|)

a nonzero

choose

Since x0 e
q < i < r. Let

for

*o = E
For any i

the

and

aJxJ

q>

vector x0 e

*o

\342\200\224

q.

r.

x0

is

in the

not

x,) = 0 for

il(x0,

Hence

\342\200\224

N(L)so that

that

follows

< p+

rank(L)

H(x9 yr)).

J,...,

span

and

i^p

bjyj-

< p,

H(x0,x,) = HE
But for < p, we
i < p. Similarly,
z

{xr+!,...,

yq+

\342\200\224\342\200\224

it

N(L),

H(x,

xp),

and

linear

nullity(L)
Therefore,

be

by

L(x) = (#(x,

It is easy to

Let L: V ->\342\200\242
Rr+p~\302\253

of V.

dimension

the

is

it

x\342\200\236},

ajxp xi)=

bt = 0

#(x0,Xo)

that

xt)

xj.

a{U(x|9

a{

Therefore,

in

whenever

follows

ajHfrj,

0 and iJ(x0, x,) = 0.


a,1
Since
is
i
r.
not
+
<
<
x0
g
^ 0 for some p <i < r.

iJ(x,-, x,) >

have

\302\243

Thus

aj*j> E

V=i

(*/#(*;,

1*1

*j)

\302\253(**

0 for
the

every

span

of

Chap.6

378
=

Product

Inner

Spaces

(afH(xj,x$

\302\243

<0.

Furthermore,
n

^i

bjyp

(n

= i(bJ)2H(yJ,yj)

= t

So

x0) <c 0

H(x0,

p = q.

and H(x0, x0) ^

0,

is

which

Definitions.
of

T/ie

number

bilinear

a symmetric

of

vector space is called

on a real

form

of a diagonal

entries

diagonal

positive

and

diagonal

number

the

of

bilinear

of

negative
is called

form

and
the form. The three terms
\"index\"
are
invariant
of the bilinear form because
associated
matrix representations. These same terms applyto
Notice
that the values of any two of theseinvariants

the signature^/
called invariants

index

the

The difference between the number of positive


entries of a diagonal representation of a symmetric

the form.

that

conclude

We

contradiction.

.-

representation

y)

(bj)2H(yp

are

\"signature\"

\"rank,\"

with

they

the

form.

quadratic
value

the

determine

to

respect

the

of

third.

Example

The bilinear form correspondingto the quadratic


3x3 diagonal matrix representation
the
diagonal
are each
of K
Therefore, the rank, index,and
form

Example

entries

with

signature

of

2.

2, 7,

8 has

and

0.

Example 10

The bilinear
x2

\342\200\224

y2

the

to

corresponding

K(x, y) =

form

quadratic

standard
representation with respect to
basis the diagonal
matrix with diagonal entries 1 and 1. Therefore,
K is 2, the index of K is 1,
the
of K is 0.
1
signature

on

R2

has

as its matrix

the

\342\200\224

ordered

rank

form

of

and

Since the congruencerelationis


we can apply Sylvester's law of inertiato
symmetric matrices.

associated

intimately

Let

the

be

an

n x

n real symmetric

study

this

relation

with

on

bilinear

the

matrix, and suppose that

forms,

set of real
and

are

Bilinear and QuadraticForms

Sec, 6.7

each

matrices

diagonal

matrix

379

form H on

of the bilinear

representation

Corollary 3 to

to A. By

congruent

with respect to the standardorderedbasis


6.27, D and E are also
representations

inertia tells us that

entries.

Corollary1

symmetricmatrix.

negativediagonal
the choiceof diagonal
entries

the

\"rank,\"

of the matrix,and

values

the

and

of

the

number

of

of

is called

of these invariants

is

of

diagonal

positive

the signature

of

are called invariants

of a matrix

\"signature\"

two

any

diagonal

entries

between

\"index\"

be

number of positivediagonal

to A. The

entries

A. As before,

Let A be a

Matrices).

real symmetric matrix, and let D

A be a

Let

A. The difference
and the number of negativediagonal

entries

determine the value of

third.

real

congruent

if they

can be
invariants
matrices.
symmetric

of these

two

Any
of

matrix

the index of

called

the

of positive
diagonal

real
entries and the number of
diagonal
to A is independent of
congruent
for

Inertia

law of

Sylvester's

Therefore,

matrix.

is congruent

that

any

of

the

matrix

number

the

Then

Definitions.

of

Law

(Sylvester's

to Theorem

corollary

version of Sylvester'slaw.

as the matrix

fact

this

formulate

can

We

= xlAy

y)

H(x,

by

the

A is

6.26,

of positive as well as negative

number

same

the

E have

and

the

By

H.

of

matrix

defined

Rn

Rn.

for

Theorem

used to determinean

Corollary 2. Two real symmetricn


have the same invariants.

Proof If

congfuentto

and

same

the

are

symmetric
matrices,
and it follows that

congruent
matrix,

diagonal

x n matricesare

invariants.

be diagonal
choose and

matrices

and

zero.

so that

fth diagonal
entry

B,

jD,

and

qt is

given

by

of

both

D and

both

they have the

same

if 1 <

if

if r

and

E.

Let

i<

<

i <

< I

p and

E. Let d{ denote

x n diagonal matrix

then they are

negative,

Q be the n

let

only

same invariants. Let D and E


respectively. By Exercise 22 we may

have the same invariants,so do


respectively,

of

and

entries are in theorderof positive,

and the rank,

entry

if

congruent

B have the

A and

to

the diagonal

A and B

Since

A and

congruent

denote the index


diagonal

that

suppose

Conversely,

class

equivalence

whose

the
fth

Chap. 6

380
Then

Q'DQ

\302\260

Jpr=|0

-Ir\342\200\236p

\\0

to Jpr. Similarly,

A is congruent

that

The matrix

matrices.
2, describes

of

is

hence

and

Jpr.

A real symmetric n x n matrix


to
congruent
Jpr (as defined above).

is

Jpr,

theory of real symmetric


in the proof of Corollary

form for the


is contained
proof

3,

Corollary

to

congruent

a canonical
whose

corollary,
role

the

as

acts

Jpr

next

The

to B. 1

A is congruent

and only if

Spaces

Jpr, where

(h

It follows

Product

Inner

index

has

p and

rank r if

B, and

C. The

Example 11;

Let

\342\200\242-

0=

-1
-1

\\

2 to determine

the

matrix

matrix ,4 is1,the3x3
to

congruent

the

of

in

5,

Example

necessary

to compute

be

Q), it can

it is shown

which

entries

DB

that both B and


that

and

Dc,

example

and

are

and

is

\342\200\22412.

(it is

not

congruent,

where

and

\\0

are congruent,

and

1,

that

/1

DB =

\342\200\2241,

that

of

that

shown

to the diagonal matrices

A,

among

congruences

methods

A has

It follows

C =

and

matrix with diagonal


diagonal
rank 3 and index 1.Using the

Therefore,

respectively,

\\

We apply Corollary

B =

C haverank

is congruent

and

index

2. We

conclude

B nor C.

to neither

that B and

EXERCISES

1.

the

Label

as being

statements

following

(a) Every quadratic

form

is

bilinear

true or false.
form.

same
are congruent, they have
bilinear forms have symmetric matrix

(b)

If two matrices

(c)

Symmetric

(d)

Any

matrix

symmetric

(e) The sumof

the

two

symmetric

is congruent
bilinear

eigenvalues.

representations.

to a diagonal matrix.
forms
is a symmetric
bilinear

form.

Sec. 6.7
(f)

and

Bilinear

Two

(h)

If

(i)

Let

(j)

exists

any bilinear form on a


there exists a basis for V
sum

of

4 Determine
(a)

Let

closed

vector

on

R2

iJ(x,

by

a field F,

be

j>)

real

be

and let J e ^(V) be


for all x, y

yj]2

[J(x,

y)

= h

nontrivial.

= det(x, j;), where

inner

det(x,

product

j>)

and

column

first

e V.

+ 2t2.
vectors.

column

matrix with x as its

2x2

dt.

f(t)g(t)

H: R2 -+

Define

y as

its second
V

R for

determinant

the

denotes

space. Define H:

column.

V -+

by

for x,yeM.

complex

inner

product

space.

for
x, j'GV.
^>
)0 =
isa
5. Verify that each of the givenmappings
of H with respect to
matrix
representation
(a) H: R3 x R3 -+ R, where

Define H:

V -+

C by

<*\302\273

form.

bilinear

the

given

Then

compute

basis.

*i
11

b2

with

the

- F by

V x V

H:

9)

over

space

H(x, y) = <x, y)
#&

form.

functions

be

of the

Let

is a bilinear

forms.

R2

(f)

form.

is a bilinear

a bilinear form

and

Let

matrix.

which of the followingare bilinear


= C[0,1]
be the space of continuousreal-valued
interval
define
[0,1]. For f, g e

(c) Define H: R x R -* by H(tl912)


as
(d) Consider the membersof

(jfy

space

product

6.25.

H(x,

any

V,

Define

x,

For

0.

y)

is a diagonal

ij/p(H)

forms

a scalar

H(f,

(b) Let

and y.

356.

page

bilinear

two
of

the

Theorem

4 on

and

2,

that

inner

real

such

ft

Prove

0 but H(x,

finite-dimensional

2. Prove properties1, 3,
3. (a) Verify that the
product
(b) Verifythat
(c)

0 for all x

y) ^

H(x9

such that y ^

aj^eV

If H is
V,

that

such

form

vector

there

xe\\f

bilinear

form.

bilinear

same

the

space of dimension n, then dim(^(V))= 2n.


be a bilinear form on a finite-dimensionalvectorspaceV.

is a

are

of

representations

(g) There exists a

with the same characteristicpolynomial

matrices

symmetric

matrix

381

Forms

Quadratic

fli&i

2^

+ a2bi -

a3b3

the

(b) Let

basis

with

M2x2(R)

H:

=
P

Let

(c)

- R by #(.4,
{cos t, sin t, cos It, sin
V

continuousfunctionson
p. Define iJ:

J? by

linear transformation. For


for
f (H)(x, = H(T(x),
T(y))

(a) For
(b) f:
In

^(W)

f'(0)

all x, y e

basis

with

subspace

\342\200\242

^\"(O).

same field,

let

and

define

?(#):

T: V

-* W be

V-x

V~>-F

by

V. Prove that

transformation.

of

6.26:

Theorem

(a) Prove**that for any basis ft, ij/p is linear.


Let V be an n-dimensional
vector space over a fieldF
(b)
let <^: V-.-* Fn be the standard representationof V with

of all

In the space

span(/?).

isomorphism, then so isT.

notation

the

g)

iJe^(VV),

linear

is

~->-^(V)

tr(5).

e #(V).

f (J?)

&#(W),

If T is .an

(c)

7.

.#(\302\243

any

y)

and

spaces over the

W be vector

V and

Let

6.

tv(A)

a four-dimensional

is

R,

\342\200\242

=
B)

2t}

ni-

p.

;>

(s

MP.
Define

Inner Product Spaces

Chap,

382

basis

with

ft,

to ft.

respect

and

Let

Define H:VxV-+F
by H(xty) = ^fi(x)yA^fi(y)l
that
H e ^(V). Can you establish this as a
to
Exercise
6?
a bilinear
the converse of part (b):
H
be
form on V. If
(c) Prove
=
=
then
H(x, y)
[4>p{x)yA[4>p(yj] for all x and in V.
ij/p(H),
1 to Theorem 6.26.
(a) Prove'Corollary
a method for finding a
vector
V, describe
space
(b) Fora
AeMnxn(F).
Prove

corollary

Let

8.

iinite-dimensional

basis for ^(V).

9.

Prove

Corollary

12. The following

outline

(a) If

and

ft

the

y are

change

coordinates,

(b)

13. Let

an

provides

bases

vector space

for a finite-dimensional
that

LQ$y,

\302\242^=

where

and

\302\242^

are

$y

be

a finite-dimensional

P and

14. Prove
Any

the

vector

y of V, rank(^(H))

space

and H e

= rank(^y(iJ)).

square

diagonal

matrix

an

alternate

for

any

is symmetric.
matrix

standard

/?-

following.

symmetric.
(b) Any matrix congruentto a diagonal
the corollary to Theorem 6.29.
(c) Prove
a vector
15.
be
space over a field F not of characteristic
Let

the

^(V). Prove that

is

if Q

and

V,

matrix changing y-coordinates into

of coordinate
prove

is an equivalence relation.
alternate
to Theorem
6;27.
proof

of V with respect to ft and y, respectively.


representations
2 to Theorem 6.26 to part (a) to obtain
Apply Corollary
proof of Theorem 6.27.

bases

(a)

of congruence

relation

the

that

Prove

6.26.

Theorem

3 to Theorem6.26.

10. Prove Corollary

11.

to

is

two,

and

let

H be

Bilinearand Quadratic

Sec. 6.7

H(x,
all

ftK(x

y)

jO

following quadratic formsK


a symmetric bilinear form H such
find an orthonormalbasis for V such

16. For the

-*

Rdefined

.-* R

R2

X:

that

defined

^7

the

be

set

space

V, find

Then

for all xeV.

H(x,x)

matrix.

is a diagonal

$fi(H)

it\\ -

8tit2 + ti

by

3t2 + 3tl+

17. Let

quadratic

by

defined

R3^i?

product

+ 4V2 +

=-2^

X
X:

the

by

Xr1)
\342\200\242(b)

K(x)

/J

R2

is

K(y)2

inner

real

that

(c)

- K(x). -

over

(a) K:

x)

V.

ye

x,

H(x,

with H, then

associated

form

that if K(x) =

on V. Prove

form

bilinear

symmetric

for

383

Forms

3t|-2tit3

of all

\\

such that

3t| - 2^
Find an orthonormal basis
3t2 +.3t| +

of

coordinates of points
geometrically.

\302\245

18. Prove the following


If

n and

< rank(i4)-<

local maximum at

(b) If 0 <
i

local

19. Prove

rank(,4)

minimum

<

at

relative

of

refinement

(a)

R3

for

ft

part
no

has

2^/2(^

to

(d)

the

that

such

/?

is

equation

simplified.

6.31.

of Theorem

and

A has

no positive

the
Sf

\342\200\242

then

eigenvalues,

negative

relating
Describe

/\"has

p.
eigenvalues, then

no

/ has no

p.

the following variation

of the second

test

derivative

Define

D=

1 = 0.

t3) +

~d2f(p)~

l(dh)2}

~d2f(p)-]

\342\200\224

d2m

_(dh)(dt2i

for

the

case

n ~ 2.

Chap. 6

384
(a)

If D

> 0 and

(b)

If D

> 0 and

(c)
(d)

52/(p)/(5tx)2 > 0, then f


f has no

0, then

If

<

If

= 0,

then the test

Hint:Observethat
of ,4, and
20. Let

is

over F. In

matrix

at

minimum

p.

local maximum at p.

is inconclusive.
=

where

A^,

and

Ax

are

A2

the

eigenvalues

6.31.

Theorem

in

Spaces

local extremum at p.

det(,4)

n matrix

n x

an

be

as

hasa local

/ has a

0, then

<

d2/(p)/(d\302\2432)2

Product

Inner

and let E be

over a field F,

Section 3.1 it was

elementary

from A

be obtained

can

AE

that

shown

an

that
E'A
can
be
an elementarycolumnoperation.
but
obtained
from A by means of the same
performed
operation
= (A'E)1.
that
E'A
on the rows rather than on the columnsofA.Hint:
A with
from
the field of rational
entries
21. For each of the following
matrix
D
and
an invertible
numbers, find a diagonal
Q such that

by means of

Prove

elementary

Note

matrices

matrix

Q'AQ -

D.

Mil)

(b)-G:-J)
an

Use

Hint:

that of interchanging

than

other

operation

elementary

columns.
\342\226\240

r/3

(c)

\302\253

[\342\226\240

-1

\\2

of a
22. Prove that if the diagonal
the resulting diagonal matrix is
23. Let T be a linear operator on a
=
for all x, y in V.
H(x,
<x, T(j>)>

to

congruent

inner

real

the

one.

original

define

V, and

space

product

then

are permuted,

matrix

diagonal

entries

y)

form
on
V.
(a) Prove that H is a
that H is symmetric if and
T is
(b) Prove
must T have to require that
(c) What
properties
bilinear

if

only

self-adjoint.
is

inner

an

on

product

V?

may fail to be a

(d) Explain why H

if

form

bilinear

V is

inner

a complex

product space.

24. Prove

the

to

converse

product space,

let

and

linear

T on

operator

finite-dimensional real inner


H be a bilinear form on V. Then there existsa
~
V such that H(x,
for all x and y in V.
<x, T()>)>
Exercise

unique

y)

Hint: Choose an orthonormalbasis


linear operator on such that [T]fl =
/?

and

25. Prove

Exercise

13

of

Section

that the total

V be a

23: Let

for

V,

let

=
^fl(H)\302\273

A. Apply Exercise

let

and

T be

7(c) of this section

6.2.

number of distinct

equivalence

the

classes

of

congruent

\342\226\240
Sec.

Einstein's

6,8

symmetric matricesis

n x n real

385

of Relativity

Theory

Special

(n

by

given

l)(n + 2)

6.8* EINSTEIN'S SPECIAL

RELATIVITY

OF

THEORY

nineteenth
of physical experiments performed in the latterhalf
of the
the Michelson-Morley experiment of 1887)physicists
century
(most notably
obtained
in measuring the speed of light are independent
that
the results
concluded
used to measure the speedof light.
For
of the instrument
example,
of the
velocity
the
of
on
Earth
an
measures
while
that
experimenter
speed light emitted
suppose
that the
milesper second.
Now
the sun and finds it to be 186,000
from
suppose

a result

As

leaves
places the measuring equipment in a spaceship
the
at 100,000
miles per second in a direction
from
the
of the same experiment from the spaceship
yield

experimenter
traveling

away

repetition
rather

same

the
at 186,000 miles per second
to
spaceship
one
second
as
might expect!
per
to a new way of relating coordinate systems used to
was Albert Einstein's special theory of
The
result
via a linear algebra
viewpoint the essence of Einstein's

locate

in

space-time.

will

develop

events

relativity.

We

relative

led

revelation

This

sun.

would

Light is traveling
than 86,000 miles

result:

Earth

that

theory.

basic

The

is to

problem

that
are
coordinate
assumptionthat the speed

in

systems

of

Suppose
such

(R3)

S (see

1.

are

we

6.6).

Figure
The

velocity in relationto S as
To simplify matters, let us suppose
at a constant

S' moves

that

light

inertial

two

given

origin of S'

S at a constant
2. Two clocks C and C

the

in

moves

in

and

clocks

are

both

clocks

and

yf,

z') are

of the x-axis

direction

of

to S.

0 relative

are placed

positive

x',

space\342\200\224the

first

to

relative

stationary

clocks
S, and the second stationaryrelativeto S'.
to give as readings real numbers in units of seconds.
so that at the instant the origins of S
S'
calibrated
coincide

system

designed

These

The

and

give the reading

3. Our unitof lengthis

and our unit


of

>

velocity

speed

Sr (x and

of S and

axes

parallel, and the

are

from

measured

that

corresponding

coordinate

two different inertial (nonaccelerating)


to each other under the
relative
motion
same when measured
in either system.
is the
coordinate
systems S and S' in three-space

compare

light

of

time

is 1

the

light

is the

zero.
second

second.

(the

Note

light second per

Given any event(something


described), we may assigna set \"space-time
whose

of

distance

light

travels in 1 second)

that with respect

to these units

the

second.
position

and
coordinates\"

time

of occurrence
to it.

For

can be

example,

if

Chap.6

386

Product

Inner

Spaces

S'

Figure 6.6

p is an

relative

event that occursat position

to

S and

t as read

at time

on clock C,

we

can

assign

to

p the

set of

coordinates

This ordered 4-tuple-iscalled

the

Likewisep has a set

relative

to Sf

of

measuring

of the

an

relative

to S and

C.

and C.
R

T\342\200\236:

consequence

of p

coordinates

space-time

We can definea mapping


a

coordinates

space-time

event

above

with

-*

such that, for

respect

to S and

(which

depends

on the velocity

v)

as

any set of space-timecoordinates

C,

Einstein's

Sec. 6.8

387

Relativity

coordinates of this

set of space-time

is the

of

Theory

Special

with

event

to

respect

S' and

C.

Intuitively T0 is one-to-oneand onto.


certain

made

Einstein

of relativity. We

will formulatean

of the

Axioms

about

assumptions

of

relative

stationary

-*

R4

is an

to

that

isomorphism.

fx

eR4,

if

(R4)

y' = y and

z' = z.

For

xf

and

t' are

independent

t\"

that is, if
Ix

and

T0

x' and

of y and z;
/xf\\

'x\\

then x\" =

his

special

in either coordinate
coordinate
system, is 1.

(R3) For any

then

to

theory

assumptions.

measured

when

beam,

of

R4

set

equivalent

led

Theory of Relativity

Special

(Rl) The speed any light


using a clock
(R2) The mappingT0:

that

T\342\200\236

t'.

T0

system

(R5) The origin


constant

As we will see,
use

in directionx.

it to study the curiousphenomenon


of

(a) Tv(ej) =

The

to

intend

We

operator

T\342\200\236.

compute

T0

contraction.

time

On R4

6.32.

Theorem

e, for i = 2, 3.
is

e3})

span({e2,

(b)

characterize

completely

the Lorentz transformation

called

the

from S'.

measured

axioms

five

these

and

<

of S' at

of the x'-axis

direction

negative

0 as

\342\200\224

velocity

the

in

moves

of

T0 is

Inner Product Spaces

Chap, 6

388

J ^-invariant.

(c) span({e1,'e4}) is J ^-invariant.


Both

(d)

(e) T*(eO= e, for i =

and

axiom

By

(a)-

Proof

hence by axiom

and

e3})

span({e2,

2, 3.

e4}) are T*-invariant

span^e^

(R2)

(R4) the first and

of

coordinates

fourth

'
\342\200\242

are both zero

for

.
The

of

proofs

(c), and

(b),

Suppose

that

emitted

from

their

at

<T*(\302\2532),

T0(e2)>

<e2,

e2,

=
<e2, e2> = <e2,T0(e2)>
Similarly T*(e3) = e3. 1
the

common

(d) are left


=

2,

of

multiple

by axiom (R3)

any

is

R. Thus

(e) For
; #
=
J 2, <T*(e2),e,>=
T*(e2)

be

a,

any

\302\253,>

i.e., that
e2>

<T*(e2),

instant

the

as exercises.
=

<e2,

T\342\200\236(\302\253j)>

T*(e2) =
=

<Ae2,

origins

Xe2

f\302\260r

e2> =

of S and

origin. The event

by

(a)

(c); for

and

(a). We conclude that

= 1 by
<e2, e2>
A,

some

Xe

and

hence

R.

Thus

= e2.

T*(e2)

S' coincide a light

flash

of the light flash

when

1 =

measured

is

either

Special Theory of

Einstein's

6.8

Sec.

to S

relative

389

Relativity

relative to S'

and C or

and C

has

coordinates

space-time

the set of all

Let P be

to S and

relative

whose

events

coordinates

space-time

C are such that

is

flash

the

the

from

observable

point

with

coordinates

(as measured relativeto S)at

in

of

terms

x,

y, z,

flash is observablefrom
t*

at the origin.

with center

to the

the points that

are precisely

L These

distance

whose

point

any

on S) is

speed of light is 1,at

t. Since the

and

The coordinates

to

(relative

S)

y2

satisfy

the

x2

equation

characterizeP
similarly: event

in

An

+ y2
of

terms

lies

in

y'

+ z2

the

P if

\342\200\224

t2

0. By

f\\2

equation

(x')2

+ (/)2

the light

origin of S (as measured


the

of

sphere

of such

if and

virtue

coordinates

space-time

and only if

(tf

the

> 0

points

only

if

radius

satisfy the
relative

to

(t>0)

+ (z')2

of axiom (Rl) we can


relative to S' and C

relative to S'

coordinates

satisfy

time

any

lie on

an event lies in P
+ z2 = t2. Hence
equation x2 +
and C its space-time coordinates

us characterize P

on C). Let

measured

t (as

time

the

> 0)

- (t')2 = 0.

and C

its

space-time

390

Chap.

Inner

Product

Spaces

Let

0 0

1 0
0 0 1
0 0 0 -1
0

A =

For

633.

Theorem

<LA(w), w> =

weR4if

any

0
0
0

0, then

<T*LATv(w),w>= 0.
Let

Proof.

w=

and

that (LA(w), w> =

suppose

Case 1. t > 0.

Since

0.

<L^(w),

of an event

coordinates

eR4,

= x2

w>

+ y2 + z2

in P relativeto S

set

the

is

of

Because

C.

and

- t2,

and

coordinates

the space-time

are

discussion

Theorem

preceding

of the same eventrelativeto S'

0,

Case
We

2.
now

T\302\273>

proof follows

t < 0. The

and

case

'[ =

w2

+ (z')2

- (t')2 =

{w1? w2}

is

T*L^T0-invariant.

is

\342\200\224

w.

\\
\302\260

3,

1 to

e4})

(j/)2

about Ty. Let

\342\226\240

01

span({eX)

(xf)2

applying

by

to deduce information

proceed

w,

Exercise

= 0.

(tf)2

(zf)2

(LATv(w),

1\\

By

the

follows.

conclusion

the

and

w>

<T*L/T\302\273S

C,

6.33 yields

(x')2 + (j/)2 +
Thus

and

an

\\

basis for spand^,^}),


result tells us even more.

orthogonal
The

next

and

Sec.6.8

Einstein's

(a)

391

Relativity

There exist nonzero scalars a

6.34,

Theorem

of

Theory

Special

and b

that

such

TfLJ^w^awj.

(b)T*LATv(w2)=bw1.

(a) Because <Ljl(w1),Wi>=0;

Proof,

Since
6.33. Thus TJL/T^wJis orthogonal to
is T*L^T0-invariant, Tv^-a^v(wi) must lie in
basis for this subspace,and so
orthogonal

T*L/T0. Thus a

# 0,

Corollary.

(a).

proving

Let

Bv

Bf

ABV

T0

of (b) is

proof

/? is t/ie

w/iere

[Tv]fl,

Then

(a)

The

Since

similar.

span({w1)-w2})

a multiple

A are

and

by Theorem

{wl,w2}

be

must

TJL/T^Wi)

= aw2 for some scalar a.

But

set.

this

e4})

span({el9

wx.

ThusT^L^T^wJ

= 0

Wi>

<T*LjlTw(w1X

invertible,

an

is

of w2.

so is

standard ordered basisfor

= A.

(b)T?LATv=LA.

We leave the proof of

Now consider

coincidedas
a

x-axis.at

and

measured

velocity

the

by

v as

the

clock

measured

C. Since the

in S, its

For

exercise.

the

after

1 second

situation

the

an

as

corollary

4.

hints, see Exercise

origins

of S and

Sf have

origin of S' is movingalong

the

space-time coordinates

relative

to

C are

\\

Similarly, the space-time coordinates

for

the

origin

of

Sf relative

to S' and

must be

for

some

t'

> 0. Thus

we have

for some t' > 0.

(18)

Chap.

392

Inner Product

Spaces

to Theorem6.34,

By the corollary

T*LAT
v

*-A

'

v2

1.

(19)

,1

But also,

T*L/T
v

*-A

Combining

(19) and (20),

we

v2-l

Thus,

from

(18)

and

(21),

conclude

that

= -(t')\\

or

t' = y/l-

-(f)

f\\2

(20)

(21)

we obtain

\\

(22)

v^1/

Next

axis of S'

recall that the

at

the

constant

origin of S

in

moves

\342\200\224v<

velocity

axiom (R5).]Consequently,1 second


as measured on clock there exists a time
after

C,

as

0
the

the

negative

from

measured
origins

t\" > 0 as

of S

of the x'-

direction

Sf. [This

and S' have

fact is

coincided

measured on clock C' such

that
o\\

/-vf\\

T.

f
From

(23)

it follows

in a manner

(23)

/\342\226\240

similar to the derivation


of

(22)

that

Einstein's Special Theoryof Relativity

Sec. 6.8

and

from

hence

393

(23) and (24)


\342\200\224

0
(25)

\\7rb/

The following result is

Theorem 6.35.

now

Let

Bv

[TVL

R 4. Then

yr=^

0
0

6.32.

-v

VT=^
=

(25) and Theorem

basis for

ordered

standard

the

be

ft

(22) and

using

proved

easily

\342\200\224

0 0

*A=

v^l

Time Contraction
A

theoryof
a sp^ce

contraction.

time

vehicle

system.

It

follows if we accept Einstein's


an astronaut
that
leaves our solar system in
Suppose
at a fixed
v as measured relative to our solar
velocity
on
Einstein's
theory that at the end of timet as measured

and

curious

most

traveling

follows

from

paradoxical

conclusion

on the space
v2.
To
only
t^/l
establish
this result, consider the coordinate systems S and S'
clocks
C and
of S' coincides
with the space
C that we studied above. Suppose that
origin
a point in the solar system (stationary
vehicle and the origin S coincides
with
so that
the origin of S and Sf coincideand clocks and
t'
relativeto
sun)
read zero at the moment the astronaut embarkson
trip.
the
of
the
vehicleat
time
As viewed
from
coordinates
S3
space-time
t > 0 as measured
by C are
Earth

the

time

that

will have passed

\342\200\224

is

vehicle

and

the

of

the

the

any

whereasas

viewed

from

S'

the

space-time

coordinates

of the vehicle at

any time

Product

Spaces

coordinates

of space-time

sets

two

if

Inner

C are

as measured by

t' > 0

But

Chap.

394

/0

fvt\\

and

are

same event, it

to describe'the

must follow that


/0

'vt\\

T\342\200\236

n
Thus

\342\200\224\\

.y

yr^

l-v2

\342\200\224

0 0

l-v2
From

the

above

equation

-v2t

yn^

Thisis
A
contracted

desired

the

dramatic
along

the

\\

yr^W
it

follows

that

\302\243'

t'

or

yr^;

tj\\

- v2.

(26)

result.

of time contraction
consequence
line
of motion
(see Exercise 9).

is that distancesare

Let us make one additionalpoint.


units
that
we
consider
of
distance and time more commonly used than
such
and second,
second
light
as the mile and the hour or the kilometer
the
Let c denote the speed
second.
of light relativeto our chosen
of distance
and
time. It is easily seen that if
an object
v relative
to a set of units, then it is
a
at
a velocity
at
Suppose

the

and

units

travels

traveling

Einstein's Special Theoryof Relativity

Sec. 6.8

vjc

velocity

of distance

of light seconds

in units

395

per second.Thus

an

for

arbitrary

and time, (26) becomes

t' = t /1-

'

.2

EXERCISES

(b), (c), and

1- Prove
2-

3.

the

Complete

proof

(d) of

6.32.

Theorem

of Theorem

6.33 for the

caset < 0.

For

and

show

that

(a) {wl9 w2} is


(b)

w:-,

for

e4}),

span^e^

e4}) is T*L^T0-invariant.

span({el9

4 Prove the

an orthogonalbasis

corollaryto

6.34.

Theorem

Hints:

(a)

Prove

that

p 0 0
/

0 1 0

BfABv =

\\

0 0

\\-q

-p

where

(b)

Show

(c)

Apply

that

q
Theorem

a + b

\342\200\224\342\200\224

= 0 by using
6.33 to

and

the fact that

w=

to show that p =

1.

B*ABV

\342\200\224

\342\200\224
\342\200\224

is

self-adjoint.

set

of units

396

\\

\"

T-

\342\226\240

6. Consider

coordinate

three

(x3

coincide.Suppose

is

Sf

S\"

is

past Sat
C,

and

C,

S', and

C\"

v2

on 5), and

measured

C is stationary

is

relative

stationary

to 5\".

relative to

C is stationary

S,

at

coincide

S\"

clocks

three

on any of

0. Assuming

time

that

prove

BV2BVI)3

BV3

that there are

that when measured

Suppose

of S, Sf, and

the origins

all

that T03 = T02T01


(i.e.,

to

relative

0 (as measuredon
5'), and S\"is moving
S),

on

measured

0 (as

>

C\" such that


clocks

three

the

a velocity vt >

S at

past

0 (as

such

and

parallel

velocity

>

vz

a.velocity

z\")

moving

S' at a

past

moving

z',

of (22).
5\" with corresponding axes
that the x-, x'-3 and x\"-axes

and

S,S',

systems,

y\"; and z,

y, y',

x\";

x',

to the derivation

similar

a technique

Use

(25)

Hint:

Spaces

that

(24)5 and prove

5. Derive

Product

Inner

Chap.

= vv+v2

1 + vtv2

Note that substituting = 1 in the equation


that the speed of light as measuredin
not
the
case?
we be surprisedif this

above yields vz = 1. This tells us


S' is the same. Why would
S or

w2

either

were

7. Compute

negative velocity
given

Einstein's

astronaut was 20
this

solving

at

Earth

S at a
the

old

years

special
at

48.2 in the

age

time

the

of departure,

in

distance b units
velocity

willcompute

the

to
span

the

phenomenom
of

t'

form

she would
in

Exercise

problem.
the

fixed

at

at a

to a star 99

then he or

year 1976. Explain the useof

moving space vehicleconsidered


that the vehicle is movingtowarda

star

the

of

is

Bv

1776 and traveled

year

9. Recall the
Suppose

moves

Sf

of the speed of light and that upon


and returned to Earth at the
around
of relativity,
show that if the
theory

99%

turned

immediately

Assuming

to

at

Earth

from

samespeed.
return

= Bv, where

[TJp

in the

Earth

left

astronaut

star

the

reaching

if

that

Conclude

6.35.

away

years

B{\342\200\236v).

S, then

to

relative

in Theorem

8. Suppose that an

light

(B,,)'1 =

(50)~VShow

tyjl

v,
time

study
star

of time
located

on

contraction.

of

the x-axis

space vehicle moves toward


earthlings
(who remain \"almost\" stationary relative to S)
it takes
Due
for the vehicle to reach the star as t =
the astronaut will perceivea
of time contraction
the

from

origin

of

S. If the

b/v.

time

\342\200\224

v2

\342\200\224

(b/vj^/l

v2.

paradox

appears

in

that

the

Einstein's
Special

Sec. 6.8
astronaut

v. The

of

solar

the

in

star

C are

S and

to

at timet (as
star relative to S' and C are

on

measured

C)

the

of the

coordinates

space-time

Setting

x'

show that x'


result

This

astronaut,
astronaut

tv

and

= by/l -

v2

may be interpreted
the distance
from
(see

\342\200\224

Conclude

the

of

astronautis
The

Thus

be contracted

\342\200\224

bv

to mean that at time t'


the astronaut to the star

as
as

measured

by

the

measured

by

the

next page) is
v2

t'v.

from this that

(1) The speed


(2)

t' =

t'v.

6.7 on the

Figure

\342\200\224

b^/l (d)

are

coordinates of the

C) the space-time

on

measured

t (as

time

(b) Show that

(c)

b.

than

of time contraction,

at

relative

from

by the astronaut is less


systems S and S' and clocksC and C

coordinate

the

discussion

(a) Argue that

distanceof b and a
the distance
the

star as measured

that

Assuming

span inconsistent with a


is resolved by observing that

paradox

to the

system

397

Relativity

a time

perceives

velocity

as

of

Theory

distance

distances

space

to the star

relative

vehicle

as measured by the

v.

from

along

by a factor

Earth

to the star

as measured

the line of motion

of ^/1

\342\200\224

v2.

by

of

the

space

the

vehicle

astronaut

appear

is

to

Inner Product Spaces

Chap.

398

(*: o, o)

coordinates
relative

S'

to

-*(-star)
\302\242,0,0)

coordinates

relative

'

to S

Figure 6.7

6.9* CONDITIONING ANDTHE

linear
m

The

of
studied specific techniques that allow us to solve systems
in the form AX = b, where A is an m x n matrix and b is an
Such
often arise through applications to the real world
systems

3.4 we

Section

In

equations
1 vector.

in

coefficients

and in

QUOTIENT

RAYLEIGH

system
m and

both

cases

many

the

of the solution.

calculation

errors

experimental

provide

arise

obtained

from experimental data,

that a computer

must

be

used

in the

Thus two types oferrors must beconsidered.


First,
in the collection
of data since no instruments can
measurements.

accurate

completely

are frequently
n are so large

Second,

round-off errors-. One might intuitively


coefficients in the systemwill causesmall
that has this property is called

feel

that

otherwise,

well-conditioned;

relative

small
errors

relative

will introduce

computers
in

the

changes
solution.

the

system

in the
A system
is called

ill-conditioned.
of errors,
several examplesof
concentrating
types
of A. In addition,
on changes in b rather than in
in the
entries
primarily
we assume that is a square,
(or real), invertible matrix, since this is
complex

We now consider

these

changes

case

most

Example

the

encountered

frequently

Consider the system

The solution

of

this

system

is

in applications.

Conditioning and the

Sec. 6.9
Now

that

suppose

Quotient

Rayleigh

system somewhat and consider

we change the

+ x2 =

1.0001.

solution

the

has

system

Xi-x2=
system

new

the

*i

This modified

399

'3.00005\\

1.99995 )'

We see that

10~4

in

a change
of

the

of

coordinate

each

one

coefiBcient

new

solution.

in

10*\"4

xx +
Xi

x2

\342\200\224

x2

a change of less than

has caused

More generally, the

1 +

system

has the solution

3+

2small

Hence

are

10

small changes in the solution. Of course,we


in \"relative
changes\" since a change in the solutionofj say,
of 10~2
but small if
order
large if the original solution is of the

is considered

introduce

in b

changes

interested

really

of
the original solution is of
order
We will use the notation 5b to
vector in the original system and b'is
have

106.

the

denote

vector

the

in

vector

the

the

b'

\342\200\224

modified

b,

b is

where

the

Thus we

system.

\\

We now define the

||-|| denotes
of
follows,
relative

change

in

norm

Cn

for any

is true

however,

in

in

change

standard

the

what

relative

to

the

be

scalar

||5b||/||b||,

where

Most
=y<6,6>.
norm. Similar definitions hold for
(or

Rn); Le. ||b||

the

x.

'3 + (5/2)
'\302\273,

||fc||

\342\226\2401*1

and

11*11

Thus therelativechange
in

systemis

well-conditioned.

1*1

2-(5/2)

equals,
1

coincidently,

the

relative

change

in b, so the

400

Chap,

Inner Product Spaces

Example 2

Consider the

system

xt

+ 1.00001^2=

3.00001,

has

which

as

= 3

x2

its

for the related

solution

The

solution.

l*i +

X2

1.00001\302\275

Xi

system
3.00001 + 8

is
'2 - (105)^
1

(105)\302\276

Hence

11^

II

>
|\302\2535|

10svf

104|\302\2535|,

while

1*1

II4&IL

xh

ll&ll

Thus the relative

change in x is at least

104

relative

the

times

lines
system is very ill-conditioned. Observe that
in this system are nearly coincident.
Soa
equations
could
alter the point of intersection, that is,
greatly
the

small

defined

of

solution

two
line

either

in

change
the

the

by

the

system.

To apply the full strengthof

the

of conditioning, we need the notion

of

Section 6.1

Let

norm

(Euclidean)

results

further

for

Definition.

of

about

A be a

theory

of

the

norm

e Cn

of x e

of a

matrix

to the study

(see Exercise 22

complex (of real) n x n matrix.

A by

Rn.

matrices

self-adjoint

of

norms).

x=\302\2430

where

in b\\ This

change

Define

the

Sec.6,9

the

and

Conditioning

that ||i4|| represents the

We see intuitively

The question of

problem of

to

how

maximum

of

\"magnification\"

A.

by the matrix

vector

401

Quotient

Rayleigh

compute

maximum

this

not

or

whether

the

as well as

exists,

by use of the so-called

it, can

be answered

n x n

self-adjoint matrix. The

\"Rayleigh

quotient.\"

B an

Let

Definition.

is defined to be the scalar

# 0

=.

R(x)

x>/||x||2.

<Bx,

a self-adjoint matrix

For

6-36-

Theorem

for

quotient

Rayleigh

B,

is

maxR(x)

the

largest

x=\302\2430

{x!,...,xn}
h

>

- -

>

A2

Theorems

By

Proof

of B are

of

An. (Recall

for

Now

real.)

where

l<f<n,

A.sxh

to Theorem 6.17 the eigenvalues

exist scalars au...,

(or Rn) there

eCn

BXi =

that

such

by the lemma

that

\342\200\242

of

eigenvectors

choose an orthonormalbasis

6.20 we may

and

6.19

B.

the smallest eigenvalueof

min R(x) is

B and

of

eigenvalue

an

that

such

* =

E aixi>

hence

R{x)=-

\342\200\224

\\' = 1

J=1

lull2

Ml2
n

ax E to

Ajflii2

llx\"2

Ml2

It is

that

see

to

easy

R(x^)

the theorem. The second

is

half

1-

Corollary

where

is

the

eigenvalue

largest

have

from

\\\\Axf

Theorem

we

so

have

and,

matrix

A*AS

and

let

be

# 0,

that

<Bx,x)

<A*Ax,x)

IWI2

IWI2

IWI2
6.36

in fact,

equals

y/X,

of A*A.

(Ax,Ax)

||i4||2 =

A.

of

the self-adjoint
_

the first half

demonstrated

similarly.

proved

eigenvalue

IWI2
we

l5

finite

of B. Since, for x

\"

For any square matrix A, ||A|| is

Let B be

Proof

|2

Kh

the

largest

402

Observe that the proof of Corollary


A*A are nonnegative. For the
corollary
For any square matrix

Lemma.
X

is

an

eigenvalue

Let

Proof

Hence

smallest

Proof
X

is

an

obtain (AA*)(Ax)=

are
eigenvalues
(A~l)*A~l =

(AA*)~\\

an invertible

is

an

not invertible.
of AA*. The

eigenvalue

0 such that
Ax

Since

X(Ax).

A*Ax

Xx.

(lest Xx =

# 0

0),

left as an exercise. I

x|| = lf^/X,

matrix. Then ||A


of an

eigenvalue

\342\226\240
\342\226\240
\342\226\240

>

AA*.
which

An be

the

invertible

where

matrix if and

only if

Then

||i4_1||2

equals

1/Xn.

the

by the lemma
the largest eigenvalue of

of A* A, which

eigenvalues

it is only

For many applications

equals
1
and

largest

smallest

the

the

which

our

can
of

conditioning.

Let

Then

3,

and

0.

Therefore,

||i4||

^/3.

smallest

occur.

The eigenvalues of B are 3,

that

eigenvalues

For example, in
case
of vibration
problems,
the
at
lowest
vibrations
eigenvalue represents
frequency
We will see the role of both t}ieseeigenvalues
in
study
are of interest.

Example

only if

of A*A.

>

A2
of

the

an

A* A is

A if and

of its inverse.

eigenvalue
>

also

The converse is

of AA*.

A be

Recall that
A'i

0, then

exists x #

there

0. Then

eigenvalue

Now let

is

of A*

lemma.

following

eigenvalue

of

eigenvalues

similar.

eigenvalue

Let

2.

Corollary

so that

invertible,

that

an

is

A. If

of A*

eigenvalue

both sides to

we have that

~l

an

now

A to

the

be

converseis

Suppose

is

AA*.

is not

,4*)

(and

proof of the
Apply

of

an

is

A,

the

need

we

next

the

all

that

shows

Inner Product Spaces

Chap.

For any

Sec. 6.9
we may

the

and

Conditioning

matrix

R(x) for the

compute

as

2(a2 + b2

x)

(Bx,

403

Quotient

Rayleigh

a2 +

3>m-~MT~

for all a,b,

know

we

the inequality||i4x||< ||i4|| ||x||,


be

3x

let

3b,

A(3x) = 3b, and so

3x

vector

the

is

matrix, we can makeuseof


b

invertible,

satisfies

that

x.

every

and

# 0,
+

A(x

3x)

Ax = b. For
b + 3b. Then

Hence

A~*l(3b).

and

Mil'11*11

ll&IIHI^K

for

holds

which

follows that

Assume in what
given

b2 + c2

for every square

exists

||i4||

\342\200\242

be)

ceR. I
that

Now

+ ac +

+ c2-ab

\\\\3x\\\\

\\\\A^(Sb)\\\\

\\\\A'l\\\\'\\\\3b\\\\.

Thus

~m

<

w\\

11*11

\"

\\\\\\b\\\\

Similarly,

(\\\\&H\\

MII'M\"1!!

number

The

|| A\\\\

*
11*11

\\\\\\b\\\\J

the condition

called

l\\\\ is

\\\\A~

\\\\dx\\\\

number of

and

is denoted

be noted that the definitionof cond(,4)dependson


we
of A. There are many reasonable
of defining
define
the norm
the norm
of a
we
used
to establish
the inequalities
matrix. In fact, the only
above was
that ||i4x|| < ||i4|| ||x|| for all x. We summarize these results in the following
It should

cond(A).

how

ways

property

\342\226\240

theorem.

\\

Theorem

For the system AX

6.37.

= b, whereA is

and

invertible

# 0,

we

have the following two results:

(a)
(b)

^m
cond(A)

eigenvalues,

Euclidean

ibf
=

~~,

defined

Xt

prove

and

A*A.

\342\200\242

Xn

(In

the

are

this

part,

and smallest

largest

we assume that

||

\342\226\240

||

is

the

in this section.)
the

from

follows

6.36.

Theorem

previous

clear from Theorem

only

if

and (b) follows

inequalities,

6,37 that cond(,4)^


=
1 if and
A
that cond(,4)
is a scalar

It is

any norm\"\"l0-

of

Proof Statement (a)


from Corollaries 1 and 2 to

{for

cond(A)

where

respectively,

norm

<

lw

1.

It

is

left

multiple

as

an

exercise

of a unitary

to

or

Chap.6

404
in (a)

can be obtained

that equality

work

some

can be

6.5. Moreover, it

in Section

as defined

matrix

orthogonal

an

by

Product

Inner

Spaces

with

shown

and

of b

choice

appropriate

5b.

We can see immediately

from

surethat a

the relative error in

is large, or

in b

error

error in b

is small!

In

short,

relative

the

large

even

indicates

the

be

may

merely

cond(,4)

error in x. If cond(,4)is

be small eventhough

in x may

error

relative

the

then

however,

large,

a small relative

forces

in b

error

relative

small

1, then we are

is close to

if cond(,4)

that

(a)

the relative

though

for large

potential

relative errors.

We
5A

For

complicated.

can

example,

assumptions

appropriate

be given

b. If there isan

in the vector

errors

only

error

of the system AX = b, the situationis more


A + 5A may fail to be invertible. But
under
it can be shown that a bound for the relative
in
error

matrix

coefficient

the

in

considered

far

so

have

cond(,4).For example,

in terms of

Forsythe and Moler (G.

if

C. B,

and

Forsythe

Moler,

Inc.,
AlgebraicSystems,Prentice-Hall,

Solution of Linear

Computer

show

p. 23)

1976,

N.J.,

Cliffs,

Englewood

then

is invertible,

5A

that

<^ cond(i4)
v '

M||

\\\\x+5x\\\\

mentioned that, in practice,


for it would be an unnecessary waste of
determine
its norm. In fact, if a computer is
It should be

one

time

used

of

inverse

will

circle! There are,

approximation

of

error

in

x is

based

to

find

A~l9

cond(J4),

computed

A'1, and the error

of cond(,4).So

we

are

the

Label

the

(a) If &X = b is

as

being
then

well-conditioned,

(b) If cond(i4)

is large, then AX =

(c)

is

If

cond(,4)

(d) The

norm

of

(e) The norm of

2. Compute

the

(a) /4

0\\

1 3/

norms

A
A

equals
is

the

of

the

quotient.

to

the largest
matrices.

following

(b) (

5 3'

1-3

is small.

is well-conditioned.

Rayleigh
equal

always

cond(,4)

b is ill-conditioned.

AX = b

then

small,

true or false.

eigenvalue

of A.

the

in

which

statements

following

in

caught

EXERCISES

1.

to

merely

the

a usable
however, some situationsin
can be found.
Thus, in most cases, the estimateof
on an estimate of cond(i4).

vicious

cond(,4)

by the size

A\"1

compute

approximate

only

be affected

will

inverse

computed

likelihood

all

in

to

knows

never

almost

relative

Sec.

6.9

Conditioning

(c) /

and

\342\200\242\\

-2

3. Prove that if B is

then

symmetric,

be as

A~l

and

||B||

of B.

eigenvalue

largest

-ll\\

29 -38

is the

follows:

13
A

405

Quotient

-2

4. Let

the Rayleigh

and

50

The

of

eigenvalues

are

and

0.0588.

Exercise
||^4||, M\"1!!,and cond(i4).
3.)
that
we have vectors x and x such that
Suppose
Ax
Use part (a) to determine upper
||
|| ^ 0,001.
A~lb\\\\ (the absolute error) and p

(a) Approximate
(b)

0,2007,

84,74,

approximately

(Note

Ax

ft

\342\200\224

||5c

A'^bW/WA'^W

and

bounds
(the

for

relative

error).
=
actual solution of
b and
that
a computer
=
arrives at an approximate solution x.
100,
cond(,4)
||b|| = 1, and
=
||ft
0.1s obtain
upper and lower bounds for ||x ic||/||x||.

5, Suppose

x is the

that

AX

If

\342\200\224

\342\200\224

i4Jc||

6-

Let

\\

B =

Compute

7.

Let

B be a

eigenvalue

\\\\B\\l

and

cond(5).

symmetric matrix. Prove that

min

R(x)

equals

the

smallest

of B.

an
of AA*, then
is
8. Prove that if is an eigenvalue
completes the proof of the lemmato Corollary
6,37.
9. Prove the left inequality of (a) in
X

Theorem

eigenvalue
to

Theorem

of A*A.
6,36.

This

406

10. Prove that cond(,4)=

11. (a)

in

(b)

a finite-dimensional

be

that are unitarily equivalent as


= ||B||.
inner
space, and let T be a linear
product

square matrices
6.5. Prove
that ||yl||

Section

Let

or

unitary

in Section 6.5.

B be

and

Let

multiple of a

is a scalar

if A

only

as defined

matrix

orthogonal

and

if

Inner Product Spaces

Chap,

operatoron

defined

Define

T.

ITI\342\200\224IM-

x*0
that

Prove

(c) Let

basis

orthonormal

{xl9

Theorem6.22

an

of

composition

Let T be

x2i...}.

that

establishes

the

inner

rigid

any

geometry

ft

is the

the

fkmiliar

is

inner

product

and reflections. This material assumesthat


end
the results about direct sums developed at
of

with

the

a linear operator a finite-dimensional


V. The operator T is calleda rotation T is the
Let T be

Definitions.
product

space

number

of

subspace

all

for

Wx

Rotations

were

this context

yeWx./n

is called the
defined

in

an

V,

that
=

T(x2)

and T(y) =

on V or if
identity
orthonormal
basis

if

two-dimensionalsubspace
9 such
{xl9 x2} for W, anda real
= xx cos 9 + x2 sin 9,
T(x1)

inner

real

on

there exists a

The

real

finite-dimensional

5.2.

Section

P =

on a real

of rotations

composition
is

reader

exist.

not

does

(b)]

that

such

inner product space is


followed by a translation. Thus to
we must analyze the
thoroughly,
the
aim
of this section. As we will

operator

orthogonal

Such

space

the linear operator on

motion

understand
of
motions
rigid
structure of orthogonaloperators.
discover, an orthogonaloperator on a
the

with an

space

product

of V.

basis

orthonormal

any

OF ORTHOGONAL OPERATORS

THE GEOMETRY

6.10*

is

/?

that ||T|| [defined in part

= kxk. Prove

T(xk)

where

='||[T]-fl||,
infinite-dimensional

an

be

||T||

11\302\276
||

sin

\342\200\224x2

9 +

x2 cos

T is called a rotation of

9,
Wx,

about

axis of rotation.
Section

2.1 for the

special case that

R2.

real
linear operator on a finite-dimensional
exists
a
V. The
T is called a reflection if
operator
product
space
x for
all x e W and T(y) =
such that T(x) =
dimensional
subspace Wo/V
Wx
V about
of
all
In this context T is calleda
y e Wx
Definitions.

Let

T be a

inner

one-

there

\342\200\224

reflection

for

The Geometryof Orthogonal

Sec. 6.10
It
are

407

Operators

and reflections (or compositions


operators
2). The principal aim of this sectionis to
(see Exercise
the
is also true, that iss that any orthogonaloperatorona
converse
of rotations and
real
inner
is the composition
product
space

orthogonal
that

establish

that rotations

be noted

should

finite-dimensional

thereof)

reflections.
Example1
Characterization

Real Inner

on a One-Dimensional

Operators

Orthogonal

of

Product Space

space

an

be

Let

Choose

V.

T(x) = Xx for
T, X = \302\2611. If
X

\342\200\224

1,

Vx

some
X

then

T is the

det(T) = 1, and

in

case

and

orthogonal

identity on

so

and

of

in

that

If

a rotation.
about

first case

the

=-1.

det(T)

T is

Note

product

eigenvalue

reflection of

T is a

reflection.

an

is

hence

and

V,

hence

and

eV,

a rotation or a

second

the

all

\342\200\224x
for

T is either

{0}, Thus

T is

Since

R.

1, then

T(x)

vector

nonzero

any

on a one-dimensional inner
x in V. Then V = span({x}),

operator

orthogonal

Example 2
Some

Reflections

Typical

(a)

then

Let

T(x) =

T;R2->R2
\342\200\224

all

for

of R2 about W-1=

be defined by T(a,b) =
x e W and T(j>) = y for all y

span({e2})s
the

span({e3})s then T(x)=

spand^i,

e2}),

the

\\pxample

\342\200\224

for

Hence

xy-plane.

all

characterizes

span^}),

a reflection

T is

Thus

Wx.

j;-axis.

defined

be

T:R3->R3

(b).Let

(-a,b). If

by T(a,
and

xe\\N

all

of R3 about

operators

orthogonal

W =

If

-c).

b9

='(a,

T(y) ~ y for

a reflection

T is

c)

b9

all j/eWx =

W-1. 1

on a one-dimensional

characterizes
all orthogonal
real inner product space.
theorem
following
of
this
on a two-dimensional real inner product space.
proof
operators
result
x-axis
follows
easily from Theorem 6.22 since a reflectionabout
the origin
with
followed
line
through
by 9 is a reflectionabout
by a rotation
The

The

the

the

slope tan ^0.

Theorem 6.38. Let T an


inner product space V. ThenT is
=
a rotation ifand
1,
ifdet(T)
be

either

only

is immediate
from
\342\200\224
of 1 and
1 and
eigenvalues

rotation

and T is

that

any

the

Moreover,

the

1 is
origin.

one-dimensional
Geometrically,

and

a reflection if and
any

eigenspace

only

Furthermore,

on

corresponding

points

R2

has

to these

T corresponding

described as a linepassing
in R2 about this line
Figure

hence can be

T reflects

=-1.

ifdet(T)

reflection

of

real
T is

two-dimensional

reflection.

two eigenvectors

eigenvaluesare orthogonal.
X

or a

the definition that

It

on a

operator

orthogonal

to

through

(see

6.8).

408

Chap.

Inner Product Spaces

\\
\\
\\
\\

\342\200\242TW

Figure

For

6.8

if

example,

A =

that

it is clear

det(,4)=-1.

about

which

Hence

is

an

L^

is

1.

such

One

by Theorem

a reflection

to find an

eigenvector

R2 and

on

operator

orthogonal

it suffices

reflects,

L^
A

eigenvalue

L^

the det(L^) =

6.38. To find

eigenvector of

L^

the subspace
to

corresponding

is

/-UA
X\"

\\

the

Consequently,

about

subspace

which L^ reflects is

the line

teR

compositionof a
Proof. If
by

Since

is

Tx

and

=-1.
det(T2)\"det(T!)
TiT2

is

similar.

rotation
on

= 1 and
Tx

Thus,

real inner product


on V is a reflection on V.

a two-dimensional

reflection

det(T1)
T2

be
and

reflection

6,38

Theorem

composition.

Let

Corollary.

are
by

T2 is a

and

det(T2) =-1.
orthogonal,

Theorem

6.38,

space. The

rotation on V, then

Let T = T2Ti be

so is T, Moreover,
T is a reflection. The

We now study

orthogonal operators on spaces

of

higher

dimension.

the

det(T)=
proof

for

The Geometry of

6,10

Sec,

Lemma. If
inner product space
1<
2.
<

OrthogonalOperators
linear

is

V,

then

a T~invariant

exists

there

finite-dimensional real
such that
subspace Wo/V

a nonzero

on

operator

409

dim(W)

Let

an ordered

Fix

Proof.

f=

n. Then $^

1,2,,,,,

2.4, the diagram in


a

As

Figure

isomorphism,

is

6,9

as

and

the

satisfies

of the

conclusion

that

theorem

for

in Section
=

<^T,

LAcf)p
L4-invariant

W =
</>jl(Z),

define

then

[T]fl.
e{

</>p(yt)

there exists an

we

is,

seen

have

we
that

commutative,

suffices to show that


n
R such that 1 < dim(Z) < 2.If

Z of

follow that

is an

and let

defined by

it

consequence

subspace

yn] for V,

jS

the

be

00:V->Rn

=
basis
y2>...,
{yu
linear transformation

it will

(see Exercise 12).

6.9

Figure

be used to

x in Cn,

it hasan

on

is

a linear

+
X\302\261

the a^'s and

6,.\302\276

real.

are

have

at least one

of

xx

or

x2 is
=
U(x)

and

xx

nonzero
Ax

U(x)

(^

and

as such

real,

can

all column vectors


vector space over C,
X.

We

and

*2

setting

and

where

xx + a2,

+
\302\2532

Thus,

Xi =

we

X2 are

and

X\302\261

where

by

over

= Ax for

be an eigenvectorcorresponding
to

x e Cn

where

iX2,

Cn

matrix

on a finite-dimensional

operator
C. Let

eigenvalue

write

considered as annx/i

define a linearoperator

Since

may

A can be

matrix

The

x2

are

x-i =

n-tuples

since x #
+

iAaXXi

with

0. Hence
+

ix2)

real entries. Note

that

Inner ProductSpaces

Chap. 6

410

Similarly,
U(x) =

Ax\302\261

iA2.

two

for

expressions

we

U(x),

that

conclude

\342\200\224

\342\200\224

let

Finally,

and

X2x2

X^x^

Ax\302\261

Thus

equations

of

the preceding pair

< 2, and

dim(Z)

orthogonal

on

operator

nonzero

finite-

ofpairwise

that

l<dim(Wi)<2

(a)

<

Since

n.

of

collection

T-invariant

orthogonal

X2x\302\261.

product space V. Then there existsa


such
subspaces
{Wl9 W2,...,
Wm} of
for i = 1,2,...,m.

real'inner

dimensional

Xlx2 +

being taken as a subspace

L^-invariant.

Let T be an

6.39-

Theorem

is

that

shows

Ax2

x2}), the span

span({xl9

x2 # 0, Z is nonzero.

# 0 or

xt

ix2)

parts of these

and imaginary

real

the

Comparing

A(x\302\261

(b) v = w1ew2\302\256-ewra.
The

Proof

n >

integer

< dim^Wi)

that

< 2. If

Wj =

and W11=
Thus
and by Exercise 12(d)of Section
orthogonal

= 2,3s...,m

pairwise

= Wj

e wi

Wj

Example 1 and

reflections.
decompositionof
and

the number of
is

number

decompose

are establishedin
Theorem

odd

V so

that

the

6.40-

is

TWi

for

or a reflection

for

number

are

reflection

not

is not unique,

at most

of

unique.

whether

Moreover,

we

one Wf. These

can

facts

result.

following

Let T, Vs Wls...

(a) The number of i's for

whetherdet(T)=

a reflection

reflection

we

the W/s, the

property of T.

is an intrinsic

or

even

is

TWi

TW| is

which

for

Wrss

said

6.39,

Theorem

up of rotations and
the uniqueness of the

made
about

example,

which

of Wf's for

number

the

For

6.39.

Theorem

in

be

can

little

very

Unfortunately,

of

.a rotation

is

in

wm.

context

the

W,-

always

is

\342\200\242
\342\200\242
\342\200\242

\302\251w2

Theorem 6.38in
either
can conclude that the restriction ofT to
each i= 1,2,...,m. Thus
some
sense
T is

this

for

<

dim(W\302\243)

6.2

Applying

Although

<

{\\NUW2,...,Wm}

W2\302\251W3e-\302\251Wm.

W/s,

that

such

is

of T to Wfto TWi.
hypothesis

orthogonal T-invariant

ofpairwise

\302\251\302\243
\\N\302\243

Otherwise,

restriction

induction

the

apply

may

is established.

the

and

a collection

Wm}

W3,...,

{\\A72,

we

n,

exists

there

that

subspaces

fixed

some

T-invariant subspace Wj of

result

the

V,

T-invariant

is

Wf-

<
orthogonal. Spicedin^W]1)
conclude

there is a

lemma

By

Exercise 13

Wf # {0}. By
and

n for

<

dim(V)

the

n.

dim(V)

such

is true whenever

is

result

the

1.

Suppose
V

dim(V) = 1,

on dim(V). If

induction

by

that the result

So assume

obvious.

is

proof

or

which

det(T)

TWi

Wm
is

=-1.

be

as

a reflection

in Theorem
is even

639.
or odd according

to

The

Sec, 6,10

of

Geometry

number of
whether

which

for

Then

of

number

(b) Let
If

W2,...,

W = Wx
is

a collection of

0 W2 0

Observe

\342\226\240
\342\226\240
\342\200\242

Wfc.

if

otherwise,

normal basis

for

decomposep into a
ft contains
if p is even

each

elements

let
span(ft).
orthogonal and
=

TW(

if any pt

and one element if

one element, then

is

TWfc+r

decomposition
in

reflection
(27)

of

Let

orthogonal

space
operators

(a) For each

i =

1,2,...,r,

is pairwise

\342\200\242
\342\200\242
\342\200\242

two

(27)

w&+r.

the

Theorem
condition

V such

<

= det(-1) =

r.

If ft

consists

of

and

-1.

6.39, and we
of part (b). I

conclude that the

an

operator

theorem,

preceding

rotations

T be an
V. Then
on

rotation for j

and

the

satisfies

of

product

each

contains

= det

det([TWs Jp)
by

As a consequence
be factored as a product
Corollary.

TW; is a

hence

dim(W&+r)= 1

det(TWs
Thus

\342\200\242
\342\200\242
\342\200\242
\302\251w&

detdTw, Jft)

and

rotation,

Pr

then

det(TWfc+j)

and

For

odd.

is

i < r,

{Wl9W2,...,W&,...,W&+r}

w2

an ortho-

contains two elements,

is

union

exactly tv/o elements for


clearly,

choose

(p > 0). It is possible to


\342\200\242
\342\200\242
= ft u /?2 u \342\200\242
w Pr
such
P

elements

disjoint

pairwise

v = Wi e
Moreover,

Otherwise,

TWi

a nonzero
=
{0}, a

xeW(nEcEinE

follows.

/c,

exists

there

reflection,

then

containing

Then

Wfc+r

is

and

Ki^fc

i = 1, 2,...,

each

for

that,

subspaces

for

to

6.39

Theorem

applying

by

T-invariant

such

of

For

So

pairwiseorthogonal
that 1 < dim(W,)<

inner

(-1)',

a T-invariant subspace

E is

then

-jc};

W is T-invariant.

then

Wfc}

rotation.

T\\Mk+i

det(TwJ

But

So

for which

decomposition

\342\200\242
\342\200\242
\342\226\240
\342\200\242
\342\200\242

for which T(x) = -x.


xeW,
=
the
contradiction. If
result
{0},

that

in the

W/s

det(TW2)

{xeV;T(x)=

Ex,

we obtain

{Wl9

reflection,

(a).

proving

Tw,

is

TWi

14

Exercise

by

\342\226\240

V.

Furthermore,if

to

= 1,

det(T) = det(TWl)

of

one according

is zero or

reflection

det(T) =-1.

(a) Let r denotethe

reflection.

is a

Tw(

= 1 or

dim(Wi)

Proof
is

Vs

the

so that

6.39

Theorem

in

as

det(l)

then

TWi

possible to decompose

It is always

(b)

411

Operators

Orthogonal

orthogonal

can

reflections.

real
orthogonal operator on a
there exists a collection {Tl9T29...9Tm}of
finite-dimensional

that

i, Tj is eithera

reflection

or

rotation.

Inner ProductSpaces

Chap. 6

412

(b) For at

one

most

(c) TjTj = TjTj for all i


=

(d)T

TJ2^T,
f

is

TWi

= W10W20.-.0Wms

where

xy e W;

on

In

V,

rotation

i <

xj

for all j.

m. For each

= Xi +

\342\226\240
\342\200\242
\342\200\242

Ti(x!

write

can

we

for

rotation

each

of part (b) of Theorem6.40

proof
V

where

j.

otherwise.

in the

As

Proof

reflection,

for

\342\200\224

and

if Tj is a rotation

is a

i, Ts

It is

\342\200\242
\342\200\242
\342\200\242

x,*-i

TE is

T(*: V

define

m,

+ xi+1 +

T(Xj)
each

that

shown

easily

i = 1,2,...

and

(d)s

(c)s

is

TWi

(e)

:\"

Example

operator

whether

of

are left

x7'ms

an orthogonal

or a reflection

by

\342\200\242
\342\200\242
\342\200\242

fact,
Tt
according to
or a reflection.
This establishes (a) and (b).The proofs
as exercises (see Exercise 15). 1

rotation

is a

-\342\226\272
V

Real Inner Product

a Three-Dimensional

on

Operators

Orthogonal

on
Let T be an orthogonal
product space V. We will showthat can be
of a rotation and at most one reflection.

operator

Space

the composition

into

decomposed

inner

real

three-dimensional

Let

be a

If m = 2S then
that dimfWJ = 1 and
and

is

TW2

Theorem

VandT-T2.)
If = 3,
m

let

Ji

then

reflection^ then Tt
for

reflection
identity

(a

Wx

dim(W2)

Defining
that T =

Tx and T2

TXT2 is the

i&

at most

rotation).

V =

and

one

f,

we

= 3.

generality,

suppose

or the identity on

composition of a rotation

and

then Tx is
=

dim(Wf)

corollary to

Wt-. Otherwise,
that
conclude
T is

If

at

the identity on

1 for all
6.40.

Theorem

on

identity

or

as in the proof ofthe corollaryto

a reflection,

not

W1\302\256W2\302\256W3

proof of the
the

is

TWl is a reflection

Thus

2.

of

loss

Without

\302\251W2.

(Note that if TWl

as in the

be

have

reflection.

one

most

a rotation.
we

6.40,

Wi\302\251W2\302\251---\302\251Wm

as in Theoren* 6.40(b).Clearly,

decomposition

Wls

I For each
TW(

is not

f,

Tt is a reflection. SinceTWi is a
either a single reflection or the

EXERCISES

1. Label the

following
that
the
following
underlying

product

spaces.

as

statements

vector

being
spaces

true
are

or

false.

Assume

finite-dimensional

for the
real inner

Sec, 6,10

The

of

Geometry

(a)

Any orthogonal

(b)

The

operator is either a rotationor a


of any two rotations on a two-dimensional
spaceisa
reflection.

composition

rotation.
The

(c)

composition

rotation.
The

(d)

composition

rotation.
The

(e)

of any

two rotations

on a three-dimensional

space is a

of any

two rotations

on a four-dimensional

space is a

rotation.

is a

operator

identity

413

Operators

Orthogonal

reflections
(f) The composition of
operator is a
(g) Any orthogonal
For

(h)

2.

Rotations

that

Prove

have

always

a reflection.

eigenvalues.

reflections, and compositions of rotations

rotations,

are

reflections

T is

then
\342\200\2241,

have eigenvalues.

(i) Reflections always

(j)

compositionof rotations.
T, if det(T)-=

operator

orthogonal

any

a reflection.

is

two

and

operators.

orthogonal

3. Let

A =

(a) Prove that


Find

(b)

(c)

L^:

-\342\226\272
R2

R2

the axis in R2

which

L^

Prove

that

For any real

and

LAB

number

reflection.

about which

the

as

acts

is

that

let

<\302\243,

L^ is a

(b) Find the axis R2


For any real number
in

5.

(f)

6.

Prove

define

(c)

Deduce

Prove

that

that any rotation


that

T^

=
Tilf)+l}})

T^

L^, where

~ sm 0
cos

(f)

on R2 is of
for

any

the

(/)
form

T^,

some

for

\302\242.

^i/zeR,

rotations on
commute.
composition of any two rotations on

that any two


the

(/)J'

L^ reflects.

which

about

\\sin

(b)

\342\200\224cos

\\

reflection.

_ fcos $

Prove

sin 0

fcos

<\302\243,

(a)

of R2 on

subspace

rotations.

are

LBA

\\sin

Prove

the

identity.

(a)

i.e.,

reflects,

L^

R2

R3

is

rotation

on

R3,

7.

numbers

real

Given

and

(f>

0 cos
(10 sin 0

sin

(b) Prove that L^


8.

Prove

that

no

9-

Prove

that

if V is

the

an

Give

example

rotation.

11-

J(x)

rotation for

0
\\

LAB.

a rotation anda
a two- or three-dimensional real innerproductspace,
of two reflections
on V is a rotation of V.
of an orthogonal
operator that is neither a reflectionnora
can be both

operator

reflection.

then

space. Define T:

product

of rotations if

a product

T is

that

\342\200\224iX,-Prove

inner

real

a~ finite-dimensional

be

Let

sin $

rotation.

orthogonal

composition

B =

and

LB are rotations.

is

the axis of

Find

(c)

/cos^

cos 0 /

L^ and

that

Prove

(a)

\\

\342\200\224

<$>

matrices

define

if/,

10.

Inner ProductSpaces

Chap. 6

414

-+

and only if

by
is

dim(V)

even.

12.

to Theorem 6.39
satisfies the required conditions.

(f>pl(Z)

13,

an

be

Let

inner

is^n

Tw

V.

space

product

by

(b) W-1 is a T-invariantsubspace

(c)

an

is

Tw1

Tw is one-to= T~l(y) e W.

the fact that


T*(y)

j/eW,

on W.
vector

finite-dimensional

sum of T-invariant subspaces,


that det(T) = det(TWl)-det(TW2)
Prove

W10W20---0Wfc.

V.

space

is a direct

that

any

operator

[unitary]

orthogonal

a linear operator on a

14 Let T be
Suppose

for

conclude^that,

Use

Hint:

V.

of

one aridonto to

on W.

operator

[unitary]

orthogonal

that

showing

on a finite-dimensional real
operator
If W is a T-invariant subspace of V,prove

[unitary]

orthogonal

[complex]
that

(a)

of the lemma

the proof

Complete

say

\342\200\242
\342\200\242
\342\200\242
\342\200\242

det(TwJ.

15.

16.

Let

an

be

space

V.

(a) If

is

that

at

and

be expressed

T can

then

odd,

to Theorem6.40.

on an n-dimensional
operator
T is not the identity. Prove that

orthogonal

Suppose

reflection

(b)

of the corollary

the proof

Complete

most

is even, then T
rotations
or as the

If n

^{n

\342\200\224

1)

real inner product

as the composition of at mostone

rotations.

can be expressedas
of
composition
composition of one reflectionand at
the

at

most

most

jn
\342\200\224

%(n

2)

rotations.

17.

Let V be a

real inner product

that x # y and
V

such

that

\\\\x\\\\

T(x)

= y.

=
\\\\y\\\\

spaceof

1, show

that

dimension

there

2.

For

any

exists a unique

x, y

eV such

rotation T on

Indexof

Chap. 6

Definitions

41S

OF

INDEX

Adjointof a linear
Axis

Orthogonal subset ofan inner

316

operator

product

406

of rotation

inner

403

number

Condition

Orthonormal subset 300

297

space

Critical

296

matrix

of

314

Partial

relative

a vector

of

semidefinite

376

matrix

Inner

product

295

Reflection

297

Inner product space

Invariants

of the

transformation 351

318

373

Local extremum

406

Rotation

Localminimum

Self-adjoint

373

of a bilinear

representation

Simultaneous

form 357
of

or

matrix

Orthogonal

an inner

400

Spectrum

operator

Symmetric

subsetof

Translation

equivalent

Orthogonal

projection

Orthogonal

projection

348

linear

matrices

336
348

on a

337

351

operator
form

bilinear

296
360

338

polynomial

Trigonometric

310

space

Orthogonaloperator 333

subspace

326

complement of a

Orthogonal matrix

of a

Standard inner product

298

product

Orthogonally

352

operator

of a matrix

386

coordinates

Spectral decomposition of a linear

of

system

321

Normof a vector
Normal

332

diagonalization

Space-time

Minimalsolution

329

matrix or operator

Signature 378-79

388

transformation

338

motion

Rigid

Local maximum 373

Norm

operator

identity

induced by a linear
line

equations

297

406

Resolution

378

squares

Matrix

401

quotient

Rayleigh

Real inner product space

378-79

or

form

307

Index

331

operator

Quadratic form 366

Gram-Schmidtorthogonalization

Lorentz

331

operator

Rankof a symmetric
bilinear

set 309

to an orthonormal

Least

346

isometry

Positive

coefficients

process

identity

definite

Distance 303
Fourier

Parseval's

Positive

361

bilinear form

Diagonalizable

302

Polar identities 303

373

point

Parallelogram
law

Congruent matrices 359

Conjugatetranspose

304

basis

Orthonormal

product

300

vectors

Orthogonal

Bilinearform 355

Complex

300

space

314

Bessel's inequality

CHAPTER

FOR

DEFINITIONS

349

Unitarily equivalentmatrices
336

Unitary

matrix

Unitary

operator

Unit vector

300

333

337

Forms

Canonical

As

lies
simplicity
representation

of its

the

in

of a diagonalizable linear operator


Such an operator has a diagonalmatrix
description.
there
is a basis
for the underlying vector space
of the
not every linear operator is
However,
operator.
5, the advantage

in. Chapter

learned

we

or.equivalently,

consistingof

eigenvectors

diagonalizableevenif its
5.2 describes such a linear operator.
characteristic

is

It

the

for

representations

of this

purpose

chapter

polynomial

These

3 of

Example

to consider alternative

operators.

nondiagonalizable

splits.

Section

matrix
called

are

representations

of
canonical
and
canonical fornis. There are
kinds
their
forms,
are
The choice of a
advantages and .disadvantages depend on
applied.
they
canonical form is determined the appropriate
of an ordered
choice
basis.
are
not
if
matrices
Naturally, the canonical formsof a
operator
diagonal
the linear operator is not diagonalizable.
this
the two most popular canonical forms. The
we treat
of
chapter
the Jordan
canonical form, requires that the characteristic polynomialof
these,
This
form
is always available if the underlying field is
operator
splits.
that
with coefficients from the field
closed,
is, if every polynomial
sections
deal with this form. The rational canonicalform,
first
two
splits.
different

how

by

linear

In

first

the

algebraically

The

treated in

Section

7.4,

does

not

require

such factorization.

7.1 GENERALIZED EIGENVECTORS

In

two

first

the

vector

dimensional

operators

have

at

chapter we consider linear operators on finitefor which


the characteristic polynomials split. Such

of this

sections
spaces

one

least

eigenvalue.

If Al3 A2,...,

(not
A\342\200\236

eigenvalues of T:V-*V, recall from


if and only if there is
basis
ordered
diagonalizable
eigenvectors of T. If = {xl9x2,...,x\342\200\236} is such a basis
are

all

the

Theorem

an

ft

416

necessarily

distinct)

5.4.

that

is

consisting

of

in which x}

is an

for

Sec.

Generalized

7.1

to the eigenvalue

corresponding

eigenvector

417

Eigenvectors

^
0

X2

k0

D\"L

Xjs

then

.-\342\200\242

.\342\200\242-

Although not every linearoperator


that for any linear operator
characteristic
T

V is

on

whose

basis

an ordered

for

ft

Ji

is

'Jt

J.

sO

of the form

matrix

a square

polynomial

we will prove
there exists
splits,

that

such

U1a

where

diagonalizable,

\342\200\242\342\200\242\342\200\242

&

-Jkl
01

10---0

jXj

form

the

or

(Xs)

\\
A/

a Jordan block
to and the matrix [T]^ is called a Jordan canonicalform of T.
corresponding
T.
Observe
also
that
basis /? is a Jordan canonicalbasis
the ordered
say
is a
Jordan
block Jt is \"almost\" a diagonal
fact,
[T]fl
that^ach
matrix if and only if each Jt is
the
form
diagonal
(Xj).
for some

eigenvalue

Xs

T.

of

a matrix

Such

Jt is called

A/,

for

We

matrix\342\200\224in

of

Example

The 8 x

8 matrix
1

o; o o 0 0

0 o

0 o

I2

/o

J=

\\o

exists

canonical

Jordan

basis

ft

form

o\":

Tjo
o

\\o

is a

0 0 0
10
0 0 0 0
;' 2 : o 0 0 0
I

0 0 0^0
3l0
0 0 0 0
;o
0 0 0 0 0 ;0

of

{xl9x2,...,x8}

a linear

operator

for C8 such

1
0
T: C8 -+ C8;

that is,

that [T]^ = J.

Notice

there

that

418

Forms

Canonical

Chap.

=
and
the characteristic polynomial for T and J is
tl)
(t
2)\\t
that
so the multiplicity
of each eigenvalue is the number of
eigenvalue
vectors
of J. Also observe that of
on the diagonal
x8i only
xu x2s...s
appears
the
column
of
each of
basis
to
first
and
vectors
(the
corresponding
xu x4, x5i
\342\200\224

\342\200\224
3)2\302\2432,

\342\200\224

det(J

times

the

x7

Jordan

the

are

blocks)

of T.

eigenvectors

It will be proved that


whose
characteristic
operator
polynomial
of the Jordan
order
blocks).
splits has a unique Jordan canonical form (up to
it is not the case that
canonical
form
is completely
Jordan
Nevertheless
transformation.
determined
For
by the characteristic polynomial of
every

the

the

the

characteristic

the

example,

of

polynomial

2 1 ;o 0 0 0 0 0
0
;o 0 0 0 0 0
0 0 ! 2 l;0 0 0 0
0 0 0 2!0 0 0 0
J' =
0 0
0 0 0 0!
0 0
0 0 0 0 0
0 0 0 0 0 0 0 1
\\ 0 0 0 0 0 0 0 o
2

\342\200\2420

also'

is

(t

- 3)2t2.

2)\\t

Consideragain
=

T(x2)

Therefore,

and therefore,

+ 2x2

*i

\342\200\224

(T

For

the

to

block

second

(T

21)2(x3),

is

21)2(x3)

an

xt.

written

be

can

ft

Jordan blocks.

for the other


corresponding

=
xx

2\\)(x2)

{(T -

*2, x3} =

{*i,

\342\200\224

vectors of

the -first three

and of these, only

(T

basis /? of

the

and

matrix

the

x4

that is an

\342\200\224

(T

Similarly,
as

21)(x3)

x2.

21)(x3), x3},
This

eigenvector.

example,

Example 1. Notice that

is

pattern

the

only

is repeated
vector of

/?

eigenvector. Also note that

{^^} = {(T-31)(x6),x6}.
for

occur

the

of

Because

any

the vectors of

information

about

canonical

basis of a
entry

diagonal

(T

\342\200\224

21)3(x3)

above.

A,

basis of a

canonical

Jordan

of a Jordan

nature

very

then

0 and

canonical form, this pattern

linear operator. Thisfact

a Jordan canonicalbasis.

If

linear operator T and corresponds


to

\342\200\224

(T

(T

Al)p(x)
\342\200\224

21)2(x2)

us

gives

lies

in

a sufficiently

0 for

the vectors x2

and x3 in

Let

Definition.

nonzero

vector

T be

x in V

a linear operator

is called a

with

example,
basis

the

Eigenvectors
always
satisfy this condition for the case p = 1.
to identify
the nonzero
vectors that satisfy this more

condition.
V.

a Jordan

It

appropriate

some

block

Jordan

large p. For

0 for

must

/?

seems

general

on a finite-dimensionalvectorspace

generalized eigenvectorof T if

there

exists

scalar

419

Generalized Eigenvectors

Sec. 7.1

that

such

Al)p(x)

some positive

0 for

corresponding to

a generalized

is

\342\200\224

eigenvector

generalized
If

(T

integer for which (T


of T corresponding
to X. Therefore,

\342\200\224

It is important to

X\\)p(x)

0, then

an

X is
the

identify

a linear

be

(T

correspondingto
=
smallest positive integer such that
Air
2(x),...,
{(T 11)^(4
eigenvector of T

generalized

the

a cycle

(T\342\200\224

called

x are

and

Al)p~1(x)

For

fi2 = {x4},

/^ = {xux2ixz}i

considering,

cycles. Theorem

of these

asserts,

other

among

that

things,

cycles

basis

elements

The

X.

of

we

that

ft

and

{x5ix6}>

jS4

the

cycle,

have

been

are

{x7sx8}

in /?. Notice

occur

that

of

eigenvectors

union

jS3

the cycles of generalized


disjoint

set

p.

J and the

the matrix

recalling

example,

is

cycle

the

p denotes

vector

the length of the

We say that

respectively.

x be

x}

Al)(x),

eigenvectors of T correspondingto
the initial vector and the end

of generalized

let

and

V,

the

above,

the ordered

(T

(T

is called

space

X. If

0, then

Al)p(x)

(T

canonical

made

remarks

eigenvalue

\342\200\224

the

is

eigenvector

a Jordan

from

on a vector

operator

an

of T.

the

From

and

is

X\\)p~l(x)

of vectors

sequences

Jordan

Let

x is

that

say

to

\342\200\224

eigenvalue

basis that correspondto a


block.
following definitions will be useful.

Definitions.

We

A.

of T corresponding

eigenvector

smallest

integer p.

that

ft

is

7.1 summarizes these observationsand


of generalized
eigenvectors are always

linearly independent.
eigenvectors of T correspondingto

generalized
(a)

eigenvalue

(b)
(c)

is

Let

vector

initial

The

of
no

and

A,

on

a linear operator

T be

Let

7.1.

Theorem

an

is

eigenvalue

of

eigenvector

member

other

the

and

V,

let

of

X.

to the

corresponding

is an eigenvector

ofy

a cycle

be

of T.

independent.

linearly

P be an

if and only

ordered basisfor V. Then

if

ft

is

union

disjoint

is

/?

of cycles

Jordan

basis for

canonical

of generalized

eigenvectors of

T.

Proof

We

prove

only

(b);

the

proofs

of (a) and

(c) are leftas

exercises.

The

on the length of the cycley. If y has length


1, then y = {xx}
is linearly
since xu a generalized eigenvector,is a nonzero
vector.
independent
\342\200\224
assume
that
1
Now
are
for some
cycles of length k
linearly
independent
\342\200\224
1 >
1. Suppose
that y = {xux23--.3xk}
is a cycle of generalized
integer fc
proof

is by

induction

eigenvectors corresponding to the eigenvalue


X

k
\302\243 aixi

and

that

Chap. 7

420
scalars

some

for

\342\200\224

ApplyingT

a2i...iak.

au

to

X\\

the

Canonical

Forms

above

equation

gives

equality is a linearcombination
= 0 for i=%
L Hence
xfc_ j} of length
at-

{xl3 x2i...,

cycle

of

/c

to

=
a2 =
induction.

= 0.

alxl

= 0,

\342\200\242
\342\200\242
\342\226\240

ak

But

a***

that

is

it follows that

play

eigenspaces

key

such

fixed

bases

from

in

these~*spaces

The

Kx =.'{x e

Note that
eigenvectors

V.

Then

corresponding

So

0.

ax =

the

completes

study of the
is spanned
by the set of all
it is useful to
our purposes
eigenvectors

Kx

Let

is a

of a linear

vector

A,

Al)p(x)

some

0 for

space

is the

set

integer p}.

positive

and all the

vector

zero

the

W of

a subspace

that

operator T on a
T corresponding to denoted
by KA,

\342\200\224

of

bases are obtained by selecting


The following
is one way of

generalized

A.

to

Theorem 7.2.

space

V: (T

consists

KA

corresponding

Recall

of

eigenspace

generalized

ways.

appropriate

an eigenvalue

X be

Let

Definition.

canonical

Jordan

eigenvalue.

single

definingsuch a space.

V.

Thus

the

in

role

by

=
ax

This

independent.

linearly

a space
diagonalization\"problem.Recallthat
For
eigenvectorscorrespondingto a
eigenvalue.
consider the spacespanned the set of all generalized
to

/c.

We have seen that

corresponding

3, ...,

\302\260

since x^^O,

proving

from

elements

\342\200\224

reduces

\302\260-

in the preceding

sum

the

But

flf**-i

E
l~2

an

be

T-invariant

a linear

of

eigenvalue

T on a

operator

of V containing

subspace

c W.

if T(W)

T-invariant

is

(the

EA

eigenspace

vector
of T

to X).

Proof. Clearly, 0 e
Suppose
integers
p and q such that
positive
KA.

that

(T-X\\)p(x)= 0 and

y are in

and

(T

KA.

Then

= 0.

X\\)%y)

Therefore,

(T -

X\\)p+q(x

y)

= (T

= (T = 0.

X\\)p+*(x)

X\\f{0)

(T

(T

Al)'+*(y)

X\\y(0)

there

exist

Sec.7.1
By

Generalized

To

KA is

that

show

that (T -

p such

integer

for

Kx

c.

scalar

any

subspace of V.

Kx is a

Hence

it can be shownthat ex e

simple calculation

of a

means

421

Eigenvectors

0.

Al)p(x)

Choose

KA.

positive

Then

T(T -

- M)pT(x) =

(T

any x e

consider

T-invariant,

X\\)p(x)

T(0)

= 0.

T(x) e

Therefore,
Finally,

KA.

is a

it

that

observation

simple

in

contained

is

EA

KA.

of generalized
We will produce Jordan canonical basesby collecting
cycles
take
from the generalized eigenspaces, but we must
care
that the
eigenvectors
lemma
and
the
resulting collections are linearly independent. The next
of guaranteeing linear independence
7.3 localize
the problem
Theorem
accompanying

to the individual
eigenspaces.
5.13 and the lemma directlyprecedingit

linear operator on a

T be a

Let

Lemma.

be distinct eigenvaluesof

...3]k

xx

then

Xi

each

For

x2+

for

any

..., k, let

1, 2,

let

and
Xie

Al3 A2s
KAi3

the

(1)

...+xk=03

\342\200\224

distinct

where

eigenvalues,

on

induction

by

fc

k, the

1. Assume that

/c

trivial

hoi(fe*

space

to ^. If

lemma mathematical
of distinct eigenvalues. The result is
for
We prove the

Proof

vector

i.

all

0 for

T.

corresponding

eigenspace

generalized

5.2.

Section

in

to Theorem

results

these

Compare

generalized

\342\200\224

>

1, and

number

the lemma

suppose

that we

vectors
have k distinct eigenvalues X2i...,
xt e
satisfying
Xk with
(1). First,
let
be
0.
each
1
i
the
smallest
For
that xt #
i, < <
pt
positive integer
suppose
=
=
0. Let z
z is an eigenvector
Then
for which (T A^l^'fo)
(T A1I)Pl\"
1(x1).
Let
be the polynomial
of T corresponding to
defined by
f(t)
Al9

KXi

ks

\342\200\224

\342\200\224

Ax.

= (t-x2r---(t-xkr.

m
Then

/(T)

is

the

linear

operator

given by

/(T) = (T-A2ir--.(T-A,ir.

Clearly,/(T)(x,)=

for

all

>

1. With the

aid of Exercise22

have

f{J)(z)

= /(AJz

= \\ih ~
#0.

4r

\342\226\240\342\226\240\342\226\240(*!4)\"*>

of

Section

5.1,

we

Canonical Forms

Chap. 7

422

Furthermore,

/(TXzH/axT-Aiir-H*!)

=
=

(T-A,l)\302\273-VTTX*i)

- W

-(T

\"lU(T)(x2 +

\342\200\242
\342\200\242
\342\200\242

(1)

by

xj]

= 0.

contraction.Soit

This is a

be

must

that

x2 +
by the induction

Therefore,

this establishes-thelemma

Theorem 7.3.
space with distinct

Let

j;

KAj,

linear

Al3 A2s...3

x e

St

r
5 =

S{

i = 1,2,...,

j < nr, 1 < i

{xj; 1 <

k,
i

for

/e\302\243

S:

and

j,

V,

\342\200\224

x.

Then

that

fact

the

to

contrary

KAf,

x lies in

\342\200\242
\342\226\240
\342\200\242
j

XfnJ.

Consider

<;&}.

vector

0.

X|2j

{Xfl9

Let

now follows.

lemma

eac/z

For

Afc.

0,

< i < /c, and

on a finite-dimensional

for some i #/.

linearly independent set. Thus n Sj


Now suppose that for each i,

Then

for

of K^. Then Sj n Sj =

Sr n Sj

\342\200\224
0

xt

The

operator

+ j; = 0. By the lemmax =

that

have

we

eigenvalues.

subset

that

Suppose

Proof.

and

be

= 0.

xk

is a linearly independent
subsetof

S = S1uS2uf\"uSk

independent

linearity

distinct

eigenvalues

6e

(1) becomes

\342\200\242\342\200\242\342\200\242

hypothesis

for

= 0, and

xx

scalars

any

such

{atj}

that

n(

IE

= o.

*tjXv

For each i let


=
yt

\302\243

fly^i/-

/=1

j^ e

Then

y.

z=

i. But

all

for

that atj- \342\226\240=


0 for
next

The

describea

i and

each

for

KAf

all

j^ + y2 +

step

method

in
for

our

cycles

choosing

that

This

Theorem 7.4
space and for each

Let

i (1

the

of

is linearly

independentsubset.

V,

yk

= 0.

by the lemma,

Therefore,

cycles

of^generalized

union

of

each

observation

be

< i <

the

then
provides

linear

cycles

is an

cycle

independent,

Jordan canonical bases is

to produce

program

vector

the

for
each
S( is linearly independent for all i. Thus,
f, it follows
j. We conclude that 5 is linearlyindependent. 1

generalized eigenspace so
Notice that the initial

union of

\342\226\240
\342\200\242
\342\226\240

cycle

from

is linearly

eigenvector.

independent.
Trivially, if the

the eigenvectors form a linearly


a key to the solution.
us with
on a finite~dimensional

operator

q) let Zt be a

eigenvectors

to

of

generalized

eigenvectors

vector
of T

to
corresponding

and

set

distinct

cycles

the

423

GeneralizedEigenvectors

Sec. 7.1

the

X and

eigenvalue

independent, then the Zj'sare disjoint


common) and

y2,...,
yq} is linearly
have any elements in

{yl3

the Vj'sare distinct

vector y;. If

initial

having

two

(no

z = U z,

is

independent.

linearly

\342\226\240

the

proof

number

of

linearly independent
in Z. If this number is

vectors

for some integer n

vectors. Now suppose


Z.

by

generated

i let

each

vector.Each
nonzero

every

W is (T

\342\200\224

Al

last

by

vector of

Z'

the

of

vectors

is

=
dim(R(U))

N(0),

we

q.

that

have

we

theorem

\342\200\224

Since

in

Z[.

the end

conversely

Let

z\\.
R(U). Furthermore, Z'

of the Z\"sare also

initial

the induction hypothesisto conclude


Z' is a basis for R(U). Hence
Therefore,
j>2,...,
yq} is a linearly independent subset of
^ q. From these inequalities and the dimension

we may apply

Thus

independent.

linearly

and

Zh

contained

is

the range
generates
the set of initial vectors

and

vectors,

Zt's.

Zt

Zt by deleting

vector of

Z'

statement,

\342\200\224

U of a

under

image

z' = (J

Then
the
consistsof

than

the subspace of V
< n. Let U denote

dim(W)

from

obtained

cycle

U of a

under

image

and

Al)-invariant,

fewer

has

W be

Let

vectors.

\342\200\224

the

is the

Z\\

Assume

trivial.

is

result

the

whenever

valid

exactly

1, then

W.

to

denote

Z\\

of

vector

> 1 the resultis

that Z has

Clearly,

the restrictionof T
For

is by mathematical inductionon

Z is

that

The

that

the

that

fact

distinct.

are

y/s

follows by Exercise 5 and the

are disjoint

theZ-s

That

Proof.

{yu

dim(N(U))

that

obtain

^ dim(W)

- dim(R(U))+
>

(n

- q) +

dim(N(U))

n.

We conclude

that

must be a

dim(W)

basis for

W,

and

Z generates

n. Since
Z

hence

We are now prepared

the

for

is linearly

and

independent.
of this

theorem

principal

of n

consists

vectors,

it

1
section.

Theorem 7.5. Let T be a linear


on
an n-dimensional
vector space
V such that the characteristic polynomialof T splits.
there
a Jordan
exists
canonical basis for T; that is,
exists
an ordered
basis /3 for V that is a disjoint
operator

Then

there

unionof

cycles

of

generalized

eigenvectors

of

T.

Canonical

Chap. 7

424
The

Proof.

is valid

that the conclusion

Assume

fixed integer n >

1,

on n. Clearly, the resultis

is by induction

proof

and

for

Choose an eigenvalue

vector

any

that

suppose

Forms

1.

less than a

of dimension

space

for

true

= n.

dim(V)

let r =

\342\200\224

Then
rank(T
R(T
Axl)
AJ).
of
and
r
<n.
Let
be
is an
/--dimensional
the
restriction
T-invariant
V,
subspace
of U divides the characteristic
of T to R(T
The
characteristic
Ail).
polynomial
and
also splits. Thus we may apply the
hence
polynomialof T
5.26)
canonical
basis
a for U. We wish to
inductionhypothesisto obtaina
of the
extend a to a Jordan
basis
for
T. For each i let
consist
to
in a corresponding
a linearly
Then
generalized
eigenvectors
St- is
of generalized
of
set that is a disjointunion cycles
independent
eigenvectors
Let
be the
cycles whose union is Sx. (It is possible
Zl3
disjoint
Z2,...
Zp
=
=
u Zh where
that
p
0.) For eacfi cycle Zh let
yt is a vector in such
{yt}
is
end
of Zf. Such a
exists
the
vector
because
(T
AJXyd
always
a cycle of generalized eigenvectors of T
Then
R(T
^1).
Z\\ is also
Zfc
i let zt denote
the initial vector of
4o
For
each
Then
corresponding
and
this set can be
AJ),
{zu z2s...s
Zp} is a linearly independent subsetof

T, and

of

Ax

\342\200\224

\342\200\224

(Theorem

Jordan

canonical

St-

At-.

of

At-.

that

Z\\

\342\200\224

yt

\342\200\224

Ax.

Zv

\342\200\224

N(T

extended

to_

basis

Z2i.

(Zl3

\342\226\240
\342\200\242
j\302\243p3

\342\200\242
\342\200\242

Zp+ii

.)\302\243n-.rj

- r. Then
=
< nlet
then
Z\\
Z'l3
{zt} for p < i <
a
of
of
is
collection
generalized
Z2s.--sZ'n-.r
disjoint
eigenvectors
to Ax. Let Si denote the unionof thiscollection.
the
initial
corresponding
vectors
of theSe cycles form a linearly independent set,
have
that
Si is linearly
from
% Theorem 7.4. Notice that Si is
independent
Sx by adjoining
for

N(T

AJ). If p

i\\

cycles

Since

we

obtained

\342\200\224

defined

ft be

Let

vectors.

by
=

Then,

from a

/? is obtained

vectors, we have

that

In the proof

of their initial

basis

Theorem

correspond

for

the
to

of Theorem 7.5

the

\342\200\224

vectors.

vectors.

cycles

that

{Z-}

Since

Furthermore,
/?

is

were

Jordan

ft

canonical

constructed

of r

a consists
is

linearly

for

basis

so that the

set

vectors

N(T

7.5,

of

7.3. We conclude

an

\342\200\224

Aj

initial

cycles correspondingto equals


relations
(Theorem 7.8)that
Having established the
investigate the connection

form

these

existence

between

l3 . . . j Zn^.rJ

of the construction

context

of the cycles

vectors

eigenvalue

%pi Zp+

the

in

Thus,

1).

---3

^2'

\\^l3

is

adjoining

by

consists

/?

by Theorem

independent

S[vS2v--vSk.

in a Jordan

in the proof of

canonicalbasis

that

therefore the number of


In the
next section we will show
dim(EA).
true
for any Jordan
canonical basis.
are
of a Jordan
canonical
form, we can now
and the characthe
generalized
eigenspaces
a

basis

for

EA, and

425

GeneralizedEigenvectors

Sec. 7.1

teristic

this next theoremto

useful to compare

It is

operator.

52.

Section

in

5.14

Theorem

an

of

polynomial

Theorem 7,6.
a finite-dimensional
Let T be a linearoperator
on
vector
that
space V such that the characteristic polynomialof T splits.Suppose
kx, A2,
..., Afc are the distinct
eigenvalues
of T with corresponding multiplicities ml3 m2,

...,

Then:

mfc.

(a)

(b) Iffor

each

(c) If P is a

(e) T is

union S =

basis

T, then

for

for each i,

and

if

diagonalizable

We

(c)

\342\200\224

subset of

Si are
all

disjoint,

pairwise
of this by

m-t <

that

have

we

KXi,

and their union

it follows that

m{

dt

dt for all

in

are

ft

ft

is a

linearly

i. By Theorem7.3

the

sets

summarize

We

\302\243 mt

Since

vectors.

mt

S is linearlyindependent.

which

of

ft consists

n =
from,

is

KXl

the diagonal
occurrences
of Xt on

matrix,

independent

is

At as

that

is m-x. Therefore,

in

vectors

the

(b)

diagonal

ft = fin

Sk

all i.

let J [T]fl.
in and (c), d-t = dim(KAf), and n = dim(V). For each i,
contain
one-to-one
correspondence with the columns of J
of
number
the
entry. Since J is an upper triangular
the

parts simultaneously. Let ft


For
i let St and ft be as
each

three

first

the

prove

= K^. for

E^.

if

only

for T, and

basis

canonical

a Jordan

\342\200\242
\342\200\242
\342\200\242

S2

for all i.

WD

Proof, (a),(b),and

be

Sj u

K^.

H((T

KXi

K;(, then the

basis for

canonical

Jordan

for
=

(d)

is a

Sj

i,

for V.

a basis

basis

all i.

= mj for

dim(KA()

for

<

\302\243 dt

all i

<

n,

and

since
S is linearly
The last equality tells us that S contains vectors.
Therefore,
= dt, we have that each ft isa basis
m(
independent, it is a basis V. Because
Thus
established
we have
(a), (b), and (c).
KA(.
c KA.. Suppose that xe
Z
the
Then
cycle
A(-l)m0
(d) Clearly,
of
7.1. Since
with end vector x is a linearlyindependent
KA., by Theorem
it follows
that the length of Z cannot exceed
that
is,
by
(a),
dim(KAi) =
=
0
xe
that
for p>m(. Therefore,
^-1)^),
proving
(TAt\\)p(x)
n

for

for

N((T

KXr

subset

m(;

mi

N((T

N((T-A,ir)=KAr

(e) If T is diagonalizable,

then

for
equal

part

all i.

Since

by part

(a), and

EA.

is

of

subspace

(a). Conversely,

by

if

KA. for
=

KXi

T is diagonalizable

Theorem

5.14 we have

all i, we

EXi for

all i,

that

have that the

two

then dim(EA.) =

by Theorem5.14. 1

mt-

dim(EA()
are

spaces
m{

for

all

i by

Chap.

426

Forms

Canonical

Example

-> C3 be

Let T: C3

defined

by

A =

We will find a basis for each eigenspace


and
The characteristic polynomial of T is

f(t) =
Hence

3 and

Ax

is a

7.6, KAl
=
N((T
K,2

21). Hence

N(T

for the eigenspace

basis

dimension

has

1,

- 21)2).

Now

EAl

dimension

has

KA2

2,

31) and E,2 =

N(T

2,

Since

KAl.

EAl

2)2.

the eigenvalues of T havingmultiplicitiesand

= 2 are

and

N(T-31),

KAl

3)(t

of T.

eigenspace

generalized

-tl)=-(t-

dst(A

Theorem

By

respectively.
=

A2

each

for

3,

Ax

it is

for the generalized

also a basis

eigenspace

KAl.

Now

this

Hence

cycle of length2.

one is

an

length

one

union

of

or consists

cycles.

of one

of a cycle of length
basis would be a basis of
= 1, which can be easily

the vector

because

resulting

that

fact

the

consisting of a
of

cycles

the

hence

and

eigenvector,

two

is impdssible

former

The

eigenvectors\342\200\224contradicting

dim(EA2)

single cycle of length 2. A vectorv is the


\342\200\224
= 0. It can
of such a cycle if and onlyif (^4 \342\200\224
# 0, but (A
2I)2v
2I)v

verified.

the

Therefore,

vector

end

of

either is a union

basis

has a basis

2 and

dimension

KA2fhas

is a

basis

desired

be shown that

is a basis

the

for

solution

space

of the

homogeneous

'

1\\

system

\342\200\224

2I)X

(A

0, and

/-1

-3 1,12

-1
is

basis

for

the

solution

space

0,

of the homogeneous

system

\342\200\224
(A

2I)2X

0,

Thus the

an

is

427

Generalized Eigenvectors

Sec. 7.1

vector

that

for v. Notice

candidate

acceptable

I-3 |,

(A-2I)v=
the

therefore

and

homogeneous

\342\200\224

21)2X

(A

system

coincides with the basis


given above. It follows that

chosen

have

we

cycle

for the

'\342\226\240

(::)\342\226\240(!

1(1)-

for C3, and

is a basis

'3 0
,0

is a

0 2

T.

for

form

canonical

Jordan

|0

[T],=

01

Example 3

Let T: P2(C)->
each eigenspace

and generalized eigenspaceof T.

-1

-1
=

[T],

characteristic

the

Thus

\342\200\224

So any basis
Now

EA

if

EA
and

If

We

ft

then

{l,z,z2},

for

a basis

find

will

ft

is

an

is

the

only

N(T

for

/?,

example,
I)

N(T

is a

-2

0 -1
is f(t) =

of T, and

eigenvalue

only

for P2(Q,
=

of

polynomial

0N

-1

0
Hence

- f.

= -f

T(f)

by

for P2(C) and

basis

ordered

defined

be

P2(C)

hence
basis

1). Thus, if

= -/-/'
=

-/'.

+ /
+

for

P2(Q

\342\200\224

\342\200\224

tl)=

(t+

by Theorem

1)3.

7.6.

KA.

f is a polynomialin

if

0 = T(/)

KA

dtt(A

P2(C\\

then

428

Chap.

f e

Therefore,
EA.

Since

basis

to

corresponding

is three-dimensional,a Jordancanonical
vectors.
The initial vector of any
of T corresponding to Therefore,
a
such

of

cycle

an eigenvector

of a single

consist

must

basis

A is

for

and this space


consist

must

KA

constant. Consequently, {1}is a basis

f is a

if

only

= P2(Q

KA

for

and

if

EA

A.

cycle since
/-1
1

a Jordan

is

matrix

of such a

example

In

will

we

develop

a Jordan

and

form

canonical

Jordan

-1/
The set

\342\200\224

2z,

{2,

z2}

is an

cycle. 1

section

next

the

a cycle, then

such

form for T.

canonical

If y is

1.

0\\

0-1

\\ 0
this

dim(EA)

[T]y=

and

Forms

Canonical

approach for finding


canonical basis for a linear operator.
a more direct

the

Direct

Bums*

be a

that

the

Let

suppose

operator

Vs

if

diagonalizable

on a finite-dimensionalvector space and


of T splits. By Theorem5.16, is
characteristic
polynomial
and only if V is a direct sum of
of T. If T is
eigenspaces
linear

the

The
coincide.
diagonalizable, the eigenspacesand
generalized
eigenspaces
result generalizes Theorem 5.16 to the nondiagonalizable
case.
following
the

7.7.

Theoreirt

V for

space

of the

which

a linear operator on a finite-dimensional


vector
the characteristic*polynomial
sum
of T splits. Then V is a direct
T be

Let

eigenspaces of T.

generalized

Proof Exercise. 1

EXERCISES

1.

the

Label

(a) Eigenvectorsof a
T.

(b) It is possible

for

associated

with

(c)

(d)

Any

linear

linear

on

operator

of

each

eigenvalue

an eigenvalue of T.
a finite-dimensional
vector space has

a Jordan

Let

T be

a linear

cycle

of

vectorspace.
\342\200\242(f)

of

T to be

(e) There exists exactly one


to
corresponding

eigenvectors

linear operator

operator

of a

is not

are

eigenvectors

generalized

also generalized

eigenvector
that

scalar

true or false.

T are

operator

generalized

canonicalform.
Cycles

as being

statements

following

of

linearly

generalized

a linear

operator

independent.
eigenvectors

on a finite-dimensional

on a finite-dimensionalvectorspace

whose

Sec.

Generalized

7.1

429

Eigenvectors

factors into polynomials of degree1,


let
of T. If, for each /?(- is any basis
the
distinct
be
eigenvalues
Ax, A2,...,
u fik is a Jordan
canonical
basis for T.
then
for
/?x u /?2 u
Jordan
block
J, L, has Jordan canonical form J.
(g) For
T be
a linear
on an n-dimensional vector space whose
(h)
operator
Then
for
characteristic
X of T,
any
splits.
eigenvalue
characteristic

polynomial

and

z,

A&

\342\226\240
\342\226\240
\342\226\240

KA(,

any

Let

polynomial

KA=N((T-AI)\.

2. For each of

the

linear

following

eigenspaceand each

3.
4,*

where

LA,

linear operator in

subspace

that

is

span(Z)

on

V that

a T-invariant

of V.

5. Let Zl5Z2,...,
be
cycles
T corresponding to an
are distinct, then the

of generalized

Zp

are

cycles

Let T: V -*

be

that

disjoint.

the following:

Prove

transformation.

linear

of a linear operator
if the initial eigenvectors

eigenvectors

Prove

X.

eigenvalue

(a) N(T) = N(-T).

(b)

Exercise2.

of a linearoperator

of generalized eigenvectors
to the eigenvalue
Prove
X.

a cycle

be

corresponds

6.

2f-f

of each

form

canonical

Jordan

the

by T(f) =

~+ P2(C) defined

T: P2(C)

Find

Let

for each

T = L^, where

(b) T =

(c)

a basis

eigenspace.

generalized

(a)

find

T,

operators

N((-T)&)

N(T&)

for

k.

integer

positive

any

(c) If W = (so that T is a linear operator on


T, then for any positiveinteger
V

V)

and

X is

of

an eigenvalue

N((T-AIV)&)

7.

Let U be

a linear operatorona

following:

(a) N(U) c

(b) If

N(U2)c ...
=

rank(Um)

rank(Um) =

(c)

rank(U&)

vector

finite-dimensional

N(U&)

any

N(U&+1)

for

rank(Um+1)
for

N((AIV-T)&).

the

V. Prove

....

some

positive
k^m.
integer

positive

space

integer

m,

then

=
If rank(Um) = rank(Um+1)for some positive
then
m,
N(Um)
N(U&) for any positive integer ^ m.
of T. Prove that
Let T be a linear operator, and let be an eigenvalue
=
=
if rank((T rank((T
Al)m+1) for some integer m, then
integer

(d)

Al)m)

N((T-AI)m).

KA

430

Chap.

Canonical Forms

whose
(e) Second Testfor Diagonalizability. T be a linear operator
are
the
characteristic polynomial splits. Suppose that
X2,...,Afc
if
distinct
of T. Then T is diagonalizableif and
eigenvalues
for
i < /c
1<
rank(T
A;I) = rank((T
24 of Section 5.4; If T
of Exercise
proof
(f) Use part (e)to obtaina
Let

Xu

only

Xt\\)2)

simpler

is a diagonalizable
linear

V and

8.

polynomial
f(t) of
its characteristic
T satisfies

that

prove
the

/(T)(x) = 0

for T, then

basis

j? be

a Jordan

dimensional

vector

Let

vector space

that

such

T splits. Prove that f(T) =


i.e.,
polynomial. (This is a specialcase
T0;

of

that if

Show

Hint:

theorem.)

Cayley-Hamilton

space

Tw is diagonalizable.

of V, then

on a finite-dimensional

operator

vector

a finite^dimensional

subspace

characteristic

the

9.

T-invariant

any

a linear

be

Let

is

on

operator

for eachx e

is

ft

canonical

a Jordan

ft

basis for a linear operator


an
eigenvalue
space V, and let J = [T]^.
canonical

the

in

T, and

X of

Fix

let m denote the number of Jordan blocks having


[Wewill see
positions.1 Prove that 1 < m < dimCEJ.

finite-

on

the

in

next

diagonal

section

that

V such

that

m = dim(E^.]

10. Prove parts(a)

7.1.

Theorem

of

(c)

11 and 12 will be concerned

Exercises
11.

and

7.7.

Theorem

Prove

sums.

direct

with

vector
12. Let T be a linearoperator a finite-dimensional
the characteristic polynomial of T splits, and let
of T. Proye that
distinct
eigenvalues

space

on

J=
is

For the purposesof this section


a linear
vector space
such
that
the
characteristic
fix

we

X2,...,Xk

of

to

the

X{

form

restriction

eigenvalues

basis
a

for

basis

of

T to

T.

for

ft

is T-invariant,

eigenspace

generalized

distinct

canonical

a Jordan

correspond

Th

the

denote

KM. Theorem

canonical basis

is

Jordan

the

KM

for

V.

T on

operator

polynomial

of T. Theorem
By

Theorem

an n-dimensional

of T splits.

7.5 assures

Let

Al9

existence

the

7.6(c) the cycles

of

ft

that

each

KAr We can reverse this process. Since


there exists a Jordan canonical basis ft

is a Jordan

and only if each Jt


to
each
for
z.

FORM

CANONICAL

JORDAN

7.2

for T if
for the restriction
of T

form

canonical

jl@J2\302\256-.-@Jk

form

canonical

Jordan

the

the

be

X2,...,Xk

Xu

=()

7.6(b) now applies, and


Pi

the

union

for

For any Jordan

restrictions

is,

if

[T,]^

431

canonical basis

canonical

Jordan

T, the

of

ft

forms for the

canonical form J for T. That

a Jordan

to obtain

combined

be

can

Tt

Form

Canonical

Jordan

7.2

Sec.

all z, then

At for

J =

[TJ

aJ
In this section
computing J and also.

we

While

are

Ax

the bases ft,

and

At

for finding J,

a method

developing

the matrices

sense

some

in

that

matrices

the

compute

ft

evident

of the appropriate, size.

zero matrix

is a

each

where

we

What

unique.

thereby

it will become
by \"in

mean

some sense\"will becomeclearas


proceed.
To aid in formulating
a uniqueness theorem for J,
the
adopt
following
The basis ft for
convention:
henceforth
be
ordered
in
a
will
such
way that the
That is, if ft is a disjointunion
cyclesappear order of decreasing
length.
is
cycles
Zl9
Z2,...,
Zki, and if the length of the
Zj
pp we index the cycles so
>
that Pi ^
of the cycles determines an ordering
^ pki. This ordering
hence
the matrix Ai.lt is in this sensethat
is
It then
determines
ft and
unique.
to
follows that the Jordan canonical form for T is unique
an
of the
ordering
is no
theorem
eigenvalues of T. As we will alsosee,
comparable
uniqueness
for the bases ft or for ft Specifically,
will
be shown
is that the number kt of
we

we

KA(

in

of

cycle

for

p2

Ax

up

there

what

cycles that formft

the

and

ps (J

length

= 1,2,...,

each

of

/\302\276)

cycle

is completely

by T.

determined

Example 1

\\

To
p2,...

that

illustrate

\342\200\242,
pki,

and p4 =

1.

= 4 (i.e.,
kt

that

suppose

At is

matrix

the

by the numbers

determined

entirely

there are four cycles),

p\302\261

Then

/o

At

X,

\\

: a.

0 0
0 0
0 0

1\302\260

\\o

o ;o 0 0
1-:.0 0 0
0
0 0

o ; o

o ;o

0 0
0 0 0
0 0 0

o\\

0
l o :o 0 0
k l :-o. 0 0
0 0
:o
0

xt

;o

a\302\243

i
\342\226\240

:Q

Ar;

6 r*Xl

\\p2

= 3,

kh

pu

p$ = 2,

432

Chap.

aid

an

As

dot diagram,to

in computing

as above that

Suppose

Pi ^ Pi ^

* * *

of

member

of

dot

the

of

Zj

\342\200\224

dot

the

Thus

etc.

A;l)w'~2(xJ.);

of the column correspondsto

for each

following rules.

Hence

with ft may

\342\200\224

final

the

second

the

Xi\\)PJ~1(xj)',

dot

(lowermost)

be depicted as

\342\226\240\342\226\240\342\226\240

'(T-W'-X-)

\342\200\242(T-W2\"2(x2)

\342\200\242(T-'X-:Op,-2(*i)

Zk. with lengths

contains one dot

to (T

.(7-^.1)^(\302\275)

\342\200\242(T-W'-Hxi)

ft.

Xj.

associated

diagram

basis

the

and

for each cycle).


column consists of pj dots that
jth
in the following manner: If
x3 is the end

top dot corresponds

to (T

corresponds

the

of dots, called

column

(one

right,

members

the

then

Zj3

to

left

from

correspondto

kt columns

of

array

vector

dot diagram
to the
according

At

Z2,...,

Zx,

cycles

The

is constructed

1. The
consists
2. Counting

of

union

Ac,> respectively.

and

ft

of the matrix

form

ft is a disjoint

an array

introduce

we

ft,

the

visualize

us

help

and

At

Forms

Canonical

'(T-Ajr-Vj

-(T-*IX**)

'(T-^\\)(x2)

-x2

\342\200\242(T-A,IX*i)

In

the

-xki

diagram-above

we have labeled

each dot with the

dot diagram

ft

member

of

it

to which

ft

corresponds.

Notice that the

that

also

Observe

rows.

p\302\261

diagram become shorter


You

of

the

also

might

then

array,
in

combinatorial

nature,
to

Returning
we

the

that

see

We

will

dot

devise

(one for each cycle) and


\342\200\242
\342\226\240
\342\200\242
columns
of the dot
^ pk{5 the
p\302\261^
p2 ^
not longer)
as we move from left to right.

for

(or

since
at

that

observe

rx ^ r2 >
it will
Example

diagram

least

has

kt columns

if r3 denotes

\342\200\242\342\226\240\342\200\242

>

rpi.

Since

the number of dots in


jth
the proof
of this fact is
the

row

the exercises (see Exercise7).


= 4,
=
3, p2 = 3,p3 = 2, and p4 =
kt

be left to
1, where

p\302\261

1,

for ft is

a method

\342\200\242

\342\226\240

\342\226\240

\342\200\242

\342\200\242

\342\226\240

for computing

\342\200\242

the dot diagram for ft in

terms

of

is uniquely determined T. It is important


to
dot
is uniquely
however, that when we say that
diagram
understand,
of
determined
ft.
by T, we are making no assertionsabout
uniqueness
basis
is
the
the
of
the
dot
as we will see,
By
ft
unique.
uniqueness
Indeed,
canonical
bases for
then
diagram we mean that if ft and are two Jordan
alone.

Hence

the dot diagram

by

the

the

not

ft

KXi,

Sec.7.2
the

ft are identical.

for ft and

dot diagrams

433

Form

Canonical

Jordan

cycles of lengths p[>p'2>'->


PL

if

Thus,

then

p'kt,

k\\

is a

ft

= kt

p2 = p2,

p[ = pu

and

of k[

union

disjoint

\342\200\242
\342\226\240
\342\226\240,

Pkr

combinatorial
this uniqueness result, we will use
following
is completely determined by the
of its
rows
and
fact:
Any dot diagram
if these
numbers
could be
the number of dots in each row (see
7). Thus,
as the
transformation
T (for example,
computed from properties intrinsicto
of j), the dot diagram could be
ranks of (T A, I)-* for various
values

To establish

the

number

Exercise

the

\342\200\224

The

the

and

reconstructed

associated

dots

the

with

\342\200\224

the

Hence

A;l)r).

equals

(T

\342\200\224

A[\\)r

this reason it

Let

a and b

under

basis

for
ft

for

ft: U(x) =

KAi.

\342\200\224

\342\200\224

AJ)r)

N(U).

For

Let

0} and

denote the number of

N((T

of

restriction

the

U denote

Let

Ail)r.

of these remarks,

a consequence

{x e

=
1,-1)0

(T

suffices to considerU.

Sl

that are

of a generalized eigenspace,

invariant

is

KA(. As

to

proved.
numbers.

these

N((TKXf

be

would

any

By definition

Furthermore,

pki

X-N).

nu\\\\\\X.y((J

Proof.

kh Pi,p2,...,

positive integer r the basis vectorsin ft


in the first r rows of a dot diagram
for ft form a
number
of dots in the first r rows of a dot diagram

For

7.8,

Theorem

numbers

the desired method for computing

results provide

following

N((T

of the

uniqueness

S2={xe ft:
in

vectors

and

5X

U(x)

S2i

0}.

Then
respectively.
if x is one of
only

= mh where = dim(KA(). For any x in ft, x e


and
to
and
the first r
of a cycle
this
is true if and only if x
corresponding
first
r rows of the dot diagram of ft. For
the
x in
correspondsto a dot
dot
the
to x exactly r
S2, the effect of applying U to x is to
corresponding
dot
It follows
that U maps S2 in a one-todiagram.
places up its columnof
is a basis for R(U) consistingof b
one fashion
Thus,
ft.
{U(x):xeS2}
a + b

m\302\243

5X

vectors

if

A,*,

among

any

move

the

into

vectors,and

hence

rank(U)

linearly independent subset

basis for N(U).


In

the

case

corresponding

canonical
form

A|

equals

the

r = 1,

that

of

consisting

N(U)

a vectors,

\342\200\224

and

a. But

Si is a

is therefore

ft

to
for

X-v

be a

ft

is

corollary.

Jordan

k-t

the

Then

T the

dimension

We are now

Theorem 7.8 yields the following

canonical basis for


the disjoint union of
cycles
of

Let
that

suppose

of

= mt

nullity(U)

Corollary,
and

b. Therefore,

number

generalized

eigenvectors

kj. Hence in

dimension

of

of Jordan

blocks corresponding to the

E^ equals

a Jordan

eigenvalue

of Ea,.

able to

for Pi directly from T.

the restrictionofTtoKx,

formulate

procedure

for

computing

the dot diagram

for

diagram

(a)
(b)

Let

7.9.

Theorem

rx

= dim(V)

rj

= rank((T

- rank(T - ^1)^1) - rank((T-

if

A,l)')

>

7.8,
=

\342\226\240
\342\226\240
\342\200\242

r2

I).

A,

By Theorem

'i +

dot

Then

fi>x.

Proof

row of a

in the ]th

of dots

number

the

denote

x-3

Forms

Canonical

Chap.

434

rj

- W)

nullity((T

= dim(V)-

for any j

A,I)')

rank((T

^ 1.

Hence
=

dim(V)

rx

- 1,1)1)

rank((T

and
=

+ \"\"' +

+ r2

(ri

rj

~ (ri +

rj)

- (dim(V) -

rank((T- W')) -

= rank((T -

W\"1) -

This
T. Hence

O-i)
-

for j

Xt\\)j)

for

diagram

- W'\"1))

rank((T

(dim(V)

rank((T

a dot

that

shows

theorem

\"\"\"

r2

is

> 1.

by

result.

tr

For

determined

completely

/?\302\243

the followinguniqueness

we have proved

Corollary,

eigenvalue

any

convention
Thus,subjectto
the Jordan canonical
the

Xx

of

are

that^cycles
of

form

linear

operator

dot diagram
listed in order of
the

for fi{ is unique.

decreasing length,

ordering of its

up to the

is unique

eigenvalues.

Before

some

giving

of A

form

Let

Definition,

characteristic

of a matrix

form

canonical

Jordan

examples

polynomial

is defined to

be

of

of Theorem 7.9,

of the use

in the obviousmanner.

matrix with entries from F


A (and hence LJ splits. Then
Jordan
an

the

define

we

n x n

of

form

the

linear

the

canonical

the

be the Jordan canonical

that

such

LA on

operator

Fn.

Observe that if J
are

similar.

is the

two

matrices

three

=
ft

the

[zu

canonical

Jordan

z2,...,

examples

and

a linear

that

follow

operator.

of a

form

column^

then

we compute

matrix

A, then J

canonical basis

zn} is a Jordan

having z3as itsjth

n x n matrix

In the
of

if

In fact,

is

for

LA

and
and

Theorem 5.1.
the Jordan canonical forms
= Q~~1AQ by

Jordan Canonical

Sec. 7.2

435

Form

Example

Let

-1
3

'2
A

We will find

\342\200\2241

-1

the Jordan canonical

of

A and

form

linear transformationL^. Thecharacteristic

A has

Thus

1,

two

distinct

be

a Jordan

2)\\t -

- tl) = (t =

eigenvalues,

Ax

3).

A2 =

2 and

is

of

polynomial

det(.4

basis for the

canonical

a Jordan

3 and

3 with multiplicities

respectively.

Let
has

fii

3, dim(KAl).=

multiplicity

LA

let rj denote

As above,

3 dots.

contains

basis for the restriction of


3 by Theorem 7.6. Thus the dot

canonical

Applying Theorem 7.9,

dot diagram.

the numberof dots

-1
= 4

\342\200\224

= 4-

-2/)

rank(,4

KAl.

Since

diagram

for

pt

of this

row

jth

Ax

have

we

/\302\260

rx

the

in

to

-1

-1

rank

-1

\\o

=4-2=2
1

and

r2 = rank(4

(Actually,
that r2 = 1

the

So

if

the

from

denotes

T\302\243

of

computation

Hence

dots.)

- 2/) -

dot

the

the

that

facts

diagram

restriction

r2

rank(U

associated

of

LA to

of

with

(i

KXt

1,

=
A2

basis

any

3.

/?2

for

KA2

is

/?x

[TJ,,

we must

1, 2),

will

2/
consist

Thus

A2

three

have

1 0\\

\\0

Since dim(KAJ=
corresponding to

We could deduce

that the dot diagramconsists

/2

^i = [TJ,f=

= 1.

in this case.

is unnecessary

= 2 and

ri

= 2-1

2/)2)

= (3).

of

a single

eigenvector

436
Setting

= P^kj/?2)

ft

Chap.

L^.

First

Canonical

Forms

we have

J=[kJ,=

so J is

We now seek

of A.

form

canonical

Jordan

the

a Jordan canonicalbasis

for

Jordan canonicalbasis for TV We know from


to must be
that the dot diagramcorresponding
/?x

we must find

the preceding

computations

/?x

\342\200\2420\"-4l)(*i)

this diagram we see that


Since
^NftT-aj)1).

From
*i

A-21

It is

now

for

choose

must

{A

seen

easily

is a basis

we

xx so

that xx e

N((T

\342\200\224

AJ)2)

but

- 2/)2 =

that

\342\200\224

N((T

'*2

AJ)2)

these basis vectors,

KAl. Of

and

satisfy the condition not


to be either of
vectors,
of

these

to

belonging
say

x,

N((T

\342\200\224

Ail)1).

Hence

we may

select xx

Jordan Canonical

Sec. 7.2

437

Form

Then

(T-A1l)(x1)

(A-2/Xx1)

Now simply choose x2 to be

of

element

an

EAl that

is linearly

of

independent

-1'

(T-A1l)(x1) =

for

select

example,

Xn

Thus we have

associated the Jordan canonicalbasis

r
=
/\302\273i

\\

with

the

dot

diagram

in the following

manner:

/1

'-1\\

The

verified.
Theorem

reader

might

be

concerned

that

the linear independence

assured,
however, that this verification
7.4. Since x2 was chosen to be linearly

Be

is not

of

/?x

was

not
of

because

necessary

independent

of

the

initial

vector

Canonical Forms

Chap. 7

438

(T- AJXxJ of

the

cycle

{(T-AJXxiW,

that

from this theorem


of LA
Any eigenvector

it follows

the desired basis

fi2

for

/?x

is

independent.

linearly

the

to

corresponding

eigenvalue

A2

= 3 will

form

Q such

that

example,

KA2\342\200\224for

Pi

Thus

'P = Pivp2

is a Jordan canonicalbasis

for

then

J=

if

that

Notice

L^.

Q~lAQ.

Example 3

Let

'2

-4
-2

2
1 3
3 3

-6

-2
,4

Again

we

will

find

a Jordan

-2
-2

The characteristic polynomial


Let

=
A2

4,

and

\\-AtoKXl,fori= l9Z

be

/?\302\243

canonical form J for

J^Q-'AQ.
= 2,
Ax

the

Jordan

of

is

canonical

and

a matrix

\342\200\224

det(,4

basis

tl)

(t

for Tl9 the

\342\200\224

2)2(t

\342\200\224

4)2.

restriction of

We begin by

computing the dot diagram

for

the first row of this

of dots in

dot

the

Let

/?r.

the

i\\denote

number

then

diagram;

= 4-2 = 2.

- rank(.4 -2/)

= 4

ri
So

439

Form

Canonical

Jordan

7.2

Sec.

for /?x is

diagram

Thus

Next we compute the dot


only 4

\342\200\224

in the first

1 dot

Since

/?2.

\342\200\224

rank(,4

4/)

row of the diagram.Since

dimension

has

KA2

is

3, there

diagram for p2 must be

7.6), the dot

(Theorem

for

diagram

Thus

A2

So if p

= Pi u

the

then

p2,

[TJ,,

of

form

0 ;0
2

LA

is

0^

; o

\\o o i 0 4,

In orderto
Jordan

canonical

chosen

to

Xi

that

Q such

matrix

P for

linearly

= J, we

Q~lAQ

must first find a

diagram for Pi indicatesthat Pi can


set of eigenvectorsof corresponding
to

L^. The dot

independent

be

example,

will suffice.For p2
N((L^

find

basis

be any

= 2. For

obtaining

^~T;T~i

vector

J=CLJ'

Xi $

1'

canonical

Jordan

'2

Xi

'4

\342\200\224

AJ)1).

in
such

we

find

must

One

way

an

element

of finding

2. In this
Example
A simple
a vector.

xx e

=
KA2

N((L^

such an element

\342\200\224

such

A2I)2)

that

was used to select

the

example we illustrate another method


calculation shows that a basis
the
null

for

for

Chap.

440

space of

\342\200\224

X2\\

LA

is

Let

(A

4/)(x1) =

and choose

xx

To do this,

we must find a solutionto

to

be

any

of

preimage

(A

the

4/)

H
i.e.,

4
\342\226\2404

\342\226\2402

It

is easily

verified

that

matrix

equation

Canonical Forms

Jordan Canonical

Sec. 7.2

so select

a solution;

is

441

Form

Xt

Thus

Pi

{V-A

- A2IXxiixj =

Hence

is a Jordan
So

P1vP2

canonical basis for

L^.

if

\\

J = Q_1i4Q.

then

Example 4

Let

and

of

vector

example,

-\342\226\272
V

2. (A basis

if f(x,

denned

for

is

2x2

\342\200\224

+ y,

3xy

T(/)==^/(Xs3;)=1+4x\"33;-

We will find

{1, x, y3

by

=
y)

functions over R in two variablesx

of polynomial

space

at most

degree

mappingT:

For

the

denote

a Jordan canonicalbasis

for

T.

then

x2, y2, xy}.)

Consider the

Chap.

442

if

First, observe that

/0100

0\\

'0002

Canonical

Forms

then

[T]\302\253,

0 0 0 0 0
A

0 0 0 0 0

0 0 0 0 0 0
\\ 0 0 0 0 0 0

Thus

characteristic

the

of

polynomial

T is

\\
det(4

- tl) = det

\302\253'

:/

T has

Hence

only one eigenvalue

canonical basis for T.If


for

diagram

/?,

then

the

denotes

rj

(A

rank(,4)

h
Q

A2 =

=
r2

are

rank(A)

six

dots

\342\200\224

rank(,42)

in the dot

We conclude

that

the

of dots
3

Let

p denote

in the jth

any Jordan

row of the dot

3. Since

0\\

0 0 0 0 0 0

0 0 0 0 0 0
0 0 0 0 0 0
\\ 0 0 0 0 0 0
=

2. Thus

\342\200\224

form

canonical

1 0
0 0

/o
J =

\342\200\224

diagram, it followsthat

Jordan

V.

KA

number
=

\342\200\224

rt=6

and

0)s

because=

1.

r3

J of T

o\"

o\\

01

0 1;0

0 0 0

,0

_0

jo/

rr = 3,
So the

is

r2 = 2,

and

dot diagram

there

for p is

Sec.

Canonical

Jordan

7.2

443

Form

We now seek a Jordancanonicalbasis


dot diagram for

ft

three

of

consists

xt such that

find a vector

we must

dots,

of the

column

first

the

Since

T.

for

d2

dx:

a = {1,

basis

the

Examining

for xt. Lettingxt = x2,

x, y, x2, y2, xy}

we

KA,

x2 is a

see that

candidate

that

find

we

for

= 2x
(T-AI)(x1)=T(x1)=^(x1)

and

= 2.

(T-AI)2(x1)=T2(x1)=^(x1)

Likewise,since

the

Examining

the

of

span

with

the

x3

of

consists

two

dots,

y2.

in

x2}), we see that

{2,2x,

(T-AI)(x2)

Finally,choose

/?

from consideration (becausetheylie

and x2 eliminated

ls x,

cycle

for

x2 such that

find a vector

we must

dot diagram

of the

column

second

Then

we

may

select

= xy. Thus

x2

T(x2)=^(xj;)=j;.
the following basis with the dot

identified

we have

diagram;

\342\200\2422

-2x

'y2

my

*xy

\342\226\240x2

Thus

{2S 2x, x2,

of

context

T.

for

preceding examples we relied upon our


reader
the problem to find a Jordan canonicalbasis.

the

In

y, xy, y2} is a Jordancanonicalbasis

three

and

ingenuity

in these
exercises. are successful
cases
consideration
are
dimensions of the generalizedeigenspaces
not attempt, however, to develop a generalalgorithm computing

same in the

to do the

one could be

formulated

existence of such a basis

7.5).

basis

proof of the
The

although

(Theorem

following

Theorem

having

Jordan

section.

Then

result

may

by

the

following

be thought of as a corollaryto

We

small.

for

canonical

the

because

We

under

able

be

will

The

the

Theorem

do

a Jordan

steps

in the

7.9.

B be two square matricesof


same
each
size,
canonical forms computed according to the
this
of
A and B are similar if and
have
if they
(up to a permutation of
7.10.

their eigenvalues)the

Let A and

the

conventions

only

same

Jordan

canonical

form

AAA

B have the sameJordan

If A and

Proof.

are each similar to J


hence
Conversely, suppose that

sameeigenvalues
of
canonical
exists a linear
y

for

that

canonical

Exercise

by

vector

and

bases

V and

space

ft

and

Thus JA and JB are Jordan


Hence, since the eigenvalues of

operator.

way, the corollary to

in the same

5.1 there

= /B.

[T]y

linear

hypothesis

Section

of

19

the

JB,

7.9

Theorem

implies

that

Example

which of the matrices

We will determine

-3

-7

are similar. Observe


\342\200\224\342\200\224

\342\200\224

-1

-1\\

-1

-2
.

2/

polynomial.

polynomials,

be similar

cannot

\302\253)

has

have

t(t

to A, B,

characteristic

same

the

\342\200\224 \342\200\224

\342\200\224

2)

V){t

matrices

similar

because

Thus,

D =

and

that A, B, and C

whereas

2)2,

l)(t

-1

-3

\"2\\

-3

(t

Hence

B is similar to

finite-dimensional

JA

same

the

ordered

are

respectively,
to JA and
is similar

\\T~\\p

of

forms

and

B,

operator T on a

such

multiplicities.

JB are similar.

JA and

that

since

Then

implies

and

forms

eigenvalues.

same

the

with

A and 5

then

J,

other.
similar.
Then A and B must have the
Let JA and JB denote the Jordan
for some fixed ordering of their

B are

and

form

canonical

Forms

to each

similar

are

and

Canonical

Chap.

the

have

as

polynomial

characteristic

its

same characteristic

or C. Now eachof
matrices
A, B, and
= 2 with multiplicities 1 and
and
A2
the

=
1
the same eigenvalues
If J^, JB, and Jc denote the Jordan canonical
of
respectively.
with respect to this orderingof their eigenvalues,
respectively,

has

Ax

forms

B, and

A,

C
2S

Cs

then

J,

Since

form.

JA

Jb

= Jc, A is

0 0

/1

\302\260\\

2 0

\\o

0 2

JV

and

Hence if T

similar to C, whileB-is

to

similar

for T is

a basisfor

is a diagonalizable

on

of T.

operator

consisting

of

eigenvectors

V,

any

A nor

neither

The reader should observe that any diagonalmatrix


Thus T is diagonalizable if and onlyifitsJordan
canonical

matrix.

==

is

a Jordan

Jordan

form

C.

canonical
is a

diagonal
canonical
basis

Jordan Canonical

Sec. 7.2

445

Form

EXERCISES

as being true or false.

the following statements

1, Label

(a)

The

(b)

Let

has

a Jordan

be

matrix itself.
a finite-dimensional vector space that

of a

form

canonical

Jordan

on
operator
form J.
canonical

a linear

matrix is the

diagonal

If

is

ft

for V,

basis

any

then the Jordan

canonical form of [T]^ is J.


(c)

Linear

the

having

operators

same

characteristic

are similar.

polynomial

canonical
form are similar.
(d) Matriceshaving same Jordan
canonical
form.
(e) Every matrix is similarto its
Let T be a linear operator on a finite-dimensional
(f)
space
the

Jordan

with

vector

characteristic

\342\200\224

polynomial

(\342\200\224
lf(t

blocks are ordered

the Jordan

to

Subject

X)n.

T has

size,

decreasing

by

that

convention

the

a unique Jordan

canonical form.
a Jordan
(g) If an operator
Jordan canonicalbasis

canonical

has

for

(h)

The
is

2.

that

of any linear

dot diagram

form,

then

there

is a unique

operator.

operator havinga Jordancanonical

form

unique.

a linear operator on a
V such
vector
space
=
=
=
the characteristic polynomial of T splits. Let
2, A2
4, and
be
distinct
the
of T, and suppose that the dot diagrams
eigenvalues
= 1,2, 3) are as follows:
I to
restriction
of T
A;
KA( (z

Let T be

finite-dimensional

Ax

that
\342\200\224

A3

for

the

\342\200\224

=
Ax

Let

the

A2

= 4

=
A3

-3

canonical form of T.
on a finite-dimensional vector space
linear
operator
form of T is
canonical

the Jordan

^Find

3.

be

Jordan

2 10 10
2

1 ; 0

01

such

0 0 2 10 0 0 0
0 0 0 2 1 ;0 0-

0
1\302\260

\\o

(a) Find
(b)

Find

(c) For

the

characteristic

the dot diagram


which

eigenvalues

,0
0 0
0 0
0

polynomial

;0

;3

0 '0
of

0
3

T.

corresponding to each eigenvalueof T.


Xt,

if any,

does

=
EA(

KA 7

that

Canonical Forms

Chap. 7

446

(d) For each eigenvalue


=
KAj

(e)

smallest

px for which

integer

positive

N((T-Afin

Uf denote the

Let

the

find

Xh

restriction of T

following for each


(1)

rank(U(-)

(2)

rank(U?)

\342\200\224

the

z. Compute

each

KA( for

to

A,-l

z.

(3) nullity(Uf)

(4) nullity(U?)

4. For each of

matrix

the

\302\243-3

-1

Ki

5.. Let ,4 be an
are

similar.

6. Let

vector

the

denote

rank(G4

of

T: V -> V by

and

conclude that

At and

that are linear

T(/) =

combinations of

x2ex3

e2x. Define

and

derivative

(the

for

the

array

contains

the

(a) in = p1 and
(b) pj = max {z:

Hint:

(c) rt^r2

r\302\243
^j}

pk,

for

< j <

k and

==

max

rf

the

that

determined

completely

Definition.
integer

prove

the following:

linear

{j: pj

^ i} for

1<

<

m.

on m.

>--->rm.

(d) Conclude

positive

- *
\342\200\242

ri.

induction

Use

^ p2 ^

If pt

dots.

rt

T.

has

diagram)

of

some

ex,

Find
/'
off).
both a Jordan canonical form and a Jordancanonicalbasis
and m
that an array of dots (such as a dot
k columns
Suppose
contains
rows and that the j th column
array
pj dots and the zth row of
xex,

7.

A1

and

any positive integer

- A/)r).

of functions

space

splits. Prove that

polynomial

canonical form,3 and

Jordan

For any eigenvalue X


XIJ) ^= rank((.4(

Hint:

r, show that

characteristic

whose

same

the

A1 have
tr

and

A =

<>/

n matrix

(d)

-2

-1
5

.-3

-A

/\302\273

A==

and

(b),

(a),

parts

(b)

-3
2/

-i

(c)

-A

-7
V

in

used in Example 5.

(c) are matrices

(a)

that the matrices

= Q\"l AQ. Notice

that

such

form J and

canonical

a Jordan

A, find

matrices

following

of

number
if

operator

the

dots

of a dot diagram is
dots in each row is known.

in each column

number

T on V

of

is called nilpotent

if Tp =

T0

for

p.

8. Prove that if T is a nilpotentoperatoron an n-dimensional


then the characteristic polynomial of T is (\342\200\224
Hence
l)ntn.

vector

the

space

characteristic

V,

Jordan Canonical

Sec. 7.2

Use

Hint:

all vector

for

true

of
has only one eigenvalue
the general step, assumethat
spaces of dimension less than n, and

splits, and T
induction
on n. In
of T

; polynomial
;

447

Form

(zero)

multiplicity

n.

conclusion

is

the

the

follow

steps

below.

at least one

that T has

Prove

(a)

to

corresponding

eigenvector

= 0.

Thus

dim(R(T))<dim(V)=n.

inductionhypothesisto the T-invariantsubspace


basis
xk] for R(T) to a basis = {xl3 x2)
{xl5 x2,...,

(b) Apply the


a

Extend

(c)

ft

0' are (n

0 and

respectively.

characteristic

polynomial

example of a

10. Give an

is

the

zero

matrix

is

\342\200\224

equal

zero

matrices,

having

nilpotent but zero

where 0 denotes

Ap = 0,

that
=

the n x

p.

integer

(LA)P

operator on

if Ap equals

nilpotent

the

A is nilpotent

that

Conclude

T0.

triangular matrix

that any square upper


to zero is nilpotent.

with

and

Xd x2,

to

7.6

Theorem

Use

x{ e

that

such

\342\226\240..,
xk

Let

S: V

and

define

-\342\226\272
V

be

zero

if and only if

Next,

define

nilpotent,

U:
and

-\342\226\272
V

SU

by

finite-dimensional

==

there

entry

space V such that


Al5 A2,...,
Xk are

exist

unique

vectors

\342\226\240\342\226\240\342\200\242

For

xk.

represent x as in part

xeV,

...

(a)

+Afcxfc.

- S. Prove

that S is

diagonalizable,

= US.

(c) Prove the converseof part

on a

xt + x2

= A1x1 +

S(x)

diagonal

and

KXt

follows:

as

defined

that

suppose

for any x

that

prove

each

vector

linear

is

k)

nilpotent.
that T is not

a finite-dimensional
on
13, Let T be a
operator
the characteristic polynomialof T splits,
the distinct eigenvalues of T.

(b)

\342\200\224

(n

nilpotent.

12^ Prove

(a)

/c)

all such transformations.

called

is

matrix

Prove

if

T such

T. Characterize

positive

Mnxn(F).

T is

then

operator

of

if and only

matrix,
LA

some

for

11. Let

linear

Ann x n

Definition.

(n

is a linear

8: If T

(\342\200\224
lft\"a

eigenvalue

only

/c and

/c)

Exercise

of

converse

the

\342\200\224

- tl) = (- lft\\

that det(T

Deduce

Prove

xk,

that

where

9.

..,

3c\342\200\236}

Show

(e)

\342\226\240

forV-

Xfc+l
(d)

R(T).

(b).

vector

That

space

is,

V, S

a linear operator
U are linear operators on

that

prove

and

if T is

448

Chap.

that T =

V such

SU = US,

with

eigenvalue
13.

Exercise

for the restrictionof T to

this restriction.Then

(a)

J =

Let

let

u P2

Pt

S = T - U.

and

[T],

Suppose

and

KA(S

T.

[S]/j is a diagonal
entries of
diagonal

is also

that

fit is

and S
S

of

a generalized

to A.

T corresponding

and U be as in

then

X,

and

U is nilpotent,

diagonalizable,

the same characteristic polynomial,


First prove that if x is an eigenvector

S have
(b). Hint:

part

corresponding

eigenvectorof

14. Let T

in

as

defined

S is

where

and

then

is

S+

Forms

Canonical

the Jordan

denote

Jt

u - - - u Pk is
the

Prove

a Jordan

canonical basis
canonical form of

a Jordan canonicalbasis

for

following:

matrix whose diagonal


then
/; that is, if = [S]^,

are

entries

the

to

identical

iff =/
otherwise.

(b) If

Af

=
[U]\342\200\236

then

M..=
13

(c)

J = D

As

otherwise.

+ M.

(d) MD = DM.

(e)

ify = f+1

a consequence

p be the
matrix.
Then
Let

Jr = Dr

is a binomial
smallest positive integer for which
equals
(c) and (d) there

of parts

the

Mp

+ rDr'lM +

J.

zero

\342\226\240
\342\226\240
\342\200\242

Dr'2M2

for

expansion

rjDMr_1

ifr

Mr

<p,

and

Jr = Dr +

rDr' lM +

r(r

(f) If T = L^,
(g)

For

then

the matrix

there

\342\200\224

1) ;

Dr'2M2

exists

,,
+

\342\200\242
\342\200\242

Q such

a matrix

if r >

Tr:Dr-p+lM*-1

7TT7

that

Q above and any positiveintegerr,

=
Ar

p.

QJQ'1.
=

QJrQ~l.

be a nilpotent linearoperatoron
vector
V.
space
= 0 is the
from Exercise 8 that
Recall
of T; hence = KA.
only eigenvalue
Let j? be a Jordan canonical basis for T.
for any
that
positive
integer
if
delete
from
vectors
to
we
the
the
last
in
dots
each
p
corresponding
column
of a dot diagram for P, the resultingset is a basis
(If a
R(T').
of the dot diagram contains fewer than
column
all
the
vectors
dots,
associated
with that column are removed from p.)

15. Let T

ajinite-dimensional

Prove

for

Sec.

17. Let T be a
an

linear

that

EA

form

blocks

the Jordan blocks corresponding

only if all

if and

Kx

of all the
of T.

lengths

canonical

Jordan

the

in

the

1 matrices.

1 x

are

and let

splits,

polynomial

of

sum

the

is

dim(KA)

(b) Deduce that


A

two

having

space

of T.

correspondingto
to

characteristic

whose

operator

eigenvalue

(a) Prove

vector

finite-dimensional

bases.

canonical

Jordan

distinct

be

on a

a linear operator

16. Find

449

Form

Canonical

Jordan

7.2

18. (a) Let


1

.-\342\200\242

\342\200\242\342\226\240\342\200\242

/A

/0

J =

and let N

Jismxm,

that

Suppose

A/

= J

any

r^m

Nm

that

Prove

A/m.

zero

is the

matrix.

(b) Observe as

in

that

14

Exercise

z-2n2

+ rZ-'N + ^r^

r = nm

for

\342\200\242
\342\226\240
\342\200\242

+ 2)1P.BI+
Ar

r(r-l)...(r-m
+

(m

\\

r(-r -

,v-i

2!

(m

Ar

rk

r(r-

P_!

_^

- 1)!

_J

1)---(/--m
'

2){.or\342\200\224m+l\\
.,\\

+3)

y
)\342\200\242\342\226\240\342\226\240(/\342\226\240v

(/72

A'

I
0

1)!

x
'
r(r-l)---(r-m

or-m

A'

+ 2

2)!
-

\\

lim Jr exists

that

Prove

r-*co

(1)

(2)

A=l

<

if and only if

one

of

the

holds:

following

1.

andm

= l.

Furthermore, show that lim

Jr

is

the

matrix

zero

if condition

r-*ao

matrix
holds and is
(c) Prove Theorem 5.18.
the

19. For

any

e Mnxn(C),

(1)

define

if condition

2 holds.

||i4|| = max{|i4v|:

1 < Uj <

n}.

Prove

the

following results for arbitrary

(a) ||i4|| ^
(b)

0,

and

(d)

0 if

= |c|-M||.

|M||

M||

and only if

matrix.

zero

the

is

ce C.

and

e Mnxn(C)

A,

||B||.

||AB||<nM||-||B||.

Let A e Mn x
the

||i4||

(c) ||i4 + B|| <


20.

Canonical Forms

Chap. 7

450

Let || \342\226\240
|| be

of A.

form

canonical

Jordan

5.3), and P~

(see Section

matrix

a transition

be

n(i?)

in Exercise

defined

as

lAP = J be
19.

< 1.
(a) Show that for every positive
m,
||i4m||
is bounded.
(b) Deduce that {||Jm||: = 1, 2,...}
that
each
Jordan
block
(c) Using part (b) and
18(b),
prove
to the eigenvalue = 1 of A is 1 x 1.
corresponding
and
Theorem
Exercise
5.18,
(d) Use pari
18(b) to show that lim
integer

Exercise

Am

(c),

m-*ao

if and

exists

if A has

only

the property that whenever


A

=
1.
of A with |A| = 1, then
Prove Theorem
5.25(a) using part (c) and

an

is

eigenvalue

(e)

21.

exercise

(This

of absolutely

knowledge

requires

from page 280 that

if

Theorem5.24.
as lim Bm, where

eA is defined

then

e Mnxn(C),

series.) Recall

convergent
m-*ao

Am

A2

Use

22. Let X'

= AX bea

tuple of
and

that

that

to

contrast

for

exists

eA

Mnxn(C).
where X is an nequations
of the real variable t,

x2{t\\...,xn(t)

Xi(t),

A e

every

differential

n linear

functions

n x

an

of

system

differentiable

is

that

19(d) to show

Exercise

as in Exercise 14 of Section5.2.
but
is not
however,
diagonalizable,
suppose that
Let
Al5 A2,..-,
Xk be the
polynomial of
splits.

n coefficient^matrix

exercise,

the characteristic

In

distinct eigenvalues of A.

(a)

LA

of

a cycleof generalized
then
and u corresponds to the eigenvalue
of degree less than p the function

is the

if u

that

Prove

length

eMtU(t)(A

is a

(b)

form

Use

that

cycles

- W2

any

\342\200\242
\342\226\240
\342\226\240

pf-'KNu

to Xr =

AX is a sumof functions

(a), where the vectors u are the end vectors


a fixed Jordan canonical basis for
constitute

22(b) to find

Exercise

for

X' = AX.

solution

general

in part

given

distinct
23.

the

that

Prove

+ f'(t)(A

W1

to the system

solution

of

eigenvectors

Xu

f(t)

polynomial

end vector of

the general solution to each

system

of

of

the

of

the

LA,

differential

equations:

(a)

(xf

= 2x +

(b) f x' = 2x +

y =
where

real

x,
variable

y,

z are

and
t

unknown

real-valued

2y +

2z
differentiable functions of the

451

The Minimal Polynomial

7.3

Sec.

7.3 THE

POLYNOMIAL

MINIMAL

a given
T
operator
Hamilton
theorem
shows
For

on a finite-dimensional vector space


that there is a polynomial f(t)
which

the

for

Cayley-

= T0,

/(T)

are
other
namely the characteristic polynomial of T.
many
polynomials
this
One
of
the
most
having
the
minimal
property.
important these,
another means for studying linear operators.
polynomial,
provides
There

of

Definition.

on a vector

operator

polynomial for T if p(t) is a


which p(T) = T0. (Recall
Appendix

for

degree

positive

a linear

be

a minimal

is called

p(t)

Let

monic

is

coefficient

polynomial

of

polynomial

from

is one in which the leading

polynomial

space V.

that

least

monic

1.)

vector
any linear operator T on an
the
of degree at most n.
has
a minimal
Cayley-Hamilton
polynomial
space
is
of
the
theorem
the characteristic
n, satisfies
polynomial
f(t) of T, which
for
which
degree
/(T) = T0. Choose a polynomial g(t)of least
equation
= T0, and let p(t) be theresultof dividing by its leading
coefficient.
Then
g(T)
of p(t)
is at most n. The next
degree
p(t) is a minimal polynomial ofT and

is

It

to see that

easy

n-dimensional

By

degree

positive

g(t)

the

result shows that


requirement
it
is
unique.
guarantees that
Let

7.11.

Theorem

vector

a finite-dimensional

(a) If g(t)

a linear operatorT

is any polynomial
for

which

g(T)

then

T0,

the

for

algorithm
and

q(t)

r(t)

r(t) has degree

0(T) = p(T)
is a

minimal

9(t)

=4t)p{t\\

= T0,

we

have

Before

we

introduce

p2(t)

are

monic,

continuing

the

Definition.

minimal

cp2(t)

= 1.

our

study

Thus

pt(t)

for

minimal

each

are

and

p2(t)

nonzero

some

Thus pt(t) =

degree

which

p(t)

and

simplifies

that
p(t)

to

for T. Then

the same nonnegative


scalar c. Moreover, since

have

minimal polynomial
for a matrix.

for

(2)

polynomials

of the

polynomial

using

p2(t). 1

p(t)

positive

and

(2)

r(t) has degree lessthan

The minimal polynomial

polynomial of least

in

be the zero polynomial.

since

But

pt(t)

p2(t)

Pi(t) divides p2{t)by part (a).

(2)

r(t\\

of p(t).Substituting

= T0. Since

r(T)

r(t) must
polynomial,
proving (a).

degree, we must have

\302\253WpW

less than that

(b) Suppose that pt(t) and

pt(t) and

unique.

The

9\302\256

where

is

division
polynomial for which g(T) = T0.
Appendix E) implies that there existpolynomials

(see

polynomials
that
such

In

T.

g(t) be any

Let

(a)

p(t) divides g(t).


of

p(t)

\\Proof,

on

V.

Space

particular, p(t) divides characteristic


polynomial
There is only one minimalpolynomial
for T; i.e.,

(b)

monic

be

polynomial

minimal polynomial for

be a

p(t)

minimal

that

the

p(A)

for

equals

for an operator,

e Mnxn(F)

is the monic

the zero matrix.

452

Chap.
this

Throughout

space

Theorem

7.12.

V, and let

ft

minimal

this

of

a basis

Corollary. For
minimal

the

vector

a finite-dimensional

on

the minimal polynomial fori

V. Then

is the

as

same

[T]^.

minimal polynomialfor A is the

x n(F), the

e M n

any

same

LA.

polynomialfor

Exercise.

Proof

a linear operator

Proof. Exercise.

as

for

for

polynomial

The following theorem

versa.

type.

Let T be

be

vice

and

matrices

about

statements

Forms

have been

linear transformations

about

statements

book,

translatedinto
and its corollaryare

the

Canonical

As a consequenceof the preceding


and
corollary,
subsequent
theorems
of this section that are stated for operators
true
also
for matrices.
In the remainder of this section
minimal
study
primarily
polynomials
for operators-whose
characteristic polynomials split. A more general
of minimal
will be given in Section 7.4.
polynomials
theorem

are

we

treatment

T be
a linear
Theorem 7,13.
on a finite-dimensional vector
operator
space and let p(t) be the minimal polynomialfor T. scalar A is an eigenvalue of
=
0. Hence
T if and only if
the characteristic
polynomial and the minimal
zeros.
polynomialforT have the
Let f(t) be the characteristic
f(t),
Proof
polynomial of T. Since divides
= q(t)p(t) for some polynomial q(t). Let be a zero of p(t). Then
f(t)
Let

V,

p(A)

same

p(t)

/(A) =
So

is

also

that

suppose

Conversely,

Since x #

0\302\276
p(X)

so

0, and

As an immediate
following corollary.

is

of T.

eigenvalue

by Exercise 22

consequence of

the

let x e

of Section5.1

1
result

preceding

on
a finite-dimensional
Let T be a linear
Corollary.
with minimal polynomial p(t) and characteristic
f(t).

factors

an

have

we

we

polynomial

have

vector

operator

be

= p(X)x.

of p(t).

zero

T, and

of

eigenvalue

T0(x) = p(T)(x)
A

= 0.

g(A)-0

an

an

is

is

to A. Then

corresponding

eigenvector

that is,

for f(t);

zero

=
g(A)p(A)

Suppose

the

space
that

f(t)

as

f(t) =

where kuX2,.

\342\200\242
are
\342\200\242,
Ak

the

(At

distinct

t)-(A2

eigenvalues

- - t)\302\253

(Ak

t)\"*,

of T. Then

there exist integersmls

Sec.7.3

Minimal

The

that 1 <

such

m2,...,nik

453

Polynomial

<

m-t

i and

all

for

n(

Example

We compute the minimal

matrix

the

for

polynomial

0'

-1

2 0

4=10

1 -1
A

Since

characteristic

has

polynomial

-l

2-t

h-t

f(t) = det

0
\\

by the corollary
that

shows

-1

polynomial for ,4

the minimal

p(A)

to

is the

Let T: R2 ->

is

must be

either

- 3),

2f{t

2)(t

- 3) or

into

p(t) is the minimal

(t

p(t) = (t

- 3)

2)2(t

- 2)(tfor

polynomial

R2 be

defined

3)

A.

by

6)

the

basis

standard

= (2a +

5b, 6a +

b).

for R3, then

So the characteristic polynomialof [T]^,

Thus

(t

Substituting

zero matrix. Thus

T(a,

j?

-(\302\243

2-t,

7.13.

Theorem

|=

Example

If

the minimal polynomial

hence

and

for T must be

\342\200\224

l){t

(t

of

T, is

4) also.

Example 3

Let D: P2(R)->

P2(R)

the minimal

compute

be

the

differentiation

polynomial

for D. For
1

/0

[D]fl=

defined

operator

by D(/)

the standardbasis

ft

we

==

/'.

We

have

0\\

0 2

\\0 0 0/
Hence

the

characteristic

polynomial

of D

is

\342\200\224

t3.

So

the

corollary

to Theorem

454

Chap.
for D is t, t2,or

the minimal polynomial

that

shows

7.13

Canonical

D2 # T0. Thus

minimal

the

to
In Example 3 it is
verify
the
we
saw
that
itself).In this
are of the same degree.Thisis no

that

easy

and characteristic

minimal

example

is a D-cyclic

P2(R)

= 2

D2(t2)

# 0,

be tz.

D must

for

polynomial

\302\2433.
Since

Forms

subspace (of
polynomials

coincidence.

If

vector

T-cyclic subspace of itself, then


same
polynomial p(t) for T are of

a
the minimal
V.

linear operator onan n-dimensional

J be a

Let

7.14.

Theorem

V is

the

characteristic

the

If V

Proof

Hence

exists an

f(t)

and

f(t)

polynomial

degree.

subspace, then there

is a-T-cyclic

space

(\342\200\224
l)np(t).

e V

element

such

that

for

is a basis

Let

5.27).

(Theorem

+ att-\\

g(t)=a0
where

0< k

# 0 and

ak

< n.
=

9(T)(x)

Then

a0x

+ aj(x)

combination of elements of
namely ak. Since is linearly
independent,

is a linear

minimal

the

characteristic

for

polynomial

polynomial

at

akTk(x)
one

least

#'0,

g(T)(x)

ft

the

- \342\200\242

having

ft

Therefore

akt\\

nonzero

coefficient,

of degree n, which is also the

T is

# T0.

and hence g{l)

of

degree

of T.

\342\200\242*

states

7.14

Theorem

a condition

under which the

degree of

minimal

the

as possible.We
when
the
investigate
of the minimal polynomial is as small as possible. follows
from
degree
Theorem
7.13 that if the characteristic polynomial of an operator
distinct
at least
k. The
eigenvalues
splits, then the minimal polynomial must be of
next theorem
shows that the operators for
the
of the minimal
degree
is as small as possibleare
the
polynomial
diagonalizable
operators.
for

polynomial

an operator

is as large

now

It

with

degree

which

precisely

Theorem
space

V.

Then

a linear operatoron a
minimal
diagonalizable if and only if

7.15.
T is

Let T be

vector

finite-dimensional

the

for

polynomial

T is

of

the form

p(t) = (t-AJ(t-A2)...(t-4)
where

necessarily

Xu

A2,...,Ak
the

distinct

distinct

scalars.

eigenvalues

of T.)

are

(Note that in

Proof Suppose that T is diagonalizable.


of T, and define
eigenvalues
Let

P(t) = (t- hit

(t

A2,...,

Xl9

\342\200\242
\342\200\242
A2)

this case

K)-

A& be

the

the

X-'s

are

distinct

Sec. 7.3

The

By

Theorem

{^1)

x2,

x,-

in

* *

Then

(T

(t

divides

and

Xj)

=
minimal polynomial of T.
j?
consisting of eigenvectors of T, and consider

the

divides

p(t)

for V

a basis

be

\342\200\242)
xn}

455

Polynomial

7.13,

\342\200\224

p.

Since

Minimal

\342\200\224

any

the

0 for

which

to
eigenvalue
A,is a polynomial
qfo) such

Ajl)(x,-)

there

p{t\\

Let

x,- corresponds.

that p(t) =

\342\200\224

/.j)

qj{t){t

hence,

piTXxd

qj(T)(T

0.

Aj\\)(xd

of V into the
of a basis
T0 since p(T) takes each
zero vector. Therefore, p(t) is
minimal
for T.
polynomial
A2,...,
Xk
Conversely,
suppose that there are distinct scalars
of T) such
that the minimal polynomial p(t) for T factorsas
eigenvalues

that p(T) =

It follows

member

the

Al5

(necessarily

(t-h){t-X2)-(t-Ak).

p(t)

We

diagonalizable.
suppose
some > 1, and suppose that
n

(t

that

the

that

\342\200\224
\342\200\242
\342\200\242
\342\200\242

(t

Ax)
is

Tw

of T

eigenspace
Xk

be

would

that
such

the

of

an

union

-Afc\342\200\2361l)(x)

and

T,

and

is T-invariant,

to

corresponding
eigenvalue

of the two

two

The

A&.

of Tw, contrary

the

divides

Tw

for W

a basis

be

xm}

an

of Tw by

eigenvalue

of eigenvectors of

consisting

are

sets

polynomial

hypothesis to deduce

be a basis.for

let {xm+l3...3xn}

disjoint

\342\200\224

the

Afcl),

N(T

otherwise

because

to a remark made above.

show

We

sets is linearly independent.Considerscalarsals...,

aM

that
n

fl,.x,- +

Z
f=l

from

clearly

0,

not

is

A&

For

is

which

Afcl,

Then

the induction

apply

may

Furthermore,

{xlsx2,...,

of

polynomial

we

diagonalizable.

hence

and

^1).--(1

minimal
Thus

Afc.i).

Theorem 7.13. Let


Tw,

# V

Clearly,

Afcl).

W3

follows
\342\200\224

(T-

It

= R(T -

=
of T. If
then
{0},
that 0 < dim(W) < n.

Xk is an eigenvalue
Now suppose
diagonalizable.
because

then T is
whenever dim(V) < n for

diagonalizable

dim(V) = n. Let W

n = 1,

V. If

of

dimension

the
\302\253,

T is

that

Next,

for any x

to

induction

mathematical

apply

each
A&,

i >m,(T

aixi =

Yi

\302\260-

(3)

= m+l

to an eigenvalue
^ m, x,-is an eigenvectorof T corresponding
and therefore
there is a scalar c,- # 0 such that (T \342\200\224
Afcl)(xr)
i

\342\200\224
Afcl)(x\302\243)

0.

if we

Therefore,

apply

\342\200\224

Xk\\

to

distinct

sides

both

c,x,-.

For

of (3), we

obtain
m

follows that
a{ = 0 for 1 < i <

from which it
have

that

afc,m.

0 for
Thus

= 0,

atCtXi

1 <
(3)

i < m.

reduces

Since ct # 0

to

for

all

such

z, we

and

hence

= 0 for

at

of

of

consisting

n. Thus

m <i^

independentsubset

It follows that j? is a

linearly

basis

for

therefore, T is diagonalizable.

of T, and

the 2 x 2
the

for

is

,4 = J or

7.15. If p(t)

l)(t - 2),

= (tp(t)

,4

then

is

21 =

3 A

l)(t

(t

g{t). Hence the

,4 divides

\342\200\224
\342\200\224
(\302\243
1)(\302\243
2).

or

Theorem

by

If

\342\200\224
\302\243 2,

\342\200\224

A2

= t

Note

only

that

in all of

p(t)

= t

\342\200\224

0,

- 2).

or

\342\200\224

2,

to

similar

Example

that if

We will prove

where 0 is

the

Since

g(t).

by

diagonalizable

Example 6

= t{t+

if g(t) = t3 ~t

1),

\\){t

factors, neither does

no repeated

g(t) has
Theorem

then

is

g(A) = 0,

then

the minimal polynomial

Hence

matrix.

n zero

such that A? = A,

n matrix

n x

real

is

that

Note

diagonalizable.

divides

1,

for which
= t2 - 3t + 2 =

for

p(t)

\342\200\224

are

p(t)

21.

g(t)

polynomial

diagonalizable
=

Define

matrix.

zero

minimal
Since #(,4)= 0,
possiblecandidates

these cases

A e M2x2(i?)

matrices

all

determine

will

where 0 is

then

xn} is a

xm+l,...,

xm,

{xl5...,

of n vectors.

consisting

eigenvectors

=
P

Forms

Example

We

Canonical

Chap. 7

456

p(t) for

p(t). Thus

is

7.15.

In Example

the

that

saw

3--we

minimal

tz.

operator D;P2(R)->P2(R) is

fience

for the differentiation

polynomial
is

not

(Theorem

diagonalizable

7.15).

EXERCISES

1. Label

the

followsthat all
(a)

Every
which

linear

as

statements

following
vector

spaces

operator

are

or false. Assume

in what

polynomial

p(t)

of

degree

largest

for

p(T) = T0.

The characteristic
for

polynomial

polynomial
that

(d) The minimal and


operator

(e)

true

finite-dimensional.

T has a

(b) Every linear operatorhas a


(c)

being

are

minimal

unique

polynomial.

of a linear operator divides

the

operator.

characteristic polynomialsof

any

diagonalizable

identical.

a linear operator on n-dimensional


vector
be
the characteristic
the minimal polynomial for T, and
of T. If/(0
splits, then f(t) divides[p(#]n.
Let T be

minimal

an

f(t)

space

V, p(t)
polynomial

be

Sec.

(f)

(g)

Minimal

The

7.3

457

Polynomial

for a linearoperator
has
polynomial of the operator.

The minimal polynomial


as the characteristic
degree
linear

is

operator

minimal

its

if

diagonalizable

same

the

always

polynomial

splits.

(h) Let T be a

degreeof

2.

linear

minimal

the

then the

a T-cyclic subspace,
T equals
dim(V).

If V is

V.

on

operator

for

polynomial

Compute the minimal polynomials

each

for

the

of

,,

matrices.

following

,\342\226\240

\302\253G

\302\243

(c)

3.

the

Compute

minimal

for each of

polynomial

operators.

(a)

T: R2

(b) T:
(c) T:
(d) T:

- R2, where T(o,

b)

-+

P2(R)

where

P2(R\\

linear

- b).
/' + If.
b, a

(a +

T(/)

the following

+ f + 2f.
=
- Mn x n{R), where 1{A)
A1.Hint:
n(R)
4. Determine which of the matrices
in Exercises
operators
~+

P2(R)

where

P2(R\\

= ~xf\"

T(f)

Note

Mn

= I.

T2

that

3 are

2 and

and

diagonalizable.

5. Describe all linearoperators


T

7.
8..

T is diagonalizable

and

its corollary.

and

7.12

Theorem

Prove

that

such

= T0..

T3-2T2+T

6.

on

7.13.
Prove the corollary to
Let T be a linear operator on a finite-dimensional
if g(t) is the minimal polynomial of T,
if and only if #(0) # 0.
T is invertible
(a)
If T is invertible
and g(t) = tn + an-ttn~l +
Theorem

vector

that

Prove

space.

then

- \342\200\242
\342\200\242

(b)

a^t

then

a0,

T_1=__

9.

Let T be

a diagonalizable

linear

V. Prove that

space

eigenspacesof T is
10.

Let

g(t)

be-the

equation
the

(a)

is

D-invariant

subspace

T-cyclic

vector

a finite-dimensional

if and

only if each

of the

one-dimensional.

of a homogeneous

polynomial

auxiliary

solution

is

coefficients

constant

with

denote

on

operator

space

(as

defined

of the differential
where

subspace,

D;

linear differential

in Section

2.7), and let

equation. Show that


C00 -+ C00 is the differentiation

operator.
(b)

The

minimal

polynomial

for

Dv (the restriction

of D to

V)

is

g(t).

458

(c) If the

is (D:

Let

the

then

is

g(t)

\302\253,

characteristic

(b) and (c),

parts

-\302\273\342\226\240

P(R)

P(R)

polynomials
g(D)= T0.
12. Let T be a

linear

differentiation

->

P(R)

vector

finite-dimensional

on

operator

g(t) for which


no minimal polynomial.
no polynomial

exists

has

P(R)

suppose that the characteristicpolynomial T splits.


of T, and for each i let pt be
the
distinct
eigenvalues
block corresponding
to in a Jordan
Jordan
canonical

Let

of

A2r

{t ~x,r(t-

linear operator on a finite-dimensional


that W is a T-invariant subspace of
Prove
that
of Tw divides the minimal polynomialof T.

suppose

The

14.

Let

and

px(\302\243)

be

p2(t)

or disprove

Prove

Let

Definition/.
V,

of x

subspaces of

minimal

the

T be

a linear operator

15. * Let T be a linearoperator

on

be a nonzero

vector

of

x has a

vector

V.

of

by

the

is

then

of

degtee

(d)

is the

T-annihilator

= 0.

p(T)(x)

vector

finite-dimensional

space

V, and let

space

Prove;

divides

minimal

for

f(t)

polynomial

any

which

the T-cyclicsubspacegenerated
polynomial for Tw, and dim(W)equals

of x

T-annihilator
p(t)

which

for

/(T) = T0p(t)

T.

vector

let

and

on a finite-dimensional
called

Let

V.

unique T-annihilator.

(b) The T-annihilator


(c) If

5.2).

respectively.

TW2,
for

degree

The

and

TWl

\302\251W2,

the minimalpolynomial

a monic polynomialof least

if p(t) is

(a)

for

polynomials

that Pi(t)p2(t) is

that

such

V = Wx

and

minimal

the

linear operator on

and T a

space

space

sums (see Section

rionzero vector in V. The polynomialp(t)is

let xbea

and

vector

T-invariant

W2 be

of direct

knowledge

requires

a finite-dimensional

be

and

Wx

vector

exercise

following

largest

of T. Prove

V.

polynomial

the

of

\342\226\240\342\226\240\342\226\240(txky\\

T be a

Let

\342\226\240
\342\200\242
\342\200\242,
Xk be

T is

of

^polynomial

Xu A2,

form

Xt

and

space,

order

the

that the minimal

of all

on the space

operator

there

that

Prove

D:

Hence

use Theorem 2.33.

the

be

R.

over

13.

of D: V -> V

polynomial

l)Bff(t).

For

Hint:

11.

degree of

Forms

Canonical

Chap.

and W is

the

p(t).

The degree

of the T-annihilator

of x is

if

and

only

if x

is an eigenvector

ofT.

16. Let T be a
Wx

be

linear

on

operator

T-invanant

of

subspace

space V, and

vector

a finite-dimensional

V. If xeV

and

the

prove

x\302\243Wl5

let

following.

(a)

There

that

such

(b)

exists a unique

If h(t) is

(c) Let

W2

monic polynomial g^t) of least

degree

positive

^1(T)(x)eW1.

a polynomial for
be

a T-invariant

which

subspace

h(T)(x)

e Wls

then

of V such that W2

g^t)

y\\ll,

divides h(t).
Prove

that

459

Rational Canonical Form

7.4

Sec.

if g2(t)is the unique


of least
polynomial
positive
Deduce
that
02COM e W2, then g^t)
g2(t).
minimal and characteristic polynomialsof T.

degree

monic

g^t)

divides

7.4*
in

our

eigenvalues, eigenvectors, and generalized


with characteristic polynomials that
operators

we have used

now

Until

FORM

CANONICAL

RATIONAL

eigenvectors

of linear

analysis

general,

characteristic

the

However,

linear operator

an

on

indeed,

vector

space

WMTWiWY*

split.In
need

operators
for

theorem

factorization

the characteristic

n-dimensional

f(t) = (-

and,

split,

unique

that

guarantees

E)

(Appendix

not

need

polynomials

not have eigenvalues!


polynomials

such that
divides the

polynomial f(t) of any


factors uniquely as

\342\226\240
\342\226\240
\342\226\240

(Ut)Yk,

k) are distinct irreducible monic polynomials,and the


are
In the case that f(t) splits, each irreduciblemonic
integers.
Hj's
positive
=
factor
t \342\200\224
of T,is of the form <\302\243,-(\302\243)
where
polynomial
A,-,
A,- is an eigenvalue
of T and the
and there is a one-to-one
between
the
correspondence
eigenvalues
where

the

i <

<

(1

0f(\302\243)'s

irreducible monic. factors


not

need

eigenvalues

but

exist,

of

the irreducible

section
structure
the characteristic

theorems

establish

we

instead

polynomial

In this

Let T be a

V witfy characteristic

monic factors always do exist.In this


on the irreducible monic factors of
based

of

eigenvalues.

is the appropriate

are

linearoperator

on

0f(f)

= {x e

= t

form.

linear

(&(t))\302\260k,

is

of

degree
the

one,

eigenvalue

we

some

positive

the

set

K^

integer

the

and

polynomials,

nfs

by
p}.

If
is a nontrivial T-invariantsubspace
then
K^ is the generalized eigenspace of T
of

V.

A.

of the concepts of eigenvalueand


extensions
a canonical form of a linearoperator
the
next
task
is to describe
rational
canonical
context.
The one that we will study is calledthe
to this
of a
a canonical
form is a description of a matrix
Since
representation
of ordered
bases used
it can be specified by describingthe kinds
operator,

Having

suitable

= 0 for
V: (0j(T))p(x)

see that each K^

will

to
corresponding
eigenspace,

\342\200\242
\342\226\240
\342\226\240

k)

\342\200\224

space

polynomial

monic

We

vector

a finite-dimensional

(- WMtWWiit))*2
the
0;(t)'s (1 < i < k) are distinctirreducible
define
positive integers. For each i (1 < i <
K^

for

replacement

f(t) =

where

general,

eigenspace.

generalized

Definition.

In

polynomial.

the following definition

context,
and

eigenspace

characteristic

the

chosen

for the representations.

suitable

460

Chap.
bases

These

For

generated

by

concept

Let T be a linearoperator

on

be a nonzero

generated

Recall

x.

by

in

vector

vector

finite-dimensional

CX(T) for the


that if dim(C:c(T)) =

5.27)

(Theorem

V, and let

space

the notation

use

We

V.

and terminology.

notation

new

some

introduce

and

5.27

Theorem

and

vector

recall

should

reader

the

reason

this

generators of certaincyclicsubspaces.
the definition of a T-cyclic subspace
of Section 5.4. We briefly review this

arise from the

naturally

Forms

Canonical

T-cyclicsubspace
the

then

/c,

set

...T'-H*)}

{x,T(x),T2(x)5.

is an ordered basis for CX(T).To


this
basis
from
all other ordered
bases for CX(T), we call it the T-cyclicbasis
x and
denote
it by
by
of the restriction of T to
relative
BX(T). Let be the matrix
representation
to the
ordered
5.27
Basis BX(T). Recall from the proof of
that
distinguish

generated

CX(T)

Theorem

\342\200\242-

\342\200\224

--

\342\200\224

A =

--

\342\200\242

\302\253o\\

a.

a-

-\302\253&-i/

where

T*(x) = ~(a0x

characteristic

Furthermore,'the

The matrix

h(t) = a0 +

a\302\261t

\342\200\242
\342\200\242

the

and

matrix,

companion

afclit*-1

characteristic

any monicpolynomial

of

g(t)

5.4). By Theorem
for A. Since
polynomial
Section

CX(T),

h(t)

Section13,

the

also

is

h(t)

is

also

^-iT^x)).

is

\342\200\242
\342\200\242
-

degree

equal

matrix

the

to

h(t)

object of thissectionto
finite-dimensional vector space there
the matrix representation
[T]^ of the
is

the

of the restriction

minimal

of T to

for this restriction. By Exercise 15of


for

that

prove

V,

also

of x.

T-annihilator

It is the

is

19 of

Exercise

(see

(\342\200\224
l)kg(t)

representation

polynomial

r).

of the

polynomial

is

flfc-it*\"1

monic polynomial
has a
polynomial
of the companion matrix of

polynomial
k is

by

given

7.143the monic

minimal
the

matrix
companion
+ tk. Every
monic

the

called

is

\342\200\242
-

of

polynomial

(- lfK + att

- tl) =

det(.4

+ -

+.aj(x)

exists

form

an

every

ordered

linear

basis

operator

T on a

j? for V such

that

where each

let

space

T if

definition,

by

P be an

K$ for some

Then

<\302\243(t)

divides

V,

space

x is

of the f01m

the

minimal

Then
and
15 of Section7.3;
(<\302\243(\302\243))p,

basis

ordered

V. Then

for

P is a

T-cyclic bases
<\302\243(t)

of

the

vector

a finite-dimensional

on

operator

irreducible monicdivisor

rational canonical basis of


each

where

BXi(T),

characteristic

polynomial

Xj lies,

in

of T.

Exercise.

Proof

linear

disjoint union of

is the

<\302\243(t).

of

of x.
the T-annihilator
of T by Exercise
polynomial

Let T be a

V, and let
if P

T-annihilator

7.16.

only

the

K^,.

xeK^.

Theorem
and

polynomial

minimal

the

divides

<\302\243(\302\243),

and

vector

((f>(t))p is

that

Suppose

Proof

hence

soxe

of T, and

polynomial

monic

irreducible

some

for

(<\302\243W)P

on a finite-dimensional

a linear operator

vector in V, and supposethat

nonzero

xbea

The

T.

lemma.

T be

Let

Lemma.

form

of

is

theorem

following

is

rationalcanonical

is called a

representation

is

cf)(t)

accompanying basis a rational canonicalbasis


a simple consequence of the accompanying

will call the

We

T.

of the characteristic

a matrix

Such

(0(t))m,where
polynomial of T, and

of some polynomial

matrix

companion

divisor

integer.

positive
of

the

is

Ct

irreducible

monic

461

Rational Canonical Form

Sec. 7.4

Example1
The

8x8

C given by

matrix

/0

-3

/1

\\

10

0
0
0 \\
0
0 I
0
0
0
0
0
0 J
0 -1 /

\302\260\\

c =

0
0
0
0
0 -1
0
0
1 0 -2
jo
0 ;o
0
0
0 0 0
0
0
0 0 0 0
0
1 :o 0
0 jo 0
0;
0
!0

\302\260

\\o

is a

rational

the

companion

matrices

T.

of

In

this

basis

the

case

0/

T: R8-> R8 defined by
operator
for R8 is the accompanying rational
submatrices
C1; C2, and C3 are the

linear

the

ordered

standard
for

of

form

canonical

T = Lc, and
canonical basis

the

polynomials

anc*

<\302\243i(\302\243),
(02M)2>

where
=
\302\242$)

t2

- t +

3 and

02(t)=

t2

1.

(f>i{t),

respectively,

In the context of

Theorem

the

/3 is

7.16,

of the T-cyclic bases

union

disjoint

Canonical Forms

Chap.

462

/! = Bei(T)uBe3(T)uBe7(T)

= tei,
By Exercise38of Section5.4

e2}

characteristic

the

the product of

characteristic

the

e5, e6}

c4,

{^3>

the

then

form,

divisor

<\302\243(t)

= 0i(W2W)3-

of

characteristic

case,

trivial.

not

is

K^,

this section.

polynomial
7.11),

(Theorem

polynomial

polynomialare also
nontrivial
show that

of

divisors

minimal

minimal

the

with

begin

polynomial.

annihilator

the

and,

operator divides the


monic divisors of the minimal
the irreducible
characteristic
and it is easy to
the
polynomial,
of a linear

V,

space

the

a linear operator

T be

Let

T, where the

ofp(t), and

i, K^.is a

the

<\302\243j(t)'s

are

mj's
T-invariant

nontrivial

(c) For i #

j,

is
K^

under

invariant

For

each

i,

Proof, (a) The

=
1\302\276

Let fit) be the polynomial

K^

is

there

is nontrivial,

is a vector

T-invatiant

from

obtained

K^

p(t)

first observe that

z in

positive

<

V.

offy-JJ) to K^

for

which

of V is

subspace
by

omitting

the factor

f(t) is a proper

divisor

x = /f(T)(z)

(0,(T))-/,(TXz)

an exercise.

(0f
of

(\302\243))\"\".

p(t\\

# 0. However,

because

(0;(T))m'(x)

Then

integers.
of

distinct

the

are

k)

((fc(t)P).

proof that

i
=\302\276

and the restriction

0j(T),

is one-to-one.

(1

subspace

(b) Fori#j,K^nK^={0}.

therefore,

vector

\342\200\242
- \342\200\242
(<t>k(t)r>

(<\302\243i(t))m'(02(t)r2

of

polynomial

(a) For each

that

on a finite-dimensional

let

and

irreducible monic factors

prove

7.3.

15 of Section

Exercise

accompanying

7.17.

minimal

(d)

the

use

we

\302\242,Consequently,

the characteristicpolynomial.
result
that lists several properties of irreducible divisorsof
The
reader
is advised to review the definition of T-

p(t) =
be

divisor

such

any

.
Theorem

goal of

main

is the

existence

this

of

of

existence

the

upon

in place of

polynomial

We

for

is

K^,

the proof

rests

K^,

minimal

the

Since

of

the nontriviality

However,

canonical form, and

the rational

characteristic

This

form.

canonical

the

\342\226\240

operator has a rational canonical


of some power of each monic irreducible
matrix
at least once as a submatrix
polynomial
appears
follows
easily from Exercise 38 of Section 5A In this

companion
its

of

companion

if a linear

that

suggests

example

the

of

polynomials

is given by
matrices:

of T

f(t)

polynomial

fit) = Ut){Ut)?Ut)
This

u {e7, ea}.

= p(T)(z) =

0.

To

and

xe

K^

Sec.7.4

Rational

463

Form

Canonical

x e K^rvK^, and suppose that x #


=
pt and p, such that (<t>i(t))Pt(x)
((j)j(t))PJ(x)
Let

(b)
integers

annihilator of x.

Then

both

divides

g(t)

0.

= 0.
and.

(0\302\243(O)Pf

exist

there

Then

positive

g(t) be the

Let

But

(0y(t))w.

Tis

this

impossible because these two polynomials are relatively


(see Appendix
E).
prime
=
We conclude that x
0.
Since
is T-invariant,
it is also 0y(T)-invariant. Suppose that
(c)
K^
= 0 for
some
x e K^. Then x e K^,. n K^ , and hence x = 0 by (b). We
(f)j(T)(x)

conclude

of

restriction

the

that

to K^ is

0y(T)

(d) Consider any i.


f(t) is a product of polynomials the form
is invariant under f(T), and
restriction
N((0I-(t))m')

Clearly,

/f(TXj')

Let

onto.

(f>j(t), for;

we

be

as in

have

by

(a). Since

(c) that

K^

one-to-one. Hence
/((1)
Then there exists y e K^ such that
to K^,. is

of

the

Let-/f(t)

c.K^(.

of

this restrictionis also

one-to-one.

e K^.

*\342\200\242
Therefore,

(0i(T)Hx)

= p(TXy) =

(&(T)r/KT)(j>)

and hencex e N((0|(T)H-

Thus

-N((0,(T)H.

K^

0,

bases of an operator are formed


by taking
unions of T-cyclic bases, we must take care
the
unions
are
resulting
linearly
of
Theorem 7.18 reduces this problemto
linear
independent.
independence
subsets
of K^,, where 0 is an irreduciblemonic
minimal
of the
polynomial
Since rational canonical

that

the

divisor

ofT.

and let

<\302\243i,

be a linear

Let J

Lemma.

x1

X;

/c

\342\200\224

xke

the operator

distinct

(f>2,---,

<j)l9

to

(<\302\243&(T))P

each i < k,
for

Let

K^f

f. Therefore,

all

Suppose

$u

\342\200\224

>

vectors

(4)

$i,--

1,

and

xf

e K^ (1

and

that

suppose

< i<

(4).

satisfying
=

(0&(T))p(x&)

0.

\342\200\242
\342\226\240

{MWfru-J

We

apply

by

hence,

the

= 0.

induction

=
part (c) of Theorem 7.17, xt

xk=0.
T is

for

holds

there are k

that
k)

of

number

to obtain

of (4)

by

to the equation

7.18.

V.

= 0.

+ (&(T))p(x2)+

(0&(T))p(xf)

(4) simplifies

space

be such that

integer p such that

sides

both

/c

am*

(f>k

K^, there is a positive

Theorem

vector

x2+---+xk

where

divisors,

(0ft(T))p(xf)= 0
Thus,

< k)

the

WkiWfri)
For

is by mathematical induction on-A:,the


lemma
result is trivial if/c = 1. Assumethat

The

distinct divisors
Since

let x{ e K^

<

V,

the minimal

proof

divisors.

distinct

monic divisors of

space

alii.

for

The

Proof
any

(1

vector

a finite-dimensional

irreducible

distinct

the

be

02>-\"0k

polynomialof T.For eachi


Then

on

operator

hypothesis,

for

k.

<

a linear operator

.,(/>& be distinct

on a finite-dimensional

irreduciblemonicdivisorsof

the

464

minimal

for each i

of T, and

polynomial

independent subset of
---u
Si u S2 u
Sk is a linearly

Then

subset

independent

That

Theorem7.17.
of

Section

of

unions

bases.

T-cyclic

in

used

are

K^

used to

of the

of

ways

up

building

the

are

that

K^,

canonical

rational

form

sets. They serve


the rational
canonical
such

the dual purposes of leadingto


existence
theorem
for
form and of providing methods
rational
canonical
constructing
For Theorems
7.19 and 7.20 we fix a linear operator
vector space V and an irreducible monic divisor
dimensional
of

bases.

for

(f)(t)

finite-

on

the

minimal

of T.

polynomial

Theorem 7.19.

is

of

Theorem

place

subsets of

independent

These subsets are

results give us

The next several

bases.

of the form

on linearly

attention

our

focus

of part (b)

consequence

identical to the proof of

is formally

proof

a/ui

eigenspaces.

now

the

#j

i # j,

for

linearly

V.

of

a trivial

is

\302\243

that subspaces

7.1, except

generalized

We

of

rest

The

7.3

0 for

StnSj=

Sj be a

let

< k)

<

= 0
Sj

S{ n

K^,..

Proof

(1

Forms

Canonical

Chap,

Let xl3 x2,.. -,


S1 =

such

K^, such that

in

vectors

distinct

Bxl(T)u.--uBXk(T)
i let Zj be a vector in

each

For

independent.

linearly

be

xfc

that

= x}.

0(T)(Zj)

Then
\\

is

also

linearly

Proof

S2=B2l(T)u-uBZi(T)

independent

Consider any linear

combination ofthe

vectors

of

S2

that

sums

to

zero, say,

t
For

each

i let f(t)

0.

\302\253yT'fo)

be the polynomial

defined

(5)

by

aVitK

/i(T)fe)

= 0.

ft(t)

Then (5) can be

./=0

as

rewritten

Now apply $(T) to both sides

of

I
The

preceding

sum

(6)

/,(TWX*,)

can be rewritten

to

(6)

obtain

= I

=
/i(T)(xf)

0.

as a linear combination
of

vectors

in

5X.

Rational Canonical Form

Sec. 7.4

Since

is

Sx

465
that

it follows

independent,

linearly

for all I

/f(TXxi) = 0
Since

7.3).

\302\242(1)

there existsa

for each i

all i. Thus

T-annihilator

the

divides

fi(t) for all i (see Exercise15of


of xf, it follows that \302\242{1) divides

divides

of xt

T-annihilator

the

Therefore,

that

such

gt(t)

polynomial

Section

fi(t)

gi{t)<j>{t).

Hence (6) becomes

t
linear

Again,

of

independence

= ffi(TXxi) =

ft(TX*t)
But

/f(T)(zj)

that

arise

the terms of the linear


the linearly independent set B2((T).We conclude

from

in

combination

Therefore,S2 is

all j.

show that

a basis

has

K^,

i,

independent.

linearly

(5)

each

for

that

now

We

for all

of grouping

result

is the

= 0 for
atj

0.

that

requires

5X

\302\243 gi(J)(Xi)

0,-(T)0(T)(z,.)

for

f(t)

of a union

consisting

of T-cyclic bases.

be a

Let

Lemma.

T-invariant

subspace of

let

and

K^,,

ft be

a basis for

W. Then

(a) For

e N(0(T)),

any

(b) For some

zl9

then

if x$\\N,
in

z2,...,2\302\276

/3

j? can

N(0(T)),

is linearly

BX(T)

be extended

independentset

P' = Pu
whose

contains

span

Proof, (a) Let

\342\200\242
- \342\200\242

and

where

xe

is

the

to

that

follows

of

degree

contrary,

W,

It

<\302\243(\302\243).

W.

that

Suppose

z # 0.

Then z has

Therefore,

all j. Similarly,

a{

CX(T)nW

= CX(T), and

If

Now apply
the

denote

W
(a)

not contain

does
to

span

of /^. If

choose a vector
=
P2

j?u

(T)u

that

conclude

z2

in

Wx

N (<\302\243(T)),
fit
does

N(0(T)),

(T) is linearly

= Pu

choose

not

a vector

zx in

N(0(T))

not in W.

B21(T)is linearly independent.

Let

contain

not

hence

z = YtbjT3(x) = 0, from which


it
for all i, and hence j?u BX(T) is

linearlyindependent.
(b)

T-

W)< dim(Cx(T))= d,

that

follows

hypothesis.

= 0 for

bj

CX(T)

that

suppose

= 0,

< dim(Cx(T) n

= dim(Cr(T))

the

BJT)

therefore,

<\302\243(\302\243),

and

\342\226\240
\342\226\240.,
x&},

x2,

^fc/T-^x). Then z e

annihilator

BZi(T)u

\302\243a{Xi

where z =

to a linearly

N(0(T)).

{xl3

j?

independent.

in

N(0(T)),

W1;

and

proceed

such

Wx

as above to

that

the set

independent. Continue this process

eventu-

466

in

vectors zu z2,...,zs

ally to obtain

union

the

that

such

N(0(T))

Forms

Canonical

Chap.

0' = 0uB21(T)u...uB2/T)

is a linearly independent

whose

set

T is

of the form

p(t) =

(<\302\243(t))m,

canonical basis.

Proof The proof is

Apply (b)ofthe

of

polynomial

a rational

T has

N(0(T)).

If the minimal

Theorem 7.20.
then

contains

span

a linearly independent subsetof

{0} to obtain

to

lemma

that m =

m. Suppose

on

induction

mathematical

by

1.
V

BXs(T) whose span contains N(0(T)). Since


=
a rational canonical basis for V.
N(0(T)),
that
for some
the
suppose
integer m > 1 the result is valid
for T is of the form
/c < m, and assume that the
minimal
where
polynomial
is
a
minimal polynomial T is p(t) = ((f>(t))m.Let r = rank(0(T)). Then
of V, and the restriction of T to this subspace
T-invariant
has
subspace
the induction hypothesis to
we may apply
Therefore,
polynomial
((^(t))171\"1.
canonical
of T to
that
obtaina
basis
for the restriction
Suppose
bases constitute
this
x1; x29...,
xk are'the generating vectors of the T-cyclic
V such
canonical
rational
basis. For each i choose
in
that xt = 0(T)(zf). By
---u

BXI(T) u
this
set is

form

the

of

Now

whenever

((f)(t))k

for

R(0(T))

minimal

rational

R(<\302\243(T)).
that

zx

Theorem 7.19 the union

of

this union,and

B2jt+i(T),...,

to

\342\200\242
\342\200\242
\342\200\242

J8'=

B\342\200\236(T)u

BJT)

\342\226\240
\342\226\240
\342\200\242

BJT)

and
both
N(0(T)).
We argue
that W = V. Since W is T-invariant,
to W.
By the way in which W was obtained
= R(0(T)) and N(U) = N(0(T)).

span

contains

let

of

denote

from

<\302\243(T)

R(U)

Let j? denote

set

independent

whose

where

P,

B^(T)

independent.

space

and

lemma

the

of

(b)

Apply

is linearly

Bri(T)

generated by j3.Then W contains R(0(T)).


additional
adjoin
T-cyclic bases (if necessary)
z{ is in N(0(T)) for i ^ k, to obtaina linearly

the

denote

let

sets

the

R(0(T)),

the

restriction

it follows

that

Therefore,

dim(W')

rank(U)

+ nullity(U)

= rank(0(T)) +

nullity(0(T))

= dim(V).

ThusW =

V,

and

Corollary.
bases.

In

Apply

Proof

We

j?'

now

extend

is

general,

Theorem

Theorem

has

K^

7.20

a basis

consisting

to the restriction

7.20 to the

basis of T.

canonical

a rational

of a union

of T to

general case.

K^.

of T-cyclic

Sec.7.4

vector
Every linear operator on a finite-ditnensional
basis and hence, a rational canonical
canonical
form.

7.21.

Theorem
has

a rational

Proof.

and
where

all

467

Form

Canonical

Rational

a linear

be

Let

vector space

on the finite-dimensional

operator

space

V,

be the minimal polynomial of T,


let
pit)
($i(t))mi($2(t))m2---($k(t)Tk
are
the
irreducible
distinct
monic factors of p(t), and
> 0 for
the
i. The proof is by mathematical inductionon
The
case
/c = 1 is proved in
=

m,-

<\302\243f(\302\243)'s

k.

Theorem7.20.

the minimal polynomial


than
k distinct
irreducible
factors
for some k> 1, and suppose
contains
/c distinct
that
contains
factors.
Let U be the restriction of T to
TW = R((0&(T))mfc), and let q(t) fee the minimalpolynomial
invariant
subspace
Then
U.
does
q(t) divides
p(t) by Exercise 13 of Section 7.3.
(f)k(t)
not
there would exist a nonzero
x in W
divide
such
q(t). For otherwise,
that 0&(U)(x) = 0, and a vector in V such that x = ((j)k(T))mk(y).It follows that
Now

the

that

suppose

is valid

result

whenever

fewer

the

p(t)

of

Furthermore,

vector

(0&(TFk 1(^)

by the induction

Therefore,

of a union of

5X consisting

from some of
7.20,

the

and S =

number of

S2

basis

bases) of vectors
to Theorem
corollary

of T-cyclic

By

the

of T-cyclicbases.

Theorem

By

is linearly

irreducible

7.18,

Let s denote

independent.

the

S. Then

in

elements

5X

Theorem

by

canonical

a rational

has

1.
< i < /c \342\200\224

/c distinct

than

hence

(and

of a union

S2 consisting

S2 are disjoint,

K^,

=
((f>k(T)mk(y)

fewer

hypothesis

bases

U-cyclic

subspaces

a basis

has

K^fc

5X and

Thus, q(t) contains

a contradiction.

7.17(d),
divisors.

and x =

y e K^

hence

and

0,

5 = dim(R((0fc(T)n)

+ dim(K^)

= rank((^(T)r*) + nullity((^(T)n

=
We

canonical form.

a rational

T has

a basis for V. Therefore, S is a rational

S is

that

conclude

/7.

The following

theorem

characteristic

V -with

1
the

relates

form

canonical

rational

Let T be

7.22.

characteristic

a linear operator on an

to the

the

positive

<&(t)'s

(1 <

(c) If P
is a

space

(- l)n(0!(t))ni(02(t))^
and
< k) are distinct irreduciblemonicpolynomials
\342\200\242
\342\226\240
\342\226\240

(0k(t))\"S

the

n/s

Then

integers.

(a) For each i,


(b) For each i,

vector

n-dimensional

polynomial

f(t) =

are

and

basis,

polynomial.

Theorem

where

canonical

0i(t) divides minimal


dim(K^) = djns,
the

where

is a rational

basis for

K^.

canonical

basis

of

polynomial
d{

denotes
for

V,

T.

the

degree

then

for

of 0f(t).

each

i,

/3

K^

468

Chap.
Si is

If

(d)

a basis for K^

for each i,

/n particular, if each S;

fs

38 of

Exercise

Sk

bases, then S

of T-cyclic

polynomial of C,

5.4 the characteristic

Section

the product of the characteristic polynomialsof the


companion
irreducible
monic
divisor
0. Therefore, each
compose
cj)(t)
characteristic

for some positive

one of

of at least

polynomial

integer p,

basis for V.

is a

\342\226\240
\342\200\242
*

a rational canonical

7.21, T has

Theorem

By

(a)

Sx u

is a

T.

for

Froo/*.

union

disjoint

rationalcanonicalbasis

then

Forms

Canonical

of

the
hence

and

matrices,

of a

that

divides

f(t)

By

T, is

of

hence

and

matrices

companion

T-annihilator

the

is

(<\302\243(\302\243))p

the

C.

form

vector of V.

nonzero

divides
and
hence
the
minimal
of T.
We conclude that
polynomial
=
of the orders of
the
sum
(b), (c), and (d) Let C
[T]^. For each i,
from
arise
in K^, and this
the companion matrices of C
bases
T-cyclic
in ft = P n
number is equal to
number
of elements
K^, which is a
of
subset
independent
K^f. It follows that ntdt < dim(K^f)for all i.
of K^,., and
sum
of the dtnt9s is the degree of f(t\\ For eachi,
be
basis
the
any
=
u Sk.
u
let S
7.18, S is linearly independent and the bases
By Theorem
<\302\243(\302\243),

(<\302\243(\302\243))p,
-

is

dfnf

that

the

linearly

Furthermore,

let

S{

\342\226\240
\342\200\242>

5X

are

Therefore,

disjoint.

n=

and

it

that

follows

Moreover,

dim(K^,()

.'-1

for all i.

= d^

S contains n elements,and

Uniqueness

Having

that

canonical

rational

to which
it
positionto askabout
canonicalformof an operator
be modified
that constitute the corresponding
permuting the companion matricesthat
As in the case of the Jordancanonical

form

for

basis

K^.

V.

can

make

form,

permutations, the rational canonical


bases are not.

form

basis.

is

up

will

we

the

the rational
the T-cyclic bases

This

show

has

the effect of

canonical

rational

although

unique,

in a

Certainly,

by permuting

canonical

rational

we are now

exists,

is unique.

extent

the

for

basis

is

hence

is

ft

Therefore,

Canonical Form

of the Rational

shown

n,

\302\243 dim(K^f)<

\302\243dfn,*\302\243
\302\243=1

that

except

form.

for these
canonical

rational

of ordering
To simplifythis task
the
convention
adopt
every rational
canonical basis so that all
bases
associated
with the same irreducible
T-cyclic
monic divisor of the characteristic
are
grouped
together.
we
within
each
such
Furthermore,
y/ill always arrange T-cyclic
grouping
in order
of decreasing
size. Our task is to
bases
that
to this order,
subject
the rational canonical form of a linearoperator unique
up to the arrangement
of the irreducible monic divisors.
As in
canonical
case
of
the
Jordan
form of Section 7.2, for each
irreducible
of the
characteristic
divisor
of T we introduce a
polynomial
the
order
of that part of the rational canonical basis
diagram of dotsto
we

the

polynomial

the

show

is

the

monic

(f>

describe

Sec.

Rational

7.4

K^,. We prove that the diagram is completely


intrinsic
to operators, namely rank.
in

contained
property

For what followsT is a

and

K^,.

\342\200\242
\342\226\240
\342\200\242

>

px

dot

The

dots

array

and

column

= 3,

=
px

= 2, and

p2

4,

with pj dots
the

the

jth

terminates

and

top

in

p3 = 2, the dot

diagram

below.

given

canonical form, the dots

to the Jordan

In contrast

dot

of

not correspondto

canonical form do

rational

the

of

members

individual

for a rational

diagram

basis.

canonical

For each i, let


the

begins at

the Jth column

For example, if k

pj dots.

of k columns

is

this configuration

with

associated

diagram

consisting

so that

arranged

the

since

pk

dpj elements.
arranged in

7.3, &XJ(T) contains


bases are
T-cyclic

of Section

15

Exercise

by

therefore,

Furthermore, ^ p2 >
decreasing order of size.
of
defined to be the

is

basis j?,

canonical

a rational

with

operator

by

of its characteristic polynomial of degreed,


in
s BXfc(T) are the T-cyclic bases of P that are contained
BX|(T),
BX2(T)S...
T-annihilator
of Xj. This polynomial
has degree
each j let (<\302\243(\302\243))Py
be
the
For

dpj, and

after

linear

determined

divisor

monic

irreducible

an

is

(j)(t)

469

Form

Canonical

rt

rl=r2

above,

diagram

the

be

Section 7.2, the r/s

= 3, and
the

determine

p/s,

r3 =

row of a dot

in the ith

of dots

number

=
r4

1.

to

According

diagram.

Exercise

In

7 of

vice versa.

and

Example 2

Recall the rational canonical


C
of
irrealicible monic divisorsof
characteristic
dot diagrams to consider, one
(p^t)
is the
T-annihilator
02(0 = t2 + 1.
form

the

polynomial

for

Since

the dot

and

Be3(T)

Be?(T),

0x(t)

lie

in

K^2. Since

anmhilator
02(Os Pi = 2 and p2 = 1
are displayed below.

Dot
The

of powers

basis.

The

the

unique.

conventions

diagram

for

\342\200\224

and Bej(T) is

there

a basis for K^,

other two T-cyclicbases,


nas
Te3 has T-annihilator (02(O)2
ei

of a

consists

t2

of et

\302\242^1)

for

diagram

are two
of C, there
are two
and
the other for

1. Since

Example

single dot. The

an(*

for

the

\302\242^\302\243)

dot

Dot

diagram

diagram

of 02(O- These

for

<\302\2432(t)

diagrams

the r/s are expressible terms


of the
ranks
are independent
of
and
thus
of the choice of a rational canonical
are
therefore
of the choice of basis. Hence, subjectto
independent
p/s
we have
the rational canonical form of an operatoris
discussed,
next

theorem

<\302\243(T),

tells us that

in

Chap.

470

V, let

space

T of

<\302\243(t)

basis of T.

canonical

to a rational

fl = i

pj be as

r: and

let

and

d,

degree

of the characteristic

divisor

vector

a finite-dimensional

on

operator

monic

irreducible

an

be

linear

Let T be a

7.23.

Theorem

abovefor the dot

for

diagram

Forms

Canonical

polynomial of

<\302\243(t)

with

respect

TIren

rank(0(T))]

[dim(V)

and

r,

The

=^[rank((0(T))i-1)
of this

proof

to_the

key

rests on a

theorem

for i > 1.

lemma which

we

of the lemma is provided

of the proof

outline

An

- rank((0(T))!)]

with

of

many

details

the

state.

now

left to

the exercises.
*

Lemma.:
da

for any i

Furthermore,

dim(K^,).

in the dot diagram

number of dots

the total

a be

Let

for

the
K^.

Let

Proof

of

of vectors in

j, the number

each

For

lying in K^,.By

basis

canonical

rational

Let

the

be

to

suffices

bases in that part

of

Theorem 7.22,the unionisa basis

for

is

(T)

Therefore,

dpj.

YJdpj = da.

dim(K$)

:;

rO.

BXfc(T) be theT-cyclic

BXi(T),...,

Then

< pl9

nullitya^T^^d^+.-.
Outline

<\302\243(t).

it
N((0(T))O= N((0(U))'),and
the second part of the lemma for nullityft^U))1).
prove
range
is
the
under
of
in
vectors
generated
by
images
(0( U))'
=
u
For
each j such that ps > i, let
Then
the
BXfc(T).
(0(U))'(xy).
is generated
by
[j BW(T), and this set is linearlyindependent.

of T

restriction

to K0. Then

therefore

The

the

R((0(U))O

BX1(T)

range

of

\342\200\242
\342\226\240
\342\200\242

ys

(<\302\243(U))'

pj>i

each

Furthermore,

of d(pj

\342\200\224

i)

ys has

elements.

as

(<\302\243(T))Py~f

its

and

annihilator,

hence

B^ (T) consists

Therefore,

rank((0(U)y')

\302\243

d(Pj

i).

pj>t

Since pj
rows

have

- i is
been

the

number

of

dots

we have

removed,

a~
dim(K^,)

= da,

jth column of the

diagram afterthe

that

Z (Pi-0
Pj>i

Since

in the

= ''i +

.
\342\226\240\342\226\240\342\200\242+/\342\200\242,-.

first

Sec. 7.4

471

Form

Canonical

Rational

= nullity((0(U))O

nullity((0(T))O

= da

- rank((0(U))[')

da

\302\243

= d a ~

i)

d(Pj

(Pi

Pj>i

= dir, +
of Theorem 7.23.

Proof

For i =
=

nullity(0(T))

dr\302\261

\342\200\242
\342\226\240
\342\226\240

77).

1
- rank(0(T)).

dim(V)

For i > 1,
=

\342\226\240
- \342\226\240

d(n

drt

/>)

nullity((0(T))<\")

= [dim(V)-

r,_

rank((0(T))<)]

[dim(V)

rank^T)/-1)]

rank((0(T))<)],

follows.

result

the

\342\200\242
\342\226\240
\342\226\240

- nullitytfflT))'-1)

= [rank^T))'\"1)
and

d(ri +

Under the conventiondescribedearlier,


arrangement
of the
of a linear operator is unique up to
form
divisors of the characteristic polynomial.
Corollary.

the

the

canonical

rational

monic

irreducible

Exainple3
Let

P in C
an

{ex cos(2x),

\302\260\302\260

(see

Section

basis

ordered

derivativeof f

canonical

form

ex sin(2x), xex cos(2x), xex sin(2x)}s


and let W be the span
2.7). Then W is a four-dimensional subspace ofC^, and

D: W -*

for W. Let
D

Then

and

is

be

the

on
operator
canonical
basis of

linear

rational

defined

mapping

W. We will

D. Let

A =

and the characteristic

polynomial of D, and
= (t2 - It +
f(t)

hence

5)2.

by D(/)

of

A,

is

= /',

of
P

is

the

find the rational


[D]^.

Then

472

Forms

Canonical

Chap.

monic divisor of f(t).


has
2 and
W is four-dimensional,
the dot diagram for
contains
degree
only
two
dots. Therefore,
the diagram is determined by rl9
number
of dots
in the
first row. Because ranks are preservedundermatrix
A in
we
use
Since
7.23.
place of D in the formula givenin
Thus

\342\200\224

t2

\302\242(1)

the only irreducible

5 is

It

Since

\302\242(\302\243)

(f>(t)

the

representations,

Theorem

<KA)

we have that

n = i[4

It followsthat'theseconddot

second

the

in

lies

2] =

*[4

rank(0(.4))]

row,

1.

and the dot

diagram is given

by

Therefore,

is a

D-annihilator

cyclic

space, and it
its

Furthermore,

(\302\242(1))2.

companion matrixof

=
(\302\242(1))2

t*

the

#(x)

xex
=

Bff(D)

20'

-14
4

to

find

the

25,

a function

g in

which

for

0, and

cos(2x))

Therefore,

{xex

cos(2x)),

D(xex

cos(2x),

D.

of

in

sin(2x)

basis

with

is given by

form

#
^A)e3 # 0, it followsthat ^D)(xex
cos(2x) can be chosen as the cyclicgenerator.

is a rational canonicalbasis
can
be
chosen
h(x) = xex
canonical

function

single

-25\\

suffices

14t2 - 20t +

# 0. Since

0(D)(g)
therefore,

it

generator

cyclic

:\302\260

by

canonical

rational

- 4t3 +

For

is generated

is

not

D2(xex cos(2x), D3(xex cos(2x))}


of g. This

place

We now define the rational canonical

form

The rational
Definition.
defined to be the rational

canonical

canonical

will

be

h defined

function

by

shows that the rational

unique,

basis
accompanying

the

that

Notice

called

form

of

rational

form

the

of

of

linear

a matrix

matrix

operator
canonical
basis

in the
A

natural

way.

in Mnxn(F)

LA: Fn ->
of A.

is

Fn. The

Sec. 7.4

473

Form

Canonical

Rational

Example

a rational

canonical form and

the rational

Find

of

basis

canonical

real

the

matrix

-2
1
0
1

f(t)
=

therefore

and

of f(t). By

divisors

degree of
=

4/2

t2

0X(\302\243)

is

0x(t)

2,

1 -2

-1

1 _4

-3

4/

is

- a) =-(t2

1 and

(f>2(t) = t

+ l)2(t-

\342\200\224

of dots

number

dots in the

and

is given

rx

irreducible
monic
= 1. Since the

dim(K02)

in the dot

row

first

2),

distinct

the

are

dim(K01)

the number of

2, and

-3

det(.4

total

\\

Theorem 7.22,
the

of

The characteristic polynomial

-6

diagram for

is

\302\242^1)

by

r^iCdimfR^-rank^^))]

= ^[5

- rank(,42+ 2/)]

= i(5-1)
dot

the

Thus

for 0x(t) is

diagram

and each
= t2 + 2

of

column

this

2.

given by

(t>^)

the

rational

= t

02(0

Therefore,

\342\200\224

the

form.

canonical
consists

rational

of

a single

canonical
0
I

1
0

0
\\

If

is

matrix

companion

for

'0 -i
1

to

the

contributes

diagram

a rational

canonical

0y

Since dim(K^,2)= 1, the


dot, which contributes the
form of
is

dot

of

diagram

1x 1

matrix

(2).

-2;

o ;o

o o
o o

\\

-2
0
o; l
0 0
0 Vil
o;

basis of A, then

finK^,^ is the union

of

two

cyclic

474

Chap.

bases

and

BXi(L^)

each

that

follows

where

BX2(L^)S

x2 each have

and

xx

be

N(0(LJ). It can

xt lies in

verified

Canonical Forms

annihilator

It

\302\242^\302\243).

that

1\\
/\302\260

o\\

|,

o/
is

a basis

for N(0(LJ).

\302\260
\\

\\o

Setting

we see that
o
Ax\302\261

Next, choose x2 to
B*,(LJ = {xu
AX}},

be a
for

in

vector

example,

x9

Then it can

be

seen

that

KfPi

= e9 =

I 0

that

2
i
Ax2

0
2

\\
and

B^fLJuBJLJ

is a

basis for

K(

it

is linearly

independent

of

Sec.

Since the dot

vector in

is

K^,2

diagram of

an

475

Form

Canonical

Rational

7.4

2 consists

dot, any nonzero

single

to eigenvalue

A corresponding

of

eigenvector

of a

\342\200\224

</>2(t)

2.

For

choose

example,

x, =

By Theorem 7,22, j?=

{xls

Ax2,

x2,

Axu

basis of

canonical

rational

is a

x3}

Example 5

For

is the

A of

matrix

the

an invertible matrix

4, we find

Example

canonical form of A. Let

rational

vectors of the

rational canonicalbasis
0
Q

0
Q

is the rational

lAQ

4. That

Example

'10

Then

of

j?

that

such

Q~lAQ

are the

columns

whose

matrix

the

be

is,

01

1 -2
1 0
0
1 0 -2
1 0 -4
1

canonical form of

5.1.

Theorem

by

Example 6

Find the rational


matrix

and

form

canonical

2 1
0
0 0
0 0
2

The
the

characteristic
only

1. If

degree
result

of A is f(t)

polynomial

irreducible

monic

divisor of f(t),

we apply Theorem

= (t

\342\200\224

and

Therefore,

If.
=
K^,

7.23 to compute

the

dot

cf)(t)

In

R4.

= 4

- rank(0(,4))=

r2 = rank(0U))

2 =

2,
=

rank((0(.4))2)

this

diagram

is

ri

- 1=

of the real

basis

canonical

rational

1,

case
for

lis

t\342\200\224

\302\242(t)

\302\242(1),

has
the

476

Chap.
7

Forms

Canonical

and

(t

\342\200\224

the

has

2)3

= dim(R4)/l

dots in the

has-the

terminate

may

the

is

-12

,0
2)

we

8^

10

\342\200\224

= 1.

matrix

companion

and (t

diagram,

dot diagram for

r3. Thus the

with

computation

Since

are

there

Since

- rank((0U))3) =1-0

= rank((0(,4))2)

r3

6,

matrix

companion

(2),

canonical form of

the rational

is

8 ; 0'

1 0

-12 10

and

respectively,

such

BX2(LJ} = {xls Axu

x2eH(LA

\342\200\224

21).

It

can

A2xu

of A. Moreover,
be shown that

basis

canonical

rational

dot diagram above indicates

with annihilators

x2 in R4

p = {Bxl(LJu
is

of A. The

basis

and

xt
that

0 ! 2,

.0

canonical

rational
vectors

are

there

that

find

10

0
now

by

'0

We

given

easily

N(L^ -

and

(<\302\243(\302\243))3

<\302\243(\302\243),

x2}

xx

N((L^

\342\200\224

2i)2),

and

21)= span({els

e4})

and

N((L^-2l)2)

The standard vector

e3

criteria

the

meets

e2, e4}).

span({els
for xt3

so we set

/1

0\\
i

and

=
Ax\302\261

2
\\

0/

,
A

2x,

4
=

I 4

\\o

=
xx

e3.

It follows

that

Sec. 7.4

e4

this

satisfies

in the span

a vector x2 not
condition.
Thus

we choose

Next,

477

Form

Canonical

Rational

of

The following theoremis a


Let T be a

7.24.
V with characteristic
Theorem

are

(b)

<

(1

<\302\243;(t)ss

(-

If

linearoperator

Clearly,

21).

7.22.

Theorem

vector

n-dimensional

an

on

space

\342\226\240
\342\226\240
\342\200\242

wMvnMtyr2

(ws

irreducible monicpolynomials
and

the

n^s

Then
K^k.

V=K^\302\251K^2\302\256.-.\302\256

T- is

of

consequence

simple

k) are distinct

i <

integers.

positive

(a)

\342\200\224

N(L^

polynomial
f\302\253

the

in

Sums*

Direct

where

but

BXi(T)

of T to K^ and Q

the restriction

is

the

rational

canonical

of

form

T;, then
c1\302\256c2\302\251---\302\256ck

is

\\

canonical

rational

the

form

of T.

Proof. Exercise. I

The following

is

theorem

Theorem 7.25. Let T be a linear


space V. Then V is a direct sumof T-cyclic
of
K$ for some irreducible monic divisor

on

operator

subspaces

<\302\243(t)

vector

a finite-dimensional
CXj(T),

characteristic

each

where

in

xt lies

of T.

polynomial

Exercise,

Proof

the

7.16.

of Theorem

consequence

simple

EXERCISES

1.

Label

the

statements

following

as

being

(a) Every rational canonicalbasis

of

true or false.
linear

operator

T is

the union of

T-

cyclic bases.

(b) If a basis
P

a rational

is

the

union

canonical

of T-cyclic

basis of T.

bases of a

linear operator T,

then

is

478

Chap.
(c)

(d)

exist

There
A

is similar

matrix

square

having no rational canonical form.


to its rational canonical form.

matrices

square

(e) The Jordancanonical


operator are the same.

T on

any

(g)

Let

\302\242(1)

an

be

T.

rational

the vectors in a
2.

For

the

of

each

matrix

(b) The

matrix

(c)

The

for

The

divides

the

of the characteristic polynomial


in the dot diagram used to compute the
are in one-to-one
with
correspondence
divisor

K^,.

canonical

find the rational

form

and

an

basis.

matrix

real

if T is

Prove

that

with

minimal

N^O\7\"\"1)

of

matrix

3.

TK

canonical

(e)

vector space, any

matrix

complex

(d) The real

of

following,

(a) The real

real

dots

The

basis

rational

accompanying

monic

form

canonical

polynomial

of any linear

form

T.

irreducible

of a linearoperator

a finite-dimensional

characteristic

the

minimal polynomial of

canonical

rational

and

form

linear
(f) For
operator
irreduciblefactor of

Forms

Canonical

-4

-1

-3

-1

-4

-1

-5,

a linear operator on a

polynomial

for

(<\302\243(\302\243))m

-7

12

vector

finite-dimensional

some

positive

is a proper T-invariantsubspace
of

V.

integer

space

m,

then

479

Form

Canonical

Rational

7.4

Sec.

be a linear operator

4. Let T
minimal
some

polynomial

minimal

polynomial

of T to

\302\242(1)

and
has

R(0(T))

(0(t))m-1.

be a linear operator on a
rational
canonical form of T is

5. Let T
the

with

space

polynomial

that the restriction

m. Prove

integer

positive

monic

irreducible

some

for

(<\302\243(\302\243))m

vector

finite-dimensional

on

vector

finite-dimensional

a diagonal

if

matrix

that

Prove

space.
and

is

if T

only

diagonalizable.

6. Let T be

a linear operator

characteristic
that

Prove

annihilator
basis for V.

(b) Prove that


=

which

to

exists

there

xx and

elements
T-annihilator

has

a vector

\"x2

in

such

and

x3 in V

is a

BX2(T)

with T-annihilator

T-

xx has

that

BXI(T)u

<\302\2432(\302\243),

uniqueness of the rational canonical


of these T-cyclic bases that constitutea

f\302\260r

\302\242^($\302\2422(1)

characteristic

we

form,

T-annihilators

have

7. Let T

x2

and n = dim(V).

polynomials

are

$2{t)

with

with

BX3(T).

generators

basis

\302\242^(1),

and

\302\242$)

CX3(T).

to assure

Thus

exist

where

(\342\200\224l)n0i(\302\243)02W>

of

respect
respect

there

f(t) =

V with

space

the difference between the matrix representation


to BXi(T)u
BX2(T) and the matrix representation of

(c) Describe

the

monic

irreducible

distinct

(a)

polynomial

vector

finite-dimensional

on

of

polynomial

be a linear operator

mt

canonical

rational

monic factorsof

the

T.

with

minimal

polynomial

are the distinct

monic

irreducible

the

that

= ($mnHMt))m2---($Mmh

f(t)

where the ^(tfs


^for each i,
is

to powers of irreducible

equal

require

number

in the first

of entries

factors

Prove

of f(t)>

column of the dot

that

diagram

for^.(t).

8.

Let

a linear operator

T be

for any irreducible

Fill

in

of the

details

the

if

<\302\243(\302\243),\302\242(1)

proof

is

not

of the lemma

space

Prove

V.

V, then

on

one-to-one

of T. Hint:

polynomial

Section7.3.

9.

vector

polynomial

characteristic

the

divides

on a finite-dimensional

that
cf)(t)

Apply Exercise 15

within the proof

of

of

Theorem

7.23.

10.

Let T be

a linear operator on a

that

is

Prove

only

\302\242(1)

that

irreducible

an
if

\302\242(1)

if CX(T)

Exercises 11 and
11.

Prove

Theorem

is

the

monic

vector

finite-dimensional-

factor

T-annihilator

with

7.24.

12. Prove Theorem 7.25.

and

suppose

of the characteristic polynomial of T.


of vectors
x and y, then x e C^T)if and

C^T).

12 are concerned

space,

direct

sums.

480

Companionmatrix
Cycle

of

Jordan

460

419

eigenvectors

generalized

form

Dot diagram for Jordancanonical


Dot

Length

432

form

canonical

for rational

diagram

form

469
cycle

eigenspace

Generatorof a
Initial

vector

of a

or

cycle

419

Jordancanonicalbasis

canonical

form

Rationalcanonical

and

basis of a

form

and

basis

472

T-annihilator of a
417

447

linear operator 461


matrix

Jordan block 417

446

linear operator

Nilpotent matrix
Rational

linear

451

matrix

: 420

460

419

of a cycle

Nilpotent

basis

cyclic

434

419

Generalized eigenvector 418

of

Minimal polynomialof a
operator

End vectorof a
Generalized

matrix

linear

of a

417

operator

Jordan canonical

460

Cyclic basis

form

canonical

Forms

CHAPTER

FOR

DEFINITIONS

OF

INDEX

Canonica!

Chap.

vector 458

of a

enclices

A set

of objects, called

is a collection

element of
write

SETS

APPENDIX

set

the

we write

then

A,

For example, if Z

x$A.

sets

Two

the same elements.

of

set

the

the elements of the set


describing the elements of the

set

between

braces

{
of

$ Z.

contain

if they

ways:

terms

in

set

B,

we

then

A,

and j

3 e Z

two

of

one

in

of

element

then

integers,

Sets may be described

1. By listing
2. By

if x is not an

called equal, denoted

B are

and

is

x e A;

is an

set. If x

the

of

members

or

elements

exactly

characteristic

some

property

For example, the

as (1, 2, 3,

order

the

that

4 can be

and

set are listedis

the elements of a

in which
3, 4}

{1,2,

written

{3,1,2,4}

hence

immaterial;

= (1,3,1, 4, 2}.

Example

Let

1,2,3,

less than 5}.

a positive integer

x is

{x:
Note

elements

the

as

or

4}

set consistingof

the set of

A denote

real numbers between 1 and2.

A = {x:

or, if

jR

is

set

the

of real

x is a real

numbers,

set

is

said

to

be a subset

an
element
if every element of
{2,8,7,6,1}. Observe that = B if and
often used to prove that two sets are
B

< x

as

be written

may

< 2}

as

A= {xeR: 1 <
A

and

number

Then

x < 2}. 1

of a set

is

of

A,

A.

written

For

only if A ^ B

A or

example,

and B ^

A,

A^

B,

{1,2,6}^
a

fact

that

is

equal.

481

Appendfces

482

The

set is a

set, denoted

empty

subset of

Sets

two sets

and

B,

The empty

elements.

no

containing

basic
to form other sets in
ways.
A u B, is the set of elementsthat are

denoted

in

of

union

The

two

A,

or

B, or

that is,

both;

Au B =
of two sets

The intersection

are

set

the

set.

every

be combined

may

is

by 0,

B; that

and

both

in

{x:x e

and

is,

An B= {x: x e

and

e B}.

called disjoint if their intersectionis

Two sets are

set of elementsthat

B, is the

A n

denoted

B,

e B}.

or

the

set.

empty

Example

3, -5}and

Let A = {1,

. -AuB={l,
=

if

Likewise,

are

and AnB={l,

3,5,7,8}
and

{1, 2, 8}

4, 5,

then
= 0.

XnY

and

8}

and intersection of more than


if Al9 A2,...,An are sets,
Specifically,
of #iese sets are defined as

union

The
analogously.
intersection

5}.

sets.

disjoint

4, 5},

{3,

= {l5 2,3,

Xu7

Thus X and

8}. Then

{1,5,1,

two

sets
then

can

be

the

union

defined
and

A{

(J

{x: x e

i =

some

for

At

1, 2,...,

n}

and
n

Similarly, if

is

an

At

f]

set and

index

of these sets

intersection

{x: x e

{Aa: a e

are defined
=

Aa

|J

all

for

At

{x: x e

1, 2,...,

A} is a collection
of

by

for

Aa

a e

some

A}

and

f] Aa= {x: x e

aeA

Example

Let

Aa

for-^all

3
{a

e jR:

a > 1}, and


=
Aa

let
\\xeR:

n}.

<x^l+ai

e A}.

sets,

the

union

and

for each a e

on a set

By a relation

any

elements

on A

relation

\"is

2}.

for

More

precisely,

(x, y) e S if and

of real numbers,

On the set

for

or

\"is less than,\" and \"is greater than


A relation
S on a set A is called
an equivalence
to,\"

equal

relations.

familiar

to y.

relationship

given

<

that

A such

of

elements

of

the

in

in a given relationshipto y.

5 of orderedpairs

is a set

whether or not,

for determining

a rule

mean

we

numbers. Then

f]Aa^ {x e R: 0 ^

and

-1}

A, x stands

y in

and

only if x stands
instance,

x >

e R:

{x

Aa

set of real

the

jR denotes

where

A,

(J

483

Functions

Appendix B

are

to\"

equal

on A if

relation

these three conditions hold.

1. For

2. If (x,
3. If (x,
(x,

j>)

fixed

If

S is

5.

y)

y)

5,

then

S and

For

n, then

f(x) in B.

set

of

range

if

all

g:

Note

then

by

is

j>

on the set

relation

equivalence

divisible

\342\200\224

is

f). If

(unier

of

/: ,4
all

image of x

->\342\200\242

5,

of integers.

of

the range of

called

in

elements

f is a

f: A~+ B,

written

element denoted

(under f) and x

is

,4

then

images

B,

A a unique

in

the

called

into

,4

from

element

each

f(x)

that

function f

of

subset

is

called

the domain of
A

is

B.

If

the

called

c ,4, we

images of elementsof 5.
denote
the set {xe i4:/(x) e T}
we
by /_1(T)
in T. Finally,
elements
two functions /: A~+B and
= #(x) for all x e ,4.
if /(x)
set

{/(x):

x e

5} of all

equal

f:

that

of / Since /(2) = 5, the


-2
is another
preimage
- [2,5] and /^(T)
f(S)
As

need

an

is

place of

A~+
B
B = jR, the set of real numbers. Let
A = [ \342\200\22410,10]
and
element
x2 + 1 in B; that
that assigns to each element x in ,4 the
the function
is the
/ is definedby f(x) = x2 + 1. Then ,4 is the domainof and [1,101]

range
that

that

~ y in

Suppose

is,

\342\200\224

write

usually

of

of
are

Example

be

the

TcB,

preimages
A.-+B

(transitivity).

j>

to

denote by /(5)
Likewise,

5, then (x, z)eS


we

{/(x):xe,4}

(symmetry).

relation on a set A,
if we define x ~
to
mean

associates
f(x)

of

and the

(y, z) e

element

The

primage

(refiexivity).

(y, x)eS

sets, then a

B are

and

FUNCTIONS

is a rule that
a

(x, x) e

example,

integer

e A,

an equivalence

APPENDIX
If

each

not

Example

be

unique.

shows,

Functions

image

of

2 is

5 and

2 is a

of 5. Moreover, if S =

= [-10,
the

such

preimage of 5.

Notice

[1,2] and T = [82,101],

-9] u [9,10]. I
an

element in the

that each element

of the rangehas a

preimage

of

range
unique

Appendices

484

called

are

preimage

= y

implies

If f:

is one-to-oneif f(x) =

that is, f:A~+B

one-to-one;

if x #

or, equivalently,
-*
5 is a function

j>

# /(j>)-

/(x)

implies

i.e., if f(A) = 5,

with range 5,

f(y)

then

is

called

onto.

that f:

Suppose

restriction

the

called

-*

f to S,

of

Then a function/5: S

and Sci.

a function

B is

can be formed

fs(x)

defining

by

\342\200\224

for

f(x)

~+B,

each

xeS.

example illustrates these concepts.

The following

Example2
Let

/:

-*

[\342\200\224
1,1]

since

one-to-one

x2 + 3. Then

#: B

such

~\302\273
,4

(f\302\260g)(y)

it

functions.
the

For example,
x, and
#(x) =

sin

f(g(x))

(/\302\260s)(x)

is associative,

however;

exists

there

function

all x e A If
x for
e B and (# \302\260/)(x) \342\200\224
and is called the inverse of/ We denote
that / is invertibleif and
It can be shown

all y

is unique

/ (when it exists) by
and
both
one-to-one

is

(h\302\260g)\302\260f

for

of

if/

only

whereas

is said to be invertibleif

then

inverse

/(x)

numbers),

onto

called

g\302\260f:A~^

composition

h\302\260(g\302\260f)

such a function^ exists,


the

is both

but not onto.

all xeA

for

g(f(xj)

3,

fs

and g'.B~+C be

a function

sin2x+

then

[0,1],

not

but

that

\342\200\224

is one-to-one

fT

Functional

f\302\260g.

/: A~+B

function

g\302\260f

=
g(f(x))

if

f\\A-*B

\302\260

h: C ~+D, then

that is, if

and

sets

we obtain

(g*f)(x)=

+ 3). Hence

sin(x2

then

[-\302\243,
1],

/ Thus (9 f)(x) =
(the set of real

of # and
= C = R

composite

let

with

./

following

By

if T =
be

and

B;

A,

x2. This functionis onto

/(1) = 1. Note that

/(-1)-

and one-to-one. Finally,

Let

by /(x) =

be defined

[0,1]

*.

*;

onto.

Example 3

The function f:R~+R


hence

is

The

invertible.

The followingfacts about


1. If/:

2.

If

APPENDIX

A'->\342\200\242
B

f'.A-^B

of /

inverse

1 is one-to-one

= 3x +

/(x)

by

is the functionf~1:R~^R

and onto;
defined

by

= (x-1)/3.

f-\\x)

defined

is

then

invertible,
and

easily

is invertible and

f1
are

g\\B~+C

are

functions

invertible

invertible,

then

proved:

(/-1)-1 = f

g\302\260f

is

invertible

and

C FIELDS

The set of real numbersis

an

example

Basically, a fieldisa in which four


subtraction, and division)can be
set

defined

of

an

algebraic

(called

operations
so

that,

with

structure

addition,
the

called a \"field.\"

multiplication,
of division
exception

the

zero,

by

sum,

More precisely, a

A field

Definitions.

F is a

and

addition
elements

multiplication,
y in F there

x,

a + b

(F2) (a +

b)

all

elements

and

of

and

addition

0 + a

each

For
elements

= a

a + c= 0
(existence

(F 5) a

\342\200\242

(b

c)

of

and

The

y.

called

identity
c and

elements

multiplicative

and

y are

called

0 (read \"zero\") and

elements

that

such

in

\342\200\242

addition and multiplication).


nonzeroelementb in F
exist
there

b'd

and

and multiplication).

over

multiplication

and 1

of inverses for addition


\342\200\242
= a \342\200\242
c
b
+ a

(distributivity

The elements x

F:

multiplication).

of identity elements for


a in F and each
element
c and d in F such that

(existence

which the

F for

in

x*y

of

b *a

\342\200\242

exist distinct elements0 and

(F 3) There

4)

c in

b,

(called

that for each pair

are defined so
and

\342\226\240

and

operations

set

the

follows.

as

defined

Wo

which

a,

of any two elementsin

of addition and multiplication).


= a +
(b + c) and (a-b)-c = a-(b-c)

(associativity

(F

fieldis

are unique elements x +

= b+a

(commutativity

set on

respectively)

following conditions holdfor


(Fl)

and quotient

difference,

product,

of the set.

element

an

is

485

Fields

AppendixC

addition).

the sum and

product, respectively,of

1 (read \"one\")

in

mentioned

for addition and multiplication,


d referred to in (F 4) are called additive
inverse
inverse for b, respectively.
elements

3) are
the

and

respectively,

an

(F

for

a and

Example1
The

set

is a

field,

real

numbers

which

will

of

the usual definitions


denoted
1
by R.

with
be

of addition and multiplication


'

Example 2

The set of rational


multiplication is a field. 1
numbers

the

of

definitions

usual

addition

and

Example

The

with

set

numbers,

numbers of the form a + b^/l,


with addition and multiplication as in is

of all real

where

jR

field.

b are

and

rational

Appendices

486

Example

The fieldZ2 consists

two

multiplication

by

of

defined

the

0 +

0 = 0,

0-0

= 0,

operations of addition and

1 with the

0 and

elements

equations

0+1 = 1 + 0=1,
0*1 = 1*0 =

0,

1*1

and

1=0,

1.

Example 5

Neither the set of positive


definitions of additionand
not hold. I

nor

integers

+b

(a) If a

(b) If a

Proof
b*d=l.

\342\200\224
c

The

(c

b)

of

proofs

of the

consequence

following
c

and

be

by (F 3)

theorem.
elements

arbitrary

of

side of the

the right

(a

a * (b

4)

are

Suppose

= a for each a e

b)'

= (c

\342\226\240
\342\200\242

b)

mentioned

and

2)

3) we

(F

a.
Thus

c.

3) and

in(F

c and

the elements

unique.

that
F3

we

0' ef
have

satisfies 0' + a = a
each
0' + a = 0 + a for eachaeF.

Since

aeF.

for

Thus

0'

0 by

C.l.

The proofs of the remainingparts are


Thus

obtain

to

by

this equality: by (F
=

d)

c*b

that

F such

in

element

equality a*b =

equality reducesto c.

The elements 0 and 1

Corollary.

Theorem

so only (b) will be proved.

the existenceof an

left side of

the

Consider

Proof

(b) are similar;

and

(a)

both sides of the

d.

d mentioned in (F

a = c.

b-# 0, then

(a' b)
Similarly,

a = c.

then

(F 4) guarantees

Multiply
-

b,

and

c* b

0, then

If

0 + a

is guaranteed

existence

F.

afield

(a b)
have

field,

with the usual


integers
for in either
case (F 4) does

of

(Cancellation Laws). Let a,b,

Theorem'CI

is

this

whose

elements

inverse

set

is

multiplication

The identityand
and (F 4) are unique;

the

each

element

b in a

similar.

field has a unique

inverse

additive

in
the
unique multiplicative inverse. (It willbe
that 0 has no multiplicative
inverse.)The additive
inverse
of b are denoted by ~-b and b'1,
= 6.
and that (b~^-1
shown

inverse

respectively.

Subtraction and division can

be

defined

in

corollary
and
Notice

terms

if b

# 0, a

to Theorem

C.2

and,

the
that

of

multiplicative
\342\200\224
\342\200\224

addition

b)

and

multiplication
be

defined

Division
zero
difference,and
Many of
in any field,as

\342\200\224

of

the

two

any

familiar

the

but

properties

defined

a'b'1.

exception the sum, product,


elements
of a field are defined.
of multiplication
of real numbers are true
shows.

b be arbitrary

Let a. and

C.2.

# 0 is

this

with

theorem

following

afb ~

and

(~b)

undefined,

quotient

Theorem

b~a

is

by

Specifically,

that is,

b~x;

by

multiplication

addition

be

to

inverses.
multiplicative
of ~-b and division by b

and

additive

the

using

by

subtraction of b is
to

487

Fields

Appendix C

elements

each of the

Then

a field.

of

following are true.

(a) a-0 = 0.
=

(~a)-b

(b)

-(a-b).

a-(-b)=

(c) (~a)-(~b)^arb.

Proof, (a) Since6.+ 0=

= a*0 +

a*0

that

shows

5)

0) = a-0

+ a'0.
and cancellation of a-0

= a-(0 +

a-0
Thus

(F

0,

a-0,

C.l

Theorem

by

gives

0 = a-0.

(b) By definition
=

+ (

a'b

that

So

0.

[\342\200\224(a*by]

a)'b

0. But

\342\200\224a
is

and

(F5)

\342\200\224(a\\'b)

(c)

is

such that a
=

twice

By

field

arbitrary

which a sum

it may

of p l's equals0 is

integer exists, then F is saidto

In a

field

0; so

that

a'(-b)

inverse.

that a sum

happen

jR

and

p
having

j\302\243
0,

has
then
finite

called

have

characteristic
x +

x +

zero.

characteristic

zero.
\342\226\240
\342\226\240
\342\226\240

characteristic

unnatural problems arise. For this


spaces stated in this book require that

the

field

over

the
which

Z2
p for

positive

Z2 has

equals
characteristic

of

field

the

if no such

(p summands)

some

---

integer

that if F

Observe

(especially

reason

Thus

in

positive

of F;

characteristic

the

1+

example,

the

characteristic

has no multiplicative

of a field

identity

equals 0 for some positiveintegerp. For


in Example 4),.1.+ 1 = 0. In this case
smallest

two,

a)

-\302\243-(a-b)l=a-b.

(p summands)

characteristic

\342\200\224

find

= -[fl-(-ft)].=
additive

The

an

part (b), we

applying

Corollary.

(defined

+ (

similar.

(-a)'(-b)

In

of F

show

to

suffices

it

\342\200\224(a'b)

(-fl)]-6 = 0-6 =-6-0


(-a)-b = -(a'b). The proof

(a). Thus

part

=
(\342\200\224a)'b

element

the

that a'b +

of F such

element

unique

= [fl +

fl-6 + (-fl)-6
by

the

to prove that

in order

\342\200\224

is

\342\200\224(a'b)

results
the

is a

field

0 for all
two),
about
vector

of

x e F.
many
vector

space is

488

Appendices

defined be of characteristic
zero

at

(or,

other than

characteristic

of some

least,

two).

Finally, note that in


is
elements a and b in a
D

APPENDIX

For the purposes of algebra the fieldof real numbers


are polynomials of nonzero degree with real number
zeros in the field of real numbers(for
x2
example,
field.

the

in

zero

real

the

called

numbers

respectively.
and

sum

of Wo

product

(where a, b, c, andd

no

is often desirable to
with coefficients from

1). It

degree

is an expressionof

A complex number

where a and b are-teal


The

have

that

coefficients

there

for

sufficient,

we shall \"enlarge\"the fieldofreal

this reason

For

not

a field.

such

Definitions.

b.

is

of nonzero

polynomial

any

obtain

to

numbers

which

of two

product

\342\200\242

NUMBERS

COMPLEX

have a fieldin
that fieldhas a

than

rather

ab

denoted

field

the

book

this

of

sections

other

real

are

are

numbers)

(a + bi) + (c +

z + w=

part

a + bi

bi,

part of z,
w

and

= a +

c + di

as follows:

defined

(b + d)i

c) +

(a

di)

form

and the imaginary

numbers z =

complex

the

and

(a

+ di) =

bi)(c

(ac

\342\200\224

bd)

+ ad)L

(be

Example

The

zw

sum

and

z +

of z =

product
w

(3

\342\200\224

5i

5f) + (9 +

and

70 =

(3

w
+

= 9 +
9)

li are

12+2f

+ K-5) + 7]f=

and

zw =

= 62

Any

with
sums

the
and

real

50(9 + 70 =

(3-

be regarded

c may

that

+ (d+

Of)

called

(c

=*

In particular,

for i =

d) +

of the
The

imaginary.

(bi)(di)

this

associating

by

preserves

correspondence

is,

Any complex number

number,is

3 -7]f

[(-5)-9

as a complex number

c + Of. Observe that

number

products;

(c + 00

- 24f. 1

number

complex

[3*9- (-5)-7]

(0

bi)(0

Of,

and , (c + Oi)(d+
bi

form

+ di) =

we

have

i'i=

bi, where

(0 - bd) +

-bd.

0 + If,

cd +

Of.

nonzero real
imaginary numbers is real since

0 +

of two

product

Of)

\342\200\2241.

\342\200\242
(b

b is a

\342\200\242

d)i

Appendix

i2

that

observation

The

489

Numbers

Complex

\342\200\224

i-i

\342\200\224

complex
1.

to remember

way

easy

of multiplication of complex
simply
numbers
as you would any two algebraicexpressions

the definition
\342\200\224

an

provides
numbers:

and

this

2 illustrates

Example

two

multiply

i2

replace

by

technique.

Example 2

The product of

2i

\342\200\2245
+

and

-5(1 - -5 +

- 3i) =

+ 2f)(l

(-5

\342\200\224
3\302\243
is

30

number

real

(a + 60 +

0=

(a

= a+
the

Likewise

number

real

identity

=a+
each

\342\200\224

a)

b)i.

inverse.In

In

view

of the

as a complex

(b + 0)i

0) +

Of)

number, is a

6-0)

(a\342\200\242

+ (6-1

namely

the field of
The

number

bi.

We

with the operations of addition

numbers

complex numbers

by

C.

of a complex number a + bi is the


conjugate
of the complex number z by z.

conjugate
the

denote

Example3
of

\342\200\2243
+

-3+2f

2f,

is not surprising.

is afield.

(complex)

\342\200\224

the following result

of complex

above

defined

multiplication

conjugates

- a-0)i

has an additive inverse,


number
except 0 has a multiplicative

a + 6f

statements

set

The

Definition.

The

multiplicative

numbers since

complex

preceding

D.l.

We denote

complex

(a

fact,

Theorem

and

=
<M)

bi.

each

also

But

to)(l

number

complex

\342\200\224

identity

bu

set of complex

(a + to)-1= (a+

Clearly

(0 +

bi)

1, regarded

the

for

element

number, is an additive

since

numbers

of complex

set

the

for

element

as a complex

0, regarded

30

15f + 2f-6(-1)

-5+

= 1 + Hi.
The

2f-6f2

15f

21(1

\342\200\224

=-3-

7f,

and

2i,

6 are

as follows:

4-7f = 4 +

7f,

Appendices

490

and

6 =

The

D.2.

Theorem

For

complex

any

z= a+

Observe that zz =

an

provides
c

(a +

zz

the

a real number

bi, zz

is real and

number.

of a complex
value

absolute

the

denote

We

z.

nonnegative, for

complex

modulus)

(or

if and only if z =

b2.

absolute value of a

+ b2.

^/a2

z is

- bi) = a2 +

bi)(a

number

of z by

|z|.

number and its conjugateis


the quotient of two complexnumbers

of a complex

for detennining

a+ bi c

real

+ bd

ac

\342\200\224

di

(ac

be

c2 + d2

+ (be

bd)

c- di~

c + di

di~
=

\342\200\224

ad

ad)i

d2

c2

\342\200\224

c2 + d2

Example

will
+

the

of

definition

the

0, then

di #

?c +

(1

value

product

method

easy

\342\200\224

absolute

a + bi

We

6.

|z|2.

that

fact

The

number

real

the

is

bi

number

The

Definition:

if

Of

consequenceof

number

complex

to define the

be used

can

fact

for

conjugate.

complex

This

Of

is an easy

result

following

6 +

4f)/(3

- 2f):

1+

4f

1 +

4i\\

\"3-2*3

3^2f

The absolute
absolutevalueof a

|zw|

w|< |z| +
=

|z|

numbers.

Let

computing

the

_5_

14,

~T3+l3*'

has the familiar

the

following

denote

any.two

quotient

result

complex

properties of the

shows.
numbers.

Then

|w|.

|w|.

(c) |z| - |w| < |z +


Proof

14i_

number

as

number,

Let z and

Theorem D.3.
(b)

real

_ -5 +
+ 2\"
9+4

3 + 2t

a complex

of

value

(a) |z +

described above by

the procedure

illustrate

= a +

w|.

bi and

c +

di,

where

a, b, c,

and d are real

Appendix

first that

Observe

(a)

491

Numbers

Complex

0 ^ (ad -

so: labcd <

b2c2.

a2d2

+ b2d2 to

a2c2

Adding

- labcd

a2d2

be)2

+ b2c\\

both sides of

the inequality

(a2 + b2)(c2 +

<22).

gives

(ac

bd)2

= a2c2 +

labcd + b2d2

< a2c2+

a2<22

By taking squareroots,

b2c2

b2<22 =

obtain

we

ac

+bdt:

\\(a

+ b2jc2-+d2.

ja2

Now
\\z

w\\2

= (a +

(b +

c)2+

a2

b2 + d2

= (V?TF
= (M

By taking square roots,


From

(b)

\\zw\\

J(ac

\\z\\

\\(z +

see that

of absolute value we
<fi)l

\\(ac

(b) it follows

w)

b2jc2 + i2

+ 2^a2 +

(a).

w\\

|z +

bd) + (be +
=

a<2)2

fc27c2+ i2 =

(a) and

From

(c)

M)

^Td1)2

bd)2 + (be +

= ya2 +

obtain

we

bi)(c

2(ac+

+ M)2.

the definition
\\(a +

<2)2

b2 + d2 +

c2 +

d)i\\2

(b

+ c2 +

= a2
<

c)

\\a

+ b2<22

Ja2c2
-

W|

ad)i\\

\\c

<*i|

+ b2c2 + a2<22

|z|

\342\200\242

M.

that
w\\

I \342\200\224w|

|z +

w| + |w|.

So

\\z\\-\\w\\^\\z +

the set of

for enlarging

motivation

Our

w\\.

real numbers to

a field such that


polynomial
coefficients in that field has a zero.Our
having
field of complex numbers has this property.
was to obtain

numbers

with

every

result

next

Theorem

(n

^=

1)

be

D.4

complex

(The

Theorem

Fundamental

such

numbers

that an #

+
anzn+ a^iZ\"-\"1
has a zero

in the field of

complex

numbers.

\342\226\240\342\226\240\342\226\240

+axz

a0

set

of

nonzero

guarantees

of Algebra).

0. Then

the

complex

degree
that
the

Let a0,..., an

492

Appendices

see Walter Rudin's

a proof,

For

New

(McGraw-Hill,

The
division

for

n^ 1

with

necessarily

complex

coefficients,

distinct)

such

a:z

is a polynomial of
complex numbers cu...,

exist

there

then

the

and

degree

+ a0

\342\200\242
\342\200\242
\342\200\242

anzn

Theorem D.4

E.l).

(Theorem

polynomials

If p(z) =

Corollary.

corollary follows from

important

algorithm

Analysis

1964).

York,

following

Principles of Mathematical

(not

cn

that

p(z) = an(z-c1)---(z-cn).

A field is called algebraically


that every
closed it has the property
polynomial with coefficients from that fieldfactorsas a product polynomials
1. Thus the corollary above shows that the fieldof complex
is
of degree
if

of

numbers

closed.

algebraically

E POLYNOMIALS

APPENDIX

In this appendix**wediscuss some properties polynomials


to
from a field. For the definition of a polynomial,

Our

arbitrary field.
E.l

Theorem

polynomialof
exist

n,

degree

polynomials

Division

(The

q(x)

(b)

(c) q(x)and

r(x)

are

less

is

r(x)

f,(x) = q(x)f2(x)

there

m.

than

If

the

of q(x)

existence

< m,

we can take

n. In

this

q(x) = 0 and

(a) and (b).

Assume, therefore,that
r(x)

by

induction

<

on n. Suppose

Thus /i(x) and


q(x) = /1(x)(/2(x))\"1and
Now suppose that the
n > 0. Let
m = 0.

r(x)

/iW =

flnxn

theorem

case we will

first that n =

to

satisfy
is

an-iX\"\"1

true

(a) and

0;

then

Hence

<

/i(x)

n implies

we

has degree

/x(x)

\342\200\242\342\226\240\342\200\242

=
r(x)

to

may

that

take

(b).

whenever

and r(x) that

establish the existenceof

constants.

nonzero

are

f2(x)

(a) and (b).

to conditions

respect

establishing

(b).

and

degree m^O. Then

a polynomial of

with

unique

by

q(x)

Let f:(x) be

+ r(x).

Proof We will begin


satisfy conditions (a) and
satisfy

with coefficients from an

r(x) such that

and

(a) The degree of

exists

there

process for

for Polynomials).

Algorithm
f2(x) be

let

and

if

g(x)

for polynomials

is valid

coefficients

real

polynomial
long division

familiar

the

that

shows

result

first

with

polynomials

g(x) = f(x)q(x).

q(x) such that

polynomial

f(x) divides a

A polynomial

1.2.

Section

refer

Definition.

coefficients

with

of

aiX

a0

less than

Appendix

493

Polynomials

and

f2(x) =

Kx) = h{x)=

... + blX +

fc,-^-1

b0,

a polynomial /i(x) by

Define

< n.

where

bmxm

anb-'x\342\200\224f2{x)

(fln-l

fln&\302\253

^m-l^\"\"1

&m-

aB&^

(\302\2531.-2

~*

2)x\"

(1)

+--+(^0-^^^0)-

Then

is

h(x)

h(x)is

Case 1.
and

of

polynomial

r(x)

of

by (1) we obtain

h(x). Then

/iW =
and

has

r(x)

Case 2.
may

has

such that

at

degree

induction

the

apply

has

r(x)

SWAM

q(x) =

anb~1xn~m

r(x),

less than m.

degree

h(x)

this case, let

m. In

than

less

degree

n.

less than

degree

has degree less than n,


to obtain polynomials q^{x) and r(x)

m. Since /i(x)

least

hypothesis

m, and

than

less

degree

we

*(*) = qi(x)f2(x)
(1)

Combining

and

(2) and

+ r(x).
for /t(x), we have

solving

/i(x) = fabZ'x\"-* +
this

In

where

has

r(x)

We

and

r2(x)

let q(x) = anb~ 1x\"~m

case,

less

degree

show

exist

such that r:(x)


/iW

so

q^x),

r(x).

/x(x) = q{x)f2{x)+

that

the existence of q(x)and

m. This proves

than

r. Supposethat

of q and

the uniqueness

now

gi(x)]/2(x)

(2)

and r2(x)each has

less

degree

= Qi(x)f2(x) +

+
rx(x) = q2(x)f2(x)

= r2(x) -

r(x),
r(x).

r^x),

q^x),

q2(x),

than

m and

r2(x).

Then

QhM
The

of (3)

side

right

m, it must

is a polynomial

follow that q^x)

thus r:(x) =
In the

r2(x)

by

context

(3).

r,(x).
of degree lessthan

qi(x)lf2(x)

\342\200\224

q2(x)

m.

is

the

zero

of Theorem E.l

call

we

the

complex

f)x5 -(1-

Ox4

6x3

the

and

q(x)
by

/i(x)

of

by

f2(x)

Hence

polynomial.

of

= (3 +

Since

has

degree

qx(x) = q2(x);

remainder, respectively, for the division


quotient and remainder for the division
/i(x)

(3)

complex

(-6

Ox2

Ixx

/2(x).

+ 4

the
For

quotient

and

example,

the

polynomial

+ 2i)x2 +

polynomial

/2(x) -(3

r(x)

(2 +

f)x

Appendices

494

are

g(x) = x3 + fx2~2

Corollary1.
Then

f(a)

Let

that x
-

Suppose

Proof

q(x) such that

\342\200\224

suppose

1,
q(x)(x-a).

q(a)

0.

and

one

than

exist

there

algorithm

less

degree

\342\200\242

a)q{a)

division

the

r(x).

obtain r(a) =

Since

has

r(x)
=

r(x)

0,

a zero of f(x) if /(a)


states that a is a zero of f(x)

= 0.

The

Proof

be a polynomial
of

\342\200\224

therefore

by

this

terminology

if x

\342\200\224

induction

divides

less

degree

Thus

f(x)

element

/(x) has no zeros,

we

hypothesis

of

f(x)

q(x),

to

write

may

n;

degree

zeros.

distinct

1.

f(x)

nothing

be of

that q{x) must

q(x) can have at most

can have at

most n + 1 distinctzeros.

arise

divisors

common

no

having

n, and let

then there is

q(x). Note

if

obvious,

polynomial

zeros.

distinct

most

Corollary

zero

above

corollary

integer

positive

by

e F

f(x).

hasat

of f(x) distinct froma is alsoa

since any zero

Polynomials

some

of f(x), then

is a zero

the

on n. The result is
for

true

1. If

n +

some

for

a)q(x)

the

only

result is

degree

if a

Otherwise,

f(x) = (x

and

is by induction

proof

therefore that the

Suppose

With

Any polynomial of degreen^l

2.

Corollary

if

from a field F, an

coefficients

with

f(x)

polynomial

any

is called

Thus,

exists a polynomial

\342\226\240_

For

prove.

= (a -

0.
be the constant polynomial

must

it

than

and let a e F.

at least 1,
there

Then

has

r(x)

the above we

x in

for

/(x).

- q(x)(x-a) +

f(x)
a

divides

such that

9.

f(x).

By

q(x) and r(x)

Substituting

divides

\342\200\224

- 3f)x +

(2

of degree

polynomial
a

r(x)

a)g(x). Thus /(a)


that f(a) = 0.

(x

f(x)

Conversely,

polynomials

and only if

0 if

be

f(x)

and

in the study

naturally

of

canonicalforms.

Definition.

nonzero

Two

polynomial of positive
For
= (x
them.
On

h(x)

because

the polynomials with

example,
\342\200\224

\342\200\224

l)(x

2)

not

are

relatively

the other hand, f(x)


they have no common

Proposition

exist

polynomials

E.2.

If

q:(x)

and

f:(x)
q2(x)

denotes

the

constant

real coefficientsf(x) =

prime

because

and g(x)=
2)(x
of
factors positive

x
\342\200\224

3)

(xw\342\200\224

\342\200\224

are

\342\200\224
x2(x

divides

relatively

1)

and

each

of

prime

degree.

and
such

qiC^Cx)

where 1

if no

prime

relatively

of them.

each

divides

degree

called

are

polynomials

polynomial

f2(x) are relatively


that

+ q2(x)f2(x) =

1,

value

1.

with

prime polynomials,there

Appendix

or

on

degree of/2(x). The proof will be


is
of/2(x). If f2(x) has degree0,
f2(x)

to the

equal

the degree

in this case

= 0

+ <li{x)f2{x) =

flito/ito

Now suppose that

for some integer


n

than

and

and

Since/i(x)

and

then

= <Ax)f2{x) +

than

that

we

then

there

r(x).

So

prime,

relatively

that

'
r(x).

(4)

r(x) is not

the zero polynomial. If r(x)

as
constant, c, and we obtain the conclusion
has degree greater than zero. Since r(x)has

nonzero

is a

r(x)
then

Suppose

less

are

f2(x)

0,

degree

before.

less than n

f2(x) has degree n. By the division


and r(x) such that r(x) has degreeless

q(x)

AM

has

has degree

f2(x)

that

suppose

polynomials

1-

whenever

holds

theorem

the

1,

exist

there

algorithm

q2(x) = c\"1,

and

condition

the

satisfy

clearly

c. So

constant

a nonzero

take

can

q^x)

which

greater

induction

mathematical

by

then

we

degree of /i(x) is

assume that the

of generality

loss

Without

Proof,
than

49,5

Polynomials

r(x)

degree

and
Suppose
degree that dividesboth

may apply the induction hypothesis to f2(x)


show these polynomials to be relatively prime.

n, we

can

^(x) of positive

a polynomial

exists

Combining (4) and-(5),

fi(x)

and f2(x)are

But

the induction

Thus

there

hypothesis

(4)

(6), we

and

(5)

+ h1(x)lg(x),

exist

divides

#i(x)

that

g2(x) such

and

/i(x)

prime. Hence by

f2(x) are relatively

and

r(x)

the fact that

contradicting

/2(x),

9i(x)f2(x) + g2{x)r{x)
Combining

and

f2(x)

- g(x)h2(x).

f2{x)

lq(x)h2(x)

g(x)

prime.

relatively

otherwise;

obtain

we

arid so g(x) divides/i(x).

and

^(x)/i1(x)

r(x)

provided

h^x) and h2(x)such that

exist polynomials

there

r(x)

1.

(6)

obtain

9i(x)f2(x) + ^2(x)[/1(x)-

1.

1.

g(x)/2(x)]

Thus

92(x)fi(x) + [^i(x)
qx(x) = #2(x)
conclusion.
1
Setting

g2(x)q(xy]f2(x)

and g2(x)= 0iM ~

9i{x)q{x\\

we

the

obtain

desired

Example 1

Let /i(x) =
coefficients,

/i(x)

x3.\342\200\224 x2

and

+
/2(x)

1 and
are

/2(x) = (x
relatively

\342\200\224

1)2.

prime.

As

polynomials

It is easily

with

real

verified that

the

496

Appendices

qx(x)

polynomials

g2(x) = x2

2 and

\342\200\224

\342\200\224 \342\200\224

and hence these polynomials

of

conclusion

the

satisfy

satisfy

= 1,

+ 32W/2M

flito/iM

linear
Throughout Chapters 5, 6, and we consider
that
polynomials in a particular operator T and
matrix A, For these operators and matrices
particular

are

matrices

the

are

that

operators

E.2.

Proposition

in a

polynomials

is

notation

following

convenient.

Let

Definitions.

+ anxn
+ a:x +
with coefficients from afield F. If T isa
F,. we define f(T) by

f(x) = a0

be a

polynomial
space V over

\342\226\240
\342\226\240
\342\200\242

if

= a0I +

f(A)

and

so

T be the
let

a vector

axA +

\342\226\240
\342\200\242
-

we

define

f(A)

by

anAn.

Example

Let

entries from F,

n matrix with

n x

is:an

on

operator

+ .--+anTn.

f(T)-a0!+aiT
Similarly,

linear

f(x)

linear operator on
x2 + 2x - 3. It is
=

that

b) = (2a

T(a,

by

checked

easily

/(T)(a,i)

defined

R2

T2(a,

6) = (5a +

+ 6, a

\342\200\224

b\\

6, a + 26);

(T2+2T-3!Xfl,6)

a + 26) +

(5fl + 6,
(6a

36,

(4a

26,

2a

- 26)

- 3(a,

6)

- 36).

3a

Similarly,if

'2
A

-1

then

next three results

The

PropositionE.3.

and let T

(a)

be

f(T)

(b) IfP
Proof

linear
is

Let

utilize this notation.


f(x)

operator
linear

is a finite
Exercise.

operator

on a

coefficients

vector space V over

from a field

F,

F. Then

on V.

orderedbasis

for

with

a polynomial

be

and

[T]\342\200\236,

then

[f(T)]\342\200\236

f(A).

Appendix

Let T be

E.4.

Proposition

let A be

497

Polynomials

a square matrix

f2(x)

with

Then

F.

from

for

vector

V over

space

any polynomials

F, and

f:(x) and

= f2(TMT).

fiCl-fcCD

(a)

entries

with

from

coefficients

a linear operator on a

= f^A^A).

(b) f^A^A)

Proof. Exercise. 1
let A be

F, and

Let

E.5.

Proposition

an n x

primepolynomials
wzt/i

entries

from

suc/i

q^AJf^A)

entries

from

F, then

from

there

t/iat

q2(T)f2(T) = 1.

(a) q1(T)f1(T) +
(b)

on a vectorspace over afield


F. If f:(x) and f2(x) are relatively
exist polynomials qi(x) and q2(x)

a linear operator

with

matrix

entries

with

T be

= I.

q2(A)f2(A)

Proof Exercise.

7 we
are
In Chapters5
concerned
with determining
when a linear
vector
can be \"diagonalized\"
and with
operator T on a
space
of these
are
finding a simple (canonical)representation T. Both
problems
affected by the factorization of a
determined
polynomial
by T (the
\"characteristic
of polynomials
polynomial\" of T). In this setting
types
and

finite-dimensional

of

certain

certain

play

role.

an important

Definitions.

monic if its
as

expressed

a product

1. If

is

coefficient

leading

with

f(x)

polynomial

of polynomials

from

coefficients

f(x) has positive

with coefficients from

a field F

is called

degree and cannot

F each

having

be

positive

irreducible.

f(x) is called

then

degree,

is
irreducible
on the
\\ Observe that whetheror not a
depends
=
field from which its coefficients come. For example,
x2 + 1 is irreducible
over the field of real numbers but not
over
field
the
of complex
=
numbers since x2 + 1
+
0f)(x
of
degree 1 is irreducible. Moreover, for
Clearly,
any
polynomial
from
an algebraically
closed field, the polynomials of
with
coefficients
polynomial

f(x)

irreducible

\342\200\224

(x

polynomials

degree1are
The

the

following

Proposition

field F. If
relatively

is

</>(x)

irreducible

only

are

facts

E.6.

Let

irreducible

polynomials.

easily

>

established.

<\302\243(x)

and

and

<\302\243(x)

f(x)
does

be polynomials
not

divide

with coefficients
f(x), then <\302\243(x) and

from a
f(x)

are

prime.

Proof.

Exercise.

Proposition E.7.
relatively prime.

Proof. Exercise.

1
Any

two

distinct

irreducible

monic

polynomials

are

498

Appendices

PropositionE.8.
divides

or

f(x)

that

<\302\243(x)

<\302\243(x)

f(x)g(x) =

divides

Thus

Let

with

<\302\243i(x)<\302\2432W

<\302\243(x)

and

<\302\243(x)

are

f(x)

and

\342\200\242\"\"
<\302\243n(x),

Proof

We

exists

there

<\302\243(x),

such that

h(x)

polynomial

= <^(x)[g1(x)^(x)
+ q2(x)h(xyj.
q2{x)<j>{x)h{x)

prove

4>2(x\\...,

(^(x),

same

the
=

the

of

consequence
1

If

<\302\243n(x).

we are

that

and

polynomials

the

that

then

Suppose

4>2(x\\...,

product

= 1 the result

on n. For n

E.7.

monic

\342\226\240
\342\226\240
\342\226\240
=
<\302\243\342\200\236(*)[<\302\243i(x)<\302\2432M

the

<\302\243(x)

Proposition

\342\200\224

<\302\243iW</>2W

divides
If
i (i = 1,2,..., n).

by induction

corollary

monic

irreducible

be

field.
some

for

<^(x)

<\302\243(x)

<\302\243n(x)

<\302\243i(x),

(j)(x)

(7)

becomes

from

then

will

immediate

yields

g(x)

+ q2(x)f{x)g{x).
q1{x)4>{x)g{x)

(7)

coefficients

for

divides

4>(x)

\342\226\240
\342\226\240
\342\226\240
<\302\243\342\200\236i(x)]<7Ux),

or <\302\243M divides
product
<^i(x)^2W,-,<^n-iW
by
<\302\243\342\200\236(x)
\342\200\224
=
=
i (i
E.8. In the first case, <\302\243(x)
for some
<^(x)
1,2,..., n
by
1)
=
in the second case, <\302\243(x)
by
4>n{x)
Proposition
hypothesis;

divides

Proposition

induction

E.7.

then

q^x)

by

equation

irreducible
n
any
corollary is true
given n irreducible monic polynomials

the

f(x)g(x),

g(x).

Corollary..

then

coefficients

there exist polynomials

this

= q1{x)(j){x)g{x) +

divides

is an

Then

f(x).

+ q2{x)f{x).

f(x)g(x),

(j)(x)h(x).

g(x)

polynomials

product

1 = q^xWx)

, g(x) =

(j)(x)

divide

not

does

E.6, and so

by Proposition

of

So

with

polynomials

g(x).

Multiplying both sides

Since

be

that

such

q2(x)

divides

<\302\243(x)

prime

relatively

<\302\243(x)

the

Suppose

Proof

and

g(x),

f(x),

If(j)(x) is irreducibleanddivides

the same field.

from

Let

the

We are now able to establish unique


used throughout
Chapters 5 and 7. This
degree is uniquely expressible as
positive
the

result

irreducible

monic

factorization
that

states

every

a constant

times

is

which

theorem,

polynomial

of

of

product

polynomials.

Theorem E.9 (Unique Factorization Theorem


Polynomials).
a
constant
there
exist
polynomial
f(x) of positive degree,
and
monic polynomials (^(x), (j)2(x),...,
irreducible
unique
that
such
n:, n2,...,

For

for

unique

<\302\243k(x),

c,

unique

any

distinct
integers

positive

nfc

induction

by showing the existence


the degree
of /(x). If f(x) is of degree

We

Proof
on

begin

of such a
1,

then

factorization

f{x)

using
\342\200\224

ax

b for

Appendix

f(x) = a4>{x).Since

an

is

4>(x)

in this case.
suppose
positive degreeless

= anx\" +

f(x)

scalars ax with

for some

of

an

0. If

\342\226\240*\342\226\240

of

representation

ajX

a0

is irreducible, then

f(x)

+ ^)

+ ...+^

/(^^+^^
a

have

we

b/a,

Then

n.

is.

the result is

polynomial,
is true for

> 1,

<\302\243(x)

proved
any polynomial with
and let f(x) be a polynomial

monic

integer

Setting

conclusion

the

some

than

0.
\302\245=

irreducible

that

Now

degree

b with

and

constants

some

499

Polynomials

of

a product

as

/(x)

and

an

f(x) =
than
products

If f(x) is not irreducible,then


polynomial.
g(x) and h(x) each of positive degree less
that both g(x) and h(x) factor as
guarantees

g(x)h(x)

of

monic

for

some

irreducible
polynomials

induction

The

n.

hypothesis
and

constant

powers

monic polynomials. Consequently f(x) =


also
factors
a product
as
of a
in this way. Thus in either case f(x) can be
constant and powers of distinct irreduciblemonic
It remains
the uniqueness
to establish
of such a factorization.Supposethat
irreducible

of distinct

g(x)h(x)

factored

polynomials.

=
where

c and

polynomials,

and

r.

....,

Clearly

\\
So

(8)

by

the

\342\226\240
\342\226\240
\342\226\240

d^,,{x)T^2{x)T2

\342\200\242
\342\200\242
\342\200\242
[&(*)]\"*\342\226\240

the

divides

to

corollary

side

right

Proposition

and

generality we may supposethat i =

W>i

Since

side
E.8.

0,

this

But

It

is

of f(x) in

coefficients

useful

W*(x)]B*

for

<^-(x)

for

some

for

some

if necessary,
L Without
lo?s of

Then by canceling

[<\302\243i(x)]mi

i =

(^(x),

regard

from a field

\342\200\242
\342\200\242
\342\200\242

I4>k

(10)

(x)T*.

of (10)and
divides
the
right
2,..., k by the corollaryto Proposition
hence

--,

<\302\2432(x),

(8) are the same.


to

[02W]m2

the left side

some

that

contradicts

often

m,<

> m1.

n1

\342\226\240
\342\226\240
\342\226\240

divides

(^(x)

So (^(x) =

also.

and

renumbering

by

<\302\243f(x)

we obtain

'\"\"[^M]\"2

Ml\"'

n1\342\200\224 mt>

factorizations

with

of (9),

that

that n{ #

r), i^/x) =

= 1,2,...,

(j

/c

sides

(9)

[^w]^.

(9) for i= 1,2,..., L Consequently,


=
E.8, for each i (i = 1,2,...,/c), <^(x)
^(x)

for

both

\342\200\242
\342\200\242
\342\200\242

WiW]m,[^w]TO

of

for, some J = 1,2,..., r, and


any
j
x = 1,2,...,
/c. We conclude that r =
k. Suppose
0f(x) = i^-(x) for = 1,2,...,
from

(8)

WrWF,

becomes

Wiwr[^wr
<\302\243|(x)

[&(*)]\"*

are
irreducible
d are constants, <fo(x)
and
monic
\\j/j(x)
\342\200\224
. . . , k and j = 1, 2,
and
1, 2,
integers for i
m3 are positive
c and <2 must
be the
coefficient
of f(x); hence c = d.
both
leading

rti

dividing,

By

\342\200\242
\342\200\242
\342\200\242

cC^Mrc^M]\"2

f(x)

are

Hence

distinct.

the

a polynomial

F as a

<\302\243&(*)

function

f(x) = anx\" +
f:F

->\342\200\242
F.

In

\342\200\242
\342\200\242
\342\200\242

this

axx

case

+ a0

the

500

Appendices

value of f at ceF is f(c) =


arbitrary fields F, there is not a
and

polynomials

are

two

then

and

f(x)

But

polynomials.

f(a)

different
g(a) for all a

Let

Theorem E.10.

infinitefield F. If
Proof

=
f(a.)

that

h(x)

g(x)= x

in Example 4 of Appendix
C),

e Z2, so that f

are not equal as

and

are

polynomial

equal

that this anomaly

if

occur

cannot

==

is an

#(a)

polynomial.

with coefficients

from an

then f(x) and g(x)are equal.

for all a

e F. Define

> 1. It follows

h(x)

from

/(x)

the

\342\200\224

#(x),

to

corollary

But

h(a)

polynomial,

polynomials

at most n zeros.

the

be

a e F,

is of degree n

contradicting

zero

g(x)

all

for

f(a)

for any a e F,
h(x) is a constant
is the

and

f(x)

h(x) can have

E.9 that
;

g(a)

that

Suppose

suppose

Theorem

h(x)

between

field.

infinite

and

example,

for

Unfortunately,

if f(x) = x2 and

degrees and hence

final result shows

Our

functions.

+ a0.

axc

Z2 (as defined

have

g{x)

correspondence

For

field

the

over

polynomials

\342\200\242
\342\200\242
\342\200\242

one-to-one

functions.

polynomial

ancn

f(a)

g(a) = 0
that

assumption
and

since

h(a)

Hence f(x) =

/i(x) has

= 0 for

#(x). 1

positive

each a e

degree. Thus

F, it

follows

that

Answers

Selected

to

Exercises

1.1

SECTION

1.

in parts

the pairs

Only

and

(b)

Z(a)
(3, -2, 4) +
9,-3)
+
0,
7,
(c) (3,
2)
-10)
3. (a) (2, -5, -1) + t,(-2,
(c) (-8, 2, 0) +
1, 0)

are

(c)
=

\302\243(-8,

parallel.

= x

\302\243(0,

9,1)+ t2(-5,

\302\243:(9,

12,

2)

-7, 0) = x

t2(K

SECTION 1.2

1 (a) T
Xg)

3. M13= 3,
4. (a) /

(b)

(c)

(d)

(e) T

(h)

(i)

(j) T

(k) T

14.

M22

2x4 + x3

(g)

10x7

287

10

2x

15*

fails.

4)

(VS

-12\\

\\4

+
+ 2x2 - 30x4 + 40x2-

(e)

- 5
20

(c)/8

3 9/

V_4

No,

4, and

2\\

13.

M21

(f) F

Yes.

15. No.

20.

2mn

1.3

SECTION

5\\

,
: the

-1/

(d) F

(c) T

(b) F

1. (a) F
/-4
2. (a)

trace

is

ff
\342\200\224

5.

(e) T

(f)

Answers to SelectedExercises

502

(c) /-3

(e) No.

8. (a) Yes.
11.

No.

(f)

not closed under

set is

the

No,

(e)

1/

-2

6\\

addition.

14. Yes.

SECTION1.4

1.

2.

(a)

(a)

{x2(l,

0) +

1, 0,

0, -2,

x4(-3,

1)

4-

(5,

(a)

Yes.

No.

(c)

(-4,

(e)

(c) F

(b) T

1. (a) F

(e) T

(d) F

(f)

(f)

1.6

SECTION

1. (a) F

(b) T

(c) F

(g)

(h) T

(i) F

(a)

Yes.

Yes.

(c)

4. No. .

(e) T

(d) F

(k) T

(j) T

(e)

No.

(e) No.

(c) No.

3. (a) No.

5.

0, 5): x3, x4 6 K}

No.

2.

3, 0,

1.5

SECTION

8.

x4 eR}

No.

(e)

(c) Yes.

4. (a) Yes.

No.

8.

{Xj,

X3,

9. (flls a2,

14. n2 - 1

16.

{n(n

21.

4,

0)

(e)

0): x23

0,

(c) There are no solutions.


-3,
1, 0, 0) + x4(-3, 2, 0, 1,
{x3(10,

3.

(f) F

(e) T

(d) F

(c) T

(b)~

dimfWJ

SECTION

1. (a) F

X5, x7j

a3, a4) =

axxx

(a2

- ax)x2 +

(a3 -

1)

- 3,

dim(W2)

2,

+ W2)

dim^

a2)x3

- 4, and

(a4

dim(W,

1.7

(b) F

(c)

(d)

(e)

(f) T

a3)x4

nWJ

503

Answers to SelectedExercises

SECTION!2.1

1,

(a)

2.

The

(b)

(c)

is 1, and

nullity

the rank is Z

4. The nullityis 4, and the rank is 2.


is 3. T
5. The nullity is 0, and the
T

rank

10. T(2,3) = (5,11).


T

SECTION

1. (a) T

is

not

is

neither

(g)

nor

one-to-one

(c)

(d)

No.

(e)

(e)

(f) F

(c) (2 1

(g) (1 o

SECTION2.3

1.

(a)

(g) F

2. (a)

(b)

(h) F

(c)

3C)

(d)

(i) T

'20
A(2B

(f)

(J) T

-9

10

18\\
8

and

A(BD)

onto.

but not onto.

is one-to-one

2.2

(b) T

is onto.

but

one-to-one

is

12.

one-to-one.

(f) F

(e) F

(d) T

29

\\

26.

GO F

Answers

504

Exercises

to Selected

-\302\273-*\"\302\273

w*-e\"\302\273)

- [U]J -

and

I,

[UT]J

\302\253

-6,

(6) No.

SECTION 2.4

1. OOF

-(b)

(g) T

17.

(b)

0 10
or

(e)T

(d)

(e)

(f) F

10

,0

01

(d)

(i) T

(h) T

/10

SECTION

(c)F

0
1,

2.5

1. (a) F

(b) T

6. T(x, y) =
1

+mz

(c) T

=-((1-

m2)x

2my,

2mx

+ (m2

l)y)

505

Exercises

to Selected

Answers

SECTION 2.6

1. (a)

2.

j(x,

y,

5. The basis

for

7.

{pt(x%

+
\342\200\224\302\243

(b) T

2. (a) F

(b) F

(a)

bx)= -3a -

x.

Ab

2.7

1. (a) T

{e~\\

SECTION

(c)

\302\243-JW)

(f)

(c) T

(b) F

(e)

te1}

{1,

el

{e~\\

cos

(e)

(f) F

2 transforms

column

1 to

column

into

\\

3.2

(c) 2

4.

/l

D =

(h)

\\0

5. (a) The

(c) T

(b) F

1. (a) F
(g) T
2. (a) 2
(a)

2t,

g\"4*, e~21}

(d)

(i)

The rank is

(e)

The

rank

(f)

(g) 1
; the

is

rank

2.

0/

rank is 2, and

(c)

0\\

1 0 0
0

(e)

(e) 3
0

(d)

the

is t

inverse

2: so no inverse

is 3,

(i) T

(h) F
\342\200\2242
times

SECTION

(g)

3.1

1. (a) T
(g) T
Adding

{e~\\ te~\\ e\\

(c)

(e) T
(e)

(d) F
(d) T

(c) F
(c) T

te\"1}

4. (a) {e^fW\\

2.

functionals.
(e), and (f) are linear
f
2(x,
y9 z)
\\yt and f3(x, y, z) = - x + z
= 2 - 2x and p2(x)
=
p2(*)}> where pt(x)

(a) T'(f) = g, where g(a +

SECTION

3.

- {y,

is

(h) F

(a), (c),

parts

= x
z)

(g) T

(f) T

(e) F

(d) T

(c)
in

functions

The

3. (a)

(b)

exists.

and the inverse is

\\

\\

B.

el sin

It}

(g) The rank is 4, and the

6.

(a)

+ bx

J~l(ax2

is

inverse

-ax2 -

+ c) -

(4a

- (10a +

b)x

(c) l~\\a,Kc)^^a~^b + kc^~\\^~^a +


6, c) = &z - 6 + ic)x2+ (-ja +ic)x
(e) T~l(o,
1.

/l

10

0\\ /l

0\\ /l

\\0

o'

o\\

/l

o\\

\\c)

/l

l/\\0

1/

\\0

1. (a)

10

0-200100

110
1/

^b

26 + c)

1/

\\0

-1

SECTION 3.3
(OF

(g)T

2.

to Selected

Answers

506

(a)

r, s,te

Exercises

Selected

to

Answers

Exercises

4. (b)

(3)

6.

T-l{(Ul\302\273

teR

\302\243

solutions.
systems in parts (b), (c), and (d) have
11. The farmer, tailor, and carpenter must have incomesin the
13. There must be 7.8 units of thefirst
and
9.5 units
commodity

7. The

proportions

of the second.

SECTION 3.4

(c)T

(b)T

l.(a)F
2. (a)

4\\

(c)

2>

j
(e)

/4\\

/V

/4\\

4(a)

vO/

(c)

There

\\

are no

2/

solutions.

(d)T.

(e)

(f)

4:3

(g)T

Answers to SelectedExercises

508

SECTION4.1

1.

(a)

2.

(a)

30

4.

(f) F

14

(c)

4.2

SECTION

(c)
(b) T
(c) T49

1. (a) T
3. (a) -34

4.

(f) F

(c) -24

15/

19

(a)

(e) T

-8

(c)

-10+

3. (a)

(d) F

(c) T

(b)

in

functions

The

(c), (d),

parts

(d)

(e)

and (g) are

3-linear.

SECTION 4.3

1.

(a)

(h) T
2. (a) 90
3.

(a)

(b)T

(i) T
\"

16. /%
(a) xi

(d) T

(j)

(k)

(e) F

(m) F

(I)

(f) T

(c)

(g) -102 +

(e) 86

(c)~ 0

100

i\302\243

(c)

76/

Mil-Mil

M22-M12

SECTION 4.4

1.

(a)

(b)

(g)

(h)

2. (a)

22

3. (a)

-12

4. (a)

88

SECTION

1. (a) F
(g) F

(c)

(c)

(e) F
(k) T

(d) F
(j) T

(c) T
(i) T

(f) T

- 4/
22

(e)

-3

(e) 17-

(c) -6

3/

+ 24i

24

(g)

5.1

(b) T
(h)

(c) T
(i)

(d)

(e)

(j)

(k) F

(f)

(g) F

2. (a)

Exercises

Selected

to

Answers

[LJ

1 2

1 1 l\\

(c)

e
3. (a) The

2/

eigenvaluesare 4 and

The

basis

1,

are 1 and

eigenvalues

of eigenvectors

\342\200\224

1,

basis

is

'4

and

3)'
(c)

\342\200\224

of eigenvectors

is

and

ff

\342\200\242

,-

1-r

i-i

i-\302\253\\

eigenvalues are 1, 2, and 3, and a basis

4. The

of

eigenvectors

25. 4

SECTION 5.2

L (a)
(g) T
F

2.

(a)

(b)

(h) T

Not

(c)

(i)

(e) T

(d)
(c)

diagonalizable.

(e) Not diagonalizable.

(g)

\\

3, (a)

Not

(d) p

(e)

7.

= {x - x2,1 -

Not

(c)

diagonalizable.

x2, x

ir
5\"

2(-1)\"

_2(5\")

2(-1)\"

(-1)\"

2(5\")

(-1)\"

_+

A\"
5n

13. (b)

X{t) = cle3gl

(c)

X(t)

e'

A + c2e

2i

diagonalizable.

+ x2}

(f) T

is {1,

x, x2}>

Answers to SelectedExercises

510

SECTION 5.3
L

(b) T

(a)

(g)T

(h)

0\\

2. (a) /0
0

(g)

(c)
o

6. One month

41%are
7/.

(j)

/\302\276

-M

&)

arrival

after

and

bedricfden,

25%

of

14%

have

limit

No

(e)

(i) No

-1\\

(f) T

(e) T

exists.

limit exists

the

have recovered, 20% are ambulatory,


die.
and
died; eventually f\302\247recover
J\302\243
patients

the matrices in parts (a) and (b) are regulartransitionmatrices.

8. Only
9.

(d) F

(i)

O)

/__

(c) F

(a)

(e)

/0

(c) No limit exists.

i\\

10. (a) /0.225\\

0^

\\

\\

/0

(g)

0\\

/0.20\\

;;

0.441

'

two

after

0.60

and

stages

^0.334/

eventually.

\\0,20/

/0.50\\

(c) /0.372\\

0.225

two

after

0,20

and

stages

eventually.

\\0.30/

0.403/

(e) ^0.329\\

A\\

two

after

0.334

and

stages

eventually.

vO.337,

12.

13.

In

and

once-used,
-\302\276

new,
-\302\276

twice-used.

-\302\276

1995 24% will own large cars, 34%will

own small cars; the corresponding

eventual

18.

=
e\302\260

I and

SECTION

1. (a)
2. The
F

subspaces

intermediate-sized

own

are

proportions

will
0.30, and 0.60.

cars,
0.10,

el = eL

5.4
(b)

(c)
in

parts

F
(a), (c),

(d)

(e) T

and (d) are

(f) T

T-invariant.

(g) T

and 42%

to

Answers

Exercises

Selected

511

6- (a)

a:

9.

12.

-t(t2-3t

(a)

(a)

t(t

+ 3)

(c)

- 3t +.3)
/2 -2

- l)(t2

18. (c)
i

30. (a) 6-6t +

(t

(c)

l)3(t +

1)

-4N

,0

-2

(c)

A~l^\\\\

1\342\200\224t

t2

6.1

SECTION

(b) T

1. (a) T

2. <x,

>>>

8 +

5h

\\\\x\\\\

(d)

77,

\\\\y\\\\

(e)

^14,

(f)

and ||x +

y\\\\2

(g) T

F
-

37,

;2-l
3.

A/11

</,0>-l,

and ||/ +

16.

(b)

^1(-.

Ilf/ll

11 + 3e2

No.

ECTION 6.2

\\E

1.
2.

(a)

(b)

The

(b)

is

basis

orthonormal

J6

3
1,
3~(1,

The

Fourier

are

coefficients

1,

1),^-(-2,
2^/3/3,

(c) The orthonormalbasisis {1,

-^/6/6,

The

Fourier

5.

51

is the

coefficients

^),

26\\

/2S>\\

17
14

18. (b)

1/./14

\\

1)|.

and ^/2/2.
\342\200\224

6%/5(x2

4-

\302\243)}.

^/3/6, and 0.
to

to

17V104,

1),^-(0,

origin that is perpendicular x0;

is perpendicular the
(b)

-1,

1)}

plane through the

the origin that


17. (a) J_/

are 3/2,

J2

\342\200\224

2y3(x

4 51 = span{ft-i(l + 0,

(g) T

(f) F

(e) T

(d) F

(c) T

plane

containing

xt

Sq

is the

and x2-

line through

Answersto SelectedExercises

512

SECTION6.3

1.

2.

(a)

(a)

3. (a)

(b)

- (1,

T*(x)

210x2
-2, 4)
(c) y
(11, -12)
(c) T*(/(x))

14. T*(x) = <x, z)y


18. The line is y =
19.

=
\302\243

0.

The

spring

20. x = f,

1. (a) T

(b) F

(g) F

(c)

the

self-adjoint;

(b)

T is

(c)

9x

is

the parabola

1, and

\302\2432/3

4\302\243/3

2 with

2.1.

(e)

basis

is

(e)

(d)

orthonormal

normal but not

(f) T

{\342\200\224-(1,

1)

-2),-=(2,

self-adjoint.

normal.

not

is

6.5

SECTION

1. (a) T
F

(g)

33

= 69x2-

(h)T
is

with

(g)T

6.4

SECTION

2. (a)

It + f

constant is approximately
= ^
f, 2

- 204x +

(f) T

(e) F

(d) T

(c)

(b).F

(c)

(h).F

(i) F

(d)

(OT

J)

and

4. Tz

if
5.

is normal for allz e


=

\\z\\

Only

21.

(a)

Tz

is

if and

self-adjoint

only if

z e K; Tz

1.

the pair of

matrices in part (d) is unitarilyequivalent.

\342\200\224=

The

\342\200\224=
x'

x'

form

2
H

==

is 3(x')

V2
(/)2.

x'+-==/

V13
form

is 5x2

y'

\342\200\224

and

V13
quadratic

x'

\342\200\224p

V2

quadratic

13
new

72

new

x-

The

and

y'

V2

3
(c)

C;

\342\200\224

8/2.

13

is

unitary

if and

only

Answers to SelectedExercises

23. (c)

J_

/J_

__

7SV

J*

Lfi

./2

\\

2./2
\\

and

\342\226\240A

1
0

'3

(e) xj =

3, x2 = -5,

3. (2) (a)
(d)

2)}),

span({(l,

1,(0, b) =
b3 c)

lx(a,

T2(a,6,

c)

%a

[T],
6, a

+ b)

2?

(f)

5/

and T2(a, b)

c, -a +

+ 6 +

6+0,0

(e)

'15
,5

i(2a - b -

=|(a

(d)

2b

Ha-

c, -a

0,0 + 6 +

6, -a +
- b + 2c) and
b)

0)

6.7

SECTION

(g) F

4(a)

(c) T

(b) T

1. (a) F

1. (a)

6.6

SECTION

2. For

x3

Yes.

(b)

(c)

(i) T

(h) F
(b)

No,

(c)

(d)

(e)T

(j) F
No.

(d)

Yes.

(0
(b)

-4

0>

-8 0,
16. (a)

Same as part

iwv

(a).

514
17.

to

Answers

Exercises

Selected

as Exercise 16(c).

Same

21. (a)

and

,0

(b)

'2
D

and

(c)
D

and

SECTION 6.9

1. (a)
2. (a) ^/l8
F

(d) F

approximately

234

(b)

(c)

(c)

4 (a) \\\\A\\\\ * 84.74,


(b) ||jc-i4*-!i>||<

\342\200\242

P~^IK

0.001

14.41

\\\\b\\\\

\\\\b\\\\

*'UlO

*g\"*,

6.

P
J5

\342\200\2242
*=

11^11

7*

1. (a) F
(g) F
3. (b)

(b)

\302\276

and

cond(B)=2

6.10

SECTION

4.

and

\302\253
0.17

-b\\\\

_J|*>->1*||
< cond(X)

M-^ir5.

H^jc

l\\\\

\\\\A~

cond(yl)\302\2531441

and

\302\253
17.01,

\\\\A'l\\\\

(e) F

\\

(c) T

(b) T

(i) T

(h) T

(d)

<j)

u/y

teR

\\n

teR>

\"

J x /'cos
<t\\

and

ifc\302\243=0

(1)

+
(\302\243

l'
t

sin (^

7. (c) There

(2)

(f) F

(e)

Any

are six possibilities:


line

the

through

origin

if

<j)

= 0.

ij/

/0'
t|0

teR}

|:

3S

l/f

H-

if

0 and

ij/

jr.

l\\

(ei?>

if(\302\243

7i and

ij/

^n.

6 K > if

&
c\302\243

Answers

Exercises

to Selected

(4)

cos

sin

\\

(5)

l\\

if

teR}

i/r

rc and

<j)

/sin

\342\200\224

sin

\\sin

sin

i/r(cos

7r.

1)1

^+

c\302\243
(cos

if^-^

teR}

}:

\302\243

i/r

\302\243
e

|:

otherwise.

)\342\200\242

\302\242+1)

7.1

SECTION

1. (a)

(g)

2. (a) For

(c) F

(b) F

(d) T

(e)

(f)

(h) T
X

2,

is a basis
(b)

\\

For

for

for

For

EA;

basis

for

EA and

KA.

any

-1,

is a basis

(c)

=\302\243
7t,

/o\\

Itll

(6)

\342\200\224

both

2,

is a

basis for

is a

basis

For

X =

for
2,

EA

KA.

and

R2 is

a basis for

KA.

Answers to Selected Exercises

516

is a basis

for

and

EA

is a
3.

/2

(a)

for

basis
1\\

KA.

(b)

(c)

SECTION 7.2

1. (a) T
(g) F

(b)

(h) T

2. /2 i o;
0 2 1

o1

0
\342\200\242_

002;0D00000
o 6 d 2 11 ooooo
ooo;o2;oooo.
I

0000

o
000

12^0

o o o
0 0 0
o o o

cu

ooo

\"b\"S\"4\"Y\"61

0; 0 0 10
o

<)\342\200\242'
o

;o

0 0 0 0 0 0

4 !o \"* o

o\"6\"j

-(t

(a)

(b) For

(c)

A2

(d)

px =

(e) (1)

At

2)5(t

\"\"

o\"\"!-3\"
I

- 3)2

-2

For

3 and p2 =
-

(2) rank(U?)(3) nulIity(U1)


(4)

= 3

3 and
1

! 2

0 2/

rank(U2)

- 0

rank(U^)

= 0

and

- 2 and
=

nulIity(U?)

4 and

nuIIity(U2)

nuUity(U^)

/1

/i; 0 0\\

4. (a)

A2

rank(U:)

\"6\"j\"-3\"; o

ooooooooo

iO

3.

4*1

\"6\"

ooooooooo
*

(f) F

(e) T

(d) T

(c)

and

Q= I 2

to

Answers

/0

(d)

517

Exercises

Selected

0'

HO

0 0' 0 0

0 0;
0

^0

2t

and

Jordan canonical form is

6. The

'1 1 0 ;

23.

canonical basis is

a Jordan

and

0N

1 ; 0

1 '0

,6

\"0

\342\200\242

; 2,

{2ex,2xex,x2ex,

e2x].

(a)

+ c^e

(Cl+C2\302\243)

(b)
+

(c,+c2t

1- (a) F

(b)T

(g) F

(h)T

(a)

(t

(a)

2c3

t2

(d) (t
For

- 3)

(d) F

(c) F

(c)

(t

(e) T

(f)

- 2)

l)2(t

2)2

(d)(\302\243

3.

2c3t)

7.3

SECTION

2.

+ (c2 +

c3t2)

(2),

(c) (t -2)2

m +1)
For

(a);

(3), (a)

5. The operators are T0,

I,

and (d).
and

those

operators

both 0 and

having

1 as eigenvalues.

SECTION 7.4

1.
2.

(a)

(b)

(c)

(d) T

(f) T

(e) F

(g) F

(b)

(a)

(C)

(H-l
\\

+ ty/3)

\\

/\302\260

-2

(e)

'

\\o

0
0 0
0 1

0\\

'

-3

0/

page 460

B,(T)

page355

#(V)

page

Cn(jR)

489

18

page

page 16

C([0, 1]) page 296

e,(T)

page*460

det(X) page-193
det(T) page
219

dim(V)

eA

page

40

page

280

page 35

page 234

page

fiA)

F)

/-or/
iv or i

page

59

page

P(Fh

page 9

Pn(f)'

page 16

R(T)
R

page 57

page485
rank(,4) page 132
rank(T) page59
span(S) page
27

T0

page

56

page 7

To

page 55

page 8

Tw

page 282

496

496

page 297

page75
page55
page

518.

57

tr(A)

V/W

d-m

page 20

page

page

128

page

76

page

420

page

459

page

78

page 186
5r,

page 75

page

489

A*

_^

16

page

Fn

nullity(T)

page

page 486

Z2

page

page

/CO

N(T)

page 89

<t>p

page 68

page

111

page

C(R)

W)

J5P(V,

page 252

page 69

J2?(V)

^(S,

lim A,
m-*ao

page

296

T*

page

315

V*

page 102

P*

page 102

A'1

page 86

T-i

Ml

V
Bl\302\256'\"\302\256Bk

WieW2

Wi e

page

85

page

15

page

103

page

287

page

19

page

245

page

19

\342\200\242
\342\200\242
\342\200\242

S,+S2

w&

page

245

page

310

[T],

page

67

mi

page 67

i= l

page 66

(V)
(\342\200\242I-)

page 295

page 141
page
298

nclex

1.1 page 10

Proposition

Proposition 1.2 page 11

Theorem1.3

page 15

page

Theorem

2.17

Theorem 2.18

1.4

Theorem

1.5

page

27

Theorem

2.21

Theorem

1.6

page

34

Theorem

Theorem

1.7

page

36

Theorem

Theorem 1.8

Theorem 1.9

page 37

page 37

page

88

2.22

page

89

2.23

page

95

page

96

Theorem 2.24

Theorem2.25

1.10

page 38

Theorem

2.26

Theorem

1.11

page

44

Theorem

2.27

Theorem

1.12

page

50

Theorem

2.28

page

51

Theorem

2.29

Theorem 2.30

Theorem 2.1

page

57

page 85

Theorem

Theorem

Theorem 1.13

80

Theorem2.19 page 86
2.20
page 88

Theorem

17

page

Theorem2.31

page 102

page 103

page 104

page 111
page

112

page

113

page

114

Theorem

2.2

page 58

Theorem

2.32

page 115

Theorem

2.3

page 59

Theorem

2.33

page 115

Theorem

2.4

Theorem

2.34

Theorem

2.5

Theorem

2.35

Theorem 2.6

Theorem2.7

page 60

page60
page
61

page

68

Theorem 2.36

page 118

page 119
page

120

page

130

130

Theorem

2.8

page

69

Theorem3.1

Theorem

2.9

page

73

Theorem

3.2

page

Theorem

2.10

page 73

Theorem

3.3

page 133

Theorem

2.11

page 74

Theorem

3.4

page 133

Theorem 2.12

page 75

Theorem

3.5

page

76

Theorem 3.6

Theorem 2.13
Theorem

2.14

page

77

Theorem3.7

Theorem

2.15

page

77

Theorem

Theorem

2.16

page

79

Theorem

page 133

page 135
page

139

3.8

page

149

3.9

page

151

Index of Theorems

520

page 152

Theorem 3.10
Theorem

3.11

page

Theorem

3.12

Theorem 3.13

5.26

page

page 156

Theorem

5.27

page 283

page 160

Theorem 5.28

3.14

page

Theorem

3.15

Theorem

4.2

4.3

Proposition

165

Theorem

4.6

Theorem

4.7
4.9

Theorem

4.10/

page

page 167

Theorem

5.30

page 288

page 172

Theorem 6.1

173

Theorem

6.2

page

page

183

Theorem

6.3

page 305

184

Theorem 6.4

186

Theorem

190

Proposition

page

-page

page 191

page

Theorem

page

Theorem

5.3

Theorem

5.4

Theorem

5.5

Theorem

5.6

Theorem 5.7

page
\342\200\236
page
\\

page

200

216

6.13

page 321

Theorem

6.15

Theorem

6.16

Theorem

233

page 326

Theorem 6.14

327

page
\"

page 327
page

329

6.18

page

334

6.19

page 337

Theorem6.20

page

337

338

Theorem

6.21

page

Theorem

6.22

page 339

Theorem 6.23

page350
351

6.24

page

page 234

Theorem

6.25

page 356

page 237

Theorem 6.26

page

Theorem

5.15

238

page 357
359

Theorem

6.27

page

page 246

Theorem

6.28

page 360

page 248

Theorem 6.29

Theorem

5.17

page

Theorem

5.18

253

page 362
367

Theorem

6.30

page

page 254

Theorem

6.31

page 373

page 255

Theorem 6.32

Theorem

5.20

page

Theorem

5.21

Theorem 5.22

Theorem

Theorem

5.14

Theorem 5.19

page

Theorem

page 231

Theorem

Theorem 5.16

6.12

219

page 222

320

Theorem

Theorem 6.17

222

315

page 317

Theorem 6.11

217

5.9

Theorem 5.13

page 315
page 316

Theorem

5.12

page 312

6.7

6.10

page

Theorem

310

Theorem

5.8

page

page

page

page 220

5.11

308

6.9

/ page 220

Theorem

6\302\243

page

Theorem

Theorem

Theorem 5.10

6.5

193

page 215

Theorem 5.2

page 307

Theorem 6.8

page 215

5.1

Theorem

299

192

\"page 197

4.12

page 298

page

Theorem 4.11 -page 198


Theorem

286

5.29

Theorem4.8 , page
Theorem

page 284

Theorem

Proposition 4.4 page

Proposition4.5

282

Theorem

Theorem

Theorem 4.1

153

page 269

Theorem 5.25

259

page 388
390

Theorem

6.33

page

page 264

Theorem

6.34

page 391

page 266

Theorem 6.35

266

Theorem

5.23

page

Theorem

5.24

page 268

page 393
401

Theorem

6.36

page

Theorem

6.37

page 403

Indexof Theorems

Theorem 6.38
Theorem

6.39

Theorem

6.40

page 407

page 410

page410

Theorem

7.1

page

Theorem

7.2

page

420

page

422

Theorem 7.3

Theorem7.4

419

page 422

Theorem 7.21

467

page

Theorem

7.22

page 467

Theorem

7.23

page 470

Theorem

7.24

page

Theorem

7.25

477

page

477

Theorem C.l

page

486

Theorem C.2

page

487

page 423

Theorem

7.5

Theorem

7.6

page 425

Theorem

D.l

page 489

Theorem

7.7

page

428

Theorem

D.2

page 490

Theorem

7.8

page

433

Theorem

D. 3

page 490

Theorem 7.9

page

434

Theorem D.4

Theorem7.10

page

page 443

Theorem E.l

page

492

Proposition

E.2

page

494

Proposition

E.3

page

496

Proposition

E.4

page 497

Proposition

E.5

page 497

461

Proposition

E.6

page

Proposition E.7

Theorem

7.11

Theorem

7.12

page 451

page 452

page452
Theorem 7.14
page454
Theorem 7.15 page454
Theorem7.16 page
Theorem

7.13

Theorem

7.17

page

462

Theorem

7.18

page

463

Theorem

7.19

page 464

Theorem

Theorem

7.20

page 466

Theorem

\\

491

497

page

497

page

498

E.9

page

498

E.10

page 500

PropositionE.8

naex

Absolute

standard basis for Fn, 35

chain, 273

Maikov

Absorbing

rational canonical, 461

490^-91

value,

Absorbing state, 273

Additionof vectors,1-2
function,

Additive

inverse

205

388-92

316-18,

operator,

of a matrix, 296r 318-22


closed

Algebraically

Alternating

Angle

184-87

function,

two vectors, 174,

between

Annihilator:

300

of a subset,.108

of

projection, 349

Associated

of linearequations,153
112,

polynomial,

114-15,

Axis of rotation, 406

357-60

representation,

product

with

356

scalar,

376

rank,

378

signature,

sum, 356

symmetric,360-63,367-68
355

of,

space

35-41,

459-79

rational,

for

a symmetric

50-52

Cauchy-Schwarz inequality, 298


linear

a
.%

284

operator,
285,

matrix,

332

of sets, 49

of

Characteristic

cyclic, 460

487-88

field,

362-63,-

382,

(definition)

Characteristic polynomial, 221-22, 329

dual, 102

Jordancanonical,
417

Characteristic
value

66

Characteristic

308,

380

matrix,

matrix,95-98
Change of coordinate

164

304,

vector addition, 10

Jordan,416-50

for
Chain

164

pass,

law for

Cancellation

for

Backsubstitution,

orthonormal,

378-79

theorem:
Cayley-Hamilton

118-20

ordered,

378

index,

Canonicalform:
341

141

matrix,

of a system

of an orthogonal

form,

quadratic

Augmented

Basis,

66

Pn(F),

314

inequality,

vector

property

Approximation

Backward

66

458

vector,

Auxiliary

basis for Fn,

Bilinear form, 355-66

matrix

233

Algorithm for diagonalization,239


/z-Iinear

Bessel's

invariants,

492

field,

multiplicity,

Algebraic

ordered

361-68
diagonalization*

classical,181,
a linear

standard

36

Pn(F),

for

vector, 6, 10-11

of a

Adjoint:
of

basis

standardorderedbasis

65

Additive

for

standard

327-29

vector

Classical adjoint:

(see

Eigenvalue)

(see Eigenvector)

523

Index

of a 2

181

matrix,

of

Clique,81

a simple

of

186, 206, 324,

matrix,

193
uniqueness,

economy,

154-56

Coefficient matrix of a system of linear

147,152
equations,

linear

Diagonalizable

216,

operator,

238-39

matrix, 216

Diagonalizable

Diagonalization:

Coefficients;

of

a square

342, 345

model

Closed

209-11

properties,

n x n matrix, 205

of an

differential

algorithm,239
of a bilinear

109-10

equation,

8-9

of a polynomial,

Cofactor,187-88
Column

operation,

Column

sum

of

Differential

118-20

coefficients,109-10

110-15,118-20,457
homogeneous,

of linear transformations,

72--75

of

system

linear

solutions, 110-15

solution space, 113-14,118-20

Congruent matrices, 359-60, 379-80,

384-85
linear,

Conjugate

Differential

318-22

Dimensiontheorem,

252-56

59

of

176

66,

77-78,

95

323,346,348,351,410-12,414,

. 428, 458, 477

Distance,303

204

200-201,

Divisionalgorithm,

Critical point, 373

492-94

Cycleof generalized

eigenvectors,

end

419

Dominance

length, 419

Dot product

460

Degree

281,

283

of a polynomial,

Demandvector,156
Determinant,

evaluation,

of a linear

8, 112

193-97,

206-9

operator, 219-20, 407, 410

form,

469-71

(see Inner product)

Doubledual, 102,104-5
102

Dual

basis,

Dual

space,

Echelon

171-213

canonical

rational

for

Cyclicbasis,
subspace,

81-82

for Jordan canonical form, 432-34

initial vector, 419

Cyclic

relation,

Dot diagram:

419

vector,

314,

286,

(definition),

20

rule,

19, 48-49, 64, 84,

245-48

151
linearequations,

Coset,

287-88

matrices,

of subspaces,

Corresponding homogeneous system of

Cramer's

sum:

Direct

Coordinate
system,
vector,

88-89, 102,

44-46,

358, 409-11

Coordinate function, 102-3


Coordinate

40-41,

121

112,

operator,

Dimension,

of matrices,

Convergence

Differentiation

155

matrix,

Consumption

112, 115-19

operator,

order, 112, 115, 117-18

298

Conjugate transpose.of a matrix,296,

450

242-44,

systems,

341-44

sections,

110

order,

403-4

number,

122-23

nonhomogeneous,

398

equations,

Conic

109

linear,

of a
Conditioning
Condition

112, 114-15,

polynomial,

auxiliary

122-23

109-20,

equation,

488

part,

Composition

354

Diagonalof a matrix,16

488

part,

332,

293,

Diagonalmatrix, 16

460

488-92

conjugate,489-90
imaginary

or matrix, 214-52

291,

251,

simultaneous,

absolute value, 490-91

real

operator

tests, 238-39, 431

number,

Complex

264

matrix,

Companion

a linear

problem,214, 216

128

of matrices,

Column vector,

361-68

form,

102-5

form (see Row echelonform)

Economics

Eigenspace:

(see

Leontief,

Wassily)

Index

524

composite,484

Eigenspace (cont.)

420-22
generalized,
of

linear

domain,

483

or matrix,

operator

234-38,

351

Eigenvalue of a linearoperatoror matrix,

217-18,222,

264-69,

224-26,

401-3

373-76,

327,

image,483

110

part,

484

484

invertible,

generalized, 418

/z-linear, 182-88

of a linear

or

operator

217-18,

matrix,

327-29

odd, 19, 314

one-to-one,
484

Albert (see Special

theory of

relativity)

128-32

Elementary

column

Elementary

matrix, 129, 191-92

Elementary

operation,

operation,

row

Elementary

exponential, 113-20
inverse,

Eigenvector:

Einstein,

13, 19, 314

even,

imaginary

233

multiplicity,

483

8,

equality,

128-32

128-32

operation,

Ellipse (see Conic sections)

484

onto,

polynomial,

preimage,

483

range, 483

real part, 110


484

restriction,

Equal

functions,

Equal

matrices,

Gaussian elimination, 164-65, 347

,,

back

Equal polynomials,9
155

economy,

346, 382,

Equivalence relation, 91,

'384-85,

483

of linear

160

(see

Local

algebraically

Gram-Schmidt

Fixed

two,

367-69

164

Fourier,

Jean Baptiste, 309

Fouriercoefficient,102,309,

349

483-84

alternating,

\342\226\240*

149-50

Hooke's

of linear

system

Homogeneous

269

pass,

additive,

equation,

Homogeneouspolynomialof degree

Forward

Function,

differential

118-20, 457

110-15,

vector space, 40, 44


vector,

probability

(see

linear operator or

linear

Homogeneous

488-92

485

of elements,

sum

matrix

matrix)

product of elements, 485

Finite-dimensional

or

operator

Self-adjoint

(definition)

485

process,

orthogonalization
'

linear

Hermitian

382,

numbers,

331-32

Hardy-Weinberglaw, 276

362-63,
characteristic,
of real

406-12

307, 347

closed, 492

of complex'numbers,

370-72,

disk theorem, 264

matrix,

extremum)

cancellationlaws, 486
487-88

50

338-44,

Gramian

484-88

Field,

41,

of a cyclic subspace,281

Generator

Gerschgorin's

Exponential of a matrix, 280

Extremum

28,

Geometry,

113-20

function,

Exponential

Generalized
418
eigenvector,
273-76

314

19,

13,

function,

Generalized eigenspace,420-22

Genetics,

Euclidean norm, 4()0-404


Even

pass, 164

Generates,

equations,

164

pass,

forward

(definition)

systems

Equivalent

164

substitution,

backward
a simple

for

condition

Equilibrium

491

of algebra,

theorem

Fundamental

End vector(of a cycle),419

law,

109,

324

Identity matrix, 75-76,

79

Identity

55

transformation,

65

Ill-conditioned

184-87

coordinate function, 102-3

Image

equations,

system,

(see Range)

398

Imaginary part of a function,110

525

Index

Incidence

80-82

matrix,

Linear

Index:

Linear functional, 101

of a matrix,

Linearindependence,
33-34,39,

379

305

vector

Infinite-dimensional

40

space,

Linear

vector, 261

probability

product,

Inner

cyclic

space:

a bilinear

of

Invariants

of a matrix, 379

eigenvector,217-18,

327

280-83

64-65,

subspace,

eigenvalue, 217-18, 327


378

form,

Invariants

Inverse:

linear transformation, 85-87,

of a matrix,86-87,91,
Invertible

function,

Invertible

linear

144

484

85-87
403

338-40

333-35,

generalized

eigenvector,

Isomorphism,87-89, 104-5,357-58

418

Jordancanonical

417-50

form,

451-56

polynomial,

446

normal, 326-28, 352-53

orthogonal,333-40,406-12
346

isometry,

positive

331-33

definite,

positive semidefinite, 331-32

rationalcanonical

459-79

form,

block,

Jordan

canonical

rotation, 56t 336-44, 406-12

417

basis,

self-adjoint,

Jordan canonical form of a linear

329

operator,419-50
of

form

canonical

\"Jordan

simultaneous

434

a matrix,

skew-symmetric?354
351

spectrum,

Kroneckerdelta,75,

300

352-53

333-38,

unitary,

Linear space (see

42-44,

formula,

interpolation

Linear

Linear

Least

43\342\200\22444,

Least squares line, 318-21


transformation,

Length

of a

414

vector (see Norm)

open

model,

154-56

model,

156-57

isomorphism,

combination,

87-89, 104-5, 357-58

kernel, 57-60, 114-18


78-80,

left-multiplication,

394-95
Limitof a sequence
Linear

Lorentz,
252-56

matrices,

36, 183-84

21, 27,

Linear dependence,31-34,

'

85-87

invertible,

Light second, 385,

of

also

144

85-87,

inverse,

Leontief,Wassily,154
closed

operator)

adjoint, 316-18, 388-92

(see

image,57-60

Left-multiplication
78-80,

54-55

72-75
composition,
5
5
identity,

318

approximation,

squares

107

93,

Vector space)

transformation,

107, 352

Lagrange polynomials,

251,

diagonalization,

293, 333, 354

291,

352
spectraldecomposition,

Kernel (see Null space)

Lagrange

339, 406-12

97-98,

56,

reflection,

417

Jordan

280-83

invariant subspace, 64-65,

partial

vector spaces, 87-89

Isomorphic

eigenspace,

nilpotent,

transformation,

420-22

generalized

rrunimal

142-43

Invertible matrix, 86-87, 95, 192,


Isometry,

351

234-38,

eigenspace,

Input-output matrix, 155-56

121

112,

\342\226\240differentiation,

real, 297

of a

216, 238-39

differential, 112, 115-19

334,

328,

349-50

Invariant

219-20

diagonalizable,

309,

283

281,

subspace,

determinant,

complex, 297

H, 297, 300,306,

(see

adjoint, 316-18, 388-92

296

product

operator,

also Linear

characteristic
221-22
polynomial,

295-300

standard,

214

transformation)

Initial vector(of a cycle),419


Inner

of linear

System

equations)

of a bilinear form, 378

Initial

(see

equations

37

388-95

matrix representation,
308,

nullity,

414

316

59-60

67-70,

74\342\200\22476,

Index

526

Linear transformation(cont.)

null space,57-60,114-18

invertible,
86-87,95,

68-69

with a scalar,

product

sum,

68-69

minimalpolynomial,451-56

normal, 326-28

orthogonal,203,336-44

Local maximum, 373, 384

Local

(definition)

384

373,

niinimum,

Lorentz

transformation,

155

positive,

198

product, 73-80

product

Markovchain,260
Markov

process,

Matrix,

7..

regulartransition,
263

260.

141,

change

of coordinate,

representation of a

row

sum, 264

379-80, 384-85

171-213, 324, 340,

342,

diagonal,16,
of,

sum,8,

entries,

297,

norm,

400

Gramian, 331-32

identity, 75-76, 79

incidence,80-82
index,

379

197,

74,

230

(definition)

128-32

upper

229, 326,

210-11,

196-98,

18,

triangular,

338, 347, 447

203-4
Vandermonde,

Euclidean

203,

336-38

unitary equivalence, 337, 346, 406

191-92

operations,

450

257-59,

transpose, 15-16, 18, 56t


unitary,

equality,

345-46

transition,

217

129,

337,

20, 329-30,

15-16,

250,

eigenvalue,217,373-76,401-3
elementary

69

trace, 16, 18, 83, 100,230-31,

16

eigenspace, 234

elementary,

20, 203

stochastic,257

direct sum, 287-88

eigenvector,

251

diagonalization,

341-46

216

diagonal

simultaneous

symmetric,

83

diagonalizable,

443

square, 8

252-56

345

similarity,98-100,230,
skew-symmetric,

155

determinant,

329, 401

signature, 379

403-4

conjugate transpose,296, 318-22


convergence,

229

self-adjoint,

359-60,

163

264

sum,

scalar,

companion, 460 :

consumption,

row echelon form,

147

congruent,

316

308,

row,

205

number,

linear
74-76,

characteristic polynomial, 221-22

condition

form,

67-70,
transformation,

95-98

classicaladjoint,181,
7

a bilinear

of

357-60

153

augmented,

column,

374-76

191-92,

representation

coefficient,

scalar,

rational canonical form, 472

adjoint,296,318-22

column

with

132-39,

rank,
*

273\342\200\242

absorbing,

337-38

equivalence,

orthogonal

388-95

Lower triangular matrix,

400-404, 449-50

303-4,

norm,

373, 384

extremum,

155

nonnegative,

69

55,

zero,

447

nilpotent,

vector spaceof, 68-69,88-89

Local

434

form,

lower triangular, 198

103-4, 107-8

transpose,

403

192,

limit, 252-56

139

59-60,

canonical

Jordan

range, 57-60
rank,

142-43

inverse, 86-87, 91,

60

onto,

379

invariants,

60

one-to-one,

155-56

input-output,

vector

space

of,

8,

297,

357

zero, 8

Maximal element of a
Maximal

linearly

50-51

family

independent

of

sets,

subset,

49

Index

527

Maximalprinciple,
49-50

Parallelogram:
385

experiment,

Michelson-Morley

Minimal

linear operator

of a

polynomial

or matrix, 451-56

Minimalsolution

of

of

system

linear

321-22

equations,

449-50

400-404,

of a vector, 298-300, 303

492-500

118-20

114-15,

112,

auxiliary,

characteristic, 221, 329

divisionalgorithm,
9

equal,

9, 499-500

function,

homogeneous of degree two, 367-69

irreducible,

497-99

59-60

Null space, 57-60,

minimal, 451-56

monic, 497-99

398
conditioning,

product

with

347-48

factorization,

107

93,

43-44,

Lagrange,

114-18

Numerical methods:
QR

relativity,

492-94

or matrix,

operator

/z-ruple,
Nullity,

theory

124\342\200\22425

385-95

degree, 8

325

326-28, 352-53
space,

of

8,
Polynomial,

Euclidean,400-404
303-4,

a spring,

of

motion

Polar identities,303

Norm:

linear

pendularmotion,123-24
spring constant, 324

Nonnegative matrix, 155

Normal

Hooke's law, 109, 324

special

149

equations,

300

vectors,

periodic

linear

of

system

Nonhomogeneous

equations,

124-25

Physics:

182-88

122-23

Normal

Pendular motion, 123-24


Perpendicular

447

Nonhomogeneousdifferentialequation,

matrix,

314

identity,

Partial isometry, 346


motion,

/z-linear function,

vectors,

446

operator,

of

Parallel

Periodic

Nilpotentlinear
matrix,

by vectors, 176

law, 1, 302
Parseval's

Multiplicity of an eigenvalue, 233

Nilpotent

174-80

area,

determined

in division,

quotient

scalar,

493-94

relatively prime, 494-97

remainderin division,

19, 314

function,

sOdd

Open model of a

156-57

493-94

simpleeconomy,

differential

Ordered

110

equation,

of a differential

operator, 112, 115-19

66

basis,

Orthogonal

310, 312-13,

complement,

346, 348-51, 410


of

equivalence

Orthogonal

matrix,

matrices,

203, 336-44

vector

operator,

333-40,

Orthogonal projection, 311,

354

subset,

Orthogonal

vectors,

of, 9

space

zero of, 115,

494

Positive

406-12

323, 348-52,

300,

305

300

Orthonormal

basis, 304, 308, 327-29

Orthonormal

subset,

300

\302\273

(definition)

definite

operator,

matrix,

155

331-33

Positive semidefiniteoperator,331-32
Primary

decomposition

theorem,

477

axis theorem, 342

Probability (see Markovchain)

Probabilityvector,258
fixed,

Orthogonal

498-99

factorization,

unique

Principal

(definition)

Orthogonal

329

trigonometric,349

Positive

337-38

..Orthogonal

325,

zero, 8

175-80

Orientation,

232,

sum, 8

T-annihilator, 458

Order,

of

splits,

initial,

269

261

Product:

of bilinear forms

of complex

numbers,

and scalars,356
488-89

Index

528

Product

(com.)

Sequence,
9

field,485

of elements of a

of linear

scalars,

of, 481

empty, 482

73-80

of matrices,

of vectors and scalars,6

64,72,84,348-52,
Projection,

354

the

on

operatoror
vector

(see

matrix)

Eigenvector)

intersection, 482-83

orthogonal,300
300

481

subset,

482-83

Signature of a

Quadraticform,366-72
with

associated

Similar

quadratic

equations,

341

bilinearform,

378

rule, 107

Simpson's

Simultaneousdiagonalization,251, 291,

293,

354

332,

Skew-symmetric operator, 354

57-60

Solution:

Rank:

of a

59-60, 139

linear transformation,

of a matrix, 132-39,191-92,

374-76

461

Rational

canonical

basis,

Rational

canonical

form:

a function, 110

transition

change in a

vector, 399
38-39

theorem,

identity operator, 351

Rigid motion,338-40
336-44,

56,

Rotation,

Row echelon form,


a

of

Row

Row

operation,

406-12

vector,

Span,

27-28,

148

386-87

37, 307

30-31,

Special theory of relativity,385-95

axioms,387-88

388-95

transformation,

386-87

coordinates,

space-time

163

393-95

time contraction,

Spectral decomposition,352-53

128

Spectral

351

theorem,

Row sum of matrices,264


Row

of linear

system

coordinates,

Space-time

Lorentz

matrix,

118-20

494\342\200\22497

of the

of

113-14,

Relativelyprimepolynomials,
Replacement

148

Solution space of a differential


equation,

263

matrix,

of linear equations,

equations,
406-12

Resolution

251

a system

set

Reflection,56, 97-98,339,
Relative

minimal,

of a system of differential
equations,

Solution

quotient,1401

Regular

110-15

equation,

321-22

trivial, 149

472

part of

differential

244,

of a matrix,
Real

of a

of

of a linearoperator,459-79
Rayleigh

20, 203

matrix,

Skew-symmetric
Range,

443

230,

98-100,

matrices,

65, 93, 292

space, 20, 49,

Quotient

483 (definition)

384-85,

union,

QR factorization, 347-48

382,

346,

91,

relation,

equivalence

orthonormal,

301

theorem,

Pythagorean

equalityof, 481

56

x-axis,

Proper value (see Eigenvalue of a linear

Proper

482

disjoint,

element

68-69

Projection

481-83

Set,

and

transformations

351

Spectrum,

Splits,

232, 325, 329

Spring, periodicmotionof, 108-9,


Saddle

point,

373

124-25

Scalar,

Spring

Scalar matrix, 229

Square

Schur'stheorem:
for

linear

operator,

Second

derivative

Self-adjoint

test,

326

Standard

basis

unitary

for Fn,

345

operator,

35

Standard basis for Pn(F),36


373-76,

linear operator or

329, 401

matrix,

Square root of a

matrix, 338

for a

324

constant,

383-84

matrix,

Standard

inner

product

on

Standard ordered basis for


Standard

ordered

basis

for

Fn,

296

Fn, 66
Pn(F)}

66

Index

529

Standard representationof a vectorspace,

homogeneous, 149-50

89-90

ill-conditioned,398

States:

minimal

273

absorbing,

of

321-22

solution,

149-50

nonhomogeneous,

a transition

matrix,

257

398

well-conditioned,

Stationary vector (see Fixed

probability

T-annihilatorof a vector,458

vector)

Statistics

Least

(see

Taylor's

squares

Stochastic matrix (see Transitionmatrix)


Stochastic

T-cyclic basis, 460


Time

31-34, 37

dependent,

linearly

312-13,

410

346, 348-51,

regular, 263

spanof, 27-28,30-31,
37,

states,

257

307

Transpose:

44-45

of a

cyclic, 281-83

directsum, 19,48-49,84,
286,

(definition),

103-4,107-8

414, 428,458,

invariant,

64, 280-83 (definition)

zero, 14

Direct

also

a field,

of

Unit

Upper

Sylvester's

Vandermonde
inertia:

of

bilinear

form,

matrix,

379

annihilator of, 458

column,
7

coordinate

367-68
matrix,

System of differentialequations,

242-^44,

450

of linear

66,

vector,

95

77-78,

demand, 156

337,

329-30,

15-16,

341

equations,

23-25,

147 (definition)

equivalent,

447

203-4

matrix,

Vector,

377

Symmetric bilinear form, 360-63,

augmented

(definition)

18, 20, 229,

matrix,

triangular

326, 338, 347,

of vectors, 6
law

336-38

203,

300

vector,

19, 48-49

of subsets,

406

matrix,

polynomials,

33

zero,

Unitary operator, 333-38, 352-53

of matrices,

corresponding

solution, 149

Unitary

485

of linear transformations, 68-69

Symmetric

representation

Trivial

346,

of complexnumbers,488
elements

of

Trivial

337,
Unitary equivalenceof matrices,

sum):

forms, 356

of bilinear

System

Triangle inequality, 298

Trigonometricpolynomial,

27

generated

of

107

rule,

Trapezoidal

230

349

a set,

by

of

197,

74,

346,

477

\\

linear transformation,

of a matrix, 15, 18,56,

245-48

323,

314,

348, 351, 410-12,

(see

338-41

Translation,

19

14-17,

matrix, 257-59, 450

Transition

orthonormal, 300
Subspace,

100,

345-46

297,

250,

230-31,

310,

complement,

64-65, 280-83

subspace,

Trace of a matrix, 16, 18,83,

300
orthogonal,

sum,

393-95

contraction,

T-invariant

linearly independent,33-34, 39

orthogonal

238-39

diagonalizability,
283

Subset:

Sum

for

Test

T-cyclicsubspace,281,

259

process,

373-74

theorem,

approximation)

matrix

of,

349

309,

initial

probability,

norm,

298-300,

261

303

orthogonal, 300

probability,258

153

homogeneous,

160

fixed probability,269
Fouriercoefficients,
102,

with

product

151

scalar,

2-3,

Rayleigh quotient, 401

row,

530

Vector

of /z-tuples, 7

(cont.)

sum,

of polynomials,9,

unit, 300

Vector space, 6
bilinear

of continuous

subspace,14-17,44-45

356-58

forms,

functions,

296, 314

16, 55, 101,

dimension,
40-41,

88-89,

44-^6,

102,

358

dual,

zero,

13

Volume

of

199

a parallelopiped,

system, 398

Well-conditioned

Wilkinson,J. H.,

347

102-5

finite-dimensional,

40, 44

of functions
iroma

into

set

Zero,
a

field,

40

infinite-dimensional,

of infinitely differentiablefunctions,

111-13,218,

457

87-89,

isomorphism,

104-5,

357-58

of linear transformations, 68-69,


matrices,

8,

297,

357

trivial

representation

of, 33

Zero matrix, 8

Zero of a polynomial,115
8

Zero

polynomial,

Zero

subspace,

14

Zero transformation, 55,

69

Zerovector,6, 10-11,
21,

33

Zero

\302\243g-89

of

65, 93

49,

of sequences, 9

basis,35-41,50-52
of

20,

quotient,

72

vector

Z2, 14, 34,

space,

13

361-62, 486 (definitio

Index

n)

\\

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy