0% found this document useful (0 votes)
109 views

MA209

This document introduces differential equations and discusses what these notes will cover. It defines a differential equation as an equation involving known and unknown functions of a single variable and their derivatives. It explains that differential equations arise naturally in fields like physics, biology, economics to model changing quantities. While solving differential equations means finding all functions satisfying the equation, it is often more useful to obtain qualitative information about solutions and know about existence and uniqueness of solutions. The notes will cover solving simple differential equations, obtaining qualitative information about solutions, and theorems about existence and uniqueness of solutions.

Uploaded by

prw1118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

MA209

This document introduces differential equations and discusses what these notes will cover. It defines a differential equation as an equation involving known and unknown functions of a single variable and their derivatives. It explains that differential equations arise naturally in fields like physics, biology, economics to model changing quantities. While solving differential equations means finding all functions satisfying the equation, it is often more useful to obtain qualitative information about solutions and know about existence and uniqueness of solutions. The notes will cover solving simple differential equations, obtaining qualitative information about solutions, and theorems about existence and uniqueness of solutions.

Uploaded by

prw1118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

Introduction

0.1 What a differential equation is

In any subject, it is natural and logical to begin with an explanation of what the subject matter
is. Often it’s rather difficult, too. Our subject matter is differential equations, and the first order
of business is to define a differential equation. The easiest way out, and maybe the clearest, is to
list a few examples, and hope that although we do not know how to define one, we certainly know
one when we see it. But that is shirking the job. Here is one definition: an ordinary differential
equation1 is an equation involving known and unknown functions of a single variable and their
derivatives. (Ordinarily only one of the functions is unknown.) Some examples:

d2 x
1. − 7t dx
dt2
4
dt + 8x sin t = (cos t) .
! "3 d2 x
2. dx
dt + dt2 = 1.
dx d2 x
3. dt dt2 + sin x = t.
dx
4. dt + 2x = 3.

Incidentally, the word “ordinary” is meant to indicate not that the equations are run-of-the-mill,
but simply to distinguish them from partial differential equations (which involve functions of
several variables and partial derivatives). We shall also deal with systems of ordinary differential
equations, in which several unknown functions and their derivatives are linked by a system of
equations. An example:
dx1
= 2x1 x2 + x2
dt
dx2
= x1 − t2 x2 .
dt
A solution to a differential equation is, naturally enough, a function which satisfies the equation.
It’s possible that a differential equation has no solutions. For instance,
# $2
dx
+ x2 + t2 = −1
dt
has none. But in general, differential equations have lots of solutions. For example, the equation
dx
+ 2x = 3
dt
1 commonly abbreviated as ‘ODE’
ii

is satisfied by
3 3 3
x= , x= + e−2t , x= + 17e−2t ,
2 2 2
and more generally by,
3
+ ce−2t ,
x(t) =
2
where c is any real number. However, in applications where these differential equations model
certain phenomena, the equations often come equipped with initial conditions. Thus one may
demand a solution of the above equation satisfying x = 4 when t = 0. This condition lets one
solve for the constant c.

Why study differential equations? The answer is that they arise naturally in applications.

Let us start by giving an example from physics since historically that’s where differential
equations started. Consider a weight on a spring bouncing up and down. A physicist wants to
know where the weight is at different times. To find that out, one needs to know where the weight
is at some time and what its velocity is thereafter. Call the position x; then the velocity is dxdt .
2
Now the change in velocity ddt2x is proportional to the force on the weight, which is proportional
2
to the amount the spring is stretched. Thus ddt2x is proportional to x. And so we get a differential
equation.

Things are pretty much the same in other fields where differential equations are used, such
as biology, economics, chemistry, and so on. Consider economics for instance. Economic models
can be divided into two main classes: static ones and dynamic ones. In static models, everything
is presumed to stay the same; in dynamic ones, various important quantities change with time.
And the rate of change can sometimes be expressed as a function of the other quantities involved.
Which means that the dynamic models are described by differential equations.

How to get the equations is the subject matter of economics (or physics or biology or whatever).
What to do with them is the subject matter of these notes.

0.2 What these notes are about

Given a differential equation (or a system of differential equations), the obvious thing to do with it
is to solve it. Nonetheless, most of these notes will be taken up with other matters. The purpose
of this section is to try to convince the student that all those other matters are really worth
discussing.

To begin with, let’s consider a question which probably seems silly: what does it mean to solve
a differential equation? The answer seems obvious: it means that one find all functions satisfying
the equation. But it’s worth looking more closely at this answer. A function is, roughly speaking,
a rule which associates to each t a value x(t), and the solution will presumable specify this rule.
That is, solving a differential equation like

x + t2 x = e t , x(0) = 1
11
should mean that if I choose a value of t, say π , I end up with a procedure for determining x
there.

There are two problems with this notion. The first is that it doesn’t really conform to what
one wants as a solution to a differential equation. Most of the times a function means (intuitively,
at least) a formula into which you plug t to get x. Well, for the average differential equation, this
0.2. What these notes are about iii

formula doesn’t exist–at least, we have no way of finding it. And even when it does exist, it is
often unsatisfactory. For example, an equation like

x"" + tx" + t2 x = 0, x(0) = 1, x(1) = 0

has a solution which can be written as a power series:


1 4 1 1 8
x(t) = 1 − t + t6 + t + ....
12 90 3360
And this, at least at first, doesn’t seem too helpful. Why not? That leads to the second problem:
the notion of a function given above doesn’t really tell us what we want to know. Consider
for instance, a typical use of a differential equation in physics, like determining the motion of
a vibrating spring. One makes various plausible assumptions, uses them to derive a differential
equation, and (with luck) solves it. Suppose that the procedure works brilliantly and that the
solutions to the equation describe the motion of the spring. Then we can use the solutions to
answer questions like

“If the mass of the weight at the end of the spring is 7 grams, if the spring constant is 3 (in
appropriate units), and if I start the spring off at some given position with some given velocity
where will the mass be 3 seconds later?”

But there are also more qualitative question we can ask. For instance,

“Is it true that the spring will oscillate forever? Will the oscillations get bigger and bigger,
will they die out, or will they stay roughly constant in size?”

If we only know the solution as a power series, it may not be easy to answer these questions.
(Try telling, for instance, whether or not the function above gets large as t → ∞.) But questions
like this are obviously interesting and important if one wants to know what the physical system
will do as time goes on.

For applications, another matter arises. Perhaps the most spectacular way of putting it is
that every differential equation used in applications is wrong! First of all, the problem being
considered is usually a simplification of real life, and that introduces errors. Next, there are errors
in measuring the parameters used in the problem. Result: the equation is all wrong. Of course,
these errors are slight (one hopes, anyway), and presumably the solutions to the equation bear
some resemblance to what happens in the world. So the qualitative behaviour of solutions is very
useful.

Another question is whether solutions exist and how many do. Since there is in general no
formula for solving a differential equation, we have no guarantee that there are solutions, and it
would be frustrating to spend a long time searching for a solution that doesn’t exist. It is also
very important, in many cases, to know that exactly one solution exists.

What all of this means is that these notes will be discussing these sorts of matters about
differential equations. First how to solve the simplest ones. Second, how to get qualitative
information about the solutions. And third, theorems about existence and uniqueness of solutions
and the like. In all three, there will be theoretical material, but we will also see examples.
Contents

0.1 What a differential equation is . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

0.2 What these notes are about . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

1 Linear equations 1

1.1 Objects of study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Using Maple to investigate differential equations . . . . . . . . . . . . . . . . . . . 3

1.2.1 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Differential equations in Maple . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 High order ODE to a first order ODE. State vector. . . . . . . . . . . . . . . . . . 7

1.4 The simplest example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5 The matrix exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6 Computation of etA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.7 Stability considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Phase plane analysis 25

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2 Concepts of phase plane analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.1 Phase portraits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.2 Singular points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3 Constructing phase portraits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3.1 Analytic method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3.2 The method of isoclines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3.3 Phase portraits using Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Phase plane analysis of linear systems . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.4.1 Complex eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

v
vi Contents

2.4.2 Diagonal case with real eigenvalues . . . . . . . . . . . . . . . . . . . . . . . 36

2.4.3 Nondiagonal case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.5 Phase plane analysis of nonlinear systems . . . . . . . . . . . . . . . . . . . . . . . 39

2.5.1 Local behaviour of nonlinear systems . . . . . . . . . . . . . . . . . . . . . . 40

2.5.2 Limit cycles and the Poincaré-Bendixson theorem . . . . . . . . . . . . . . 44

3 Stability theory 51

3.1 Equilibrium point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.2 Stability and instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3 Asymptotic and exponential stability . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.4 Stability of linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.5 Lyapunov’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.6 Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.7 Sufficient condition for stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4 Existence and uniqueness 63

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2 Analytic preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.3 Proof of Theorem 4.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.1 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4 The general case. Lipschitz condition. . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.5 Existence of solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.6 Continuous dependence on initial conditions . . . . . . . . . . . . . . . . . . . . . . 72

5 Underdetermined ODEs 75

5.1 Control theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2 Solutions to the linear control system . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.3 Controllability of linear control systems . . . . . . . . . . . . . . . . . . . . . . . . 78

Bibliography 85

Index 87
Chapter 1

Linear equations

1.1 Objects of study

Many problems in economics, biology, physics and engineering involve rate of change dependent
on the interaction of the basic elements–assets, population, charges, forces, etc.–on each other.
This interaction is frequently expressed as a system of ordinary differential equations, a system of
the form
x"1 (t) = f1 (t, x1 (t), x2 (t), . . . , xn (t)), (1.1)
x"2 (t)= f2 (t, x1 (t), x2 (t), . . . , xn (t)), (1.2)
..
.
"
xn (t) = fn (t, x1 (t), x2 (t), . . . , xn (t)). (1.3)
Here the (known) functions (τ, ξ1 , . . . , ξn ) $→ fi (τ, ξ1 , . . . , ξn ) take values in R (the real numbers)
and are defined on a set in Rn+1 (R × R × · · · × R, n + 1 times).

We seek a set of n unknown functions x1 , . . . , xn defined on a real interval I such that when
the values of these functions are inserted into the equations above, the equality holds for every
t ∈ I.

Introducing the vector notation


   "     
x1 (t) x1 (t) f1 (τ, ξ1 , . . . , ξn ) ξ1
x(t) :=  ...  , x" (t) :=  ...  , ..
       .. 
and f (τ, ξ) :=  .  ξ :=  . ,
xn (t) x"n (t) fn (τ, ξ1 , . . . , ξn ) ξn
the system of differential equations can be abbreviated simply as
x" (t) = f (t, x(t)).

Definition. A function x : [t0 , t1 ] → Rn is said to be a solution of (1.1)-(1.3) if x is differentiable


on [t0 , t1 ] and it satisfies (1.1)-(1.3) for each t ∈ [t0 , t1 ].

In addition, an initial condition may also need to be satisfied: x(0) = x0 ∈ Rn , and a corre-
sponding solution is said to satisfy the initial value problem
x" (t) = f (t, x(t)), x(t0 ) = x0 .

1
2 Chapter 1. Linear equations

In this course we will mainly consider the case when the functions f1 , . . . , fn do not depend
on t (that is, they take the same value for all t).

In most of this course, we will consider autonomous systems, which are defined as follows.

Definition. If f does not depend on t, that is, it is simply a function defined on some subset of
Rn , taking values in Rn , the differential equation

x" (t) = f (x(t)),

is called autonomous.

But we begin our study with an even simpler case, namely when these functions are linear,
that is,
f (ξ) = Aξ,
where  
a11 ··· a1n
 .. .. ..  ∈ Rn×n .
A= . . . 
an1 ··· ann
Then we obtain the ‘vector’ differential equation

x" (t) = Ax(t),

which is really the system of scalar differential equations given by

x"1 = a11 x1 + · · · + a1n xn , (1.4)


..
.
x"n = an1 x1 + · · · + ann xn . (1.5)

In many applications, the equations occur naturally in this form, or it may be an approximation
to a nonlinear system.

Exercises.

1. Classify the following differential equations as autonomous/nonautonomous. In each au-


tonomous case, also identify if the system is linear or nonlinear.
(a) x" (t) = et .
(b) x" (t) = ex(t) .
(c) x" (t) = et y(t), y " (t) = x(t) + y(t).
(d) x" (t) = y(t), y " (t) = x(t)y(t).
(e) x" (t) = y(t), y " (t) = x(t) + y(t).
1.2. Using Maple to investigate differential equations 3

2. Verify that the differential equation has the given function or functions as solutions.

(a) x" (t) = esin x(t) + cos(x(t)); x(t) ≡ π.


(b) x" (t) = ax(t), x(0) = x0 ; x(t) = eta x0 .
(c) Let p be a polynomial function:

p(λ) := an λn + an−1 λn−1 + · · · + a1 λ + a0 .

Let λ0 be root of this polynomial, that is, λ0 is a number satisfying p(λ0 ) = 0. The
differential equation under consideration is:

dn x dn−1 x dx
an n
(t) + an−1 n−1 (t) + · · · + a1 (t) + a0 x(t) = 0.
dt dt dt

The claimed solution is x(t) = Ceλ0 t , where C is a constant.


+ " , + , + , + ,
x1 (t) 2x2 (t) x1 (t) sin(2t)
(d) = ; = .
x"2 (t) −2x1 (t) x2 (t) cos(2t)
1
(e) x" (t) = 2t(x(t))2 ; x1 (t) = 1−t2 for t ∈ (−1, 1), x2 (t) ≡ 0.
2
(f) x" (t) = 3(x(t)) 3 , x(0) = 0; x1 (t) = 0 and x2 (t) = t3 for t ≥ 0.
2
(g) ∂x
− ∂∂sx2 (s, t) = 0; x(s, t) = Cetτ0 +sσ0 , where C is a constant and τ0 , σ0 are fixed
∂t (s, t)
numbers satisfying τ0 − σ02 = 0.

3. Find value(s) of m such that x(t) = tm is a solution to 2tx" (t) = x(t) for t ≥ 1.

4. Show that every solution of x" (t) = (x(t))2 + 1 is an increasing function.

1.2 Using Maple to investigate differential equations

1.2.1 Getting started

Maple should be installed on all public school computers; with the newest version being “Maple
2016”. (Some computers will also have “Maple 17”, which, confusingly, is an older version than
“Maple 2016”.) You can obtain a free copy of Maple 2016 from the IMT Walk In Centre on the
1st floor of the LSE Library.

Background material about Maple can be found at:

1. the MapleSoft Student Help Center ;

2. Maple’s own tutorials and help, which can be found under Maple Help in Maple itself;

3. MA100 Maple tutorials, which can be found on the MA209 Moodle page.

1.2.2 Differential equations in Maple

Here we describe some main Maple commands related to differential equations.


4 Chapter 1. Linear equations

1. Defining differential equations. For instance, to define the differential equation x" = x + t,
we give the following command.

[> ode1 := diff(x(t),t) = t + x(t);

Here, ode1 is the label or name given to the equation, diff(x(t),t) means that the function
t $→ x(t) is differentiated with respect to t, and the last semicolon indicates that we want
Maple to display the answer upon execution of the command. Indeed, on hitting the enter-
key, we obtain the following output.
d
ode1 := x(t) = t + x(t)
dt
The differentiation of x can also be expressed in another equivalent manner as shown below.

[> ode1 := D(x)(t) = t + x(t);

A second order differential equation, for instance x"" = x" + x + sin t can be specified by

[> ode2 := diff(x(t),t,t) = diff(x(t),t) + x(t) + sin(t);

or equivalently by the following command.

[> ode2 := D(D(x))(t) = D(x)(t) + x(t) + sin(t);

A system of ODEs can be specified in a similar manner. For example, if we have the system

x"1 = x2
x"2 = −x1 ,

then we can specify this as follows:

[> ode3a := diff(x1(t),t) = x2(t); ode3b := diff(x2(t),t) = -x1(t);

2. Solving differential equations. To solve say the equation ode1 from above, we give the
command

[> dsolve(ode1);

which gives the following output:

x(t) = −t − 1 + et C1

The strange “ C1” is a constant. The strange name is used by Maple as an indication that
the constant is generated by Maple (and has not been introduced by the user).
To solve the equation with a given initial value, say with x(0) = 1, we use the command:

[> dsolve({ode1, x(0)=1});

If our initial condition is itself a parameter α, then we can write

[> dsolve({ode1, x(0)=alpha});

which gives:
x(t) = −t − 1 + et (1 + α)
We can also give a name to the equation specifying the initial condition as follows
1.2. Using Maple to investigate differential equations 5

[> ic1 := x(0)=2;

and then solve the initial value problem by writing:

[> dsolve({ode1, ic1});

Systems of differential equations can be handled similarly. For example, the ODE system
ode3a, ode3b can be solved by

[> dsolve({ode3a, ode3b});

and if we have the initial conditions x1 (0) = 1, x2 (0) = 1, then we give the following
command:

[> dsolve({ode3a, ode3b, x1(0)=1, x2(0)=1});

3. Plotting solutions of differential equations. The tool one has to use is called DEplot, and so
one has to activate this at the outset using the command:

[> with(DEtools):

Once this is done, one can use for instance the command DEplot, which can be used to plot
solutions. This command is quite complicated with many options, but one can get help from
Maple by using:

[> ?DEplot;

For the equation ode1 above, the command

[> DEplot(ode1, x(t), t=-2..2, [[x(0)=0]]);

will give a nice picture of a solution to the associated initial value problem, but it contains
some other information as well. The various elements in the above command are: ode1 is
the label specifying which differential equation we are solving, x(t) indicated the dependent
variable, t=-2..2 indicates the independent variable and its range, and [[x(0)=0]] gives
the initial value.
One can also give more than one initial value, for instance:

[> DEplot(ode1, x(t), t=-2..2, [[x(0)=-1], [x(0)=0], [x(0)=1]]);

The colour of the plot can also be changed:

[> DEplot(ode1, x(t), t=-2..2, [[x(0)=-1], [x(0)=0], [x(0)=1]],


linecolour=blue);

The arrows one sees in the picture show the direction field, a concept we will discuss in
Chapter 2. One can hide these arrows:

[> DEplot(ode1, x(t), t=-2..2, [[x(0)=-1], [x(0)=0], [x(0)=1]],


arrows=NONE);

To make plots for higher order ODEs, one must give the right number of initial values. We
consider an example for ode2 below:

[> DEplot(ode2, x(t), t=-2..2, [[x(0)=0, D(x)(0)=0], [x(0)=0, D(x)(0)=2]]);


6 Chapter 1. Linear equations

One can also handle systems of ODEs using DEplot, and we give an example below.

[> DEplot({ode3a,ode3b}, {x1(t),x2(t)}, t=0..10, [[x1(0)=1, x2(0)=0]],


scene=[t,x1(t)]);

If one wants x1 and x2 to be displayed in the same plot, then we can use the command
display as demonstrated in the following example.

[> with(plots):
[> plot1 := DEplot({ode3a,ode3b}, {x1(t),x2(t)}, t=0..10,
[[x1(0)=1, x2(0)=0]], scene=[t,x1(t)]):
[> plot2 := DEplot({ode3a,ode3b}, {x1(t),x2(t)}, t=0..10,
[[x1(0)=1, x2(0)=0]], scene=[t,x2(t)], linecolour=red):
[> display(plot1,plot2);

In Chapter 2, we will learn about ‘phase portraits’ which are plots in which we plot one
solution against the other (with time giving this parametric representation) when one has a
2D system. We will revisit this subsection in order to learn how we can make phase portraits
using Maple.

Exercises.

1. In each of the following initial-value problems, find a solution using Maple. Verify that the
solution exists for some t ∈ I, where I is an interval containing 0.

(a) x" = x + x3 with x(0) = 1.


1
(b) x"" + x = 2cos t with x(0) = 1 and x" (0) = 1.
-
x"1 = −x1 + x2
(c) with x1 (0) = 0 and x2 (0) = 0.
x"2 = x1 + x2 + t

2. In forestry, there is interest in the evolution of the population x of a pest called ‘spruce
budworm’, which is modelled by the following equation:
# $
1 5x
x" = x 2 − x − . (1.6)
5 2 + x2

The solutions of this differential equation show radically different behaviour depending on
what initial condition x(0) = x0 one has in the range 0 ≤ x0 ≤ 10.

(a) Use Maple to plot solutions for several initial values in the range [0, 10].
(b) Use the plots to describe the different types of behaviour, and also give an interval
for the initial value in which the behaviour occurs. (For instance: For x0 ∈ [0, 8), the
solutions x(t) go to 0 as t increases. For x0 ∈ [8, 10] the solutions x(t) go to infinity as
t increases.)
(c) Use Maple to plot the function
# $
1 5x
f (x) = x 2 − x − ,
5 2 + x2

in the range x ∈ [0, 10]. Can the differential equation plots be explained theoretically?
Hint: See Figure 1.2.
1.3. High order ODE to a first order ODE. State vector. 7

10

6
x(t)

0 2 4 6 8 10
t

Figure 1.1: Population evolution of the budworm for various initial conditions.
x
2 4 6 8 10
0

–1

–2

–3

–4

–5

Figure 1.2: Graph of the function f .

1.3 High order ODE to a first order ODE. State vector.

Note that the system of equations (1.1)-(1.3) are first order, in that the derivatives occurring are
of order at most 1. However, in applications, one may end with a model described by a set of
high order equations. So why restrict our study only to first order systems? In this section we
learn that such high order equations can be expressed as a system of first order equations, by
introducing a ‘state vector’. So throughout the sequel we will consider only a system of first order
equations.

Let us consider the second order differential equation

y "" (t) + a(t)y(t) + b(t)y(t) = u(t). (1.7)

If we introduce the new functions x1 , x2 defined by

x1 = y and x2 = y " ,

then we observe that

x"1 (t) = y " (t) = x2 (t),


x"2 (t) = y "" (t) = −a(t)y " (t) − b(t)y(t) + u(t) = −a(t)x2 (t) − b(t)x1 (t) + u(t),

and so we obtain the system of first order equations

x"1 (t) = x2 (t), (1.8)


x"2 (t) = −a(t)x2 (t) − b(t)x1 (t) + u(t), (1.9)
8 Chapter 1. Linear equations

which is of the form (1.1)-(1.3).

Solving (1.7) is equivalent to solving the system (1.8)-(1.9). To see the equivalence, suppose
that (x1 , x2 ) satisfies the system (1.8)-(1.9). Then x1 is a solution to (1.7), since

(x"1 (t))" = x"2 (t) = −b(t)x1 (t) − a(t)x"1 (t) + u(t),

which is (1.7). On the other hand, if y is a solution to (1.7), then define x1 = y and x2 = y " , and
proceeding as in the preceding paragraph, this yields a solution of (1.8)-(1.9).

More generally, if we have an nth order scalar equation

y (n) + an−1 (t)y (n−1) + · · · + a1 (t)y " (t) + a0 (t)y(t) = u(t),

then by introducing the vector of functions


   
x1 y
 x2   y" 
   
 ..  :=  ..  , (1.10)
 .   . 
xn y (n−1)

we arrive at the equivalent first order system of equations

x"1 (t) = x2 (t),


x"2 (t) = x3 (t),
..
.
x"n−1 (t) = xn (t),
x"n (t) = −a1 (t)x1 (t) − · · · − an−1 (t)xn−1 (t) + u(t).

The auxiliary vector in (1.10) comprising successive derivatives of the unknown function in the
high order differential equation, is called a state, and the resulting system of first order differential
equations is called a state equation.

Exercises. By introducing appropriate state variables, write a state equation for the following
(systems of) differential equations:

1. x"" + ω 2 x = 0.

2. x"" + x = 0, y "" + y " + y = 0.

3. x"" + t sin x = 0.

1.4 The simplest example

The differential equation


x" (t) = ax(t) (1.11)
is the simplest differential equation. It is also one of the most important. First, what does it
mean? Here x : R → R is an unknown real-valued function (of a real variable t), and x" (t) is its
derivative at t. The equation (1.11) holds for every value of t, and a denotes a constant.
1.4. The simplest example 9

The solutions to (1.11) are obtained from calculus: if C is any constant, then the function f
given by f (t) = Ceta is a solution, since

f " (t) = Caeta = a(Ceta ) = af (t).

Moreover, there are no other solutions. To see this, let u be any solution and compute the
derivative of v given by v(t) = e−ta u(t):

v " (t) = −ae−ta u(t) + e−ta u" (t)


= −ae−ta u(t) + e−ta au(t) (since u" (t) = au(t))
= 0.

Therefore by the fundamental theorem of calculus,


. t . t
v(t) − v(0) = v " (t)dt = 0dt = 0,
0 0

and so v(t) = v(0) for all t, that is, e−ta u(t) = u(0). Consequently u(t) = eta u(0) for all t.

So we see that the initial value problem

x" (t) = ax(t), x(0) = x0

has the unique solution


x(t) = eta x0 , t ∈ R.
As the constant a changes, the nature of the solutions changes. Can we describe qualitatively the
way the solutions change? We have the following cases:

1◦ a < 0. In this case,


lim x(t) = lim eta x(0) = 0x(0) = 0.
t→∞ t→∞

Thus the solutions all converge to zero, and moreover they converge to zero exponentially,
that is, they there exist constants M > 0 and ( > 0 such that the solutions satisfy an
inequality of the type |x(t)| ≤ M e−&t for all t ≥ 0. (Note that not every decaying solution
of an ODE has to converge exponentially fast–see the example on page 56). See Figure 1.3.

a<0
a=0
a>0
t t t

Figure 1.3: Exponential solutions eta x0 .

2◦ a = 0. In this case,
x(t) = et0 x(0) = 1x(0) = x(0) for all t ≥ 0.
Thus the solutions are constants, the constant value being the initial value it starts from.
See Figure 1.3.
3◦ a > 0. In this case, if the initial condition is zero, the solution is the constant function
taking value 0 everywhere. If the initial condition is nonzero, then the solutions ‘blow up’.
See Figure 1.3.
10 Chapter 1. Linear equations

We would like to have a similar idea about the qualitative behaviour of solutions, but when
we have a system of linear differential equations. It turns out that for the system

x" (t) = Ax(t),

the behaviour of the solutions depends on the eigenvalues of the matrix A. In order to find out
why this is so, we first give an expression for the solution of such a linear ODE in the next two
sections. We find that the solution is notationally the same as the scalar case discussed in this
section: x(t) = etA x(0), with the little ‘a’ now replaced by the matrix ‘A’ ! But what do we mean
by the exponential of a matrix, etA ? We first introduce this concept in the next section, and
subsequently, we will show how it enables us to solve the system x" = Ax.

1.5 The matrix exponential

In this section we introduce the exponential of a square matrix A, which is useful for obtaining
explicit solutions to the linear system x" (t) = Ax(t). We begin with a few preliminaries concerning
vector-valued functions.

A vector-valued function t $→ x(t) is a vector whose entries are functions of t. Similarly, a


matrix-valued function t $→ A(t) is a matrix whose entries are functions:
   
x1 (t) a11 (t) . . . a1n (t)
 ..   .. .. 
 . , A(t) =  . . .
xn (t) am1 (t) . . . amn (t)

The calculus operations of taking limits, differentiating, and so on are extended to vector-valued
and matrix-valued functions by performing the operations on each entry separately. Thus by
definition,
 
lim x1 (t)
t→t0
lim x(t) = 
 .. 
.
t→t0  . 
lim xn (t)
t→t0

So this limit exists iff lim xi (t) exists for all i ∈ {1, . . . , n}. Similarly, the derivative of a vector-
t→t0
valued or matrix-valued function is the function obtained by differentiating each entry separately:
 "   " 
x1 (t) a11 (t) . . . a"1n (t)
dx dA
(t) =  ...  , .. ..
   
(t) =  . . ,
dt "
dt " "
xn (t) am1 (t) . . . amn (t)

where x"i (t) is the derivative of xi (t), and so on. So dx


dt is defined iff each of the functions xi (t) is
differentiable. The derivative can also be described in vector notation, as

dx x(t + h) − x(t)
(t) = lim . (1.12)
dt h→0 h

Here x(t + h) − x(t) is computed by vector addition and the h in the denominator stands for scalar
multiplication by h−1 . The limit is obtained by evaluating the limit of each entry separately,
as above. So the entries of (1.12) are the derivatives xi (t). The same is true for matrix-valued
functions.
1.5. The matrix exponential 11

Suppose that analogous to

a2 a3
ea = 1 + a + + + ..., a ∈ R,
2! 3!
we define
1 2 1
eA = I + A + A + A3 + . . . , A ∈ Rn×n . (1.13)
2! 3!
In this section, we will study this matrix exponential, and show that the matrix-valued function

t2 2 t3 2
etA = I + tA + A + A + ...
2! 3!
(where t is a variable scalar) can be used to solve the system x" (t) = Ax(t), x(0) = x0 : indeed,
the solution is given by x(t) = etA x0 .

We begin by stating the following result, which shows that the series in (1.13) converges for
any given square matrix A.

Theorem 1.5.1 The series (1.13) converges for any given square matrix A.

We have collected the proofs together at the end of this section in order to not break up the
discussion.

Since matrix multiplication is relatively complicated, it isn’t easy to write down the matrix
entries of eA directly. In particular, the entries of eA are usually not obtained by exponentiating
the entries of A. However, one case in which the exponential is easily computed, is when A is
a diagonal matrix, say with diagonal entries λi . Inspection of the series shows that eA is also
diagonal in this case and that its diagonal entries are eλi .

The exponential of a matrix A can also be determined when A is diagonalizable , that is,
whenever we know a matrix P such that P −1 AP is a diagonal matrix D. Then A = P DP −1 , and
using (P DP −1 )k = P Dk P −1 , we obtain
1 2 1
eA = I +A+ A + A3 + . . .
2! 3!
1 1
= I + P DP −1 + P D2 P −1 + P D3 P −1 + . . .
2! 3!
1 1
= P IP −1 + P DP −1 + P D2 P −1 + P D3 P −1 + . . .
# 2! $ 3!
1 2 1 3
= P I + D + D + D + . . . P −1
2! 3!
= P eD P −1 
eλ1
= P  0 ...
  −1
P ,
eλn

where λ1 , . . . , λn denote the eigenvalues of A.

Exercise. (∗∗) The set of diagonalizable n × n complex matrices is dense in the set of all n × n
complex matrices, that is, given any A ∈ Cn×n , there exists a B ∈ Cn×n arbitrarily close to A
(meaning that |bij − aij | can be made arbitrarily small for all i, j ∈ {1, . . . , n}) such that B has n
distinct eigenvalues.
12 Chapter 1. Linear equations

Hint: Use the fact that every complex n× n matrix A can be ‘upper-triangularised’: that is, there
exists an invertible complex matrix P such that P AP −1 is upper triangular. Clearly the diagonal
entries of this new upper triangular matrix are the eigenvalues of A.

In order to use the matrix exponential to solve systems of differential equations, we need to
extend some of the properties of the ordinary exponential to it. The most fundamental property
is ea+b = ea eb . This property can be expressed as a formal identity between the two infinite series
which are obtained by expanding
2
1 + (a+b)
ea+b = / 1! +
(a+b)
2! 0+/. . . and 2
a 2
0 (1.14)
ea eb = 1 + 1! + a2! + . . . 1 + 1!b + b2! + . . . .

We cannot substitute matrices into this identity because the commutative law is needed to obtain
equality of the two series. For instance, the quadratic terms of (1.14), computed without the
commutative law, are 21 (a2 + ab + ba + b2 ) and 21 a2 + ab + 21 b2 . They are not equal unless ab = ba.
So there is no reason to expect eA+B to equal eA eB in general. However, if two matrices A and
B happen to commute, the formal identity can be applied.

Theorem 1.5.2 If A, B ∈ Rn×n commute (that is AB = BA), then eA+B = eA eB .

The proof is at the end of this section. Note that the above implies that eA is always invertible
and in fact its inverse is e−A : Indeed I = eA−A = eA e−A .

Exercises.

1. Give an example of 2 × 2 matrices A and B such that eA+B += eA eB .


+ ,
A 2 3
2. Compute e , where A is given by A = .
0 2
+ ,
0 3
Hint: A = 2I + .
0 0

We now come to the main result relating the matrix exponential to differential equations.
Given an n × n matrix, we consider the exponential etA , t being a variable scalar, as a matrix-
valued function:
t2 t3
etA = I + tA + A2 + A3 + . . . .
2! 3!

Theorem 1.5.3 etA is a differentiable matrix-valued function of t, and its derivative is AetA .

The proof is at the end of the section.

Theorem 1.5.4 (Product rule.) Let A(t) and B(t) be differentiable matrix-valued functions of t,
of suitable sizes so that their product is defined. Then the matrix product A(t)B(t) is differentiable,
and its derivative is
d dA(t) dB(t)
(A(t)B(t)) = B(t) + A(t) .
dt dt dt

The proof is left as an exercise.


1.5. The matrix exponential 13

Theorem 1.5.5 The first-order linear differential equation


dx
(t) = Ax(t), t ∈ R, x(0) = x0 (1.15)
dt
has the unique solution x(t) = etA x0 .

Proof We have
d tA
(e x0 ) = AetA x0 ,
dt
and so t $→ etA x0 solves dx
dt (t) = Ax(t). Furthermore, x(0) = e0A x0 = Ix0 = x0 .

Finally we show that the solution is unique. Let x be a solution to (1.15). Using the product
rule, we differentiate the matrix product e−tA x(t):

d −tA
(e x(t)) = −Ae−tA x(t) + e−tA Ax(t).
dt
From the definition of the exponential, it can be seen that A and e−tA commute, and so the
derivative of e−tA x(t) is zero. Therefore, e−tA x(t) is a constant column vector, say C, and x(t) =
etA C. As x(0) = x0 , we obtain that x0 = e0A C, that is, C = x0 . Consequently, x(t) = etA x0 .

Thus the matrix exponential enables us to solve the differential equation (1.15). Since direct
computation of the exponential can be quite difficult, the above theorem may not be easy to apply
in a concrete situation. But if A is a diagonalizable matrix, then the exponential can be computed:
eA = P eD P −1 . To compute the exponential explicitly in all cases requires putting the matrix into
Jordan form. But in the next section, we will learn yet another way of computing etA by using
Laplace transforms.

We now go back to prove Theorems 1.5.1, 1.5.2, and 1.5.3.

For want of a more compact notation, we will denote the i, j-entry of a matrix A by Aij here.
So (AB)ij will stand for the entry of the matrix product matrix AB, and (Ak )ij for the entry of
Ak . With this notation, the i, j-entry of eA is the sum of the series
1 2 1
(eA )ij = Iij + Aij + (A )ij + (A3 )ij + . . . .
2! 3!
In order to prove that the series for the exponential converges, we need to show that the entries of
the powers Ak of a given matrix do not grow too fast, so that the absolute values of the i, j-entries
form a bounded (and hence convergent) series. Consider the following function , · , on Rn×n :

,A, := max{|Aij | | 1 ≤ i, j ≤ n}. (1.16)

Thus |Aij | ≤ ,A, for all i, j. This is one of several possible “norms” on Rn×n , and it has the
following property.

Lemma 1.5.6 If A, B ∈ Rn×n , then ,AB, ≤ n,A,,B,, and for all k ∈ N, ,Ak , ≤ nk−1 ,A,k .

Proof We estimate the size of the i, j-entry of AB:


1 1
12n 1 2 n
1 1
|(AB)ij | = 1 Aik Bkj 1 ≤ |Aik ||Bkj | ≤ n,A,,B,.
1 1
k=1 k=1
14 Chapter 1. Linear equations

Thus ,AB, ≤ n,A,,B,. The second inequality follows from the first inequality by induction.

Proof (of Theorem 1.5.1:) To prove that the matrix exponential converges, we show that the
series
1 1
Iij + Aij + (A2 )ij + (A3 )ij + . . .
2! 3!
is absolutely convergent, and hence convergent. Let a = n,A,. Then

1 1 1 1
|Iij | + |Aij | + |(A2 )ij | + |(A3 )ij | + . . . ≤ 1 + ,A, + n,A,2 + n2 ,A,3 + . . .
2! 3! # 2! 3! $
1 1 1 ea − 1
= 1+ a + a2 + a3 + . . . = 1 + .
n 2! 3! n

Proof (of Theorem 1.5.2:) The terms of degree k in the expansions of (1.14) are
# $
1 1 2 k 2 1 1
(A + B)k = Ar B s and Ar B s .
k! k! r r! s!
r+s=k r+s=k

These terms are equal since for all k, and all r, s such that r + s = k,
# $
1 k 1
= .
k! r r!s!

Define
1 1 1
Sn (A) = I + A + A2 + · · · + An .
1! 2! n!
Then
# $# $
1 1 2 1 n 1 1 2 1 n
Sn (A)Sn (B) = I + A + A + ··· + A I + B + B + ···+ B
1! 2! n! 1! 2! n!
n
2 1 r1 s
= A B ,
r,s=0
r! s!

while
1 1 1
Sn (A + B) = I+ (A + B) + (A + B)2 + · · · + (A + B)n
1! 2! n!
n
2 2 1 #k $ 2n 2 1 1
r s
= A B = Ar B s .
k! r r! s!
k=0 r+s=k k=0 r+s=k

Comparing terms, we find that the expansion of the partial sum Sn (A + B) consists of the terms
in Sn (A)Sn (B) such that r + s ≤ n. We must show that the sum of the remaining terms tends to
zero as k tends to ∞.

Lemma 1.5.7 The series 1


2 2 11# 1 $ 11
r 1 s 1
1 A B 1
1 r! s! ij 1
k r+s=k

converges for all i, j.


1.5. The matrix exponential 15

Proof Let a = n,A, and b = n,B,. We estimate the terms in the sum using Lemma 1.5.6:

|(Ar B s )ij | ≤ ,Ar B s , ≤ n,Ar ,,B s , ≤ n(nr−1 ,A,r )(ns−1 ,B,s ) ≤ ar bs .

Therefore 1 $ 11 2 2 r s
2 2 11# 1
r 1 s a b
= ea+b .
1
1 A B 1≤
1 r! s! ij 1 r! s!
k r+s=k k r+s=k

The theorem follows from this lemma because, on the one hand, the i, j-entry of (Sk (A)Sk (B) −
Sk (A + B))ij is bounded by 1
2 11# 1 $ 11
1
Ar B s 1 .
1
1
1 r! s! ij 1
r+s>k

According to the lemma, this sum tends to 0 as k tends to ∞. And on the other hand, Sk (A)Sk (B)−
Sk (A + B) tends to eA eB − eA+B .

This completes the proof of Theorem 1.5.2.


16 Chapter 1. Linear equations

Proof (of Theorem 1.5.3:) By definition,


d tA 1
e = lim (e(t+h)A − etA ).
dt h→0 h
Since the matrices tA and hA commute, we have
# $
1 (t+h)A 1 hA
(e − etA ) = (e − I) etA .
h h
So our theorem follows from this lemma:

1 hA
Lemma 1.5.8 lim (e − I) = A.
h→0 h

Proof The series expansion for the exponential shows that


1 hA h h2
(e − I) − A = A2 + A3 + . . . . (1.17)
h 2! 3!
We estimate this series. Let a = |h|n,A,. Then
1# $ 11 1 1 1 1
1 h
1 2 h2 3 1 1 h 2 1 1 h2 3 1
1 A + A + . . . 1 ≤ 1 (A ) 1
ij 1 + 1 (A )ij 1 + . . .
1
1 2! 3! ij 1
1 2! 1 3!
1 1
≤ |h|n,A,2 + |h|2 n2 ,A,3 + . . .
2! # 3! $ # a $
1 1 2 ,A, a e −1
= ,A, a + a + ... = (e − 1 − a) = ,A, −1 .
2! 3! a a
a
1
e −1 d x 11
Note that a → 0 as h → 0. Since the derivative of ex is ex , lim = e = e0 = 1.
a→0 a dx 1x=0

So (1.17) tends to 0 with h → 0.

This completes the proof of Theorem 1.5.3.

Exercises.

1. (∗) If A ∈ Rn×n , then show that ,eA , ≤ en'A' . (In particular, for all t ≥ 0, ,etA , ≤ etn'A' .)
2. (a) Let n ∈ N. Show that there exists a constant C (depending only on n) such that if
A ∈ Rn×n , then for all v ∈ Rn , ,Av, ≤ C,A,,v,. (Throughout this course, by ,v,,
where v is a vector in Rn , we mean its Euclidean norm, that is, the square root of the
sum of the squares of the n components of v.)
(b) Show that if λ is an eigenvalue of A, then |λ| ≤ n,A,.
3. (a) (∗) Show that if λ is an eigenvalue of A and v is a corresponding eigenvector, then v is
also an eigenvector of eA corresponding to the eigenvalue eλ of eA .
+ ,
4 3
(b) Solve x" (t) = x(t), x(0) = x0 , when
3 4
+ , + , + ,
1 1 2
(i) x0 = , (ii) x0 = , and (iii) x0 = .
1 −1 0
Hint: In parts (i) and (ii), observe that the initial condition is an eigenvector of the
square matrix in question, and in part (iii), express the initial condition as a combination
of the initial conditions from the previous two parts.
1.6. Computation of etA 17

!
4. (∗) Prove that etA = (etA )( . (Here M ( denotes the transpose of the matrix M .)

5. (∗) Let A ∈ Rn×n , and let S = {x : R → Rn | ∀t ∈ R, x" (t) = Ax(t)}. In this exercise we
will show that S is a finite dimensional vector space with dimension n.

(a) Let C(R; Rn ) denote the vector space of all functions f : R → Rn with pointwise
addition and scalar multiplication. Show that S is a subspace of C(R; Rn ).

(b) Let e1 , . . . , en denote the standard basis vectors in Rn . By Theorem 1.5.5, we know
that for each k ∈ {1, . . . , n}, there exists a unique solution to the initial value problem
x" (t) = Ax(t), t ∈ R, x(0) = ek . Denote this unique solution by fk . Thus we obtain
the set of functions f1 , . . . , fn ∈ S . Prove that {f1 , . . . , fn } is linearly independent.
Hint: Set t = 0 in α1 f1 + · · · + αn fn = 0.

(c) Show that S = span{f1 , . . . , fn }, and conclude that S is a finite dimensional vector
space of dimension n.

1.6 Computation of etA

In the previous section, we saw that the computation of etA is easy if the matrix A is diagonalizable.
However, not all matrices are diagonalizable. For example, consider the matrix

+ ,
0 1
A= .
0 0

The eigenvalues of this matrix are both 0, and so if it were diagonalizable, then the diagonal form
will be the zero matrix, but then if there did exist an invertible P such that P −1 AP is this zero
matrix, then clearly A should be zero, which it is not!

In general, however, every matrix has what is called a Jordan canonical form, that is, there
exists an invertible P such that P −1 AP = D + N , where D is diagonal, N is nilpotent (that is,
there exists a n ≥ 0 such that N n+1 = 0), and D and N commute. Then one can compute the
exponential of A:

# $
1 2 2 1 n n
e tA
= Pe tD
I + tN + t N + · · · + t N P −1 .
2! n!

However, the algorithm for computing the P taking A to the Jordan form requires some sophis-
ticated linear algebra. So we give a different procedure for calculating etA below, using Laplace
transforms. First we will prove the following theorem.

. ∞
Theorem 1.6.1 For large enough s, e−st etA dt = (sI − A)−1 .
0
18 Chapter 1. Linear equations

Proof First choose a s0 large enough so that s0 > n,A,. Then for all s > s0 , we have
. ∞ . ∞
−ts tA
e e dt = e−t(sI−A) dt
0 0
. ∞
= (sI − A)−1 (sI − A)e−t(sI−A) dt
0
. ∞
= (sI − A)−1 (sI − A)e−t(sI−A) dt
0
. ∞
d
= (sI − A)−1 − e−t(sI−A) dt
dt
/0 1 0
−1 −ts tA 1t=∞
= (sI − A) −e e t=0
= (sI − A)−1 (0 + I)
= (sI − A)−1 .

(In the above, we used the Exercise 1 on page 16, which gives ,etA , ≤ etn'A' ≤ ets0 = ets et(s0 −s) ,
and so ,e−ts etA , ≤ et(s0 −s) . Also, we have used Exercise 2b, which gives invertibility of sI − A.)

If s is not an eigenvalue of A, then sI − A is invertible, and Cramer’s rule1 says that,


1
(sI − A)−1 = adj(sI − A).
det(sI − A)

Here adj(sI − A) denotes the adjoint of the matrix sI − A, which is defined as follows: its (i, j)th
entry is obtained by multiplying (−1)i+j and the determinant of the matrix obtained by deleting
the jth row and ith column of sI − A. Thus we see that each entry of adj(sI − A) is a polynomial
in s whose degree is at most n − 1. (Here n denotes the size of A–that is, A is a n × n matrix.)

Consequently, each entry mij of (sI − A)−1 is a rational function, in other words, it is ratio
of two polynomials (in s) pij and q := det(sI − A):

pij (s)
mij =
q(s)

Also from the above, we see that deg(pij ) ≤ deg(q) − 1. From the fundamental theorem of algebra,
we know that the monic polynomial q can be factored as

q(s) = (s − λ1 )m1 . . . (s − λk )mk ,

where λ1 , . . . , λk are the distinct eigenvalues of q(s) = det(sI −A), with the algebraic multiplicities
m1 , . . . , mk .

By the “partial fraction expansion” one learns in calculus, it follows that one can find suitable
coefficients for a decomposition of each rational entry of (sI − A)−1 as follows:
mk
k 2
2 Cl,r
mij = .
(s − λl )r
l=1 r=1

Thus if fij (t) denotes the (i, j)th entry of etA , then its Laplace transform will be an expression of
the type mij given above. Now it turns out that this determines the fij , and this is the content
of the following result.
1 For a proof, see for instance Artin [3].
1.6. Computation of etA 19

Theorem 1.6.2 Let a ∈ C and n ∈ N. If f is a continuous function defined on [0, ∞), and if
there exists a s0 such that for all s > s0 ,
. ∞
1
F (s) := e−st f (t)dt = ,
0 (s − a)n

then
1
f (t) = tn−1 eta for all t ≥ 0.
(n − 1)!

Proof The proof is beyond the scope of this course, but we refer the interested reader to Exercise
11.38 on page 342 of Apostol [1].

So we have a procedure for computing etA : form the matrix sI − A, compute its inverse (as
a rational matrix), perform a partial fraction expansion of each of its entry, and take the inverse
Laplace transform of each elementary fraction. Sometimes, the partial fraction expansion may be
avoided, by making use of the following corollary (which can be obtained from Theorem 1.6.2, by
a partial fraction expansion!).

Corollary 1.6.3 Let f be a continuous function defined on [0, ∞), and let there exist a s0 such
that for all s > s0 , F defined by . ∞
F (s) := e−st f (t)dt,
0
is one of the functions given in the first column below. Then f is given by the corresponding entry
in the second column.

F f
b
eta sin(bt)
(s − a)2 + b2
s−a
eta cos(bt)
(s − a)2 + b2
b
eta sinh(bt)
(s − a)2 − b2
s−a
eta cosh(bt)
(s − a)2 − b2

+ , + ,
0 1 s −1
Example. If A = , then sI − A = , and so
0 0 0 s
+ , + 1 1
,
1 s 1
(sI − A)−1 = = s s2
1 .
s2 0 s 0 s
+ ,
tA 1 t
By using Theorem 1.6.2 (‘taking the inverse Laplace transform’), we obtain e = . ♦
0 1
20 Chapter 1. Linear equations

Exercises.
+ ,
3 −1
1. Compute etA , when A = .
1 1
 
λ 1 0
2. Compute etA , for the ‘Jordan block’ A =  0 λ 1 .
0 0 λ
Remark: In general, if
 
t2 tn−1
1 t 2! ... (n−1)!
 .. 
   .. .. 
λ 1  . . . 
 
 .. ..   .. 

if A =  . .  , then etA

= eλt 
 . 
.
 1   .. t2

 . 2!

λ 
 ..


 . t 
1

+ ,
tA a b
3. (a) Compute e , when A = .
−b a
(b) Find the solution to
x"" + kx = 0, x(0) = 1, x" (0) = 0.

(Here k is a fixed positive constant.)



Hint: Introduce the state variables x1 = kx and x2 = x" .
Suppose that k = 1, and find (x(t))2 + (x" (t))2 . What do you observe? If one identifies
(x(t), x" (t)) with a point in the plane at time t, then how does this point move with
time?

4. Suppose that A is a 2 × 2 matrix such that


+ ,
cosh t sinh t
etA = , t ∈ R.
sinh t cosh t

Find A.

1.7 Stability considerations

Just as in the scalar example x" = ax, where we saw that the sign of (the real part of) a allows us
to conclude the behaviour of the solution as x → ∞, it turns out that by looking at the real parts
of the eigenvalues of the matrix A one can say similar things in the case of the system x" = Ax.
We will study this in this section.

We begin by proving the following result.

Lemma 1.7.1 Suppose that λ ∈ C and k is a nonnegative integer. For every ω > Re(λ), there
exists a Mω > 0 such that for all t ≥ 0, |tk eλt | ≤ Mω eωt .
1.7. Stability considerations 21

Proof We have

2 (ω − Re(λ))n tn (ω − Re(λ))k tk
e(ω−Re(λ))t = ≥ ,
n=0
n! k!

and so tk e(Re(λ)−ω)t ≤ Mω , where


k!
Mω := > 0.
(ω − Re(λ))k

Consequently, for t ≥ 0, |tk eλt | = tk eRe(λ)t = tk e(Re(λ)−ω)t eωt ≤ Mω eωt .

In the sequel, we denote the set of eigenvalues of A by σ(A), sometimes referred to as the
spectrum of A.

Theorem 1.7.2 Let A ∈ Rn×n .

1. Every solution of x" = Ax tends to zero as t → ∞ iff for all λ ∈ σ(A), Re(λ) < 0. Moreover,
in this case, the solutions converge exponentially to 0: there exist ( > 0 and M > 0 such
that for all t ≥ 0, ,x(t), ≤ M e−&t ,x(0),.
2. If there exists a λ ∈ σ(A) such that Re(λ) > 0, then for every δ > 0, there exists a x0 ∈ Rn
with ,x0 , < δ, such that the unique solution to x" = Ax with initial condition x(0) = x0
satisfies ,x(t), → ∞ as t → ∞.

Proof 1. We use Theorem 1.6.1. From Cramer’s rule, it follows that each entry in (sI − A)−1
is a rational function with the denominator equal to the characteristic polynomial of A, and then
by using a partial fraction expansion and Theorem 1.6.2, it follows that each entry in etA is a
linear combination of terms of the form tk eλt , where k is a nonnegative integer and λ ∈ σ(A). By
Lemma 1.7.1, we conclude that if each eigenvalue of A has real part < 0, then there exist positive
constants M and ( such that for all t ≥ 0, ,etA , < M e−&t .

On the other hand, if each solution tends to 0 as t → ∞, then in particular, if v ∈ Rn


is an eigenvector2 corresponding to eigenvalue λ, then with initial condition x(0) = v, we have
t→∞
x(t) = etA v = eλt v, and so ,x(t), = eRe(λ)t ,v, −→ 0, and so it must be the case that Re(λ) < 0.

2. Let λ ∈ σ(A) be such that Re(λ) > 0, and let v ∈ Rn be a corresponding eigenvector3. Given
δ
δ > 0, define x0 = 2'v' v ∈ Rn . Then ,x0 , = 2δ < δ, and the unique solution x to x" = Ax with
initial condition x(0) = x0 satisfies ,x(t), = 2δ eRe(λ)t → ∞ as t → ∞.

In the case when we have eigenvalues with real parts equal to zero, then a more careful analysis
is required and the boundedness of solutions depends on the algebraic/geometric multiplicity of the
eigenvalues with zero real parts. We will not give a detailed analysis, but consider two examples
which demonstrate that the solutions may or may not remain bounded.

2 With a complex eigenvalue, this vector is not in Rn ! But the proof can be modified so as to still yield the

desired conclusion.
3 See the previous footnote!
22 Chapter 1. Linear equations

Examples. Consider the system x" = Ax, where


+ ,
0 0
A= .
0 0

Then the system trajectories are constants x(t) ≡ x(0), and so they are bounded.

On the other hand if + ,


0 1
A= ,
0 0
then + ,
1 t
etA = ,
0 1
+ ,
0 √
and so with the initial condition x(0) = δ , we have ,x(t), = δ 1 + t2 → ∞ as t → ∞ for
1
all δ > 0. So even if one starts arbitrarily close to the origin, the solution can become unbounded.

Exercises.

1. Determine if all solutions of x" = Ax are bounded, and if so if all solutions tend to 0 as
t → ∞.
+ ,
1 2
(a)
3 4
+ ,
1 0
(b)
0 −1
 
1 1 0
(c)  0 −2 −1 
0 0 −1
 
1 1 1
(d)  1 0 1 
0 0 −2
 
−1 0 0
(e)  0 −2 0 
1 0 −1
 
−1 0 −1
(f)  0 −2 0  .
1 0 0
"
2. For what values of α ∈ R +can we conclude that
, all solutions of the system x = Ax will be
α 1+α
bounded for t ≥ 0, if A = ?
−(1 + α) α
Chapter 2

Phase plane analysis

2.1 Introduction

In the preceding chapter, we learnt how one can solve a system of linear differential equations.
However, the equations that arise in most practical situations are inherently nonlinear, and typi-
cally it is impossible to solve these explicitly. Nevertheless, sometimes it is possible to obtain an
idea of what its solutions look like (the “qualitative behaviour”), and we learn one such method
in this chapter, called phase plane analysis.

Phase plane analysis is a graphical method for studying 2D autonomous systems. This method
was introduced by mathematicians (among others, Henri Poincaré) in the 1890s.

The basic idea of the method is to generate in the state space motion trajectories corresponding
to various initial conditions, and then to examine the qualitative features of the trajectories. As a
graphical method, it allows us to visualise what goes on in a nonlinear system starting from various
initial conditions, without having to solve the nonlinear equations analytically. Thus, information
concerning stability and other motion patterns of the system can be obtained. In this chapter, we
learn the basic tools of the phase plane analysis.

2.2 Concepts of phase plane analysis

2.2.1 Phase portraits

The phase plane method is concerned with the graphical study of 2 dimensional autonomous
systems:
x"1 (t) = f1 (x1 (t), x2 (t)),
(2.1)
x"2 (t) = f2 (x1 (t), x2 (t)),
where x1 and x2 are the states of the system, and f1 and f2 are nonlinear functions from R2 to
R. Geometrically, the state space is a plane, and we call this plane the phase plane.

Given a set of initial conditions x(0) = x0 , we denote by x the solution to the equation (2.1).
(We assume throughout this chapter that given an initial condition there exists a unique solution
for all t ≥ 0: this is guaranteed under mild assumptions on f1 , f2 , and we will learn more about
this in Chapter 4.) With time t varied from 0 to ∞, the solution t $→ x(t) can be represented

25
26 Chapter 2. Phase plane analysis

geometrically as a curve in the phase plane. Such a curve is called a (phase plane) trajectory.
A family of phase plane trajectories corresponding to various initial conditions is called a phase
portrait of the system. From the assumption about the existence of solution, we know that from
each point in the phase plane there passes a curve, and from the uniqueness, we know that there
can be only one such curve. Thus no two trajectories in the phase plane can intersect, for if they
did intersect at a point, then with that point as the initial condition, we would have two solutions1 ,
which is a contradiction!

To illustrate the concept of a phase portrait, let us consider the following simple system.

Example. Consider the system


x"1 =
x2 ,
x"2 −x1 .
=
+ , + ,
" 0 1 tA cos t sin t
Thus the system is a linear ODE x = Ax with A = . Then e = ,
−1 0 − sin t cos t
and so if the initial condition expressed in polar coordinates is
+ , + ,
x10 r0 cos θ0
= ,
x20 r0 sin θ0
then it can be seen that the solution is
+ , + ,
x1 (t) r0 cos(θ0 − t)
= , t ≥ 0. (2.2)
x2 (t) r0 sin(θ0 − t)
We note that
(x1 (t))2 + (x2 (t))2 = r02 ,
which represents a circle in the phase plane. Corresponding to different initial conditions, circles
of different radii can be obtained, and from (2.2), it is easy to see that the motion is clockwise.
Plotting these circles on the phase plane, we obtain a phase portrait as shown in the Figure 2.1.

1.5

1
x2

0.5

–1.5 –1 –0.5 0 0.5 1 1.5


x1
–0.5

–1

–1.5

Figure 2.1: Phase portrait.

We see that the trajectories neither converge to the origin nor diverge to infinity. They simply
circle around the origin. ♦
1 Here we are really running the differential equation backwards in time, but then we can make the change of

variables τ = −t.
2.2. Concepts of phase plane analysis 27

2.2.2 Singular points

An important concept in phase plane analysis is that of a singular point.

3
x"1 = f1 (x1 , x2 )
Definition. A singular point of the system is a point (x1∗ , x2∗ ) in the
x"2 = f2 (x1 , x2 ),
phase plane such that f1 (x1∗ , x2∗ ) = 0 and f2 (x1∗ , x2∗ ) = 0.

Such a point is also sometimes called an equilibrium point (see Chapter 3), that is, a point
where the system states can stay forever: if we start with this initial condition, then the unique
solution is x1 (t) = x1∗ and x2 (t) = x2∗ . So through that point in the phase plane, only the ‘trivial
curve’ comprising just that point passes.

For a linear system x" = Ax, if A is invertible, then the only singular point is (0, 0), and if
A is not invertible, then all the points from the kernel of A are singular points. So in the case
of linear systems, either there is only one equilibrium point, or infinitely many singular points,
none of which is then isolated. But in the case of nonlinear systems, there can be more than one
isolated singular point, as demonstrated in the following example.

Example. Consider the system

x"1 = x2 ,
1
x"2 = − x2 − 2x1 − x21 ,
2
whose phase portrait is shown in Figure 2.2. The system has two singular points, one at (0, 0),

x2
2

–2 –1 1 2

x1
–1

–2

–3

Figure 2.2: Phase portrait.

and the other at (−2, 0). The motion patterns of the system trajectories starting in the vicinity
of the two singular points have different natures. The trajectories move towards the point (0, 0),
while they move away from (−2, 0). ♦
28 Chapter 2. Phase plane analysis

One may wonder why an equilibrium point of a 2D system is called a singular point. To answer
this, let us examine the slope of the phase trajectories. The slope of the phase trajectory at time
t is given by
dx2
dx2 dt f2 (x1 , x2 )
= dx = .
dx1 dt
1 f 1 (x1 , x2 )

When both f1 and f2 are zero at a point, this slope is undetermined, and this accounts for the
adjective ‘singular’.

Singular points are important features in the phase plane, since they reveal some information
about the system. For nonlinear systems, besides singular points, there may be more complex
features, such as limit cycles. These will be discussed later in this chapter.

Note that although the phase plane method is developed primarily for 2D systems, it can be
also applied to the analysis of nD systems in general, but the graphical study of higher order
systems is computationally and geometrically complex. On the other hand with 1D systems, the
phase “plane” is reduced to the real line. We consider an example of a 1D system below.

Example. Consider the system x" = −x + x3 .

−1 0 1

Figure 2.3: Phase portrait of the system x" = −x + x3 .

The singular points are determined by the equation

−x + x3 = 0,

which has three real solutions, namely −1, 0 and 1. The phase portrait is shown in Figure 2.3.
Indeed, for example if we consider the solution to

x" (t) = −x(t) − (x(t))3 , t ≥ t0 , x(t0 ) = x0 ,

with 0 < x0 < 1, then we observe that

x" (t0 ) = −x0 − x30 = − x0 (1 − x20 ) < 0,


4567 4 56 7
>0 >0

and this means that t $→ x(t) is decreasing, and so the “motion” starting from x0 is towards the
left. This explains the direction of the arrow for the region 0 < x < 1 in Figure 2.3. ♦

Exercises.

1. Locate the singular points of the following systems.


3 "
x1 = x2 ,
(a)
x"2 = sin x1 .
3 "
x1 = x1 − x2 ,
(b)
x"2 = x22 − x1 .
3 "
x1 = x21 (x2 − 1),
(c)
x"2 = x1 x2 .
2.2. Concepts of phase plane analysis 29

3
x"1 = x21 (x2 − 1),
(d)
x"2 = x21 − 2x1 x2 − x22 .
3
x"1 = sin x2 ,
(e)
x"2 = cos x1 .
2. Sketch the following parameterised curves in the phase plane.
(a) (x1 , x2 ) = (a cos t, b sin t), where a > 0, b > 0.
(b) (x1 , x2 ) = (aet , be−2t ), where a > 0, b > 0.
3. Draw phase portraits of the following 1D systems.
(a) x" = x2 .
(b) x" = ex .
(c) x" = cosh x.
(d) x" = sin x.
(e) x" = cos x − 1.
(f) x" = sin(2x).

4. Consider a 2D autonomous system for which there exists a unique solution for every initial
condition in R2 for all t ∈ R.
(a) Show that if (x1 , x2 ) is a solution, then for any T ∈ R, the shifted functions (y1 , y2 )
given by

y1 (t) = x1 (t + T ),
y2 (t) = x2 (t + T ),

(t ∈ R) is also a solution.
# $
2 cos t sin(2t)
(b) (∗) Can x(t) = , 0, t ∈ R, be the solution of such a 2D
1 + (sin t)2 1 + (sin t)2
autonomous system?
Hint: By the first part, we know that t $→ y1 (t) := x(t + π/2) and t $→ y2 (t) :=
x(t + 3π/2) are also solutions. Check that y1 (0) = y2 (0). Is y1 ≡ y2 ? What does this
say about uniqueness of solutions starting from a given initial condition?
(c) Using Maple, sketch the curve
# $
2 cos t sin(2t)
t $→ , .
1 + (sin t)2 1 + (sin t)2

(This curve is called the lemniscate.)


5. Consider the ODE (1.6) from Exercise 2 on page 6.
(a) Using Maple, find the singular points (approximately).
(b) Draw a phase portrait in the region x ≥ 0.
30 Chapter 2. Phase plane analysis

6. A simple model for a national economy is given by

I" = I − αC
C" = β(I − C − G),

where

I denotes the national income,


C denotes the rate of consumer spending, and
G denotes the rate of government expenditure.

The model is restricted to I, C, G ≥ 0, and the constants α, β satisfy α > 1, β ≥ 1.


(a) Suppose that the government expenditure is related to the national income according
to G = G0 + kI, where G0 and k are positive constants. Find the range of positive k’s
for which there exists an equilibrium point such that I, C, G are nonnegative.
(b) Let k = 0, and let (I0 , C0 ) denote the equilibrium point. Introduce the new variables
I1 = I − I0 and C1 = C − C0 . Show that (I1 , C1 ) satisfy a linear system of equations:
+ " , + ,+ ,
I1 1 −α I1
= .
C1" β −β C1

If β = 1 and α = 2, then conclude that in fact the economy oscillates.

2.3 Constructing phase portraits

Phase portraits can be routinely generated using computers, and this has spurred many advances
in the study of complex nonlinear dynamic behaviour. Nevertheless, in this section, we learn a few
techniques in order to be able to roughly sketch the phase portraits. This is useful for instance
in order to verify the plausibility of computer generated outputs. We describe two methods: one
involves the analytic solution of differential equations. If an analytic solution is not available, the
other tool, called the method of isoclines, is useful.

2.3.1 Analytic method

There are two techniques for generating phase portraits analytically. One is to first solve for x1
and x2 explicitly as functions of t, and then to eliminate t, as we had done in the example on page
26.

The other analytic method does not involve an explicit computation of the solutions as func-
tions of time, but instead, one solves the differential equation

dx2 f1 (x1 , x2 )
= .
dx1 f2 (x1 , x2 )

Thus given a trajectory t $→ (x1 (t), x2 (t)), we eliminate the t by setting up a differential equation
for the derivative of the second function ‘with respect to the first one’, not involving the t, and by
solving this differential equation. We illustrate this in the same example.
2.3. Constructing phase portraits 31

Example. Consider the system

x"1
x2 , =
x"2
−x1 . =
# $
dx2 −x1 dx2 d 1 2 dx2
We have = , and so x2 = −x1 . Thus x2 = x2 = −x1 .
dx1 x2 dx1 dx1 2 dx1
Integrating with respect to x1 , and using the fundamental theorem of calculus, we obtain
x22 + x21 = C. This equation describes a circle in the (x1 , x2 )-plane. Thus the trajectories satisfy

(x1 (t))2 + (x2 (t))2 = C = (x1 (0))2 + (x2 (0))2 , t ≥ 0,

and they are circles. We note that when x1 (0) belongs to the right half plane, then x"2 (0) =
−x1 (0) < 0, and so t $→ x2 (t) should be decreasing. Thus we see that the motion is clockwise, as
shown in Figure 2.1. ♦

Exercises.
3
x"1 = x2 ,
1. Sketch the phase portrait of
x"2 = x1 .
3
x"1 = −2x2 ,
2. Sketch the phase portrait of
x"2 = x1 .

3. (a) Sketch the curve curve y(x) = x(A + B log |x|) where A, B are constants and B > 0.
3 "
x1 = x1 + x2 ,
(b) (∗) Sketch the phase portrait of
x"2 = x2 .
Hint: Solve the system, and try eliminating t.

2.3.2 The method of isoclines


dx2 f2 (x1 ,x2 )
At a point (x1 , x2 ) in the phase plane, the slope of the tangent to the trajectory is dx1 = f1 (x1 ,x2 ) .
An isocline is a curve in R2 defined by ff21 (x 1 ,x2 )
(x1 ,x2 ) = α, where α is a real number. This means that
if we look at all the trajectories that pass through various points on the same isocline, then all
of these trajectories have the same slope (equal to α) on the points of this isocline. To obtain
trajectories from the isoclines, we assume that the tangent slopes are locally constant. The method
of constructing the phase portrait using isoclines is thus the following:

Step 1. For various values of α, construct the corresponding isoclines. Along an isocline, draw
small line segments with slope α. In this manner a field of directions is obtained.

Step 2. Since the tangent slopes are locally constant, we can construct a phase plane trajectory
by connecting a sequence line segments.

We illustrate the method by means of two examples.

3
x"1 = x2 , −x1
Example. Consider the system The slope is given by dxdx1 = x2 , and so the
2
x"2 = −x1 .
isocline corresponding to slope α is x1 + αx2 = 0, and these points lie on a straight line. By taking
different values for α, a set of isoclines can be drawn, and in this manner a field of directions of
32 Chapter 2. Phase plane analysis

x2

α=∞ x1

α = −1 α=0 α=1

Figure 2.4: Method of isoclines.

tangents to the trajectories are generated, as shown in Figure 2.4, and the trajectories in the phase
portrait are circles. If x1 > 0, then x"2 (0) = −x1 (0) < 0, and so the motion is clockwise. ♦

Let us now use the method of isoclines to study a nonlinear equation.

Example. (van der Pol equation) Consider the differential equation

y "" + µ(y 2 − 1)y " + y = 0. (2.3)

By introducing the variables x1 = y and x2 = y " , we obtain the following 2D system:


x"1 = x2 ,
x"2 = −µ(x21 − 1)x2 − x1 .

−µ(x21 − 1)x2 − x1
An isocline of slope α is defined by = α, that is, the points on the curve
x2
x1
x2 = all correspond to the same slope α of tangents to trajectories.
(µ − µx21 ) − α

We take the value of µ = 21 . By taking different values for α, different isoclines can be obtained,
and short line segments can be drawn on the isoclines to generate a field of directions, as shown
in Figure 2.5. The phase portrait can then be obtained, as shown.

It is interesting to note that from the phase portrait, one is able to guess that there exists a
closed curve in the phase portrait, and the trajectories starting from both outside and inside seem
to converge to this curve2 . ♦

3
x"1 = x2 ,
Exercise. Using the method of isoclines, sketch a phase portrait of the system
x"2 = x1 .

2.3.3 Phase portraits using Maple

We consider a few examples in order to illustrate how one can make phase portraits using Maple.
2 This is also expected based on physical considerations: the van der Pol equation arises from electric circuits
containing vacuum tubes, where for small oscillations, energy is fed into the system, while for large oscillations,
energy is taken out of the system–in other words, large oscillations will be damped, while for small oscillations,
there is ‘negative damping’ (that is energy is fed into the system). So one can expect that such a system will
approach some periodic behaviour, which will appear as a closed curve in the phase portrait.
2.3. Constructing phase portraits 33

Figure 2.5: Method of isoclines.

Consider the ODE system: x"1 (t) = x2 (t) and x2 (t) = −x2 (t). We can plot x1 against x2 by
using DEplot. Consider for example:

> with(DEtools) :
> ode3a := diff(x1(t), t) = x2(t); ode3b := diff(x2(t), t) = −x1(t);
> DEplot({ode3a, ode3b}, {x1(t), x2(t)}, t = 0..10, x1 = −2..2, x2 = −2..2,
[[x1(0) = 1, x2(0) = 0]], stepsize = 0.01, linecolour = black);

The resulting plot is shown in Figure 2.6. The arrows show the direction field.

x2
1

–2 –1 0 1 2
x1

–1

–2

Figure 2.6: Phase portrait for the ODE system x"1 = x2 and x"2 = −x1 .

By including some more trajectories, we can construct a phase portrait in a given region, as
shown in the following example.
34 Chapter 2. Phase plane analysis

3
x"1 = −x2 + x1 (1 − x21 − x22 ),
Example. Consider the system Using the following Maple
x"2 = x1 + x1 (1 − x21 − x22 ).
command, we can obtain the phase portrait shown in Figure 2.7.

> with(DEtools) :
> ode1 := diff(x1(t), t) = −x2(t) + x1(t) ∗ (1 − x1(t)^2 − x2(t)^2);
> ode2 := diff(x2(t), t) = x1(t) + x1(t) ∗ (1 − x1(t)^2 − x2(t)^2);
> initvalues := seq(seq([x1(0) = i + 1/2, x2(0) = j + 1/2],
i = −2..1), j = −2..1) :
> DEplot({ode1, ode2}, [x1(t), x2(t)], t = −4..4, x1 = −2..2, x2 = −2..2, [initvalues],
stepsize = 0.05, arrows = MEDIUM, colour = black, linecolour = red);

x2
1

–2 –1 0 1 2
x1

–1

–2

Figure 2.7: Phase portrait.


Exercises.

1. Using Maple, construct phase portraits of the following systems:


3 "
x1 = x2 ,
(a)
x"2 = x1 .
3 "
x1 = −2x2 ,
(b)
x"2 = x1 .
3 "
x1 = x1 + x2 ,
(c)
x"2 = x2 .
2. Suppose a lake contains two species of fish, which we simply call ‘big fish’ and ‘small fish’. In
the absence of big fish, the small fish population xs evolves according to the law: x"s = axs ,
where a > 0 is a constant. Indeed, the more the small fish, the more they reproduce. But
big fish eat small fish, and so taking this into account, we have

x"s = axs − bxs xb ,

where b > 0 is a constant. The last term accounts for how often the big fish encounter the
small fish–the more the small fish, the easier it becomes for the big fish to catch them, and
the faster the population of the small fish decreases.
On the other hand, the big fish population evolution is given by

x"b = −cxb + dxs xb ,


2.3. Constructing phase portraits 35

Figure 2.8: Big fish and small fish.

where c, d > 0 are constants. The first term has a negative sign which comes from the
competition between these predators–the more the big fish, the fiercer the competition for
survival. The second term accounts for the fact that the larger the number of small fish, the
greater the growth in the numbers of the big fish.

(a) !Singular
" points. Show that the (xs , xb ) ODE system has two singular points (0, 0) and
c a
d b . The point (0, 0) corresponds to the extinction!of both
, " species–if both popula-
tions are 0, then they continue to remain so. The point dc , ab corresponds to population
levels at which both species sustain their current nonzero numbers indefinitely.
(b) Solution to the ODE system. Use Maple to plot the population levels of the two species
on the same plot, with the following data: a = 2, b = 0.002, c = 0.5, d = 0.0002,
xs (0) = 9000, xb (0) = 1000, t = 0 to t = 100.

8000

6000

x1(t)

4000

2000

0 20 40 60 80 100

Figure 2.9: Periodic variation of the population levels.

Your plot should show that the population levels vary in a periodic manner, and the
population of the big fish lags behind the population of the small fish. This is expected
since the big fish thrive when the small fish are plentiful, but ultimately outstrip their
food supply and decline. As the big fish population is low, the small fish numbers
increase again. So there is a a cycle of growth and decline.
(c) Phase portrait. With the same constants as before, plot a phase portrait in the region
xs = 0 to xs = 10000 and xb = 0 to 4000.
Also plot in the same phase portrait the solution curves. What do you observe?
36 Chapter 2. Phase plane analysis

4000

3000

x2
2000

1000

0 2000 4000 6000 8000 10000

x1

Figure 2.10: Phase portrait for the Lotka-Volterra ODE system.

2.4 Phase plane analysis of linear systems

In this section, we describe the phase plane analysis of linear systems. Besides allowing us to
visually observe the motion patterns of linear systems, this will also help the development of
nonlinear system analysis in the next section, since similar motion patterns can be observed in the
local behaviour of nonlinear systems as well.

We will analyse three simple types of matrices. It turns out that it is enough to consider these
three types, since every other matrix can be reduced to such a matrix by an appropriate change
of basis (in the phase portrait, this corresponds to replacing the usual axes by new ones, which
may not be orthogonal). However, in this elementary first course, we will ignore this part of the
theory.

2.4.1 Complex eigenvalues


+ , + ,
a b cos(bt) sin(bt)
Consider the system x" = Ax, where A = . Then etA = eta , and
−b a − sin(bt) cos(bt)
so if the initial condition x(0) has polar coordinates (r0 , θ0 ), then the solution is given by
x1 (t) = eta r0 cos(θ0 − bt) and x2 (t) = eta r0 sin(θ0 − bt), t ≥ 0,
so that the trajectories are spirals if a is nonzero, moving towards the origin if a < 0, and outwards
if a > 0. If a = 0, the trajectories are circles. See Figure 2.11.

2.4.2 Diagonal case with real eigenvalues

Consider the system x" = Ax, where


+ ,
λ1 0
A= ,
0 λ2
where λ1 and λ2 are real numbers. The trajectory starting from the initial condition (x10 , x20 ) is
given by x1 (t) = eλ1 t x10 , x2 (t) = eλ2 t x20 . We also see that Axλ1 2 = Bxλ2 2 with appropriate values
2.4. Phase plane analysis of linear systems 37

1 1.5 60

0.8
1 40
0.6 x2
x2 x2
0.4 20
0.5

0.2

–0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1 1.2


–1.5 –1 –0.5 0.5 1 1.5 –60 –40 –20 0 20 40 60
x1 x1
–0.2 x1 –20
–0.5

–0.4
–40
–1
–0.6

–0.8 –60
–1.5

phase plane
x2

x1

t
0

Figure 2.11: Case of complex eigenvalues. The last figure shows the phase plane trajectory as a
projection of the curve (t, x1 (t), x2 (t)) in R3 : case when a = 0, and b > 0.

for the constants A and B. See the topmost figure in Figure 2.12 for the case when λ1 , λ2 are
both negative. In general, we obtain the phase portraits shown in Figure 2.12, depending on the
signs of λ1 and λ2 .

2.4.3 Nondiagonal case

Consider the system x" = Ax, where


+ ,
λ 1
A= ,
0 λ

where λ is a real number. It is easy to see that


+ λt ,
tA e teλt
e = ,
0 eλt

so that the solution starting from the initial condition (x10 , x20 ) is given by

x1 (t) = eλt (x10 + tx20 ), x2 (t) = eλt x20 .

Figure 2.13 shows the phase portraits for the three cases when λ < 0, λ = 0 and λ > 0.
38 Chapter 2. Phase plane analysis

phase plane
x2

x1
t
0

λ1 < λ2 < 0 λ1 < 0 < λ2 λ2 < 0 = λ1

λ1 = λ2 < 0 λ1 = λ2 = 0

Figure 2.12: The topmost figure shows the phase plane trajectory as a projection of the curve
(t, x1 (t), x2 (t)) in R3 : diagonal case when both eigenvalues are negative and equal. The other fig-
ures are phase portraits in the case when A is diagonal with real eigenvalues. When the eigenvalues
have opposite signs, the singular point is called a saddle point.
3 3 3

2 2 2
x2 x2 x2

1 1 1

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
x1 x1 x1
–1 –1 –1

–2 –2 –2

–3 –3 –3

Figure 2.13: Nondiagonal case.


2.5. Phase plane analysis of nonlinear systems 39

We note that the angle that a point on the trajectory makes with the x1 axis is given by
# $ # $
x2 (t) x20
arctan = arctan ,
x1 (t) x10 + tx20
which tends to 0 or π as t → ∞.

Exercises.
3
x"1
= x1 − 3x2 ,
1. Draw the phase portrait for the system using the following procedure:
x"2
= −2x2 ,
+ ,
1 −3
Step 1. Find the eigenvectors and eigenvalues: Show that A := has eigenvalues
0 −2
+ , + ,
1 1
1, −2, with eigenvectors v1 := , v2 := , respectively.
0 1
Step 2. Set up the coordinate system in terms of the eigenvectors: Since v1 , v2 form a
basis for R2 , the solution vector x(t) can be expressed as a linear combination of v1 , v2 :
x(t) = α(t)v1 + β(t)v2 . Note that α(t) and β(t) are the ‘coordinates’ of the point x(t) in
the directions v1 and v2 , respectively. In other words, they are the ‘projections’ of the point
x(t) in the directions v1 and v2 , as shown in the Figure 2.14.
Step 3. Eliminate t: Show that (α(t))2 β(t) = (α(0))2 β(0), and using the ‘distorted’ coor-
dinate system, draw a phase portrait for the system x" (t) = Ax(t).

R2 β(t) x(t)
(1, 1)
0
(1, 0) α(t)

Figure 2.14: The distorted coordinate system.

2. (∗) Let A ∈ R2 have eigenvalues a + ib and a − ib, where a, b ∈ R, and b += 0.


(a) If v1 := u + iv (u, v ∈ R2 ) is an eigenvector corresponding to the eigenvalue a + ib, then
show that v2 := u − iv is an eigenvector corresponding to a − ib.
(b) Using the fact that b += 0, conclude that v1 , v2 are linearly independent in C2 .
(c) Prove that u, v are linearly independent as vectors in R2 . Conclude that the matrix P
with the columns u and v is invertible.
(d) Verify that + ,
a b
P −1 AP = .
−b a

2.5 Phase plane analysis of nonlinear systems

With the phase plane analysis of nonlinear systems, we should keep two things in mind. One is
that the phase plane analysis is related to that of linear systems, because the local behaviour of
40 Chapter 2. Phase plane analysis

a nonlinear system can be approximated by a linear system. And the second is that, despite this
similarity with linear systems, nonlinear systems can display much more complicated patterns in
the phase plane, such as multiple singular points and limit cycles. In this section, we will discuss
these aspects. We consider the system

x"1 = f1 (x1 , x2 ),
(2.4)
x"2 = f2 (x1 , x2 ),

where we assume that f1 and f2 have continuous partial derivatives, and this assumption will be
continued for the remainder of this chapter. We will learn later in Chapter 4 that a consequence of
this assumption is that for the initial value problem of the system above, there will exist a unique
solution. Moreover, we will also make the assumption that solutions exist for all times in R.

2.5.1 Local behaviour of nonlinear systems

In order to see the similarity with linear systems, we decompose the the nonlinear system into a
linear part and an ‘error’ part (which is small close to a singular point), using Taylor’s theorem,
as follows.

Let (x10 , x20 ) be an isolated singular point of (2.4). Thus f1 (x10 , x20 ) = 0 and f2 (x10 , x20 ) = 0.
Then by Taylor’s theorem, we have
+ , + ,
" ∂f1 ∂f1
x1 = (x10 , x20 ) (x1 − x10 ) + (x10 , x20 ) (x2 − x20 ) + e1 (x1 − x10 , x2 − x20 ), (2.5)
∂x1 ∂x2
+ , + ,
" ∂f2 ∂f2
x2 = (x10 , x20 ) (x1 − x10 ) + (x10 , x20 ) (x2 − x20 ) + e2 (x1 − x10 , x2 − x20 ), (2.6)
∂x1 ∂x2

where e1 and e2 are such that e1 (0, 0) = e2 (0, 0) = 0. We translate the singular point (x10 , x20 )
to the origin by introducing the new variables y1 = x1 − x10 and y2 = x2 − x20 . With
∂f1 ∂f1
a := (x10 , x20 ), b := (x10 , x20 ),
∂x1 ∂x2
∂f2 ∂f2
c := (x10 , x20 ), d := (x10 , x20 ),
∂x1 ∂x2
we can rewrite (2.5)-(2.6) as follows:

y1" = ay1 + by2 + e1 (y1 , y2 ),


(2.7)
y2" = cy1 + dy2 + e2 (y1 , y2 ).

We note that this new system has (0, 0) as a singular point. We will elaborate on the similarity
between the phase portrait of the system (2.4) with the phase portrait of its linear part, that is,
the system
z1" = az1 + bz2 ,
(2.8)
z2" = cz1 + dz2 .

Before clarifying the relationship between (2.4) and (2.8), we pause to note some important
differences. The system (2.4) may have many singular points; one of them has been selected and
moved to the origin. If a different singular point would have been chosen, then the constants
a, b, c, d in (2.8) would have been different. The important point is that any statement relating
(2.4) and (2.8), is local in nature, in that they apply ‘near’ the singular point under consideration.
By ‘near’ here, we mean in a sufficiently small neighbourhood or ball around the singular point.
Totally different kinds of behaviour may occur in a neighbourhood of other critical points. The
2.5. Phase plane analysis of nonlinear systems 41

transformation above must be made, and the corresponding linear part must be analysed, for each
isolated singular point of the nonlinear system.

We now give the main theorem in this section about the local relationship between the nature
of phase portraits of (2.4) and (2.8), but we will not prove this theorem.

Theorem 2.5.1 Let (x10 , x20 ) be an isolated singular point of (2.4), and let
8 9
∂f1 ∂f1
∂x (x10 , x20 ) ∂x (x10 , x20 )
A := ∂f2
1
∂f2
2
.
∂x1 (x10 , x20 ) ∂x2 (x10 , x20 )

Then we have the following:

1. If every eigenvalue of A has a negative real part, then all solutions of (2.4) starting in a
small enough ball with centre (x10 , x20 ) converge to (x10 , x20 ) as t → ∞. (This situation
is abbreviated by saying that the equilibrium point (x10 , x20 ) is ‘asymptotically stable’; see
§3.3.)
2. If the matrix A has an eigenvalue with a positive real part, then there exists a ball B such
that for every ball B " of positive radius around (x10 , x20 ), there exists a point in B " such that
a solution x of (2.4) starting from that point leaves the ball B. (This situation is abbreviated
by saying that the equilibrium point (x10 , x20 ) is ‘unstable’; see §3.2.)

We illustrate the theorem with the following example.

Example. Consider the system

x"1 = −x1 + x2 − x1 (x2 − x1 ),


x"2 = −x1 − x2 + 2x21 x2 .

This nonlinear system has the singular points (−1, −1), (1, 1) and (0, 0). If we linearise around
the singular point (0, 0), we obtain the matrix
 1 1 
∂ 1 ∂ 1
(−x1 + x2 − x1 (x2 − x1 ))11 (−x1 + x2 − x1 (x2 − x1 ))11 + ,
 ∂x1
1 (0,0) ∂x2 1 (0,0)  = −1 1

 ,
 ∂ 1 ∂ 1  −1 −1
(−x1 − x2 + 2x21 x2 )11 (−x1 − x2 + 2x21 x2 )11
∂x1 (0,0) ∂x2 (0,0)

which has eigenvalues −1 + i and −1 − i. Thus by Theorem 2.5.1, it follows that for the above
nonlinear system, if we start close to (0, 0), then the solutions converge to (0, 0) as t → ∞.

However, not all solutions of this nonlinear system converge to (0, 0). For example, we know
that (1, 1) is also a singular point, and so if we start from there, then we stay there. ♦

The above example highlights the local nature of Theorem 2.5.1. How close is sufficiently close
is generally a difficult question to answer.
42 Chapter 2. Phase plane analysis

Actually more than just similarity of convergence to the singular point can be said. It turns out
that if the real parts of the eigenvalues are not equal to zero, then also the ‘qualitative’ structure
is preserved. Roughly speaking, this means that there is a map T mapping a region Ω1 around
(x10 , x20 ) to a region Ω2 around (0, 0) such that

1. T is one-to-one and onto;


2. both T and T −1 are continuous;
3. if two points of Ω1 lie on the same trajectory of (2.8), then their images under T lie on the
same trajectory of (2.4);
4. if two points of Ω2 lie on the same trajectory of (2.4), then their images under T −1 lie on
the same trajectory of (2.8).

The mapping is shown schematically in Figure 2.15.


z2 x2

Ω1 Ω2
z1 x1

T −1

Figure 2.15: The mapping T .

The actual construction of such a mapping is not easy, but we demonstrate the plausibility of
its existence by considering an example.

Example. Consider the system

x"1 = x2 ,
x"2 = x1 − x2 + x1 (x1 − 2x2 ).

The singular points are solutions to

x2 = 0 and x1 − x2 + x21 − 2x1 x2 = 0,

and so they are (0, 0) and (−1, 0). Furthermore,

∂f1 ∂f1
= 0, = 1,
∂x1 ∂x2
∂f2 ∂f1
= 1 + 2x1 − 2x2 , = −1 − 2x1 .
∂x1 ∂x2

At (0, 0), the matrix of the linear part is


+ ,
0 1
,
1 −1
2.5. Phase plane analysis of nonlinear systems 43

whose eigenvalues satisfy λ2 + λ − 1 = 0. The roots are real and of opposite signs, and so the
origin is a saddle point.

At (−1, 0), the matrix of the linear part is


+ ,
0 1
,
−1 1

whose eigenvalues satisfy λ2 − λ + 1 = 0. The trajectories are thus outward spirals.

2
x2

–2 –1 1 2

x1

–1

–2

–3

Figure 2.16: Phase portrait.

Figure 2.16 shows the phase portrait of the system. As expected we see that around the points
(0, 0) and (−1, 0), the local picture is similar to the corresponding linearisations. ♦

Finally, we discuss the case when the eigenvalues of the linearisation have real part equal to
0. It turns out that in this case, the behaviour of the linearisation gives no information about the
behaviour of the nonlinear system. For example, circles in the phase portrait may be converted
into spirals. We illustrate this in the following example.

Example. Consider the system

x"1 = −x2 − x1 (x21 + x22 ),


x"2 = x1 − x2 (x21 + x22 ).

The linearisation about the singular point (0, 0) gives rise to the matrix
+ ,
0 −1
1 0
which has eigenvalues −i and i. Thus the phase portrait of the linear part comprises circles.
44 Chapter 2. Phase plane analysis

:
If we introduce the polar coordinates r := x21 + x22 and θ = arctan(x2 /x1 ), then we have
that

r" = −r3 ,
θ" = 1.

Thus we see that the trajectories approach the origin in spirals! ♦

Exercises. Use the linearisation theorem (Theorem 2.5.1), where possible, to describe the be-
haviour close to the equilibrium points of the following systems:

3
x"1 = ex1 +x2 − x2 ,
1.
x"2 = −x1 + x1 x2 .
3
x"1 = x1 + 4x2 + ex1 − 1,
2.
x"2 = −x2 − x2 ex1 .
3
x"1 = x2 ,
3.
x"2 = −x31 .
3
x"1 = sin(x1 + x2 ),
4.
x"2 = x2 .
3
x"1 = sin(x1 + x2 ),
5.
x"2 = −x2 .

2.5.2 Limit cycles and the Poincaré-Bendixson theorem

In the phase portrait of the van der Pol equation shown in Figure 2.5, we suspected that the
system has a closed curve in the phase portrait, and moreover, trajectories starting inside that
curve, as well as trajectories starting outside that curve, all tended towards this curve, while a
motion starting on that curve would stay on it forever, circling periodically around the origin.
Such a curve is called a “limit cycle”, and we will study the exact definition later in this section.
Limit cycles are a unique feature that can occur only in a nonlinear system. Although in the phase
portrait in the middle of the top row of figures in Figure 2.11, we saw that if the real part of the
eigenvalues is zero, we have periodic trajectories, these are not limit cycles, since now matter how
close we start from such a periodic orbit, we can never approach it. We want to call those closed
curves limit cycles such that for all trajectories starting close to it, they converge to it either as
the time increases to +∞ or decreases to −∞. In order to explain this further, we consider the
following example.

Examples. Consider the system

x"1
= x2 − x1 (x21 + x22 − 1)
x"2
= −x1 − x2 (x21 + x22 − 1).
:
By introducing polar coordinates r = x21 + x22 and θ = arctan(x2 /x1 ), the equations are trans-
formed into

r" = −r(r2 − 1)
θ" = −1.
2.5. Phase plane analysis of nonlinear systems 45

When we are on the unit circle, we note that r" = 0, and so we stay there. Thus the unit circle is a
periodic trajectory. When r > 1, then r" < 0, and so we see that if we start outside the unit circle,
we tend towards the unit circle from the outside. On the other hand, if r < 1, then r" > 0, and so
if we start inside the unit circle, we tend towards it from the inside. This can be made rigorous by
/: 0−1
examining the analytical solution, given by r(t) = 1 + (1/r02 − 1)e−2t , θ(t) = θ0 − t, where
(r0 , θ0 ) denotes the initial condition. So we have that all trajectories in the vicinity of the unit
circle converge to the unit circle as t → ∞.

Now consider the system


x"1 = x2 + x1 (x21 + x22 − 1)
x"2 = −x1 + x2 (x21 + x22 − 1).
Again by introducing polar coordinates (r, θ) as before, we now obtain
r" = r(r2 − 1)
θ" = −1.
When we are on the unit circle, we again have that r" = 0, and so we stay there. Thus the unit
circle is a periodic trajectory. But now when r > 1, then r" > 0, and so we see that if we start
outside the unit circle, we move away from the unit circle. On the other hand, if r < 1, then
r" < 0, and so if we start inside the unit circle, we again move away from the unit circle. However,
if we start with an initial condition (r0 , θ0 ) with r0 < 1, and go backwards in time, then we can
/: 0−1
show that the solution is given by r(t) = 1 + (1/r02 − 1)e2t , θ(t) = θ0 − t, and so we have
that all trajectories in the vicinity of the unit circle from inside converge to the unit circle as
t → −∞. ♦

In each of the examples considered above, we would like to call the unit circle a “limit cycle”.
This motivates the following definitions of ω- and α- limit points of a trajectory, and we will define
limit cycles using these notions.

Definition. Let x be a solution of x" = f (x). A point x∗ is called an ω-limit point of x if there
is a sequence of real numbers (tn )n∈N such that lim tn = +∞ and lim x(tn ) = x∗ . The set of
n→∞ n→∞
all ω-limit points of x is denoted by Lω (x).

A point x∗ is called an α-limit point of x if there is a sequence of real numbers (tn )n∈N such
that lim tn = −∞ and lim x(tn ) = x∗ . The set of all α-limit points of x is denoted by Lα (x).
n→∞ n→∞

For example, if x∗ is a singular point, clearly Lω (x∗ ) = Lα (x∗ ) = x∗ . We consider a few more
examples below.

3
x"1 = −x1
Examples. Consider the system which has the origin as a saddle singular point.
x"2 = x2 ,
For any trajectory x starting on the x1 -axis (but not at the origin), we have Lω (x) = (0, 0), while
Lα (x) = ∅. On the other hand, for any trajectory x starting on the x2 -axis (but not at the origin),
Lω (x) = ∅, and Lα (x) = (0, 0). Finally, for any trajectory x that does not start on the x1 - or the
x2 -axis, the sets Lω (x) = Lα (x) = ∅.
3 "
x1 = x2
Now consider the system for which all trajectories are periodic and are
x"2 = −x1 ,
circles in the phase portrait. For any trajectory starting from a point P , the sets Lω (x), Lα (x)
46 Chapter 2. Phase plane analysis

are both equal to the circle passing through P . ♦

We are now ready to define a limit cycle.

Definitions. A periodic trajectory is a nonconstant solution such that there exists a T > 0 such
that x(t) = x(t + T ) for all t ∈ R.

A limit cycle is a periodic trajectory that is contained in Lω (x) or Lα (x) for some other
trajectory x.

Limit cycles represent an important phenomenon in nonlinear systems. They can be found
often in engineering and nature. For example, aircraft wing fluttering is an instance of a limit
cycle frequently encountered which is sometimes dangerous. In an ecological system where two
species share a common resource, the existence of a limit cycle would mean that none of the species
becomes extinct. As one can see, limit cycles can be desirable in some cases, and undesirable in
other situations. In any case, whether or not limit cycles exist is an important question, and we
now study a few results concerning this. In particular, we will study an important result, called
the Poincaré-Bendixson theorem, for which we will need a few topological preliminaries, which we
list below.

Definitions. Two nonempty sets A and B in the plane R2 are said to be separated if there is
no sequence of points (pn )n∈N contained in A such that lim pn ∈ B, and there is no sequence
n→∞
(qn )n∈N contained in B such that lim qn ∈ A.
n→∞

A set that is not the union of two separated sets is said to be connected.

A set O is open if for every x ∈ O, there exists a ( > 0 such that B(x, () ⊂ O.

A set Ω is called a region if it is an open, connected set.

A set A is said to be bounded if there exists a R > 0 large enough so that A ⊂ B(0, R).

For example, two disjoint circles in the plane are separated, while the quadrant {(x, y) ∈
R2 | x > 0, y > 0} is not separated from the x-axis.

Although the definition of a connected set seems technical, it turns out that sets that we would
intuitively think of as connected in a nontechnical sense are connected. For instance the annulus
{(x, y) ∈ R2 | 1 < x2 + y 2 < 2} is connected, while the set Z2 is not.

Roughly speaking, an open set can be thought of as a set without its ‘boundary’. For example,
the unit disk {(x, y) ∈ R2 | x2 + y 2 < 1} is open.

The principal result is the following.

3
x"1 = f1 (x1 , x2 ),
Theorem 2.5.2 (Poincaré-Bendixson) Let (x1 , x2 ) be a solution to such
x"2 = f2 (x1 , x2 ),
that for all t ≥ t0 , the solution lies in a bounded region of the plane containing no singular points.
Then either the solution is a periodic trajectory or its omega limit set is a periodic trajectory.

The proof requires advanced mathematical techniques to prove, and the proof will be omitted.
The Poincaré-Bendixson theorem is false for systems of dimension 3 or more. In the case of 2D
2.5. Phase plane analysis of nonlinear systems 47

systems, the proof depends heavily on a deep mathematical result, known as the Jordan curve
theorem, which is valid only in the plane. Although the theorem sounds obvious, its proof is
difficult. We state this theorem below, but first we should specify what we mean by a curve.

Definitions. A curve is a continuous function f : [a, b] → R2 . If for every t1 , t2 ∈ (a, b) such that
t1 += t2 , there holds that f (t1 ) += f (t2 ), then the curve is called simple. A curve is called closed if
f (a) = f (b).

Theorem 2.5.3 (Jordan curve theorem) A simple closed curve divides the plane into two regions,
one of which is bounded, and the other is unbounded.

Figure 2.17: Jordan curve theorem.

Now that the inside of a curve is defined, the following result helps to clarify the type of region
to seek in order to apply the Poincaré-Bendixson theorem.

3
x"1 = f1 (x1 , x2 )
Theorem 2.5.4 Every periodic trajectory of contains a singular point in
x"2 = f2 (x1 , x2 )
its interior.

This theorem tells us that in order to apply the Poincaré-Bendixson theorem, the singular-
point-free-region where the trajectory lies must have at least one hole in it (for the singular point).

We consider a typical application of the Poincaré-Bendixson theorem.

Example.(∗) Consider the system

x"1 = x1 + x2 − x1 (x21 + x22 )[cos(x21 + x22 )]2


x"2 = −x1 + x2 − x2 (x21 + x22 )[cos(x21 + x22 )]2 .

In polar coordinates, the equations are transformed into

r" = r[1 − r2 (cos r2 )2 ]


θ" = −1.

Consider a circle of radius r0 < 1 about the origin. If we start on it, then all trajectories move
outward, since
r" (t0 ) = r0 [1 − r02 (cos r02 )2 ] > r0 [1 − r02 ] > 0.
48 Chapter 2. Phase plane analysis


Also if r(t0 ) = π, then √
r" (t0 ) =
π[1 − π] < 0,

and so trajectories starting on the circle with radius π (or close to√it) move inwards. Then it can
be shown that any trajectory starting inside the annulus r0 < r√< π stays there3 . But it is clear
that there are no singular points inside the annulus r1 < r < π. So this region must contain a
periodic trajectory. Moreover, it is also possible to prove that a trajectory starting inside the unit
circle is not a periodic trajectory (since it never returns to this circle), and hence its omega limit
set is a limit cycle. ♦

It is also important to know when there are no periodic trajectories. The following theorem
provides a sufficient condition for the non-existence of periodic trajectories. In order to do that,
we will need the following definition.

Definition. A region Ω is said to be simply connected if for any simple closed curve C lying
entirely within Ω, all points inside C are points of Ω.

For example, the annulus 1 < r < 2 is not simply connected, while the unit disk r < 1 is.

∂f1 ∂f2
Theorem 2.5.5 Let Ω be a simply connected set in R2 . If the function + is not identically
∂x1 ∂x2
zero in Ω and does not change sign in Ω, then there are no periodic trajectories of

x"1 = f1 (x1 , x2 )
x"2 = f2 (x1 , x2 )

in the region Ω.

Proof (Sketch.) Assume that a periodic trajectory exists with period T , and denote the curve
by C. Using Green’s theorem, we have
.. # $ .
∂f1 ∂f2
+ dx1 dx2 = f2 dx1 − f1 dx2 .
∂x1 ∂x2 C

But . . T
f2 dx1 − f1 dx2 = (f2 (x1 (t), x2 (t))x"1 (t) − f1 (x1 (t), x2 (t))x"2 )dt = 0,
C 0
a contradiction.

Example. Consider the system

x"1 = x2 + x1 x22 ,
x"2 = −x1 + x21 x2 .

Since # $
∂f1 ∂f2
+ = x21 + x22 ,
∂x1 ∂x2
3 Indeed, for example if the trajectory leaves the annulus, and reaches a point r inside the inner circle at time

t∗ , then go “backwards” along this trajectory and find the first point t1 such that r(t1 ) = r0 . We then arrive at
! t∗ "
the contradiction that r∗ − r0 = r(t∗ ) − r(t1 ) = t r (t)dt > 0. The case of the trajectory leaving the outer circle
1
of the annulus can be handled similarly.
2.5. Phase plane analysis of nonlinear systems 49

which is always strictly positive (except at the origin), the system does not have any periodic
trajectories in the phase plane. ♦

Exercises.

1. Show that the sets A = {(x, y) ∈ R2 | xy = 1} and B = {(x, y) ∈ R2 | xy = 0} are separated.


2. Show that the system

x"1 = 1 − x1 x2
x"2 = x1

has no limit cycles.


Hint: Are there any singular points?
3. Show that the system

x"1 = x2
x"2 = −x1 − (1 + x21 )x2

has no periodic trajectories in the phase plane.


4. Prove that if the system

x"1 = −x2 + x1 (1 − x21 − x22 )


3
x"2 = x1 + x2 (1 − x21 − x22 ) + ,
4
has a periodic trajectory starting inside the circle C x21 + x22 = 21 , then it will either intersect
C.
Hint: Consider the simply connected region x21 + x22 < 12 .
5. Show that Lω (x) = Lα (x) = {x(t) | t ∈ R} = {x(t) | t ∈ [0, T ]} for a periodic trajectory x
with period T .
Chapter 3

Stability theory

Given a differential equation, an important question is that of the stability. In everyday language,
we say that “something is unstable” if a small deviation from the present state produces a major
change in the state. A familiar example is that of a pendulum balanced vertically upwards. A
small change in the position produces a major change–the pendulum falls. However, if on the
other hand, the pendulum is at its lowest position, for small changes in the position, the resulting
motion keeps the pendulum around the down position. In this chapter we will make these, and
related ideas, precise in the context of solutions of systems of differential equations. Roughly
speaking, an equilibrium point is described as stable if whenever we start somewhere near this
equilibrium point, then the resulting trajectory of the solution stays around the equilibrium point
ever after. In the case of the motions of a pendulum, the equilibrium points are the vertical up
and down positions, and these are examples of unstable and stable points, respectively.

Having defined the notions of stable and unstable equilibrium points, we will begin with some
elementary stability considerations, namely the stability of linear systems x" = Ax, which can be
characterised in terms of signs of the real parts the eigenvalues of the matrix A.

In the case of general nonlinear systems, such a neat characterisation is not possible. But
a useful approach for studying stability was introduced in the late 19th century by the Russian
mathematician Alexander Lyapunov, in which stability is determined by constructing a scalar
“energy-like” function for the differential equation, and examining its time variation. In this
chapter we will also study the basics of Lyapunov theory.

3.1 Equilibrium point

It is possible for a solution to correspond to only a single point in the phase portrait. Such a point
is called an equilibrium point. Stability will be formulated with respect to an equilibrium point,
that is, it is a particular equilibrium point of a differential equation for which the adjective stable,
unstable etc. applies.

Definition. A point xe ∈ Rn is called an equilibrium point of the differential equation

x" (t) = f (x(t)),

if f (xe ) = 0.

51
52 Chapter 3. Stability theory

Recall that in the context of phase plane analysis of 2D autonomous systems, we had referred
to equilibrium points also as ‘singular points’.

Note that x(t) = xe for all t ≥ 0 is a solution to the equation


x" (t) = f (x(t)), x(0) = xe ,
that is, if we start from xe , then we stay there for all future time, and hence the name ‘equilibrium’
(which usually means ‘state of balance’ in ordinary language).

Example. (Linear system) If A ∈ Rn×n and f : Rn → Rn is given by f (x) = Ax, x ∈ Rn , then


there are two possible cases:

1◦ A is invertible. The only equilibrium point in this case is the zero vector, that is, xe = 0.
2◦ A is not invertible. In this case, there are infinitely many equilibrium points. Indeed, the
set of equilibrium points is precisely the set of points in the kernel of A. ♦

A nonlinear system can have finite or infinitely many isolated equilibrium points. In the
following example we consider the familiar physical system of the pendulum.

Example. (The pendulum) Consider the pendulum shown in Figure 3.1, whose dynamics is given
by the following nonlinear equation:
ml2 θ"" + kθ" + mgl sin θ = 0,
where k is a friction coefficient, m is the mass of the bob, l is the length of the pendulum, and g

Figure 3.1: The pendulum.

is the acceleration due to gravity. Defining x1 = θ and x2 = θ" , we obtain the state equations
x"1 = x2
k g
x"2 = − 2
x2 − sin x1 .
ml l
Therefore the equilibrium points are given by
x2 = 0, sin x1 = 0,
which leads to the points (nπ, 0), n ∈ Z. Physically, these points correspond to the pendulum
resting exactly at the vertically up (n odd) and down (n even) positions. ♦

Exercise. (∗) Determine the number of equilibrium points of the system x" = x − cos x.
3.2. Stability and instability 53

3.2 Stability and instability

Let us introduce the basic concepts of stability and instability.

Definitions. An equilibrium point xe is said to be stable if for any R > 0, there exists a r > 0
such that if ,x(0) − xe , ≤ r, then for all t ≥ 0, ,x(t) − xe , ≤ R.

Otherwise, the equilibrium point x0 is called unstable.

Essentially, stability means that the trajectories can be kept arbitrarily close to the equilibrium
point by starting sufficiently close to it. It does not mean that if we start close to the equilibrium
point, then the solution approaches the equilibrium point, and the Figure 3.2 emphasises this
point. This other stronger version of stability when trajectories do tend to the equilibrium point
is called asymptotic stability, which we will define in the next section.

r
xe t

x(0)

Figure 3.2: Stable equilibrium.

The definition of a stable equilibrium point says that no matter what small ball B(xe , R) is
specified around the point xe , we can always find a somewhat smaller ball with radius r (which
might depend on the radius R of the big ball), such that if we start from within the small ball
B(xe , r) we are guaranteed to stay in the big ball for all future times.

On the other hand, an equilibrium point is unstable if there exists at least one ball B(xe , R)
such that for every r > 0, no matter how small, it is always possible to start from somewhere
within the small ball B(xe , r) and eventually leave the ball B(xe , R).

It is important to point out the qualitative difference between instability and the intuitive
notion of “blowing up”. In the latter, one expects the trajectories close to the equilibrium to move
further and further away. In linear systems, instability is indeed equivalent to blowing up, because
eigenvalues in the right half plane always lead to growth of the system states in some direction.
However, for nonlinear systems, blowing up is only one way of instability. The following example
illustrates this point.

Example. (van der Pol oscillator) This example shows that


Unstable is not the same as “blowing up”.

The van der Pol oscillator is described by


x"1 = x2 ,
x"2 = −x1 + (1 − x21 )x2 .
54 Chapter 3. Stability theory

The system has an equilibrium point at the origin.

2
x2

–3 –2 –1 0 1 2 3
x1
–1

–2

–3

Figure 3.3: Unstable equilibrium of the van der Pol equation.

The solution trajectories starting from any non-zero initial states all asymptotically approach
a limit cycle. Furthermore, the ball B(0, 1) can be shown to be within the phase-plane region
enclosed by the limit cycle. Therefore, solution trajectories starting from an arbitrarily small ball
B(0, r) will eventually get out of the ball B(0, 1) to approach the limit cycle; see Figure 3.3. This
implies that the origin is an unstable equilibrium point.

Thus, even though the solutions starting from points near the equilibrium point do not blow
up (in fact they do remain close to the equilibrium point), they do not remain arbitrarily close to
it. This is the fundamental distinction between stability and instability. ♦

Exercise. For what values of a is 0 a stable equilibrium point for the system x" = ax?

3.3 Asymptotic and exponential stability

Sometimes it can happen that trajectories actually approach the equilibrium point. This motivates
the following definition.

Definition. An equilibrium point xe is called asymptotically stable if it is stable and there exists
a r > 0 such that if ,x(0) − xe , ≤ r, then lim x(t) = xe .
t→∞

Asymptotic stability means that the equilibrium is stable, and that in addition, if we start
close to xe , the solution actually converges to xe as the time tends to ∞; see Figure 3.4.

One may question the need for the explicit stability requirement in the definition above.
One might think: Doesn’t convergence to the equilibrium already imply that the equilibrium is
stable? The answer is no. It is possible to build examples showing that solution convergence to
an equilibrium point does not imply stability. For example, a simple system studied by Vinograd
has trajectories of the form shown in Figure 3.5. All the trajectories from nonzero initial points
within the unit disk first reach the curve C before converging to the origin. Thus the origin is
3.3. Asymptotic and exponential stability 55

r
xe t

x(0)

Figure 3.4: Asymptotically stable equilibrium.

unstable, despite the convergence. Calling such an equilibrium point unstable is quite reasonable,
since a curve such as C might be outside the region where the model is valid.

x2

x1

Figure 3.5: Asymptotically stable, but not stable, equilibrium.

In some applications, it is not enough to know that solutions converge to the equilibrium point,
but there is also a need to estimate how fast this happens. This motivates the following concept.

Definition. An equilibrium point xe is called exponentially stable if there exist positive numbers
M , ( and r such that for all ,x(0) − xe , ≤ r, we have

for all t ≥ 0, ,x(t) − xe , ≤ M e−&t ,x(0) − xe ,. (3.1)

In other words, the solutions converge to xe faster than the exponential function, and (3.1)
provides an explicit bound on the solution.

Example. The point 0 is an exponentially stable equilibrium point for the system

x" = −(1 + (sin x)2 )x.

Indeed it can be seen that the solution satisfies


# . t $
; 2
<
x(t) = x(0) exp − 1 + (sin(x(τ ))) dτ
0

(check this by differentiation!). Therefore, |x(t)| ≤ |x(0)|e−t . ♦

Note that exponential stability implies asymptotic stability. However, asymptotic stability
does not guarantee exponential stability, as demonstrated by the following example.
56 Chapter 3. Stability theory

Example. Consider the equation x" = −x3 . It can be shown that this has the solution
 1
 !
1
if x(0) > 0,
 2t+ (x(0))
 2

x(t) = 0 if x(0) = 0,
 −! 1 1

 if x(0) < 0,
2t+ (x(0))2

for t ≥ 0. It can be seen that 0 is an asymptotically stable equilibrium point. But we now prove
that it is not exponentially stable. Suppose, on the contrary, that there exist positive M , ( and r
such that for all |x(0)| ≤ r, the corresponding solution x is such that for all t ≥ 0, |x(t)| ≤ M e−&t .
Then with x(0) = r, we would have that
1
∀t ≥ 0, A ≤ M e−&t r. (3.2)
1
2t + r2

Since for t ≥ 0 we have e&t = 1 + (t + ((t)2 /2 + · · · ≥ (t, it follows that e−&t ≤ 1/(t. From (3.2), it
follows that # $
M 2 r2 2 1
∀t > 0, 1 ≤ + 2 2 .
(2 t r t
Passing the limit as t → ∞, we obtain 1 ≤ 0, a contradiction. ♦

Exercise. For what values of a is 0 an asymptotically stable equilibrium point for the system
x" = ax? When is it exponentially stable?

3.4 Stability of linear systems

In the scalar example x" = ax discussed in Section 1.4, we observe that if Re(a) < 0, then the
equilibrium point 0 is exponentially stable, while if Re(a) > 0, then the equilibrium point is
unstable. Analogously, from Theorem 1.7.2, we obtain the following corollary.

Corollary 3.4.1 The equilibrium point 0 of the system x" = Ax is exponentially stable if for all
λ ∈ σ(A), Re(λ) < 0. The equilibrium point 0 of the system x" = Ax is unstable if there exists a
λ ∈ σ(A) such that Re(λ) > 0.

Using Theorem 2.5.1, we can sometimes deduce the stability of an equilibrium point of 2D
nonlinear system by linearisation.

As illustrated in the Examples on page 22, in the case when the real part of the eigenvalues is
zero, then the stability depends on the number of independent eigenvectors associated with these
eigenvalues.

Definition. Let A ∈ Rn×n , and let λ ∈ σ(A). The dim(ker(λI − A)) is called the geometric
multiplicity of the eigenvalue λ. The multiplicity of λ as a root of the characteristic polynomial of
A is called the algebraic multiplicity of the eigenvalue λ.
3.5. Lyapunov’s method 57

+ ,
0 1
Example. If A = , then the geometric multiplicity of the eigenvalue 0 is equal to 1,
0 0
while its algebraic multiplicity is 2.
+ ,
0 0
If A = , then the geometric multiplicity and the algebraic multiplicity of the eigen-
0 0
value 0 are both equal to 2. ♦

We state the following criterion for stability of 0 for the system x" = Ax without proof.

Theorem 3.4.2 Suppose that all the eigenvalues of A have nonpositive real parts. The equilibrium
point 0 of the system x" = Ax is stable iff all eigenvalues with zero real part have their algebraic
multiplicity equal to the geometric multiplicity.

Exercises.

1. Determine the stability of the system x" = Ax, where A is


+ ,
1 0
(a)
0 −1
 
1 1 0
(b)  0 −2 −1 
0 0 −1
 
1 1 1
(c)  1 0 1 
0 0 −2
 
−1 0 0
(d)  0 −2 0 
1 0 −1
 
0 1 0
(e)  0 0 1 
0 0 0
+ ,
0 1
(f) .
−1 0
2. For the following systems, find the equilibrium points, and determine their stability. Indicate
whether the stability is asymptotic.
3 "
x1 = x1 + x2
(a)
x"2 = −x2 + x31 .
3 "
x1 = −x1 + x2
(b)
x"2 = −x2 + x31 .

3.5 Lyapunov’s method

The basic idea behind Lyapunov’s method is the mathematical extension of a fundamental physical
observation:
If the energy of a system is continuously dissipated,
then the system eventually settles down to an equilibrium point.
58 Chapter 3. Stability theory

Thus, we may conclude the stability of a system by examining the variation of its energy, which
is a scalar function.

In order to illustrate this, consider for example a nonlinear spring-mass system with a damper,
whose dynamic equation is
mx"" + bx" |x" | + k0 x + k1 x3 = 0,
where

x denotes the displacement,


bx" |x" | is the term corresponding to nonlinear damping,
k0 x + k1 x3 is the nonlinear spring force.

Assume that the mass is pulled away by some distance and then released. Will the resulting motion
be stable? It is very difficult to answer this question using the definitions of stability, since the
general solution of the nonlinear equation is unavailable. However, examination of the energy of
the system can tell us something about the motion pattern. Let E denote the mechanical energy,
given by . x
1 1 1 1
E = m(x" )2 + (k0 x + k1 x3 )dx = m(x" )2 + k0 x2 + k1 x4 .
2 0 2 2 4
Note that zero energy corresponds to the equilibrium point (x = 0 and x" = 0):

E = 0 iff [x = 0 and x" = 0].

Asymptotic stability implies the convergence of E to 0. Thus we see that the magnitude of a
scalar quantity, namely the mechanical energy, indirectly says something about the magnitude of
the state vector. In fact we have

E " = mx" x"" + (k0 x + k1 x3 )x" = x" (−bx" |x" |) = −b|x" |3 ,

which shows that the energy of the system is continuously decreasing from its initial value. If it
has a nonzero limit, then we note that x" must tend to 0, and then from physical considerations,
it follows that x must also tend to zero, for the mass is subjected to a nonzero spring force at any
position other than the natural length.

Lyapunov’s method is a generalisation of the method used in the above spring-mass system to
more complex systems. Faced with a set of nonlinear differential equations, the basic procedure of
Lyapunov’s method is to generate a scalar ‘energy-like’ function for the system, and examine the
time-variation of that scalar function. In this way, conclusions may be drawn about the stability of
the set of differential equations without using the difficult stability definitions or requiring explicit
knowledge of solutions.

3.6 Lyapunov functions

We begin this section with this remark: In the following analysis, for notational simplicity, we
assume that the system has been transformed so that equilibrium point under consideration is the
origin. We can do this as follows. Suppose that xe is the specific equilibrium point of the system
x" = f (x) under consideration. Then we introduce the new variable y = x − xe , and we obtain
the new set of equations in y given by

y " = f (y + xe ).
3.6. Lyapunov functions 59

Note that this new system has 0 as an equilibrium point. As there is a one-to-one correspondence
between the solutions of the two systems, we develop the following stability analysis assuming
that the equilibrium point of interest is 0.

The energy function has two properties. The first is a property of the function itself: it is
strictly positive unless the state variables are zero. The second property is associated with the
dynamics of the system: the function is decreasing when we substitute the state variables into the
function, and view the overall function as a function of time. We give the precise definition below.

Definition. A function V : Rn → R is called a Lyapunov function for the differential equation

x" = f (x) (3.3)

if there exists a R > 0 such that in the ball B(0, R), V has the following properties:

L1. (Local positive definiteness) For all x ∈ B(0, R), V (x) ≥ 0.


If x ∈ B(0, R) and V (x) = 0, then x = 0.
V (0) = 0.

L2. V is continuously differentiable, and for all x ∈ B(0, R), ∇V (x) · f (x) ≤ 0.

If in addition to L1 and L2, a function satisfies

+ ,
∂V ∂V
L3. ∇V (x) · f (x) = 0 iff x = 0. Here ∇V := ... .
∂x1 ∂xn

then V is called a strong Lyapunov function.

For example, the function V given by

1 2 2
V (x1 , x2 ) = ml x2 + mgl(1 − cos x1 ), (3.4)
2
which is the mechanical energy of the pendulum (see the Example on page 52), is locally positive
definite, that is, it satisfies L1 above.

If the R in L1 can be chosen to be arbitrarily large, then the function is said to be globally
positive definite. For instance, the mechanical energy of the spring-mass system considered in the
previous section is globally positive definite. Note that for this system, the kinetic energy

1 1
mx22 = m(x" )2
2 2
is not positive definite by itself, because it can be equal to zero for nonzero values of the position
x1 = x.

Let us describe the geometrical meaning of locally positive definite functions on R2 . If we plot
V (x1 , x2 ) versus x1 , x2 in 3-dimensional space, it typically corresponds to a surface looking like
an upward cup, and the lowest point of the cup is located at the origin. See Figure 3.6.

We now observe that L2 implies that the ‘energy’ decreases with time. Define v : [0, ∞) → R
by v(t) = V (x(t)), t ≥ 0, where x is a solution to (3.3). Now we compute the derivative with
60 Chapter 3. Stability theory

x2
V (x(t))

x1

x(t)

Figure 3.6: Lyapunov function.

respect to time of v (that is, of the map t $→ V (x(t))), where x is a solution to x" = f (x). By the
chain rule we have
d
v " (t) = (V (x(t)) = ∇V (x(t)) · x" (t) = ∇V (x(t)) · f (x(t)).
dt
Since L2 holds, we know that if x(t) lies within B(0, R), then for all t ≥ 0, v " (t) ≤ 0. In Figure 3.6,
we see that corresponding to a point x(t) in the phase plane, we have the value v(t) = V (x(t)) on
the inverted cup. As time progresses, the point moves along the surface of the inverted cup. But
as v(t) decreases with time (in B(0, R)), the point on the surface is forced to move to the origin
if we start within B(0, R). In this manner we can prove stability and asymptotic stability. We do
this in the next section.

Exercise. Verify that (3.4) is a Lyapunov function for the system in the Example on page 52.

3.7 Sufficient condition for stability

Theorem 3.7.1 Let the system x" = f (x) have an equilibrium point at 0.

1. If there exists a Lyapunov function V for this system, then the equilibrium point 0 is stable.
2. If there exists a strong Lyapunov function, then 0 is asymptotically stable.

Proof (Sketch) 1. To show stability we must show that given any R > 0, there exists a r > 0
such that any trajectory starting inside B(0, r) remains inside the ball B(0, R) for all future time.

Let m be the minimum of V on the sphere S(0, R) = {x ∈ Rn | ,x, = R}. Since V


is continuous (and S(0, R) is compact), the minimum exists, and as V is positive definite, m is
positive. Furthermore, since V (0) = 0, there exists a r > 0 such that for all x ∈ B(0, r), V (x) < m;
see Figure 3.7.

Consider now a trajectory whose initial point x(0) is within the ball B(0, r). Since t $→ V (x(t))
is nonincreasing, V (x(t)) remains strictly smaller than m, and therefore, the trajectory cannot
possibly cross the outside sphere S(0, R). Thus, any trajectory starting inside the ball B(0, r)
remains inside the ball B(0, R), and therefore stability of 0 is guaranteed.

2. Let us now assume that x $→ ∇V (x) · f (x) is negative definite, and show asymptotic stability
by contradiction. Consider a trajectory starting in some ball B(0, r) as constructed above corre-
sponding to some R where the negative definiteness holds. Then the trajectory will remain in the
3.7. Sufficient condition for stability 61

S(0, R)

B(0, r)

x(·)

Figure 3.7: Proof of stability.

ball B(0, R) for all future time. Since V is bounded below and t $→ V (x(t)) decreases continually,
V (x(t)) tends to a limit L, such that for all t ≥ 0, V (x(t)) ≥ L. Assume that this limit is not
0, that is, L > 0. Then since V is continuous and V (0) = 0, there exists a ball B(0, r0 ) that the
solution never enters; see Figure 3.8.

S(0, R)

B(0, r)
B(0, r0 )

Ω 0

x(·)

Figure 3.8: Proof of asymptotic stability.

But then, since x $→ −∇V (x) · f (x) is continuous and positive definite, and since the set
Ω = {x ∈ Rn | r0 ≤ ,x, ≤ R} is compact, x →
$ −∇V (x) · f (x) : Ω → R has some minimum value
L1 > 0.

This is a contradiction, because it would imply that v(t) := V (x(t)) decreases from its initial
value v(0) = V (x(0)) to a value strictly smaller than L, in a finite time T0 > v(0)−L
L1 ≥ 0: indeed
by the fundamental theorem of calculus, we have

. T0 . T0 . T0
v(T0 ) − v(0) = v " (t)dt = ∇V (x(t)) · f (x(t))dt ≤ −L1 dt = −L1 T0 < L − v(0)
0 0 0

and so v(T0 ) < L. Hence all trajectories starting in B(0, r) asymptotically converge to the origin.
62 Chapter 3. Stability theory

Example. A simple pendulum with viscous damping is described by

θ"" + θ" + sin θ = 0.

Defining x1 = θ and x2 = θ" , we obtain the state equations

x"1 = x2 (3.5)
x"2 = −x2 − sin x1 . (3.6)

Consider the function V : R2 → R given by


1
V (x1 , x2 ) = (1 − cos x1 ) + x22 .
2
This function is locally positive definite (Why?). As a matter of fact, this function represents the
total energy of the pendulum, composed of the sum of the potential energy and the kinetic energy.
We observe that
+ ,
; < x2
∇V (x) · f (x) = sin x1 x2 · = −x22 ≤ 0.
−x2 − sin x1

(This is expected since the damping term absorbs energy.) Thus by invoking the above theorem,
one can conclude that the origin is a stable equilibrium point. However, with this Lyapunov
function, one cannot draw any conclusion about whether the equilibrium point is asymptotically
stable, since V is not a strong Lyapunov function. But by considering yet another Lyapunov
function, one can show that 0 is an asymptotically stable equilibrium point, and we do this in
Exercise 1 below. ♦

Exercises.

1. Prove that the function V : R2 → R given by


1 2 1
V (x1 , x2 ) = x + (x1 + x2 )2 + 2(1 − cos x1 )
2 2 2
is also a Lyapunov function for the system (3.5)-(3.6) (although this has no obvious physical
meaning). Prove that this function is in fact a strong Lyapunov function. Conclude that
the origin is an asymptotically stable equilibrium point for the system.
2. Consider the system

x"1 = x2 − x1 (x21 + x22 ) (3.7)


x"2 = −x1 − x2 (x21 + x22 ). (3.8)

Let V : R2 → R given by V (x1 , x2 ) = x21 + x22 .


(a) Verify that V is a Lyapunov function. Is it a strong Lyapunov function?
(b) Prove that the origin is an asymptotically stable equilibrium point for the system (3.7)-
(3.8).
Chapter 4

Existence and uniqueness

4.1 Introduction

A major theoretical question in the study of ordinary differential equations is: When do solutions
exist? In this chapter, we study this question, and also the question of uniqueness of solutions.

To begin with, note that in Chapter 1, we have shown the existence and uniqueness of the
solution to
x" = Ax, x(t0 ) = x0 .
Indeed, the unique solution is given by x(t) = e(t−t0 )A x0 .

It is too much to expect that one can show existence by actually giving a formula for the
solution in the general case:
x" = f (x, t), x(0) = x0 .
Instead one can prove a theorem that asserts the existence of a unique solution if the function f
is not ‘too bad’. We will prove the following “baby version” of such a result.

Theorem 4.1.1 Consider the differential equation


x" = f (x, t), x(t0 ) = C, (4.1)
where f : R2 → R is such that there exists a constant L such that for all x1 , x2 ∈ R, and all t ≥ t0 ,
|f (x1 , t) − f (x2 , t)| ≤ L|x1 − x2 |.
Then there exists a t1 > t0 and a x ∈ C 1 [t0 , t1 ] such that x(t0 ) = x0 and x" (t) = f (x(t), t) for all
t ∈ [t0 , t1 ].

We will state a more general version of the above theorem later on (which is for nonscalar f ,
that is, for a system of equations). The proof of this more general theorem is very similar to the
proof of Theorem 4.1.1.

The next few sections of this chapter will be spent in proving Theorem 4.1.1. Before we get
down to gory detail, though, we should discuss the method of proof.

Let us start with existence. The simplest way to prove that an equation has a solution is to
write down a solution. Unfortunately, this seems impossible in our case. So we try a variation.

63
64 Chapter 4. Existence and uniqueness

We write down functions which, while not solutions, are very good approximations to solutions:
they miss solving the differential equations by less and less. Then we try to take the limit of these
approximations and show that it is an actual solution.

Since all those words may not be much help, let’s try an example. Suppose that we want to
solve x2 − 2 = 0. Here’s a method: pick a number x0 > 0, and let
# $
1 2
x1 = x0 + ,
2 x0
# $
1 2
x2 = x1 + ,
2 x1
# $
1 2
x3 = x2 + ,
2 x2
and so on. We get a sequence of numbers whose squares get close to 2 rather rapidly. For instance,
if x0 = 1, then the sequence goes
3 17 577 656657
1, , , , ,...,
2 12 408 470832
and the squares are
1 1 1 1
1, 2 , 2 ,2 ,2 ,....
4 144 166464 221682772224

Presumably the xn ’s approach a limit, and this limit is 2. Proving this has two parts. First,
let’s see that the numbers approach a limit. Notice that
# $2 # $2
2 1 2 1 2
xn − 2 = xn−1 + −2= xn−1 − ≥ 0,
4 xn−1 4 xn−1
and so x2n ≥ 2 for all n ≥ 1. Clearly xn > 0 for all n (recall that x0 > 0). But then
# $
1 2 1
xn − xn−1 = xn−1 + − xn−1 = (2 − x2n−1 ) < 0.
2 xn−1 2xn−1
Hence (xn )n≥1 is decreasing, but is also bounded below (by 0), and therefore it has a limit.

Now we need to show that the limit is 2. This is easy. Suppose the limit is L. Then as
# $
1 2
xn+1 = xn + ,
2 xn
! " √
taking1 limits, we get L = 21 L + L2 , and so L2 = 2. So L does indeed equal 2, that is, we have
constructed a solution to the equation x2 − 2 = 0.

The iterative rule xn+1 = (xn +2/xn )/2 may seem mysterious, and in part, it is constructed by
running the last part of the proof backwards. Suppose x2 − 2 = 0. Then x = 2/x, or 2x = x + 2/x,
or x = (x + 2/x)/2. Now think of this equation not as an equation, but as a formula for producing
a sequence, and we are done.

Analogously, in order to prove Theorem 4.1.1, we will proceed in 3 steps:

1. Take the equation, manipulate it cleverly, and turn it into a rule for producing a sequence
of approximate solutions to the equation.
2. Show that the sequence converges.
3. Show that the limit solves the equation.
1 To be precise, this is justified (using the algebra of limits) provided that lim xn "= 0. But since x2n > 2,
n→∞
xn > 1 for all n, so that surely L ≥ 1.
4.2. Analytic preliminaries 65

4.2 Analytic preliminaries

In order to prove Theorem 4.1.1, we need to develop a few facts about calculus in the vector space
C[a, b]. In particular, we need to know something about when two vectors from C[a, b] (which are
really two functions!) are “close”. This is needed, since only then can we talk about a a convergent
sequence of approximate solutions, and carry out the plan mentioned in the previous section.

It turns out that just as Chapter 1, in order to prove convergence of the sequence of matrices,
we used the matrix norm given by (1.16), we introduce the following “norm” in the vector space
C[a, b]: it is simply the function , · , : C[a, b] → R defined by

,f , = sup |f (t)|,
t∈[a,b]

for f ∈ C[a, b]. (By the Extreme Value Theorem, the “sup” above can be replaced by “max”, since
f is continuous on [a, b].) With the help of the above norm, we can discuss distances in C[a, b].
We think of ,f − g, as the distance between functions f and g in C[a, b]. So we can also talk
about convergence:


2
Definition. Let (fk )k∈N be a sequence in C[a, b]. The series fk is said to be convergent if
k=1
there exists a f ∈ C[a, b] such that for every ( > 0, there exists a N ∈ N such that for all n > N ,
there holds that B n B
B2 B
B B
B fk − f B < (.
B B
k=1


2 ∞
2
The series fn is said to be absolutely convergent if the real series ,fk , converges, that is,
n=1 k=1
there exists a S ∈ R so that for every ( > 0, there exists a N ∈ N such that for all n > N , there
holds that 1 n 1
12 1
1 1
1 ,fk , − S 1 < (.
1 1
k=1

Now we prove the following remarkable result, we will will use later in the next section to
prove our existence theorem about differential equations.


2
Theorem 4.2.1 Absolutely convergent series in C[a, b] converge, that is, if ,fk , converges in
k=1

2
R, then fk converges in C[a, b].
k=1

Note the above theorem gives a lot for very little. Just by having convergence of a real series,
we get a much richer converge–namely that of a sequence of functions in , · ,–which in particular
gives pointwise convergence in [a, b], that is we get convergence of an infinite family of sequences!
Indeed this remarkable proof works since it is based on the notion of “uniform” convergence of
the functions–convergence in the norm gives a uniform rate of convergence at all points t, which
is stronger than simply saying that at each t the sequence of partial sums converge.
66 Chapter 4. Existence and uniqueness

Proof Let t ∈ [a, b]. Then |fk (t)| ≤ ,fk ,. So by the Comparison Test, it follows that the real

2 ∞
2 ∞
2
series |fk (t)| converges, and so the series fk (t) also converges, and let f (t) = fk (t). So
k=1 k=1 k=1
we obtain a function t $→ f (t) from [a, b] to R. We will show that f is continuous on [a, b] and

2
that fk converges to f in [a, b].
k=1

The real content of the theorem is to prove that f is a continuous function on [a, b]. First
let us see what we have to prove. To prove that f is continuous on [a, b], we must show that it
is continuous at each point c ∈ [a, b]. To prove that f is continuous at c, we must show that for
every ( > 0, there exists a δ > 0 such that if t ∈ [a, b] satisfies |t − c| < δ, then |f (t) − f (c)| < (.
We prove this by finding a continuous function near f . Assume that c and ( have been picked.
∞ ∞ N
2 2 ( 2
Since ,fk , is finite, we can choose an N ∈ N such that ,fk , < . Let sN (t) = fk (t).
3
k=1 k=N +1 k=1
Then sN is continuous, since it is the sum of finitely many continuous functions, and
1 ∞ 1 ∞ ∞
1 2 1 2 2 (
1 1
|sN (t) − f (t)| = 1 fk (t)1 ≤ |fk (t)| ≤ ,fk , < ,
1 1 3
k=N +1 k=N +1 k=N +1

regardless of t. (This last part is the crux of the proof.) Since sN is continuous, we can pick δ > 0
so that if |t − c| < δ, then |sN (t) − sN (c)| < 3& . This is the delta we want: if |t − c| < δ, then
( ( (
|f (t) − f (c)| ≤ |f (t) − sN (t)| + |sN (t) − sN (c)| + |sN (c) − f (c)| < + + = (.
3 3 3
This proves the continuity of f .

The rest of the proof is straightforward. We have:


1 1 1 1
12n 1 1 2 ∞ 1 ∞
2 ∞
2
1 1 1 1 n→∞
1 fk (t) − f (t)1 = 1 fk (t)1 ≤ |fk (t)| ≤ ,fk , −→ 0.
1 1 1 1
k=1 k=n+1 k=n+1 k=n+1

2
This shows that fk converges to f in C[a, b].
k=1


2
Theorem 4.2.2 Suppose fk converges absolutely to f in C[a, b]. Then
k=1
∞ .
2 b . b
fk (t)dt = f (t)dt.
k=1 a a

Proof We need to show that for any ( > 0, there exists an N ∈ N such that for all n > N ,
1 n . . b 1
12 b 1
1 1
1 fk (t)dt − f (t)dt1 < (.
1 a a 1
k=1

2 (
Choose N such that if n ≥ N , then ,fk , < . So
b−a
k=n+1
1 1 1 1
12n ∞
1 1 2 1 ∞ ∞
1 1 1 1 2 2 (
1 f k (t) − f (t) =
1 1 f k (t)1 ≤ |f k (t)| ≤ ,fk , < .
1 1 1 1 b−a
k=1 k=n+1 k=n+1 k=n+1
4.3. Proof of Theorem 4.1.1 67

But then
1 1 1. 1 .
12n . b . b 1 1 b 2 n 1 b 2n . b
1 1 1 1 (
1 fk (t)dt − f (t)dt1 = 1 ( fk (t) − f (t))dt1 ≤ | fk (t) − f (t)|dt ≤ dt = (.
1 a a 1 1 a 1 a a b − a
k=1 k=1 k=1

4.3 Proof of Theorem 4.1.1

4.3.1 Existence

Step 1. The first step in the proof is to change the equation into one which we can use for
creating a sequence. The principle for doing this is a useful and important one. We want to
arrange matters so that successive terms are close together. For this, integration is much better
than differentiation. Two functions that are close together have their integrals close together2 ,
but their derivatives can be far apart3 .

So we should change (4.1) into something involving integrals. The easiest way to do this is to
integrate both sides. Taking into account the initial condition, we see that we get
. t
x(t) − C = f (x(t), t)dt,
t0

that is, . t
x(t) = f (x(t), t)dt + C.
t0
Now we can construct our sequence.

We begin with
x0 (t) = C,
and define x1 , x2 , x3 , . . . inductively:
. t
x1 (t) = f (x0 (t), t)dt + C,
t0
. t
x2 (t) = f (x1 (t), t)dt + C,
t0
..
.
. t
xk+1 (t) = f (xk (t), t)dt + C,
t0

and so on. All the functions x0 , x1 , . . . are continuous functions. This is the sequence we will work
with.

Step 2. We want to show that the sequence of functions x0 , x1 , x2 , . . . converges. This sequence
is the sequence of partial sums of the series
x0 + (x1 − x0 ) + (x2 − x1 ) + . . . .
2 Atleast for a while. As the interval of integration gets large, the functions drift apart.
3 Forinstance, let f (x) = 10110 sin(10100 x), a function close to 0. If we integrate, then we get − 101110 cos(10100 x),
which is really tiny; if we differentiate, we get 1090 cos(10100 x), which can get very large.
68 Chapter 4. Existence and uniqueness

So we will prove that this series, namely



2
x0 + (xk+1 − xk )
k=0

converges. In order to do this, we show that it converges absolutely, and then by Theorem 4.2.1,
we would be done.

Thus we need to look at xk+1 − xk :


. t
xk+1 (t) − xk (t) = [f (xk (t), t) − f (xk−1 (t), t)]dt.
t0

Since we know that


|f (xk (t), t) − f (xk−1 (t), t)| ≤ L|xk (t) − xk−1 (t)|,
where L is a number not depending on k or t. Then
. t
|xk+1 (t) − xk (t)| ≤ |f (xk−1 (t), t) − f (xk−1 (t), t)|dt
t0
. t
≤ L|xk (t) − xk−1 (t)|dt
t0
. t
≤ L,xk − xk−1 ,dt
t0
= L(t − t0 ),xk − xk−1 ,.
1
So if we work in the interval t0 ≤ t ≤ t0 + 2L , we have
1 1
,xk − xk−1 , = ,xk − xk−1 ,.
,xk+1 − xk , ≤ L
2L 2
Then, as one may check using induction,
1
,xk+1 − xk , ≤ ,x1 − x0 , (4.2)
2k
for all k. Thus
∞ ∞
2 2 1
,x0 , + ,xk+1 − xk , ≤ ,x0 , + ,x1 − x0 , < ∞,
2k
k=0 k=0

2
and the series x0 + (xk+1 − xk ) converges absolutely. By Theorem 4.2.1, it converges and has
k=0
a limit, which we denote by x.

Step 3. Now we need to know that x satisfies (4.1). We begin by taking limits in
. t
xk+1 (t) = f (xk (t), t)dt + C.
t0

By Theorem 4.2.2, taking limits inside the integral can be justified (see the Exercise on page 70
below), and so we get
. t
x(t) = f (x(t), t)dt + C. (4.3)
t0
We see that x(t0 ) = C because the integral from t0 to t0 is 0. Also by the Fundamental Theorem
of Calculus, x can be differentiated (since it is given as an integral), and
x" (t) = f (x(t), t).
This proves the existence.
4.3. Proof of Theorem 4.1.1 69

4.3.2 Uniqueness

Finally, we prove uniqueness. Let x1 and x2 be two solutions. Then x1 and x2 satisfy the
“integrated” equation (4.3) as well.

Let t∗ = max{t ∈ [0, T ] | x1 (τ ) = x2 (τ ) for all τ ≤ t}. In other words, t∗ is the smallest time
instant after which x1 and x2 start to be different. See Figure 4.1.

x2

x1

0 t∗ T

Figure 4.1: Definition of t∗ .

We have
. t . t
x1 (t) − x1 (t∗ ) = x"1 (τ )dτ = f (x1 (τ ), τ )dτ,
t∗ t∗
. t . t
x2 (t) − x2 (t∗ ) = x"2 (τ )dτ = f (x2 (τ ), τ )dτ,
t∗ t∗

and so . t
x1 (t) − x2 (t) = f (x1 (τ ), τ ) − f (x2 (τ ), τ )dτ.
t∗

Let N ∈ N be such that 3 -


1 1
N > max 1, , ,
L L(T − t∗ )
and let
M := max |x1 (τ ) − x2 (τ )|.
τ ∈[t∗ ,t∗ + LN
1
]
1 1
; 1
<
(Since N > L(T −t∗ ) , we know that t∗ + LN < T .) Then for all t ∈ t∗ , t∗ + LN , we have
1. t 1
1 1
|x1 (t) − x2 (t)| = 1
1 f (x1 (τ ), τ ) − f (x2 (τ ), τ )dτ 11
t∗
. t
≤ |f (x1 (τ ), τ ) − f (x2 (τ ), τ )|dτ
t∗
. t
≤ L|x1 (τ ) − x2 (τ )|dτ
t∗
. t
≤ LM dτ = LM (t − t∗ )
t∗
# $
1 M
≤ LM t∗ + − t∗ = .
LN N
M
Thus on the interval M ≤ N, that is, N ≤ 1, which a contradiction to our choice of N (which
satisfied N > 1).
70 Chapter 4. Existence and uniqueness

This completes the proof of the theorem.

Exercise. (∗) Justify taking limits inside the integral in Step 3 of the proof.

4.4 The general case. Lipschitz condition.

We begin by introducing the class of Lipschitz functions.

Definition. A function f : Rn → Rn is called locally Lipschitz if for every r > 0 there exists a
constant L such that for all x, y ∈ B(0, r),

,f (x) − f (y), ≤ L,x − y,. (4.4)

If there exists a constant L such that (4.4) holds for all x, y ∈ Rn , then f is said to be globally
Lipschitz.

For f : R → R, we observe that the following implications hold:

f is continuously differentiable ⇒ f is locally Lipschitz ⇒ f is continuous.

That the inclusions of these three classes are strict can be seen by observing
√ that f (x) = |x| is
globally Lipschitz, but not differentiable at 0, and the function f (x) = x is continuous, but not
locally Lipschitz. (See Exercise 1c below.)

Just like Theorem 4.1.1, the following theorem can be proved, which gives a sufficient condition
for the unique existence of a solution to the initial value problem of an ODE.

Theorem 4.4.1 If there exists an r > 0 and a constant L such that the function f satisfies

,f (x, t) − f (y, t), ≤ L,x − y, for all x, y ∈ B(0, r) and all t ≥ t0 , (4.5)

then there exists a t1 > t0 such that the differential equation

x" (t) = f (x(t), t), x(t0 ) = x0 ∈ B(0, r),

has a unique solution for all t ∈ [t0 , t1 ].

If the condition (4.5) holds, then f is said to be locally Lipschitz in x uniformly with respect
to t.

The existence theorem above is of a local character, in the sense that the existence of a solution
x∗ is guaranteed only in a small interval [t0 , t1 ]. We could, of course, take this solution and examine
the new initial value problem

x" (t) = f (x(t), t), t ≥ t1 , x(t1 ) = x∗ (t1 ).

The existence theorem then guarantees a solution in a further small neighbourhood, so that the
solution can be “extended”. The process can then be repeated. However, it might happen that
the lengths of the intervals get smaller and smaller, so that we cannot really say that such an
extension will yield a solution for all times t ≥ 0. We illustrate this by considering the following
example.
4.5. Existence of solutions 71

Example. Consider the initial value problem

x" = 1 + x2 , t ≥ 0, x(0) = 0.

Then it can be shown that f (x) = 1 + x2 is locally Lipschitz in x (trivially uniformly in t). So a
unique solution exists in a small time interval. In fact, we can explicitly solve the above equation
and find out that the solution is given by
π
x(t) = tan t, t ∈ [0, ).
2

The solution cannot be extended to an interval larger than [0, π2 ). ♦

Exercises.

1. (a) Prove that every continuously differentiable function f : R → R is locally Lipschitz.


Hint: Mean value theorem.
(b) Prove that every locally Lipschitz function f : R → R is continuous.
:
(c) (∗) Show that f (x) = |x| is not locally Lipschitz.

2. (∗) Find a Lipschitz constant for the following functions

(a) sin x on R.
1
(b) 1+x2 on R.
(c) e−|x| on R.
(d) arctan(x) on (−π, π).

3. (∗) Show that if a function f : R → R satisfies the inequality

|f (x) − f (y)| ≤ L|x − y|2 for all x, y ∈ R,

then show that f is continuously differentiable on R.

4.5 Existence of solutions

Here is an example of a differential equation with more than one solution for a given initial
condition.

Example. An equation with multiple solutions. Consider the equation


2
x" = 3x 3

with the initial condition x(0) = 0. Two of its solutions are x(t) ≡ 0 and x(t) = t3 . ♦

In light of this example, one can hope that there may be also theorems applying to more
general situations than Theorem 4.4.1, which state that solutions exist (and say nothing about
uniqueness). And there are. The basic one is:
72 Chapter 4. Existence and uniqueness

Theorem 4.5.1 Consider the differential equation

x" = f (x, t),

with initial condition x(t0 ) = x0 . If the function f is continuous (but not necessarily Lipschitz
in x uniformly in t), then there exists a t1 > t0 and a x ∈ C 1 [t0 , t1 ] such that x(t0 ) = x0 and
x" (t) = f (x(t), t) for all t ∈ [t0 , t1 ].

The above theorem says that the continuity of f is sufficient for the local existence of solutions.
However, it does not guarantee the uniqueness of the solution. We will not give a proof of this
theorem.

4.6 Continuous dependence on initial conditions

In this section, we will consider the following question: If we change the initial condition, then
how does the solution change? The initial condition is often a measured quantity (for instance
the estimated initial population of a species of fish in a lake at a certain starting time), and we
would like our differential equation to be such that the solution varies ‘continuously’ as the initial
condition changes. Otherwise we cannot be sure if the solution we have obtained is close to the real
situation at hand (since we might have incurred some measurement error in the initial condition).
We prove the following:

Theorem 4.6.1 Let f be globally Lipschitz in x (with constant L) uniformly in t. Let x1 , x2


be solutions to the equation x" = f (x, t), for t ∈ [t0 , t1 ], with initial conditions x0,1 and x0,2 ,
respectively. Then for all t ∈ [t0 , t1 ],

,x1 (t) − x2 (t), ≤ eL(t−t0 ) ,x0,1 − x0,2 ,.

Proof Let f (t) := ,x1 (t) − x2 (t),2 , t ∈ [t0 , t1 ]. If 3·, ·4 denotes the standard inner product in Rn ,
we have

f " (t) = 23x"1 (t) − x"2 (t), x1 (t) − x2 (t)4 = 23f (x1 , t) − f (x2 , t), x1 (t) − x2 (t)4
≤ 2,f (x1 , t) − f (x2 , t),,x1 (t) − x2 (t), (by the Cauchy-Schwarz inequality)
≤ 2L,x1 (t) − x2 (t),2 = 2Lf (t).

In other words,
d −2Lt
(e f (t)) = e−2Lt f " (t) − 2Le−2Lt f (t) = e−2Lt (f " (t) − 2Lf (t)) ≤ 0.
dt
Integrating from t0 to t ∈ [t0 , t1 ] yields
. t
d −2Lτ
e−2Lt f (t) − e−2Lt0 f (t0 ) = e f (τ )dτ ≤ 0,
t0 dτ

that is, f (t) ≤ e2L(t−t0 ) f (0). Taking square roots, we obtain ,x1 (t)−x2 (t), ≤ eL(t−t0 ) ,x0,1 −x0,2 ,.
4.6. Continuous dependence on initial conditions 73

Exercises.

1. Prove the Cauchy-Schwarz inequality: if x, y ∈ Rn , then |3x, y4| ≤ ,x,,y,.

Cauchy-Schwarz

Hint: If α ∈ R, and x, y ∈ Rn , then we have 0 ≤ 3x+αy, x+αy4 = 3x, x4+2α3x, y4+α2 3y, y4,
and so it follows that the discriminant of this quadratic expression is ≤ 0, which gives the
desired inequality.
2. (∗) This exercise could have come earlier, but it’s really meant as practice for the next one.
We know that the solution to

x" (t) = x(t), x(0) = a,

is x(t) = et a. Now suppose that we knew about differential equations, but not about
exponentials. We thus know that the above equation has a unique solution, but we are
hampered in our ability to solve them by the fact that we have never come across the
function et ! So we declare E to be the function defined as the unique solution to

E " (t) = E(t), E(0) = 1.

(a) Let τ ∈ R. Show that t $→ E(t + τ ) solves the same differential equation as E (but with
a different initial condition). Show from this that for all t ∈ R, E(t + τ ) = E(t)E(τ ).
(b) Show that for all t ∈ R, E(t)E(−t) = 1.
(c) Show that E(t) is never 0.
3. (∗∗) This is similar to the previous exercise, but requires more work. This time, imagine that
we know about existence and uniqueness of solutions for second order differential equations
of the type
x"" (t) + x(t) = 0, x(0) = a, x" (0) = b,
but nothing about trigonometric functions. (For example from Theorem 1.5.5, we know that
this equations has a unique solutions, and we can see this by introducing the state vector
comprising x and x" .) We define the functions S and C as the unique solutions, respectively,
to

S "" (t) + S(t) = 0, S(0) = 0, S " (0) = 1;


""
C (t) + C(t) = 0, C(0) = 1, C " (0) = 0.

(Privately, we know that S(t) = sin t and C(t) = cos t.) Now show that for all t ∈ R,
(a) S " (t) = C(t), C " (t) = −S(t).
(b) (S(t))2 + (C(t))2 = 1.
Hint: What is the derivative?
74 Chapter 4. Existence and uniqueness

(c) S(t + τ ) = S(t)C(τ ) + C(t)S(τ ) and C(t + τ ) = C(t)C(τ ) − S(t)S(τ ), all τ ∈ R.


(d) S(−t) = −S(t), C(−t) = C(t).
(e) There is a a number α > 0 such that C(α) = 0. (That is, C(x) is not always positive.
If we call the smallest such number π/2, we have a definition of π from differential
equations.)
Chapter 5

Underdetermined ODEs

5.1 Control theory

The basic objects of study in control theory are underdetermined differential equations. This
means that there is some freeness in the variables satisfying the differential equation. An example
of an underdetermined algebraic equation is x + u = 10, where x, u are positive integers. There
is freedom in choosing, say u, and once u is chosen, then x is uniquely determined. In the same
manner, consider the differential equation

dx
(t) = f (x(t), u(t)), x(0) = x0 , t ≥ 0, (5.1)
dt
x(t) ∈ Rn , u(t) ∈ Rm . So if written out, equation (5.1) is the set of equations

dx1
(t) = f1 (x1 (t), . . . , xn (t), u1 (t), . . . , um (t)), x1 (0) = x0,1
dt
..
.
dxn
(t) = fn (x1 (t), . . . , xn (t), u1 (t), . . . , um (t)), xn (0) = x0,n ,
dt
where f1 , . . . , fn denote the components of f . In (5.1), u is the free variable, called the input,
which is assumed to be continuous.

75
76 Chapter 5. Underdetermined ODEs

A control system is an equation of the type (5.1), with input u and state x. Once the input
u and the initial state x(0) = x0 are specified, the state x is determined. So one can think of a
control system as a box, which given the input u and initial state x(0) = x0 , manufactures the
state according to the law (5.1); see Figure 5.1.

u x" (t) = f (x(t), u(t)) x


x(0) = x0

Figure 5.1: A control system.

Example. Suppose the population x(t) at time t of fish in a lake evolves according to the
differential equation:
x" (t) = f (x(t)),
where f is some complicated function which is known to model the situation reasonable accurately.
A typical example is the Verhulst model, where
/ x0
f (x) = rx 1 − .
M
(This model makes sense, since first of all the rate of increase in the population should increase
with more numbers of fish–the more there are fish, the more they reproduce, and larger is the
population. However, if there are too many fish, there is competition for the limited food resource,
x
and then the population starts declining, which is captured by the term 1 − M .)

Now suppose that we harvest the fish at a harvesting rate h. Then the population evolution
is described by
x" (t) = f (x(t)) − h(t).
But the harvesting rate depends on the harvesting effort u:

h(t) = x(t)u(t).

(The harvesting effort can be thought in terms of the amount of time used for fishing, or the
number of fishing nets used, and so on. Then the above equation makes sense, as the harvesting
rate is clearly proportional to the number of fish–the more the fish in the lake, the better the
catch.)

Hence we arrive at the underdetermined differential equation

x" (t) = f (x(t)) − x(t)u(t).

This equation is underdetermined, since the u can be decided by the fisherman. This is the input,
and once this has been chosen, then the population evolution is determined by the above equation,
given some initial population level x0 of the fish. ♦

If the function f is linear, that is, if

f (ξ, υ) = Aξ + Bυ

for some A ∈ Rn×n and B ∈ Rn×m , then the control system is said to be linear. We will study
this important class of systems in the rest of this chapter.
5.2. Solutions to the linear control system 77

5.2 Solutions to the linear control system

In this section, we give a formula for the solution of a linear control system.

Theorem 5.2.1 Let A ∈ Rn×n and B ∈ Rn×m . If u ∈ (C[0, T ])m , then the differential equation

dx
(t) = Ax(t) + Bu(t), x(0) = x0 , t≥0 (5.2)
dt
has the unique solution x on [0, +∞) given by
. t
x(t) = etA x0 + e(t−τ )A Bu(τ )dτ. (5.3)
0

Proof We have
# . t $ # . t $
d d
etA x0 + e(t−τ )A Bu(τ )dτ = etA x0 + etA e−τ A Bu(τ )dτ
dt 0 dt 0
. t
= AetA x0 + AetA e−τ A Bu(τ )dτ + etA e−tA Bu(t)
0
# . t $
tA tA −τ A
= A e x0 + e e Bu(τ )dτ + etA−tA Bu(t)
0
# . t $
tA tA −τ A
= A e x0 + e e Bu(τ )dτ + Bu(t),
0

and so it follows that x(·) given by (5.3) satisfies x" (t) = Ax(t) + Bu(t). Furthermore,
. 0
e0A x0 + e(0−τ )A Bu(τ )dτ = Ix0 + 0 = x0 .
0

Finally we show uniqueness. If x1 , x2 are both solutions to (5.2), then it follows that x :=
x1 − x2 satisfies x" (t) = Ax(t), x(0) = 0, and so from Theorem 1.5.5 it follows that x(t) = 0 for
all t ≥ 0, that is x1 = x2 .

Exercises.

1. Suppose that p ∈ C 1 [0, T ] is such that for all t ∈ [0, T ], p(t) + α += 0, and it satisfies the
scalar Riccati equation
p" (t) = γ(p(t) + α)(p(t) + β).

Prove that q given by


1
q(t) := , t ∈ [0, T ],
p(t) + α
satisfies
q " (t) = γ(α − β)q(t) − γ, t ∈ [0, T ].

2. Find p ∈ C 1 [0, 1] such that p" (t) = (p(t))2 − 1, t ∈ [0, 1], p(1) = 0.
78 Chapter 5. Underdetermined ODEs

3. It is useful in control theory to be able to make estimates on the size of the solution without
computing it precisely. Show that if for all t ≥ 0, |u(t)| ≤ M , then the solution x to
x" = ax + bu (a += 0), x(0) = x0 satisfies
M |b| ta
|x(t) − eta x0 | ≤ [e − 1].
a
M|b| ta
That is, the solution differs from that of x" = ax, x(0) = x0 by at most a [e − 1]. What
happens to the bound as a → 0?
4. Newton’s law of cooling says that the rate of change of temperature is proportional to the
difference between the temperature of the object and the environmental temperature:
Θ" = κ(Θ − Θe ),
where κ denotes the proportionality constant and Θe denotes the environmental temperature.
In the episode of the TV series CSI, the body of a murder victim was discovered at 11:00
a.m. The medical examiner arrived at 11:30 a.m., and found the temperature of the body
was 94.6◦ F. The temperature of the room was 70◦ F. One hour later, in the same room, she
took the body temperature again and found that it was 93.4◦ F. Estimate the time of death,
using the fact that the body temperature of any living human being is around 98.6◦ F.

5.3 Controllability of linear control systems

A characteristic of underdetermined equations is that one can choose the free variable in a way
that some desirable effect is produced on the other dependent variable.

For example, if with our underdetermined algebraic equation x + u = 10 we wish to make


x < 5, then we can achieve this by choosing the free variable u to be strictly larger than 5.

Control theory is all about doing similar things with differential equations of the type (5.1).
The state variables x comprise the ‘to-be-controlled’ variables, which depend on the free variables
u, the inputs. For example, in the case of an aircraft, the speed, altitude and so on are the to-be-
controlled variables, while the angle of the wing flaps, the speed of the propeller and so on, which
the pilot can specify, are the inputs. So one of the basic questions in control theory is then the
following:

How do we choose the control inputs to achieve regulation of the state variables?

For instance, one may wish to drive the state to zero or some other desired value of the state
at some time instant T . This brings us naturally to the notion of controllability which, roughly
speaking, means that any state can be driven to any other state using an appropriate control.

For the sake of simplicity, we restrict ourselves to linear systems: x" (t) = Ax(t) + Bu(t), t ≥ 0,
where A ∈ Rn×n and B ∈ Rn×m . We first give the definition of controllability for such a linear
control system.

Example. Suppose a lake contains two species of fish, which we simply call ‘big fish’ and ‘small
fish’, which form a predator-prey pair. Suppose that the evolution of their populations xb and xs
are reasonably accurately modelled by
+ " , + ,+ ,
xb (t) a11 a12 xb (t)
= .
x"s (t) a21 a22 xs (t)
5.3. Controllability of linear control systems 79

Now suppose that one is harvesting these fish at harvesting rates hb and hs (which are inputs, since
they can be decided by the fisherman). The model describing the evolution of the populations
then becomes: + " , + ,+ , + ,
xb (t) a11 a12 xb (t) hb (t)
= − .
x"s (t) a21 a22 xs (t) hs (t)
The goal is to harvest the species of fish over some time period [0, T ] in such a manner starting
from the initial population levels + , + ,
xb (0) xb,i
= ,
xs (0) xs,i
we are left with the desired population levels
+ , + ,
xb (T ) xb,f
= .
xs (T ) xs,f

For example, if one of the species of fish is nearing extinction, it might be important to main-
tain some critical levels of the populations of the predator versus the prey. Thus we see that
controllability problems arise quite naturally from applications. ♦

Definition. The system


dx
(t) = Ax(t) + Bu(t), t ∈ [0, T ] (5.4)
dt
is said to be controllable at time T if for every pair of vectors x0 , x1 in Rn , there exists a control
u ∈ (C[0, T ])m such that the solution x of (5.4) with x(0) = x0 satisfies x(T ) = x1 .

Examples.

1. (A controllable system) Consider the system

x" (t) = u(t), t ∈ [0, T ],

so that A = 0, B = 1. Given x0 , x1 ∈ R, define u ∈ C[0, T ] to be the constant function


x1 − x0
u(t) = , t ∈ [0, T ].
T
By the fundamental theorem of calculus,
. T . T
x1 − x0
x(T ) = x(0) + x" (τ )dτ = x0 + u(τ )dτ = x0 + (T − 0) = x1 .
0 0 T

2. (An uncontrollable system) Consider the system

x"1 (t) = x1 (t) + u(t), (5.5)


x"2 (t) = x2 (t), (5.6)

so that + , + ,
1 0 1
A= , B= .
0 1 0
The equation (5.6) implies that x2 (t) = et x2 (0), and so if x2 (0) > 0, then x2 (t) > 0 for all
t ≥ 0. So a final state with the x2 -component negative is never reachable by any control.

80 Chapter 5. Underdetermined ODEs

We would like to characterise the property of controllability in terms of the matrices A and
B. For this purpose, we introduce the notion of reachable space at time T :

Definition. The reachable space of (5.4) at time T , denoted by RT , is defined as the set of all
x ∈ Rn for which there exists a control u ∈ (C[0, T ])m such that
. T
x= e(T −τ )A Bu(τ )dτ. (5.7)
0

Note that the above simply says that if we run the differential equation (5.4) with the input
u, and with initial condition x(0) = 0, then x is the set of all points in the state-space that are
‘reachable’ at time T starting from 0 by means of some input u.

We now prove that RT is a subspace of Rn .

Lemma 5.3.1 RT is a subspace of Rn .

Proof We verify that RT is nonempty, and closed under addition and scalar multiplication.

S1 If we take u = 0, then
. T
e(T −τ )A Bu(τ )dτ = 0,
0
and so 0 ∈ RT .
S2 If x1 , x2 ∈ RT , then there exist u1 , u2 in (C[0, T ])m such that
. T . T
(T −τ )A
x1 = e Bu1 (τ )dτ and x2 = e(T −τ )A Bu2 (τ )dτ.
0 0
m
Thus u := u1 + u2 ∈ (C[0, T ]) and
. T . T . T
e(T −τ )A Bu(τ )dτ = e(T −τ )A Bu1 (τ )dτ + e(T −τ )A Bu2 (τ )dτ = x1 + x2 .
0 0 0

Consequently x1 + x2 ∈ RT .
S3 If x ∈ RT , then there exists a u ∈ (C[0, T ])m such that
. T
x= e(T −τ )A Bu(τ )dτ.
0
m
If α ∈ R, then α · u ∈ (C[0, T ]) and
. T . T
e(T −τ )A B(αu)(τ )dτ = α e(T −τ )A Bu(τ )dτ = αx.
0 0

Consequently αx ∈ RT .

Thus RT is a subspace of Rn .

We now prove Theorem 5.3.3, which will yield Corollary 5.3.4 below on the characterisation of
the property of controllability. In order to prove Theorem 5.3.3, we will use the Cayley-Hamilton
theorem, and for the sake of completeness, we have included a sketch of proof of this result here.
5.3. Controllability of linear control systems 81

Theorem 5.3.2 (Cayley-Hamilton) If A ∈ Cn×n and p(t) = tn + cn−1 tn−1 + · · · + c1 t + c0 is its


characteristic polynomial, then p(A) = An + cn−1 An−1 + · · · + c1 A + c0 I = 0.

Proof (Sketch) This is easy to see if A is diagonal, since


   
λ1 p(λ1 )

p  ..  
 =  .. 
 = 0.
. .
λn p(λn )

It is also easy to see if A is diagonalisable, since if A = P DP −1 , then

p(A) = p(P DP −1 ) = P p(D)P −1 = P 0P −1 = 0.

As det : Cn×n → C is a continuous function, it follows that the coefficients of the characteristic
polynomial are continuous functions of the matrix entries. Using the fact that the set of diag-
onalisable matrices is dense in Cn×n , we see that the result extends to all complex matrices by
continuity.

; <
Theorem 5.3.3 RT = Rn iff rank B AB ... An−1 B = n.

Proof If: If RT += Rn , then there exists a x0 = + 0 in Rn such that for all x ∈ RT , x(


0 x = 0.
Consequently,
. T
x(
0 e(T −τ )A Bu(τ )dτ = 0 ∀u ∈ (C[0, T ])m .
0
!
In particular, u0 defined by u0 (t) = B ( e(T −τ )A x0 , t ∈ [0, T ], belongs to (C[0, T ])m , and so
. T
!
x(
0 e
(T −τ )A
BB ( e(T −τ )A x0 dτ = 0,
0

and so it can be seen that


x(
0e
(T −τ )A
B = 0, t ∈ [0, T ]. (5.8)
(Why?) With t = T , we obtain x( ( (T −t)A
0 B = 0. Differentiating (5.8), we obtain x0 e AB = 0,
(
t ∈ [0, T ], and so with t = T , we have x0 AB = 0. Proceeding in this manner (that is, successively
differentiating (5.8) and setting t = T ), we see that x( k
0 A B = 0 for all k ∈ N, and so in particular,
; <
x(0 B AB . . . An−1 B = 0.
; <
As x0 += 0, we obtain rank B AB ... An−1 B < n.
; <
Only if: Let C := rank B AB ... An−1 B < n. Then there exists a nonzero x0 ∈ Rn
such that ; <
x(
0 B AB ... An−1 B = 0. (5.9)
By the Cayley-Hamilton theorem, it follows that
; <
x( n (
0 A B = x0 α0 I + α1 A + · · · + αn A
n−1
B = 0.

By induction,
x( k
0 A B = 0 ∀k ≥ n. (5.10)
82 Chapter 5. Underdetermined ODEs

From (5.9) and (5.10), we obtain x( k ( tA


0 A B = 0 for all k ≥ 0, and so x0 e B = 0 for all t ∈ [0, T ].
But this implies that x0 +∈ RT , since otherwise if some u ∈ C[0, T ],
. T
x0 = e(T −τ )A Bu(τ )dτ,
0

then . .
T T
x(
0 x0 = x0 e(T −τ )A Bu(τ )dτ = 0u(τ )dτ = 0,
0 0
which yields x0 = 0, a contradiction.

The following result gives an important characterisation of controllability.

Corollary 5.3.4 Let T > 0. The system (5.4) is controllable at T iff


; <
rank B AB . . . An−1 B = n,

where n denotes the dimension of the state space.

Proof Only if: Let x" (t) = Ax(t) + Bu(t) be controllable at time T . Then with x0 = 0,
n
; states x1 ∈ R n−1
all the can be< reached at time T . So RT = Rn . Hence by Theorem 5.3.3,
rank B AB . . . A B = n.
; <
If: Let rank B AB . . . An−1 B = n. Then by Theorem 5.3.3, RT = Rn . Given x0 , x1 ∈
Rn , we have x1 − eT A x0 ∈ Rn = RT , and so there exists a u ∈ (C[0, T ])m such that
. T . T
x1 − eT A x0 = e(T −τ )A Bu(τ )dτ, that is, x1 = eT A x0 + e(T −τ )A Bu(τ )dτ.
0 0

In other words x(T ) = x1 , where x(·) denotes the unique solution to x" (t) = Ax(t) + Bu(t),
t ∈ [0, T ], x(0) = x0 .

We remark that the test:


; <
rank B AB ... An−1 B =n

is independent of T , and so it follows that if T1 , T2 > 0, then the system x" (t) = Ax(t) + Bu(t) is
controllable at T1 iff it is controllable at T2 . So for the system x" (t) = Ax(t) + Bu(t), we usually
talk about ‘controllability’ instead of ‘controllability at T > 0’.

Examples. Consider the two examples on page 79.

1. (Controllable system) In the first example,


; < (n=1) ; < ; <
rank B AB ... An−1 B = rank B = rank 1 = 1 = n,

the dimension of the state space (R).


2. (Uncontrollable system) In the second example, note that
+ ,
; n−1
< (n=2) ; < 1 1
rank B AB ... A B = rank B AB = rank = 1 += 2 = n,
0 0

the dimension of the state space (R2 ). ♦


5.3. Controllability of linear control systems 83

Exercises.

1. For what values of α is the system (5.4) controllable, if


+ , + ,
2 1 1
A= , B= ?
0 1 α

2. (∗) Let A ∈ Rn×n and B ∈ Rn×1 . Prove that if the system (5.4) is controllable, then every
matrix commuting with A is a polynomial in A.
3. Let + , + ,
1 0 1
A= and B = .
0 1 0
Show that the reachable subspace RT at time T of x" (t) = Ax(t) + Bu(t), t ≥ 0, is equal to
+ , 3+ ,1 -
1 α 11
span , that is, the set α∈R .
0 0 1

4. A nonzero vector v ∈ R1×n is called a left eigenvector of A ∈ Rn×n if there exists a λ ∈ R


such that vA = λv.
Show that if the system described by x" (t) = Ax(t) + Bu(t), t ≥ 0, is controllable, then for
every left eigenvector v of A, there holds that vB += 0.
Hint: Observe that if vA = λv, then vAk = λk v for all k ∈ N.
Di↵erential Equations 2020/21
MA 209

Extra Notes 1
Calculating the exponential of a matrix
Eigenvalues and eigenvectors
Classification of 2 ⇥ 2 matrices
Decoupled systems
Higher dimensional linear systems

1.1 Calculating the exponential of a matrix


As described in the Lecture Notes, if A is n ⇥ n real matrix, then we define the exponential eA by
1
X
1 2 1 1 k
eA = I + A + A + A3 + · · · = A .
2! 3! k!
k=0

But actually calculating this exponential is usually far from trivial. In these extra notes we have
a closer look at one method that might be useful for this.
1
• Suppose we can write A = P M P , for some invertible matrix P and any other matrix M . Then
we have that

Ak = (P M P 1 k
) = (P M P 1
) (P M P 1
) · · · (P M P 1
)
1 1 1 1
= PMP PMP · · · PMP = P M M ··· M P = P M kP 1
,

and hence
1
X 1
X 1
X
1 k 1 1 k
eA = A = P M kP 1
= P M P 1
= P eM P 1
.
k! k! k!
k=0 k=0 k=0

(In fact, we also find that etA = P etM P 1


; check for yourself.)
1
So if we can find a matrix M so that A = P M P for some invertible matrix P , and eM is “easy”
to determine, then we also can determine eA .
2 3
1 0 ··· 0
6 .. .. 7
60 . . 7
• One easy type of matrices are diagonal matrices: D = 6
6.
2 7. Then we find for
7
4 .. .. ..
. . 05
0 ··· 0 n

c London School of Economics, 2019


MA 209 Di↵erential Equations Extra Notes 1 — Page 2

2 k 3
1 0 ··· 0
6 .. .. 7
60 k . . 7
the powers Dk = 6
6.
2 7, and hence
7
4 .. .. ..
. . 05
k
0 ··· 0 n

2 k 3
1 0 ··· 0
X1 X1 6 .. .. 7
D 1 k 1 660
k
2 . . 7
7
e = D =
k! k! 6 .
4 .. .. .. 7
k=0 k=0 . . 05
k
0 ··· 0 n
2 3
P
1
1 k
1 0 ··· 0 2 1 3
6k=0 k! 7 e 0 ··· 0
6 7
6 P
1 .. .. 7 6 .. .. 7
6 0 1 k . . 7 6 0
6 k! 2 7 e 2 . . 7
= 6 k=0 7 = 6
6 .
7.
7
6 .. .. .. 7 4 .. .. ..
6 . . . 0 7 . . 0 5
6 7
4 P
1
1 k
5 0 ··· 0 e n
0 ··· 0 k! n
k=0

• In the remainder of these extra notes we will show that every real matrix A has some standard
accompanying real matrix J so that A = P JP 1 for some real invertible matrix P and so that eJ
is not too hard to determine. This special matrix is called the Jordan form of A.
We prove that each 2 ⇥ 2 matrix has a Jordan form in quite some detail. For larger dimensions
we give the outcomes only and some ideas where it comes from.
Before we can start looking at the Jordan form, we have a closer look at eigenvalues and eigen-
vectors of real matrices.

1.2 Eigenvalues
A real or complex number is an eigenvalue of an n ⇥ n square matrix A if there exists a non-zero
vector v so that Av = v.
Since this definition is equivalent to asking for (A I)v = 0, which only has a non-zero solution
if A I is singular (i.e. has no inverse), we find all eigenvalues by looking for solutions of the
equation det(A I) = 0.
• Here is something you should know:
Theorem 1.1
An n ⇥ n matrix A has n eigenvalues 1 , . . . , n, which can be complex numbers, and where the
same number can appear more than once.

“Proof ” As mentioned above, the eigenvalues of A are just the solutions to det(A I) = 0. If
we define the function f ( ) = det(A I) for 2 C, then it’s straightforward to show that f ( )
is a polynomial of degree n. So the eigenvalues of A are exactly the roots of this degree n
polynomial f ( ).
So we need that a polynomial of degree n has n roots. This is fairly hard to prove completely, so
we just rely on the Fundamental Theorem of Algebra (FToA):
MA 209 Di↵erential Equations Extra Notes 1 — Page 3

* Fundamental Theorem of Algebra


Let p(x) = an xn + an 1 xn 1 + · · · + a1 x + a0 be a polynomial over the complex numbers of
degree n 1 (hence an 6= 0). Then p(x) has n roots r1 , . . . , rn in the complex numbers (and
can be written as p(x) = an (x r1 )(x r2 ) · · · (x rn )).
Translating this back to our polynomial f ( ) = det(A I), we get that f ( ) has exactly n roots,
hence A has exactly n eigenvalues.
• Note that the eigenvalues can be complex numbers. Also note that the same eigenvalue can appear
more than once in the list 1 , . . . , n . The number of times the same number appears is called its
multiplicity.
We sometimes emphasise this by listing only the di↵erent eigenvalues 1 , . . . , k (where we must
have k  n) with their multiplicities m1 , . . . , mk . These multiplicities mi are positive integers
with m1 + · · · + mk = n.
• Theorem 1.1 doesn’t say much about what kind of numbers we can expect the eigenvalues to be.
And in general, there isn’t much that can be said. But in the case we are interested in in this
course we can do a little more:
Theorem 1.2
Suppose A is an n ⇥ n matrix in which all entries are real numbers.
(a) If A has a complex eigenvalue = ↵ + i, with ↵, 2 R, 6= 0, then the conjugate of ,
=↵ i, is also an eigenvalue of A.
(b) If n is odd, then at least one of the eigenvalues of A is a real number.

“Proof ” We again look at the polynomial f ( ) = det(A I) of degree n, realising that the
eigenvalues are exactly the roots of this polynomial. Suppose f ( ) = an n + an 1 n 1 + · · · +
a1 + a0 . If all entries of A are real, then so are those of A I, and hence when writing out the
determinant det(A I) we also only see real numbers. So we know that all coefficients an , . . . , a0
of the polynomial f ( ) are real numbers.
Now it’s a matter of writing out in full the expressions for both f (↵ + i) and f (↵ i), and in
particular looking at their real and imaginary part (using that the coefficients an , . . . , a0 of the
polynomial f ( ) are reals). If you do this correctly, you will find that Re(f (↵+ i)) = Re(f (↵ i)),
while Im(f (↵ + i)) = Im(f (↵ i)). Since ↵ + i is assumed to be an eigenvalue of A, we
have f (↵ + i) = 0, hence Re(f (↵ + i)) = 0 and Im(f (↵ + i)) = 0. This means that also
Re(f (↵ i)) = 0 and Im(f (↵ i)) = 0, and hence f (↵ i) = 0.
It follows that ↵ i is also an eigenvalue of A. This proves part (a).

We prove part (b) by induction on the degree n of the polynomial f ( ) = det(A I). If n = 1,
then we have f ( ) = a1 + a0 , where a1 , a0 are real numbers and a1 6= 0. This polynomial has
the obvious real root r = a0 /a1 .
So now suppose f ( ) has odd degree n 3. If it has no complex root, then it has n real roots,
and we are done. So suppose f ( ) has a complex root z = ↵ + i. Above we’ve seen that then
also the conjugate z = ↵ i is a root. That means we can factor out the factors z and z
and write f ( ) = ( z)( z) q( ), where q( ) is a polynomial of degree n 2. If we multiply
out we get (use that i2 = 1)

( z)( z) = ( ↵ i)( ↵ + i)
2
= ↵ + i ↵ + ↵2 ↵ i i+↵ i 2 2
i
2 2 2
= 2↵ + ↵ + .
MA 209 Di↵erential Equations Extra Notes 1 — Page 4

So in fact we have f ( ) = ( 2 2↵ + ↵2 + 2 ) q( ). Since all coefficients of f ( ) and all of


↵, ↵2 , 2 are real numbers, also all coefficients of q( ) are real numbers. And since q( ) has odd
degree n 2, by induction we know it has a real root. But any root of q( ) is also a root of f ( ).

1.3 Eigenvectors
If is an eigenvalue of an n ⇥ n square matrix A, then an eigenvector is a non-zero vector v so
that Av = v.
Theorem 1.3
Let be an eigenvalue of multiplicity m. Then the number of linearly independent eigenvectors
of is between 1 and m.

“Proof ” We only show that there is at least one eigenvector of . The proof that there can’t
be more than the multiplicity would involve things that go beyond what we want to know in this
course. Recall the fact that is an eigenvalue means that det(A I) = 0. Let c1 , c2 , . . . , cn be
the columns of the matrix A I. The fact that the determinant of A I is zero means that
the columns form a dependent set. Hence there exist numbers a1 , . . . , an , not all equal to 0, so
that a1 c1 + a2 c2 + · · · + an cn = 0. Now let v be the vector that has the numbers a1 , . . . , an as
2 3
a1
6 .. 7
entries: v = 4 . 5. Since at least one from a1 , . . . , an is not 0, we have v 6= 0. Writing writing out
an
the product, we get (A I)v = a1 c1 + a2 c2 + · · · + an cn = 0. This is the same as Av v = 0.
Hence Av = v, and we found an eigenvector for .

Theorem 1.4
Let v1 be an eigenvector of an eigenvalue 1 and v2 an eigenvector of an eigenvalue 2, where
1 6= 2 . Then v1 , v2 are linearly independent vectors.

Proof Exercise.
• Since we allow eigenvalues to be complex numbers, it seems also possible that entries of eigenvectors
can be complex numbers. If this is the case, then we often will split the entries in their real and
imaginary part. Hence we will write v = vR + ivI , where all entries in both vR and vI are real
numbers.
As the following result shows, if a real matrix has a complex eigenvalue, then we always have a
complex eigenvector with additional properties.

Theorem 1.5
Let A be an n⇥n matrix in which all entries are real numbers. Suppose A has a complex eigenvalue
= ↵ + i, with ↵, 2 R, 6= 0, with eigenvector v.
(a) The eigenvector v contains complex entries. I.e. if we write v = vR + ivI , then vI 6= 0.
(b) In fact, if we write v = vR + ivI , then the two parts vR , vI are two linearly independent
vectors.
(c) The parts vR , vI satisfy AvR = ↵vR vI and AvI = vR + ↵vI .

Proof For (a), suppose there is an eigenvector v of = ↵ + i in which all entries are real
numbers. Then in the expression Av we have real numbers only. On the other hand, the expression
v = (↵ + i)v has numbers with imaginary parts as well. But since we must have Av = v, this
can’t be correct. Hence v cannot contain real entries only.
MA 209 Di↵erential Equations Extra Notes 1 — Page 5

Part (b) is again an exercise.


For part (c), first use the linearity of matrix multiplication to write Av = A(vR +ivI ) = AvR +iAvI .
Here everything in AvR and AvI are real numbers. Next write v = (↵ + i) (vR + ivI ) =
(↵vR vI ) + i(↵vI + vR ). Since we have Av = v and both real and imaginary parts must be
equal, we have AvR = ↵vR vI and AvI = ↵vI + vR . So we are done.

1.4 Classification of 2-dimensional matrices


With the knowledge from above we are now ready to describe all di↵erent possibilities for the
eigenvalues and eigenvectors of a real-valued 2 ⇥ 2 matrix A:
(i) A has two di↵erent real eigenvalues 1 , 2 , 1 > 2 , each with an eigenvector v1 , v2 (Since
the two s are di↵erent, we can always number them so that 1 > 2 .);
(ii) A has a double real eigenvalue , with two linearly independent eigenvectors v1 , v2 ;
(iii) A has a double real eigenvalue , with only one eigenvector v;
(iv) A has a complex eigenvalue = ↵ + i, > 0, with an eigenvector v = vR + ivI .
(Note that by the Theorem 1.2 we know that if = ↵ + i is a complex eigenvalue, then so
is = ↵ i. Since one of , is positive, we loose nothing by assuming > 0.)
We will consider each of these four cases separately below.
• (i) Let P be the 2 ⇥ 2 matrix formed by using the eigenvectors v1 , v2 as columns; so we can write
⇥ ⇤ ⇥ ⇤ ⇥ ⇤
P = v1 | v2 . This means that AP = A v1 | v2 = 1 v1 | 2 v2 . This answer can also be writ-
  
⇥ ⇤ 1 0 1 0 0
ten as v1 | v2 , which is the same as P . So we find that AP = P 1 .
0 2 0 0
⇥2 ⇤ 2
Since v1 , v2 are linearly independent the matrix P = v1 | v2 is invertible. Multiplying from
the right by P 1 we get the following:

⇥ ⇤ 0
* The invertible real matrix P = v1 | v2 has the property that A = P 1 P 1.
0 2

• (ii) In the case that we have a double eigenvalue with two linearly independent eigenvectors
v1 , v2 we can do exactly as above. The conclusion will be:

⇥ ⇤ 0
* The invertible real matrix P = v1 | v2 has the property that A = P P 1.
0
• (iii) This case is in a sense the most complicated case, since we only have one vector yet. To get
a second one, we would need to go deeper into the theory of eigenvalues; deeper than we like
to do at the moment. The eigenvector v of A for eigenvalue is the only solution for x (up
to scalar multiplication) of the equation (A I)x = 0. Since is a double eigenvalue, it
can be shown that the equation (A I)2 x = 0 has two linearly independent solutions. Since
(A I)2 v = (A I) (A I)v = (A I)0 = 0, we can take v as one of these solutions.
⇤ ⇤
Suppose w is a second one, where v, w are linearly independent. Since (A I)2 w⇤ =
⇤ ⇤
(A I) (A I)w = 0, it follows that the vector u = (A I)w has the property that
(A I)u = 0. But the only linearly independent vector for which (A I)x = 0 was v,
so u must be a multiple of v. Say we have u = kv. Since k 6= 0 (otherwise u = 0 and
(A I)w⇤ = u = 0 would be another solution to (A I)x = 0), we can write w = k1 w⇤ .
1 ⇤ 1
So we have that (A I)w = (A I) k w = k (A I)w⇤ = k1 u = v. This is the same
as Aw = v + w. Since v, w are linearly independent and w = k1 w⇤ , also v, w are linearly

independent.
⇥ ⇤
Now let P be the 2 ⇥ 2 matrix formed by using the vectors v, w as columns: P = v | w . This

⇥ ⇤ ⇥ ⇤ ⇥ ⇤ 1
means that AP = A v | w = v | v + w . This answer can also be written as v | w ,
0
MA 209 Di↵erential Equations Extra Notes 1 — Page 6

 
1 1
which is the same as P . So we find that AP = P . Since v, w are linearly
0 0
⇥ ⇤
independent, the matrix P = v | w is invertible. Multiplying from the right by P 1 we get:

⇥ ⇤ 1
* The invertible real matrix P v | w has the property that A = P P 1.
0
• (iv) We actually could do the case of two complex eigenvalues (who must be di↵erent, because
one of them is the conjugate of the other) just as Case (i). But that would mean that we start
looking at matrices with complex entries. And when taking the exponential it’s not clear what
would happen with those complex numbers. Moreover, once we translate this whole business to
solutions of systems of linear equations, we really don’t want to end up with complex numbers
in our answers. So we treat this case di↵erently from Case (i).
We use that vR , vI are two linearly independent vectors that satisfy AvR = ↵vR vI and
AvI = vR + ↵vI .
⇥ ⇤
Let P be the 2 ⇥ 2 matrix formed by using the vectors vR , vI as columns: P = vR | vI . Then
⇥ ⇤ ⇥ ⇤
we easily find AP = A vR | vI = ↵vR v | vR + ↵vI . This answer can also be written as
 I
⇥ ⇤ ↵ ↵
vR | vI , which is the same as P . Since vR , vI are two linearly independent
↵ ↵
⇥ ⇤
vectors, the matrix P = vR | vI is invertible. Multiplying from the right by P 1 we get the
following:

⇥ ⇤ ↵
* The invertible real matrix P = vR | vI has the property that A = P P 1.

   
1 0 0 1 ↵
• The special forms (with 1 > 2 ), , , and (with > 0) from
0 2 0 0 ↵
above are called the Jordan form of A.
• Two matrices A, B are called similar if there is an invertible matrix Q so that A = Q 1 BQ. It is
easy to show (exercise) that this means that if two matrices are similar then they have the same
Jordan form.

1.5 Exponentials of Jordan forms


For each of the Jordan forms J in the previous part, it is not so hard to find the exponentials eJ
and etJ .
  k
0 0
• (i) If J = 1 , then for k = 0, 1, 2, . . . we have J k = 1 k , and hence
0 2 0 2
21 3
P 1 k
X1 X1  0 
1 k 1 k
0 6k=0 k! 1 7 e 1 0
J
e = J = 1
= 6 7 = .
k! k! 0 k 4 P 1 k5
1
0 e 2
k=0 k=0 2 0 k! 2
k=0

et 1 0
A similar argument gives that etJ = .
0 et 2
   t
0 J e 0 tJ e 0
• (ii) If J = , then we get from the previous case that e = and e = .
0 0 e 0 et
 
1 tJ t t
• (iii) For the Jordan form J = , we immediately calculate e . We have that tJ = .
0 0 t
 
1 0 (t )k kt(t )k 1
So we obtain (tJ)0 = ; while for the powers k 1 we get (tJ)k =
0 1 0 (t )k
MA 209 Di↵erential Equations Extra Notes 1 — Page 7

(this can be easily proved using induction on k).


This gives
X1  X 1 
1 1 0 1 (t )k kt(t )k 1
etJ = (tJ)k = +
k! 0 1 k! 0 (t )k
k=0 k=1
2 3 21 3
P
1
1 k
P1
1 k 1
P 1 k
P
1
1 k 1
6 1 + k! (t ) t k! k(t ) 7 6k=0 k! (t ) t (k 1)! (t ) 7
6
= 4 k=1 k=1 7 = 6 k=1 7
P
1
1 k5 4 P 1
1
k 5
0 1+ k! (t ) 0 k! (t )
k=1 k=0
2 3
P
1 P
1
1
)k 1
)` 7  t
6k=0 k! (t t `! (t e tet
= 6
4
`=0
P
1
7 = .
1 k5 0 et
0 k! (t )
k=0

e e
By taking t = 1, this means eJ = .
0 e
  
↵ ↵ 0 0
• (iv) Finally, the Jordan form J = . Writing J↵ = and J = we have
↵ 0 ↵ 0
that J = J↵ + J . It is easy to check that J↵ and J commute (J↵ J = J J↵ ), and so by
Theorem 1.5.2 from the Lecture Notes we have that eJ = eJ↵ eJ .
  ↵
↵ 0 J↵ e 0
For J↵ = we can argue as in Case (i) or (ii) to get e = .
0 ↵ 0 e↵

0
For J = , life is slightly more involved. Using induction on k it is not so hard to
0
find the following expressions for the powers of J , for k = 0, 1, 2, . . ..
 
( 1)m k 0 ( 1)k/2 k 0
If k = 2m is even, then J k = J 2m = m k = ,
0 ( 1) 0 ( 1)k/2 k
if k = 2m + 1 is odd, then
 
0 ( 1)m k 0 ( 1)(k 1)/2 k
J k = J 2m+1 = m k = (k 1)/2 k .
( 1) 0 ( 1) 0

p( ) q( )
So we find that eJ = , where
q( ) p( )
1
X 1 1 1 1
p( ) = ( 1)k/2 k
= 1 2
+ 4 6
+ · · · = cos( ),
k! 2! 4! 6!
k=0, k even
X1
1 1 1 1
q( ) = ( 1)(k 1)/2 k
= 3
+ 5 7
+ · · · = sin( ).
k! 3! 5! 7!
k=0, k odd

This leads to
  
J J↵ J e↵ 0 cos( ) sin( ) e↵ cos( ) e↵ sin( )
e = e ·e = · = .
0 e↵ sin( ) cos( ) e↵ sin( ) e↵ cos( )
 t↵
e cos(t ) et↵ sin(t )
In a similar way we can find that etJ = .
et↵ sin(t ) et↵ cos(t )
• So how to use all of this
⇢ 0knowledge? Well, let’s give one example. Suppose we are given the
x1 = 2x1 ,
2-dimensional system with initial values x1 (0) = 1, x2 (0) = 3. This is a
x02 = x1 2x2 ,
 
0 2 0 1
linear system x = Ax with A = and initial vector x0 = .
1 2 3
MA 209 Di↵erential Equations Extra Notes 1 — Page 8


20
First we want to find the eigenvalues of A. We have det(A I) = det =
1 2
( 2 )( 2 ) ( 1) · 0 = ( + 2)2 , so A has a double eigenvalue = 2. So we are in case (ii)
or (iii) of Section 1.4 of these notes. To find out which of the two cases we are, we need to find
out how many linearly independent eigenvectors there are. Eigenvectors are
⇢ found by looking for
v1 2v1 = 2v1 .
vertices v such that Av = v. Taking v = , we get the two equations
v2 v1 2v2 = 2v2 .
This system gives
 as only information v1 = 0, while v2 can be anything. So all eigenvectors have
0
the form v = , and hence there is only one linearly independent eigenvector. We conclude
v2
that we are in case (iii).

2 1
According to Case (iii), this means that A has the Jordan form J = and that there
0 2
is an invertible real matrix P so that A = P JP 1 . To find P we can follow the recipe using
eigenvectors from Section 1.4 above. But we can also use the knowledge that P exists. I.e. let’s
just try to find a P so that A = P JP 1. This equation can be rewritten as AP = P J. Filling in
a b
the entries for A and J, and setting P = , the equation AP = P J corresponds to the matrix
c d
 
2a 2b 2a a 2b
equation = . This matrix equation gives us four equations for
a 2c b 2d 2c c 2d
the four unknowns a, b, c, d:

2a = 2a, 2b = a 2b, a 2c = 2c, b 2d = c 2d.

But there is in fact very little information we really get from these equations: a = 0 and c = b,
for all others we have a free choice. (This is in general the case, because the matrix P is never
unique.) To keep life simple (remember, we also need to determine P 1 ), we take a = 0, b = 1,
 
0 1 0 1
c = 1, and d = 0, so P = . Then we have P 1 = .
1 0 1 0
Now we have all the knowledge we need to write out the solution of the systems of ODEs with
initial values.
  2t
2 1 tJ e te 2t
Since J = , from Case (iii) in this section we learn that e = . And then
0 2 0 e 2t
we can use the observations from page 1 of these extra notes to deduce
 2t    
e te 2t 0 1 e 2t te 2t 0 1 e 2t 0
etA = P 2t P 1
= 2t = 2t .
0 e 1 0 0 e 1 0 te e 2t

Finally we use Theorem 1.5.5 from the Lecture Notes to conclude that the 2-dimensional system
x0 = Ax with initial value x(0) = x0 has the unique solution
  
tA e 2t 0 1 e 2t
x(t) = e x0 = = .
te 2t e 2t 3 3e 2t + te 2t
⇢ 2t
x1 (t) = e ,
In other words, the solution is 2t 2t
x2 (t) = 3e + te .

1.6 Decoupled systems


An n-dimensional system of di↵erential equations x0 = f t, x) is called decoupled if we can partition
the coordinates {1, . . . , n} into two parts J1 and J2 (so J1 [ J2 = {1, . . . , n} and J1 \ J2 = ?), so
that if i 2 J1 then the derivative x0i depends on t and on xj with j 2 J1 only, while if i 2 J2 , then
the derivative x0j depends on t and those xj with j 2 J2 only.
MA 209 Di↵erential Equations Extra Notes 1 — Page 9

In other words, in a decoupled system we can divide the system of di↵erential equations into two
smaller systems that are completely independent from one another. In particular, the solutions and
qualitative behaviour of the big system is completely determined by the solutions and qualitative
behaviour of the two parts who don’t interact with one another.

1 0
• An example of a decoupled system is the system x0 = Ax with A = , which can be
0 2
decoupled to the two equations x01 = 1 x1 and x02 = 2 x2 .
• More explicitly, suppose we are given an n-dimensional system x0 = f (t, x), with x = (x1 , . . . , xn ),
8 0
>
> x1 = f1 (t, x1 , . . . , xm ),
>
> 0
>
> x 2 = f2 (t, x1 , . . . , xm ),
>
> . ..
>
> ..
>
> = .
>
< 0
xm = fm (t, x1 , . . . , xm ),
which can be decoupled as Then we can consider the smaller
>
> x0m+1 = fm+1 (t, xm+1 , . . . , xn ),
>
> x0
>
> m+2 = fm+2 (t, xm+1 , . . . , xn ),
>
>
>
> .. ..
>
> . = .
>
: 0
xn = fn (t, xm+1 , . . . , xn ).
8 0 8 0
> x
< 1 = f 1 (t, x 1 , . . . , x m ), < xm+1 = fm+1 (t, xm+1 , . . . , xn ),
>
systems .
.. = .
.. and .. .. separately. From
> > . = .
: 0 : 0
xm = fm (t, x1 , . . . , xm ), xn = fn (t, xm+1 , . . . , xn ),
the first system we obtain the solutions for x1 (t), . . . , xm (t), and from the second system the
solutions for xm+1 (t), . . . , xn (t). Then the full list x(t) = (x1 (t), . . . , xm (t), xm+1 (t), . . . , xn (t)) is
a solution for the original large system.
( x0 = x ,
1 1
• The partition into two parts is not always unique. For instance the system x02 = 2x2 , can be
x0 = x3 ,
⇢ ⇢ 0 3
x01 = x1 , x2 = 2x2 ,
decoupled into and x03 = x3 ; but also in x01 = x1 and
x02 = 2x2 , x03 = x3 .

1.7 Higher dimensional linear systems


The analysis of linear systems of the form x0 = Ax of dimension n > 2 can be continued and works
very similar to the 2-dimensional case. We won’t do much about proving this, but just give the
results. In order to do so, it makes sense to start with the 1-dimensional case.
• Property 1.6
The only form of a 1-dimensional linear di↵erential equation is x0 = Ax, where A = ( ) for some
2 R.
The solution for this system is x(t) = e t x0 for a constant x0 2 R.
MA 209 Di↵erential Equations Extra Notes 1 — Page 10

• We now formulate the results obtained in Sections 1.4 of these extra notes slightly di↵erently.
Property 1.7
For every 2-dimensional linear system of di↵erential equations x0 = Ax there exists an invertible
real matrix P such that A = P JP 1 where J has one of the following forms:
!
B 0
(a) J = , where B, C are real matrices of dimension smaller than 2 and 0, 00 are blocks
00 C
of zeros of appropriate size;
✓ ◆
1
(b) J = , where 2 R;
0
✓ ◆

(c) J = , where ↵, 2 R, > 0.

The formulation with the sub-matrices in case (a) is a little overkill here, since B, C, 0, 00 are
just 1-dimensional matrices, hence just real numbers. But it prepares for the higher dimensional
results below. And it also clearly indicates that we can consider case (a) as a decoupled system
consisting of two smaller systems of lower dimension.
• For 3-dimensional systems we get the following:
Property 1.8
For every 3-dimensional linear system of di↵erential equations x0 = Ax there exists an invertible
real matrix P such that A = P JP 1 where J has one of the following forms:
!
B 0
(a) J = , where B, C are real matrices of dimension smaller than 3 and 0, 00 are blocks
00 C
of zeros of appropriate size;
0 1
1 0
(b) J = @ 0 1 A, where 2 R.
0 0

If you followed what was the relation between eigenvalues and the Jordan forms of 2-dimensional
matrices, then you should have some idea when the Jordan form in (b) appears. That is if A has
one triple eigenvalue with only one corresponding eigenvector.
In case (a) above, we can assume that C is 1-dimensional, hence just a real number, and B is one
of the 2-dimensional cases from the previous property. In particular, in (a) we have a decoupled
system, whereas the system in (b) cannot be decoupled. More explicitly, if we fill in the di↵erent
possibilities for B and C in case (a) we get the following options:
0 1
1 0 0
(i) J = @ 0 2 0 A, where 1 , 2 , 3 2 R;
0 0 3
0 1
1 1 0
(ii) J = @ 0 1 0 A, where 1 , 2 2 R;
0 0 2
0 1
↵ 0
(iii) J = @ ↵ 0 A, where ↵, , 2 R, > 0.
0 0
Together with the special form in (b) this means we have four di↵erent Jordan forms for a real
3 ⇥ 3 matrix.
MA 209 Di↵erential Equations Extra Notes 1 — Page 11

• For 4-dimensional system, a new type appears, as is formulated in the following result.
Property 1.9
For every 4-dimensional linear system of di↵erential equations x0 = Ax there exists an invertible
real matrix P such that A = P JP 1 where J has one of the following forms:
!
B 0
(a) J = , where B, C are real matrices of dimension smaller than 4 and 0, 00 are blocks
00 C
of zeros of appropriate size;
0 1
1 0 0
B0 1 0C
(b) J = B @0 0
C, where 2 R;
1A
0 0 0
0 1
↵ 1 0
B ↵ 0 1C
(c) J = B
@ 0
C, where ↵,
A 2 R, > 0.
0 ↵
0 0 ↵

Again, the special Jordan form in (b) appears if A has a 4-fold eigenvalue with only one corre-
sponding eigenvector.
The form in (c) is new. That form corresponds to the following special matrices A: A has a double
complex eigenvalue ↵ + i with only one corresponding eigenvector. Then also the conjugate
complex number ↵ i is a double eigenvalue with only one corresponding eigenvector. Because
of the discrepancy between the algebraic multiplicity two of the eigenvalues and the fact that they
have only one eigenvector, we get that extra block with 0s and 1s in the top right corner of the
Jordan form.
The smaller matrices B, C in part (a) can either both be 2-dimensional, and then according to
the property for 2-dimensional systems, or C is just 1-dimensional (just a real number) and B is
3-dimensional and one of the special types for that dimension.
• When we go to even higher dimensions, no new forms appear.
Property 1.10
For every n-dimensional linear system of di↵erential equations x0 = Ax there exists an invertible
real matrix P such that A = P JP 1 where J has one of the following forms:
!
B 0
(a) J = , where B, C are real matrices of dimension smaller than nand 0, 00 are blocks
00 C
of zeros of appropriate size;
0 1
1 0 ··· ··· 0
B0 1 0 ··· 0 C
B C
B .. . . .. .. .. .. C
B. . . . . . C
(b) J = B B .. .. ..
C, where 2 R, and we must have dimension n 2;
C
B. . . 1 0C
B C
@ 0 ··· ··· 0 1A
0 ··· ··· ··· 0
MA 209 Di↵erential Equations Extra Notes 1 — Page 12

0 1
↵ 1 0 0 0 ··· ··· ··· ··· 0 0
B .. C
B ↵ 0 1 0 0 . 0 0C
B C
B .. .. .. C
B . C
B 0 0 ↵ 1 0 0 . . C
B .. .. C
B .. .. C
B 0 0 ↵ 0 1 0 . . . . C
B .. .. .. .. .. .. .. .. .. .. C
B . . . . . . C
B . . 0 0 . . C
B .. .. .. .. .. .. .. .. .. .. C
(c) J = B
B . . . . . . . . . 0 . C
C,
B .. .. .. .. .. .. C
B . . . . . . 1 0 0 0C
B C
B .. .. .. .. .. C
B . . . . . 0 1 0 0C
B C
B 0 0 ··· ··· ··· ··· 0 0 ↵ 1 0C
B C
B C
B 0 0 ··· ··· ··· ··· 0 0 ↵ 0 1C
B C
@ 0 0 ··· ··· ··· ··· ··· ··· 0 0 ↵ A
0 0 ··· ··· ··· ··· ··· ··· 0 0 ↵
where ↵, 2 R, > 0, and we must have that the dimension n is even.

It is possible to give explicit solutions for (b) and (c), and for (a) a solution is obtained by combining
the solutions for the two decoupled parts. But since we’re not really that much interested in the
solutions, but more in the qualitative behaviour, we won’t discuss these.

Exercises

1 For each matrix A given below, find its Jordan form J and find the matrix P so that A = P JP 1
.
  
0 1 41 29 9 4
(a) A = ; (b) A = ; (c) A = .
2 3 58 41 9 3

2 For each matrix A in question 1 above, calculate etA .

3 Show that the two statements in Theorem 1.2 on page 3 of these notes are not true in general if
we allow the matrix A to have complex entries.

4 Prove Theorem 1.4 on page 4 of these notes. I.e. show that the eigenvectors of di↵erent eigenvalues
are linearly independent.

5 Prove part (b) of Theorem 1.5 on page 4.

6 Prove that if A is an n ⇥ n matrix, and P is an n ⇥ n invertible matrix, then A and P 1


AP have
the same eigenvalues.

7 Prove the final statement of Section 1.4 on page 6 of these notes: If 2 ⇥ 2 matrices A, B are
similar, then there exist invertible matrices PA , PB so that the Jordan forms JA = PA 1 APA and
JB = PB 1 BPB are the same.
Di↵erential Equations 2020/21
MA 209

Extra Notes 2
Continuity of Functions
Existence and Uniqueness of Solutions of ODEs
The Lipschitz Conditions

These extra notes provide some extra information related to Chapter 4 of the Lecture Notes.

2.1 Functions and their limits


Some important general definitions for functions are:
* If f : S ! T , where S and T are non-empty sets, then S is called the domain of f and T
is the range of f .
* For R ✓ S, the set of attainable values of f on R or the image of R under f is

f (R) = { y 2 T | f (x) = y for some x 2 R } = { f (x) | x 2 R }.

• Most of you will have a notion of what it means for a function from R to R to have a
limit. Definitions for these concepts usually include notions such as “x approaches a” or “x
approaches a from above”. But if we are dealing with functions f : Rn ! Rm , we must be
careful what we mean if we say “x approaches a” since x and a are points in some higher
dimensional space. We will use the following definition:
Definition
Given a function f : S ! Rm where S ✓ Rn . Then we say that f (x) ! y as x ! a,
where y 2 Rm and a 2 Rn , if for every " > 0 there is a > 0 such that for all x 2 S with
0 < kx ak < we have kf (x) yk < ".
2 3
x1
n 6 .. 7 p
Here k . k is the normal norm in R : if x = 4 . 5, then kxk = x21 + · · · + x2n .
xn
• For one-dimensional functions we have an additional definition:
Definition
Given a function f : R ! Rm . Then we say that f (t) ! y as t ! 1, where y 2 Rm , if
for all " > 0 there exists an M 2 R such kf (t) yk < " for all t > M .
c London School of Economics, 2019
MA 209 Di↵erential Equations Extra Notes 2 — Page 2

• For any x 2 Rn and real number r > 0, the open ball B(x, r) with centre x and radius r is
the set B(x, r) = { y 2 Rn | kx yk < r }.
If D ✓ Rn , then a point x 2 D is an interior point of D if there is an r > 0 so that B(x, r) ✓ D.

2.2 Continuous functions


We use the following definitions for a function to be continuous:
* A function f : S ! Rm , where S ✓ Rn , is said to be continuous at x0 2 S if f (x) !
f (x0 ) as x ! x0 .
* And f is continuous on S if f is continuous at every point in S.
Using the definition from the previous subsection, a more extended definition would be:
* A function f : S ! Rm , where S ✓ Rn , is said to be continuous at x0 2 S if for all " > 0
there is a > 0 such that for all x 2 S with 0 < kx x0 k < we have kf (x) f (x0 )k < ".
2 3
f1 (x)
6 7
• If f : S ! Rm , then we can think of f as being defined by m functions f (x) = 4 ... 5,
fm (x)
one for each coordinate. It can be shown that this means that:
* f is continuous at x0 2 S if and only if each fi is continuous at x0 2 S.

2.3 Existence of Solutions


For the remainder of these notes we suppose that we are given a function f : D ! Rn
for some D ✓ Rn+1 and a point (t0 , x0 ) 2 D. And we are looking for solutions x(t) to the
following initial-value problem

(1) x0 = f (t, x) with x(t0 ) = x0 .

(The reason to allow f to be defined on a subset D of Rn+1 , is so that we also can consider
equations like x0 = x/t for t > 0.)
• Following the definitions and observations from the previous subsection, we can write
2 3
f1 (t, x1 , x2 , . . . , xn )
6 7
6 f2 (t, x1 , x2 , . . . , xn ) 7
6
f (t, x) = 6 .. 7,
7
4 . 5
fn (t, x1 , x2 , . . . , xn )

and f is continuous on D if and only if each fi (t, x1 , . . . , xn ) is continuous on D.


MA 209 Di↵erential Equations Extra Notes 2 — Page 3

• The following theorem guarantees that most initial-value problems have a solution.

Theorem 2.1
If f (t, x) is continuous on D and (t0 , x0 ) is an interior point of D, then there exist ts , te with
ts < t0 < te and a function x : (ts , te ) ! Rn that solves the initial-value problem (1) for all
t 2 (ts , te ).

The above is equivalent to Theorem 4.5.1 in the Lecture Notes (although formulated a little
di↵erent). It is a lot more general than Theorem 4.1.1 in the Lecture Notes.
The proof of Theorem 2.1 is fairly tricky. So we only give a sketch of a possible way to prove
the theorem. This in fact describes a way to find an approximate solution. The construction
is known as the Cauchy-Euler construction.
Sketch of proof We will only show that there is a solution on the interval [t0 , te ) for some
te > t0 . The same ideas can be used to show that there is a solution for t below t0 as well.
We will assume that all points considered below are in D. This can be achieved by choosing
the value of ↵ below appropriately.
Fix a positive number ↵ and set t↵ = t0 + ↵. We will construct an approximate solution on
[t0 , t↵ ]. Next, for a positive integer N , divide the interval between t0 and t↵ into N equal
parts. So write t = ↵/N and define N + 1 time points tr = t0 + r t for r = 0, 1, . . . , N . We
now form the corresponding sequence of points xr , r = 0, 1, . . . , N , defined by

xr = xr 1 + (tr tr 1 ) f (tr 1 , xr 1 ) for r = 1, . . . , N .

Finally, for a time t 2 [t0 , t↵ ) we know that t 2 [tr 1 , tr ), for exactly one r 2 {1, 2, . . . , N }.
So for all t 2 [t0 , t↵ ) we can define the approximate solution xN (t) as follows:

xN (t) = xr 1 + (t tr 1 ) f (tr 1 , xr 1 ) where r is the integer so that t 2 [tr 1 , tr ).

In order to understand what is happening, make the following observations about xN :


– For t 2 [t0 , t1 ), xN (t) is nothing more than the line starting in x0 going in the direction
f (t0 , x0 ). Note that the solution of the ODE is a function x(t) with x(t0 ) = x0 and
x0 (t0 ) = f (t0 , x0 ). So for t 2 [t0 , t1 ), xN (t) is the linear approximation of the solution x
we are looking for.
– The linear approximation from the first step goes from t = t0 until t = t1 . Then we are in
the point x1 and we start using a new direction f (t1 , x1 ). So for t 2 [t1 , t2 ), xN (t) is the
linear approximation of a solution x that would start with x(t1 ) = x1 .
– The process continues; between times tr 1 and tr we follow a straight line starting at xr 1
and with direction f (tr 1 , xr 1 ).
So the function xN (t) consists of a sequence of linear pieces, each piece chosen to give a
reasonable approximation of a possible solution of the ODE at the starting point of the piece.
This approach is often used in computer software to find approximations of solutions, or for
instance to draw the graphs of solutions when the solution itself is not explicitly known.
From this point, we should continue the proof by showing that for ↵ small enough, if we let
N ! 1 (which is the same as t # 0), then xN converges to some di↵erentiable function x
MA 209 Di↵erential Equations Extra Notes 2 — Page 4

which is the solution of (1). This part of the proof involves analysis of uniform convergence
of xN (t), etc. We will skip that and just believe that by doing the construction above using
smaller and smaller steps t we eventually get a solution.
• Note that Theorem 2.1 only guarantees a solution over a certain interval I = (ts , te ) with
t0 2 I. This interval very much depends on the exact form of the ODE. For instance the
one-dimensional ODE

x0 = x2 1, x(0) = 0,

1 e2t
has the solution x(t) = , for all t 2 R (so we can take I = R). But the very similar
1 + e2t
looking ODE

y 0 = y 2 + 1, y(0) = 0,
1
has the solution y(t) = tan(t), for 2⇡ < t < 12 ⇡. So here the solution is only valid for the
interval I = ( 12 ⇡, 12 ⇡).
• Theorem 2.1 guarantees that most ODEs have a solution. But that doesn’t mean the solution
has to be unique. For instance, the ODE
p
x0 = 32 x ,
3
x(0) = 0,

has solutions x(t) = 0 for all t 2 R, but also x(t) = |t|3/2 for all t 2 R, and many others.
So in order to make sure that there is a unique solution, we must put some extra conditions
on the ODE, in particular on the expression f (t, x).

2.4 Uniqueness of Solutions - The Lipschitz Condition


The di↵erent forms of the Lipschitz Conditions (locally Lipschitz, globally Lipschitz, locally
Lipschitz in x uniform with respect to t) are defined in Section 4.4 of the Lecture Notes. The
most important one is the following.
* Definition
Let D ✓ Rn+1 be some domain in which (t0 , x0 ) is an interior point. Then the function
f (t, x) defined on D satisfies the Lipschitz condition on D if there exists a constant L such
that

kf (t, x) f (t, y)k  Lkx yk for all (t, x), (t, y) 2 D.

With this condition, we can formulate the most important general result on the uniqueness
of ODEs:

* Theorem 2.2
Let f (t, x) satisfy the Lipschitz Condition on some domain D in which (t0 , x0 ) is an
interior point. Then there exist ts , te with ts < t0 < te such that the di↵erential equation
x0 = f (t, x), with initial value x(t0 ) = x0 , has a unique solution on (ts , te ).
MA 209 Di↵erential Equations Extra Notes 2 — Page 5

To prove this theorem, we first would need to show that there is at least one solution. But that
follows from the earlier results, since it can be shown that if a function satisfies a Lipschitz
Condition, then it is continues. So we only need to prove that that solution is unique.
That second part of the proof can be found in Section 4.3.2 of the Lecture Notes, but also at
the end of these extra notes. It is quite some work, although nothing incredibly complicated
is happening. Nevertheless, we won’t spend much time with the proof, and hence it is not
considered examinable material.
• Although the Lipschitz Condition is not too complicated, in practice it is quite hard to find
out if a function satisfies the Lipschitz Condition. The following condition is often useful.

* Theorem 2.3
Suppose f (t, x) is continuous di↵erentiable on some open convex domain D ✓ Rn+1 with
(t0 , x0 ) 2 D and that there exists some constant K such that the partial derivatives with
respect to the x-coordinates satisfy:
@fi (t, x)
 K for all i = 1, . . . , n, j = 1, . . . , n and (t, x) 2 D.
@xj
Then f satisfies the Lipschitz Condition on D.

2.5 Proof of Uniqueness under the Lipschitz Condition


Before really starting with the proof, we first rewrite the standard ODE from (1) in a some-
what di↵erent form. To find this, suppose we have a solution x(t) for (1). Integrating both
sides of the di↵erential equation from t0 to t we get
Z t Z t
(2) x0 (⌧ ) d⌧ = f (⌧, x(⌧ )) d⌧.
t0 t0

You must realise that the functions x and f are in fact multi-dimensional. So in reality we
⇥ ⇤
have x(s) = x1 (s), . . . , xn (s) , and hence we should read
Z t Z t  t
⇥ ⇤ R Rt
ẋ(⌧ ) d⌧ = x1 (⌧ ), . . . , xn (⌧ ) d⌧ = x1 (⌧ ) d⌧, . . . , xn (⌧ ) d⌧ ,
t0 t0 t0 t0

Rt
where each of the integrals xi (⌧ ) d⌧ is a normal, one-dimensional integral.
t0
Rt
Now recall that for a di↵erentiable function f : R ! R we have f 0 (⌧ ) d⌧ = f (t) f (t0 ).
t0
Rt
Then we find that the equation in (2) is equivalent to x(t) x(t0 ) = f (⌧, x(⌧ )) d⌧ . Entering
t0
the initial value x(t0 ) = x0 , we get the so-called Volterra’s Integral Equation:
Z t
(3) x(t) = x0 + f (⌧, x(⌧ )) d⌧.
t0

In fact, we have shown that solving the initial-value di↵erential equation in (1) is equivalent
to solving the integral equation in (3).
MA 209 Di↵erential Equations Extra Notes 2 — Page 6

• We next need a preliminary lemma.

Lemma 2.4
Let ' : [t0 , te ) ! R, where te > t0 , be continuous on [t0 , te ) and satisfy '(t) 0 for all
t 2 [t0 , te ). Suppose there is some constant K 0 so that
Z t
0  '(t)  K '(⌧ ) d⌧ for all t 2 [t0 , te ).
t0

Then '(t) = 0 for all t 2 [t0 , te ).

Proof If K = 0, then we immediately get 0  '(t)  0, hence '(t) = 0 for all t 2 [t0 , te ).
So from now on we assume K > 0.
Rt
For t 2 [t0 , te ) write (t) = '(⌧ ) d⌧ . Since '(⌧ ) 0 for al ⌧ 2 [t0 , te ), we also have (t) 0.
t0
Also, (t0 ) = 0 and (t) is continuous di↵erentiable with 0 (t) = '(t). Hence the inequality
in the lemma can be written as
0
0  (t)  K · (t) for all t 2 [t0 , te ).

The second part is the same as 0 (t)


K (t)  0. After multiplying with the positive value
d ⇥ Kt ⇤
e Kt we get e Kt 0 (t) e Kt K (t)  0, which is the same as e (t)  0. Now take
dt
the integral from t0 to t on both sides to get:
Z t Z t
Kt Kt0 d ⇥ K⌧ ⇤
e (t) e (t0 ) = e (⌧ ) d⌧  0 d⌧ = 0.
t0 dt t0

Rb 0 (⌧ ) d⌧
(Recall that for a di↵erentiable function we have = (b) (a).) But since
a
(t0 ) = 0 we must conclude e Kt (t)  0. Since e Kt
is positive, it must be the case that
(t)  0. Together with the inequality 0  K (t), hence 0  (t) (since K > 0), we must
conclude (t) = 0 for all t. But then also '(t) = 0 (t) = 0 for all t 2 [t0 , te ).
• Proof of Theorem 2.3 We only consider the interval [t0 , te ). The interval (ts , t0 ] can be
done similarly, but we must take care of the signs of the integrals when t < t0 .
Suppose there are two solutions x(t) and y(t) of (1) valid on [t0 , te ) for some te > t0 . We will
show that if f (t, x) satisfies the Lipschitz Condition, then we must have x(t) = y(t) for all
t 2 [t0 , te ).
Let L be the constant corresponding to the Lipschitz Condition of f (t, x). From the integral
equation formulation in (3) we find
Z t Z t
x(t) = x0 + f (⌧, x(⌧ )) d⌧ and y(t) = x0 + f (⌧, y(⌧ )) d⌧.
t0 t0

Subtracting, we see that


Z t
⇥ ⇤
x(t) y(t) = f (⌧, x(⌧ )) f (⌧, y(⌧ )) d⌧.
t0
MA 209 Di↵erential Equations Extra Notes 2 — Page 7

Taking the norm of both sides we get


Z t
⇥ ⇤
0  kx(t) y(t)k = f (⌧, x(⌧ )) f (⌧, y(⌧ )) d⌧ .
t0

Rt Rt
Now we use that for integrable functions a : R ! Rn we have a(⌧ ) d⌧  ka(⌧ )k d⌧ .
t0 t0
This should require a proof, but if you recall that the integral is the limit of a large sum, and
using the triangle inequality for the norms of sums, I hope you will believe this. Anyway,
applying this inequality and the Lipschitz Condition we find
Z t
⇥ ⇤
0  kx(t) y(t)k = f (⌧, x(⌧ )) f (⌧, y(⌧ )) d⌧
t0
Z t
 f (⌧, x(⌧ )) f (s, y(⌧ )) d⌧
t0
Z t Z t
 Lkx(⌧ ) y(⌧ )k d⌧ = L kx(⌧ ) y(⌧ )k d⌧.
t0 t0

Now use Lemma 2.4 with K = L and '(t) = kx(t) y(t)k, and we find kx(t) y(t)k = 0 for
all t 2 [t0 , te ), hence x(t) = y(t), as required.

Exercises
"#
x 1 + 1
1 Consider the function f : R3 ! R2 given by f (t, x) = 2 .
x 2 + t2
(a) Prove that f is locally Lipschitz.
(b) Show that f is not globally Lipschitz on R3 .

2 Suppose the function f (t, x), where f : Rn+1 ! Rn , satisfies the Lipschitz Condition on a
certain domain D ✓ Rn+1 , and let g : R ! Rn be any function.
Prove that h defined by h(t, x) = f (t, x) + g(t) also satisfies the Lipschitz Condition on D.

3 (a) Suppose the function f : R2 ! R satisfies the Lipschitz Condition on the whole space R2 .
Give an example that shows that this does not guarantee that f is continuous on R2 .
(b) Suppose that the function f (t, x) is globally Lipschitz on R2 . I.e. there is a constant L
such that

|f (t, x) f (s, y)|  Lk(t, x) (s, y)k for all (t, x), (s, y) 2 R2 .

Prove that this means that f is continuous on R2 .


Di↵erential Equations 2020/21
MA 209

Extra Notes 3
Face Portraits
Linearisation of Non-linear Systems

These extra notes contain some extra information related to Chapter 2 of Dr Sasane’s Lecture
Notes.

3.1 Preliminaries
From now on we will only consider autonomous systems x0 (t) = f (x(t)). We also assume
that f is continuous and satisfies a local Lipschitz Conditions. This guarantees that a solution
exist for all initial values x(0) = x0 , and that all solutions are unique.
This means in particular that if we have two solutions x(t) and y(t) such that x(t1 ) = y(t1 )
for some time t1 , then x(t) = y(t) for all t for which the solutions exist.
Moreover, the fact that the system is autonomous, means that if we have two x(t) and y(t)
such that x(t1 ) = y(t2 ) for some times t1 , t2 , then x(t1 + t) = y(t2 + t) for all t for which the
solutions at t1 + t and t2 + t exist. We can also write this as x(t) = y(t + (t2 t1 )), for all t
for which the solutions exist.
In other words, if two solutions of an autonomous systems go through the same point at some
time, then these solutions are essentially the same, only di↵ering by a shift of time.

3.2 Phase portraits


• A phase portrait of an n-dimensional autonomous system x0 (t) = f (x(t)) is a graphical rep-
resentation of the states in x-space. Hence we ignore the time axis, and just indicate the
qualitative behaviour the solutions would take if at a certain time they would go through a
certain point. So for “each” point x in n-dimensional space, we indicated what the direction
of f (x) is, but don’t worry how large f (x) is. The only points x we treat special are those
where f (x) = 0, since at those points there is no direction.
For practical reasons, we can only draw phase portraits for dimensions 1 and 2. (Maple has
some tools to draw and play with 3-dimensional phase portraits.) The Lecture Notes go
extensively into aspects of 2-dimensional phase portraits, and a little about 1-dimensional.
We’ll do a little bit more about 1-dimensional phase portraits in these additional notes, before
moving on to 2-dimensional ones.
c London School of Economics, 2019
MA 209 Di↵erential Equations Extra Notes 3 — Page 2

• For 1-dimensional autonomous systems x0 = f (x), the phase portrait lives on the x-line. We
only have three kind of points: f (x) = 0, f (x) > 0 or f (x) < 0.
The points with f (x) = 0 are called singular points or fixed points or stationary points or
equilibrium points. These are exactly the point for which we will have a constant solution.
If a and b are fixed points, a < b, and f (x) 6= 0, for x 2 (a, b), then, because f is continuous,
f (x) has the same sign for all x 2 (a, b). So we can label the interval (a, b) of the x-line with
an arrow showing the direction in which x is changing.

0 x ln(|x|), if x 6= 0,
• As an example, consider the system x (t) =
0, if x = 0.
If x 6= 0, then x ln |x| = 0 whenever ln |x| = 0, hence whenever |x| = 1. This gives x = 1 or
x = 1. By the definition, x = 0 is also a fixed point for this di↵erential equation. Filling in
values for x between the fixed points, we get the following phase portrait:


P t P
⇣ t ⇣
P t P

1 0 1

• It is obvious that we have only four possibilities for the behaviour around a fixed point:

(a) P
⇣ t P
⇣ (b) ⇣
P t P
⇣ (c) P
⇣ t ⇣
P (d) ⇣
P t ⇣
P

The fixed point in (b) is called a repellor, the one in (c) an attractor, while those in (a) and (d)
are both called a shunt.
• For 2-dimensional phase portraits, and how to construct them in general, we refer to Sec-
tions 2.1 – 2.3 of the Lecture Notes.

3.3 Phase portraits of 2-dimensional linear systems


• We have seen earlier that if x0 = Ax is a 2-dimensional linear system, then we get a lot of
information by writing A = P JP 1 , where P is an invertible matrix and J is a so-called
Jordan form. The phase portraits of systems y 0 = Jy, where J is a Jordan form are given
in Subsections 2.4.1 – 2.4.3 of the Lecture Notes. But how can we use these to get phase
portraits of general systems? A little bit about that is done in the exercises of Section 2.4,
but we will do it below in somewhat greater detail.
• As indicated above, we assume we can find the phase portraits of any system y 0 = Jy if J is
a Jordan form.
Now suppose we are given a system x0 = Ax, with A = P JP 1. Take y(t) = P 1 x(t), hence
x(t) = P y(t). Then we have that
d d d
y0 = y = P 1 x = P 1 x = P 1 x0 = P 1 Ax = P 1 AP y = Jy.
dt dt dt
So the system y 0 = Jy is exactly the system we get under the transformation y = P 1 x. So
if we know the phase portrait for the y-system, then we can use the inverse transformation
x = P y to find the phase portrait of the x-system.
MA 209 Di↵erential Equations Extra Notes 3 — Page 3

To get an idea what it means to apply the transformation x = P y, we have a closer look
now.
• In this section we look at mappings x 7! M x from Rn to Rn where M is an n ⇥ n invertible
real-valued matrix. And the relevant question is “How does the image of Rn look, after having
applied the transformation x 7! M x?”.
Theorem 3.1
Let M be a real-valued invertible n ⇥ n matrix. Then the mapping Rn ! Rn given by
x 7! M x has the following properties:
(a) The origin gets mapped to the origin.
(b) A closed curve gets mapped to a closed curve.
(c) A line gets mapped to a line.
(d) A half-line with its end in the origin gets mapped to a half-line with the end in the origin.
Moreover if we impose one of the two possible directions along the original half-line (away
from the origin or towards the origin), then the direction is the same in the image.
(e) Parallel lines get mapped to parallel lines.

Proof (a) is trivial since M 0 = 0.


For (b) we really should define more carefully what we mean by a “closed curve”. Let’s just
say that (b) follows since the mapping x 7! M x is a continuous function, and hence we can’t
have a point on a curve which is “cut open” under that mapping.
Recall that a line in Rn can be given as the set of points { a + b | 2 R }, where a, b are fixed
vectors, a 6= 0. The image of these points are the points { M ( a+b) | 2 R } = { M a+M b |
2 R }. This is a new line where the defining vectors are now M a and M b. Since M is
invertible, we also have M a 6= 0. This proves (c).
For (d) we use that a half-line with its end in the origin is the set of points { a | 0 },
where a is some fixed vector, a 6= 0. The image of such as set is { M ( a) | 0 } = { Ma |
0 }, the same kind of set. Moreover, moving along such a half-line away from the origin
is the same as increasing the value of . This clearly has the same e↵ect in both the original
half-line and the image.
For (e) we only need the observation that parallel lines have the property that we can take
their direction vector the same. So line 1 contains all points of the form a1 + b, 2 R,
for some vectors a1 , b; while line 2 contains all points of the form a2 + b, 2 R, for some
vector a2 and the same b. Multiplying these points with M , we get the points M a1 + M b,
2 R, and M a2 + M b, 2 R. These points form parallel lines.

The observations above give us a good idea what happens if we have some figure (like a phase
portrait) sketched in the plane, and we now want to know the image of that sketch under a
transformation x 7! M x.
MA 209 Di↵erential Equations Extra Notes 3 — Page 4

• In particular, now we have a way to find a phase portrait of a linear system x0 = Ax, with
A = P JP 1 . Take the phase portrait of the system y 0 = Jy, and now transform that phase
portrait with the transformation y 7! P y. And we have seen earlier that if x(t) = P y(t)
for some solution y(t) of the system y 0 = Jy, then x(t) is a solution of the original system
x0 = Ax. So the trajectories of the transformed system correspond exactly to trajectories of
the original system.

3.4 Di↵erentiability and Linear Taylor Approximation


The remaining sections of these Extra Notes will give some extra information on what is
happening in Section 2.5.1 of the Lecture Notes.
• Firstly, the theory of linearisation is a theory about “local” behaviour. To give that a more
solid footing, we need some definitions.
• A set N ✓ Rn is a neighbourhood of a point x0 2 Rn if there is an r > 0 so that
B(x0 , r) ✓ N (where, as before, B(x0 , r) is the open ball around x0 with radius r).
An equivalent way to define neighbourhood would be: The set N is a neighbourhood of a
point x0 2 Rn if x0 is an interior point of N (see page 2 of Extra Notes 2).
• Recall that a function f : S ! Rm , where S ✓ Rn , is di↵erentiable at x0 2 S, where x0
must be in the interior of S, if there exists an m ⇥ n matrix A so that

f (x) f (x0 ) A(x x0 )


! 0 as x ! x0 .
kx x0 k

(Since the vector 0 is the only vector with norm 0, it is completely equivalent to replace the
numerator by the norm kf (x) f (x0 ) A(x x0 )k, and hence require the expression to
approach 0 as x ! x0 .)
The matrix A is called the derivative of f at x0 and is denoted by Df (x0 ).
If f is di↵erentiable on a set S, then the derivative Df can in its turn be seen as a function
Df : S ! Rm⇥n . (Here we should assume that S is open to make sure that all x0 2 S
are interior points of S.) If this function Df is continuous, then f is said to be continuous
di↵erentiable.
• For practical use, we can assume the following:
If f : S ! Rm , where S ✓ Rn , then we can think of f as being defined by m coordinate
functions f (x) = (f1 (x), . . . , fm (x)), with each fi a function Rn ! R. Then the derivative
exists for all x 2 S and is continuous, if and only if the j-th partial derivative of every
coordinate function fi (x) exist and is continuous for all i = 1, . . . , m, j = 1, . . . , m and x 2 S.
@fi (x)
Note that this j-partial derivative of fi is is .
@xj
MA 209 Di↵erential Equations Extra Notes 3 — Page 5

In that case we also have


0 1
@f1 (x) @f1 (x) @f1 (x)
B @x1 ··· C
B @x2 @xn C
B @f2 (x) @f2 (x) @f2 (x) C
B ··· C
B C
Df (x) = B @x1 @x2 @xn C.
B .. .. .. .. C
B . . . . C
B C
B C
@ @fm (x) @fm (x) @fm (x) A
···
@x1 @x2 @xn

• Theorem 3.2 (First Order / Linear Taylor Approximation for functions Rn ! Rm )


Let f : S ! Rm be a function, where S ✓ Rn , and let x0 2 S be an interior point of S.
If f is di↵erentiable at S, then for any x 2 S we can write

f (x) = f (x0 ) + Df (x0 )(x x0 ) + R(x)kx x0 k.

Here the function R : S ! Rm in the remainder term depends on x0 , and has the properties
that R(x0 ) = 0 and R(x) ! 0 as x ! x0 .

Proof Although the theorem looks kind of scary, it’s actually very easy to proof. Of course,
easy if you understand what the definition of di↵erentiability and so is.
The definition of f being di↵erentiable at x0 means that

f (x) f (x0 ) Df (x0 )(x x0 )


! 0 as x ! x0 .
kx x0 k

f (x)f (x0 ) Df (x0 )(x x0 )


Now define R(x) = for x 2 S \{x0 }, and set R(x0 ) = 0. Then
kx x0 k
we immediately get that R(x) ! 0 = R(x0 ) as x ! x0 .
Also, by rearranging the definition we immediately find the main formula

f (x) = f (x0 ) + Df (x0 )(x x0 ) + R(x)kx x0 k.

And we saw already that R has the right properties.

An alternative way to write the formula in the Taylor Approximation is by writing every x
as x = x0 + y (so y is just the di↵erence of x and x0 ). Then we get

f (x0 + y) = f (x0 ) + Df (x0 )y + R(x0 + y)kyk.

This is actually the form we will use most from now on.
MA 209 Di↵erential Equations Extra Notes 3 — Page 6

3.5 Linearisation at a fixed point


• Now we start looking at an autonomous system x0 = f (x). The Taylor Approximation at x0
of the function f (x), writing x = x0 + y, is

f (x0 + y) = f (x0 ) + Df (x0 )y + R(x0 + y)kyk.

Moreover, since we are given x0 = f (x) = f (x0 + y), and y = x x0 , so y = x0 (since x0 is


some fixed constant vector), we get a new di↵erential equation

y 0 = f (x0 ) + Df (x0 )y + R(x0 + y)kyk.

The linearisation of x0 = f (x) at x0 is just this formula with the remainder term removed:

y = f (x0 ) + Df (x0 )y.

In general, we are only interested in the case that x0 is a fixed point of the system x0 = f (x),
so when f (x0 ) = 0. Then the linearisation looks like

y 0 = Df (x0 )y.

• Remember that Df (x0 ) is the matrix of partial derivatives. Since f : Rn ! Rn , we get


0 1
@f1 (x0 ) @f1 (x0 ) @f1 (x0 )
B ··· C
B @x1 @x2 @xn C
B @f2 (x0 ) @f2 (x0 ) @f2 (x0 ) C
B ··· C
B C
Df (x0 ) = B @x1 @x2 @xn C.
B .. .. .. .. C
B . . . . C
B C
B C
@ @fn (x0 ) @fn (x0 )) @fn (x0 ) A
···
@x1 @x2 @xn
" # " #
x01 f1 (x1 , x2 )
And in particular, for a 2-dimensional system 0 = we find that the linearisa-
x2 f2 (x1 , x2 )
tion at a fixed point x0 is
2 3
" # @f1 (x0 ) @f1 (x0 ) " #
y10 6 7
6 @x1 @x2 7 y1
= 6 7 .
y20 4 @f2 (x0 ) @f2 (x0 ) 5 y2
@x1 @x2

• The following is a more qualitative version of Theorem 2.5.1 in the Lecture Notes. We won’t
define “qualitatively equivalent” exactly; let’s just say it means “looks qualitatively the same”.
Theorem 3.3 (Linearisation Theorem)
Suppose the non-linear system x0 = f (x) has a fixed point x0 with linearised system y 0 =
Df (x0 )y. Then there is a neighbourhood N of x0 so that the phase portrait of the original
system for x is qualitatively equivalent to the phase portrait of the linearised system at the
origin, provided none of the eigenvalues of Df (x0 ) has a real part equal to zero.
MA 209 Di↵erential Equations Extra Notes 3 — Page 7

In the lectures we will discuss the ideas behind the proof of this theorem, using the Linear
Taylor Approximation.
Note that Theorem 3.3 does not say anything if Df (x0 ) has an eigenvalue whose real value
is equal to zero. In fact, there are two essentially di↵erent possibilities for that to happen:
• Df (x0 ) is not invertible, and hence it has an eigenvalue equal to zero;
• Df (x0 ) is invertible, but it has an eigenvalue that is purely imaginary.
• Fixed points x0 of systems x0 = f (x) for which none of the eigenvalues of Df (x0 ) have zero
real part are called hyperbolic. So hyperbolic fixed points are exactly those fixed points for
which the Linearisation Theorem provides information.

Exercises

1 Consider the parameter-dependent 1-dimensional di↵erential equation

x0 = (x ) (x2 ),

where is some real number.


Find all possible phase portraits that can occur for this equation, together with the intervals
of in which they occur.

2 (a) For a 1-dimensional autonomous di↵erential equation, how many distinct types of phase
portraits can occur on the phase line if it has three fixed points?
(b) And what is the number of distinct types of phase portraits that can occur if there are n
fixed points?

3 Use the method of isoclines the sketch the phase portraits of the following systems:
⇢ 0
x1 = x1 ,
(a)
x02 = x1 x2 ;
⇢ 0
x1 = x1 x2 ,
(b)
x02 = x22 ;
⇢ 0
x1 = ln(x1 ),
(c) for x1 > 0.
x02 = x2 ,
MA 209 Di↵erential Equations Extra Notes 3 — Page 8

4 Below are given four invertible matrices, corresponding to linear transformations x 7! Ax,
. . . , x 7! Dx.
" # " # " # " #
1 0 1 2 7 3 1 2
(a) A = ; (b) B = ; (c) C = ; (d) D = .
0 2 2 1 8 3 1 1

(i) Indicate the e↵ect of each of the linear transformations of the x1 , x2 -plane by shading
the image of the square S = { (x1 , x2 ) | 0  x1 , x2  1 }.
(ii) Sketch the image of the circle x21 + x22 = 1.
(iii) Sketch the image of the curve x1 x2 = 1, x1 , x2 > 0.

" # " #
x01 x1
5 Consider the linear system 0 = A , where A is one of the matrices below.
x2 x2
" # " # " # " # " #
1 0 1 0 2 1 0 2 3 0
(a) ; (b) ; (c) ; (d) ; (e) .
0 2 0 2 0 2 2 0 0 0

(i) Sketch the phase portraits of the linear systems x0 = Ax.


" # " #" #
y1 2 1 x1
(ii) Indicate the e↵ect of the linear transformation = on each of the systems
y2 1 1 x2
by sketching each phase portrait in the y1 , y2 -plane.
(iii) For each of the phase portraits in (i), determine the matrix B so that y 0 = By is the
equation for the system in the y1 , y2 -plane.

6 Find the stationary points of the following systems and determine the linearisation at each
of the stationary points.
⇢ 0
x1 = x22 3x1 + 2,
(a)
x02 = x21 x22 ;
⇢ 0
x1 = sin(x1 + x2 ),
(b)
x02 = x2 ;
⇢ 0
x1 = x1 x2 + x1 x2 ,
(c)
x02 = x1 x2 x22 .

7 For each of the systems in the previous questions, describe the behaviour (stable, unstable,
other observation) close to each of their stationary points.
Di↵erential Equations 2020/21
MA 209

Extra Notes 4
Lyapunov Theory
First Integrals

These extra notes contain some bits, pieces and extra material, related to Sections 3.5 to 3.7
of Dr Sasane’s Lecture Notes.

4.1 Definitions
In these Extra Notes (and in the relevant sections from the Lecture Notes) we assume that
we are given a system of ODEs x0 = f (x) and that the origin is a fixed point of that system.
If we want to study an other fixed point than the origin, we first modify the system to a new
system which has the origin as a fixed point. See the start of Section 3.6 of the Lecture Notes.
• Let V : Rn ! R be a function and N a neighbourhood of the origin (i.e. there is a r > 0 so
that B(0, r) ✓ N ).
* We call V positive definite on N if V (0) = 0 and V (x) > 0 for all x 2 N \ {0}.
If we replace V (x) > 0 by V (x) < 0, we get the definition of negative definite on N .
* We call V positive semi-definite on N if V (0) = 0 and V (x) 0 for all x 2 N .
And if we require V (x)  0, we get the definition of negative semi-definite on N .
• Let V : Rn ! R be a function, and x0 (t) = f (x(t)) be an n-dimensional system of ODEs.
Then the directional derivative of V along the curve x(t) is the derivative of V (x(t)) with
respect to t. Note that V (x( . )) is a function from R to R. Hence, using the Chain Rule, for
the directional derivative of V along x(t) we find:
2 3
0 d 0
@V (x)
[V (x(t))] = V (x(t)) = rV (x) · x (t) = rV (x) · f (x), 6 @x1 7
dt 6 7
6 @V (x) 7
6 7
where rV (x) = 6 @x 7
6 . 2 7.
6 .. 7
6 7
4 @V (x) 5
@xn
In other words, the directional derivative is
@V (x) @V (x) @V (x)
[V (x(t))]0 = f1 (x) + f2 (x) + · · · + fn (x)
@x1 @x2 @xn
(the inner product of the two vectors rV (x) and f (x)).
c London School of Economics, 2019
MA 209 Di↵erential Equations Extra Notes 4 — Page 2

• * A continuous di↵erentiable function V : Rn ! R is a (weak ) Lyapunov function of the


system x0 = f (x) if there exists a neighbourhood N of the origin so that
– V (x) is positive definite on N , and
– the directional derivative [V (x(t))]0 is negative semi-definite on N .
* A continuous di↵erentiable function V : Rn ! R is a strong Lyapunov function of the
system x0 = f (x) if there exists a neighbourhood N of the origin so that
– V (x) is positive definite on N , and
– the directional derivative [V (x(t))]0 is negative definite on N .
The Lecture Notes only considers neighbourhoods of the form B(0, R) for some R > 0. There
are good reasons to consider more general neighbourhoods as is done in the definitions above,
in particular when we come to determining domains of stability below.

4.2 Results
The crucial results on the relation between Lyapunov functions and stability is Theorem 3.7.1
in the Lecture Notes. Here it is again:
Theorem 3.7.1
Suppose the system x0 = f (x) has a fixed point in the origin.
• If there is a weak Lyapunov function for the system, then the origin is a stable fixed point.
• If there is a strong Lyapunov function for the system, then the origin is an asymptotically
stable fixed point.
• It is possible to formulate stronger results, such as the following:
Extra Theorem 4.1
Suppose there is a weak Lyapunov function V for the system x0 = f (x) with a fixed point in
the origin, on some neighbourhood N of the origin. And suppose that for every x⇤ 2 N \ {0}
with [V (x⇤ )]0 = 0 we have that f (x⇤ ) is directed towards points x 2 N for which [V (x)]0 6= 0.
Then the origin is an asymptotically stable fixed point.
• The problem with the Lyapunov theorems is that they give no indication how to find an
appropriate Lyapunov function. And in fact, there are no golden rules for this. Most of them
are found by trial and error, using some standard forms that are know to give useful positive
definite functions. Examples will be discussed in the lectures and you will be asked to do
some of this in the exercises as well.
• Once a Lyapunov function is found for a system, it can also be used to find a domain of stability
of the system. This is a neighbourhood of the origin such that every trajectory starting in
that neighbourhood stays in it, or (in case of asymptotic stability) every trajectory starting
in it will converge to the origin.
A condition that is required for a set D to be a domain of stability, is that the directional
derivative [V (x(t))]0 is negative definite in D. But we also need to make sure that D contains
whole level sets V (x) = C. The reason for this follows from the proof of Theorem 3.7.1: we
MA 209 Di↵erential Equations Extra Notes 4 — Page 3

can only guarantee that trajectories stay inside sets bounded by level sets of the Lyapunov
function.
Note that the fact that we can use Lyapunov functions to find domains of stability makes it
a very useful tool in stability analysis. In the exercises you will find examples of systems for
which we can show that the origin is (asymptotically) stable quite easily using the Linearisa-
tion Theorem. But that theorem gives us no information about the size of the neighbourhood
which can be used in the definition of stability. In those cases an appropriate Lyapunov
function might provide a lot more information.

4.3 First Integrals


A first integral for a system x0 = f (x) is a function that has some properties related to
Lyapunov functions. The following two definitions are equivalent:
* A continuous di↵erentiable function F : Rn ! R, which is not constant, is a first integral
of the system x0 = f (x) on a region S ✓ Rn if for all x 2 D we have

@F (x) @F (x)
[F (x(t))]0 = f1 (x) + · · · + fn (x) = 0.
@x1 @xn

– A continuous di↵erentiable function F : Rn ! R, which is not constant, is a first integral


of the system x0 = f (x) on a region S ✓ Rn if for every solution x⇤ (t) for this system we
have that if t1 , t2 are times such that x⇤ (t1 ), x⇤ (t2 ) 2 S, then F (x⇤ (t1 )) = F (x⇤ (t2 )).
The first definition above gives a straightforward method to check if a certain F is a first
integral of a system. The second definition gives an idea why we should be interested in first
integral. That definition indicates that the value of F is constant for di↵erent points of a
trajectory. So if we have a first integral on the whole space Rn , then trajectories are curves
that lie completely on a level set F (x) = C for some constant C.
• A system that has a first integral on the whole space Rn is called a conservative system.
The idea behind that name is that since the first integral is constant on every trajectory. So
when the system develops over time from some kind of initial point x0 (i.e. moves along the
trajectory through x0 away from x0 ), then the first integral remains constant throughout the
dynamic development. So the first integral indicates some kind of property that is “conserved”
when the system changes over time.
In physical systems, the first integral is often something like the total energy of the system.
For an economical system, it can be something like the total amount of wealth in the system.
• As for Lyapunov functions, it is easier to check that a certain function is a first integral than
to find one. But here there are some general techniques that can be used.
" # " #
x01 f1 (x1 , x2 )
For 2-dimensional systems 0 = , the standard method relies on trying to find
x2 f2 (x1 , x2 )
solutions for the di↵erential equation
dx2 f2 (x1 , x2 )
= .
dx1 f1 (x1 , x2 )
MA 209 Di↵erential Equations Extra Notes 4 — Page 4

Examples will be discussed in the lectures. Note that this method only gives a suggestion for
a first integral. Once a function is guessed, you should still use the definition to show that it
really is a first integral.

Exercises

1 Show that V (x1 , x2 ) = x21 + x22 is a strong Lyapunov function for the following systems:
⇢ 0
x1 = x31 + x2 sin(x1 ),
(a)
x02 = x2 x21 x2 x1 sin(x1 );
⇢ 0
x1 = x1 2x22 ,
(b)
x02 = 2x1 x2 x32 ;
⇢ 0
x1 = x1 sin2 (x1 ),
(c)
x02 = x2 x52 ;
⇢ 0
x1 = (1 x2 )x1 ,
(d)
x02 = (1 x1 )x2 .

2 Find domains of stability for each of the systems in Question 1 above.

3 Show that V (x1 , x2 ) = x21 + x22 is a weak Lyapunov function for the following systems:
⇢ 0
x1 = x2 ,
(a)
x02 = x1 x32 (1 x21 )2 ;
⇢ 0
x1 = x1 + x22 ,
(b)
x02 = x1 x2 x21 ;
⇢ 0
x1 = x31 ,
(c)
x02 = x21 x2 .
Which of the systems above are asymptotically stable and which ones are not?

4 Prove that if the system x0 = f (x) satisfies f (0) = 0, and the system has a strong Lyapunov
function on some neighbourhood of the origin, then the system x0 = f (x) has an unstable
fixed point at the origin.
What can you say if there is only a weak Lyapunov function for x0 = f (x)?
MA 209 Di↵erential Equations Extra Notes 4 — Page 5

5 Find an appropriate Lyapunov function and a domain of stability for the following system:
⇢ 0
x1 = x2 ,
x02 = x2 + x32 x51 .

6 Find first integrals of the following systems, together with the regions the first integrals are
defined:
⇢ 0
x1 = x2 ,
(a)
x02 = x21 + 1;
⇢ 0
x1 = x1 (x2 + 1),
(b)
x02 = x2 (x1 + 1);
( 1
x01 = ,
(c) cos(x1 ) for |x1 | < 12 ⇡;
0
x2 = x2 , 2
⇢ 0
x1 = x1 x2 ,
(d) for x1 > 1.
x02 = ln(x1 ),


x01 = x1 (1 x21 ),
7 Consider the following 2-dimensional non-linear system:
x02 = x2 .
(a) Find the fixed points and use the Linearisation Theorem to determine the phase portraits
around each of the fixed points.
(b) Determine some isoclines and use these, together with the information from (a), to sketch
a global phase portrait.

8 Several times in the lectures we have


⇢ looked at the Predator-Prey model described by the
0
x1 = x1 (a b x2 ),
so-called Volterra-Lotka equations: where a, b, c, d are positive real
x02 = x2 (d x1 c),
constants.
A more complicated Predator-Prey model is given by the following equations :
⇢ 0
x1 = x1 (a b x2 ↵ x1 ),
x02 = x2 (d x1 c x2 ),
where a, b, c, d, ↵, are positive constants with ↵ c < a d.
For simplicity, let’s take a = b = c = d = 1, and ↵ = = 1/3.
(a) Show that there are three fixed points in the quadrant x1 , x2 0.
(b) Obtain the linearisations at the fixed points and classify the linear systems. Which of
the fixed points can be characterised using the linearisations ?
(c) Determine the isoclines with horizontal and vertical directions.
(d) Use the information obtained to make a sketch of the phase portrait of the system for
x1 , x2 0.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy