Notes On MATH 441
Notes On MATH 441
Notes On MATH 441
Lecture notes.
M.McIntyre
5 Tangency Of Maps 33
5.1 Tangency and Affine Maps . . . . . . . . . . . . . . . . . . . . . . 34
6 Concept of Derivative 37
7 Differentiation of Composites 43
i
ii CONTENTS
10 Inverse Maps 65
10.1 Some techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
10.2 Existence of local inverses. . . . . . . . . . . . . . . . . . . . . . . 68
11 Implicit functions 83
11.1 An Application of the Implicit Function Theorem . . . . . . . . 85
These notes were complied by Dr. McIntyre. Much material has been adapted
from lecture notes written by Arthur Jones and Alistair Gray of Latrobe Uni-
versity, Australia.
The text by Lang, S. “Analysis 1” is an appropriate supplement. In this book,
the presentation and the order of introducing ideas varies from that taken by
the lecturer, nevertheless the student is advantaged by reading from this text.
Differential calculus for functions mapping from R to R has been studied in
calculus. Such a function is differentiable at a ∈ R if limh→0 f (a+h)−f
h
(a)
exists.
The derivative of f at a is then defined to be this number and is denoted f 0 (a).
Unfortunately however this idea does not carry over to higher dimensions.
For example division is not defined for an element of Rm , which is a vector
rather than a number.
The way out of this dilemna was discovered by the French mathematician
Fréchet in 1911. His idea was to think of differentiation as a process of ap-
proximating the function f near a, by a linear map. This linear map is called
the Fréchet derivative of f at a. The main aim of this course is to give an
understanding of this differentiation.
1. At first we briefly consider functions f : R → R and define Frećhet differ-
entiation in this familiar context. The Fréchet derivative at a point a ∈ R
is just a linear map from R to R, i.e. it is a constant (in fact the ordinary
derivative at a) times the identity map.
2. We define linear and affine maps between arbitrary vector spaces, our
scalars always being from the field R and we examine the geometry of
such maps when the vector spaces are Euclidean. We review the matrix of
a linear map from Linear Algebra and prove the result that an affine map
between arbitrary vector spaces has an inverse if and only if the same is
true of its linear part.
3. Having reviewed linear and affine maps, the other prerequisite for under-
standing the Fréchet derivative of maps between normed vector spaces is
the study of limits for such maps. We recall from 353 and 356, some topol-
ogy of normed vector spaces. A normed vector space is a metric space,
the metric being determined by the norm. Thus as basis for a topology,
we have the set of all open balls. The study of limits is of course closely
iii
iv DETAILS OF THE COURSE
4. Next we study the concept of tangency for maps between normed vec-
tor spaces. This is our main application of the idea of limits developed
previously.
Then we study the derivative for functions which map one normed vector
space to another. The basic idea is very simple. A function is differentiable
at a point if there is an affine map which approximates the function very
closely near this point. The derivative of the function is then the linear
part of that affine map.
equations. We use the Inverse Map Theorem to prove the Implicit Func-
tion Theorem. As application we could examine the notion of charts for
manifolds.
vi DETAILS OF THE COURSE
Chapter 1
1
2 CHAPTER 1. THE FRÉCHET DERIVATIVE FOR MAPS F : R → R
so
n
0 < ||x − λy||2 (xi − λyi )2
P
=
i=1
definition of || || on Rn
n n n
x2i − 2λ xi yi + λ2 yi2 ;
P P P
=
i=1 i=1 i=1
f (a1 , a2 , . . . , ai + h, . . . , an ) − f (a1 , . . . , an )
lim
h→0 h
∂f
exists. The value of the limit is the ith partial derivative ∂x i
of f at a.
∂f
Each partial derivative ∂xi determines a function from R → R, so partial differ-
entiation reduces the problem in (ii) to that of (i). Next to consider differentia-
bility of functions f : Rn → Rm , since f (x) ∈ Rm , f determines m component
functions; i.e. f (x) = (f1 (x), . . . , fm (x)) each of which is a function from Rn
to R. Again the question of differentiability becomes a question of existence
of partial derivatives of real valued functions. This led to the definition of the
Jacobian matrix (of partial derivatives) to which we will return later.
Considering our definition of f 0 (a) for f : R → R, i.e.
f 0 (a) = limh→0 f (a+h)−f
h
(a)
, it is obvious that this cannot apply for functions
f : R → R because division by h ∈ Rn is not defined.
n
Geometrically the graphs of linear and affine maps from R to R are lines.
affine
linear
Notation: L(R, R) will denote the space of linear maps from R to R. It is closed
under the vector space operations of pointwise addition and scalar multiplica-
tion. Moreover it is possible to define a norm on this vector space. For example:
4 CHAPTER 1. THE FRÉCHET DERIVATIVE FOR MAPS F : R → R
|| || : L(R, R) → R defined by ||mId|| = |m| (i.e. the absolute value of the slope
of the line) is a norm on L(R, R).
We will say more about linear and affine maps when we consider more general
functions.
Notice that Df (a) ∈ L(R, R), the space of linear maps from R to R.
Example (1): f = Id2 (i.e. f (x) = x2 for each x ∈ R).
We have for a ∈ R, f (a) = a2 , so f 0 (a) = 2a and Df (a) = 2aId.
The notation suggests an underlying map Df . Suppose f is differentiable at
each a ∈ R, then
Df : R → L(R, R)
is the map which assigns to each a ∈ R, the linear function Df (a) ∈ L(R, R).
Returning to the example, we had Df (a) = 2aId i.e. Df (a)(h) = (2aId)(h) =
2ah.
Df : R → L(R, R)
a 7→ Df (a)
For f = Id2 , the map Df is 2IdId; since to each a ∈ R Df assigns the map
Df (a) which is 2aId.
1.2. THE FRÉCHET DERIVATIVE FOR MAPS F : R → R. 5
Example (2): Let f : R → R be given by f = 3Id2 . cos (i.e. f (x) = 3x2 cos(x).)
f is differentiable at each a ∈ R, by elementary calculus we have
f 0 = 6Id. cos −3Id2 . sin (i.e. at each a ∈ R, f 0 (a) = 6a cos(a) − 3a2 sin(a))
Df (a) = f 0 (a)Id = (6a cos(a) − 3a2 sin(a))Id
Df (a)(h) = 6ah cos(a) − 3a2 h sin(a)
Df = (6Id. cos ◦Id−3Id2 sin ◦Id)Id, where conventionally the composite cos ◦Id =
cos and sin ◦Id = sin, so Df = (6Id. cos −3Id2 sin)Id.
5. Let f, g ∈ L(R, R). Show that f ◦ g ∈ L(R, R) and express its slope in terms
of the slopes of f and g.
(a) What does the chain rule of elementary calculus tell you about
(f ◦ g)0 (a)?
(b) Hence prove that
This is the form the chain rule takes for Fréchet derivatives.
10.
Next we generalize the notion of linear (and affine) maps to arbitrary vector
spaces; a preliminary step to generalizing the Fréchet derivative. For simplicity
all of our vector spaces will be over the field R.
There are two aspects to linearity, the algebraic and the geometric. First we
consider the algebraic: each vector space has two operations; addition and scalar
multiplication, so a linear map should preserve these operations.
Definition 2.0.2 Let V and W be vector spaces over the same field of scalars.
By saying that a map L : V → W is linear, we mean that for all vectors x, y ∈ V
and all scalars λ ∈ R:
(i) L(x + y) = L(x) + L(y) and
(ii) L(λx) = λL(x)
Notice: x + y ∈ V , L(x) + L(y) ∈ W
λx ∈ V , L(λx) ∈ W and L(x) ∈ W .
(†) Verify that in the case V = W = R, this definition is equivalent to Definition
(1.1.1): i.e. prove that if L = mId then L satisfies Definition (2.0.2) and
conversely prove that if L is a map which satifies Definition (2.0.2) then L = mId
for some choice m ∈ R.
Indeed we can show the following three statements are true:
(1) If V = W = R, it can be shown that
a map L is linear ⇔ it has the form L = mId for some m ∈ R
(2) If V = R2 and W = R, it can be shown that
a map L : R2 → R is linear ⇔ it has the form L(x, y) = ax + by for some
a, b ∈ R
(3) If V = R2 and W = R2 , it can be shown that
a map L : R2 → R2 is linear ⇔ it has the form L(x, y) = (ax + by, cx + dy)
for some a, b, c, d ∈ R
7
8 CHAPTER 2. LINEAR AND AFFINE MAPS
(a) L maps R2 onto R2 : in this case it can be shown that each “grid” (equally
spaced parallel lines) maps onto a grid.
(*) x = x1 e1 + · · · + xm em ,
2.1. THE MATRIX OF A LINEAR MAP 9
Definition 2.1.1 The matrix with ijth entry Li (ej ) is called the matrix of the
linear map L (relative to the usual basis). We denote this matrix [L].
Example: A linear map L : R2 → R2 sends points e1 and e2 to the points (2, 2)
and (1, 3). Find [L] (relative to the usual basis) and use it to calculate L((3, 1)).
We have
L(e1 ) = (2, 2) = (L1 (e1 ), L2 (e1 )) and
L(e2 ) = (1, 3) = (L1 (e2 ), L2 (e2 )) so
2 1 T 2 1 3 7
[L] = . Hence L((3, 1)) = = . Indeed in
2 3 2 3 1 9
this example, for x = (x, y),
T
L(x) = (L(x))T = (2x + y, 2x + 3y).
The map L is onto when it has full rank.
To assist with the proof of the statement: “ If V = W = R2 , it can be shown
that a map L is linear ⇔ it has the form L(x, y) = (ax + by, cx + dy) for some
a, b, c, d ∈ R”: let L : R2 → R2 be linear and suppose that L maps e1 and e2 to
the points (1, 1) and (−1, 1). Now find a formula for L(x) valid for each x ∈ R2 ,
i.e. let x = (x, y) so x = xe1 + ye2 then
L(x) = L(xe1 ) + L(ye2 )
= x(1, 1) + y(−1, 1) = (x − y, x + y)
which tells you what to choose for a, b, c and d.
Returning now to matrices and linear maps:
10 CHAPTER 2. LINEAR AND AFFINE MAPS
L(x)T = AxT .
The map L is then linear and its matrix (relative to the usual basis) is given by
[L] = A.
Proof shortly. In the meantime a few more examples to assist with the exercises.
There is an equivalent formulation of the notion of linearity given in Definition
(2.0.2).
Definition (2.0.2)’: A map L : V → W is linear ⇔ L(x+λy) = L(x)+λL(y),
for all x, y ∈ V and λ ∈ R (a field).
You will find both definitions useful to address different problems.
Let’s see that a map L : R2 → R2 which reflects each point across the vertical
axis is linear.
L(x + y)
x+y
x L
L(x)
y L(y)
by the linearity of L. So the image of the two points in l is two points on the
line through L(u) parallel to L(v). Thus L(l) is a line in R2 .
Proof of Theorem (2.1.3): Suppose L : Rm → Rn , M : Rn → Rp are linear.
Let x, y ∈ Rm , λ ∈ R, so x + λy ∈ Rm (it’s a vector space). Now
So M ◦ L is linear.
Let [L] (resp. [M ]) be the matrix for L (resp. M ). Then
Also
Every linear map L : V → W sends the zero vector in V to the zero vector in
W . Thus c = A(0), where 0 ∈ V .
In fact A(x) − A(0) = L(x) i.e. the linear map is uniquely determined by the
affine map.
Examples:
A((1, 1))
A((−1, 1))
A((1, 1))
A
A((−1, 1))
A((1, −1))
A((−1, −1))
1. Prove (from the definition) that every linear map L : V → W maps the zero
element of V to the zero element of W .
2.
(a) Generalise the statements (1), (2) and (3) on page 7 to give the form
of a linear map L : Rm → R with respect to the usual basis for Rm .
Show that your choice of L is indeed linear.
(b) Now generalise to give the form of a linear map L : Rm → Rn with
respect to the usual basis for Rm . Again show that your choice of L
is linear.
2.2. AFFINE MAPS BETWEEN ARBITRARY VECTOR SPACES 13
3. In each case state whether the map L with domain R2 is linear. If not, give
a counterexample. If so, give a proof.
(a) L(x) = L(x, y) = (2y, 4x − y),
(b) L(x) = L(x, y) = x + 4y + 1,
(c) L(x) = L(x, y) = x2 + 2x − y,
(d) L(x) = L(x, y) = (x + y, −3xy),
(e) (
x2 −y 2
if x + y 6= 0
L(x) = L(x, y) = x+y
2x otherwise
(f) L : Mn×n (R) → Mn×n (R) with L(A) = AT + BA, where B is a fixed
n × n matrix.
4. Sketch the graph of the linear map L : R2 → R with L(x, y) = −3x + 2y.
5. Let L : R2 → R2 be a linear map and let {e1 , e2 } be the usual orthonormal
basis for R2 . Suppose that L(e1 ) = (4, 1) and L(e2 ) = (1, 4).
(a) On an arrow diagram, show the effect of L on a unit grid in the
domain.
(b) Find a formula giving the value of L at an arbitrary point x in R2 .
6. Prove that if L : V → W is linear, then its kernel {x ∈ V : L(x) = 0W } is
a vector subspace of V .
2 2
7. A linear map
L:R → R has its matrix relative to the usual basis given
2 6
as [L] = . Find the point to which this map sends (1, −1).
1 4
Now use theorem (2.1.3) to find two different expressions for its matrix
and thence derive the elementary addition formulaes for cos and sin .
11. Verify that the formula A(x) = −1 + 5x defines an affine map A : R → R
which is not linear.
15. Recall that a map has an inverse if and only if the map is 1-1 and onto.
Prove that an affine map A : V → W has an inverse if and only if the
same is true of its linear part L. [Hint: Express A−1 in terms of L−1 .]
Chapter 3
For linear maps, the study of continuity assumes a particularly simple form: if a
linear map is continuous at the origin (the zero vector) then it is continuous at
every point of its domain. Moreover, all linear maps between Euclidean spaces
L : Rn → Rm are continuous.
Definition 3.0.2 An open ball in a normed vector space (V, || ||) is a set of
the form
{x ∈ V : ||x − a|| < r}
for some a ∈ V and real r > 0.
a is the centre of the ball, r its radius.
{x ∈ V : ||x − a|| < r} is denoted Br (a) (the open ball)
{x ∈ V : ||x − a|| ≤ r} is denoted B r (a) (the closed ball)
Proposition 3.0.3 (a) Each open ball contains its centre.
(b) Each open ball in V is a subset of V .
(c) For each a ∈ V and r > 0, Br (a) ⊆ B r (a).
(d) If 0 < r ≤ s then Br (a) ⊆ Bs (a).
15
16 CHAPTER 3. CONTINUITY AND LIMITS
Theorem 3.0.6 (1) the union of an arbitrary collection of open sets is open
and
(2) the intersection of finitely many open sets is open
our definition of open set determines a topology on V .
x f
a ∃δ l
∀ε
f (x)
Thus f (x) lies on the unit circle when x 6= 0; i.e. f (R2 \{0}) is the unit circle
in R2 .
3.1. CONTINUITY AND LIMITS. 17
(a) Notice that f (B 21 ((0, 0))) is the unit circle S 1 together with the point {0}.
Proof: Let y ∈ f (B 12 ((0, 0))) i.e.
y = f (x) for some x with ||x|| < 21 i.e.
1
y = 0 or y = ||x|| x for some x with ||x|| < 21 .
In the first case y ∈ {0} and in the second case ||y|| = 1 and so y ∈
S 1 ∪ {0}.
Conversely, suppose y ∈ S 1 ∪ {0}, i.e. y = 0 or ||y|| = 1.
In the first case choose x = 0 = (0, 0) ∈ B 21 ((0, 0)) (in the domain).
Then f (x) = y and so there’s an x ∈ B 12 ((0, 0)) such that f (x) = y; i.e.
y ∈ f (B 21 ((0, 0))).
In the second case choose x = 14 y. Then x ∈ B 12 ((0, 0)), since ||x|| =
|| 41 y|| = 14 < 12 . And so y = ||x||
1
x = f (x) for this choice of x ∈ B 21 ((0, 0)).
Thus f (B 12 ((0, 0))) = S 1 ∪ {0}.
(b) Also f (Ba ((a, 0))), a > 0, is the right half of the unit circle; i.e. {(y1 , y2 ) ∈
R2 : ||(y1 , y2 )|| = 1 and y1 > 0}.
Proof: Let y ∈ f (Ba ((a, 0))); i.e. y = f (x) for some x, with ||(x1 −
a, x2 )|| < a. Now x 6= 0 and x1 > 0, since |x1 − a| < a.
1
Thus by the definition of f , y = ||x|| x, which gives ||y|| = 1 and since
x1 > 0 we have y1 > 0; i.e. y ∈ {(y1 , y2 ) ∈ R2 : ||(y1 , y2 )|| = 1 and y1 >
0}.
Conversely suppose y ∈ {(y1 , y2 ) ∈ R2 : ||(y1 , y2 )|| = 1 and y1 > 0}.
Choose x = ay1 (y1 , y2 ) then
We have x = ay1 (y1 , y2 ) ∈ Ba ((a, 0)). Now f (x) ∈ f (Ba ((a, 0))) and
x
f (x) = ||x|| . So
ay1 (y1 , y2 )
f (ay1 (y1 , y2 )) = = (y1 , y2 ) = y
ay1 ||(y1 , y2 )||
(since a > 0, y1 > 0 and ||(y1 , y2 )|| = 1) i.e. y ∈ f (Ba ((a, 0))).
Thus {(y1 , y2 ) ∈ R2 : ||(y1 , y2 )|| = 1 and y1 > 0} = f (Ba ((a, 0))) as
required.
To express the statement that a function f : S ⊆ V → W is continuous at a we
simply replace l by f (a) in Definition (3.1.1). We have
18 CHAPTER 3. CONTINUITY AND LIMITS
Theorem 3.1.3 If V 6= {0} and S ∪ {a} is an open set in V then the function
f : S ⊂ V → W has at most one limit l at a.
Lemma (A): If V 6= {0} and S ∪ {a} is an open set in V then every open ball
Br (a) in V contains some point different from a.
Lemma (B): Every open ball Br (a) of centre a in V contains a point of S
different from a.
The proof of Lemma (A) is an exercise.
Proof of Lemma (B): Let Br (a) be an open ball of centre a in V .
We are to show that Br (a) ∩ S contains a point other than a, where S =
domf .
a
r ∃δ1
From the diagram we see that choosing δ = min{r, δ1 } where δ1 is the radius
of the open ball centred on a and contained in S ∪ {a} (which exists because
S ∪ {a} is open) will work.
3.1. CONTINUITY AND LIMITS. 19
Let Bδ1 (a) be an open ball centred on a with Bδ1 (a) ⊆ S ∪ {a} (open).
Let δ = min{r, δ1 } then Bδ (a) ⊆ Br (a) and Bδ (a) ⊆ Bδ1 (a) ⊆ S ∪ {a}.
Thus Bδ (a) ⊆ Br (a) ∩ S ∪ {a}.
By Lemma (A), Bδ (a) contains a point of V other than a. Let y 6= a, be in
Bδ (a).
Then y ∈ S ∪ {a}, indeed y ∈ S and y 6= a.
Thus every open ball of centre a contains a point of S other than a.
Proof of Theorem (3.1.3): Suppose V 6= {0} and S ∪ {a} is an open set in
V . Let f : S ⊆ V → W and suppose l1 and l2 ∈ W are both limits of f at a.
Let > 0.
Let δ1 > 0 be the number for which
(∀x ∈ S ∪ {a}), x ∈ Bδ1 (a) ⇒ ||f (x) − l1 || < (i)
2
Let δ2 > 0 be the number for which
(∀x ∈ S ∪ {a}), x ∈ Bδ2 (a) ⇒ ||f (x) − l2 || < (ii)
2
Choose δ = min{δ1 , δ2 }; so Bδ (a) = Bδ1 (a) ∩ Bδ2 (a). By Lemma (B), Bδ (a)
contains a point of S other than a.
Let y ∈ S, y 6= a and y ∈ Bδ (a), so that y ∈ Bδ1 (a) and y ∈ Bδ2 (a).
So by (i) ||f (y) − l1 || < 2 and by (ii) ||f (y) − l2 || < 2 . Thus ||l1 − l2 || ≤
||f (y) − l1 || + ||f (y) − l2 || < .
But since this holds for all ; we have ||l1 − l2 || = 0, i.e. l1 = l2 as required.
Theorem 3.1.4 Properties of limits (and continuity).
(1) Translation of the origin:
(a) (in the domain)
Choose δ > 0 for which for all x ∈ S with ||x−a|| < δ we have ||f (x)−l|| <
. Thus g(x) < and so ||g(x)|| < , since g ≥ 0; i.e. limx→a g(x) = 0, i.e.
limx→a ||f (x) − l|| = 0 as required.
Conversely suppose that
Let > 0 and choose δ > 0 for which for all x ∈ S, with ||x − a|| < δ we
have ||f (x) − l|| < .
(i.e. ||||f (x) − l|| − 0|| < ). That is limx→a f (x) = l.
(2) Suppose that ||g(x)|| ≤ ||f (x)|| for x ∈ domg ⊆ domf . And suppose that
lim f (x) = 0.
x→a
In fact one can replace Rn and Rm with any Banach space of finite dimension.
Proof of 3.2.1: We have L : V → W , a ∈ V . Suppose that L is continuous at
a ∈ V ; i.e.
∀ > 0∃δ > 0 such that ∀x ∈ V with ||x − a|| < δ, ||L(x) − L(a)|| < .
δx
So if ||L(x)|| < 1 when ||x|| < δ, then for x 6= 0, the point 2||x|| ∈ Bδ (0)
δx δ x
and ||L( 2||x|| )|| = | 2 |||L( ||x|| )|| < 1; and if x = 0, then L(x) = 0; so choose
c = 2δ .
Let δ > 0 be the radius of an open ball centred on 0 such that ∀x ∈ Bδ (0)
we have L(x) ∈ B1 (0). (This is OK because L is assumed continuous at 0.)
Choose c = 2δ . Now if x = 0, then L(0) = 0 and so 0 = ||L(x)|| ≤ 2δ ||x|| = 0.
δx δx
Suppose x 6= 0. Then 2||x|| ∈ V , since it’s a vector space. And || 2||x|| || =
δ δ δx δ
| 2||x|| |||x|| ≤ 2||x|| ||x|| < δ. And so ||L( 2||x|| )|| < 1 i.e. 2||x|| ||L(x)|| < 1, i.e.
||L(x)|| ≤ c||x|| as required.
Conversely suppose that there exists c ∈ R such that ∀x ∈ V, ||L(x|| ≤ c||x||.
Let B (0) be an open ball centred on 0 for some > 0. Choose δ = c , then
x ∈ B c (0) ⇒ ||x|| < c ⇒ c||x|| < ⇒ ||L(x)|| < and so L(x) ∈ B (0).
24 CHAPTER 3. CONTINUITY AND LIMITS
1. Sketch each of the following sets and decide if they are open sets. Give brief
reasons for your decision.
(a) The open interval (−1, 1) in the normed vector space (R, | |).
(b) The closed interval [−1, 1] in R.
(c) The interval [0, ∞) in R.
(d) The line {(2x, x) : x ∈ R} in R2 .
(e) The square (0, 1] × [0, 1) in R2 .
3.2. CONTINUITY FOR LINEAR MAPS: 25
2. Prove that each of the following sets is open in the appropriate vector space
with usual norm.
6. Find the derivative of the function Id2 at the point a ∈ R from first princi-
2 2
ples. What is the domain S of the Newton quotient Id (x)−Id (a) ? x−a
Is S ∪ {a} an open subset of R?
7. Verify that each of the following functions have the stated limits.
8. Let L : R2 → R be given by
(
x2 −y 2
if x + y 6= 0
L(x) = L(x, y) = x+y
2x otherwise
9. Assume that S ∪ {a} is an open subset of V and that V 6= {0}. Use the
fact that V must contain some point different from 0, to prove Lemma A:
that every open ball Br (a) in V contains some point different from a.
26 CHAPTER 3. CONTINUITY AND LIMITS
10. (i) In theorem (3.1.4), prove: translation of origin in the domain, (ii)
complete the proof of components and (iii) prove the statement on limit
of a composite.
(iv) Extend the statement on limit of a composite to: Let f : U → V ,
g : V → W and suppose that f is continuous. Then g is continuous if and
only if limx→a g ◦ f (x) = g(limx→a f (x)). (continuity via limits) Prove it.
13. Once you get away from vector spaces of finite dimension, it is no longer
true that every linear map is automatically continuous. For example, let
C00 = {f ∈ F(N, R) : f (n) = 0 for all sufficiently large n}, where F(N, R)
is the vector space of all functions from N to R with the usual pointwise
operations. Another way of expressing the condition for f to belong to
C00 is that {n : f (n) 6= 0} is finite.
P∞
Then define L : C00 → R by L(f ) = n=1 nf (n).
Notice that the sum on the right hand side is in fact a finite sum. Now
(a) Check that L is linear.
(b) Show that {|L(f )| : ||f || ≤ 1} is not bounded above. Then deduce
that L is not continuous, from the following theorem:
Theorem 4.1.1 L(V, W ) is a vector space with respect to the usual operations.
27
28 CHAPTER 4. SPACES OF LINEAR MAPS
Let x, y ∈ V , µ ∈ R. Then
Hence L + M is linear.
Similiarly
(λL)(x + µy) = λ(L(x + µy))
= λ(L(x) + µL(y))
= λL(x) + λµL(y)
and so λL is linear.
Finally dom(L + M ) = V = dom(λL) and codom(L + M ) = W = codom(λL).
Thus L + M and λL are in L(V, W ).
From 3.2.3 we have that every linear map from Rm to Rn is continuous. Hence
L(Rm , Rn ) = {L : Rm → Rn : L is linear}.
{(1, 0)Id1 , (0, 1)Id1 , (1, 0)Id2 , (0, 1)Id2 } = {(Id1 , 0), (0, Id1 ), (Id2 , 0), (0, Id2 )}
Theorem 4.1.2 For each continuous linear map L : V → W , the set {||L(x)|| :
||x|| ≤ 1} is bounded above.
||L(x)||W = ||L(0)||W = ||(0)||W since a linear map sends the zero to zero
= 0 by positivity
The maximum distance of L(x) from the origin occurs at a vertex of the ellipse
i.e. ||L|| := sup||x||≤1 ||L(x)|| = maximum value of ||L(x)|| as x moves around
the unit circle= length of the semi-major axis of the ellipse.
More generally ||L|| is the radius of the image L(B1 (0)) of the unit ball of V i.e.
||L|| is the radius of the smallest closed ball in W with centre 0, which contains
the image of the closed unit ball in V .
Exercise 4
.
9. By a product of V × W → U we mean a map ∗ : V × W → U |(v, w) 7→ v ∗ w,
satisfying the following:
P1. If v, v 0 ∈ V and w ∈ W , then (v + v 0 ) ∗ w = v ∗ w + v 0 ∗ w and
If v ∈ V and w, w0 ∈ W , then v ∗ (w + w0 ) = v ∗ w + v ∗ w0
P2. If c ∈ R, then (cv) ∗ w = c(v ∗ w) = v ∗ (cw)
P3. For all v, w we have ||v ∗ w||U ≤ ||v||V ||w||W .
Let V, W be normed vector spaces. Show that the evaluation map
is a product.
This says that L satisfies a Lipschitz inequality with Lipschitz constant c.]
32 CHAPTER 4. SPACES OF LINEAR MAPS
Chapter 5
Tangency Of Maps
The next step is to understand tangency of maps between normed vector spaces.
This is our main application of the notion of limit.
In elementary calculus, a function f : R → R is said to be differentiable at a
point a of its domain, if there is a line which is tangent to the graph of the
function at the point (a, f (a)). This line is the graph of some affine function
A : R → R.
Recall linear functions L : R → R| x 7→ mx, x ∈ R and
affine functions A : R → R| x 7→ mx + c, m, c ∈ R.
For a pair of functions f : R → R and g : R → R, if g is tangent to f at a then
the ratio |f (x)−g(x)|
|x−a| is very small as x approaches a.
f
f (x)
g(x)
g
a x
Examples.
33
34 CHAPTER 5. TANGENCY OF MAPS
A(x) = cA + LA (x)
B(x) = cB + LB (x).
||A(x) − B(x)||
lim = 0.
x→a ||x − a||
Now
||A(x) − B(x)|| = ||LA (x) − LB (x) − (cB − cA )||
= ||(LA − LB )(x) − (LB − LA )(a)||
= ||LA (x − a) − LB (x − a)||.
Thus
||A(x)−B(x)||
0 = lim ||x−a||
x→a
= lim ||LA (x−a)−L B (x−a)||
||x−a||
x→a
= lim ||(LA −L B )(x−a)||
||x−a|| .
x→a
Id4 : R → R : x 7→ x4 and
−Id2 : R → R : x 7→ −x2
are tangent at 0 ∈ R
Verify that these examples do indeed satisfy the definition of tangency.
3. Suppose that f : R → R and g : R → R are related by the formula
f (x) = g(x) + (x − a)n . For which values of n > 0 (n an integer) is f
tangent to g at a?
4. Verify that the relation “. . . is tangent to . . . at a” is an equivalence relation
on the functions which map neighbourhoods of the point a in V into W .
(Recall that an equivalence relation “∼” on a set S is a relation such that
for all x, y and z in S
(i) x ∼ x (ii) x ∼ y ⇒ y ∼ x
(iii) x ∼ y and y ∼ z ⇒ x ∼ z.
5. Suppose that a continuous map g : U → W is tangent at a to a map
f : U → W . Prove that f is also continuous at a.
6. Let f : R → R be the absolute value function and let g : R → R be an affine
map for which g(0) = 0. Show that it is not possible for g to be tangent
to f at 0.
7. Prove lemma 5.1.2. (Hint: Given a non-zero vector h, there is a non-zero
vector v such that h = tv for some scalar t. Furthermore h → 0 as t → 0.)
36 CHAPTER 5. TANGENCY OF MAPS
Chapter 6
Concept of Derivative
Notice in the following example; there does not exist a plane which is tangent
to f at 0.
37
38 CHAPTER 6. CONCEPT OF DERIVATIVE
since
A(a + h) = c + L(a + h) = A(a) + L(h)
= f (a) + L(h).
And so we have an alternate definition of differentiability
f (a + h) − f (a)
f 0 (a) = lim
h→0 h
(it’s a real number) and
Finding Df (a).
Example 1.
Let f : V → W be a continuous linear map. We will show that this map is
differentiable at each point a ∈ V and we will find Df (a).
40 CHAPTER 6. CONCEPT OF DERIVATIVE
Let a ∈ V .
Choose La = f .
Then (i) La : V → W is a continuous linear map (by assumption) and (ii)
f (a + h) − f (a) = f (a + h, b + k) − f (a, b)
= (a + h)2 + (b + k)2 − a2 − b2
= 2ah + 2bk + h2 + k 2 ,
where 2ah + 2bk is the part which is linear in h and k. So choose L(h) =
2ah + 2bk.
Choose La (h) = 2ah + 2bk. Now you should check that this is a linear map.
Then by theorem 3.2.3, it is continuous.
Also
||f (a + h) − f (a) − La (h)|| |h2 + k 2 | p
lim = lim = lim h2 + k 2 = 0.
h→0 ||h|| h→0 ||(h, k)|| h→0
Example 3.
Let f : R2 → R2 be given by f (x, y) = (x2 + y 2 + 1, xy + 1). Let U be an open
rectangle in R2 , a = (a1 , a2 ) a point in U , h = (h1 , h2 ) ∈ R2 . Then
1. Let V and W be normed vector spaces. In each case show that the map
is differentiable at each point a of its domain, by finding an affine map
which is tangent to the given map at a:
(a) a constant map c : V → W
(b) a continuous affine map B : V → W and
(c) a continuous linear map L : V → W.
2. Deduce from question 1 and other results that every linear map and every
affine map from Rm to Rn is differentiable at every point of its domain.
For each of the given maps in question 1, find its Frećhet derivative at a.
3. Let f : R2 → R with f (x, y) = (x − 1)2 + (y − 2)2 + 3. Sketch the graph
of f and hence guess an affine map which is tangent to f at (1, 2). Hence
prove that f is differentiable at the point (1, 2).
Differentiation of
Composites
g◦f
and remember that by saying that two maps are tangent at a point we mean
(roughly) that one map is a good approximation to the other map near that
point.
Proof: That the composite of two affine maps is an affine map follows from the
result that the composite of two linear maps is a linear map.
Since g(f (a)) = B(f (a)) (B is tangent to g at f (a))
and B(f (a)) = B(A(a)) (A is tangent to f at a)
we have g(f (a)) = B(A(a)). It remains to prove that
43
44 CHAPTER 7. DIFFERENTIATION OF COMPOSITES
The strategy is to find a quotient not less than the given one and use the
squeeze principle. The steps are:
(1) define two functions ε and η (satisfying (2)) and show that they are
continuous;
and by (3) and (4) the limit of both summands on the right is zero, thus
(1) Define ε : U → V , η : V → W by
(
1
(f (a + h) − A(a + h)) ||h|| if h 6= 0
ε(h) =
0 h=0
for each k ∈ V . The functions ε and η are clearly continuous away from 0
so we just need to show show that ε : U → V is continuous at 0 ∈ U and
η : V → W is continuous at f (a) (i.e. k = 0).
Proof: Because f is tangent to A at a we know that
lim ε(h) = 0;
h→0
i.e. let > 0, then B (0) is an open ball in V centred on 0 then can
choose δ > 0 to be the radius of the open ball in U centred on 0, for
which for all h ∈ Bδ (0) we have ε(h) ∈ B (0). In addition if h = 0 then
ε(h) = ε(0) = 0 and so again ε(h) ∈ B (0).
By a similar argument we find η is continuous at f (a) ∈ V .
(2) We show that ε and η defined in (1) satisfy the given equation.
Let
k = LA (h) + ε(h)||h||
= LA (h) + f (a + h) − A(a + h) by definition of ε
= f (a + h) − f (a) since A(a + h) = f (a) + LA (h).
lim ||η(k)|| = 0.
k→0
as required.
47
Exercise 7
1. Simplify the chain rule D(g ◦ f )(a) = Dg(f (a)) ◦ Df (a) in each of the
following cases:
(a) g is continuous linear (b) f is continuous linear.
(c) g is a constant map (d) f is a constant map.
2. Let f : R2 → R3 and g : R3 → R be differentiable at the points a ∈ R2 and
b = f (a) ∈ R3 respectively. Suppose the matrices of the respective linear
maps are given by
[Dg(b)] = [2 4 3]
2 3
[Df (a)] = 1 2 .
5 6
Find the matrix of the linear map D(g ◦ f )(a) and hence find the number
D(g ◦ f )(a)(h) where h = (2, 3).
3. Find the value at the point (x, y) of the function
Differentiability of
real-valued functions
Since we won’t always want to have to calculate derivatives from first principles,
just as in the elementary calculus, we will develop some rules for differentiation.
The rules only apply for real-valued functions; i.e. functions f : V → R. In
general this is still not an efficient way to calculate derivatives and so we will
in the next chapter examine the connection between the derivative Df (a), at a
point a, of a function f : V → W and the matrix of partial derivatives of f .
The usual algebraic operations on R induce corresponding pointwise operations
on real-valued functions. Let S and T be arbitrary sets and consider functions
f : S → R and g : T → R. We have
(f + g)(x) = f (x) + g(x); (cf )(x) = cf (x) (c ∈ R);
f.g(x) = f (x).g(x) and (f /g)(x) = f (x)/g(x).
Each of these functions will have S ∩ T as their domain except the last which
has S ∩ T \{x|g(x) = 0} as domain.
We can use the projection functions Idj : Rm → R |x = (x1 , x2 , . . . , xm ) 7→ xj ;
1 ≤ j ≤ m to build all of the polynomial and rational functions using the four
operations described in the previous paragraph. For example:
And
f : Rm → R described by f (x) = x21 + x22
is just the function Id21 + Id22 .
The rules of differentiation for real-valued functions are similar to those of ele-
mentary calculus. And the following lemma is useful in proving the rules:
Lemma 8.0.8 Let f, g : Rm → R be differentiable at the point a ∈ Rm . Then
the function
(f, g) : Rm → R2 | x 7→ (f (x), g(x))
49
50 CHAPTER 8. DIFFERENTIABILITY OF REAL-VALUED FUNCTIONS
Now √
(f (a+h)−f (a)−Df (a)(h))2 +(g(a+h)−(g(a)−Dg(a)(h))2
0 ≤ ||h||
||f (a+h)−f (a)−Df (a)(h)||
≤ ||h|| + ||g(a+h)−g(a)−Dg(a)(h)||
||h||
and by assumption the summands on the right both have limit of zero as h → 0.
So by the squeeze(sandwich) principle for limits, we have
||(f, g)(a + h) − (f, g)(a) − (Df (a), Dg(a))(h)||
lim = 0.
h→0 ||h||
Thus (Df (a), Dg(a)) is the derivative of (f, g) at a as required.
Lemma 8.0.9 (a) Id1 Id2 : R2 → R is differentiable everywhere in R2 and
We will prove some parts of the lemma and theorem the rest are to be done as
exercises.
Proof of lemma 8.0.9(a):
Let a = (a1 , a2 ) ∈ R2 , h = (h1 , h2 ) ∈ R2 . Notice that
as required.
The idea of proof of the other parts of theorem 8.0.10 work similarly, express the
given function as a composite, then use lemmas 8.0.8 and 8.0.9 as appropriate.
And so to find the derivative of a given function at a point you now have 3
methods:
(a) find a continuous affine map which is tangent to the given map at the given
point, then find the linear part of that affine map,
(b) find a continuous linear map satisfying the limit condition
(c) express the given function in an appropriate form and use the results lemma
8.0.8, 8.0.9 and theorem 8.0.10.
52 CHAPTER 8. DIFFERENTIABILITY OF REAL-VALUED FUNCTIONS
1
D arctan(a) = arctan0 (a)Id = Id.
1 + a2
By the chain rule we have
D(arctan ◦(2Id21 + Id2 ))(a) = D arctan((2Id21 + Id2 )(a)) ◦ D(2Id21 + Id2 )(a)
= D arctan(2a21 + a2 ) ◦ D(2Id21 + Id2 )(a)
where a = (a1 , a2 )
= 1+(2a12 +a2 )2 Id ◦ D(2Id21 + Id2 )(a)
1
Now
D(2Id21 + Id2 )(a) = 2DId21 (a) + DId2 (a)
= 2.2a1 Id1 + Id2
thus
1
D(arctan ◦2Id21 + Id2 )(a) = (4a1 Id1 + Id2 ).
1 + (2a21 + a2 )2
And so
n
P
So choose La (h) = 2 ai hi .
i=1
Then La is linear, since its a linear combination of projection functions with real
scalars and by theorem 3.2.3, La is continuous (being a map between Euclidean
spaces.)
In addition
n
h2i
P v
u n
||f (a + h) − f (a) − La (h)|| uX
lim = lim i=1 = lim t h2i = 0.
h→0 ||h|| h→0 ||h|| h→0
i=1
n
Thus f is differentiable at a ∈ Rn . La (h) = 2
P
ai hi satisfies the definition of
i=1
n
P n
P
derivative and so Df (a)(h) = 2 ai hi ; i.e. Df (a) = 2 ai Idi .
i=1 i=1
More generally the map
n
X
<, >: R2n = Rn × Rn → R given by < x, y >= xi yi
i=1
(a) From first principles show that < , > is differentiable and find the
derivative of this map at the point (a, b) = (a1 , a2 , . . . , an , b1 , . . . , bn ) ∈
R2n .
54 CHAPTER 8. DIFFERENTIABILITY OF REAL-VALUED FUNCTIONS
2Id1 + 3Id22 .
Our main use for partial derivatives will be to provide another (easier) method
of computing derivatives. Let f : U ⊆ Rm → R, U an open subset of Rm . By
allowing one of the xi in x = (x1 , x2 , . . . , xm ) to roam free while holding all of
the others fixed, we get a function xi 7→ f (x1 , . . . , xi , . . . , xm ); a function from
R to R. So its ordinary derivative is
Definition 9.0.11
f (x + hei ) − f (x)
lim := f /i (x),
h→0 h
where ei = (0, 0, . . . 1, 0, . . . , 0) the 1 being in the ith place.
∂f
(More conventionally f /i (x) is written as ∂x i
but in this course we will reserve
the ∂ notation for another sort of partial derivative.)
Example
Let f (x1 , x2 ) = x21 + x32 . Fix x2 ∈ R and consider the function which maps x1
to f (x1 , x2 ); i.e.
x1 7→ x21 + x32 .
Its derivative (with respect to xi ) is 2Id. So f /1 (x1 , x2 ) = 2Id(x1 ) = 2x1 .
Now fix x1 and consider the function which maps x2 to f (x1 , x2 ); i.e.
x2 7→ x21 + x32 .
55
56 CHAPTER 9. PARTIAL DERIVATIVES AND JACOBIAN MATRICES
y 3
f (x, y) = x2 + y 2
2
x
3
From the sketch it is clear that there does not exist an affine map which is
tangent to f at (0, 0). But f /1 (0, 0) = 0 and f /2 (0, 0) = 0. In fact one can show
that f /1 : R2 → R |(x, y) 7→ f /1 (x, y) (and by symmetry f /2 ) is not continuous
at (0, 0); to do this we have to find B = B (0) such that for every A = Bδ (0, 0)
9.1. FRÉCHET DIFFERENTIABILITY AND PARTIAL DERIVATIVES. 57
and so
/1 δ δ /1
/1 δ δ 1
f , − f (0, 0) = f
, − 0 ≥ 1 > .
2 2 2 2 2
f /i (x) = Df (x)(ei ).
Proof: Since f is differentiable at x, there exists the linear map Df (x) such
that
||f (x + h) − f (x) − Df (x)(h)||
lim = 0.
h→0 ||h||
Let i be an integer: 1 ≤ i ≤ n. By the properties of limits (restriction of domain)
we may restrict the variable h and replace it with tei , t ∈ R. Rewriting the
limit condition we have
||f (x+tei )−f (x)−Df (x)(tei )||
0 = lim ||tei || = lim || f (x+teti )−f (x) − Df (x)(ei )||
t→0 t→0
Now
∂1 f ((x, y), z) : R2 → R
(r, s) 7→ Df ((x, y), z)(re1 ) + Df ((x, y), z)(se2 ).
So
And
∂2 f ((x, y), z) : R → R
t 7→ Df ((x, y), z)(t) = tf /3 ((x, y), z) = tx.
So
Df ((x, y), z)((r, s), t) = r(2x + z) + s(2y) + tx.
Df (x)((p, q), (r, s)) = Df (x)((p, q), (0, 0)) + Df (x)((0, 0), (r, s))
= ∂1 f (x)((p, q)) + ∂2 f (x)((r, s)).
Now
∂1 f (x) : R2 → R
(p, q) 7→ Df (x)(pe1 ) + Df (x)(qe2 ).
So
And
∂2 f (x) : R2 → R
(r, s) 7→ Df (x)(r, s) = Df (x)(re3 ) + Df (x)(se4 )
So
∂2 f (x)(r, s) = r(wx)2 + s cos(xw).
Thus
(We have f /1 (x) = −zx sin(xw) + 2x2 yw; f /2 (x) = −2w sin(xw) + 2w2 yx ;
f /3 (x) = (xw)2 and f /4 (x) = cos(xw).)
60 CHAPTER 9. PARTIAL DERIVATIVES AND JACOBIAN MATRICES
which we will denote f 0 (x). This is consistent with our notation for functions
f : R → R, where we had Df (a)(h) = f 0 (a).h
Example
Let f : R2 → R2 | (x, y) 7→ (x2 + y 2 , xy).
So f = (f1 , f2 ) where f1 (x, y) = x2 + y 2 and f2 (x, y) = xy.
/1 /2 /1 /2
Thus f1 (x, y) = 2x; f1 (x, y) = 2y; f2 (x, y) = y and f2 (x, y) = x. I.e.
2x 2y
f 0 (x) =
y x.
The next theorem shows that the matrix of the linear map Df (x) is just the
Jacobian matrix f 0 (x). The result will be useful in calculating derivatives.
Theorem 9.3.2 Let f : U → Rn be differentiable at x ∈ U (an open subset of
Rm ). The Jacobian matrix f 0 (x) then exists and the Fréchet derivative
Df (x) : Rm → Rn has its values given by
(Df (x)(h))T = f 0 (x).hT
where . means matrix multiplication.
The significance of the theorem is that it provides another way of calculating
derivatives, but you still need to establish the existence of the derivative either
from first principles or from the rules (Lemmas 8.0.8,8.0.9 and theorem 8.0.10).
In the case of a map f : R → R, the Jacobian matrix is a 1 × 1 matrix and
hence may be regarded as a real number and identified with f 0 (x). In this case
(Df (x)(h))T = f 0 (x).hT reduces to Df (x)(h) = f 0 (x).h where . is multiplica-
tion of real numbers.
Proof of 9.3.2: Let f : U → Rn be differentiable at x ∈ U (an open subset
of Rm ). We need to show that the matrix (Df (x)) for the linear map Df (x) is
f 0 (x) the result follows then from theorem 2.1.2.
f : U → Rn ; f = (f1 , . . . , fn )
Df (x) : Rm → Rn ; Df (x) = (Df1 (x), . . . , Dfn (x)) by the generalisation of
lemma 8.0.8.
9.3. JACOBIAN MATRIX 61
Now the ijth element of (Df (x)) is (Df (x))i (ej ) = Dfi (x)(ej ).
/j
And by theorem 9.1.1, Dfi (x)(ej ) = fi (x). Thus the ijth element of (Df (x))
/j
is fi (x); i.e.
/1 /m
f1 (x) . . . f1 (x)
.. ..
. .
[Df (x)] = fi (x) . . . fi/m (x)
/1
.. ..
. .
/1 /m
fn (x) . . . fn (x)
:= f 0 (x)
Many theorems of elementary calculus rely on continuity of f 0 , the analogue
here will be theorems which depend on the continuity of Df . This notion uses
the norm in L(V, W ), the space of continuous linear maps with supremum norm.
It is different to discussing the continuity of Df (x) which is by definition a con-
tinuous linear map. The next theorem gives necessary and sufficient conditions
for Df to be continuous.
We can now state the generalisation of theorems 9.1.1, 9.1.2 and 9.3.2.
/i
fj (x) = Dfj (x)(ei )
Proof: The details are left as an exercise. Let a ∈ Rm . Then for each i =
1, 2, . . . , m
Say why f is continuously differentiable and write down Df (x, y, z)(h, k, l),
where (h, k, l) ∈ R3 .
9. Let f : R2 × R3 → R with
(a) Assume that the function is differentiable and calculate its Jacobian
matrix at (x, y) ∈ R2 .
(b) Use the results, 8.0.8, 8.0.9, 8.0.10 and the chain rule to show that f
is differentiable at (x, y). (First express f in variable free notation.)
(c) Use theorem 9.3.5 to find the Fréchet derivative of f at (x, y).
(d) Now calculate f directly using the expression you wrote for f in (b)
and results numbered 8.0.8, 8.0.9, 8.0.10 and the chain rule.
2
+y 2
(iii) f : R2 → R3 with f (x, y) = (ex , exy , x).
(iv) f : R2 → R with f (x, y) = sin(x3 + 2y 4 ).
Inverse Maps
∀x ∈ A ∀y ∈ B y = f (x) ⇐⇒ x = f ← (y).
• The matrix of the inverse of a linear map, is the inverse of the matrix of
that linear map.
65
66 CHAPTER 10. INVERSE MAPS
(a) The circles of centre (0, 0) and radius r = 1, 2, 3 respectively in the domain
have image:
(b) Moving around the circles of centre (0, 0) and radius r = 1, 2, 3 respectively
in the domain, when restricted say to
π
{(r cos(θ), r sin(θ))| 0 ≤ θ ≤ }
2
we see that f is not injective; for example
√ ! √ ! √ !
3 1 3 1 3
f , = 1, =f ,
2 2 2 2 2
√ √
3 3 1
but ( 12 , 2 ) 6= ( 2 , 2 ).
(f) Finally suppose we want to find the set of all points (x, y) such that the
continuous linear map Df (x) = Df ((x, y)) : R2 → R2 has a continuous
linear inverse.
We have f = (Id21 + Id22 , 2Id1 Id2 ) and since each component involves
sums/products of projection functions, it is differentiable on R2 . For
x = (x, y) ∈ R2 we have
0 2x 2y
f ((x, y)) = .
2y 2x
are
(1) [Id] = [Df ← (f (x, y))][Df (x, y)]
(2) [Id] = [Df (f ← (u, v))][Df ← (u, v)]
and replacing (u, v) with f (x, y)
(2)0 [Id] = [Df (x, y)][Df ← (f (x, y))] so
1 0
(1) = (f ← )0 (f (x, y)).f 0 (x, y)
0 1
1 0
(2) = f 0 (x, y)(f ← )0 (f (x, y))
0 1
which says that (f ← )0 (f (x, y)) is the inverse matrix of f 0 (x, y); i.e. the Jacobian
matrix for f ← is the inverse of the Jacobian for f .
Here we have (f ← )0 (f (x, y)) = (f ← )0 (x2 + y 2 , 2xy). Now
1 √ √ √ √
(f ← )(u, v) = ( u + v + u − v, u + v − u − v)
2
so " #
1 √1 + √1 √1 − √1
← 0 u+v u−v u+v u−v
(f ) (u, v) = √1 √1 √1 √1
.
4 u+v
− u−v u+v
+ u−v
(f ← )0 (f (a)) = (f 0 (a))−1 ;
i.e. the Jacobian of the inverse at f (a) equals the inverse matrix of the Jacobian
of f at a.
The inverse map theorem thus gives a sufficient condition for local invertibility
of a function f : A ⊆ V → W .
Remarks
• A Banach space is a complete normed vector space; i.e. a normed vector
space in which every Cauchy sequence converges to a point in the space.
For a sequence {fn }∞
n=1 of points (maps) in V (a Banach space) we have
If {fn }∞
n=1 is a Cauchy sequence then∃l ∈ V, lim {fn }∞
n=1 = l.
n→∞
We will show that Df (0) has a continuous inverse, but that f fails to be injec-
tive on every open interval containing 0 and so cannot have an inverse. This
demonstrates that existence of the inverse of the derivative is not sufficient by
itself. We will also demonstrate that Df is not continuous at 0.
Since f : R → R we have Df (x) = f 0 (x).Id. Now for x ∈ R\{0},
1 1 1
f 0 (x) = + 2x sin − cos
2 x x
f (h)−f (0)
and f 0 (0) = 21 . Since f 0 (0) := lim h we have
h→0
h
+ h2 sin h1
f 0 (0) = lim 2
.
h→0 h
Now
| h2 + h2 sin h1 | | h + h2 | 1
≤ 2
= + h
|h| |h| 2
70 CHAPTER 10. INVERSE MAPS
thus
( 12 + 2x sin x1 − cos x1 ).Id
if x 6= 0
Df (x) = 1
2 .Id if x = 0.
And so Df (0) = 12 .Id, which clearly has continuous inverse (Df (0))−1 = 2.Id.
is not continuous at 0.
Consider ε = 21 , let δ > 0 and since { 2nπ
1
}∞
n=1 is a decreasing sequence we can
choose large n so that 2nπ < δ. Notice that Df (x) − Df (0) = 2x sin x1 − cos x1
1
1
so we can choose x such that sin x = 0 and cos x = 1; consider x = 2nπ . Now
|x| = x < δ
and
||Df (x) − Df (0)|| = ||( 21 − cos 2nπ)Id − 21 .Id||
= || cos 2nπ.Id||
= ||Id|| = 1 > ε.
Thus Df is not continuous at 0.
Finally we show that f fails to be injective on every open set containing 0; and
so it cannot have an inverse.
First notice that
• |f (x) − x2 | = |x2 sin x1 | = |x2 || sin x1 | ≤ x2 . So x
2 − x2 ≤ f (x) ≤ x
2 + x2 .
10.2. EXISTENCE OF LOCAL INVERSES. 71
2 0
• If x ∈ (0, 14 ) then x
2 − x2 is increasing and ( Id
2 − Id ) (x) =
1
2 − 2x.(Which
1 1
is zero when x = 4 and positive when 0 < x < 4 .)
2
• There is a decreasing sequence of points sn = (4n−1)π such that f (sn ) =
sn 2 2 1
2 − sn , for all n and s1 = 3π < 4 . This means that ∀n > 1, f (sn ) <
f (s1 ) and since f (sn ) is a local minimum for each n; ∃r ∈ R such that
sn < r < sn−1 and f (r) = f (sn−1 ).
Now I want to prove that ∀δ > 0, f is not injective on Bδ (0).
Let δ > 0, Bδ (0) is an open ball in R.
If δ ≥ 41 , then choose x1 = 3π2
and x2 the point between 2
7π and 2
3π such
that 2
1 2
f (x2 ) = − .
3π 3π
Then
2 2
1 2 3π 1 2
f (x1 ) = + sin = − = f (x2 ).
3π 3π 2 3π 3π
And if δ < 41 , then choose for x1 the first real number sν say, for which
f (sν ) = snu 2
2 − (sν ) (=f (x1 )). Then f (sν+1 ) < f (sν ) and both points are
local minima so ∃x2 such that sν+1 < x2 < sν for which f (x2 ) = s2ν −(sν )2 .
So again f is not injective on Bδ (0).
So
f = ((exp ◦Id1 ) cos ◦Id2 , (exp ◦Id1 ) sin ◦Id2 )
which is componentwise a product of differentiable functions and so is differen-
tiable. We have that the Jacobian matrix at (x, y) is
x
e cos y −ex sin y
0
f (x, y) = .
ex sin y ex cos y
Since all of the partial derivatives are continuous, by Theorem 9.1.2 f is con-
tinuously differentiable, which by definition means that Df is continuous.
Also det(f 0 (x, y)) = e2x which is not zero. And so the inverse of the linear map
Df (x, y) exists everywhere.
On the other hand, we have
Theorem 10.2.2 Let f map an open subset A of Rn into Rn and let f and all
/i
of its partial derivatives fj be continuous on A.
0
If det(f (a)) 6= 0 then
∃ open X ⊆ Rn ; a ∈ X
∃ open Y ⊆ Rn ; f (a) ∈ Y
such that f : X → Y has a continuous inverse f ← : Y → X.
Furthermore the inverse map is differentiable at f (a) and
Proof: We assume Theorem 10.2.1. Recall that f 0 (a) is the matrix of Df (a),
so the assumption that det(f 0 (a)) 6= 0 means that (Df (a))← exists and it is
linear, being the inverse of a linear map.
By assumption all partial derivatives exist and are continuous and so Df is
continuous.
Thus the hypothesis of Theorem 10.2.1 is satisfied and so Theorem 10.2.2 follows.
Now to prove theorem 10.2.1 we will need a generalisation to arbitrary normed
vector spaces, of the mean-value theorem. Recall that for functions f : R → R
continuous on [a, b], a closed real interval, and differentiable on (a, b)
∃x ∈ (a, b)| f 0 (x) = f (b)−f
b−a
(a)
.
Corollary 10.2.3 Let f be as in the mean-value theorem for real-valued func-
tions with real domain; then |f (b) − f (a)| ≤ sup {| f 0 (t)|(b − a)}.
t∈[a,b]
Lemma 10.2.7 Let ϕ map an open set on the real line containing the interval
[0, 1] into W . If ϕ has a derivative which exists and is bounded on [0, 1], then
Proof: First we show that for any > 0, for each t ∈ [0, 1], ∃δ > 0 such that
for all h ∈ R
Let > 0.
Choose δ > 0 such that
||ϕ(t + h) − ϕ(t) − Dϕ(t)(h)||
|h| < δ ⇒ <
|h|
||ϕ(1)−ϕ(0)|| ≤ sup ||Dϕ(t)|| which is what we need to prove the special case.
t∈[0,1]
Thus S 6= ∅ and its bounded above (since S ⊆ [0, 1]) so it has a least upper
bound.
Let A = sup S . I will show that A ∈ S and then that A = 1.
Let {sn }∞
n=1 be a sequence of real numbers in S such that lim sn = A.
n→∞
Since ϕ is continuous on A (by assumption differentiable on [0, 1]) we have
by the theorem “continuity via sequences”, lim ϕ(sn ) = ϕ(A).
n→∞
10.2. EXISTENCE OF LOCAL INVERSES. 75
And so
lim ϕ(sn ) − ϕ(0) = ϕ(A) − ϕ(0)
n→∞
⇒ lim (ϕ(sn ) − ϕ(0)) = ϕ(A) − ϕ(0) since ϕ(0) is independent of n and so
n→∞
|| lim (ϕ(sn ) − ϕ(0))|| = ||ϕ(A) − ϕ(0)||
n→∞
i.e. ||ϕ(A) − ϕ(0)|| = lim ||ϕ(sn ) − ϕ(0)|| since || || is continuous (*)
n→∞
Hence A ∈ S .
Now by definition of A it is less than or equal to 1. If A = 1 then we are
done. So suppose that A < 1.
Choose 0 < δ ≤ 1 − A, so if |h| < δ by the first (*) we have
||ϕ(A + h) − ϕ(A)|| = (||Dϕ(A)|| + )|h| so
||ϕ(A + 12 δ) − ϕ(A)|| = (||Dϕ(A)|| + ) 12 δ
Now ||ϕ(A + 12 δ) − ϕ(0)|| = ||ϕ(A + 12 δ) − ϕ(A) + ϕ(A) − ϕ(0)||
≤ ||ϕ(A + 12 δ) − ϕ(A)|| + ||ϕ(A) − ϕ(0)||
by the triangle inequality
≤ (||Dϕ(A)|| + ) 21 δ + sup {(||Dϕ(t)|| + )|A|}
t∈[0,A]
≤ sup {(||Dϕ(t)|| + )|A + 12 δ|}
t∈[0,A+ 12 δ]
Proof of Theorem 10.2.6: Let f map an open set containing the segment
[a, b] in the normed vector space V into W . Suppose f has a derivative which
is defined and bounded on [a, b].
Put ϕ(t) = f (a + t(b − a)) = (f ◦ (a + Id(b − a)))(t). Then ϕ(1) = f (b),
ϕ(0) = f (a) and a + t(b − a) ∈ [a, b], for all t ∈ [0, 1]. So ϕ maps an open set
containing [0, 1] into W . Since f is differentiable at a+t(b−a) and a+Id(b−a)
is differentiable (being an affine map) the composite ϕ is defined on [0, 1] and
for any t ∈ [0, 1],
Dϕ(t) = D(f ◦ (a + Id(b − a)))(t)
= Df (a + Id(b − a)(t)) ◦ D(a + Id(b − a))(t)
= Df (a + t(b − a)) ◦ (b − a)Id.
76 CHAPTER 10. INVERSE MAPS
i.e. ||Df (a + t(b − a))||||(b − a)|| is an upper bound for {||Dϕ(t)(s)|| : |s| ≤ 1}.
But ||Dϕ(t)|| is the least upper bound of {||Dϕ(t)(s)|| : |s| ≤ 1}. So
Now
||f (b) − f (a)|| = ||ϕ(1) − ϕ(0)||
≤ sup ||Dϕ(t)|| by the special case
t∈[0,1]
≤ sup ||Df (a + t(b − a))||||b − a||
t∈[0,1]
≤ sup ||Df (x)||||b − a||
x∈[a,b]
A. Show that the distance between successive iterates drops down by a factor
of k or more at each step.
C. Show that the limit x is a fixed point of g, and that it is the only one.
xn = g(xn−1 ) ∀n ≥ 2
and so xn+1 = g(xn )
so that xn+1 − xn = g(xn ) − g(xn−1 ).
Since (†) holds for all n ≥ 2, by repeatedly applying (†) we may drop the value
of n on the right hand side, back to 2; i.e.
and so we see that the distance between successive iterates is dropping down at
least as rapidly as the terms of a geometric series with ratio k.
B. To show {xn }∞ n=1 is a Cauchy sequence.
Let m < n. Note that
First we check that the hypotheses of the inverse map theorem apply; i.e.
/1 /2
f1 (a, b) = −e−a cos(b) f1 (a, b) = −e−a sin(b)
/1 /2
f2 (a, b) = −e−a sin(b) f2 (a, b) = e−a cos(b)
so Df (a) = (−e−a cos(b)Id1 −e−a sin(b)Id2 , −e−a sin(b)Id1 +e−a cos(b)Id2 ) which
conforms with the form of a linear map from R2 to R2 . The linear map
Df (a) : R2 → R2 has an inverse ⇐⇒ it’s matrix f 0 (a) has non-zero de-
terminant.
Now
−e−a cos(b) −e−a sin(b)
0
f (a, b) =
−e−a sin(b) e−a cos(b)
and so the determinant det(f 0 (a)) = −e−2a 6= 0 for any a ∈ R. Thus f satisfies
the hypothesis of the theorem on R2 and so on A.
Thus the conclusion to the theorem applies; i.e.
• there is an open set X ⊆ A containing a and an open set Y ⊆ W containing
f (a) such that the restricted map f : X → Y has a continuous inverse
f ← : Y → X.
• The inverse map is Fréchet differentiable at f (a) and
(f ← )0 (f (a)) = (f 0 (a))−1 .
But before we choose the sets X and Y we can assume, by the inverse map
theorem that an inverse exists on some open set and so we will find the inverse
and then try to choose sets X and Y so that f and f ← are bijective.
−1
For f (x, y) = (e−x cos(y), e−x sin(y)) = (u, v) find that x = ln((u2 + v 2 ) 2 ) and
y = arctan uv . And so
−1 v
f ← (u, v) = (ln((u2 + v 2 ) 2 ), arctan ).
u
Try to find sets X and Y = f (X) which are suitable in this example.
The Jacobian for f ← (u, v) is the matrix
−u −v
u2 +v 2 u2 +v 2
−v u
u2 +v 2 u2 +v 2
Check that the matrix product f 0 (x, y).(f ← )0 (f (x, y)) gives the identity 2 × 2
matrix; thus the composite Df ((x, y)) ◦ Df ← (f (x, y)) is the identity map on
80 CHAPTER 10. INVERSE MAPS
Y and that the matrix product (f ← )0 (f (x, y)).f 0 (x, y) is also the identity 2 × 2
matrix, so that the composite Df ← (f (x, y)) ◦ Df (x, y) is the identity map on
X. It is also easy to check that (f ← )0 (f (a)) = (f 0 (a))−1 as asserted by the
theorem.
Exercise 10
(a) Sketch the images of the hyperbolae of centre (0, 0) and distance to
a vertex of 1,2,3 respectively.
(b) Is f 1-1? Give reasons.
(c) Find a set X such that the restricted map f : X → R2 is 1-1.
(d) Now find the set Y such that the map f : X → Y is onto.
(e) Find a formula for the inverse f ← : Y → X of the restricted map.
(f) Now find the set of points for which the linear map Df (x, y) : R2 → R2
has an inverse. Is the inverse of Df (x, y) continuous?
3. Use the mean value theorem (for real valued functions) to prove that the
restriction of the function 1−cos : [0, π4 ] → [0, π4 ] is a contraction mapping.
(a) Prove that for each b ∈ V there is a unique x ∈ V such that L(x) = b,
thereby proving that L has an inverse L← : V → V .
(b) Use the equation Id = Id − L + L to prove that there is a constant
k such that for each x ∈ V ||x|| ≤ k||L(x)||. Deduce that L← is
continuous.
(i) By the following lemma (“no-jump” lemma) for some δ > 0 we have
f (y) > 0 for all y ∈ (c − δ, c + δ).
No-jump Lemma: Let D ⊂ R. If f : D → R is continuous and
l ∈ D is such that f (l) > 0 then there exists δ > 0 such that, for all
x∈D
|x − l| ≤ δ ⇒ f (x) > 0.
The lemma tells us that if f (l) > 0 then f (x) > 0 for all x ∈ D
sufficiently near l. The function cannot suddenly jump to negative
values arbitrarily close to l. (A similiar result holds for f (l) < 0).
Prove the following statement and deduce the no-jump lemma from
it.
82 CHAPTER 10. INVERSE MAPS
(ii) Furthermore from the continuity of φ we have for some ν > 0 φ(x) ∈
(c − δ, c + δ) for all x ∈ (−ν, ν).
(iii) Thus ∀x ∈ (−ν, ν), (f ◦ φ)(x) > 0 and so we can divide both sides
of the differential equation by f ◦ φ and get
Using the substitution rule on the left hand side and the initial con-
dition φ(0) = c gives
φ(x)
Zx Z Z
Id−1 ◦ f ◦ φ.φ0 = Id−1 ◦ f = Id−1 ◦ f (φ(x))
0 c c
and so
Z Z
Id−1 ◦ f ◦ φ = g on (−ν, ν)
0
c
←
−1 −1
R R
So if the function Id ◦ f has an inverse Id ◦f on
c c
(c − δ, c + δ) we could conclude that
←
Z Z
−1
φ= Id ◦ f ◦ g
0
c
Implicit functions
Our last major result for this course is the Implicit Function Theorem. It is a
more general theorem than the inverse map theorem. To motivate the theorem,
consider the question; given a function F : R2 → R does there exist a set of
points
{(x, y) ∈ R2 : F (x, y) = 0}
which is the graph of some function; i.e. does there exist a function φ say, such
that
{(x, y) ∈ R2 : F (x, y) = 0} = {(x, φ(x)) : F (x, φ(x)) = 0}?
In general the answer to the question is no. The purpose of the Implicit Function
Theorem is to establish sufficient conditions for such to exist, possibly on a
subset of the domain of F .
Example
Let F : R2 → R be given by F (x, y) = x2 + y 2 − 1.
(a, b)
(−1, 0) (1, 0)
83
84 CHAPTER 11. IMPLICIT FUNCTIONS
In this example the answer is yes; provided that (a, b) 6= (±1, 0).
Notice that in this example ∂2 F (a, b) = F /2 (a, b)Id = 2bId = 0 at each of the
points (±1, 0). We have
Theorem 11.0.10 Let F : U ⊆ V × W → W where U is open and let F be
Fréchet differentiable on U . Suppose that there is a point (a, b) in U such that
F (a, b) = 0, DF is continuous at (a, b) and that (∂2 F )(a, b) ∈ L(W, W ) has a
continuous inverse (so that (∂2 F )(a, b) is a homeomorphism.)
Then there is an open set A ⊆ V containing a, an open set X ⊆ V × W
containing (a, b) and a unique function ϕ : A → W with ϕ(a) = b such that for
all x ∈ A
(a) (x, ϕ(x)) ∈ X
(b) F (x, ϕ(x)) = 0
(c) X ∩ F −1 ({0}) = {(x, ϕ(x)) : x ∈ A}.
(d) The function ϕ is differentiable at a and
Dϕ(a) = −(∂2 F (a, b))← ◦ (∂1 F )(a, b).
Remarks:
• In the special case V = W = R (i.e. F : U ⊆ R2 → R) we have that
∂2 F (a, b) = F /2 (a, b)Id ∈ L(R, R).
But remember that L(R, R) = {mId : m ∈ R}, so
1
(a) mId has an inverse ⇐⇒ m 6= 0; (it’s inverse is m Id.)
(b) ∂2 F (a, b) = F /2 (a, b)Id and
(c) every linear map from R to R is continuous.
So by (a) and (c) ∂2 F (a, b) has a continuous inverse ⇐⇒ ∂2 F (a, b) 6= 0
and by (b) this is equivalent to F /2 (a, b)Id 6= 0, which is in turn equivalent
to F /2 (a, b) 6= 0. Thus we can replace the condition “∂2 F (a, b) has a
continuous inverse” with F /2 (a, b) 6= 0.
• Notice also that in this case (V = W = R) we have ∂1 F (a, b) = F /1 (a, b)Id
so that /1
F (a, b)
(∂2 F (a, b))← ◦ ∂1 F (a, b) = Id.
F /2 (a, b)
• The next simplest case is V = Rm−n , W = Rn with m > n. Identify
V × W with Rm (= Rm−n × Rn ). In this case the conditions that
F be differentiable on U and
DF be continuous at a = (a, b) are usually met by checking that all partial
/j
derivatives Fi (1 ≤ i ≤ n, 1 ≤ j ≤ m) exist and are continuous at a.
The condition that ∂2 F (a, b) be a homeomorphism reduces to the invert-
ibility of its matrix; i.e. need that the matrix for the linear map ∂2 F (a, b)
has non zero determinant.
11.1. AN APPLICATION OF THE IMPLICIT FUNCTION THEOREM 85
Since F and its partial derivatives are continuous on an open set containing
((0, √12 ), − √12 ), DF is continuous at ((0, √12 ), − √12 ).
Finally ((0, √12 ), − √12 ) is a point in R2 ×R such that F ((0, √12 ), − √12 ) = 0. Thus
F satisfies the hypotheses of the Implicit Function Theorem.
The conclusion to the Implicit Function Theorem asserts the existence of
• open A ⊆ R2
It is the proof of the theorem which will guide us to understand how to explicitly
describe the conclusion and so we defer the description until we have examined
a proof of the theorem.
Exercise 11
F : R2 → R with F (x, y) = x9 − y 3 .
4. Let F : R×R
√ → R with F (x, y) = x2 +y 2 −4. Show that in a neighbourhood
of (1, 3) the set {(x, y)| F (x, y) = 0} forms the graph of a function φ
with F (x, φ(x)) = 0.
5. Let F : R × R → R with F (x, y) = x2 cos(y) + sin(x2 ).
√
Show that in a neighbourhood of ( π, π2 ) the set {(x, y)| F (x, y) = 0}
forms the graph of a function φ with F (x, φ(x)) = 0.
88 CHAPTER 11. IMPLICIT FUNCTIONS
Chapter 12
(f ← )0 (f (a)) = (f 0 (a))−1
i.e. the Jacobian matrix of the inverse is the inverse of the Jacobian matrix.
The theorem reduces to the following special case in which V = W and Df (a) =
IdV .
First we assume that the special case is true and deduce the Inverse Map The-
orem.
Suppose that f maps an open subset A of a Banach space V to W , f is Fréchet
differentiable on A, Df is continuous at a ∈ A and the linear map Df (a) :
V → W has a continuous inverse (Df (a))← (i.e. assume the hypotheses of the
inverse map theorem).
Now set f = (Df (a))← ◦ f .
89
90 CHAPTER 12. PROOF OF THE THEOREMS
f -
A⊆V W
f
(Df (a))←
?
+
V
We want to show that f satisfies the hypotheses of the special case. We have
By (i), (ii) and (iii) f = (Df (a))← ◦ f satisfies the hypotheses of the special
case.
The conclusion is that there is an open set X ⊆ A containing a and an open set
Y ⊆ V containing f (a) such that the restricted map f : X → Y has a continuous
← ←
inverse f : Y → X. The inverse map f is differentiable at b = f (a) with
←
Df (b) = Id.
It remains to deduce the conclusion to the Inverse Map Theorem.
Let X ⊆ A be an open set and let Y be the subset Df (a)(Y ) ⊆ W , so that
f : X → Y ⊆ W is given by
Since each of Df (a) and f have a continuous inverse then f has a continuous
inverse
←
f ← = (Df (a) ◦ f )← = f ◦ (Df (a))← .
And so f ← : Y → X, Y is open, since Y is open and (Df (a))← is continuous.
Thus the conclusion to the Inverse Map Theorem holds:
there is an open set X ⊆ A containing a and an open set Y ⊆ W containing f (a)
such that the restricted map f : X → Y has a continuous inverse f ← : Y → X
which is Fréchet differentiable at f (a) and
(f ← )0 (f (a)) = (f 0 (a))−1 .
And so we have to prove the special case of the inverse map theorem. This is
really the main step in the proof of the Inverse Map Theorem because here we
have to prove existence of the sets X and Y .
Proof of Lemma (12.0.1):
We assume that f is a function satisfying the hypotheses of the special case,
i.e. we have f maps an open set A of a Banach space V into V , f is Fréchet
differentiable, Df continuous and Df (a) = IdV .
Now given a y close to b = f (a) ∈ V , we want to solve the equation y = f (x)
to get a unique point x near to a in A.
For this we can use the contraction map theorem.
• Apply the contraction map theorem to find that gy has a unique fixed
point (one for each y) and
We define
gy : B r (a) → V by gy = IdV − f + y.
We’ll show that gy satisfies the contraction condition on B r (a) and that it maps
the closed ball B r (a), of radius r > 0 centred on a into itself. Since B r (a) is a
closed subset of a complete space it is complete and so for each y, gy satisfies
the hypotheses of the contraction map theorem.
92 CHAPTER 12. PROOF OF THE THEOREMS
||Dgy (x1 ) − Dgy (x2 )|| = ||IdV − Df (x1 ) − IdV + Df (x2 )||
= ||Df (x1 ) − Df (x2 )||.
Thus for each y, gy satisfies the contraction condition. Next we show that for
y ∈ B r2 (b), gy (x) ∈ B r (a) for all x ∈ B r (a).
Let y ∈ B r2 (b), i.e. ||y − b|| < 2r .
Let x ∈ B r (a) ⇒ ||x − a|| ≤ r.
Now by (†) we have for each fixed y and each x1 , x2 ∈ B r (a)
1
||gy (x1 ) − gy (x2 )|| < ||x1 − x2 ||.
2
So in particular we have
1 r
||gy (x) − gy (a)|| < ||x − a|| ≤ ,
2 2
93
whenever x ∈ B r (a).
Now gy = IdV − f + y and so
Furthermore
||gy (x) − a|| = ||gy (x) − gy (a) + gy (a) − a||
≤ ||gy (x) − gy (a)|| + ||gy (a) − a||
= ||gy (x) − gy (a)|| + ||y − b|| < r ≤ r.
which is what we wanted to prove, since along with the contraction condition
(†) it gives that gy satisfies the contraction map theorem for B r (a).
By the Contraction Map Theorem gy has a unique fixed point for each y.
And so
L←
b (f (a + h) − f (a)) = h + L←b (O(h)) (1)
= (a + h − a) + L←
b (O(a + h − a)).
94 CHAPTER 12. PROOF OF THE THEOREMS
L← ← ← ← ← ←
b (b + k − b) = f (b + k) − f (b) + Lb (O(f (b + k) − f (b)))
i.e. f ← (b + k) = f ← (b) + L← ← ← ←
b (k) − Lb (O(f (b + k) − f (b))).
To see that L←
b is the Fréchet derivative of f
←
at f (a) we have to show that
||L← ← ←
b (O(f (b + k) − f (b)))||
lim =0
k→0 ||k||
which is the sufficient condition for L← ←
b = Df (b).
But Lb is a linear transformation and so ||Lb (x)|| ≤ ||L←
← ←
b ||||x|| and so we have
to show that
||O(f ← (b + k) − f ← (b))||
lim = 0.
k→0 ||k||
Now
||O(f ← (b+k)−f ← (b))||
||k||
←
(b+k)−f ← (b))|| ||f ← (b+k)−f ← (b||
= ||O(f
||f (b+k)−f ← (b)||
← ||k|| (2)
← ← ←
and by continuity of f we have f (b + k) → f (b) as b + k → b, i.e. k → 0.
And by the definition of O
||O(f ← (b + k) − f ← (b))||
→0
||f ← (b + k) − f ← (b||
as f ← (b + k) → f ← (b).
Also, by the contraction condition (†)
||f ← (b + k) − f ← (b|| 1
→ .
||k|| 2
So since both limits on the right of (2) exist and one of them is zero we
have the desired result. Thus we have deduced the conclusion to the special
case of the Inverse Map Theorem we have X = f ← (Y ) ∩ Br (a), Y = B r2 (b) and
f (X) = Y . In addition f : X → Y is injective and so it’s inverse f ← : Y → X
exists. We have also shown that f ← is continuous and differentiable.
Recall the Implicit Function Theorem 11.1
Let F : U ⊆ V × W → W where U is open and let F be Fréchet
differentiable on U . Suppose that there is a point (a, b) in U such
that F (a, b) = 0, DF is continuous at (a, b) and that (∂2 F )(a, b) ∈
L(W, W ) has a continuous inverse (so that (∂2 F )(a, b) is a homeo-
morphism.)
Then there is an open set A ⊆ V containing a, an open set X ⊆
V × W containing (a, b) and a unique function ϕ : A → W with
ϕ(a) = b such that for all x ∈ A
95
We use the Inverse Map Theorem to prove the Implicit Function Theorem.
The Implicit Function Theorem involves a function F : V × W → W so to use
the Inverse Map Theorem, I need a function H : V ×W → V ×W which satisfies
the hypotheses of the Inverse Map Theorem.
Define a function H : V × W → V × W by
as required.
(3) Next we are to prove that DH(a, b) has an inverse and that the inverse is
continuous.
96 CHAPTER 12. PROOF OF THE THEOREMS
DH(a, b)(x, y) = (v, w) ⇔ (x, y) = (v, ∂2 F (a, b)← (w−∂1 F (a, b)(v))).
i.e. by (1) (x, DF (a, b)(x, y))) = (v, w) and so x = v and DF (a, b)(x, y) =
w. But
(∂2 F (a, b))← (∂2 F (a, b)(y)) = (∂2 F (a, b))← (w − ∂1 F (a, b)(x))
i.e. (DH(a, b))← (v, w) = (v, (∂2 F (a, b))← (w − ∂1 F (a, b)(v)))
i.e. (DH(a, b))← = (Id1 , (∂2 F (a, b))← ◦ (Id2 − ∂1 F (a, b) ◦ Id1 )).
Conversely suppose that
Now
DH(a, b)(x, y) = DH(a, b)(v, (∂2 F (a, b))← (w − ∂1 F (a, b)(v)))
.
= (v, DF (a, b)(v, (∂2 F (a, b))← (w − ∂1 F (a, b)(v))))
And
DF (a, b)(v, ∂2 F (a, b)← (w − ∂1 F (a, b)(v)))
= ∂1 F (a, b)(v) + ∂2 F (a, b)(∂2 F (a, b)← (w − ∂1 F (a, b)(v)))
= ∂1 F (a, b)(v) + w − ∂1 F (a, b)(v)
=w
Hence
DH(a, b)(x, y) = (v, w).
(ii) Next we want to see that (DH(a, b))← is continuous.
||(DH(a, b))← (v, w)|| = ||(v, (∂2 F (a, b))← (w − ∂1 F (a, b)(v))||
so
||(DH(a, b))← (v, w)||2 = ||v||2 +||(∂2 F (a, b))← (w−∂1 F (a, b)(v))||2
97
i.e.
1
||(DH(a, b))← (v, w)|| ≤ ε(1 + ||(∂2 F (a, b))← ||2 (1 + ∂1 F (a, b))2 ) 2
But Id2 and H ← are differentiable and f is continuous linear and hence
differentiable. And so ϕ is the composite of differentiable functions.
Furthermore
Dϕ(a) = D(Id2 ◦ H ← ◦ f )(a)
= DId2 (H ← (f (a))) ◦ D(H ← ◦ f )(a)
= Id2 ◦ DH ← (f (a)) ◦ Df (a)
= Id2 ◦ DH ← (a, 0) ◦ f since f is continuous linear
99
But
DH ← (a, 0) = (DH(a, b))←
= (Id1 , (∂2 F (a, b))← ◦ (Id2 − ∂1 F (a, b) ◦ Id1 )).
And so
Dϕ(a)(h) = ((∂2 F (a, b))← ◦ (Id2 − ∂1 F (a, b) ◦ Id1 ))(f (h))
= ((∂2 F (a, b))← ◦ (Id2 − ∂1 F (a, b) ◦ Id1 ))((h, 0))
= ((∂2 F (a, b))← ◦ −∂1 F (a, b))(h) hence
Dϕ(a) = (∂2 F (a, b))← ◦ −∂1 F (a, b)
as required.
This completes the proof of the Implicit Function Theorem.
To finish we return to the example of Chapter 11 and find explicitly the sets X
and A and function φ : A → W .
Everything depended upon the function H here we have H : R2 × R → R2 × R
which maps ((x, y), z) to ((x, y), F ((x, y), z)).
X was to be a set on which H is bijective so that H −1 : Y → X exists; also X
should contain the point ((0, √12 ), − √12 ). Consider the set
x2
F −1 ({0}) = {((x, y), z) ∈ R3 : + y 2 + z 2 = 1}.
4
q
x2 2
Now 4 + y 2 + z 2 = 1 ⇐⇒ z = ± 1 − ( x4 + y 2 ). And z can only be defined
x2
in this way if we have 4 + y 2 ≤ 1.
q
2
Also its clear that we should take z = − 1 − ( x4 + y 2 ) (†)
since we want ((0, √12 ), − √12 ) ∈ X.
Now if H is restricted to
x2
{((x, y), z) : + y 2 ≤ 1 and z ≤ 0}
4
then it is injective.
Proof: Suppose that H((x, y), z) = H((u, v), w) this is equivalent to saying
But
x2
{((x, y), z) : + y 2 ≤ 1 and z ≤ 0}
4
100 CHAPTER 12. PROOF OF THE THEOREMS
x2
{((x, y), z) : + y 2 < 1 and z < 0}.
4
2
Now to find A ⊆ R2 : if we take A = {(x, y) : x4 + y 2 < 1} and for φ : A → R
the function defined by
r
x2
φ(x, y) = − 1 − ( + y 2 )
4
then φ(0, √12 ) = − √12 and so φ satisfies the first condition. In addition:
q
let (x, y) ∈ A then (x, y), − 1 − ( x4 + y 2 ) ∈ R2 × R; so (a) is satisfied.
2
q
2
F ((x, y), φ(x, y)) = F (x, y), − 1 − ( x4 + y 2 )
q 2
x2 2 x2 2
= 4 +y + − 1−( 4 +y ) −1=0
so (b) is satisfied.
And
2
X ∩ F −1 ({0}) = {((x, y), z) : x4 + y 2 < 1 and z < 0}∩
{((x, y), φ(x, y)) : (x, y) ∈ A}
2
= {((x, y), z) : x4 + y 2 < 1 and z < 0}∩
2 2
{((x, y), 1 − ( x4 + y 2 )) : x4 + y 2 < 1}
= {((x, y), φ(x, y)) : (x, y) ∈ A}
since z = {((x, y), φ(x, y)) : (x, y) ∈ A}
whenever (x, y) ∈ A.
Hence (c) is satisfied.
q
2
Finally φ(x, y) = − 1 − ( x4 + y 2 ) means that
2
φ/1 (x, y) = − 21 (1 − ( x4 + y 2 ))−1/2 .(− 2x
4 )
= q xx2
4 1−( 4 +y 2 )
and
2
φ/2 (x, y) = − 12 (1 − ( x4 + y 2 ))−1/2 .(−2y)
= q yx2
1−( 4 +y 2 )
so
x y
Dφ(x, y) = q Id1 + q Id2 .
x2 2
4 1 − ( 4 + y2 ) 1 − ( x4 + y 2 )
1
Dφ(0, √ ) = Id2 .
2
101
Note: it is easy to check that there is at most one element 0 in V which satisfies
(A3)–this element is called the zero element of V .
Question: where did the 1 come from in (M1)?
A vector space is defined as a set V over R (in our case). In fact for R one
can substitute any field. Examples of fields are: R, C, Q with respect to the
operations of addition and multiplication.
Vector Subspace
Given a vector space V equipped with a pair of operations, + and multiplication
by an element of the field of scalars, a subset S of the set V is said to be a vector
subspace of V if and only if it is non-empty and closed under + and the scalar
multiplication.
103
104 CHAPTER 13. APPENDIX A- VECTOR SPACES
Field axioms:
(i) ∀x, y ∈ F x+y =y+x
(ii) ∀x, y, z ∈ F x + (y + z) = (x + y) + z
(iii) ∃0 ∈ F ∀x ∈ F x+0=x
Appendix B- Complete
Spaces
105
106 CHAPTER 14. APPENDIX B- COMPLETE SPACES
Hence (|sn |) → l. But (sn ) is Cauchy and so (sn ) → l or (sn ) → −l. Hence
every Cauchy sequence in R is convergent; i.e. R is complete.
Now suppose that xr = (x1r , x2r , . . . , xnr ), r ∈ N, is a Cauchy sequence in Rn ;
xir the ith component i
pP of xr . Then for each i, xr is a Cauchy sequence in R,
i i
since |xr − xs | ≤ (xr − xs ) = d(xr , xs ); where d is the usual metric on Rn .
i i 2
Proof: Let {fn }∞ n=1 be a Cauchy sequence in B[a, b]. We are to prove that
there is a function f in B[a, b] which is the limit of this sequence.
The first step is to find some way of constructing a function f which has a
reasonable chance of being the required limit. We now show how to define the
value of such a function f at each point x in [a, b].
To this end, let x ∈ [a, b]. Since for all m, n, ∈ N
But the absolute value function is continuous. So by the theorem continuity via
sequences we may take lim inside:
||fm − f || ≤ .
From this it would seem that we have reached our goal, except that we have
“≤ ” rather than “< ”. We can fix this by using the 2 trick at line (∗).
More seriously we have not yet shown that f ∈ B[a, b]. So we do this next.
Let x ∈ [a, b], and let n ∈ N. By the triangle inequality for R,
where by (∗∗), ||f −fn || is certainly defined for some n and where ||fn || is defined
since fn ∈ B[a, b]. The right hand side is thus an upper bound for the set whose
typical member appears on the left. Hence f ∈ B[a, b], as required.
108 CHAPTER 14. APPENDIX B- COMPLETE SPACES
Chapter 15
APPENDIX C-Higher
Order Fréchet Derivatives
109
110CHAPTER 15. APPENDIX C-HIGHER ORDER FRÉCHET DERIVATIVES
Idj .wi (x + λy) = Idj (x1 + λy1 , . . . , xn + λyn )wi (x1 + λy1 , . . . , xn + λyn )
= (xj + λyj )wi
= xj wi + λyj wi
= Idj .wi (x) + λIdj .wi (y).
(b) To prove that Idj .wi is continuous we express it as the composite Id1 .Id2 ◦
(Idj , wi ) which we will show is continuous being the composite of contin-
uous functions.
(i) To prove that (Idj , wi ) : Rn → R × F is continuous we will show that
it is differentiable at any a = (a1 , . . . , an ) ∈ Rn .
Let a ∈ Rn , h ∈ Rn then
(Idj , wi )(a + h) − (Idj , wi )(a) = (aj + hj , wi ) − (aj , wi )
= (hj , 0).
Now choosing D(Idj , wi )(a)(h) = (hj , 0); i.e. D(Idj , wi )(a) = (Idj , 0),
it is clear that the map is linear and continuous and satisfies the re-
quired limit condition. Thus (Idj , wi ) is continuous at any point
a ∈ Rn , being differentiable everywhere.
(ii) Here we prove that
Id1 .Id2 : R × F → F
(a, â) 7→ aâ
is continuous.
Let a = (a, â) ∈ R × F.
Let ε > 0.
Choose δ = min{1, 1+||â||εF +|a|R }; so δ > 0 and it is in R.
Let x = (x, x̂) ∈ R × F and suppose that
then
||x − a|| < 1 and ||x − a|| < 1+||â||εF +|a|R thus
||x − a||(1 + ||â||F + |a|R ) < ε and so
||x − a||(||x − a|| + ||â||F + |a|R ) < ε, since ||x − a|| < 1 thus
||x − a||2 + ||x − a||(||â||F ) + ||x − a||(|a|R ) < ε but since
|x − a| ≤ ||x − a|| and ||x̂ − â|| ≤ ||x − a|| we have
|x − a|(||x̂ − â||) + |x − a|(||â||) + |a|(||x̂ − â||) < ε this gives
||(x − a)(x̂ − â) + (x − a)â + a(x̂ − â)|| < ε
by homogeneity and the triangle inequality i.e.
||xx̂ − aâ|| < ε.
111
since L is linear. Pm
Also L(ej ) ∈ F and so L(ej ) = i=1 αij wi since wi are elements of a
basis for F. Thus
Pn
L(x) = j=1 xj L(ej )
Pn Pm
= j=1 xj i=1 αij wi
Pn Pm
= j=1 i=1 αij xj wi
Pn Pm
= j=1 i=1 αij Idj .wi (x).
Thus ∃λij ∈ R such that
X
L= λij Idj wi .
1≤j≤n
1≤i≤m
Lemma 15.0.10 If F has a Euclidean dominated norm then so too does L(Rn , F).
That is if
n P m
λij Idj wi ∈ L(Rn , F)
P
L =
j=1
s i=1
n Pm
λ2ij .
P
then ||L|| ≤
j=1 i=1
112CHAPTER 15. APPENDIX C-HIGHER ORDER FRÉCHET DERIVATIVES
n P
P m
Proof: Suppose L = λij Idj wi . We have
j=1 i=1
and so on.
Returning to the map D2 f : E → L(E, L(E, F)) how do we get back to elements
of F?
For x ∈ E we haveD2 f (x) ∈ L(E, L(E, F))
for u ∈ E we haveD2 f (x)(u) ∈ L(E, F)
and for w ∈ E we haveD2 f (x)(u)(w) ∈ F.
In the case E = Rm , F = Rn (i.e. finite dimensional Euclidean spaces)
1. Standard methods applying the Chain rule and the lemmas 8.0.1,8.0.2 and
theorem 8.0.3.
113
where αij (v) ∈ R. Then applying theorem 9.3.2 we can show that αij (v) =
/j
fi (v), since for each v ∈ Rn , αij (v) ∈ R and so we may regard αij as a
function from Rn to R.
In this sense
n X
m n X
m
/j
X X
Df = αij Idj .ei = fi Idj .ei .
j=1 i=1 j=1 i=1
And so we have
n X
X m
D(Df )(v) = Dαij (v)Idj .ei ,
j=1 i=1
n X
n X
m n X
n X
m
/j/k
X X
D(Df )(v) = αijk (v)Idk Idj .ei = fi (v)Idk Idj .ei .
k=1 j=1 i=1 k=1 j=1 i=1
For example:
f : R2 → R with f (x, y) = x3 + 4xy 2 . We have f = Id31 + 4Id1 .Id22 and so
f is differentiable at each point x = (x, y) ∈ R2 .
We have α11 = f /1/1 = 6Id1 , α12 = f /1/2 = 8Id2 , α21 = f /2/1 = 8Id2 and
α22 = f /2/2 = 8Id1 all of which are differentiable this gives
P2 P2 P2
D3 f (x) = k=1 j=1 i=1 f /i/j/k (x)Idk Idj .Idi .ei
= 6Id1 Id1 Id1 + 8Id2 Id2 Id1 + 8Id2 Id1 Id2 + 8Id1 Id2 Id2 .
In fact it can be shown that the existence of the derivative D(Df )(x)
implies the existence of each of the derivatives Dαij (x).