0% found this document useful (0 votes)
8 views90 pages

FALecture Notes

The document outlines the syllabus and content for the MATH 36202/M6202 Functional Analysis course at the University of Bristol, taught by Dr. Thomas Bothner in Spring 2024. It includes topics such as normed linear spaces, Hilbert spaces, Banach spaces, and bounded operators, along with preliminary concepts in metric spaces and relevant mathematical theorems. The material is intended for educational purposes and is based on established texts in the field of functional analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views90 pages

FALecture Notes

The document outlines the syllabus and content for the MATH 36202/M6202 Functional Analysis course at the University of Bristol, taught by Dr. Thomas Bothner in Spring 2024. It includes topics such as normed linear spaces, Hilbert spaces, Banach spaces, and bounded operators, along with preliminary concepts in metric spaces and relevant mathematical theorems. The material is intended for educational purposes and is based on established texts in the field of functional analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

University of Bristol, MATH 36202 and M6202

Functional Analysis

Instructor: Dr. Thomas Bothner

Written by Thomas Bothner

Spring 2024
Contents

Contents
1 Preliminaries 3
1.1 Background material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Normed linear spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Hilbert spaces 18
2.1 The geometry of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 The Riesz representation theorem . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Banach spaces 31
3.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 The Hahn-Banach theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Duals and double duals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 The Baire category theorem and its consequences . . . . . . . . . . . . . . . . 49
3.5 Weak convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4 Bounded operators 61
4.1 Topologies on bounded operators . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 The spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4 Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Subject Index 88

These notes were written for the spring 2024 module MATH 36202/M6202 Functional Analysis
at the University of Bristol and the content follows closely
 M. Reed and B. Simon, Methods of Modern Mathematical Physics, I: Functional Analysis,
Academic Press, New York-London 1972, xvii+325.
 B. Simon, Real Analysis, A Comprehensive Course in Analysis, Part 1, American Mathemat-
ical Society, Providence, RI, 2015, xx+789.
 B. Simon, Operator Theory, A Comprehensive Course in Analysis, Part 4, American Mathe-
matical Society, Providence, RI, 2015, xviii+749.
This material is provided exclusively for educational purposes at the University of Bristol and
is to be downloaded or copied for your private study only. The artwork on the front cover,
“Painting for Saints”, was created by the British street artist Banksy in 2020.

2
1 Preliminaries

1 Preliminaries

1.1 Background material


Throughout
√ history, properties of individual numbers were of great importance, just think about
2 being irrational and 𝜋 transcendental. With the development of calculus in the 17th century,
the focus shifted gradually from numbers to functions. Those wrap up individual numbers into
some rule of assignment which now matters more than the numbers themselves. Still, late in the
19th century, the interest of mathematicians shifted yet again from studying individual functions
(their differentiability or integrability) to function spaces. One can argue that a function space
wraps up functions of interest into one geometric object. In turn, the geometry of the function
space reflects important properties of functions. This lead to the development of the field of
functional analysis, a truly remarkable creation of 20th century mathematics.

However, we don’t start from scratch but rely on results typically covered in Linear Algebra,
Analysis and Metric Space modules - but we won’t use measure or integration theory. Here is a
short summary, see Reed-Simon, Chapter I or Simon Part 1, Chapters 1 and 2 for further details.

Notation and Terminology


The basic rings and fields in this module are

R = real numbers Q = rationals Z = integers

C = complex numbers = {𝑥 + i𝑦 : 𝑥, 𝑦 ∈ R}
p
with their sums and products. For 𝑧 = 𝑥 + i𝑦 ∈ C, we write <𝑧 = 𝑥, =𝑧 = 𝑦, |𝑧| = 𝑥 2 + 𝑦 2 . We
also use
N = natural numbers = {1, 2, 3, . . .}.

Orders and Zorn’s Lemma


Definition 1. Given a set 𝑆, a relation, 𝑅, is a subset of the Cartesian product 𝑆 × 𝑆. For 𝑥, 𝑦 ∈ 𝑆,
we write 𝑥 ∼𝑅 𝑦 if (𝑥, 𝑦) ∈ 𝑅. A relation is called an equivalence relation if it satisfies
(i) ∀𝑥 ∈ 𝑆 we have 𝑥 ∼𝑅 𝑥 (reflexive)
(ii) ∀𝑥, 𝑦 ∈ 𝑆 such that 𝑥 ∼𝑅 𝑦, we have 𝑦 ∼𝑅 𝑥 (symmetric)
(iii) ∀𝑥, 𝑦, 𝑧 ∈ 𝑆 such that 𝑥 ∼𝑅 𝑦 and 𝑦 ∼𝑅 𝑧, we have 𝑥 ∼𝑅 𝑧 (transitive)

3
1 Preliminaries

If 𝑅 is an equivalence relation on 𝑆, the equivalence class, [𝑥], of 𝑥 ∈ 𝑆 is



[𝑥] := 𝑦 ∈ 𝑆 : 𝑥 ∼𝑅 𝑦 ,

and by the properties of equivalence relations we have for all 𝑥, 𝑦 ∈ 𝑆 that either [𝑥] = [𝑦] or
[𝑥] ∩ [𝑦] = ∅, in short, each 𝑥 ∈ 𝑆 belongs to a unique equivalence class.

Next, we discuss Zorn’s lemma.

Definition 2. A relation, 𝑅, on 𝑆 is called antisymmetric if


(iv) ∀𝑥, 𝑦 ∈ 𝑆 such that 𝑥 ∼𝑅 𝑦 and 𝑦 ∼𝑅 𝑥, we have 𝑦 = 𝑥
A partial order is a relation that is reflexive, antisymmetric and transitive. If 𝑅 is a partial order,
we often write 𝑥 𝑅 𝑦 instead of 𝑥 ∼𝑅 𝑦 and a set with a distinguished partial order is called a
partially ordered set.
We use the word “partial" in the last definition since two elements 𝑥, 𝑦 ∈ 𝑆 need not obey 𝑥 𝑅 𝑦
or 𝑦 𝑅 𝑥.

Definition 3. A partially ordered set, 𝑆, is said to be totally ordered if for all 𝑥 and 𝑦 in 𝑆, either
𝑥 𝑅 𝑦 or 𝑦 𝑅 𝑥. A totally ordered subset of 𝑆 is called a chain.

Some final pieces of terminology

Definition 4. If 𝑆 is a partially ordered set and 𝑇 ⊂ 𝑆, we say 𝑥 is an upper bound (respectively,


lower bound) for 𝑇 if
• 𝑥 ∈ 𝑆 (not necessarily in 𝑇 ) and
• ∀𝑦 ∈ 𝑇 we have 𝑦 𝑅 𝑥 (respectively, 𝑥 𝑅 𝑦)
An element 𝑥 ∈ 𝑆 is called maximal if for all 𝑦 ∈ 𝑆 we have that 𝑥 𝑅 𝑦 implies 𝑥 = 𝑦 (but 𝑥
may not be an upper bound for 𝑆 since it may not be comparable to every 𝑦 ∈ 𝑆).
In this lecture we will make use of the following consequence of the axiom of choice which is
known, in turn, to imply the axiom of choice (if one also assumes the “usual” axioms).

Theorem 1.1: Zorn’s lemma

Let 𝑋 be a nonempty partially ordered set in which every chain has an upper bound. Then
𝑋 has at least one maximal element.

Metric spaces
We will often need a way of measuring the distance between objects in sets

4
1 Preliminaries

Definition 5. A metric space is a set 𝑋 and a real-valued function 𝜌 : 𝑋 × 𝑋 → [0, ∞) that


obeys
(i) 𝜌 (𝑥, 𝑦) = 0 if and only if 𝑥 = 𝑦 (strong zero property)
(ii) ∀𝑥, 𝑦 ∈ 𝑋 we have 𝜌 (𝑥, 𝑦) = 𝜌 (𝑦, 𝑥) (symmetry)
(iii) ∀𝑥, 𝑦, 𝑧 ∈ 𝑋 we have 𝜌 (𝑥, 𝑧) ≤ 𝜌 (𝑥, 𝑦) + 𝜌 (𝑦, 𝑧) (triangle inequallity)
The function 𝜌 is called a metric on 𝑋 .

Definition 6. Given a metric space, (𝑋, 𝜌), we define the open ball, 𝐵𝑟 (𝑥) for 𝑟 > 0, 𝑥 ∈ 𝑋 , by

𝐵𝑟 (𝑥) := 𝑦 ∈ 𝑋 : 𝜌 (𝑥, 𝑦) < 𝑟 . (1.1)

The closed ball, 𝐵𝑟 (𝑥), is defined (for 𝑟 ≥ 0) with 𝜌 (𝑥, 𝑦) < 𝑟 in (1.1) replaced by 𝜌 (𝑥, 𝑦) ≤ 𝑟 . A
set 𝐴 ⊂ 𝑋 in a metric space (𝑋, 𝜌), is called open if and only if for all 𝑥 ∈ 𝐴, there exists 𝑟 > 0 so
that 𝐵𝑟 (𝑥) ⊂ 𝐴. 𝐴 is called closed if 𝑋 \ 𝐴 is open.
If 𝐴 ⊂ 𝑋 is arbitrary, then the closure of 𝐴, denoted 𝐴, is the smallest closed set containing
𝐴. Clearly, 𝑥 ∈ 𝐴 if and only if for all 𝜖 > 0 we have 𝐵𝜖 (𝑥) ∩ 𝐴 ≠ ∅. The interior 𝐴int is the
largest open set contained in 𝐴. Furthermore, if 𝐴 ⊂ 𝐵 with 𝐵 ⊂ 𝐴, we say 𝐴 is dense in 𝐵. In
particular, if 𝐵 = 𝑋 , we speak of a dense subset of 𝑋 . Finally, a metric space, (𝑋, 𝜌), is called
separable if and only if 𝑋 has a countable dense subset.

Definition 7. Given a sequence (𝑥𝑛 )𝑛=1∞ in (𝑋, 𝜌) a metric space, we say 𝑥 is a limit point

of (𝑥𝑛 )𝑛=1
∞ (respectively, converges to 𝑥 , written 𝑥 → 𝑥 ) if and only if for all 𝜖 > 0, 𝐵 (𝑥 )
∞ 𝑛 ∞ 𝜖 ∞
contains infinitely many 𝑥𝑛 (respectively, all but finitely many 𝑥𝑛 ).
Convergent sequences are related to closed sets by the following result.

Proposition 1. Let (𝑋, 𝜌) be a metric space and 𝐴 ⊂ 𝑋 . Then


(a) 𝐴 is closed if and only if (𝑥𝑛 )𝑛=1
∞ ⊂ 𝐴 and 𝑥 → 𝑥 implies 𝑥 ∈ 𝐴.
𝑛 ∞ ∞

(b) The closure 𝐴 is the set of limit points of sequences in 𝐴.


If (𝑋, 𝜌𝑋 ) and (𝑌 , 𝜌𝑌 ) are metric spaces, then a function 𝑓 : 𝑋 → 𝑌 is said to be continuous
at 𝑥 0 ∈ 𝑋 , if

∀ 𝜖 > 0 ∃ 𝛿 = 𝛿 (𝜖, 𝑥 0 ) > 0 : 𝜌𝑌 𝑓 (𝑥), 𝑓 (𝑥 0 ) < 𝜖 whenever 𝜌𝑋 (𝑥, 𝑥 0 ) < 𝛿.

If 𝑓 is continuous at every 𝑥 0 ∈ 𝑋 , we say 𝑓 is a continuous function.

Proposition 2. Let (𝑋, 𝜌𝑋 ) and (𝑌 , 𝜌𝑌 ) be metric spaces and 𝑓 : 𝑋 → 𝑌 a function. Then the
following are equivalent
(i) 𝑓 is continuous

5
1 Preliminaries

(ii) For all open 𝐵 ⊂ 𝑌 , 𝑓 −1 [𝐵] := {𝑥 ∈ 𝑋 : 𝑓 (𝑥) ∈ 𝐵} is an open subset of 𝑋 .


(iii) If 𝑥𝑛 → 𝑥 ∞ in 𝑋 , then 𝑓 (𝑥𝑛 ) → 𝑓 (𝑥 ∞ ) in 𝑌 .
Finally, the important notions of complete and completion.

Definition 8. A sequence (𝑥𝑛 )𝑛=1


∞ of points in a metric space, (𝑋, 𝜌), is called Cauchy if and

only if
∀ 𝜖 > 0 ∃ 𝑁 = 𝑁 (𝜖) ∈ N : 𝜌 (𝑥𝑛 , 𝑥𝑚 ) < 𝜖 whenever 𝑛, 𝑚 > 𝑁 .

Any convergent sequence is a Cauchy sequence by the triangle inequality, however the converse
may not be true: if 𝑋 = (0, 1] ⊂ R with the usual metric 𝜌 (𝑥, 𝑦) = |𝑥 − 𝑦|, then 𝑥𝑛 = 𝑛1 , 𝑛 ∈ N is
Cauchy but does not converge to a point in 𝑋 . For us it will be important to single out spaces
where this cannot happen.

Definition 9. A metric space, (𝑋, 𝜌), is called complete if and only if every Cauchy sequence in
𝑋 converges to some 𝑥 ∞ ∈ 𝑋 .
Recall that a subset 𝑌 in a complete metric space (𝑋, 𝜌) is itself complete if and only if 𝑌 is
closed. Moreover, any metric space can be embedded as a dense subset of a complete metric
space

Theorem 1. Given any metric space (𝑋, 𝜌), there is a complete metric space (𝑋, ˆ and a map
ˆ 𝜌)
𝜋 : 𝑋 → 𝑋ˆ so that
(i) 𝜌ˆ (𝜋 (𝑥), 𝜋 (𝑦)) = 𝜌 (𝑥, 𝑦) for all 𝑥, 𝑦 ∈ 𝑋 (isometry property)
(ii) Ran(𝜋) is dense in 𝑋ˆ

Compactness
Definition 10. A metric space, (𝑋, 𝜌), is called compact if and only if every open cover of 𝑋 has
a finite subcover. We say 𝑋 is totally bounded if and only if for all 𝜖 > 0, there is a finite set of
points 𝑥 1, . . . , 𝑥𝑛 ∈ 𝑋 so that 𝑋 ⊂ 𝑛𝑗=1 𝐵𝜖 (𝑥 𝑗 ). Furthermore, 𝑋 is called sequentially compact
Ð
if and only if every sequence (𝑥𝑛 )𝑛=1∞ ⊂ 𝑋 has a convergent subsequence (𝑥 ) ∞ ⊂ {𝑥 } ∞ so
𝑛𝑘 𝑘=1 𝑛 𝑛=1
that 𝑥𝑛𝑘 → 𝑥 ∞ ∈ 𝑋

Theorem 2. Let (𝑋, 𝜌) be a metric space. Then the following are equivalent:
(i) 𝑋 is compact.
(ii) 𝑋 is sequentially compact.
(iii) 𝑋 is complete and totally bounded.

Here is the fundamental property of compact subsets of R𝑛 :

6
1 Preliminaries

Theorem 3 (Heine-Borel). A subset 𝐴 ⊂ F𝑛 with F = R or F = C is compact (in the usual metric


𝜌 (𝑥, 𝑦) = |𝑥 − 𝑦|) if and only if 𝐴 is closed and bounded.
Recall that if (𝑋, 𝜌𝑋 ) and (𝑌 , 𝜌𝑌 ) are metric spaces, we say a function 𝑓 : 𝑋 → 𝑌 is uniformly
continuous if and only if

∀ 𝜖 > 0 ∃ 𝛿 = 𝛿 (𝜖) > 0 : 𝜌𝑌 𝑓 (𝑥), 𝑓 (𝑤) < 𝜖 whenever 𝜌𝑋 (𝑥, 𝑤) < 𝛿 for any 𝑥, 𝑤 ∈ 𝑋 .

Theorem 4 (Dirichlet-Heine). Let (𝑋, 𝜌𝑋 ) be a compact metric space and (𝑌 , 𝜌𝑌 ) a metric space.
Then any continuous function 𝑓 : 𝑋 → 𝑌 is uniformly continuous.
Here are a few other nice properties of compact metric spaces

Theorem 5. Let (𝑋, 𝜌𝑋 ) be a compact metric space.


(a) If (𝑌 , 𝜌𝑌 ) is a metric space and 𝑓 : 𝑋 → 𝑌 continuous, then 𝑓 [𝑋 ] := Ran(𝑓 ) is compact.
(b) If 𝑓 : 𝑋 → R is continuous, then 𝑓 is bounded and there exist 𝑥 ± ∈ 𝑋 with

𝑓 (𝑥 + ) = sup 𝑓 (𝑦), 𝑓 (𝑥 − ) = inf 𝑓 (𝑦).


𝑦 ∈𝑋 𝑦 ∈𝑋

Equicontinuity
Compact sets in function spaces are especially important.

Definition 11. Let (𝑋, 𝜌𝑋 ) and (𝑌 , 𝜌𝑌 ) be metric spaces. A family F , of functions from 𝑋 to 𝑌 is
called equicontinuous at 𝑥 0 ∈ 𝑋 if and only if

∀ 𝜖 > 0 ∃ 𝛿 = 𝛿 (𝜖, 𝑥 0 ) > 0 : 𝜌𝑌 𝑓 (𝑥), 𝑓 (𝑥 0 ) < 𝜖 whenever 𝜌𝑋 (𝑥, 𝑥 0 ) < 𝛿 for any 𝑓 ∈ F .

If F is equicontinuous at every 𝑥 0 ∈ 𝑋 , we say F is equicontinuous. Furthermore, we say F is


uniformly equicontinuous if and only if

∀ 𝜖 > 0 ∃ 𝛿 = 𝛿 (𝜖) > 0 : 𝜌𝑌 𝑓 (𝑥), 𝑓 (𝑤) < 𝜖 when 𝜌𝑋 (𝑥, 𝑤) < 𝛿 for any 𝑥, 𝑤 ∈ 𝑋, 𝑓 ∈ F .

Equicontinuity allows one to turn weak information about a limit approach into stronger
information. Here are the most important results about equicontinuous function families:

Theorem 6. Let (𝑓𝑛 )𝑛=1∞ be a sequence of functions from one metric space (𝑋, 𝜌 ) to another
𝑋
(𝑌, 𝜌𝑌 ) with the property that F = {𝑓𝑛 }𝑛=1
∞ is equicontinuous. Suppose 𝑓 (𝑥) → 𝑓 (𝑥) as 𝑛 → ∞
𝑛
pointwise for each 𝑥 ∈ 𝑋 . Then 𝑓 is continuous.

Theorem 7. Let F = {𝑓𝑛 : [0, 1] → C}𝑛=1 ∞ be a uniformly equicontinuous family of functions

on [0, 1] ⊂ R. Suppose that 𝑓𝑛 (𝑥) → 𝑓 (𝑥) as 𝑛 → ∞ pointwise for each 𝑥 ∈ [0, 1]. Then
𝑓𝑛 (𝑥) → 𝑓 (𝑥) uniformly in 𝑥 ∈ [0, 1].

7
1 Preliminaries

Theorem 1.2: Arzelà-Ascoli

∞ be a family of uniformly bounded equicontinuous functions


Let F = {𝑓𝑛 : [0, 1] → C}𝑛=1
∞ ⊂ F converges uniformly on [0, 1].
on [0, 1] ⊂ R. Then some subsequence (𝑓𝑛𝑘 )𝑘=1

Some Linear Algebra


Here and later, we will use F to indicate either R or C. A vector space is a set, 𝑋 , with two
operations: addition (𝑥, 𝑦) ↦→ 𝑥 + 𝑦, mapping 𝑋 × 𝑋 to 𝑋 , and scalar multiplication (𝜆, 𝑥) ↦→ 𝜆𝑥,
mapping F × 𝑋 to 𝑋 , with the usual properties. We say a subset, 𝑌 , of 𝑋 spans 𝑋 if for any
𝑥 ∈ 𝑋 , there are 𝛼 1, . . . , 𝛼𝑛 ∈ F and 𝑦1, . . . , 𝑦𝑛 ∈ 𝑌 so that
𝑛
Õ
𝑥= 𝛼 𝑗𝑦 𝑗 .
𝑗=1

We say 𝑋 is finite-dimensional if it has a finite spanning set. Moreover, {𝑦 𝑗 }𝑛𝑗=1 ⊂ 𝑋 are


independent if and only if for {𝛼 𝑗 }𝑛𝑗=1 ⊂ F,
𝑛
Õ
𝛼 𝑗𝑦 𝑗 = 0 ∈ 𝑋 ⇒ 𝛼 1 = 𝛼 2 = . . . = 𝛼𝑛 = 0 ∈ F.
𝑗=1

If {𝑦 𝑗 }𝑛𝑗=1 are not independent, they are called dependent. A basis for a finite-dimensional
space is an independent spanning set. If {𝑦 𝑗 }𝑛𝑗=1 ⊂ 𝑋 is a basis, the map
𝑛
Õ
(𝜇1, . . . , 𝜇𝑛 ) ↦→ 𝜇 𝑗𝑦 𝑗 (1.2)
𝑗=1

sets up a linear bijection of F𝑛 and 𝑋 with linear inverse. But since F𝑛 and F𝑚 are not linearly
isomorphic for 𝑛 ≠ 𝑚, the number 𝑛 of elements of a basis is independent of choice of basis, we
call it the dimension, dim(𝑋 ), of 𝑋 .

Theorem 8. Every finite-dimensional vectors space, 𝑋 , has a dimension, 𝑛 ∈ Z ≥0 , which is the


number of elements in any basis.
Finally, a subset 𝑌 ⊂ 𝑋 is called a subspace if it is non-empty and closed under addition and
scalar multiplication.

1.2 Normed linear spaces


Definition 12. A norm on a vector space, 𝑋 , over F = C or F = R is a function k · k : 𝑋 → [0, ∞)
which satisfies

8
1 Preliminaries

(i) k𝑥 k = 0 ⇔ 𝑥 = 0
(ii) k𝜆𝑥 k = |𝜆| k𝑥 k for all 𝜆 ∈ F and all 𝑥 ∈ 𝑋 .
(iii) k𝑥 + 𝑦 k ≤ k𝑥 k + k𝑦 k for all 𝑥, 𝑦 ∈ 𝑋 .
If condition (i) is dropped, k · k is called a seminorm. A vector space with a distinguished norm is
called a normed linear space (NLS).

In any normed linear space (𝑋, k · k),

𝜌 (𝑥, 𝑦) = k𝑥 − 𝑦 k

defines a metric, and so all the features of a metric space.

Definition 13. A bounded linear transformation from a normed linear space (𝑋, k · k𝑋 ) to
a normed linear space (𝑌 , k · k𝑌 ) is a function 𝑇 : 𝑋 → 𝑌 which satisfies
(i) 𝑇 (𝛼𝑥 + 𝛽𝑦) = 𝛼𝑇 (𝑥) + 𝛽𝑇 (𝑦) for all 𝛼, 𝛽 ∈ F (= R or C) and 𝑥, 𝑦 ∈ 𝑋 .
(ii) There exists 𝑐 > 0 such that k𝑇 𝑥 k𝑌 ≤ 𝑐 k𝑥 k𝑋 for all 𝑥 ∈ 𝑋 .
We write L (𝑋, 𝑌 ) for the set of all bounded linear transformations from 𝑋 to 𝑌 , and L (𝑋 ) :=
L (𝑋, 𝑋 ) for the set of all bounded linear operators on 𝑋 .

As it happens the elements of L (𝑋, 𝑌 ) are exactly the continuous linear maps:

Theorem 9. Let 𝑇 : 𝑋 → 𝑌 be a linear transformation between two normed linear spaces


(𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ). Then the following are equivalent:
(1) 𝑇 is continuous at one point 𝑥 0 ∈ 𝑋 .
(2) 𝑇 is continuous at 𝑥 0 = 0.
(3) 𝑇 is continuous at all points.
(4) 𝑇 is bounded on the closed ball 𝐵 1 (0) = {𝑥 ∈ 𝑋 : k𝑥 k𝑋 ≤ 1}.
(5) 𝑇 is bounded.

Proof. Clearly (3) ⇒ (2) ⇒ (1) and (5) ⇒ (4) ⇒ (2) (since (4) yields k𝑇 𝑥 − 𝑇 𝑥 0 k ≤
k𝑇 (𝑥/k𝑥 k) k k𝑥 − 𝑥 0 k ≤ 𝐶 k𝑥 − 𝑥 0 k given 𝑇 (𝑥 0 ) = 0, i.e. Lipschitz continuity at 𝑥 0 = 0). We thus
show, say, (4) ⇒ (5): For any 𝑥 ∈ 𝑋 \ {0} we have
 
𝑥
𝑇 𝑥 = k𝑥 k𝑋 𝑇 ,
k𝑥 k𝑋
|{z}
≤1 in k · k𝑋

and thus k𝑇 𝑥 k𝑌 ≤ k𝑇 (𝑥/k𝑥 k𝑋 ) k𝑌 k𝑥 k𝑋 ≤ 𝑐 k𝑥 k𝑋 by assumption (4). But also 𝑇 (0) = 0, which


shows boundedness at 𝑥 = 0. Moving on, let’s show (1) ⇒ (3): Suppose 𝑇 is continuous at

9
1 Preliminaries

𝑥 0 ∈ 𝑋 and 𝑥 ∈ 𝑋 is arbitrary. If (𝑥𝑛 )𝑛=1∞ ⊂ 𝑋 is such that 𝑥 → 𝑥, then 𝑥 − 𝑥 + 𝑥 → 𝑥 as


𝑛 𝑛 0 0
𝑛 → ∞, so from (1) and Proposition 2,

𝑇 (𝑥 0 ) = lim 𝑇 (𝑥𝑛 − 𝑥 + 𝑥 0 ) = lim 𝑇 (𝑥𝑛 ) − 𝑇 (𝑥) + 𝑇 (𝑥 0 ) = 𝑇 (𝑥 0 ) − 𝑇 (𝑥) + lim 𝑇 (𝑥𝑛 ),
𝑛→∞ 𝑛→∞ 𝑛→∞

and thus 𝑇 (𝑥𝑛 ) → 𝑇 (𝑥), i.e. continuity at 𝑥 ∈ 𝑋 by Proposition 2. Next we address (2) ⇒ (5):
From continuity at 𝑥 0 = 0 we have by Proposition 2 that 𝑇 −1 [𝐵 1 (0)] contains some open disk
𝐵𝛿 (0) ⊂ 𝑋 , i.e.
k𝑥 k𝑋 < 𝛿 ⇒ k𝑇 𝑥 k𝑌 < 1.
But if 𝑥 ∈ 𝑋 is arbitrary and 𝜖 > 0, then

𝛿𝑥
< 𝛿,
𝜖 + k𝑥 k𝑋 𝑋

and therefore by linearity of 𝑇 ,


 
𝛿𝑥 𝛿 1 
1> 𝑇 = k𝑇 𝑥 k𝑌 ⇔ k𝑇 𝑥 k𝑌 ≤ 𝜖 + k𝑥 k𝑋 .
𝜖 + k𝑥 k𝑋 𝑌 𝜖 + k𝑥 k𝑋 𝛿
1
Letting 𝜖 ↓ 0, we see that k𝑇 𝑥 k𝑌 ≤ 𝑐 k𝑥 k𝑋 with 𝑐 = 𝛿 > 0 which is (5). This completes our
proof of the Theorem. 

Observe that condition (4) in Theorem 9 is not always satisfied for all continuous maps since
𝐵 1 (0) is not necessarily compact, compare the following result:

Theorem 1.3: Equivalent norms

A normed linear space (𝑋, k · k) is finite dimensional if and only if every closed and
bounded subset of it is compact. Moreover, if (𝑋, k · k) is finite dimensional, then
(i) (𝑋, k · k) is complete.
(ii) Any two norms k · k 1 and k · k 2 on 𝑋 are equivalent in the sense that there are
constants 𝑐, 𝑑 > 0 so that, for any 𝑥 ∈ 𝑋 ,

𝑐 k𝑥 k 1 ≤ k𝑥 k 2 ≤ 𝑑 k𝑥 k 1 .

Proof. If 𝑋 is finite-dimensional with 𝑛 = dim(𝑋 ) and basis {𝑥 𝑗 }𝑛𝑗=1 ⊂ 𝑋 , then for any 𝑥 ∈ 𝑋
Í
we have 𝑥 = 𝑛𝑗=1 𝛼 𝑗 𝑥 𝑗 for some {𝛼 𝑗 }𝑛𝑗=1 ⊂ F. In turn, by the triangle inequality on 𝑋 and the
ordinary Cauchy-Schwarz inequality on R,
v
u
𝑛 𝑛 tÕ𝑛
Õ Õ √
k𝑥 k ≤ |𝛼 𝑗 | k𝑥 𝑗 k ≤ 𝐶 |𝛼 𝑗 | ≤ 𝐶 𝑛 |𝛼 𝑗 | 2, 𝐶 := max{k𝑥 1 k, . . . , k𝑥𝑛 k} < ∞, (1.3)
𝑗=1 𝑗=1 𝑗=1

10
1 Preliminaries

Í
so the map (1.2), i.e. the linear, bijective map 𝑇 : (𝛼 1, . . . , 𝛼𝑛 ) ↦→ 𝑛𝑗=1 𝛼 𝑗 𝑥 𝑗 from F𝑛 to 𝑋 , is
bounded by (1.3) provided (F𝑛 , k · k𝑒 ) is equipped with the standard euclidean norm
v
u
tÕ𝑛
k (𝛼 1, . . . , 𝛼𝑛 ) k𝑒 := |𝛼 𝑗 | 2 .
𝑗=1

But 𝑓 : 𝑋 → [0, ∞) with 𝑓 (𝑥) = k𝑥 k, 𝑥 ∈ 𝑋 is (Lipschitz) continuous since by the triangle


inequality on 𝑋 ,
k𝑥 k − k𝑦 k ≤ k𝑥 − 𝑦 k, 𝑥, 𝑦 ∈ 𝑋,
so 𝑓 ◦𝑇 : F𝑛 → [0, ∞) is continuous on the sphere 𝜕𝐵 1 (0) := {𝛼 = (𝛼 1, . . . , 𝛼𝑛 ) ∈ F𝑛 : k𝛼 k𝑒 = 1}
by Theorem 9 and this closed and bounded sphere is compact by Theorem 3. Thus, by Theorem
5, there exists 𝛽 ∈ 𝜕𝐵 1 (0) ⊂ F𝑛 with 𝑓 (𝑇 𝛼) ≥ 𝑓 (𝑇 𝛽) for all 𝛼 ∈ 𝜕𝐵 1 (0) and 𝑐 := k𝑇 𝛽 k > 0 by
linear independence of {𝑥 𝑗 }𝑛𝑗=1 . Thus

k𝑇 𝛼 k ≥ 𝑐 k𝛼 k𝑒 > 0 ∀ 𝛼 ∈ 𝜕𝐵 1 (0) ⊂ F𝑛

which shows that, by linearity of the inverse 𝑇 −1 : 𝑋 → F𝑛 ,

k𝑇 −1𝑥 k𝑒 ≤ 𝑑 k𝑥 k ∀ 𝑥 ∈ 𝑋, (1.4)
∞ ⊂ 𝑋 be a Cauchy
in other words, the inverse is also bounded. With the above at hand, let (𝑦𝑘 )𝑘=1
∞ Í
sequence with coordinates (𝛼𝑘 )𝑘=1 ⊂ F , i.e. 𝛼𝑘 = (𝛼𝑘1, . . . , 𝛼𝑘𝑛 ) ∈ F with 𝑦𝑘 = 𝑛𝑗=1 𝛼𝑘 𝑗 𝑥 𝑗 ,
𝑛 𝑛

then by (1.4), for every 𝑚, 𝑘 ∈ N,

k𝛼𝑚 − 𝛼𝑘 k𝑒 = k𝑇 −1 (𝑦𝑚 − 𝑦𝑘 ) k𝑒 ≤ 𝑑 k𝑦𝑚 − 𝑦𝑘 k,

which tells us that (𝛼𝑘 )𝑘=1 ∞ ⊂ F𝑛 is Cauchy and thus convergent (recall F = R or F = C) to some
∗ ∗ ∗
𝛼 = (𝛼 1 , . . . , 𝛼𝑛 ) ∈ F . In turn, by (1.3),
𝑛

𝑛
Õ √
𝑦𝑘 − 𝛼 ∗𝑗 𝑥 𝑗 ≤ 𝐶 𝑛 k𝛼𝑘 − 𝛼 ∗ k𝑒 → 0 as 𝑘 → ∞,
𝑗=1

which shows that (𝑋, k · k) is complete. Next, since (1.3) and (1.4) hold true for any two norms
k · k 1 and k · k 2 on 𝑋 , we find at once, for every 𝑥 ∈ 𝑋 ,
(1.3) √ √ (1.4) √ (1.3) (1.4) √
k𝑥 k 1 ≤ 𝐶 𝑛k𝛼 k𝑒 = 𝐶 𝑛k𝑇 −1𝑥 k𝑒 ≤ 𝐶 𝑛 𝑑 k𝑥 k 2 ≤ 𝐶 2𝑛𝑑 k𝑇 −1𝑥 k𝑒 ≤ (𝐶 𝑛𝑑) 2 k𝑥 k 1

and thus, after division with 𝐶 𝑛𝑑 > 0 the desired equivalence of norms. Next, let 𝐾 ⊂ 𝑋 be
closed and bounded and 𝑋 finite dimensional. Then, 𝐾 = 𝑇 (𝑇 −1 (𝐾)) is compact by the above
discussion and Theorems 3 and 5. Conversely, if every closed and bounded set in 𝑋 is compact,
then 𝐵 1 (0) ⊂ 𝑋 is necessarily compact, so by Theorem 2 totally bounded. But this means we
can find finitely many 𝑥 1, . . . , 𝑥𝑛 ∈ 𝐵 1 (0) such that
𝑛
Ø
𝐵 1 (0) ⊂ 𝐵 1 (𝑥 𝑗 ). (1.5)
2
𝑗=1

11
1 Preliminaries

Set 𝑈 := span{𝑥 1, . . . , 𝑥𝑛 } (which is finite dimensional, thus complete and closed) and now pick
an arbitrary 𝑥 ∈ 𝐵 1 (0). By (1.5) we can find 𝑢 1 ∈ 𝑈 and 𝑥 1∗ ∈ 𝐵 1 (0) so that 𝑥 = 𝑢 1 + 𝑥 1∗ /2. But
using (1.5) again, for the given 𝑥 1∗ we can then find 𝑢 2 ∈ 𝑈 and 𝑥 2∗ ∈ 𝐵 1 (0) so that 𝑥 1∗ = 𝑢 2 + 𝑥 2∗ /2,
i.e. together
1 1 1
𝑥 = 𝑢 1 + 𝑥 1∗ = 𝑢 1 + 𝑢 2 + 𝑥 2∗ .
2 2 4
| {z }
∈𝑈

Continuing in this fashion we see that for arbitrary 𝑥 ∈ 𝐵 1 (0) and 𝑛 ∈ N we can find 𝑢𝑛 ∈ 𝑈
and 𝑥𝑛∗ ∈ 𝐵 1 (0) so that 𝑥 = 𝑢𝑛 + 𝑥𝑛∗ /2𝑛 . But then

k𝑥 − 𝑢𝑛 k ≤ 2−𝑛 → 0 as 𝑛 → ∞,
∞ ⊂ 𝑈 converges to 𝑥. But 𝑈 is closed so we must have 𝑥 ∈ 𝑈 and thus 𝐵 (0) ⊂ 𝑈 . Given
so (𝑢𝑛 )𝑛=1 1
that the radii of the involved balls are irrelevant, we thus obtain 𝑋 = 𝑈 so dim(𝑋 ) ≤ 𝑛 < ∞.
This concludes our proof. 

Example 1. Let 𝑋 = 𝐶 [0, 1] denote the vector space of continuous functions defined on [0, 1] ⊂ R
with values in F. We can turn 𝑋 into a normed linear space by either considering

k 𝑓 k ∞ := max |𝑓 (𝑥)|, (1.6)


𝑥 ∈ [0,1]

or by introducing
∫ 1  𝑝1
𝑝
k𝑓 k 𝑝 := |𝑓 (𝑥)| 𝑑𝑥 , 1 ≤ 𝑝 < ∞. (1.7)
0
It is easy to check that (1.6) satisfies all norm properties and the same is true for (1.7), except that
the triangle inequality requires more effort for 𝑝 ∉ {1, 2}. The details are below.

Theorem 1.4: Continuous Hölder inequality

1
Let 1 ≤ 𝑝, 𝑞, 𝑟 ≤ ∞ with 𝑝 + 𝑞1 = 𝑟1 . If 𝑓 , 𝑔 ∈ 𝐶 [0, 1], then 𝑓 𝑔 ∈ 𝐶 [0, 1] and

k 𝑓 𝑔k𝑟 ≤ k𝑓 k 𝑝 k𝑔k𝑞 .

Proof. The case 𝑝 = 𝑞 = 𝑟 = ∞ is obvious, so we may assume that 𝑟 < ∞. By considering

|𝑓 |𝑟 |𝑔|𝑟
𝑓ˆ := , 𝑔ˆ :=
k𝑓 k𝑟𝑝 k𝑔k𝑞𝑟

that k 𝑓ˆk 𝑝ˆ = k𝑔k 1 1


𝑝 𝑞
we have with 𝑝ˆ := 𝑟 and 𝑞ˆ := 𝑟 ˆ 𝑞ˆ = 1 and 𝑝ˆ + 𝑞ˆ = 1. We are thus reduced to
the special case where
1 1
k 𝑓 k 𝑝 = k𝑔k𝑞 = 1, + = 1.
𝑝 𝑞

12
1 Preliminaries

𝑦𝑞 1
But Young’s inequality states 𝑥𝑦 ≤ 𝑥𝑝
𝑝 + 𝑞 whenever 𝑥, 𝑦 ≥ 0 and 1 < 𝑝, 𝑞 < ∞ with 𝑝 + 𝑞1 = 1,
so we find for all 𝑥 ∈ [0, 1],
1 1
|𝑓 (𝑥)𝑔(𝑥)| ≤ |𝑓 (𝑥)|𝑝 + |𝑔(𝑥)|𝑞 ,
𝑝 𝑞
and thus after integration k 𝑓 𝑔k 1 ≤ 1. The proof is completed. 

Theorem 1.5: Continuous Minkowski inequality

Let 1 ≤ 𝑝 ≤ ∞ and 𝑓 , 𝑔 ∈ 𝐶 [0, 1]. Then 𝑓 + 𝑔 ∈ 𝐶 [0, 1] and

k 𝑓 + 𝑔k 𝑝 ≤ k𝑓 k 𝑝 + k𝑔k 𝑝 .

Proof. We may assume that 1 < 𝑝 < ∞. Since


∫ 1 ∫ 1
𝑝
k𝑓 + 𝑔k 𝑝 = |𝑓 (𝑥) + 𝑔(𝑥)|𝑝 𝑑𝑥 = |𝑓 (𝑥) + 𝑔(𝑥)|𝑝−1 |𝑓 (𝑥) + 𝑔(𝑥)| 𝑑𝑥
0 0
∫ 1 ∫ 1
𝑝−1
≤ |𝑓 (𝑥) + 𝑔(𝑥)| |𝑓 (𝑥)| 𝑑𝑥 + |𝑓 (𝑥) + 𝑔(𝑥)|𝑝−1 |𝑔(𝑥)| 𝑑𝑥
0 0
  ∫ 1  𝑞1
𝑞 (𝑝−1)
≤ k𝑓 k 𝑝 + k𝑔k 𝑝 |𝑓 (𝑥) + 𝑔(𝑥)| 𝑑𝑥
0
1 1
by Theorem 1.4 with 𝑝 + 𝑞 = 1 in the second inequality, we find with 𝑞(𝑝 − 1) = 𝑝 that the
above yields
𝑝 −𝑝/𝑞
k 𝑓 k 𝑝 + k𝑔k 𝑝 ≥ k𝑓 + 𝑔k 𝑝 k 𝑓 + 𝑔k 𝑝 = k 𝑓 + 𝑔k 𝑝 ,
which is the desired result. 

Example 2. Let 𝑥 = (𝑥𝑛 )𝑛=1


∞ ⊂ F denote a sequence of numbers. We introduce the vector space of

all bounded sequences  



ℓ∞ (N) := (𝑥𝑛 )𝑛=1 : sup |𝑥𝑛 | < ∞ ,
𝑛 ∈N
and the vector space of all 𝑝-summable sequences
( ∞
)
Õ
∞ 𝑝
ℓ𝑝 (N) := (𝑥𝑛 )𝑛=1 : |𝑥𝑛 | < ∞ , 1 ≤ 𝑝 < ∞.
𝑛=1

Both spaces are normed linear spaces if we equip them with the following norms

! 𝑝1
Õ
ℓ∞ (N) : k𝑥 k ∞ := sup |𝑥𝑛 |, respectively ℓ𝑝 (N) : k𝑥 k 𝑝 := |𝑥𝑛 |𝑝 , 1 ≤ 𝑝 < ∞.
𝑛 ∈N 𝑛=1

Just as in Example 1 it is easy to show that k · k ∞ is a norm on ℓ∞ (N), but we require more work
for k · k 𝑝 on ℓ𝑝 (N), see below.

13
1 Preliminaries

Theorem 1.6: Discrete Hölder and Minkowski inequality

1
Let 1 ≤ 𝑝, 𝑞, 𝑟 ≤ ∞ with 𝑝 + 𝑞1 = 𝑟1 . If 𝑥 ∈ ℓ𝑝 (N), 𝑦 ∈ ℓ𝑞 (N), then 𝑥𝑦 ∈ ℓ𝑟 (N) and

k𝑥𝑦 k𝑟 ≤ k𝑥 k 𝑝 k𝑦 k𝑞 .

Furthermore, if 1 ≤ 𝑝 ≤ ∞ and 𝑥, 𝑦 ∈ ℓ𝑝 (N), then 𝑥 + 𝑦 ∈ ℓ𝑝 (N) and

k𝑥 + 𝑦 k 𝑝 ≤ k𝑥 k 𝑝 + k𝑦 k 𝑝 .

Proof. Exactly as in the proofs of Theorems 1.4 and 1.5, except that we formally replace 𝑓 (𝑥) →

𝑥𝑛 , 𝑔(𝑥) ↦→ 𝑦𝑛 and integration over 𝑥 with summation over 𝑛, first truncated then in a limit. 

In conclusion of this section and Chapter 1 we introduce some further terminology. First, every
𝑇 ∈ L (𝑋, 𝑌 ) satisfies, with 𝐶 > 0,
k𝑇 𝑥 k𝑌 ≤ 𝐶 k𝑥 k𝑋
for all 𝑥 ∈ 𝑋 , see Definition 13. The smallest such constant 𝐶 > 0 plays a very important role.

Definition 14. Given 𝑇 ∈ L (𝑋, 𝑌 ), a bounded linear transformation between two normed linear
spaces (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ), we introduce
k𝑇 𝑥 k𝑌
k𝑇 k := sup = sup k𝑇 𝑥 k𝑌 ,
𝑥 ∈𝑋 \{0} k𝑥 k𝑋 k𝑥 k𝑋 =1

and call it the operator norm of 𝑇 .

Here is an important class of bounded linear transformations which will be studied extensively
in Chapter 3:

Definition 15. The dual space of a normed linear space, 𝑋 , is the vector space 𝑋 ∗ = L (𝑋, F) of
continuous linear transformations of 𝑋 to F. The elements of 𝑋 ∗ are called continuous linear
functionals.

Theorem 10. The operator norm is a norm on L (𝑋, 𝑌 ) and L (𝑋, 𝑌 ) is complete if (𝑌 , k · k𝑌 ) is
complete.

Proof. Homogeneity and definiteness of the operator norm follow from the properties of k · k𝑋 ,𝑌
and the linearity of 𝑇 . As for the triangle inequality, we note that for 𝑥 ∈ 𝜕𝐵 1 (0) and 𝑆,𝑇 ∈
L (𝑋, 𝑌 ),
k (𝑆 + 𝑇 )𝑥 k𝑌 ≤ k𝑆𝑥 k𝑌 + k𝑇 𝑥 k𝑌 ≤ k𝑆 k + k𝑇 k,
so taking the supremum over 𝜕𝐵 1 (0), we find the desired inequality for k𝑆 + 𝑇 k. On the other
∞ ⊂ L (𝑋, 𝑌 ) is Cauchy, then for any 𝑥 ∈ 𝑋 ,
hand, if (𝑇𝑛 )𝑛=1

k𝑇𝑛 𝑥 − 𝑇𝑚 𝑥 k𝑌 = k (𝑇𝑛 − 𝑇𝑚 )𝑥 k𝑌 ≤ k𝑇𝑛 − 𝑇𝑚 k k𝑥 k𝑋 ,

14
1 Preliminaries

∞ ⊂ 𝑌 is Cauchy for every 𝑥 ∈ 𝑋 . Since 𝑌 is complete, it is therefore convergent with


so (𝑇𝑛 𝑥)𝑛=1
limit 𝑦 ∈ 𝑌 . Now define 𝑇 : 𝑋 → 𝑌 via 𝑇 𝑥 := 𝑦. This map is linear since for any 𝑥 1, 𝑥 2 ∈ 𝑋 and
𝛼, 𝛽 ∈ F,

𝛼𝑇 𝑥 1 + 𝛽𝑇 𝑥 2 = lim 𝛼𝑇𝑛 𝑥 1 + 𝛽𝑇𝑛 𝑥 2 = lim 𝑇𝑛 (𝛼𝑥 1 + 𝛽𝑥 2 ) = 𝑇 (𝛼𝑥 1 + 𝛽𝑥 2 ),
𝑛→∞ 𝑛→∞

and bounded: for any 𝑥 ∈ 𝜕𝐵 1 (0),



k𝑇 𝑥 k𝑌 = lim (𝑇𝑛 𝑥) = lim k𝑇𝑛 𝑥 k𝑌 ≤ lim k𝑇𝑛 k k𝑥 k𝑋 ≤ 𝑐 k𝑥 k𝑋 ,
𝑛→∞ 𝑌 𝑛→∞ 𝑛→∞
∞ ⊂ R a Cauchy sequence, so
where we used that 𝑥 ↦→ k𝑥 k is Lipschitz continuous and (k𝑇𝑛 k)𝑛=1
uniformly bounded. This completes our proof of the theorem. 

Corollary 1. For any 𝑆 ∈ L (𝑋, 𝑌 ),𝑇 ∈ L (𝑌 , 𝑍 ) we have 𝑇 ◦ 𝑆 ≡ 𝑇 𝑆 ∈ L (𝑋, 𝑍 ) and


k𝑇 𝑆 k ≤ k𝑇 k k𝑆 k.

Proof. By linearity of 𝑆 and 𝑇 , for any 𝑥, 𝑦 ∈ 𝑋 and 𝛼, 𝛽 ∈ F,


 
𝑇 𝑆 (𝛼𝑥 + 𝛽𝑦) = 𝑇 𝛼𝑆𝑥 + 𝛽𝑆𝑦 = 𝛼𝑇 (𝑆𝑥) + 𝛽𝑇 (𝑆𝑦),
i.e. the linearity of 𝑇 𝑆 : 𝑋 → 𝑍 is established. However, compositions of continuous functions
on metric spaces are continuous, so by Theorem 9, we have 𝑇 𝑆 ∈ L (𝑋, 𝑍 ). Finally, for any
𝑥 ∈ 𝑋,
k𝑇 𝑆𝑥 k𝑍 = k𝑇 (𝑆𝑥) k𝑍 ≤ k𝑇 k k𝑆𝑥 k𝑌 ≤ k𝑇 k k𝑆 k k𝑥 k𝑋
using 𝑇 ∈ L (𝑌 , 𝑍 ) in the first and 𝑆 ∈ L (𝑋, 𝑌 ) in the second inequality. All together, k𝑇 𝑆 k ≤
k𝑇 k k𝑆 k, as claimed. 

Before discussing a few concrete bounded linear transformations, we record the final definition
of this Section.

Definition 16. Let 𝑇 ∈ L (𝑋, 𝑌 ) be a bounded linear transformation between two normed linear
spaces 𝑋 and 𝑌 . We call
Ker(𝑇 ) = {𝑥 ∈ 𝑋 : 𝑇 𝑥 = 0}, respectively Ran(𝑇 ) := {𝑇 𝑥 ∈ 𝑌 : 𝑥 ∈ 𝑋 } = 𝑇 [𝑋 ],
the kernel, respectively range, of 𝑇 . Note that Ker(𝑇 ) is always a closed subspace of 𝑋 , but
Ran(𝑇 ) is in general not a closed subspace of 𝑌 .

Example 3. Let (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ) be normed linear spaces with 𝑛 = dim(𝑋 ) < ∞. Then
any linear map 𝑇 : 𝑋 → 𝑌 is a bounded linear transformation. Indeed, if {𝑥𝑖 }𝑛𝑖=1 is a basis for 𝑋 ,
then for any 𝑥 = 𝑛𝑗=1 𝛼 𝑗 𝑥 𝑗 with 𝛼 𝑗 ∈ F,
Í

v
u
Õ 𝑛 Õ𝑛 tÕ𝑛
k𝑇 𝑥 k𝑌 = 𝛼 𝑗𝑇 𝑥 𝑗 ≤ |𝛼 𝑗 | k𝑇 𝑥 𝑗 k𝑌 ≤ k𝑇 𝑥 𝑗 k𝑌2 k𝛼 k𝑒 = 𝑐 k𝛼 k𝑒
𝑗=1 𝑌 𝑗=1 𝑗=1
| {z }
=:𝑐<∞

15
1 Preliminaries

by linearity of 𝑇 , the triangle inequality on 𝑌 and the Cauchy-Schwarz inequality on R𝑛 . However,


by (1.4) we have k𝛼 k𝑒 ≤ 𝑑 k𝑥 k𝑋 and thus altogether, for any 𝑥 ∈ 𝑋 ,

k𝑇 𝑥 k𝑌 ≤ 𝑐ˆ k𝑥 k𝑋 , 𝑐ˆ > 0.

This shows 𝑇 ∈ L (𝑋, 𝑌 ).

Example 4. Let (𝑡𝑛 )𝑛=1


∞ ∈ ℓ (𝑁 ) be a bounded sequence of real or complex numbers. Now define

𝑇 : ℓ𝑝 (N) → ℓ𝑝 (N) for 1 ≤ 𝑝 ≤ ∞ as the multiplication operator

𝑇 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (𝑡 1𝑥 1, 𝑡 2𝑥 2, 𝑡 3𝑥 3 . . .).

Clearly, 𝑇 is a linear map on (ℓ𝑝 (N), k · k 𝑝 ) with



Õ ∞ 
Õ  ∞
Õ
𝑝 𝑝
1≤𝑝 <∞: k𝑇 𝑥 k 𝑝 = |𝑡𝑛 𝑥𝑛 |𝑝 ≤ sup |𝑡𝑚 |𝑝 |𝑥𝑛 |𝑝 = k𝑡 k ∞ |𝑥𝑛 |𝑝 ,
𝑛=1 𝑛=1 𝑚 ∈N 𝑛=1

and
𝑝=∞: k𝑇 𝑥 k ∞ = sup |𝑡𝑛 𝑥𝑛 | ≤ k𝑡 k ∞ sup |𝑥𝑛 |.
𝑛 ∈N 𝑛 ∈N

Hence, k𝑇 𝑥 k 𝑝 ≤ k𝑡 k ∞ k𝑥 k 𝑝 for any 1 ≤ 𝑝 ≤ ∞ and all 𝑥 ∈ ℓ𝑝 (N) which shows that 𝑇 ∈ L (ℓ𝑝 (N))
is a bounded linear operator on ℓ𝑝 (N). Moreover, the last inequality on k𝑇 𝑥 k 𝑝 yields

k𝑇 k ≤ k𝑡 k ∞ (1.8)

by Definition 14. We will now show that the operator norm k𝑇 k is in fact equal to k𝑡 k ∞ , i.e. we
show that (1.8) holds with equality. To this end, by definition of k𝑡 k ∞ , given any 𝜖 > 0 there exists
𝑁 = 𝑁 (𝜖) > 0 so that |𝑡 𝑁 | > k𝑡 k ∞ − 𝜖. Now consider the sequence
(
𝑦𝑛 = 1, 𝑛 = 𝑁
𝑦 = (𝑦1, 𝑦2, 𝑦3, . . .) with ,
𝑦𝑛 = 0, 𝑛 ≠ 𝑁

which satisfies k𝑦||𝑝 = 1 and k𝑇𝑦 k 𝑝 = |𝑡 𝑁 | > k𝑡 k ∞ − 𝜖. Thus, by Definition 14,

k𝑇 k = sup k𝑇 𝑥 k 𝑝 ≥ k𝑇𝑦 k 𝑝 > k𝑡 k ∞ − 𝜖,


k𝑥 k𝑋 =1

which yields the reversed inequality to (1.8) since 𝜖 > 0 was arbitrary. Put together, as promised,
we have k𝑇 k = k𝑡 k ∞ .

Example 5. Consider (ℓ𝑝 (N), k · k 𝑝 ) over F = R or F = C and let 𝑅 : ℓ𝑝 (N) → ℓ𝑝 (N) with
1 ≤ 𝑝 ≤ ∞ denote the right shift operator

𝑅 : (𝑥 1, 𝑥 2, 𝑥 3, 𝑥 4, . . .) ↦→ (0, 𝑥 1, 𝑥 2, 𝑥 3, . . .). (1.9)

16
1 Preliminaries

Clearly 𝑅 ∈ L (ℓ𝑝 (N)) with k𝑅k = 1 since k𝑅𝑥 k 𝑝 = k𝑥 k 𝑝 for every 1 ≤ 𝑝 ≤ ∞ and all 𝑥 ∈ ℓ𝑝 (N).
Next let 𝐿 : ℓ𝑝 (N) → ℓ𝑝 (N) with 1 ≤ 𝑝 ≤ ∞ denote the left shift operator

𝐿 : (𝑥 1, 𝑥 2, 𝑥 3, 𝑥 4, . . .) ↦→ (𝑥 2, 𝑥 3, 𝑥 4, 𝑥 5, . . .). (1.10)

Then 𝐿 ∈ L (ℓ𝑝 (N)) with k𝐿k ≤ 1 since k𝐿𝑥 k 𝑝 ≤ k𝑥 k 𝑝 for every 1 ≤ 𝑝 ≤ ∞ and all 𝑥 ∈ ℓ𝑝 (N).
However, with 𝑦 = (0, 1, 0, 0, 0, . . .) ∈ ℓ𝑝 (N) we find k𝑦 k 𝑝 = 1 and k𝐿𝑦k 𝑝 = 1, so all together we
have in k𝐿k = 1.

Example 6. Consider (𝐶 [0, 1], k · k ∞ ) and let 𝐾 : 𝐶 [0, 1] → 𝐶 [0, 1] denote the integral operator
∫ 1
(𝐾 𝑓 ) (𝑥) := 𝑘 (𝑥, 𝑦)𝑓 (𝑦)𝑑𝑦,
0

where the kernel function 𝑘 : [0, 1] × [0, 1] → C is continuous on the square [0, 1] × [0, 1] ⊂
R2 . Note that 𝐾 is well-defined, linear and by the triangle inequality for integrals, k𝐾 𝑓 k ∞ ≤
k𝑓 k ∞ max{|𝑘 (𝑥, 𝑦)| : 𝑥, 𝑦 ∈ [0, 1]}. Hence 𝐾 ∈ L (𝐶 [0, 1]) with k𝐾 k ≤ max{|𝑘 (𝑥, 𝑦)| : 𝑥, 𝑦 ∈
[0, 1]}.
The last four examples conclude Chapter 1.

17
2 Hilbert spaces

2 Hilbert spaces

2.1 The geometry of Hilbert spaces


In this Section we study vector spaces over F = C that have an inner product, a generalization
of the usual dot product on C𝑛 . The geometric properties of these spaces follow from the notion
of angle which is implicit in the definition of an inner product.

Definition 17. Let 𝑉 be a vector space over F = C. A map h·, ·i : 𝑉 × 𝑉 → C is called an inner
product if and only if the following three conditions are satisfied:
(i) h𝑥, 𝑥i ≥ 0 for all 𝑥 ∈ 𝑉 and h𝑥, 𝑥i = 0 if and only if 𝑥 = 0 ∈ 𝑉 (positive definiteness)
(ii) h𝑥, 𝑦i = h𝑦, 𝑥i for all 𝑥, 𝑦 ∈ 𝑉 (conjugate symmetry)
(iii) h𝛼𝑥 + 𝛽𝑦, 𝑧i = 𝛼 h𝑥, 𝑧i + 𝛽 h𝑦, 𝑧i for all 𝑥, 𝑦, 𝑧 ∈ 𝑉 and 𝛼, 𝛽 ∈ C (linearity in first vector)
A vector space with a distinguished inner product is called an inner product space.

Remark. A real inner product space 𝑉 is a vector space over F = R with a distinguished map
h·, ·i : 𝑉 × 𝑉 → R that satisfies (i) and (iii) above, but (ii) gets replaced by h𝑥, 𝑦i = h𝑦, 𝑥i.

Example 7. Let C𝑛 denote the vector space of all 𝑛-tuples of complex numbers. For any 𝑥 =
(𝑥 1, . . . , 𝑥𝑛 ) ∈ C𝑛 and 𝑦 = (𝑦1, . . . , 𝑦𝑛 ) ∈ C𝑛 we define
𝑛
Õ
h𝑥, 𝑦i := 𝑥 𝑗𝑦 𝑗 . (2.1)
𝑗=1

Then (C𝑛 , h·, ·i) is an inner product space.

Example 8. Let 𝐶 [0, 1] denote the complex-valued continuous functions on [0, 1] ⊂ R. For
𝑓 , 𝑔 ∈ 𝐶 [0, 1] define ∫ 1
h𝑓 , 𝑔i := 𝑓 (𝑥)𝑔(𝑥)𝑑𝑥, (2.2)
0
then (𝐶 [0, 1], h·, ·i) is an inner product space.

18
2 Hilbert spaces

Example 9. Let ℓ2 (N) denote the vector space of all square summable sequences of complex
numbers. For 𝑥, 𝑦 ∈ ℓ2 (N) define

Õ
h𝑥, 𝑦i := 𝑥𝑛𝑦𝑛 . (2.3)
𝑛=1
Then h·, ·i is well-defined by Theorem 1.6 and (ℓ2 (N), h·, ·i) an inner product space.

We now develop the geometrical notions that extend to arbitrary inner product spaces.

Definition 18. Two vectors 𝑥 and 𝑦 in an inner product space (𝑉 , h·, ·i) are said to be orthogonal
if and only if h𝑥, 𝑦i = 0. A collection {𝑥𝛼 }𝛼 ∈𝐼 of vectors in 𝑉 is called orthonormal if and only if
h𝑥𝛼 , 𝑥 𝛽 i = 0 for 𝛼 ≠ 𝛽 and h𝑥𝛼 , 𝑥𝛼 i = 1 for all 𝛼 ∈ 𝐼 .
p
We will use the shorthand k𝑥 k = h𝑥, 𝑥i ≥ 0 and shortly realize that k · k is in fact a norm on
𝑉 . First, for complex inner product spaces, one can recover {h𝑥, 𝑦i}𝑥,𝑦 ∈𝑉 from {h𝑥, 𝑥i}𝑥 ∈𝑉 by
the polarization identity
1 1 𝔦 𝔦
h𝑥, 𝑦i =k𝑥 + 𝑦 k 2 − k𝑥 − 𝑦 k 2 + k𝑥 + 𝔦𝑦 k 2 − k𝑥 − 𝔦𝑦 k 2
4 4 4 4
which is an easy consequence of the properties of h·, ·i.

Theorem 11 (Parallelogram identity). Let (𝑉 , h·, ·i) be an inner product space with k · k =
p
h·, ·i.
Then for any 𝑥, 𝑦 ∈ 𝑉 , we have
k𝑥 + 𝑦 k 2 + k𝑥 − 𝑦 k 2 = 2k𝑥 k 2 + 2k𝑦 k 2 . (2.4)

Proof. Since
k𝑥 ± 𝑦 k 2 = k𝑥 k 2 + k𝑦 k 2 ± 2<h𝑥, 𝑦i
by the properties of an inner product, the claim follows from straightforward algebra. 

Theorem 12 (Pythagorean theorem). Let {𝑥 𝑗 }𝑛𝑗=1 be a finite orthonormal set in an inner product
space (𝑉 , h·, ·i) with k · k = h·, ·i. Then for any 𝑦 ∈ 𝑉 ,
p

𝑛 𝑛 2
2
Õ Õ
h𝑦, 𝑥 𝑗 i + 𝑦 − h𝑦, 𝑥 𝑗 i𝑥 𝑗 = k𝑦 k 2 . (2.5)
𝑗=1 𝑗=1

Proof. We have, with 𝑦ˆ :=


Í𝑛
𝑗=1 h𝑦, 𝑥 𝑗 i𝑥 𝑗 ∈ 𝑉 , that
𝑛 𝑛
2 2
Õ Õ
h𝑦, 𝑦i
ˆ = h𝑦, 𝑥 𝑗 i , h𝑦,
ˆ 𝑦i
ˆ = h𝑦, 𝑥 𝑗 i
𝑗=1 𝑗=1

and therefore
𝑛
2
Õ
k𝑦 − 𝑦ˆ k 2 = k𝑦 k 2 + k𝑦ˆ k 2 − 2<h𝑦, 𝑦i
ˆ = k𝑦 k 2 − h𝑦, 𝑥 𝑗 i
𝑗=1
which proves (2.5). 

19
2 Hilbert spaces

Corollary 2 (Bessel’s inequality). Let {𝑥 𝑗 }𝑛𝑗=1 be a finite orthonormal set in an inner product
space (𝑉 , h·, ·i) with k · k = h·, ·i. Then for any 𝑦 ∈ 𝑉 ,
p

𝑛
2
Õ
h𝑦, 𝑥 𝑗 i ≤ k𝑦 k 2 . (2.6)
𝑗=1

Proof. Simply drop the second nonnegative term in the left hand side of (2.5). 

Corollary 3 (Cauchy-Schwarz inequality). Let (𝑉 , h·, ·i) be an inner product space with k · k =
h·, ·i. Then for any 𝑥, 𝑦 ∈ 𝑉 ,
p

|h𝑥, 𝑦i| ≤ k𝑥 k k𝑦 k. (2.7)

Proof. If 𝑥 = 0 ∈ 𝑉 , both sides in the inequality (2.7) equal zero, so the inequality holds. Hence,
if 𝑥 ≠ 0, set 𝑥 1 = 𝑥/k𝑥 k, so {𝑥 1 } is a finite orthonormal set and thus by Bessel’s inequality

|h𝑦, 𝑥 1 i| ≤ k𝑦 k.

Using now the properties of the inner product, we find the desired inequality (2.7). 

Corollary 4. Every inner product space (𝑉 , h·, ·i) is a normed linear space with norm
p
k𝑥 k = h𝑥, 𝑥i ≥ 0.

Proof. Since 𝑉 is a vector space, we need only verify that k · k has all the properties of a norm.
But all of these properties, except the triangle inequality, follow immediately from the properties
of h·, ·i. Hence suppose 𝑥, 𝑦 ∈ 𝑉 , then
(2.7)
k𝑥 + 𝑦 k 2 = k𝑥 k 2 + k𝑦 k 2 + 2<h𝑥, 𝑦i ≤ k𝑥 k 2 + k𝑦 k 2 + 2|h𝑥, 𝑦i| ≤ k𝑥 k 2 + k𝑦 k 2 + 2k𝑥 k k𝑦 k,

which gives k𝑥 + 𝑦 k ≤ k𝑥 k + k𝑦 k, i.e. the outstanding triangle inequality. 

One can ask when a norm comes from an inner product. Here is the simple answer which will
be proven in the exercises.

Theorem 2.1: Jordan-von Neumann

Let (𝑋, k · k) be normed linear space over F = C. Then k · k comes from an inner product
if and only if k · k obeys the parallelogram identity (2.4).

The above results state that every inner product space is a normed linear space and thus
in particular a metric space. We thus have the notions of convergence, completeness, and
density. In particular, we can always complete (𝑉 , h·, ·i) to a normed linear space in which 𝑉
is isometrically embedded as a dense subset, see Theorem 1. This feature makes us single out
complete inner product spaces:

20
2 Hilbert spaces

Definition 19. A Hilbert space, H , is an inner product space which is complete in the induced
metric.

Remark. Inner product spaces are sometimes called pre-Hilbert spaces. A real Hilbert space
is a real inner product space which is complete in the induced metric.

Example 10. On C𝑛 , (2.1) defines an inner product and since C𝑛 is finite dimensional, (C𝑛 , h·, ·i)
is a Hilbert space by Theorem 1.3.

Example 11. On 𝐶 [0, 1], (2.2) defines an inner product, however (𝐶 [0, 1], h·, ·i) is not a Hilbert
space: consider for 𝑛 ∈ Z ≥2 the piecewise constant functions


 0, 0 ≤ 𝑥 ≤ 21 − 𝑛1


1
𝑓𝑛 (𝑥) := 𝑛(𝑥 − 2 + 𝑛1 ), 1 1
2 − 𝑛 <𝑥 ≤ 2
1 .

1
2 <𝑥 ≤ 1
 1,


Then, by construction, for any 𝑛, 𝑚 ∈ Z ≥2 ,
1
∫ 1 ∫  
2
2 1 1 1 1 2 max{𝑛, 𝑚}
|𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥)| 𝑑𝑥 ≤ (1 + 1) 𝑑𝑥 = 2 + + − =4
1 1 1 1 𝑛 𝑚 𝑛 𝑚 𝑛𝑚
0 min{ 2 − 𝑛 , 2 − 𝑚 }

showing that (𝑓𝑛 )𝑛=2


∞ is a Cauchy sequence. However, if for some 𝑓 ∈ 𝐶 [0, 1], k 𝑓 − 𝑓 k → 0
∞ 𝑛 ∞ 2
as 𝑛 → ∞, then by the uniqueness of the limit 𝑓∞ (𝑥) = 1 for 𝑥 ∈ [ 12 , 1] and thus in turn by
construction of (𝑓𝑛 )𝑛=2
∞ ,

1
∫ 1 ∫
2
lim k 𝑓𝑛 − 𝑓∞ k 22 = lim 2
|𝑓𝑛 (𝑥) − 𝑓∞ (𝑥)| 𝑑𝑥 = 0 = lim |𝑓𝑛 (𝑥) − 𝑓∞ (𝑥)| 2𝑑𝑥 . (2.8)
𝑛→∞ 𝑛→∞ 0 𝑛→∞ 0

On the other hand, there exists 𝛿 > 0 so that 𝑓∞ (𝑥) > 12 for all 𝑥 ∈ [ 12 − 𝛿, 12 ] since 𝑓∞ ∈ 𝐶 [0, 1]
and since 𝑓∞ (𝑥) = 1 for 𝑥 ∈ [ 21 , 1]. Now let 𝑁 = 𝑁 (𝛿) ∈ N be such that 𝑁 > 𝛿2 . Then for all
𝑛 ≥ 𝑁 we have that 𝑓𝑛 (𝑥) = 0 for 𝑥 ∈ [ 21 − 𝛿, 12 − 𝛿2 ] and consequently, when 𝑛 ≥ 𝑁 ,
1 1 𝛿
2− 2
∫ ∫
2 𝛿
2
|𝑓𝑛 (𝑥) − 𝑓∞ (𝑥)| 𝑑𝑥 ≥ |𝑓∞ (𝑥)| 2𝑑𝑥 > .
0 1
2 −𝛿
8

This contradicts (2.8) and hence, (𝐶 [0, 1], h·, ·i) is not a Hilbert space.

Remark. The inner product (2.2) is not all bad. In fact, 𝐶 [0, 1] is dense in the space 𝐿 2 [0, 1] of
Lebesgue square integrable functions on [0, 1]. This space with (2.2) is a Hilbert space, compare a
module on measure and integration theory.

Example 12. On ℓ2 (N), (2.3) defines an inner product and (ℓ2 (N), h·, ·i) is a Hilbert space: let
∞ ⊂ ℓ (N) be a Cauchy sequence, noting that each term 𝑥 ∈ ℓ (N) is itself a square-
(𝑥𝑛 )𝑛=1 2 𝑛 2
summable sequence of complex numbers which we denote as

𝑥𝑛 = (𝑥𝑛1, 𝑥𝑛2, 𝑥𝑛3, . . .) ∈ ℓ2 (N).

21
2 Hilbert spaces

Now fix 𝑘 ∈ N and consider the sequence (𝑥𝑛𝑘 )𝑛=1


∞ ⊂ C of complex numbers. For any 𝑛, 𝑚 ∈ N,


! 12
  12 Õ
|𝑥𝑛𝑘 − 𝑥𝑚𝑘 | = |𝑥𝑛𝑘 − 𝑥𝑚𝑘 | 2 ≤ |𝑥𝑛𝑘 − 𝑥𝑚𝑘 | 2 = k𝑥𝑛 − 𝑥𝑚 k 2,
𝑘=1

so (𝑥𝑛𝑘 )𝑛=1
∞ is Cauchy and therefore convergent in C, 𝑥
𝑛𝑘 → 𝑦𝑘 as 𝑛 → ∞, say, and we define
𝑦 = (𝑦𝑘 )𝑘=1 . Next, given that (𝑥𝑛 )𝑛=1 ⊂ ℓ2 (N) is Cauchy, for any 𝜖 > 0 there is 𝑁 = 𝑁 (𝜖) > 0 so
∞ ∞

that k𝑥𝑛 − 𝑥𝑚 k 2 < 𝜖 whenever 𝑛, 𝑚 > 𝑁 . Hence for arbitrary 𝑀, 𝑛 ∈ N by Theorem 1.6,

𝑀
! 12 𝑀
! 21
Õ Õ
2 2
|𝑦𝑘 | = |𝑦𝑘 − 𝑥𝑛𝑘 + 𝑥𝑛𝑘 − 𝑥 𝑁 𝑘 + 𝑥 𝑁 𝑘 |
𝑘=1 𝑘=1
𝑀
! 21
Õ
≤ |𝑦𝑘 − 𝑥𝑛𝑘 | 2 + k𝑥𝑛 − 𝑥 𝑁 k 2 + k𝑥 𝑁 k 2
𝑘=1

where k𝑥𝑛 − 𝑥 𝑁 k 2 < 𝜖 for 𝑛 ≥ 𝑁 and, by the convergence 𝑥𝑛𝑘 → 𝑦𝑘 as 𝑛 → ∞,

𝑀
! 12
Õ
lim |𝑦𝑘 − 𝑥𝑛𝑘 | 2 = 0.
𝑛→∞
𝑘=1

Hence, for every 𝑀 ∈ N,


𝑀
! 21
Õ
|𝑦𝑘 | 2 ≤ 𝜖 + k𝑥 𝑁 k 2 < ∞,
𝑘=1

which proves 𝑦 ∈ ℓ2 (N). It now remains to show that k𝑥𝑛 − 𝑦 k 2 → 0 as 𝑛 → ∞. To this end,
like before, let 𝜖 > 0 be arbitrary and choose 𝑁 = 𝑁 (𝜖) > 0 so that k𝑥𝑛 − 𝑥𝑚 k 2 < 𝜖 whenever
𝑛, 𝑚 ≥ 𝑁 . Then for arbitrary 𝑀, 𝑛 ∈ N with 𝑛 ≥ 𝑁 , by Theorem 1.6,

𝑀
! 21 𝑀
! 21 𝑀
! 21
Õ Õ Õ
|𝑦𝑘 − 𝑥𝑛𝑘 | 2 ≤ |𝑦𝑘 − 𝑥𝑚𝑘 | 2 + k𝑥𝑚 − 𝑥𝑛 k 2 < 𝜖 + |𝑦𝑘 − 𝑥𝑚𝑘 | 2
𝑘=1 𝑘=1 𝑘=1

where
𝑀
! 21
Õ
lim |𝑦𝑘 − 𝑥𝑚𝑘 | 2 = 0,
𝑚→∞
𝑘=1

since 𝑥𝑚𝑘 → 𝑦𝑘 as 𝑚 → ∞. Hence, for every 𝑀 ∈ N, when 𝑛 ≥ 𝑁 ,

𝑀
! 21
Õ
|𝑦𝑘 − 𝑥𝑛𝑘 | 2 ≤ 𝜖,
𝑘=1

so k𝑦 − 𝑥𝑛 k 2 ≤ 𝜖 for 𝑛 ≥ 𝑁 and thus k𝑦 − 𝑥𝑛 k 2 → 0 as 𝑛 → ∞.

22
2 Hilbert spaces

Example 13. One can start with two arbitrary Hilbert spaces (H1, h·, ·i1 ) and (H2, h·, ·i2 ) and
generate a new Hilbert space H1 ⊕ H2 , their direct sum. In detail, we define the direct sum as the
Cartesion product H1 × H2 with component-wise vector space structure (i.e. (𝜙 1, 𝜙 2 ) ⊕ (𝜓 1,𝜓 2 ) =
(𝜙 1 + 𝜓 1, 𝜙 2 + 𝜓 2 )) and inner product

h(𝜙 1, 𝜙 2 ), (𝜓 1,𝜓 2 )i := h𝜙 1,𝜓 1 i1 + h𝜙 2,𝜓 2 i2, 𝜙 1,𝜓 1 ∈ H1, 𝜙 2,𝜓 2 ∈ H2 .

We will study the direct sum in more detail in the exercises.

2.2 The Riesz representation theorem


Example 13 in the last Section showed one way of constructing new Hilbert spaces from old
ones. Another way to do this is to restrict ones attention to a closed subspace 𝑆 of a given
Hilbert space H . Note that 𝑆 is itself a Hilbert space under the natural inner product that it
inherits as a subspace of H (since closed subsets of complete metric spaces are themselves
complete). Our goal is now to construct orthogonal complements of 𝑆, that is vectors orthogonal
to 𝑆. We do this through a variational principle.

Definition 20. A subset, 𝑆, of a (real or complex) vector space 𝑋 , is called convex if and only if
for all 𝑥, 𝑦 ∈ 𝑆, 𝜃 ∈ [0, 1], one has
𝜃𝑥 + (1 − 𝜃 )𝑦 ∈ 𝑆.
The vector 𝜃𝑥 + (1 − 𝜃 )𝑦 is called a convex combination of 𝑥 and 𝑦.

Theorem 13. Let 𝑆 be a closed convex set in a Hilbert space H and 𝑥 ∈ H arbitrary. Then there
is a unique 𝑦 ∈ 𝑆, the best approximation of 𝑥 by vectors in 𝑆, such that

k𝑥 − 𝑦 k = inf k𝑥 − 𝑧 k : 𝑧 ∈ 𝑆 .

Proof. By the parallelogram identity (2.4), for 𝑎 = 𝑥 − 𝑤 1, 𝑏 = 𝑥 − 𝑤 2 ,


2
1 1 1 1
𝑥 − (𝑤 1 + 𝑤 2 ) + k𝑤 1 − 𝑤 2 k 2 = k𝑥 − 𝑤 1 k 2 + k𝑥 − 𝑤 2 k 2 . (2.9)
2 4 2 2

Let 𝑐 := inf {k𝑥 − 𝑧 k : 𝑧 ∈ 𝑆 }. If there are 𝑤 1, 𝑤 2 ∈ 𝑆 with k𝑥 − 𝑤 1 k = k𝑥 − 𝑤 2 k = 𝑐, then since


1 1 2 2
2 (𝑤 1 + 𝑤 2 ) ∈ 𝑆 by convexity, we have k𝑥 − 2 (𝑤 1 + 𝑤 2 ) k ≥ 𝑐 , so by (2.9)

1 1 1
𝑐 2 + k𝑤 1 − 𝑤 2 k 2 ≤ 𝑐 2 + 𝑐 2 = 𝑐 2,
4 2 2
which implies k𝑤 1 − 𝑤 2 k = 0, that is, 𝑤 1 = 𝑤 2 . We have thus proven there is at most one
∞ ⊂ 𝑆 so that
minimizer. Next, by definition of 𝑐, there exists (𝑦𝑛 )𝑛=1

1
𝑐 2 ≤ k𝑥 − 𝑦𝑛 k 2 ≤ 𝑐 2 + ∀ 𝑛 ∈ N. (2.10)
𝑛

23
2 Hilbert spaces

∞ is a Cauchy sequence. Indeed,


We now show that (𝑦𝑛 )𝑛=1
2
(2.4) 1
k𝑦𝑛 − 𝑦𝑚 k 2 = k (𝑥 − 𝑦𝑚 ) − (𝑥 − 𝑦𝑛 ) k 2 = 2k𝑥 − 𝑦𝑚 k 2 + 2k𝑥 − 𝑦𝑛 k 2 − 4 𝑥 − (𝑦𝑛 + 𝑦𝑚 )
2
   
1 1 1 1
≤ 2 𝑐2 + + 𝑐2 + − 4𝑐 2 = 2 + ,
𝑚 𝑛 𝑛 𝑚

where we used that 12 (𝑦𝑛 + 𝑦𝑚 ) ∈ 𝑆 by convexity. Hence 𝑦𝑛 → 𝑦∞ ∈ 𝑆 since 𝑆 is closed subset


and thus itself complete. Finally, by (2.10) and continuity of the norm, k𝑥 − 𝑦∞ k = 𝑐, proving
existence of the minimizer. This concludes our proof. 

Definition 21. For any subset, 𝑆, in a Hilbert space (H, h·, ·i), define its orthogonal complement
by
𝑆 ⊥ := {𝑦 ∈ H : h𝑦, 𝑥i = 0 ∀ 𝑥 ∈ 𝑆 }.
Observe that 𝑆 ⊥ is always a closed subspace of H .

Proposition 3. Let 𝑆 be a closed subspace in a Hilbert space H . Let 𝑥 ∈ H and 𝑦 ∈ 𝑆 be the best
approximation of 𝑥 by vectors in 𝑆. Then 𝑥 − 𝑦 ∈ 𝑆 ⊥ .

Proof. Since k𝑥 − 𝑦 k = inf {k𝑥 − 𝑧 k : 𝑧 ∈ 𝑆 } we have for any 𝜆 ∈ C and 𝑧 ∈ 𝑆 that

k𝑥 − 𝑦 k ≤ k𝑥 − 𝑦 − 𝜆𝑧 k.

Equivalently, for all such 𝜆 and 𝑧,

−2<h𝑥 − 𝑦, 𝜆𝑧i + |𝜆| 2 k𝑧 k 2 ≥ 0,

so with 𝜆 = |𝜆|𝔢𝔦𝜃 , for fixed 𝜃 ∈ [0, 2𝜋), after dividing by |𝜆| and taking then |𝜆| ↓ 0,

−2< 𝑥 − 𝑦, 𝔢𝔦𝜃 𝑧 ≥ 0 ∀ 𝜃 ∈ [0, 2𝜋), 𝑧 ∈ 𝑆.

This readily shows that h𝑥 − 𝑦, 𝑧i = 0 for all 𝑧 ∈ 𝑆, that is, 𝑥 − 𝑦 ∈ 𝑆 ⊥ . 

Theorem 2.2: Projection lemma

Let H be a Hilbert space and 𝑆 ⊂ H a closed subspace. Then every 𝑥 ∈ H can be uniquely
written 𝑥 = 𝑦 + 𝑧 where 𝑦 ∈ 𝑆 and 𝑧 ∈ 𝑆 ⊥ .

Proof. Given 𝑥 ∈ H , let 𝑦 ∈ 𝑆 be the best approximation of 𝑥 be vectors in 𝑆. Set 𝑧 := 𝑥 − 𝑦,


then by Proposition 3, 𝑧 ∈ 𝑆 ⊥ and clearly 𝑥 = 𝑦 + (𝑥 − 𝑦) = 𝑦 + 𝑧 ∈ 𝑆 + 𝑆 ⊥ . For uniqueness,
suppose that 𝑥 = 𝑦1 + 𝑧 1 = 𝑦2 + 𝑧 2 with 𝑦1, 𝑦2 ∈ 𝑆 and 𝑧 1, 𝑧 2 ∈ 𝑆 ⊥ . Then

𝑦1 − 𝑦2 = 𝑧 2 − 𝑧 1 ∈ 𝑆 ∩ 𝑆 ⊥ = {0},

and so 𝑦1 = 𝑦2, 𝑧 1 = 𝑧 2 . 

24
2 Hilbert spaces

Corollary 5. If 𝑆 is a proper closed subspace in a Hilbert space H , that is 𝑆 ≠ H , then 𝑆 ⊥ ≠ {0}.

Proof. Let 𝑥 ∉ 𝑆. Then in Theorem 2.2, 𝑧 ≠ 0. Since 𝑧 ∈ 𝑆 ⊥ we have then 𝑆 ⊥ ≠ {0}. 

Moreover, which will be used later on and proven in the exercises,

Corollary 6. For any subspace (not necessarily closed) 𝑇 of a Hilbert space, H , we have
(𝑇 ⊥ ) ⊥ = 𝑇 .

At this point we are prepared to state and prove the central result of this section.

Theorem 2.3: Riesz representation

Let H be a Hilbert space and 𝑓 ∈ H ∗ a continuous linear functional. Then there exists a
unique 𝑦 = 𝑦 (𝑓 ) ∈ H such that

𝑓 (𝑥) = h𝑥, 𝑦i ∀𝑥 ∈ H.

In addition k𝑦 k = k 𝑓 k.

Proof. Set 𝑆 := Ker(𝑓 ) and recall that 𝑆 ⊂ H is a closed subspace, see Definition 16. If 𝑆 = H ,
then 𝑓 (𝑥) = 0 = h𝑥, 0i for all 𝑥 ∈ H and we are finished. Hence assume 𝑆 is a proper closed
subspace in H , so by Corollary 5 there exists 𝑥 0 ∈ 𝑆 ⊥ ≠ {0}. Define
𝑥0
𝑦 := 𝑓 (𝑥 0 ) ∈ 𝑆⊥
k𝑥 0 k 2
so that 𝑓 (𝑥) = 0 = h𝑥, 𝑦i for all 𝑥 ∈ 𝑆. Further, if 𝑥 = 𝛼𝑥 0 , then
 
𝑥0
𝑓 (𝑥) = 𝑓 (𝛼𝑥 0 ) = 𝛼 𝑓 (𝑥 0 ) = 𝛼𝑥 0, 𝑓 (𝑥 0 ) = h𝛼𝑥 0, 𝑦i = h𝑥, 𝑦i.
k𝑥 0 k 2
In summary, the maps 𝑥 ↦→ 𝑓 (𝑥) and 𝑥 ↦→ h𝑥, 𝑦i are linear and they agree on 𝑆 and span{𝑥 0 }.
But since for all 𝑥 ∈ H ,  
𝑓 (𝑥) 𝑓 (𝑥)
𝑥= 𝑥− 𝑥0 + 𝑥 0,
𝑓 (𝑥 0 ) 𝑓 (𝑥 0 )
| {z } | {z }
∈𝑆 ∈span{𝑥 0 }

we obtain 𝑓 (𝑥) = h𝑥, 𝑦i for all 𝑥 ∈ H . In order to establish uniqueness, we assume that
𝑓 (𝑥) = h𝑥, 𝑦i = h𝑥, 𝑦 0i for all 𝑥 ∈ H . But then
k𝑦 0 − 𝑦 k 2 = h𝑦 0 − 𝑦, 𝑦 0i − h𝑦 0 − 𝑦, 𝑦i = 𝑓 (𝑦 0 − 𝑦) − 𝑓 (𝑦 0 − 𝑦) = 0,
so 𝑦 0 = 𝑦, proving uniqueness. In order to prove that k𝑦 k = k 𝑓 k we observe that by Definition
14,
(2.7) 
k𝑓 k = sup |𝑓 (𝑥)| = sup |h𝑥, 𝑦i| ≤ sup k𝑥 k k𝑦 k = k𝑦 k,
k𝑥 k=1 k𝑥 k=1 k𝑥 k=1

25
2 Hilbert spaces

and    
𝑦 𝑦
k𝑓 k = sup |𝑓 (𝑥)| ≥ 𝑓 = ,𝑦 = k𝑦 k.
k𝑥 k=1 k𝑦 k k𝑦 k
This completes our proof of the Theorem. 

We note that the Cauchy-Schwarz inequality (2.7) shows that the converse of the Riesz repre-
sentation theorem is true. Namely, each 𝑦 ∈ H defines a continuous linear functional 𝑓𝑦 on H
by 𝑓𝑦 (𝑥) := h𝑥, 𝑦i.

2.3 Orthonormal bases


We have already defined what it means for a set of vectors to be orthonormal, see Definition 18.
In this section we develop this idea further; in particular we will extend the idea of a “basis”, so
useful for finite-dimensional vector spaces, to Hilbert spaces.

Definition 22. If 𝑆 is an orthonormal set in a Hilbert space H and no other orthonormal set con-
tains 𝑆 as a proper subset, then 𝑆 is called a complete orthonormal system or an orthonormal
basis for H .
Any complete orthonormal sytem 𝑆 = {𝑥𝛼 }𝛼 ∈𝐼 ⊂ H has the property that if 𝑥 ∈ H and
h𝑥, 𝑥𝛼 i = 0 for all 𝛼 ∈ 𝐼 , then 𝑥 = 0 ∈ H . It is in this sense that 𝑆 is complete. We now state and
prove the standard existence result for orthonormal bases in Hilbert spaces.

Theorem 14. A Hilbert space H (having a non-zero vector) has at least one complete orthonormal
system.

Proof. This is a typical application of Zorn’s lemma, that is Theorem 1.1: Let 𝑆 be an orthonormal
set in H . Such a set surely exists; for instance, if 𝑥 ∈ H \ {0}, then 𝑆 = {𝑥/k𝑥 k} will do. Now
consider C := {𝑆 : 𝑆 ⊂ H is an orthonormal set} the collection of orthonormal sets in H .
Note that C is partially ordered by set inclusion; that is, we say 𝑆 1  𝑆 2 if 𝑆 1 ⊆ 𝑆 2 . Next, pick
Ð
any chain {𝑆𝛼 }𝛼 ∈𝐼 ⊂ C, i.e. any totally ordered subset of C. Then 𝛼 ∈𝐼 𝑆𝛼 ⊂ C is again an
Ð
orthonormal set which contains each 𝑆𝛼 , so 𝛼 ∈𝐼 𝑆𝛼 is in fact an upper bound for the chain
{𝑆𝛼 }𝛼 ∈𝐼 , compare Definition 4. Thus, by Theorem 1.1, there exists a maximal element of C;
that is an orthonormal system not properly contained in any other orthonormal system. This
concludes the proof. 

In the next theorem we show that as in the finite-dimensional case every vector in a Hilbert
space can be expressed as a linear combination (possibly infinite) of basis elements. First, the
following preparation:

Proposition 4. If 𝑆 = {𝑥𝛼 }𝛼 ∈𝐼 is an orthonormal set in a Hilbert space H and 𝑥 ∈ H , then


h𝑥, 𝑥𝛼 i ≠ 0 for at most a countable number of vectors 𝑥𝛼 ∈ 𝑆.

26
2 Hilbert spaces

Proof. For each 𝑛 ∈ N let 𝑆𝑛 := {𝑥𝛼 ∈ 𝑆 : |h𝑥, 𝑥𝛼 i| ≥ 𝑛1 }. By Bessel’s inequality (2.6), for fixed
𝑥 ∈ H, !
2
Õ Õ
𝐶𝑥 := |h𝑥, 𝑥𝛼 i| 2 = sup h𝑥, 𝑥𝛼 i < ∞,
𝛼 ∈𝐼 𝐹 ⊂𝐼
𝐹 finite 𝛼 ∈𝐹
Ð∞
which shows that 𝑆𝑛 is finite for any 𝑛 ∈ N, more precisely |𝑆𝑛 | ≤ 𝐶𝑥 𝑛 2 . But 𝑛=1 𝑆𝑛 = {𝑥𝛼 ∈
𝑆 : h𝑥, 𝑥𝛼 i ≠ 0}. This completes our proof. 

Theorem 2.4: Fourier expansion

Let 𝑆 = {𝑥𝛼 }𝛼 ∈𝐼 be an orthonormal basis for a Hilbert space H . Then for each 𝑥 ∈ H ,
Õ
𝑥= h𝑥, 𝑥𝛼 i𝑥𝛼 , (2.11)
𝛼 ∈𝐼

where the equality (2.11) means that the sum on the right-hand side converges independent
of order to 𝑥 ∈ H . Moreover, we have the Parseval relation
2
Õ
k𝑥 k 2 = h𝑥, 𝑥𝛼 i .
𝛼 ∈𝐼

|𝑐 𝛼 | 2 < ∞, 𝑐 𝛼 ∈ C, then
Í Í
Conversely, if 𝛼 ∈𝐼 𝛼 ∈𝐼 𝑐 𝛼 𝑥 𝛼 converges to an element of H .

Proof. By Proposition 4 at most a countable number of 𝑥𝛼 ∈ 𝑆 contribute to the sum and those we
order in some way 𝑥𝛼 1 , 𝑥𝛼 2 , 𝑥𝛼 3 , . . .. Furthermore, since 𝑛𝑗=1 |h𝑥, 𝑥𝛼 𝑗 i| 2 is monotone increasing
Í
Í
and bounded by (2.6), it converges to a finite limit as 𝑛 → ∞. Let 𝑦𝑛 := 𝑛𝑗=1 h𝑥, 𝑥𝛼 𝑗 i𝑥𝛼 𝑗 . Then
for 𝑛 > 𝑚,
𝑛 2 𝑛
Õ Õ
2
k𝑦𝑛 − 𝑦𝑚 k = h𝑥, 𝑥𝛼 𝑗 i𝑥𝛼 𝑗 = |h𝑥, 𝑥𝛼 𝑗 i| 2
𝑗=𝑚+1 𝑗=𝑚+1
by orthonormality. Therefore ∞
(𝑦𝑛 )𝑛=1 ⊂ H
is Cauchy and thus convergent to some 𝑦 ∈ H .
Observe that by continuity of the inner product,
* 𝑛
+
Õ
h𝑥 − 𝑦, 𝑥𝛼𝑘 i = lim 𝑥 − h𝑥, 𝑥𝛼 𝑗 i𝑥𝛼 𝑗 , 𝑥𝛼𝑘 = h𝑥, 𝑥𝛼𝑘 i − h𝑥, 𝑥𝛼𝑘 i = 0, 𝑘 ∈ N,
𝑛→∞
𝑗=1

and if 𝛼 ≠ 𝛼𝑘 for 𝑘 ∈ N, then also by Proposition 4


* 𝑛
+
Õ
h𝑥 − 𝑦, 𝑥𝛼 i = lim 𝑥 − h𝑥, 𝑥𝛼 𝑗 i𝑥𝛼 𝑗 , 𝑥𝛼 = h𝑥, 𝑥𝛼 i − 0 = 0.
𝑛→∞
𝑗=1

This shows that 𝑥 − 𝑦 is orthogonal to all 𝑥𝛼 ∈ 𝑆 and since 𝑆 is a complete orthonormal system
we must have 𝑥 − 𝑦 = 0 ∈ H . Thus
𝑛
Õ
𝑥 = lim h𝑥, 𝑥𝛼 𝑗 i𝑥𝛼 𝑗
𝑛→∞
𝑗=1

27
2 Hilbert spaces

which proves (2.11) through Proposition 4. Furthermore, by continuity of the norm and or-
thonormality,
𝑛 2 𝑛
!
Õ Õ Õ
0 = lim 𝑥 − h𝑥, 𝑥𝛼 𝑗 i𝑥𝛼 𝑗 = lim k𝑥 k 2 − |h𝑥, 𝑥𝛼 𝑗 i| 2 = k𝑥 k 2 − |h𝑥, 𝑥𝛼 i| 2,
𝑛→∞ 𝑛→∞
𝑗=1 𝑗=1 𝛼 ∈𝐼

so Parseval’s relation holds as well. Finally, if 𝛼 ∈𝐼 |𝑐 𝛼 | 2 < ∞, then


Í
 
Õ
2
Õ
2 1 1
∞ > 𝑀 := |𝑐 𝛼 | ≥ |𝑐 𝛼 | ≥ 2 |𝑆𝑛 |, 𝑆𝑛 := 𝑐 𝛼 : |𝑐 𝛼 | ≥ ,
𝛼 ∈𝐼 𝑐 ∈𝑆
𝑛 𝑛
𝛼 𝑛

so {𝛼 ∈ 𝐼 : 𝑐 𝛼 ≠ 0} is countable, compare the proof of Proposition 4. Ordering the non-zero 𝑐 𝛼


as 𝑐 𝛼 1 , 𝑐 𝛼 2 , . . . we then conclude, as before, that 𝑛𝑗=1 |𝑐 𝛼 𝑗 | 2 has a finite limit as 𝑛 → ∞ and thus
Í
Í𝑛
𝑦𝑛 := 𝑗=1 𝑐 𝛼 𝑗 𝑥𝛼 𝑗 is Cauchy with a limit in H . This completes our proof. 

Our reasoning in Theorems 14 and 2.4 was not constructive at all. As it turns out, we can do
better provided we work with separable Hilbert spaces, that is spaces with a countable dense
subset, compare Chapter 1.

Proposition 5. In any separable Hilbert space, there exist countable independent spanning sets,
that is, sets {𝑦 𝑗 }𝑁𝑗=1 ⊂ H with 𝑁 finite or countably infinite, that obey
(i) For any 𝑛 ∈ {1, . . . , 𝑁 }, if 𝑛𝑗=1 𝛼 𝑗 𝑦 𝑗 = 0 ∈ H , then 𝛼 1 = . . . = 𝛼𝑛 = 0 ∈ C.
Í

(ii) span{𝑦 𝑗 : 𝑗 = 1, . . . , 𝑁 } = { 𝑛𝑗=1 𝛼 𝑗 𝑦 𝑗 : 𝛼 𝑗 ∈ C, 𝑛 ∈ {1, . . . , 𝑁 }} is dense in H .


Í

Proof. Pick a countable dense set {𝑧 1, . . . , 𝑧𝑛 , . . .} ⊂ H with all 𝑧 𝑗 ≠ 0. Set 𝑦1 := 𝑧 1 . If 𝑧𝑘 is


linearly dependent on 𝑧 1, . . . , 𝑧𝑘−1 , remove 𝑧𝑘 . Otherwise, let it be the next 𝑦. In this fashion
we clearly satisfy condition (i). Moreover, span{𝑦 𝑗 } = span{𝑧𝑘 } by construction and since
{𝑧 1, . . . , 𝑧𝑛 , . . .} is dense in H we also satisfy condition (ii). 

Example 14. Let ℓ2 (N) denote the vector space of square summable sequences equipped with the
inner product (2.3). Define
(
∞ 1, 𝑘 = 𝑛
ℓ2 (N) 3 𝑒𝑛 := (𝑒𝑛𝑘 )𝑘=1 with 𝑒𝑛𝑘 = .
0, 𝑘 ≠ 𝑛
Then {𝑒𝑛 }𝑛=1
∞ is a countable independent spanning set for ℓ (N), i.e. ℓ (N) is in particular separable.
2 2
As it turns out, see Theorem 16 below, ℓ2 (N) is one of two prototypical separable Hilbert spaces.

Theorem 15 (Gram-Schmidt procedure). Let {𝑦 𝑗 }𝑁𝑗=1 be a countable (either finite or countably


infinite) independent spanning set of a separable Hilbert space, H . Define 𝑥 𝑗 inductively by
Í 𝑗−1
𝑦 𝑗 − 𝑘=1 h𝑦 𝑗 , 𝑥𝑘 i𝑥𝑘
𝑥 𝑗 := Í 𝑗−1 . (2.12)
k𝑦 𝑗 − 𝑘=1 h𝑦 𝑗 , 𝑥𝑘 i𝑥𝑘 k

Then {𝑥 𝑗 }𝑁𝑗=1 is an orthonormal basis.

28
2 Hilbert spaces

Proof. By condition (i) of independent spanning sets, the vector in the numerator of (2.12)
is nonzero, so we may divide by its norm. A direct computation verifies that {𝑥 𝑗 }𝑁𝑗=1 is an
orthonormal set. Moreover since
𝑗−1
Õ 𝑗−1
Õ
𝑦𝑗 = 𝑦𝑗 − h𝑦 𝑗 , 𝑥𝑘 i𝑥𝑘 𝑥 𝑗 + h𝑦 𝑗 , 𝑥𝑘 i𝑥𝑘
𝑘=1 𝑘=1

we have that, for each 𝑛,


( 𝑛
) ( 𝑛
)
Õ Õ
𝛼 𝑗𝑦 𝑗 : 𝛼 𝑗 ∈ C = 𝛽𝑗𝑥𝑗 : 𝛽𝑗 ∈ C ,
𝑗=1 𝑗=1

and thus the union over 𝑛 of the right-side sets is dense in H . If 𝑤 ∈ ({𝑥 𝑗 }𝑁𝑗=1 ) ⊥ , then 𝑤 is
orthogonal to that union, and so to all of H . Hence k𝑤 k 2 = h𝑤, 𝑤i = 0. This shows that {𝑥 𝑗 }𝑁𝑗=1
is a complete orthonormal system and thus an orthonormal basis for H . 

Most Hilbert spaces that arise in practice are separable. Theorem 16, the last theorem in this
section and chapter, characterizes them up to isomorphism.

Definition 23. Two Hilbert spaces (H1, h·, ·i1 ) and (H2, h·, ·i2 ) are said to be isomorphic if there
is a linear bijection 𝑈 : H1 → H2 that obeys
h𝑈 𝑥, 𝑈 𝑦i2 = h𝑥, 𝑦i1 ∀ 𝑥, 𝑦 ∈ H1 . (2.13)
Such an operator 𝑈 is called unitary.

Theorem 16. A Hilbert space H is separable if and only if it has a countable orthonormal basis 𝑆.
(1) If there are 𝑁 < ∞ elements in 𝑆, then H is isomorphic to C𝑁 , see Example 7.
(2) If there are countably many elements in 𝑆, then H is isomorphic to ℓ2 (N), see Example 9.

Proof. If H is separable then it has a countable orthonormal basis by the Gram-Schmidt proce-
dure, i.e. by Theorem 15. Conversely, if {𝑥𝑛 }𝑛=1
𝑁 is a countable complete orthonormal system
Í𝑛
for H , then by Theorem 2.4 the set { 𝑗=1 𝛼 𝑗 𝑥 𝑗 : 𝛼 𝑗 ∈ Q + 𝔦Q, 𝑛 ∈ {1, . . . , 𝑁 }} is dense in H .
Since this set is countable, H is separable. Now pick an orthonormal basis {𝑥 𝑗 }𝑁𝑗=1 for H and
define the linear map 𝑈 : H → ℓ2 (N) by
𝑈 : 𝑥 ↦→ (h𝑥, 𝑥 𝑗 i) 𝑁𝑗=1 .
By Theorem 2.4, if 𝑁 = ∞, 𝑈 𝑥 ∈ ℓ2 (N), so 𝑈 maps into C𝑁 if 𝑁 is finite or ℓ2 (N) if 𝑁 is infinite.
But for any orthonormal basis, by polarization of Parseval’s relation in Theorem 2.4, we have
𝑁
Õ 𝑁
Õ
h𝑥, 𝑦i = h𝑥, 𝑥 𝑗 ih𝑥 𝑗 , 𝑦i = h𝑥, 𝑥 𝑗 ih𝑦, 𝑥 𝑗 i = h𝑈 𝑥, 𝑈 𝑦i2,
𝑗=1 𝑗=1

so 𝑈 obeys (2.13). Finally 𝑈 −1𝑦 := 𝑁𝑗=1 𝑦 𝑗 𝑥 𝑗 for any 𝑦 = (𝑦𝑛 )𝑛=1


∞ ∈ ℓ (N) is a two-sided inverse
Í
2
to 𝑈 by Theorem 2.4, proving that 𝑈 is indeed a bijection. 

29
2 Hilbert spaces

The last theorem concludes the content of this chapter. We will encounter Hilbert spaces again
in Chapters 3 and 4.

30
3 Banach spaces

3 Banach spaces

3.1 Definitions and examples


The concept of a Banach space is a generalization of Hilbert space. A Banach space assumes
that there is a norm on the space relative to which the space is complete, but it is not assumed
that the norm is defined in terms of an inner product. There are many examples of Banach
spaces that are not Hilbert spaces, so that the generalization is quite useful.

Definition 24. A Banach space is a normed linear space (𝑋, k · k) over F = C or F = R which
is complete in the induced metric.

Example 15. The sequence spaces ℓ𝑝 (N) with norms k · k 𝑝 in Example 2 for 1 ≤ 𝑝 ≤ ∞ are
Banach spaces: first, for 1 ≤ 𝑝 < ∞, we use the logic of Example 12 and show that any Cauchy
sequence (𝑥𝑛 )𝑛=1
∞ ⊂ ℓ (N) is convergent in ℓ (N) - the completeness workings in Example 12 did
𝑝 𝑝
not use the ℓ2 (N) inner product, only the ℓ2 (N) norm and Minkowski’s inequality for it. This means
we can replace 𝑝 = 2 by general 𝑝 ∈ [1, ∞) and the argument is still valid, see Theorem 1.6. Second,
for 𝑝 = ∞, we pick a Cauchy sequence (𝑥𝑛 )𝑛=1 ∞ ⊂ ℓ (N) and write again 𝑥 = (𝑥 ) ∞ ∈ ℓ (N).
∞ 𝑛 𝑛𝑘 𝑘=1 ∞
For fixed 𝑘 ∈ N the sequence (𝑥𝑛𝑘 )𝑛=1
∞ ⊂ F is Cauchy since

|𝑥𝑛𝑘 − 𝑥𝑚𝑘 | ≤ sup |𝑥𝑛𝑘 − 𝑥𝑚𝑘 | = k𝑥𝑛 − 𝑥𝑚 k ∞,


𝑘 ∈N

and thus 𝑥𝑛𝑘 → 𝑦𝑘 ∈ F exists as 𝑛 → ∞. Now construct 𝑦 = (𝑦𝑘 )𝑘=1 ∞ and note that for given

𝜖 > 0 there exists 𝑁 = 𝑁 (𝜖) > 0 such that k𝑥𝑛 − 𝑥𝑚 k ∞ < 𝜖 when 𝑛, 𝑚 > 𝑁 . Hence, for arbitrary
𝑘, 𝑛 ∈ N, by the ordinary triangle inequality in F,
|𝑦𝑘 | = |𝑦𝑘 − 𝑥𝑛𝑘 + 𝑥𝑛𝑘 − 𝑥 𝑁 𝑘 + 𝑥 𝑁 𝑘 | ≤ |𝑦𝑘 − 𝑥𝑛𝑘 | + k𝑥𝑛 − 𝑥 𝑁 k ∞ + k𝑥 𝑁 k ∞,
with k𝑥𝑛 − 𝑥 𝑁 k ∞ < 𝜖 for 𝑛 ≥ 𝑁 and |𝑦𝑘 − 𝑥𝑛𝑘 | → 0 as 𝑛 → ∞. Hence, for every 𝑘 ∈ N,
|𝑦𝑘 | ≤ 𝜖 + k𝑥 𝑁 k ∞,
and thus indeed 𝑦 ∈ ℓ∞ (N). Finally, for any 𝜖 > 0, choose 𝑁 = 𝑁 (𝜖) > 0 so that k𝑥𝑛 − 𝑥𝑚 k ∞ < 𝜖
for 𝑛, 𝑚 ≥ 𝑁 . Then for arbitrary 𝑘, 𝑛 ∈ N with 𝑛 ≥ 𝑁 ,
|𝑦𝑘 − 𝑥𝑛𝑘 | ≤ |𝑦𝑘 − 𝑥𝑚𝑘 | + k𝑥𝑚 − 𝑥𝑛 k ∞ < 𝜖 + |𝑦𝑘 − 𝑥𝑚𝑘 |
where |𝑦𝑘 − 𝑥𝑚𝑘 | → 0 as 𝑚 → ∞. This shows that for every 𝑘 ∈ N,
|𝑦𝑘 − 𝑥𝑛𝑘 | ≤ 𝜖,
and so k𝑦 − 𝑥𝑛 k ∞ ≤ 𝜖 for 𝑛 ≥ 𝑁 , equivalently k𝑦 − 𝑥𝑛 k ∞ → 0 as 𝑛 → ∞.

31
3 Banach spaces

Example 16. Let 𝑐 0 (N) denote the vector space of sequences of numbers in F that converge to
zero, n o

𝑐 0 (N) := (𝑥𝑛 )𝑛=1 ⊂ F : lim 𝑥𝑛 = 0 .
𝑛→∞
Equip 𝑐 0 (N) with the norm k · k ∞ of Example 2, then (𝑐 0 (N), k · k ∞ ) is a Banach space: First of,
𝑐 0 (N) ⊂ ℓ∞ (N) is a subspace of ℓ∞ (N). Second, if (𝑥𝑛 )𝑛=1
∞ ⊂ 𝑐 (N) is a convergent sequence, say
0
k𝑥𝑛 − 𝑦 k ∞ → 0 as 𝑛 → ∞, then 𝑦 ∈ 𝑐 0 (N): indeed, write 𝑦 = (𝑦𝑘 )𝑘=1∞ and 𝑥 = (𝑥 ) ∞ , then
𝑛 𝑛𝑘 𝑘=1

|𝑦𝑘 | ≤ |𝑦𝑘 − 𝑥𝑛𝑘 | + |𝑥𝑛𝑘 | ≤ k𝑦 − 𝑥𝑛 k ∞ + |𝑥𝑛𝑘 |

for any 𝑛, 𝑘 ∈ N. But given 𝜖 > 0, there is 𝑁 = 𝑁 (𝜖) such that k𝑦 − 𝑥𝑛 k ∞ < 𝜖 for 𝑛 ≥ 𝑁 and thus,
since 𝑥𝑛 ∈ 𝑐 0 (N), 𝑦𝑘 → 0 as 𝑘 → ∞. In short, (𝑐 0 (N), k · k ∞ ) is a closed subspace (by Proposition
1) of the Banach space (ℓ∞ (N), k · k ∞ ) and thus itself a Banach space.

Example 17. Let 𝑐 00 (N) denote the vector space of sequences of numbers in F that have all but
finitely many zero terms,
n o

𝑐 00 (N) := (𝑥𝑛 )𝑛=1 ⊂ F : 𝑥𝑛 = 0 for all but a finite number of 𝑛 .

Equip 𝑐 00 (N) with the norm k · k ∞ of Example 2, then (𝑐 00 (N), k · k ∞ ) is not a Banach space:
clearly 𝑐 00 (N) ⊂ 𝑐 0 (N) ⊂ ℓ∞ (N) is a subspace of 𝑐 0 (N), so if (𝑥𝑛 )𝑛=1 ∞ ⊂ 𝑐 (N) is convergent, say
00
k𝑥𝑛 − 𝑦 k ∞ → 0 as 𝑛 → ∞, then 𝑦 = (𝑦𝑛 )𝑛=1 ∞ ∈ 𝑐 (N) by Example 16. But, in general, 𝑦 ∉ 𝑐 (N),
0 00
for if 𝑥𝑛 := 1, 12 , 13 , . . . , 𝑛1 , 0, 0, 0, . . . ∈ 𝑐 00 (N), then 𝑥𝑛 → 𝑥 := (1, 21 , 31 , . . .) ∈ 𝑐 0 (N) in k · k ∞
since k𝑥𝑛 − 𝑥 k ∞ = 𝑛+1 1
→ 0 as 𝑛 → ∞. However, 𝑥 ∈ 𝑐 0 (N) \ 𝑐 00 (N), i.e. 𝑐 00 (N) is not a closed
subspace of the Banach space (𝑐 0 (N), k · k ∞ ) and thus itself not a Banach space.

Example 18. Note that 𝑐 00 (N) equipped with the norm k · k 𝑝 of Example 2 for 1 ≤ 𝑝 < ∞ is also
not a Banach space: look at
 
1 1 1 1
𝑥𝑛 = (𝑥𝑛1, 𝑥𝑛2, 𝑥𝑛3, . . .) := 2/𝑝 , 2/𝑝 , 2/𝑝 , . . . , 2/𝑝 , 0, 0, 0, 0, . . . ∈ 𝑐 00 (N),
1 2 3 𝑛
then for any 𝑛 > 𝑚,
∞ 𝑛 ∞
𝑝
Õ
𝑝
Õ 1 Õ 1
k𝑥𝑛 − 𝑥𝑚 k 𝑝 = |𝑥𝑛𝑘 − 𝑥𝑚𝑘 | = 2
≤ → 0 as 𝑚 → ∞,
𝑘 𝑘2
𝑘=1 𝑘=𝑚+1 𝑘=𝑚+1

so (𝑥𝑛 )𝑛=1
∞ is Cauchy in (𝑐 (N), k · k ). But 𝑥 → 𝑥 := (1, 1 , 1 , 1 , . . .) in k · k since
00 𝑝 𝑛 41/𝑝 91/𝑝 161/𝑝 𝑝


𝑝
Õ 1
k𝑥𝑛 − 𝑥 k 𝑝 = → 0 as 𝑛 → ∞,
𝑘2
𝑘=𝑛+1

however 𝑥 ∉ 𝑐 00 (N). Still, (𝑐 00 (N), k · k 𝑝 ) is dense in the Banach space (ℓ𝑝 (N), k · k 𝑝 ) for 1 ≤ 𝑝 < ∞
and (𝑐 00 (N), k · k ∞ ) is dense in (𝑐 0 (N), k · k ∞ ), see the exercises. This implies, in particular, that
(ℓ𝑝 (N), k · k 𝑝 ) is separable for 1 ≤ 𝑝 < ∞ and so is (𝑐 0 (N), k · k ∞ ).

32
3 Banach spaces

Example 19. Let 𝐶 [0, 1] denote the vector space of F-valued continuous functions on [0, 1] ⊂ R
equipped with the norm k · k 𝑝 of Example 1 for 1 ≤ 𝑝 ≤ ∞. Then (𝐶 [0, 1], k · k 𝑝 ) is a Banach
space only for 𝑝 = ∞: if 1 ≤ 𝑝 < ∞, copy paste the workings of Example 11, i.e. note that
∫ 1 ∫ 1  𝑝 ∫ 1
𝑝
2
𝑝 1 1 𝑛 1 1
|𝑓𝑛 (𝑥) − 𝑓 (𝑥)| 𝑑𝑥 = 𝑛 𝑥− + 𝑑𝑥 = 𝑛𝑝 𝑢 𝑝 𝑑𝑢 = → 0,
0 1 1
2−𝑛
2 𝑛 0 𝑛𝑝 +1
as 𝑛 → ∞ and for any 𝑛, 𝑚 ∈ Z ≥2 ,
∫ 1 ∫ 1  
𝑝
2
𝑝 𝑝−1 1 1 1 1 max{𝑛, 𝑚}
|𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥)| 𝑑𝑥 ≤ (1 + 1) 𝑑𝑥 = 2 + + − = 2𝑝 .
0 min{ 12 − 𝑛1 , 12 − 𝑚
1
} 𝑛 𝑚 𝑛 𝑚 𝑛𝑚
Hence, (𝑓𝑛 )𝑛=2
∞ is a Cauchy sequence in (𝐶 [0, 1], k · k ) for any 1 ≤ 𝑝 < ∞. But, if for some
𝑝
𝑓∞ ∈ 𝐶 [0, 1] we have k 𝑓𝑛 − 𝑓∞ k 𝑝 → 0 as 𝑛 → ∞, then by the reasoning in Example 11, there exists
𝛿 > 0 and 𝑁 = 𝑁 (𝛿) ∈ N so that
∫ 1
2 𝛿
|𝑓𝑛 (𝑥) − 𝑓∞ (𝑥)|𝑝 𝑑𝑥 > 𝑝+1
0 2
whenever 𝑛 ≥ 𝑁 . This is a contradiction to the fact that
∫ 1
2
lim |𝑓𝑛 (𝑥) − 𝑓∞ (𝑥)|𝑝 𝑑𝑥 = 0
𝑛→∞ 0
and hence, (𝐶 [0, 1], k · k 𝑝 ) is not a Banach for 1 ≤ 𝑝 < ∞. The situation is different for 𝑝 = ∞,
simply because uniform limits of continuous functions are continuous: let (𝑓𝑛 )𝑛=1 ∞ ⊂ 𝐶 [0, 1] be

Cauchy, then for any fixed 𝑥 ∈ [0, 1] and all 𝑛, 𝑚 ∈ N,


|𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥)| ≤ max |𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥)| = k 𝑓𝑛 − 𝑓𝑚 k ∞,
𝑥 ∈ [0,1]

so (𝑓𝑛 (𝑥))𝑛=1
∞ ⊂ F is a Cauchy sequence and thus pointwise convergent, say 𝑓 (𝑥) → 𝑓 (𝑥), 𝑥 ∈
𝑛
[0, 1] as 𝑛 → ∞. Next, for arbitrary 𝑥, 𝑦 ∈ [0, 1] and 𝜖 > 0 we can choose 𝑁 = 𝑁 (𝜖) > 0 so that
k𝑓𝑛 − 𝑓𝑚 k ∞ ≤ 𝜖3 for all 𝑛, 𝑚 ≥ 𝑁 and by Theorem 4 also 𝛿 = 𝛿 (𝜖) > 0 so that |𝑓𝑁 (𝑥) − 𝑓𝑁 (𝑦)| < 𝜖3
whenever |𝑥 − 𝑦| < 𝛿. In turn, for any 𝑛, 𝑚 ≥ 𝑁 and |𝑥 − 𝑦| < 𝛿,
|𝑓 (𝑥) − 𝑓 (𝑦)| ≤ |𝑓 (𝑥) − 𝑓𝑛 (𝑥)| + |𝑓𝑛 (𝑥) − 𝑓𝑁 (𝑥)| + |𝑓𝑁 (𝑥) − 𝑓𝑁 (𝑦)| + |𝑓𝑁 (𝑦) − 𝑓𝑚 (𝑦)|
+ |𝑓𝑚 (𝑦) − 𝑓 (𝑦)|
𝜖 𝜖 𝜖
≤ |𝑓 (𝑥) − 𝑓𝑛 (𝑥)| + + + + |𝑓𝑚 (𝑦) − 𝑓 (𝑦)|
3 3 3
so in the limit 𝑛, 𝑚 → ∞ with 𝑓𝑛 (𝑥) → 𝑓 (𝑥),
|𝑓 (𝑥) − 𝑓 (𝑦)| ≤ 𝜖 whenever |𝑥 − 𝑦| < 𝛿.
This shows that 𝑓 ∈ 𝐶 [0, 1] and we are now left to prove k 𝑓𝑛 − 𝑓 k ∞ → 0 as 𝑛 → ∞. To this end
let 𝜖 > 0 be arbitrary and 𝑁 = 𝑁 (𝜖) > 0 such that k 𝑓𝑛 − 𝑓𝑚 k ∞ < 𝜖 whenever 𝑛, 𝑚 ≥ 𝑁 . Then, for
all 𝑥 ∈ [0, 1] and any 𝑛, 𝑚 ≥ 𝑁 ,
|𝑓𝑛 (𝑥) − 𝑓 (𝑥)| ≤ |𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥)| + |𝑓𝑚 (𝑥) − 𝑓 (𝑥)| ≤ k𝑓𝑛 − 𝑓𝑚 k ∞ + |𝑓𝑚 (𝑥) − 𝑓 (𝑥)|
< 𝜖 + |𝑓𝑚 (𝑥) − 𝑓 (𝑥)|,
so after the limit 𝑚 → ∞, |𝑓𝑛 (𝑥) − 𝑓 (𝑥)| ≤ 𝜖 for all 𝑥 ∈ [0, 1], as desired.

33
3 Banach spaces

Example 20. Let (𝑋, k · k𝑋 ) be a normed linear space and (𝑌 , k · k𝑌 ) a Banach space. Then
(L (𝑋, 𝑌 ), k · k), the space of bounded linear transformations between 𝑋 and 𝑌 equipped with the
operator norm, is a Banach space, see Theorem 10. In particular, the dual space 𝑋 ∗ = L (𝑋, F) with
F = R or F = C of any normed linear space (𝑋, k · k𝑋 ) is a Banach space when equipped with the
operator norm.
In applications it is important to have criteria to determine whether a given normed linear space
is complete. Such a criterion is given by the following result:

Definition 25. A sequence (𝑥𝑛 )𝑛=1


∞ ⊂ 𝑋 of elements in a normed linear space (𝑋, k · k) is called

absolutely summable if 𝑛=1 k𝑥𝑛 k < ∞. It is called summable if 𝑛=1 𝑥𝑛 converges as 𝑁 → ∞


Í∞ Í𝑁
to an 𝑥 ∈ 𝑋 .

Theorem 17. A normed linear space (𝑋, k · k) over F = R or F = C is a Banach space if and
only if every absolutely summable sequence is summable.

Proof. If (𝑋, k · k) is a Banach space and (𝑥𝑛 )𝑛=1


∞ ⊂ 𝑋 an arbitrary absolutely summable sequence
Í𝑁
then with 𝑦𝑁 := 𝑛=1 𝑥𝑛 , for 𝑁 > 𝑀 and arbitrary 𝜖 > 0,
𝑁
Õ ∞
Õ
k𝑦𝑁 − 𝑦𝑀 k = 𝑥𝑛 ≤ k𝑥𝑛 k < 𝜖
𝑛=𝑀+1 𝑛=𝑀+1

provided 𝑀 is sufficiently large. Consequently, (𝑦𝑁 )𝑁∞=1 is Cauchy and thus convergent in 𝑋 , i.e.
∞ is summable. Conversely, suppose every absolutely summable sequence is summable
(𝑥𝑛 )𝑛=1
and pick an arbitrary Cauchy sequence (𝑥𝑛 )𝑛=1 ∞ ⊂ 𝑋 . For every 𝑗 ∈ N, we can then choose
−2
𝑛 𝑗 ∈ N such that k𝑥𝑚 − 𝑥𝑛 𝑗 k ≤ 𝑗 whenever 𝑚 ≥ 𝑛 𝑗 and we may assume 𝑛 𝑗 < 𝑛 𝑗+1 . In turn,
for every 𝑗 ∈ N,
k𝑥𝑛 𝑗 +1 − 𝑥𝑛 𝑗 k ≤ 𝑗 −2,
so the sequence (𝑥𝑛 𝑗 +1 − 𝑥𝑛 𝑗 ) ∞
𝑗=1 is absolutely summable, thus

𝑁
Õ
𝑥𝑛 1 + (𝑥𝑛 𝑗 +1 − 𝑥𝑛 𝑗 ) = 𝑥𝑛𝑁 +1
𝑗=1

converges in 𝑋 as 𝑁 → ∞ by assumption. In summary, the arbitrary Cauchy sequence


∞ ⊂ 𝑋 has a convergent subsequence (𝑥 ) ∞ ⊂ {𝑥 } ∞ and is thus itself convergent.
(𝑥𝑛 )𝑛=1 𝑛 𝑗 𝑗=1 𝑛 𝑛=1
This shows that 𝑋 is a Banach space. 

Moving ahead we introduce the following important class of bounded linear transformations:

Definition 26. Let (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ) be two normed linear spaces and 𝑇 ∈ L (𝑋, 𝑌 ) a
bounded linear transformation. We say that 𝑇 is invertible or an isomorphism if there exists a
bounded linear bijection 𝑇 −1 ∈ L (𝑌 , 𝑋 ) such that

𝑇𝑇 −1 = 𝐼𝑌 and 𝑇 −1𝑇 = 𝐼𝑋 .

34
3 Banach spaces

Here, 𝐼𝑋 , resp. 𝐼𝑌 , denotes the identity map on 𝑋 , resp. 𝑌 . We say 𝑋 and 𝑌 are isomorphic if there
exists an isomorphism 𝑇 ∈ L (𝑋, 𝑌 ). Furthermore, a linear transformation 𝑇 ∈ L (𝑋, 𝑌 ) is called
an isometry, if k𝑇 𝑥 k𝑌 = k𝑥 k𝑋 for all 𝑥 ∈ 𝑋 .
We proved in Theorem 16 that all separable, infinite-dimensional Hilbert spaces are isometric
to ℓ2 (N). Two Banach spaces which are isometric can be regarded as the same as far as their
Banach space properties are concerned.

Remark. An invertible 𝑇 ∈ L (𝑋, 𝑌 ) is always bijective, however not every bijective 𝑇 ∈ L (𝑋, 𝑌 )
is invertible since 𝑇 −1 might not be bounded. Indeed, see Examples 17, 18, consider the normed linear
spaces 𝑋 ≡ (𝑐 00 (N), k · k 1 ), 𝑌 ≡ (𝑐 00 (N), k · k ∞ ) and let 𝑇 : 𝑋 → 𝑌 denote the map 𝑇 𝑥 := 𝑥, 𝑥 ∈ 𝑋 .
Note that 𝑇 is well-defined since 𝑐 00 (N) ⊂ ℓ1 (N) ∩ ℓ∞ (N), moreover sup{k𝑇 𝑥 k ∞ : k𝑥 k 1 = 1} < ∞,
so 𝑇 ∈ L (𝑋, 𝑌 ). Also, if 𝑇 𝑥 = 0 ∈ 𝑌 then by definition of 𝑇 , 𝑥 = 0 ∈ 𝑋 , so 𝑇 is injective, and since
any 𝑥 ∈ 𝑐 00 (N) is of the form

𝑥 = (𝑥 1, 𝑥 2, . . . , 𝑥𝑛0 , 0, 0, 0, 0, . . .) ∈ ℓ1 (N) ∩ ℓ∞ (N)

for some 𝑛 0 ∈ N, we can use 𝑇 𝑥 = 𝑥 and conclude that 𝑇 is surjective. All together, 𝑇 ∈ L (𝑋, 𝑌 )
is a bijection and its inverse 𝑆 : 𝑌 → 𝑋 equal to 𝑆𝑥 := 𝑥, 𝑥 ∈ 𝑌 . But 𝑆 ∉ L (𝑌 , 𝑋 ), so 𝑇 is not
invertible according to Definition 26, for if
 
𝑛 𝑛−1 𝑛−2 1
𝑥𝑛 := , , , . . . , , 0, 0, 0, 0, . . . , 𝑛 ∈ N
𝑛 𝑛 𝑛 𝑛

then 𝑥𝑛 ∈ 𝑌 and k𝑥𝑛 k ∞ = 1 for all 𝑛 ∈ N. However,


𝑛
Õ 𝑘 1
k𝑆𝑥𝑛 k 1 = = (𝑛 + 1)
𝑛 2
𝑘=1

is unbounded. We will soon understand in full generality, see Theorem 3.7 below, when a given
bijective 𝑇 ∈ L (𝑋, 𝑌 ) is invertible.

At times, one can use the following simple result to compute the inverse of a linear operator
𝑇 ∈ L (𝑋 ) on a Banach space.

Theorem 18 (Neumann series). Let (𝑋, k · k) be a Banach space and 𝑇 ∈ L (𝑋 ) a bounded linear
operator with k𝑇 k < 1. Then 𝐼 − 𝑇 is invertible and

Õ
L (𝑋 ) 3 (𝐼 − 𝑇 ) −1 = 𝐼 + 𝑇 𝑛, 𝑇 𝑛 := 𝑇 ◦ . . . ◦ 𝑇 ,
| {z }
𝑛=1
𝑛 times

with
1
k (𝐼 − 𝑇 ) −1 k ≤ .
1 − k𝑇 k

35
3 Banach spaces

Proof. Given that the operator norm is submultiplicative, see Corollary 1, we find

k𝑇 𝑛 k = k𝑇 𝑛−1𝑇 k ≤ k𝑇 𝑛−1 k k𝑇 k ≤ . . . ≤ k𝑇 k𝑛 ∀ 𝑛 ∈ N.
∞ ⊂ L (𝑋 ) is absolutely summable and thus summable by Theorems
Hence the sequence (𝑇 𝑛 )𝑛=1
17 and 10, we use 𝑆 ∈ L (𝑋 ) to denote its sum. Since, in operator norm on L (𝑋 ),
𝑁
Õ
(𝐼 − 𝑇 )𝑆 = (𝐼 − 𝑇 ) lim 𝑇 𝑛 = lim (𝐼 − 𝑇 𝑁 +1 ) = 𝐼,
𝑁 →∞ 𝑁 →∞
𝑛=0

and similarly 𝑆 (𝐼 −𝑇 ) = 𝐼 , we conclude 𝑆 = (𝐼 −𝑇 ) −1 ∈ L (𝑋 ). Finally, by the triangle inequality


and continuity of the norm,
𝑁 𝑁 𝑁
Õ Õ Õ 1
k (𝐼 − 𝑇 ) −1 k = lim 𝑇 𝑛 ≤ lim k𝑇 𝑛 k ≤ lim k𝑇 k𝑛 = .
𝑁 →∞
𝑛=0
𝑁 →∞
𝑛=0
𝑁 →∞
𝑛=0
1 − k𝑇 k

This completes our proof. 

Our last result in this section is useful in applications when one considers linear transformations
defined on dense subspaces.

Definition 27. Let (𝑋, k · k𝑋 ) and (𝑌, k · k𝑌 ) be normed linear spaces. We say a linear transfor-
mation 𝑇 ∈ L (𝑋 0, 𝑌 ) is densely defined in 𝑋 , if 𝑋 0 ⊂ 𝑋 is a dense subspace of 𝑋 .

Theorem 3.1: BLT

Let (𝑋, k · k𝑋 ) be a normed linear space, (𝑌 , k · k𝑌 ) a Banach space and 𝑇 ∈ L (𝑋 0, 𝑌 )


densely defined in 𝑋 . Then there exists a unique 𝑆 ∈ L (𝑋, 𝑌 ) with 𝑆𝑥 = 𝑇 𝑥 for all 𝑥 ∈ 𝑋 0
and k𝑆 k = k𝑇 k.

Proof. If 𝑋 0 ⊂ 𝑋 is a dense subspace, then for every 𝑥 ∈ 𝑋 we can find a sequence (𝑥𝑛 )𝑛=1 ∞ ⊂𝑋
0
such that 𝑥𝑛 → 𝑥 as 𝑛 → ∞. Given that k𝑇 𝑥𝑛 − 𝑇 𝑥𝑚 k𝑌 = k𝑇 (𝑥𝑛 − 𝑥𝑚 ) k𝑌 ≤ k𝑇 k k𝑥𝑛 − 𝑥𝑚 k𝑋 ,
∞ ⊂ 𝑌 is Cauchy and thus convergent, say 𝑇 𝑥 → 𝑦 as 𝑛 → ∞. Note
we conclude that (𝑇 𝑥𝑛 )𝑛=1 𝑛
that 𝑦 is independent of the sequence (𝑥𝑛 )𝑛=1 ∞ , for if there is (𝑥ˆ ) ∞ ⊂ 𝑋 such that 𝑥ˆ → 𝑥 as
𝑛 𝑛=1 𝑛
𝑛 → ∞, then k𝑇 𝑥ˆ𝑛 − 𝑇 𝑥𝑛 k𝑌 ≤ k𝑇 k k𝑥ˆ𝑛 − 𝑥𝑛 k𝑋 → 0 as 𝑛 → ∞. For this reason 𝑆 : 𝑋 → 𝑌 with
𝑆𝑥 := 𝑦 is well-defined, it is linear by linearity of 𝑇 and if 𝑥 ∈ 𝑋 0 ⊂ 𝑋 , then by the above

𝑇 𝑥 = 𝑇 lim 𝑥𝑛 = lim 𝑇 𝑥𝑛 = 𝑦 = 𝑆𝑥,
𝑛→∞ 𝑛→∞

since 𝑇 ∈ L (𝑋 0, 𝑌 ) is continuous, compare Theorem 9. Next, by continuity of the norm,



k𝑆𝑥 k𝑌 = lim k𝑇 𝑥𝑛 k𝑌 ≤ lim k𝑇 k k𝑥𝑛 k𝑋 = k𝑇 k k𝑥 k𝑋 ∀ 𝑥 ∈ 𝑋,
𝑛→∞ 𝑛→∞

36
3 Banach spaces

so 𝑆 ∈ L (𝑋, 𝑌 ) and k𝑆 k ≤ k𝑇 k, but since 𝑆 = 𝑇 on 𝑋 0 , actually k𝑆 k = k𝑇 k. Finally if there are


𝑆, 𝑆 0 ∈ L (𝑋, 𝑌 ) with 𝑆𝑥 = 𝑆 0𝑥 = 𝑇 𝑥 for all 𝑥 ∈ 𝑋 0 and k𝑆 k = k𝑇 k = k𝑆 0 k, then for every 𝑥 ∈ 𝑋
we can again choose (𝑥𝑛 )𝑛=1∞ ⊂ 𝑋 with 𝑥 → 𝑥 as 𝑛 → ∞ and conclude
0 𝑛

k𝑆𝑥 − 𝑆 0𝑥 k = lim k𝑆𝑥𝑛 − 𝑆 0𝑥𝑛 k = lim k𝑇 𝑥𝑛 − 𝑇 𝑥𝑛 k = 0.


𝑛→∞ 𝑛→∞

This shows k𝑆 − 𝑆 0 k = 0 in operator norm and therefore 𝑆 = 𝑆 0, i.e. uniqueness follows. 

3.2 The Hahn-Banach theorem


In dealing with Banach spaces, one often needs to construct linear functionals with certain
properties. This is usually done in two steps: first, one defines the linear functional on a subspace
of the Banach space where it is easy to verify the desired properties; second, one appeals (or
derives) a general theorem which says that any such functional can be extended to the whole
space while retaining the desired properties. Observe that Theorem 3.1 has a similar flavor,
however it was formulated for linear transformations rather than functionals. We begin with
the following continuation to Definition 20:

Definition 28. Let 𝑋 be a (real or complex) vector space and 𝑆 ⊂ 𝑋 a convex subset. A function
𝑓 : 𝑆 → R is called convex if and only if for all 𝑥, 𝑦 ∈ 𝑆, 𝜃 ∈ [0, 1], we have

𝑓 𝜃𝑥 + (1 − 𝜃 )𝑦 ≤ 𝜃 𝑓 (𝑥) + (1 − 𝜃 ) 𝑓 (𝑦).

Theorem 3.2: real Hahn-Banach

Let 𝑋 be a real vector space and 𝑝 : 𝑋 → R a convex function. Let 𝑌 ⊂ 𝑋 be a subspace


and 𝑓 : 𝑌 → R a linear functional obeying 𝑓 (𝑦) ≤ 𝑝 (𝑦) for all 𝑦 ∈ 𝑌 . Then there exists
a linear functional 𝐹 : 𝑋 → R on 𝑋 , satisfying 𝐹 (𝑥) ≤ 𝑝 (𝑥) for all 𝑥 ∈ 𝑋 such that
𝐹 (𝑦) = 𝑓 (𝑦) for all 𝑦 ∈ 𝑌 .

Remark. Note that 𝑌 ⊂ 𝑋 need not be closed and 𝑋 need not be a normed linear space. These are
signals that analytic tools will not play a significant role in the upcoming proof. In fact we will
again rely on Zorn’s lemma.
The key to the proof of Theorem 3.2 will be to add a single vector 𝑥 ∉ 𝑌 to the domain of 𝐹 and
then use “induction". To do this we require the following two auxiliary results.

Proposition 6. Let 𝑋, 𝑌 , 𝑓 , 𝑝 be as in the assumption of Theorem 3.2. Then for every 𝑦1, 𝑦2 ∈
𝑌, 𝑥 ∈ 𝑋 \ 𝑌 , and 𝛼, 𝛽 ∈ (0, ∞), we have
1  1 
𝑓 (𝑦1 ) − 𝑝 (𝑦1 − 𝛽𝑥) ≤ 𝑝 (𝑦2 + 𝛼𝑥) − 𝑓 (𝑦2 ) .
𝛽 𝛼

37
3 Banach spaces

Proof. Set 𝜃 :=
𝛽
𝛼+𝛽 ∈ (0, 1). Then
  
𝑓 (1 − 𝜃 )𝑦1 + 𝜃𝑦2 ≤ 𝑝 (1 − 𝜃 )𝑦1 + 𝜃𝑦2 = 𝑝 (1 − 𝜃 ) (𝑦1 − 𝛽𝑥) + 𝜃 (𝑦2 + 𝛼𝑥)
≤ (1 − 𝜃 )𝑝 (𝑦1 − 𝛽𝑥) + 𝜃𝑝 (𝑦2 + 𝛼𝑥), (3.1)

where the first inequality follows from 𝑓 (𝑦) ≤ 𝑝 (𝑦) for 𝑦 ∈ 𝑌 , the equality from (1 − 𝜃 )𝛽 = 𝜃𝛼
and the second inequality from convexity of 𝑝. Multiplying (3.1) by (𝛼 + 𝛽) yields through
linearity of 𝑓 ,
𝛼 𝑓 (𝑦1 ) + 𝛽 𝑓 (𝑦2 ) ≤ 𝛼𝑝 (𝑦1 − 𝛽𝑥) + 𝛽𝑝 (𝑦2 + 𝛼𝑥)
and thus the claimed inequality after rearrangement. 

Proposition 7. Given 𝑋, 𝑌 , 𝑓 , 𝑝 as in the assumptions of Theorem 3.2 and 𝑥 ∈ 𝑋 \ 𝑌 , there exists


a linear functional 𝐹 ◦ : 𝑌◦ := 𝑌 + {𝛼𝑥 : 𝛼 ∈ R} → R so that 𝐹 ◦ (𝑧) ≤ 𝑝 (𝑧) for all 𝑧 ∈ 𝑌◦ and
𝐹 ◦ (𝑦) = 𝑓 (𝑦) for all 𝑦 ∈ 𝑌 .

Proof. Given that every 𝑧 ∈ 𝑌◦ is of the form 𝑧 = 𝑦 + 𝛼𝑥 for some unique 𝑦 ∈ 𝑌 and 𝛼 ∈ R, we
will define the linear extension 𝐹 ◦ : 𝑌◦ → R as

𝐹 ◦ (𝑧) := 𝑓 (𝑦) + 𝛼𝐹 ◦ (𝑥), (3.2)

and thus need to make sense of 𝐹 ◦ (𝑥) for 𝑥 ∈ 𝑋 \𝑌 . But with (3.2) in place the desired inequality
𝐹 ◦ (𝑧) ≤ 𝑝 (𝑧) on 𝑌◦ is equivalent to

𝑓 (𝑦1 ) + 𝛼𝐹 ◦ (𝑥) ≤ 𝑝 (𝑦1 + 𝛼𝑥), 𝑓 (𝑦2 ) − 𝛽𝐹 ◦ (𝑥) ≤ 𝑝 (𝑦2 − 𝛽𝑥)

for all 𝑦1, 𝑦2 ∈ 𝑋 and 𝛼, 𝛽 ∈ (0, ∞). This is equivalent to


1  1 
𝑓 (𝑦2 ) − 𝑝 (𝑦2 − 𝛽𝑥) ≤ 𝐹 ◦ (𝑥) ≤ 𝑝 (𝑦1 + 𝛼𝑥) − 𝑓 (𝑦1 ) , (3.3)
𝛽 𝛼

so any value for 𝐹 ◦ (𝑥) with


   
1  1 
sup 𝑓 (𝑦2 ) − 𝑝 (𝑦2 − 𝛽𝑥) ≤ 𝐹 ◦ (𝑥) ≤ inf 𝑝 (𝑦1 + 𝛼𝑥) − 𝑓 (𝑦1 )
𝛽>0 𝛽 𝛼 >0 𝛼
𝑦1 ∈𝑌
𝑦2 ∈𝑌

will do, noting that both extreme values are finite and that such a value exists by Proposition 6.
With this choice for 𝐹 ◦ (𝑥) we satisfy (3.3) and have thus completed the proof. 

Proposition 7 shows that 𝑓 can be extended “one dimension at a time”. We will now use Zorn’s
lemma to show that this process can be continued to extend 𝑓 to the whole space 𝑋 .

Proof of Theorem 3.2. Let P := {(𝑌◦, 𝐹 ◦ )} denote the set of all pairs, 𝑌◦ , so that 𝑌 ⊂ 𝑌◦ ⊂ 𝑋 , and
linear functionals 𝐹 ◦ : 𝑌◦ → R obeying 𝐹 ◦ (𝑧) ≤ 𝑝 (𝑧) for all 𝑧 ∈ 𝑌◦ with 𝐹 ◦ (𝑦) = 𝑓 (𝑦) when
𝑦 ∈ 𝑌 . This set is nonempty by Proposition 7. Next, P is partially ordered by extension; that is,
we say (𝑌1, 𝐹 1 )  (𝑌2, 𝐹 2 ) if and only 𝑌1 ⊂ 𝑌2 and 𝐹 2 (𝑦) = 𝐹 1 (𝑦) when 𝑦 ∈ 𝑌1 . Moving ahead,

38
3 Banach spaces

Ð
pick any chain {(𝑌𝛼 , 𝐹𝛼 )}𝛼 ∈𝐼 ⊂ P, i.e. any totally ordered subset of P. Then ( 𝛼 ∈𝐼 𝑌𝛼 , 𝐹 ∗ (𝑦) :=
Ð
𝐹𝛼 (𝑦), 𝑦 ∈ 𝑌𝛼 ) is an upper bound for the chosen chain: first, 𝛼 ∈𝐼 𝑌𝛼 is a subspace of 𝑋 , for if
Ð Ð
𝑥, 𝑦 ∈ 𝛼 ∈𝐼 𝑌𝛼 then 𝑥, 𝑦 ∈ 𝑌𝛽 ⊂ 𝑋 for some 𝛽 ∈ 𝐼 , consequently, span{𝑥, 𝑦} ∈ 𝑌𝛽 ⊂ 𝛼 ∈𝐼 𝑌𝛼 .
Second, 𝐹 ∗ is clearly a linear functional with the desired properties by construction, so together
Ð Ð
( 𝛼 ∈𝐼 𝑌𝛼 , 𝐹 ∗ ) ∈ P. Third and last, every (𝑌𝛾 , 𝐹𝛾 ) in the chain satisfies (𝑌𝛾 , 𝐹𝛾 )  ( 𝛼 ∈𝐼 𝑌𝛼 , 𝐹 ∗ ),
again by construction. Moving ahead, by Theorem 1.1, there exists a maximal element of P,
say (𝑌∞, 𝐹 ∞ ). If 𝑌∞ ≠ 𝑋 , by Proposition 7, we can find 𝑥 ∈ 𝑋 \ 𝑌∞ and a linear functional 𝐺 ◦ on
𝑌∞,◦ := 𝑌∞ + {𝛼𝑥 : 𝛼 ∈ R} so that 𝐺 ◦ (𝑦) = 𝐹 ∞ (𝑦) when 𝑦 ∈ 𝑌∞ . In turn, (𝑌∞, 𝐹 ∞ ) would not be
maximal since 𝑌∞,◦ ≠ 𝑌∞ . We conclude that 𝑌∞ = 𝑋 and 𝐹 := 𝐹 ∞ satisfies 𝐹 (𝑥) ≤ 𝑝 (𝑥) for all
𝑥 ∈ 𝑋 with 𝐹 (𝑦) = 𝑓 (𝑦) when 𝑦 ∈ 𝑌 . This completes the proof of Theorem 3.2. 

Remark. If 𝑋 is a separable space and 𝑝 (𝑥) = k𝑥 k, i.e. 𝑋 is a separable normed linear space, then
one can avoid Zorn’s lemma and use ordinary induction instead.
We now turn to the complex version of Theorem 3.2.

Definition 29. Let 𝑋 be a complex vector space. A function 𝑝 : 𝑋 → R is called symmetric if


for all 𝑥 ∈ 𝑋, 𝜆 ∈ C : |𝜆| = 1,
𝑝 (𝜆𝑥) = 𝑝 (𝑥).
Notice that the norm in a normed linear space (𝑋, k · k) is symmetric (and convex).

Theorem 3.3: complex Hahn-Banach

Let 𝑋 be a complex vector space, 𝑝 : 𝑋 → R a symmetric convex function, 𝑌 ⊂ 𝑋 a


complex subspace, and 𝑓 : 𝑌 → C a linear functional obeying |𝑓 (𝑦)| ≤ 𝑝 (𝑦) for all 𝑦 ∈ 𝑌 .
Then there exists a linear functional 𝐹 : 𝑋 → C on 𝑋 , satisfying |𝐹 (𝑥)| ≤ 𝑝 (𝑥) for all
𝑥 ∈ 𝑋 such that 𝐹 (𝑦) = 𝑓 (𝑦) for all 𝑦 ∈ 𝑌 .

Proof. Note that =(𝑓 (𝑦)) = <(−𝔦𝑓 (𝑦)) for all 𝑦 ∈ 𝑌 and so by linearity of 𝑓 , for all 𝑦 ∈ 𝑌 ,
𝑓 (𝑦) = <(𝑓 (𝑦)) + 𝔦<(𝑓 (−𝔦𝑦)). (3.4)
Since |𝑓 (𝑦)| ≤ 𝑝 (𝑦) on 𝑌 we have in particular <(𝑓 (𝑦)) ≤ 𝑝 (𝑦) on 𝑌 and so by Theorem 3.2
there is a linear functional 𝐹𝑟 : 𝑋 → R with 𝐹𝑟 (𝑥) ≤ 𝑝 (𝑥) for all 𝑥 ∈ 𝑋 and 𝐹𝑟 (𝑦) = <(𝑓 (𝑦))
when 𝑦 ∈ 𝑌 . But by symmetry of 𝑝, we have 𝑝 (−𝑥) = 𝑝 (𝑥), 𝑥 ∈ 𝑋 and thus |𝐹𝑟 (𝑥)| ≤ 𝑝 (𝑥) on
𝑋 by linearity of 𝐹𝑟 and the fact that 𝐹𝑟 (𝑥) ≤ 𝑝 (𝑥) holds for all 𝑥 ∈ 𝑋 . Define
𝐹 (𝑥) := 𝐹𝑟 (𝑥) + 𝔦𝐹𝑟 (−𝔦𝑥), 𝑥 ∈ 𝑋,
then 𝐹 (𝔦𝑥) = 𝐹𝑟 (𝔦𝑥) +𝔦𝐹𝑟 (𝑥) = 𝔦(𝐹𝑟 (𝑥) +𝔦𝐹𝑟 (−𝔦𝑥)) since 𝐹𝑟 (𝔦𝑥) = −𝐹𝑟 (−𝔦𝑥) ∈ R, so 𝐹 is complex
linear and by (3.4), 𝐹 (𝑦) = 𝑓 (𝑦) when 𝑦 ∈ 𝑌 . It remains to prove the inequality for 𝐹 : Given
𝑥 ∈ 𝑋 , pick 𝜃 ∈ [0, 2𝜋) so that 𝐹 (𝑥) = |𝐹 (𝑥)|𝔢𝔦𝜃 . Then

|𝐹 (𝑥)| = 𝐹 𝔢−𝔦𝜃 𝑥 = 𝐹𝑟 𝔢−𝔦𝜃 𝑥 ≤ 𝐹𝑟 𝔢−𝔦𝜃 𝑥 ≤ 𝑝 (𝑥) ∀ 𝑥 ∈ 𝑋 .


  
|{z}
∈R

39
3 Banach spaces

The proof is completed. 

The Hahn-Banach theorems will imply that there are lots of continuous linear functionals on a
given normed linear space, 𝑋 , enough so that for any 𝑥 ∈ 𝑋 , there is an 𝑓 ∈ 𝑋 ∗ with 𝑓 (𝑥) ≠ 0.
This and other consequences are summarized in the following:

Corollary 7. Let (𝑋, k · k) be a (real or complex) normed linear space and 𝑌 ⊂ 𝑋 a subspace.
Then, given any bounded linear functional 𝑓 ∈ 𝑌 ∗ , there is 𝐹 ∈ 𝑋 ∗ with 𝐹 (𝑦) = 𝑓 (𝑦) for 𝑦 ∈ 𝑌
and k𝐹 k = k𝑓 k.

Proof. This is immediate from Theorems 3.2 and 3.3 with 𝑝 (𝑥) := k 𝑓 k k𝑥 k, for the extension
𝐹 : 𝑋 → F satisfies |𝐹 (𝑥)| ≤ k𝑓 k k𝑥 k and thus k𝐹 k ≤ k𝑓 k. But 𝐹 (𝑦) = 𝑓 (𝑦) on 𝑌 , so in fact
k𝐹 k = k 𝑓 k. This completes the proof. 

Corollary 8. Let (𝑋, k · k) be a (real or complex) normed linear space and 𝑋 ∗ its dual space. For
any 𝑥 0 ∈ 𝑋 , there exists 𝑓 ∈ 𝑋 ∗ so that

𝑓 (𝑥 0 ) = k𝑥 0 k, k𝑓 k = 1.

In particular, if 𝑥 0 ≠ 0, there exists 𝑓 ∈ 𝑋 ∗ with 𝑓 (𝑥 0 ) ≠ 0.

Proof. Set 𝑌 := {𝛼𝑥 0 : 𝛼 ∈ F} = span{𝑥 0 } and define 𝑔 : 𝑌 → F by 𝑔(𝛼𝑥 0 ) := 𝛼 k𝑥 0 k. Clearly


𝑔 ∈ 𝑌 ∗ with |𝑔(𝑦)| = k𝑦 k for all 𝑦 ∈ 𝑌 , i.e. by Corollary 7 there exists 𝑓 ∈ 𝑋 ∗ with 𝑓 (𝑦) = 𝑔(𝑦)
for 𝑦 ∈ 𝑌 , so 𝑓 (𝑥 0 ) = 𝑔(𝑥 0 ) = k𝑥 0 k, and k𝑓 k = k𝑔k. But k𝑔k = 1 by the previous and therefore
k𝑓 k = k𝑔k = 1. 

Corollary 9. Let (𝑋, k · k) be a (real or complex) normed linear space, 𝑌 ⊂ 𝑋 a closed subspace
and 𝑥 0 ∈ 𝑋 \ 𝑌 . Then there exists 𝑓 ∈ 𝑋 ∗ so that

𝑓 (𝑦) = 0 ∀𝑦 ∈ 𝑌 , 𝑓 (𝑥 0 ) ≠ 0.

Proof. We first show that there exists a constant 𝑐 > 0 so that k𝑦 + 𝛼𝑥 0 k ≥ 𝑐 |𝛼 | for all 𝛼 ∈ F and
𝑦 ∈ 𝑌 : if not, then we can find 𝑦𝑛 ∈ 𝑌 , 𝛼𝑛 ∈ F so that k𝑦𝑛 + 𝛼𝑛 𝑥 0 k < 𝑛1 |𝛼𝑛 |. Evidently, 𝛼𝑛 ≠ 0, so
setting 𝑣𝑛 := −𝑦𝑛 /𝛼𝑛 ∈ 𝑌 , we have k𝑣𝑛 − 𝑥 0 k < 𝑛1 which tells us that 𝑣𝑛 → 𝑥 0 as 𝑛 → ∞. But
this is impossible for 𝑌 is closed and 𝑥 0 ∈ 𝑋 \ 𝑌 . This verifies our initial claim. Moving ahead,
we introduce ℎ : 𝑌◦ → F via
ℎ(𝑦 + 𝛼𝑥 0 ) := 𝑐𝛼
and note that ℎ(𝑦) = 0 for all 𝑦 ∈ 𝑌 as well as |ℎ(𝑦 +𝛼𝑥 0 )| = 𝑐 |𝛼 | ≤ k𝑦 +𝛼𝑥 0 k by our initial claim.
In turn Corollary 7 yields existence of a linear 𝑓 : 𝑋 → F with k 𝑓 k = kℎk ≤ 1, 𝑓 (𝑦) = ℎ(𝑦) = 0
on 𝑌 and 𝑓 (𝑥 0 ) = 𝑐 ≠ 0. The proof is complete. 

40
3 Banach spaces

Corollary 10. Let (𝑋, k · k) be a (real or complex) normed linear space and 𝑌 ⊂ 𝑋 a nonempty
subspace of 𝑋 . Then for every 𝑥 ∈ 𝑋 with dist(𝑥, 𝑌 ) = inf {k𝑥 − 𝑦 k : 𝑦 ∈ 𝑌 } = 𝛿 > 0 there exists
𝑓 ∈ 𝑋 ∗ with k𝑓 k ≤ 1, 𝑓 (𝑥) = 𝛿 and 𝑓 (𝑦) = 0 for all 𝑦 ∈ 𝑌 .

Proof. As in the proof of Corollary 9 we start with the subspace 𝑌◦ := 𝑌 + {𝛼𝑥 : 𝛼 ∈ F} but now
define 𝑔 : 𝑌◦ → F by
𝑔(𝑦 + 𝛼𝑥) := 𝛼𝛿.
Again, 𝑔 is well-defined, linear, we have 𝑔(𝑦) = 0 for all 𝑦 ∈ 𝑌 , 𝑔(𝑥) = 𝛿 and 𝑔 is bounded,

|𝑔(𝑦 + 𝛼𝑥)| = |𝛼 |𝛿 = |𝛼 | inf {k𝑥 − 𝑦 k : 𝑦 ∈ 𝑌 } = inf {k𝛼𝑥 − 𝑦 k : 𝑦 ∈ 𝑌 } ≤ k𝛼𝑥 + 𝑦 k,

so k𝑔k ≤ 1. Hence, by Corollary 7, there is 𝑓 ∈ 𝑋 ∗ with 𝑓 (𝑦) = 𝑔(𝑦) = 0 on 𝑌 , 𝑓 (𝑥) = 𝑔(𝑥) = 𝛿


(since 𝑓 (𝑧) = 𝑔(𝑧) for all 𝑧 ∈ 𝑌◦ ) and k 𝑓 k = k𝑔k ≤ 1. This completes our proof. 

Corollary 11 (Separation property). For every two vectors 𝑥 ≠ 𝑦 in a (real or complex) normed
linear space 𝑋 , there exists 𝑓 ∈ 𝑋 ∗ such that 𝑓 (𝑥) ≠ 𝑓 (𝑦).

Proof. Set 𝑥 0 := 𝑥 − 𝑦 ≠ 0, then by Corollary 8 we can find 𝑓 ∈ 𝑋 ∗ with 𝑓 (𝑥) − 𝑓 (𝑦) = 𝑓 (𝑥 0 ) =


k𝑥 0 k ≠ 0. The proof is complete. 

In order to show how useful the above corollaries are we prove the following general theorem
which will close this Section while preparing the next one.

Theorem 19. Let 𝑋 be a Banach space over F = R or F = C. If its dual space 𝑋 ∗ is separable, then
𝑋 is separable.

Proof. Let {𝑓𝑛 }𝑛=1


∞ ⊂ 𝑋 ∗ be a countable dense subset in 𝑋 ∗ . Then for every 𝑛 ∈ N we can choose

𝑥𝑛 ∈ 𝑋 \ {0} so that |𝑓𝑛 (𝑥𝑛 )| ≥ 12 k𝑓𝑛 k k𝑥𝑛 k (for otherwise k 𝑓𝑛0 k < 12 k 𝑓𝑛0 k for some 𝑛 0 ∈ N). Let
( 𝑛 )
Õ
D := 𝛼 𝑗 𝑥 𝑗 : 𝛼 𝑗 ∈ Q, 𝑛 ∈ N ⊂ 𝑋
𝑗=1

∞ with rational coefficients. Since


denote the set of all finite linear combinations of the {𝑥𝑛 }𝑛=1
D is countable, it is now sufficient to show that D is dense in 𝑋 : if it were not, then there
is 𝑦 ∈ 𝑋 \ D and 𝑓 ∈ 𝑋 ∗ with 𝑓 (𝑦) ≠ 0, but 𝑓 (𝑥) = 0 for all 𝑥 ∈ D, see Corollary 10. Let
∞ ⊂ {𝑓 } ∞ ⊂ 𝑋 ∗ be a subsequence which converges to 𝑓 as 𝑘 → ∞ (recall that {𝑓 } ∞
(𝑓𝑛𝑘 )𝑘=1 𝑛 𝑛=1 𝑛 𝑛=1
is dense in 𝑋 ∗ ). Then

|(𝑓 − 𝑓𝑛𝑘 ) (𝑥𝑛𝑘 )| |𝑓𝑛𝑘 (𝑥𝑛𝑘 )| 1


k 𝑓 − 𝑓𝑛𝑘 k ≥ = ≥ k𝑓𝑛𝑘 k
k𝑥𝑛𝑘 k k𝑥𝑛𝑘 k 2

which shows that k𝑓𝑛𝑘 k → 0 as 𝑘 → ∞, and so 𝑓 is the zero functional on 𝑋 . But this contradicts
𝑓 (𝑦) ≠ 0, so D is dense and thus 𝑋 separable. 

41
3 Banach spaces

3.3 Duals and double duals


In a normed linear space, (𝑋, k · k), over F = R or F = C, the dual space 𝑋 ∗ = L (𝑋, F) is a Banach
space by Theorem 10. By Corollary 8 there are lots of elements in 𝑋 ∗ so it is natural to expect
that 𝑋 ∗ is sufficiently interesting, in fact, once 𝑋 ∗ is in place, then we will have 𝑋 ∗∗ = (𝑋 ∗ ) ∗ ,
we will have the dual transformation and the notion of reflexive space.

Theorem 20. Let 𝐴 ∈ L (𝑋, 𝑌 ) be a bounded linear transformation between two normed linear
spaces (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ). Then there exists a unique 𝐴𝑡 ∈ L (𝑌 ∗, 𝑋 ∗ ), the dual, so that

(𝐴𝑡 𝑓 ) (𝑥) = 𝑓 (𝐴𝑥) ∀ 𝑥 ∈ 𝑋, ∀ 𝑓 ∈ 𝑌 ∗ . (3.5)

The map 𝐴 ↦→ 𝐴𝑡 is a linear isometry.

Proof. For every 𝐴 ∈ L (𝑋, 𝑌 ) and 𝑓 ∈ 𝑌 ∗ we have 𝑓 ◦𝐴 ∈ 𝑋 ∗ as composition of linear maps and
since k𝑓 ◦ 𝐴k ≤ k𝑓 k k𝐴k < ∞ by Corollary 1. This shows that (3.5) indeed defines a bounded
linear transformation 𝐴𝑡 ∈ L (𝑌 ∗, 𝑋 ∗ ). Next, use Definition 14 and compute
" # " #
(3.5)
k𝐴𝑡 k = sup k𝐴𝑡 𝑓 k = sup sup (𝐴𝑡 𝑓 ) (𝑥) = sup sup 𝑓 (𝐴𝑥)
k𝑓 k=1 k𝑓 k=1 k𝑥 k𝑋 =1 k𝑓 k=1 k𝑥 k𝑋 =1
" #

≤ sup sup k 𝑓 ◦ 𝐴k k𝑥 k𝑋 ≤ sup k𝑓 k k𝐴k = k𝐴k.
k𝑓 k=1 k𝑥 k𝑋 =1 k𝑓 k=1

On the other hand, by Corollary 8, given 𝑥 ∈ 𝑋 , there exists 𝑔 ∈ 𝑌 ∗ with k𝑔k = 1 and
𝑔(𝐴𝑥) = k𝐴𝑥 k𝑌 . Hence, by (3.5),

k𝐴𝑥 k𝑌 = |𝑔(𝐴𝑥)| = |(𝐴𝑡 𝑔) (𝑥)| ≤ k𝐴𝑡 𝑔k k𝑥 k ≤ k𝐴𝑡 k k𝑥 k𝑋 ,

which yields k𝐴k ≤ k𝐴𝑡 k and thus together with the previous part k𝐴k = k𝐴𝑡 k. This implies in
particular that 𝐴 ↦→ 𝐴𝑡 is an isometry and since for two 𝐴, 𝐵 ∈ L (𝑋, 𝑌 ), 𝛼, 𝛽 ∈ F,
 (3.5) 
𝑓 (𝛼𝐴 + 𝛽𝐵)𝑥 = 𝛼 𝑓 (𝐴𝑥) + 𝛽 𝑓 (𝐵𝑥) = 𝛼 (𝐴𝑡 𝑓 ) (𝑥) + 𝛽 (𝐵𝑡 𝑓 ) (𝑥) = (𝛼𝐴𝑡 + 𝛽𝐵𝑡 ) 𝑓 (𝑥),

whenever 𝑥 ∈ 𝑋 and 𝑓 ∈ 𝑌 ∗ , it follows that (𝛼𝐴+𝛽𝐵)𝑡 = 𝛼𝐴𝑡 +𝛽𝐵𝑡 , i.e. 𝐴 ↦→ 𝐴𝑡 is linear, provided
𝐴𝑡 is uniquely defined by (3.5). But this is easy to see, for if there were 𝐴𝑡1, 𝐴𝑡2 ∈ L (𝑌 ∗, 𝑋 ∗ ) and
both satisfy (3.5), then
" # " #
k𝐴𝑡1 − 𝐴𝑡2 k = sup sup ((𝐴𝑡1 − 𝐴𝑡2 ) 𝑓 ) (𝑥) = sup sup ((𝐴𝑡1 𝑓 ) (𝑥) − (𝐴𝑡2 𝑓 ) (𝑥)
k𝑓 k=1 k𝑥 k𝑋 =1 k𝑓 k=1 k𝑥 k𝑋 =1
" #
= sup sup 𝑓 (𝐴𝑥) − 𝑓 (𝐴𝑥) = 0,
k𝑓 k=1 k𝑥 k𝑋 =1

so 𝐴𝑡1 = 𝐴𝑡2 , i.e. uniqueness is established (and thus 𝐴 ↦→ 𝐴𝑡 well-defined). 

42
3 Banach spaces

Below, we separately summarize the key properties of the map 𝐴 ↦→ 𝐴𝑡 . These have either been
proven in Theorem 20 or will be proven in the exercises.

Theorem 21. Let 𝐴 ∈ L (𝑋, 𝑌 ), 𝐵 ∈ L (𝑌 , 𝑍 ) be two bounded linear transformations between three
normed linear spaces (𝑋, k · k𝑋 ), (𝑌 , k · k𝑌 ) and (𝑍, k · k𝑍 ) and let 𝐴𝑡 ∈ L (𝑌 ∗, 𝑋 ∗ ), 𝐵𝑡 ∈ L (𝑍 ∗, 𝑌 ∗ )
denote their duals. Then
(1) 𝐴 ↦→ 𝐴𝑡 is a linear isometry.
(2) (𝐵 ◦ 𝐴)𝑡 = 𝐴𝑡 ◦ 𝐵𝑡 .
(3) 𝐴𝑡 is injective provided 𝐴 is surjective. If 𝐴 ∈ L (𝑋, 𝑌 ) is invertible, then so is 𝐴𝑡 ∈ L (𝑌 ∗, 𝑋 ∗ )
and we have (𝐴−1 )𝑡 = (𝐴𝑡 ) −1 .

Remark. The dual transformation introduced in Theorem 20 and further studied in Theorem 21
also goes under the name Banach space adjoint (or transpose) if the underlying normed linear
spaces are Banach spaces. This is because of the similarities between the dual’s properties listed in
Theorem 21 and the properties of the Hilbert space adjoint studied in Chapter 4.
Moving ahead, one important consequence of the existence of a large 𝑋 ∗ is a large (𝑋 ∗ ) ∗ . We
now proceed in studying those linear functionals on 𝑋 ∗ .

Definition 30. Let 𝑋 be a normed linear space. The dual space 𝑋 ∗∗ = (𝑋 ∗ ) ∗ of the dual space 𝑋 ∗
is called the second dual, the bidual, or the double dual of 𝑋 .
Given 𝑥 ∈ 𝑋 , the rule
∀ 𝑓 ∈ 𝑋∗ : 𝑗𝑥 (𝑓 ) := 𝑓 (𝑥), (3.6)
defines an element 𝑗𝑥 ∈ 𝑋 ∗∗ and we thus have a canonical map 𝐽 : 𝑋 → 𝑋 ∗∗ in setting 𝑥 ↦→ 𝑗𝑥 .

Theorem 22. Let (𝑋, k · k𝑋 ) be a normed linear space. Then 𝐽 ∈ L (𝑋, 𝑋 ∗∗ ) is an isometry. In
particular, 𝐽 is injective and, provided 𝑋 is a Banach space, Ran(𝐽 ) is a closed subspace of 𝑋 ∗∗ .

Proof. By (3.6),

𝑗𝛼𝑥+𝛽𝑦 (𝑓 ) = 𝑓 (𝛼𝑥 + 𝛽𝑦) = 𝛼 𝑓 (𝑥) + 𝛽 𝑓 (𝑦) = (𝛼 𝑗𝑥 + 𝛽 𝑗 𝑦 ) (𝑓 ) ∀ 𝑥, 𝑦 ∈ 𝑋, ∀ 𝛼, 𝛽 ∈ F, ∀ 𝑓 ∈ 𝑋 ∗,

so 𝐽 : 𝑋 → 𝑋 ∗∗ is surely linear and since


" # " #
(3.6) 
k𝐽 k = sup k 𝐽𝑥 k = sup k 𝑗𝑥 k = sup sup 𝑗𝑥 (𝑓 ) ≤ sup sup k𝑓 k k𝑥 k ≤ 1,
k𝑥 k𝑋 =1 k𝑥 k𝑋 =1 k𝑥 k𝑋 =1 k𝑓 k=1 k𝑥 k𝑋 =1 k𝑓 k=1

therefore 𝐽 ∈ L (𝑋, 𝑋 ∗∗ ). But from the last estimate, we deduce k𝐽𝑥 k ≤ k𝑥 k𝑋 and on the other
hand, given 𝑥 0 ∈ 𝑋 , there exists 𝑔 ∈ 𝑋 ∗ so that 𝑔(𝑥 0 ) = k𝑥 0 k𝑋 and k𝑔k = 1, see Corollary 8. So,
(3.6)
k𝑥 0 k𝑋 = |𝑔(𝑥 0 )| = | 𝑗𝑥 0 (𝑔)| ⇒ k𝐽𝑥 0 k = sup | 𝑗𝑥 0 (𝑓 )| ≥ | 𝑗𝑥 0 (𝑔)| = k𝑥 0 k𝑋 ,
k𝑓 k=1

43
3 Banach spaces

and thus together k 𝐽𝑥 k = k𝑥 k𝑋 for all 𝑥 ∈ 𝑋 , i.e. 𝐽 is an isometry and thus, in particular,
∞ ⊂ Ran(𝐽 ) with (𝑥 ) ∞ ⊂ 𝑋 a
injective. Now assume (𝑋, k · k𝑋 ) is a Banach space and ( 𝑗𝑥𝑛 )𝑛=1 𝑛 𝑛=1
convergent sequence, say 𝑗𝑥𝑛 → 𝑘 ∈ 𝑋 as 𝑛 → ∞. Then, for all 𝑓 ∈ 𝑋 ∗ and any 𝑛, 𝑚 ∈ N,
∗∗

|𝑓 (𝑥𝑛 − 𝑥𝑚 )| = |𝑓 (𝑥𝑛 ) − 𝑓 (𝑥𝑚 )| = | 𝑗𝑥𝑛 (𝑓 ) − 𝑗𝑥𝑚 (𝑓 )| = |( 𝑗𝑥𝑛 − 𝑗𝑥𝑚 ) (𝑓 )| ≤ k 𝑗𝑥𝑛 − 𝑗𝑥𝑚 k k𝑓 k,


∞ ⊂ 𝑋 is
and thus, compare the exercises, k𝑥𝑛 − 𝑥𝑚 k𝑋 ≤ k 𝑗𝑥𝑛 − 𝑗𝑥𝑚 k, which shows that (𝑥𝑛 )𝑛=1
Cauchy and hence convergent, 𝑥𝑛 → 𝑥 ∞ , say. However, for any 𝑛 ∈ N, since 𝐽 is an isometry,

k 𝑗𝑥 ∞ − 𝑘 k ≤ k 𝑗𝑥 ∞ − 𝑗𝑥𝑛 k + k 𝑗𝑥𝑛 − 𝑘 k = k𝑥 ∞ − 𝑥𝑛 k𝑋 + k 𝑗𝑥𝑛 − 𝑘 k

and therefore, after sending 𝑛 to infinity, 𝑘 = 𝑗𝑥 ∞ ∈ Ran(𝐽 ), i.e. Ran(𝐽 ) is closed. This completes
our proof. 

Definition 31. A Banach space, 𝑋 , is called reflexive if and only if Ran(𝐽 ) = 𝑋 ∗∗ .


Before analyzing a few concrete dual spaces, we list two further results about reflexive spaces.

Proposition 8. A closed subspace 𝑌 ⊂ 𝑋 of a reflexive Banach space 𝑋 is reflexive and so is any


Banach space isomorphic to 𝑋 .

Proof. Since 𝑌 is a Banach space we pick 𝑘 ∈ 𝑌 ∗∗ and now need to show that there exists 𝑦 ∈ 𝑌
so that 𝑘 = 𝑗 𝑦 with 𝑗 𝑦 (𝑓 ) = 𝑓 (𝑦) for every 𝑓 ∈ 𝑌 ∗ . To this end construct the lift

ℓ (𝑓 ) := 𝑘 (𝑓 𝑌 ) ∀ 𝑓 ∈ 𝑋 ∗,

where 𝑓 𝑌 : 𝑌 → F denotes the restriction of 𝑓 : 𝑋 → F to the subspace 𝑌 . Clearly, ℓ ∈ 𝑋 ∗∗


and since 𝑋 is reflexive there exists 𝑥 ∈ 𝑋 so that ℓ (𝑓 ) = 𝑗𝑥 (𝑓 ) = 𝑓 (𝑥) for every 𝑓 ∈ 𝑋 ∗ .
But, in fact, 𝑥 ∈ 𝑌 , for if 𝑥 ∈ 𝑋 \ 𝑌 , then, by Corollary 9, we can find 𝑔 ∈ 𝑋 ∗ so that 𝑔 𝑌 = 0
and 𝑔(𝑥) ≠ 0 which contradicts 0 ≠ 𝑔(𝑥) = 𝑗𝑥 (𝑔) = ℓ (𝑔) = 𝑘 (𝑔𝑌 ) = 0. So, 𝑦 := 𝑥 ∈ 𝑌 and
𝑗 𝑦 (𝑓 ) = ℓ (𝑓 ) = 𝑘 (𝑓 ) for all 𝑓 ∈ 𝑌 ∗ , i.e. 𝐽 : 𝑌 → 𝑌 ∗∗ with 𝑦 ↦→ 𝑗 𝑦 is surjective. Moving ahead, if
𝐴 ∈ L (𝑋, 𝑍 ) is an isomorphism, then so is its dual 𝐴𝑡 ∈ L (𝑍 ∗, 𝑋 ∗ ) by Theorem 21, and likewise
its dual again, i.e. (𝐴𝑡 )𝑡 ∈ L (𝑋 ∗∗, 𝑍 ∗∗ ). Let ℓ ∈ 𝑍 ∗∗ and set
 −1
𝑘 := (𝐴𝑡 )𝑡 ℓ ∈ 𝑋 ∗∗ .

Since 𝑋 is reflexive there exists 𝑥 ∈ 𝑋 such that 𝑘 = 𝑗𝑥 with 𝑗𝑥 (𝑓 ) = 𝑓 (𝑥) for all 𝑓 ∈ 𝑋 ∗ which,
in turn, allows us to define 𝑧 := 𝐴𝑥 ∈ 𝑍 . Now check that for every 𝑔 = (𝐴𝑡 ) −1 𝑓 ∈ 𝑍 ∗ ,
(3.5)
𝑘 (𝐴𝑡 𝑔) = 𝑘 (𝑓 ) = 𝑗𝑥 (𝑓 ) = 𝑓 (𝑥) = (𝐴𝑡 𝑔) (𝑥) = 𝑔(𝐴𝑥) = 𝑔(𝑧),

and
 (3.5)
ℓ (𝑔) = (𝐴𝑡 )𝑡 𝑘 (𝑔) = 𝑘 (𝐴𝑡 𝑔),
so we have ℓ (𝑔) = 𝑔(𝑧) and thus ℓ = 𝑗𝐴𝑥 = 𝑗𝑧 . This shows that 𝐽 : 𝑍 → 𝑍 ∗∗ is surjective, and
thus 𝑍 reflexive by Definition 31. 

44
3 Banach spaces

Theorem 23. A Banach space 𝑋 is reflexive if and only if its dual space 𝑋 ∗ is reflexive.

Proof. Reflexivity of 𝑋 ∗ is equivalent to the surjectivity of the map 𝐽 : 𝑋 ∗ → 𝑋 ∗∗∗ from the dual
space 𝑋 ∗ to its bidual 𝑋 ∗∗∗ = (𝑋 ∗ ) ∗∗ that acts as 𝑋 ∗ 3 𝑓 ↦→ 𝑗 𝑓 ∈ 𝑋 ∗∗∗ via the rule

∀ 𝐹 ∈ 𝑋 ∗∗ : 𝑗 𝑓 (𝐹 ) = 𝐹 (𝑓 ).

In short, 𝐽 is surjective if and only if for every 𝑘 ∈ 𝑋 ∗∗∗ there exists 𝑓 ∈ 𝑋 ∗ so that 𝑘 = 𝑗 𝑓
with 𝑗 𝑓 (𝐹 ) = 𝐹 (𝑓 ) for every 𝐹 ∈ 𝑋 ∗∗ . Now suppose that 𝑋 is reflexive and let 𝑘 ∈ 𝑋 ∗∗∗ be
arbitrary. Using the map 𝐼 : 𝑋 → 𝑋 ∗∗ with 𝑥 ↦→ 𝑖𝑥 where 𝑖𝑥 (𝑓 ) = 𝑓 (𝑥) for all 𝑓 ∈ 𝑋 ∗ , we define
𝑔 := 𝑘 ◦ 𝐼 : 𝑋 → F and note that by Theorem 22, 𝑔 ∈ 𝑋 ∗ , and 𝑔(𝑥) = 𝑘 (𝑖𝑥 ) for every 𝑥 ∈ 𝑋 . But
since 𝑋 is reflexive, every 𝐺 ∈ 𝑋 ∗∗ is of the form 𝐺 = 𝑖𝑥 for some 𝑥 ∈ 𝑋 , so together

𝑘 (𝐺) = 𝑘 (𝑖𝑥 ) = 𝑔(𝑥) = 𝑖𝑥 (𝑔) = 𝐺 (𝑔) = 𝑗𝑔 (𝐺) ∀𝐺 ∈ 𝑋 ∗∗ .

This proves that 𝐽 : 𝑋 ∗ → 𝑋 ∗∗∗ is surjective and thus 𝑋 ∗ reflexive. Conversely, if 𝑋 ∗ is reflexive
then so is 𝑋 ∗∗ by the first part of the current proof. But Ran(𝐽 ) with 𝐽 : 𝑋 → 𝑋 ∗∗ is a closed
subspace of 𝑋 ∗∗ by Theorem 22 (so itself a Banach space) and thus reflexive by Proposition
8. Hence, given that 𝑋 and Ran(𝐽 ) are isomorphic (since 𝐽 : 𝑋 → Ran(𝐽 ) is bijective with
bounded inverse 𝐽 −1 : Ran(𝐽 ) → 𝑋 ) it follows that 𝑋 is reflexive by Proposition 8. 

We now discuss several dual spaces in detail.

Example 21. Let (H, h·, ·i) be an arbitrary Hilbert space. Then H is isomorphic to its own dual
space H ∗ and H is reflexive: By Theorem 2.3, any 𝑓 ∈ H ∗ is of the form 𝑓 (𝑥) = h𝑥, 𝑦i, 𝑥 ∈ H
for a unique 𝑦 = 𝑦 𝑓 ∈ H with k 𝑓 k = k𝑦 𝑓 k. Thus 𝑇 : H ∗ → H with 𝑇 𝑓 = 𝑦 𝑓 is a linear bijective
isometry with bounded inverse 𝑇 −1 : H → H ∗ given by 𝑦 ↦→ h·, 𝑦 i ∈ H ∗ . This shows that H and
H ∗ are isomorphic. Next, let 𝑘 ∈ H ∗∗ be arbitrary. Since H ∗ is a Hilbert space with the inner
product
h𝑓 , 𝑔i H ∗ := h𝑦 𝑓 , 𝑦𝑔 i, 𝑓 = h·, 𝑦 𝑓 i,
we can use Theorem 2.3 again and thus find a unique 𝑓𝑘 ∈ H ∗ so that 𝑘 (𝑓 ) = h𝑓 , 𝑓𝑘 i H ∗ for all
𝑓 ∈ H ∗ . Consequently,

𝑘 (𝑓 ) = h𝑓 , 𝑓𝑘 i H ∗ = h𝑦 𝑓 , 𝑦 𝑓𝑘 i = h𝑦 𝑓𝑘 , 𝑦 𝑓 i = 𝑓 (𝑦 𝑓𝑘 ),

which means there exists 𝑦 := 𝑦 𝑓𝑘 ∈ H so that 𝑘 = 𝑗 𝑦 with 𝑗 𝑦 (𝑓 ) = 𝑓 (𝑦) for all 𝑓 ∈ H ∗ . In short,
H is reflexive.

Example 22. Consider 𝑋 = (ℓ1 (N), k · k 1 ) and 𝑌 = (ℓ∞ (N), k · k ∞ ) as in Example 2. Then
𝑋 ∗ is isomorphic to 𝑌 but 𝑋 and 𝑌 are not reflexive: for any 𝑦 = (𝑦𝑛 )𝑛=1
∞ ∈ ℓ (N), the map

𝑓𝑦 : ℓ1 (N) → F given by
Õ∞
𝑓𝑦 (𝑥) := 𝑥𝑛𝑦𝑛 (3.7)
𝑛=1

45
3 Banach spaces

defines a bounded linear functional on ℓ1 (N) with k 𝑓𝑦 k ≤ k𝑦 k ∞ . In turn, 𝑇 : ℓ∞ (N) → (ℓ1 (N)) ∗
with 𝑦 ↦→ 𝑓𝑦 is a bounded linear transformation that satisfies k𝑇 k ≤ 1. However, if 𝑦 ∈ ℓ∞ (N),
then for any 𝜖 > 0 there exists 𝑁 = 𝑁 (𝜖) > 0 so that |𝑦𝑁 | ≥ k𝑦 k ∞ − 𝜖. Hence, with
(
1, 𝑛 = 𝑘
𝑒𝑛 := (𝑒𝑛1, 𝑒𝑛2, 𝑒𝑛3, 𝑒𝑛4, . . .) ∈ ℓ1 (N) ∩ ℓ∞ (N), 𝑛 ∈ N where 𝑒𝑛𝑘 := ,
0, 𝑛 ≠ 𝑘

we find |𝑓𝑦 (𝑒 𝑁 )| = |𝑦𝑁 | ≥ k𝑦 k ∞ − 𝜖 with k𝑒 𝑁 k 1 = 1 and thus

k𝑓𝑦 k = sup |𝑓𝑦 (𝑥)| ≥ |𝑓𝑦 (𝑒 𝑁 )| ≥ k𝑦 k ∞ − 𝜖.


k𝑥 k 1 =1

Together, k 𝑓𝑦 k = k𝑦 k ∞ , i.e. k𝑇𝑦 k = k𝑦 k ∞ , showing that 𝑇 is an isometry and hence injective. Now
let 𝑓 ∈ (ℓ1 (N)) ∗ and set 𝑧𝑛 := 𝑓 (𝑒𝑛 ) so that 𝑧 = (𝑧𝑛 )𝑛=1 ∞ ∈ ℓ (N). Noting that 𝑔 := 𝑓 − 𝑓 ∈
∞ 𝑧
(ℓ1 (N)) ∗ with 𝑓𝑧 as in (3.7) satisfies 𝑔(𝑒𝑛 ) = 0 we conclude that 𝑔 is the zero functional on the
subspace (𝑐 00 (N), k · k 1 ) discussed in Example 18. But (𝑐 00 (N), k · k 1 ) is dense in (ℓ1 (N), k · k 1 ),
see the exercises, so we get by continuity of 𝑔, that 𝑔 = 0 ∈ (ℓ1 (N)) ∗ and thus 𝑓 = 𝑓𝑧 = 𝑇 𝑧, i.e. 𝑇 is
surjective. In summary, there exists 𝑇 −1 : (ℓ1 (N)) ∗ → ℓ∞ (N) so that

𝑇𝑇 −1 = 𝐼 (ℓ1 (N)) ∗ and 𝑇 −1𝑇 = 𝐼 ℓ∞ (N) .

and k𝑇𝑦 k = k𝑦 k ∞ for all 𝑦 ∈ ℓ∞ (N) yields k𝑇 −1 𝑓 k ∞ = k𝑓 k, i.e. 𝑇 −1 ∈ L ((ℓ1 (N)) ∗, ℓ∞ (N)), i.e.
𝑋 ∗ is isomorphic to 𝑌 . We now prove that 𝑋 is not reflexive which in turn proves that 𝑌 is not
reflexive (if it were, then so is 𝑋 ∗ by Proposition 8 and hence also 𝑋 by Theorem 23): recall that
(𝑐 00 (N), k · k ∞ ) in Example 17 is a Banach space. Clearly 𝑥 0 := (1, 1, 1, 1, . . .) ∈ ℓ∞ (N) \ 𝑐 00 (N),
so by Corollary 9 there exists 𝑔 ∈ (ℓ∞ (N)) ∗ so that 𝑔 = 0 on 𝑐 00 (N) and 𝑔(𝑥 0 ) ≠ 0. Using the
isomorphism 𝑇 : ℓ∞ (N) → (ℓ1 (N)) ∗ we have the dual isomorphism 𝑇 𝑡 : (ℓ1 (N)) ∗∗ → (ℓ∞ (N)) ∗
from Theorem 21 and now show (𝑇 𝑡 ) −1𝑔 ∈ (ℓ1 (N)) ∗∗ is not in Ran(𝐽 ) of the canonical map
𝐽 : ℓ1 (N) → (ℓ1 (N)) ∗∗ . Assuming the contrary, there exists 𝑥 ∈ ℓ1 (N) so that 𝑗𝑥 = 𝐽𝑥 = (𝑇 𝑡 ) −1𝑔
and thus for every 𝑓 ∈ (ℓ1 (N)) ∗ ,
(3.5)
𝑓 (𝑥) = 𝑗𝑥 (𝑓 ) = (𝑇 𝑡 ) −1𝑔 (𝑓 ) = (𝑇 −1 )𝑡 𝑔 (𝑓 ) = 𝑔(𝑇 −1 𝑓 ).
 

So in particular for 𝑓 = 𝑓𝑒𝑛 from (3.7), with 𝑥 = (𝑥𝑛 )𝑛=1


∞ ∈ ℓ (N),
1

𝑥𝑛 = 𝑓𝑒𝑛 (𝑥) = 𝑔(𝑇 −1 𝑓𝑒𝑛 ) = 𝑔(𝑒𝑛 ) = 0 ∀ 𝑛 ∈ N

since 𝑒𝑛 ∈ 𝑐 00 (N). In turn, 𝑥 = 0 and so 𝑔(𝑇 −1 𝑓 ) = 0 for all 𝑓 ∈ (ℓ1 (N)) ∗ . But 𝑥 0 ∈ ℓ∞ (N), hence
𝑥 0 = 𝑇 −1ℎ for some ℎ ∈ (ℓ1 (N)) ∗ since 𝑇 is an isomorphism. Consequently 𝑔(𝑇 −1ℎ) = 𝑔(𝑥 0 ) = 0
contradicting the previous 𝑔(𝑥 0 ) ≠ 0. In summary, 𝑋 is not reflexive.

Example 23. Consider 𝑋 = (𝑐 0 (N), k · k ∞ ) as in Example 16 and 𝑌 = (ℓ1 (N), k · k 1 ) as in Example


2. Then 𝑋 ∗ is isomorphic to 𝑌 but 𝑋 is not reflexive: for any 𝑦 = (𝑦𝑛 )𝑛=1 ∞ ∈ ℓ (N), the map
1
𝑓𝑦 : 𝑐 0 (N) → F given by
Õ∞
𝑓𝑦 (𝑥) := 𝑥𝑛𝑦𝑛 (3.8)
𝑛=1

46
3 Banach spaces

defines a bounded linear functional on 𝑐 0 (N) with k𝑓𝑦 k ≤ k𝑦 k 1 . Consider 𝑑𝑛 = (𝑑𝑛𝑘 )𝑘=1
∞ ∈ 𝑐 (N)
0
for arbitrary 𝑛 ∈ N with
( ( |𝑦 |
𝑑𝑛𝑘 , 1 ≤ 𝑘 ≤ 𝑛 𝑘
, 𝑦𝑘 ≠ 0
𝑑𝑛𝑘 := where 𝑑𝑛𝑘 := 𝑦𝑘 when 1 ≤ 𝑘 ≤ 𝑛.
0, 𝑘 >𝑛 0, 𝑦𝑘 = 0
Since k𝑑𝑛 k ∞ = 1 we find

Õ 𝑛
Õ 𝑛
Õ 𝑛→∞
k𝑓𝑦 k = sup |𝑓𝑦 (𝑥)| ≥ |𝑓𝑦 (𝑑𝑛 )| = 𝑑𝑛𝑘 𝑦𝑘 = 𝑑𝑛𝑘 𝑦𝑘 = |𝑦𝑘 | −→ k𝑦 k 1
k𝑥 k ∞ =1 𝑘=1 𝑘=1 𝑘=1

and thus together k𝑓𝑦 k = k𝑦 k 1 , i.e. 𝑇 : ℓ1 (N) → (𝑐 0 (N)) ∗ with 𝑦 ↦→ 𝑓𝑦 is a bounded linear
isometry. In order to see that all continuous linear functionals on 𝑐 0 (N) arise like (3.8) we argue as
follows: let 𝑓 ∈ (𝑐 0 (N)) ∗ be arbitrary, define 𝑓𝑛 := 𝑓 (𝑒𝑛 ) with 𝑒𝑛 ∈ 𝑐 0 (N) used in Example 22 and
consider
𝑛  
Õ |𝑓𝑘 |
𝑥𝑛 := 𝑒𝑘 ∈ 𝑐 0 (N), 𝑛 ∈ N,
𝑓𝑘
𝑘=1
omitting those terms from the sum for which 𝑓𝑘 = 0. Clearly k𝑥𝑛 k ∞ = 1 and since
𝑛
Õ
𝑓 (𝑥𝑛 ) = |𝑓𝑘 |, |𝑓 (𝑥𝑛 )| ≤ k𝑓 k k𝑥𝑛 k ∞ = k𝑓 k,
𝑘=1

we have |𝑓𝑘 | ≤ k𝑓 k for all 𝑛 ∈ N. So, |𝑓𝑘 | < ∞ , i.e. (𝑓𝑛 )𝑛=1
∞ ∈ ℓ (N), and thus
Í𝑛 Í∞
𝑘=1 𝑘=1 1

Õ
𝐹 (𝑥) := 𝑥𝑘 𝑓𝑘
𝑘=1

is a well-defined bounded linear functional on 𝑐 0 (N). However 𝐹 (𝑒𝑛 ) = 𝑓𝑛 = 𝑓 (𝑒𝑛 ), so 𝐹 and 𝑓


agree on all finite linear combinations of the 𝑒𝑛 and since such linear combinations are dense in
𝑐 0 (N), see the exercises, we can conclude by continuity that 𝐹 = 𝑓 on all of 𝑐 0 (N). In turn, 𝑇 is
also surjective and thus, just as in Example 22, we conclude 𝑇 −1 ∈ L ((𝑐 0 (N)) ∗, ℓ1 (N)), i.e. 𝑋 ∗
is isomorphic to 𝑌 . Finally, Example 22 showed that 𝑌 is not reflexive, so 𝑋 ∗ is not reflexive by
Proposition 8 and hence neither 𝑋 by Theorem 23.

Example 24. Consider 𝑋 = (ℓ𝑝 (N), k · k 𝑝 ) and 𝑌 = (ℓ𝑞 (N), k · k𝑞 ) as in Example 2 with 1 <
𝑝, 𝑞 < ∞ and 𝑝1 + 𝑞1 = 1. Then 𝑋 ∗ is isomorphic to 𝑌 and 𝑋, 𝑌 are reflexive: as in the last two
examples, for any 𝑦 = (𝑦𝑛 )𝑛=1
∞ ∈ ℓ (N), the map 𝑓 : ℓ (N) → F given by
𝑞 𝑦 𝑝

Õ
𝑓𝑦 (𝑥) := 𝑥𝑛𝑦𝑛 (3.9)
𝑛=1

defines a bounded linear functional on ℓ𝑝 (N) by Theorem 1.6 that satisfies k 𝑓𝑦 k ≤ k𝑦 k𝑞 . Hence
𝑇 : ℓ𝑞 (N) → (ℓ𝑝 (N)) ∗ with 𝑦 ↦→ 𝑓𝑦 is a bounded linear transformation with k𝑇 k ≤ 1. Even better,
for 𝑔 = (𝑔𝑛 )𝑛=1
∞ with
(
𝑦𝑛 |𝑦𝑛 |𝑞−2, 𝑦𝑛 ≠ 0
𝑔𝑛 := , (3.10)
0, 𝑦𝑛 = 0

47
3 Banach spaces

we find

Õ ∞
Õ ∞
Õ
𝑝 𝑝 (𝑞−1) 𝑞
|𝑔𝑛 | = |𝑦𝑛 | = |𝑦𝑛 |𝑞 = k𝑦 k𝑞 ,
𝑛=1 𝑛=1 𝑛=1
so 𝑔 ∈ ℓ𝑝 (N), and 𝑓𝑦 (𝑔) |𝑞 . Thus, implicitly assuming 𝑦 ∈ ℓ𝑞 (N) \ {0},
Í∞
= 𝑛=1 |𝑦𝑛
|𝑓𝑦 (𝑥)| |𝑓𝑦 (𝑔)| 𝑞 (1− 1 )
k𝑓𝑦 k = sup ≥ = k𝑦 k𝑞 𝑝 = k𝑦 k𝑞 ,
𝑥 ∈𝑋 \{0} k𝑥 k 𝑝 k𝑔k 𝑝
showing that together k 𝑓𝑦 k = k𝑦 k𝑞 , i.e. 𝑇 ∈ L (ℓ𝑞 (N), (ℓ𝑝 (N)) ∗ ) above is an isometry and so in
particular injective. Next, let 𝑓 ∈ (ℓ𝑝 (N)) ∗ be arbitrary, set 𝑓𝑛 := 𝑓 (𝑒𝑛 ) with 𝑒𝑛 ∈ ℓ𝑝 (N) as used in
Example 22 and let
Õ𝑛

𝑤𝑛 := 𝑓𝑘 |𝑓𝑘 |𝑞−2 𝑒𝑘 ∈ ℓ𝑝 (N)
𝑘=1

omitting those terms from the sum for which 𝑓𝑘 = 0. Clearly k𝑤𝑛 k 𝑝 = |𝑓𝑚 |𝑞 and since
𝑝 Í𝑛
𝑚=1

𝑛 𝑛
! 𝑝1
Õ Õ
𝑞 𝑞
𝑓 (𝑤𝑛 ) = |𝑓𝑘 | , |𝑓 (𝑤𝑛 )| ≤ k𝑓 k k𝑤𝑛 k 𝑝 = k𝑓 k |𝑓𝑚 |
𝑘=1 𝑚=1

we have |𝑓𝑘 |𝑞 ≤ k𝑓 k𝑞 for all 𝑛 ∈ N. So (𝑓𝑛 )𝑛=1


∞ ∈ ℓ (N), and thus
Í𝑛
𝑘=1 𝑞

Õ
𝐺 (𝑥) := 𝑥𝑘 𝑓𝑘
𝑘=1

is a well-defined bounded linear functional on ℓ𝑝 (N). But 𝐺 (𝑒𝑛 ) = 𝑓𝑛 = 𝑓 (𝑒𝑛 ), so 𝐺 and 𝑓 agree on
all finite linear combinations of the 𝑒𝑛 and since such linear combinations are dense in ℓ𝑝 (N), see the
exercises, we conclude 𝐺 = 𝑓 on all of ℓ𝑝 (N) by continuity. Together, 𝑇 is also surjective and thus as
in Examples 22 and 23, we have 𝑇 −1 ∈ L ((ℓ𝑝 (N)) ∗, ℓ𝑞 (N)), i.e. 𝑋 ∗ and 𝑌 are isomorphic. It remains
to show that 𝑋 , say, is reflexive (once done, 𝑋 ∗ is reflexive by Theorem 23 and so 𝑌 by Proposition
8): the first part of this exercise showed that there exist isomorphisms 𝑇 : ℓ𝑞 (N) → (ℓ𝑝 (N)) ∗
and 𝑆 : ℓ𝑝 (N) → (ℓ𝑞 (N)) ∗ , exploiting now the perfect symmetry between 𝑝 and 𝑞. In turn the
composition
(𝑇 𝑡 ) −1𝑆 : ℓ𝑝 (N) → (ℓ𝑝 (N)) ∗∗
is an isomorphism by Theorem 21, i.e. for every 𝑘 ∈ 𝑋 ∗∗ we can find a unique 𝑥 = 𝑥 (𝑘) ∈ ℓ𝑝 (N) so
that 𝑘 = (𝑇 −1 )𝑡 𝑆𝑥. Setting 𝑗𝑥 := (𝑇 −1 )𝑡 𝑆𝑥 ∈ 𝑋 ∗∗ , we check that
(3.9) (3.5) (3.9)
∀ 𝑓 = 𝑓𝑦 ∈ 𝑋 ∗ with 𝑦 ∈ ℓ𝑞 (N) : 𝑗𝑥 (𝑓 ) = (𝑇 −1 )𝑡 𝑆𝑥 (𝑓 ) = 𝑆𝑥 (𝑇 −1 𝑓 ) = 𝑆𝑥 (𝑦),


where 𝑆𝑥 (𝑦) = 𝑛=1 𝑦𝑛 (𝑆𝑥 (𝑒𝑛 )) by the first part of this example. But 𝑆𝑥 (𝑒𝑛 ) = 𝑓𝑥 (𝑒𝑛 ) = 𝑥𝑛 , so all
Í∞
together

(3.9)
Õ
𝑗𝑥 (𝑓 ) = 𝑆𝑥 (𝑦) = 𝑦𝑛 𝑥𝑛 = 𝑓 (𝑥) ∀ 𝑓 ∈ 𝑋 ∗ .
𝑛=1
This verifies that 𝐽 : 𝑋 → 𝑋 ∗∗ with 𝑥 ↦→ 𝑗𝑥 is surjective, i.e. 𝑋 is reflexive.
The above three examples conclude our content on duals and double duals.

48
3 Banach spaces

3.4 The Baire category theorem and its consequences


Next to the Hahn-Banach theorems 3.2 and 3.3 there are at least four other “big” theorems
in functional analysis: the principle of uniform boundedness, the open and inverse mapping
theorem and the closed graph theorem. These and their consequences will be the subject of this
section. As a first taste of the things to come, we note that many questions in Banach space
theory involve proving that sets have nonempty interiors. For instance:

Proposition 9. Let (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ) be two normed linear spaces. Then a linear map
𝑇 : 𝑋 → 𝑌 is bounded if and only if

𝑇 −1 𝑦 ∈ 𝑌 : k𝑦 k𝑌 ≤ 1 = 𝑥 ∈ 𝑋 : 𝑇 𝑥 ∈ 𝐵 1 (0)
 

has a nonempty interior.

Proof. If the preimage of the closed unit ball has nonempty interior, then there exist 𝑥 0 ∈ 𝑋 and
𝜖 > 0 so that

𝐵𝜖 (𝑥 0 ) := {𝑥 ∈ 𝑋 : k𝑥 − 𝑥 0 k𝑋 < 𝜖} ⊂ 𝑇 −1 𝑦 ∈ 𝑌 : k𝑦 k𝑌 ≤ 1 .


In turn, for every 𝑥 ∈ 𝐵𝜖 (0), k𝑇 𝑥 k𝑌 ≤ k𝑇 (𝑥 +𝑥 0 ) k𝑌 + k𝑇 𝑥 0 k𝑌 ≤ 1 + k𝑇 𝑥 0 k𝑌 since 𝑥 +𝑥 0 ∈ 𝐵𝜖 (𝑥 0 ).


Thus, for all 𝑥 ∈ 𝑋 , by linearity of 𝑇 ,
1 
k𝑇 𝑥 k𝑌 ≤ 1 + k𝑇 𝑥 0 k𝑌 k𝑥 k𝑋 ,
𝜖
which shows that 𝑇 ∈ L (𝑋, 𝑌 ). Conversely, if 𝑇 ∈ L (𝑋, 𝑌 ) has 0 < k𝑇 k < ∞, then for all
𝑥 ∈ 𝑋 \ {0},
   
𝑥 1 𝑥 1
𝑇 = 𝑇 ≤ sup k𝑇 𝑥 k𝑌 = 1,
k𝑥 k𝑋 k𝑇 k 𝑌 k𝑇 k k𝑥 k 𝑌 k𝑇 k k𝑥 k𝑋 =1

so all 𝐵𝑐/ k𝑇 k (0) for 0 ≤ 𝑐 ≤ 1 are contained in the sought after preimage and which has therefore
a nonempty interior. 

A little bit more sophisticated, in that it addresses the completeness of the underlying space, is
the following theorem, known from a Metric space module:

Theorem 24 (Cantor). Let (𝑋, 𝜌) be a complete metric space and {𝐴𝑛 }𝑛=1
∞ a family of closed,

nonempty subsets of 𝑋 satisfying 𝐴𝑛+1 ⊂ 𝐴𝑛 for all 𝑛 ∈ N. If

diam(𝐴𝑛 ) := sup 𝜌 (𝑥, 𝑦) → 0 as 𝑛 → ∞,


𝑥,𝑦 ∈𝐴𝑛

then consists of one point 𝑥 0 ∈ 𝑋 .


Ñ∞
𝑛=1 𝐴𝑛

49
3 Banach spaces

Proof. Select 𝑥𝑛 ∈ 𝐴𝑛 for every 𝑛 ∈ N. Then 𝑥𝑛+𝑝 ∈ 𝐴𝑛 for all 𝑛, 𝑝 ∈ N by the imposed
nesting and thus 𝜌 (𝑥𝑛+𝑝 , 𝑥𝑛 ) ≤ diam(𝐴𝑛 ) which shows that (𝑥𝑛 )𝑛=1 ∞ ⊂ 𝑋 is Cauchy and hence
Ñ∞
convergent, 𝑥𝑛 → 𝑥 0 , say. But all 𝐴𝑛 are closed, so 𝑥 0 ∈ 𝐴𝑛 for every 𝑛 ∈ N, in turn 𝑥 0 ∈ 𝑛=1 𝐴𝑛 .
Ñ∞
Lastly, if there is another 𝑦 ∈ 𝑛=1 𝐴𝑛 , then 𝜌 (𝑥 0, 𝑦) ≤ diam(𝐴𝑛 ) for all 𝑛 ∈ N and so 𝑥 0 = 𝑦, i.e.
the intersection contains exactly one point. 

A variation of the proof argument in Theorem 24 yields the following extraordinary result.

Theorem 3.4: Baire Category

A countable intersection of dense open subsets of a complete metric space is dense.

Proof. Let (𝑋, 𝜌) be a complete metric space and {𝐴𝑛 }𝑛=1 ∞ a family of dense open subsets of
Ñ∞
𝑋 . Let 𝑥 ∈ 𝑋 and 𝜖 > 0 be given. We will now construct 𝑥 ∞ ∈ 𝑛=1 𝐴𝑛 with 𝜌 (𝑥, 𝑥 ∞ ) < 𝜖,
Ñ∞
proving the desired density of the countable intersection 𝑛=1 𝐴𝑛 in 𝑋 : since 𝐴1 is dense, we
can find 𝑥 1 ∈ 𝐴1 with 𝜌 (𝑥, 𝑥 1 ) < 𝜖/3. Since 𝐴1 is open, we can find 𝛿 1 > 0 with 2𝛿 1 < 𝜖/3 so that
𝐵 2𝛿1 (𝑥 1 ) ⊂ 𝐴1 . We now pick 𝑥 2, 𝛿 2, 𝑥 3, 𝛿 3, . . . inductively so that 𝑥𝑛 ∈ 𝐴𝑛 with 𝜌 (𝑥𝑛−1, 𝑥𝑛 ) < 𝛿𝑛−1
and 𝛿𝑛 with 2𝛿𝑛 < 𝜖/3𝑛 so that 𝐵 2𝛿𝑛 (𝑥𝑛 ) ⊂ 𝐴𝑛 ∩ 𝐵𝛿𝑛−1 (𝑥𝑛−1 ). This is possible since, 𝐴𝑛 is dense,
so we can pick 𝑥𝑛 with the required property, and since 𝑥𝑛 ∈ 𝐴𝑛 ∩ 𝐵𝛿𝑛−1 (𝑥𝑛−1 ), which is open,
we can pick 𝛿𝑛 . By construction, 𝜌 (𝑥𝑛−1, 𝑥𝑛 ) < 𝜖/2/3𝑛−1 , so for 𝑘 ∈ N,
𝑘−1   𝑛+𝑗 ∞   𝑛+𝑗  𝑛
𝜖Õ 1 𝜖Õ 1 3𝜖 1
𝜌 (𝑥𝑛 , 𝑥𝑛+𝑘 ) < < =
2 𝑗=0 3 2 𝑗=0 3 4 3

which shows that (𝑥𝑛 )𝑛=1 ∞ ⊂ 𝑋 is Cauchy and so convergent, 𝑥 → 𝑥 as 𝑛 → ∞, say. Also, by
𝑛 ∞
construction,
𝐵𝛿𝑛 (𝑥𝑛 ) ⊂ 𝐴𝑛 ∩ 𝐵𝛿𝑛−1 (𝑥𝑛−1 ),
Ñ∞
and so 𝑥𝑛 , 𝑥𝑛+1, . . . , 𝑥𝑛𝑘 , . . . ∈ 𝐵𝛿𝑛 (𝑥𝑛 ), so 𝑥 ∞ ∈ 𝐵𝛿𝑛 (𝑥𝑛 ) ⊂ 𝐴𝑛 for all 𝑛 ∈ N and so 𝑥 ∞ ∈ 𝑛=1 𝐴𝑛 .
Finally, by construction,
∞  𝑗
𝜖 𝜖Õ 1 𝜖 𝜖
𝜌 (𝑥, 𝑥 ∞ ) ≤ 𝜌 (𝑥, 𝑥 1 ) + 𝜌 (𝑥 1, 𝑥 ∞ ) < + = + < 𝜖.
3 2 𝑗=1 3 3 4

This completes the proof. 

In practice, one rarely uses the Baire category theorem directly but rather one of its consequences
that follow from the below corollary to Theorem 3.4.

Corollary 12. Let 𝑋 be a complete metric space and {𝐶𝑛 }𝑛=1


∞ a family of closed sets with Ð∞
𝑛=1 𝐶𝑛 =
𝑋 . Then some 𝐶𝑛 has nonempty interior.

50
3 Banach spaces

Proof. If not, with 𝐴𝑛 := 𝑋 \ 𝐶𝑛 , each 𝐴𝑛 is open and dense since 𝐶𝑛int = ∅. Thus 𝑛=1
Ñ∞
𝐴𝑛 is
Ð∞ Ñ∞
dense by Theorem 3.4 and so, not empty. Thus 𝑛=1 𝐶𝑛 = 𝑋 \ 𝑛=1 𝐴𝑛 is not all of 𝑋 . This
contradiction shows that some 𝐶𝑛 has 𝐶𝑛int ≠ ∅. 

Indeed, here is the first consequence of Corollary 12:

Theorem 3.5: Principle of uniform boundedness (PUB)

Let F be a subset of bounded linear maps from one Banach space, (𝑋, k · k𝑋 ), to another
one, (𝑌 , k · k𝑌 ). Suppose that F is pointwise bounded, i.e. for each 𝑥 ∈ 𝑋 ,

sup{k𝑇 𝑥 k𝑌 : 𝑇 ∈ F } < ∞,

then F is uniformly bounded, i.e.

sup{k𝑇 k : 𝑇 ∈ F } < ∞.

Remark. Theorem 3.5 is also known as the Banach-Steinhaus theorem and in many applications
we have 𝑌 = F, so one is dealing with F ⊂ 𝑋 ∗ . Note that completeness is essential, for if
𝑋 = (𝐶 [0, 1], k · k 1 ) as discussed in Example 19 and 𝑇𝑛 ∈ 𝑋 ∗ is given by
1

𝑛
∫ 1 𝑢 
𝑇𝑛 𝑓 := 𝑛 𝑓 (𝑡)𝑑𝑡 = 𝑓 𝑑𝑢, 𝑛 ∈ N,
0 0 𝑛

then for each 𝑓 ∈ 𝑋 , there exists 𝑐 = 𝑐 (𝑓 ) by Theorem 5 so that |𝑇𝑛 𝑓 | ≤ 𝑐 for every 𝑛 ∈ N. In short,

∀ 𝑓 ∈ 𝑋, sup{|𝑇𝑛 𝑓 | : 𝑛 ∈ N} < ∞,

so F := {𝑇𝑛 }𝑛=1
∞ ⊂ 𝑋 ∗ is pointwise bounded. But

(
2( 𝑛1 − 𝑥)𝑛 2, 𝑥 ∈ [0, 𝑛1 ]
𝐶 [0, 1] 3 𝑔𝑛 (𝑥) := , 𝑛∈N
0, 𝑥 ∈ [ 𝑛1 , 1]

satisfies k𝑔𝑛 k 1 = 1 and 𝑇𝑛𝑔𝑛 = 𝑛, so k𝑇𝑛 k ≥ |𝑇𝑛𝑔𝑛 | = 𝑛 becomes unbounded for large 𝑛.

Proof of Theorem 3.5. Let



𝐶𝑛 := 𝑥 ∈ 𝑋 : sup{k𝑇 𝑥 k𝑌 : 𝑇 ∈ F } ≤ 𝑛 , 𝑛 ∈ N.
∞ ⊂ 𝐶 , then for any 𝑇 ∈ F , by continuity of the norm,
If 𝑥𝑚 → 𝑥 for some (𝑥𝑚 )𝑚=1 𝑛

k𝑇 𝑥 k𝑌 = lim k𝑇 𝑥𝑚 k𝑌 ≤ 𝑛,
𝑚→∞

51
3 Banach spaces

Ð∞
so 𝑥 ∈ 𝐶𝑛 , that is, 𝐶𝑛 is closed. Given that F is pointwise bounded, 𝑛=1 𝐶𝑛 = 𝑋 . Hence, by
int
Corollary 12, for some 𝑛 0 , 𝐶𝑛0 ≠ ∅, that is, there exist 𝑥 0, 𝛿, and 𝑛 0 so 𝐵𝛿 (𝑥 0 ) ⊂ 𝐶𝑛0 . But this
means
k𝑥 − 𝑥 0 k𝑋 < 𝛿 and 𝑇 ∈ F ⇒ k𝑇 𝑥 k𝑌 ≤ 𝑛 0 .
Letting 𝑦 := 𝑥 − 𝑥 0 , we see

k𝑦 k𝑋 < 𝛿 and 𝑇 ∈ F ⇒ k𝑇𝑦 k𝑌 ≤ 𝑛 0 + sup{k𝑇 𝑥 0 k𝑌 : 𝑇 ∈ F } =: 𝐶 < ∞.

Replacing 𝑦 by 𝑦/(1 + 𝜖) and taking 𝜖 ↓ 0, we see that we can change k𝑦 k < 𝛿 to k𝑦 k ≤ 𝛿. Thus
1
𝑇 ∈F ⇒ k𝑇 k = sup k𝑇𝑦 k𝑌 = sup k𝑇𝑦 k𝑌 ≤ 𝛿 −1𝐶,
k𝑦 k𝑋 =1 𝛿 k𝑦 k𝑋 =𝛿

proving the uniform boundedness of F . 

The second consequence of Corollary 12 will lead to Theorem 3.7 below which will be extremely
useful in Chapter 4:

Theorem 3.6: Open Mapping

Let 𝑇 ∈ L (𝑋, 𝑌 ) be a bounded linear transformation between two Banach spaces. If


Ran(𝑇 ) = 𝑌 , then 𝑇 is open, that is, if 𝐴 ⊂ 𝑋 is open in 𝑋 , then 𝑇 [𝐴] = {𝑇 𝑥 : 𝑥 ∈ 𝐴} is
open in 𝑌 .

Proof. We will use open balls 𝐵𝑟𝑋 (𝑥 0 ), resp. 𝐵𝑟𝑌 (𝑦0 ), in 𝑋 , resp. 𝑌 , and prove that for some 𝑟 > 0,
𝑇 [𝐵𝑟𝑋 (0)] has a nonemtpy interior. Begin with

𝐶𝑛 := 𝑇 [𝐵𝑛𝑋 (0)], 𝑛 ∈ N.
Ð∞ Ð∞
Each 𝐶𝑛 is closed, and since 𝑛=1 𝑇 [𝐵𝑛𝑋 (0)] = 𝑌 , we have 𝑛=1 𝐶𝑛 = 𝑌 . Thus, by Corollary 12,
some 𝐶𝑛0 has nonempty interior, that is, for some 𝑛 0 , some 𝑦0 ∈ 𝑌 , and 𝜌 > 0,

𝐵𝑌𝜌 (𝑦0 ) ⊂ 𝑇 [𝐵𝑛𝑋0 (0)]. (3.11)

We now need to show that some 𝑇 [𝐵𝑟𝑋 (0)] has nonempty interior: By (3.11), given 𝑦 ∈ 𝐵𝑌𝜌 (0),
∞ ⊂ 𝐵 𝑋 (0) so 𝑇 (𝑥 ) → 𝑦 + 𝑦 as 𝑚 → ∞. Additionally, by (3.11) again, we
we can find (𝑥𝑚 )𝑚=1 𝑛0 𝑚 0

can find (𝑧𝑚 )𝑚=1 ⊂ 𝐵𝑛𝑋0 (0) so that 𝑇 (𝑧𝑚 ) → 𝑦0 . Then 𝑇 (𝑥𝑚 − 𝑧𝑚 ) → 𝑦 as 𝑚 → ∞, proving that

𝐵𝑌𝜌 (0) ⊂ 𝑇 [𝐵𝑋2𝑛0 (0)]. (3.12)

Since 𝑇 [𝐵𝑋𝜆𝛼 (0)] = 𝜆𝑇 [𝐵𝛼𝑋 (0)] for 𝜆, 𝛼 > 0 by linearity of 𝑇 , we conclude that (3.12) yields
existence of 𝜖 > 0 so that
𝐵𝜖𝑌 (0) ⊂ 𝑇 [𝐵𝑋1/2 (0)]. (3.13)

52
3 Banach spaces

Then, by the aforementioned scaling invariance, for each 𝑛 ∈ Z ≥0 ,


𝑌
𝑛 (0) ⊂ 𝑇 [𝐵 (0)]. (3.14)
𝐵𝜖/2 𝑋
1/2𝑛+1

We will use this inclusion to prove that 𝐵𝜖𝑌 (0) ⊂ 𝑇 [𝐵𝑋1 (0)]. Given 𝑦 ∈ 𝐵𝜖𝑌 (0), pick 𝑥 0, 𝑥 1, . . .
inductively so that
𝜖
𝑥 𝑗 ∈ 𝐵𝑋1/2 𝑗 +1 (0) and k𝑦 − 𝑇 (𝑥 0 + . . . + 𝑥 𝑗 ) k𝑌 < .
2 𝑗+1
Indeed, we can first pick 𝑥 0 ∈ 𝐵𝑋1/2 (0) with k𝑦 − 𝑇 𝑥 0 k𝑌 < 𝜖/2 by (3.13) and then 𝑥 𝑗 inductively
𝑌 (0) and since we have (3.14). In this way ( Í𝑛 𝑥 ) ∞ ⊂ 𝑋
since 𝑦 − 𝑇 (𝑥 0 + . . . + 𝑥 𝑗−1 ) ∈ 𝐵𝜖/2 𝑗 𝑗=1 𝑗 𝑛=1
Í
is Cauchy, so has a limit 𝑛𝑗=1 𝑥 𝑗 → 𝑥 ∈ 𝐵𝑋1 (0) as 𝑛 → ∞ and 𝑇 𝑥 = 𝑦. This proves 𝐵𝜖𝑌 (0) ⊂
𝑇 [𝐵𝑋1 (0)] and we can now complete our proof: Let 𝑈 ⊂ 𝑋 be open and 𝑦 ∈ 𝑇 [𝑈 ], i.e. there is
𝑥 ∈ 𝑈 with 𝑇 𝑥 = 𝑦. Find 𝛿 > 0 so 𝐵𝛿𝑋 (𝑥) ⊂ 𝑈 . By 𝐵𝜖𝑌 (0) ⊂ 𝑇 [𝐵𝑋1 (0)] and the aforementioned
scaling invariance,
𝑌
𝐵𝜖𝛿 (0) ⊂ 𝑇 [𝐵𝛿𝑋 (0)],
which implies that
𝑌
𝐵𝜖𝛿 (𝑦) ⊂ 𝑇 [𝐵𝛿𝑋 (𝑥)] ⊂ 𝑇 [𝑈 ],
so 𝑇 [𝑈 ] is open. The proof is complete. 

Having proven the two main direct consequences of the Baire category theorem in Theorems
3.5 and 3.6, we now turn to their applications and list a sequence of more specialized corollaries.

Theorem 25. Let 𝑇 : 𝑋 → 𝑌 be an injective bounded linear map between two Banach spaces
(𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ). Suppose Ran(𝑇 ) is closed in 𝑌 . Then there exists 𝜖 > 0 so that for all
𝑥 ∈ 𝑋,
k𝑇 𝑥 k𝑌 ≥ 𝜖 k𝑥 k𝑋 .

Proof. If 𝑌b := Ran(𝑇 ), then (𝑌b, k · k𝑌 ) is a Banach space and 𝑇 ∈ L (𝑋, 𝑌b) is onto. Hence, by
Theorem 3.6, there exists 𝜖 > 0 so that

𝐵𝜖𝑌 (0) ⊂ 𝑇 [𝐵𝑋1 (0)].


b

Thus, for any 0 < 𝛿 < 1,


 
𝜖 (1 − 𝛿)
𝑇 𝑥 : 𝑥 ∈ 𝑋 ⊂ 𝐵𝜖𝑌 (0) ⊂ 𝑇 [𝐵𝑋1 (0)],
b
k𝑇 𝑥 k𝑌
and since 𝑇 is injective therefore
𝜖 (1 − 𝛿)
𝑥 <1 ⇔ k𝑇 𝑥 k𝑌 > 𝜖 (1 − 𝛿) k𝑥 k𝑋 .
k𝑇 𝑥 k𝑌 𝑋

This verifies the claim. 

53
3 Banach spaces

Theorem 3.7: Inverse mapping

Let 𝑇 ∈ L (𝑋, 𝑌 ) be a continuous linear bijection between two Banach spaces (𝑋, k · k𝑋 )
and (𝑌 , k · k𝑌 ). Then 𝑇 −1 : 𝑌 → 𝑋 is continuous.

Proof. By Theorem 25 for 𝑥 = 𝑇 −1𝑦, k𝑇 −1𝑦 k𝑋 ≤ 𝜖 −1 k𝑦 k𝑌 which shows 𝑇 −1 ∈ L (𝑌 , 𝑋 ). 

Corollary 13 (Norm equivalence). Let 𝑋 be a vector space and k · k 1, k · k 2 two norms in which
𝑋 is a Banach space. If there is 𝐶 1 > 0 so that for all 𝑥 ∈ 𝑋, k𝑥 k 1 ≤ 𝐶 1 k𝑥 k 2 then there is 𝐶 2 > 0 so
that for all 𝑥 ∈ 𝑋, k𝑥 k 2 ≤ 𝐶 2 k𝑥 k 1 .

Proof. The identity map 𝐼 : (𝑋, k · k 2 ) → (𝑋, k · k 1 ) is an isomorphism by assumption, so its


inverse 𝐼 −1 : (𝑋, k · k 1 ) → (𝑋, k · k 2 ) bounded by Theorem 3.7. This shows that for all 𝑥 ∈ 𝑋 ,
k𝑥 k 2 = k𝐼 −1𝑥 k 2 ≤ 𝐶 2 k𝑥 k 1 with some 𝐶 2 > 0. 

In order to state and prove the final two consequences of Corollary 12 we require the following
additional terminology:

Definition 32. Let 𝐼 be a countable index set, and suppose that for each 𝛼 ∈ 𝐼, (𝑋𝛼 , k · k 𝛼 ) is a
normed linear space. Let

?
( )
Õ
𝑋 := (𝑥𝛼 )𝛼 ∈𝐼 ∈ 𝑋𝛼 : 𝑥 𝛼 ∈ 𝑋𝛼 , k𝑥𝛼 k 𝛼 < ∞ .
𝛼 ∈𝐼 𝛼 ∈𝐼

Then 𝑋 with the norm Õ


k (𝑥𝛼 )𝛼 ∈𝐼 k × := k𝑥𝛼 k 𝛼 ,
𝛼 ∈𝐼

is a normed linear space, called the direct sum of the spaces 𝑋𝛼 , written 𝑋 = 𝛼 ∈𝐼 𝑋𝛼 .
É
É
If each 𝑋𝛼 is a Banach space in Definition 32, then so is 𝛼 ∈𝐼 𝑋𝛼 . This will be used in Corollary
14 below.

Definition 33. Let 𝑋 be any vector space with subspaces 𝑋 1, 𝑋 2 ⊂ 𝑋 . We call 𝑋 1, 𝑋 2 comple-
mentary if and only if
𝑋 1 + 𝑋 2 = 𝑋, 𝑋 1 ∩ 𝑋 2 = {0}.
Equivalently, any 𝑥 ∈ 𝑋 can be uniquely written 𝑥 = 𝑥 1 + 𝑥 2 with 𝑥 𝑗 ∈ 𝑋 𝑗 . We also say 𝑋 2 is a
complementary subspace or complement to 𝑋 1 , and vice versa.

54
3 Banach spaces

Corollary 14. Let (𝑋, k · k) be a Banach space and 𝑋 1, 𝑋 2 complementary subspaces, each of
which is closed. Then for some 𝛿 > 0 and all 𝑥 𝑗 ∈ 𝑋 𝑗 , we have

𝛿 k𝑥 1 k + k𝑥 2 k ≤ k𝑥 1 + 𝑥 2 k ≤ k𝑥 1 k + k𝑥 2 k.

Equivalently, the direct sum norm k · k × on 𝑋 1 ⊕ 𝑋 2 is equivalent to k · k.

Proof. If 𝑇 : 𝑋 1 ⊕𝑋 2 → 𝑋 is defined by 𝑇 ((𝑥 1, 𝑥 2 )) := 𝑥 1 +𝑥 2 , then 𝑇 is linear and k𝑇 ((𝑥 1, 𝑥 2 )) k =


k𝑥 1 + 𝑥 2 k ≤ k𝑥 1 k + k𝑥 2 k = k (𝑥 1, 𝑥 2 ) k × , so 𝑇 ∈ L (𝑋 1 ⊕ 𝑋 2, 𝑋 ). Clearly, 𝑇 is injective and
surjective since 𝑋 𝑗 are complementary, thus 𝑇 is invertible. By Theorem 3.7, 𝑇 −1 (𝑋, 𝑋 1 ⊕ 𝑋 2 )
so k𝑇 −1𝑥 k × ≤ 𝐶 k𝑥 k for all 𝑥 ∈ 𝑋 and since 𝑋 𝑗 are complementary therefore k𝑥 1 k + k𝑥 2 k =
k(𝑥 1, 𝑥 2 )k × ≤ 𝐶 k𝑥 1 + 𝑥 2 k for all 𝑥 𝑗 ∈ 𝑋 𝑗 . This concludes the proof. 

Definition 34. Let 𝑋, 𝑌 be two normed linear spaces and 𝑇 : 𝑋 → 𝑌 a linear map. The graph,
Γ(𝑇 ), is a subset of 𝑋 ⊕ 𝑌 given by

Γ(𝑇 ) = (𝑥,𝑇 𝑥) : 𝑥 ∈ 𝑋 .


It is always a subspace of 𝑋 ⊕ 𝑌 even if 𝑇 is a priori not assumed continuous.

Theorem 3.8: Closed graph

Let 𝑇 : 𝑋 → 𝑌 be a linear map from a Banach space 𝑋 to a Banach space 𝑌 . If the graph
of 𝑇 is closed in 𝑋 ⊕ 𝑌 , then 𝑇 ∈ L (𝑋, 𝑌 ).

Proof. By definition of the direct sum, 𝜋 1 : 𝑋 ⊕ 𝑌 → 𝑋 and 𝜋 2 : 𝑋 ⊕ 𝑌 → 𝑌 with 𝜋1 ((𝑥, 𝑦)) = 𝑥


nad 𝜋 2 ((𝑥, 𝑦)) = 𝑦 are both continuous. Clearly, 𝜋 1 Γ (𝑇 ) is a bijection between Γ(𝑇 ) and 𝑋 .
But Γ(𝑇 ) is closed, so it is itself a Banach space, and so by Theorem 3.7, 𝜋 1−1 : 𝑋 → Γ(𝑇 ) with
𝜋 1−1 (𝑥) = (𝑥,𝑇 𝑥) is continuous. Since 𝑇 = 𝜋2 ◦ 𝜋1−1 : 𝑋 → 𝑌 , 𝑇 is continuous. 

Remark. To understand the depth of Theorem 3.8, consider the below three statements for con-
vergent sequences (𝑥𝑛 )𝑛=1
∞ ⊂ 𝑋 and a linear 𝑇 : 𝑋 → 𝑌 between two normed linear spaces 𝑋 and

𝑌:
(a) 𝑥𝑛 → 𝑥 ∈ 𝑋 as 𝑛 → ∞.
(b) 𝑇 𝑥𝑛 → 𝑦 ∈ 𝑌 as 𝑛 → ∞.
(c) 𝑦 = 𝑇 𝑥.
Continuity of 𝑇 says (a) ⇒ (b) + (c) for every such convergent sequence (𝑥𝑛 )𝑛=1
∞ . Γ(𝑇 ) closed says

(a) + (b) ⇒ (c). Theorem 3.8 says that one can assume (a) + (b) and completeness of 𝑋, 𝑌 in trying
to prove (c) to get continuity of 𝑇 .
As a final consequence of Corollary 12, and as the last result of this Section, we record the
following:

55
3 Banach spaces

Theorem 26 (Hellinger-Toeplitz). Let𝑇 : H → H be a linear map from a Hilbert space (H, h·, ·i)
to itself. Suppose for all 𝑥, 𝑦 ∈ H , we have

h𝑇 𝑥, 𝑦i = h𝑥,𝑇𝑦i.

Then 𝑇 ∈ L (H ).

Proof. If 𝑥𝑛 → 𝑥 and 𝑇 𝑥𝑛 → 𝑦, then for all 𝑧 ∈ H , by continuity of the inner product,

h𝑧,𝑇 𝑥i = h𝑇 𝑧, 𝑥i = lim h𝑇 𝑧, 𝑥𝑛 i = lim h𝑧,𝑇 𝑥𝑛 i = h𝑧, 𝑦i.


𝑛→∞ 𝑛→∞

Choosing 𝑧 = 𝑇 𝑥 − 𝑦 ∈ H , we obtain k𝑇 𝑥 − 𝑦 k = 0, so 𝑦 = 𝑇 𝑥. Hence, Γ(𝑇 ) is closed and so by


Theorem 3.8, 𝑇 continuous.


3.5 Weak convergence


In this last section of Chapter 3 we touch upon a particular notion of convergence that will
be useful in Chapter 4. This notion is weaker than the norm convergence that we have used
throughout. For instance, in the Hilbert space (ℓ2 (N), h·, ·i) discussed in Example 12 the elements
∞ with
𝑒𝑛 = (𝑒𝑛𝑘 )𝑘=1
(
1, 𝑘 = 𝑛
𝑒𝑛𝑘 :=
0, 𝑘 ≠ 𝑛
√ ∞ is not Cauchy and hence does not
obey k𝑒𝑛 − 𝑒𝑚 k 2 = 2 > 0 for all 𝑛 ≠ 𝑚. In turn (𝑒𝑛 )𝑛=1
converge in H . Nevertheless, by Theorem 2.3 and (2.6), for all 𝑓 ∈ H ∗ ,

𝑓 (𝑒𝑛 ) = h𝑒𝑛 , 𝑦i → 0 as 𝑛 → ∞.

This observation lies at the heart of the following notion:

Definition 35. A sequence (𝑥𝑛 )𝑛=1


∞ ⊂ 𝑋 in a normed linear space 𝑋 over F = R or F = C
converges weakly to 𝑥 ∈ 𝑋 if and only if for every 𝑓 ∈ 𝑋 ∗ we have 𝑓 (𝑥𝑛 ) → 𝑓 (𝑥) as 𝑛 → ∞.
We write
𝑥𝑛 ⇀ 𝑥 as 𝑛 → ∞.

Below we list the basic properties of weakly convergent sequences.

Theorem 27. Let 𝑋 be a normed linear space and (𝑥𝑛 )𝑛=1


∞ ⊂ 𝑋 . If 𝑥 → 𝑥 as 𝑛 → ∞, then
𝑛
𝑥𝑛 ⇀ 𝑥 as 𝑛 → ∞. Furthermore, weak limits are unique and if (𝑥𝑛 )𝑛=1
∞ converges weakly, then

there exists 𝑐 > 0 so that k𝑥𝑛 k ≤ 𝑐 for all 𝑛 ∈ N.

56
3 Banach spaces

Proof. Since |𝑓 (𝑥𝑛 ) − 𝑓 (𝑥)| = |𝑓 (𝑥𝑛 − 𝑥)| ≤ k 𝑓 k k𝑥𝑛 − 𝑥 k for all 𝑓 ∈ 𝑋 ∗ , we see that 𝑥𝑛 → 𝑥
implies 𝑥𝑛 ⇀ 𝑥. Next, if 𝑥𝑛 ⇀ 𝑥 and 𝑥𝑛 ⇀ 𝑦 as 𝑛 → ∞, then 𝑓 (𝑥) ← 𝑓 (𝑥𝑛 ) → 𝑓 (𝑦) as 𝑛 → ∞
for all 𝑓 ∈ 𝑋 ∗ . In turn, by Corollary 11, 𝑥 = 𝑦. Finally, using the canonical map 𝐽 : 𝑋 → 𝑋 ∗∗ of
(3.6), i.e. 𝑥 ↦→ 𝑗𝑥 where 𝑗𝑥 (𝑓 ) = 𝑓 (𝑥) for all 𝑓 ∈ 𝑋 ∗ , we know from Theorem 22 that k 𝑗𝑥 k = k𝑥 k.
But for all 𝑓 ∈ 𝑋 ∗ there exists 𝑐 = 𝑐 (𝑓 ) > 0 so that |𝑓 (𝑥𝑛 )| ≤ 𝑐 for all 𝑛 ∈ N. This shows that
for each 𝑓 ∈ 𝑋 ∗ , 
sup | 𝑗𝑥𝑛 (𝑓 )| : 𝑛 ∈ N < ∞,
∞ ⊂ 𝑋 ∗∗ is pointwise bounded. Given that 𝑋 ∗ is a Banach space,
i.e. the family F := { 𝑗𝑥𝑛 }𝑛=1
Theorem 3.5 asserts that
sup{k 𝑗𝑥𝑛 k = k𝑥𝑛 k : 𝑛 ∈ N} < ∞,
and thus there exists 𝑐 > 0 so that k𝑥𝑛 k < 𝑐 for all 𝑛 ∈ N. This completes our proof. 

The next result provides a convenient criterion for weak convergence.

Theorem 28. Let (𝑋, k · k𝑋 ) be a normed linear space and (𝑥𝑛 )𝑛=1
∞ ⊂ 𝑋 . Then 𝑥 converges
𝑛
weakly to 𝑥 ∈ 𝑋 provided
(1) There exists 𝑐 > 0 so that k𝑥𝑛 k𝑋 ≤ 𝑐 for all 𝑛 ∈ N.
(2) There exists a dense subset 𝑀 ⊂ 𝑋 ∗ so that 𝑓 (𝑥𝑛 ) → 𝑓 (𝑥) for all 𝑓 ∈ 𝑀.

Proof. Pick 𝑓 ∈ 𝑋 ∗ and (𝑓𝑚 )𝑚=1


∞ ⊂ 𝑀 so 𝑓 → 𝑓 in 𝑋 ∗ as 𝑚 → ∞. Note that for all 𝑛, 𝑚 ∈ N,
𝑚

|𝑓 (𝑥𝑛 ) − 𝑓 (𝑥)| ≤ |𝑓 (𝑥𝑛 ) − 𝑓𝑚 (𝑥𝑛 )| + |𝑓𝑚 (𝑥𝑛 ) − 𝑓𝑚 (𝑥)| + |𝑓𝑚 (𝑥) − 𝑓 (𝑥)|
≤ k𝑓 − 𝑓𝑚 k k𝑥𝑛 k𝑋 + |𝑓𝑚 (𝑥𝑛 ) − 𝑓𝑚 (𝑥)| + k 𝑓𝑚 − 𝑓 k k𝑥 k𝑋 ,

with k𝑥𝑛 k𝑋 ≤ 𝑐 and k𝑥 k𝑋 ≤ 𝑐. Let 𝜖 > 0 be arbitrary, then, by 𝑓𝑚 → 𝑓 , there exists 𝑁 = 𝑁 (𝜖) >
0 so that k𝑓𝑚 − 𝑓 k < 3𝑐𝜖 for all 𝑚 ≥ 𝑁 and likewise |𝑓𝑚 (𝑥𝑛 ) − 𝑓𝑚 (𝑥)| < 𝜖3 for 𝑛 ≥ 𝑁 . Hence, for
all 𝑚, 𝑛 ≥ 𝑁 ,
𝜖 𝜖 𝜖
|𝑓 (𝑥𝑛 ) − 𝑓 (𝑥)| ≤ 𝑐 + + 𝑐 = 𝜖,
3𝑐 3 3𝑐

which shows that 𝑥𝑛 ⇀ 𝑥 as 𝑛 → ∞ since 𝑓 ∈ 𝑋 was arbitrary. 

Remark. In a Hilbert space, 𝑥𝑛 ⇀ 𝑥 ∈ H as 𝑛 → ∞ if and only if h𝑥𝑛 , 𝑦i → h𝑥, 𝑦i for all


𝑦 ∈ H . If H is in addition separable then by Proposition 5 and Theorem 28 one needs to check
h𝑥𝑛 , 𝑦 𝑗 i → h𝑥, 𝑦 𝑗 i on an independent spanning set {𝑦 𝑗 }𝑁𝑗=1 only in order to deduce 𝑥𝑛 ⇀ 𝑥.
Our next Definition pushes the notion of weak convergence to the dual space, using that the
canonical map 𝐽 : 𝑋 → Ran(𝐽 ) ⊂ 𝑋 ∗∗ allows us to identify certain elements in 𝑋 ∗∗ with
elements in 𝑋 .

Definition 36. A sequence (𝑓𝑛 )𝑛=1


∞ ⊂ 𝑋 ∗ in the dual space of a normed linear space 𝑋 over F = R

or F = C converges weak-∗ly if and only if for every 𝑥 ∈ 𝑋 we have 𝑓𝑛 (𝑥) → 𝑓 (𝑥) as 𝑛 → ∞.


We write

𝑓𝑛 ⇀ 𝑓 as 𝑛 → ∞.

57
3 Banach spaces

Given that 𝑋 ∗ is also a normed linear space we have three different convergence types in 𝑋 ∗ : a
∞ ⊂ 𝑋 ∗ converges
sequence (𝑓𝑛 )𝑛=1
(a) to 𝑓 (in norm) if and only if k𝑓𝑛 − 𝑓 k → 0 as 𝑛 → ∞.
(b) to 𝑓 weakly if and only if for every 𝑘 ∈ 𝑋 ∗∗ , |𝑘 (𝑓𝑛 ) − 𝑘 (𝑓 )| → 0 as 𝑛 → ∞.
(c) to 𝑓 weak-∗ly if and only if for every 𝑥 ∈ 𝑋 , |𝑓𝑛 (𝑥) − 𝑓 (𝑥)| → 0 as 𝑛 → ∞.
Theorem 27 applied to the dual space 𝑋 ∗ says that norm convergence in 𝑋 ∗ implies weak
convergence in 𝑋 ∗ . Moreover, using again the canonical map 𝐽 : 𝑋 → 𝑋 ∗∗ , we see that weak
convergence in 𝑋 ∗ implies weak-∗ convergence in 𝑋 ∗ . However, the converse implications are
in general false.

Example 25. Consider the normed linear space 𝑋 = (𝐶 [−1, 1], k · k ∞ ) and the function family
(
𝑛 − 𝑛 2 |𝑥 |, |𝑥 | ≤ 𝑛1
∫ 1
𝐶 [−1, 1] 3 𝜌𝑛 (𝑥) := , 𝑛 ∈ N; 𝜌𝑛 (𝑥)𝑑𝑥 = 1.
0, |𝑥 | > 𝑛1 −1

Define 𝑓𝑛 ∈ 𝑋 ∗ by ∫ 1
𝑓𝑛 (𝑔) := 𝑔(𝑥)𝜌𝑛 (𝑥)𝑑𝑥, 𝑔 ∈ 𝐶 [−1, 1],
−1
and obtain, for every 𝑔 ∈ 𝐶 [−1, 1],
∫ 1 ∫ 1
 𝑛
|𝑓𝑛 (𝑔) − 𝑔(0)| = 𝑔(𝑥) − 𝑔(0) 𝜌𝑛 (𝑥)𝑑𝑥 ≤ |𝑔(𝑥) − 𝑔(0)|𝜌𝑛 (𝑥)𝑑𝑥
−1 − 𝑛1

≤ max |𝑔(𝑥) − 𝑔(0)| → 0 as 𝑛 → ∞.


|𝑥 | ≤ 𝑛1


In short, 𝑓𝑛 ⇀ 𝑓 ∈ 𝑋 ∗ where 𝑓 (𝑔) := 𝑔(0). However, (𝑓𝑛 )𝑛=1
∞ does not converge weakly to 𝑓 .

Example 26. Consider the normed linear space 𝑋 = (𝑐 0 (N), k · k ∞ ). By Example 23 there exists
an isomorphism 𝑇 : ℓ1 (N) → (𝑐 0 (N)) ∗ and every 𝑓 ∈ (𝑐 0 (N)) ∗ is of the form

Õ

𝑓 (𝑥) = 𝑓𝑦 (𝑥) = 𝑥𝑛𝑦𝑛 , 𝑥 = (𝑥𝑛 )𝑛=1 ∈ 𝑐 0 (N)
𝑛=1

with a unique 𝑦 = 𝑦 (𝑓 ) = (𝑦𝑛 )𝑛=1 ∞ ∈ ℓ (N). Consider the sequence (𝑓 ) ∞ ∈ (𝑐 (N)) ∗ . Then
1 𝑒𝑛 𝑛=1 0
𝑓𝑒𝑛 (𝑥) = 𝑥𝑛 → 0 as 𝑛 → ∞ for every 𝑥 = (𝑥𝑛 )𝑛=1 ∞ ∈ 𝑐 (N) by definition of 𝑐 (N), see Example 16,
0 0
i.e. (𝑓𝑒𝑛 )𝑛=1
∞ converges weak-∗ly to the zero functional on 𝑐 (N). However, (𝑓 ) ∞ does not converge
0 𝑒𝑛 𝑛=1
weakly to the same limit: by Example 22 there exists an isormorphism 𝑆 : ℓ∞ (N) → (ℓ1 (N)) ∗ , so
(𝑇 𝑡 ) −1𝑆 : ℓ∞ (N) → (𝑐 0 (N)) ∗∗ is an isormorphism by Theorem 21. Hence, for any 𝑘 ∈ 𝑋 ∗∗ there
exists a unique 𝑦 = (𝑦𝑛 )𝑛=1
∞ ∈ ℓ (N) so that 𝑘 = (𝑇 𝑡 ) −1𝑆𝑦 which implies

(3.5)
𝑘 (𝑓𝑒𝑛 ) = (𝑇 𝑡 ) −1𝑆𝑦 (𝑓𝑒𝑛 ) = 𝑆𝑦 (𝑇 −1 𝑓𝑒𝑛 ) = 𝑆𝑦 (𝑒𝑛 ) = 𝑦𝑛 ,


and which, in general, does not converge to zero as 𝑛 → ∞.

58
3 Banach spaces

Remark. In any reflexive Banach space 𝑋 , Ran(𝐽 ) = 𝑋 ∗∗ , so the canonical map 𝐽 : 𝑋 → 𝑋 ∗∗ is


an isomorphism. In turn, for any (𝑓𝑛 )𝑛=1
∞ ⊂ 𝑋 ∗,


𝑓𝑛 ⇀ 𝑓 ⇔ 𝑓𝑛 ⇀ 𝑓 as 𝑛 → ∞.

Theorem 29. Let 𝑋 be a normed linear space, 𝑋 ∗ its dual space and (𝑓𝑛 )𝑛=1 ∞ ⊂ 𝑋 ∗ . If (𝑓 ) ∞
𝑛 𝑛=1
converges weakly to 𝑓 , then (𝑓𝑛 )𝑛=1
∞ converges weak-∗ly to 𝑓 . Furthermore, weak-∗ limits are unique

and if 𝑋 is complete and (𝑓𝑛 )𝑛=1


∞ weak-∗ly convergent, then there exists 𝑐 > 0 so that k𝑓 k ≤ 𝑐 for
𝑛
all 𝑛 ∈ N.

Proof. We have already discussed that weak convergence in 𝑋 ∗ implies weak-∗ convergence.
Moreover, weak-∗ limits are clearly unique. Hence, if (𝑓𝑛 )𝑛=1 ∞ is weak-∗ convergent, then for all

𝑥 ∈ 𝑋 , there exists 𝑐 = 𝑐 (𝑥) so that |𝑓𝑛 (𝑥)| ≤ 𝑐 for all 𝑛 ∈ N. In turn, for every 𝑥 ∈ 𝑋 ,

sup |𝑓𝑛 (𝑥)| : 𝑛 ∈ N < ∞,
∞ ⊂ 𝑋 ∗ is pointwise bounded. Given that 𝑋 is complete, Theorem 3.5
i.e. the family F := {𝑓𝑛 }𝑛=1
asserts that 
sup k𝑓𝑛 k : 𝑛 ∈ N < ∞,
and so k 𝑓𝑛 k < 𝑐 for all 𝑛 ∈ N. This completes our proof. 

Our next result provides a convenient criterion for weak-∗ convergence.

Theorem 30. Let (𝑋, k · k𝑋 ) be a normed linear space with dual space 𝑋 ∗ and (𝑓𝑛 )𝑛=1
∞ ⊂ 𝑋 ∗.

Then 𝑓𝑛 converges weak-∗ly to 𝑓 ∈ 𝑋 provided


(1) There exists 𝑐 > 0 so that k𝑓𝑛 k ≤ 𝑐 for all 𝑛 ∈ N.


(2) There exists a dense subset 𝑀 ⊂ 𝑋 so that 𝑓𝑛 (𝑥) → 𝑓 (𝑥) for all 𝑥 ∈ 𝑀.

Proof. Choose 𝑥 ∈ 𝑋 and (𝑥𝑚 )𝑚=1


∞ ⊂ 𝑀 so that 𝑥 → 𝑥 as 𝑚 → ∞. For any 𝑛, 𝑚 ∈ N,
𝑚

|𝑓𝑛 (𝑥) − 𝑓 (𝑥)| ≤ |𝑓𝑛 (𝑥) − 𝑓𝑛 (𝑥𝑚 )| + |𝑓𝑛 (𝑥𝑚 ) − 𝑓 (𝑥𝑚 )| + |𝑓 (𝑥𝑚 ) − 𝑓 (𝑥)|
≤ k𝑓𝑛 k k𝑥 − 𝑥𝑚 k𝑋 + |𝑓𝑛 (𝑥𝑚 ) − 𝑓 (𝑥𝑚 )| + k 𝑓 k k𝑥𝑚 − 𝑥 k𝑋 .

We have k 𝑓𝑛 k ≤ 𝑐 for all 𝑛 ∈ N and k 𝑓 k ≤ 𝑐. Let 𝜖 > 0 be arbitrary, then, by 𝑥𝑚 → 𝑥, there


exists 𝑁 = 𝑁 (𝜖) > 0 so that k𝑥𝑚 − 𝑥 k𝑋 < 3𝑐𝜖 for all 𝑚 ≥ 𝑁 and likewise |𝑓𝑛 (𝑥𝑚 ) − 𝑓 (𝑥𝑚 )| < 𝜖3
for all 𝑛 ≥ 𝑁 . Hence, for all 𝑚, 𝑛 ≥ 𝑁 ,
𝜖 𝜖 𝜖
|𝑓𝑛 (𝑥) − 𝑓 (𝑥)| ≤ 𝑐 + + 𝑐 = 𝜖,
3𝑐 3 3𝑐

so 𝑓𝑛 ⇀ 𝑓 as 𝑛 → ∞ since 𝑥 ∈ 𝑋 was arbitrary. 

59
3 Banach spaces

The last two theorems of this section and chapter are generalizations of the classical Bolzano-
Weierstrass theorem, i.e. the fact that each bounded sequence in F𝑛 with F = R or F = C has a
convergent subsequence.

Theorem 31 (Helly’s theorem). Let 𝑋 be a separable normed linear space. Then every bounded
sequence (𝑓𝑛 )𝑛=1
∞ ⊂ 𝑋 ∗ has a weak-∗ly convergent subsequence.

Proof. Let 𝑀 = {𝑥𝑘 }𝑘=1 𝑁 ⊂ 𝑋 be a countable dense subset of 𝑋 and (𝑓 ) ∞ ⊂ 𝑋 ∗ an arbitrary


𝑛 𝑛=1
bounded sequence. In turn, (𝑓𝑛 (𝑥))𝑛=1 ∞ ⊂ F is bounded for every 𝑥 ∈ 𝑋 , so in particular
∞ is bounded. Thus, by the Bolzano-Weierstrass theorem, there exists a subsequence
(𝑓𝑛 (𝑥 1 ))𝑛=1
∞ ∞ ⊂ 𝑋 ∗ with 𝑓
(𝑓𝑛1 (𝑘) )𝑘=1 ⊂ {𝑓𝑛 }𝑛=1 𝑛 1 (𝑘) (𝑥 1 ) → 𝑓∞ (𝑥 1 ) as 𝑘 → ∞. Now consider the bounded

sequence (𝑓𝑛1 (𝑘) (𝑥 2 ))𝑘=1 . By the Bolzano-Weierstrass theorem we can find a subsequence
∞ ⊂ {𝑓
(𝑓𝑛2 (𝑘) )𝑘=1 ∞
𝑛 1 (𝑘) }𝑘=1 for which 𝑓𝑛 2 (𝑘) (𝑥 2 ) → 𝑓∞ (𝑥 2 ) as 𝑘 → ∞. Continuing in this fashion
inductively, we can find successive subsequences (𝑓𝑛𝑖 (𝑘) )𝑘=1 ∞ ⊂ 𝑋 ∗ so that

∞ ⊂ {𝑓
(a) (𝑓𝑛𝑖+1 (𝑘) )𝑘=1 ∞ ∗
𝑛𝑖 (𝑘) }𝑘=1 ⊂ 𝑋 for 𝑖 ∈ N,

(b) 𝑓𝑛𝑖 (𝑘) (𝑥 𝑗 ) → 𝑓∞ (𝑥 𝑗 ) as 𝑘 → ∞ for all 𝑗 = 1, 2, . . . , 𝑖.


To get a subsequence that converges at each 𝑥 𝑗 , 𝑗 = 1, . . . , 𝑁 we consider the diagonal sequence
∞ ⊂ 𝑋 ∗ . By construction, 𝑓
(𝑓𝑛𝑘 (𝑘) )𝑘=1 𝑛𝑘 (𝑘) (𝑥 𝑗 ) → 𝑓∞ (𝑥 𝑗 ) as 𝑘 → ∞ for all 𝑗 = 1, . . . , 𝑁 , so
Theorem 30 yields the weak-∗ convergence of (𝑓𝑛𝑘 (𝑘) )𝑘=1 ∞ ∞ . This completes our
⊂ {𝑓𝑛 }𝑛=1
proof. 

Theorem 32. Let 𝑋 be a reflexive Banach space. Then every bounded sequence (𝑥𝑛 )𝑛=1
∞ ⊂ 𝑋 has

a weakly convergent subsequence.

Proof. Let (𝑥𝑛 )𝑛=1


∞ ⊂ 𝑋 be a bounded sequence and set

( 𝑁
)
Õ
𝑌 := 𝛼 𝑗 𝑥 𝑗 : 𝛼 𝑗 ∈ F, 𝑁 ∈ N .
𝑗=1

Since 𝑌 is a closed subspace of a reflexive Banach space, Proposition 8 asserts that 𝑌 is reflexive
Í
and moreover separable (the linear combinations 𝑁𝑗=1 𝛼 𝑗 𝑥 𝑗 with 𝛼 𝑗 ∈ Q + iQ or 𝛼 𝑗 ∈ Q are
dense in 𝑌 ). In turn, 𝑌 ∗∗ is isomorphic to 𝑌 and since the isomorphism 𝐽 : 𝑌 → 𝑌 ∗∗ in question
is an isometry, see Theorem 22, we know that 𝑌 ∗∗ is separable as well and thus also 𝑌 ∗ by
Theorem 19. Now use Theorem 31 for 𝑌 ∗ : ( 𝑗𝑥𝑛 )𝑛=1 ∞ ⊂ 𝑌 ∗∗ , which is bounded by Theorem 22,

has a weak-∗ly convergent subsequence ( 𝑗𝑥𝑛𝑚 )𝑚=1 ⊂ { 𝑗𝑥𝑛 }𝑛=1 ∞ , say, with weak-∗ limit 𝑘 ∈ 𝑌 ∗∗ .

But 𝑌 is reflexive, so 𝑘 = 𝑗𝑥 ∞ for some 𝑥 ∞ ∈ 𝑌 ⊂ 𝑋 , and thus for every 𝑓 ∈ 𝑌 ∗ ,

𝑓 (𝑥𝑛𝑚 ) = 𝑗𝑥𝑛𝑚 (𝑓 ) → 𝑘 (𝑓 ) = 𝑗𝑥 ∞ (𝑓 ) = 𝑓 (𝑥 ∞ ), 𝑚 → ∞. (3.15)

But the convergence (3.15) extends to all 𝑓 ∈ 𝑋 ∗ , so indeed 𝑥𝑛𝑚 ⇀ 𝑥 ∞ as 𝑚 → ∞, which


concludes our proof. 

60
4 Bounded operators

4 Bounded operators

4.1 Topologies on bounded operators


We have already introduced L (𝑋, 𝑌 ), the normed linear space of bounded linear transformations
from one normed linear space to another. In this section we will study L (𝑋, 𝑌 ) more closely.

Definition 37. Let (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ) be normed linear spaces, (𝑇𝑛 )𝑛=1
∞ ⊂ L (𝑋, 𝑌 ) and

𝑇 ∈ L (𝑋, 𝑌 ) bounded linear transformations. We say


∞ converges in norm (or uniformly) to 𝑇 , if k𝑇 − 𝑇 k → 0 as 𝑛 → ∞.
(i) (𝑇𝑛 )𝑛=1 𝑛
∞ converges strongly to 𝑇 , if for all 𝑥 ∈ 𝑋 , k𝑇 𝑥 − 𝑇 𝑥 k → 0 as 𝑛 → ∞.
(ii) (𝑇𝑛 )𝑛=1 𝑛 𝑌
∞ converges weakly to 𝑇 , if for all 𝑥 ∈ 𝑋, 𝑓 ∈ 𝑌 ∗ , |𝑓 (𝑇 𝑥) − 𝑓 (𝑇 𝑥)| → 0 as 𝑛 → ∞.
(iii) (𝑇𝑛 )𝑛=1 𝑛

Since, by linearity of 𝑓 and 𝑇𝑛 ,𝑇 ,

𝑓 (𝑇𝑛 𝑥) − 𝑓 (𝑇 𝑥) = 𝑓 (𝑇𝑛 𝑥 − 𝑇 𝑥) ≤ k𝑓 k k𝑇𝑛 𝑥 − 𝑇 𝑥 k𝑌 ≤ k𝑓 k k (𝑇𝑛 − 𝑇 )𝑥 k𝑌


≤ k𝑓 k k𝑇𝑛 − 𝑇 k k𝑥 k𝑋 ,

we deduce that norm convergence implies strong convergence and strong convergence implies
weak convergence. The converse implications are in general false.

Example 27. Consider 𝑋 = (ℓ𝑝 (N), k · k 𝑝 ) as discussed in Example 2 for 1 ≤ 𝑝 ≤ ∞. First, if


𝑇𝑛 ∈ L (𝑋 ) is given by
1
𝑇𝑛 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (𝑥 1, 𝑥 2, 𝑥 3, . . .),
𝑛
then 𝑇𝑛 → 0 uniformly since k𝑇𝑛 − 0k = 𝑛1 → 0 as 𝑛 → ∞. Second, if 𝐿 and 𝑅 denote the shift
operators in Example 5, then 𝑆𝑛 ∈ L (𝑋 ) with

𝑆𝑛 := 𝐿𝑛 = 𝐿 ◦ . . . ◦ 𝐿 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (𝑥𝑛+1, 𝑥𝑛+2, 𝑥𝑛+3, . . .),


| {z }
𝑛 times

satisfies k𝑆𝑛 𝑥 k 𝑝 = 𝑖=𝑛+1 |𝑥𝑖 |𝑝 → 0 as 𝑛 → ∞ for 1 ≤ 𝑝 < ∞ and all 𝑥 ∈ ℓ𝑝 (N), so 𝑆𝑛 → 0


𝑝 Í∞
strongly, but not uniformly since k𝑆𝑛 k = 1 for all 𝑛 ∈ N, compare the workings in Example 5.
Finally, let 𝐴𝑛 ∈ L (𝑋 ) denote the operator

𝐴𝑛 := 𝑅𝑛 = 𝑅 ◦ . . . ◦ 𝑅 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (0, . . . , 0, 𝑥 1, 𝑥 2, 𝑥 3, . . .),
| {z } | {z }
𝑛 times 𝑛

61
4 Bounded operators

then 𝑓 (𝐴𝑛 𝑥) = 𝑘=1 𝑥𝑘−𝑛𝑦𝑘 → 0 as 𝑛 → ∞ for 1 < 𝑝 < ∞ where 𝑦 =


Í∞ Í∞
(𝐴𝑛 𝑥)𝑘 𝑦𝑘 = 𝑘=𝑛+1
(𝑦𝑘 )𝑘=1 ∈ ℓ𝑞 (N) with 1 < 𝑞 < ∞ : 𝑝 + 𝑞 = 1 by Example 24. But 𝐴𝑛 → 0 neither strongly nor
∞ 1 1

uniformly since k𝐴𝑛 k = 1 and k𝐴𝑛 𝑥 k 𝑝 = k𝑥 k 𝑝 for any 𝑛 ∈ N and 𝑥 ∈ ℓ𝑝 (N).


In closing of this short section we record the following partial analogue of Theorem 27:

Theorem 33. Let (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ) be Banach spaces. If (𝑇𝑛 )𝑛=1


∞ ⊂ L (𝑋, 𝑌 ) converges

weakly, then there exists 𝑐 > 0 so that k𝑇𝑛 k ≤ 𝑐 for all 𝑛 ∈ N.

Proof. If for all 𝑥 ∈ 𝑋 and all 𝑓 ∈ 𝑌 ∗ , 𝑓 (𝑇𝑛 𝑥) → 𝑓 (𝑇 𝑥), then the sequence (𝑇𝑛 𝑥)𝑛=1
∞ ⊂ 𝑌

converges weakly to 𝑇 𝑥, i.e.


𝑇𝑛 𝑥 ⇀ 𝑇 𝑥 as 𝑛 → ∞,
and, by Theorem 27, there exists 𝑐 = 𝑐 (𝑥) > 0 so that k𝑇𝑛 𝑥 k𝑌 ≤ 𝑐 for all 𝑛 ∈ N. Hence, for every
𝑥 ∈ 𝑋,
sup{k𝑇𝑛 𝑥 k𝑌 : 𝑛 ∈ N} < ∞,
∞ ⊂ L (𝑋, 𝑌 ) is pointwise bounded. Given that 𝑋 and 𝑌 are complete,
which says that F := {𝑇𝑛 }𝑛=1
Theorem 3.5 asserts that 
sup k𝑇𝑛 k : 𝑛 ∈ N < ∞,
and so k𝑇𝑛 k ≤ 𝑐 for all 𝑛 ∈ N. This completes our proof. 

4.2 Adjoints
In this section we extend the dual or Banach space adjoint in (3.5) to general Hilbert spaces,
emphasizing on the outset that the Hilbert space adjoint is not equal to the Banach space adjoint
although they are closely related to one another.

Theorem 34. Let (H1, h·, ·i1 ) and (H2, h·, ·i2 ) be two Hilbert spaces and 𝑇 ∈ L (H1, H2 ) a
bounded linear transformation. There exists a unique transformation 𝑇 ∗ ∈ L (H2, H1 ), the Hilbert
space adjoint, so that

∀ 𝑥 ∈ H1, 𝑦 ∈ H2 : h𝑇 𝑥, 𝑦i2 = h𝑥,𝑇 ∗𝑦i1 .

Proof. Start by recalling that each Hilbert space is isomorphic to its own dual, see Example 21
or Theorem 2.3. Let 𝐶𝑖 : H𝑖 → H𝑖∗ denote the map that assigns to each 𝑦 ∈ H𝑖 , the bounded
linear functional h·, 𝑦i𝑖 in H𝑖∗ . Note that 𝐶𝑖 is a conjugate linear isometry which is bijective by
Theorem 2.3. Now define the map 𝑇 ∗ : H2 → H1 by

𝑇 ∗ := 𝐶 1−1𝑇 𝑡 𝐶 2, (4.1)

using the dual 𝑇 𝑡 ∈ L (H2∗, H1∗ ). Note that 𝑇 ∗ satisfies

(3.5)
h𝑇 𝑥, 𝑦i2 = (𝐶 2𝑦) (𝑇 𝑥) = (𝑇 𝑡 𝐶 2𝑦) (𝑥) = (𝐶 1𝑇 ∗𝑦) (𝑥) = h𝑥,𝑇 ∗𝑦i1

62
4 Bounded operators

for every 𝑥 ∈ H1 and 𝑦 ∈ H2 . Clearly 𝑇 ∗ in (4.1) is linear and k𝑇 ∗ k ≤ k𝐶 1−1 k k𝑇 𝑡 k k𝐶 2 k < ∞ by


Corollary 1 given that 𝐶𝑖 are isometries and since k𝑇 𝑡 k = k𝑇 k by Theorem 21. Finally, if there
were 𝑇1∗,𝑇2∗ ∈ L (H2, H1 ) such that h𝑥,𝑇1∗𝑦i1 = h𝑇 𝑥, 𝑦i2 = h𝑥,𝑇2∗𝑦i1 for all 𝑥 ∈ H1, 𝑦 ∈ H2 , then
insert 𝑥 = 𝑇1∗𝑦 − 𝑇2∗𝑦 ∈ H1 and conclude 𝑇1∗𝑦 = 𝑇2∗𝑦 for all 𝑦 ∈ H2 , i.e. uniqueness follows. 

We now summarize the key properties of the map 𝑇 ↦→ 𝑇 ∗ , thus extending Theorem 21 to the
Hilbert space adjoint.

Theorem 35. Let 𝑇 ∈ L (H1, H2 ), 𝑆 ∈ L (H2, H3 ) be two bounded linear transformations be-
tween three Hilbert spaces (H1, h·, ·i1 ), (H2, h·, ·i2 ) and (H3, h·, ·i3 ) and let 𝑇 ∗ ∈ H (H2, H1 ), 𝑆 ∗ ∈
L (H3, H2 ) denote their adjoints. Then
(1) 𝑇 ↦→ 𝑇 ∗ is a conjugate linear isometric isomorphism.
(2) (𝑆 ◦ 𝑇 ) ∗ = 𝑇 ∗ ◦ 𝑆 ∗ .
(3) If 𝑇 ∈ L (H1, H2 ) is invertible, then so is 𝑇 ∗ ∈ L (H2, H1 ) and we have (𝑇 −1 ) ∗ = (𝑇 ∗ ) −1 .
(4) (𝑇 ∗ ) ∗ = 𝑇 .
(5) The map 𝑇 ↦→ 𝑇 ∗ is continuous in the weak and uniform operator topologies, but not in
general in the strong operator topology.
(6) k𝑇 ∗𝑇 k = k𝑇𝑇 ∗ k = k𝑇 k 2 .

Proof. Begin with (4) and compute for any 𝑥 ∈ H1, 𝑦 ∈ H2 ,

h(𝑇 ∗ ) ∗𝑥, 𝑦i2 = h𝑦, (𝑇 ∗ ) ∗𝑥i2 = h𝑇 ∗𝑦, 𝑥i1 = h𝑥,𝑇 ∗𝑦i1 = h𝑇 𝑥, 𝑦i2,

so (𝑇 ∗ ) ∗ = 𝑇 . Moving to (1), we use that 𝐶𝑖 : H𝑖 → H𝑖∗ is conjugate linear and 𝑇 𝑡 : H2∗ → H1∗
linear, so 𝑇 ↦→ 𝑇 ∗ is conjugate linear by (4.1), i.e.

(𝛼𝑇1 + 𝛽𝑇2 ) ∗ = 𝛼 𝑇1∗ + 𝛽 𝑇2∗

for any 𝑇 𝑗 ∈ L (H1, H2 ) and 𝛼, 𝛽 ∈ F. Moreover, since 𝐶 𝑗 are isometries and since k𝑇 𝑡 k = k𝑇 k
by Theorem 21 we conclude at once that k𝑇 ∗ k = k𝑇 k, i.e. 𝑇 ↦→ 𝑇 ∗ is an isometry and thus
injective. However, (𝑇 ∗ ) ∗ = 𝑇 says that 𝑇 ↦→ 𝑇 ∗ is surjective, so 𝑇 ↦→ 𝑇 ∗ is invertible by
Theorems 10 and 3.7 and this yields (1). Moving ahead, property (2) is immediate since
(4)
h𝑇 ∗𝑆 ∗𝑥, 𝑦i1 = h𝑦,𝑇 ∗𝑆 ∗𝑥i1 = h𝑇𝑦, 𝑆 ∗𝑥i2 = h𝑆𝑇𝑦, 𝑥i3 = h𝑥, 𝑆𝑇𝑦i3 = h𝑥, ((𝑆𝑇 ) ∗ ) ∗𝑦i3

for all 𝑥 ∈ H3, 𝑦 ∈ H1 and thus, by uniqueness of the adjoint, (𝑆𝑇 ) ∗ = 𝑇 ∗𝑆 ∗ . Furthermore,
given 𝑇 −1𝑇 = 𝐼 H1 and 𝑇𝑇 −1 = 𝐼 H2 , recall Definition 26, we use that 𝑇 ∗ is invertible by (4.1) and
Theorem 21, moreover
(2) (2)
𝑇 ∗ (𝑇 −1 ) ∗ = (𝑇 −1𝑇 ) ∗ = 𝐼 H

1
= 𝐼 H1 , (𝑇 −1 ) ∗𝑇 ∗ = (𝑇𝑇 −1 ) ∗ = 𝐼 H

2
= 𝐼 H2

63
4 Bounded operators

which proves the identity (𝑇 −1 ) ∗ = (𝑇 ∗ ) −1 because inverses are unique. With (1) − (4) proven
∞ ⊂ L (H , H ) so that 𝑇 → 𝑇 ∈ L (H , H ) as 𝑛 → ∞ weakly
we now pick a sequence (𝑇𝑛 )𝑛=1 1 2 𝑛 1 2
and uniformly. Then
(1) (1)
k𝑇𝑛∗ − 𝑇 ∗ k = k (𝑇𝑛 − 𝑇 ) ∗ k = k𝑇𝑛 − 𝑇 k → 0 as 𝑛 → ∞,

so 𝑇𝑛∗ → 𝑇 ∗ uniformly, and

𝑓 (𝑇𝑛∗𝑥) = h𝑇𝑛∗𝑥, 𝑦 𝑓 i1 = h𝑥,𝑇𝑛𝑦 𝑓 i2 = h𝑇𝑛𝑦 𝑓 , 𝑥i2 = 𝑔𝑥 (𝑇𝑛𝑦)


→ 𝑔𝑥 (𝑇𝑦) = h𝑇𝑦 𝑓 , 𝑥i2 = h𝑥,𝑇𝑦 𝑓 i2 = h𝑇 ∗𝑥, 𝑦 𝑓 i1 = 𝑓 (𝑇 ∗𝑥)

for all 𝑓 ∈ H1∗ by Theorem 2.3 with the linear functional 𝑔𝑥 ∈ H2∗ given by 𝑔𝑥 (𝑦) := h𝑦, 𝑥i2 . In
turn, 𝑇𝑛∗ → 𝑇 ∗ weakly and we will discuss the strong topology in Example 28 below. Moving to
(6) we know from Corollary 1 and (1) that

k𝑇 ∗𝑇 k ≤ k𝑇 ∗ k k𝑇 k = k𝑇 k 2, k𝑇𝑇 ∗ k ≤ k𝑇 k k𝑇 ∗ k ≤ k𝑇 k 2 .

But also

k𝑇 k 2 = sup k𝑇 𝑥 k 22 = sup h𝑇 𝑥,𝑇 𝑥i2 = sup h𝑥,𝑇 ∗𝑇 𝑥i1 ≤ sup k𝑥 k 1 k𝑇 ∗𝑇 𝑥 k 1



k𝑥 k 1 =1 k𝑥 k 1 =1 k𝑥 k 1 =1 k𝑥 k 1 =1

k𝑥 k 1 k𝑇 𝑇 k = k𝑇 ∗𝑇 k,
2 ∗ 
≤ sup
k𝑥 k 1 =1

so together k𝑇 ∗𝑇 k = k𝑇 k 2 . Since this equality holds for any 𝑇 ∈ L (H1, H2 ), it holds in particular
for 𝑇 ∗ ∈ L (H2, H1 ) and we obtain therefore k𝑇𝑇 ∗ k = k (𝑇 ∗ ) ∗𝑇 ∗ k = k𝑇 ∗ k 2 = k𝑇 k 2 by (1) and (4).
This completes our proof of the theorem. 

Example 28. Consider H = ℓ2 (N) over F = C and 𝐴𝑛 ∈ L (H ) as in Example 27,

𝐴𝑛 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (0, . . . , 0, 𝑥 1, 𝑥 2, 𝑥 3, . . .).
| {z }
𝑛

Then for any 𝑥, 𝑦 ∈ H ,



Õ ∞
Õ ∞
Õ
h𝐴𝑛 𝑥, 𝑦i = (𝐴𝑛 𝑥)𝑘 𝑦𝑘 = 𝑥𝑘−𝑛𝑦𝑘 = 𝑥𝑘 𝑦𝑘+𝑛 = h𝑥, 𝑆𝑛𝑦i
𝑘=1 𝑘=𝑛+1 𝑘=1

which shows that 𝐴𝑛∗ = 𝑆𝑛 for all 𝑛 ∈ N. But Example 27 established 𝑆𝑛 → 0 strongly as 𝑛 → ∞
however 𝑆𝑛∗ = 𝐴𝑛 does not converge strongly to zero. Hence, 𝑇 ↦→ 𝑇 ∗ is in general not continuous in
the strong operator topology.

Definition 38. A bounded linear operator 𝑇 ∈ L (H ) on a Hilbert space H is called normal,


if 𝑇𝑇 ∗ = 𝑇 ∗𝑇 , and self-adjoint, if 𝑇 ∗ = 𝑇 . Moreover, if H1 and H2 are two Hilbert spaces, we
call a linear transformation 𝑈 ∈ L (H1, H2 ) unitary (compare Definition 23), if 𝑈 ∗𝑈 = 𝐼 H1 and
𝑈𝑈 ∗ = 𝐼 H2 .

64
4 Bounded operators

Self-adjoint operators play a major role in functional analysis and mathematical physics and
much of our remaining time is devoted to studying them.

Example 29. Let 𝑇 ∈ L (ℓ2 (N)) denote the multiplication operator of Example 4, i.e. given a
sequence of complex numbers (𝑡𝑛 )𝑛=1
∞ ∈ ℓ (N) we have

𝑇 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (𝑡 1𝑥 1, 𝑡 2𝑥 2, 𝑡 3𝑥 3, . . .).

Then for any 𝑥, 𝑦 ∈ ℓ2 (N),



Õ ∞
Õ ∞
Õ
h𝑇 𝑥, 𝑦i = (𝑇 𝑥)𝑛𝑦𝑛 = 𝑡𝑛 𝑥𝑛𝑦𝑛 = 𝑥𝑛 𝑡𝑛𝑦𝑛 ,
𝑛=1 𝑛=1 𝑛=1

so 𝑇 is self-adjoint if and only if 𝑡𝑛 ∈ R for all 𝑛 ∈ N and unitary if and only if |𝑡𝑛 | = 1 for all
𝑛 ∈ N. Note that 𝑇 is normal for any (𝑡𝑛 )𝑛=1
∞ ∈ ℓ (N).

Given two self-adjoint operators 𝑇 , 𝑆 ∈ L (H ) on a Hilbert space H the combination 𝛼𝑇 + 𝛽𝑆


is again self-adjoint if 𝛼, 𝛽 ∈ R, compare Theorem 35. Moreover, the composition 𝑇 ◦ 𝑆 is
self-adjoint if and only if 𝑇 and 𝑆 commute. We now state a sufficient criterion for self-adjoint
operators on complex Hilbert spaces.

Lemma 36. Let 𝑇 ∈ L (H ) be a bounded linear operator on a complex Hilbert space H . Then 𝑇
is self-adjoint if and only if h𝑇 𝑥, 𝑥i ∈ R for all 𝑥 ∈ H .

Proof. Self-adjointness of 𝑇 implies

h𝑇 𝑥, 𝑥i = h𝑥,𝑇 ∗𝑥i = h𝑥,𝑇 𝑥i = h𝑇 𝑥, 𝑥i ∀ 𝑥 ∈ H,

and so h𝑇 𝑥, 𝑥i ∈ R. Conversely, for any 𝑥, 𝑦 ∈ H ,

𝑇 (𝑥 + 𝑦), 𝑥 + 𝑦 = h𝑇 𝑥, 𝑥i +h𝑇 𝑥, 𝑦i + h𝑇𝑦, 𝑥i + h𝑇𝑦, 𝑦i,


| {z } | {z } | {z }
∈R ∈R ∈R

i.e. h𝑇 𝑥, 𝑦i + h𝑇𝑦, 𝑥i ∈ R for all 𝑥, 𝑦 ∈ H . Likewise,

𝑇 (𝑥 + 𝔦𝑦), 𝑥 + 𝔦𝑦 = h𝑇 𝑥, 𝑥i −𝔦h𝑇 𝑥, 𝑦i + 𝔦h𝑇𝑦, 𝑥i + h𝑇𝑦, 𝑦i,


| {z } | {z } | {z }
∈R ∈R ∈R

so 𝔦h𝑇𝑦, 𝑥i − 𝔦h𝑇 𝑥, 𝑦i ∈ R for all 𝑥, 𝑦 ∈ H . In summary,


(
h𝑇 𝑥, 𝑦i + h𝑇𝑦, 𝑥i = h𝑇 𝑥, 𝑦i + h𝑇𝑦, 𝑥i
∀ 𝑥, 𝑦 ∈ H : ,
𝔦h𝑇 𝑥, 𝑦i − 𝔦h𝑇𝑦, 𝑥i = −𝔦h𝑇 𝑥, 𝑦i + 𝔦h𝑇𝑦, 𝑥i

and this gives h𝑇 𝑥, 𝑦i = h𝑇𝑦, 𝑥i, thus h𝑦,𝑇 𝑥i = h𝑇𝑦, 𝑥i for all 𝑥, 𝑦 ∈ H , and so 𝑇 = 𝑇 ∗ . 

65
4 Bounded operators

And a sufficient criterion for unitary transformations:

Lemma 37. A linear transformation 𝑈 ∈ L (H1, H2 ) between two Hilbert spaces (H1, h·, ·i1 ) and
(H2, h·, ·i2 ) obeys 𝑈 ∗𝑈 = 𝐼 H1 if and only if k𝑈 𝑥 k 2 = k𝑥 k 1 for all 𝑥 ∈ H1 . Such a 𝑈 is unitary if
and only if Ran(𝑈 ) = H2 .

Proof. By polarization, say in F = C,

1 1 𝔦 𝔦
h𝑥, 𝑦i = k𝑥 + 𝑦 k 2 − k𝑥 − 𝑦 k 2 + k𝑥 + 𝔦𝑦 k 2 − k𝑥 − 𝔦𝑦 k 2,
4 4 4 4
we obtain 𝑈 ∗𝑈 = 𝐼 H1 if and only if h𝑥, 𝑈 ∗𝑈 𝑥i1 = k𝑥 k 21 for all 𝑥 ∈ H1 . But, h𝑥, 𝑈 ∗𝑈 𝑥i1 =
h𝑈 𝑥, 𝑈 𝑥i2 = k𝑈 𝑥 k 22 , so the first equivalence follows. Finally, any isometry is injective, if it is
also surjective, then it has an inverse 𝑆 ∈ L (H2, H1 ) by Theorem 3.7 and so 𝑆𝑈 = 𝐼 H1 and
𝑈 𝑆 = 𝐼 H2 . But then 𝑈 ∗ = 𝑈 ∗ (𝑈 𝑆) = (𝑈 ∗𝑈 )𝑆 = 𝑆, so 𝑈 𝑈 ∗ = 𝐼 H2 , i.e. 𝑈 is unitary. Conversely, if
𝑈𝑈 ∗ = 𝐼 H2 , then for any 𝑥 ∈ H2 we have 𝑥 = 𝑈 (𝑈 ∗𝑥) and so 𝑥 ∈ Ran(𝑈 ). This concludes the
proof. 

We now collect a few subspace equalities that will be useful in the upcoming section. Recall the
orthogonal complement back in Definition 21:

Proposition 10. Let 𝑇 ∈ L (H1, H2 ) be a bounded linear transformation between two Hilbert
spaces (H1, h·, ·i1 ) and (H2, h·, ·i2 ). Then
⊥ ⊥
Ker(𝑇 ∗ ) = Ran(𝑇 ) , Ker(𝑇 ) = Ran(𝑇 ∗ ) .

Proof. For an arbitrary 𝑥 ∈ H2 ,


⊥
𝑇 ∗𝑥 = 0 ⇔ 0 = h𝑦,𝑇 ∗𝑥i1 = h𝑇𝑦, 𝑥i2 ∀𝑦 ∈ H1 ⇔ 𝑥 ∈ Ran(𝑇 ) .

Hence, Ker(𝑇 ∗ ) = (Ran(𝑇 )) ⊥ which yields, with (𝑇 ∗ ) ∗ = 𝑇 also the second claim. 

Note that, by Theorem 2.2 and the last Proposition 10, for a given 𝑇 ∈ L (H ) on a Hilbert space
H any element 𝑥 ∈ H admits a unique representation of the form
⊥  ⊥⊥
𝑥 =𝑦 +𝑧 : 𝑦 ∈ Ker(𝑇 ∗ ), 𝑧 ∈ Ker(𝑇 ∗ ) = Ran(𝑇 ) = Ran(𝑇 ), (4.2)

where we used Corollary 6. In summary, with the terminology of Definition 33:

Corollary 15. Given a Hilbert space H and 𝑇 ∈ L (H ), then Ker(𝑇 ∗ ) and Ran(𝑇 ) are comple-
mentary subspaces.
An important class of operators intimately related to (4.2) is that of the projections.

66
4 Bounded operators

Definition 39. A linear operator 𝑃 ∈ L (𝑋 ) on a normed linear space (𝑋, k · k) is called a


projection if 𝑃 2 = 𝑃 ◦ 𝑃 = 𝑃. If in addition 𝑋 = H is a Hilbert space and 𝑃 ∗ = 𝑃, then 𝑃 is called
an orthogonal projection.
Here is the basic result about projections on normed linear spaces:

Theorem 38. Let 𝑃 ∈ L (𝑋 ) be a projection on a normed linear space 𝑋 . Then


(1) 𝑄 = 𝐼 − 𝑃 ∈ L (𝑋 ) is also a projection with 𝑃𝑄 = 𝑄𝑃 = 0.
(2) Ran(𝑃) = Ker(𝑄) and Ran(𝑄) = Ker(𝑃), i.e. Ran(𝑃) is closed.
(3) Ker(𝑃) and Ran(𝑃) are complementary subspaces of 𝑋 .

Proof. Clearly 𝑄 ∈ L (𝑋 ) and since 𝑄 2 = (𝐼 − 𝑃) (𝐼 − 𝑃) = 𝐼 − 𝑃 − 𝑃 + 𝑃 2 = 𝐼 − 𝑃 = 𝑄, 𝑄 is a


projection. Also, 𝑃𝑄 = 𝑃 (𝐼 − 𝑃) = 𝑃 − 𝑃 2 = 0 = (𝐼 − 𝑃)𝑃 = 𝑄𝑃, so (1) holds. Moving ahead,
let 𝑦 ∈ Ran(𝑃), i.e. 𝑦 = 𝑃𝑥 for some 𝑥 ∈ 𝑋 and thus 𝑄𝑦 = (𝑄𝑃)𝑥 = 0 by (1), so 𝑦 ∈ Ker(𝑄).
Conversely, if 𝑥 ∈ Ker(𝑄), then 𝑥 = 𝑃𝑥, so 𝑥 ∈ Ran(𝑃). Thus Ran(𝑃) = Ker(𝑄) and, by formally
replacing 𝑃 with 𝐼 −𝑃, also Ran(𝑄) = Ker(𝑃), so (2) holds. Finally, 𝑥 = 𝑃𝑥 + (1−𝑃)𝑥 which shows
that any 𝑥 ∈ 𝑋 is a sum of 𝑦 := (𝐼 − 𝑃)𝑥 ∈ Ran(𝑄) = Ker(𝑃) and 𝑧 := 𝑃𝑥 ∈ Ran(𝑃). Moreover,
this representation is unique since if 𝑦1 +𝑧 1 = 𝑥 = 𝑦2 +𝑧 2 with 𝑦𝑖 ∈ Ker(𝑃), 𝑧𝑖 ∈ Ran(𝑃) = Ker(𝑄),
then
𝑦1 − 𝑦2 = 𝑧 2 − 𝑧 2 ∈ Ker(𝑃) ∩ Ker(𝑄),
so necessarily 𝑦1 − 𝑦2 = 𝑧 1 − 𝑧 2 = 0 which completes our proof. 

The last part of Theorem 38 asserts that projections are always associated with complementary
subspaces and provided we are working in a Hilbert space, the converse is also true by Theorem
2.2 (and so its name justified): if any 𝑥 ∈ H can be uniquely written as 𝑥 = 𝑦 + 𝑧 for 𝑦 ∈ 𝑆 and
𝑧 ∈ 𝑆 ⊥ where 𝑆 ⊂ H is a closed subspace, then

𝑃 : H → 𝑆, 𝑥 ↦→ 𝑦

defines an orthogonal projection for if 𝑥 0 = 𝑦 0 + 𝑧 0 with 𝑦 0 ∈ 𝑆 and 𝑧 0 ∈ 𝑆 ⊥ , then h𝑃𝑥, 𝑥 0i =


h𝑦, 𝑦 0 + 𝑧 0i = h𝑦, 𝑦 0i = h𝑦, 𝑃𝑥 0i = h𝑥, 𝑃𝑥 0i and Ran(𝑃) = 𝑆. However, the last one-to-one
correspondence between orthogonal projections and closed subspaces fails in general Banach
or normed linear spaces. Still, some special cases do work out:

Theorem 39. Let 𝑋 be a Banach space and 𝑆 ⊂ 𝑋 a finite-dimensional subspace. Then there
exists a closed subspace 𝑆 0 ⊂ 𝑋 complementary to 𝑆.

Proof. Let {𝑥𝑖 }𝑛𝑖=1 be a basis for 𝑆 and {𝑓𝑖 }𝑛𝑖=1 the associated dual basis for 𝑆 ∗ , uniquely determined
by the constraints (
1, 𝑖 = 𝑗
𝑓𝑖 (𝑥 𝑗 ) = 𝛿𝑖 𝑗 := .
0, 𝑖 ≠ 𝑗

67
4 Bounded operators

By Corollary 7, there exists an extension of {𝑓𝑖 }𝑛𝑖=1 ⊂ 𝑆 ∗ to {𝐹𝑖 }𝑛𝑖=1 ⊂ 𝑋 ∗ so that 𝐹𝑖 (𝑥 𝑗 ) = 𝛿𝑖 𝑗 .


Now define
Õ𝑛
𝑃𝑥 := 𝐹𝑘 (𝑥)𝑥𝑘 , 𝑥 ∈ 𝑋,
𝑘=1

and note that 𝑃 ∈ L (𝑋 ) with 𝑃𝑥𝑖 = 𝑥𝑖 for 𝑖 = 1, . . . , 𝑛. Hence 𝑃 2 = 𝑃, so 𝑃 is a projection, and


with Ran(𝑃) = 𝑆 the sought after complementary subspace is 𝑆 0 := Ker(𝑃), see Theorem 38. 

4.3 The spectrum


If 𝑇 is a linear transformation on C𝑛 , then the eigenvalues of 𝑇 are the numbers 𝜆 ∈ C such
that the determinant of 𝜆𝐼 − 𝑇 is equal to zero. The set of such 𝜆 is called the spectrum of 𝑇 .
It can consists of at most 𝑛 points since det(𝜆𝐼 − 𝑇 ) is a polynomial of degree 𝑛. If 𝜆 is not an
eigenvalue, then 𝜆𝐼 − 𝑇 has an inverse since det(𝜆𝐼 − 𝑇 ) ≠ 0. The spectral theory of operators
on infinite-dimensional spaces is much more interesting. We begin with two basic observations.

Lemma 40. Let 𝑇 ∈ L (H ) be a normal operator on a Hilbert space H over F = R or F = C and


𝑇 𝑥 = 𝜆𝑥 for some 𝜆 ∈ F, 𝑥 ∈ H . Then 𝑇 ∗𝑥 = 𝜆𝑥.

Proof. Note that, for all 𝑥 ∈ H, 𝜆 ∈ F,

k(𝑇 − 𝜆𝐼 )𝑥 k 2 = h(𝑇 − 𝜆𝐼 )𝑥, (𝑇 − 𝜆𝐼 )𝑥i = 𝑥 , (𝑇 ∗ − 𝜆𝐼 ) (𝑇 − 𝜆𝐼 )𝑥 = 𝑥, (𝑇 − 𝜆𝐼 ) (𝑇 ∗ − 𝜆𝐼 )𝑥


2
= (𝑇 ∗ − 𝜆𝐼 )𝑥, (𝑇 ∗ − 𝜆𝐼 )𝑥 = (𝑇 ∗ − 𝜆𝐼 )𝑥 ,

where the normality of 𝑇 was used in the second equality. Hence, 𝑇 𝑥 = 𝜆𝑥 if and only if
𝑇 ∗𝑥 = 𝜆𝑥. 

Corollary 16. If 𝑇 ∈ L (H ) is a normal operator on a Hilbert space H and 𝑥, 𝑦 ∈ H, 𝜆, 𝜇 ∈ F


such that 𝑇 𝑥 = 𝜆𝑥,𝑇𝑦 = 𝜇𝑦 with 𝜆 ≠ 𝜇, then h𝑥, 𝑦i = 0.

Proof. If 𝑇 𝑥 = 𝜆𝑥 and 𝑇𝑦 = 𝜇𝑦, then by Lemma 40, 𝑇 ∗𝑦 = 𝜇𝑦 and thus

𝜆h𝑥, 𝑦i = h𝑇 𝑥, 𝑦i = h𝑥,𝑇 ∗𝑦i = h𝑥, 𝜇𝑦i = 𝜇h𝑥, 𝑦i,

which yields h𝑥, 𝑦i = 0 by assumption. 

Here are the two central definition of this section:

Definition 40. Let 𝑋 be a Banach space over F and 𝑇 ∈ L (𝑋 ) a bounded linear operator. A
number 𝜆 ∈ F is said to be in the resolvent set 𝜌 (𝑇 ) of 𝑇 if 𝜆𝐼 − 𝑇 is invertible. The operator
𝑅𝜆 (𝑇 ) := (𝜆𝐼 − 𝑇 ) −1 ∈ L (𝑋 ) is called the resolvent of 𝑇 at 𝜆. If 𝜆 ∉ 𝜌 (𝑇 ), then 𝜆 is said to be in
the spectrum 𝜎 (𝑇 ) of 𝑇 .

68
4 Bounded operators

If 𝜆 ∈ 𝜎 (𝑇 ), then 𝜆𝐼 − 𝑇 is not invertible and at least one of the below statements must be true,

Ker(𝜆𝐼 − 𝑇 ) ≠ {0} or Ran(𝜆𝐼 − 𝑇 ) ≠ 𝑋 .

In turn, we distinguish the following three cases:

Definition 41. Let 𝑇 ∈ L (𝑋 ) be a bounded linear operator on a Banach space 𝑋 .


(i) An 𝑥 ∈ 𝑋 \ {0} which satisfies 𝑇 𝑥 = 𝜆𝑥 is called an eigenvector of 𝑇 with eigenvalue
𝜆 ∈ F. If 𝜆 is an eigenvalue of 𝑇 , then 𝜆𝐼 − 𝑇 is not injective and thus 𝜆 ∈ 𝜎 (𝑇 ). The set of
all eigenvalues is called the point spectrum of 𝑇 , denoted 𝜎𝑝 (𝑇 ).
(ii) If 𝜆 is not an eigenvalue, Ran(𝜆𝐼 − 𝑇 ) ≠ 𝑋 but Ran(𝜆𝐼 − 𝑇 ) is dense in 𝑋 , then 𝜆 is said to
be in the continuous spectrum of 𝑇 , denoted 𝜎𝑐 (𝑇 ).
(ii) If 𝜆 is not an eigenvalue and Ran(𝜆𝐼 − 𝑇 ) ≠ 𝑋 , i.e. Ran(𝜆𝐼 − 𝑇 ) is not dense, then 𝜆 is said
to be in the residual spectrum of 𝑇 , denoted 𝜎𝑟 (𝑇 ).
Observe that, by the last definition,

𝜎 (𝑇 ) = 𝜎𝑝 (𝑇 ) t 𝜎𝑐 (𝑡) t 𝜎𝑟 (𝑇 ) ∀𝑇 ∈ L (𝑋 )

as disjoint union.

Example 30. Consider the Hilbert space (ℓ2 (N), k · k 2 ) over F = C with the right shift operator
𝑅 ∈ L (ℓ2 (N)), see Example 5, that satisfies k𝑅k = 1. First let 𝑅𝑥 = 𝜆𝑥 with 𝑥 = (𝑥𝑘 )𝑘=1
∞ ∈ ℓ (N),
2
so entrywise 0 = 𝜆𝑥 1 and 𝑥𝑘−1 = 𝜆𝑥𝑘 for 𝑘 ∈ Z ≥2 and which tells us 𝑥 = 0. In short,

𝜎𝑝 (𝑅) = ∅.

Next, with k𝑅k = 1 and Theorem 18 we obtain that 𝜆𝐼 − 𝑅 = 𝜆(𝐼 − 𝜆 −1𝑅) is invertible for
𝜆 ∈ C : |𝜆| > 1, so

𝜎 (𝑅) ⊂ C \ {𝜆 ∈ C : |𝜆| > 1} = D1 (0), D𝑟 (𝑧 0 ) := {𝑧 ∈ C : |𝑧 − 𝑧 0 | < 𝑟 }.

But 𝜆𝐼 − 𝑅 is in fact non-invertible on all of D1 (0) for if 𝜆 = 0, then ℓ2 (N) 3 (1, 0, 0, 0, . . .) ∉


Ran(𝜆𝐼 − 𝑅) and if 0 < |𝜆| ≤ 1, then (𝜆𝐼 − 𝑅)𝑥 = (1, 0, 0, 0, . . .) has no solution 𝑥 ∈ ℓ2 (N). Thus

𝜎 (𝑅) = D1 (0).

Next we use the adjoint 𝑅 ∗ = 𝐿 computed in Example 28 which satisfies k𝑅 ∗ k = 1 and 𝜎𝑝 (𝑅 ∗ ) =


D1 (0): indeed, if 𝑅 ∗𝑥 = 𝜆𝑥 for 𝑥 ∈ ℓ2 (N), then entrywise 𝑥𝑘+1 = 𝜆𝑥𝑘 for 𝑘 ∈ N which yields
𝑥 = 𝑥 1 (1, 𝜆, 𝜆 2, 𝜆 3, . . .) ∈ ℓ2 (N) if and only if |𝜆| < 1. Now apply Corollary 15 and Lemma 40,

ℓ2 (N) = Ker 𝜆𝐼 − 𝑅 ∗ + Ran(𝜆𝐼 − 𝑅), Ker 𝜆𝐼 − 𝑅 ∗ ∩ Ran(𝜆𝐼 − 𝑅) = {0},


 
(4.3)

and take |𝜆| ≤ 1. Then (4.3) says Ran(𝜆𝐼 − 𝑅) is dense in ℓ2 (N) if and only if 𝜆𝐼 − 𝑅 ∗ is injective,
so if and only if 𝜆 ∉ 𝜎𝑝 (𝑅 ∗ ) = D1 (0) which is equivalent to 𝜆 ∉ D1 (0). We conclude

𝜎𝑐 (R) = 𝜕D1 (0) = {𝜆 ∈ C : |𝜆| = 1},

69
4 Bounded operators

and since 𝜎 (𝑅) decomposes disjointly into its parts, also

𝜎𝑟 (𝑅) = D1 (0).

In order to simplify certain spectral problems, like the one in the last example, we develop
further theory.

Proposition 11. Let 𝑋 be a Banach space and 𝑇 ∈ L (𝑋 ). Then 𝜌 (𝑇 ) is an open subset of F, for
any 𝜆, 𝜇 ∈ 𝜌 (𝑇 ) the resolvents 𝑅𝜆 (𝑇 ), 𝑅 𝜇 (𝑇 ) commute and they satisfy the resolvent formula

𝑅𝜆 (𝑇 ) − 𝑅 𝜇 (𝑇 ) = (𝜇 − 𝜆)𝑅 𝜇 (𝑇 )𝑅𝜆 (𝑇 ).

Proof. Convergence questions aside for the moment, we have for 𝜆0 ∈ 𝜌 (𝑇 ) by Theorem 18,
 −1  −1
(𝜆𝐼 − 𝑇 ) −1 = (𝜆 − 𝜆0 )𝐼 + (𝜆0 𝐼 − 𝑇 ) = (𝜆0 𝐼 − 𝑇 ) −1 𝐼 − (𝜆0 − 𝜆) (𝜆0 𝐼 − 𝑇 ) −1
" ∞
#
Õ
= (𝜆0 𝐼 − 𝑇 ) −1 𝐼+ (𝜆0 − 𝜆)𝑛 (𝜆0 𝐼 − 𝑇 ) −𝑛 .
𝑛=1

For this reason we now define with 𝜆0 ∈ 𝜌 (𝑇 ),


" ∞
#
Õ  𝑛
𝑆𝜆 (𝑇 ) := 𝑅𝜆0 (𝑇 ) 𝐼 + (𝜆0 − 𝜆)𝑛 𝑅𝜆0 (𝑇 ) , (4.4)
𝑛=1

and note that the infinite series converges in operator norm provided |𝜆 − 𝜆0 | < k𝑅𝜆0 (𝑇 ) k −1
since k(𝑅𝜆0 (𝑇 ))𝑛 k ≤ k𝑅𝜆0 (𝑇 ) k𝑛 . For such 𝜆, 𝑆𝜆 (𝑇 ) is well defined and we check that (as in the
proof of Theorem 18)
(𝜆𝐼 − 𝑇 )𝑆𝜆 (𝑇 ) = 𝑆𝜆 (𝑇 ) (𝜆𝐼 − 𝑇 ) = 𝐼 .
Hence 𝜆 ∈ 𝜌 (𝑇 ) if |𝜆 − 𝜆0 | < k𝑅𝜆0 (𝑇 ) k −1 and 𝜆0 ∈ 𝜌 (𝑇 ), moreover 𝑆𝜆 (𝑇 ) = 𝑅𝜆 (𝑇 ) for such 𝜆.
This shows that 𝜌 (𝑇 ) ⊂ F is open and since

𝑅𝜆 (𝑇 ) − 𝑅 𝜇 (𝑇 ) = 𝑅𝜆 (𝑇 ) (𝜇𝐼 − 𝑇 )𝑅 𝜇 (𝑇 ) − 𝑅𝜆 (𝑇 ) (𝜆𝐼 − 𝑇 )𝑅 𝜇 (𝑇 ) = 𝑅𝜆 (𝑇 ) (𝜇 − 𝜆)𝑅 𝜇 (𝑇 ),

the second claim follows as well after interchanging 𝜆 with 𝜇. 

Theorem 41. Let 𝑋 be a complex Banach space and 𝑇 ∈ L (𝑋 ). Then the spectrum 𝜎 (𝑇 ) is not
empty.

Proof. Again formally, by Theorem 18,


" ∞
#
1 Õ
(𝜆𝐼 − 𝑇 ) −1 = 𝜆 −1 (𝐼 − 𝜆 −1𝑇 ) −1 = 𝐼+ 𝑇 𝑛 𝜆 −𝑛
𝜆 𝑛=1

70
4 Bounded operators

which motivates that we consider


" ∞
#
1 Õ
𝑅b𝜆 (𝑇 ) := 𝐼+ 𝑇 𝑛 𝜆 −𝑛 , |𝜆| > k𝑇 k. (4.5)
𝜆 𝑛=1

The series converges in the operator norm for |𝜆| > k𝑇 k and its sum is indeed 𝜆𝑅𝜆 (𝑇 ) − 𝐼 . Thus,
as |𝜆| → ∞, we conclude
k𝑅𝜆 (𝑇 ) k → 0.
Now if 𝜎 (𝑇 ) = ∅ were empty, then (4.4) implies that 𝑔(𝜆) := 𝑓 (𝑅𝜆 (𝑇 )𝑥) : 𝜌 (𝑇 ) → C is an
analytic function in 𝜆 ∈ 𝜌 (𝑇 ) = C for any 𝑓 ∈ 𝑋 ∗ and 𝑥 ∈ 𝑋 . Moreover, since k𝑅𝜆 (𝑇 ) k → 0 for
large |𝜆| we have 𝑔(𝜆) → 0 as 𝜆 → ∞. By Liouville’s theorem therefore 𝑔(𝜆) ≡ 0 which implies
𝑅𝜆 (𝑇 ) = 0 ∈ L (𝑋 ), contradicting (4.5). This completes our proof. 

Corollary 17. Let 𝑋 be a Banach space and 𝑇 ∈ L (𝑋 ). Then 𝜎 (𝑇 ) ⊂ F is compact and



𝜎 (𝑇 ) ⊂ 𝜆 ∈ F : |𝜆| ≤ k𝑇 k .

Proof. The proof of Theorem 41 revealed



𝜆 ∈ F : |𝜆| > k𝑇 k ⊂ 𝜌 (𝑇 ),

and so 𝜎 (𝑇 ) = F \ 𝜌 (𝑇 ) ⊂ {𝜆 ∈ F : |𝜆| ≤ k𝑇 k} as claimed. But this subset of F = R or


F = C is bounded and since 𝜌 (𝑇 ) is open by Proposition 11 we obtain from Theorem 3 that
𝜎 (𝑇 ) = F \ 𝜌 (𝑇 ) is compact. 

Definition 42. Let 𝑋 be a Banach space and 𝑇 ∈ L (𝑋 ). We call

𝑟 (𝑇 ) := sup |𝜆|
𝜆 ∈𝜎 (𝑇 )

the spectral radius of 𝑇 . The supremum taken over the empty set is interpreted as −∞.

Theorem 4.1: Spectral radius formula

Let 𝑋 be a complex Banach space and 𝑇 ∈ L (𝑋 ). Then


p
𝑟 (𝑇 ) = lim 𝑛 k𝑇 𝑛 k.
𝑛→∞

If 𝑋 = H is a complex Hilbert space and 𝑇 ∈ L (H ) normal, then 𝑟 (𝑇 ) = k𝑇 k.

Proof. Choose 𝜆 ∈ 𝜎 (𝑇 ) and note that for any 𝑛 ∈ N by the ordinary geometric sum,
" 𝑛−1 # " 𝑛−1 #
Õ Õ
𝜆𝑛 𝐼 − 𝑇 𝑛 = (𝜆𝐼 − 𝑇 ) 𝜆𝑛−1−𝑘𝑇 𝑘 = 𝜆𝑛−1−𝑘𝑇 𝑘 (𝜆𝐼 − 𝑇 ). 𝑇 0 := 𝐼
𝑘=0 𝑘=0

71
4 Bounded operators

But either Ker(𝜆𝐼 − 𝑇 ) ≠ {0} or Ran(𝜆𝐼 − 𝑇 ) ≠ 𝑋 since 𝜆 ∈ 𝜎 (𝑇 ), so 𝜆𝑛 ∈ 𝜎 (𝑇 𝑛 ). Hence,


!𝑛
𝑛
𝑟 (𝑇 ) = sup |𝜆| = sup |𝜆|𝑛 ≤ sup |𝜆|𝑛 = 𝑟 (𝑇 𝑛 ),
𝜆 ∈𝜎 (𝑇 ) 𝜆 ∈𝜎 (𝑇 ) 𝜆𝑛 ∈𝜎 (𝑇 𝑛 )
p p
and by Corollary 17, 𝜎 (𝑇 ) ⊂ {𝜆 ∈ C : |𝜆| ≤ k𝑇 k}, so 𝑟 (𝑇 ) ≤ 𝑛 𝑟 (𝑇 𝑛 ) ≤ 𝑛 k𝑇 𝑛 k valid for any
𝑛 ∈ N. Consequently,
p  q 
𝑛 𝑘
𝑟 (𝑇 ) ≤ lim inf k𝑇 k := lim inf k𝑇 k .
𝑛 𝑘
𝑛→∞ 𝑛→∞ 𝑘 ≥𝑛

We now show that  q 


p
𝑛 𝑘
lim sup k𝑇 𝑛 k := lim sup k𝑇 𝑘 k ≤ 𝑟 (𝑇 ),
𝑛→∞ 𝑛→∞ 𝑘 ≥𝑛

and recall to this end the proof workings of Theorem 41: for |𝜆| > k𝑇 k,
" ∞
# ∞
1 Õ Õ
𝑅𝜆 (𝑇 ) = 𝐼+ 𝑇 𝑛 𝜆 −𝑛 = 𝑇 𝑛 𝜆 −𝑛−1
𝜆 𝑛=1 𝑛=0

so that the Laurent series



Õ
ℎ(𝜆) := 𝑓 (𝑇 𝑛 𝑥)𝜆 −𝑛−1, |𝜆| > k𝑇 k ≥ 𝑟 (𝑇 ),
𝑛=0

converges for all 𝑓 ∈ 𝑋 ∗ and 𝑥 ∈ 𝑋 . Hence, the sequence (𝜆 −𝑛𝑇 𝑛 𝑥)𝑛=1


∞ ⊂ 𝑋 must converge

weakly to zero for all 𝑥 ∈ 𝑋 . But weakly convergent sequences are bounded by Theorem 27, so
for all 𝑥 ∈ 𝑋 ,
sup k𝜆 −𝑛𝑇 𝑛 𝑥 k : 𝑛 ∈ N < ∞


which says that F := {𝜆 −𝑛𝑇 𝑛 }𝑛=1


∞ is pointwise bounded. Hence, by Theorem 3.5,

sup k𝜆 −𝑛𝑇 𝑛 k : 𝑛 ∈ N < ∞,




so there exists 𝑐 > 0 so that k𝑇 𝑛 k ≤ 𝑐 |𝜆|𝑛 for all 𝑛 ∈ N. Thus,


p
lim sup 𝑛 k𝑇 𝑛 k ≤ |𝜆| ≤ 𝑟 (𝑇 ),
𝑛→∞
and combined together,
p
𝑛
p
𝑛
𝑟 (𝑇 ) ≤ lim inf k𝑇 𝑛 k ≤ lim sup k𝑇 𝑛 k ≤ |𝜆| ≤ 𝑟 (𝑇 ),
𝑛→∞ 𝑛→∞

which gives the desired identity. Finally, if 𝑇 ∈ L (H ) is normal then k𝑇 𝑥 k = k𝑇 ∗𝑥 k for all
𝑥 ∈ 𝑋 , so in particular k𝑇 2𝑥 k = k𝑇 ∗𝑇 𝑥 k and from which we deduce
k𝑇 2 k = sup k𝑇 2𝑥 k = sup k𝑇 ∗𝑇 𝑥 k = k𝑇 ∗𝑇 k = k𝑇 k 2,
k𝑥 k=1 k𝑥 k=1

with Theorem 35 in the last equality. In turn, k𝑇 2 k = k𝑇 k 2 for all 𝑛 ∈ N so by the first part of
𝑛 𝑛

the current proof, q


2𝑘
p
k𝑇 2 k = k𝑇 k.
𝑛 𝑘
𝑟 (𝑇 ) = lim k𝑇 k = lim
𝑛
𝑛→∞ 𝑘→∞
This completes our proof of the Theorem. 

72
4 Bounded operators

The following result is sometimes useful in determining spectra.

Lemma 42 (Phillips). Let 𝑋 be a Banach space and 𝐴 ∈ L (𝑋 ). Then

𝜎 (𝐴) = 𝜎 (𝐴𝑡 ), 𝑅𝜆 (𝐴𝑡 ) = (𝑅𝜆 (𝐴))𝑡 .

If H is a Hilbert space and 𝑇 ∈ L (H ), then

𝜎 (𝑇 ∗ ) = 𝜆 ∈ F : 𝜆 ∈ 𝜎 (𝑇 ) , 𝑅𝜆 (𝑇 ∗ ) = (𝑅𝜆 (𝑇 )) ∗ .


Proof. This follows immediately from Theorems 21 and 35. 

Example 31. Consider the Hilbert space (ℓ2 (N), h·, ·i) over F = C of Example 12 and let 𝑇 :
ℓ2 (N) → ℓ2 (N) denote the multiplication operator as studied in Example 4, i.e. for a given sequence
of complex numbers 𝑡 = (𝑡𝑛 )𝑛=1
∞ ⊂ ℓ (N) \ {0},

𝑇 : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (𝑡 1𝑥 1, 𝑡 2𝑥 2, 𝑡 3𝑥 3, . . .).

Recall that k𝑇 k = k𝑡 k ∞ =: 𝑟 > 0, so by Corollary 17, 𝜎 (𝑇 ) ⊂ D𝑟 (0). Now if 𝑇 𝑥 = 𝜆𝑥 then, for all
𝑘 ∈ N, (𝑡𝑘 − 𝜆)𝑥𝑘 = 0 ∈ C, which yields the eigenvector-eigenvalue pairs {𝑥 = 𝑒𝑛 , 𝜆 = 𝑡𝑛 }𝑛=1
∞ with

𝑒𝑛 ∈ ℓ2 (N) as in Example 22, so



𝜎𝑝 (𝑇 ) = {𝑡𝑛 }𝑛=1 .
Next, defining (𝑅𝜆 (𝑇 )𝑥)𝑛 := (𝜆 − 𝑡𝑛 ) −1𝑥𝑛 for all 𝜆 ∉ {𝑡𝑛 }𝑛=1
∞ and 𝑥 = (𝑥 ) ∞ ∈ ℓ (N) we compute
𝑛 𝑛=1 2
at once (𝜆𝐼 − 𝑇 )𝑅𝜆 (𝑇 ) = 𝑅𝜆 (𝑇 ) (𝜆𝐼 − 𝑇 ) = 𝐼 on ℓ2 (N) and estimate
1
k𝑅𝜆 (𝑇 ) k ≤ ∞ ).
dist(𝜆, {𝑡𝑛 }𝑛=1

∞ and so
Thus 𝜌 (𝑇 ) = C \ {𝑡𝑛 }𝑛=1
∞ .
𝜎 (𝑇 ) = {𝑡𝑛 }𝑛=1
It now remains to decide whether the limit points of {𝑡𝑛 }𝑛=1
∞ are part of the continuous or the

residual spectrum: since 𝑇 ∈ L (ℓ2 (N)) acts as


𝑇 ∗ : (𝑥 1, 𝑥 2, 𝑥 3, . . .) ↦→ (𝑡 1 𝑥 1, 𝑡 2 𝑥 2, 𝑡 3 𝑥 3, . . .),

we conclude that 𝑇 is normal and so by Lemma 40, Ker(𝜆𝐼 − 𝑇 ) = Ker(𝜆𝐼 − 𝑇 ∗ ). In turn, if


𝜆 ∈ 𝜎𝑟 (𝑇 ), then by definition Ker(𝜆𝐼 − 𝑇 ) = {0} and Ran(𝜆𝐼 − 𝑇 ) is not dense in H , but this
contradicts Corollary 15. So,

𝜎𝑟 (𝑇 ) = ∅, ∞ \ {𝑡 } ∞ .
𝜎𝑐 (𝑇 ) = {𝑡𝑛 }𝑛=1 𝑛 𝑛=1

Our final result in this section concerns operators on Hilbert spaces with built in symmetries
(such as the one exploited in Example 31):

73
4 Bounded operators

Theorem 43. Let H be a Hilbert space and 𝑇 ∈ L (H ). Then,


(1) if 𝑇 is normal, then 𝜎𝑟 (𝑇 ) = ∅.
(2) if 𝑇 is unitary, then 𝜎𝑟 (𝑇 ) = ∅ and 𝜎 (𝑇 ) ⊂ {𝜆 ∈ F : |𝜆| = 1}.
(3) if 𝑇 is self-adjoint, then 𝜎𝑟 (𝑇 ) = ∅ and 𝜎 (𝑇 ) ⊂ R.

Proof. By Corollary 15,

H = Ker 𝜆𝐼 − 𝑇 ∗ + Ran(𝜆𝐼 − 𝑇 ), Ker 𝜆𝐼 − 𝑇 ∗ ∩ Ran(𝜆𝐼 − 𝑇 ) = ∅,


 

and for normal 𝑇 ∈ L (H ), Ker(𝜆𝐼 − 𝑇 ∗ ) = Ker(𝜆𝐼 − 𝑇 ), see Lemma 40. Consequently, by


definition of 𝜎𝑟 (𝑇 ), we must have 𝜎𝑟 (𝑇 ) = ∅. Since unitary and self-adjoint operators are also
normal we have therefore proven (1) and parts of (2), (3). Moving ahead, if 𝑇 ∈ L (H ) is unitary,
then k𝑇 k 2 = k𝑇 ∗𝑇 k = 1, so k𝑇 k = 1 which shows 𝜎 (𝑇 ) ⊂ {𝜆 ∈ F : |𝜆| ≤ 1} by Corollary 17.
But any unitary operator is invertible, so 0 ∈ 𝜌 (𝑇 ) and if 0 ≤ |𝜆| < 1, then

𝜆𝐼 − 𝑇 = −𝑇 (𝐼 − 𝜆𝑇 ∗ ).

This shows that 𝜆𝐼 −𝑇 is invertible for 0 ≤ |𝜆| < 1 by Theorem 18. Thus, 𝜎 (𝑇 ) ⊂ {𝜆 ∈ F : |𝜆| = 1}
which completes the proof of (2). For (3) we begin with the following standard computation for
any self-adjoint 𝑇 ∈ L (H ): if 𝑇 𝑥 = 𝜆𝑥 with 𝑥 ∈ H \ {0}, then

𝜆h𝑥, 𝑥i = h𝑇 𝑥, 𝑥i = h𝑥,𝑇 ∗𝑥i = h𝑥,𝑇 𝑥i = 𝜆h𝑥, 𝑥i ⇒ 𝜆 = 𝜆,

so 𝜎𝑝 (𝑇 ) ⊂ R. In turn, for 𝜆 ∉ R we have Ker(𝜆𝐼 − 𝑇 ) = {0} and so Ran(𝜆𝐼 − 𝑇 ) is dense in H


for the same 𝜆 ∉ R by Corollary 15. Additionally, with arbitrary 𝑎, 𝑏 ∈ R and 𝑥 ∈ H ,
2 2
+ 𝑏 2 k𝑥 k 2,
  
(𝑎 + 𝔦𝑏)𝐼 − 𝑇 𝑥 = 𝑥, (𝑎 − 𝔦𝑏)𝐼 − 𝑇 (𝑎 + 𝔦𝑏)𝐼 − 𝑇 𝑥 = (𝑎𝐼 − 𝑇 )𝑥

so that k ((𝑎 + 𝔦𝑏)𝐼 − 𝑇 )𝑥 k ≥ |𝑏 | k𝑥 k and which implies that Ran(𝜆𝐼 − 𝑇 ) is closed for 𝜆 ∉ R.
∞ ⊂ Ran(𝜆𝐼 − 𝑇 ) is convergent, 𝑦 → 𝑦, say, then for some (𝑥 ) ∞ ⊂ H ,
Indeed, if (𝑦𝑛 )𝑛=1 𝑛 𝑛 𝑛=1

𝑦𝑛 = (𝜆𝐼 − 𝑇 )𝑥𝑛 , 𝑛 ∈ N,
∞ ⊂ H is Cauchy and thus
and so, by the above, k𝑦𝑛 k ≥ |=𝜆| k𝑥𝑛 k. This says that (𝑥𝑛 )𝑛=1
convergent, 𝑥𝑛 → 𝑥 ∈ H , say. Hence, by continuity of 𝑇 ∈ L (H ),

𝑦𝑛 = (𝜆𝐼 − 𝑇 )𝑥𝑛 → (𝜆𝐼 − 𝑇 )𝑥 ∈ Ran(𝜆𝐼 − 𝑇 ), as 𝑛 → ∞,

and so by the uniqueness of the limit, 𝑦 ∈ Ran(𝜆𝐼 −𝑇 ). All together, Ran(𝜆𝐼 −𝑇 ) = H for 𝜆 ∉ R
by Corollary 15, i.e. Theorem 3.7 yields that 𝜆𝐼 − 𝑇 is invertible for 𝜆 ∉ R and so 𝜎 (𝑇 ) ⊂ R, as
claimed. 

74
4 Bounded operators

4.4 Compact operators


One of the central results in finite-dimensional operator theory is the spectral theorem for
Hermitian operators on inner product spaces, asserting existence of an orthonormal basis for
the underlying vector space consisting of eigenvectors of the operator. In this final section,
we will discuss a class of infinite-dimensional operators that are so close to finite-dimensional
operators that the spectral theorem extends.

Definition 43. Let (𝑋, k · k𝑋 ) and (𝑌 , k · k𝑌 ) be two Banach spaces. Com(𝑋, 𝑌 ), the compact
linear transformations, is the set 𝑇 ∈ L (𝑋, 𝑌 ) so that

𝑇 [𝐵 1 (0)]

is compact in the norm topology. Here, 𝐵 1 (0) = {𝑥 ∈ 𝑋 : k𝑥 k𝑋 < 1} and we write Com(𝑋 ) for
Com(𝑋, 𝑋 ).

Definition 44. We say a linear transformation 𝑇 ∈ L (𝑋, 𝑌 ) from one Banach space to another is
of finite rank if dim Ran(𝑇 ) < ∞.

Proposition 12. Let 𝑋, 𝑌 be two Banach spaces. A linear transformation 𝑇 ∈ L (𝑋, 𝑌 ) is compact
if and only if for every bounded sequence (𝑥𝑛 )𝑛=1
∞ ⊂ 𝑋 the image sequence (𝑇 𝑥 ) ∞ ⊂ 𝑌 has a
𝑛 𝑛=1
subsequence convergent in 𝑌 .

Proof. By linearity of 𝑇 and completeness of 𝑌 , we see that 𝑇 ∈ Com(𝑋, 𝑌 ) if and only if 𝑇


maps bounded sets in 𝑋 to totally bounded ones in 𝑌 . But a set 𝑀 in a Banach space is totally
bounded if and only if every sequence in 𝑀 has a convergent subsequence with limit in 𝑀, see
Theorem 2. 

Clearly, Com(𝑋, 𝑌 ) ⊂ L (𝑋, 𝑌 ) by definition and we will see later on that the compact linear
transformations form a proper subset of L (𝑋, 𝑌 ) for generic 𝑋, 𝑌 . For now, we summarize the
basic algebraic properties of compact linear transformations.

Theorem 44. Let (𝑋, k · k𝑋 ), (𝑌 , k · k𝑌 ) and (𝑍, k · k𝑍 ) be Banach spaces.


(1) If 𝑇 ∈ L (𝑋, 𝑌 ) is of finite rank, then 𝑇 ∈ Com(𝑋, 𝑌 ).
(2) Com(𝑋, 𝑌 ) is a norm-closed subspace of L (𝑋, 𝑌 ).
(3) If 𝑇 ∈ L (𝑋, 𝑌 ), 𝑆 ∈ Com(𝑌 , 𝑍 ), then 𝑆 ◦ 𝑇 ∈ Com(𝑋, 𝑍 ).
(4) If 𝑆 ∈ Com(𝑋, 𝑌 ),𝑇 ∈ L (𝑌 , 𝑍 ), then 𝑇 ◦ 𝑆 ∈ Com(𝑋, 𝑍 ).
(5) 𝑇 ∈ Com(𝑋, 𝑌 ) if and only if 𝑇 𝑡 ∈ Com(𝑌 ∗, 𝑋 ∗ ).

75
4 Bounded operators

Proof. If 𝑇 ∈ L (𝑋, 𝑌 ) is of finite rank, then 𝑇 [𝐵 1 (0)] ⊂ Ran(𝑇 ) is a bounded and closed set in
a finite-dimensional space, hence compact by Theorem 1.3 and Theorem 3 and so sequentially
compact. In (2), let 𝑆,𝑇 ∈ Com(𝑋, 𝑌 ) and 𝛼, 𝛽 ∈ F. If (𝑥𝑛 )𝑛=1 ∞ ⊂ 𝑋 is an arbitrary bounded

sequence, compactness of 𝑆 yields existence of a (bounded) subsequence (𝑥𝑛𝑘 )𝑘=1 ∞ ⊂ {𝑥 } ∞


𝑛 𝑛=1
∞ ⊂ 𝑌 converges. But by compactness of 𝑇 , we can pick a further (bounded)
so that (𝑆𝑥𝑛𝑘 )𝑘=1
∞ ⊂ {𝑥 } ∞ such that (𝑇 𝑥
subsequence (𝑥𝑛𝑘𝑚 )𝑚=1 ∞
𝑛𝑘 𝑘=1 𝑛𝑘𝑚 )𝑚=1 ⊂ 𝑌 converges. Together,
∞
𝛼𝑆𝑥𝑛𝑘𝑚 + 𝛽𝑇 𝑥𝑛𝑘𝑚 𝑚=1 ⊂𝑌

is a convergent subsequence of (𝛼𝑆𝑥𝑛 + 𝛽𝑇 𝑥𝑛 )𝑛=1 ∞ , so 𝛼𝑆 + 𝛽𝑇 ∈ Com(𝑋, 𝑌 ) by Proposition 12.

Next, if 𝑇 ∈ Com(𝑋, 𝑌 ), for any given 𝜖 > 0 there exists 𝑆 ∈ Com(𝑋, 𝑌 ) so that k𝑆 − 𝑇 k < 𝜖/2.
But 𝑆 [𝐵 1 (0)] ⊂ 𝑌 is totally bounded, so there are finitely many 𝑦1, . . . , 𝑦𝑛 ∈ 𝑌 so that
𝑛
Ø 
𝑆 [𝐵 1 (0)] ⊂ 𝑦 ∈ 𝑌 : k𝑦 − 𝑦 𝑗 k𝑌 < 𝜖/2 ,
𝑗=1

and from the triangle inequality, k𝑇 𝑥 k𝑌 ≤ 𝜖2 + k𝑆𝑥 k𝑌 for any 𝑥 ∈ 𝐵 1 (0). Thus 𝑇 [𝐵 1 (0)] is
Ð
covered by 𝑛𝑗=1 {𝑦 ∈ 𝑌 : k𝑦 −𝑦 𝑗 k𝑌 < 𝜖} and hence totally bounded. This shows 𝑇 ∈ Com(𝑋, 𝑌 )
∞ ⊂ 𝑋 is an arbitrary bounded sequence,
and concludes our proof of (2). Moving ahead, if (𝑥𝑛 )𝑛=1
∞ ⊂ 𝑌 is bounded for all 𝑇 ∈ L (𝑋, 𝑌 ). Hence compactness of 𝑆 implies existence
then (𝑇 𝑥𝑛 )𝑛=1

of a subsequence (𝑇 𝑥𝑛𝑘 )𝑘=1 ∞ such that (𝑆𝑇 𝑥 ) ∞
⊂ {𝑇 𝑥𝑛 }𝑛=1 ∞
𝑛𝑘 𝑘=1 ⊂ {𝑆𝑇 𝑥𝑛 }𝑛=1 is convergent.
Thus 𝑆𝑇 ∈ Com(𝑋, 𝑍 ) by Proposition 12. Moreover, if 𝑆 ∈ Com(𝑋, 𝑌 ), then we can select a
convergent subsequence (𝑆𝑥𝑛𝑘 )𝑘=1∞ ⊂ {𝑆𝑥 } ∞ and since for any 𝑇 ∈ L (𝑌 , 𝑍 ),
𝑛 𝑛=1

k𝑇 𝑆𝑥𝑛𝑘 − 𝑇 𝑆𝑥𝑛𝑚 k𝑍 ≤ k𝑇 k k𝑆𝑥𝑛𝑘 − 𝑆𝑥𝑛𝑚 k𝑌 ,


∞ ⊂ {𝑇 𝑆𝑥 } ∞ is convergent in 𝑍 . Thus 𝑇 𝑆 ∈ Com(𝑋, 𝑍 ) and we have
we see that (𝑇 𝑆𝑥𝑛𝑘 )𝑘=1 𝑛 𝑛=1
therefore completed the proofs of (3) and (4). Finally, if

𝑓 ∈ 𝑓 ∈ 𝑌 ∗ : k𝑓 k ≤ 1 ,


then on the compact metric space 𝑇 [𝐵 1 (0)] 3 𝑦, we have that |𝑓 (𝑦)| ≤ k𝑦 k𝑌 and so |𝑓 (𝑦1 ) −
𝑓 (𝑦2 )| ≤ k𝑦1 − 𝑦2 k𝑌 for all 𝑦 ∈ 𝑇 [𝐵 1 (0)] and 𝑓 ∈ (𝑇 [𝐵 1 (0)]) ∗ with k𝑓 k ≤ 1. Hence
 ∗
F := 𝑓 ∈ 𝑇 [𝐵 1 (0)] : k𝑓 k ≤ 1

is a family of uniformly bounded equicontinuous functions on a compact space. By Theorem 1.2,


given any sequence (𝑓𝑛 )𝑛=1∞ ⊂ F there exists (𝑓 ) ∞ ⊂ {𝑓 } ∞ which converges in operator
𝑛𝑘 𝑘=1 𝑛 𝑛=1
∞ is convergent in norm on 𝐵 (0).
norm on 𝑇 [𝐵 1 (0)]. But 𝑓 (𝑇 𝑥) = (𝑇 𝑡 𝑓 ) (𝑥) by (3.5), so (𝑇 𝑡 𝑓𝑛𝑘 )𝑘=1 1
Given that
k 𝑓 k = sup |𝑓 (𝑥)|
k𝑥 k𝑋 =1

we conclude that ∞
(𝑇 𝑡 𝑓𝑛𝑘 )𝑘=1 ⊂ 𝑋∗ is convergent in the norm on 𝑋 ∗ and so by Proposition 12,
𝑇 𝑡 ∈ Com(𝑌 ∗, 𝑋 ∗ ). For the converse statement let 𝑇 𝑡 ∈ Com(𝑌 ∗, 𝑋 ∗ ). Then, by the just proven
part, (𝑇 𝑡 )𝑡 ∈ Com(𝑋 ∗∗, 𝑌 ∗∗ ). Using now the canonical map 𝐽 : 𝑋 → 𝑋 ∗∗, 𝑥 ↦→ 𝑗𝑥 we have

76
4 Bounded operators

𝑇 ∗∗ ◦ 𝐽 ∈ Com(𝑋, 𝑌 ∗∗ ) by part (3) and Theorem 22. However, if 𝐾 : 𝑌 → 𝑌 ∗∗, 𝑦 ↦→ 𝑘 𝑦 denotes


the canonical map on 𝑌 , then, for any 𝑓 ∈ 𝑌 ∗, 𝑥 ∈ 𝑋
  (3.5) (3.6) (3.5)
(𝑇 𝑡 )𝑡 (𝐽𝑥) (𝑓 ) = (𝑇 𝑡 )𝑡 𝑗𝑥 (𝑓 ) = 𝑗𝑥 (𝑇 𝑡 𝑓 ) = (𝑇 𝑡 𝑓 ) (𝑥) = 𝑓 (𝑇 𝑥),

and likewise
 (3.6)
𝐾 (𝑇 𝑥) (𝑓 ) = 𝑘𝑇 𝑥 (𝑓 ) = 𝑓 (𝑇 𝑥).
So (𝑇 𝑡 )𝑡 ◦ 𝐽 = 𝐾 ◦ 𝑇 ∈ Com(𝑋, 𝑌 ∗∗ ) and since 𝐾 : 𝑌 → 𝑌 ∗∗ is an isometry as well, therefore
𝑇 ∈ Com(𝑋, 𝑌 ) by Proposition 12. This concludes our proof of the theorem. 

Part (5) in Theorem 44 is commonly known as Schauder’s theorem and it yields the following
special case:

Corollary 18. Let 𝑇 ∈ L (H1, H2 ) be a linear transformation from one Hilbert space to another.
Then 𝑇 ∈ Com(H1, H2 ) if and only if 𝑇 ∗ ∈ Com(H2, H1 ).

Proof. If 𝑇 ∈ Com(H1, H2 ), then (4.1) yields 𝑇 ∗ ∈ Com(H2, H1 ) by (3) and (4) in Theorem 44.
Conversely use (𝑇 ∗ ) ∗ = 𝑇 and the first part of this proof. 

We now discuss a prototypical compact operator

Example 32. Return to Example 6, so consider 𝑋 = (𝐶 [0, 1], k · k ∞ ) and 𝐾 ∈ L (𝑋 ) with


∫ 1
(𝐾 𝑓 ) (𝑥) := 𝑘 (𝑥, 𝑦) 𝑓 (𝑦)𝑑𝑦
0

where 𝑘 : [0, 1] × [0, 1] → C is continuous on the square [0, 1] × [0, 1] ⊂ R2 . We know



k𝐾 𝑓 k ∞ ≤ k𝑓 k ∞ max |𝑘 (𝑥, 𝑦)| : 𝑥, 𝑦 ∈ [0, 1] ,

so for an arbitrary bounded sequence (𝑓𝑛 )𝑛=1


∞ ⊂ 𝐶 [0, 1] and any 𝑥, 𝑥 0 ∈ [0, 1],

∫ 1
0
(𝐾 𝑓𝑛 ) (𝑥) − (𝐾 𝑓𝑛 ) (𝑥 ) ≤ k𝑓𝑛 k ∞ 𝑘 (𝑥, 𝑦) − 𝑘 (𝑥 0, 𝑦) 𝑑𝑦.
0

But 𝑘 is continuous on the compact square [0, 1] × [0, 1], so uniformly continuous by Theorem
4 and thus, for any 𝜖 > 0, we can find 𝛿 = 𝛿 (𝜖) > 0 so that |𝑘 (𝑥, 𝑦) − 𝑘 (𝑥 0, 𝑦)| < 𝜖 whenever
|𝑥 − 𝑥 0 | < 𝛿 for all 𝑦 ∈ [0, 1]. Hence, whenever |𝑥 − 𝑥 | < 𝛿,

(𝐾 𝑓𝑛 ) (𝑥) − (𝐾 𝑓𝑛 ) (𝑥 0) ≤ 𝜖 k 𝑓𝑛 k ∞ .

This shows that F := {𝐾 𝑓𝑛 }𝑛=1


∞ ⊂ 𝐶 [0, 1] is a family of equicontinuous functions. However, the

same family is uniformly bounded by our first estimate since {𝑓𝑛 }𝑛=1
∞ is bounded, so by Theorem

1.2 there exists a uniformly convergent subsequence (𝐾 𝑓𝑛𝑘 )𝑘=1


∞ ⊂ F . This shows 𝐾 ∈ Com(𝑋 ) by

Proposition 12.

77
4 Bounded operators

An important property of compact operators is given by

Theorem 45. Let 𝑇 ∈ Com(𝑋, 𝑌 ) be a compact linear transformation from one Banach space to
another. Then 𝑇 maps weakly convergent sequences into norm convergent sequences.

Proof. Let 𝑇 ∈ Com(𝑋, 𝑌 ) and (𝑥𝑛 )𝑛=1


∞ ⊂ 𝑋 with 𝑥 ⇀ 𝑥 as 𝑛 → ∞. Set 𝑦 := 𝑇 𝑥 , and observe
𝑛 𝑛 𝑛
that for any 𝑓 ∈ 𝑌 ∗ ,
(3.5)
𝑓 (𝑦𝑛 ) − 𝑓 (𝑇 𝑥) = (𝑇 𝑡 𝑓 ) (𝑥𝑛 − 𝑥),
so 𝑦𝑛 ⇀ 𝑦 := 𝑇 𝑥 ∈ 𝑌 as 𝑛 → ∞. Now suppose 𝑦𝑛 does not converge to 𝑦 in norm, i.e. there
∞ ⊂ {𝑦 } ∞ so that k𝑦
exists 𝜖 > 0 and a subsequence (𝑦𝑛𝑘 )𝑘=1 𝑛 𝑛=1 𝑛𝑘 − 𝑦 k𝑌 ≥ 𝜖 for all 𝑘 ∈ N. But

𝑦𝑛𝑘 = 𝑇 𝑥𝑛𝑘 and (𝑥𝑛𝑘 )𝑘=1 is bounded by Theorem 27. Thus, by compactness of 𝑇 , there exists a
∞ ⊂ {𝑦 } ∞ such that
subsequence (𝑦𝑛𝑘𝑚 )𝑚=1 𝑛𝑘 𝑘=1

𝑦𝑛𝑘𝑚 → 𝑦b as 𝑚 → ∞.

But this is impossible since then 𝑦𝑛𝑘𝑚 ⇀ 𝑦b as 𝑚 → ∞ by Theorem 27 and so 𝑦b = 𝑦 since weak
limits are unique. For this reason, 𝑦𝑛 → 𝑦 in norm which yields the claim. 

Remark. The converse of Theorem 45 holds if 𝑋 is a reflexive Banach space, see Reed-Simon,
Chapter 𝑉 𝐼 for this deeper result.
We will now show that Com(𝑋, 𝑌 ) is in general a proper subset of L (𝑋, 𝑌 ):

Lemma 46 (Riesz). Let (𝑋, k · k) be a normed linear space and 𝑌 ⊂ 𝑋 a closed, proper subspace.
Then for any 𝜖 > 0, there exists 𝑥 = 𝑥 (𝜖) ∈ 𝑋 with

k𝑥 k = 1, dist(𝑥, 𝑌 ) = inf k𝑥 − 𝑦 k : 𝑦 ∈ 𝑌 ≥ 1 − 𝜖.

Proof. Pick 𝑧 ∈ 𝑋 \ 𝑌 and note that dist(𝑧, 𝑌 ) > 0: for if dist(𝑧, 𝑌 ) = 0, then there exists

(𝑦𝑛 )𝑛=1 ⊂ 𝑌 such that 𝑦𝑛 → 𝑧 as 𝑛 → ∞. But 𝑌 is closed, so we would have 𝑧 ∈ 𝑌 in
contradiction to our initial choice. Next, assuming without loss of generality that 𝜖 ∈ (0, 1) is
arbitrary we can then find 𝑦∗ = 𝑦∗ (𝜖) ∈ 𝑌 so that

dist(𝑧, 𝑌 )
0 < k𝑧 − 𝑦∗ k < .
1−𝜖
𝑧−𝑦∗
Setting 𝑥 = 𝑥 (𝜖) := k𝑧−𝑦∗ k we have k𝑥 k = 1 and for all 𝑦 ∈ 𝑌 ,

k𝑧 − (𝑦∗ + k𝑧 − 𝑦∗ k𝑦) k dist(𝑧, 𝑌 )


k𝑥 − 𝑦 k = ≥ > 1−𝜖
k𝑧 − 𝑦∗ k k𝑧 − 𝑦∗ k

since 𝑌 is a subspace. Consequently, dist(𝑥, 𝑌 ) ≥ 1 − 𝜖, as claimed. 

Proposition 13. Let 𝑋 be an infinite-dimensional Banach space, then 𝐼𝑋 ∉ Com(𝑋 ).

78
4 Bounded operators

Proof. Suppose 𝑋 is infinite dimensional and pick 𝑥 1 ∈ 𝑋 with k𝑥 1 k = 1. Now set 𝑋 1 :=


span{𝑥 1 } ⊂ 𝑋 , which is a closed, proper, one-dimensional subspace of 𝑋 . Thus, by Lemma 46
we can find 𝑥 2 ∈ 𝑋 \ 𝑋 1 with k𝑥 2 k = 1 and
1 1
dist(𝑥 2, 𝑋 1 ) ≥ ⇒ k𝑥 2 − 𝑥 1 k ≥ .
2 2

Given this, we proceed inductively and construct an infinite sequence (𝑥𝑛 )𝑛=1∞ ⊂ 𝐵 (0) so that
1
1
k𝑥𝑛 − 𝑥𝑚 k ≥ 2 for all 𝑛 ≠ 𝑚. Clearly, this sequence has no convergent subsequence, hence
𝐵 1 (0) = 𝐼𝑋 [𝐵 1 (0)] is not sequentially compact and thus not totally bounded. 

Corollary 19. Let 𝑋, 𝑌 be Banach spaces and at least one of them be infinite-dimensional. Then
𝑇 ∈ Com(𝑋, 𝑌 ) is not invertible.

Proof. If 𝑇 ∈ Com(𝑋, 𝑌 ) ⊂ L (𝑋, 𝑌 ) were invertible, then there exists 𝑇 −1 ∈ L (𝑌 , 𝑋 ) so that


𝑇 −1𝑇 = 𝐼𝑋 and 𝑇𝑇 −1 = 𝐼𝑌 . But Com(𝑋, 𝑌 ) is a two-sided ideal by Theorem 44, parts (3) and (4),
hence we would have 𝐼𝑌 ∈ Com(𝑌 ) and 𝐼𝑋 ∈ Com(𝑋 ). But at least one those implications is
false by Proposition 13, thus 𝑇 is not invertible. 

Corollary 20. Let 𝑋 be an infinite-dimensional Banach space and 𝑇 ∈ Com(𝑋 ) a compact


operator. Then 0 ∈ 𝜎 (𝑇 ).

Proof. Clear from Corollary 19 and Definition 40. 

Before discussing the spectral theory of compact operators in detail, we provide one useful
compactness test:

Definition 45. Let 𝑋, 𝑌 be two Banach spaces. FA(𝑋, 𝑌 ), the finite approximable linear
transformations, is the norm closure in L (𝑋, 𝑌 ) of the finite rank linear transformations.
As before, we first summarize the standard algebraic properties of any finite approximable
linear transformation, compare Theorem 44:

Theorem 47. Let (𝑋, k · k𝑋 ), (𝑌 , k · k𝑌 ) and (𝑍, k · k𝑍 ) be Banach spaces.


(1) If 𝑇 ∈ FA(𝑋, 𝑌 ), then 𝑇 ∈ Com(𝑋, 𝑌 ).
(2) FA(𝑋, 𝑌 ) is a norm-closed subspace of L (𝑋, 𝑌 ).
(3) If 𝑇 ∈ L (𝑋, 𝑌 ), 𝑆 ∈ FA(𝑌 , 𝑍 ), then 𝑆 ◦ 𝑇 ∈ FA(𝑋, 𝑍 ).
(4) If 𝑆 ∈ FA(𝑋, 𝑌 ),𝑇 ∈ L (𝑌 , 𝑍 ), then 𝑇 ◦ 𝑆 ∈ FA(𝑋, 𝑍 ).
(5) If 𝑇 ∈ FA(𝑋, 𝑌 ), then 𝑇 𝑡 ∈ FA(𝑌 ∗, 𝑋 ∗ ).

79
4 Bounded operators

Proof. Since any 𝑇 ∈ FA(𝑋, 𝑌 ), by definition, is either of finite rank or the norm limit of a
sequence of finite rank linear transformations, claim (1) follows at once from Theorem 44, parts
(1) and (2). Next, FA(𝑋, 𝑌 ) is norm-closed by Definition and if 𝑆,𝑇 ∈ FA(𝑋, 𝑌 ) are of finite rank,
then for any 𝛼, 𝛽 ∈ F,
Ran(𝛼𝑆 + 𝛽𝑇 ) ⊂ Ran(𝑆) + Ran(𝑇 ),
i.e. 𝛼𝑆 + 𝛽𝑇 is of finite rank and so 𝛼𝑆 + 𝛽𝑇 ∈ FA(𝑋, 𝑌 ). On the other hand, if at least one of
𝑆,𝑇 is the norm limit of a sequence of finite rank linear transformations, then clearly 𝛼𝑆 + 𝛽𝑇
is also the norm limit of a sequence of finite rank linear transformations, i.e. we have verified
claim (2). Next, if 𝑇 ∈ L (𝑋, 𝑌 ), 𝑆 ∈ FA(𝑌 , 𝑍 ), then 𝑆𝑇 ∈ FA(𝑋, 𝑍 ) since

Ran(𝑆𝑇 ) ⊂ Ran(𝑆),

and likewise for 𝑆 ∈ FA(𝑋, 𝑌 ),𝑇 ∈ L (𝑌 , 𝑍 ) we have 𝑇 𝑆 ∈ FA(𝑋, 𝑍 ) given that



Ran(𝑇 𝑆) ⊂ 𝑇 Ran(𝑆) .
∞ is a sequence of finite rank linear
This completes our proof of (3) and (4). Finally, if (𝑇𝑛 )𝑛=1
transformations with k𝑇𝑛 − 𝑇 k → 0 as 𝑛 → ∞, then 𝑇𝑛𝑡 ∈ L (𝑌 ∗, 𝑋 ∗ ) has also finite rank for any
𝑛 ∈ N, in fact dim Ran(𝑇 ) = dim Ran(𝑇 𝑡 ), see the exercises. Hence, with Theorem 21,

k𝑇𝑛𝑡 − 𝑇 𝑡 k = k𝑇𝑛 − 𝑇 k → 0 as 𝑛 → ∞,

so 𝑇 𝑡 ∈ FA(𝑌 ∗, 𝑋 ∗ ), as claimed in (5). 

At present we have established the sequence of subspace inclusions,

FA(𝑋, 𝑌 ) ⊂ Com(𝑋, 𝑌 ) ⊂ L (𝑋, 𝑌 )

where Com(𝑋, 𝑌 ) ⊂ L (𝑋, 𝑌 ) is proper in general, see Proposition 13. As it happens FA(𝑋, 𝑌 ) ⊂
Com(𝑋, 𝑌 ) is also proper in general, see Simon Part 4, however in certain cases one can do
better. Here is the aforementioned compactness test:

Theorem 4.2: Approximation property

Let H be a Hilbert space and 𝑌 a Banach space. Then

FA(H, 𝑌 ) = Com(H, 𝑌 ).

Proof. We will show that any 𝑇 ∈ Com(H, 𝑌 ) is the norm limit of a sequence of finite rank linear
transformations. Assuming H is infinite-dimensional (otherwise FA(H, 𝑌 ) = Com(H, 𝑌 ) since
∞ ⊂ H and orthogonal projections
then any 𝑇 ∈ L (H, 𝑌 ) is of finite rank), we select {𝑥𝑛 }𝑛=1
∞ ⊂ L (H ) as follows: in the first step, pick 𝑥 ∈ H with k𝑥 k = 1 and k𝑇 𝑥 k ≥ 1 k𝑇 k.
{𝑃𝑛 }𝑛=1 1 1 H 1 𝑌 2
Now define 𝑃 1 ∈ L (H ) as the orthogonal projection onto span{𝑥 1 }, i.e.

𝑃 1𝑥 := h𝑥, 𝑥 1 i𝑥 1, 𝑥 ∈ H.

80
4 Bounded operators

In the second step, pick 𝑥 2 ∈ H with


1
k𝑥 2 k H = 1, 𝑃1𝑥 2 = 0, k𝑇 𝑥 2 k𝑌 ≥ k𝑇 (𝐼 − 𝑃 1 ) k.
2
Also this is possible since by Theorem 2.2 and Theorem 38,

k𝑇 (𝐼 − 𝑃 1 ) k = sup k𝑇 (𝐼 − 𝑃 1 )𝑥 k𝑌 = sup k𝑇 (𝐼 − 𝑃 1 )𝑥 k𝑌 = sup k𝑇 𝑥 k𝑌 ,


k𝑥 k H =1 k𝑥 k H =1 k𝑥 k H =1
𝑥 ∈Ker(𝑃 1 ) 𝑃 1 𝑥=0

since H is infinite-dimensional and since we can orthogonalize countably many vector by


Theorem 15. Proceeding inductively, with 𝑃𝑛 ∈ L (H ) as the orthogonal projection onto
span{𝑥 1, . . . , 𝑥𝑛 }, i.e.
𝑛
Õ
FA(H, H ) 3 𝑃𝑛 𝑥 := h𝑥, 𝑥 𝑗 i𝑥 𝑗 , 𝑥 ∈ H,
𝑗=1

we can choose 𝑥𝑛+1 ∈ H so that


1
k𝑥𝑛+1 k H = 1, 𝑃𝑛 𝑥𝑛+1 = 0, k𝑇 𝑥𝑛+1 k𝑌 ≥ k𝑇 (𝐼 − 𝑃𝑛 ) k.
2
∞ ⊂ H is an orthonormal set and so by (2.6), 𝑥 ⇀ 0 as 𝑛 → ∞. In
By this construction, {𝑥𝑛 }𝑛=1 𝑛
turn by Theorem 45, 𝑇 𝑥𝑛 → 0 in norm and so all together

k𝑇 − 𝑇 ◦ 𝑃𝑛 k ≤ 2k𝑇 𝑥𝑛+1 k𝑌 → 0 as 𝑛 → ∞,
∞ ⊂ FA(H, 𝑌 ). This completes
i.e. 𝑇 ∈ Com(H, 𝑌 ) is the norm limit of the sequence (𝑇 ◦ 𝑃𝑛 )𝑛=1
our proof. 

We will now start to clarify the important role of compact operators. In a way, those operators
imitate best a series of results from Linear Algebra in finite-dimensional vector spaces. For
instance, given a matrix 𝐴 ∈ F𝑛×𝑛 and 𝑦 ∈ F𝑛 , we know that

𝑥 − 𝐴𝑥 = 𝑦

is solvable (in 𝑥 ∈ F𝑛 ) for all 𝑦 if and only if the corresponding homogeneous equation

𝑥 = 𝐴𝑥

has no non-trivial solutions. Equivalently, and these two possibilities are exclusive, either the
homogeneous equation has a non-trivial solution or the matrix 𝐼 − 𝐴 is invertible. Here is a far
reaching generalization of this alternative:

Theorem 4.3: Fredholm Alternative

Let 𝑋 be a Banach space and 𝑇 ∈ Com(𝑋 ) a compact operator. Then 𝐼 − 𝑇 is injective if


and only if it is surjective.

81
4 Bounded operators

Remark. Returning to equations, Theorem 4.3 says for a given 𝐴 ∈ Com(𝑋 ),


(1) either for every 𝑦 ∈ 𝑋 , there is a unique 𝑥 ∈ 𝑋 obeying 𝑥 − 𝐴𝑥 = 𝑦,
(2) or there is 𝑥 ≠ 0 obeying 𝑥 = 𝐴𝑥.
And these two are exclusive, that is, if 𝑥 = 𝐴𝑥 has a solution 𝑥 ≠ 0, then there is 𝑦 ∈ 𝑋 so that
𝑥 − 𝐴𝑥 = 𝑦 has no solution.
We now prepare the necessary tool for the proof of Theorem 4.3:

Proposition 14. Let 𝑇 ∈ Com(𝑋 ) be a compact operator on a Banach space 𝑋 . Then Ker(𝐼 − 𝑇 )
is finite-dimensional and Ran(𝐼 − 𝑇 ) is closed.

Proof. Since 𝐼 −𝑇 ∈ L (𝑋 ), we know that 𝑀 := Ker(𝐼 −𝑇 ) is a closed subspace of 𝑋 and so itself


a Banach space. But 𝑇 𝑀 ∈ Com(𝑀) acts as identity on 𝑀, so 𝑀 has to be finite-dimensional
by Proposition 13. Moving ahead, by Theorem 39, there exists a closed subspace 𝑀 0 ⊂ 𝑋
complementary to 𝑀 so that 𝐼 − 𝑇 𝑀 0 is injective and Ran(𝐼 − 𝑇 𝑀 0 ) = Ran(𝐼 − 𝑇 ). Now select
∞ ⊂ Ran(𝐼 − 𝑇 ) with 𝑦 → 𝑦 ∈ 𝑋 as 𝑛 → ∞. Then, by the before, there exists a unique
(𝑦𝑛 )𝑛=1 𝑛
∞ ⊂ 𝑀 0 so that 𝑦 = 𝑥 − 𝑇 𝑥 for all 𝑛 ∈ N. Suppose (𝑥 ) ∞ ⊂ 𝑀 0 ⊂ 𝑋 is
sequence (𝑥𝑛 )𝑛=1 𝑛 𝑛 𝑛 𝑛 𝑛=1
unbounded, i.e. there exists a subsequence (𝑥𝑛𝑘 )𝑘=1∞ ⊂ {𝑥 } ∞ with k𝑥 k → ∞ as 𝑘 → ∞.
𝑛 𝑛=1 𝑛𝑘
Setting 𝑧𝑘 := 𝑥𝑛𝑘 /k𝑥𝑛𝑘 k we then deduce
1  𝑦𝑛𝑘
𝑧𝑘 − 𝑇 𝑧𝑘 = 𝑥 𝑛𝑘 − 𝑇 𝑥 𝑛𝑘 = →0 as 𝑘 → ∞. (4.6)
k𝑥𝑛𝑘 k k𝑥𝑛𝑘 k
∞ ⊂ 𝑀 0 ⊂ 𝑋 bounded, so (𝑇 𝑧 ) ∞ ⊂ 𝑋 has a convergent subse-
But 𝑇 is compact and (𝑧𝑘 )𝑘=1 𝑘 𝑘=1

quence (𝑇 𝑧𝑘𝑚 )𝑚=1 , 𝑇 𝑧𝑘𝑚 → 𝑤, say. In turn, from (4.6), 𝑧𝑘𝑚 → 𝑤 as 𝑚 → ∞ with k𝑤 k = 1 and
∞ ⊂ 𝑀 0 and 𝑀 0 is closed, so 𝑤 ∈ 𝑀 0 which contra-
(𝐼 − 𝑇 )𝑤 = 0, i.e. 𝑤 ∈ 𝑀. However, (𝑧𝑘𝑚 )𝑚=1
∞ ⊂ 𝑋 must be bounded,
dicts 𝑤 ∈ 𝑀 given that 𝑤 ≠ 0. In conclusion, the above sequence (𝑥𝑛 )𝑛=1
so by compactness of 𝑇 we can select a convergent subsequence (𝑇 𝑥𝑛𝑚 )𝑚=1 ∞ ∞ and
⊂ (𝑇 𝑥𝑛 )𝑛=1
since 𝑦𝑛 = 𝑥𝑛 − 𝑇 𝑥𝑛 converges as 𝑛 → ∞, so does 𝑥𝑛𝑚 = 𝑦𝑛𝑚 + 𝑇 𝑥𝑛𝑚 as 𝑚 → ∞. Writing
𝑥 := lim𝑚→∞ 𝑥𝑛𝑚 we conclude

𝑦 ← 𝑦𝑛𝑚 = 𝑥𝑛𝑚 − 𝑇 𝑥𝑛𝑚 → 𝑥 − 𝑇 𝑥 as 𝑚 → ∞,

and so 𝑦 ∈ Ran(𝐼 − 𝑇 ). This concludes our proof of the proposition. 

Proof of Theorem 4.3. Suppose first 𝐾 := 𝐼 − 𝑇 ∈ L (𝑋 ) is injective but not surjective. Since
𝐾 𝑛 = 𝐼 − 𝑇𝑛 with 𝑇𝑛 ∈ Com(𝑋 ) for all 𝑛 ∈ N by Theorem 44, we have that 𝑋𝑛 := Ran(𝐾 𝑛 )
is closed for all 𝑛 ∈ N by Proposition 14. Note that 𝑋𝑛+1 ⊂ 𝑋𝑛 . Next, 𝐾 being not surjective,
there exists 𝑥 ∈ 𝑋 so 𝑥 ≠ 𝐾𝑦 for all 𝑦 ∈ 𝑋 , and since 𝐾 is injective therefore 𝐾 𝑛 𝑥 ≠ 𝐾 𝑛+1𝑦, so
𝑥 ∈ 𝑋𝑛 \ 𝑋𝑛+1 . Using Lemma 46 we can now find vectors 𝑥𝑛 = 𝐾 𝑛𝑦𝑛 ∈ 𝑋𝑛 so that k𝑥𝑛 k = 1 and
dist(𝑥𝑛 , 𝑋𝑛+1 ) ≥ 21 . Since for 𝑚 > 𝑛 ≥ 1 we then have 𝑇 𝑥𝑚 + 𝐾𝑥𝑛 ∈ 𝑋𝑛+1 and so

1
k𝑇 𝑥𝑚 − 𝑇 𝑥𝑛 k = k𝑇 𝑥𝑚 + 𝐾𝑥𝑛 − 𝑥𝑛 k ≥ dist(𝑥𝑛 , 𝑋𝑛+1 ) ≥ .
2

82
4 Bounded operators

∞ does not have any convergent subsequences, contradicting Proposition


This shows that (𝑇 𝑥𝑛 )𝑛=1
12. In summary, injectivity of 𝐾 implies its surjectivity. Conversely, if 𝐾 is surjective, then
𝐾 𝑡 = 𝐼 −𝑇 𝑡 is injective by Theorem 21 and 𝑇 𝑡 ∈ Com(𝑋 ∗ ) by Theorem 44. Thus, by the first part
of the current proof, 𝐾 𝑡 is surjective and so invertible by Theorem 3.7. In turn, using Theorem
21 again, we conclude that (𝐾 𝑡 )𝑡 is invertible and so 𝐾 injective since (𝐾 𝑡 )𝑡 ◦ 𝐽 = 𝐽 ◦ 𝐾 with
the canonical isometry 𝐽 : 𝑋 → 𝑋 ∗∗ . This concludes our proof of Theorem 4.3. 

The following example shows that the compactness assumption in Theorem 4.3 is necessary.

Example 33. Consider 𝑋 = (𝐶 [0, 1], k · k ∞ ) as in Example 1 over F = C and let 𝑇 ∈ L (𝑋 ) be


given by
(𝑇 𝑓 ) (𝑥) := 𝑥 𝑓 (𝑥).
Note that 𝑇 is not compact: set 𝑓𝑛 (𝑥) := sin(2𝑛 𝜋 (2𝑥 − 1)), then (𝑓𝑛 )𝑛=1
∞ ⊂ 𝐶 [0, 1] is a bounded

sequence, however (𝑇 𝑥𝑛 )𝑛=1 has no convergent subsequence. Indeed, for all 𝑛 > 𝑚,

1 1
k𝑇 𝑓𝑛 − 𝑇 𝑓𝑚 k ∞ = max 𝑥 (𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥)) ≥ max 𝑓𝑛 (𝑥) − 𝑓𝑚 (𝑥) ≥ ,
𝑥 ∈ [0,1] 2 𝑥 ∈ [ 21 ,1] 2

and the lower bound is achieved at 𝑥𝑚 = 12 (1 + 2−(𝑚+1) ) ∈ [ 21 , 1]. Moreover, Theorem 4.3 does not
hold: 𝜎𝑝 (𝑇 ) = ∅, so 𝐼 − 𝑇 is injective while at the same time, since 𝜎 (𝑇 ) = [0, 1] ⊂ R, see the
exercises, 𝜆𝐼 − 𝑇 is not invertible for 𝜆 ∈ [0, 1].
At this point we are ready to state and prove a spectral theorem for compact operators. First,
the below important information on the spectrum of 𝑇 ∈ Com(𝑋 ).

Theorem 4.4: Riesz-Schauder

Let 𝑇 ∈ Com(𝑋 ) be a compact operator on a Banach space 𝑋 . Then 𝜎 (𝑇 ) \ {0} is at most a


discrete subset of F \ {0}. That is, if non-empty, a set {𝜆𝑛 }𝑛=1
𝑁 of points with 𝜆 → 0 if 𝑁 is
𝑛
infinite. Each 𝜆𝑛 is an eigenvalue of finite geometric multiplicity, i.e. dim Ker(𝜆𝑛 𝐼 −𝑇 ) < ∞
for all 𝑛.

Proof. If 𝜆 ≠ 0 is not an eigenvalue of 𝑇 , then 𝐼 − 𝜆 −1𝑇 = 𝜆 −1 (𝜆𝐼 − 𝑇 ) is invertible by Theorem


4.3 and thus 𝜆 ∉ 𝜎 (𝑇 ). If on the other hand 𝜆 ≠ 0 is an eigenvalue of 𝑇 , then Ker(𝜆𝐼 − 𝑇 ) =
Ker(𝐼 − 𝜆 −1𝑇 ), so by Proposition 14, dim Ker(𝜆𝐼 − 𝑇 ) < ∞. It now remains to verify the
discreteness of 𝜎 (𝑇 ) \ {0} and thus show that for any 𝜖 > 0 there are at most finitely many
eigenvalues 𝜆 of 𝑇 with |𝜆| ≥ 𝜖. Assuming the contrary, there exists an infinite sequence
∞ ⊂ F of eigenvalues with |𝜆 | ≥ 𝜖 and we may assume 𝜆 ≠ 𝜆 for 𝑖 ≠ 𝑗. Since 𝜎 (𝑇 ) ⊂ F
(𝜆𝑛 )𝑛=1 𝑛 𝑖 𝑗
is compact by Corollary 17 there exists further a convergent subsequence (𝜆𝑛𝑘 )𝑘=1 ∞ ⊂ {𝜆 } ∞ ,
𝑛 𝑛=1
𝜆𝑛𝑘 → 𝜆, say with |𝜆| ≥ 𝜖. But for each 𝜇𝑘 := 𝜆𝑛𝑘 there exists an eigenvector 𝑥𝑘 ∈ 𝑋 \ {0} of
unit length and eigenvectors corresponding to distinct eigenvalues are linearly independent.
Consequently,
𝑋𝑘−1 := span{𝑥 1, . . . , 𝑥𝑘−1 }

83
4 Bounded operators

is a closed, proper subspace of 𝑋𝑘 . Hence, by Lemma 46, we can select 𝑥𝑘0 ∈ 𝑋𝑘 of unit length so
dist(𝑥𝑘0 , 𝑋𝑘−1 ) ≥ 21 . However, using that 𝑥𝑘0 = 𝑘𝑖=1 𝛼𝑖 𝑥𝑖 with 𝛼𝑖 ∈ F, we derive
Í

𝑘
Õ 𝑘−1
Õ
𝑦𝑘 := (𝜇𝑘 𝐼 − 𝑇 )𝑥𝑘0 = (𝜇𝑘 − 𝜇𝑖 )𝛼𝑖 𝑥𝑖 = (𝜇𝑘 − 𝜇𝑖 )𝛼𝑖 𝑥𝑖 ∈ 𝑋𝑘−1 .
𝑖=1 𝑖=1

So, for any 𝑘 > 𝑚 ≥ 2,


1
k𝜇𝑘−1𝑇 𝑥𝑘0 − 𝜇𝑚
−1 0
𝑇 𝑥𝑚 k = k𝜇𝑘−1𝑦𝑘 + 𝜇𝑚
−1 0
𝑇 𝑥𝑚 − 𝑥𝑘0 k ≥ dist(𝑥𝑘0 , 𝑋𝑘−1 ) ≥ ,
2
0 ∈ 𝑋 −1 0 ∞
since 𝑇 𝑥𝑚 𝑘−1 for 𝑘 > 𝑚 ≥ 2. In summary, (𝜇𝑘 𝑇 𝑥𝑘 )𝑘=1 has no convergent subsequences
0 ∞
and since 𝜇𝑘 → 𝜆 ≠ 0, the same is true for (𝑇 𝑥𝑘 )𝑘=1 . But this contradicts Proposition 12 since
(𝑥𝑘0 )𝑘=1
∞ was bounded, so 𝜎 (𝑇 ) \ {0} must be discrete.
𝑝 

Equipped with Theorem 4.4 we now state the main two results of this section. First, we require
the below notion of positivity

Definition 46. Let (H, h·, ·i) be a Hilbert space and 𝑇 ∈ L (H ) self-adjoint. We say 𝑇 is positive
(respectively, strictly positive), written 𝑇 ≥ 0 (respectively, 𝑇 > 0), if and only if for all 𝑥 ∈ H ,

h𝑇 𝑥, 𝑥i ≥ 0 (respectively, h𝑇 𝑥, 𝑥i > 0 if 𝑥 ≠ 0).

Theorem 4.5: Hilbert-Schmidt

Let 𝑇 ∈ Com(H ) be a positive compact operator on a separable Hilbert space H . Then


there is an orthonormal basis of eigenvectors whose eigenvalues accumulate only at 0;
𝑁 and {𝑥 0 }𝑀 , whose
more specifically, there exist two countable orthonormal sets, {𝑥𝑛 }𝑛=1 𝑚 𝑚=1
union is an orthonormal basis so that
0
𝑇 𝑥𝑛 = 𝜆𝑛 𝑥𝑛 , 𝜆1 ≥ 𝜆2 ≥ . . . > 0; 𝜆𝑛 ↓ 0 if 𝑁 = ∞; 𝑇 𝑥𝑚 = 0.

Our proof of Theorem 4.5 relies on the following auxiliary result.

Proposition 15. Let H be a Hilbert space and (𝑥𝑛 )𝑛=1


∞ , (𝑦 ) ∞ ⊂ H two sequences.
𝑛 𝑛=1

(1) If 𝑥𝑛 ⇀ 𝑥 weakly and 𝑦𝑛 → 𝑦 in norm, then h𝑥𝑛 , 𝑦𝑛 i → h𝑥, 𝑦i.


(2) If 𝑇 ∈ Com(H ) and 𝑥𝑛 ⇀ 𝑥 weakly, then h𝑇 𝑥𝑛 , 𝑥𝑛 i → h𝑇 𝑥, 𝑥i.

Proof. Note that


h𝑥𝑛 , 𝑦𝑛 i − h𝑥, 𝑦i = h𝑥𝑛 , 𝑦𝑛 − 𝑦i + h𝑥𝑛 − 𝑥, 𝑦i,

84
4 Bounded operators

and thus with Corollary 3, Theorem 27 and 𝑓𝑦 (𝑥) := h𝑥, 𝑦i ∈ H ∗ ,

h𝑥𝑛 , 𝑦𝑛 i − h𝑥, 𝑦i ≤ 𝑐 k𝑦𝑛 − 𝑦 k + |𝑓𝑦 (𝑥𝑛 − 𝑥)| → 0, as 𝑛 → ∞,

by the assumed weak and norm convergence. Part (2) is a consequence of (1) and Theorem
45. 

Proof of Theorem 4.5. Set 𝑋 := 𝐵 1 (0) = {𝑥 ∈ H : k𝑥 k ≤ 1} and consider the map 𝑓 : 𝑋 →


[0, ∞) ⊂ R,
𝑓 (𝑥) := h𝑇 𝑥, 𝑥i ≥ 0, 𝑥 ∈ 𝑋,
defined on the closed ball 𝑋 . Clearly, the image 𝑊 := 𝑓 [𝑋 ] is bounded and it is also closed: if
∞ ⊂ 𝑊 is convergent, 𝑦 → 𝑦, say, then 𝑦 = 𝑓 (𝑥e ) for some sequence (𝑥e ) ∞ ⊂ 𝑋 . But
(𝑦𝑛 )𝑛=1 𝑛 𝑛 𝑛 𝑛 𝑛=1
𝑋 is a closed subspace of a Hilbert space, thus itself a Hilbert space and so reflexive by Example
21 and Proposition 8. Hence, by Theorem 32, we can select a subsequence (g ∞ ⊂ {𝑥e } ∞
𝑥𝑛𝑘 )𝑘=1 𝑛 𝑛=1
such that 𝑥g𝑛𝑘 ⇀ 𝑥 ∈ 𝑋 as 𝑘 → ∞ and thus, by Proposition 15,

𝑦 ← 𝑦𝑛𝑘 = 𝑓 (g
𝑥𝑛𝑘 ) = h𝑇 𝑥g
𝑛𝑘 , 𝑥
g 𝑛𝑘 i → h𝑇 𝑥, 𝑥i = 𝑓 (𝑥), as 𝑘 → ∞.

Hence, 𝑦 ∈ 𝑊 , i.e. 𝑊 ⊂ R ≥0 is both, bounded and closed, so compact by Theorem 3. Conse-


quently, there exists 𝑥 1 ∈ 𝑋 such that

h𝑇 𝑥 1, 𝑥 1 i = sup h𝑇 𝑥, 𝑥i = k𝑇 k,
𝑥 ∈𝑋

see the exercises for the second equality. Note that we may assume k𝑥 1 k = 1, for if k𝑥 1 k < 1,
then simply use
h𝑇 𝑥 1, 𝑥 1 i ≤ h𝑇 (𝑥 1 /k𝑥 1 k), 𝑥 1 /k𝑥 1 ki.
We now show that 𝑥 1 is an eigenvector of 𝑇 with corresponding eigenvalue 𝜆1 := k𝑇 k ≥ 0. First,
if 𝑦 ∈ H is orthogonal to 𝑥 1 , set
1
𝑧𝑡 := (𝑥 1 + 𝑡𝑦), 𝑡 ∈ R,
k𝑥 1 + 𝑡𝑦 k

and note that k𝑥 1 + 𝑡𝑦 k 2 = 1 + 𝑡 2 k𝑦 k 2 , i.e.


1 h
2
i
∀𝑡 ∈ R : h𝑇 𝑧𝑡 , 𝑧𝑡 i = h𝑇 𝑥 1 , 𝑥 1 i + 2𝑡<h𝑇 𝑥 1 , 𝑦i + 𝑡 h𝑇𝑦, 𝑦i .
1 + 𝑡 2 k𝑦 k 2

But h𝑇 𝑧𝑡 , 𝑧𝑡 i ≤ h𝑇 𝑥 1, 𝑥 1 i, so we must have <h𝑇 𝑥 1, 𝑦i = 0. But if h𝑥 1, 𝑦i = 0, then likewise


h𝑥 1, 𝔦𝑦i = 0 and repeating the last computation with 𝔦𝑦 instead of 𝑦 we also deduce =h𝑇 𝑥 1, 𝑦i = 0,
i.e. all together h𝑇 𝑥 1, 𝑦i = 0, i.e. by choice of 𝑦,

𝑇 𝑥 1 ∈ ({𝑥 1 }⊥ ) ⊥ = span{𝑥 1 } = {𝛼𝑥 1 : 𝛼 ∈ F}.

This means 𝑇 𝑥 1 = 𝛼𝑥 1 and since h𝑇 𝑥 1, 𝑥 1 i = k𝑇 k we conclude 𝑇 𝑥 1 = 𝜆1𝑥 1 with 𝜆1 := k𝑇 k ≥ 0.


Next, define
H1 := (span{𝑥 1 }) ⊥ = 𝑥 ∈ H : h𝑥, 𝑥 1 i = 0 ,


85
4 Bounded operators

and note that for any 𝑥 ∈ H1 ,

h𝑇 𝑥, 𝑥 1 i = h𝑥,𝑇 𝑥 1 i = 𝜆1 h𝑥, 𝑥 1 i = 0

so 𝑇 [H1 ] ⊂ H1 , i.e. we have for 𝑇1 := 𝑇  H1 that 𝑇1 ∈ Com(H1 ). Moreover, 𝑇1 is also positive,


so as above, there is 𝑥 2 ∈ H1 of unit length and 𝜆2 ≥ 0 such that 𝑇 𝑥 2 = 𝑇1𝑥 2 = 𝜆2𝑥 2 . Moreover
𝜆2 = k𝑇1 k ≤ k𝑇 k = 𝜆1 . Proceeding inductively, we find 𝑥 1, . . . , 𝑥𝑛 of unit length so 𝑇 𝑥 𝑗 = 𝜆 𝑗 𝑥 𝑗
with 𝜆 𝑗 = k𝑇 𝑗−1 k ≤ k𝑇 𝑗−2 k = 𝜆 𝑗−1 , and then

H 𝑗 = 𝑥 ∈ H : h𝑥, 𝑥𝑘 i = 0, 𝑘 = 1, . . . , 𝑗 ,

so that 𝑥 𝑗+1 ∈ H 𝑗 with 𝑇 𝑥 𝑗+1 = k𝑇 𝑗 k𝑥 𝑗+1 where 𝑇 𝑗 := 𝑇  H 𝑗 is compact and positive. By


construction, {𝑥𝑛 }𝑛=1∞ is an orthonormal set of eigenvectors, 𝜆 ≥ 𝜆 ≥ . . . ≥ 0 and 𝜆
1 2 𝑗+1 =
k𝑇 𝑗 k = k𝑇  H 𝑗 k. Moving ahead, by (2.6), we have 𝑥𝑛 ⇀ 0 as 𝑛 → ∞, and so by Theorem 45,

𝜆 𝑗 = k𝜆 𝑗 𝑥 𝑗 k = k𝑇 𝑥 𝑗 k → 0 as 𝑗 → ∞.

Now set H∞ := ∞
Ñ
𝑗=1 H 𝑗 ⊂ H , recall the nesting H 𝑗+1 ⊂ H 𝑗 and obtain in turn k𝑇  H∞ k ≤ 𝜆 𝑗 ,
valid for all 𝑗 ∈ N. Hence, 𝑇  H∞ is the zero operator, that is H∞ ⊂ Ker(𝑇 ). Now if H∞ = {0},
∞ is already an orthonormal basis for H . If not, then H ⊂ H , as separable Hilbert
then {𝑥𝑛 }𝑛=1 ∞
space, has an orthonormal basis {𝑥𝑚 0 }𝑀 by Theorem 15 and we have
𝑚=1

0 𝑀
H = span{𝑥𝑚 }𝑚=1 + H∞⊥ = span{𝑥𝑚
0 𝑀
}𝑚=1 + span{𝑥 𝑗 }∞ 0 𝑀 ∞
𝑗=1, span{𝑥𝑚 }𝑚=1 ∩ span{𝑥 𝑗 } 𝑗=1 = {0}.

Hence the union of {𝑥 𝑗 }∞ 0 𝑀


𝑗=1 and {𝑥𝑚 }𝑚=1 is an orthonormal basis for H and we almost have
the prescribed orthonormal sets: for if 𝜆𝑛 > 0 for all 𝑛 ∈ N, we have exactly as claimed and if
𝜆𝑁 > 0 and 𝜆𝑁 +1 = 0, then we rename {𝑥 𝑗 }∞ 0
𝑗=𝑁 +1 as vectors 𝑥𝑚 with 𝑚 > 𝑀 or interlacing the
0
original 𝑥𝑚 vectors if 𝑀 = ∞. This concludes our proof. 

Theorem 4.6: Spectral decomposition

Let 𝑇 ∈ Com(H ) be a compact self-adjoint operator on a Hilbert space H . Then there


exists a countable orthonormal set {𝑥𝑛 }𝑛=1
𝑁 of eigenvectors of 𝑇 so that

𝑇 𝑥𝑛 = 𝜆𝑛 𝑥𝑛 , |𝜆1 | ≥ |𝜆2 | ≥ . . . > 0; 𝜆𝑛 → 0 if 𝑁 = ∞,

and for all 𝑥 ∈ H ,


𝑁
Õ
𝑇𝑥 = 𝜆𝑛 h𝑥, 𝑥𝑛 i𝑥𝑛 , (4.7)
𝑛=1

where the series in the right hand side of (4.7) converges in norm.

Theorem 4.6 constitutes a spectral theorem for self-adjoint compact operators on any Hilbert
space H . It is the last result of this module and its proof relies on Theorem 4.5.

86
4 Bounded operators

Proof of Theorem 4.6. Modify the function 𝑓 : 𝑋 → [0, ∞) in the proof of Theorem 4.5 to
𝑓 (𝑥) := |h𝑇 𝑥, 𝑥i| ≥ 0. Then repeat the same inductive argument and construct a countable
orthonormal set {𝑥𝑛 }𝑛=1 𝑁 of eigenvectors of 𝑇 with associated eigenvalues (𝜆 ) 𝑁 such that
𝑛 𝑛=1
|𝜆1 | ≥ |𝜆2 | ≥ . . . > 0 and |𝜆𝑛 | ↓ 0 as 𝑛 → ∞. With 𝑇𝑛 := 𝑇  H𝑛 as before we then have for any
𝑥 ∈ H, !
Õ𝑛 𝑛
Õ
𝑇𝑛 𝑥 − h𝑥, 𝑥𝑘 i𝑥𝑘 = 𝑇 𝑥 − 𝜆𝑘 h𝑥, 𝑥𝑘 i𝑥𝑘 ,
𝑘=1 𝑘=1
| {z }
∈ H𝑛

and thus, as 𝑛 → ∞,
𝑛
Õ 𝑛
Õ
𝑇𝑥 − 𝜆𝑘 h𝑥, 𝑥𝑘 i𝑥𝑘 ≤ k𝑇𝑛 k 𝑥 − h𝑥, 𝑥𝑘 i𝑥𝑘 ≤ 2|𝜆𝑛 | k𝑥 k → 0,
𝑘=1 𝑘=1

having used (2.6) in the last inequality. This verifies (4.7) and completes the proof of the
theorem. 

87
Subject Index

𝐶 [0, 1]-space, 18, 33 continuous function, 5


ℓ𝑝 (N)-space, 13 continuous linear functional, 14
𝑐 0 (N)-space, 32 continuous spectrum, 69
𝑐 00 (N)-space, 32 convex combination, 23
convex function, 37
absolutely summable sequence, 34 convex set, 23
antisymmetric relation, 4
Approximation property, 80 dense subset, 5
Arzelà-Ascoli theorem, 8 densely defined linear transformation, 36
dependent, 8
Baire category theorem, 50 dimension, 8
Banach space, 31 direct sum, 54
Banach space adjoint, 43 direct sum of Banach spaces, 54
Banach space isomorphism, 34 direct sum of Hilbert spaces, 23
Banach-Steinhaus theorem, 51 Dirichlet-Heine theorem, 7
basis, 8 dual linear transformation, 42
Bessel’s inequality, 20 dual space, 14
best approximation, 23
bidual, 43 eigenvalue, 69
bounded linear operator, 9 eigenvector, 69
bounded linear transformation, 9 equicontinuous, 7
equivalent norms, 10
canonical map, 43
finite approximable linear transformations,
Cantor theorem, 49
79
Cauchy sequence, 6
finite rank linear transformations, 75
Cauchy-Schwarz inequality, 20
finite-dimensional, 8
chain, 4
Fourier expansion, 27
closed ball, 5
Fredholm alternative, 81
closed graph theorem, 55
closed set, 5 Gram-Schmidt procedure, 28
closure, 5 graph, 55
compact linear transformation, 75
compact space, 6 Hölder inequality, 12, 14
complementary subspace, 54 Hahn-Banach theorem, 37, 39
complete metric space, 6 Heine-Borel theorem, 7
complete orthonormal system, 26 Hellinger-Toeplitz theorem, 56

88
SUBJECT INDEX

Helly’s theorem, 60 partially ordered set, 4


Hilbert space, 21 point spectrum, 69
Hilbert space adjoint, 43, 62 polarization, 19
Hilbert space isomorphism, 29 positive definite, 18
Hilbert-Schmidt theorem, 84 positive operator, 84
pre-Hilbert space, 21
independent, 8 principle of uniform boundedness (PUB), 51
independent spanning set, 28 projection, 67
inner product, 18 projection lemma, 24
inner product space, 18 Pythagorean theorem, 19
integral operator, 17, 77
interior, 5 range, 15
inverse mapping theorem, 54 real Hilbert space, 21
invertible linear transformatoin, 34 real inner product space, 18
isometry, 35 reflexive Banach space, 44
reflexive relation, 3
Jordan-von Neumann theorem, 20 relation, 3
residual spectrum, 69
kernel, 15
resolvent, 68
left shift operator, 17 resolvent formula, 70
limit point, 5 resolvent set, 68
Riesz representation, 25
metric space, 5 Riesz’s lemma, 78
Minkowski inequality, 13, 14 Riesz-Schauder theorem, 83
multiplication operator, 16 right shift operator, 16

Neumann series, 35 Schauder’s theorem, 77


norm, 8 second dual space, 43
norm convergence, 61 self-adjoint operator, 64
norm equivalence, 54 seminorm, 9
normal operator, 64 separable space, 5, 28
normed linear space (NLS), 9 separation property, 41
sequentially compact, 6
open ball, 5 span, 8
open mapping theorem, 52 spectral decomposition theorem, 86
operator norm, 14 spectral radius, 71
orthogonal, 19 spectral radius formula, 71
orthogonal complement, 24 spectrum, 68
orthogonal projection, 67 strong convergence, 61
orthonormal, 19 subspace, 8
orthonormal basis, 26 summable sequence, 34
symmetric function, 39
parallelogram identity, 19 symmetric relation, 3
Parseval relation, 27
partial order, 4 totally bounded, 6

89
SUBJECT INDEX

totally ordered, 4 unitary operator, 29


transitive relation, 3 upper bound, 4
triangle inequality, 5
weak convergence, 56, 61
uniformly continuous, 7 weak-∗ convergence, 57
uniformly equicontinuous, 7
unitary linear transformation, 64 Zorn’s lemma, 4

90

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy