SHVYDKOY-FunctionalAnalysis
SHVYDKOY-FunctionalAnalysis
R. SHVYDKOY
Contents
1. Elements of topology 2
1.1. Metric spaces 3
1.2. General topological spaces. Nets 6
1.3. Ultrafilters and Tychonoff’s compactness theorem 7
1.4. Exercises 8
2. Banach Spaces 9
2.1. Basic concepts and notation 9
2.2. Geometric objects 9
2.3. Banach Spaces 10
2.4. Classical examples 11
2.5. Subspaces, direct products, quotients 14
2.6. Norm comparison and equivalence 15
2.7. Compactness in finite dimensional spaces 16
2.8. Convex sets 17
2.9. Linear bounded operators 18
2.10. Invertibility 20
2.11. Complemented subspaces 22
2.12. Completion 23
2.13. Extensions, restrictions, reductions 23
2.14. Exercises 24
3. Fundamental Principles 24
3.1. The dual space 24
3.2. Structure of linear functionals 25
3.3. The Hahn-Banach extension theorem 26
3.4. Adjoint operators 28
3.5. Minkowski’s functionals 28
3.6. Separation theorems 29
3.7. Baire Category Theorem 30
3.8. Open mapping and Closed graph theorem 31
3.9. Exercises 33
4. Hilbert Spaces 34
4.1. Inner product 34
4.2. Orthogonality 36
4.3. Orthogonal and Orthonormal sequences 38
4.4. Existence of a basis and Gram-Schmidt orthogonalization 40
4.5. Riesz Representation Theorem 42
4.6. Hilbert-adjoint operator 43
1
2 R. SHVYDKOY
4.7. Exercises 45
5. Weak topologies 46
5.1. Weak topology 46
5.2. Weak∗ topology 49
5.3. Exercises 52
6. Compact sets in Banach spaces 52
6.1. Compactness in sequence spaces 54
6.2. Arzelà-Ascoli Theorem 56
6.3. Compactness in Lp (Ω) 57
6.4. Extreme points and Krein-Milman Theorem 59
6.5. Compact maps 60
6.6. Exercises 60
7. Fixed Point Theorems 60
7.1. Contraction Principles 60
7.2. Brouwer Fixed Point theorem and its relatives 61
8. Spectral theory of bounded operators 61
8.1. Spectral Mapping Theorem and the Gelfand formula 65
8.2. On the spectrum of self-adjoint operators 67
8.3. On the spectrum of unitary operators 70
8.4. Exercises 70
1. Elements of topology
We start our lectures with a crush course in elementary topology. We will not need the full
extent of this section till after we start discussing weak and weak-star topologies on Banach
spaces. However, everything related to metric spaces and compactness will become useful
right away.
Definition 1.1. A set X is called a topological space if has a designated family of subsets τ ,
called topology whose elements are called open sets and which satisfy the following axioms
(i) ∅, X ∈ τ , S
(ii) If Uα ∈ τ for α ∈ A is any collection of open sets, then T α∈A Uα ∈ τ ,
(iii) If Ui , i = 1, . . . , n is a finite collection of open sets, then ni=1 Ui ∈ τ .
We call F ⊂ X a closed set if F c = X\F ∈ τ .
Closed sets verify a complementary set of axioms:
(i) ∅, X are closed, T
(ii) If Fα for α ∈ A is any collection of closed sets, then α∈A S Fα is closed,
(iii) If Fi , i = 1, . . . , n is a finite collection of closed sets, then ni=1 Fi is closed.
For any subset A ⊂ X, we define the closure of A, denoted Ā, to the minimal closed set
containing A. In other words, if F is a closed set containing A, then Ā ⊂ F . In view of
property (ii) of closed sets, we can define the closure equivalently by
\
Ā = F.
F is closed, A⊂F
LECTURES ON FUNCTIONAL ANALYSIS 3
One of the most important uses of topological spaces is that one can define the notion of
a continuous map between them.
Definition 1.2. Let (X, τX ) and (Y, τY ) be two topological spaces. A map
f :X→Y
−1
is called continuous if f (U ) ∈ τX for every U ∈ τY . In other words the preimage of every
open set is an open set.
One can also define ”small sets” that in a sense possess properties of a finite dimensional
object.
Definition 1.3. A subset K ⊂ X is called compact if every open cover of K contains a finite
subcover, in other words if one can embed K into a collection of open sets
[
K⊂ Uα ,
α∈A
then S
one can find only finitely many of them Uα1 , . . . , Uαn which still cover the entire set
K ⊂ ni=1 Uαi .
Lemma 1.4. Let f : X → Y be a continuous map. If K ⊂ X is compact, then f (K) is
compact in Y .
Proof. Consider an open cover [
f (K) ⊂ Vα .
α∈A
Then f −1 (Vα ) =
SnUα form an open over of K.
Sn By the compactness of K there exists a finite
subcover K ⊂ i=1 Uαi . But then f (K) ⊂ i=1 Vαi .
One can define a limit of a sequence x = limn→∞ xn by declaring that for every open set U
containing x there exists an N ∈ N such that for all n > N , xn ∈ U . However, in general the
definitions of continuity and compactness are not equivalent to their sequential counterparts
as, say, in Rn . Thus, the continuity in the sense that “if xn → x, then f (xn ) → f (x)” in
not equivalent to Definition 1.2. Likewise, sequential compactness in the sense that ”every
sequence {xn } ⊂ K has a convergent subsequence to an element of K” is not equivalent to
Definition 1.3. These sequential counterparts appeal to the situation when one can choose
a “sequence of balls” shrinking towards x, and respectively towards f (x) in Y , which may
not be possible if the topology does not have a “countable base”. Before we venture into
the general settings let us narrow our discussion down to the more special case of a metric
space where most topological concepts do in fact have sequential analogues.
1.1. Metric spaces. A metric on a set X, or distance, is a map d : X × X → R+ such that
(i) d(x, y) = 0 if and only if x = y,
(ii) d(x, y) = d(y, x),
(iii) d(x, z) ≤ d(x, y) + d(y, z).
For any point x there is a family of balls Br (x) = {y : d(x, y) < r} around x which has a
countable base, namely B1/n (x). We define metric topology on X by declaring that a set U
is open if for every x ∈ U there is a ball Br (x) ⊂ U .
Every metric space separates points, namely for every x1 , x2 ∈ X there are open neigh-
borhoods U (xi ) such that xi ∈ U (xi ) and U (x1 ) ∩ U (x2 ) = ∅. Indeed, pick U (xi ) to be the
4 R. SHVYDKOY
ball centered at xi of radius r = d(x1 , x2 )/3. Such topologies are called Hausdorff. We say
that x = limn→∞ xn if d(x, xn ) → 0. Because of the Hausdorff property all limits are unique.
We call a sequence Cauchy if ∀ε ∃N such that ∀n, m > N , d(xn , xm ) < ε.
Definition 1.5. The space X is called complete if every Cauchy sequence in X has a limit.
Proposition 1.6 (Sequential definitions). Let (X, d) be a metric space.
(a) A subset F ⊂ X of a metric space is closed iff the limit of a convergent sequence
xn ∈ F , belongs to F .
(b) A function f : X → Y , where Y is any topological space, is continuous iff for every
sequence xn → x one has f (xn ) → f (x) (sequentially continuous).
(c) A subset K ⊂ X is compact iff every sequence {xn }n ⊂ K contains a subsequence
which converges to a point in K (sequentially compact). Consequently, K is complete
as a metric space.
Proof. (a): Suppose F is closed. If x ∈ / F , then there is a neighborhood U (x) ∩ F = ∅.
By definition of the limit there exists n such that xn ∈ U (x), so not in F , a contradiction.
Conversely, if F is sequentially closed but F c is not open, then there exists a point x ∈ F c
such that B1/n (x) ∩ F 6= ∅. Pick a sequence xn ∈ B1/n (x) ∩ F . Then xn → x and all xn ∈ F ,
yet x ∈ / F , a contradiction.
(b): Suppose f is continuous, but there exists xn → x such that f (xn ) does not converge
to f (x). Then thee exists an open neighborhood U of f (x) and a subsequence xnk such that
f (xnk ) ∈ / U . Since f −1 (U ) is open and contains x, some elements of xnk will eventually get
in f −1 (U ) which means f (xnk ) ∈ U , a contradiction.
Conversely, suppose f is sequentially continuous but for some open U ⊂ Y , f −1 (U ) is not
open. Then there exists a point x ∈ f −1 (U ) and a sequence xn ∈ B1/n (x)\f −1 (U ). But then
xn → x and so f (xn ) → f (x), which means that from some n, f (xn ) ∈ U , a contradiction.
(c): Let K be compact and let {xn } ⊂ K. We call point y ∈ K a cluster point if every
ball Br (y) contains a subsequence {xnk } of the given sequence. We show that there exists
at least one cluster point. Suppose not, then for every y ∈ K we can find a ball Br(y) (y)
such that starting from n > N (y), all xn ∈ / Br(y) (y). This defines a cover, hence there
exists a finite subcover Br(y1 ) (y1 ), . . . , Br(ym ) (ym ). So, from n > maxi N (yi ) all elements of
our sequence do not belong to K, a contradiction. So, let y be a cluster point. We pick a
converging subsequence as follows: pick n1 such that xn1 ∈ B1 (y), then n2 > n1 such that
xn2 ∈ B1/2 (y), then then n3 > n2 such that xn3 ∈ B1/3 (y), and so on. Clearly, xnm → y.
The converse statement is somewhat more involved and it highlights several additional
useful properties of compact sets in metric spaces. So, suppose K is sequentially compact.
And let {Uα }α∈A be an open cover of K.
First, we show that this cover has a certain ”fatness” everywhere in K, this is called
Lebesgue number lemma. We claim that there exists a ε > 0 such that for every point
x ∈ K there exists α such that Bε (x) ⊂ Uα . Indeed, otherwise for every n we would have
found an xn ∈ K such that the ball B1/n (xn ) is not contained entirely in any Uα . For
this sequence, {xn } we can extract a converging subsequence xn → x ∈ K. The limit x is
contained in some Uα , and with it there is a ball Br (x) ⊂ Uα . Since xn → x, we can find
n > 2/r such that d(xn , x) < r/2. But then B1/n (xn ) ⊂ Br (x) ⊂ Uα , a contradiction.
Next, we show that for every S ε there exists an ε-net covering K. This means a set of point
x1 , . . . , xn such that K ⊂ i Bε (xi ). Indeed, if for some ε > 0 K is not a union of finitely
such balls, we pick a sequence as follows: take any x1 ∈ K, then pick x2 ∈ K\Bε (x1 ), then
LECTURES ON FUNCTIONAL ANALYSIS 5
pick x3 ∈ K\(Bε (x1 ) ∪ Bε (x2 )), and so on. This selects a sequence such that d(xn , xk ) > ε
for all n > k. It may not have any converging subsequence.
Finally, for our cover we pick a Lebesgue number an ε-net {xi }ni=1 . Then each
S ε and find S
ball Bε (xi ) finds itself in some Uαi . Clearly, K ⊂ i Bε (xi ) ⊂ i Uαi . So, we have found a
finite subcover.
Corollary 1.7. Let K ⊂ X be compact and f : K → R be a continuous function. Then
maxK f and minK f are achieved.
Proof. First, maxK f is finite. If not, then there is a sequence xn ∈ K such that f (xn ) > n.
Choosing a converging subsequence we arrive a contradiction. Now, similarly, pick a sequence
such that f (xn ) → maxK f and select a convergent subsequence. If x is its limit, then clearly,
f (x) = maxK f .
Another important characterization of compact sets in metric spaces is provided in terms
of ε-nets, which momentarily appeared in the proof of Proposition 1.6 (c).
Definition 1.8. Let K ⊂ X. We say that {xγ }γ∈Γ ⊂ X is an ε-net for K if for every y ∈ K,
there exists γ ∈ Γ such that d(y, xγ ) < ε.
It is easy to show that if one can find an ε-net for K in X, then there exists in fact a
2ε-net consisting of elements of K itself. Indeed, let us consider balls Bε (xγ ). Consider the
subset Γ0 ⊂ Γ consisting of indexes γ such that Bε (xγ ) ∩ K 6= ∅. For any such γ ∈ Γ0 pick
a yγ ∈ Bε (xγ ) ∩ K. Then for any y ∈ K there exists γ ∈ Γ such that y ∈ Bε (xγ ). But then
γ ∈ Γ0 , and consequently, d(y, yγ ) ≤ d(y, xγ ) + d(xγ , yγ ) < 2ε. Hence, {yγ }γ∈Γ0 ⊂ K is a
2ε-net.
Definition 1.9. A subset K ⊂ X is called precompact if its closure K̄ is compact.
As we will see in the context of compact embeddings most of the sets we will deal with
are in fact precompact “out of the box”. So, it is a very convenient concept to use.
Lemma 1.10. Suppose X is complete. Then the following are equivalent:
(i) K is precompact.
(ii) Every sequence in K contains a subsequence converging to an element of X.
(iii) For every ε > 0 there exists an ε-net for K.
Proof. Let us assume (i). Then (ii) follows immediately from Proposition 1.6(c) due to K̄
being compact. Conversely, if (ii) holds, consider a sequence xn ∈ K̄. Then for any n, pick
a yn ∈ K such that d(yn , xn ) < 1/n. There exists a subsequence ynk → x. Then x ∈ K̄, and
at the same time xnk → x. Proposition 1.6(c) implies that K̄ is compact.
Let us show that (i) implies (iii). For a fixed ε > 0 consider an open cover of K̄ by
balls Bε (x), x ∈ K̄. Since K̄ is compact there exists a finite subcover Bε (xj ), j = 1, . . . , n.
Clearly, {xj } forms a finite ε-net for K.
Assume (iii). Clearly, K̄ shares the same property. Pick any sequence yn ∈ K. Fix an
arbitrary sequence εk → 0. For any k, find an εk -net xkj , j = 1, . . . , Jk . Now we start an
iteration. Since the balls Bε1 (x1j ) cover K, and there are finitely many of them, there exists
a subsequence yn1 belonging to one of them, i.e. d(yn1 , ym1
) < 2ε1 , and with the first element
y11 corresponding to yp with p > 1. Next, pick a further subsequence yn2 which belongs to a
ball of radius ε2 , with the first element y12 corresponding to yp , p > 2, etc. The constructed
6 R. SHVYDKOY
subsequence y1n = zn is Cauchy since for any ε > 0 we can find εk < ε and starting from
n > k all elements of zn are εk -close. Since X is complete, this implies that zn → x ∈ X,
which implies (ii).
Corollary 1.11. In a complete metric space X, a subset K ⊂ X is compact if and only if
it is closed and for every ε > 0 there exists an ε-net for K.
There is a separate term in topology for sets satisfying property (iii) of Lemma 1.10. They
are called totally bounded. We will rarely use this term because the distinction between totally
bounded and precompact sets may only appear in incomplete spaces, which will not be in
focus for most of our discussion.
1.2. General topological spaces. Nets. The subject of this section is an arbitrary, not
necessarily metric, topological space (X, τ ). It turns out that in the case when the topology
cannot be defined by a metric, many of the conventional sequential definitions stated in
Proposition 1.6 do not apply. Sequences are simply not descriptive enough to characterize
features of a topology that does not have a countable base of neighborhoods, see [?] for more
on this. A proper “upgrade” of a sequence to the general settings is given by the concept of
a net.
Definition 1.12. A subset {xα }α∈A ⊂ X is called a net (not to be confused with ε-nets
of the previous section), if the index set A is partially ordered and directed, i.e. for every
pair α, β ∈ A there is γ ∈ A with γ ≥ α, γ ≥ β. A subnet is a net {yβ }β∈B with a map
n : A → B such that yβ = xn(β) , n is monotone, and for every α ∈ A there is β ∈ B with
n(β) ≥ α. A net {xα }α∈A is said to be convergent to x ∈ X, written
x = lim xα = lim xα ,
α∈A
in G.
Conversely, suppose there is an open G ⊂ Y such that f −1 (G) is not open. Thus, there
is a point x0 ∈ f −1 (G) such that any open neighborhood U of x contains a point outside
f −1 (G). Let us fix one such point xU for every U . Let A = {U ∈ τ : U open, x ∈ U }. It is a
net ordered by inclusion. Clearly, xU → x, since for every U containing x, all elements of the
net, namely starting from xU , will fall inside U . Yet, f (xU ) 6∈ G, and thus f (xU ) 6→ f (x).
(c): Suppose X is compact, and let {xα }α∈A ⊂ X be a net. First we let us establish
existence of a cluster point. A point y ∈ X is a cluster point of a net if for every U ∈ τ
LECTURES ON FUNCTIONAL ANALYSIS 7
containing y and every α0 , there is α ≥ α0 such that xα ∈ U . Suppose that our net does
not have cluster points. Thus, for every y ∈ X there is Uy and αy ∈ A such that xα 6∈ Uy
for all α ≥ αy . Consider the open cover {Uy }y∈X . By compactness there is a finite sub cover
Uy1 , . . . , Uyn . Since A is a net, there is a α ≥ αyi for all i = 1, ..., n. Then xα is in none of
the open sets above, which shows that they don’t form a cover.
So, let y be a cluster point. Let B = {(U, α) : y ∈ U, U ∈ τ, xα ∈ U } be ordered by reverse
inclusion on the first component, and by the order of A on the second. For β = (U, α), let
yβ = xα , and let n(β) = α. It is routine to show that {yβ }β∈B is a subnet converging to y.
Conversely, suppose every net has a converging subnet, and yet on the contrary, X is not
compact. This implies that there is an open cover U which has no finite subcover. Let us
define A = {α = (U1 , . . . , Un ) : Ui ∈ U, n ∈ N} ordered by α ≥ β if S β ⊂ α. Clearly A is also
directed. By assumption, for any α = (U1 , . . . , Un ) there is xα 6∈ i Ui . The net {xα }α∈A
has a converging subnet {yβ }β∈B , and y = lim yβ . Since U is a cover, there is U ∈ U with
y ∈ U . Let α = (U ). By the definition of a subnet, there is β 0 ∈ B such that n(β 0 ) ≥ α
and yβ 0 = xn(β 0 ) , and there is another β 00 ≥ β 0 such that yβ 00 ∈ U . By monotonicity of n,
n(β 00 ) ≥ α, and yet xn(β 00 ) ∈ U , in contradiction with the construction.
1.3. Ultrafilters and Tychonoff ’s compactness theorem. Let X be a set. A family of
subsets F ⊂ 2X is called a filter if
(1) ∅ 6∈ F;
(2) if F1 , · · · , Fn are elements of F, then nj=1 Fj ∈ F;
T
1.4. Exercises.
Exercise 1.1. Verify all the axioms of open sets for this definition.
Exercise 1.2. Let X be an arbitrary topological space. Show that a subset F ⊂ X is closed
if and only if the limit of every convergent net inside F is contained in F .
Exercise 1.3. Show that Ā is the set of all limits of nets from within A. In a metric space,
Ā is the set of all limits of sequences from within A.
Exercise 1.4. A topology τ1 on X is said to be stronger than another topology τ2 on X if
for any point x ∈ X any open neighborhood of x in τ2 contains an open neighborhood of x
in τ1 . We denote it τ1 ≥ τ2 . If τ1 ≥ τ2 and τ2 ≥ τ1 , then the topologies are called equivalent.
Show that in general, τ1 ≥ τ2 if and only if a net converging in τ1 also converges in τ2 .
Exercise 1.5. Show that a net xα → x in the product topology if and only if πγ (xα ) → πγ (x)
for every γ ∈ Γ.
LECTURES ON FUNCTIONAL ANALYSIS 9
2. Banach Spaces
2.1. Basic concepts and notation. Let us consider a vector space X over the field K = R
or C. We say that X has finite dimension n is there is a system of n linearly independent
vectors {x1 , . . . , xn } in X which spans X. We denote the linear span of a set S ⊂ X by [S].
If X has no finite dimension, X is called infinite dimensional. A function
k · k : X → R+
is said to define a norm on X if the following axioms hold:
(i) kxk = 0 iff x = 0,
(ii) kαxk = |α|kxk for all x ∈ X, α ∈ K,
(iii) kx + yk ≤ kxk + kyk (triangle inequality).
More relaxed versions of the concept of a norm are also studied in functional analysis. These
include a pseudo-norm, that is a function k · k satisfying only (ii) and (iii), and a quasi-norm
that is a function satisfying (i), (ii), and the triangle inequality with a constant coefficient
c > 1:
kx + yk ≤ c(kxk + kyk).
We write (X, k · k) to indicate that X is equipped with the norm k · k if it is not clear from
the context. We call (X, k · k) a normed space.
Let us notice that a norm generates a metric, called norm-metric, on the space X via
d(x, y) = kx − yk.
The corresponding topology is called norm-topology. Let us recall from Section 1.1 that
we identify open sets as sets U with the property that for any x ∈ U there is an open
ball Bε (x) ⊂ U . The norm-topology naturally gives rise to the concept of convergence and
continuity as described in Section 1.1: a sequence {xn }∞ n=1 ⊂ X is said to converge to x in
the norm, or strongly, if kxn − xk → 0 as n → ∞. A function f : X → Y , where Y is a
topological space, is continuous if f (xn ) → f (x) whenever xn → x, or equivalently, if f −1 (U )
is open for any open U ⊂ Y , see Lemma ??.
Lemma 2.1. The norm k · k : X → R is a continuous function on X.
The proof follows readily from (25).
2.2. Geometric objects. In any normed space we can define a line passing through x0 ∈ X
in the direction of v ∈ X as a set of vectors
{x0 + tv : t ∈ R}.
Similarly we define a plane spanned by a linearly independent couple v, w ∈ X:
{x0 + tv + sw : t, s ∈ R},
etc. We introduce the following notation of balls and spheres
B(X) = {x ∈ X : kxk < 1}, the open unit ball
B(X) = {x ∈ X : kxk ≤ 1}, the closed unit ball
S(X) = {x ∈ X : kxk = 1}, the unit sphere
Br (x0 ) = {x ∈ X : kx − x0 k < r},
Br (x0 ) = {x ∈ X : kx − x0 k ≤ r}.
10 R. SHVYDKOY
Exercise 2.1. Show that B(X) is open, while B(X) and S(X) are closed subsets of X.
For two sets A, B ⊂ X we denote by A+B their algebraic sum {x+y : x ∈ A, y ∈ B}, and
constant multiple by αA = {αx : x ∈ A}. Thus, Br (x0 ) = x0 + rB(X), Sr (x0 ) = x0 + rS(X),
etc.
For a subset A ⊂ X its linear span is the set
( k )
X
span A = αi xi : xi ∈ A, αi ∈ R, k ∈ N ,
i=1
In particular for any two vectors x, y ∈ X, the convex hull conv{x, y} is simply the segment
between them.
We can define the concept of a distance between two sets A, B ⊂ X:
dist{A, B} = inf ka − bk.
a∈A,b∈B
Finally, we say that x ∈ X is a unit vector if kxk = 1. For any vector x 6= 0, we can define
the unit vector pointing in the same direction:
x
x̄ = , kx̄k = 1.
kxk
A subset A ⊂ X is called bounded if supa∈A kak < ∞. Note that a set is bounded if and
only if it is contained in a ball
A ⊂ RB(X),
for some R > 0.
2.4. Classical examples. The simplest example of a normed space is the Euclidean space
`n2 = (Kn , k · k2 ) with the norm given by
n
!1/2
X
kxk2 = |xi |2 .
i=1
The corresponding n-dimensional analogue is denoted `np . At this point, other than in cases
p = 1, 2, ∞, it not clear whether `p is a linear space and k · kp defines a norm on it. We will
show it next and establish several very important inequalities as we go along the proof.
Lemma 2.4. `p is a Banach space for all 1 ≤ p ≤ ∞.
Proof. First, let p = ∞. The axioms of norm in this case are trivial. To show that `∞
is complete let xn = {xn (j)}∞
j=1 be a Cauchy sequence. Hence, every numerical sequence
{xn (j)}n is Cauchy. This implies that xn (j) → x(j) as n → ∞. Since {xn } is Cauchy, we
have kxn − xm k∞ < ε for all n, m > N . Thus, |xn (j) − xm (j)| < ε for all j ∈ N as well. Let
us fix n and j and let m → ∞ in the last inequality. We obtain |xn (j) − x(j)| ≤ ε for all j,
and hence kxn − xk ≤ ε, for all n > N . We have shown that xn → x.
Now let p < ∞. Let us prove the triangle inequality first. By concavity of ln(x), we have
ln(λa + µb) ≥ λ ln(a) + µ ln(b),
for all λ + µ = 1, λ, µ ≥ 0, and a, b > 0. Exponentiating the above inequality we obtain
aλ bµ ≤ λa + µb.
Letting λ = p1 , µ = 1
q
and replacing a → ap and b → bq we obtain
ap b q
(1) ab ≤ + , (Young’s Inequality)
p q
whenever p1 + 1q = 1, p ≥ 1. Next, consider finite sequences x = {xi }ni=1 , y = {yi }ni=1 and
observe by (1)
n n n
X 1X p 1X q 1 1
xi yi ≤ |xi | + |yi | = kxkpp + kykqq .
i=1
p i=1 q i=1 p q
12 R. SHVYDKOY
P
So, if kxkp = kykq = 1, then | i xi yi | ≤ 1. For general x 6= 0 and y 6= 0, we consider the
corresponding unit vectors x̄, ȳ, for which according to the above
n
X xi y i
≤ 1.
i=1
kxkp kykq
Hence, we obtain
n
X
(2) xi yi ≤ kxkp kykq , (Hölder Inequality).
i=1
Finally,
n
X n
X n
X
p p−1
|xi + yi | ≤ |xi + yi | |xi | + |xi + yi |p−1 |yi |
i=1 i=1 i=1
n
!1/q
X
≤ |xi + yi |(p−1)q [kxkp + kykp ] = kx + ykp/q
p [kxkp + kykp ].
i=1
Thus,
kx + ykpp ≤ kx + ykp/q
p [kxkp + kykp ],
and this implies
kx + ykp ≤ kxkp + kykp , (Minkowski’s inequality)
which is what we need only for finite sequences. It remains to notice that if x, y ∈ `p are
arbitrary, then the above inequality shows that the partial sums of the p-series of x + y are
uniformly bounded, which in turn implies that x + y ∈ `p and the triangle inequality (iii)
holds as desired.
To prove that `p is complete, let xn = {xn (j)}∞j=1 be Cauchy. Then, as before we can pass
to the limit in every coordinate xn (j) → x(j). For a fixed J ∈ N, ε > 0 and n, m large
enough, we have
XJ
|xn (j) − xm (j)|p < ε.
j=1
Letting m → ∞, we obtain
J
X
|xn (j) − x(j)|p < ε.
j=1
In particular, this implies that all partial sums of the series j |x(j)|p are bounded, and
P
hence x = {x(j)}j ∈ `p . Now, let us let J → ∞ in the estimate above. We obtain then
kxn − xkp ≤ ε1/p , thus xn → x.
Our next classical example is c0 . This is the space of sequences {xj }∞
j=1 such that
limj→∞ xj = 0 endowed with the uniform k · k∞ norm.
Lemma 2.5. c0 is a Banach space.
Proof. Since c0 ⊂ `∞ , and c0 is clearly a linear space which shares the same norm with `∞ ,
it is sufficient to show that c0 is closed as a subset of `∞ , as any closed subset of a complete
metric space is also complete.
LECTURES ON FUNCTIONAL ANALYSIS 13
So, let xn ∈ c0 and xn → x ∈ `∞ . Then for any ε > 0 there exists n such that
ε
kxn − xk∞ < .
2
At the same time, since coordinates of xn tend to zero, xkn → 0, k → ∞, there is K ∈ N
such that for k > K we have
ε
|xkn | < .
2
In view of the two inequalities above,
|xk | ≤ |xkn − xk | + |xkn | < ε,
for all k > K. This shows that x ∈ c0 .
Let us introduce the Lebesgue spaces (see also Math 533). Let (Ω, Σ, µ) be a measure space,
which is a set Ω with σ-algebra Σ or subsets and a positive σ-additive measure µ : Σ → R+
defined on it. Fro 1 ≤ p < ∞, we define
Z
p
Lp (Ω, Σ, µ) = [f ] : |f (x)| dµ(x) < ∞ ,
Ω
where [f ] is the equivalence class of functions g such that g = f almost everywhere. The
norm is defined by
Z p1
p
kf kp = |f (x)| dµ(x) .
Ω
Here f is any representative of the class.
For p = ∞ we define the space of essentially bounded functions
L∞ (Ω, Σ, µ) = {[f ] : ∃E ∈ S, µ(E) = 0, f |Ω\E is bounded }.
We define the essential supremum norm by
kf k∞ = esssupΩ f = inf sup |f (x)|.
E∈S:µ(E)=0 x∈Ω\E
Note that the maximum is always achieved by Corollary 1.7, and for continuous functions
kf k∞ = kf kC(K) .
Lemma 2.7. The spaces C(K) is Banach.
This follows the same proof as before: we fist find a pointwise limit fn (x) → f (x), then
conclude that the limit is uniform
kfn − f k∞ → 0,
and invoke the classical theorem from Real Analysis saying that a uniform limit of continuous
function is continuous.
14 R. SHVYDKOY
Definition 2.8. A normed space X is called separable if it contains a countable dense subset,
i.e. if there is S ⊂ X, card S = ω0 such that for every x ∈ X and any ε > 0 there is y ∈ S
with kx − yk < ε.
2.5. Subspaces, direct products, quotients. We say that Y ⊂ X is a subspace if it is
closed under linear operations. We say that Y is closed if it is closed in the norm-topology
of X. If in addition X is complete, then so is its every closed subspace. Thus, any closed
subspace of a Banach space is Banach. We say that Y is dense in X if for every x ∈ X and
every ball Bε (x) there is a point y ∈ Bε (x). In other words, Y is dense in the norm-topology.
Let us now fix a closed linear subspace Y ⊂ X and consider the equivalence relation
x1 ∼ x2 iff x1 − x2 ∈ Y . This defines a conjugacy class [x] = x + Y , for every x ∈ X. The
space of all conjugacy classes is called the quotient-space of X by Y , denoted X/Y , with the
natural linear operations inherited from X. We can endow X/Y with a norm too, called the
quotient-norm:
(3) k[x]k = inf{kx + yk : y ∈ Y } = dist{x, Y }.
Notice that we always have
k[x]kX/Y ≤ kxkX .
Lemma 2.9. The above defines a norm on X/Y . If X is complete, then X/Y is complete
as well in the quotient-norm.
Proof. We will use Exercise 3.2 for the proof. Let us suppose that we have an absolutely
convergent series in X/Y :
X
k[xi ]kX/Y < ∞.
i
We can assume that all k[xi ]k > 0. Then for each i we can find a yi ∈ Y such that
kxi + yi kX ≤ 2k[xi ]kX/Y .
Then X
kxi + yi kX < ∞.
i
P
Since X is a Banach space, the series i (xi + yi ) = x converges. Thus,
N
X N
X
[xi ] − [x] ≤ (xi + yi ) − x → 0,
i=1 X/Y i=1 X
endowed with the coordinate-wise operation of addition and multiplication by a scalar. This
makes X × Y into a linear space. Identifying elements of the product (x, 0) with x , and
(0, y) with y arranges a natural embedding of X and Y into X × Y . We thus can write
x + y = (x, y). If both spaces are normed, (X, k · kX ) and (Y, k · kY ), there are many ways
one can define a norm on the product. For example, let 1 ≤ p < ∞. We can define a new
norm on X × Y by
kx + ykp = (kxkpX + kykpY )1/p .
The verification that this rule defined a norm is immediate from Minkowski’s inequality
established above. The obtained normed space is called the `p -sum of the X and Y and
denoted X ⊕p Y . For p = ∞ we naturally define X ⊕∞ Y equipped with the norm
kx + yk∞ = max{kxkX , kykY }.
Similarly, we define `p -sums of any number of spaces and even countably many spaces by
requiring a member of X1 ⊕p X2 ⊕p . . . to be a sequence of vectors x = {x1 , ...} such that
P 1/p
∞ p
kxkp = j=1 kx k
j Xj < ∞, or bounded in the case p = ∞.
Let us notice that for any pair of vectors x ∈ S(X) and y ∈ S(Y ), the span{x, y} will be
identical to `2p in the `p -product of spaces. So, for example, the unit ball of the X ⊕1 R will
look like a symmetric tent with B(X) being the base and (0, 1) the top point. The ball of
X ⊕∞ R would be the cylinder with base B(X) and height 1.
2.6. Norm comparison and equivalence. Let (X, k · k) be a normed space and Y ⊂ X
is a subspace with another norm |||·|||. We say that the norm |||·||| is stronger than k · k if there
exists a constant C > 0 such that
(4) kyk ≤ C |||y||| , for all y ∈ Y.
The two norms are equivalent if there are c, C > 0 for which
(5) c |||y||| ≤ kyk ≤ C |||y||| , for all y ∈ Y.
Geometrically, (4) means that B |||·||| (Y ) ⊂ CB k·k (Y ), while (5) means that there is embedding
in both sides, cB k·k (Y ) ⊂ B |||·||| (Y ) ⊂ CB k·k (Y ). The stronger norm, therefore, defines a finer
topology on Y , while equivalent norms define the same topology.
Example 2.10. We have `p ⊂ `q , for all 1 ≤ p ≤ q ≤ ∞, and
(6) kxkq ≤ kxkp .
Indeed, assuming that x = (x1 , . . . ) ∈ S(`p ) implies that all |xi | ≤ 1. Hence, |xi |q ≤ |xi |p ,
and thus, x ∈ `q . Moreover, kxkq ≤ 1. The general inequality (6) follows by homogeneity.
Example 2.11. We have the opposite embeddings for the Lebesgue spaces: Lq (dµ) ⊂ Lp (dµ),
for all 1 ≤ p ≤ q ≤ ∞,
1 1
(7) kf kp ≤ µ(Ω) p − q kf kq , for all f ∈ Lq (dµ).
It readily follows from the Hölder inequality,
Z Z p/q
p q
|f | dµ ≤ |f | dµ µ(Ω)1−p/q .
Ω Ω
Then
ym
i0 = xi0 − zk ,
ym k
where
i
X ym
zk = − k
i 0 xi .
i6=i0
y m k
Hence, zk → xi0 from which we conclude that xi0 ∈ Y , a contradiction with linear indepen-
dence.
i
Now that we know that all ym ’s are bounded we use the P Bolzano-Weierstrass Theorem to
pass to subsequences ymk → y . Then form the vector y = i y i xi . We have
i i
X
i
kymk − yk ≤ |ym k
− y i | → 0.
i
Conversely, let us assume now that dim X = ∞ and show that the ball is not compact.
It suffices to construct a separated sequence of vectors x1 , x2 , ... so that all kxn k = 1 and
kxn − xm k ≥ 1/2. Indeed, any such sequence would not contain a convergent subsequence.
LECTURES ON FUNCTIONAL ANALYSIS 17
To this end, let us fix an arbitrary first vector x1 ∈ S(X). Consider the space Y1 = span x1 ,
and find x2 ∈ S(X) such that
1
k[x2 ]kX/Y1 = dist{x2 , Y1 } > .
2
We can do that as follows. Consider any [x] ∈ S(X/Y1 ). Note that there is a vector y1 ∈ Y1
such that
kx + y1 k < 2.
x+y1
Then letting x2 = kx+y1 k ∈ S(X) we obtain
x y1 k[x]kX/Y1 1
k[x2 ]kX/Y1 = inf + +y = ≥ .
y∈Y1 kx + y1 k kx + y1 k kx + y1 k 2
Then consider Y2 = span{x1 , x2 } and find x3 ∈ S(X) with dist{x3 , Y2 } > 1/2, and so on.
The process will never terminate since X is not a span of finitely many vectors.
Now, if n > m, then xm ∈ Ym ⊂ Yn−1 , and so dist{xn , Yn−1 } > 1/2. Hence, kxn − xm k >
1/2.
Corollary 2.13. A subset K of a finite-dimensional space X is precompact if and only if it
is bounded.
Theorem 2.14. On a finite dimensional linear space X all norms are equivalent.
Proof. By transitivity, it suffices to show that all norms on X are equivalent to the norm
P of
`n1 . So, let k · k be a norm on X, and let {ei }ni=1 be a unit basis in X. Then for x = xi ei ,
X X
kxk ≤ |xi |kei k ≤ |xi | := kxk1 .
i
To establish an inequality from below, let us consider the norm-function N (x) = kxk. By
compactness of Sk·k1 (X) and continuity of N , N attains its minimum on Sk·k1 (X) at x0 ,
see Corollary 1.7. Then N (x0 ) = c > 0, since N never vanishes on a non-zero vector. So,
kxk ≥ c, for all x ∈ Sk·k1 (X), and hence kxk ≥ ckxk1 , by homogeneity.
2.8. Convex sets. We say that a set A ⊂ X is convex if x, y ∈ A implies λx + (1 − λ)y ∈ A
for all 0 < λ < 1, i.e. with every pair of points A contains the interval connecting them. A
direct consequence of homogeneity and triangle inequality of the norm is that any ball is a
convex set. Let us recall that for a set A ⊂ X we define the convex hull of A as the set of
all convex combinations of elements from A:
( N N
)
X X
conv A = λi ai : ai ∈ A, λi = 1, λi ≥ 0, N ∈ N .
i=1 i=1
The topological closure of the convex hull conv A is the same as the smallest closed convex
set containing A, or the intersection of such sets.
Theorem 2.15 (Caratheodori). Let A ⊂ Rn , then every point a ∈ conv A can be represented
as a convex combination of at most n + 1 elements in A.
18 R. SHVYDKOY
Proof. Suppose x = N
P P
i=1 λi ai , all λi > 0, λi = 1, and N > n + 1. We will find a way
to introduce a correction into the convex combination above as to reduce the number of
elements in the sum by 1. Then the proof follows by iteration.
First, let us observe that since N > n+1, the number of elements in the family a2 −a1 , a3 −
a1 , . . . , aN − a1 is larger then the dimension and hence they are not linearly
PN independent.
So, we can find constants
P ti ∈ R, not all of which are zero, such that i=2 ti (ai − a1 ) = 0.
Denoting t1 = − ti , we can write
N
X
ti ai = 0.
i=1
By reversing the sign of all the ti ’s if necessary, we can assume that at least one of them is
positive. We will now adjust the original convex combination by a constant multiple of the
zero-sum above, thus not changing the x:
N
X N
X N
X
x= λi ai − ε t i ai = (λi − εti )ai .
i=1 i=1 i=1
Letting ε = minti >0 {λPi /ti } ensures that µi = λi − εti ≥ 0 for all i, and that for some i0 ,
µi0 = 0. Yet, clearly, µi = 1. Thus, the new representation
N
X
x= µ i ai ,
i=1
2.9. Linear bounded operators. A map T between two linear spaces X → Y is called a
linear operator if T (αx+βy) = αT (x)+βT (y). We usually drop the parentheses, T (x) = T x,
when a linear operator is in question. Suppose (X, k · kX ) and (Y, k · kY ) are normed. A
linear operator T : X → Y is called bounded or continuous if there exists a constant C > 0
such that
(8) kT xkY ≤ CkxkX ,
holds for all x ∈ X. We denote the set of all linear bounded operators between X and Y by
L(X, Y ). If X = Y , we simply denote L(X, X) = L(X).
The following theorem justifies the terminology.
Theorem 2.17. Let T : X → Y be a linear operator. The following are equivalent:
(i) T ∈ L(X, Y );
(ii) T maps bounded sets into bounded sets;
(iii) T is continuous as a map between X and Y endowed with their norm topologies;
(iv) T is continuous at the origin.
LECTURES ON FUNCTIONAL ANALYSIS 19
Proof. The implication (i) ⇒ (ii) is clear from (8). Conversely, T , in particular, is bounded
on the unit ball of X, i.e. there exists a C > 0 such that, kT xkY ≤ 1, for all x ∈ B(X). If
x ∈ X is arbitrary, then x̄ = x/kxk ∈ S(X), and hence kT x̄kY ≤ C. So, by linearity, we
obtain (8).
The implication (i) ⇒ (iv) is also clear directly from (8). If (iv) holds, and x0 ∈ X is
arbitrary, then for y → 0, by linearity and continuity at the origin, we have
T (x0 + y) = T x0 + T y → T x0 ,
showing that T is continuous at x0 . Thus, (iii) holds. Finally, if (iv) holds, then there is a
δ > 0 such that kxkX < δ implies kT xkY < 1. So, if x is arbitrary, consider x0 = δx/kxk.
Then kT x0 k ≤ 1 implies kT xk ≤ kxk/δ giving us (8).
In particular, for any x ∈ X, kT xk ≤ Ckxk holds for all C > 0 for which (8) holds. This
shows that the infimum itself provides the bound from above
kT xkY ≤ kT kkxkX , ∀x ∈ X.
The set L(X, Y ) of all bounded linear operators between X and Y clearly forms a linear
space, and the above exercise shows that the operator norm endows it with a norm.
Theorem 2.18. If (X, k · kX ) is normed and (Y, k · kY ) is Banach, then the space L(X, Y )
is Banach in its operator norm.
Proof. Suppose {Tn } ⊂ L(X, Y ) is Cauchy. Then, in particular, for any fixed x ∈ X,
T (x) = lim Tn x,
n→∞
for each x ∈ X. By linearity of Tn ’s the limit is a linear operator as well. To show that
it is bounded, observe that the original sequence of operators is bounded, thus kTn k ≤ M
for some M and all n. So, kT xkY ≤ M kxkX for all X, which proves boundedness. Finally,
to show that Tn → T in operator norm, let us fix ε > 0 , then for all n, m large and all
x ∈ B(X) we have
kTn x − Tm xkY < ε.
Let us keep n fixed and let m → ∞. We already know that Tm x → T x, thus,
kTn x − T xkY ≤ ε
holds for all x ∈ B(X) and all n large. This gives kTn − T k ≤ ε for all n large, which
completes the proof.
20 R. SHVYDKOY
Let us consider two important examples of operators on `p – the left shift and right shift:
Tl x = (x2 , x3 , . . . ),
(11)
Tr x = (0, x1 , x2 , . . . ).
Clearly, Tl is non-injective, while Tr is non-surjective. So, none of them is invertible. Yet,
Tl ◦Tr = I is obviously invertible. This means that the converse to Lemma 2.21 does not hold
– invertibility of the product does not translate into invertibility of the individual product
terms. It turns our that the converse does actually hold if we additionally assume that the
operators commute. Note that this is not the case in our example, since
Tr ◦ Tl x = (0, x2 , x3 , . . . ).
Lemma 2.22. Let T, S ∈ L(X). If ST = T S and ST is invertible, then both S and T are
invertible.
Proof. Denote by U = (ST )−1 . Let us show that U commutes with S. Indeed, we have
S(ST ) = (ST )S.
Let us multiply this identity by U from the left and from the right
U S(ST )U = U (ST )SU.
Using that U is inverse to ST , we obtain
U S = SU.
Now let us check that U S is the inverse to T . Indeed,
(U S)T = U (ST ) = I,
and
T (U S) = T (SU ) = (T S)U = I.
Similarly, U T = T U , and U T is the inverse of S.
Exercise 2.2. Generalize Lemma 2.22 to several operators: let T1 , . . . , Tn ∈ L(X), and Ti Tj =
Tj Ti and T1 ◦ · · · ◦ Tn is invertible. Then all Ti ’s are invertible. Hint: use induction.
Invertible operators are sometimes called isomorphisms between two Banach spaces. If an
isomorphism exists between two spaces we call the spaces isomorphic, and denote X ≈ Y .
T is called an isometric isomorphism if T is invertible and T is an isometry, i.e. it preserves
the lengths of vectors
(12) kT xk = kxk, ∀x ∈ X.
We denote by X ∼ = Y isometrically isomorphic spaces. We generally don’t distinguish
between such spaces, and simply refer to them as equal, although sometimes specify the
identification rule, i.e. T : X → Y , between their elements.
We can extend this terminology to operators which are not necessarily surjective yet act
as isomorphisms onto their own images. We say that T : X → Y is an isomorphic embedding
if (9) holds.
Lemma 2.23. The range of any operator T bounded from below is a closed subspace of Y .
22 R. SHVYDKOY
kyn k xn yn
kxn k − 1 ≤ kyn k ≤ kxn k + 1. Thus, kxn k
→ 1. We have kxn k
+ kxn k
→ 0. On the other
hand,
xn yn xn yn yn kyn k
+ = + + −1
kxn k kxn k kxn k kyn k kyn k kxn k
xn yn kyn k
≥ + − −1
kxn k kyn k kxn k
xn yn
In view of all of the above, kxn k
+ kyn k
→ 0, in contradiction with our assumption.
Corollary 2.26. If Z = X + Y is a direct sum of closed subspaces, and dim X < ∞, then
the spaces X and Y are complemented.
Proof. Indeed, if the spheres of X and Y are not separated, then there exist sequences
yn ∈ S(Y ) and xn ∈ S(X) such that kxn − yn k → 0. By compactness we can choose a
subsequence ynk → y ∈ S(Y ). Thus, xn → y as well, which by the closedness of S(X)
implies y ∈ X, a contradiction.
In the situation described above, the subspace Y ⊂ Z is called finite co-dimensional.
2.12. Completion. For every incomplete normed space (X, k · k) there is a way to actually
complete it to the full Banach space where it would embed densely. To do that, consider
another space X̃ of all Cauchy sequences {xn }n in X endowed with `∞ norm. Note that for
any Cauchy sequence limn kxn k exists. So, it is in fact a special subspace of X ⊕∞ X ⊕∞ ....
One can show that it is complete by a diagonalization procedure. Next, we consider the
subspace of vanishing sequences X̃0 = {{xn }n : limn kxn k = 0}, and the quotient space
X̄ = X̃/X̃0 . One show that the quotient-norm of a conjugacy class is given by k[{xn }]k =
limn kxn k. Then one can embed i : X → X̄ by assigning i(x) = (x, x, ...). One can show that
i is an isometric embedding of X into X̄, Rg i is dense in X̄, and of course X̄ is complete by
contraction. The space X̄ is called a completion of X.
2.13. Extensions, restrictions, reductions. If T : X → Y is a bounded operator and
X0 = Ker T , we can construct a new operator T̃ : X/X0 → Y by the rule T̃ ◦J = T . One can
easily check that this definition is not ambiguous. Moreover, one has kT k ≤ kT̃ kkJk = kT̃ k.
And on the other hand, if kT̃ [x]k ≥ kT̃ k − ε and k[x]k < 1, then for some x0 ∈ X0 ,
kx + x0 k < 1, and yet T̃ [x + x0 ] = T (x + x0 ). This shows the opposite inequality kT̃ k ≤ kT k.
Thus, kT̃ k = kT k. Notice that the reduced operator T̃ has trivial kernel.
If Y ⊂ X is a dense linear subspace of a Banach space X and T : Y → Z is a bounded
operator to another Banach space Z, then one can uniquely extend T to the entire X in a
linear fashion and preserving the norm of T . Indeed, if x ∈ X, then there exists a sequence
yn → x, yn ∈ Y . This implies in particular that {T yn } is a Cauchy sequence in Z. Since
Z is complete, it has a limit, which we call T̃ x. This limit is independent of the original
sequence yn , simply because another such sequence yn0 would satisfy kyn − yn0 k → 0, hence
kT yn − T yn0 k → 0. Linearity follows similarly. Also if x ∈ Y , then we can pick yn = x, so
T̃ = T on Y . Furthermore, one can show that kT̃ kX→Z = kT kY →Z . The operator T̃ is called
a bounded extension of T . Show that there exists only one such extension.
Let us note that if Y was not a dense subspace, even more specifically, if Y ⊂ X is a proper
closed subspace, then an extension of T : Y → Z to X would not be possible unless we know
24 R. SHVYDKOY
some specific information about the space Y itself. For instance, if Y is complemented,
Y ⊕ W = X, then one could extend T to X be declaring T w = 0 for all w ∈ W . However,
this extension could increase the norm of T .
If T : X → Z, and Y ⊂ X, we define the restriction of T onto Y by T |Y (y) = T y. Note
that the norm of the restricted operator may decrease in general.
2.14. Exercises.
Exercise 2.3. Show that S(X) is compact for any finite-dimensional space X.
Exercise 2.4. Show that
kT k = sup kT xkY = sup kT xkY .
x∈B(X) x∈S(X)
These identities say that the norm of an operator is the measure of deformation of the
unit ball of X under T . In particular, if kT k ≤ 1, then T is called a contraction.
Exercise 2.5. Suppose that T, S ∈ L(X, Y ). Then
kT + Sk ≤ kT k + kSk.
Exercise 2.6. Suppose that T : X → Y and S : Y → Z are bounded. Prove that
kS ◦ T k ≤ kSkkT k.
Exercise 2.7. Let a = (a1 , a2 , ...) ∈ `∞ be a sequence. Define T : `p → `p by
(13) T x = (a1 x1 , a2 x2 , . . .).
Show that T is bounded and kT k = kak∞ .
Exercise 2.8. Prove that if T is invertible, then the sharp constant in (9) is in fact c =
kT −1 k−1 .
= `2∞ , but `n1 ∼
Exercise 2.9. Show that `21 ∼ 6= `n∞ for all n ≥ 3.
Exercise 2.10. Show that the norm of any projection operator is at least 1.
Exercise 2.11. Prove that a bounded operator P : X → X is a projection onto a subspace if
and only if it is idempotent, P 2 = P .
Exercise 2.12. Show that if Z = X ⊕ Y , then Z/X ≈ Y . Hint: consider the projector
P : Z → Y along X, and its factor by the kernel P̃ : Z/X → Y .
3. Fundamental Principles
3.1. The dual space. If the target space of a linear operator is R, or C, the field over which
X itself is defined, the operator is called a linear functional, real or complex respectively.
The space of all linear functionals on X is denoted X 0 , while the space of all linear bounded
functionals is denoted by X ∗ , and called the dual space. It is often possible to identify the
dual of a Banach space up to an isometry.
Example 3.1. c∗ ∼
0 = `1 , `∗ ∼
p = `q , where 1 + 1 = 1 and 1 ≤ p < ∞. The dual of `∞ is a very
p q
big space of finitely additive measures of bounded variation on N.
Let us carry out construction for `p , p < ∞. First, we notice that the space has what is
called a Schauder basis.
LECTURES ON FUNCTIONAL ANALYSIS 25
Proof. The implications (a) ⇒ (b) ⇒ (c) are trivial. Suppose that Ker(f ) is not dense. Then
for some ball Br (x0 ) ∩ Ker(f ) = ∅. Let y ∈ S(X). Then g(t) = f (x0 + ty) = f (x0 ) + tf (y)
is a continuous non-vanishing function on (−r, r). This implies that the sign of it has to
agree with that of f (x0 ). Assuming f (x0 ) > 0 we then have f (x0 ) + tf (y) > 0, and so,
|f (y)| ≤ r−1 f (x0 ). This shows that f is bounded on the unit sphere and completes the
proof.
Geometrically linear bounded functionals can be identified with affine hyperplanes. Thus,
if f ∈ X ∗ , then H(f ) = {x ∈ X : f (x) = 1}, defines f uniquely. If f ∈ S(X ∗ ), then
the hyperplane is in some sense tangent to the unit sphere of X, namely, it does not deep
inside the interior of the unit ball and it approaches arbitrarily close to S(X). Note that a
functional may not necessarily attain its highest values on the sphere, i.e. the norm. For
example, let X = `1 , and f = (1/2, 2/3, 3/4, . . .) ∈ S(`∞ ). There is no sequence x ∈ S(`1 )
for which f (x) = 1. If x ∈ S(X) and f ∈ S(X ∗ ) with f (x) = 1, then f is called a supporting
functional of x. Existence of supporting functionals is not immediately obvious, and it brings
us to an even more fundamental question – does there exist at least one non-zero bounded
linear functional on a given normed space?
3.3. The Hahn-Banach extension theorem. The essence of the Hahn-Banach extension
theorem is to show that a given bounded functional defined on a linear subspace of X can
be extended boundedly to the whole space X retaining not only its boundedness but also
its norm. The boundedness can be expressed as the condition of domination by the norm-
function, i.e. if Y ⊂ X and f ∈ Y 0 then f ∈ Y ∗ if and only if
f (y) ≤ Ckyk,
for some C > 0. We will in fact need a more general extension result that will be useful
when establishing separation theorems later in Section 3.6. We thus consider a positively
homogeneous convex functional p : X → R ∪ {∞}, which means that p(tx) = tp(x) for all
x ∈ X and t ≥ 0, and p(λx + (1 − λ)y) ≤ λp(x) + (1 − λ)p(y), for all 0 < λ < 1, x, y ∈ X.
The latter is equivalent to the triangle inequality, p(x + y) ≤ p(x) + p(y). Note that a norm,
or a quasi-norm, is an example of such a functional. We say that p dominates f on Y if
f (y) ≤ p(y) for all y ∈ Y .
Theorem 3.4 (Hahn-Banach extension theorem). Suppose Y ⊂ X, and p is a positively
homogeneous convex functional defined on X. Then every linear functional f ∈ Y 0 dominated
by p on Y can be extended to a linear functional f ∈ X 0 denominated by the same p on all
of X.
In the core of the proof lies Zorn’s Lemma, which we recall briefly. Let P be a partially
ordered set. A subset C of P is called a chain if its every two elements are comparable, i.e.
∀a, b ∈ C either a ≤ b or b ≤ a. An upper bound for a set A ⊂ P is an element b ∈ P such
that b ≥ a, for all a ∈ A. A maximal element m is an element with the property that if
a ≥ m, then a = m. Generally, it may not be unique.
Lemma 3.5 (Zorn’s Lemma). If every chain of P has an upper bound, then P contains a
maximal element.
Zorn’s lemma is equivalent to the Axiom of Choice.
LECTURES ON FUNCTIONAL ANALYSIS 27
3.3.2. Second dual space. For a normed space X, one can consider the dual of the dual space,
X ∗∗ , called second dual. There is a canonical isometric embedding i : X ,→ X ∗∗ defined
as follows: i(x)(x∗ ) = x∗ (x). It is convenient to use parentheses to indicate action of a
functional: x∗ (x) = (x∗ , x). In this notation (i(x), x∗ ) = (x∗ , x) or simply, (x, x∗ ) = (x∗ , x).
To show that i is an isometry, notice that |(i(x), x∗ )| ≤ kx∗ kX ∗ kxkX , thus ki(x)kX ∗∗ ≤ kxkX .
On the other hand, let x∗ be a supporting functional. Then kx∗ kX ∗ = 1, and (i(x), x∗ ) =
x∗ (x) = kxkX . We will think of X is a subspace of X ∗∗ with the natural identification of
elements described above.
Definition 3.6. If the embedding X ,→ X ∗∗ covers all of X ∗∗ , i.e. X = X ∗∗ , then X is
called reflexive.
We will return to a discussion of reflexive spaces later as they possess very important
compactness properties.
28 R. SHVYDKOY
3.4. Adjoint operators. Let T : X → Y be a bounded operator. We can define the adjoint
or dual operator T ∗ : Y ∗ → X ∗ by the rule
(T ∗ y ∗ , x) = (y ∗ , T x).
Again, using the Hahn-Banach theorem we show that
kT k = kT ∗ k.
First, |(T ∗ y ∗ , x)| ≤ ky ∗ kkT xk ≤ ky ∗ kkT kkxk. This shows kT ∗ k ≤ kT k. Let now ε > 0 be
given. Find x ∈ S(X) such that kT xk ≥ kT k − ε. Then let y ∗ ∈ S(Y ∗ ) be a supporting
functional for T x. We have (T ∗ y ∗ , x) = (y ∗ , T x) = kT xk ≥ kT k − ε. This shows the opposite
inequality.
Lemma 3.7. We have Rg T = Y if and only if Ker T ∗ is non-trivial.
Proof. Indeed, if Rg T = Z is a proper subspace of Y then we can find a point y0 6∈ Z. Define
a bounded linear functional on span{y0 , Z} by y ∗ (z) = 0 and y ∗ (y0 ) = 1. Then extend it
to y ∗ ∈ Y ∗ by the Hahn-Banach theorem. We have (T ∗ y ∗ , x) = (y ∗ , T x) = 0, because y ∗
vanishes on the range of T . Hence, y ∗ ∈ Ker T ∗ .
Oppositely, if y ∗ ∈ Ker T ∗ , and y ∗ 6= 0, then Rg T ⊂ Ker y ∗ , and hence Rg T ⊂ Ker y ∗ .
3.5. Minkowski’s functionals. Let us recall from Section 3.6 that a function p : X →
R ∪ {∞} is called convex if for any x, y ∈ X one has p(λx + (1 − λ)y) ≤ λp(x) + (1 − λ)p(y)
for all 0 < λ < 1. A function q : X → R ∪ {−∞} is called concave if for any x, y ∈ X one
has q(λx + (1 − λ)y) ≥ λq(x) + (1 − λ)q(y) for all 0 < λ < 1. Is it easy to see that if p is
convex then the sub-level sets {p ≤ p0 } are convex, and if q is concave then the super-level
sets {q ≥ q0 } are convex.
Suppose A ⊂ X is convex and 0 ∈ A. We associate to A a convex function, called
Minkowski’s functional, pA so that A is ”almost” given as a sub level set of pA . We define
pA (x) as follows. Suppose there is no t ≥ 0 for which x ∈ tA, then pA (x) = ∞. If x ∈ tA for
some t ≥ 0, we set
pA (x) = inf{t ≥ 0 : x ∈ tA}.
We list the basic properties of the Minkowski’s functional.
(a) pA is positively homogeneous and convex;
(b) {pA < 1} ⊂ A ⊂ {pA ≤ 1}.
For α > 0, x ∈ tA if and only if αx ∈ αtA. This readily implies (a). Notice that for
positively homogeneous functionals convexity is equivalent to triangle inequality, pA (x+y) ≤
pA (x) + pA (y). So, let x, y ∈ X. If any of pA (x) or pA (y) equal ∞, the inequality becomes
trivial. If both are finite, then for every ε > 0 we can find t1 < pA (x) + ε and t2 < pA (y) + ε
such that x ∈ t1 A and y ∈ t2 A. Then
t1 t2
x + y ∈ t1 A + t2 A = (t1 + t2 ) A+ A ⊂ (t1 + t2 )A.
t1 + t2 t1 + t2
This shows that pA (x + y) ≤ pA (x) + pA (y) + ε, for all ε > 0. Finally, (c) follows directly
from the definition and that 0 ∈ A.
Suppose now B is another convex set not containing a small ball around the origin, i.e.
there is δ > 0 such that δB(X) ∩ B = ∅. We can associate a similar, but now concave
functional to B as follows. If x ∈ tB, for no t ≥ 0, then qB (x) = −∞. Otherwise, we define
qB (x) = sup{t ≥ 0 : x ∈ tB}.
LECTURES ON FUNCTIONAL ANALYSIS 29
Condition δB(X)∩B = ∅ warrants that the supremum is finite for any x ∈ X. The following
list of properties can be established in a similar fashion:
(a) qB is positively homogeneous and concave;
(b) {qB > 1} ⊂ B ⊂ {qB ≥ 1}.
Suppose now that we have two disjoint convex sets A and B satisfying all the assumptions
above, and let pA and qB be the corresponding Minkowski’s functionals. If pA (x) < ∞, let
t ≥ 0 be such that x ∈ tA. Since 0 ∈ A, the whole interval [0, x] is in tA and therefore not
in tB. This in turn implies than x 6∈ sB for any s ≥ t, for if such s existed, then st x ∈ tB
contradicting the previous. As a consequence, qB (x) ≤ t. We have shown that
(18) qB (x) ≤ pA (x), for all x ∈ X.
Proof. Suppose A 6⊂ B, then there exists x0 ∈ A and δ > 0 such that Bδ (x0 ) ∩ B = ∅. By
moving x0 to the origin we satisfy all the conditions on A and B as above, which allows us
to define Minkowski’s functions, qB ≤ pA . Thus, we have (19) for f = 0, and Y = {0}. By
Theorem 3.8, there exists and extension f for which (20) holds, and thus, in view of (18), f
separates A and B.
If A has a non-empty interior then clearly (i) holds. Let us assume that εB(X) ⊂ A and
let f be the functional constructed above. Since f (x) ≤ pA (x), we conclude that whenever
kxk ≤ ε, then f (x) ≤ 1. This shows kf k ≤ ε−1 .
Finally if A = {x0 } and B is closed, we apply case (ii) to A = Bδ (x0 ) for small δ > 0. Since
f 6= 0, there is y ∈ S(X) for which f (y) > 0. Thus, f (x0 ) < f (x0 ) + εf (y) ≤ inf f (B).
The condition of (i) is not sufficient to guarantee that a bounded separator would exist.
Let us consider the following example. Let X = `2,0 be the linear space of finite sequences
endowed with the `2 -norm, A = conv{~en }n , and B = 12 A. These are convex disjoint sets.
Notice that for any b ∈ B, let b = 21 i λi~ei , we have
P
1X 1X 1
kbk22 = |λi |2 ≤ λi = .
4 4 4
Thus, B ⊂ 12 B(X). Since any ~en 6∈ 21 B(X), the conditions of Theorem 3.9 (i) are satisfied.
Next observe that 0 ∈ A ∩ B, by considering the sequences xn = (~e1 + · · · + ~en )/n and
yn = 21 xn . Suppose f ∈ `2 /{0} separates the two sets, i.e. sup f (A) ≤ c ≤ inf f (B). Since 0
is in the closure of both sets, we conclude that c = 0. Then f as a sequence is positive on
the one hand and negative on the other hand. Thus, f = 0, a contradiction. It is easy to
construct an unbounded separator though by taking f = (1, 1, ....).
Corollary 3.10. Let S ⊂ X be a subset of a normed space. Then
\
conv S = {x : f (x) ≤ sup f (S)}.
f ∈X ∗
Indeed, the inclusion ⊂ is obvious. If however x0 6∈ conv S, then Theorem 3.9 (iii) provides
a functional such that f (x0 ) < inf f (conv S) ≤ inf f (S). Reversing the sign of f shows that
x0 is not in one of the sets in the intersection.
3.7. Baire Category Theorem. Let us consider for a moment a general complete metric
space X, without necessarily a given linear structure. Let us ask ourselves how ”big” such a
space can be. One might say that a singleton X = {x0 } is an obvious example of a ”small”
space. On the other hand, it is not that small compared to its own standards – that one
point is closed and open at the same time, and it is in fact a ball of any radius centered
around itself. It is therefore rather ”large”.
To make the terms ”small” and ”large” more precise let us agree on what we mean by a
”small” subset first. We say that F is nowhere dense in X is for any open set U there exists
an open subset V ⊂ U with no intersection with F . In S other words, F has empty interior.
A subset F ⊂ X is called of 1st Baire category if F = ∞ i=1 Fi , where all Fi ’s are nowhere
dense. A subset is of 2nd Baire category if it is not of the 1st category. Thus, in the example
above X itself is clearly of the 2nd category. This in fact holds in general.
Theorem 3.11 (Baire Category Theorem). Any complete metric space is a 2nd Baire cat-
egory set.
LECTURES ON FUNCTIONAL ANALYSIS 31
S
Proof. Let us suppose, on the contrary, that X = Fi , and all Fi ’s are nowhere dense. Then,
there is a closed ball Bε1 (x1 ) with ε1 < 1 disjoint from F1 . Since F2 is however dense, there
is a ball Bε2 (x2 ) ⊂ Bε1 (x1 ), with ε2 < 1/2, disjoint from F2 , continuing in the same manner
we find a sequence of nested closed balls Bεn (xn ), with εn < 1/n. Clearly, d(xn , xm ) < 1/m,
for all n > m. Thus, the sequence {xn } is Cauchy. By completeness, there exists a limit x,
which belongs to all the balls, and hence not in any Fj ’s, a contradiction.
There are many consequences of the Baire category theorem, some of which are given in
the exercises below.
Theorem 3.12 (Banach-Steinhauss uniform boundedness principle). Let F ⊂ L(X, Y )
be a family of bounded operators, and X is a Banach space. Suppose for any x ∈ X,
supT ∈F kT xk < ∞. Then, the family is uniformly bounded, i.e. supT ∈F kT k < ∞.
Proof. Let Fn = {x S ∈ X : supT ∈F kT xk ≤ n}. Note that each set Fn is closed, and by
assumption X = n Fn . Hence, by the Baire Category Theorem, one of Fn ’s contains a
ball Br (x0 ). This implies that for all x ∈ B(X), kT (x0 + rx)k ≤ n, for all T ∈ F. Thus,
kT xk ≤ r−1 (n + kT x0 k) ≤ r−1 (n + supT ∈F kT x0 k), implying the desired result.
Corollary 3.13. Let S ⊂ X be a subset such that for every x∗ ∈ X ∗ , sup |x∗ (S)| < ∞.
Then S is bounded.
Indeed, if viewed as a subset of X ∗∗ , S is a pointwise bounded family of operators. Hence,
it is norm-bounded by the Banach-Steinhauss theorem.
3.8. Open mapping and Closed graph theorem.
Lemma 3.14. Suppose T : X → Y is bounded, and X is a Banach space. Suppose that
B(Y ) ⊂ T (B(X)), then 12 B(Y ) ⊂ T (B(X)).
Proof. Let us note, by linearity of T , that
(23) rB(Y ) ⊂ T (rB(X))
for any r > 0. Let us fix y ∈ B1/2 (Y ) and let us fix a small ε > 0 to be specified later. By
(23) we can find x1 ∈ 21 B(X) such that ky − y1 k < ε, where T x1 = y1 . Since y − y1 ∈ εB(Y ),
one finds x2 ∈ εB(X) such that ky − y1 − y2 k < ε/2, where T P x2 = y2 . Continuing this way,
∞ n
we construct a sequence {xn }n=1 with kxn k ≤ ε/2 . Let x = n xn . Then by construction
T x = y, and kxk ≤ 21 + 2ε < 1, provided ε is small.
Theorem 3.15 (Open mapping theorem). Suppose T ∈ L(X, Y ), and both spaces are Ba-
nach. If, in addition, T is surjective, then T is an open mapping, i.e. T (U ) is open for every
open U .
Proof. Suppose, U is open. Let x0 ∈ U , and let Bε (x0 ) ⊂ U . We prove the theorem if we show
that T (Bε (x0 )) contains an open neighborhood of T x0 . Since, T (Bε (x0 )) = T x0 +εT (B(X)),
it amounts to showing that S T (B(X)) contains a ball centered at the origin. Since T is
surjective, we have Y = n nT (B(X)). By the Baire Category Theorem, one of the sets
nT (B(X)) is dense in some ball Bδ (y0 ), and hence, by linearity, so is T (B(X)). Since
T (B(X)) is convex and symmetric with respect to the origin,
δB(Y ) ⊂ conv{Bδ (y0 ), Bδ (−y0 )} ⊂ T (B(X))
Applying Lemma 3.14 to the operator 1δ T , we conclude 2δ B(Y ) ⊂ T (B(X)) and the proof is
finished.
32 R. SHVYDKOY
and thus,
kT xk ≤ (C − 1)kxk.
This implies that T is bounded.
3.9. Exercises.
Exercise 3.1. Show that the Triangle Inequality is equivalent to
(25) |kxk − kyk| ≤ kx ± yk.
Exercise 3.2. The fact that an absolutely convergent series converges does not necessarily
hold in general normed spaces. Prove that a normed space (X, k · k) is Banach if and only
if every absolutely convergent series in X is convergent.
Exercise 3.3. Show that c0 and `p is a separable space for all 1 ≤ p < ∞.
Exercise 3.4. Show that `∞ is not a separable space. Hint: consider the set of vectors
{xA }A⊂N , where xA is the characteristic function of A.
Exercise 3.5. Verify that the `p -sum of Banach spaces X1 ⊕p X2 ⊕p . . . is a Banach space.
Exercise 3.6. Show that if X = X1 ⊕p . . . ⊕p Xn , then X ∗ = X1∗ ⊕q . . . ⊕q Xn∗ , where p, q are
conjugates. Show that the same is true for infinite `p -sums if 1 ≤ p < ∞.
Exercise 3.7. If a strict inequality holds in (21), then A and B are called strictly separated.
Show that if A is compact and B closed convex disjoint sets, then they can be strictly
separated.
Exercise 3.8. Let T ∗∗ : X ∗∗ → Y ∗∗ be the second adjoint operator, i.e. T ∗∗ = (T ∗ )∗ . Show
that T ∗∗ |X = T .
Exercise 3.9. Show that X ∗ is always complemented in X ∗∗∗ . Hint: consider the adjoint
i∗ : X ∗∗∗ → X ∗ .
Exercise 3.10. Let T ∈ L(X, Y ), and S ∈ L(Y, Z). Show that (S ◦ T )∗ = T ∗ ◦ S ∗ .
Exercise 3.11. Show that in Examples 2.10, 2.11 the norms are not equivalent.
Exercise 3.12. Verify the inequality
1 1
kxkq ≤ kxkp ≤ n p − q kxkq ,
for vectors x ∈ Rn . Can you interpret the upper bound as a particular case of (7)?
Exercise 3.13. Show that if µ(Ω) = ∞, then the Lp -norms are not comparable on Lp ∩ Lq ,
i.e. neither is stronger than the other.
Exercise 3.14. Prove that T : X → Y is an isomorphism if and only if T ∗ : Y ∗ → X ∗ is.
Exercise 3.15. Show that X is reflexive if and only if X ∗ is reflexive.
Exercise 3.16. Let Y ⊂ X be a closed subspace, and X is Banach. Define Y ⊥ = {f ∈
X ∗ : f |Y = 0}. This is a closed subspace of X ∗ , called the annihilator of Y . Show that
Y∗ ∼
= X ∗ /Y ⊥ , and (X/Y )∗ ∼
= Y ⊥.
Exercise 3.17. Give an example of a vector y ∈ `p which does not belong to the range of the
operator defined in (24).
34 R. SHVYDKOY
4. Hilbert Spaces
4.1. Inner product. Let H be a linear space over C. We say that a binary operation
h·, ·i : H × H → C
is an inner product on H if
(1) hαx + βy, zi = αhx, zi + βhy, zi,
(2) hx, yi = hy, xi,
(3) hx, xi ≥ 0 and hx, xi = 0 if and only if x = 0.
Note that in view of conjugate symmetry property (2), the inner product is not linear with
respect to the second variable, but it is conjugate-linear:
(1*) hz, αx + βyi = ᾱhz, xi + β̄hz, yi.
Products satisfying (1)-(1*) are called sesquilinear, i.e. 1 21 -linear.
In case of H is defined over the reals R, we replace assumption (2) with the usual symmetry
hx, yi = hy, xi.
In this case the inner product is bi-linear, i.e. linear with respect to both positions.
Every inner product defines a norm (inner-product norm) by
p
kxk = hx, xi.
Axioms (1)-(2) of the definition of a norm are easily verified. In order to demonstrate the
triangle inequality we prove the Cauchy-Schwartz inequality first.
Lemma 4.1 (Cauchy-Schwartz inequality). For any pair of vectors x, y ∈ H one has
|hx, yi| ≤ kxkkyk,
and the equality holds if and only if x, y are linearly dependent.
Proof. If y = 0, then the Cauchy-Schwartz inequality is trivial. Let us suppose y 6= 0.
Let us fix a λ ∈ C to be determined later and compute
0 ≤ kx + λyk2 = kxk2 + 2 Re(λ̄hx, yi) + |λ|2 kyk2 .
Setting λ = − hx,yi
kyk2
we obtain
|hx, yi|2 |hx, yi|2
kxk2 − 2 + ≥ 0.
kyk2 kyk2
This proves the inequality. In case of equality holds, then for the above choice of λ we obtain
kx + λyk2 = 0.
Hence, x = −λy.
With the help of the Cauchy-Schwartz inequality we compute
kx + yk2 = kxk2 + 2 Rehx, yi + kyk2 ≤ kxk2 + 2|hx, yi| + kyk2
≤ kxk2 + 2kxkkyk + kyk2 = (kxk + kyk)2 .
This proves the triangle inequality.
Definition 4.2. An inner-product linear space H which is complete in the metric of its
inner-product norm is called a Hilbert space.
LECTURES ON FUNCTIONAL ANALYSIS 35
Example 4.3. All classical examples are one way or another are connected to L2 -spaces. This
is not a coinsidence. We will make this more precise later. For now let us consider the space
`2 . Its norm comes from the inner product defined by
∞
X
hx, yi = xi yi .
i=1
Can we definitively say that all the other examples we encountered are not Hilbert spaces
in their canonical norms? To answer this question we first observe that all inner-product
norms satisfy a distinct identity, called the Parallelogram rule:
(26) kx + yk2 + kx − yk2 = 2(kxk2 + kyk2 ), ∀x, y ∈ H.
It is remarkable that this formula does not involve the underlying inner-product explicitly,
but it is a consequence of the norm origin.
So, let us check that `p -norm, for p 6= 2, is not an inner-product norm. Simply pick
x = (1, 0, 0, ....), y = (0, 1, 0, ...),
and verify directly that the Parallelogram Rule fails.
We note that using the Parallelogram Rule only proves that those spaces are not Hilbert
in their classical norms. But it is much harder to show that there is no other equivalent
norm for which those spaces are Hilbert. This shows that the property of being Hilbert is
not just isometric, it determines isomorphic properties of the space too.
Can one restore the inner product from the inner-product norm? In a real space this is
done via the following identity
1
(27) hx, yi = (kx + yk2 − kx − yk2 ),
4
and in a complex space this is done via the Polarization Identities
1
Rehx, yi = (kx + yk2 − kx − yk2 ),
(28) 4
1
Imhx, yi = (kx + iyk2 − kx − iyk2 ),
4
As a consequence of the Cauchy-Schwartz inequality we deduce continuity of the inner
product.
Lemma 4.4. If xn → x, and yn → y, then hxn , yn i → hx, yi.
Proof. Note that the sequence yn is bounded, kyn k ≤ M . So,
|hxn , yn i − hx, yi| = |hxn , yn i − hx, yn i + hx, yn i − hx, yi| ≤ kxn − xkkyn k + kxkkyn − yk → 0.
36 R. SHVYDKOY
4.2. Orthogonality. Next we will discuss the very important concept of orthogonality. We
say that x and y are orthogonal, denoted x ⊥ y, if hx, yi = 0. For any two orthogonal vectors
x ⊥ y one has the Pythagorean Theorem:
kx + yk2 = kxk2 + kyk2 .
Prove it as an exercise!
An important concept we learned in Calculus is the concept of an orthogonal projection
to a hyperplane. This concept can be extended to any Hilbert space, in fact. To do that we
have to recall that the orthogonal projection of x to Y ⊂ H can be viewed as a solution to
the variational problem:
Find y0 ∈ Y such that kx − y0 k = miny∈Y kx − yk := dist{x, Y }.
If this problem has a solution then we have a good candidate for an ”orthogonal projection”.
A solution to the minimization problem can be given for general closed convex sets.
Lemma 4.5. Let Y be any convex closed subset of H. Then for every x ∈ H there exists a
unique y0 ∈ Y such that
kx − y0 k = min kx − yk.
y∈Y
Denote α = |hx − y0 , y1 i|2 and let λ = −thx − y0 , y1 i, t ∈ R. Then continuing the above we
obtain
I 2 − αt + αt2 < I 2 ,
for t = 1/2, for example. So, we found another vector, namely y0 − λy1 ∈ Y for which the
distance to x is smaller than I. A contradiction.
Now let us assume that we have a vector y0 ∈ Y such that x−y0 ⊥ Y . Denote kx−y0 k = I.
Then for any other vector y ∈ Y , we have
kx − y0 − yk2 = I 2 + kyk2 ≥ I 2 .
This proves the last assertion.
The characterization of minimizer as an orthogonal projection is especially important. We
can show that the map
P : H → Y, P x = y0 ,
is linear. Indeed, if P x1 = y1 , P x2 = y2 , then x1 − y1 ⊥ Y , and x2 − y2 ⊥ Y . But then,
αx1 + βx2 − αy1 − βy2 ⊥ Y also. So, P (αx1 + βx2 ) = αy1 + βy2 . Furthermore, this mapping
defines a bounded operator with kP k = 1. Indeed,
kxk2 = kx − P x + P xk2 = kx − P xk2 + kP xk2 ≥ kP xk2 .
So,
kP xk ≤ kxk.
Moreover, P y = y for all y ∈ Y trivially. This establishes that P is idempotent, P 2 = P .
These properties tell us that P is a projection operator, called orthogonal projection onto Y .
We now study the kernel of P , which we denote Y ⊥ .
Lemma 4.7 (Orthogonal complement). We have
Y ⊥ = {z ∈ H : z ⊥ y for all y ∈ Y }.
Moreover, Y ⊥ is a closed linear subspace complementary to Y :
(29) H = Y ⊕ Y ⊥.
Proof. The latter assertion is in fact a consequence of the definition of Y ⊥ as a kernel of the
projection P with image Y .
Let us address the first claim now. If z ⊥ Y , then kz − yk2 ≥ kzk2 for any y ∈ Y . So, this
means that 0 is the minimizer in this case, and hence P z = 0. Conversely, if P z = 0, then
by Lemma 4.6, z − 0 = z ⊥ Y , and our lemma follows.
We can define an orthogonal complement to any set A in H:
A⊥ = {x ∈ H : x ⊥ a, ∀a ∈ A}.
It is easy to show that the orthogonal complement to A is also a complement to the entire
closed linear span of A, see Exercise 4.6. As a consequence of Exercise 4.3, we can see that
the double-complement returns not the set A itself in general, but its closed linear span:
(30) A⊥⊥ = span A.
As a consequence we obtain the following important characterization of dense sets.
Lemma 4.8. The linear span of a set A is dense in H if and only if A⊥ = {0}.
38 R. SHVYDKOY
Proof. Suppose that the linear span is dense. This implies that span A = H. But then
according to (30),
⊥
{0} = span A = A⊥⊥⊥ = A⊥ ,
where the latter follows from the fact that A⊥ is a closed linear space and hence Exercise 4.3
applies.
⊥
Conversely, if A⊥ = {0}, then according to the above, span A = {0}. This means
that the orthogonal complement to span A is trivial, hence according to decomposition (29),
span A = H, which literally means that span A is dense in H.
Lemma 4.8 provides a valuable tool to show that a set is in some sense total – the linear
span has trivial complement. It would have been very hard to try to prove it directly as it
is difficult to describe all the linear combinations of a set. For future reference we will use
exactly this term.
Definition 4.9. A set A ⊂ H is called total if
span A = H.
4.3. Orthogonal and Orthonormal sequences. We say that a sequence {xn }∞ n=1 is or-
thogonal if
hxn , xm i = 0, ∀n 6= m.
If in addition all kxn k = 1, then the sequence is called orthonormal.
Let us consider some classical examples.
In space `2 the sequence of unit basis vectors en is orthonormal.
In space L2 [0, 2π] the system of simple harmonics forms an orthogonal sequence:
un (θ) = einθ , n ∈ Z.
To make it orthonormal, we have to normalize it:
1 inθ
vn = e .
2π
If one is interested in real functions one can consider the corresponding trigonometric sys-
tems:
an = cos(nt), bm = sin(mt),
which together form an orthogonal system. Both systems are referred to as a Fourier system.
If we have an orthonormal sequence {en } it is very easy to find a projection onto the closed
space Y spanned by it. We will discuss it next.
First, let us consider the finite dimensional spaces
Yn = span{e1 , . . . , en }.
If x ∈ H, and y = Pn x, where Pn is the orthogonal projection onto Yn , then
Xn
Pn x = ak e k .
k=1
To find the coefficient let us recall that x − Pn x ⊥ Yn . So,
hx − Pn x, ek i = 0, k = 1, ..., n.
Since the system is orthonormal,
hPn x, ek i = ak .
LECTURES ON FUNCTIONAL ANALYSIS 39
So,
ak = hx, ek i.
Thus, we obtain the explicit formula
n
X
Pn x = hx, ek iek .
k=1
Let us now consider the entire sequence and space Y . Let y = P x, and denote yn = Pn x.
In view of the inequality above, we have
Xn
|hx, ek i|2 ≤ kxk2 ,
k=1
This inequality carries a special name: Bessel inequality. Looking at the sequence of yn ’s we
obtain now
m
X
2
kyn − ym k = |hx, ek i|2 .
k=n+1
This can be made < ε for n, m > N in view of the convergence of the series. Hence, {yn } is
Cauchy, which means that the vectorial series
X∞
ỹ = hx, ek iek ,
k=1
converges. Let us show that y = ỹ. Indeed, by uniqueness of Lemma 4.6 it suffices to prove
that x − ỹ ⊥ Y . Since the set A = {en }n spans Y , it suffices to just show that x − ỹ ⊥ A.
For that we pick en , and by continuity of the inner product and convergence of the series
compute
∞
X
hx − ỹ, en i = hx, en i − h hx, ek iek , en i = hx, en i − hx, en i = 0.
k=1
We have arrived at the following conclusion.
Lemma 4.10. For any orthonormal sequence {en }n and any vector x ∈ H the orthogonal
projection of x onto the space Y = span{en : n ∈ N} is given by
∞
X
(32) Px = hx, ek iek .
k=1
This means that Y = H, and P = I, and according to Lemma 4.10, the expansion (32)
holds not just for the projection but for the vector x itself:
X∞
(33) x= hx, ek iek .
k=1
For an orthonormal basis the Bessel inequality turns into the Parseval identity:
X∞
2
(34) kxk = |hx, ek i|2 .
k=1
Indeed, by convergence of the series and continuity of the norm,
N 2 N ∞
X X X
2
kxk = lim hx, ek iek = lim |hx, ek i|2 = |hx, ek i|2 .
N →∞ N →∞
k=1 k=1 k=1
By construction, the new system {en } is orthonormal and it spans the space space as the
old one. So, if H is a separable Hilbert space, then we can start with an arbitrary dense
countable set (yn ). Then we can iteratively filter out vectors yn which are linearly dependent
on the previous ones y1 , ..., yn−1 , making a new linearly independent system (xn ) with the
same linear span as (yn ), i.e. span(xn ) = H. Using Gram-Schmidt orthogonalization we
produce another orthonormal sequence (en ) with the same span span(en ) = H. This system
is total, and hence forms a basis.
Theorem 4.12. Every separable Hilbert space has an orthonormal basis.
As an easy consequence of this theorem we obtain the following result.
Theorem 4.13. Every separable Hilbert space is isometrically isomorphic to `2 .
Proof. Indeed, let (en ) be an orthonormal basis in H. Define the operator
T : H → `2 , T x = (hx, en i)∞
n=1 .
and hence T x = a.
As easy it is to show that a given system is orthonormal it appears much harder to show
that it is a basis. This usually boils down to proving totality, which is an approximation
property of a given set. Proving this requires elements of approximation theory and harmonic
analysis. All the examples we mentioned before, however, are bases.
Example 4.14. We start with the system
x0 = 1, x1 = t, x2 = t2 , . . .
in space L2 [a, b]. This sequence forms polynomials and polynomials are dense in C[a, b] while
C[a, b] is dense in L2 [a, b]. Thus, polynomials are dense in L2 [a, b]. Hence it is a total system.
The Gram-Schmidt orthogonalization produces the system of Legendre polynomials. On the
interval [−1, 1] those are given by
r
2n + 1
en = Pn (t),
2
1 dn 2
Pn = n [(t − 1)n ].
2 n! dtn
Example 4.15. On the whole line L2 (R) we start with moments of Gaussian distributions
2 /2
x0 = e−t , x1 = tx0 , x2 = t2 x0 , . . .
The orthogonalization process results in Hermit system
1 2
en = n √ 1/2 e−t /2 Hn (t),
(2 n! π)
2 dn −t2
H0 = 1, Hn = (−1)n et (e ).
dtn
42 R. SHVYDKOY
Then kf k∗ = kf kH ∗ = kJf kH .
Another consequence of the Riesz Representation Theorem is an explicit formula for the
x
supporting functional: for every x 6= 0, set. f ∼ kxk . Then kf k = 1, and f (x) = kxk.
In the Hilbert space setting the formula for the norm of an operator (17) reads
(35) kT k = sup hT x, yi.
kxk=1,kyk=1
Thus,
kT k = kT ∗ k.
Example 4.18. Consider H = Cn , and T ∼ A = (aij ). Then T ∗ ∼ A∗ = (āji ).
Definition 4.19. We say that T : H → H is self-adjoint if
T = T ∗.
We say that T : H → H is unitary if T is bijective and
T −1 = T ∗ .
We say that T : H → H is normal if
T T ∗ = T ∗ T.
44 R. SHVYDKOY
So, T = T ∗ if and only if all coefficients ai are real. And T −1 = T ∗ if and only if |ai | = 1.
In some sense these properties define these classes of operators in general (this will be more
clear in the framework of spectral theory).
To proceed with the properties of these special classes of operators, let us establish one
simple uniqueness criterion.
Lemma 4.20. Suppose H is a complex Hilbert space and let
hT x, xi = hSx, xi, ∀x ∈ H.
Then T = S.
Proof. Let us consider arbitrary λ ∈ C, x, y ∈ H and compute for D = S − T ,
0 = hD(x + λy), (x + λy)i = λhDy, xi + λ̄hDx, yi.
Setting λ = 1 and λ = i we obtain the system
hDy, xi + hDx, yi = 0,
hDy, xi − hDx, yi = 0,
which proves that hDx, yi = 0, for all x, y ∈ H. Thus D = 0.
Lemma 4.21. An operator T : H → H is self-adjoint if and only if hT x, xi ∈ R for all
x ∈ H.
Proof. If T is self-adjoint we have
hT x, xi = hx, T xi = hT x, xi.
So, hT x, xi ∈ R. Conversely, if hT x, xi ∈ R, then
hT x, xi = hT x, xi = hx, T xi = hT ∗ x, xi.
By Lemma 4.20 we obtain T = T ∗ .
Lemma 4.22. The following are equivalent
(i) T : H → H is unitary.
(ii) T is surjective and T ∗ T = I.
(iii) T is injective and T T ∗ = I.
Proof. If T is unitary, then T is bijective hence it is both surjective and injective. Moreover
T ∗ T = T T ∗ = I. Hence (i) implies both (ii) and (iii).
If (ii) holds, then T ∗ T = I implies
kT xk2 = hT x, T xi = hT ∗ T x, xi = hx, xi = kxk2 .
LECTURES ON FUNCTIONAL ANALYSIS 45
Hence, T is injective and even is an isometry. Together with surjectivity it implies that T is
bijective. But then applying T −1 from the right we obtain
T ∗ = T ∗ T T −1 = IT −1 = T −1 .
This establishes that T is unitary.
Suupose (iii). For every y ∈ H we have T (T ∗ y) = y. So setting x = T ∗ y, we have T x = y.
This proves that T is surjective. Together with the fact that is it injective implies that T is
invertible. But then applying T −1 from the left we obtain
T ∗ = T −1 T T ∗ = T −1 I = T −1 .
This establishes that T is unitary.
The right shift operator on `2 is a good example which illustrates that not every isometry
is automatically unitary, i.e. the condition T ∗ T = I alone does not imply surjectivity.
4.7. Exercises.
Exercise 4.1. Prove identities (27) and (28).
Exercise 4.2. Show that Lp ([0, 1]), p 6= 2 is not a Hilbert space. Show that C[0, 1] is not a
Hilbert space either.
Exercise 4.3. Show that Y ⊥⊥ = Y .
Exercise 4.4. Prove that Y ⊥ ∼
= H/Y .
Exercise 4.5. Show that A⊥ is a linear closed subspace of H.
⊥
Exercise 4.6. Show that A⊥ = span A .
Exercise 4.7. Show that if (33) holds for every x ∈ H, then the orthonormal system is total,
and hence is a basis.
Exercise 4.8. Write down the formula for vectors in the trigonometric orthonormal system.
Exercise 4.9. Show that
1. T ∗∗ = T .
2. (αT )∗ = ᾱT ∗ .
3. (S + T )∗ = S ∗ + T ∗ .
4. (ST )∗ = T ∗ S ∗ .
5. kT k2 = kT ∗ T k = kT T ∗ k.
Exercise 4.10. Show that the supporting functional is unique for every x ∈ H. Hint: recall
when the Cauchy-Schwartz inequality becomes equality.
Exercise 4.11. Show that a projector P : H → H is orthogonal if and only if it is self-adjoint.
Exercise 4.12. Show that T : H → H is an isometry if and only if T ∗ T = I and if and only
if T preserves inner-products
hT x, T yi = hx, yi, ∀x, y ∈ H.
So, the only condition that separates every isometry from being unitary is the surjectivity
condition, see Lemma 4.22 (ii).
Exercise 4.13. Show that T ∗ T and T T ∗ are both self-adjoint operators for any T .
46 R. SHVYDKOY
5. Weak topologies
5.1. Weak topology. Let X be a Banach space. We define the weak topology on X as
the topology with the following base of neighborhoods: for x ∈ X, ε1 , . . . , εn > 0 and
f1 , . . . , fn ∈ X ∗ , let
(36) Uεf11,...,ε
,...,fn
n
(x) = {y ∈ X : |fi (y) − fi (x)| < εi , ∀i = 1, n}.
Note that on an infinite dimensional space X these are unbounded sets containing the entire
linear plane x + ∩ni=1 Ker fi .
We say that set V ⊂ X is weakly open if for every point x ∈ V there exists some weak
neighborhood of x contained in the same set, Uεf11,...,ε
,...,fn
n
(x) ⊂ V . The collection of all weakly
open sets forms a topology (exercise!), called weak topology. We will use the term “strong”
with respect to anything related to the norm-topology, as opposed to “weak” that refers to
anything related to the weak topology. For example, “strongly compact set” v.s. “weakly
compact set”, or “strong convergence” v.s. “weak convergence”.
Examples of weakly open sets are basically anything that can be defined by a finite set of
functionals, e.g.
{f < a}, {f > b}, {a < f < b}, ...
The corresponding sets with ≤, ≥ are weakly closed.
It is clear that the weak topology is weaker than the norm-topology on any normed space.
In fact, on an infinite dimensional space it is strictly weaker. To see this, we show that any
neighborhood (36) is unbounded. Indeed, let Z = ∩i Ker fi . This is a non-empty space, for
otherwise X would have been a span of xi ’s, with xi 6∈ Ker fi . Then Uεf11,...,ε
,...,fn
n
(x) contains all
of x + H.
Lemma 5.1. A sequence xn → x weakly if and only if f (xn ) → f (x), for all f ∈ X ∗ .
Proof. Suppose xn → x weakly. For any ε > 0 and f ∈ X ∗ one can find an N ∈ N such
that for all n > N , xn ∈ Uεf (x). So, |f (xn ) − f (x)| < ε. This proves the convergence
f (xn ) → f (x).
Conversely, for any sets ε1 , . . . , εn > 0 and f1 , . . . , fn ∈ X ∗ , one can find a common N ∈ N
such that for all n > N ,
|fi (xn ) − fi (x)| < εi , ∀i = 1, . . . , N.
This implies that xn ∈ Uεf11,...,ε
,...,fn
n
(x).
One can weaken Lemma 5.1 considerably for bounded sequences by requiring convergence
to hold only on a total set of functionals.
Definition 5.2. We say that a family of functionals F ⊂ X ∗ is total, if span F = X ∗ .
In the Hilbert space settings where H ∼ = H ∗ , this concept is identical to that of Defi-
nition 4.9. In `p for 1 ≤ p < ∞ and in c0 the most basic example of a total system is
provided by the set of coordinate basis vectors F = {en }∞
n=1 . And just like in Hilbert spaces
if f (x) = 0 for all f ∈ F, then x = 0.
Lemma 5.3. Let F ⊂ X ∗ be a total family of functionals. A sequence xn → x weakly if and
only if
(i) kxn k ≤ M , for all n ∈ N;
(ii) f (xn ) → f (x), for all f ∈ F.
LECTURES ON FUNCTIONAL ANALYSIS 47
Proof. Suppose xn → x weakly. Then (ii) directly from Lemma 5.1. Since xn is also a weakly
bounded set, (i) holds by the Banach-Steinhauss uniform boundedness principle.
Conversely, suppose (i) − (ii) holds. By linearity, f (xn ) → f (x) holds for all f ∈ span F,
which is dense in X ∗ . Let g ∈ X ∗ be arbitrary. For any ε > 0 find f ∈ span F such that
kf − gk < ε. Then find N ∈ N such that
|f (xn ) − f (x)| < ε, ∀n > N.
Then
|g(xn ) − g(x)| ≤ |g(xn ) − f (xn )| + |f (xn ) − f (x)| + |f (x) − g(x)|
≤ kf − gkkxn k + ε + kf − gkkxk < ε(M + 1 + kxk).
Exercise 5.1. Suppose dim X = ∞. Construct a net {xα }α∈A in X such that xα → 0 weakly,
yet for every α0 ∈ A and N > 0, there is α ≥ α0 such that kxα k ≥ N . Hint: let
A = {(f1 , . . . , fn ; N ) : fj ∈ X ∗ , n ∈ N, N > 0}.
Define a partial order on A, and for every α ∈ A pick an xα ∈ ∩j Ker fj , with kxα k > N .
Show that xα → 0 weakly and is (frequently) unbounded.
With the help of Lemma 5.3 we can give a simple characterization of weak convergence in
sequence spaces.
Lemma 5.4. Let {xn }∞ n=1 ⊂ X be a sequence in any of the spaces X = c0 , `p , for 1 < p < ∞.
Then xn → x weakly if and only if {xn } is norm bounded and converges to x pointwise, i.e.
xn (j) → x(j), for all j ∈ N.
Proof. This follows from Lemma 5.3 by choosing F to be the coordinate basis. In all the
spaces in question such a sequence is total.
48 R. SHVYDKOY
Sometimes when xn → x weakly and kxn k → kxk the sequence has no other choice
than converge strongly, kxn − xk → 0. Such Banach spaces are called to have Radon-Riesz
property. Any Hilbert space belongs to the Radon-Riesz class. Indeed, if xn → x weakly,
then
kxn − xk2 = kxn k2 + kxk2 − hxn , xi − hx, xn i
According to our assumptions, the right hand side converges to
kxk2 + kxk2 − hx, xi − hx, xi = 0.
Lemma 5.7. The weak topology is not metrizable if dim X = ∞.
Proof. Suppose, on the contrary that there is a metric d(·, ·) that defines the weak topology.
Consider the sequence of balls {d(x, 0) < 1/n}. Each contains a weak neighborhood of the
origin. We have shown every weak neighborhood is unbounded. Thus, we can find a xn
within the nth ball with kxn k > n. So, on the one hand, xn → 0 weakly, and yet {xn } is
unbounded, in contradiction with Lemma 5.3.
Lemma 5.8. If X ∗ is separable, then the weak topology is metrizable on any bounded set.
Proof. Since X ∗ is separable, we can pick a sequence of unit functionals ϕn ⊂ S(X ∗ ) which
dense on the sphere S(X ∗ ). Note that this family is automatically total. Now, define a
metric
∞
X |ϕn (x − y)|
d(x, y) = n
.
n=1
2
Note that since the functional are unit, the series converges. This metric is clearly symmetric,
and if d(x, y) = 0, then ϕn (x − y) = 0 for all n. This implies that x − y = 0. The triangle
inequality is also obvious.
So, now let A ⊂ X be a bounded set. We will show that the weak topology induced on
A, i.e. topology given by the family of traces of weak neighborhoods
Uεf11,...,ε
,...,fn
n
(x) ∩ A,
is equivalent to the topology produces by the metric d on A. For the purpose, we have to
show that a neighborhood of a point x ∈ A in one topology contains a neighborhood of the
same point x in the other topology.
Let us start with metric one. So, fix x ∈ A, ε > 0, and consider the ball
Bε (x) = {y ∈ X : d(x, y) < ε}.
Since A is bounded, we have kyk ≤ M , for some M > 0, and all y ∈ A. Then we can pick
N ∈ N such that
∞
X |ϕn (x − y)| ε
n
< ,
n=N +1
2 2
ϕ1 ,...,ϕn
for all y ∈ A. Then, we consider Uε/2,...,ε/2 (x) ∩ A. For any y in it we have
N ∞
X ε X |ϕn (x − y)|
d(x, y) ≤ n+1
+ < ε.
n=1
2 n=N +1
2n
So,
ϕ1 ,...,ϕn
Uε/2,...,ε/2 (x) ∩ A ⊂ Bε (x) ∩ A.
LECTURES ON FUNCTIONAL ANALYSIS 49
Proof. Notice that for any f ∈ B(X ∗ ), and x ∈ X, f (x) ∈ [−kxk, Q kxk]. This naturally
suggests to consider B(X ∗ ) as a subset of the product space T = x∈X [−kxk, kxk]. By Ty-
chonoff’s theorem, this product space is compact in the product topology. It suffices to show
that B(X ∗ ) is closed in T , because convergence of nets in the product topology is equivalent
to pointwise convergence, which for elements of B(X ∗ ) amounts to weak∗ convergence.
To this end, let {fα }α∈A be a net in B(X ∗ ) with lim fα = f ∈ T . By linearity of fα ’s and
the ”pointwise” sense of the limit above, we conclude that
f (λx + µy) ← fα (λx + µy) = λfα (x) + µfα (y) → λf (x) + µf (y).
Thus, f is linear, and since |fα (x)| ≤ kxk, we also have |f (x)| ≤ kxk for all x ∈ X, which
identifies f as an element of B(X ∗ ).
As an immediate consequence we see that for a reflexive Banach space X, the unit ball is
weakly compact. This the property actually characterizes reflexivity.
Theorem 5.11 (Kakutani). A Banach space X is reflexive if and only if its unit ball is
weakly compact.
Lemma 5.12 (Goldstein). The unit ball B(X) is weakly∗ dense in B(X ∗∗ ). In particular,
w∗
B(X) = B(X ∗∗ ).
Indeed, let B(X) be weakly compact. In view of Lemma 5.12, for any x∗∗ ∈ B(X ∗∗ ) we
w∗ w
find a net xα → x∗∗ , xα ∈ B(X). By the weak compactness, there exists a subnet yβ → x for
some x ∈ B(X), and yet that same subnet converges to x∗∗ in the weak∗ topology of X ∗∗ .
Thus, for every x∗ we have (x∗∗ , x∗ ) ← (yβ , x∗ ) = (x∗ , yβ ) → (x∗ , x), which identifies x∗∗ as
x. Thus, B(X) = B(X ∗∗ ), implying X = X ∗∗ .
In order to prove Lemma 5.12, we have to go back to the Separation Theorem 3.9 and
make an adaptation to the case of the weak∗ topology. First, we prove Helly’s Lemma.
Tn
Lemma 5.13 (Helly). Suppose that f, f1 , . . . , fn ∈ X ∗ , and j=1 Ker fj ⊂ Ker f . Then
f ∈ span{f1 , . . . , fn }.
Proof. We will prove the lemma by induction. Suppose n = 1, and f1 6= 0 (otherwise the
statement is trivial). By the structure of the linear functionals discussed in Section 3.2, there
is x1 ∈ X with f1 (x1 ) = 1, such that for every x ∈ X we have x = λx1 + y, where y ∈ Ker f1 .
Since f (y) = 0 we have f (x) = λf (x1 ) = f (x1 )f1 (x), as desired.
Suppose the statement is true for n. Let us assume ∩n+1 j=1 Ker fj ⊂ Ker f . Consider
n
P Y = Ker fn+1 . Then ∩j=1 Ker fj |Y ⊂ Ker f |Y . By the induction hypothesis,
the space
f |Y = nj=1 aj fj |Y . By the structure of fn+1 we have for any x ∈ X, x = λxn+1 + y for some
LECTURES ON FUNCTIONAL ANALYSIS 51
Lemma 5.14. If x∗∗ ∈ X ∗∗ is continuous in the weak∗ topology, then x∗∗ ∈ X.
Proof. By the assumption (x∗∗ )−1 (−1, 1) contains a weak∗ neighborhood of the origin, say,
Uεx11,...,ε
,...,xn
n
(0). In particular, x∗∗ ∈ (−1, 1) on ∩j Ker xj . Since the latter is a linear space, x∗∗
must in fact vanish on it. Thus, by Lemma 5.13, x∗∗ ∈ span[x1 , . . . , xn ] ⊂ X.
Theorem 5.15 (Separation Theorem for weak∗ topology). Suppose B is a weakly∗ -closed
convex subset of X ∗ , and f 6∈ B. Then, there is x ∈ X such that
sup g(x) < 1 < f (x).
g∈B
Proof. Let us follow the proof of Theorem 3.9. Let us assume f = 0, and let A = Uεx11,...,ε,...,xn
n
(0)
be a weak∗ neighborhood of 0 disjoint from B. We then associate Minkowski’s functionals pA
and qB to A and B respectfully. As a result of Hahn-Banach Theorem, we find a separating
functional F so that qB ≤ F ≤ pA . Since f ∈ A implies pA (f ) ≤ 1, we see that F is bounded
from above on A. Like in the proof of Lemma 5.14 we conclude that F vanishes on the
intersection of the kernels of the x1 , . . . , xn , and hence, F ∈ X. By rescaling F if necessary
we can arrange the constant of separation to be 1.
w∗
Proof of Lemma 5.12. Let B = B(X) . By Exercise 5.11, B ⊂ B(X ∗∗ ). Suppose there is
F ∈ B(X ∗∗ )\B. By Theorem 5.15 , we find a x∗ ∈ X ∗ such that
sup G(x∗ ) < 1 < F (x∗ ).
G∈B
The first inequality holds, in particular, on B(X), which shows that kx∗ k ≤ 1. This runs
into contradiction with the second inequality.
Corollary 5.16. Let Y ⊂ X be a closed subspace of a reflexive space X. Then Y and X/Y
are reflexive.
Proof. Indeed, B(Y ) = Y ∩ B(X). Since the this set if convex and closed, it is weakly closed
in B(X) and hence compact in the weak topology of X. However, by the Hahn-Banach
extension theorem the topology induced on Y by the weak topology of X is exactly the
weak topology of Y . Thus, B(Y ) is weakly compact in Y . Now, by Exercise 3.16 and by
the previous, (X/Y )∗ is a subspace of a reflexive space, which makes it reflexive. Then by
Exercise 3.15 (X/Y ) itself is reflexive.
52 R. SHVYDKOY
5.3. Exercises.
Exercise 5.3. Show that xα → x weakly if and only if f (xα ) → f (x) for every functional
f ∈ X ∗.
Exercise 5.4. Show that, more generally, if a net xα → x weakly, then for any ε > 0 there is
α0 so that for all α ≥ α0 , kxα k ≥ kxk − ε.
Exercise 5.5. Show that if A is closed and B is strongly compact convex sets, then there are
two disjoint weakly open neighborhoods of A and B.
Exercise 5.6. Show that if xn → x weakly, then there is a sequence of convex combinations
made of xn ’s that converge to x strongly.
Exercise 5.7. Show that fn → f if and only if fn (x) → f (x) for every x ∈ X.
Exercise 5.8. Show that any weakly∗ convergent sequence in X ∗ is strongly bounded.
Exercise 5.9. Let F ⊂ X be a total set. Show that fn → f if and only if {fn } is bounded
fn (x) → f (x) for every x ∈ F.
Exercise 5.10. If X is separable show that the weak∗ -topology is metrizable on any bounded
subset of X ∗ .
w∗
Exercise 5.11 (Weak∗ lower semi-continuity of norm). Show that if fn → f , then
lim inf kfn k ≥ kf k.
n→∞
More generally, if a net {fα }α∈A converges weakly∗ to f , then for any ε > 0 there is α0 ∈ A
so that for all α ≥ α0 , kfα k ≥ kf k − ε.
Exercise 5.12. Recall that (c0 )∗∗ = (`1 )∗ = `∞ . Show that B(c0 ) is weakly∗ sequentially
dense in B(`∞ ), i.e. for every F ∈ B(`∞ ) there is a sequence xn ∈ B(c0 ) converging weakly∗
to F .
times, in which case we pick a stationary subsequence xkl = yn0 , or if not, then xk → 0, in
which case 0 ∈ K is the limit.
So, compact sets may span infinite-dimensional spaces, but still are in some sense finite-
dimensional as is seen in the following lemma.
Lemma 6.1. A subset K ⊂ X is precompact if and only if K is bounded and for any ε > 0
there exists a finite-dimensional subspace Yε such that
k[x]kX/Yε < ε, ∀x ∈ K.
In other words, every element of K is no more than ε-away from the space Yε .
Note that in this lemma and well as some of the statement below we choose to state
a criterion for precompactness rather than compactness. This is simply because such a
statement already contains the key element. In order the upgrade this to compactness one
simply has to add “being closed”.
Proof. Suppose K is precompact. Hence, it is clearly bounded. According to Lemma 1.10
for any ε > 0 we can find a finite ε-net {xj }Jj=1 . Consider Yε = span{xj }Jj=1 . Then for any
x ∈ K, there exists an element of Yε , namely the one from the ε-net xj such that kx−xj k < ε.
Hence, k[x]kX/Yε < ε.
Conversely, for any ε let us find a space Yε/2 . For any x ∈ K, find y(x) ∈ Yε/2 with
kx − y(x)kX < ε/2.
The set y(K) = {y(x)}x∈K is obviously bounded. And as a subset of a finite-dimensional
space, it is also precompact, see Corollary 2.13. Hence, according to Lemma 1.10 we can
find an ε/2-net for y(K). Clearly, this same set will be an ε-net for K.
An important example of a compact set in `2 is provided by the Hilbert cube.
Example 6.2. Let us fix a sequence a ∈ `2 with positive coordinates an > 0. A Hilbert Cube
is the subset of `2 defined by
H = {x ∈ `2 : |xn | ≤ an }.
So, in other words it is an infinite Cartesian product
[−a1 , a1 ] × [−a2 , a2 ] × · · · × [−an , an ] × . . .
Note that H is a bounded closed convex subset of `2 . It is also precompact, which along
with closedness makes it compact. Indeed, for any ε, let us pick N > 0 such that
X
a2n < ε2 .
n≥N
Note that (39) holds true for any finite set. But a close examination of the argument we
presented for the Hilbert cube shows that this condition is exactly what is needed to prove
precompactness.
Lemma 6.5. A subset K ⊂ `p , for 1 ≤ p < ∞ is precompact if and only if it is uniformly
summable.
Proof. The sufficiency follows the line of the proof of compactness of the Hilbert cube line
by line. For any ε we find N such that
X
sup |xn |p < εp .
x∈K
n>N
Conversely, denote cN = supx∈K n>N |xn |p . For any ε > 0 we can find an ε-net for K,
P
y 1 , . . . , y m . Since this is a finite set, we have, for some N ∈ N,
X
sup |ynj |p < ε.
j=1,...,m
n>N
For any x ∈ K, find j such that kx − y j kp < ε. Then, their tails are also ε-close
!1/p
X
|xn − ynj |p < ε.
n>N
Proof. We have essentially proved necessity of (41) in the discussion above. Indeed, if we
apply the partition for an ε-net {xj }, then (41) follows with 2ε.
Conversely, let M be a global bound on the norms of elements in K, kxk∞ < M . For
any ε let us find a partition {Bp }Pp=1 as in (41). Let us further fix an ε-net of scalars
Λ = {λ1 , . . . , λQ } on the interval [−M, M ]. Let us form sequences with values in Λ on each
Bp :
N = {x ∈ `∞ : x|Bp ∈ Λ, p ≤ P }.
Thus, each element in N is constant on each Bp with value in Λ. Since the partition is finite
and Λ is finite, the set N is finite. Yet, in view of (41) for each x ∈ K and any p there
exists λ(p) ∈ Λ such that |λ(p) − xn | < 2ε for all n ∈ Bp . Then the corresponding y ∈ N with
y |Bp = λ(p) will fulfill ky − xk∞ < 2ε. Thus, N is a 2ε-net for K.
56 R. SHVYDKOY
Proof. Let us start with the easier implication. If K is precompact, then it is bounded and
hence pointwise bounded. For any ε > 0 find an ε/3-net {fp }Pp=1 for K. Since every function
in that net is uniformly continuous we find a common δ > 0 such that for any d(x, y) < δ
we have
max |fp (x) − fp (y)| < ε/3.
p
For any other function f ∈ K we pick p such that kf − fp k < ε/3, and the result follows by
the standard application of the triangle inequality.
To start the converse implication we note that equicontinuity and pointwise boundedness
implies the usual boundedness, see Exercise 6.3. So, we can assume that K is equicontinuous
and bounded. Let
sup kf k = M < ∞.
f ∈K
Let us fix ε > 0 and partition the interval [−M, M ] into disjoint semi-closed intervals Ij ,
j = 1, . . . , J with |Ij | < ε/2, in the increasing order I1 < I2 < · · · < IJ . Next, let us find
finite cover K by balls B1 , . . . , BN such that for any x, y ∈ Bi we have
(42) sup |f (x) − f (y)| < ε/2.
f ∈K
Note that in view of (42), all values of a function f ∈ K on any ball Bn will belong to either
the top interval or the one below
(43) f |Bn ⊂ Iif (n)−1 ∪ Iif (n) .
Consequently, if if = ig , then on any ball |f − g| < ε, i.e. kf − gk < ε. However, since i is a
map between two finite sets there are only finitely many such maps, if1 , . . . , ifP . Any other
map ig , g ∈ K has to coincide with one them, i.e. there exists fp such that ifp = ig . Hence
kg − fp k < ε. This means that {fp }Pp=1 is a finite ε-net for K.
If Ω is a bounded convex domain in Rn there is a very simple sufficient condition for
uniform equicontinuity of a family F ⊂ C(Ω): boundedness in C 1 (Ω). Indeed, if there exists
a constant M > 0 so that
sup k∇f k∞ ≤ M,
f ∈F
0 00
then for any x , x ∈ Ω we have
|f (x0 ) − f (x00 )| ≤ |∇f (y)(x0 − x00 )|,
for some y ∈ [x0 , x00 ], and hence
|f (x0 ) − f (x00 )| ≤ M |x0 − x00 |.
So, the family is uniformly Lipschitz, and hence equicontinuous.
6.3. Compactness in Lp (Ω). In this section we will establish compactness criteria in Lp -
spaces over domains in Rn . Before we discuss setup and results of let us first introduce an
important procedure called mollification2.
6.3.1. Mollification. We consider a non-negative C ∞ function ψ on Rn with supp ψ ⊂ {|x| ≤
1} and unit total weight: Z
ψ(x) dx = 1.
Rn
For any small parameter ε > 0 and we consider the rescaled function
1 x
ψε (x) = n ψ .
ε ε
Note that just like ψ, ψε has total weight 1 as well. However, supp ψε ⊂ {|x| ≤ ε}. So, ψε
concentrates all its mass in a smaller ball.
For any function u ∈ Lp (Rn ), 1 ≤ p ≤ ∞ we define a convolution by
Z
(44) u ∗ ψε (x) = uε (x) = ψε (x − y)u(y) dy.
Rn
Since ψε has compact support, it is clear that the integral in y converges for all x, although
it may not have a local support anymore. However if it clear that
(45) supp(u ∗ ψε ) ⊂ supp u + Bε (0).
The most important feature of mollification is that it makes functions smooth. Indeed,
since ψ ∈ C ∞ the usual convergence theorems can be applied to show that
Z
j1 jn
∂x1 . . . ∂xn u ∗ ψε (x) = ∂xj11 . . . ∂xjnn ψε (x − y)u(y) dy.
Rn
2Other terms used in the literature include filtration, coarse-graining, regularization.
58 R. SHVYDKOY
Since each time the derivative falls on ψ it produces a factor of 1/ε, it is clear that the norm
of any derivative of the convolution can grow out of control. However, it terms of its native
Lp -norm, the mollification is well behaved. To see this let us estimate using p1 + 1q = 1,
Z Z p Z Z p
p
ku ∗ ψε kp = ψε (x − y)u(y) dy dx = ψε1/q (x − y)ψε1/p (x − y)u(y) dy dx
Rn Rn Rn Rn
Z Z pq Z
≤ ψε (x − y) dy ψε (x − y)|u(y)|p dy dx
Rn Rn Rn
Z Z
= ψε (x − y)|u(y)|p dy dx = kukpp .
Rn Rn
Thus,
(46) ku ∗ ψε kp ≤ kukp .
Next important property of mollification is the approximation:
(47) ku ∗ ψε − ukp → 0.
Indeed, let us recall the following fact from Real Analysis: for any function u ∈ Lp (Rn ) we
have
Z
(48) |u(x + h) − u(x)|p dx → 0,
Rn
Noting that all |h| < ε in the support of ψε and in view of (48) we can see that the inner
integral tends to zero uniformly in h as ε → 0. This proves (47).
without further notice we always assume that functions in Lp (Ω) are extended trivially to
the entire Rn :
u(x), x ∈ Ω,
u(x) =
0, x 6∈ Ω.
Compactness in Lp (Ω) is very similar in flavor to the Arzelà-Ascoli Theorem, with the
equicontinuity replaced by the “equiintegrability” – the condition (48) holding uniformly
across a given set.
Theorem 6.11 (Compactness in Lp ). Let Ω ⊂ Rn be a bounded domain, and 1 ≤ p < ∞.
A subset K ⊂ Lp (Ω) is precompact if and only if K is bounded and
Z
(49) sup |u(x + h) − u(x)|p dx → 0, h → 0.
u∈K Ω
Proof. If K is precompact, it is clearly bounded, and for every ε > 0 let us pick an ε-net
{ui }N
i=1 for it. Since (48) holds for each ui we have for some i,
So, for any δ let us pick ε so small that o(1) < δ. For that fixed ε the set Kε ⊂ C(Ω + Bε (0))
is equicontinuous. Indeed,
Z
0 00
|uε (x ) − uε (x )| ≤ |ψε (x0 − y) − ψε (x00 − y)||u(y)| dy
Ω
≤ Cε |x0 − x00 |kukL1 (Ω) ≤ Cε,p |x0 − x00 |kukLp (Ω) ≤ C|x0 − x00 |,
in the latter since K is assumed bounded. So, the family Kε is uniformly Lipschitz in fact.
By the Arzelà-Ascoli Theorem we conclude that Kε is precompact in C(Ω + Bε (0)). Let
{ujε }N j
j=1 be a δ-net in Kε . Then for any uε there exists uε with
6.5. Compact maps. Generally by a compact map between two metric spaces X and Y we
understand a map T : X → Y which maps bounded sets to precompact sets. If X and Y
are Banach spaces and T ∈ L(X, Y ), then it is sufficient to consider only the unit ball.
Definition 6.12. T ∈ L(X, Y ) is called a compact operator if T (B(X)) is precompact in Y .
By linearity then it implies that T sends bounded sets for precompact sets. The following
lemma follows directly from Lemma 1.10.
Lemma 6.13. T ∈ L(X, Y ) is compact if and only if for any bounded sequence {xn }∞
n=1 ⊂ X
there exists a convergent subsequence of T xnk → y ∈ Y .
Let us note some obvious properties of compact maps. First, if dim X < ∞, then any
bounded linear operator T : X → Y is compact, simply because any continuous maps be-
tween topological spaces maps compact sets to compact sets, see Lemma 1.4. At the same
time if dim Y < ∞, then every bounded subset of Y is precompact by Corollary 2.13. And
thus, every T ∈ L(X, Y ) is compact also. If, however, dim X, dim Y = ∞, then no isomor-
phism or even isomorphic embedding between such spaces can be a compact operator. This
is because in this case T (B(X)) would have a non-empty interior in an infinite dimensional
subspace. So, compact operators are not invertible in this case.
6.6. Exercises.
Exercise 6.1. Show that the Hilbert cube with sides an = √1n is not a compact set. Hint:
construct a disjoint sequence of unit vectors in H with supports going off to infinity.
Exercise 6.2. Prove Lemma 6.6.
Exercise 6.3. Show that any uniformly equicontinuous and pointwise bounded set in C(Ω) is
automatically bounded. Give an example of a pointwise bounded set which is not bounded.
Why does the Uniform Boundedness principle not apply here?
Exercise 6.4. Give an example of a set K ⊂ Lp (Rn ) which satisfies (49) yet is not precompact.
Hint: consider shifts of a fixed compactly supported function.
Exercise 6.5. Formulate and prove an analogue of the compactness criterion in `∞ stated in
Lemma 6.8 in the space L∞ (Ω).
7. Fixed Point Theorems
7.1. Contraction Principles.
Definition 7.1. Let (X, d) be a metric space. We say that T : X → X is a contraction if
there exists 0 < θ < 1 such that
d(T x, T y) ≤ θd(x, y), ∀x, y ∈ X.
Theorem 7.2 (Contraction Principle). Let (X, d) be a complete metric space and T : X → X
is a contraction. Then T has a unique fixed point.
Proof. Fix an arbitrary x0 ∈ X. Let us consider the sequence
xn = T n x0 , n ∈ N.
Let us estimate the distance between two consecutive elements:
d(xm+1 , xm ) = d(T xm , T xm−1 ) ≤ θd(xm , xm−1 ) ≤ · · · ≤ θm d(x1 , x0 ).
LECTURES ON FUNCTIONAL ANALYSIS 61
where the series converges absolutely in the operator norm of L(X, X).
Proof. First of all, let us note that kT n k ≤ kT kn , and since kT k < 1, the series converges
absolutely. Denote
∞
X XN
S= T j , SN = T j.
j=1 j=1
Then
SN (I − T) = I + T + · · · + T N − T − · · · − T N +1 = I − T N +1 ,
hence, SN (I − T ) → I, and at the same time SN → S. So, S(I − T ) = I. Similarly,
(I − T )S = I. By Lemma 2.20, I − T is invertible, and (I − T )−1 = S, which finishes the
lemma.
A simple consequence of this lemma is that the spectrum of any operator is confined to
the ball of radius kT k. Indeed, if |λ| > kT k, then the operator λ1 T has norm < 1. Then
1
λI − T = λ(I − T )
λ
is invertible, and according to (50),
∞
−1
X Tj
(51) R(λ, T ) = −(λI − T ) =− .
j=0
λj+1
Next, we show that σ(T ) is always closed.
Lemma 8.3. The spectrum of any bounded linear operator is a closed subset of C.
Proof. We prove it by showing that the resolvent set ρ(T ) is open. Indeed, let λ0 ∈ ρ(T ).
Let us fix another λ such that
1
|λ − λ0 | < .
kR(λ0 , T )k
Then
T − λI = T − λ0 I − (λ − λ0 )I = (T − λ0 I)[I − (λ − λ0 )R(λ0 , T )].
Since k(λ−λ0 )R(λ0 , T )k < 1, the above is the product of two invertible operators. Hence, the
product is invertible. This establishes that λ ∈ ρ(T ), and hence the entire ball B 1 (λ0 ) ⊂
kR(λ0 ,T )k
ρ(T ).
The proof of the lemma provides an explicit formula for the resolvent in the form of a
power series expansion:
R(λ, T ) = [I − (λ − λ0 )R(λ0 , T )]−1 R(λ0 , T ),
so, according to (50),
∞
X 1
(52) R(λ, T ) = (λ − λ0 )j Rj+1 (λ0 , T ), |λ − λ0 | < .
j=0
kR(λ0 , T )k
LECTURES ON FUNCTIONAL ANALYSIS 63
There are several useful consequences of this formula. First, the following estimate shows
that the resolvent must blowup when nearing the spectrum.
Lemma 8.4. For any λ ∈ ρ(T ), we have
1
(53) kR(λ, T )k ≥ .
dist{λ, σ(T )}
Proof. Indeed, we can see from (52) that for any λ ∈ ρ(T ), the ball of radius 1/kR(λ, T )k
lies entirely in ρ(T ). So, the distance to the spectrum from λ will be greater than or equal
that radius:
dist{λ, σ(T )} ≥ 1/kR(λ, T )k.
This proves the result.
Next, formula (52) provides demonstrates complex analytic structure of the resolvent.
Indeed, if we fix any x ∈ X, and x∗ ∈ X ∗ , then from (52), we have an expansion for the
function f (λ) = (x∗ , R(λ, T )x):
∞
X
f (λ) = cj (λ − λ0 )j , cj = (x∗ , Rj+1 (λ0 , T )x).
j=0
It proves that f is complex analytic in the resolvent set ρ(T ). This defines what is called a
weakly analytic map R : ρ(T ) → L(X, X). It turns out that every weakly analytic map is
automatically analytic with respect to the uniform topology, in other words the limit
R(λ + h, T ) − R(λ, T )
R0 (λ, T ) = lim , λ ∈ ρ(T )
h→0 h
exists in the operator norm of L(X). This general fact can be found in [?]. But in our case
we can prove it directly from (52) because manipulating with the convergent series can be
done in the same way as with complex variables.
Indeed, let λ0 ∈ ρ(T ), then
∞ ∞
R(λ0 + h, T ) − R(λ0 , T ) 1 X j j+1 X
= h R (λ0 , T ) = hj Rj+2 (λ0 , T ).
h h j=1 j=0
Then
∞
R(λ0 + h, T ) − R(λ0 , T ) X |h|kR(λ0 , T )k3
− R2 (λ0 , T ) ≤ |h|j kR(λ0 , T )kj+2 = → 0.
h j=1
1 − |h|kR(λ 0 , T )k
Hence,
(54) R0 (λ, T ) = R2 (λ, T ).
This proves uniform analyticity and gives an explicit formula for the derivative of the resol-
vent.
Alternatively, the formula (54) can be derived from the Hilbert identities, which are useful
for various purposes. These state the following: for any λ, µ ∈ ρ(T ),
(55) R(µ, T ) − R(λ, T ) = (µ − λ)R(µ, T )R(λ, T )
64 R. SHVYDKOY
For any fixed x∗ ∈ X ∗ and x ∈ X, the analytic function f (λ) = x∗ (R(λ, T )x) is entire and
bounded. By Liouville’s theorem this implies that f (λ) is constant. So, in other words, for
any λ, µ ∈ C and any pair of x∗ ∈ X ∗ and x ∈ X
x∗ (R(λ, T )x) = x∗ (R(µ, T )x).
This implies that R(λ, T ) is independent of λ, and hence so is R−1 = T −λI. This is obviously
not true.
With the basic properties of the spectra developed so far, we can already describe spectra
of various operators.
Example 8.6. Let us consider the left-shift operator
T x = (x2 , x3 , . . . )
on `p , 1 ≤ p ≤ ∞. Clearly 0 ∈ σp (T ), and since kT k = 1 the whole spectrum is confined to
the unit disc. Now, if |λ| < 1 we might try to construct an eigenvector corresponding to λ
by considering the equation
T x = λx.
It read coordinate-wise, it results in an infinite system of linear equations
x2 = λx1 , x3 = λx2 , ...
So, if such a vector exists, it must be of the form
xn = λn−1 x1 .
Thus, the sequence
x = (1, λ, λ2 , . . . )
will be an eigenvector and it will belong to any `p class due to |λ| < 1 (in fact for p = ∞
this would also serve as an eigenvector for any |λ| = 1).
LECTURES ON FUNCTIONAL ANALYSIS 65
We have discovered that the open disc {|λ| < 1} belongs to the spectrum. According to
Lemma 8.3 the spectrum is a closed set, so the closed disc {|λ| ≤ 1} also belongs to the
spectrum. But since kT k = 1 it must be a subset of the disc. We conclude that
σ(T ) = {|λ| ≤ 1}.
Example 8.7. Consider now the multiplication operator on `p :
T x = (a1 x1 , a2 x2 , . . . ), a ∈ `∞ .
Clearly, an ∈ σp (T ) for the corresponding eigenvector is given by the basis vector en . Con-
sequently,
{an }∞
n=1 ⊂ σ(T ).
Let now λ 6∈ {an }∞
n=1 . This means that there exists δ > 0 such that |λ − an | > δ. But then
8.1. Spectral Mapping Theorem and the Gelfand formula. Let us consider a poly-
nomial
p(λ) = an λn + an−1 λn−1 + · · · + a1 λ + a0 .
We can define the operator p(T ) by substituting T for λ:
p(T ) = an T n + an−1 T n−1 + · · · + a1 T + a0 .
The natural question to ask is what is the spectrum of this operator? This will be answered
in the following Spectral Mapping Theorem.
Theorem 8.8. We have the following identity
σ(p(T )) = p(σ(T )).
Proof. First let us show the inclusion
σ(p(T )) ⊂ p(σ(T )).
Indeed, let us fix a µ ∈ C, and consider the polynomial p(λ) − µ. By the Fundamental
Theorem of Algebra, we have a factorization
p(λ) − µ = an (λ − λ1 )(λ − λ2 ) . . . (λ − λn ),
where λ1 , . . . , λn are the roots. Then
(56) p(T ) − µI = an (T − λ1 I)(T − λ2 I) . . . (T − λn I).
If all of λi ∈ ρ(T ), then all the multiples in the product on the right hand side is invertible,
and so is the product itself, p(T ) − µI, by Lemma 2.21. Hence, µ ∈ ρ(p(T )). So, if µ ∈
σ(p(T )), then one of λi ∈ σ(T ). But p(λi ) = µ, so µ ∈ p(σ(T )).
Let us prove the opposite inclusion
(57) p(σ(T )) ⊂ σ(p(T )).
66 R. SHVYDKOY
We start from µ ∈ ρ(p(T )). Then p(T ) − µI is invertible. Note that all operators in the
product (56) commute. Then by Lemma 2.22, see also Exercise 2.2, each product term
T − λi I is invertible. Hence, all λi ∈ ρ(T ). If it was the case that µ ∈ p(σ(T )), then µ = p(λ)
for some λ ∈ σ(T ). But then it would have implied that λ is the root of p(λ) − µ. We just
established that all such roots are in ρ(T ), which is a contradiction. So, µ in the complement
of p(σ(T )). In other words,
ρ(p(T )) ⊂ C\p(σ(T )),
which is equivalent to (57).
We will now define the spectral radius by
r(T ) = sup |λ|.
λ∈σ(T )
So far, we only know that r(T ) ≤ kT k. Since σ(T n ) = σ(T )n , we have r(T n ) = r(T )n . So,
according to the above
p
r(T ) ≤ n kT n k,
for all n ∈ N. Hence,
p
n
p
n
r(T ) ≤ lim inf kT n k ≤ lim sup kT n k.
n→∞ n→∞
Let us recall the power series representation of the resolvent (53), for |λ| > kT k, which we
rewrite for the variable ξ = 1/λ
∞
X
(60) Rξ = −ξ ξnT n.
j=0
Then we have
∞
X ∞
X
ξ ξ n T n ≤ −x xn kT n k, x = |ξ|.
j=0 n=0
From the calculus we recall that the power series above converges for all x < r0 , where
1 p
(61) = lim sup n kT n k.
r0 n→∞
So, for all |ξ| < r0 , the power series converges absolutely. This power series therefore defines
an analytic function on the ball of radius r0 . At the same time Rξ defines an analytic
function on the ball of radius 1/r(T ). And both functions coincide on the ball of radius
1/kT k. This means that both functions coincide up to their common radius of analyticity
LECTURES ON FUNCTIONAL ANALYSIS 67
r1 = min{1/r(T ), r0 } . Assume now that 1/r(T ) > r0 . Then formula (60) holds up to
|ξ| < r0 . If we consider the series itself
∞
X
f (ξ) = ξnT n,
j=0
it shows that f (ξ) has an analytic extension beyond r0 , say up to the ball of radius r0 + 2ε,
where f remains bounded. Then we can compute the coefficients of the series by the Cauchy
formula Z Z
n 1 f (ξ) 1 f (ξ)
T = n+1
dξ = dξ,
2πi |ξ|=r0 −ε ξ 2πi |ξ|=r0 +ε ξ n+1
due to the fact the integrand is analytic in the annulus r0 − ε ≤ |ξ| ≤ r0 + ε. Given that
kf (ξ)k ≤ M , we obtain the estimate
M
kT n k ≤ ,
(r0 + ε)n
for all n ∈ N. So,
p
n 1 1
lim sup kT n k ≤ < ,
n→∞ r0 + ε r0
in contradiction with (61).
This proves that 1/r(T ) ≤ r0 , which is what we aimed for.
8.2. On the spectrum of self-adjoint operators. The goal of this section will be to
establish special spectral properties for the classes of operators we defined on a Hilbert
space. So, we fix a Hilbert space H with an inner-product h·, ·i.
We start with a description of spectrum for self-adjoint operators.
Lemma 8.10. Let T ∈ L(H) be a self-adjoint operator, T = T ∗ . Then σ(T ) ⊂ R.
Proof. We start by proving that the spectrum is real.
First, let us make an observation known already from linear algebra: if λ ∈ σp (T ), then
λ ∈ R. Indeed, if
T x = λx,
scalar multiplying with x and using Lemma 4.21 we obtain
hT x, xi = λkxk2 ∈ R,
and hence, λ ∈ R. Next, we prove the following general claim.
Claim 8.11. Denote Tλ = T − λI. If Tλ is bounded from below
(62) kTλ xk ≥ c0 kxk,
then Tλ is invertible.
Proof. From Lemma 2.23 we learned that Rg Tλ = Y is closed. Suppose that Y 6= H. Then
consider a point x0 ∈ Y ⊥ . We have
0 = hx0 , Tλ xi = hTλ̄ x0 , xi, ∀x ∈ H.
So, Tλ̄ x0 = 0, which means that λ̄ is an eigenvalue, and hence λ ∈ R. But then Tλ x0 = 0
which contradicts the assumption.
68 R. SHVYDKOY
Now let us fix λ = α + iβ with β 6= 0 and show that λ ∈ ρ(T ). Indeed, from Lemma 4.21,
hT x, xi ∈ R. Thus,
ImhTλ x, xi = βkxk2 .
Then
|β|kxk2 = | ImhTλ x, xi| ≤ |hTλ x, xi| ≤ kT xkkxk,
which implies (62). Hence, Tλ is invertible and hence, λ ∈ ρ(T ).
We now give a more precise information about the spectrum.
Lemma 8.12. Let T ∈ L(H) be a self-adjoint operator. Then
(63) σ(T ) ⊂ {hT x, xi : x ∈ S(H)}.
Proof. Let us fix λ 6∈ {hT x, xi : x ∈ S(H)}. This means in particular that there is a positive
distance between λ and the set of values hT x, xi:
δ = inf |λ − hT x, xi| > 0.
x∈S(H)
Next we prove that the end-points of the set of values hT x, xi lie in the spectrum.
Lemma 8.14. Let
s1 = inf hT x, xi, s2 = sup hT x, xi.
kxk=1 kxk=1
Then s1 , s2 ∈ σ(T ).
Since according to (64)
kT k = max{|s1 |, |s2 |},
this proves that the spectral radius is in fact equal to the norm of the operator
kT k = r(T ).
Proof. We focus on proving that s2 ∈ σ(T ), with the argument for s1 being similar.
Let us pick a constant L > 0 large enough so that
h(T + LI)x, xi > 0, ∀x ∈ H.
If we can prove that s2 (T + LI) ∈ σ(T + LI), then by the Spectral Mapping Theorem 8.8,
we achieve the result. In other words, without loss of generality we can assume that s1 > 0.
In this case kT k = s2 . Let us pick a sequence xn ∈ H so that
hT xn , xn i → kT k.
Then
kT xn − kT kxn k2 = kT xn k2 − 2kT khT xn , xn i + kT k2 ≤ 2kT k2 − 2kT khT xn , xn i → 0.
This means that the operator T − kT kI is not bounded from below, and as a consequence
cannot be invertible. Hence, kT k ∈ σ(T ).
Let us collect the obtained results in one spectral theorem for self-adjoint operators.
Theorem 8.15. Let T ∈ L(H) be a self-adjoint operator. Then
σ(T ) ⊂ {hT x, xi : x ∈ S(H)},
inf hT x, xi, sup hT x, xi ∈ σ(T ),
kxk=1 kxk=1
Let us note that the spectrum of a self-adjoint operator is not in fact the same as
{hT x, xi : x ∈ S(H)}. This is clear even in finite dimensional case. In this case we have an
orthogonal basis of eigenvectors
T xi = λi xi .
Thus, σ(T ) = {λi }i . Yet,
X
hT x, xi = λi |xi |2 ,
i
2
P
where i |xi | = 1. In other words, the set of scalar products in this case is the convex
combination of the spectrum:
{hT x, xi : x ∈ S(H)} = [min σ(T ), max σ(T )].
70 R. SHVYDKOY