0% found this document useful (0 votes)

107 views

Recurrence and Ergodicity

The aim of this lecture is to state Birkhoff’s Ergodic Theorem and to study some of its applications in the context of ergodic transformations of a probability space. In particular we will apply it to the doubling map and to the continued fraction map and deduce some results of a number-theoretic nature. Birkhoff’s Ergodic Theorem can be viewed as giving us information about rates of recurrence. Indeed, one application that we shall study is Kac’s Lemma: given a set A of measure μ(A), the orbi

Uploaded by

Bratnok

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views

Recurrence and Ergodicity

Uploaded by

Bratnok

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

MAGIC010 Ergodic Theory Lecture 5

5. Recurrence and Ergodicity

§5.1 Introduction
The aim of this lecture is to state Birkhoff’s Ergodic Theorem and to study
some of its applications in the context of ergodic transformations of a prob-
ability space. In particular we will apply it to the doubling map and to
the continued fraction map and deduce some results of a number-theoretic
nature. Birkhoff’s Ergodic Theorem can be viewed as giving us information
about rates of recurrence. Indeed, one application that we shall study is
Kac’s Lemma: given a set A of measure µ(A), the orbit of almost every point
of A eventually returns to A and the expected time of the first recurrence
is 1/µ(A). We begin by discussing Poincaré’s theorem: this gives us infor-
mation about the recurrence properties of an arbitrary measure-preserving
transformation.

§5.2 Poincaré’s Recurrence Theorem

Theorem 5.1 (Poincaré’s Recurrence Theorem)
Let T : X → X be a measure-preserving transformation of (X, B, µ) and let
A ∈ B have µ(A) > 0. Then for µ-a.e. x ∈ A, the orbit {T n x}∞
n=0 returns
to A infinitely often.
Proof. Let
E = {x ∈ A | T n x ∈ A for infinitely many n},
then we have to show that µ(A\E) = 0.
If we write
F = {x ∈ A | T n x 6∈ A ∀n ≥ 1}
then we have the identity
∞
[
A\E = (T −k F ∩ A).
k=0

Thus we have the estimate

∞
!
[
µ(A\E) = µ (T −k F ∩ A)
k=0
∞
!
[
−k
≤ µ T F
k=0
∞
X
≤ µ(T −k F ).
k=0

1
MAGIC010 Ergodic Theory Lecture 5

Since µ(T −k F ) = µ(F ) ∀k ≥ 0 (because the measure is preserved), it suffices

to show that µ(F ) = 0.
First suppose that n > m and that T −m F ∩ T −n F 6= ∅. If y lies in
this intersection then T m y ∈ F and T n−m (T m y) = T n y ∈ F ⊂ A, which
contradicts the definition of F . Thus T −m F and T −n F are disjoint.
Since {T −k F }∞
n=0 is a disjoint family, we have

∞ ∞
!
X [
µ(T −k F ) = µ T −k F ≤ µ(X) = 1.
k=0 k=0

Since the terms in the summation have the constant value µ(F ), we must
have µ(F ) = 0. 2

Remark There are many ways of proving Poincaré’s Recurrence Theorem

(see Petersen’s book). The proof above avoids the more relaxed attitude
towards sets of measure zero in the proof presented in the lecture (although
that proof is fully rigorous); it also makes more explicit use of the fact that
µ is a probability measure.

§5.3 Ergodic Theorems

An ergodic theorem is a result that describes the limiting behaviour of se-
quences of the form
n−1
1X
f ◦ Tj (5.1)
n
j=0

as n → ∞. The precise formulation of an ergodic theorem depends on

the class of function f (for example, one could assume that f is integrable,
L2 , or continuous), and the notion of convergence used (for example, the
convergence could be pointwise, L2 , or uniform). Here we discuss von Neu-
mann’s (Mean) Ergodic Theorem and Birkhoff’s Ergodic Theorem. Von
Neumann’s Ergodic Theorem is in the context of f ∈ L2 and L2 -convergence
of the ergodic averages (5.1); Birkhoff’s Ergodic Theorem is in the context of
f ∈ L1 and almost everywhere pointwise convergence of (5.1). Note that L2
convergence neither implies nor is implied by almost everywhere pointwise
convergence.
Before stating these theorems, we first need to discuss conditional ex-
pectation.

§5.4 Conditional expectation

We will need the concepts of Radon-Nikodym derivates and conditional ex-
pectation.

2
MAGIC010 Ergodic Theory Lecture 5

Definition. Let µ be a measure on (X, B). We say that a measure ν

is absolutely continuous with respect to µ and write ν µ if ν(B) = 0
whenever µ(B) = 0, B ∈ B.

Remark Thus ν is absolutely continuous with respect to µ if sets of µ-

measure zero also have ν-measure zero (but there may be more sets of ν-
measure zero). For example, let f ∈ L1 (X, B, µ) be non-negative and define
a measure ν by Z
ν(B) = f dµ.
B
Then ν µ.
The following theorem says that, essentially, all absolutely continuous
measures occur in this way.

Theorem 5.2 (Radon-Nikodym)

Let (X, B, µ) be a probability space. Let ν be a measure defined on B and
suppose that ν µ. Then there is a non-negative measurable function f
such that Z
ν(B) = f dµ, for all B ∈ B.
B
Moreover, f is unique in the sense that if g is a measurable function with
the same property then f = g µ-a.e.

Remark If ν µ then it is customary to write dν/dµ for the function

given by the Radon-Nikodym theorem, that is
Z
dν
ν(B) = dµ.
B dµ

The following relations are all easy to prove, and indicate why the notation
was chosen in this way.

(i) If ν µ and f is a µ-integrable function then

Z Z
dν
f dν = f dµ.
dµ

(ii) If ν1 , ν2 µ then
d(ν1 + ν2 ) dν1 dν2
= + .
dµ dµ dµ

(iii) If λ ν µ then
dλ dλ dν
= .
dµ dν dµ

3
MAGIC010 Ergodic Theory Lecture 5

Let A ⊂ B be a sub-σ-algebra. Note that µ defines a measure on A by

restriction. Let f ∈ L1 (X, B, µ). Then we can define a measure ν on A by
setting Z
ν(A) = f dµ.
A
Note that ν µ|A . Hence by the Radon-Nikodym theorem, there is a
unique A-measurable function E(f | A) such that
Z
ν(A) = E(f | A) dµ.

We call E(f | A) the conditional expectation of f with respect to the σ-

algebra A.
So far, we have only defined E(f | A) for non-negative f . To define
E(f | A) for an arbitrary f , we split f into positive and negative parts
f = f+ − f− where f+ , f− ≥ 0 and define

E(f | A) = E(f+ | A) − E(f− | A).

Thus we can view conditional expectation as an operator

E(· | A) : L1 (X, B, µ) → L1 (X, A, µ).

Note that E(f | A) is uniquely determined by the two requirements that

(i) E(f | A) is A-measurable, and

R R
(ii) A f dµ = A E(f | A) dµ for all A ∈ A.

Intuitively, one can think of E(f | A) as the best approximation to f in the

smaller space of all A-measurable functions.
To state von Neumann’s and sBirkhoff’s Ergodic Theorems precisely, we
will need the sub-σ-algebra I of T -invariant subsets, namely:

I = {B ∈ B | T −1 B = B a.e.}.

It is straightforward to check that I is a σ-algebra. Note that if T is ergodic

then I is the trivial σ-algebra consisting of all sets in B of measure 0 or 1.

§5.5 Von Neumann’s Mean Ergodic Theorem

Von Neumann’s Ergodic Theorem deals with the L2 -limiting behaviour of
1 P n−1 j 2
n j=0 f T for f ∈ L (X, B, µ).

4
MAGIC010 Ergodic Theory Lecture 5

Theorem 5.3 (Von Neumann’s Ergodic Theorem)

Let (X, B, µ) be a probability space and let T : X → X be a measure-
preserving transformation. Let I denote the σ-algebra of T -invariant sets.
Then for every f ∈ L2 (X, B, µ), we have
n−1
1X j
f T → E(f | I)
n
j=0

in L2 .

Corollary 5.4
Let (X, B, µ) be a probability space and let T : X → X be an ergodic
measure-preserving transformation. Let f ∈ L2 (X, B, µ). Then
n−1 Z
1X j
fT → f dµ, as n → ∞,
n
j=0

in L2 .

Proof. If T is ergodic then I is the trivial σ-algebra NR consisting of sets

of measure 0 and 1. If f ∈ L2 (X, B, µ) then E(f | N ) = f dµ. 2

In order to prove von Neumann’s Ergodic Theorem, it is useful to recast

it in terms of spectral theory.

Theorem 5.5 (von Neumann’s Ergodic Theorem for Operators)

Let U be an unitary operator of a complex Hilbert space H. Let I = {v ∈
H | U v = v} be the subspace of U -invariant functions and let PI : H → I
be orthogonal projection onto I. Then for all v ∈ H we have
n−1
1X j
U v → PI v (5.2)
n
j=0

in the norm induced on H by the inner product.

Proof of Theorem 5.5. First note that if v ∈ I then (5.2) holds, as

n−1
1X j
U v = v = PI v.
n
j=0

If v = U w − w for some w ∈ H then

n−1
1 X j 1 n 1
= n kU w − wk ≤ n 2kwk → 0.
U v

n
j=0

5
MAGIC010 Ergodic Theory Lecture 5

If we let B denote the norm-closure of the subspace {U w − w | w ∈ H} then

it follows that
n−1
1X j
U v→0
n
j=0

for all v ∈ B, by approximation.

We claim that H = I ⊕ B, an orthogonal decomposition. Suppose that
v ⊥ B. Then hv, U w − wi = 0 for all w ∈ H. Hence hU ∗ v, wi = hv, wi for all
w ∈ H. Hence U ∗ v = v. As U is unitary, we have that U ∗ = U −1 . Hence
v = U v, so that v ∈ I. Reversing each implication we see that v ∈ I implies
v ⊥ B, and the claim follows. 2

Remark Note that an isometry of a Hilbert space H is a linear operator

U such that hU v, U wi = hv, wi for all v, w ∈ H. We say that U is unitary if,
in addition, it is invertible. Equivalently, U is unitary if the dual operator
U ∗ is the inverse of U : U ∗ U = U U ∗ = id.
We can prove von Neumann’s Ergodic Theorem for an invertible measure-
preserving transformation T of a probability space (X, B, µ) as follows. Re-
call that L2 (X, B, µ) is a Hilbert space with respect to the inner product
Z
hf, gi = f ḡ dµ

and that T induces a linear operator U : L2 → L2 by U f = f ◦ T . As T is

measure-preserving, we have that U is an isometry; if T is invertible then U
is unitary.
Hence, when T is invertible, Theorem 5.3 follows immediately from The-
orem 5.5.
One can deduce from Theorem 5.5 that the result continues to hold
when U is an isometry and is not assumed to be invertible (this makes a
good exercise). Instead, we give an argument based around the construction
of the natural extension of the dynamical system T .
Let T be a measure-preserving transformation of the probability space
(X, B, µ). Introduce a new space X̂ = {(xj )∞ j=0 | T (xj ) = xj−1 } (essentially,
we are just extending the space X so that it contains all the possible pasts
for each point x ∈ X. Equip X̂ with the smallest σ-algebra B̂ which contains
all sets of the form {(xj )∞
j=0 | x0 ∈ B0 , . . . , xr ∈ Br } for each Bj ∈ B and
any r ∈ N. Define a measure µ̂ on such sets by

µ̂{(xj )∞
j=0 | x0 ∈ B0 , . . . , xr ∈ Br } = µ(T
−r
B0 ∩ · · · ∩ Br )

and extend to B̂ by using the Kolmogorov extension theorem. Define a trans-

formation T̂ : X̂ → X̂ by T (x0 , x1 , x2 , . . .) = (T x0 , T x1 , T x2 , . . .). Then T̂
is invertible: T̂ −1 (x0 , x1 , x2 , . . .) = (x1 , x2 , . . .).

6
MAGIC010 Ergodic Theory Lecture 5

One can check that µ̂ is a T̂ -invariant measure if and only if µ is a T -

invariant measure. Indeed, µ̂ is ergodic for T̂ if and only if µ is ergodic for
T.
The transformation T̂ is called the natural extension of T .
Let f ∈ L2 (X, B, µ). Then we obtain a function fˆ ∈ L2 (X̂, B̂, µ̂) by
defining fˆ(x0 , x1 , x2 , . . .) = f (x0 ). Then one can easily see that
n−1 n−1
1Xˆ j 1X j
f T̂ = fT .
n n
j=0 j=0

Hence von Neumann’s Ergodic Theorem for T follows from von Neumann’s
Ergodic Theorem for T̂ .

§5.6 Birkhoff ’s Pointwise Ergodic Theorem

1 Pn−1
Birkhoff’s Ergodic Theorem deals with the behaviour of n j=0 f (T j x) for
µ-a.e. x ∈ X, and for f ∈ L1 (X, B, µ).

Theorem 5.6 (Birkhoff ’s Ergodic Theorem)

Let (X, B, µ) be a probability space and let T : X → X be a measure-
preserving transformation. Let I denote the σ-algebra of T -invariant sets.
Then for every f ∈ L1 (X, B, µ), we have
n−1
1X
f (T j x) → E(f | I)
n
j=0

for µ-a.e. x ∈ X.

Corollary 5.7
Let (X, B, µ) be a probability space and let T : X → X be an ergodic
measure-preserving transformation. Let f ∈ L1 (X, B, µ). Then
n−1 Z
1X
f (T j x) → f dµ, as n → ∞,
n
j=0

for µ-a.e. x ∈ X.

§5.7 Applications of Birkhoff ’s Ergodic Theorem

§5.7.1 Kac’s Lemma
Poincaré’s Recurrence Theorem tells us that, under a measure-preserving
transformation, almost every point of a subset A of positive measure will
return to A. However, it does not tell us how long we should have to wait for
this to happen. One would expect that return times to sets of large measure

7
MAGIC010 Ergodic Theory Lecture 5

are small, and that return times to sets of small measure are large. This is
indeed the case, and forms the content of Kac’s Lemma.
Let T : X → X be a measure-preserving transformation of a probability
space (X, B, µ) and let A ⊂ X be a measurable subset with µ(A) > 0. By
Poincaré’s Recurrence Theorem, the integer

nA (x) = inf{n ≥ 1 | T n (x) ∈ A}

is defined for a.e. x ∈ A.

Theorem 5.8 (Kac’s Lemma)

Let T be an ergodic measure-preserving transformation of the probability
space (X, B, µ). Let A ∈ B be such that µ(A) > 0. Then
Z
nA dµ = 1.
A

Proof. Let

An = A ∩ T −1 Ac ∩ · · · ∩ T −(n−1) Ac ∩ T −n A.

Then An consists of those points in A that return to A after exactly n

iterations of T , i.e. An = {x ∈ A | nA (x) = n}.
Consider the illustration in figure 5.7.1. As T is ergodic, almost every

T T

T T T

A
A1 A2 A3 An

Figure 5.1: The return times to A

point of X eventually enters A. Hence the diagram represent almost all

of X. Note that the column above An in the diagram consists of n sets,
An,0 , . . . , An,n−1 say, with An,0 = An . Note that T −k An,k = An . As T is
measure-preserving, it follows that µ(An,k ) = µ(An ) for k = 0, . . . , n − 1.
Hence

1 = µ(X)

8
MAGIC010 Ergodic Theory Lecture 5

∞ X
X
= k = 1n µ(An,k )
n=1
X∞
= nµ(An )
n=1
X∞ Z
= nA dµ
n=1 An
Z
= nA dµ.
A

§5.7.2 Ehrenfests’ example

The following example, due to P. and T. Enhrenfest, demonstrates that the
return times in Poincaré’s Recurrence Theorem may be extremely large.
Consider two urns. One urn contains 100 balls, numbered 1 to 100, and
the other urn is empty. We also have a random number generator: this
could be a bag containing 100 slips of paper, numbered 1 to 100.
Each second, a slip of paper is drawn from the bag, the number is noted,
and the slip of paper is returned to the bag. The ball bearing that number
is then moved from whichever urn it is currently in to the other urn.
Naively, we would expect that the system will settle into an equilibrium
state in which there are 50 balls in each urn. Of course, there will continue
to be small random fluctuations about the 50-50 distribution. However,
it would appear highly unlikely for the system to return to the state in
which 100 balls are in the first urn. Nevertheless, the Poincaré Recurrence
Theorem tells us that this situation will occur almost surely (although we
will have to wait a long time for this to happen).
To see this, we represent the system as a full shift on 101 symbols with
an appropriate measure. Regard xj ∈ {0, . . . , 100} as being the number of
balls in the first urn after j seconds. Hence a sequence (xj )∞j=0 records the
number of balls in the first urn at each time. As the number of balls in the
first urn increases or decreases by 1 each second, such sequences determine
the shift of finite type

Σ = {{(xj )∞
j=0 | xj ∈ {0, . . . , 100}, |xj − xj+1 | = 1 for j = 0, 1, 2, . . .}

(this corresponds to a transition matrix A = (aij ) where ai,i+1 = ai+1,i = 1

for i = 0, . . . , 100 and aij = 0 otherwise).
Let pi denote the probability of there being i balls in the first n. This
probability is independent of time, and is equal to

1 100
pi = 100 .
2 i

9
MAGIC010 Ergodic Theory Lecture 5

If we have i balls in the first urn then at the next stage we must have either
i − 1 or i + 1 balls in the first urn. The number of balls becomes i − 1 if
the random number chosen is equal to the number of one of the balls in
the first urn. As there are currently i such balls, the probability of this
happening is i/100. Hence the conditional probability Pi,i−1 that there are
i − 1 balls remaining given that we started with i balls in the first urn is
i/100. Similarly, the conditional probability Pi,i+1 that there are i + 1 balls
in the first urn given that we started with i balls is (100 − i)/100. This
defines a stochastic matrix:
 
0 1 0 0 0 ···
 1 99
0 ··· 
 100 02 100 0
98

P =
 0
100 0 100 0 · · · 

3 97
 0
 0 100 0 100 · · · 

.. .. .. .. .. ..
. . . . . .

Note that Pi,j 6= 0 if and only if Aij = 1 and so P and A are compatible. It is
straightforward to check that pP = p. Hence we have a Markov probability
measure µP defined on Σ. The matrix A is irreducible (but is not aperiodic);
this ensures that µP is ergodic.
Consider the cylinder A = [100] of length 1. The represents there being
100 balls in the first urn. By Poincaré’s Recurrence Theorem, if we start in
A then we return to A infinitely often. By Kac’s lemma, the expected first
return time to A is
1
= 2100 seconds,
µP (A)
which is about 4 × 1022 years, or about 3 × 1012 times the length of time
that the Universe has so far existed!

Remark This suggests that returning to a small set is a ‘rare’ event, and
as such the return times are likely to have a Poisson distribution. This
is normally formalised as follows. Let T be an ergodic measure-preserving
transformation of a probability space (X, B, µ) and let x ∈ X. Let An be a
decreasing sequence of subsets that decrease
R to x. Define the return time as
τ (x) = limn→∞ µ(An )τAn (x). Then τ dµ = 1. One would normally expect
τ to have a Poisson distribution, but this is only known in particular cases;
usually one needs some hyperbolicity of the dynamics T (such as being a
shift of finite type). Additionally, there are normally restrictions on the sets
An (such as requiring them to be cylinders).

10
MAGIC010 Ergodic Theory Lecture 5

§5.7.3 Normality of numbers

Let r ≥ 2. Recall that any number x ∈ [0, 1] can be written as a base r
‘decimal’, i.e. there exist digits xj ∈ {0, 1, . . . , r − 1} for which
∞
X xj
x= .
rj
j=1

This r-adic expansion is unique, unless the sequence (xj ) ends in either
infinitely repeated 0s or infinitely repeated (r − 1)s.

Definition. Fix r ≥ 2. A number x ∈ [0, 1] is said to be (simply) normal

(in base r) if it has a unique expansion as an r-adic expansion, and for each
k = 0, 1, . . . , r − 1, the frequency with which digit k occurs in its r-adic
expansion is equal to 1/r.

For each r ≥ 2, define the map Tr : [0, 1] → [0, 1] by Tr (x) = rx mod 1.

(The case r = 2 is the doubling map.) It is easy to see, by following the
arguments for the doubling map, that Lebesgue measure µ on [0, 1] is an
ergodic invariant measure for Tr .
The close connections between r-adic expansions and the map Tr can be
used to prove the following result. A number is said to be normal if it is
simultaneously simply normal in every base r ≥ 2.

Proposition 5.9
Lebesgue almost every number in [0, 1] is normal.

Proof. Fix r ≥ 2. Then clearly all but a countable set of points has a
unique r-adic expansion. Fix k ∈ {0, 1, . . . , r − 1}. Then it is easy to see
that xj = k if and only if T j−1 x ∈ [k/r, (k + 1)/r). Thus
n−1
1 1X
card{1 ≤ j ≤ n | xj = k} = χ[k/r,(k+1)/r) (T j x).
n n
j=0

By Birkhoff’s Ergodic Theorem

R for Lebesgue almost every point x the above
expression converges to χ[k/r,(k+1)/r) (x) dx = 1/r. Let Nr denote the set
of such points.
/inf ty
As Nr has measure 1 for each r ≥ 2, it follows that N = ∩r=2 Nr has
measure 1. Hence Lebesgue almost every point is normal. 2

Remark Given r ≥ 2 it is easy to construct a number that is simply

normal in base r. However, not a single example is known of a number that
is simultaneously normal in every base r ≥ 2.
One can easily use Birkhoff’s Ergodic Theorem to prove the following
result.

11
MAGIC010 Ergodic Theory Lecture 5

Proposition 5.10
For Lebesgue-almost every point x ∈ [0, 1], the arithmetic mean of the digits
occurring in the base r expansion of x is (r − 1)/2.
We leave this as an exercise.

§5.7.4 Continued fractions

We can prove similar results for the distribution of digits in the continued
fraction expansion of real numbers.

Proposition 5.11
For Lebesgue-almost every x ∈ [0, 1], the frequency with which the natural
number k occurs in the continued fraction expansion of x is

(k + 1)2

1
log .
log 2 k(k + 2)

Proof. Let λ denote Lebesgue measure and let µ denote Gauss’ measure.
Then λ-a.e. and µ-a.e. x ∈ (0, 1) is irrational and has an infinite continued
fraction expansion
1
x= .
x0 + x + 1 1
1 1
x2 + x +···
3

Let T denote the continued fraction map. Then xn = [1/T n x].

Fix k ∈ N. Then xn = k precisely when [1/T n x] = k, i.e.
1
k≤ <k+1
T nx
which is equivalent to requiring
1 1
< T nx ≤ .
k+1 k
Hence
n−1
1 1X
card{0 ≤ j ≤ n − 1 | xj = k} = χ(1/(k+1),1/k] (T i x)
n n
j=0
Z
→ χ(1/(k+1),1/k] dµ for µ-a.e. x

1 1 1
= log 1 + − log 1 +
log 2 k k+1
1 (k + 1) 2
= log .
log 2 k(k + 2)
As µ and λ are equivalent, this holds for Lebesgue almost every point. 2

12
MAGIC010 Ergodic Theory Lecture 5

Proposition 5.12
(i) For Lebesgue-almost every x ∈ [0, 1], the arithmetic mean of the digits
in the continued fraction expansion of x is infinite.

(ii) For Lebesgue-almost every x ∈ [0, 1], the geometric mean of the digits
in the continued fraction expansion of x is
∞ log k/ log 2
Y 1
1+ .
k 2 + 2k
k=1

Proof. Writing
1
x= 1 .
x0 + x1 + 1
1
x2 + x +···
3

the proposition claims that

1
lim (x0 + x1 + · · · + xn−1 ) = ∞ (5.3)
n→∞ n

almost everywhere, and

∞ log k/ log 2
1/n
Y 1
lim (x0 x1 · · · xn−1 ) = 1+ 2
(5.4)
n→∞ k + 2k
k=1

almost everywhere.
We leave (5.3) as an exercise.
We prove (5.4). Define f (x) = log k for x ∈ (1/(k + 1), 1/k]. Then
n−1
1 1X
(log a0 + log a1 + · · · + an−1 ) = f (T j x)
n n
j=0
Z 1
1 f (x)
→ dx
log 2 0 1 + x
∞ Z
1 X 1/k log k
= dx
log 2 1/(k+1) 1+x
k=1
∞
X log k 1
= log 1 + 2
,
log 2 k + 2k
k=1

for Gauss-almost every, hence Lebesgue-almost every, point x ∈ [0, 1]. 2

§5.8 Appendix: The proof of Birkhoff ’s Ergodic Theorem

The proof is something of a tour de force of hard analysis. It is based on
the following inequality.

13
MAGIC010 Ergodic Theory Lecture 5

Theorem 5.13 (Maximal Inequality)

Let (X, B, µ) be a probability space, let T : X → X be a measure-preserving
transformation and let f ∈ L1 (X, B, µ). Define f0 = 0 and, for n ≥ 1,
fn = f + f ◦ T + · · · + f ◦ T n−1 .
For n ≥ 1, set Fn (x) = max0≤j≤n fj (x). Then Fn (x) ≥ 0. Then
Z
f dµ ≥ 0.
{x∈X|Fn (x)>0}

Proof. Clearly Fn ∈ L1 (X, B, µ). For 0 ≤ j ≤ n, we have Fn ≥ fj , so

Fn ◦ T ≥ fj ◦ T . Hence
Fn ◦ T + f ≥ fj ◦ T + f = fj+1
and therefore
Fn ◦ T (x) + f (x) ≥ max fj (x).
1≤j≤n

If Fn (x) > 0 then

max fj (x) = max fj (x) = Fn (x),
1≤j≤n 0≤j≤n

so we obtain that
f ≥ Fn − Fn ◦ T
on the set A = {x | Fn (x) > 0}.
Hence
Z Z Z
f dµ ≥ Fn dµ − Fn ◦ T dµ
A Z A ZA

= Fn dµ − Fn ◦ T dµ as Fn = 0 on X \ A
X A
Z Z
≥ Fn dµ − Fn ◦ T dµ as Fn ◦ T ≥ 0
X X
= 0 as µ is T -invariant.
2

Corollary 5.14
Let g ∈ L1 (X, B, µ) and let
 
n−1
 1X 
Mα = x ∈ X | sup g(T j x) > α .
 n≥1 n j=0


Then for all B ∈ B with T −1 B = B we have that

Z
g dµ ≥ αµ(Mα ∩ B).
Mα ∩A

14
MAGIC010 Ergodic Theory Lecture 5

Proof. Suppose first that B = X. Let f = g − α, then

 
∞ 
[ n−1
X  ∞
[ ∞
[
Mα = x| g(T j x) > nα = {x | fn (x) > 0} = {x | Fn (x) > 0}
 
n=1 j=0 n=1 n=1

(since fn (x) > 0 ⇒ Fn (x) > 0 and Fn (x) > 0 ⇒ fj (x) > 0 for some 1 ≤ j ≤
n). Write Cn = {x | Fn (x) > 0} and observe that Cn ⊂ Cn+1 . Thus χCn
converges to χBα and so f χCn converges to f χMα , as n → ∞. Furthermore,
|f χCn | ≤ |f |. Hence, by the Dominated Convergence Theorem,
Z Z Z Z
f dµ = f χCn dµ → f χMα dµ = f dµ, as n → ∞.
Cn X X Mα

Applying
R the Maximal RInequality, we have,R for all n ≥ 1 we have that
Cn f dµ ≥ 0. Therefore Mα f dµ ≥ 0, i.e., Bα g dµ ≥ αµ(Bα ).
For the general case, we work with the restriction of T to B, T : B → B,
and apply the Maximal Inequality on this subset to get
Z
g dµ ≥ αµ(Mα ∩ B),
Mα ∩B

as required. 2

We will also need the following convergence result.

Proposition 5.15 (Fatou’s Lemma)

Let (X, B, µ) be a probability space and suppose that fn : X → R are mea-
surable functions. Define f (x) = lim inf n→∞ fn (x). Then f is measurable
and Z Z
f dµ ≤ lim inf fn dµ
n→∞

(one or both of these expressions may be infinite).

Proof of Birkhoff ’s Ergodic Theorem. Let

n−1 n−1
∗ 1X 1X
f (x) = lim sup f (T j x), f∗ (x) = lim inf f (T j x).
n→∞ n n→∞ n
j=0 j=0

These exist (but may be ±∞, respectively) at all points x ∈ X. Clearly

f∗ (x) ≤ f ∗ (x).
Let
n−1
1X
an (x) = f (T j x).
n
j=0

Observe that
n+1 1
an+1 (x) = an (T x) + f (x).
n n

15
MAGIC010 Ergodic Theory Lecture 5

As f is finite µ-a.e., we have that f (x)/n → 0 µ-a.e. as n → ∞. Hence,

taking the lim sup and lim inf as n → ∞, gives us that f ∗ ◦ T = f ∗ µ-a.e.
and f∗ ◦ T = f∗ µ-a.e.
We have to show

(i) f ∗ = f∗ µ-a.e

(ii) f ∗ ∈ L1 (X, B, µ)

(iii) f ∗ dµ = f dµ.
R R

We prove (i). For α, β ∈ R, define

Eα,β = {x ∈ X | f∗ (x) < β and f ∗ (x) > α}.

Note that [
{x ∈ X | f∗ (x) < f ∗ (x)} = Eα,β
β<α, α,β∈Q

(a countable union). Thus, to show that f ∗ = f∗ µ-a.e., it suffices to show

that µ(Eα,β ) = 0 whenever β < α. Since f∗ ◦ T = f∗ and f ∗ ◦ T = f ∗ , we
see that T −1 Eα,β = Eα,β . If we write
 
n−1
 1 X 
Mα = x ∈ X | sup f (T j x) > α
 n≥1 n j=0


then Eα,β ∩ Mα = Eα,β .

Applying Corollary 5.14 we have that
Z Z
f dµ = f dµ
Eα,β Eα,β ∩Mα
≥ αµ(Eα,β ∩ Mα ) = αµ(Eα,β ).

Replacing f , α and β by −f , −β and −α and using the fact that (−f )∗ =

−f∗ and (−f )∗ = −f ∗ , we also get
Z
f dµ ≤ βµ(Eα,β ).
Eα,β

Therefore
αµ(Eα,β ) ≤ βµ(Eα,β )
and since β < α this shows that µ(Eα,β ) = 0. Thus f ∗ = f∗ µ-a.e. and
n−1
1X
lim f (T j x) = f ∗ (x) µ-a.e.
n→∞ n
j=0

16
MAGIC010 Ergodic Theory Lecture 5

We prove (ii). Let

n−1
1 X j

gn (x) = f (T x) .
n j=0

Then gn ≥ 0 and Z Z
gn dµ ≤ |f | dµ

so we can apply Fatou’s Lemma (Proposition 5.15) to conclude that limn→∞ gn =

|f ∗ | is integrable, i.e., that f ∗ ∈ L1 (X, B, µ).
We prove (iii). For n ∈ N and k ∈ Z, define

k k+1
Dkn = x ∈ X | ≤ f ∗ (x) < .
n n
For every ε > 0, we have that

Dkn ∩ M k −ε = Dkn .
n

Since T −1 Dkn = Dkn , we can apply Corollary 5.14 again to obtain

Z
k
f dµ ≥ − ε µ(Dkn ).
n
Dk n

Since ε > 0 is arbitrary, we have

Z
k
f dµ ≥ µ(Dkn ).
Dkn n

Thus Z Z
k+1 1
f ∗ dµ ≤ µ(Dkn ) ≤ µ(Dkn ) + f dµ
Dkn n n Dkn

(where the first inequality follows from the definition of Dkn ). Since
[
X= Dkn
k∈Z

(a disjoint union), summing over k ∈ Z gives

Z Z
∗ 1
f dµ ≤ µ(X) + f dµ
X n X
Z
1
= + f dµ.
n X

Since this holds for all n ≥ 1, we obtain

Z Z
∗
f dµ ≤ f dµ.
X X

17
MAGIC010 Ergodic Theory Lecture 5

Applying the same argument to −f gives

Z Z
∗
(−f ) dµ ≤ −f dµ

so that Z Z Z
∗
f dµ = f∗ dµ ≥ f dµ.

Therefore Z Z
∗
f dµ = f dµ,

as required.
Finally, we prove that f ∗ = E(f | I). First note that as f ∗ is T -invariant,
it is measurable with respect to I. Moreover, if I is any T -invariant set then
Z Z
f dµ = f ∗ dµ.
I I

Hence f ∗ = E(f | I). 2

§5.9 References
Most of the material in this lecture is standard in ergodic theory. The presen-
tation of Ehrenfests’ example (originally an example in statistical mechanics)
is taken from

K. Petersen, Ergodic Theory, C.U.P., Cambridge, 1983.

Additional applications of the Ergodic Theorem to continued fractions can

be found in

I. P. Cornfeld, S. V. Fomin, and Ya. G. Sinai, Ergodic Theory, Springer,

Berlin, 1982.
A.M. Rockett and P. Szusz, Continued Fractions, World Scientific, 1992.

§5.10 Exercises
Exercise 5.1
Construct an example to show that Poincaré’s recurrence theorem does not
hold on infinite measure spaces. (Recall that a measure space (X, B, µ) is
infinite if µ(X) = ∞.)

Exercise 5.2
Prove Proposition 5.10: For Lebesgue-almost every point x ∈ [0, 1], the
arithmetic mean of the digits occurring in the base r expansion of x is
(r − 1)/2.

18
MAGIC010 Ergodic Theory Lecture 5

Exercise 5.3
(i) Let T : X → X be an ergodic measure-preserving transformation of
a probability Rspace (X, B, µ). Let f : X → R be measurable, and
suppose that f dµ = ∞. Prove that
n−1
1X
lim f (T j x) = ∞
n→∞ n
j=0

for µ-almost every x.

(ii) Prove Proposition 5.12(i).

Exercise 5.4
Let B be a Banach space and U a bounded linear operator of B such that
supk kU k k < ∞. Prove that the following are equivalent:
n−1
1X j
(i) U v converges in norm;
n
j=0

n−1
1X j
(ii) U v has a limit point in the weak topology,
n
j=0

(iii) U has a fixed point in the weakly closed convex hull of {U n v} (i.e. the
smallest weakly closed convex set that contains all the U n v).

Hence prove the Lp -Ergodic Theorem when T is an ergodic measure-

preserving transformation of a probability space (X, B, µ), namely that if
f ∈ Lp (X, B, µ) then
n−1 Z
1X j
f T → f dµ
n
j=0

in Lp .

Spin Wave Daniel D Stancil
No ratings yet
Spin Wave Daniel D Stancil
364 pages
Brodmann M.P., Sharp R.Y.-local Cohomology
No ratings yet
Brodmann M.P., Sharp R.Y.-local Cohomology
516 pages
B.H. Matzat Inverse Galois Theory
100% (2)
B.H. Matzat Inverse Galois Theory
450 pages
V. I Arnold, A. Avez - Ergodic Problems of Classical Mechanics (The Mathematical Physics Monograph Series) - Benjamin (1968)
100% (1)
V. I Arnold, A. Avez - Ergodic Problems of Classical Mechanics (The Mathematical Physics Monograph Series) - Benjamin (1968)
296 pages
Toric Varieties
100% (1)
Toric Varieties
863 pages
Discrete Groups, Expanding Graphs and Invariant Measures (A. Lubotzky) PDF
100% (1)
Discrete Groups, Expanding Graphs and Invariant Measures (A. Lubotzky) PDF
201 pages
Elliptic Curves, Modular Forms and Fermat's Last Theorem
0% (1)
Elliptic Curves, Modular Forms and Fermat's Last Theorem
342 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Ergodic Theory Intro
No ratings yet
Ergodic Theory Intro
64 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Lebesgue Integration
From Everand
Lebesgue Integration
J.H. Williamson
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
3.5/5 (1)
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Automorphic Representations and L-Functions For The General Linear Group - Volume 2c
No ratings yet
Automorphic Representations and L-Functions For The General Linear Group - Volume 2c
210 pages
Z. A. Kuzicheva (Auth.), A. N. Kolmogorov, A. P. Yushkevich (Eds.) - Mathematics of The 19th Century - Mathematical Logic Algebra Number Theory Probability Theory (1992, Birkhäuser Basel) PDF
100% (3)
Z. A. Kuzicheva (Auth.), A. N. Kolmogorov, A. P. Yushkevich (Eds.) - Mathematics of The 19th Century - Mathematical Logic Algebra Number Theory Probability Theory (1992, Birkhäuser Basel) PDF
319 pages
(Problem Books in Mathematics) Marek Capiński, Tomasz Zastawniak (Auth.) - Probability Through Problems-Springer-Verlag New York (2001)
No ratings yet
(Problem Books in Mathematics) Marek Capiński, Tomasz Zastawniak (Auth.) - Probability Through Problems-Springer-Verlag New York (2001)
262 pages
Hypergeometric Functions of Two Variables
100% (1)
Hypergeometric Functions of Two Variables
201 pages
Cohomology of Arithmetic Groups, L-Functions and Automorphic - T. Venkatamarana PDF
No ratings yet
Cohomology of Arithmetic Groups, L-Functions and Automorphic - T. Venkatamarana PDF
132 pages
(Birkhäuser Advanced Texts Basler Lehrbücher) T. Kyle Petersen - Eulerian Numbers-Imprint - Birkhäuser, Springer New York (2015)
No ratings yet
(Birkhäuser Advanced Texts Basler Lehrbücher) T. Kyle Petersen - Eulerian Numbers-Imprint - Birkhäuser, Springer New York (2015)
463 pages
Schur Complement
No ratings yet
Schur Complement
12 pages
Many-Body Physics
No ratings yet
Many-Body Physics
228 pages
Foa G Nov 1817 Public
No ratings yet
Foa G Nov 1817 Public
808 pages
Statistical Field Theory - An Introduction to Exactly Solved Models in Statistical Physics 2nd Edition Giuseppe Mussardo download pdf
100% (3)
Statistical Field Theory - An Introduction to Exactly Solved Models in Statistical Physics 2nd Edition Giuseppe Mussardo download pdf
55 pages
Polyhedral Geometry
100% (1)
Polyhedral Geometry
442 pages
P-Adic Hodge Theory (Brinon and Conrad)
No ratings yet
P-Adic Hodge Theory (Brinon and Conrad)
290 pages
Topics in Random Matrix Theory
No ratings yet
Topics in Random Matrix Theory
342 pages
Cyclotomic Polynomials
No ratings yet
Cyclotomic Polynomials
13 pages
Edwin Hewitt, Kenneth A. Ross Hewitt and Ros Abstract Harmonic Analysis Volume II Structure and Analysis For Compact Groups Analysis On Locally Compact Abelian Groups
No ratings yet
Edwin Hewitt, Kenneth A. Ross Hewitt and Ros Abstract Harmonic Analysis Volume II Structure and Analysis For Compact Groups Analysis On Locally Compact Abelian Groups
781 pages
Random Matrix Theories in Quantum Physics
No ratings yet
Random Matrix Theories in Quantum Physics
178 pages
Miles Reid
No ratings yet
Miles Reid
164 pages
(Graduate Texts in Mathematics) Serge Lang - SL2 (R) With 33 Figures-Springer Science & Business Media (1985)
100% (2)
(Graduate Texts in Mathematics) Serge Lang - SL2 (R) With 33 Figures-Springer Science & Business Media (1985)
432 pages
(Applied Quaternionic Analysis 28 Research and Exposition in Mathematics) Vladislav v. Kravchenko. 28-Heldermann Verlag (2003)
100% (2)
(Applied Quaternionic Analysis 28 Research and Exposition in Mathematics) Vladislav v. Kravchenko. 28-Heldermann Verlag (2003)
134 pages
Galois Theories by Francis Borceux, George Janelidze
100% (1)
Galois Theories by Francis Borceux, George Janelidze
353 pages
Calculus of Variations
No ratings yet
Calculus of Variations
14 pages
PDF
100% (1)
PDF
207 pages
Notes On Convex Sets Polytopes Polyhedra
No ratings yet
Notes On Convex Sets Polytopes Polyhedra
183 pages
(Lecture Notes in Mathematics 1667) Jesús M. F. Castillo, Manuel González (Auth.) - Three-Space Problems in Banach Space Theory-Springer-Verlag Berlin Heidelberg (1997) PDF
No ratings yet
(Lecture Notes in Mathematics 1667) Jesús M. F. Castillo, Manuel González (Auth.) - Three-Space Problems in Banach Space Theory-Springer-Verlag Berlin Heidelberg (1997) PDF
280 pages
Spectral Theory and Differential Operators D.E. Edmunds 2024 scribd download
100% (10)
Spectral Theory and Differential Operators D.E. Edmunds 2024 scribd download
66 pages
Probability and Geometry On Groups Lecture Notes For A Graduate Course
No ratings yet
Probability and Geometry On Groups Lecture Notes For A Graduate Course
209 pages
Bok 3A978 3 319 29977 8
No ratings yet
Bok 3A978 3 319 29977 8
191 pages
Cvitanovic Et Al. Classical and Quantum Chaos Book (Web Version 9.2.3, 2002) (750s) - PNC
No ratings yet
Cvitanovic Et Al. Classical and Quantum Chaos Book (Web Version 9.2.3, 2002) (750s) - PNC
750 pages
The Analysis and Geometry of Hardy's Inequality
No ratings yet
The Analysis and Geometry of Hardy's Inequality
277 pages
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
From Everand
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
Wolfram Hergert
No ratings yet
Algebraic Topology I and II, Haynes Miller MIT 2021
100% (1)
Algebraic Topology I and II, Haynes Miller MIT 2021
307 pages
Computability and Incompleteness - Lecture Notes
100% (1)
Computability and Incompleteness - Lecture Notes
128 pages
Elements of - Category Theory
100% (1)
Elements of - Category Theory
606 pages
Hairer Geometric Numerical Integration
100% (3)
Hairer Geometric Numerical Integration
525 pages
Multiplicative Number Theory I-Montgomery
No ratings yet
Multiplicative Number Theory I-Montgomery
572 pages
(H.S. Carslaw) Introduction To The Theory of Fourier Series
No ratings yet
(H.S. Carslaw) Introduction To The Theory of Fourier Series
332 pages
Free Probability and Random Matrices
No ratings yet
Free Probability and Random Matrices
342 pages
Ergodic Theory Number Theory
No ratings yet
Ergodic Theory Number Theory
104 pages
Artin - Galois Title
No ratings yet
Artin - Galois Title
2 pages
Dimension Functions: Depth, Measuring Singularities: Pieter Belmans February 14, 2014
No ratings yet
Dimension Functions: Depth, Measuring Singularities: Pieter Belmans February 14, 2014
11 pages
Dales, Dashiell, Lau, . Strauss. Banach Spaces of Continuous Functions as Dual Spaces
No ratings yet
Dales, Dashiell, Lau, . Strauss. Banach Spaces of Continuous Functions as Dual Spaces
286 pages
(Translations of Mathematical Monographs) A. N. Andrianov and v. G. Zhuravlev - Modular Forms and Hecke Operators (1995, American Mathematical Society)
No ratings yet
(Translations of Mathematical Monographs) A. N. Andrianov and v. G. Zhuravlev - Modular Forms and Hecke Operators (1995, American Mathematical Society)
347 pages
Stochastic Calculus
No ratings yet
Stochastic Calculus
217 pages
Numerical Analysis Introduction
100% (1)
Numerical Analysis Introduction
24 pages
Lovasz Discrete and Continuous
No ratings yet
Lovasz Discrete and Continuous
23 pages
(Applied Mathematical Sciences 113) P. D. Hislop, I. M. Sigal (Auth.) - Introduction to Spectral Theory_ With Applications to Schrödinger Operators-Springer-Verlag New York (1996) (1)
No ratings yet
(Applied Mathematical Sciences 113) P. D. Hislop, I. M. Sigal (Auth.) - Introduction to Spectral Theory_ With Applications to Schrödinger Operators-Springer-Verlag New York (1996) (1)
351 pages
Wang F. Foundation of Probability Theory 2024
100% (2)
Wang F. Foundation of Probability Theory 2024
208 pages
Quant Session 1
No ratings yet
Quant Session 1
20 pages
MS-108 Linear Algebra CIS Updated-2
No ratings yet
MS-108 Linear Algebra CIS Updated-2
5 pages
Mathematics Form 3 - Chapter 1 Indices
No ratings yet
Mathematics Form 3 - Chapter 1 Indices
6 pages
Quadratic Equations: I. 2X Ii. Y
No ratings yet
Quadratic Equations: I. 2X Ii. Y
4 pages
Ece171b HW4
No ratings yet
Ece171b HW4
6 pages
EOT Exam Form 4
No ratings yet
EOT Exam Form 4
4 pages
Tensor 3
No ratings yet
Tensor 3
21 pages
matlab13
No ratings yet
matlab13
3 pages
Math 9 Q1 Week 1
No ratings yet
Math 9 Q1 Week 1
10 pages
4 Node Quad
No ratings yet
4 Node Quad
7 pages
Math9 Q1 Mod3 QuadraticEquation Version3
No ratings yet
Math9 Q1 Mod3 QuadraticEquation Version3
40 pages
Mount Litera Zee School, Roorkee: Periodic Assessment - I (2021-22)
No ratings yet
Mount Litera Zee School, Roorkee: Periodic Assessment - I (2021-22)
1 page
Pearson TVET Poster A3 Mathematics N4
No ratings yet
Pearson TVET Poster A3 Mathematics N4
1 page
Real (End1)
No ratings yet
Real (End1)
1 page
(Ebook) Foundations of Factor Analysis, Second Edition by Stanley A Mulaik ISBN 9781420099614, 1420099612 - The ebook in PDF/DOCX format is available for instant download
100% (2)
(Ebook) Foundations of Factor Analysis, Second Edition by Stanley A Mulaik ISBN 9781420099614, 1420099612 - The ebook in PDF/DOCX format is available for instant download
53 pages
M1 Lesson Plan
No ratings yet
M1 Lesson Plan
9 pages
Army Engineer Cartography I Map Mathematics
100% (2)
Army Engineer Cartography I Map Mathematics
183 pages
Ch3 Matrices
No ratings yet
Ch3 Matrices
33 pages
Higher Engineering Mathematics - B. S. Grewal Companion Text
80% (5)
Higher Engineering Mathematics - B. S. Grewal Companion Text
197 pages
Real Numbers PDF
100% (1)
Real Numbers PDF
4 pages
1867-Lista de Exercícios - Vetor Posição
No ratings yet
1867-Lista de Exercícios - Vetor Posição
24 pages
1-Solution of Linear Equationsjh
No ratings yet
1-Solution of Linear Equationsjh
28 pages
Topical Guidance of SPM Mathematics
No ratings yet
Topical Guidance of SPM Mathematics
39 pages
Three Node Triangle
No ratings yet
Three Node Triangle
19 pages
05 - Integration Trig PDF
No ratings yet
05 - Integration Trig PDF
2 pages
Radial Basis Function Interpolation: Wilna Du Toit
No ratings yet
Radial Basis Function Interpolation: Wilna Du Toit
58 pages
Quadratic Equation(Worksheet)
No ratings yet
Quadratic Equation(Worksheet)
2 pages
AI-ML Class-16
No ratings yet
AI-ML Class-16
12 pages
Determinants Part 1
No ratings yet
Determinants Part 1
25 pages
Numerical Analysis - MTH603 Handouts Lecture 3
No ratings yet
Numerical Analysis - MTH603 Handouts Lecture 3
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Recurrence and Ergodicity

Uploaded by

Recurrence and Ergodicity

Uploaded by

MAGIC010 Ergodic Theory Lecture 5

5. Recurrence and Ergodicity

§5.2 Poincaré’s Recurrence Theorem

Thus we have the estimate

Since µ(T −k F ) = µ(F ) ∀k ≥ 0 (because the measure is preserved), it suffices

Remark There are many ways of proving Poincaré’s Recurrence Theorem

§5.3 Ergodic Theorems

as n → ∞. The precise formulation of an ergodic theorem depends on

§5.4 Conditional expectation

Definition. Let µ be a measure on (X, B). We say that a measure ν

Remark Thus ν is absolutely continuous with respect to µ if sets of µ-

Theorem 5.2 (Radon-Nikodym)

Remark If ν  µ then it is customary to write dν/dµ for the function

(i) If ν  µ and f is a µ-integrable function then

Let A ⊂ B be a sub-σ-algebra. Note that µ defines a measure on A by

We call E(f | A) the conditional expectation of f with respect to the σ-

E(f | A) = E(f+ | A) − E(f− | A).

Thus we can view conditional expectation as an operator

E(· | A) : L1 (X, B, µ) → L1 (X, A, µ).

Note that E(f | A) is uniquely determined by the two requirements that

(i) E(f | A) is A-measurable, and

Intuitively, one can think of E(f | A) as the best approximation to f in the

It is straightforward to check that I is a σ-algebra. Note that if T is ergodic

§5.5 Von Neumann’s Mean Ergodic Theorem

Theorem 5.3 (Von Neumann’s Ergodic Theorem)

Proof. If T is ergodic then I is the trivial σ-algebra NR consisting of sets

In order to prove von Neumann’s Ergodic Theorem, it is useful to recast

Theorem 5.5 (von Neumann’s Ergodic Theorem for Operators)

in the norm induced on H by the inner product.

Proof of Theorem 5.5. First note that if v ∈ I then (5.2) holds, as

If v = U w − w for some w ∈ H then

If we let B denote the norm-closure of the subspace {U w − w | w ∈ H} then

for all v ∈ B, by approximation.

Remark Note that an isometry of a Hilbert space H is a linear operator

and that T induces a linear operator U : L2 → L2 by U f = f ◦ T . As T is

and extend to B̂ by using the Kolmogorov extension theorem. Define a trans-

One can check that µ̂ is a T̂ -invariant measure if and only if µ is a T -

§5.6 Birkhoff ’s Pointwise Ergodic Theorem

Theorem 5.6 (Birkhoff ’s Ergodic Theorem)

§5.7 Applications of Birkhoff ’s Ergodic Theorem

nA (x) = inf{n ≥ 1 | T n (x) ∈ A}

is defined for a.e. x ∈ A.

Theorem 5.8 (Kac’s Lemma)

Then An consists of those points in A that return to A after exactly n

Figure 5.1: The return times to A

point of X eventually enters A. Hence the diagram represent almost all

§5.7.2 Ehrenfests’ example

(this corresponds to a transition matrix A = (aij ) where ai,i+1 = ai+1,i = 1

§5.7.3 Normality of numbers

Definition. Fix r ≥ 2. A number x ∈ [0, 1] is said to be (simply) normal

For each r ≥ 2, define the map Tr : [0, 1] → [0, 1] by Tr (x) = rx mod 1.

By Birkhoff’s Ergodic Theorem

Remark Given r ≥ 2 it is easy to construct a number that is simply

§5.7.4 Continued fractions

Let T denote the continued fraction map. Then xn = [1/T n x].

the proposition claims that

almost everywhere, and

for Gauss-almost every, hence Lebesgue-almost every, point x ∈ [0, 1]. 2

§5.8 Appendix: The proof of Birkhoff ’s Ergodic Theorem

Theorem 5.13 (Maximal Inequality)

Proof. Clearly Fn ∈ L1 (X, B, µ). For 0 ≤ j ≤ n, we have Fn ≥ fj , so

If Fn (x) > 0 then

Then for all B ∈ B with T −1 B = B we have that

Proof. Suppose first that B = X. Let f = g − α, then

We will also need the following convergence result.

Proposition 5.15 (Fatou’s Lemma)

(one or both of these expressions may be infinite).

Proof of Birkhoff ’s Ergodic Theorem. Let

These exist (but may be ±∞, respectively) at all points x ∈ X. Clearly

As f is finite µ-a.e., we have that f (x)/n → 0 µ-a.e. as n → ∞. Hence,

We prove (i). For α, β ∈ R, define

Eα,β = {x ∈ X | f∗ (x) < β and f ∗ (x) > α}.

(a countable union). Thus, to show that f ∗ = f∗ µ-a.e., it suffices to show

then Eα,β ∩ Mα = Eα,β .

Replacing f , α and β by −f , −β and −α and using the fact that (−f )∗ =

We prove (ii). Let

Remark If ν µ then it is customary to write dν/dµ for the function

(i) If ν µ and f is a µ-integrable function then