0% found this document useful (0 votes)
107 views

Recurrence and Ergodicity

The aim of this lecture is to state Birkhoff’s Ergodic Theorem and to study some of its applications in the context of ergodic transformations of a prob- ability space. In particular we will apply it to the doubling map and to the continued fraction map and deduce some results of a number-theoretic nature. Birkhoff’s Ergodic Theorem can be viewed as giving us information about rates of recurrence. Indeed, one application that we shall study is Kac’s Lemma: given a set A of measure μ(A), the orbi

Uploaded by

Bratnok
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views

Recurrence and Ergodicity

The aim of this lecture is to state Birkhoff’s Ergodic Theorem and to study some of its applications in the context of ergodic transformations of a prob- ability space. In particular we will apply it to the doubling map and to the continued fraction map and deduce some results of a number-theoretic nature. Birkhoff’s Ergodic Theorem can be viewed as giving us information about rates of recurrence. Indeed, one application that we shall study is Kac’s Lemma: given a set A of measure μ(A), the orbi

Uploaded by

Bratnok
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

MAGIC010 Ergodic Theory Lecture 5

5. Recurrence and Ergodicity

§5.1 Introduction
The aim of this lecture is to state Birkhoff’s Ergodic Theorem and to study
some of its applications in the context of ergodic transformations of a prob-
ability space. In particular we will apply it to the doubling map and to
the continued fraction map and deduce some results of a number-theoretic
nature. Birkhoff’s Ergodic Theorem can be viewed as giving us information
about rates of recurrence. Indeed, one application that we shall study is
Kac’s Lemma: given a set A of measure µ(A), the orbit of almost every point
of A eventually returns to A and the expected time of the first recurrence
is 1/µ(A). We begin by discussing Poincaré’s theorem: this gives us infor-
mation about the recurrence properties of an arbitrary measure-preserving
transformation.

§5.2 Poincaré’s Recurrence Theorem


Theorem 5.1 (Poincaré’s Recurrence Theorem)
Let T : X → X be a measure-preserving transformation of (X, B, µ) and let
A ∈ B have µ(A) > 0. Then for µ-a.e. x ∈ A, the orbit {T n x}∞
n=0 returns
to A infinitely often.
Proof. Let
E = {x ∈ A | T n x ∈ A for infinitely many n},
then we have to show that µ(A\E) = 0.
If we write
F = {x ∈ A | T n x 6∈ A ∀n ≥ 1}
then we have the identity

[
A\E = (T −k F ∩ A).
k=0

Thus we have the estimate



!
[
µ(A\E) = µ (T −k F ∩ A)
k=0

!
[
−k
≤ µ T F
k=0

X
≤ µ(T −k F ).
k=0

1
MAGIC010 Ergodic Theory Lecture 5

Since µ(T −k F ) = µ(F ) ∀k ≥ 0 (because the measure is preserved), it suffices


to show that µ(F ) = 0.
First suppose that n > m and that T −m F ∩ T −n F 6= ∅. If y lies in
this intersection then T m y ∈ F and T n−m (T m y) = T n y ∈ F ⊂ A, which
contradicts the definition of F . Thus T −m F and T −n F are disjoint.
Since {T −k F }∞
n=0 is a disjoint family, we have

∞ ∞
!
X [
µ(T −k F ) = µ T −k F ≤ µ(X) = 1.
k=0 k=0

Since the terms in the summation have the constant value µ(F ), we must
have µ(F ) = 0. 2

Remark There are many ways of proving Poincaré’s Recurrence Theorem


(see Petersen’s book). The proof above avoids the more relaxed attitude
towards sets of measure zero in the proof presented in the lecture (although
that proof is fully rigorous); it also makes more explicit use of the fact that
µ is a probability measure.

§5.3 Ergodic Theorems


An ergodic theorem is a result that describes the limiting behaviour of se-
quences of the form
n−1
1X
f ◦ Tj (5.1)
n
j=0

as n → ∞. The precise formulation of an ergodic theorem depends on


the class of function f (for example, one could assume that f is integrable,
L2 , or continuous), and the notion of convergence used (for example, the
convergence could be pointwise, L2 , or uniform). Here we discuss von Neu-
mann’s (Mean) Ergodic Theorem and Birkhoff’s Ergodic Theorem. Von
Neumann’s Ergodic Theorem is in the context of f ∈ L2 and L2 -convergence
of the ergodic averages (5.1); Birkhoff’s Ergodic Theorem is in the context of
f ∈ L1 and almost everywhere pointwise convergence of (5.1). Note that L2
convergence neither implies nor is implied by almost everywhere pointwise
convergence.
Before stating these theorems, we first need to discuss conditional ex-
pectation.

§5.4 Conditional expectation


We will need the concepts of Radon-Nikodym derivates and conditional ex-
pectation.

2
MAGIC010 Ergodic Theory Lecture 5

Definition. Let µ be a measure on (X, B). We say that a measure ν


is absolutely continuous with respect to µ and write ν  µ if ν(B) = 0
whenever µ(B) = 0, B ∈ B.

Remark Thus ν is absolutely continuous with respect to µ if sets of µ-


measure zero also have ν-measure zero (but there may be more sets of ν-
measure zero). For example, let f ∈ L1 (X, B, µ) be non-negative and define
a measure ν by Z
ν(B) = f dµ.
B
Then ν  µ.
The following theorem says that, essentially, all absolutely continuous
measures occur in this way.

Theorem 5.2 (Radon-Nikodym)


Let (X, B, µ) be a probability space. Let ν be a measure defined on B and
suppose that ν  µ. Then there is a non-negative measurable function f
such that Z
ν(B) = f dµ, for all B ∈ B.
B
Moreover, f is unique in the sense that if g is a measurable function with
the same property then f = g µ-a.e.

Remark If ν  µ then it is customary to write dν/dµ for the function


given by the Radon-Nikodym theorem, that is
Z

ν(B) = dµ.
B dµ

The following relations are all easy to prove, and indicate why the notation
was chosen in this way.

(i) If ν  µ and f is a µ-integrable function then


Z Z

f dν = f dµ.

(ii) If ν1 , ν2  µ then
d(ν1 + ν2 ) dν1 dν2
= + .
dµ dµ dµ

(iii) If λ  ν  µ then
dλ dλ dν
= .
dµ dν dµ

3
MAGIC010 Ergodic Theory Lecture 5

Let A ⊂ B be a sub-σ-algebra. Note that µ defines a measure on A by


restriction. Let f ∈ L1 (X, B, µ). Then we can define a measure ν on A by
setting Z
ν(A) = f dµ.
A
Note that ν  µ|A . Hence by the Radon-Nikodym theorem, there is a
unique A-measurable function E(f | A) such that
Z
ν(A) = E(f | A) dµ.

We call E(f | A) the conditional expectation of f with respect to the σ-


algebra A.
So far, we have only defined E(f | A) for non-negative f . To define
E(f | A) for an arbitrary f , we split f into positive and negative parts
f = f+ − f− where f+ , f− ≥ 0 and define

E(f | A) = E(f+ | A) − E(f− | A).

Thus we can view conditional expectation as an operator

E(· | A) : L1 (X, B, µ) → L1 (X, A, µ).

Note that E(f | A) is uniquely determined by the two requirements that

(i) E(f | A) is A-measurable, and


R R
(ii) A f dµ = A E(f | A) dµ for all A ∈ A.

Intuitively, one can think of E(f | A) as the best approximation to f in the


smaller space of all A-measurable functions.
To state von Neumann’s and sBirkhoff’s Ergodic Theorems precisely, we
will need the sub-σ-algebra I of T -invariant subsets, namely:

I = {B ∈ B | T −1 B = B a.e.}.

It is straightforward to check that I is a σ-algebra. Note that if T is ergodic


then I is the trivial σ-algebra consisting of all sets in B of measure 0 or 1.

§5.5 Von Neumann’s Mean Ergodic Theorem


Von Neumann’s Ergodic Theorem deals with the L2 -limiting behaviour of
1 P n−1 j 2
n j=0 f T for f ∈ L (X, B, µ).

4
MAGIC010 Ergodic Theory Lecture 5

Theorem 5.3 (Von Neumann’s Ergodic Theorem)


Let (X, B, µ) be a probability space and let T : X → X be a measure-
preserving transformation. Let I denote the σ-algebra of T -invariant sets.
Then for every f ∈ L2 (X, B, µ), we have
n−1
1X j
f T → E(f | I)
n
j=0

in L2 .

Corollary 5.4
Let (X, B, µ) be a probability space and let T : X → X be an ergodic
measure-preserving transformation. Let f ∈ L2 (X, B, µ). Then
n−1 Z
1X j
fT → f dµ, as n → ∞,
n
j=0

in L2 .

Proof. If T is ergodic then I is the trivial σ-algebra NR consisting of sets


of measure 0 and 1. If f ∈ L2 (X, B, µ) then E(f | N ) = f dµ. 2

In order to prove von Neumann’s Ergodic Theorem, it is useful to recast


it in terms of spectral theory.

Theorem 5.5 (von Neumann’s Ergodic Theorem for Operators)


Let U be an unitary operator of a complex Hilbert space H. Let I = {v ∈
H | U v = v} be the subspace of U -invariant functions and let PI : H → I
be orthogonal projection onto I. Then for all v ∈ H we have
n−1
1X j
U v → PI v (5.2)
n
j=0

in the norm induced on H by the inner product.

Proof of Theorem 5.5. First note that if v ∈ I then (5.2) holds, as


n−1
1X j
U v = v = PI v.
n
j=0

If v = U w − w for some w ∈ H then



n−1
1 X j 1 n 1
= n kU w − wk ≤ n 2kwk → 0.
U v

n
j=0

5
MAGIC010 Ergodic Theory Lecture 5

If we let B denote the norm-closure of the subspace {U w − w | w ∈ H} then


it follows that
n−1
1X j
U v→0
n
j=0

for all v ∈ B, by approximation.


We claim that H = I ⊕ B, an orthogonal decomposition. Suppose that
v ⊥ B. Then hv, U w − wi = 0 for all w ∈ H. Hence hU ∗ v, wi = hv, wi for all
w ∈ H. Hence U ∗ v = v. As U is unitary, we have that U ∗ = U −1 . Hence
v = U v, so that v ∈ I. Reversing each implication we see that v ∈ I implies
v ⊥ B, and the claim follows. 2

Remark Note that an isometry of a Hilbert space H is a linear operator


U such that hU v, U wi = hv, wi for all v, w ∈ H. We say that U is unitary if,
in addition, it is invertible. Equivalently, U is unitary if the dual operator
U ∗ is the inverse of U : U ∗ U = U U ∗ = id.
We can prove von Neumann’s Ergodic Theorem for an invertible measure-
preserving transformation T of a probability space (X, B, µ) as follows. Re-
call that L2 (X, B, µ) is a Hilbert space with respect to the inner product
Z
hf, gi = f ḡ dµ

and that T induces a linear operator U : L2 → L2 by U f = f ◦ T . As T is


measure-preserving, we have that U is an isometry; if T is invertible then U
is unitary.
Hence, when T is invertible, Theorem 5.3 follows immediately from The-
orem 5.5.
One can deduce from Theorem 5.5 that the result continues to hold
when U is an isometry and is not assumed to be invertible (this makes a
good exercise). Instead, we give an argument based around the construction
of the natural extension of the dynamical system T .
Let T be a measure-preserving transformation of the probability space
(X, B, µ). Introduce a new space X̂ = {(xj )∞ j=0 | T (xj ) = xj−1 } (essentially,
we are just extending the space X so that it contains all the possible pasts
for each point x ∈ X. Equip X̂ with the smallest σ-algebra B̂ which contains
all sets of the form {(xj )∞
j=0 | x0 ∈ B0 , . . . , xr ∈ Br } for each Bj ∈ B and
any r ∈ N. Define a measure µ̂ on such sets by

µ̂{(xj )∞
j=0 | x0 ∈ B0 , . . . , xr ∈ Br } = µ(T
−r
B0 ∩ · · · ∩ Br )

and extend to B̂ by using the Kolmogorov extension theorem. Define a trans-


formation T̂ : X̂ → X̂ by T (x0 , x1 , x2 , . . .) = (T x0 , T x1 , T x2 , . . .). Then T̂
is invertible: T̂ −1 (x0 , x1 , x2 , . . .) = (x1 , x2 , . . .).

6
MAGIC010 Ergodic Theory Lecture 5

One can check that µ̂ is a T̂ -invariant measure if and only if µ is a T -


invariant measure. Indeed, µ̂ is ergodic for T̂ if and only if µ is ergodic for
T.
The transformation T̂ is called the natural extension of T .
Let f ∈ L2 (X, B, µ). Then we obtain a function fˆ ∈ L2 (X̂, B̂, µ̂) by
defining fˆ(x0 , x1 , x2 , . . .) = f (x0 ). Then one can easily see that
n−1 n−1
1Xˆ j 1X j
f T̂ = fT .
n n
j=0 j=0

Hence von Neumann’s Ergodic Theorem for T follows from von Neumann’s
Ergodic Theorem for T̂ .

§5.6 Birkhoff ’s Pointwise Ergodic Theorem


1 Pn−1
Birkhoff’s Ergodic Theorem deals with the behaviour of n j=0 f (T j x) for
µ-a.e. x ∈ X, and for f ∈ L1 (X, B, µ).

Theorem 5.6 (Birkhoff ’s Ergodic Theorem)


Let (X, B, µ) be a probability space and let T : X → X be a measure-
preserving transformation. Let I denote the σ-algebra of T -invariant sets.
Then for every f ∈ L1 (X, B, µ), we have
n−1
1X
f (T j x) → E(f | I)
n
j=0

for µ-a.e. x ∈ X.

Corollary 5.7
Let (X, B, µ) be a probability space and let T : X → X be an ergodic
measure-preserving transformation. Let f ∈ L1 (X, B, µ). Then
n−1 Z
1X
f (T j x) → f dµ, as n → ∞,
n
j=0

for µ-a.e. x ∈ X.

§5.7 Applications of Birkhoff ’s Ergodic Theorem


§5.7.1 Kac’s Lemma
Poincaré’s Recurrence Theorem tells us that, under a measure-preserving
transformation, almost every point of a subset A of positive measure will
return to A. However, it does not tell us how long we should have to wait for
this to happen. One would expect that return times to sets of large measure

7
MAGIC010 Ergodic Theory Lecture 5

are small, and that return times to sets of small measure are large. This is
indeed the case, and forms the content of Kac’s Lemma.
Let T : X → X be a measure-preserving transformation of a probability
space (X, B, µ) and let A ⊂ X be a measurable subset with µ(A) > 0. By
Poincaré’s Recurrence Theorem, the integer

nA (x) = inf{n ≥ 1 | T n (x) ∈ A}

is defined for a.e. x ∈ A.

Theorem 5.8 (Kac’s Lemma)


Let T be an ergodic measure-preserving transformation of the probability
space (X, B, µ). Let A ∈ B be such that µ(A) > 0. Then
Z
nA dµ = 1.
A

Proof. Let

An = A ∩ T −1 Ac ∩ · · · ∩ T −(n−1) Ac ∩ T −n A.

Then An consists of those points in A that return to A after exactly n


iterations of T , i.e. An = {x ∈ A | nA (x) = n}.
Consider the illustration in figure 5.7.1. As T is ergodic, almost every

T T

T T T

A
A1 A2 A3 An

Figure 5.1: The return times to A

point of X eventually enters A. Hence the diagram represent almost all


of X. Note that the column above An in the diagram consists of n sets,
An,0 , . . . , An,n−1 say, with An,0 = An . Note that T −k An,k = An . As T is
measure-preserving, it follows that µ(An,k ) = µ(An ) for k = 0, . . . , n − 1.
Hence

1 = µ(X)

8
MAGIC010 Ergodic Theory Lecture 5

∞ X
X
= k = 1n µ(An,k )
n=1
X∞
= nµ(An )
n=1
X∞ Z
= nA dµ
n=1 An
Z
= nA dµ.
A

§5.7.2 Ehrenfests’ example


The following example, due to P. and T. Enhrenfest, demonstrates that the
return times in Poincaré’s Recurrence Theorem may be extremely large.
Consider two urns. One urn contains 100 balls, numbered 1 to 100, and
the other urn is empty. We also have a random number generator: this
could be a bag containing 100 slips of paper, numbered 1 to 100.
Each second, a slip of paper is drawn from the bag, the number is noted,
and the slip of paper is returned to the bag. The ball bearing that number
is then moved from whichever urn it is currently in to the other urn.
Naively, we would expect that the system will settle into an equilibrium
state in which there are 50 balls in each urn. Of course, there will continue
to be small random fluctuations about the 50-50 distribution. However,
it would appear highly unlikely for the system to return to the state in
which 100 balls are in the first urn. Nevertheless, the Poincaré Recurrence
Theorem tells us that this situation will occur almost surely (although we
will have to wait a long time for this to happen).
To see this, we represent the system as a full shift on 101 symbols with
an appropriate measure. Regard xj ∈ {0, . . . , 100} as being the number of
balls in the first urn after j seconds. Hence a sequence (xj )∞j=0 records the
number of balls in the first urn at each time. As the number of balls in the
first urn increases or decreases by 1 each second, such sequences determine
the shift of finite type

Σ = {{(xj )∞
j=0 | xj ∈ {0, . . . , 100}, |xj − xj+1 | = 1 for j = 0, 1, 2, . . .}

(this corresponds to a transition matrix A = (aij ) where ai,i+1 = ai+1,i = 1


for i = 0, . . . , 100 and aij = 0 otherwise).
Let pi denote the probability of there being i balls in the first n. This
probability is independent of time, and is equal to
 
1 100
pi = 100 .
2 i

9
MAGIC010 Ergodic Theory Lecture 5

If we have i balls in the first urn then at the next stage we must have either
i − 1 or i + 1 balls in the first urn. The number of balls becomes i − 1 if
the random number chosen is equal to the number of one of the balls in
the first urn. As there are currently i such balls, the probability of this
happening is i/100. Hence the conditional probability Pi,i−1 that there are
i − 1 balls remaining given that we started with i balls in the first urn is
i/100. Similarly, the conditional probability Pi,i+1 that there are i + 1 balls
in the first urn given that we started with i balls is (100 − i)/100. This
defines a stochastic matrix:
 
0 1 0 0 0 ···
 1 99
0 ··· 
 100 02 100 0
98

P =
 0
100 0 100 0 · · · 

3 97
 0
 0 100 0 100 · · · 

.. .. .. .. .. ..
. . . . . .

Note that Pi,j 6= 0 if and only if Aij = 1 and so P and A are compatible. It is
straightforward to check that pP = p. Hence we have a Markov probability
measure µP defined on Σ. The matrix A is irreducible (but is not aperiodic);
this ensures that µP is ergodic.
Consider the cylinder A = [100] of length 1. The represents there being
100 balls in the first urn. By Poincaré’s Recurrence Theorem, if we start in
A then we return to A infinitely often. By Kac’s lemma, the expected first
return time to A is
1
= 2100 seconds,
µP (A)
which is about 4 × 1022 years, or about 3 × 1012 times the length of time
that the Universe has so far existed!

Remark This suggests that returning to a small set is a ‘rare’ event, and
as such the return times are likely to have a Poisson distribution. This
is normally formalised as follows. Let T be an ergodic measure-preserving
transformation of a probability space (X, B, µ) and let x ∈ X. Let An be a
decreasing sequence of subsets that decrease
R to x. Define the return time as
τ (x) = limn→∞ µ(An )τAn (x). Then τ dµ = 1. One would normally expect
τ to have a Poisson distribution, but this is only known in particular cases;
usually one needs some hyperbolicity of the dynamics T (such as being a
shift of finite type). Additionally, there are normally restrictions on the sets
An (such as requiring them to be cylinders).

10
MAGIC010 Ergodic Theory Lecture 5

§5.7.3 Normality of numbers


Let r ≥ 2. Recall that any number x ∈ [0, 1] can be written as a base r
‘decimal’, i.e. there exist digits xj ∈ {0, 1, . . . , r − 1} for which

X xj
x= .
rj
j=1

This r-adic expansion is unique, unless the sequence (xj ) ends in either
infinitely repeated 0s or infinitely repeated (r − 1)s.

Definition. Fix r ≥ 2. A number x ∈ [0, 1] is said to be (simply) normal


(in base r) if it has a unique expansion as an r-adic expansion, and for each
k = 0, 1, . . . , r − 1, the frequency with which digit k occurs in its r-adic
expansion is equal to 1/r.

For each r ≥ 2, define the map Tr : [0, 1] → [0, 1] by Tr (x) = rx mod 1.


(The case r = 2 is the doubling map.) It is easy to see, by following the
arguments for the doubling map, that Lebesgue measure µ on [0, 1] is an
ergodic invariant measure for Tr .
The close connections between r-adic expansions and the map Tr can be
used to prove the following result. A number is said to be normal if it is
simultaneously simply normal in every base r ≥ 2.

Proposition 5.9
Lebesgue almost every number in [0, 1] is normal.

Proof. Fix r ≥ 2. Then clearly all but a countable set of points has a
unique r-adic expansion. Fix k ∈ {0, 1, . . . , r − 1}. Then it is easy to see
that xj = k if and only if T j−1 x ∈ [k/r, (k + 1)/r). Thus
n−1
1 1X
card{1 ≤ j ≤ n | xj = k} = χ[k/r,(k+1)/r) (T j x).
n n
j=0

By Birkhoff’s Ergodic Theorem


R for Lebesgue almost every point x the above
expression converges to χ[k/r,(k+1)/r) (x) dx = 1/r. Let Nr denote the set
of such points.
/inf ty
As Nr has measure 1 for each r ≥ 2, it follows that N = ∩r=2 Nr has
measure 1. Hence Lebesgue almost every point is normal. 2

Remark Given r ≥ 2 it is easy to construct a number that is simply


normal in base r. However, not a single example is known of a number that
is simultaneously normal in every base r ≥ 2.
One can easily use Birkhoff’s Ergodic Theorem to prove the following
result.

11
MAGIC010 Ergodic Theory Lecture 5

Proposition 5.10
For Lebesgue-almost every point x ∈ [0, 1], the arithmetic mean of the digits
occurring in the base r expansion of x is (r − 1)/2.
We leave this as an exercise.

§5.7.4 Continued fractions


We can prove similar results for the distribution of digits in the continued
fraction expansion of real numbers.

Proposition 5.11
For Lebesgue-almost every x ∈ [0, 1], the frequency with which the natural
number k occurs in the continued fraction expansion of x is

(k + 1)2
 
1
log .
log 2 k(k + 2)

Proof. Let λ denote Lebesgue measure and let µ denote Gauss’ measure.
Then λ-a.e. and µ-a.e. x ∈ (0, 1) is irrational and has an infinite continued
fraction expansion
1
x= .
x0 + x + 1 1
1 1
x2 + x +···
3

Let T denote the continued fraction map. Then xn = [1/T n x].


Fix k ∈ N. Then xn = k precisely when [1/T n x] = k, i.e.
1
k≤ <k+1
T nx
which is equivalent to requiring
1 1
< T nx ≤ .
k+1 k
Hence
n−1
1 1X
card{0 ≤ j ≤ n − 1 | xj = k} = χ(1/(k+1),1/k] (T i x)
n n
j=0
Z
→ χ(1/(k+1),1/k] dµ for µ-a.e. x
    
1 1 1
= log 1 + − log 1 +
log 2 k k+1
1 (k + 1) 2
= log .
log 2 k(k + 2)
As µ and λ are equivalent, this holds for Lebesgue almost every point. 2

12
MAGIC010 Ergodic Theory Lecture 5

Proposition 5.12
(i) For Lebesgue-almost every x ∈ [0, 1], the arithmetic mean of the digits
in the continued fraction expansion of x is infinite.

(ii) For Lebesgue-almost every x ∈ [0, 1], the geometric mean of the digits
in the continued fraction expansion of x is
∞  log k/ log 2
Y 1
1+ .
k 2 + 2k
k=1

Proof. Writing
1
x= 1 .
x0 + x1 + 1
1
x2 + x +···
3

the proposition claims that


1
lim (x0 + x1 + · · · + xn−1 ) = ∞ (5.3)
n→∞ n

almost everywhere, and


∞  log k/ log 2
1/n
Y 1
lim (x0 x1 · · · xn−1 ) = 1+ 2
(5.4)
n→∞ k + 2k
k=1

almost everywhere.
We leave (5.3) as an exercise.
We prove (5.4). Define f (x) = log k for x ∈ (1/(k + 1), 1/k]. Then
n−1
1 1X
(log a0 + log a1 + · · · + an−1 ) = f (T j x)
n n
j=0
Z 1
1 f (x)
→ dx
log 2 0 1 + x
∞ Z
1 X 1/k log k
= dx
log 2 1/(k+1) 1+x
k=1
∞  
X log k 1
= log 1 + 2
,
log 2 k + 2k
k=1

for Gauss-almost every, hence Lebesgue-almost every, point x ∈ [0, 1]. 2

§5.8 Appendix: The proof of Birkhoff ’s Ergodic Theorem


The proof is something of a tour de force of hard analysis. It is based on
the following inequality.

13
MAGIC010 Ergodic Theory Lecture 5

Theorem 5.13 (Maximal Inequality)


Let (X, B, µ) be a probability space, let T : X → X be a measure-preserving
transformation and let f ∈ L1 (X, B, µ). Define f0 = 0 and, for n ≥ 1,
fn = f + f ◦ T + · · · + f ◦ T n−1 .
For n ≥ 1, set Fn (x) = max0≤j≤n fj (x). Then Fn (x) ≥ 0. Then
Z
f dµ ≥ 0.
{x∈X|Fn (x)>0}

Proof. Clearly Fn ∈ L1 (X, B, µ). For 0 ≤ j ≤ n, we have Fn ≥ fj , so


Fn ◦ T ≥ fj ◦ T . Hence
Fn ◦ T + f ≥ fj ◦ T + f = fj+1
and therefore
Fn ◦ T (x) + f (x) ≥ max fj (x).
1≤j≤n

If Fn (x) > 0 then


max fj (x) = max fj (x) = Fn (x),
1≤j≤n 0≤j≤n

so we obtain that
f ≥ Fn − Fn ◦ T
on the set A = {x | Fn (x) > 0}.
Hence
Z Z Z
f dµ ≥ Fn dµ − Fn ◦ T dµ
A Z A ZA

= Fn dµ − Fn ◦ T dµ as Fn = 0 on X \ A
X A
Z Z
≥ Fn dµ − Fn ◦ T dµ as Fn ◦ T ≥ 0
X X
= 0 as µ is T -invariant.
2

Corollary 5.14
Let g ∈ L1 (X, B, µ) and let
 
n−1
 1X 
Mα = x ∈ X | sup g(T j x) > α .
 n≥1 n j=0

Then for all B ∈ B with T −1 B = B we have that


Z
g dµ ≥ αµ(Mα ∩ B).
Mα ∩A

14
MAGIC010 Ergodic Theory Lecture 5

Proof. Suppose first that B = X. Let f = g − α, then


 
∞ 
[ n−1
X  ∞
[ ∞
[
Mα = x| g(T j x) > nα = {x | fn (x) > 0} = {x | Fn (x) > 0}
 
n=1 j=0 n=1 n=1

(since fn (x) > 0 ⇒ Fn (x) > 0 and Fn (x) > 0 ⇒ fj (x) > 0 for some 1 ≤ j ≤
n). Write Cn = {x | Fn (x) > 0} and observe that Cn ⊂ Cn+1 . Thus χCn
converges to χBα and so f χCn converges to f χMα , as n → ∞. Furthermore,
|f χCn | ≤ |f |. Hence, by the Dominated Convergence Theorem,
Z Z Z Z
f dµ = f χCn dµ → f χMα dµ = f dµ, as n → ∞.
Cn X X Mα

Applying
R the Maximal RInequality, we have,R for all n ≥ 1 we have that
Cn f dµ ≥ 0. Therefore Mα f dµ ≥ 0, i.e., Bα g dµ ≥ αµ(Bα ).
For the general case, we work with the restriction of T to B, T : B → B,
and apply the Maximal Inequality on this subset to get
Z
g dµ ≥ αµ(Mα ∩ B),
Mα ∩B

as required. 2

We will also need the following convergence result.

Proposition 5.15 (Fatou’s Lemma)


Let (X, B, µ) be a probability space and suppose that fn : X → R are mea-
surable functions. Define f (x) = lim inf n→∞ fn (x). Then f is measurable
and Z Z
f dµ ≤ lim inf fn dµ
n→∞

(one or both of these expressions may be infinite).

Proof of Birkhoff ’s Ergodic Theorem. Let


n−1 n−1
∗ 1X 1X
f (x) = lim sup f (T j x), f∗ (x) = lim inf f (T j x).
n→∞ n n→∞ n
j=0 j=0

These exist (but may be ±∞, respectively) at all points x ∈ X. Clearly


f∗ (x) ≤ f ∗ (x).
Let
n−1
1X
an (x) = f (T j x).
n
j=0

Observe that
n+1 1
an+1 (x) = an (T x) + f (x).
n n

15
MAGIC010 Ergodic Theory Lecture 5

As f is finite µ-a.e., we have that f (x)/n → 0 µ-a.e. as n → ∞. Hence,


taking the lim sup and lim inf as n → ∞, gives us that f ∗ ◦ T = f ∗ µ-a.e.
and f∗ ◦ T = f∗ µ-a.e.
We have to show

(i) f ∗ = f∗ µ-a.e

(ii) f ∗ ∈ L1 (X, B, µ)

(iii) f ∗ dµ = f dµ.
R R

We prove (i). For α, β ∈ R, define

Eα,β = {x ∈ X | f∗ (x) < β and f ∗ (x) > α}.

Note that [
{x ∈ X | f∗ (x) < f ∗ (x)} = Eα,β
β<α, α,β∈Q

(a countable union). Thus, to show that f ∗ = f∗ µ-a.e., it suffices to show


that µ(Eα,β ) = 0 whenever β < α. Since f∗ ◦ T = f∗ and f ∗ ◦ T = f ∗ , we
see that T −1 Eα,β = Eα,β . If we write
 
n−1
 1 X 
Mα = x ∈ X | sup f (T j x) > α
 n≥1 n j=0

then Eα,β ∩ Mα = Eα,β .


Applying Corollary 5.14 we have that
Z Z
f dµ = f dµ
Eα,β Eα,β ∩Mα
≥ αµ(Eα,β ∩ Mα ) = αµ(Eα,β ).

Replacing f , α and β by −f , −β and −α and using the fact that (−f )∗ =


−f∗ and (−f )∗ = −f ∗ , we also get
Z
f dµ ≤ βµ(Eα,β ).
Eα,β

Therefore
αµ(Eα,β ) ≤ βµ(Eα,β )
and since β < α this shows that µ(Eα,β ) = 0. Thus f ∗ = f∗ µ-a.e. and
n−1
1X
lim f (T j x) = f ∗ (x) µ-a.e.
n→∞ n
j=0

16
MAGIC010 Ergodic Theory Lecture 5

We prove (ii). Let



n−1
1 X j

gn (x) = f (T x) .
n j=0

Then gn ≥ 0 and Z Z
gn dµ ≤ |f | dµ

so we can apply Fatou’s Lemma (Proposition 5.15) to conclude that limn→∞ gn =


|f ∗ | is integrable, i.e., that f ∗ ∈ L1 (X, B, µ).
We prove (iii). For n ∈ N and k ∈ Z, define
 
k k+1
Dkn = x ∈ X | ≤ f ∗ (x) < .
n n
For every ε > 0, we have that

Dkn ∩ M k −ε = Dkn .
n

Since T −1 Dkn = Dkn , we can apply Corollary 5.14 again to obtain


Z  
k
f dµ ≥ − ε µ(Dkn ).
n
Dk n

Since ε > 0 is arbitrary, we have


Z
k
f dµ ≥ µ(Dkn ).
Dkn n

Thus Z Z
k+1 1
f ∗ dµ ≤ µ(Dkn ) ≤ µ(Dkn ) + f dµ
Dkn n n Dkn

(where the first inequality follows from the definition of Dkn ). Since
[
X= Dkn
k∈Z

(a disjoint union), summing over k ∈ Z gives


Z Z
∗ 1
f dµ ≤ µ(X) + f dµ
X n X
Z
1
= + f dµ.
n X

Since this holds for all n ≥ 1, we obtain


Z Z

f dµ ≤ f dµ.
X X

17
MAGIC010 Ergodic Theory Lecture 5

Applying the same argument to −f gives


Z Z

(−f ) dµ ≤ −f dµ

so that Z Z Z

f dµ = f∗ dµ ≥ f dµ.

Therefore Z Z

f dµ = f dµ,

as required.
Finally, we prove that f ∗ = E(f | I). First note that as f ∗ is T -invariant,
it is measurable with respect to I. Moreover, if I is any T -invariant set then
Z Z
f dµ = f ∗ dµ.
I I

Hence f ∗ = E(f | I). 2

§5.9 References
Most of the material in this lecture is standard in ergodic theory. The presen-
tation of Ehrenfests’ example (originally an example in statistical mechanics)
is taken from

K. Petersen, Ergodic Theory, C.U.P., Cambridge, 1983.

Additional applications of the Ergodic Theorem to continued fractions can


be found in

I. P. Cornfeld, S. V. Fomin, and Ya. G. Sinai, Ergodic Theory, Springer,


Berlin, 1982.
A.M. Rockett and P. Szusz, Continued Fractions, World Scientific, 1992.

§5.10 Exercises
Exercise 5.1
Construct an example to show that Poincaré’s recurrence theorem does not
hold on infinite measure spaces. (Recall that a measure space (X, B, µ) is
infinite if µ(X) = ∞.)

Exercise 5.2
Prove Proposition 5.10: For Lebesgue-almost every point x ∈ [0, 1], the
arithmetic mean of the digits occurring in the base r expansion of x is
(r − 1)/2.

18
MAGIC010 Ergodic Theory Lecture 5

Exercise 5.3
(i) Let T : X → X be an ergodic measure-preserving transformation of
a probability Rspace (X, B, µ). Let f : X → R be measurable, and
suppose that f dµ = ∞. Prove that
n−1
1X
lim f (T j x) = ∞
n→∞ n
j=0

for µ-almost every x.

(ii) Prove Proposition 5.12(i).

Exercise 5.4
Let B be a Banach space and U a bounded linear operator of B such that
supk kU k k < ∞. Prove that the following are equivalent:
n−1
1X j
(i) U v converges in norm;
n
j=0

n−1
1X j
(ii) U v has a limit point in the weak topology,
n
j=0

(iii) U has a fixed point in the weakly closed convex hull of {U n v} (i.e. the
smallest weakly closed convex set that contains all the U n v).

Hence prove the Lp -Ergodic Theorem when T is an ergodic measure-


preserving transformation of a probability space (X, B, µ), namely that if
f ∈ Lp (X, B, µ) then
n−1 Z
1X j
f T → f dµ
n
j=0

in Lp .

19

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy