CorinnaII
CorinnaII
CorinnaII
Lecturers:
Copyright c University of Bristol 2010 & 2014. This material is copyright of the University.
It is provided exclusively for educational purposes at the University
and is to be downloaded or copied for your private study only.
Chapter 2
Topological Dynamics
and Symbolic Dynamics
Examples of metric spaces and distances are the following. The first three are classical examples, while
the following two are useful in dynamical systems.
d(x, y) = |x − y|.
2. X = R2 or X = [0, 1] × [0, 1] with the Euclidean distance: if x = (x1 , x2 ) and y = (y1 , y2 ) are points in
R2 , their distance is p
d(x, y) = (x1 − x2 )2 + (y1 − y2 )2
1
MATH36206 - MATHM6206 Topological and Symbolic Dynamics
5. Let (X, d) be any metric space and f : X → X. Then for each n ∈ N+ we can define a new distance,
dn , given by
dn (x, y) = max d(f k (x), f k (y)).
k=0,...,n−1
Two points x, y are close in the dn metric if their orbits up to time n stay close. We will use this distance
to defined topological entropy in §2.3.
Exercise 2.1.1. Check that the distances in the previous Examples satisfy the properties of distance in
Definition 2.1.1. For each, describe the ball of radius at a point.
In a metric space one can talk about convergence and continuity as in Rn . Let (X, d) be a metric space.
Given x ∈ X and > 0, let Bd (x, ) be the ball of radius around the point x defined using the distance d,
that is
Bd (x, ) = {y ∈ X such that d(x, y) < }.
If there is no ambiguity about the distance, we will often write simply B(x, ), dropping the subscript d. We
can use balls to define convergence:
Definition 2.1.2. A sequence (xn )n∈N ⊂ X converges to x and we write limn→∞ xn = x if for any > 0
there exists N > 0 such that xn ∈ Bd (x, ) for all n ≥ N .
We can use the distance to define the notion of open and closed sets.
Definition 2.1.3. A set U ⊂ X of a metric space (X, d) is open if for any x ∈ U there exists an > 0 such
that
Bd (x, ) ⊂ U.
A set C ⊂ X is closed if its complement X\C is open.
Example 2.1.2. If X = R and d(x, y) = |x − y|, the intervals (a, b) are open sets and the intervals [a, b] are
closed sets. Also intervals of the form (a, ∞) or (∞, b) are open and intervals of the form [a, ∞) or (∞, b] are
closed. Intervals of the form [a, b) or (a, b] are neither open nor closed.
Exercise 2.1.2. Prove that a ball Bd (x, ) is open (use the triangle inequality).
Open and closed sets in a metric space enjoy the following property:1
Lemma 2.1.1. (1) Countable unions of open sets are open: if U1 , U2 , . . . , Un , . . . are open sets, than ∪k∈N Uk
is an open set;
(2) Finite intersections of open sets are open: if U1 , U2 , . . . , UN are open sets, than ∩N
k=1 Uk is an open set.
Exercise 2.1.3. Prove the lemma using the Definition 2.1.3 above.
Exercise 2.1.4. Given an example in X = R of a countable collection of open sets whose interesection is not
open.
By using De Morgan’s Laws, it follows that closed sets have the following properties (note that the role of
intersections and unions is reversed):
Corollary 2.1.1. (1) Countable intersections of closed sets are closed: if C1 , C2 , . . . , Cn , . . . are closed sets,
than ∩k∈N Ck is a closed set;
(2) Finite unions of closed sets are closed: if C1 , C2 , . . . , CN are closed sets, than ∪N
k=1 Ck is a closed set.
Property (1) in the Lemma is taken as an axiom. A collection U of subsets of X such that ∅, X ∈ U and Property (1) is satisfied
is called a topology. In this case, sets in U are called open sets and complement of sets in U are called closed sets. A topological
space (X, U ) is a space X with a topology U , see Extra.
Definition 2.1.4. A subset Y ⊂ X is dense if for any non-empty open set U ⊂ X there is a point y ∈ Y such
that y ∈ U .
One can check that this definition of dense set reduces to the usual definition of dense set for a subset
Y ⊂ R, that is, for each y ∈ Y and > 0 there exists y ∈ Y such that |x − y| < .
Definition 2.1.5. A metric space (X, d) is called separable if it contains a countable dense subset.
Example 2.1.3. If X = Rn with the Euclidean distance, X is separable since the set Qn given by all points
(x1 , . . . , xn ) ∈ Rn whose coordinates xi are rational numbers is dense and it is countable.
Let (X, dX ) and (Y, dY ) be a metric space. We will now consider properties of functions f : X → Y .
Definition 2.1.6. A function f : X → Y is an isometry if it preserves the distances, that is
The last metric space notion that we will use is the notion of compact sets. Let (X, d) be a metric space.
2 The following Lemma can be taken as definition of a continuous function when (X, U ) is a topological space, see Extra.
Definition 2.1.8. [Sequentially compact] A subset K ⊂ X is (sequentially) compact if for any sequence
(xn )n∈N ⊂ K there exists a convergent subsequence (xnk )k∈N and the limit limk→∞ xnk = x belong to K.
This property is called sequentially compactness since the definition involves sequences. There are other
notion of compactness (see compactness by covers below) which are equivalent in a metric space, so we will
simply say that a set is compact and use the term sequentially compact only when we specifically want to use
the above property of compact sets.
Example 2.1.5. Closed bounded intervals [a, b] ⊂ R are sequentially compact (this is known as Heine-Borel
theorem).
Conversely, in R, if a set is not bounded or not closed, it is not compact. The following two are non-examples,
that is examples of spaces that are not compact.
Example 2.1.6. The unbounded closed interval [0, ∞) is not sequentially compact: consider for example the
sequence (xn )n∈N given by xn = n. The sequence has no convergent subsequence.
The open interval (0, 1) is not sequentially compact: consider for example the sequence (xn )n∈N given by
xn = 1/n. We have limn→∞ xn = 0, but 0 ∈ / (0, 1).
In addition to the definition of sequencial compact, there is an other definitions of compactness, compactness
by open covers, which turn out to be equivalent in a metric space. Compactness by open covers is a more
general definition of compactness and can be used as a defintion of compactness in any topological space (see
Extra on topological spaces if interested).
Definition 2.1.9. An open cover of K ⊂ X is a collection {Uα }α of open sets of X such that
[
K⊂ Uα
α
(this is why we say that they cover K). A finite subcover is a finite subset {Uα1 , Uα1 , . . . , UαN } ⊂ {Uα }α
which still covers, that is such that K ⊂ ∪N
i=1 Uαi .
Definition 2.1.10. [Compact by covers] A subset K ⊂ X is compact by covers if for any open cover {Uα }α
there exists a finite subcover {Uα1 , Uα1 , . . . , UαN } ⊂ {Uα }α such that X ⊂ ∪α Uα .
Example 2.1.7. The open interval (0, 1) is not compact by covers: consider for example the collection
1 1
U = , , n∈N
n+2 n
is an open cover, but U does not admit a finite subcover. Indeed, a finite subset of intervals in U is of the
form
1 1 1 1 1 1
, , , , , ,
n1 + 2 n1 n2 + 2 n2 nk + 2 nk
1
so that if n = maxi=1,...,k ni , no point in 0, n+2 is covered by the finite collection.
Theorem 2.1.1. In a metric space (X, d), a subset K ⊂ X is sequentially compact if and only if it is compact
by covers.
Since we will work only with metric spaces, we will simply say that a set is compact and use equivalenty
either Definition 2.1.8 or 2.1.10.
Remark 2.1.1. In Rn , any subset C ⊂ X which is closed and bounded, that is such that supx,y∈C d(x, y) <
+∞, is compact.
Example 2.2.1. For example, the doubling map is topologically transitive, since we constructed a dense orbit
(see Theorem 1.3.2 in §1.3.). Another example is given by the rotation Rα by an irrational number α: as we
proved in Theorem 1.2.1 in §1.2, for example the orbit of x = 0 is dense.
Definition 2.2.3. A topological dynamical system is called minimal if all orbits are dense, that is for all
x ∈ X the set O+
f (x) is dense.
Example 2.2.2. For example, the rotation Rα by an irrational number α is minimal, as we proved in
Theorem 1.2.1 in § 1.2. On the other hand, the doubling map is not minimal, since there are periodic points
which lead to non-dense orbits.
Remark 2.2.1. Minimality implies topological transitivity, since if all orbits are dense, there is in particular
one dense orbit. On the other hand, we have just seen that the converse is not true, since there are systems
that are topologically transitive but not minimal, as the doubling map.
A useful alternative characterisation of topological transitivity is the following. We say a point x ∈ X is
isolated if the singleton {x} is an open set in X, or, equivalently, if there is > 0 such that Bd (x, ) = {x}.
Proposition 1. Let X be compact. A topological dynamical system f : X → X is topologically transitive if
for each pair U, V of non-empty open sets there exists n ∈ N such that
f n (U ) ∩ V 6= ∅.
The reverse implication holds under the additional assumption that X has no isolated points.
The reverse implication requires the following lemma:
3 More generally, it is enough to use a topological space, for which the notion of continuous map and of homeomorphism is
well defined.
4 If f is invertible, and f −1 is continuous (that is, f is a homeomorphism), one can require that there exists x ∈ X such that
0
the full orbit Of (x0 ) is dense. In some books, this stronger definition of topologically transitive is used for homeomorphisms.
Definition 2.2.2 is sometimes referred to as forward topologically transitive. If there exists x0 ∈ X such that the full orbit Of (x0 )
is dense, one can prove that there exists x such that O+ f (x) is dense, but x could be different than x0 .
dense.
Proof of Lemma. Since X has no isolated points, every open non-empty set U ⊂ X contains infinitely many
1
distinct open subsets Uk . [For x ∈ U and K sufficiently large, take for instance Uk = B(x, k1 ) \ B(x, k+1 ),
k ≥ K. Here B(x, ) = {x ∈ X : d(x, ) ≤ )} is the closed ball at x.] This implies that, since Of (x0 ) is
+
dense, there are infinitely many integers mk such that f mk (x0 ) ∈ Uk ⊂ U . Given n ∈ N, choose k such that
mk ≥ n. Then U 3 f mk (x0 ) = f mk −n (f n (x0 )) and thus there is an integer m (namely m = mk − n) such that
f m (f n (x0 )) ∈ U .
Proof of Proposition 1. We first prove the reverse implication. Assume that f is topologically transitive and
X has no isoluated points. Let x0 be such that O+ f (x0 ) is dense. Given U, V open sets, by density there
exists n such that f n (x0 ) ∈ U . Since by Lemma 2.2.1 also O+ n
f (f (x0 )) is dense, there exists m such that
m n
f (f (x0 )) ∈ V . So
f m+n (x0 ) ∈ f m (U ) ∩ V,
which shows that f m (U ) ∩ V 6= ∅.
[For Level M:] Let us now prove the first implication. Assume that for each pair U, V of non-empty open
sets there exists n ∈ N such that f n (U ) ∩ V 6= ∅. Since X is compact, one can prove that X has a countable
dense subset, see the Exercise in §2.1 (for example, in the unit square [0, 1]2 , which is a compact set in R2 ,
the set Q2 ∩ [0, 1]2 of points whose coordinates are rational is a dense subset and is countable).
Let {xn }n∈N be the points of thie countable dense subset. To show that the orbit of a point x ∈ X is
dense, it is enough to show that for each k ∈ N and n ∈ N there exists a m such that f m (x) ∈ B(xn , 1/k).
This is because any open set U contains a point xn for some n (by density) and hence, since it is open, it
contains the ball B(xn , 1/k) for some k ∈ N, so if f m (x) ∈ B(xn , 1/k) ⊂ U , then O+
f (x) ∩ U 6= ∅.
Since the balls B(xn , 1/k) for n ∈ N and k ∈ N are countable, we can relabel them and enumerate them as
Let us know use transitivity to construct an orbit which visits all these balls in the order in which we listed
them5 . Let B0 = B(x, ) be any ball and let B0 be the closed ball B0 = {y d(x, y) ≤ }. By assumption, there
exists N1 such that f N1 (B0 ) ∩ U1 6= ∅. Thus, we can pick a ball inside the non-empty open set B0 ∩ f −N1 (U1 ).
Up to reducing the radius, we can assume that there is a smaller ball, that we call B1 , such that the closed
ball
B1 ⊂ B0 ∩ f −N1 (U1 ).
By assumption there exists N2 such that f N2 (B1 ) ∩ U2 6= ∅. Thus, as before we can find a smaller ball B2
such that
B2 ⊂ B1 ∩ f −N2 (U2 ).
Repeating by induction, we can construct a sequence balls Bn such that
Since the closed balls are nested, that is B n+1 ⊂ Bn and we are in a compact space, their intersection is
non-empty6 . If x ∈ ∩n B n , then f Nn (x) ∈ Un for any n by (2.1), thus the orbit of x is dense.
A dynamical property stronger than topological transitivity is the following, which is the first mathematical
definition of the intuitive idea of mixing (we will see in Chapter 4 another definition of mixing in the context
of ergodic theory).
Definition 2.2.4. A topological dynamical system f : X → X is called topologically mixing if for any pair
U, V of non-empty open sets there exists N ∈ N such that for all n ≥ N we have f n (U ) ∩ V 6= ∅.
5 Compare this with the proof that we gave that the doubling map has a dense orbit. For the doubling map, we used the
collection of binary intervals of the form I(a0 , . . . , an ), that is countable and plays the same role that the Ui in the proof and we
then constructed an orbit which visits them all.
6 This is a consequence of compactness that you might have seen in Metric Spaces: in a compact set, a countable collection of
Topological mixing conveys the idea that each set U , after iterations of f , become spread everywhere: for
each V , for all n sufficiently large, f n (U ) intersects V .
If f is topologically mixing, in particular it is topologically transitive. This follows from the characterization
of topologically transitive in Proposition 1: if f n (U ) ∩ V 6= ∅ for all n ≥ N , in particular there is an n such
that f n (U ) ∩ V 6= ∅. Topologically mixing though, requires that the sets intersect for all large enough n.
Let us start by giving a non-example, that is an example of a map which is topologically transitive but not
topologically mixing.
Example 2.2.3. Rotations Rα : S 1 → S 1 are not topologically mixing. For simplicity take α < 1/2. Take
for example U, V to be two arcs, each of arc length less than πα. Then one can see there are infinitely many
k
k such that the images Rα (U ) does not intersect V : for every [1/α] iterates of Rα ([1/α], that is the integer
part of 2π/2α is the number of iterates to turn once around the cirle), there is at most one iterate such that
Rα (U ) intersects V (drawing a picture of the iterates of Rα will help you understand it).
On the other hand, if α is irrational, Rα is minimal (by Theorem 1.2.1 in § 1.2) and in particular topolog-
ically transitive. So irrational rotations are topologically transitive but not topologically mixing.
Let us now give an example of a topologically mixing dynamical system. In the following we make [0, 1]2
a metric space by using the standard Euclidean metric (we could also use the Manhattan distance d(x, y) =
max(|x1 − y1 |, |x2 − y2 |); see what this changes in the discussion below.)
Proposition 2. The baker map F : [0, 1]2 → [0, 1]2 is topologically mixing.
Proof. Let U, V be any two non empty open sets in X. Since U contains a small ball, it also contain a small
dyadic square Q, that is, a square with sides of length 1/2n which has the form
i i+1 j j+1
Q= , × , , 0 ≤ i, j ≤ 2n − 1.
2n 2n 2n 2n
Figure 2.1: Here n = 1, i = j = 0. The figures show 2k horizontal strips in F n+k (Q) with spacing 1/2k for
k = 1 and k = 2.
For each k = 1, 2, . . . , n, F acts on Q by doubling the horizontal width and halving the vertical height,
so that F k (Q) is a dyadic rectangle of width 2k (1/2n ) = 1/2n−k and height (1/2)k (1/2n ) = 1/2n+k . In
particular, for n = k, F n (Q) is thin horizontal rectangle of full width equal to 1. The image by F of a full
horizontal rectangle consists of two full horizontal rectangles, whose vertical distance is at most 1/2 (recall
how F acts geometrically and draw a picture to understand it). Since each iterate of F splits a full horizontal
rectangle into two, F k+n (Q) (for k ∈ N) consists of 2k horizontal rectangles of width 1, whose vertical spacing
is no more than 1/2k (where by vertical spacing we mean the distance between for example the centers of two
consecutive strips), see for example Figure 2.1. Now let B(y, ) be a ball contained in V , with > 0 sufficiently
small. Then, for all k such that 1/2k < , we have F n+k (Q) ∩ B(y, ) 6= ∅ and hence F n+k (U ) ∩ V 6= ∅.
The doubling map and the cat map are also topologically mixing dynamical systems (see Exercises below)
and provide examples in which one can prove that a small set (for example a dyadic rectangle for a baker map
or a rectangle whose sides are in eigenvalues directions for the cat map) is spread under the dynamics.
Exercise 2.2.1. Let X = R/Z and let f (x) = 2x mod 1 be the doubling map. You can use that if d(x, y) <
1/4, then d(f (x), f (y)) = 2d(x, y).
(a) Let I be a dyadic interval, that is an interval of the form
i i+1
I= , , N ∈ N+ , 0 ≤ i ≤ 2N − 1.
2N 2N
Describe the iterates f n (I) for n ∈ N: how many dyadic intervals do they consist of? of which size?
[Hint: You will need to consider separately what happens if n ≤ N and if n > N .]
(b) Show that for any non-empty open set U there exists N ∈ N such that f N (U ) = X. continues...
(c) Show that the doubling map is topologically mixing.
Exercise 2.2.2. Let fA : T2 → T2 be the cat map. Let λ1 > 1, λ2 < 1 be the eigenvalues of A and let
√ √
1+ 5 1− 5
v1 = 2 , v2 = 2 .
1 1
(a) Let Q be a small rectangle whose sides have direction v1 and v2 respectively. Describe how the iterates
fAn (Q) look like.
(b) Show that the cat map is topologically mixing.
[Hint: In both Part (a) and (b) you can use that the directions of v1 and v2 are irrational and that if
π(L) is the projection via π : R2 → T2 of a line L with irrational slope (see figure below), then π(S) is
dense in T2 , that is for any non empty open set U ⊂ T2 there is a point of π(S) inside U .]
(a) (b)
The The
lo- tent
gis- map
tic g
map
f
Figure 2.2: The tent map g and the logistic map f are topologically conjugate.
The logistic map f and the tent map g are topologically conjugate. Let us show that the topological
conjugacy is the map ψ : [0, 1] → [0, 1] given by
π
ψ(x) = sin2 x . (2.2)
2
Let us first show that the following diagram commutes
g
[0, 1] −−−−→ [0, 1]
ψ ψ
y y
f
[0, 1] −−−−→ [0, 1]
Proof of Lemma 2.3.1. Assume that Og (y)+ is dense and let us show that O+ f (ψ(y)) is dense. For any U ⊂ X
−1 −1
non-empty open set, ψ (U ) is an open set in Y since ψ is continuous because ψ is an homeomorphism and
it is non-empty since ψ is surjective. By density of Og (y)+ , there exists k ∈ N such that
Exercise 2.3.1. Check that if the topological dynamical systems f : X → X and g : Y → Y are topologically
semi-conjugate, one can still prove that if the orbit O+
g (y) of y ∈ Y is dense, then the orbit Of (ψ(y)) of ψ(y)
+
is dense in X.
Proof of Proposition 3. Part (1) and (2) follow from the definitions as a consequence of the Lemma 2.3.1, since
dense orbits are mapped to dense orbits by the conjugacy. Indeed, if g is topologically transitive, there exists
y ∈ Y so that Og+ (y) is dense and by Lemma 2.3.1 Of+ (ψ(y)) is dense in X so also f is topologically transitive.
Similarly, if g is minimal, for any x ∈ X, let y = ψ −1 (x) ∈ Y (which is well defined since ψ is intertible). By
minimality of g Og+ (y) is dense and by Lemma 2.3.1 Of+ (x) is dense in X so also f is minimal. The converse
implications follow by reversing the role of f and g and noting that if ψ : Y → X is a topological conjugacy
also ψ −1 : X → Y is a topological conjugacy.
Let us now prove Part (3). Assume that g is topologically mixing and let us deduce that f is also
topologically mixing. Given U, V open and non empty, ψ −1 (U ) and ψ −1 (V ) are also open, since ψ is continuous,
and non-empty, since ψ is surjective. Thus there exists N such that for any n ≥ N we have that g n (ψ −1 (U )) ∩
ψ −1 (V ) 6= ∅. Let y ∈ g n (ψ −1 (U )) ∩ ψ −1 (V ). Recall that by definition of conjugacy, since ψ is invertible, we
have ψ ◦ g ◦ ψ −1 = f and hence by induction ψ ◦ g n ◦ ψ −1 = f n . Thus, if we consider ψ(y) we have that
ψ(y) ∈ ψ g n (ψ −1 (U )) ∩ V = f n (U ) ∩ V.
Hence for any n ≥ N we have f n (U )∩V 6= ∅, which shows that f is topologically mixing. The other implication
follows again by reversing the role of f and g.
The next Exercise follows from Exercise 2.3.1 as Proposition 3 follows from Lemma 2.3.1:
Exercise 2.3.2. If f : X → X and g : Y → Y are topologically semi-conjugated by ψ : Y → X, then if g is
topologically transitive, f is topologically transitive and if g is minimal then f is minimal.
Exercise 2.3.3. Prove that if f : X → X and g : Y → Y are topologically semi-conjugated by ψ : Y → X
then if g is topologically mixing then f is topologically mixing.
Sensitive dependence means that arbitrarily close to each point of the space there are points whose future
itereates will become ∆-apart from the iterates of the given point. This phenomenon cause high unpre-
dictability: if for example one tries to use a computer to understand the dynamics of a system, one needs to
approximating the initial conditions by rounding it off and if the system has sensitive dependence on initial
conditions, this might cause a huge difference: the simulated orbit might be completely different than the real
evolution.
Example 2.3.2. Show that if ∆ > 0 is a sensitivity constant for f : X → X, any constant 0 < ∆0 < ∆ is
also a sensitivity constant.
7 A popular image for this phenomenon is the so called butterfly effect: ”‘a butterfly in China could cause a hurricane in
mexico”’. A small difference in initial conditions, like the presence of a buttarfly swinging, in a chaotic system could create a
huge difference in the long-term evolution, as the creation of a hurricane.
Note that we do not require that all points in B(x, δ) have different evolution, but only that in each ball
there is at least one point with that property. A stronger requirement, which implies sensitive dependence, is
the following:
Definition 2.3.5. A topological dynamical system f : X → X on a metric space (X, d) is called expansive if
there exists ν > 0, called the expansive constant such that for all x, y ∈ X such that x 6= y there exists n ∈ N
such that
d(f n (x), f n (y)) ≥ ν.
If f is expansive, the orbits of all points nearby a given point x ∈ N have some iterate which eventually
become far apart. Thus, if f is expansive with constant ν, in particular it has sensitive dependence with
constant ∆ = ν.
Example 2.3.3. The doubling map is expansive. We can take ν = 1/4. We have already seen in §1.3 that if
d(x, y) < 1/4, then d(f (x), f (y)) = 2d(x, y). Let x 6= y be distinct points. If d(x, y) ≥ 41 , then the definition
of expansive holds with n = 0 for x and y. Otherwise, by the above relation d(f (x), f (y)) = 2d(x, y). If
d(f (x), f (y)) ≥ 14 , we are again done since the definition holds with n = 1. Otherwise, d(f 2 (x), f 2 (y)) =
2d(f (x), f (y)) = 22 d(x, y). Thus, continuing in this way, if d(f k (x), f k (y)) < 41 , for all 0 ≤ k ≤ n − 1, then
Note that d(x, y) 6= 0 by the properties of a distance if x 6= y. Thus, for given x 6= y there exists n ∈ N
1
log
such that 2n d(x, y) ≥ 41 (namely any n ≥ 4d(x,y)
log 2 ), and therefore d(f n (x), f n (y)) ≥ 1/4. We conclude the
doubling map has sensitive dependence on initial conditions with constant 1/4.
Exercise 2.3.4. The linear maps Em : R/Z → R/Z given by Em (x) = x + m mod 1 (where m ∈ N , m > 1)
are expansive with expansive constant ν = 1/2m.
Let us now give an example of a map which has sensitive dependence but is not expansive.
Example 2.3.4. Let fA : T2 → T2 be the cat map. Let us first show that fA has sensitive dependence on
initial conditions. Take for example ∆ = 1/2. For any x ∈ T2 and any δ > 0, consider all y ∈ Bd (x, δ) on the
line through x in direction of the eigenvector v1 with eigenvalue λ1 > 1. Since fA expands points in direction
v1 with factor λ1 , we have that
d(fAn x, fAn y) = λn1 d(x, y)
at least for all n such that λn1 d(x, y) ≤ 1/2. Since λ1 > 1, there exists n0 such that 1/2λn1 0 < δ. We can
now choose y ∈ Bd (x, δ) so that d(y, x) = 1/2λn1 0 . Then d(fAn0 x, fAn0 y) = 1/2. This shows that ∆ = 1/2 is a
sensitivity constant.
Let us now show that fA is not expansive. Fix any ν ∈ (0, 21 ]. Then fix any x ∈ T2 and consider y ∈ Bd (x, ν)
on the line through x in direction of the eigenvector v2 with eigenvalue λ2 < 1. Since fA contracts distances
in direction v2 with factor λ2 , we have that
for all n ∈ N. We have thus shown that for any ν ∈ (0, 21 ] there exist x 6= y such that d(fAn x, fAn y) < ν for all
n ∈ N. This implies fA is not expansive.
Exercise 2.3.5. Show that any hyperbolic toral automorphism fX : T2 → T2 has sensitive dependence but
is not expansive.
Exercise 2.3.6. Let F : [0, 1]2 → [0, 1]2 be the Baker map.
(a) Show that the baker map has sensitive dependence on initial conditions with ∆ = 1/4;
(b) Show that it is not expansive, that is that there is no ν > 0 such that for every two points (x1 , y1 ), (x2 , y2 ) ∈
[0, 1]2 there is n such that d(F n ((x1 , y1 ), F n (x2 , y2 )) ≥ ν.
[Hint: Look at nearby points on the same horizontal line for (a) and at nearby points on the same vertical
line for (b)]
Example 2.3.5. The doubling map is chaotic. We have already seen that it is expansive (Eg 2.3.3), that it
is topologically transitive (Theorem 1.3.1 in §1.3) and that periodic points are dense (Exercise 1.3.4 in §1.3).
Example 2.3.6. Rotations Rα of the circle are not chaotic according to this definition. If α is irrational,
there are no periodic points. If α is rational, there are no dense orbits. Moreover, for any α, Rα does not have
sensitive dependence on initial conditions since Rα is an isometry and the iterates of nearby points remain
close by.
Other maps that can be proved to be chaotic are the baker map, the cat map and the logistic map. For
the latter, one can use the fact that it is topologically conjugate to the doubling map. Indeed, let us remark
that if f : X → X and g : Y → Y are topologically conjugate one can prove, similarly to what we did for the
other properties, that f is chaotic if and only if g is chaotic.
* Exercise 2.3.7. Consider the linear twist T : T2 → T2 , that is the map given by
(b) Show that if y is rational than (x, y) is periodic. Conclude that P er(T ) is dense.
(c) Is T chaotic?
* Exercise 2.3.8. Consider the tent map f and the logistic map g defined above in this lecture.
• Show that the tent map f is topologically mixing;
8 The first to adopt this as definition of chaos was Devaney, in An Introduction to Chaotic Dynamical Systems. Sometimes
this definition is also called Devanay chaotic.
Thus, two points are −close with respect to the distance dn if their iterates under f stay −close until time
n. Note that the definition of dn depends on the trasformation f . Thus, the balls with respect to this metric
Bdn (x, ) = {y ∈ X such that d(f k (x), f k y) < for all 0 ≤ k < n}
consist of all points whose trajectories up to time n stay -close to the finite orbit segment {x, f (x), . . . , f n−1 (x)}.
Another way to express it is
Indeed, the points y in the intersections are exactly the points such that f k (y) ∈ Bd (f k (x), ) for all 0 ≤ k ≤
n − 1.
Definition 2.4.1. Let > 0, n ∈ N. A set S ⊂ X is (, n)-separated if for all distinct point x, y ∈ S, x 6= y,
we have dn (x, y) ≥ .
Points in an (, n)-separated set have trajectories that, with a finite scale resolution , can be recognized
as different in time n. Let us give an example of an (n, )-separated set.
9 The definition of metric entropy, often called Kolmogorov-Sinai entropy, was introduced for the first time by Kolmogorov, one
of the fathers of ergodic theory, in a paper in 1958 and was successively developed by Sinai, another crucial figure in ergodic theory,
who at the time was his graduate student (the entry on entropy on Scholarpedia was actually written by Sinai himself). Entropy
in information theory, usually called Shannon entropy, was introduced by Claude Shannon in his 1948 paper Mathematical Theory
of Communication.
10 In metric spaces this definition of entropy was introduced by Bowen in 1971 and independently by Dinaburg in 1970. The
definition of entropy via covers for any topological space already existed before. Equivalence between these two notions was
proved by Bowen in 1971.
11 Historially the definition of topological entropy via covers came first and is due to introduced in 1965 by Adler, Konheim and
McAndrew. Their definition for topological dynamical systems is modelled on the definition of metric entropy by Kolmogorov
and Sinai.
Example 2.4.1. [Separated sets for the doubling map] Let f : R/Z → R/Z be the doubling map,
f (x) = 2x mod 1. Let > 0 and assume that < 1/4. Find k such that 1/2k+1 < ≤ 1/2k . By assumption
on , k ≥ 2. Consider the set Sn of dyadic fractions with denominator 2n , that is
i n
Sn = , 0 ≤ i ≤ 2 − 1 . (2.4)
2n
Let us prove that the set Sn−1+k is (n, )-separated. Let x, y ∈ Sn−1+k , with x 6= y. We need to show that
dn (x, y) ≥ , that means that there exists 0 ≤ l ≤ n − 1 such that d(f l (x), f l (y)) ≥ . In an exercise, we saw
that for any u, v ∈ R/Z
1
d(u, v) ≤ ⇒ d(f (u), f (v)) = 2d(u, v). (2.5)
4
If there exists 0 ≤ l ≤ n − 1 such that d(f l (x), f l (y)) ≥ 1/4, we are done (since k ≥ 2, 1/4 ≥ 1/2k ≥ ).
Otherwise, we can apply (2.5) repeatingly for n − 1 times and get
d(f n−1 (x), f n−1 (y)) = 2n−1 d(x, y).
Since x 6= y and x, y ∈ Sn−1+k , we have d(x, y) ≥ 1/2n−1+k , so that we get
2n−1 1
d(f n−1 (x), f n−1 (y)) = 2n−1 d(x, y) ≥ = ≥ .
2n−1+k 2k
This proves that Sn−1+k is (n, )−separated. Note that the cardinality of Sn−1+k is 2n−1+k .
Remark 2.4.1. One can show, using the assumption that X is compact, (n, )-separated sets exist and are
finite (see Extra 1 in the Extras on Topological Entropy linked after the following class).
To determine how many different orbits can be recognized at resolution one can maximize the size of an
(n, )−separated set.
Definition 2.4.2. Let Sep(f, n, ) (or simply Sep(n, ) when there is no ambiguity on f ) be the maximal
(that is, the largest) cardinality of an (n, )−separated set in X.
Clearly as n grows, the maximum number of (n, )−separated points will grow. In our Example 2.4.1 with
the doubling map, the cardinality of the (n, )−separated set Sn−1+k that we exhibited grows as 2k+n−1 , thus
exponentially in n.
Note that if a quantity grows exponentially, for example if an = eκn , the exponential rate of growth, or
the exponent κ, can be obtained as
log eκn log an
κ = lim = lim .
n→∞ n n→∞ n
This is still true if the exponential growth is not purely exponential, but there are other subexponential factors,
for example if an = n2 eκn
log an log n2 eκn 2 log n log eκn
lim = lim = lim + lim = 0 + κ = κ.
n→∞ n n→∞ n n→∞ n n→∞ n
More in general, if an = f (n)eκn where f (n) is subexponential, that is, limn→∞ log f (n)/n = 0, the limit
limn→∞ log an /n still gives κ.
Thus, to consider the exponential growth rate of Sep(f, n, ), for any > 0, consider the quantity
log(Sep(f, n, ))
htop (f, ) = lim sup .
n→∞ n
We need to consider the lim sup because we do not know if the limit a priori exists. In all the examples that
we will compute the limit actually exists. If we now change resolution, thus let → 0, it is again clear that
Sep(f, n, ) cannot decrease as tends to zero (as the resolution becomes finer and finer, one can distinguish
at least the same orbits, and probabily more). Moreover one can show that the growth rate of distinguishable
orbits will stay the same or increase, thus also htop (f, ) is not decreasing. Since a monotone function has a
limit, the limit of htop (f, ) as tends to 0 exists.
Definition 2.4.3. [Topological entropy] The topological entropy htop (f, ) of a topological dynamical system
f : X → X is given by
log(Sep(f, n, ))
htop (f ) = lim htop (f, ) = lim lim sup .
→0 →0 n→∞ n
Thus, htop (f, ) is the exponential growth rate of the maximum number of orbits of length n which are
distinguishable with finite precision and htop (f ), which is the limit for arbitrarly small , is the exponential
growth rate of the maximum number of orbits of length n which are distinguishable with finite but arbitrary
precision.
Example 2.4.2. Let f be again the doubling map. We showed that if 1/2k+1 < ≤ 1/2k then the set Sn+k
is an (n, )−separated set. Thus the maximal cardinality Sep(n, ) of an (n, )−separated set is at least the
cardinality of Sn+k , which is 2n+k . Thus, remarking that here , and hence k, are fixed and the limit is taken
only in n, we get
Definition 2.4.4. Let > 0, n ∈ N. A set S ⊂ X is (, n)−spanning if for all x ∈ X there is an y ∈ S such
that dn (x, y) < .
In other words, a set S is (, n)−spanning if any point of the space can be approximated with a point
of S whose orbit up to time n is indistinguishable up to time n with finite resolution . Equivalently, S is
(, n)−spanning if and only if
[
X⊂ Bdn (y, ), where Bdn (y, ) = {x ∈ X, such that dn (x, y) < }.
y∈S
One can show that in any compact metric space (X, d), for any > 0 and n ∈ N there exist (n, )-spanning
sets (see Extra 1 in the Extras on Topological Entropy linked after the following class).
Example 2.4.3. [Spanning set for the doubling map] Let f be again the doubling map and Sk the set
of dyadic fractions with denominator 2k , see (2.4). Let 1/2k+1 < ≤ 1/2k . Then Sk+n is (, n)−spanning.
Indeed, if x ∈ [0, 1], there exists i such that
i i+1
x ∈ n+k , n+k , where 0 ≤ i ≤ 2n+k − 1.
2 2
1 2j 2n−1 1
d(x, y) ≤ ⇒ d(f j (x), f j (y)) ≤ ≤ = k+1 < , for all 0 ≤ j < n.
2n+k 2n+k 2n+k 2
Thus, dn (x, y) < .
Note that the cardinality of Sn+k+1 is 2n+k+1 , so it grows exponentially in n.
This time it makes sense to see what is the smallest possible (, n)-spanning set.
Definition 2.4.5. Let Span(f, n, ) (or Span(n, ) if there is no ambiguity on f ) be the minimal (that is, the
smallest) cardinality of an (n, )-spanning set in X.
In other words, Span(f, n, ) is the minimum number of initial conditions needed to approximate with
resolution all orbits in the space up to time n. We can consider the exponential growth rate in n of
Span(f, n, ) for fixed and then take the limit as tends to zero. It turns out that this gives the same result
than using Sep(f, n, ) and yields again the topological entropy:
Theorem 2.4.1. For any topological dynamical system f : X → X we have
log(Span(f, n, ))
htop (f ) = lim lim sup .
→0 n→∞ n
Note that since we have already defined htop (f ) in terms of (n, )-separated sets, so by definition
log(Sep(f, n, ))
htop (f ) = lim lim sup .
→0 n→∞ n
Thus, the content of the theorem, is that the two limsups, obtained using separated and spanning sets respec-
tively, are the same, that is:
The proof of the theorem is given below. Let us first remark that the theorem states that to compute
topological entropy, we can either use maximal separated sets or minimal spanning sets. While the expression
with Sep(n, ) is useful to give lower bounds on htop (f ), the expression with Span(n, ) is useful to give upper
bounds on htop (f ). Indeed, once we find an (n, )-separated set, we have a lower bound on on the maximal
cardinality of (n, )-separated sets, that is on Sep(n, ). Conversely, once we find an (n, )-spanning set, we
have an upper bound on the minimal cardinality of (n, )-spanning sets, that is on Sep(n, ). When the upper
bound and the lower bound coincide, the common value is the topological entropy.
Let us show two applications of Theorem 2.4.1.
Example 2.4.4. [Entropy of the doubling map] Consider the doubling map f . Let < 1/4 and let k be
such that 1/2k+1 < ≤ 1/2k . Since Sn+k is an (n, )-spanning set of cardinality 2n+k , the minimal cardinality
Span(f, n, ) of an (n, )-spanning set is at most 2n+k so
Note that k is fixed when taking the limit in n. Taking now the limit in of a quantity which is independent
on , by Theorem 2.4.1 we have htop (f ) ≤ log 2. Since we have already shown using (n, )-separating sets that
htop (f ) ≥ log 2, we conclude that htop (f ) = log 2.
Example 2.4.5. [Entropy of rotations] Let Rα : S 1 → S 1 be a rotation. Fix and N such that 1/N ≤ .
Let us consider N equispaced points on S 1 :
m
S = { e2πi N , m = 0, 1, . . . , N − 1},
so that the arc length distance between two successive points in S is 1/N . Clearly S is (1, )−spanning, since
for any z ∈ S 1 there exists z 0 ∈ S such that d(z, z 0 ) ≤ 1/N ≤ . Let us show that the same set S is also
(n, )-spanning for any n ∈ N. It is enough to remark that for any j ∈ N, since Rα preserves the arc length
distance,
d(Rαj j
(z), Rα (z 0 )) = d(z, z 0 ) ≤ 1/N ≤ . (2.6)
Thus, for any n ∈ N, dn (z, z 0 ) ≤ . This shows that S is also (n, )-spanning for any n ∈ N.
Hence, the minimal cardinality Span(f, n, ) is less than the cardinality of S, which is N , independently
on n. Thus, for any , if we choose 1/N ≤ and keep N fixed as n → ∞, we get
Notice that the groth rate is non negative, so that if it is less than 0, i it indeed 0. By the Theorem 2.4.1
log(Span(f, n, ))
0 ≤ htop (Rα ) = lim lim sup = 0.
→0 n→∞ n
Let us now prove (2). Let S be a (n, )-spanning set of minimal cardinality, so that Span(n, ) =Card(S).
By definition of (n, )-spanning, we know that
[
X⊂ Bdn (y, ).
y∈S
Let S 0 be any (n, 2)−separated set. We claim that no two distinct points x1 6= x2 of S 0 can belong to the
same dn −ball. Indeed, if both x1 , x2 ∈ Bdn (y, ) for some y ∈ S, by triangle inequality
which contradicts the fact that S 0 is (n, 2)-separated. So the number of points in S 0 cannot be more than
the number of balls, that is the cardinality of S. Let us now choose S 0 the (n, 2)-separated set of maximal
cardinality. Then
Sep(n, 2) = Card(S 0 ) ≤ Card(S) = Span(n, ).
Proof of Theorem 2.4.1. By the Lemma, for any > 0 and n ∈ N, we have
where the first inequality follows from (2) and the second inequality follows from (1). By sandwitch Theorem
(also called pinching theorem or two policemen theorem), if we now take the limit as tends to zero, both the
left and the righ hand side converge by definition to htop (f ), so
log(Span(f, n, ))
htop (f ) ≤ lim lim sup ≤ htop (f ).
→0 n→∞ n
Thus, we conclude as desired that the limit as tends to zero of the exponential growth rate of Span(f, n, )
is equal to htop (f ).
1. If for any fixed > 0 you can construct sets Sn , for each n ∈ N, which are (n, )-separating and whose
exponential growth rate is h, then, since Sep(n, ) ≥ Card(Sn ), you can conclude that htop (f ) ≥ h;
2. If for any fixed > 0 you can construct sets Sn , for each n ∈ N, which are (n, )-spanning and whose
exponential growth rate is h, then, since Span(n, ) ≤ Card(Sn ), you can conclude that htop (f ) ≤ h;
3. If the exponential growth rates h and h in the two previous points the same, then you can conclude that
htop (f ) = h = h.
We have already seen examples of this principle in the previous section. We will now compute the topo-
logical entropy of hyperbolic toral automorpshisms using this strategy.
A v 1 = λ1 v 1 , A v 2 = λ2 v 2 ,
and let us assume that they are renormalized so that they have unit length: ||v 1 || = ||v 2 || = 1.
Proof. Let us first construct (n, )-spanning sets for fA . Fix > 0 and choose N ∈ N such that 1/N ≤ /2.
Draw inside the unit square [0, 1) × [0, 1) segments of lines in direction of v 1 , which cross the square fully and
have spacing 1/N both on the horizontal and on the vertical side, as in Figure 2.3(a). Note that the distance
between two successive lines is less than 1/N ≤ /2, since the distance between lines is the side of a right
triangle whose hypothenus is the spacing 1/N between lines.
On each line, consider points whose spacing is /2λn−11 . The reason of this choice will be clear later. Let
S ⊂ T2 be the set which consists of the union over the lines of these points (see Figure 2.3(b)). Let us prove
that S is (n, )-spanning.
(a) (b)
x Ak (x)
and and
y Ak (y)
Let x ∈ T2 (note that a point on T2 has two coordinates (x1 , x2 ) and we can think of it as a vector, that
is why why write x). Let y ∈ S be the closest point to x among points in S. We can write between x − y
as sum of a vector proportional to v 2 and a vector proportional to v 1 (see Figure 2.4(a)). Remark that the
distance between x and y along the v 1 direction is less than the spacing of points on each line, which is 2λn−1
1
(see Figure 2.4(a)); the distance between x and y in direction v 2 is less than the distance between x and a
line, which is less than the distance /2 between lines. Thus can write (2.7).
x = y + av 1 + bv 2 , where |a| ≤ , |b| ≤ . (2.7)
2λn−1
1 2
Let us now compute the distance between the orbits of x and y under fA . Recall that fAk is obtained by
first acting linearly by the matrix Ak and then taking the result modulo 1 (which correspond to cutting and
pasting the affine image of [0, 1]2 under A to map it back again a unit square.) Since A acts linearly, for each
iterate k ∈ N
Ak (x − y) = Ak (av 1 + bv 2 ) = aAk v 1 + bAk v 2 = aλk1 v 1 + bλk2 v 2 ,
where in the latter equality we used that v 1 , v 2 are eigenvectors. Thus, since the operation of cutting and
pasting does not increase the distances (actually, if the distance is less than 1, it is preserved when taking the
result modulo Z2 ), by setting z = y + av1 = x − bv2 (see again Figure 2.4(a)), we have by triangle inequality
that
d(fAk (x), fAk (y)) ≤ d(fAk (x), fAk (z)) + d(fAk (z), fAk (y)) ≤ |a|λk1 + |b|λk2 .
Using now the bounds on |a| and |b| from (2.4(a)) and then recalling that k ≤ n − 1 and that λ2 < 1, we get
that
|a|λk1 + |b|λk2 ≤ n−1 λk1 + λk2 ≤ + = .
2λ1 2 2 2
Thus d(fAk (x), fAk (y)) ≤ for each 0 ≤ k ≤ n − 1, which means that dn (x, y) ≤ and concludes the proof that
S is (n, )-spanning.
Let us bound the cardinality of S. Let L be the length of the longest line segment. Then, we can bound
the number of points in each line by L divided by the spacing and since there are at most 2N lines, we get
!
L 4N L n−1
Card(S) ≤ (number of lines )(points on each line) ≤ 2N = λ1 .
2λn−1
1
Let us compute the exponential growth rate of Span(n, ), recalling that N and are fixed and only n grows:
Thus, using the Theorem which expresses the topological entropy using the growth rate of Span(n, ), we have
htop (fA ) ≤ log λ1 .
Let us now construct (n, )-separating sets for fA . Fix an < 1/2. To construct an (n, ) separated set S,
it is enought to consider now only one of the lines, for example the line of maximal length, whose length will
be denoted by L and let S ⊂ T2 be the set which consists of points on the line whose spacing is /λn−1 1 , as in
Figure 2.3(c). Let us prove that S is (n, )-separating, that is, that, given two distinct points x, y in S, there
exists 0 ≤ l ≤ n − 1 such that d(fAl (x), fAl (y)) ≥ , i.e. the two points can be distinguished with resolution in
time n. Let us check that the closest points, that is two consecutive points on the line, can be distinguished.
If x and y are consecutive points, they can be written as
y =x+ v1 .
λ1n−1
and the operation of cutting and pasting to consider the result mod Z2 does not change distances until the
distance is at least 1, we have, for k = n − 1
d(fAn−1 (x), fAn−1 (y)) = n−1
n−1 λ1 = .
λ1
Thus, dn (x, y) ≥ . If one considers another pair of distinct points x0 , y 0 on the line, d(x0 , y 0 ) ≥ d(x, y) and
moreover, since Ak acts linearly, d(Ak x0 , Ak y 0 ) ≥ d(Ak x, Ak y) for any k. Hence there will also be a 0 ≤ k < n
such that
d(fAk (x0 ), fAk (y 0 )) = d(Ak (x0 ), Ak (y 0 )) ≥ .
Thus we conclude that dn (x0 , y 0 ) and that S is (n, )-separating.
Since cardinality of S is the number of points on the line is at least
L L n−1
Card(S) ≥ = λ .
λn−1
1
1
Using this time the definition of htop (fA ) with Sep(n, ) we get
Thus, htop (fA ) ≥ log λ1 . Combining with htop (fA ) ≤ log λ1 , we proved htop (fA ) = log λ1 .
Remark 2.5.1. If f and g are topologically conjugate, then
For example, the ball Bd (x, ) has diameter 2. Let dn be as usual be the metric
Let U = {U1 , U2 , . . . , Un } be a cover of X with open sets with dn -diameter less than , that is diamdn (Ui ) ≤
for any 1 ≤ i ≤ n. Note that since X is compact we can assume that the cover consists of finitely many open
sets (since from any open cover we can extract a finite subcover).
Example 2.5.1. If S is (n, /2)−spanning, since (see the definition of (n, /2)−spanning),
[
X⊂ Bdn (y, ),
2
y∈S
This collection is a cover, since given any x ∈ X, since U is a cover, x ∈ Ul0 for some 0 ≤ l0 ≤ N , f (x) ∈ Ul1
for some 0 ≤ l1 ≤ N and so on up to f n−1 (x) ∈ Uln−1 for some 0 ≤ ln−1 ≤ N , so that
(Note that here l0 , l1 , . . . , ln−1 , . . . is a symbolic coding of the itinerary of the orbit O+
f (x) with respect to the
cover {U1 , U2 , . . . , UN }. Moreover, the dn -diameter of each set is less than , since if x, y ∈ Uk0 ∩ f −1 (Uk1 ) ∩
. . . f −(n−1) (Ukn−1 ), then
x, y ∈ Uk0 , f (x), f (y) ∈ f (Uk1 ), ..., f n−1 (x), f n−1 (y) ∈ f n−1 (Ukn−1 )
d(x, y) ≤ , d(f (x), f (y)) ≤ , . . . , d(f n−1 (x), f n−1 (y)) ≤ ⇒ dn (x, y) ≤ .
Definition 2.5.1. Let Cov(f, n, ) (or simply Cov(n, ) is the map f is clear from the context) be the
minimal cardinality of open covers U = {U1 , U2 , . . . , UN } with sets whose diameter in the dn metric satisfy
diamdn (Ui ) ≤ .
Remark 2.5.2. One can show that the exponential growth rate of Cov(n, ), that is
log(Cov(f, n, ))
lim ,
n→∞ n
exists (see Extra), so there is no need to use lim inf to define it.
Theorem 2.5.2. Let (X, d) be a compact metric space and f : X → X a topological dynamical systems. The
topological entropy htop (f ) can be computed using Cov(f, n, ) and is given by
log(Cov(f, n, ))
htop (f ) = lim lim .
→0 n→∞ n
We will see an example of computation of topological entropy using this definition in the next sections.
Let us now give the proof of Theorem 2.5.2, which follows immediately from the following Lemma (which is
similar to Lemma 2.3.1).
Lemma 2.5.1. For any topological dynamical system f : X → X on a metric space (X, d) we have:
Proof. In Example 2.5.1, we showed that if S in (n, /2)−spanning, that there it gives a cover with balls of
dn -diameter less than and of cardinality Card(S). Thus, if S is such that Span(n, /2) = Card(S), this
shows that the minimal cardinality Cov(n, ) is less than the cardinality of S, thus Cov(n, ) ≤ Span(n, /2),
which is the second inequality.
For the other inequality, note that if U is a cover with open sets of dn −diameter less than , then each
open set U is contained in a ball Bdn (x, ) centered at any point x ∈ U . Indeed, if x ∈ U , any other point
y ∈ U is such that dn (x, y) ≤ by definition of diameter. Thus, we can find a cover with dn −balls of radius
of the cardinality Card(U ) and the collection S of the centers of the balls give an (n, )−spanning set of
cardinality Card(U ). If we choose U such that Card(U ) = Cov(n, ), this shows that the minimal cardinality
Span(n, ) is at less than Card(S) = Card(U ). So we have Span(n, ) ≤ Cov(n, ).
Proof of Theorem 2.5.2. For each > 0 and each n ∈ N, by Lemma 2.5.1, we have that
Since as → 0, by Theorem 2.3.1 in the previous section the right an the left hand side tend to htop (f ), we
have, by the sandwitch theorem, that
log Cov(n, )
htop (f ) = lim lim .
→0 n→∞ n
See the Extras for this section for for the proof of the existence of the exponential growth rate of Cov(f, n, )
and see Exercise .0.6 in the Extras for the proof of Remark 2.5.1 on topological entropy as a conjugacy invariant.
∞
Σ+ N
N = {1, . . . , N } = { a = (ai )i=0 , 1 ≤ ai ≤ N },
ΣN = {1, . . . , N }Z = { a = (ai )∞
i=−∞ , 1 ≤ ai ≤ N },
σ + ((ai )+∞ +∞
i=0 ) = (ai+1 )i=0 , or, when f is invertible, by σ((ai )+∞ +∞
i=−∞ ) = (ai+1 )i=−∞ .
The maps
σ + : Σ+ +
N → ΣN , σ : ΣN → ΣN ,
are known as full (one-sided) shift on N symbols and full bi-sided shift on N
If ψ : X → Σ+ N (or ψ : X → ΣN in the invertible case) is the coding map which assign to each point its
itinerary, the previous relation shows that for all x ∈ X
In order to give a conjugacy, though, the coding map ψ should be both injective and surjective. Thus, it is
natural to ask:
(Q1) Is the coding unique?
(Q2) Do all sequences in Σ+
N (or in ΣN ) occur as possible itineraries?
The answer to both these questions is generally NO. In all the cases that we saw (doubling map, baker map,
Gauss map) so far, all possible finite 12 sequences (in Σ+ 2 for the doubling map, in Σ2 for the baker map and in
countably many digits {1, . . . , n, . . . , }N for the Gauss map) do occur, but as the Example 2.6.1 below shows,
this is often not the case.
12 Also in these examples there are countably many infinite sequences that do not occurr as itinerararies: for example, for the
doubling map, if the coding partition is [0, 1/2) and [1/2, 1], all sequences which end with a tail of 1s do not appear as itinerary
of any point.
if 0 ≤ x < 12 ,
2x
f (x) =
x − 12 if 12 ≤ x < 1.
whose graph is shown in Figure 2.5. Let I1 = [0, 1/2) and I2 = [1/2, 1]. It is clear that if x ∈ I2 , then f (x) ∈ I1 .
On the other hand, if x ∈ I1 , one could have either f (x) ∈ I1 (if x < 1/4) or f (x) ∈ I2 (if 1 < 4 ≤ x < 1/4).
Thus, one will never see two consecutive digits 2, 2 in the itinerary, while all combinations 1, 1, 1, 2 and 2, 1
can occur.
To be able to describe the subset of the shift that describes itineraries of this form is one of the reasons to
study subshifts of finite type of the following form.
Definition 2.6.1. An N × N matrix is called an transition matrix (also called incidence matrix ) if all entries
Aij , 1 ≤ i, j ≤ N , are either 0 or 1.
One can use a matrix A to encore the information of which pairs of consecutive digits can appear in an
itinerary: the digit i can be followed by the digit j if and only if the entry Aij is equal to 1. More formally,
we can consider the following subspaces Σ+ +
A ⊂ ΣN and ΣA ⊂ ΣN of sequences
Σ+ +∞ +
A = {(ai )i=0 ∈ ΣN , Aai ai+1 = 1 for all i ∈ N},
ΣA = {(ai )+∞
i=−∞ ∈ ΣN , Aai ai+1 = 1 for all i ∈ Z}.
since the only zero entry is A22 = 0, the digit 2 cannot be followed by another digit 2, while all the other pairs
of successive digits 12, 11 and 21 are allowed. Thus the sequences in Σ+ A (respectively ΣA ) are all sequences
in the digits 1, 2 (respectively all the bisided sequences) without any pairs of consecutive digits 2.
If (ai )+∞ + + +∞ +
i=0 ∈ ΣA , also the shifted sequence σ ((ai )i=0 ) belongs to ΣA , since if Aai ai+1 = 1 for all i ∈ N,
clearly also Aai+1 ai+2 = 1 for all i ∈ N (in other words, if a pair of consecutive digits did not occurr in
a, it clearly does not occurr either in the shifted sequence). The same is true for bisided sequences: if
(ai )+∞ +∞ +
i=−∞ ∈ ΣA , also the shifted sequence σ((ai )i=−∞ ) ∈ ΣA . Thus, the spaces ΣA and ΣA are invariant
+
under the shift and we can consider the restriction of σ and σ to this subspaces.
Definition 2.6.3. The restriction of the shift maps to
σ + : Σ+ +
A → ΣA , σ : ΣA → ΣA ,
are called a topological Markov chain 13 (or also a subshift of finite type) associated to the matrix A.
These are special examples of subshifts, that is restrictions of the shift to closed invariant spaces of Σ+
N (or
ΣN ). In a topological Markov chain, the only type of restrictions on the sequences is of the form i cannot be
followed by j, thus depend only on the previous digit14 .
It is very convenient to visualize sequences sequences in ΣA as paths on a graph.
13 In probability, one studies Markov chains, which consists of a topological Markov chain with in addition a measure. We will
define a measure which is invariant under the shift in one of the next lectures.
14 More in general, one can define for example invariant spaces where certain combination of digits, also called words in the
digits, are not allowed (for example forbidden words could be 2212 and 111, so there cannot be occurrences of 11, but not three
consecutive digits 1). A subshift can be equivalently defined in terms of countably many forbidden words: no sequence in the
subshift contains a forbidden word and any sequence in the complement does contain a forbidden word. If only a finite number
of words are forbidden, we have a subshift of finite type. If the maximal length of forbidden words is k + 1, the subshift is called
a k-step subshift of finite type. Thus, topological Markov chains are 1−step subshifts of finite type.
Definition 2.6.4. The graph GA associated to the N × N transition matrix A is a graph with vertices are
v1 , . . . , vN , where vi and vj are connected by an arrow from vi to vj if and only if Aij = 1.
Then the following fact is immediate:
Lemma 2.6.1. A sequence (ai )+∞i=0 ∈ ΣN belongs to ΣA if and only if it describes a path on GA . Similarly a
+ +
one obtains the graphs GA , GB and GC in Figure 2.6. A paths on GA can never go through v2 and immediately
after v2 again. Since there are no infinite paths on GB , we see that ΣB = ∅.
To avoid trivial cases like above, where ΣB = ∅, but also to guarantee interesting dynamical properties, as
we will see in the next lecture, one can ask conditions on the matrix as the following.
Definition 2.6.5. A transition matrix A is called irredudible if for any 1 ≤ i, j ≤ N there exists an n ∈ N
(possibly dependent on i, j) such that the entry (An )ij of the matrix An obtained multiplying A by itself n
times is positive ((An )ij > 0).
A transition matrix A is called aperiodic (or also, in some books, transitive) if there exists an n ∈ N such
that if for any 1 ≤ i, j ≤ N we have (An )ij > 0.
A matrix A such that all entries Aij > 0 is called positive (and we write A > 0). Thus, A is aperiodic if
there exists a power n ∈ N such that An is positive.
Example 2.6.4. For example
1 1 1 1 2 1 1 1 1 n
A2 = = ; if D = , Dn = , (2.10)
1 0 1 0 1 1 0 1 0 1
so A is irreducible and aperiodic (with n = 2) since all entries of A2 are positive, while D is not irreducible,
since for any n the entry (Dn )21 = 0.
Remark 2.6.1. Irreducible and aperiodic matrices can be easily recognized using the associated graph:
(1) A is irreducible if and only if for any two vertices vi and vj on GA there exists a path connecting vi to
vj ;
(2) A is aperiodic if and only if there exists an n such that any two vertices vi and vj on GA can be connected
by a path of the same length n.
The proof of this remark will be given in the next section.
Example 2.6.5. The graph GA in Figure 2.6 shows that A is irreducible, since all vertices can be connected:
for example, v2 can be connected to itself going through v1 . Moreover, it is aperiodic with n = 2, since the
paths from v2 to v2 has length two and one can get paths of length 2 from v1 to itself, from v2 to v1 and from
v1 to v2 by repeating the loop around v1 . The graph GB in Figure 2.6 shows that B is not irreducible, since
there are no paths connecting for example v1 to itself (or neither paths connecting v2 to v1 and v2 to itself).
The graph GC in Figure 2.6 shows that C is irreducible, since one sees immediately that all vertices can be
connected to each other.
Draw the corresponding graphs GAi , i = 1, 2, associated to them. For each i = 1, 2 is Ai irreducible? is Ai
aperiodic?
∞
X |xk − yk |
d+
ρ (x, y) = , where x = (xk )∞
k=0 , y = (yk )∞
k=0 . (2.13)
ρk
k=0
It two points belong to the same cylinder, they share a common central block of digits. Thus, it is clear
that the distances in (2.12) and (2.13) are small. More is true. If ρ is chosen sufficiently large, than symmetric
cylinders are exactly balls with respect to the distance dρ :
Lemma 2.7.1. If ρ > 2N − 1, than for any = 1/ρn we have:
1
C−n,n (x−n , . . . , xn ) = Bdρ x, n ,
ρ
where x = (xi )∞
i=−∞ ∈ ΣN be a sequence which contains the central block x−n , . . . , xn .
Proof. Let C−n,n (x−n , . . . , xn ) be a symmetric cylinder in ΣN . Since x = (xi )∞ i=−∞ ∈ ΣN contains the central
block x−n , . . . , xn , the point x ∈ C−n,n (x−n , . . . , xn ).
If also y ∈ C−n,n (x−n , . . . , xn ), then, since |xk − yk | = 0 for all k ∈ Z with |k| ≤ n, we have
−n−1 ∞
X |xk − yk | X |xk − yk |
dρ (x, y) = + .
k=−∞
ρ|k| k=n+1
ρ|k|
Note that, since both xi , yi ∈ {1, . . . , N }, for any i ∈ N we have |xi − yi | ≤ N − 1. Thus, using also the formula
for the sum of the geometric progression, that gives us
∞
X 1 1 ρ
j
= 1 = ,
j=0
ρ 1− ρ
ρ−1
we get
∞ ∞
X N −1 2(N − 1) X 1 2(N − 1) ρ 2(N − 1) 1
dρ (x, y) ≤ 2 ≤ = = .
ρk ρn+1 j=0 ρj ρn+1 ρ − 1 ρn ρ−1
k=n+1
Thus, since
2(N − 1) 1 1 2(N − 1)
dρ (x, y) ≤ n
< n ⇔ <1 ⇔ ρ > 2N − 1,
ρ ρ−1 ρ ρ−1
if ρ > 2N − 1, we have
1 1
dρ (x, y) < n ⇔ y ∈ Bdρ x, n .
ρ ρ
This proves that if ρ > 2N − 1 we have the inclusion
1
C−n,n (x−n , . . . , xn ) ⊂ Bdρ x, n .
ρ
Let us check the reverse inclusion. Assume that y ∈ Bdρ x, ρ1n . Then, if by contradiction y ∈
/ C−n,n (x−n , . . . , xn ),
there exists j ∈ Z with |j| ≤ n such that xj 6= yj , so that |xj − yj | ≥ 1. But then
∞
X |xk − yk | |xj − yj | 1 1
dρ (x, y) = |k|
≥ |j|
≥ |j| ≥ n , since 0 ≤ |j| ≤ n.
ρ ρ ρ ρ
k=−∞
/ Bdρ x, ρ1n , which is a contradiction. Thus, we also have (without any additional assumption on
Thus, y ∈
ρ) the opposite inclusion
1
Bdρ x, n ⊂ C−n,n (x−n , . . . , xn ).
ρ
Exercise 2.7.1. Find a condition on ρ which guarantees that for any = 1/ρn we have:
1
C0,n (x0 , . . . , xn ) = Bd+ x, .
ρ
ρn
Consider now a subshift ΣA ⊂ ΣN determined by the transition matrix A (or the one-sided subshift
Σ+ +
A ⊂ ΣN ).
Remark 2.7.1. If A is irreducible, a cylinder is admissible if and only if it is not empy. Indeed, the condition
Aai ,ai+1 = 1 for all −m ≤ i < n guarantees that there is a path on GA described by a−m , . . . , an (that is
passing in order through the vertices va−m , . . . , van ) and since A is irreducible, one can continue this path to
a biinfinite path in GA (adding any admissible forward tail starting from an and any admissible backward tail
before a−m ). This path belong to the cylinder and shows that it is not empy.
where in the latter equation we simply used the definition of (i, j) entry of the product matrix An A.
The Lemma has the following immediate Corollary on the number of periodic points of a topological
Markov chain.
Corollary 2.7.1. The cardinality of periodic points of period n for σ : ΣA → ΣA is exactly the trace T r(An ).
P
(Recall that the trace of a matrix T r(A) = i Aii is the sum of the diagonal entries of A.)
Proof. If x is a periodic points of period n for σ : ΣA → ΣA , then σ n (x) = x, which implies that the digits of
the sequence x = (xi )+∞
i=−∞ have period n, that is xn+i = xi for all i ∈ Z. Thus, the path described by x on
GA is a periodic path, that repeats periodically the path starting from some vi and coming back to the same
vi . Since the paths of length n connecting vi to vi are Anii by Lemma 2.7.2, we have
Proof. Remark first that if An > 0 for some n > 0, this means that for each j there exists a kj such that
Akj j = 1. Otherwise, if Akj = 0 for all 1 ≤ k ≤ N , then the vertex vj cannot be reached from any other
vertex vk , so there cannot exist any path of length n reaching vj , in contradiction with the fact that Anij > 0.
Let us now prove by induction on m that Am > 0 for m ≥ n. For m = n it is true by assumption. If we
verified it for m, take any 1 ≤ i, j ≤ N . Then, by the previous remark there exists kj such that Akj j = 1.
Moreover, for all the other k we have Akj ≥ 0. Hence, we get
N
X
Am+1
ij = Am m m
ik Akj ≥ Aikj Akj j = Aikj
k=1
and Am m
ikj > 0 since A > 0 the inductive assumption. This shows that A
m+1
> 0 and concludes the proof.
Proof of Theorem 2.7.1. Let us prove (1). Assume that A is irreducible. We want to show that for each U, V
non-empty open sets there exists M > 0 such that σ M (U ) ∩ V 6= 0. Fix ρ > 2N − 1. Since each open sets
contains an open ball of the form Bdρ (x, ρ−k ) for some large k and by Lemma in the previous section each
ball of this form is a symmetric cylinder, there exists two admissible cylinders
Let us now construct a point x which contains both blocks of digits a−k , . . . , ak and b−l , . . . , bl . By definition
of irreducibility, taking i = ak and j = b−l , there exists n > 0 such that Anak ,b−l > 0. This means that there
exists a path of length n which connects vak to vb−1 . Let us denote by
y0 = ak , y1 , y2 , . . . , yn−1 , yn = b−l ,
the digits which describe this path. Clearly Ayi yi+1 = 1 for all 0 ≤ i ≤ n − 1. Consider a point x ∈ ΣA such
that
x = . . . a−k , . . . , a0 , . . . , ak , y1 , . . . , yn−1 , b−l , . . . , bl , . . .
|{z}
i=0
(such point exist since by irreducibility we can choose a backward and a forward tail by choosing any path on
GA which starts from bl (for the forward tail) or ends in a−k (for the backward tail). Clearly, since x contains
as central block of digits a−k , . . . , ak , we have x ∈ C(−k,k) (a−k , . . . , ak ) ⊂ U . Moreover, if we set M = n+k +l,
shifting the sequence k + n + l times to the left, since xk+n+l = b0 , we get
σ M (x) = . . . b−l , . . . , b0 , . . . , bl , . . . ,
|{z}
i=0
x ∈ U ∩ σ −M (V ) 6= ∅ ⇔ σ M U ∩ V 6= ∅.
(σ n (x))0 = xn , so σ n (x) ∈ V ⇔ xn = j.
Thus, we found an element x ∈ ΣA , that by definition describes a biinfinite path on GA , such that x0 = i and
xn = j. This gives a path of length n connecting vi to vj , showing that Anij > 0.
Let us now prove (2). Assume that An > 0. We want to show that σ : ΣA → ΣA is topologically mixing.
Let U, V be non empty open sets. We seek M0 such that for any M ≥ M0 we have σ M (U ) ∩ V 6= ∅. We
can reason very similarly to part (1). Both U, V contain admissible symmetric cylinders of the form (2.14).
Let M0 = n + k + l. If M ≥ M0 , then M = m + k + l with m ≥ n. Then also Am > 0 by Lemma 2.7.3, so
Amak ,b−l > 0. Thus, there exists a path of length m from vak and vb−l , so we can construct a point in ΣA of
the form
x = . . . a−k , . . . , a0 , . . . , ak , y1 , . . . , ym−1 , b−l , . . . , bl , . . .
|{z}
i=0
−M
Reasoning as in Part (1), x ∈ U ∩ σ (V ), so that σ M (U ) ∩ V 6= ∅. This can be repeated for any M ≥ M0 ,
showing that σ : ΣA → ΣA is topologically mixing.
Theorem 2.7.2 (Entropy of topological Markov chains). Let A be a N × N be an aperiodic transition matrix
and σ : ΣA → ΣA be the associated topological Markov chain. The following limit exists and the topological
entropy is given by
||An ||
htop (σ) = lim log . (2.15)
n→∞ n
where |λmax
A | is the maximum modulus of the eigenvalues of A.
Combining the Remark and the Theorem we have the following Corollary, which is more useful for the
computations of entropy:
Corollary 2.7.2. The topological entropy of σ : ΣA → ΣA where A is irreducible is given by
In the proof of the Theorem, we will use the following Lemma, whose proof is omitted.
Lemma 2.7.4. Let A be a matrix with non negative entries. Then, for any k > 0, the following limits exist
and we have
||An+k || ||An ||
lim log = lim log .
n→∞ n n→∞ n
Proof of Theorem 2.4.1. Let ρ < 2N − 1 and consider the distance d = dρ , so that by Lemma 2.7.1 balls of
radius 1/ρn in ΣN are exacly symmetric cylinders of the form C−n,n (a−n , . . . , an ). Let = 1/ρk . Consider
the collection U of cylinders which are admissible and of the following form:
Since (a−k , . . . , an−1+k ) varies over all possible admissible values, Un is a cover: for any x ∈ ΣA , x ∈
C−k,n−1+k (x−k , . . . , xn−1+k ) ∈ Un . Moreover, since cylinders are open balls, Un is an open cover. Let us
check that the dn diameter of each C ∈ Un is less than .
Let C = C−k,n−1+k (a−k , . . . , an−1+k ). If x, y ∈ C, by definition
so that both σ j (x) and σ j (y) belong to the same symmetric cylinder which is a ball or radius 1/ρk . Hence
Cov(n, ) ≤ Card(Un ).
The cardinality of Un is the number of admissible cylinders of the form (2.16), thus the number of admissible
paths of length n + 2k from any vertex vi to any other vertex vj . Thus, by Lemma 2.7.2, we have
N
X N
X
Card(Un ) = {paths of length n + 2k from vi to vj } = An+2k
ij .
i,j=1 i,j=1
Moreover, one can see that Un is a minimal cover with sets of dn -diameter less than . Note that all cylinders
in U are disjoint balls in the dn -metric. A cover has in particular to cover a point in each cylinder Ci ∈ U
with some open set Ui , but since the diamdn (Ui ) ≤ , it cannot contain any other point outside Ci . Thus, the
cardinality of any cover with open sets of dn −diameter less than has at least the cardinality of Un . Thus
N
X
Cov(n, ) = Card(Un ) = An+2k
ij = ||An+2k ||.
i,j=1
Using the definition of entropy via covers and Lemma 2.7.4, we get
be the corresponding eigenvectors, which are orthogonal since their scalar product is
√ √
(1 + 5) (1 − 5) 1−5
< v 1 , v 2 >= +1·1= + 1 = −1 + 1 = 0.
2 2 2
In Figure 2.7 you can see a partition of T2 into two sets, R1 and R2 (R1 is the union of the white pieces
and R2 is the union of the dark pieces).
Each of them is a rectangle on the surface of T2 , whose sides are in the orthogonal directions of either v1
or v2 . In order to see that, it is convenient to cut and paste the triangles as in Figure 2.7 (sets filled with the
same shade are translations of each other). When you cut and paste a triangle by moving it by an integer
vector (k, l) ∈ Z2 , you get a different set in R2 , which represent the same set on the torus T2 (recall that
T2 = R2 /Z2 is the set of equivalence classes of points in R2 where (x, y) ∼ (x0 , y 0 ) if (x0 , y 0 ) = (x, y) + (k, l)).
Thus, if instead than the standard copy which lies in the unit square [0, 1)2 , we use the cut and pasted copies
translated as in Figure 2.7, we can see that both R1 and R2 are rectangles.
[Instead than representing T2 as a unit square with opposite sides identified, we could choose to represent
2
T as union of the two copies of the rectangles R1 and R2 where again opposite sides are identified.]
We could try to use the partition P = {R1 , R2 } of T2 to code fA . This partition has the nice property
that rectangles are mapped to rectangles. Indeed, since the sides of R1 and R2 are parallel to eigenvectors,
the image of R1 and R2 under A have still sides parallel to v1 and v2 , that is, they are still rectangles.
Let us describe the images fA (R1 ), fA (R2 ) under the cat map, referring to Figure 2.8.
Recall that fA is obtained by first acting linearly by A and then taking the result modulo one, which
correspond to projecting R2 to T2 by the projection π : R2 → T2 given by
Since the sides of R1 and R2 are parallel to eigenvectors, the image of R1 and R2 are still rectangles, but
since all directions parallel to v1 are expanded by a factor λ1 and all directions parallel to v1 are contracted
by a factor λ2 , the rectangles image under A are thinner and longer, as shown in Figure 2.8 (the image of R1
under A is the lighter shade rectangle, the image of R2 under A is the darker shade rectangle). The projection
π consists of cutting and pasting corresponding pieces of these rectangles back to the unit square, as shown
again in Figure 2.8 (sets with the same name are translated copies of each other).
Note that the images of R1 and R2 under A crosses different parts of the translated copies of each original
rectangle R1 , R2 . In order to describe these intersections precisely, let us write
R 1 = P1 ∪ P2 ∪ P3 , R2 = P4 ∪ P5 ,
where P1 , . . . , P5 are the sets in Figure 2.8 ( that we give the same name Pi to all the sets in R2 that represent
a translated copy by an integer vector of Pi ⊂ [0, 1)2 , since they correspond to the same set on T2 ).
Looking at Figure 2.8, you can see that
Since R1 crosses itself more than once, it is not a good idea to use the partition P = {R1 , R2 } of T2 to code
fA . Let us use instead the finer partition
5
[
P = {P1 , P2 , P3 , P4 , P5 }, 2
T = R1 ∪ R2 = (P1 ∪ P2 ∪ P3 ) ∪ (P4 ∪ P5 ) = Pk
k=1
It is clear that if we code the orbit of the point (x, y) using a sequence (ai )i∈Z ∈ Σ5 = {1, . . . , 5}Z in the usual
way, so that
fAk ((x, y)) ∈ Pak , for all k ∈ Z,
not all sequences of Σ5 will describe itineraries, becouse of the inclusions (2.17). For example, if ak = 1, it
means that fAk ((x, y)) ∈ P3 ⊂ R1 . Thus, fAk+1 ((x, y)) ∈ f (R1 ) = P3 ∪ P1 ∪ P4 , so ak+1 can be only 1, 3 or 4.
Reasoning in a similar way, since if x ∈ P2 ⊂ R1 or x ∈ P2 ⊂ R1 , f (x) ∈ f (R1 ) = P3 ∪ P1 ∪ P4 , the digits
2, 3 can be followed only by 1, 3 or 4. If x ∈ P4 ⊂ R2 or x ∈ P5 ⊂ R2 , f (x) ∈ f (R2 ) = P2 ∪ P5 , we see that
4, 5 can be followed only by 2 or 5.
Let encode this information in a 5 × 5 transition matrix B, by setting Bij = 1 if and only if i can be
followed by j, 0 otherwise. We get
1 0 1 1 0
1 0 1 1 0
B= 1 0 1 1 0 . (2.18)
0 1 0 0 1
0 1 0 0 1
These considerations lead to the following:
Remark 2.8.1. Full itineraries of orbits Of+A (x) of the cat map fA : T2 → T2 belong to the shift space ΣB .
Conversely, all the transitions that we describe can actually occur. Thus, finite sequences in ΣB all
represent possibile itineararies of orbits of the cat map with respect to the partition P. Moreover, a general
property of the coding map, the image of a point under fA is coded by the shifted sequence in ΣB . Thus, one
can use the shift space ΣB to construct a (semi-)conjugacy of the cat map with the topological Markov chain
σ : ΣB → ΣB . More details can be found in the references quoted in the Extra.
Finding a good partition for coding is not always easy (as in this case), but once one has constructed
a semi-conjugacy with a shift space, it is much easier to prove some dynamical properties, as for example
topological transitivity (which, for a subshift given by a transition matrix, follows simply by verifying that
the matrix is irreducible). See for example the exercise Ex. 2.8.1 below.
Remark 2.8.2. The partition used to code the cat map is an example of Markov partition. Markov partitions
can be constructed more in general for hyperbolic dynamical systems, that is systems that have contracting
and expanding directions. Coding via a Markov partition, allows to reduce the study of the dynamical system
to a symbolic space. This is often a powerful technique to prove dynamical properties of the original system.
In particular, Markov partitions can be constructed for all hyperbolic toral automorphisms similarly to the
example we saw for the cat map. We give another example as an Exercise.
* Exercise 2.8.1. Check that the toral automorphism fA : T2 → T2 given by the matrix
1 1
A= .
1 0
is hyperbolic. Figure 2.9 shows a partition of T2 into rectangles and their images under fA .
(a) Using itineraries with respect to the partition {R1 , R2 , R3 } to code a point, find a transition matrix A
such that the shift space ΣA describes all possible itineraries coding orbits O+ fA ((x, y)).
Exercise .0.2. If (X, d) is a metric space, the collection T of all sets which are open in the metric space
according to Definition 2.1.3, that is all the sets U ⊂ X such that for each x ∈ U there exists > 0 such
that Bd (x, ) ⊂ U , form a topology. Indeed, X and ∅ satisfy Definition 2.1.3 trivially and hence belong to
T , proving (T 1). The second property (T 2) follows from Lemma 2.1.1.The collection of open sets in a metric
space give a topology to the metric space X
Example .0.2. [Trivial topology] Consider a space X and let Ttr = {∅, X}. One can check that Ttr satisfies
(T 1), (T 2), (T 3). This topology is known as trivial topology. Thus, (X, Ttr ) is a topological space.
Example .0.3. [Point topology] Consider a space X and let Tpt = P(X) be the collection of all subsets of
X. One can check that also Tpt satisfies (T 1), (T 2), (T 3). This topology is known as point topology. Thus,
(X, Tpt ) is a topological space.
In a topological space one can define the notion of convergence or density in the same way we did with
metric spaces, just using open sets instead than balls:
Definition .0.3. A sequence {xn }n∈N ⊂ X converges to x and we write limn→∞ xn = x if for any open set
U containing x there exists N > 0 such that xn ∈ U for all n ≥ N .
Similarly, one can define what it means for a function to be continuous, taking as definition of continuity
the equivalent characterization given by Lemma 2.1.2.
Definition .0.4. Let (X, T ) be a topological space. A sequence {xn }n∈N ⊂ X converges to x and we write
limn→∞ xn = x if for any open set U ∈ mathscrT that contains x, there exists N > 0 such that xn ∈ U for
all n ≥ N .
Lemma .0.1. A function f : X → Y between two topological spaces (X, TX ) and (Y, TY ) is continuous if
and only if for each open set V ∈ TY the preimage f −1 (V ) is an open set of X, that is f −1 (V ) ∈ TX .
Finally, the notion of compactness via covers can be defined in any topological space:
15 The notation P(X) denotes the parts of X, that is the collection of all subsets of X.
Definition .0.5. Let (X, TX ) be a topological space. A subset K ⊂ X is compact by covers if for any open
cover {Uα }α there exists a finite subcover {Uα1 , Uα1 , . . . , UαN } ⊂ {Uα }α such that X ⊂ ∪α Uα .
In the next sections we will define, in the context of metric spaces, dynamical properties as topological
transitivity, topological minimality and topological mixing. All this properties can be defined more in general
for topological spaces. This is why they are called topological properties and why we talk of topological
dynamics.
Let U be a cover with open sets of dn −diameter less than and Card(U ) = Cov(n, ) and let V be a cover
with open sets of dm −diameter less than and Card(U ) = Cov(, m). Then, if U ∈ U and V ∈ V , then the
set
U ∩ f −n (V )
is open and has dn+m −diameter less than , since if x, y ∈ U , max0≤i≤n−1 d(f i (x), f i (y)) ≤ and since f n (x)
and f n (y) are both in V , also
If we consider now the sequence an = log Cov(n, ), property (19) becomes
an+m = log Cov(n + m, ) ≤ log Cov(n, )Cov(, m) ≤ log Cov(n, ) + log Cov(, m) = an + am
for any n, m ∈ N. Thus an is subadditive and by Exercise .0.4, the limit
log Cov(n, )
lim
n→∞ n
exists.