MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley

MATH37012 Course Notes: discrete time
Dr Jonathan Bagley
Semester 2 - 2020-2021
§0 Introduction
A stochastic (random) process is a family of random variables, {Xt : t ∈ T }, where T is

some set. The index t often represents time. For example,
a) T = R+ , giving {Xt : t ≥ 0}
b) T = Z + , giving {X0 , X1 , ...}
If T is countable, as in b), the process is called discrete time.
The union of the ranges of the Xt is called the state space, denoted by S.
The simplest example of a stochastic process is X0 , X1 , ..., where the random variables are
independent. However, this is not interesting or useful.
Generally, there is some sort of dependence of Xn on Xn−1 , Xn−2 , .... However, the more
complicated the dependence, the less amenable to analysis is the process.
§1 Discrete time Markov chains
1.1 Definition
A discrete time random process is a Markov chain (M.C.) if it has the Markov property
(Andrei Markov, Russia, 1905).
For all n ≥ 0 and for all xi ∈ S such that both sides are defined,
P (Xn+1 = xn+1 |Xn = xn ) = P (Xn+1 = xn+1 |Xn = xn , Xn−1 = xn−1 , ..., X0 = x0 )
1
This is equivalent to (see problem sheet 1), “Given the present, the future is independent of
the past.”
1.2 Definition
The M.C. is homogeneous if, for all n ≥ 0 and i, j ∈ S
P (Xn+1 = j|Xn = i) = P (X1 = j|X0 = i)
All our M.C.s will be homogeneous.
1.3 Definition
The (one-step) transition probabilities are
pi,j = P (X1 = j|X0 = i)
The |S| × |S| matrix, P = (pi,j ), is called the transition matrix. We may have |S| = ∞.
1.4 Examples
a) S = {0, 1, 2, 3, 4, 5, 6}
1. When in state 0, roll a die. If it shows j, move to j.
2. When in state i ∈
/ {0, 6}, toss a coin. If heads, move to i + 1; if tails, move to i − 1.
3. When in state 6, toss a coin. If heads, move to 5; if tails, remain in 6.
The independence of outcomes of successive rolls and tosses guarantees the Markov property.
The transition matrix is  

1 1 1 1 1 1
0 6 6 6 6 6 6
 
1 1
0 0 0 0 0
 
2 2
 
 
1 1
0 0 0 0 0
 
2 2
 
 
P =
 1 1 
0 0 2
0 2
0 0 
 
 1 1 
 0 0 0 2
0 2
0 
 
 1 1 
 0 0 0 0 2
0 2

 
1 1
0 0 0 0 0 2 2
2
Note all rows sum to 1; so that P is a stochastic matrix.
A transition diagram, showing the paths of positive probability (with or without transi-
tion probabilities) is often useful.
Correction to diagram. The upper arrow from state 0 to state 1 is redundant.
b) A random walk (R.W.) S = {..., −2, −1, 0, 1, 2, ...}
For all i, define pi,i+1 = pi , pi,i−1 = 1 − pi .
If pi ≡ p, we have a simple random walk.
If p = 12 , we have a simple symmetric random walk.
The assumption of independence of steps in the random walk guarantees the Markov prop-
erty.
We have  
0 . 0 0 0 0 0
 
. 0 . 0 0 0 0
 
 
 
0 1 − p−1 0 p−1 0 0 0
 
 
 
P = 1 − p0
 
0 0 0 p0 0 0 
 
1 − p1
 
 0 0 0 0 p1 0 
 
 
 0 0 0 0 . 0 . 
 
0 0 0 0 0 . 0
c) The Ehrenfest model of diffusion (1907)
3
Containers A and B contain m balls in total. At each time epoch, a ball is chosen at random
from the m and placed in the opposite container.
Let Xn denote the number of balls in container A just after the nth epoch. We see that
{Xn : n ≥ 0} is a M.C. with S = {0, 1, ..., m − 1, m} and
m−i i
pi,i+1 = , pi,i−1 = , 0 ≤ i ≤ m.
m m
This is an example of a R.W. with reflecting barriers at 0 and m. See problem sheet 1
for another diffusion model.
To proceed further, we need the following.
1.5 Definition
For n ≥ 1, the n-step transition matrix Pn = (pi,j (n)) is the matrix of n-step transition
probabilities; so that, for any m ≥ 0, we have, by homogeneity
pi,j (n) = P (Xm+n = j|Xm = i)
Note that P1 = P .
1.6 Theorem
The Chapman-Kolmogorov equations.
X
For all i, j ∈ S and m, n ≥ 1, pi,j (m + n) = pi,k (m)pk,j (n)
k∈S
Proof
We have
pi,j (m + n) = P (Xm+n = j|X0 = i)

X
which, using Q3 on problem sheet 1 = P (Xm+n = j, Xm = k|X0 = i)
k∈S
Now use (Q2 on problem sheet 1) P (A ∩ B|E) = P (A|B ∩ E)P (B|E) to get
4
X
= P (Xm+n = j|Xm = k, X0 = i)P (Xm = k|X0 = i)
k∈S
X
which, by the Markov property = P (Xm+n = j|Xm = k)P (Xm = k|X0 = i)
k∈S
X
= pk,j (n)pi,k (m)
k∈S
X
= pi,k (m)pk,j (n)
k∈S
In matrix notation, Pm+n = Pm Pn .
In particular, Pn+1 = P Pn , which implies Pn = P n for all n ≥ 1.
At least in theory, this provides us with the distribution of Xn for any n ≥ 1 and any initial
state.
Now suppose the initial state may be random. Let µi (n) = P (Xn = i) and µ(n) denote the
row vector of the µi (n) ; so that µ(0) represents the initial distribution.
1.7 Lemma
µ(m+n) = µ(m) P n and hence µ(n) = µ(0) P n
Proof
µj (m+n) = P (Xm+n = j)
X
= P (Xm+n = j|Xm = i)P (Xm = i)
i∈S
X
= µi (m) pi,j (n).
i∈S
In matrix notation,
µ(m+n) = µ(m) Pn = µ(m) P n
1.8 Examples
5
a) (Grimmett & Stirzaker) Three types of individuals: A×A, a×A, a×a. An individual
mates with itself. The offspring type is given by
A×A → A×A with probability 1

1
a×A → A×A with probability
4
1
→ a×A with probability
2
1
→ a×a with probability
4
a×a → a×a with probability 1
The sequence of individuals resulting from repeated self-mating constitutes a M.C. with
S = {AA, aA, aa} and  
1 0 0
 
P =
 1 1 1 
4 2 4

 
0 0 1
It can be verified that

 
1 0 0
 
Pn =  1
− ( 12 )n+1 ( 12 )n 1
− ( 21 )n+1 
 
 2 2 
0 0 1
b) Consider a M.C. with S = {0, 1} and transition matrix

 
1−a a
P =  where 0 < a, b < 1.
b 1−b
Using standard matrix theory: for example, Grimmett & Stirzaker, section 6.6; it can be
shown that    
1  b a  (1 − a − b)  a −a  n
Pn (= P n ) = +
a+b b a a+b −b b
Note that |1 − a − b| < 1, so the second term tends exponentially fast to 0 as n → ∞.
We also observe that, for example, as n → ∞
b b
p0,0 (n) → and p1,0 (n) →
a+b a+b
The process “forgets” its initial state. More on this later.
6
Note on 1.8b). Write (MATH10202)
P = V ΛV −1
where the columns of V are the right eigenvectors of P and the diagonal (the only non-zero)
entries of Λ are the eigenvalues. A stochastic matrix has largest eigenvalue 1 and the sum
of the two eigenvalues is the sum of the diagonal entries of P . Therefore the eigenvalues are
1 and 1 − a − b. Now observe that
P n = V ΛV −1 V ΛV −1 . . . V ΛV −1 = V Λn V −1
which leads to the expression in the example.
The probability of a M.C. ever returning to its initial state will prove important. Below, we
investigate.
1.9 Definition
A state i is called recurrent or persistent if
P (Xn = i for some n ≥ 1|X0 = i) = 1
That is, the probability of eventual return is 1.
If this probability is less than 1, the state is called transient.
Example
Assume the initial state is 0 and that all transition probabilities are equal to 12 .
Let A denote the event, “never returns to 0” and Bn the event, “takes at least n steps round
the perimeter.”
7
Observe that, for all n ≥ 1, A ⊆ Bn . Consequently, referring to Q3 on problem sheet 2 for
the right-hand equality
1
0 ≤ P (A) ≤ P (Bn ) = ( )n .
2
Letting n → ∞ gives P (A) = 0, which implies P (Ac ) = 1 and the recurrence of state 0.
The above M.C. has a very simple structure. For a general investigation, we need the fol-
lowing:
1.10 Definition
For n ≥ 1, fi,j (n) = P (Xn = j, Xk 6= j ∀1 ≤ k ≤ n − 1|X0 = i).
Therefore fi,j (n) is the probability that, with initial state i, the M.C. first visits state j at
the nth transition.
Now, partition the event, “ever visits j” according to the time of the first visit, to see
that
∞
X
fi,j (n), from here on denoted by fi,j ,
n=1
is the probability that, with initial state i, the M.C. ever visits j.
Importantly, fi,i = 1 if and only if i is recurrent.
Below we relate the fi,j to the pi,j (n).
1.11a) Theorem
∞
X
State i is recurrent if and only if pi,i (n) = ∞.
n=1
Consequently, i is transient if and only if the sum above is finite.
Proof
For all n ≥ 1
P (Xn = j, X0 = i)
pi,j (n) = P (Xn = j|X0 = i) = .
P (X0 = i)
8
Now partition the event {Xn = j}∩{X0 = i} as to the time of the first visit to j, giving
n
X
pi,j (n) = P (Xn = j, Xk = j, Xl 6= j ∀1 ≤ l ≤ k − 1|X0 = i)
k=1
which, using P (A ∩ B|E) = P (A|B ∩ E)P (B|E)
n
X
= P (Xn = j|Xk = j, Xl 6= j ∀1 ≤ l ≤ k−1, X0 = i)P (Xk = j, Xl 6= j ∀1 ≤ l ≤ k−1|X0 = i)
k=1
which, using the Markov property
n
X n
X
= P (Xn = j|Xk = j)fi,j (k) = pj,j (n − k)fi,j (k)
k=1 k=1
Summarising, we now have, for all n ≥ 1
n
X
pi,j (n) = pj,j (n − k)fi,j (k) (∗)
k=1
where we have defined pj,j (0) = 1 (and, in future, pi,j (0) = 0 for i 6= j). Now define
∞
X ∞
X
n
Fi,j (s) = fi,j (n)s and Pi,j (s) = pi,j (n)sn
n=1 n=0
which, at least for |s| < 1, are both convergent.
Remark
Fi,j (s) is the p.g.f. of the time of first visit to j, given X0 = i. When there is a positive
probability of never visiting j, Fi,j (1) ( =fi,j ) will be strictly less than 1.
Now, from (*) we get
Pi,j (s) = Fi,j (s)Pj,j (s) (i)
and Pi,i (s) = 1 + Fi,i (s)Pi,i (s) (ii)
Verify these by checking the coefficients of sn in the expansions of both sides. (*) is called a
convolution and arises when two power series are multiplied together.
Now, from (ii) we have
1
Pi,i (s) = for |s| < 1. (iii)
1 − Fi,i (s)
9
Letting s ↑ 1 gives
∞
X ∞
X
Fi,i (s) → fi,i (n) and Pi,i (s) → pi,i (n)
n=1 n=0
and it then follows from (iii) that
∞
X ∞
X
pi,i (n) = ∞ if and only if fi,i (n) = 1 = fi,i
n=0 n=1
which completes the proof.
We can use the above result and proof to prove the following
1.11b) Theorem
∞
P
(1) If j is recurrent, then pi,j (n) = ∞ ∀i such that fi,j > 0
n=0
∞
P
(2) If j is transient, then pi,j (n) < ∞ ∀i
n=0
Proof
∞
P ∞
P
(1) j recurrent implies pj,j (n) = ∞ which, using (i), gives pi,j (n) = ∞ whenever
n=0 n=0
Fi,j (1) = fi,j > 0.
∞
P ∞
P
(2) j transient implies pj,j (n) < ∞ which, using (i), gives pi,j (n) < ∞
n=0 n=0
A further characterisation of recurrence is
1.11c) Theorem
i is recurrent if and only if the expected number of returns is infinite.
Proof

 1 if X = i;
n
Given initial state i, let In =
 0 otherwise.
∞
P
Observe that the number of returns to i is In . Therefore, the expected number of returns
n=1
10
∞
P ∞
P ∞
P ∞
P
is E[ In ] = E[In ], which, by definition, is equal to P (Xn = i|X0 = i) = pi,i (n);
n=1 n=1 n=1 n=1
and this is infinite if and only if i is recurrent.
Remark
Interchanging the order of expectation and summation is always valid in the case of non-
negative random variables.
1.12 Example
a) The simple symmetric random walk in one dimension. This is described by the diagram
below.
The steps are assumed independent. Observe that
n
p0,0 (n) = P (number of up steps = number of down steps) = P Sn =
2
where Sn denotes the number of up steps. The above probability is clearly zero for odd n.
For n even n2 n− n2 n
nn 1 1 n 1
P Sn =n = = n
2
2 2 2 2
2
n 1 n
P P
To apply 1.11a), we investigate the convergence of p0,0 (n) = n
2
, using the
even n even n 2
famous and very useful
Stirling’s approximation
1 √
n! ∼ nn+ 2 e−n 2π as n → ∞
LHS
where ∼ denotes RHS
→ 1.
If you are interested, a proof can be found in Feller, vol I.
11
Applying S.A. to the factorials in the binomial coefficients (you should work through this
yourself), we find just about everything cancels and we get, for n even
1 1
p0,0 (n) ∼ p nπ and hence, for all n > 0, p0,0 (2n) ∼ √ as n → ∞.
2
nπ
Therefore, using a comparison test for convergence of series
∞ ∞
X 1 X
√ =∞ implies p0,0 (n) = ∞
n=1
nπ n=1
and by 1.11a), 0 is recurrent.
Note that by symmetry, all states are recurrent.
b) The simple symmetric random walk in two dimensions. As you might expect, at each
1
step there is probability of moving in each of the N,S,E,W directions. It turns out that
4
∞
1
P
this M.C. is also recurrent, as S.A. leads to n
, which famously diverges to infinity.
n=1
c) The simple symmetric random walk in three dimensions. Here we have probability 16
∞
1
P
of moving in each of the N,S,E,W, up, down directions. The computation leads to 3 ,
n=1 n 2
which is convergent; and hence 0 is transient. In fact (Feller vol I), numerical estimations
lead to f0,0 ' 0.35. That is, the probability of ever returning to 0 is approximately 0.35.
Classifying the states of a Markov chain
1.13 Definition
i) j is accessible from i, written i → j, if ∃n ≥ 0 such that pi,j (n) > 0.
That is, there is a path of positive probability from i to j.
ii) i and j communicate, written i ↔ j, if i → j and j → i.
Note that we earlier defined pi,i (0) = 1, and so i ↔ i.
In some literature, alternative terminology is used. For example, Grimmett and Stirzaker
uses “communicates with” for “accessible from”; and “intercommunicate” for “communi-
12
cate”.
1.14 Definition
A set C ⊆ S of states is called:
i) closed if pi,j = 0 ∀i ∈ C, j ∈
/ C.
ii) irreducible if i ↔ j ∀i, j ∈ C.
Note that S is closed. A state in a closed set containing only that state is called an ab-
sorbing state.
1.15 Example
S = {1, 2, 3, 4, 5, 6}
We observe that {1, 2} and {5, 6} are irreducible closed sets.
From 3 or 4 there is a positive probability of reaching 1, from which return is impossible,

which suggests that both 3 and 4 are transient states.
Proof that 3 is transient.
{Xn 6= 3 ∀n ≥ 1, X0 = 3} ⊇ {X1 = 1, X0 = 3}
13
implies
P (Xn 6= 3 ∀n ≥ 1, X0 = 3) P (X1 = 1, X0 = 3)
≥ .
P (X0 = 3) P (X0 = 3)
Hence
1
P (Xn 6= 3 ∀n ≥ 1|X0 = 3) ≥ P (X1 = 1|X0 = 3), giving 1 − f3,3 ≥ p3,1 = .
4
Hence
3
f3,3 ≤ and so 3 is transient.
4
In this simple example, it can be shown directly that 1, 2, 5, 6 are recurrent. The proof below
is for state 1. First observe that, for all n ≥ 1,
{Xm 6= 1 ∀m ≥ 1, X0 = 1} ⊆ {Xk = 2 ∀1 ≤ k ≤ n, X0 = 1}
leading, as argued above, to the second inequality in
1 3 n−1
0 ≤ 1 − f1,1 ≤ p1,2 pn−1
2,2 = ( ) .
2 4
Letting n → ∞ gives us f1,1 = 1, which completes the proof.
In general, we can appeal to the following result.
1.16 Theorem
If C is finite, closed and irreducible, then all states in C are recurrent.
Proof
We first show that at least one state is recurrent. Let |C| = N . Then, since C is closed, we
have, for each i ∈ C and n ≥ 1
pi,1 (1) + pi,2 (1) + ... + pi,N (1) = 1
pi,1 (2) + pi,2 (2) + ... + pi,N (2) = 1
pi,1 (n) + pi,2 (n) + ... + pi,N (n) = 1.
14
Summing down the columns gives
n
X n
X n
X
pi,1 (k) + pi,2 (k) + ... + pi,N (k) = n
k=1 k=1 k=1
Now let n → ∞. The RHS → ∞. The number of summations on the LHS is |N | < ∞;
∞
P
hence, for at least one j, we must have pi,j (k) = ∞. The recurrence of j then follows
k=1
from 1.11b) part (2).
We now show that, if at least one state is recurrent, then all are.
Suppose j is recurrent. If i ↔ j then ∃m, n such that α = pi,j (m)pj,i (n) > 0. Then, using
1.6, we have, for any r > 0
pi,i (m + r + n) ≥ pi,j (m)pj,i (r + n) ≥ pi,j (m)pj,j (r)pj,i (n) = αpj,j (r).
Consequently,
∞
X ∞
X
pi,i (m + r + n) ≥ α pj,j (r).
r=0 r=0
Now, by 1.11a),
∞
X
j recurrent ⇒ pj,j (r) = ∞.
r=0
Therefore
∞
X ∞
X
pi,i (r) ≥ pi,i (m + r + n) = ∞.
r=0 r=0
and so, by 1.11a), i is recurrent. That all states in C are recurrent now follows from the
irreducibility of C.
1.16a) Corollary
Given any M.C., i ↔ j implies { i recurrent ⇔ j recurrent }; so that, in an irreducible M.C.,

either all states are recurrent, or all are transient.
Proof
The result follows from the second part of the proof of 1.16, with C = S; noting that C is
not required to be finite.
1.17 Examples
15
a) Cyclical random walk
 
0 p 0 1−p
 
 1−p
 
0 p 0 
P = 
1−p
 
 0 0 p 
 
p 0 1−p 0
Draw the transition diagram to see how it gets its name. |S| is finite and the M.C. is irre-
ducible. Hence all states are recurrent.
b) Simple symmetric 3-dimensional random walk
This is irreducible, yet all states are transient. As |S| = ∞, 1.16 does not apply.
c) Simple symmetric 1-dimensional random walk.
This is irreducible and all the states are recurrent; but, as |S| = ∞, this cannot be inferred
from 1.16.
We now introduce the concept of periodicity.
1.18 Definition
The period, d(i) of state i, is
d(i) = g.c.d.{n ≥ 1 : pi,i (n) > 0}
That is, the greatest common divisor of epochs at which return is possible.
Note: If the set {n ≥ 1 : pi,i (n) > 0} is empty, the period is not defined.
A state with period 1 is called aperiodic. A M.C. in which all states are aperiodic, is called
an aperiodic M.C.
1.19 Examples
a) fig 1.19 a). With initial state 1, return is possible at epochs 2, 4, 6, .... Therefore d(1) = 2.
16
b) fig 1.19 b). Return to 1 is possible at epochs 4, 6, 8, 10, .... Therefore d(1) = 2.
c) Simple 1-dimensional random walk. Recall 1.12 where we saw that p0,0 (n) > 0 ⇔ n is
even. Consequently d(0) = 2.
d) Any irreducible M.C. with pi,i > 0 for some i ∈ S is aperiodic. Can you see why?
e) Can you find a non-trivial M.C. where all states have period 3? A trivial example is
 
0 1 0
 
P = 0 0 1 
 
 
1 0 0
1.20 Theorem
If i ↔ j, then d(i) = d(j).
Proof
i ↔ j implies ∃v, w ≥ 1 such that pi,j (v) > 0 and pj,i (w) > 0.
Let n be any positive integer such that pj,j (n) > 0. As j → i → j, at least one such n exists.
Now (c.f. 1.16)
pi,i (v + n + w) ≥ pi,j (v)pj,j (n)pj,i (w) > 0
Hence d(i)\v + n + w. Also
pi,i (v + w) ≥ pi,j (v)pj,i (w) > 0
Hence d(i)\v + w. Consequently, d(i)\n. That is, d(i) is a common divisor of epochs at
which return to j is possible. Therefore, by definition, d(i) ≤ d(j). Interchanging i and j
throughout, gives d(j) ≤ d(i); and so d(i) = d(j).
Periodicity is important when looking at limiting probabilities. For example, in 1.19a), we

see that
p1,1 (n) = 1 for even n; and p1,1 (n) = 0 for odd n.
17
Therefore the limit, as n → ∞, does not exist.
For a general investigation into the asymptotic behaviour of the n-step transition probabili-
ties, we need the idea of a stationary distribution. Recall lemma 1.7, which includes
X
∀n ≥ 1, P (Xn = j) = pi,j (n)P (X0 = i).
i∈S
Can we find a distribution for X0 such that X1 , X2 , ..., Xn , ... all have this distribution? That
is, can we find a row vector π, where πi = P (X0 = i), such that π = πPn for all n ≥ 1?
With this in mind, we make the following
1.21 Definition
The row vector π is a stationary distribution (s.d.) for a M.C. with transition matrix P if
X
i) πi ≥ 0 for all i; ii) πi = 1; iii) π = πP.
i∈S
Note: π = πP ⇒ πP = πP 2 ⇒ π = πP 2 ⇒ ... ⇒ π = πP n = πPn .
We now state a limit theorem for a class of finite state space Markov chains.
1.22 Theorem
Given an irreducible, aperiodic M.C. with | S |= N < ∞,
a) For each i, j, limn→∞ pi,j (n) exists, equal to πj , say.

b) π := (π1 , π2 , ..., πN ) is the unique s.d.
Note: Implicit in a) is the limit is independent of i. The M.C. ”forgets” its initial state.
There exist several proofs of this result. The following is a sketch of the original, due to
Markov (A. A. Markov. Issledovanie zamechatel nogo sluchaya zavisimyh ispytanij. Izvestiya
Akademii Nauk,SPb, VI seriya, 1(93):61–80, 1907).
18
Proof of a). We want to show that, as n → ∞,
   
p (n) p1,2 (n) . . . p1,N (n) π1 π2 . . . πN
 1,1   
 p2,1 (n) p2,2 (n) . . . p2,N (n) π 1 π2 . . . πN 
   
 
   
   
 . . .   . . . 
 → 
   
 . . .   . . . 
   
   
 . . .   . . . 
   
pN,1 (n) pN,2 (n) . . . pN,N (n) π 1 π2 . . . πN
Let Mj (n) denote the maximum entry in column j and mj (n), the minimum. The proof
(not examinable) proceeds as follows.
i) Show that Mj (n) decreases with n and mj (n) increases.
ii) Observe that, since Mj (n) is bounded below by 0, its limit, Mj say, exists; and, since
mj (n) is bounded above by 1, its limit, mj say, exists.
iii) We now show that Mj = mj . To this end, define dj (n) = Mj (n) − mj (n).
iv) Observe that dj (n) decreases with n. Therefore, to show that dj (n) → 0, it suffices
to show (this is a result from analysis) that dj (nk ) → 0 as k → ∞, for some subsequence
n1 , n2 , ..., nk , ....
v) To get this subsequence, show ∃K such that pij (K) > 0 ∀i, j. This is a consequence of
finite, irreducible and aperiodic.
vi) Now set nk = kK. It can be shown that dj (kK) → 0 as k → ∞.
Complete (but slightly different) proofs, elucidating steps i) and vi), can be found in Tijms,
Stochastic Models, section 3.5.12, page 131; or Grinstead and Snell, page 448.
Note that some early statements of this result include ∃K such that pij (K) > 0 ∀i, j as
a hypothesis. This condition is called regularity and, for finite state space M.C.s, can be
shown to be equivalent to irreducible and aperiodic.
19
Proof of b). We first verify that π is a distribution.
N
X N
X N
X
πk = lim pik (n) = lim pik (n) = lim 1 = 1
n→∞ n→∞ n→∞
k=1 k=1 k=1
We now show that π is the unique s.d. Let λ be any distribution satisfying λ = λP . Then,
for all n ≥ 1, λ = λP ⇒ λP = λP 2 ⇒ λ = λP 2 ⇒ ... ⇒ λ = λP n = λPn .
Letting n → ∞ and using part a) gives

 
π1 π2 . . . πN
 
π1 π 2 . . . πN 
 

 
 
 . . . 
(λ1 , λ2 , ..., λN ) 

 = (λ1 , λ2 , ..., λN )

 . . . 
 
 
 . . . 
 
π1 π 2 . . . πN
giving, for each i,

λ1 πi + λ2 πi + ... + λN πi = λi
P P
Therefore ( λj )πi = λi . But λj = 1; hence, ∀i πi = λi , which completes the proof.
j j
1.23 Corollary
Under the hypotheses of 1.22, for any initial distribution, lim P (Xn = k) = πk .
n→∞
Proof Let q denote the initial distribution. Then (c.f. 1.9)
N
X N
X N
X
P (Xn = k) = pik (n)qi → πk q i = π k q i = πk .
i=1 i=1 i=1
Notes
a) In the periodic case, a s.d. may exist, but not a limit distribution. Examples are 1.4c and
1.19a.
b) The limit probabilities may exist, but depend upon the initial state. For example, a
finite state space M.C. comprising two irreducble closed sets. Can you show this possesses
an infinite number of stationary distributions? Another example is 1.15. Here, in addition
to two irreducible closed sets, there are the transient states 3 and 4. To find, for example,
limn→∞ p31 (n), we would need to condition on the state first entered upon leaving {3, 4}.
20
c) Non-irreducible examples exist where there is a unique s.d. and the limit probabilities
exist independent of the initial state. An example is 1.15 with states 5 and 6 removed and
p44 = 12 .
1.24 Examples
a) Ehrenfest diffusion (1.4c). (P. and T. Ehrenfest, “Über zwei bekannte Einwände gegen
das Boltzmannsche H-Theorem,” Physikalishce Zeitschrift, vol. 8 (1907), pp. 311-314.) This
is irreducible, all states have period 2 and | S |< ∞. A unique s.d. can be found using
πP = π, but lim pij (n) do not exist.
n→∞
b) The Simple R.W. with retaining barriers and S = {0, 1, 2, 3, 4, 5}.
Here, πP = π gives us
 
q p 0 0 0 0
 
q 0 p 0 0 0 
 

 
 
 0 q 0 p 0 0 
(π0 , π1 , ..., π5 ) 

 = (π0 , π1 , ..., π5 )

 0 0 q 0 p 0 
 
 
 0 0 0 q 0 p 
 
0 0 0 0 q p
giving
π0 q + π1 q = π 0
π0 p + π2 q = π 1
π3 p + π5 q = π 4
π4 p + π5 p = π 5
You should check for yourself, that solving recursively gives πj = ( pq )j π0 1 ≤ j ≤ 5. Then
P5
πj = 1 gives
0
p p
π0 (1 + + ... + ( )5 ) = 1
q q
which, for p 6= q, implies
p
1− q
π0 = p 6
1− (q)
21
Therefore, for p 6= q,
p

p j 1 − q
πj = 0 ≤ j ≤ 5.
q 1− p 6

q
For p = q = 21 , we see, from the system of equations, that
1
πj = 0 ≤ j ≤ 5.
6
We can now apply 1.22/23 to get that these πj are are also the limiting probabilities.
c) Referring to 1.8b, solve π = πP with π1 + π2 = 1, to find
b a
π1 = , π2 = .
a+b a+b
which are also the limiting probabilities as determined in example 1.8b.
Theorem 1.22 applies only when | S |< ∞. We now look at the general case, S countable;
so can be infinite, but includes the finite case. The crucial additional concept proves to be
“positive recurrence”.
1.25 Definition
For a recurrent state i, let Ti denote the recurrence time: the time of the first return to i,
∞
P
starting from i. Let µi = E[Ti ]. Recall from 1.10, µi = nfii (n).
n=1
A recurrent state i is said to be positive recurrent if µi < ∞. Otherwise, when µi = ∞, it
is null recurrent. In earlier literature the terms, “non-null persistent” and “null persistent”
are often used. When i is transient, define µi = ∞.
1.25 a) Example
Recall 1.12, the simple symmetric R.W. We showed 0 to be a recurrent state (by 1.16a, all
states are then recurrent). We have

n 1 n
p00 (n) = n ( ) for n even
2
2
= 0 for n odd.
The negative binomial expansion gives (see problem sheet 2)
22
∞
X
P (s) = p00 (n)sn
n=0
1 1 2 − 12
= 1−4· · s
2 2
1
= (1 − s2 )− 2 .
From 1.11, we have
1
P (s) =
1 − F (s)
X∞
where F (s) = f00 (n)sn .
n=1
Therefore
1
F (s) = 1 −
P (s)
1
= 1 − (1 − s2 ) 2 .
Now, recalling properties of p.g.f.s

∞
nf00 (n) = F 0 (1−).
P
µ0 =
n=1
We have
1 1
F 0 (s) = − (1 − s2 )− 2 (−2s)
2
s
= 1
(1 − s2 ) 2
s2 12
=
1 − s2
1 12
= 1 .
s2
−1
Therefore
µ0 = F 0 (1−)
= lim F 0 (s)
s%1
= ∞.
and 0 is null recurrent and therefore, by symmetry, are all the states. There are several
approaches to proving a version of 1.22 for the case S countable. The method here hinges on
a result of Feller, Erdos and Pollard (1949). If you enjoy analysis, it can be found in Feller,
vol 1, chapter 13, section II.
23
1.26 Theorem(F,E & P)
Let (fn ) n ≥ 1 be a non-negative sequence s.t. Σfn = 1 and the g.c.d. of n s.t. fn > 0, is 1.
Let u0 = 1 and
un = f1 un−1 + f2 un−2 + · · · + fn u0 n≥1

∞
1 1
P
then un → µ
as n → ∞; where µ = nfn . Define µ
= 0 when µ = ∞.
n=1
Proof omitted.
We also need the following two lemmas.
1.27 Lemma
If j is recurrent and j → i, then fij = 1.
Proof
j → i ⇒ ∃m s.t. pji (m) > 0. Let r be the smallest such m. This excludes the possibility of
revisiting j before visiting i. Now

X0 = j, Xr = i, Xk 6= j ∀k > r ⊆ X0 = j, Xk 6= j ∀k ≥ 1
⇒ P (Xk 6= j ∀k > r, Xr = i|X0 = j) ≤ P (Xk 6= j ∀k ≥ 1|X0 = j).
Then, using P (A ∩ B|E) = P (A|B ∩ E)P (B|E) on the LHS, we get
P (Xk 6= j ∀k > r|Xr = i, X0 = j)P (Xr = i|X0 = j) ≤ P (Xk 6= j ∀k ≥ 1|X0 = j)

⇒ (1 − fij )pji (r) ≤ 1 − fjj .
Now, j recurrent ⇒ fjj = 1 ⇒ RHS = 0. Therefore pji (r) > 0 ⇒ 1 − fij = 0 ⇒ fij = 1.
1.28 Lemma
24
Given a non-negative sequence (bn )n≥1 with Σbn = b < ∞; and a double sequence (ank )n≥1,k≥1
bounded between 0 and 1 such that, for each fixed k, ank → a as n → ∞; we have
∞
P
ank bk → ab as n → ∞.
k=1
Note: This is an example of a bounded convergence theorem. It allows us to get around

the problems posed by |S| = ∞.
Proof
N
X ∞
X N
X ∞
X
∀n, N ≥ 1, ank bk ≤ ank bk ≤ ank bk + bk .
k=1 k=1 k=1 k=N +1
Let n → ∞ to get
N
X ∞
X ∞
X N
X ∞
X
abk ≤ lim ank bk ≤ lim ank bk ≤ abk + bk . − (∗)
n→∞ n→∞
k=1 k=1 k=1 k=1 k=N +1
where lim denotes lim inf l≥n and lim denotes lim supl≥n .
n→∞ n→∞ n→∞ n→∞
Further explanation of the above will be given in the lecture.
Continuing the proof, letting N → ∞, we have
N
X N
X ∞
X
i) abk = a bk → ab; ii) bk → 0.
k=1 k=1 k=N +1
∞
P
Note that ii) is a consequence of bk < ∞.
k=1
Therefore, from (*)

∞
X ∞
X
lim ank bk = lim ank bk = ab
n→∞ n→∞
k=1 k=1
∞
P
and consequently, lim ank bk = ab.
n→∞ k=1
Note on bounded convergence. To see the necessity of the boundedness condition, consider
the following. Define 
1  0 n 6= k
bk = ank =
k2  k2 n=k
25
π2
so that b = 6
and ann → ∞ as n → ∞; hence the boundedness condition does not hold.
However, for each fixed k, ank → 0 as n → ∞. Now observe that, for each fixed n,
∞
X n2
ank bk = ann bn = = 1.
k=1
n2
so that
∞
X
lim ank bk = 1 6= 0.
n→∞
k=1
A complete analysis of the asymptotic behaviour for the countable case is given by the results
1.29 to 1.32 below.
1.29 Theorem
Let j be recurrent and aperiodic, then
fij
a) pij (n) → as n → ∞.
µj
Recall: µj denotes the mean recurrence time of state j and note that, for j recurrent, we
have fjj = 1, and therefore
1
pjj (n) → as n → ∞.
µj
If, in addition, j → i, then
1
b) pij (n) → as n → ∞.
µj
Proof
a) For n ≥ 1, write (see earlier)
n
X
pjj (n) = pjj (n − k)fjj (k)
k=1
Now apply 1.26 with un = pjj (n) and fn = fjj (n) to get
1
pjj (n) → as n → ∞.
µj
Now recall
n
X
pij (n) = pjj (n − k)fij (k)
k=1
26
Define pjj (s) = 0 for s < 0, so that
∞
X
pij (n) = pjj (n − k)fij (k)
k=1
Apply 1.28 with bk = fij (k) and ank = pjj (n − k) to get
∞
P
fij (k)
k=1 fij
pij (n) → = as n → ∞.
µj µj
b) By 1.27, fij = 1 and the result follows.
1.30 Theorem
Given any M.C., suppose that ∀i, j, the limn→∞ pij (n) exist, equal to πj say (so independent
of i), then
P P
a) i πi ≤ 1 and i πi pij = πj ;
P
b) either all the πj = 0, or i πi = 1;
P
c) If all the πj = 0, no s.d. exists. If i πi = 1, then (π1 , π2 , ...) is the unique s.d.
Proof
∞
P N
P N
P
a) ∀n ≥ 1, N ≥ 1, we have 1 = pij (n) ≥ pij (n). Letting n → ∞ gives 1 ≥ πj , and
j=1 j=1 j=1
∞
P
letting N → ∞ gives 1 ≥ πj .
j=1
For the second part, conditioning on the state following the nth transition, we have
∞
X N
X
pkj (n + 1) = pki (n)pij ≥ pki (n)pij
i=1 i=1
N
P ∞
P
Letting n → ∞ gives πj ≥ πi pij and letting N → ∞ gives πj ≥ πi pij .
i=1 i=1
∞
P
To show equality, suppose that, for some r, we have πr > πi pir . Then
i=1
X XX XX X X X
πj > ( πi pij ) = πi pij = πi ( pij ) = πi ,
j j i i j i j i
27
P
which is a contradiction. It follows that i πi pij = πj .
b) Recall from 1.21 that π = πP implies, for all n ≥ 1, that π = πPn . Consequently, it
P P
follows from i πi pij = πj in part a), that, for all j and n ≥ 1, i πi pij (n) = πj . Now
P
apply 1.28 to the L.H.S., with ani = pij (n) and bi = πi , to get πj ( i πi ) = πj and hence
P P
πj (1 − i πi ) = 0 for all j. Therefore πj > 0 for some j implies 1 − i πi = 0 and hence
P
i πi = 1, which completes the proof of b).
P P
c) Suppose q is a non-negative vector such that i qi ≤ 1, which, ∀j, satisfies i qi pij = qj ,
P
and consequently also satisfies i qi pij (n) = qj ∀n.
P
Apply 1.28 to the L.H.S., with bi = qi and ani = pij (n), to get i q i πj = qj and therefore
P
πj i q i = q j
If πj = 0 ∀j, then qj = 0 ∀j and there is no s.d.
P P
If i πi = 1, the result i πi pij = πj of a) implies π is a s.d.
P
For uniqueness, suppose λ is a s.d. Then i λi pij (n) = λj ∀j, n. Apply 1.28 to the L.H.S.
P
to get i λi πj = λj and consequently πj = λj . Hence π is the unique s.d.
We now look at the transient case
1.31 Lemma
a) If j is transient, then, ∀i, pij (n) → 0 as n → ∞.
b) If all states are transient, then no s.d. exists.
Proof
∞
P
a) (1.11) implies pij (n) < ∞, implies pij (n) → 0 as n → ∞.
n=1
P
b) Suppose q is a non-negative vector such that i qi ≤ 1, satisfying, ∀j and n ≥ 1,
P P
i qi pij (n) = qj . Using the result of part a), apply 1.28 as in 1.30c to get 0 i qi = qj .
28
Hence ∀j, qj = 0 and no s.d. exists.
Given an irreducible, aperiodic M.C., the most useful consequences of the above results are
i) If no s.d. exists, then either all states are transient or (1.29b) all are null recurrent. In
both cases, pij (n) → 0 as n → ∞.
ii) If a s.d. is found, it is unique and, ∀i, j, pij (n) → πj as n → ∞, where π is the unique
s.d.
1
iii) The mean recurrence times are given by µi = πi
when π is the unique s.d. and µi = ∞
when no s.d. exists. This is also useful in the finite state space case.
One final question remains. Is it possible for the s.d. of an irreducible M.C. to have some
πi = 0? That is, can there be a mixture of null (µi = ∞) and positive (µi < ∞) recurrent
states? Note that 1.30b does not exclude this possibility. The result below answers this in
the negative.
1.32 Lemma
In an irreducible, recurrent M.C., either all states are positive recurrent, or all are null re-
current.
Proof (for the aperiodic case)
Recall the proof of 1.16. i ↔ j if and only if ∃r, s such that pji (r) > 0 and pij (s) > 0.
1
Also, i positive recurrent implies pii (n) → µi
> 0 as n → ∞.
We have, ∀n ≥ 1 and some α > 0,
pjj (n + r + s) ≥ pji (r)pii (n)pij (s) = αpii (n).
Therefore
α
lim pjj (n) = lim pjj (n + r + s) ≥ >0
n→∞ n→∞ µi
and so j is positive recurrent.
29

MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley

Uploaded by

Copyright:

Available Formats

MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley

Uploaded by

Copyright:

Available Formats

MATH37012 Course Notes: discrete time

A stochastic (random) process is a family of random variables, {Xt : t ∈ T }, where T is

b) T = Z + , giving {X0 , X1 , ...}

If T is countable, as in b), the process is called discrete time.

§1 Discrete time Markov chains

P (Xn+1 = xn+1 |Xn = xn ) = P (Xn+1 = xn+1 |Xn = xn , Xn−1 = xn−1 , ..., X0 = x0 )

The M.C. is homogeneous if, for all n ≥ 0 and i, j ∈ S

P (Xn+1 = j|Xn = i) = P (X1 = j|X0 = i)

All our M.C.s will be homogeneous.

The (one-step) transition probabilities are

pi,j = P (X1 = j|X0 = i)

1. When in state 0, roll a die. If it shows j, move to j.

3. When in state 6, toss a coin. If heads, move to 5; if tails, remain in 6.

The transition matrix is  

Correction to diagram. The upper arrow from state 0 to state 1 is redundant.

b) A random walk (R.W.) S = {..., −2, −1, 0, 1, 2, ...}

For all i, define pi,i+1 = pi , pi,i−1 = 1 − pi .

If pi ≡ p, we have a simple random walk.

If p = 12 , we have a simple symmetric random walk.

c) The Ehrenfest model of diffusion (1907)

To proceed further, we need the following.

pi,j (n) = P (Xm+n = j|Xm = i)

The Chapman-Kolmogorov equations.

pi,j (m + n) = P (Xm+n = j|X0 = i)

In matrix notation, Pm+n = Pm Pn .

In particular, Pn+1 = P Pn , which implies Pn = P n for all n ≥ 1.

µ(m+n) = µ(m) P n and hence µ(n) = µ(0) P n

A×A → A×A with probability 1

It can be verified that

b) Consider a M.C. with S = {0, 1} and transition matrix

Note that |1 − a − b| < 1, so the second term tends exponentially fast to 0 as n → ∞.

We also observe that, for example, as n → ∞

The process “forgets” its initial state. More on this later.

which leads to the expression in the example.

A state i is called recurrent or persistent if

P (Xn = i for some n ≥ 1|X0 = i) = 1

That is, the probability of eventual return is 1.

If this probability is less than 1, the state is called transient.

For n ≥ 1, fi,j (n) = P (Xn = j, Xk 6= j ∀1 ≤ k ≤ n − 1|X0 = i).

Importantly, fi,i = 1 if and only if i is recurrent.

Below we relate the fi,j to the pi,j (n).

Consequently, i is transient if and only if the sum above is finite.

which, using P (A ∩ B|E) = P (A|B ∩ E)P (B|E)

which, using the Markov property

Summarising, we now have, for all n ≥ 1

which, at least for |s| < 1, are both convergent.

Now, from (*) we get

Pi,j (s) = Fi,j (s)Pj,j (s) (i)

and Pi,i (s) = 1 + Fi,i (s)Pi,i (s) (ii)

Now, from (ii) we have

and it then follows from (iii) that

which completes the proof.

A further characterisation of recurrence is

i is recurrent if and only if the expected number of returns is infinite.

The steps are assumed independent. Observe that

famous and very useful

If you are interested, a proof can be found in Feller, vol I.

Therefore, using a comparison test for convergence of series

and by 1.11a), 0 is recurrent.