LN SP2018
LN SP2018
LN SP2018
Dmitry Ioffe
3
4 CONTENTS
6.1 The setup: Probabilistic vrs Electrostatic Interpretation. . . . 88
6.2 Necessary and sufficient criterion for transience. . . . . . . . . 90
6.3 Probabilistic interpretation of unit currents. . . . . . . . . . . 91
6.4 Variational description of effective conductances and effective
resistances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.5 Rayleigh’s principle and Nash-Williams criterion. . . . . . . . 94
6.6 Simple random walk on Zd and Polya’s Theorem. . . . . . . . 95
6.7 Simple random walk on trees. . . . . . . . . . . . . . . . . . . 96
7 Renewal theorey in continuous time . . . . . . . . . . . . . . . . . . . 97
7.1 Poisson Process. . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2 The Setup and the Elementary Renewal Theorem. . . . . . . . 97
7.3 Renewal-reward theorem and applications. . . . . . . . . . . . 102
7.4 Excess life distribution and stationarity. . . . . . . . . . . . . 113
8 Continuous Time Markov Chains. . . . . . . . . . . . . . . . . . . . . 117
8.1 Finite state space. . . . . . . . . . . . . . . . . . . . . . . . . 117
8.2 Ergodic theorem for CTMC on a finite state space. . . . . . . 124
8.3 Countable state space. . . . . . . . . . . . . . . . . . . . . . . 125
8.4 Explosions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.5 Ergodic theorem for CTMC on countable state spaces. . . . . 129
8.6 Biased sampling and PASTA. . . . . . . . . . . . . . . . . . . 130
8.7 Reversibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
List of Notations
1A Indicator of A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5
6 CONTENTS
1 Basic Probability Theory.
1.1 Probability Spaces.
A Probability Space is a triple (Ω, A, P), where:
• The sample space Ω is the set of outcomes, points in Ω are denoted ω ∈ Ω.
• A is the collection (σ-algebra) of events - subsets of Ω; A ⊆ Ω.
• is a probability measure, which assigns to each A ∈ A its probability P(A) ∈
[0.1].
A is a σ-algebra if:
1. Ω ∈ A.
3. If A1 , A2 , · · · ∈ A then ∪n An ∈ A.
(A2) P(Ω) = 1.
For instance in order to check that (A3.3) ⇒ (A3) set A = ∪An and define
Bn = ∪∞n+1 A` . Then by additivity
n
X
P(A) = P(Aj ) + P(Bn ).
1
Example 1.1.1. Bernoulli trails. Let us flip a coin n times. The probability space is
(Ωn , Fn , Pn ). The sample space is Ωn = {0, 1}n . That is ω ∈ Ωn is a n-dimensional vector
with coordinates ωi = 0, 1. We say that ωi = 1 indicates success (e.g. Head) in the i-th
trail. Fn = 2Ωn . There is only one parameter - probability of success p ∈ [0, 1], which
specifies Pn . Alternatively, q = 1 − p is the probability of failure (in a single trail). The
probabilities related to Bernoulli trails are given by
Pn Pn Pn Pn
ωi 1 (1−ωi ) ωi 1 (1−ωi )
Pn (ω) = p 1 (1 − p) =p 1 q . (1.1.3)
8 CONTENTS
Of course, n1 ωi is just the total number of successes in the outcome (of n trails)
P
ω. Important events related to Bernoulli trails are
( i
)
X
Ak = ω : ωi = k
1
Clearly,
n k n−k
P(Ak ) = p q . (1.1.4)
k
Note that the minimal σ-algebra σ (A0 , A1 , . . . , An ) contains all possible unions of
A0 , . . . , An and it is strictly smaller than 2Ωn .
for n = 1, 2, 3, . . . and a1 , . . . an = 0, 1.
Remark 1.1.3. Note that Ω in Example 1.1.3 is uncountable, which means that it
is impossible to number all elements of Ω as ω (1) , ω (2) , . . . . Indeed, assume that we
(n) (n) (n)
manage to number them with ω (n) = a1 , a2 , . . . , an , . . . . Then consider
(1) (2)
ω ∗ = 1 − a1 , 1 − a2 , . . . , 1 − a(n)
n , . . . .
P P
ai
P(A) = p (1 − p)n− ai
. (1.1.11)
Recall that A0 is the algebra of events which depend only on finite number of trials. Using (1.1.11) we extend P to
A0 by additivity. Actually, if A ∈ A depends only on first n trails, then computation of P(A) reduces toP
computation
n
on a finite probability space (Ωn , An , Pn ) discussed in Example 1.1.1 above. For instance the event 1 ωi = k
belongs to A0 and its probability equals to
n
pk (1 − p)n−k .
k
Exercise 1.1.4. Show that if A1 ⊇ A2 ⊇ . . . is a non-increasing sequence of events from A0 which satisfies
∩An = ∅, then there exists N < ∞, such that An = ∅ for all n ≥ N . In particular, limn→∞ P(An ) = 0 and hence
the conditions of Caratheodory’s theorem are satisfied.
Solution: We shall check that under the conditions of Exercise 1.1.4 there exists a finite n0 < ∞ such that An = ∅ for
any n ≥ n0 . Equivalently, since An is non-increasing sequence of events, there exists n such that An = ∅. This follows by
contradiction. Let
Fn = σ ([a1 , . . . , an ] ; a1 , . . . , an = 0, 1) .
There is no loss of generality to assume that An ∈ Fn . Note that for any n the collection {[a1 , . . . , an ]}a1 ,...,an =0,1 of 2n
events is a partition of Ω.
Assume now that An 6= ∅ for any n. Since An -s form a non-increasing sequence of events this would mean that there
exists a1 = 0, 1 such that An ∩ [a1 ] 6= ∅ for all n as well. Proceeding along these lines of reasoning we conclude that
under assumption An 6= ∅ there exists an infinite sequence a1 , a2 , . . . such that An ∩ [a1 , . . . , ak ] 6= ∅ for any n and
k. But An ∈ Fn . Therefore An ∩ [a1 , . . . , an ] 6= ∅ simply means that [a1 , . . . , an ] ⊆ An for any n. Consider a point
a = (a1 , a2 , a3 , . . . ) ∈ Ω. Clearly a ∈ [a1 , . . . , an ] for any n. By the above a ∈ An for any n. Hence a ∈ ∩An 6= ∅. A
contradiction.
1. BASIC PROBABILITY THEORY. 11
Construction of Uniform Distribution (Lebesgue measure) on [0, 1]. Consider
now Example 1.1.4. Let B̃0 be the smallest algebra which contains B0 . In other words, B̃0 contains finite unions of
intervals (a, b), (a, b], [a, b), [a, b] For 0 ≤ a ≤ b ≤ 1 define
Remark 1.1.4. On can use Heine-Borel argument/ compactness considerations in order to check that P above
satisfies conditions of Caratheodory’s theorem.
P ((a, b) ∩ Q) = b − a,
and extend it by additivity to B0 . Then there does not exist a σ-additive extension to σ(B0 ). Indeed, let us number
all rational numbers in (0, 1) as q1 , q2 , q3 , . . . . Consider sets
Clearly, P(Ai ) ≤ 2 × 4−i . On the other hand, Ω = ∪Ai . Hence, should there be a σ-additive extension of P, we
would obtain:
X 2
1 = P(Ω) ≤ P(Ai ) = ,
i
3
a contradiction.
Alternatively, should there be a σ-additive extension of P, the for any rational q ∈ (0, 1) it should hold that
P(q) = lim→0 P ((q − , q + ) ∩ Q) = 0, and hence P(Ω) = 0, again a contradiction.
One may object that B0 above is not an algebra. Here is a classical example of an additive but not σ-additive
function defined on a σ-algebra of subsets: Let F = 2N set of all the subsets for N. Using convention ∞ · 0 = 0,
define
µ(A) = ∞1{A is infinite} . (1.1.12)
for any a ∈ RX .
CASE 2 In general, for any a ∈ R the set
∆
{X ≤ a} = {ω : X(ω) ≤ a} ∈ A. (1.2.2)
∆
That is {X ≤ a} is an event for any a ∈ R and hence probabilities P (X ≤ a) = FX (a) are
well defined.
12 CONTENTS
Each event A ∈ A has its representative X = 1A , called the indicator of A, in
the world of random variables.
(
1, if ω ∈ A
1A (ω) = (1.2.3)
0, if ω ∈
6 A
FX (x) = P (X ≤ x) . (1.2.4)
Let us say that x belongs to the support supp(X) of the distribution of X if FX (x + ) >
FX (x − ) for any > 0. Equivalently, x ∈ supp(X), if P (X ∈ (x − , x + )) > 0 for any
> 0.
Let us say that X has discrete distribution with probability function px if there is an at
most countable S ⊂ R such that P(X ∈ S) = 1, and P(X = s) = pX (s) for every s ∈ S.
RLet
x
us say that X has a continuous distribution with density function fX if FX (x) =
−∞ X
f (t)dt for every x ∈ R.
and that it has left limits limy↑x FX (y) = P(X < x). Conclude that P(X = x) equals
to the size of jump of FX at x.
(b) Find an example of a discrete random variable X which has a continuous support,
e.g. supp(X) = R.
(c) Let X be a continuous random variable. Check that Y = FX (X) has a uniform
distribution on [0, 1], that is P(Y ≤ y) = y for every y ∈ [0, 1].
(d) Give an example of a random variable which is neither discrete, nor continuous.
For instance here is a solution to (b): Number all rational numbers Q = {q1 , q2 , q3 , . . .}
(this is possible since Q is countable. Define the probability distribution of X via:
P (X = qi ) = 2−i .
• Bernoulli random variable X ∼ B(p) has only two possible values 0 and 1 with
p = P (X = 1). For any event A the indicator X = 1A is a Bernoulli random
variable. If (Ω, A, Pp ) is the probability space of a finite or infinite sequence
of independent Bernoulli trails as describe above, then Xi (ω) = ωi is a finite
or infinite sequence of independent Bernoulli random variables.
• Poisson random variable with intensity λ, N ∼ P oi(λ) has Range N0 and its
distribution function is given by
λk −λ
pN (k) = e .
k!
λr sr−1 −λs
fS (s) = e 1[0,∞]) (s) (1.2.5)
Γ(r)
1 (x−µ)2
fX (x) = √ e− 2σ2 . (1.2.8)
2πσ 2
X
X(ω) = s1X=s .
s∈S
A general procedure for construction expectations for non necessarily discrete ran-
dom variables is described in Exercise 1.7.4, and it boils down to a construction
of the so called Lebesgue integral, which rucially differs from the usual Riemann
integral in the in the following sense: Lebesgue integral uses partition and approxi-
mation of the range rather than of the domain. If X is a continuous random variable
with density function fX , then
R∞
= −∞ xfX (x)dx
if the integral is absolutely convergent
R∞ R0
= ∞ if xf (x)dx = ∞, but |x| fX (x)dx < ∞
X
E(X) R0∞ R−∞
0 (1.2.11)
= −∞ if 0
xfX (x)dx < ∞, but −∞ |x| fX (x)dx = ∞
R∞ R0
both 0 xfX (x)dx = −∞ |x| fX (x)dx = ∞
not defined if
Exercise 1.2.2. Let X be a random variable, and let F and G be two bounded
non-decreasing functions. Then
Tail formula for expectations. Let X be a non-negative random variable, and let F
be its distribution function. Then
Z ∞ Z ∞
E(X) = P(X > x)dx = ((1 − F (x)) dx. (1.2.13)
0 0
More generally, let X be any random variable, and let ϕ be a smooth non-decreasing
∆
function ϕ(−∞) = limx→−∞ ϕ(x) = 0. Then,
Z ∞
E(ϕ(X)) = ϕ0 (x)P(X > x)dx. (1.2.14)
−∞
RX R∞
Sketch of the Proof of (1.2.14). Write ϕ(X) = −∞ ϕ0 (x)dx = −∞ ϕ0 (x)1{X>x} dx.
Then interchanging the expectation and the integral (which is justified by the so
called Tonelli Theorem)
Z ∞ Z ∞ Z ∞
0 0
ϕ0 (x)P(X > x)dx.
E(ϕ(X)) = E ϕ (x)1{X>x} dx = ϕ (x) E 1{X>x} dx =
−∞ −∞ −∞
Exercise 1.2.3. Average waiting time for the best offer: Let X1 , X2 , . . . be contin-
uous i.i.d.random variables, say offers by different clients for the car you wish to
sell. Define
N = inf {n > 1 : Xn > X1 } .
That is N describes the amount of clients which will show up before there will be an
offer than the one made by the very first client. Check that E(N ) = ∞, but
√ better
E N < ∞.
Hint. Use tail formula (1.2.13).
E(X)
P (X ≥ a) ≤ . (1.2.16)
a
In particular, for any random variable X and for any non-negative and strictly
increasing (on supp (X )) function ϕ,
Eϕ(X)
P (X ≥ a) ≤ (1.2.17)
ϕ(a)
Chebychev. This is a particular instance of (1.2.17): If E(X) is defined and finite, then
Var(X)
P (|X − E(X)| ≥ a) ≤ . (1.2.18)
a2
Example 1.3.3.
n
1 X
Xn = √ ξi , (LLN)
σ n 1
where ξ1 , ξ2 , . . . are i.i.d. mean zero and finite variance σ 2 random variables.
Example 1.3.4. Fix λ > 0, and let Xn is the number of successes in n Bernoulli
trials with probability of success pn = nλ .
Example 1.3.5. Fix r > 0 and consider distribution of m = brnc particles into n
urns (or energy levels) labeled 1, . . . , n. Let Xn be the number of particles which land
in urn number 1. Note that random variables Xn have different properties depending
on whether we consider particles to be different (Maxwell-Boltzmann statistics) or
identical (1.7.1). with, of course, xi ≤ yi for any i. (Bose-Einstein statistics).
lim P (|Xn − X| ≤ ) = 1.
n→∞
Remark 1.3.1. Note that unlike all other type of convergences defined above, convergence
in distribution does not require that random variables X1 , X2 , . . . , X are defined on the
same probability space.
For instance in order to check (a) pick a sequence m ↓ 0. Then (recall (1.4.1)) limn→∞ Xn = X is recorded as
\[ \
P {|Xn − X| ≤ m } = 1.
m N n≥N
Hence,
[ \
P {|Xn − X| ≤ m } = 1
N n≥N
T
Since P n≥N {|Xn − X| ≤ } ≤ P (|XN − X| ≤ ) it follows that for any > 0,
lim P (|XN − X| ≤ ) = 1,
N →∞
Remark 1.3.3. Note that (a)-(d) of Remark 1.3.2 imply that weak convergence is the weakest one, and convergence in
probability is the next weakest one. Exercise 1.7.5 below implies that in general arrows in (a),(b) and (d) cannot be inverted.
On the other hand the following fact (which we shall not prove at this stage), called Skorohod convergence Theorem, holds:
If w − limn→∞ Xn = X, then one can construct a probability space and a sequence of random variables Y1 , Y2 , . . . , Y
on it, such that for any n random variable Yn has the same distribution as Xn , Y has the same distribution as X, and
limn→∞ Yn = Y P-a.s.
n 4n
!
1 X X
X4n − Xn = √ − ξi + ξj ∼ N(0, 1),
2 n i=1 j=n+1
Xn √
P − a.s. lim sup √ = 2. (1.3.1)
n→∞ log log n
Proving (1.3.1) is beyond the scope of this course, although we shall come rather close
to it from the point of view of developing relevant techniques in Subsections 1.4-1.6.
which means that we are talking about the set of those ω ∈ Ω, such for any > 0,
|Xn (ω) − X(ω)| ≤ as soon as n is sufficiently large.
Exercise 1.4.1. (a) Check that (for any collection of event {An } it always holds
that lim inf n→∞ An ⊆ lim supn→∞ An . Construct example when they ere equal, and
when they are different.
(b) Check that
{An i.o}c = {Acn a.b.f.}
(c) Check that for any sequence of events An , the following holds:
P lim sup An ≥ lim sup P (An ) and P lim inf An ≤ lim inf P (An ) (1.4.3)
n→∞ n→∞ n→∞ n→∞
(e) Assume that A2n = A and A2n+1 = B. Chech that lim supn→∞ An = A ∪ B,
whereas lim inf n→∞ An = A ∩ B.
Lemma
P 1.4.1. Let {An } be a sequence of events (defined on the same probability space).
If n P(An ) < ∞, then
P (An i.o.) = 0. (1.4.4)
Lemma 1.4.2. Let {An } be a sequence of independent events (defined on the same
P ∆ P
probability space). If n P(An ) = n pn = ∞, then
Remark 1.4.1. Note that two Borel -Canteli lemmas; Lemma 1.4.1 and Lemma 1.4.1,
imply:
P Let {An } be a sequence of independent events. Then P (An i.o.) = 1(0) ⇔
n P(A n ) = (<)∞.
Proof of Second Borel-Cantelli Lemma. Recall (Exercise 1.4.1) that {An i.o}c = {Acn a.b.f.}.
But
∞
! M
!
Exercise 1.1.2
\ \
P (Acn a.b.f.) = lim P Acn = lim lim P Acn .
N →∞ N →∞ M →∞
n=N N
P (Acn a.b.f.) = 0.
(c) Let {Xn } be a sequence of i.i.d. non-negative random variables. Check that if
E(Xi ) < ∞, then limn→∞ Xnn = 0 P-a.s. On the other hand, check that if E(Xi ) = ∞,
then p − limn→∞ Xnn = 0, but
Xn
P lim sup = ∞ = 1.
n→∞ n
Hint. Use (1.2.13).
(d) Let ξ1 , ξ2 , . . . be a sequence of i.i.d. exponential random variables, that is there
exists λ > 0 such that P (ξ > t) = e−λt for any t ≥ 0. Check that
ξn 1
P lim sup = = 1.
n→∞ log n λ
(e) Prove that for any random variable X and any sequence of positive numbers cn
satisfying limn→∞ cn = ∞, the following holds:
X
P lim = 0 = 1.
n→∞ cn
Tail σ-algebra and Kolmogorov’s 0 − 1 Law. Define Tn = σ (ξn , ξn+1 , . . .). In other words, Tn is the σ-algebra
generated by random variables ξ1 , ξn+1 , . . . . Then define the tail σ-algebra
\
T∞ = Tn . (1.4.10)
n
Lemma 1.4.3. If {ξn } are independent, then T∞ is trivial in the following sense: For any A ∈ T∞
∆ P(AB)
P(B|A) =
P(A)
[
P(B|A) = P(B) for any B ∈ An .
n
S
Hence P(·|A) is σ-additive on the algebra n An and by Caratheodory’s theorem it should coincide with P on
A∞ = σ (ξ1 , ξ2 , . . .) ⊆ A. But T∞ ⊆ A∞ . Which means that P(A|A) = P(A), which is a contradiction to
P(A) ∈ (0, 1).
Exercise 1.4.4. Let {ξn } be a sequence of i.i.d random variables. Consider normalized sums Xn defined in
(LLN). Check that for any number a ∈ R ∪ ∞ the probability P(limn Xn = a) i s either 0 or 1. The same regarding
probabilities P(lim supn Xn ≥ a) and P(lim supn Xn < a).
are always defined, albeit they may take values ±∞. Here is a necessary and suffi-
cient condition for P-a.s. existence of a finite limit X = limn→∞ Xn :
Proof. Assume that X = limn→∞ Xn , P-a.s.. Recall Exercise 1.4.1 which states that this happens iff
Hence (1.5.3)
Theorem 1.5.1.P Let {ξn } be a sequence of zero mean (Eξn = 0) independent random
variables with n Var(ξn ) < ∞. Then random series Xn in (RS) converges P-a.s.
E(X2n )
Var(Xn )
P max |Xk | ≥ ≤ 2
= . (1.5.4)
1≤k≤n 2
In the last inequality we just used that |X` − Xm | ≤ |X` − Xn | + |Xm − Xn |. On the
other hand (writing ` = n + k for ` > k),
P sup |X` − Xn | > = lim P max |Xn+k − Xn | > .
`≥n 2 K→∞ 1≤k≤K 2
For each k = 1, . . . , n:
!2
n
X
E X2n 1Ak = E Xk +
ξi 1Ak
k+1
! !2
n
X n
X
= E X2k 1Ak + 2E Xk 1Ak
ξi + E 1Ak ξi .
k+1 k+1
Pn
The last term above is non-negative. Since k+1 ξi and Xk 1Ak are independent, he
second term above is zero. Finally, since by construction |Xk | ≥ on Ak , the first
term above is bounded below by 2 E (1Ak ) = 2 P(Ak ). We, therefore conclude:
n
X
E(X2n ) ≥ 2
P(Ak ) = 2 P(A).
1
This is (1.5.4).
Exercise 1.5.1. (a) Let {ξn } be a sequence of independent random variables P such
that P
E(ξn ) and Var(ξn ) are defined and finite for every n. Check that if both n E(ξn )
and n Var(ξn ) converge, then random series Xn in (RS) converge P P − a.s.
(b) Find an example when X in (RS) is P-a.s. convergent, but n Var(ξn ) = ∞.
(c) Wiener process on [0, 1]: Let ξ0 , ξ1 , . . . be i.i.d. standard normal N (0, 1) random
variables. For each t ∈ [0, 1] set
∞
√ X sin(nπt)
Wt = ξ0 t + 2 ξn . (1.5.5)
n=1
nπ
Check that random variables Wt are well defined for each t ∈ [0, 1].
The following most general theorem about convergence of random serries is given without proof:
Theorem 1.5.2. Let {ξn } be a sequence of independent random variables. A necessary and sufficient condition for
convergence of random series Xn in (RS) is convergence of each of the three (non-random) sequences below:
X X X
P (|ξn | > r) , E ξn 1|ξn |≤r and Var ξn 1|ξn |≤r (1.5.6)
n n n
P (lim Xn = µ) = 1. (1.6.1)
1
Pn
for any > 0. Hence P limn→∞ n 1 ξi = 0 = 1.
CASE 2 E (ξ 2 ) < ∞ (in particular E (|ξ|) < ∞ and hence µ ∈ R).
We proceed to assume that E(ξi ) = 0. As in (1.6.2) one may check that
n
!2
1X c
E ξi ≤ .
n i=1 n
P∞ 1 1 Pn
Since, however, 1 n
= ∞ we cannot directly rely on the first Borel-Cantelli Lemma. In the sequel Xn = n 1 ξi .
1. BASIC PROBABILITY THEORY. 29
Exercise 1.6.2. (a) Fix a number α > 1, and check that P limn→∞ Xbnα c = 0 = 1.
(b) Check that for any > 0,
P max Xk − Xbnα c ≥ i.o. = 0.
bnα c≤k<b(n+1)α c
(c) Deduce from (a) and (b) above the statement of the LLN in CASE 2.
Variables ξ ± are non-negative, and E(ξ) = ∞ means that E(ξ − ) < ∞, whereas
n −
E(ξ + ) = ∞. Thus X− 1
P
n = n 1 ξi falls in the framework of CASE 3 above. We just
need to show that
1X +
P lim ξi = ∞ = 1. (1.6.3)
n→∞ n
Exercise 1.6.3. (a) Use the tail formula (1.2.13) to show the following: If Y is a
non-negative random variable, then
∞
X ∞
X
P (Y > n) ≥ E(Y) ≥ P (Y > n) . (1.6.4)
n=0 n=1
(b) Show that if ξi+ are i.i.d non-negative random variables with E(ξi+ ) = ∞, then
(c) In conditions of (b) above let M > 0. Consider (i.i.d.) random variables
∆
min ξi+ , M = ξi+ ∧ M ∈ [0, M ]. Use LLN to check that for any M ,
n
!
1X +
ξi ≥ E ξ + ∧ M
P lim inf = 1.
n→∞ n
1
Use the lower bound in (1.6.4) to check that limM →∞ E (ξ + ∧ M ) = ∞, and complete
the proof of LLN in CASE 4.
Several classical exmaples related to laws of large numbers are listed below:
30 CONTENTS
Exercise 1.6.4. Bernstein polynomials. Let f be a continuous function on [0, 1].
Pn p ∈ [0, 1] consider ξ1 , ξ2 , . . . i.i.d. Bernoulli(p) random variables. Set
For any
Sn = 1 ξi . Let Pp be the corresponding probability. Consider
X n
1 n k
Ep f Sn = f pk (1 − p)n−k . (1.6.5)
n k=0
k n
The expression in (1.6.5) is a polynomial of order n (in real variable p ∈ [0, 1]), the
so called Bernstein polynomial. Use Chebychev bound employed in the proof of LLN
to check that
1
lim max f (p) − Ep f Sn = 0. (1.6.6)
n→∞ p∈[0,1] n
Hint. Rely on the following property of continuous functions: If f is continuous on
[0, 1], then there exists a (continous) function δf with δf (0) = 0; called modulus of
continuity of f , such that for any t, s ∈ [0, 1]
|f (t) − f (s)| ≤ δf (t − s).
Exercise 1.6.5. Monte-Carlo integration. Let f be a continuous function on [0, 1]
and let U1 , U2 , . . . be i.i.d Uni[0, 1] random variables. Then, P-a.s.
n Z t
1X
lim f (Ui ) = f (t)dt. (1.6.7)
n→∞ n 0
i=1
Exercise
Qn 1.6.6. Let U1 , U2 , . . . be i.i.d Uni[0, 2] random variables. Define√Wn =
n
i=1 Ui . Check that limn→∞ Wn = 0 with probability one. Compute limn→∞ Wn .
Exercise 1.6.7. Long term optimal investment problem (following Durrett). As-
sume that each $1 you invest in bonds in the beginning of any month yields fixed
sum $a in the end of this month. On the other hand each $1 invested in stocks in
the beginning of months 1, 2, 3, . . . yields $V1 , $V2 , $V3 , . . . in the end of the corre-
sponding months, where Vn -s are i.i.d positive random variables.
The problem is to choose optimal proportion p ∈ [0, 1] of money to be invested into
stocks. Once such Qp is chosen, each dollar invested in the beginning of the first
month yields Wn = n1 (a(1 − p) + pVi ) in the end of months n.
Assume that there exists > 0 such that P ( ≤ Vi ≤ −1 ) = 1.
(a) Show that for any p ∈ [0, 1] fixed the limit
1 ∆
lim log Wn = φ(p)
n→∞ n
exists and finite P-a.s.
(b) Check that if E(V ) > a and E(V −1 ) > a−1 , then there is a non-trivial optimal
investemt p∗ ∈ (0, 1).
Hint. Check that φ is a concave function: φ00 (p) ≤ 0.
(c) Use Jensen inequlity to check that E (V −1 ) ≤ a−1 implies that E(V ) ≥ a.
(d) What is the optimal stategy if either E(V ) ≤ a or if E(V −1 ) ≤ a−1 .
1. BASIC PROBABILITY THEORY. 31
1.7 Further Exercises
Exercise 1.7.1. Let {Aα } be a family (not necessarily countable) of σ algebras of
subsets of Ω. Check that
∆
A = {A ⊆ Ω : A ∈ Aα for any α} = ∩α Aα
is a σ-algebra. Deduce that given a collection A0 the minimal σ-algebra A ⊃ A0 is
well defined via:
∆ ∆
\
A = σ(A0 ) = B,
B⊃A0
m∈Z m∈Z
defined.
Hint for (c) Check that if both X and Y are discrete, then one can find partition
{Ci } of Ω such that
X X
X(ω) = xi 1Ci (ω) and Y(ω) = yi 1Ci (ω)
i i
Exercise 1.7.6. Use first Borel-Cantelli lemma to prove the following statement:
If limn→∞ Xn = X in probability, then there exists a sub-sequence nk such that
limk→∞ Xnk = X P-a.s.
Exercise 1.7.7. Check that a sequence of random variables {Xn } converges in prob-
ability iff for any > 0,
lim P (|Xn − Xm | > ) = 0. (1.7.4)
m,n→∞
Exercise 1.7.8. Let {ξi } is a sequence of i.i.d. ±1 valued random variables with
P(ξn = 1) = P(ξn = −1) = 21 . Let a1 , a2 , a3 , . . . be a sequence of (non-random)
numbers. Define
X n
Yn = a` ξ` .
1
a2n < ∞.
P
Use Theorem 1.5.2 to show that random series Yn converges iff n
Exercise 1.7.9. Prove the following general version of Exercise 1.5.1, which gives
a general construction of Brownian motion on [0, 1]: Let ψ0 , ψ1 , . . . be a complete
orthonormal basis of L2 (0, 1), that is any f ∈ L2 (0, 1) can be represented as
∞ Z 1
∆
X
f= hf, ψk iψk where for g, h ∈ L2 (0, 1) hg, hi = f (t)g(t)dt.
k=0 0
Let ξ0 , ξ1 , . . . be i.i.d. standard normal N (0, 1) random variables. Then, the series
∞
X Z t
Bt = ξk ψk (s)ds (1.7.5)
k=0 0
converge P-a.s for each t ∈ [0, 1]. Moreover, for any t, s ∈ [0, 1],
∆
Cov (Bt , Bs ) = min {s, t} = s ∧ t.
Rt
Hint. Define It (s) = 1{s≤t} and note that 0 ψk (s)ds = hIt , ψk i. Since {ψk } is
R1
a complete orthonormal basis, 0 f (s)2 ds = ∞ 2
P
0 hf, ψk i for any f ∈ L2 (0, 1), in
particular for f = It .
Ramsey’s question is: Given M < N , whether there exists a 2-coloring of KN such
that there are no monochromatic M -cliques.
34 CONTENTS
Clearly, the smaller M is, the more it is difficult to color KN without having
monochromatic M cliques. For small N -s and large N -s this even might be impos-
sible. Prove the following probabilistic lower bound: If
Hint. Color all the edges of KN independently and estimate the probability that
there exists a monochromatic M -clique. Think what it means if this probability
happens to be less than one.
2. RENEWAL THEORY IN DISCRETE TIME. 35
2 Renewal theory in discrete time.
2.1 The setup.
Let T1 , T2 , . . . be independent random variables. We assume that: T1 is N0 -valued,
and that T2 , T3 , . . . are identically distributed N ∪ {∞}-valued random variables. In
the sequel we use the following notation for probability functions f of T1 and p of
Ti ; i ≥ 2:
k
X ∞
X
Sk = Ti ∈ N0 , V (n) = 1Sk =n and v(n) = EV (n) = P (∃ k : Sk = n) .
1 k=1
(2.1.2)
where v 0 (n) is the probability of arrival and, respectively, m0 (n) is the expected
number of arrivals for the un-delayed (T1 = 0) renewal sequence.
In the defective case set
However, Sk0
= Sk − T1 is exactly the k-th arrival time of the zero-delayed renewal
sequence. Therefore,
n−1
! n−1
X X X
E 1T1 =` 1Sk −T1 =n−` = f` v ∗ (n − `).
k `=0 `=0
In a similar fashion,
n−1
X
1Sk−1 <n 1Sk =n = 1Sk−1 <n 1Tk =n−` .
`=0
Hence X
E 1Sk−1 =` 1Sk =n−` = v(`)pn−` ,
k>1
2. RENEWAL THEORY IN DISCRETE TIME. 37
and the first of
P(2.2.1) follows. The second of (2.2.1) is an immediate consequence
since m(n) = `≤n v(`).
The proof of (2.2.3) goes along similar lines: Write
n
X
1ξ≤n = 1T1 ≤n 1ξ≤n = 1T1 =` 1ξ0 ≤n−` ,
`=0
where ξ 0 is the last renewal time of the zero-delayed renewal sequence Sk0 = Sk − T1 ,
that is S10 = 0, S20 = T2 , S30 = T2 + T3 , . . . . Taking expectations one gets (2.2.3).
Existence of solutions to (2.2.1) follows by recursion. Note that since v(0) =
m(0) = P(T1 = 0) are unambiguously defined, equations (2.2.1) always have unique
solutions. The renewal theorem describes the large n behaviour of these solutions.
Theorem 2.2.2. Define: µ = E(T ) ∈ (0, ∞]. Then,
M (n) m(n) 1
P − a.s lim = lim = lim v(n) = . (2.2.5)
n→∞ n n→∞ n n→∞ µ
We shall prove now only first two limits in (2.2.2), which are usually referred to
as an elementary renewal theorem . The full renewal theorem, limn→∞ v(n) = µ1 will
be explained later as an application of coupling methods.
Remark 2.2.1. Recall that in general the almost sure convergence; limn→∞ Xn = X
P-a.s, does not imply that limn→∞ E(Xn ) = E(Xn ), even if all the expectations are
defined and finite. Which means that the first limit in (2.2.1) does not automatically
imply the second limit therein, and additional arguments are needed. Such argument
will be based on Wald’s formula.
Note that the following events coincide:
Indeed, since M (n) is non-decreasing in n, the limit exists. Now, for any m ∈ N,
P lim M (n) ≤ m = lim P (M (n) ≤ m) .
n→∞ n→∞
38 CONTENTS
However, by (2.2.6)
∞
X
P (M (n) ≤ m) = P (Sm+1 > n) = P(Sm+1 = `) = 1 − FSm+1 (m),
`=m+1
Hence (2.2.7).
STEP 2. By STEP 1 and SLLN for n1 Sn ,
SM (n) M (n) 1
lim = µ P − a.s. ⇒ lim = P − a.s. (2.2.8)
n→∞ M (n) n→∞ SM (n) µ
STEP 3. By definition SM (n) ≤ m < SM (n)+1 . Hence,
M (n) M (n) M (n)
∈ , .
n SM (n)+1 SM (n)
1
By (2.2.8) right end-points of the above intervals converge to µ
. As for the left
end-points:
M (n) M (n) + 1 M (n)
lim = lim ·
n→∞ SM (n)+1 n→∞ SN (t)+1 M (n) + 1
M (n)
By STEP 1 we conclude that limn→∞ M (n)+1
= 1 P-a.s (simply because lim M (n) =
N (t)+1 1
∞). On the other hand limt→∞ SN (t)+1 = µ by (2.2.8).
In order to prove the second limit in (2.2.5) we need to develop an additional
tool: Wald formula for stopping times.
Exercise 2.2.3. (a) Check that M is a stopping time with respect to {Ak } iff
{M = k} ∈ Ak for any k iff {M ≥ k} ∈ Ak−1 for any k .
(b) Show that if M1 and M2 are two stopping times with respect to a filtration {An },
then, N1 = min(M1 , M2 ) and N2 = max(M1 , M2 ) are both stopping times (with resp.
to the same filtration). In addition, show that a constant M = c is a stopping time
(with resp. to any filtration)
2. RENEWAL THEORY IN DISCRETE TIME. 39
Consider now the following filtration associated to non-negative i.i.d random
inter-arrival times {Ti }: Ak = σ(T1 , . . . , Tk ). Note that for any n ∈ N the random
variable M (n) + 1 is a stopping time with respect to {Ak }, however M 0 = M (n)
is not, in general, a stopping time. Indeed, by (2.2.6), {M (n) ≤ k} = {Sk+1 > n},
whereas
{M (n) + 1 ≤ k} = {Sk > n} ∈ Ak .
Now, since TiA ≤ Ti we obviously have that M A (n) ≥ M (n), and hence mA (n) ≥
m(n). Therefore it is enough to check that for any A,
mA (n) 1
lim sup ≤ A. (2.2.13)
n→∞ n µ
STEP 3. Since TiA -s are bounded above by A,
SM A (n) SM A (n)+1 − A
1≥ ≥ .
n n
Therefore, taking expectations and using Wald’s formula again,
µ(mA (n) + 1) − A
1≥ ,
n
for any A > 0. (7.2.6) follows.
(c) Note that in general Ti and Ri are dependent. What we require is independence of
couples (Ti , Ri ) for different i-s.
2. RENEWAL THEORY IN DISCRETE TIME. 41
We continue to employ notation M (n) for the number of arrivals by time n and
Sk = k1 Ti for the time of k-th arrival. Recall that SM (n) ≤ n ≤ SM (n)+1 .
P
There are several ways to define reward collected by time n:
Example 2.3.1. Ti -s are inter arrival times, Ri -s are service times, that is Ri is time
needed to serve i-th customer. Then CT (n) is the total time needed to serve all the cus-
tomers who arrived before n.
C∗ (n) r
lim = P − a.s., (2.3.4)
n→∞ t µ
for ∗ = I, T, P. Moreover,
E(C∗ (n)) r
lim = . (2.3.5)
n→∞ n µ
Recall that we have already checked that a.s. − limn→∞ M (n) = ∞, and that a.s. −
limn→∞ Mn(n) = µ1 , the latter is just the elementary renewal theorem. On the other
hand,
M
1 X
a.s − lim Rk = r,
M →∞ M
1
42 CONTENTS
by the strong LLN. The same logic applies for initial rewards:
M (n)+1
CI M (n) + 1 1 X
= · Rk .
n n M (n) + 1 1
Remark 2.3.1. One can incorporate the case P(Ti = 0) = p0 > 0 as follows: Con-
sider N-valued (and hence positive) i.i.d. random variables T̃i with P T̃i = ` =
p`
1−p0
for ` = 1, 2, . . . , and an independent i.i.d sequence Nk such that Nk ∼ Geo(1 − p0 ).
n o
In this way the k-th (single) arrival for T̃i -process corresponds to a simultaneous
arrival of Nk customers in the original process. Precisely, set S̃k = k1 T̃` . Then,
P
X
M (n) = Nk 1{S̃k ≤n} . (2.3.6)
k
M (n) in (2.3.6) makes sense even if i.i.d random variables {N` } have distribution
different from geometric. In this case the picture falls in the general framework of
the renewal-reward theorem.
Size bias. The size bias (or biased sampling) is the following statement
2. RENEWAL THEORY IN DISCRETE TIME. 43
Theorem 2.4.1. Recall our notation p` = P(Ti = `) and µ = ETi . Then, for any ` ∈ N,
n
1X `p`
lim 1{SM (k)+1 −SM (k) =`} = . (2.4.1)
n→∞ n µ
k=1
Formula (2.4.1) has the following interpretation: We chose a random time uni-
formly from {1, . . . , n} and ask for the probability that this random time falls into
an inter-renewal interval of duration `. Then (2.4.1) describes the asymptotics of
this probability as n → ∞. Since the right hand side of (2.4.1) is different from
p` there is a (size) bias for sampling intervals lengths as compared to the unbiased
distribution {p` }.
Proof of Theorem 2.4.1. we can rewrite the left hand-side of (2.4.1) in terms of
renewals with partial reward as follows: Set Rk = `1Tk =` and set
Exercises below are mostly borrowed from the books by Durrett and Grimmet-
Stirzaker.
Exercise 2.4.1. Suppose the lifetime of a car is a random variable with probability
function p. B buys a new car always on january 1, as soon as the old one breaks
down during the preceeding year or reaches N years. Suppose a new car costs a NIS
and that an additional cost of b NIS is accumulate if the car breaks down before N .
What is the long-run cost per unit time of B’s car policy?
Calculate the cost (as a function of N ) when the lifetime is uniformly distributed
on {1, 10} (meaning that probaility that the car breaks down during the i-th year is
1/10 for i = 1, . . . , 10); a = 10, and b = 3.
Hint Take U1 , U2 , U3 , . . . as i.i.d Uni {1, . . . , 10} random variables and set inter-
renewal times Ti = min {Ui , N } and rewards Ri = a + b1Ti =N .
Exercise 2.4.3. A machine is working for a geometric time with parameter p1 > 0
and is being repaired for a geometric time with parameter p2 > 0. At what rate does
the machine break down? Calculate the probability that the machine is working at a
time point uniformly chosen from {0, . . . , n} as n → ∞.
44 CONTENTS
Exercise 2.4.4. Bears arrive in a village at the instants of a renewal process with
ET1 = µ. They are captured and locked in a safe place which costs c NIS per unit
time per bear. When N bears have been captured, an expedition costing d NIS is
organized to remove and release them far away. What is the long-run average cost
of this policy.
Exercise 2.4.5. The weather in a certain locale consists of alternating wet and dry
spells. Suppose that the number of days in each rainy spell is a Poisson distribu-
tion with mean 2, and that a dry spell follows a Geometric distribution with mean
7. Assume that the successive durations of rainy and dry spells are independent.
Calculate the probability that it rains at a point uniformly chosen from {0, . . . , n} as
n → ∞.
describes the total number of passengers who departed before time n and whose
arrival “fell” into
the inter-arrival (buses) interval of length less or equal to u. Note
that A SM (n) describes the total number of passengers who departed from the
station by time n. By the renewal-reward theorem.
Now,
Therefore, the limiting ratio of passengers who fall into inter-arrival (buses) times
of length ≤ u is still given by size bias law (2.4.1).
2. RENEWAL THEORY IN DISCRETE TIME. 45
Excess life distribution and stationarity. Define the excess life at time n
Note that it might happen that P (E e (n) = 0) > 0 despite our assumption on Ti ∈ N.
Exercise 2.4.6. Check that for any ` ∈ N0 the following limit exists P-a.s.
n P∞
1X pj P (T > `) ∆ e
lim 1{E e (k)=`} = P`+1
∞ = = p (`). (2.4.4)
n→∞ n
k=1 1 jpj µ
Exercise 2.4.7. Compute discrete excess life distribution {pe` } for Ti -s being dis-
tributed according to the following laws:
(a) Poisson(λ).
(b) Geo(p).
(c) Bin(n, p).
Stationarity. We shall use Pe , Ske , M e etc for the delayed renewal with the delay
T1e distributed according to the excess life distribution.
Exercise 2.4.8. For the delay T1 = T1e being distributed according to the limiting excess
life distribution pe` -s in (2.4.4) check that v(n) ≡ µ1 is the unique solution to (2.2.1), and
that me (n) = n+1µ
for any n ∈ N0 .
By Exercise 2.4.8 the Pe -probability v(n) of having renewal does not change with
time. This is an expression of stationarity. Let us elaborate on this notion.
46 CONTENTS
In the language of the next section, {E e (n)} is a Markov chain on N0 with matrix of
transition probabilities P and stationary distribution pe .
By the above, under Pe the sequence of excess life times is stationary in the
following sense:
Pe (E e (n) = `) = pe (`), (2.4.8)
for any n, ` ∈ N0 . In other words, under Pe the excess life time E e (n) has the same
distribution pe for any time n ∈ N0 . In fact (2.4.8) implies that the renewal process
M e itself is stationary, that is for any k ∈ N the distribution of M e (n + k) − M e (n)
does not depend on n. Indeed, first of all,
X
Pe (M e (n + k) − M e (n) = 0) = Pe (E e (n) > k) = pe (j).
j>k
Pe (V ∈ A) = Pe (θn V ∈ A) , (2.4.11)
∞ ∞
∆ X X
p` = P (T = `) for ` = 1, 2, 3, . . . satisfy p` = 1 and `p` = µ < ∞. (2.5.1)
`=1 `=1
In this case the limiting excess life distribution pe` is defined, see (2.4.4), and the corresponding renewal process
M (n) is stationary. As before we use P and P0 for the laws of stationary and un-delayed renewals, and reserve
e e
Exercise 2.5.1. Check that any delayed renewal process M satisfies the following property: Define
Ak = {Renewal at time k} .
Then conditionally on Ak , the process M̃ (n) = M (k + n) − M (k) is distributed like (meaning has the same finite
dimensional distributions) the un-delayed renewal process. In other words for any n1 < n2 < · · · < nj and any
`1 ≤ `2 ≤ · · · ≤ `j ,
P0 M (k + n1 ) − M (k) = `1 , . . . , M (k + nj ) − M (k) = `j Ak
(2.5.2)
= P0 (M (n1 ) = `1 , . . . , M (nj ) = `j ) .
We would like to argue that under the above assumptions on inter-arrival times (and under the tacit assumption
P (T1 < ∞) = 1), the delayed renewal M will converge to N e . In order to make such statement precise let us relabel
M (·) as follows: Set
Vn = 1{Renewal at time n} = 1{∃k : Sk =n} .
Thus {Vn } is just a random sequence of 0-s and 1-s. Evidently, one can recover M (·) from {Vn } and visa versa. We
reserve notation {Vne } for the stationary renewal with T1 distributed according to the limiting excess life distribution
(2.4.4).
Definition Let us say that M (·) converges to M e (·) as n → ∞ if,
where the supremum is over all cylindrical (that is depending only on finitely many coordinates) subset A ⊆ {0, 1}N0 .
In fact it would be enough to consider only particular cases of delayed renewal Pa with P (T1 = a) = 1. Indeed,
in general,
∞
X
P (·) = P (T1 = a) Pa (·) .
a=0
P∞
Since Pe(·) = e a
a=0 pa P (·), we conclude that (2.5.3) is equivalent to the following statement: For n ∈ N0 and a
sequence v = (v0 , v1 , . . . ) ∈ {0, 1}N0 define the shift θn v = (vn , vn+1 , . . . ). Then
where g.c.d stands for the greatest common divisor. It is known that (2.5.5) implies that there exists n0 < ∞ such that
∆
v0 (n) = E0 (Vn ) = P0 (Vn = 1) > 0 (2.5.6)
for every n ≥ n0 .
We proceed to work under Assumption (2.5.6). The proof of (2.5.4) is split into several steps.
STEP 1. (Basic coupling construction). Let {Vn } and {Ṽn } be two renewal sequences which correspond to different
(but both P-a.s. finite) delays, for instance to 0-delay and to a-delay as in (2.5.4), but in the beginning it would
make sense to do things in general. Since we want to make a statement about the marginal laws P and P̃, there is
no loss to assume that {Vn } and {Ṽn } are independent. We use Q for the corresponding product law.
Exercise 2.5.2. (a) Check that if {Vn } and {Ṽn } are independent, then
Un = Vn · Ṽn (2.5.7)
for any n and for any cylindrical A ⊆ {0, 1}N0 . In particular, if Qa is the product measure for independent copies of
0-delayed and a-delayed renewals, and if τ a = inf n > 0 : Vn0 · Vna = 1 , then (2.5.4) would follow if we show that
Qa (τ a < ∞) = 1. (2.5.10)
STEP 2. At this stage let us consider two independent copies of stationary renewal processes {Vne } and {Ṽne }. Let
Qe be the corresponding product measure. As we know from Exercise 2.5.2(a) , the process Une = Vne · Ṽne is a
delayed renewal sequence under Qe . Let J1e and J1 , J2 , . . . be the corresponding independent delay and the i.i.d
inter-arrival times. By stationarity and independence of two copies,
1
Qe (Une = 1) = Pe (V e = 1)2 = Pe (E e (n) = 0)2 = > 0. (2.5.11)
µ2
n
!
1 e X Qe (J e < ∞)
lim Q Uke = . (2.5.12)
n→∞ n 1
Qe (J1 )
We could write conditional expectation since by (2.5.6), Q0 Vm · Ṽm+a = 1 = g(m)g(m + a) > 0 as soon as
m ≥ n0 . But since we have 1 on the left - hand side of (2.5.14), it immediately follows that
Q0 Un = 1 i.o. Vm · Ṽm+a = 1 = 1.
It remains to notice that under Q0 · Vm · Ṽm+a = 1 the shifted sequences W̃k = Vm+k and W̃k = 1{k≥a} Ṽm+k
are independent and distributed according to the product measure Qa . Hence,
Qa Wk = W̃k i.o = 1 ⇒ (2.5.10).
Key renewal theorem in discrete case. Assume (2.5.1) and (2.5.5). Then, the last of (2.2.5) holds,
that is
1
lim v(n) = . (2.5.15)
n→∞ µ
Moreover, let τ e be the coupling time between independent zero renewal and stationary renewal sequences under the product
measure Qe . Then,
1 2Qe (ϕ(τ e ))
v(n) − ≤ .
µ ϕ(n)
for any non-negative increasing function ϕ.
50 CONTENTS
3 Conditional expectations.
Let (Ω, A, P) be a probability space, and let B ⊆ A be a sub σ-algebra. A random
variable Y is said to be B-measurable; Y ∼ B, if
{ω : Y(ω) ≤ y} ∈ B
Remark 3.1.1. In fact, it is enough to check (3.1.1) only for Y-s which are indicators of
events from B, that is for Y = 1B for some B ∈ B.
Remark 3.1.3. If (X, Y) is a continuous random vector with joint probability density fXY ,
and if B = σ(Y), then
Z
∆
E X B (ω) = E X Y (ω) = xfX|Y (x|Y(ω))dx, (3.1.4)
fXY (x,y)
where fX|Y (x|y) = fY (y)
1fY (y)>0 is the conditional density.
3. CONDITIONAL EXPECTATIONS. 51
Exercise 3.1.1. Let U ∼ Uni(0, 1) and n ∈ N. Define the conditional distribution
of Sn given U = u to be binomial Bin(n, u). That is
Z 1 Z 1
n
P (Sn = k) = P (Sn = k|U = u) du = uk (1 − u)n−k du
0 k 0
P4. (Projection/least square property). If X has finite variance; E (X2 ) < 0, then the
conditional expectation Z = E (X|B) solves the following minimization problem:
P5. (Inequalities). Cauchy-Schwartz, Hölder and Jensen inequalities hold for conditional
expectations. For instance, if ϕ is convex, and E (|ϕ(X)|) < ∞, then
E ϕ(X) B (ω) ≥ ϕ (E (X|B) (ω)) P − a.s.. (3.1.9)
Exercise 3.1.5. Let B be an atomic σ-algebra. Check that E(X|B) defined in (3.1.3)
indeed satisfies P1-P3 and P5-P6.
d 2
or as the left derivative at s = 1. The same regarding ds 2 GZ (1). In general GZ is
1 − GZ0 (1 − α∞ ) < 1,
if µ > 1.
Theorem 3.3.1. If a < 1, then the size |C| is of order 1. That is, that
If a > 1, then, with positive probability, the size |C| is of order n. Precisely there exists
δ > 0, such that
lim inf Pn (|C| > δn) > 0 (3.3.3)
n→∞
Sketch of the proof. We can think about C being constructed as follows: Set I0 =
{1}. Then iterate:
It+1 = j ∈ Kn \ It : ∃ i ∈ It with η(ij) = 1 . (3.3.4)
Upper bound. Consider the branching process Ztu with branching mechanism
a
Bin n, n . Then one can construct (coupling) Zt and Ztu on the same probability
space such that Zt ≤ Ztu for all t = 1, . . . , n.
3. CONDITIONAL EXPECTATIONS. 55
l
Lower bound. For ∈ (0, 1) consider the branching process Zt with branching
a
mechanism Bin n(1 − ), n . Then one can construct (coupling) Zt and Ztl on the
same probability space such that
for all t = 1, . . . , n.
Given the above Upper and Lower bounds we are left with the following task:
Let Zn be a branching process with branching mechanism Bin n, nb . Then, if b < 1,
∞
!
X
lim lim sup Pn Zk > K = 0. (3.3.6)
K→∞ n→∞
1
On the other hand, if b > 1, then there exists δ > 0, such that
∞
!
X
lim inf Pn Zn > δn > 0 (3.3.7)
n→∞
1
T = inf {k ≥ 1 : Sk = 0} . (3.3.8)
In order to derive (3.3.6) and (3.3.7) from (3.3.8) one needs to develop an elementary
large deviation theory, and we shall return to this issues in the sequel.
56 CONTENTS
4 Markov Chains.
4.1 The setup.
Let S be a finite or countable set. In the sequel we shall call it state space. Let
(Ω, F, P) be a probability space. Consider a sequence X = (X0 , X1 , . . .) of S-valued
random variables. Define the following filtration of sigma-algebras:
Fn = σ (X0 , . . . , Xn ) ; n = 0, 1, 2, . . . .
Markov property. Random sequence X is called Markov chain if for any n and any
bounded function f on S
Alternative definitions. The usual definition is: For any n and any x0 , . . . , xn , xn+1 ∈
S,
P (Xn+1 = xn+1 | Xn = xn , . . . , X0 = x0 ) = P(xn , xn+1 ). (4.1.3)
The relation (4.1.3) implies that finite dimensional distributions of X under Pµ are
given by
Pµ (X0 = x0 , X1 = x1 , . . . , Xn = xn ) = µ(x0 )P(x0 , x1 )P(x1 , x2 )P(xn−1 , xn ). (4.1.4)
4. MARKOV CHAINS. 57
By the Caratheodory’s extension theorem (4.1.4) has a unique extension to SN0 , A∞ ,
where A∞ is the σ-algebra generated by all cylindrical subsets of SN0 , or in other
words (4.1.4) unambiguously defines the distribution of infinite random sequence
(process) X.
To formulate a more upscaled version of (4.1.2) let F be a bounded measurable1
function on SN0 . For instance, let F be a local function, which means that there
exists k < ∞ such that F depends only on first k + 1 coordinates of x = (x0 , x1 , . . . );
F (x) = F (x0 , . . . , xk ).
Example 4.1.1. Let y ∈ S, k ∈ N, and consider
k
X
F (x0 , x1 , x2 , . . . ) = δy (xi ).
i=1
That is F (X) is number of visits to y by the Markov Chain X during first k steps.
Define
P[F ](x) = E (F (X)|X0 = x) = Ex F (X).
Recall our notation for shifts: θn (x0 , x1 , . . .) = (xn , xn+1 , . . .). In Example 4.1.1,
F (θn X) is the number of visits to y by X during the time interval {n + 1, n + 2, . . . , n + k}.
The upscaled version of Markov property states that for any boundsed measur-
able function F ands for any n,
E (F (θn X) | Fn ) = P[F ](Xn ) P − a.s. (4.1.5)
We shall take (4.1.5) for granted (see though Exercise 4.1.1 which is a first step
towards deriving (4.1.5) from (4.1.2)).
In order to see how (4.1.2) implies (4.1.3) consider the function f (x) = δxn+1 (x)
and the event B = {X0 = x0 , . . . , Xn = xn }. Of course B ∈ Fn . Note that for any
x ∈ S and for any function g on S,
Pf (x) = P(x, xn+1 ) and g(Xn )1B = g(xn )1B . (4.1.6)
Now,
P (Xn+1 = xn+1 , Xn = xn , . . . , X0 = x0 ) = E (f (Xn+1 )1B ) = E (1B E (f (Xn+1 ) | Fn ))
(4.1.6)
= E (1B Pf (Xn )) = E (1B P(xn , xn+1 )) = P(B)P(xn , xn+1 ).
This is (4.1.3).
Exercise 4.1.1. (a) Check that (4.1.2) implies (4.1.3).
(b) Check that (4.1.2) (and hence (4.1.3)) imply the following: for any 0 ≤ n1 <
n2 · · · < nk < nk+1 < · · · < nk+` and for any x1 , x2 , . . . , xk+` ∈ S,
P Xnk+` = xk+` , . . . , Xnk+1 = xk+1 Xnk = xk , . . . , Xn1 = x1
(4.1.7)
= P Xnk+` = xk+` , . . . , Xnk+1 = xk+1 Xnk = xk .
1
Measurable here means that F (X) is a random variable
58 CONTENTS
4.2 Examples.
Random walk on Zd . Let ξ1 , ξ2 , . . . be a collection of i.i.d Zd random variables.
We think about ξ` -s as of steps of the random walk Xn . In its turn the position of
the walker at time n is given by: Fix x ∈ Zd -starting point. Then,
Xn
X0 = x and for n ≥ 1 Xn = x + ξ` . (4.2.1)
`=1
d
In this example S = Z and P(x, y) = P (ξ = y − x).
Repair shop. Let ξ1 , ξ2 , . . . be i.i.d numbers of machines for repair to the repair
shop on mornings of days 1, 2, . . . . Assume that the shop is capable of repairing
exactly one machine per day. Let X0 be a (random) number of machines before
repair in the end of day 0, which we assume to be independent of ξ` -s. Define Xn as
the number of machines before repair in the end of day n. Then,
For any n ≥ 0 Xn+1 = max {Xn + ξn+1 − 1, 0} . (4.2.4)
If we set p` = P (ξ = `), then the matrix of transition probabilities is given by:
(
p0 + p1 if ` = 0
P(0, `) = and, for k > 0, P(k, `) = p0 1{`=k−1} + p`−k+1 1{`≥k} .
p`+1 if ` > 0
(4.2.5)
Next two models are not explicitly based on a sequence of i.i.d.-s
Fisher-Wright model (of fixed size haploid population). Let the population size
N be fixed. Each member of a certain generation is either of type A or of type
B. We assume that each member of (n + 1)-st generation picks a type at random
from one of the members of generation n, and independently from other members
of (n + 1)-st generation. Xn is random number of members of type A in generation
n. The state space is finite S = {0, . . . , N }. An the transition probabilities are:
` N −`
N k k
P(k, `) = δ0 (k)δ0 (`) + δN (k)δN (`) + 1{0<k<N } 1− . (4.2.6)
` N N
4. MARKOV CHAINS. 59
Ehrenfest chain (particle exchange dynamics between two containers). There
are N particles at states A or B. At each step one chooses a particle at random and
changes its state. Let Xn be the number of particles in state A at time n.
As before the state space is finite; S = {0, . . . , N }. However, the matrix of
transition probabilities is given by:
N −` `
P(`, ` + 1) = 1{`<N } and P(`, ` − 1) = 1{`>0} . (4.2.7)
N N
Birth and death processes in discrete time. The state space is S ⊆ N0 . The
jump probabilities P(`, k) are
Exercise 4.3.1. (a) Prove that in general the n-step transition probabilities of a Markov
chain are given by
P Xn = y X0 = x = Pn (x, y),
(4.3.1)
where Pn (x, y) are entries of the n-th power of the transition matrix P
(b) Prove that the distribution πn (vector) of Xn could be recovered from the initial distri-
bution π0 via:
πn = π0 Pn . (4.3.2)
(c) Consider two state MC with transition matrix
1−α α
P=
β 1−β
Since
{Hx = n} = {X0 6= x; X1 6= x, . . . , Xn−1 6= x; Xn = x} ∈ Fn ,
the random variable Hx is a stopping time. The same regarding Tx . Note that Hx 6= Tx
only if the chain starts at x.
Exercise 4.4.1. Consider a simple random walk Xn on Z with jump probabilities
P (ξ = 1) = p = 1 − P (ξ = −1). Assume that p > 1/2 and that the walk starts at
zero.
(a) Check that P (limn→∞ Xn = ∞) = 1.
(b) For x > 0 consider the last visit time,
Lx = sup {n : Xn = x} .
Check that P (Lx < ∞) = 1 (that is sup is actually P-a.s. max). Check that Lx is
not a stopping time.
Let T be a stopping time.
Definition.
FT = {B ∈ F : B ∩ {T = n} ∈ Fn ∀n ∈ N} . (4.4.2)
Nyx describes random number of visits at y before the first hitting/return to x. Con-
sider for instance B = {Nyx = 5}. Then, for any n ∈ N (since by construction
Tx ≥ 1, there is no point considering n = 0),
( n )
X
B ∩ {Tx = n} = δy (Xk ) = 5 ∩ {Tx = n} ∈ Fn .
1
4. MARKOV CHAINS. 61
Strong Markov Property. Let X be a Markov chain (on a finite or countable state
space S) and let T be a stopping time. Given a sequence x = (x0 , x1 , . . .) ∈ SN0 define the
random shift θT x = (xT , xT+1 , . . .). Then for any bounded measurable function F on SN0
(see the footnote after (4.1.3)),
E 1{T<∞} F (θT X) FT = 1{T<∞} P [F ] (XT ), (4.4.4)
P-a.s. In particular,
E F (θT X) T < ∞ = E P [F ] (XT ) T < ∞ . (4.4.5)
Informally Strong Markov Property means that the chain starts afresh (forgets about the
past) at stopping time.
Proof of (4.4.5).
∞
X ∞
X
E F (θT X) 1{T<∞} = E F (θT X) 1{T=n} = E 1{T=n} E F (θn X) Fn
n=0 n=0
∞
(4.1.5) X
= E 1{T=n} P[F ](Xn ) = E 1{T<∞} P[F ](XT )
n=0
Relation (4.4.6) has the following implication: Consider a MC X which starts at some
probability distribution µ and fix x ∈ S. Let T1 = H − x = inf {n ≥ 0 : Xn = x} be
the first hitting time of x. By definition, T1 = Hx ∈ {0, 1, . . . , ∞}. On the event
∆
T1 = S1 < ∞ define
is distributed as MC X under Pµ .
The decomposition (4.4.7) is always true, but it is not always very informative.
For instance it might happen that the chain which starts at µ will never go to x, that
is it might happen that Pµ (Hx = ∞) > 0. Or it might happen that Ex (Tx ) = ∞.
Moreover, if we want to study long term behaviour, for instance convergence in
distribution of Xn , periodicity issues should matter. All these cases should be sorted
out before attempts to apply renewal theory.
For x, y ∈ S set:
(a) Show that all the states are positively recurrent; Ek (T` ) < ∞ for any k, ` ∈ N0 .
(b) Find a probability distribution µ on N0 such that Eµ (Tk ) = ∞ for any k ∈ N0 .
Exercise 4.5.3. Let X be a MC on S. Recall the definition of the period
∆
dx = g.c.d. {n ≥ 1 : Pn (x, x) > 0} = g.c.d. {n1 , n2 , . . .} .
Pπ -a.s. Moreover if the chain is aperiodic (dx = 1 for some hence, by irreducibility,
any x ∈ S), then
1
lim Pπ (Xn = x) = , (4.6.2)
n→∞ Ex (Tx )
P-a.s for any x ∈ S.
64 CONTENTS
Proof. Fix x ∈ S and consider the induced regenerative structure as in (4.4.7). By
the renewal-reward theorem,
P
! Tx
k=1 δy (Xk )
n Tx Ex
1 X 1 X X X
lim f (Xk ) = f (y)Ex δy (Xk ) = f (y) ,
n→∞ n Ex (Tx ) y Ex (Tx )
0 k=1 y
Ex (Tx )
should be the same no matter which x we choose. Choosing x = y we infer that it
1
equals to Ey (T y)
.
Formula (4.6.2) is just the key renewal theorem (2.5.15) in the discrete case.
Exercise 4.6.1. (a) Check that any finite state irreducible MC is poisitively recur-
rent, and, furthermore, distributions of first hitting/return times have exponential
tails: ∃ α > 0 such that for any n ∈ N and any x, y ∈ S
(b) Prove that if the chain is positively recurrent, then for any initial distribution µ,
for any k ∈ N and for any bounded function f on Sk ,
n
1X X 1
lim f (Xj , Xj+1 , . . . , Xj+k−1 ) = P(x1 , x2 ) . . . P(xk−1 , xk )f (x1 , . . . , xk ),
n→∞ n Ex1 (Tx1 )
j=1 x ,...,x ∈S
1 k
(4.6.3)
Pµ -a.s
Hint. Consider an auxiliary Markov chain Yi = (Xi , Xi+1 . . . . , Xi+k−1 ). Check that
it is irreducible and positive recurrent on the state space
Using ergodic theorem and bounded convergence theorem (BON) observe that for
(x1 , . . . , xk ) ∈ S̃k , the following limits exist and coincide P-a.s.:
n n
1X 1X
lim δ(x1 ,...,xk ) (Xi . . . , Xi+k−1 ) = lim P (Xi = x1 , . . . , Xi+k−1 = xk ) ,
n→∞ n n→∞ n
i=1 i=1
and finish the proof using P (Xi = x1 , . . . , Xi+k−1 = xk ) = P(Xi = x1 )P(x1 , x2 ) . . . P(xk−1 , xk ).
4. MARKOV CHAINS. 65
Invariant measures and invariant distributions. We continue to consider
irreducible chains on a countable S.
Definition.(Invariant measure and invariant distribution) A non-negative non-trivial
function µ on S is called invariant measure if
X
µy = µx P(x, y) (4.6.4)
x∈S
for any y ∈ S.
Iterating (4.6.4) we check that ∀ y ∈ S and n ∈ N,
Irreducibility
X
µy = µx Pn (x, y) ⇒ 0 < µy < ∞ ∀ y ∈ S. (4.6.5)
x∈S
P
An invariant measure µ satisfying y µy = 1 is called invariant distribution.
P Note that
invariant distribution exists iff there exists an invariant measure with y µy < ∞.
lim Pn (x, y) = 0,
n→∞
for any two states x and y. Taking n → ∞ limits in the right hand side of (4.6.6)
and using BON - bounded convergence theorem, still to be formulated, we conclude
that µy = 0 for any y ∈ S. A contradiction.
Here is the connection between invariance and cycle decomposition (4.4.7). Fix
x ∈ S and define !
XTx
µy = E x δy (Xn ) . (4.6.7)
1
µx µy
= (4.6.8)
πx πy
for any x, y ∈ S
(c) The chain is positively recurrent iff there exists an invariant distribution, that is
a probabilistic solution to (4.6.4).
Proof. Since we are assuming recurrence, µx = Px (Tx < ∞) = 1. Let us check that
µy > 0 for any other y 6= x. Indeed, µy ≥ Px (Ty < Tx ), and by irreducibility there
exists n with 0 < Pn (x, y). However,
n−1
X
n
0 < P (x, y) = Px (Xk = x) Px (Xn−k = y; Ty < Tx )
k=0
n−1
X
≤ Px (Ty < Tx ) Px (Xk = x) ≤ nPx (Ty < Tx ) ,
k=0
as it follows by the last exit (from x) decomposition and, of course, Markov property.
Hence, µy > 0 as it was claimed. .
In order to prove invariance, write (recall that µx = 1)
∞
X ∞
X
µy = Px (Xn = y; n ≤ Tx ) = p(x, y) + Px (Xn = y; n ≤ Tx )
n=1 n=2
∞ X
X
= µx p(x, y) + Px (Xn−1 = z; Xn = y; n ≤ Tx ) .
n=2 z6=x
and, furthermore,
∞
X
Px (Xn−1 = z; n − 1 ≤ Tx ) = µz .
n=2
Definition 4.6.1. (Transition probabilities for the chain reversed with respect to π.) Set
b z) = πz P(z, y) 1 .
P(y, (4.6.9)
πy
then the corresponding chain is said to be reversible with respect to π. Note that if (4.6.11)
holds, then π is automatically invariant.
b z) = 1 πy
X X
P(y, πz P(z, y) = = 1.
z
πy z πy
Furthermore,
b n (y, z) = πz Pn (z, y) 1 .
P (4.6.12)
πy
By definition this holds for n = 1. Indeed assume that (4.6.12) holds for n ∈ N.
Then,
X X
b n+1 (y, z) =
P P(y,
b u)Pb n (u, z) = b n (u, z)P(y,
P b u)
u u
X 1 1 1
= πz Pn (z, u) πu P(u, y) = Pn+1 (z, y)πz ,
u
πu πy πy
and one can apply induction. Furthermore π is clearly an invariant measure for Y:
X X 1
πy P(y,
b z) = πy P(z, y)πz = πz . (4.6.13)
y y
πy
Let X be an irreducible and positively recurrent MC, and let µ be the corresponding
invariant distribution
1
µx = .
Ex (Tx )
Exercise 4.6.4. Show that for any x 6= y ,
1 Ex (Tx )
Px (Ty < Tx ) = = . (4.6.16)
µx (Ex (Ty ) + Ey (Tx )) (Ex (Ty ) + Ey (Tx ))
Hint. Define T to be the first return time to y after a visit to x.
(a) Use reward renewal and ergodic theorem to check that
T
!
X
Ey δx (Xn ) = µx Ey (T).
1
Consider now independent iterations of Fω , and, with some abuse of notation, let
Fω` be the composition of ` such iterations. Let us say that Fω` is constant if the set
Fω` (S) is a singleton. Of course, if Fω` is constant, then so is Fωk for any k > `. In
light of this observation define random number of iterations
Proof. Let us first check that M < ∞ P-a.s. The fact that the state spce S was
assumend to be finite plays a role here. Indeed, since X is irreducible, and since S is
finite, there exists L < ∞, such that PL (i, j) > 0 for any i, j ∈ S. This means that
P (M > L) < 1. However, by independence (of iterates Fω ), P(M > kL) ≤ P(M >
L)k . Hence, (e.g. by Borel-Cantelli), P (M < ∞) = 1.
The rest of the proof clarifies why one is talking about coupling from the past
here. Let us think about FωM as a map from S located at negative random time −M ,
and let X be the stationary Markov chain run from −∞. We may couple X with
independent random maps Fωt from S located at times t to S located at subsequent
times t + 1; t ∈ Z. In this way M < ∞ simply implies that X0 = Z with probability
one. Hence the conclusion.
70 CONTENTS
5 Martingales.
Throughout this section {Fn } is a filtration of σ-algebras.
Definition. A random sequence M = (M0 , M1 , . . .) is called a martingale, respectively
sub-martingale or super-martingale, if
5.1 Examples.
The first example explains prefixes sub and super.
Example 5.1.1.
Then, for any bounded harmonic (sub, sup) function h, the sequence Mn = h(Xn )
is a martingale (sub, sup) with respect to the filtration Fn = σ (X0 , . . . , Xn ).
Exercise 5.1.1. Let X be a Markov Chain, and f is a bounded function on S. Check that
n−1
X
Mn = f (Xn ) − f (X0 ) + (f (Xk ) − Pf (Xk )) (5.1.2)
0
is a martingale.
∆ P
Hint. Rewrite Mn as Mn = n1 (f (Xk ) − Pf (Xk−1 )) = n1 ηk ,, and check that ηk is a
P
martingale difference sequence, that is E(ηk+1 |Fk ) = 0.
Example 5.1.2.
Let Mn = x + n1 ξ1 be a random walk starting at x, say on Z with i.i.d steps
P
ξ satisfying E(|ξi |) < ∞. Then M is a martingale, respectively sub-martingale
or super-martingale, with respect to Fn = σ (ξ1 , . . . , ξn ), if E(ξ) = 0, respectively
E(ξ) ≥ 0 or E(ξ) ≤ 0
Exercise 5.1.2. Fix any x ∈ Z and consider a non-trivial simple Random Walk on Z
starting at x, that is when 0 < P (ξ = 1) = p = 1 − q = 1 − P(ξ = −1) < 1. Prove that
Xn
q
Mn =
p
is a martingale.
Example 5.1.3.
Think about a sequence of games with i.i.d outcomes ξ1 , ξ2 , . . . . The outcomes
are ±1 - win or loss. One can describe a previsible strategy as follows: Your decide
to bet Ci (ξ1 , . . . , ξi−1 ) ≥ 0 on i-th game. For instance if you bet $ 1 until the first win
(and then leave the casino), then Ci = 1{T>i−1} , where T = inf {n > 0 : ξn = 1}.
Then, the net gain/loss after n games is
n
X
Mn = Ci ξi . (5.1.3)
1
Example 5.1.4.
72 CONTENTS
Consider Branching process Xn in (4.2.2). Let µ = E(ξ) and Fn = σ (X0 , . . . , Xn ).
Then,
E Xn+1 Fn = µXn .
Which means that Xn is a martingale/sub-martingale/super-martingale if µ = 1/
µ ≥ 1/ µ ≤ 1 . Furthermore,
∆ 1
Zn = n Xn , (5.1.4)
µ
is always a martingale.
Example 5.1.5.
In Polya’s Urn Scheme one starts with w white and b black balls in the urn. At
each step a ball is randomly sampled, and then returned to the urn with c balls of
the same colour. Define Mn as proportion of white balls in the urn after n-th stage.
w
Of course, M0 = w+b , but M1 , M2 , . . . are already random.
Lemma 5.1.1. The sequence {Mn } is a martingale with respect to the natural fil-
tration Fn = σ (M0 , . . . , Mn ).
Proof. Let m be in the range of Mn , and let us compute E Mn+1 Mn = m . Note
that after n-th stage there are w + b + nc balls in the urn, and given {Mn = m},
exactly m (w + b + nc) of them are white. Note also that the conditional probability,
given {Mn = m}, to sample a whiet ball on (n + 1)-st stage is m. Therefore,
m (w + b + nc) + c m (w + b + nc)
E Mn+1
Mn = m = m + (1 − m)
w + b + (n + 1)c w + b + (n + 1)c
m (m (w + b + nc) + c) + (1 − m)m (w + b + nc)
= = m.
w + b + (n + 1)c
Hence E Mn+1 Mn = Mn .
Exercise 5.1.3. Define the event Bn = {Black ball is sampled on n-th stage}. Us-
b
ing Lemma 5.1.1 give a short proof that P (Bn ) ≡ w+b .
Example 5.1.6.
Consider the following model of stock price evolution: Let X0 be the initial price,
and let ξ1 , ξ2 , . . . be i.i.d non-negative (and better positive) random variables, which
are also independent of X0 . Define:
n
Y
Xn = X0 ξi (5.1.5)
i=1
For instance in a discrete version of the Black-Sholes model random change factors
ξi = eηi with ηi ∼ N (µ, σ 2 ).
5. MARTINGALES. 73
Exercise 5.1.4. Find conditions under which {Xn } is a martingale/sub-martingale/super-
martingale sequence. Quantify this in terms of µ and σ for Black-Sholes model.
Remark 5.1.1. A typical question about Xn would be whether it hits level a before
hitting level b. This is a random walk question: Consider
n
X
Yn = ln Xn = ln X0 + ln ξi .
1
Yn is a random walk, and the question is whether it visits [ln a, ∞) before visiting
(−∞, ln b].
Example 5.1.7.
Let ξ1 , ξ2 , . . . be i.i.d with E(ξ) = 0 and finite variance Var(ξ) = σ 2 < ∞. Define
Fn = σ(ξ1 , . . . , ξn ) and set
n
X
Mn = ξi and Yn = Mn2 − nσ 2 .
1
Variance Martingales.
Exercise 5.1.5. Let Mn be a martingale such that E (Mn2 ) < ∞ for all n = 0, 1, 2, . . . .
For n = 1, 2, . . . define
n
X
E (Mk − Mk−1 )2 |Fk−1 .
An = (5.1.6)
k=1
Example 5.1.8.
Consider a certain service system in discrete time. The state of the system is
described by a random variable Yk . Assume that supk E (|Yk |2 ) < ∞. Customers
arrive to the system according to the Bernoulli process ξ = (ξ1 , ξ2 , ξ3 , . . .). That is
ξi -s are independent, and P (ξi = 1) = p = 1 − q = P (ξi = 0). Assume, furthermore,
that for any k the random indicator ξk is independent of {Yj }j<k . Define Fn =
σ (ξ1 , . . . , ξn ; Yk , k < n). Then,
n
X
Mn = Yk−1 (ξk − p) (5.1.7)
k=1
74 CONTENTS
is an Fn -martingale.
Indeed, by our assumptions ξn+1 is independent of Fn , and hence
BASTA (Bernoulli arrivals see time averages). As we shall see below, under our condi-
tions the following version of the Martingale convergence theorem applies:
1
lim Mn = 0 P − a.s. (5.1.8)
n→∞ n
Assume that
n−1
1X
Ȳ = lim Yk (5.1.9)
n→∞ n
0
also exists
P P-a.s. Then Ȳ is interpreted as an objective long range average of Y . Define
Sn = n1 ξi - number of customers which arrived to the system by time n. By (5.1.8) and
(5.1.9),
n n
1X 1 X
pȲ = lim Yk−1 ξk = p lim Yk−1 ξk . (5.1.10)
n→∞ n n→∞ Sn
1 1
However, limn→∞ S1n n1 Yk−1 ξk is exactly the subjective average computed according to
P
what customers see when they arrive.
is a super-martingale.
If T is a stopping time, then {T > n − 1} ∈ Fn−1 , and
n
X
Mn = 1{T≥k} ξk = Xn∧T . (5.3.1)
1
Theorem 5.3.1. Let X be a super-martingale and T is a stopping time, both with respect
to the same filtration {Fn }. Then
Proof of Theorem 5.3.1. All three claims (a)-(c) follow from the fact that Mn in
(5.3.1) is super-martingale. In particular E(Mn ) ≤ E(M0 ) = E(X0 ).
(a) and (b) are easy: Indeed, in order to see (b) just take n = N . Then MN =
XT∧N = XT and, consequently, E(XT ) = E(MN ) ≤ E(M0 ) = E(X0 ).
If T is P-a.s. finite, then XT = limn→∞ XT∧n , and (a) follows from Fatou Lemma.
Let us turn to (c). Note that
∞
X
XT − XT∧n = (XT − XT∧n ) 1T>n = (Xm+1 − Xm )1T>m .
m=n
76 CONTENTS
Consequently,
∞
X ∞
X
E |XT − Mn | = E |XT − XT∧n | ≤ E (|Xm+1 − Xm | 1T>m ) ≤ R P (T > m) ,
m=n m=n
where the first inequality follows by Fatou, and the second one follows
P by assumption
(5.3.3). Since E(T) < ∞, it follows by tail formula that limn→∞ ∞ m=n P (T > m) =
0. Since E(Mn ) ≤ E(X0 ) and since E(XT ) ≤ E(Mn ) + E |XT − Mn |, we are home.
and hence
The rest of the proof is based on integral convergence theorems and on the following
fact: If T < ∞ P-a.s, then limn→∞ Xn∧T = XT P-a.s.
(a) If Xn is non-negative, then by Fatou Lemma,
E (XT ) = E lim Xn∧T ≤ lim inf E (Xn∧T ) ≤ E (X0 ) .
n→∞ n→∞
Since by assumption E (T) < ∞, the dominated convergence theorem and then
(5.3.6) imply that
E (XT − X0 ) = lim (XT∧n − X0 ) ≤ 0.
n→∞
T = inf {n : Mn ≥ r} .
Next,
n n
X M` 1X
P max M` ≥ r = P (T ≤ n) ≤ E 1{T=`} = E 1{T=`} M` .
1≤`≤n
`=0
r r `=0
(5.4.2)
Since M is a sub-martingale; M` ≤ E (Mn | F` ), for any ` ≤ n. Therefore,
E 1{T=`} M` ≤ E 1{T=`} E (Mn | F` ) = E E 1{T=`} Mn | F` = E 1{T=`} Mn .
E (Mn+` − M` )2 ≤ Kn,
(5.4.4)
prove that if M is a martingale, ϕ is convex and E (|ϕ (Mn ) |) < ∞ for any n, then Xn = ϕ(Mn ) is a sub-martingale.
Then deduce (5.4.3).
The following three exercises are similar to Exercise 1.6.2, where LLN was established under the second moment
condition.
Exercise 5.4.2. Assume (5.4.4). Check, using Borel-Cantelli lemma, that for any α > 1,
1
lim Mbnα c = 0, (5.4.7)
n→∞ nα
P-a.s.
Exercise 5.4.3. Check, using (5.4.3) and Borel-Cantelli lemma, that for any α > 1, P-a.s.
M` Mbnα c
lim max − = 0. (5.4.8)
n→∞ bnα c≤`<(n+1)α ` nα
Exercise 5.4.4. Using (5.4.7) and (5.4.8) finish the proof of (5.4.5).
lim Mn = M∞ . (5.5.3)
n→∞
Branching Process. Recall Branching process in Example 5.1.4 and the related
non-negative martingale Zn defined in (5.1.4). As before let ξ be the offspring
variable, µ = E (ξ).
Exercise 5.5.1.
(a) Check that if µ < 1, then both limn→∞ Xn = limn→∞ Zn = 0 P-a.s.. Conclude
that {Zn } is not UI, and explain your conclusion.
(b) Check that if µ = 1 and if P (ξ = 0) > 0, then limn→∞ Xn = 0 P-a.s, and
explain why this implies that {Xn } is not UI.
∆
(c) Assume that µ > 1 and that σ 2 = Var(ξ) < ∞. Use conditional variance
formula: For any two random variables U and W, if Var(W) < ∞, then
in order to check that {Var (Zn )} is a bounded sequence. Conclude that {Zn } is
unifromly integrable, and that E (Z∞ ) = 1.
(d) Assume that µ > 1 and that P (ξ = 0) > 0. Prove that there exists unique
x∗ ∈ (0, 1) such that
E xξ∗ = x∗ .
∆
Check that with x∗ as above, Mn = xX∗ n is a martingale. Prove that M∞ =
limn→∞ Mn exists, and conclude that P-a.s.
Polya’s Urn Scheme with c = 1. Recall Example 5.1.5. In this case exactly one
ball is added to the urn at each stage. By Lemma 5.1.1 the random proportion Mn
of white balls in the urn after n stages is a martingale. It is non-negative, and since
Mn ∈ [0, 1], it is automatically UI. Hence M∞ = limn→∞ Mn exists P-a.s.
Below we give a complete probabilistic characterization of M∞ for c = 1, that is
when exactly one ball is added to the urn at each stage.
Recall that a random variable Z has a β-distribution β(α, β) with parameters
α, β > 0 if it density function fα,β is given by
(
0, if p 6∈ [0, 1]
fα,β (p) = pα−1 (1−p)β−1 . (5.5.6)
B(α,β)
, if p ∈ [0, 1]
Exercise 5.5.2.
5. MARTINGALES. 83
With w, b and c = 1 as above:
(a) Let Pn,p be the probability function of the binomial distribution Bin(n, p).
Using (5.5.6) and (5.5.7) check that one can rewrite (5.5.8) as follows:
Z 1
w+k
P Mn = = fw,b (p)Pn,p (k). (5.5.9)
w+b+n 0
(c) Explain, using Proposition 1.3.2, why this implies that M∞ ∼ β(w, b).
Kolmogorov’s 0−1 law. Let ξ1 , ξ2 , ξ3 , . . . be independent random variables. Define
∆
Fn = σ(ξ1 , . . . , ξn ) and Tn = σ(ξn , ξn+1 , . . . ). The tail σ-algebra is T∞ = ∩Tn .
Kolmogorovs 0 − 1 law claims that T∞ is trivial: If A ∈ T∞ , then either P(A) = 1
or P(A) = 0. The proof is immediate if one uses martingale convergence theorem:
Define Mn = P(A | Fn ) = E(1A | Fn ). Then M is a uniformly integrable martingale,
and limn→∞ Mn exists and equals to 1A . On the other hand, since for every n, A and
Fn are independent, E(1A | Fn ) = P(A). Consequently, P(A) almost surely equals
to 1A , which is the zero-one law.
Backward martingales and LLN. A process M = {Mn ; n ∈ Z− } is called back-
ward martingale with respect to filtration F−1 ⊃ F−2 , . . . ,, if if E (Mn |F` ) = M`
for any ` < n ≤ −1. Alternatively, M` = E (M−1 |F` ) for any ` ∈ Z− . Note that
backward martingales are always uniformly integrable.
Let ξ1 , ξ2 , . . . be i.i.d random variables, and assume that µ = E(ξi ) is well defined
and finite. Set n
1X
Sn ξi and F−n = σ (Sn , Sn+1 , . . .) .
n 1
By symmetry,
E (ξ1 | F−n ) = Sn
So, S is a backward martingale, and the limit S∞ = limn→∞ Sn exists P-a.s. But
S∞ ∼ F−∞ = ∩F−` . So by Kolmogorov’s zero-one law S∞ is a P-a.s constant. Since,
by uniform integrability, lim E(Sn ) = E (S∞ ), the LLN follows. .
Exercise 5.6.1. Check that (5.6.1) implies that the chain is transient.
Let us check that h in (5.6.2) is (if the chain is transient as we assume) a non-trivial
super-harmonic function. By conditioning on the first step,
the last inequality holds since transient chains 0 ≤ h(x) < 1. So h is indeed super-
harmonic.
On the other hand, since X is transient, n Pn (x, x) < ∞. Consequently,
P
X
lim Px (θn Tx < ∞) ≤ lim Pm+n (x, x) = 0.
m→∞ m→∞
n
Since Pnx (θn Tx < ∞) = z Pnx (x, z)h(z), it follows that infz h(z) = 0, and, since by
P
irreducibility h is positive, it follows that it is non-trivial.
Pn
Example 5.6.1. Consider random walk Xn = 1 ξ` such that
(i) E etξ < ∞ for any t ∈ R.
(ii) µ = E (ξ) 6= 0.
(iii) P (ξ < 0) and P (ξ > 0) are both positive.
By LLN (ii) implies that X is transient. Let us see how this conclusion fol-
lows from the criterion above, namely from existence of positive non-trivial super-
harmonic function.
Lemma 5.6.1. Under conditions (i)-(iii) above there exists t∗ 6= 0, such that h(x) =
et∗ x is harmonic.
Exercise 5.6.2. Prove that ϕ is finite and differentiable by (i), convex either by
direct computation or by Hölder, strictly convex by (iii) and, again by (iii), satisfies
limt→±∞ ϕ(t) = ∞.
5. MARTINGALES. 85
Furthermore, ϕ(0) = 0 and ϕ0 (0) = µ 6= 0. Since limt→±∞ ϕ(t) = ∞ there exists
the second root t∗ 6= 0 of ϕ(t) = 0. That is there exists t∗ 6= 0 such that E et∗ ξ = 1.
But then, X
P(x, y)et∗y = E et∗(x+ξ) = et∗x ,
y
as claimed.
Proof of Lemma 5.6.2. . Let us start the chain from some x ∈ F , and let us define
the following sequence of stopping times which record successive returns of X to F ,
Hx
X ∞
X
Tx = Sn ⇒ Ex (Tx ) = Ex (Sn 1Hx >n−1 ) .
1 n=1
We assume that Var(ξi ) < ∞ and that the chain is irreducible. Set µ = E (ξ). Find
(with proofs of course) conditions in terms of µ, when the chain is transient, recur-
rent and positively recurrent.
Hint. If µ > k, then use LLN. If µ ≤ k, then consider F = {0, . . . k} and h(x) = x.
5. MARTINGALES. 87
Discrete birth and death process. Consider Markov chain Xn on N0 with tran-
sition probabilities
Exercise 5.6.6. (a) Check that for any n > 0 the function h satisfies:
X
h(n) = P(n, `)h(`).
`
(b) Prove that if limn→∞ h(n) < ∞, then the chain X is transient.
(c) Prove that if limn→∞ h(k) = ∞, then the chain X is recurrent.
(d) Prove that if qk ≥ pk for all k large enough, then the chain is recurrent.
(e) Prove that if there exists > 0 such that qk ≥ pk + for all large enough k, then
the chain is positively recurrent.
Hint for (e) Use Foster’s theorem with h(x) = x.
(f ) Find an example when limk→∞ (qk − pk ) = 0, but the chain is still positively
recurrent.
(g) Find necessary and sufficient conditions for X be positively recurrent, and write
down expression for the invariant distribution πk ; k = 0, 1, 2, . . . .
Hint for (f ) and (g): Check that
Qn−1
pk
µ0 = 1 and µn = Q0 n for n > 0
1 qk
is an invariant measure for X. (h) Define T0 = inf {n > 1 ; Xn = 0}. For positively
recurrent discrete birth and death process find a formula for the expected value of
Ek (T0 ), that is expected value of T0 for a chain starting at point k, for k = 0, 1, . . . .
Hint. Recall that in the positive recurrent (ergodic) case, E0 (T0 ) = π0−1 , and that
by the first step conditioning E0 (T0 ) = 1 + E1 (T0 ). This gives E1 T0 and, by similar
reasoning, Ek+1 Tk for any k ∈ N. Finally note that by recurrence and strong Markov
property,
Ek+1 (T0 ) = Ek+1 (Tk ) + Ek (T0 ) .
88 CONTENTS
6 Reversible Random Walks and Electric Net-
works
The exposition in this section is inspired by the books by Doyle-Snell and Lyons-
Peres.
In this way we can interpret X as a random walk on G which makes random jumps
across the edges of E, and the corresponding jump probabilities are given by
c(x, y) c(x, y)
P(x, y) = P = .
z c(x, z) πx
Remark 6.1.1. We shall always assume that G is connected and locally finite, the
last sentence means that there is a finite number of edges incident to a particular
vertex v ∈ V.
Remark 6.1.2. In the sequel, unless mentioned otherwise, we shall assume that
V \ (A ∪ B) is finite and connected.
6. REVERSIBLE RANDOM WALKS AND ELECTRIC NETWORKS 89
In this case there is an unambiguously defined (see Theorem 6.1.1 below) equi-
librium voltage distribution v on V, and there is an equilibrium current i from the
source A to the sink B across the edges of E. In the sequel we shall use symbol v(x)
for the voltage at x and i(x, y) for the current across e = (x, y) from x to y. Note
that i(x, y) may be positive or negative and that i(x, y) = −i(y, x). In the sequel
we shall use
~E = The set of oriented edges of G
If ~e = {x, y}, then −~e = {y, x}. For ~e = {x, y} ∈ ~E we use x = s− (~e) and y = s+ (~e)
to stress the orientation of the edge. In this way the outflow from A and inflow into
B are given by
X X
Out(A) = i(~e) and In(B) = i(~e) (6.1.2)
~e : s− (~e)∈A ~e : s+ (~e)∈B
Electrostatics is run by two laws which regulate voltages and currents (and which,
in particular, imply that Out(A) = In(B)).
Ohm’s Law. For ~e = {x, y} the voltages v(x) and v(y) and the current i(~e) = i(x, y)
satisfy:
i(x, y) = c(x, y) (v(x) − v(y)) . (6.1.3)
Here is the relation between the random walk X and the unit equilibrium voltage
distribution vAB :
Theorem 6.1.2. Let TA and TB be the first hitting times by X of A and B. Then,
Proof of Theorem 6.1.1 and Theorem 6.1.2. By the combination of Ohm’s and Kir-
choff’s laws the unit equilibrium voltage is harmonic function on V \ (A ∪ B) which
90 CONTENTS
equals 1 on A and 0 on B. This means that vAB satisfies the following equation for
any x 6∈ A ∪ B X
c(x, y) (v(x) − v(y)) = 0. (6.1.7)
y∼x
satisfies (6.1.7). But this is straightforward by the first step decomposition: For
x 6∈ A ∪ B,
X (6.1.1) 1 X
h(x) = Ex h(X1 ) ⇒ 0 = P(x, y)(h(x) − h(y)) = c(x, y)(h(x) − h(y)).
y
πx y
Since we imposed the unit voltage drop between A and B, the following definition
is natural in light of Ohm’s law:
Definition 6.1.1. The effective conductance ceff (A, B) between A and B is defined
via X
ceff (A, B) = πa Pa (TB < TA ) (6.1.9)
a∈A
The effective resistance reff (A, B) is the reciprocal of ceff (A, B).
Hence (6.3.4)
(c) It is non-negative:
X X X
0≤ θ(~e) = − θ(~e). (6.4.2)
s− (~e)=a b∈B s+ (~e)=b
6. REVERSIBLE RANDOM WALKS AND ELECTRIC NETWORKS 93
Note that the equality in (6.4.2) follows from (a) and (b).
The flow θ is called unit if the quantities on the both side of (6.4.2) equal to one.
It is called positive if there is a strickt inequality in (6.4.2). Otherwise, if both sides
equal to zero, it is called sourceless. .
Recall (6.2.1). An example of unit flow from a to B over oriented edges ~e = {x, y}
is given by the unit current iaB in (6.3.1).
In view of (6.3.1) it makes sense to fine the energy of a flow θ as
1X 1 1X
E(θ) = θ(~e)2 = r(~e)θ(~e)2 . (6.4.3)
2 c(~e) 2
~e ~e
In this way the energy of the equilibrium voltage vaB and of the unit flow iaB are
related as follows:
1 ∆
E(vaB ) · E (iaB ) = 1 or E (iaB ) = = reff (a, B). (6.4.4)
ceff (a, B)
In particular, one can reformulate Theorem 6.2.1 in terms of resistances as follows:
Theorem 6.4.1. An irreducible reversible Markov chain X on a countable state
space V is transient if and only for some (and hence for any) a ∈ V the effective
resistance reff (a, ∞) < ∞.
In order to use either of the two formulations for sorting out recurrence and tran-
sience issue for particular examples one relies on the following variational principle:
Theorem 6.4.2. The equilibrium voltage vaB is the unique solution of the mini-
mization problem
min E(g). (6.4.5)
g(a)=1,g|B =0
Hence
E (iaB + η) = E (iaB ) + E (η) , (6.4.7)
for any vortex η. It happens that votices span sourceless flows in such a way that
the orthogonality relation (6.4.7) holds for any sourceless flow η. Hence, an addition
of sourceless flows can only increase energy, and (6.4.6) is indeed attained at iaB as
claimed.
Definition. A set of edges Λ is called a cutset (for a ∈ V) if any infinite path from
a to infinity includes at least one edge from Λ.
For instance, if V = Zd and a = 0, then
Λn = {e = (x, y) : kxk∞ = n, kyk∞ = n + 1} (6.5.2)
is a cutset for every n ∈ N. Above kxk∞ = max {|x1 | , . . . , |xd |} .
6. REVERSIBLE RANDOM WALKS AND ELECTRIC NETWORKS 95
Theorem 6.5.2. Let a ∈ V and let {Λn } be a countable family of disjoint finite
cutsets for a. Then, !−1
X X
reff (a, ∞) ≥ c(e) . (6.5.3)
n e∈Λn
Proof. Let θ be a unit flow from a to ∞. Recall the definition of the energy E(θ) in
(6.4.3). Since θ(−~e) = −θ(~e), the absolute value |θ(e)| is unambiguously defined for
any (un-oriented) edge e. If we show that
!−1
X X X
r(e)θ2 (e) ≥ c(e) , (6.5.4)
e∈E n e∈Λn
then the claim follows by Thomson’s principle (6.4.6). Since θ is a unit flow and
since, for any n, Λn is a cutset it is intuitively clear and actually not difficult to
check that X
|θ(e)| ≥ 1.
e∈Λn
However,
v ! !
u
X X p Cauchy−Schwarz u X X
|θ(e)| = |θ(e)| c(e)r(e) ≤ t θ2 (e)r(e) c(e) .
e∈Λn e∈Λn e∈Λn e∈Λn
Therefore, !−1
X X
θ2 (e)r(e) ≥ c(e) ,
e∈Λn e∈Λn
Note that
θ(~e) = P (~e ∈ γ ω ) − P (−~e ∈ γ ω ) (6.6.2)
is a unit flow from the origin to ∞.
for every v ∈ Zd . Check that for such distribution P the flow θ in (6.6.2) has finite
energy.
Exercise 6.7.1. Consider the simple random walk X on Tk . Give two different
proofs that X is transient for any k > 1, using:
(a) Consider Yn = |Xn | and describe it as a random walk on N0 .
(b) Directly from Thomson’s principle, that is via constracting a finite energy
unit flow.
7. RENEWAL THEOREY IN CONTINUOUS TIME 97
7 Renewal theorey in continuous time
7.1 Poisson Process.
Recall that N is a Poisson random variable with parameter λ > 0; N ∼ Poi(λ), if
λk −λ
P (N = k) = e , k = 0, 1, 2 . . . (7.1.1)
k!
Poisson process of arrivals with intensity λ is a collection of random variables
{N (t)}t∈[0,∞) , where N (t) describes number of arrivals by time t. In this way
N (s, t] = N (t) − N (s) is number of arrivals during the time interval (s, t]. The
inter-arrival times are denoted T1 , T2 , . . . . The time of k − th arrival is denoted
k
X
Sk = Ti .
1
Poisson process of arrivals has (and is characterized by) the following set of proper-
ties:
• For each t > 0, N (t) ∼ Poi(λt). More generally, for each s < t, N (s, t] = N [s, t] ∼
Poi(λ(t − s)).
• For any k and any 0 < s1 < t1 ≤ s2 < t2 ≤ · · · ≤ sk < tk , random variables
N [s1 , t1 ], . . . , N [sk , tk ] are independent.
• For any k ≥ 1, the time Sk of k − th arrival is distributed Γ(k, λ). that is the desnity
function fSk (s) is zero for s < 0, and
λk sk−1 e−λs
fSk (s) = for s ≥ 0. (7.1.2)
(k − 1)!
Let Sk0 , N 0 (t) and m0 (t) be the corresponding quantities for zero-delay, that is
when T1 is distributed as the rest of Ti -s (or, depending on the context, T1 is set to
be zero).
Remark 7.2.1. (a) In the case of exponential distribution Ti ∼ Exp(λ), N (t) is the
familiar Poisson process of arrivals with intensity λ. However, our objective here is
to explore a much more general situation.
∆
(b) It might happen that Ti -s take only integer values; P Ti ∈ N0 = N ∪ {0} = 1.
Yet, even in such cases, we shall consider continuous time here, in particular both
N (t) and m(t) will be functions of continuous variable t ∈ [0, ∞).
Exercise 7.2.1. In the following two cases write explicit expressions for P(N 0 (t) =
n) (for any t ∈ [0, ∞)):
(a) Ti ∼ Γ(2, λ).
(b) Ti ∼ Poi(λ).
N ( t) 1
lim = 1{T1 <∞} P − a.s. (B.1)
t→∞ t µ
Furthermore,
m(t) 1
lim = P (T1 < ∞) . (B.2)
t→∞ t µ
Finally, for any t ≥ 0
Z t Z t
0
m(t) = F (t) + m (t − s)dF (s) = F (t) + m(t − s)dF 0 (s). (B.3)
0 0
where T is distributed as T2 , T3 , . . . .
Remark 7.2.3. As in the discrete case, the almost sure convergence; limn→∞ Xn =
X P-a.s, does not imply that limn→∞ E(Xn ) = E(Xn ), even if all the expectations
are defined and finite. Which means that (B.1) does not automatically imply (B.2),
and additional arguments, based on integral convergence theorems are needed.
The proof is identical to the proof in the discrete case.
Proof. We shall split the proof into several steps.
STEP 1. Let us start with zero-delayed process N 0 . We claim that
Indeed, since N 0 (t) is non-decreasing in t, the limit exists. Now, for any n ∈ N,
lim N 0 (t) ≤ n = lim P N 0 (t) ≤ n .
P
t→∞ t→∞
However, by (7.2.2)
P N 0 (t) ≤ n = P Sn+1
0
> t = 1 − FS 0 (t),
n+1
Hence (7.2.3).
STEP 2. By STEP 1 and SLLN for 1 0
S ,
n n
0
SN 0 (t) N 0 (t) 1
lim =µ P − a.s. ⇒ lim 0
= P − a.s. (7.2.4)
t→∞ N 0 (t) t→∞ SN 0 (t) µ
100 CONTENTS
STEP 3. 0
By definition SN 0
0 (t) ≤ t < SN (t)+1 . Hence,
" #
N 0 (t) N 0 (t) N 0 (t)
∈ 0
, 0 .
t SN 0 (t)+1 SN 0 (t)
1
By (7.2.4) right end-points of the above intervals converge to µ
. As for the left end-points:
N 0 (t)
By STEP 1 we conclude that limt→∞ N 0 (t)+1
= 1 P-a.s (simply because lim N 0 (t) = ∞). On the other hand
0
N (t)+1 1
limt→∞ S0 0
= µ
by (7.2.4).
N (t)+1
In order to prove (B.2) we, exactly as it was done in the discrete case, shall rely Wald formula (2.2.10)for
stopping times.
STEP 4 Let us assume first that µ < ∞. Since N 0 + 1 is a stopping time, by Wald’s formula
0 0
ESN 0 (t)+1 = µE(N (t) + 1) = µ(m(t) + 1).
0
However t < SN . Therefore,
(t)+1
0
E(SN 0 (t)+1 ) m0 (t) m0 (t)
1 1
1≤ =µ + ⇒ ≤ lim inf . (7.2.5)
t t t µ t→∞ t
Now, since TiA ≤ Ti we obviously have that N A (t) ≥ N 0 (t), and hence mA (t) ≥ m0 (t). Therefore it is enough to
check that for any A,
mA (t) 1
lim sup ≤ A. (7.2.6)
t→∞ t µ
SN A (t) SN A (t)+1 − A
1≥ ≥ .
t t
µ(mA (t) + 1) − A
1≥ ,
t
N 0 (t − T1 ) 1
lim =
t→∞ t µ
P-a.s. on the event {T1 < ∞}. This yields (B.1). Next, because of independence of T2 , T3 , . . . from T1 ,
m0 (t) 1
lim sup ≤ P (T1 < ∞) .
t→∞ t µ
On the other hand, for every A fixed, 1{T1 ≤t} N 0 (t − T1 ) ≥ 1{T1 ≤A} N 0 (t − A) whenever t ≥ A. Which means
that,
m(t) 1
lim inf ≥ P (T1 ≤ A) .
t→∞ t µ
Since A is arbitrary (and since P (T1 < ∞) = limA→∞ P T1∗ ≤ A ), (B.2) follows as well.
where, as usual, Sk0 = T1 + . . . , Tk for T1 being distributed according to F 0 as the rest of Ti -s. In this notation,
∞
X ∞
X
m(t) = ϕk (t) and m0 (t) = ϕ0k (t). (7.2.9)
k=1 k=1
Of course, ϕ1 (t) = P (T1 ≤ t) = F (t). Let us consider now k = ` + 1 for ` ≥ 1. In this case
∞
X
E 1{T1 ≤t} ϕ0` (t − T1 ) = F (t) + E 1{T1 ≤t} m (t − T1 ) .
m(t) = F (t) +
`=1
Similarly, if k = ` + 1, then,
ϕ`+1 (t) = P (T1 + T2 + · · · + T`+1 ≤ t) = E 1{T P S` ≤ t − T`+1 T`+1 = E 1{T ϕ` (t − T` ) .
`+1 ≤t} `+1 ≤t}
This follows by independence of T1 , T2 , . . . , T` from T`+1 . As a result, since Ti -s are equally distributed for i ≥ 2,
∞
X
m(t) = F (t) + E 1{T` ≤t} ϕ` (t − T` ) = F (t) + E 1{T ≤t} m (t − T ) .
`=2
Exercise 7.2.2. Solve Exercise 2.2.4 using (B.3) for inter-arrival times Ti ∼ Uni(0, 1).
Hint: Note that M = N (1) + 1. Also note that for t ≤ 1 the expectation m(t) =
E(N (t)) satisfies the following ordinary differential equation: m(0) = 0 and
Z t
d
m(t) = t + m(t − s)ds ⇒ m(t) = 1 + m(t).
0 dt
102 CONTENTS
7.3 Renewal-reward theorem and applications.
The setup and the exposition is rather similar to the discrete case. In particular,
what happens during the initial delay T1 does not play a role, and we shall conve-
niently ignore it, as if T1 is distributed as the rest of Ti -s.
Consider a collection of i.i.d. couples of (Ti , Ri ), such that:
(c) Note that in general Ti and Ri are dependent. What we require is independence of
couples (Ti , Ri ) for different i-s.
We continue
Pk to employ notation N (t) for the number of arrivals by time t and
Sk = 1 Ti for the time of k-th arrival. Recall that SN (t) ≤ t ≤ SN (t)+1 .
There are several ways to define reward collected by time t:
Example 7.3.1. Ti -s are inter arrival times, Ri -s are service times, that is Ri is time
needed to serve i-th customer. Then CT (t) is the total time needed to serve all the cus-
tomers who arrived before t.
C∗ (t) r
lim = P − a.s., (7.3.4)
t→∞ t µ
for ∗ = I, T, P. Moreover,
E(C∗ (t)) r
lim = . (7.3.5)
t→∞ t µ
Recall that we have already checked that a.s. − limt→∞ N (t) = ∞, and that a.s. −
limt→∞ Nt(t) = µ1 , the latter is just (A.1) of the elementary renewal theorem. On
the other hand,
N
1 X
a.s − lim Rk = r,
N →∞ N
1
by the strong LLN. The same logic applies for initial rewards:
N (t)+1
CI N (t) + 1 1 X
= · Rk .
t t N (t) + 1 1
1 t
Z
1
CP (t) = 1 ds. (7.3.8)
t t 0 {SN (s)+1 −SN (s) ≤u}
This corresponds to the probability that a randomly sampled (uniformly from [0, t])
point falls into the inter-arrival interval of duration at most u. By the Renewal-
Reward theorem
1 t E T 1{T ≤u}
Z
lim 1{SN (s)+1 −SN (s) ≤u} ds = . (7.3.9)
t→∞ t 0 E(T )
Exercise 7.3.1. Check the following forms of the so called size-biased distribution:
(a) If T is a continuous random variable with density function f , then the right
hand side of (7.3.9) describes the law of continuous random variable T̃ with density
function f˜(t) = E(T
tf (t)
)
.
(b) If T is a discrete random variable with probability function p, then the right
hand side of (7.3.9) describes the law of discrete random variable T̃ with probability
tp(t)
function p̃(t) = E(T )
. What happens if P(T = 0) > 0?
Further exercises (mostly borrowed from Durrett and Grimmet-Stirzaker, see the
difference with their analogs in the discrete case):
Exercise 7.3.2. Suppose the lifetime of a car is a random variable with density
function h. B buys a new car as soon as the old one breaks down or reaches T years.
Suppose a new car costs a NIS and that an additional cost of b NIS is accumulate
if the car breaks down before T . What is the long-run cost per unit time of B’s car
policy?
Calculate the cost (as a function of T ) when the lifetime is uniformly distributed on
[0, 10], a = 10, and b = 3.
7. RENEWAL THEOREY IN CONTINUOUS TIME 105
Exercise 7.3.3. Let T1 , T2 , . . . be i.i.d. inter-arrival times with T1 ∼ Poi(λ), λ > 0.
Calculate the probability that, as t → ∞, a point uniformly chosen from (0, t) will
fall on an interval of length at least u > 0.
Exercise 7.3.4. A machine is working for an exponential time with parameter λ1 >
0 and is being repaired for an exponential time with parameter λ2 > 0. At what rate
does the machine break down? Calculate the probability that the machine is working
at a time point uniformly chosen from (0, t) as t → ∞.
Exercise 7.3.5. The weather in a certain locale consists of alternating wet and dry
spells. Suppose that the number of days in each rainy spell is a Poisson distribution
with mean 2, and that a dry spell follows a Geometric distribution with mean 7. As-
sume that the successive durations of rainy and dry spells are independent. Calculate
the probability that it rains at a point uniformly chosen from (0, t) as t → ∞.
describes the total number of passengers who departed before time t and whose
arrival “fell” into
the inter-arrival (buses) interval of length less or equal to u. Note
that A SN (t) describes the total number of passengers who departed from the
station by time t. By the renewal-reward theorem.
Now,
Therefore, the limiting ratio of passengers who fall into inter-arrival (buses) times
of length ≤ u is still given by size biased law (7.3.9).
106 CONTENTS
Little’s laws. Let us return to the Example 7.3.3. Recall that we are assuming
that the server was idle before time zero and that a customer enters the service
at time 0 and that the service time R0 is independent and identical to Ri -s For
simplicity we shall assume that P (Ti = 0) = 0, that is customers arrive one at
a time. In particular, this means that the customer which arrived at time zero
immediately entered the service.
Recall that we could not compute the limiting server load limt→∞ Mt(t) because
in general M (t) ≤ N
P (t)
0 Rk , that is because in general not all the customers who
arrived before time t complete their service before t. Let X(t) be the total number
of customers in the G|G|1 system (queue and server) at time t. For definitness we
take X to be right continuous. If X(t) = 0, then evidently M (t) = N
P (t)
0 Rk . Define
σ̂1 = inf {t > 0 : X(t) = 0} . (7.3.11)
That is σ̂1 is the first time when the server becomes idle.
Exercise 7.3.6. Consider the G/G/1 queue described in Example 7.3.3. Let µ be
the mean inter arrival time and r = E (R1 ) to be the mean service time. Denote
ρ = r/µ. Prove that if ρ < 1 then the queue will empty almost surely, or equivalently
P (σ̂1 < ∞) = 1 and if ρ > 1, and if there is a customer in the queue at time zero,
then there is a positive probability that it will never become empty.
Hint. Recall that we are assuming that the 0-th customer
Pk enters the system exatly
at time 0. We continue to use notation Sk = 1 Ti is the arrival time of k-th
customer. Note that σ̂1 > Sk if and anly if all the following happen:
T1 < R0
T1 + T2 < R0 + R1
(7.3.12)
...
T1 + · · · + Tk < R0 + · · · + Rk−1
Consider now Y` = `j=1 (Tj − Rj−1 ) := `j=1 ξ` . In this way, {Y` }`∈N is a random
P P
walk on R with i.i.d. steps ξ` . Note that E(ξ` ) < 0 if ρ < 1 and E(ξ` ) > 0 if ρ > 1.
Hence by the strong LLN, P-a.s.,
lim Y` = −∞ if ρ < 1 and lim Y` = ∞ if ρ > 1. (7.3.13)
`→∞ `→∞
Therefore,
Pk assuming (7.3.15), τ̂i -s givePrise to a new renewal process, and we
set Ŝk = 1 τ̂i and, accordingly, N̂ (t) = k 1{Ŝk ≤t} . By construction (recall that
CT (t) = N
P (t)
1 Rk ), X
M Ŝk = Rj .
j:Sj <Ŝk
Hence
R0 + CT ŜN̂ (t)−1 ≤ M (t) ≤ R0 + CT ŜN̂ (t)+1 (7.3.16)
we conclude:
Ŝk+1 M (t) r
If lim = 1, then lim = . (7.3.17)
k→∞ Ŝk t→∞ t µ
Hint. For (a) remember that ŜN̂ (t) ≤ t ≤ ŜN̂ (t)+1 , and deduce
Ŝk+1 t
lim = 1 ⇒ lim = 1.
k→∞ Ŝk t→∞ ŜN̂ (t)±1
Naturally, D(t) ≤ A(t). Finally let Wk be the time which k-th customer spends in
the system, and let M (t) be the total number of customers in the system at time t.
Little’s laws describe situations when the limits
Z t N
1 A(t) D(t) 1 X
L = lim M (s)ds, λ = lim = limand W = lim Wk
t→∞ t 0 t→∞ t t→∞ t N →∞ N
1
(7.3.20)
exist, and, moreover, when the relation (7.3.19) holds. We shall justify Little’s laws
under the following set of assumptions on the system:
Assumption (Regenerative structure). There exist an infinite sequence of times
0 = S0 < S1 < S2 < . . . such that:
A1. The system is empty just before time Si , that is M (Si −) = 0.
A2. Define Ti = Si − Si−1 , Ni -number of customers who entered (and by A1. leaved) the
system during the time interval [Si−1 , Si ), and M [Si−1 , Si ) the trajectory of the process
M during the time interval [Si−1 , Si ). Then {Ti , Ni , M [Si−1 , Si )} is an i.i.d. collection of
random objects. RT
A3. The expectations E(T1 ), E(N1 ) and 0 1 M (t)dt are all finite.
Theorem (Little’s law). Under assumptions A1.-A3. the limits in (7.3.20) are defined
(and in particular two limits for A(t) and D(t) coinside), and, moreover, (7.3.19) holds.
By Assumption A.3, we know that R1 , R2 , . . . are i.i.d. and that E(Rk ) < ∞.
Hence, by the renewal reward theorem
Z t
1 E(R) ∆
lim M (s)ds = = L. (7.3.22)
t→∞ t 0 E(T )
Hence,
D(t) A(t) E(N ) ∆
lim = lim = = λ. (7.3.23)
t→∞ t t→∞ t E(T )
Third application of Renewal-Reward. Consider renewal process with (discrete) inter
arrival times Nk , and consider rewards,
N1 +···+Nk−1 +Nk Z Sk
X
Gk = Wj = M (t)dt = Rk .
N1 +...Nk−1 +1 Sk−1
The last inequality is precisely (7.3.21). Then the following limit exists:
1+N2X
+···+Nk
1 E(R) ∆
lim Wj = = W. (7.3.24)
k→∞ N1 + N2 + · · · + Nk E(N )
1
Exercise 7.3.9. Consider G|G|1 queue with mean inter-arrival time µ and average
service time r such that µ > r. Set ρ = µr . Let LQ be the average length of the
queue and L average number of the customers in the system. Similarly, let WQ be
the average time a customer spends waiting in the queue and W the average time
the customer spends in the system. Use an obvious relation between W and = WQ
and Little’s theorem to check that LQ = L − ρ.
110 CONTENTS
The case of G|G|1 queue. We shall prove A.3 for G|G|1 queue under additional assumption that
i.i.d. service times W0 , W1 , W2 , . . . have finite exponential moments:
E eaWk < ∞ for some a > 0. (7.3.25)
Of course, since Wk -s are non-negative, (7.3.25) always holds for any a ≤ 0. Also for simplicity we shall assume
that customers arrive one at a time, that is P(Tk = 0) = 0 and that E(T ) = µ < ∞.
Recall the definitions of σ̂1 and τ̂1 in (7.3.14). Also let η̂1 be the total number of customers who arrived to the
system during [0, τ̂1 ). That is η̂1 = N (σ̂1 ). Note that
Z τ̂1
M (t)dt ≤ σ̂1 η̂1 .
0
Thus A.3 will follow if we check that all the three expectations E (τ̂1 ) , E (η̂1 ) and E (σ̂1 η̂1 ) are finite. By the tail
formula we need to show that all the three integrals
Z ∞ Z ∞ Z ∞
P (τ̂1 > t) dt, P (η̂1 > t) dt and P (σ̂1 η̂1 > t) dt (7.3.26)
0 0 0
are finite.
STEP 1 (Bound on E(σ1 )). First of all:
k−1 k−1
! !
X X
P (σ̂1 > t) ≤ P (N (t) > k − 1) + P Wj > t = P (Sk ≤ t) + P Wj > t . (7.3.27)
0 0
Recall that we assume that r < µ. Fix a number ρ ∈ (r, µ), and let us consider t = kρ. Then one can rewrite the
right-hand side above as (µ = E(T ) and r = E(W )),
k k
! !
X X
P Ti ≤ kE(T ) − k(µ − ρ) +P Wk ≥ kE(W ) + k(ρ − r) . (7.3.28)
1 1
Lemma 7.3.1. Under our assumptions there exists a constant c = c(ρ) > 0, such that the expression in (7.3.28)
is bounded above by e−c(ρ)k for all k ∈ N.
Lemma 7.3.1 follows from the (Cramer’s) Large Deviation Upper bound, which will be formulated and explained
below. It implies that
Z ∞ ∞
X
P (σ̂1 > t) dt ≤ ρ P (σ̂1 > kρ) < ∞.
0 k=0
Generalized tail formula. Let X be a non-negative random variable and ϕ a non-decreasing and non-negative differen-
tiable function on [0, ∞) with ϕ(0) = 0. Then
Z ∞
E (ϕ(X)) = ϕ0 (t)P (X > t) dt. (7.3.29)
0
E eδσ̂1 < ∞.
(k−1 k
)
X X
{η̂1 > k} = {σ̂1 > Sk } ⊆ Wj > Tj .
0 1
However, the probability of the right hand side above is bounded by (7.3.28) and hence
k−1 k
!
X X
P Wj ≥ Tj ≤ e−c(ρ)k (7.3.30)
0 1
for all k ∈ N. By Exercise 7.3.11 not only the expectation, but all the moments of η̂1 are bounded.
STEP 3 (Bound on E (σ̂1 η̂1 )). By Cauchy-Schwartz,
q
E σ̂12 E η̂12 .
E (σ̂1 η̂1 ) ≤
Since all the moments (second moment in particular) of both σ̂1 and η̂1 are bounded, the expectation on the left
hand side above is bounded as well.
STEP 4 It remains to show that E (τ̂1 ) is bounded as well. Now,
P (τ̂1 > 2t) ≤ P (σ̂1 > t) + P (σ̂1 ≤ t; τ̂1 > 2t) . (7.3.31)
It is tempting to conclude that P (σ̂1 ≤ t; τ̂1 > 2t) ≤ P(T > T ), but we are dealing with dependent events and,
therefore, should proceed with some care.
Clearly,
X
P (σ̂1 ≤ t; τ̂1 > 2t) = P (N (σ̂1 ) = k; σ̂1 ≤ t; τ̂1 > 2t) (7.3.32)
k
However,
k k−1
( )
X X
{N (σ̂1 ) = k} ⊆ Ti ≤ Wj .
1 0
Consequently,
k k−1
( )
X X
{N (σ̂1 ) = k; σ̂1 ≤ t; τ̂1 > 2t} ⊆ Ti ≤ Wj ∩ {Tk+1 ≥ t} . (7.3.33)
1 0
In the right hand side of (7.3.33) there is already an intersection of two independent events. A substitution to
(7.3.32) yields:
k k−1
!
X X X (7.3.30) P (T > t)
P (σ̂1 ≤ t; τ̂1 > 2t) ≤ P (T > t) P Ti ≤ Wj ≤ . (7.3.34)
k 1 0
1 − e−c(ρ)
Z ∞
E(T )
E (τ̂1 ) = 2 P (τ̂1 > 2t) dt ≤ 2 E (σ̂1 ) + .
0 1 − e−c(ρ)
h(a) = log E eaξ and I(x) = sup {ax − h(x)} . (7.3.35)
a
The following upper bound is called exponential Chebychev inequality: For any x > 0 and a > 0,
Pk
k E e a 1 ξi
!
X Pk
P ξi ≥ kx = P ea 1 ξi
≥ eakx ≤ = exp {−k (ax − h(a))} (7.3.36)
1
eakx
Similarly,
Pk
k E e−a 1 ξi
!
X Pk
−a 1 ξi akx
P ξi ≤ −kx =P e ≥e ≤ = exp {−k (ax − h(−a))} (7.3.37)
1
eakx
The function h in (7.3.35) is called log-moment generating function. By Hölder inequality it is convex: If λ ∈ (0, 1)
and a < b, then
λ 1−λ
h(λa + (1 − λ)b) = log E e(λa+(1−λ)b)ξ ≤ log E eaξ E ebξ .
Since E(ξ) = 0, Jensen inequality implies that h ≥ 0. Clearly, h(0) = 0. In general it might happen that h = ∞ on
an open or closed semi-line not containing zero. It might even happen that h(a) = ∞ ∀a 6= 0.
The function I in (7.3.35) is called the Legendre-Fenchel transform of h. Clearly, I ≥ 0 and, since h is
non-negative, h(0) = 0. Furthermore, for any x > 0,
I(x) = sup {ax − h(a)} if x > 0, and I(−x) = sup {ax − h(−a)}
a≥0 a≥0
Cramer’s Large Deviation upper bound. For any k ∈ N and for any x > 0,
k k
! !
X X
P ξi ≥ kx ≤ e−kI(x) and P ξi ≤ −kx ≤ e−kI(−x) . (7.3.38)
1 1
The bound (7.3.38) is non-trivial if I(x) > 0, respectively if I(−x) > 0. Here is a necessary and sufficient
condition:
Lemma 7.3.2. I(x) > 0 for any x > 0 iff there exists a > 0 such that h(a) < ∞. Similarly, I(−x) > 0 iff there exists
a > 0 such that h(−a) < ∞.
The proof of Lemma 7.3.2 is easy. Clearly I(x) = 0 if h(a) = ∞ for any a > 0. On the other hand, if h(a) < ∞
for some a > 0 (and hence by convexity for all b ∈ [0, a]) , then h is infinitely many times differentiable on (0, a),
and for b ∈ (0, a),
E ξebξ
0
h (b) = (7.3.39)
E ebξ
Moreover, limb↓0 h0 (b) = E(ξ) = 0. The latter statement and (7.3.39) follow from the dominated convergence
theorem (DOM), which, as well as MON, will be formulated later. As a result, if x > 0, then there exists b ∈ (0, a]
such that h0 (b) ≤ x/2. But then,
Z b xb
I(x) ≥ bx − h(b) = (x − h0 (t))dt ≥ > 0.
0 2
k k
! !
X X
P ξi ≥ kx =P (ξi − µ) ≥ k(x − µ) ≤ e−kJ(x−µ) ,
1 1
The same for x < µ. I in (7.3.40) is called large deviation rate function.
Exercise 7.3.12. Compute rate function I for ξ ∼ Bernoulli(p), N (µ, σ 2 ), Poisson(λ) and Exp(µ).
In words, E(t) is the time from t to the next arrival. Given u ≥ 0 let us compute
1 t
Z
e
F (u) = lim 1{E(s)≤u} ds. (7.4.2)
t→∞ t 0
Proof. The statement is a version of biased sampling and it follows by the renewal-
reward argument. Indeed, we are dealing with rewards
Z Tk
u
Rk = 1{Tk −s≤u} ds = Tk 1{Tk ≤u} + u1{Tk >u} .
0
Hence,
∆
E(Rku ) = E T 1{T ≤u} + uP (T > u) = E (Y u ) + uP (T > u) .
N (t, t + s] = N (t + s) − N (t)
depends only on the length of the interval s (but not on the starting point t).
In the sequel we shall consider delayed renewals satisfying
Note that if N (t) is stationary, then m(t) is linear, and hence, m(t) = µt .
Theorem 7.4.1. The renewal process is stationary iff one of the following happens:
(a) The distribution of excess life time E(t) does not depend on t.
(b) The distribution of T1 is given by (7.4.3).
Therefore, Z t
1
P (T1 ≤ u) = lim E 1{E(τ )≤u} dτ .
t→∞ t 0
Exercise 7.4.2. Prove that under our Assumption (7.4.4),
1 t
Z
lim 1{E(τ )≤u} dτ = F e (u),
t→∞ t 0
see (7.2.8) where ϕ0` was defined. The expression on the right hand side above depend only on the law of E ∗ (t)
which, by assumption, is the same for all t.
Let us turn to the second implication in (7.4.5)
t
Exercise 7.4.3. Check that m(t) = µ
solves
Z t
t−T
m(t) = F e (t) + m(t − s)dF 0 (s) = F e (t) + E 1{T ≤t} . (7.4.6)
0 µ
t t
We shall assume that m(t) = µ
is the only solution to (7.4.6), which by (B.3) means that me (t) = µ
.
Let N e be the delayed renewal with T1 = T e distributed according to F e . We keep notation E e (t) for the
excess life time for this process. Evidently,
P (E e (t) > u) = P (T e > t + u) + P (N e (t) ≥ 1 ; N e (t, t + u] = 0)
(7.4.7)
X
= P (T e > t + u) + P (N e (t) = k ; N e (t, t + u] = 0) .
k≥1
1
Exercise 7.4.4. Check that since T e = S1e has a continuous distribution (with density f1e (t) = µ
(1 − F (t)), then
Ske is a continuous random variable for any k ≥ 1.
Using Exercise 7.4.4 let fke be the density function of Ske . Then the right hand side of (7.4.8) reads as:
Z t
fke (τ ) 1 − F 0 (t + u − τ ) dτ.
0
k≥1
µ 0
All together,
1
Z ∞ Z t
1
Z ∞
P (E e (t) > u) = 1 − F 0 − (τ ) dτ + 1 − F 0 (t + u − τ ) dτ 1 − F 0 (τ ) dτ,
=
µ t+u 0 µ u
Another name for Q as above is infinitesimal generator. A family of matrices {Pt }t∈R+ ;
∞
tQ ∆
X (tQ)n
Pt = e = . (8.1.2)
n=0
n!
Exercise 8.1.1.
Let S be a finite set, Q be an infinitesimal generator, and let Pt be the associated
semi-group.
(a) Check that
n
Pt+s = Pt Ps ∀ t, s ∈ R+ , in particular, Pt = P nt for any t ∈ R+ and n ∈ N
(8.1.3)
(b) Check that for any t the matrix Pt is stochastic, that is:
X
For any x, y, Pt (x, y) ≥ 0 and, for any x, Pt (x, y) = 1. (8.1.4)
y
Hint: For proving the first of (8.1.4), notice that Pδ has non-negative entries for δ
sufficiently small,
P and then use (8.1.3). For proving the second of (8.1.4) use (8.1.2)
and note that y Q(x, y) = 0 for any x.
(c) Check that the matrix-valued function t 7→ Pt is the unique solution to both
Kolmogorov’s forward (8.1.5) and backward (8.1.6) differential equations below:
d
Pt = Pt Q and P0 = I. (8.1.5)
dt
and
d
Pt = QPt and P0 = I. (8.1.6)
dt
118 CONTENTS
Equivalently, for any function f on S, for any x and for any t ≥ 0,
Z t Z t
Pt f (x) − f (x) = QPs f (x)ds = Ps Qf (x)ds. (8.1.7)
0 0
Definition 8.1.2. (Cadlag stochastic process). A stochastic process {Xt } which takes
values on a finite or countable state space S is called cadlag if it is P-a.s is right continuous
and has left limits.
(b) For all t ∈ R+ and ω ∈ Ω there exists > 0 such that X has at most one jump on
(t − , t + ).
As we shall see below, on countable state spaces it is in principle possible that the process
escapes to ∞ in finite time, so (a) and (b) above will be refined to accommodate such a
possibility.
Definition 8.1.4. (Analytic). A cadlag process X on a finite state space S is called CTMC
with infinitesimal generator Q, and respectively with semi-group of transition probabilities
Pt = etQ if for any function f on S and for any t, s ≥ 0,
E f (Xt+s ) Ft = Ps f (Xt ) P − a.s. (8.1.11)
8. CONTINUOUS TIME MARKOV CHAINS. 119
Exactly as in the case of discrete time Markov chains (8.1.11) implies the follow-
ing conventional definition of Markov property: For any 0 ≤ t1 < t2 < · · · < tn <
tn+1 and for any x1 , x2 , . . . , xn .xn+1 ∈ S,
P Xtn+1 = xn+1 | Xtn = xn , . . . , Xt1 = x1 = P Xtn+1 = xn+1 | Xtn = xn . (8.1.12)
Furthermore, for any s, t ≥ 0 and for any x, y ∈ S,
P (Xt+s = y | Xt = x) = Ps (x, y), (8.1.13)
and finite dimensional distributions of X are given by:
X
P (Xtn = xn , . . . , Xt1 = x1 ) = P(X0 = y)Pt1 (y, x1 ) . . . Ptn −tn−1 (xn−1 , xn ).
y
(8.1.14)
We do not have time for a complete workout of above equivalences and conclusions.
For instance in order to see how (8.1.11) implies (8.1.13) just write:
P (Xt+s = y; Xt = x) = E {δy (Xt+s )δx (Xt )} = E {δx (Xt )E (δy (Xt+s ) | Ft )}
(8.1.11)
= E {δx (Xt )Ps δy (Xt )} = E {δx (Xt )Ps δy (x)} = Ps (x, y)P(Xt = x).
Let us sketch how the martingale characterization (8.1.10) implies (8.1.11): Let f be a function on S. Since Mtf is
a martingale,
Z s Z s
E (f (Xt+s )|Ft ) − f (Xt ) = E Qf (Xt+τ )dτ Ft = QE f (Xt+τ ) Ft dτ. (8.1.15)
0 0
∆
which means that P̃s f (Xt ) = E f (Xt+τ ) Ft satisfies for any f
(a) By right continuity lims→0 P̃s f (Xt ) = f (Xt ).
d
(b) For any s > 0, ds P̃s f (Xt ) = QP̃s f (Xt ).
(a) and (b) above give a randomized form of the backward equation (8.1.6). By the unicity of solutions to the
latter (which is just a fact about systems of ODE-s when we are on a finite state space),
E f (Xt+s ) Ft = P̃s f (Xt ) = Ps f (Xt ),
which is (8.1.11).
FT = {B : B ∩ {T ≤ t} ∈ Ft for any t ∈ R+ } .
Then for any n, any function F on Sn and any 0 ≤ t1 < t2 < · · · < tn ,
E 1{T<∞} F (XT+t1 , . . . , XT+tn ) FT = 1{T<∞} EXT (F (Xt1 , . . . , Xtn )) P − a.s. (8.1.16)
In particular, given {T < ∞; XT = x}, the chains X[0, T] and X[T, ∞] are independent,
and the chain X[T, ∞] is distributed like the the usual chain which starts from x. We use
Px for the later.
120 CONTENTS
Sketch of the proof. It is easy to understand claim (a): By the usual Markov prop-
erty,
Px (τx > t + s |τ > t) = Px (τx > s) . (8.1.18)
To make this rigorous, however, requires some work. Indeed, if what we call usual
Markov property is (8.1.12), then it involves only finite number of times, whereas
events {τx > t + s} involve a continuity of times. One should go to a limit, and
right-continuity will play a role.
Assuming (8.1.18), we conclude that τx is memoryless. Hence it is exp (λ) with
some λ ≥ 0. The fact that λ = qx follows from intuitively obvious (and indeed easily
justifiable in the finite state space case) fact: For t small,
Px (τ > t) = Px (Xt = x) + o(t) = Pt (x, x) + o(t).
But then λ = qx follows from (8.1.2).
An alternative clean and very short proof of (a) follows by the martingale char-
acterization (8.1.10) and by the following fact which we inherit without a proof from
the discrete time case:
If Mt is a martingale and T is a stopping time, then Mt∧T is also a martingale.
(8.1.19)
Fix x and define f (y) = 1{y6=x} . Using (8.1.10) and (8.1.19) with T = τx , we infer
that Z τx ∧t
Ex (f (Xτx ∧t )) = E Qf (Xs )ds . (8.1.20)
0
for any t ≥ 0. Note that for s < τx the quantity Qf (Xs ) = qx . On the other hand,
by right continuity, f (Xτx ∧t ) = 1{τx ≤t} . Therefore (8.1.20) reads as: For any t ≥ 0
Z t
Px (τx ≤ t) = qx Ex (τx ∧ t) = qx Px (τx > s) ds. (8.1.21)
0
8. CONTINUOUS TIME MARKOV CHAINS. 121
(8.1.21) is an integral equation for Px (τx ≥ t). It has the unique solution Px (τx ≥ t) =
e−qx t , which is precisely the tail of Exp (qx ).
Claim (b) of the theorem follows easily if we are permitted to rely on optional
stopping in the continuous case. The relevant input (without proof) follows:
Optional stopping for continuous time martingales. Let Mt be a bounded martin-
gale in the following sense: There exists R < ∞, such that |Ms+t − Ms |1T ≤s+t ≤ tR, P-a.s
for all t and s. Let T be a stopping time with finite expectation E (T) < ∞. Then,
Turning back to the proof of (b) pick y 6= x. Then, by (8.1.10), the process
Z t
Mt = δy (Xt ) − Qδy (Xs ) ds
0
is a martingale. In view of (a), Ex (τx ) = q1x < ∞, and conditions of the optional
stopping Theorem above are satisfied. Therefore, noting that Qδy (x) = qxy ,
qxy
Ex (Xτx = y) = E (δy (Xτx )) = qxy Ex (τx ) = .
qx
A short proof of (c) relies on the following generalization of martingale property
(8.1.10):
General Martingales related to CTMC. Let X be a CTMC on a finite state space S,
∂
and let g(t, x) be a function on R+ × S which is bounded and whose derivative ∂t g( t, x) is
also bounded. Then
Z t
g ∂
Mt = g (t, Xt ) − g (s, Xs ) + Qg (s, Xs ) ds (8.1.23)
0 ∂t
is a martingale.
Let us turn to the proof of (c). Pick y 6= x and λ > 0, and consider g(t, z) =
e−λt δy (z). Again, if qx > 0, optional stopping applies for Mtg and τx under Px . Since
under Px , M0g = g(0, x) = 0,
Z τx
g −λτx −λs
0 = Ex Mτx = Ex e 1{Xτx =y} − qxy Ex e ds .
0
is a martingale. This has the following far reaching generalization which we proceed
to discuss:
Let Xn be a CTMC on a finite state space S with Q-matrix Q. Given a Poisson
process N and a cadlag random function ψ which is adapted to filtration F (meaning
8. CONTINUOUS TIME MARKOV CHAINS. 123
that for any t random variable ψ(t) ∼ Ft ), let define the integral
Z t Nt
X
ψ(s−)dN (s) = ψ(Sk −),
0 k=1
Exercise 8.1.3.
(a) Check (at least on a heuristic level) that for any x 6= y with qxy > 0 and for any
function g on S the process
Z t Z t
Mtg = g (Xs− ) dNxy (s) − qxy g (Xs ) ds, (8.1.27)
0 0
is a martingale. Above Xs− is the left limit Rof X at s. and, given a function ψ on
t
[0, t] and a Poisson process N , the integral 0 ψ(s)dN (s) is simply defined as
where S1 , S2 , . . . are arrival times of N .
(b) Use (8.1.27) to derive the following: Let Πt (x) be the total time spent by X at x
during [0, t], and let Jxy (t) be the number of jumps from x to y during [0, t]. Then,
is a martingale.
Hint: Consider g = δx .
(c) Again, at least on a heuristic level, check that
xy
E (Mt+s − Mtxy )2 ≤ qxy s. (8.1.28)
Assuming that LLN (5.4.5) holds for continuous time martingales (which it does),
we infer from (8.1.28) that
1
lim (Jxy (t) − qxy Πt (x)) = 0.
t→∞ t
Πt (x)
In particular, if limt→∞ t
= π(x) exists, then
Jxy (t)
lim = π(x)qxy .
t→∞ t
The latter conclusion is both an instance of PASTA and of the Ergodic Theorem for
CTMC on finite state space, which we proceed to discuss.
124 CONTENTS
8.2 Ergodic theorem for CTMC on a finite state space.
Let us say that a CTMC X on finite state space S is irreducible if its Jump chain
Y defined in (8.1.25) is. Using Graphical Construction 1, we can easily transfer all
the conclusions from discrete MC (on finite state spaces) to CTMC. For instance let
Ty be the first time when Xt arrives/returns to y (that is after making at least one
jump). Let Ny be the number of steps needed for Yn to reach/return to y. Then for
x 6= y
Ny −1
!
X ξ`
Ex (Ty ) = Ex .
0
qY `
∆
Since, for irreducible chains on finite state space q̄ = minz qz > 0, we conclude that
1
max Ex (Ty ) ≤ max Ex (Ny ) < ∞.
x6=y q̄ x6=y
In particular irreducible CTMC on finite state spaces are always positively recurrent.
Next, cycle decomposition for Xt is inherited from the cycle decomposition of
Yn : Fix x ∈ S. Let Ñ1 , N1 , N2 , . . . be cycle lengths (integer inter-renewal times)
for Yn . Recall that Ni is distributed as the first return time Nx under Px . Define
independent T̃1 , T1 , T2 , . . . via
Ñ 1 −1 N i −1
X ξ` Px
X ξ`
T̃1 = and Ti ∼ . (8.2.1)
0
qY` 0
qY`
Example 8.3.1.
A Pure Birth Chain has generator Q with the following jump rates:
0 q0 1 q1 2 q2 3 q2
• −→ • −→ • −→ • −→ . . . (8.3.2)
Recall that there were three ways to think about CTMC on finite state spaces: Graphical
representations, Analytic and via Martingale problems. Let us check whether and how
these approaches go through in the case of countable state spaces:
126 CONTENTS
Graphical representation. Consider (8.1.26). Because of (8.3.1) everything is
well defined and the construction makes sense. However, it might be ambiguous.
Indeed define
X n X∞
Jn = τ` and J∞ = lim Jn = τ` . (8.3.3)
n→∞
0 0
Then, (8.1.26) tells how to construct Xt only for t < J∞ . But what happens if
Definition (Explosion)
P (J∞ < ∞) > 0. (8.3.4)
It does not tell what happens with the process on [J∞ , ∞]. There is a freedom to
postulate this. The simplest thing to do is to declare that the process is killed at J∞
and either add a cemetery state ∂ where the process is absorbed at J∞ , or to think
in terms of sub-probability distributions. P −1
Or, for instance, in the pure birth process with q` < ∞ it is possible to
declare XJ∞ = 0 or, more generally, to fix p ∈ [0, 1] and to sample XJ∞ from any
probability distribution on N0 with probability p, or to send it to the cemetery state
∂ with probability q = 1 − p.
Analytic approach. In the countable case it is not immediately clear how to make
sense out of (8.1.2). However, (8.1.5) and (8.1.6) could be viewed as an infinite
systems of ODE-s.
Theorem (Without proof). Pt defined in (8.3.6) is always a solution, and the minimal
one, to both (8.1.5) and (8.1.6). It is unique iff the process is non-explosive, that is if
P (J∞ = ∞) = 1.
8. CONTINUOUS TIME MARKOV CHAINS. 127
Note that unicity in the non-explosive case follows from minimality. Indeed, if
P̃
Pt another solution, then
is P by minimilaity P̃t ≥ Pt . But in the non-explosive case
y Pt (x, y) ≡ 1. Since y P̃t (x, y) ≤ 1, the equality follows.
Martingale problem. Both Mtf in (8.1.10) and Mtg in (8.1.23) are ambiguous if the process is explosive. However,
if P (J∞ = ∞) = 1, then Mtg is a martingale. Indeed, let Sn be an increasing sequence of finite subsets of S, such
that S = ∪Sn . Set
Rn = inf {t ≥ 0 : Xt 6∈ Sn } .
Since we assume that the chain is non-explosive,
On the other hand Xt∧Rn could be viewed as a CTMC on the finite set Sn ∪ ∂, where we kill X at time Rn , XRn = ∂.
For any bounded (and with bounded derivatives) function g consider
(
g(t, x), if x ∈ Sn
gn (t, x) =
0, if x = ∂.
Then,
Z t∧Rn
∂g(s, Xs )
Mtg,n = gn (t ∧ Rn , Xt∧Rn ) − gn (0, X0 ) − + Qg(s, Xs ) ds
0 ∂s
is a martingale for any n ∈ N.
By (8.3.7) the limit
Z t
∂g(s, Xs )
Mtg = lim Mtg,n = g (t, Xt ) − g (0, X0 ) − + Qg(s, Xs ) ds
n→∞ 0 ∂s
exists. Using (BON) for conditional expectations, one can conclude that Mtg,n is a martingale as well.
8.4 Explosions.
We shall discuss irreducible chains here. Let Q be a Q-matrix. Pick λ > 0 and
consider the following in general infinite system of linear equations: For any x ∈ S,
X
(λ + qx ) g(x) = qxy g(y) or, in vector notation, λg = Qg. (8.4.1)
y6=x
Theorem (Reuter’s criterion.) An irreducible CTMC is explosive if and only if for some
λ > 0 there exists a bounded and positive solution gλ to (8.4.1). If such gλ exists for some
λ > 0, then it exists for all λ > 0.
gλ (x) = Ex e−λJ∞
(8.4.2)
is positive (and of course bounded) for any x ∈ S. We claim that gλ solves (8.4.1).
Indeed, by Graphical construction τx ∼ Exp(qx ) under Px and it is independent of
Xτx . By the same authority,
qxy
Px (Xτx = y) = .
qx
128 CONTENTS
Hence
X qxy qx X qxy
gλ (x) = Ex e−λJ∞ = Ex e−λτx Ey e−λJ∞ =
gλ (y),
y6=x
q x λ + q x
y6=x
q x
which is (8.4.1).
Assume now that gλ is a positive and bounded solution of (8.4.1) for some λ > 0.
There is no loss of generality to assume that supy gλ (y) ≤ 1. Pick any x ∈ X. Then,
X qxy (8.4.1)
Ex e−λJ1 gλ (XJ1 ) = Ex e−λJ1
gλ (y) = gλ (x).
u6=x
qx
Exercise 8.4.3.
Check that in general an irreducible CTMC Xt on a countable state space S is
non-explosive if one of the following conditions happens:
(a) If supx qx < ∞.
(b) Xt is recurrent.
Exercise 8.4.4.
Find an example of a CTMC such that P (J∞ < ∞) ∈ (0, 1).
Hint: Think of two different pure birth processes or, more generally, of two
different transient birth and death processes.
8. CONTINUOUS TIME MARKOV CHAINS. 129
8.5 Ergodic theorem for CTMC on countable state spaces.
In the sequel we assume that a CTMC Xt is recurrent (and hence in particluar
non-explosive).
Steady state eqaution. Let us say that a non-trivial and non-negative {πx } is an
invariant measure if X
πx q x = πy qyx , (SE)
y6=x
for any x ∈ S. P
An invariant measure is called invariant distribution if x πx = 1.
µy = Ex 1{Yn =y}
0
(c) Let Jxy (t) be the number of jums of Xt up to time t. Check that
Jxy
lim = πx qxy . (8.5.6)
t→∞ t
Let Xt be an ergodic CTMC on some state space S and let N (t) be a Poisson process
of intensity λ, such that X[0, t) and N [t, ∞) are independent. Let g be a function
on S Then, since by definition Xt is cadlag and, in particular, has left limits, the
expression Z t
g (Xs− ) dN (s)
0
is well defined by means of (8.6.2). Consider
Z t Z t
g
Mt = g (Xs− ) dN (s) − λ g (Xs ) ds
0 0
8. CONTINUOUS TIME MARKOV CHAINS. 131
A generalization of (8.6.1) implies that Mtg is a martingale.
If, in addition supt E g 2 (Xt )2 < ∞, then Mtg satisfies the LLN, that is
1 g
lim Mt = 0 P − a.s.. (8.6.3)
t→∞ t
However, by (8.5.5) ,
Z t X
lim g (Xs− ) ds = πx g(x).
t→∞ 0 x
P
The quantity x πx g(x) is the objective long-range time average of g(Xt ).
Since limt→∞ Nt(t) = λ, using S1 , S2 , . . . to describe arrival times of N , we con-
clude:
N (t) N
X 1X 1 X
λ πx g(x) = lim g (XSk − ) = λ lim g (XSk − ) .
t→∞ t N →∞ N
x 1 1
Exercise 8.6.1. Assume that buses arrive to a station according to Poisson process
of intensity µ and passengers arrive to this station according to Poisson process of
intensity λ, and independently of buses. Assume that all the passengers board the
bus when it arrives, and the whole procedure takes essentially zero time.
Define Nk to be the number of passengers already waiting at the station which
passenger number k sees upon his/her arrival. Compute
n
1X
lim Nk .
n→∞ n
1
8.7 Reversibility.
Let us start by setting up appropriate notions in the context of irreducible discrete
time Markov chains Y = (Yn ) on finite or countable state spaces S.
In the latter case we shall say the Markov chain X is reversible with respect to µ.
µx1 P(x1 , x2 )P(x2 , x3 ) . . . P(xn−1 , xn ) = µxn P(xn , xn−1 )P(xn−1 , xn−2 ) . . . P(x2 , x1 ).
(8.7.2)
P( x0 , x1 ) . . . P(xn , y)
µy = . (8.7.4)
P( y, xn ) . . . P(x1 , x0 )
Exercise 8.7.1. Check that under (8.7.3) the measure µy in (8.7.4) is well defined
(that is the expression in the right hand side of (8.7.4) does not depend on the path
chosen), and that µ indeed satisfies the detailed balance condition with respect to P.
Let us turn the case of continuous time Markov chains. Consider an irreducible
CTMC X with Q-matrix Q as in (8.1.1). Assume that π is a (positive) invariant
measure
P πQ = 0. Note that we do not assume that π is a distribution, or even, that
x πx < ∞
We shall denote the reversed chain as X̂. Note also that if Ŷ is the jump chain of X̂,
then its matrix of transition probabilities R̂ satisfies
q̂xy (8.7.5),(8.7.6) πy qyx 1
R̂(x, y) = = = µy R(y, x) , (8.7.7)
qx πx q x µx
where µx = πx qx is the invariant measure of the jump chain Y of the original CTMC
X. Inother words Ŷ is the reversal of Y. In particular, (8.7.2) folds for the pair
Y, Ŷ of discrete time MC.
The following claim (without proof) is a generalization of (8.7.2) for CTMC:
Direct and reversed chains satisfy: For any 0 < t1 < t2 < · · · < tn and any x0 , x1 . . . , xn ,
πx1 Px0 (Xt1 = x1 , . . . , Xtn = xn ) = πxn P̂xn Xtn −tn−1 = xn−1 , . . . , Xtn = x1 . (8.7.8)
A proof of (8.7.8) is based on (8.7.2) for the associated jump chain Y, the relation
µx = qx πx between the invariant measure π of X and invariant measure µ of Y, and
on the following
In the latter case we shall say the CTMC X is reversible with respect to π.
134 CONTENTS
Note that if π satisfies (8.7.11), then it is invariant. Indeed,
X (8.7.11) X
πy qxy = πx qxy = πx qx .
y6=x y6=x
Theorem 8.7.1. Assume that (an irreducible) CTMC X is reversible with respect to π.
Then for any 0 < t1 < t2 < · · · < tn and any x0 , x1 . . . , xn ,
πx1 Px0 (Xt1 = x1 , . . . , Xtn = xn ) = πxn Pxn Xtn −tn−1 = xn−1 , . . . , Xtn = x1 . (8.7.12)
If π is a probability distribution, then the right hand side of (8.7.12) defines finite dimen-
sional distributions for CTMC X = {X(t)}t∈(−∞,∞) in equilibrium. In this case (8.7.12)
implies that for the time reversal X̂(t) = X(−t) of X has the same distribution as X.
Theorem 8.7.1 has dramatic implications for birth and death processes and re-
lated queuing systems and networks.
Example 8.7.1. Consider a M/M/1 queue with customer arrival rates λ and ser-
vice rates µ. It is described by CTMC X on N0 with jump rates
qkm = λ1m=k+1 + µ1m=k−1 .
Set ρ = µλ . Then X is reversible with respect to πk = ρk . If ρ < 1, X is ergodic, and
the invariant distribution is given by πk = (1 − ρ)ρk , which is just Geo(1 − ρ).
In the equilibrium X and its time reversal X̂ have the same distributions. But arrivals
of X̂ are departures of X. Let D(t) be the process of departures of X. Theorem 8.7.1
implies:
D is a Poisson process with intensity λ. Moreover, D(−∞, t] and X(t) are independent.
(8.7.13)
The statement (8.7.13) above is called Burke’s Theorem.
Consider now general birth and death processes X defined in Exercise 8.4.2
Exercise 8.7.3. Recall that the state space of Xt is S = N0 . The jump rates are,
for any k ∈ N0 , given by
Q(k, k + 1) = λk and Q(k + 1, k) = µk+1 . (8.7.14)
(i) Check that X is reversible with respect to
λ0 . . . λk−1
π0 = 1 and, for k > 0, πk = . (8.7.15)
µ1 . . . µk
find necessary and sufficient condition (in terms of λi -s and µj -s for the erodicity of
X.
(ii) If λi ≡ λ (constant birth rates ) and X is ergodic, check that the conclusion
(8.7.13) of Burke’s theorem still holds.
8. CONTINUOUS TIME MARKOV CHAINS. 135
Queuing Networks
Consider a tandem of two Markovian queues (birth and death processes) Q1 and Q2 ,
for instance Q1 = M/M/1 (gas station) and Q2 = M/M/N/N (rest area parking
lot next to the gas station). Assume that:
a. Customers arrive to Q1 with rate λ1 . The service rate at Q1 is µ1 .
b. Customers bypass Q1 and arrive directly to Q2 with rate λ2 . The service rate at
Q2 is µ2 .
c. Each client departing from Q1 with probability p goes to Q2 , and with probability
1 − p leaves the system. Both independently of all other clients.
a.-c. above define an a-cyclic service network, which could be schematically
depicted as follows:
λ2
λ1
M |M |1 M |M |N |N
p
1−p
Set ρ1 = µλ11 . If ρ1 < 1, then the M/M/1 queue on Figure 1 is ergodic. Let
X1 = {X1 (t)}t∈R be its state (number of customers) in equilibrium. Then X1 (t) ∼
π1 = Geo(1 − ρ1 ) for any t ∈ R. Furthermore, by Burke’s theorem the process of
departures D1 from Q1 is Poisson with intensity λ1 . Arrivals to Q2 come from two
independent (Poissonian) sources - direct arrivals with intensity λ2 and thinning of
departures from Q1 with intensity pλ1 . Hence the effective rate of r2 of arrivals to
Q2 equals to r2 = λ2 + pλ1 . The process X2 lives on a finite state space {1, . . . , N },
and it is, therefore, ergodic. In the he equilibrium X2 is distributed according to π2
which is given by
1 ρk2
π2 (k) = , (8.7.16)
c k!
where
r2 λ2 + pλ1
ρ2 = = .
µ2 µ2
By Burke’s theorem in the equilibrium the process of departures D1 [−∞, t) from Q1
is independent of X1 (t). Hence, in the equilibrium, X1 (t) and X2 (t) are independent
for each t ∈ R.
136 CONTENTS
Exercise 8.7.4. Make an argument (even heuristic) which would imply that for
t 6= s random variables X1 (t) and X2 (s) are, in general dependent.
Another example of a more complicated a-cyclic queuing network is depicted on
Figure 2
p1J
λ1 p14
M/M/2
p13
λ3
p12 M/M/∞ M/M/N
p23
λ2
M/M/4
p24
Exercise 8.7.5. Consider the network of four Markovian queues as on Figure 8.7.5
arrival rates: λ1 = 6, λ2 = 5, λ3 = 3
service rates (per server): µ1 = 4, µ2 = 2, µ3 = 1, µ4 = 3
transitions: p12 = 13 , p13 = 16 , p14 = 16 , p1J = 1
3
p23 = 72 , p24 = 5
7
p34 = 1.
For instance, a customer after receiving service at Station 1 either leaves the system
with probability p1J = 13 or goes to Station 2 with probability p12 = 13 , or goes to
8. CONTINUOUS TIME MARKOV CHAINS. 137
Station 3 with probability p13 = 61 , or goes to Station 4 with probability p14 = 61
and so on.
(i) For which values of N does the network have an invariant distribution?
(ii) Calculate the invariant distribution for N = ∞.
A solution to Exercise 8.7.5 should be based on the following principle for a-cyclic
networks with constant arrival rates.
Definition 8.7.4. An a-cyclic network with n nodes is a graph with vertices {1, . . . , n}
and with oriented edges eij ; i < j. By construction it does not contain loops. There
is a service station at each node i, and for each node/station i we specify:
a. Maximal service rate µ1 .
b. The rate λi of customers who arrive directly to station i.
Furthermore, for each two stations i < j we specify
c. The probability pij that a customer after leaving station i goes directly to station j.
It is assumed that decisions of different customers and decisions of same customers
at different stations are independent.
If the network is ergodic then Burke’s theorem implies that in the equilibrium
arrivals of customers to stations i; i = 1, . . . , n are according to Poisson processes
with effective rates ri . Effective rates satisfy the following system of equations:
X
rj = λj + ri pij . (8.7.17)
i<j
Other way around (8.7.17) his gives a criterion for stability/ergodicity of the a-cyclic
network: It is easy to see that the solution to (8.7.17) always exists and it is always
unique:
r1 = λ1 , r2 = λ2 + r1 p12 , . . . .
Then, the a-cyclic network is ergodic iff
ri
< 1 for i = 1, . . . , n. (8.7.18)
µi
Index
138
INDEX 139
Modes of convergence, 18
Random walk
Exit probabilities, 76
Renewal
Deffective, 35
Delayed, 35
Excess life distribution, 45, 113
Renewal Theorem
Delayed, 99
Elementary, 37
Renewal-reward, 41, 103
Reversibility
Detailed Balance CTMC, 133
Detailed Balance MC, 131
Kolmogorov’s criterion MC, 132
Sigma algebra, 6
Filtration, 38
Statistics
Bose-Einstein, 18
Maxwell-Boltzmann, 18
Stopping time, 38
Tail σ-algebra, 24
Kolmogorov’s 0 − 1 Law, 24
Theorem
Burke’s, 134
Wald’s formula, 39