0% found this document useful (0 votes)
46 views80 pages

Advanced Probabiliy

advance prob

Uploaded by

Sarat Muppana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views80 pages

Advanced Probabiliy

advance prob

Uploaded by

Sarat Muppana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Advanced Probability

Perla Sousi
October 13, 2013

Contents
1 Conditional expectation

1.1

Discrete case

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

Product measure and Fubinis theorem . . . . . . . . . . . . . . . . . . . . .

11

1.4

Examples of conditional expectation . . . . . . . . . . . . . . . . . . . . . . .

11

1.4.1

Gaussian case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

1.4.2

Conditional density functions . . . . . . . . . . . . . . . . . . . . . .

12

2 Discrete-time martingales

13

2.1

Stopping times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

2.2

Optional stopping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.3

Gamblers ruin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

2.4

Martingale convergence theorem . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.5

Doobs inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.6

Lp convergence for p > 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

2.7

Uniformly integrable martingales . . . . . . . . . . . . . . . . . . . . . . . .

21

2.8

Backwards martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.9

Applications of martingales . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

2.9.1

27

Martingale proof of the Radon-Nikodym theorem . . . . . . . . . . .

University of Cambridge, Cambridge, UK; p.sousi@statslab.cam.ac.uk

3 Continuous-time random processes

28

3.1

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3.2

Martingale regularization theorem . . . . . . . . . . . . . . . . . . . . . . . .

32

3.3

Convergence and Doobs inequalities in continuous time . . . . . . . . . . . .

36

3.4

Kolmogorovs continuity criterion . . . . . . . . . . . . . . . . . . . . . . . .

38

4 Weak convergence

40

4.1

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

4.2

Tightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

4.3

Characteristic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

5 Large deviations

47

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

5.2

Cramers theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

5.3

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

6 Brownian motion

51

6.1

History and definition

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

6.2

Wieners theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

6.3

Invariance properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

6.4

Strong Markov property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

6.5

Reflection principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

6.6

Martingales for Brownian motion . . . . . . . . . . . . . . . . . . . . . . . .

60

6.7

Recurrence and transience . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

6.8

Brownian motion and the Dirichlet problem . . . . . . . . . . . . . . . . . .

65

6.9

Donskers invariance principle . . . . . . . . . . . . . . . . . . . . . . . . . .

68

6.10 Zeros of Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

7 Poisson random measures

71

7.1

Construction and basic properties . . . . . . . . . . . . . . . . . . . . . . . .

71

7.2

Integrals with respect to a Poisson random measure . . . . . . . . . . . . . .

73

7.3

Poisson Brownian motions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

Conditional expectation

Let (, F, P) be a probability space, i.e. is a set, F is a -algebra on and P is a


probability measure on (, F).
Definition 1.1. F is a -algebra on if it satisfies:
1. F
2. If A F, then also the complement is in F, i.e.,Ac F.
3. If (An )n1 is a collection of sets in F, then
n=1 An F.
Definition 1.2. P is a probability measure on (, F) if it satisfies:
1. P : F [0, 1], i.e. it is a set function
2. P() = 1 and P() = 0
3. If (An )n1 is a collection of pairwise disjoint sets in F, then P(
n=1 An ) =

n=1

P(An ).

Let A, B F be two events with P(B) > 0. Then the conditional probability of A given the
event B is defined by
P(A B)
P(A|B) =
.
P(B)
Definition 1.3. The Borel -algebra, B(R), is the -algebra generated by the open sets in
R, i.e., it is the intersection of all -algebras containing the open sets of R. More formally,
let O be the open sets of R, then
B(R) = {E : E is a -algebra containing O}.
Informally speaking, consider the open sets of R, do all possible operations, i.e., unions,
intersections, complements, and take the smallest -algebra that you get.
Definition 1.4. X is a random variable, i.e., a measurable function with respect to F,
if X : R is a function with the property that for all open sets V the inverse image
X 1 (V ) F.
Remark 1.5. If X is a random variable, then the collection of sets
{B R : X 1 (B) F}
is a -algebra (check!) and hence it must contain B(R).
Definition 1.6. For a collection A of subsets of we write (A) for the smallest -algebra
that contains A, i.e.
(A) = {E : E

is a -algebra containing A}.

Let (Xi )iI be a collection of random variables. Then we define


(Xi : i I) = ({ : Xi () B} : i I, B B) ,
i.e. this is the smallest -algebra that makes (Xi )iI measurable.
3

Let A F. The indicator function

1(A) is defined via

1(A)(x) = 1(x A) =

1, if x A;
0, otherwise.

Recall the definition of expectation. First for positive simple random variables, i.e., linear
combinations of indicator random variables, we define
" n
#
n
X
X
E
ci 1(Ai ) :=
ci P(Ai ),
i=1

i=1

where ci are positive constants and Ai are measurable events. Next, let X be a non-negative
random variable. Then X is the increasing limit of positive simple variables. For example
Xn () = 2n b2n X()c n X() as n .
So we define
E[X] := lim E[Xn ].
Finally, for a general random variable X, we can write X = X + X , where X + = max(X, 0)
and X = max(X, 0) and we define
E[X] := E[X + ] E[X ],
if at least one of E[X + ], E[X ] is finite. We call the random variable X integrable, if it
satisfies E[|X|] < .
Let X be a random variable with E[|X|] < . Let A be an event in F with P(A) > 0. Then
the conditional expectation of X given A is defined by
E[X|A] =

E[X 1(A)]
,
P(A)

Our goal is to extend the definition of conditional expectation to -algebras. So far we only
have defined it for events and it was a number. Now, the conditional expectation is going
to be a random variable, measurable with respect to the -algebra with respect to which we
are conditioning.

1.1

Discrete case

Let X be integrable, i.e., E[|X|] < . Lets start with a -algebra which is generated by a
countable family of disjoint events (Bi )iI with i Bi = , i.e., G = (Bi , i I). It is easy
to check that G = {iJ Bi : J I}.
The natural thing to do is to define a new random variable X 0 = E[X|G] as follows
X
X0 =
E[X|Bi ]1(Bi ).
iI

What does this mean? Let . Then X 0 () =


use the convention that E[X|Bi ] = 0, if P(Bi ) = 0

iI

E[X|Bi ]1( Bi ). Note that we

It is very easy to check that


X 0 is G measurable

(1.1)

and integrable, since


E[|X 0 |]

E|X 1(Bi )| = E[|X|] < .

iI

Let G G. Then it is straightforward to check that


E[X 1(G)] = E[X 0 1(G)].

1.2

(1.2)

Existence and uniqueness

Before stating the existence and uniqueness theorem on conditional expectation, let us
quickly recall the notion of an event happening almost surely (a.s.), the Monotone convergence theorem and Lp spaces.
Let A F. We will say that A happens a.s., if P(A) = 1.
Theorem 1.7. [Monotone convergence theorem] Let (Xn )n be random variables such
that Xn 0 for all n and Xn X as n a.s. Then
E[Xn ] E[X] as n .
Theorem 1.8. [Dominated convergence theorem] If Xn X and |Xn | Y for all n
a.s., for some integrable random variable Y , then
E[Xn ] E[X].
Let p [1, ) and f a measurable function in (, F, P). We define the norm
kf kp = (E[|f |p ])1/p
and we denote by Lp = Lp (, F, P) the set of measurable functions f with kf kp < . For
p = , we let
kf k = inf{ : |f | a.e.}
and L the set of measurable functions with kf k < .
Formally, Lp is the collection of equivalence classes, where two functions are equivalent if
they are equal almost everywhere (a.e.). In practice, we will represent an element of Lp by
a function, but remember that equality in Lp means equality a.e..
Theorem 1.9. The space (L2 , k k2 ) is a Hilbert space with hf, gi = E[f g]. If H is a closed
subspace, then for all f L2 , there exists a unique (in the sense of a.e.) g H such that
kf gk2 = inf hH kf hk2 and hf g, hi = 0 for all h H.
5

Remark 1.10. We call g the orthogonal projection of f on H.


Theorem 1.11. Let X be an integrable random variable and let G F be a -algebra. Then
there exists a random variable Y such that:
(a) Y is G-measurable;
(b) Y is integrable and E[X 1(A)] = E[Y 1(A)] for all A G.
Moreover, if Y 0 also satisfies (a) and (b), then Y = Y 0 a.s..
We call Y (a version of ) the conditional expectation of X given G and write Y = E[X|G]
a.s.. In the case G = (G) for some random variable G, we also write Y = E[X|G] a.s..
Remark 1.12. We could replace (b) in the statement of the theorem by requiring that for
all bounded G-measurable random variables Z we have
E[XZ] = E[Y Z].
Remark 1.13. In Section 1.4 we will show how to construct explicit versions of the conditional expectation in certain simple cases. In general, we have to live with the indirect
approach provided by the theorem.
Proof of Theorem 1.11. (Uniqueness.) Suppose that both Y and Y 0 satisfy (a) and (b).
Then, clearly the event A = {Y > Y 0 } G and by (b) we have
E[(Y Y 0 )1(A)] = E[X 1(A)] E[X 1(A)] = 0,
hence we get that Y Y 0 a.s. Similarly we can get Y Y 0 a.s.
(Existence.) We will prove existence in three steps.
1st step: Suppose that X L2 . The space L2 (, F, P) with inner product defined by
hU, V i = E[U V ] is a Hilbert space by Theorem 1.9 and L2 (, G, P) is a closed subspace.
(Remember that L2 convergence implies convergence in probability and convergence in probability implies convergence a.s. along a subsequence (see for instance [2, A13.2])).
Thus L2 (F) = L2 (G) + L2 (G) , and hence, we can write X as X = Y + Z, where Y L2 (G)
and Z L2 (G) . If we now set Y = E[X|G], then (a) is clearly satisfied. Let A G. Then
E[X 1(A)] = E[Y 1(A)] + E[Z 1(A)] = E[Y 1(A)],
since E[Z 1(A)] = 0.
Note that from the above definition of conditional expectation for random variables in L2 ,
we get that
if X 0, then Y = E[X|G] 0 a.s.,
since note that {Y < 0} G and
E[X 1(Y < 0)] = E[Y 1(Y < 0)].
6

(1.3)

Notice that the left hand side is nonnegative, while the right hand side is non-positive,
implying that P(Y < 0) = 0.
2nd step: Suppose that X 0. For each n we define the random variables Xn = X n n,
and hence Xn L2 . Thus from the first part of the existence proof we have that for each n
there exists a G-measurable random variable Yn satisfying for all A G
E[Yn 1(A)] = E[(X n)1(A)].

(1.4)

Since the sequence (Xn )n is increasing, from (1.3) we get that almost surely (Yn )n is increasing. If we now set Y = lim supn Yn , then clearly Y is G-measurable and almost surely
Y = limn Yn . By the monotone convergence theorem in (1.4) we get for all A G
E[Y 1(A)] = E[X 1(A)],

(1.5)

since Xn X, as n .
In particular, if E[X] is finite, then E[Y ] is also finite.
3rd step: Finally, for a general random variable X L1 (not necessarily positive) we
can apply the above construction to X + = max(X, 0) and X = max(X, 0) and then
E[X|G] = E[X + |G] E[X |G] satisfies (a) and (b).
Remark 1.14. Note that the 2nd step of the above proof gives that if X 0, then there
exists a G-measurable random variable Y such that
for all A G, E[X 1(A)] = E[Y 1(A)],
i.e., all the conditions of Theorem 1.11 are satisfied except for the integrability one.
Definition 1.15. Sub--algebras G1 , G2 , . . . of F are called independent, if whenever Gi Gi
(i N) and i1 , . . . , in are distinct, then
P(Gi1 . . . Gin ) =

n
Y

P(Gik ).

k=1

When we say that a random variable X is independent of a -algebra G, it means that (X)
is independent of G.
The following properties are immediate consequences of Theorem 1.11 and its proof.
Proposition 1.16. Let X, Y L1 (, F, P) and let G F be a -algebra. Then
1. E [E[X|G]] = E[X]
2. If X is G-measurable, then E[X|G] = X a.s..
3. If X is independent of G, then E[X|G] = E[X] a.s..
4. If X 0 a.s., then E[X|G] 0 a.s..
7

5. For any , R we have E[X + Y |G] = E[X|G] + E[Y |G] a.s..


6. |E[X|G]| E[|X||G] a.s..
The basic convergence theorems for expectation have counterparts for conditional expectation. We first recall the theorems for expectation.
Theorem 1.17. [Fatous lemma] If Xn 0 for all n, then
E[lim inf Xn ] lim inf E[Xn ].
n

Theorem 1.18. [Jensens inequality] Let X be an integrable random variable and let
: R R be a convex function. Then
E[(X)] (E[X]).
Proposition 1.19. Let G F be a -algebra.
1. Conditional monotone convergence theorem: If (Xn )n0 is an increasing sequence of non-negative random variables with a.s. limit X, then
E[Xn |G] % E[X|G] as n , a.s..
2. Conditional Fatous lemma: If Xn 0 for all n, then
h
i
E lim inf Xn |G lim inf E[Xn |G] a.s..
n

3. Conditional dominated convergence theorem: If Xn X and |Xn | Y for all


n a.s., for some integrable random variable Y , then
lim E[Xn |G] = E[X|G] a.s..

4. Conditional Jensens inequality: If X is an integrable random variable and :


R (, ] is a convex function such that either (X) is integrable or is nonnegative, then
E[(X)|G] (E[X|G]) a.s..
In particular, for all 1 p <
kE[X|G]kp kXkp .
Proof.
1. Let Yn be a version of E[Xn |G]. Since 0 Xn % X as n , we have
that almost surely Yn is an increasing sequence and Yn 0. Let Y = lim supn Yn .
We want to show that Y = E[X|G] a.s.. Clearly Y is G-measurable, as the limsup of
G-measurable random variables. Also, by the monotone convergence theorem we have
for all A G
E[X 1(A)] = lim E[Xn 1(A)] = lim E[Yn 1(A)] = E[Y 1(A)].
n

2. The sequence inf kn Xk is increasing in n and limn inf kn Xk = lim inf n Xn .


Thus, by the conditional monotone convergence theorem we get
lim E[inf Xk |G] = E[lim inf Xn |G].

kn

Clearly, E[inf kn Xk |G] inf kn E[Xk |G]. Passing to the limit gives the desired inequality.
3. Since Xn + Y and Y Xn are positive random variables for all n, applying conditional
Fatous lemma we get
E[X + Y |G] = E[lim inf(Xn + Y )|G] lim inf E[Xn + Y |G] and
n

E[Y X|G] = E[lim inf(Y Xn )|G] lim inf E[Y Xn |G].


n

Hence, we obtain that


lim inf E[Xn |G] E[X|G] and lim sup E[Xn |G] E[X|G].
n

4. A convex function is the supremum of countably many affine functions: (see for instance
[2, 6.6])
(x) = sup(ai x + bi ), x R.
i

So for all i we have E[(X)|G] ai E[X|G] + bi a.s. Now using the fact that the
supremum is over a countable set we get that
E[(X)|G] sup(ai E[X|G] + bi ) = (E[X|G]) a.s.
i

In particular, for 1 p < ,


kE[X|G]kpp = E[|E[X|G]|p ] E[E[|X|p |G]] = E[|X|p ] = kXkpp .

Conditional expectation has the tower property:


Proposition 1.20. Let H G be -algebras and X L1 (, F, P). Then
E[E[X|G]|H] = E[X|H] a.s..
Proof. Clearly, E[X|H] is H-measurable and for all A H we have
E[E[X|H]1(A)] = E[X 1(A)] = E[E[X|G]1(A)],
since A is also G-measurable.
We can always take out what is known:
9

Proposition 1.21. Let X L1 and G a -algebra. If Y is bounded and G-measurable, then


E[XY |G] = Y E[X|G] a.s..
Proof. Let A G. Then by Remark 1.12 we have
E[Y E[X|G]1(A)] = E[E[X|G](1(A)Y )] = E[Y X 1(A)],
which implies that E[Y X|G] = Y E[X|G] a.s..
Before stating the next proposition, we quickly recall the definition of a -system and the
uniqueness of extension theorem for probability measures agreeing on a -system generating
a -algebra.
Definition 1.22. Let A be a set of subsets of . We call A a -system if for all A, B A,
the intersection A B A.
Theorem 1.23. [Uniqueness of extension] Let 1 , 2 be two measures on (E, E), where
E is a -algebra on E . Suppose that 1 = 2 on a -system A generating E and that
1 (E) = 2 (E) < . Then 1 = 2 on E.
Proposition 1.24. Let X be integrable and G, H F be -algebras. If (X, G) is independent of H, then
E[X|(G, H)] = E[X|G] a.s..
Proof. We can assume that X 0. The general case will follow like in the proposition
above.
Let A G and B H. Then
E[1(A B)E[X|(H, G)]] = E[1(A B)X] = E[X 1(A)]P(B)
= E[E[X|G]1(A)]P(B) = E[1(A B)E[X|G]],
where we used the independence assumption in the second and last equality. Let Y =
E[X|(H, G)] a.s., then Y 0 a.s.. We can now define the measures
(F ) = E[E[X|G]1(F )] and (F ) = E[Y 1(F )], for all F F.
Then we have that and agree on the -system {A B : A G, B H} which generates
(G, H). Also, by the integrability assumption, () = () < . Hence, they agree
everywhere on (G, H) and this finishes the proof.
Warning! If in the above proposition the independence assumption is weakened and we just
assume that (X) is independent of H and G is independent of H, then the conclusion does
not follow. See examples sheet!

10

1.3

Product measure and Fubinis theorem

A measure space (E, E, ) is called -finite, if there exists a collection of sets (Sn )n0 in E
such that n Sn = E and (Sn ) < for all n.
Let (E1 , E1 , 1 ) and (E2 , E2 , 2 ) be two -finite measure spaces. The set
A = {A1 A2 : A1 E1 , A2 E2 }
is a -system of subsets of E = E1 E2 . Define the product -algebra
E1 E2 = (A).
Set E = E1 E2 .
Theorem 1.25. [Product measure] Let (E1 , E1 , 1 ) and (E2 , E2 , 2 ) be two -finite measure spaces. There exists a unique measure = 1 2 on E such that
(A1 A2 ) = 1 (A1 )2 (A2 )
for all A1 E1 and A2 E2 .
Theorem 1.26. [Fubinis theorem] Let (E1 , E1 , 1 ) and (E2 , E2 , 2 ) be two -finite measure
spaces.
Let f be E-measurable and non-negative. Then

Z Z
(f ) =
f (x1 , x2 ) 2 (dx2 ) 1 (dx1 ).
E1

(1.6)

E2

If f is integrable, then
1. x2 f (x1 , x2 ) is 2 -integrable for 1 -almost all x1 ,
R
2. x1 E2 f (x1 , x2 ) 2 (dx2 ) is 1 -integrable and formula (1.6) for (f ) holds.

1.4

Examples of conditional expectation

Definition 1.27. A random vector (X1 , . . . ,P


Xn ) Rn is called a Gaussian random vector
iff for all a1 , . . . , an R the random variable ni=1 ai Xi has a Gaussian distribution.
A real-valued process (Xt , t 0) is called a Gaussian process iff for every t1 < t2 < . . . < tn
the random vector (Xt1 , . . . , Xtn ) is a Gaussian random vector.
1.4.1

Gaussian case

Let (X, Y ) be a Gaussian random vector in R2 . Set G = (Y ). In this example, we are going
to compute X 0 = E[X|G].

11

Since X 0 must be G-measurable and G = (Y ), by [2, A3.2.] we have that X 0 = f (Y ), for


some Borel function f . Let us try X 0 of the form X 0 = aY + b, for a, b R that we will
determine.
Since E[E[X|G]] = E[X], we must have that
aE[Y ] + b = E[X].

(1.7)

E[(X X 0 )Y ] = 0 Cov(X X 0 , Y ) = 0 Cov(X, Y ) = a var(Y ).

(1.8)

Also, we must have that

So, if a satisfies (1.8), then


Cov(X X 0 , Y ) = 0
and since (X X 0 , Y ) is Gaussian, we get that X X 0 and Y are independent. Hence, if Z
is (Y )-measurable, then using also (1.7) we get that
E[(X X 0 )Z] = 0.
Therefore we proved that E[X|G] = aY + b, for a, b satisfying (1.7) and (1.8).
1.4.2

Conditional density functions

Suppose that X and Y are random variables having a joint density function fX,Y (x, y) in
R2 . Let h : R R be a Borel function such that h(X) is integrable.
In this example we want to compute E[h(X)|Y ] = E[h(X)|(Y )].
The random variable Y has a density function fY , given by
Z
fY (y) =
fX,Y (x, y) dx.
R

Let g be bounded and measurable. Then we have that



Z Z
Z Z
fX,Y (x, y)
E[h(X)g(Y )] =
h(x)g(y)fX,Y (x, y) dx dy =
h(x)
dx g(y)fY (y) dy,
fY (y)
R R
R
R
where we agree say that 0/0 = 0. If we now set
Z
fX,Y (x, y)
(y) =
h(x)
dx, if fY (y) > 0
fY (y)
R
and 0 otherwise, then we get that
E[h(X)|Y ] = (Y ) a.s.
We interpret this result by saying that
Z
E[h(X)|Y ] =

h(x)(Y, dx),
R

where (y, dx) = fY (y)1 fX,Y (x, y)1(fY (y) > 0)dx = fX|Y (x|y)dx. The measure (y, dx) is
called the conditional distribution of X given Y = y, and fX|Y (x|y) is the conditional density
function of X given Y = y. Notice this function of x, y is defined only up to a zero-measure
set.
12

Discrete-time martingales

Let (, F, P) be a probability space and (E, E) be a measurable space. (We will mostly
consider E = R, Rd , C. Unless otherwise indicated, it is to be understood from now on that
E = R.)
Let X = (Xn )n0 be a sequence of random variables taking values in E. We call X a
stochastic process in E.
A filtration (Fn )n is an increasing family of sub--algebras of F, i.e., Fn Fn+1 , for all n.
We can think of Fn as the information available to us at time n. Every process has a natural
filtration (FnX )n , given by
FnX = (Xk , k n).
The process X is called adapted to the filtration (Fn )n , if Xn is Fn -measurable for all n. Of
course, every process is adapted to its natural filtration. We say that X is integrable if Xn
is integrable for all n.
Definition 2.1. Let (, F, (Fn )n0 , P) be a filtered probability space. Let X = (Xn )n0 be
an adapted integrable process taking values in R.
X is a martingale if E[Xn |Fm ] = Xm a.s., for all n m.
X is a supermartingale if E[Xn |Fm ] Xm a.s., for all n m.
X is a submartingale if E[Xn |Fm ] Xm a.s., for all n m.
Note that every process which is a martingale (resp. super, sub) with respect to the given
filtration is also a martingale (resp. super, sub) with respect to its natural filtration by the
tower property of conditional expectation.
Example 2.2. Let (i )i1 be
P a sequence of i.i.d. random variables with E[1 ] = 0. Then it
is easy to check that Xn = ni=1 i is a martingale.
Example 2.3.QLet (i )i1 be a sequence of i.i.d. random variables with E[1 ] = 1. Then the
product Xn = ni=1 i is a martingale.

2.1

Stopping times

Definition 2.4. Let (, F, (Fn )n , P) be a filtered probability space. A stopping time T is


a random variable T : Z+ {} such that {T n} Fn , for all n.
Equivalently, T is a stopping time if {T = n} Fn , for all n. Indeed,
{T = n} = {T n} \ {T n 1} Fn .
Conversely,
{T n} = kn {T = k} Fn .
13

Example 2.5.

Constant times are trivial stopping times.

Let (Xn )n0 be an adapted process taking values in R. Let A B(R). The first
entrance time to A is
TA = inf{n 0 : Xn A}
with the convention that inf() = , so that TA = , if X never enters A. This is a
stopping time, since
{TA n} = kn {Xk A} Fn .
The last exit time though, TA = sup{n 10 : Xn A}, is not always a stopping time.
As an immediate consequence of the definition, one gets:
Proposition 2.6. Let S, T, (Tn )n be stopping times on the filtered probability space (, F, (Fn ), P).
Then also S T, S T, inf n Tn , supn Tn , lim inf n Tn , lim supn Tn are stopping times.
Proof. Note that in discrete time everything follows straight from the definitions. But when
one considers continuous time processes, then right continuity of the filtration is needed to
ensure that the limits are indeed stopping times.
Definition 2.7. Let T be a stopping time on the filtered probability space (, F, (Fn ), P).
Define the -algebra FT via
FT = {A F : A {T t} Ft , for all t}.
Intuitively FT is the information available at time T .
It is easy to check that if T = t, then T is a stopping time and FT = Ft .
For a process X, we set XT () = XT () (), whenever T () < . We also define the stopped
process X T by XtT = XT t .
Proposition 2.8. Let S and T be stopping times and let X = (Xn )n0 be an adapted process.
Then
1. if S T , then FS FT ,
2. XT 1(T < ) is an FT -measurable random variable,
3. X T is adapted,
4. if X is integrable, then X T is integrable.
Proof.

1. Straightforward from the definition.

2. Let A E. Then
{XT 1(T < ) A} {T t} =

t
[

{Xs A} {T = s} Ft ,

s=1

since X is adapted and {T = s} = {T s} \ u<s {T u} Fs .


14

3. For every t we have that XT t is FT t -measurable, hence by (1) Ft -measurable since


T t t.
4. We have
E[|XT t |] = E

" t1
X

"

|Xs |1(T = s) + E

s=0

2.2

#
|Xt |1(T = s)

s=t

t
X

E[|Xt |] < .

s=0

Optional stopping

Theorem 2.9. [Optional stopping] Let X = (Xn )n0 be a martingale.


1. If T is a stopping time, then X T is also a martingale, so in particular E[XT t ] = E[X0 ],
for all t.
2. If S T are bounded stopping times, then E[XT |FS ] = XS a.s..
3. If S T are bounded stopping times, then E[XT ] = E[XS ].
4. If there exists an integrable random variable Y such that |Xn | Y for all n, and T is
a stopping time which is finite a.s., then
E[XT ] = E[X0 ].
5. If X has bounded increments, i.e., M > 0 : n 0, |Xn+1 Xn | M a.s., and T is
a stopping time with E[T ] < , then
E[XT ] = E[X0 ].
Proof.
1. Notice that by the tower property of conditional expectation, it suffices to
check that E[XT t |Ft1 ] = XT (t1) a.s.. We can write
" t1
#
X
E[XT t |Ft1 ] = E
Xs 1(T = s)|Ft1 + E [Xt 1(T > t 1)|Ft1 ]
s=0

= XT 1(T t 1) + 1(T > t 1)Xt1 ,


since {T > t 1} Ft1 and E[Xt |Ft1 ] = Xt1 a.s. by the martingale property.
2. Suppose that T n a.s.. Since S T , we can write
XT = (XT XT 1 ) + (XT 1 XT 2 ) + . . . + (XS+1 XS ) + XS
= XS +

n
X

(Xk+1 Xk )1(S k < T ).

k=0

15

Let A FS . Then
E[XT 1(A)] = E[XS 1(A)] +

n
X

E[(Xk+1 Xk )1(S k < T )1(A)] = E[XS 1(A)],

k=0

since {S k < T } A Fk , for all k and X is a martingale.


3. Taking expectations in 2 gives the equality in expectation.
4. See example sheet.
5. See example sheet.

Remark 2.10. Note that Theorem 2.9 is true if X is a super-martingale or sub-martingale


with the respective inequalities in the statements.
Remark 2.11.
Pn Let (k )k be i.i.d. random variables taking values 1 with probability 1/2.
Then Xn = k=0 k is a martingale. Let T = inf{n 0 : Xn = 1}. Then T is a stopping time
and P(T < ) = 1. However, although from Theorem 2.9 we have that E[XT t ] = E[X0 ],
for all t, it holds that 1 = E[XT ] 6= E[X0 ] = 0.
For non-negative supermartingales, Fatous lemma gives:
Proposition 2.12. Suppose that X is a non-negative supermartingale. Then for any stopping time T which is finite a.s. we have
E[XT ] E[X0 ].

2.3

Gamblers ruin

Let (i )i1 be an i.i.d. sequence of random variables


Pn taking values 1 with probabilities
P(1 = +1) = P(1 = 1) = 1/2. Define Xn = i=1 i , for n 1, and X0 = 0. This is
called the simple symmetric random walk in Z. For c Z we write
Tc = inf{n 0 : Xn = c},
i.e. Tc is the first hitting time of the state c, and hence is a stopping time. Let a, b > 0. We
will calculate the probability that the random walk hits a before b, i.e. P(Ta < Tb ).
As mentioned earlier in this section X is a martingale. Also, |Xn+1 Xn | 1 for all n. We
now write T = Ta Tb . We will first show that E[T ] < .
It is easy to see that T is bounded from above by the first time that there are a+b consecutive
+1s. The probability that the first 1 , . . . , a+b are all equal to +1 is 2(a+b) . If the first
block of a + b variables fail to be all +1s, then we look at the next block of a + b, i.e.
a+b+1 , . . . , 2(a+b) . The probability that this block consists only of +1s is again 2(a+b) and

16

this event is independent of the previous one. Hence T can be bounded from above by a
geometric random variable of success probability 2(a+b) times a + b. Therefore we get
E[T ] (a + b)2a+b .
We thus have a martingale with bounded increments and a stopping time with finite expectation. Hence, from the optional stopping theorem (5), we deduce that
E[XT ] = E[X0 ] = 0.
We also have
E[XT ] = aP(Ta < Tb ) + bP(Tb < Ta ) and P(Ta < Tb ) + P(Tb < Ta ) = 1,
and hence we deduce that
P(Ta < Tb ) =

2.4

b
.
a+b

Martingale convergence theorem

Theorem 2.13. [A.s. martingale convergence theorem] Let X = (Xn )n be a supermartingale which is bounded in L1 , i.e., supn E[|Xn |] < . Then Xn X a.s. as n ,
for some X L1 (F ), where F = (Fn , n 0).
Usually when we want to prove convergence of a sequence, we have an idea of what the limit
should be. In the case of the martingale convergence theorem though, we do not know the
limit. And, indeed in most cases, we just know the existence of the limit. In order to show
the convergence in the theorem, we will employ a beautiful trick due to Doob, which counts
the number of upcrossings of every interval with rational endpoints.
Corollary 2.14. Let X = (Xn )n be a non-negative supermartingale. Then X converges a.s.
towards an a.s. finite limit.
Proof. Since X is non-negative we get that
E[|Xn |] = E[Xn ] E[X0 ] < ,
hence X is bounded in L1 .
Let x = (xn )n be a sequence of real numbers. Let a < b be two real numbers. We define
T0 (x) = 0 and inductively for k 0
Sk+1 (x) = inf{n Tk (x) : xn a} and Tk+1 (x) = inf{n Sk+1 (x) : xn b}

(2.1)

with the usual convention that inf = .


We also define Nn ([a, b], x) = sup{k 0 : Tk (x) n}, i.e., the number of upcrossings of the
interval [a, b] by the sequence x by time n. As n we have
Nn ([a, b], x) N ([a, b], x) = sup{k 0 : Tk (x) < },
i.e., the total number of upcrossings of the interval [a, b].
17

S2
S1 T 1
Figure 1. Upcrossings.

Before stating and proving Doobs upcrossing inequality, we give an easy lemma that will
be used in the proof of Theorem 2.13.
= R {} if and
Lemma 2.15. A sequence of real numbers x = (xn )n converges in R
only if N ([a, b], x) < for all rationals a < b.
Proof. Suppose that x converges. Then if for some a < b we had that N ([a, b], x) = ,
that would imply that lim inf n xn a < b lim supn xn , which is a contradiction.
Next suppose that x does not converge. Then lim inf n xn < lim supn xn and so taking a < b
rationals between these two numbers gives that N ([a, b], x) = .
Theorem 2.16. [Doobs upcrossing inequality] Let X be a supermartingale and a < b
be two real numbers. Then for all n 0
(b a)E[Nn ([a, b], X)] E[(Xn a) ].
Proof. We will omit the dependence on X from Tk and Sk and we will write N = Nn ([a, b], X)
to simplify notation. By the definition of the times (Tk ) and (Sk ), it is clear that for all k
XTk XSk b a.

(2.2)

We have
n
X

N
n
X
X
(XTk n XSk n ) =
(XTk XSk ) +
(Xn XSk n )1(N < n)

k=1

k=1

(2.3)

k=N +1

N
X
(XTk XSk ) + (Xn XSN +1 )1(SN +1 n),

(2.4)

k=1

since the only term contributing in the second sum appearing on the right hand side of (2.3)
is N + 1, by the definition of N . Indeed, if SN +2 n, then that would imply that TN +1 n,
which would contradict the definition of N .
Using induction on k, it is easy to see that (Tk )k and (Sk )k are sequences of stopping times.
Hence for all n, we have that Sk n Tk n are bounded stopping times and thus by the
Optional stopping theorem, Theorem 2.9 we get that E[XSk n ] E[XTk n ], for all k.
18

Therefore, taking expectations in (2.3) and (2.4) and using (2.2) we get
" n
#
X
0E
(XTk n XSk n ) (b a)E[N ] E[(Xn a) ],
k=1

since (Xn XSN +1 )1(SN +1 n) (Xn a) . Rearranging gives the desired inequality.
Proof of Theorem 2.13. Let a < b Q. By Doobs upcrossing inequality, Theorem 2.16
we get that
E[Nn ([a, b], X)] (b a)1 E[(Xn a) ] (b a)1 E[|Xn | + a].
By monotone convergence theorem, since Nn ([a, b], X) N ([a, b], X) as n , we get that
E[N ([a, b], X)] (b a)1 (sup E[|Xn |] + a) < ,
n

by the assumption on X being bounded in L1 . Therefore, we get that N ([a, b], X) < a.s.
for every a < b Q. Hence,
!
\
P
{N ([a, b[, X) < } = 1.
a<bQ

Writing 0 = a<bQ {N ([a, b[, X) < }, we have that P(0 ) = 1 and by Lemma 2.15 on
0 we have that X converges to a possible infinite limit X . So we can define

limn Xn , on 0 ,
X =
0,
on \ 0 .
Then X is F -measurable and by Fatous lemma and the assumption on X being in L1
we get
E[|X |] = E[lim inf |Xn |] lim inf E[|Xn |] < .
n

Hence X L as required.

2.5

Doobs inequalities

Theorem 2.17. [Doobs maximal inequality] Let X = (Xn )n be a non-negative submartingale. Writing Xn = sup0kn Xk we have
P(Xn ) E[Xn 1(Xn )] E[Xn ].
Proof. Let T = inf{k 0 : Xk }. Then T n is a bounded stopping time, hence by the
Optional stopping theorem, Theorem 2.9, we have
E[Xn ] E[XT n ] = E[XT 1(T n)] + E[Xn 1(T > n)] P(T n) + E[Xn 1(T > n)].
It is clear that {T n} = {Xn }. Hence we get
P(Xn ) E[Xn 1(T n)] = E[Xn 1(Xn )] E[Xn ].

19

Theorem 2.18. [Doobs Lp inequality] Let X be a martingale or a non-negative submartingale. Then for all p > 1 letting Xn = supkn |Xk | we have
kXn kp

p
kXn kp .
p1

Proof. If X is a martingale, then by Jensens inequality |X| is a non-negative submartingale.


So it suffices to consider the case where X is a non-negative submartingale.
Fix k < . We now have
E[(Xn

Z

 Z
x) dx =

pxp1 P(Xn x) dx
k) ] = E
px 1
0
0
Z k


p
pxp2 E[Xn 1(Xn x)] dx =

E Xn (Xn k)p1
p1
0
p

kXn kp kXn kkpp1 ,


p1
p

p1

(Xn

where in the second and third equalities we used Fubinis theorem, for the first inequality
we used Theorem 2.17 and for the last inequality we used Holders inequality. Rearranging,
we get
p
kXn kp .
kXn kkp
p1
Letting k and using monotone convergence completes the proof.

2.6

Lp convergence for p > 1

Theorem 2.19. Let X be a martingale and p > 1, then the following statements are equivalent:
1. X is bounded in Lp (, F, P) : supn0 kXn kp <
2. X converges a.s. and in Lp to a random variable X
3. There exists a random variable Z Lp (, F, P) such that
Xn = E[Z|Fn ] a.s.
Proof. 1 = 2 Suppose that X is bounded in Lp . Then by Jensens inequality, X is also
bounded in L1 . Hence by Theorem 2.13 we have that X converges to a finite limit X a.s.
By Fatous lemma we have
E[|X |p ] = E[lim inf |Xn |p ] lim inf E[|Xn |p ] sup kXn kpp < .
n

By Doobs Lp inequality, Theorem 2.18 we have that


kXn kp

p
kXn kp ,
p1
20

n0

where recall that Xn = supkn |Xk |. If we now let n , then by monotone convergence
we get that
p

kX
kp
sup kXn kp .
p 1 n0
Therefore

Lp
|Xn X | 2X

and dominated convergence theorem gives that Xn converges to X in Lp .


2 = 3 We set Z = X . Clearly Z Lp . We will now show that Xn = E[Z|Fn ] a.s. If
m n, then by the martingale property we can write
kXn E[X |Fn ]kp = kE[Xm X |Fn ]kp kXm X kp 0 as m .

(2.5)

Hence Xn = E[X |Fn ] a.s.


3 = 1 This is immediate by the conditional Jensens inequality.
Remark 2.20. A martingale of the form E[Z|Fn ] (it is a martingale by the tower property)
with Z Lp is called a martingale closed in Lp .
Corollary 2.21. Let Z Lp and Xn = E[Z|Fn ] a martingale closed in Lp . If F =
(Fn , n 0), then we have
Xn X = E[Z|F ] as n a.s. and in Lp .
Proof. By the above theorem we have that Xn X as n a.s. and in Lp . It only
remains to show that X = E[Z|F ] a.s. Clearly X is F -measurable. Let A n0 Fn .
Then A FN for some N and
E[Z 1(A)] = E[E[Z|F ]1(A)] = E[XN 1(A)] E[X 1(A)] as N .
So this shows that for all A n0 Fn we have
E[X 1(A)] = E[E[Z|F ]1(A)].
But n0 Fn is a -system generating F , and hence we get the equality for all A F .

2.7

Uniformly integrable martingales

Definition 2.22. A collection (Xi , i I) of random variables is called uniformly integrable


(UI) if
sup E[|Xi |1(|Xi | > )] 0 as .
iI

Equivalently, (Xi ) is UI, if (Xi ) is bounded in L1 and


> 0, > 0 : A F, P(A) < sup E[|Xi |1(A)] < .
iI

21

Remember that a UI family is bounded in L1 . The converse is not true.


If a family is bounded in Lp , for some p > 1, then it is UI.
Theorem 2.23. Let X L1 . Then the class
{E[X|G] : G a sub--algebra of F}
is uniformly integrable.
Proof. Since X L1 , we have that for every > 0 there exists a > 0 such that whenever
P(A) , then
E[|X|1(A)] .

(2.6)

We now choose < so that E[|X|] . For any sub--algebra G we have


E[|E[X|G]|] E[|X|].
Writing Y = E[X|G] we have by Markovs inequality P(|Y | ) E[|X|]/ . Finally
from (2.6) and the fact that {|Y | } G we have
E[|Y |1(|Y | )] E[|X|1(|Y | )] .

Lemma 2.24. Let (Xn )n , X L1 and Xn X as n a.s. Then


L1

Xn X as n iff (Xn )n0 is UI.


Proof. See [2, Theorem 13.7].
Definition 2.25. A martingale (Xn )n0 is called a UI martingale if it is a martingale and
the collection of random variables (Xn )n0 is a UI family.
Theorem 2.26. Let X be a martingale. The following statements are equivalent.
1. X is a uniformly integrable martingale
2. Xn converges a.s. and in L1 (, F, P) to a limit X
3. There exists Z L1 (, F, P) so that Xn = E[Z|Fn ] a.s. for all n 0.
Proof. 1 = 2 Since X is UI, it follows that it is bounded in L1 , and hence from Theorem 2.13 we get that Xn converges a.s. towards a finite limit X as n . Since X is UI,
[2, Theorem 13.7] gives the L1 convergence.
2 = 3 We set Z = X . Clearly Z L1 . We will now show that Xn = E[Z|Fn ] a.s. For all
m n by the martingale property we have
kXn E[X |Fn ]k1 = kE[Xm X |Fn ]k1 kXm X k1 0 as m .
3 = 1 Notice that by the tower property of conditional expectation, E[Z|Fn ] is a martingale. The uniform integrability follows from Theorem 2.23.
22

Remark 2.27. As in Corollary 2.21, if X is a UI martingale, then E[Z|F ] = X , where


F = (Fn , n 0).
Remark 2.28. If X is a UI supermartingale (resp. submartingale), then Xn converges a.s.
and in L1 to a limit X , so that E[X |Fn ] Xn (resp. ) for every n.
Example 2.29. Let (Xi )i be i.i.d. random variables with P(X1 = 0) = P(X1 = 2) = 1/2.
Then Yn = X1 Xn is a martingale bounded in L1 and it converges to 0 as n a.s.
But E[Yn ] = 1 for all n, and hence it does not converge in L1 .
If X is a UI martingale and T is a stopping time, which could also take the value , then
we can unambiguously define
XT =

Xn 1(T = n) + X 1(T = ).

n=0

Theorem 2.30. [Optional stopping for UI martingales] Let X be a UI martingale and


let S and T be stopping times with S T . Then
E[XT |FS ] = XS a.s.
Proof. We will first show that E[X |FT ] = XT a.s. for any stopping time T . We will now
check that XT L1 . Since |Xn | E[|X ||Fn ], we have
E[|XT |] =

E[|Xn |1(T = n)] + E[|X |1(T = )]

n=0

E[|X |1(T = n)] = E[|X |].

nZ+ {}

Let B FT . Then
X
E[1(B)XT ] =

E[1(B)1(T = n)Xn ] =

nZ+ {}

E[1(B)1(T = n)X ] = E[1(B)X ],

nZ+ {}

where for the second equality we used that E[X |Fn ] = Xn a.s. Also, clearly XT is FT measurable, and hence
E[X |FT ] = XT a.s.
Now using the tower property of conditional expectation, we get for stopping times S T ,
since FS FT
E[XT |FS ] = E[E[X |FT ]|FS ] = E[X |FS ] = XS a.s.

2.8

Backwards martingales

Let . . . G2 G1 G0 be a sequence of sub--algebras indexed by Z . Given such


a filtration, a process (Xn , n 0) is called a backwards martingale, if it is adapted to the
filtration, X0 L1 and for all n 1 we have
E[Xn+1 |Gn ] = Xn a.s.
23

By the tower property of conditional expectation we get that for all n 0


E[X0 |Gn ] = Xn a.s.

(2.7)

Since X0 L1 , from (2.7) and Theorem 2.23 we get that X is uniformly integrable. This is
a nice property that backwards martingales have: they are automatically UI.
Theorem 2.31. Let X be a backwards martingale, with X0 Lp for some p [1, ). Then
Xn converges a.s. and in Lp as n to the random variable X = E[X0 |G ], where
G = n0 Gn .
Proof. We will first adapt Doobs up-crossing inequality, Theorem 2.16, in this setting. Let
a < b be real numbers and Nn ([a, b], X) be the number of up-crossings of the interval [a, b]
by X between times n and 0 as defined at the beginning of Section 2.4.
If we write Fk = Gn+k , for 0 k n, then Fk is an increasing filtration and the process
(Xn+k , 0 k n) is an F-martingale. Then Nn ([a, b], X) is the number of up-crossings
of the interval [a, b] by Xn+k between times 0 and n. Thus applying Doobs up-crossing
inequality to Xn+k we get that
(b a)E[Nn ([a, b], X)] E[(X0 a) ].
Letting n we have that Nn ([a, b], X) increases to the total number of up-crossings of
X from a to b and thus we deduce that
Xm X as m a.s.,
for some random variable X , which is G -measurable, since the -algebras Gn are decreasing.
Since X0 Lp , it follows that Xn Lp , for all n 0. Also, by Fatous lemma, we get that
X Lp . Now by conditional Jensens inequality we obtain
|Xn X |p = |E[X0 X |Gn ]|p E[|X0 X |p |Gn ].
But the latter family of random variables, (E[|X0 X |p |Gn ])n is UI, by Theorem 2.23
again. Hence also (|Xn X |p )n is UI, and thus by [2, Theorem 13.7], we conclude that
Xn X as n in Lp .
In order to show that X = E[X0 |G ] a.s., it only remains to show that if A G , then
E[X0 1(A)] = E[X 1(A)].
Since A Gn , for all n 0, we have by the martingale property that
E[X0 1(A)] = E[Xn 1(A)].
Letting n in the above equality and using the L1 convergence of Xn to X finishes
the proof.

24

2.9

Applications of martingales

Theorem 2.32. [Kolmogorovs 0-1 law] Let (Xi )i1 be a sequence of i.i.d. random variables. Let Fn = (Xk , k n) and F = n0 Fn . Then F is trivial, i.e. every A F
has probability P(A) {0, 1}.
Proof. Let Gn = (Xk , k n) and A F . Since Gn is independent of Fn+1 , we have that
E[1(A)|Gn ] = P(A) a.s.
Theorem 2.26 gives that E[1(A)|Gn ] converges to E[1(A)|G ] a.s. as n , where G =
(Gn , n 0). Hence we deduce that
E[1(A)|G ] = 1(A) = P(A) a.s.,
since F G . Therefore
P(A) {0, 1}.

Theorem 2.33. [Strong law of large numbers] Let (Xi )i1 be a sequence of i.i.d. random
variables in L1 with = E[X1 ]. Let Sn = X1 + . . . + Xn , for n 1 and S0 = 0. Then
Sn /n as n a.s. and in L1 .
Proof. Let Gn = (Sn , Sn+1 , . . .) = (Sn , Xn+1 , . . .). We will now show that (Mn )n1 =
(Sn /(n))n1 is a (Fn )n1 = (Gn )n1 backwards martingale. We have for m 1


i
h
Sm1

(2.8)
E Mm+1 Fm = E
Gm .
m 1
Setting n = m, since Xn is independent of Xn+1 , Xn+2 , . . ., we obtain






Sn1
Xn
Sn Xn
Sn
E
E
Gn = E
Gn =
Sn .
n1
n1
n1
n1

(2.9)

By symmetry, notice that E[Xk |Sn ] = E[X1 |Sn ] for all k. Indeed, for any A B(R) we have
that E[Xk 1(Sn A)] does not depend on k. Clearly
E[X1 |Sn ] + . . . + E[Xn |Sn ] = E[Sn |Sn ] = Sn ,
and hence E[Xn |Sn ] = Sn /n a.s. Finally putting everything together we get


Sn1
Sn
Sn
Sn
E

=
a.s.
Gn =
n1
n 1 n(n 1)
n
Thus, by the backwards martingale convergence theorem, we deduce that Snn converges as
n a.s. and in L1 to a random variable, say Y = lim Sn /n. Obviously for all k
Y = lim

Xk+1 + . . . + Xk+n
,
n
25

and hence Y is Tk = (Xk+1 , . . .)-measurable, for all k, hence it is k Tk -measurable. By


Kolmogorovs 0-1 law, Theorem 2.32, we conclude that there exists a constant c R such
that P(Y = c) = 1. But
c = E[Y ] = lim E[Sn /n] = .

Theorem 2.34. [Kakutanis product martingale theorem] Let (Xn )n0 be a sequence
of independent non-negative random variables of mean 1. We set
M0 = 1 and Mn = X1 X2 . . . Xn , n N.
Then (Mn )n0 is a non-negative
martingale and Mn M a.s. as n for some random
variable M . We set an = E[ Xn ], then an (0, 1]. Moreover,
1. if

2. if

an > 0, then Mn M in L1 and E[M ] = 1,

an = 0, then M = 0 a.s.

Proof. Clearly (Mn )n is a positive martingale and E[Mn ] = 1, for all n, since the random
variables (Xi ) are independent and of mean 1. Hence, by the a.s. martingale convergence
theorem, we get that Mn converges a.s. as n to a finite random variable M . By
Cauchy-Schwarz an 1 for all n.
We now define

Nn =

X1 . . . Xn
, for n 1.
a1 . . . an

Then Nn is a non-negative martingale that is bounded in L1 , and hence converges a.s. towards
a finite limit N as n .
1. We have
1
1
Q
sup E[Nn2 ] = sup Qn
< ,
(2.10)
2 =
( n an ) 2
n0
n0 ( i=1 ai )
Q
Q
under the assumption that n an > 0. Since Mn = Nn2 ( ni=1 ai )2 Nn2 for all n, we get
E[sup Mk ] E[sup Nk2 ] 4E[Nn2 ],
kn

kn

where the last inequality follows by Doobs L2 -inequality, Theorem 2.18. Hence by Monotone
convergence and (2.10) we deduce
E[sup Mn ] < ,
n

and since Mn supn Mn we conclude that Mn is UI, and hence it also converges in L1
towards M . Finally since E[Mn ] = 1 for all n, it follows that E[M ] = 1.
Q
Q
2. We have Mn = Nn2 ( ni=1 ai )2 0, as n , since n an = 0 and N exists and is finite
a.s. by the a.s. martingale convergence theorem. Hence M = 0 a.s.
26

2.9.1

Martingale proof of the Radon-Nikodym theorem

Theorem 2.35. [Radon-Nikodym theorem] Let P and Q be two probability measures


on the measurable space (, F). Assume that F is countably generated, i.e. there exists a
collection of sets (Fn : n N) such that
F = (Fn : n N).
Then the following statements are equivalent:
(a) P(A) = 0 implies that Q(A) = 0 for all A F (and in this case we say that Q is
absolutely continuous with respect to P and write Q  P).
(b) > 0, > 0, A F, P(A) implies that Q(A) .
(c) There exists a non-negative random variable X such that
Q(A) = E[X 1(A)], A F.
Remark 2.36. The random variable X which is unique P-a.s., is called (a version of) the
Radon-Nikodym derivative of Q with respect to P. We write X = dQ/dP a.s. The theorem
extends immediately to finite measures by scaling, then to -finite measures by breaking the
space into pieces where the measures are finite. Also we can lift the assumption that the
-algebra F is countably generated and the details for that can be found in [2, Chapter 14].
Proof. We will first show that (a) implies (b). If (b) does not hold, then we can find
> 0 such that for all n 1 there exists a set An with P(An ) 1/n2 and Q(An ) . Then
by the Borel-Cantelli lemma we get that
P(An i.o.) = 0
Therefore from (a) we will get that Q(An i.o.) = 0. But
Q(An i.o.) = Q(n kn Ak ) = lim Q(kn Ak ) ,
n

which is a contradiction, so (a) implies (b).


Next we will show that (b) implies (c). We consider the following filtration:
Fn = (Fk , k n).
If we write An = {H1 . . . Hn : Hi = Fi or Fic }, then it is easy to see that
Fn = (An ).
Note that the sets in An are disjoint. We now let Xn : [0, ) be the random variable
defined as follows
X Q(A)
1( A).
Xn () =
P(A)
AA
n

27

Since the sets in An are disjoint, we get that


Q(A) = E[Xn 1(A)], for all A Fn .
We will use the notation

dQ
on Fn .
dP
It is easy to check that (Xn )n is a non-negative martingale with respect to the filtered
probability space (, F, (Fn ), P). Indeed, if A Fn , then
Xn =

E[Xn+1 1(A)] = Q(A) = E[Xn 1(A)], for all A Fn .


Also (Xn ) is bounded in L1 , since E[Xn ] = Q() = 1. Hence by the a.s. martingale convergence theorem, it converges a.s. towards a random variable X as n .
We will now show that (Xn ) is a uniformly integrable martingale. Set = 1/. Then by
Markovs inequality
E[Xn ]
1
P(Xn )
= = .

Therefore by (b)
E[Xn 1(Xn )] = Q({Xn }) ,
which proves the uniform integrability. Thus by the convergence theorem for UI martingales,
Theorem 2.26, we get that Xn converges to X as n in L1 and E[X ] = 1. So for all
A Fn we have
E[Xn 1(A)] = E[X 1(A)].
e
e
Hence if we now define a new probability measure Q(A)
= E[X 1(A)], then Q(A) = Q(A)
for all A n Fn . But n Fn is a -system that generates the - algebra F, and hence
e on F,
Q=Q
which implies (c).
The implication (c) = (a) is straightforward.

3
3.1

Continuous-time random processes


Definitions

Let (, F, P) be a probability space. So far we have considered stochastic processes in


discrete time only. In this section the time index set is going to be the whole positive real
line, R+ . As in Section 2, we define a filtration (Ft )t to be an increasing collection of sub
-algebras of F, i.e. Ft Ft0 , if t t0 . A collection of random variables (Xt : t R+ ) is
called a stochastic process. Usually as in Section 2, X will take values in R or Rd . X is
called adapted to the filtration (Ft ), if Xt is Ft -measurable for all t. A stopping time T is a
random variable taking values in [0, ] such that {T t} Ft , for all t.
28

When we consider processes in discrete time, if we equip N with the -algebra P(N) that
contains all the subsets of N, then the process
(, n) 7 Xn ()
is clearly measurable with respect to the product -algebra F P(N).
Back to continuous time, if we fix t R+ , then 7 Xt () is a random variable. But, the
mapping (, t) 7 Xt () has no reason to be measurable with respect to F B(R) (B(R) is
the Borel -algebra) unless some regularity conditions are imposed on X. Also, if A R,
then the first hitting time of A,
TA = inf{t : Xt A}
is not in general a stopping time as the set
{TA t} = 0st {T = s}
/ Ft in general,
since this is an uncountable union.
A quite natural requirement is that for a fixed the mapping t 7 Xt () is continuous in
t. Then, indeed the mapping (, t) 7 Xt () is measurable. More generally we will consider
processes that are right-continuous and admit left limits everywhere a.s. and we will call such
processes c`adl`ag from the french continu a` droite limite `a gauche. Continuous and c`adl`ag
processes are determined by their values in a countable dense subset of R+ , for instance Q+ .
Note that if a process X = (Xt )t(0,1] is continuous, then the mapping
(, t) 7 Xt ()
is measurable with respect to F B((0, 1]). To see this, note that by the continuity of X in
t we can write
n 1
2X
1(t (k2n, (k + 1)2n])Xk2n ().
Xt () = lim
n

k=0

For each n it is easy to see that


(, t) 7

n 1
2X

1(t (k2n, (k + 1)2n])Xk2

()

k=0

is F B((0, 1])-measurable. Hence Xt () is F B((0, 1])-measurable, as a limit of measurable


functions.
We let C(R+ , E) (D(R+ , E)) be the space of continuous (cadlag) functions x : R+ E
endowed with the product -algebra that makes the projections t : X 7 Xt measurable for
every t. Note that E = R or Rd in this course.
For a stopping time T we define as before
FT = {A F : A {T t} Ft for all t}.
For a cadlag process X we set XT () = XT () (), whenever T () < and again as before
we define the stopped process X T by XtT = XT t .
29

Proposition 3.1. Let S and T be stopping times and X a cadlag adapted process. Then
1. S T is a stopping time,
2. if S T , then FS FT ,
3. XT 1(T < ) is an FT -measurable random variable,
4. X T is adapted.
Proof. 1,2 follow directly from the definition like in the discrete time case. We will only
show 3. Note that 4 follows from 3, since XT t will then be FT t -measurable, and hence
Ft -measurable, since by 2, FT t Ft .
Note that a random variable Z is FT measurable if and only if Z 1(T t) is Ft -measurable
for all t. It follows directly by the definition that if Z is FT -measurable, then Z 1(T t) is
Ft -measurable for all t. For the other implication, note that if Z = c1(A), then
Pthe claim is
true. This extends to all finite linear combinations of indicators, since if Z = ni=1 ci 1(Ai ),
where the constants ci are positive, then we can write Z as a linear combination of indicators
of disjoint sets and then the claim follows easily. Finally for any positive random variable
Z we can approximate it by Zn = 2n b2n Zc n Z as n . Then the claim follows for
each Zn , since if Z 1(T t) is Ft -measurable, then also Zn 1(T t) is Ft -measurable, for
all t. Finally the limit of FT -measurable random variables is FT -measurable.
So in order to prove that XT 1(T < ) is FT -measurable, we will show that XT 1(T t) is
Ft -measurable for all t. We can write
XT 1(T t) = XT 1(T < t) + Xt 1(T = t).
Clearly, the random variable Xt 1(T = t) is Ft -measurable. It only remains to show that
XT 1(T < t) is Ft -measurable. If we let Tn = 2n d2n T e, then it is easy to see that Tn is a
stopping time that takes values in the set Dn = {k2n : k N}. Indeed
{Tn t} = {d2n T e 2n t} = {T 2n b2n tc} F2n b2n tc Ft .
By the cadlag property of X and the convergence Tn T we get that
XT 1(T < t) = lim XTn t 1(T < t).
n

Since Tn takes only countably many values, we have


X
XTn t 1(T < t) =
Xd 1(Tn = d) + Xt 1(Tn > t)1(T < t).
dDn ,dt

But Tn is a stopping time wrt the filtration (Ft ), and hence we see that XTn t 1(T < t) is
Ft -measurable for all n and this finishes the proof.

30

Example 3.2. Note that when the time index set is R+ , then hitting times are not always
stopping times. Let J be a random variable that takes values +1 or 1 each with probability
1/2. Consider now the following process

t,
if t [0, 1];
Xt =
1 + J(t 1), if t > 1.
Let Ft = (Xs , s t) be the natural filtration of X. Then if A = (1, 2) and we consider
TA = inf{t 0 : Xt A}, then clearly
{TA 1}
/ F1 .
If we impose some regularity conditions on the process or the filtration though, then we get
stopping times like in the next two propositions.
Proposition 3.3. Let A be a closed set and let X be a continuous adapted process. Then
the first hitting time of A,
TA = inf{t 0 : Xt A},
is a stopping time.
Proof. It suffices to show that

{TA t} =


inf d(Xs , A) = 0 ,

sQ,st

(3.1)

where d(x, A) stands for the distance of x from the set A. If TA = s t, then there exists a
sequence sn of times such that Xsn A and sn s as n . By continuity of X, we then
deduce that Xsn Xs as n and since A is closed, we must have that Xs A. Thus
we showed that XTA A. We can now find a sequence of rationals qn such that qn TA as
n and since d(XTA , A) = 0 we get that d(Xqn , A) 0 as n .
Suppose now that inf sQ,st d(Xs , A) = 0. Then there exists a sequence sn Q, sn t, for
all n such that
d(Xsn , A) 0 as n .
We can extract a converging subsequence of sn s and by continuity of X we get that
Xsn Xs as n . Since d(Xs , A) = 0 and A is a closed set, we conclude that Xs A,
and hence TA t.
Definition 3.4. Let (Ft )tR+ be a filtration. For each t we define
\
Ft+ =
Fs .
s>t

If Ft+ = Ft for all t, then we call the filtration (Ft ) right-continuous.


Proposition 3.5. Let A be an open set and X a continuous process. Then
TA = inf{t 0 : Xt A}
is a stopping time with respect to the filtration (Ft+ ).
31

Proof. First we show that for all t, the event {TA < t} Ft . Indeed,by the continuity of
X and the fact that A is open we get that
[
{TA < t} =
{Xq A} Ft ,
qQ,q<t

since it is a countable union.


Since we can write
{TA t} =

\
{T < t + 1/n}
n

we get that {TA t} Ft+ .

3.2

Martingale regularization theorem

As we discussed at the beginning of the section, we can view a stochastic process indexed
by R+ as a random variable with values in the space of functions {f : R+ E} endowed
with the product -algebra that makes the projections f 7 f (t) measurable. The law of the
process X is the measure that is defined as
(A) = P(X A),
where A is in the product -algebra. However the measure is not easy to work with.
Instead we consider simpler objects that we define below.
Given a probability measure on D(R+ , E) we consider the probability measure J , where
J R+ is a finite set, defined as the law of (Xt , t J). The probability measures (J ) are
called the finite dimensional distributions of . By a -system uniqueness argument, is
uniquely determined by its finite-dimensional distributions. Indeed the set
{sJ {Xs As } : J is finite , As B(R)}
is a -system generating the product -algebra. So, when we want to specify the law of
a cadlag process, it suffices to describe its finite-dimensional distributions. Of course we
have no a priori reason to believe there exists a cadlag process whose finite-dimensional
distributions coincide with a given family of measures (J : J R+ , J finite).
Even if we know the law of a process, this does not give us much information about the
sample path properties of the process. Namely, there could be different processes with the
same finite marginal distributions. This motivates the following definition:
Definition 3.6. Let X and X 0 be two processes defined on the same probability space
(, F, P). We say that X 0 is a version of X if Xt = Xt0 a.s. for every t.
Remark 3.7. Note that two versions of the same process have the same finite marginal
distributions. But they do not share the same sample path properties.
Example 3.8. Let X = (Xt )t[0,1] be the process that is identical to 0 for all t. Then
obviously the finite marginal distributions will be Dirac measures at 0. Now let U be a
uniform random variable on [0, 1]. We define Xt0 = 1(U = t). Then clearly the finite
32

marginal distributions of X 0 are Dirac measures at 0, and hence it is a version of X. However


it is not continuous and furthermore
P(Xt0 = 0 t [0, 1]) = 0.
In this section we are going to show two theorems that guarantee the existence of a continuous
or cadlag version of a process.
Let (, F, (Ft ), P) be a filtered probability space. Let N be the collection of sets in F of
measure 0. We define the filtration
Fet = (Ft+ , N ).
Definition 3.9. If a filtration satisfies Fet = Ft for all t, then we say that (Ft ) satisfies the
usual conditions.
Before stating the next theorem, note that the definitions of martingales (resp. supermartingales and submartingales) are the same in continuous time as the ones given for discrete
time processes.
Theorem 3.10. [Martingale regularization theorem] Let (Xt )t0 be a martingale with
e which is a martingale
respect to the filtration (Ft )t0 . Then there exists a cadlag process X
with respect to (Fet ) and satisfies
et |Ft ] a.s.
Xt = E[X
e is a cadlag version
for all t 0. If the filtration (Ft ) satisfies the usual conditions, then X
of X.
Before proving the theorem we state and prove an easy result about functions which is
analogous to Lemma 2.15 which was used in the proof of the a.s. martingale convergence
theorem.
Lemma 3.11. Let f : Q+ R be a function defined on the positive rational numbers.
Suppose that for all a < b and a, b Q and all bounded I Q+ the function f is bounded
on I and the number of upcrossings of the interval [a, b] during the time intervals I by f is
finite, i.e. N ([a, b], I, f ) < , where N ([a, b], I, f ) is defined as
sup {n 0 : 0 s1 < t1 < . . . < sn < tn , si , ti I, f (si ) < a, f (ti ) > b, 1 i n} .
Then for every t R+ the right and left limits of f exist and are finite, i.e.
lim f (s), lim f (s) exist and are finite.
st

st

Proof. First note that if (sn ) is a sequence of rationals decreasing to t, then by Lemma 2.15
we get that the limit limn f (sn ) exists. Similarly if s0n is a sequence increasing to t, then
the limit limn f (s0n ) exists. So far we showed that for any sequence converging to t from
above (or below) the limit exists. It remains to show that the limit is the same along any
sequence decreasing to t. To see this, note that if sn is a sequence decreasing to t and qn is
33

another sequence decreasing to t and limn f (sn ) 6= limn f (qn ), then we can combine the two
sequences and get a decreasing sequence (an ) converging to t such that limn f (an ) does not
exist, which is a contradiction, since we already showed that for every decreasing sequence
the limit exists. Finally the limits from above or below are finite, which follows by the
assumption that f is bounded on any bounded subset of Q+ .
e as follows:
Proof of Theorem 3.10. The goal is to define X
et =
X

lim Xs

st,sQ+

on a set of measure 1 and 0 elsewhere.


So first we need to check that the limit above exists a.s. and is finite. In order to do so, we
are going to use Lemma 3.11. Therefore we first show that X is bounded on bounded subsets
I of Q+ . Let I be such a subset. Consider J = {j1 , . . . , jn } I, where j1 < j2 < . . . < jn .
Then the process (Xj )jJ is a discrete time martingale. By Doobs maximal inequality we
obtain
P(max |Xj | > ) E[|Xjn |] E[|XK |],
jJ

where K > sup I. So taking a monotone limit over J finite subsets of I with union the set
I, then we get that
P(sup |Xt | > ) E[|XK |].
tI

Therefore by letting this shows that


P(sup |Xt | < ) = 1.
tI

Let a < b be rational numbers. Then we have N ([a, b], I, X) = supJI, finite N ([a, b], J, X).
Let J = {a1 , . . . , an } (in increasing order again) be a finite subset of I. Then (Xai )in is a
martingale and Doobs upcrossing lemma gives that
(b a)E[N ([a, b], J, X)] E[(Xan a) ] E[(XK a) ]

(3.2)

By monotone convergence again, if we let IM = Q+ [0, M ], we then get that for all M
N ([a, b], IM , X) < a.s.
Thus if we now let
0 = M N a<b,a,bQ {N ([a, b], IM , X) < } { sup |Xt | < },
tIM

then we obtain that P(0 ) = 1 . For 0 by Lemma 3.11 the following limits exist in R:
Xt+ () = lim Xs (), t 0
st,sQ

Xt () = lim Xs (), t > 0.


st,sQ

34

Hence we can now define for t 0,



et =
X

Xt+ , on 0 ;
0,
otherwise.

e is Fe adapted, since Fe contains also the events of 0 probability.


Then clearly X
Let tn be a sequence in Q such that tn t as n . Then
et = lim Xtn .
X
n

Notice that the process (Xtn : n 1) is a backwards martingale, and hence it converges a.s.
and in L1 as n . Therefore,
et |Ft ] in L1 .
E[Xti |Ft ] E[X
But E[Xti |Ft ] = Xt . Therefore
et |Ft ] a.s..
Xt = E[X

(3.3)

e Let s < t and sn a sequence in Q such


It remains to show the martingale property of X.
that sn s and s0 < t. Then
es = lim Xsn = lim E[Xt |Fsn ].
X
Now note that (E[Xt |Fsn ]) is a backwards martingale and hence it converges a.s. and in L1
to E[Xt |Fs+ ]. Therefore
es = E[Xt |Fs+ ] a.s.
X

(3.4)

If s < t, then by the tower property and (3.4) and (3.3) we get that
et |Fs+ ] = X
es a.s.
E[X
Notice that if G is any -algebra and X is an integrable random variable, then
E[X|G N ] = E[X|G] a.s.
et |Fes ] = X
es a.s., which shows that X
e is a martingale with respect to
Finally we get that E[X
e
the filtration F.
The only thing that remains to prove is the cadlag property.
e is not right continuous. Then this means
Suppose that for some 0 we have that X
that there exists a sequence (sn ) such that sn t as n and
e sn X
et | > ,
|X
e for 0 , there exists a sequence of rational
for some > 0. By the definition of X
0
0
0
numbers (sn ) such that sn > sn , sn t as n and
e s n Xs 0 |
|X
n
35

2.

Therefore, we get that

which is a contradiction, since Xs0n

et | > ,
|Xs0n X
2
et as n .
X

e has left limits is left as an exercise (hint: use the finite up-crossing property
The proof that X
of X on rationals).
Example 3.12. Let , be independent random variables taking values +1 or 1 with equal
probability. We now define

if t < 1;
0,
,
if t = 1;
Xt =

+ , if t > 1.
We also define Ft to be the natural filtration, i.e. Ft = (Xs , s t). Then clearly, X is a
martingale relative to the filtration (Ft ), but it is not right continuous at 1. Also, it is easy
to see that F1 = () but F1+ = (, ). We now define

0,
if t < 1;
e
Xt =
+ , if t 1.
et |Ft ] a.s. for all t and X
e is a martingale with respect to the
It is easy to check that Xt = E[X
e is cadlag. Note though that X
e is not a version of X,
filtration (Ft+ ). It is obvious that X
e1 .
since X1 6= X
From now on when we work with martingales in continuous time, we will always consider
their cadlag version, provided that the filtration satisfies the usual conditions.

3.3

Convergence and Doobs inequalities in continuous time

In this section we will give the continuous time analogues of Doobs inequalities and the
convergence theorems for martingales.
Theorem 3.13. [A.s. martingale convergence] Let (Xt : t 0) be a cadlag martingale
which is bounded in L1 . Then Xt X a.s. as t , for some X L1 (F ).
Proof. If N ([a, b], IM , X) stands for the number of up-crossings of the interval [a, b] as
defined in Lemma 3.11, then from (3.2) in the proof of the martingale regularization theorem,
we get that
(b a)E[N ([a, b], IM , X)] a + sup E[|Xt |] < ,
t0

since X is bounded in L1 . Hence, if we take the limit as M then we get that


N ([a, b], Q+ , X) < a.s.
Therefore, the set
0 = a<b,a,bQ {N ([a, b], Q+ , X) < }
36

has probability 1. On 0 it is easy to see that Xq converges as q and q Q+ .


Indeed, as in the proof of Lemma 2.15, if Xq did not converge, then lim sup Xq 6= lim inf Xq
and this would contradict the finite number of up-crossings of the interval [a, b], where
lim inf < a < b < lim sup. Thus (Xq )qQ+ converges a.s. as q , q Q+ , to X . We will
now use the cadlag property of X to deduce that
Xt X as t .
Since Xq X as q , q Q+ , for each > 0, there exists q0 such that

|Xq X | < , for all q > q0 .


2
By right continuity, we get that for t > q0 there exists a rational q such that q > t and

|Xt Xq | < .
2
Hence we conclude that
|Xt X | .

Theorem 3.14. [Doobs maximal inequality] Let (Xt : t 0) be a cadlag martingale


and Xt = supst |Xs |. Then, for all 0 and t 0
P(Xt ) E[|Xt |].
Proof. Notice that by the cadlag property we have
sup |Xs | =
st

sup

|Xs |.

s{t}([0,t]Q+ )

The rest of the proof follows in the same way as the first part of the proof of Theorem 3.10
Theorem 3.15. [Doobs Lp -inequality] Let (Xt : t 0) be a cadlag martingale. Setting
Xt = supst |Xs |, then for all p > 1 we have
kXt kp

p
kXt kp .
p1

Theorem 3.16. [Lp martingale convergence theorem] Let X be a cadlag martingale


and p > 1, then the following statements are equivalent:
1. X is bounded in Lp (, F, P) : supt0 kXt kp <
2. X converges a.s. and in Lp to a random variable X
3. There exists a random variable Z Lp (, F, P) such that
Xt = E[Z|Ft ] a.s.

37

Theorem 3.17. [UI martingale convergence theorem] Let X be a cadlag martingale.


Then X is UI if and only if X converges a.s. and in L1 to X and this if and only if X is
closed.
Theorem 3.18. [Optional stopping theorem] Le X be a cadlag UI martingale. Then
for every stopping times S T , we have
E[XT |FS ] = XS a.s.
Proof. Let A FS . We need to show tbat
E[XT 1(A)] = E[XS 1(A)].
Let Tn = 2n d2n T e and Sn = 2n d2n Se. Then Tn T andSn S as n and by the right
continuity of X we get that
XSn XS and XTn XT as n .
Also, from the discrete time optional stopping theorem we have that XTn = E[X |FTn ] and
thus we see that XTn is UI. Hence it converges to XT as n also in L1 . By the discrete
time optional stopping theorem for UI martingales we have
E[XTn |FSn ] = XSn a.s.

(3.5)

Since A FS the definition of Sn implies that A FSn . Hence from (3.5) we obtain that
E[XTn 1(A)] = E[XSn 1(A)]
Letting n and using the L1 convergence of XTn to XT and of XSn to XS we have
E[XT 1(A)] = E[XS 1(A)].

3.4

Kolmogorovs continuity criterion

Let Dn = {k2n : 0 k 2n } be the set of dyadic rationals of level n and D = n0 Dn .


Theorem 3.19. [Kolmogorovs continuity criterion] Let (Xt )tD be a stochastic process
with real values. Suppose there exists p > 0, > 0 so that
E[|Xt Xs |p ] c|t s|1+ , for all s, t D,
for some constant c < . Then for every (0, /p), the process (Xt )tD is -H
older
continuous, i.e. there exists a random variable K such that
|Xt Xs | K |s t| , for all s, t D.

38

Proof. By Markovs inequality and the assumption we have



P |Xk2n X(k+1)2n | 2n c2np 2nn .
By the union bound we have


n
c2n(p) .
P maxn |Xk2n X(k+1)2n | 2
0k<2

By Borel-Cantelli, since (0, /p), we deduce


max |Xk2n X(k+1)2n | 2n , for all n sufficiently large.

0k<2n

Therefore, there exists a random variable M such that


sup maxn
n0 0k<2

|Xk2n X(k+1)2n |
M < .
2n

(3.6)

We will now show that there exists a random variable M 0 < a.s. so that for every s, t D
we have
|Xt Xs | M 0 |t s| .
Let s, t D and let r be the unique integer such that
2(r+1) < t s 2r .
Then there exists k such that s < k2(r+1) < t. Set = k2r+1 , then 0 < t < 2r . So
we have that
X xj
,
t=
j
2
kr+1
where xj {0, 1} for all j (in fact this is a finite sum because t is dyadic). Similarly we
can write
X yj
s=
,
j
2
jr+1
where yj {0, 1} for all j. Thus we see that we can write the interval [s, t) as a disjoint
union of dyadic intervals of length 2n for n r + 1 and where at most 2 such intervals have
the same length. Therefore,
X
|Xs Xt |
|Xd Xd+2n |,
d,n

where d, d + 2n in the summation above are the endpoints of the intervals in the decomposition of [s, t). Hence using (3.6) we obtain that for all s, t D
|Xs Xt | 2

M 2n = 2M

nr+1

2(r+1)
.
1 2

Thus, if we set M 0 = 2M/(1 2 ), then we get that for s, t D


|Xs Xt | M 0 2(r+1) M 0 |t s| .
Therefore we get that (Xt )tD is -Holder continuous a.s.
39

Weak convergence

4.1

Definitions

Let (M, d) be a metric space endowed with its Borel -algebra. All the measures that we
will consider in this section will be measures on such a measurable space.
Definition 4.1. Let (n , n 0) be a sequence of probability measures on a metric space
(M, d). We say that n converges weakly to and write n
R if n (f ) (f ) as n
for all bounded continuous functions f on M , where (f ) = M f d.
Notice that by the definition is also a probability measure, since (1) = 1.
Example 4.2. Let (xn )n0 be a sequence in a metric space M that converges to x as
n . Then xn converges weakly to x as n , since if f is any continuous function,
then f (xn ) f (x) as n .
P
Example 4.3. Let M = [0, 1] with the Euclidean metric and n = n1 0kn1 k/n .
R1
P
Then n (f ) is the Riemann sum n1 0kn1 f (k/n) and it converges to 0 f (x) dx if f is
continuous, which shows that n converges weakly to Lebesgue measure on [0, 1].
Remark 4.4. Notice that if A is a Borel set, then it is not always true that n (A) (A)
as n , when n . Indeed, let xn = 1/n and n = xn . Then n 0 , but n (A) = 1
for all n, when A is the open set (0, 1), and 0 (A) = 0.
Theorem 4.5. Let (n )n0 be a sequence of probability measures. The following are equivalent:
(a) n as n ,
(b) lim inf n n (G) (G) for all open sets G,
(c) lim supn n (A) (A) for all closed sets A,
(d) limn n (A) = (A) for all sets A with (A) = 0.
Proof. (a) = (b). Let G be an open set with non-empty complement Gc . For every
positive M we now define
fM (x) = 1 (M d(x, Gc )).
Then fM is a continuous and bounded function and for all M we have fM (x) 1(x G).
Also fM 1(G) as M , since Gc is a closed set. Since fM is continuous and bounded
we have
n (fM ) (fM ) as n .
Hence
lim inf n (G) lim inf n (fM ) = (fM ).
n

Now using monotone convergence as M we get


lim inf n (G) (G).
n

40

(b) (c). This is obvious by taking complements.


and A denote the interior and the closure of the set A respectively.
(b),(c) = (d). Let A
Since
= 0,
(A) = (A \ A)
= (A) = (A).
Hence,
we get that (A)
(A) lim inf n (A)

lim sup n (A)


n

A this gives the result.


and since A
(d) = (a). Let f : M R+ be a continuous bounded non-negative function. Using
Fubinis theorem we get
Z
Z
Z
Z K
f (x)n (dx) =
n (dx)
1(t f (x)) dt = n({f t}) dt,
M

where K is an upper bound for f . We will now show that for Lebesgue almost all t we have
({f t}) = 0.

(4.1)

Notice that {f t} {f = t}, since {f t} is a closed set by the continuity of f and


{f > t} is an open set contained in the interior. However, there can be at most a countable
set of numbers t such that ({f = t}) > 0, because
{t : ({f = t}) > 0} = n1 {t : ({f = t}) n1 }
and the n-th set on the right has at most n elements. Hence this proves (4.1).
RK
Therefore by (d) and dominated convergence on 0 n ({f t}) dt we get that
n (f ) (f ) as n .
The extension to the case of a function f not necessarily positive is immediate.
For a finite non-negative measure on R we define its distribution function
F (x) = ((, x]), x R.
As a consequence of the theorem above we will now prove the following:
Proposition 4.6. Let (n )n be a sequence of probability measures in R. The following are
equivalent:
(a) n converges weakly to as n ,
(b) for every x R such that F is continuous at x, the distribution functions Fn (x)
converges to F (x) as n .

41

Proof. (a)=(b). Let x be a continuity point of F . Then


((, x]) = ({x}) = ((, x]) lim ((, x1/n]) = F (x) lim F (x1/n) = 0,
n

since x is a continuity point of F . Thus we get that


Fn (x) = n ((, x]) ((, x]),
by the 4-th equivalence in Theorem 4.5.
(b)=(a). First of all note that a distribution function is increasing, and hence has only
countably many points of discontinuity.
Let G be an open set in R. Then we canPwrite G = k (ak , bk ), where the intervals (ak , bk )
are disjoint. We thus have that n (G) = k n ((ak , bk )). For each interval (a, b) we have
n ((a, b)) = Fn (b) Fn (a) Fn (b0 ) Fn (a0 ),
where a0 , b0 are continuity points of F (remember there are only countably many points of
discontinuity set of continuity points is dense) satisfying
a < a0 < b0 < b.
Therefore
lim inf n ((a, b)) F (b0 ) F (a0 ) = ((a0 , b0 ))
n

and hence if we let a a and b0 b along continuity points of F , then


lim inf n ((a, b)) (a, b).

(4.2)

Finally we deduce
lim inf n (G) = lim inf
n

X
k

n ((ak , bk ))

lim inf n ((ak , bk ))


n

((ak , bk )) = (G),

where the first inequality follows from Fatous lemma and the second one from (4.2).
Definition 4.7. Let (Xn )n be a sequence of random variables taking values in a metric
space (M, d) but defined on possibly different probability spaces (n Fn , Pn ). We say that Xn
converges in distribution to a random variable X defined on the probability space (, F, P) if
the law of Xn converges weakly to the law of X as n . Equivalently, if for all functions
f : M R continuous and bounded
EPn [f (Xn )] EP [f (X)] as n .
Proposition 4.8 (a). Let (Xn )n be a sequence of random variables that converges to X in
probability as n . Then Xn converges to X in distribution to X as n .
(b). Let (Xn )n be a sequence of random variables that converges to a constant c in distribution
as n . Then Xn converges to c in probability as n .
Proof. See example sheet.
Example 4.9. [Central limit theorem] Let (Xn )n be a sequence of i.i.d. random variables
in L2 with m = E[X1 ] and 2 = var(X1 ). We set Sn = X1
+ . . . + Xn . Then the central
limit theorem states that the normalized sums (Sn nm)/ n converge in distribution to
a Gaussian N (0, 1) random variable as n .
42

4.2

Tightness

Definition 4.10. A sequence of probability measures (n )n on a metric space M is said to


be tight if for every > 0, there exists a compact subset K M such that
sup n (M \ K) .
n

Remark 4.11. Note that if a metric space M is compact, then every sequence of measures
is tight.
Theorem 4.12. [Prohorovs theorem] Let (n )n be a tight sequence of probability measures on a metric space M . Then there exists a subsequence (nk ) and a probability measure
on M such that
nk .
Proof. We will prove the theorem in the case when M = R. Let Fn = Fn be the distribution
function corresponding to the measure n . We will first show that there exists a subsequence
nk and a non-decreasing function F such that Fnk (x) converges to F (x) for all x Q. To
prove that we will use a standard extraction argument.
Let (x1 , x2 , . . .) be an enumeration of Q. Then (Fn (x1 ))n is a sequence in [0, 1], and hence
it has a converging subsequence. Let the converging subsequence be Fn(1) (x1 ) and the limit
k
F (x1 ). Then (Fn(1) (x2 ))k is a sequence in [0, 1] and thus also has a converging subsequence.
k

(i)

If we continue in this way, we get for each i 1 a sequence nk so that Fn(i) (xj ) converges
k

(k)

to a limit F (xj ) for all j = 1, . . . , i. Then the diagonal sequence mk = nk satisfies that
Fmk (x) converges for all x Q to F (x) as k . Since the distribution functions Fn (x)
are non-decreasing in x, then we get that F (x) is also non-decreasing in x.
By the monotonicity of F we can define for all x R
F (x) = lim F (q).
qx,qQ

The definition of F gives that it is right continuous and the monotonicity property gives
that left limits exist, hence F is cadlag.
We will next show that if t is a point of continuity of F , i.e. F (t) = F (t), then
lim Fmk (t) = F (t).

Let s1 < t < s2 with s1 , s2 Q and such that |F (si ) F (t)| < /2 for i = 1, 2. Note that
such rational numbers s1 and s2 exist since t is a continuity point of F . Then using the
monotonicity property of Fmk we get that for k large enough
F (t) < F (s1 )

< Fmk (s1 ) Fmk (t) Fmk (s2 ) < F (s2 ) + < F (t) + .
2
2

By tightness, for every > 0, there exists N such that


n ([N, N ]c ) n.
43

Note that we can choose N so that both N and N are continuity points of F (F is monotone). Therefore it follows that
F (N ) and 1 F (N ) .
Hence we see that
lim F (x) = 0 and

lim F (x) = 1.

Finally we need to show that there exists a measure such that F = F . To this end, we
define
((a, b]) = F (b) F (a).
Then can be extended to a Borel probability measure by Caratheodorys extension theorem
and F = F . Another way to construct the measure is given in [2, Section 3.12].
Proposition 4.6 now finishes the proof.

4.3

Characteristic functions

Definition 4.13. Let X be a random variable taking values in Rd with law . We define
the characteristic function = X by
Z
ihu,Xi
(u) = E[e
]=
eihu,xi (dx), u Rd .
Rd

Remark 4.14. The characteristic function of a random variable X is clearly a continuous


function on Rd and (0) = 1.
The characteristic function X determines the law of a random variable X, in the sense
that if X (u) = Y (u) for all u, then L(X) = L(Y ). To prove this see the Probability and
Measure notes by James Norris, Theorem 7.2.2.
Theorem 4.15. [L
evys convergence theorem] Let (Xn )n0 , X be random variables in
d
R . Then
L(Xn ) L(X) if and only if Xn () X () Rd .
We will prove the more general result:
Theorem 4.16. [L
evy] 1. If L(Xn ) L(X) as n , then Xn () X () as n
d
for all R .
2. If (Xn )n0 is a sequence of random variables in Rd such that there exists : Rd C
continuous at 0 with (0) = 1 and such that Xn () () as n for all Rd , then
= X , for some X and L(Xn ) L(X) as n .
Before giving the proof of Levys theorem we state and prove a useful lemma:
Lemma 4.17. If X is a random variable in Rd , then for all K > 0
Z
d
P(kXk > K) C(K/2)
(1 <X ()) d,
[K 1 ,K 1 ]d

where C = (1 sin 1)1 .


44

Proof. Let be the distribution of X. Then by Fubinis theorem we have


Z

Z
Z
Z Y
d Z
ihu,xi
<X (u) du =
<
eiuj xj duj d(x)
e
d(x) du = <
[,]d

[,]d

j=1

[,]



Z Y
Z Y
d 
d 

1
2 sin(xj )
ixj
ixj
=<
e
e
d(x) =
d(x).
ixj
xj
j=1
j=1
Therefore we have
!
Z
Z
d
Y
1
sin(x
)
j
.
(1 <X (u)) du = 2d
d(x) 1
d [,]d
xj
Rd
j=1

(4.3)

It is easy to check that if x 1, then


| sin x| x sin 1,
and hence the function f : Rd R given by f (u) =
when kuk 1. Thus for C = (1 sin 1)1 we have

Qd

j=1

sin uj /uj satisfies |f (u)| sin 1

1(kuk 1) C(1 f (u)).


Hence, we have
P(kXk

"
#


d
1
Y
X
sin(K
X
)
j

K) = P
K 1 CE 1
K 1 Xj

j=1
!
Z
d
Y
sin(K 1 xj )
d(x) 1
=C
.
1 x
K
j
Rd
j=1

Equation (4.3) now finishes the proof.


Proof of Theorem 4.16. 1. If Xn converges in distribution to X as n , then for all
f continuous and bounded, writing n = L(Xn ) and = L(X), we have
n (f ) (f ) as n .
Take f (x) = eih,xi . Then f is clearly continuous and bounded, and hence
Xn () = n (eih,i ) (eih,i ) = X ().
2. We will first show that the sequence (L(Xn )) is tight. From Lemma 4.17 we have that
for all K > 0
Z
d
P(kXn k > K) Cd K
(1 <Xn (u)) du.
[K 1 ,K 1 ]d

By the assumption and since |1 <Xn (u)| 2 for all n using the dominated convergence
theorem we have
Z
Z
d
d
lim K
(1 <Xn (u)) du = K
(1 <(u)) du.
n

[K 1 ,K 1 ]d

[K 1 ,K 1 ]d

45

Since is continuous at 0, if we take K large enough we can make this limit < /(2Cd ) and
so for all n large enough
P(kXn k > K) .
If we now take K even larger, then the above inequality holds for all n showing the tightness
of the family (L(Xn )).
By Prohorovs theorem there exists a subsequence (Xnk ) that converges in distribution to
some random variable X. So Xnk converges pointwise to X , and hence X = , which
shows that is a characteristic function.
We will finally show that Xn converges in distribution to X. If not, then there would exist
a subsequence (mk ) and a continuous and bounded function f such that for some > 0 and
all k
|E[f (Xmk )] E[f (X)]| > .

(4.4)

But since the laws of (Xmk ) are tight, we can extract a subsequence (`k ) along which (X`k )
converges in distribution to some Y , which would imply that = Y and thus Y would have
the same distribution as X, contradicting (4.4).

46

5
5.1

Large deviations
Introduction

Let {Xi } be a sequence of i.i.d. random variables with E[X1 ] = x and we set Sn =

Pn

i=1

Xi .

By the central limit theorem (assuming var(Xi ) = 2 < ) we have

P(Sn n
x + a n) P(Z a) as n ,
where Z N (0, 1).
Large deviations: What are the asymptotics of P(Sn an) as n , for a > x?
Example 5.1. Let Xi be i.i.d. distributed as N (0, 1). Then

1
2
ea n/2 ,
P(Sn an) = P(X1 a n)
a 2n
where we write f (x) g(x) if f (x)/g(x) 1 as x . So
1
a2
log P(Sn an) I(a) =
as n .
n
2
In general we have
P(Sn+m a(n + m)) P(Sn an)P(Sm am),
so bn = log P(Sn an) satisfies that
bn+m bn + bm ,
and hence this implies the existence of the limit (exercise)
lim

bn
1
= lim log P(Sn an) = I(a).
n
n

Note that if P(X1 a0 ) = 1, then we will only consider a a0 , since clearly P(Sn na) = 0
for a > a0 .

5.2

Cramers theorem

We will now obtain a bound for P(Sn na) using the moment generating function of X1 .
For 0 we set
M () = E[eX1 ],
which could also be infinite. We define
() = log M ().

47

Note that (0) = 0 and by Markovs inequality for 0


n
P(Sn na) = P(eSn ena ) ena E[eSn ] = ea M () = exp(n(a ()). (5.1)
We now define the Legendre transform of :
(a) = sup(a ()) (0) = 0.
0

Then (5.1) yields

(a)

P(Sn an) en

n,

whence
1
lim inf log P(Sn an) (a).
n
n

(5.2)

TheoremP
5.2. [Cramers theorem] Let (Xi ) be i.i.d. random variables with E[X1 ] = x
and Sn = ni=1 Xi . Then
1
lim log P(Sn na) = (a) for a x.
n

Before proving the theorem we state and prove a preliminary lemma.


Lemma 5.3. The functions M () and () are continuous in D = { : M () < } and
with
differentiable in D
M 0 () = E[X1 eX1 ] and 0 () =

M 0 ()

for D.
M ()

Proof. Continuity follows immediately from the dominated convergence theorem.


Note that D is a (possibly infinite) interval i.e. if 1 < < 2 and 1 , 2 D, then also
D, since ex e1 x + e2 x for all x.
To show that it is differentiable, note that
 (+h)X

e
eX
M ( + h) M ()
=E
h
h
and for 2|h| < mini=1,2 | i | = 2 we have
(+h)x

x
e

e
= |x|e (e1 x + e2 x )1 ,



h
is in [1 , 2 ] if |h| < mini | i | = 2.
where
Proof of Theorem 5.2. The direction
1
lim log P(Sn na) (a)
n
n
48

follows from (5.2).


i = Xi a yields
Replacing Xi by X
P(Sn na) = P(Sn 0)
f() = E[eXf1 ] = ea M () so
and M
e
f() = () a.
()
= log M
Thus we need to show that
1
log P(Sen 0) (0) = sup[()].
n
0
In view of (5.2) what remains is (dropping tildes)
lim inf

1
log P(Sn 0) inf ()
0
n

(5.3)

when x < 0.
If P(X1 0) = 1, then
inf () lim () = log (0),

where = L(X1 ), so (5.3) holds in this case. Thus we may assume that P(X1 > 0) > 0.
Next consider the case M () < for all . Define a new law where
Z
d
ex
ex
(x) =
, so E [f (X1 )] = f (x)
d(x).
d
M ()
M ()
More generally
Z
E [F (X1 , . . . , Xn )] =

F (x1 , . . . , xn )

n
Y
i=1

holds when F (x1 , . . . , xn ) =

Qn

i=1

exi

d(x1 ) . . . d(xn )
M ()n

fi (xi ), and hence for all bounded measurable F .

The dominated convergence theorem gives that g() = E [X1 ] is continuous and g(0) = x <
0, while
R x
xe d
lim g() = lim R x
> 0,

e d
since (0, ) > 0. Thus we can find > 0 such that E [X1 ] = 0.
We now have


P(Sn 0) P(Sn [0, n]) E e(Sn n) 1(Sn [0, n]) = M ()n P (Sn [0, n])en .
By the central limit theorem we have that P (Sn [0, n]) 1/2 as n so
lim inf
n

1
log P(Sn 0) () .
n
49

Letting 0 proves (5.3) in the case where M () < for all .


Now we are going to prove the theorem in the general case. Let P
n = L(Sn ) and the
n
law of X1 conditioned on {|X1 | K} and n the law of Sn =
i=1 Xi conditioned on
the event ni=1 {|Xi | K}. Then we have that n [0, ) n [0, )[K, K]n . We write
RK
K () = log K ex d(x) and observe that
Z

log

ex d(x) = K () log [K, K].

Therefore
lim inf

1
1
log n [0, ) log [K, K] + lim inf log n [0, ) inf K () = JK .
0
n
n

Note that K as K , so JK J as K , for some J, and


lim inf
n

1
log n [0, ) J.
n

(5.4)

Since JK K (0) (0) = 0, so we have J 0.


For large K we have that [0, K] > 0, and hence JK > whence J > . By the
continuity of K (Lemma 5.3) the level sets { : K () J} are non-empty compact
nested sets, so there exists
\
0 { : K () J}.
K

Therefore we obtain
(0 ) = lim K (0 ) J,
K

and hence by (5.4) we get


lim inf
n

1
log n [0, ) (0 ) inf ()
0
n

as claimed.

5.3

Examples

Example 5.4. Let X N (0, 1), then


Z
1
2
2
2
exx /2 dx = e /2 , so () = .
M () =
2
2
In order to minimize a () we need to solve a = 0 () = , and hence
(a) = a2

50

a2
a2
= .
2
2

Example 5.5. Let X Exp(1). If < 1, then


Z
M () = exx dx =

1
.
1

So for < 1 we have () = log(1 ) and for 1 we have M () = , and thus


1
() = . Solving a = 0 () gives that a = 1
or equivalently that = 1 a1 , and hence
(a) = a 1 log a.
Example 5.6. Let X Poisson(1). Then
M () =

X
1 k1

e
= ee 1 ,
k!
k=0

so () = e 1. Solving a = 0 () gives that a = e , and hence


(a) = a log a a + 1.

6
6.1

Brownian motion
History and definition

Brownian motion is named after R. Brown who observed in 1827 the erratic motion of small
particles in water. A physical model was developed by Einstein in 1905 and the mathematical
construction is due to N. Wiener in 1923. He used a random Fourier series to construct
Brownian motion. Our treatment follows later ideas of Levy and Kolmogorov.
Definition 6.1. Let B = (Bt )t0 be a continuous process in Rd . We say that B is a Brownian
motion in Rd started from x Rd if
(i) B0 = x a.s.,
(ii) Bt Bs N (0, (t s)Id ), for all s < t,
(iii) B has independent increments, independent of B0 .
Remark 6.2. We say that (Bt )t0 is a standard Brownian motion if x = 0.
Conditions (ii) and (iii) uniquely determine the law of a Brownian motion. In the next
section we will show that Brownian motion exists.
Example 6.3. Suppose tht (Bt , t 0) is a standard Brownian motion and U is an indepenet , t 0) defined
dent random variable uniformly distributed on [0, 1]. Then the process (B
by

Bt , if t 6= U ;
e
Bt =
0, if t = U
has the same finite-dimensional distributions as Brownian motion, but is discontinuous if
B(U ) 6= 0, which happens with probability one, and hence it is not a Brownian motion.
51

6.2

Wieners theorem

Theorem 6.4. [Wieners theorem] There exists a Brownian motion on some probability
space.
Proof. We will first prove the theorem in dimension d = 1 and we will construct a process
(Bt , 0 t 1) and then extend it to the whole of R+ and to higher dimensions.
Let D0 = {0, 1} and Dn = {k2n , 0 k 2n } for n 1, and D = n0 Dn be the set
of dyadic rational numbers in [0, 1]. Let (Zd , d D) be a sequence of independent random
variables distributed according to N (0, 1) on some probability space (, F, P). We will first
construct (Bd , d D) inductively.
First set B0 = 0 and B1 = Z1 . Inductively, given that we have constructed (Bd , d Dn1 )
satisfying the conditions of the definition, we build (Bd , d Dn ) as follows:
Take d Dn \ Dn1 and let d = d 2n and d+ = d + 2n , so that d , d+ are consecutive
dyadic numbers in Dn1 . We write
Bd =

Bd + Bd+
Zd
+ (n+1)/2 .
2
2

Then we have
Bd Bd =

Bd+ Bd
Bd Bd
Zd
Zd
+ (n+1)/2 and Bd+ Bd = +
(n+1)/2 .
2
2
2
2

(6.1)

Bd Bd

Zd
Setting Nd = + 2 and Nd0 = 2(n+1)/2
, we see by the induction hypothesis that Nd and
Nd0 are independent centred Gaussian random variables with variance 2n1 . Therefore

Cov(Nd + Nd0 , Nd Nd0 ) = var(Nd ) var(Nd0 ) = 0,


and hence the two new increments Bd Bd and Bd+ Bd , being Gaussian, are independent.
Indeed, all increments (Bd Bd2n ) for d Dn are independent. To see this it suffices to
show that they are pairwise independent, as the vector of increments is Gaussian. Above
we showed that increments over consecutive intervals are independent. If they are defined
over intervals that are not consecutive, then notice that the increment is equal to half the
increment of the previous scale plus an independent Gaussian random variable by (6.1), and
hence this shows the claimed independence.
We have thus defined a process (Bd , d D) satisfying the properties of Brownian motion.
Let s t D and notice that for every p > 0, since Bt Bs N (0, t s), we have
E[|Bt Bs |p ] = |t s|p/2 E[|N |p ],
where N N (0, 1). Since N has moments of all orders, it follows by Kolmogorovs continuity
criterion, Theorem 3.19, that (Bd , d D) is -Holder continuous for all < 1/2 a.s. Hence
in order to extend to the whole of [0, 1] we simply let for t [0, 1]
Bt = lim Bdi ,
i

52

where di is a sequence in D converging to t. It follows easily that (Bt , t [0, 1]) is -Holder
continuous for all < 1/2 a.s.
Finally we will check that (Bt , t [0, 1]) has the properties of Brownian motion. We will
first prove the independence of the increments property. Let 0 = t0 < t1 < . . . < tk and let
0 = tn0 tn1 . . . tnk be dyadic rational numbers such that tni ti as n for each i.
By continuity (Btn1 , . . . , Btnk ) converges a.s. to (Bt1 , . . . , Btk ) as n , while on the other
hand the increments (Btnj Btnj1 , 1 j k) are independent Gaussian random variables
with variances (tnj tnj1 , 1 j k). Then as n we have
"
E exp i

k
X

!#
uj (Btnj Btnj1 )

j=1

k
Y

e(tj tj1 )uj /2

j=1

k
Y

e(tj tj1 )uj /2 .

j=1

By Levys convergence theorem we now see that the increments converge in distribution to
independent Gaussian random variables with respective variances tj tj1 , which is thus the
distribution of (Btj Btj1 , 1 j k) as desired.
To finish the proof we will construct Brownian motion indexed by R+ . To this end, take
a sequence (Bti , t [0, 1]) for i = 0, 1, . . . of independent Brownian motions and glue them
together, more precisely by
btc1
X
btc
B1i .
Bt = Btbtc +
i=0

This defines a continuous random process B : [0, ) R and it is easy to see from what
we have already shown that B satisfies the properties of a Brownian motion.
Finally to construct Brownian motion in Rd we take d independent Brownian motions in 1
dimension, B 1 , . . . , B d , and set Bt = (Bt1 , . . . , Btd ). Then it is straightforward to check that
B has the required properties.
Remark 6.5. The proof above gives that the Brownian paths are a.s. -Holder continuous
for all < 1/2. However, a.s. there exists no interval [a, b] with a < b such that B is Holder
continuous with exponent 1/2 on [a, b]. See example sheet for the last fact.

6.3

Invariance properties

The following invariance properties of Brownian motion will be used a lot.


Proposition 6.6. Let B be a standard Brownian motion in Rd .
1. If U is an orthogonal matrix, then U B = (U Bt , t 0) is again a standard Brownian
motion. In particular, B is a standard Brownian motion.
2. If > 0, then (1/2 Bt , t 0) is a standard Brownian motion (scaling property).
3. For every t 0, the shifted process (Bt+s Bs , t 0) is a standard Brownian motion
independent of FtB (simple Markov property).
53

Theorem 6.7. [Time inversion] Suppose that (Bt , t 0) is a standard Brownian motion.
Then the process (Xt , t 0) defined by

0,
if t = 0;
Xt =
tB1/t , for t > 0
is also a standard Brownian motion.
Proof. The finite dimensional distributions (Bt1 , . . . , Btn ) of Brownian motion are Gaussian
random vectors and are therefore characterized by their means E[Bti ] = 0 and covariances
Cov(Bti , Btj ) = ti for 0 ti tj .
So it suffices to show that the process X is a continuous Gaussian process with the same
means and covariances as Brownian motion. Clearly the vector (Xt1 , . . . , Xtn ) is a centred
Gaussian vector. The covariances for s t are given by
1
Cov(Xs , Xt ) = st Cov(B1/s , B1/t ) = st = s.
t
Hence X and B have the same finite marginal distributions. The paths t 7 Xt are clearly
continuous for t > 0, so it remains to show that they are also continuous for t = 0. First notice
that since X and B have the same finite marginal distributions we get that (Xt , t 0, t Q)
has the same law as a Brownian motion and hence
lim Xt = 0 a.s.

t0,tQ

Since Q+ is dense and X is continuous for t > 0 we get that


0 = lim Xt = lim Xt a.s.
t0,tQ

t0

Corollary 6.8. [Law of large numbers] Almost surely, limt

Bt
t

= 0.

Proof. Let Xt be as defined in Theorem 6.7. Then


Bt
= lim X(1/t) = X(0) = 0 a.s.
t t
t
lim

Remark 6.9. Of course one can show the above result directly using the strong law of large
numbers, i.e. limn Bn /n = 0. The one needs to show that B does not oscillate too much
between n and n + 1. See example sheet.
Definition 6.10. We define (FtB , t 0) to be the natural filtration of (Bt , t 0) and Fs+
the slightly augmented -algebra defined by
\
Fs+ =
FtB .
t>s

54

Remark 6.11. By the simple Markov property of Brownian motion Bt+s Bs is independent
of FsB . Clearly FsB Fs+ for all s, since in Fs+ we allow an additional infinitesimal glance
into the future. But the next theorem shows that Bt+s Bs is still independent of Fs+ .
Theorem 6.12. For every s 0 the process (Bt+s Bs , t 0) is independent of Fs+ .
Proof. Let (sn ) be a strictly decreasing sequence converging to s as n . By continuity
Bt+s Bs = lim Bsn +t Bsn a.s.
n

Let A Fs+ and t1 , . . . , tm 0. For any F continuous and bounded on(Rd )m we have by
the dominated convergence theorem
E[F ((Bt1 +s Bs , . . . , Btm +s Bs ))1(A)] = lim E[F ((Bt1 +sn Bsn , . . . , Btm +sn Bsn ))1(A)].
n

Since A Fs+ , we have that A FsBn for all n, and hence by the simple Markov property we
obtain that for all n
E[F ((Bt1 +sn Bsn , . . . , Btm +sn Bsn ))1(A)] = P(A)E[F ((Bt1 +sn Bsn , . . . , Btm +sn Bsn ))].
Therefore, taking the limit again we deduce that
E[F ((Bt1 +s Bs , . . . , Btm +s Bs ))1(A)] = E[F ((Bt1 +s Bs , . . . , Btm +s Bs ))]P(A),
and hence proving the claimed independence.
Theorem 6.13. [Blumenthals 0-1 law] The -algebra F0+ is trivial, i.e. if A F0+ , then
P(A) {0, 1}.
Proof. Let A F0+ . Then A (Bt , t 0), and hence by Theorem 6.12 we obtain that A
is independent of F0+ , i.e. it is independent of itself:
P(A) = P(A A) = P(A)2 ,
which gives that P(A) {0, 1}.
Theorem 6.14. Suppose that (Bt )t0 is a standard Brownian motion in 1 dimension. Define
= inf{t > 0 : Bt > 0} and = inf{t > 0 : Bt = 0}. Then
P( = 0) = P( = 0) = 1.
Proof. For all n we have
{ = 0} =

{ 0 < < 1/k : B > 0}

kn
B
and thus { = 0} F1/n
for all n, and hence

{ = 0} F0+ .
55

Therefore, P( = 0) {0, 1}. It remains to show that it has positive probability. Clearly,
for all t > 0 we have
1
P( t) P(Bt > 0) = .
2
Hence by letting t 0 we get that P( = 0) 1/2 and this finishes the proof. In exactly the
same way we get that
inf{t > 0 : Bt < 0} = 0 a.s.
Since B is a continuous function, by the intermediate value theorem, we deduce that
P( = 0) = 1.
Proposition 6.15. For d = 1 and t 0 let St = sup0st Bs and It = inf 0st Bs .
1. Then for every > 0 we have
S > 0 and I < 0 a.s.
In particular, a.s. there exists a zero of B in any interval of the form (0, ), for all > 0.
2. A.s. we have
sup Bt = inf Bt = +.
t0

t0

Proof. 1. For all t > 0 we have that


1
P(St > 0) P(Bt > 0) = .
2
Thus, if tn is a sequence of real numbers decreasing to 0 as n , then by Fatous inequality
1
P(Btn > 0 i.o.) = P(lim sup{Btn > 0}) lim sup P(Btn > 0) = .
2
n
Clearly, the event {Btn > 0 i.o.} is in F0+ since it is FtBk -measurable for all k (notice that for
all k it does not depend on Bt1 , . . . , Btk ). By Blumenthals 0-1 law we get that
P(Btn > 0 i.o.) = 1,
and hence S > 0 a.s. for all > 0.
The same is true for the infimum by considering B which is again a standard Brownian
motion.
2. By scaling invariance of Brownian motion we get that

(d)
S = sup Bt = sup Bt = sup Bt .
t0

t0

t0

(d)

Hence S = S for all > 0. Thus for all x > 0 the probability P(S x) is a constant
c, and hence
P(S 0) = c.
But we have already showed that P(S 0) = 1. Therefore, for all x we have
P(S x) = 1,
which gives that P(S = ) = 1.
56

Proposition 6.16. Let C be a cone in Rd with non-empty interior and origin at 0, i.e. a
set of the form {tu : t > 0, u A}, where A is a non-empty open subset of the unit sphere
of Rd . If
HC = inf{t > 0 : Bt C}
is the first hitting time of C, then HC = 0 a.s.
Proof. Since the cone C is invariant under multiplication by a positive scalar, by the scaling
invariance property of Brownian motion we get that for all t
P(Bt C) = P(B1 C).
Since C has non-empty interior, it is straightforward to check that
P(B1 C) > 0
and then we can finish the proof using Blumenthals 0-1 law as in the proposition above.

6.4

Strong Markov property

Let (Ft )t0 be a filtration. We say that a Brownian motion B is an (Ft )-Brownian motion
if B is adapted to (Ft ) and (Bs+t Bs , t 0) is independent of Fs for every s 0.
In Proposition 3.3 we saw that the first hitting time of a closed set by a continuous process
is always a stopping time. This is not true in general though for an open set. However, if we
consider the right continuous filtration, i.e. (Ft+ ), then we showed in Proposition 3.5 that
the first hitting time of an open set by a continuous process is always an (Ft+ ) stopping time.
So, in what follows we will be considering the right continuous filtration. As this filtration
is larger, this choice produces more stopping times.
Theorem 6.17. [Strong Markov property] Let T be an a.s. finite stopping time. Then
the process
(BT +t BT , t 0)
is a standard Brownian motion independent of FT+ .
Proof. We will first prove the theorem for the stopping times Tn = 2n d2n T e that discretely
(k)
approximate T from above. We write Bt = Bt+k2n Bk2n which is a Brownian motion
and B for the process defined by
B (t) = Bt+Tn BTn .
We will first show that B is a Brownian motion independent of FT+n . Let E FT+n . For
every event {B A} we have
P({B A} E) =

P({B (k) A} E {Tn = k2n })

k=0

P(B (k) A)P(E {Tn = k2n }),

k=0

57

+
since by the simple Markov property {B (k) A} is independent of Fk2
n and E {Tn =
+
n
(k)
(k)
k2 } Fk2n . Since B is a Brownian motion, we have P(B A) = P(B A) does
not depend on k, and hence

P({B A} E) = P(B A)P(E).


Taking E to be the whole space gives that B is a Brownian motion, and hence
P({B A} E) = P(B A)P(E)
for all A and E, thus showing the claimed independence.
By the continuity of Brownian motion we get that
Bt+s+T Bs+T = lim (Bs+t+Tn Bs+Tn ).
n

The increments (Bt+s+Tn Bs+Tn ) are normally distributed with 0 mean variance equal to
t. Thus for any s 0 the increments Bt+s+T Bs+T are also normally distributed with 0
mean and variance t. As the process (Bt+T BT , t 0) is a.s. continuous, it is a Brownian
motion. It only remains to show that it is independent of FT+ .
Let A FT+ and t1 , . . . , tk 0. We will show that for any function F : (Rd )k R continuous
and bounded we have
E[1(A)F ((Bt1 +T BT , . . . , Btk +T BT ))] = P(A)E[F ((Bt1 +T BT , . . . , Btk +T BT ))].
Using the continuity again and the dominated convergence theorem, we get that
E[1(A)F ((Bt1 +T BT , . . . , Btk +T BT ))] = lim E[1(A)F ((Bt1 +Tn BTn , . . . , Btk +Tn BTn ))].
n

Since Tn > T , it follows that A FT+n . But we already showed that the process (Bt+Tn
BTn , t 0) is independent of FT+n , hence using the continuity and dominated convergence
one more time gives the claimed independence.
Remark 6.18. Let = inf{t 0 : Bt = max0s1 Bs }. It is intuitively clear that is not a
stopping time. To prove that, first show that < 1 a.s. The increment Bt+ B is negative
in a small neighbourhood of 0, which contradicts the strong Markov property.

6.5

Reflection principle

Theorem 6.19. [Reflection principle] Let T be an a.s. finite stopping time and (Bt , t 0)
et , t 0) defined by
a standard Brownian motion. Then the process (B
et = Bt 1(t T ) + (2BT Bt )1(t > T )
B
is also a standard Brownian motion and we call it Brownian motion reflected at T .

58

Proof. By the strong Markov property, the process


B (T ) = (BT +t BT , t 0)
is a standard Brownian motion independent of (Bt , 0 t T ). Also the process
B (T ) = (BT Bt+T , t 0)
is a standard Brownian motion independent of (Bt , 0 t T ). Therefore, the pair
((Bt , 0 t T ), B (T ) ) has the same law as ((Bt , 0 t T ), B (T ) ).
We now define the concatenation operation at time T between two continuous paths X and
Y by
T (X, Y )(t) = Xt 1(t T ) + (XT + YtT )1(t > T ).
Applying T to B and B (T ) gives us the Brownian motion B, while applying it to B and
e
B (T ) gives us the process B.
Let A be the product -algebra on the space C of continuous functions on [0, ). It is easy
to see that T is a measurable mapping from (C C, A A) to (C, A) (by approximating T
by discrete stopping times).
e have the same law.
Hence, B and B
Corollary 6.20. [Reflection principle] Let B be a standard Brownian motion in 1 dimension and b > 0 and a b Then writing St = sup0st Bs we have that for every t 0
P(St b, Bt a) = P(Bt 2b a).
Proof. For any x > 0 we define Tx = inf{t 0 : Bt = x}. Since S = (S = supt0 Bt )
a.s. we have that Tx < a.s.
By the continuity of Brownian motion, we have that BTx = x a.s. Clearly {St b} = {Tb
t}. By the reflection principle applied to Tb we get
et 2b a),
P(St b, Bt a) = P(Tb t, 2b Bt 2b a) = P(Tb t, B
et = 2b Bt when t Tb .
since B
et 2b a} is contained in the event {Tb t}. Hence we get
Since a b, the event {B
et 2b a) = P(Bt 2b a),
P(St b, Bt a) = P(B
e is a standard Brownian
where the last equality follows again by the reflection principle (B
motion).
Corollary 6.21. For every t 0 the variables St and |Bt | have the same law.
Proof. Let a > 0. Then by Corollary 6.20 we get that
P(St a) = P(St a, Bt a) + P(St a, Bt > a) = 2P(Bt a) = P(|Bt | a),
since the event {Bt > a} is contained in {St a}.
Exercise 6.22. Let x > 0 and Tx = inf{t > 0 : Bt = x}. Then the random variable Tx has
the same law as (x/B1 )2 .
59

6.6

Martingales for Brownian motion

Proposition 6.23. Let (Bt , t 0) be a standard Brownian motion in 1 dimension. Then


(i) the process (Bt , t 0) is an (Ft+ )-martingale,
(ii) the process (Bt2 t, t 0) is an (Ft+ )-martingale.
Proof. (i) Let s t, then
E[Bt Bs |Fs+ ] = E[Bt Bs ] = 0,
since the increment Bt Bs is independent of Fs+ by Theorem 6.12.
(ii) The process is adapted to the filtration (Ft+ ) and if s t, then
E[Bt2 t|Fs+ ] = E[(Bt Bs )2 |Fs+ ] + 2E[Bt Bs |Fs+ ] E[Bs2 |Fs+ ] t
= (t s) + 2Bs2 Bs2 t = Bs2 s.

Using the above proposition, one can show that


Proposition 6.24. Let B be a standard Brownian motion in 1 dimension and x, y > 0.
Then
x
and E[Tx Ty ] = xy.
P(Ty < Tx ) =
x+y
Proposition 6.25. Let B be a standard Brownian motion in d dimensions. Then for each
u = (u1 , . . . , ud ) Rd the process


|u|2 t
u
Mt = exp hu, Bt i
,t 0
2
is an (Ft+ ) martingale.
 P

d
2
Proof. Integrability follows since E[exp(hu, Bt i)] = exp t i=1 ui /2 for all t 0. Let
s t, then


2
2
E[Mtu |Fs+ ] = e|u| t/2 E exp(hu, Bt Bs + Bs i)|Fs+ = e|u| t/2 exp(hu, Bs i)E [exp(hu, Bt Bs i)] ,
where the last equality follows from Theorem 6.12. Since the increment Bt Bs is distributed
according to N (0, (t s)Id ) we get that
E[Mtu |Fs+ ] = Msu ,
and hence proving the martingale property.

60

We saw above that if f (x) = x2 , then the right term to subtract from f (Bt ) in order to
make it a martingale is t. More generally now, we are interested in finding what we need to
subtract from f in order to obtain a martingale. Before stating the theorem for Brownian
motion, lets look at a discrete time analogue for a simple random walk on the integers. Let
(Sn ) be the random walk. Then
1
E[f (Sn+1 )|S1 , . . . , Sn ] f (Sn ) = (f (Sn + 1) 2f (Sn ) + f (Sn 1))
2
1e
= f
(Sn ),
2
e (x) := f (x + 1) 2f (x) + f (x 1). Hence
where f
n1

1Xe
f (Sn )
f (Sk )
2 k=0
defines a discrete time martingale. In the Brownian motion case we expect a similar result
e replaced by its continuous analogue, the Laplacian
with
f (x) =

d
X
2f
i=1

x2i

Theorem 6.26. Let B be a Brownian motion in Rd . Let f (t, x) : R+ Rd R be continuously differentiable in the variable t and twice continuously differentiable in the variable
x. Suppose in addition that f and its derivatives up to second order are bounded. Then the
following process

Z t
1

Mt = f (t, Bt ) f (0, B0 )
+ f (s, Bs ) ds, t 0
t 2
0
is an (Ft+ )-martingale.
Proof. Integrability follows trivially by the assumptions on the boundedness of f and its
derivatives.
We will now show the martingale property. Let 0 t. Then

Z s+t 

1
Mt+s Ms = f (t + s, Bt+s ) f (s, Bs )
+ f (r, Br ) dr
r 2
s

Z t

1
= f (t + s, Bt+s ) f (s, Bs )
+ f (r + s, Br+s ) dr.
r 2
0
Since Bt+s Bs is independent of Fs+ by Theorem 6.12 and Bs is Fs+ -measurable, writing
2
ps (z, y) = (2s)d/2 e|zy| /(2s) for the transition density in time s, we have (check!)
Z
+
+
E[f (t + s, Bt+s )|Fs ] = E[f (t + s, Bt+s Bs + Bs )|Fs ] =
f (t + s, Bs + x)pt (0, x) dx.
Rd

61

Now notice that by the boundedness assumption on f and all its derivatives
Z t 


 Z t 


1
1
+

E
E
+ f (r + s, Br+s ) dr Fs =
+ f (r + s, Br+s ) Fs+ dr.
r 2
r 2
0
0
(Check! using Fubinis theorem and the definition of conditional expectation.) Using again
the fact that Bt+s Bs is independent of Fs+ , we get



 Z 
1
1

+
+ f (r + s, Br+s Bs + Bs ) Fs =
+ f (r+s, x+Bs )pr (0, x) dx.
E
r 2
r 2
Rd
By the boundedness of f and its derivatives, using the dominated convergence theorem we
deduce


Z tZ 
Z tZ 
1
1

+ f (r+s, x+Bs )pr (0, x) dx dr = lim


+ f (r+s, x+Bs )pr (0, x)dxdr.
0
r 2
r 2
0
Rd
Rd
Using integration by parts twice in this last integral and Fubinis theorem we have that it is
equal to
Z Z t
Z

pr (0, x)f (r + s, x + Bs ) dr dx
(f (t + s, Bs + x)pt (0, x) f ( + s, x + Bs )p (0, x)) dx
Rd r
Rd
Z tZ
1
+
pr (0, x)f (r + s, x + Bs ) dx dr.

Rd 2
The transition density pr (0, x) satisfies the heat equation, i.e. (r /2)p = 0, and hence
this last expression is equal to
Z
(f (t + s, Bs + x)pt (0, x) f ( + s, x + Bs )p (0, x)) dx.
Rd

Now notice that as 0 we get


Z
f ( + s, x + Bs )p (0, x)) dx = f (s, Bs ),
lim
0

Rd

since the limit above is equal to lim0 E[f (s + , Bs+ )|Fs+ ] which by the continuity of the
Brownian motion and of f and by the conditional dominated convergence theorem is equal
to f (s, Bs ).
Therefore we showed that
E[Mt+s Ms |Fs+ ] = 0 a.s.
and this finishes the proof.

6.7

Recurrence and transience

We note that if a Brownian motion starts from x Rd , i.e. B0 = x, then B can be written
as
et ,
Bt = x + B
e is a standard Brownian motion.
where B
We will write Px to indicate that the Brownian motion starts from x, i.e. under Px the
process (Bt x, t 0) is a standard Brownian motion.
62

Theorem 6.27. Let B be a Brownian motion in d 1 dimensions.


(i) If d = 1, then B is point-recurrent, in the sense that for all x a.s. the set
{t 0 : Bt = x}
is unbounded.
(ii) If d = 2, then B is neighbourhood recurrent, in the sense that for every x, z under
Px -a.s. the set
{t 0 : |Bt z| }
is unbounded for every > 0.
However, B does not hit points, i.e. for every x Rd
P0 (t > 0 : Bt = x) = 0.
(iii) If d 3, then B is transient, in the sense that
|Bt | as t P0 -a.s.
Proof. (i) This is a consequence of Proposition 6.15, since
lim sup Bt = = lim inf Bt .
t

(ii) Note that it suffices to show the claim for z = 0.


Let Cb2 (R2 ) be such that
(y) = log |y|, for |y| R,
where R > > 0. Note that (y) = 0 for |y| R. Let the Brownian motion start
from x, i.e. B0 = x with < |x| < R.
By Theorem 6.26 the process

M=

Z
(Bt )
0


1
(Bs ) ds
2
t

is a martingale.
We now set S = inf{t 0 : |Bt | = } and TR = inf{t 0 : |Bt | = R}. Then H = S TR
is an a.s. finite stopping time and (MtH )t0 = (log |BtH |, t 0) is a bounded martingale.
By the optional stopping theorem, since H < a.s., we thus obtain that
Ex [log |BH |] = log |x|
or equivalently,
log()Px (S < TR ) + log(R)Px (TR < S) = log |x|,
63

which gives that


Px (S < TR ) =

log R log |x|


.
log R log

(6.2)

Letting R we have that TR a.s. and hence Px (S < ) = 1, which shows that
Px (|Bt | , for some t > 0) = 1.
Applying the Markov property at time n we get
Px (|Bt | , for some t > n) = Px (|Bt+n Bn + Bn | , for some t > 0)
Z
=
P0 (|Bt + y| , for some t > 0)Px (Bn dy)
2
ZR
Py (|Bt | , for some t > 0)Px (Bn dy).
=
R2

(Px (Bn dy) is the law of Bn under Px .) Since we showed above that for all z
Pz (|Bt | , for some t > 0) = 1,
we deduce that Px (|Bt | , for some t > n) = 1 for all x.
Therefore the set {t 0 : |Bt | } is unbounded Px -a.s.
Letting 0 in (6.2) gives that the probability of hitting 0 before hitting the boundary of
the ball around 0 of radius R is 0. Therefore, letting R gives that the probability of
ever hitting 0 is 0, i.e. for all x 6= 0
Px (Bt = 0, for some t > 0) = 0.
We only need to show now that
P0 (Bt = 0, for some t > 0) = 0.
Applying again the Markov property at a > 0 we get
Z
P0 (Bt = 0, for some t a) =
P0 (Bt+a Ba + y = 0, for some t > 0)P0 (Ba dy)
2
R
Z
1
2
=
Py (Bt = 0, for some t > 0)
e|y| /(2a) dy = 0
d/2
(2a)
R2
since for all y 6= 0 we have already proved that Py (Bt = 0, for some t > 0) = 0.
Thus, since P0 (Bt = 0, for some t a) = 0 for all a > 0, letting a 0 we deduce that
P0 (Bt = 0, for some t > 0) = 0.
(iii) Since the first three components of a Brownian motion in Rd form a Brownian motion
in R3 , it suffices to treat the case d = 3. As we did above, let f be a function f Cb2 (R3 )
such that
1
f (y) =
, for |y| R.
|y|
64

Note that f (y) = 0 for |y| R. Let B0 = x with |x| R. If we define again S
and TR as above the same argument shows that
Px (S < TR ) =

|x|1 R1
.
1 R1

As R this converges to /|x| which is the probability of ever visiting the ball centred
at 0 and of radius when starting from |x| .
We will now show that
P0 (|Bt | as t ) = 1.
Let Tr = inf{t > 0 : |Bt | = r} for r > 0. We define the events
An = {|Bt | > n for all t Tn3 }.
By the unboundedness of Brownian motion, it is clear that
P0 (Tn3 < ) = 1.
Applying the strong Markov property at the time Tn3 we obtain

P0 (Acn ) = P0 |Bt+Tn3 BTn3 + BTn3 | n for some t 0
n
1
= E0 [PBT 3 (Tn < )] = 3 = 2 .
n
n
n
Since the right hand side is summable, by the Borel-Cantelli lemma we get that only finitely
many of the sets Acn occur, which implies that |Bt | diverges to as t .

6.8

Brownian motion and the Dirichlet problem

Definition 6.28. We call a connected open subset D of Rd a domain. We say that D


satisfies the Poincare cone condition at x D (boundary of D) if there exists a non-empty
open cone C with origin at x and such that C B(x, r) Dc for some r > 0.
Theorem 6.29. [Dirichlet problem] Let D be a bounded domain in Rd such that every
boundary point satisfies the Poincare cone condition. Suppose that is a continuous function
on D. We let (D) = inf{t 0 : Bt D}, which is an almost surely finite stopping
R given by
time when starting in D. Then the function u : D

u(x) = Ex [(B (D) )], for x D,


is the unique continuous function satisfying
u = 0 on D
u(x) = (x) for x D.
Before solving the Dirichlet problem we state a well-known result and for the proof we refer
the reader to [1, Theorem 3.2].
65

Theorem 6.30. Let D be a domain in Rd and u : D R measurable and locally bounded.


The following conditions are equivalent:
(i) u is twice continuously differentiable and u = 0,
(ii) for any ball B(x, r) D we have
1
u(x) =
L(B(x, r))

Z
u(y) dy,
B(x,r)

(iii) for any ball B(x, r) D we have


1
u(x) =
x,r (B(x, r))

Z
u(y) dx,r (y),
B(x,r)

where x,r is the surface area measure on B(x, r).


Definition 6.31. A function satisfying one of the equivalent conditions of Theorem 6.30 is
called harmonic in D.
The next theorem and corollary folowing it will be used in the uniqueness part of the proof
of Theorem 6.29.
Theorem 6.32. [Maximum principle] Suppose that u : Rd R is a harmonic function
on a domain D Rd .
(i) If u attains its maximum in D, then u is a constant on D.
and D is bounded, then
(ii) If u is continuous on D
max u(x) = max u(x).

xD

xD

Proof. (i) Let M be the maximum. Then the set V = {x D : u(x) = M } is relatively
closed in D (if xn is a sequence of points in V converging to x D, then x V ), since u is
continuous. Since D is open, for any x V there exists r > 0 such that B(x, r) D. From
Theorem 6.30 we have
Z
1
u(y) dy M.
M = u(x) =
L(B(x, r)) B(x,r)
We thus deduce that u(y) = M for almost all y B(x, r). But since u is continuous, this
gives that u(y) = M for all y B(x, r). Therefore, B(x, r) V . Hence V is also open and
by assumption non-empty. But since D is connected, we must have that V = D. Hence u is
constant on D.
is closed and bounded, u attains a maximum on D.
By (i),
(ii) Since u is continuous and D
the maximum has to be attained on D.
Corollary 6.33. Suppose that u1 , u2 : Rd R are functions harmonic on a bounded domain
If u1 and u2 agree on D, then they are identical.
D and continuous on D.

66

Proof. By Theorem 6.32 (ii) applied to u1 u2 we obtain that


max(u1 (x) u2 (x)) = max (u1 (x) u2 (x)) = 0,

xD

xD

In the same way u2 (x) u1 (x) for


and hence we obtain that u1 (x) u2 (x) for all x D.

all x D. Hence u1 = u2 on D.
Proof of Theorem 6.29. Since the domain D is bounded, we get that u is bounded. We
will first show that u = 0 on D, by showing that u satisfifes condition (iii) of Theorem 6.30.
) D. Let = inf{t > 0 : Bt
Let x D. Then there exists > 0 such that B(x,
/ B(x, )}.
Then this is an a.s. finite stopping time, and hence applying the strong Markov property at
we get
u(x) = Ex [(BD )] = Ex [Ex [(BD )|F ]] = Ex [EB [(BD )]]
Z
1
u(y) dx,r (y).
= Ex [u(B )] =
(B(x, r)) B(x,r)
The uniqueness now follows from Corollary 6.33.
Clearly u is continuous on D. So we only
It remains to show that u is continuous on D.
need to show that u is continuous on D. Let z D. Since the domain D satisfies the
Poincare cone condition, there exists h > 0 and a non-empty open cone Cz with origin at z
such that Cz B(z, h) Dc .
Since is continuous on D, we get that for every > 0, there exists 0 < h such that
if |y z| and y D, then |(y) (z)| < .
Let x be such that |x z| 2k , for some k > 0. Then we have
|u(x) u(z)| = |Ex [(BD )] (z)| Ex [|(BD ) (z)|]
Px (D < B(z,) ) + 2kk Px (B(z,) < D )
Px (D < B(z,) ) + 2kk Px (B(z,) < Cz ).
Now we note that
Px (B(z,) < Cz ) ak ,
for some a < 1. Thus by choosing k large enough, we can get this last probability as small
as we like, and hence this completes the proof of continuity.
We will now give an example where the domain does not satisfy the conditions of Theorem 6.29 and the function u as defined there fails to solve the Dirichlet problem.
Example 6.34. Let v be a solution of the Dirichlet problem on B(0, 1) with boundary
condition : B(0, 1) R. We now let D = {x R2 : 0 < |x| < 1} be the punctured disc.
We will show that the function u(x) = Ex [(BD )] given by Theorem 6.29 fails to solve
the problem on D with boundary condition : B(0, 1) {0} if (0) 6= v(0). Indeed, since
planar Brownian motion does not hit points, the first hitting time of D = B(0, 1) {0} is
equal a.s. to the first hitting time of B(0, 1). Therefore,
u(0) = E0 [(BD )] = v(0) 6= (0).
67

6.9

Donskers invariance principle

In this section we will show that Brownian motion is the scaling limit of random walks with
steps of 0 mean and finite variance. This can be seen as a generalization of the central limit
theorem to processes.
For a function f C([0, 1], R) we define its uniform norm kf k = supt |f (t)|. The uniform
norm makes C([0, 1], R) into a metric space so we can consider weak convergence of probability
measures. The associated Borel -algebra coincides with the -algebra generated by the
coordinate functions.
Theorem 6.35. [Donskers invariance principle] Let (Xn , n 1) be a sequence of Rvalued integrable independent random variables with common law such that
Z
Z
x d(x) = 0 and
x2 d(x) = 2 (0, ).
Let S0 = 0 and Sn = X1 + . . . + Xn and define a continuous process that interpolates linearly
between values of S, namely
St = (1 {t})S[t] + {t}S[t]+1 , t 0,
where [t] denotes the integer part of t and {t} = t [t]. Then S [N ] := (( 2 N )1/2 SN t , 0
t 1) converges in distribution to a standard Brownian motion between times 0 and 1, i.e.
for every bounded continuous function F : C([0, 1], R) R,
E[F (S [N ] )] E[F (B)] as N .
Remark 6.36. Note that from Donskers theorem we can infer that N 1/2 sup0nN Sn
converges to sup0t1 Bt in distribution as N , since the function f 7 sup f is a
continuous operation on C([0, 1], R).
The proof of Theorem 6.35 that we will give uses a coupling of the random walk with
the Brownian motion, called the Skorokhod embedding theorem. It is however specific to
dimension d = 1.
Theorem 6.37. [Skorokhod embedding for random walks] Let be a probability
measure on R of mean 0 and variance 2 < . Then there exists a probability space (, F, P)
with filtration (Ft )t0 , on which is defined a Brownian motion (Bt )t0 and a sequence of
stopping times
0 = T0 T1 T2 . . .
such that, setting Sn = BTn ,
(i) (Tn )n0 is a random walk with steps of mean 2 ,
(ii) (Sn )n0 is a random walk with step distribution .

68

Proof. Define Borel measures on [0, ) by


(A) = (A), A B([0, )).
There exists a probability space on which are defined a Brownian motion (Bt )t0 and a
sequence ((Xn , Yn ) : n N) of independent random variables in R2 with law given by
(dx, dy) = C(x + y) (dx)+ (dy)
where C is a suitable normalizing constant. Set F0 = (Xn , Yn : n N) and Ft = (F0 , FtB ).
Set T0 = 0 and define inductively for n 0
Tn+1 = inf{t Tn : Bt BTn = Xn+1 or Yn+1 }.
Then Tn is a stopping time for all n. Note that, since has mean 0, we must have
Z 0
Z
C
(x)(dx) = C
y(dy) = 1.

Write T = T1 ,X = X1 and Y = Y1 .
By Proposition 6.24, conditional on X = x and Y = y, we have T < a.s. and
P(BT = Y |X, Y ) = X/(X + Y ) and E[T |X, Y ] = XY.
So, for A B([0, )),
Z Z

P(BT A) =
A

x
C(x + y) (dx)+ (dy)
x+y

so P(BT A) = (A). A similar argument shows this identity holds also for A B((, 0]).
Next
Z Z
E[T ] =
xyC(x + y) (dx)+ (dy)
0
0
Z 0
Z
2
(x) (dx) +
y 2 (dy) = 2 .
=

Now by the strong Markov property for each n 0 the process (BTn +t BTn )t0 is a Brownian
motion, independent of FTBn . So by the above argument BTn+1 BTn has law , Tn+1 Tn
has mean 2 , and both are independent of FTBn . The result follows.
Proof of Theorem 6.35. We assume for this proof that = 1. This is enough by scaling.
Let (Bt )t0 be a Brownian motion and (Tn )n1 be the sequence of stopping times as constructed in Theorem 6.37. Then BTn is a random walk with the same distribution as Sn .
Let (St )t0 be the linear interpolation between the values of (Sn ).
For each N 1 we set

(N )

Bt

N BN 1 t ,
69

which by the scaling invariance property of Brownian motion is again a Brownian motion.
(N )
We now perform the Skorokhod embedding construction with (Bt )t0 replaced by (Bt )t0 ,
(N )
(N )
(N )
to obtain stopping times Tn . We then set Sn = B (N ) and interpolate linearly to form
Tn

(N )
(St )t0 .

Clearly, for all N we have




(N )
(N )
(Tn )n0 , (St )t0 ((Tn )n0 , (St )t0 ) .

(N )
(N )
(N )
(N )
Next we set Ten = N 1 Tn and Set = N 1/2 SN t . Then
(N )

(Set

[N ]

)t0 (St )t0

(N )
and Sen/N = BTen(N ) for all n. We need to show that for all bounded continuous functions
F : C([0, 1], R) R that as N

E[F (S [N ] )] E[F (B)].


In fact we will show that for all > 0 we have




e(N )

P sup St Bt > 0.
0t1

Since F is continuous, this implies that F (Se(N ) ) F (B) in probability, which by bounded
convergence is enough.
Since Tn is a random walk with increments of mean 1 by the strong law of large numbers we
have that a.s.
Tn
1 as n .
n
So as N we have that a.s.
N 1 sup |Tn n| 0 as n .
nN

Hence for all > 0 we have that as N






e(N )

P sup Tn n/N > 0.
nN

(N )
Sen/N

Since
= BTen(N ) for all n we have that for every n/N t (n + 1)/N there exists
(N
)
(N )
(N )
Ten u Ten+1 such that Set = Bu . This follows by the intermediate value theorem and
(N )
the fact that (Set ) is the linear interpolation between the values of Sn . Hence we have
(N )

{|Set

Bt | > for some t [0, 1]} {|Ten(N ) n/N | > for some n N }
{|Bu Bt | > for some t [0, 1] and |u t| + 1/N }
= A1 A2 .

The paths of (Bt )t0 are uniformly continuous on [0, 1]. So for any > 0 we can find > 0
so that P(A2 ) /2 whenever N 1/. Then by choosing N even larger we can ensure
that P(A1 ) /2 also. Hence Se(N ) B uniformly on [0, 1] in probability as required.
Remark 6.38. From the proof above we see that we can construct the Brownian motion
and the random walk on the same space so that as N


[N ]
P sup |St Bt | > 0.
0t1

70

6.10

Zeros of Brownian motion

Theorem 6.39. Let (Bt )t0 be a one dimensional Brownian motion and
Zeros = {t 0 : Bt = 0}
is the zero set. Then, almost surely, Zeros is a closed set with no isolated points.
Proof. Since Brownian motion is continuous almost surely, the zero set is closed a.s. To
prove that no point is isolated we do the following: for each rational q [0, ) we consider
the first zero after q, i.e.
q = inf{t q : Bt = 0}.
Note that q is an almost surely finite stopping time. Since Zeros is a closed set, this infimum
is almost surely a minimum. By the strong Markov property, applied to q , we have that
for each q, almost surely q is not an isolated zero from the right. But since the rational
numbers is a countable set we get that almost surely for all rational q, the zero q is not
isolated from the right.
The next thing to prove is that the remaining points of Zeros are not isolated from the left.
We claim that for any 0 < t in the zero set which is different from q for all rational q is not
an isolated point from the left. Take a sequence qn t with qn Q. Define tn = qn . Clearly
qn tn < t and so tn t. Thus t is not isolated from the left.
Theorem 6.40. Fix t 0. Then, almost surely, Brownian motion in one dimension is not
differentiable at t.
Proof. Exercise.
But also a much stronger statement is true, namely
Theorem 6.41. [Paley, Wiener and Zygmund 1933] Almost surely, Brownian motion
in one dimension is nowhere differentiable.

7
7.1

Poisson random measures


Construction and basic properties

For (0, ) we say that a random variable X in Z+ is Poisson of parameter and write
X P () if
P(X = n) = e n /n!
We also write X P (0) to mean X 0 and write X P () to mean X .
Proposition 7.1. [Addition property] Let Nk , k N, be independent random variables,
with Nk P (k ) for all k. Then
X
X
Nk P (
k ).
k

71

Proposition 7.2. [Splitting property] Let N, Yn , n N, be independent random variables,


with N P (), < and P(Yn = j) = pj , for j = 1, . . . , k and all n. Set
Nj =

N
X

1(Yn = j).

n=1

Then N1 , . . . , Nk are independent random variables with Nj P (pj ) for all j.


Proof. Left as an exercise.
Let (E, E, ) be a -finite measure space. A Poisson random measure with intensity is a
map
M : E Z+ {}
satisfying, for all sequences (Ak : k N) of disjoint sets in E,
P
(i) M (k Ak ) = k M (Ak ),
(ii) M (Ak ), k N, are independent random variables,
(iii) M (Ak ) P ((Ak )) for all k.
Denote by E the set of Z+ {}-valued measures on E and define, for A E,
X : E E Z+ {}, XA : E Z+ {}
by
X(m, A) = XA (m) = m(A).
Set E = (XA : A E).
Theorem 7.3. There exists a unique probability measure on (E , E ) such that under
X is a Poisson random measure with intensity .
Proof. (Uniqueness.) For disjoint sets A1 , . . . , Ak E and n1 , . . . , nk Z+ , set
A = {m E : m(A1 ) = n1 , . . . , m(Ak ) = nk }.
Then, for any measure making X a Poisson random measure with intensity ,

(A ) =

k
Y

e(Aj ) (Aj )nj /nj !

j=1

Since the set of such sets A is a -system generating E , this implies that is uniquely
determined on E .
(Existence.) Consider first the case where = (E) < . There exists a probability
space (, F, P) on which are defined independent random variables N and Yn , n N, with
N P () and Yn / for all n. Set
M (A) =

N
X

1(Yn A),

n=1

72

A E.

(7.1)

It is easy to check, by the Poisson splitting property, that M is a Poisson random measure
with intensity . Indeed, for disjoint A1 , . . . , Ak in E with finite measures, we let Xn = j
whenever Yn Aj , so that M (Aj ), 1 j k are independent P ((Aj )), 1 j k random
variables.
More generally, if (E, E, ) is -finite, then there exist disjoint sets Ek E, k N, such
that k Ek = E and (Ek ) < for all k. We can construct, on some probability space,
independent Poisson random measures Mk , k N, with Mk having intensity |Ek . Set
X
M (A) =
Mk (A Ek ), A E.
kN

It is easy to check, by the Poisson addition property, that M is a Poisson random measure
with intensity . The law of M on E is then a measure with the required properties.
The above construction gives the following important property of Poisson random measures.
Proposition 7.4. Let M be a Poisson random measure on E with intensity , and let A E
be such that (A) < .
P Then M (A) has law P ((A)), and given M (A) = k, the restriction
M |A has same law as ki=1 Xi , where (X1 , . . . , Xk ) are independent with law ( A)/(A).
Moreover, if A, B E are disjoint, then the restrictions M |A , M |B are independent.
Exercise 7.5. Let E = R+ and = 1(t 0) dt. Let M be a Poisson random measure on
R+ with intensity measure and let(Tn )n1 and T0 = 0 be a sequence of random variables
such that (Tn Tn1 , n 1) are independent exponential random variables with parameter
> 0. Then
!
X
Nt =
1(Tn t), t 0 and (Nt0 = M ([0, t]), t 0)
n1

have the same distribution.

7.2

Integrals with respect to a Poisson random measure

Theorem 7.6. Let M be a Poisson random measure on E with intensity . Then for
f L1 (), then so is M (f ) defined by
Z
M (f ) =
f (y)M (dy)
E

and

Z
E[M (f )] =

Z
f (y)(dy), var(M (f )) =

f (y)2 (dy).

Let f : E R+ be a measurable function. Then for u > 0


 Z

 uM (f ) 
uf (y)
E e
= exp (1 e
)(dy) .
E

Let f : E R be in L1 (). Then for any u


Z

 iuM (f ) 
iuf (y)
E e
= exp
(e
1)(dy) .
E

73

Proof. First assume that f = 1(A), for A E. Then M (A) is a random variable by
definition of M and this extends to any finite linear combination of indicators. Since any
measurable non-negative function is the increasing limit of finite linear combinations of such
indicator functions, we obtain by monotone convergence that M (f ) is a random variable as
a limit of random variables.
Let En , n 0 be a measurable partition of E into sets of finite -measure. A similar
approximation argument shows that M (f 1(En )), n 0 are independent random variables.
Let f L1 (). We will first show the formula for the expectation and the variance. If
f = 1(A), then this is clear. This extends to finite linear combinations and to any nonnegative measurable functions by approximation. For a general f , we do the standard
procedure, separating into f = f + f and use the fact that M (f + ) and M (f ) are
independent.
Since
Pk by Proposition 7.4 given M (En ) = k, the restriction M |En has the same law as
i=1 Xi , where (X1 , . . . , Xk ) are independent with law ( En )/(En ), we get
E[exp(uM (f 1(En )))] =

E[exp(uM (f ))|M (En ) = k]P(M (En ) = k)

k=0

k
(En ) (En )

k!
Z

k=0

=e

(En )

Z
e

uf (x)

En
uf (x)

(dx)
(En )

k


(dx)

exp
e
En

 Z
(dx)(1 exp(uf (x))) .
= exp
En

Since the random variables M (f 1(En )) are independent over n 0, we can take products
over n 0 and by monotone convergence we obtain the wanted formula.
To establish the formula in the case where f L1 (), follows by the same kind of arguments.
We first establish the formula for f 1(En ) in place of f . Then to obtain the result, we must
show that
Z
Z
iuf (x)
(dx)(e
1)
(dx)(eiuf (x) 1) as n ,
An

E
ix

where An = E0 . . . En . But since |e 1| |x| for all x, we have that


|eiuf (x) 1| |uf (x)|,
whence the function under consideration is integrable with respect to , which by dominated
convergence gives the result.

7.3

Poisson Brownian motions

In this section we are going to consider Poisson random measures in Rd for d 1 with
intensity measure given by = dx, i.e. multiples of the Lebesgue measure in d dimensions.
74

Let be a Poisson random measure in Rd of intensity (this means times Lebesgue


measure). Note that the construction of Theorem 7.3 gives that can be written as
=

Xi ,

i=1

where Xi are random variables, since the Lebesgue measure of the whole space is infinite.
We will sometimes say Poisson point process to mean a Poisson random measure in Rd .
Proposition 7.7. [Thinning property] Let = {Xi } be a Poisson point process in Rd of
intensity . For each point Xi we perform an independent experiment and we keep it with
probability p(Xi ) and we remove it with the complementary probability, where p : Rd [0, 1]
is a measurable function. Thus we define a new process that contains the points RXi that we
kept. The process is a Poisson random measure in Rd with intensity (A) = A p(x) dx.
Proof. The independence property follows easily from the independence of . We will now
show that for any set A with finite volume we have (A) P ((A)), where is the intensity
measure given in the statement. By Proposition 7.4 we have
X
P((A) = n, (A) = k)
P((A) = k) =
nk

  Z
k Z
nk
dx
dx
n
p(x)
(1 p(x))
=
e
n!
k
vol(A)
vol(A)
A
A
nk
k X
nk
Z
Z
k
nk
vol(A)
=e
p(x) dx
(1 p(x)) dx
k!
(n k)!
A
A
nk
k
 Z

Z
k
vol(A)
=e
p(x) dx exp (1 p(x)) dx
k!
A
A
 R

Z
( A p(x) dx)k
= exp p(x) dx
.
k!
A
X

n
vol(A) (vol(A))

Proposition 7.8. Let = {Xi } be a Poisson point process inP


Rd of intensity . Let (Yi )
be i.i.d. random variables with law . Define the measure = i Xi +Yi . Then is again
a Poisson point process of the same intensity as .
Proof. It suffices to check that for any u > 0 and f : Rd R+ we have
 Z

 u(f ) 
uf (x)
E e
= exp
(e
1) dx .
Rd

We can write
P




E eu(f ) = E eu i f (Xi +Yi )

75

and conditioning on {Xi } and using the independence of the (Yi )s we obtain
" Z
#
Y
  u P f (Xi +Yi ) 
 u(f ) 
i
=E E e
| = E
E e
euf (Xi +y) (dy)
i

"
= E exp log

YZ
i

Rd

!#
euf (Xi +y) (dy)

Rd

"

Z
X
= E exp
log

euf (Xi +y) (dy)

!#

Rd

= E [exp ((g))] ,
euf (x+y) (dy). By Theorem 7.6 we have
 Z

 
 Z 
uf (x+y)
exp log e
(dy) 1 dx
E [exp ((g))] = exp
Rd
 
 Z Z
uf (x+y)
e
(dy) 1 dx
= exp
Rd
Rd
 Z


uf (x)
= exp
e
1 dx ,

where g(x) = log

Rd

Rd

where in the last step we used Fubinis theorem and the fact that is a probability measure
on Rd .
For the rest of this section we are going to consider the following model: let (0) be a Poisson
point process in Rd of intensity , let (0) = {Xi }. We now let each point of the Poisson
process move independently according to a standard Brownian motion in d dimensions.
Namely the point Xi moves according to the Brownian motion (i (t))t0 . This way at every
time t we obtain a new process (t) = {Xi + i (t)}, which by Proposition 7.8 is again a
Poisson point process of intensity .
We can think of the points of the Poisson process (0) as the users of a wireless network
that can communicate with each other when they are at distance at most r from each other.
So it is natural to introduce mobility to the model and this is why we let them evolve in
space.
We now fix a target particle which is at the origin of Rd and we are interested in the first
time that one of the points of the Poisson process is within distance r from it, i.e. we define
Tdet = inf {t 0 : 0 i B(Xi + i (t), r)} ,
where B(x, r) stands for the ball centred at x of radius r.
Theorem 7.9. [Stochastic geometry formula] Let be a standard Brownian motion in
d dimensions and let W (t) = st B((s), r) be the so-called Wiener sausage up to time t.
Then, for any dimension d 1, the detection probability satisfies
P(Tdet > t) = exp(E[vol(W (t))]).
76

Random walk sausage

Proof. Let be the set of points of (0) that have detected 0 by time t, that is
= {Xi (0) : s t s.t. 0 B(Xi + i (s), r)}.
Since the i s are independent we have by Proposition 7.7 that is a thinned Poisson point
process with intensity (x)dx where is given by
(x) = P(x st B((s), r)),
for is a standard Brownian motion.
So for the probability that the detection time is greater than t we have that


Z
d
P(x st B((s), r)) dx
P(Tdet > t) = P((R ) = 0) = exp
Rd

= exp(E [vol(st B((s), r))]) = exp(E[volW (t)]),


where the third equaliy follows by Fubini.
Theorem 7.10. The expected volume of the Wiener sausage W (t) = st B((s), r) satisfies
as t
q

8t

+ 2r for d = 1

2t
(1 + o(1)) for d = 2
E [vol(W (t))] =
log t

d/2 r d2 t

(1 + o(1)) for d 3.
( d2
2 )
Proof. Dimension d = 1 is left as an exercise.
For all d we have that
Z

E[vol(Wt )] =

P(y st B((s), r))dy =


P(B(y,r) t)dy
d
R
Z
= vol(B(0, r)) +
P(B(y,r) t)dy,
Rd

Rd \B(0,r)

77

where A is the first hitting time of the set A by the Brownian motion. Define
Z t
y
Zt =
1((s) B(y, r)) ds,

(7.2)

i.e. the time that the Brownian motion spends in the ball B(y, r) before time t. It is clear
by the continuity of the Brownian paths that {Zty > 0} = {B(y,r) t}. We now have
E[Z y ]
P(Zty > 0) = E[Z y |Zty >0] and for the first moment we have
t

E[Zty ]

Z tZ

P0 ((s) B(y, r)) ds =

=
0

B(y,r)

|z|2
1
2s
dz ds =
e
(2s)d/2

Z tZ
0

B(0,r)

|z+y|2
1
2s
dz ds
e
(2s)d/2

and for the conditional expectation E[Zty |Zty > 0], if we write T for the first time that the
Brownian motion hits the boundary of the ball B(y, r), then we get that in 2 dimensions for
all y
/ B(0, r)
Z t
 Z t
y
y
E[Zt |Zt > 0] = E
1((s) B(y, r)) ds P0((s) B(0, r)) ds
T
0
Z tZ
2
r2
1 |z|
e 2s dz ds 1 + log t.
1+
2
1
B(0,r) 2s
In dimensions d 3 we have for all y
/ B(0, r)
 Z t
Z t
y
y
1((s) B(y, r)) ds P0((s) B(0, r)) ds
E[Zt |Zt > 0] = E
T
0
Z tZ
Z
Z
|z|2
1
1
1
2s
=
e
dz ds =
sd/22 es ds dz,
d/2
d/2
d2 |z|2
(2s)
(2)
|z|
0
B((0,r),r)
B((0,r),r)
2t
where B((0, r), r) stands for the ball centred at (0, . . . , 0, r) and of radius r and the last step
follows by a change of variable. Now notice that


Z
d2
d/22 s
s
e ds
as t
|z|2
2
2t
and by the mean value property for the harmonic function 1/|z|d2 we get that
Z
dz
= vol(B(0, 1))r2 .
d2
|z|
B((0,r),r)
So, putting all things together we obtain that in 2 dimensions
Z
E[Zty ]
E[vol(Wt )] = vol(B(0, r)) +
dy
y
y
R2 \B(0,r) E[Zt |Zt > 0]
R

2
RtR
R
1 |z+y|
2s
e
dy
dz
ds

E[Zty ] dy
2
0 B(0,r)
R 2s
B(0,r)
vol(B(0, r)) +
2
1 + r2 log t
R
2 B(0,r) E[Zty ] dy
2tr2
= vol(B(0, r)) +

.
2 + r2 log t
2 + r2 log t
78

It is easy to see that

R
B(0,r)

E[Zty ] dy = O(log t) and hence in 2 dimensions we get


lim inf E[vol(Wt )]
t

log t
1.
2t

In d 3 we obtain in the same way as above



d2
lim inf E[vol(Wt )] d/2 2d2 1,
t
2 r t
since

R
B(0,r)

E[Zty ] dy = O(1). It remains to show that in 2 dimensions


lim sup E[vol(Wt )]
t

log t
1
2t

(7.3)

and in d 3 that

d2
lim sup E[vol(Wt )] d/2 2d2 1.
2 r t
t
Let > 0. We define Zety =

R t(1+)
0

(7.4)

1((s) B(y, r)) ds and use the obvious inequality

P(Zty > 0)

E[Zety ]
.
E[Zety |Zty > 0]

We can now lower bound the conditional expectation appearing in the denominator above
as follows. In d = 2 we have
Z t Z
Z t Z
1 |z|2
1 |z|2
y
y
e
e 2s dz ds
e 2s dz ds
E[Zt |Zt > 0]
log(t) B((0,r),r) 2s
0
B((0,r),r) 2s
2r 2

r2 e log(t)

(log(t) log log(t)).


2
For d 3 we have
Z t Z
y
y
e
E[Zt |Zt > 0]
0

B((0,r),r)

|z|2
1
1
2s
dz ds =
e
d/2
(2s)
(2)d/2

Z
B((0,r),r)

1
|z|d2

|z|2
2t

Similarly to the calculations leading to the lower bound we get that in d = 2


2r 2

2t(1 + )e log(t)
E[vol(Wt )]
log t + log log log(t)
and hence for d = 2

log t
1 + ,
2t
t
for all > 0, and thus letting go to 0 proves (7.3).
lim sup E[vol(Wt )]

For d 3 in the same way we obtain



d2
lim sup E[vol(Wt )] d/2 2d2 1 + ,
2 r t
t
for all > 0, and thus letting go to 0 proves (7.4).
79

sd/22 es ds dz.

Now suppose that the target particle is moving according to a deterministic function
f : R+ Rd . We define the detection time
f
= inf{t 0 : f (t) i B(Xi + i (t), r))}.
Tdet

Then we have the following theorem which is non-examinable:


Theorem 7.11 (Peres-Sousi). For all times t and all dimensions d we have
f
P(Tdet
> t) P(Tdet > t).

Using a straightforward generalization of Theorem 7.9 we get the equivalent statement


Theorem 7.12 (Peres-Sousi). For all times t and all dimensions d we have
E[vol(Wf (t))] E[vol(W (t))],
where Wf (t) = st B((s) + f (s), r).
We now conclude this course by stating an open question.
Question 7.13. Does the stochastic domination inequality
P(vol(st B((s) + f (s), r)) ) P(vol(st B((s), r)) )
also hold?
Acknowledgements
These notes are based in part on the lecture notes by James Norris and Gregory Miermont
for the same course and [1].

References
[1] Peter Morters and Yuval Peres. Brownian motion. Cambridge Series in Statistical and
Probabilistic Mathematics. Cambridge University Press, Cambridge, 2010. With an appendix by Oded Schramm and Wendelin Werner.
[2] David Williams. Probability with martingales. Cambridge Mathematical Textbooks.
Cambridge University Press, Cambridge, 1991.

80

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy