Brownian Motion and Martingales
Brownian Motion and Martingales
4.4.2 Explain the definition and basic properties of standard Brownian motion (or
Wiener process).
0 Introduction
Essentially, a stochastic process is a sequence of values of some quantity where the future values
cannot be predicted with certainty. This and the following chapter are concerned with
continuous-time stochastic processes that have applications in financial economics. These
chapters are of a very mathematical nature and you may find some of it hard-going. It is more
important that you gain a higher-level understanding than that you master the pure maths that
underlies it. For example, you may find it useful to learn Ito’s Lemma as a procedure rather than
trying to understand the pure mathematical concepts.
The most important process studied here is the Wiener process, also known as Brownian motion,
which is the subject of Section 1. These two terms will be used interchangeably. We define this
as a process with continuous sample paths and independent and normally distributed increments.
A Brownian motion is the continuous-time version of a random walk, as we will see. The graph in
Section 1.2 shows a typical sample path.
If security prices can be modelled in some way in terms of Brownian motion, this will be useful for
pricing certain types of options. This is discussed further in Parts 3 and 4 of the course.
Section 2 of this chapter introduces martingales. A martingale is a process whose current value is
the best estimate of its future values. We will see later that martingale theory has important
applications in relation to financial derivatives.
The notation used in financial economics generally is not standardised and similar notation can
refer to different quantities: readers should check the definitions provided in each section. In
particular, the value of a random stochastic process can be equivalently written as Xt or X (t) .
Furthermore, standard Brownian motion can be denoted by Bt (as in the Tables), or Wt or Zt as
found throughout the Core Reading.
The Core Reading in this chapter is adapted from course notes written by Timothy Johnson.
1.1 Introduction
In 1895, Louis Bachelier embarked on a doctorate on the ‘Theory of Speculation’.
Bachelier’s approach was fairly conventional at the time; he would model an asset price as
a random walk. At the start of his thesis he argues that:
At a given instant the market believes neither in a rise nor in a fall of the true price.
His innovation was to consider the walk to be continuous, rather than a discrete-time
random walk. This is analogous to moving from the binomial to the normal distribution.
Bachelier was unable to mathematically define the paths he discussed. A little later, in
1903–1904, Einstein used a similar model to represent the motion of atoms/molecules in a
liquid. Einstein also failed to define the path he was working with. However, since his
paper was used as evidence that atoms existed it became important in physics that the
paths were rigorously defined. This was done by Norbert Wiener in 1921. Today physicists
will refer to the paths used by Bachelier as ‘Brownian motion’, a physical process, while
mathematicians refer to them as a ‘Wiener process’, which is a mathematical object.
The phenomenon of ‘Brownian motion’ is named after the nineteenth century botanist Robert
Brown who observed the random movement of pollen particles in water. The path of a
two-dimensional Brownian motion process bears a resemblance to the track of such pollen
particles.
(i) W0 0
This means that the graph of Wt as a function of t doesn’t have any breaks in it.
This property shows that the increments are stationary in that their statistical properties
rely on the size of the interval t s . The concept of stationarity is discussed further in
Subject CS2.
The natural filtration FtW represents the history of the process up to and including time t.
This concept is covered in more detail in the martingale section of the chapter. FtW may
be written as (Wu t ) to denote that it is the filtration generated by the process Wt .
The fact that a Wiener Process has independent increments implies it is Markovian
(in fact it is ‘strong Markovian’).
Intuitively, a Markov process is one where, if we know the latest value of the process, we
have all the information required to determine the probabilities for the future values.
Knowing the historical values of the process as well would not make any difference.
Markov processes are also discussed in Subject CS2.
Property (iii) combined with property (i) gives us Wt Wt W0 N(0,t) , which results in
E Wt 0.
Brownian Motion
1
0.8
0.6
Value, x
0.4
0.2
0
-0.2 0 0.2 0.4 0.6 0.8 1
-0.4
Time, t
Brownian motion can be viewed as the continuous version of a simple symmetric random walk.
The term Brownian motion refers to a process { Zt , t 0} that satisfies criteria (ii) and (iv) above,
but with the distribution in criteria (iii) being replaced with N (t s), 2 (t s) .
Here is the drift coefficient and is known as the diffusion coefficient (or volatility).
It turns out that Brownian motion is the only process with stationary independent increments and
continuous sample paths. This is far from obvious and we won’t prove it here.
The relationship between standard Brownian motion and Brownian motion is the same as the
relationship between a standard normal distribution, N(0,1) , and a general N( , 2 ) distribution.
A Brownian motion with given diffusion and drift coefficients can be constructed out of a standard
Brownian motion {Wt ,t 0} by setting:
Zt Z0 Wt t
Question
Solution
The second property of a Brownian motion – that it has continuous sample paths – is met because
Zt is driven by only time t and the continuous process Wt .
Zt Z s Wt Ws t s
Zt Zs N t s , 2 t s
Zt Zs Wt Ws t s
N 0,t s t s
N 0, 2 t s t s
N t s , 2 t s
Question
How can a Brownian motion, Zt , that has drift and diffusion parameter and a starting value
of Z0 be converted into a standard Brownian motion?
Solution
Zt Z0 t
Wt
x
z
1
n 1 with probability
2
Xn Zi where Zi
1
i 1 1 with probability
2
The value of the process increases or decreases randomly by 1 unit (= ‘simple’) with equal
probability (= ‘symmetric’).
If we reduce the step size progressively from 1 unit until it is infinitesimal (and rescale the X
values accordingly), the simple symmetric random walk becomes standard Brownian motion. An
important consequence of this is that a standard Brownian motion returns infinitely often to zero,
or indeed any other level.
Many of the properties of standard Brownian motion can be demonstrated using the following
decomposition. For s t :
Wt Ws (Wt Ws )
a decomposition in which the first term is known at time s and the second is independent of
everything up to and including time s .
In calculations involving Brownian motion, we often need to split up Wt in this way, so that we
can work with independent increments.
This follows from the fact that E[Wt ] E[Ws ] 0 , and then by applying the decomposition
Wt Ws (Wt Ws ) .
By independence of increments:
Var (Ws ) 0
This follows from the fact that E[Ws2 ] Var (Ws ) E 2[Ws ] s 0 .
The importance of this result is that, in fact, if a stochastic process has the property that:
Cov ( X s , X t ) min{ s, t }
Xt cWt / c
The ‘clock’ of the process X t has been scaled by a factor c. For example, the process has
been slowed down and magnified if c 1 (and speeded up and shrunk if c 1 ).
Cov ( X t u , Xt ) Cov cW t u , cW t
c c
c Cov W t u ,W t
c c
c min t u,t
c c
t
c
c
t
assuming u 0 .
1
Xt Wat
a
1
with a .
c
Xt tW1/ t
The time-inverted Wiener process is itself a Wiener process, as can be shown by Lévy’s
Theorem.
Let u 0.
Then we have:
(t u )t Cov W 1 ,W 1
t u t
1 1
Cov ( X t u , Xt ) (t u )t min ,
t u t
1
(t u )t
t u
t
The time-inverted Wiener process is useful in proving limiting properties. For example,
since tW1/ t is a Wiener process, then:
Wt
lim lim W1/ t W0 0
t t t
Zt Wta 1 2
Wtb
where Wta and Wtb are independent Wiener processes and 1 1 defines the
correlation between Zt and Wta .
Question
Show that E[ Zt ] 0 .
Solution
The process Zt is only a weighted sum of two Wiener processes, both of which have zero
expectation. Therefore we have:
E [ Zt ] E Wta 1 2
Wtb
E Wta E 1 2
Wtb
E Wta 1 2
E Wtb
2
0 1 0 0
2
Var (Zt ) Var Wta 1 2
Var Wtb
2 2
t 1 t
Question
Solution
2 2
(t u t) (1 ) (t u t)
u
Wt Ws dWt Ws Wt dWt
lim or lim
t s 0 t s dt s t 0 s t dt
The first inequality assumes that t s , and the second covers the case when s t . These
statements come from the definition of a derivative as the convergence of a function’s gradient:
f (x h) f (x)
f (x) lim
h 0 h
If such a derivative existed, then we could find an arbitrarily small to measure the difference
between the derivative and the gradient.
Wt Ws dWt dWt 1
N ,
t s dt dt t s
which will have a positive probability of being greater than , and so the statement is
uncertain. In fact, since the variance increases as t s 0 , it never holds, almost surely.
The fundamental theorem of calculus is that given a derivative, f ( x ) , then the integral f ( x ) ,
is understood as:
b b b
df ( x )
f (b) f (a ) f ( x )dx f (a ) dx f (a ) df ( x )
dx
a a a
dWt
However, since the idea of is meaningless, the stochastic integral,
dt
b
dWt
a
In other words, because a Wiener process isn’t differentiable anywhere then it’s not clear how
integrals should be handled when the variable of integration is Wt . Stochastic integrals will be
dealt with in the next chapter.
However successful the Brownian motion model may be for describing the movement of market
indices in the short run, it is useless in the long run, if only for the reason that a standard
Brownian motion is certain to become negative eventually. It could also be pointed out that the
Brownian motion model predicts that daily movements of size 100 or more would occur just as
frequently when the process is at level 100 as when it is at level 10,000.
St e Zt
Brownian Motion
5
4
3
Value, x 2
1
Time, t
St 0 for all t
Geometric Brownian motion features heavily in this course. For example, Black and Scholes’
Nobel prize-winning formula for pricing European options assumes that the price of the
underlying asset is a geometric Brownian motion.
The properties of St are less helpful than those of Brownian motion. For example, St has neither
independent increments nor stationary increments.
But this is not so important because Zt does possess these desirable properties. Analysis of path
properties of St should involve first taking the logarithm of the observations, and then
performing the analysis using techniques appropriate to Brownian motion.
S S e Zt
The log-return log t from time s to time t is given by log t log Z Zt Z s .
Ss Ss e s
It follows by the independent increments property of Brownian motion that the log-returns, and
hence the returns themselves, are independent over disjoint time periods.
2 Martingales
2.1 Introduction
In simple terms, a martingale is a stochastic process for which its current value is the best
estimate of its future value. So, the expected future value is the current value. Other ways of
thinking of a martingale are that the expected change in the process is zero or that the process
has ‘no drift’.
Note
Throughout this course we will be using the word ‘expected’ in its statistical sense, rather than in
the everyday sense.
Consider a person standing on a ‘never-ending’ ladder. Every minute they move up or down the
ladder one step, depending on whether a tossed coin comes up heads or tails.
In the everyday sense of the word, after the next toss of the coin, we ‘expect’ them to move up or
down (but we don’t know which way). However, in the statistical sense, because
1 1 1 1 0 , we ‘expect’ them to stay exactly where they are, even though there’s no
2 2
way that that can happen!
The idea of martingales is consistent with the original equestrian term ‘martingale’, meaning a
holster used to keep a horse ‘pointing straight ahead’.
Their importance for modern financial theory cannot be overstated. In fact, the whole theory of
pricing and hedging of financial derivatives is formulated in terms of martingales.
For this reason, it may be best to think of a martingale as being a random process that has ‘no
drift’ because the idea of drift is more consistent with the way we think about real financial
assets. We have already seen that it is possible to model the log of a share price, log St , using
Brownian motion with a drift . You can think of as being the rate of the long-term drift of
the log of the share price. It is the underlying non-random trend. It should not come as a great
surprise that when we remove this underlying non-random trend (or drift) and look at log St t,
we obtain a martingale.
Conditional expectation
The features of martingales rely on the application of conditional expectations.
The filtration Ft represents everything that can be known up to and including time t.
Some random variables will be known by time t . We say that Xt is Ft -measurable if the value of
the process is known at time t, ie it belongs to Ft .
If Ft is the filtration generated by Xt (as opposed to any other process), then it is known as the
natural filtration of Xt and is denoted here by FtX . The following results will be used extensively.
(i) E E X |FtX E X
X t is adapted to Ft
E Xt for all t
E X t | Fs X s for all s t
The first condition is just a technicality to ensure that the process value can be known with
certainty at time t, and the second is to guarantee that Xt is integrable. In most questions we are
only concerned with the last condition and we’ll assume the first two hold.
Question
Solution
E[ Xt |Fs ] means the expected value of the process at time t, given that we are at time s and we
know the history of the process up to and including time s.
Of all the properties of martingales, the most useful is also the simplest: a martingale has constant
mean, ie E[ Xn ] E[ X0 ] X0 for all n .
E X t | Fs Xs
E X t | Fs Xs
A supermartingale has either negative or zero drift, whereas a submartingale has either positive
or zero drift. So a process which is both a supermartingale and a submartingale must therefore
be a martingale.
Consider:
E Wt | FsW Ws
The Wiener process is a martingale with respect to its natural filtration, noting that
E Wt since Wt almost surely.
Question
An asset’s value at time t (in pence and measured in years) is denoted by At and fluctuates in
value from day to day. Within these random fluctuations, there appears to be an underlying
long-term trend, in that the asset’s value is increasing by 2 pence on average each week.
(i) Assuming that there are exactly 52 weeks in a year, suggest a process based on At that
you think might be a martingale.
(ii) Suppose that the price increments have a continuous uniform distribution such that
At As U[ 52(t s),260(t s)] . Construct a martingale out of At .
Solution
(i) The value of the asset is increasing on average by 2 pence a week. Assuming that there
are exactly 52 weeks in a year, this means the asset is ‘drifting’ by 104 pence a year.
A martingale is a process without drift and so a good suggestion would be to remove this
drift and consider the process:
At 104t
(ii) Using the formula for the expected value of a uniform distribution from page 13 of the
Tables, we have:
260(t s) 52(t s)
E At Fs E As ( At As ) Fs As As 104(t s)
2
E At 104t Fs As 104 s
This now fits in with the definition given for a martingale in continuous time.
2
E Wt2 t | FsW E Ws (Wt Ws ) | FsW t
2
E Ws2 | FsW 2E Ws (Wt Ws ) | FsW E Wt Ws | FsW t
Ws2 s
E Ws2 |FsW Ws2 since the value of Ws is known with certainty at time s
Question
Solution
Given that Wt2 t is a martingale with zero drift, then Wt2 must have positive drift. This means
that Wt2 is a submartingale.
1 2
Finally, consider the stochastic process defined by exp Wt 2
t .
The same approach is taken here as before; decompose the Wiener process, but also decompose
time t into s (t s) .
2
E exp Wt 1
2
t | FsW E exp (Ws (Wt Ws )) 1
2
2
(s (t s ) | FsW
2 2
exp Ws 1
2
s E exp (Wt Ws ) 1
2
(t s ) | FsW
To evaluate this expectation, consider a random variable Z where Z N , 2 , then its moment
t 1 2t 2
generating function is defined as: MGFZ (t) E[exp(Z t)] e 2 . Therefore when t 1 we
1 2
have: E[exp(Z )] e 2 .
1 2 1 Var (Z )
E [exp(Z )] exp 2
exp E [ Z ] 2
1 2
If exp Yt , where Yt (Wt Ws ) 2
(t s ) is a normally distributed random variable, we
have that:
2
E exp (Wt Ws ) 1
2
(t s ) | FsW E [exp(Yt )]
1 2 1 2
exp 2
(t s) 2
(t s)
So:
2
E exp Wt 1
2
t | FsW exp Ws 1
2
2
s
T T
2
exp f (t )dWt 1
2
(f (t ))2 dt
0 0
is a martingale.
Chapter 9 Summary
Wiener process (standard Brownian motion)
A Wiener process is a stochastic process with the defining properties:
W0 0
Zt Z0 Wt t
St e Zt
Martingales
A martingale is a stochastic process such that:
Xt is adapted to Ft
E Xt for all t
Martingales are processes with no drift. In fact, it can be shown that a martingale has
constant mean, ie:
E[ Xn ] E[ X0 ] for all n
Various martingales can be constructed from Wiener processes, for example, Wt , Wt2 t
Wt 1 2t
and e 2 .
9.1 Assume that the spot rate of interest at time t , S(t) , can be modelled by S(t) = e 2 W (t) , where
Exam style W (t) is a Brownian motion with drift coefficient and volatility coefficient 1 such that W(0) = 0.
(i) Write down an expression for W (t) in terms of a standard Brownian motion, B(t) . [1]
9.2 ’Brownian motion is the only process with stationary independent increments and continuous
sample paths.’
(ii) State the distribution of the increments for a standard Brownian motion.
9.3 (i) What is meant by saying that the process {Yt } is a martingale with respect to another
process { Xt } ?
(ii) Show that Bt and Bt2 kt are both martingales with respect to Bt , for a suitably chosen
value of the constant k , which you should specify.
(iii) Show that there is a value of the constant c , which you should specify, such that
(a bBt )2 ct is a martingale with respect to Bt , where a and b are constants.
(ii) What is the probability that B2 takes a value in the interval ( 1,1) ?
(iii) Show that the probability that B1 and B2 both take positive values is 38 .
(iv) What is the probability that Bt takes a negative value at some time between t 0 and
t 2?
9.5 Consider the statement: ‘If you want to find the variance of X B(s) B(t) , where s t , for a
standard Brownian motion process, you can use the fact that B(s) and B(t) are independent to
get Var ( X ) s t .’
(i) Explain why the statement is not correct, and find a correct expression for Var(X ) .
(ii) Hence show that the general formula for Var B(t1 ) B(t2 ) when t1 ,t2 0 can be
expressed as t1 t2 2min(t1 ,t2 ) .
(ii) Hence find a general formula for the correlation coefficient (Bt1 ,Bt2 ) . [2]
[Total 5]
9.7 (i) Write down a formula for E eaX where X N , 2 and, by differentiating, or
Exam style
otherwise, derive an expression for E XeaX . [2]
aBt 1 a 2t
Xt Bt at e 2
Chapter 9 Solutions
9.1 (i) General Brownian motion
where B(t) is standard Brownian motion, is the drift, is the volatility coefficient and W (0) is
the value of general Brownian motion at time 0.
E St Fs E e 2 ( t Bt ) Fs
2
E e 2 t 2 Bt Fs
2
e 2 t E e 2 Bt Fs
2 2 Bt Bs Bs
e 2 tE e Fs
2 2 Bt Bs
e 2 t e 2 Bs E e Fs
2 2 Bt Bs
e 2 t e 2 Bs E e [2]
The filtration Fs can be left out because of the independent increments property. Now
Bt Bs N 0,t s and so the expectation is of the form E[eaX ] where X N 0,t s and
a 2 . Therefore, we can use the MGF of a normal distribution calculated at point 2 to
determine the expectation.
So, using the MGF formula, from page 11 of the Tables, we get:
2 0( 2 ) 12 (t s)( 2 )2
E St Fs e 2 t e 2 Bs e
2
e 2 s 2 Bs [1]
Ss
2
E[ St ] E e 2 t 2 Bt
2
e 2 t E e 2 Bt
2
e 2 t MBt ( 2 )
2 0( 2 ) 1 t ( 2 )2
e 2 te 2 1
‘Continuous sample paths’ means that the function t Bt ( ) for each particular realisation is
a continuous function of t .
Strictly, we should say that the process {Yt } is a martingale with respect to the filtration {Ft } of
the process { Xt } , which means that:
and E[ Yt ]
and Yt is adapted to Ft
If we use an s subscript to denote the expected value with respect to the filtration at time s ,
then we can write:
Es [Bt ] 0 Bs Bs
We have shown that the expected future value of Bt is equal to its current value (at time s ). We
also need to show that E Bt .
2
So: E Bt E 1 Bt2 1 var(Bt ) E (Bt ) 1 t 02
Similarly:
Es [Bt2 ] E s [{(Bt Bs ) Bs }2 ]
(t s) 02 0 Bs2
t s Bs2
We can show that E Bt2 t for any value of t, by first noting that:
2
So: E Bt2 t E Bt2 t var(Bt ) E (Bt ) t t 02 t 2t
Since the expected future value of Bt2 t is equal to its current value (at time s ) and the expected
value of its modulus is finite, Bt2 t is a martingale with respect to Bt , and the required constant
is k 1.
So, if we use an s subscript to denote the expected value with respect to the filtration at time s ,
then:
Es [Bt ] Bs
a2 2abBs b2 (Bs2 t s)
(a bBs )2 b2 (t s)
ie (a bBt )2 b2t is a martingale with respect to Bt . So the required value of the constant is
c b2 .
2
The technical condition E a bBt ct will hold as in (ii).
So: P(B2 0) 12
B2 B2 B0 N(0,2) .
Standardising:
1 1
P( 1 B2 1) 0.760 (1 0.760) 0.520
2 2
p P( X 0,Y X)
Since the range of values of Y depends on the value of X , we must use a double integral to
evaluate this:
p (x) (y) dy dx
x 0 y x
where the joint density function is expressed as the product of the individual density functions by
independence.
So:
p (x) (y) dy dx
x 0 y x
(x) (y) x dx
x 0
( x) 1 ( x) dx
x 0
(x) (x)dx
x 0
Finally:
(iv) Probability that Bt takes a negative value at some time between 0 and 2
The probability is 1 because Bt will almost surely take a negative value at some point close to
t 0.
The statement is not correct because B(s) and B(t) , which represent the value of the process at
two different times, are not independent. In fact, they are positively correlated.
It is actually the increments B(s) B(0) and B(t) B(s) that are independent. The correct
calculation can be done by expressing X B(s) B(t) in terms of these increments:
0 4 s (t s) 3s t
Var( X ) Var B(s) B(t) Var 2B(s) 4Var B(s) 4 s (or 4t)
So a general formula would be s t 2min(s,t) , or if we’re using t1 and t2 to denote the times,
t1 t2 2min(t1 ,t2 ) .
We can then calculate the covariance and correlation between these two values:
This formula only applies when s t . We can generalise this to cover any positive times t1 and t2 ,
if we write it in the form:
min(t1 ,t2 )
(Bt1 ,Bt2 ) [1]
max(t1 ,t2 )
[Total 2]
t1 t2
There are various alternative ways of writing this, eg (Bt1 ,Bt2 ) min , .
t2 t1
1 2 2
a a
E eaX e 2 [½]
This is the MGF of a normal random variable and can be found in the Tables.
We therefore have:
d d a 1 2 a2 a 1 2 a2
2
E XeaX E eaX e 2 a e 2 . [1½]
da da
[Total 2]
2
E Xt Fs E Bt at eaBt 0.5a t Fs
a Bs Bt Bs 0.5a2t
E Bs Bt Bs at e Fs
2 a B B 2 a B B
E Bs at eaBs 0.5a t e t s Fs E Bt Bs eaBs 0.5a t e t s Fs [2]
Now we can use the fact that we are conditioning on all the information known at time s. Any
terms involving Bs can be taken outside the expectation, as can terms in s and t (which are
fixed, not random points in time). This gives:
2 aB B 2 aB B
Bs at eaBs 0.5a t E e t s Fs eaBs 0.5a t E Bt Bs e t s Fs [1]
We can then drop the conditions since the increments are independent of the past:
2 aB B 2 aB B
Bs at eaBs 0.5a t E e t s eaBs 0.5a t E Bt Bs e t s [1]
2
Using part (i), and noting that Bt Bs N(0,t s) so that 0 and t s , we therefore get:
2 0.5a2 t s 2 0.5a2 t s
Bs at eaBs 0.5a t e eaBs 0.5a t a t s e
2
Bs as eaBs 0.5a s
Xs [1]