1 HJB: The Stochastic Case: 1.1 Brownian Motion
1 HJB: The Stochastic Case: 1.1 Brownian Motion
1 HJB: The Stochastic Case: 1.1 Brownian Motion
It turns out that Zt cannot be well-defined, not even when we use the
Lebesgue integral instead of the Riemann integral. The problem is that
zt is not a measurable function of time. This means that if we want to de-
termine the size of certain sets of type At = {t : zt ≤ α, α ∈ R}, these sets
can never be approximated well by taking countable unions and intersections
of intervals (why?), which is in the end the only things that we’re sure how
to deal with. 1 But integrating always requires that we put some simple
functions that are constant on sets like At below the function we want to
integrate (in this case Zt ). If we cannot even find sets of type At , definitely
the integral cannot be well-defined.
The next idea is not to work with the shock process zt itself, but with its
integral Zt . If we can impose certain criteria on it to ensure that its incre-
ments behave like the white noise we have in mind and it is mathematically
1
Mathematicians in the 19th century have spend a lot of effort on coming up with the
most general class of sets on the real line that we can assign a measure to, and it turned
out that measurable sets basically have to be the result of an algorithm as described above.
1
handier, then we should be happy to work with this kind of process. As you
can imagine, the Brownian Motion Bt will be the result of this effort.
But let’s first motivate the construction a bit more. Let’s say that we
want to create a process Wt , whose increments over a unit of time have
standardized variance:
V ar(Wt+1 − Wt ) = 1
Then definitely the increments of length 1/n have to have variance 1/n:
n
X
V ar(Wt+1 − Wt ) = V ar(Wt+ i+1 − Wt+ i ) = nV ar(Wt+ 1 − Wt ),
n n n
i=1
where the covariance terms in the first step are zero because we definitely
want the increments to be independent, and where for the second step we
require that the variance of increments of the same size be equal.
We can also impose another requirement on our process by just making
the chopping of [t, t + 1] always finer: Since the increments on the chopped
pieces are i.i.d. with finite variance, the Central Limit Theorem applies and
Wt+1 −Wt has to be normally distributed — if we make the chopping number
n higher and higher, then definitely this must be true.
Also, note that there is nothing peculiar about the size of our chosen
interval [t, t + 1]; our chopping argument is true for an interval of whatever
tiny length. This leads us to our next requirement on Bt , which is that its
increments should be normally distributed with variance equal to the size
of the interval under consideration. Of course, non-overlapping increments
should be independent.
It turns out that we can also require any realization of the process Bt to
be a continuous function of time without losing anything, so we definitely
want to do this to make our life easier. Note that for a continuous function,
the sets At = {t : zt ≤ α, α ∈ R} are well-behaved; mathematically speaking,
any continuous function is measurable, which will make it possible to work
with Bt .
2
(1) is called a “stochastic differential equation”. But what does it actually
mean? The dBt takes the role of the εt in discrete time, s(Xt ) is a function
that specifies how volatile the process is at a certain point in the state space,
and a(Xt ) gives us a deterministic trend component depending on the state
Xt . Often we will determine functions a(·) and s(·) that are inspired by some
idea we have in discrete time, and then we will try to start working from that.
The problem is that we still don’t know how to actually compute the
value of Xt for t > 0. The goal should be that we can calculate Xt when
somebody gives us a realization Bt of the shock process over an interval [0, T ].
Since we know well how to think about discrete pieces of time, we will have
a look at the following approximation of what we mean by equation (1)2
Xt+∆t − Xt = a(Xt )∆t + s(Xt )εt , εt ∼ i.i.d. N (0, ∆t) (2)
This is meant recursively: We compute a(X0 ) and s(X0 ) with the X0 from
our initial condition. Then, we read off B∆t from the realization of Bt what
we’re given, and compute X∆t . With this value X∆t in hand, we can again
evaluate a and s, read off the next increment and compute X2∆t , and so
forth. Then we repeat the same, but this time for a smaller ∆t.
In the end, we will — under some technical assumptions — arrive at some
limit values Xt for all t ∈ [0, T ]; we’re lucky, since Prof. Ito has proved this
for us. This limit is called the Ito Integral, which involves integrating against
the Brownian Motion Bt and looks somewhat intimidating at first:
Z t Z t
Xt = a(Xt )dt + s(Xt )dBt
0 0
Everybody who says this is an easy equation is a fool: As we have seen,
the values for Xt have to be evaluated underway as a function of the shocks
to Bt up to t, and it is not clear at all that we can have something like a
closed-form solution to solve for Xt given the previous evolution of Bt .
Actually, if we have something like this, we call it a “solution to a stochas-
tic differential equation”.3 Specifically, a solution is a function g(t, X0 , {Bs }s∈[0,t] )
that tells us how to compute Xt given the ingredients X0 and Bt up to t.
2
We have in mind that a(·) and s(·) are sufficiently smooth functions in order to do
this.
3
These solutions can be found using Ito’s Lemma, which is something like a rule for
taking derivatives of stochastic processes: With this rule, we can take of a function f (Xt )
of some stochastic process with respect to time, and then see if the result resembles some
stochastic differential equations that we have written down before, having in mind some
discrete-time model.
3
1.3 An example: Geometric Brownian Motion
An example is in order to illustrate what it means to obtain a solution for a
stochastic differential equation. Consider the following law of motion:
Notice that a solution like this makes our life a lot easier: Given Bt , we don’t
have to do the approximation procedure from equation (2) anymore, but we
can just substitute the single number Bt into (4) to obtain a 100%-correct
value for Xt . Also, with this solution in hand, we see immediately that Xt
is log-normally distributed; we are given the mean and variances for free, we
can easily compute quantiles, and even the joint distribution properties for
any collection {X1 , . . . , Xn } are relatively easy to obtain.
Look again at equation (4). It is important to be surprised by the term
2
−σ /2 that shows up in front of t: This is the famous Ito-term, which has
to do with Jensen’s inequality.
To see that we need this term, suppose that µ = 0 and that the term
2
−σ /2 didn’t appear, i.e. we suggest a different solution X̂t = ĝ(Bt ) =
exp(σBt ) in this particular case. We will now show that this solution X̂t
would give us values that are fundamentally at odds with some properties of
the stochastic differential equation we wrote down in (3).
If we take any discrete chopping of [0, t] and approximate Xt à la (2), we
see that all increments must have expectation zero:
4
Jensen’s inequality. Note that the correct solution Xt = −σ 2 t/2 + exp(σBt )
corrects for this effect by subtracting the mean of the log-normal distribution
of exp(σBt ).
5
The following derivation is somewhat heuristic; see the book by Oksendahl
for a more rigorous treatment.
The basic idea is to study how the function f (Xt ) changes over a small
time interval [t, t + r] by taking Taylor approximations. Before we start, first
observe that if we take r small enough, then X will not move far away from
its initial value Xt and we can approximate the stochastic process Xt by
for the interval [t, t + r], where ā ≡ a(Xt ) and s̄ ≡ s(Xt ). For this, it is
enough that the functions a and s are continuous.
Similarly, the function f (X) can be approximated well by a second-order
taylor approximation around Xt as long as the state X does not move away
too far from its initial level (we will later see why we need to keep terms up
to second order):
X N X N
f (Xt+s ) = f (Xt )+ ¯0
f (ā∆t+s̄∆Btj )+ 1 ¯00 2
f (s̄ ∆Bt2j +2ās̄∆t∆Btj + ā 2
∆t}2 ).
2 | {z
j=1 j=1
| {z }
→0
→0
Now, as we let N → ∞, we see that the last term is of order ∆t2 and vanishes
when summed up. Similarly, the second-to-last term is of order ∆t3/2 and
vanishes (see your homework). As for the remaining term multiplying f¯00 ,
5
Notice that actually we would have to take f 0 (Xτ ) = f¯0 + f¯00 (Xτ − Xt ) when taking
the second-order approximation of f (X) seriously – follow the rest of the proof why it is
not necessary to be this precise once the terms Xτ − Xt become small!
6
the random variable ∆Bt2j is i.i.d. χ2 -distributed with mean ∆t and finite
variance. By the law of large numbers, we will have N 2
P
j=1 ∆Btj → N ∆t = s
as N → ∞ (see your homework again).
Replacing the bar-notation (ā,s̄,f¯0 ,f¯00 ) again by functions evaluated at Xt ,
we conclude that
f (Xt+s )−f (Xt ) = f 0 (Xt ) a(Xt )r +s(Xt )(Bt+r −Bt ) + 21 f 00 (Xt )s(Xt )2 r. (5)
We see that the expected change in f (X) over the interval [t, t + r] is given
by
Et f (Xt+r ) − f (Xt ) = f 0 (Xt )a(Xt )r + 21 f 00 (Xt )s(Xt )2 r.
So the expected change in f is partly due to the drift a in Xt (the first term),
and partly due to convexity/concavity of f in combination with shocks (the
second term – look again at the derivation of this term if the intuition is
not clear to your). Additional to the expected change in f , there is the
martingale term f 0 (Xt )s(Xt )(Bt+r − Bt ), which captures the effect of shocks
on f .
As r → 0, we re-write (5) as Ito’s Lemma (or the Ito Rule) in differential
notation:
df (Xt ) = f 0 (Xt )a(Xt )dt + f 0 (Xt )s(X)t dBt + 21 f 00 (Xt )s(Xt )2 dt (6)
| {z } | {z } | {z }
drift-induced term martingale term Ito term
If you’re not sure about how to read this type of notation in any stochastic-
calculus problem, always go back to the discrete version (5) that we derived
to gain intuition.
The term f 0 (Xt )a(Xt )dt in (6) clearly captures the effect of the determin-
istic drift component, and f 0 (Xt )s(Xt )dBt represents random motions. The
Ito term 21 f 00 (Xt )s(Xt )2 dt captures the effects of Jensen’s inequality: If the
function f (·) is convex, then there is an upward movement of f (Xt ) in expec-
tation additional to the one induced by the drift a(Xt )dt; positive changes in
Bt will have a larger impact on f (Xt ) then negative movements of the same
size, since the slope of f (Xt ) is increasing in Xt .
It is a good exercise to apply Ito’s Rule to several important functions on
the process dXt = adt + σdBt . Think about the intuition for the resulting
stochastic differential equations and the meaning of the single terms in it.
If you start with exp(Xt ), you will obtain the example from the previous
section, for example.
7
Finally, we can integrate up (6) over time (that is: for a given realiza-
tion Bt of Brownian motion) to obtain
Z t Z t Z t
0 1 00
f (Xt ) = f (Xt )a(Xt )dt + 2
f (Xt )s(Xt ) dt + f 0 (Xt )s(Xt )dBt .
0 0 2 0
We can now come back to a problem we faced before: How can we find closed-
form solutions to SDEs? It works as follows: We can try and apply Ito’s
Lemma to different functions f (·) and obtain stochastic differential equations
as in (6). Note that then, of course f (Xt ) is a solution to the stochastic
differential equation (6). If Xt is something we’re easily able to evaluate as
a function of Bt , as is the case for Xt = at + σBt and f (X) = exp(X), we
have found a solution to an SDE. This is analogous to taking derivatives
of conventional functions and using the result as a solution to an ordinary
differential equation (ODE): For g(X) = exp(X), we have g 0 (X) = exp(X) =
g(X), so g(X) = exp(X) solves the ODE g 0 (x) = g(x) with initial condition
g(0) = 1.
This means that the drift as well as the volatility of Xt may be influenced by
the control h(Xt ). Of course this general case encompasses the easier case
where the planner cannot choose his exposure to certain shocks, in which
case s[·] is only a function of Xt but not of h(Xt ).
Look at the following example to understand what this means exactly:
Suppose that an agent can either put his assets A into a safe investment that
increases his wealth at a certain rate r or invest them into a risky asset which
increases wealth at a stochastic rate, whose average is higher than r:
8
where At is the level of asset holdings at t and the function 0 ≤ h(At ) ≤ 1
specifies which fraction of his assets the agents wants put into the riskless
bond at a certain level of At . Notice that when we fix a certain policy h(·),
equation (7) gives us a stochastic differential equation that determines the
behavior of At .
Now we turn to the optimal choice of h(·): Choosing among all functions
h(·) in a certain class, we want to find the one that maximizes a certain
reward function u(·) discounted over time, i.e.
Z ∞
max E0 e−βt u(Xt )dt (8)
h(·)∈H 0
Z t Z t
where Xt = a[Xt , h(Xt )]dt + s[Xt , h(Xt )]dBt ,
0 0
where H is some function space that I am too lazy to specify exactly.
We will take an approach as for the discrete-time HJB and ask which fixed
value h̄ we should choose given that we are in a certain state Xt . Again, we
will argue that in the problems we are interested in, the optimal policy should
be a continuous function of the state Xt , the process Xt shouldn’t make crazy
jumps etc. Then we can look — as in the discrete case — for an optimal h̄
to get the functional equation for the value function:
Z t+ε
−β(s−t)
−βε
v(Xt ) ' max e u(Xt , h̄)ds + Et e v(Xt+ε ) (9)
h̄ t
Note that I have fixed Xt in the first term pertaining to the instantaneous
reward u(·): As in the deterministic case, we argue that Xt will barely move
over a very short time and that we don’t loose much accuracy by doing this.
Now we use what we know from the Ito rule (6) to calculate how v(Xt+ε
evolves:
Z t+ε
1 t+ε 00
Z
0
v(Xt+ε ) =v(Xt ) + v (Xs )a(Xs , h̄)ds + v (Xs )s(Xs , h̄)2 ds+
2
Z t+ε t t
9
For the approximation in the second step we use again that for tiny ε the
movements in the state Xt are (almost surely) so small that we get very close
to the truth by just putting Xt+s = Xt when we evaluate the functions a(·),
s(·), v 0 (·) and v 00 (·). Now we plug (10) into (9) and observe that Et [Bt+ε −
Bt ] = 0. As in the deterministic case, we can then take the term e−βε v(Xt )
to the left-hand side, divide everything by ε and take the limit ε → 0 to
obtain the stochastic Hamilton-Jacobi-Bellman (HJB) equation:
Note that the only point where the stochastic nature of the problem enters
into the HJB is in the last term, which involves the second derivative of the
value function. The intuition for this term again has to do with Jensen’s
inequality: If the value function is concave, one will not like the effects from
random movements over the state space. If possible, the planner will try to
keep volatility low by choosing h̄ such that s(·, h̄) is small.
For the sake of completeness, we also have a look at the HJB for the
generalized problem where we are given a terminal value. Actually, a more
general case can be stated than we had in the deterministic case: The process
need not be terminated at a fixed time T for all possible histories; instead,
the planner could be stopped at different times for different histories of the
world. However, stating all this mathematically is somewhat tedious, so just
bear in mind the simple case where we have a “bequest function” vT (xT ) for
some fixed terminal point T in time.
Then, the (non-stationary) HJB is:
10