Fin 2 Mi QEFAll
Fin 2 Mi QEFAll
at UNISG)
Paul Söderlind1
20 December 2010
1 Universityof St. Gallen. Address: s/bf-HSG, Rosenbergstrasse 52, CH-9000 St. Gallen,
Switzerland. E-mail: Paul.Soderlind@unisg.ch. Document name: Fin2MiQEFAll.TeX
Contents
13 Bond Portfolios 35
13.1 Duration: Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 35
13.2 Duration Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
13.3 Yield Curve Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
13.4 Interest Rates and Macroeconomics . . . . . . . . . . . . . . . . . . 54
13.5 Forecasting Interest Rates . . . . . . . . . . . . . . . . . . . . . . . . 60
13.6 Risk Premia on Fixed Income Markets . . . . . . . . . . . . . . . . . 62
1
14.6 Pricing Bounds and Convexity of Pricing Functions . . . . . . . . . . 87
2
12 Interest Rate Calculations
Main references: Elton, Gruber, Brown, and Goetzmann (2010) 21–22 and Hull (2006) 4
Additional references: McDonald (2006) 7; Fabozzi (2004); Blake (1990) 3–5; and Camp-
bell, Lo, and MacKinlay (1997) 10
Suppose we borrow one unit of currency (that is, the face value of the loan is 1) that
should be repaid with interest rate m periods later. The payment in period m is then the
face value (of 1) plus the interest, so the payment in m is
where Y .m/ is the effective interest rate, y .m/ the continuously compounded interest rate
and Z.m/ is the simple interest rate.
Remark 12.1 (The transformation from one type of rate to the other) We have
The different interest rates are typically very similar, except for very high rates. See
Figure 12.1 for an illustration.
Suppose a bond without dividends costs B .m/ in t and gives one unit of account in t C m
(the trade date index t is suppressed to simplify notation—in case of potential confusion,
3
we can write B t .m/). The gross return (payoff divided by price) from investing in this
bond is 1=B.m/, since the face value is normalized to unity.
1
D Œ1 C Y.m/m , or (12.5)
B .m/
1=m
Y.m/ D B.m/ 1: (12.6)
Another way to think of this is that if we invest the amount B.m/ by buying one bond,
then after m periods we get B.m/ times the interest rate, that is, B.m/ Œ1 C Y.m/m D 1.
In practice, bond quotes are typically expressed in percentages (like 97) of the face value,
whereas the dicussion here effectively uses the fraction of the face value (like 0.97).
The relation between the rate and the price is clearly non-linear—and depends on the
time to maturity (m): short rates are more sensitive to bond price movements than long
rates. Conversely, prices on short bonds are less sensitive to interest rate changes than
prices on long bonds. See Figure 12.1 for an illustration.
In terms of the continously compounded rate, we have
1
D exp Œmy .m/ , or (12.7)
B .m/
y.m/ D ln B.m/=m: (12.8)
Example 12.2 (Effective and continuously compounded rates) Let the period length be a
year (which is the most common convention for interest rates). Consider a six-month bill
so m D 0:5. Suppose B .m/ D 0:95. From (12.5) we then have that
1
D Œ1 C Y.0:5/0:5 , so Y.0:5/ 0:108; and y.0:5/ 0:103:
0:95
Example 12.3 (Bond price changes vs interest rate changes) Suppose that, over a split
second (so the time to maturitity is virtually unchanhged), the log bond price changes by
ln B, then (12.8) says that the change in the interest rate is
y.m/ D ln B.m/=m:
For instance, if the price of a 10-year bond decreases from 0.95 to 0.86 we get that the
interest rate increases by
0:001 D ln.0:86=0:95/=10;
4
Bond prices and rates Bond prices and rates
1
0.1
0.9
0.1
Some fixed income instruments (in particular inter bank loans, LIBOR/EURIBOR)
are quoted in terms of a simple interest rate. The “price” of a deposit that gives unity at
maturity is then
1
B .m/ D , or (12.9)
1 C mZ.m/
1=B .m/ 1
Z.m/ D : (12.10)
m
5
t t Cm t Cn
write contract: pay forward get 1
agree on forward price price, get bond
t t Cm t Cn
buy 1 n-bond, pay B.n/=B.M / get 1
sell B.n/=B.M / (the principal)
of m-bonds
A forward contract written in t stipulates buying at t C m, a discount bond that pays one
unit of account at time t C n—see Figure 12.2 for an illustration. An arbitrage argument
(see Figure 12.3) shows that the forward price must satisfy
Proof. (of (12.11)) In period t, buy one bond maturing in t C n at the cost of B.n/
and sell B.n/=B.m/ bonds maturing in t C m at the value of B.n/: the net investment in
t is zero. In t C m, pay the principal of the maturing bonds at the cost B.n/=B.m/—this
is the net investment in t C m. The payoff in t C n is one. The forward contract has the
same payoff in t C n and must therefore specify the same net investment in t C m, the
forward price: B.n/=B.m/.
Buying a forward contract is effectively an investment from t C m to t C n, that
6
is, over n m periods. The gross return (which happens to be known already in t ) is
1=ŒB.n/=B.m/. We define a per period effective rate of return, a forward rate, F .m; n/,
analogous with (12.5)
1
D Œ1 C F .m; n/n m
: (12.12)
B.n/=B.m/
Notice that F .m; n/ here denotes a forward rate, not a forward price. This is the rate
of return over t C m to t C n that can be guaranteed in t . By using the relation between
bond prices and yields (12.5) this expression can be written
1=.n m/
Œ1 C Y.n/n=.n m/
B.m/
F .m; n/ D 1D 1: (12.13)
B.n/ Œ1 C Y.m/m=.n m/
0.07 spot, 2Q
spot, 3Q
0.065
forward
Effective interest rate
0.06
0.055
0.05
0.045
0.04
Split up the time until n into n= h intervals of length h (see Figure 12.5). Then, the
n-period spot rate equals the geometric average of the h-period forward rates over t to
7
t Cn
0 h 2h 3h n
This means that the forward rate can be seen as the “marginal cost” of making a loan
longer. See Figure 12.6 for an illustration.
Proof. (of (12.14)) Let n D 2m and use (12.12) for forward contracts between 0 to m
and m to 2m
1 1
D Œ1 C F .0; m/m and D Œ1 C F .m; 2m/m :
B.m/=B.0/ B.2m/=B.m/
Multiply and simplify to get
1
D Œ1 C F .0; m/m Œ1 C F .m; 2m/m :
B.n/
Example 12.4 (Forward rate) Let m D 0:5 (six months) and n D 0:75 (nine months),
and suppose that Y.0:5/ D 0:04 and Y.0:75/ D 0:05. Then (12.13) gives
8
Upward sloping yield curve Flat yield curve
8 8
6 6
Int rate, %
Int rate, %
4 4
2 Spot 2
Forward (one−period)
0 0
2 4 6 8 10 2 4 6 8 10
Maturity Maturity
6
Int rate, %
0
2 4 6 8 10
Maturity
which gives F .0:5; 0:75/ 0:07. See Figure 12.4 for an illustration.
Example 12.5 (Forward rate) Let the period length be a year. Let m D 1 (one year) and
n D 2 (two years), and suppose that Y.1/ D 0:04 and Y.2/ D 0:05. Then (12.13) gives
.1 C 0:05/2
F .1; 2/ D 1 0:06:
.1 C 0:04/1
Example 12.6 (Spot as average forward rate) In the previous example, (12.14) gives,
using F .0; 1/ D Y.1/,
1:041=2 1:061=2 1:05;
9
12.3.2 Continuously Compounded and Simple Forward Rates
Taking logs of 1 C F .m; n/ in (12.13) we get the continuously compounded forward rate
h Xn= h 1
y .m/ D f Œsh; .s C 1/h: (12.16)
n sD0
The instantaneous forward rate, f .m/, is defined as the limit when the maturity date of
the bond approaches the settlement date of the forward contract, n ! m. This can be
thought of as a forward “overnight” rate m periods ahead in time. From (12.15) it is
1 n
Z
y .n/ D f .s/ds: (12.21)
n 0
Equations (12.20) and (12.21) show that the difference between the forward and spot
rates, f .n/ y.n/, is proportional to the slope of the yield curve.
10
c c c cC1
0 m1 m2 m3 mK
Proof. (of (12.21)) Integrating the first term on the right hand side of (12.20) over
Rn
Œ0; n gives 0 y.s/ds. Integrating (by parts) the second term on the right hand side of
Rn Rn
(12.20) over Œ0; n, 0 s dy.s/
ds
ds, gives ny.n/ 0 y.s/ds. Adding the two terms gives
ny.n/.
Consider a bond which pays coupons, c, for K periods (t C m1 , t C m2 ,.. ), and one unit
of account (the “face” or “par” value) in the last period t C mK —see Figure 12.7.
The coupon bond is, in fact, a portfolio of zero coupon bonds: c maturing in t C m1 ,
c in t C m2 ,..., and 1 in t C mK . The price of the coupon bond must therefore equal the
price of the portfolio
PK
B c .K; c/ D kD1 B.mk /c C B.mK / (12.22)
PK c 1
D mk C ; (12.23)
kD1
Œ1 C Y.mk / Œ1 C Y.mK /mK
where B.m/ is defined as in (12.5). The length of the time periods is typically a year, but
the the expression is correct also for other conventions.
Example 12.7 (Coupon bond price) Suppose B.1/ D 0:95 and B.2/ D 0:90. The price
of a bond with a 6% annual coupon with two years to maturity is then
Equivalently, the bond prices imply that Y.1/ 5:3% and Y.2/ 5:4% so
0:06 0:06 C 1
1:01 C :
1:053 1:0542
11
Example 12.8 (Coupon bond price at par) A 9% (annual coupons) Suppose B.1/ D
1=1:06 and B.2/ D 1=1:0912 . The price of a bond with a 9% annual coupon with two
years to maturity is then
0:09 0:09 1
C 2
C 1:
1:06 1:091 1:0912
This bond is (approximately) sold “at par”, that is, the bond price equals the face (or
par) value (which is 1 in this case).
If we knew all the spot interest rates, then it would be easy to calculate the correct
price of the coupon bond. However, the situation is typically the reverse: we know prices
on several coupon bonds (different maturities and coupons), and want to calculate the
spot interest rates that are compatible with them. This is to estimate the yield curve. The
implied zero coupon bonds prices is often called the discount function.
The effective yield to maturity (also called redemption yield), , on a coupon bond is the
internal rate of return which solves
PK c 1
B c .K; c/ D mk C ; (12.24)
kD1
.1 C / .1 C /mK
where the bond pays coupons, c, at m1 ; m2 ; :::; mK periods ahead. This equation can be
solved (numerically) for . Quotes of bonds are typically the yield to maturity or the
price. For a par bond (the bond price equals the face value, here 1), the yield to maturity
equals the coupon rate. For a zero coupon bond, the yield to maturity equals the spot
interest rate.
Example 12.9 (Yield to maturity) A 4% (annual coupon) bond with 2 years to maturity.
Suppose the price is 1.019. The yield to maturity is 3% since it solves
0:04 0:04 1
1:019 C 2
C :
1 C 0:03 .1 C 0:03/ .1 C 0:03/2
Example 12.10 (Yield to maturity of a par bond) A 4% (annual coupon) par bond (price
of 1)with 2 years to maturity. The yield to maturity is 4% since
0:04 0:04 1
C 2
C D1
1 C 0:04 .1 C 0:04/ .1 C 0:04/2
12
Example 12.11 (Yield to maturity of a portfolio) A 1-year discount bond with a ytm (ef-
fective interest rate) of 7% has the price 1=1:07 and a 3-year discount bond with a ytm of
10% has the price 1=1:13 . A portfolio with one of each bond has a ytm
1 1 1 1
C D C , with 0:091:
1:07 1:13 1C .1 C /3
This is clearly not the average ytm of the two bonds. It would be, however, if the yield
curve is flat.
Note that the yield to maturity is just a convention. In particular, it does not provide a
measure of the return to an investor who buys the bond and keeps it until maturity—unless
the yield curve is flat.
To calculate the buy-and-hold (until maturity) return of a coupon bond we need to specify
how the coupons are reinvested. One useful assumption about the reinvestment of the
coupons is that they are done by a forward contract. This means that the investor buys
the bond now and receives nothing until maturity—as if he/she had bought a zero-coupon
bond. Indeed, no-arbitrage arguments show that the return (from now to maturity) is
indeed the spot interest on a zero-coupon bond.
Proof. (Buy-and-hold return on a coupon bond, simple case) Consider a 3-period
coupon bond. From (12.23), the price of the bond is
From (12.12), we know that the forward contract for the first coupon has the gross return
(until maturity) 1=ŒB.3/=B.1/ and that the forward contract for the second coupon has
the cross return (until maturity) 1=ŒB.3/=B.2/. The value of the reinvested coupons and
the face value at maturity is then
B.1/ B.2/
cC c C c C 1:
B.3/ B.3/
Dividing by the first equation (the investment) gives 1=B.3/ so the return on buying and
holding (and reinvesting the coupons) this coupon bond is the same as the 3-period spot
interest rate. (The extension to more periods is straightforward.)
13
Example 12.12 (Yield to maturity versus return) Suppose also that the spot (zero coupon)
interest rates are 4% for one year to maturity and 9% for 2 years to maturity. Notice that
the forward rate (between year 1 and 2) is 14.24%. A 3% coupon bond with 2 years to
maturity must have the price
0:03 0:03 C 1
C 0:8958:
1:04 1:092
The yield to maturity is 8.91% since
0:03 0:03 C 1
0:8958 C :
1 C 0:0891 .1 C 0:0891/2
However, the value of the bond at maturity, if the coupon is reinvested by a forward
contract, is
0:03 .1 C 0:1424/ C 0:03 C 1 1:0643;
p
so the gross return is approximately 1:0643=0:8958. Annualized ( 1:0643=0:8958) this
becomes 1.09 so the return is 9%—just like the 2-year spot rate.
where all coefficients except one are positive. There is then only one positive real root,
1 . Many software packages contain routines for finding roots of polynomials. Once that
is done, pick the only positive real root, 1 , and calculate the yield as D .1 1 /=1 .
Remark 12.14 (Calculating in the simplest case) If the bond price, B c , is unity, then
the bond is sold “at par.” If also mk in (12.24) is the integer k (as in the previous remark),
then D c.
Example 12.15 (Par bond) A 9% (annual coupons) 2-year bond with a yield to maturity
of 9%, and exactly two years to maturity has the price
0:09 0:09 1
C 2
C D 1:
1 C 0:09 .1 C 0:09/ .1 C 0:09/2
14
Remark 12.16 (Newton-Raphson algorithm for solving (12.24)) It is straightforward to
use a Newton-Raphson algorithm to solve (12.24). It is then useful to note that the deriva-
tive is
dB./ PK mk c mK
D kD1 mk C1
:
d .1 C / .1 C /mK C1
The Newton-Raphson algorithm is based on a first order Taylor expansion of the bond
price equation
dB.0 /
B.1 / D B.0 / C .1 0 / :
d
Set the left hand side equal to the observed price, B, guess a values of and call it 0 ;
then solve for 1 as 1 D 0 C ŒB B.0 / = dB. d
0/
. 1 is probably a better guess of
than 0 . Improve by repeating this updating as 2 D 1 C ŒB B.1 / = dB. d
1/
, and so
forth until n converges.
Remark 12.17 (Bisection method for solving (12.24)) The bisection method is a very
simple (no derivatives are needed) and robust way to solve for the yield to maturity. First,
start with a lower (L ) and higher (H ) guess of the yield which are known to bracket the
true value, that is, B.H / B B.L / where B is the observed bond price. Second,
calculate the bond price at the average of the two guesses: BŒ.L CH /=2. Third, replace
either H or L according to: if B BŒ.L C H /=2 (so the midpoint .L C H /=2
is below the true yield) then replace L by .L C H /=2 (a higher value), but if B >
BŒ.L C H /=2 then replace H by .L C H /=2. Fourth, iterate until L H .
Example 12.18 (Bisection method). The first couple of iterations for a 2-year bond with
a 4% coupon and a price of 1.019 are (see also Figure 12.8)
15
Bisection (yield bounds) Newton−Raphson (yield)
0.06
0.04
0.04
yield
yield
0.02 0.02
2−year bond, 4% coupon, price 1.019
0 Convergence critierion: 1e−005
0
0 5 10 15 0 5 10 15
iteration iteration
A par yield for is the coupon rate at which a bond would trade at par (that is, have a price
equal to the face value). Setting B c .K; c/ D 1 in (12.22) and solving for the implied
coupon rate gives
1
c D PK Œ1 B.mK / , or (12.25)
kD1 B.mk /
1 1
D PK 1 : (12.26)
1
kD1 Œ1CY .mk /mk
Œ1 C Y.mK /mK
Example 12.19 Suppose B.1/ D 0:95 and B.2/ D 0:90. We then have
1
1 D .0:95 C 0:9/c C 0:9, so c D .1 0:9/ 0:054:
0:95 C 0:9
A swap contract involves a sequence of payment over the life time (maturity) of the con-
tract: for each tenor (that is, sub period, for instance a quarter) it pays the floating market
rate (say, the 3-month Libor) in return for a fixed swap rate. Split up the time until matu-
rity n into n= h intervals of length h—see Figure 12.11. In period sh, the swap contract
pays
hŒZ.s 1/h .h/ R (12.27)
16
Upward sloping yield curve Flat yield curve
8 8
6 6
Int rate, %
Int rate, %
4 4
2 Spot 2
Par yield
0 0
2 4 6 8 10 2 4 6 8 10
Maturity Maturity
6
Int rate, %
0
2 4 6 8 10
Maturity
where Z.s 1/h .h/ is the short (floating) simple h-period interest rate in .s 1/h and R is
the (fixed) swap rate determined in t (as part of the swap contract).
The issuer can lock in the floating rate payments by a sequence of forward rate agree-
ments that pay the floating rate in return for the forward rate. In this way the swap contract
becomes riskfree so its present value must be zero. This implies that the swap rate must
therefore be
1 1 B.n/
R D Xn= h ; (12.28)
h
B.sh/
sD1
which is proportional to the par yield in (12.25).
Example 12.20 (Swap rate) Consider a one-year swap contract with quarterly periods
17
Interest rate swap
Company Bank
7% on EUR 75
(each year)
Libor on EUR 75
(each year)
1 B.1/
RD4 :
B.1=4/ C B.1=2/ C B.3=4/ C B.1/
With the bond prices (0.99,0.98,0.97,0.96) we have
1 0:96
RD4 4:1%:
0:99 C 0:98 C 0:97 C 0:96
Proof. (of (12.28)) Notice that a simple forward rate for an investment from sh to
.s C 1/h is
f 1 B.sh/
Z Œsh; .s C 1/h D 1 :
h BŒ.s C 1/h
We can therefore write the present value of (12.27) as
Xn= h BŒ.s 1/h
PV D B.sh/ 1 hR :
sD1 B.sh/
Since it is riskfree the PV should be zero (or else there are arbitrage opportunities), which
18
swap rate
floating rate
C C
0 h 2h 3h 4h
(The payments to the party receiving the floating rate are marked by C or )
we rearrange as
Xn= h Xn= h BŒ.s 1/h
hR B.sh/ D B.sh/ 1
sD1 sD1 B.sh/
Xn= h
hR B.sh/ D 1 B.n/;
sD1
where we have used the fact that B.0/ D 1. Finally, solve for hR to get (12.28).
The (zero coupon) spot rate curve is of particular interest: it helps us price any bond or
portfolio of bonds—and it has a clear economic meaning (“the price of time”).
In some cases, the spot rate curve is actually observable—for instance from swaps
and STRIPS. In other cases, the instruments traded on the market include some zero
coupon instruments (bills) for short maturities (up to a year or so), but only coupon bonds
for longer maturities. This means that the spot rate curve needs to be calculated (or
estimated). This section describes different methods for doing that.
We can sometimes calculate large portions of the yield curve directly from asset prices.
The idea is to calculate a short yield first (from a bill/bond with short time to maturity)
19
and then use this to calculate the yield for the next (longer) bond, and so on.
For instance, suppose we have a one-period coupon bond, which by (12.22) must have
the price
B c Œ1; C.1/ D B.1/Œc.1/ C 1; (12.29)
where we use c.1/ to indicate the coupon value of this particular bond. The equation
immediately gives the one-period discount function value, B.1/. Suppose we also have a
two-period coupon bond, which pays the coupon c.2/ in t C 1 and t C 2 as well as the
principal in t C 2, with the price (see (12.22))
The two period discount function value, B.2/, can be calculated from this equation since
it is the only unknown. We can then move on to the three-period bond,
to calculate B.3/, and so forth. Finally, we can use (12.5) to transform these zero coupon
bond prices to spot interest rates.
Example 12.22 (Bootstrapping) Suppose we know that B.1/ D 0:95 and that the price
of a bond with a 6% annual coupon with two years to maturity is 1.01. Since the coupon
bond must be priced as
we can solve for the price of a two-period zero coupon bond as B.2/ D 0:90. The spot
20
interest rates are then
1
D 1 C Y.1/ or Y.1/ 0:053
0:95
1
D Œ1 C Y.2/2 or Y.2/ 0:054:
0:90
Unfortunately, the bootstrap approach is tricky to use. First, there are typically gaps
between the available maturities. On way around that is to interpolate. Second (and
quite the opposite), there may be several bonds with the same maturity but with different
coupons/prices, so it is impossible to calculate a unique yield curve. This could be solved
by excluding some data.
If we attach some random error to the bond prices in (12.22), then that equation looks
very similar to regression equation: the coupon bond price is the dependent variable;
the coupons are the regressors, and the discount function values are the coefficients to
estimate—perhaps with OLS. This is a way of overcoming the second problem discussed
above since multiple bonds with the same maturity, but different coupons, are just addi-
tional data points in the estimation.
The first problem mentioned above, gaps in the term structure of available bonds, is
harder to deal with. If there are more coupon dates than bonds, then we cannot estimate
all the necessary zero coupon bond prices from data (fewer data points than coefficients).
The way around this is to decrease the number of parameters that need to be estimated by
postulating that B.m/ is a linear combination of some J predefined functions of maturity,
g1 .m/,..., gJ .m/,
B.m/ D 1 C jJD1 aj gj .m/; (12.32)
P
where gj .0/ D 0 since B.0/ D 1 (the price of a bond maturing today is one).
Once the gj .m/ functions are specified, (12.32) is substituted into (12.22) and the
j coefficients a1 ,..., aj are estimated by minimizing the squared pricing error (see, for
instance, Campbell, Lo, and MacKinlay (1997) 10).
One possible choice of gj .m/ functions is a polynomial, gj .m/ D mj . Another
common choice is to make the discount function a spline (see McCulloch (1975)).
21
Example 12.23 (Quadratic discount function) With a quadratic discount function
B.m/ D a0 C a1 m C a2 m2 ;
we get
PK
B c .K; c/ D 2
a0 C a1 mk C a2 m2k c C a0 C a1 mK C a2 mK
kD1
P P
D a0 .Kc C 1/ C a1 c K kD1 km C mK C a 2 c K
m
kD1 k
2
C m2
K :
The a0 ; a1 , and a2 can be estimated by OLS if we have data on at least three bonds. This
method can, however, lead to large errors in the fitted yields (if not the prices). See Figure
12.12 for an example.
B.m/ D a0 C a1 m C a2 m2 C a3 m3 ;
we get
P P P
B c .K; c/ D a0 .Kc C 1/Ca1 c K
kD1 mk C mK Ca 2 c K
m2
kD1 k C m2
K Ca 3 c K
m3
kD1 k C m3
K :
Yet another approach to estimating the yield curve is to start by specifying a function for
the instantaneous forward rate curve, and then calculate what this implies for the discount
function. (These will typically be complicated and not satisfy the simple linear structure
in (12.32).)
Let f .m/ denote the instantaneous forward rate with time to settlement m. The ex-
tended Nelson and Siegel forward rate function (Svensson (1995)) is
m m m m m
f .mI ˇ/ D ˇ0 C ˇ1 exp C ˇ2 exp C ˇ3 exp , (12.33)
1 1 1 2 2
22
Spot rates from cubic discount fn Pricing errors, in %
0.05
2
interest rate
%
0
0
−0.05 −2
0 10 20 30 0 10 20 30
Years to maturity Years to maturity
−0.05
−0.1 Actual
Fitted
0 10 20 30
Years to maturity
lim f .mI b/ D ˇ0 ;
m!1
so ˇ0 C ˇ1 corresponds to the current very short spot interest rate (an overnight rate, say)
and ˇ0 to the forward rate with settlement very far in the future (the asymptote).
The spot rate implied by (12.33) is (integrate as in (12.21) to see that)
23
US yield curve on 1995−01−04 (Nelson−Siegel method)
8.5
7.5
%
One way of estimating the parameters in (12.33) is to substitute (12.34) for the spot rate
in (12.7), and then minimize the sum of the squared price errors (differences between
p
actual and fitted prices), perhaps with 1/ duration as the weights (a practice used by
many central banks). Alternatively, one could minimize the sum of the squared yield
errors (differences between actual and fitted yield to maturity). See Figures 12.13–12.15
for illustrations.
When many bonds are traded at (approximately) par, the par yield curve (12.25) can be
obtained by just plotting the coupon rates. In practice, the yield to maturity is used instead
(to partly compensate for the fact that the bonds are only approximately at par)—and the
gaps (across maturities) are filled by interpolation. (Recall that for a par bond, the yield to
maturity equals the coupon rate.) This is basically the way the Constant Maturity Treasury
yield curve, published by the US Treasury, is constructed.
24
US forward rates on 1994−01−05 US forward rates on 1995−01−04
8 8
%
%
6 6
4 4
0 5 10 15 0 5 10 15
Years to maturity Years to maturity
0 5 10 15
Years to maturity
Suppose the interest rate r is compounded 2 times per year. This means that the amount
B invested at the beginning of the year gives B.1 C r=2/ after six months—which is
reinvested and therefore gives B.1 C r=2/.1 C r=2/ after another six months (at the end
of the year). To make this payoff equal to unity (as we have used as our convention) it
must be the case that the bond price B D 1=.1 C r=2/2 . By comparing with the definition
of the effective interest rate (with annual compounding) in (12.5) we have
1 r 2
D 1C D 1 C Y; (12.35)
B 2
where Y is the annual effective interest rate.
25
Spot rates from Nelson−Siegel Pricing errors, in %
0.06
interest rate
2
0.04
%
0.02 0
0 −2
0 10 20 30 0 10 20 30
Years to maturity Years to maturity
0.04
0.02
Actual
0 Fitted
0 10 20 30
Years to maturity
This shows how we can transform from semi-annual compounding to annual com-
pounding (and vice versa).
More generally, with compounding n times per year, we have
1 r n
D 1C D 1 C Y: (12.36)
B n
The convention for US Treasury notes and bonds (issued with maturities longer than one
year) is that coupons are paid semi-annually (as half the quoted coupon rate), and that
yields are semi-annual effective yields. (This applies also to most as well as for most US
corporate bonds and UK Treasury bonds.)
However, both are quoted on an annual basis by multiplying by two. The quoted yield
26
to maturity, , solves
PK c=2 1
B c .K; c/ D nk C ; (12.37)
kD1
.1 C =2/ .1 C =2/nK
where the bond pays coupons c=2, at n1 ; n2 ; :::; nK half-years ahead. By using (12.35),
the yield quoted, , can be expressed in terms of an annual effective interest rate.
Example 12.25 A 9% US Treasury bond (the coupon rate is 9%, paid out as 4.5% semi-
annually) with a yield to maturity of 7%, and one year to maturity has the price
0:09=2 0:09=2 1
C 2
C D 1:019:
1 C 0:07=2 .1 C 0:07=2/ .1 C 0:07=2/2
From (12.35), we get that the yield to maturity rate expressed as an annual effective
interest is .1 C 0:035/2 1 0:071.
The quotes of bond prices (as opposed to yields) are not the full price (also called the
dirty price, invoice price, or cash price) the investor actually pays. Instead, it is the “clean
price” that is quoted, which is the full price less the accrued interest:
The buyer of the bond (buying in t) will typically get the next coupon (trading is
“cum-dividend”). The accrued interest is the faction of that next coupon that has been
accrued during the period the seller owned the bond. It is calculated as
Discount Yield
US Treasury bills have no coupons and are issued in 3, 6, 9, and 12 months maturities—
but the time to maturity does of course change over time. They are quoted in terms of the
27
(banker’s) discount yield, Ydb .m/, which satisfies
1=m
Y.m/ D Œ1 mYdb .m/ 1 (12.40)
y .m/ D lnŒ1 mYdb .m/=m: (12.41)
Example 12.26 A T-bill with 44 days to maturity and a quoted discount yield of 6.21%
has the price 1 .44=360/0:0621 0:992. The effective interest rate is Œ1 .44=360/
0:0621 360=44 1 6:43%.
The LIBOR (London Interbank Offer Rate) and the EURIBOR (Euro Interbank Offered
Rate) are the simple interest rate on a short term loan without coupons. It is quoted
as a simple annual interest rate, using a “actual/360” day count—with the exception of
pounds which are quoted “actual/365.” This means that borrowing one dollar for 150 days
at a 6% LIBOR requires the payment of 0:06 150=360 dollars in interest at maturity.
Rescaling to make the payment at maturity equal to unity (which is the convention used
in these lecture notes), the loan must be 1=.1 C 0:06 150=360/—which is the “price”
of a deposit that gives unity in 150 days.
The major continental European bond markets (in particular, France and Germany) typ-
ically have annual coupons and the accrued interest is calculated according to the “ac-
tual/actual” convention, that is, as
accrued interest = next coupon days since last coupon/365 (or 366).
28
(The computation is slightly more complicated for the UK and the scandinavian coun-
tries, since they have ex-dividend periods.)
29
The lag factor l is the indexation lag. There are two reasons for this lag. First, the
convention on many markets is that the bond price is quoted disregarding accrued interest
(clean price). The typical case is as follows. The next coupon payment is m1 periods
ahead. The buyer of the bond in t will get this coupon (trading is “cum-dividend”). The
full price the buyer pays to the seller in t is therefore
where the accrued interest is typically the coupon payment times the fraction of this
coupon period that has already passed. To pay this accrued interest, we have to know
the next coupon payment, that is, cP t Cm1 l =P t l ; in t we must know the price level in
t C m1 l. P This mean that l m1 must always hold: the indexation lag must be at least
as long as the time between coupon payments (six months in the UK).
Second, it takes time to calculate and publish price indices. Suppose we learn to
know Ps in s C k. This means that the indexation lag must be an additional k periods,
l m1 C k, so it uses a known price level. For instance, in the UK, the indexation lag is
8 months.
To simplify matters in the rest of this section, suppose the indexation lag is zero. Use
(12.22), modified to allow for different coupons, to price the inflation-indexed bond. To
further simplify, suppose that bonds do not have any riskpremia (clearly a strong assump-
tion), so that the bond price equals the discounted expected payoffs
PK c E t P t Cmk =P t E t P t CmK =P t
B c .K; c/ D mk C : (12.42)
kD1
Œ1 C Y.mk / Œ1 C Y.mK /mK
The Fisher equation is
E t P t Cm
Œ1 C Y.m/m D Œ1 C R.m/m ; (12.43)
Pt
where R is the real interest rate. It splits up the gross nominal return in the bond into a
gross real return and gross inflation rate. Notice that the Fisher equation assumes that
there is no risk premia, which is a strong assumption.
30
US 10−year interest rates Implied 10−year US inflation expectations
4
6 nominal
real 3
4 2
1
2
0
2000 2005 2010 2000 2005 2010
Year Year
The financial press typically quotes a bond equivalent yield for T-bills—in an attempt
to make the yields comparable. The bond equivalent yield is the coupon (and yield to
maturity) of a par bond that would give the same yield as the T-bill. For a T-bill with at
most half a year to maturity, this gives a simple interest rate, but for longer T-bills the
expression is more complicated.
We first analyze a T-bill with more than half a year to maturity. Consider a coupon
bond with face value B (which equals the current price of the T-bill), semi-annual coupon
c=2 and the same yield to maturity. Since the coupon and the yield to maturity are the
31
same, the “clean price” of the bond (the price to pay if the seller gets to keep the accrued
interest on the first coupon payment) equals the face value (here B): it is traded at par.
Notice that the latter means that the buyer gets the following fraction of the next coupon
payment (which is B c=2): the fraction of a half year until the next coupon payment (or
(days to next coupon)2=365).
When the T-bill has more than half a year to maturity, then the bond has two coupon
payments left (including the maturity). At maturity, the owner will have the following: (i)
the principal plus final coupon, B.1 C c=2/; (ii) the part of the first coupon that belongs
to the current owner, d D B 2n c=2, where n D(days to next coupon)=365; and (iii)
the interest on d when reinvested at the semi-annual rate c=2 for half a year, d c=2.
To get the same return as on the T-bill, the owner of the coupon bond must get a value
of one at maturity (the return is then 1=B), or
We now apply the same logic to a T-bill with at most half a year to maturity. The
bond then only has the final coupon left (which is split with the previous owner), and the
face value (which is not split). In particular, there is no reinvestment. In this case, (A.1)
simplifies to
1 D B.1 C 2n c=2/: (A.3)
32
Solving for c (and using the fact that n D h D .days to maturity)=365) gives
1=B 1
cD or (A.4)
h
1
BD : (A.5)
1Chc
Example A.3 A T-bill with 44 days to maturity and a quoted discount yield of 6.21% has
the price 1 .44=360/ 0:0621 0:992. The bond equivalent yield is the c such that
1
0:992 D 44
or c D 6:6%:
1C 360
c
Remark A.4 There are two other, but equivalent, expressions for the bond equivalent
yield for maturities of at most half a year (see, for instance, McDonald (2006) Appendix
7.A). The first is
1 B 1
c1 D :
B m
Substituting for B using (A.5) shows that c1 D c. The second is
365 Ydb
c2 D :
360 Ydb days
Substituting for Ydb using (12.39) shows that c2 D c1 D c.
Bibliography
Blake, D., 1990, Financial market analysis, McGraw-Hill, London.
Deacon, M., and A. Derry, 1998, Inflation-indexed securities, Prentice Hall Europe,
Hemel Hempstead.
Fabozzi, F. J., 2004, Bond markets, analysis, and strategies, Pearson Prentice Hall, 5th
edn.
33
Hull, J. C., 2006, Options, futures, and other derivatives, Prentice-Hall, Upper Saddle
River, NJ, 6th edn.
McCulloch, J., 1975, “The tax-adjusted yield curve,” Journal of Finance, 30, 811–830.
Svensson, L., 1995, “Estimating forward interest rates with the extended Nelson&Siegel
method,” Quarterly Review, Sveriges Riksbank, 1995:3, 13–26.
34
13 Bond Portfolios
Main references: Elton, Gruber, Brown, and Goetzmann (2010) 21–22 and Hull (2006) 4
Additional references: McDonald (2006) 7
The “duration” of a coupon bond is used to analyse how the bond price will change in
response to changes in the yield curve. This section gives the definitions of the most
commonly used duration measures.
Recall that the yield to maturity, , of a coupon bond satisfies
c1 c2 cK
B c .K; c/ D m1 C m2 C : : : C
.1 C / .1 C / .1 C /mK
ck
D K (13.1)
P
;
kD1
.1 C /mk
where the bond pays ck at mk periods from now. The principal is included in the last
“coupon” payment. We allow the payments to differ between periods—to simplify the
notation and to be able to treat a bond portfolio in the same way as an ordinary bond.
The derivative of a coupon bond price with respect to its yield to maturity is
dB c .K; c/ 1 PK ck
D kD1 mk : (13.2)
d 1C .1 C /mk
This measures the sensitivity of the bond price to a small change in the yield to maturity.
The dollar duration, D$ , is typically defined as this derivative times minus one
dB c .K; c/
D$ D (13.3)
d
1 PK ck
D kD1 mk : (13.4)
1C .1 C /mk
The change of the bond price, B c .K; c/, due to a small change in the yield, , is
approximately
B c .K; c/ D$ (13.5)
35
It is common to divide the duration by the bond price, B c .K; c/, to get the adjusted
(or modified) duration, Da ,
1
Da D D$ c : (13.6)
B .K; c/
By dividing both sides of (13.5) by the bond price and using the definition of the adjusted
duration we see that the relative (percentage) change of the bond price due to a small
change in the yield is approximately
B c .K; c/
Da (13.7)
B c .K; c/
By multiplying both sides of (13.5) by .1 C /=B c .K; c/ and using the definition of
Macaulay’s duration we see that the relative (percentage) change of the bond price due to
a small relative (percentage) change in the yield is approximately
B c .K; c/
Dmac : (13.10)
B c .K; c/ 1C
The term last term, =.1 C /, is the relative change in the gross yield—since D
.1 C /.
Notice that Macaulay’s duration is a weighted average of the time to the coupon (and
face) payments (m1 ; m2 ; :::; mK ). The weight of mk is ck =Œ.1 C /mk B c .K; c/, so the
weights sum to unity and they are clearly the percentage of the bond price accounted for
by the respective coupon (or principal) payments. Macaulay’s duration is therefore an
average “time to payment” of the bond. For instance, for a zero coupon bond, Macaulay’s
duration is the time to maturity (set c D 0 in (13.9)). For bonds with coupons, Macaulay’s
duration is less than the time to maturity—and this effect is more pronounced at high
coupon rates and at high yields to maturity. This is illustrated in Figure 13.1.
Remark 13.1 (Duration of a zero coupon bond) For a zero-coupon bond with a face
value of unity and maturity of K, the price is B D 1=.1 C y/K , where y is the yield to
36
Macaulay’s duration, ytm = 0% Macaulay’s duration, ytm = 5%
15 15
c=0%
c=5%
10 10
c=10%
5 5
0 0
0 5 10 15 0 5 10 15
Years to maturity Years to maturity
10
0
0 5 10 15
Years to maturity
Example 13.2 (Duration) Consider a 4% (annual) coupon bond with 2 years to maturity.
Suppose the price is 1.019. The the yield to maturity is 3% since it solves
0:04 1:04
1:019 C :
1 C 0:03 .1 C 0:03/2
37
The dollar duration is
1 0:04 1:04
D$ D C2 1:94;
1:03 1:03 1:032
so the adjusted duration and Macaulay’s duration are
1
Da D 1:94 1:90
1:019
1:03
Dmac D 1:94 1:96:
1:019
Example 13.3 (Duration of a zero coupon bond) A two-period zero coupon bond with
price 0.94 has a ytm equal to 0.03, since
1
0:94 :
1:032
The duration is
1 1
2 1:83;
1:03 1:032
and Macaulay’s duration is
1:03 1 1
2
2 D 2:
1=1:03 1:03 1:032
Proposition 13.4 (Duration of a portfolio) A portfolio with bonds A and B has the dollar
duration D$1 C D$2 if the ytm of bond 1 and 2 are the same, otherwise it is just an
approximation. If the dollar duration is additive, then Macauly’s duration of the portfolio
is B1 =.B1 C B2 /Dmac1 C B2 =.B1 C B2 /Dmac2 , that is, the value weighted average of
the different Macauly’s durations.
Proof. (Duration of a portfolio ) Thew first part is intuitive since the dollar duration
of a coupon bond is considered “correct”—and it uses the same ytm for all the coupons.).
For the second part, multiply the dollar duration D$1 C D$2 by the ytm and divide by the
portfolio value .B1 C B2 /. This is Macaulay’s duration of the portfolio. Now, rewrite by
using D$ D BDmac =.1 C / to get the result in the proposition.
Example 13.5 (Duration of a portfolio, same ytm) A 1-year discount bond with a ytm
(effective interest rate) of 10% has the price 1=1:1 and a 3-year discount bond with a ytm
38
of 10% has the price 1=1:13 .The dollar duration and Macauly’s durations are
1
1-year bond: D$ D 0:83 and Dmac D 1
1:12
3
3-year bond: D$ D 2:05 and Dmac D 3:
1:14
A portfolio with one of each bond has a price equal Bp D 1=1:1 C 1=1:13 and a ytm
1 1
Bp D C , with D 0:1:
1C .1 C /3
The duration and Macaulay’s duration of the portfolio are then
1 1 1
D$ D C 3 3 2:88;
1:1 1:1 1:1
1:1
Dmac D D$ 1:90:
Bp
Compare with
Example 13.6 (Duration of a portfolio, different ytm) A 1-year discount bond with a ytm
(effective interest rate) of 7% has the price 1=1:07 and a 3-year discount bond with a ytm
of 10% has the price 1=1:13 .The dollar duration and Macauly’s durations are
1
1-year bond: D$ D 0:87 and Dmac D 1
1:072
3
3-year bond: D$ D 2:05 and Dmac D 3:
1:14
A portfolio with one of each bond has a price Bp D 1=1:07 C 1=1:13 and a ytm
1 1
Bp D C , with 0:091:
1C .1 C /3
39
The duration and Macaulay’s duration of the portfolio are then
1 1 1
D$ D C3 2:96;
1:091 1:091 1:0913
1:091
Dmac D D$ 1:91:
Bp
Compare with
Suppose we want to hedge against price movements of a bond portfolio. (This is also
called immunisation.) The portfolio can be thought of as a coupon bond (with a possibly
complicated set of coupons), so the previous formulas apply. One was of doing that would
be to use a (potentially large) set of futures—to match every cash flow of the bond, but
that may well be both difficult and costly (transaction costs). Duration matching is the
other extreme: finding a single instrument to use in the hedging.
Suppose we are short one unit of a bond (portfolio) with price BAc and dollar duration
DA . We will hedge this portfolio by buying h units of a bond portfolio (the hedging
portfolio) with price BH
c
and dollar duration DH . The value of the overall position is then
c
V D hBH BAc : (13.11)
Using the approximate relation of the bond price change (13.5) we have that the
change of value of the overall position is
40
Duration matching means that we set h such that the change in the value is (approxi-
mately) zero
DA A
hD : (13.13)
DH H
The most straightforward way of hedging is perhaps to let bond 2 be a zero-coupon bond
with the same time to maturity (and therefore duration) as the duration of bond A. In this
case, if both yields to maturity move equally much (A D H ) then the hedge ratio h
is 1. In general, if seems reasonable to use a similar duration, since then it is reasonable to
assume that the yields to maturity change in a similar way, so assuming A =H D 1
makes sense.
A common assumption is that both yields change equally much (parallel shift of the
yield curve), so the hedge ratio becomes
DA
hD if A D H : (13.14)
DH
we see that the hedge ratio is positive and that it is above one if bond A has a longer
duration than bond H , et vice versa. The intuition is that a the price of a long bond is
more sensitive to a yield curve shift than the price of a short bond. Therefore, to hedge a
long bond we need to buy more of the short bond.
In practice, the hedging portfolio also includes a small position in a short-term money
market account—so the overall portfolio have a zero value (at least initially).
See Figures 13.2–13.3 for illustrations.
Remark 13.7 (Effect yield curve shift with imperfect duration hedge) If the duration of
the hedge portfolio is too long, then the overall portfolio in (13.11) is likely to loose value
when interest rates go up (since long bond prices go down more)—and vice versa. An-
other way of thinking about this is that the overall portfolio then has a positive duration:
that is we have lent at a fixed interest rate, so we loose if the floating rates (our refinance
cost, say) go up.
Remark 13.8 (Overall portfolio value over several subperiods ) Start by creating a port-
folio with a zero initial value
c c c c
0 D h t BH;t BA;t C B t , so B t D 0 h t BH;t C BA;t ;
41
Bond prices and rates
0.95
0.9
0.85
Bond price/face value
0.8
0.75
0.7
0.65
0.6
Price 5−year
0.55 Price 10−year
where B t is the amount held in a short-term (almost zero duration) bill which a conti-
nously compounded interest y t . In t C 1 (say, after a day), this portfolio is worth
c c
V t C1 D h t .BH;t C1 C cH;t C1 / .BA;tC1 C cA;tC1 / C B t e yt h ;
where cH;tC1 and cA;tC1 are coupon payments (or any other cash flows), the bond prices
are measured after coupons and y t h is the interest rate factor per day. After rebalancing
in t C 1, we need h t C1 units of bond H and we are still short one bond A—and the balance
is invested in the short term bill
c c
B t C1 D V t C1 h t C1 BH;tC1 C BA;tC1 :
This is indeed very similar to the expression for B t in the first equation. Clearly, the
value of the portfolio in t C 2 is computed as in the second equation, but with subscripts
advanced one period.
The formula for the price change (13.5) is only exact for infinitesimal yield changes—and
the approximation error is likely to be large when the yield changes are.
42
(Dollar) Duration Hedge ratio
Annuity 0.98
2.8 Bond H
0.97
2.6
2.4 0.96
2.2 0.95
Jan Apr Jul Oct Jan Jan Apr Jul Oct Jan
2009 2009
Value after paying liabilities Bond A: annuity paying 0.2 each 20 March until 2014
0.9
Jan Apr Jul Oct Jan
2009
dB c .K; c/
B c .K; c/ : (13.15)
d
Obviously, a second-order Taylor approximation is more precise. It would be
dB c .K; c/ 1 d 2 B c .K; c/
B c .K; c/ C ./2 : (13.16)
d 2 d 2
where the last term includes the second derivative of the bond price with respect to the
yield to maturity. The second derivative is easily calculated to be
d 2 B c .K; c/ PK ck
D kD1 mk .mk C 1/ : (13.17)
d 2
.1 C /mk C2
43
Dividing (13.16) by the bond price and using (13.7) gives
B c .K; c/ 1
Da C C ./2 ; (13.18)
B c .K; c/ 2
where C (often called “convexity”) is the second derivative in (13.16) divided by the bond
price. It can be shown that the convexity is positive, but decreasing in the coupon rate —
for a given ytm and maturity. (The convexity is actually increasing in the coupon rate for
a given ytm and modified duration.) See Figure 13.2 for an illustration of the non-linear
effect.
By choosing the hedging bond (portfolio) so that it has a similar convexity to the bond
to be hedged may make the hedge more precise.
The duration measures assume that the times when the coupons and the face value are
paid are unaffected by the yield change. That is true for many instruments (like most
government bonds), but not for callable bonds—and effectively not for bonds whose risk
premium depends on the interest rate level as most corporate bonds do (as the interest rate
level affects the default risk).
The probably most important problem with using duration for hedging is that the hedge
ratio in (13.13) depends on the changes in the yields—and these are unknown when we
construct the hedging portfolio.
The ideal case for duration matching is when A =H is always one, that is, when
the yields (to maturity) move in parallel. This will be the case, for instance, if the yield
44
curve is flat (across maturities)—and the only movements are parallel shifts up and down.
In reality, most movements in the yield curve are parallel, but changes in slope and cur-
vature are not uncommon either. Often the short interest rates move more (in response to
news) than long rates.
Combine (13.12) and the hedge ratio (13.14) (which depend only on the relative du-
ration) to get that change in the portfolio value is approximately
which shows that the change in value is negative if the yield of the hedging portfolio
increases more than the yield of portfolio A.
For instance, suppose the yield curve changes from being upward sloping to being
downward sloping. If the hedging portfolio has shorter duration than portfolio A, then the
overall position loses value. The reason is that the hedging portfolio falls more (the yield
increases more) in price than portfolio A. See Figure 13.4 for an illustration.
However, the relative frequencies of these movements seem to change over time (ac-
cording to business cycle conditions and monetary policy regime). This suggests that the
ability of duration matching (assuming A =H D 1) to provide a hedge is different in
different time periods and different markets.
Explicit models of how the entire yield curve moves in response to a small num-
ber of factors have implications for A =H —which may vary across instruments and
time. It is still an open issue of these models provide a better hedge than just assuming
A =H D 1.
Yield curves (in the US and most other developed countries) tend to have the following
features (see Figure 13.5 for some examples).
First, most of the time, the yield curve is upward sloping. This is only consistent
expectations hypothesis if short rates are expected to be higher in the future. This means
that short rates should (most of the time) be increasing over time—which contradicts
empirical evidence. It is more likely that long rates tend to be high because of risk premia.
45
yield
new yiel
d curve
e
ld curv
old yie
maturity
hedging portfolio A
portfolio
Second, the yield curve changes over time. It is common to describe the movements in
terms of three “factors”: level, slope, and hump. To formalise this, consider the following
simple model for the effective sport interest rate in period t
(Alternatively, the forward rate is used as the dependent variable.) Clearly, the
0 coef-
ficient represents the general level of the yield curve,
1 the slope (as a function of time
to maturity), and
2 the curvature (capturing humps). Repeating the cross-sectional re-
gression for each trading date gives time series of coefficients. If the movements in the
0
coefficient dominate then parallel movements of the yield curve dominate, etc.
It is often more instructive to run a slightly different regression where the regressors
are uncorrelated. Let n1 be the residual from regressing n on a constant, and n2 the
residual from regressing n2 on a constant and n1 . The regression is then
The advantage of this approach is that the regressors are uncorrelated so it is straight-
forward to decompose the variance of ynt into the sum of the variances: VarŒy t .n/ D
Var.ı0t / C Var.ı1t n1 / C Var.ı2t n2 / C Var."nt /. See Figure 13.7 for an example.
An alternative is to use principal component analysis. See Figure 13.6 for an example.
Most evidence on US data suggest (see, for instance, Cochrane (2001) 19) that changes
46
in the level dominate—perhaps accounting for 80–90% of the total variation in yields. The
slope comes second (perhaps accounting for 10%), and hump third (accounting for a few
percent). Similar results are found by principal component analysis.
Remark 13.10 (Principal component analysis) The first (sample) principal component
of the zero mean N 1 vector z t D Y t YN is w10 z t where w1 is the eigenvector as-
sociated with the largest eigenvalue of ˙ D Cov.z t /. This value of w1 solves the
problem maxw w 0 ˙w subject to the normalization w 0 w D 1. This eigenvalue equals
Var.w10 z t / D w10 ˙w1 . The j th principal component solves the same problem, but under
the additional restriction that wi0 wj D 0 for all i < j . The solution is the eigenvector
associated with the j th largest eigenvalue (which equals Var.wj0 z t / D wj0 ˙wj ). This
means that the first K principal components are those (normalized) linear combinations
that account for as much of the variability as possible—and that the principal compo-
nents are uncorrelated (Cov.wi0 z t ; wj0 z t / D 0/). Dividing an eigenvalue with the sum of
eigenvalues gives a measure of the relative importance of that principal component (in
terms of variance). If the rank of ˙ is K, then only K eigenvalues are non-zero.
Example 13.12 (Principal component analysis) With three yields, and the i th principal
component in the column vector Œw1i ; w2i ; w3i 0 , we have
30 2
Y t .1/ YNt .1/
2 3 2 3
pc1 t w11 w12 w13
4pc2 t 5 D 4w21 w22 w23 5 4Y t .3/ YNt .3/5 and
6 7 6 7 6 7
For instance, the second column of the W matrix shows how each of the yields “react to”
the second principal component.
47
US spot rates on 1994−01−05 US spot rates on 1995−01−04
8 8
%
%
6 6
4 4
0 5 10 15 0 5 10 15
Years to maturity Years to maturity
0 5 10 15
Years to maturity
The expectations hypothesis of interest rates says that long bonds have no, or possibly
constant, risk premia. In that case, forward interest rates can be interpreted as expected
future short interest rates. The evidence on the expectations hypothesis is mixed (see Sec-
tion 13.5.4), so it can only be thought of as a rough (although convenient) approximation.
To illustrate how the expectations hypothesis works, it is easiest to work with contin-
uously compounded interest rates. Recall that a continuously compounded interest rate,
y t .n/, satisfies
1
D exp Œny t .n/ , or y t .n/ D ln B t .n/=n; (13.22)
B t .n/
48
US interest rates, Fed funds to 10 years US interest rates
20 20
10−year
15 15 3−year
3−month
10 10
5 5
0 0
1970 1980 1990 2000 2010 1970 1980 1990 2000 2010
average short interest rate. Split up the time until n into n=m intervals of length m. Then,
the expectations hypothesis says that the n-period spot rate equals the geometric average
of the m-period short rates over t to t C n
m
y t .n/ D a.n/ C Œy t .m/ C E t y t Cm .m/ C E t y tC2m .m/ C : : : : (13.23)
n
If a.n/ D 0, then the pure expectations are said to hold. Hence, the expectations hypoth-
esis (although not in its pure form) allows for constant risk premia.
There are several reasons for why bonds should have risk premia. First, the real return of
a long bond is very sensitive to inflation changes—probably more than equity. Bonds are
therefore likely to have inflation risk premia. Second, long bonds are risky for investors
49
Regression coefficients Regressors
20 δ0 n0
δ1 10 n1
15 n2
δ2
10 0
5
−10
0
1970 1980 1990 2000 2010 1d 3m 6m 1y 3y 5y 7y 10y
Maturity
Decomposition of y:
y = δ0 + δ1n1 + δ2n2
where (1,n1,n2) are rotated (uncorrelated)
who don’t intend to keep them until maturity—and will therefore have term premia. Third,
some bonds are not traded much (for instance, off-the-run bonds and many index-linked
bonds)—so they are likely to have liquidity premia.
The Vasicek model assumes that the short interest rate is an AR(1). The specification
typically involves shifting the mean of the process to allow for a risk (term) premia. To
simplify, I will crudely assume that there are some unspecified constant premia (the ex-
pectations hypothesis). (The more general formulation derives the risk premia in terms of
the mean reversion and volatility of the short rate.)
To simplify the notation, let r t be the short rate. It follows an AR(1)
r t C1 D .r t / C " t C1 ; (13.24)
50
Federal funds rate
20
Coefficients in yt=a+ρyt−1+εt:
10
0
1950 1960 1970 1980 1990 2000 2010
Year
terms of changes as
r tC1 r t D a . r t / C " t C1 :
r tC1 D .1 a/ .r t / C " t C1 ;
which is of the same form as before. With 0 < a < 1 (that is, with 0 < < 1) the process
is mean reverting.
E t r t Cs D .1 s / C s r t : (13.25)
We now assume that the expectations hypothesis of interest rates holds. Using y t .1/ D r t
this in (13.23) gives the long interest rate. For instance, the n D 2 rate is
1
y t .2/ D a.2/ C Œr t C .1 / C r t
2
D a.2/ C .1 / =2 C r t .1 C / =2: (13.26)
51
Vasicek model, spot rate
8
Short rate = 7%
ρ=0.9 (quarterly), µ=0.05
Short rate = 5%
7
Short rate = 3%
6
Interest rate, %
2
0 1 2 3 4 5 6 7 8 9 10
Years to maturity
Figure 13.9: Vasicek model, spot rates for different intial short rates
In this model, all movements of the yield curve are driven by one variable (here the short
rate), so it is a one-factor model. However, the shifts of the yield are parallell only if
D 1 (the random walk model) since then C.s/ D s in (13.27), so we get
Example 13.14 (Vasicek model) For D 0:9 and D 0:05, (13.27) gives (assuming no
risk premia) 2 3 2 3 2 3
y t .1/ 0 1
6 7 6 7 6 7
6y t .2/7 6 0:0025 7 6 0:95 7
76 7C6
6y .3/7 6 0:0048 7 6 0:90 7 r t :
6 7
4 t 5 4 5 4 5
y t .4/ 0:007 0:86
This model allows us to calculate (or rather estimate) the proper way of hedging a
52
bond. In particular, it can be used to calculate the ratio of the yield changes (A =H )
in (13.13).
To illustrate that, consider zero coupon bonds. Then (13.27) can be used directly: the
spot rate coincides with the yield to maturity (which enters the hedging formula).
The ratio of yield changes (corresponding to A =H in (13.13)) for two bonds with
maturities of s and q periods is
y t .s/ C.s/=s
D : (13.30)
y t .q/ C.q/=q
For bonds with coupons (or other portfolios of zero coupon bonds), the ratio of yield
changes typically has to been calculated numerically, but that is straightforward.
This ratio is unity only in the random walk model (13.28). Since the duration equals
the maturity (for a zero coupon bond), the hedge ratio in (13.13) becomes
s C.s/=s C.s/
hD D : (13.31)
q C.q/=q C.q/
Clearly, the hedge using (13.31) will still suffer from the approximation error (convexity).
Example 13.15 (Hedging in a Vasicek model) If we want to hedge the 4-period bond by
using the 2-period bond in Example 13.14, then we have the ratio of yields
y t .4/ 0:86
D 0:91:
y t .2/ 0:95
The hedge ratio is therefore
4 0:86
hD 1:82;
2 0:95
which is slightly lower than 2 (the ratio of durations), since longer yields are less sensitive
to the short rate changes in the Vasicek model (as long as < 1).
All one-factor models (not least the Vasicek model) imply that all yields are perfectly
correlated (there is a common single driving force) and only fairly limited yield curve
movements are possible. Multi-factor models overcome most of those limitations. For
instance, the model in Nelson and Siegel (1987) is a two-factor model.
53
13.4 Interest Rates and Macroeconomics
This section outlines three (not mutually exclusive) macroeconomic approaches to mod-
elling the yield curve.
Also, let y tr .n/ be the one period continuously compounded real interest rate (an interest
rate measured in goods).
The Fisher equation (here in the form of continuously compounded rates) says that the
nominal interest rate includes compensation both for inflation expectations, E t t Cn ,the
real interest rate, y tr .n/, and possibly a constant (across time) risk premium, .n/,
Example 13.16 (Fisher equation) Suppose the nominal interest rate is y.n/ D 0:07, the
real interest rate is y r .n/ D 0:03, and the nominal bond has no risk premium ( D 0),
then the expected inflation is E t t Cn D 0:04.
The same type of relation holds for forward rates. The Fisher equation suggests a
framework for analysing nominal interest rates in terms of real interest rates and inflation
expectations. This is commonly used for long rates. Information about real interest rates
can be elicited from index-linked bonds, that is, bonds which give automatic compensation
for actual inflation.
Empirical results typically indicate that there are non-trivial movements in the real
interest rate and/or risk premia—especially for short forecasting horizons. This holds also
when inflation expectations, as measured by surveys, are used as the dependent variable.
Inflation expectations seems to vary by less than the interest rate. It is therefore not
straightforward to extract inflation expectations from nominal interest rates.
The Fisher equation could also be embedded in a macro model to construct a so-
phisticated (and complicated) model of the yield curve. This involves using macro the-
54
US inflation and interest rate US interest rate − inflation
10
15 Infl
T−bill 5
10
5 0
0 −5
−5 −10
1960 1970 1980 1990 2000 2010 1960 1970 1980 1990 2000 2010
Year Year
4 2
1
2
0
2000 2005 2010 2000 2005 2010
Year Year
ory/empirics to model how real interest rates and inflation expectations (for different ma-
turities) depend on the state of the economy.
The expectations hypothesis of interest rates says that long interest rates equal an average
of expected future short rates, possibly with a constant (across time, not maturitites) risk
premium as in (13.23). Alternatively, that forward rates equal expected future spot rates.
The expectations hypothesis is often used to calculate implied “forecasts” of future
short interest rates. For instance, suppose the central bank increases its policy rate (typi-
cally a very short rate, at most a week or two). This is likely to affect also longer interest
rates, but how is another matter. Let us consider a few different cases. For simplicity we
55
Interest rate and inflation: levels Interest rate and inflation: changes
15 10
Change in inflation
10 5
Inflation
5 0
0 −5
0 5 10 15 −6 −4 −2 0 2 4
interest rate Change in interest rate
assume that risk premia are unaffected by this move in the policy rate.
First, one possibility is that only the very short interest rates change, and that all longer
interest rates stay unchanged. This would happen if the policy move was well anticipated.
Second, another possibility is that most long interest rates increase. Under the expec-
tations hypothesis of interest rates the interpretation is that the market now expects high
short interest rates also in the future. That is, that the central bank will not reverse its pol-
icy action in the foreseeable future. If we are willing to assume that the real interest rate
was not affected by the policy move, then one possible interpretation is that the central
bank has received information about a long-lasting inflation pressure.
Third, and finally, short rates may increase, but really long interest rates decrease. A
common interpretation of this scenario is that the central bank has become more inflation
averse. It therefore raises the policy rate to bring down inflation. If the market believes
that it will succeed, then it follows that it will eventually be possible to lower interest rates
(when inflation and inflation expectations are lower).
The expectations hypothesis has been tested many times, typically by an ex post linear
regression (realized interest rates regressed on lagged forward rates). The results often
reject the expectations hypothesis, but the results depend on how the test is done. It is not
clear, however, if the rejection is due to systematic risk premia or to fairly small samples
(compared to the long swings in interest rates). The expectations hypothesis gets more
support when survey data on interest rate expectations is used instead on realized interest
rates.
56
Expectations hypothesis: levels Expectations hypothesis: changes
10
0
5
0 −5
0 5 10 15 −5 0 5
Interest rate Change in interest rate
Sample: US 12−month interest rates and next−year average federal funds rate: 1970:1−2010:7
Figure 13.13: US 12-month interest and average federal funds rate (next 12 months)
Uncovered interest rate parity says that the difference between a domestic and foreign
interest rate equals the expected depreciation plus a constant (across time, not maturities)
risk premium
y t .n/ y t .n/ D E t s t Cn s t C 'n ; (13.34)
where y t .n/ is a (continuously compounded) foreign interest rate, and s t is the logarithm
of the exchange rate (number of domestic currency units per foreign currency unit). If
this condition hold with a zero risk premium, then the expected return from investing in
foreign bonds and then buy domestic currency equals the known return from investing in
domestic bonds.
Empirical evidence suggests that there might be large movements in the risk premia
over time (or that there have been systematic surprises in historical samples).
Monetary policy is a crucial part of the macroeconomic picture these days, so it is impor-
tant to understand how monetary policy is formed. It has not always been this way: there
are long periods when many countries adopted a very simple (or so it seemed) mone-
tary policy by pegging the currency to another currency. Macroeconomic policy was then
synonymous with fiscal policy. Recently, the roles have changed.
57
GBP/USD GBP/USD depreciation, %
0.7 0.3
Exchange rate, t+90
Depreciation, t+90
0.65 0.2
0.1
0.6
0
0.55
−0.1
0.5
0.5 0.55 0.6 0.65 0.7 −0.1 0 0.1 0.2 0.3
Forward exchange rate, t Forward premium, t
Sample: 1988:7−2010:9
Exchange rate level:
Regression of realized exchange rate on forward exchange rate: Coefficient = 0.81, R2 = 0.67
MSE of various methods (OLS, random walk, forward rate): 10.20, 9.94, 10.03
Modern macro models are often smaller than the older macroeconometric models and
they pay more attention to both the supply side of the economy and the role of expecta-
tions. These models try to capture the key elements in the way central banks (and most
other observers) reason about the interaction between inflation, output, and monetary pol-
icy.
In these models, inflation depends on expected future inflation (some prices are set
today for a long period and will therefore be affected about expectations about future
costs and competitors’ prices), lagged inflation, and a “Phillips effect” where an output
gap (output less trend output) affects price setting via demand pressure. For instance,
inflation ( t ) is often modelled as
t D ˛ E t t C1 C ˇ t 1 C x t C " t ; (13.35)
where x t is the output gap and " t can be interpreted as “cost push” shocks (wage de-
mands, oil price shocks). This equation can be said to represent the supply side of the
economy and it is typically derived from a model where firms with some market power
want to equate marginal revenues and marginal costs, but choose to change prices only
58
gradually.
The demand side of the economy is modelled from consumers’ savings decision,
where the trade off between consumption today and tomorrow depends on the real in-
terest rates. Simplifying by setting consumption equal to output we get something like
the following equation for the output gap
xt D xt 1 .i t E t tC1 / C u t ; (13.36)
where i t is the nominal interest rate (set by the central bank) and u t is a shock to demand.
Note that the expected real interest rate affects demand (negatively).
In some cases, the real exchange rate is added to both (13.35) and (13.36), capturing
price increases on imported goods and foreign demand for exports, respectively. The
exchange rate is then linked to the rest of the model via an assumption of uncovered
interest rate parity (that is, expected exchange rate depreciation equals the interest rate
differential).
Some of the important features of this simple model are: (i) inflation expectations
matter for today’s inflation (think about wage inflation), (ii) the instrument for monetary
policy, the short interest rate i t , can ultimately affect inflation only via the output gap; (iii)
it is the real, not the nominal, interest rate that matters for demand.
To make the model operational, two more things must be added: the monetary policy
(the way the interest rate is set) and the expectations in (13.35)–(13.36) must be specified.
It is common to assume that the central bank has some instrument rule like the famous
“Taylor rule”
i t D 0 C 0:5x t C 1:5 t C v t : (13.37)
The residual v t is a “monetary policy shock,” which picks up factors left out of the model
(for instance, the central bank’s concern for the banking sector or simply changes in the
central bank’s objectives). This simple reaction function has been able to track US mone-
tary policy fairly well over the last decade or so. Another approach to find a policy rule is
to assume that the central bank has some loss function that it minimizes by choosing an
policy rule. This loss function is often a weighted average of the variance of inflation and
the variance of the output gap. The policy rule is the solution of the minimization prob-
lem, and can often look more complicated than the Taylor rule. However, there is one
interesting special case. Suppose the central bank wants to minimize the (unconditional)
59
variance of inflation. The formal optimization problem is then
The solution is then that the interest rate should be set so that actual inflation is zero (here
the mean) in every period. If the model is changed so there is a time lag between the
interest rate decision and its effect on inflation (for instance, by letting inflation in (13.35)
react to x t 1 instead of x t ), then the interest rate should be set so that the conditional
expectation of next period’s inflation is zero (the mean), E t t C1 D 0. This type of “rule”
is used in much of the monetary policy debate.
The expectations in (13.35)–(13.36) can be handled in many ways. The perhaps most
straightforward way is to assume that the expectations about the future equal the current
value of the same variable (a “random walk”). A more satisfactory way is to use survey
data on inflation expectations. Finally, many model builders assume that expectations are
“rational” (or “model consistent”) in the sense that the expectation equals the best guess
we could do under the assumption that the model is correct. This latter approach typically
requires a sophisticated way of solving the model (as the model both generates the best
guesses and depends on them).
There is a two-way causality: inflation and the real economy (which depend on the real
interest rate) affect monetary policy, and monetary policy can surely affect inflation and
the real economy. This makes it difficult to analyse and forecast interest rates. However,
for short term forecasting, the emphasis is typically on forecasting the next monetary
policy move. Long run forecasting relies more on understanding the determinants of real
interest rates and inflation, which depends on the general business cycle prospects, but
also on the long run stance of monetary policy (“tough on inflation or not?”).
Kolb and Stekler (1996) use a semi-annual survey of (12 to 40) professional analysts’
interest rate forecasts published in Wall Street Journal. The (6 months ahead) forecasts
60
are for the 6-month T-bill rate and the yield on 30-year government bonds. The paper
studies four questions, and I summarize the findings below.
1. Q. Is the distribution of the forecasts (across forecasters) at any point in time sym-
metric? (Analyzed by first testing if the sample distribution could be drawn from a
normal distribution; if not, then checking asymmetry (skewness).) A. Yes, in most
periods. (The authors argues why this makes the median forecast a meaningful
representation of a “consensus forecast.”)
2. Q. Are all forecasters equally good (in terms of ranking of (absolute?) forecast
error)? A. Yes for the 90-day T-bill rate; No for the long bond yield.
3. Q. Are some forecasters systematically better (in terms of absolute forecast error)?
(Analyzed by checking if the absolute forecast error is below the median more than
50% of the time) A. Yes.
4. Q. Do the forecasts predict the direction of change of the interest rate? (Analyzed
by checking if the forecast gets the sign of the change right more than 50% of the
time.) A. No.
Hartzmark (1991) has data on daily futures positions of large traders on eight different
markets, including futures on 90-day T-bills and on government bonds. He uses this data
to see if the traders changed their position in the right direction compared to realized
prices (in the future) and if they did so consistently over time.
The results indicate that these large investors in T-bills and bond futures did no better
than an uninformed guess of the direction of change of the bill and bond prices. He gets
essentially the same results if the size of the change in the position and in the price are
also taken into account.
There is of course a distribution of how well the different investors do, but it looks
much like one generated from random guesses (uninformed forecasts). The investors
change places in this distribution over time: there is very little evidence that successful
investors continue to be successful over long periods.
61
13.5.4 Long Rates as Forecasts of Future Short Rates: The Expectations Hypothe-
sis
The expectations hypothesis has been tested many times, typically by an ex post linear
regression (realized interest rates regressed on lagged forward rates). The results typi-
cally reject the expectations hypothesis. It is not clear, however, if the rejection is due to
systematic risk premia or to fairly small samples (compared to the long swings in interest
rates). The expectations hypothesis gets more support when survey data on interest rate
expectations is used instead on realized interest rates.
There are many different types of risk premia on fixed income markets.
Nominal bonds are risky in real terms, and are therefore likely to carry inflation risk
premia. Long bonds are risky because their market values fluctuate over time, so they
probably have term premia. Corporate bonds and some government bonds (in particular,
from developing countries) have default risk premia, depending on the risk for default.
Interbank rates may be higher than T-bill of the same maturity for the same reason (see
the TED spread, the spread between 3-month Libor and T-bill rates) and illiquid bonds
may carry liquidity premia (see the spread between off-the run and on-the-run bonds).
Figures 13.15–13.18 provide some examples.
Bibliography
Cochrane, J. H., 2001, Asset pricing, Princeton University Press, Princeton, New Jersey.
Hartzmark, M. L., 1991, “Luck versus forecast ability: determinants of trader perfor-
mance in futures markets,” Journal of Business, 64, 49–74.
Hull, J. C., 2006, Options, futures, and other derivatives, Prentice-Hall, Upper Saddle
River, NJ, 6th edn.
62
Long−term interest rates
0.15 Baa (corporate)
Aaa (corporate)
10−y Treasury
0.1
0.05
0
1970 1975 1980 1985 1990 1995 2000 2005 2010
Kolb, R. A., and H. O. Stekler, 1996, “How well do analysts forecast interest rates,”
Journal of Forecasting, 15, 385–394.
Nelson, C., and A. Siegel, 1987, “Parsimonious modeling of yield curves,” Journal of
Business, 60, 473–489.
63
TED spread (3−month LIBOR − T−bill)
4
3.5
3
2.5
2
1.5
1
0.5
0
1990 1995 2000 2005 2010
4
3.5
3
2.5
2
1.5
1
0.5
0
2007 2008 2009 2010
64
Liquidity premium on Treasury market
0.35
0.3
0.25
0.2
0.15
0.1
0.05
Spread between off−the−run and on−the−run Treasury bonds
0
1998 2000 2002 2004 2006 2008
65
14 Basic Option Pricing
Main References: Elton, Gruber, Brown, and Goetzmann (2010) 23–24 and Hull (2006)
5 and 8–10
Additional references: McDonald (2006) 9–12; Cochrane (2001) 17–18
A European call option contract traded in t may stipulate that the buyer of the contract
has the right to buy one unit of the underlying asset from the issuer of the option on the
expiration date t C m at the strike price K. See Figure 14.1 for the timing convention.
The payoff at exercise is zero or, if larger, the price of the underlying asset, S t Cm ,
minus the strike price
Clearly, an owner of a call option benefits from an increase in the price of the underlying
asset (exercise the right to buy for K and sell asset at a higher price). The payoff of
the original seller of the option (the option writer who has a short option position) is the
mirror image of the buyer’s payoff: the buyer’s gain is the writer’s loss. See Figure 14.2
for an illustration.
t t Cm
buy option if S > K: pay
agree on K, pay C K and get asset,
otherwise: do nothing
66
Profit of options
Call(K)
Put(K)
Stock price
A put option instead gives the buyer of the contract the right to sell one unit of the
underlying asset. The put price is here denoted by P . An owner of a put option benefits
from a decrease in the price of the underlying asset (buy the asset cheaply and exercise
the right to sell for K). The payoff is
67
SMI call options (Sep 2009) with trade
7500
7000
6500
6000
Strike price
5500
5000
4500
4000
3500
3000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
68
SMI put options (Sep 2009) with trade
7500
7000
6500
6000
Strike price
5500
5000
4500
4000
3500
3000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
69
SMI call and put options (Sep 2009), trade volume
10000
9000
8000
Number of contracts
7000
6000
5000
4000
3000
2000
1000
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
70
14.1.2 Financial Engineering
Replicating a Forward
Options markets are often very liquid—and are therefore useful for constructing replicat-
ing portfolios. The portfolio Call(K) - Put(K) for K D F (the forward price) replicates
a forward contract, so it is a synthetic forward. Clearly, we can then replicate a short
position in a forward contract by selling such a portfolio. See Figure 14.6.
Portfolio Insurance
A protective put is a combination of a put and a position in the underlying asset. This
allows the owener to capture the upside of the price movement (of the underlying), at the
same time as insuring against the downside. This is indeed very similar to just buying a
call option. See Figure 14.6.
A variation on the synthetic short forward is the collar: Call(K2 ) + Put(K1 ) where
K1 < K2 . It also looks like a short position in a forward contract, except that the payoff
is flat between the strike prices. Clearly, this is betting on a large price decrease. Selling
a collar (or reversal) is instead a bet on a large price increase.
A collar (reversal) can be used to hedge a long (short) position in the underlying asset,
except that there is no hedge between the strike prices. It provides insurance outside the
strike prices. See Figure 14.7.
To bet on a small increase in the price of the underlying asset we can use a bull spread:
Call(K1 ) - Call(K2 ) where K1 < K2 . This portfolio has flat payoffs outside the strike
71
Synthetic forward, C(K) − P(K) Protective put, P(K) + S
0 0
K K=S0
−Underlying, −(S−S0)
Call(K), atm
−covered call
K=S0
Stock price
prices, but a a payoff that increases with the underlying asset between them. Selling a
bull spread creates a bear spread, which is a bet on a small decrease of the underlying
price. (These spreads can also be constructed by combing puts.) See Figure 14.7.
The buyer always stands the risk of getting a zero payoff, that is, a return of 100%. For
instance, the net return on a European call option is
max .0; S t Cm K/
1; (14.3)
C
72
Straddle, C(K) + P(K) Strangle, C(K2) + P(K1)
Call(K) Call(K2)
Put(K) Put(K1)
Straddle
Strangle
0 0
K K1 K2
−Call(K2) Call(K1)
Put(K1) −Call(K2)
Collar Bull spread
0 0
K1 K2 K1 K2
where C is the call option price. Whenever the option isn’t exercised, the whole invest-
ment is lost (and the return is 100%).
It is the clear that option returns cannot be normally (or even lognormally) distributed:
the density function has a spike at 100% (whose probability is the same as the proba-
bility of S t Cm K). This means, that we cannot motivate “mean-variance” pricing of
options by referring to a normal distribution of the return. (This does not rule out mean-
variance pricing, which could be motivated by, for instance, mean-variance preferences.)
Since options are exposed to risk factors, they can be used to hedge risk, that is, to
create an “insurance.” For instance, an owner of the underlying asset can hedge by buying
a put option. This guarantees that he/she always gets at least the strike price.
Similarly, an investor who has short-sold the underlying asset (borrowed the asset by
73
Option price as function of strike price
14
Call
12 Put
10
0
30 35 40 45 50
Strike price
someone and then sold it) can hedge by buying a call option. This puts a limit (the strike
price) on how much he/she will have to pay for the asset when it is time to turn it back to
the lender.
Options prices depend on many things, but there are some fairly general results
First, call option prices are decreasing in the strike price, while put options prices are
increasing in the strike price. See Figure 14.8 for an illustration.
The intuition is that a higher strike price means that an owner of a call option will
have to pay more in case of exercise—and there is also a lower chance of exercise. This
is illustrated in Figure 14.9.
Second, both call and put option prices are typically increasing in the dispersion of the
distribution of the future price of the underlying asset. The intuition for the second result
is that a wider dispersion increases the probability of a really high price of the underlying
asset (which is good). Of course, it also increases the probability of a really low asset
price, but that is of less concern is the the call option payoff is bounded below at zero.
This is illustrated in Figure 14.10.
74
Distribution of future asset price Distribution of future asset price
0.06 K = 42 0.06 K = 47
Std = 0.14 Std = 0.14
0.04 0.04
0.02 0.02
0 0
20 30 40 50 60 70 20 30 40 50 60 70
Asset price Asset price
0.06 K = 42 0.06 K = 42
Std = 0.14 Std = 0.21
0.04 0.04
0.02 0.02
0 0
20 30 40 50 60 70 20 30 40 50 60 70
Asset price Asset price
An American option is like a European option, except that it can be exercised on any day
before or on the expiration date. This means that an American option has more rights than
a European option and is therefore worth at least as much
CA CE and PA PE : (14.4)
If there are no dividends, then it is never optimal to exercise an American call option
early (such a call option will have the same price as a European call option), but it can
still be optimal to exercise an American put option early. If there are dividends, then the
American call option should only be exercised just prior to the dividend payments, while
an American put should perhaps also be exercised also at other times.
75
t t Cm
write contract: pay F ,
agree on F get asset
Forward prices play an important role in simplifying option analysis, so we first discuss
the forward-spot parity.
The present value of one unit paid m periods into the future must be the price of a
bond, B.m/, maturing at the same time. We therefore have that the present value of Z is
where Y t .m/ is effective spot interest rate, and y t .m/ is the continuously compounded
interest rate (y t .m/ D ln Œ1 C Y t .m/).
Example 14.1 (Present value) With y t .m/ D 0:05 and m D 3=4 we have the present
value e 0:053=4 Z 0:963Z.
A forward contract specifies (among other things) which asset that should be derlived
at the expiration and what the price is then (the forward price). See Figure 14.11 for an
illustration.
Proposition 14.2 (Forward-spot parity, no dividends) The forward price, F t .m/, con-
tracted in t (but to be paid in t C m) on an asset without dividends satisfies
my t .m/
e F t .m/ D S t : (14.8)
76
The intuition is that the forward contract is like buying the underlying asset on credit—
e myt .m/ F t .m/ can be thought of as a prepaid forward contract.
Proof. (of Proposition 14.2) Portfolio A: enter a forward contract, with a present value
of e my F . Portfolio B: buy one unit of the asset at the price S . Both portfolios give one
asset at expiration, so they must have the same costs today.
Proposition 14.3 (Forward-spot parity, discrete dividends) Suppose the underlying asset
pays the dividend Di at mi (i D 1; :::; n) periods into the future (but before the expiration
date of the forward contract). The dividends must be known already in t. The forward
price then satisfies
Xn
my t .m/ mi y t .mi /
e F t .m/ D S t e Di : (14.9)
i D1
The last term is the sum of the present values of the dividend payments. The intuition
is that the forward contract does not give the right to these dividends so its value is the
underlying asset value stripped of the present value of the dividends.
Proof. (of Proposition 14.3) Portfolio A: enter a forward contract, with a present value
of e my F . Portfolio B: buy one unit of the asset at the price S and sell the rights to the
known dividends at the present value of the dividends. Both portfolios give one asset at
expiration, so they must have the same costs today.
Proposition 14.4 (Forward-spot parity, continuous dividends) When the dividend is paid
continuously as the rate ı (of the price of the underlying asset), then
my t .m/ ım
e F t .m/ D S t e : (14.10)
Proof. (of Proposition 14.4) Portfolio A: enter a forward contract, with a present value
of e my F . Portfolio B: buy e ı m units of the asset at the price e ı m S, and then collect
dividends and reinvest them in the asset. Both portfolios give one asset at expiration, so
they must have the same costs today.
77
14.2.2 Forwards versus Futures
A forward contract is typically a private contract between two investors—and can there-
fore be tailor made. A futures contract is similar to a forward contract (write contract,
get something later), but is typically traded on an exchange—and is therefore standard-
ized (amount, maturity, settlement process). The settlement is either cash settlement or
physical settlement. The latter does not work for synthetical assets like equity indices.
Another important difference is that a forward contract is settled at expiration, whereas
a futures contract is settled daily (“marking-to-market”), which essentially means that
gains and losses (because of prices changes) are transferred between issuer and owner
daily—but kept at the at an interest bearing account at the exchange. If interest rates
change randomly over time (and they do), the rate at which these gains (losses) are rein-
vested (refinanced) will therefore be different from the rate when the futures was issued.
This difference is embedded in the futures price. The proposition below show that, if
(hypothetically) the interest rate path was non-stochastic, then the forward and futures
prices would be the same. In practice, the difference between forward and futures prices
is typically small.
Proposition 14.6 (Forward vs. futures prices, non-stochastic interest rates) The forward
and futures prices would be the same if the interest rate only changed in a non-stochastic
way.
Proof. (of Proposition 14.6) To simplify the notation, let t D 0 and m D 2. Also,
let rs be the continuously compounded one-day interest rate and fs be the futures price.
Strategy A: have e r0 long futures contracts on (the end of) day 0, increase it to e r0 Cr1 on
day 1. Provided we reinvest the settlements in one-day bills, we have
The end-value of strategy A is therefore e r0 Cr1 .f2 f0 /, which equals e r0 Cr1 .S2 f0 /
since the value at expiration is the value of the underlying asset. Strategy B: be long e r0 Cr1
forward contracts, which gives a payoff on day 2 of e r0 Cr1 .S2 F0 /. Both strategies take
78
on exactly the same risk, so the prices must be the same: f0 D F0 . (The proof relies on
knowing r1 already on day 0.)
There is a tight link between European call and put prices. If you know one of them (and
the forward price), then you can easily calculate that the other must be. The following
proposition is more precise.
Proposition 14.7 (Put-call parity for European options) The put-call parity for European
options is
C P D e my .F K/; (14.11)
my
where e .F K/ is the present value of the forward price minus the strike price.
Time subscripts and indicators of maturity have been suppressed to make the notation
a bit easier. The parity holds irrespective of whether the underlying asset has dividends or
not (since the expression uses the forward price). Its practical importance is that it allows
us to use two of the assets to replicate the third asset. For instance, we can combine a
call option and a forward contract to replicate a put option, or buy a call and sell a put to
replicate a forward contract.
See Figure 14.6 for an illustration.
Proof. (of Proposition 14.7) Buy one call option and sell one put option, both with
the strike price K. This will with certainty give one asset at maturity at the price K. The
present value of the cost is C P C e my K. The same is achieved by entering a forward
contract—the present value of the cost is e my F .
This formula is very general, but a few special cases are of particular interest. First,
when the underlying asset pays no dividends, then (14.11) together with (14.8)–(14.10)
give
C P DS e my K if no dividends, (14.12)
Xn
C P DS e mi yt .mi / Di e my
K if dividends, (14.13)
i D1
ım my
C P D Se e K if continuous dividend rate ı: (14.14)
79
Example 14.8 (Put-call parity) S D 42; m D 1=2; y D 5%; K D 38. If C D 5:5 for an
underlying asset without dividends, then (14.12) gives
0:50:05
5:5 P D 42 e 38 or P 0:56:
This section discusses early exercise of American options. There are some cases where
we can exclude early exercise, so the American option is priced as a European option. In
other cases, we cannot exclude early exercise—but we may still be able to say something
about when early exercise is likely. More precise answers will require building a model
for the pricing. Clearly, the answer is then model dependent.
American call options on an asset without dividends (until expiration of the option) are
not exercised early. The following proposition is more precise.
Proposition 14.9 (No early exercise, American call, no dividends) An American call op-
tion on an asset without dividends should never be exercised early—but perhaps sold. It
therefore has the same price as a European call option.
Suppose that you are pretty sure that price of the underlying will drop tomorrow. The
above argument shows that you should still not exercise the call option, but it might be
sensible to sell the option today. If we exercised early, then we would effectively through
away the put protection (against downside movements) inherent in the call option and be
left with the underlying asset (recall from the European put-call parity that the call option
can be thought of as a portfolio of the underlying, a put, and some cash) and also pay the
strike price now instead of later—neither of which is good (and which a potential buyer
of the call option would be willing to pay for).
Proof. (of Proposition 14.9) To avoid early exercise, selling (getting CA ) should be
more profitable than exercising (getting S K), CA > S K. Put-call parity for European
options (14.12) says
CE D S K C .1 e my /K C PE :
80
American put, BOPM nodes American put, early exercise
150 150
Shaded area: Shaded area:
nodes used in calculation nodes with early exercise
100 100
Spot price
Spot price
50
S 50
t
0 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
Time interval, m Time interval, m
The sum of last two terms is positive (before the expiration date), so CE > S K. Since
CA CE we have
CA > S K;
so selling the option is always more profitable than exercising early. The reason is that
early exercise throws away the put protection (PE ) and also the “rebate” due to later
payment of the exercise price (pay K instead of the present value e my K).
American put options on an asset without dividends (until expiration of the option) may
be exercised early. The following proposition is more precise.
Proposition 14.10 (Early exercise, American put, no dividends) An American put option
on an asset without dividends could be exercised early. However, there is no early exercise
if the put option is deep out-the-money (high asset price/low strike price) and the interest
rate is low. In particular, there is no early exercise if the corresponding European call
option satisfies CE > .1 e my /K. For instance, this is always the case if the interest
rate is zero.
81
American put price, BOPM
10 American put
Option price
C−S+K
C − S + e−ymK = PE
5
K, m, y, and σ are 42 0.5 0.05 0.2
Binomial solution uses 100 time intervals
0
32 34 36 38 40 42 44 46 48 50 52
Stock price
10
Option price
5
y is 0.15
0
32 34 36 38 40 42 44 46 48 50 52
Stock price
This means that the American put price is close to the European put price for high
asset price/low strike price and low interest rates, but is higher otherwise.
See Figures 14.12–14.13 for an illustration, based on a numerical solution for the
price on an American put option. The first figure shows in which nodes early exercise is
optimal. The second picture illustrates how the price is related to the European put price
and a upper boundary (to be discussed later).
Example 14.11 (Early exercise of American put option?) When the underlying asset
goes bankrupt, then S D 0 and it is known that it will stay at S D 0. Exercising the
American put option now gives K, whereas waiting until expiration has a present value
of e my K (which is lower): early exercise is optimal.
Example 14.12 (Early exercise of American put option?) Using the same parameters as
82
my
in Example 14.8, we have that CE > .1 e /K is satisfied since
1=20:05
5:5 > .1 e /38 D 0:94;
so there is no early exercise of the American put option. The reason is that we from the
put-call parity for European options (14.12) and the fact PA PE then have
PA PE D CE C K S .1 e my /K ;
„ƒ‚… „ ƒ‚ …
5:5 0:94
so selling the put option (getting PA ) gives the same as exercising (K S ) plus at least
5:5 0:94. If, for some reason, we instead have y D 35% (so .1 e my /K D .1
e 1=20:35 /38 D 6:1) but the same prices, then we would perhaps get early exercise.
Proof. (of Proposition 14.10) To avoid early exercise, selling (getting PA ) should be
more profitable than exercising (getting K S ), PA > K S. Put-call parity for European
options (14.12) says
PE D CE C K S .1 e my /K:
If
my
CE > .1 e /K;
then PA PE > K S so selling is better than exercising. This means that there is
no early exercise if the European call price is high (high asset price compared to strike
price), the strike price is low, or if the discounting until expiration is low (low interest rate
or small time to expiration). For instance, with a zero interest rate, PA CE C K S, so
there is never early exercise as long as CE > 0. If these conditions are not satisfied, we
cannot rule out early exercise.
American call and put options on an asset with dividends (until expiration of the option)
may be exercised early. The following propositions are more precise.
Proposition 14.13 (Early exercise, American call, dividends) An American call option
on an asset with dividends could be exercised early, especially just before a dividend
payment and when the option is deep in-the-money (low strike price/high asset price).
83
Conversely, there is no early exercise if .1 e my /K > niD1 e mi y t .mi /
P
Di , that is, with
a high strike price and low present value of the dividends.
Example 14.14 (Early exercise, American call, dividends?) Suppose there is one divi-
dend payment one month ahead: D1 D 0:95 at m1 D 4=12. If we use the same parame-
ters as in Example 14.8, we then have
1=20:05 4=120:05
.1 e /38 D 0:94 > e 0:95 D 0:93;
so we can rule out early exercise. However, if the dividend payment is at m1 D 1=12, then
we cannot.
Proof. (of Proposition 14.13) To avoid early exercise, selling (getting CA ) should be
more profitable than exercising (getting S K), CA > S K. Put-call parity for European
options (14.13) says
Xn
mi y t .mi / my
CE D S K e Di C .1 e /K C PE :
i D1
If
Xn
my mi y t .mi /
.1 e /K > e Di ;
i D1
and PE 0 (always true), then CA CE > S K: selling is better than early exercise.
Hence, there is no early exercise if the present value of dividends is low, the strike price
is high or if the discounting until expiration is large (high interest rate or long time to
expiration). In the opposite case, we cannot rule out early exercise.
Proposition 14.15 (Early exercise, American put, dividends) Early exercise is possible...
There is no put-call parity for American options. However, pricing bounds can be derived.
my
C S Ce K PA CA S C K: (14.15)
„A ƒ‚ … „ƒ‚…
PE CE
84
Bounds, American put, no dividends
10
C−S+K
C−S+e−myK = PE
8
6
S, m, y, and σ: 42 0.5 0.05 0.2
4 The call price is from Black−Scholes
0
30 35 40 45 50
Strike price
The lower boundary is the European put price from (14.12). The reason is that the
American and European call options have the same prices (the American call option on
an asset without dividends is never exercised early—see Section 14.4). The upper bound
is very similar, except that it involves the strike price, not its present value. Clearly,
when the interest rate is low, then the interval is narrow—and with a zero interest rate it
collapses to the put-call parity of European options. (The latter corresponds to the fact
that an American put option on an asset without dividends is never exercised early if the
interest rate is zero, see Section 14.4).
See Figures 14.13 and 14.14 for illustrations.
Example 14.17 (Bounds for an American put option) Using the same parameters as in
Example 14.8, we get the following bounds for an American put option (no dividends)
Proof. (of Proposition 14.16) The lower boundary is the European put price (since
CA D CE when there are no dividends) and it is always true that PA PE .
The upper boundary follows from the following argument where we compare two
portfolios. Portfolio A: one call option with strike price K plus a deposit of K. Portfolio
85
B: one put option plus one underlying asset. If the put option is held until expiration (the
call is not exercised early), then portfolio A will be worth max.0; Sm K/ C e my K in
period m (where m is date of expiration), and portfolio B will be worth max.0; K Sm /C
Sm , so portfolio A is worth (weakly) more. If, instead, the put is exercised earlier (l < m),
then portfolio A will be worth CA;l C e ly K in period l, and portfolio B will be worth
K Sl C Sl D K, so portfolio A is worth (weakly) more. In period 0 (0 l < m) we
don’t know when/if the early exercise of the put will happen—but we know that in either
case A portfolio will then be worth more than a portfolio B: portfolio A must therefore
be worth (weakly) more than B already in 0: CA;0 C K PA;0 C S0 , which is the upper
bound in (14.15).
Proposition 14.18 (Put-call, American option, dividends) With dividends, the upper bound-
ary in (14.15) is changed by adding the present value of the dividend stream
Xn
my mi y t .mi /
CA S Ce K PA CA S CK C e Di : (14.16)
i D1
Notice that the lower boundary is not equal to the European put price anymore (since
CA CE and the present value of the dividends is not added). Together this means that
the interval is wider with dividends than without dividends.
Proof. (of Proposition 14.18) The lower boundary follows from the following argu-
ment. Buy one call option, lend e my K, and sell one asset—the total value is CA C
e my K S , which is the left hand side of (14.16). If the call is exercised prior to expiry,
the payoff is S K C e my K S D .e my 1/K < 0 which must be less than the value
of the put whose value is nonnegative. If no early exercise, then the payoff at expiration
is max.0; S K/ C K S D max.0; K S/ which is the same as the put payoff.
The upper boundary is a bit trickier, so we leave it for now.
86
Price bounds on European call
50
40
30
C ≤ PV(F) ≤ S
0≤C
20
PV(F−K) ≤ C
10
S, m, y: 42 0.5 0.05
0
30 35 40 45 50
Strike price
The price of both American or European call option must satisfy the following restrictions
my
C e F S (14.17)
0C (14.18)
my
e .F K/ C: (14.19)
The motivations are basically as follows (the intuition based on European options, but
the results extend to Americal options as well. First, a call option with a zero strike price
(K D 0) would be the same as owning a prepaid forward contract (which is worth as
much or less than the underlying asset). Whenever the strike price is higher, the call price
will lower. Second, the call option gives rights, not obligations: its price value cannot be
negative. Third, the lowest possible value of a put option is zero, so the put-call parity
(14.11) immediately gives that the call price must exceed the present value of F K. See
Figures 14.15 and 14.16 for illustrations.
Example 14.19 (Pricing bounds for call option) Using the same parameters as in Exam-
87
SMI call option prices, 2009−07−14
6000
5000
4000
C
3000 C ≤ PV(F)
0≤C
2000 PV(F−K) ≤ C
1000
3000 3500 4000 4500 5000 5500 6000 6500 7000 7500
Strike price
4 C 42:
Suppose we have American or European call options with different strike prices, K1 <
K2 . We the have the following price relations
The first relation says that the call option price is decreasing in the strike price. The
intuition is that a higher strike price means that an owner of a call option will have to
pay more in case of exercise—and there is also a lower chance of exercise. The second
relation says that change is smaller than the change in the strike price. The third relation
88
says that the relation is convex. If these relations do not hold, then there are arbitrage
opportunities (see the proofs below).
In other words, these three conditions say that we have the following partial derivatives
(if they exist) of the call option price function
This means that the call option price is decreasing in the strike price, but slower than the
strike price itself, but that the curve flattens out at high strike prices.
See Figure 14.8 for an illustration.
Proof. (of (14.20)) If (14.20) was not true, so C.K2 / > C.K1 /, then a bull spread (buy
C.K1 / and sell C.K2 /), would have a negative price (C.K1 / C.K2 / < 0). However,
the payoff of a bull spread is
8
< 0
ˆ if S K1
max.0; S K1 / max.0; S K2 / D S K1 if K1 < S K2
ˆ
K2 K1 if K2 < S:
:
This would give a non-negative payoff for a negative asset price, which creates arbitrage
opportunities.
Proof. (of (14.21)) If (14.21) was not true, so C.K1 / C.K2 / K2 K1 , then we
can sell a bull spread (sell C.K1 / and buy C.K2 /) and invest the proceeds in a T-bill (zero
investment). The payoff at expiration (m period later) is then
8
< 0
ˆ if S K1
max.0; S K2 / max.0; S K1 / D ŒC.K1 / C.K2 /e rm C .S K1 / if K1 < S K2
„ ƒ‚ … ˆ
.K2 K1 / if K2 < S:
:
>K2 K1
In either case, there is a positive profit (recall that the initial investment is zero), which
creates arbitrage opportunities.
Proof. (of (14.22)) Let KN D K1 C .1 /K2 . If (14.22) was not true, so C.K/ N >
C.K1 / C .1 /C.K2 /, then we can sell C.K/ N and buy C.K1 / C .1 /C.K2 / (zero
89
investment). The payoff at expiration (m period later) is then
max.0; S K1 / max.0; S N C .1
K/ / max.0; S K2 /
8
ˆ
ˆ
ˆ 0 D0 if S K1
D .S K1 / if K1 < S KN
ˆ
< .S K1 /
D
ˆ
ˆ .S K1 / .S N
K/ D .1 /.S K2 / if KN < S K2
ˆ
N C .1 if K2 < S;
ˆ
: .S K1 / .S K/ /.S K1 / 0
where the second column uses the definition of K. N All payoffs are non-negative, and some
are positive. Since the initial investment is zero, this creates arbitrage opportunities.
Bibliography
Cochrane, J. H., 2001, Asset pricing, Princeton University Press, Princeton, New Jersey.
Hull, J. C., 2006, Options, futures, and other derivatives, Prentice-Hall, Upper Saddle
River, NJ, 6th edn.
90
15 The Binomial Option Pricing Model
Main references: Elton, Gruber, Brown, and Goetzmann (2010) 23 and Hull (2006) 11
and 17
Additional references: McDonald (2006) 9–12; Cochrane (2001) 17–18
There are basically two ways to model option prices: by some sort of factor model (like
CAPM) or by a no-arbitrage argument. The latter is clearly much more precise, so it
is typically preferred—when it works. These notes focus on a particularly simple case:
when the underlying asset follows a binomial process.
The binomial model (where the change of the price of the underlying asset only can take
two values) is very stylized, but it is useful for establishing the key ideas of option pricing.
It can also be transformed into a realistic model by cumulating many (short) subperiods.
In the limit (as the subperiods became very many/very short) it converges to the well-
known Black-Scholes model.
The binomial tree for the underlying asset starts at the price S and has probability q of
moving to Su in the next period and a probability of 1 q of moving to Sd . This is
illustrated in Figure 15.1. These probabilities are the true (“natural”) probabilities. If we
denote the price today by S t and in the next period by S tCh , then we have
(
u with probability q
S t Ch =S t D (15.1)
d with probability 1 q:
Remark 15.1 (Mean and variance of a binomial process) The mean of a (shifted) bino-
mial process like (15.1) is qu C .1 q/d and the variance is q.1 q/.u d /2 .
91
Su
q
1
q
Sd
11
0:6
10
0:4
9:5
Example 15.2 (Binomial process) Suppose S D 10; u D 1:1; d D 0:95, and q D 0:6.
Then, the process has a 60% probability of increasing from 10 to 11 and a 40% probability
of decreasing to 9.5. See Figure 15.2. This gives an expected relative change of 0:61:1C
0:4 0:95 D 1:04 and a variance of the relative change of 0:6 0:4 .1:1 0:95/2 D
0:0054:
92
expiration, then
Alternatively, it could be a forward contract (and the next time period is the time of expi-
ration), so
fu D Su F and fd D Sd F: (15.3)
We next use a no-arbitrage argument to derive what today’s price of the derivative
(denoted f ) must be. In doing so, we take it for granted that
If this condition is not satisfied, then there are a trivial arbitrage opportunities. For in-
stance, if e yh > u, then we could short the stock and buy bonds: this would guarantee a
positive payoff for a zero investment (an arbitrage possibility).
Example 15.3 (European call option) With the parameters in Example 15.2, equation
(15.2) shows that a European call option with strike price of 10 has
93
same in both cases
Su fu D Sd fd , so
fu fd
D : (15.6)
S .u d /
With this choice of (also called the “delta hedge”) the portfolio is riskfree and must
therefore have the same return as the riskfree rate.
Step 2: Make the Return of the Portfolio Equal to the Riskfree Rate
The present value of our riskfree portfolio is e yh .Su fu /, where y is the interest
rate per year and h is the length of the time interval. (The present value is also equal
to e yh .Sd fd /, but that is the same, as discussed before.) Since the portfolio is
riskfree, this present value must be equal to the cost of the portfolio, S f ,
yh
e .Su fu / D S f: (15.7)
Solve for the price of the derivative, f , and use the value of from (15.6) that ensures
that the portfolio is riskfree
f D S 1 e u C e yh fu
yh
(15.8)
fu fd
D 1 e yh u C e yh fu (15.9)
u d
yh e yh d
De Œpfu C .1 p/ fd with p D : (15.10)
u d
Equation (15.9) shows what the price of the derivative must be—and is written in terms
of the possible outcomes and the interest rate. Notice that neither probabilities (of the
different outcomes), nor risk preferences enter this expression—since we have used a no-
arbitrage argument to price this derivative. This works (that is, we can construct a riskfree
portfolio) because we have as many (relevant) assets (riskfree and underlying risky asset)
as there are possible outcomes (up or down).
94
Su; fu
p
S; f
1
p
Sd; fd
Equations (15.9) and (15.10) are alternative ways to write the price of the derivative.
The latter shows that the current price of the derivative is the discounted value (e yh )
times what seems as an expectation of the payoff of the derivative. This expression is
quite useful since we can think of p as a “risk neutral probability”—although it is not a
probability in the usual sense: it is just a convenient construction. Notice that p does not
depend on which derivative asset (with the same underlying asset) we consider. Under
the restrictions in (15.4), 0 < p < 1, as any “probability” should be.
Example 15.5 (European call option) Continuing Example 15.3 and assuming that y D
0, equation (15.10) gives the price of a call option with strike price 10 as
0 1 0:95
f De Œp1 C .1 p/ 0 with p D D 1=3
1:1 0:95
D 1=3:
0
f De Œ.1=3/ 2 C .2=3/ .1=2/ D 1:
yh
S De ŒpSu C .1 p/ Sd : (15.11)
95
This looks (again) like a discounted expected future payoff.
Example 15.6 (The underlying asset itself) Continuing Example 15.5, equation (15.11)
gives
S D e 0 Œ.1=3/ 11 C .2=3/ 9:5 D 10:
A forward contract has a zero current price (nothing is paid until expiry), and the
payoff at expiry is fu D Su F in the up state (the value of the underlying asset minus
the forward price) and fd D Sd F in the down state. Using this in (15.10) gives
yh
0De Œp .Su F / C .1 p/ .Sd F / , so (15.12)
F D pSu C .1 p/ Sd: (15.13)
This shows that the mean of the risk neutral distribution equals the forward price. Com-
bining (15.11) and (15.13) clearly gives the spot-forward parity, F D e yh S.
A riskfree asset can also be priced by this method. The only way an asset can be
riskfree in this setting is if fu D fd . We then get a zero hedge ratio () and (15.10) gives
yh
f De fu ; (15.14)
yh
f De p: (15.15)
The no-arbitrage argument in (15.6) was based on the fact that a portfolio of of the
underlying asset and of 1 of the derivative replicated a bond.
This argument can be turned around to replicate the derivative by holding the follow-
96
ing portfolio
The payoff of this portfolio in the up state is Su .Su fu / D fu and in the down
state it is Sd .Sd fd / D fd (since Su fu D Sd fd ) . This replicates the
derivative’s payoff. We can therefore hedge a short position in the derivative by holding
of the underlying asset (“delta hedging”).
We have used a no-arbitrage method to price the derivative. It works since the derivative
is a redundant asset: it can be replicated by a portfolio of the underlying asset and a
riskfree asset—and therefore must have the same price as this portfolio. This does not
mean, however, that the option is in itself riskfree. In fact, options are typically very risky
and therefore carry large risk premia. It may seem as if the pricing formula (15.10) is
free from the preference parameters that would determine the risk premium. Not correct.
The pricing formula contains the current asset price (through fu and fd ) which is indeed
affected by preference parameters.
The easiest way to see this is perhaps to recall that we can replicate the portfolio by
holding a portfolio of the underlying asset and bills, see (15.16). Clearly, this portfolio
will incorporate a risk premium—and so must the derivative.
The only case without a risk premium is when the derivative payoffs are unrelated to
the asset price—so the derivative is actually a safe asset as in (15.14).
97
Combine (15.13) and (15.17) to get
If the underlying asset has no risk premium, then the forward price equals the expected
future price (the left hand side is zero), so p must equal q. This motivates the name of p as
a “risk neutral probability”: it would be the probability (that is compatible with observed
price) if investors were risk neutral.
When there is a positive risk premium, then we know that e yh S t D F < E t S t Ch . This
means that the expected capital gain is larger than motivated by the riskfree rate alone—
to compensate for the risk. Then, (15.18) shows that p < q (since u > d ), that is, the
risk neutral probability of the up state is lower than the true (natural) probability. One
interpretation is that a risk neutral investor would be happy with a lower probability of the
up state (and thus a lower expected return), than a risk averse investor.
Example 15.7 (Natural versus risk neutral probability) With the parameters in Example
15.2, equation (15.17) gives
In this case, there is a positive risk premium and p < q (1/3 and 0.6 respectively).
We now discuss how to construct a binomial tree with many small time steps—so that it
mimics the behaviour of the asset price process.
The binomial distribution converges to a normal distribution as we chop up a given
time to expiration into smaller and smaller time steps—and the normal distribution is
98
fully described by the mean and variance. It is therefore common practice to construct the
binomial tree to match the mean and variance of the underlying series.
Suppose the price of the underlying asset has a (continuously compounded) drift of
and a variance of 2 per period (most often a year). This means that for a horizon of
length h, we have
(Notice that (15.1) says that S t Ch =S t D u with probability q. Just take logs to get the
results here.) The binomial process implies that the mean and variance of the asset price
change are therefore (see Remark 15.1)
E.ln S t Ch ln S t / D q ln u C .1 q/ ln d; (15.21)
Var.ln S t Ch ln S t / D q.1 q/.ln u ln d /2 : (15.22)
There are three parameters (u; d , and q) which can be chosen to match the two moments
(mean and variance) in (15.19), so we can make one arbitrary choice. The following is a
common approach.
First, for any u and d (not yet decided), pick q to match the mean drift over a time
step of size h (which is h, see (15.19)), that is,
q ln u C .1 q/ ln d D h; so (15.23)
h ln d
qD : (15.24)
ln u ln d
Second, pick u and d to match the variance over a time step of size h (which is 2 h,
99
see (15.19)), that is,
q.1 q/.ln u ln d /2 D 2 h: (15.25)
There are several ways to proceed from here, but the most common is approach of Cox,
Ross, and Rubinstein (1979) where
p p
u D e h
and d D e h
: (15.27)
This clearly does not fit the volatility exactly (compare with the right hand side of (15.26)),
but the approximation improves quickly as h decreases (the second order term h2 vanishes
fast). There are other ways to construct the binomial tree, but they have similar properties.
Notice that once we have the values of u and d , the pricing of derivatives does not use
the natural probability of the up state (q).
The binomial model is very useful for numerical calculations of the implied option price.
In such numerical applications, the time to expiry is divided into many small time steps,
and it is assumed that the price of the underlying asset can make an up or down movement
in each subinterval—and that the no-arbitrage portfolio is rebalanced every time step. Of
course, the size of the up and down movements (u and d in the previous analysis), as well
as the discounting, is scaled by the number of subintervals.
Let m be the time to expiration of the derivative. With n short time intervals, the
length of each interval is h D m=n. The perhaps most common way to construct the tree
is that of Cox, Ross, and Rubinstein (1979). In short, it implies (from using (15.10) and
(15.27))
p p
u D e h
;d D e h
; p D .e yh d /=.u d /, and discounting by e yh
: (15.29)
100
0 h 2h m D nh
0 h 2h m D nh
Figure 15.4: Two different time steps with same time to expiration m
Suu
Su
S Sdu D Sud
Sd
Sd d
Notice that we must keep h small enough so (15.4) holds (to rule arbitrage opportunities),
that is, p p
e h > e yh > e h ; (15.30)
p
which requires h < =y.
Figure 15.5 is an illustration of a binomial tree with two subintervals. This tree has
only three final nodes, since Sud D Sdu—it is “recombining,” which is very useful to
keep the number of nodes manageable (when we have many time steps). The correspond-
ing prices of the derivative are illustrated in Figure 15.6.
Example 15.8 (A European call option) For a European call option with strike price K
101
fuu
fu
f fdu D fud
fd
fd d
and three months (0.25 years) to expiration, the nodes for two steps (n D 2, so the length
of each time interval is 0:25=2 D 1=8 long) in Figure 15.6 are
fuu D max.Suu
2 3
" # K; 0/
y=8 fu D e y=8 Œpfuu C .1 p/fud
f De Œpfu C.1 p/fd , , and 4 fud D max.Sud K; 0/ 5
6 7
fd D e y=8 Œpfdu C .1 p/fd d
fd d D max.Sdd K; 0/
where p D .e y=8 d /=.u d /. Notice that the calculation begins at the end (right) and
works backwards towards the start of the tree (left).
The binomial tree we have used so far assumes that the derivative is “alive” until the end
of the period. This is not necessarily the case for American options, so the approach needs
to be modified to handle the possibility of early exercise.
The option value is then the maximum of the exercise value and the value if keeping
the option “alive.” The latter is defined in the same way as in (15.10). Together this gives
102
the price of the derivative as
where p is defined as before (in (15.10)). For instance, for an American put option we
have
f D max.K S, e yh
Œpfu C .1 p/ fd /, where (15.32)
fu D max.K Su; 0/ and fd D max.K Sd; 0/: (15.33)
Example 15.9 (An American put option) With an American put option with strike price
K and six months (0.5 years) to expiration the nodes for two steps (n D 2, so the length
of each time interval is 0:5=2 D 1=4 long) in Figure 15.6, we must account for the
possibility of an early exercise. At each node, the option value is the maximum of the
value if exercised (K minus the asset price) and the value if kept “alive” (denoted f a
below) The latter is the discounted risk-neutral expected value of the option value next
period—just like for a European option. We therefore have
where p D .e y=4 d /=.u d /. As always, the calculation begins at the end and works
backwards down the tree.
Figure 15.7 illustrates the solution for an American put option on an asset without
dividends. Notice that the American put price exceeds the European put price—and more
so at low asset prices and high interest rates, that is, when it is likely that the option will
be exercised early.
103
American put price, BOPM
10 American put
Option price
C−S+K
C − S + e−ymK = PE
5
K, m, y, and σ are 42 0.5 0.05 0.2
Binomial solution uses 100 time intervals
0
32 34 36 38 40 42 44 46 48 50 52
Stock price
10
Option price
5
y is 0.15
0
32 34 36 38 40 42 44 46 48 50 52
Stock price
Figure 15.8 illustrates the calculations of the American put price for one current value
of the underlying asset. The shaded areas show the location of the nodes (future prices
of the underlying asset) that are used in the calculation—and at which nodes that early
exercise will happen.
104
American put, BOPM nodes American put, early exercise
150 150
Shaded area: Shaded area:
nodes used in calculation nodes with early exercise
100 100
Spot price
Spot price
50
S 50
t
0 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
Time interval, m Time interval, m
Se ıh d fd in the “down” state. The e ıh factor comes from reinvestment. To make the
portfolio riskfree the delta must be
fu fd
D : (15.34)
Se ıh .u d /
Second, to make the return of the portfolio equal to the riskfree rate, we set the present
value of our riskfree portfolio equal to the cost of the portfolio
e yh Se ıh u fu D S f: (15.35)
105
chosen as before, for instance, as in (15.27).
Remark 15.10 (Risk neutral drift with continuous dividends) With continuous dividends,
the risk-neutral expected value is Ept S t Ch =S t D e .y ı/h , so the drift is .y ı/h over the
short time interval h.
Bibliography
Cochrane, J. H., 2001, Asset pricing, Princeton University Press, Princeton, New Jersey.
Cox, J. C., S. A. Ross, and M. Rubinstein, 1979, “Option pricing: a simplified approach,”
Journal of Financial Economics, 7, 229–263.
Hull, J. C., 2006, Options, futures, and other derivatives, Prentice-Hall, Upper Saddle
River, NJ, 6th edn.
106
16 The Black-Scholes Model and the Distribution of As-
set Prices
Main references: Elton, Gruber, Brown, and Goetzmann (2010) 23 and Hull (2006) 13–15
Additional references: McDonald (2006) 9–13; Cochrane (2001) 17–18; Cox, Ross, and
Rubinstein (1979)
Assume that the change over a short interval (between t and t C h) in the log asset price
is an iid process
This implies that, based on the information in period 0, the logarithm of the stock price in
period m, Sm , is normally distributed
where S is the current asset price (the subscript is dropped to reduce clutter). For instance,
if there is no volatility ( 2 D 0), then Sm D e m S . If we take the proper limit as the
time interval h goes towards zero, then we have a Brownian motion for the log asset price
(d ln S t D dt C d W t , where d W t are the increments to a Wiener process).
A hedging/no arbitrage argument similar to the binomial model then leads to the
Black-Scholes formula (for an asset without dividends) where the European call option
price is
ym
C D S˚ .d1 / K˚ .d2 / , where
e (16.3)
ln.S=K/ C y C 2 =2 m p
d1 D p and d2 D d1 m: (16.4)
m
107
BS call, S varies BS call, K varies BS call, σ varies
15 15
6
10 10
4
5 5 2
0 0 0
30 40 50 60 30 40 50 60 0 0.2 0.4
Asset price Strike price Standard deviation
S, K, m, y, and σ:
6 6
42 42 0.5 0.05 0.2
4 4
2 2
0 0
0 0.2 0.4 0 0.05 0.1
Expiration Interest rate
In this formula, ˚.d / denotes the probability of x d when x has an N.0; 1/ distribution
(that is, the distribution function value at d ).
108
price), worth e y m F , play the role of the underlying asset in (16.1). This gives the BS
formula (16.3)–(16.4) but with e y m F substituted for S
ym ym
C De F ˚ .d1 / e K˚ .d2 / , where (16.5)
ln.F=K/ C . 2 =2/m p
d1 D p and d2 D d1 m: (16.6)
m
This is Black’s model which has many applications.
Remark 16.1 (Approximation of option price) A Taylor approximation gives that the call
p
option price close to F D K and D 0 is C e y m F m=.2/.
Remark 16.2 (Practical hint: code for Black’s model with a forward price) Suppose you
have a computer code for the BS model (16.3)—(16.4) which takes the inputs .S; K; y; m; /.
To use that code for Black’s model (16.5)–(16.6), substitute (F; 0) for (S; y) and multiply
the results by e y m .
For instance, for an asset with a continuous dividend rate of ı, the forward-spot parity
says F D Se .y ı/m . In this case (16.5)–(16.6) can also be written
ım ym
C De S˚ .d1 / e K˚ .d2 / , where (16.7)
ln.S=K/ C .y ı C 2 =2/m p
d1 D p and d2 D d1 m: (16.8)
m
When the asset is a currency (read: foreign money market account) and ı is the foreign
interest rate, then this is the “Garman-Kolhagen” formula.
Remark 16.4 (Practical hint: code for BS model with continuous dividends) Suppose
you have a computer code for the BS model (16.3)—(16.4) which takes the inputs .S; K; y; m; /.
To use that code for Black’s model (16.5)–(16.6), substitute e ı m S for S .
109
Remark 16.5 (Practical hint: finding the dividend rate) If you don’t know what the
dividend rate is, use the forward-spot parity, F D Se .y ı/m , to calculate it as ı D
y ln.F=S/=m.
Remark 16.6 (The “greeks”) The derivates of the Blac-Scholes formula for an asset with
continous dividends (16.7)–(16.8) are
@C
D D e ı m ˚ .d1 /
@S
@2 C e ı m .d1 /
D D p
@S 2 S m
@C @C 1
D D D ıS ı m ˚.d1 / yKe ym
˚.d2 / p e ım
S.d1 /
@t @m 2 m
@C p
vega D D Se ı m .d1 / m
@
@C
D D mKe y m ˚.d2 /;
@y
where ./ is the standard normal probability density function (the derivative of ˚./).
The Black-Scholes formula contains only one unknown parameter: the variance 2 m in
the distribution of ln Sm (see 16.2). With data on the option price, spot and forward prices,
the interest rate, and the strike price, we can solve for the variance. The term is often
called the implied volatility—and it is often used as an indicator of market uncertainty
about the future asset price, Sm . It can be thought of as an annualized (provided a period
is defined as a year) standard deviation. See Figure 16.2 for an example.
Note that we can solve for one implied volatility for each available strike price. If the
Black-Scholes formula is correct, that is, if the assumption in (16.1) is correct, then these
volatilities should be the same across strike prices. On currency markets, we often find
a volatility “smile” (volatility is a U-shaped function of the strike price). One possible
explanation is that the (perceived) distribution of the future asset price has relatively more
probability mass in the tails (“fat tails”) than a normal distribution has. On equity markets,
we often find a volatility “smirk” instead, where the volatility is very high for very low
strike prices. This is often interpreted as that investors are willing to pay a lot for put
options that protect them from a dramatic fall in the stock price. One possible explanation
110
CBOE volatility index (VIX)
Kuwait Stock market crash LTCM/Russia pharma Higher rates Lehman
70
60
50
40
30
20
10
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010
Figure 16.2: CBOE VIX, summary measure of implied volatities (30 days) on US stock
markets
is thus that the distribution has more probability mass than a normal distribution at very
low stock prices (negative skewness). See Figure 16.3 for an example.
Remark 16.7 (Starting value for finding ) From Remark 16.1 we get a starting guess
p
of C =Œe y m F m=.2/. Alternatively, it is often recommended to use the starting
p
value D jln.F=K/j 2=m.
This section demonstrates that the option prices from the BOPM converges to the prices
from the Black-Scholes model. See Figure 16.5 for an illustration.
We know that the risk neutral pricing of a European call option is
ym
C De E max.0; Sm K/; (16.9)
111
Implied volatility, SMI option
1
2009−07−14
2009−08−14
0.8
0.6
0.4
0.2
0
3000 4000 5000 6000 7000 8000
Strike price
For the Black-Scholes model, the normal distribution for the log asset price (16.2)
implies that the risk neutral distribution of ln Sm is
This gives
E Sm D S y m D F; (16.11)
which equals the forward price (the mean of a risk neutral distribution) and the same
variance as in the true (natural) distribution.
Proof. (that (16.11) has a mean equal to the forward rate) Recall that EŒexp.x/ D
exp.Cs 2 =2/, if x N.; s 2 /. Applying on the distribution in (16.10) gives E Œexp.ln Sm / D
exp.ln S C y m/ D S y m , which equals the forward price (by the forward-spot parity).
For the binomial option pricing model (BOPM) we have that, in the risk neutral bino-
mial tree, the movements of the log price of the underlying asset are
(
ln u with probability p
ln S t Ch =S t D (16.12)
ln d with probability 1 p:
(In the risk neutral binomial tree since S t Ch D S t u with probability p and S t d other-
112
SMI implied volatility atm SMI, average iv − iv(atm)
0.2
0.3
0.1
0.25
0
0.2
0.15 −0.1
Jan Apr Jul −10 −5 0 5 10
distance of strike and forward price, %
0.15
0.1
0.05
0
−10 −5 0 5 10
distance of strike and forward price, %
wise.) The parameters u; d and p all depend on the time step length h in such a way
that we match the mean and variance of the price series. In fact, if they are chosen so that
the mean and variance of ln S t Ch =S t are (at least in the limit) proportional to h. The risk
neutral distribution is clearly a binomial distribution.
I demonstrate the convergence in two steps: first, that the binomial distribution con-
verges to a normal distribution; and second that both distributions have the same mean
and variance in the limit.
If we can show that the risk neutral distribution implied by the binomial model converges
(as the number of time steps increase, keeping time to expiration constant) to a normal
distribution, then it is plausible that the Black-Scholes model can be thought of as the
113
European call option price, binomial solution
3.5
S, K, m, y, and σ: 42 42 0.5 0.05 0.2 Binomial
Black−Scholes
3
2.5
0 5 10 15 20 25 30 35 40 45 50
n (no. steps in binomial model)
Proposition 16.8 If u; d and p in the binomial process (16.12) are such that the mean
and variance of ln S t Ch =S t are proportional to h, then the distribution converges to a nor-
mal distribution as the number of time steps n increases, keeping the maturity m constant
(so h D m=n).
Remark 16.9 (The Lindeberg-Lévy central limit theorem) If xi is independently and iden-
tically distributed with E xi D and Var.xi / D < 1, then,
p 1 Pn Pn p d
n .xi / D i D1 .xi /= n ! N.0; 2 /:
n i D1
Proof. (of Proposition 16.8) The binomial model (16.12) means that we can write the
p
de-meaned log price change over a time step of length as "i h, where "i is an iid zero
114
Pdf, normal and binomial (n=5) Pdf, normal and binomial (n=10)
3 3
2 2
1 1
0 0
3.2 3.4 3.6 3.8 4 4.2 3.2 3.4 3.6 3.8 4 4.2
Log asset price Log asset price
Pdf, normal and binomial (n=25) Pdf, normal and binomial (n=50)
3 3
2 2
1 1
0 0
3.2 3.4 3.6 3.8 4 4.2 3.2 3.4 3.6 3.8 4 4.2
Log asset price Log asset price
115
Mean of log asset price changes, annualized
0.0295
0 5 10 15 20 25 30 35 40 45 50
n (no. steps in binomial model)
0.04
0.0395
0 5 10 15 20 25 30 35 40 45 50
n (no. steps in binomial model)
This section demonstrates that the mean and variance of the binomial distribution con-
verges to the same values as in the risk neutral distribution of the Black-Scholes model
(16.10). This would be trivial if u; d and p in the binomial process (16.12) were cali-
brated to always (for any n) give the same mean and variance of the log price changes.
In practice, most ways to calibrate the BOPM parameters only satisfy this in the limit. In
particular, that is the case for the CRR tree, which is the focus of this section.
See Figure 16.7 for an illustration.
Proposition 16.10 (Moments of CRR steps) In the Cox, Ross, and Rubinstein (1979) tree,
the parameters in (16.12) are
p p
ln u D h; ln d D h and p D .e yh d /=.u d /:
116
As n ! 1, but h D m=n we have (since the price changes are independent) the following
results for the sum of them
This the same as in the risk neutral distribution of the Black-Scholes model.
Proof. (of Proposition 16.10) Both the mean and the variance (of the sum) scales lin-
early with the number of terms (since the terms are uncorrelated). The mean and variance
of ln S t Ch =S t are p ln u C .1 p/ ln d and p.1 p/.ln u ln d /2 . Substitute for u; d and
p and take the limits of n E ln S t Ch =S t and n Var.ln S tCh =S t / as n ! 1, but h D m=n.
(This is straightforward, but slightly messy, calculus.)
The price of a European (call or put) option calculated by the binomial model converges
to the Black-Scholes price as the number of subintervals increases (keeping the time to
expiration constant, so the subintervals become shorter). This is illustrated in Figure 16.5.
Both the binomial option pricing model (BOPM) and the Black-Scholes model imply
that the call option price can be written as the discounted risk neutral expected payoff
(16.9), which we can write as
Z 1
ym
C De .Sm K/ f .Sm /dSm ; (16.13)
K
where f .Sm / is the risk neutral density function of the asset price at expiration (Sm ).
We can clearly rewrite this expression as
ym
C De E .Sm KjSm > K/ Pr .Sm > K/ (16.14)
ym
De E .Sm jSm > K/ Pr .Sm > K/ e ym
K Pr .Sm > K/ : (16.15)
The first term is (the present value of) the expected asset price conditional on exercise,
times the probability of exercise. The second term is (the present value of) the strike price
times the probability of exercise.
The discussion below demonstrates that these probabilities are the same (in the limit)
in the BOPM and the Black-Scholes models.
117
p2
2p.1 p/
1 p
.1 p/2
To understand the binomial model a bit better, consider a binomial tree with 2 subintervals
(n D 2) of length h as illustrated in Figures 16.8–16.9.
The price of the call option is the discounted risk neutral expected value of the value
in the next period
Cuu D max.Suu
2 3
"
yh
# K; 0/
yh Cu D e ŒpCuu C .1 p/Cud
C D e ŒpCu C.1 p/Cd , , and 4 Cud D max.Sud K; 0/ 5
6 7
Cd D e yh ŒpCdu C .1 p/Cd d
Cd d D max.Sdd K; 0/
(16.16)
where p D .e y= h
d /=.u d /.
Remark 16.11 (Probabilities for the final nodes) With two trials (n D 2), the probabili-
118
Cuu
Cu
C Cud
Cd
Cd d
Pr.uu/ D p 2
Pr.ud / D 2p.1 p/
Pr.dd / D .1 p/2 :
ym
C De p 2 max.Suu K; 0/ C 2p.1 p/ max.Sud p/2 max.Sdd
K; 0/ C .1
K; 0/ ;
(16.17)
which expresses the call option price as the discounted risk-neutral expectation of the
option payoff.
Suppose only Suu > K, that is, it is only at the up and up branch, uu, that we
exercise. Then
ym
C De p 2 .Suu K/
ym
De Suu
„ƒ‚… p2 e ym
K p2 : (16.18)
„ƒ‚… „ƒ‚…
Ep .Sm jSm >K/Prp .uu/ Prp .uu/
119
The first term is the (discounted value of) the risk-neutral expected value of the asset price,
conditional on being so high that we exercise the call option, times the risk neutral prob-
ability of that event. The second term is the (discounted value of) the strike price times
the risk neutral probability of exercise. This clearly has the same form as (16.15). This
extends to n steps, except that the expressions for the probabilities are more complicated.
Remark 16.12 (Bernoulli and binomial distributions) The random variable X can only
take two values: 1 or 0, with probability p and 1 p respectively. This gives E.X/ D p
and Var.X/ D p.1 p/. After n independent trials, the number of successes (y) has the
binomial pdf, nŠ=ŒyŠ.n y/Šp y .1 p/n y for y D 0; 1; :::; n. This gives E.Y / D np
and Var.Y / D np.1 p/. To find the probability of at least z successes, sum the pdf over
y D z; z C 1; z C 2; : : :
Proposition 16.14 (Riskneutral probability of Sm > K) The ˚ .d2 / term in the Black-
Scholes formula (16.3)–(16.4) is the risk-neutral probability that Sm > K.
Proposition 16.15 (S˚ .d1 / in Black-Scholes) The S˚ .d1 / term in the Black-Scholes
formula (16.3)–(16.4) is (the present value of) the expected asset price conditional on
exercise, times the probability of exercise, that is, the first term in (16.15).
mean
‚ …„ ƒ
ln K ln S C y m m=2
2
k0 D p :
m
„ƒ‚…
std
120
Clearly, k0 is then the same as the argument d2 in (16.4)
ln.S=K/ C y 2 =2 m
d2 D p :
m
Proof. (of Proposition 16.15) First, the first term in (16.15) can be written
ym
F irstT erm D e exp C s 2 =2 ˚.s
k0 /;
C s 2 =2 D ln S C y m;
p ln K ln S C y m 2 m=2
s k0 D m p D d1 ;
m
where the last line follows from comparing with (16.4). We can therefore write F irstT erm
as S ˚.d1 /, since the e y m e y m term cancels. This is the same as in the Black-Scholes for-
mula.
This section discusses how we can hedge a European call option. The setting might be
that we have written such an option, but we do not want to carry the risk.
Consider a portfolio with h t of the underlying asset (the hedging portfolio) and short one
call option. The value of the overall position is
Vt D ht St Ct : (16.19)
Assume that only the price of the underlying asset can change (clearly not true, but
at least a starting point for the analysis). A first-order Taylor approximation of the call
option price is
@C t
C t Ch C t t .S t Ch S t / ; where t D : (16.20)
@S
121
Probability of exercise
1
Binomial
Black−Scholes
0.5
0
0 5 10 15 20 25 30 35 40 45 50
n (no. steps in binomial model)
48
46
44
0 5 10 15 20 25 30 35 40 45 50
n (no. steps in binomial model)
Use (16.20) to approximate the change of the value of the overall portfolio as
V t Ch V t D h t .S t Ch St / C tCh Ct
h t .S t Ch St / t .S t Ch St /
0 if h t D t : (16.21)
This is a delta hedge. Clearly, the delta is likely to change from period to period, so the
portfolio needs to be frequently rebalanced.
In the Black-Scholes model for an asset with dividends, the delta is
@C ım
D De ˚ .d1 / ; (16.22)
@S
where d1 is given by (16.8). Without dividends, just set ı D 0. From the put-call partity,
122
Delta, ∂C/∂S Gamma, ∂2C/∂S2 Theta, ∂C/∂m
0.8 0.06
−1
0.6 0.04
−2
0.4
0.02
0.2 −3
30 40 50 60 30 40 50 60 30 40 50 60
Asset price Asset price Asset price
30 40 50 60 30 40 50 60
Asset price Asset price
@P ım ım
De Œ˚ .d1 / 1 D e ˚ . d1 / ; (16.23)
@S
which is negative (the second equality follows from the symmetry of the normal distribu-
tion.
Proof. (of (16.22)) From (16.7)–(16.8) we have @C
@S
@
D e ı m ˚ .d1 /Ce ı m S @S ˚ .d1 /
e ym
K @S ˚ .d2 /, but it straightforward to show that the last two terms cancel. The key
@
to that proof is to note that ˚ 0 .d1 / D ˚ 0 .d2 / K=F . To demonstrate that, recall that
˚ 0 .d1 / D p12 exp. d12 =2/.
See 16.11 for an illustration of the “Greeks” in the B-S model and Figure 16.12 for an
example of how a delta hedge works on real data.
Clearly, 0 1 and increasing in the price of the underlying asset. Intuitively, an
123
Delta, SMI call, 5500 SMI futures
1 6500
6000
0.5 5500
5000
0 4500
May Jun Jul Aug Sep Oct May Jun Jul Aug Sep Oct
option that is deep out of the money will not be very sensitive to the asset price—since
the chance of exercising is so low. Conversely, an option that is deep in the money moves
almost in tandem with the asset price, since it will almost for sure be exercised.
In practice, the hedging portfolio also includes a small position in a short-term money
market account—to the overall portfolio have a zero value (at least initially).
Example 16.16 (Overall portfolio value over several subperiods ) Start by creating a
hedge portfolio with a zero initial value
0 D t St C t C B t , so B t D 0 t St C Ct ;
where B t is the amount held in the riskfree asset. In t C h (say, after one day), this
124
portfolio is worth (assuming no dividends)
V t Ch D t S t Ch e ıh C t Ch C B t e yt h ;
B t Ch D V t Ch t Ch S t Ch C C t Ch ;
which is very similar to the first equation. Clearly, the value of the portfolio in t C 2h is
computed as in the second equation, but with subscripts advanced one period.
When the underlying is a forward contract as in Black’s model (16.5)–(16.6), the sensi-
tivity of the call option price to the forward price is
@C ym
t D De ˚ .d1 / ; (16.24)
@F
where d1 is given by (16.6). The sensitivity of a put option is
@C ym ym
De Œ˚ .d1 / 1 D e ˚ . d1 / : (16.25)
@F
Proof. (of (16.24)) Similar to the proof of (16.22).
Delta hedging can be imprecise if the price of the underlying asset changes much. A
second-order Taylor approximation of the option price gives
1 @2 C t
@C t
C t Ch C t t .S t Ch St / C t .S t Ch S t /2 , where t D
t and
: D
2 @S 2
@S
(16.26)
This movement can be hedged by holding v t of the underlying asset and w t of other
option. Let t and t be the delta and gamma of this other option. A second-order
125
Taylor approximation of the value of this portfolio (denoted U t ) is
1
U t Ch U t v t .S t Ch S t / C w t t .S t Ch St / C wt t
.S t Ch S t /2 : (16.27)
2
Subtracting (16.26) from (16.27)
1
C t / v t C w t t S t /C w t t S t /2 :
.U t Ch U t / .C t Ch t .S t Ch t .S t Ch
„ ƒ‚ … „ ƒ‚ …2
At Bt
(16.28)
By first choosing w t to make the B t term zero and then v t to make the A t term zero, we
get a hedge. This clearly gives
t= t , and (16.29)
wt D
vt D t
t : (16.30)
t= t
@2 C ım 1 @˚ .d1 /
D De p :
@S 2 S m @d1
where d1 is given by (16.7). Without dividends, just set ı D 0. Clearly, @˚ .d1 / =@d1 is
the probability density function (at d1 ) of a N.0; 1/ variable.
126
16.5 Options on Currencies and Interest Rates
Buying one currency entails selling another. It should therefore come as no surprise that
a call option on a currency is also a put option on the other currency. To be precise, the
option prices are related according to
On the left hand side, Cd is the domestic price of a call option on the foreign currency—
with the strike price (K) is expressed in the domestic currency. On the right hand side, S t
is the current exchange rate (domestic price of one unit of the foreign currency), and Pf
is the foreign price of a put option on the domestic currency—with the strike price (1=K).
Example 16.19 Let Cd D £0:01 for an option on US dollars and the strike price is
£0.6 (to get one dollar). If the current exchange rate is £0.58 (per dollar), then the
dollar price of a put option on GBP with a strike price of 1=0:6 dollars per GBP is
0:01=.0:58 0:6/ D $0:0287:
Proof. (of (16.31)) The payoff of a call option (denominated in the domestic currency)
on foreign currency with strike price K is
where K is the strike price and S t Cm is the exchange rate at expiration—both expressed
as the domestic price of one unit of foreign currency (for instance, GBP 0.6 per USD).
The payoff is clearly expressed in the domestic currency. In contrast, the payoff of a put
option (denominated in the foreign currency) on the domestic currency (with strike price
1=K) has the payoff
max.0; 1=K 1=S t Cm /;
which is clearly expressed in the foreign currency. Notice that both options are exercised
when S t Cm > K. In fact, these options are identical, except for a scaling factor and
the currency denomination. To see that, consider buying K of the foreign denominated
options and then convert the payoff to the domestic currency (multiply by S tCm )
127
which is clearly the same as for the first option. For that reason, buying K of the for-
eign currency denominated put options should have the same price (when measured in
domestic currency—multiply by S t ) as the domestically denominated call option.
Options on the FX (exchange rate) markets are often sold (on the OTC market) as special
portfolios (consisting of straddles, risk-reversals and strangles) and quoted in terms of
the implied volatilities. Apart from these conventions, options on exchange rates are
no different from options on other assets (but, remember that currencies typically carry
“dividends” since holding a currency in practice means holding a money market account
in that currency).
A delta-neutral straddle (in terms of the forward contract), that is, a long position in
a call and also in a put. To make it delta-neutral, we need
@C @P
C D 0; (16.32)
@F @F
which from (16.22)–(16.23) and (16.5)–(16.6) gives (using F D Se .y ı/m
)
2 =2
d1 D 0, that is, Kat m D F e m : (16.33)
This straddle is typically quoted in terms of the implied volatility (at m ) of an option at
Kat m . A higher value of the straddle indicates more overall uncertainty.
See Figure 16.13 for illustrations.
A 25-delta risk reversal is a portfolio of one call option with a strike price K2 such
that the delta is 0:25 and short one put option with a strike price K1 such that the delta
is 0:25. Both options are out of the money so the strike price for the put is lower than
the forward price, which in turn is lower than the strike price of the call (K1 < F < K2 ).
The risk reversal is typically quoted as the difference of the two implied volatilities
rr D 2 1 ; (16.34)
where 2 and 1 are the implied volatilities of the options with strike prices K2 and K1
respectively (notice that, by the put-call parity, a put and a call with the same strike price
have the same implied volatility). A higher value of the risk reversal indicates beliefs of
128
an increase in the underlying—so it captures skewness.
A 25-delta strangle has a long position the 25-delta call and also in the 25-delta put.
It is typically quoted as heir average implied volatility minus the at-the-money volatility
and is therefore also called a butterfly
2 C 1
bf D at m : (16.35)
2
(A butterfly is a long position in a strangle and a short position in a straddle, so it looks
similar to bf —which is just a convention for quoting a price of a strangle or sometimes a
butterfly.) An increase in bf signals a belief in fatter tails, so it captures kurtosis. Notice
that a proportional increase of all volatilities does not change bf (it is “vega” neutral).
With the quotes on the risk reversal (16.34) and the butterfly (16.35), we can solve for
the implied volatilities 1 and 2 as
1 D bf C at m rr=2
2 D bf C at m C rr=2: (16.36)
It is straightforward to invert the formulas for the deltas to derive what the strike
prices are. If we use the convention that the deltas are with respect to the spot price, then
by setting @C =@S D 0:25 in (16.22) and @P =@S D 0:25 in (16.23) give the following
strike prices (using F D Se .y ı/m )
p
K2 D F expŒ 2 m˚ 1 .e ı m 0:25/ C m22 =2
p
K1 D F expŒ1 m˚ 1 .e ı m 0:25/ C m12 =2; (16.37)
where ı equals the foreign interest rate. Clearly, by changing 0.25 to x, we get the results
for a x-delta risk reversal instead.
See Figure 16.14 for an empirical illustration.
Options on bonds are basically no different from options on equity, especially since bonds
typically pay coupons (“dividends”). For instance, a call option on a bond gives the right
to buy the bond (at the expiration of the option) at the strike price.
Options on interest rates are also very similar, but often have a more complicated
129
Straddle (atm), C(K=F) + P(K=F) Risk reversal, C(K2) − P(K1)
Call(K) Call(K)
Put(K) −Put(K)
Straddle Risk reversal
0 0
K1 Katm K2 K1 Katm K2
Call(K) Strangle
Put(K) −Straddle
Strangle Butterfly
0 0
K1 Katm K2 K1 Katm K2
structure. A caplet is a call option that protects against higher interest rates (typically a
floating 3-month market rate or similar). Let Z tCs be the (annualized) market interest
rate for a loan between t C s and t C s C m and let ZK be the (annualized) cap rate. The
payoff in t C s C m (notice: paid at the end of the borrowing period) is
The second term is the interest rate cost for a loan (with a face value of unity) between
t C s and t C s C m according to the market rate minus the same cost according to the
cap rate. Clearly, buying such an option is a way to make sure that interest rate paid on a
future loan will not exceed the cap rate. If settled at t C s the payoff is just the discounted
130
atm straddle, iv quote 25−delta risk reversal, iv quote
DM/GBP options, 1992
15 0
−1
10
−2
5
−3
0
Apr Jul Oct Apr Jul Oct
value
maxŒ0; m.Z t Cs ZK /
: (16.39)
1 C mZ tCs
The payoff in (16.39) can be rewritten as
1
.1 C mZK / max 0; B t Cs .m/ (16.40)
1 C mZK
Notice that the max./ term defines the payoff of a put option on an m-period bond in
t C s (whose value turns out to be B t Cs .m/ D 1=.1 C mZ tCs /)—with a strike price of of
1=.1 C mZK /. The caplet is therefore proportional to a put option on a bond.
131
Proof. (of (16.40)) Multiply and divide (16.39) by .1 C mZK / and rearrange
mZ t Cs mZK
.1 C mZK / max 0;
.1 C mZ t Cs / .1 C mZK /
1 1
D .1 C mZK / max 0; :
1 C mZK 1 C mZ t Cs
Remark 16.20 (Simple interest rates) If Z is a simple interest rates, then of a zero-
coupon bond that gives unity at maturity is
1 1=B .m/ 1
B .m/ D , or Z.m/ D :
1 C mZ.m/ m
A simple forward rate for the period s to s C m periods in the fture is defined as
f 1 B.s/
Z .s; s C m/ D 1 :
m B.s C m/
Caplet.s; mI ; ZK / D me .sCm/y
ŒZ f ˚ .d1 / ZK ˚ .d2 /, where (16.41)
ln.Z f =ZK / C . 2 =2/s p
d1 D p and d2 D d1 s; (16.42)
s
where is the (annualized) volatility of the log forward rate.
An interest rate cap is a portfolio of different caplets which protects the owner over
several tenors (subperiods) starting S periods ahead and lasting until S C n. Typically,
the first caplet is deleted (as there is no uncertainty about what the short rate is today)
132
and the last payment is done on the the maturity date S C n. Therefore, the tenors are
ŒS Cm; S C2m, ŒS C2m; S C3m and so forth until the last one which is ŒS Cn m; S Cn
so there are n=m 1 caplets. (The start/end of a tenor is called a reset/settlement date.)
For instance, a 1-year cap, starting 6 months from now, on the 3-month Libor consists of
3 caplets. See Figure 16.15 for an illustration.
If we apply the same volatility to all caplets (“flat volatilities”), then the price of a cap
(according to the Black-Scholes model) starting in S ending in S C n, so the last caplet
starts in S C n m. The value of the cap is therefore
Xn=m 1
Cap.S; n; mI ; ZK / D Caplet.S C im; mI ; ZK /: (16.43)
i D1
Caps are often quoted in terms of the implied volatility ( ) that solves this equation—
meaning that there is one implied volatility per cap contract, but it may differ across cap
rates (“strike prices”) and maturities.
Example 16.21 (1-year Cap starting 6 months ahead, 3-month tenors) Let n D 1, m D
1=4 S D 1=2. The payoffs are based on the difference between the 3-month Libor and
the cap rate at the beginning of the tenors (3=4; 1; 5=4), but are paid one quarter (1=4)
later. Equation (16.43) is therefore
Clearly, the first caplet starts in S D 2=4 C 1=4 D 3=4 and the last starts in S C .n=m
1/m D 2=4 C 1 1=4 D 5=4.
Floorlets and floors are similar to caplets and caps, except that they pays off when the
interest goes below the cap rate.
We have seen that the price of a derivative is a discounted risk-neutral expectation of the
derivative payoff, see (16.9).
In the Black-Scholes model, this risk-neutral distribution is that ln Sm is normally
distributed as in (16.2) except that the mean is different (this is the difference between
the natural and the risk-neutral distribution). However, risk neutral distributions can be
133
cap rate
market rate
+ 0 +
t t Cm t C 2m t C 3m t C 4m t C 5m t C 6m
start of cap end of cap
derived from other assumptions than those in the Black-Scholes model, and (16.9) would
still be valid. For instance, it holds in the binomial model, whose distribution is not normal
(unless we make the time steps very many and small). Alternatively, we could construct
a binomial tree where the time steps have different volatilities (this is often done to fit
the yield curve)—and even in the limit (with many and small time steps) the distribution
would be non-normal. Once again, the Black-Scholes formula would not be exact, but
(16.9) would still be true.
Example 16.22 (Call prices, three states) Suppose that Sm only can take three values:
90, 100, and 110; and that the risk neutral probabilities for these events are: 0.5, 0.4, and
0.1, respectively. We consider three European call option contracts with the strike prices
89, 99, and 109. From (16.9) their prices are (if y D 0)
With prices on several options with different strike prices (but otherwise identical), it
is possible to estimate the risk-neutral distribution.
Example 16.23 (Extracting probabilities) Suppose we observe the option prices in Ex-
ample 16.22, and want to use these to recover the probabilities. We know the possible
134
states, but not their probabilities. Let Pr.x/ denote the probability that Sm D x. From
Example 16.22, we have that the option price for K D 109 equals
C .K D 109/ D 0:1
D Pr.90/ 0 C Pr.100/ 0 C Pr.110/.110 109/;
which we can solve as Pr.110/ D 0:1. We now use this in the expression for the option
price for K D 99
C .K D 99/ D 1: 5
D Pr.90/ 0 C Pr.100/.100 99/ C 0:1.110 99/;
which we can solve as Pr.100/ D 0:4. Since probabilities sum to one, it follows that
Pr.90/ D 0:5.
A common approach is to make an assumption about the form of the distribution, for
instance, that it is mixture of two normal distributions. The parameters of this distribution
are then chosen (estimated) by minimizing the sum (across strike prices) of squared dif-
ferences between observed and predicted prices. (This is like the minimization problem
behind the least squares method in econometrics.) This allows the possibility to pick up
skewed (downside risk different from upside risk?) and even bimodal distributions.
Figure 16.16 shows some data and results (assuming a mixture of two normal distribu-
tions) for German bond options around the announcement of the very high money growth
rate on 2 March 1994.
Bibliography
Cochrane, J. H., 2001, Asset pricing, Princeton University Press, Princeton, New Jersey.
Cox, J. C., S. A. Ross, and M. Rubinstein, 1979, “Option pricing: a simplified approach,”
Journal of Financial Economics, 7, 229–263.
135
June−94 Bund option, volatility, 06−Apr−1994 June−94 Bund option, pdf on 06−Apr−1994
0.1
15 N
0.09 mix N
10
0.08
0.07 5
0.06 0
5.5 6 6.5 7 7.5 5.5 6 6.5 7 7.5
Strike price (yield to maturity, %) Yield to maturity, %
0
5.5 6 6.5 7 7.5
Yield to maturity, %
Figure 16.16: Bund options 23 February and 3 March 1994. Options expiring in June
1994.
Hull, J. C., 2006, Options, futures, and other derivatives, Prentice-Hall, Upper Saddle
River, NJ, 6th edn.
136
Distribution of CHF/EUR, 1m, 16−Sep−2008 Distribution of CHF/EUR, 1m, 16−Oct−2008
8
10 6
4
5
2
0 0
1.3 1.4 1.5 1.6 1.7 1.3 1.4 1.5 1.6 1.7
2 2
0 0
1.3 1.4 1.5 1.6 1.7 1.3 1.4 1.5 1.6 1.7
137
CHF/EUR 80% conf band and forward, 1m
1.65
1.6
1.55
1.5
1.45
1.4
1.35
138
CHF/EUR, distance from forward, 1m
0.1 percentile 10
percentile 90
0.05
−0.05
−0.1
139
17 Trading Volatility
Reference: Gatheral (2006) and McDonald (2006)
More advanced material is denoted by a star ( ). It is not required reading.
17.1 VIX
By using option portfolios (for instance, straddles) it is possible to create a position that
is a bet on volatility—and is (in principle) not sensitive to the direction of change of the
underlying. See Figure 17.1 for an illustration.
Volatility, as an asset class, has some interesting features. In particular, returns on the
underlying asset and volatility are typically negatively correlated: very negative returns
Call(K)
Put(K)
Straddle
Stock price
140
VIX (solid) and S&P 500 (dashed)
1000
50
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010
are typically accompanied by increases in future actual volatility as well as beliefs about
higher future volatility (as priced into options). See Figure 17.2 for an illustration, where
changes in the VIX are taken to proxy the one-day holding return on a straddle.
The VIX is an index of volatility, calculated from 1-month options on S&P 500. It
used to be calculated as an average of implied volatilities, but since 2003 the calculation
is more complicated (the old series is now called VXO). It can be shown (although it is
a bit tricky) that the VIX is a very good approximation to the square root of the variance
swap rate (see below) for a 30-day contract. There is also a futures contract on VIX with
payoff
VIX futures payoff t Cm D V IX t Cm futures price t : (17.1)
Notice that V IX t Cm is really a guess of what the volatility will be during the month after
t C m, so the futures contract pays off when the expected volatility (in t C m) is higher
than what was thought in t.
Remark 17.1 (Calculation of VIX) Let F be the forward price, Ki D .Ki C1 Ki 1 /=2
141
and let K0 denote the first strike price below F . Then, the VIX is calculated as
2 P Ki 2 P Ki 1
V IX 2 D exp.y m/ 2
P .Ki /C exp.y m/ 2
C.Ki / .F=K0 1/2 ;
m Ki K0 Ki m Ki >K0 Ki m
where m is the time to expiration (around 1/12), y the interest rate, P ./ the put price and
C./ the call price.
where the variance swap rate (also called the strike or forward price for ) is agreed on at
inception (t) and the realized volatility is just the sample variance for the swap period.
Both rates are typically annualized, for instance, if data is daily and includes only trading
days, then the variance is multiplied by 252 or so (as a proxy for the number of trading
days per year).
A volatility swap is similar, except that the payoff it is expressed as the difference
between the standard deviations instead of the variances
p
Volatility swap payoff tCm = realized variance t Cm volatility swap rate t , (17.3)
If we use daily data to calculate the realized variance from t until the expiration(RV tCm ),
then
252 Pm 2
RV t Cm D sD1 R tCs ; (17.4)
m
where R t Cs is the net return on day t C s. (This formula assumes that the mean return is
zero—which is typically a good approximation for high frequency data. In some cases,
the average is taken only over m 1 days.)
Notice that both variance and volatility swaps pays off if actual (realized) volatility
between t and t C m is higher than expected in t . In contrast, the futures on the VIX pays
off when the expected volatility (in t C m) is higher than what was thought in t. In a way,
we can think of the VIX futures as a futures on a volatility swap (between t C m and a
142
VIX (solid) and realized volatility (dashed)
80
70
60
50
40
30
20
10
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010
month later).
Since VIX2 is a good approximation of variance swap rate for a 30-day contract, the
return can be approximated as
Figures 17.3 and 17.4 illustrate the properties for the VIX and realized volatility of
the S&P 500. It is clear that the mean return of a variance swap (with expiration of 30
days) would have been negative on average. (Notice: variance swaps were not traded
for the early part of the sample in the figure.) The excess return (over a riskfree rate)
would, of course, have been even more negative. This suggests that selling variance
swaps (which has been the speciality of some hedge funds) might be a good deal—except
that it will incur some occasional really large losses (the return distribution has positive
skewness). Presumably, buyers of the variance swaps think that this negative average
return is a reasonable price to pay for the “hedging” properties of the contracts—although
the data does not suggest a very strong negative correlation with S&P 500 returns.
143
Histogram of return on (synthetic) variance swaps
0.5
0
−1 −0.5 0 0.5 1 1.5 2 2.5
Bibliography
Gatheral, J., 2006, The volatility surface: a practitioner’s guide, Wiley.
144
18 Dynamic Portfolio Choice
More advanced material is denoted by a star ( ). It is not required reading.
Suppose the investor wants choose portfolio weights (v t ) to maximize expected utility,
that is, to solve
max E t u.W t Cq /; (18.1)
vt
where and E t denotes the expectations formed today, u./ is a utility function and W t Cq is
the wealth (in real terms) at time t C q.
This is a standard (static) problem if the investor cannot (or it is too costly to) rebalance
the portfolio. (In some cases this leads to a mean-variance portfolio, in other cases not.)
If the distribution of assets returns is iid, then the portfolio choice is unchanged over
time—otherwise it changes. For instance, with mean-variance preferences, the tangency
portfolio changes as the expected returns and/or the covariance matrix do.
Instead, if the investor can rebalance the portfolio in every time period (t C 1; :::; t C
q 1), then this is a truly dynamic problem—which is typically more difficult to solve.
However, when the utility function has constant relative risk aversion (CRRA) and returns
are iid, then we know that the optimal portfolio weights are constant across time and
independent of the investment horizon (q). We can then solve this as a standard static
problem. The intuition for this result is straightforward: CRRA utility implies that the
portfolio weights are independent of the wealth of the investor and iid returns imply that
the outlook from today is the same as the outlook from yesterday, except that the investor
might have gotten richer or poorer. (The same result holds if the objective function instead
is to maximize the utility from stream of consumption, but with a CRRA utility function.)
With non-iid returns (predictability or time-varying volatility), the optimization is typ-
ically much more complicated. The next few sections present a few cases that we can
handle.
145
18.2 Optimal Portfolio Choice: Logarithmic Utility and Non-iid Re-
turns
Let the objective in period t be to maximize the expected log wealth in some future period
where r t is the log return, r t D ln.1 C R t / where R t is a net return. The investor can
rebalance the portfolio weights every period.
Since the returns in the different periods enter separably, the best an investor can do
in period t is to choose a portfolio that solves
max E t r t C1 : (18.3)
That is, to choose the one-period growth-optimal portfolio. But, a short run investor who
maximizes E t lnŒW t .1 C R t C1 // D max.ln W t C E t r t C1 / will choose the same portfolio,
so there is no horizon effect. However, the portfolio choice may change over time, if the
distribution of the returns do. (The same result holds if the objective function instead is to
maximize the utility from stream of consumption, but with a logarithmic utility function.)
In dynamic portfolio choice models it is often more convenient to work with logarithmic
portfolio returns (since they are additive across time). This has a drawback, however, on
the portfolio formation stage: the logarithmic portfolio return is not a linear function of the
logarithmic returns of the assets in the portfolio. Therefore, we will use an approximation
(which gets more and more precise as the length of the time interval decreases).
If there is only one risky asset and one riskfree asset, then Rpt D vR t C .1 v/Rf t .
Let ri t D ln.1 C Ri t / denote the log return. Campbell and Viceira (2002) approximate
the log portfolio return by
146
where 2 is the conditional variance of r t . (That is, 2 is the variance of u t in r t D
E t 1 r t C u t .) Instead, if we let r t denote an n 1 vector of risky log returns and v the
portfolio weights, then the multivariate version is
where ˙ is the nn covariance matrix of r t and 2 is the n1 vector of the variances (that
is, the the diagonal elements of that covariance matrix). The portfolio weights, variances
and covariances could be time-varying (and should then perhaps carry time subscripts).
Proof. (of (18.4) ) The portfolio return Rp D vR1 C .1 v/Rf can be used to write
1 C Rp 1 C R1
D1Cv 1 :
1 C Rf 1 C Rf
The logarithm is
rf D ln 1 C v exp.r1
˚
rp rf / 1 :
The function f .x/ D ln f1 C v Œexp.x/ 1g has the following derivatives (evaluated at
x D 0): df .x/=dx D v and d 2 f .x/=dx 2 D v.1 v/, and notice that f .0/ D 0. A
second order Taylor approximation of the log portfolio return around r1 rf D 0 is then
1 2
rp rf D v r1 rf C v.1 v/ r1 rf :
2
In a continuous time model, the square would equal its expectation, Var.r1 /, so this further
approximation is used to give (18.4). (The proof of (18.5) is just a multivariate extension
of this.)
The objective is to maximize the (conditional) expected value of the portfolio return as
in (18.3). When there is one risky asset and a riskfree asset, then the portfolio return is
given by the approximation (18.4). To simplify the notation a bit, let etC1 be the condi-
tional expected excess return E t .r t C1 rf;t C1 / and let t2C1 be the conditional variance
(Var t .r t C1 /). Notice that these moments are conditional on the information in t (when the
portfolio decision is made) but refer to the returns in t C 1.
147
The optimization problem is then
which is very similar to a mean-variance portfolio choice. Clearly, the weight on the risky
asset will change over time—if the expected excess return and/or the volatility does. We
could think of the portfolio with v t of the risky asset and 1 v t of the riskfree asset as a
managed portfolio.
Example 18.1 (Portfolio weight, single risky asset) Suppose etC1 D 0:05 and t2C1 D
0:15, then we have v t D .0:05 C 0:15=2/=0:15 D 5=6 0:83.
With many risky assets, the optimization problem is to maximize the expected value
of (18.5). The optimal n 1 vector of portfolio weights is then
1
v t D ˙ t C1 .etC1 C t2C1 =2/; (18.8)
where ˙ tC1 is the conditional covariance matrix (Cov t .r t C1 /) and t2C1 the n 1 vector
of conditional variances. The weight on the riskfree asset is the remainder (1 10 v t , where
1 is a vector of ones).
Proposition 18.2 If the log returns are normally distributed, then (18.8) gives a portfolio
on the mean-variance frontier of returns (not of log returns).
Figures 18.1–18.2 illustrate mean returns and standard deviations, estimated by expo-
nentially moving averages (as by RiskMetrics). Figures 18.3–18.4 show how the optimal
portfolio weights change (assuming mean-variance preferences). It is clear that the port-
folio weights change very dramatically—perhaps too much to be realistic. The portfolio
weights seem to be particularly sensitive to movements in the average returns, which po-
tentially a problem since the averages are often considered to be more difficult to estimate
(with good precision) than the covariance matrix.
148
Mean excess returns (annualized Mean excess returns (annualized
0.15 0.15
0.1 0.1
Cnsmr HiTec
Manuf Hlth
0.05 0.05
1990 2000 1990 2000
0.15
0.1
Other
0.05
1990 2000
E rp rf C v 0 e C v 0 2 =2 v 0 ˙v=2;
e C 2 =2 ˙ 1
v D 0n1 :
Solve for v.
Proof. (of Proposition 18.2) First, notice that if the log return r t in (18.5) is normally
distributed, then so is the log portfolio return (rpt ). Second, recall that if ln y N.; 2 /,
then E y D exp C 2 =2 and Std .y/ = E y D exp. 2 / 1, so that ln E y 2 =2 D
p
149
Std (annualized Std (annualized
0.25 0.25
Cnsmr HiTec
Manuf Hlth
0.2 0.2
0.15 0.15
1990 2000 1990 2000
Std (annualized
0.25
Other
0.2
0.15
1990 2000
which is increasing in E y and decreasing in Var.y/. To prove the statement, notice that
y corresponds to the gross return and ln y to the log return, so corresponds to E t rptC1 .
Clearly, is increasing in E y and decreasing in Var.y/, so the solution will be on the
MV frontier of the (gross and net) portfolio return.
150
Portfolio weights, Cnsmr Portfolio weights, Manuf
6 10
fixed mean
4
fixed cov
2 5
0
−2
0
1990 2000 1990 2000
Figure 18.3: Dynamically updated portfolio weights, T-bill and 5 U.S. industries
18.2.4 A Simple Example with Time-Varying Expected Returns (Log Utility and
Non-iid Returns)
A particularly simple case is when the expected excess returns are linear functions of
some information variables in the (k 1) vector z t
at the same time as the variances and covariances are constant. In this expression, a is an
n 1 vector and b is an n k matrix. Assuming that the information variables have zero
means turns out to be convenient later on, but it is not a restriction (since the means are
captured by a). The information variables could perhaps be the slope of the yield curve
151
Portfolio weights, Other Portfolio weights, riskfree
0
2
−5
0
fixed mean
fixed cov −2
−10
1990 2000 1990 2000
Figure 18.4: Dynamically updated portfolio weights, T-bill and 5 U.S. industries
See Figure 18.5 for an illustration (based on Example 18.3). The figure shows the
basic properties for the returns, the optimal portfolios and their location in a traditional
mean-std figure. In this example, z t can only take on two different values with equal
probability: 1 or 1. The figure shows one mean-variance figure for each state—and the
152
portfolio is clearly on them. However, the portfolio is not on the unconditional mean-
variance figure (where the means and covariance matrix are calculated by using both
states).
Example 18.3 (Dynamic portfolio weights when z t is a scalar that only takes on the
values 1 and 1; with equal probabilities) The expected excess returns are
(
a b when z t D 1
etC1 D
a C b when z t D 1:
Example 18.4 (Numerical values for Example 18.3). Suppose we have three assets with
02 31 2 3
r1 1:19 0:32 0:24
Cov @4r2 5A D 4 0:32 0:81 0:02 5 =100;
B6 7C 6 7
and 2 3 2 3
0:41 0:63
e 1 D 4 0:295 =100 and e1 D 40:435 =100;
6 7 6 7
0:07 0:21
In this case, the portfolio weights are
2 3 2 3
0:112 0:709
v 1 40:0945 and v1 40:7365 :
6 7 6 7
0:065 0:610
Example 18.5 (Details on Figure 18.5) To transfer from the log returns to the mean and
std of net returns, the following result is used: if the vector x N.; 2 / and y D
exp.x/, then E yi D exp .i i C i i =2/ and Cov.yi ; yj / D exp i C j C .i i C jj /=2 exp.ij /
1.
153
MV frontiers of basic assets in different states MV frontier from unconditional moments
8 8
Mean of net return, %
6 6
5 5
0 5 10 15 20 0 5 10 15 20
Std of net return, % Std of net return, %
An important feature of the portfolio choice based on the logarithmic utility function is
that it is myopic in the sense that it only depends on the distribution of next period’s return,
not on the distribution of returns further into the future. Hence, short-run and long-run
investors choose the same portfolios—as discussed before. This property is special to the
logarithmic utility function.
With a utility function with a constant relative risk aversion (CRRA) different from
one, today’s portfolio choice would also depend on distribution of returns in t C 2 and
onwards. In particular, it would depend on how the (random) returns in t C1 are correlated
with changes (in t C1) of expected returns and volatilities of returns in t C2 and onwards.
This is intertemporal hedging.
In this case, the optimization problem is tricky, so I will illustrate it by using a simple
model. As in Campbell and Viceira (1999), suppose there is only one risky asset and let
the (scalar) information variable be an AR(1)
In addition, I assume that the expected return follows (18.9) but with b D 1 (to simplify
the algebra)
etC1 D a C z t : (18.15)
154
Combine the time series processes (18.14) and (18.15) to get the following expression for
the excess return
r teC1 D r t C1 rf D a C z t C u t C1 ; (18.16)
0.6
0.4
0.2
−0.2
0 1 2 3 4 5 6 7 8 9
Future period
Figure 18.6: Average impulse response of the return to changes in u0 , two different cases.
This shows that the unconditional autocovariance of the return can be considerable at
the same time as the conditional autocovariance may be much smaller. It is the latter
than matters for the portfolio choice. For instance, it is possible that the unconditional
autocovariance is zero (in line with empirical evidence), while the conditional covariance
155
is negative.
Figure 18.6 shows the impulse response function (the forecast based on current infor-
mation) of a shock to the temporary part of the return (u) under two different assumptions
about how this temporary part is correlated with the mean return for the next period re-
turn. When they are uncorrelated, then a shock to the temporary part of the return is just
a “blip.” In contrast, when today’s return surprise indicates poor future returns (a negative
covariance), then the impulse response function is positive (unity) in the initial period, but
then negative for a prolonged period (since the mean return is autocorrelated).
Proof. (of (18.17)–(18.18)) The unconditional covariance is
since z t is known in t .
To solve the maximization problem, notice that if the log portfolio return, rp D ln.1 C
Rp /, is normally distributed, then maximizing E.1 C Rp /1
=.1
/ is equivalent to
maximizing
E rp C .1
/ Var.rp /=2; (18.19)
where rp is the log return of the portfolio (strategy) over the investment horizon (one or
156
several periods—to be discussed below). (See lecture notes for Finance 1 for a proof.)
etC1 C 2 =2 a C z t C 2 =2
vt D D ; (18.21)
2
2
and the weight on the riskfree asset is 1 v t . With
D 1 (log utility), we get the same
results as in (18.7). With a higher risk aversion, the weight on the risky asset is lower.
Clearly, the portfolio choice depends positively on the (signal about) the expected returns.
Figure 18.7 for how the portfolio weight on the risky asset depends on the risk aversion.
1.2 myopic
2−period
1 2−period (no rebal)
0.8
σ, a, σuη, ση = 0.40 0.05 −0.40 2.00
0.6
0.4
0.2
0
1 1.5 2 2.5 3 3.5 4 4.5 5
Risk aversion (γ)
Figure 18.7: Weight on risky asset, two-period investor with CRRA utility and the possi-
bility to rebalance.
157
Proof. (of (18.21)). Using the approximation (18.4), we have
E rp D rf C ve C v 2 =2 v 2 2 =2
Var.rp / D v 2 2 :
e C 2 =2 v 2 v 2 D 0:
Solve for v.
158
The solution (see Appendix) is
a C 2 =2 C .1 C /z t =2
vD : (18.23)
2 .1
/ŒVar. tC1 /=2 C u
Similar to the one-period investor, the weight is increasing in the signal of the average
return (z t ), but there are also some interesting differences. Even if the utility function
is logarithmic (
D 1), we do not get the same portfolio choice as for the one-period
investor. In particular, the reaction to the signal (z t ) is smaller (unless D 1). The reason
is that in this case, the investor commits to the same portfolio for two periods—and the
movements in average returns are assumed to be mean-reverting.
There are also some important patterns on average (when z t D 0). Then,
D 1
actually gives the same portfolio choice as for the one-period investor. However, if
> 1,
and there are important shocks to the expected return, then the two-period investor puts a
lower weight on the risky asset (the second term in the denominator tends to be positive).
The reason is that the risky asset is more dangerous to the two-period investor since rpt C2
is more risky than rptC1 , since rptC2 can be hit by more shocks—shocks to the expected
return of rpt C2 . In contrast, if data is iid then those shocks do not exist (Var. t C1 / D 0),
so the two-period investor makes the same choice as the one-period investor.
One more thing is worth noticing: if u < 0, then the demand for the risky asset is
higher than otherwise. This can be interpreted as a case where a temporary positive return
leads to lower future (expected) returns. With this sort of mean-reversion in the price level
(conditional negative autocorrelation), the risky asset is somewhat less risky to a long-run
investor than otherwise. When extended to several risky assets, the result is that there us
a higher demand for assets that tend to be negatively correlated with the future general
investment outlook. See 18.6 for an illustration of this effect and Figure 18.7 for how the
portfolio weight on the risky asset depends on the risk aversion.
It is more reasonable to assume that the two-period investor can rebalance in each period.
Rewrite (18.22) as
E t rptC1 C E t rptC2 C .1
/ŒVar t .rptC1 / C Var t .rptC2 / C 2 Cov t .rpt C1 ; rp2C1 /=2;
(18.24)
159
and notice that the investor (in t ) can affect only those terms that involve rptC1 (as the
portfolio will be rebalanced in t C 1). He/she therefore maximizes
The maximization problem is the same as for a one-period investor (18.20) if returns are
iid (so the covariance is zero), or if
D 1.
Otherwise, the covariance term will influence the portfolio choice in t . The difference
to the no-rebalancing case is that the investor in t takes into account that rpt C2 will be
generated by a portfolio with the weights of a one-period investor
a C z t C1 C 2 =2
v t C1 D : (18.26)
2
(This is the same as (18.21) but with the time subscripts advanced one period). This
affects both how the signal about future average returns (z t ) and the risk are viewed. The
solution is (a somewhat messy expression)
etC1 C 2 =2 1
2
1 2
vt D C C C (18.27)
a =2 z t u :
2
2
2 2
(The proof is in the Appendix.) See Figure 18.7 for how the portfolio weight on the risky
asset depends on the risk aversion and for a comparison with the cases of myopic portfolio
choice and and no rebalancing.
As before, the portfolio choice depends positively on the expected return (as signalled
by z t ). But, there are several other results. First, when
D 1 (log utility), then the
portfolio choice is the same as for the one-period investor (for any value of z t ). Second,
when u D Var t .u t C1 ; t C1 / D 0, then the second term drops out, so the two-period
investor once again picks the same portfolio as the one-period investor does.
Third ,
> 1 combined with u < 0 increases (on average, z t D 0) the weight on
the risky asset—similar to the case without rebalancing. In this case, the second term of
(18.27) is positive. That is, there is a positive extra demand (in t) for the risky asset: such
an asset tends to pays off in t C 1 (since u t C1 > 0, which only affects the return in t C 1,
not in subsequent periods) when the overall investment prospects for t C 2 become worse
(etC2 is low since t C1 and thus z t C1 tends to be low when u t C1 is high). In this case, the
return in t C 1, driven by the temporary shock u t C1 , partially hedges investment outlook
160
in t C 1 (that is, the distribution of the portfolio returns in t C 2). The key to getting
intertemporal hedging is thus that the temporary movements in the return partially offset
future movements in the investment outlook.
While this simplified case only uses one risky asset, it is important to understand that
this intertemporal hedging is not about that a particular asset hedging the changes in its
own return distribution. Indeed, if the outlook for a particular asset becomes worse, the
investor could always switch out of it. Instead, the key effect depends on how a particular
asset hedges the movements in tomorrow’s optimal portfolio—that is, tomorrow’s overall
investment outlook.
Remark 18.6 (How to estimate (18.14) and (18.16)). First, regress the excess returns
on some information variables z t : r t C1 rf D a C b z t C u t C1 . Second, define
z t D b .z t E z t /. Then, a regression of the return on z t gives a slope coefficient of one
as in (18.16). Third, estimate an AR(1) on z t as in (18.14). Fourth and finally, estimate
the covariance matrix of the residuals from the last two regressions.
However, without restrictions on v.z/ it is impossible to sort out what sort of strategies
that would be assigned neutral performance by a particular (multi-factor) model. There-
fore, assume that v.z/ are linear in the K information variables
v.z t 1/ D „ƒ‚…
d zt 1 (18.29)
„ƒ‚…
N K K1
161
for any N K matrix d . For instance, when the expected returns are driven by the infor-
mation variables z t as in (18.9), then the optimal portfolio weights (for an investor with
logarithmic preferences) are linear functions of the information variables as in (18.11) or
(18.13).
It is clear that the portfolio return (18.28)–(18.29) can be written
Rpt D Re0
t v.z t 1/ C Rf
D Re0
t dz t 1 C Rf
D .vec d /0 .z t 1 ˝ Ret / C Rf : (18.30)
Proof. (of (18.30)) Recall the rule that vec .ABC / D .C 0 ˝ A/ vec B. Here, notice
that Re0 dz is a scalar, so we can use the rule to write Re0 dz D .z 0 ˝Re0 / vec d . Transpose
and recall the rule .D ˝ E/0 D D 0 ˝ E 0 to get .vec d /0 .z ˝ Re /
This shows that the portfolio return can involve any linear combination of z ˝ Re so
the new return space is defined by these new managed portfolios. We can therefore think
of the returns
RQ t D .z t 1 ˝ Ret / C Rf (18.31)
as the returns on new assets—which can be used to define, for instance, mean-variance
frontiers.
It is not self-evident how to measure the performance of a portfolio in this case. It
could, for instance, be argued that the return of the dynamic part of the portfolio is to be
considered non-neutral performance. After all, this part exploits the information in the
information variables z, which is potentially better than keeping a fixed portfolio. In this
case, the alpha from a traditional CAPM regression
e e
Rpt D ˛ C ˇRmt C "i t (18.32)
162
is a good measure of performance.
On the other hand, it may also be argued that a dynamic trading rule that investors
can easily implement themselves should be assigned neutral performance. This can be
done by changing the “benchmark” portfolio from being just the market portfolio to in-
clude managed portfolios. As an example, we could use the intercept from the following
“dynamic CAPM” (or “conditional CAPM”) as a measurement of performance
e e
Rpt D ˛ C .ˇ C
z t 1 / Rmt C "t
e e
D ˛ C ˇRmt C
z t 1 Rmt C "t : (18.33)
where the second term are the dynamic benchmarks that capture the effect of time-varying
portfolio weights. In fact, (18.33) would assign neutral performance (˛ D 0) to any pure
“market timing” portfolio (constant relative weights in the sub portfolio of risky assets,
but where the split between riskfree and risky assets change).
e
Rpt D ˛ C ˇf t C
.z t 1 ˝ ft / C "t ;
where f t is a vector of factors (excess returns on some portfolios), where ˝ is the Kro-
necker product.
To connect the performance evaluation in (18.32) and (18.33) to the optimal dynamic port-
folio strategy (18.13), suppose the optimal strategy is a pure “market timing” portfolio.
This happens when the expected returns (18.9) are modelled as
where c is some scalar constant, while a and 2 are vectors. This gives the portfolio
weights (18.13)
v t 1 D C cz t 1 D .1 C cz t 1 /; (18.35)
„ ƒ‚ …
!t
where is defined in (18.13). There are constant relative weights in the sub portfolio of
risky assets, but the split between the risky assets (the vector v t 1 ) and riskfree (the scalar
163
1 10 v t 1 ) and change as z t 1 does: market timing.
Proof. (of (18.35)) Use b D c.a C 2 =2/ from (18.34) in (18.13)
1
D˙ .a C 2 =2/
1
!t D ˙ .a C 2 =2/cz t D cz t :
e
Rpt D 0
Ret .1 C cz t 1 /: (18.36)
First, consider using the intercept (˛) from the the CAPM regression (18.32) as a
measure of performance. If the market portfolio is the tangency portfolio (for instance,
we could assume that the rest of the market do static MV optimization so the market
equilibrium satisfies CAPM), then the static part of the return (18.36), 0 Ret , will be
assigned neutral performance. The dynamic part, 0 cz t 1 Ret , is different: it is like the
return on a new asset—which does not satisfy CAPM. It is therefore likely to be assigned
a non-neutral performance.
Second, consider using the intercept from the dynamic CAPM regression (18.33) as a
measure of performance. As before, the static part of the return should be assigned neutral
performance (as the market/tangency portfolio is one of the regressors). In this case, also
the dynamic part of the portfolio is likely to be assigned neutral performance (or close
to it). This is certainly the case when the static portfolio weights, , are proportional
weights in the market portfolio. Then, the z t 1 Rmte
term in dynamic CAPM regression
(18.33) exactly matches the R t z t 1 part of the return of the dynamic strategy (18.36).
0 e
See Figure 18.5 for an illustration (based on Example 18.3). Since, the portfolio is not
on the unconditional mean-variance figure, it does not have a zero alpha when regressed
against the tangency (as a proxy for the “market”) portfolio. (All the basic assets do, by
construction, have zero alphas.) However, it does have a zero alpha when regressed on
(Rm ; zRm ).
However, dynamic portfolio choices that are more complicated than the market timing
strategy in (18.35) would not necessarily be assigned neutral performance in (18.33).
However, also such strategies could be assigned a neutral performance—if we augmented
the number of benchmarks to properly capture the time-varying portfolio weights. In this
164
MV frontiers of basic assets in different states MV frontier from unconditional moments
8 8
Mean of net return, %
6 6
5 5
0 5 10 15 20 0 5 10 15 20
Std of net return, % Std of net return, %
Figure 18.9: Portfolio choice, two different states where market timing is optimal
case, this would require using z t 1 ˝ Ret (where Ret are the returns on the original assets)
as the regressors
e e
Rpt D ˛ C ˇRmt C
.z t 1 ˝ Ret / C " t : (18.37)
With those benchmarks all strategies where the portfolio weights on the original assets are
linear in z t 1 would be assigned neutral performance. In practice, evaluation of mutual
funds typically define a small number (perhaps 5) of returns and even fewer instruments
(perhaps 2–3). The instruments are typically inspired by the literature on return pre-
dictability and often include the slope of the yield curve, the dividend yield or lagged
returns.
Figures 18.9 illustrates the case when the portfolio has a zero alpha against (Rm ; zRm ),
while 18.10 shows a case when the portfolio does not.
165
MV frontiers of basic assets in different states MV frontier from unconditional moments
8 8
Mean of net return, %
6 6
5 5
0 5 10 15 20 0 5 10 15 20
Std of net return, % Std of net return, %
Figure 18.10: Portfolio choice, two different states where market timing is not fully opti-
mal
A Some Proofs
Proof. (of (18.23)) (This proof is a bit crude, but probably correct....) The objective is to
maximize (18.24). Using (18.4) we have
rptC1 rf C vr teC1 C v 2 =2 v 2 2 =2
rptC2 rf C vr teC2 C v 2 =2 v 2 2 =2;
so
rptC1 C rptC2 2rf C v.r teC1 C r teC2 / C v 2 v2 2:
166
so the derivative with respect to v
@ E t .rptC1 C rptC2 /
D etC1 C E t etC2 C 2 2v 2 : (foc1)
@v t
The variance of the two-period return is
so the derivative is
@ Var t .rptC1 C rptC2 /
D 2v Var t .r teC1 C r teC2 /: (foc2)
@v t
Combine (foc1) and (foc2) to get the first order condition
etC1 C E t etC2 C 2
vD :
2 2 .1
/ Var t .r teC1 C r teC2 /
Recall that
etC1 D a C z t
E t etC2 D a C E t z t C1 D a C z t , so
etC1 C E t etC2 D 2a C .1 C /z t :
167
since Cov.u t C1 ; u t C2 / D Cov. t C1 ; u t C2 / D 0. Combining into the expression for v
gives
2a C .1 C /z t C 2
vD
2 2 .1
/.2 2 C Var. tC1 / C 2u /
a C .1 C /z t =2 C 2 =2
D 2
.1
/. 2 C Var. t C1 /=2 C u /
a C .1 C /z t =2 C 2 =2
D 2 :
.1
/ŒVar. t C1 /=2 C u
Proof. (of (18.27)) (This proof is a bit crude, but probably correct....) The objective
is to maximize
rptC1 rf C v t r t C1 rf C v t 2 =2 v t2 2 =2
@ E t rptC1
D etC1 C 2 =2 vt 2: (foc1)
@v t
The variance term in (obj) is
1
@ Var t .rptC1 /
D .1
/v t 2 : (foc2)
2 @v t
168
The covariance in (obj) is
where the second line uses the fact that r t C2 rf D etC2 C u tC2 and that u tC2 is
uncorrrelated with u t C1 and v t C1 . There are two channels for the covariance: u t C1 might
be correlated with the expected return, etC2 , or with the portfolio weight, v t C1 . The
portfolio weight from the one-period optimization (18.21), but for t C 1, is
aQ C z t C1
v t C1 D ;
2
where aN D a C 2 =2 (this notation is only used to make the subsequent equations shorter)
The B term in (ff) can then be written
1 1
B D .aN C z t C1 / .aN C z t C1 / 2 1
2
1 1
D 2az N t C1 C z t2C1 + constants
2
1
2
Since Cov t u t C1 ; 2tC1 D 0 (since they are jointly normally distributed) the covariance
in (ff)
1 1
Cov t .rptC1 ; rp2C1 / D v t .aN C z t / u 2 2
The derivative of the corvariance part of (obj) is
169
Combine the derivatives (foc1), (foc2) and (foc3) to the first order condition
@ E rpt C1 @ Var t .rptC1 /=2 @ Cov t .rptC1 ; rp2C1 /
0D C .1
/ C .1
/
@v t @v t @v t
1 aN C z t
e 2 2 2
D . t C1 C =2 v t / C .1
/v t C .1
/ 2 u
2
1 aN C z t
e 2 2
D t C1 C =2
v t C .1
/ 2 u
2
1 aN C z t
e 2
D t C1 C =2 C .1
/ 2 u 2
v t ;
2
which can be solved as (18.27).
Bibliography
Campbell, J. Y., and L. M. Viceira, 1999, “Consumption and portfolio decisions when
expected returns are time varying,” Quarterly Journal of Economics, 114, 433–495.
Campbell, J. Y., and L. M. Viceira, 2002, Strategic asset allocation: portfolio choice of
long-term investors, Oxford University Press.
Dahlquist, M., and P. Söderlind, 1999, “Evaluating portfolio performance with stochastic
discount factors,” Journal of Business, 72, 347–383.
Ferson, W. E., and R. Schadt, 1996, “Measuring fund strategy and performance in chang-
ing economic conditions,” Journal of Finance, 51, 425–461.
170