Watcher_Asset_Allocation
Watcher_Asset_Allocation
175
1. INTRODUCTION
The study of portfolio allocation has played a central role in financial economics, from its
very beginnings as a discipline. This field of study has attracted (and continues to attract)
the attention that it does because it is both highly practical and amenable to the application
of sophisticated mathematics.
This study reviews the recent academic literature on asset allocation. Two important
simplifications are employed: First, the field has drawn a distinction between the study
of allocation to broad asset classes and allocation to individual assets within a class.
This article focuses on the former. In fact, the empirical applications in this article
assume an even more specific case, namely an investor who chooses between a broad
stock portfolio and a riskless asset. Second, the surveyed models assume, for the most
part, no financial frictions. That is, I assume that the investor does not face unhedgeable
labor income risk or barriers to trading in the assets, such as leverage or short-sale
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
constraints. This is not to deny the importance of other asset classes or of financial
frictions. Recent surveys on portfolio choice (encompassing portfolios of many assets)
by University of Pennsylvania on 11/26/10. For personal use only.
include Cochrane (1999), Brandt (2009), and Avramov & Zhou (2010). Campbell
(2006) and Curcuru et al. (2009) survey work on asset allocation under realistic frictions
faced by households.
I focus on two broad classes of models: static models (in which the investor looks one
period ahead) and dynamic models (in which the investor looks multiple periods ahead and
takes his future behavior into account when making decisions). For static models, the
solution where investors have full information about asset returns has been known for
some time (Markowitz 1952), so the focus is on incorporating uncertainty about the return
process. In contrast, much has been learned in recent years about dynamic models, even in
the full-information case. A barrier to considering dynamic models is often their complex-
ity: For this reason, I devote space to analytical results. These results, besides being inter-
esting in their own right, can serve as a starting point for understanding the behavior of
models that can be solved numerically only.
Finally, in both the static and dynamic sections, I consider in detail the model in which
excess returns on stocks over short-term Treasury bills are in part predictable. A substantial
empirical literature devotes itself to the question of whether returns are predictable; the
asset allocation consequences of such predictability are striking and well-known in at least
a qualitative sense since Graham & Dodd (1934).
Ultimately, the goal of academic work on asset allocation is the conversion of the time
series of observable returns and other variables of interest into a single number: Given the
preferences and horizon of the investor, what fraction of her wealth should she put in
stock? The aim is to answer this question in a “scientific” way, namely by clearly specifying
the assumptions underlying the method and developing a consistent theory based on these
assumptions. The very specificity of the assumptions and the resulting advice can seem
dangerous, imputing more certainty to the models than the researcher can possibly possess.
Yet, only by being so highly specific, does the theory turn into something that can be clearly
debated and ultimately refuted in favor of an equally specific and hopefully better theory.
This development implies the use of mathematics to model the investment decision.
Throughout this article, the reader is encouraged to remember that the subject of the
modeling is an individual or household making a decision with significant consequences
for lifelong financial security.
176 Wachter
2. STATIC MODELS
where
0 1
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Y
T Y
T
WT ¼ WT^@z Rs þ (1 " z) Rf ,s A: ð2Þ
^þ1
s¼T ^þ1
s¼ T
by University of Pennsylvania on 11/26/10. For personal use only.
and
where
! "
utþ1
j yt , . . . , y1 , xt , . . . , x0 & N(0, S), ð5Þ
vtþ1
and
" #
s2u suv
S¼ : ð6Þ
suv s2v
important differences.
I assume the investor does not know the parameters of the system above. Rather, he is a
Bayesian, meaning that he has prior beliefs on the parameters, and after viewing the data,
by University of Pennsylvania on 11/26/10. For personal use only.
makes inferences using the laws of probability (Berger 1985). Let b^ be the ordinary least
squares (OLS) estimate of b in the regression (Equation 3). Bayesian analysis turns the
standard frequentist analysis on its head: Instead of asking for the distribution of the test
statistic b^ (which depends on the data) as a function of the true parameter b, Bayesian
analysis asks for the distribution of the true parameter b as a function of the data (which
^
often comes down to a function of sufficient statistics, such as b).
For notational convenience, stack the coefficients from Equations 3 and 4 into a vector:
b ¼ ½a, b, y, r)> :
The investor starts out with prior beliefs p(b, S). Let L(Djb, S) denote the likelihood
function, where D is the data available up until and including time T ^. It follows from
Bayes’ rule that the posterior distribution is given by
L(D j b, S) p(b, S)
p(b, S j D) ¼ ,
p(D)
where p(D) is an unconditional likelihood of the data in the sense that
Z
p(D) ¼ L(D j b, S) p(b, S) dbdS,
b,S
where / denotes “proportional to” because p(D) does not depend on b or S. The likeli-
^ years of data, is equal to
hood function, given T
^Y
T "1
L(D j S, b) ¼ ptþ1jt (ytþ1 , xtþ1 j xt , S, b)p0 (x0 j b, S):, ð8Þ
t¼0
where ptþ1jt (ytþ1 , xtþ1 j xt , S, b) is given by a bivariate normal density function as described
in Equations 3–6 and p0 (x0 j b, S) gives the initial condition of the time series. Given the
posterior, the predictive density for returns from time T ^ to T is defined as
178 Wachter
Z
p(yT^þ1 , . . . yT j D) ¼ p(yT^þ1 , . . . , yT j b, S, xT^) p(b, S j D) db dS: ð9Þ
The predictive distribution (Equation 9) summarizes the agent’s beliefs about the return
distribution after viewing the data. The expectation in Equation 1 is taken with respect to
this distribution.
How might predictability influence an investor’s optimal allocation? Kandel & Stambaugh
(1996) find that optimal allocation for the single-period case can be approximated by
1 E½yT ) þ 12 Var(yT )
z* , ð10Þ
g Var(yT )
where the mean and the variance are taken under the investor’s subjective distribution of
returns (which is Equation 9 in the Bayesian case). Holding the variance constant, an upward
shift in the mean increases the allocation. This is not surprising given that the investor prefers
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
more, not less, wealth. This approximation is valid only for short horizons and small shocks.
However, it is useful as a first step to understanding the portfolio allocation.
by University of Pennsylvania on 11/26/10. For personal use only.
the posterior distribution for all the parameters could then be obtained in closed form.
Indeed, conditional on S, b would be normally distributed around its OLS estimate b. ^
However, p0 (x0 j b, S) is there, and something must be done about it. One approach is to
let it stay and specify what it should be. I refer to the resulting set of assumptions and
results as the “exact Bayesian model” (described Section 2.3). Another approach is to
assume x0 conveys no prior information. Thus,
p(b, S j x0 ) ¼ p(b, S): ð12Þ
Because the prior is now conditional on x0, the likelihood can condition on x0 as well.
The posterior is, of course, conditional on x0 because it is conditional on all the data. That
is, the assumption in Equation 12 allows Equation 7 to be replaced by
p(b, S j D) / Lc (D j b, S, x0 ) p(b, S j x0 ),
1
However, it still would not be a classical regression model. An assumption of classical regression is that the
dependent variable is either nonstochastic or independent of the disturbance term ut at all leads and lags (Zellner
1971, ch. 3). As emphasized in Stambaugh (1999), the independence assumption fails in predictive regressions.
Under the assumptions of classical regression, Gelman et al. (1996) show that the likelihood function for the
regressor is irrelevant to the agent’s decision problem (and so, therefore, is the initial condition).
the long-run mean, the investor has all her wealth in stocks. When the dividend yield is at
one standard deviation below the long-run mean, the optimal allocation falls to 20%. The
optimal allocation is bounded above and below because the power utility investor would
by University of Pennsylvania on 11/26/10. For personal use only.
never risk wealth below zero. Because the distribution for returns is unbounded from
above, this investor would never hold a negative position in stock. The investor would also
a b
1 1
Allocation to stocks
0.5 0.5
0 0
0 2 4 6 8 10 0 2 4 6 8 10
Horizon (years) Horizon (years)
Figure 1
Static allocation as a function of horizon assuming return predictability: (a) when there is no parameter uncertainty and
(b) incorporating parameter uncertainty. The solid line corresponds to the optimal (buy-and-hold) allocation when the dividend
yield is at its sample mean (3.75%). The dash-dotted lines correspond to the allocations when the dividend yield is one standard
deviation above or below its mean (2.91% and 4.59%, respectively). The dotted lines correspond to the allocations when the
dividend yield is two standard deviations above or below its mean (2.06% and 5.43%, respectively). The agent has power utility
over terminal wealth with relative risk aversion equal to 5. Some lines may lie on top of each other. The allocations weakly
increase as a function of the dividend yield except at very long horizons in panel b. The model is estimated over monthly data
from 1952 to 1995. This figure is adapted from Barberis (2000, figure 3).
180 Wachter
never hold a levered position in stock, for this too implies a positive probability of negative
wealth, because returns could be as low as "100%. Note that these endogenous bounds
on the optimal portfolio illustrate errors in the approximation given in Equation 10, which
contains no such bounds.
What about the case where the investor incorporates estimation risk into her decision
making? One may think that estimation risk would make a substantial difference because,
as Barberis (2000) reports, the evidence for predictability at a monthly horizon is of only
borderline significance in the relevant sample. However, the results incorporating estima-
tion risk are quite similar to those that do not at short horizons. Indeed, differences start to
become noticeable only at buy-and-hold horizons of five years or more. (Although such
long buy-and-hold horizons may characterize the behavior of some investors, from the
normative perspective of this article, such infrequent trading seems extreme.) Thus, for the
statistical model for stock returns above, parameter uncertainty resulting from the regres-
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
case [see Stambaugh (1999) for an explanation of the reversed relation between holdings
and the dividend yield at the longest horizons]. Because innovations to the dividend yield
are negatively correlated with innovations to returns, stocks when measured at long hori-
zons are less risky than stocks measured at short horizons [mean reversion in stock returns
is pointed out in earlier work of Poterba & Summers (1988)]. The implications of mean-
reversion for long-horizon investors are also the subject of Siegel (1994).
2
Note that the effect of the dividend yield attenuates at longer horizons both when parameter uncertainty is taken
into account and when it is not. This occurs because the dividend yield is mean-reverting and because the investor
cannot rebalance as the dividend yield reverts to its mean. In the limit, an investor with an infinite horizon (who cares
about wealth at the end of the horizon) would care only about the unconditional distribution of returns and not
about the current value of the dividend yield. Because the dividend yield is so persistent, the effect attenuates very
slowly as a function of the horizon.
(Hamilton 1994, p. 53). The relevant likelihood function is therefore Equation 8, where p0
is the normal density given by Equation 14.
by University of Pennsylvania on 11/26/10. For personal use only.
The use of the unconditional likelihood requires that r be between "1 and 1.
Stambaugh (1999) therefore modifies the assumption in Equation 11 as follows:
What is the rationale for Equation 16 or, for that matter, for Equations 15 or 11? The
prior of Equation 11 is standard in regression models. Its appeal is best understood by the
fact that it embodies three conditions: (a) b and S should be independent in the prior;(b)
for the elements of b, ignorance is best represented by a uniform distribution (which, in the
limit, becomes a constant as in Equation 11); and (c)
which generalizes the assumption that, for a single system, the log of the standard deviation
should have a flat distribution on "1 and 1. Jeffreys (1961, p. 48) proposes these rules
for cases when there is no theoretical guidance on the values of the parameters. An
additional appeal of Equation 11 (discussed above) is that, when combined with the
likelihood (Equation 13), explicit expressions for the posterior distributions of the param-
eters can be obtained.
This discussion would seem to favor the prior given in Equation 15 (because theory now
requires a stationary process) in combination with the exact likelihood. However, applying
these rules does not constitute the only approach. Jeffreys (1961) proposes an alternative
means of defining ignorance: Inference should be invariant to one-to-one changes in the
parameter space. This criterion is appealing in the case of the predictive model (Equations
3–6) in which the particular parametrization appears arbitrary. The exact form of Jeffreys
prior depends on the sample size T and is derived by Uhlig (1994). Stambaugh (1999)
derives an approximate Jeffreys prior that becomes exact as the sample size approaches
infinity. This approximate Jeffreys prior is equal to Equation 16. Relative to the flat prior
for r (Equation 15), more weight is placed on values of r close to "1 and 1.
182 Wachter
Table 1 Posterior means of b and r under various combinations of the likelihood and the
prior*
Posterior means
Specification b r
"3=2
Conditional likelihood; p(b, S) /j S j , r 2 ("1, 1) 0.437 0.9800
"3=2
Conditional likelihood; p(b, S) /j S j , r 2 ("1, 1) 0.441 0.9798
"3=2
Exact likelihood; p(b, S) /j S j , r 2 ("1, 1) 0.375 0.9828
Exact likelihood; p(b, S) / (1 " r2 )"1 s2v j S j"5=2 , r 2 ("1, 1) 0.276 0.9872
*Results are from Stambaugh (1999, figure 1). The predictor variable is the dividend-price ratio. Data are monthly
from 1952 to 1996. The conditional likelihood refers to Equation 13; the exact likelihood refers to Equation 8 with
initial condition given by Equation 14.
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Table 1 shows the implications of these specification choices for the posterior mean of
by University of Pennsylvania on 11/26/10. For personal use only.
the regressive coefficient b and the autocorrelation r.3 For the conditional likelihood
and prior (Equation 11), the posterior mean of beta equals the OLS regression coef-
ficient (which is biased upward). When values of r are restricted to be between "1 and 1,
the posterior mean of b is slightly higher. By contrast, when the exact likelihood is used, the
posterior mean of b is lower and the difference is substantial, regardless of whether the
uniform prior or the Jeffreys prior is used.
To understand these differences in posterior means, consider the following approximate
relation (Stambaugh 1999):
! "
suv
E½b j D) * b^ þ E 2 j D ðE½r j D) " r ^Þ: ð18Þ
sv
Because suv is negative, positive differences between the posterior mean of r and r ^ trans-
late into negative differences between b and b. ^ Equation 18 is the Bayesian version of the
observation that the upward bias in b^ originates from the downward bias in r ^. OLS
estimates the persistence to be lower than what it is in population: This bias arises from
the need to estimate both the sample mean and the regression coefficient at the same time;
the observations revert more quickly to the sample estimate of the mean than the true mean
(Andrews 1993). Because of the negative correlation, OLS also estimates the predictive
coefficient to be too high. Intuition for this result is as follows: If r is above the OLS
estimate r^, then r
^ is “too low,” i.e., in the sample, shocks to the predictor variable tend to
be followed more often by shocks of a different sign than would be expected by chance.
Because shocks to the predictor and the return variables are negatively correlated, this
implies that shocks to the predictor variable tend to be followed by shocks to returns of
the same sign. This implies that b^ will be “too high.”
Compared with the uniform prior over "1 to 1, the prior that restricts r to be
between –1 and 1 lowers (slightly) the posterior mean of r because it rules out draws of r
that are greater than one. For this reason it raises (slightly) the posterior mean of b even
3
The specifications involving the exact likelihood or the Jeffreys prior do not admit closed-form solutions for the
posterior distribution. Nonetheless, the posterior can be constructed using the Metropolis-Hastings algorithm (see
Chib & Greenberg 1995, section 5). See Johannes & Polson (2006) for further discussion of sampling methods for
solving Bayesian portfolio choice problems.
these values at various levels of the dividend yield. Comparing the first and last rows of
each panel shows that Bayesian estimation with the conditional likelihood and prior
(Equation 11) have implications that are virtually identical to ignoring parameter uncer-
by University of Pennsylvania on 11/26/10. For personal use only.
tainty and using the OLS estimates. For the exact likelihood and prior (Equation 15), both
the expected returns and allocations are less variable, as one would expect given the lower
posterior mean of b. Surprisingly, not only are the expected returns less variable, but they
are also substantially lower for both values of the dividend yield, leading to lower alloca-
tions as well. In fact, the average excess stock return is different in the various cases
(Wachter & Warusawitharana 2009b). As explained in that paper, differences in estimates
of average excess stock returns arise from differences in estimates of the mean of the
Table 2 Expected returns and optimal allocations under various combinations of the likelihood and prior (monthly
horizon)*
Specification 3% 4% 5%
Exact likelihood; p(b, S) / (1 " r2 )"1 s2v j S j"5=2 , r 2 ("1, 1) 1.9 5.2 8.5
*Results are from Stambaugh (1999, tables 3, 4). The predictor variable is the dividend-price ratio. Data are monthly from 1952 to 1996. The
conditional likelihood refers to Equation 13; the exact likelihood refers to Equation 8 with initial condition given by Equation 14. The table
assumes that the investor has a horizon of one month and has constant relative risk aversion equal to 7.
184 Wachter
predictor variable. Over this sample, the conditional maximum likelihood estimate of the
dividend yield is below the exact maximum likelihood estimate. Therefore, shocks to the
predictor variable over the sample period must have been negative on average; it follows
that shocks to excess returns must have been positive on average. Accordingly, the poste-
rior mean of returns is below the sample mean.
that economic theory points toward low levels of the R2, should predictability exist at all.
Let
by University of Pennsylvania on 11/26/10. For personal use only.
s2v
s2x ¼ , ð19Þ
1 " r2
and note that Equation 19 is the unconditional variance of xt. The population R2 for the
regression given in Equation 3 is defined to be the ratio of the variance of the predictable
component of the return to the total variance. It follows from Equation 19 that the R2 is
equal to
b2 s2x
R2 ¼ : ð20Þ
b2 s2x þ s2u
Wachter & Warusawitharana (2009a) consider a class of priors that translate into
distributions on the population R2. Specifically, they define a “normalized” b:
! ¼ s"1
u sx b:
!2
R2 ¼ : ð22Þ
!2 þ1
Because sx is implicitly a function of r and sv, the prior on b is also a function of these
parameters. The approximate Jeffreys prior for the remaining parameters is given by
5
p(a, y, r, S) / sx su j S j"2 : ð24Þ
likely. The prior assigns a probability of nearly 100% to the R2 exceeding any given value,
except for values that are an infinitesimal distance from one.
The literature has considered other specifications for informative priors. Kandel &
by University of Pennsylvania on 11/26/10. For personal use only.
Stambaugh (1996), for example, construct priors assuming that the investor has seen, in
addition to the actual data, a hypothetical prior sample of the data such that the sample
means, variances, and covariances of returns and predictor variables are the same in the
ση =0
1
ση =0.04
ση =0.08
ση =5
0.8
ση =10
ση =100
0.6
P(R2>k)
0.4
0.2
0
0 0.01 0.02 0.03
// 0.5 1
k
Figure 2
The prior probability that the R2 exceeds a value k implied by various prior beliefs. Prior beliefs are indexed by s! , the prior
standard deviation of the normalized coefficient on the predictor variable. The dogmatic prior is given by s! ¼ 0; the diffuse prior
by s! ¼ 1. Intermediate priors express some skepticism over return predictability. Note that left portion of the x-axis of the
graph is scaled differently from the right portion.
186 Wachter
hypothetical prior sample as in the actual sample. However, in the hypothetical sample, the
R2 is exactly equal to zero (see also Avramov 2002, 2004). Cremers (2002) constructs
informative priors assuming the investor knows sample moments of the predictive variable.
These constructions raise the question of how the investor knows the sample moments of
returns and predictive variables (note that it is not sufficient for the investor to make a guess
that is close to the sample values). If it is by seeing the data, the prior and the posterior are
equal and the problem reduces to the full-information case. An alternative is to assume that
the investor has somehow intuited the correct values. According to this latter (somewhat
awkward) interpretation, to be consistent these moments would have to be treated as
constants (namely conditioned on) throughout the analysis, which they are not.
Figure 2 suggests that priors of the form (Equations 23 and 24) with small s! have more
reasonable economic properties than uninformative priors. Wachter & Warusawitharana
(2009a) investigate the quantitative implications of these priors for portfolio allocation.
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Not surprisingly, because the posterior mean of b shrinks toward zero, the portfolio
allocation under these priors exhibits less dependence on the dividend yield.
Several papers critique the evidence in favor of predictability based on out-of-sample
by University of Pennsylvania on 11/26/10. For personal use only.
performance: Bossaerts & Hillion (1999) find no evidence of out-of-sample return pre-
dictability using a number of predictors, whereas Goyal & Welch (2008) find that pre-
dictive regressions often perform worse than using the sample mean when it comes to
predicting returns. For the researcher, these studies raise the question of how the Bayesian
asset allocation strategies perform out of sample. Note, however, that from the point of
view of the Bayesian investor, such additional information is irrelevant. The predictive
distribution for returns, as generated from the likelihood and the prior, is the sole determi-
nant of the portfolio strategy.
Wachter & Warusawitharana (2009a) examine the out-of-sample performance implied by
various priors. They show that asset allocation, using the results of OLS regression without
taking parameter uncertainty into account, indeed delivers worse out-of-sample performance
than a strategy implied by a dogmatic belief in no predictability. Relative to the OLS bench-
mark, the strategy implied by the uninformative Jeffreys prior (16) performs better, but still
worse than the no-predictability prior. Across various specifications, the best-performing prior
is an intermediate one, representing some weight on the data and some weight on an econom-
ically reasonable view that, if predictability should exist, the R2 should be relatively small.
Campbell & Thompson (2008) adopt a second approach to improving out-of-sample
performance. They show that the out-of-sample performance improves when weak eco-
nomic restrictions are imposed on the return forecasts, thereby requiring that the expected
excess return be positive and that the predictor variable has the theoretically expected sign.
The Campbell and Thompson paper is non-Bayesian, but it would not be difficult to
incorporate these prior views into a Bayesian setting.
where u, v, and w are iid (across time) and jointly normally distributed. Here, m (unobserved)
is the true expected excess return, and the agent learns about m by observing x and y. Under
by University of Pennsylvania on 11/26/10. For personal use only.
this predictive system, one could still regress ytþ1 on the observable xt. However, the error in
the regression would be correlated with time-t variables. Pastor & Stambaugh find that this
distinction between mt and xt , and particularly the fact that the autocorrelation of x need not
equal the autocorrelation of m, has important consequences for investors.
One could expand the uncertainty faced by investors in other ways. Recent studies
(Avramov 2002, Cremers 2002, Wachter & Warusawitharana 2009b) explore the possibil-
ity that an investor assigns some prior probability to alternative models. Although this
represents a form of “model uncertainty,” the agent is still Bayesian in the sense that he
assigns probabilities. One could go further and assume that there are some forms of
uncertainty that investors simply cannot quantify. Gilboa & Schmeidler (1989) define a
set of axioms on preferences that distinguish between risk (in which the agent assigns
probabilities to states of nature) and uncertainty (in which probabilities are not assigned).
They show that aversion to uncertainty leads investors to maximize the minimum over the
set of priors that may be true. Uncertainty aversion, also called ambiguity aversion, has
been the subject of a fast-growing literature in recent years, much of which has focused on
asset allocation (see Chamberlain 1999, Chen & Epstein 2002, Chen et al. 2009, Garlappi
et al. 2007, Hansen 2007, Maenhout 2006).
This notion of additional uncertainty facing investors is likely to be a subject of contin-
ued active debate. As discussed above, there are a number of complementary approaches,
such as the predictive system, model uncertainty with probabilities over the models, and
model uncertainty such that the agent need not formulate probabilities over the models.
The contention of the previously discussed models is that periods of low valuation (e.g.,
when the dividend yield is high) represent, to some uncertain extent, a readily available
opportunity for the investor. However, another possibility is that the excess returns earned
by this market-timing strategy are a compensation for a type of risk that does not appear in
the sample, i.e., the risk of a rare event.4
4
Yet another possibility is that the excess returns represent compensation for greater volatility. Shanken & Tamayo
(2005) evaluate this claim directly in a Bayesian setting and find little support for it. A large literature debates the
extent to which changes in volatility are linked to changes in expected returns; based on available evidence, however,
it does not appear that the fluctuations in expected returns captured by the dividend yield correspond to changes in
volatility. See Campbell (2003) for a discussion of this literature.
188 Wachter
In Wachter (2008), I show that predictability in excess returns can be captured by a
model with a representative investor with recursive preferences (see below), in which there
is a time-varying probability of a rare event. Times when this rare-event probability are
high correspond to times when the dividend yield is also high. Most of the time, the rare
event does not happen, implying higher than average realized returns. Occasionally, the
rare event does happen, in which case high dividend yields are followed by quite low
returns. The representative agent holds a constant weight in equities (as is required by
equilibrium) despite the fact that excess returns vary in a predictable fashion. Strategies
that attempt to time the market, according to this view, are risky, though this risk would be
difficult to detect in the available time series.
3. DYNAMIC MODELS
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
I now consider the investor who has a horizon beyond one period and, at each time point,
faces a consumption and portfolio choice decision. I start with a general specification that
by University of Pennsylvania on 11/26/10. For personal use only.
allows for multiple risk assets and state variables. Let Ct denote the investor’s consumption
at time t, zt the N +1 vector of allocations to risky assets, and Wt the investor’s wealth.
Samuelson (1969) models this problem as
XT
C1"g
max E e"bt t ð26Þ
c,z
t¼0
1"g
Wtþ1 ¼ (Wt " Ct )Rf ,tþ1 þ Wt zt> (Rtþ1 " Rf ,tþ1 ) ð27Þ
"bt C1"g
and terminal condition WT , 0. Here e represents period utility (for simplicity, I have
t
1"g
assumed that the investor does not have a bequest motive). An alternative is to consider the
1"g
problem without the utility flow from consumption, namely the investor maximizes W1"g T
.
This is not as realistic, but it is sometimes a helpful simplification. Another helpful sim-
plification is to take the limit of Equation 26 as T goes to infinity.
The problem above can be solved by backward induction using the Bellman equation
(see Duffie 1996, ch. 3). Let Xt denote an n+1 vector of state variables that determine the
distribution of returns. Let t ¼ T"t denote the horizon. Define the value function as the
remaining utility:
t
X
J(Wt , Xt , t) ¼ max E e"bs u(ctþs ):
c,z
s¼0
with the boundary condition J(W, X, 0) ¼ u(W). See Brandt (2009) for further discussion
of the value function and its properties.
Although Equation 28 reduces the multiperiod problem (Equation 26) to a series of one-
period problems, these one-period problems may look quite different from the problem con-
sidered in Section 2 because of the interaction between the state variables X and wealth W.
Indeed, when there is no X (so returns are iid), Samuelson (1969) shows that Equation 26
denote the N+d matrix of loadings on the Brownian motions. Assume that the price
process for asset i, i ¼ 1, . . ., N is given by
by University of Pennsylvania on 11/26/10. For personal use only.
dP(ti)
¼ (li (Xt ) þ rf (Xt )) dt þ si (Xt ) dBt , ð29Þ
P(ti)
where rf ¼ logRf . I assume Xt follows a Markov process:
Assumptions in Equations 29 and 30 imply that the current value of the state variables
at time t fully determine the investment opportunities that are available to the investor.
That is, they determine the investment opportunity set.
Merton (1971) shows that under the assumptions above, wealth follows the process
% &
dWt ¼ Wt zt> l(Xt ) þ Wt rf (Xt ) " Ct dt þ Wt zt> s(Xt ) dBt : ð31Þ
Merton (1973) derives a partial differential equation characterizing the value function J.
Moreover, he shows that the first-order condition with respect to z leads to the following
characterization of z in terms of derivatives of J:
JW
(ss > ) l " 1 (ss>) sa> JXW ,
"1 "1
z¼" ð32Þ
JWW W JWW W
where JW , JWW , and JXW refer to first and second partial derivatives of J. Here and in
what follows, I eliminate time subscripts and function arguments when not required
for clarity. I show in the Appendix (section below) that the value function takes the
form
190 Wachter
of which depends on the process for X. Note that in the discrete-time setting when one
period remains, the value function depends only on wealth, not on X. The same is true in
continuous time; in the limit, as the horizon approaches 0, the value function’s dependence
on X also approaches zero. Therefore, as the horizon approaches 0, only the first term
remains. As a result, Merton (1973) refers to this term as what the investor would choose
if he behaved myopically, namely if, similar to the discrete-time investor with one period
left, he took into account only the very immediate future and did not look beyond.
Given that myopic demand captures, in a limiting sense, the desired allocation of a one-
period investor, how does it compare with the results derived in Section 2? Consider for
simplicity the case of a single risky asset. In this case, l corresponds to the (instantaneous)
expected excess return on the asset and ss > to the (instantaneous) variance. Indeed, Ito’s
Lemma implies that for an asset with price Pt
# $
1
d logPt ¼ l þ rf " ss > dt þ s dBt ,
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
2
so that, assuming units are the same, Et ½ytþ1 ) * l " 12 ss > and Vart ½ytþ1 ) * ss > . Myopic
by University of Pennsylvania on 11/26/10. For personal use only.
demand therefore closely resembles Equation 10. The main difference is that Equation 10
is approximate, whereas Equation 34 is exact. Recall that in the setting of Section 2 (indeed
in any discrete-time setting) power preferences rule out levered positions or short positions
in the stock at any horizon. However, when trading is continuous, the agent can exit these
positions in time to avoid negative wealth. This property, which is not without controversy,
plays a key role in making the continuous-time model tractable.
Myopic demand, then, is the continuous-time analog of the static portfolio choice
described in Section 2. In contrast, the second term in Equation 34 is completely new. As
Merton (1973) shows, this term represents the agent’s efforts to hedge future changes in the
investment opportunity set. There are two offsetting motives: On the one hand, the inves-
tor would like more wealth in states with superior investment opportunities, all the better
to take advantage of them. On the other hand, the investor would like more wealth in
states with poorer investment opportunities, so as to lessen the overall risk to long-term
wealth. The former is a substitution effect; the latter is an income effect.
To see how these motives are represented by Equation 34, consider the case with a
single state variable. Note that the sign of JX equals the sign of IX. Define an increase in X
to indicate an improvement in investment opportunities if and only if it increases the
agent’s utility, namely if and only if JX > 0. [Merton (1973) discusses hedging motives in
terms of the consumption-wealth ratio rather than the value function. I explore the link to
the consumption-wealth ratio in what follows.] If an asset positively covaries with the
stock, hedging demand is negative so long as g is greater than 1 and positive so long as g is
less than 1. In effect, the agent with g > 1 reduces his investment to an asset that pays off in
states with superior investment opportunities (the income effect dominates), whereas the
agent with g < 1 increases his investment to such an asset (the substitution effect domi-
nates). Logarithmic utility (g ¼ 1) corresponds to the knife-edge case when these effects
cancel each other out.
To go further, it is necessary to learn more about the function I(X,t). This function
depends on the parameters in Equations 29 and 30, so it will embody an empirical statement
about the distribution of returns. Applying the theory above to estimated processes for
returns is one way the literature has built on the insights in Merton (1973). A second source
of innovation is in the type of utility function considered (see next section).
developed by Duffie & Epstein (1992a, 1992b). Let Vt denote the remaining utility.
Following Duffie and Epstein, I use the notation V to denote the utility process and the
by University of Pennsylvania on 11/26/10. For personal use only.
notation J to denote optimized utility as a function of wealth, the state variables, and the
horizon. At the optimum, Vt ¼ J(Wt , Xt , T " t). Duffie and Epstein specify Vt as follows:
Z T
Vt ¼ Et f (Cs ,Vs ) ds, ð35Þ
t
where 8 0 1 1
>
> 1 !1" c
>
> b % & B % & " C
>
> (1 " g)V @ C (1 " g)V 1"g " 1A c 6¼ 1
>
>
<1 " 1
>
f (C,V) ¼ c ð36Þ
>
> 0 1
>
>
>
> 1
>
> b((1 " g)V)@log C " log((1 " g)V)A c ¼ 1:
>
: 1"g
Duffie & Epstein (1992a) show that the parameter c > 0 can be interpreted as the
elasticity of intertemporal substitution (EIS) and g > 0 can be interpreted as relative risk
aversion. When g ¼ 1/c, power preferences given in Equation 26 are recovered (note that
the resulting formulation of Vt may not take the same form as Equation 26 but will imply
the same underlying preferences and therefore the same choices).
Results in Duffie & Epstein (1992a) show that the first-order condition for portfolio
allocation (Equation 32) and the first-order condition for consumption fc ¼ Jw derived by
Merton (1973) are valid in this more general setting. Below, I use these results to charac-
terize optimal consumption and investment behavior, considering the case of c 6¼ 1 and
c ¼ 1 separately.6
5
Kihlstrom (2009) develops an alternative approach to separating the inverse of the elasticity of substitution and risk
aversion within an expected-utility framework.
6
Interesting questions of existence and uniqueness of solutions are beyond the scope of this study. Schroder &
Skiadas (1999) provide such results assuming bounded investment opportunities and a utility function that generalizes
the recursive utility case considered here. Wachter (2002) proves existence in the return predictability case (Section 3.3)
under power utility with risk aversion greater than 1. Dybvig & Huang (1988) and Dybvig et al. (1999) provide further
existence results under power utility.
192 Wachter
3.2.1. Characterizing the solution when the EIS does not equal 1. As shown in the Appen-
dix (see below), so long as c 6¼ 1, the form of the value function (Equation 33), and
therefore the form of optimal allocation (Equation 34), still holds. Myopic demand takes
the same form as under power utility: It is determined by g alone. The parameter g also
determines whether the income or substitution effect dominates in the portfolio decision.
These results support the interpretation of g as risk aversion in this more general model.
It is also instructive to consider the consumption policy. Define a function H as follows:
Wt
¼ H(Xt , T " t): ð38Þ
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Ct
It follows from Equation 37 that
by University of Pennsylvania on 11/26/10. For personal use only.
IX 1 HX
¼" : ð39Þ
I 1"c H
Recall that the sign of IX equals the sign of JX, the derivative of the value function
with respect to the state variables. As in the asset allocation decision, there are two
effects that changes in investment opportunities could have on consumption behavior.
On the one hand, an improvement could lead investors to consume less out of wealth, to
better take advantage of the opportunities (the substitution effect). On the other, an
improvement raises wealth in the long run, allowing the investor to consume more today
(the income effect). Equation 39 shows that, for investors who are relatively willing to
substitute intertemporally (c>1), consumption falls relative to wealth when investment
opportunities rise (the substitution effect dominates). For investors who are relatively
unwilling to substitute intertemporally (c<1), consumption rises (the income effect
dominates). These results support the interpretation of c as the elasticity of intertemporal
substitution.
Substituting into the Bellman, Equation 53 leads to the following differential equation
for H:
with boundary condition H(X, 0) ¼ 0. Equation 40 is useful in considering the special cases
below.
The solution is
1 ' (
H(t) ¼ 1 " e"k(1"c)t , ð42Þ
k(1 " c)
where
# $
1 1 > % > &"1 1 "1
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
k¼ l ss l þ rf " b 1 " :
2g c
by University of Pennsylvania on 11/26/10. For personal use only.
The first two terms in k provide a measure of the quality of investment opportunities.
For c > 1, H(t) is increasing in k. This follows from the fact that H(0) ¼ 0 and that H0 (t) is
increasing in k for any (fixed) t > 0. As such, the greater the investment opportunities are,
the less the investor consumes out of wealth. Note that the discount rate enters k with a
negative sign: Whereas an increase in investment opportunities causes the investor to
consume less out of wealth, an increase in the discount rate causes the investor to consume
more. For c < 1, H(t) is decreasing in k. The greater the investment opportunities are, the
more the investor consumes out of wealth. Also, an increase in investment opportunities
and an increase in the discount both lead the investor to consume more and save less as a
percentage of wealth.
Power utility and complete markets. In this setting without trading restrictions, markets
are complete if and only if the diffusion terms for asset prices span the diffusion terms
"1
for X. The term as> (ss >) represents the projection of the diffusion terms for X on the
(i)
diffusion terms for P ; therefore, markets are complete if and only if
"1
as> (ss > ) s ¼ a,
namely if the projection recovers the diffusion terms on X. Further note that, because
tr(AB) ¼ tr(BA) for conforming matrices,
194 Wachter
with F(X, 0) ¼ 1. To see this, note that it follows from integration by parts that
Z t
@F
ds ¼ F(X, t) " F(X, 0) ¼ Ht " 1:
0 @s
incomplete markets (He & Pearson 1991, Cuoco 1997) and to recursive utility (Duffie &
Skiadas 1994, Schroder & Skiadas 1999, Skiadas 2007), it is less straightforward in these
by University of Pennsylvania on 11/26/10. For personal use only.
cases.
3.2.2. Characterizing the solution when the EIS equals 1. In the case of c ¼ 1, the value
function takes the form
"bt
W (1"g)(1"e ) G(X, t)1"g
J(W, X, t) ¼ :
1"g
The differential equation for G is given in the Appendix (see below). Equation 32 still holds
(see Duffie & Epstein 1992a), implying that the portfolio allocation is given by
1 % > &"1 1"g % > &"1 > GX>
z¼ ss lþ ss sa : ð46Þ
1 " (1 " g)(1 " e"bt ) 1 " (1 " g)(1 " e"bt ) G
As in the case of c 6¼ 1, the portfolio allocation separates into two terms, the first of which
can be interpreted as myopic demand (because it does not depend on future investment
opportunities) and the second as hedging demand.
Myopic demand is horizon dependent when c ¼ 1. When the horizon is large (as t ! 1),
myopic demand approaches the myopic demand in Equation 34, namely it is determined
by g only. However, for finite horizons, myopic demand is determined by a weighted
average of c"1 (¼1) and g, with the horizon determining the weights:
1 " (1 " g)(1 " e"bt ) ¼ e"bt þ g(1 " e"bt ):
The first-order condition fc ¼ JW applied to Equation 46 implies that the wealth-
1"e"bt
consumption ratio is given by WC ¼ b . Unlike in the c 6¼ 1 case, the wealth-consumption
ratio does not depend on investment opportunities. Unit EIS corresponds to the knife-edge
case where the substitution and income effects cancel each other out, as far as consumption
behavior is concerned. In the limiting case of an infinite horizon, the wealth-consumption
ratio is constant and equal to b"1.
3.3.1. When are exact solutions available? More explicit solutions for the value func-
tion, and therefore for portfolio and consumption choices, are available in two special
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
cases of the above analysis: (a) when the c is equal to 1 (b) when power utility obtains
(g ¼1/c) and markets are complete. Schroder & Skiadas (1999) (who also assume
by University of Pennsylvania on 11/26/10. For personal use only.
complete markets) and Campbell et al. (2004) (who also assume an infinite horizon)
consider the first case. Here I further consider this case, allowing markets to be incom-
plete and the horizon to be finite. The second case is the subject of Wachter (2002). In
a related contribution, Kim & Omberg (1996) show that one can also obtain closed-
form solutions for portfolio choice when the investor maximizes power utility over
terminal wealth.
Indeed, when c ¼ 1, the value function is given by Equation 60, with G taking the
form
) *
X2
G(X, t) ¼ exp A(11) (t) þ A(21) (t)X þ A(31) (t) ð47Þ
2
(1)
and where Ai satisfy a system of ordinary differential equations with boundary conditions
Aði 1Þ (0) ¼ 0. For power-utility and complete markets, the wealth-consumption ratio H(X, t)
is given by Equation 44, where
) *
(2) X2 (2) (2)
F(X, t) ¼ exp A1 (t) þ A2 (t)X þ A3 (t) : ð48Þ
2
(2)
Substituting into Equation 45 results in a set of ordinary differential equation for Ai with
(2)
boundary conditions Ai ¼ 0.
196 Wachter
% &
e"h * eE½"h) þ "h "E½"h) eE½"h) : ð49Þ
Let h1 ¼ eE½"h) and h0 ¼ h1 (1 " log h1 ). Then Equation 49 implies
1 1
" (h0 " h1 log H) " b 1 " * 0:
1"c c
by University of Pennsylvania on 11/26/10. For personal use only.
Observe that this differential equation is similar in form to the one for the value
function in the c ¼ 1 case (given by Equation 65). In fact, it is simpler in that there is no
time dependence. It follows that the approximation method can be implemented in any
setting where the c ¼ 1 yields an exact solution. Under the above assumptions on the asset-
return process and state-variable processes,
) *
X2
H(X,t) * exp A(13) þ A(23) X þ A(33) ,
2
where A(i 3) can be determined by matching coefficients. Campbell & Viceira (1999) use this
approximation to show that, in the infinite-horizon problem, portfolio decisions are
driven, almost entirely, by risk aversion g.
3.3.3. Numerical results. When calibrated to reasonable values, what do these dynamic
considerations add to the asset allocation problem? In what follows, I present results from
Wachter (2002); the near-perfect negative correlation between the dividend yield and the
stock return makes it reasonable to assume that markets are complete. I calibrate this
model using the same parameters as used by Barberis (2000) and assume a risk aversion
of 5, so the results are quantitatively comparable to those discussed in Section 2.7 Similar
results are found using alternative specifications and methods (e.g., Brennan et al. 1997,
Brandt 1999, Balduzzi & Lynch 1999).
Figure 3 shows the optimal allocation as a function of horizon for various levels of the
dividend yield. As in the static case, there are substantial horizon and market timing
effects. However, in this case, rather than decreasing (slowly) in the horizon, the degree to
which the allocation varies with the dividend yield is even more marked for long-term
investors than for short-term investors. This greater dependence results from hedging
demand. Because the dividend yield (proportional to X) is negatively correlated with stock
returns, hedging demand leads the investor to allocate more money to stocks for g > 1.
7
Wachter (2002) provides details on how the discrete-time results are used to calibrate the continuous-time model.
However, that paper calibrates the model mistakenly assuming that the process in Barberis (2000) applies to the net
return on equities rather than the excess return. These results correct that mistake.
Allocation to stocks
2
1
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
0
by University of Pennsylvania on 11/26/10. For personal use only.
0 2 4 6 8 10
Horizon (years)
Figure 3
Dynamic allocation as a function of horizon assuming return predictability and that the investor can
trade continuously. The solid line corresponds to the optimal allocation when the dividend yield is at
its sample mean (3.75%). The dash-dotted lines correspond to the allocations when the dividend yield
is one standard deviation above or below its mean (2.91% and 4.59%, respectively). The dotted lines
correspond to the allocations when the dividend yield is two standard deviations above or below its
mean (2.06% and 5.43%, respectively). The agent has power utility over consumption (lines with
circles) or over terminal wealth (lines without circles) with risk aversion equal to five. Note that the
allocations increase as a function of the dividend yield. The model is estimated over monthly data from
1952 to 1995.
The greater the dividend yield is, the more the investor cares about this hedge, which is
why hedging demand makes market timing more extreme. Unlike the simpler horizon
effect in Section 2, this effect reverses for g < 1. The long-horizon investor with g < 1 holds
less in stock than does the short-horizon investor.
198 Wachter
allocation. Given this limiting result in continuous time, it is perhaps not surprising that
estimation risk should have little effect at short horizons as shown in Section 2.
What does have an effect, and a large one, is learning. Hedging demand induced by
learning is negative and can be substantial (Brennan 1998). The reason is that the investor’s
estimate of the average return (effectively a state variable) is positively correlated with
realized returns. When a positive shock to prices occurs, the investor updates his beliefs
about the average return, estimating it to be higher than before. Thus, stocks are less
attractive to an investor with g > 1 (see Equation 34).
Uncertainty about parameters other than the mean is harder to address because it does
not lend itself to closed-form solutions. Studies, therefore, have explored this question
using numerical methods. Xia (2001) allows the investor to be uncertain about the degree
of predictability (the coefficient b in Equation 3) and assumes the other parameters are
known. She decomposes hedging demand into the component to hedge learning about b
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
and the component to hedge changes in Xt. Learning-induced hedging demand decreases in
the difference between the dividend yield and its mean. Moreover, it switches in sign:
It is positive when the dividend yield is below its long-run mean, zero when it is at the
by University of Pennsylvania on 11/26/10. For personal use only.
long-run mean, and negative when it is above the long-run mean. As Xia (2001) shows,
these properties make the overall allocation less variable compared to the no-learning case.
However, the allocation is still more variable than implied by the myopic strategy.
Brandt et al. (2005) and Skoulakis (2007) undertake solving the asset allocation prob-
lem when there is uncertainty about the full set of parameters. The lack of closed-form
solutions and the high dimensionality of the problem make this a formidable technical
challenge. These studies show that, in addition to the effect noted by Xia (2001), uncer-
tainty about the mean (as in Brennan 1998) exerts an important influence, driving down
the average allocation relative to that discussed above. Although the net effect of hedging
demand is under dispute, the market-timing effect remains alive and well.
4. CONCLUDING REMARKS
In this study, I review the literature on static and dynamic asset allocation, with a focus on
the implications of return predictability for long-run investors. For both buy-and-hold and
dynamically trading investors, the optimal allocation to stocks is greater the longer the
horizon, given reasonable assumptions on preferences. This similarity should not obscure
some key differences. In the static case, the effect of any stationary variable on the alloca-
tion will diminish as the horizon grows. In the dynamic case, there is no reason for this to
happen, and indeed the opposite may be true. In effect, for investors who dynamically
trade, even short-term variables can have long-term implications.
This survey also highlights efforts to introduce parameter uncertainty into the agent’s
decision process. This, in theory, serves to pass on some of the uncertainty faced by the
econometrician to the agent; the agent now incorporates this estimation risk into his deci-
sions. Empirically, however, estimation risk appears to have very little effect, except at long
buy-and-hold horizons (at least for the specifications explored herein). This is not to say that
the perfect- and imperfect-information cases are identical. Indeed, learning can induce impor-
tant hedging demands in the dynamic setting. Furthermore, I show in the static setting that
the choice of prior and likelihood can have a large impact on the results. The notion of
uninformative priors is less than clear in a predictive regression setting. Moreover, economic
theory indicates a possible role for unapologetically informative priors that take this theory
>
> b % &B % & 1"g C
>
> (1 " g)V @ C (1 " g)V " 1A c 6¼ 1
>
< 1 " c1
f (C,V) ¼ ð52Þ
by University of Pennsylvania on 11/26/10. For personal use only.
>
>
> %
> & 1
>
> 1"g
log(x1"g þ (1 " g)V))
:b x
> þ (1 " g)V (log C "
1"g
c ¼ 1:
200 Wachter
Furthermore, note that
8
>
< 1"c % &1 " gc
bc JW (1 " g)J 1 " g c 6¼ 1
CJW ¼ ð56Þ
>
: % 1"g &
b x þ (1 " g)J c ¼ 1:
For c 6¼ 1, substituting into Equation 53 from Equations 55, 56, and 32 implies
2 % &"1
1 JW JW >
"Jt þ JX b " l > ss > l" J as > (ss)"1 l
2 JWW JWW XW
1 1 1 >
þ JW Wrf þ tr(a > JXX a) " J as > (ss> )"1 sa > JXW ð57Þ
2 2 JWW XW
# (
1 1"c % &1"gc 1 "1
bc JW
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Guess
1
log(x1"g þ (1 " g)J(W, X, t)) ¼ q(t)log W þ log G(X,t): ð60Þ
1"g
Derivatives of J can be found by implicitly differentiating on both sides of Equation 60:
# $
% & Gt
Jt ¼ x1"g þ (1 " g)J q0 logW þ
G
% & 1
JW ¼ x1"g þ (1 " g)J q ð61Þ
W
% & GX
JX ¼ x1"g þ (1 " g)J :
G
Substituting Equations 60–62 into Equation 59 and dividing by x1"g þ (1 " g)J leads to the
following:
Gt GX 1 q % &"1
"q0 logW " þ bþ l> ss> l
G G 2 1 " q(1 " g)
q(1 " g) GX > % > &"1
þ as ss l
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
þqrf þ tr " g 2
þ þ as ss sa
2 G G 2 1 " q(1 " g) G G
"b " bg(qlogW þ log G) þ b log(bq"1 ) þ blogW " b(1 " g)(qlogW þ log G) ¼ 0:
"q0 þ b " bq ¼ 0,
Gt GX 1 q
" þ bþ l> (ss> )"1 l
G G 2 1 " q(1 " g)
q(1 " g) GX >
þ as (ss> )"1 l
1 " q(1 " g) G ð65Þ
# $
1 a> GX> GX a> a > GXX a 1 q(1 " g)2 GX > % > &"1 > GX>
þqrf þ tr " g 2
þ þ as ss sa
2 G G 2 1 " q(1 " g) G G
DISCLOSURE STATEMENT
The author is not aware of any affiliations, memberships, funding, or financial holdings
that might be perceived as affecting the objectivity of this review.
ACKNOWLEDGMENTS
I thank Itamar Drechsler, Lubos Pastor, and Moto Yogo for helpful comments and Jerry
Tsai for excellent research assistance.
202 Wachter
LITERATURE CITED
Ait-Sahalia Y, Hansen LP, eds. 2009. Handbook of Financial Econometrics: Volume 1—Tools and
Techniques. Amsterdam: North Holland. 808 pp.
Andrews DWK. 1993. Exactly median-unbiased estimation of first order autoregressive/unit root
models. Econometrica 61:139–65
Ang A, Bekaert G. 2007. Stock return predictability: Is it there? Rev. Financ. Stud. 20:651–707
Avramov D. 2002. Stock return predictability and model uncertainty. J. Financ. Econ. 64:423–58
Avramov D. 2004. Stock return predictability and asset pricing models. Rev. Financ. Stud. 17:699–738
Avramov D, Zhou G. 2010. Bayesian portfolio analysis. Annu. Rev. Financ. Econ. 2: In press
Balduzzi P, Lynch AW. 1999. Transaction costs and predictability: some utility cost calculations.
J. Financ. Econ. 52:47–78
Barberis N. 2000. Investing for the long run when returns are predictable. J. Financ. 55:225–64
Berger JO. 1985. Statistical Decision Theory and Bayesian Analysis. New York: Springer
Bossaerts P, Hillion P. 1999. Implementing statistical criteria to select return forecasting models: What
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Duffie D, Epstein LG. 1992a. Asset pricing with stochastic differential utility. Rev. Financ. Stud.
5:411–36
by University of Pennsylvania on 11/26/10. For personal use only.
204 Wachter
Karatzas I, Lehoczky JP, Shreve SE. 1987. Optimal portfolio and consumption decisions for a small
investor on a finite horizon. SIAM J. Contr. Optim. 25:1557–86
Keim DB, Stambaugh RF. 1986. Predicting returns in the stock and bond markets. J. Financ. Econ.
17:357–90
Kihlstrom R. 2009. Risk aversion and the elasticity of substitution in general dynamic portfolio
theory: consistent planning by forward looking, expected utility maximizing investors. J. Math.
Econ. 45:634–63
Kim TS, Omberg E. 1996. Dynamic nonmyopic portfolio behavior. Rev. Financ. Stud. 9:141–61
Kothari S.P, Shanken J. 1997. Book-to-market, dividend yield, and expected market returns: a time-
series analysis. J. Financ. Econ. 44:169–203
Kreps D, Porteus E. 1978. Temporal resolution of uncertainty and dynamic choice theory.
Econometrica 46:185–200
Lettau M, Ludvigson SC. 2001. Consumption, aggregate wealth and expected stock returns. J. Financ.
56:815–49
Lewellen J. 2004. Predicting returns with financial ratios. J. Financ. Econ. 74:209–35
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
Liu J. 2007. Portfolio selection in stochastic environments. Rev. Financ. Stud. 20:1–39
Maenhout P. 2006. Robust portfolio rules and detection-error probabilities for a mean-reverting risk
by University of Pennsylvania on 11/26/10. For personal use only.
206 Wachter
Annual Review of
Financial Economics
Harry M. Markowitz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
by University of Pennsylvania on 11/26/10. For personal use only.
vi
Ambiguity and Asset Markets
Larry G. Epstein and Martin Schneider . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Risk Management
Philippe Jorion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Errata
An online log of corrections to Annual Review of Financial Economics articles
may be found at http://financial.annualreviews.org
Annu. Rev. Fin. Econ. 2010.2:175-206. Downloaded from www.annualreviews.org
by University of Pennsylvania on 11/26/10. For personal use only.
Contents vii