Identi Cation, Estimation and Testing of Conditionally Heteroskedastic Factor Models
Identi Cation, Estimation and Testing of Conditionally Heteroskedastic Factor Models
Gabriele Fiorentini
University of Alicante
Enrique Sentana
CEMFI
Abstract
We investigate several important inference issues for factor models with
dynamic heteroskedasticity in the common factors. First, we show that such
models are identied if we take into account the time-variation in the variances of the factors. Our results also apply to dynamic versions of the APT,
dynamic factor models, and vector autoregressions. Secondly, we propose a
consistent two-step estimation procedure which does not rely on knowledge
of any factor estimates, and explain how to compute correct standard errors.
Thirdly, we develop a simple preliminary LM test for the presence of arch
eects in the common factors. Finally, we conduct a Monte Carlo analysis of
the nite sample properties of the proposed estimators and hypothesis tests.
Introduction
In recent years, increasing attention has been paid to modelling the observed
changes in the volatility of many economic and nancial time series. By and
large, though, most theoretical and applied research in this area has concentrated
on univariate series. However, many issues in nance, such as tests of asset pricing restrictions, asset allocation, performance evaluation or risk management, can
only be fully addressed within a multivariate framework. Unfortunately, the application of dynamic heteroskedasticity in a multivariate context has been hampered
by the sheer number of parameters involved.
Given that there are many similarities between this problem and that of modelling the unconditional covariance matrix of a large number of asset returns, it is
perhaps not surprising that one of the most popular approaches to multivariate
dynamic heteroskedasticity is based on the same idea as traditional factor analysis.
That is, in order to obtain a parsimonious representation of conditional second
moments, it is assumed that each of several observed variables is a linear combination of a smaller number of common factors plus an idiosyncratic noise term,
but allowing for dynamic heteroskedasticity-type eects in the underlying factors.
The factor garch model of Engle (1987) and the conditionally heteroskedastic
latent factor model introduced by Diebold and Nerlove (1989) and extended by
King, Sentana and Wadhwani (1994) are the best known examples. Such models
also have the advantage of being compatible with standard factor analysis based
on unconditional covariance matrices. Furthermore, they are particularly appealing in nance, where there is a long tradition of factor or multi-index models (see
e.g. the Arbitrage Pricing Theory of Ross (1976)).
Although many properties of these models have already been studied in detail,
either for the general class or for some of its members (see e.g. Bollerslev and
Engle (1993), Engle, Ng and Rothschild (1990), Gourieroux, Monfort and Renault
1
(1991), Harvey, Ruiz and Sentana (1992), Kroner (1987), Lin (1992), or Nijman
and Sentana (1996)), some very important inference issues have not been fully
investigated yet. The purpose of the paper is to address four such remaining
issues
The rst issue is in what sense, if any, the identication problems of traditional
factor models are altered by the presence of dynamic heteroskedasticity in the factors. This has important implications for empirical work related to the Arbitrage
Pricing Theory (APT), as in static factor models individual risk premia components are only identiable up to an orthogonal transformation. Furthermore, it
also has some bearing upon the interpretation of common trend and dynamic
factor models, and on the identication of fundamental disturbances and their
dynamic impact in vector autoregressions.
Another important aspect is the development of alternative estimation methods. Traditionally, the preferred method of estimation for such models has been
full information maximum likelihood. Unfortunately, this involves a very time
consuming procedure, which is disproportionately more so as the number of series
considered increases. Although using the EM algorithm combined with derivative
based methods signicantly reduces the computational burden (see Demos and
Sentana (1996b)), it would be interesting to have simpler estimation procedures,
which are nevertheless based on rm statistical grounds.
It is also of some interest to have a simple preliminary test for the presence of
arch eects in the common factors. Moreover, since the way in which standard
errors are usually computed in static factor models is only valid under conditional
homoskedasticity, it is convenient to have a model diagnostic to assess the validity
of such a maintained assumption.
Finally, given that the justication of such estimators and hypothesis tests is
asymptotic in nature, it is useful to investigate their nite sample properties by
xt = Cf t + wt
0
20
1 0
13
B ft C
6B 0 C ; B t 0 C7
@
A j Xt1 N 4@
A @
A5
wt
0
0
(2)
mal with zero mean, and covariance matrix t = Ct C0 + . For this reason, we
shall refer to the data generation process specied by (1-2) as a multivariate conditionally heteroskedastic factor model. Note that the diagonality of t implies
that the factors are conditionally orthogonal.
3
Such a formulation nests several models widely used in the empirical literature. In particular, it nests the conditionally heteroskedastic latent factor model
introduced by Diebold and Nerlove (1989) and extended by King, Sentana and
Wadhwani (1994), and the factor arch model of Engle (1987). These models typically assume that the unobserved factors follow univariate dynamic heteroskedastic processes, but dier in the exact parametrisation of t and .
For instance, in the conditionally heteroskedastic latent factor model, the idiosyncratic covariance matrix is assumed diagonal, and the variances of the factors
are parametrised as univariate arch models, but taking into account that the values of the factors are unobserved. In particular, for the gqarch(1,1) formulation
of Sentana (1995),
2
jj;t = 'j0 + 'j1 fjt1jt1 + j1 (fjt1jt1 + jj;t1jt1 ) + j1 jj;t1
(3)
where ftjt = E(ft jXt ) and tjt = V (ft jXt ), which can be easily evaluated via the
Kalman lter (see Harvey, Ruiz and Sentana (1992)). Note that the measurability
of jj;t with respect to Xt1 is achieved in this model by replacing the unobserved
factors by their best (in the conditional mean square error sense) estimates, and
including a correction in the standard arch terms which reects the uncertainty
in the factor estimates.
Similarly, the factor garch(p,q) model can also be written as a particular
case of (1-2), with non-diagonal, and the conditional variances of the factors
given by:
jj;t =
q
X
js x2 +
_ jts
s=1
p
X
jr jj;tr
(4)
r=1
(5)
where =V (ft ) = E(t ). This property makes the model considered here compatible with traditional factor analysis.
3.1
The most distinctive feature of factor models is that they provide a parsimonious specication of the (dynamic) cross-sectional dependence of a vector of
observable random variables. In our case, the factor structure allows us to decompose the conditional covariance matrix t into two parts: one which is common
but of reduced rank k, ct = Ct C0 , and one which is specic, st = . Unfortunately, without further restrictions on , or on the constant part of t , we
cannot separately identify one from the other. The reason is twofold. On the one
hand, we are not able to dierentiate the contribution to the conditional variance
of conditionally homoskedastic common factors (see Engle, Ng and Rothschild
(1990)). On the other, we may be able to transfer unconditional variance from
the idiosyncratic terms to the common factors. For instance, if is non-singular,
we can take ct (-) = C(t + -)C0 , and st (-) = C-C0 , where - is any
5
k k p.s.d. diagonal matrix such that the eigenvalues of -C0 1 C are less than
or equal to 1 (see Sentana (1997a)).
The most common assumption made to dierentiate common from idiosyncratic eects is that is diagonal (see e.g. Diebold and Nerlove (1989) or King,
Sentana and Wadhwani (1994)). In this case, we say that the conditional factor structure is exact. However, in some applications, diagonality of may be
thought to be too restrictive. For that reason, Chamberlain and Rothschild (1983)
introduced the concept of approximate factor structures, in which the idiosyncratic terms may be mildly correlated. Their denition is asymptotic in N , and
amounts to the largest eigenvalue of V [(w1t ; w2t ; : : : ; wNt )0 ] remaining bounded as
N increases (as in band-diagonal matrices).1 In practice, the eigenvalues of are
always bounded as N is nite, and it is dicult to come up with realistic models
that ensure such an asymptotic restriction.
An alternative way to dierentiate common from idiosyncratic eects is to
assume that has reduced rank.2 In some cases, in fact, it may be necessary
to assume that is both diagonal and of reduced rank. As a trivial example,
consider an exact conditionally homoskedastic single factor model with N = 2
and 11 = 1. Its covariance matrix can be written as
0
with c =
11
B
@
c2
11
11
c c
11 21
c2
21
22
1
C
A
for any 2 [0; 11 + c2 22 =(c2 + 22 )]. Note that the extreme values of this
11
11
21
range correspond to the two possible Heywood (i.e. singular) cases.
1
3.2
But the most fundamental identication issue in factor models relates to the
decomposition of ct into C and t . Since the scaling of the factors is usually
irrelevant, then in the case of constant variances, it is conventional to impose the
assumption that the variance of each factor is unity, that is t = I; 8t. By analogy,
we may impose here the same scaling assumption on the factors unconditional
variances.3
Suppose that we were to ignore the time-variation in the conditional variances and base our estimation in the unconditional covariance matrix of xt in
(5). As is well known from standard factor analysis theory, it would then be
possible to generate an observationally equivalent (o.e.) model up to unconditional second moments as xt = C ft +wt , where C = CQ0 , ft = Qf t , and Q is
an arbitrary orthogonal k k matrix, since the unconditional covariance matrix,
= C C0 + = CC0 +, remains unchanged.
form it makes it clear that the lack of identiability comes from the factors having
common, rather than constant, variances.
Finally, note that the imposition of unnecessary restrictions on C by analogy with standard factor models may produce misleading results. An important
implication of our results is that if such restrictions were nevertheless made, at
least they could then be tested. However, the accuracy that can be achieved in
estimating C depends on how much linearly independent variability there is in t ,
for if the elements of this matrix are essentially constant, identiability problems
will reappear.
3.3
Extensions
Proposition 1 can also be applied to other closely related models, and in particular to the model in Harvey, Ruiz and Sentana (1992). Theirs is a general state
space formulation for xt , with unrestricted mean dynamics, in which some unobservable components show dynamic conditional heteroskedasticity. In this section,
we shall explicitly consider the application of Proposition 1 to some well-known
special cases which are empirically relevant.
3.3.1
Several recent studies based on dynamic versions of the APT have estimated
conditionally heteroskedastic factor models in which the variances of the common
factors aect the mean of xt (see e.g. Engle, Ng and Rothschild (1990), King,
Sentana and Wadhwani (1994), or Ng, Engle and Rothschild (1992)). The models
typically considered in those studies can be expressed as:
xt = Ct + Cf t +wt
where is a k 1 vector of price of risk coecients. Notice that if = 0;
we return to the previous case. Since the proof of Proposition 1 is based on the
diagonality of the conditional variance of ft , it is straightforward to show that
the columns of C and 0 corresponding to factors with linearly independent timevarying variances are identiable (up to sign changes and permutations).
3.3.2
yt = Ayt1 +f t
(i.e. static) factor model. On the other hand, when A = I we have the common trends model (see e.g. Harvey (1989) or Stock and Watson (1988)). If ft
is conditionally homoskedastic, it is well known that an o.e. model (up to un10
is, for any orthogonal matrix Q, the model xt = C yt +wt ; yt = A yt1 +f where
t
Our results also apply to models with N common factors, no idiosyncratic noise
and linear mean dynamics, such as varma(r; s) models. Again, for simplicity
consider the following var(1):
xt = Axt1 +ut ;
ut = Cf t
But suppose that some elements of ft have time-varying conditional variances and
this is explicitly recognized in estimation. Then Proposition 2 implies that the
columns of C associated with those disturbances are identiable.
In this context, we can perhaps shed more light on Proposition 1 by reinterpreting it as a uniqueness result for the disturbances, ft . Given the way in
which the model is dened, we know that there is a set of disturbances, conditionally uncorrelated with each other, that can be written as a (time-invariant) linear
11
So far we have assumed that the factors are conditionally orthogonal, since this
has been a maintained assumption in all existing empirical applications. However,
as the following proposition shows, it turns out that most of the identiability
is coming from the fact the conditional covariances of conditionally orthogonal
factors are (trivially) constant over time
Proposition 3 Let t be a k k positive denite matrix of (possibly) timevarying factor variances but constant conditional covariances, and let t = vecd(t ).
If the stochastic processes in (0t ; 1) are linearly independent, C is unique under
orthogonal transformations other than column permutations and sign changes
Notice that the main dierence with Proposition 1 is that identication problems reappear in oblique factor models when a single factor has constant conditional variance. The reason is that we can transfer unconditional variance from
the conditionally homoskedastic factor to the others. This is not possible if the
factors have to remain conditionally orthogonal.
Factor models with constant conditional covariances arise more commonly than
it may appear. For instance, the factor arch model of Engle (1987) is o.e. to
a whole family of oblique factor models with constant conditional covariances,
whose limiting cases are the conditionally orthogonal factor model in (4), and a
model with a singular idiosyncratic covariance matrix (see Sentana (1997a) for
details). In fact, we can always express any conditionally heteroskedastic factor
model as an oblique factor model with constant conditional covariances and a
5
12
0
B
@
ftG
G
wt
20
1 0
C
6B 0 C ; B t + (C
A j Xt1 N 4@
A @
C)
0
0
C (C C)
13
C7
A5
where ftG = (C0 1 C)1 C0 1 xt are the Generalized Least Squares (GLS) estimates of the common factors (see Gourieroux, Monfort and Renault (1991)).
These factor scores are dierent from the minimum (conditional) mean square
h
and =vech() or vecd(), are usually estimated jointly from the log-likelihood
function of the observed variables, xt . Ignoring initial conditions, the log-likelihood
function of a sample of size T takes the form LT () =
lt () =
PT
t=1 lt (),
where:
1
1
N
1
ln 2 ln jCt C0 + j x0t (Ct C0 + ) xt
2
2
2
(6)
and t =diag [t ()], which allows the conditional variances of the factors to depend not only on , but also on the static factor model parameters c and .
Since the rst order conditions are particularly complicated in this case (see
appendix B), a numerical approach is usually required. Unfortunately, the application of standard quasi-Newton optimisation routines results in a very time
consuming procedure, which is disproportionately more so as the number of series
considered increases. In this respect, Demos and Sentana (1996b) show that using
the EM algorithm combined with derivative-based methods signicantly reduces
the computational burden. Nevertheless, it is still of some interest to have simpler
alternative estimation procedures.
13
4.1
Most empirical applications of the factor garch model have been carried out
using a two-step univariate garch method under the assumption that the matrix D is known. First, univariate models are tted to xjt = d0j xt ; j = 1; 2; : : : ; k.
_
Then, the estimated conditional variances are taken as data in the estimation
of N univariate models for each xit ; i = 1; 2; : : : ; N . However, such a procedure
ignores cross-sectional correlations and parameter restrictions, and thus sacrices
eciency. For that reason, Demos and Sentana (1996b) proposed an EM-based restricted maximum likelihood estimator which exploits those restrictions but maintains the assumption of known D. In the general case, an equivalent assumption
would be that the matrix D0 = (C0 1 C)1 C0 1 is known, which is tantamount
to ftG being observed. Under such a maintained assumption, it is possible to prove
that consistent estimates of C; and can be obtained by combining the estimates of the marginal model for ftG with the estimates from the OLS regression of
each xit on ftG (see Sentana (1997b) for details). Unfortunately, the consistency
of such restricted ML estimators crucially depends on the correct specication of
the factor scores (see Lin (1992) for the factor garch case).
Here, we shall develop a two-step consistent estimation procedure which does
not rely on knowledge of ftG for those cases in which the idiosyncratic covariance
matrix is diagonal. For clarity of exposition, we initially assume that the matrix
C is identiable even if we ignore the time-variation in t .
The rationale for our proposed two-step estimator is as follows. We saw in
section 2 that if ft and wt are covariance stationary, the unconditional covariance
matrix, , inherits the factor structure (cf. (5)). As our rst step, therefore, we
can estimate the unconditional variance parameters c and by pseudo-maximum
likelihood using a standard factor analytic routine. Note that such estimators
satisfy (^; ) = arg maxc; LT (c; ; 0). It is easy to see that (^; ) are root-T
c ^
c ^
14
consistent, as the expected value of the score of the estimated model evaluated
at the true parameter values is 0 under our assumptions. However, since the rst
derivatives are proportional to vech(xt x0t ) (see appendix B), the score does not
preserve the martingale dierence property when there are arch eects in the
common factors, and it is necessary to compute robust standard errors which take
into account its serial correlation.
Having obtained consistent estimates of c and , we can then estimate the
conditional variance parameters by maximizing (6) with respect to keeping c
and xed at their pseudo-maximum likelihood estimates. That is, our second
^
step estimator is = arg max LT (^; ; ). On the basis of well-known results
c ^
^
from Durbin (1970), it is clear that is also root-T consistent. However, since the
asymptotic covariance matrix is not generally block-diagonal between static and
dynamic variance parameters (see appendix B), standard errors will be underestimated by the usual expressions. Asymptotically correct standard errors can be
computed from an estimate of the inverse information matrix corresponding to (6)
^
evaluated at the two-step estimators ^; and (see Lin (1992) for an analogous
c ^
correction in the factor GARCH case).
When C is not identiable from the unconditional covariance matrix, re^
mains consistent, but ^ is only consistent up to an orthogonal transformation. As
c
discussed in section 3.2, the reason is that by assuming unconditional normality
in estimation, we are neglecting very valuable information in dynamic fourth order moments. One possibility would be to replace the Gaussian quasi-likelihood
in the rst-step by an alternative objective function which took into account the
autocorrelation in vech(xt x0t ). Unfortunately, the evidence from univariate arch
models suggests that the resulting estimators are likely to be rather inecient.
In any case, note that if we were to iterate our proposed two step procedure and
achieved convergence, we would obtain fully ecient maximum likelihood esti-
15
4.2
the factors are generally unobserved. Nevertheless, we can derive similar tests
using some factor estimates instead. Under conditional normality, ftjt , the Kalman
lter based estimates of the underlying factors, satisfy:
i1
(7)
moskedastic if and only if t is constant over time. Hence, had we data on ftjt , we
h
2
2
could test whether or not the moment condition cov fjtjt ; fjt1jt1 = 0 holds for
weights are not orthogonal to C; will follow weak garch processes. Therefore,
such moment tests will have non-trivial power since under the alternative fjtjt will
show serial correlation in the squares.
In practice, we must base the tests on ftjt evaluated at the parameter estimates
under the null. In particular, we will use
^tjt = tjt C0 1 xt
^ ^ ^
f
where
^
^ ^ ^ 1
tjt = I + (C0 1 C)
8t
It turns out that the presence of parameter estimates does not aect the asymptotic distribution of such tests, as the information matrix is block diagonal
between and (c; ) under the null (see appendix B). Furthermore, we also prove
in appendix B that our proposed moment test is precisely the standard LM test
for conditional homoskedasticity in the common factors based on the score of
(6) evaluated under H0 . Therefore, we can compute a two-sided 2 test against
1
arch(1) in each common factor as T times the uncentred R2 from the regression
^2
^
^2
^
of either 1 on (fjtjt + jj;tjt 1) times (fjt1jt1 + jj;t1jt1 1) (outer-product
^2
^
^2
^
version), or (fjtjt + jj;tjt 1) on (fjtjt + jj;tjt 1) (Hessian-based version). In
fact, more powerful variants of these tests can be obtained by taking the one-sided
nature of the alternative hypothesis into account through the sign of the relevant
regression coecient (see Demos and Sentana (1996a)).
factor garch model of Engle (1987) by means of a detailed Monte Carlo analysis. In this section, we shall conduct a similar exercise for the conditionally heteroskedastic latent factor model in (3). Unfortunately, given that the estimation
17
5.1
We rst generated 8000 samples of 240 observations each (plus another 100 for
initialization) of a trivariate single factor model using the NAG library G05DDF
routine. Such a sample size corresponds roughly to twenty years of monthly data,
ve years of weekly data or one year of daily data. Since the performance of
the dierent estimators depends on C and mostly through the scalar quantity
(C0 1 C), the model considered is:
xit = ci ft + wit
(i = 1; 2; 3)
2
with c = (1; 1; 1)0 ; t = (1)+(ft1jt1 +t1jt1 )+t1 and =I. Two
values of have been selected, namely 2 or 1=2, corresponding to low and high
signal to noise ratios, and three pairs of values for and , namely (0; 0); (:2; :6)
and (:4; :4); which represent constant variances, persistent but smooth garch
behaviour, and persistent but volatile conditional variances respectively. It is
worth mentioning that the pair = :2; = :6 matches roughly what we tend to
see in the empirical literature. In order to minimize experimental error, we use
the same set of underlying random numbers in all designs. Maximization of the
log-likelihood (6) with respect to c; ; and was carried out using the NAG
library E04JBF routine. Initial values of the parameters were obtained by means
of the EM algorithm in Demos and Sentana (1996b).
For scaling purposes, we use c2 + c2 + c2 = 1, and leave the constant part of
1
2
3
the conditional variance free. In order to guarantee the positivity and stationarity
restrictions 0 1 1; we use the re-parametrisation = sin2 (1 )
18
unconditional variance of the common factor to start up the recursions. But since
this implies that is not identied if = 0; we set = 0 whenever = 0.
In this respect, it is important to mention that when and/or are 0, the
parameter values lie on the boundary of the admissible range. The distribution of
the ML estimator and associate tests in those situations has been studied by Self
and Liang (1987) and Wolak (1989). When = 0; for instance, we could use the
result in case 2, theorem 2 of Self and Liang (1987), to show that the asymptotic
1
distribution of the ML estimators of (; ; c0 ; 0 ) should be a ( 4 ; 1 ; 1 ) mixture of a)
2 4
19
20
occur more frequently than what the asymptotic distribution would suggest. This
is particularly true when the signal-to-noise ratio is small. These results are
conrmed in Table 3, which presents mean biases and standard deviations across
replications for joint and two-step maximum likelihood estimates of and : In
this respect, it is important to mention that since is not identied when = 0;
the reported values for correspond to those cases in which is not estimated
as 0. Note that the 0 s obtained are rather more accurate than the 0 s: Also
note that the biases for the joint estimates of are smaller than for the two-step
ones, although the latter have smaller Monte Carlo variability. In contrast, the
downward biases in are larger for joint ML estimates. To some extent, these
biases reect the larger proportion of zero 0 s in Table 2.
5.2
We have also simulated the following six-variate model with two factors:
xit = ci1 f1t + ci2 f2t + wit
2
with 11;t = (1 ) + (f1t1jt1 + 11;t1jt1 ) + 11;t1 , 22;t = 1 and =I.
ate single factor models like the one considered in the previous subsection put
together, while the second design introduces correlation in the columns of C.
21
For each value of c, two values of have been selected, namely 2 and 1=2, corresponding to low and high signal to noise ratios. Then for each of the four
combinations, we consider two pairs of values for and , namely (:4; :4) and
(:2; :6), in order to obtain persistent but volatile conditional variances, and the
more realistic persistent but smooth garch behaviour. Given that this model
is four times as costly to estimate as the previous one, we only generated 2000
samples of 240 observations each. The remaining estimation details are the same
as in section 5.1.6
Table 4 presents mean biases and standard deviations across replications for
joint and two step maximum likelihood estimates, as well as a restricted ML estimator which imposes the same identifying restriction as the two step estimator,
namely c62 = 0. Such an estimator is ecient when the overidentifying restriction
is true, but becomes inconsistent when it is false. More precisely, if C is not unconditionally identiable, restricted and two-step ML estimators of c are consistent
for the orthogonal transformation of the true parameter values which zeroes c62 .
For simplicity of exposition, only certain averages across equations are included (in
particular, ca1 = (c11 +c21 +c31 )=3, cb1 = (c41 +c51 +c61 )=3, ca2 = (c12 +c22 +c32 )=3,
cb2 = (c42 + c52 )=2, and = ( 1 + 2 + 3 + 4 + 5 + 6 )=6).
The rst panel of Table 4 contains the results for those designs in which
c0 =(0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0). Not surprisingly, the restricted ML estimator is
clearly the best as far as estimates of the factor loadings are concerned. However,
it turns out that the two-step estimator performs very similarly, except when there
is signicant variability in conditional variances, which is in line with the results
for the single factor model. On the other hand, the joint ML estimator is the
6
One additional issue that arose during the simulations with two factor models was that,
occasionally, some idiosyncratic variances were estimated as 0. The incidence of these so-called
Heywood cases increased with the value of ; and especially c62 . Nevertheless, since at worst
only 35 out of 2000 replications had this problem, we discarded them, and replaced them by
new ones.
22
worst performer when the signal to noise ratio and the variability in 11;t are low,
but comes very close to the restricted ML in the opposite case.7 This behaviour
is not unexpected, given that the identiability of the joint ML estimator comes
from the fact that 11;t changes over time, while the identiability of the other two
estimators is obtained from the restriction c62 = 0. Nevertheless, it seems that the
latter identiability condition is more informative than the former, which should
be borne in mind in empirical work.
In contrast, there are only minor dierences between the dierent estimates
of the idiosyncratic variance parameters, which are always identied. Obviously,
their Monte Carlo standard deviations increase when changes from 1=2 to 2,
but the coecients of variation remain approximately the same.
The second panel of Table 4 contains the results for those designs in which
1 1
c0 =( 4 ; 4 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 ). Note that the dierent estimates of j are hardly
4
4 4 4
23
in more than one parameter, we have done the required implicit size-corrections
in these plots using the closest match (cf. Davidson and MacKinnon (1996)). Not
surprisingly, the absolute power of the test is small, as the Monte Carlo variability
in the joint estimator of c62 is large relative to the re-scaled value of this parameter (' :14) for the sample size considered (see Table 4). Nevertheless, it is clear
that the power of the test increases with the signal-to-noise ratio, and especially,
with the variability of the conditional variance of the factor. This conrms the
crucial role that changes in 11;t play in the identiability of the model, as stated
in Proposition 1.
Table 5 presents the proportion of estimates of and which are at the
boundary of the parameter space. In all cases, the proportions of = = 0
and 6= 0; = 0 should be (0; 0) asymptotically. But as in the single factor
model, the results show that = 0; and especially = 0 occur more frequently
than what the asymptotic distribution would suggest. This is particularly true
when the signal-to-noise ratio is small. These results are conrmed in Table 6,
which presents mean biases and standard deviations across replications for joint,
24
restricted and two-step maximum likelihood estimators of and : Once more, the
0 s are estimated rather more accurately than the 0 s, which reects the larger
proportion of zero 0 s in Table 5. As in Table 4, though, there are signicant
dierences between the rst and second panel. While the performance of joint ML
estimator is by and large independent of whether or not c62 = 0, the behaviour of
the restricted and two-step estimators radically changes, and they clearly become
inconsistent.
Conclusions
In this paper we investigate some important issues related to the identication,
26
and Ct = Ct , so that the loadings of dierent variables on each conditionally homoskedastic factor change proportionately over time (see Engle, Ng and
Rothschild (1990)). The motivation for such an assumption is twofold. First, it
provides a parsimonious and plausible specication of the time variation in t ,
and for that reason has been the only one adopted so far in empirical applications.
Second, it implies that the unconditional factor representation of xt is well dened
(provided unconditional variances are bounded), which makes it compatible with
the standard approach based on , and therefore empirically relevant. Notice
that even if t is diagonal, the unconditional variance of a process characterized
by a conditional factor representation may very well lack an unconditional factor
structure for any k < N (see Hansen and Richard (1987)). Although the model is
not identiable if Ct is unspecied, this paper shows that the statistical properties
of alternative plausible formulations of the general conditional factor model would
certainly merit a close look.
27
References
Blanchard, O.J. and Quah, D. (1989): The dynamic eects of aggregate
demand and supply disturbances, American Economic Review, 79, 655-673
Bollerslev, T. and Engle, R.F. (1993): Common persistence in conditional
variances, Econometrica 61, 166-187.
Bollerslev, T. and Wooldridge, J.M. (1992): Quasi-maximum likelihood estimation and inference in dynamic models with time-varying variances, Econometric Reviews 11, 143-172.
Chamberlain, G. and Rothschild, M. (1983): Arbitrage, factor structure, and
mean-variance analysis on large asset markets, Econometrica 51, 1281-1304.
Davidson, R. and MacKinnon, J.G. (1996): Graphical methods for investigating the size and power of hypothesis tests, mimeo, GREQAM.
Demos, A. and Sentana, E. (1992): An EM-based algorithm for conditionally
heteroskedastic factor models, LSE FMG Discussion Paper 140.
Demos, A. and Sentana, E. (1996a): Testing for garch eects: A one-sided
approach, CEMFI Working Paper 9611.
Demos, A. and Sentana, E. (1996b): An EM algorithm for conditionally heteroskedastic factor models, CEMFI Working Paper 9615, forthcoming in Journal
of Business and Economic Statistics.
Diebold, F.X. and Nerlove, M. (1989): The dynamics of exchange rate volatility: A multivariate latent factor arch model, Journal of Applied Econometrics
4, 1-21.
Dunn, J.E. (1973): A note on a suciency condition for uniqueness of a
restricted factor matrix, Psychometrika 38, 141-143.
Durbin, J. (1970): Testing for serial correlation in least-squares regression
when some of the regressors are lagged dependent variables, Econometrica 38,
410-421.
28
29
Magnus, J.R. (1988): Linear Structures, Oxford University Press, New York.
Magnus, J.R. and Neudecker, H. (1988): Matrix dierential calculus with
applications in Statistics and Econometrics, Wiley, Chichester.
Ng, V.M.; Engle, R.F. and Rothschild, M. (1992): A multi-dynamic factor
model for stock returns, Journal of Econometrics 52, 245-266.
Nijman, T. and Sentana, E. (1996): Marginalization and contemporaneous
aggregation of multivariate garch processes, Journal of Econometrics 71, 71-87.
Pea, D. and Box, G.E.P. (1987): Identifying a simplifying structure in time
series, Journal of the American Statistical Association 82, 836-843.
Ross, S. (1976): The arbitrage theory of capital asset pricing, Journal of
Economic Theory, 13, 341-360.
Self, S.G. and Liang, K.Y. (1987): Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions Journal
of the American Statistical Association 82, 605-610.
Sentana, E. (1992): Identication of multivariate conditionally heteroskedastic factor models, LSE FMG Discussion Paper 139.
Sentana, E. (1995): Quadratic arch models, Review of Economic Studies
62, 639-661.
Sentana, E. (1997a): The relation between conditionally heteroskedastic factor models and factor garch models, mimeo, CEMFI.
Sentana, E. (1997b): Risk and return in the Spanish stock market: Some
evidence from individual assets, CEMFI Working Paper 9702, forthcoming in
Investigaciones Econmicas.
Stock, J.H. and Watson, M.W. (1988): Testing for common trends, Journal
of the American Statistical Association 83, 1097-1107.
Wolak, F.A. (1989): Local and global testing of linear and nonlinear inequality constraints in nonlinear econometric models, Econometric Theory 5, 1-35.
30
Appendices
A
Proofs
A.1
Proposition 1
such that Q0 Q = QQ0 = Ik . Since the covariance matrix of the transformed factors
ft = Qf t , is = Qt Q0 , with typical element [ ]ij =
t
t
orthogonality requires
Pk
l=1
Pk
l=1
For a given i; j (j > i), these restrictions can be expressed in matrix notation as:
~
T qij = 0 T
B 11;1
B
B
B
~ T = B 11;2
where
B .
B .
.
B
@
22;1
22;2
.
.
.
C
A
B
@
kk;1 C B
C
B
kk;2 C B
C
B
C=B
C
B
.
.
C
B
.
(A1)
1
01 C
C
02 C
C
C
. C is a T k matrix, T a T 1
. C
.
0T
C
A
a k 1 vector. We can
~
that rank T = k when the stochastic processes in t are linearly independent,
~
the only solution to the above system of equations is T qij = 0 k . irrespectively
of i and j. That is, we must have that for all j > i; i = 1; 2; : : : k, qil qjl = 0 for
1=2
31
ij
= 1 for i = j
2
A.2
Proposition 2
In this case (A1) also applies, but since 0t = (01t ; kk;t 0k2 ), we can re-write it
as
0
where T
B
B
B
B
= B
B
B
B
@
T qij = 0 T
11;1
22;1
22;2
.
.
.
k1 k1 ;2
.
.
.
1) 1 vector.
0
kk;1 C
B 11 ;
C
B
B 0
kk;2 C
C
B
C = B 12
C
B .
.
.
C
B .
.
C
B .
k1 k1 ;1
11;2
.
.
.
(A2)
1
01T
Pk
kk;1 C
C
kk;2 C
C
C is a T
C
.
.
C
.
kk;T
C
A
a (k +
j. That is, for all j > i; i = 1; 2; : : : k we must have qil qjl = 0 for l = 1; 2; : : : ; k1
and also
Pk
be two elements in the rst k1 columns of Q which are dierent from 0. Lets
partition Q comformably as:
0
B Q11
@
Q12 C
Q21 Q22
it must be the case that Q11 = Ik1 ; Q21 = 0; Q12 = 0 and Q22 is orthogonal. 2
A.3
Proposition 3
First of all, note that = Qt Q0 = Qdg(t )Q0 + Q[t dg(t )]Q0 . But
t
Pk
l=1
Pk
l=1
(A3)
~
qij . Given that rank T jT = k + 1 when the stochastic processes in (0t ; 1) are
linearly independent, the only way the above system of equations can have a
and =vecd(). Bollerslev and Wooldridge (1992) and Kroner (1987) show that
the score function st () =@lt ()=@ of any conditionally heteroskedastic multivariate model with zero conditional mean is given by the following expression:
st () =
i
1 @vec0 [t ] h 1
t -1 vec [xt x0t t ]
t
2
@
into vecd(A) as vecd(A) = E0n vec(A), and Kn is the square commutation matrix
(see Magnus (1988)).
After some straightforward algebraic manipulations, we get
2
1 xt x0t 1 Ct
t
t
1 Ct
t
6 vec
6
h
i
6
1
st () = 6
vecd 1 xt x0t 1 1
t
t
t
2
6
4
0
i 3
7
7
7
7+
7
5
0
6 @t ()=@c 7
6
7
16
6 @0 ()=@
t
26
4
@0t ()=@
h
i
7
7 vecd C0 1 xt x0 1 C C0 1 C
t
t t
t
7
5
Assuming that rank() =N , we can use the Woodbury formula to prove that
1 xt x0t 1 Ct 1 Ct = 1 E [(xt Cf t )ft0 jXT ; ]
t
t
t
1 xt x0t 1 1 = 1 E [(xt Cf t )(xt Cf t )0 jXT ; ] 1
t
t
t
C0 1 xt x0t 1 C C0 1 C = 1 E [ft ft0 t jXT ; ] 1
t
t
t
t
t
where E [jXT ; ] refers to expectations conditional on all observed x0t s and the
parameter values . Therefore, we can interpret the score of the log-likelihood
function for xt as the expected value given XT of the sum of the (unobservable) scores corresponding to the conditional log-likelihood function of xt given ft ,
and the marginal log-likelihood function of ft (cf. Demos and Sentana (1996b)).
Note that these expressions only involve ftjT = E [ft jXT ; ] = ftjt and tjT =
34
2
@jj;t (0 )=@c = 0; @jj;t (0 )=@ = 0 and @jj;t (0 )=@j1 = (fjt1jt1 +jj;t1jt1
2
2
E fjtjt + jj;tjt 1 = E E fjt 1jXt
2
= E fjt 1 = 0
then the orthogonality conditions implicit in the last k elements of the score are
2
2
simply cov(fjtjt ; fjt1jt1 ) = 0.
"
i @vec [ ]
1 @vec0 [t ] h 1
t
t -1
t
0
2
@
@
@0t (0 ) 0 0 1
@ 2 lt (0 )
E
jXt1 =
Ek (C C - C0 1 C)
@@c0
@
"
@ 2 lt (0 )
1 @0t (0 ) 0 1
E
jXt1 =
(C C0 1 )
@@ 0
2 @
where we use the fact that the Hadamard (or element by element) product of two
m n matrices, R and S; can be written as R S = E0m (R - S)En (see Magnus
(1988)).
2
Since E [@jj;t (0 )=@j1 ] = E fjt1jt1 + jj;t1jt1 1 = 0, it is clear that
the information matrix is block diagonal between static and dynamic variance
parameters under the null of conditional homoskedasticity.
Finally, it is also worth noting that under conditional homoskedasticity
#
"
@ 2 lt (0 )
jXt1 = 2(C0 1 C - 1 )
E
@c@c0
"
@ 2 lt (0 )
jXt1 = E0N (1 C - 1 )
E
@@c0
"
@ 2 lt (0 )
1
E
jXt1 = (1 1 )
0
@@
2
35
0 =2.0
0 =0.0
0 =0.0
bias
std.dev.
ML
-.0006
.0265
2S
-.0006
.0265
ML
-.0036
.0729
2S
-.0036
.0730
ML
-.0054
.0789
2S
-.0051
.0771
ML
-.0297
.3093
2S
-.0287
.3025
0 =0.2
0 =0.6
bias
std.dev.
-.0006
.0269
-.0006
.0270
-.0036
.0720
-.0036
.0729
-.0055
.0795
-.0054
.0786
-.0290
.3045
-.0292
.3034
0 =0.4
0 =0.4
bias
std.dev.
-.0006
.0277
-.0006
.0282
-.0035
.0700
-.0037
.0729
-.0055
.0795
-.0058
.0818
-.0282
.2913
-.0300
.3047
0 =0.0,0 =0.0
0 =0.2,0 =0.6
0 =0.4,0 =0.4
0 =0.5
= 0; = 0
6= 0; = 0
ML
2S
ML
2S
.556
.557
.265
.264
.022
.027
.091
.086
.003
.005
.074
.070
0 =2.0
= 0; = 0
6= 0; = 0
ML
2S
ML
2S
.552
.552
.286
.282
.118
.137
.198
.167
.049
.059
.218
.185
0 =0.2
0 =0.6
bias
std.dev.
ML
.007
.112
0 =0.4
0 =0.4
bias
std.dev.
-.004
.151
0 =2.0
2S
-.002
.104
ML
-.106
.253
2S
-.103
.250
ML
.019
.172
2S
-.007
.149
ML
-.183
.302
2S
-.162
.299
-.030
.134
-.043
.196
-.039
.195
-.015
.222
-.065
.190
-.081
.257
-.058
.257
0 = 0:4 0 = 0:4
0 = 2:0
2S
ML
0 = 0:5
2S
ML
0 = 2:0
2S
ML
2S
cb1 bias -.0120 -.0033 -.0034 -.0504 -.0147 -.0149 -.0011 -.0035 -.0037 -.0357 -.0145 -.0159
s.d. .0600 .0279 .0282 .1365 .0814 .0829 .0578 .0286 .0295 .1187 .0803 .0858
ca2 bias -.0178 -.0016 -.0016 -.0549 -.0117 -.0117 -.0069 -.0015 -.0016 -.0347 -.0113 -.0117
s.d. .0611 .0269 .0269 .1513 .0810 .0809 .0306 .0271 .0271 .1208 .0808 .0807
cb2 bias
s.d.
bias -.0058 -.0057 -.0058 -.0437 -.0428 -.0441 -.0059 -.0058 -.0059 -.0429 -.0417 -.0444
s.d. .0724 .0723 .0730 .3141 .3100 .3136 .0712 .0712 .0730 .3048 .3018 .3142
c0 = ( 1 ; 1 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 )0
4 4 4
4 4 4
0 = 0:2 0 = 0:6
0 = 0:5
ML
ca1 bias -.0121
s.d. .1296
R
.0955
.0424
0 = 0:4 0 = 0:4
0 = 2:0
2S
ML
.1040 -.0038
.0417 .1820
R
.0920
.0815
0 = 0:5
2S
ML
.0987 -.0072
.0809 .0939
R
.0955
.0473
0 = 2:0
2S
ML
.1076 -.0045
.0455 .1526
2S
.0841
.0841
.1007
.0838
cb1 bias -.0158 -.0373 -.0394 -.0457 -.0437 -.0468 -.0079 -.0360 -.0415 -.0305 -.0406 -.0486
s.d. .0622 .0296 .0297 .1305 .0787 .0785 .0439 .0315 .0317 .1057 .0778 .0813
ca2 bias -.0151
s.d. .0611
.0151
.0309
.0149
.0306
cb2 bias -.0142 -.1339 -.1418 -.0164 -.1433 -.1561 -.0075 -.1210 -.1417 -.0210 -.1260 -.1566
s.d. .1293 .0480 .0485 .1916 .1344 .1399 .0796 .0473 .0488 .1564 .1283 .1413
bias -.0060 -.0063 -.0060 -.0517 -.0519 -.0524 -.0060 -.0067 -.0061 -.0505 -.0525 -.0537
s.d. .0725 .0726 .0732 .3269 .3267 .3210 .0711 .0715 .0732 .3117 .3266 .3114
0 =0.2,0 =0.6
0 =0.4,0 =0.4
c0 = (0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0)0
0 =0.5
0 =2.0
= 0; = 0
6= 0; = 0
= 0; = 0
6= 0; = 0
ML
R
2S ML
R
2S ML
R
2S ML
R
2S
.034 .034 .038 .114 .088 .084 .146 .145 .153 .226 .190 .166
.004 .004 .004 .097 .077 .072 .064 .064 .072 .260 .222 .188
0 =0.2,0 =0.6
0 =0.4,0 =0.4
c0 = ( 1 ; 1 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 )0
4 4 4
4 4 4
0 =0.5
0 =2.0
= 0; = 0
6= 0; = 0
= 0; = 0
6= 0; = 0
ML
R
2S ML
R
2S ML
R
2S ML
R
2S
.034 .048 .054 .095 .124 .117 .156 .167 .195 .201 .225 .183
.004 .011 .015 .095 .099 .093 .062 .095 .109 .297 .227 .186
0 =0.2
0 =0.6
bias
std.dev.
ML
.025
.115
R
.007
.112
0 =0.4
0 =0.4
bias
std.dev.
.010
.150
-.003
.150
2S
-.003
.103
-.032
.131
0 =2.0
ML
-.128
.259
R
-.109
.248
2S
-.106
.246
-.057
.197
-.047
.193
-.041
.193
ML
.062
.192
R
.025
.181
2S
-.014
.146
.017
.224
-.012
.226
-.081
.186
ML
-.219
.305
R
-.192
.301
2S
-.166
.301
-.108
.258
-.089
.256
-.055
.258
2S
-.175
.311
-.036
.275
c0 = ( 1 ; 1 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 )0
4 4 4
4 4 4
0 =0.5
0 =0.2
0 =0.6
bias
std.dev.
ML
.027
.120
0 =0.4
0 =0.4
bias
std.dev.
.012
.156
R
-.021
.108
2S
-.030
.100
-.061
.150
-.088
.133
0 =2.0
ML
-.116
.256
R
-.120
.273
2S
-.115
.269
-.055
.202
-.027
.214
-.019
.215
ML
.064
.208
R
-.013
.164
2S
-.047
.134
ML
-.201
.306
R
-.207
.312
.017
.234
-.079
.217
-.144
.176
-.095
.266
-.074
.271
0 = :5
0 = 2
4
++ +
+ + ++++ + + +
+++
++ + + + +
++ +
++
+
++
+x x x x x x x
x
xxx
+x
xxxx
***
x
******
o
********xxx
***xx*********
o
**x
oo
xxxxx
xx
oo
x
oo
oo
oooo
ooo
oooo
oooooooooo
0
-2
-4
op 2-sided:
10
15
hess 2-sided: ?
+ ++++ ++
++ ++ + + + ++
+++
+++
+ ++
+
+++
x+
+x x x x x x x x x x x x x
xxxx
***
xx
o ******
xxx
****
o
******xxxx****
****
****x
o
xxx
oo
oo
oo
ooo
oooo
oooo ooo
o
ooooooo
0
-2
-4
0
op 1-sided:
10
hess 1-sided:
15
0 = :2, 0 = :6
0 = :4, 0 = :4
100
100
90
90
80
70
60
50
40
30
20
10
o o oo
oooo
o oo
ooo
o
oo
ooo
+ ++
oo
++ + +
oo
++ +
++ +
oo
+
o
++
+ ++
oo
o + ++
+
o
o ++
+
x
o+
xx
+
x xx
+
o
xxx
x xx
+
xx
xx
o
+
xx
x
** *
xx
+
** *
xx
***
xx
* **
*
xx
* **
x x ** * *
x
**
x
x * **
x **
x**
x*
*
*
0
0
0 = :5,
5
2-sided
:+
10
0 = :5,
15
:o
1-sided
80
70
o
+
+
60
50
40
x
*
30
o o oooo o o oo
ooooooo
oo ooo
+ + + ++
o oo o + +++ ++ + ++ + ++
oo
oo+ + +++ +
o
o ++
o ++
o +
+
+
xxx
xx xx
xxx
xx x
xx
x xx
xx
** *
xx
** *
x
***
xx
* **
xx
* **
xx
***
**
x
**
x
*
x **
x **
x**
*
20
10
0
0
0 = 2,
:?
2-sided
10
0 = 2,
15
1-sided:
4
+++
+
+++
++ +++++++x x x
x
x
x**
*
**x **
+++ *++++ * * * x x * x x
**
* xx
o
+ * *+* o ox x x o o o o * * * * o* o
oooo o
o
ooo
++ *
o
+x x x x x x x x x x
x *** ooooox
oooo
++ *
xx
oo
**
2
0
-2
-4
0
10
2
+
0 = :2, 0 = :6,
0 = 2:
0 = :2, 0 = :6,
0 = :5:
15
0 = :2, 0 = :4,
0 = 4: o
0 = :4, 0 = :4,
0 = :5: ?
100
90
80
70
60
50
40
* *
**
***
**
*
* *
* **
* **
++
++
++
++
+
o
+ +
*
oo o
+ +
o o
o o
++
*
+
oo o
x xx
oo
++
x xx
+
oo
+ o o o
xx x
*
+
x x
+
oo
x x
o
+
x x x
+ o oo
xx
xx x
+ o
xx
o
+ x xx
o
o x
x
*
30
20
10
0
0
0 = :2, 0 = :6,
0 = 2:
0 = :2, 0 = :6,
0 = :5:
2
+
10
15
0 = :2, 0 = :4,
0 = 4: o
0 = :4, 0 = :4,
0 = :5: ?