0% found this document useful (0 votes)
60 views0 pages

Identi Cation, Estimation and Testing of Conditionally Heteroskedastic Factor Models

Identi.cation, Estimation and Testing of Conditionally Heteroskedastic Factor Models. Paper is an extensively revised version of Sentana's 1992 paper. Authors are grateful to Financial support from the Spanish Ministry of Education and science.

Uploaded by

Pablo F
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views0 pages

Identi Cation, Estimation and Testing of Conditionally Heteroskedastic Factor Models

Identi.cation, Estimation and Testing of Conditionally Heteroskedastic Factor Models. Paper is an extensively revised version of Sentana's 1992 paper. Authors are grateful to Financial support from the Spanish Ministry of Education and science.

Uploaded by

Pablo F
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 0

Identication, Estimation and Testing of Conditionally

Heteroskedastic Factor Models

Gabriele Fiorentini
University of Alicante

Enrique Sentana
CEMFI

Working Paper No. 9709


July 1997

This paper is an extensively revised version of Sentana (1992), which concentrated


on identication issues. Financial support for that version from the Spanish Ministry of Education and Science and the LSE Financial Markets Group as part of the
ESRC project The Eciency and Regulation of Financial Markets is gratefully
acknowledged. The authors are also grateful to Manuel Arellano, Antonis Demos,
Lars Hansen, Andrew Harvey, Mervyn King, Jan Magnus, Danny Quah, Esther
Ruiz, Mushtaq Shah, Neil Shephard and Sushil Wadhwani, as well as seminar participants at CEMFI, the European Meeting of the Econometric Society (Brussels,
1992) and the ESRC Econometric Study Group Conference (Bristol, July 1997)
for very helpful comments and suggestions. Two anonymous referees have also
helped us greatly improve the paper. Of course the usual caveat applies. (E-mail
addresses: ga@merlin.fae.ua.es, sentana@cem.es).
CEMFI, Casado del Alisal 5, 28014 Madrid, Spain.
Tel: 341 4290551, fax: 341 4291056, http://www.cem.es.

Abstract
We investigate several important inference issues for factor models with
dynamic heteroskedasticity in the common factors. First, we show that such
models are identied if we take into account the time-variation in the variances of the factors. Our results also apply to dynamic versions of the APT,
dynamic factor models, and vector autoregressions. Secondly, we propose a
consistent two-step estimation procedure which does not rely on knowledge
of any factor estimates, and explain how to compute correct standard errors.
Thirdly, we develop a simple preliminary LM test for the presence of arch
eects in the common factors. Finally, we conduct a Monte Carlo analysis of
the nite sample properties of the proposed estimators and hypothesis tests.

Introduction
In recent years, increasing attention has been paid to modelling the observed

changes in the volatility of many economic and nancial time series. By and
large, though, most theoretical and applied research in this area has concentrated
on univariate series. However, many issues in nance, such as tests of asset pricing restrictions, asset allocation, performance evaluation or risk management, can
only be fully addressed within a multivariate framework. Unfortunately, the application of dynamic heteroskedasticity in a multivariate context has been hampered
by the sheer number of parameters involved.
Given that there are many similarities between this problem and that of modelling the unconditional covariance matrix of a large number of asset returns, it is
perhaps not surprising that one of the most popular approaches to multivariate
dynamic heteroskedasticity is based on the same idea as traditional factor analysis.
That is, in order to obtain a parsimonious representation of conditional second
moments, it is assumed that each of several observed variables is a linear combination of a smaller number of common factors plus an idiosyncratic noise term,
but allowing for dynamic heteroskedasticity-type eects in the underlying factors.
The factor garch model of Engle (1987) and the conditionally heteroskedastic
latent factor model introduced by Diebold and Nerlove (1989) and extended by
King, Sentana and Wadhwani (1994) are the best known examples. Such models
also have the advantage of being compatible with standard factor analysis based
on unconditional covariance matrices. Furthermore, they are particularly appealing in nance, where there is a long tradition of factor or multi-index models (see
e.g. the Arbitrage Pricing Theory of Ross (1976)).
Although many properties of these models have already been studied in detail,
either for the general class or for some of its members (see e.g. Bollerslev and
Engle (1993), Engle, Ng and Rothschild (1990), Gourieroux, Monfort and Renault
1

(1991), Harvey, Ruiz and Sentana (1992), Kroner (1987), Lin (1992), or Nijman
and Sentana (1996)), some very important inference issues have not been fully
investigated yet. The purpose of the paper is to address four such remaining
issues
The rst issue is in what sense, if any, the identication problems of traditional
factor models are altered by the presence of dynamic heteroskedasticity in the factors. This has important implications for empirical work related to the Arbitrage
Pricing Theory (APT), as in static factor models individual risk premia components are only identiable up to an orthogonal transformation. Furthermore, it
also has some bearing upon the interpretation of common trend and dynamic
factor models, and on the identication of fundamental disturbances and their
dynamic impact in vector autoregressions.
Another important aspect is the development of alternative estimation methods. Traditionally, the preferred method of estimation for such models has been
full information maximum likelihood. Unfortunately, this involves a very time
consuming procedure, which is disproportionately more so as the number of series
considered increases. Although using the EM algorithm combined with derivative
based methods signicantly reduces the computational burden (see Demos and
Sentana (1996b)), it would be interesting to have simpler estimation procedures,
which are nevertheless based on rm statistical grounds.
It is also of some interest to have a simple preliminary test for the presence of
arch eects in the common factors. Moreover, since the way in which standard
errors are usually computed in static factor models is only valid under conditional
homoskedasticity, it is convenient to have a model diagnostic to assess the validity
of such a maintained assumption.
Finally, given that the justication of such estimators and hypothesis tests is
asymptotic in nature, it is useful to investigate their nite sample properties by

means of simulation methods.


The rest of the paper is organized as follows. We formally introduce the model
in section 2, and relate it to the most common conditional variance parametrisations. Identication issues are discussed in detail in section 3. Then, in section 4.1,
we propose a simple two-step consistent estimator. We also derive an LM test for
arch in the common factors in section 4.2. Finally, we carry out a Monte Carlo
analysis in section 5. Proofs and auxiliary results are gathered in appendices.

Conditionally Heteroskedastic Factor Models


Consider the following multivariate model:
(1)

xt = Cf t + wt
0

20

1 0

13

B ft C
6B 0 C ; B t 0 C7
@
A j Xt1 N 4@
A @
A5
wt
0
0

(2)

where xt is a N 1 vector of observable random variables, ft is a k 1 vector

of unobserved common factors, C is the N k matrix of factor loadings, with


N k and rank (C) = k, wt is a N 1 vector of idiosyncratic noises, which are

conditionally orthogonal to ft , is a N N positive semidenite (p.s.d.) matrix


of constant idiosyncratic variances, t is a k k diagonal positive denite (p.d.)

matrix of (possibly) time-varying factor variances, which generally involve some


extra parameters, , and Xt1 is an information set that contains the values of
xt up to, and including time t 1.

Our assumptions imply that the distribution of xt conditional on Xt1 is nor-

mal with zero mean, and covariance matrix t = Ct C0 + . For this reason, we
shall refer to the data generation process specied by (1-2) as a multivariate conditionally heteroskedastic factor model. Note that the diagonality of t implies
that the factors are conditionally orthogonal.
3

Such a formulation nests several models widely used in the empirical literature. In particular, it nests the conditionally heteroskedastic latent factor model
introduced by Diebold and Nerlove (1989) and extended by King, Sentana and
Wadhwani (1994), and the factor arch model of Engle (1987). These models typically assume that the unobserved factors follow univariate dynamic heteroskedastic processes, but dier in the exact parametrisation of t and .
For instance, in the conditionally heteroskedastic latent factor model, the idiosyncratic covariance matrix is assumed diagonal, and the variances of the factors
are parametrised as univariate arch models, but taking into account that the values of the factors are unobserved. In particular, for the gqarch(1,1) formulation
of Sentana (1995),
2
jj;t = 'j0 + 'j1 fjt1jt1 + j1 (fjt1jt1 + jj;t1jt1 ) + j1 jj;t1

(3)

where ftjt = E(ft jXt ) and tjt = V (ft jXt ), which can be easily evaluated via the

Kalman lter (see Harvey, Ruiz and Sentana (1992)). Note that the measurability
of jj;t with respect to Xt1 is achieved in this model by replacing the unobserved
factors by their best (in the conditional mean square error sense) estimates, and
including a correction in the standard arch terms which reects the uncertainty
in the factor estimates.
Similarly, the factor garch(p,q) model can also be written as a particular
case of (1-2), with non-diagonal, and the conditional variances of the factors
given by:
jj;t =

q
X

js x2 +
_ jts

s=1

p
X

jr jj;tr

(4)

r=1

where xt = D0 xt and D = (d1 j : : : jdk ) is a N k matrix of full column rank


_
satisfying D0 C = Ik (see Sentana (1997a)). Note that the measurability of jj;t

with respect to Xt1 is achieved here by making the time-variation in second


moments a function of k linear combinations of xt .
4

Finally, if ft is conditionally homoskedastic, which usually corresponds to


= 0, (1-2) reduces to the static orthogonal factor model (see e.g. Johnson
and Wichern (1992)). But even if ft is conditionally heteroskedastic, provided
that it is covariance stationary, the assumption of constant factor loadings implies
an unconditionally orthogonal k factor structure for xt . That is, the unconditional
covariance matrix of xt , =E(t ), can be written as:
= CC0 +

(5)

where =V (ft ) = E(t ). This property makes the model considered here compatible with traditional factor analysis.

The Eects of Modelling Conditional Heteroskedasticity on Identication

3.1

Identication of Idiosyncratic Factors

The most distinctive feature of factor models is that they provide a parsimonious specication of the (dynamic) cross-sectional dependence of a vector of
observable random variables. In our case, the factor structure allows us to decompose the conditional covariance matrix t into two parts: one which is common
but of reduced rank k, ct = Ct C0 , and one which is specic, st = . Unfortunately, without further restrictions on , or on the constant part of t , we
cannot separately identify one from the other. The reason is twofold. On the one
hand, we are not able to dierentiate the contribution to the conditional variance
of conditionally homoskedastic common factors (see Engle, Ng and Rothschild
(1990)). On the other, we may be able to transfer unconditional variance from
the idiosyncratic terms to the common factors. For instance, if is non-singular,
we can take ct (-) = C(t + -)C0 , and st (-) = C-C0 , where - is any
5

k k p.s.d. diagonal matrix such that the eigenvalues of -C0 1 C are less than
or equal to 1 (see Sentana (1997a)).

The most common assumption made to dierentiate common from idiosyncratic eects is that is diagonal (see e.g. Diebold and Nerlove (1989) or King,
Sentana and Wadhwani (1994)). In this case, we say that the conditional factor structure is exact. However, in some applications, diagonality of may be
thought to be too restrictive. For that reason, Chamberlain and Rothschild (1983)
introduced the concept of approximate factor structures, in which the idiosyncratic terms may be mildly correlated. Their denition is asymptotic in N , and
amounts to the largest eigenvalue of V [(w1t ; w2t ; : : : ; wNt )0 ] remaining bounded as
N increases (as in band-diagonal matrices).1 In practice, the eigenvalues of are
always bounded as N is nite, and it is dicult to come up with realistic models
that ensure such an asymptotic restriction.
An alternative way to dierentiate common from idiosyncratic eects is to
assume that has reduced rank.2 In some cases, in fact, it may be necessary
to assume that is both diagonal and of reduced rank. As a trivial example,
consider an exact conditionally homoskedastic single factor model with N = 2
and 11 = 1. Its covariance matrix can be written as
0

with c =
11

B
@

c2
11

11

c c
11 21
c2
21

22

1
C
A

c2 + 11 ; c = c21 c11 =c and = 22 + c2 [1 (c11 =c )2 ]


11
11 21
11
22
21
11

for any 2 [0; 11 + c2 22 =(c2 + 22 )]. Note that the extreme values of this
11
11
21
range correspond to the two possible Heywood (i.e. singular) cases.
1

This suggests an intuitive interpretation by analogy with univariate time series: if yt is a


covariance stationary and ergodic process (e.g. an ma model), then all the eigenvalues of the
intertemporal covariance matrix V [(y1 ; y2 ; : : : yT )0 ] remain bounded as T ! 1. Unlike in a time
series framework, though, there is generally no natural ordering for the variables in xt :
2
The rank of is related to the observability of the factors. If rank() = N k the factors
would be fully revealed by the xt variables; otherwise they are only partially revealed (see King,
Sentana and Wadhwani (1994))

3.2

Identication of Common Factors

But the most fundamental identication issue in factor models relates to the
decomposition of ct into C and t . Since the scaling of the factors is usually
irrelevant, then in the case of constant variances, it is conventional to impose the
assumption that the variance of each factor is unity, that is t = I; 8t. By analogy,

we may impose here the same scaling assumption on the factors unconditional
variances.3
Suppose that we were to ignore the time-variation in the conditional variances and base our estimation in the unconditional covariance matrix of xt in
(5). As is well known from standard factor analysis theory, it would then be
possible to generate an observationally equivalent (o.e.) model up to unconditional second moments as xt = C ft +wt , where C = CQ0 , ft = Qf t , and Q is
an arbitrary orthogonal k k matrix, since the unconditional covariance matrix,
= C C0 + = CC0 +, remains unchanged.

Hence, some restrictions would be needed on C. One way to impose them


would be to use Dunns (1973) set of suciency identication conditions for the
homoskedastic factor model with orthogonal factors. These conditions are zerotype restrictions that guarantee that C is locally identiable up to column sign
changes. For instance, when C is otherwise unrestricted, imposing cij = 0 for
j > i; i = 1; 2; : : : k (i.e. C lower trapezoidal) ensures identication.4 Although
such restrictions are often arbitrary, the factors can be orthogonally rotated to
simplify their interpretation once the model has been estimated. In some other
3

If the unconditional variance is unbounded, as in Integrated garch-type models, other


scaling assumptions can be made. For instance, we can x the constant part of the conditional
variance of each factor, or the norm of each column of C.
4
Other alternative sets of sucient local identiability restrictions have been suggested. For
example, Jennrich (1978) proves that when C is otherwise unrestricted, xing not necessarily to
zero the k(k 1)=2 supra-diagonal coecients of (a permutation of) C also guarantees identiability. From a computational point of view, though, the most convenient uniqueness condition
in the unrestricted case is C0 1 C diagonal (see e.g. Johnson and Wichern (1992)).

cases, identiability can be achieved by imposing plausible a priori restrictions.


For example, if in a two factor model it is believed that the second factor only
aects a subset of the variables (say the rst N1 , with N1 < N, so that ci2 = 0
for i = N1 + 1; : : : ; N) the non-zero elements of C will always be identiable.
However, when time variation in t is explicitly recognized in estimation, the
set of admissible Q matrices is substantially reduced, as the conditional covariance
matrix of the transformed factors ft = Qf t has to remain diagonal 8t. In this

context, the following result can be stated:

Proposition 1 Let t = vecd(t ) denote the k 1 vector containing the diagonal


of t . If the stochastic processes in t are linearly independent, in the sense
that there is no vector 2Rk ; 6= 0; such that 0 t = 0; 8t, C is unique under
orthogonal transformations other than column permutations and sign changes.
Notice the generality of Proposition 1 since it has been obtained without assuming any particular parametrisation for the dynamic heteroskedasticity; it relies only on the conditional orthogonality of the factors, the linearly independent
time-variation of their variances, and the constancy of C. One possible way to
gain some intuition on this result is to recall that parameter identiability can
be obtained in many econometric models by looking at higher order moments.
Since conditional normality with changing variances is incompatible with unconditional normality, but at the same time implies autocorrelation in vech(xt x0t ),
Proposition 1 provides an example in which identiability comes from considering
dynamic fourth-order, as opposed to second order, moments.
If the processes in t were linearly dependent, though, identication problems
would re-appear. Given the parametrisations used in empirical work (see section
2.1), it is dicult to envisage situations in which this will be the case, unless two or
more factor variances are constant. Nevertheless, consider as an example a model
in which for all time periods, a group of k2 factors (1 < k2 < k) is characterized by
a scalar covariance matrix kk;t Ik2 , while the others have an unrestricted diagonal
8

covariance matrix 1t . If we partition C conformably as C = (C1 j C2 ), where


C1 and C2 are N k1 and N k2 respectively, with k1 + k2 = k, the following
result can be stated:

Proposition 2 Let 1t = vecd(1t ). If the stochastic processes in (01t ; kk;t ) are


linearly independent, C1 is unique under orthogonal transformations other than
column permutations and sign changes.
For practical purposes, Proposition 2 could be re-stated so that it would refer
only to the empirically relevant case in which kk;t = 1; 8t. However, in its present

form it makes it clear that the lack of identiability comes from the factors having
common, rather than constant, variances.
Finally, note that the imposition of unnecessary restrictions on C by analogy with standard factor models may produce misleading results. An important
implication of our results is that if such restrictions were nevertheless made, at
least they could then be tested. However, the accuracy that can be achieved in
estimating C depends on how much linearly independent variability there is in t ,
for if the elements of this matrix are essentially constant, identiability problems
will reappear.

3.3

Extensions

Proposition 1 can also be applied to other closely related models, and in particular to the model in Harvey, Ruiz and Sentana (1992). Theirs is a general state
space formulation for xt , with unrestricted mean dynamics, in which some unobservable components show dynamic conditional heteroskedasticity. In this section,
we shall explicitly consider the application of Proposition 1 to some well-known
special cases which are empirically relevant.

3.3.1

Conditionally Heteroskedastic in Mean Factor Models

Several recent studies based on dynamic versions of the APT have estimated
conditionally heteroskedastic factor models in which the variances of the common
factors aect the mean of xt (see e.g. Engle, Ng and Rothschild (1990), King,
Sentana and Wadhwani (1994), or Ng, Engle and Rothschild (1992)). The models
typically considered in those studies can be expressed as:
xt = Ct + Cf t +wt
where is a k 1 vector of price of risk coecients. Notice that if = 0;

we return to the previous case. Since the proof of Proposition 1 is based on the
diagonality of the conditional variance of ft , it is straightforward to show that
the columns of C and 0 corresponding to factors with linearly independent timevarying variances are identiable (up to sign changes and permutations).
3.3.2

Conditionally Heteroskedastic Dynamic Factor Models

The formulation considered in section 2 is also a special case of the so-called


dynamic factor model, which constitutes a popular specication for multivariate
time series applications because of its plausibility and parsimony (see e.g. Engle
and Watson (1982) or Pea and Box (1987)). For simplicity, we shall just consider
here the case in which the factor dynamics can be captured by a var(1) process.
Specically,
xt = Cyt +wt ;

yt = Ayt1 +f t

where yt is a k 1 vector of dynamic factors, A is the matrix of var coecients


and ft and wt are dened as in (1-2). If A = 0, we go back to the traditional

(i.e. static) factor model. On the other hand, when A = I we have the common trends model (see e.g. Harvey (1989) or Stock and Watson (1988)). If ft
is conditionally homoskedastic, it is well known that an o.e. model (up to un10

conditional second moments) can be obtained by orthogonally rotating yt . That

is, for any orthogonal matrix Q, the model xt = C yt +wt ; yt = A yt1 +f where
t

yt = Qyt ; f = Qf t ; C = CQ0 and A = QAQ0 , is o.e. Again, Proposition 1 imt

plies that linearly independent time-variability in the conditional variances of ft


will eliminate the nonidentiability of the matrix C.
3.3.3

Vector Autoregressive Moving Average Models

Our results also apply to models with N common factors, no idiosyncratic noise
and linear mean dynamics, such as varma(r; s) models. Again, for simplicity
consider the following var(1):
xt = Axt1 +ut ;

ut = Cf t

where ft ; a N 1 vector dened as in (1-2), could perhaps be better understood

in this context as conditionally orthogonal fundamental shocks aecting the


process xt . Given that ft is white noise, we can estimate this model without

taking into account the time-variation in conditional variances. But then C is


not identiable without extra restrictions. This problem is well known and has
received substantial attention in macroeconometrics. To solve it, some authors
impose short run restrictions such as C lower triangular (cf. the discussion in section 3.2). More recently, Blanchard and Quah (1989) have achieved identiability
by means of restrictions on some elements of the long run multipliers (I A)1 C.

But suppose that some elements of ft have time-varying conditional variances and
this is explicitly recognized in estimation. Then Proposition 2 implies that the
columns of C associated with those disturbances are identiable.
In this context, we can perhaps shed more light on Proposition 1 by reinterpreting it as a uniqueness result for the disturbances, ft . Given the way in
which the model is dened, we know that there is a set of disturbances, conditionally uncorrelated with each other, that can be written as a (time-invariant) linear
11

combination of the innovations in xt , namely, ft = C1 ut . If k2 1, Proposition 2


then says that there is only one such set.5
3.3.4

Oblique Factor Models with Constant Conditional Covariances

So far we have assumed that the factors are conditionally orthogonal, since this
has been a maintained assumption in all existing empirical applications. However,
as the following proposition shows, it turns out that most of the identiability
is coming from the fact the conditional covariances of conditionally orthogonal
factors are (trivially) constant over time
Proposition 3 Let t be a k k positive denite matrix of (possibly) timevarying factor variances but constant conditional covariances, and let t = vecd(t ).
If the stochastic processes in (0t ; 1) are linearly independent, C is unique under
orthogonal transformations other than column permutations and sign changes
Notice that the main dierence with Proposition 1 is that identication problems reappear in oblique factor models when a single factor has constant conditional variance. The reason is that we can transfer unconditional variance from
the conditionally homoskedastic factor to the others. This is not possible if the
factors have to remain conditionally orthogonal.
Factor models with constant conditional covariances arise more commonly than
it may appear. For instance, the factor arch model of Engle (1987) is o.e. to
a whole family of oblique factor models with constant conditional covariances,
whose limiting cases are the conditionally orthogonal factor model in (4), and a
model with a singular idiosyncratic covariance matrix (see Sentana (1997a) for
details). In fact, we can always express any conditionally heteroskedastic factor
model as an oblique factor model with constant conditional covariances and a
5

However, it is important to emphasize that Proposition 1 is not an existence result, in that


it does not say whether or not such disturbances exist to begin with. Rather, it takes them as
given.

12

singular idiosyncratic covariance matrix, since:


G
xt = Cf G + wt
t

0
B
@

ftG
G
wt

20

1 0

C
6B 0 C ; B t + (C
A j Xt1 N 4@
A @

C)

0
0

C (C C)

13
C7
A5

where ftG = (C0 1 C)1 C0 1 xt are the Generalized Least Squares (GLS) estimates of the common factors (see Gourieroux, Monfort and Renault (1991)).
These factor scores are dierent from the minimum (conditional) mean square
h

error estimates, but closely related as ftG = I + (C0 1 C)1 1 ftjt .


t

Estimation and Testing


In model (1-2), the parameters of interest, 0 =(c0 ; 0 ; 0 ), where c =vec(C)

and =vech() or vecd(), are usually estimated jointly from the log-likelihood
function of the observed variables, xt . Ignoring initial conditions, the log-likelihood
function of a sample of size T takes the form LT () =
lt () =

PT

t=1 lt (),

where:

1
1
N
1
ln 2 ln jCt C0 + j x0t (Ct C0 + ) xt
2
2
2

(6)

and t =diag [t ()], which allows the conditional variances of the factors to depend not only on , but also on the static factor model parameters c and .
Since the rst order conditions are particularly complicated in this case (see
appendix B), a numerical approach is usually required. Unfortunately, the application of standard quasi-Newton optimisation routines results in a very time
consuming procedure, which is disproportionately more so as the number of series
considered increases. In this respect, Demos and Sentana (1996b) show that using
the EM algorithm combined with derivative-based methods signicantly reduces
the computational burden. Nevertheless, it is still of some interest to have simpler
alternative estimation procedures.
13

4.1

Two-step consistent estimation procedures

Most empirical applications of the factor garch model have been carried out
using a two-step univariate garch method under the assumption that the matrix D is known. First, univariate models are tted to xjt = d0j xt ; j = 1; 2; : : : ; k.
_
Then, the estimated conditional variances are taken as data in the estimation
of N univariate models for each xit ; i = 1; 2; : : : ; N . However, such a procedure
ignores cross-sectional correlations and parameter restrictions, and thus sacrices
eciency. For that reason, Demos and Sentana (1996b) proposed an EM-based restricted maximum likelihood estimator which exploits those restrictions but maintains the assumption of known D. In the general case, an equivalent assumption
would be that the matrix D0 = (C0 1 C)1 C0 1 is known, which is tantamount
to ftG being observed. Under such a maintained assumption, it is possible to prove
that consistent estimates of C; and can be obtained by combining the estimates of the marginal model for ftG with the estimates from the OLS regression of
each xit on ftG (see Sentana (1997b) for details). Unfortunately, the consistency
of such restricted ML estimators crucially depends on the correct specication of
the factor scores (see Lin (1992) for the factor garch case).
Here, we shall develop a two-step consistent estimation procedure which does
not rely on knowledge of ftG for those cases in which the idiosyncratic covariance
matrix is diagonal. For clarity of exposition, we initially assume that the matrix
C is identiable even if we ignore the time-variation in t .
The rationale for our proposed two-step estimator is as follows. We saw in
section 2 that if ft and wt are covariance stationary, the unconditional covariance
matrix, , inherits the factor structure (cf. (5)). As our rst step, therefore, we
can estimate the unconditional variance parameters c and by pseudo-maximum
likelihood using a standard factor analytic routine. Note that such estimators
satisfy (^; ) = arg maxc; LT (c; ; 0). It is easy to see that (^; ) are root-T
c ^
c ^
14

consistent, as the expected value of the score of the estimated model evaluated
at the true parameter values is 0 under our assumptions. However, since the rst
derivatives are proportional to vech(xt x0t ) (see appendix B), the score does not
preserve the martingale dierence property when there are arch eects in the
common factors, and it is necessary to compute robust standard errors which take
into account its serial correlation.
Having obtained consistent estimates of c and , we can then estimate the
conditional variance parameters by maximizing (6) with respect to keeping c
and xed at their pseudo-maximum likelihood estimates. That is, our second
^
step estimator is = arg max LT (^; ; ). On the basis of well-known results
c ^
^
from Durbin (1970), it is clear that is also root-T consistent. However, since the
asymptotic covariance matrix is not generally block-diagonal between static and
dynamic variance parameters (see appendix B), standard errors will be underestimated by the usual expressions. Asymptotically correct standard errors can be
computed from an estimate of the inverse information matrix corresponding to (6)
^
evaluated at the two-step estimators ^; and (see Lin (1992) for an analogous
c ^
correction in the factor GARCH case).
When C is not identiable from the unconditional covariance matrix, re^
mains consistent, but ^ is only consistent up to an orthogonal transformation. As
c
discussed in section 3.2, the reason is that by assuming unconditional normality
in estimation, we are neglecting very valuable information in dynamic fourth order moments. One possibility would be to replace the Gaussian quasi-likelihood
in the rst-step by an alternative objective function which took into account the
autocorrelation in vech(xt x0t ). Unfortunately, the evidence from univariate arch
models suggests that the resulting estimators are likely to be rather inecient.
In any case, note that if we were to iterate our proposed two step procedure and
achieved convergence, we would obtain fully ecient maximum likelihood esti-

15

mates of all model parameters. Such an iterated estimation procedure is closely


related to the zig-zag estimation method suggested in Demos and Sentana (1992),
which combined the EM algorithm to estimate the static factor parameters conditional on the values of the conditional variance parameters, followed by the direct
maximization of (6) with respect to holding c and xed.

4.2

A simple LM test for ARCH in the common factors

Despite the simplicity of the two-step procedure, the numerical maximization


of (6) with respect to in models such as (3) still involves the use of the Kalman
lter to produce estimates of fjt1jt1 and jj;t1jt1 once per parameter per iteration. Therefore, it is of some interest to have a simple preliminary test for the
presence of arch eects in the common factors. Moreover, since the way in which
standard errors are usually computed in static factor models is only valid under
conditional homoskedasticity, it is convenient to have a model diagnostic to assess
the validity of such a maintained assumption.
If the factors were observable, we could easily carry out standard LM tests
for arch on each of them. For the arch(1) case, for instance, that would entail
2
2
2
2
regressing 1 on (fjt 1)(fjt1 1), or equivalently fjt 1 on fjt1 1: Unfortunately,

the factors are generally unobserved. Nevertheless, we can derive similar tests
using some factor estimates instead. Under conditional normality, ftjt , the Kalman
lter based estimates of the underlying factors, satisfy:

ftjt jXt1 N 0; tjt

where tjt = 1 + (C0 1 C)1


t

i1

(7)

. As a result, ftjt will be conditionally ho-

moskedastic if and only if t is constant over time. Hence, had we data on ftjt , we
h

2
2
could test whether or not the moment condition cov fjtjt ; fjt1jt1 = 0 holds for

j = 1; : : : ; k. Importantly, the aggregation results in Nijman and Sentana (1996)


imply that linear combinations of multivariate factor models like (1-2), whose
16

weights are not orthogonal to C; will follow weak garch processes. Therefore,
such moment tests will have non-trivial power since under the alternative fjtjt will
show serial correlation in the squares.
In practice, we must base the tests on ftjt evaluated at the parameter estimates
under the null. In particular, we will use
^tjt = tjt C0 1 xt
^ ^ ^
f
where

^
^ ^ ^ 1
tjt = I + (C0 1 C)

8t

It turns out that the presence of parameter estimates does not aect the asymptotic distribution of such tests, as the information matrix is block diagonal
between and (c; ) under the null (see appendix B). Furthermore, we also prove
in appendix B that our proposed moment test is precisely the standard LM test
for conditional homoskedasticity in the common factors based on the score of
(6) evaluated under H0 . Therefore, we can compute a two-sided 2 test against
1
arch(1) in each common factor as T times the uncentred R2 from the regression
^2
^
^2
^
of either 1 on (fjtjt + jj;tjt 1) times (fjt1jt1 + jj;t1jt1 1) (outer-product

^2
^
^2
^
version), or (fjtjt + jj;tjt 1) on (fjtjt + jj;tjt 1) (Hessian-based version). In

fact, more powerful variants of these tests can be obtained by taking the one-sided
nature of the alternative hypothesis into account through the sign of the relevant
regression coecient (see Demos and Sentana (1996a)).

Monte Carlo Evidence


In a recent paper, Lin (1992) analyzes dierent estimation methods for the

factor garch model of Engle (1987) by means of a detailed Monte Carlo analysis. In this section, we shall conduct a similar exercise for the conditionally heteroskedastic latent factor model in (3). Unfortunately, given that the estimation
17

of these models is computationally rather intensive, we are forced to consider here


a smaller number of series than in many empirical applications. Nevertheless, we
select the parameter values, and in particular the signal-to-noise ratio, so as to
reect empirically relevant situations.

5.1

A single factor model

We rst generated 8000 samples of 240 observations each (plus another 100 for
initialization) of a trivariate single factor model using the NAG library G05DDF
routine. Such a sample size corresponds roughly to twenty years of monthly data,
ve years of weekly data or one year of daily data. Since the performance of
the dierent estimators depends on C and mostly through the scalar quantity
(C0 1 C), the model considered is:
xit = ci ft + wit

(i = 1; 2; 3)

2
with c = (1; 1; 1)0 ; t = (1)+(ft1jt1 +t1jt1 )+t1 and =I. Two

values of have been selected, namely 2 or 1=2, corresponding to low and high
signal to noise ratios, and three pairs of values for and , namely (0; 0); (:2; :6)
and (:4; :4); which represent constant variances, persistent but smooth garch
behaviour, and persistent but volatile conditional variances respectively. It is
worth mentioning that the pair = :2; = :6 matches roughly what we tend to
see in the empirical literature. In order to minimize experimental error, we use
the same set of underlying random numbers in all designs. Maximization of the
log-likelihood (6) with respect to c; ; and was carried out using the NAG
library E04JBF routine. Initial values of the parameters were obtained by means
of the EM algorithm in Demos and Sentana (1996b).
For scaling purposes, we use c2 + c2 + c2 = 1, and leave the constant part of
1
2
3
the conditional variance free. In order to guarantee the positivity and stationarity
restrictions 0 1 1; we use the re-parametrisation = sin2 (1 )
18

and = sin2 (2 )(1 ). Similarly, we used i = ( )2 : We also set 1 to the


i

unconditional variance of the common factor to start up the recursions. But since
this implies that is not identied if = 0; we set = 0 whenever = 0.
In this respect, it is important to mention that when and/or are 0, the
parameter values lie on the boundary of the admissible range. The distribution of
the ML estimator and associate tests in those situations has been studied by Self
and Liang (1987) and Wolak (1989). When = 0; for instance, we could use the
result in case 2, theorem 2 of Self and Liang (1987), to show that the asymptotic
1
distribution of the ML estimators of (; ; c0 ; 0 ) should be a ( 4 ; 1 ; 1 ) mixture of a)
2 4

the usual asymptotic distribution, b) the asymptotic distribution of a restricted


ML estimator which sets = = 0, and c) the asymptotic distribution of a
restricted ML estimator which only sets = 0. We take into account these results
in order to compute standard errors.
It is also important to mention that joint estimates are always at least as
ecient as two-step estimates in this context, since the information matrix is
block-diagonal between unconditional and conditional variance parameters under
the null of no arch.
Table 1 presents mean biases and standard deviations across replications for
joint and two-step maximum likelihood estimates of the static factor model parameters c and . For simplicity of exposition, only averages across equations are
included (in particular, c = (c1 + c2 + c3 )=3 and = ( 1 + 2 + 3 )=3). Note that
all estimates are very mildly downward biased. At the same time, it seems that
the more variability there is in conditional variances, the better joint estimates
are relative to two-step estimates. Nevertheless, the dierences are minor, at least
for the sample sized used.
Given the large number of parameters involved, we summarize the performance
of the estimates of the asymptotic covariance matrix of these estimators by com-

19

puting the experimental distribution of some simple test statistics. In particular,


we test c1 = c2 = c3 ; and 1 = 2 = 3 : Both tests should have asymptotic 2
2
distributions under the null. Standard errors for joint ML estimates are computed
from the Hessian. On the other hand, the usual sandwich estimator with a 4-lag
triangular window is employed for two-step estimates of the static factor parameters. The results, which are not reported for conciseness, suggest that the size
distortions are not very large.
Our experimental design also allows us to analyze the performance of the
dierent LM test for arch under the null, and under two alternatives. In order
to evaluate their size properties, we employ the p-value discrepancy plots
proposed by Davidson and MacKinnon (1996), which are plots of the dierence
between actual and nominal test size versus nominal test size for all possible
test sizes. If the asymptotic distribution is correct, p-value discrepancy plots
should be close to the x axis. Figure 1 shows such plots for the one-sided and
two-sided versions of the outer-product and Hessian-based forms of the LM test.
As expected, the outer-product versions have much larger distortions than the
Hessian-based ones, whose sizes are fairly accurate.
In order to display the simulation evidence on the power of the dierent tests,
we employ the size-power curves of Davidson and MacKinnon (1996), which
are plots of test power versus actual test size for all possible test sizes. The main
advantage of size-power plots is that they allow us to see immediately the eect
on power of dierent parameter values, as well as to compare the relative powers
of test statistics that have dierent null distributions. Figure 2 presents such plots
for the Hessian-based one-sided and two-sided tests. As can be seen, power is an
increasing function of both the value of , and the signal-to-noise ratio. Also, our
results conrm that one-sided versions are always more powerful than two-sided
ones, although not overwhelmingly so (cf. Demos and Sentana (1996a)).

20

Table 2 presents the proportion of estimates of and which are at the


boundary of the parameter space. Asymptotically, the proportions of = = 0
1 1
and 6= 0; = 0 should be ( 2 ; 4 ) under the null of no arch, and (0; 0) under

the alternative. However, the results show that = 0; and especially = 0

occur more frequently than what the asymptotic distribution would suggest. This
is particularly true when the signal-to-noise ratio is small. These results are
conrmed in Table 3, which presents mean biases and standard deviations across
replications for joint and two-step maximum likelihood estimates of and : In
this respect, it is important to mention that since is not identied when = 0;
the reported values for correspond to those cases in which is not estimated
as 0. Note that the 0 s obtained are rather more accurate than the 0 s: Also
note that the biases for the joint estimates of are smaller than for the two-step
ones, although the latter have smaller Monte Carlo variability. In contrast, the
downward biases in are larger for joint ML estimates. To some extent, these
biases reect the larger proportion of zero 0 s in Table 2.

5.2

A two factor model

We have also simulated the following six-variate model with two factors:
xit = ci1 f1t + ci2 f2t + wit
2
with 11;t = (1 ) + (f1t1jt1 + 11;t1jt1 ) + 11;t1 , 22;t = 1 and =I.

Please note that according to Proposition 1, the parameters in C are identied


without further restrictions, provided that 6= 0 and we take into account the
time-variation in conditional second moments.

Two sets of values for C have been selected, c0 =(0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0)


1
1
1
and c0 = ( 4 ; 1 ; 4 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 4 ). The rst design corresponds to two trivari4
4 4

ate single factor models like the one considered in the previous subsection put
together, while the second design introduces correlation in the columns of C.
21

For each value of c, two values of have been selected, namely 2 and 1=2, corresponding to low and high signal to noise ratios. Then for each of the four
combinations, we consider two pairs of values for and , namely (:4; :4) and
(:2; :6), in order to obtain persistent but volatile conditional variances, and the
more realistic persistent but smooth garch behaviour. Given that this model
is four times as costly to estimate as the previous one, we only generated 2000
samples of 240 observations each. The remaining estimation details are the same
as in section 5.1.6
Table 4 presents mean biases and standard deviations across replications for
joint and two step maximum likelihood estimates, as well as a restricted ML estimator which imposes the same identifying restriction as the two step estimator,
namely c62 = 0. Such an estimator is ecient when the overidentifying restriction
is true, but becomes inconsistent when it is false. More precisely, if C is not unconditionally identiable, restricted and two-step ML estimators of c are consistent
for the orthogonal transformation of the true parameter values which zeroes c62 .
For simplicity of exposition, only certain averages across equations are included (in
particular, ca1 = (c11 +c21 +c31 )=3, cb1 = (c41 +c51 +c61 )=3, ca2 = (c12 +c22 +c32 )=3,
cb2 = (c42 + c52 )=2, and = ( 1 + 2 + 3 + 4 + 5 + 6 )=6).
The rst panel of Table 4 contains the results for those designs in which
c0 =(0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0). Not surprisingly, the restricted ML estimator is
clearly the best as far as estimates of the factor loadings are concerned. However,
it turns out that the two-step estimator performs very similarly, except when there
is signicant variability in conditional variances, which is in line with the results
for the single factor model. On the other hand, the joint ML estimator is the
6

One additional issue that arose during the simulations with two factor models was that,
occasionally, some idiosyncratic variances were estimated as 0. The incidence of these so-called
Heywood cases increased with the value of ; and especially c62 . Nevertheless, since at worst
only 35 out of 2000 replications had this problem, we discarded them, and replaced them by
new ones.

22

worst performer when the signal to noise ratio and the variability in 11;t are low,
but comes very close to the restricted ML in the opposite case.7 This behaviour
is not unexpected, given that the identiability of the joint ML estimator comes
from the fact that 11;t changes over time, while the identiability of the other two
estimators is obtained from the restriction c62 = 0. Nevertheless, it seems that the
latter identiability condition is more informative than the former, which should
be borne in mind in empirical work.
In contrast, there are only minor dierences between the dierent estimates
of the idiosyncratic variance parameters, which are always identied. Obviously,
their Monte Carlo standard deviations increase when changes from 1=2 to 2,
but the coecients of variation remain approximately the same.
The second panel of Table 4 contains the results for those designs in which
1 1
c0 =( 4 ; 4 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 ). Note that the dierent estimates of j are hardly
4
4 4 4

aected. As expected, though, the behaviour of both restricted and two-step


factor loading estimators radically changes, as they clearly become inconsistent.
In contrast, the performance of the joint estimates of c is basically the same as in
the rst panel.
In order to summarize the performance of the estimates of the asymptotic
covariance matrix of these estimators, we computed the experimental distribution
of some simple test statistics. In particular, we test c11 = c21 = c31 ; c41 = c51 = c61 ;
c12 = c22 = c32 ; 1 = 2 = 3 and 4 = 5 = 6 : Given our choice of parameter
values, the plims of all the estimators satisfy these restrictions even when the
assumption c62 = 0 is false. Therefore, all ve tests should have asymptotic 2
2
distributions. The results, not reported for conciseness, suggest that the size
distortions associated with the two-step estimator, for which the usual sandwich
expression with a 4-lag triangular window is employed, are small, but larger than
7
Since the joint estimates of c are not identied when = 0; the reported values correspond
to those cases in which is not estimated as 0.

23

those for joint and restricted ML estimators.


Our design also allows us to consider the nite sample distribution of the likelihood ratio test for the restriction c62 = 0, both under the null and under the
alternative. The p-value discrepancy plot presented in Figure 3 shows that nominal test sizes are fairly accurate at the 5% level, although less so when is small.
For very large signicance levels, however, the size distortions are higher, because
the LR test takes the value 0 when is estimated as 0. The distribution of this
test under the alternative, though, is far more interesting, as it provides a summary indicator of the determinants of the information content in our identiability
restrictions. Figure 4 present the size-power curves for the four experimental designs in which c62 6= 0. Although null and alternative experimental designs dier

in more than one parameter, we have done the required implicit size-corrections
in these plots using the closest match (cf. Davidson and MacKinnon (1996)). Not
surprisingly, the absolute power of the test is small, as the Monte Carlo variability
in the joint estimator of c62 is large relative to the re-scaled value of this parameter (' :14) for the sample size considered (see Table 4). Nevertheless, it is clear
that the power of the test increases with the signal-to-noise ratio, and especially,
with the variability of the conditional variance of the factor. This conrms the
crucial role that changes in 11;t play in the identiability of the model, as stated
in Proposition 1.
Table 5 presents the proportion of estimates of and which are at the
boundary of the parameter space. In all cases, the proportions of = = 0
and 6= 0; = 0 should be (0; 0) asymptotically. But as in the single factor

model, the results show that = 0; and especially = 0 occur more frequently
than what the asymptotic distribution would suggest. This is particularly true
when the signal-to-noise ratio is small. These results are conrmed in Table 6,
which presents mean biases and standard deviations across replications for joint,

24

restricted and two-step maximum likelihood estimators of and : Once more, the
0 s are estimated rather more accurately than the 0 s, which reects the larger
proportion of zero 0 s in Table 5. As in Table 4, though, there are signicant
dierences between the rst and second panel. While the performance of joint ML
estimator is by and large independent of whether or not c62 = 0, the behaviour of
the restricted and two-step estimators radically changes, and they clearly become
inconsistent.

Conclusions
In this paper we investigate some important issues related to the identication,

estimation and testing of multivariate conditionally heteroskedastic factor models.


We begin by re-examining the identication problems of traditional factor analysis.
It turns out that the model considered here only suers from lack of identication
in as much as the variances of some of the common factors are constant. Thus,
there is a non-trivial advantage in explicitly recognizing the existence of dynamic
heteroskedasticity when estimating factor analytic models. Our results also apply
to other popular time series models, and in particular, to dynamic versions of
the APT in which the variances of the common factors aect the mean of xt .
Importantly, our result could also be useful in the interpretation of common trenddynamic factor models, and in the identication of fundamental disturbances from
vector autoregressions.
Secondly, we propose a root-T consistent two-step estimation procedure for
these models which does not rely on knowledge of (some consistent estimates
of) the factors. For those cases in which the idiosyncratic covariance matrix is
diagonal, and the factor loadings are identied even if we ignore the time-variation
in the factor variances, our procedure involves estimating the factor loadings and
idiosyncratic variances by pseudo-maximum likelihood based on the unconditional
25

covariance matrix. Then, the conditional variance parameters are estimated by


maximizing the log-likelihood function of the observed variables keeping the static
factor model parameters xed at their pseudo-maximum likelihood estimates. In
this respect, we also explain how to compute correct standard errors.
Thirdly, we develop a simple preliminary moment test for the presence of arch
eects in the common factors, which can also be employed as a model diagnostic.
This is particularly relevant because the way in which standard errors are usually
computed in static factor models is only valid under conditional homoskedasticity.
Importantly, we prove that our proposed test is precisely the standard LM test
for conditional homoskedasticity in the common factors based on the score of the
joint model evaluated under the null. Not surprisingly, it can be computed as T
times the uncentred R2 from an auxiliary regression involving squares of the best
estimates of the factors and their lags. In fact, more powerful versions of these
tests can be obtained by taking the one-sided nature of the alternative hypothesis
into account.
Finally, we investigate the nite sample properties of our proposed estimators
and hypothesis tests by simulation methods in order to assess the reliability of
their asymptotic distributions in practice. Our results suggest that: (i) the eciency of joint ML estimates of c and relative to two-step estimates increases
with the variability in conditional variances; (ii) standard errors of the estimates
are fairly accurate; (iii) size distortions of the LM test for arch are far smaller for
Hessian-based versions than for outer-product ones; (iv) the power of this test is
an increasing function of and the signal-to-noise ratio, with one-sided versions
being preferred; (v) arch and garch parameters are estimated as 0 more frequently than they should, especially when the signal-to-noise ratio is small, which
results in signicant downward biases; and (vi) although time-variation in factor
variances ensures identication in practice, traditional conditions on C are more

26

informative, as long as they are correct.


The conditionally heteroskedastic factor model in (1-2) is a special case of
the general approximate conditional factor representation t = Ct Ct +t , where
Ct is a N k matrix of measurable functions of the information set and t is
such that its eigenvalues remain bounded as N increases. In this framework, our
model can be-written as xt = Ct ft# + wt , where V (wt jXt1 ) = ;V (ft# jXt1 ) = I
1=2

and Ct = Ct , so that the loadings of dierent variables on each conditionally homoskedastic factor change proportionately over time (see Engle, Ng and
Rothschild (1990)). The motivation for such an assumption is twofold. First, it
provides a parsimonious and plausible specication of the time variation in t ,
and for that reason has been the only one adopted so far in empirical applications.
Second, it implies that the unconditional factor representation of xt is well dened
(provided unconditional variances are bounded), which makes it compatible with
the standard approach based on , and therefore empirically relevant. Notice
that even if t is diagonal, the unconditional variance of a process characterized
by a conditional factor representation may very well lack an unconditional factor
structure for any k < N (see Hansen and Richard (1987)). Although the model is
not identiable if Ct is unspecied, this paper shows that the statistical properties
of alternative plausible formulations of the general conditional factor model would
certainly merit a close look.

27

References
Blanchard, O.J. and Quah, D. (1989): The dynamic eects of aggregate
demand and supply disturbances, American Economic Review, 79, 655-673
Bollerslev, T. and Engle, R.F. (1993): Common persistence in conditional
variances, Econometrica 61, 166-187.
Bollerslev, T. and Wooldridge, J.M. (1992): Quasi-maximum likelihood estimation and inference in dynamic models with time-varying variances, Econometric Reviews 11, 143-172.
Chamberlain, G. and Rothschild, M. (1983): Arbitrage, factor structure, and
mean-variance analysis on large asset markets, Econometrica 51, 1281-1304.
Davidson, R. and MacKinnon, J.G. (1996): Graphical methods for investigating the size and power of hypothesis tests, mimeo, GREQAM.
Demos, A. and Sentana, E. (1992): An EM-based algorithm for conditionally
heteroskedastic factor models, LSE FMG Discussion Paper 140.
Demos, A. and Sentana, E. (1996a): Testing for garch eects: A one-sided
approach, CEMFI Working Paper 9611.
Demos, A. and Sentana, E. (1996b): An EM algorithm for conditionally heteroskedastic factor models, CEMFI Working Paper 9615, forthcoming in Journal
of Business and Economic Statistics.
Diebold, F.X. and Nerlove, M. (1989): The dynamics of exchange rate volatility: A multivariate latent factor arch model, Journal of Applied Econometrics
4, 1-21.
Dunn, J.E. (1973): A note on a suciency condition for uniqueness of a
restricted factor matrix, Psychometrika 38, 141-143.
Durbin, J. (1970): Testing for serial correlation in least-squares regression
when some of the regressors are lagged dependent variables, Econometrica 38,
410-421.
28

Engle, R.F. (1987): Multivariate arch with factor structures - cointegration


in variance, mimeo, University of California at San Diego.
Engle, R.F., Ng, V.M. and Rothschild, M. (1990): Asset pricing with a factor
arch structure: Empirical estimates for Treasury Bills, Journal of Econometrics
45, 213-237.
Engle, R.F. and Watson, M. (1981): A one-factor multivariate time series
model of metropolitan wage rates, Journal of the American Statistical Association 76, 774-781.
Gourieroux, C., Monfort, A. and Renault, E. (1991): A general framework
for factor models, mimeo, INSEE.
Hansen, L.P. and Richard, S.F. (1987): The role of conditioning information in deducing testable restrictions implied by dynamic asset pricing models,
Econometrica 55, 587-613.
Harvey, A.C. (1989): Forecasting, Structural Models and the Kalman Filter,
Cambridge University Press, Cambridge.
Harvey, A.C., Ruiz, E. and Sentana, E. (1992): Unobservable component time
series models with arch disturbances, Journal of Econometrics 52, 129-157.
Jennrich, R.I. (1978): Rotational equivalence of factor loading matrices with
specied values, Psychometrika 43, 421-426.
Johnson, R.A. and Wichern, D.W. (1992): Applied Multivariate Statistical
Analysis, 3rd edition, Prentice-Hall.
King, M.A., E. Sentana, and S.B. Wadhwani (1994): Volatility and links
between national stock markets, Econometrica 62, 901-933.
Kroner, K.F. (1987): Estimating and testing for factor garch, mimeo,
University of California at San Diego.
Lin, W.L. (1992): Alternative estimators for factor garch models: A Monte
Carlo comparison, Journal of Applied Econometrics 7, 259-279.

29

Magnus, J.R. (1988): Linear Structures, Oxford University Press, New York.
Magnus, J.R. and Neudecker, H. (1988): Matrix dierential calculus with
applications in Statistics and Econometrics, Wiley, Chichester.
Ng, V.M.; Engle, R.F. and Rothschild, M. (1992): A multi-dynamic factor
model for stock returns, Journal of Econometrics 52, 245-266.
Nijman, T. and Sentana, E. (1996): Marginalization and contemporaneous
aggregation of multivariate garch processes, Journal of Econometrics 71, 71-87.
Pea, D. and Box, G.E.P. (1987): Identifying a simplifying structure in time
series, Journal of the American Statistical Association 82, 836-843.
Ross, S. (1976): The arbitrage theory of capital asset pricing, Journal of
Economic Theory, 13, 341-360.
Self, S.G. and Liang, K.Y. (1987): Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions Journal
of the American Statistical Association 82, 605-610.
Sentana, E. (1992): Identication of multivariate conditionally heteroskedastic factor models, LSE FMG Discussion Paper 139.
Sentana, E. (1995): Quadratic arch models, Review of Economic Studies
62, 639-661.
Sentana, E. (1997a): The relation between conditionally heteroskedastic factor models and factor garch models, mimeo, CEMFI.
Sentana, E. (1997b): Risk and return in the Spanish stock market: Some
evidence from individual assets, CEMFI Working Paper 9702, forthcoming in
Investigaciones Econmicas.
Stock, J.H. and Watson, M.W. (1988): Testing for common trends, Journal
of the American Statistical Association 83, 1097-1107.
Wolak, F.A. (1989): Local and global testing of linear and nonlinear inequality constraints in nonlinear econometric models, Econometric Theory 5, 1-35.

30

Appendices
A

Proofs

A.1

Proposition 1

Let Q be an arbitrary k k orthogonal matrix with typical element [Q]ij = qij

such that Q0 Q = QQ0 = Ik . Since the covariance matrix of the transformed factors
ft = Qf t , is = Qt Q0 , with typical element [ ]ij =
t
t
orthogonality requires

Pk

l=1

Pk

l=1

ll;t qil qjl , conditional

ll;t qil qjl = 0 for j > i; i = 1; 2; : : : k and t = 1; 2; : : : T:

For a given i; j (j > i), these restrictions can be expressed in matrix notation as:
~
T qij = 0 T

B 11;1
B
B
B
~ T = B 11;2
where
B .
B .
.
B
@

22;1
22;2
.
.
.

11;T 22;T kk;T

vector of ones and qij =

C
A

B
@

kk;1 C B
C
B
kk;2 C B
C
B
C=B
C
B
.
.
C
B
.

(A1)
1

01 C
C
02 C
C
C
. C is a T k matrix, T a T 1
. C
.
0T

C
A

qi1 qj1 qi2 qj2 : : : qik qjk

a k 1 vector. We can

regard (A1) as a set of T homogenous linear equations in k unknowns, qij . Given

~
that rank T = k when the stochastic processes in t are linearly independent,
~
the only solution to the above system of equations is T qij = 0 k . irrespectively
of i and j. That is, we must have that for all j > i; i = 1; 2; : : : k, qil qjl = 0 for

l = 1; 2; : : : ; k; which in turn requires qil = 0 and/or qjl = 0. Therefore, there


cannot be two elements in any column of Q which are dierent from 0. Given that
Q is an orthogonal matrix, the only admissible transformations are permutations
1=2

1=2

of Cholesky square roots of the unit matrix, Ik , where Ik


and 0 otherwise.

31

ij

= 1 for i = j
2

A.2

Proposition 2

In this case (A1) also applies, but since 0t = (01t ; kk;t 0k2 ), we can re-write it
as
0

where T

B
B
B
B
= B
B
B
B
@


T qij = 0 T
11;1

22;1
22;2
.
.
.

k1 k1 ;2
.
.
.

11;T 22;T k1 k1 ;T kk;T

(k1 + 1) matrix, and qij =

1) 1 vector.

0
kk;1 C
B 11 ;
C
B
B 0
kk;2 C
C
B
C = B 12
C
B .
.
.
C
B .
.
C
B .

k1 k1 ;1

11;2
.
.
.

(A2)
1

01T

qi1 qj1 qi2 qj2 : : : qik1 qjk1

Pk

kk;1 C
C
kk;2 C
C
C is a T
C
.
.
C
.
kk;T

C
A

l=k1 +1 qil qjl

a (k +

Since rank T = k1 + 1 by assumption, qij = 0 k1 +1 irrespectively of i and

j. That is, for all j > i; i = 1; 2; : : : k we must have qil qjl = 0 for l = 1; 2; : : : ; k1
and also

Pk

l=k1 +1 qil qjl

= 0. The rst set of restrictions implies that there cannot

be two elements in the rst k1 columns of Q which are dierent from 0. Lets
partition Q comformably as:
0

B Q11
@

Q12 C

Q21 Q22

Then, given that Q is orthogonal, if we exclude mere permutations of the factors,


1=2

it must be the case that Q11 = Ik1 ; Q21 = 0; Q12 = 0 and Q22 is orthogonal. 2

A.3

Proposition 3

First of all, note that = Qt Q0 = Qdg(t )Q0 + Q[t dg(t )]Q0 . But
t

since t dg(t ) is time-invariant by assumption, constant conditional covariances


for simply requires that Qdg(t )Q0 is also time-invariant. Given that the
t
ij th element of Qdg(t )Q0 is

Pk

l=1

ll;t qil qjl , this requires

Pk

l=1

ll;t qil qjl = ij for

j > i; i = 1; 2; : : : k and t = 1; 2; : : : T: For a given i; j (j > i), these restrictions


32

can be expressed in matrix notation as:


~
T qij = ij T

(A3)

We can regard (A3) as a set of T non-homogenous linear equations in k unknowns,

~
qij . Given that rank T jT = k + 1 when the stochastic processes in (0t ; 1) are
linearly independent, the only way the above system of equations can have a

solution is if ij = 0 for all j > i; i = 1; 2; : : : k. That is, if Qdg(t )Q0 remains


diagonal for t = 1; 2; : : : T . In that case, the proof of Proposition 1 applies.

The score and information matrix of a conditionally heteroskedastic factor model


Let 0 =(c0 ; 0 ; 0 ) denote the vector of parameters of interest, with c =vec(C)

and =vecd(). Bollerslev and Wooldridge (1992) and Kroner (1987) show that
the score function st () =@lt ()=@ of any conditionally heteroskedastic multivariate model with zero conditional mean is given by the following expression:
st () =

i
1 @vec0 [t ] h 1
t -1 vec [xt x0t t ]
t
2
@

Then, since the dierential of t is


d(Ct C0 + ) = (dC)t C0 + C(dt )C0 + Ct (dC0 ) + d
(cf. Magnus and Neudecker (1988)), we have that the three terms of the Jacobian
corresponding to c; and will be:
@vec [t ]
@t ()
= (IN 2 + KN )(Ct -IN ) + (C - C)Ek
0
@c
@c0
@vec [t ]
@t ()
= EN + (C - C)Ek
0
@
@ 0
@vec [t ]
@t ()
= (C - C)Ek
0
@
@ 0
33

where En is the unique n n2 diagonalization matrix which transforms vec(A)

into vecd(A) as vecd(A) = E0n vec(A), and Kn is the square commutation matrix
(see Magnus (1988)).
After some straightforward algebraic manipulations, we get
2

1 xt x0t 1 Ct
t
t

1 Ct
t

6 vec
6
h
i
6
1
st () = 6
vecd 1 xt x0t 1 1
t
t
t
2
6
4
0

i 3
7
7
7
7+
7
5

0
6 @t ()=@c 7
6
7

16
6 @0 ()=@
t
26
4
@0t ()=@

h
i
7
7 vecd C0 1 xt x0 1 C C0 1 C
t
t t
t
7
5

Assuming that rank() =N , we can use the Woodbury formula to prove that
1 xt x0t 1 Ct 1 Ct = 1 E [(xt Cf t )ft0 jXT ; ]
t
t
t
1 xt x0t 1 1 = 1 E [(xt Cf t )(xt Cf t )0 jXT ; ] 1
t
t
t
C0 1 xt x0t 1 C C0 1 C = 1 E [ft ft0 t jXT ; ] 1
t
t
t
t
t
where E [jXT ; ] refers to expectations conditional on all observed x0t s and the
parameter values . Therefore, we can interpret the score of the log-likelihood
function for xt as the expected value given XT of the sum of the (unobservable) scores corresponding to the conditional log-likelihood function of xt given ft ,
and the marginal log-likelihood function of ft (cf. Demos and Sentana (1996b)).
Note that these expressions only involve ftjT = E [ft jXT ; ] = ftjt and tjT =

E [ft ft0 jXT ; ] = tjt .

As a simple yet important example, consider the following arch(1)-type con-

ditional variance specication


2
jj;t = (1 j1 ) + j1 (fjt1jt1 + jj;t1jt1 )

34

so that 0 = (11 ; 21 ; :::; k1 ). If the true parameter conguration corresponds


to the case of conditional homoskedasticity, i.e. 0 = 0; so that t = I; 8t, then

2
@jj;t (0 )=@c = 0; @jj;t (0 )=@ = 0 and @jj;t (0 )=@j1 = (fjt1jt1 +jj;t1jt1

1). Since jj;t = 1 under the null, and


h

2
2
E fjtjt + jj;tjt 1 = E E fjt 1jXt

2
= E fjt 1 = 0

then the orthogonality conditions implicit in the last k elements of the score are
2
2
simply cov(fjtjt ; fjt1jt1 ) = 0.

Let Ht () =@ 2 lt ()=@@0 denote Hessian matrix of lt (). Bollerslev and


Wooldridge (1992) also prove that
E [Ht (0 )jXt1 ] =
When 0 = 0,

"

i @vec [ ]
1 @vec0 [t ] h 1
t
t -1
t
0
2
@
@

@0t (0 ) 0 0 1
@ 2 lt (0 )
E
jXt1 =
Ek (C C - C0 1 C)
@@c0
@
"

@ 2 lt (0 )
1 @0t (0 ) 0 1
E
jXt1 =
(C C0 1 )
@@ 0
2 @
where we use the fact that the Hadamard (or element by element) product of two
m n matrices, R and S; can be written as R S = E0m (R - S)En (see Magnus
(1988)).

2
Since E [@jj;t (0 )=@j1 ] = E fjt1jt1 + jj;t1jt1 1 = 0, it is clear that

the information matrix is block diagonal between static and dynamic variance
parameters under the null of conditional homoskedasticity.
Finally, it is also worth noting that under conditional homoskedasticity
#

"

@ 2 lt (0 )
jXt1 = 2(C0 1 C - 1 )
E
@c@c0
"

@ 2 lt (0 )
jXt1 = E0N (1 C - 1 )
E
@@c0
"

@ 2 lt (0 )
1
E
jXt1 = (1 1 )
0
@@
2
35

Table 1: One Factor Model


Mean biases and standard deviations
for unconditional variance parameters
0 =0.5

0 =2.0

0 =0.0
0 =0.0

bias
std.dev.

ML
-.0006
.0265

2S
-.0006
.0265

ML
-.0036
.0729

2S
-.0036
.0730

ML
-.0054
.0789

2S
-.0051
.0771

ML
-.0297
.3093

2S
-.0287
.3025

0 =0.2
0 =0.6

bias
std.dev.

-.0006
.0269

-.0006
.0270

-.0036
.0720

-.0036
.0729

-.0055
.0795

-.0054
.0786

-.0290
.3045

-.0292
.3034

0 =0.4
0 =0.4

bias
std.dev.

-.0006
.0277

-.0006
.0282

-.0035
.0700

-.0037
.0729

-.0055
.0795

-.0058
.0818

-.0282
.2913

-.0300
.3047

Table 2: One Factor Model


Proportion of estimates at the boundary of the parameter space

0 =0.0,0 =0.0
0 =0.2,0 =0.6
0 =0.4,0 =0.4

0 =0.5
= 0; = 0
6= 0; = 0
ML
2S
ML
2S
.556
.557
.265
.264
.022
.027
.091
.086
.003
.005
.074
.070

0 =2.0
= 0; = 0
6= 0; = 0
ML
2S
ML
2S
.552
.552
.286
.282
.118
.137
.198
.167
.049
.059
.218
.185

Table 3: One Factor Model


Mean biases and standard deviations
for conditional variance parameters
0 =0.5

0 =0.2
0 =0.6

bias
std.dev.

ML
.007
.112

0 =0.4
0 =0.4

bias
std.dev.

-.004
.151

0 =2.0

2S
-.002
.104

ML
-.106
.253

2S
-.103
.250

ML
.019
.172

2S
-.007
.149

ML
-.183
.302

2S
-.162
.299

-.030
.134

-.043
.196

-.039
.195

-.015
.222

-.065
.190

-.081
.257

-.058
.257

Table 4: Two Factor Model


Mean biases and standard deviations
for unconditional variance parameters
c0 = (0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0)0
0 = 0:2 0 = 0:6
0 = 0:5
ML
ca1 bias
s.d.

0 = 0:4 0 = 0:4

0 = 2:0
2S

.0014 -.0013 -.0012


.1349 .0554 .0556

ML

0 = 0:5
2S

.0215 -.0001 -.0004


.1912 .1006 .1003

ML

0 = 2:0
2S

.0026 -.0014 -.0014


.1018 .0570 .0577

ML

2S

.0136 -.0003 -.0004


.1603 .1005 .1034

cb1 bias -.0120 -.0033 -.0034 -.0504 -.0147 -.0149 -.0011 -.0035 -.0037 -.0357 -.0145 -.0159
s.d. .0600 .0279 .0282 .1365 .0814 .0829 .0578 .0286 .0295 .1187 .0803 .0858
ca2 bias -.0178 -.0016 -.0016 -.0549 -.0117 -.0117 -.0069 -.0015 -.0016 -.0347 -.0113 -.0117
s.d. .0611 .0269 .0269 .1513 .0810 .0809 .0306 .0271 .0271 .1208 .0808 .0807
cb2 bias
s.d.

.0033 -.0005 -.0006


.1285 .0405 .0407

.0098 -.0026 -.0020 -.0004 -.0003 -.0005


.1934 .1010 .1011 .0835 .0396 .0408

.0036 -.0018 -.0012


.1556 .0974 .0974

bias -.0058 -.0057 -.0058 -.0437 -.0428 -.0441 -.0059 -.0058 -.0059 -.0429 -.0417 -.0444
s.d. .0724 .0723 .0730 .3141 .3100 .3136 .0712 .0712 .0730 .3048 .3018 .3142

c0 = ( 1 ; 1 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 )0
4 4 4
4 4 4
0 = 0:2 0 = 0:6
0 = 0:5
ML
ca1 bias -.0121
s.d. .1296

R
.0955
.0424

0 = 0:4 0 = 0:4

0 = 2:0
2S

ML

.1040 -.0038
.0417 .1820

R
.0920
.0815

0 = 0:5
2S

ML

.0987 -.0072
.0809 .0939

R
.0955
.0473

0 = 2:0
2S

ML

.1076 -.0045
.0455 .1526

2S

.0841
.0841

.1007
.0838

cb1 bias -.0158 -.0373 -.0394 -.0457 -.0437 -.0468 -.0079 -.0360 -.0415 -.0305 -.0406 -.0486
s.d. .0622 .0296 .0297 .1305 .0787 .0785 .0439 .0315 .0317 .1057 .0778 .0813
ca2 bias -.0151
s.d. .0611

.0151
.0309

.0150 -.0562 -.0021 -.0035 -.0048


.0312 .1655 .0997 .1012 .0324

.0149
.0306

.0150 -.0337 -.0009 -.0038


.0313 .1326 .0977 .1018

cb2 bias -.0142 -.1339 -.1418 -.0164 -.1433 -.1561 -.0075 -.1210 -.1417 -.0210 -.1260 -.1566
s.d. .1293 .0480 .0485 .1916 .1344 .1399 .0796 .0473 .0488 .1564 .1283 .1413

bias -.0060 -.0063 -.0060 -.0517 -.0519 -.0524 -.0060 -.0067 -.0061 -.0505 -.0525 -.0537
s.d. .0725 .0726 .0732 .3269 .3267 .3210 .0711 .0715 .0732 .3117 .3266 .3114

Table 5: Two Factor Model


Proportion of estimates at the boundary of the parameter space

0 =0.2,0 =0.6
0 =0.4,0 =0.4

c0 = (0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0)0
0 =0.5
0 =2.0
= 0; = 0
6= 0; = 0
= 0; = 0
6= 0; = 0
ML
R
2S ML
R
2S ML
R
2S ML
R
2S
.034 .034 .038 .114 .088 .084 .146 .145 .153 .226 .190 .166
.004 .004 .004 .097 .077 .072 .064 .064 .072 .260 .222 .188

0 =0.2,0 =0.6
0 =0.4,0 =0.4

c0 = ( 1 ; 1 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 )0
4 4 4
4 4 4
0 =0.5
0 =2.0
= 0; = 0
6= 0; = 0
= 0; = 0
6= 0; = 0
ML
R
2S ML
R
2S ML
R
2S ML
R
2S
.034 .048 .054 .095 .124 .117 .156 .167 .195 .201 .225 .183
.004 .011 .015 .095 .099 .093 .062 .095 .109 .297 .227 .186

Table 6: Two Factor Model


Mean biases and standard deviations
for conditional variance parameters
c0 = (0; 0; 0; 1; 1; 1; 1; 1; 1; 0; 0; 0)0
0 =0.5

0 =0.2
0 =0.6

bias
std.dev.

ML
.025
.115

R
.007
.112

0 =0.4
0 =0.4

bias
std.dev.

.010
.150

-.003
.150

2S
-.003
.103
-.032
.131

0 =2.0

ML
-.128
.259

R
-.109
.248

2S
-.106
.246

-.057
.197

-.047
.193

-.041
.193

ML
.062
.192

R
.025
.181

2S
-.014
.146

.017
.224

-.012
.226

-.081
.186

ML
-.219
.305

R
-.192
.301

2S
-.166
.301

-.108
.258

-.089
.256

-.055
.258

2S
-.175
.311
-.036
.275

c0 = ( 1 ; 1 ; 1 ; 1; 1; 1; 1; 1; 1; 1 ; 1 ; 1 )0
4 4 4
4 4 4
0 =0.5

0 =0.2
0 =0.6

bias
std.dev.

ML
.027
.120

0 =0.4
0 =0.4

bias
std.dev.

.012
.156

R
-.021
.108

2S
-.030
.100

-.061
.150

-.088
.133

0 =2.0

ML
-.116
.256

R
-.120
.273

2S
-.115
.269

-.055
.202

-.027
.214

-.019
.215

ML
.064
.208

R
-.013
.164

2S
-.047
.134

ML
-.201
.306

R
-.207
.312

.017
.234

-.079
.217

-.144
.176

-.095
.266

-.074
.271

Figure 1: Test for ARCH in common factor


P-value discrepancy plots

0 = :5

0 = 2

4
++ +
+ + ++++ + + +

+++

++ + + + +

++ +

++
+
++
+x x x x x x x
x
xxx
+x
xxxx
***
x
******
o
********xxx
***xx*********
o
**x
oo
xxxxx
xx
oo
x
oo
oo
oooo
ooo
oooo
oooooooooo

0
-2
-4

op 2-sided:

10

15

hess 2-sided: ?

+ ++++ ++
++ ++ + + + ++
+++

+++

+ ++

+
+++
x+
+x x x x x x x x x x x x x
xxxx
***
xx
o ******
xxx
****
o
******xxxx****
****
****x
o
xxx
oo
oo
oo
ooo
oooo
oooo ooo
o
ooooooo

0
-2
-4
0

op 1-sided:

10

hess 1-sided:

15

Figure 2: Test for ARCH in common factor


Size{Power curves

0 = :2, 0 = :6

0 = :4, 0 = :4

100

100

90

90

80
70
60
50
40
30
20
10

o o oo
oooo
o oo
ooo
o
oo
ooo
+ ++
oo
++ + +
oo
++ +
++ +
oo
+
o
++
+ ++
oo
o + ++
+
o
o ++
+
x
o+
xx
+
x xx
+
o
xxx
x xx
+
xx
xx
o
+
xx
x
** *
xx
+
** *
xx
***
xx
* **
*
xx
* **
x x ** * *
x
**
x
x * **
x **
x**
x*
*
*

0
0

0 = :5,

5
2-sided

:+

10

0 = :5,

15

:o

1-sided

80
70

o
+
+

60
50
40
x
*

30

o o oooo o o oo
ooooooo
oo ooo
+ + + ++
o oo o + +++ ++ + ++ + ++
oo
oo+ + +++ +
o
o ++
o ++
o +
+
+
xxx
xx xx
xxx
xx x
xx
x xx
xx
** *
xx
** *
x
***
xx
* **
xx
* **
xx
***
**
x
**
x
*
x **
x **
x**
*

20
10
0
0

0 = 2,

:?

2-sided

10

0 = 2,

15
1-sided:

Figure 3: Likelihood Ratio Test for overidentifying restriction


P-value discrepancy plots

4
+++
+
+++
++ +++++++x x x
x
x
x**
*
**x **
+++ *++++ * * * x x * x x
**
* xx
o
+ * *+* o ox x x o o o o * * * * o* o
oooo o
o
ooo
++ *
o
+x x x x x x x x x x
x *** ooooox
oooo
++ *
xx
oo
**

2
0
-2
-4
0

10

2
+

0 = :2, 0 = :6, 0 = 2:
0 = :2, 0 = :6, 0 = :5:

15

0 = :2, 0 = :4, 0 = 4: o
0 = :4, 0 = :4, 0 = :5: ?

Figure 4: Likelihood Ratio Test for overidentifying restriction


Size{Power curves

100
90
80
70
60
50
40

* *

**
***
**
*
* *

* **

* **

++
++
++
++
+
o
+ +
*
oo o
+ +
o o
o o
++
*
+
oo o
x xx
oo
++
x xx
+
oo
+ o o o
xx x
*
+
x x
+
oo
x x
o
+
x x x
+ o oo
xx
xx x
+ o
xx
o
+ x xx
o
o x
x
*

30
20
10

0
0

0 = :2, 0 = :6, 0 = 2:
0 = :2, 0 = :6, 0 = :5:

2
+

10

15

0 = :2, 0 = :4, 0 = 4: o
0 = :4, 0 = :4, 0 = :5: ?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy