0% found this document useful (0 votes)
59 views22 pages

Cointegration. Overview and Development

This document provides an overview of cointegration analysis using the vector autoregressive (VAR) model. It discusses three approaches to modeling cointegration - the regression formulation, the autoregressive formulation, and the unobserved components formulation. The bulk of the document focuses on the autoregressive formulation, defining the VAR model used, its moving average representation, likelihood-based inference, asymptotic properties of estimators, and rank tests. Applications and extensions are briefly mentioned, along with open problems.

Uploaded by

anorfirdaus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views22 pages

Cointegration. Overview and Development

This document provides an overview of cointegration analysis using the vector autoregressive (VAR) model. It discusses three approaches to modeling cointegration - the regression formulation, the autoregressive formulation, and the unobserved components formulation. The bulk of the document focuses on the autoregressive formulation, defining the VAR model used, its moving average representation, likelihood-based inference, asymptotic properties of estimators, and rank tests. Applications and extensions are briefly mentioned, along with open problems.

Uploaded by

anorfirdaus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Cointegration.

Overview and Development

Sren Johansen

Department of Applied Mathematics and Statistics, University of Copenhagen


sjo@math.ku.dk

Summary. This article presents a survey of the analysis of cointegration using the
vector autoregressive model. After a few illustrative economic examples, the three
model based approaches to the analysis of cointegration are discussed. The vector
autoregressive model is defined and the moving average representation of the solu-
tion, the Granger representation, is given. Next the interpretation of the model and
its parameters and likelihood based inference follows using reduced rank regression.
The asymptotic analysis includes the distribution of the Gaussian maximum likeli-
hood estimators, the rank test, and test for hypotheses on the cointegrating vectors.
Finally, some applications and extensions of the basic model are mentioned and the
survey concludes with some open problems.

1 Introduction
Granger [13] coined the term cointegration as a formulation of the phenom-
enon that nonstationary processes can have linear combinations that are sta-
tionary. It was his investigations of the relation between cointegration and
error correction that brought modeling of vector autoregressions with unit
roots and cointegration to the center of attention in applied and theoretical
econometrics; see Engle and Granger [10].
During the last 20 years, many have contributed to the development of
theory and applications of cointegration. The account given here focuses on
theory, more precisely on likelihood based theory for the vector autoregressive
model and its extensions; see [21]. By building a statistical model as a frame-
work for inference, one has to make explicit assumptions about the model
used and hence has a possibility of checking the assumptions made.

1.1 Two examples of cointegration

As a first simple economic example of the main idea in cointegration, consider


the exchange rate series, et , between Australian and US dollars and four time
2 Sren Johansen

series pau us au us
t , pt , it , it : log consumer price and five year treasury bond rates
in Australia and US. If the quarterly series from 1972:1 to 1991:1 are plotted,
they clearly show nonstationary behavior, and we discuss in the following a
method of modeling such nonstationary time series. As a simple example of an
economic hypothesis consider Purchasing Power Parity (PPP), which asserts
that et = pus au
t pt . This identity is not found in the data, so a more realistic
formulation is that pppt = et pus t + pt
au
is a stationary process, possibly
with mean zero. Thus we formulate the economic relation, as expressed by
PPP, as a stationary relation among nonstationary processes. The purpose
of modeling could be to test the null hypothesis that pppt is stationary, or
in other words that (et , pus au au us 0
t , pt it , it ) cointegrate with (1, 1, 1, 0, 0) as a
cointegration vector. If that is not found, an outcome could be to suggest
other cointegration relations, which have a better chance of capturing co-
movements of the five processes in the information set. For a discussion of the
finding that real exchange rate, pppt , and the spread, iau us
t it , are cointegrated
I(1) processes so that a linear combination pppt c (it ius
au
t ) is stationary,
see Juselius and MacDonald [31].
Another example is one of the first applications of the idea of cointegration
in finance; see Campbell and Shiller [8]. They considered a present value model
for the price of a stock Yt at the end of period t and the dividend yt paid during
period t. They assume that there is a vector autoregressive model describing
the data which contain Yt and yt and may contain values of other financial
assets. The expectations hypothesis is expressed as

X
Yt = (1 ) j Et yt+i + c,
i=0

where c and are positive constants and the discount factor is between 0
and 1. The notation Et yt+i means model based conditional expectations of
yt+i given information in the data at the end of period t. By subtracting yt ,
the model is written as

X
Yt yt = (1 ) j Et (yt+i yt ) + c.
i=0

It is seen that when the processes yt and Yt are nonstationary and their
dierences stationary, the present value model implies that the right hand
side and hence the left hand side are stationary. Thus there is cointegration
between Yt and yt with a cointegration vector 0 = (1, , 0, . . . , 0); see section
6.1 for a discussion of rational expectations and cointegration.
There are at present three dierent ways of modeling the cointegration
idea in a parametric statistical framework. To illustrate the ideas they are
formulated in the simplest possible case, leaving out deterministic terms.
Cointegration. Overview and Development 3

1.2 Three ways of modeling cointegration

The regression formulation

The multivariate process xt = (x01t , x02t )0 of dimension p = p1 + p2 is given by


the regression equations

x1t = 0 x2t + u1t ,


x2t = u2t ,

where ut = (u01t , u02t )0 is a linear invertible process defined by i.i.d. errors t


with mean zero and finite variance. The assumptions behind this model imply
that x2t is nonstationary and not cointegrated, and hence the cointegration
rank, p1 , is known so that models for dierent ranks are not nested. The first
estimation method used in this model is least squares regression, Engle and
Granger [10], which is shown to give a superconsistent estimator by Stock [46].
This estimation method gives rise to residual based tests for cointegration. It
was shown by Phillips and Hansen [42] that a modification of the regression
estimator, involving a correction using the long-run variance of the process
ut , would give useful methods for inference for coecients of cointegration
relations; see also Phillips [41].

The autoregressive formulation

The autoregressive formulation is given by

xt = 0 xt1 + t ,

where t are i.i.d. errors with mean zero and finite variance, and and
are p r matrices of rank r. Under the condition that xt is stationary, the
solution is
Xt
X
xt = C i + Ci ti + A, (1)
i=1 i=0

where C = (0 )1 0 and 0 A = 0. Here is a full rank p (p


r) matrix so that 0 = 0. This formulation allows for modeling of both
0
the long-run relations, x, and the adjustment, or feedback , towards the
attractor set {x : 0 x = 0} defined by the long-run relations. Models for
dierent cointegration ranks are nested and the rank can be analyzed by
likelihood ratio tests. Thus the model allows for a more detailed description
of the data than the regression model. Methods usually applied for the analysis
are derived from the Gaussian likelihood function, which are discussed here;
see also [18, 21], and Ahn and Reinsel [1].
4 Sren Johansen

The unobserved component formulation

Let xt be given by
t
X
xt = 0 i + vt ,
i=1
where vt is a linear process, typically independent of the process t , which is
i.i.d. with mean zero and finite variance.
In this formulation too, hypotheses of dierent ranks are nested. The pa-
rameters are linked to the autoregressive formulation by = and = ,
even though the linear process in (1) depends on the random walk part, so the
unobserved components model and the autoregressive model are not the same.
Thus both adjustment and cointegration can be discussed in this formulation,
and hypotheses on the rank can be tested. Rather than testing for unit roots
one tests for stationarity, which is sometimes a more natural formulation. Es-
timation is usually performed by the Kalman filter, and asymptotic theory of
the rank tests has been worked out by Nyblom and Harvey [37].

1.3 The model analyzed in this article


In this article cointegration is modelled by the vector autoregressive model
for the pdimensional process xt
k1
X
0
xt = ( xt1 + Dt ) + i xti + dt + t , (2)
i=1

where t are i.i.d. with mean zero and variance , and Dt and dt are deter-
ministic terms, like constant, trend, seasonal- or intervention dummies. The
matrices and are p r where 0 r p. The parametrization of the
deterministic term Dt + dt , is discussed in section 2.2. Under suitable
conditions, see again section 2.2, the processes 0 xt and xt are stationary
around their means, and (2) can be formulated as
k1
X
xt E(xt ) = ( 0 xt1 E( 0 xt1 )) + i (xti E(xti )) + t .
i=1

This shows how the change of the process reacts to feedback from disequi-
librium errors 0 xt1 E( 0 xt1 ) and xti E(xti ), via the short-run
adjustment coecients and i . The equation 0 xt E( 0 xt ) = 0 defines the
long-run relations between the processes.
There are many surveys of the theory of cointegration; see for instance
Watson [48] or Johansen [26]. The topic has become part of most textbooks
in econometrics; see among others Banerjee, Dolado, Galbraith and Hendry
[4], Hamilton [14], Hendry [17] and Ltkepohl [34]. For a general account of
the methodology of the cointegrated vector autoregressive model, see Juselius
[32].
Cointegration. Overview and Development 5

2 Integration, cointegration and Grangers


Representation Theorem
The basic definitions of integration and cointegration are given together with
a moving average representation of the solution of the error correction model
(2). This solution reveals the stochastic properties of the solution. Finally the
interpretation of cointegration relations is discussed.

2.1 Definition of integration and cointegration

The vector autoregressive model for the pdimensional process xt given by (2)
is a dynamic stochastic model for all components of xt . By recursive substitu-
tion, the equations define xt as function of initial values, x0 , . . . , xk+1 , errors
1 , . . . , t , deterministic terms, and parameters. Properties of the solution of
these equations are studied through the characteristic polynomial
k1
X
(z) = (1 z)Ip z (1 z) i z i (3)
i=1

with determinant | (z)|. The function C(z) = (z)1 has poles at the roots of
the polynomial | (z)| and the position of the poles determines the stochastic
properties of the solution of (2). First a well known result is mentioned; see
Anderson [3].

Theorem 1. If | (z)| = 0 impliesPthat |z| > 1, then and have full rank p,
1 i
Pcoecients of (z) = i=0 Ci z are exponentially decreasing. Let
and the
t = i=0 Ci ( Dti + dti ). Then the distribution of the initial values of
xt can be chosen so that xt t is stationary. Moreover, xt has the moving
average representation

X
xt = Ci ti + t . (4)
i=0

Thus the exponentially decreasing coecients are found by simply invert-


ing the characteristic polynomial if the roots are outside the unit disk. If
this condition fails, the equations generate nonstationary processes of various
types, and the coecients are not exponentially decreasing. Still, the coe-
cients of C(z) determine the stochastic properties of the solution of (2), as
is discussed in section 2.2. A process of the form (4) is a linear process and
forms the basis for the definitions of integration and cointegration.

Definition 1. The process xt is integrated


P of order 1, I(1), if xt E(xt )
is a linear process, with C(1) = i=0 Ci 6= 0. If there is a vector 6= 0 so
that 0 xt is stationary around its mean, then xt is cointegrated with cointe-
gration vector . The number of linearly independent cointegration vectors is
the cointegration rank.
6 Sren Johansen

Example 1. A bivariate process is given for t = 1, . . . , T by the equations

x1t = 1 (x1t1 x2t1 ) + 1t ,


x2t = 2 (x1t1 x2t1 ) + 2t .

Subtracting the equations, we find that the process yt = x1t x2t is autore-
gressive and stationary if |1 + 1 2 | < 1 and the initial value is given by its
invariant distribution. Similarly we find that St = 2 x1t 1 x2t is a random
walk, so that

x1t 1 1 1 1
= St yt .
x2t 2 1 1 2 1 2

This shows, that when |1 + 1 2 | < 1, xt is I(1), x1t x2t is stationary,


and 2 x1t 1 x2t is a random walk, so that xt is a cointegrated I(1) process
with cointegration vector 0 = (1, 1). We call St a common stochastic trend
and the adjustment coecients.

Example 1 presents a special case of the Granger Representation Theorem,


which gives the moving average representation of the solution of the error
correction model.

2.2 The Granger Representation Theorem

If the characteristic polynomial (z) defined in (3) has a unit root, then
(1) = is singular, of rank r < p, and the process is not stationary. Next
the I(1) condition is formulated. It guarantees that the solution of (2) is a
Pk1
cointegrated I(1) process. Let = Ip i=1 i and denote for a pm matrix
a by a a p (p m) matrix of rank p m.
Condition 1. (The I(1) condition) The I(1)condition is satisfied if |(z)| = 0
implies that |z| > 1 or z = 1 and that

|0 | 6= 0. (5)

Condition (5) is needed to avoid solutions that are integrated of order 2


or higher; see section 6. For a process with one lag = Ip and

0 xt = (Ir + 0 ) 0 xt1 + 0 t .

In this case the I(1) condition is equivalent to the condition that the absolute
value of the eigenvalues of Ir + 0 are bounded by one, and in example 1 the
condition is |1 + 1 2 | < 1.

Theorem 2. (The Granger Representation Theorem) Let (z) be defined by


(3). If (z) has unit roots and the I(1) condition is satisfied, then
Cointegration. Overview and Development 7

X
1
(1 z) (z) = C(z) = Ci z i = C + (1 z)C (z) (6)
i=0

converges for |z| 1 + for some > 0 and

C = (0 )1 0 . (7)

The solution xt of equation (2) has the moving average representation


t
X
X
xt = C (i + di ) + Ci (ti + dti + Dti ) + A, (8)
i=1 i=0

where A depends on initial values, so that 0 A = 0.

This result implies that xt and 0 xt are stationary, so that xt is a cointe-


grated I(1)P process with r cointegration vectors and pr common stochastic
t
trends 0 i=1 i . The interpretation of this is that among p nonstationary
processes the model (2) generates r stationary or stable relations and p r
stochastic trends or driving trends, which create the nonstationarity.
The result (6) rests on the observation that the singularity of (z) for z = 1
implies that (z)1 has a pole at z = 1. Condition (5) is a condition for this
pole to be of order one. This is not proved here, see [27], but it is shown how
this result can be applied to prove the representation result (8), which shows
how coecients of the inverse polynomial determine the properties of xt .
We multiply (L)xt = dt + Dt + t by

(1 L) (L)1 = C(L) = C + (1 L)C (L)

and find

xt = (1 L) (L)1 (L)xt = (C + C (L))(t + Dt + dt ).

Now define the stationary process zt = C (L)t and the deterministic function
t = C (L)( Dt + dt ), and note that C = 0, so that

xt = C(t + dt ) + (zt + t ),

which cumulates to
t
X
xt = C (i + di ) + zt + t + A,
i=1

where A = x0 z0 0 . The distribution of x0 is chosen so that 0 x0 =


0 (z0 + 0 ), and hence 0 A = 0. Then xt is I(1) and 0 xt = 0 zt + 0 t is
stationary around its mean E( 0 xt ) = 0 t . Finally, xt is stationary around
its mean E(xt ) = Cdt + t .
8 Sren Johansen

One of the useful applications of the representation (8) is to investigate


the role of the deterministic terms. Note that dt cumulates in the process
with a coecient C, but that Dt does not, because C = 0. A leading
special case is the model with Dt = t, and dt = 1, which ensures that any
linear combination of the components of xt is allowed to have a linear trend.
Note that if Dt = t is not allowed in the model, that is = 0, then xt has a
trend given by Ct, but the cointegration relation 0 xt has no trend because
0 C = 0.

2.3 Interpretation of cointegrating coecients

Consider first a usual regression

x1t = 2 x2t + 3 x3t + t , (9)

with i.i.d. errors t which are independent of the processes x2t and x3t . The
coecient 2 is interpreted via a counterfactual experiment, that is, the coef-
ficient 2 is the eect on x1t of a change in x2t , keeping x3t constant.
The cointegration relations are long-run relations. This means that they
have been there all the time, and they influence the movement of the process
xt via the adjustment coecients . The more the process 0 xt deviates from
E 0 xt , the more the adjustment coecients pull the process back towards its
mean. Another interpretation is that they are relations that would hold in the
limit, provided all shocks in the model are set to zero after a time t.
It is therefore natural that interpretation of cointegration coecients in-
volves the notion of a long-run value. From the Granger Representation The-
orem 8 applied to the model with no deterministic terms, it can be proved,
see [25], that
k1
X t
X
x|t = lim E(xt+h |xt , . . . , xtk+1 ) = C(xt i xti ) = C i + x|0 .
h
i=1 i=1

This limiting conditional expectation is a long-run value of the process.


Because 0 x|t = 0, the point x|t is in the attractor set {x : 0 x = 0} =
sp{ }, see Figure 1.
Thus if the current value, xt , is shifted to xt + h, then the long-run value
is shifted from x|t to x|t + Ch, which is still a point in the attractor set
because 0 x|t + 0 Ch = 0. If a given long-run change k = C in x|t is
needed, k is added to the current value xt . This gives the long-run value

x|t + C k = x|t + C C = x|t + C = x|t + k,

where the identity C C = C is applied; see (7). This idea is now used to give
an interpretation of a cointegration coecient in the simple case of r = 1,
p = 3, and where the relation is normalized on x1
Cointegration. Overview and Development 9

x2t xt bX
sp( )
XX
b
6 z
X



0
x|t
xt




X
y
XX
X



0 St
i=1 i



- x1t

Fig. 1. In the model xt = 0 xt1 + t , the point xt = (x1t , x2t ) is moved towards
the long-run value x|t on the attractor set {x| 0 x = 0} = sp( } by S the forces
or +, and pushed along the attractor set by the common trends 0 ti=1 i .

x1 = 2 x2 + 3 x3 , (10)

so that 0 = (1, 2 , 3 ). In order to give the usual interpretation as a


regression coecient (or elasticity if the measurements are in logs), a long-
run change with the properties that x2 changes by one, x1 changes by 2 , and
x3 is kept fixed, is needed. Thus the long-run change is k = ( 2 , 1, 0), which
satisfies 0 k = 0, so that k = C for some , and this can be achieved by
moving the current value from xt to xt + C k. In this sense, a coecient in
an identified cointegration relation can be interpreted as the eect of a long-
run change to one variable on another, keeping all others fixed in the long
run. More details can be found in [25] and Proietti [43].

3 Interpretation of the I(1) model for cointegration


In this section model H(r) defined by (2) is discussed. The parameters in H(r)
are
(, , 1 , . . . , k1 , , , ) .
All parameters vary freely and and are p r matrices. The normalization
and identification of and are discussed, and some examples of hypotheses
on and , which are of economic interest are given.

3.1 The models H(r)

The models H(r) are nested


10 Sren Johansen

H(0) H(r) H(p).


Here H(p) is the unrestricted vector autoregressive model, so that and are
unrestricted pp matrices. The model H(0) corresponds to the restriction =
= 0, which is the vector autoregressive model for the process in dierences.
Note that in order to have nested models, we allow in H(r) for all processes
with rank less than or equal to r.
The formulation allows us to derive likelihood ratio tests for the hypothesis
H(r) in the unrestricted model H(p). These tests can be applied to check if
ones prior knowledge of the number of cointegration relations is consistent
with the data, or alternatively to construct an estimator of the cointegration
rank.
Note that when the cointegration rank is r, the number of common trends
is p r. Thus if one can interpret the presence of r cointegration relations one
should also interpret the presence of p r independent stochastic trends or
p r driving forces in the data.

3.2 Normalization of parameters of the I(1) model


The parameters and in (2) are not uniquely identified, because given any
choice of and and any nonsingular r r matrix , the choice and 01
gives the same matrix = 0 = ( 01 )0 .
If xt = (x01t , x02t )0 and = ( 01 , 02 )0 , with | 1 | 6= 0, we can solve the
cointegration relations as
x1t = 0 x2t + ut ,
where ut is stationary and 0 = ( 01 )1 02 . This represents cointegration as
a regression equation. A normalization of this type is sometimes convenient
for estimation and calculation of standard errors of the estimate, see section
5.2, but many hypotheses are invariant with respect to a normalization of ,
and thus, in a discussion of a test of such a hypothesis, does not require
normalization. As seen in subsection 3.3, many stable economic relations are
expressed in terms of identifying restrictions, for which the regression formu-
lation is not convenient.
From the Granger Representation Theorem P we see that the p r common
trends are the nonstationary random walks in C ti=1 i , that is, can be chosen
P P
as 0 ti=1 i . For any full rank (p r) (p r) matrix , 0 ti=1 i could
also be used as common trends because
t
X t
X t
X
C i = (0 )1 (0 i ) = (0 )1 (0 i ).
i=1 i=1 i=1

Thus identifying restrictions on the coecients in are needed to find their


estimates and standard errors.
In the cointegration model there are therefore three dierent identification
problems: one for the cointegration relations, one for the common trends, and
finally one for the short run dynamics, if the model has simultaneous eects.
Cointegration. Overview and Development 11

3.3 Hypotheses on long-run coecients


The purpose of modeling economic data is to test hypotheses on the coe-
cients, thereby investigating whether the data supports an economic hypoth-
esis or rejects it. In the example with the series xt = (pau us au us
t , pt , it , it , et )
0

the hypothesis of P P P is formulated as the hypothesis that (1, 1, 1, 0, 0)


is a cointegration relation. Similarly, the hypothesis of price homogeneity is
formulated as
R0 = (1, 1, 0, 0, 0) = 0,
or equivalently as = R = H, for some vector and H = R . The hy-
pothesis that the interest rates are stationary is formulated as the hypothesis
that the two vectors (0, 0, 0, 1, 0) and (0, 0, 0, 0, 1) are cointegration vectors. A
general formulation of restrictions on each of r cointegration vectors, including
a normalization, is
= (h1 + H1 1 , . . . , hr + Hr r ) . (11)
Here hi is p 1 and orthogonal to Hi which is p (si 1) of rank si 1, so
that p si restrictions are imposed on the vector i . Let Ri = (hi , Hi ) then
i satisfies the restrictions Ri0 i = 0, and the normalization (h0i hi )1 h0i i = 1.
Walds identification criterion is that i is identified if
Ri0 ( 1 , . . . , r ) = r 1.

3.4 Hypotheses on adjustment coecients


The coecients in measure how the process adjusts to disequilibrium errors.
The hypothesis of weak exogeneity is the hypothesis that some rows of are
zero; see Engle, Hendry and Richard [11]. The process xt is decomposed as
0
xt = (x01t , x02t ) and the matrices are decomposed similarly so that the model
equations without deterministic terms become
Pk1
x1t = 1 0 xt1 + i=1 1i xti + 1t ,
P
x2t = 2 0 xt1 + k1
i=1 2i xti + 2t .

If 2 = 0, there is no levels feedback from 0 xt1 to x2t , and if the errors


are Gaussian, x2t is weakly exogenous for 1 , . The conditional model for
x1t given x2t and the past is
k1
X
x1t = x2t + 1 0 xt1 + (1i 2i )xti + 1t 2t , (12)
i=1
1
where = 12 22 . Thus full maximum likelihood inference on 1 and can
be conducted in the conditional model (12).
An interpretation of the hypothesis of weak exogeneity is the
Ptfollowing:
if 2 = 0 then contains the columns of (0, Ipr )0 , so that i=1 2i are
common trends. Thus the errors in the equations for x2t cumulate in the
system and give rise to nonstationarity.
12 Sren Johansen

4 Likelihood analysis of the I(1) model

This section contains first some comments on what aspects are important
for checking for model misspecification, and then describes the calculation
of reduced rank regression, introduced by Anderson [2]. Then reduced rank
regression and modifications thereof are applied to estimate the parameters
of the I(1) model (2) and various submodels.

4.1 Checking the specifications of the model

In order to apply Gaussian maximum likelihood methods, the assumptions


behind the model have to be checked carefully, so that one is convinced that
the statistical model contains the density that describes the data. If this is
not the case, the asymptotic results available from the Gaussian analysis need
not hold. Methods for checking vector autoregressive models include choice
of lag length, test for normality of residuals, tests for autocorrelation, and
test for heteroscedasticity in errors. Asymptotic results for estimators and
tests derived from the Gaussian likelihood turn out to be robust to some
types of deviations from the above assumptions. Thus the limit results hold
for i.i.d. errors with finite variance, and not just for Gaussian errors, but
autocorrelated errors violate the asymptotic results, so autocorrelation has to
be checked carefully.
Finally and perhaps most importantly, the assumption of constant para-
meters is crucial. In practice it is important to model outliers by suitable
dummies, but it is also important to model breaks in the dynamics, breaks
in the cointegration properties, breaks in the stationarity properties, etc. The
papers by Seo [45] and Hansen and Johansen [16] contain some results on
recursive tests in the cointegration model.

4.2 Reduced rank regression

Let ut , wt , and zt be three multivariate time series of dimensions pu , pw , and


pz respectively. The algorithm of reduced rank regression, see Anderson [2],
can be described in the regression model

ut = 0 wt + zt + t , (13)

where t are the errors with variance . The product moments are
T
X
Suw = T 1 ut wt0 ,
t=1

and the residuals, which we get by regressing ut on wt , are


1
(ut |wt ) = ut Suw Sww wt ,
Cointegration. Overview and Development 13

so that the conditional product moments are


T
X
1
Suw.z = Suw Suz Szz Szw = T 1 (ut |zt )(wt |zv)0 ,
t=1
T
X
Suu.w,z = T 1 (ut |wt , zt )(ut |wt , zt )0 = Suu.w Suz.w Szz.w
1
Szu.w .
t=1
0
Let = . The unrestricted regression estimates are
1 1
= Suw.z Sww.z , = Suz.w Szz.w , and = Suu.w,z .
Reduced rank regression of ut on wt corrected for zt gives estimates of ,
and in (13). First the eigenvalue problem
1
|Sww.z Swu.z Suu.z Suw.z | = 0 (14)

is solved. The eigenvalues are ordered 1 pw , with corresponding


eigenvectors v1 , . . . , vpw . The reduced rank estimates of , , , and are
given by
= (v1 , . . . , vr ),
= Suw.z ,
= Suz. 0 w S 1 0 , (15)
zz. w
0
= Suu.z Suw.z Swu.z ,
Q
|| = |Suu.z | ri=1 (1 i ).
The eigenvectors are orthogonal because vi0 Sww.z vj = 0 for i 6= j, and are
normalized by vi0 Sww.z vi = 1. The calculations described here are called a
reduced rank regression and are denoted by RRR(ut , wt |zt ).

4.3 Maximum likelihood estimation in the I(1) model and


derivation of the rank test

Consider the I(1) model given by equation (2). Note that the multiplier
of Dt is restricted to be proportional to so that, by the Granger Represen-
tation Theorem, Dt does not cumulate in the process. It is assumed for the
derivations of maximum likelihood estimators and likelihood ratio tests that
t is i.i.d. Np (0, ), but for asymptotic results the Gaussian assumption is not
needed. The Gaussian likelihood function shows that the maximum likelihood
estimator can be found by the reduced rank regression
RRR(xt , (x0t1 , Dt0 )0 |xt1 , . . . , xtk+1 , dt ).
It is convenient to introduce the notation for residuals
R0t = (xt |xt1 , . . . , xtk+1 , dt )
R1t = ((x0t1 , Dt0 )0 |xt1 , . . . , xtk+1 , dt )
14 Sren Johansen

and product moments


T
X
Sij = T 1 0
Rit Rjt .
t=1
The estimates are given by (15), and the maximized likelihood is, apart from
a constant, given by
r
Y
L2/T
max = || = |S00 | (1 i ). (16)
i=1

Note that all the models H(r), r = 0, . . . , p, have been solved by the
same eigenvalue calculation. The maximized likelihood is given for each r
by (16) and by dividing the maximized likelihood function for r with the
corresponding expression for r = p, the likelihood ratio test for cointegration
rank is obtained:
Xp
2logLR(H(r)|H(p)) = T log(1 i ). (17)
i=r+1

This statistic was considered by Bartlett [5] for testing canonical correlations.
The asymptotic distribution of this test statistic and the estimators are dis-
cussed in section 5.
The model obtained under the hypothesis = H, is analyzed by
RRR(xt , (H 0 x0t1 , Dt0 )0 |xt1 , . . . , xtk+1 , dt ),
and a number of hypotheses of this type for and can be solved in the
same way, but the more general hypothesis
= (h1 + H1 1 , . . . , hr + Hr r ) ,
cannot be solved by reduced rank regression. With = (1 , . . . , r ) and
= (10 , . . . , r0 )0 , equation (2) becomes
r
X k1
X
xt = j ((hj + Hj j )0 xt1 + j Dt ) + i xti + dt + t .
j=1 i=1

This is reduced rank regression, but there are r reduced rank matrices
j (1, 0j , j ) of rank one. The solution is not given by an eigenvalue problem,
but there is a simple modification of the reduced rank algorithm, which is
easy to implement and is quite often found to converge. The algorithm has
the property that the likelihood function is maximized in each step. The algo-
rithm switches between reduced rank regressions of xt on (x0t1 (Hi , hi ), Dt0 )0
corrected for
(((hj + Hj j )0 xt1 + j Dt )j6=i , xt1 , . . . , xtk+1 , dt ).
This result can immediately be applied to calculate likelihood ratio tests
for many dierent restrictions on the coecients of the cointegration relations.
Thus, in particular, this can give a test of over-identifying restrictions.
Cointegration. Overview and Development 15

5 Asymptotic analysis
A discussion of the most important aspects of the asymptotic analysis of
the cointegration model is given. This includes the result that the rank test
requires a family of Dickey-Fuller type distributions, depending on the speci-
fication of the deterministic terms of the model. The asymptotic distribution
of is mixed Gaussian and that of the remaining parameters is Gaussian, so
that tests for hypotheses on the parameters are asymptotically distributed as
2 . All results are taken from Johansen [21]

5.1 Asymptotic distribution of the rank test

The asymptotic distribution of the rank test is given in case the process has
a linear trend.
Theorem 3. Let t be i.i.d. (0, ) and assume that Dt = t and dt = 1,
in model (2). Under the assumptions that the cointegration rank is r, the
asymptotic distribution of the likelihood ratio test statistic (17) is
Z 1 Z 1 Z 1
d
2logLR(H(r)|H(p)) tr{ (dB)F 0 ( F F 0 du)1 F (dB)0 }, (18)
0 0 0

where F is defined by
B(u)
F (u) = 1 ,
u
and B(u) is the p r dimensional standard Brownian motion.
The limit distribution is tabulated by simulating the distribution of the
test of no cointegration in the model for a p r dimensional model with
one lag and the same deterministic terms. Note that the limit distribution
does not depend on the parameters (1 , . . . , k1 , , , ), but only on p r,
the number of common trends, and the presence of the linear trend. For finite
samples, however, the dependence on the parameters can be quite pronounced.
A small sample correction for the test has been given in [24], and the bootstrap
has been investigated by Swensen [47].
In the model without deterministics the same result holds, but with F (u) =
B(u). A special case of this, for p = 1, is the Dickey-Fuller test and the
distributions (18) are called the DickeyFuller distributions with p r degrees
of freedom; see [9].
The asymptotic distribution of the test statistic for rank depends on the
deterministic terms in the model. It follows from the Granger Representa-
Pt
tion Theorem that the deterministic term dt is cumulated to C i=1 di . In
P
deriving the asymptotics, xt is normalized by T 1/2 . If ti=1 di is bounded,
this normalization
Pt implies that the limit distribution does not depend on the
precise form of i=1 di . Thus, if dt is a centered seasonal dummy, or an inno-
vation dummy dt = 1{t=t0 } , it does not change the asymptotic distribution.
16 Sren Johansen

If, on the other hand, a step dummy dt = 1{tt0 } is included, then the cumu-
lation of this is a broken linear trend, and that influences the limit distribution
and requires special tables; see [29].

5.2 Asymptotic distribution of the estimators

The main result here is that the estimator of , suitably normalized, converges
to a mixed Gaussian distribution, even when estimated under continuously
dierentiable restrictions [20]. This result implies that likelihood ratio tests
on are asymptotically 2 distributed. Furthermore the estimators of the
adjustment parameters and the short-run parameters i are asymptotically
Gaussian and asymptotically independent of the estimator for .
In order to illustrate these results, the asymptotic distribution of for
r = 2 is given, when is identified by

= (h1 + H1 1 , h2 + H2 2 ). (19)

Theorem 4. In model (2) without deterministic terms and t i.i.d. (0, ),


the asymptotic distribution of T vec( ) is given by
1 0 R 1 !
H1 0 11 H10 GH1 12 H10 GH2 H1 0 G(dV1 )
R1 , (20)
0 H2 21 H20 GH1 22 H20 GH2 H20 0 G(dV2 )

where
d
T 1/2 x[T u] G = CW,
d R1
T 1 S11 G = C 0 W W 0 duC 0 ,
and
V = 0 1 W = (V1 , V2 )0 ,
ij = 0i 1 j .
The estimators of the remaining parameters are asymptotically Gaussian and
asymptotically independent of .
Note that G and V are independent Brownian motions so that the limit
distribution is mixed Gaussian and the asymptotic conditional distribution
given G is Gaussian with asymptotic conditional variance
1 0
H1 0 11 H10 GH1 12 H10 GH2 H1 0
.
0 H2 21 H20 GH1 22 H20 GH2 0 H20

A consistent estimator for the asymptotic conditional variance is


1 0
H1 0 11 H10 S11 H1 12 H10 S11 H2 H1 0
T . (21)
0 H2 21 H10 S11 H2 22 H20 S11 H2 0 H20

In order to interpret these results, note that the observed information


about in the data (keeping other parameters fixed) is given by
Cointegration. Overview and Development 17

11 H10 S11 H1 12 H10 S11 H2
JT = T ,
21 H20 S11 H1 22 H20 S11 H2

which normalized by T 2 converges to the stochastic limit



11 H10 GH1 12 H10 GH2
J = .
21 H20 GH1 22 H20 GH2

Thus the result (20) states that, given the asymptotic information or equiva-
lently the limit of the common trends, 0 W , the limit distribution of T ( )
is Gaussian with a variance that is a function of the inverse limit information.
Hence the asymptotic distribution of
0
1/2 H1 ( 1 1 )
JT
H20 ( 2 2 )

is a standard Gaussian distribution. Here Hi0 = (Hi0 Hi )1 Hi0 . This implies


that Wald and therefore likelihood ratio tests on can be conducted using
the asymptotic 2 distribution.
It is therefore possible to scale the deviations in order to obtain an
1/2
asymptotic Gaussian distribution. Note that the scaling matrix JT is not an
estimate of the asymptotic variance of , but an estimate of the asymptotic
conditional variance given the information in the data. It is therefore not the
asymptotic distribution of that is used for inference, but the conditional
distribution given the information; see Basawa and Scott [6] or [19] for a
discussion. Finally the result on the likelihood ratio test for the restrictions
given in (19) is formulated.

Theorem 5. Let t be i.i.d. (0, ). The asymptotic distribution of the likeli-


hood ratio test statistic for the restrictions (19) in model
P (2) with no deter-
ministic terms is 2 with degrees of freedom given by ri=1 (p r si + 1).

This result is taken from [21], and a small sample correction for some tests
on has been developed in [23].

6 Further topics in the area of cointegration


It is mentioned here how the I(1) model can be applied to test hypotheses
implied by rational expectations. The basic model for I(1) processes can be
extended to other models of nonstationarity. In particular models for sea-
sonal roots, explosive roots, I(2) processes, fractionally integrated processes
and nonlinear cointegration. We discuss here models for I(2) processes, and
refer to the paper by Lange and Rahbek [33] for some models of nonlinear
cointegration.
18 Sren Johansen

6.1 Rational expectations

Many economic models operate with the concept of rational or model based
expectations; see Hansen and Sargent [15]. An example of such a formulation
is uncovered interest parity,

e et+1 = i1t i2t , (22)

which expresses a balance between interest rates in two countries and economic
expectations of exchange rate changes. If a vector autoregressive model

xt = 0 xt1 + 1 xt1 + t , (23)

fits the data xt = (et , i1t , i2t )0 , the assumption of model based expectations,
Muth [35], means that e et+1 can be replaced by the conditional expectation
Et et+1 based upon model (23). That is,

e et+1 = Et et+1 = 1 0 xt + 11 xt .

Assumption (22) implies the identity

i1t i2t = 1 0 xt + 11 xt .

Hence the cointegration relation is

0 xt = i1t i2t ,

and the other parameters are restricted by 1 = 1, and 11 = 0. Thus,


the hypothesis (22) implies a number of testable restrictions on the vec-
tor autoregressive model. The implications of model based expectations for
the cointegrated vector autoregressive model is explored in [30], where it is
shown that, as in the example above, rational expectation restrictions assume
testable information on cointegration relations and short-run adjustments. It
is demonstrated how estimation under rational expectation restrictions can
be performed by regression and reduced rank regression in certain cases.

6.2 The I(2) model

It is sometimes found that inflation rates are best described by I(1) processes
and then log prices are I(2). In such a case 0 has reduced rank; see (5).
Under this condition model (2) can be parametrized as

2 xt = ( 0 xt1 + 0 xt1 ) + (0 )1 0 0 xt1 + t , (24)

where and are p r and is p (r + s), or equivalently as


0
2 xt1
xt = + 0 xt1 + t , (25)
0 0 xt1
Cointegration. Overview and Development 19

where
= 0 , = 0 + (0 )1 0 ;
see [22] and Paruolo and Rahbek [39]. Under suitable conditions on the para-
meters, the solution of equations (24) or (25) has the form
t X
X i t
X
x t = C2 j + C1 i + A1 + tA2 + yt ,
i=1 j=1 i=1

where yt is stationary and C1 and C2 are functions of the model parameters.


One can prove that the processes 2 xt , 0 xt + 0 xt , and 0 xt are sta-
tionary. Thus 0 xt are cointegration relations from I(2) to I(1). The model
also allows for multicointegration, that is, cointegration between levels and
dierences because 0 xt + 0 xt is stationary; see Engle and Yoo [12]. Maxi-
mum likelihood estimation can be performed by a switching algorithm using
the two parametrizations given in (24) and (25). The same techniques can be
used for a number of hypotheses on the cointegration parameters and .
The asymptotic theory of likelihood ratio tests and maximum likelihood es-
timators is developed by Johansen [22, 28], Rahbek, Kongsted, and Jrgensen
[44], Paruolo [38, 40], Boswijk [7] and Nielsen and Rahbek [36]. It is shown
that the likelihood ratio test for rank involves not only Brownian motion,
but also integrated Brownian motion and hence some new Dickey-Fuller type
distributions that have to be simulated. The asymptotic distribution of the
maximum likelihood estimator is quite involved, as it is not mixed Gaussian,
but many hypotheses still allow asymptotic 2 inference; see [28].

7 Concluding remarks
What has been developed for the cointegrated vector autoregressive model is a
set of useful tools for the analysis of macroeconomic and financial time series.
The theory is part of many textbooks, and software for the analysis of data
has been implemented in several packages, e.g. in CATS in RATS, Givewin,
Eviews, Microfit, Shazam, R, etc.
Many theoretical problems remain unsolved, however. We mention here
three important problems for future development.
1. The analysis of models for time series strongly relies on asymptotic
methods, and it is often a problem to obtain suciently long series in eco-
nomics which actually measure the same variables for the whole period. There-
fore periods which can be modelled by constant parameters are often rather
short, and it is therefore extremely important to develop methods for small
sample correction of the asymptotic results. Such methods can be analytic or
simulation based. When these will become part of the software packages, and
are routinely applied, they will ensure more reliable inference.
2. A very interesting and promising development lies in the analysis of
cointegration in nonlinear time series, where the statistical theory is still in its
20 Sren Johansen

beginning. Many dierent types of nonlinearities are possible, and the theory
has to be developed in close contact with applications in order to ensure that
useful models and concepts are developed; see the overview [33].
3. Most importantly, however, is the development of an economic theory
which takes into account the findings of empirical analyses of nonstationary
economic data. For a long time, regression analysis and correlations have
been standard tools for quantitative analysis of relations between variables
in economics. Economic theory has incorporated these techniques in order to
learn from data. In the same way economic theory should be developed to
incorporate nonstationarity of data and develop theories consistent with the
findings of empirical cointegration analyses.

References
1. Ahn SK, Reinsel GC (1990) Estimation for partially nonstationary multivariate
autoregressive models. Journal of the American Statistical Association 85:813
823
2. Anderson TW (1951) Estimating linear restrictions on regression coecients for
multivariate normal distributions. Annals of Mathematical Statistics 22:327351
3. Anderson TW (1971) The statistical analysis of time series. Wiley, New York
4. Banerjee A, Dolado JJ, Galbraith JW, Hendry DF (1993) Co-integration error-
correction and the econometric analysis of nonstationary data. Oxford Univer-
sity Press, Oxford
5. Bartlett M (1948) A note on the statistical estimation of the demand and supply
relations from time series. Econometrica 16:323329
6. Basawa IV, Scott DJ (1983) Asymptotic optimal inference for non-ergodic mod-
els. Springer, New York
7. Boswijk P (2000) Mixed normality and ancillarity in I(2) systems. Econometric
Theory 16:878904
8. Campbell J, Shiller RJ (1987) Cointegration and tests of present value models.
Journal of Political Economy 95:10621088
9. Dickey DA, Fuller WA (1981) Likelihood ratio statistics for autoregressive time
series with a unit root. Econometrica 49:10571072
10. Engle RF, Granger CWJ (1987) Co-integration and error correction: Represen-
tation, estimation and testing. Econometrica 55:251276
11. Engle RF, Hendry DF, Richard J-F (1983) Exogeneity. Econometrica 51:277
304
12. Engle RF, Yoo BS (1991) Cointegrated economic time series: A survey with
new results. In: Granger CWJ, Engle RF (eds) Long-run economic relations.
Readings in cointegration. Oxford University Press, Oxford
13. Granger CWJ (1983) Cointegrated variables and error correction models. UCSD
Discussion paper 83-13a
14. Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton
New Jersey
15. Hansen LP, Sargent TJ (1991) Exact linear rational expectations models: Spec-
ification and estimation. In: Hansen LP, Sargent TJ (eds) Rational expectations
econometrics. Westview Press, Boulder
Cointegration. Overview and Development 21

16. Hansen H, Johansen S (1999) Some tests for parameter constancy in the coin-
tegrated VAR. The Econometrics Journal 2:306333
17. Hendry DF (1995) Dynamic econometrics. Oxford University Press, Oxford
18. Johansen S (1988) Statistical analysis of cointegration vectors. Journal of Eco-
nomic Dynamics and Control 12:231254
19. Johansen S (1995) The role of ancillarity in inference for nonstationary variables.
Economic Journal 13:302320
20. Johansen S (1991) Estimation and hypothesis testing of cointegration vectors
in Gaussian vector autoregressive models. Econometrica 59:15511580
21. Johansen S (1996) Likelihood-based inference in cointegrated vector autoregres-
sive models. Oxford University Press, Oxford
22. Johansen S (1997) Likelihood analysis of the I(2) model. Scandinavian Journal
of Statistics 24:433462
23. Johansen S (2000) A Bartlett correction factor for tests on the cointegration
relations. Econometric Theory 16:740778
24. Johansen S (2002) A small sample correction of the test for cointegration rank
in the vector autoregressive model. Econometrica 70:19291961
25. Johansen S (2005) The interpretation of cointegration coecients in the cointe-
grated vector autoregressive model. Oxford Bulletin of Economics and Statistics
67:93104
26. Johansen S (2006a) Cointegration: a survey. In Mills TC and Patterson K (eds)
Palgrave handbook of econometrics: Volume 1, Econometric theory. Palgrave
Macmillan, Basingstoke
27. Johansen S (2006b) Representation of cointegrated autoregressive processes with
application to fractional processes. Forthcoming in Econometric Reviews
28. Johansen S (2006c) Statistical analysis of hypotheses on the cointegration rela-
tions in the I(2) model. Journal of Econometrics 132:81115
29. Johansen S, Mosconi R, Nielsen B (2000) Cointegration analysis in the presence
of structural breaks in the deterministic trend. The Econometrics Journal 3:134
30. Johansen S, Swensen AR (2004) More on testing exact rational expectations in
vector autoregressive models: Restricted drift term. The Econometrics Journal
7:389397
31. Juselius K, MacDonald R (2004) The international parities between USA and
Japan. Japan and the World Economy 16:1734
32. Juselius K (2006) The cointegrated VAR model: Econometric methodology and
macroeconomic applications. Oxford University Press, Oxford
33. Lange T, Rahbek A (2006) An introduction to regime switching time series
models. This Volume
34. Ltkepohl H (2006) Introduction to multiple times series analysis. Springer,
New York
35. Muth JF (1961) Rational expectations and the theory of price movements.
Econometrica 29:315335
36. Nielsen HB, Rahbek A (2004) Likelihood ratio testing for cointegration ranks
in I(2) Models. Forthcoming Econometric Theory
37. Nyblom J, Harvey A (2000) Tests of common stochastic trends. Econometric
Theory 16:176199
38. Paruolo P (1996) On the determination of integration indices in I(2) systems.
Journal of Econometrics 72:313356
39. Paruolo P, Rahbek A (1999) Weak exogeneity in I(2) VAR systems. Journal of
Econometrics 93:281308
22 Sren Johansen

40. Paruolo P (2000) Asymptotic eciency of the two stage estimator in I(2) sys-
tems. Econometric Theory 16:524550
41. Phillips PCB (1991) Optimal inference in cointegrated systems. Econometrica
59:283306
42. Phillips PCB, Hansen BE (1990) Statistical inference on instrumental variables
regression with I(1) processes. Review of Economic Studies 57:99124
43. Proietti T (1997) Short-run dynamics in cointegrated systems. Oxford Bulletin
of Economics and Statistics 59:405422
44. Rahbek A, Kongsted HC, Jrgensen C (1999) Trend-stationarity in the I(2)
cointegration model. Journal of Econometrics 90:265289
45. Seo B (1998) Tests for structural change in cointegrated systems. Econometric
Theory 14:222259
46. Stock JH (1987) Asymptotic properties of least squares estimates of cointegra-
tion vectors. Econometrica 55:10351056
47. Swensen AR (2006) Bootstrap algorithms for testing and determining the coin-
tegrating rank in VAR models. Forthcoming Econometric Theory
48. Watson M (1994) Vector autoregressions and cointegration. In: Engle RF, Mc-
Fadden D (eds) Handbook of econometrics Vol. 4. North Holland Publishing
Company, Netherlands

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy