Chapter 2 - Autocorrelation
Chapter 2 - Autocorrelation
Chapter 2 - Autocorrelation
¿ E ( ui u j ) =0 ( for i≠ j )
re e =
∑ et et −1 = ^ρ
√∑ e 2t √∑ e2t −1
t t−1 uu t t −1
re e
t t−1
is an estimate of the true autocorrelation coefficient ρu u t t−1
which measures the
correlation of the true population of u ’ s.
If the value of u in any particular period depends on its own value in the preceding
period alone, we say that the u ’ s follow a first-order autoregressive scheme (or
first-order Markov process). The relationship between u ’ s is then of the form
ut =f (ut −1 )
If u depends on the values of the two previous periods, that is ut =f (ut −1 , ut −2 ), the
form of autocorrelation is called a second-order autoregressive scheme, and so
on.
In most applied research it is assumed that, when autocorrelation is present, it is of
the simple first-order formut =f (ut −1 ) and more particularly
ut =a 1 ut −1+ v t
where a 1=the coefficient of the autocorrelation relationship
v=a random variable satisfying all theusual assumptions .
∑ u t ut−1
a 1= t =2n
∑ u2t −1
t=2
ρu u =
∑ ut ut −1
t t−1
√∑ u2t √∑ u2t −1
Given that for large samples ∑ u 2t ≈ ∑ u2t −1, we may write
ρ≈
∑ ut ut −1
√¿¿¿¿
Clearly ρ ≈ a^ 1 for large samples. This is why in most textbooks the simple first-order
autoregressive model is given in the form
ut =ρ ut −1 + v t
w here ρis the first-order autocorrelation coefficient. Clearly if ρ=0 ,u t=v t that isu t is not
autocorrelated.
SOURCES OF AUTOCORRELATION
Autocorrelated values of the disturbance term u may be observed for many reasons.
i. Omitted explanatory variables.
ii. Misspecification of the mathematical form of the model.
iii. Interpolations of the statistical observations.
iv. Misspecification of the true random term u.
It should be noted that the source of autocorrelation has a strong bearing on the
solution which must be adopted for the correction of the incidence of serial
correlation.
We will next establish the mean, variance and covariance of this autocorrelated
disturbance variable.
I. The mean of the autocorrelated u ’ s
E(u t )=E ¿
But by the assumptions of the distribution of v we have
E ( v t−r ) =0.
¿ ∑ ( ρ ) E ( v t −r )
r 2 2
¿ ∑ ( ρ r ) var ( v t−r )
2
¿ ∑ ρ σ v =σ v [1+ ρ + ρ + ρ +… ]
2 2 2 2 4 6
The expression in brackets is the sum of a geometric progression of infinite
terms, whose first term is unity and common ratio is ρ2 . Since|ρ|<1, then
taking the sum of the geometric progression we have
E ( u 2t ) =σ 2v
[ ]
1
1−ρ
2
(for| ρ|<1)
¿ ρ ( σ 2v + ρ2 σ 2v + ρ 4 σ 2v +…+ 0 )
1
¿ ρ ( σ v [ 1+ ρ + ρ + ρ + … ] )=ρ σ v
2 2 4 6 2
2
( for|ρ|< 1 )
1−ρ
2
¿ ρσv
s 2
cov ( ut u t−s ) =ρ σ v (for s ≠ t).
CONSEQUENCES OF AUTOCORRELATION
When the disturbance term exhibit serial correlation the value as well as the
standard errors of the parameter estimates are affected. In particular:
I. The estimates of the parameters do not have the statistical bias. In other
words, even when the residuals are serially correlated the parameter
estimates of OLS are statistically unbiased, in the sense that their expected
value is equal to the true parameters.
II. With auto-correlated values of the disturbance term the OLS variation of the
parameter estimates are likely to be larger than those of other econometric
methods.
III. The variance of the random term u may be seriously underestimated if the
u ’ s are auto-correlated. In particular, the underestimation of the variance of
u will be more serious in the case of positive auto-correlation of the error
term (ui ) and of positively auto-correlated values of X (in successive time
periods).
Note that if X is not auto-correlated, but is approximately random, then, even
if u is auto-correlated, the bias in var (u) and var ( b^ i ) is not likely to be serious.
IV. Finally, if the values of u are auto-correlated, the prediction based on OLS
estimates will be inefficient, in the sense that they will have a larger variance
as compared with prediction based on estimates obtained from other
econometric techniques.
∑ ( X t − X t−1 )2 /(n−1)
σ 2 t=2
=
s 2X ∑ ( X t− X )2 /n
This is the ratio of the variance of the first difference of any variable X over the
variance of X . This ratio is applicable for directly observed series and for variables
which are random, that is, variables whose successive values are not auto-
correlated. In the case of the u ’ s
n
σ 2 ∑ ( e t −e t−1 )2 /(n−1)
= t=2
2
sX ∑ (et −e )2 /n
For large samples (n>30 ) one might think that the von Neumann ratio could be
applied approximately (with e=0 by definition). This test is not applicable for testing
the autocorrelation of theu ’ s, especially if the sample is small (n<30 ¿ .
The Durbin-Watson Test
Durbin and Watson have suggested a test which is applicable to small samples.
However, the test is applicable only for the first order autoregressive scheme (
ut =ρ ut −1 + v t ). The test may be outline as follows:
The null hypothesis is
H 0 : ρ=0 ,that is ,the u ’ s are not auto−correlated with first order scheme .
Against an alternative hypothesis
H 0 : ρ ≠ 0 ,that is ,the u ’ s are auto−correlated with first order scheme .
To test the null hypothesis we use the Durbin-Watson statistic
n
∑ (e t−et −1)2
d= t −2 n
=¿
∑e 2
t
t =1
The values of d lie between 0 and 4, and that when d=2, then ρ=0 therefore testing
H 0 : ρ=0 is equivalent to test H 0 :d=2 .
Expanding the d statistics we obtain
n n
∑ et 2
∑ e2t
t =1 t =1
n n n
∑ e2t +∑ e 2t−1−2 ∑ et e t −1
d= t −2 t−2
n
t −2
∑ e 2t
t=1
n n n
But for large samples the term ∑ et , ∑ e2t −1∧¿ ∑ e2t
2
are approximately equal.
t =2 t =2 t =1
Therefore we may write
2 ∑ e t −1 2 ∑ et et −1
2
d≈ −
∑e 2
t −1 ∑ e 2t −1
d ≈ 2(1−
∑ e t e t−1 )
∑ e2t −1
But
∑ e t et −1 = ^ρ
∑ e 2t −1
Therefore
d
d ≈ 2 (1−^ρ )∨ ^ρ=1−
2
From this expression it is obvious that the values of d lie between 0 and 4.
Firstly, if there is no autocorrelation ^ρ =0 ¿ d=2. Thus if from the sample data we find
¿
d ≈ 2 we accept that there is no autocorrelation in the function.
Secondly, if ^ρ =+ 1, d=0 and we have perfect positive autocorrelation. Therefore, if
¿
0< d <2 there is some degree of positive autocorrelation, which is stronger, the
closer d ¿ is to zero.
Thirdly, if ^ρ =−1 ,d =4 and we have perfect negative autocorrelation. Therefore, if
¿
2<d <4 there is some degree of negative autocorrelation, which is stronger, the
¿
higher the value ofd .
It should be clear that in the Durbin-Watson test the null hypothesis of zero
autocorrelation ( ρ=0 ¿ is carried out indirectly, by testing the equivalent hypothesis
d=2.
The next step is to use the sample residuals ( e t ’ s) and compute the empirical value
of Durbin-Watson statistic, d ¿. Finally, the empirical d ¿ must be computed with the
theoretical values of d ; that is, the values of d which define the critical region of the
test.
The problem with this test is that the exact distribution of d is not known. However,
Durbin-Watson have established upper d u and lower d L limits for the significance
levels of d which are appropriate to test the hypothesis of zero first-order
autocorrelation against the alternative hypothesis of positive first order
autocorrelation.
The test itself compares the empirical d ¿ value, calculated from regression residuals,
with the d L and d u in the Durbin –Watson tables, with their transforms ( 4−d L ¿ and
(4−d u). The comparison using d L and d u investigates the possibility of positive
autocorrelation, while comparison with ( 4−d l ¿ and(4−d u) investigates the possibility
of negative autocorrelation:
¿
I. If d < d L we reject the null hypothesis of no autocorrelation and accept that
there is positive autocorrelation of the first order.
¿
II. If d >(4−d l) we reject the null hypothesis of no autocorrelation and accept
that there is negative autocorrelation of the first order.
¿
III. If d u <d <( 4−d u ) we accept the null hypothesis of no autocorrelation.
¿ ¿
IV. If d L <d <d u or (4−d ¿¿ u)< d < ( 4−d L ) ¿ the test is inclusive.
¿ e t =ρ √ et −1 + v t
¿ so on .
Autocorrelation is judged in the light of the statistical significance of the ^ρ ( s ) and the
overall fit of the above regression. That is, we may carry out any one of the
standard tests of statistical significance for the estimates ( ^ρ' s) of the
^ρi
autocorrelation relationship (e.g. t= ) as well as F−test for the overall
s( ^ρ¿¿i) ¿
significance of the regression.
Y t =b 0+ b1 X t +ut
where ut =ρ ut −1 + v t with v t satisfying all the usual assumptions of a random variable . Assuming
ρ=1
ut =¿u t−1 −v t ¿ ¿ vt =ut −ut −1
Now, lagging the original equation by one period and pre-multiplying by ρ we obtain
d ⋍ 2(1− ρ)
¿
From the application of the Durbin-Watson test we obtain d which we may
substitute in the above expression and get
1 ¿
^ρ =1− d
2
If the sample is small the estimate of ^ρ will not be accurate, since the relationship
d ⋍ 2 (1−ρ ) holds asymptotically (for large samples).
Step 1: Apply OLS to the original data and obtain estimates of the coefficients b 0∧b1
:
Y^ t =b^ 0+ b^ 1 X t
^ρ =
∑ et e t −1 (t=2 , 3 ,… ,n)
∑ e 2t−1
Step 2: Use ^ρ to transform the original data and apply the OLS to the model
¿
( Y t −^ρ Y t −1 ) =b0 ( 1− ^ρ )+ b1 ( X t −^ρ X t −1) +u t
Denote the estimates of this ‘second-round’ by b^^ 0∧b^^ 1 (where b^^ 0is an estimate of
the intercept b 0 ( 1− ^ρ )).
Using b^^ 0∧b^^ 1 compute the ‘second-round’ residuals e^^ t =Y t− b^^ 0− b^^ 1 X t (t=1 , 2 , … ,n) and
from these estimates obtain the ‘second-round’ estimate of ρ
^ ^
^^ρ = ∑ e^ t e^ t −1 (t=2 , 3 ,… ,n)
∑ e^^ 2t−1
Step 3: Use ^^ρ to transform the original variables and apply OLS to the model
We obtain the ‘third-round’ estimates, b^^ 0∧b^^ 1 , which yield the ‘third-round’ residuals
^ ^^ ^^
e^^ t =Y t− b^ 0− b^ 1 X t (t=1 , 2 , … ,n) from these estimates obtain the ‘second-round’
estimate of ρ
^ ^
^^ ∑ e^^ t e^^ t −1
^ρ = (t=2 , 3 ,… ,n)
^^ 2
∑ t−1
e ^
This iterative procedure is repeated until the value of the estimate of ρ converges.
^ ^
Some researchers stop at the ‘second round’ estimates, b^^ ¿ b^^ . This is then called a
0 1
two-stage Cochrane-Orcutt method.
An alternative approach is to use at each step of the iterative (for the first-order
autoregressive scheme) the Durbin-Watson d statistic to test the residuals for
autocorrelation. If they pass the test of zero autocorrelation, the iterations stop. If
not, the iterations proceed until the hypothesis of zero autocorrelation is accepted.
Tests of significance of the b^ ' s are conducted only at the final iteration.
Setting b 0 ( 1− ρ )=a 0
b 1=a1
b 1 ρ=a2, etc.
We may write the equation in the following form
Y t =a 0+ ρY t −1+ a1 X 1 t +a 2 X 1(t−1) +…+ v t
Applying least squares to this equation we obtain an estimate of ρ , ^ρ , which is the
coefficient of the lagged variable Y t −1.
¿
(X ¿ ¿ kt−ρ X k (t −k) )=X k ¿
Which we use in order to estimate the parameters of the original relationship
¿ ¿ ¿ ¿
Y t =b 0 +b1 X 1 +b 2 X 2 +…+ bk X k + v t
Durbin’s method provides estimates which have optimal asymptotic properties and
are more efficient for samples of all sizes.
Example: Table below includes data on imports and the gross national product of the UK.
Applying OLS to these observations we obtain the following imports function
^z t =−2,461+0.2795 X t r 2 =0.983
Sb^ ( 250 ) ( 0.01 )
i
¿
d=
∑ 2
(e t −e t−1 ) 573071.44
= =0.937
∑ e t2 537220.22
From the Durbin-Watson table, with 5% level of significance, n=20 observations, k=1
independent variable, the significance points of d L∧d u are:
d L=1.20 du =1.41
¿
Since d < d L , we conclude there is positive autocorrelation in the imports function. We will
use two of the previous developed corrective procedures, namely; an alternative test for
autocorrelation and Durbin’s two-step method.
e^ t =0.53 et −1
(0.26)
e^ t =0.55 et −1−0.21 e t−2
( 0.29 ) (0.29)
One could experiment with other forms of autocorrelation, but we limit our example
to the first order and second-order autoregressive schemes.
The above regression indicates a first-order autoregressive, since ^ρ1 is just
^ 2 not significant at the 5% level.
significant but ρ
Using the estimate ^ρ =0.53 we obtain the transformed variables and applying OLS to
the transformed data we obtain
Exercise:
The data of the following table are the OLS residuals of a certain relationship (
Y =b0 +b 1 X +u ¿ . Calculate d and estimate ρ with any two of the methods discussed.
Do your results support the use of first differences for the estimation of b 0∧b1 ?
Residua
Residual Residual l Residual
Year ei Year ei Year ei Year ei
1950 1 1955 -0.3 1960 -4.6 1965 -2.6
1951 -1.5 1956 -3.1 1961 -4.3 1966 -2.3
1952 -0.7 1957 -5.5 1962 1.9 1967 -0.9
1953 -1.3 1958 -4.7 1963 1.9 1968 1.4
1954 -4.6 1959 -1.3 1964 2.9 1969 3.7