Univariate Time Series Modelling and Forecasting: Introductory Econometrics For Finance' © Chris Brooks 2002 1
Univariate Time Series Modelling and Forecasting: Introductory Econometrics For Finance' © Chris Brooks 2002 1
Univariate Time Series Modelling and Forecasting: Introductory Econometrics For Finance' © Chris Brooks 2002 1
• So if the process is covariance stationary, all the variances are the same and all the
covariances depend on the difference between t1 and t2. The moments
s ...
E ( yt E ( yt ))( yt s E ( yt,ss ))= 0,1,2,
are known as the covariance function.
• The covariances, s, are known as autocovariances.
• However, the value of the autocovariances depend on the units of measurement of yt.
• It is thus more convenient to use the autocorrelations which are the autocovariances
normalised by dividing by the variance:
, s = 0,1,2, ...
s
s
• If we plot s against s=0,1,2,... then
0 we obtain the autocorrelation function or
correlogram.
• We can also test the joint hypothesis that all m of the k correlation coefficients are
simultaneously equal to zero using the Q-statistic developed by Box and Pierce:
m
Q T k2
where T = sample size, m k=1maximum lag length
• The Q-statistic is asymptotically distributed as a .
m2
• However, the Box Pierce test has poor small sample properties, so a variant
has been developed, called the Ljung-Box statistic:
k2
m
Q T T 2
~ m2
k 1 T k
• This statistic is very useful as a portmanteau (general) test of linear dependence in
time series.
• Question:
Suppose that a researcher had estimated the first 5 autocorrelation
coefficients using a series of length 100 observations, and found them to be
(from 1 to 5): 0.207, -0.013, 0.086, 0.005, -0.022.
Test each of the individual coefficient for significance, and use both the Box-
Pierce and Ljung-Box tests to establish whether they are jointly significant.
• Solution:
A coefficient would be significant if it lies outside (-0.196,+0.196) at the 5%
level, so only the first autocorrelation coefficient is significant.
Q=5.09 and Q*=5.26
Compared with a tabulated 2(5)=11.1 at the 5% level, so the 5 coefficients
are jointly insignificant.
( 1 1 2 ) 2
2 = E[Xt-E(Xt)][Xt-2-E(Xt-2)]
= E[Xt][Xt-2]
= E[(ut +1ut-1+2ut-2)(ut-2 +1ut-3+2ut-4)]
= E[( )] u 2
2 t 2
= 2
2
3 = E[Xt-E(Xt)][Xt-3-E(Xt-3)]
= E[Xt][Xt-3]
= E[(ut +1ut-1+2ut-2)(ut-3 +1ut-4+2ut-5)]
=0
So s = 0 for s > 2.
(iii) For 1 = -0.5 and 2 = 0.25, substituting these into the formulae above
gives 1 = -0.476, 2 = 0.190.
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 12
ACF Plot
0.8
0.6
0.4
acf
0.2
0
0 1 2 3 4 5 6
-0.2
-0.4
-0.6
y t 1 y t 1 2 y t 2 ... p y t p u t
• Or using the lag operator notation:
Lyt = yt-1 Liyt = yt-i
p
y t i y t i u t
i 1
p
• or y t i Li y t u t
i 1
• States that any stationary series can be decomposed into the sum of two
unrelated processes, a purely deterministic part and a purely stochastic
part, which will be an MA().
• For the AR(p) model, ( L) y t u t , ignoring the intercept, the Wold
decomposition is
y t ( L )u t
where,
( L) (1 1 L 2 L2 ... p Lp ) 1
4
u t 2
2
.. u
t u
1 t 1 2
1 ut 2 .. ]
= E[(ut 1 ut 1 1 ut 2 ... cross products)]
2 2 2 4 2
= E[(ut 1 ut 1 1 ut 2 ...)]
= 2
u 1
2 2
u 1 u ...
4 2
=
2
u (1 1
2
1 ...)
4
u2
=
(1 12 )
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 21
Solution (cont’d)
(iii) Turning now to calculating the acf, first calculate the autocovariances:
1 = Cov(yt, yt-1) = E[yt-E(yt)][yt-1-E(yt-1)]
Since a0 has been set to zero, E(yt) = 0 and E(yt-1) = 0, so
1 = E[ytyt-1]
2
(u t 1u t 1 1 u t 2 ...)(u t 1 1 u t 2 1 u t 3 ...)
2
1 = E[ ]
1 u t 1 1 u t 2 ... cross products ]
2 3 2
= E[ 2
= 1
1
3
2
1
5
2
...
1 2
= (1 2 )
1
2 2 4 2
= E[ 1 t 2u 1 u t 3 ... cross products ]
= 1 2
2
1
4
2
...
= 1
2
2
(1 1
2
1
4
...)
12 2
= (1 2 )
1
• If these steps were repeated for 3, the following expression would be
obtained
13 2
3 =
(1 12 )
and for any lag s, the autocovariance would be given by
1s 2
s =
(1 12 )
0
0 = 1
0
2 2
1 2 1
2 2
(1 1 ) (1 1 )
1 2
1 = 2 = 1 12
0 0
2 2
2 2
(1 1 ) (1 1 )
3 = 13
…
s = 1s
• Measures the correlation between an observation k periods ago and the current
observation, after controlling for observations at intermediate lags (i.e. all lags
< k).
• So kk measures the correlation between yt and yt-k after removing the effects of
yt-k+1 , yt-k+2 , …, yt-1 .
• At lag 1, the acf = pacf always
• The pacf is useful for telling the difference between an AR process and an
ARMA process.
• In the case of an AR(p), there are direct connections between yt and yt-s only
for s p.
• In the case of an MA(q), this can be written as an AR(), so there are direct
connections between yt and all its previous values.
where ( L) 1 1 L 2 L ... p L
2 p
or y t 1 y t 1 2 y t 2 ... p y t p 1u t 1 2 u t 2 ... q u t q u t
with E (u t ) 0; E (u t ) ; E (u t u s ) 0, t s
2 2
0
1 2 3 4 5 6 7 8 9 10
-0.05
-0.1
-0.15
acf and pacf
-0.2
-0.25
-0.3
acf
-0.35
pacf
-0.4
-0.45
Lag
0.4
0.3 acf
pacf
0.2
0.1
acf and pacf
0
1 2 3 4 5 6 7 8 9 10
-0.1
-0.2
-0.3
-0.4
Lags
0.9
acf
0.8 pacf
0.7
0.6
acf and pacf
0.5
0.4
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10
-0.1
Lags
0.6
0.5
acf
pacf
0.4
acf and pacf
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10
-0.1
Lags
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10
acf and pacf
-0.1
-0.2
-0.3
-0.4
acf
-0.5 pacf
-0.6
Lags
0.9
acf
pacf
0.8
0.7
0.6
acf and pacf
0.5
0.4
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10
Lags
0.8
0.6
acf
pacf
0.4
acf and pacf
0.2
0
1 2 3 4 5 6 7 8 9 10
-0.2
-0.4
Lags
• Box and Jenkins (1970) were the first to approach the task of estimating an
ARMA model in a systematic manner. There are 3 steps to their approach:
1. Identification
2. Estimation
3. Model diagnostic checking
Step 1:
- Involves determining the order of the model.
- Use of graphical procedures
- A better procedure is now available
Step 2:
- Estimation of the parameters
- Can be done using least squares or maximum likelihood depending
on the
model.
Step 3:
- Model checking
• Reasons:
- variance of estimators is inversely proportional to the number of degrees of
freedom.
- models which are profligate might be inclined to fit to data specific features
• This gives motivation for using information criteria, which embody 2 factors
- a term which is a function of the RSS
- some penalty for adding extra parameters
• The object is to choose the number of parameters which minimises the
information criterion.
• The information criteria vary according to how stiff the penalty term is.
• The three most popular criteria are Akaike’s (1974) information criterion
(AIC), Schwarz’s (1978) Bayesian information criterion (SBIC), and the
Hannan-Quinn criterion (HQIC).
AIC ln( 2 ) 2 k / T
k
SBIC ln(ˆ 2 ) ln T
T
2k
HQIC ln(ˆ 2 ) ln(ln(T ))
T So we min. IC s.t.
where k = p + q + 1, T = sample size. p p, q q
SBIC embodies a stiffer penalty term than AIC.
• Which IC should be preferred if they suggest different model orders?
– SBIC is strongly consistent but (inefficient).
– AIC is not consistent, and will typically pick “bigger” models.
i 0
since 0, the effect of each observation declines exponentially as we move
another observation forward in time.
• Forecasts are generated by
ft+s = St
for all steps into the future s = 1, 2, ...
• Forecasting = prediction.
• An important test of the adequacy of a model.
e.g.
- Forecasting tomorrow’s return on a particular share
- Forecasting the price of a house given its characteristics
- Forecasting the riskiness of a portfolio over the next year
- Forecasting the volatility of bond returns
• The distinction between the two types is somewhat blurred (e.g, VARs).
• A good test of the model since we have not used the information from
1999M1 onwards when we estimated the model parameters.
• Structural models
e.g. y = X + u
yt 1 2 x2t k xkt ut
To forecast y, we require the conditional expectation of its future
E yt t 1 E 1 2 x2t k xkt ut
value:
= 1 2 E x2t k E xkt
But what are ( x 2t ) etc.? We could use x 2 , so
E y t 1 2 x 2 k xk
= y !!
• Models include:
• simple unweighted averages
• exponentially weighted averages
• ARIMA models
• Non-linear models – e.g. threshold models, GARCH, bilinear models, etc.
•For example, say we predict that tomorrow’s return on the FTSE will be 0.2, but
the outcome is actually -0.4. Is this accurate? Define ft,s as the forecast made at time t
for s steps ahead (i.e. the forecast made for time t+s), and yt+s as the realised value of y
at time t+s.
• Some of the most popular criteria for assessing the accuracy of time series
forecasting techniques are:
N
1
MSE
N
t 1
( yt s f t , s ) 2
N
MAE is given by 1
MAE
N
t 1
yt s f t , s
Mean absolute percentage error: 1 N yt s f t , s
MAPE 100
N t 1 yt s
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 57
How can we test whether a forecast is accurate or not?
(cont’d)
• It has, however, also recently been shown (Gerlow et al., 1993) that the
accuracy of forecasts according to traditional statistical criteria are not
related to trading profitability.
• A measure more closely correlated with profitability:
1 N
% correct sign predictions = N zt s
t 1
• Given the following forecast and actual values, calculate the MSE, MAE
and percentage of correct sign predictions: