Arima: Autoregressive Integrated Moving Average
Arima: Autoregressive Integrated Moving Average
Arima: Autoregressive Integrated Moving Average
Autoregressive Integrated
Moving Average
Introduction - ARMA
ARMA - Auto Regressive Moving Average
Introduced by Box and Jenkins in 1976.
Box-Jenkins model.
Used to develop a model that will forecast an
element based on its historical values.
For example, the exchange rate in time t can
be forecasted based on its values in time t-2
and time t-5 plus stochastic error terms.
Two Common Processes
Autoregressive process.
Assumptions
No Uncontrolled Correlation
Autocorrelation means that the value of a
given datum is largely determined by the
value of the preceding datum in the series.
Assumption is tested through the Durbin-
Watson Coefficient, with range of value from
0 to 4. A value of 2 indicates no
autocorrelation, 0 indicates autocorrelation,
and 4 indicates negative autocorrelation.
Assumptions
Arbitrary Model Lag Order
The researcher must have a theoretical basis
to establish the face validity of the order of
the model.
Assumptions
No Outliers
As in other forms of regression, outliers may
affect conclusions strongly and misleadingly.
Assumptions
Randomly Distributed Shocks
If shocks are present in the time series, they
are assumed to be randomly distributed with
a mean of 0 and a constant variance.
Assumptions
Uncorrelated Random Errors
Residuals are randomly and normally
distributed, have non-significant
autocorrelations and partial autocorrelations,
and have a mean of 0 and homogeneity of
variance over time.
The Durbin-Watson test is the standard test
for correlated error.
Assumptions
Procedure
Test for the assumptions.
The Box-Jenkins Methodology
Identification.
Estimation.
Diagnostic Checking.
Forecasting.
Test for Stationarity
Visual plot.
Correlogram.
Unit root test.
Augmented Dickey-Fuller Test
Ho: Series is non-stationary
Ha: Series is stationary.
If absolute value of ADF > absolute value of the
critical regions, reject Ho
Differencing
Differencing is a procedure which attempts to de-trend
the data in order to control autocorrelation and achieve
stationarity.
It does this by subtracting each datum in a series from its
predecessor.
The number of times a series needs to be differenced to
achieve stationarity is reflected in the d parameter.
In order to determine the necessary level of differencing,
one should examine the plot of the data and
autocorrelogram
Caution: Some time series may require little or no
differencing. An over differenced series produce less
stable coefficient estimates.
Procedure
Identification
Major tools: ACF and PACF
One autoregressive (p) parameter: ACF - exponential
decay; PACF - spike at lag 1, no correlation for other lags.
Two autoregressive (p) parameters: ACF - a sine-wave
shape pattern or a set of exponential decays; PACF - spikes
at lags 1 and 2, no correlation for other lags.
One moving average (q) parameter: ACF - spike at lag 1, no
correlation for other lags; PACF - damps out exponentially.
Two moving average (q) parameters: ACF - spikes at lags 1
and 2, no correlation for other lags; PACF - a sine-wave
shape pattern or a set of exponential decays.
One autoregressive (p) and one moving average (q)
parameter: ACF - exponential decay starting at lag 1; PACF
- exponential decay starting at lag 1.
Procedure
Estimation
approximate maximum likelihood method
the fastest method
should be used for very long time series (e.g., with more
than 30,000 observations)
approximate maximum likelihood method with
backcasting
must use this method first to establish initial parameter
estimates that are very close to the actual final values
exact maximum likelihood method
may be inefficient when used to estimate parameters for
seasonal models with long seasonal lags (e.g., with yearly
lags of 365 days)
Procedure
Diagnostic Checking
Test for the significance of the parameter
estimates.
Use partial data to generate forecasts.
Analysis of residuals.
Limitations
The ARIMA method is appropriate only for a
time series that is stationary.
At least 50 observations are recommended
for the input data.
It is also assumed that the values of the
estimated parameters are constant
throughout the series.
Illustration
Background: US GDP data from 1970 –
1996
Frequency: quarterly
To Extract Data: File Open
Foreign Data as Workfile Choose File
Open
Extract Data From Excel
1
2
3
Step 1: Check for Stationarity
To Plot Data: Series Box
View Graph Line
Visual Plot
1
2
Step 1: Check for Stationarity
Series Box View
Unit Root Test Level
ADF / Unit Root Test
3
1
2
Step 2: Difference theSeries
Data Box View
Unit Root Test
1st Difference
ADF Test
3
2
Step 3: Estimate the P and Q
Series Box View
Correlogram
1st Difference
• Correlogram – ACF and PACF
3
2
Step 3: Estimate the P and Q
Correlogram – ACF and PACF
Possible AR and MA models
And so on…
Quick Estimate Equation
Type equation: u_s_gdp c ar(1)
Step 4: Estimate
1 several
ok models
2
Quick Estimate Equation
Type Equation: u_s_gdp c ar(1) ma(1)
1 ok
2
Step 5: Determine the model
Model AIC value SC value
AR(1) 10.04537 10.10206m
AR(1)MA(1) 9.985905 10.07094
Step 6: Checking
Use the t-test to check for the
significance of the parameters.
Use the Durbin-Watson test to check for
the autocorrelation of the error terms.