0% found this document useful (0 votes)
19 views18 pages

Machine Learning

The document discusses time series models for heteroscedasticity, including the ARCH and GARCH models, and describes how these models can account for changing conditional variances over time as seen in financial time series which often exhibit volatility clustering.

Uploaded by

dasharathv87
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views18 pages

Machine Learning

The document discusses time series models for heteroscedasticity, including the ARCH and GARCH models, and describes how these models can account for changing conditional variances over time as seen in financial time series which often exhibit volatility clustering.

Uploaded by

dasharathv87
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Lecture Notes

on

Time Series Analysis


STA-403-S3(ELECTIVE II)

Unit-V
Time Series Models of Heteroscedasticity
SR

Dr. Suresh, R
Assistant Professor
Department of Statistics
Bangalore University, Bengaluru-560 056

1
Contents
1 Time Series Models of Heteroscedasticity 5
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Some Common Features of Financial Time Series . . . . . . . . . . . . . . 7
1.3 ARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 ARCH(m) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 ARCH(1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 GARCH(1,1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Test for ARCH Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Identifying an ARCH/GARCH Model in Practice . . . . . . . . . . . . . . 16
1.7 Maximim Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.1 ML Estimation of ARCH Model . . . . . . . . . . . . . . . . . . . . 16
1.7.2 ML Estimation of GARCH Model . . . . . . . . . . . . . . . . . . . 18
SR

2
Syllabus

Unit-V: 6 hrs - Time Series Models of Heteroscedasticity

1. Time Series Models of Heteroscedasticity: Some common features of


financial t.s.
2. ARCH and GARCH Models, Test for ARCH effect, Maximum Likeli-
hood Estimation SR

3
References
[1] Box, G. E. P., Jenkins, G. M., Reinsel, G. C. and Ljung, G. M., Time
Series Analysis-Forecasting and Control, 5/e, Wiley, 2016.
[2] Brockwell, P. J. and Davis, R. A., Introduction to Time Series and
Forecasting, 3/e, Springer, Switzerland, 2016.
[3] Chatfield, C. and Xing, H., The Analysis of Time Series: An Introduc-
tion with R, 7/e, CRC Press, 2019.
[4] Cryer, J. D. and Chan, K. S., Time Series Analysis with Application
in R, 2/e, Springer, New York, 2008.
[5] Enders, W., Applied Econometric Time Series, 4/e, Wiley, 2015.
[6] Kirchgassner, G., Wolters, J. and Hassler, U., Introduction to Modern
Time Series Analysis, 2/e, Springer, Berlin, 2013.
[7] Tsay, R. S., Analysis of Financial Time Series, 3/e, Wiley, New Jersey,
2010.
SR

4
1 Time Series Models of Heteroscedasticity
1.1 Introduction
In the previous units, our focus was on time series with time-varying mean
processes. We were concerned with stationary and nonstationary variables.
The nonstationary nature of the variables(time series) implied that they had
means that change over time. All models discussed so far use the conditional
expectation to describe the mean development of one or more time series.
The optimal forecast, in the sense that the variance of the forecast errors will
be minimised, is given by the conditional mean of the underlying model.
Here, it is assumed that the residuals are not only uncorrelated but also
homoscedastic, i.e. that the unexplained fluctuations have no dependencies
in the second moments.
In this unit, we are concerned with stationary series, but with conditional
variances that change over time. The model we focus on is called the au-
toregressive conditional heteroskedastic (ARCH) model and its generalized
version Generalised autoregressive conditional heteroskedastic (GARCH)
model.
Nobel Prize winner Robert Engle’s original work on ARCH was con-
SR

cerned with the volatility (means changing a lot over time) of inflation.
However, it was applications of the ARCH model to financial time series
that established and consolidated the significance of his contribution. For
this reason, the examples used in this unit will be based on financial time
series.
Financial time series have characteristics that are well represented by
models with dynamic variances. The particular aims of this unit are to
discuss the modeling of dynamic variances using the ARCH class of models
of volatility (Note: In statistics we use variance to measure volatility), the
estimation of these models.

Note :
1. The importance of volatility models stems from the fact that the price
of an option crucially depends on the variance of the underlying se-
curity price. Thus with the surge of derivative markets in the last
decades the application of such models (models of volatility) has seen
a tremendous rise.
2. Another use of volatility models is to assess the risk of an investment.
In the computation of the so-called value at risk (VaR), these models

5
have become an indispensable tool. In the banking industry, due to
the regulations of the Basel accords, such assessments are in particular
relevant for the computation of the required equity capital backing-up
assets of different risk categories.
3. In risk management, volatility models provide a simple approach to
calculating the value at risk of a financial position. Volatility also
plays an important role in asset allocation and portfolio optimization.
4. Volatility is an important factor in options trading. Here volatility
means the conditional standard deviation of the underlying asset re-
turn.
5. A special feature of stock volatility is that it is not directly observ-
able.Although volatility is not directly observable, it has some charac-
teristics (Refer section: 1.2) that are commonly seen in asset returns.
6. Volatility evolves over time in a continuous manner —that is, volatility
jumps are rare.
7. Volatility does not diverge to infinity—that is, volatility varies within
SR

some fixed range. Statistically speaking, this means that volatility is


often stationary.
8. An asset is risky if its return rt is volatile (changing a lot over time).
9. In statistics we use variance to measure volatility (dispersion), and so
the risk.
10. Statistically, volatility clustering implies time-varying conditional vari-
ance: big volatility (variance) today may lead to big volatility tomor-
row.
11. We are more interested in conditional variance, denoted by

V ar(rt |rt−1 , rt−2 , . . .) = E(rt2 |rt−1 , rt−2 , . . .)

, because we want to use the past history to forecast the variance. The
last equality holds if E(rt |rt−1 , rt−2 , . . .) = 0, which is true in most
cases.
12. The ARCH process has the property of time-varying conditional vari-
ance, and therefore can capture the volatility clustering.
13. ARCH and GARCH model are non-linear models.

6
Remark: It is pertinent to note that the first differences of most of the
financial time series often exhibit wide swings, or volatility, suggesting that
the variance of financial time series varies over time. We can think of
modeling such “varying variance” (modelling changes in variance or volatil-
ity). This is where the so-called autoregressive conditional heteroscedastic-
ity (ARCH) model originally developed by Engle comes in handy.
These models do not generally lead to better point forecasts of the mea-
sured variable, but may lead to better estimates of the (local) variance.
This, in turn, allows more reliable prediction intervals to be computed and
hence a better assessment of risk.

1.2 Some Common Features of Financial Time Series


Many financial time-series, such as stock returns, inflation rates, foreign
exchange rates, etc., have some commom features, such as:
1. The values of these series change rapidly from period to period in an
apparently unpredictable manner; we say the series are volatile.
2. Furthermore, there are periods when large changes are followed by
SR

further large changes and periods when small changes are followed
by further small changes. In this case the series are said to display
time-varying volatility as well as “clustering” (‘volatility clustering’ or
‘volatility pooling’) of changes. Within these periods (periods of large
changes and periods of small change) volatility seems to be positively
autocorrelated. Statistically, volatility clustering implies time-varying
conditional variance. (In the case of financial data, for example, large
and small errors tend to occur in clusters, i.e., large returns are followed
by more large returns, and small returns by more small returns)
3. The financial time series may seem serially uncorrelated, but it is de-
pendent. (ACFs of the financial t.s may suggest no significant serial
correlations except for small ones at lags 1 or 2. However, the sample
ACFs of some function of the financial t.s., (say absolute or squared ),
show strong dependence over all lags.)
4. These series display non-normal properties. (we see more observations
around the mean and in the tails,i.e., more peaked around the mean
and relatively fat tails-). This results in ‘excess kurtosis’, i.e. the values
of the kurtosis are above three. Distributions with these properties said
to be leptokurtic.

7
5. The financial time series may also be asymmetric(skewed).
6. Most of these financial time series is that in their level form(i.e.,
mean) they are random walks; that is, they are nonstationary. On the
other hand, in the first difference form, they are generally stationary.

1.3 ARCH Model


1.3.1 Introduction

ARMA models were used to model the conditional mean of a process when
the conditional variance was constant. Using an AR(1) as an example, we
assumed
E(Xt |Xt−1 , Xt−1 , . . .) = φXt−1 , and V ar(Xt |Xt−1 , Xt−1 , . . .) = V ar(at ) = σt2 .
(1)
In many problems, however, the assumption of a constant conditional vari-
ance will be violated. Models such as the autoregressive conditionally het-
eroscedasticor ARCH model, first introduced by Engle(1982) were devel-
oped to model changes in volatility. These models were later extended to
generalized ARCH, or GARCH models by Bollerslev(1986).
SR

In these problems, we are concerned with modeling the return or growth


rate of a series. For example, if Xt is the value of an asset at time t, then
the return or relative gain, rt , of the asset at time t is
Xt − Xt−1
rt = . (2)
Xt−1
The above definition (2) implies Xt = (1 + rt )Xt−1 .Thus, if the return
represents a small (in magnitude) percentage change then
∇log(Xt ) ≈ rt (3)
Xt − Xt−1
Either value, ∇log(Xt ) or , will be called the return, and will be
Xt−1
denoted by rt . Typically, for financial series, the return rt , does not have a
constant conditional variance.

1.3.2 ARCH(m) Model

The first model that provides a systematic framework for volatility modeling
is the ARCH model of Engle (1982). The basic idea of ARCH models is
that: Suppose rt = µt + at , where µt is the mean return, conditional on
Ft−1 , the information available through time (t − 1) then:

8
a) the shock/innovation at of an asset return is serially uncorrelated, but
dependent, and
b) the dependence of at can be described by a simple quadratic function
of its lagged values. Specifically, an ARCH(m) model assumes that
at = σt t , (4)
σt2 = α0 + α1 a2t−1 + · · · + αm a2t−m , (5)
where {t } is a sequence of independent and identically distributed
(iid) random variables with mean zero and variance 1, α0 > 0, and
αi > 0 for i > 0. A model for at satisfying Equations (4) and (5) is
called an autoregressive conditionally heteroscedastic model of order
m (ARCH(m)).
Note:
1. The coefficients αi must satisfy some regularity conditions (α1 + α2 +
· · · + αm < 1) to ensure that the unconditional variance of at is finite.
This additional constraint ensures that the at are covariance stationary
with finite unconditional variance σt2 .
2. In practice (for modeling purposes), t is often assumed to follow the
SR

standard normal or a standardized Student−t or a generalized error


distribution.
3. The adjective ‘autoregressive’ arises because the value of σt2 depends
on past values of the derived series, albeit in squared form. Note that
Equation (5) does not include an ‘error’ term and so does not define a
stochastic process.
4. The conditional heteroscedastic models of this unit are concerned with
the evolution of σt2 (“Conditional Heteroscedasticity” of returns), where
σt2 = V ar(rt |Ft−1 ) = V ar(at |Ft−1 ), Ft−1 is the past information i.e.,
the set of observed data up to time t − 1.
5. {t } is independent of at−j , j ≥ 1.
6. Here we use an exact function to govern the evolution of σt2 . (Equation
(5) will capture serial correlation in volatility)
7. As the name suggests, heteroscedasticity, or unequal variance, may
have an autoregressive structure in that heteroscedasticity observed
over different periods may be autocorrelated.
8. The ARCH process is nonlinear in variance but linear in mean.

9
1.3.3 ARCH(1) Model

Definition 1.1. The ARCH(1) model is

at = σt t , (6)
σt2 = α0 + α1 a2t−1 , (7)

where at is the shock/innovation of an asset return and {t } is a sequence


of independent and identically distributed (iid) random variables with mean
zero and variance 1, α0 > 0, and α1 > 0.
Properties of ARCH(1) Model: To understand the ARCH models, it
pays to carefully study the ARCH(1) model.
1. Unconditional mean of at : The unconditional mean of at remains
zero because

E(at ) = E[E(at |Ft−1 )] = E[σt E(t )] = 0. (8)

2. Unconditional variance of at : It can be obtained as

V ar(at ) = E(a2t ) ∵ E(at ) = 0


SR

= E[E(a2t |Ft−1 )]
= E(α0 + α1 a2t−1 )
= α0 + α1 E(a2t−1 )
= α0 + α1 V ar(at−1 ) ∵ V ar(at−1 ) = E(a2t−1 )
= α0 + α1 V ar(at ) ∵ V ar(at−1 ) = V ar(at )
α0
σa2 = . (9)
1 − α1
Since the variance of at must be positive, we require 0 ≤ α1 < 1.
Note: Substituting α0 = σa2 (1 − α1 ) from (9) into (7), we se that

σt2 = σa2 + α1 (a2t−1 − σa2 ) (10)

or, equivalently,
σt2 − σa2 = α1 (a2t−1 − σa2 ). (11)
Hence, the conditional variance of at will be above the unconditional
variance whenever a2t−1 is larger than the unconditional variance σa2 .
3. The at are serially uncorrelated: Since for j > 0,

E(at at−j ) = E[E(at at−j |Ft−1 )] = E[at−j E(at |Ft−1 )] = 0. (12)

10
But the at ’s are not mutually independent since they are interrelated
through their conditional variances. The lack of serial correlation is an
important property that makes the ARCH model suitable for model-
ing asset returns that are expected to be uncorrelated by the efficient
market hypothesis.
4. Unconditional Kurtosis of at : We can show that the unconditional
Kurtosis of at is given by
3(1 − α12 )
κ= (13)
1 − 3α12
This value exceeds 3, the kurtosis of the normal distribution. Hence,
the marginal distribution of at has heavier tails than those of the
normal distribution. Hence the innovation process at in a Gaussian
ARCH(1) model tends to generate more ‘outliers’ than a Gaussian
white noise process. This is in agreement with the empirical finding
that ‘outliers’ appear more often in asset returns than that implied
by an iid sequence of normal random variates. This is an additional
feature of the ARCH model that makes it useful for modeling financial
SR

asset returns where heavy-tailed behavior is the norm.


5. Potentially another useful property: Another useful theoretical
property of the ARCH(1) model as written in equation (7 ) above is
the following:
We let ut = a2t − σt2 , where the random variables ut have zero mean
and are serially uncorrelated. Add this on both sides of equation (7 )
to obtain
a2t = α0 + α1 a2t−1 + ut (14)
This form reveals that the process of squared errors a2t can be viewed
as an AR(1) model with uncorrelated innovations ut .
Remark:
1. From properties 1-3, we can make out the the ARCH(1) process is
stationary.
2. The innovations at in the weakly stationary ARCH(1) model are un-
conditionally homoskedastic and conditionally heteroskedastic.
3. All the above properties also hold for general ARCH models with higher
orders, but the argument become more complicated.

11
1.4 GARCH Model
1.4.1 Introduction

The ARCH model has a disadvantage in that it often requires a high lag
order m to adequately describe the evolution of volatility over time. An
extension of the ARCH model called the generalizedARCH, or GARCH,
model was introduced by Bollerslev (1986) to overcome this issue. The
generalized ARCH or GARCH model is a parsimonious alternative to an
ARCH(m) model.
Definition 1.2. For a log return series rt , let at = rt − µt be the innovation
at time t. Then at follows a GARCH(m,s) model if

at = σt t (15)
m
X s
X
σt2 = α0 + αi a2t−i + 2
βj σt−j (16)
i=1 j=1

where {t }is a sequence of iid random variables with mean 0 and variance
Pmax(m,s)
1, α0 > 0, αi ≥ 0, βj ≥ 0, and i=1 (αi + βi ) < 1.
SR

Note:
1. The constraint on αi + βi implies that the unconditional variance of at
is finite, whereas its conditional variance σt2 evolves over time.
2. To carry out inference, t is often assumed to follow a standard normal
or standardized Student-t distribution or generalized error distribution.
3. Equations (15 and 16) reduces to a pure ARCH(m) model if s = 0.
4. The αi and βj are referred to as ARCH and GARCH parameters, re-
spectively.
5. {t } is independent of at−j , j ≥ 1.
6. A GARCH (generalized autoregressive conditionally heteroscedastic)
model uses values of the past squared observations (ARCH terms )
and past variances (GARCH terms) to model the variance at time t.

1.4.2 GARCH(1,1) Model

The simplest and most widely used model in the class of GARCH models
is the GARCH(1, 1) model.

12
Definition 1.3. The GARCH(1, 1) model is
at = σt t , (17)
σt2 = α0 + α1 a2t−1 + β1 σt−1
2
, (18)
where at is the shock/innovation of an asset return and {t } is a sequence
of independent and identically distributed (iid) random variables with mean
zero and variance 1, α0 > 0, 0 ≤ α1 , β1 ≤ 1 and (α1 + β1 ) < 1.

Properties of GARCH(1, 1) Model: To understand the GARCH models,


it pays to carefully study the GARCH(1, 1) model.
1. ARM A(1, 1) form of a2t : A very useful theoretical property of the
GARCH(1, 1) model as written in equation (18 ) above is the following:
We let ut = a2t − σt2 ( =⇒ ut−1 = a2t−1 − σt−1 2 2
, i.e., σt−1 = a2t−1 −
ut−1 ), where the random variables ut have zero mean and are serially
uncorrelated. Add this on both sides of equation (18) to obtain
a2t = α0 + α1 a2t−1 + β1 σt−1
2
+ ut
2 2
= α0 + α1 at−1 + β1 (at−1 − ut−1 ) + ut
SR

a2t = α0 + (α1 + β1 )a2t−1 + ut − β1 ut−1 (19)


This form reveals that the process of squared errors a2t can be viewed
as an ARMA(1, 1) model with uncorrelated innovations(errors) ut .
Note: The ARMA representation of the GARCH model can be used
for identification of the order of the GARCH(p, q) model.
2. Unconditional mean of at : The unconditional mean of at remains
zero because
E(at ) = E[E(at |Ft−1 )] = E[σt E(t |Ft−1 )] = E[σt E(t )] = 0. (20)

3. Unconditional variance of at : It can be obtained as


V ar(at ) = E(a2t ) ∵ E(at ) = 0
= E(α0 + (α1 + β1 )a2t−1 + ut − β1 ut−1 )
= α0 + (α1 + β1 )E(a2t−1 ) + E(ut ) − β1 E(ut−1 )
= α0 + α1 V ar(at−1 ) + 0 − β1 · 0 ∵ E(at ) = E(ut ) = 0, ∀t
= α0 + (α1 + β1 )V ar(at ) ∵ V ar(at−1 ) = V ar(at )
α0
σa2 = . (21)
1 − (α1 + β1 )
Since the variance of at must be positive, we require 0 ≤ α1 + β1 < 1.

13
4. The at are serially uncorrelated: Since for j > 0,
E(at at−j ) = E[E(at at−j |Ft−1 )] = E[at−j E(at |Ft−1 )] = 0. (22)
But the at ’s are not mutually independent since they are interrelated
through their conditional variances.
5. Unconditional Kurtosis of at : We can show that the unconditional
Kurtosis of at is given by
3(1 − (α1 + β1 )2 )
κ= (23)
1 − (α1 + β1 )2 − 2α12
This value exceeds 3, the kurtosis of the normal distribution. Conse-
quently, similar to ARCH models, the tail distribution of a GARCH(1, 1)
process is heavier than that of a normal distribution.
Remark: These properties also hold for general GARCH models with
higher orders, but the argument become more complicated.

Note:
SR

1. The weakly stationary GARCH(1,1) model has innovations which are


conditionally heteroskedastic with time-varying conditional variance
E(a2t |Ft−1 ) = σt2 and unconditionally homoskedastic with a constant
α0
unconditional variance σa2 = E(a2t ) = .
1 − α1 − β1
2. Since σt2 can be written as a function of the squared innovations
a2t−1 , a2t−2 , a2t−3 , . . ., we get
∞ ∞
X α0 X
σt2= (1 + β1 + β12 + · · · )α0 + α1 = β1i−1 a2i−1
+ α1 β1i−1 a2i−1 ,
i=1
1 − β1 i=1
(24)
this implies the GARCH(1,1) model can be interpreted as an infinite
ARCH process with restricted parameters. This is a major reason
the GARCH model is typically empirically more relevant than the
ARCH(m) model.

1.5 Test for ARCH Effect


Before modeling the volatility of a time series it is advisable to test whether
heteroskedasticity is actually present in the data. For this purpose the
literature proposed several tests of which we are going to examine two. For

14
both tests the null hypothesis is that there is no heretoskedasticity i.e. that
there are no ARCH effects.

1. Autocorrelation of Quadratic Residuals(McLeod and Li(1983)):


This test is based on the autocorrelation function of squared residuals
from a preliminary regression. This preliminary regression or mean
regression produces a series {ât } which should be approximately white
noise if the equation is well specified. Then we can look at the ACF
of the squared residuals {â2t } and apply the Ljung-Box test. Thus the
test can be broken down into three steps.
(a) Estimate an ARMA model for {Xt } and retrieve the residuals {ât }
from this model. Compute {â2t }. These data can be used to esti-
mate σ̂a2 as
n
2 1X 2
σ̂a = â . (25)
n t=1 t
Note that the ARMA model should be specified such that the
residuals are approximately white noise.
(b) Estimate the ACF for the squared residuals in the usual way:
SR

Pn 2 2 2 2
t=k+1 (ât − σ̂a )(ât−k − σ̂a )
ρ̂a2t (k) = Pn 2 2 )2
(26)
t=1 (â t − σ̂ a

(c) Use the Ljung-Box test statistic to test the hypothesis that all
correlation coefficients up to order K are simultaneously equal to
zero. McLeod and Li (1983) proposed the portmanteau statistic

n(n + 2) K
P
k=1 ρ̂a2t (k)
Q̃â2 = (27)
n−k
Under the null hypothesis this statistic is distributed as χ2 with
K degrees of freedom (McLeod and Li (1983) showed that the
statistic Q̃â2 has approximately the χ2 distribution with K degrees
of freedom under the assumption that the ARMA model alone is
adequate.). The decision rule is to reject the null hypothesis if
Q̃â2 > χ2K (α), where χ2K (α) is the upper 100(1 − α)th percentile of
χ2K , or the p−value of Q̃â2 is less than α, type-I error.
2. Engle’s Lagrange-Multiplier Test: Engle (1982) proposed a Lagrange-
Multiplier test. This test rests on an ancillary regression of the squared
residuals against a constant and lagged values of â2t−1 , â2t−2 , . . . , â2t−m

15
where the {â2t } is again obtained from a preliminary regression or
â2t = rt − µ̂t .
The auxiliary regression thus is
â2t = α0 + α1 â2t−1 + · · · + αm â2t−m + et , t = m + 1, ..., n. (28)
(i.e., Fit an AR(m) model to {â2t }, t = 1, 2, . . . , n ). Here et denotes the
error term of this auxiliary regression. Then the null hypothesis H0 :
α1 = α2 = · · · = αm = 0 is tested against the alternative hypothesis
H1 : αi 6= 0 for at least one i. As a test statistic one can use the
coefficient of determination times n, i.e. nR2 . Therefore LM test
statistic is
LM = nR2 (29)
Under H0 , this test statistic is asymptotically distributed as χ2 with m
degrees of freedom. The decision rule is to reject the null hypothesis
if LM > χ2m (α), where χ2m (α) is the upper 100(1 − α)th percentile of
χ2m , or the p−value of LM is less than α, type-I error.
Note:
1. ARCH(p) should only ever be applied to a series that has already had
an appropriate model fitted sufficient to leave the residuals looking
SR

like discrete white noise. Since we can only tell whether ARCH is
appropriate or not by squaring the residuals and examining the ACF,
we also need to ensure that the mean of the residuals is zero.
2. These tests can also be useful in a conventional regression setting.

1.6 Identifying an ARCH/GARCH Model in Practice


The best identification tool may be a time series plot of the series. It’s
usually easy to spot periods of increased variation sprinkled through the
series. It can be fruitful to look at the ACF and PACF of both rt and rt2 .
For instance, if rt appears to be white noise and rt2 appears to be AR(1),
then an ARCH(1) model for the variance is suggested. If the PACF of the
rt2 suggests AR(m), then ARCH(m) may work. GARCH models may be
suggested by an ARMA type look to the ACF and PACF of rt2 .

1.7 Maximim Likelihood Estimation


1.7.1 ML Estimation of ARCH Model

Several likelihood functions are commonly used in ARCH estimation, de-


pending on the distributional assumption of t . Under the normality as-

16
sumption, the likelihood function of an ARCH(m) model is
f (a1 , a2 , . . . , an |α) = f (an |Fn−1 )f (an−1 |Fn−2 ) · · · f (am+1 |Fm )f (a1 , a2 , . . . , am |α)
n
!
Y 1 a2t
= p exp − 2 × f (a1 , a2 , . . . , am |α) (30)
t=m+1 2πσ t
2 2σ t

where α = (α0 , α1 , . . . , αm )0 and f (a1 , a2 , . . . , am |α) is the joint probability


density function of a1 , a2 , . . . , am . Since the exact form of f (a1 , a2 , . . . , am |α)
is complicated, it is commonly dropped from the prior likelihood function,
especially when the sample size is sufficiently large. This results in using
the conditional-likelihood function
n
!
2
Y 1 a
f (am+1 , am+2 , . . . , an |α, a1 , a2 , . . . , am ) = p exp − t 2 , (31)
t=m+1 2πσt2 2σt
where σt2 can be evaluated recursively. We refer to estimates obtained
by maximizing the prior likelihood function as the conditional maximum-
likelihood estimates (MLEs) under normality. Maximizing the conditional-
likelihood function is equivalent to maximizing its logarithm, which is easier
to handle. The conditional log-likelihood function is
n
" #
SR

2
X 1 1 1 a t
l(am+1 , am+2 , . . . , an |α, a1 , a2 , . . . , am ) = − ln(2π) − ln(σt2 ) − 2 .
t=m+1
2 2 2 σ t
(32)
Since the first term ln(2π) does not involve any parameters, the log-likelihood
function becomes
n
" #
2
X 1 1 a t
l(am+1 , am+2 , . . . , an |α, a1 , a2 , . . . , am ) = − ln(σt2 ) + 2 . (33)
t=m+1
2 2 σ t

where σt2 = α0 + α1 a2t−1 + · · · + αm a2t−m can be evaluated recursively. There


is no closed-form solution for the maximum likelihood estimators of α, but
they can be computed by maximizing the log-likelihood function numeri-
cally.

Remark: As discussed earlier, the conditional normal distribution results


in a leptokurtic unconditional distribution. Nevertheless, in financial ap-
plications the normal distribution sometimes fails to capture the excess
kurtosis that is present in stock returns and other variables.
1. To overcome this drawback, Bollerslev (1987) suggested using a stan-
dardized Student t-distribution with ν > 2 degrees of freedom for the
estimation.

17
2. Nelson (1991) suggested using the generalized error distribution (GED)
for the estimation.

1.7.2 ML Estimation of GARCH Model

The likelihood function of a GARCH model can be readily derived for the
case of normal innovations. We illustrate the computation for the case of a
stationary GARCH(1, 1) model. Extension to the general case is straight-
forward. Given the parameters α0 , α1 , and β1 the conditional variances can
be computed recursively by the formula

σt2 = α0 + α1 a2t−1 + β1 σt−1


2
(34)

for t ≥ 2, with the initial value, σ12 , set under the stationarity assumption
α0
as the stationary unconditional variance σa2 = . We use the
1 − α1 − β1
conditional pdf
!
2
1 a
f (at |at−1 , at−2 , . . . , a1 ) = p exp − t 2 (35)
2
2πσt 2σ t
SR

and the joint pdf

f (an , an−1 , . . . , a1 ) = f (an−1 , an−2 , . . . , a1 )f (an |an−1 , an−2 , . . . , a1 ). (36)

Iterating this last formula and taking logs gives the following formula for
the log-likelihood function:
n
!
2
n 1 X a
l(α0 , α1 , β1 ) = − ln(2π) − 2
ln(σt−1 ) + t2 (37)
2 2 t=1 σt

There is no closed-form solution for the maximum likelihood estimators of


α0 , α1 , and β1 , but they can be computed by maximizing the log-likelihood
function numerically.

Remark: As we discussed in the case of ARCH models, instead of normal


distribution, we use standardized Student t-distribution and/or generalized
error distribution (GED) to capture the excess kurtosis that is noticed in
the financial time series.

***END***

18

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy