0% found this document useful (0 votes)
238 views

Stationary Process

This document provides an overview of stochastic processes and time series analysis. It defines key concepts such as stationary processes, autocorrelation functions, and partial autocorrelation functions. It also describes common tests for determining if a time series is stationary or if autocorrelation values are statistically different than zero, such as Bartlett's test, the Box-Pierce test, and the Ljung-Box test. Examples are provided to demonstrate calculating autocorrelation functions and interpreting correlograms and the results of stationarity and autocorrelation tests.

Uploaded by

Mohamed Asim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
238 views

Stationary Process

This document provides an overview of stochastic processes and time series analysis. It defines key concepts such as stationary processes, autocorrelation functions, and partial autocorrelation functions. It also describes common tests for determining if a time series is stationary or if autocorrelation values are statistically different than zero, such as Bartlett's test, the Box-Pierce test, and the Ljung-Box test. Examples are provided to demonstrate calculating autocorrelation functions and interpreting correlograms and the results of stationarity and autocorrelation tests.

Uploaded by

Mohamed Asim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 178

Stochastic Processes

A stochastic process (aka a random process) is a collection of random


variables ordered by time. This is the “population version” of a time series
(which plays the role of a “sample” of a stochastic process).
Topics:
• Stationary Process
• Autocorrelation Function
• Partial Autocorrelation Function
• Purely Random Time Series (white noise)
• Random Walk
• Deterministic Trend
• Dickey-Fuller Test
• Real Statistics Time Series Testing Tools
• Correlogram
• Handling Missing Time Series Data

1
Stationary Process
Basic Concepts
A time series is stationary if the properties of the time series (i.e. the mean,
variance, etc.) are the same when measured from any two starting points in
time. Time series which exhibit a trend or seasonality are clearly not
stationary.
We can make this definition more precise by first laying down a statistical
framework for further discussion.
Definitions
Definition 1: A stochastic process (aka a random process) is a
collection of random variables ordered by time.
In economics, GDP and corporate profits (by year) can be modeled as
stochastic processes. In biology the number of elephants in the wild, in
meteorology the average temperature of the planet, in medicine the number
of Ebola cases, etc. can all be modeled as stochastic processes.
Thus, if we are interested in GDP from 2001 until 2015, we can define the
random variables yi = the GDP in 2000 + i, and so the series y1, y2, …, y15 is a
stochastic process.
Corresponding to the individual populations of the random variables in a
stochastic process are the samples for each random variable. Any such
realization of samples is called a time series. Note that the sample of each
random variable in a time series contains just one element.
Definition 2: A stochastic process is stationary if the mean, variance and
autocovariance are all constant; i.e. there are constants μ, σ and γk so that for
all i, E[yi] = μ, var(yi) = E[(yi–μ)2] = σ2 and for any lag k, cov(yi, yi+k) = E[(yi–
μ)(yi+k–μ)] = γk.
A time series is stationary if the above properties hold for the time series
(in the same way as we extend properties of a population to its samples). We
will make this more precise shortly.
Observation: The above definition of stationary is what is usually
called weakly stationary, but fortunately it is sufficient for our purposes.
A stochastic process is truly stationary if not only are the mean, variance,
and autocovariances constant, but all the properties (i.e. moments) of its
distribution are time-invariant.
Example
Example 1: Determine whether the Dow Jones closing averages for the
month of October 2015, as shown in columns A and B of Figure 1 is a
stationary time series.

2
Figure 1 – Dow Jones Time Series
As you can see from Figure 1, there is an upward trend to the data. This is an
indication that the time series is not stationary.
We now take the first differences of the Dow Jones closing averages on
consecutive days, as shown in Figure 2.

3
Figure 2 – Differences of Lag 1
Here cell N5 contains the formula =M5-M4 and similarly for the other cells
in column N. This time the chart shows what looks like a random pattern.
This is indicative of a stationary time series.

4
Autocorrelation Function
Definitions
Definition 1: The autocorrelation function (ACF) at lag k, denoted ρk, of a
stationary stochastic process, is defined as ρk = γk/γ0 where γk = cov(yi, yi+k) for any i.
Note that γ0 is the variance of the stochastic process.
Definition 2: The mean of a time series y1, …, yn is

The autocovariance function at lag k, for k ≥ 0, of the time series is defined by

The autocorrelation function (ACF) at lag k, for k ≥ 0, of the time series is defined
by

The variance of the time series is s0. A plot of rk against k is known as


a correlogram. See Correlogram for information about the standard error and
confidence intervals of the rk, as well as how to create a correlogram including the
confidence intervals.
Observation: The definition of autocovariance given above is a little different from
the usual definition of covariance between {y1, …, yn-k} and {yk+1, …, yn} in two
respects: (1) we divide by n instead of n–k and we subtract the overall mean instead
of the means of {y1, …, yn-k} and {yk+1, …, yn} respectively. For values of n which
are large with respect to k, the difference will be small.
Example
Example 1: Calculate s2 and r2 for the data in range B4:B19 of Figure 1.

5
Figure 1 – ACF at lag 2
The formulas for calculating s2 and r2 using the usual COVARIANCE.S and
CORREL functions are shown in cells G4 and G5.
The formulas for s0, s2, and r2 from Definition 2 are shown in cells G8, G11, and G12
(along with an alternative formula in G13). Note that the values for s2 in cells E4 and
E11 are not too different, as are the values for r2 shown in cells E5 and E12; the
larger the sample the more likely these values will be similar.
Worksheet Functions
Real Statistics Functions: The Real Statistics Resource Pack supplies the following
functions:
ACF(R1, k) = the ACF value at lag k for the time series in range R1
ACVF(R1, k) = the autcovariance at lag k for the time series in range R1
Note that ACF(R1, k) is equivalent to
=SUMPRODUCT(OFFSET(R1,0,0,COUNT(R1)-k)-
AVERAGE(R1),OFFSET(R1,k,0,COUNT(R1)-k)-AVERAGE(R1))/DEVSQ(R1)
Observations
There are theoretical advantages to using division by n instead of n–k in the
definition of sk, namely that the covariance and correlation matrices will always be
definite non-negative (see Positive Definite Matrices).
Even though the definition of autocorrelation is slightly different from that of
correlation, ρk (or rk) still takes a value between -1 and 1, as we see in Property 2.

6
Properties
Property 1: For any stationary process, γ0 ≥ |γi| for any i
Proof: Click here
Property 2: For any stationary process, |ρi| ≤ 1 (i.e. -1 ≤ ρi ≤ 1) for any i > 0
Proof: By Property 1, γ0 ≥ |γi| for any i. Since ρi = γi /γ0 and γ0 ≥
0 (actually γ0 > 0 since we are assuming that ρi is well-defined), it follows that

Example with Correlogram


Example 2: Determine the ACF for lag = 1 to 10 for the Dow Jones closing averages
for the month of October 2015, as shown in columns A and B of Figure 2, and
construct the corresponding correlogram.
The results are shown in Figure 2. The values in column E are computed by placing
the formula =ACF(B$4:B$25, D5) in cell E5, highlighting range E5:E14 and
pressing Ctrl-D.

Figure 2 – ACF and Correlogram


As can be seen from the values in column E or the chart, the ACF values descend
slowly towards zero. This is typical of an autoregressive process.

7
Observation: A rule of thumb is to carry out the above process for lag = 1 to n/3
or n/4, which for the above data is 22/4 ≈ 6 or 22/3 ≈ 7. Our goal is to see whether
by this time the ACF is significant (i.e. statistically different from zero). We can do
this by using the following property.
Tests
Property 3 (Bartlett): In large samples, if a time series of size n is purely random
then for all k

Example 3: Determine whether the ACF at lag 7 is significant for the data from
Example 2.
As we can see from Figure 3, the critical value for the test in Property 3 is .417866.
Since r7 = .031258 < .417866, we conclude that ρ7 is not significantly different from
zero.

Figure 3 – Bartlett’s Test


Note that using this test, values of k up to 3 are significant and those higher than 3
are not significant (although here we haven’t taken experiment-wise error into
account).
Property 4 (Box-Pierce): In large samples, if ρk = 0 for all k ≤ m, then

A more statistically powerful version of Property 4, especially for smaller samples,


is given by the next property.
Property 5 (Ljung-Box): If ρk = 0 for all k ≤ m, then

Example 4: Use the Box-Pierce and Ljung-Box statistics to determine whether the
ACF values in Example 2 are statistically equal to zero for all lags less than or equal
to 5 (the null hypothesis).
The results are shown in Figure 4.

8
Figure 4 – Box-Pierce and Ljung-Box Tests
We see from these tests that ACF(k) is significantly different from zero for at least
one k ≤ 5, which is consistent with the correlogram in Figure 2.
Test worksheet functions
Real Statistics Functions: The Real Statistics Resource Pack provides the following
functions to perform the tests described by the above properties.
BARTEST(r, n, lag) = p-value of Bartlett’s test for correlation coefficient r based
on a time series of size n for the specified lag.
BARTEST(R1,, lag) = BARTEST(r, n, lag) where n = the number of elements in
range R1 and r = ACF(R1,lag)
PIERCE(R1,,lag) = Box-Pierce statistic Q for range R1 and the specified lag
BPTEST(R1,,lag) = p-value for the Box-Pierce test for range R1 and the
specified lag
LJUNG(R1,,lag) = Ljung-Box statistic Q for range R1 and the specified lag
LBTEST(R1,,lag) = p-value for the Ljung-Box test for range R1 and the
specified lag
In the above functions where the second argument is missing, the test is performed
using the autocorrelation coefficient (ACF). If the value assigned instead is 1 or
“pacf” then the test is performed using the partial autocorrelation coefficient (PACF)
as described in the next section. Actually, if the second argument takes any value
except 1 or “pacf”, then the ACF value is used.
For example, BARTEST(.303809,22,7) = .07708 for Example 3 and
LBTEST(B4:B25,”acf”,5) = 1.81E-06 for Example 4.
41 thoughts on “Autocorrelation Function”
Autocorrelation Proof
Property 1: For any stationary process, γ0 ≥ |γi| for any i

9
Proof: For any stationary process yi with mean µ, define zi = yi – µ. Then it is easy
to see that zi is a stationary process with mean zero. Also

(including the case where k = 0) which means that it is sufficient to prove the
property in the case where the mean is zero.
Now suppose that E[zi–czi+k] ≥ 0 for some real number c. Now for any real number c,
it follows that

and so
If γk ≥ 0, let c = 1, while if γk < 0, let c = -1. Then

2 thoughts on “Autocorrelation Proof”


Partial Autocorrelation Function
For regression of y on x1, x2, x3, x4, the partial correlation between y and x1 is

This can be calculated as the correlation between the residuals of the regression of y
on x2, x3, x4 with the residuals of x1 on x2, x3, x4.
For a time series, the hth order partial autocorrelation is the partial correlation of
yi with yi-h, conditional on yi-1,…, yi-h+1, i.e.

The first order partial autocorrelation is therefore the first-order autocorrelation.


The partial autocorrelations can be calculated as in the following alternative
definition.
Definition 1: For k > 0, the partial autocorrelation function (PACF) of order k,
denoted πk, of a stochastic process, is defined as the kth element in the column vector

Here, Γk is the k × k autocovariance matrix Γk = [vij] where vij = γ|i-j| and δk is the k ×
1 column vector δk = [γi]. We also define π0 = 1. We can also define πki to be
the ith element in the vector , and so πk = πkk.
Provided γ0 > 0, the partial correlation function of order k is equal to the kth element
in the following column matrix divided by γ0.

10
Here, Σk is the k × k autocorrelation matrix Σk = [ωij] where ωij = ρ|i-j| and τk is
the k × 1 column vector τk = [ρi].
Note that if γ0 > 0, then Σk and Γk are invertible for all m.
The partial autocorrelation function (PACF) of order k, denoted pk, of a time
series, is defined in a similar manner as the last element in the following matrix
divided by r0.

Here Rk is the k × k matrix Rk = [sij] where sij = r|i-j| and Ck is the k × 1 column
vector Ck = [ri].
We also define p0 = 1 and pik to be the ith element in the matrix , and so pk = pkk.
These values can also be calculated from the autocovariance matrix of the time series
in the same manner as described above for stochastic processes.
Observation: Let Y be the k × 1 vector Y = [y1 y2 … yk]T and Z = [y1–µ y2–µ … yk–
µ]T where µ = E[Y]. Then cov(Y) = E[ZZT], which is equal to
the k × k autocovariance matrix Γk = [vij] where vij = γ|i-j|.
Example 1: Calculate PACF for lags 1 to 7 for Example 2 of Autocorrelation
Function.
The result is shown in Figure 1.

Figure 1 – PACF
We now show how to calculate PACF(4) in Figure 2. The calculations of the other
PACF values is similar.

11
Figure 2 – Calculation of PACF(4)
First, we note that range R4:U7 of Figure 2 contains the autocovariance matrix with
lag 4. This is a symmetric matrix, all of whose values come from range E4:E6 of
Figure 1. The values on the main diagonal are s0, the values on the diagonal above
and below the main diagonal are s1. The values on the diagonal two units away
are s2 and finally, the values in the upper right and lower left corners of the matrix
are s3.
Range R9:R12 is identical to range E5:E8. The values in range T9:T12 can now be
calculated using the array formula
=MMULT(MINVERSE(R4:U7),R9:R12)
The values in range T9:T12 are the pi4 values, and so we see that PACF(4) = p4 = -
.06685 (cell T12).
Observation: See Correlogram for information about the standard error and
confidence intervals of the pk, as well as how to create a PACF correlogram that
includes the confidence intervals.
Observation: We see from the chart in Figure 1, that the PACF values for lags 2, 3,
… are close to zero. As can be seen in Partial Autocorrelation for an AR(p) Process,
this is typical for a time series derived from an autoregressive process. Note too that
we can use Property 3 of Autocorrelation Function to test whether the PACF values
for lags 2 and beyond are statistically equal to zero (see Figure 3).

12
Figure 3 – Bartlett’s test for PACF
The test shows that PACF(2) is not significantly different from zero. Note that,
where applicable, we can also use Property 4 and 5 of Autocorrelation Function to
test PACF values.
Real Statistics Function: The Real Statistics Resource Pack supplies the following
function where R1 is a column range containing time series data:
PACF(R1, k) – the PACF value at lag k
The following array functions are also provided
ACOV(R1, k) – the autcovariance matrix at lag k
ACORR(R1, k) – the autcorrelation matrix at lag k
Property 1: The autocovariance matrix is non-negative definite.
Proof: Let Y = [y1 y2 … yk]T and let Γ be the k × k autocovariance matrix Γ =
[vij] where vij = γ|i-j| . As we observed previously, Γ = E[ZZT].
Now let X be any k × 1 vector. Then

which completes the proof.


6 thoughts on “Partial Autocorrelation Function”
Are we able to download the spreadsheet you used for the calculations above?
Purely Random Time Series
A purely random time series y1, y2, …, yn (aka white noise) takes the
form

where
Clearly, E[yi] = μ, var(yi) = σ2i and cov(yi, yj) = 0 for i ≠ j. Since these values
are constants, this type of time series is stationary. Also note that ρh = 0 for
all h > 0.

13
Example 1: Simulate 300 white noise data elements with mean zero.
Using the formula =NORM.S.INV(RAND()) we can generate a sample of 300
white noise elements, as displayed in Figure 1.

Figure 1 – White Noise Simulation


We see that there is a random pattern. Using the techniques described
in Autocorrelation Function and Partial Autocorrelation Function we can
also calculate ACF and PACF values, as shown in Figure 2.

Figure 2 – ACF and PACF for White Noise simulation


Although the theoretical ACF values are ρk = 0 for all k > 0, the sample
values rk won’t necessarily be exactly 0, as we can see from the left side of
Figure 2. Based on Property 3 of Autocorrelation Function

Since n = 300, a 95% confidence interval for rk is


0 ± NORM.S.INV(.025)/SQRT(300) = ±0.11316

14
Figure 2 shows 40 values for rk. We would expect that about 40(.95) = 2 of
these values would be outside the 95% confidence interval. In fact, two ACF
values are outside this range, namely r9 = .11842 and r19 = .13366.
Using the Ljung-Box test, we see that none of the 40 ACF values is
significantly different from zero:

p-value = CHISQ.DIST.RT(46.2803,40) = .229 > .05 = α


We can perform similar tests for the PACF values.
Random Walk
Basic Concepts
A random walk time series y1, y2, …, yn takes the form

where

If δ = 0, then the random walk is said to be without drift, while if δ ≠ 0, then the
random walk is with drift (i.e. with drift equal to δ).
It is easy to see that for i > 0

It then follows that E[yi] = y0 + δi, var(yi) = σ2i and cov(yi, yj) = 0 for i ≠ j. The
variance values are not constants but vary with time i, and so this type of time series
is not stationary. Also, the mean values are constant only for a random walk without
drift.
Note too that since cov(εi,εj) = 0 for i ≠ j, it follows that

Note that the first difference zi = yi – yi-1 of a random walk is stationary since it takes
the form

which is a purely random time series.

15
Plot
Example 1: Graph the random walk with drift yi = yi-1 + εi where the εi ∼ N(0,.5).
The graph is shown in Figure 1. All the cells in column B contain the formula
=NORM.INV(RAND(),0,.5), cell C4 contains the formula =1+B4 and cell C5
contains the formula =1+B5+C4.
As we can see, the graph shows a clear upward trend and the ACF shows a slow
descent.

Figure 1 – Random Walk


First differences are taken between the y values as shown in Figure 2. E.g. cell C5
contains the formula = B5-B4 (where column B replicates the values in column C
from Figure 1). We see from the chart that the trend has been eliminated. We also
see from the Ljung-Box test (cell F13) that the ACF values for the first 7 lags are
statistically equal to zero, consistent with a purely random process.

16
Figure 2 – First differences of a random walk
2 thoughts on “Random Walk”
Deterministic Trend
A time series with a (linear) deterministic trend can be modeled as

Now E[yi] = μ + δi and var(yi) = σ2, and so while the variance is a constant, the mean
varies with time i; consequently, this type of time series is also not stationary.
These types of time series can be transformed into a stationary time series
by detrending, i.e. by setting zi = yi – δi. In this case zi = μ + εi, which is a purely
random time series.
In a similar fashion, we can speak about a quadratic deterministic trend (yi = μ
+ δi + εi) or various other varieties of deterministic trends.
The types of random walks described previously are said to have a stochastic trend.
We can also have random walks with a deterministic trend. These take the form

where δ is a constant. These are not stationary and require differencing and
detrending to be transformed into a stationary time series.

17
Example 1: Graph the time series with deterministic trend yi = i + εi) where the εi ∼
N(0,1).
The graph is shown in Figure 1. All the cells in column B contain the formula
=NORM.S.INV(RAND()) and cell C4 contains the formula =A4+B4 (and similarly
for the other cells in column C).
As we can see, once again the graph shows a clear upward trend and the ACF shows
a slow descent.

Figure 1 – Deterministic Trend


This time we get rid of the trend by detrending as shown in Figure 2. E.g. cell C4
contains the formula = B4-A4 (where column B replicates the values in column C
from Figure 1). We see from the chart that the trend has been eliminated. We also
see from the Ljung-Box test (cell F13) that the ACF values for the first 7 lags are
statistically equal to zero, consistent with a purely random process.

18
Figure 2 – Detrending
4 thoughts on “Deterministic Trend”
Dickey-Fuller Test
We consider the stochastic process of form

where |φ| ≤ 1 and εi is white noise. If |φ| = 1, we have what is called a unit root. In
particular, if φ = 1, we have a random walk (without drift), which is not stationary.
In fact, if |φ| = 1, the process is not stationary, while if |φ| < 1, the process is
stationary. We won’t consider the case where |φ| > 1 further since in this case the
process is called explosive and increases over time.
This process is a first-order autoregressive process, AR(1), which we study in more
detail in Autoregressive Processes. We will also see why such processes without a
unit root are stationary and why the term “root” is used.
The Dickey-Fuller test is a way to determine whether the above process has a unit
root. The approach used is quite straightforward. First calculate the first difference,
i.e.

19
i.e.
If we use the delta operator, defined by Δyi = yi – yi-1 and set β = φ – 1, then the
equation becomes the linear regression equation

where β ≤ 0 and so the test for φ is transformed into a test that the slope parameter β
= 0. Thus, we have a one-tailed test (since β can’t be positive) where
H0: β = 0 (equivalent to φ = 1)
H1: β < 0 (equivalent to φ < 1)
Under the alternative hypothesis, if b is the ordinary least squares (OLS) estimate
of β, and so φ-bar = 1 + b is the OLS estimate of φ, then for large enough n

where
We can use the usual linear regression approach, except that when the null
hypothesis holds the t coefficient doesn’t follow a normal distribution and so we
can’t use the usual t-test. Instead, this coefficient follows a tau distribution, and so
our test consists of determining whether the tau statistic τ (which is equivalent to the
usual t statistic) is less than τcrit based on a table of critical tau statistics values shown
in Dickey-Fuller Table.
If the calculated tau value is less than the critical value in the table of critical values,
then we have a significant result; otherwise, we accept the null hypothesis that there
is a unit root and the time series is not stationary.
There are the following three versions of the Dickey-Fuller test:

Type 0 No constant, no trend Δyi = β1 yi-1 + εi

Type 1 Constant, no trend Δyi = β0 + β1 yi-1 + εi

Type 2 Constant and trend Δyi = β0 + β1 yi-1 + β2 i+ εi

Each version of the test uses a different set of critical values, as shown in the Dickey-
Fuller Table. It is important to select the correct version of the test for the time series
being analyzed. Note that the type 2 test assumes there is a constant term (which
may be significantly equal to zero).
Example 1: The net daily earnings of a small-time gambler are listed in column B
of Figure 1. Use the Dickey-Fuller test to determine whether the times series is
stationary.

20
We start by assuming that the correct model is type 1, namely constant but no
trend.

Figure 1 – Regression on time-series data


Since we are using the regression model

(constant, no trend) we use the Real Statistics Linear Regression data analysis tool
using range B4:B27 and the X data range and D5:D28 as the Y data range. Note that
the values in column D are calculated by placing the formula =B5-B4 in cell D5,
highlighting the range D5:D28 and pressing Ctrl-D.
The output from the regression analysis is shown on the right side of Figure 1. In
particular, we see that the t statistic (cell I20) for the β1 coefficient is -1.91613. This
is the tau statistic. We now look up in the Dickey-Fuller Table, and find that the tau
critical value for a type 1 test is -2.986 when n = 25 and α = .05. Since τcrit = -2.986
< – 1.91613 = τ, we cannot reject the null hypothesis that the time series is not
stationary.
Note that the β1 coefficient (cell G20) is negative as expected. If instead, the
coefficient were positive, then we would know that this type of Dickey-Fuller test
was inappropriate since β1 = φ – 1 ≤ 0.
We now display in Figure 2 a plot of the time series values from Figure 1.

21
Figure 2 – Chart of Winnings by Day
We see that there is an apparent downward trend towards the end of the 25 day period
and so it is not surprising that the time series is not stationary. In fact, this leads us
to choose the type 2 Dickey-Fuller test (with constant and trend). The result of this
test is shown in Figure 3.

Figure 3 – Dickey-Fuller with trend

22
Since we are using the regression model
Δyi = β0 + β1i + β2yi-1 + εi
this time, we use A4:B27 from Figure 1 as the X data range and D5:D28 as the Y
data range. We see from Figure 3 that the t statistic (cell I21) for the β2 coefficient is
-2.91345. We now look up in the Dickey-Fuller Table, and find that the tau critical
value is -3.60269 for a type 2 test when n = 25 and α = .05. Since τcrit = -3.60269 < -
2.91345 = τ, we cannot reject the null hypothesis that the time series is not stationary.
Real Statistics Function: The Real Statistics Resource Pack provides the following
array function where R1 contains a column of time series data.
ADFTEST(R1, lab, , , type, alpha): returns a 3 × 1 range which contains the
following values: tau-statistic, tau-critical, yes/no (stationary or not)
If lab = TRUE (default is FALSE), the output consists of a 3 × 2 range whose first
column contains labels. type = the test type (0, 1, 2, default is 1). The default value
for alpha is .05.
Note that for the type 2 test for Example 1, the output from the array formula
=ADFTEST(R6:R30,TRUE,,,2,U9)
agrees with the results we obtained above, as displayed in Figure 4.

Figure 4 – Output from ADFTEST function


Note that the ADFTEST function can also be used to conduct the Augmented
Dickey-Fuller test (ADF). See Augmented Dickey-Fuller Test. In fact, the
ADFTEST function can take additional arguments and output other values, as
explained on that webpage.
Real Statistics Function: The Real Statistics Resource Pack provides the following
functions
ADFCRIT(n, alpha, type) = critical value, tau-crit, for the stated type of ADF test
at the stated alpha value, when the time series has n elements
ADFPROB(x, n, type) = estimated p-value (based on linear interpolation) for the
ADF test at x
Thus for Example 1, we see that ADFCRIT(25,.05,2) = -3.60269.
Thus for Example 1, ADFCRIT(25,.05,2) = -3.60269. Also, ADFPROB(-
2.91345,25,2) = “>.1”. ADFPROB takes values between .01 and .10; values greater
than .1 are output as “>.1” and values less than .01 are output as “<.01”. Note that in
the constant without trend case, if the tau-stat were -2.91345, then p-value =
ADFPROB(-2.91345,25,1) = .060127.

23
Note that the ADFCRIT function will return critical values for alpha = .01, .025, .05
and .10 and for values of n found in the Dickey-Fuller Table as well as for values
of alpha and n not included in the table.

24
Augmented Dickey-Fuller Table

If the calculated tau value is less than the critical value in the table above,
then we have a significant result; otherwise, we accept the null hypothesis
that there is a unit root and the time series is not stationary.
The following is a more precise way of estimating these critical values:
crit = t + u/N + v/N2 + w/N3
where t, u, v and w are defined as follows:

Augmented Dickey-Fuller Test


In Dickey-Fuller Test we describe the Dickey-Fuller test which determines
whether an AR(1) process has a unit root, i.e. whether it is stationary. We
now extend this test to AR(p) processes.
For the AR(1) process

we take the first difference to obtain the equivalent form

where Δyi = yi – yi-1 and β = φ – 1, and test the hypothesis


H0: β = 0 (equivalent to φ = 1)
H1: β < 0 (equivalent to φ < 1)
If |φ| = 1, we have what is called a unit root (i.e. the time series is not
stationary). We have three versions of the test.
Type 0 No constant, no trend Δyi = β1 yi-1 + εi

Type 1 Constant, no trend Δyi = β0 + β1 yi-1 + εi

25
Type 2 Constant and trend Δyi = β0 + β1 yi-1 + β2 i+ εi
The extension to AR(p) processes has the following three versions.

Type 0 No constant, no trend

Type 1 Constant, no trend

Type 2 Constant and trend

Once you know how many lags to use, the augmented test is identical to the
simple Dickey-Fuller test. We can use the Akaike Information Criterion
(AIC) or Bayesian Information Criteria (BIC) to determine how many lags to
consider, as described in Comparing ARIMA Models.
Thus we can now use the full version of the ADFTEST function which was
introduced in Dickey-Fuller Test.
Real Statistics Function: The Real Statistics Resource Pack provides the
following array function where R1 contains a column of time series data.
ADFTEST(R1, lab, lag, criteria, type, alpha): returns an 8 × 1 range which
contains the following values: tau-statistic, tau-critical, yes/no (stationary
or not), AIC value, BIC value, # of lags (p), the first-order autoregression
coefficient and estimated p-value.
If lab = TRUE (default is FALSE), the output consists of a 8 × 2 range whose
first column contains labels. type = the test type (0, 1, 2, default is 1). The
default value for alpha is .05.
The arguments lag and criteria, which were not used for the Dickey-Fuller
Test, are defined as follows:
• lag = the maximum number of lags to use in the test (default 0)
• criteria = “none” : no criteria is used, and so p is set to the value
of lag
• criteria = “aic” : the AIC is used to determine the number of
lags p (where p ≤ lag)
• criteria = “bic” : the BIC is used to determine the number of
lags p (where p ≤ lag)
To specify the criteria, you can use “AIC” or 1 instead of “aic”, you can use
“BIC” or 2 instead of “bic” and you can use “” or 0 instead of “none”.

26
If lag < 0 then lag will automatically be set to value
=Round(12*(n/100)^.25,0), as proposed by Schwert, where n = the number
of elements in the time series.
To specify the test type, you can use “” or “none” instead of 0, you can use
“drift” or “constant” instead of 1 and you can use “trend” or “both” instead of
2.
Example 1: Determine whether the data in column A of Figure 1 has a unit
root based on a model without trend based on the Schwert estimate for the
maximum number of lags using the AIC criteria. Also, determine whether
there is a unit root based on a model with trend and a maximum number of
lags equal to 7 using the AIC criteria.

Figure 1 – Time Series


Here range J4:K8 contains the array formula =DescStats(A3:A22,TRUE).
We see that the mean value of the time series is 2.376, and so we conclude
that the time series likely has a non-constant mean. We could confirm this
by using a t-test to see whether the population mean is significantly different
from zero.
We now use the array formula =ADFTEST(A3:A22,TRUE,-1) to show the
results of the ADF test without trend. The -1 means that we are using the
Schwert estimate for the maximum number of lags. We are also using the

27
default type = 1, which results in the test for constant without trend. As we
can see from range P4:P11 in Figure 2, since tau-stat > tau-crit, the time
series is not stationary.

Figure 2 – ADF Test


Note that the above formula effectively uses a maximum lag count of 8, which
can seen by using the formula =ROUND(12*(K4/100)^0.25,0) in cell K10
from Figure 1.
Looking at the chart in Figure 1, it appears that the time series has a trend,
and so we repeat the ADF Test with constant and trend to get the results
shown in range S4:T11 of Figure 2 using the array formula
=ADFTEST(A3:A22,TRUE,7,”aic”,2). Here type = 2 (constant and trend) and
maximum number of lags = 7. Note that we didn’t use 8 as the maximum
number of lags since that would produce error values (based on insufficient
degrees of freedom in the underlying regression analysis).
Real Statistics Data Analysis Tool: As explained in Time Series Testing
Tools, the Time Series Testing data analysis tool can be used to perform
the Dickey-Fuller Test. In fact, it can also be used to perform the Augmented
Dickey-Fuller Test.

28
Time Series Testing Tools
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack
provides the Time Series Testing data analysis tool which consolidates many
of the capabilities described in this part of the website.
To use this tool for the data in Example 1 of Stationary Process (repeated in
Figure 1), press Ctr-m and choose the Testing option from the Time S tab
(or from the Time Series option if using the original user interface) and
click on the OK button. Now, fill in the dialog box that appears as shown in
Figure 1.
In Figure 1 we have inserted the time series values in the Input Range field,
without column heading or date information. You can optionally request the
ACF, ACVF and/or PACF values by placing a positive integer value in the
corresponding field. In Figure 1 we have requested the ACF(k) values for
lags k = 1, 2, 3, 4, 5. We could have requested any combination of ACF, ACVF
or PACF values, or none at all.
Similarly, we can request any combination of the white noise tests (Bartlett’s,
Box-Pierce, Ljung-Box) or none at all by inserting a positive integer in the
corresponding field. In Figure 1 we have requested only the Ljung-Box test
on ACF for lags up to 5.
Finally, we can optionally request the ADF test by inserting a non-negative
integer value (including 0) in the # of Lags field or by leaving this field
empty and selecting the Schwert option. We can also request to use
the Drift or Trend options of the test. In Figure 1, we have requested the
test with drift based on the number of lags specified by the Schwert criterion.

29
Figure 1 – Time Series Testing data analysis dialog box
The Schwert criterion calculates the lag based on the Excel formula
=ROUND(12 * (n / 100) ^ (1 / 4), 0)
which in this case is ROUND(12*(22/100)^(1/4),0) = 8. The AIC criteria is
then used with lags = 8.
The output for this analysis is shown in Figure 2 (only the first 15 of the 22
data elements are shown in columns D and E).

30
Figure 2 – Time Series data analysis
The first 5 ACF values are shown in column H. The Ljung-Box test gives a
significant result (cell M6), which means that at least one of the first 5
autocorrelations is significantly different from zero. The Augmented Dickey-
Fuller test shows that the time series is not stationary (cell P13).
Example 2: Repeat Example 1 on the first differences of the data in Example
1.
We fill in the dialog box shown in Figure 1 with two changes, namely we
change the # of Diff field from the default value of zero to 1 and used the
ADF test without drift. The result is shown in Figure 3 (only the first 15 of 21
data values is shown in columns D and E).

31
Figure 3 – Time Series data analysis after differencing
This time we see that the first five ACF values are not statistically different
from zero (cell M6) and that the data is stationary (cell P13).
Correlogram
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack
provides the Correlogram data analysis tool which outputs an ACF or
PACF correlogram that includes the confidence intervals.
ACF Correlogram
Example 1: Construct an ACF Correlogram for the data in column A of
Figure 1 (only the first 18 of 56 data elements are visible).

32
Figure 1 – ACF Correlogram
Press Ctr-m and choose the Time Series option (or the Time S tab if
using the Multipage interface). Select the Correlogram option and click on
the OK button. Now, fill in the dialog box that appears as shown in Figure 2.
Since the # of Lags field was left blank, the default of 30 was used.

Figure 2 – Correlogram dialog box


After clicking on the OK button, the output shown on the right side of Figure
1 appears. Note that the alpha value in cell F3 is automatically set to .05. You
can change this to any value you like between 0 and 0.5 and all the cells as
well as the chart will change to reflect your choice for alpha.

33
Note that cell D7 contains the formula =ACF($A$4:$A$59,C7), cell E7
contains the formula =-F7 and cell F7 contains the formula
=NORM.S.INV(1-$F$3/2)/SQRT(COUNT($A$4:$A$59))
The remaining values in columns D and E (until row 36, corresponding to
lag 30) are calculated using similar formulas. Cell F8 contains the formula
=NORM.S.INV(1-
$F$3/2)*SQRT((1+2*SUMSQ(D$7:D7))/COUNT($A$4:$A$59))
and similarly, for the other cells in column F. This reflects the fact that the
standard error and confidence interval of ACF(k) are

PACF Correlogram
Example 2: Construct a PACF Correlogram for the data in column A of
Figure 1.
This time the PACF option from the dialog box in Figure 2 is selected. The
output is shown in Figure 3 (only the firsts 15 of 30 lags is shown).

Figure 3 – PACF Correlogram


Cell Q7 contains the formula =PACF($A$4:$A$59,P7), cell R7 contains =-S7
and cell S7 contains =NORM.S.INV(1-$S$3/2)/SQRT(COUNT(A4:A59)).

34
This reflects the fact that the standard error and confidence interval of
PACF(k) are

Handling Missing Time Series Data


When data is missing in a time series, we can use some form of imputation
or interpolation to impute a missing value. In particular, we consider the
approaches described in Figure 1.
Numeric label Text label Imputation type

0 linear linear interpolation

1 spline spline interpolation

2 prior use prior value

3 next use next value

-1 sma simple moving average

-2 wma weighted moving average

-3 ema exponential moving average


Figure 1 – Imputation Approaches
Example
Example 1: Apply each of these approaches for the time series with missing
entries in column E of Figure 2. The full time series is shown in column B.

35
Figure 2 – Imputation Examples
Linear interpolation
The missing value in cell E15 is imputed as follows as shown in cell G15.

The missing value in cell E10 is imputed as follows as shown in cell G10.

Finally, the missing value in cell E18 is imputed as follows as shown in cell
G18.

Spline interpolation
To create the spline interpolation for the four missing values, first, create the
table in range O3:P14 by removing all the missing values. This can be done
by placing the array formula =DELROWBLANK(D3:E18,TRUE) in range
O3:P14, as shown in Figure 3. Next place the array formula
=SPLINE(R4:R18,O4:O14,P4:P14) in range S4:S18 (or in range H4:H18 of
Figure 2).

36
Figure 3 – Spline interpolation
The chart of the spline curve is shown on the right side of Figure 3. The
imputed values are shown in red on the chart.
See Spline Fitting and Interpolation for additional information.
Prior/Next
For Next the next non-missing value is imputed (or the last non-missing
value if there is no next non-missing value), while for Prior the previous
non-missing value is imputed (or the first non-missing value if there is no
previous non-missing value).
The missing value in cell E9 is imputed as 23 (cell J9) when using Next and
12 (cell I9) when using Prior. The missing value in cell E18 is imputed as 75
(cell I18 or J18) when using Prior or Next.
Simple Moving Average
The imputed value depends on the span value k which is a positive integer.
To impute the missing values, we first use linear interpolation, as shown in
column AE of Figure 4. For any missing values in the first or last k elements
in the time series, we simply use the linear interpolation value. For the
others, we use the mean of the 2k+1 linear interpolated values on either side
of the missing value.
In Figure 2 we use a span value of k = 3. To show how the values in column
K of Figure 2 are calculated, we calculate the linear interpolated values as
shown in column AE of Figure 4. Next, we place the formula
=IF(AD4=””,AE4,AD4) in cell AF4, highlight range AF4:AF6 (i.e. a column

37
range with k = 3 elements) and press Ctrl-D. Similarly, we copy the formula
in cell AF4 into the last 3 cells in column F.
Next, we place the formula =IF(AD7=””,AVERAGE(AE4:AE10),AD7) in cell
AF7, highlight the range AF7:AF15 (i.e. all the cells in column AF that haven’t
yet been filled in), and press Ctrl-D. The imputation should in column K is
identical to that shown in column AF.

Figure 4 – Moving Average Imputations


Weighted Moving Average
The approach is similar to the simple moving average approach, except that
now weights are used. For example, the first missing time series element
occurs at time t = 6. Thus, we weight the linear imputed values in column AE
of Figure 4 by 1 for t = 6, by 1/2 for t = 5 or 7, by 1/3 for t = 4 or 8, and by 1/4
for t = 3 or 9. The calculation of the imputed value at t = 6 is shown in Figure
5.

38
Figure 5 – WMA for t = 6
Here, cell AK4 contains the formula =1/(ABS(AJ$7-AJ4)+1), cell AL4
contains =AE6 and cell AM4 contains =AK4*AL5. We can now highlight the
range AK:AM10 and press Ctrl-D to fill in the other values. Then we sum the
weights to obtain the value 3.16667 as shown in cell AK11 and sum the
products to obtain the value 60.66667 as shown in cell AM11. The imputed
value is thus 60.66667 divided by 3.166667, i.e. 19.15789 as shown in cell
AM12.
This is the value shown in cell AG9 of Figure 4. In fact, we can fill in column
AG of Figure 4 as follows. First, insert the worksheet formula
=1+2*SUMPRODUCT(1/AC5:AC7) in cell AG20. Next, fill in the first 3 and
last 3 values in column AG by using the values in column AE. Finally, insert
the following formula in cell AG7, highlight range AG7:AG15, and press Ctrl-
D.
=IF(AD7=””,SUMPRODUCT(AE4:AE10,1/(ABS(AC4:AC10-
AC7)+1))/AG$20,AD7)
Exponential (Weighted) Moving Average
The approach is identical to that of the weighted moving average except that
we use weights that are a power of 2. Now, we weight the linear imputed
values in column AE of Figure 4 by 1 for t = 6, by 1/2 for t = 5 or 7, by 1/4
for t = 4 or 8, and by 1/8 for t = 3 or 9.
The calculation of the imputed value at t = 6 is as shown in Figure 5, except
that the formula used for cell AK4 is now =1/2^ABS(AJ$7-AJ4) (and
similarly for the other cells in column AK). The result is as shown in column
AH of Figure 4. This time, cell AH7 contains the formula
=IF(AD7=””,SUMPRODUCT(AE4:AE10,1/2^ABS(AC4:AC10-
AC7))/AH$20,AD7)
and cell AH20 contains the formula =1+2*SUMPRODUCT(1/2^AC4:AC6).
39
Worksheet Function
Real Statistics Function: For a time series represented as a column array
where any non-numeric values are treated as missing, the Real Statistics
Resource Pack supplies the following array function:
TSImputed(R1, itype, k): returns a column array of the same size as R1
where each missing element in R1 is imputed based on the imputation
type itype which is either a number or text string as shown in Figure 2
(default 0 or “linear”) and k = the span (default 2), which is only used with
the three moving average imputation types.
For example, =TSImputed(E4:E18,”ema”,3) returns the time series shown in
range M4:M18 of Figure 2.
Seasonality
If the time series has a seasonal component, then we can combine one of the
imputation approaches described in Figure 1 with a seasonality imputation
approach as described in Handling Missing Seasonal Time Series Data.

40
Autoregressive Processes
A p-order autoregressive process, denoted AR(p), takes the form

Thinking of the subscripts i as representing time, we see that the value of y


at time i is a linear function of y at earlier times plus a fixed constant and a
random error term. Similar to the ordinary linear regression model, we
assume that the error terms are independently distributed based on a normal
distribution with zero mean and a constant variance σ2 and that the error
terms are independent of the y values.
Topics:
• Basic Concepts
• Characteristic Equation
• Partial Autocorrelation
• Finding Model Coefficients using ACF/PACF
• Finding Model Coefficients using Linear Regression
• Lag Function Representation
• Augmented Dickey-Fuller Test
• Other Unit Root Tests

Autoregressive Processes Basic Concepts


In a simple linear regression model, the predicted dependent variable is
modeled as a linear function of the independent variable plus a random error
term.

A first-order autoregressive process, denoted AR(1), takes the form

Thinking of the subscripts i as representing time, we see that the value of y


at time i+1 is a linear function of y at time i plus a fixed constant and a
random error term. Similar to the ordinary linear regression model, we
assume that the error terms are independently distributed based on a normal
distribution with zero mean and a constant variance σ2 and that the error
terms are independent of the y values. Thus

It turns out that such a process is stationary when |φ1| < 1, and so we will
make this assumption as well. Note that if |φ1| = 1 we have a random walk.
Similarly, a second-order autoregressive process, denoted AR(2),
takes the form

41
and a p-order autoregressive process, AR(p), takes the form

Property 1: The mean of the yi in a stationary AR(p) process is

Proof: click here


Property 2: The variance of the yi in a stationary AR(1) process is

Proof: click here


Property 3: The lag h autocorrelation in a stationary AR(1) process is
Proof: click here
Example 1: Simulate a sample of 100 elements from the AR(1) process

where εi ∼ N(0,1) and calculate ACF.


Thus φ0 = 5, φ1 = .4 and σ = 1. We simulate the independent εi by using the
Excel formula =NORM.INV(RAND(),0,1) or =NORM.S.INV(RAND()) in
column B of Figure 1 (only the first 20 of 100 values are displayed.
The value of y1 is calculated by placing the formula =5+0.4*0+B4 in cell C4
(i.e. we arbitrarily assign the value zero to y0). The other yi values are
calculated by placing the formula =5 +0.4*C4+B5 in cell C5, highlighting the
range C5:C203 and pressing Ctrl-D.
By Property 1 and 2, the theoretical values for the mean and variance
are μ = φ0/(1–φ1) = 5/(1–.4) = 8.33 (cell F22) and

(cell F23). These compare to the actual time series values of ȳ


=AVERAGE(C4:C103) = 8.23 (cell I22) and s2 = VAR.S(C4:C103) = 1.70 (cell
I23).
The time series ACF values are shown for lags 1 through 15 in column F.
These are calculated from the y values as in Example 1. Note that the ACF
value at lag 1 is .394376. Based on Property 3, the population ACF value at
lag 1 is ρ1 = φ1 = .4. Theoretically, the values for ρh = = .4h should get
smaller and smaller as h increases (as shown in column G of Figure 1).

42
Figure 1 – Simulated AR(1) process
The graph of the y values is shown on the left of Figure 2. As you can see, no
particular pattern is visible. The graph of ACF for the first 15 lags is shown
on the right side of Figure 2. As you can see, the actual and theoretical values
for the first two lags agree, but after that, the ACF values are small but not
particularly consistent.

43
Figure 2 – Graphs of simulated AR(1) process and ACF
Observation: Based on Property 3, for 0 < φ1 < 1, the theoretical values of
ACF converge to 0. If φ1 is negative, -1 < φ1 < 0, then the theoretical values of
ACF also converge to 0, but alternate in sign between positive and negative.
Property 4 : For any stationary AR(p) process. The autocovariance at
lag k > 0 can be calculated as

Similarly the autocorrelation at lag k > 0 can be calculated as

Here we assume that γh = γ-h and ρh = ρ-h if h < 0, and ρ0 = 1.


These are known as the Yule-Walker equations.
Proof: click here
Property 5: The Yule-Walker equations also hold where k = 0 provided we
add a σ2 term to the sum. This is equivalent to

Observation: In the AR(1) case, we have

and

Solving for γ0 yields


In the AR(2) case, we have

44
Solving for ρ1 yields

Also
We can also calculate the variance as follows:

Solving for γ0 yields

This value can be re-expressed algebraically as described in Property 7


below.
Property 6: The following hold for a stationary AR(2) process

Proof: Follows from Property 4, as shown above.


Property 7: The variance of the yi in a stationary AR(2) process is

Proof: click here for an alternative proof.


Characteristic Equation for AR(p) Processes
Property 1: An AR(p) process is stationary provided all the roots of the
following polynomial equation (called the characteristic equation) have
an absolute value greater than 1.

This is equivalent to saying that if z that satisfies the characteristic equation


then |z| > 1.
In fact, setting w = 1/z, this is equivalent to saying that |w| < 1 for any w that
satisfies the following equation

By the Fundamental Theorem of Algebra, any pth degree polynomial


has p roots; i.e. there are p values of z that satisfy the above equation.
Unfortunately, not all of these roots need to be real; some can involve
“imaginary” numbers such as , which is usually abbreviated by the

45
letter i. For example, the equation z2 + 1 has the roots i and –i as can be seen
by substituting either of these values for z in the equation z2 + 1.
We now give three properties of imaginary numbers, which will help us avoid
discussing imaginary numbers in any further detail:
• all values which involve imaginary numbers can be expressed in the
form a + bi where a and b are real numbers
• if a + bi is a root of a pth degree polynomial, then so is a – bi
• if z = a + bi then the absolute value of z is defined by |z| =
Since a and b are real numbers, not involving , we only need to deal with
real numbers.
Property 2: An AR(1) process is stationary provided |φ1| < 1
Property 3: An AR(2) process is stationary provided
|φ2| < 1 and |φ1| + φ2 < 1
Example 1: Determine whether the following AR(2) process is stationary.

The roots of w2 – 2w + .5 = 0 are

This process is not stationary since 1 + √1.5 ≥ 1. You get the same result via
Property 3 since |φ1| + φ2 = 2 – .5 = 1.5 ≥ 1.
Example 2: Determine whether the following AR(2) process is stationary.

Since

the roots of the reverse characteristic equation are not real. In fact

Thus

and so we see that this AR(2) process is stationary. We get the same result
via Property 3 since

Observation: It turns out that by Property 4 of Basic AR Concepts, for


any k, ρk can be expressed as a linear combination

where w1, …, wp are the unique roots of the reverse characteristic equation

Real Statistics Function: The Real Statistics Resource Pack supplies the
following array function where R1 is a p × 1 range containing the

46
phi coefficients of the polynomial where φp is in the first position and φ1 is in
the last position.
ARRoots(R1): returns a p × 3 range where each row contains one root, and
where the first column consists of the real part of the roots, the second
column consists of the imaginary part of the roots and the third column
contains the absolute value of the roots
This function calls the ROOTS function described in Roots of a Polynomial.
Note that just like in the ROOTS functions, the ARRoots function can take
the following optional arguments:
ARRoots(R1, prec, iter, r, s)
prec = the precision of the result, i.e. how close to zero is acceptable. This
value defaults to 0.00000001.
iter = the maximum number of iteration performed when performing
Bairstow’s Method. The default is 50.
r, s = the initial seed values when using Bairstow’s Method. These default to
zero.
Partial Autocorrelation for AR(p) Process
Property 1: For an AR(p) process yi = φ0 + φ1 yi-1 +…+ φp yi-p + εi, PACF(k)
= φk
Thus, for k > p it follows that PACF(k) = 0
Example 1: Chart PACF for the data in Example 1 from Basic Concepts for
Autoregressive Process
Using the PACF function and Property 1, we get the result shown in Figure
1.

47
Figure 1 – Graph of PACF for AR(1) process
Observation: We see from Figure 1 that the PACF values for lags > 1 are
close to zero, as is expected, although there is some random fluctuation from
zero.
Example 2: Repeat Example 1 for the AR(2) process

where εi ∼ N(0,1), and calculate ACF and PACF.


From Example 2 of Characteristic Equation of AR(p) Process, we know that
this process is stationary.

48
Figure 2 – Simulated AR(2) process
This time we place the formula =5+0.4*0-0.1*0+B4 in cell C4, =5+0.4*C4-
0.1*0+B5 in cell C5 and =5+0.4*C5-0.1*C4+B6 in cell C6, highlight the range
C6:C103 and press Ctrl-D.
The ACF and PACF are shown in Figure 3.

49
Figure 3 – ACF and PACF for AR(2) process
As you can see, there isn’t a perfect fit between the theoretical and actual ACF
and PACF values.
Finding AR(p) coefficients
Suppose that we believe that an AR(p) process is a fit for some time series.
We now show how to calculate the process coefficients using the
following techniques: (1) estimates based on ACF or PACF values, (2) using
linear regression and (3) using Solver. We illustrate the first of these
approaches on this webpage.
One approach is to use the Yule-Walker in reverse to calculate the φ0, φ1,
…, φp, σ2 coefficients based on the values of μ, γ0, …, γp (ACF values).
Alternatively, we use the values μ, γ0, π1…, πp (PACF values), which it turns
out are equivalent.
Example 1: Use the statistics described above to find the coefficients of the
AR(1) process based on the data in Example 1 of Autoregressive Processes
Basic Concepts.
The first 8 of 100 data elements are shown in column B of Figure 1. We next
calculate the mean, variance and PACF(1) values. From these, we can
estimate the process coefficients as shown in cells G8:G10. This estimate of
the time series is the process yi = 4.983 + .394yi-1 + εi where σ2 = 1.421703.

50
Figure 1 – Estimation of AR(1) coefficients
As we can see, the process coefficients are pretty close to the original
coefficients used to generate the data in column B (φ0 = 5, φ0 = .4 and σ2 = 1)
with the exception of σ2, which is a little high.
Observation: We can use this approach for AR(2) processes, by noting
that

Thus

and so
Example 2: Use the statistics described above, to find the coefficients of the
AR(2) process based on the data in Example 1.
We show two versions in Figure 2. The lower version is based on the ACF
using the formulas described in the above observation. The upper version is
based on the PACF using Property 1 of Partial Autocorrelation of AR(p)
Processes.

51
Figure 2 – Estimation of AR(2) coefficients
Finding AR(p) coefficients using Regression
We now show how to calculate the coefficients of an AR(p) process which
represents a time series by using ordinary least squares.
An AR(p) process can be expressed as

which is equivalent to

Our goal is to minimize

Let X be the n–p × p+1 matrix such that the ith row is [1 yi-1 yi-2 ⋯ yi-p], i.e. X =
[xij] where xi1 = 1 for all i and xij = yi-j+1 for all j > 1. Let Y be the n–p × 1 column
vector Y = [yp+1 yp+2 ⋯ yn]T, let φ be the p+1 × 1 column vector φ

52
= [φ0 φ1 ⋯ φp]T and ε be the n–p column vector ε = [εp+1 εp+2 ⋯ εn]T . Then the
AR(p) process can be represented by

The least-squares solution φ = [φ0 φ1 ⋯ φp]T is then given by

We are given the values of y1, …, yn, but we also need to initialize values for
y0, …, y1-p (i.e. the values with non-positive subscripts). We will simply
initialize these values to zero, although alternatively, we can use

i.e. the mean of the AR(p) process from Property 1 of Autoregressive


Processes Basic Concepts.
Example 1: Use the least square method to find the coefficients of an AR(1)
process based on the data from Example 1 of Finding AR(p) Coefficients.
The first 14 of 100 data elements are shown in column B of Figure 1. We next
create the X and Y matrices as described above in ranges D5:E103 and
G5:G103.

Figure 1 – Finding AR(1) coefficients using least squares


The coefficient matrix (range I5:I6) is then calculated using the array
formula
=MMULT(MINVERSE(MMULT(TRANSPOSE(D5:E103),D5:E103)),
MMULT(TRANSPOSE(D5:E103),G5:G103))
The predicted value in cell L5 is then calculated by the formula =I$5+K4*I$6
and similarly for the other values in column L.
53
Example 2: Use the least square method to find the coefficients of an AR(2)
process based on the data from Example 2 of Finding AR(p) Coefficients.

Figure 2 – Finding AR(2) coefficients using least squares


The coefficient matrix (range J6:J8) is then calculated using the array
formula
=MMULT(MINVERSE(MMULT(TRANSPOSE(D6:F103),D6:F103)),
MMULT(TRANSPOSE(D6:F103),H6:H103))
The predicted value in cell M6 is then calculated by the formula
=J$6+L5*J$7+L4*J$8 and similarly for the other values in column M.
Of course, it is much easier to use the Real Statistics Linear Regression data
analysis as shown in Figure 3.

54
Figure 3 – Regression approach to finding AR(2) coefficients
Here the X values are shown in columns X and Y and the Y values are shown
in column Z. These values are obtained by placing the formulas =B5
(referencing Figure 2) in cell X4, =B4 in cell Y4 and =B6 in cell Z4,
highlighting the range X4:Z101 and pressing Ctrl-D. The predicted values
can now be calculated using the TREND array function.
Observation: The regression approach to calculating the AR(p) model
coefficients is more accurate than the ACF/PACF approach described
in Finding AR(p) Coefficients. Elsewhere we also show how to use Solver to
calculate these coefficients. The coefficients will be identical to those using
linear regression.
Real Statistics Function: The Real Statistics Resource Pack supplies the
following array function:
ARMap(R1,p) – takes the time series in the n × 1 range R1 and outputs
the n–p × p+1 range where the first p columns represent the X values in the
linear regression and the last column represents the Y values.
If we had highlighted the range X4:Z101, entered the formula
=ARMap(B4:B103) and pressed Ctrl-Shft-Enter we would get the same
values in range X4:Z101 as in Figure 3.

55
Lag Function
We now define the lag function

We further assume that

for any constant c and any variables x and z. We also use the following
notation for any variable z and non-negative integer n.

We can express the AR(p) process

using the lag function as

or even
where 1 is the identity function and we use the notation (f+g)x to mean f(x)
+ g(x) for any functions f and g. This can also be expressed as

By Property 1 of Autoregressive Process Basic Concepts, this can also be


expressed as

or
Note that

is a pth degree polynomial which is equivalent to the characteristic


polynomial of the AR(p) process, as described in Characteristic Equation for
Autoregressive Processes. This polynomial can be factored (by the
Fundamental Theorem of Algebra) as follows

where the values r1, r2, …, rp are the characteristic roots of the AR(p)
process.
Based on the vector φ = [φ1, …, φp] of coefficients, we can define the
operator φ(L)

and so an autoregression process can be expressed as

56
Observation: The lag function is also called the (back) shift operator and
so sometimes the symbol B is used in place of L.
Augmented Dickey-Fuller Test
In Dickey-Fuller Test we describe the Dickey-Fuller test which determines
whether an AR(1) process has a unit root, i.e. whether it is stationary. We
now extend this test to AR(p) processes.
For the AR(1) process

we take the first difference to obtain the equivalent form

where Δyi = yi – yi-1 and β = φ – 1, and test the hypothesis


H0: β = 0 (equivalent to φ = 1)
H1: β < 0 (equivalent to φ < 1)
If |φ| = 1, we have what is called a unit root (i.e. the time series is not
stationary). We have three versions of the test.
Type 0 No constant, no trend Δyi = β1 yi-1 + εi

Type 1 Constant, no trend Δyi = β0 + β1 yi-1 + εi

Type 2 Constant and trend Δyi = β0 + β1 yi-1 + β2 i+ εi


The extension to AR(p) processes has the following three versions.

Type 0 No constant, no trend

Type 1 Constant, no trend

Type 2 Constant and trend

Once you know how many lags to use, the augmented test is identical to the
simple Dickey-Fuller test. We can use the Akaike Information Criterion
(AIC) or Bayesian Information Criteria (BIC) to determine how many lags to
consider, as described in Comparing ARIMA Models.
Thus we can now use the full version of the ADFTEST function which was
introduced in Dickey-Fuller Test.
Real Statistics Function: The Real Statistics Resource Pack provides the
following array function where R1 contains a column of time series data.

57
ADFTEST(R1, lab, lag, criteria, type, alpha): returns an 8 × 1 range which
contains the following values: tau-statistic, tau-critical, yes/no (stationary
or not), AIC value, BIC value, # of lags (p), the first-order autoregression
coefficient and estimated p-value.
If lab = TRUE (default is FALSE), the output consists of a 8 × 2 range whose
first column contains labels. type = the test type (0, 1, 2, default is 1). The
default value for alpha is .05.
The arguments lag and criteria, which were not used for the Dickey-Fuller
Test, are defined as follows:
• lag = the maximum number of lags to use in the test (default 0)
• criteria = “none” : no criteria is used, and so p is set to the value
of lag
• criteria = “aic” : the AIC is used to determine the number of
lags p (where p ≤ lag)
• criteria = “bic” : the BIC is used to determine the number of
lags p (where p ≤ lag)
To specify the criteria, you can use “AIC” or 1 instead of “aic”, you can use
“BIC” or 2 instead of “bic” and you can use “” or 0 instead of “none”.
If lag < 0 then lag will automatically be set to value
=Round(12*(n/100)^.25,0), as proposed by Schwert, where n = the number
of elements in the time series.
To specify the test type, you can use “” or “none” instead of 0, you can use
“drift” or “constant” instead of 1 and you can use “trend” or “both” instead of
2.
Example 1: Determine whether the data in column A of Figure 1 has a unit
root based on a model without trend based on the Schwert estimate for the
maximum number of lags using the AIC criteria. Also, determine whether
there is a unit root based on a model with trend and a maximum number of
lags equal to 7 using the AIC criteria.

58
Figure 1 – Time Series
Here range J4:K8 contains the array formula =DescStats(A3:A22,TRUE).
We see that the mean value of the time series is 2.376, and so we conclude
that the time series likely has a non-constant mean. We could confirm this
by using a t-test to see whether the population mean is significantly different
from zero.
We now use the array formula =ADFTEST(A3:A22,TRUE,-1) to show the
results of the ADF test without trend. The -1 means that we are using the
Schwert estimate for the maximum number of lags. We are also using the
default type = 1, which results in the test for constant without trend. As we
can see from range P4:P11 in Figure 2, since tau-stat > tau-crit, the time
series is not stationary.

59
Figure 2 – ADF Test
Note that the above formula effectively uses a maximum lag count of 8, which
can seen by using the formula =ROUND(12*(K4/100)^0.25,0) in cell K10
from Figure 1.
Looking at the chart in Figure 1, it appears that the time series has a trend,
and so we repeat the ADF Test with constant and trend to get the results
shown in range S4:T11 of Figure 2 using the array formula
=ADFTEST(A3:A22,TRUE,7,”aic”,2). Here type = 2 (constant and trend) and
maximum number of lags = 7. Note that we didn’t use 8 as the maximum
number of lags since that would produce error values (based on insufficient
degrees of freedom in the underlying regression analysis).
Real Statistics Data Analysis Tool: As explained in Time Series Testing
Tools, the Time Series Testing data analysis tool can be used to perform
the Dickey-Fuller Test. In fact, it can also be used to perform the Augmented
Dickey-Fuller Test.

Other Unit Root Tests


Two other unit root tests are commonly used, in addition to or instead of
the Augmented Dickey-Fuller Test, namely:
• Phillips-Perron (PP) test
• Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test
While the ADF test uses a parametric autoregression to estimate the errors, the PP
test uses a non-parametric approach.
The KPSS test uses yet a different approach. Unlike the other tests, the null
hypothesis for the KPSS test is that the time series is stationary, while the alternative
hypothesis is that there is a unit root.

60
Real Statistics Functions: The Real Statistics Resource Pack provides the following
array functions where R1 contains a column of time series data.
PPTEST(R1, lab, lags, type, alpha) – an array function that returns a column
range for the PP test consisting of tau-stat, tau-crit, stationary (yes/no), lags and the
autocorrelation coefficient and p-value.
KPSSTEST(R1, lab, lags, type, alpha) – an array function that returns a column
range for the KPSS test consisting of test-stat, crit-value, stationary (yes/no), lags
and p-value.
As usual, if lab = TRUE (default is FALSE), the output consists of two columns
whose first column contains labels. type = the test type (0, 1, 2, default is 1). The
default value for alpha is .05.
To specify the test type, you can use “” or “none” instead of 0, you can use “drift”
or “constant” instead of 1 and you can use “trend” or “both” instead of 2.
Note too that the KPSS test does not support the case where there is no constant and
no trend. Thus, type for KPSSTEST is restricted to 1 and 2. If type = 0 is used, then
it is assumed that type = 1.
You can either specify the number of lags to test or use the values “short” or “long”.
If “short” is specified then lags is calculated to be =Round(4*(n/100)^.25,0)
where n = the number of elements in the time series, while if lags = “long” then the
value =Round(12*(n/100)^.25,0) is used.
In Figure 1, we repeat the analysis for Example 1 of Augmented Dickey-Fuller
Test using the PP and KPSS tests, specifying lags = “short” (which is equivalent
to lags = 3).

61
Figure 1 – PP and KPSS Tests

Seasonality for Time Series


A time-series yi with no trend has seasonality of period c if E[yi] = E[yi+c].
If we have a stationary time series yi and a deterministic time series si such
that si = si+c for all i (and so si = si+kc for all integers k), then zi = yi + si would be
a seasonal time series with period c. As shown in Regression with
Seasonality, the seasonality of such time series can be modeled by using c–1
dummy variables.
A second way to model seasonality is to assume that si = μm(i) + εi where εi is a
purely random time series and μ0, …, μc-1 are constants where m(i) =
MOD(i,c).
A third approach is to model seasonality as a sort of random walk,
i.e. si = μm(i) + si-c + εi . If μ0 = … = μc-1 = 0 then there is no drift; otherwise μ0,
…, μc-1 capture the seasonal drift.
Of course, seasonality can be modeled in many other ways.

62
Recall that for the lag function Lc(yi) = yi-c, and so (1–Lc)yi = yi – yi-c. This is the
principal way of expressing seasonality for SARIMA models.
Note too that if si is deterministic, then (1–Lc)si = si – si-c= 0.
SARIMA Models
As described in ARMA Models, an ARMA(p,q) model can be expressed as

If φ0 = 0 (i.e. the mean of the stochastic process is zero) then this can be
expressed using the lag operator as

where

Note too that an ARIMA(p, d, q) process zi can be expressed as above but


first zi must be replaced by zi = yi – yi-d. Alternatively, we can express an
ARIMA(p, d, q) process yi without constant as

We can add the constant term back in as

A seasonal ARIMA model takes the same form, but now there are additional
terms that reflect the seasonality part of the model. Specifically, a
SARIMA(p,d,q) × (P,D,Q)m model without constant can be expressed as

where

Here, we have P seasonal autoregressive terms (with coefficients Φ1, …,


ΦP), Q seasonal moving average terms (with coefficients Θ1, …, ΘQ)
and D seasonal differencing based on m seasonal periods.
Let’s assume that the two types of differencing (corresponding to d and D)
have already been done. Then a SARIMA(1,0,1) × (1,0,1)12 model takes the
following form:

The residuals can therefore be expressed as:

63
The forecast can be expressed (where the coefficients should have a hat on
them):

Similarly, a SARMA(p,q) × (P,Q)m model without constant can be expressed


as
or equivalently as

And so

which is equivalent to

In the case where there is a constant term φ0 this expression takes the form

This serves as the equation to estimate the forecast at time i (when the
final εi is set to zero). You can also solve for εi to obtain an expression that
can be used to estimate the residuals.
SARIMA Model Example
Example 1: Create a SARIMA(1,1,1) ⨯ (1,1,1)4 model for Amazon’s quarterly
revenues shown in Figure 1 and create a forecast based on this model for the four
quarters starting in Q3 2017.
Note that the range A3:B33 contains all the data, where the second half of the data
is repeated in columns D and E (so that it is easier to display in the figure).

64
Figure 1 – Amazon Revenues
We start by creating a plot of the time series data by highlighting the range B4:B33
and then selecting Insert > Charts|Line. After making a few modifications we
obtain the result shown in Figure 2.

Figure 2 – Plot of Amazon Revenues


We see from the chart that there is an upward trend and there is seasonality.
We next try to remove the trend and seasonality by differencing, as shown
on the left side of Figure 3.

65
Figure 3 – Ordinary and seasonal differencing
The original revenue data is repeated in range O3:O33 (with only the first 14 data
elements visible in the figure). Column P contains the detrended data where cell P5
contains the formula =O5-O4, and similarly for the other cells in column P. Column
Q removes the seasonality from the data in column P. This is done by inserting the
formula =P9-P5 in cell Q9, highlighting the range Q9:Q33 and pressing the key
sequence Ctrl-D.
We next plot the data in column Q as shown on the right side of Figure 3.
This time the plot looks like it comes from a stationary time series, although
we would need to perform a unit root test to confirm this.
Figure 4 shows how to calculate the residuals for the SARIMA model of this
time series in terms of the coefficients (only the first 8 of the time series
entries in AG3:AI28 are displayed).

Figure 4 – Calculation of residuals

66
The values in range AH4:AH28 are copied from Q9:Q33 of Figure 3. As we
saw SARIMA Models, the residuals of this time series can be calculated using the
formula

To calculate εi we need to know the values of εi-1, εi-4, εi-5. Thus we arbitrarily set the
values of the first five residuals equal to zero and use the above formula to
calculate ε6 (cell AI9). This is done in Excel using the following worksheet formula
=AH9-AL$3-AL$4*AH8-AL$6*AH5+AL$4*AL$6*AH4-AL$5*AI8-AL$7*AI5-
AL$5*AL$7*AI4
Next, we highlight range AI9:AI28 and press Ctrl-D to fill in the values of all the
rest of the residuals. Since we initially set all the coefficients to zero, the residuals
initially all take the same value as the data.
Our goal is to find coefficients that minimize the sum of the squares of the residuals
(SSE). We accomplish this by using Excel’s Solver. The value of SSE is calculated
in cell AL9 using the formula =SUMSQ(AI9:AI28).
Select Data > Analysis|Solver and fill in the dialog box that appears as shown in
Figure 5.

Figure 5 – Solver dialog box

67
After clicking on the Solve button, the results shown in Figure 6 appear.

Figure 6 – Solver results


As you can see, the residual values shown in column AI change
and the new SSE SARIMA Forecast Example
In SARIMA Model Example we show how to create a SARIMA model for the
following example, step by step, in Excel.
Example 1: Create a SARIMA(1,1,1) ⨯ (1,1,1)4 model for Amazon’s quarterly
revenues shown in Figure 1 and create a forecast based on this model for the
four quarters starting in Q3 2017.
We now show how to use this model to create a forecast.
The coefficients estimated by this model (shown in AK3:Al7 of Figure 1) can
be used to create a forecast based on the following equation:

Since the last data element is y25, we want to determine the forecasted values
of y26,. y27, y28 and y29. To do this, we use the above formula using data values
of yi-1,. yI-4 and yi-5 when available, and (previously obtained) forecasted values
when the real data values are not available. This is shown in Figure 1.

68
Figure 1 – Forecast for the differenced time series
This figure shows the data values and residuals for the later portion of the
time series (leaving out the middle) plus the forecasted values. E.g. the
forecast value in cell AH29 is calculated by the formula
=AL$3+AL$4*AH28+AL$6*AH25-AL$4*AL$6*AH24+AL$5*AI28
+AL$7*AI25+AL$5*AL$7*AI24
After entering this formula, you can highlight the range AH29:AK32 and
press Ctrl-D to obtain the other three forecast values. Note that the
residuals corresponding to the four forecast values are implicitly set to zero.
Now that we have the forecasted values for the time series shown in column
Q of Figure 3 of SARIMA Model Example, we need to translate these into
forecast values for the original time series (column O in Figure 3 of SARIMA
Model Example). To accomplish this, we need to undo the two types of
differencing.
We start by replicating the bottom of the data in Figure 3 of SARIMA Model
Example (i.e. the part that is not displayed) and then inserting the forecast
that we obtained in Figure 1. This is shown in Figure 2.

69
Figure 2 – Forecast (step 1)
We only need to go in the original time series far enough to produce at least
one value not forecasted in column AQ. Whereas differencing proceeds from
left to right, integrating (i.e. undoing differencing) proceeds from right to
left. If we know the values in cells AP5 and AQ9, we can obtain the value in
cell AP9 using the formula =AP5+AQ9. Similarly, if we know the value in
cells AO8 then we can calculate the value in cell AO9 using the formula
=AO8+AP9 (where the value in AP9 was calculated previously).
In a similar way, we can obtain the value in cell AP10, using the formula
=AP6+AQ10 and the value in cell AO10 using the formula =AO9+AP10. We
highlight the range AO10:AP13 and press Ctrl-D to obtain the other three
forecast values, as shown in Figure 3.

Figure 3 – Forecast (step 2)

70
We can now extend the plot shown in Figure 2 of SARIMA Model Example to
include the forecasted values, as shown in Figure 4.

Figure 4 – Revenue Forecast


value of 4,244,634 is much less than the initial value shown in Figure 4.
Real Statistics Support for SARIMA
We now show how to simplify the process of creating a SARIMA model by
using Real Statistics capabilities.
Real Statistics Functions: The Real Statistics Resource Pack supplies the
following array functions:
ADIFF(R1, d, D, per): returns a column array that corresponds to the time
series data in R1 after ordinary differencing d times and seasonal
differencing D times based on a seasonal period of per.
SARMA_RES(R1, Rar, Rma, Rsa, Rsm, per, cons): returns a column array
with the residuals that correspond to the time series data in the column
array R1 based on a SARMA model with AR coefficients in Rar, MA
coefficients in Rma, seasonal AR coefficients in Rsa, seasonal MA
coefficients in Rsm, the constant coefficient in cons and the seasonal
period per.
SARMA_PRED(R1, Rar, Rma, Rsa, Rsm, per, cons, f): returns a column
array with the predicted values that correspond to the time series data in
the column array R1 plus the next f forecast values based on a SARMA
model with AR coefficients in Rar, MA coefficients in Rma, seasonal AR
coefficients in Rsa, seasonal MA coefficients in Rsm, the constant
coefficient in cons and the seasonal period per. If f is omitted then the
highlighted (output) range is filled with forecasted values (i.e. f is set equal

71
to the number of rows in the highlighted range minus the number of rows
in R1).
SARIMA_PRED(R0, R1, d, D, per): returns a column array with the
forecasted values for the SARIMA(p, d, q) ⨯(P, D, Q)per model of the time
series data in R1 that correspond to the forecast values in R0 for the
SARMA(p, q) ⨯(P, Q)per model.
All the above arrays are column arrays. Any of the Rar, Rma, Rsa, Rsm arrays
may be omitted, although at least one of these can’t be omitted. per defaults
to 12 and cons defaults to 0 (i.e. no constant).
Note that the ADIFF function is an extension to the version described
in ARIMA Differencing.
Observation: Example 1 shows how to create a SARIMA(1, 1, 1) ⨯ (1, 1,
1)4 model and forecast. The above functions make it easier to create any
SARIMA model and forecast. To illustrate this, for Example 1 of SARIMA
Model Example, the following formulas could have been used:
=ADIFF(B4:B33,1,1,4) to create the time series in range AH4:AH28 of
Figure 4 of SARIMA Model Example
=SARMA_RES(AH4:AH28,AL4,AL5,AL6,AL7,4,AL3) to create the array of
residuals in range AI4:AI28 of Figure 4 of SARIMA Model Example
=SARMA_PRED(AH4:AH28,AL4,AL5,AL6,AL7,4,AL3) to create an array
of predicted values that take the values in the array AH4:AH32 – AI4:AI32
of Figure 4 of SARIMA Model Example. In particular, the last 4 of these
values are those found in range AH29:AH32.
=SARIMA_PRED(AH29:AH32,B4:B33,1,1,4) to create the array of forecast
values shown in range AO10:AO13 of Figure 3 of SARIMA Forecast
Example.
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack
provides the Seasonal Arima (Sarima) data analysis tool which creates a
SARIMA model and forecast.
To perform the analysis for Example 1 of SARIMA Model Example,
press Ctrl-m and choose Seasonal Arima (Sarima) from the Time
S tab (or from the Time Series dialog box if using the original user
interface). Now fill in the dialog box that appears as shown in Figure 1.

72
Figure 1 – SARIMA dialog box
If you leave the # of Forecasts field blank, then its value defaults to the
value in the Seasonal Period field. If that field is blank then no seasonality
is used in the model and # of Forecasts defaults to 5.
After clicking on the OK button, the output shown in Figures 2 and 3 is
displayed (only the first 24 rows of the output in Figure 2 and the first 20
rows in Figure 3 are displayed).

73
Figure 2 – SARIMA output (part 1)

74
Figure 3 – SARIMA output (part 2)
Most of the values are produced using the Real Statistics functions described
above. The formulas used for the descriptive statistics in range J13:J24 and
coefficient roots in columns P, Q and R are similar to those used for the
corresponding values in the Arima data analysis tool.
The lower portion of the output, which contains the forecast, is shown in
Figure 4. The values in columns D, E, F and G are the continuation of these
columns from Figure 2 and the values in columns T and U are the
continuation of these columns from Figure 3.

75
Figure 4 – SARIMA forecast output
Range G29:G32 contains the four-quarter forecast for the differenced time
series, while range U34:U37 contains the corresponding four-quarter
forecast for the revenues for the period Q3 2017 through Q2 2018.
Observation: For this example, we chose to use the Solver approach to
estimating the SARIMA coefficients. The default is to use the Levenberg-
Marquardt approach. This is accomplished by leaving the Solver option
unchecked in Figure 1. In this case, the output is similar to that described
above, except that now the output in Figure 5 is included, which is useful in
that it provides the standard errors of the coefficients and the t-tests that
determine which coefficients are significantly different from zero.

Figure 5 – SARIMA coefficients


The output in range H27:L32 of Figure 5 is produced by the array formula
=SARIMA_PARAM(A1:A30,I4,I5,I6,J7,J4,J5,J6,J8)
Real Statistics Functions: The Real Statistics Resource Pack supplies the
following array functions:
SARIMA_COEFF(R1, ar, ma, diff, per, sar, sma, sdiff, con, lab): returns
an array with two columns, the first column of which contains the SARIMA
coefficients (in the order constant term, phi coefficients, theta coefficients,
Phi coefficients, Theta coefficients) and the second column contains the

76
corresponding standard errors. If lab = TRUE (default FALSE) then a
column of labels is appended to the output.
SARIMA_PARAM(R1, ar, ma, diff, per, sar, sma, sdiff, con): returns an
array with four columns, the first column of which contains the SARIMA
coefficients (in the order constant term, phi coefficients, theta coefficients,
Phi coefficients, Theta coefficients) and the remaining columns contain the
corresponding standard errors, t statistics and p-values.
Here, the parameters are ar = p, ma = q, diff = d, per = m, sar = P, sma = Q,
sdiff = D for a (p, d, q) × (P, D, Q)m SARIMA model. con = TRUE (default) if
a constant term is included.
Mann-Kendall Test
Basic Concepts
The Mann-Kendall Test is used to determine whether a time series has a
monotonic upward or downward trend. It does not require that the data be
normally distributed or linear. It does require that there is no
autocorrelation.
The null hypothesis for this test is that there is no trend, and the alternative
hypothesis is that there is a trend in the two-sided test or that there is an
upward trend (or downward trend) in the one-sided test. For the time
series x1, .., xn, the MK Test uses the following statistic:

Note that if S > 0 then later observations in the time series tend to be larger
than those that appear earlier in the time series, while the reverse is true
if S < 0.
The variance of S is given by

where t varies over the set of tied ranks and ft is the number of times (i.e.
frequency) that the rank t appears.
The MK Test uses the following test statistic:

where se = the square root of var. If there is no monotonic trend (the null
hypothesis), then for time series with more than 10 elements, z ∼ N(0, 1), i.e.
z has a standard normal distribution.

77
Examples
Example 1: Determine whether the time series in range A4:A15 of Figure 1
has a monotonic trend.

Figure 1 – Mann-Kendall Test (part 1)


We build a table in range D4:O15 so that the row and column headings
consist of the data elements in the time series and the cells in the lower
triangular part of the table consist of the values in S. In particular, we insert
the following formula in cell D4, highlight range D4:O15 and press Ctrl-
D and Ctrl-R.
=IF(ROW(D4)-ROW(D$4)>COLUMN(D4)-COLUMN($D4),SIGN($C4-
D$3),””)
S is now the sum of the elements in this table. In fact, the MK Test, based on
this table is shown in Figure 2.

78
Figure 2 – Mann-Kendall Test (part 2)
Note that S = -44 (cell R7), which indicates the potential for a downward
trend. This is consistent with the line chart of the time series data shown in
Figure 3.

Figure 3 – Downward trend


The analysis shown in Figure 2 confirms that there is significant evidence for
the claim that the data has a trend based on a two-sided test. Actually, if we
had conducted a one-sided test we would reject the null hypothesis that there
is either no trend or an upward trend, and conclude that there is a downward
trend (the p-value for this test is half of the value shown in cell R10).
Ties Correction
Note that the ties correction used in the formula in cell R8 is based on the
value in cell P19. This is calculated based on the values shown in range
D17:N19. To create the values in this range, first insert formula
=COUNTIF(E3:$O3,D3) in cell D17 and then highlight range D17:N17 and
press Ctrl-R. Cells in this row (labeled preliminary ties counts) that are not
zero correspond to time series values that are tied. The only problem is that
when the ties count in bigger than 1 (i.e. two values are tied) there is some
double counting.
This double counting is eliminated in the next row (labeled ties counts). This
is accomplished by inserting the formula =D17 in cell D18, the formula
=IF(COUNTIF($D3:D3,E3)=0,E17,0) in cell E18, highlighting the range
E18:N18 and pressing Ctrl-R. This row shows that there are 2 data elements
with the value 5.5 (the column heading above cell G18) and 3 elements with
the value 4.5 (the column heading above cell I18). The values shown in cells
G18 and G20 are one less than the number of ties.

79
The following row contains the ties corrections where cell P19 contains the
sum of these corrections. This is done by inserting the formula
=IF(D18=0,0,D18*(D18+1)*(2*D18+7)) in cell D19, highlighting the range
D19:N19, pressing Ctrl-R and, inserting the formula =SUM(D19:N19) in cell
P19.
Worksheet Functions
Real Statistics Function: The Real Statistics Resource Pack supplies the
following array function to automate the steps required to perform the
Mann-Kendall Test.
MKTEST(R1, lab, tails, alpha): returns a column array with the
values S, s.e., z-stat, p-value and trend.
R1 is a column array containing the time series values, if lab = TRUE then an
extra column of labels is appended to the output (default FALSE), tails = 1 or
2 (default) and alpha is the significance level (default .05).
trend takes the values “yes” or “no” in the two-tailed test, and “upward” or
“no” in the one-tailed case where S > 0 and “downward” or “no” in the one-
tailed case where S < 0.
For Example 1, =MKTEST(A4:A15,TRUE) outputs the results shown in range
Q7:R11 of Figure 2.
Data Analysis Tool
The Mann-Kendall Test can also be performed using the Mann-Kendall
and Sen’s Slope data analysis tool, as demonstrated in Sen’s Slope.
Examples Workbook
Click here to download the Excel workbook with the examples described on
this webpage.
Reference
Gocic, M. and Trajkovic, S. (2012) Analysis of changes in meteorological
variables using Mann-Kendall and Sen’s slope estimator statistical tests in
Serbia. Elsevier
https://www.academia.edu/6955354/Trend_Analysis_MK_Sen_Slope
Sen’s Slope
The usual method for estimating the slope of a regression line that fits a set
of (x, y) data elements is based on a least-squares estimate. This approach is
not valid when the data elements don’t fit a straight line; it is also sensitive
to outliers.
Sen’s Slope Definition
We now describe an alternative, more robust, nonparametric estimate of the
slope, called Sen’s slope, for the set of pairs (i, xi) where xi is a time
series. Sen’s slope is defined as

80
A 1–α confidence interval for Sen’s slope can be
calculated as (lower, upper) where

Here, N = the number of pairs of time


series elements (xi, xj) where i < j and se = the standard error for the Mann-
Kendall Test.
Example
Example 1: Determine Sen’s slope for the time series in Example 1 in Mann-
Kendall Test.
The time series is shown in range A4:A15 of Figure 1. As for the Mann-
Kendall Test, we construct a table whose row and column headings consist
of the elements of the time series. This time, the elements in the table are
constructed by placing the following formula in cell D4, highlighting the
range D4:O15, and pressing Ctrl-D and Ctrl-R.
=IF(ROW(D4)-ROW(D$4)>COLUMN(D4)-COLUMN($D4),($C4-
D$3)/((ROW(D4)-ROW(D$4))-(COLUMN(D4)-COLUMN($D4))),””)

Figure 1 – Sen’s Slope (step 1)


We can now calculate Sen’s slope to be -.2 as shown in cell R11 of Figure 2.
This figure also calculates the 95% confidence interval (-3.6667, -0.1) for the
slope (as shown in cells R12 and R13).
The value for the standard error in cell R6 is taken from cell R8 of Figure 2
of Mann-Kendall Test.

81
Figure 2 – Sen’s Slope (step 2)
Worksheet Function
Real Statistics Function: The Real Statistics Resource Pack supplies the
following array function to automate the steps required to calculate Sen’s
slope.
SEN_SLOPE(R1, lab, alpha): returns a column array with the values:
Sen’s slope along with the lower and upper limits of the 1–alpha confidence
interval.
R1 is a column array containing the time series values, if lab = TRUE then an
extra column of labels is appended to the output (default FALSE)
and alpha is the significance level (default .05).
For Example 1, =SEN_SLOPE(A4:A15,TRUE) outputs the results shown in
range Q11:R13 of Figure 2.
Data Analysis Tool
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack
provides the Mann-Kendall and Sen’s Slope data analysis tool.
To use this data analysis tool for Example 1 (whose data is repeated on the
left side of Figure 4), press Ctrl-m and select the Mann-Kendall and
Sen’s Slope option from the Time S tab (or from the Time Series dialog
box if using the original user interface) and then fill in the dialog box that
appears as shown in Figure 3.

Figure 3 – MK and Sen’s Slope dialog box


82
Upon pressing the OK button the output shown in Figure 4 is displayed.

Figure 4 – Data analysis tool output


Examples Workbook
Click here to download the Excel workbook with the examples described on
this webpage.
References
Dransfield R.D., Brightwell R. (2012) Sen’s estimator of slope. Avoiding
and detecting statistical malpractice: Design & Analysis for Biologists, with
R.
http://influentialpoints.com/Training/sens_estimator_of_slope.htm
Gocic, M. and Trajkovic, S. (2012) Analysis of changes in meteorological
variables using Mann-Kendall and Sen’s slope estimator statistical tests in
Serbia. Elsevier
https://www.academia.edu/6955354/Trend_Analysis_MK_Sen_Slope
Cox-Stuart Test
The Cox-Stuart test is a simple test that is used to determine whether a
time series has an increasing or decreasing trend.
Suppose the time series is x1, x2, …, xn. Let m = INT(n/2) and define s1,
…, sm where si = Sign(xm+i+1 – xi). We now perform the sign test of this series
(using the binomial distribution).
Example
Example 1: Determine whether there is a decreasing trend for the time
series in column A of Figure 1.

83
Figure 1 – Cox-Stuart Test
The p-value from the test is shown in cell D9 using the binomial distribution
(or in cell D11 using the Slope Test). Since p-value = .035156 < .05 = α, we
have a significant result provided the time series is decreasing. Since the ratio
of positive values to non-negative values in column K is .125 (cell D8), which
is less than .5, we can conclude that the time series is indeed decreasing. Note
that this is a one-tail test since we specified a decreasing trend.
If we had specified an increasing trend, then clearly we would have a non-
significant result (p-value = 1-.035156 = .964844). If we wanted to determine
whether there was a trend in either direction, then we would perform a two-
tailed test with p-value = 2*.035156 = .070312, which is not significant.

84
Worksheet Function
Real Statistics Function: The Real Statistics Resource Pack provides the
following array function.
COX_STUART(R1, tails): returns a column array with the p-value of the
Cox-Stuart test for the data in the column array R1 along with the ratio
between positive differences over total differences.
tails = 1 or 2 (default). If ratio < .5 then any trend is decreasing, while if ratio
> .5 then any trend is increasing.
Applying this function in range D13:D14, we get the same results as shown
above for Example 1.
Examples Workbook
Click here to download the Excel workbook with the examples described on
this webpage.
Reference
Logos, T. (2009) Trend analysis with Cox-Stuart test in R. R-Blogger
https://www.r-bloggers.com/2009/08/trend-analysis-with-the-cox-stuart-
test-in-r/
Granger Causality
Granger Causality
As we have learned on many occasions, correlation doesn’t necessarily imply
causality, and while we can measure the degree of association between two
variables, i.e. correlation, it is harder to determine whether one variable
causes another variable.
Although generally, we don’t believe that a present or future event can cause
a past event, we do believe that it is possible that a past event can cause a
present or future event. This is the impetus for the Granger’s
Causality test on time-series data that gives evidence that variable x causes
y. Whether this test really demonstrates causality is open to debate, and so
we will use the phrase “x Granger-causes y” instead of “x causes y”.
As we will see, x Granger-causes y when the prediction of y is improved by
the inclusion of past values of x.
Granger Causality Test
The test is based on the following OLS regression model:

Here, the αj and βj are the regression coefficients and εi is the error term. The
test is based on the null hypothesis:
H0: β1 = β2 = … = βm = 0
We say that x Granger-causes y when the null hypothesis is rejected.

85
We use the usual F test described in Adding Extra Variables to a Regression
Model to determine whether there is a significant difference between the
regression model shown above (the full model) or the reduced model, based
on the null hypothesis, without the βj terms (i.e. where all the βj = 0).
There we demonstrate two equivalent forms of the test:

Here, all the terms are based on the full model with the exception
of SS′E and Rr2, which are based on the reduced model.
If the p-value for this test is less than the designed value of α, then we reject
the null hypothesis and conclude that x causes y (at least in the Granger
causality sense).
Assumptions
The Granger Causality test assumes that both the x and y time series are
stationary. If this is not the case, then differencing, de-trending or other
techniques must first be employed before using the Granger Causality test.
Note that the number of lags, i.e. the value of m, is critical, in that different
values of m may lead to different test results. One approach to selecting an
appropriate value for m is to choose the value that results in the full model
with the smallest AIC or BSC value.
It is possible that causation is only in one direction, or in both directions
(x Granger-causes y and y Granger causes x) or in neither direction.
Examples
Example 1: Figure 1 shows the egg production and chicken population
(including only those birds related to egg production) for the years 1931 to
1970. Determine whether the amount of the egg production Granger-causes
the size of the chicken population or the chicken population Granger-causes
the amount of egg production, or both or neither. This example is a tongue-
in-check exploration of the common question, “Which came first: the
chicken or the egg”?

86
87
Figure 1 – Chicken and Egg production
A plot of both time series (see Figure 2) shows that neither series is
stationary.

Figure 2 – Time series plots

88
As a result, we will instead study the first differences of each time series. The
data and time series plots for these are shown in Figures 3 and 4.

89
Figure 3 – Differenced time series

Figure 4 – Plots for differenced time series


The plots suggest that the time series may be stationary. This result is
confirmed by using the ADFtest (see Augmented Dickey-Fuller Test) as
shown in Figure 5.

90
Figure 5 – ADF tests
We now show how to determine whether Chickens Granger-cause Eggs for
lags = 4. To do this we perform regression on the X data in range E2:L37 of
Figure 6 and Y data in range M2:M37 (only the first 12 of 35 rows are shown).

Figure 6 – Setup for regression


We now calculate the p-value of the Granger Causality Test for this data, as
shown in Figure 7.

91
Figure 7 – Test for Granger Causality
Here we use the Real Statistics function RSquare on the full model (cell AP3)
as well as the reduced model (AP4), although we could have gotten all the
values in the figure by actually conducting the regression.
Since p-value = 0.003892 is small, we conclude that Eggs Granger-cause
Chickens for lags = 4. Alternatively, we could have calculated the p-value by
placing the Real Statistics formula =RSquareTest(E3:L37,E3:H37,M3:M37)
in cell AP9.
Worksheet Functions
Real Statistics Functions: The Real Statistics Resource Pack supports the
following two functions that make it easy to determine whether the time
series in the column array Rx Granger-causes the time series in the column
array Ry at the specified number of lags.
GRANGER(Rx, Ry, lags) = the F statistic of the test
GRANGER_TEST(Rx, Ry, lags) = p-value of the test
We can use the GRANGER_TEST function to determine whether Eggs
Granger-causes Chickens and vice versa at various numbers of lags, as shown
in Figure 8.

Figure 8 – Granger Causality Tests


For example, cell AV7 contains the formula
=GRANGER_TEST(C3:C41,B3:B41,AT7)
with references to the data in Figure 3, and produces the same results as in
Figure 7.
We see from Figure 8 that Eggs Granger-cause Chickens, but the reverse is
not true.

92
Examples Workbook
Click here to download the Excel workbook with the examples described on
this webpage.
Reference
Thurman, W. N. and Fisher, M. E. (1988) Chickens, eggs, and causality, or
which came first? American Journal of Agricultural Economics. Vol. 70.
No. 2.
http://web.pdx.edu/~crkl/ec571/eggs.pdf

Time Series Analysis


We explore various methods for forecasting (i.e. predicting) the next value(s)
in a time series. A time series is a sequence of observations y1, …, yn. We
usually think of the subscripts as representing evenly spaced time intervals
(seconds, minutes, months, seasons, years, etc.).
Topics
• Forecasting Accuracy
• Forecast errors
• Diebold-Mariano test
• Pesaran-Timmermann test
• Real Statistics support
• Basic Forecasting Methods
• Simple Moving Average
• Weighted Moving Average
• Simple Exponential Smoothing
• Holt’s Linear Trend
• Holt-Winters Multiplicative Method
• Holt-Winters Additive Method
• Excel 2016 Forecasting Functions
• Real Statistics Forecasting Tools
• Stochastic Process
• Stationary Process
• Autocorrelation Function
• Partial Autocorrelation Function
• Purely Random Time Series (white noise)
• Random Walk
• Deterministic Trend
• Dickey-Fuller Test
• Real Statistics Time Series Testing Tools
• Correlogram

93
• Handling Missing Time Series Data
• Autoregressive Processes
• Basic Concepts
• Characteristic Equation
• Partial Autocorrelation
• Finding Model Coefficients using ACF/PACF
• Finding Model Coefficients using Linear Regression
• Lag Function Representation
• Augmented Dickey-Fuller Test
• Other Unit Root Tests
• Moving Average Processes
• Basic Concepts
• Infinite-order Moving Average
• Invertibility
• Finding Model Coefficients using ACF
• Finding Model Coefficients using Solver
• Autoregressive Moving Average Processes (ARMA)
• Basic Concepts
• ARMA(1, 1) processes
• ARMA(p, q) processes
• Calculating model coefficients using maximum likelihood
• Calculating model coefficients using Solver
• Evaluating the ARMA model
• Forecasting
• Real Statistics data analysis tool
• Real Statistics ARMA tool options
• Autoregressive Integrated Moving Average Processes (ARIMA)
• Differencing
• Identification
• Calculating model coefficients
• Comparing models
• Forecasting
• Seasonal ARIMA (SARIMA)
• Seasonality for Time Series
• SARIMA models
• Example of an ARIMA model
• SARIMA forecast example
• Real Statistics support
• Miscellaneous Topics
• Mann-Kendall Test

94
• Sen’s Slope
• Cox-Stuart Test
• Granger Causality
• Cointegration (Engle-Granger Test)
• Cross Correlations
• ARIMAX Model and Forecast
References
Greene, W. H. (2002) Econometric analysis. 5th Ed. Prentice-Hall
https://spu.fem.uniag.sk/cvicenia/ksov/obtulovic/Mana%C5%BE.%20%C
5%A1tatistika%20a%20ekonometria/EconometricsGREENE.pdf
Gujarati, D. & Porter, D. (2009) Basic econometrics. 5th Ed. McGraw Hill
https://cbpbu.ac.in/userfiles/file/2020/STUDY_MAT/ECO/1.pdf
Hamilton, (1994) Time series analysis. Princeton University Press
http://www.ru.ac.bd/stat/wp-
content/uploads/sites/25/2019/03/504_02_Hamilton_Time-Series-
Analysis.pdf
Wooldridge, J. M. (2009) Introductory econometrics, a modern approach.
5th Ed. South-Western, Cegage Learning
https://economics.ut.ac.ir/documents/3030266/14100645/Jeffrey_M._W
ooldridge_Introductory_Econometrics_A_Modern_Approach__2012.pdf
Autoregressive Processes
A p-order autoregressive process, denoted AR(p), takes the form

Thinking of the subscripts i as representing time, we see that the value of y at


time i is a linear function of y at earlier times plus a fixed constant and a random
error term. Similar to the ordinary linear regression model, we assume that the error
terms are independently distributed based on a normal distribution with zero mean
and a constant variance σ2 and that the error terms are independent of the y values.
Topics:
• Basic Concepts
• Characteristic Equation
• Partial Autocorrelation
• Finding Model Coefficients using ACF/PACF
• Finding Model Coefficients using Linear Regression
• Lag Function Representation
• Augmented Dickey-Fuller Test
• Other Unit Root Tests

7 thoughts on “Autoregressive Processes”

95
Autoregressive Process Proofs
Property 1: The mean of the yi in a stationary AR(p) process is

Proof: Since the process is stationary, for any k, E[yi] = E[yi-k], a value which we will denote μ.
Since E[εi] = 0, E[φ0] = φ0 and

it follows that

Solving for μ yields the desired result.


Property 2: The variance of the yi in a stationary AR(1) process is

Proof: Since the yi and εi are independent, by basic properties of variance, it follows that

Since the process is stationary, var(yi) = var(yi-1), and so

Solving for var(yi) yields the desired result.


Property 3: The lag h autocorrelation in a stationary AR(1) process is

Proof: First note that for any constant a, cov(a+x, a+y) = cov(x,y). Thus, cov(yi,yj) has the same
value even if we assume that φ0 = 0, and similarly for var(yi) = cov(yi,yi). Thus, it suffices to prove
the property when φ0 = 0. In this case, by Property 1, μ = 0, and so cov(yi,yj) = E[yiyj].
Thus

since by the stationary property, E[yi-1,yi- ] = γi-1. Now, by induction on k, it is easy to see that
k

Hence
8 thoughts on “Autoregressive Process Proofs”
Autoregressive Processes Basic Concepts
In a simple linear regression model, the predicted dependent variable is modeled as
a linear function of the independent variable plus a random error term.

A first-order autoregressive process, denoted AR(1), takes the form

96
Thinking of the subscripts i as representing time, we see that the value of y at
time i+1 is a linear function of y at time i plus a fixed constant and a random error
term. Similar to the ordinary linear regression model, we assume that the error terms
are independently distributed based on a normal distribution with zero mean and a
constant variance σ2 and that the error terms are independent of the y values. Thus

It turns out that such a process is stationary when |φ1| < 1, and so we will make this
assumption as well. Note that if |φ1| = 1 we have a random walk.
Similarly, a second-order autoregressive process, denoted AR(2), takes the form

and a p-order autoregressive process, AR(p), takes the form

Property 1: The mean of the yi in a stationary AR(p) process is

Proof: click here


Property 2: The variance of the yi in a stationary AR(1) process is

Proof: click here


Property 3: The lag h autocorrelation in a stationary AR(1) process is
Proof: click here
Example 1: Simulate a sample of 100 elements from the AR(1) process

where εi ∼ N(0,1) and calculate ACF.


Thus φ0 = 5, φ1 = .4 and σ = 1. We simulate the independent εi by using the Excel
formula =NORM.INV(RAND(),0,1) or =NORM.S.INV(RAND()) in column B of
Figure 1 (only the first 20 of 100 values are displayed.
The value of y1 is calculated by placing the formula =5+0.4*0+B4 in cell C4 (i.e. we
arbitrarily assign the value zero to y0). The other yi values are calculated by placing
the formula =5 +0.4*C4+B5 in cell C5, highlighting the range C5:C203 and
pressing Ctrl-D.
By Property 1 and 2, the theoretical values for the mean and variance are μ = φ0/(1–
φ1) = 5/(1–.4) = 8.33 (cell F22) and

(cell F23). These compare to the actual time series values of ȳ


=AVERAGE(C4:C103) = 8.23 (cell I22) and s2 = VAR.S(C4:C103) = 1.70 (cell
I23).

97
The time series ACF values are shown for lags 1 through 15 in column F. These are
calculated from the y values as in Example 1. Note that the ACF value at lag 1 is
.394376. Based on Property 3, the population ACF value at lag 1 is ρ1 = φ1 = .4.
Theoretically, the values for ρh = = .4h should get smaller and smaller
as h increases (as shown in column G of Figure 1).

Figure 1 – Simulated AR(1) process


The graph of the y values is shown on the left of Figure 2. As you can see, no
particular pattern is visible. The graph of ACF for the first 15 lags is shown on the
right side of Figure 2. As you can see, the actual and theoretical values for the first
two lags agree, but after that, the ACF values are small but not particularly
consistent.

98
Figure 2 – Graphs of simulated AR(1) process and ACF
Observation: Based on Property 3, for 0 < φ1 < 1, the theoretical values of ACF
converge to 0. If φ1 is negative, -1 < φ1 < 0, then the theoretical values of ACF also
converge to 0, but alternate in sign between positive and negative.
Property 4 : For any stationary AR(p) process. The autocovariance at lag k > 0 can
be calculated as

Similarly the autocorrelation at lag k > 0 can be calculated as

Here we assume that γh = γ-h and ρh = ρ-h if h < 0, and ρ0 = 1.


These are known as the Yule-Walker equations.
Proof: click here
Property 5: The Yule-Walker equations also hold where k = 0 provided we add
a σ2 term to the sum. This is equivalent to

Observation: In the AR(1) case, we have

and

Solving for γ0 yields


In the AR(2) case, we have

99
Solving for ρ1 yields

Also
We can also calculate the variance as follows:

Solving for γ0 yields

This value can be re-expressed algebraically as described in Property 7 below.


Property 6: The following hold for a stationary AR(2) process

Proof: Follows from Property 4, as shown above.


Property 7: The variance of the yi in a stationary AR(2) process is

Proof: click here for an alternative proof.


16 thoughts on “Autoregressive Processes Basic Concepts”
Partial Autocorrelation for AR(p) Process
Property 1: For an AR(p) process yi = φ0 + φ1 yi-1 +…+ φp yi-p + εi, PACF(k) = φk
Thus, for k > p it follows that PACF(k) = 0
Example 1: Chart PACF for the data in Example 1 from Basic Concepts for
Autoregressive Process
Using the PACF function and Property 1, we get the result shown in Figure 1.

100
Figure 1 – Graph of PACF for AR(1) process
Observation: We see from Figure 1 that the PACF values for lags > 1 are close to
zero, as is expected, although there is some random fluctuation from zero.
Example 2: Repeat Example 1 for the AR(2) process

where εi ∼ N(0,1), and calculate ACF and PACF.


From Example 2 of Characteristic Equation of AR(p) Process, we know that this
process is stationary.

101
Figure 2 – Simulated AR(2) process
This time we place the formula =5+0.4*0-0.1*0+B4 in cell C4, =5+0.4*C4-
0.1*0+B5 in cell C5 and =5+0.4*C5-0.1*C4+B6 in cell C6, highlight the range
C6:C103 and press Ctrl-D.
The ACF and PACF are shown in Figure 3.

102
Figure 3 – ACF and PACF for AR(2) process
As you can see, there isn’t a perfect fit between the theoretical and actual ACF and
PACF values.
4 thoughts on “Partial Autocorrelation for AR(p) Process”
Characteristic Equation for AR(p) Processes
Property 1: An AR(p) process is stationary provided all the roots of the following
polynomial equation (called the characteristic equation) have an absolute value
greater than 1.

This is equivalent to saying that if z that satisfies the characteristic equation then |z|
> 1.
In fact, setting w = 1/z, this is equivalent to saying that |w| < 1 for any w that satisfies
the following equation

By the Fundamental Theorem of Algebra, any pth degree polynomial has p roots; i.e.
there are p values of z that satisfy the above equation. Unfortunately, not all of these
roots need to be real; some can involve “imaginary” numbers such as , which is
usually abbreviated by the letter i. For example, the equation z + 1 has the
2

roots i and –i as can be seen by substituting either of these values for z in the
equation z2 + 1.
We now give three properties of imaginary numbers, which will help us avoid
discussing imaginary numbers in any further detail:
• all values which involve imaginary numbers can be expressed in the form a
+ bi where a and b are real numbers
• if a + bi is a root of a pth degree polynomial, then so is a – bi

103
• if z = a + bi then the absolute value of z is defined by |z| =
Since a and b are real numbers, not involving , we only need to deal with real
numbers.
Property 2: An AR(1) process is stationary provided |φ1| < 1
Property 3: An AR(2) process is stationary provided
|φ2| < 1 and |φ1| + φ2 < 1
Example 1: Determine whether the following AR(2) process is stationary.

The roots of w2 – 2w + .5 = 0 are

This process is not stationary since 1 + √1.5 ≥ 1. You get the same result via Property
3 since |φ1| + φ2 = 2 – .5 = 1.5 ≥ 1.
Example 2: Determine whether the following AR(2) process is stationary.

Since

the roots of the reverse characteristic equation are not real. In fact

Thus

and so we see that this AR(2) process is stationary. We get the same result via
Property 3 since

Observation: It turns out that by Property 4 of Basic AR Concepts, for


any k, ρk can be expressed as a linear combination

where w1, …, wp are the unique roots of the reverse characteristic equation

Real Statistics Function: The Real Statistics Resource Pack supplies the following
array function where R1 is a p × 1 range containing the phi coefficients of the
polynomial where φp is in the first position and φ1 is in the last position.
ARRoots(R1): returns a p × 3 range where each row contains one root, and where
the first column consists of the real part of the roots, the second column consists of
the imaginary part of the roots and the third column contains the absolute value of
the roots

104
This function calls the ROOTS function described in Roots of a Polynomial. Note
that just like in the ROOTS functions, the ARRoots function can take the following
optional arguments:
ARRoots(R1, prec, iter, r, s)
prec = the precision of the result, i.e. how close to zero is acceptable. This value
defaults to 0.00000001.
iter = the maximum number of iteration performed when performing Bairstow’s
Method. The default is 50.
r, s = the initial seed values when using Bairstow’s Method. These default to zero.
2 thoughts on “Characteristic Equation for AR(p) Processes”
Finding AR(p) coefficients
Suppose that we believe that an AR(p) process is a fit for some time series. We now
show how to calculate the process coefficients using the following techniques: (1)
estimates based on ACF or PACF values, (2) using linear regression and (3) using
Solver. We illustrate the first of these approaches on this webpage.
One approach is to use the Yule-Walker in reverse to calculate the φ0, φ1,
…, φp, σ2 coefficients based on the values of μ, γ0, …, γp (ACF values). Alternatively,
we use the values μ, γ0, π1…, πp (PACF values), which it turns out are equivalent.
Example 1: Use the statistics described above to find the coefficients of the AR(1)
process based on the data in Example 1 of Autoregressive Processes Basic Concepts.
The first 8 of 100 data elements are shown in column B of Figure 1. We next
calculate the mean, variance and PACF(1) values. From these, we can estimate the
process coefficients as shown in cells G8:G10. This estimate of the time series is the
process yi = 4.983 + .394yi-1 + εi where σ2 = 1.421703.

Figure 1 – Estimation of AR(1) coefficients


As we can see, the process coefficients are pretty close to the original coefficients
used to generate the data in column B (φ0 = 5, φ0 = .4 and σ2 = 1) with the exception
of σ2, which is a little high.

105
Observation: We can use this approach for AR(2) processes, by noting that

Thus

and so
Example 2: Use the statistics described above, to find the coefficients of the AR(2)
process based on the data in Example 1.
We show two versions in Figure 2. The lower version is based on the ACF using the
formulas described in the above observation. The upper version is based on the
PACF using Property 1 of Partial Autocorrelation of AR(p) Processes.

Figure 2 – Estimation of AR(2) coefficients


8 thoughts on “Finding AR(p) coefficients”
Lag Function
We now define the lag function

We further assume that

106
for any constant c and any variables x and z. We also use the following
notation for any variable z and non-negative integer n.

We can express the AR(p) process

using the lag function as

or even
where 1 is the identity function and we use the notation (f+g)x to mean f(x)
+ g(x) for any functions f and g. This can also be expressed as

By Property 1 of Autoregressive Process Basic Concepts, this can also be


expressed as

or
Note that

is a pth degree polynomial which is equivalent to the characteristic


polynomial of the AR(p) process, as described in Characteristic Equation for
Autoregressive Processes. This polynomial can be factored (by the
Fundamental Theorem of Algebra) as follows

where the values r1, r2, …, rp are the characteristic roots of the AR(p)
process.
Based on the vector φ = [φ1, …, φp] of coefficients, we can define the
operator φ(L)

and so an autoregression process can be expressed as

Observation: The lag function is also called the (back) shift operator and
so sometimes the symbol B is used in place of L.
Other Unit Root Tests
Two other unit root tests are commonly used, in addition to or instead of
the Augmented Dickey-Fuller Test, namely:
• Phillips-Perron (PP) test

107
• Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test
While the ADF test uses a parametric autoregression to estimate the errors, the PP
test uses a non-parametric approach.
The KPSS test uses yet a different approach. Unlike the other tests, the null
hypothesis for the KPSS test is that the time series is stationary, while the alternative
hypothesis is that there is a unit root.
Real Statistics Functions: The Real Statistics Resource Pack provides the following
array functions where R1 contains a column of time series data.
PPTEST(R1, lab, lags, type, alpha) – an array function that returns a column
range for the PP test consisting of tau-stat, tau-crit, stationary (yes/no), lags and the
autocorrelation coefficient and p-value.
KPSSTEST(R1, lab, lags, type, alpha) – an array function that returns a column
range for the KPSS test consisting of test-stat, crit-value, stationary (yes/no), lags
and p-value.
As usual, if lab = TRUE (default is FALSE), the output consists of two columns
whose first column contains labels. type = the test type (0, 1, 2, default is 1). The
default value for alpha is .05.
To specify the test type, you can use “” or “none” instead of 0, you can use “drift”
or “constant” instead of 1 and you can use “trend” or “both” instead of 2.
Note too that the KPSS test does not support the case where there is no constant and
no trend. Thus, type for KPSSTEST is restricted to 1 and 2. If type = 0 is used, then
it is assumed that type = 1.
You can either specify the number of lags to test or use the values “short” or “long”.
If “short” is specified then lags is calculated to be =Round(4*(n/100)^.25,0)
where n = the number of elements in the time series, while if lags = “long” then the
value =Round(12*(n/100)^.25,0) is used.
In Figure 1, we repeat the analysis for Example 1 of Augmented Dickey-Fuller
Test using the PP and KPSS tests, specifying lags = “short” (which is equivalent
to lags = 3).

108
Figure 1 – PP and KPSS Tests
2 thoMoving Average Processes
A q-order moving average process, denoted MA(q) takes the form
Thinking of the subscripts i as representing time,
we see that the value of y at time i+1 is a linear function of past errors. We
assume that the error terms are independently distributed with a normal
distribution with mean zero and a constant variance σ2.
Topics:
• Basic Concepts
• Infinite-order Moving Average
• Invertibility
• Finding Model Coefficients using ACF
• Finding Model Coefficients using Solver
The mathematical proofs of some of the properties of Moving Average
Processes is given in Moving Average Proofs
uMoving Average Processes
109
A q-order moving average process, denoted MA(q) takes the form
Thinking of the subscripts i as representing time, we
see that the value of y at time i+1 is a linear function of past errors. We assume that
the error terms are independently distributed with a normal distribution with mean
zero and a constant variance σ2.
Topics:
• Basic Concepts
• Infinite-order Moving Average
• Invertibility
• Finding Model Coefficients using ACF
• Finding Model Coefficients using Solver
The mathematical proofs of some of the properties of Moving Average Processes is
given in Moving Average Proofs
7 thoughts on “Moving Average Processes”
ghts on “Other Unit Root Tests”
Calculating MA Coefficients using Solver
We now show how to use Excel’s Solver to calculate the parameters that best fit an
MA(q) process to some empirical time series data, based on the assumption that the
data does indeed fit an MA(q) process for some specific value of q.
Example 1: Repeat Example 1 of Calculating MA Coefficients using ACF using
Solver.
We created our 200 element time series by simulating the MA(1) process yi = εi –
.4εi-1 with σ2 = .25. The values in the time series are shown in range C4:C203 of
Figure 1.
Our goal is to fit this data to an MA(1) process of the form yi = μ + εi + θ1εi-
1 (ignoring that the time series was derived from a simulation of an MA(1)

process).
MA(q) Process Basic Concepts
A q-order moving average process, denoted MA(q), takes the form
Thinking of the subscripts i as representing time, we
see that the value of y at time i+1 is a linear function of past errors. We assume that
the error terms are independently distributed with a normal distribution with mean
zero and a constant variance σ2. Thus

Observation: An MA(q) process can be expressed as

110
where zi = yi – μ. Thus, we can often simplify our analyses by restricting ourselves
to the case where the mean is zero.
Using the lag operator, we can express a zero mean MA(q) process as

where
Property 1: The mean of an MA(q) process is μ.
Property 2: The variance of an MA(q) process is

Property 3: The autocorrelation function of an MA(1) process is

Property 4: The autocorrelation function of an MA(2) process is

Property 5: The autocorrelation function of an MA(q) process is

for h ≤ q and ρh = 0 for h > q


Observation: The proofs of Property 1 – 5 are given in Moving Average Proofs.
Property 6: The PACF of an MA(1) process is

where 1 ≤ j < n.
If the process is invertible (see Invertible MA(q) Processes) then

Example 1: Simulate a sample of size 199 from the MA(1) process yi = 4 + εi + .5εi-
1 where εi ∼ N(0,2).

Thus μ = 4, θ1 = .5 and σ = 2. We simulate the independent εi by using the Excel


formula =NORM.INV(RAND(),0,2) in column B of Figure 1 (only the first 20 of
199 values is shown). The yi values are calculated by placing the formula
=4+B5+0.5*B4 in cell C5, highlighting the range C5:C203 and pressing Ctrl-D. The
graph of the y values is shown on the right side of Figure 1. As you can see, no
particular pattern is visible.

111
Figure 1 – Simulated MA(1) data
By Properties 1 and 2, the theoretical values for the mean and variance are μ = 4 and
var(yi) = σ2(1+ ) = 22(1+.52) = 5. These compare to the actual time series values of
y̅ = AVERAGE(C6:C204) = 4.358 and s2 = VAR.P(C6:C204) = 4.401.
The ACF values are shown for lags 1 through 15 in Figure 2. These are calculated
from the y values as in Example 1 of AR(p) Process Basic Concepts. Note that the
ACF value at lag 1 is .301285. Based on Property 3, the population ACF value at lag
1 is

The ACF values for lags h > 1 vary from about -.18 to .16, compared to the
theoretical value of ρh = 0 (per Property 3). As you can see from Figure 2, the sample
values can be quite different from the theoretical values.

112
Figure 2 – ACF for MA(1) process
Observation: By Property 3

But note that

and so the reciprocal of θ1 yields the same ACF. Thus, if we are seeking a
coefficient θ1 that yields a particular ρ1 value, we can always choose a coefficient
whose absolute value is at most 1. In fact, it turns out that there is always a unique
such θ1 whose absolute value is less than 1.
This is also true for any MA(q) process, and so we will restrict our MA(q) processes
to those where |θj| < 1 for all j. This also ensures that the MA(q) process is invertible
(see Invertible MA(q) Processes).
Example 2: Chart PACF for the data in Example 1.
The approach is as described in Example 1 of Partial Autocorrelation Function. The
chart is shown in Figure 3.

113
Figure 3 – Graph of PACF for MA(1) Process
The theoretical PACF values are calculated using Property 6. In particular, we insert
the following formula in cell G5, highlight the range G5:G19 and press Ctrl-D.
=-((-0.5)^E5*(1-0.5^2)/(1-0.5^(2*E5+2)))
Note that we couldn’t use the following formula
=-(-0.5)^E5*(1-0.5^2)/(1-0.5^(2*E5+2))
This is because Excel incorrectly evaluates any expressions of the form -a^n as if
they were (-a)^n. Thus –1^2 and –(–1)^2 are evaluated as 1 instead of -1.
Observation: We can see from Figure 3 that the absolute value of the PACF values
tends towards zero as the lag increases. This is generally true for an MA(q) model.
10 thoughts on “MA(q) Process Basic Concepts”

114
Figure 1 – Using Solver to fit an MA(1) process
As we have done elsewhere we calculate the mean of the time series to provide our
estimate of the mean of the process, namely, the estimate of μ =
AVERAGE(C4:C203) = .03293, which noted previously is not significantly
different from zero.
We can now either subtract off this value for the mean from the y values in column
C or simply assume that the mean is zero, and proceed assuming that μ = 0, which
is what we will do here.
Since yi = εi + θ1εi-1, it follows that εi = yi – θ1εi-1,. Thus, for any estimated value of θ1,
we can calculate the values of the εi for i > 1 based on the data values in the time
series and the assumption that the initial residual value is zero, i.e. ε0 = 0.
Thus, we place 0 in cell D4 of Figure 1. Next, we insert the formula =C5-$G$3*D4
in cell D5, highlight the range D5:D203 and press Ctrl-D.

115
By Property 2 of Moving Average Processes Basic Concepts

and so an estimate for σ2 can be calculated from the estimate for θ1 using the formula
VAR.P(C5:C204) or VARP(C5:C204) as an estimate for var(yi). This is the formula
in cell G4 of Figure 1.
We use as an initial guess for the value of θ1 the value calculated by using the ACF
estimate from Example 1 of Calculating MA Coefficients using ACF, namely
0.28958, although we could simply use 0. As usual, we will use Solver to minimize
the mean squared error (MSE), which simply the sum of the squares of the εi values,
as shown by the formula in cell G6 of Figure 1.
We now select Data > Analysis|Solver which brings up the dialog box shown on
the right side of Figure 1. We fill in the values shown to minimize MSE (cell G6) by
changing the value of θ1 (cell G3). Note that since we want to restrict θ1 to have an
absolute value less than 1, we add the constraints shown in the dialog box. We could
also add a constraint to ensure that σ2 > 0, although this is not necessary since the
formula in cell G4 already creates this constraint (provided all the data elements in
column C are not equal).
When we click on the Solve button, the results shown in Figure 2 appear.

116
Figure 2 – Solver output for MA(1) process
We see that MSE is minimized when θ1 = -0.35909, a value that is closer to the
original value than the result from Example 1 of Calculating MA Coefficients using
ACF.
We should also make sure that the residual values in column D are consistent with
white noise. We see this by using the Ljung-Box test or looking at the Correlogram,
as shown in Figure 3.

117
Figure 3 – Check that residuals are white noise
Example 2: Repeat Example 1 trying to fit the data with an MA(2) process.
The approach is the same, except that this time, we calculate the residuals using the
formula εi = yi – θ1εi-1 – θ2εi-2, and assume that ε0 = ε-1 = 0. E.g., this is captured in
Excel by using the formula =C6-$G$3*D5-G$4*D4 in cell D6 and similarly for the
other cells in column D.
The setup for Solver is shown in Figure 4 using initial guesses of θ1 = θ2 = 0.

118
Figure 4 – Using Solver to fit an MA(2) process
The output is shown in Figure 5.

Figure 5 – Solver output for MA(2) process


This results in the MA(2) process
yi = εi – 0.31991εi-1 – 0.06596εi-2
with σ = 0.194861.
2

119
In Comparing ARIMA Models we discuss how to determine which model is a better
fit, the MA(1) process from Example 1 or the MA(2) process from Example 2. We
also show how to use this model for forecasting in ARIMA Forecasting.
8 thoughts on “Calculating MA Coefficients using Solver”
Moving Average Proofs
Moving Average Basic Concepts
The following are proofs of properties found in Moving Averages Basic
Concepts
Property 1: The mean of an MA(q) process is μ.
Proof:

Property 2: The variance of an MA(q) process is

Proof:

Property 3: The autocorrelation function of an MA(1) process is

Proof:

When h = 1
since E[εi-1] = 0. When h > 1

Thus for h = 1, by Property 2

and for h > 1

Property 4: The autocorrelation function of an MA(2) process is

Proof:

Thus, when h = 1

and when h = 2
120
and when h > 2

It now follows that for h = 1, by Property 2

and for h = 2

and for h > 2

Property 5: The autocorrelation function of an MA(q) process is

for h ≤ q and ρh = 0 for h > q.


Proof: As in the proof of Property 3 of Autoregressive Processes, it is
sufficient to prove the property in the case where the mean = 0 (and so the
constant term is zero). Since E[εiεi] = σ2 and E[εiεi-j] = 0 when j > 1, it follows
that

for h ≤ q and cov(yi, yi-h) = 0 if h > q.


By Property 2 it now follows that

for h ≤ q and ρh = 0 for h > q.


Infinite Moving Average Processes
An infinite-order moving average process, denoted MA(∞), takes the form

where the following infinite series is finite (i.e. converges to a


real value)

121
and

We can express a MA(∞) process as


where it is assumed that ψ0 = 1.
Observation: That |ψj|converges ensures that the yi take finite values and
that converges.
Example 1: Show that the AR(1) process from Example 1 of Autoregressive
Processes Basic Concepts can be represented by an MA(∞) process.

By Property 1 of Autoregressive Processes Basic Concepts

Now define
Then the original AR(1) process can be transformed into the process

which is
But then
and so
which means that
Similarly
which results in

Continuing in this way, we get

and so
is the desired MA(∞) process.
Property 1: Any stationary AR(1) process can be expressed as an MA(∞) process.
In fact

122
Proof: Using the same approach as in Example 1, we find that the AR(1)
process

can be expressed as

where
Since the original process is a stationary AR(1), |φ1| < 1 and the εi have the desired
properties.
Observation: Another way to see this is to use the lag operator, namely that an
AR(1) process (with zero mean) can be expressed as

where
as well as

where
Substituting the first equation inside the second, we get

i.e.

Here we recall that φ0 = 1. Equating the coefficients, we see


that for all j > 0

Thus
etc.
and so we that

It follows that
We also observed above that

and so ψ(L) is the inverse of φ(L)


Property 2: Any stationary AR(p) process can be expressed as an MA(∞) process.
Proof: The proof is similar to that of Property 1.

123
Example 2: Show that the following AR(2) process can be represented by an MA(∞)
process.
By Property 1 of Autoregressive Processes Basic
Concepts, the mean is

Now define
Then the original AR(2) process can be transformed into the process
But then
and so

etc. Thus the first few terms of the MA(∞) process are

Property 3 (Wold’s Decomposition Theorem): Any stationary process can be


represented as a MA(∞) process
Property 4: The following are true for any MA(∞) process

Proof: See Moving Average Proofs


Real Statistics Function: The Real Statistics Resource Pack provides the following
array function where R1 is a column range consisting of phi coefficients and R2 is a
column range consisting of theta coefficients.
PSICoeff(R1, R2, k, rev): returns a k × 1 range containing the first k psi
coefficients (starting with ψ0 = 1) for the ARMA model with the coefficients in R1
and R2.
If k is omitted (default) then k is set equal to the number of rows in the highlighted
range. If rev = TRUE (default), then the phi and theta coefficients are listed in
reverse order φp, φp-1, …, φ1 and order θq, θq-1, …, θ1.
Since both phi and theta coefficients can be present, this function can also handle
ARMA processes as described in ARMA Processes.
We can use the PSICoeff function to find the psi coefficients for Example 2 as shown
in range I10:I14 of Figure 1.

124
Figure 1 – Convert AR(2) model into an MA(∞) model
3 thoughts on “Infinite Moving Average Processes”
Infinite Moving Average Processes
The following is a proof of Property 4 in Infinite Moving Average Processes.
Property 4: The following are true for any MA(∞) process

Proof: The assumption that the infinite sum of the absolute values of
the ψi terms is finite is needed to show that all the infinite series listed below
converge. As usual, it is sufficient to demonstrate the above properties in the
case where the mean is 0.
nvertibility of MA(q) Processes
Basic Concepts
Just as we can define an infinite-order moving average process, we can also define
an infinite-order autoregressive process, AR(∞). It turns out that any stationary
MA(q) process can be expressed as an AR(∞) process. E.g. suppose we have an
MA(1) process with μ = 0.

Thus
Continuing in this way, after n steps we have

125
As a result, we have

Or equivalently
It turns out that if |θ1| < 1 then this infinite series converges to a finite value. Such
MA(q) processes are called invertible.
Properties
Property 1: If |θ1| < 1 then the MA(1) process is invertible
Property 2: The MA(q) process yi = μ + εi + θ1εi-1 + ⋅⋅⋅ + θqεi-q is invertible provided
the absolute value of all the roots of the characteristic polynomial 1 + θ1L + θ1L2 +
⋅⋅⋅ + θqLq = 0 is greater than 1.
Worksheet Function
Real Statistics Function: The Real Statistics Resource Pack supplies the following
array function where R1 is a q×1 range containing the theta coefficients of the
polynomial where θq is in the first position and θ1 is in the last position.
MARoots(R1): returns a q × 3 range where each row contains one root, and where
the first column consists of the real part of the roots, the second column consists of
the imaginary part of the roots and the third column contains the absolute value of
the roots
This function calls the ROOTS function described in Roots of a Polynomial. Note
that just like in the ROOTS functions, the MARoots function can take the following
optional arguments:
MARoots(R1, prec, iter, r, s)
prec = the precision of the result, i.e. how close to zero is acceptable. This value
defaults to 0.00000001.
iter = the maximum number of iterations performed when performing Bairstow’s
Method. The default is 50.
r, s = the initial seed values when using Bairstow’s Method. These default to zero.
Example
Example 1: Determine whether the following MA(3) process is invertible
yi = 4 + εi + .5εi-1 – .2εi-2 + .6εi–3
We insert the array formula =MARoots(B3:B5) in range D3:F5 to obtain the results
shown in Figure 1.

126
Figure 1 – Roots of an MA(3) process
We see that the three roots of the characteristic equation are -.605828–1.23715i, -
.605828+1.23715i, and -0.87832. Since the absolute value of the real root is less than
1, we conclude that the process is not invertible.
Example 2: Determine whether the following MA(2) process is invertible
yi = εi – .1εi-1 + .21εi-2
Using the same approach as for Example 1, we see that the roots of the characteristic
polynomial are 10/3 and 10/7, both of which are greater than one. Thus, we conclude
that this is an invertible process.
Examples Workbook
Click here to download the Excel workbook with the examples described on this
webpage.
References
Peiris, M. S. (2013) Invertibility of MA processes. Time series concepts & methods
https://talus.maths.usyd.edu.au/u/UG/SM/STAT3011/r/TS/NOTES10.pdf
Brockwell, P. J. and Davis, R. A. (2002) Introduction to time series and
forecasting, 2nd Ed. Springer.
http://home.iitj.ac.in/~parmod/document/introduction%20time%20series.pdf
13 thoughts on “Invertibility of MA(q) Processes”
Calculating MA Coefficients using ACF
If we know (or assume) that a time series can be fit by an MA(q) process, then we
need to figure out the value of the parameters μ, σ2, q, θ1, …, θq.
The initial approach to determining the value for q is to look at the ACF values for
the time series under consideration. Since we know that for an MA(q) process, ρk =
0 for all k > q, we seek the first value for q where ACF(q) is approximately zero. We
will refine this approach in Comparing ARIMA Models.
We next turn our attention to finding the other parameters that provide the best fit
for the data.
We start by looking at an MA(1) process yi = μ + εi + θ1εi-1 . We know that
Property 1: The mean is μ.
Property 2: The variance is

Property 3: The autocorrelation function is


127
We start by using the mean of the time series as μ. We
then subtract this value from all the time series values to get a zero mean time series.
We then calculate the variance s2 and r = ACF(1) of the time series. We can solve
for θ1 using the equation

which is equivalent to the quadratic equation

which has the solutions


Actually θ1 above is really the estimated value of θ1 which typically has a hat over
it. These solutions are real provided |r| < .5. It turns out that for large values of n

where

Also, note that


Example 1: Assuming that the time series in range C4:C203 of Figure 1 fits an
MA(1) process (only the first 10 of 200 values are shown), find the values
of μ, σ2, θ1 for the MA(1) process.
We actually created the time series using the MA(1) process yi = εi – .4εi-1 with σ2 =
.25. Thus, we entered the formula =NORM.INV(RAND(),0,.5) in all the cells in
range B4:B203 and placed the formula =B4 in cell C4 and the formula =B5-.4*B4
in cell C5. We then highlighted range C5:C203 and pressed Ctrl-D.

Figure 1 – Calculating MA(1) parameters

128
We now calculate the mean (cell F4), variance (cell F5) and autocorrelation from the
time series as shown in the upper right-hand side of Figure 1. From these values, we
calculate two possible values for θ1, namely -0.28958 and -3.45331. Note that these
values are reciprocals of one another. Only the value θ1 = -0.28958 yields an
invertible MA(1) process since |θ1| < 1. In this case, we see that σ2 = 0.198967.
The result is an estimate of the MA(1) process, namely

with an estimate of 0.198967 for the variance of the εi. Using a one-sample t-test, we
can see that the mean is not significantly different from zero (t = .97, p-value = .33,
2 tailed test).
Observation: We can compute a somewhat crude 95% confidence range
for θ1 based on the normal approximation, as shown in Figure 2.

Figure 2 – Confidence interval


If we knew the real value of θ1 the confidence interval would be as shown in column
AD of Figure 2. Since we don’t know the actual value of θ1 we have to make do with
the values estimated in Figure 1 when calculating the standard error (as shown in
column AC of Figure 2). Thus, we see that the real value lies in the interval (-.42349,
-.15566) with 95% confidence.
Observation: In the above example we used a sample with 200 elements. When we
repeated the same analysis with a sample of 1,000 elements we got the following
estimates, which are closer to the original MA(1) process parameters:
with σ2 = 0.23683. The 95% confidence interval
for θ1 also narrowed to (-.446, -.311).
We can obtain better estimates using other techniques, as shown in Calculating
MA(q) Coefficients using Solver.
6 thoughts on “Calculating MA Coefficients using ACF”
ARMA Processes
An autoregressive moving average (ARMA) process consists of both
autoregressive and moving average terms. If the process has terms from both an

129
AR(p) and MA(q) process, then the process is called ARMA(p, q) and can be
expressed as

Topics
• Basic Concepts
• ARMA(1, 1) processes
• ARMA(p, q) processes
• Calculating model coefficients using maximum likelihood
• Calculating model coefficients using Solver
• Evaluating the model
• Forecasting
• Real Statistics data analysis tool
• Real Statistics ARMA tool options
For proofs of some of the properties, see ARMA Proofs.
3 thoughts on “ARMA Processes”

since E[εi-jεi+k-h] = σ2 if h = j+k and zero otherwise.


The other two properties follow from this last one.
Observation: Property 4 can be used to provide an alternative proof of
various properties of AR(p) processes. For example, for an AR(1) process we
have seen that ψj = φ1j, and so

This last equality results from the fact that |φ1| < 1, and so φ12 < 1, in which
case we have a geometric series that converges as follows:

See Geometric Series for a proof that the geometric series converges.
ARMA Processes
An autoregressive moving average (ARMA) process consists of both
autoregressive and moving average terms. If the process has terms from both an

130
AR(p) and MA(q) process, then the process is called ARMA(p, q) and can be
expressed as

Topics
• Basic Concepts
• ARMA(1, 1) processes
• ARMA(p, q) processes
• Calculating model coefficients using maximum likelihood
• Calculating model coefficients using Solver
• Evaluating the model
• Forecasting
• Real Statistics data analysis tool
• Real Statistics ARMA tool options
For proofs of some of the properties, see ARMA Proofs.
3 ARMA(1,1) Processes
For an ARMA(1, 1) process

Now let’s suppose that |φ1| < 1. We show how to create the MA(∞) representation
as follows:

Thus
where ψ0 = 1 and for j > 0
The MA(∞) representation is therefore

If |φ1| < 1, then this ARMA(1,1) process is stationary. It also turns out that when |θ1|
< 1, the process is invertible.
Example 1: Find the MA(∞) form of the ARMA(1, 1) process yi =.4yi-1 +εi – .2εi-1

etc.

131
We get the same result using the Real Statistics PSICoeff array function as shown in
Figure 1.

Figure 1 – Finding MA(∞) Coefficients


Property 1: The following is true for an ARMA(1,1) process

Proof: See ARMA Proofs


Property 2: The following is true for an ARMA(1,1) process

and for k > 1


Proof: See ARMA Proofs
Observation: Since z1 = 1/φ1 is the root of the characteristic polynomial φ(z) =
1 – φ1z, we can express the second term in Property 2 as

It turns out that this can be generalized to other ARMA processes.


Property 3: The following is true for an ARMA(1, 1) process

and for k > 1


Proof: See ARMA Proofs
Observation: When θ1 = –φ1 for an ARMA(1, 1) process, we note
that γ0 = σ2 and ρk = 0 for all k > 1, which are the characteristics of white noise.
In fact, the white noise process with zero mean takes the form
yi = εi
We see that φ1 y – φ1 εi = 0 for all i, and in particular φ1 y – φ1 εi-1 = 0. Thus, the
i i-1

process takes the form

which is an ARMA(1,1) process with θ1 = –φ1.


132
Example 2: Simulate a sample of 105 elements from the ARMA(1,1) process

where εi ∼ N(0, 1) and calculate the ACF.


This is done in Figure 2 by placing the formula =NORM.S.INV(RAND()) in cell B7
and the formula =0.7*C6+B7-0.2*B6 in cell C7, highlighting the range B7:C111
and pressing Ctrl-D (only the first 16 elements of the simulation is shown in Figure
2).

Figure 2 – Simulated ARMA(1,1) process


The ACF values are shown in Figure 3 where the theoretical values are based on
Property 3.

133
Figure 3 – ACF for ARMA(1,1) Process
Cell M6 contains the formula =ACF($C$12:$C$111,L6), and similarly, for the other
cells in column M, cell N6 contains the formula
=(Q5+Q6)*(1+Q5*Q6)/(1+2*Q5*Q6+Q5^2) and cell N7 contains the formula
=N6*Q$5, and similarly for rest of the cells in column N.
Since |φ1| = .7 < 1 and |θ1| = .2 < 1, this process is both stationary and invertible.
Example 3: Simulate a sample of 105 elements from the ARMA(1,1) process

where εi ∼ N(0, 1) and calculate the ACF.


The only difference between this process and the one in Example 2 is the constant
term. The approach is identical, except that this time you place the formula
=3+0.7*C6+B7-0.2*B6 in cell C7. The result is shown in Figure 4. Note that the
ACF values are identical to those in Figure 3.

134
Figure 4 – Simulated ARMA(1,1) Process with non-zero mean
8 thoughts on “ARMA(1,1) Processes”
thoughts on “ARMA Processes”
RMA(p,q) Processes
Property 1: An ARMA(p, q) process

is stationary provided it is causal, i.e. the polynomial

for any z such that |z| ≤ 1.


Observation: Actually, we will only consider stationary ARMA(p, q)
processes.
From Property 1, if z is a root of the polynomial 1 – φ1z – φ2z2 – ··· – φpzp, it
follows that |z| > 1. As in the AR(p) case, this is equivalent to the fact that |w|
< 1 for any w that satisfies the following equation

The causal property implies that (and is equivalent to) the fact that there
exist constants ψj such that ψ0 = 1 and

135
Thus all stationary ARMA processes can be expressed as an MA(∞) process.
In fact, the ψj coefficients can be determined as in Property 2.
Property 2: Let

Then

which in turn results in


where θ0 = 1, θj = 0 for j > q and ψj = 0 for j < 0.
Proof: See ARMA Proofs
Observation: We will also restrict our attention to invertible ARMA(p, q)
processes, i.e. those for which if 1 + θ1z + θ2z2 + ··· + θpzp = 0 then |z| > 1.
Under construction
Calculate ARMA(p,q) coefficients using
Solver
Example 1: Assuming that the time series data in Example 1 of ARMA(1,1)
Processes (duplicated in range F8:F112 of Figure 1) can be represented by an
ARMA(1,1) process, use Solver to find the φ1 and θ1 coefficients.
Since the time series data in Example 1 simulates the ARMA(1,1) process

with σ2 = 1, it is not surprising that we can model the time series as an ARMA(1,1)
process. We now see how close the coefficients are to the coefficients of the original
ARMA(1,1) process.
Since we are assuming that we have an AR(1,1) process, we know that

Solving for the residual, we have

We will assume for the moment that φ0 = 0.


Place 0 in cell G8 and the worksheet formula =F9-SUMPRODUCT(F8,J$8)-
SUMPRODUCT(G8,K$8) in cell G9. Then highlight the range G9:G112 and
press Ctrl-D. Here J8 contains the initial guess for φ1, namely zero and K8 contains
the initial guess for θ1, also zero.

136
Actually, it is sufficient to use the formula =F9-F8*J$8-G8*K$8 in cell G9. We use
the more complicated formula shown above since it is applicable when we get to the
general ARMA(p,q) case.

Figure 1 – Using Solver to find the coefficients


As described in Calculating ARMA Coefficients using Maximum Likelihood, we
need to find the values of φ1 and θ1 that minimize the sum of the squared errors
(SSE), εi, which is modeled using the formula =SUMSQ(G9:G112) in cell K11.
We now use Solver to minimize the value in cell K11.
Select Data > Analysis|Solver and fill in the dialog box that appears as shown on
the right side of Figure 1. When you press the Solver button, the values in cells J8
and K8 will change to those that appear in Figure 2.

137
Figure 2 – Solver output
We see that the values φ1 = .6867541 and θ1 = -0.40305 reduce SSE from 125.98788
to 110.19884. These are the coefficients we are looking for. Note that the phi
coefficient is fairly similar to the original coefficient φ1 = .7, and the theta coefficient
is a little off from the original coefficient θ1 = -0.2.
We can use the Real Statistics T Test and Non-parametric Equivalents data
analysis tool, to determine whether the mean of the data values (in range F8:F112)
is significantly different from zero.

Figure 3 – Testing the constant term


As we see from Figure 3, the mean is not significantly different from zero, which
justifies the assumption that we made earlier to set φ0 = 0.
Example 2: Assuming that the time series data in Example 2 of ARMA(1,1)
Processes can be represented by an ARMA(1,1) process, use Solver to find
the φ1 and θ1 coefficients.
Since the time series data in Example 2 simulates the ARMA(1,1) process

i.e. the same process as in Example 1, except that the constant term is non-zero.
Repeating the steps described in Example 1 (with a few differences, as described
below), we get the output from Solver shown in Figure 4 (only the first 10 data
elements are displayed).

138
Figure 4 – Solver output for ARMA(1,1) process with constant term
The data in columns B and C are the same as in Figure 2. Whereas column C contains
the original time series data yi, column F contains the data for the time series zi =
yi – µ. This is done by placing the formula =F6-K$7 in cell F6, highlighting the range
F6:F110 and pressing Ctrl-D. Here cell K7 contains the estimate of the mean of the
ARMA(1,1) process which is being estimated.
As in Example 1, now place 0 in cell G6 and the formula =F7-
SUMPRODUCT(F6,J$6)-SUMPRODUCT(G6,K$6) in cell G7. Then highlight the
range G7:G110 and press Ctrl-D. We also insert the formula =SUMSQ(G7:G110)
in cell K9 (SSE).
We now use Excel’s Solver to minimize the value of SSE. The Solver dialog box is
filled in as in Figure 1, except that this time we insert K9 in the Set Objective field
and the range J6:K7 in the By Changing Variable Cells field, where J6 contains the
initial guess for φ1, namely zero and K6 contains the initial guess for θ1, also zero,
K7 contains the initial guess for µ, namely 0, and J7 contains the value of φ0, which
is calculated by the formula =K7*(1-SUM(J6)) since as we saw in defining
ARMA(p,q) processes in ARMA Basic Concepts

Note that we could simply use the formula =K7*(1-J6) in cell J7 in the ARMA(1,1)
case, although the more complicated formula given above is applicable in the more
general ARMA(p,q) case.
After clicking on the Solve button, the output shown in Figure 4 is displayed. You
can see that the estimated values of φ0, φ1, θ1 are similar, but not exactly the same as
the original values of the simulated ARMA(1,1) process.

139
11 thoughts on “Calculate ARMA(p,q) coefficients using
Solver”
Evaluating the ARMA model
In Calculating ARMA(p,q) Coefficients using Solver we showed how to create
an ARMA model for time series data. We now present some statistics for
evaluating the fit of the model. All the statistics we present will be for the
ARMA(1,1) created in Example 2 of Calculating ARMA(p,q) Coefficients
using Solver.
Descriptive Statistics
We start with some descriptive statistics, as shown in Figure 1 (with reference
to the cells in Figure 4 of Calculating ARMA(p,q) Coefficients using Solver).

Figure 1 – Descriptive statistics


We see that the mean of the residuals (cell K11) is approximately zero, as we
expect. The values in cells K12 and K13 provide estimates of σ2. The values of
1.24 and 1.29 are reasonably close to the expected value of 1.
The value y̅ (cell K14) is expected to be

which is reasonably close to the calculated value of 9.855. The value of z̅ is


expected to be zero (since zi = yi – µ).
Finally, note that since |φ1| = .751 < 1, the ARMA(1,1) process is causal. Also
since |θ1| = .486 < 1, the ARMA(1,1) process is invertible.
Comparison with other models
We created an ARMA(1,1) model for the data in Example 2 of Calculating
ARMA(p,q) Coefficients using Solver), but how do we know that some other
model, e.g. an ARMA(2,1) or ARMA(2,2) model, isn’t a better fit? Just as we
have done for logistic regression, we seek the model with the smallest value
for LL, but just as we have done for linear regression, we want to penalize
models that have more parameters, favoring those with the fewest

140
parameters, as long as the LL value (i.e. SSE value) is made as small as
possible.
Akaike’s Information Criterion (AIC) and Bayesian Information
Criteria (BIC) are two such measures. The latter is also called the Schwarz
Bayesian Criterion (SBC) or the Schwarz Information
Criterion (SIC).

where k = the number of parameters in the model, which for a model without
a constant term is k = p + q + 1 (including φ1, …, φp, θ1, …, θq, σ); in the case
where there is a constant term, k = p + q +2 (including φ0).
The value of -2LL is as described at the end of Calculating ARMA(p,q)
Coefficients using Maximum Likelihood. Since any models of the time series
have the same size, the n + LN(2π) portion of -2LL, and therefore
of AICaug and BICaug, are not relevant, and so can be left out of the definitions
of AIC and BIC.
For the time series in Example 2 of Calculating ARMA(p,q) Coefficients
using Solver), the values of these statistics are shown in Figure 2.

Figure 2 – AIC and BIC


Significance of the coefficients
We now test whether each of the coefficients φ0, φ1, θ1 makes a significant
contribution to the value of LL. As we saw when exploring logistic
regression, if we have two models one with all the coefficients and another
where one coefficient is removed, then we can test whether the LL values of

141
the two models are significantly different by using the fact that LL1 – LL0 ~
χ2(1) where LL1 is the LL value for the complete model and LL0 is
the LL value for the reduced model.
For ARMA(p,q) models, this is equivalent to testing

where SSE0 and SSE1 are the sums of the squared errors of the reduced and
complete models respectively. The results for Example 2 are shown in the
upper portion of Figure 3.

Figure 3 – Significance of model coefficients


The values in cells P6, P7 and P8 are equal to the values in cells J6, K6 and
J7 of Figure 4 of Calculating ARMA(p,q) Coefficients using Solver.
The SSE value in cell Q6 (i.e. the SSE value where the φ1 coefficient is
dropped from the model) is calculated using the Real Statistics formula
=ARMA_SSE(F6:F110,P6,P7,0,1). Similarly, cell Q7 contains
=ARMA_SSE(F6:F110,P6,P7,0,,1) and cell Q8 contains
=ARMA_SSE(F6:F110,P6,P7,K7).
Cell R6 contains the formula =K17*(LN(Q6)-LN(K9)) and cell S6 contains
the formula =CHISQ.DIST.RT(R6,1), and similarly for cells R7, S7, R8 and
S8. The p-values indicate that all these coefficients result in a significant
difference in LL values.
The test of the theoretical z-mean value of -0.2277 (cell P12) is the usual t-
test. The p-value of 0.084497 indicates that the theoretical z-mean is not
significantly different from zero.
Forecasting using an ARMA model
We now show how to create forecasts for a time series modelled by an
ARMA(p,q) process.

142
Example 1: Create a forecast for times 106 through 110 based on the
ARMA(1,1) model created in Example 1 of Calculating ARMA Coefficients
using Solver.
The result is shown in Figure 1, where we have omitted the data for times 5
through 102 to save space.

Figure 1 – Forecast for AR(1,1) process


Columns V, W and X are just copies of columns E, F and G from Figure
1 of Calculating ARMA Coefficients using Solver. The predicted values in Y
for the observed data in the time series (range Y8:Y112) is simply the data
element minus the residual; e.g. cell Y8 contains the formula =W8-X8.
The predicted (or forecasted) value at time 106 (cell Y113) is based on the
equation that defines the ARIMA(1,1) process, namely

Thus, the forecast value at time i = 106 is

Note that since we don’t have an observed value for ε106, we use the theoretical
mean value, namely zero. The forecasted value at time i = 106 is calculated
in Figure 1 using the formula
=SUMPRODUCT(W112,J$8)+SUMPRODUCT(X112,K$8). For this model,
this formula can be simplified to =W112*J8+X112*K8, but the longer
formula will come in handy when we create forecasts using ARMA(p, q)
where p and/or q is larger than 1.
The forecast at time i = 107 is calculated by

143
This time, there are no observed values for ε106, ε107, or y106. As before, we
estimate ε106 and ε107 by zero, but we estimate y106 by the forecasted value ŷ106.
This is accomplished in Excel using the formula =SUMPRODUCT(Y113,J8).
The forecasted values at times 108, 109 and 110 are calculated in a similar
manner.
We next look at the standard error for the forecast values. For the observed
times, the standard error is σ, which can be estimated
by = = = 1.029371, where SSE = 110.19884
(see Figure 2 of Calculating ARMA Coefficients using Solver).
The standard error of the mth forecasted value (after the n observed values)
is given by the formula

where
is the MA(∞) representation of the ARMA process. As before, we
estimate σ by estimated by = 1.029371.
The psi coefficients for the MA(∞) representation are as shown in Figure 2.

Figure 2 – Psi coefficients for MA(∞) representation


Here the formula =PSICoeff(J8,K8) is inserted in range N18:N22.
Thus, for example, the standard error for ŷ108 is

This value is calculated (cell Z115) by the formula


=K$18*SQRT(SUMSQ(N$18:N20)).
The 95% confidence interval for ŷ108 is therefore

which is (-1.79521, 2.472136), as shown on row 115 of Figure 1. For example,


the formula in cell AB115 is =Y115+NORMSINV(1-Z$6/2)*Z115.

144
Example 2: Create a forecast for times 106 through 110 based on the
ARMA(1,1) model created in Example 2 of Calculating ARMA Coefficients
using Solver.
The process is identical to that shown in Example 1. The only difference is
that this time there is a constant term in the ARMA(1,1) model. The result is
shown in Figure 3.

Figure 3 – Forecast for ARMA(1,1) process with non-zero mean


As we discussed in Evaluating the ARMA Model, the left side of Figure 3
contains the forecast not for the original yi data, but for the zi data
where zi = yi – µ, where the estimate for µ is 10.147687 (cell K7 in Figure 4
of Calculating ARMA Coefficients using Solver). To create a forecast for the
yi time series, we need to add 10.147687 to the forecasted zi values. This is
done in column AE,
E.g., cell AE115 contains the formula =Y115+K7, while cell AE6 contains the
formula =W6+K7.
Real Statistics ARMA Tool
We now show how to create an ARMA model of a time series using the
ARIMA Real Statistics data analysis tool and to use this model to create a forecast.
The data analysis tool uses the Real Statistics ARIMA_Coeff function (described in
detail below) to calculate the ARMA coefficients along with their standard errors.
This function uses the Levenberg-Marquardt algorithm instead of Solver, resulting
in a more accurate model (i.e. one with a lower SSE value).
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack provides
the ARIMA Model and Forecast data analysis tool which tries to fit an ARMA(p,
145
q) process to time series data. This tool can also be used to analyze an ARIMA
process as demonstrated in ARIMA Model Coefficients.
Example 1: Use the ARIMA Model and Forecast data analysis tool to build an
ARMA(2,1) model for the data in Example 2 of Calculating ARMA Coefficients
using Solver (the first 20 elements in the time series are repeated in Figure 1).
Start by pressing Ctr-m and choosing the Time Series option. Select the ARIMA
Model and Forecast option on the dialog box that appears and click on
the OK button. Now, fill in the dialog box that appears as shown in Figure 1.

Figure 1 – ARIMA Model and Forecast dialog box


In Figure 1 we have inserted the time series values in the Input Range field, without
column heading or sequence numbers. We also insert the p value in the AR
order field and the q value in the MA order field. We also check the Constant
included in the model field. Since we want to forecast 5 additional elements, we
insert 5 in the # of Forecasts field. We will describe the other options later.
When we click on the OK button, we see the output shown in Figures 2, 3 and 4.

146
Figure 2 – ARMA(2,1) model – part 1
In Figure 2, we see that the best fit ARMA(2,1) process is given by
yi = 2.64 + .68yi-1 + .06yi-2 + εi – .41εi-1
The mean value 10.143026 (cell J6) has been subtracted from all the y values in
column B (shown in Figure 1) to obtain the z values in column E.
Note that the formula in cell F6 is
=E6-SUMPRODUCT(E4:E5,I$4:I$5)-SUMPRODUCT(F4:F5,J$4:J$5)
Similar formulas are used to calculate the other residual values shown in column F.
The formula used to calculate SSE (cell J8) is =SUMSQ(F6:F108). The other cells
are calculated as described in Evaluating the ARMA Model.
Note that AIC = 16.68 (cell J21). This compares with AIC = 13.03 for the
ARMA(1,1) model used to fit the same data as shown in Figure 2 of Evaluating the
ARMA Model. This gives evidence that the ARMA(1,1) model is a better fit for the
data than the ARMA(2,1) model. Similarly, BIC = 29.86 (cell J22) for the
ARMA(2,1) model is greater than BIC = 20.30 for the ARMA(1,1) model shown in

147
Figure 2 of Evaluating the ARMA Model, giving more evidence that the
ARMA(1,1) is the better, and certainly more parsimonious, fit for the data.

Figure 3 – ARMA(2,1) model – part 2


From Figure 3, we see that the ARMA(2,1) process is both stationary and invertible
(since the absolute values of all the roots are greater than 1).
The coefficient values and their standard errors are calculated by the Real Statistics
array formula =ARIMA_Coeff(B4:B108,2,1,0,TRUE) in range Q4:R7. Cell S4
contains the formula =Q4/R4 and cell S5 contains the formula
=T.DIST.2T(ABS(S4),J$18-J$10-J$11-1). The output in range P3:T7 uses a
different approach from that employed in Figure 3 are given in Evaluating the
ARMA Model.
We see that only the constant and phi 1 coefficients are making a significant
contribution to the LL value of the ARMA(2,1) model (since these are the only ones
whose p-value < .05 = α).
The psi coefficients in range M16:M20 are used to create the forecast values and are
calculated using the array formula =PSICoeff(I4:I5,J4:J5). Five psi coefficient
values are produced since we requested 5 forecast values.
More details about the values in Figure 3 are given in Evaluating the ARMA Model.
Finally, we show the forecast values in Figure 4.

148
Figure 4 – ARMA(2,1) model – part 3
To save space, we have not included the values for times 4 through 101. Note the
following formulas used in Figure 4.
Cell Entry Formula

X109 pred z106 =SUMPRODUCT(V107:V108,I$4:I$5)+SUMPRODUCT(W107:W108,J$4:J$

X110 pred z107 =SUMPRODUCT(V108,I4)+SUMPRODUCT(X109,I5)+SUMPRODUCT(W108

X111 pred z108 =SUMPRODUCT(X109:X110,I4:I5)

X112 pred z109 =SUMPRODUCT(X110:X111,I4:I5)

X113 pred z110 =SUMPRODUCT(X111:X112,I4:I5)

Y113 s.e. =J$15*SQRT(SUMSQ(M$16:M20))

AD113 pred y110 =X113+J$6


Figure 5 – Formulas from Figure 4
We show the time series plus 5 forecasted elements in Figure 6 based on the data in
range AD4:AD113 of Figure 4.

149
Figure 6 – Time series forecast
See ARMA Tool Options for a description of the following options that are
displayed in Figure 1:
• Make AR(p) agree with OLS
• Include sigma-sq in AIC/BIC
• Reformat for Linear Regression
• Use Solver
Real Statistics Function: The Real Statistics Resource Pack provides the following
array functions. In particular, the first function is used to calculate the ARIMA
coefficients and their standard errors.
ARIMA_Coeff(R1, p, q, d, con, lab) = a p+q+1 × 4 array, each row of which
contains the coefficient, standard error, t-stat and p-value (in order: constant, phi 1,
phi 2, …, theta 1, theta 2, …) of the ARIMA(p,q,d) model for the time series data
in column range R1; if lab = TRUE (default FALSE), then an extra row and
column are appended with labels; if con = TRUE (default) then a constant term is
used, otherwise it is not (i.e. it is set to zero).
Range Q4:R7 of Figure 3 contains the formula
=ARIMA_Coeff(B4:B108,2,1,0,TRUE), where only the first two columns of the
output are used (with no labels). The output from the array formula
=ARIMA_Coeff(B4:B108,2,1,0,TRUE,TRUE) is shown in range P9:T13 of Figure
7.

150
Figure 7 – Real Statistics ARIMA_Coeff and ARIMA_Stats functions
Note too that there are more options for the con argument than just TRUE and
FALSE. In fact, you can specify a column or row range with up to p+q+1 elements.
Each position in the range specifies the initial guess used for the corresponding
coefficient in the Levenberg-Marquardt algorithm. The initial guess for any
coefficients that are not explicitly specified is .2. Note too that if the element in the
range takes the form “c” followed by a numeric value, then that numeric value will
be fixed and the Levenberg-Marquardt algorithm will not change it.
E.g. for an ARIMA(2,1,0) model, the con range can have up to p+q+1 = 2+1+1 = 4
elements, the first for the constant, the second for phi 1, the third for phi 2 and the
fourth for theta 1. Thus, if cell D1 contains “c1.2” and cell D2 contains .4, then the
formula =ARIMA_Coeff(B4:B108,2,1,0,D1:D2) specifies that the constant term
will be fixed with the value 1.2 and phi 1 will be initialized to .4 (instead of .2)
although its final value will depend on the Levenberg-Marquardt algorithm. The
initial values of phi 2 and theta 1 will be .2 (the default) since these values have not
been specified in range D1:D2.
ARIMA_Stats(R1,R2, p, q, d, con, lab) = 7 × 1 column array containing the
values LL, SSE, MSE, AIC, BIC, AIC augmented and BIC augmented for the
ARIMA(p,q,d) model for the time series data in column range R1 based on the
coefficients in the p+q+1 × 1 column range R2; if lab = TRUE (default FALSE),
then an extra column of labels is appended to the output; if con = TRUE (default)
then a constant term is used, otherwise it is not (unlike ARIMA_Coeff, no other
values are acceptable).
The output from the array formula
=ARIMA_Stats(B4:B108,Q10:Q13,2,1,0,,TRUE) is shown in range P15:Q21 of
Figure 7.

151
17 thoughts on “Real Statistics ARMA Tool”
ARMA Tool Options
We now describe the following options that are displayed in Figure 1 of Real
Statistics ARMA Tool.
• Make AR(p) agree with OLS
• Include sigma-sq in AIC/BIC
• Reformat for Linear Regression
• Use Solver
Also, note that the Differences field is described in ARIMA
Differencing and ARIMA Model Coefficients.
Use Solver: If this option is selected then the ARIMA_Coeff function is not used to
estimate the ARIMA coefficients, but instead Solver is used as described
in Calculating ARMA Coefficients using Solver.
Make AR(p) agree with OLS: The usual calculation for the statistic (e.g.
cell J15 of Figure 2) is to take the square root of SSE/n. If we are modelling an AR(p)
process, then it is common to use an ordinary least squares (OLS) regression, in
which case, s.e. = = if there is a constant term
and s.e. = = if there is no constant term.
If you check this option then the OLS regression approach is used for an AR(p)
model; otherwise the value is used. This option does not affect the results
for an ARMA(p, q) model where q ≠ 0.
Include sigma-sq in AIC/BIC: When calculating AIC and BIC, as shown
in Evaluating the ARMA Model, there is a term k which represents the number of
parameters in the model. The usual approach is to include σ as one of the model
coefficients, and so k = p + q + 2 for an ARMA(p, q) model with a constant term.
For the AR(p) model using OLS regression, however, the σ parameter is not used in
calculating AIC and BIC, and so k = p + 0 + 1 for an ARMA(p, 0) model with a
constant term. Actually, the σ parameter is not used in the Solver approach described
in Calculating ARMA Coefficients using Solver either.
If you check this option and either q > 0 or Make AR(p) agree with OLS is
unchecked, then σ is included as a parameter when calculating AIC and BIC;
otherwise, it is not.
Reformat for Linear Regression: In the case of a time series that is modeled as an
AR(p) process, you can choose to use ordinary linear regression using the Real
Statistics Linear Regression data analysis tool. To do this the data in the time series
must be reformatted as shown in the following example.

152
If you check this option then the time series data is reformatted so that it can be sued
as input to the Linear Regression tool; otherwise, no such reformatted data is
output. In either case, the usual ARIMA calculations are made.
Example 1: Perform multiple linear regression for the AR(3) model of the time
series in range B4:B23 of Figure 1.
This time we choose the ARIMA Model and Forecast data analysis tool and insert
the range B4:B23 in the Input Range field, insert 3 in the AR order field, 0 in
the MA order field and check the Reformat for Linear Regression option.
When we click on the OK button, we see output similar to that in Figures2, 3 and 4
of Calculating ARMA Coefficients using Solver along with the output in columns
AF through AI of Figure 1.

Figure 1 – Linear regression for time series data


We can now use the Linear Regression data analysis tool using range AF5:AI22 as
input to obtain the linear regression output shown on the right side of Figure 1.
Note that if when we ran the ARIMA Model and Forecast data analysis tool on the
data range B4:B23 with 3 in the AR order field and 0 in the MA order field, we
had also checked the Constant included in the model, along with the Make AR(p)

153
agree with OLS and Include sigma-sq in AIC/BIC fields, then the output would
agree with the linear regression model shown in Figure 1.
Real Statistics Function: The Real Statistics Resource Pack provides the following
array function where R1 is an n × 1 column range containing time series data and p is
a positive integer
ARMap(R1, p): outputs an n–p × p+1 range which contains X and Y data
equivalent to the data in R1 in order to perform multiple linear regression
In fact, the ARIMA Model and Forecast data analysis uses this function to produce
the reformatted data range described above. In particular, range AF6:AI22 in Figure
1 contains the array formula =ARMap(B4:B23,3).
2 thoughts on “ARMA Tool Options”
Seasonality for Time Series
A time-series yi with no trend has seasonality of period c if E[yi] = E[yi+c].
If we have a stationary time series yi and a deterministic time series si such
that si = si+c for all i (and so si = si+kc for all integers k), then zi = yi + si would be
a seasonal time series with period c. As shown in Regression with
Seasonality, the seasonality of such time series can be modeled by using c–1
dummy variables.
A second way to model seasonality is to assume that si = μm(i) + εi where εi is a
purely random time series and μ0, …, μc-1 are constants where m(i) =
MOD(i,c).
A third approach is to model seasonality as a sort of random walk,
i.e. si = μm(i) + si-c + εi . If μ0 = … = μc-1 = 0 then there is no drift; otherwise μ0,
…, μc-1 capture the seasonal drift.
Of course, seasonality can be modeled in many other ways.
Recall that for the lag function Lc(yi) = yi-c, and so (1–Lc)yi = yi – yi-c. This is the
principal way of expressing seasonality for SARIMA models.
Note too that if si is deterministic, then (1–Lc)si = si – si-c= 0.
SARIMA Models
As described in ARMA Models, an ARMA(p,q) model can be expressed as

If φ0 = 0 (i.e. the mean of the stochastic process is zero) then this can be
expressed using the lag operator as

154
where

Note too that an ARIMA(p, d, q) process zi can be expressed as above but


first zi must be replaced by zi = yi – yi-d. Alternatively, we can express an
ARIMA(p, d, q) process yi without constant as

We can add the constant term back in as

A seasonal ARIMA model takes the same form, but now there are additional
terms that reflect the seasonality part of the model. Specifically, a
SARIMA(p,d,q) × (P,D,Q)m model without constant can be expressed as

where

Here, we have P seasonal autoregressive terms (with coefficients Φ1, …,


ΦP), Q seasonal moving average terms (with coefficients Θ1, …, ΘQ)
and D seasonal differencing based on m seasonal periods.
Let’s assume that the two types of differencing (corresponding to d and D)
have already been done. Then a SARIMA(1,0,1) × (1,0,1)12 model takes the
following form:

The residuals can therefore be expressed as:

The forecast can be expressed (where the coefficients should have a hat on
them):

Similarly, a SARMA(p,q) × (P,Q)m model without constant can be expressed


as
or equivalently as

155
And so

which is equivalent to

In the case where there is a constant term φ0 this expression takes the form

This serves as the equation to estimate the forecast at time i (when the
final εi is set to zero). You can also solve for εi to obtain an expression that
can be used to estimate the residuals.
SARIMA Forecast Example
In SARIMA Model Example we show how to create a SARIMA model for the
following example, step by step, in Excel.
Example 1: Create a SARIMA(1,1,1) ⨯ (1,1,1)4 model for Amazon’s quarterly
revenues shown in Figure 1 and create a forecast based on this model for the
four quarters starting in Q3 2017.
We now show how to use this model to create a forecast.
The coefficients estimated by this model (shown in AK3:Al7 of Figure 1) can
be used to create a forecast based on the following equation:

Since the last data element is y25, we want to determine the forecasted values
of y26,. y27, y28 and y29. To do this, we use the above formula using data values
of yi-1,. yI-4 and yi-5 when available, and (previously obtained) forecasted values
when the real data values are not available. This is shown in Figure 1.

156
Figure 1 – Forecast for the differenced time series
This figure shows the data values and residuals for the later portion of the time series
(leaving out the middle) plus the forecasted values. E.g. the forecast value in cell
AH29 is calculated by the formula
=AL$3+AL$4*AH28+AL$6*AH25-AL$4*AL$6*AH24+AL$5*AI28
+AL$7*AI25+AL$5*AL$7*AI24
After entering this formula, you can highlight the range AH29:AK32 and press Ctrl-
D to obtain the other three forecast values. Note that the residuals corresponding to
the four forecast values are implicitly set to zero.
Now that we have the forecasted values for the time series shown in column
Q of Figure 3 of SARIMA Model Example, we need to translate these into
forecast values for the original time series (column O in Figure 3 of SARIMA
Model Example). To accomplish this, we need to undo the two types of
differencing.
We start by replicating the bottom of the data in Figure 3 of SARIMA Model
Example (i.e. the part that is not displayed) and then inserting the forecast
that we obtained in Figure 1. This is shown in Figure 2.

157
Figure 2 – Forecast (step 1)
We only need to go in the original time series far enough to produce at least
one value not forecasted in column AQ. Whereas differencing proceeds from
left to right, integrating (i.e. undoing differencing) proceeds from right to
left. If we know the values in cells AP5 and AQ9, we can obtain the value in
cell AP9 using the formula =AP5+AQ9. Similarly, if we know the value in
cells AO8 then we can calculate the value in cell AO9 using the formula
=AO8+AP9 (where the value in AP9 was calculated previously).
In a similar way, we can obtain the value in cell AP10, using the formula
=AP6+AQ10 and the value in cell AO10 using the formula =AO9+AP10. We
highlight the range AO10:AP13 and press Ctrl-D to obtain the other three
forecast values, as shown in Figure 3.

Figure 3 – Forecast (step 2)

158
We can now extend the plot shown in Figure 2 of SARIMA Model Example to
include the forecasted values, as shown in Figure 4.

Figure 4 – Revenue Forecast


10 thoughts on “SARIMA Forecast Example”
Real Statistics Support for SARIMA
We now show how to simplify the process of creating a SARIMA model by
using Real Statistics capabilities.
Real Statistics Functions: The Real Statistics Resource Pack supplies the
following array functions:
ADIFF(R1, d, D, per): returns a column array that corresponds to the time
series data in R1 after ordinary differencing d times and seasonal
differencing D times based on a seasonal period of per.
SARMA_RES(R1, Rar, Rma, Rsa, Rsm, per, cons): returns a column array
with the residuals that correspond to the time series data in the column
array R1 based on a SARMA model with AR coefficients in Rar, MA
coefficients in Rma, seasonal AR coefficients in Rsa, seasonal MA
coefficients in Rsm, the constant coefficient in cons and the seasonal
period per.
SARMA_PRED(R1, Rar, Rma, Rsa, Rsm, per, cons, f): returns a column
array with the predicted values that correspond to the time series data in
the column array R1 plus the next f forecast values based on a SARMA
model with AR coefficients in Rar, MA coefficients in Rma, seasonal AR
coefficients in Rsa, seasonal MA coefficients in Rsm, the constant
coefficient in cons and the seasonal period per. If f is omitted then the
highlighted (output) range is filled with forecasted values (i.e. f is set equal

159
to the number of rows in the highlighted range minus the number of rows
in R1).
SARIMA_PRED(R0, R1, d, D, per): returns a column array with the
forecasted values for the SARIMA(p, d, q) ⨯(P, D, Q)per model of the time
series data in R1 that correspond to the forecast values in R0 for the
SARMA(p, q) ⨯(P, Q)per model.
All the above arrays are column arrays. Any of the Rar, Rma, Rsa, Rsm arrays
may be omitted, although at least one of these can’t be omitted. per defaults
to 12 and cons defaults to 0 (i.e. no constant).
Note that the ADIFF function is an extension to the version described
in ARIMA Differencing.
Observation: Example 1 shows how to create a SARIMA(1, 1, 1) ⨯ (1, 1,
1)4 model and forecast. The above functions make it easier to create any
SARIMA model and forecast. To illustrate this, for Example 1 of SARIMA
Model Example, the following formulas could have been used:
=ADIFF(B4:B33,1,1,4) to create the time series in range AH4:AH28 of
Figure 4 of SARIMA Model Example
=SARMA_RES(AH4:AH28,AL4,AL5,AL6,AL7,4,AL3) to create the array of
residuals in range AI4:AI28 of Figure 4 of SARIMA Model Example
=SARMA_PRED(AH4:AH28,AL4,AL5,AL6,AL7,4,AL3) to create an array
of predicted values that take the values in the array AH4:AH32 – AI4:AI32
of Figure 4 of SARIMA Model Example. In particular, the last 4 of these
values are those found in range AH29:AH32.
=SARIMA_PRED(AH29:AH32,B4:B33,1,1,4) to create the array of forecast
values shown in range AO10:AO13 of Figure 3 of SARIMA Forecast
Example.
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack
provides the Seasonal Arima (Sarima) data analysis tool which creates a
SARIMA model and forecast.
To perform the analysis for Example 1 of SARIMA Model Example,
press Ctrl-m and choose Seasonal Arima (Sarima) from the Time
S tab (or from the Time Series dialog box if using the original user
interface). Now fill in the dialog box that appears as shown in Figure 1.

160
Figure 1 – SARIMA dialog box
If you leave the # of Forecasts field blank, then its value defaults to the
value in the Seasonal Period field. If that field is blank then no seasonality
is used in the model and # of Forecasts defaults to 5.
After clicking on the OK button, the output shown in Figures 2 and 3 is
displayed (only the first 24 rows of the output in Figure 2 and the first 20
rows in Figure 3 are displayed).

161
Figure 2 – SARIMA output (part 1)

162
Figure 3 – SARIMA output (part 2)
Most of the values are produced using the Real Statistics functions described
above. The formulas used for the descriptive statistics in range J13:J24 and
coefficient roots in columns P, Q and R are similar to those used for the
corresponding values in the Arima data analysis tool.
The lower portion of the output, which contains the forecast, is shown in
Figure 4. The values in columns D, E, F and G are the continuation of these
columns from Figure 2 and the values in columns T and U are the
continuation of these columns from Figure 3.

163
Figure 4 – SARIMA forecast output
Range G29:G32 contains the four-quarter forecast for the differenced time
series, while range U34:U37 contains the corresponding four-quarter
forecast for the revenues for the period Q3 2017 through Q2 2018.
Observation: For this example, we chose to use the Solver approach to
estimating the SARIMA coefficients. The default is to use the Levenberg-
Marquardt approach. This is accomplished by leaving the Solver option
unchecked in Figure 1. In this case, the output is similar to that described
above, except that now the output in Figure 5 is included, which is useful in
that it provides the standard errors of the coefficients and the t-tests that
determine which coefficients are significantly different from zero.

Figure 5 – SARIMA coefficients


The output in range H27:L32 of Figure 5 is produced by the array formula
=SARIMA_PARAM(A1:A30,I4,I5,I6,J7,J4,J5,J6,J8)
Real Statistics Functions: The Real Statistics Resource Pack supplies the
following array functions:
SARIMA_COEFF(R1, ar, ma, diff, per, sar, sma, sdiff, con, lab): returns
an array with two columns, the first column of which contains the SARIMA
coefficients (in the order constant term, phi coefficients, theta coefficients,
Phi coefficients, Theta coefficients) and the second column contains the

164
corresponding standard errors. If lab = TRUE (default FALSE) then a
column of labels is appended to the output.
SARIMA_PARAM(R1, ar, ma, diff, per, sar, sma, sdiff, con): returns an
array with four columns, the first column of which contains the SARIMA
coefficients (in the order constant term, phi coefficients, theta coefficients,
Phi coefficients, Theta coefficients) and the remaining columns contain the
corresponding standard errors, t statistics and p-values.
Here, the parameters are ar = p, ma = q, diff = d, per = m, sar = P, sma = Q,
sdiff = D for a (p, d, q) × (P, D, Q)m SARIMA model. con = TRUE (default) if
a constant term is included.

Miscellaneous Time Series Topics


Topics:
• Mann-Kendall Test
• Sen’s Slope
• Cox-Stuart Test
• Granger Causality
• Cointegration (Engle-Granger Test)
• Cross Correlations
• ARIMAX Model and Forecast

4 thoughts on “Miscellaneous Time Series Topics”

Calculate ARMA(p,q) coefficients using


maximum likelihood
We will assume an ARMA(p, q) process with zero mean

We will further assume that the random column vector Y = [y1 y2 ··· yn]T is normally
distributed with pdf f(Y; β, σ2) where β = [φ1 ··· φp θ1 ··· θq]T. For any time series y1,
y2, …, yn the likelihood function is

where Γn is the autocovariance matrix. As usual, we treat y1, y2, …, yn as fixed and
seek estimates for β and σ2 that maximizes L, or equivalently the log of L, namely

165
This produces the maximum likelihood estimate (MLE) B, s2 for the parameters β,
σ2. Equivalently, our goal is to minimize

Property 1: For large enough values of n, (B–β) is multivariate normally


distributed with mean 0 and covariance matrix σ2 , i.e.

Observation: From Property 1, we can conclude that for large enough n, the
following holds, where the parameter with a hat is the maximum likelihood estimate
of the corresponding parameter.
AR(1)

AR(2)

MA(1)

MA(2)

ARMA(1,1)

It turns out that -2LL can be expressed as

Thus to minimize -2LL, we need to minimize

We show how to do this in Calculating ARMA Coefficients using Solver.


12 thoughts on “Calculate ARMA(p,q) coefficients using
maximum likelihood”
Correlogram
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack
provides the Correlogram data analysis tool which outputs an ACF or
PACF correlogram that includes the confidence intervals.
ACF Correlogram
Example 1: Construct an ACF Correlogram for the data in column A of
Figure 1 (only the first 18 of 56 data elements are visible).
166
Figure 1 – ACF Correlogram
Press Ctr-m and choose the Time Series option (or the Time S tab if
using the Multipage interface). Select the Correlogram option and click on
the OK button. Now, fill in the dialog box that appears as shown in Figure 2.
Since the # of Lags field was left blank, the default of 30 was used.

Figure 2 – Correlogram dialog box


After clicking on the OK button, the output shown on the right side of Figure
1 appears. Note that the alpha value in cell F3 is automatically set to .05. You
can change this to any value you like between 0 and 0.5 and all the cells as
well as the chart will change to reflect your choice for alpha.

167
Note that cell D7 contains the formula =ACF($A$4:$A$59,C7), cell E7
contains the formula =-F7 and cell F7 contains the formula
=NORM.S.INV(1-$F$3/2)/SQRT(COUNT($A$4:$A$59))
The remaining values in columns D and E (until row 36, corresponding to
lag 30) are calculated using similar formulas. Cell F8 contains the formula
=NORM.S.INV(1-
$F$3/2)*SQRT((1+2*SUMSQ(D$7:D7))/COUNT($A$4:$A$59))
and similarly, for the other cells in column F. This reflects the fact that the
standard error and confidence interval of ACF(k) are

PACF Correlogram
Example 2: Construct a PACF Correlogram for the data in column A of
Figure 1.
This time the PACF option from the dialog box in Figure 2 is selected. The
output is shown in Figure 3 (only the firsts 15 of 30 lags is shown).

Figure 3 – PACF Correlogram


Cell Q7 contains the formula =PACF($A$4:$A$59,P7), cell R7 contains =-S7
and cell S7 contains =NORM.S.INV(1-$S$3/2)/SQRT(COUNT(A4:A59)).

168
This reflects the fact that the standard error and confidence interval of
PACF(k) are

169
Handling Missing Time Series Data
When data is missing in a time series, we can use some form of imputation
or interpolation to impute a missing value. In particular, we consider the
approaches described in Figure 1.
Numeric label Text label Imputation type

0 linear linear interpolation

1 spline spline interpolation

2 prior use prior value

3 next use next value

-1 sma simple moving average

-2 wma weighted moving average

-3 ema exponential moving average


Figure 1 – Imputation Approaches
Example
Example 1: Apply each of these approaches for the time series with missing
entries in column E of Figure 2. The full time series is shown in column B.

170
Figure 2 – Imputation Examples
Linear interpolation
The missing value in cell E15 is imputed as follows as shown in cell G15.

The missing value in cell E10 is imputed as follows as shown in cell G10.

Finally, the missing value in cell E18 is imputed as follows as shown in cell
G18.

Spline interpolation
To create the spline interpolation for the four missing values, first, create the
table in range O3:P14 by removing all the missing values. This can be done
by placing the array formula =DELROWBLANK(D3:E18,TRUE) in range
O3:P14, as shown in Figure 3. Next place the array formula
=SPLINE(R4:R18,O4:O14,P4:P14) in range S4:S18 (or in range H4:H18 of
Figure 2).

171
Figure 3 – Spline interpolation
The chart of the spline curve is shown on the right side of Figure 3. The
imputed values are shown in red on the chart.
See Spline Fitting and Interpolation for additional information.
Prior/Next
For Next the next non-missing value is imputed (or the last non-missing
value if there is no next non-missing value), while for Prior the previous
non-missing value is imputed (or the first non-missing value if there is no
previous non-missing value).
The missing value in cell E9 is imputed as 23 (cell J9) when using Next and
12 (cell I9) when using Prior. The missing value in cell E18 is imputed as 75
(cell I18 or J18) when using Prior or Next.
Simple Moving Average
The imputed value depends on the span value k which is a positive integer.
To impute the missing values, we first use linear interpolation, as shown in
column AE of Figure 4. For any missing values in the first or last k elements
in the time series, we simply use the linear interpolation value. For the
others, we use the mean of the 2k+1 linear interpolated values on either side
of the missing value.
In Figure 2 we use a span value of k = 3. To show how the values in column
K of Figure 2 are calculated, we calculate the linear interpolated values as
shown in column AE of Figure 4. Next, we place the formula
=IF(AD4=””,AE4,AD4) in cell AF4, highlight range AF4:AF6 (i.e. a column

172
range with k = 3 elements) and press Ctrl-D. Similarly, we copy the formula
in cell AF4 into the last 3 cells in column F.
Next, we place the formula =IF(AD7=””,AVERAGE(AE4:AE10),AD7) in cell
AF7, highlight the range AF7:AF15 (i.e. all the cells in column AF that haven’t
yet been filled in), and press Ctrl-D. The imputation should in column K is
identical to that shown in column AF.

Figure 4 – Moving Average Imputations


Weighted Moving Average
The approach is similar to the simple moving average approach, except that
now weights are used. For example, the first missing time series element
occurs at time t = 6. Thus, we weight the linear imputed values in column AE
of Figure 4 by 1 for t = 6, by 1/2 for t = 5 or 7, by 1/3 for t = 4 or 8, and by 1/4
for t = 3 or 9. The calculation of the imputed value at t = 6 is shown in Figure
5.

173
Figure 5 – WMA for t = 6
Here, cell AK4 contains the formula =1/(ABS(AJ$7-AJ4)+1), cell AL4
contains =AE6 and cell AM4 contains =AK4*AL5. We can now highlight the
range AK:AM10 and press Ctrl-D to fill in the other values. Then we sum the
weights to obtain the value 3.16667 as shown in cell AK11 and sum the
products to obtain the value 60.66667 as shown in cell AM11. The imputed
value is thus 60.66667 divided by 3.166667, i.e. 19.15789 as shown in cell
AM12.
This is the value shown in cell AG9 of Figure 4. In fact, we can fill in column
AG of Figure 4 as follows. First, insert the worksheet formula
=1+2*SUMPRODUCT(1/AC5:AC7) in cell AG20. Next, fill in the first 3 and
last 3 values in column AG by using the values in column AE. Finally, insert
the following formula in cell AG7, highlight range AG7:AG15, and press Ctrl-
D.
=IF(AD7=””,SUMPRODUCT(AE4:AE10,1/(ABS(AC4:AC10-
AC7)+1))/AG$20,AD7)
Exponential (Weighted) Moving Average
The approach is identical to that of the weighted moving average except that
we use weights that are a power of 2. Now, we weight the linear imputed
values in column AE of Figure 4 by 1 for t = 6, by 1/2 for t = 5 or 7, by 1/4
for t = 4 or 8, and by 1/8 for t = 3 or 9.
The calculation of the imputed value at t = 6 is as shown in Figure 5, except
that the formula used for cell AK4 is now =1/2^ABS(AJ$7-AJ4) (and
similarly for the other cells in column AK). The result is as shown in column
AH of Figure 4. This time, cell AH7 contains the formula
=IF(AD7=””,SUMPRODUCT(AE4:AE10,1/2^ABS(AC4:AC10-
AC7))/AH$20,AD7)
and cell AH20 contains the formula =1+2*SUMPRODUCT(1/2^AC4:AC6).
174
Worksheet Function
Real Statistics Function: For a time series represented as a column array
where any non-numeric values are treated as missing, the Real Statistics
Resource Pack supplies the following array function:
TSImputed(R1, itype, k): returns a column array of the same size as R1
where each missing element in R1 is imputed based on the imputation
type itype which is either a number or text string as shown in Figure 2
(default 0 or “linear”) and k = the span (default 2), which is only used with
the three moving average imputation types.
For example, =TSImputed(E4:E18,”ema”,3) returns the time series shown in
range M4:M18 of Figure 2.
Seasonality
If the time series has a seasonal component, then we can combine one of the
imputation approaches described in Figure 1 with a seasonality imputation
approach as described in Handling Missing Seasonal Time Series Data.
Handling Missing Seasonal Time Series Data
Seasonality
If a time series has a seasonal component, then we can combine one of the
imputation approaches described in Figure 1 of Handling Missing Time
Series Data with either deseasonalizing or split seasonal imputation (as
shown in Figure 1) based on the seasonality period (i.e. 4 for quarterly, 12 for
monthly, etc.).
Numeric label Text label Seasonality type

0 none no seasonality

1 seas deseasonalizing

2 split split seasonality


Figure 1 – Seasonality Imputation
We now show how to perform imputation for the missing time series
elements in column G of Figure 2.
Split seasonality
This approach is straightforward. If there are say 4 seasons then the time
series is treated as 4 separate time series, one for each season. Imputation
for each of these separated time series is then performed based on the
selected imputation approach from Figure 1 of Handling Missing Time
Series Data.

175
Example 1: Apply the split seasonality approach to impute the missing
elements for the time series in column G of Figure 2.

Figure 2 – Split seasonality imputation


If we assume that we want to perform weighted moving average imputation
with span 2, then effectively the time series in column G of Figure 2 is split
into the four time series shown on the right side of Figure 2. E.g. the time
series corresponding to Q2 is shown in range L8:L11. The imputed values for
this series, shown in range M8:M11, can be calculated by the array formula
=TSImputed(L8:L11,”wma”,2). Putting the four imputed time series back
together in the original order yields the imputed time series shown in column
H.
Deseasonalizing
In this approach, the data is deseasonalized using the seasonality period. E.g.
if the seasonality period is per = 4, then the time series y1, y2, …, yn is replaced
by z1, …, zn-4 where zi = yi + 4 – yi. If either yi+4 or yi is missing then zi is
considered missing. Any such missing zi is now imputed as described
previously using the imputation approach specified by itype (as described
in Handling Missing Time Series Data). The original time series yi is now
restored by reseasonalizing using the imputed zi values.
This approach requires that the first per elements of the original time series
are not missing (in order to reseasonalize). If one of these values is missing,
we use the split seasonality approach to impute any of the missing elements
among the first per elements in yi.

176
We illustrate this approach for the same time series shown in Figure 2. This
is repeated in column G of Figure 3.

Figure 3 – Deseasonalizing Approach


Explanations
Since one of the first four elements in the time series is missing (cell G5), we
impute this value using the split seasonality method, as shown in column H.
We place the array formula =TSImputed(G4:G19,”wma”,2,”split”,4) in range
H4:H7 (see below for a description of the TSImputed function) and the array
formula =IF(G8:G19=””,””,G8:G19) in range H8:H19.
We next insert the formula =IF(H4=””,””,IF(H8=””,””,H8-H4)) in cell I8,
highlight the range I8:I19 and press Ctrl-D to obtain the deseasonalized
time series. Now we impute the missing values in this time series by placing
the array formula =TSImputed(I8:I19,”wma”,2) in range J8:J19.
Now, we place the array formula =H4:H7 in range K4:K7 and then
reseasonalize by placing the formula =IF(G8=””,J8+K4,G8) in cell K8,
highlighting the range K8:K19 and pressing Ctrl-D. The result is the
imputed time series shown in column K.
Worksheet Function
Real Statistics Function: The Real Statistics function TSImputed,
described in Handling Missing Time Series Data, can be expanded to support
seasonal imputation. The function now takes the form:
TSImputed(R1, itype, k, stype, per): returns a column array of the same
size as R1 where each missing element in R1 is imputed based on the

177
imputation type itype and k, as described in Handling Missing Time Series
Data, plus stype which is either a number or text string as shown in Figure 1
(default 0 or “none”) and per is the seasonal period (default 4 for
quarterly), which is only used when stype is not “none”.
For example, =TSImputed(G4:G19,”wma”,2,”split”,4) returns the time series
shown in range H4:H19 of Figure 2. =TSImputed(G4:G19,”wma”,2,”seas”,4)
returns the time series shown in range K4:K19 of Figure 3.

178

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy