TSA With R
TSA With R
So far predictive modelling techniques that we have seen have involved predicting a target which depends on
some other predictor variables. In this module however we’ll see how to predict future values of variables
which depend on themselves [ Their own past values].
We are going to focus on Univariate Time Series models only.
Essentially every modelling task is about finding a pattern in the data. Before we go ahead and start finding
these patterns and their equations , lets first see what kind of patterns can their be in a variable which
depends on its own past values.
First kind of pattern is where there is seemingly no pattern. [Although later we’ll see that this the most
general challenge to solve in time series analysis].
Here you can see that level of each time series seems to change but there is no apparent trend in the data.
Next kind is where there is some persistent trend through out the data, although it might change directions
at times.
1
Building on the same, next type of data can be where there is a persistent but constant cyclic pattern. This
can be with or without linear trend in the data.
2
We’ll first look at a simpler method modelling these kind of data, also known as exponential smoothing.
First kind of pattern that we saw , essentially has no pattern. Our forecast will be simply a constant value.
But that level has been updating itself depending on its past value. To know current value of level we need
to evaluate updation model as well.
Forecast Model : ŷt+1|t = lt + t
Updation Model : lt = α ∗ yt + (1 − α) ∗ lt−1
Here α in the updation model equation is the weight-age given to most recent value of level and rest [(1 − α)]
is given to previous level values. If you expand this recurrence equation it looks like this:
You can see here that weight-age for past values keeps on decreasing exponentially, that is where the name
exponential smoothing comes from. You can
P 2fit in this equation in the forecasting model and get estimate of
α my minimizing errors sum of squares. t
Let us see an example of this model in R. We’ll look at 100 year rain fall data starting from the year 1813.
rain=read.csv("rain.csv")
3
To be able to model this data with time series functions , we need to first convert this to a time series data
object.
raints=ts(rain,start=c(1813))
plot(raints)
35
30
x
25
20
Time
You can see , that this data doesn’t have any of the linear trend or seasonality for that matter.We are going
to use function HoltWinters to build this exponential smoothing model. Holtwinters has three parameters
: alpha, beta and gamma which correspond to level, linear trend and seasonality in the data. For this data
we’ll be setting beta and gamma to FALSE.
rainforecast=HoltWinters(raints,beta=F,gamma=F)
rainforecast
Value of alpha here tells how much weight-age is given to most recent values. value of a at the end is the
constant forecast. You can plot object rainforecast to see how level is getting updated over time.
4
plot(rainforecast)
35 Holt−Winters filtering
Observed / Fitted
30
25
20
Time
Value of alpha that you see in the model summary is the estimated value according to entire data. You can
use your own preconceived value of alpha as well. This will change your forecast equation. Notice that value
of a at then end of the model summary has been changed.
r2=HoltWinters(raints,alpha=0.2,beta=F,gamma=F)
r2
plot(r2)
5
Holt−Winters filtering
35
Observed / Fitted
30
25
20
Time
Now to get forecast and confidence intervals, we’ll be using forecast.HoltWinters function from the library
forecast.
library(forecast)
r2forecast=forecast.HoltWinters(rainforecast,h=10)
plot(r2forecast)
6
Forecast values and confidence band values can be accessed here.
r2forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1913 24.67819 19.17493 30.18145 16.26169 33.09470
## 1914 24.67819 19.17333 30.18305 16.25924 33.09715
## 1915 24.67819 19.17173 30.18465 16.25679 33.09960
## 1916 24.67819 19.17013 30.18625 16.25434 33.10204
## 1917 24.67819 19.16853 30.18785 16.25190 33.10449
## 1918 24.67819 19.16694 30.18945 16.24945 33.10694
## 1919 24.67819 19.16534 30.19105 16.24701 33.10938
## 1920 24.67819 19.16374 30.19265 16.24456 33.11182
## 1921 24.67819 19.16214 30.19425 16.24212 33.11427
## 1922 24.67819 19.16054 30.19584 16.23968 33.11671
Now for model to be adequate, our errors should be independent and for our confidence bands to be correct ,
errors should follow normal distribution. We can check that by looking at auto-correlation plot and normal
density curves.
acf(r2forecast$residuals)
object$x
1.0
0.8
0.6
ACF
0.4
0.2
−0.2
0 5 10 15
Lag
None of the auto-correlation bars are going above the blue significance line, which tells that errors are
independent. Lets look at normality curves to see whether we can rely on the confidence bands.
library(ggplot2)
d=data.frame(x=as.vector(r2forecast$residuals))
ggplot(d,aes(x))+
7
geom_density(color="red")+
stat_function(fun=dnorm,aes(x=x),color="green")
0.4
0.3
density
0.2
0.1
0.0
−5 0 5 10
x
Clearly, errors are not following normal distribution, you should be wary of confidence bands provided with
that assumption.
Next we look at the model when data has linear trend. Equations for level models will updated to accommodate
linear trend as well. Here an addition parameter β will come in picture which is nothing but weight-age given
to most recent value of linear trend.
Forecast Model : ŷt+h|t = lt + h ∗ bt + t
Updation Model : lt = α ∗ yt + (1 − α) ∗ (lt−1 + bt−1
bt = β ∗ (lt − lt−1 ) + (1 − β) ∗ bt−1
Lets look at an example for the same in R. We’ll again be using function HoltWinters. This time there is
both level and linear trend component, hence we’ll set only seasonality parameter gamma to FALSE.
skirts=read.csv("skirts.csv")
skirts=ts(skirts,start=c(1866))
plot(skirts)
8
900 1000
800
x
700
600
Time
As you can see this time series has trend and a simple “level” forecast will not be enough.
skirtforecast=HoltWinters(skirts,gamma=F)
plot(skirtforecast)
Holt−Winters filtering
Observed / Fitted
900
800
700
600
500
Time
skirtforecast
9
##
## Call:
## HoltWinters(x = skirts, gamma = F)
##
## Smoothing parameters:
## alpha: 0.8383481
## beta : 1
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 529.308585
## b 5.690464
Your forecast equation here is nothing but a+b*h, which is : 529.308585+h*5.690464. Keep in mind that
trend has been changing a lot for this data, it will not be wise to use this forecast equation to make forecast
too long in future.
skirtfuture=forecast.HoltWinters(skirtforecast,h=19)
plot(skirtfuture)
10
acf(skirtfuture$residuals,lag.max=20)
1.0
0.5
object$x
ACF
0.0
−0.5
0 5 10 15 20
Lag
This indicates that there might be some auto-correlation present. To see whether this correlation is significant
or not, we’ll do box-ljung test.
Box.test(skirtfuture$residuals,lag=20,type=c("Ljung-Box"))
##
## Box-Ljung test
##
## data: skirtfuture$residuals
## X-squared = 19.731, df = 20, p-value = 0.4749
Which tells you that this auto-correlation is not significant. Lets look at density plots to see if normality
assumption holds.
d=data.frame(x=as.vector(skirtfuture$residuals))
ggplot(d,aes(x))+
geom_density(color="red")+
stat_function(fun=dnorm,aes(x=x),color="green")
11
0.3
density
0.2
0.1
0.0
−40 −20 0 20 40
x
This again indicates that assumption of normality doesn’t hold and we should be wary of confidence intervals
provided.
Next we look at models where all three kind of patterns are present.
Forecast Model : ŷt+h|t = lt + h ∗ bt + st−m + t
Updation Model : lt = α ∗ (yt − st−m ) + (1 − α) ∗ (lt−1 + bt−1
bt = β ∗ (lt − lt−1 ) + (1 − β) ∗ bt−1
st = γ ∗ (yt − lt−1 − bt−1 ) + (1 − γ) ∗ st−m
When you have this kind of data, you have to mention seasonality with option frequency while creating the
time series object from your data. Since seasonality is 12 in this case, you also need to mention which month
the first observation in the data belong to in the option start.
souvenir=read.csv("souvenir.csv")
souvenirts = ts(souvenir, frequency=12, start=c(1987,1))
plot(souvenirts)
12
8e+04
x
4e+04
0e+00
Time
Here the trend is not linear , also the effect of seasonality is not additive, we can take care of that by log
transforming our data.
souvenirts=log(souvenirts)
plot(souvenirts)
11
10
x
9
8
Time
This looks alright. Lets model this data with function HoltWinters . Now all three components are present,
we don’t have to set any parameter to FALSE.
13
souvenirforecast=HoltWinters(souvenirts)
souvenirforecast
plot(souvenirforecast)
14
Holt−Winters filtering
11
Observed / Fitted
10
9
8
Time
The value of beta is 0.00, indicating that the estimate of the slope b of the trend component is not updated
over the time series, and instead is set equal to its initial value.
This makes good intuitive sense, as the level changes quite a bit over the time series, but the slope b of the
trend component remains roughly the same. Lets make forecasts now.
souvenirfuture=forecast.HoltWinters(souvenirforecast,h=48)
plot(souvenirfuture)
15
14
13
12
11
10
9
8 Forecasts from HoltWinters
acf(souvenirfuture$residuals,lag.max=20)
object$x
1.0
0.8
0.6
ACF
0.4
0.2
−0.2
Lag
No auto-correlation, good, lets look at normality bit.
16
shapiro.test(souvenirfuture$residuals)
##
## Shapiro-Wilk normality test
##
## data: souvenirfuture$residuals
## W = 0.98618, p-value = 0.6187
Everything looks alright here too. Next we discuss more general time series models which do not assume that
your data depends on all previous values.
AR or Autoregressive models of order p assume that data values depend on its last p-value only. This is how
you right an AR(p) model.
You can write the same equation with the help of backshift operator as well. backshift operator shifts back
the value by some time period. It is denoted by B.
B k ∗ yt = yt−k
(1 − Bψ1 − B 2 ψ2 − .....B p ψp ) ∗ yt = t
MA or moving average model of order q assume that current data value is dependent on q previous error
terms [ or random shocks]. This is how we write this with backshift operator.
yt = (1 + Bθ1 + B 2 θ2 + ......B q θq ) ∗ t
AR(p) models are considered to be deterministic where as MA(q) models are considered to be stochastic
models. Any general time series can be expressed as an addition of an MA(q) and AR(p) series , given that
it is stationary. [Wold’s Theorem]
Any series can be made stationary by differencing. Combining AR(p) and MA(q) models gives rise to
ARIMA(p,d,q) models where d is the order of differencing needed to achieve stationarity.
A general ARIMA(p,d,q) model written with backshift operator:
An even general ARIMA model will be the one with seasonality component as well. This is called a
SARIMA(p,d,q)(P,D,Q)[m] model. where (p,d,q) is the order of non seasonal ARIMA component. (P,D,Q) is
order of seasonal ARIMA component. m is period of seasonality. A general SARIMA model is written as:
(1 − Bψ1 − B 2 ψ2 − .....B p ψp ) ∗ (1 − B d )∗
(1 − B m Ψ1 − B 2m Ψ2 − .....B P m ΨP ) ∗ (1 − B Dm ) ∗ yt
17
=
(1 + Bθ1 + B θ2 + ......B q θq )∗
2
(1 + B m Θ1 + B 2m Θ2 + ......B Qm ΘQ ) ∗ t
You don’t need to worry about finding order of ARIMA or SARIMA model. Function auto.arima in the
library forecast gives you this automatically. Don’t bother with long and vague articles on guessing orders
by looking at acf and pacf plots.
auto.arima(souvenirts)
## Series: souvenirts
## ARIMA(2,0,0)(0,1,1)[12]
##
## Coefficients:
## ar1 ar2 sma1
## 0.4860 0.4894 -0.4922
## s.e. 0.1013 0.1028 0.1622
##
## sigma^2 estimated as 0.03099: log likelihood=20.3
## AIC=-32.6 AICc=-32 BIC=-23.49
This turns out to be a SARIMA model with , p=2, d=0, q=0 , P=0 ,D=1 ,Q=1 and m=12. Simplified
equation for this will be:
(1 − Bψ1 − B 2 ψ2 ) ∗ (1 − B 12 ) ∗ yt = (1 + B 12 Θ1 ) ∗ t
where values of ψ1 ,ψ2 and Θ1 are 0.486 , 0.4894 and -0.4922 respectively. Having order of the SARIMA
model, you can use function arima in the base package also.
arimafit=arima(souvenirts,order=c(2,0,0),seasonal=c(0,1,1))
arimafuture=forecast.Arima(arimafit,h=48)
plot(arimafuture)
18
14
13
12
11
10
9
8 Forecasts from ARIMA(2,0,0)(0,1,1)[12]
Series arimafuture$residuals
1.0
0.8
0.6
ACF
0.4
0.2
−0.2
Lag
shapiro.test(arimafuture$residuals)
##
## Shapiro-Wilk normality test
19
##
## data: arimafuture$residuals
## W = 0.98509, p-value = 0.4446
20