Homework 3 Solutions: Joe Neeman September 22, 2010
Homework 3 Solutions: Joe Neeman September 22, 2010
Joe Neeman
September 22, 2010
1. Recall that the best linear predictor of Y given Z is and linear function
of Z (say, P (Y |Z) = aZ + b) that satisfies (by the projection theorem)
EP (Y |Z) = EY and EZ(Y − P (Y |Z)) = 0. Therefore, aEZ + b = EY and
0 = EZ(Y − aZ − b)
= EY Z − aEZ 2 − bEZ
= EY Z − aEZ 2 − (EY − aEZ)EZ
= EY Z − EY EZ + a(EZ)2 − aEZ 2
= Cov(Y, Z) − a Var(Z).
Cov(Y, Z) Cov(Y, Z)
P (Y |Z) = Z + EY − EZ.
Var(Z) Var(Z)
Thus,
Cov(α1 Y1 + α2 Y2 , Z) Cov(α1 Y1 + α2 Y2 , Z)
P (α1 Y1 + α2 Y2 |Z) = Z + EY − EZ
Var(Z) Var(Z)
α1 Cov(Y1 , Z) + α2 Cov(Y2 , Z) α1 Cov(Y1 , Z) + α2 Cov(Y2 , Z)
= Z + EY − EZ
Var(Z) Var(Z)
= α1 P (Y1 |Z) + α2 P (Y2 |Z).
2. We can assume that the mean of the process is zero and that Wt ∼
W N (0, 1). Then the best linear predictor of X2 given X1 and X3 is a ran-
dom variable of the form Y = aX1 + bX3 which satisfies EX1 (X2 − Y ) = 0
and EX3 (X2 − Y ) = 0. Let’s write our AR(1) process as Xt = φXt−1 + Wt
for φ 6= 1. Then the covariance function is γ(h) = φh /(1 − φ2 ) and so we
can solve our conditions for the best linear predictor:
1
Homework 3 solutions, Fall 2010 Joe Neeman
and
and so
t
X
φt+n−s = αt−τ +1 φ|s−τ |
τ =1
If you stare at this matrix for long enough, it becomes clear that one
solution is α1 = φn and αs = 0 for s > 1. That is, Xt+n t
= φn Xt solves
the prediction equations and so it is the best linear predictor of Xt+n given
X 1 , . . . , Xt .
To compute its mean-squared error,
t
E(Xt+n − Xt+n )2 = E(Xt+n − φn Xt )2
= γ(0) − 2φn γ(n) + φ2n γ(0).
2
Homework 3 solutions, Fall 2010 Joe Neeman
Series x Series x
1.0
0.8
0.5
0.6
0.4
Sample PACF
Sample ACF
0.0
0.2
0.0
−0.2
−0.5
−0.4
0 5 10 15 20 5 10 15 20
Lag Lag
Figure 1: Sample ACF and sample PACF for the sunspot data.
2 h
But γ(h) = σw φ /(1 − φ2 ) for an AR(1) process, and so this is just
2 σ 2 (1 − φ2n )
σw
2
1 − 2φ2n + φ2n = w .
1−φ 1 − φ2
(a) The sample ACF and sample PACF are given in Figure 1. The R
code that computes and plots them is:
z <- read.table("sunspot.dat")$V1
x <- sqrt(z)
postscript(file="stat_153_solutions3_4a.eps")
par(mfcol=c(1,2))
a <- acf(x, ylab="Sample ACF")
pa <- acf(x, type="partial", ylab="Sample PACF")
dev.off()
3
Homework 3 solutions, Fall 2010 Joe Neeman
4. An MA(1) model would have a correlation function that was zero for lags
of 2 or more. Similarly, an MA(2) model would have a correlation function
that was zero for lags of 3 or more. Neither of these correponds to the
sample ACF shown in Figure 1. An AR(1) model, on the other hand,
would show a PACF that was zero for lags of 2 or more and an AR(2)
model would have a PACF that was zero for lags of 3 or more. This last
model looks the most likely, because the PACF is fairly large for the first
two lags and then it drops off fairly substantially.
5. We estimated the parameters with the command ar.yw(x, order = 2),
which gave us the estimate Xt = 1.388Xt−1 − 0.678Xt−2 + Wt + 6.351,
where Wt ∼ W N (0, 2.082).
6. We used the predict function to predict the next four values. After
we squared the results, the prediction intervals were about [20.3, 47.8],
[9.7, 52.4], [6.8, 59.3] and [7.6, 67.4].
7. The plots of the original time series, our predictions and their prediction
intervals are in Figure 2.
4
Homework 3 solutions, Fall 2010 Joe Neeman
150
sunspots
100
50
0
year
150
sunspots
100
50
0
year
Figure 2: In black, the original time series. Our predictions for 1985–1988 are
in red and the prediction intervals are in blue. The actual values for 1985-1988
are in green. The second graph zooms in on the years 1950-1988.