0% found this document useful (0 votes)
4 views

Note 2

The document discusses regression analysis and forecasting, focusing on the relationships between dependent and independent variables. It explains simple and multiple linear regression models, the estimation of coefficients using ordinary least squares, and the application of these models to time series data for trend analysis. Additionally, it covers prediction methods, including point forecasts and prediction intervals, with examples illustrating the concepts.

Uploaded by

ayaecherradi8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Note 2

The document discusses regression analysis and forecasting, focusing on the relationships between dependent and independent variables. It explains simple and multiple linear regression models, the estimation of coefficients using ordinary least squares, and the application of these models to time series data for trend analysis. Additionally, it covers prediction methods, including point forecasts and prediction intervals, with examples illustrating the concepts.

Uploaded by

ayaecherradi8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Time Series Analysis

Dr. Maryam Habadi


STAT 454
1
3 Regression Analysis and Forecasting

´ Regression analysis is a statistical technique for modeling and


investigating the relationships between a dependent (outcome or
response) variable and one or more independent (predictor, risk
factor or regressor) variables.

´ The result of a regression analysis study is often to generate a model


that can be used to forecast or predict future values of the response
variable given specified values of the predictor variables.
The simple linear regression model involves a single predictor variable and is
4
written as:

𝒚 = 𝜷𝒐 + 𝜷𝟏 𝒙 + 𝜺

• where y is the response, x is the predictor variable

• 𝛽° 𝑎𝑛𝑑 𝛽$ are unknown parameters, 𝛽% is the intercept; 𝛽$ : slope of a straight


line and measures the change in the mean of the response variable y for a unit
change in the predictor variable x.

• 𝜀 is an error term and accounts for deviations of the actual data from the straight

𝑖𝑖𝑑
line specified by the model equation. 𝜀 𝑁 0, 𝜎 & .
~
5
´ linear regression models because they are linear in the unknown parameters

(the 𝛽 ' 𝑠), and not because they necessarily describe linear relationships between

the response and the regressors. For example, the model

𝒚 = 𝜷𝒐 + 𝜷𝟏 𝒙𝟏 + 𝜷𝟐 𝒙𝟐 + ⋯ +𝜷𝒌 𝒙𝒌 + 𝜺

is a multiple linear regression model with k predictors. Also, the model

𝒚 = 𝜷𝒐 + 𝜷𝟏 𝒙 + 𝜷𝟐 𝒙𝟐 + 𝜺

is a linear regression model because it is linear in the unknown 𝛽% , 𝛽$ 𝑎𝑛𝑑 𝛽&

although it describes a quadratic relationship between y and x.


Classical Regression Methods for Trend effect
6
If the variable of interest is a time series, then naturally it is important to identify and fit any
systematic time patterns which may be present. This approach estimate the trend equation as a
regression of the time series on the time 𝑡. Suppose that the observed series is 𝑦! , for 𝑡 =
1,2, … , 𝑛. The liner trend regression model for time series data is written as:

𝑦! = 𝛽" + 𝛽# 𝑡 + 𝜀 ; 𝑡 = 1,2, … , 𝑛.

Where:

𝑦! represents the time series observations

𝑡 is the time index

𝛽" 𝑎𝑛𝑑 𝛽# are unknown parameters (regression coefficients),

and 𝜀 is an error term and accounts for deviations of the actual data from the straight line
𝑖𝑖𝑑
specified by the model equation. 𝜀 𝑁 0, 𝜎 $ .
~
7 Notes:
´ The general form of the trend model is:
𝑦* = 𝑓(𝑡) + 𝜀+

Where 𝑓(𝑡): is a function that can have various forms, such as, linear, quadratic ,
exponential…etc.

´ For a linear trend, use 𝑡 as a predictor variable in a regression model.

´ For non-liner trend, for example quadratic trend, we might consider using 𝑡
both and 𝑡 & .
8

Positive liner trend Negative liner trend


9

Quadratic Trend Exponential growth


10 Least Squares Estimation In Linear Regression Models
To estimate the coefficients, we use ordinary least squares methods (OLS), and solving the
normal equations,
4 𝑦! = 𝑛𝛽" + 𝛽# 4 𝑡

4 𝑡 𝑦! = 𝛽" 4 𝑡 + 𝛽# 4 𝑡 $

we get:
% ∑ !'! (∑ ! ∑ '!
𝑏# = % ∑ !"( ∑ ! "

𝑏" = 𝑦6! − 𝑏# 𝑡̅
Where:
∑ 𝑦! ∑𝑡
𝑦6! = 𝑎𝑛𝑑 𝑡̅ =
𝑛 𝑛
11 Notes:

´ Positive 𝑏! means increasing trend.

´ Negative 𝑏! means decreasing trend.

´ Zero 𝑏! means no trend.

´ By replacing 𝑡 in the estimated trend equation, we can forecast the future


trend value of the Time series (T.S)
12 Example (1)

´ Estimate the trend equation of the following time series, and predict
the trend in 2013? Plot the data? (Assume that it is a linear trend)

Year 2004 2005 2006 2007 2008 2009 2010


𝑦" 17 25 33 41 39 48 53
13 Answer:
Year 𝑦" 𝑡 𝑡𝑦" 𝑡#
2004 17 1 17 1
2005 25 2 50 4
2006 33 3 99 9
2007 41 4 164 16
2008 39 5 195 25
2009 48 6 288 36
2010 53 7 371 49
Total 256 28 1184 140
" ∑ $%! &∑ $ ∑ %!
14 𝑏! = = 5.714 Positive trend
" ∑ $ "& ∑ $ "

𝑏' = 𝑦+$ − 𝑏! 𝑡̅ = 13.715

´ The estimated linear trend model is:

𝑦0$ = 13.715 + 5.714 𝑡

´ predict the trend in 2013 → 𝑡 = 10:

𝑦0()!* = 13.715 + 5.714 10 = 70.855


15 Answer:
Year 𝑦" 𝑦$"
2004 17 19.429
2005 25 25.143
2006 33 30.857
2007 41 36.571
2008 39 42.285
2009 48 47.999
2010 53 53.713
Total 256
16

50
45
40
35
yt

30
25
20

2004 2005 2006 2007 2008 2009 2010

Year
17 Regression Approach Using Matrix Notations

Its simpler to estimate the regression coefficients if the regression model


expressed in matrix notations. The regression model may be written in matrix
notation as:
𝑦" = 𝑿𝒕 𝛽" + ε"
18

Where :

𝒚𝒕 :is an (𝑛×1) vector of the observation.

𝑿𝒕 :is an (𝑛×2) matrix of the levels of the regressor variables ( the elements of
its first column are all equal to one, and the elements of its second column are
values of the time units 1,2,3, … , 𝑛)

𝜷𝒕 is a (2×1) vector of the regression coefficients ( unknown parameters)

𝜺 is an (𝑛×1) vector of the random errors.


´ That is:
19 𝑦7 1 1
𝑦8 1 2
𝑦6 = ⋮ ; 𝑋6 = ⋮ ⋮
𝑦9 9×7
1 𝑛 9×8

𝜀7
𝛽: 𝜀8
𝛽6 = ; 𝜀6 = ⋮
𝛽7 8×7
𝜀9 9×7
To estimate the regression coefficient using the least square method we multiply the

20 regression equation
𝑦 = 𝑿𝛽

By 𝑿8 (transpose of 𝑿), then we get


𝑿8 𝑦 = 𝑿8 𝑿𝛽 (1)

Where

1 1 … 1
𝑿8 = ,
1 2 … 𝑛 #×:
21 1 1 𝑛 <𝑡
1 1 … 1 1 2
𝑿8 𝑿 = ⋮ ⋮ =
1 2 … 𝑛 #×:
1 𝑛 :×#
<𝑡 < 𝑡#
#×#

and
:
𝑦! < 𝑦"
1 1 … 1 𝑦# ";!
𝑿8 𝑦 = ⋮ = :
1 2 … 𝑛 #×:
𝑦: :×! < 𝑡𝑦"
";! #×!
Then equation (1) become:
22

:
< 𝑦" 𝑛 <𝑡
";! 𝛽<
: =
𝛽! #×!
< 𝑡𝑦" <𝑡 < 𝑡#
";! #×! #×#

Then multiply both sides of Eq. (1) by 𝑿8 𝑿 =! (the inverse of 𝑿8 𝑿) from the
left (we assume that this inverse exists), where
23
< 𝑡# −<𝑡
1
𝑿8 𝑿 =! = #
𝑛 ∑ 𝑡# − ∑ 𝑡 −<𝑡 𝑛
#×#
and its known that
(𝐴)=!𝐴 = 𝐼

∴ 𝑿8 𝑿 =! 𝑿8 𝑿 = 𝑰

Thus the least squares estimator of 𝛽C is

𝛽C = 𝑿8 𝑿 =! 𝑿8 𝑦
24 The fitted values of the response variable from the regression model are
computed from
𝑦$ = 𝑿𝛽C

and the residuals ( the difference between the actual observation y; and the
corresponding fitted value) can be written as an 𝑛×1 vector denoted by
𝒆 = 𝑦 − 𝑦$ = 𝒚 − 𝑿𝛽C
25 Predication methods of new observations
Point forecasts: which is the estimate of the possible future values at the point
𝑡< and is computed by
𝑦$ 𝑿< = 𝑿8𝒐 𝛽C

where

1
𝑿< = ; 𝑿8𝒐 = 𝟏 𝒕𝒐
𝑡<
´ Prediction Interval (PI): giving a range of future values the random variable
26
could take with relatively high probability. Therefore, a 100 1 − 𝛼 percent
predication interval for the future observation at the point 𝑡< is given by

𝑦$ 𝑿< − 𝑡 ? 𝜎$A! ≤ 𝑦 𝑿< ≤ 𝑦$ 𝑿< + 𝑡 ? 𝜎$A!


# ,:=# # ,:=#

where

𝑆. 𝐸(𝑦$" ) = 𝜎$A! = S 1 + 𝑿8𝒐 𝑿8 𝑿 =! 𝑿


<

and

∑$ 0
!"# A! =A
%
S=
:=#
27 Notes:
´ Confidence interval (CI): is an interval estimate on the mean of the
response distribution at a specific point. while the Prediction
Interval (PI ) is an interval estimate on a single future observation
from the response distribution at that point.

´ PI longer than the corresponding CI at the same point.


28 Example (2):
The following time series data represent the number of births (thousands) from
2010-2015

Year 2010 2011 2012 2013 2014 2015


#of births 15 20 35 30 35 40

1. Find the trend equation using matrix notation?

2. Calculate the fitted values and the residuals?

3. Estimate the number of births in 2017?

4. Calculate the 95% predication interval of births in 2017?


29
Answer
1.
𝛽: = 𝑿) 𝑿 (# 𝑿) 𝑦

6 21 1 91 −21
𝑿) 𝑿 = ;∴ 𝑿) 𝑿 (# =
21 91 $×$ 105 −21 6

175
𝑿) 𝑦 =
695

1 91 −21 175
∴ 𝛽: =
105 −21 6 695

1 1330 12.67
𝛽: = =
105
495 4.71
30 ´ Thus, trend regression model is:
P𝒕 = 𝟏𝟐. 𝟔𝟕 + 𝟒. 𝟕𝟏 𝒕
𝒚

3. predict the number of births in 2017 → 𝑡< = 8:


𝑦$#C!D = 12.67 + 4.71 8 = 50.35 ≈ 50

4. Prediction interval:

1
𝑿< = ; 𝑿8𝒐 = 𝟏 𝟖
8
1
𝑿8𝒐 𝑿8 𝑿 =! = −77 25
105
𝑿8𝒐 𝑿8 𝑿 =! 𝑿
< = 𝟏. 𝟏𝟕
31 81.9059
S= = 20.476 = 4.525
4

𝑆. 𝐸 𝑦H! = 𝜎H'! = 4.525 1 + 1.17 = 6.67

The critical value: 𝑡 #


,%($
=𝑡 ,.,$.,/ 0 2.776
"

Then ,

𝑦H 𝑿" − 𝑡 1 𝜎H'! ≤ 𝑦 𝑿" ≤ 𝑦H 𝑿" + 𝑡 1 𝜎H'!


$ ,%($ $ ,%($
50.35 − 2.776 ∗ 6.67 ≤ 𝑦 𝑿" ≤ 50.35 + (2.776 ∗ 6.67)
31.83 ≤ 𝑦 𝑿" ≤ 68.87

That is, 95% that the number of births in 2017 falls between 31.83 and 68.87.
32

Year 𝑦" 𝑦$" 𝑒


2010 15 17.38 -2.38
2011 20 22.09 -2.09
2012 35 26.8 8.2
2013 30 31.51 -1.51
2014 35 36.22 -1.22
2015 40 40.93 -0.93
Total 175

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy