0% found this document useful (0 votes)
100 views

STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20

This document provides an overview of regression analysis concepts including regression problems, linear regression models, and ordinary least squares estimation. It introduces simple and multiple linear regression, defines key terms like residuals and fitted values, and provides examples of estimating regression coefficients and variance using the ordinary least squares method. Examples are given for simple linear regression, alternative regression forms, and properties of residuals.

Uploaded by

Mingyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20

This document provides an overview of regression analysis concepts including regression problems, linear regression models, and ordinary least squares estimation. It introduces simple and multiple linear regression, defines key terms like residuals and fitted values, and provides examples of estimating regression coefficients and variance using the ordinary least squares method. Examples are given for simple linear regression, alternative regression forms, and properties of residuals.

Uploaded by

Mingyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

STAT 3008 Applied Regression Analysis

Tutorial 1 | Term 2, 2019−20

ZHAN Zebang
22 January 2020

1 Regression Problems
• Regression studies the dependency: is there any dependence of the response Y on the
predictors (explanatory variables) X?

• A scatterplot is able to identify the mean function, variance function and separated points.
Mean function: E(Y |X = x) = f (x).
Variance function: Var(Y |X = x) = h(x).
Separated points: leverage point (horizontal, larger impact), outlier (vertical).

• A null plot is a scatterplot with constant mean function, constant variance function and
no separated point.

Example 1. (Nonnull plot and null plot)

x1=rt(100,6) # 100 values from a t(6) distribution


y1=3*x1+rnorm(100) # y1 depends on x1
x2=runif(100,2,8) # 100 values from a Uniform(2,8) distribution
y2=rnorm(100,3,2) # y2 comes from N(3,4) and does not depend on x2
par(mfrow=c(1,2)) # display plots in 1 row
plot(x1,y1,main="A nonnull scatterplot")
plot(x2,y2,main="A null plot")

2 Linear Regression
• Model setting: Y = Xβ + e, where E(e) = 0, Var(e) = σ 2 In .
Simple linear regression: yi = β0 + β1 xi + ei ;

1
Multiple linear regression (p > 1): yi = β0 + β1 x1i + · · · + βp xpi + ei ,
where E(ei ) = 0, Var(ei ) = σ 2 , ei s are uncorrelated.

• Ordinary least squares (OLS) method: the estimates of parameters are obtained by
minimizing the sum of squares of the distances.
g(β) = ni=1 [yi − (β0 + β1 x1i + · · · + βp xpi )]2 ⇒ β̂ = arg minβ g(β).
P

Define the fitted value of yi as ŷi = β̂0 + β̂1 x1i + · · · + β̂p xpi ,
the residual êi = yi − ŷi = yi − (β̂0 + β̂1 x1i + · · · + β̂p xpi ),
and the residual sum of squares RSS = ni=1 ê2i = ni=1 [yi −(β̂0 + β̂1 x1i +· · ·+ β̂p xpi )]2 .
P P

RSS
E(RSS) = (n − k)σ 2 ⇒ σ̂ 2 = is an unbiased estimator of σ 2 , where k is the number
n−k
of regression coefficients in the model.

• For simple linear regression (p = 1), denote the sample means and sums of squares by
1 Pn Pn 2
Pn 2 2
x̄ = i=1 xi , SXX = i=1 (xi − x̄) = i=1 xi − nx̄ ,
n
1 Pn Pn 2
Pn 2 2
ȳ = i=1 yi , SY Y = i=1 (yi − ȳ) = i=1 yi − nȳ ,
n
SXY = ni=1 (xi − x̄)(yi − ȳ) = ni=1 xi yi − nx̄ȳ. Then
P P

g(β0 , β1 ) = ni=1 (yi − β0 − β1 xi )2


P

∂g(β0 , β1 ) ∂g(β0 , β1 )
= −2 ni=1 (yi − β0 − β1 xi ), = −2 ni=1 (yi − β0 − β1 xi )xi .
P P

∂β0 ∂β1
∂g(β0 , β1 ) ∂g(β0 , β1 )
Set |β̂0 ,β̂1 = |β̂0 ,β̂1 = 0
∂β0 ∂β1
Pn Pn
xi yi − nx̄ȳ (x − x̄)(yi − ȳ) SXY
⇒ β̂1 = Pn 2 i=1
2
= i=1 Pn i 2
= , β̂0 = ȳ − β̂1 x̄.
i=1 xi − nx̄ i=1 (xi − x̄) SXX
RSS 1 SXY 2
E(RSS) = (n − 2)σ 2 ⇒ σ̂ 2 = = ··· = (SY Y − ).
n−2 n−2 SXX

Example 2. (Alternative form of SLR) For a data set with observations {(xi , yi ), i =
1, ..., n}, consider the regression model yi = α0 + α1 (xi − x̄) + ei , where E(ei ) = 0, Var(ei ) = σ 2 ,
ei s are uncorrelated. Find the least squares estimates of α0 , α1 and σ 2 .

Example 3. (OLS method) For a data set with observations {(xi , yi ), i = 1, ..., n}, consider
the regression model yi = β1 x2i + ei , where E(ei ) = 0, Var(ei ) = σ 2 , ei s are uncorrelated. Find
the least squares estimates of β1 and σ 2 .

Example 4. (Deterministic properties of residuals) For the simple linear regression


model, consider the residuals êi = yi − ŷi = yi − β̂0 − β̂1 xi , i = 1, ..., n, where β̂0 and β̂1 are the
least squares estimates of β0 and β1 . Find
P P P ¯ where ê¯ = 1 Pn êi .
(1) ni=1 êi ; (2) ni=1 xi êi ; (3) ni=1 (xi − x̄)(êi − ê), n i=1

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy