Metrics 2019 Lec3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 59

Simple OLS Regression:Estimation

Introduction to Econometrics,Fall 2019

Zhaopeng Qu

Nanjing University

9/26/2019

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 1 / 60


1 Review the last lecture

2 OLS Estimation: Simple Regression

3 The Least Squares Assumptions

4 Properties of the OLS Estimators

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 2 / 60


Review the last lecture

Section 1

Review the last lecture

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 3 / 60


OLS Estimation: Simple Regression

Section 2

OLS Estimation: Simple Regression

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 5 / 60


OLS Estimation: Simple Regression

Question: Class Size and Student’s Performance


Specific Question:
What is the effect on district test scores if we would increase district
average class size by 1 student (or one unit of Student-Teacher’s
Ratio)
Technically,we would like to know the real value of a parameter 𝛽1 ,

Δ𝑇 𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒
𝛽1 =
Δ𝐶𝑙𝑎𝑠𝑠𝑆𝑖𝑧𝑒

And 𝛽1 is actually the definition of the slope of a straight line


relating test scores and class size. Thus

𝑇 𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒 = 𝛽0 + 𝛽1 × 𝐶𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒

where 𝛽0 is the intercept of the straight line.


Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 6 / 60
OLS Estimation: Simple Regression

Question: Class Size and Student’s Performance

BUT the average test score in district 𝑖 does not only depend on the
average class size
It also depends on other factors such as
Student background
Quality of the teachers
School’s facilitates
Quality of text books …..

So the equation describing the linear relation between Test score and
Class size is better written as

𝑇 𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 × 𝐶𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒𝑖 + 𝑢𝑖

where 𝑢𝑖 lumps together all other district characteristics that affect


average test scores.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 7 / 60


OLS Estimation: Simple Regression

Terminology for Simple Regression Model

The linear regression model with one regressor is denoted by

𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖

Where
𝑌𝑖 is the dependent variable(Test Score)
𝑋𝑖 is the independent variable or regressor(Class Size or
Student-Teacher Ratio)
𝛽0 + 𝛽1 𝑋𝑖 is the population regression line or the population
regression function
This is the relationship that holds between Y and X on average over
the population. (be familiar with? Recall the concept of CEF )

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 8 / 60


OLS Estimation: Simple Regression

Terminology for Simple Regression Model

The intercept 𝛽0 and the slope 𝛽1 are the coefficients of the


population regression line, also known as the parameters of the
population regression line.
𝑢𝑖 is the error term which contains all the other factors besides 𝑋
that determine the value of the dependent variable, 𝑌 , for a specific
observation, 𝑖.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 9 / 60


OLS Estimation: Simple Regression

Terminology for Simple Regression Model

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 10 / 60


OLS Estimation: Simple Regression

How to find the “best” fitting line?


In general we don’t know 𝛽0 and 𝛽1 which are parameters of
population regression function.We have to calculate them using a
bunch of data- the sample.

So how to find the line that fits the data best?


Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 11 / 60
OLS Estimation: Simple Regression

The Ordinary Least Squares Estimator (OLS)

The OLS estimator


Chooses the best regression coefficients so that the estimated
regression line is as close as possible to the observed data, where
closeness is measured by the sum of the squared mistakes made in
predicting Y given X.
Let 𝑏0 and 𝑏1 be estimators of 𝛽0 and 𝛽1 ,thus 𝑏0 ≡ 𝛽0̂ ,𝑏1 ≡ 𝛽1̂
The predicted value of 𝑌𝑖 given 𝑋𝑖 using these estimators is
𝑏0 + 𝑏1 𝑋𝑖 , or 𝛽0̂ + 𝛽1̂ 𝑋𝑖 formally denotes as 𝑌𝑖̂

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 12 / 60


OLS Estimation: Simple Regression

The Ordinary Least Squares Estimator (OLS)

The OLS estimator


The prediction mistake is the difference between 𝑌𝑖 and 𝑌𝑖̂

𝑢̂𝑖 = 𝑌𝑖 − 𝑌𝑖̂ = 𝑌𝑖 − (𝑏0 + 𝑏1 𝑋𝑖 )

The estimators of the slope and intercept that minimize the sum of
the squares of 𝑢̂𝑖 ,thus
𝑛 𝑛
𝑎𝑟𝑔 𝑚𝑖𝑛 ∑ 𝑢̂2𝑖 = 𝑚𝑖𝑛 ∑(𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 )2
𝑏0 ,𝑏1 𝑏0 ,𝑏1
𝑖=1 𝑖=1

are called the ordinary least squares (OLS) estimators of 𝛽0 and


𝛽1 .

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 13 / 60


OLS Estimation: Simple Regression

The Ordinary Least Squares Estimator (OLS)

OLS minimizes sum of squared prediction mistakes:


𝑛 𝑛
𝑚𝑖𝑛 ∑ 𝑢̂2𝑖 = ∑(𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 )2
𝑏0 ,𝑏1
𝑖=1 𝑖=1

Solve the problem by F.O.C(the first order condition)


Step 1 for 𝛽0 :
𝜕 𝑛
∑(𝑌 − 𝑏0 − 𝑏1 𝑋𝑖 )2 = 0
𝜕𝑏0 𝑖=1 𝑖

Step 2 for 𝛽1 :
𝜕 𝑛
∑(𝑌 − 𝑏0 − 𝑏1 𝑋𝑖 )2 = 0
𝜕𝑏1 𝑖=1 𝑖

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 14 / 60


OLS Estimation: Simple Regression

Step 1: OLS estimator of 𝛽0

Optimization

𝜕 𝑛 2 𝑛
∑ 𝑢̂𝑖 = −2 ∑(𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 ) = 0
𝜕𝑏0 𝑖=1 𝑖=1
𝑛 𝑛 𝑛
⇒ ∑ 𝑌𝑖 − ∑ 𝑏0 − ∑ 𝑏1 𝑋𝑖 = 0
𝑖=1 𝑖=1 𝑖=1
𝑛 𝑛
1 1 1 𝑛
⇒ ∑ 𝑌𝑖 − ∑ 𝑏0 − 𝑏1 ∑ 𝑋𝑖 = 0
𝑛 𝑖=1 𝑛 𝑖=1 𝑛 𝑖=1
⇒𝑌 − 𝑏0 − 𝑏1 𝑋 = 0

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 15 / 60


OLS Estimation: Simple Regression

Step 1: OLS estimator of 𝛽0

OLS estimator of 𝛽0 :

𝑏0 = 𝛽0̂ = 𝑌 − 𝑏1 𝑋

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 16 / 60


OLS Estimation: Simple Regression

Step 2: OLS estimator of 𝛽1

𝜕 𝑛 2 𝑛
∑ 𝑢̂𝑖 = −2 ∑ 𝑋𝑖 (𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 ) = 0
𝜕𝑏1 𝑖=1 𝑖=1
𝑛
⇒ ∑ 𝑋𝑖 [𝑌𝑖 − (𝑌 − 𝑏1 𝑋) − 𝑏1 𝑋𝑖 ] = 0
𝑖=1
𝑛
⇒ ∑ 𝑋𝑖 [(𝑌𝑖 − 𝑌 ) − 𝑏1 (𝑋𝑖 − 𝑋)] = 0
𝑖=1
𝑛 𝑛
⇒ ∑ 𝑋𝑖 (𝑌𝑖 − 𝑌 ) − 𝑏1 ∑ 𝑋𝑖 (𝑋𝑖 − 𝑋) = 0
𝑖=1 𝑖=1

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 17 / 60


OLS Estimation: Simple Regression

Step 2: OLS estimator of 𝛽1


Some Algebraic Facts
𝑛 𝑛 𝑛 𝑛 𝑛
∑(𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌 ) = ∑ 𝑋𝑖 𝑌𝑖 − ∑ 𝑋𝑖 𝑌 − ∑ 𝑋𝑌𝑖 + ∑ 𝑋𝑌
𝑖=1 𝑖=1 𝑖=1 𝑖=1 𝑖=1
𝑛 𝑛 𝑛
1
= ∑ 𝑋𝑖 𝑌𝑖 − ∑ 𝑋𝑖 𝑌 − 𝑛𝑋( ∑ 𝑌𝑖 ) + 𝑛𝑋𝑌
𝑖=1 𝑖=1
𝑛 𝑖=1
𝑛
= ∑ 𝑋𝑖 (𝑌𝑖 − 𝑌 )
𝑖=1

By a similar reasoning, we could obtain


𝑛 𝑛
∑(𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋) = ∑ 𝑋𝑖 (𝑋𝑖 − 𝑋)
𝑖=1 𝑖=1

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 18 / 60


OLS Estimation: Simple Regression

Step 2: OLS estimator of 𝛽1

Thus

𝜕 𝑛 2 𝑛 𝑛
∑ 𝑢̂𝑖 = ∑(𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌 ) − 𝑏1 ∑(𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋) = 0
𝜕𝑏1 𝑖=1 𝑖=1 𝑖=1

OLS estimator of 𝛽1 :
𝑛
∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌 )
𝑏1 = 𝛽1̂ = 𝑛
∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 19 / 60


OLS Estimation: Simple Regression

Some Algebraic of 𝑢̂𝑖


Recall the F.O.C

𝜕 𝑛
∑(𝑌 − 𝑏0 − 𝑏1 𝑋𝑖 )2 = 0
𝜕𝑏0 𝑖=1 𝑖
𝜕 𝑛
∑(𝑌 − 𝑏0 − 𝑏1 𝑋𝑖 )2 = 0
𝜕𝑏1 𝑖=1 𝑖

We obtain two intermediate formulas

𝑛
∑(𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 ) = 0
𝑖=1
𝑛
∑ 𝑋𝑖 (𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 ) = 0
𝑖=1
Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 20 / 60
OLS Estimation: Simple Regression

Some Algebraic of 𝑢̂𝑖


Recall the OLS predicted values 𝑌𝑖̂ and residuals 𝑢̂𝑖 are:

𝑌𝑖̂ = 𝛽0̂ + 𝛽1̂ 𝑋𝑖


𝑢𝑖̂ = 𝑌𝑖 − 𝑌𝑖̂

Then we have

𝑛
∑ 𝑢𝑖̂ =0
𝑖=1
𝑛
∑ 𝑢𝑖̂ 𝑋𝑖 = 0
𝑖=1

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 21 / 60


OLS Estimation: Simple Regression

Some Algebraic of 𝑢

𝑛 𝑛
∑ 𝑢̄ = ∑ 𝑢̂𝑖 = 0
𝑖=1 𝑖=1
𝑛
∑ 𝑢𝑋
̄ 𝑖=0
𝑖=1

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 22 / 60


OLS Estimation: Simple Regression

The Estimated Regression Line

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 23 / 60


OLS Estimation: Simple Regression

Measures of Fit: The 𝑅2

Decompose 𝑌𝑖 into the fitted value plus the residual 𝑌𝑖 = 𝑌𝑖̂ + 𝑢̂𝑖
𝑛
The total sum of squares (TSS): 𝑇 𝑆𝑆 = ∑𝑖=1 (𝑌𝑖 − 𝑌 )2
𝑛
The explained sum of squares (ESS): ∑𝑖=1 (𝑌𝑖̂ − 𝑌 )2
𝑛 𝑛
The sum of squared residuals (SSR): ∑𝑖=1 (𝑌𝑖̂ − 𝑌𝑖 )2 = ∑𝑖=1 𝑢̂2𝑖
And
𝑇 𝑆𝑆 = 𝐸𝑆𝑆 + 𝑆𝑆𝑅

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 24 / 60


OLS Estimation: Simple Regression

Measures of Fit: The 𝑅2

𝑅2 or the coefficient of determination, is the fraction of the sample


variance of 𝑌𝑖 explained/predicted by 𝑋𝑖

𝑆𝑆𝐸 𝑆𝑆𝑅
𝑅2 = =1−
𝑆𝑆𝑇 𝑆𝑆𝑇

So 0 ≤ 𝑅2 ≤ 1
It seems that R-squares is bigger, the regression is better.
But actually we DON’T care much about 𝑅2 in causal inference

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 25 / 60


OLS Estimation: Simple Regression

The Standard Error of the Regression

The standard error of the regression (SER) is an estimator of the


standard deviation of the regression error 𝑢𝑖
Because the regression errors 𝑢𝑖 are unobserved, the SER is
computed using their sample counterparts, the OLS residuals 𝑢𝑖̂

𝑆𝐸𝑅 = 𝑠𝑢̂ = √𝑠𝑢2̂

1 𝑛
where 𝑠𝑢2̂ = 𝑛−2 ∑𝑖=1 𝑢̂2

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 26 / 60


The Least Squares Assumptions

Section 3

The Least Squares Assumptions

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 27 / 60


The Least Squares Assumptions

Assumption of the Linear regression model

In order to investigate the statistical properties of OLS, we need to


make some statistical assumptions
Linear Regression Model
The observations, (𝑌𝑖 , 𝑋𝑖 ) come from a random sample(i.i.d) and satisfy
the linear regression equation,

𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖

and 𝐸[𝑢𝑖 ∣ 𝑋𝑖 ] = 0

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 28 / 60


The Least Squares Assumptions

Assumption 1: Conditional Mean is Zero

Assumption 1: Zero conditional mean of the errors given X


The error,𝑢𝑖 has expected value of 0 given any value of the independent
variable
𝐸[𝑢𝑖 ∣ 𝑋𝑖 = 𝑥] = 0

An weaker condition that 𝑢𝑖 and 𝑋𝑖 are uncorrelated:

𝐶𝑜𝑣[𝑢𝑖 , 𝑋𝑖 ] = 𝐸[𝑢𝑖 𝑋𝑖 ] = 0

if both are correlated, then Assumption 1 is violated.


Equivalently, the population regression line is the conditional mean of
𝑌𝑖 given 𝑋𝑖 , thus

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 29 / 60


The Least Squares Assumptions

Assumption 1: Conditional Mean is Zero

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 30 / 60


The Least Squares Assumptions

Assumption 2: Random Sample

Assumption 2: Random Sample


We have a i.i.d random sample of size , {(𝑋𝑖 , 𝑌𝑖 ), 𝑖 = 1, ..., 𝑛} from the
population regression model above.

This is an implication of random sampling.


And it generally won’t hold in other data structures.
– Violations: time-series, cluster samples.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 31 / 60


The Least Squares Assumptions

Assumption 3: Large outliers are unlikely

Assumption 3: Large outliers are unlikely


It states that observations with values of 𝑋𝑖 , 𝑌𝑖 or both that are far
outside the usual range of the data(Outlier) are unlikely. Mathematically,
it assume that X and Y have nonzero finite fourth moments.

Large outliers can make OLS regression results misleading.


One source of large outliers is data entry errors, such as a
typographical error or incorrectly using different units for different
observations.
Data entry errors aside, the assumption of finite kurtosis is a plausible
one in many applications with economic data.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 32 / 60


The Least Squares Assumptions

Assumption 3: Large outliers are unlikely

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 33 / 60


The Least Squares Assumptions

Underlying assumptions of OLS

The OLS estimator is unbiased, consistent and has asymptotically


normal sampling distribution if
1 Random sampling.
2 Large outliers are unlikely.
3 The conditional mean of 𝑢𝑖 given 𝑋𝑖 is zero

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 34 / 60


The Least Squares Assumptions

Underlying assumptions of OLS

OLS is an estimator: it’s a machine that we plug data into and we


get out estimates.
It has a sampling distribution, with a sampling variance/standard
error, etc. like the sample mean, sample difference in means, or the
sample variance.
Let’s discuss these characteristics of OLS in the next section.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 35 / 60


Properties of the OLS Estimators

Section 4

Properties of the OLS Estimators

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 36 / 60


Properties of the OLS Estimators

The OLS estimators

Question of interest: What is the effect of a change in 𝑋𝑖 (Class Size)


on 𝑌𝑖 (Test Score)
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖
We derived the OLS estimators of 𝛽0 and𝛽1 :

𝛽0̂ = 𝑌 ̄ − 𝛽1̂ 𝑋̄

̄ 𝑖 − 𝑌̄ )
∑(𝑋𝑖 − 𝑋)(𝑌
𝛽1̂ =
̄
∑(𝑋𝑖 − 𝑋)(𝑋 ̄
𝑖 − 𝑋)

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 37 / 60


Properties of the OLS Estimators

Least Squares Assumptions

1 Assumption 1:
2 Assumption 2:
3 Assumption 3:
If the 3 least squares assumptions hold the OLS estimators will be
unbiased
consistent
normal sampling distribution

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 38 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: unbiasedness

Recall:
𝛽0̂ = 𝑌 − 𝛽1̂ 𝑋
take expectation to 𝛽0 :

𝐸[𝛽0̂ ] = 𝑌 ̄ − 𝐸[𝛽1̂ ]𝑋̄

Then we have: if 𝛽1 is unbiased, then 𝛽0 is also unbiased.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 39 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: unbiasedness

Remind we have
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖
𝑌 = 𝛽 0 + 𝛽1 𝑋 + 𝑢

So take expectation to 𝛽1 :
̄
∑(𝑋𝑖 − 𝑋)/(𝑌 𝑖−𝑌)
̄
𝐸[𝛽1̂ ] = 𝐸[ ]
̄
∑(𝑋𝑖 − 𝑋)(𝑋 ̄
𝑖 − 𝑋)

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 40 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: unbiasedness

Continued
̄ 0 + 𝛽1 𝑋𝑖 + 𝑢𝑖 − (𝛽0 + 𝛽1 𝑋 + 𝑢))
∑(𝑋𝑖 − 𝑋)(𝛽
𝐸[𝛽1̂ ] = 𝐸[ ]
̄
∑(𝑋𝑖 − 𝑋)(𝑋 ̄
𝑖 − 𝑋)
̄ 1 (𝑋𝑖 − 𝑋) + (𝑢𝑖 − 𝑢))
∑(𝑋𝑖 − 𝑋)(𝛽
= 𝐸[ ]
∑(𝑋 − 𝑋)(𝑋̄ − 𝑋)̄ 𝑖 𝑖
̄ 𝑖 − 𝑢)
∑(𝑋𝑖 − 𝑋)(𝑢
= 𝛽1 + 𝐸[ ]
̄
∑(𝑋𝑖 − 𝑋)(𝑋 ̄
𝑖 − 𝑋)

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 41 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: unbiasedness

̄ 𝑖 − 𝑢) = ∑(𝑋𝑖 − 𝑋)𝑢
Because ∑(𝑋𝑖 − 𝑋)(𝑢 ̄ 𝑖 , so

∑(𝑋𝑖 − 𝑋)𝑢𝑖
𝐸[𝛽1̂ ] = 𝛽1 + 𝐸[ ]
∑(𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)
∑(𝑋𝑖 − 𝑋)𝐸(𝑢𝑖 |𝑋1 , ..., 𝑋𝑛 )
= 𝛽1 + 𝐸[ ]
∑(𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)

the Law of Iterated Expectation(LIE)

𝐸(𝐸(𝑌 |𝑋)) = 𝐸(𝑌 ) 𝑎𝑛𝑑 𝐸(𝐸(𝑔(𝑋)𝑌 |𝑋) = 𝐸(𝑔(𝑋)𝑌 )

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 42 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: unbiasedness

Then we can obtain

𝐸[𝛽1̂ ] = 𝛽1 𝑖𝑓 𝐸[𝑢𝑖 |𝑋𝑖 ] = 0

Thus both 𝛽0 and 𝛽1 are unbiased on the condition of Assumption


1.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 43 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: Consistency

𝑝
Notation: 𝛽1̂ ⟶ 𝛽1 or 𝑝𝑙𝑖𝑚𝛽1̂ = 𝛽1 , so
̄ 𝑖 − 𝑌̄ )
∑(𝑋𝑖 − 𝑋)(𝑌
𝑝𝑙𝑖𝑚𝛽1̂ = 𝑝𝑙𝑖𝑚[ ]
̄
∑(𝑋𝑖 − 𝑋)(𝑋 𝑖 − 𝑋)
̄

1 ̄ ̄ 𝑠𝑥𝑦
𝑛−1 ∑(𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌 )
𝑝𝑙𝑖𝑚𝛽1̂ = 𝑝𝑙𝑖𝑚[ ] = 𝑝𝑙𝑖𝑚( )
1 ̄ ̄ 𝑠2𝑥
𝑛−1 ∑(𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)

where 𝑠𝑥𝑦 and 𝑠2𝑥 are sample covariance and sample variance.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 44 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: Consistency

Continuous Mapping Theorem: For every continuous function 𝑔(𝑡)


and random variable 𝑋:

𝑝𝑙𝑖𝑚(𝑔(𝑋)) = 𝑔(𝑝𝑙𝑖𝑚(𝑋))

Example:
𝑝𝑙𝑖𝑚(𝑋 + 𝑌 ) = 𝑝𝑙𝑖𝑚(𝑋) + 𝑝𝑙𝑖𝑚(𝑌 )
𝑋 𝑝𝑙𝑖𝑚(𝑋)
𝑝𝑙𝑖𝑚( )= 𝑖𝑓 𝑝𝑙𝑖𝑚(𝑌 ) ≠ 0
𝑌 𝑝𝑙𝑖𝑚(𝑌 )

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 45 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: Consistency

Base on L.L.N(law of large numbers) and random sample(i.i.d)


𝑝
𝑠2𝑋 ⟶= 𝜎𝑋
2
= 𝑉 𝑎𝑟(𝑋)
𝑝
𝑠𝑥𝑦 ⟶ 𝜎𝑋𝑌 = 𝐶𝑜𝑣(𝑋, 𝑌 )

Combining with CMP,then we obtain OLS estimator when 𝑛 ⟶ ∞


𝑠𝑥𝑦 𝐶𝑜𝑣(𝑋𝑖 , 𝑌𝑖 )
𝑝𝑙𝑖𝑚𝛽1̂ = 𝑝𝑙𝑖𝑚( 2
)=
𝑠𝑥 𝑉 𝑎𝑟(𝑋𝑖 )

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 46 / 60


Properties of the OLS Estimators

Properties of the OLS estimator: Consistency

𝐶𝑜𝑣(𝑋𝑖 , 𝑌𝑖 )
𝑝𝑙𝑖𝑚𝛽1̂ =
𝑉 𝑎𝑟(𝑋𝑖 )
𝐶𝑜𝑣(𝑋𝑖 , (𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖 ))
=
𝑉 𝑎𝑟(𝑋𝑖 )
𝐶𝑜𝑣(𝑋𝑖 , 𝛽0 ) + 𝛽1 𝐶𝑜𝑣(𝑋𝑖 , 𝑋𝑖 ) + 𝐶𝑜𝑣(𝑋𝑖 , 𝑢𝑖 )
=
𝑉 𝑎𝑟(𝑋𝑖 )
𝐶𝑜𝑣(𝑋𝑖 , 𝑢𝑖 )
= 𝛽1 +
𝑉 𝑎𝑟(𝑋𝑖 )

Then we could obtain


𝑝𝑙𝑖𝑚𝛽1̂ = 𝛽1 𝑖𝑓 𝐸[𝑢𝑖 |𝑋𝑖 ] = 0

Both 𝛽0̂ and 𝛽1̂ are Consistent on the condition of Assumption 1.


Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 47 / 60
Properties of the OLS Estimators

In A Summary: Unbiasedness vs Consistency

Unbiasedness & Consistency both rely on 𝐸[𝑢𝑖 |𝑋𝑖 ] = 0


Unbiasedness implies that 𝐸[𝛽1̂ ] = 𝛽1 for a certain sample size
n.(“small sample”)
Consistency implies that the distribution of 𝛽1̂ becomes more and
more tightly distributed around 𝛽1 if the sample size n becomes larger
and larger.(“large sample””)

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 48 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽0̂ and 𝛽1̂

Firstly, Let’s recall: Sampling Distribution of 𝑌


Because Y1,…,Yn are i.i.d., then we have

𝐸(𝑌 ) = 𝜇𝑌

Based on the Central Limit theorem(C.L.T), the sample distribution


in a large sample can approximates to a normal distribution, thus

𝜎𝑌2
𝑌 ∼ 𝑁 (𝜇𝑌 , )
𝑛

The OLS estimators 𝛽0̂ and 𝛽1̂ could have similar sample distributions
when three least squares assumptions hold.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 49 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽0̂ and 𝛽1̂

Unbiasedness of the OLS estimators implies that

𝐸[𝛽1̂ ] = 𝛽1 𝑎𝑛𝑑 𝐸[𝛽0̂ ] = 𝛽0

Based on the Central Limit theorem(C.L.T), the sample distribution


of 𝛽 in a large sample can approximates to a normal distribution, thus

𝛽0̂ ∼ 𝑁 (𝛽0 , 𝜎𝛽2 ̂ )


0

𝛽1̂ ∼ 𝑁 (𝛽1 , 𝜎𝛽2 ̂ )


1

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 50 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽0̂ and 𝛽1̂ in large-sample


Recall: Sampling Distribution of 𝑌 ̄ , based on the Central Limit
theorem(C.L.T), the sample distribution in a large sample can
approximates to a normal distribution.
𝜎𝑌2
𝑌 ∼ 𝑁 (𝜇𝑌 , )
𝑛

So the sample distribution of 𝛽1 in a large sample can also


approximates to a normal distribution based on the Central Limit
theorem(C.L.T), thus 𝛽1̂ ∼ 𝑁 (𝛽1 , 𝜎𝛽2 ̂ )
1

Where it can be shown that


1 𝑉 𝑎𝑟[(𝑋𝑖 − 𝜇𝑥 )𝑢𝑖 ]
𝜎𝛽2 ̂ = )
1 𝑛 [𝑉 𝑎𝑟(𝑋𝑖 )]2
1 𝑉 𝑎𝑟(𝐻𝑖 𝑢𝑖 )
𝜎𝛽2 ̂ = )
0 𝑛 (𝐸[𝐻𝑖2 ])2
Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 51 / 60
Properties of the OLS Estimators

Sampling Distribution of 𝛽1̂

𝛽1̂ in terms of regression and errors in following equation

1 𝑛
𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌 )
𝛽1̂ = 1 𝑛
𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)
1 𝑛
𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑢𝑖 − 𝑢)
= 𝛽1 + 1 𝑛
𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 52 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽1̂ :the numerator


1 𝑛
The numerator: 𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑢𝑖 − 𝑢)
𝑝
Because 𝑋̄ is consistent, thus 𝑋 −
→ 𝜇𝑥 .
𝑛
Recall ∑𝑖=1 𝑢̄ = 0, then

𝑛 𝑛
∑(𝑋𝑖 − 𝑋)(𝑢𝑖 − 𝑢) = ∑(𝑋𝑖 − 𝑋)𝑢𝑖
𝑖=1 𝑖=1

then we have

1 𝑛 1 𝑛
∑(𝑋𝑖 − 𝑋)(𝑢𝑖 − 𝑢) ≅ ∑(𝑋𝑖 − 𝜇𝑥 )𝑢𝑖
𝑛 𝑖=1 𝑛 𝑖=1

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 53 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽1̂ :the numerator

Let 𝑣𝑖 = (𝑋𝑖 − 𝜇𝑥 )𝑢𝑖


Based on Assumption 1, then 𝐸(𝑣𝑖 ) = 0
Based on Assumption 2, 𝜎𝑣2 = 𝑉 𝑎𝑟[(𝑋𝑖 − 𝜇𝑥 )𝑢𝑖 ]

Then
1 𝑛 1 𝑛
∑(𝑋𝑖 − 𝜇𝑥 )𝑢𝑖 = ∑ 𝑣𝑖 = 𝑣 ̄
𝑛 𝑖=1 𝑛 𝑖=1

Then 𝑣 ̄ is the sample mean of 𝑣𝑖 , based on C.L.T,

𝑣̄ − 0 𝑑 𝑑 𝜎2
− → 𝑁 (0, 𝑣 )
→ 𝑁 (0.1) 𝑜𝑟 𝑣 ̄ −
𝜎𝑣 ̄ 𝑛

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 54 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽1̂ :the denominator

The denominator,

1 𝑛
∑(𝑋 − 𝑋)(𝑋𝑖 − 𝑋)
𝑛 𝑖=1 𝑖

This is a variation of sample variance of 𝑋 (except dividing by 𝑛


rather than 𝑛 − 1, which is inconsequential if 𝑛 is large)
Based on discussion of the sample variance is a consistent estimator
of the population variance,thus
𝑝
𝜎𝑥2 𝑖 −
→ 𝑉 𝑎𝑟[𝑋𝑖 ]

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 55 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽1̂

𝛽1̂ in terms of regression and errors


1 𝑛
𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑢𝑖 − 𝑢)
= 𝛽1 + 1 𝑛
𝑛 ∑𝑖=1 (𝑋𝑖 − 𝑋)(𝑋𝑖 − 𝑋)

Combining these two results, we have that, in large samples


𝑣̄
𝛽1̂ − 𝛽1 ≅
𝑉 𝑎𝑟[𝑋𝑖 ]

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 56 / 60


Properties of the OLS Estimators

Sampling Distribution of 𝛽1̂


Based on 𝑣 ̄ follow a normal distribution, in large samples,thus

𝑑 𝜎𝑣2̄
𝑣̄ −
→ 𝑁 (0, )
𝑛
𝑣̄ 𝑑 𝜎𝑣2̄
⇒ −
→ 𝑁 (0, )
𝑉 𝑎𝑟[𝑋𝑖 ] 𝑛[𝑉 𝑎𝑟(𝑋𝑖 )]2
𝑑 𝜎𝑣2̄
⇒𝛽1̂ − 𝛽1 −
→ 𝑁 (0, )
𝑛[𝑉 𝑎𝑟(𝑋𝑖 )]2

Then the sampling distribution of 𝛽1̂ is


𝑑
𝛽1̂ −
→ 𝑁 (𝛽1 , 𝜎𝛽2 ̂ )
1

where
𝜎𝑣2̄ 𝑉 𝑎𝑟[(𝑋𝑖 − 𝜇𝑥 )𝑢𝑖 ]
𝜎𝛽2 ̂ = 2
=
1 𝑛[𝑉 𝑎𝑟(𝑋𝑖 )] 𝑛[𝑉 𝑎𝑟(𝑋𝑖 )]2
Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 57 / 60
Properties of the OLS Estimators

Sampling Distribution 𝛽1̂ in large-sample

We have shown that


1 𝑉 𝑎𝑟[(𝑋𝑖 − 𝜇𝑥 )𝑢𝑖 ]
𝜎𝛽2 ̂ = )
1 𝑛 [𝑉 𝑎𝑟(𝑋𝑖 )]2

An intuition:The variation of X is very important.


Because if 𝑉 𝑎𝑟(𝑋𝑖 ) is small, it is difficult to obtain an accurate
estimate of the effect of X on Y which implies that 𝑉 𝑎𝑟(𝛽1̂ ) is large.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 58 / 60


Properties of the OLS Estimators

Variation of X

When more variation in X, then there is more information in the data


that you can use to fit the regression line.

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 59 / 60


Properties of the OLS Estimators

In a Summary

Under 3 least squares assumptions, the OLS estimators will be


unbiased
consistent
normal sampling distribution
more variation in X, more accurate estimation

Zhaopeng Qu (Nanjing University) Simple OLS Regression:Estimation 9/26/2019 60 / 60

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy