As of Sep 16, 2021: Seppo Pynn Onen Econometrics I
As of Sep 16, 2021: Seppo Pynn Onen Econometrics I
As of Sep 16, 2021: Seppo Pynn Onen Econometrics I
Part III
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
y = β0 + β1 x1 + · · · + βk xk + u. (1)
Because
∂y
= βj (2)
∂xj
j = 1, . . . , k, coefficient βj indicates the marginal effect of variable
xj , and indicates the amount y is expected to change as xj changes
by one unit and other variables are kept constant (ceteris paribus).
Example 1
Consider the consumption function
C = f (Y ), where Y is income. Suppose the assumption is that as
incomes grow the marginal propensity to consume decreases.
β1 = β1l + β1q Y ,
= β0 + β1l Y + β1q Y 2 + u
Example 1 continues . . .
This simple example demonstrates that we can meaningfully enrich
simple regression analysis (even though we have essentially only two
variables, C and Y ) and at the same time get a meaningful interpretation
to the above polynomial model.
Technically, considering the simple regression
C = β0 + β1 Y + v ,
the extension
C = β0 + βil Y + β1q Y 2 + u
means that we have extracted the quadratic term Y 2 out from the error
term v of the simple regression.
Estimation
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
Estimation
E[u|x1 , . . . , xk ] = 0. (3)
Estimation
Estimation
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
Estimation
y = (y1 , . . . , yn )0 , (5)
Estimation
Then we can present the whole set of regression equations for the
sample
y1 = β0 + β1 x11 + · · · + βk x1k + u1
y2 = β0 + β1 x21 + · · · + βk x2k + u2
.. (7)
.
yn = β0 + β1 xn1 + · · · + βk xnk + un
in the matrix form as
β0
y1 1 x11 x12 ··· x1k u1
β1
y2 1 x21 x22 ··· x2k u2
.. =
.. .. .. ..
β2 +
..
(8)
. . . . . .. .
.
yn 1 xn1 xn2 ··· xnk un
βk
or shortly
y = Xβ + u, (9)
where
β = (β0 , β1 , . . . , βk )0
Seppo Pynnönen Econometrics I
Multiple Regression Analysis
Estimation
X0 Xβ̂ = X0 y (10)
Estimation
i = 1, . . . , n.
The residual for observation i is again defined as
Again hold
ȳ = ŷ¯ (14)
and
ȳ = β̂0 + β̂1 x̄1 + · · · + β̂k x̄k , (15)
where ȳ , ŷ¯ , x̄j , j = 1, . . . , k are sample averages of the variables.
Estimation
If in the sample the average high school GPA is 3.4 and the average ACT
is 24.2, what is the average college GPA in the sample?
Estimation
Remark 1
For example, if you fit the simple regression ỹ = β̃0 + β̃1 x1 , where β̃0 and
β̃1 are the OLS estimators, and fit a multiple regression
ŷ = β̂0 + β̂1 x1 + β̂2 x2 then generally β̃1 6= β̂1 unless β̂2 = 0, or x1 and x2
are uncorrelated.
Estimation
Example 2
Consider the hourly wage example. enhance the model as
log(w ) = β0 + β1 x1 + β2 x2 + β3 x3 , (16)
Goodness-of-Fit
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
Goodness-of-Fit
or
SST = SSE + SSR, (18)
where
ŷi = β̂0 + β̂1 xi1 + · · · + β̂k xik . (19)
Goodness-of-Fit
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
Goodness-of-Fit
SSE SSR
R2 = =1− . (20)
SST SST
Again as in the case of the simple regression, R 2 can be shown to
be the squared correlation coefficient between the actual yi and
fitted ŷi . This correlation is called the multiple correlation
Recall, ŷ¯ = ȳ .
Remark 2
R 2 never decreases when an explanatory variable is added to the model!
Goodness-of-Fit
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
Goodness-of-Fit
su2
R̄ 2 = 1 − , (22)
sy2
where
n n
1 X 1 X
su2 = (yi − ŷi )2 = ûi2 (23)
n−k −1 n−k −1
i=1 i=1
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
β̂ = (X0 X)−1 X0 y
= (X0 X)−1 X0 (Xβ + u)
(25)
= (X0 X)−1 X0 Xβ + (X0 X)−1 X0 u
= β + (X0 X)−1 X0 u.
Remark 3
If z = (z1 , . . . , zk )0 is a random vector, then E[z] = (E[z1 ] , . . . , E[zn ])0 .
That is the expected value of a vector is a vector whose components are
the individual expected values.
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
y = β0 + β1 x1 + u (27)
y = β0 + β1 x1 + β2 x2 + u. (28)
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
y = β0 + β1 x1 + β2 x2 + u, (30)
y = β0 + β1 x1 + v , (31)
where
(xi1 − x̄1 )
ai = P . (34)
(xi1 − x̄1 )2
i.e., P
h i (xi1 − x̄1 )xi2
E β̃1 = β1 + β2 P , (36)
(xi1 − x̄1 )2
implying that β̃1 is biased for β1 unless x1 and x2 are uncorrelated
(or β2 = 0). This is called the omitted variable bias.
Remark 4
The omitted variable bias factually is due to the fact that the error term
of the miss-specified model is correlated with the explanatory variables.
That is, if the true model is y = β0 + β1 x1 + β2 x2 + u but we estimate
model y = β0 + β2 x1 + v (so that v = β2 x2 + u), then
E[v |x1 ] = E[β2 x2 + u|x1 ] = β2 E[x2 |x1 ] + E[u|x1 ] = β2 E[x2 |x1 ] 6= 0 if x2
and x1 are correlated, i.e., the crucial assumption E[v |x1 ] = 0 becomes
violated in the miss-specified model. (Note: we assume that in the true
model E[u|x1 , x2 ] = 0 which implies also E[u|x1 ] = 0.)
What are some factors contained in u? Do you think the key assumption
E[u|prbconv, avgsen] = 0 likely to hold?
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
y = Xβ + u. (38)
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
x1 , x2 , . . . , xk
a1 x1 + · · · + ak xk = 0
holds only if
a1 = · · · = ak = 0.
h i
From the variance equation (41) we see that var β̂j → ∞ as
Rj2 → 1.
That is, the more the explanatory variables are linearly dependent
the larger the variance becomes.
This implies that the coefficient estimates become increasingly
unstable.
High (but not perfect) correlation between two or more
explanatory variables is called multicollinearity.
Symptoms of multicollinearity:
1 High correlations between explanatory variables.
2 R 2 is relatively high, but the coefficient estimates tend to be
insignificant (see the section of hypothesis testing)
3 Some of the coefficients are of wrong sign and some
coefficients are at the same time unreasonably large.
4 Coefficient estimates change much from one model alternative
to another.
Example 3
Variable Et denotes expenditure costs in a sample of Toyota Mark II cars
at time point t, Mt denotes the mileage and At age.
Model B: Et = β0 + β1 Mt + u2t
Model C: Et = γ0 + γ1 Mt + γ2 At + u3t
Example 3 continues . . .
Estimation results: (t-values in parentheses)
Findings:
Apriori, coefficients α1 , β1 , γ1 , and γ2 should be positive. However,
γ̂2 = −151.15 (!!?), but β̂1 = 53.45. Correlation rA,M = 0.996!
Remedies:
In the collinearity problem the question is the there is not enough
information to reliably identify each variables contribution as an
explanatory variable in the model.
Thus in order to alleviate the problem:
1 Use non-sample information if available to impose restrictions
between coefficients.
2 Increase the sample size if possible.
3 Drop the most collinear (on the base of Rj2 ) variables.
4 If a linear combination (usually a sum) of the most collinear
variables is meaningful, replace the collinear variables by the
linear combination.
Remark 5
Multicollinearity is not always harmful.
If an explanatory variable is not correlated with the rest of the
explanatory variables multicollinearity of these variables (provided
that it is not perfect) does not harm the variance of the slope
coefficient estimator of the uncorrelated explanatory variable.
If xj is not correlated with any of the other predictors, Rj2 = 0 and hence,
as is seen from equation (41), the correlation factor (1 − Rj2 ) drops out
h i
from its slope coefficient estimator variance var β̂j .
does it make sense to hold sleep, work, leisure fixed, while changing
study?
(ii) Explain why this model violates Assumption 4.
(iii) How would you reformulate model so that its parameters have a useful
interpretation and it satisfies Assumption 4.?
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
y = β0 + β1 x1 + β2 x2 + u. (42)
and
ỹ = β̃0 + β̃1 x1 . (44)
Then
h i σ2
var β̂1 = 2 )
Pu (45)
(1 − r12 (xi1 − x̄1 )2
and
h i σu2
var β̃1 = P , (46)
(xi1 − x̄1 )2
where r12 is the sample correlation between x1 and x2 .
Seppo Pynnönen Econometrics I
Multiple Regression Analysis
h i h i
Thus var β̃1 ≤ var β̂1 , and the inequality is strict if r12 6= 0.
In summary (assuming r12 6= 0):
h i h i
1 If β2 6= 0, then β̃1 is biased, β̂1 is unbiased, and var β̃1 < var β̂1
h i h i
2 If β2 = 0, then both β̃1 and β̂1 are unbiased, but var β̃1 < var β̂1
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
where
ûi = yi − β̂0 − β̂1 xi1 − · · · − β̂k xik . (48)
The term n − k − 1 in (46) is the
degrees of freedom (df).
It can be shown that
E σ̂u2 = σu2 ,
(49)
i.e., σ̂u2 is unbiased estimator of σu2 .
σ̂u is called the standard error of the regression.
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
p
Substituting σu by its estimate σ̂u = σ̂u2 gives
the standard error of β̂j
σ̂u
se(β̂j ) = q . (51)
(1 − Rj2 ) (xij − x̄j )2
P
Estimation
Matrix form
Goodness-of-Fit
R-square
Adjusted R-square
Multicollinearity
Theorem 2 (Gauss-Markov)
Under the classical assumptions 1–5 β̂0 , β̂1 , . . . β̂k are the best linear
unbiased estimators (BLUEs) of β0 , β1 , . . . βk , respectively.
BLUE:
Best: The variance of the OLS estimator is smallest among all
linear unbiased estimators of βj
Linear: β̂j = ni=1 wij yi .
P
h i
Unbiased: E β̂j = βj .
Show that β̃1 is linear and unbiased. (Remember, because E[u|x] = 0, you can
treat both xi and zi as nonrandom in your derivation.)
(ii) Show that
σ 2 n (zi − z̄)2
h i P
var β̃1 = Pn i=1 2
i=1 (zi − z̄)zi
h i h i
(iii) Show that under the Gauss-Markov assumptions, var β̂1 ≤ var β̃ , where β̂1 is
the
P OLS estimator. (Hint. Cauchy-Schwartz inequality:
( ai bi )2 ≤ ( ai ) ( bi ).)
P P