Multicol
Multicol
Multicol
Some ideas:
MC is a problem of exact or near exact linear relationship between the explanatory of feature
variables. Not a problem in the structure of the error terms.
Is this really a problem?
or in deviation(?) form:
It is known that:
σ2 σ2 1
var(β̂2 ) = P = × (5)
x22i (1 − r23
2 ) 2 )
x22i (1 − r23
P
σ2 σ2 1
var(β̂3 ) = P = × (6)
x23i (1 − r232 ) x23i (1 − r23
2 )
P
−r23 σ 2
cov(β̂2 , β̂3 ) = qP (7)
2 ) x22i
P 2
(1 − r23 x3i
ŷ = Xβ + û (8)
−1 0
β̂ = X 0 X Xy (9)
−1
cov(β̂) = σ 2 X 0 X (10)
1
STK 310 (Multicollinearity (MC))
3. Model specification
4. Overdetermined model
1. Although the OLS estimators are BLUE their variances are inflated
Intuition ??? Impact of inflated variances?
σ2
var(βˆj ) = P 2 V IFj (11)
xj
1
V IFj = (12)
1 − Rj2
where
β̂j = the estimated partial regression coefficient of variable Xj
Rj2 = R2 of the regression of Xj on the remaining (k − 2) explanatory variables
P 2
(Xj − X j )2
P
xj =
1
Sometimes we use tolerance: T OLj = V IFj .
3. t-ratios lower
5. OLS estimators and standard errors sensitive to small changes in the data.
2
STK 310 (Multicollinearity (MC))
Detection of MC
4. Auxiliary regressions
7. Plots
Remedial measures
2. Extending/expanding data
3. Removing variables
4. Transformation of variables