Eco No Metrics Assignment 3
Eco No Metrics Assignment 3
Question 1:
Answer
When there are dummy variables in all observations, the constant term has to be excluded. If a
constant term is included in the regression, it is important to exclude one of the dummy variables from
the regression model, making it the base category against which the others are assessed. If all the
dummy variables are included, their sum is equal to 1, resulting in perfect multicollinearity.
a) The researcher is making the mistake of introducing an extra dummy variable term which is a
perfect linear function of the dummy variable terms. In other words, he has fallen into the
dummy variable trap and hence has encountered the problem of multicollinearity.
The appropriate course of action for him would be to drop one of the dummy variables; either weekday
or weekend.
Question 2:
Answer
a. Breusch-Pagan Test
1. Run the original regression.
2. Collect the residuals.
3. Run the following regression:
u2 = d0 + d1x1 +…+ dk xk
4. Test
H0: d1 = d2 = … = dk = 0
H1: d1 ≠ d2 ≠ … ≠ dk ≠ 0 (at least one should hold true)
If we fail to reject null hypothesis, then we can say that there exists no
heteroskedasticity.
If we reject null hypothesis, then there is the problem of heteroskedasticity.
5. Compute F = [R2/k]/ [(1 – R2)/ (n – k – 1)] and compare with critical F-value with
degrees of freedom (k, n-k-1).
b. White Test
1. Run the original regression
2. Collect the residuals
3. Run the following regression(assume only 3 variables):
u2 = d0 + d1x1 + d2x2 + d3x3 + d4x12 + d5x22 + d6x32 + d7x1x2 + d8x2x3 + d9x3x1
4. Test
H0: d1 = d2 = … = d9 = 0
H1: d1 ≠ d2 ≠ … ≠ d9 ≠ 0 (at least one of them should be met)
If we fail to reject null hypothesis, then we can say that there is no
heteroskedasticity.
If we reject null hypothesis, then there is the problem of heteroskedasticity.
5. Compute F = [R2/k]/ [(1 – R2)/(n – k – 1)] and compare with critical F-value with
degrees of freedom (k,n-k-1).
Question 3:
Answer
SUMMARY
OUTPUT
Regression Statistics
Multiple R 0.312
R Square 0.098
Adjusted R Square 0.029
Standard Error 8.209
Observations 100.000
ANOVA
Df SS MS F Significance F
Regression 7.000 670.063 95.723 1.421 0.207
Residual 92.000 6199.604 67.387
Total 99.000 6869.667
Coefficient Standard
s Error t Stat P-value
Intercept 3.080 18.110 0.170 0.865
D1(J) -3.881 3.208 -1.210 0.229
D2(P) 0.496 3.152 0.157 0.875
D3(A) 0.512 3.215 0.159 0.874
D4(G) -0.064 3.322 -0.019 0.985
D5(V) 4.150 3.403 1.219 0.226
Popn -0.006 0.005 -1.171 0.245
City Per Cap 0.001 0.003 0.495 0.622
By looking at the results we can say that all the variables are not significant (since all p-values are
>0.05) and the F-value=1.421<2.11 which is F-Critical Value at degrees of freedom (7,92). So we fail to
reject the null hypothesis, which means that there is no heteroskedasticity problem.
a) Answer
As we know that the White’s test doesn’t hold good if the number of variables are too many
(>3), therefore in this case this test isn’t feasible as there are 7 variables in the regression.
b) Answer
According to the BP Test, there is no heteroskedasticity problem in this dataset.
c) Answer
If suppose there is heteroskedasticity, and if it is caused by a dummy variable say D4 then the
regression will be as follows:
Int Rate^ = c0 + c1*D1^ + c2*D2^ + c3*D3^ + c4*D4^ + c5*D5^ + c6*Popn^ + c7 *City Per Capita^
Or else if the heteroskedasticity is caused by non-dummy variable say Popn then divide each
data variable by √(Popn i).
Question 4:
a) Answer
Int Rate = b0 + b1*D1 + b2*D2 + b3*D3 + b4*D4 + b5*D5 + b6*Popn + b7 *City Per Capita
SUMMARY
OUTPUT
Regression Statistics
Multiple R 0.409
R Square 0.168
Adjusted R Square 0.053
Standard Error 8.107
Observations 100.000
ANOVA
Df SS MS F Significance F
Regression 12.000 1151.857 95.988 1.461 0.155
Residual 87.000 5717.810 65.722
Total 99.000 6869.667
Coefficient Standard
s Error t Stat P-value
Intercept 129.722 86.941 1.492 0.139
D1(J) -146.026 94.538 -1.545 0.126
D2(P) -174.977 93.584 -1.870 0.065
D3(A) -148.481 94.372 -1.573 0.119
D4(G) -83.979 98.186 -0.855 0.395
D5(V) -71.871 95.966 -0.749 0.456
Popn -0.006 0.005 -1.130 0.262
City Per Cap -0.019 0.014 -1.360 0.177
D1(J)*CPC 0.023 0.015 1.504 0.136
D2(P)*CPC 0.028 0.015 1.882 0.063
D3(A)*CPC 0.024 0.015 1.580 0.118
D4(G)*CPC 0.013 0.016 0.844 0.401
D5(V)*CPC 0.012 0.015 0.791 0.431
Question 5:
Answer
a) STEPS:
1) Run the normal regression.
2) Collect all the error terms.
3) Arrange your data and run the regression. (u t & ut-1)
4) Define D= ∑ (ut^-ut-1^)2/∑ut2 = 1804.55/742.52
= 2.43
5) D≈ 2(1-ρ) = 2(1-(-0.228)) = 2.44
6) Looking at DW Tables, we need to find the value of D L & Dw for n=100 & k=7.
DL = 1.40 & DU= 1.693. Since D is greater than 1.693, we can infer that there is a negative
auto correlation.
b) For testing the presence of AR(2) term, following steps are followed:
STEPS:
1) Run the normal regression.
2) Collect all the error terms.
3) Arrange your data and run the regression. ( u t, ut-1 & ut-2 )
4) Perform F-Test with,
H0: ρ1=ρ2=0 (To test whether ρ1 &ρ2 are significant).
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.226299716
R Square 0.051211561
Adjusted R Square 0.04100748
Standard Error 2.681186268
Observations 99
ANOVA
df SS MS F
Regression 1 38.02582825 38.02582825 5.289623
Residual 98 704.4984609 7.188759805
Total 99 742.5242891
ρ = - 0.228
Prepared by: