Answers For Homework #2: 1 Theoretical Exercises
Answers For Homework #2: 1 Theoretical Exercises
Answers For Homework #2: 1 Theoretical Exercises
Zheng Tian
1 Theoretical Exercises
a. Substituting Height = (70, 65, 74) inches into the equation, the predicted weights
are (176.39, 156.69, 192.15) pounds, respectively.
\
b. ∆W eight = 3.94 × ∆Height = 3.94 × 1.5 = 5.91 inches.
c. Let’s consider this problem from a general case. Suppose the original estimated
regression model is
Now we have new data with different units such that xi = aXi , yi = bYi . It is
easy to see that
X X X X
x̄ = aX̄, ȳ = bȲ , (xi − x̄) = a (Xi − X̄), and (yi − ȳ) = b (Yi − Ȳ )
i i i i
Let β̃0 and β̃1 be the estimated coefficients and ũi be the residuals in the new
regression equation as follows,
Then, we have
P P
− x̄)(yi − ȳ)
i (x ab i (Xi − X̄)(Yi − Ȳ ) b
β̃1 = Pi 2
= 2
P = β̂1
i (xi − x̄) a i (Xi − X̄) a
b
β̃0 = ȳ − β̃1 x̄ = bȲ − β̂1 (aX̄) = bβ̂0
a
ũi = yi − ŷi = b(Yi − Ŷi ) = bûi
P 2
b2 i û2i
P
2 ESS SSR ũi
R̃ = =1− =1− P i
2
= 1− 2P 2
= R2
T SS T SS i (y i − ȳ) b i (Y i − Ȳ )
s s
1 X b 2 X
SER
] = ũ2i = û2i = b SER
n−2 n−2
i i
Now let’s go back to the specific question. We know that 1 inch = 2.54 cm and
1 pound = 0.4536 kg so that W eightnew = 0.4536 × W eight and Heightnew =
1
2.54 × Height. Thus, using the results above, we obtain
AW
\ E = 696.7 + 9.6 × Age, R2 = 0.023, SER = 624.1
a. The coefficient 9.6 shows the marginal effect of Age on AWE; that is, AWE is
expected to increase by 9.6 for each additional year of age. 696.7 is the intercept
of the regression line. It determines the overall level of the line.
b. SER is in the same units as the dependent variable (Y, or AWE in this example).
Thus SER is measured in dollars per week.
c. R2 is unit free.
d. Plugging 25 and 45 into the regression equation,
• 696.7 + 9.6 × 25 = 936.7
• 696.7 + 9.6 × 45 = 1128.7
e. No. The oldest worker in the sample is 65 years old. 99 years is far outside the
range of the sample data.
f. No. The distribution of earning is positively skewed and has kurtosis larger than
the normal.
g Ȳ = β̂0 + β̂1 X̄. Thus, the sample mean of AWE is 696.7 + 9.6 × 41.6 = 1096.06.
4.5 a. ui represents factors other than time that influence the student’s performance on
the exam including amount of time studying, aptitude for the material, and
so forth. Some students will have studied more than average, other less; some
students will have higher than average aptitude for the subject, others lower,
and so forth.
b. Because of random assignment ui is independent of Xi . Since ui represents devi-
ations from average E(ui ) = 0. Because u and X are independent E(ui |Xi ) =
E(ui ) = 0.
c. Assumption #2 is satisfied if this year’s class is typical of other classes, that is,
students in this year’s class can be viewed as random draws from the population
of students that enroll in the class. Assumption #3 is satisfied because both
X and Y are bounded.
d. • 70.6 for 95 minutes; 77.8 for 120 minutes; 85.0 for 150 minutes
• 2.4 for 10 more minutes.
4.10 a. Assumption #1 is satisfied since whatever value X takes we always have E(ui ) =
0. Assumption #2 is satisfied because (ui , Xi ) is i.i.d and Yi is a function of Xi
and ui . Xi is bounded and so has finite fourth moment; the fourth moment is
non-zero because Pr(Xi = 0) and Pr(Xi = 1) are both non-zero so that Xi has
2
finite, non-zero kurtosis. Following calculation like those exercise 2.13, ui also
has non-zero finite fourth moment.
b. var(Xi ) = 0.2 × (1 − 0.2) = 0.16 and µX = 0.2. Also,
h i
var ((Xi − µX )ui ) = E ((Xi − µX )ui )2 = E E ((Xi − µX )ui )2 |X
h i h i
= E ((Xi − µX )ui )2 |Xi = 0 · Pr(Xi = 0) + E ((Xi − µX )ui )2 |Xi = 1 · Pr(Xi = 1)
= E((0 − 0.2)2 u2i ) × 0.8 + E((1 − 0.2)2 u2i ) × 0.2
= 0.22 × 1 × 0.8 + 0.82 × 4 × 0.2
= 0.544
Therefore,
1 var ((Xi − µX )ui ) 1 0.544 1
σβ̂2 = 2 = 2
= 21.25
1 n [var(Xi )] n 0.16 n
4.12 a. Write
X X Xh i2
ESS = (Ŷi − Ȳ )2 = (β̂0 + β̂1 Xi − Ȳ )2 = β̂1 (Xi − X̄)
i i i
P 2
X
i (Xi− X̄)(Yi − Ȳ )
= β̂12 (Xi − X̄)2 = P 2
i i (Xi − X̄)
This implies
P 2
ESS
2 i (Xi − X̄)(Yi − Ȳ )
R = =P 2
P 2
T SS i (Xi − X̄) i (Yi − Ȳ )
2
1 P
n−1 i (Xi − X̄)(Yi − Ȳ )
=
1/2 1/2
1 P 2 1 P 2
n−1 i (Xi − X̄) n−1 i (Yi − Ȳ )
sXY 2
2
= = rXY
sX sY
2 Empirical Exercise
This file include answers and R codes for completing Empirical Exercise 4.2 in Introduction
to Econometrics (3rd edition) by Stock and Watson.
The first step is to read the data file into R. The data files for this problem are TeachingRatings.dta
and TeachingRatings.xls, accompanied by a descriptive file TeachingRatings_Description.pdf.
3
• Read the STATA file
library(foreign)
teachingdata <- read.dta("TeachingRatings.dta")
• Upon reading the data, we can take a glimpse on the data.
– Use head or tail to look at the first or last few observations
head(teachingdata)
Summary Statistics
We get the summary statistics of the variables used in the analysis, which is course_eval
and beauty
df <- teachingdata[c("course_eval", "beauty")]
sumdf <- summary(df); sumdf
course_eval beauty
Min. :2.100 Min. :-1.45049
1st Qu.:3.600 1st Qu.:-0.65627
Median :4.000 Median :-0.06801
Mean :3.998 Mean : 0.00000
3rd Qu.:4.400 3rd Qu.: 0.54560
Max. :5.000 Max. : 1.97002
We can create a table that looks professional using stargazer().
library(stargazer)
stargazer(df, type = "latex",
title = "Summary Statistics", label = "tab:sum-stats")
Scatterplot
4
Figure 1: The scatterplot of course evaulation on professors’ beauty
5
Regression
Now let’s estimate the regression model. The results is reported in Table 2
# run a regression of course evaluation on professor’s beauty
teaching.ols <- lm(teaching.formula, data = teachingdata)
Dependent variable:
Course Evaluations
Beauty 0.133∗∗∗ (0.032)
Constant 3.998∗∗∗ (0.025)
Observations 463
R2 0.036
Residual Std. Error 0.545 (df = 461)
Note: ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01
6
beauty.sd <- sd(teachingdata$beauty)
courseval.sd <- sd(teachingdata$course_eval)
delta.courseval <- b1 * beauty.sd
d. The standard deviation of course evaluation is 0.5549, and the standard deviation
of beauty is 0.7886. A one-standard-deviation increase in beauty is expected to
increase course evaluation by 0.1049, or 0.19 of standard deviation of course eval-
uations. The effect is small.
rsq <- summary(teaching.ols)$r.squared
e. The regression R2 is 0.0357, so that Beauty explains only 3.6 percent of the variance
in course evaluations.