9.1 Multiple Choice: Chapter 9 Assessing Studies Based On Multiple Regression
9.1 Multiple Choice: Chapter 9 Assessing Studies Based On Multiple Regression
9.1 Multiple Choice: Chapter 9 Assessing Studies Based On Multiple Regression
3) Errors-in-variables bias
x2
A) is present when the probability limit of the OLS estimator is given by ˆ1 1
p
.
w2
2
x
1
Copyright © 2011 Pearson Education, Inc.
6) The reliability of a study using multiple regression analysis depends on all of the following with the
exception of
A) omitted variable bias.
B) errors-in-variables.
C) presence of homoskedasticity in the error term.
D) external validity.
Answer: C
2
Copyright © 2011 Pearson Education, Inc.
11) Misspecification of functional form of the regression function
A) is overcome by adding the squares of all explanatory variables.
B) is more serious in the case of homoskedasticity-only standard error.
C) results in a type of omitted variable bias.
D) requires alternative estimation methods such as maximum likelihood.
Answer: C
13) A survey of earnings contains an unusually high fraction of individuals who state their weekly
earnings in 100s, such as 300, 400, 500, etc. This is an example of
A) errors-in-variables bias.
B) sample selection bias.
C) simultaneous causality bias.
D) companies that typically bargain with workers in 100s of dollars.
Answer: A
14) In the case of a simple regression, where the independent variable is measured with i.i.d. error,
X2
A) ˆ1 1
p
X2 w2
X2
B) ˆ1
p
.
X2 w2 1
w2
C) ˆ1 1 .
p
X2 w2
X2
D) ˆ1 1
p
.
X2 w2
Answer: A
3
Copyright © 2011 Pearson Education, Inc.
16) Sample selection bias occurs when
A) the choice between two samples is made by the researcher.
B) data are collected from a population by simple random sampling.
C) samples are chosen to be small rather than large.
D) the availability of the data is influenced by a selection process that is related to the value of the
dependent variable.
Answer: D
19) Applying the analysis from the California test scores to another U.S. state is an example of looking for
A) simultaneous causality bias.
B) external validity.
C) sample selection bias.
D) internal validity.
Answer: B
20) Comparing the California test scores to test scores in Massachusetts is appropriate for external
validity if
A) Massachusetts also allowed beach walking to be an appropriate P.E. activity.
B) the two income distributions were very similar.
C) the student-to-teacher ratio did not differ by more than five on average.
D) the institutional settings in California and Massachusetts, such as organization in classroom
instruction and curriculum, were similar in the two states.
Answer: D
21) The guidelines for whether or not to include an additional variable include all of the following, with
the exception of
A) providing "full disclosure" representative tabulations of the results.
B) testing whether additional questionable variables have nonzero coefficients.
C) determining whether it can be measured in the population of interest.
D) being specific about the coefficient or coefficients of interest.
Answer: C
4
Copyright © 2011 Pearson Education, Inc.
22) Possible solutions to omitted variable bias, when the omitted variable is not observed, include the
following with the exception of
A) panel data estimation.
B) nonlinear least squares estimation.
C) use of instrumental variables regressions.
D) use of randomized controlled experiments.
Answer: B
24) You try to explain the number of IBM shares traded in the stock market per day in 2005. As an
independent variable you choose the closing price of the share. This is an example of
A) simultaneous causality.
B) invalid inference due to a small sample size.
C) sample selection bias since you should analyze more than one stock.
D) a situation where homoskedasticity-only standard errors should be used since you only analyze one
company.
Answer: A
25) In the case of errors-in-variables bias, the precise size and direction of the bias depend
on
A) the sample size in general.
B) the correlation between the measured variable and the measurement error.
C) the size of the regression R2.
D) whether the good in question is price elastic.
Answer: B
5
Copyright © 2011 Pearson Education, Inc.
28) A definition of internal validity is
A) the estimator of the causal effect being unbiased and consistent
B) the estimator of the causal effect being efficient
C) inferences and conclusions being generalized from the population to other populations
D) OLS estimation being available in your statistical package
Answer: A
30) The true causal effect might not be the same in the population studied and the population of interest
because
A) of differences in characteristics of the population
B) of geographical differences
C) the study is out of date
D) all of the above
Answer: D
6
Copyright © 2011 Pearson Education, Inc.
9.2 Essays and Longer Questions
1) Until about 10 years ago, most studies in labor economics found a small but significant negative
relationship between minimum wages and employment for teenagers. Two labor economists challenged
this perceived wisdom with a publication in 1992 by comparing employment changes of fast-food
restaurants in Texas, before and after a federal minimum wage increase.
(a) Explain how you would obtain external validity in this field of study.
(b) List the various threats to external validity and suggest how to address them in this case.
Answer:
(a) Obtaining external validity involves generalizing the results from the population and setting them
under study, in this case Texas. Students familiar with the Card and Krueger literature on minimum
wages will point to the New Jersey/Pennsylvania study, or the high/low impact minimum wage paper by
Card. In general, studies of the effect of minimum wages on employment using data from other states
and/or countries will generate external validity.
(b) The main threats to external validity are the differences between the population and setting studied
versus the population and setting of interest. In particular, there may be geographic and/or time
differences, in that the study may be out of date. Being out of date is not a major concern here, since the
study was done relatively recently. Using data from Texas only could be of concern if you believed that
the Texas fast-food restaurants are different from those elsewhere, say in terms of monopsony power, the
type of teenager they attract, etc. (Students familiar with the literature may point out that no data was
obtained from McDonald's, but again, that does not pose a particular threat.) Generalizing from fast-food
restaurants to other sectors such as the garment industry, is an entirely different matter, as is generalizing
from teenagers to older workers, especially females. Some authors have established that increases in
minimum wages lead to lower school enrollment rates by whites, who then replace black fast-food
restaurant workers. These types of substitutions are not likely to occur with older workers. Comparisons
with other countries, where cultural differences may be larger than within the United States, are
potentially more problematic.
7
Copyright © 2011 Pearson Education, Inc.
2) Your textbook used the California Standardized Testing and Reporting (STAR) data set on test student
performance in Chapters 4-7. One justification for putting second to twelfth graders through such an
exercise once a year is to make schools more accountable. The hope is that schools with low scores will
improve the following year and in the future. To test for the presence of such an effect, you collect data
from 1,000 L.A. County schools for grade 4 scores in 1998 and 1999, both for reading (Read) and
mathematics (Maths). Both are on a scale from zero to one hundred. The regression results are as follows
(homoskedasticity-only standard errors in parentheses):
(a) Interpret the results and indicate whether or not the coefficients are significantly different from zero.
Do the coefficients have the expected sign and magnitude?
(b) Discuss various threats to internal and external validity, and try to assess whether or not these are
likely to be present in your study.
(c) Changing the estimation method to allow for heteroskedasticity-robust standard errors produces four
new standard errors: (0.539), (0.015), (0.452), and (0.015) in the order of appearance in the two equations
above. Given these numbers, do any of your statements in (b) change? Do you think that the coefficients
themselves changed?
(d) If reading and maths scores were the same in 1999 as in 1998, on average, what coefficients would you
expect for the intercept and the slope? How would you test for the restrictions?
(e) The appropriate F-statistic in (d) is 138.27 for the maths scores, and 104.85 for the reading scores.
Comparing these values to the critical values in the F table, can you reject the null hypothesis in each
case?
(f) Your professor tells you that the analysis reminds her of "Galton's Fallacy." Sir Francis Galton
regressed the height of children on the average height of their parents. He found a positive intercept and
a slope between zero and one. Being concerned about the height of the English aristocracy, he interpreted
the results as "regression to mediocrity" (hence the name regression). Do you see the parallel?
8
Copyright © 2011 Pearson Education, Inc.
Answer:
(a) High (low) reading and maths scores in 1998 will result in high (low) reading and maths scores in
1999. The slope coefficients suggest a high degree of persistence. However, both regression lines cross the
45 degree line, thereby implying implausibly mean reversion. All coefficients are statistically significant,
and approximately 80 to 90 percent of the variation in the 1999 scores are explained by the 1998 scores.
(b) The biggest threat to internal validity stems from the errors-in-variables problem. Assume that the
tests scores in maths in a given year are determined by a given set of factors, such as class size,
socioeconomic variables of the school district, quality of teachers, etc. Let the maths score in the second
year also be determined by the same factors, which are unlikely to change by much between the two
years. Then subtracting the earlier year from the more current year results in a population regression
function with a slope of one and an intercept of zero, and an error term which is correlated with the
previous year's score. Hence the OLS estimator will be biased downward from one and the intercept will
be biased upward from zero, giving the above result.
There are few threats to internal or external validity present through the other factors, although the L.A.
school district may not be typical when compared to a less urban setting.
(c) The coefficients are unaffected by the choice of standard error calculation. However, hypothesis tests
have no longer the desired significance levels, unless the errors are homoskedastic. There is no suggestion
from the institutional setting of the district that this should be the case here. (Indeed, homoskedasticity is
rejected for the above sample.)
(d) In that case the intercept would be zero, and the slope one. This is a simultaneous hypothesis, and
hence the F-test is appropriate here.
(e) The critical value is 4.61 at the 1% level, thereby comfortably rejecting the null hypothesis in each case.
(f) The situation is similar here. Instead of regressing the outcome in one period on determining factors, it
is regressed on the outcome in a previous period. In each case the outcome in the previous period is an
imperfect measure, or contains a measure error, of the underlying determinants. This results in problems
with internal validation.
9
Copyright © 2011 Pearson Education, Inc.
3) Keynes postulated that the marginal propensity to consume (MPC = ) is between zero and one.
He also hypothesized that the average propensity to consume (APC = ) would fall as personal
Can you reject the null hypothesis that the slope is less than one? Greater than zero? Test the hypothesis
that the intercept is zero. Should you be concerned about the sample size when conducting these tests?
What other threats to internal validity may be present here?
(c) Given the GDP identity for a closed economy,
Yt Ct I t Gt ,
show why economists saw important policy implications in finding an APC that would decrease over
time.
(d) Simon Kuznets, who won the Nobel Prize in economics, collected data on consumption expenditures
and national income from 1869 to 1938 and found, using overlapping period averages, that the APC was
relatively constant over this period. To reconcile this finding with the regression results, Milton
Friedman, who also won the Nobel Prize, formulated the "permanent income" hypothesis. In essence,
Friedman hypothesized that both actual consumption and income are measured with error,
C t Ct vt and Y t Yt wt ,
where Ct and Yt were called "permanent" consumption and income, respectively, and vt and wt , the
two measurement errors, were labeled transitory consumption and income. Friedman hypothesized that
the transitory components were purely random error terms, uncorrelated with the permanent parts.
Ct k Ypd ,t ut
so that the APC and MPC are the same and constant over time. Furthermore, let both transitory and
permanent income be independent of the error term. Show that by regressing actual consumption on
actual income, the MPC will be downward biased, and the intercept will be greater than zero, even in
large samples (to simplify the analysis, assume that permanent income and all of the errors are i.i.d. and
mutually independent).
10
Copyright © 2011 Pearson Education, Inc.
Answer:
(a) Cˆ i ˆ0 ˆ1Ypd ,i . Dividing both sides by personal disposable income results in
Cˆ i 1
APC ˆ0 ˆ1 . Hence the APC will fall with increases in personal disposable income.
Ypd ,i Ypd ,i
(b) Assuming that all assumptions required for proper inference are satisfied here, the t-statistic for an
MPC of one is –6.97, thereby rejecting the null hypothesis. You can also reject the null hypothesis that the
slope is zero (t-statistic = 26.32). The sample is very small here and certainly less than the number of
observations required to permit the use of the standard normal distribution. There may also be omitted
variables here, such as wealth, the real interest rate, the inflation rate, etc. The functional form may be
misspecified, and there may be errors in variables (permanent income). Perhaps most seriously, there is
simultaneous causality present, given the GDP identity.
(c) Dividing both sides of the identity by GDP results in 1 ≡ + + . With the APC falling over time
as income increased, either the investment output ratio or the government output ratio would have to
make up for this fall. The likely candidate was the government-expenditure share.
(d) This is the standard errors-in-variables problem discussed in the textbook. Following the derivation in
X2
footnote 2 in the textbook, it is straightforward to show ˆ1 1 , where X is permanent
p
X2 w2
income, and w is the measurement error in income. Hence the marginal propensity to consume will be
downward biased, or ˆ1 < k. For the intercept we get
= - 1 = + + - 1 , and collecting terms results in
w2 w2
+ . Therefore ˆ0 0 X 1 ˆ
p p
= -( 1- ) , since 1 1 1 and
X2 w2 X2 w2
Hence the intercept in the consumption function will be upward biased.
4) The Phillips curve is a relationship in macroeconomics between the inflation rate (inf) and the
unemployment rate (ur). Estimating the Phillips curve using quarterly data for the United States from
1962:I to 1995:IV, you find
11
Copyright © 2011 Pearson Education, Inc.
(d) The most obvious choice would be to estimate the Phillips curve for other countries. It is also possible
to estimate the Phillips curve for a cross-section of countries. Using state data is more problematic since
state unemployment rates vary, but inflation rates are very similar and only exist for certain cities (using
the CPI).
5) You have decided to analyze the year-to-year variation in temperature data. Specifically you want to
use this year's temperature to predict next year's temperature for certain cities. As a result, you collect the
daily high temperature (Temp) for 100 randomly selected days in a given year for three United States
cities: Boston, Chicago, and Los Angeles. You then repeat the exercise for the following year. The
regression results are as follows (heteroskedasticity-robust standard errors in parentheses):
(6.46) (0.10)
(3.98) (0.05)
(15.33) (0.22)
(a) What is the prediction of the above regression for Los Angeles if the temperature in the previous year
was 75 degrees? What would be the prediction for Boston?
(b) Assume that the previous year's temperature gives accurate predictions, on average, for this year's
temperature. What values would you expect in this case for the intercept and slope? Sketch how each of
the above regressions behaves compared to this line.
(c) After reflecting on the results a bit, you consider the following explanation for the above results. Daily
high temperatures on any given date are measured with error in the following sense: for any given day in
any of the three cities, say January 28, there is a true underlying seasonal temperature (X), but each year
there are different temporary weather patterns (v, w) which result in a temperature different from X.
For the two years in your data set, the situation can be described as follows:
= X + vt and = X + wt
Subtracting from , you get = + wt – vt. Hence the population parameter for the intercept
and slope are zero and one, as expected. Show that the OLS estimator for the slope is inconsistent, where
2
ˆ1
p
1
X2 2
12
Copyright © 2011 Pearson Education, Inc.
(d) Use the formula above to explain the differences in the results for the three cities. Is your
mathematical explanation intuitively plausible?
Answer:
(a) The prediction for Los Angeles is 70.5 degrees, and for Boston 74.4 degrees.
(b) In that case, the intercept would be zero, and the slope one.
(c) The derivation follows footnote 2 in the textbook with one modification: = 1.
1
2 ˆ1
p
1
(d) Rewriting ˆ1 1 2 suggests that the slope in the
p
as
X2 2 1 X2
temperature regression will be closer to one, the more variation there is in the underlying "true"
temperature. Temperatures in Los Angeles vary the least throughout the year, and you would therefore
expect the largest bias. The slope for Chicago suggests that temperatures there have the most variation.
The standard deviation for the Boston temperature is 19.5 and for Chicago 21.0. However, these are actual
temperature standard deviations. To calculate the variance of X in the above example, you could collect
data over a 100-year period on the same dates and form daily averages. It is the standard deviation of
these temperatures that would most resemble the standard deviation in X.
13
Copyright © 2011 Pearson Education, Inc.
6) A study of United States and Canadian labor markets shows that aggregate unemployment rates
between the two countries behaved very similarly from 1920 to 1982, when a two percentage point gap
opened between the two countries, which has persisted over the last 20 years. To study the causes of this
phenomenon, you specify a regression of Canadian unemployment rates on demographic variables,
aggregate demand variables, and labor market characteristics.
(a) Assume that your analysis is internally valid. What would make it externally valid?
(b) If one of the determinants of Canadian unemployment is aggregate United States economic activity
(or perhaps shocks to it), what variable would you suggest as its replacement if you did a similar study
for the United States?
(c) Certain Canadian geographical areas, such as the prairies and British Columbia, seem particularly
sensitive to commodity price shocks (Edmonton's NHL team is called the Edmonton Oilers). Having
collected provincial data, you establish a relationship between provincial unemployment rates and
commodity price changes (shocks). How would you address external validity now?
Answer:
(a) Threats to external validation come from the difference between the population and settings studied
versus the population and settings of interest. Finding, for example, that the variables which characterize
the unemployment insurance system exert an influence on Canadian unemployment, does not
automatically imply that this holds universally. To obtain external validity, the exercise should be
repeated to other geographic units, such as countries or states. If the coefficients are similar, or differences
in coefficients can be explained, then the study is externally valid.
(b) Shocks to world aggregate demand, or the major trading partners for the United States, would be a
possibility.
(c) The task is to find geographical units that are also sensitive to commodity price changes. Texas,
Louisiana, and Oklahoma would be candidates for obtaining external validity.
7) Several authors have tried to measure the "persistence" in U.S state unemployment rates by running
the following regression:
uri ,t 0 1 uri ,t k zi ,t
where ur is the state unemployment rate, i is the index for the i-th state, t indicates a time period, and
typically k ≥ 10.
(a) Explain why finding a slope estimate of one and an intercept of zero is typically interpreted as
evidence of "persistence."
(b) You collect data on the 48 contiguous U.S. states' unemployment rates and find the following
estimates:
14
Copyright © 2011 Pearson Education, Inc.
Interpret the regression results.
(c) Analyzing the accompanying figure, and interpret the observation for Maryland and for Washington.
Do you find evidence of persistence? How would you test for it?
(d) One of your peers points out that this result makes little sense, since it implies that eventually all
states would have identical unemployment rates. Explain the argument.
(e) Imagine that state unemployment rates were determined by their natural rates and some transitory
shock. The natural rates themselves may be functions of the unemployment insurance benefits of the
state, unionization rates of its labor force, demographics, sectoral composition, etc. The transitory
components may include state-specific shocks to its terms of trade such as raw material movements and
demand shocks from the other states. You specify the i-th state unemployment rate accordingly as
follows for the two periods when you observe it,
so that actual unemployment rates are measured with error. You have also assumed that the natural rate
is the same for both periods. Subtracting the second period from the first then results in the following
population regression function:
15
Copyright © 2011 Pearson Education, Inc.
It is not too hard to show that estimation of the observed unemployment rate in period t on the
unemployment rate in period (t-k) by OLS results in an estimator for the slope coefficient that is biased
towards zero. The formula is
2
ˆ1
p
1 .
X2 2
Using this insight, explain over which periods you would expect the slope to be closer to one, and over
which period it should be closer to zero.
(f) Estimating the same regression for a different time period results in
If your above analysis is correct, what are the implications for this time period?
Answer:
(a) This result would imply that states with high (low) unemployment rates in the (t-k) period would
have high (low) unemployment rates in period t. Hence high (low) unemployment rates would persist.
(b) A state which had an unemployment rate of 3 percent in 1970 is predicted to have an unemployment
rate of approximately 4 percent in 1995. If the state had a 7 percent unemployment rate in 1970, then the
prediction becomes approximately 6.5 percent. There is no interpretation for the constant. The regression
explains 40 percent of the variation in state unemployment rates in 1995.
(c) Washington had the highest unemployment rate in 1970, namely above 9 percent. There are several
states in 1995 that have higher unemployment rates. Washington seems to have reverted towards the
mean unemployment rate of all states. Maryland had a relatively low unemployment rate in 1970 (about
3.5 percent), but has a relatively higher unemployment rate in 1995. It also has reverted towards the
mean.
(d) The positive intercept and the slope between zero and one imply that high (low) unemployment rate
states will have high (low) unemployment rates in the future, but that they will not be as high (low) as in
the base period. Hence there is mean reversion. The prediction would be that ultimately all states would
end up with identical unemployment rates. However, unemployment rate differences should persist if
there are differences in the natural rates of the state unemployment rates. These may be due to different
sectoral compositions, unemployment insurance benefits, tax rates, etc. Unless states were identical with
regard to these variables, then unemployment rates should differ.
1
2 ˆ1
p
1 2
(e) Noting that ˆ1 1 X , you would expect ˆ1 to lie
p
can be rewritten as
X2 2 1
2
closer to one over time periods when natural rate variations dominate the transitory deviation of state
unemployment rates from their natural rates. Therefore if you attempted to predict the unemployment
rates in the mid 1980s from those in the mid 1970s, then the slope coefficient should be further away from
one. (There are several studies that have found virtually no persistence in state unemployment rates over
this period.)
(f) Following the previous argument, the result suggests that there were more transitory deviations from
the natural rate over this period. The large drop in oil prices, particularly in 1986, comes to mind.
16
Copyright © 2011 Pearson Education, Inc.
8) Sir Francis Galton (1822-1911), an anthropologist and cousin of Charles Darwin, created the term
regression. In his article "Regression towards Mediocrity in Hereditary Stature," Galton compared the
height of children to that of their parents, using a sample of 930 adult children and 205 couples. In
essence he found that tall (short) parents will have tall (short) offspring, but that the children will not be
quite as tall (short) as their parents, on average. Hence there is regression towards the mean, or as Galton
referred to it, mediocrity. This result is obviously a fallacy if you attempted to infer behavior over time
since, if true, the variance of height in humans would shrink over generations. This is not the case.
(a) To research this result, you collect data from 110 college students and estimate the following
relationship:
where Studenth is the height of students in inches and Midparh is the average of the parental heights.
Values in parentheses are heteroskedasticity-robust standard errors. Sketching this regression line
together with the 45 degree line, explain why the above results suggest "regression to the mean" or "mean
reversion."
(b) Researching the medical literature, you find that height depends, to a large extent, on one gene
("phog") and on environmental influences. Let us assume that parents and offspring have the same
invariant (over time) gene and that actual height is therefore measured with error in the following sense,
where X is measured height, X is the height given through the gene, v and w are environmental
influences, and the subscripts o and p stand for offspring and parents, respectively. Let the environmental
influences be independent from each other and from the gene.
Subtracting the measured height of offspring from the height of parents, what sort of population
regression function do you expect?
(c) How would you test for the two restrictions implicit in the population regression function in (b)? Can
you tell from the results in (a) whether or not the restrictions hold?
(d) Proceeding in a similar way to the proof in your textbook, you can show that
2
ˆ1
p
1
X2 2
for the situation in (b). Discuss under what conditions you will find a slope closer to one for the height
comparison. Under what conditions will you find a slope closer to zero?
(e) Can you think of other examples where Galton's Fallacy might apply?
17
Copyright © 2011 Pearson Education, Inc.
Answer:
(a) As can be seen in the accompanying graph, the regression line crosses the 45 degree line. Tall (short)
parents will have tall (short) children, but on average, they will not be as tall (short) as their parents.
Hence they will regress to the mean, or mean revert.
(b)
(c) You would have to test simultaneously whether the intercept is zero and the slope is one. This
requires an F-test. Analyzing the t-statistics above suggests rejection of both hypotheses. However,
testing the hypotheses sequentially is not the same as testing them simultaneously.
1
ˆ1
p
1 ˆ1
(d) The above expression can be rewritten as 2
will equal unity if there is no
X
1
2
measurement error, or if the variance in the gene is relatively large compared to the measurement error.
(e) Answer will vary by student. There are many examples of Galton's Fallacy, some of which have been
used in the test bank (state unemployment rates in year t when compared to year t-k; temperatures in a
given city this year compared to the previous year; grade received in the final examination relative to the
midterm grade; mutual fund performance this year versus last year; convergence regressions, sports
performance this year compared to the previous year, etc.).
18
Copyright © 2011 Pearson Education, Inc.
9) Macroeconomists who study the determinants of per capita income (the "wealth of nations") have been
particularly interested in finding evidence on conditional convergence in the countries of the world.
Finding such a result would imply that all countries would end up with the same per capita income once
other variables such as saving and population growth rates, education, government policies, etc., took on
the same value. Unconditional convergence, on the other hand, does not control for these additional
variables.
(a) The results of the regression for 104 countries was as follows,
where g6090 is the average annual growth rate of GDP per worker for the 1960-1990 sample period, and
RelProd60 is GDP per worker relative to the United States in 1960.
Interpret the results and point out the difference with regard to unconditional convergence.
(b) The "beta-convergence" regressions in (a) are of the following type,
= β0 + β0 ln Yi,0 + ui,t,
where △t ln Yi,t = ln Yi,0 – ln Yi,0, and t and o refer to two time periods, i is the i-th country.
Explain why a significantly negative slope implies convergence (hence the name).
(c) The equation in (b) can be rewritten without any change in information as (ignoring the division by T)
ln Yt = β0 + γ1 ln Y0 + ut
In this form, how would you test for unconditional convergence? What would be the implication for
convergence if the slope coefficient were one?
(d) Let's write the equation in (c) as follows:
and assume that the "~" variables contain measurement errors of the following type,
19
Copyright © 2011 Pearson Education, Inc.
where the "*" variables represent true, or permanent, per capita income components, while v and w are
temporary or transitory components. Subtraction of the initial period from the current period then results
in
Ignoring, without loss of generality, the constant in the above equation, and making standard
assumptions about the error term, one can show that by regressing current per capita income on a
constant and the initial period per capita income, the slope behaves as follows:
v2
ˆ1
p
1
Y2 v2
Answer:
(a) There is evidence for unconditional convergence among the OECD countries, but not for the countries
of the world as a whole. Only for the OECD countries is the slope coefficient significantly different from
zero.
(b) A significantly negative slope coefficient implies that countries which were further behind initially,
grow faster subsequently. Hence these countries will eventually converge.
(c) Ignoring T above, 1 = β1 - 1. Hence for convergence to occur, 1 has to be significantly different from
unity. If it were unity, then there would be no convergence or mean reversion.
(d) If Y is measured with error, perhaps due to a temporary difference resulting from a shock during the
initial year of measurement, then beta will be biased downward, i.e., the regression will indicate
convergence when there is none in truth.
10) One of the most frequently used summary statistics for the performance of a baseball hitter is the so-
called batting average. In essence, it calculates the percentage of hits in the number of opportunities to hit
(appearances "at the plate"). The management of a professional team has hired you to predict next
season's performance of a certain hitter who is up for a contract renegotiation after a particularly great
year. To analyze the situation, you search the literature and find a study which analyzed players who had
at least 50 at bats in 1998 and 1997. There were 379 such players.
(a) The reported regression line in the study is
20
Copyright © 2011 Pearson Education, Inc.
and the intercept and slope are both statistically significant. What does the regression imply about the
relationship between past performance and present performance? What values would the slope and
intercept have to take on for the future performance to be as good as the past performance, on average?
(b) Being somewhat puzzled about the results, you call your econometrics professor and describe the
results to her. She says that she is not surprised at all, since this is an example of "Galton's Fallacy." She
explains that Sir Francis Galton regressed the height of offspring on the mid-height of their parents and
found a positive intercept and a slope between zero and one. He referred to this result as "regression
towards mediocrity." Why do you think econometricians refer to this result as a fallacy?
(c) Your professor continues by mentioning that this is an example of errors-in-variables bias. What does
she mean by that in general? In this case, why would batting averages be measured with error? Are
baseball statisticians sloppy?
(d) The top three performers in terms of highest batting averages in 1997 were Tony Gwynn (.372), Larry
Walker (.366), and Mike Piazza (.362). Given your answers for the previous questions, what would be
your predictions for the 1998 season?
Answer:
(a) The regression implies mean reversion: those players who had a high (low) average in 1997 will have
a high (low) average in 1998, but it will not be as high (low) as before. If the performance was as good or
bad as in the past, then the intercept would have to be zero and the slope one.
(b) If the result were true, then eventually everyone would be of the same height.
(c) Errors-in-variables bias refers to a situation where variables are not measured precisely, but contain a
measurement error. In this situation, the player may have had an extraordinarily good or bad year,
resulting, perhaps, from an injury, adjustments to a new league, a new city, etc. This results in a
measurement error of his underlying ability. It has nothing to do with not measuring the batting average
correctly.
(d) The forecast would be for Tony Gwynn to bat (.312), Larry Walker (.309), and Mike Piazza (.307).
11) Your textbook compares the results of a regression of test scores on the student-teacher ratio using a
sample of school districts from California and from Massachusetts. Before standardizing the test scores
for California, you get the following regression result:
= 698.9 - 2.28×STR
n = 420, R2 = 0.051, SER = 18.6
In addition, you are given the following information: the sample mean of the student-teacher ratio is
19.64 with a standard deviation of 1.89, and the standard deviation of the test scores is 19.05.
a. After standardizing the test scores variable and running the regression again, what is the value of the
slope? What is the meaning of this new slope here (interpret the result)?
b. What will be the new intercept? Now that test scores have been standardized, should you interpret the
intercept?
c. Does the regression R2 change between the two regressions? What about the t-statistic for the slope
21
Copyright © 2011 Pearson Education, Inc.
estimator?
22
Copyright © 2011 Pearson Education, Inc.
Answer:
a. Standardization of a variable is a simple linear transformation,
Yi Y Y 1
Yi Yi a bYi
sY sY sY
The numerical value of the new slope is (-0.11). The interpretation is as follows: if you decrease the
student-teacher ratio by one, then test scores improve by 0.11 of a standard deviation of test scores or
Or, in this case, 2.35. Mathematically speaking, the intercept continues to represent the (standardized) test
score when the student-teacher ratio is zero. This does not make sense and it is best not to interpret the
intercept.
c. Performing a linear transformation on the regressand (or the regressor for that matter) does not change
the regression R2. It is easy but tedious to show that it is unaffected. Intuitively this makes sense since
otherwise you could affect the goodness of fit by whim (changing the scale of the data). Similarly, logic
dictates that the t-statistic is unaffected.
12) Suppose that you have just read a review of the literature of the effect of beauty on earnings. You
were initially surprised to find a mild effect of beauty even on teaching evaluations at colleges. Intrigued
by this effect, you consider explanations as to why more attractive individuals receive higher salaries.
One of the possibilities you consider is that beauty may be a marker of performance/productivity. As a
result, you set out to test whether or not more attractive individuals receive higher grades (cumulative
GPA) at college. You happen to have access to individuals at two highly selective liberal arts colleges
nearby. One of these specializes in Economics and Government and incoming students have an average
SAT of 2,100; the other is known for its engineering program and has an incoming SAT average of 2,200.
Conducting a survey, where you offer students a small incentive to answer a few questions regarding
their academic performance, and taking a picture of these individuals, you establish that there is no
relationship between grades and beauty. Write a short essay using some of the concepts of internal and
external validity to determine if these results are likely to apply to universities in general.
Answer: Students will consider various points that pose a threat to internal and external validity.
Obviously there is a difference in populations (external validity) between highly selective liberal arts
23
Copyright © 2011 Pearson Education, Inc.
colleges and universities in general. SAT scores at these colleges are much higher than for the average
university. In addition, the gender composition may be quite different, especially for engineering school,
where males dominate in terms of student numbers. Even in economics, the ratio of female to male
students is typically 1:2. This is an example of sample selection bias (internal validity). Other potential
problems with this study may include errors-in-variables from students not reporting the correct GPA.
However, this may not be a severe problem since GPA is the dependent variable. There could be a
problem if there are systematic problems in inflating the GPA for lower GPAs. It is also not clear from the
setup how beauty was judged. If judges were chosen who are friends of the individuals, then their
judgments may be biased, which is more severe since beauty is an explanatory variable. The setup also
does not indicate what the control variables are. In the absence of controls, there will be omitted variable
bias (internal validity) since intelligence will clearly be a determining factor of cumulative GPAs.
1) Your textbook gives the following example of simultaneous causality bias of a two equation system:
Yi = β0 + β1Xi + ui
Xi = 0 + 1 Yi + vi
In microeconomics, you studied the demand and supply of goods in a single market. Let the demand (
= β0 – β1Pi + ui,
= 0 – 1 Pi + vi,
where P is the price of the good. In addition, you typically assume that the market clears.
Explain how the simultaneous causality bias applies in this situation. The textbook explained a positive
correlation between Xi and ui for 1 > 0 through an argument that started from "imagine that ui is
negative." Repeat this exercise here.
Answer: Although quantities appear on the left-hand side of both equations, this is a system of two
equations in two unknowns, where quantity and price are determined simultaneously by demand and
supply.
A negative ui, call it a "demand shock," decreases the quantity demanded. Since demand equals supply,
this results in a lower quantity traded, and hence a lower price. (At the old price level, there would now
be excess supply, and hence the price would fall.) The negative ui has therefore resulted in a lower price,
and hence the error term in the demand equation is positively correlated with the price in the same
equation.
24
Copyright © 2011 Pearson Education, Inc.
2) The errors-in-variables model analyzed in the text results in
X2
ˆ1
p
1
w2
2
X
so that the OLS estimator is inconsistent. Give a condition involving the variances of X and w, under
which the bias towards zero becomes small.
X2 1
ˆ1
p
1 1
Answer: X2 w2 w2 . Hence if the variance of X is large relative to w, so that
1 2
X
variations in the variable measured with error is dominated by the unobserved component, then the bias
disappears. Also, if there is no measurement error, then w = 0, and the bias disappears.
2
3) You have been hired as a consultant by building contractor, who have been sued by the owners'
representatives of a large condominium project for shoddy construction work. In order to assess the
damages for the various units, the owners' association sent out a letter to owners and asked if people
were willing to make their units available for destructive testing. Destructive testing was conducted in
some of these units as a result of the responses. Based on the tests, the owners' association inferred the
damage over the entire condo complex. Do you think that the inference is valid in this case? Discuss how
proper sampling should proceed in this situation.
Answer: This is clearly a case of sample selection bias which leads to bias in the OLS estimator in general.
It should be clear that inference cannot be conducted properly, since owners who suspect that their unit is
faulty are much more likely to agree to destructive testing of their unit than those who have not
experienced any problems. The proportion of units assumed to be faulty in the population is bound to be
too large when derived through sampling of this type.
The proper sampling method would be to decide on the units to be tested through random sampling. A
random number generator should be used to determine the sampled units. The owners' association must
guarantee that the randomly selected units are available for destructive testing.
25
Copyright © 2011 Pearson Education, Inc.
4) Assume that a simple economy could be described by the following system of equations,
Ct = β0 + β1Yt + ui
It = ,
where C is consumption, Y is income, and I is investment. (This may be a primitive island society which
does not trade with other islands. There is no government, and the only good consumed and invested
(saved) is sunflower seeds.)
Assume the presence of the GDP identity, Y = C + I. If you estimated the consumption function, what sort
of problem involving internal validity may be present?
Answer: There is simultaneous causality present in the system. Income causes consumption, which in
return causes income (GDP). A negative consumption "shock," ut, causes consumption, and hence
aggregate demand, to fall. With lower aggregate demand, not all goods supplied are being sold in the
market, and hence income (Yt) falls. There is therefore a positive correlation between ut and Yt, i.e., the
error term and the regressor are correlated.
5) Your professor wants to measure the class's knowledge of econometrics twice during the semester,
once in a midterm and once in a final. Assume that your performance, and that of your peers, on the day
of your midterm exam only measure knowledge imperfectly and with an error,
where X is your exam grade, X is underlying econometrics knowledge, and w is a random error with
mean zero and variance w . w may depend on whether you have a headache that day, whether or not the
2
questions you had prepared for appeared on the exam, your mood, etc. A similar situation holds for the
final, which is exam two:
. What would happen if you ran a regression of grades received by students in the final
on midterm grades?
Answer: This is a typical errors-in-variables problem, which results in a downward biased estimator of
the slope. Subtracting the first equation from the second results in
If underlying econometrics knowledge at each exam did not change, then the regression should have a
slope of one and a zero intercept. (Alternatively, you can allow for an intercept.) The main point here is
that the performance during the first exam is only an imperfect measure of econometric ability, meaning
that there is measurement error. This results in a correlation between the error term and the regressor,
X2 X2
and the OLS estimator will be inconsistent. ˆ1
p
1 < 1, and so the regression will
X2 w2 X2 w2
display mean reversion: students with high (low) midterm scores will most likely have high (low) scores
in the final, but they will not be quite as high (low) as in the midterm.
26
Copyright © 2011 Pearson Education, Inc.
6) Consider the one-variable regression model, Yi = β0 + β1Xi + ui, where the usual assumptions from
Chapter 4 are satisfied. However, suppose that both Y and X are measured with error, = Yi + zi and
= Xi + wi. Let both measurement errors be i.i.d. and independent of both Y and X respectively. If you
estimated the regression model = β0 + β1 + vi using OLS, then show that the slope estimator is not
consistent.
Answer: The difference from the example used in section 7.2 of the text is that both the regressor and the
dependent variable are measured with error here. Proceeding along the lines in section 7.2, you can write
the population regression equation Yi = β0 + β1Xi + ui in terms of the imprecisely measured variables
where vi = zi - β1wi + ui. Hence the dependent variable being measured with error does not cause
additional problems to the case discussed in the textbook, but the error term continues to be correlated
with the regressor. As a matter of fact, it is easiest to combine the this measurement error with the
population regression error term, i.e., = zi + ui, in which case the derivation shown in Chapter 7
footnote 2 of the textbook holds after making this small adjustment. Note that cov( , ) = cov( , z i) +
X2
as before, and ˆ1 1 .
p
cov( ui) = 0, and hence cov( , vi) = - β1
X2 w2
7) In the simple, one-explanatory variable, errors-in-variables model, the OLS estimator for the slope is
inconsistent. The textbook derived the following result
X2
ˆ1
p
1 .
X2 w2
Show that the OLS estimator for the intercept behaves as follows in large samples:
where .
Answer: = - 1 = β0 + β1 + - 1 , and, collecting terms, this results in = β0 - ( 1 - β1) + .
2
2
27
Copyright © 2011 Pearson Education, Inc.
8) Assume that you had found correlation of the residuals across observations. This may happen because
the regressor is ordered by size. Your regression model could therefore be specified as follows:
Yi = β0 + β1Xi + ui
ui = ρui-1 + vi; < 1.
Furthermore, assume that you had obtained consistent estimates for β0, β1, ρ. If asked to make a
prediction for Y, given a value of X(= Xj) and j-1, how would you proceed? Would you use the
information on the lagged residual at all? Why or why not?
Answer: Given that the error term for j is related to the error term in j-1, it seems intuitive to use that
information in prediction, i.e., if Yj-1 is larger than 0 + 1Xj-1, then Yj will also be larger than but not by
as much (given ρ > 0). Substitution of the second equation into the first equation results in Yi = β0 + β1Xi
+ ρui-1 + vi. Hence the predicted value should be calculated as
j = 0 + 1Xj + j-1.
9) Your textbook only analyzed the case of an error-in-variables bias of the type i= Xi + wi. What if the
error were generated in the simple regression model by entering data that always contained the same
typographical error, say i= Xi + a or i= bXi, where a and b are constants. What effect would this have on
your regression model?
Answer: This would have an effect similar to changing the units of measurement. The measurement
error is not random here, and the bias can be determined exactly.
For the case i= Xi + a, the slope will be unaffected and the usual properties for the OLS slope estimator
will hold. However, since = + a and 0 = - 1 ) - 1a, the intercept will be underestimated by the
constant measurement error times the slope.
For the case i = bXi, the intercept is unaffected, but the ratio of the estimated slope with measurement
error to the slope without measurement error is b.
10) Explain why the OLS estimator for the slope in the simple regression model is still unbiased, even if
there is correlation of the error term across observations.
Answer: The proof for unbiasedness is presented in Appendix 4.3 of the textbook. There
1 n
1 n
X X u
X i X ui
i i
n n i 1
ˆ1 1 i n1 , and E ˆ1 1 E .
n
1 2
1 2
i
n i 1
X X
n i 1
X i X
1 n
n X i X E ui X 1 , , X n
Given the law of iterated expectations, this becomes E ˆ1 1 E i 1
,
1 n
2
Xi X
n i 1
and the second term vanishes due to the least squares assumptions of independence between the error
term and the regressor. The assumption of correlation of the error term across observations has not
entered into the proof. However, it will play a role in the derivation of standard errors.
28
Copyright © 2011 Pearson Education, Inc.
11) To analyze the situation of simultaneous causality bias, consider the following system of equations:
Yi = β0 + β1Xi + ui
Xi = + Yi + vi
Demonstrate the negative correlation between Xi and for < 0 , either through mathematics or by
presenting an argument which starts as follows: "Imagine that ui is negative."
Answer: The mathematical derivation of the correlation is given in footnote 3 of Chapter 7 in the
textbook. Setting <0 results in a negative correlation between Xi and ui. A negative shock to the first
equation yields a lower Y. This in turn increases X in the second equation. Hence there is a negative
correlation between Xi and ui.
12) Think of three different economic examples where cross-sectional data could be collected. Indicate in
each of these cases how you would check if the analysis is externally valid.
Answer: Answers will differ by student. Using U.S. state data to analyze determinants of unemployment
or the effect of minimum wages on employment-population ratios, and using a sample of Canadian
provinces, or other subnational geographical units, may be mentioned. Similarly cross-country
comparisons to test convergence in per capita income could be compared to results within countries.
Given the textbook example, test scores in elementary schools within one state may be validated by using
data from another state.
X2
13) The textbook derived the following result: ˆ1 1 . Show that this is the same as
p
X2 w2
w2
ˆ1
p
1 1 .
w2 X2
X2 X2 w2 w2 w2
Answer: 1 1 1 1 1 1 .
X2 w2 X2 w2 X w
2 2
X2 w2
29
Copyright © 2011 Pearson Education, Inc.
14) Your textbook has analyzed simultaneous equation systems in the case of two equations,
Yi = β0 + β1Xi + ui
Xi = + Yi + vi,
where the first equation might be the labor demand equation (with capital stock and technology being
held constant), and the second the labor supply equation (X being the real wage, and the labor market
clears). What if you had a a production function as the third equation
Zi = + Yi + wi
where Z is output. If the error terms, u, v, and w, were pairwise uncorrelated, explain why there would be
no simultaneous causality bias when estimating the production function using OLS.
Answer: Although the above system represents three equations in three unknowns, it is "block-
recursive," meaning that X and Y (the real wage and employment) are completely determined by the first
two equations and independently of the production function (Z). Given the solution for employment (Y),
the third equation solely determines output (Z).
Put differently, if there was a positive shock to the production function, which would result in higher
output, then this would have no effect on employment (Y), and there would therefore be no feedback into
the production function. Hence the error term in the third equation is not correlated with the regressor.
15) A professor in your microeconomics lectures derived a labor demand curve in the lecture. Given some
reasonable assumptions, she showed that the demand for labor depends negatively on the real wage. You
want to put this hypothesis to the test ("show me") and collect data on employment and real wages for a
certain industry. You try to estimate the labor demand curve but find no relationship between the two
variables. Is economic theory wrong? Explain.
Answer: This is a case of simultaneous causality. Since there is a supply of labor as well, the real wage
depends on employment, which, in a market-clearing model, is determined by the intersection of supply
and demand. In a Keynesian world with wait unemployment, you would expect a negative relationship
between real wages and employment, given the capital stock and productivity.
30
Copyright © 2011 Pearson Education, Inc.
16) Your textbook uses the following example of simultaneous causality bias of a two equation system:
Yi = β0 + β1Xi + ui
Xi = + Yi + vi
To be more specific, think of the first equation as a demand equation for a certain good, where Y is the
quantity demanded and X is the price. The second equation then represents the supply equation, with a
third equation establishing that demand equals supply. Sketch the market outcome over a few periods
and explain why it is impossible to identify the demand and supply curves in such a situation. Next
assume that an additional variable enters the demand equation: income. In a new graph, draw the initial
position of the demand and supply curves and label them D0 and S0. Now allow for income to take on
four different values and sketch what happens to the two curves. Is there a pattern that you see which
suggests that you might be able to identify one of the two equations with real-life data?
You only observe market outcomes (the intersection of the demand and supply curve). Fitting a
regression line through these points does not gives you neither the supply curve nor the demand curve,
and hence neither is identified.
31
Copyright © 2011 Pearson Education, Inc.
The market outcome now generates give observations at the intersection of the two curves. Fitting a line
through the five points will give an estimate of the supply curve. Hence by shifting the demand curve in
this fashion, you can identify the supply curve.
32
Copyright © 2011 Pearson Education, Inc.
17) Give at least three examples where you could envision errors-in-variables problems. For the case
where the measurement error occurs only for the explanatory variable in the simple regression case,
X2
derive ˆ1 1 .
p
X2 w2
Answer: Answers will vary by student. Consumption functions are frequently mentioned, where
permanent consumption is proportional to permanent income, both of which differ from actual measures
of consumption and income through transitory components. There are several examples in this chapter of
the test bank where the underlying measure of the regressor is proxied by previous outcomes
(unemployment rates, weather, height, etc.). Students may feel that responses to surveys result in
measurement error, e.g., when people respond to questions regarding their income, their SAT score, and
so forth.
18) Your textbook states that correlation of the error term across observations "will not happen if the data
are obtained by sampling at random from the population." However, in one famous study of the electric
utility industry, the observations were listed by the size of the output level, from smallest to largest. The
pattern of the residuals was as shown in the figure.
33
Copyright © 2011 Pearson Education, Inc.
19) Consider a situation where Y is related to X in the following manner: Yi = β0 × × eui. Draw the
deterministic part of the above function. Next add, in the same graph, a hypothetical Y, X scatterplot of
the actual observations. Assume that you have misspecified the functional form of the regression function
and estimated the relationship between Y and X using a linear regression function. Add this linear
regression function to your graph. Separately, show what the plot of the residuals against the X variable
in your regression would look like.
Answer: See the accompanying graphs.
34
Copyright © 2011 Pearson Education, Inc.
20) In macroeconomics, you studied the equilibrium in the goods and money market under the
assumption of prices being fixed in the very short run. The goods market equilibrium was described by
the so-called IS equation
Ri = β0 – β1Yi + ui
where R represented the nominal interest rate and Y was real GDP. β0 contained variables determined
outside the system, such as government expenditures, taxes, and inflationary expectations.
Ri = + Yi + vi
and contained the real money supply and the intercept from the money demand equation.
Nd = β 0 + β 1 +u
Ns = γ 0 + γ 1 +v
Nd = Ns = N
where N is employment, (W/P) is the real wage in the labor market, and u and v are determinants other
than the real wage which affect labor demand and labor supply (respectively). Let
Assume that you had collected data on employment and the real wage from a random sample of
observations and estimated a regression of employment on the real wage (employment being the
regressand and the real wage being the regressor). It is easy but tedious to show that
u2
ˆ
1 1
p
1 1
u2 v2
>0
since the slope of the labor supply function is positive and the slope of the labor demand function is
negative. Hence, in general, you will not find the correct answer even in large samples.
35
Copyright © 2011 Pearson Education, Inc.
a. What is this bias referred to?
b. What would the relationship between the variance of the labor supply/demand shift variable have to
be for the bias to disappear?
c. Give an intuitive answer why the bias would disappear in that situation. Draw a graph to illustrate
your argument.
Answer:
a. Simultaneous equations bias
b. The variance of v, the shift variable of the labor supply curve, would have to be substantially larger
compared to the variance of the labor demand shift variable.
c. Take the extreme case where the labor demand curve hardly shifts at all, but there are large changes in
the labor supply curve caused by the shift variable v. In that case, the labor supply curve would "trace out
"the labor demand curve. Since in real life you only observe the intersection of the demand and supply
relationship, it becomes clear now why the simultaneous equation bias has been removed.
36
Copyright © 2011 Pearson Education, Inc.
22) To compare the slope coefficient from the California School data set with that of the Massachusetts
School data set, you run the following two regressions:
CA = 2.35 - 0.123×STRCA
(0.54) (0.027)
2 = 0.051, SER = 0.98
n = 420, R
MA = 1.97 - 0.114×STRMA
(0.57) (0.033)
2 = 0.067, SER = 0.97
n = 220, R
Numbers in parenthesis are heteroskedasticity-robust standard errors, and the LHS variable has been
standardized.
Calculate a t-statistic to test whether or not the two coefficients are the same. State the alternative
hypothesis. Which level of significance did you choose?
Answer: H0: β1,CA = β1,MA; H1: β1,CA ≠ β1,MA;t = = 0.21. Hence you cannot reject
the null hypothesis at any reasonable level of significance. The underlying assumption here is that the
two samples are independent, which seems reasonable.
23) You have read the analysis in chapter 9 and want to explore the relationship between poverty and test
scores. You decide to start your analysis by running a regression of test scores on the percent of students
who are eligible to receive a free/reduced price lunch both in California and in Massachusetts. The results
are as follows:
CA = 681.44 - 0.610×PctLchCA
(0.99) (0.018)
2 = 0.75, SER = 9.45
n = 420, R
MA = 731.89 - 0.788×PctLchMA
(0.95) (0.045)
2 = 0.61, SER = 9.41
n = 220, R
a. Calculate a t-statistic to test whether or not the two slope coefficients are the same.
b. Your textbook compares the slope coefficients for the student-teacher ratio instead of the percent
eligible for a free lunch. The authors remark: "Because the two standardized tests are different, the
coefficients themselves cannot be compared directly: One point on the Massachusetts test is not the same
as one point on the California test." What solution do they suggest?
37
Copyright © 2011 Pearson Education, Inc.
Answer:
a. H0: β1,CA = β1,MA; H1: β1,CA ≠ β1,MA;t = = 3.67. Hence you reject the null
hypothesis.
b. The authors suggest standardizing the test score variable in both states by subtracting the mean and by
dividing by the standard deviation.
38
Copyright © 2011 Pearson Education, Inc.