0% found this document useful (0 votes)
22 views

Chapter 6 - Endogeneity

Modelos de endogeneidad, master economia UP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Chapter 6 - Endogeneity

Modelos de endogeneidad, master economia UP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Chapter 6: Endogeneity

Set of Exercises
Exercise 1. Regression towards the mean
In a famous exercise (“Regression towards Mediocrity in Hereditary Stature”, 1886), Galton
regressed the heights of a sample of sons on the heights of their fathers. He found a coefficient
strictly less than 1. He interpreted this as evidence of “regression towards the mean”.
Regression fallacy: Let us first consider the model:
yis = α + βyi,s−1 + uis , s = 1, ..., S, i = 1, ..., N
s indices the generations (grand-father, father, son), and i indices individuals.
We assume that, for all s: E(uis |y1,s−1 , ..., yN,s−1 ) = 0. Moreover the variance of uis is
constant and equal to σ 2 .
1 PN y .
Lastly, we shall denote as ȳs = N i=1 is

1. Compute E(yis − ȳs |y1,s−1 , ..., yN,s−1 ) as a function of y1,s−1 , ..., yN,s−1 , and give a precise
meaning to Galton’s interpretation.

2. Compute Var(yis ) as a function of Var(yi,s−1 ), for given s. What happens to the


variance of heights over generations? Is the Galton interpretation necessarily correct?

A model of intergenerational transmission: In the rest of this exercise, we shall consider


the alternative model:
yi = δ1 + zi + εi
xi = δ2 + zi + ηi
In this model we focus on two generations: fathers (height xi ) and sons (height yi ).
zi are iid with finite variance and εi and ηi are iid(0, σ 2 ) and uncorrelated. Moreover:
E(εi zi ) = E(ηi zi ) = 0

1. Interpret the variable zi .

2. Compute the linear projection of yi on xi and a constant. We call it β̃.

3. Show that:
0 ≤ plim β̃ ≤ 1
N →∞

Under which conditions are these inequalities strict?

1
A modern researcher estimates the following regression for a sample of N countries indexed
by i:
∆lnGDPit = δ + βlnGDPiinit + ui
where ∆lnGDPit is the average growth rate of GDP of country i during a period, and
lnGDPiinit is the (log) GDP at the beginning of the period. She estimates a negative β by
OLS. She concludes that this is evidence of convergence across countries. Your opinion?

Exercise 2. Returns to schooling


Let yi be the logarithm of the (hourly) wage of individual i, and let xi be her education level.
We wish to compute the causal effect of xi on yi .
Endogeneity: We start to assume the following model:
yi = δ + αxi + ui
xi = µ + βzi + vi
ui = zi + ηi
(1)
In this system, zi is unobserved, iid with variance σz2 .
Errors vi , ηi are iid, uncorrelated, have
zero mean and have variance σv2 and ση2 , respectively. Lastly, E(vi zi ) = E(ηi zi ) = 0.
1. Interpret this structural model. What interpretation do you give to zi ?
2. Let α̂ be the coefficient of the linear projection of yi on xi . Compute:
plim α̂ − α
N →∞

Do you expect a special sign for this quantity?


3. In the following list of variables, select the ones which are likely to be valid instruments:
IQ, Distance to college, Profession of the father, Monthly wage, Hours worked, Month
of birth.

4. Given one instrument, explain how to consistently estimate α.

A researcher does just that (with a valid instrument), and finds an estimate:
α̃ ≈ 1.2α̂
Endogeneity and measurement error: To try to explain this result, we consider a second
(augmented) model:
yi = δ + αxi + ui
xi = µ + βzi + vi
ui = zi + ηi
ỹi = yi + εi
x̃i = xi + νi
(2)

2
In addition to the assumptions in (1), it is assumed here that εi and νi are iid with zero mean,
uncorrelated, and that E(εi zi ) = E(εi vi ) = E(νi zi ) = E(νi vi ) = E(νi ηi ) = 0. In addition, νi
has variance σν2 . Importantly, ỹi and x̃i are the only observed variables in this model.

1. Interpret this model.

2. Let α̂ be the coefficient of the linear projection of yi on xi . Compute:


plim α̂ − α
N →∞

3. Does the strategy of the first paragraph yield a consistent estimator of α in this case?

4. In the dataset that is used in this exercise, it is thought that measurement error
accounts for around 10% of the variance of education (measured by the “number of
years of schooling” variable). Your opinion on the measurement error explanation?

Exercise 3. Weak instruments


Consider the following model:
yi = βxi + ui
where β and xi are scalar.
Let zi scalar such that
xi = γzi + vi
We assume that the data are iid, that Var(zi ) 6= 0, that E(vi |zi ) = 0 and that E(ui |zi ) = 0.
Let β̂ be the 2SLS estimator of β in the regression of yi on xi , using zi as instrument.
Give the expression of β̂.
Strong instrument: We first assume that γ 6= 0 is a constant.

1. Prove that E(zi xi ) 6= 0.

2. Prove that β̂ tends to β in probability when N tends to infinity.

Weak instrument: We now assume instead that γ = √1 .


N

1. Show that:
1 PN z x = 0
plim N i=1 i i
N →∞

2. Show that:
√1
PN
zi xi → N (µ, V )
N i=1 d

where you will specify the values of µ and V .

3. Show that:

3
β̂ − β → (E(zi2 ) + a)−1 b
d

where a and b are jointly normally distributed with mean zero and covariance matrix:
 
0 zi ui
S = E(gi gi ), where gi =
zi vi

Hint: apply the CLT for a vector of random variables. You may also use that zn → z
d
implies a(zn ) → a(z) under suitable regularity conditions on function a.
d

4. Is β̂ consistent?

Exercise 4. Class size, test scores and Maimonides’ rule


Read the paper by Angrist and Lavy (QJE, 1999). Then answer the following questions.

1. What is the parameter of interest in the paper?

2. Why not regressing test scores on class size? Explain.

3. Let yi be a test score of child i, ci be class size, ei be enrollment at the beginning of the
academic year, and let c̃i = f (ei ) be the class size as predicted by ei and Maimonides’
rule. Draw function f , up to ei = 121 children.

4. Why are ci and c̃i not perfectly correlated?

5. Are ci and c̃i strongly correlated? You may refer to the introduction and the Figures
in the paper.

6. I propose to use c̃i as an instrument in the regression:


0
yi = xi α + βci + ui

where xi includes individual and school characteristics, but does not include ei . What
is the main assumption I am making? Comment.

7. Instead, Angrist and Lavy use c̃i as an instrument in:


0
yi = xi α + γei + βci + ui

Is the instrument relevant?

8. Do you see a potential identification problem with including ei in the regression (through
the term γei )? Show that the graphs suggest that the parameter of interest is identified.

Hint: You may want to read the last part of the introduction again.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy