Assignment 2
Assignment 2
Assignment 2. Endogeneity: IV
M Shahin Sarwar
MBA 2024
Instructions
1) The assignment should be solved in groups of four students; you choose your own colleague.
2) It is highly recommended that the exercises be solved using STATA.
3) Your reports should include how you achieved the results as well as an interpretation of the
results.
4) Use the format Times New Roman 12pt. and 1.5 spacing, and write a maximum of 15 pages
including the tables and the figures.
Instrumental Variable
Corporate social responsibility and firm performance
Use Csr.dta (data cannot be used outside the course!)
Variable definitions:
Alpha (dependent variable): Abnormal stock performance measured using daily CAPM data
within each quarter.
sum_csr (Main variable of interest): This is a firm level measure which presents the aggregate
number of CSR keywords mentioned in the media within each firm quarter.
Mean_sum_csr (Instrument): average sum_csr in the industry group in each period.
Control variables:
Leverage: The ratio of debt to total assets (book values) of the firm at year-end.
ROA: The return on total assets (book values) of the firm at year-end.
Liquidity_mv: The average trading volume of the firm’s equity divided by its market value at
quarter-end.
Mktcap: The natural logarithm of the market capitalization of the firm at quarter-end used as a
proxy for the firm size
Vol: The standard deviation of the firm’s daily equity returns during a quarter.
Tangibility: The ratio of the firm’s fixed assets to total assets (book values) at year-end.
Isin_id: Stock id.
Period: Data cover quarterly observations between 2010 and 2014. Period takes the value of
21 for the first quarter of 2010 and stretches until 36 for the last quarter of 2014 (The period is
a selection from a larger dataset).
Questions
1) Summarize the data and run a pearson correlation test on the variables. Explain your results.
3) Using the ivreg2 command, estimate the model in Question 2; use two stage least squares
(hereinafter 2SLS) with Mean_sum_csr as an instrument and robust standard errors. Describe
the results and contrast them to those obtained in Problem 2 from OLS. (OBS! you may have
to install the ivreg2 command). Note that the “first” option in ivreg2 is used to show the first
stage of the regression. Interpret the coefficient on Mean_sum_csr in the first stage. What are
the assumptions behind 2SLS? Explain.
4) What is over identification test? Explain. Test the over-identifying restrictions associated
with using Mean_sum_csr as instrument. Interpret the result from Hansen J statistic (over
identification test of the instrument) in Question 3.
5) Perform a Durbin-Wu-Hausman (DWH) test of the null that sum_csr is exogenous using
Mean_sum_csr as the instrument and using the “ivendog” command. What exactly is the null
of this test? Can you reject the null? Does this or does this not surprise you in light of your
answer to Question 3? Compare the DWH test from the “ivendog” command to that produced
by the “endog” option to ivreg2. Note that you need to remove “robust” command to be able
to run a DWH test using the “ivendog” command. What is the difference between these two
endogeneity tests?
stata codes:
ivreg2 alpha leverage roa liquidity_mv mktcap vol tangibility (sum_csr = mean_sum_csr),
endog (sum_csr)
ivendog sum_csr
6) Obtain the predicted residual from the first stage regression that you estimated in Problem 3
and call it resid1. Run a regression of Alpha on sum_csr, the control variables and resid1. Use
robust standard errors in the last regression. Describe and interpret your results.
bp: blood pressure, which is an outcome variable. We want to check the treatment effect on
blood pressure.
patient: patient id.
after: it’s a dummy variable where 1 indicates after the treatment and 0 indicates before the
treatment.
treatment: it’s a dummy variable where 1 represents that the patient received treatment and 0
represents that the patient did not receive treatment.
age and sleep are the other control variables.
Questions:
Generate new interaction variable.
generate treatafter = after*treatment
1) Use the regress command to find out the effect.
Stata command:
regress bp treatment after treatafter
use patient fixed effect and cluster the standard error at patient level.
Stata command:
reghdfe bp treatment after treatafter, a(patient) vce(cluster patient)
Explain the effect on bp for both models. What is the difference you observe between these
two models?
Stata command:
didregress (bp age sleep) (treatafter), group(treatment) time(after) aequations