Econometrics
Econometrics
Econometrics
Model A: 𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝑢
Model B: 𝑋2 = 𝛼1 + 𝛼2 𝑋3 + 𝑟
Model C: 𝑌 = 𝜆1 + 𝜆2 𝑟̂ + 𝑣
Here, 𝑟̂ is the residuals obtained from model B.
Based on the above models, prove the followings:
i. 𝛽̂1 = 𝜆̂1 − 𝜆̂2 𝛼̂1
ii. 𝛽̂2 = 𝜆̂2
iii. 𝛽̂3 = −𝜆2 𝛼̂2
3. A researcher obtained the following regression results based on 64 observations (see example 7.1):
̂ = 263.6416 − 0.0056 𝑃𝐺𝑁𝑃 − 2.2316 𝐹𝐿𝑅
𝐶𝑀
𝑆𝑒 = (11.5932) (0.0019) (0.2099)
𝑅2 = 0.7077
Where 𝐶𝑀 refers to child mortality per 1000 live birth, 𝑃𝐺𝑁𝑃 refers to per capita GNP and 𝐹𝐿𝑅 is the
female literacy rate
i. Do the regression coefficients satisfy priori signs? Interpret the coefficients of regression
results using suitable incremental rate of the explanatory variables.
ii. Compute the impact of a unit increase in per capita GNP as well as a one percentage point
increase in female literacy rate on child mortality and interpret your finding.
iii. Test the significance of individual coefficients using suitable level of significance.
iv. Do you think all the slope coefficients are simultaneously equal to zero? How do you test
it? Show the detail calculation and draw your conclusion.
v. Compute the adjusted 𝑅2 and interpret it.
4. The following exponential function exhibit the relationship between output (𝑌) and two factors –
labor (𝑋2) and capital (𝑋3 ) of several firms in a country:
𝛽 𝛽
𝑌𝑖 = 𝛽0 𝑋2𝑖2 𝑋3𝑖3 𝑒 𝑢𝑖
a. Transform the above model into a linear regression model.
b. How do you estimate the parameters of the transformed linear regression model?
c. Describe the process of testing the null hypothesis 𝐻0 : 𝛽2 + 𝛽3 = 1.
d. Based on the information of 50 manufacturing firms, a researcher finds the following results.
̂
𝑙𝑛 𝑌 = 3.8876 + 0.4683 𝑙𝑛𝑋2 + 0.5213 𝑙𝑛𝑋3
𝑆𝑒 = (0.3962) (0.0989) (0.0969) 𝑅2 = 0.9642
i. What is the output elasticity of labor? What does it mean? Is it statistically significant at
5 percent level of significance? Show your calculation.
ii. What is the output elasticity of capital? What does it mean? Is it statistically significant
at 5 percent level of significance? Show your calculation.
iii. Test the overall significance of the model.
iv. Does the economy exhibit constant returns to scale? Show your calculation.
[Note: 𝑐𝑜𝑣(𝛽̂2 , 𝛽̂3 ) = −0.0092]
5. A researcher estimated the Cobb-Douglas production function for an economy using 20 observations:
(b) ̂
𝑙𝑛 (
𝐺𝐷𝑃
) = −0.4947 + 1.1053 ln (
𝐶𝑎𝑝𝑖𝑡𝑎𝑙
)
𝐿𝑎𝑏𝑜𝑟 𝐿𝑎𝑏𝑜𝑟
𝑡 = (−4.0612) (28.1056)
𝑅𝑅2 = 0.9777, 𝑅𝑆𝑆𝑅 = 0.0166
6. A researcher estimated the following regression results based on 11 observations of quantity demanded
(Y) and price of coffee (X) (see example 7.2).
(a) ̂𝑌 = 2.6911 − 0.4795 𝑋
𝑆𝑒 = (0.1216) (0.1140) 𝑟 2 = 0.6628
(b) ̂
𝑙𝑛 𝑌 = 0.7774 − 0.2530 𝑙𝑛𝑋
𝑆𝑒 = (0.0152) (0.0494) 𝑟 2 = 0.7448
7. From the data of 46 states in the United States for 1992, a researcher obtained the following regression
results:
8. From a sample of 209 firms, an Econometrician obtained the following regression results:
Table: Determinant of log of salaries of CEOs
Explanatory variables Coefficient Se
Log (sales – annual firm sales) 0.280 0.035
roe (Return on equity in percent) 0.0174 0.0041
ros (Return on firm’s stock) 0.00024 0.00054
Constant 4.32 0.32
i. Interpret the preceding regression taking into account any prior expectations that you may
have about the signs of the various coefficients.
ii. Which of the coefficients are individually statistically significant at the 5 percent level?
iii. What is the overall significance of the regression? Which test do you use? And why?
iv. Can you interpret the coefficients of roe and ros as elasticity coefficient? Why or why not?
In the above models, 𝑢𝑖 is the stochastic disturbance term satisfying the standard assumptions of
Ordinary Least Square Method.
(a) How would you interpret the coefficient of TFR? A priori, would you expect a positive or negative
relationship between CM and TFR? Justify your answer.
(b) Have the coefficient values of PGNP and FR changed between the two equations? If so, what may
be the reason(s) for such a change? Is the observed difference statistically significant? Which test
do you use and why?
(c) How would you choose between models A and B? Which statistical test would you use to answer
this question? Show the necessary calculations.
(d) We have not given the standard error of the coefficient of TFR. Can you find it out? (Hint: Recall
the relationship between the t and F distributions.)
12. A franchise management in American was trying to estimate the effect of advertising on sales or total
revenue and was proposing to estimate the following regression model:
Model A: 𝑆 = 𝛽1 + 𝛽2 𝑃 + 𝛽3 𝐴 + 𝑢
Here, 𝑆 represents sales or revenue measured in 1000 dollar as unit, 𝑃 is per unit price, and 𝐴 is
advertising cost measured in 1000 dollar as unit.
Later on, the management presumed that advertising may have diminishing return and therefore,
postulated the following form of the model.
Model B: 𝑆 = 𝛽1 + 𝛽2 𝑃 + 𝛽3 𝐴 + 𝛽4 𝐴2 + 𝑢
a. Explain how would you test the hypothesis that advertising has no effect on sales in model A and
model B.
b. What will be priori sign of 𝛽3 and 𝛽4 in model B if the assumption of diminishing returns to
advertising become true? Describe the process of testing the hypothesis that for 40000 dollar, the
return to advertising reaches at its optimal level.
c. Using OLS method, the economist find the following results:
𝑆̂ = 104.81 − 6.582𝑃 + 3.36𝐴 − 0.0268𝐴2
𝑠𝑒 = (3.74) (1.582) (0.42) (0.0159)
𝑛 = 78, 𝑅𝑆𝑆 = 2592.301
i. Interpret the regression result.
ii. Find the optimal level of advertising.
iii. Do you think that advertising has no effect on sales? How do you know that? Show your
calculation. [𝑅𝑆𝑆𝑅 = 20907.331]
iv. Suppose that the franchise management, based on experience in other cities, thinks
that the optimal level of advertising in problem c(ii) is too high, and that the optimal level
of advertising is actually about $40,000. How would you test this hypothesis? Show your
calculation and draw your conclusion. [𝑐𝑜𝑣(𝛽3 , 𝛽4 ) = −0.0064]
13. Define dummy variable. Discuss its nature and usages in social science researches. What cautionary
measures should be taken during incorporating dummy variables in regression model?
14. A macroeconomist was studying a time series data for the period 1990-2015. S/he was trying to know
whether there is structural differences in savings-income relationship in Bangladesh. S/he considers
2002 as the structural break year.
The restricted estimated regression model is
To analyze the structural differences s/he used the dummy variable technique and estimated the
following unrestricted regression result.
Where,
𝑌𝑡 is the savings at time 𝑡,
𝑋𝑡 is the income at time 𝑡,
𝐷𝑡 is the dummy variable containing two values: 1 if the data come from 2002-2015 and 0
otherwise. 𝑡 is time.
i. Estimate the regression line for the periods 1990-2001 and 2002-2015.
ii. Do you think the intercept differential is significantly different from zero? Show your
calculation.
iii. Do you think the slope differential is significantly different from zero? Show your
calculation.
iv. Do you think both intercept and slope differentials are simultaneously significantly
different from zero? How do you know it? Describe the process and draw your conclusion.
v. What was the alternative way of analyzing structural differences of savings-income
relationship? What were the limitations of that approach?
15. A student of DDS collected data on red roses, the dozen of red roses sold quarterly, in various flower
markets in Dhaka city. The student aims to estimate a demand function for red roses. S/he primarily
decided to estimate the following two regression models:
Where
𝑌 is the quantity of red roses sold, dozens.
𝑋2 is the average wholesale price of red roses.
𝑋3 is the average wholesale price of white roses.
𝑋4 is the average weekly family disposable income.
𝑋5 is the trend variable taking values of 1,2,and so on.
𝑙𝑛 natural log
Based on the collected data she obtained the following regression results:
Model A:
Model B:
Source SS df MS Number of obs = 16
F(4, 11) = 9.63
Model 1.09893508 4 .27473377 Prob > F = 0.0013
Residual .313663766 11 .028514888 R-squared = 0.7780
Adj R-squared = 0.6972
Total 1.41259884 15 .094173256 Root MSE = .16886
(a) Shortly discuss, what are the fundamental differences between model A and model B?
(b) Interpret the both regression results. Do the results concur with the a priori expectations about the
signs of the parameters? Discuss.
(c) List which regression coefficients are significant in model A and model B at 5 percent level of
significance.
(d) The student concludes that as model A has higher 𝑅2 value than model B, model A is better than
model B. Do you agree with her? Explain your argument.
16. Recently ABC Company has recruited a manager. The company sells different kinds of breads in the
markets of different cities of the country. The company has taken initiatives to expand the
understanding of the product through newspaper advertising in the hope of higher sales. The company
has shared the monthly sales (SALES), a price index for all products sold in a given month (PRICE),
and monthly advertising expenditure (ADVERT). The SALES and ADVERT are measured in thousand
($1000). There are 75 observations in the dataset. The manager has planned to estimate the following
regression models and to report the findings to the management.
Model A Model B
Variable Coefficient SE
PRICE β1 -7.908 1.096 -7.640 1.046
ADVERT β2 1.863 0.683 12.151 3.556
ADVERT 2 β3 - - -2.768 0.941
β0 118.914 6.352 109.719 6.799
Summary Statistics of the Model
Number of observations (N) 75 75
Model Sum of Square 1,396.539 1,583.397
Residual Sum of Square 1,718.943 1,532.084
F 29.248 24.459
Note: The average sale is equal to; the average price is 5.6872; and the average advertising is $1844,
the standard deviations of the variables are 6.488537, 0.518432, and 0.8316769 respectively.
17. A young researcher was trying to estimate the following the regression models:
Model A: 𝑙𝑛𝑌𝑖 = 𝛼1 + 𝛼2 𝐾𝑊𝑊𝑖 + 𝛼3 𝐼𝑄𝑖 + 𝛼4 𝐸𝐷𝑈𝐶𝑖 + 𝛼5 𝑙𝑛𝑇𝑖 + 𝑢𝑖
Model B: 𝑙𝑛𝑌𝑖 = 𝛽1 + 𝛽2 𝐾𝑊𝑊𝑖 + 𝛽3 𝐼𝑄𝑖 + 𝛽4 𝐸𝐷𝑈𝐶𝑖 + 𝛽5 𝑙𝑛𝑇𝑖 + 𝛽6 𝐵𝐿𝐴𝐶𝐾 +∈𝑖
Model C: 𝑙𝑛𝑌𝑖 = 𝜆1 + 𝜆2 𝐾𝑊𝑊𝑖 + 𝜆3 𝐼𝑄𝑖 + 𝜆4 𝐸𝐷𝑈𝐶𝑖 + 𝜆5 𝑙𝑛𝑇𝑖 + 𝜆7 𝐾𝑊𝑊 ∗ 𝐵𝐿𝐴𝐶𝐾 + 𝜆8 𝐼𝑄 ∗ 𝐵𝐿𝐴𝐶𝐾 + 𝜆9 𝐸𝐷𝑈𝐶 ∗ 𝐵𝐿𝐴𝐶𝐾 +
𝜆10 𝑙𝑛𝑇 ∗ 𝐵𝐿𝐴𝐶𝐾 + 𝜀𝑖
Model B: 𝑙𝑛𝑌𝑖 = 𝛾1 + 𝛾2 𝐾𝑊𝑊𝑖 + 𝛾3 𝐼𝑄𝑖 + 𝛾4 𝐸𝐷𝑈𝐶𝑖 + 𝛾5 𝑙𝑛𝑇𝑖 + 𝛾6 𝐵𝐿𝐴𝐶𝐾 + 𝛾7 𝐾𝑊𝑊 ∗ 𝐵𝐿𝐴𝐶𝐾 + 𝛾8 𝐼𝑄 ∗ 𝐵𝐿𝐴𝐶𝐾 + 𝛾9 𝐸𝐷𝑈𝐶 ∗
𝐵𝐿𝐴𝐶𝐾 + 𝛾10 𝑙𝑛𝑇 ∗ 𝐵𝐿𝐴𝐶𝐾 + 𝑒𝑖
Here 𝑙𝑛𝑌𝑖 is log of monthly earnings, 𝐾𝑊𝑊 is the knowledge of world work score, IQ is IQ score, EDUC
is years of schooling, 𝑙𝑛𝑇 is log of tenure and BLACK is a binary race variable containing 1 value if race is black
or 0 otherwise.
Model B assumes that the wage equations of black and non-black labor have different intercept only
(hence, the race dummy is just added) and model C assumes that wage equations for black and non-
black labor have same intercept but have different slopes (the race dummy is omitted but the
interaction of race variable with the explanatory variables are added). In model D, the researcher
added a race dummy variable and a set of interactive terms under the presumption that race has both
intercept differential and slope differentials. Therefore, in this setting, model A assumes that race has
neither intercept differential nor slope differentials. The results of the model is reported in the
following table:
18. In a study of turnover in the labor market, James F. Ragan, Jr., aimed to estimate the following
regression model for the U.S. economy for the period of 1950–I to 1979–IV:
Where 𝑄 = quit rate in manufacturing industry, defined as number of people leaving jobs voluntarily
per 100 employees; 𝐶𝑌𝐶 = an instrumental or proxy variable for adult male unemployment rate; 𝑌𝑁𝐺 =
percentage of employees younger than 25; 𝐸𝑀𝑃 = 𝑁𝑡−1 /𝑁𝑡−4 = ratio of manufacturing employment in
quarter (t - 1) to that in quarter (t - 4); 𝑊𝑂𝑀 = percentage of women employees; and 𝑇𝐼𝑀𝐸 = time trend
(1950–I = 1).
(i) Shortly describe the factors determining the quit rate in manufacturing industry in US
economy based on the results reported in the above table.
(ii) Find the standard errors of the regression coefficients from the given data.
(iii) Test the overall significance of the above regression result.
19. (a). Marc Nerlove has estimated the following cost function for electricity generation
𝛼 𝛼 𝛼
𝑌𝑖 = 𝐴𝑋𝛽 𝑃1 1 𝑃2 2 𝑃3 3 𝑒 𝑢𝑖
where Y = total cost of production, X = output in kilowatt hours, 𝑃1 = price of labor input, 𝑃2 = price of
capital input, 𝑃3 = price of fuel, and u = disturbance term
By imposing a special restriction, the author transformed the above model as follows:
𝑌𝑖 𝑃1 𝛼1 𝑃2 𝛼2
= 𝐴𝑋𝛽 ( ) ( ) 𝑒 𝑢𝑖
𝑃3 𝑃3 𝑃3
i. What was the special restriction? Explain the meaning of the restriction.
ii. Explain the process of testing whether the restriction is valid or not.
(b). On the basis of a sample of 29 medium-sized firms, and after logarithmic transformation, Nerlove
obtained the following regression results.