AE 2023 Lecture7
AE 2023 Lecture7
1
Dummy variables or binary variables
2
Dummy variables or binary variables
► In general, a dummy variable is a variable that equals 0 or 1.
► The coefficient of a dummy variable is equal to an intercept shift
of size δ when dummy variable = 1. All slope parameters remain
unchanged.
► Example:
Dummy variable:
= the wage gain/loss if the person =1 if the person is a woman
is a woman rather than a man =0 if the person is man
(holding other things fixed)
3
Dummy variables
Alternative interpretation of
coefficient:
Intercept shift
Disadvantages:
1) More difficult to test for differences between the parameters
2) R-squared formula only valid if regression contains intercept
5
Dummy variables
►Gender and Wage
Holding education,
experience, and tenure fixed,
women earn $1.81 less per
hour than men
6
Dummy variables
►Effects of training grants on hours of training
Dummy variable indicating whether firm
Hours training per received a training grant
employee
7
Dummy variables
►A new diet program and Weight
! = 12.82 − 1.85'/#0'#1
"#$%ℎ'
(2.12) (0.60)
n=330, R2=15%
►treated is a dummy variable that equals 1 for a person
who joined a new diet program, 0 otherwise.
8
Dummy variables
►Using dummy explanatory variables in equations for log(y)
Dummy variable
=1 if a house is of
colonial style.
=0 otherwise
As the dummy for colonial style changes from 0 to
1, the house price increases by 5.4 percentage
points
9
Using dummy variables for multiple
groups
10
Several subgroups
► Example: A worker is female or male and married or unmarried
⇒ 4 subgroups:
► female and not married
► female and married
► male and not married
► male and married
► How to proceed:
► (1) Define dummy variables for all subgroups
► (2) Leave out one group, which becomes the base
/reference group.
11
Several subgroups
This specification would probably not be appropriate as the credit rating only
contains ordinal information. A better way to incorporate this information is to
define dummies:
Dummies indicating whether the particular rating applies, e.g. CR1=1 if CR=1, and
CR1=0 otherwise. All effects are measured in comparison to the worst rating (=
base group).
13
University ranking
► University ranking from 1-100
► The quality difference between ranks 1 and 2 and ranks 11 and
12, respectively, may be dramatically different.
→ Hence, ranks should not be used as independent var.
► Instead, we have to assign a dummy variable !" for all but one
(the “reference group”) of the universities, inducing several new
parameters which have to be estimated.
► Note: Then, the coefficient of a dummy variable Dj denotes the
intercept shift between university j and the reference university.
► Sometimes there are too many ranks and hence too many
parameters to be estimated. Then it proves useful to group the
data, e.g., ranks 1-10, 11-20, etc.
14
Interactions involving dummy
variables
15
Interaction terms
► Example: Do returns to schooling differ for men and women?
► Or: is the effect of education on the wage moderated by gender?
educ wage
female
16
Notes
► Consider two models:
H0: 78 =0, indicating that the return to education is the same for men
and women
Or
The impact of education on wage is similar for men and women.
17
Interaction terms
log(%&'() = +, + +. (/01 + +2 3(4&5( + +6 3(4&5( ∗ (/01 + 0
► Male: +, + +. (/01
► Female:
+, + +. (/01 + +2 + +6 (/01 = (+, + +2 )+(+. + +6 )(/01
18
Interaction terms
19
Interaction terms
"#$!
%&$' =. *++. -./012 − .216'7&"'−. --89/:;</ ∗ /012
(0.12) (0.008) (.17) (.013)
+.005'@A'B + .017tenure
(0.002) (0.003)
20
Example – Investment Inefficiency and Firm
Performance
► We have a proposal with two hypotheses:
H1: Investment inefficiency has a negative impact on firm
performance.
H2: The negative impact of investment inefficiency on firm
performance is stronger for small firms.
Investment H1
Firm performance
Inefficiency
H2
Small firms
21
Example – Investment Inefficiency and Firm
Performance
► First, to test the first hypothesis, we estimate the model:
9:; = <= + <?;@A:9B;C_DAEFGHBFAH + <ICFEF9;JF + KF;9 LMG + N
"#$
! = .033 − .317$,-#".$/_1-2345.3-5 − 0.042/323"$83
(0.002) (.013) (.002) (.004)
n=76,604, R2=1.38%
24
Example – Investment Inefficiency and Firm
Performance
25
Example – Corruption and Firm Performance
► We have a proposal with three hypotheses:
H1: Firm performance is negatively associated with corruption.
H2: The negative impact of corruption on firm performance is
stronger for young firms.
H3: The negative impact of corruption on firm performance is
stronger in financial crisis.
Financial crisis
H3
H1
Corruption Firm performance
H2
Young firms
26
Example – Corruption and Firm Performance
► First, to test the first hypothesis, we estimate the model:
"#$ = ;< + ;= -#""./01#2 + ;> 415$ + ;? 072819:$ + .
"#$
! = .156 − .033-#""./01#2 + .005415$ − .057072819:$
(0.024) (0.006) (.002) (.017)
n=2,597, R2=2.61%
"#$
! = .113 − .002,#""-./0#1 + .0624#-15_70"89
(.032) (.014) (.027)
−.039,#""-./0#1 ∗ 4#-15_70"89 − 00490A$ − .056/C150DE$
(.016) (.002) (.018)
29
Example – Corruption and Firm Performance
30
Example – Corruption and Firm Performance
"#$
! = .153 − .029.#""/012#3 + .004."2626
(.026) (.008) (.027)
−.011.#""/012#3 ∗ ."2626 + .00562>$ − .0571?3@2AB$
(.011) (.002) (.018)
31
Summary
Consider a model:
y = #$ + #& '1 + )*+,-*./ + 0 (model 1)
y = #$ + #& '1 + #9 '2 + #; '1 ∗ '2+)*+,-*./ + 0 (=>?@A B)
Where x2 is a dummy variable
Main effect in Interaction-terms in Hypothesis
Model (1) Model (2)
#& #;
>0 >0 More pronounced
32
Summary
Consider a model:
y = #$ + #& '1 + #) '2 + #+ '1 ∗ '2+-./01.23 + 4
Where x2 is a dummy variable
The coefficient of BTD × LONG TENURE is positive (0.006) and statistically significant at 10% level.
→ The POSITIVE impact of board tenure diversity on firm investment efficiency is less pronounced
for firms that have short tenure.
In terms of economic significance, a one unit increase in the BTD, investment efficiency of short-
tenured firms is 0.006 unit lower than that of long-tenured firms.
34
Example
Corruption and GDP Growth: The moderating Role of FDI
High FDI
HIGH FDI is a dummy variable that equals 1 if a country‘s FDI is higher than
sample median, 0 ọtherwise.
35
Example
Corruption and GDP Growth: The moderating Role of FDI
Dep. Var.: GDP GROWTH
(1)
Corruption -0.089***
(-3.501)
Corruption × HIGH FDI 0.050***
(4.102)
HIGH FDI 0.016***
(2.738)
Control variables INCLUDED
Fixed effects Country, Year
Observations 570
Adjusted R2 0.112
HIGH FDI is a dummy variable that equals 1 if a country‘s FDI is higher than sample median, 0
ọtherwise.
The coefficient of Corruption × HIGH FDI is positive (0.050) and statistically significant at 1%.
→ The negative impact of corruption on GDP GROWTH is less pronounced for countries that have
high FDI.
In terms of economic significance, a one unit increase in the Corruption, GDP Growth of high-FDI
countries is 0.050 unit higher than that of low-FDI countries.
36
A Binary dependent variable:
The linear probability model
37
The linear probability model