Lecture 2 - LRM
Lecture 2 - LRM
REGRESSION
Nguyen Quang
quangn@ueh.edu.vn
THE IDEA BEHIND REGRESSION
• 𝑌 is the regressand, or
dependent/explained variable
• 𝑋 is a vector of regressors, or
independent/explanatory variables
• 𝑒 is an error term/residual.
REGRESSION COEFFICIENTS
𝑌! = 𝛽" + 𝛽# 𝑋#! + 𝛽$ 𝑋$! + ⋯ + 𝛽% 𝑋%! + 𝑒!
• 𝛽" is the intercept/constant
• 𝛽# to 𝛽% are the slope coefficients
• In general, 𝛽 are the regression coefficients or regression parameters. THEY ARE
WHAT WE NEED TO ESTIMATE!
• Each slope coefficient measures the (partial) rate of change in the mean value of 𝑌 for a unit
change in the value of a regressor, ceteris paribus
• Roughly speaking: 𝛽# lets us know when 𝑋# increases by one unit, 𝑌 changes by 𝛽# , other
things (all other Xs) unchanged.
METHOD OF • Method of Ordinary Least Squares (OLS) search
for coefficients that minimizes residual sum of
ORDINARY squares (RSS):
• 𝑅$, the coefficient of determination, is an overall measure of the goodness of fit of the
estimated regression line.
• 𝑅$ gives the percentage of the total variation in the dependent variable explained by the
regressors:
"
• Explained Sum of Squares 𝐸𝑆𝑆 = ∑ 𝑌) − 𝑌,
• Residual Sum of Squares 𝑅𝑆𝑆 = ∑ 𝑒 "
• Total Sum of Squares 𝑇𝑆𝑆 = ∑ 𝑌 − 𝑌, "
#$$ &$$
• Then: 𝑅" = =1−
%$$ %$$
• It is a value between 0 (no fit) and 1 (perfect fit), higher 𝑅$ indicates better fit.
• When 𝑅$ = 1, 𝑅𝑆𝑆 = 0 and ∑ 𝑒 $ = 0.
• 𝑛 is total number of observations
DEGREE OF • 𝑘 is total number of estimated coefficients
FREEDOM • 𝑑𝑓 for 𝑅𝑆𝑆 = 𝑛 − 𝑘
𝑑𝑓
GOODNESS OF FIT:
R SQUARED ADJUSTED
DUMMIES
Regression • Include white and black as regressors
Inclusion: • "Others" serves as the base category
THE WAGE
FUNCTION WITH
CATEGORICAL
VARIABLES
• β of white/black indicates
the difference in wage
between white/black and the
base category (“others”).
HYPOTHESIS
TESTING
Testing individual coefficient: t test
Testing multiple coefficients: F test
• To test the following hypothesis:
• 𝐻": 𝛽% = 0
• 𝐻#: 𝛽% ≠ 0