Bivariate Regression - Part I: Indep Var / Dep Var Continuous Discrete
Bivariate Regression - Part I: Indep Var / Dep Var Continuous Discrete
Bivariate Regression - Part I: Indep Var / Dep Var Continuous Discrete
II. Regression.
A. OLS. With OLS (Ordinary Least Squares) Regression, we are interested in how
changes in one set of variables are related to changes in another set. That is, we want to describe
or estimate the value of one variable, called the dependent variable, on the basis of one or more
other variables, called independent variables.
Examples:
T What is the relationship between education and income? For each year of education,
how much does income increase (on average)?
T What will be the rate of return on investment? For each dollar invested, how much
will sales increase?
T For a political candidate, how many votes will she get for each dollar she spends on
advertising?
It is usually not the case that the independent variables will perfectly predict the values of
the dependent variable. For the most part, we are interested in determining the average
relationship between the dependent and independent variable. That is, we want to know
E(Y | X), i.e. for a particular value of X, what value, on average, do people have on Y?
For most regression problems, the average relationship between the dependent variable
(y) and the independent variable (x) is assumed to be linear. That is, the Population regression
line is
E(Y | X) = α + βX
Of course, not all people with 16 years of education will make $29,000. Some will make
more, some will make less. The score a particular individual has on Y can be written as:
y i = α + β xi + ε i ; or,
y i = E(y | xi ) + ε i
Yˆ = a + bX; or
Yˆ = αˆ + βˆX
y i = a + bxi + ei = yˆ i + ei ; or
y i = αˆ + βˆ xi + εˆi = yˆ i + εˆi
A better approach is to try to find values for a and b so as to minimize the values of ei, where ei =
yi - y^i. The approach used to do this is called Ordinary Least Squares. With OLS, we choose a
and b so as to minimize
∑ ei2 = ∑( y i - yˆ i )2
1
∑( x i - x )( y i - y )
s
b= N - 1 = xy2 ,
1 sx
∑( x i - x )
2
N -1
a = y - bx
ŷ = a + b x = y - b x + b x = y
ŷ = a + bx = y - b x + bx = y
Hence, if the slope is zero, the best estimate of y^ is y - put another way, knowing x is of no
value to you when predicting y.
H0: ß=0
HA: ß <> 0.
3. E(ε) = 0. The average error will be zero; positive errors will be offset by
negative errors.
4. COV(εk, εj) = 0 for j <> k. Knowing one error term tells you nothing
about the value of another error term. A violation of this assumption might occur if samples are
not independent of each other (e.g. husbands and their wives are treated as separate cases in one
sample). Serial correlation is another common violation (Errors are correlated across time, as
when you collect data on industries at multiple points in time).