CHP 3 Notes, Gujarati
CHP 3 Notes, Gujarati
CHP 3 Notes, Gujarati
Estimation
We can rewrite the sample regression function as uˆi = Yi −Yˆi = Yi − βˆ1 − βˆ2 X i . In other words,
the residuals are the differences between the actual and the estimated Yi values.
With n observations, we want to choose β̂1 and β̂2 such that the sum of the residuals is
n n
minimized: ∑uˆi = ∑(Yi − Yˆi ). This turns out not to be a very good rule because some
i =1 i =1
residuals are negative and some are positive (and they would cancel each other), and all residuals
have the same weight (importance) even though some are small and some are large.
∂(∑uˆi )
2
Setting these two equations equal to zero and rearranging terms yields the normal equations
∑Y i
ˆ +β
= nβ1
ˆ
2 ∑X i
∑Y X = βˆ1 ∑ X i + βˆ2 ∑ X i
2
i i
βˆ2 = ∑x y i i
and β
ˆ =Y −β
ˆ X
, where xi = X i − X and yi = Yi −Y (these are
∑x 2 1 2
A1: Linear regression model [the regression model is linear in the parameters]
A7: The number of observations n must be greater than the number of parameters k.
A10: There is no perfect multicollinearity [there are no perfect linear relationships among
explanatory variables]
Var ( βˆ1 ) =
∑X i
2
σ and se ( βˆ1 ) =
2 ∑X i
2
σ
n∑ x i
2
n∑ x i
2
where ∑uˆi is the residual sum of squares (RSS) and n − 2 are the number of degrees of
2
freedom (df).
σˆ =
∑uˆ 2
i
= the standard error of estimate or standard error of regression. This is the standard
n−2
deviation of the Y values around the estimated regression line, and it is often used as a measure
of “goodness of fit”.
Properties of Least-squares Estimators
Gauss-Markov Theorem
Given the assumptions of the CLRM, the least-squares estimators, in the class of unbiased linear
estimators, have minimum variance, that, is, they are BLUE.
The properties presented above are finite (small) sample properties. We will discuss large sample
properties later on.
r2 (two-variable case) or R2 (multiple regression) tells us how well the sample regression line fits
the data.
∑(Yˆ ∑uˆ 2
−Y ) 2 ESS i RSS
r2 = i
= or, alternatively, as r 2 =1 − =1 −
∑(Yi −Y ) 2 TSS ∑(Y −Y )
i
2
TSS
Where ESS = explained sum of squares; RSS = residual sum of squares; and TSS = total sum of
squares.
r2 is called the Coefficient of Determination and it “measures the percentage of the total variation
in Y explained by the regression model”. r2 is a non-negative number that ranges from zero and
one. Zero means no fit and an r-squared of one means a perfect fit.
r =± r 2 =
∑x y i i
The sample correlation coefficient can be estimated as (∑xi2 )( ∑yi2 )
Properties of r:
1. It can be positive or negative [depends on the sign of the numerator]
2. −1 ≤ r ≤ 1
3. It is symmetrical [i.e., you get the same value whether you calculate it between X and Y,
or between Y and X]
4. It is independent of the origin and scale
5. If X and Y are independent then the correlation coefficient is zero [but zero correlation
does not necessarily imply independence
6. It is a measure of linear association only
7. It does not imply that there is any cause-and-effect relationship
Monte Carlo Experiment (See Example on Page 92)
A Monte Carlo experiment is essentially a computer simulation that is useful to check the
sampling properties of estimators. If you know the true value of the parameters then you would
choose the sample size, fix the values of the independent variables at a given level and draw
random numbers of the residual to obtain values of the dependent variable. You can do this since
you know the X’s, the betas and u. The generated values for Y are then used with the values of X
to get the parameter estimates (the estimated betas).
You would repeat this experiment 100 or 1,000 times, which will generate 100 or 1,000
parameter estimates. If the average values of these estimates are close to the true values then the
Monte Carlo experiment tells you that your estimator is unbiased. In general, Monte Carlo
experiments are useful when we want to know the statistical properties of different ways of
estimating population parameters.