0% found this document useful (0 votes)

63 views

Unit 4 Multiple Regression Model: 4.0 Objectives

This document provides an introduction to multiple regression models. It discusses that multiple regression models involve predicting a dependent variable using several independent variables. The key aspects covered are: - Multiple regression models extend simple linear regression to include more than one independent variable. - Least squares is used to estimate the coefficients that minimize the error sum of squares. - Additional assumptions for multiple regression include no exact linear relationships between independent variables (no multicollinearity). - Statistical tests like t-tests and F-tests can be used to evaluate the significance of regression coefficients and the overall model.

Uploaded by

ritika prakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Unit 4 Multiple Regression Model: 4.0 Objectives

Uploaded by

ritika prakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

UNIT 4 MULTIPLE REGRESSION

MODEL
Structure
4.0 objective:
4.1 Introduction
Multiple Regression ~ d d e l
Additional Assumptions
Tests of Significance
Coefficient of Determination
Matrix Form of Multiple Regression Model
Structural Stability of Regression Models - Chow Test
Prediction with Multiple Regression Model
Let Us Sum Up
Key Words
Some Useful Books
AnswersIHints to Check Your Progress Exercises

4.0 OBJECTIVES
After going through this unit, you will be in a position to:
a e-xplain the concept of multiple regression;
explain the concept of coefficient of determination;
test the structural stability of regression models; and
make prediction with multiple regression.

4.1 INTRODUCTION

In Units 2 and 3, we studied the basic statistical tools and procedures for
analysing relationships between two variables. However, the two-variable
framework is too restrictive for realistic analyses of economic phenomena.
Economic models generally contain more than two variables. When a
regression equation has three or more than three variables, we call it a
multiple regression model.. The statistical formulas for estimating the
parameters, variance, and testing the parameters are very similar, or in some
cases identical, to the two-variable regression model.
he
simplest possible multiple regression model (without using matrix
algebra) is three-variable regression, with one dependent variable and two
explanatory variables. If you can understand the analysis of a relationship
between three variables, you should be able to generalize the concept to a'
multiple regression model. A conventional example of a three variablc
equation is a demand equation in which quantity demanded depends not only
on the price of the commodity but also on the income of a consumer.
1 4.2 MULTIPLE REGRESSION MODEL
Multiple Regression Model

We extend the two-variable model by assuming that the dependent variable Y

is a linear function of a series of independent variables X I ,X, ,. ... ,Xk and
an error term. This model is a natural extension of the two-variable model.
We write the multiple regression model as

where Y is the dependent variable, X's are the independent variables, and E
is the error term. X,, represents the i th observation on explanatory variable
X, . Explanatory variable X, is taken here as 1. P, is the constant term or
intercept and PI,. -.,pk are slopes of the equation.

For simplicity we have used a three-variable regression model. The model is

written as follows:

The least-squares procedure is equivalent to searching for parameter estimates

which minimize the error sum of squares,'defined as

ESS = ci12~ (
= x
- f ) 2 , where f = 8, + b 2 x 2 +AX,,
,

Just as we did in Unit 2, we-can find the values of P I , P, and P3 which

miniinize ESS. To minimize ESS, we partially differentiate it with respect to
the three upknown parameters p,, P2 and 4,and set the resulting expression
to zero and solve them simultaneously.
To simplify we use the model in deviation form, so that

then

To solve, we multiply eq.(4.3) by x x , , 2 and multiply eq.(4.4) by x x 2 , x 3 ,,

and then subtrkt the latter from the former:
Basic Econometric Theory
Similarly, multiplying eqT.(4.3) by Ex2, 2
and eq.(4.2) by Cx,,x,, and
subtracting the latter by the former and solving them, we get

Finally, if we set the derivative of ESS with respect to PI equals to zero, we

find that

In the three-variable model, the coefficient P2 measures the change in Y

associated with a unit change in X, on the assumption that the variable X, is
held constant. Likewise, the coefficient ,!&measures the change in Y
associated with a unit change in X, while X, is held constant. In both cases
the assumption that the values of the remaining explanatory variables are
constant is crucial to our interpretation of the coefficients.

4.3 ADDITIONAL ASSUMPTIONS

The assumptions of the multiple regression model are quite similar to those of
the two-variable podel. Besides the assumptions of two-variable regression
model, the multiple regression model has another assumption, viz. na exact
linear relationship exists between two or more independent variables, or we
can say there is no multicollinearity. Suppose in eq.(4.2), Y ,A',, and A',
represent consumption expenditurg, income and wealth of the consumer
respectively. In postulating that consumption expenditure is linearly related to
income and wealth. economic theory presumes that wealth and incothe may
have some independent influence on consumption. If not. there is na sense in
including both income and wealth variables in the model. In the extreme, if
there is an exact linear relatibnship between income and wealth, we have only
one independent variable, not two, and there is no need to include 60th the
variables. In short, the assumption of no multicollinearity requires that in the
regression
- model we include only those variables which are not linear
functions of some of the variables in the model.

4.4 TESTS OF SIGNIFICANCE

To test the statistical significance of ii Jividual regression coefficients. we

need to ask whether the Gauss-hlarkov theorem extqnds to the multiple
regression models and whether one can obtain an unbiased estimate of the
variance a' as well as information about the distribution of the estimated
. regression parametgrs. The statistical properti'es of the multiple regression
models are presented in the following:
1) The Gauss-Markov theorem applies to the m~~ltiple regression rnodcls.
That is, the ordinary least-squares estimator of each coefficient /3,.
j =1,2;..,k, is BLUE.

2) An unbiased and consistent estimate of o' is provided by

-
=-CZ,'
Multiple Regression Model
s 2
... (4.8)
N -k
3) When the error is normally distributed, t tests can be applied because

In other words, the estimated regression parameters, which are

normalised by subtractkg the mean and dividing by the estimated
standard error, follow the t distribution with N - k degrees of
freedom'.

4) The F -statistic calculated by most regression programmes can be used

in multiple regression models to test the significance of the R2 statistic.
The F -statistic with k - 1 and N - k degrees of freedom allows us to
test the hypothesis that none of the explanatory variables helps explain

-
the variation of Y about its mean. In other words, the F -statistic tests
the joint hypothesis that P2 .P, = ...Pk= 0 . It can be shown that

If the null hypothesis is true, then we would expect RSS, R ~and , therefore
F to be close to 0 . Thus, a high value of the F -statistic is a rationale for
rejecting the null hypothesis. As F -statistic is not significantly different from
0 lets us conclude that the explanatory variables do little to explain the
variation of Y about its mean. In the two-variabb model, the F statistic tests
whether the regression line ii horizontal. In sbch a case, R2 = 0 and the
regression explains none of the variation in the dependent variable. Note that
we do hot test whether the regression passes through the origin; our objective
.
is simply to see whether we can explain any variation around the mean of Y .
The F -tests of the significance of a regression equation may allow for
rejection of the null hypothesis even though none of the regression
coefficients are found to be significant according to individual t tests. This
situation may arise, for example, if the independent variables are highly
correlated with each other. The result may be high standard errors for the
coefficients and low t values, yet the model as a whole may fit the data well.

4.5 COEPFICIENT OF DETERMINATION

R2 is called the coefficient of determination. To use R~ as a measure of

goodness of fit in the multiple regression model, we extend the discussion in
Unit 2 about the decomposition of the variation in the dependent variable Y .

1 Recall that in a two-variable model we tested for the significance of /?at (N - 2) degree of
&--A,- + h a --A-l :mnl..AnA h.rr\ n-lrl-n-tn-, ~ ~ ~ ~ ; - l - r l.,;"
no n ~ n r t a nto-
t 9nA Y
Basic ~conbmetricThegry
We can break down the difference between and its mean L as follows:

squaring both sides and summing over all observations (1 to N ), we obtain

Variation . Residual Explained

in Y Variation Variation
or, TSS = ESS + RSS
Total sum Residual (Error) Regression sum
of squares sum of squares of squares.
Then we define R2 as

R2 measures the proportion of the variation in Y which is explained by the

multiple regression equation. R2 is often used informally as a goodness-of-fit
statistic and to compare the validity of regression results under alternative
specifications of the independent variables in the model.

The difficulty with R~ as a measure of goodness of fit is that R2 pertains

only to explained and unexplained variation in Y and therefore does not
account for the number of degrees of: freedom. A natural solution is to use
variances (which equals variation divided by degrees of freedom), not
variations, thus eliminating the dependence of goodness of'fit on the number
of independent variables in the model. We define R 2 , or corrected/adjusted
R 2 , as

where N is number of observations in the equation and k is number of

parameters in the equation. It can be seen that as N becomes large, the
differences between R2 and R2 tend to be small.

R2 has a number of properties which make it more desirable goodness of fit

measure than R 2 . When new variables are added to a regression model, R2
always increases, while R2 may rise or fall. The use of R2 eliminates at least
some of the incentives for researchers to include numerous variables in a
model without much thought about why they should appear.

Example 4.1
Following is a numerical example for a three variable case.
Y L
v x2 Y XI x2
The above information gives us Multiple Regression Model
I

I Therefore,

ESS = *'
E = 96.84

so that

Thus the estimated equation will be

Where the value in parentheses is the standard error of the coefficient. From
i this equation we can compute
1 RSS = cj2
= 1383.16

and

Finally, the test of significance of R' is provided by F ratio

-
RSS I (k - 1)
)
= 14.28
ESS I (n- k)

This F ratio is a , test of significance of both explanatory variables

simultaneously. Here it is Significant at 10 per cent level of significance as the
computed value (14.28) exceeds the critical value (9.0) at 10 % level of
significance.
Basic Econometric Theory ~~~~~l~ 4.2,

For illustrating the ideas introduced so far, we consider the following

information, where Y = actual rate of inflation (%jat time t , X , =
unemployment rate (%) at time t , and X , = expected or anticipated inflation
rate (%) at time t . This model is known as the expectations-augmented
Philips curve.

Year Y

Based on the above data, the OLS method gives the following result.

where figures in the parentheses are the estimated standard errors. The
interpretation of this regression is as follows: For the sample period if both
X , and X , were fixed at zero, the average rate of actual inflation would have
been about 7.19 per cent. The partial regression coefficient of -1.3925 means
that by holding X 3 (the expected inflation rate) constant the actual inflation
rate on the average increased (decreased) by about 1.4 per cent for every one .
, unit (here one percentage point) decrease (increase) in the unemployrnerit rate
over the period 1970-1982. Likewise, by holding the unemployment rate
constant, the coefficient value of 1.4700 implies that over the same time
period th? actual inflation rate on the average increased by about 1.47 per cent
for eve1:I one percentage point increase in t4e anticipated or expected rate of
inflation. The R, valle of 0.88 means that the two explanatory variables
together account for about 88 per cent of the variation in the actual
inflation rate, a fairly high amount of explanatory power since R2 can at best
be one.
Example 4.3 Multiple Regression Model

With the information given in'example - 4.2, we will now test whether the
coefficients significantly determine the rate of inflation. We can postulate the
following hypotheses.
1) H,:P2=0 , and H, : P2 # 0
2) H,:fi=O and H, : fi # 0
The null hypothesis states that, holding X , constant, unemployment has no
(linear) influence on the actual rate of inflation. Similarlv. holding X ,
d * u

constant, expected inflation has no influence on actual inflation. To test the

null hypotReses, we use the t test. If the computed t value exceeds the
critical t value at the chosen level of significance, we may reject the null
hypothesis.

1) Noting that 13, = 0 under the null hypothesis, we obtain

Using the two-tailed t test, the critical t value is 3.169. Sirice the
computed t value of 4.566 exceeds the critical value of 3.169, we may
reject the nil1 hypothesis and say that b2
is statistically significant, that
is, significantly different from zero.

2) Similarly, for testing the significance of 4 ,we obtain the t value as

In this case, the computed t value exceeds the critical t value of 3.1 69
at 1 per cent level of significance. Therefore, we reject the null
hypothesis and find that the coefficient is significantly. different from
zero.
The above two t - tests show that the independent variables have
significant influence on the dependent variables. In other words, the
increase (decrease) in actual inflation is due to both the decrease
(increase) in unemployment rate as well as the increase (decrease) in
expected inflation.

Check Your Progress 1

1) The following sums were obtained from 10 sets of observations on Y,
X I and X , :
Basic Econometric Theory
' Estimate the regressi~nof Y on X, and X2 , including an intercept
term, and test the hypothesis that the coefficient of X 2 is zero.

2) Consider a multiple regression model for which all classical

assumptions hold, but in which there is no constant term. Suppose you
wish to test the null hypothesis that there is no relationship between Y
and X, that is, &+

against the alternative that at least one of the p ' s is nonzero. Resent
the appropriate test and state its distributions (including the numbers of
degrees of freedom).

4.6 MATRIX FORM OF MULTIPLE REGRESSION

MODEL
The multiple regression model can be represented in matrix form. We begin
by representing the linear model in matrix form. We can see from eq. (4.1)
that the regression model includes k + 1 variables - a dependent variable and
k independent variables (including the constant term). Since there are N
observations, we can summdise the regression model by writing a series of W
equations, as follows: . .

.
The matrix formulation of the above niodel is
' Y = X ~ + E ... (4.14)
' 8
in which
where Multiple Regression Model
..
Y = N x 1 column vector of de~endent~variable
bbservations
X = N x k matrix of independent variable observations ...
p = k x 1 column vector of unknown parameters
E =Nx
.rL
1 column vector of errors .-
. . . . .
In our representation of the matrix X , each component X,, -has two
subscripts, the first Satisfying the appropriate column. (variable? d d the
second signifying row (observation). Each column of X represent3 a vector .
r'
of N observations on a given variable, with all observations associated ;with
the intercept equal to 1.
The assumptions of t h i classical linear regression model can be represented as
follows:

(i) The elements of X are fixed and have finite variance. In addition X
.
has rank k ,which is less than the number of observations N .

(ii) E is normal1 distributed with E (E) = 0 and E (EE') = 021,where I is

Y
an N x N identity matrix.

The variance-covariance matrix o 2I appears as follows

Using the assumptions of homoscedasticity and no serial correlation, the

above matrix reduces to
Basic Econometric Theory

= a 2 ~
where I is an N x N matrix.

Our objective is to find a vectoy of parameters p which minimise ESS.

Where

And
i=xp
B represents the N x 1 vector of regression residuals, while P represents the
N x 1 vector of fitted values for Y . Substituting eqs. (4.17) and (4.18) into eq.
(4.16), we get

i'E = (Y-x ~ ) ' ( Y - xB) .

The last steps follows because B'x ' Y and Y ' X are
~ both scalars and are
equal to each other.
From Unit 2 we know that 'the least-squares estimators can be obtained by
minimizing ESS. In order to minimise ESS we take partial derivatives with
respect t o p so that :

By solving for the above we obtain

.
b = (x'x)-' (x'Y) ... (4.19)
The matrix X ' X , called the cross-product mutrix is guaranteed to have an
inverse because of our assumption that X has rank k . The cross product-
.. I I
Multiple Regression Model

Now consider the properties of the least-squares estimator ). First, we can

prove that B is an unbiased eeimator of p.
Substituting Y = X P + E into eq. (4.19) gives,

= P since E(E) = O

Hence, the least-squares estimators are unbiased.

The least squares estimator will be normally distributed, since B is a linear

I function of E and E is normally distributed.
From eq. (4.20) we get

By definition

= E [( X ' X )I x'&E'x(x'x)-I]

since X 's are non-stochastic

Basic Econometric Theory
= ( x 'x)-'~ ' O ? I X ('x)-I
X

We have already proved that the least-squares estimator4 is linear aiid

unbiased. In fact, B
is the best linear unbiased estimator of P in the sense that
it has the minimum variance of all unbiased estimators. To complete the proof
of the Gauss-M
estimator has greater variance than . p
We have a

p = ( x ' x ) -('x ' Y )

and

vUt-(j) = O'(X~X)-'

Let we take

(x'x)-'x'= w ' . so that B = u$Y

p* = c' Y ,where c'= w'+d'

= c' (xp+ E )

'=c ' x p + C ' F

' ~ (*)p= C' xp + C IE ( E )

/3 * will be an unbiased estimator, if and only if,

c'X = I ,

SO
E(P*) = c'Xp = IkP = p
We have
p* ; -'y = c ' ( x ~ + E =) c ' X ~ + O ' =E P + c ' E
1
' Or
Now Multiple Regression Model

vur(p ) = E[(B -PIP * -PYI

= E [(ctc)(&'
c)]

= a2clc
'

We have .

~ u r ( a )= a2(x'x)-'
=

We have taken
C1 -- rvl+d'

Or
+d )
c'c = (wP+d')(w

= ~~'w+d'd+w'd+d'w
= w'w+dld
since ~ 0
M J ' = and d ' w = 0 J?

Now
a2c~C = cr2w'w+cr2d'd .

or .'

*) = ~ur(,?l)+0 2 d ' d
var(~

Since

Hence, a is the best linear unbiased estimator.

Now for calculating R' ,we decompose the total variation of. Y as follows.
I)

Then

= ~'X'XB+E'X~+~'X'P+~'$

= B ' x ' x B + E ' E (since X I P = O and 2 ' X = 0 ) ... (4.23)

Basic ~ c o n o r n e t r i cTheory or, .TSS = RSS + ESS.
Then

R?
ESS -
- I-- 2'2 -
-. B'X'.Y~
TSS Y' Y Y'Y

To correct for the dependence of goodness of fit on degrees of freedom, we

define R2 as

From a sample of 10 observations c'orresponding to the regression model , the

following quantities were calculated:

Find the least squares estimates of Q, . Q,, Q, using the data in the deviation
form. Calculate R2 and interpret the estimated relation.
From the above data we construct the following deviation forms:

Now we hav:
Multiple Regression Model
Basic Econometric Theory may undergo.change so that there is a structural change in the model. In order test
for structural stability of a model there are several tests.
The Chow forecast test leads naturally to more general tests of structural
change. A structural change or break occurs if the parameters underlying a
relationship differ fiom one subset of the data to another. There may, of
coufse, be several relevant subsets oc data, with the possibility of several
structural breaks. For the moment we will consider just two subsets of n, and
n, observations making up the total sample of n = n, + n, observations.
Suppose; for example, that we wish to investigate whether savings in a
country differs between pre-liberalisation and post-li6eralisation periods.
Suppose we have observations on the relevant variables for n, pre-
.liberalisation years and n, post-liberalisation years. A Chow test could be
performed by using the estimated pre-liberalisation function to forecast
savings in the post-liberalisation years. However, grovided n, > k , one might
alternatively use the estimated post-liberalisation function to forecast savings
in pre-liberalisation years. It is, however, not clear which choice should be
made, and the two procedures might well yield different answers.' If the
subsets are lqge enough it is better to estimate both functions and test for
common parameters.
To see the structural change, we formulate the savings functions for the two
periods as follows:
'PeriodI:

Period 11: .

where Y is personal savings, X is personal income, the k ' s are the

disturbance terms in the two equations, n, and n, are the number of
observations in the two periods.
A structural change may mean that the two intercepts are different, or the two
slopes are different, or both the intercept and the slopes are different, or any
other suitable combinatjon of the parameters. If there is no structural change
(i.e., structural stability), we can combine all the n, and n, observations and
just estimate one savings function aS

The assumptions of Chow test are two fold: .

(a) E,, - N (0, 0,) and E,, - N (0, c 2 ) ,is., the two error terms are
- normally distributed with the same variance, and
(b) E,, and E,, are independently distributed. '
The Chow test proceeds as follows: Multiple Regression Model

Step I: Combining'all the n, and n, obse'rvations, we estimate (4.28) and

obtain restricted residual (error) sum of squares, say ESS, , with
DF = (n, +n, -k), where k is the number of estimated
parameters estimated.
Step 11: Estimate (4.26) and (4.27) individually and obtain their ESS, say
ESS, and ESS, ,with DF = (n, - k) and (n, - k), respectively.

Step 111: Add these two residual sum of squares, say''ES~,, (unrestricted
residual sum of squares), with DF = (n, + n, - 2k) . Find
ESS,,, - ESS,,, , with DF = k . .

Step IV: Given the assumptions of Chow test, it can be shown that

follows the F distribution with DF = ( k , n, + n, - 2k). If the F

computed from (4.29) exceeds the critical value at the chosen
level of confidence level, reject the hypothesis that the regressions
(4.26) and (4.27) we the same, that is, reject the hypothesis of
structural stability.

Example 4.5
We present the data o n personal savings. and personal income in the~united
Kingdom for the period 1946-1963 in the following. We want to test whether
the savings function is same in the two time periods.

Period I Period I1
1946-1954 Savings Income 1955-1963 Savings Income
1946 0.36 8.8 1955 0.59 15.5
1947 0.2 1 9.4 1956 0.90 16.7
I 948 0.08 10.0 1957 0.95 17.7
1949 0.20 10.6 1958 0.82 18.6
1950 0.10 11.0 1959 1.04 19.7
1951 0.12 .- 11.9 1960 1.53 21.1
1952 0.4 1 12.7 1961 1.94 22.8
1953 0.50 13.5 1962 1.75 23.9
1954 0.43 14.3 1963 1.99 25.2

In the above data, we have n, = n, = 9 .

Step I:

f = -1.0821 + 0.1178 X,
Basif Econometric Theory
(0.1452) (0.0088)
t = (77.4548) (13.43 16)

R' = 0.9185, ESS, = 0.5722, DF = 16

Step 11: Period 1946 - 1954

R' = 0.3092, ESS, = 0.1396, DF=7

Period 1955 - 1963

. -.
Step 111:
. .
ESS,, = 0.3327

ESS, - ESS,, = 9.2395

Step IV:

At the 5% level of significance the critical value o f F,,,, = 3.74. Since

the &served F value of 5.04 exceeds this critical. value, we can reject
the hypothesis th%tthe sqvings function in the two time periods is the
1
same.
Though we accept the conclusion that the savings functions in the two
time periods ~ T different,
Q this difference is due to either a difference in
the iatercept values, or slope values, or both. Although the Chow test
can be adapted- to show this, dummy variables can show this more
readily.

4.8 PRE'DICTION WITH MULTKPLE

REGRESSION MODEL
.
The formulas far prediction in multiple regression are similar to those in the
case of simple regression except that to compute the standard error of the
predicted value we need the variances and covariances of all the regression
coefficiehts.
'1 Let the estimated regression equation be Multiple Regression Model

Now consider the prediction of the value Y, of Y giverl values x;, of X2

and X,,of ,Y, , respectively.
L

t Then we have

Consider

t b, + Ax2,+ B3x30
=

I The prediction error is

t
I
8-Y,=bl- p , +(B2 - ~ 2 ) ~ ~ + ( 8 ~ - -co
~ 3 ) ~ 3 ~

Since E(B, - A ) , E ( & -a),E ( & - 4 ) .and E ( E , ) ark all equal to zero,
we have E(C - Y ,) = 0 . Thus the predictor $ is unbiased. Since and Y,'

are random variables ~ ( tE (Y,), ) .

The variance of the prediction error is

We estimate a2 by RSS l ( n - 3 ) in the case of two explanatory variables and

by. RSS 1( n - k-- 1) in the general case.

Check Your Progress 2

. 1) Data on a three-variable problem yield the following results:

i (a) What is the sample size?

(b) Compute the regression equation.

(c) Estimate the standard error of b2 and test the hypothesis p2 is

zero.
Basic Econometric Theory

2) Data on expenditure and income of a State in India for two periods of

time is presented in the following. Test whether the expenditure
function is same in the two time periods.

Period 1 Period TI
1981-91 Expenditure NSDP . 1991-2001 Expenditure NSDP
(Rs. Cr.) (Rs. Cr.) (Rs. Cr.) (Rs. Cr.)

4.9 LET US SUM UP

This unit introduced the simplest possible multiple linear regression model,
namely, thc three-variable regression model. This has also been generalised to
a multiple 11-war regression involving k number of variables model with the
help of matrix. This unit also discussed testing the individual significance of
regression coefficient and testing the overall significance of regression. The
coefficient of determination, which measures the proportion of the variation
in Y is often used informally as a goodness-of-fit statistic and to compare the
vnliditv nf r~qiiltq iinder alternative
r~ur~ssinn snecifications of the
independent variables in the model. This has been dealt with in this unit. Multiple Regression Model

Besides, this unit discussed testing for structural stability of regression models
and prediction of dependent variables, given the-values of the $dependent
variables.

I
i
4.10 KEY WORDS
Adjusted R-square :
'I

It is a measure of goodness of fit which

accounts for not only explained and
unexplained variation in Y but also for
the number of degrees of freedom. It is
therefore corrected R~ which accounts for
only the former.
: When linear relationship exists between
two or more independent variables, we
say there is multicollinearity.
Multiple regression model : When a regression equation has three or
more than thrce variables, we call it a
multiple regression model. 'The simplest
possible multiple regression model is three-
variable regression, with onc dependent
variable and two explanatory variables.
Regression sum of squares : The variation of Y is divided into two
parts - The,first part, which accounted for
by the regression equation, i.e: explained
portion, is called regression sum of
square.
Residual sum of squares : The variation of Y is divided into two
portions. The unexplained portion of the
model is called the residual (or error) sum
of square.
Structural stability : When there is no structural change or
break, i.e.. the parameters underlying a
relationship do not differ from one subset
of the data to another, we say there is
structural stability of a function.

4.1 1 SOME USEFUL BOOKS

,N. 1
I
Gujarati, Damodar N., 1995, Basic Ecoaomefrics; McGraw-Hill lnc.,
Singapore.
Johnston, Jack and John Dinardo,l 997. Econometric Methods, The McGraw-
I

I Hill Companies Inc., Singapore.

Pindyck, Robert S. and Daniel L. Rubinfeld, 1998, Econometric Models und
Economic Forecasts, IrwidMcGraw-Hill, Singapore.
Basic kconometric t h e o r y
4.12 ANSWERSIHINTS TO HECK YOUR
PROGRESS EXERCISES

Check Your Progress 1

1) Go through sections 4.2,4.4 and 4.5.

2) Go through section 4.4.

Check Your Progress 2

1) Go through sections 4.4 and 4.6.

2) Go through section 4.7.

New Century Clinic Assignment2
50% (4)
New Century Clinic Assignment2
3 pages
chapter 3
No ratings yet
chapter 3
31 pages
Chapter 3
100% (1)
Chapter 3
28 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
2.8+Regression+With+Multiple+Explanatory+Variables
No ratings yet
2.8+Regression+With+Multiple+Explanatory+Variables
29 pages
2022 Econometrics Chapter Three
No ratings yet
2022 Econometrics Chapter Three
66 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
Econometrics Chapter Three (1)
No ratings yet
Econometrics Chapter Three (1)
55 pages
Multiple Linear Regression Session 4
No ratings yet
Multiple Linear Regression Session 4
32 pages
125.785 Module 2.2
No ratings yet
125.785 Module 2.2
95 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
BAB 7 Multiple Regression and Other Extensions of The Simple
No ratings yet
BAB 7 Multiple Regression and Other Extensions of The Simple
17 pages
Chapter Three Multiple
No ratings yet
Chapter Three Multiple
15 pages
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 5
No ratings yet
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 5
11 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Brm Unit 3 Mcom Sem1
No ratings yet
Brm Unit 3 Mcom Sem1
40 pages
Chapter 14, Multiple Regression Using Dummy Variables
No ratings yet
Chapter 14, Multiple Regression Using Dummy Variables
19 pages
Unit 4 Multiple Linear Regression
No ratings yet
Unit 4 Multiple Linear Regression
3 pages
Ken Black QA 5th Chapter15 Solution
100% (1)
Ken Black QA 5th Chapter15 Solution
12 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
Multiple Regression (Compatibility Mode)
No ratings yet
Multiple Regression (Compatibility Mode)
24 pages
Psychology
No ratings yet
Psychology
61 pages
04 Violation of Assumptions All
No ratings yet
04 Violation of Assumptions All
24 pages
Multiple Regression
No ratings yet
Multiple Regression
49 pages
Chapter 3 - Multiple Linear Regression Models
No ratings yet
Chapter 3 - Multiple Linear Regression Models
29 pages
Unit 4-1
No ratings yet
Unit 4-1
29 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
4.1 Multiple Regression Models
No ratings yet
4.1 Multiple Regression Models
6 pages
Multiple Regression
100% (1)
Multiple Regression
21 pages
UNIT 3 For ACfn & MGT
No ratings yet
UNIT 3 For ACfn & MGT
28 pages
Week 8 - 10
No ratings yet
Week 8 - 10
72 pages
Estadística Clase 7
No ratings yet
Estadística Clase 7
24 pages
IST2024 Lecture02
No ratings yet
IST2024 Lecture02
31 pages
Multiple Regression: by Dr. D. Israel
No ratings yet
Multiple Regression: by Dr. D. Israel
23 pages
Module 5: Multiple Regression Analysis: Tom Ilvento
No ratings yet
Module 5: Multiple Regression Analysis: Tom Ilvento
20 pages
Chapter11_Regression (2)
No ratings yet
Chapter11_Regression (2)
37 pages
Introduction To Econometrics PDF
No ratings yet
Introduction To Econometrics PDF
13 pages
EC1 Slides Part4
No ratings yet
EC1 Slides Part4
35 pages
anova explain
No ratings yet
anova explain
10 pages
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
No ratings yet
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
12 pages
Cfa Level 2 2023 Summary
No ratings yet
Cfa Level 2 2023 Summary
100 pages
High Yield Notes
No ratings yet
High Yield Notes
251 pages
FinQuiz - Curriculum Note, Study Session 2, Reading 4
No ratings yet
FinQuiz - Curriculum Note, Study Session 2, Reading 4
5 pages
X X B X B X B y X X B X B N B Y: QMDS 202 Data Analysis and Modeling
No ratings yet
X X B X B X B y X X B X B N B Y: QMDS 202 Data Analysis and Modeling
6 pages
CHAPTER THREE - Multiple Linear Regression Analysis
No ratings yet
CHAPTER THREE - Multiple Linear Regression Analysis
77 pages
Passing Reference Multiple Regression
No ratings yet
Passing Reference Multiple Regression
10 pages
Note Multiple Regression KOM 6115
No ratings yet
Note Multiple Regression KOM 6115
18 pages
Chapter 15
No ratings yet
Chapter 15
41 pages
Multiple linear regression
No ratings yet
Multiple linear regression
39 pages
Topic 3 Multiple Regression Analysis Estimation
No ratings yet
Topic 3 Multiple Regression Analysis Estimation
31 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
9 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
Regression analysis
No ratings yet
Regression analysis
16 pages
Linear Regression PDF
100% (1)
Linear Regression PDF
32 pages
Multiple Linear Regression: y BX BX BX
No ratings yet
Multiple Linear Regression: y BX BX BX
14 pages
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Transportation and Academic Performance of Students in The Academic Track
No ratings yet
Transportation and Academic Performance of Students in The Academic Track
3 pages
Transforming Non-communicable Disease Care Through Health Information Management
No ratings yet
Transforming Non-communicable Disease Care Through Health Information Management
46 pages
Banksy
No ratings yet
Banksy
1 page
Effectiveness of The Motor Relearning Approach
No ratings yet
Effectiveness of The Motor Relearning Approach
5 pages
Richard Cory Analysis Essay
No ratings yet
Richard Cory Analysis Essay
2 pages
04 Audit Evidence and Audit Documentation
No ratings yet
04 Audit Evidence and Audit Documentation
6 pages
SSRN Id2545378 PDF
No ratings yet
SSRN Id2545378 PDF
60 pages
Hire Math 2016
No ratings yet
Hire Math 2016
20 pages
Examination Reforms
No ratings yet
Examination Reforms
7 pages
CH7060 Manufacture and Clinical Trials of Medicines
No ratings yet
CH7060 Manufacture and Clinical Trials of Medicines
11 pages
Aasd Aero Eve 8819 PDF
No ratings yet
Aasd Aero Eve 8819 PDF
2 pages
Lecture 5. Research Strategies
No ratings yet
Lecture 5. Research Strategies
21 pages
Communication Skills Journal Assessment Template
100% (1)
Communication Skills Journal Assessment Template
10 pages
LinkedIn Book Final PDF
100% (1)
LinkedIn Book Final PDF
91 pages
Cameroon - A New Investment Destination in Africa
No ratings yet
Cameroon - A New Investment Destination in Africa
102 pages
COPAR
No ratings yet
COPAR
2 pages
POM (Staffing)
No ratings yet
POM (Staffing)
33 pages
TORONTO DROP-IN NETWORK GOOD PRACTICES TOOLKIT Produced By: Paul Dowling Consulting
No ratings yet
TORONTO DROP-IN NETWORK GOOD PRACTICES TOOLKIT Produced By: Paul Dowling Consulting
377 pages
Basic Principles of Research Design
No ratings yet
Basic Principles of Research Design
33 pages
The Research Laboratory: Main Menu
100% (1)
The Research Laboratory: Main Menu
42 pages
Module V: Qualitative Analysis: Language and Literature Assessment
No ratings yet
Module V: Qualitative Analysis: Language and Literature Assessment
5 pages
Master Thesis Nanna Juli Lauth Poulsen 130379
No ratings yet
Master Thesis Nanna Juli Lauth Poulsen 130379
93 pages
Trabalho Da Cadeira de Ingles Victoria Xavier Rego ISCED 2020
No ratings yet
Trabalho Da Cadeira de Ingles Victoria Xavier Rego ISCED 2020
13 pages
2014 - Stewart Et Al - Grade Estimation From Radial Basis Functions
No ratings yet
2014 - Stewart Et Al - Grade Estimation From Radial Basis Functions
11 pages
Social Media Ramifications To Cavite Community Academy Inc. STEM Students Academic Performance
No ratings yet
Social Media Ramifications To Cavite Community Academy Inc. STEM Students Academic Performance
10 pages
Piecing Together The Puzzle: Development of The Societal Attitudes Towards Autism (SATA) Scale
No ratings yet
Piecing Together The Puzzle: Development of The Societal Attitudes Towards Autism (SATA) Scale
9 pages
Assessment
No ratings yet
Assessment
55 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Why Care About Quality of Care? - The Case of Lao PDR' Health System
No ratings yet
Why Care About Quality of Care? - The Case of Lao PDR' Health System
37 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 4 Multiple Regression Model: 4.0 Objectives

Uploaded by

Unit 4 Multiple Regression Model: 4.0 Objectives

Uploaded by

UNIT 4 MULTIPLE REGRESSION

We extend the two-variable model by assuming that the dependent variable Y

For simplicity we have used a three-variable regression model. The model is

The least-squares procedure is equivalent to searching for parameter estimates

Just as we did in Unit 2, we-can find the values of P I , P, and P3 which

To solve, we multiply eq.(4.3) by x x , , 2 and multiply eq.(4.4) by x x 2 , x 3 ,,

Finally, if we set the derivative of ESS with respect to PI equals to zero, we

In the three-variable model, the coefficient P2 measures the change in Y

4.3 ADDITIONAL ASSUMPTIONS

4.4 TESTS OF SIGNIFICANCE

To test the statistical significance of ii Jividual regression coefficients. we

2) An unbiased and consistent estimate of o' is provided by

In other words, the estimated regression parameters, which are

4) The F -statistic calculated by most regression programmes can be used

4.5 COEPFICIENT OF DETERMINATION

R2 is called the coefficient of determination. To use R~ as a measure of

squaring both sides and summing over all observations (1 to N ), we obtain

Variation . Residual Explained

R2 measures the proportion of the variation in Y which is explained by the

The difficulty with R~ as a measure of goodness of fit is that R2 pertains

where N is number of observations in the equation and k is number of

R2 has a number of properties which make it more desirable goodness of fit

Thus the estimated equation will be

Finally, the test of significance of R' is provided by F ratio

This F ratio is a , test of significance of both explanatory variables

For illustrating the ideas introduced so far, we consider the following

constant, expected inflation has no influence on actual inflation. To test the

1) Noting that 13, = 0 under the null hypothesis, we obtain

2) Similarly, for testing the significance of 4 ,we obtain the t value as

Check Your Progress 1

2) Consider a multiple regression model for which all classical

4.6 MATRIX FORM OF MULTIPLE REGRESSION

(ii) E is normal1 distributed with E (E) = 0 and E (EE') = 021,where I is

The variance-covariance matrix o 2I appears as follows

Using the assumptions of homoscedasticity and no serial correlation, the

Our objective is to find a vectoy of parameters p which minimise ESS.

i'E = (Y-x ~ ) ' ( Y - xB) .

By solving for the above we obtain

Now consider the properties of the least-squares estimator ). First, we can

Hence, the least-squares estimators are unbiased.

The least squares estimator will be normally distributed, since B is a linear

since X 's are non-stochastic

We have already proved that the least-squares estimator4 is linear aiid

p = ( x ' x ) -('x ' Y )

(x'x)-'x'= w ' . so that B = u$Y

p* = c' Y ,where c'= w'+d'

'=c ' x p + C ' F

/3 * will be an unbiased estimator, if and only if,

vur(p *) = E[(B * -PIP * -PYI

Hence, a is the best linear unbiased estimator.

= B ' x ' x B + E ' E (since X I P = O and 2 ' X = 0 ) ... (4.23)

To correct for the dependence of goodness of fit on degrees of freedom, we

From a sample of 10 observations c'orresponding to the regression model , the

where Y is personal savings, X is personal income, the k ' s are the

The assumptions of Chow test are two fold: .

Step I: Combining'all the n, and n, obse'rvations, we estimate (4.28) and

follows the F distribution with DF = ( k , n, + n, - 2k). If the F

In the above data, we have n, = n, = 9 .

R' = 0.9185, ESS, = 0.5722, DF = 16

Step 11: Period 1946 - 1954

R' = 0.3092, ESS, = 0.1396, DF=7

Period 1955 - 1963

ESS, - ESS,, = 9.2395

At the 5% level of significance the critical value o f F,,,, = 3.74. Since

4.8 PRE'DICTION WITH MULTKPLE

Now consider the prediction of the value Y, of Y giverl values x;, of X2

I The prediction error is

are random variables ~ ( tE (Y,), ) .

The variance of the prediction error is

We estimate a2 by RSS l ( n - 3 ) in the case of two explanatory variables and

Check Your Progress 2

i (a) What is the sample size?

(c) Estimate the standard error of b2 and test the hypothesis p2 is

2) Data on expenditure and income of a State in India for two periods of

4.9 LET US SUM UP

It is a measure of goodness of fit which

vur(p ) = E[(B -PIP * -PYI