Econometric S
Econometric S
Econometric S
ECONOMETRICS I
Prepared by:
Firehun Jemal
Fikadu Abera
Edited by:
Tigstu W/silassie
December, 2023
Bonga, Ethiopia
Econometrics
ModuleI
Course Name: Econometrics I
Course Code: Econ 3061
Course Description:The course aims at introducing the theory (and practice) of cross-sectional
econometrics. It first makes an introduction to the basic concepts in econometrics like economic
and econometric modeling as well as types of data; then proceeds to the simple classical linear
regression model and introduces estimation techniques such as the method of moments, ordinary
least squares and maximum likelihood estimation, inference and analyses of residuals. This is
then built into the multiple linear regression framework. After making tests of linear restrictions
emanating from economic theory, the course will finally try to highlight the problems of
multicollinearity, heteroscedasticity and autocorrelation (violations of the basic assumptions of
classical linear regression models). The course builds upon your previous course Statistics for
Economists. Hence, familiarity with the material, particularly sampling distributions, estimation
and hypothesis testing will be of much help. These will be applied on Ethiopian/international
data using statistical packages.
Course Outcomes:
The main outcomes of this course is to enable students have a good background
knowledge on cross-sectional econometric models. More specifically, after the
completion of the course, students are expected to:
Distinguish between economic and econometric models;
Do simple and multiple regression with economic data (both manually and using
statistical packages);
Interpret regression results (like coefficients and R2) and test hypotheses (both
manually and using statistical packages); and
Detect (in) existence of problems of multicollinearity, heteroscedasticity and
autocorrelation as well as suggest how to rectify such problems (both manually
and using statistical packages).
I
Table of Contents
CHAPTER ONE ....................................................................................................................... 1
Definition and scope of econometrics ..................................................................................... 1
1.1 WHAT IS ECONOMETRICS? ............................................................................................ 1
1.2 Economic models vs. econometric models .......................................................................... 3
1.3 Methodology of econometrics .............................................................................................. 4
1.4 The Sources, Types and Nature of Data ............................................................................... 7
1.5 Desirable properties of an econometric model ................................................................... 10
1.6 Goals of Econometrics ....................................................................................................... 10
CHAPTER TWO.................................................................................................................... 11
THE CLASSICAL REGRESSION ANALYSIS: Simple Linear Regression Model ....... 11
2.1. Concept of Regression Function ....................................................................................... 11
2.2. Simple Linear Regression model....................................................................................... 13
2.2.1 Assumptions of the Classical Linear Stochastic Regression Model. ........................... 13
2.3 Methods of estimation ........................................................................................................ 15
2.3.1 The ordinary least square (OLS) method and Method of moments (MM).................. 15
2.3.2 Maximum Likelihood Estimation................................................................................ 18
2.4 Properties of OLS Estimators and Gauss-Markov Theorem .............................................. 20
2.5 Tests of the „Goodness of Fit‟ With R2 .............................................................................. 21
2.6 Testing the Significance of OLS Parameters...................................................................... 24
CHAPTER THREE ............................................................................................................... 28
THE CLASSICAL REGRESSION ANALYSIS: Multiple Linear Regression Model .... 28
3.1 Introduction ........................................................................................................................ 28
3.2 Assumptions of Multiple Regression Model ...................................................................... 29
3.3 Partial-correlation coefficients ........................................................................................... 29
3.4 A Model with Two Explanatory Variables....................................................................... 30
3.4.1 Estimation of parameters of two-explanatory variables model ................................. 30
3.4.2 The coefficient of determination ( R2):two explanatory variables case ...................... 31
3.5 Statistical Properties of the Parameters and Gauss-Markov Theorem ............................... 32
3.6. Hypothesis Testing in Multiple Regression Model ........................................................... 33
3.6.1. Tests of individual significance .................................................................................. 33
3.6.2 Test of Overall Significance ........................................................................................ 34
3.7 Predictions using Multiple Linear Regression.................................................................... 37
CHAPTER FOUR: ................................................................................................................. 40
II
VIOLATIONS OF CLASSICAL ASSUMPTION .............................................................. 40
4.1 The Assumption of Zero Expected Disturbances ............................................................... 40
4.2 The Nature of Hetroscedasticity ......................................................................................... 41
4.3 The Nature of Autocorrelation ........................................................................................... 49
4.4 Multicollinearity ................................................................................................................. 50
4.5 Specification Errors ............................................................................................................ 53
III
CHAPTER ONE
Definition and scope of econometrics
The economic theories we learn in various economics courses suggest many relationships among
economic variables. For instance, in microeconomics we learn demand and supply models in
which the quantities demanded and supplied of a good depend on its price. In macroeconomics,
we study „investment function‟ to explain the amount of aggregate investment in the economy as
the rate of interest changes; and „consumption function‟ that relates aggregate consumption to
the level of aggregate disposable income.
However, economic theories that postulate the relationships between economic variables have to
be checked against data obtained from the real world. If empirical data verify the relationship
proposed by economic theory, we accept the theory as valid. If the theory is incompatible with
the observed behavior, we either reject the theory or in the light of the empirical evidence of the
data, modify the theory. To provide a better understanding of economic relationships and a
better guidance for economic policy making we also need to know the quantitative relationships
between the different economic variables. We obtain these quantitative measurements taken
from the real world. The field of knowledge which helps us to carry out such an evaluation of
economic theories in empirical terms is econometrics.
1.1 WHAT IS ECONOMETRICS?
Literally interpreted, econometrics means “economic measurement”, but the scope of
econometrics is much broader as described by leading econometricians. Various econometricians
used different ways of wordings to define econometrics. But if we distill the fundamental
features/concepts of all the definitions, we may obtain the following definition.
1
“Econometrics is the science which integrates economic theory, economic statistics, and
mathematical economics to investigate the empirical support of the general schematic law
established by economic theory. It is a special type of economic analysis and research in which
the general economic theories, formulated in mathematical terms, is combined with empirical
measurements of economic phenomena. Starting from the relationships of economic theory, we
express them in mathematical terms so that they can be measured. We then use specific methods,
called econometric methods in order to obtain numerical estimates of the coefficients of the
economic relationships.”
Measurement is an important aspect of econometrics. However, the scope of econometrics is
much broader than measurement. As D.Intriligator rightly stated the “metric” part of the word
econometrics signifies „measurement‟, and hence econometrics is basically concerned with
measuring of economic relationships. In short, econometrics may be considered as the
integration of economics, mathematics, and statistics for the purpose of providing numerical
values for the parameters of economic relationships and verifying economic theories.
2
Econometrics vs. statistics
Econometrics differs from both mathematical statistics and economic statistics. An economic
statistician gathers empirical data, records them, tabulates them or charts them, and attempts to
describe the pattern in their development over time and perhaps detect some relationship
between various economic magnitudes. Economic statistics is mainly a descriptive aspect of
economics. It does not provide explanations of the development of the various variables and it
does not provide measurements the coefficients of economic relationships.
Mathematical (or inferential) statistics deals with the method of measurement which are
developed on the basis of controlled experiments. But statistical methods of measurement are
not appropriate for a number of economic relationships because for most economic relationships
controlled or carefully planned experiments cannot be designed due to the fact that the nature of
relationships among economic variables are stochastic or random. Yet the fundamental ideas of
inferential statistics are applicable in econometrics, but they must be adapted to the problem
economic life.
1.2 Economic models vs. econometric models
i) Economic models:
Any economic theory is an observation from the real world. For one reason, the immense
complexity of the real world economy makes it impossible for us to understand all
interrelationships at once. Another reason is that all the interrelationships are not equally
important as such for the understanding of the economic phenomenon under study. The sensible
procedure is therefore, to pick up the important factors and relationships relevant to our problem
and to focus our attention on these alone. Such a deliberately simplified analytical framework is
called on economic model. It is an organized set of relationships that describes the functioning of
an economic entity under a set of simplifying assumptions.
3
Figure: Anatomy of econometric modeling.
4
1. Specification of the model
In this step the econometrician has to express the relationships between economic variables in
mathematical form. This step involves the determination of three important tasks:
i) The dependent and independent (explanatory) variables which will be included in the
model.
ii) The a priori theoretical expectations about the size and sign of the parameters of the
function.
iii) The mathematical form of the model (number of equations, specific form of the
equations, etc.)
Note: The specification of the econometric model will be based on economic theory and on any
available information related to the phenomena under investigation. Thus, specification of the
econometric model presupposes knowledge of economic theory and familiarity with the
particular phenomenon being studied.
Specification of the model is the most important and the most difficult stage of any econometric
research. It is often the weakest point of most econometric applications. In this stage there
exists enormous degree of likelihood of committing errors or incorrectly specifying the model.
Some of the common reasons for incorrect specification of the econometric models are:
1. The imperfections, looseness of statements in economic theories.
2. The limitation of our knowledge of the factors which are operative in any particular
case.
3. The formidable obstacles presented by data requirements in the estimation of large
models.
The most common errors of specification are:
a. Omissions of some important variables from the function.
b. The omissions of some equations (for example, in simultaneous equations model).
c. The mistaken mathematical form of the functions.
5
This is purely a technical stage which requires knowledge of the various econometric methods,
their assumptions and the economic implications for the estimates of the parameters. This stage
includes the following activities.
a. Gathering of the data on the variables included in the model.
b. Examination of the identification conditions of the function (especially for simultaneous
equations models).
c. Examination of the aggregations problems involved in the variables of the function.
d. Examination of the degree of correlation between the explanatory variables (i.e.
examination of the problem of multicollinearity).
e. Choice of appropriate economic techniques for estimation, i.e. to decide a specific
econometric method to be applied in estimation; such as, OLS, MLM, Logit, and Probit.
3. Evaluation of the estimates
This stage consists of deciding whether the estimates of the parameters are theoretically
meaningful and statistically satisfactory. This stage enables the econometrician to evaluate the
results of calculations and determine the reliability of the results. For this purpose we use
various criteria which may be classified into three groups:
i. Economic a priori criteria: These criteria are determined by economic theory and refer
to the size and sign of the parameters of economic relationships.
ii. Statistical criteria (first-order tests): These are determined by statistical theory and aim
at the evaluation of the statistical reliability of the estimates of the parameters of the
model. Correlation coefficient test, standard error test, t-test, F-test, and R2-test are some
of the most commonly used statistical tests.
iii. Econometric criteria (second-order tests):
These are set by the theory of econometrics and aim at the investigation of whether the
assumptions of the econometric method employed are satisfied or not in any particular case. The
econometric criteria serve as a second order test (as test of the statistical tests) i.e. they determine
the reliability of the statistical criteria; they help us establish whether the estimates have the
desirable properties of biasedness, consistency etc. Econometric criteria aim at the detection of
the violation or validity of the assumptions of the various econometric techniques.
4) Evaluation of the forecasting power of the model:
6
Forecasting is one of the aims of econometric research. However, before using an estimated
model for forecasting by some way or another the predictive power of the model. It is possible
that the model may be economically meaningful and statistically and econometrically correct for
the sample period for which the model has been estimated; yet it may not be suitable for
forecasting due to various factors (reasons). Therefore, this stage involves the investigation of
the stability of the estimates and their sensitivity to changes in the size of the sample.
Consequently, we must establish whether the estimated function performs adequately outside the
sample of data. i.e. we must test an extra sample performance the model.
Data are records of the actual state of some aspect of the universe at a particular point in time.
Data are not abstract; they are concrete, they are measurements or the tangible features of the
world. Data are an essential part of conducting research and they provide the evidence that links
the research to the real world.
The success of any econometric analysis ultimately depends on the availability of the appropriate
data. It is therefore essential that we spend some time discussing the nature, sources, and
limitations of the data that one may encounter in empirical analysis.
Types of Data
Three types of data may be available for empirical analysis: time series, cross-section, and
pooled (i.e., combination of time series and crosssection) data.
7
Qualitative data are sometimes called dummy variables or categorical variable. These are
variables that cannot be quantified.
Example: male or female, married or unmarried, religion, etc
Quantitative data are data that can be quantified
Example: income, prices, money etc.
b) Cross-Section data
✓These data give information on the variables concerning individual agents (consumers or
producers) at a given point of time.
✓ many units observed at one point in time
✓ Generally obtained through official records of individual units, surveys, questionnaires
(data collection instrument that contains a series of questions designed for a specific
purpose)
Example:
- the census of population conducted by CSA.
Note that due to heterogeneity, cross- sectional data have their own problems.
c) Pooled Data
✓These are repeated surveys of a single (cross-section) sample in different periods of time.
✓They record the behavior of the same set of individual microeconomic units over time.
✓There are elements of both time series and cross sectional data.
✓Consists of cross-sectional data sets that are observed in different time periods and
combined together
✓ At each time period (e.g., year) a different random sample is chosen from population
✓ Individual units are not the same
✓ For example, if we choose a random sample 400 firms in 2022 and choose another
sample in 2023 and combine these cross-sectional data sets we obtain a pooled crosssection data
set.
d) Panel (longitudinal) data
The panel or longitudinal data also called micro panel data, is a special type of pooled data in
which the same cross-sectional unit is surveyed over time.
8
Source of Data
Based on the source, the type of data collected could be primary or secondary in nature.
• Primary data are those which are collected afresh and for the first time, and thus happen to be
original in character. Its advantage is its relevance to the user, but it is also likely to be expensive
in time and money terms to collect.
• Secondary data are those which have already been collected by someone else and which have
already been passed through the statistical process. It is information extracted from an existing
source, probably published or held on a computer database.
Nature of Data
a) Nominal data - The nominal scale is used for assigning numbers as the identification of
individual unit. For example, the classification of journals according to the discipline they belong
to, may be considered as nominal data. If numbers are assigned to describe the categories, the
numbers represent only the name of the category.
b) Ordinal data - It indicates the ordered or graded relationship among the numbers assigned to
the observations made. These numbers connote ranks of different categories having relationship
in a definite order. For example, to study the responsiveness of library staff a researcher may
assign '1' to indicate poor, '2' to indicate average, '3' to indicate good and '4' to indicate excellent.
The numbers 1,2, 3 and 4 in this case are set of ordinal data which indicate that 4 is better than 3
which in turn is better than 2 and so on. The ordinal data show the direction of the difference and
not the exact amount of
difference.
c) Interval data - Interval data are ordered categories of data and the differences between various
categories are of equal measurement.
For example, we can measure the IQ (Intelligence Quotient) of a group of children. After
assigning numerical value to the IQ of each child, the data can be grouped with interval of 10,
like °to 10, 10 to 20, 20 to 30 and so on. In this case, '0' does not mean the absence of
intelligence and children with IQ '20' are not doubly intelligent than
children with IQ '10'.
9
d) Ratio data - Ratio data are the quantitative measurement of a variable in terms of magnitude.
In ratio data, we can say that one thing is twice or thrice of another as for example,
measurements involving weight, distance, price, etc.
10
CHAPTER TWO
THE CLASSICAL REGRESSION ANALYSIS: Simple Linear Regression
Model
2.1. Concept of Regression Function
In economics the relationship between variables are mainly explained in the form of dependent
& independent variables. The dependent variable is that variable which its average value is
computed using the already known values of the explanatory variable(s). But the values of the
explanatory variables are obtained from fixed or in repeated sampling of the population.
Ex. Suppose the amount of commodity demanded by an individual is depend on the
price of the commodity, income of individual, price of other goods & etc. Then from this
statement quantity demanded is the dependent variable which its value is determined by
the price of the commodity and income of the individual, Price of other goods etc. And
price of the commodity, income of individuals & price of other goods are independent
(explanatory) variables whose value is obtained from the population using repeated
sampling. The relationship between these dependent and independent variable is a
concern of regression analysis. i.e.
Qd = f (P, P0, Y etc)
If we study the relationship between dependent variable & one independent variable i.e.
Qd= f (P) this is known as simple two variable regression model because there are one
dependent Qd & one independent P regression model. However if the dependent variable
is depending upon more than one independent variables such as Qd: f (P, P0, Y) it is known as
multiple regression analysis. The functional relationship between the dependent and independent
variable may be linear or non-linear.
The key concept underlying regression analysis is the concept of the conditional expectation
function (CEF), or population regression function (PRF). Our objective in regression analysis
is to find out how the average value of the dependent variable (or regressand) varies with the
given value of the explanatory variable (or regressor).
This section largely deals with linear PRFs, that is, regressions that are linear in the parameters.
They may or may not be linear in the regressand or the regressors. For empirical purposes, it is
the stochastic PRF that matters. The stochastic disturbance term ui plays a critical role in
11
estimating the PRF. The PRF is an idealized concept, since in practice one rarely has access to
the entire population of interest. Usually, one has a sample of observations from the population.
Therefore, one uses the stochastic sample regression function (SRF) to estimate the PRF.
Economic theories are mainly concerned with the relationships among various economic
variables. These relationships, when phrased in mathematical terms, can predict the effect of one
variable on another. The functional relationships of these variables define the dependence of one
variable upon the other variable (s) in the specific form. The specific functional forms may be
linear, quadratic, logarithmic, exponential, hyperbolic, or any other form. A simple linear
regression model, i.e. a relationship between two variables related in a linear form. i.e.
stochastic and non-stochastic, among which we shall be using the former in econometric
analysis.
12
Yi X u i ……………………………………………………….(2.2)
Thus a stochastic model is a model in which the dependent variable is not only determined by the
explanatory variable(s) included in the model but also by others which are not included in the
model.
2.2. Simple Linear Regression model.
The above stochastic relationship (2.2) with one explanatory variable is called simple linear
regression model.
The true relationship which connects the variables involved is split into two parts:
A part represented by a line and a part represented by the random term „u‟.
The scatter of observations represents the true relationship between Y and X. The line
represents the exact part of the relationship and the deviation of the observation from the line
represents the random component of the relationship. Were it not for the errors in the model, we
would observe all the points on the line Y1' , Y2' ,......, Yn' corresponding to X 1 , X 2 ,...., X n . However
Yi xi ui
the dependent var iable the regression line random var iable
The first component in the bracket is the part of Y explained by the changes in X and the second
is the part of Y not explained by X, that is to say the change in Y is due to the random influence
of u i .
2.2.1 Assumptions of the Classical Linear Stochastic Regression Model.
The classical made important assumptions in their analysis of regression .The most important of
these assumptions are discussed below.
1. The model is linear in parameters.
13
The classical assumed that the model should be linear in the parameters regardless of whether the
explanatory and the dependent variables are linear or not. This is because if the parameters are
non-linear it is difficult to estimate them since their value is not known but you are given with
the data of the dependent and independent variable.
2. Ui is a random real variable
This means that the value which u may assume in any one period depends on chance; it may be
positive, negative or zero. Every value has a certain probability of being assumed by u in any
particular instance.
3. The mean value of the random variable(U) in any particular period is zero
This means that for each value of x, the random variable(u) may assume various values,
some greater than zero and some smaller than zero, but if we considered all the possible and
negative values of u, for any given value of X, they would have on average value equal to
zero. In other words the positive and negative values of u cancel each other.
Mathematically, E (U i ) 0
4. The variance of the random variable(U) is constant in each period (The assumption
of homoscedasticity)
For all values of X, the u‟s will show the same dispersion around their mean. This constant
variance is called homoscedasticity assumption and the constant variance itself is called
homoscedastic variance.
5. .The random variable (U) has a normal distribution
This means the values of u (for each x) have a bell shaped symmetrical distribution about their
zero mean and constant variance 2 , i.e.
U i N (0, 2 )
assumption of no autocorrelation)
This means the value which the random term assumed in one period does not depend on the
value which it assumed in any other period.
7. The X i are a set of fixed values in the hypothetical process of repeated sampling
which underlies the linear regression model.
14
This means that, in taking large number of samples on Y and X, the X i values are the same in all
samples, but the u i values do differ from sample to sample, and so of course do the values of y i .
represent their respective population value, and and are called the true parameters since
they are estimated from the population value of Y and X But it is difficult to obtain the
population value of Y and X because of technical or economic reasons. So we are forced to take
the sample value of Y and X. The parameters estimated from the sample value of Y and X are
called the estimators of the true parameters and and are symbolized as ˆ and ˆ .
The model Yi ˆ ˆX i ei , is called estimated relationship between Y and X since ˆ and ˆ
are estimated from the sample of Y and X and ei represents the sample counterpart of the
15
Estimation of and by least square method (OLS) or classical least square (CLS) involves
finding values for the estimates ˆ and ˆ which will minimize the sum of square of the squared
residuals ( ei2 ).
e 2
i (Yi ˆ ˆX i ) 2 ……………………….(2.4)
To find the values of ˆ and ˆ that minimize this sum, we have to partially differentiate
e 2
i with respect to ˆ and ˆ and set the partial derivatives equal to zero.
ei2
1. 2 (Yi ˆ ˆX i ) 0.......... .......... .......... .......... .......... .....( 2.5)
ˆ
ˆ Y ˆX .......... .......... .......... .......... .......... .......... .......... ....( 2.7)
ei2
2. 2 X i (Yi ˆ ˆX ) 0.......... .......... .......... .......... .......... (2.8)
ˆ
follows that;
Equations are called the Normal Equations. After certain step we get:
Y X i i X i (Y ˆX ) ˆX i2
Y X i ˆXX i ˆX i2
Y X i i Y X i ˆ (X i2 XX i )
XY nXY = ˆ ( X i2 nX 2)
16
XY nXY
ˆ ………………….(2.11)
X i2 nX 2
Equation (2.11) can be rewritten in somewhat different way and arrived final steps are :-
( X X )(Y Y )
ˆ
( X X ) 2
xi yi
ˆ ……………………………………… (2.12)
xi2
The expression in (2.12) to estimate the parameter coefficient is termed is the formula in
deviation form.
Estimation of a function with zero intercept
Suppose it is desired to fit the line Yi X i U i , subject to the restriction 0. To
estimate ˆ , the problem is put in a form of restricted minimization problem and then Lagrange
method is applied.
n
We minimize: ei2 (Yi ˆ ˆX i ) 2
i 1
Subject to: ˆ 0
The composite function then becomes
Z (Yi ˆ ˆX i ) 2 ˆ , where is a Lagrange multiplier.
Yi X i ˆX i 0
2
17
X i Yi
ˆ ……………………………………..(2.13)
X i2
This formula involves the actual values (observations) of the variables and not their deviation
forms, as in the case of unrestricted value of ˆ .
18
Given the distribution A & B. then if the true population were B, then the probability that we
would have obtained the sample shown would be quite small. But if the true population were A
then the probability that we would have drawn the sample would be substantially large. Select
the observation from population A as the most likely to have yielded the observed data.We
define the maximum likelihood estimator of as the value of which would most likely generate the
observed sample observations Y1 , Y2, Y3 --- Yn. Then if Yi is normally distributed & each of
Y's is drawn independently then the maximum-likelihood estimation maximizes. P(Y1) P(Y2) . .
. P(Yn)Where each P represents a probability associated with the normal distribution.P(Y1)
P(Y2)--- P(Yn) is often referred to as the likelihood function. The likelihood function depends up
on not only on the sample values but also in the unknown parameters of the problems.
In describing the likelihood function we often think of the unknown parameters as varying while
the Y's (dependent variables) are fixed.This seems reasonable because finding the maximum
likelihood estimation involves a search over alternative parameter estimators which would be
most likely to generate the given sample. For this reason the likelihood function must be
interpreted differently from the joint probability distribution. In the latter case the Y's are
allowed to vary & the underlying parameters are fixed & the reverse is true in case of maximum
likelihood.
19
2.4 Properties of OLS Estimators and Gauss-Markov Theorem
The ideal or optimum properties that the OLS estimates possess may be summarized by well
known theorem known as the Gauss-Markov Theorem.
Statement of the theorem: “Given the assumptions of the classical linear regression model, the
OLS estimators, in the class of linear and unbiased estimators, have the minimum variance, i.e.
the OLS estimators are BLUE.
According to the theorem, under the basic assumptions of the classical linear regression model,
the least squares estimators are linear, unbiased and have minimum variance (i.e. are best of all
linear unbiased estimators). Sometimes the theorem referred as the BLUE theorem i.e. Best,
Linear, and Unbiased Estimator. An estimator is called BLUE if:
a. Linear: a linear function of the random variable, such as, the dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population parameter.
c. Minimum variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.
According to the Gauss-Markov theorem, the OLS estimators possess all the BLUE properties.
The detailed proof of these properties are presented below
The variance of the random variable (Ui)
Dear student! You may observe that the variances of the OLS estimates involve 2 , which is the
population variance of the random disturbance term. But it is difficult to obtain the population
data of the disturbance term because of technical and economic reasons. Hence it is difficult to
compute 2 ; this implies that variances of OLS estimates are also difficult to compute. But we
can compute these variances if we take the unbiased estimate of 2 which is ˆ 2 computed from
the sample value of the disturbance term ei from the expression:
ei2
ˆ u2 …………………………………..2.14
n2
Statistical test of Significance of the OLS Estimators
(First Order tests)
After the estimation of the parameters and the determination of the least square regression line,
we need to know how „good‟ is the fit of this line to the sample observation of Y and X, that is to
say we need to measure the dispersion of observations around the regression line. This
20
knowledge is essential because the closer the observation to the line, the better the goodness of
fit, i.e. the better is the explanation of the variations of Y by the changes in the explanatory
variables.
We divide the available criteria into three groups: the theoretical a priori criteria, the statistical
criteria, and the econometric criteria. Under this section, our focus is on statistical criteria (first
order tests). The two most commonly used first order tests in econometric analysis are:
1) The coefficient of determination (the square of the correlation coefficient i.e. R2). This test is
used for judging the explanatory power of the independent variable(s).
2) The standard error tests of the estimators. This test is used for judging the statistical
reliability of the estimates of the regression coefficients.
R2 shows the percentage of total variation of the dependent variable that can be explained by
To elaborate this let‟s draw a horizontal line corresponding to the mean value of the
dependent variable Y . (see figure below).
By fitting the line Yˆ ˆ 0 ˆ1 X , we try to obtain the explanation of the variation of the
dependent variable Y produced by the changes of the explanatory variable X.
.Y
Y = e Y Yˆ
Y Y = Yˆ Yˆ ˆ 0 ˆ1 X
= Yˆ Y
Y.
As can be seen from fig.(d) above, Y Y represents measures the variation of the sample
observation value of the dependent variable around the mean.
21
However the variation in Y that can be attributed the influence of X, (i.e. the regression
line) is given by the vertical distance Yˆ Y .
The part of the total variation in Y about Y that can‟t be attributed to X is equal to
e Y Yˆ which is referred to as the residual variation.
In summary:
Now, we may write the observed Y as the sum of the predicted value ( Yˆ ) and the
residual term (ei.).
Yi Yˆ ei
predicted Yi
Observed Yi Re sidual
From equation (2.34) we can have the above equation but in deviation form
y 2 ( yˆ 2 e) 2
y 2 ( yˆ 2 ei2 2 yei)
yˆe 0 ………………………………………………(2.46)
Therefore;
22
yi2
yˆ 2 ei2 ………………………………...(2.47)
Total Explained Un exp lained
var iation var iation var ation
OR,
i.e
ESS yˆ 2
……………………………………….(2.49)
TSS y 2
From equation (2.37) we have yˆ ̂x . Squaring and summing both sides give us
yˆ 2 ˆ 2 x 2 (2.50)
ˆ 2 x 2
ESS / TSS …………………………………(2.51)
y 2
xy xi x y
22
2 , Since ˆ i 2 i
x y xi
2
xy xy
………………………………………(2.52)
x 2 y 2
23
Comparing (2.52) with the formula of the correlation coefficient:
xy xy
ESS/TSS = r2
x y
2 2
The limit of R2: The value of R2 falls between zero and one. i.e. 0 R 2 1 .
the changes in the explanatory variable(s) included in the model.
Interpretation of R2
Suppose R 2 0.9 , this means that the regression line gives a good fit to the observed data since
this line explains 90% of the total variation of the Y value around their mean. The remaining
10% of the total variation in Y is unaccounted for by the regression line and is attributed to the
factors included in the disturbance variable u i .
24
i) Standard error test ii) Student’s t-test iii) Confidence interval
All of these testing procedures reach on the same conclusion. Let us now see these testing
methods one by one.
i) Standard error test
This test helps us decide whether the estimates ˆ and ˆ are significantly different from zero,
i.e. whether the sample from which they have been estimated might have come from a
population whose true parameters are zero. 0 and / or 0 .
Formally we test the null hypothesis
H 0 : i 0 against the alternative hypothesis H 1 : i 0
SE( ˆ ) var( ˆ )
SE(ˆ ) var(ˆ )
Second: compare the standard errors with the numerical values of ˆ and ˆ .
Decision rule:
If SE( ˆi ) 1
2 ˆi , accept the null hypothesis and reject the alternative hypothesis. We
If SE( ˆi ) 1
2 ˆi , reject the null hypothesis and accept the alternative hypothesis. We
25
ii) Student’s t-test
Like the standard error test, this test is also important to test the significance of the parameters.
From your statistics, any variable X can be transformed into t using the general formula:
To undertake the above test we follow the following steps.
Step 1: Compute t*, which is called the computed value of t, by taking the value of in the null
hypothesis. In our case 0 , then t* becomes:
ˆ 0 ˆ
t*
SE( ˆ ) SE( ˆ )
Step 2: Choose level of significance. Level of significance is the probability of making „wrong‟
decision, i.e. the probability of rejecting the hypothesis when it is actually true or the probability
of committing a type I error. It is customary in econometric research to choose the 5% or the 1%
level of significance. This means that in making our decision we allow (tolerate) five times out
of a hundred to be „wrong‟ i.e. reject the hypothesis when it is actually true.
Step 3: Check whether there is one tail test or two tail test. If the inequality sign in the
alternative hypothesis is , then it implies a two tail test and divide the chosen level of
significance by two; decide the critical rejoin or critical value of t called tc. But if the inequality
sign is either > or < then it indicates one tail test and there is no need to divide the chosen level
of significance by two to obtain the critical value from the t-table.
Step 4: Obtain critical value of t, called tc at and n-2 degree of freedom for two tail test.
2
26
confidence”. In this respect we say that with a given probability the population parameter will
be within the defined confidence interval (confidence limits).
27
CHAPTER THREE
THE CLASSICAL REGRESSION ANALYSIS: Multiple Linear Regression
Model
3.1 Introduction
In simple regression we study the relationship between a dependent variable and a single
explanatory (independent variable). But it is rarely the case that economic relationships involve
just two variables. Rather a dependent variable Y can depend on a whole series of explanatory
variables or regressors. For instance, in demand studies we study the relationship between
quantity demanded of a good and price of the good, price of substitute goods and the consumer‟s
income. The model we assume is:
Yi 0 1 P1 2 P2 3 X i u i -------------------- (3.1)
consumer‟s income, and ' s are unknown parameters and u i is the disturbance.
Equation (3.1) is a multiple regression with three explanatory variables. In general for K-
explanatory variable we can write the model as follows:
Yi 0 1 X 1i 2 X 2i 3 X 3i ......... k X ki ui ------- (3.2)
j ( j 0,1,2,....( k 1)) are unknown parameters and u i is the disturbance term. The disturbance
term is of similar nature to that in simple regression, reflecting:
- the basic random nature of human responses
- errors of aggregation
- errors of measurement
- errors in specification of the mathematical form of the model and any other (minor)
factors, other than x i that might influence Y.
The assumptions of the multiple regressions and we will proceed our analysis with the case of
two explanatory variables and then we will generalize the multiple regression model in the case
of k-explanatory variables.
28
3.2 Assumptions of Multiple Regression Model
In order to specify our multiple linear regression model and proceed our analysis with regard to
this model, some assumptions are compulsory. But these assumptions are the same as in the
single explanatory variable model developed earlier except the assumption of no perfect
multicollinearity. These assumptions are:
1. Randomness of the error term: The variable u is a real random variable.
2. Zero mean of the error term: E (u i ) 0
3. Hemoscedasticity: The variance of each u i is the same for all the x i values.
i.e. E (u i ) u (constant)
2 2
i.e. U i ~ N (0, 2 )
variables. i.e. E (u i X 1i ) E (u i X 2i ) 0
This condition is automatically fulfilled if we assume that the values of the X‟s are a set
of fixed numbers in all (hypothetical) samples.
7. No perfect multicollinearity: The explanatory variables are not perfectly linearly
correlated.
We can‟t exclusively list all the assumptions but the above assumptions are some of the basic
assumptions in multiple analysis.
3.3 Partial-correlation coefficients
In order to remove the influence of X2 on Y, we regress Y on X2 and find the residual e1= Y*.
To remove the influence of X2 on XI, we regress XI on X2 and find the residual e2 = X1*; Y
And X1*; then represent the variations in Y and XI, respectively, left unexplained after removing
the influence of X2 from both Y and XI. Therefore, the partial correlation coefficient is merely
the simple correlation coefficient between the residuals Y* and X*; (that is, ryxl.x2 = ry*X*).
29
Partial correlation coefficient range in value from -1 to +1, ('just as in the case of simple
correlation coefficients).
For example, ryxl.x2 = -1 refers to the case where there is an exact or perfect negative
linear relationship between Y and XI after removing the common influence of X2 from
both Y and X1.
However, ryxl.x2 = 1 indicates a perfect positive linear net relationship between Y and
XI.
And ryxl.x2 = 0 indicates no linear relationship between Y and X, when the common
influence of X, has been removed from both Y and XI. As a result, XI can be omitted
from the regression.
The sign of partial correlation coefficients is the same as that of the corresponding estimated
parameter. For example, for the estimated regression equation Y = bo + blXl + b2X2, ryx1.x2
has the same sign as bl and ryx2.xl has the same sign as b,. Partial correlation coefficients are
used in multiple regression analysis to determine the relative importance of each explanatory
variable in the model.
The independent variable with the highest partial correlation coefficient with respect to the
dependent variable contributes most to the explanatory power of the model and is entered first in
a stepwise multiple regression analysis. It should be noted, however, that partial correlation
coefficients give an ordinal, not a cardinal, measure of net correlation, and the sum of the partial
correlation coefficients between the dependent and all the independent variables in the model
need not add up to 1.
3.4 A Model with Two Explanatory Variables
In order to understand the nature of multiple regression model easily, we start our analysis with
the case of two explanatory variables, then extend this to the case of k-explanatory variables.
3.4.1 Estimation of parameters of two-explanatory variables model
The model: Y 0 1 X 1 2 X 2 U i ……………………………………(3.3)
is multiple regression with two explanatory variables. The expected value of the above model is
called population regression equation i.e.
E (Y ) 0 1 X 1 2 X 2 , Since E (U i ) 0 . …………………................(3.4)
30
where i is the population parameters. 0 is referred to as the intercept and 1 and 2 are also
some times known as regression slopes of the regression. Note that, 2 for example measures
the effect on E (Y ) of a unit change in X 2 when X 1 is held constant. Since the population
regression equation is unknown to any investigator, it has to be estimated from sample data. Let
us suppose that the sample data has been used to estimate the population regression equation.
We leave the method of estimation unspecified for the present and merely assume that equation
(3.4) has been estimated by sample regression equation, which we write as:
Yˆ ˆ0 ˆ1 X 1 ˆ2 X 2 ……………………………………………….(3.5)
x y . x x x . x y
2
x y . x x1 x 2 . x1 y
2
ˆ 2 2 2 1 2 ………………….……………………… (3.22)
x1 . x 2 ( x1 x 2 ) 2
by the model, Yˆt . In this case, the model is said to “fit” the data well. If R2 is low, there is no
31
association between the values of Yt and the values predicted by the model, Yˆt and the model
does not fit the data well.
3.4.3 Adjusted Coefficient of Determination ( R 2 )
One difficulty with R 2 is that it can be made large by adding more and more variables, even if
the variables added have no economic justification. Algebraically, it is the fact that as the
variables are added the sum of squared errors (RSS) goes down (it can remain unchanged, but
this is rare) and thus R 2 goes up. If the model contains n-1 variables then R 2 =1. The
manipulation of model just to obtain a high R 2 is not wise. An alternative measure of goodness
of fit, called the adjusted R 2 and often symbolized as R 2 , is usually reported by regression
programs. It is computed as:
ei2 / n k n 1
R 2 1 1 (1 R2 ) --------------------------------(3.28)
y / n 1
2
nk
This measure does not always goes up when a variable is added because of the degree of
freedom term n-k is the numerator. As the number of variables k increases, RSS goes down, but
so does n-k. The effect on R 2 depends on the amount by which R 2 falls. While solving one
problem, this corrected measure of goodness of fit unfortunately introduces another one. It
losses its interpretation; R 2 is no longer the percent of variation explained. This modified R 2 is
sometimes used and misused as a device for selecting the appropriate set of explanatory
variables.
3.5 Statistical Properties of the Parameters and Gauss-Markov Theorem
We have seen, in simple linear regression that the OLS estimators (ˆ & ˆ ) satisfy the small
sample property of an estimator i.e. BLUE property. In multiple regressions, the OLS estimators
also satisfy the BLUE property. Now we proceed to examine the desired properties of the
estimators are:
1. Linearity
We know that: ˆ ( X ' X ) 1 X 'Y
Let C= ( X X ) 1 X
̂ CY …………………………………………….(3.33)
32
2. Unbiased ness
ˆ ( X ' X ) 1 X 'Y
ˆ ( X ' X ) 1 X ' ( X U )
3. Minimum variance
Before showing all the OLS estimators are best (possess the minimum variance property), it is
important to derive their variance.
3.6. Hypothesis Testing in Multiple Regression Model
In multiple regression models we will undertake two tests of significance. One is significance of
individual parameters of the model. This test of significance is the same as the tests discussed in
simple regression model. The second test is overall significance of the model.
3.6.1. Tests of individual significance
If we invoke the assumption that U i ~. N (0, 2 ) , then we can use either the t-test or standard
error test to test a hypothesis about any individual partial regression coefficient. To illustrate
consider the following example.
Let Y ˆ0 ˆ1 X 1 ˆ 2 X 2 ei ………………………………… (3.51)
A. H 0 : 1 0
H1 : 1 0
B. H 0 : 2 0
H1 : 2 0
The null hypothesis (A) states that, holding X2 constant X1 has no (linear) influence on Y.
Similarly hypothesis (B) states that holding X1 constant, X2 has no influence on the dependent
variable Yi. To test these null hypothesis we will use the following tests:
i- Standard error test: under this and the following testing methods we test only for
ˆ1 .The test for ̂ 2 will be done in the same way.
33
ˆ 2 x 22i ei2
SE( ˆ1 ) var( ˆ1 ) ; where ˆ
2
x x
2
1i
2
2i ( x1 x 2 ) 2 n3
If SE(ˆ1 ) 1 2 ˆ1 , we accept the null hypothesis that is, we can conclude that the
estimate i is not statistically significant.
If SE(ˆ 1 2 ˆ , we reject the null hypothesis that is, we can conclude that the
1 1
ˆi
t* ~ t n -k , where n is number of observation and k is number of parameters. If
SE( ˆi )
we have 3 parameters, the degree of freedom will be n-3. So;
ˆ 2 2
t* ; with n-3 degree of freedom
SE( ˆ 2 )
ˆ 2
t*
SE( ˆ 2 )
If t*<t (tabulated), we accept the null hypothesis, i.e. we can conclude that ̂ 2 is not
significant and hence the regressor does not appear to contribute to the explanation of
the variations in Y.
If t*>t (tabulated), we reject the null hypothesis and we accept the alternative one;
̂ 2 is statistically significant. Thus, the greater the value of t* the stronger the
evidence that i is statistically significant.
3.6.2 Test of Overall Significance
Through out the previous section we were concerned with testing the significance of the
estimated partial regression coefficients individually, i.e. under the separate hypothesis that each
of the true population partial regression coefficient was zero.
In this section we extend this idea to joint test of the relevance of all the included explanatory
variables. Now consider the following:
34
Y 0 1 X 1 2 X 2 ......... k X k U i
H 0 : 1 2 3 .......... .. k 0
This null hypothesis is a joint hypothesis that 1 , 2 ,........ k are jointly or simultaneously equal
to zero. A test of such a hypothesis is called a test of overall significance of the observed or
estimated regression line, that is, whether Y is linearly related to X 1 , X 2 ,........ X k .
Can the joint hypothesis be tested by testing the significance of individual significance of ˆ i ‟s as
the above? The answer is no, and the reasoning is as follows
In testing the individual significance of an observed partial regression coefficient, we assumed
implicitly that each test of significance was based on different (i.e. independent) sample. Thus,
in testing the significance of ̂ 2 under the hypothesis that 2 0 , it was assumed tacitly that
the testing was based on different sample from the one used in testing the significance of
ˆ3 under the null hypothesis that 3 0 . But to test the joint hypothesis of the above, we shall
be violating the assumption underlying the test procedure. A single (individual) hypothesis is not
equivalent to testing that same hypothesis. The institutive reason for this is that in a joint test of
several hypotheses any single hypothesis is affected by the information in the other hypothesis.”
The test procedure for any set of hypothesis can be based on a comparison of the sum of squared
errors from the original, the unrestricted multiple regression model to the sum of squared errors
from a regression model in which the null hypothesis is assumed to be true. When a null
hypothesis is assumed to be true, we in effect place conditions or constraints, on the values that
the parameters can take, and the sum of squared errors increases. The idea of the test is that if
these sum of squared errors are substantially different, then the assumption that the joint null
hypothesis is true has significantly reduced the ability of the model to fit the data, and the data do
not support the null hypothesis.
If the null hypothesis is true, we expect that the data are compliable with the conditions placed
on the parameters. Thus, there would be little change in the sum of squared errors when the null
hypothesis is assumed to be true.
Let the Restricted Residual Sum of Square (RRSS) be the sum of squared errors in the model
obtained by assuming that the null hypothesis is true and URSS be the sum of the squared error
35
of the original unrestricted model i.e. unrestricted residual sum of square (URSS). It is always
true that RRSS - URSS 0.
Consider Yˆ ˆ0 ˆ1 X 1 ˆ2 X 2 ......... ˆk X k ei .
This model is called unrestricted. The test of joint hypothesis is that:
H 0 : 1 2 3 .......... .. k 0
Yi Yˆ e
ei Yi Yˆi
ei2 (Yi Yi )2
This sum of squared error is called unrestricted residual sum of square (URSS). This is the case
when the null hypothesis is not true. If the null hypothesis is assumed to be true, i.e. when all the
slope coefficients are zero.
Y ˆ 0 ei
̂ 0
Y i
Y (applying OLS)…………………………….(3.52)
n
e Y ̂ 0 but ̂ 0 Y
e Y Y
(TSS RSS ) / k 1
F
RSS / n k
36
ESS / k 1
F ………………………………………………. (3.54)
RSS / n k
If we divide the above numerator and denominator by y 2 TSS then:
ESS
/ k 1
F TSS
RSS
/k n
TSS
R2 / k 1
F …………………………………………..(3.55)
1 R2 / n k
This implies the computed value of F can be calculated either as a ratio of ESS & TSS or R2 & 1-
R2. If the null hypothesis is not true, then the difference between RRSS and URSS (TSS & RSS)
becomes large, implying that the constraints placed on the model by the null hypothesis have
large effect on the ability of the model to fit the data, and the value of F tends to be large.
If the computed value of F is greater than the critical value of F (k-1, n-k), then the parameters of
the model are jointly significant or the dependent variable Y is linearly related to the independent
variables included in the model.
3.7 Predictions using Multiple Linear Regression
Suppose that you estimated the demand for meat (q1) which depend on price of meat and income
of the consumer(y). If critical the critical value of F0.05, 2, 27=3.35, examine whether the result is
valid or not.
37
Note: SS=sum of square, df=degree freedom, MS=mean sum,q1=quantity demanded of
meat, p1=price of meat , y=income of the consumer. “cons” refers to ,ceof. Refers to
estimated coefficients, std.error refers to standard error, t=statistic , p>|t| refers to
probability of rejecting H0 and 95% is the confidence interval for population parameter.
The interpretation of the result in the above Table;
Model refers to variation in the dependent variable due to the independent variable. In this table,
SS refers to sum of square and the value=105297.059 is the explained sum of square (SSE) and
MS refers to mean sum of square (MSE) ;
MSE= = , where k=number of independent variables.
In addition, residual is the estimated error. It is source of variation in the dependent variable
which is because of error. Under the column SS, the value 51734.3635 is the residual sum of
square (SSR) and the value under column MS =1916.08754 is mean residual sum of square
which is computed as follows;
MSR = 1916.08754, where n= the number of observation and
k=number of independent variables and n-k-1 determines the degree freedom in the
residual sum of square.
Moreover, in the above result, total refers total variation in the dependent is decomposed into
SSE and SSR. Therefore,
SST=SSE+SSR.
MST=MSE+ MSR
df (total)=k(df for SSE)+n-k+1(df for SSR)=n-1=30-1=29
The first Panel of the above table reports, the result which is used to examines the oval
significance of regression coefficients (F-test).
To conduct the overall significance of coefficients;
1. State the hypothesis;
H0:
H1: H0 is not true or at least one coefficient is not zero
38
2. Compute F-statistics:
Where df1=degree freedom for SSE and df2 =degree freedom for SSR
5. Decision rule;
Reject H0 if and conclude that the overall coefficients are significant meaning
that at least one coefficient not zero.
In our case, =27.48>335, so, we reject the null hypothesis.
The lower panel of the above table presents the estimated result of demand for meat. So, how
examine whether p1 and y affect q1? We need to conduct hypothesis about the significance of
each coefficients for p1 and y.
39
CHAPTER FOUR:
VIOLATIONS OF CLASSICAL ASSUMPTION
Recall that in the classical model we have assumed
a) Zero mean of the random term
b) Constant variance of the error term (i.e., the assumption of homoscedasticity)
c) No autocorrelation of the error term
d) Normality of the error term
e) No multicolinearity among the explanatory variable.
4.1 The Assumption of Zero Expected Disturbances
This assumption is imposed by the stochastic nature of economic relationships, which otherwise
it would be impossible to estimate with the common rule of mathematics. The assumption
implies that the observations of Y and X must be scattered around the line in a random way (and
hence the estimated line Yˆ = ˆ 0 + ̂1 X be a good approximation of the true line.) This defines the
relationship connecting Y and X „on the average‟. The alternative possible assumptions are either
E(U) > 0 or E(U) < 0. Assume that for some reason the U‟s had not an average value of zero, but
tended most of them to be positive. This would imply that the observation of Y and X would lie
above the true line.
It can be shown that by using these observations we would get a bad estimate of the true line. If
the true line lies below or above the observations, the estimated line would be biased.
Note that there is no test for the verification of this assumption because the assumption E (U) =
0 is forced upon us if we are to establish the true relationship. i.e, we set E(U) = 0 at the outset of
our estimation procedure. Its plausibility should be examined in each particular case on a priori
grounds. In any econometric application we must be sure that the things are fulfilled so as to be
safe from violating the assumption of E (U) = 0
i) All the important variables have been included into the function.
ii) There are no systematically positive or systematically negative errors of measurement in the
dependent variable.
40
4.2 The Nature of Hetroscedasticity
The assumption of homoscedasticity (or constant variance) about the random variable U is that
its probability distribution remains the same over all observations of X, and in particular that the
variance of each Ui is the same for all values of the explanatory variable. Symbolically we have
Var(U) = E{(Ui – E(U)}2 = E(Ui) = u2, constant
If the above is not satisfied in any particular case, we say that the U‟s are hetroscedastic. That is
Var (Ui) = ui2 not constant. The meaning of homoscedasticity is that the variation of each Ui
around its zero mean does not depend on the value of X. That is ui2 f(Xi ).
Note that if u2 is not constant, but its value depends on X. we may write ui2 = f(Xi). As shown
in the above diagrams there are various forms of hetroscedasticity. For example in figure (c) the
variance of Ui decreases as X increases.
Furthermore, suppose we have a cross-section sample of family budget from which we want to
measure the savings function. That means Saving = f(income). In this case the assumption of
constant variance of the U‟s is not appropriate, because high-income families show a much
greater variability in their saving behavior than do low income families. Families with high
income tend to stick to a certain standard of living and when their income falls they cut down
their savings rather than their consumption expenditure. But this is not the case in low income
families. Hence, the variance of Ui‟s increase as income increases.
Note, however, that the problem of hetroscedasticity is the problem of cross-sectional data rather
than time series data. That is, the problem is more serious on cross section data.
Causes of Hetroscedasticity
Hetrodcedasticity can also arise as a result of several cases. The first one is the presence of
outliers (i.e., extreme values compared to the majority of a variable). The inclusion or exclusion
of such an observation, especially if the sample size is small, can substantially alter the results of
regression analysis. With outliers it would be hard to maintain the assumption of
homoscedasticity.
Another source of hetrodcedasticity arises from violating the assumption that the regression
model is correctly specified. Very often what looks like hetroscedasticity may be due to the fact
that some important variables are omitted from the model. In such situation the residuals
41
obtained from the regression may give the distinct impression that the error variance may not be
constant. But if the omitted variables are included in the model, the impression may disappear.
The consequence of Hetrodcedasticity
If the assumption of homoscedastic disturbance is not fulfilled we have the following
consequences:
i) If U is hetroscedastic, the OLS estimates do not have the minimum variance property in
the class of unbiased estimators; that is, they are inefficient in small samples. Furthermore,
they are inefficient in large samples
ii) The coefficient estimates would still be statistically unbiased. That is the expected value of
iii) The prediction (of Y for a given value of X) would be inefficient because of high variance.
This is because the variance of the prediction includes the variances of U and of the
parameter estimates, which are not minimum due to the incidence of hetroscedasticity.
In any case how does one detect whether the problem really exists.
Remedial measures for the problems of heteroscedasticity
As we have seen, heteroscedasticity does not destroy the unbiasedness and consistency
property of the OLS estimators, but they are no longer efficient.
This lack of efficiency makes the usual hypothesis testing procedure of dubious value.
If we apply OLS to the above then it will result in inefficient parameters since var( u i ) is not
constant.
The remedial measure is transforming the above model so that the transformed model
satisfies all the assumptions of the classical regression model including homoscedasticity.
42
Applying OLS to the transformed variables is known as the method of Generalized Least
Squares (GLS).
In short GLS is OLS on the transformed variables that satisfy the standard least squares
assumptions.
The estimators thus obtained are known as GLS estimators, and it is these estimators that
are BLUE.
Assume that our original model is: Y X i U i where u i satisfied all the
(ui ) 2 i2 f (ki )
If we apply OLS to the above model, the estimators are no more BLUE.
Let us assume the following types of hetroscedastic structures, under two conditions:
hetroschedasticity when the population variance i2 is known and when i2 is not known.
The transforming variable of the above model is i2 i so that the variance of the
transformed error term is constant.
Y X i U i
(3.19)
i i i i
43
The variance of the transformed error term is constant, i.e.
2
u u 1 1
var i i 2 (u i ) 2 2 i2 1 Constant
i i i i
w uˆ
i
2
i wi (Yi ˆ ˆX i ) 2
The method of GLS (WLS) minimizes the weighted residual sum of squares
wi uˆ i2
2wi (Yi ˆ ˆX i ) 0
ˆ
wi (Yi ˆ ˆX i ) 0
wi Yi ˆwi X i
ˆ Y * ˆX *
wi wi
where Y * is the weighted mean and X * is the weighted mean which are different
from the ordinary mean we discussed in 2.1 and 2.2.
wi uˆ i2
2. 2wi (Yi ˆ ˆX i )( X i ) 0
ˆ
44
wi (Yi X i ˆX i ˆX i ) 0
2
ˆ wi Yi X i Y * X * wi x * y *
2
2
wi X i2 X * wi x *
where x* and y* are weighted deviations.
These parameters are now BLUE.
Let‟s assume that (ui ) 2 i2 k i f ( X i ) , the transformed version of the model may be
where var( ui ) i2 K i2 X i2 .
Y X i U i U
The transformed model is: i
Xi Xi Xi Xi Xi Xi
2
u 1 K2X 2
i 2 (u i2 ) 2
K 2 constant
Xi Xi Xi
45
which proves that the new random term in the model has a finite constant variance ( K 2 ) .
Ui
We can, therefore, apply OLS to the transformed version of the model .
Xi Xi
Note that in this transformation the position of the coefficients has changed: the
1
parameter of the variable in the transformed model is the constant intercept of the
Xi
original model, while the constant of term of the transformed model is the parameter of
the explanatory variable X in the original model.
Therefore, to get back to the original model, we shall have to multiply the estimated
regression by K i .
Y X i Ui
The transformed model is:
Xi Xi Xi Xi
Ui
Xi
Xi Xi
2
u
i 1 (U i ) 2 1 k 2 X k 2
X X X
i
Therefore, one will have to use the „regression through the origin‟ model to
estimate and .
46
In this case, therefore, to get back to the original model, we shall have to multiply the
estimated regression by Xi
(ui2 ) i2 k 2 (Yi )
2
Y X i Ui
(i)
X i X i X i X i
Ui
K 2 (Yi ) K 2
1 1
(u i ) 2
2
X i ( X i )
2
(Yi )2
The transformed model described in (i) above is however not operational in this case.
But since we can obtain Yˆ ˆ ˆX i , the transformation can be made through the
following two steps.
1st : we run the usual OLS regression disregarding the heteroscedasticity problem in the data and
Y 1 X U
obtain Yˆ using the estimated Yˆ , we transform the model as follows. i i
Yˆ Yˆ Yˆ Yˆ
It should be, therefore, be clear that in order to adopt the necessary corrective measure
(which is through transformation of the original data in such a way as to obtain a form in
which the transformed disturbance terms possesses constant variance) we must have
information on the form of heteroscedasticity.
Also since our transformed data no more posses heteroscedasticity, it can be shown that
the estimate of the transformed model are more efficient (i.e. they posses smaller
variance) then the estimates obtained from the application of OLS to the original data.
47
Let‟s assume that a test reveals that original data possesses heteroscedasticity and that
heteroscedasticity of the form i2 K 2 X i2 is being assumed.
Yi X i U i , (U i ) i2 K 2 X 2
k i ui
ˆ
x 2
k u
2
var( ˆ ) ( ˆ ) 2 i 2 i
x
(k i u i 2X i X i u i u i )
2
x 2 2
i
Yi Ui
On transforming the original model we obtain:
Xi Xi Xi
ˆ Y 1
ˆ
Xi Xi
ˆ i2 1 X 2 K 2 1 X
2
var( ˆ )
n( 1 X ( 1 X ) 2 ) n( 1 X ( 1 X )) 2
X 2 2
Since var( ˆ ) in OLS u 2 i
nx
48
4.3 The Nature of Autocorrelation
An important assumption of the classical linear model is that there is no autocorrelation or serial
correlation among the disturbances Ui entering into the population regression function. This
assumption implies that the covariance of Ui and Uj in equal to zero. That is: Cov(UiUj) = E{[Ui
– E(Ui)] [Uj – E (Uj)]
= E(UiUj) = 0 (for i j)
But if this assumption is violated, it implies that the disturbances are said to be auto correlated.
This could arise for several reasons.
i) Spatial autocorrelation: In regional cross-section data, a random shock affecting economic
activity in one region may cause economic activity in an adjacent region to change because
of close economic ties between the regions. Shocks due to weather similarities might also
tend to cause the error terms between adjacent regions to be related.
ii) Prolonged influence of shocks: In time series data, random shocks (disturbances) have
effects that often persist over more than one time period. An earth quick, flood, strike or
war, for example, will probably affect the economy‟s operation in periods.
iii) Inertia: past action often have a strong effect on current actions, so that a positive
disturbance in one period is likely to influence activity in succeeding periods.
iv) Data manipulation published data often undergo interpolation or smoothing, procedures that
average true disturbances over successive time periods.
v) Misspecification: An omitted relevant independent variable that is auto correlated will make
the disturbance (associated with the misspecified model) auto correlated. An incorrect
functional form or a misspecification of the equation‟s dynamics could do the same. In
these instances the appropriate procedure is to correct the misspecification.
Note that autocorrelation is a special case of correlation. Autocorrelation refers to the
relationship not between two (or more) different variables, but between the successive values of
the same variable (where in this section we are particularly interested in the autocorrelation of
the U‟s. Moreover, note that the term autocorrelation and serial correlation are treated
synonymously.
Consequences of Autocorrelation
When the disturbance term exhibits serial correlation the values as well as the standard errors of
the parameter estimates are affected.
49
i) If disturbances are correlated, the prevailed value of the disturbances has some information to
convey about the current disturbances. If this information is ignored it is clear that the sample
data is not being used with maximum efficiency. However the estimates of the parameters do
not have the statistical biased even when the residuals are serially correlated. That is, the
parameter of OLS estimates is statistically unbiased in the sense that their expected value is
equal to the true parameter.
ii) The variance of the random term U may be seriously underestimated. In particular, the under
estimation of the variance of U will be more serious in the case of positive autocorrelation of
the error term (Ut). With positive first-order auto correlated errors it implies that fitting an
OLS estimating line clearly gives an estimate quite wide of the mark. The high variation in
these estimates will cause the variance of OLS
to be greater than it would have been had the
errors been distributed randomly.
4.4 Multicollinearity
One of the assumptions of the classical linear regression model (CLRM) is that there is no
perfect multicollinearity among the regressors included in the regression model. Note that
although the assumption is said to be violated only in the case of exact multicollinearity (i.e., an
exact linear relationship among some of the regressors), the presence of multicollinearity (an
approximate linear relationship among some of the regressors) lead to estimating problems
important enough to warrant out treating it as a violation of the classical linear regression model.
Multicollinearity does not depend on any theoretical or actual linear relationship among any of
the regressors; it depends on the existence of an approximate linear relationship in the data set at
hand. Unlike most other estimating problems, this problem is caused by the particular sample
available. Multicollinearity in the data could arise for several reasons. For example, the
independent variables may all share a common time trend, one independent variable might be the
lagged value of another that follows a trend, some independent variable may have varied
together because the data were not collected from a wide enough base, or there could in fact exist
50
Note that the existence of multicollinearity will affect seriously the parameter estimates.
Intuitively, when any two explanatory variables are changing in nearly the same way, it becomes
extremely difficult to establish the influence of each one regressors on the dependent variable
separately. That is, if two explanatory variables change by the same proportion, the influence on
the dependent variable by one of the explanatory variables may be erroneously attributed to the
other. Their effect cannot be sensibly investigated, due to the high inter correlation.
51
Remedial measures
It is more difficult to deal with models indicating the existence of multicollinearity than detecting
the problem of multicollinearity. Different remedial measures have been suggested by
econometricians; depending on the severity of the problem, availability of other sources of data
and the importance of the variables, which are found to be multicollinear in the model.
Some suggest that minor degree of multicollinearity can be tolerated although one should be a bit
careful while interpreting the model under such conditions. Others suggest removing the
variables that show multicollinearity if it is not important in the model. But, by doing so, the
desired characteristics of the model may then get affected. However, following corrective
procedures have been suggested if the problem of multicollinearity is found to be serious.
1. Increase the size of the sample: it is suggested that multicollinearity may be avoided or
reduced if the size of the sample is increased. With increase in the size of the sample, the
covariances are inversely related to the sample size. But we should remember that this will be
true when intercorrelation happens to exist only in the sample but not in the population of the
variables. If the variables are collinear in the population, the procedure of increasing the size of
the sample will not help to reduce multicollinearity.
3. Use extraneous information – Extraneous information is the information obtained from any
other source outside the sample which is being used for the estimation. Extraneous information
may be available from economic theory or from some empirical studies already conducted in the
field in which we are interested. Three methods, through which extraneous information is
utilized in order to deal with the problem of multicollinearity.
52
4.5 Specification Errors
In developing an empirical model, one is likely to commit one or more of the following
specification errors:
1. Omission of a relevant variable(s)
2. Inclusion of an unnecessary variable(s)
3. Adopting the wrong functional form
4. Errors of measurement
5. Incorrect specification of the stochastic error term
53
4. The conventionally measured variance of αˆ2 ( = σ 2/∑X2) is a biased estimator of the variance
of the true estimator βˆ2.
5. In consequence, the usual confidence interval and hypothesis-testing procedures are likely to
give misleading conclusions about the statistical.
6. As another consequence, the forecasts based on the incorrect model and the forecast
(confidence) intervals will be unreliable.
54
REFERENCES:
1. Gujarati, D. N. and D. C. Proter (2009). Basic Econometrics, 5th edition, McGraw-
2. Maddala, G. S. (1992). Introduction to Econometrics, 2nd edition, Macmillan.
3. Wooldridge, J. (2013). IntroductoryEconometrics: A Modern Approach, 5nd ed.
4. Enders, W. (2014). Applied Econometric Time Series, John Wiley & Sons:, 4th ed., Singapore.
5. Koutsoyiannis, A. (2001). Theory of Econometrics, Palgrave: New York.
6. Johnston, J. and J.Dinardo (1997)Econometric Methods, 4th edition.
7. Kmenta, J. Elements of Econometrics, 2nd edition.
8. Intrilligator M.D, R.G. Bodkin, and D. Hsiao (1996). Econometric Models, Techniques and
Applications.
9. Verbeek (2004), A Guide to Modern Econometrics. New York: John Wiley & Sons, Ltd.
55