Econometric S

BONGA UNIVERSITY
COLLEGE OF BUSINESS AND ECONOMICS

DEPARTMENT OF ECONOMIC
ECONOMETRICS I
Prepared by:
Firehun Jemal
Fikadu Abera
Edited by:
Tigstu W/silassie
December, 2023
Bonga, Ethiopia
Econometrics
ModuleI
Course Name: Econometrics I
Course Code: Econ 3061
Course Description:The course aims at introducing the theory (and practice) of cross-sectional
econometrics. It first makes an introduction to the basic concepts in econometrics like economic
and econometric modeling as well as types of data; then proceeds to the simple classical linear
regression model and introduces estimation techniques such as the method of moments, ordinary
least squares and maximum likelihood estimation, inference and analyses of residuals. This is
then built into the multiple linear regression framework. After making tests of linear restrictions
emanating from economic theory, the course will finally try to highlight the problems of
multicollinearity, heteroscedasticity and autocorrelation (violations of the basic assumptions of
classical linear regression models). The course builds upon your previous course Statistics for
Economists. Hence, familiarity with the material, particularly sampling distributions, estimation
and hypothesis testing will be of much help. These will be applied on Ethiopian/international
data using statistical packages.
Course Outcomes:
The main outcomes of this course is to enable students have a good background
knowledge on cross-sectional econometric models. More specifically, after the
completion of the course, students are expected to:
 Distinguish between economic and econometric models;
 Do simple and multiple regression with economic data (both manually and using
statistical packages);
 Interpret regression results (like coefficients and R2) and test hypotheses (both
manually and using statistical packages); and
 Detect (in) existence of problems of multicollinearity, heteroscedasticity and
autocorrelation as well as suggest how to rectify such problems (both manually
and using statistical packages).
I
 Table of Contents
CHAPTER ONE ....................................................................................................................... 1
Definition and scope of econometrics ..................................................................................... 1
1.1 WHAT IS ECONOMETRICS? ............................................................................................ 1
1.2 Economic models vs. econometric models .......................................................................... 3
1.3 Methodology of econometrics .............................................................................................. 4
1.4 The Sources, Types and Nature of Data ............................................................................... 7
1.5 Desirable properties of an econometric model ................................................................... 10
1.6 Goals of Econometrics ....................................................................................................... 10
CHAPTER TWO.................................................................................................................... 11
THE CLASSICAL REGRESSION ANALYSIS: Simple Linear Regression Model ....... 11
2.1. Concept of Regression Function ....................................................................................... 11
2.2. Simple Linear Regression model....................................................................................... 13
2.2.1 Assumptions of the Classical Linear Stochastic Regression Model. ........................... 13
2.3 Methods of estimation ........................................................................................................ 15
2.3.1 The ordinary least square (OLS) method and Method of moments (MM).................. 15
2.3.2 Maximum Likelihood Estimation................................................................................ 18
2.4 Properties of OLS Estimators and Gauss-Markov Theorem .............................................. 20
2.5 Tests of the „Goodness of Fit‟ With R2 .............................................................................. 21
2.6 Testing the Significance of OLS Parameters...................................................................... 24
CHAPTER THREE ............................................................................................................... 28
THE CLASSICAL REGRESSION ANALYSIS: Multiple Linear Regression Model .... 28
3.1 Introduction ........................................................................................................................ 28
3.2 Assumptions of Multiple Regression Model ...................................................................... 29
3.3 Partial-correlation coefficients ........................................................................................... 29
3.4 A Model with Two Explanatory Variables....................................................................... 30
3.4.1 Estimation of parameters of two-explanatory variables model ................................. 30
3.4.2 The coefficient of determination ( R2):two explanatory variables case ...................... 31
3.5 Statistical Properties of the Parameters and Gauss-Markov Theorem ............................... 32
3.6. Hypothesis Testing in Multiple Regression Model ........................................................... 33
3.6.1. Tests of individual significance .................................................................................. 33
3.6.2 Test of Overall Significance ........................................................................................ 34
3.7 Predictions using Multiple Linear Regression.................................................................... 37
CHAPTER FOUR: ................................................................................................................. 40
II
VIOLATIONS OF CLASSICAL ASSUMPTION .............................................................. 40
4.1 The Assumption of Zero Expected Disturbances ............................................................... 40
4.2 The Nature of Hetroscedasticity ......................................................................................... 41
4.3 The Nature of Autocorrelation ........................................................................................... 49
4.4 Multicollinearity ................................................................................................................. 50
4.5 Specification Errors ............................................................................................................ 53
III
CHAPTER ONE
Definition and scope of econometrics
The economic theories we learn in various economics courses suggest many relationships among
economic variables. For instance, in microeconomics we learn demand and supply models in
which the quantities demanded and supplied of a good depend on its price. In macroeconomics,
we study „investment function‟ to explain the amount of aggregate investment in the economy as
the rate of interest changes; and „consumption function‟ that relates aggregate consumption to
the level of aggregate disposable income.
Each of such specifications involves a relationship among economic variables. As economists,

we may be interested in questions such as: If one variable changes in a certain magnitude, by
how much will another variable change? Also, given that we know the value of one variable;
can we forecast or predict the corresponding value of another? The purpose of studying the
relationships among economic variables and attempting to answer questions of the type raised
here, is to help us understood the real economic world we live in.
However, economic theories that postulate the relationships between economic variables have to
be checked against data obtained from the real world. If empirical data verify the relationship
proposed by economic theory, we accept the theory as valid. If the theory is incompatible with
the observed behavior, we either reject the theory or in the light of the empirical evidence of the
data, modify the theory. To provide a better understanding of economic relationships and a
better guidance for economic policy making we also need to know the quantitative relationships
between the different economic variables. We obtain these quantitative measurements taken
from the real world. The field of knowledge which helps us to carry out such an evaluation of
economic theories in empirical terms is econometrics.
1.1 WHAT IS ECONOMETRICS?
Literally interpreted, econometrics means “economic measurement”, but the scope of
econometrics is much broader as described by leading econometricians. Various econometricians
used different ways of wordings to define econometrics. But if we distill the fundamental
features/concepts of all the definitions, we may obtain the following definition.
1
“Econometrics is the science which integrates economic theory, economic statistics, and
mathematical economics to investigate the empirical support of the general schematic law
established by economic theory. It is a special type of economic analysis and research in which
the general economic theories, formulated in mathematical terms, is combined with empirical
measurements of economic phenomena. Starting from the relationships of economic theory, we
express them in mathematical terms so that they can be measured. We then use specific methods,
called econometric methods in order to obtain numerical estimates of the coefficients of the
economic relationships.”
Measurement is an important aspect of econometrics. However, the scope of econometrics is
much broader than measurement. As D.Intriligator rightly stated the “metric” part of the word
econometrics signifies „measurement‟, and hence econometrics is basically concerned with
measuring of economic relationships. In short, econometrics may be considered as the
integration of economics, mathematics, and statistics for the purpose of providing numerical
values for the parameters of economic relationships and verifying economic theories.
Econometrics vs. mathematical economics

Mathematical economics states economic theory in terms of mathematical symbols. There is no
essential difference between mathematical economics and economic theory. Both state the same
relationships, but while economic theory use verbal exposition, mathematical symbols. Both
express economic relationships in an exact or deterministic form. Neither mathematical
economics nor economic theory allows for random elements which might affect the relationship
and make it stochastic. Furthermore, they do not provide numerical values for the coefficients of
economic relationships.
Econometrics differs from mathematical economics in that, although econometrics presupposes,

the economic relationships to be expressed in mathematical forms, it does not assume exact or
deterministic relationship. Econometrics assumes random relationships among economic
variables. Econometric methods are designed to take into account random disturbances which
relate deviations from exact behavioral patterns suggested by economic theory and mathematical
economics.
2
Econometrics vs. statistics
Econometrics differs from both mathematical statistics and economic statistics. An economic
statistician gathers empirical data, records them, tabulates them or charts them, and attempts to
describe the pattern in their development over time and perhaps detect some relationship
between various economic magnitudes. Economic statistics is mainly a descriptive aspect of
economics. It does not provide explanations of the development of the various variables and it
does not provide measurements the coefficients of economic relationships.
Mathematical (or inferential) statistics deals with the method of measurement which are
developed on the basis of controlled experiments. But statistical methods of measurement are
not appropriate for a number of economic relationships because for most economic relationships
controlled or carefully planned experiments cannot be designed due to the fact that the nature of
relationships among economic variables are stochastic or random. Yet the fundamental ideas of
inferential statistics are applicable in econometrics, but they must be adapted to the problem
economic life.
1.2 Economic models vs. econometric models
i) Economic models:
Any economic theory is an observation from the real world. For one reason, the immense
complexity of the real world economy makes it impossible for us to understand all
interrelationships at once. Another reason is that all the interrelationships are not equally
important as such for the understanding of the economic phenomenon under study. The sensible
procedure is therefore, to pick up the important factors and relationships relevant to our problem
and to focus our attention on these alone. Such a deliberately simplified analytical framework is
called on economic model. It is an organized set of relationships that describes the functioning of
an economic entity under a set of simplifying assumptions.
ii) Econometric models:

The most important characteristic of economic relationships is that they contain a random
element which is ignored by mathematical economic models which postulate exact relationships
between economic variables.
3
Figure: Anatomy of econometric modeling.
1.3 Methodology of econometrics

Econometric research is concerned with the measurement of the parameters of economic
relationships and with the predication of the values of economic variables. The relationships of
economic theory which can be measured with econometric techniques are relationships in which
some variables are postulated as causes of the variation of other variables. Starting with the
postulated theoretical relationships among economic variables, econometric research or inquiry
generally proceeds along the following lines/stages.
1. Specification the model
2. Estimation of the model
3. Evaluation of the estimates
4. Evaluation of the forecasting power of the estimated model
4
1. Specification of the model
In this step the econometrician has to express the relationships between economic variables in
mathematical form. This step involves the determination of three important tasks:
i) The dependent and independent (explanatory) variables which will be included in the
model.
ii) The a priori theoretical expectations about the size and sign of the parameters of the
function.
iii) The mathematical form of the model (number of equations, specific form of the
equations, etc.)
Note: The specification of the econometric model will be based on economic theory and on any
available information related to the phenomena under investigation. Thus, specification of the
econometric model presupposes knowledge of economic theory and familiarity with the
particular phenomenon being studied.
Specification of the model is the most important and the most difficult stage of any econometric
research. It is often the weakest point of most econometric applications. In this stage there
exists enormous degree of likelihood of committing errors or incorrectly specifying the model.
Some of the common reasons for incorrect specification of the econometric models are:
1. The imperfections, looseness of statements in economic theories.
2. The limitation of our knowledge of the factors which are operative in any particular
case.
3. The formidable obstacles presented by data requirements in the estimation of large
models.
The most common errors of specification are:
a. Omissions of some important variables from the function.
b. The omissions of some equations (for example, in simultaneous equations model).
c. The mistaken mathematical form of the functions.
2. Estimation of the model
5
This is purely a technical stage which requires knowledge of the various econometric methods,
their assumptions and the economic implications for the estimates of the parameters. This stage
includes the following activities.
a. Gathering of the data on the variables included in the model.
b. Examination of the identification conditions of the function (especially for simultaneous
equations models).
c. Examination of the aggregations problems involved in the variables of the function.
d. Examination of the degree of correlation between the explanatory variables (i.e.
examination of the problem of multicollinearity).
e. Choice of appropriate economic techniques for estimation, i.e. to decide a specific
econometric method to be applied in estimation; such as, OLS, MLM, Logit, and Probit.
3. Evaluation of the estimates
This stage consists of deciding whether the estimates of the parameters are theoretically
meaningful and statistically satisfactory. This stage enables the econometrician to evaluate the
results of calculations and determine the reliability of the results. For this purpose we use
various criteria which may be classified into three groups:
i. Economic a priori criteria: These criteria are determined by economic theory and refer
to the size and sign of the parameters of economic relationships.
ii. Statistical criteria (first-order tests): These are determined by statistical theory and aim
at the evaluation of the statistical reliability of the estimates of the parameters of the
model. Correlation coefficient test, standard error test, t-test, F-test, and R2-test are some
of the most commonly used statistical tests.
iii. Econometric criteria (second-order tests):
These are set by the theory of econometrics and aim at the investigation of whether the
assumptions of the econometric method employed are satisfied or not in any particular case. The
econometric criteria serve as a second order test (as test of the statistical tests) i.e. they determine
the reliability of the statistical criteria; they help us establish whether the estimates have the
desirable properties of biasedness, consistency etc. Econometric criteria aim at the detection of
the violation or validity of the assumptions of the various econometric techniques.
4) Evaluation of the forecasting power of the model:
6
Forecasting is one of the aims of econometric research. However, before using an estimated
model for forecasting by some way or another the predictive power of the model. It is possible
that the model may be economically meaningful and statistically and econometrically correct for
the sample period for which the model has been estimated; yet it may not be suitable for
forecasting due to various factors (reasons). Therefore, this stage involves the investigation of
the stability of the estimates and their sensitivity to changes in the size of the sample.
Consequently, we must establish whether the estimated function performs adequately outside the
sample of data. i.e. we must test an extra sample performance the model.
1.4 The Sources, Types and Nature of Data
Data are records of the actual state of some aspect of the universe at a particular point in time.
Data are not abstract; they are concrete, they are measurements or the tangible features of the
world. Data are an essential part of conducting research and they provide the evidence that links
the research to the real world.
The success of any econometric analysis ultimately depends on the availability of the appropriate
data. It is therefore essential that we spend some time discussing the nature, sources, and
limitations of the data that one may encounter in empirical analysis.
 Types of Data
Three types of data may be available for empirical analysis: time series, cross-section, and
pooled (i.e., combination of time series and crosssection) data.
a) Time Series Data

A time series is a set of observations on the values that a variable takes at different times. Such
data may be collected at regular time intervals, such as daily (e.g., stock prices, weather reports),
weekly (e.g., money supply figures), monthly [e.g., the unemployment rate, the Consumer Price
Index (CPI)], quarterly (e.g., GDP), annually (e.g., government budgets), quinquennially, that
is, every 5 years (e.g., the census of manufactures), or decennially (e.g., the census of
population). Sometime data are available both quarterly as well as annually, as in the case of the
data on GDP and consumer expenditure.
Data may be qualitative or quantitative
7
Qualitative data are sometimes called dummy variables or categorical variable. These are
variables that cannot be quantified.
Example: male or female, married or unmarried, religion, etc
Quantitative data are data that can be quantified
Example: income, prices, money etc.
b) Cross-Section data
✓These data give information on the variables concerning individual agents (consumers or
producers) at a given point of time.
✓ many units observed at one point in time
✓ Generally obtained through official records of individual units, surveys, questionnaires
(data collection instrument that contains a series of questions designed for a specific
purpose)
Example:
- the census of population conducted by CSA.
Note that due to heterogeneity, cross- sectional data have their own problems.
c) Pooled Data
✓These are repeated surveys of a single (cross-section) sample in different periods of time.
✓They record the behavior of the same set of individual microeconomic units over time.
✓There are elements of both time series and cross sectional data.
✓Consists of cross-sectional data sets that are observed in different time periods and
combined together
✓ At each time period (e.g., year) a different random sample is chosen from population
✓ Individual units are not the same
✓ For example, if we choose a random sample 400 firms in 2022 and choose another
sample in 2023 and combine these cross-sectional data sets we obtain a pooled crosssection data
set.
d) Panel (longitudinal) data
The panel or longitudinal data also called micro panel data, is a special type of pooled data in
which the same cross-sectional unit is surveyed over time.
8
 Source of Data
Based on the source, the type of data collected could be primary or secondary in nature.
• Primary data are those which are collected afresh and for the first time, and thus happen to be
original in character. Its advantage is its relevance to the user, but it is also likely to be expensive
in time and money terms to collect.
• Secondary data are those which have already been collected by someone else and which have
already been passed through the statistical process. It is information extracted from an existing
source, probably published or held on a computer database.
 Nature of Data
a) Nominal data - The nominal scale is used for assigning numbers as the identification of
individual unit. For example, the classification of journals according to the discipline they belong
to, may be considered as nominal data. If numbers are assigned to describe the categories, the
numbers represent only the name of the category.
b) Ordinal data - It indicates the ordered or graded relationship among the numbers assigned to
the observations made. These numbers connote ranks of different categories having relationship
in a definite order. For example, to study the responsiveness of library staff a researcher may
assign '1' to indicate poor, '2' to indicate average, '3' to indicate good and '4' to indicate excellent.
The numbers 1,2, 3 and 4 in this case are set of ordinal data which indicate that 4 is better than 3
which in turn is better than 2 and so on. The ordinal data show the direction of the difference and
not the exact amount of
difference.
c) Interval data - Interval data are ordered categories of data and the differences between various
categories are of equal measurement.
For example, we can measure the IQ (Intelligence Quotient) of a group of children. After
assigning numerical value to the IQ of each child, the data can be grouped with interval of 10,
like °to 10, 10 to 20, 20 to 30 and so on. In this case, '0' does not mean the absence of
intelligence and children with IQ '20' are not doubly intelligent than
children with IQ '10'.
9
d) Ratio data - Ratio data are the quantitative measurement of a variable in terms of magnitude.
In ratio data, we can say that one thing is twice or thrice of another as for example,
measurements involving weight, distance, price, etc.
1.5 Desirable properties of an econometric model

An econometric model is a model whose parameters have been estimated with some appropriate
econometric technique. The „goodness‟ of an econometric model is judged customarily
according to the following desirable properties.
1. Theoretical plausibility: The model should be compatible with the postulates of economic
theory. It must describe adequately the economic phenomena to which it relates.
2. Explanatory ability: The model should be able to explain the observations of he actual
world. It must be consistent with the observed behavior of the economic variables whose
relationship it determines.
3. Accuracy of the estimates of the parameters: The estimates of the coefficients should be
accurate in the sense that they should approximate as best as possible the true parameters of
the structural model. The estimates should if possible possess the desirable properties of
biasedness, consistency and efficiency.
4. Forecasting ability: The model should produce satisfactory predictions of future values of
he dependent (endogenous) variables.
5. Simplicity: The model should represent the economic relationships with maximum
simplicity. The fewer the equations and the simpler their mathematical form, the better the
model is considered, ceteris paribus (that is to say provided that the other desirable properties
are not affected by the simplifications of the model).
1.6 Goals of Econometrics

Three main goals of Econometrics are identified:
i) Analysis i.e. testing economic theory
ii) Policy making i.e. Obtaining numerical estimates of the coefficients of economic
relationships for policy simulations.
iii) Forecasting i.e. using the numerical estimates of the coefficients in order to forecast the
future values of economic magnitudes.
10
CHAPTER TWO
THE CLASSICAL REGRESSION ANALYSIS: Simple Linear Regression
Model
2.1. Concept of Regression Function
In economics the relationship between variables are mainly explained in the form of dependent
& independent variables. The dependent variable is that variable which its average value is
computed using the already known values of the explanatory variable(s). But the values of the
explanatory variables are obtained from fixed or in repeated sampling of the population.
Ex. Suppose the amount of commodity demanded by an individual is depend on the
price of the commodity, income of individual, price of other goods & etc. Then from this
statement quantity demanded is the dependent variable which its value is determined by
the price of the commodity and income of the individual, Price of other goods etc. And
price of the commodity, income of individuals & price of other goods are independent
(explanatory) variables whose value is obtained from the population using repeated
sampling. The relationship between these dependent and independent variable is a
concern of regression analysis. i.e.
Qd = f (P, P0, Y etc)
If we study the relationship between dependent variable & one independent variable i.e.
Qd= f (P) this is known as simple two variable regression model because there are one
dependent Qd & one independent P regression model. However if the dependent variable
is depending upon more than one independent variables such as Qd: f (P, P0, Y) it is known as
multiple regression analysis. The functional relationship between the dependent and independent
variable may be linear or non-linear.
The key concept underlying regression analysis is the concept of the conditional expectation
function (CEF), or population regression function (PRF). Our objective in regression analysis
is to find out how the average value of the dependent variable (or regressand) varies with the
given value of the explanatory variable (or regressor).
This section largely deals with linear PRFs, that is, regressions that are linear in the parameters.
They may or may not be linear in the regressand or the regressors. For empirical purposes, it is
the stochastic PRF that matters. The stochastic disturbance term ui plays a critical role in
11
estimating the PRF. The PRF is an idealized concept, since in practice one rarely has access to
the entire population of interest. Usually, one has a sample of observations from the population.
Therefore, one uses the stochastic sample regression function (SRF) to estimate the PRF.
Economic theories are mainly concerned with the relationships among various economic
variables. These relationships, when phrased in mathematical terms, can predict the effect of one
variable on another. The functional relationships of these variables define the dependence of one
variable upon the other variable (s) in the specific form. The specific functional forms may be
linear, quadratic, logarithmic, exponential, hyperbolic, or any other form. A simple linear
regression model, i.e. a relationship between two variables related in a linear form. i.e.
stochastic and non-stochastic, among which we shall be using the former in econometric
analysis.
Stochastic and Non-stochastic Relationships
A relationship between X and Y, characterized as Y = f(X) is said to be deterministic or non-

stochastic if for each value of the independent variable (X) there is one and only one
corresponding value of dependent variable (Y). On the other hand, a relationship between X and
Y is said to be stochastic if for a particular value of X there is a whole probabilistic distribution
of values of Y. In such a case, for any given value of X, the dependent variable Y assumes some
specific value only with some probability.
The derivation of the observation from non-stochastic the line may be attributed to several
factors.
a. Omission of variables from the function
b. Random behavior of human beings
c. Imperfect specification of the mathematical form of the model
d. Error of aggregation
e. Error of measurement
In order to take into account the above sources of errors we introduce in econometric functions a
random variable which is usually denoted by the letter „u‟ or „  ‟ and is called error term or
random disturbance or stochastic term of the function, so called be cause u is supposed to
„disturb‟ the exact linear relationship which is assumed to exist between X and Y. By
introducing this random variable in the function the model is rendered stochastic of the form:
12
Yi    X  u i ……………………………………………………….(2.2)
Thus a stochastic model is a model in which the dependent variable is not only determined by the
explanatory variable(s) included in the model but also by others which are not included in the
model.
2.2. Simple Linear Regression model.
The above stochastic relationship (2.2) with one explanatory variable is called simple linear
regression model.
The true relationship which connects the variables involved is split into two parts:
A part represented by a line and a part represented by the random term „u‟.
The scatter of observations represents the true relationship between Y and X. The line
represents the exact part of the relationship and the deviation of the observation from the line
represents the random component of the relationship. Were it not for the errors in the model, we
would observe all the points on the line Y1' , Y2' ,......, Yn' corresponding to X 1 , X 2 ,...., X n . However
because of the random disturbance, we observe Y1 , Y2 ,......, Yn corresponding to X 1 , X 2 ,...., X n .
These points diverge from the regression line by u1 , u 2 ,...., u n .
Yi     xi  ui
 
  
the dependent var iable the regression line random var iable
The first component in the bracket is the part of Y explained by the changes in X and the second
is the part of Y not explained by X, that is to say the change in Y is due to the random influence
of u i .
2.2.1 Assumptions of the Classical Linear Stochastic Regression Model.
The classical made important assumptions in their analysis of regression .The most important of
these assumptions are discussed below.
1. The model is linear in parameters.
13
The classical assumed that the model should be linear in the parameters regardless of whether the
explanatory and the dependent variables are linear or not. This is because if the parameters are
non-linear it is difficult to estimate them since their value is not known but you are given with
the data of the dependent and independent variable.
2. Ui is a random real variable
This means that the value which u may assume in any one period depends on chance; it may be
positive, negative or zero. Every value has a certain probability of being assumed by u in any
particular instance.
3. The mean value of the random variable(U) in any particular period is zero
This means that for each value of x, the random variable(u) may assume various values,
some greater than zero and some smaller than zero, but if we considered all the possible and
negative values of u, for any given value of X, they would have on average value equal to
zero. In other words the positive and negative values of u cancel each other.
Mathematically, E (U i )  0
4. The variance of the random variable(U) is constant in each period (The assumption
of homoscedasticity)
For all values of X, the u‟s will show the same dispersion around their mean. This constant
variance is called homoscedasticity assumption and the constant variance itself is called
homoscedastic variance.
5. .The random variable (U) has a normal distribution
This means the values of u (for each x) have a bell shaped symmetrical distribution about their
zero mean and constant variance  2 , i.e.
U i  N (0,  2 )
6. The random terms of different observations U ,U are

i j independent. (The
assumption of no autocorrelation)
This means the value which the random term assumed in one period does not depend on the
value which it assumed in any other period.
7. The X i are a set of fixed values in the hypothetical process of repeated sampling
which underlies the linear regression model.
14
This means that, in taking large number of samples on Y and X, the X i values are the same in all
samples, but the u i values do differ from sample to sample, and so of course do the values of y i .
8. The random variable (U) is independent of the explanatory variables.

This means there is no correlation between the random variable and the explanatory variable.
9. The explanatory variables are measured without error
U absorbs the influence of omitted variables and possibly errors of measurement in the y‟s. i.e.,
we will assume that the regressors are error free, while y values may or may not include errors of
measurement.
2.3 Methods of estimation
Specifying the model and stating its underlying assumptions are the first stage of any
econometric application. The next step is the estimation of the numerical values of the
parameters of economic relationships. The parameters of the simple linear regression model can
be estimated by various methods. Three of the most commonly used methods are:
1. Ordinary least square method (OLS)
2. Method of moments (MM)
3. Maximum likelihood method (MLM)
But, here we will deal with the OLS and the MLM methods of estimation.
2.3.1 The ordinary least square (OLS) method and Method of moments (MM)
The model Yi    X i  U i is called the true relationship between Y and X because Y and X
represent their respective population value, and  and  are called the true parameters since
they are estimated from the population value of Y and X But it is difficult to obtain the
population value of Y and X because of technical or economic reasons. So we are forced to take
the sample value of Y and X. The parameters estimated from the sample value of Y and X are
called the estimators of the true parameters  and  and are symbolized as ˆ and ˆ .
The model Yi  ˆ  ˆX i  ei , is called estimated relationship between Y and X since ˆ and ˆ
are estimated from the sample of Y and X and ei represents the sample counterpart of the
population random disturbance U i .
15
Estimation of  and  by least square method (OLS) or classical least square (CLS) involves
finding values for the estimates ˆ and ˆ which will minimize the sum of square of the squared
residuals (  ei2 ).
From the estimated relationship Yi  ˆ  ˆX i  ei , we obtain:
ei  Yi  (ˆ  ˆX i ) ……………………………(2.3)
e 2
i   (Yi  ˆ  ˆX i ) 2 ……………………….(2.4)
To find the values of ˆ and ˆ that minimize this sum, we have to partially differentiate
e 2
i with respect to ˆ and ˆ and set the partial derivatives equal to zero.
  ei2
1.  2 (Yi  ˆ  ˆX i )  0.......... .......... .......... .......... .......... .....( 2.5)
ˆ
Rearranging this expression we will get: Y i  n  ˆX i ……(2.6)
If you divide (2.9) by „n‟ and rearrange, we get
ˆ  Y  ˆX .......... .......... .......... .......... .......... .......... .......... ....( 2.7)
 ei2
2.  2 X i (Yi  ˆ  ˆX )  0.......... .......... .......... .......... .......... (2.8)
ˆ

Note: e  Yi  ˆ  ˆX i . Hence it is possible to rewrite as  2 ei  0 and  2 X i ei  0 . It
follows that;
e i  0 and X e i i  0.......... .......... .......... .......... ....( 2.9)
Y X i i  ˆX i  ˆX i2 ……………………………………….(2.10)
Equations are called the Normal Equations. After certain step we get:
Y X i i  X i (Y  ˆX )  ˆX i2
 Y X i  ˆXX i  ˆX i2
Y X i i  Y X i  ˆ (X i2  XX i )
XY  nXY = ˆ ( X i2  nX 2)
16
XY  nXY
ˆ  ………………….(2.11)
X i2  nX 2
Equation (2.11) can be rewritten in somewhat different way and arrived final steps are :-
( X  X )(Y  Y )
ˆ 
( X  X ) 2
Now, denoting ( X i  X ) as x i , and (Yi  Y ) as y i we get;
xi yi
ˆ  ……………………………………… (2.12)
xi2
The expression in (2.12) to estimate the parameter coefficient is termed is the formula in
deviation form.
Estimation of a function with zero intercept
Suppose it is desired to fit the line Yi    X i  U i , subject to the restriction   0. To
estimate ˆ , the problem is put in a form of restricted minimization problem and then Lagrange
method is applied.
n
We minimize: ei2   (Yi  ˆ  ˆX i ) 2
i 1
Subject to: ˆ  0
The composite function then becomes
Z   (Yi  ˆ  ˆX i ) 2  ˆ , where  is a Lagrange multiplier.
We minimize the function with respect to ˆ , ˆ , and 

Z
 2(Yi  ˆ  ˆX i )    0        (i )
ˆ
Z
 2(Yi  ˆ  ˆX i ) ( X i )  0        (ii)
ˆ
z
 2  0                    (iii)

Substituting (iii) in (ii) and rearranging we obtain:
X i (Yi  ˆX i )  0
Yi X i  ˆX i  0
2
17
X i Yi
ˆ  ……………………………………..(2.13)
X i2
This formula involves the actual values (observations) of the variables and not their deviation
forms, as in the case of unrestricted value of ˆ .
Statistical Properties of Least Square Estimators

There are various econometric methods with which we may obtain the estimates of the
parameters of economic relationships. We would like to an estimate to be as close as the value of
the true population parameters i.e. to vary within only a small range around the true parameter.
How are we to choose among the different econometric methods, the one that gives „good‟
estimates? We need some criteria for judging the „goodness‟ of an estimate.
„Closeness‟ of the estimate to the population parameter is measured by the mean and variance or
standard deviation of the sampling distribution of the estimates of the different econometric
methods. We assume the usual process of repeated sampling i.e. we assume that we get a very
large number of samples each of size „n‟; we compute the estimates ˆ ‟s from each sample, and
for each econometric method and we form their distribution. We next compare the mean
(expected value) and the variances of these distributions and we choose among the alternative
estimates the one whose distribution is concentrated as close as possible around the population
parameter.
2.3.2 Maximum Likelihood Estimation
A concept frequently utilized in econometrics is that of maximum-likelihood. The basic
important concept of maximum likelihood is the fact that different statistical populations
generate different samples. i.e. any one sample being scrutinized is more likely to have come
from some populations rather than from others.
Ex.1 If one where sampling coins tosses & a sample mean of 0.5 were obtained (representing
half heads or half tails), it would be clear that the most likely population from which the sample
were drawn would be a population with a true mean of 0.5
Ex.1 Assuming that [X1,X2,X3,X4,X5,X6,X7,X8] are drawn from a normal population with a
given variance but unknown mean. Again assume that these sample observations are drawn from
distribution A or distribution B.
18
Given the distribution A & B. then if the true population were B, then the probability that we
would have obtained the sample shown would be quite small. But if the true population were A
then the probability that we would have drawn the sample would be substantially large. Select
the observation from population A as the most likely to have yielded the observed data.We
define the maximum likelihood estimator of as the value of which would most likely generate the
observed sample observations Y1 , Y2, Y3 --- Yn. Then if Yi is normally distributed & each of
Y's is drawn independently then the maximum-likelihood estimation maximizes. P(Y1) P(Y2) . .
. P(Yn)Where each P represents a probability associated with the normal distribution.P(Y1)
P(Y2)--- P(Yn) is often referred to as the likelihood function. The likelihood function depends up
on not only on the sample values but also in the unknown parameters of the problems.
In describing the likelihood function we often think of the unknown parameters as varying while
the Y's (dependent variables) are fixed.This seems reasonable because finding the maximum
likelihood estimation involves a search over alternative parameter estimators which would be
most likely to generate the given sample. For this reason the likelihood function must be
interpreted differently from the joint probability distribution. In the latter case the Y's are
allowed to vary & the underlying parameters are fixed & the reverse is true in case of maximum
likelihood.
19
2.4 Properties of OLS Estimators and Gauss-Markov Theorem
The ideal or optimum properties that the OLS estimates possess may be summarized by well
known theorem known as the Gauss-Markov Theorem.
Statement of the theorem: “Given the assumptions of the classical linear regression model, the
OLS estimators, in the class of linear and unbiased estimators, have the minimum variance, i.e.
the OLS estimators are BLUE.
According to the theorem, under the basic assumptions of the classical linear regression model,
the least squares estimators are linear, unbiased and have minimum variance (i.e. are best of all
linear unbiased estimators). Sometimes the theorem referred as the BLUE theorem i.e. Best,
Linear, and Unbiased Estimator. An estimator is called BLUE if:
a. Linear: a linear function of the random variable, such as, the dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population parameter.
c. Minimum variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.
According to the Gauss-Markov theorem, the OLS estimators possess all the BLUE properties.
The detailed proof of these properties are presented below
The variance of the random variable (Ui)
Dear student! You may observe that the variances of the OLS estimates involve  2 , which is the
population variance of the random disturbance term. But it is difficult to obtain the population
data of the disturbance term because of technical and economic reasons. Hence it is difficult to
compute  2 ; this implies that variances of OLS estimates are also difficult to compute. But we
can compute these variances if we take the unbiased estimate of  2 which is ˆ 2 computed from
the sample value of the disturbance term ei from the expression:
ei2
ˆ u2  …………………………………..2.14
n2
Statistical test of Significance of the OLS Estimators
(First Order tests)
After the estimation of the parameters and the determination of the least square regression line,
we need to know how „good‟ is the fit of this line to the sample observation of Y and X, that is to
say we need to measure the dispersion of observations around the regression line. This
20
knowledge is essential because the closer the observation to the line, the better the goodness of
fit, i.e. the better is the explanation of the variations of Y by the changes in the explanatory
variables.
We divide the available criteria into three groups: the theoretical a priori criteria, the statistical
criteria, and the econometric criteria. Under this section, our focus is on statistical criteria (first
order tests). The two most commonly used first order tests in econometric analysis are:
1) The coefficient of determination (the square of the correlation coefficient i.e. R2). This test is
used for judging the explanatory power of the independent variable(s).
2) The standard error tests of the estimators. This test is used for judging the statistical
reliability of the estimates of the regression coefficients.
2.5 Tests of the ‘Goodness of Fit’ With R2
 R2 shows the percentage of total variation of the dependent variable that can be explained by
To elaborate this let‟s draw a horizontal line corresponding to the mean value of the
dependent variable Y . (see figure below).
 By fitting the line Yˆ  ˆ 0  ˆ1 X , we try to obtain the explanation of the variation of the
dependent variable Y produced by the changes of the explanatory variable X.
.Y
Y = e  Y  Yˆ
Y Y = Yˆ Yˆ  ˆ 0  ˆ1 X
= Yˆ  Y
Y.
Figure. Actual and estimated values of the dependent variable Y.
 As can be seen from fig.(d) above, Y  Y represents measures the variation of the sample
observation value of the dependent variable around the mean.
21
 However the variation in Y that can be attributed the influence of X, (i.e. the regression
line) is given by the vertical distance Yˆ  Y .
 The part of the total variation in Y about Y that can‟t be attributed to X is equal to
e  Y  Yˆ which is referred to as the residual variation.
 In summary:
ei  Yi  Yˆ = deviation of the observation Yi from the regression line.
yi  Y  Y = deviation of Y from its mean.
yˆ  Yˆ  Y = deviation of the regressed (predicted) value ( Yˆ ) from the mean.
 Now, we may write the observed Y as the sum of the predicted value ( Yˆ ) and the
residual term (ei.).
Yi  Yˆ  ei
 predicted Yi

Observed Yi Re sidual
 From equation (2.34) we can have the above equation but in deviation form
y  yˆ  e . By squaring and summing both sides, we obtain the following expression:
y 2  ( yˆ 2  e) 2
y 2  ( yˆ 2  ei2  2 yei)
 yi  ei2  2yêi

2
But ŷei = e(Yˆ  Y )  e(ˆ  ˆxi  Y )
 ˆei  ˆexi  Yˆei
(but ei  0, exi  0 )
  yê  0 ………………………………………………(2.46)
 Therefore;
22
yi2  
yˆ 2  ei2 ………………………………...(2.47)
 
Total Explained Un exp lained
var iation var iation var ation
OR,
Total sum of Explained sum Re sidual sum

 
square of square of square
      
TSS ESS RSS
i.e
TSS  ESS  RSS ……………………………………….(2.48)
 Mathematically; the explained variation as a percentage of the total variation is explained

as:
ESS yˆ 2
 ……………………………………….(2.49)
TSS y 2
 From equation (2.37) we have yˆ  ̂x . Squaring and summing both sides give us
yˆ 2  ˆ 2 x 2                        (2.50)
 We can substitute (2.50) in (2.49) and obtain:
ˆ 2 x 2
ESS / TSS  …………………………………(2.51)
y 2
 xy  xi x y
22
 2  , Since ˆ  i 2 i
 x   y xi
2
xy xy
 ………………………………………(2.52)
x 2 y 2
23
 Comparing (2.52) with the formula of the correlation coefficient:
r = Cov (X,Y) / x2x2 = xy / nx2x2 = xy / ( x 2 y 2 )1/2 ………(2.53)
 Squaring (2.53) will result in: r2 = ( xy )2 / ( x 2 y 2 ). ………….(2.54)
 Comparing (2.52) and (2.54), we see exactly the expressions. Therefore:
xy xy
ESS/TSS  = r2
x y
2 2
 From (2.48), RSS=TSS-ESS. Hence R2 becomes;
TSS  RSS RSS e 2

R2   1  1  i2 ………………………….…………(2.55)
TSS TSS y
 From equation (2.55) we can drive;
RSS  ei2  yi2 (1  R 2 )                            (2.56)
 The limit of R2: The value of R2 falls between zero and one. i.e. 0  R 2  1 .
the changes in the explanatory variable(s) included in the model.
Interpretation of R2
Suppose R 2  0.9 , this means that the regression line gives a good fit to the observed data since
this line explains 90% of the total variation of the Y value around their mean. The remaining
10% of the total variation in Y is unaccounted for by the regression line and is attributed to the
factors included in the disturbance variable u i .
2.6 Testing the Significance of OLS Parameters

To test the significance of the OLS parameter estimators we need the following:
 Variance of the parameter estimators
 Unbiased estimator of  2
 The assumption of normality of the distribution of error term.
24
i) Standard error test ii) Student’s t-test iii) Confidence interval
All of these testing procedures reach on the same conclusion. Let us now see these testing
methods one by one.
i) Standard error test
This test helps us decide whether the estimates ˆ and ˆ are significantly different from zero,
i.e. whether the sample from which they have been estimated might have come from a
population whose true parameters are zero.   0 and / or   0 .
Formally we test the null hypothesis
H 0 :  i  0 against the alternative hypothesis H 1 :  i  0
The standard error test may be outlined as follows.

First: Compute standard error of the parameters.
SE( ˆ )  var( ˆ )
SE(ˆ )  var(ˆ )
Second: compare the standard errors with the numerical values of ˆ and ˆ .
Decision rule:
 If SE( î )  1
2 î , accept the null hypothesis and reject the alternative hypothesis. We
conclude that ˆ i is statistically insignificant.
 If SE( î )  1
2 î , reject the null hypothesis and accept the alternative hypothesis. We
conclude that ˆ i is statistically significant.

The acceptance or rejection of the null hypothesis has definite economic meaning. Namely, the
acceptance of the null hypothesis   0 (the slope parameter is zero) implies that the
explanatory variable to which this estimate relates does not in fact influence the dependent
variable Y and should not be included in the function, since the conducted test provided evidence
that changes in X leave Y unaffected. In other words acceptance of H0 implies that the
relationship between Y and X is in fact Y    (0) x   , i.e. there is no relationship between X
and Y.
25
ii) Student’s t-test
Like the standard error test, this test is also important to test the significance of the parameters.
From your statistics, any variable X can be transformed into t using the general formula:
To undertake the above test we follow the following steps.
Step 1: Compute t*, which is called the computed value of t, by taking the value of  in the null
hypothesis. In our case   0 , then t* becomes:
ˆ  0 ˆ
t*  
SE( ˆ ) SE( ˆ )
Step 2: Choose level of significance. Level of significance is the probability of making „wrong‟
decision, i.e. the probability of rejecting the hypothesis when it is actually true or the probability
of committing a type I error. It is customary in econometric research to choose the 5% or the 1%
level of significance. This means that in making our decision we allow (tolerate) five times out
of a hundred to be „wrong‟ i.e. reject the hypothesis when it is actually true.
Step 3: Check whether there is one tail test or two tail test. If the inequality sign in the
alternative hypothesis is  , then it implies a two tail test and divide the chosen level of
significance by two; decide the critical rejoin or critical value of t called tc. But if the inequality
sign is either > or < then it indicates one tail test and there is no need to divide the chosen level
of significance by two to obtain the critical value from the t-table.
Step 4: Obtain critical value of t, called tc at  and n-2 degree of freedom for two tail test.
2
Step 5: Compare t* (the computed value of t) and tc (critical value of t)

 If t*> tc , reject H0 and accept H1. The conclusion is ˆ is statistically significant.
 If t*< tc , accept H0 and reject H1. The conclusion is ˆ is statistically insignificant.

iii) Confidence interval
Rejection of the null hypothesis doesn‟t mean that our estimate ˆ and ˆ is the correct estimate
of the true population parameter  and  . It simply means that our estimate comes from a
sample drawn from a population whose parameter  is different from zero.
In order to define how close the estimate to the true parameter, we must construct confidence
interval for the true parameter, in other words we must establish limiting values around the
estimate with in which the true parameter is expected to lie within a certain “degree of
26
confidence”. In this respect we say that with a given probability the population parameter will
be within the defined confidence interval (confidence limits).
We choose a probability in advance and refer to it as confidence level (interval coefficient). It is

customarily in econometrics to choose the 95% confidence level. This means that in repeated
sampling the confidence limits, computed from the sample, would include the true population
parameter in 95% of the cases. In the other 5% of the cases the population parameter will fall
outside the confidence interval.
In a two-tail test at  level of significance, the probability of obtaining the specific t-value either
–tc or tc is 
2 at n-2 degree of freedom.
[ ˆ  SE( ˆ )t c , ˆ  SE( ˆ )t c ] ; Where t c is the critical value of t at 

2 confidence interval and n-2
degree of freedom.
27
CHAPTER THREE
THE CLASSICAL REGRESSION ANALYSIS: Multiple Linear Regression
Model
3.1 Introduction
In simple regression we study the relationship between a dependent variable and a single
explanatory (independent variable). But it is rarely the case that economic relationships involve
just two variables. Rather a dependent variable Y can depend on a whole series of explanatory
variables or regressors. For instance, in demand studies we study the relationship between
quantity demanded of a good and price of the good, price of substitute goods and the consumer‟s
income. The model we assume is:
Yi   0  1 P1   2 P2   3 X i  u i -------------------- (3.1)
Where Yi  quantity demanded, P1 is price of the good, P2 is price of substitute goods, Xi is
consumer‟s income, and  ' s are unknown parameters and u i is the disturbance.
Equation (3.1) is a multiple regression with three explanatory variables. In general for K-
explanatory variable we can write the model as follows:
Yi   0  1 X 1i   2 X 2i   3 X 3i  .........   k X ki  ui ------- (3.2)
Where X k i  (i  1,2,3,......., K ) are explanatory variables, Yi is the dependent variable and
 j ( j  0,1,2,....( k  1)) are unknown parameters and u i is the disturbance term. The disturbance
term is of similar nature to that in simple regression, reflecting:
- the basic random nature of human responses
- errors of aggregation
- errors of measurement
- errors in specification of the mathematical form of the model and any other (minor)
factors, other than x i that might influence Y.
The assumptions of the multiple regressions and we will proceed our analysis with the case of
two explanatory variables and then we will generalize the multiple regression model in the case
of k-explanatory variables.
28
3.2 Assumptions of Multiple Regression Model
In order to specify our multiple linear regression model and proceed our analysis with regard to
this model, some assumptions are compulsory. But these assumptions are the same as in the
single explanatory variable model developed earlier except the assumption of no perfect
multicollinearity. These assumptions are:
1. Randomness of the error term: The variable u is a real random variable.
2. Zero mean of the error term: E (u i )  0
3. Hemoscedasticity: The variance of each u i is the same for all the x i values.
i.e. E (u i )   u (constant)
2 2
4. Normality of u: The values of each u i are normally distributed.
i.e. U i ~ N (0,  2 )
5. No auto or serial correlation: The values of u i (corresponding to Xi ) are independent
from the values of any other u i (corresponding to Xj ) for i j.
i.e. E(ui u j )  0 for xi  j
6. Independence of u i and Xi : Every disturbance term u i is independent of the explanatory
variables. i.e. E (u i X 1i )  E (u i X 2i )  0
This condition is automatically fulfilled if we assume that the values of the X‟s are a set
of fixed numbers in all (hypothetical) samples.
7. No perfect multicollinearity: The explanatory variables are not perfectly linearly
correlated.
We can‟t exclusively list all the assumptions but the above assumptions are some of the basic
assumptions in multiple analysis.
3.3 Partial-correlation coefficients
In order to remove the influence of X2 on Y, we regress Y on X2 and find the residual e1= Y*.
To remove the influence of X2 on XI, we regress XI on X2 and find the residual e2 = X1*; Y
And X1*; then represent the variations in Y and XI, respectively, left unexplained after removing
the influence of X2 from both Y and XI. Therefore, the partial correlation coefficient is merely
the simple correlation coefficient between the residuals Y* and X*; (that is, ryxl.x2 = ry*X*).
29
Partial correlation coefficient range in value from -1 to +1, ('just as in the case of simple
correlation coefficients).
 For example, ryxl.x2 = -1 refers to the case where there is an exact or perfect negative
linear relationship between Y and XI after removing the common influence of X2 from
both Y and X1.
 However, ryxl.x2 = 1 indicates a perfect positive linear net relationship between Y and
XI.
 And ryxl.x2 = 0 indicates no linear relationship between Y and X, when the common
influence of X, has been removed from both Y and XI. As a result, XI can be omitted
from the regression.
The sign of partial correlation coefficients is the same as that of the corresponding estimated
parameter. For example, for the estimated regression equation Y = bo + blXl + b2X2, ryx1.x2
has the same sign as bl and ryx2.xl has the same sign as b,. Partial correlation coefficients are
used in multiple regression analysis to determine the relative importance of each explanatory
variable in the model.
The independent variable with the highest partial correlation coefficient with respect to the
dependent variable contributes most to the explanatory power of the model and is entered first in
a stepwise multiple regression analysis. It should be noted, however, that partial correlation
coefficients give an ordinal, not a cardinal, measure of net correlation, and the sum of the partial
correlation coefficients between the dependent and all the independent variables in the model
need not add up to 1.
3.4 A Model with Two Explanatory Variables
In order to understand the nature of multiple regression model easily, we start our analysis with
the case of two explanatory variables, then extend this to the case of k-explanatory variables.
3.4.1 Estimation of parameters of two-explanatory variables model
The model: Y   0  1 X 1   2 X 2  U i ……………………………………(3.3)
is multiple regression with two explanatory variables. The expected value of the above model is
called population regression equation i.e.
E (Y )   0  1 X 1   2 X 2 , Since E (U i )  0 . …………………................(3.4)
30
where  i is the population parameters.  0 is referred to as the intercept and  1 and  2 are also
some times known as regression slopes of the regression. Note that,  2 for example measures
the effect on E (Y ) of a unit change in X 2 when X 1 is held constant. Since the population
regression equation is unknown to any investigator, it has to be estimated from sample data. Let
us suppose that the sample data has been used to estimate the population regression equation.
We leave the method of estimation unspecified for the present and merely assume that equation
(3.4) has been estimated by sample regression equation, which we write as:
Yˆ  ˆ0  ˆ1 X 1  ˆ2 X 2 ……………………………………………….(3.5)
Where ˆ j are estimates of the  j and Yˆ is known as the predicted value of Y.
x y . x  x x . x y
2
ˆ1  1 2 2 2 1 2 2 2 …………………………..…………….. (3.21)

x1 . x 2  ( x1 x 2 )
x y . x  x1 x 2 . x1 y
2
ˆ 2  2 2 1 2 ………………….……………………… (3.22)
x1 . x 2  ( x1 x 2 ) 2
3.4.2 The coefficient of determination ( R2):two explanatory variables case

In the simple regression model, we introduced R2 as a measure of the proportion of variation in
the dependent variable that is explained by variation in the explanatory variable. In multiple
regression model the same measure is relevant, and the same formulas are valid but now we talk
of the proportion of variation in the dependent variable explained by all explanatory variables
included in the model. The coefficient of determination is:
e
2
ESS RSS
R 
2
 1  1  i 2 ------------------------------------- (3.25)
TSS TSS y i
As in simple regression, R2 is also viewed as a measure of the prediction ability of the model
over the sample period, or as a measure of how well the estimated regression fits the data. The
value of R2 is also equal to the squared sample correlation coefficient between Yˆ & Yt . Since the
sample correlation coefficient measures the linear association between two variables, if R2 is
high, that means there is a close association between the values of Yt and the values of predicted
by the model, Yˆt . In this case, the model is said to “fit” the data well. If R2 is low, there is no
31
association between the values of Yt and the values predicted by the model, Yˆt and the model
does not fit the data well.
3.4.3 Adjusted Coefficient of Determination ( R 2 )
One difficulty with R 2 is that it can be made large by adding more and more variables, even if
the variables added have no economic justification. Algebraically, it is the fact that as the
variables are added the sum of squared errors (RSS) goes down (it can remain unchanged, but
this is rare) and thus R 2 goes up. If the model contains n-1 variables then R 2 =1. The
manipulation of model just to obtain a high R 2 is not wise. An alternative measure of goodness
of fit, called the adjusted R 2 and often symbolized as R 2 , is usually reported by regression
programs. It is computed as:
ei2 / n  k  n 1 
R 2 1  1  (1  R2 )  --------------------------------(3.28)
y / n  1
2
nk 
This measure does not always goes up when a variable is added because of the degree of
freedom term n-k is the numerator. As the number of variables k increases, RSS goes down, but
so does n-k. The effect on R 2 depends on the amount by which R 2 falls. While solving one
problem, this corrected measure of goodness of fit unfortunately introduces another one. It
losses its interpretation; R 2 is no longer the percent of variation explained. This modified R 2 is
sometimes used and misused as a device for selecting the appropriate set of explanatory
variables.
3.5 Statistical Properties of the Parameters and Gauss-Markov Theorem
We have seen, in simple linear regression that the OLS estimators (ˆ & ˆ ) satisfy the small
sample property of an estimator i.e. BLUE property. In multiple regressions, the OLS estimators
also satisfy the BLUE property. Now we proceed to examine the desired properties of the
estimators are:
1. Linearity
We know that: ˆ  ( X ' X ) 1 X 'Y
Let C= ( X X ) 1 X 
 ̂  CY …………………………………………….(3.33)
Since C is a matrix of fixed variables, equation (3.33) indicates us ˆ is linear in Y.
32
2. Unbiased ness
ˆ  ( X ' X ) 1 X 'Y
ˆ  ( X ' X ) 1 X ' ( X  U )
ˆ    ( X ' X ) 1 X 'U …….……………………………... (3.34)

  , since (U )  0
Thus, least square estimators are unbiased.
3. Minimum variance
Before showing all the OLS estimators are best (possess the minimum variance property), it is
important to derive their variance.
3.6. Hypothesis Testing in Multiple Regression Model
In multiple regression models we will undertake two tests of significance. One is significance of
individual parameters of the model. This test of significance is the same as the tests discussed in
simple regression model. The second test is overall significance of the model.
3.6.1. Tests of individual significance
If we invoke the assumption that U i ~. N (0,  2 ) , then we can use either the t-test or standard
error test to test a hypothesis about any individual partial regression coefficient. To illustrate
consider the following example.
Let Y  ˆ0  ˆ1 X 1  ˆ 2 X 2  ei ………………………………… (3.51)
A. H 0 : 1  0
H1 : 1  0
B. H 0 :  2  0
H1 :  2  0
The null hypothesis (A) states that, holding X2 constant X1 has no (linear) influence on Y.
Similarly hypothesis (B) states that holding X1 constant, X2 has no influence on the dependent
variable Yi. To test these null hypothesis we will use the following tests:
i- Standard error test: under this and the following testing methods we test only for
ˆ1 .The test for ̂ 2 will be done in the same way.
33
ˆ 2  x 22i ei2
SE( ˆ1 )  var( ˆ1 )  ; where ˆ 
2
x x
2
1i
2
2i  ( x1 x 2 ) 2 n3
 If SE(ˆ1 )  1 2 ˆ1 , we accept the null hypothesis that is, we can conclude that the
estimate  i is not statistically significant.
 If SE(ˆ  1 2 ˆ , we reject the null hypothesis that is, we can conclude that the
1 1
estimate  i is statistically significant.

Note: The smaller the standard errors, the stronger the evidence that the estimates are statistically
reliable.
ii. The student’s t-test: We compute the t-ratio for each ˆ i
î  
t*  ~ t n -k , where n is number of observation and k is number of parameters. If
SE( î )
we have 3 parameters, the degree of freedom will be n-3. So;
ˆ 2   2
t*  ; with n-3 degree of freedom
SE( ˆ 2 )
In our null hypothesis  2  0, the t* becomes:
ˆ 2
t* 
SE( ˆ 2 )
 If t*<t (tabulated), we accept the null hypothesis, i.e. we can conclude that ̂ 2 is not
significant and hence the regressor does not appear to contribute to the explanation of
the variations in Y.
 If t*>t (tabulated), we reject the null hypothesis and we accept the alternative one;
̂ 2 is statistically significant. Thus, the greater the value of t* the stronger the
evidence that  i is statistically significant.
3.6.2 Test of Overall Significance
Through out the previous section we were concerned with testing the significance of the
estimated partial regression coefficients individually, i.e. under the separate hypothesis that each
of the true population partial regression coefficient was zero.
In this section we extend this idea to joint test of the relevance of all the included explanatory
variables. Now consider the following:
34
Y   0  1 X 1   2 X 2  .........   k X k  U i
H 0 : 1   2   3  .......... ..   k  0
H1 : at least one of the  k is non-zero
This null hypothesis is a joint hypothesis that 1 ,  2 ,........  k are jointly or simultaneously equal
to zero. A test of such a hypothesis is called a test of overall significance of the observed or
estimated regression line, that is, whether Y is linearly related to X 1 , X 2 ,........ X k .
Can the joint hypothesis be tested by testing the significance of individual significance of ˆ i ‟s as
the above? The answer is no, and the reasoning is as follows
In testing the individual significance of an observed partial regression coefficient, we assumed
implicitly that each test of significance was based on different (i.e. independent) sample. Thus,
in testing the significance of ̂ 2 under the hypothesis that  2  0 , it was assumed tacitly that
the testing was based on different sample from the one used in testing the significance of
ˆ3 under the null hypothesis that  3  0 . But to test the joint hypothesis of the above, we shall
be violating the assumption underlying the test procedure. A single (individual) hypothesis is not
equivalent to testing that same hypothesis. The institutive reason for this is that in a joint test of
several hypotheses any single hypothesis is affected by the information in the other hypothesis.”
The test procedure for any set of hypothesis can be based on a comparison of the sum of squared
errors from the original, the unrestricted multiple regression model to the sum of squared errors
from a regression model in which the null hypothesis is assumed to be true. When a null
hypothesis is assumed to be true, we in effect place conditions or constraints, on the values that
the parameters can take, and the sum of squared errors increases. The idea of the test is that if
these sum of squared errors are substantially different, then the assumption that the joint null
hypothesis is true has significantly reduced the ability of the model to fit the data, and the data do
not support the null hypothesis.
If the null hypothesis is true, we expect that the data are compliable with the conditions placed
on the parameters. Thus, there would be little change in the sum of squared errors when the null
hypothesis is assumed to be true.
Let the Restricted Residual Sum of Square (RRSS) be the sum of squared errors in the model
obtained by assuming that the null hypothesis is true and URSS be the sum of the squared error
35
of the original unrestricted model i.e. unrestricted residual sum of square (URSS). It is always
true that RRSS - URSS  0.
Consider Yˆ  ˆ0  ˆ1 X 1  ˆ2 X 2  .........  ˆk X k  ei .
This model is called unrestricted. The test of joint hypothesis is that:
H 0 : 1   2   3  .......... ..   k  0
H1 : at least one of the  k is different from zero.
We know that: Yˆ  ˆ0  ˆ1 X 1i  ˆ 2 X 2i  .........  ˆ k X ki
Yi  Yˆ  e
ei  Yi  Yî
ei2  (Yi  Yi )2
This sum of squared error is called unrestricted residual sum of square (URSS). This is the case
when the null hypothesis is not true. If the null hypothesis is assumed to be true, i.e. when all the
slope coefficients are zero.
Y  ˆ 0  ei
̂ 0 
Y i
Y  (applying OLS)…………………………….(3.52)
n
e  Y  ̂ 0 but ̂ 0  Y
e  Y Y
ei2  (Yi  Yî ) 2  y 2  TSS

The sum of squared error when the null hypothesis is assumed to be true is called Restricted
Residual Sum of Square (RRSS) and this is equal to the total sum of square (TSS).
R R S S U R S S/ K  1
The ratio: ~ F( k 1,n  k ) ……………………… (3.53); (has an F-
U R S S/ n  K
ditribution with k-1 and n-k degrees of freedom for the numerator and denominator respectively)
RRSS  TSS
URSS  ei2  y 2  ˆ1yx1  ˆ 2 yx2  .......... ˆ k yxk  RSS
(TSS  RSS ) / k  1
F
RSS / n  k
36
ESS / k  1
F ………………………………………………. (3.54)
RSS / n  k
If we divide the above numerator and denominator by y 2  TSS then:
ESS
/ k 1
F TSS
RSS
/k n
TSS
R2 / k 1
F …………………………………………..(3.55)
1 R2 / n  k
This implies the computed value of F can be calculated either as a ratio of ESS & TSS or R2 & 1-
R2. If the null hypothesis is not true, then the difference between RRSS and URSS (TSS & RSS)
becomes large, implying that the constraints placed on the model by the null hypothesis have
large effect on the ability of the model to fit the data, and the value of F tends to be large.
If the computed value of F is greater than the critical value of F (k-1, n-k), then the parameters of
the model are jointly significant or the dependent variable Y is linearly related to the independent
variables included in the model.
3.7 Predictions using Multiple Linear Regression
Suppose that you estimated the demand for meat (q1) which depend on price of meat and income
of the consumer(y). If critical the critical value of F0.05, 2, 27=3.35, examine whether the result is
valid or not.
Source SS df MS Number of obs = 30

F(2, 27) = 27.48
Model 105297.059 2 52648.5293 Prob > F = 0.0000
Residual 51734.3635 27 1916.08754 R-squared = 0.6705
Adj R-squared = 0.6461
Total 157031.422 29 5414.87662 Root MSE = 43.773
q1 Coef. Std. Err. t P>|t| [95% Conf. Interval]
p1 -.7085007 .2391904 -2.96 0.006 -1.199279 -.2177225

y 2.262142 .3342771 6.77 0.000 1.576262 2.948022
_cons 14.90447 36.43313 0.41 0.686 -59.85013 89.65907
37
Note: SS=sum of square, df=degree freedom, MS=mean sum,q1=quantity demanded of
meat, p1=price of meat , y=income of the consumer. “cons” refers to ,ceof. Refers to
estimated coefficients, std.error refers to standard error, t=statistic , p>|t| refers to
probability of rejecting H0 and 95% is the confidence interval for population parameter.
The interpretation of the result in the above Table;
Model refers to variation in the dependent variable due to the independent variable. In this table,
SS refers to sum of square and the value=105297.059 is the explained sum of square (SSE) and
MS refers to mean sum of square (MSE) ;
MSE= = , where k=number of independent variables.
In addition, residual is the estimated error. It is source of variation in the dependent variable
which is because of error. Under the column SS, the value 51734.3635 is the residual sum of
square (SSR) and the value under column MS =1916.08754 is mean residual sum of square
which is computed as follows;
MSR = 1916.08754, where n= the number of observation and
k=number of independent variables and n-k-1 determines the degree freedom in the
residual sum of square.
Moreover, in the above result, total refers total variation in the dependent is decomposed into
SSE and SSR. Therefore,
 SST=SSE+SSR.
 MST=MSE+ MSR
 df (total)=k(df for SSE)+n-k+1(df for SSR)=n-1=30-1=29
The first Panel of the above table reports, the result which is used to examines the oval
significance of regression coefficients (F-test).
To conduct the overall significance of coefficients;
1. State the hypothesis;
H0:
H1: H0 is not true or at least one coefficient is not zero
38
2. Compute F-statistics:
=27.48 , see the above table
3. Decide level of significance( )=0.05

4. Determine the critical value of F-test;
= = =3.35
Where df1=degree freedom for SSE and df2 =degree freedom for SSR
5. Decision rule;
 Reject H0 if and conclude that the overall coefficients are significant meaning
that at least one coefficient not zero.
 In our case, =27.48>335, so, we reject the null hypothesis.
The lower panel of the above table presents the estimated result of demand for meat. So, how
examine whether p1 and y affect q1? We need to conduct hypothesis about the significance of
each coefficients for p1 and y.
39
CHAPTER FOUR:
VIOLATIONS OF CLASSICAL ASSUMPTION
Recall that in the classical model we have assumed
a) Zero mean of the random term
b) Constant variance of the error term (i.e., the assumption of homoscedasticity)
c) No autocorrelation of the error term
d) Normality of the error term
e) No multicolinearity among the explanatory variable.
4.1 The Assumption of Zero Expected Disturbances
This assumption is imposed by the stochastic nature of economic relationships, which otherwise
it would be impossible to estimate with the common rule of mathematics. The assumption
implies that the observations of Y and X must be scattered around the line in a random way (and
hence the estimated line Yˆ = ˆ 0 + ̂1 X be a good approximation of the true line.) This defines the
relationship connecting Y and X „on the average‟. The alternative possible assumptions are either
E(U) > 0 or E(U) < 0. Assume that for some reason the U‟s had not an average value of zero, but
tended most of them to be positive. This would imply that the observation of Y and X would lie
above the true line.
It can be shown that by using these observations we would get a bad estimate of the true line. If
the true line lies below or above the observations, the estimated line would be biased.
Note that there is no test for the verification of this assumption because the assumption E (U) =
0 is forced upon us if we are to establish the true relationship. i.e, we set E(U) = 0 at the outset of
our estimation procedure. Its plausibility should be examined in each particular case on a priori
grounds. In any econometric application we must be sure that the things are fulfilled so as to be
safe from violating the assumption of E (U) = 0
i) All the important variables have been included into the function.
ii) There are no systematically positive or systematically negative errors of measurement in the
dependent variable.
40
4.2 The Nature of Hetroscedasticity
The assumption of homoscedasticity (or constant variance) about the random variable U is that
its probability distribution remains the same over all observations of X, and in particular that the
variance of each Ui is the same for all values of the explanatory variable. Symbolically we have
Var(U) = E{(Ui – E(U)}2 = E(Ui) = u2, constant
If the above is not satisfied in any particular case, we say that the U‟s are hetroscedastic. That is
Var (Ui) = ui2 not constant. The meaning of homoscedasticity is that the variation of each Ui
around its zero mean does not depend on the value of X. That is  ui2  f(Xi ).
Note that if u2 is not constant, but its value depends on X. we may write ui2 = f(Xi). As shown
in the above diagrams there are various forms of hetroscedasticity. For example in figure (c) the
variance of Ui decreases as X increases.
Furthermore, suppose we have a cross-section sample of family budget from which we want to
measure the savings function. That means Saving = f(income). In this case the assumption of
constant variance of the U‟s is not appropriate, because high-income families show a much
greater variability in their saving behavior than do low income families. Families with high
income tend to stick to a certain standard of living and when their income falls they cut down
their savings rather than their consumption expenditure. But this is not the case in low income
families. Hence, the variance of Ui‟s increase as income increases.
Note, however, that the problem of hetroscedasticity is the problem of cross-sectional data rather
than time series data. That is, the problem is more serious on cross section data.
Causes of Hetroscedasticity
Hetrodcedasticity can also arise as a result of several cases. The first one is the presence of
outliers (i.e., extreme values compared to the majority of a variable). The inclusion or exclusion
of such an observation, especially if the sample size is small, can substantially alter the results of
regression analysis. With outliers it would be hard to maintain the assumption of
homoscedasticity.
Another source of hetrodcedasticity arises from violating the assumption that the regression
model is correctly specified. Very often what looks like hetroscedasticity may be due to the fact
that some important variables are omitted from the model. In such situation the residuals
41
obtained from the regression may give the distinct impression that the error variance may not be
constant. But if the omitted variables are included in the model, the impression may disappear.
The consequence of Hetrodcedasticity
If the assumption of homoscedastic disturbance is not fulfilled we have the following
consequences:
i) If U is hetroscedastic, the OLS estimates do not have the minimum variance property in
the class of unbiased estimators; that is, they are inefficient in small samples. Furthermore,
they are inefficient in large samples
ii) The coefficient estimates would still be statistically unbiased. That is the expected value of
the ̂ ' s will equal to the true parameters, E( ˆ i ) = I
iii) The prediction (of Y for a given value of X) would be inefficient because of high variance.
This is because the variance of the prediction includes the variances of U and of the
parameter estimates, which are not minimum due to the incidence of hetroscedasticity.
In any case how does one detect whether the problem really exists.
Remedial measures for the problems of heteroscedasticity
 As we have seen, heteroscedasticity does not destroy the unbiasedness and consistency
property of the OLS estimators, but they are no longer efficient.
 This lack of efficiency makes the usual hypothesis testing procedure of dubious value.
 Therefore, remedial measures concentrate on the variance of the error term.
Consider the model
Y    X i  U i , var( ui )   i2 , (ui )  0 (ui u j )  0
 If we apply OLS to the above then it will result in inefficient parameters since var( u i ) is not
constant.
 The remedial measure is transforming the above model so that the transformed model
satisfies all the assumptions of the classical regression model including homoscedasticity.
42
 Applying OLS to the transformed variables is known as the method of Generalized Least
Squares (GLS).
 In short GLS is OLS on the transformed variables that satisfy the standard least squares
assumptions.
 The estimators thus obtained are known as GLS estimators, and it is these estimators that
are BLUE.
4.1.8.1 The Method of Generalized (Weight) Least Square
 Assume that our original model is: Y    X i  U i where u i satisfied all the
assumptions except that u i is heteroscedastic.
(ui ) 2   i2  f (ki )
 If we apply OLS to the above model, the estimators are no more BLUE.
 To make them BLUE we have to transform the above model.
 Let us assume the following types of hetroscedastic structures, under two conditions:
hetroschedasticity when the population variance  i2 is known and when  i2 is not known.
a. Assume  i2 known: (ui ) 2   i2
 Given the model Y    X i  U i
 The transforming variable of the above model is  i2   i so that the variance of the
transformed error term is constant.
 Now divide the above model by  i .
Y  X i U i
            (3.19)
i i i i
43
 The variance of the transformed error term is constant, i.e.
2
u  u  1 1
var  i    i   2 (u i ) 2  2  i2  1 Constant
i  i  i i
 We can know apply OLS to the above model.
 The transformed parameters are BLUE.
 Because all the assumptions including homoscedasticity are satisfied to (3.1).

ui 1
 (Yi    X i )
i i
2
u  1
2
1
   i   
i
(Yi    X i ) 2 , Let wi  2
i
 i 
 w uˆ
i
2
i  wi (Yi  ˆ  ˆX i ) 2
 The method of GLS (WLS) minimizes the weighted residual sum of squares
  wi uˆ i2
 2wi (Yi  ˆ  ˆX i )  0
ˆ
wi (Yi  ˆ  ˆX i )  0
wi Yi  ˆwi  ˆwi X i  0
 wi Yi  ˆwi X i  ˆwi
wi Yi ˆwi X i
ˆ    Y *  ˆX *
wi wi
where Y * is the weighted mean and X * is the weighted mean which are different
from the ordinary mean we discussed in 2.1 and 2.2.
  wi uˆ i2
2.  2wi (Yi  ˆ  ˆX i )( X i )  0
ˆ

wi (Yi  ˆ  ˆX i )( X i )  0
44
wi (Yi X i  ˆX i  ˆX i )  0
2
wi Yi X i  ˆwi X i  ˆwi X i  0

2
 wi Yi X i  ˆwi X i  ˆwi X i

2
substituting ˆ  Y *  ˆX * in the above equation we get
wi Yi X i  (Y *  ˆX * )wi X i  ˆwi X i

2
 Y * wi X i  ˆX * wi X i  ˆwi X i

2
wi Yi X i  Y *wi X i  ˆ (wi X i2  X *wi X i )
wi Yi X i  Y * X *wi  ˆ (wi X i2  X * wi )

2
ˆ wi Yi X i  Y * X * wi x * y *
 2
 2
wi X i2  X * wi x *
where x* and y* are weighted deviations.
These parameters are now BLUE.
b. Now assume that  i2 is not known
 Let‟s assume that (ui ) 2   i2  k i f ( X i ) , the transformed version of the model may be
obtained by dividing throughout the original model by f (X i ) .
Case a. Suppose the heteroscedasticity is of the form
(ui ) 2   i2  k 2 X i2 , the transforming variable is X 2  X if Y    X i  U i
where var( ui )   i2  K i2 X i2 .
Y  X i U i  U
The transformed model is:       i
Xi Xi Xi Xi Xi Xi
2
u  1 K2X 2
 i   2 (u i2 )  2
 K 2 constant
 Xi  Xi Xi
45
which proves that the new random term in the model has a finite constant variance ( K 2 ) .
 Ui
 We can, therefore, apply OLS to the transformed version of the model   .
Xi Xi
 Note that in this transformation the position of the coefficients has changed: the
1
parameter of the variable in the transformed model is the constant intercept of the
Xi
original model, while the constant of term of the transformed model is the parameter of
the explanatory variable X in the original model.
 Therefore, to get back to the original model, we shall have to multiply the estimated
regression by K i .
Case b. Suppose the heteroscedasticity is of the form : (ui2 )   i2  k 2 X i
The transforming variable is Xi
Y  X i Ui
The transformed model is:   
Xi Xi Xi Xi
 Ui
   Xi 
Xi Xi
2
 u 
 i   1 (U i ) 2  1 k 2 X  k 2
 X  X X
 i 
 Constant variance; thus we can apply OLS to the transformed model.
 There is no intercept term in the transformed model.
 Therefore, one will have to use the „regression through the origin‟ model to
estimate  and  .
46
 In this case, therefore, to get back to the original model, we shall have to multiply the
estimated regression by Xi
Case c. suppose heteroscedasticity is of the form
(ui2 )   i2  k 2 (Yi )
2
 The transforming variable is (vi ) 2  (Yi )    X i
Y  X i Ui
                  (i)
  X i   X i   X i   X i
 Ui 
K 2 (Yi )  K 2
1 1
   (u i ) 2 
2
   X i  (  X i )
2
(Yi )2
 The transformed model described in (i) above is however not operational in this case.
 It is because values of  and  are not known.
 But since we can obtain Yˆ  ˆ  ˆX i , the transformation can be made through the
following two steps.
1st : we run the usual OLS regression disregarding the heteroscedasticity problem in the data and
Y 1 X U
obtain Yˆ using the estimated Yˆ , we transform the model as follows.   i  i
Yˆ Yˆ Yˆ Yˆ
 It should be, therefore, be clear that in order to adopt the necessary corrective measure
(which is through transformation of the original data in such a way as to obtain a form in
which the transformed disturbance terms possesses constant variance) we must have
information on the form of heteroscedasticity.
 Also since our transformed data no more posses heteroscedasticity, it can be shown that
the estimate of the transformed model are more efficient (i.e. they posses smaller
variance) then the estimates obtained from the application of OLS to the original data.
47
 Let‟s assume that a test reveals that original data possesses heteroscedasticity and that
heteroscedasticity of the form  i2  K 2 X i2 is being assumed.
 Our original model is therefore:
Yi    X i  U i , (U i )   i2  K 2 X 2
 Apply OLS to the above heteroscedastic model
k i ui
ˆ   
x 2
 k u 
2
var( ˆ )  ( ˆ   ) 2   i 2 i 
 x 
(k i u i  2X i X i u i u i )
2

x 2 2
i
xi2 (u i ) 2 K 2 xi2 X 2

 
x  2 2
i
(x 2 ) 2
Yi  Ui
On transforming the original model we obtain:  
Xi Xi Xi
ˆ Y 1
ˆ  
Xi Xi
ˆ  i2  1 X 2 K 2  1 X 
2
var( ˆ )  
n( 1 X  ( 1 X ) 2 ) n( 1 X  ( 1 X )) 2
 X 2 2
Since var( ˆ ) in OLS  u 2 i
nx
48
4.3 The Nature of Autocorrelation
An important assumption of the classical linear model is that there is no autocorrelation or serial
correlation among the disturbances Ui entering into the population regression function. This
assumption implies that the covariance of Ui and Uj in equal to zero. That is: Cov(UiUj) = E{[Ui
– E(Ui)] [Uj – E (Uj)]
= E(UiUj) = 0 (for i  j)
But if this assumption is violated, it implies that the disturbances are said to be auto correlated.
This could arise for several reasons.
i) Spatial autocorrelation: In regional cross-section data, a random shock affecting economic
activity in one region may cause economic activity in an adjacent region to change because
of close economic ties between the regions. Shocks due to weather similarities might also
tend to cause the error terms between adjacent regions to be related.
ii) Prolonged influence of shocks: In time series data, random shocks (disturbances) have
effects that often persist over more than one time period. An earth quick, flood, strike or
war, for example, will probably affect the economy‟s operation in periods.
iii) Inertia: past action often have a strong effect on current actions, so that a positive
disturbance in one period is likely to influence activity in succeeding periods.
iv) Data manipulation published data often undergo interpolation or smoothing, procedures that
average true disturbances over successive time periods.
v) Misspecification: An omitted relevant independent variable that is auto correlated will make
the disturbance (associated with the misspecified model) auto correlated. An incorrect
functional form or a misspecification of the equation‟s dynamics could do the same. In
these instances the appropriate procedure is to correct the misspecification.
Note that autocorrelation is a special case of correlation. Autocorrelation refers to the
relationship not between two (or more) different variables, but between the successive values of
the same variable (where in this section we are particularly interested in the autocorrelation of
the U‟s. Moreover, note that the term autocorrelation and serial correlation are treated
synonymously.
Consequences of Autocorrelation
When the disturbance term exhibits serial correlation the values as well as the standard errors of
the parameter estimates are affected.
49
i) If disturbances are correlated, the prevailed value of the disturbances has some information to
convey about the current disturbances. If this information is ignored it is clear that the sample
data is not being used with maximum efficiency. However the estimates of the parameters do
not have the statistical biased even when the residuals are serially correlated. That is, the
parameter of OLS estimates is statistically unbiased in the sense that their expected value is
equal to the true parameter.
ii) The variance of the random term U may be seriously underestimated. In particular, the under
estimation of the variance of U will be more serious in the case of positive autocorrelation of
the error term (Ut). With positive first-order auto correlated errors it implies that fitting an
OLS estimating line clearly gives an estimate quite wide of the mark. The high variation in
these estimates will cause the variance of  OLS
to be greater than it would have been had the
errors been distributed randomly.
4.4 Multicollinearity
One of the assumptions of the classical linear regression model (CLRM) is that there is no
perfect multicollinearity among the regressors included in the regression model. Note that
although the assumption is said to be violated only in the case of exact multicollinearity (i.e., an
exact linear relationship among some of the regressors), the presence of multicollinearity (an
approximate linear relationship among some of the regressors) lead to estimating problems
important enough to warrant out treating it as a violation of the classical linear regression model.
Multicollinearity does not depend on any theoretical or actual linear relationship among any of
the regressors; it depends on the existence of an approximate linear relationship in the data set at
hand. Unlike most other estimating problems, this problem is caused by the particular sample
available. Multicollinearity in the data could arise for several reasons. For example, the
independent variables may all share a common time trend, one independent variable might be the
lagged value of another that follows a trend, some independent variable may have varied
together because the data were not collected from a wide enough base, or there could in fact exist
some kind of approximate relationship among some of the regressors.
50
Note that the existence of multicollinearity will affect seriously the parameter estimates.
Intuitively, when any two explanatory variables are changing in nearly the same way, it becomes
extremely difficult to establish the influence of each one regressors on the dependent variable
separately. That is, if two explanatory variables change by the same proportion, the influence on
the dependent variable by one of the explanatory variables may be erroneously attributed to the
other. Their effect cannot be sensibly investigated, due to the high inter correlation.
In general, the problem of multicollinearity arises when individual effects of explanatory

variables cannot be isolated and the corresponding parameter magnitudes cannot be determined
with the desired degree of precision. Though it is quite frequent in cross section data as well, it
should be noted that it tends to be more common and more serious problem in time series data.
Consequences of Multicollinearity
In the case of near or high multicollinearity, one is likely to encounter the following
consequences
i) Although BLUE, the OLS estimators have large variances and covariances, making precise
estimation difficult. This is clearly seen through the formula of variance of the estimators.
ii) Because of consequence (i), the confidence interval tend to be much wider, leading to the
acceptance of the “Zero null hypothesis” (i.e., the true population coefficient is zero).
iii) Because of consequence (i), the t-ratio of one or more coefficient‟s tend to be statistically
insignificant.
iv) Although the t-ratio of one or more coefficients is statistically insignificant, R2, the overall
measure of goodness of fit, can be very high. This is the basic symptom of the problem.
v) The OLS estimators and their standard errors can be sensitive to small changes in the data.
That is when few observations are included, the pattern of relationship may change and
affect the result.
vi) Forecasting is still possible if the nature of the collinearity remains the same within the new
(future) sample observation. That is, if collinearity exists on the data of the past 15 years
sample, and if collinearity is expected to be the same for the future sample period, then
forecasting will not be a problem.
51
Remedial measures
It is more difficult to deal with models indicating the existence of multicollinearity than detecting
the problem of multicollinearity. Different remedial measures have been suggested by
econometricians; depending on the severity of the problem, availability of other sources of data
and the importance of the variables, which are found to be multicollinear in the model.
Some suggest that minor degree of multicollinearity can be tolerated although one should be a bit
careful while interpreting the model under such conditions. Others suggest removing the
variables that show multicollinearity if it is not important in the model. But, by doing so, the
desired characteristics of the model may then get affected. However, following corrective
procedures have been suggested if the problem of multicollinearity is found to be serious.
1. Increase the size of the sample: it is suggested that multicollinearity may be avoided or
reduced if the size of the sample is increased. With increase in the size of the sample, the
covariances are inversely related to the sample size. But we should remember that this will be
true when intercorrelation happens to exist only in the sample but not in the population of the
variables. If the variables are collinear in the population, the procedure of increasing the size of
the sample will not help to reduce multicollinearity.
2. Introduce additional equation in the model: The problem of mutlicollinearity may be

overcome by expressing explicitly the relationship between multicollinear variables. Such
relation in a form of an equation may then be added to the original model. The addition of new
equation transforms our single equation (original) model to simultaneous equation model. The
reduced form method (which is usually applied for estimating simultaneous equation models)
can then be applied to avoid multicollinearity.
3. Use extraneous information – Extraneous information is the information obtained from any
other source outside the sample which is being used for the estimation. Extraneous information
may be available from economic theory or from some empirical studies already conducted in the
field in which we are interested. Three methods, through which extraneous information is
utilized in order to deal with the problem of multicollinearity.
52
4.5 Specification Errors
In developing an empirical model, one is likely to commit one or more of the following
specification errors:
1. Omission of a relevant variable(s)
2. Inclusion of an unnecessary variable(s)
3. Adopting the wrong functional form
4. Errors of measurement
5. Incorrect specification of the stochastic error term
CONSEQUENCES OF MODEL SPECIFICATION ERRORS

Whatever the sources of specification errors, what are the consequences? To keep the discussion
simple, we will answer this question in the context of the three-variable model and consider in
this section the first two types of specification errors discussed earlier, namely,
(1) underfitting a model, that is, omitting relevant variables, and
(2) overfitting a model, that is, including unnecessary variables.
Underfitting a Model (Omitting a Relevant Variable)

Suppose the true model is:
Yi = β1 + β2X2i + β3X3i + ui
but for some reason we fit the following model:
Yi = α1 + α2X2i + vi
The consequences of omitting variable X3 are as follows:
1. If the left-out, or omitted, variable X3 is correlated with the included variable X2, that is, r2 3,
the correlation coefficient between the two variables, is nonzero, αˆ1 and αˆ2 are biased as well
as inconsistent. That is, E(αˆ1) ≠ β1 and E(αˆ2) ≠ β2, and the bias does not disappear as the
sample size gets larger.
2. Even if X2 and X3 are not correlated, αˆ1 is biased, although αˆ2 is now
unbiased.
3. The disturbance variance σ 2 is incorrectly estimated.
53
4. The conventionally measured variance of αˆ2 ( = σ 2/∑X2) is a biased estimator of the variance
of the true estimator βˆ2.
5. In consequence, the usual confidence interval and hypothesis-testing procedures are likely to
give misleading conclusions about the statistical.
6. As another consequence, the forecasts based on the incorrect model and the forecast
(confidence) intervals will be unreliable.
Inclusion of an Irrelevant Variable (Overfitting a Model)

Now let us assume that
Yi = β1 + β2X2i + ui
is the truth, but we fit the following model:
Yi = α1 + α2X2i + α3X3i + vi
and thus commit the specification error of including an unnecessary variable in the model. The
consequences of this specification error are as follows:
1. The OLS estimators of the parameters of the “incorrect” model are all unbiased and consistent,
that is, E(α1) = β1, E(αˆ2) = β2, and E(αˆ3) = β3 = 0.
2. The error variance σ 2 is correctly estimated.
3. The usual confidence interval and hypothesis-testing procedures remain valid.
4. However, the estimated α‟s will be generally inefficient, that is, their variances will be
generally larger than those of the βˆ‟s of the true model.
54
REFERENCES:
1. Gujarati, D. N. and D. C. Proter (2009). Basic Econometrics, 5th edition, McGraw-
2. Maddala, G. S. (1992). Introduction to Econometrics, 2nd edition, Macmillan.
3. Wooldridge, J. (2013). IntroductoryEconometrics: A Modern Approach, 5nd ed.
4. Enders, W. (2014). Applied Econometric Time Series, John Wiley & Sons:, 4th ed., Singapore.
5. Koutsoyiannis, A. (2001). Theory of Econometrics, Palgrave: New York.
6. Johnston, J. and J.Dinardo (1997)Econometric Methods, 4th edition.
7. Kmenta, J. Elements of Econometrics, 2nd edition.
8. Intrilligator M.D, R.G. Bodkin, and D. Hsiao (1996). Econometric Models, Techniques and
Applications.
9. Verbeek (2004), A Guide to Modern Econometrics. New York: John Wiley & Sons, Ltd.
55

Econometric S

Uploaded by

Copyright:

Available Formats

Econometric S

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometric S

Uploaded by

Copyright:

Available Formats

BONGA UNIVERSITY

COLLEGE OF BUSINESS AND ECONOMICS

Each of such specifications involves a relationship among economic variables. As economists,

Econometrics vs. mathematical economics

Econometrics differs from mathematical economics in that, although econometrics presupposes,

ii) Econometric models:

1.3 Methodology of econometrics

2. Estimation of the model

1.4 The Sources, Types and Nature of Data

a) Time Series Data

1.5 Desirable properties of an econometric model

1.6 Goals of Econometrics

Stochastic and Non-stochastic Relationships

A relationship between X and Y, characterized as Y = f(X) is said to be deterministic or non-

because of the random disturbance, we observe Y1 , Y2 ,......, Yn corresponding to X 1 , X 2 ,...., X n .

These points diverge from the regression line by u1 , u 2 ,...., u n .

6. The random terms of different observations U ,U are

8. The random variable (U) is independent of the explanatory variables.

population random disturbance U i .

From the estimated relationship Yi  ˆ  ˆX i  ei , we obtain:

ei  Yi  (ˆ  ˆX i ) ……………………………(2.3)

Rearranging this expression we will get: Y i  n  ˆX i ……(2.6)

If you divide (2.9) by „n‟ and rearrange, we get

Note: e  Yi  ˆ  ˆX i . Hence it is possible to rewrite as  2 ei  0 and  2 X i ei  0 . It

e i  0 and X e i i  0.......... .......... .......... .......... ....( 2.9)

Y X i i  ˆX i  ˆX i2 ……………………………………….(2.10)

Now, denoting ( X i  X ) as x i , and (Yi  Y ) as y i we get;

We minimize the function with respect to ˆ , ˆ , and 

Statistical Properties of Least Square Estimators

2.5 Tests of the ‘Goodness of Fit’ With R2

Figure. Actual and estimated values of the dependent variable Y.

ei  Yi  Yˆ = deviation of the observation Yi from the regression line.

yi  Y  Y = deviation of Y from its mean.

yˆ  Yˆ  Y = deviation of the regressed (predicted) value ( Yˆ ) from the mean.

y  yˆ  e . By squaring and summing both sides, we obtain the following expression:

 yi  ei2  2yˆei

But ŷei = e(Yˆ  Y )  e(ˆ  ˆxi  Y )

 ˆei  ˆexi  Yˆei

(but ei  0, exi  0 )

Total sum of Explained sum Re sidual sum

TSS  ESS  RSS ……………………………………….(2.48)

 Mathematically; the explained variation as a percentage of the total variation is explained

 We can substitute (2.50) in (2.49) and obtain:

r = Cov (X,Y) / x2x2 = xy / nx2x2 = xy / ( x 2 y 2 )1/2 ………(2.53)

 Squaring (2.53) will result in: r2 = ( xy )2 / ( x 2 y 2 ). ………….(2.54)

 Comparing (2.52) and (2.54), we see exactly the expressions. Therefore:

 From (2.48), RSS=TSS-ESS. Hence R2 becomes;

TSS  RSS RSS e 2

 From equation (2.55) we can drive;

RSS  ei2  yi2 (1  R 2 )                            (2.56)

2.6 Testing the Significance of OLS Parameters

The standard error test may be outlined as follows.

conclude that ˆ i is statistically insignificant.

conclude that ˆ i is statistically significant.

Step 5: Compare t* (the computed value of t) and tc (critical value of t)

 If t*< tc , accept H0 and reject H1. The conclusion is ˆ is statistically insignificant.

We choose a probability in advance and refer to it as confidence level (interval coefficient). It is

[ ˆ  SE( ˆ )t c , ˆ  SE( ˆ )t c ] ; Where t c is the critical value of t at 

Where Yi  quantity demanded, P1 is price of the good, P2 is price of substitute goods, Xi is

wi Yi X i  Y wi X i  ˆ (wi X i2  X wi X i )

wi Yi X i  Y * X wi  ˆ (wi X i2  X wi )