Econometric S
Econometric S
CHAPTER ONE
INTRODUCTION
Note that: i). β1 and β2 are the parameters of interest we need to estimate
However, not all points obey such a perfect or exact relationship. Instead, there exists
an inexact or stochastic relationship among most economic variables. Thus, relevant
econometric model is Y=β1+β2X+ u.
There are reasons for including an error term/disturbance term (u) in a model:
Due to inexact nature of the relationship between X and Y, arising from individual line
variations, not all points will lie on the regression line:
Once the data is available, we can then proceed to estimate β1 and β2. For example, we can
estimate the econometric model and get, say, β1= -30 and β2=0.5. Thus the econometric
model becomes Ŷt=-30+0.5Xt where Ŷ is read Ŷ hat. The hat (^) on Yt means that it is an
estimated value, not the actual value. Recall that β2 = 0.5 is thus called the marginal
propensity to consume (MPC). Regression analysis is the main statistical technique we use to
estimate β1 and β2 since it actually gives us the line of best fit.
The Keynesian consumption theory reminds us that the marginal propensity to consume
(mpc), i.e. β2 obeys the rule: 0<MPC< 1. Now, we may want to test whether the results we
have obtained (MPC=0.5) conform to what theory says. This is actually hypothesis testing.
If hypothesis testing confirms that our results indeed conform to hat economic theory says, we
comfortably use our model for forecasts or predictions. For example, given, say X= 100, we
can predict Yt as Y t 30 0.5(100) 20
Apart from forecasting or prediction, the econometric model can also be used for control or
policy purposes especially in evaluating the impact of a particular fiscal or monetary policy
by government on say , consumption in such a case, the particular policy is a control variable
and consumption is the target variable.
ensure that the model can be readily analyzed and conclusions can be reached concerning the
real world.
A) Economic models
Any economic theory is an observation from the real world. For one reason, the immense
complexity of the real world economy makes it impossible for us to understand all
interrelationships at once. Another reason is that all the interrelationships are not equally
important for the understanding of the economic phenomenon under study. The sensible
procedure is therefore, to pick up the important factors and relationships relevant to our
problem and to focus our attention on these alone.
B) Econometric models
The most important characteristic of economic relationships is that they contain a random
element which is ignored by mathematical economic models which postulate exact
relationships between economic variables.
Example: Economic theory postulates that the demand for a commodity depends on its price,
on the prices of other related commodities, on consumers‟ income and on tastes.
Q b0 b1 P b2 P0 b3Y b4 t
The above demand equation is exact. However, many more factors may affect demand. In
econometrics the influence of these „other‟ factors is taken into account by the introducing
random variable. In our example, the demand function studied with the tools of econometrics
would be of the stochastic form:
Q b0 b1 P b2 P0 b3Y b4 t u , where u stands for the random factors which affect the
quantity demanded. The random term (also called error term or disturbance term) is a
surrogate variable for important variables excluded from the model, errors committed and
measurement errors.
Econometric models consist of the following four basic structural elements.
i) A set of variables
a. Theoretical plausibility: The model should be compatible with the postulates of economic
theory and adequately describe the economic phenomena to which it relates.
b. Explanatory ability: The model should be able to explain the observations of the actual
world. It must be consistent with the observed behavior of the economic variables whose
relationship it determines.
c. Accuracy of the estimates of the parameter: The estimates of the coefficients should be
accurate in the sense that they should approximate as best as possible the true parameters
of the structural model. The estimates should if possible possess the desirable properties
of unbiasedness, consistency and efficiency.
d. Forecasting ability: The model should produce satisfactory predictions of future values of
the dependent (endogenous) variables.
e. Simplicity: The model should represent the economic relationships with maximum
simplicity. The fewer the equations and the simpler their mathematical form, the better
the model provided that the other desirable properties are not affected by the
simplifications of the model.
Applied econometrics deals with the application of econometric methods developed in the
theoretical econometrics by employing economic data. In applied econometrics, we use the
tools of theoretical econometrics to study some special field(s) of economics and business,
such as the production function, investment function, demand and supply functions, portfolio
theory, etc.
In summary, an econometrician is concerned with:
As already noted economic theory must always form the basis or starting point of any
econometric analysis. However, relying on theory alone is not sufficient, because theory
has the following limitations:
i.e., whether the relationship is a simple linear function, e.g. C bYd or a log- linear
function, e.g. logC 1 2Yd or even a log- linear function, e.g. logC log1 2 logYd;
and so on.
iii. Theory provides use only with qualitative information about the relationship but no
attempt at quantitative information. For example, theory just tells us that a rise in Yd will lead
to a rise in C, i.e. that MPC lies in the interval: 0<MPC<1. But an econometrician would like
to know a specific value for MPC.
CHAPTER TWO
CORRELATION ANALYSIS
2.1 Introduction
Economic variables have a great tendency of moving together and very often data are given in
pairs of observations in which there is a possibility that the change in one variable is on
average accompanied by the change of the other variable. This situation is known as
correlation.
Correlation may be defined as the degree of relationship existing between two or more
variables. The degree of relationship existing between two variables is called simple
correlation. The degree of relationship connecting three or more variables is called multiple
correlations. In this unit, we shall examine only simple correlation. A correlation is also said
to be partial if it studies the degree of relationship between two variables keeping all other
variables connected with these two are constant.
Correlation may be linear, when all points (X, Y) on scatter diagram seem to cluster near a
straight, or nonlinear, when all points seem to lie near a curve. In other words, correlation is
said to be linear if the change in one variable brings a constant change of the other. It may be
non-linear if the change in one variable brings a different change in the other.
Correlation may also be positive or negative. Correlation is said to positive if an increase or a
decrease in one variable is accompanied by an increase or a decrease by the other in which
both variables are changed with the same direction. For example, the correlation between
price of a commodity and its quantity supplied is positive since as price rises, quantity
supplied will be increased and vice versa. Correlation is said to negative if an increase or a
decrease in one variable is accompanied by a decrease or an increase in the other in which
both are changed with opposite direction. For example, the correlation between price of a
commodity and its quantity demanded is negative since as price rises, quantity demanded will
be decreased and vice versa.
mentioned before do not show to us the strength of co-variation between variables. There are
three methods of measuring correlation. These are:
a. The Scattered Diagram or Graphic Method
b. The Simple Linear Correlation coefficient
c. The coefficient of Rank Correlation
SET BY MEKONNEN A. 19
LECTURE NOTE ON ECONOMETRICS
SET BY MEKONNEN A. 20
LECTURE NOTE ON ECONOMETRICS
r=0
In the light of the above discussions it appears clear that we can determine the kind of
correlation between two variables by direct observation of the scatter diagram. In addition, the
scatter diagram indicates the strength of the relationship between the two variables. This
section is about how to determine the type and degree of correlation using a numerical result.
For a precise quantitative measurement of the degree of correlation between Y and X we use a
parameter which is called the correlation coefficient and is usually designated by the Greek
letter. Having as subscripts the variables whose correlation it measures, refers to the
correlation of all the values of the population of X and Y. Its estimate from any particular
sample (the sample statistic for correlation) is denoted by r with the relevant subscripts. For
example if we measure the correlation between X and Y the population correlation coefficient
is represented by xy and its sample estimate by rxy. The simple correlation coefficient is used
to measure relationships which are simple and linear only. It cannot help us in measuring non-
linear as well as multiple correlations. Sample correlation coefficient is defined by the
formula
SET BY MEKONNEN A. 21
LECTURE NOTE ON ECONOMETRICS
Or x i yi
2.2
rxy
x i
2
y i
2
Where, xi X i X and y i Yi - Y
We will use a simple example from the theory of supply. Economic theory suggests that the
quantity of a commodity supplied in the market depends on its price, ceteris paribus. When
price increases the quantity supplied increases, and vice versa. When the market price falls
producers offer smaller quantities of their commodity for sale. In other words, economic
theory postulates that price (X) and quantity supplied (Y) are positively correlated.
Example 2.1: The following table shows the quantity supplied for a commodity with the
corresponding price values. Determine the type of correlation that exists between these two
variables.
Table 1: Data for computation of correlation coefficient
Time period(in days) Quantity supplied Yi (in tons) Price Xi (in Birr)
1 10 2
2 20 4
3 50 6
4 40 8
5 50 10
6 60 12
7 80 14
8 90 16
9 90 18
10 120 20
SET BY MEKONNEN A. 22
LECTURE NOTE ON ECONOMETRICS
n XY X Y
10(8520) (110)(610)
r 0.975
10(1540) (110)(110) 10(47700) (610)(610)
n X ( X )
2 2
n Y ( Y )
2 2
Or using the deviation form (Equation 2.2), the correlation coefficient can be computed as:
1810
r 0.975
330 10490
This result shows that there is a strong positive correlation between the quantity supplied and
the price of the commodity under consideration.
The simple correlation coefficient has the value always ranging between -1 and +1. That
means the value of correlation coefficient cannot be less than -1 and cannot be greater than
+1. Its minimum value is -1 and its maximum value is +1. If r= -1, there is perfect negative
correlation between the variables. If 0 r 1 , there is positive correlation between the two
variables and movement from zero to positive one increases the degree of positive correlation.
SET BY MEKONNEN A. 23
LECTURE NOTE ON ECONOMETRICS
If r= +1, there is perfect positive correlation between the two variables. If the correlation
coefficient is zero, it indicates that there is no linear relationship between the two variables. If
the two variables are independent, the value of correlation coefficient is zero but zero
correlation coefficient does not show us that the two variables are independent.
Properties of Simple Correlation Coefficient
The simple correlation coefficient has the following important properties:
1. The value of correlation coefficient always ranges between -1 and +1.
2. The correlation coefficient is symmetric. That means rxy ryx , where, rxy is the correlation
Though, correlation coefficient is most popular in applied statistics and econometrics, it has
its own limitations. The major limitations of the method are:
1. The correlation coefficient always assumes linear relationship regardless of the fact
whether the assumption is true or not.
2. Great care must be exercised in interpreting the value of this coefficient as very often the
coefficient is misinterpreted. For example, high correlation between lung cancer and smoking
does not show us smoking causes lung cancer.
3. The value of the coefficient is unduly affected by the extreme values
5. The coefficient requires the quantitative measurement of both variables. If one of the two
variables is not quantitatively measured, the coefficient cannot be computed.
SET BY MEKONNEN A. 24
LECTURE NOTE ON ECONOMETRICS
6 D 2
r' 1
2.3
n( n 2 1)
Where,
D = difference between ranks of corresponding pairs of X and Y
n = number of observations.
The values that r may assume range from + 1 to – 1.
Two points are of interest when applying the rank correlation coefficient. Firstly, it does not
matter whether we rank the observations in ascending or descending order. However, we must
use the same rule of ranking for both variables. Second if two (or more) observations have the
same value we assign to them the mean rank. Let‟s use example to illustrate the application of
the rank correlation coefficient.
Example 2.2: A market researcher asks experts to express their preference for twelve different
brands of soap. Their replies are shown in the following table.
Table 3: Example for rank correlation coefficient
Brands of soap A B C D E F G H I J K L
SET BY MEKONNEN A. 25
LECTURE NOTE ON ECONOMETRICS
Person I 9 10 4 1 8 11 3 2 5 7 12 6
Person II 7 8 3 1 10 12 2 6 5 4 11 9
The figures in this table are ranks but not quantities. We have to use the rank correlation
coefficient to determine the type of association between the preferences of the two persons.
This can be done as follows.
Person I 9 10 4 1 8 11 3 2 5 7 12 6
Person II 7 8 3 1 10 12 2 6 5 4 11 9
Di -2 -2 -1 0 2 1 -1 4 0 -3 -1 -2
2
Di 4 4 1 0 4 1 1 16 0 9 1 4 45
∑
= = 0.825
This figure, 0.825, shows a marked similarity of preferences of the two persons for the
various brands of soap.
2.4 Limitations of the Theory of Linear Correlation
Correlation analysis has serious limitations as a technique for the study of economic
relationships.
Firstly: The above formulae for r apply only when the relationship between the variables is
linear. However two variables may be strongly connected with a nonlinear relationship.
It should be clear that zero correlation and statistical independence of two variables (X and Y)
are not the same thing. Zero correlation implies zero covariance of X and Y so that r=0.
SET BY MEKONNEN A. 26
LECTURE NOTE ON ECONOMETRICS
SET BY MEKONNEN A. 27
LECTURE NOTE ON ECONOMETRICS
Chapter Three
Economic theories are mainly concerned with the relationships among various
economic variables. These relationships, when phrased in mathematical terms, can
predict the effect of one variable on another. The functional relationships of these
variables define the dependence of one variable upon the other variable (s) in the
specific form. In this regard regression model is the most commonly used and
appropriate technique of econometric analysis. Regression analysis refers to
estimating functions showing the relationship between two or more variables and
corresponding tests. This section introduces students with the concept of simple
linear regression analysis. It includes estimating a simple linear function between
two variables. We will restrict our discussion in this part only to two variables and
deal with more variables in the next section.
SET BY MEKONNEN A. 28
LECTURE NOTE ON ECONOMETRICS
simple linear regression function given below for which the errors or residuals are
minimized. Thus, it is about minimizing the residuals or the errors.
Yi X i U i 2.5
The above identity represents population regression function (to be estimated from
total enumeration of data from the entire population). But, most of the time it is
difficult to generate population data owing to several reasons; and most of the time
we use sample data and we estimate sample regression function. Thus, we use the
following sample regression function for the derivation of the parameters and related
analysis.
Before discussing the details of the OLS estimation techniques, let‟s see the major
conditions that are necessary for the validity of the analysis, interpretations and
conclusions of the regression function. These conditions are known as classical
assumptions. In fact most of these conditions can be checked and secured very easily.
3. Classical Assumptions
For the validity of a regression function and its attributes the data we use or the
terms related to our regression function should fulfill the following conditions
known as classical assumptions.
i. The error terms „Ui‟ are randomly distributed or the disturbance terms are not
correlated. This means that there is no systematic variation or relation among the
value of the error terms (Ui and Uj); Where i = 1, 2, 3, …….., j = 1, 2, 3, ……. and
i j . This is represented by zero covariance among the error terms summarized
as follows:
Cov (Ui , Uj) = 0 for i j . Note that the same argument holds for residual terms
when we use sample data or sample regression function. Thus, Cov (ei, ej) = 0 for
i j . Otherwise, the error terms do not serve an adjustment purpose rather it
causes an autocorrelation problem.
SET BY MEKONNEN A. 29
LECTURE NOTE ON ECONOMETRICS
ii. The disturbance terms „Ui‟ have zero mean. This implies the sum of the individual
disturbance terms is zero. The deviations of the values of some of the disturbance
terms are negative; some are zero and some are positive and the sum or the average
is zero. This is given by the following identities.
U i
E(Ui) = 0 . Multiplying both sides by (sample size „n‟) we obtain the
n
following. E(Ui) = U i 0 . The same argument is true for sample
If this condition is not met, then the position of the regression function (or
curve) will not be the same as where it is supposed to be. This results in an
upward (if the mean of the error term or residual term is positive) or down
ward (if the mean of the error term or residual term is negative) shift in the
regression function. For instance, suppose we have the following regression
function.
i 0 1 i Ui
i E (Yi / X i ) 0 1 X i if E (U i ) 0
Otherwise the estimated models will be biased and cause the regression function
to shift. For instance, if E (U i ) 0 (or positive) it is going to shift the estimation
upward from the true representative model. Similar argument is true for residual
term of sample regression function. This is demonstrated by the following figure.
SET BY MEKONNEN A. 30
LECTURE NOTE ON ECONOMETRICS
iii. The disturbance terms have constant variance in each period. This is given as
iv. Explanatory variables „Xi‟ and disturbance terms „Ui‟ are uncorrelated or
independent. All the co-variances of the successive values of the error term are
that the following identity holds true; ei X i 0 . The value in which the error
term assumed in one period does not depend on the value in which it assumed in
any other period. If this condition is not met by our data or variables, our
regression function and conclusions to be drawn from it will be invalid. This
assumption is known as the assumption of non-autocorrelation or non-serial
correlation.
SET BY MEKONNEN A. 31
LECTURE NOTE ON ECONOMETRICS
v. The explanatory variable Xi is fixed in repeated samples. Each value of Xi does not
vary for instance owing to change in sample size. This means the explanatory
variables are non-random and hence distributional free variable.
vi. Linearity of the model in parameters. The simple linear regression requires
linearity in parameters; but not necessarily linearity in variables. The same
technique can be applied to estimate regression functions of the following forms: Y
= f (X ); Y = f (X 2); Y = f (X 3); Y = f (X – kX ); and so on. What is important is
transforming the data as required.
Ui ˜
N 0, u
2
. This assumption is a combination of zero mean of error term
assumption and homoscedasticity assumption. This assumption or combination of
assumptions is used in testing hypotheses about significance of parameters. It is also
useful in both estimating parameters and testing their significance in maximum
likelihood method.
viii. Explanatory variables should not be perfectly, linearly and/or highly correlated.
Using explanatory variables which are highly or perfectly correlated in a regression
function causes a biased function or model. It also results in multicollinearity
problem.
ix. The relationship between variables (or the model) is correctly specified. For
instance all the necessary variables are included in the model. The variables are in
the form that best describes the functional relationship. For instance, “Y = f (X 2)”
may better reflect the relationship between Y and X than “Y = f (X )”.
x. The explanatory variables do not have identical value. This assumption is very
important for improving the precision of estimators.
SET BY MEKONNEN A. 32
LECTURE NOTE ON ECONOMETRICS
Note that some of these assumptions or conditions, (those which imply to more than
one explanatory variables), are meant for the next chapters (along with all the other
assumptions or conditions). So, we may not restate these conditions in the next
chapter even if they are required there also.
Estimating a linear regression function using the Ordinary Least Square (OLS)
method is simply about calculating the parameters of the regression function for
which the sum of square of the error terms is minimized. The procedure is given as
follows. Suppose we want to estimate the following equation
Yi 0 1 X i U i
Since most of the time we use sample (or it is difficult to get population data) the
corresponding sample regression function is given as follows.
From this identity, we solve for the residual term ' ei ' , square both sides and then
take sum of both sides. These three steps are given (respectively as follows.
2
ei Yi ˆ0 ˆ1 X i 2
2.8
The method of OLS involves finding the estimates of the intercept and the slope for
which the sum squares given by the Equation is minimized. To minimize the residual
sum of squares we take the first order partial derivatives of Equation 2.8 and equate
them to zero.
SET BY MEKONNEN A. 33
LECTURE NOTE ON ECONOMETRICS
ei
2
2 Yi ˆ0 ˆX i (1) 0 2.9
ˆ
0
^ ^
Note that the equation (Yi 0 1 X i ) 2 is a composite function and we should
apply a chain rule in finding the partial derivatives with respect to the parameter
estimates.
Equations 2.12 and 2.15 are together called the system of normal equations. Solving
the system of normal equations simultaneously we obtain:
n XY ( X )( Y )
ˆ1
Or
n X 2 ( X ) 2
_ _
XY n Y X
1 _ and we have ˆ 0 Y ˆ1 X from above
X nX i
2 2
SET BY MEKONNEN A. 34
LECTURE NOTE ON ECONOMETRICS
Now we have the formula to estimate the simple linear regression function. Let us
illustrate with example.
Example 2.4: Given the following sample data of three pairs of „Y‟ (dependent
variable) and „X‟ (independent variable), find a simple linear regression function; Y
= f(X).
Yi Xi
10 30
20 50
30 60
Solution
Yi Xi Yi Xi Xi2
10 30 300 900
20 50 1000 2500
30 60 1800 3600
Sum 60 140 3100 7000
Mean Y = 20 140
X=
3
n XY ( X )( Y )
3(3100) (140)(60)
ˆ1 0.64
3(7000) (140) 2
n X 2 ( X ) 2
SET BY MEKONNEN A. 35
LECTURE NOTE ON ECONOMETRICS
b) Interpretation, the value of the intercept term,-10, implies that the value of the
dependent variable „Y‟ is – 10 when the value of the explanatory variable is
zero. The value of the slope coefficient ( ˆ 0.64 ) is a measure of the marginal
change in the dependent variable „Y‟ when the value of the explanatory
variable increases by one. For instance, in this model, the value of „Y‟ increases
on average by 0.64 units when „X‟ increases by one.
That means when X assumes a value of 45, the value of Y on average is expected
to be 18.8. The regression coefficients can also be obtained by simple formulae by
taking the deviations between the original values and their means. Now, if
xi X i X , and yi Yi Y
x y
i i
Example 2.5: Find the regression equation for the data under Example 2.4, using
the shortcut formula. To solve this problem we proceed as follows.
Yi Xi y X xy x2 y2
10 30 -10 -16.67 166.67 277.78 100
20 50 0 3.33 0.00 11.11 0
30 60 10 13.33 133.33 177.78 100
Sum 60 140 0 0 300.00 466.67 200
Mean 20 46.66667
Then
SET BY MEKONNEN A. 36
LECTURE NOTE ON ECONOMETRICS
x i yi
300 , and ˆ0 Y ˆ1 X =20-(0.64) (46.67) = -10 with results
ˆ1 0.64
466.67
x i
2
We have seen that how the numerical values of the parameter estimates can be
obtained using OLS estimating techniques. Now let us see their distributional nature,
i.e. the mean and variance of the parameter estimates. There can be several samples
of the same size that can be drawn from the same population. For each sample, the
parameter estimates have their own specific numerical values. That means the values
of the estimates are different when we go from one sample to another. Therefore, the
parameter estimate have different values for estimating a given true population
parameter. That means the parameter estimates are random in their nature and
should have distinct distribution with the corresponding parameters. Remember we
have discussed in the previous sections that both the error term and the dependent
variable are assumed to be normally distributed. Thus, the parameter estimates also
have a normal distribution with their associative mean and variance. Formula for
mean and variance of the respective parameter estimates and the error term are
given below (procedure to drive is given in Annex A)
1. The mean of 1 E (1 ) 1
U2
2. The variance of 1 Var ( 1 ) E ((1 E ( )) 2
xi2
3. The mean of 0 E ( 0 ) 0
U2 X i2
4. The variance of 0 E (( 0 E ( 0 )) 2
n xi2
ei2
5. The estimated value of the variance of the error term U
2
n2
SET BY MEKONNEN A. 37
LECTURE NOTE ON ECONOMETRICS
The available test criteria are divided in to three groups: Theoretical a priori criteria,
statistical criteria and econometric criteria. Priori criteria set by economic theories are
in line with the consistency of coefficients of econometric model to the economic
theory. Statistical criteria, also known as first order tests, are set by statistical theory
and refer to evaluate the statistical reliability of the model. Econometric criteria refer
to whether the assumptions of an econometric model employed in estimating the
parameters are fulfilled or not. There are two most commonly used tests in
econometrics. These are:
ii. The standard error test of the parameter estimates applied for judging the
statistical reliability of the estimates. This test measures the degree of confidence that
we may attribute to the estimates.
SET BY MEKONNEN A. 38
LECTURE NOTE ON ECONOMETRICS
model or the presence of the explanatory variable in the model. The total variation of
the dependent variable is split in two additive components; a part explained by the
model and a part represented by the random term. The total variation of the
dependent variable is measured from its arithmetic mean.
_
Total var iationin Yi (Yi Y ) 2
_
Total exp lained var iation (Yi Y ) 2
Total un exp lained var iation ei2
The total variation of the dependent variable is given in the following form;
TSS=ESS + RSS, which means total sum of square of the dependent variable is split
into explained sum of square and residual sum of square.
ei yi y i
yi yi ei
2
yi2 yi ei2 2 yi ei
2
y i2 y i ei2 2 y i ei
But y i ei 0
2
Therefore, y y i ei2
2
i
(Yˆ Y )2 yˆ
2
2.16
i i
Explained Variation in Y
R2
Total Variation in Y
(Y i Y ) 2
y i
2
2
Since y i 1 x i y i the coefficient of determination can also be given as
1 xi y i
R2
y i2
Or
SET BY MEKONNEN A. 39
LECTURE NOTE ON ECONOMETRICS
(Y Yˆi ) 2 e
2
Unexplained Variation in Y
i i
2.17
R 1
2
1 1
Total Variation inY
(Y i Y )2 y i
2
The higher the coefficient of determination is the better the fit. Conversely, the
smaller the coefficient of determination is the poorer the fit. That is why the
coefficient of determination is used to compare two or more models. One minus the
coefficient of determination is called the coefficient of non-determination, and it
gives the proportion of the variation in the dependent variable that remained
undetermined or unexplained by the model.
Since the sample values of the intercept and the coefficient are estimates of the true
population parameters, we have to test them for their statistical reliability.
The significance of a model can be seen in terms of the amount of variation in the
dependent variable that it explains and the significance of the regression coefficients.
There are different tests that are available to test the statistical reliability of the
parameter estimates. The following are the common ones;
This test first establishes the two hypotheses that are going to be tested which are
commonly known as the null and alternative hypotheses. The null hypothesis
addresses that the sample is coming from the population whose parameter is not
significantly different from zero while the alternative hypothesis addresses that the
sample is coming from the population whose parameter is significantly different
from zero. The two hypotheses are given as follows:
SET BY MEKONNEN A. 40
LECTURE NOTE ON ECONOMETRICS
H0: βi=0
H1: βi≠0
1. Compute the standard deviations of the parameter estimates using the above
formula for variances of parameter estimates. This is because standard deviation is
the positive square root of the variance.
U2
se( 1 )
xi2
U2 X i2
se( 0 )
n xi2
2. Compare the standard errors of the estimates with the numerical values of the
estimates and make decision.
A) If the standard error of the estimate is less than half of the numerical value of the
estimate, we can conclude that the estimate is statistically significant. That is, if
1
se( i ) ( i ) , reject the null hypothesis and we can conclude that the estimate is
2
statistically significant.
B) If the standard error of the estimate is greater than half of the numerical value of
the estimate, the parameter estimate is not statistically reliable. That is, if
1
se( i ) ( i ) , conclude to accept the null hypothesis and conclude that the estimate
2
is not statistically significant.
This test is based on the normal distribution. The test is applicable if:
SET BY MEKONNEN A. 41
LECTURE NOTE ON ECONOMETRICS
H1 : i 0
2. Determine the level of significant ( ) in which the test is carried out. It is the
probability of committing type I error, i.e. the probability of rejecting the null
hypothesis while it is true. It is common in applied econometrics to use 5%
level of significance.
3. Determine the theoretical or tabulated value of Z from the table. That is, find
the value of Z 2 from the standard normal table. Z 0.025 1.96 from the table.
If Z cal Z tab , accept the null hypothesis while if Z cal Z tab , reject the null
hypothesis. It is true that most of the times the null and alternative hypotheses
are mutually exclusive. Accepting the null hypothesis means that rejecting the
alternative hypothesis and rejecting the null hypothesis means accepting the
alternative hypothesis.
Example: If the regression has a value of 1 =29.48 and the standard error of 1 is 36.
H 0 : 1 25
H1 : 1 25
After setting up the hypotheses to be tested, the next step is to determine the level of
significance in which the test is carried out. In the above example the significance
level is given as 5%.
SET BY MEKONNEN A. 42
LECTURE NOTE ON ECONOMETRICS
The third step is to find the theoretical value of Z at specified level of significance.
From the standard normal table we can get that Z 0.025 1.96 .
The fourth step in hypothesis testing is computing the observed or calculated value
of the standard normal distribution using the following formula.
ˆ1 1 29.48 25
Z cal 0.12 . Since the calculated value of the test statistic is less
se( ˆ1 ) 36
than the tabulated value, the decision is to accept the null hypothesis and conclude
that the value of the parameter is 25.
In conditions where Z-test is not applied (in small samples), t-test can be used to test
the statistical reliability of the parameter estimates. The test depends on the degrees
of freedom that the sample has. The test procedures of t-test are similar with that of
the z-test. The procedures are outlined as follows;
1. Set up the hypothesis. The hypotheses for testing a given regression coefficient is
given by:
H 0 : i 0
H1 : i 0
2. Determine the level of significance for carrying out the test. We usually use a 5%
level significance in applied econometric research.
3. Determine the tabulated value of t from the table with n-k degrees of freedom,
where k is the number of parameters estimated.
4. Determine the calculated value of t. The test statistic (using the t- test) is given by:
ˆi
tcal
se( ˆi )
SET BY MEKONNEN A. 43
LECTURE NOTE ON ECONOMETRICS
We have discussed the important tests that that can be conducted to check model and
parameters validity. But one thing that must be clear is that rejecting the null
hypothesis does not mean that the parameter estimates are correct estimates of the
true population parameters. It means that the estimate comes from the sample drawn
from the population whose population parameter is significantly different from zero.
In order to define the range within which the true parameter lies, we must construct
a confidence interval for the parameter. Like we constructed confidence interval
estimates for a given population mean, using the sample mean (in Introduction to
Statistics), we can construct 100(1- ) % confidence intervals for the sample
regression coefficients. To do so we need to have the standard errors of the sample
regression coefficients. The standard error of a given coefficient is the positive square
root of the variance of the coefficient. Thus, we have discussed that the formulae for
finding the variances of the regression coefficients are given as.
X
2
i
1
Variance of the slope ( ˆ1 ) is given by var( ˆ1 ) u 2 2.19
x
2
i
e
2
i
Where, u2 (2.20) is the estimate of the variance of the random term
nk
and k is the number of parameters to be estimated in the model. The standard errors
are the positive square root of the variances and the 100 (1- ) % confidence interval
for the slope is given by:
1 t (n k )(se( 1 )) 1 1 t (n k )(se( 1 ))
2 2
SET BY MEKONNEN A. 44
LECTURE NOTE ON ECONOMETRICS
Example 2.6: The following table gives the quantity supplied (Y in tons) and its price
(X pound per ton) for a commodity over a period of twelve years.
Table 5: Data on supply and price for given commodity
Y 69 76 52 56 57 77 58 55 67 53 72 64
X 9 12 6 10 9 10 7 8 12 6 11 8
SET BY MEKONNEN A. 45
LECTURE NOTE ON ECONOMETRICS
Time Y X XY X2 Y2 x Y Xy x2 y2 Ŷ ei ei2
Sum 756 108 6960 1020 48522 0 0 156 48 894 756.00 0.00 387.00
Solution
Refer to Example 2.6 above to determine how much percent of the variations in the
quantity supplied is explained by the price of the commodity and what percent
remained unexplained.
SET BY MEKONNEN A. 46
LECTURE NOTE ON ECONOMETRICS
e
2
i
387
R2 1 1 1 0.43 0.57
894
y i
2
This result shows that 57% of the variation in the quantity supplied of the
commodity under consideration is explained by the variation in the price of the
commodity; and the rest 37% remain unexplained by the price of the commodity. In
other word, there may be other important explanatory variables left out that could
contribute to the variation in the quantity supplied of the commodity, under
consideration.
2. Run significance test of regression coefficients using the following test methods
respective coefficients.
In testing the statistical significance of the estimates using standard error test, the
following information needed for decision.
Since there are two parameter estimates in the model, we have to test them
separately.
Testing for 1
We have the following information about 1 i.e 1 =3.25 and se( 1 ) 0.9
H 0 : 1 0
H 1 : 1 0
SET BY MEKONNEN A. 47
LECTURE NOTE ON ECONOMETRICS
Since the standard error of 1 is less than half of the value of 1 , we have to reject the
null hypothesis and conclude that the parameter estimate 1 is statistically
significant.
Testing for 0
Again we have the following information about 0
0 33.75 and se( 0 ) 8.3
H0 : 0 0
H1 : 0 0
Since the standard error of 0 is less than half of the numerical value of 0 , we have
to reject the null hypothesis and conclude that 0 is statistically significant.
In the illustrative example, we can apply t-test to see whether price of the commodity
is significant in determining the quantity supplied of the commodity under
consideration? Use =0.05.
H 0 : 1 0
H 1 : 1 0
ˆi 3.25
t cal 3.62
se( ˆi ) 0.8979
SET BY MEKONNEN A. 48
LECTURE NOTE ON ECONOMETRICS
Further tabulated value for t is 2.228. When we compare these two values, the
calculated t is greater than the tabulated value. Hence, we reject the null hypothesis.
Rejecting the null hypothesis means, concluding that the price of the commodity is
significant in determining the quantity supplied for the commodity.
In this part we have seen how to conduct the statistical reliability test using t-statistic.
Now let us see additional information about this test. When the degrees of freedom is
large, we can conduct t-test without consulting the t-table in finding the theoretical
value of t. This rule is known as “2t-rule”. The rule is stated as follows;
The t-table shows that the values of t changes very slowly if the degrees of freedom
(n-k) are greater than 8. For example the value of t0.025 changes from 2.30 (when n-
k=8) to 1.96(when n-k=∞). The change from 2.30 to 1.96 is obviously very slow.
Consequently, we can ignore the degrees of freedom (when they are greater than 8)
and say that the theoretical value of t cal is 2.0. Thus, a two tail test of a null hypothesis
1. If t cal is greater than 2 or less than -2, we reject the null hypothesis
2. If t cal is less than 2 or greater than -2, accept the null hypothesis.
3. Fit the linear regression equation and determine the 95% confidence interval for the
slope.
e
2
i
387 387
u2 38.7
nk 12 2 10
SET BY MEKONNEN A. 49
LECTURE NOTE ON ECONOMETRICS
1 1
var( ˆ1 ) u 38.7( ) 0.80625
2
48
x 2
The standard error of the slope is se( ˆ1 ) var( ˆ1 ) 0.80625 0.8979
The tabulated value of t for degrees of freedom 12-2=10 and /2=0.025 is 2.228.
Hence the 95% confidence interval for the slope is given by:
ˆ1 3.25 (2.228)(0.8979) 3.25 2 3.25 2, 3.25 2 1.25, 5.25 . The result tells us
that at the error probability 0.05, the true value of the slope coefficient lies between
1.25 and 5.25
Statement of the theorem: “Given the assumptions of the classical linear regression
model, the OLS estimators, in the class of linear and unbiased estimators, have the
minimum variance, i.e. the OLS estimators are BLUE.
According to this theorem, under the basic assumptions of the classical linear
regression model, the least squares estimators are linear, unbiased and have
minimum variance (i.e. are best of all linear unbiased estimators). Sometimes the
theorem referred as the BLUE theorem i.e. Best, Linear, and Unbiased Estimator. An
estimator is called BLUE if:
a. Linear: a linear function of the random variable, such as, the dependent variable
Y.
b. Unbiased: its average or expected value is equal to the true population parameter.
c. Minimum variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.
SET BY MEKONNEN A. 50
LECTURE NOTE ON ECONOMETRICS
According to the Gauss-Markov theorem, the OLS estimators possess all the BLUE
properties. The detailed proof of these properties are presented in Annex B
Chapter Four
Adding more variables to the simple linear regression model leads us to the
discussion of multiple regression models i.e. models in which the dependent variable
(or regress and) depends on two or more explanatory variables, or repressors. The
multiple linear regression (population regression function) in which we have one
dependent variable Y, and k explanatory variables, X 1 , X 2 ,...... X k is given by
Yi 0 1 X 1 2 X 2 .... k X k u i 3.1
In this model, for example, 1 is the amount of change in Yi when X 1 changes by one
unit, keeping the effect of other variables constant. Similarly, 2 is the amount of
SET BY MEKONNEN A. 51
LECTURE NOTE ON ECONOMETRICS
change in Yi when X 2 changes by one unit, keeping the effect of other variables
constant. The other slopes are also interpreted in the same way.
Although multiple regression equation can be fitted for any number of explanatory
variables (equation 3.1), the simplest possible regression model, three-variable
regression will be presented for the sake of simplicity. It is characterized by one
dependent variable (Y) and two explanatory variables (X1 and X2). The model is
given by:
Yi 0 1 X 1 2 X 2 u i 3.2
constant
Each econometric method that would be used for estimation purpose has its own
assumptions. Knowing the assumptions and their consequence if they are not
maintained is very important for the econometrician. In the previous section, there
are certain assumptions underlying the multiple regression model, under the method
of ordinary least squares (OLS). Let us see them one by one.
Assumption 2: Zero mean of u i - the random variable u i has a zero mean for each
constant variance. In other words, the variance of each u i is the same for all the X i
values.
SET BY MEKONNEN A. 52
LECTURE NOTE ON ECONOMETRICS
ui N (0, u2 )
X j ).
E(ui u j ) 0 for i j
SET BY MEKONNEN A. 53
LECTURE NOTE ON ECONOMETRICS
Under the assumption of zero mean of the random term, the sample regression
function will look like the following.
^ ^ ^ ^
Yi 0 1 X 1 2 X 2 3.3
We call this equation, the fitted equation. Subtracting (3.3) from (3.2), we obtain:
^
ei Yi Y i 3.4
The method of ordinary least squares (OLS) or classical least square (CLS) involves
^ ^ ^
obtaining the values 0 , 1 and 2 in such a way that ei2 is minimum.
^ ^ ^
The values of 0 , 1 and 2 for which e 2
i is minimum is obtained by
differentiation this sum of squares with respect to these coefficients and equate them
to zero. That is,
n
(Y ˆ0 ˆ1 X 1 ˆ 2 X 2 ) 2
ei2
i 1
0 3.5
ˆ0
^
0
n
(Y ˆ0 ˆ1 X 1 ˆ 2 X 2 ) 2
e 2
i 1
0
i
3.6
ˆ1
^
1
n
(Y ˆ0 ˆ1 X 1 ˆ 2 X 2 ) 2
e 2
i 1
0
i
3.7
ˆ 2
^
2
Solving equations (3.5), (3.6) and (3.7) simultaneously, we obtain the system of
normal equations given as follows:
^ ^ ^
Y n X X
i 0 1 1i 2 2i 3.8
^ ^ ^
X Y X X X X
1i i 0 1i 1 1
2
2 1i 2i 3.9
^ ^ ^
X Y X X X b X
2i i 0 2i 2 1i 2i 2
2
2i 3.10
Then, letting
SET BY MEKONNEN A. 54
LECTURE NOTE ON ECONOMETRICS
x1i X 1i X 1 3.11
x 2i X 2i X 2 3.12
yi Yi Y 3.13
The above three equations (3.8), (3.9) and (3.10) can be solved using Matrix
operations or simultaneously to obtain the following estimates:
^ x y x x y x x
2
1
1 1 2 2 1 2
3.14
x x x x
2
1
2
2 1 2
2
^ x y x x y x x
2
2
2 1 1 1 2
3.15
x x x x
2
1
2
2 1 2
2
^
0 Y ˆ1 X 1 ˆ 2 X 2 3.16
Like in the case of simple linear regression, the standard errors of the coefficients are
vital in statistical inferences about the coefficients. We use standard the error of a
coefficient to construct confidence interval estimate for the population regression
coefficient and to test the significance of the variable to which the coefficient is
attached in determining the dependent variable in the model. In this section, we will
see these standard errors. The standard error of a coefficient is the positive square
SET BY MEKONNEN A. 55
LECTURE NOTE ON ECONOMETRICS
root of the variance of the coefficient. Thus, we start with defining the variances of
the coefficients.
2
2
1 2 x1 2 X 1 X 2 x1 x 2
2 2
X x 2 X
^ ^
2
Var 0 ui 1 3.17
n
1 2 1 2
x 2
x 2
( x x ) 2
^
Variance of 1
^
Var 1 u2
x22
3.18
2
x1 x 2 ( x1 x 2 )
2 2
^
Variance of 2
^ ^ 2
Var ( 2 ) u
x12 3.19
2
x1 x 2 ( x1 x 2 )
2 2
Where,
^
2
e 2
i
3.20
n3
u
Equation 3.20 here gives the estimate of the variance of the random term. Then, the
standard errors are computed as follows:
^
Standard error of 0
^ ^
SE ( 0 ) Var ( 0 ) 3.21
^
Standard error of 1
^ ^
SE( 1 ) Var ( 1 ) 3.22
^
Standard error of 2
SET BY MEKONNEN A. 56
LECTURE NOTE ON ECONOMETRICS
^ ^
SE( 2 ) Var ( 2 ) 3.23
Note: The OLS estimators of the multiple regression model have properties which
are parallel to those of the two-variable model.
linear regression, R 2 is the ratio of the explained variation to the total variation.
Mathematically:
R2
y2 3.24
y 2
^ ^
Or R can also be given in terms of the slope coefficients 1 and 2 as :
2
^ ^
1 x1 y 2 x 2 y
R2 3.25
y 2
In simple linear regression, the higher the R 2 means the better the model is
determined by the explanatory variable in the model. In multiple linear regression,
however, every time we insert additional explanatory variable in the model, the R 2
SET BY MEKONNEN A. 57
LECTURE NOTE ON ECONOMETRICS
increases irrespective of the improvement in the goodness-of- fit of the model. That
(n 1)
2
Rady 1 (1 R 2 ) 3.26
(n k )
Where, k = the number of explanatory variables in the model.
In multiple linear regression, therefore, we better interpret the adjusted R 2 than the
between zero and one. But the adjusted R 2 can lie outside this range even to be
negative.
lies between 0 and +1. The adjusted R 2 , however, can sometimes be negative when
the goodness of fit is poor. When the adjusted R 2 value is negative, we considered it
as zero and interpret as no variation of the dependent variable is explained by
repressors.
Please recall that 100(1- ) % confidence interval for i is given as ˆi t / 2,n k se( ˆi )
SET BY MEKONNEN A. 58
LECTURE NOTE ON ECONOMETRICS
Interpretation of the confidence interval: Values of the parameter lying in the interval are
plausible with 100(1- ) % confidence.
These and other types of hypotheses tests can be referred from different
Econometrics books. For the case in point, we will confine ourselves to the major
ones.
H 0 : ˆ1 0 H 0 : ˆ 2 0 H 0 : ˆ K 0
a) b)
H : ˆ 0
1 1 H 1 : ˆ 2 0 H 1 : ˆ K 0
In a) we will like to test the hypothesis that X1 has no linear influence on Y holding
SET BY MEKONNEN A. 59
LECTURE NOTE ON ECONOMETRICS
relationship with Y holding other factors constant. The above hypotheses will lead us
to a two-tailed test however, one-tailed test might also be important. There are two
methods for testing significance of individual regression coefficients.
a) Standard Error Test: Using the standard error test we can test the above
hypothesis.
Thus the decision rule is based on the relationship between the numerical value of
the parameter and the standard error of the same.
(ii) If S ( ˆ i ) 1 ˆ i , we fail to accept H0, i.e., we reject the null hypothesis in favour of
2
(b) t-test
The more appropriate and formal way to test the above hypothesis is to use the t-test.
As usual we compute the t-ratios and compare them with the tabulated t-values and
make our decision.
ˆ i
Therefore: t cal t ( n 1)
S ( ˆ i )
SET BY MEKONNEN A. 60
LECTURE NOTE ON ECONOMETRICS
Otherwise, reject the null hypothesis. Rejecting H 0 means, the coefficient being tested
is significantly different from 0. Not rejecting H 0 , on the other hand, means we don‟t
yˆ 2
Fcal k 1
e2
nk
MSR
nk
SET BY MEKONNEN A. 61
LECTURE NOTE ON ECONOMETRICS
Total SST y 2 n 1
This implies that the total sum of squares is the sum of the explained (regression)
sum of squares and the residual (unexplained) sum of squares. In other words, the
total variation in the dependent variable is the sum of the variation in the dependent
variable due to the variation in the independent variables included in the model and
the variation that remained unexplained by the explanatory variables in the model.
Analysis of variance (ANOVA) is the technique of decomposing the total sum of
squares into its components. As we can see here, the technique decomposes the total
variation in the dependent variable into the explained and the unexplained
variations. The degrees of freedom of the total variation are also the sum of the
degrees of freedom of the two components. By dividing the sum of squares by the
corresponding degrees of freedom, we obtain what is called the Mean Sum of
Squares (MSS).
The Mean Sum of Squares due to regression, errors (residual) and Total are
calculated as the Sum of squares and the corresponding degrees of freedom (look at
column 3 of the above ANOVA table.
The final table shows computation of the test statistic which can be computed as
follows:
MSR
Fcal F (k 1, n k ) [The F statistic follows F distribution]
MSE
SET BY MEKONNEN A. 62
LECTURE NOTE ON ECONOMETRICS
R2 1
e 2
Hence, e 2
1 R 2 which means e 2
(1 R 2 ) y 2
y 2
y 2
Fcal
y 2
k 1
e2
nk
R2 y2 R2 y2 (n k )
Fcal .
k 1 k 1 (1 R 2 ) y 2
(1 R 2 ) y 2
nk
(n k ) R2
Fcal .
k 1 (1 R 2 )
That means the calculated F can also be expressed in terms of the coefficient of
determination.
Yi 0 1 X 1i 2 X 2i 3 X 3i ... K X Ki U i
The null hypothesis says that the two slope coefficients are equal.
SET BY MEKONNEN A. 63
LECTURE NOTE ON ECONOMETRICS
commodity and X2 is income of the consumer. The hypothesis suggests that the price
ˆ 2 ˆ1
t ~ t distribution with N - K degrees of freedom.
SE( ˆ 2 ˆ1 )
The SE( ˆ 2 ˆ1 ) is given as SE( ˆ 2 ˆ1 ) Var ( ˆ 2 ) Var ( ˆ1 ) 2 cov( 2 , 1 )
Note: Using similar procedures one can also test linear equality restrictions, for
example 1 2 1 and other restrictions.
Illustration: The following table shows a particular country‟s the value of imports
(Y), the level of Gross National Product(X1) measured in arbitrary units, and the price
index of imported goods (X2), over 12 years period.
Table 7: Data for multiple regression examples
Year 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971
Y 57 43 73 37 64 48 56 50 39 43 69 60
X1 220 215 250 241 305 258 354 321 370 375 385 385
X2 125 147 118 160 128 149 145 150 140 115 155 152
a) Estimate the coefficients of the economic relationship and fit the model.
SET BY MEKONNEN A. 64
LECTURE NOTE ON ECONOMETRICS
Table 8: Computations of the summary statistics for coefficients for data of Table 7
SET BY MEKONNEN A. 65
LECTURE NOTE ON ECONOMETRICS
Y 639
Y 53.25
n 12
X 1
3679
X1 306.5833
n 12
X 2
1684
X2 140.3333
n 12
x 49206.92 x 2500.667
2 2
1 2
x y 1043.25
1 x 2 y 509
x x 1 2 960.6667 y 2
1536.25
1
1 1 2 2 1 2
x x x x
2
1
2
2 1 2
2
(49206.92)(2500.667)- (960.667) 123050121 922880.51 2
3097800.2
0.025365
122127241
2
2 1 1 1 2
x x x x
2
1
2
2 1 2
2
(49206.92)(2500.667)- (960.667) 123050121 922880.51 2
- 26048538
0.21329
122127241
The fitted model is then written as: Yˆi = 75.40512 + 0.025365X1 - 0.21329X2
SET BY MEKONNEN A. 66
LECTURE NOTE ON ECONOMETRICS
First, you need to compute the estimate of the variance of the random term as follows
^
2
e 2
i
1401.223 1401.223
155.69143
n3 12 3
u
9
^
Variance of 1
^
Var 1 u
x22
155.69143(
2500.667
) 0.003188
2
2
x1 x2 ( x1 x2 )
2 2
12212724
^
Standard error of 1
^ ^
SE( 1 ) Var ( 1 ) 0.003188 0.056462
^
Variance of 2
^ ^ 2
Var ( 2 ) u
x12
155.69143(
49206.92
) 0.0627
2
x1 x2 ( x1 x2 )
2 2
122127241
^
Standard error of 2
^ ^
SE( 2 ) Var ( 2 ) 0.0627 0.25046
Similarly, the standard error of the intercept is found to be 37.98177. The detail is left for
you as an exercise.
c) Calculate and interpret the coefficient of determination.
We can use the following summary results to obtain the R2.
yˆ 2
135.0262
e 2
1401.223
y 2
1536.25 (The sum of the above two). Then,
^ ^
1 x1 y 2 x 2 y (0.025365)(1043.25) (-0.21329)(-509)
R 2
0.087894
y 2
1356.25
SET BY MEKONNEN A. 67
LECTURE NOTE ON ECONOMETRICS
e 2
1401.223
or R 2 1 1 0.087894
1356.25
y 2
e) Construct 95% confidence interval for the true population parameters (partial regression
coefficients).[Exercise: Base your work on Simple Linear Regression]
f) Test the significance of X1 and X2 in determining the changes in Y using t-test.
The hypotheses are summarized in the following table.
The critical value (t 0.05, 9) to be used here is 2.262. Like the standard error test, the t- test
revealed that both X1 and X2 are insignificant to determine the change in Y since the
calculated t values are both less than the critical value.
Exercise: Test the significance of X1 and X2 in determining the changes in Y using the
standard error test.
g) Test the overall significance of the model. (Hint: use = 0.05)
This involves testing whether at least one of the two variables X 1 and X2 determine the
changes in Y. The hypothesis to be tested is given by:
SET BY MEKONNEN A. 68
LECTURE NOTE ON ECONOMETRICS
H 0 : 1 2 0
H 1 : i 0, at least for one i.
In this case, the calculated F value (0.4336) is less than the tabulated value (3.98). Hence,
we do not reject the null hypothesis and conclude that there is no significant
contribution of the variables X1 and X2 to the changes in Y.
(n k ) R2 (12 - 3) 0.087894
Fcal . 0.433632
k 1 (1 R )
2
3 - 1 1 0.087894
SET BY MEKONNEN A. 69
LECTURE NOTE ON ECONOMETRICS
Chapter five
Econometric Problems
1. Assumptions Revisited
In many practical cases, two major problems arise in applying the classical linear
regression model.
1) those due to assumptions about the specification of the model and about the
disturbances and
2) those due to assumptions about the data
SET BY MEKONNEN A. 70
LECTURE NOTE ON ECONOMETRICS
With these assumptions we can show that OLS are BLUE, and normally distributed.
Hence it was possible to test Hypothesis about the parameters. However, if any of such
assumption is relaxed, the OLS might not work. We shall not examine in detail the
violation of some of the assumptions.
2. Violations of Assumptions
If this assumption is violated, we obtain a biased estimate of the intercept term. But,
since the intercept term is not very important we can leave it. The slope coefficients
remain unaffected even if the assumption is violated. The intercept term does not also
have physical interpretation.
addition, because of the central limit theorem, we can argue that the test procedures –
the t-tests and F-tests - are still valid asymptotically, i.e. in large sample.
i.e. = ². But in the case of heteroscedasticity disturbance terms the variance changes
SET BY MEKONNEN A. 71
LECTURE NOTE ON ECONOMETRICS
Causes of Heteroscedasticity
There are several reasons why the variance of the error term may be variable, some of
which are as follows.
Following the error-learning models, as people learn, their errors of behaviour
become smaller over time where the standard error of the regression model
decreases.
As income grows people have discretionary income and hence more scope for
choice about the disposition of their income. Hence, the variance (standard
error) of the regression is more likely to increase with income.
Improvement in data collection techniques will reduce errors (variance).
Existence of outliers might also cause heteroscedasticity.
Misspecification of a model can also be a cause for heteroscedasticity.
Skewness in the distribution of one or more explanatory variables included in
the model is another source of heteroscedasticity.
Incorrect data transformation and incorrect functional form are also other
sources
Note: Heteroscedasticity is likely to be more common in cross-sectional data than in
time series data. In cross-sectional data, individuals usually deal with samples (such as
consumers, producers, etc) taken from a population at a given point in time. Such
members might be of different size. In time series data, however, the variables tend to
be of similar orders of magnitude since data is collected over a period of time.
Consequences of Heteroscedasticity
If the error terms of an equation are heteroscedastic, there are three major consequences.
SET BY MEKONNEN A. 72
LECTURE NOTE ON ECONOMETRICS
a) The ordinary least square estimators are still linear since heteroscedasticity does
not cause bias in the coefficient estimates. The least square estimators are still
unbiased.
b) Heteroscedasticity increases the variance of the partial regression coefficients but
it does affect the minimum variance property. Thus, the OLS estimators are
inefficient. Thus the test statistics – t-test and F-test – cannot be relied on in the
face of uncorrected heteroscedasticity.
Detection of Heteroscedasticity
There are no hard and fast rules (universally agreed upon methods) for detecting the
presence of heteroscedasticity. But some rules of thumb can be suggested. Most of these
methods are based on the examination of the OLS residuals, ei, since these are the once
we observe and not the disturbance term ui. There are informal and formal methods of
detecting heteroscedasticity.
b) Graphical method
If there is no a priori or empirical information about the nature of heteroscedasticity,
one could do an examination of the estimated residual squared, ei² to see if they exhibit
any systematic pattern. The squared residuals can be plotted either against Y or against
one of the explanatory variables. If there appears any systematic pattern,
heteroscedasticity might exist. These two methods are informal methods.
c) Park Test
SET BY MEKONNEN A. 73
LECTURE NOTE ON ECONOMETRICS
Park suggested a statistical test for heteroscedasticity based on the assumption that the
variance of the disturbance term (i²) is some function of the explanatory variable Xi.
Park suggested a functional form as: i 2 X i e vi which can be transferred to a linear
2
function using ln transformation. Hence, Var (ei ) 2 X ii e vi where vi is the stochastic
disturbance term.
ln i ln 2 ln X i vi
2
regression is run and if turns out to be statistically significant, then it would suggest
that heteroscedasticity is present in the data.
Step 2: Ignoring the significance of ei or taking |ei| rank both ei and Xi and compute
rS N 2
t ~ t (n 2)
1 rs
2
A high rank correlation suggests the presence of heteroscedasticity. If more than one
explanatory variable, compute the rank correlation coefficient between e i and each
SET BY MEKONNEN A. 74
LECTURE NOTE ON ECONOMETRICS
This is the most popular test and usually suitable for large samples. If it is assumed that
the variance (i²) is positively related to one of the explanatory variables in the
regression model and if the number of observations is at least twice as many as the
parameters to be estimated, the test can be used.
Given the model
Yi 0 1 X i U i
i2 2 Xi2
Goldfeld and Quandt suggest the following steps:
1. Rank the observation according to the values of Xi in ascending order.
2. Omit the central c observations (usually the middle third of the recorded
observations), or where c is specified a priori, and divide the remaining (n-c)
(n c)
observations into two groups, each with observations.ss
2
3. Fit separate regressions for the two sub-samples and obtain the respective
residuals
( n c)
RSS, and RSS2 with k df
2
4. Compute the ratio:
F
Rss 2 / df (n c 2k )
~ FV1V2 v1 v2
Rss1 / df 2
If the two variances tend to be the same, then F approaches unity. If the variances differ
we will have values for F different from one. The higher the F-ratio, the stronger the
evidence of heteroscedasticity.
Note: There are also other methods of testing the existence of heteroscedasticity in your
data. These are Glejser Test, Breusch-Pagan-Godfrey Test, White‟s General Test and
Koenker-Bassett Test the details for which you are supposed to refer.
SET BY MEKONNEN A. 75
LECTURE NOTE ON ECONOMETRICS
Remedial Measures
OLS estimators are still unbiased even in the presence of heteroscedasticity. But they are
not efficient, not even asymptotically. This lack of efficiency makes the usual hypothesis
testing procedure a dubious exercise. Remedial measures are, therefore, necessary.
Generally the solution is based on some form of transformation.
The weighted least square method requires running the OLS regression to a
transformed data. The transformation is based on the assumption of the form of
heteroscedasticity.
Assumption One: Given the model Yi 0 1 X 1i U i
If var(U i ) i X i , then E (U i ) X 1i
2 2 2 2 2
1
0 ( ) 1 Vi
X 1i
Now E (Vi 2 ) E ( U i ) 1
2
E (U i ) 2
2
X 1i X 1i
Y 1
Hence the variance of Ui is now homoscedastic and regress on .
X 1i X 1i
Assumption Two: Again given the model
Yi 0 1 X 1i U i
SET BY MEKONNEN A. 76
LECTURE NOTE ON ECONOMETRICS
Yi 0 X 1i Ui
1
X 1i X 1i X 1i X 1i
1
0 1 X 1i Vi
X 1i
Now since the variance of Vi is constant (homoscedastic) one can apply the OLS
Y 1
technique to regress on and X 1i .
X 1i X 1i
To go back to the original model, one can simply multiply the transformed model by
X 1i .
E (U i ) 2 2 E (Yi )
2
Now, E (Yi ) 0 1 X 1i
1 X 1i
0 1 Vi
E (Yi ) E (Yi )
Ui
Again it can be verified that Vi gives us a constant variance ²
E (Yi )
2
Ui
2 E (Yi ) 2
1 1
E (Vi ) E E (U i ) 2
2
E (Yi ) E (Yi ) 2
E (Yi )2
SET BY MEKONNEN A. 77
LECTURE NOTE ON ECONOMETRICS
run
ln Yi 0 1 ln X 1i U i
Then it reduces heteroscedasticity.
b) Other Remedies for Heteroscedasticity
Two other approaches could be adopted to remove the effect of heteroscedasticity.
Include a previously omitted variable(s) if heteroscedasticity is suspected due to
omission of variables.
Redefine the variables in such a way that avoids heteroscedasticity. For example,
instead of total income, we can use Income per capita.
Cov(U i ,V j ) 0 i j
Serial correlation implies that the error term from one time period depends in some
systematic way on error terms from other time periods. Autocorrelation is more a
problem of time series data than cross-sectional data. If by chance, such a correlation is
observed in cross-sectional units, it is called spatial autocorrelation. So, it is important to
understand serial correlation and its consequences of the OLS estimators.
Nature of Autocorrelation
The classical model assumes that the disturbance term relating to any observation is not
influenced by the disturbance term relating to any other disturbance term.
E (U iU j ) 0 , i j
But if there is any interdependence between the disturbance terms then we have
autocorrelation
SET BY MEKONNEN A. 78
LECTURE NOTE ON ECONOMETRICS
E (U iU j ) 0 , i j
Causes of Autocorrelation
Serial correlation may occur because of a number of reasons.
Inertia (built in momentum) – a salient feature of most economic variables time
series (such as GDP, GNP, price indices, production, employment etc) is inertia
or sluggishness. Such variables exhibit (business) cycles.
Specification bias – exclusion of important variables or incorrect functional forms
Lags – in a time series regression, value of a variable for a certain period depends
on the variable‟s previous period value.
Manipulation of data – if the raw data is manipulated (extrapolated or
interpolated), autocorrelation might result.
Autocorrelation can be negative as well as positive. The most common kind of serial
correlation is the first order serial correlation. This is the case in which this period error
terms are functions of the previous time period error term.
Et PE t 1 U t
This is also called the first order autoregressive model.
-1 < P < 1
The disturbance term Ut satisfies all the basic assumptions of the classical linear model.
E (U t ) 0
E (U t , U t 1 ) 0 t t 1
U t ~ N (0, 2 )
SET BY MEKONNEN A. 79
LECTURE NOTE ON ECONOMETRICS
2) Serial correlation increases the variance of the OLS estimators. The minimum
variance property of the OLS parameter estimates is violated. That means the
OLS are no longer efficient.
without serial
correlation
~
Var ( ˆ ) Var ( )
with serial
correlation
ˆ
SET BY MEKONNEN A. 80
LECTURE NOTE ON ECONOMETRICS
et et
et = +ve
et = +ve
et-1 = -ve
et-1 = +ve
et-1 et-1
et = -ve
et-1 = -ve
et-1 = +ve
et = -ve
There are more accurate tests for the incidence of autocorrelation. The most common
test of autocorrelation is the Durbin-Watson Test.
U t PU t 1 Et then Et PE t 1 U t
H1 : P 0
This test is, however, applicable where the underlying assumptions are met:
The regression model includes an intercept term
The serial correlation is first order in nature
The regression does not include the lagged dependent variable as an explanatory
variable
There are no missing observations in the data
The equation for the Durban-Watson d statistic is
SET BY MEKONNEN A. 81
LECTURE NOTE ON ECONOMETRICS
N
(e t et 1 ) 2
d t 2
N
e
t 1
t
2
Which is simply the ratio of the sum of squared differences in successive residuals to
the RSS
Note that the numerator has one fewer observation than the denominator, because an
is based on the estimated residuals. Thus it is often reported together with R², t, etc.
The d-statistic equals zero if there is extreme positive serial correlation, two if there is
no serial correlation, and four if there is extreme negative correlation.
1. Extreme positive serial correlation: d 0
thus d
(2e ) t
2
and d 4
e
2
t
3. No serial correlation: d 2
(e e e et 1 2 et et 1
2 2
t 1 )2
d 2
t t
e e
2 2
t t
Since e e
t t 1 0 , because they are uncorrelated. Since et
2
and e t 1
2
differ in
The exact sampling or probability distribution of the d-statistic is not known and,
therefore, unlike the t, X² or F-tests there are no unique critical values which will lead to
the acceptance or rejection of the null hypothesis.
SET BY MEKONNEN A. 82
LECTURE NOTE ON ECONOMETRICS
But Durbin and Watson have successfully derived the upper and lower bound so that if
the computed value d lies outside these critical values, a decision can be made
regarding the presence of a positive or negative serial autocorrelation.
Thus
(e e e et 1 2 et et 1
2 2
t 1 )2
d
t t
e e
2 2
t t
2(1
e e t t 1
)
e
2
t 1
ˆ ) since
d 2(1 P
e e t t 1 ˆ
P
e
2
t 1
Reject H0 Reject H0
+ve -ve
autocorr. autocorr.
accept H0
no serial
correlation
0 d
dL dU 4-dU 4-dL 4
SET BY MEKONNEN A. 83
LECTURE NOTE ON ECONOMETRICS
Note: Other tests for autocorrelation include the Runs test and the Breusch-Godfrey
(BG) test. There are so many tests of autocorrelation since there is no particular test that
has been judged to be unequivocally best or more powerful in the statistical sense.
SET BY MEKONNEN A. 84
LECTURE NOTE ON ECONOMETRICS
For such a scheme the appropriate transformation is to subtract from the original
observations of each period the product of P̂ times the value of the variables in the
previous period.
Yt * b0 * b1 X 1t * ... bK X Kt * U t
Yt * Yt Pˆt 1
X it * X it PX i (t 1)
where:
Vt U t PU t 1
b0 * b0 Pb0
Thus, if the structure of autocorrelation is known, then it is possible to make the above
transformation. But often the structure of the autocorrelation is not known. Thus, we
need to estimate P in order to make the transformation.
When is not known
There are different ways of estimating the correlation coefficient, , if it is unknown.
1) Estimation of from the d-statistic
ˆ 1 d
ˆ ) or P
Recall that d 2(1 P
2
which suggest a simple way of obtaining an estimate of from the estimated d statistic.
Once an estimate of is made available one could proceed with the estimation of the
OLS parameters by making the necessary transformation.
2) Durbin’s two step method
Given the original function as
ˆˆ ˆˆ ˆ
Yt P Yt 1 0 (1 P ) 1 ( X t ˆX t 1 ) U t *x
let U t U t 1 Vt
SET BY MEKONNEN A. 85
LECTURE NOTE ON ECONOMETRICS
0 (1 P ) a 0
1 a1
1 a 2
etc.
Yt a0 PYt 1 a1 X 1t ... a K X Kt Vt
Applying OLS to the equation, we obtain an estimate of, which is the coefficient of the
lagged variable Yt 1 .
The methods discussed above to solve the problem of serial autocorrelation are
basically two step methods. In step 1, we obtain an estimate of the unknown and in
step 2, we use that estimate to transform the variables to estimate the generalized
difference equation.
Note: The Cochrane-Orcutt Iterative Method is also another method.
SET BY MEKONNEN A. 86
LECTURE NOTE ON ECONOMETRICS
If the correlation coefficient is 0, the variables are called orthogonal; there is no problem
of multicollinearity. Neither of the above two extreme cases is often met. But some
degree of inter-correlation is expected among the explanatory variables, due to the
interdependence of economic variables.
Multicollinearity is not a condition that either exists or does not exist in economic
functions, but rather a phenomenon inherent in most relationships due to the nature of
economic magnitude. But there is no conclusive evidence which suggests that a certain
degree of multicollinearity will affect seriously the parameter estimates.
Consequences of Multicollinearity
Recall that, if the assumptions of the classical linear regression model are satisfied, the
OLS estimators of the regression estimators are BLUE. As stated above if there is perfect
multicollinearity between the explanatory variables, then it is not possible to determine
the regression coefficients and their standard errors. But if collinearity among the X-
variables is high, but not perfect, then the following might be expected.
Nevertheless, the effect of collinearity is controversial and by no means conclusive.
SET BY MEKONNEN A. 87
LECTURE NOTE ON ECONOMETRICS
without severe
multicollinearity
with severe
multicollinearity
̂
(3) The computed t-ratios will fall i.e. insignificant t-ratios will be observed in the
one may increasingly accept the null hypothesis that the relevant true
population‟s value is zero
H0 : i 0
Thus because of the high variances of the estimates the null hypothesis would
be accepted.
(4) A high R² but few significant t-ratios are expected in the presence of
multicollinearity. So one or more of the partial slope coefficients are
individually statistically insignificant on the basis of the t-test. Yet the R² may
be so high. Indeed, this is one of the signals of multicollinearity, insignificant t-
values but a high overall R² and F-values. Thus because multicollinearity has
SET BY MEKONNEN A. 88
LECTURE NOTE ON ECONOMETRICS
little effect on the overall fit of the equation, it will also have little effect on the
use of that equation for prediction or forecasting.
Detecting Multicollinearity
Having studied the nature of multicollinearity and the consequences of
multicollinearity, the next question is how to detect multicollinearity. The main purpose
in doing so is to decide how much multicollinearity exists in an equation, not whether
any multicollinearity exists. So the important question is the degree of multicollinearity.
But there is no one unique test that is universally accepted. Instead, we have some rules
of thumb for assessing the severity and importance of multicollinearity in an equation.
Some of the most commonly used approaches are the following:
This is the classical test or symptom of multicollinearity. Often if R ² is high (R² > 0.8) the
F-test in most cases will reject the hypothesis that the partial slope coefficients are
simultaneously equal to zero, but the individual t-tests will show that none or very few
partial slope coefficients are statistically different from zero. In other words,
multicollinearity that is severe enough to substantially lower t-scores does very little to
So the combination of high R² with low calculated t-values for the individual regression
coefficients is an indicator of the possible presence of severe multicollinearity.
Drawback: a non-multicollinear explanatory variable may still have a significant
coefficient even if there is multicollinearity between two or more other explanatory
variables Thus, equations with high levels of multicollinearity will often have one or
two regression coefficients significantly different from zero, thus making the “high R²
low t” rule a poor indicator in such cases.
1) High pair-wise (simple) correlation coefficients among the regressors (explanatory
variables).
SET BY MEKONNEN A. 89
LECTURE NOTE ON ECONOMETRICS
If the R‟s are high in absolute value, then it is highly probable that the X‟s are highly
correlated and that multicollinearity is a potential problem. The question is how high r
should be to suggest multicollinearity. Some suggest that if r is in excess of 0.80, then
multicollinearity could be suspected.
Another rule of thumb is that multicollinearity is a potential problem when the squared
the VIF approaches infinity. If there no collinearity, VIF will be 1. As a rule of thumb,
VIF value of 10 or more shows multicollinearity is sever problem. Tolerance is defined
as the inverse of VIF.
3) Other more formal tests for multicollinearity
The use of formal tests to give any indications of the severity of the multicollinearity in
a particular sample is controversial. Some econometricians reject even the simple
indicators developed above, mainly because of the limitations cited. Some people tend
to use a number of more formal tests. But none of these is accepted as the best.
SET BY MEKONNEN A. 90
LECTURE NOTE ON ECONOMETRICS
There is no automatic answer to the question “what can be done to minimize the
problem of multicollinearity.” The possible solution which might be adopted if
multicollinearity exists in a function, vary depending on the severity of
multicollinearity, on the availability of other data sources, on the importance of factors
which are multicollinear, on the purpose for which the function is used. However, some
alternative remedies could be suggested for reducing the effect of multicollinearity.
1) Do Nothing
Some writers have suggested that if multicollinearity does not seriously affect the
estimates of the coefficients one may tolerate its presence in the function. In a sense,
multicollinearity is similar to a non-life threatening human disease that requires an
operation only if the disease is causing a significant problem. A remedy for
multicollinearity should only be considered if and when the consequences cause
insignificant t-scores or widely unreliable estimated coefficients.
2) Dropping one or more of the multicollinear variables
When faced with severe multicollinearity, one of the simplest way to get rid of (drop)
one or more of the collinear variables. Since multicollinearity is caused by correlation
between the explanatory variables, if the multicollinear variables are dropped the
correlation no longer exists.
Some people argue that dropping a variable from the model may introduce
specification error or specification biases. According to them since OLS estimators are
still BLUE despite near collinearity omitting a variable may seriously mislead us as to
the true values of the parameters.
Example: If economic theory says that income and wealth should both be included in
the model explaining the consumption expenditure, dropping the wealth variable
would constitute specification bias.
3) Transformation of the variables
If the variables involved are all extremely important on theoretical grounds, neither
doing nothing nor dropping a variable could be helpful. But it is sometimes possible to
SET BY MEKONNEN A. 91
LECTURE NOTE ON ECONOMETRICS
transform the variables in the equation to get rid of at least some of the
multicollinearity.
Two common such transformations are:
(i) to form a linear combination of the multicollinear variables
(ii) to transform the equation into first differences (or logs)
The technique of forming a linear combination of two or more of the multicollinearity
variables consists of:
If an equation (or some of the variables in an equation) is switched from its normal
specification to a first difference specification, it is quite likely that the degree of
multicollinearity will be significantly reduced for two reasons.
Since multicollinearity is a sample phenomenon, any change in the definitions of
the variables in that sample will change the degree of multicollinearity.
Multicollinearity takes place most frequently in time-series data, in which first
differences are far less likely to move steadily upward than are the aggregates
from which they are calculated.
(4) Increase the sample size
SET BY MEKONNEN A. 92
LECTURE NOTE ON ECONOMETRICS
3) Other Remedies
There are several other methods suggested to reduce the degree of multicollinearity.
Often multivariate statistical techniques such as Factor analysis and Principal
component analysis or other techniques such as ridge regression are often employed to
solve the problem of multicollinearity.
Summary
The Ordinary Least Squares methods will work when the assumptions of classical linear
regression models hold. One of the critical assumptions of the classical linear regression
model is that the disturbances have all same variance the violation of which leads to
heteroscedasticity. Heteroscedasticity does not destroy the unbiasedness and
consistency properties of OLS estimators but the efficiency property. There are several
diagnostic tests available for detecting it but one cannot tell for sure which will work in
a given situation. Eventhough it is detected, it is not easy to correct it. Transforming the
data might be a possible way out. The other assumption is that there is no
multicollinearity (exact or approximately exact linear relationship) among the
explanatory variables. If there is perfect collinearity, the regression coefficients are
indeterminate. Although there are no sure methods of detecting collinearity, there are
several indicators of it. The clearest sign of it is when R2 is very high but none of the
regression coefficients is statistically significant. Detection of multicollinearity is half the
battle, the other half is concerned with how to get rid of it. Although there are no sure
methods, there are only few rules of thumb such as use of extraneous or priori
SET BY MEKONNEN A. 93
LECTURE NOTE ON ECONOMETRICS
SET BY MEKONNEN A. 94
LECTURE NOTE ON ECONOMETRICS
Chapter Six
The most common functional form that is non-linear in the variable (but still linear in
the coefficients) is the log-linear form. A log-linear form is often used, because the
elasticities and not the slopes are constant i.e., = Constant.
Output
Input
Thus, given the assumption of a constant elasticity, the proper form is the exponential
(log-linear) form.
Given: Yi 0 X i i eU i
The log-linear functional form for the above equation can be obtained by a logarithmic
transformation of the equation.
ln Yi ln 0 i ln X i U i
The model can be estimated by OLS if the basic assumptions are fulfilled.
SET BY MEKONNEN A. 95
LECTURE NOTE ON ECONOMETRICS
demand gd(log f)
ln Yi ln 0 1 ln X i
1
Yi 0 X i
The model is also called a constant elasticity model because the coefficient of elasticity
between Y and X (1) remains constant.
Y X d ln Y
1
X Y d ln X
This functional form is used in the estimation of demand and production functions.
Note: We should make sure that there are no negative or zero observations in the data
set before we decide to use the log-linear model. Thus log-linear models should be run
only if all the variables take on positive values.
b) Semi-log Form
The semi-log functional form is a variant of the log-linear equation in which some but
not all of the variables (dependent and independent) are expressed in terms of their
logs. Such models expressed as:
model ) are called semi-log models. The semi-log functional form, in the case of taking
the log of one of the independent variables, can be used to depict a situation in which
the impact of X on Y is expected to „tail off‟ as X gets bigger as long as 1 is greater than
zero.
SET BY MEKONNEN A. 96
LECTURE NOTE ON ECONOMETRICS
1<0
Y=0+1Xi
1>0
Example: The Engel‟s curve tends to flatten out, because as incomes get higher, a
smaller percentage of income goes to consumption and a greater percentage goes to
saving.
c) Polynomial Form
Y 0 1 X 1i 2 X 1i 3 X 2i U i
2
Such models produce slopes that change as the independent variables change. Thus the
slopes of Y with respect to the Xs are
Y Y
1 2 2 X 1 , and 3
X 1 X 2
In most cost functions, the slope of the cost curve changes as output changes.
SET BY MEKONNEN A. 97
LECTURE NOTE ON ECONOMETRICS
Y Y
A) B)
X
Xi Impact of age on earnings
a typical cost curve
Simple transformation of the polynomial could enable us to use the OLS method to
estimate the parameters of the model
X1 X 3
2
Setting
Y 0 1 X 1i 2 X 3 3 X 2i U i
The inverse functional form expresses Y as a function of the reciprocal (or inverse) of
one or more of the independent variables (in this case X1):
1
Yi 0 1 ( ) 2 X 2i U i
X 1i
Or
1
Yi 0 1 ( ) 2 X 2i U i
X 1i
The reciprocal form should be used when the impact of a particular independent
variable is expected to approach zero as that independent variable increases and
eventually approaches infinity. Thus as X1 gets larger, its impact on Y decreases.
SET BY MEKONNEN A. 98
LECTURE NOTE ON ECONOMETRICS
1 0 0
Y 0
X 1i 1 0
0
1 0 0
Y 0
X 1i 1 0
An asymptote or limit value is set that the dependent variable will take if the value of
the X-variable increases indefinitely i.e. 0 provides the value in the above case. The
function approaches the asymptote from the top or bottom depending on the sign of 1.
1
Wt 0 1 ( ) Ut
Ut
References
Greene, W. H. (2002). Econometric Analysis. 5th Edition. Macmillan, New York.
Maddala, G.S. (1992). Introduction to Econometrics. 2nd Edition.
Koutsoyianis A., 2001. Theory of Economietrics. 2nd ed. Replicas press. Pvt.ltd. New
Delhi.
SET BY MEKONNEN A. 99
LECTURE NOTE ON ECONOMETRICS