Basic Econometrics
Basic Econometrics
Basic Econometrics
IV Semester
COMPLEMENTARY COURSE
B Sc MATHEMATICS
(2011 Admission)
UNIVERSITY OF CALICUT
SCHOOL OF DISTANCE EDUCATION
422
UNIVERSITY OF CALICUT
SCHOOL OF DISTANCE EDUCATION
STUDY MATERIAL
COMPLEMENTARY COURSE
B Sc Mathematics
IV Semester
MATHEMATICAL ECONOMICS
Prepared by &
Sri. Shabeer K P,
Assistant Professor,
Dept. of Economics,
Govt College Kodanchery.
Scrutinised by:
Layout:
Reserved
Mathematical Economics
Page 2
CONTENTS
PAGE No.
MODULE I
INTRODUCTION TO ECONOMETRICS
MODULE II
17
MODULE III
37
MODULE IV
52
Mathematical Economics
Page 3
Mathematical Economics
Page 4
MODULE I
INTRODUCTION TO ECONOMETRICS
1.1 Definition and Scope of Econometrics
Literally interpreted, econometrics means economic measurement. Econometrics
deals with the measurement of economic relationships. It is a science which
combines economic theory with economic statistics and tries by mathematical and
statistical methods to investigate the empirical support of general economic law
established by economic theory. Econometrics, therefore, makes concrete certain
economic laws by utilising economics, mathematics and statistics. The term
econometrics is formed from two words of Greek origin, oukovouia meaning
economy and uetpov meaning measure.
Although measurement is an important part of econometrics, the scope of
econometrics is much broader, as can be seen from the following quotations. In the
words of Arthur S Goldberger econometrics may be defined as the social science in
which the tools of economic theory, mathematics and statistical inference are
applied to the analysis of economic phenomena. Gerhard Tintner points out that
econometrics, as a result of certain outlook on the role of economics, consists of
application of mathematical statistics to economic data to lend empirical support to
the models constructed by mathematical economics and to obtain numerical
results. For H Theil econometrics is concerned with the empirical determination of
economic laws. In the words of Ragnar Frisch the mutual penetration of
quantitative econometric theory and statistical observation is the essence of
econometrics.
Thus, econometrics may be considered as the integration of economics,
mathematics and statistics for the purpose of providing numerical values for the
parameters of economic relationships and verifying economic theories. It is a special
type of economic analysis and research in which the general economic theory,
formulated in mathematical terms, is combined with empirical measurement of
economic phenomena. Econometrics is the art and science of using statistical
methods for the measurement of economic relations. In the practice of
econometrics, economic theory, institutional information and other assumptions are
relied upon to formulate a statistical model, or a set of statistical hypotheses to
explain the phenomena in question.
Economic theory makes statements or hypotheses that are mostly qualitative in
nature. Econometrics gives empirical content to most economic theory.
Econometrics differs from mathematical economics. The main concern of the
mathematical economics is to express economic theory in mathematical form
(equations) without regard to measurability or empirical verification of the theory. As
noted above, econometrics is mainly interested in the empirical verification of
Mathematical Economics
Page 5
Methodology of Econometrics
To illustrate the preceding steps, let us consider the well known psychological law of
consumption.
Mathematical Economics
Page 6
0< 2 <1
(1.1)
Page 7
(1.2)
Consumption expenditure
Income
Page 8
Page 9
Page 10
statistics examination, between smoking and lung cancer and so on. In regression
analysis, we are not primarily interested in such a measure. Instead, we try to
estimate or predict the average value of one variable on the basis of the fixed values
of other variables. Thus, we want to know whether predict the average mark on a
mathematics examination by knowing a students marks on a statistics
examination.
In regression analysis, there is an asymmetry in the way the dependent and
explanatory variables are treated. The dependent variable is assumed to be
statistical, random or stochastic, that is, to have a probability distribution. On the
other hand, the explanatory variables are assumed to have fixed values (in repeated
sampling). But in correlation analysis we treat any two variables symmetrically;
there is no distinction between the dependent and explanatory variables. The
correlation between marks of mathematics and statistics examinations is the same
as that between marks of statistics and mathematics examinations. Moreover, both
variables are assumed to be random. While most of the correlation theory is based
on the assumption of randomness of variables, whereas most of the regression
theory is based on the assumption that the dependent variable is stochastic but
explanatory variables are fixed or non-stochastic.
1.5 Two-Variable Regression Analysis
As noted above, regression analysis is largely concerned with estimating
and/or predicting the population mean or average value of the dependant variable
on the basis of the known value of the explanatory variable. We start by a simple
linear regression model, that is, by the relationship between two variables, one
dependent and one explanatory, related with a linear function. If we are studying
the dependence of a variable on only a single explanatory variable, such as
consumption expenditure on income, such study is known as the simple or twovariable regression analysis.
Suppose we are interested in studying the relationship between weekly
consumption expenditure Y and weekly after-tax or disposable family income X.
More specifically, we want to predict the population mean level of weekly
consumption expenditure knowing the family weekly income. For each of the
conditional probability distributions of Y we can compute its mean average value,
known as conditional mean or conditional expectation, denoted as E (Y/X=X i) and is
read as the expected value of Y given that X takes the specific value X i, which for
simplicity written as E(Y/Xi). An expected value is simply a population mean or
average value.
Each conditional mean E(Y/Xi) will be a function of Xi. Symbolically,
E(Y/Xi) = f (Xi)
(1.3)
Page 11
functionally related to Xi. In other words, it tells how the mean or average response
of Y varies with X. The form of the function f (X i) is important because in real
situations we do not have the entire population available for examination. Therefore,
the functional of the PRF is an empirical question. For example, an economist might
hypothesize that consumption expenditure is linearly related to income. Thus, we
assume that the PRF E(Y/Xi) is a linear function of Xi, say of the type,
E(Y/Xi) = 1+2Xi
(1.4)
Where 1and 2 are unknown but fixed parameters known as the regression
coefficients. 1and 2 are also known as the intercept and slope coefficients
respectively. Equation (1.4) is known as the linear population regression function or
simply the linear population regression. Some alternative expressions used are
linear population regression model or linear population regression equation. In
regression analysis our interest is in estimating the PRF like equation (1.4) that is
estimating the values of unknowns 1and 2 on the basis of observations on Y and
X.
1.6 The Meaning of the term Linear
The term linear can be interpreted in two different ways, namely, linearity in
variables and linearity in parameters. The first and perhaps more natural meaning
of linearity is that the conditional expectation of Y is a linear function of X i, such as
equation (1.4). Geometrically, the regression curve in this case is a straight line. In
this interpretation, a regression function such as E(Y/X i) = 1+2Xi2 is not a linear
function because the variable X appears with a power of 2.
The second interpretation of linearity, that is, linearity in parameters is that
the conditional expectation of Y, E(Y/X i), is a linear function of the parameters that
is s. It may or may not be linear in the variable X. In this interpretation E(Y/X i) =
1+2Xi2 is a linear regression model but E(Y/X i) = 1+ Xi is not. The later is an
example of a nonlinear in parameters regression model.
Of the two interpretations of linearity, linearity in the parameters is relevant
for the development of regression theory. Thus, for our analysis, the term linear
regression will always mean a regression that is linear in the parameters, that is s.
In other words, parameters are raised to the first power only. It may or may not be
linear in the explanatory variable, the Xs. It can be noted that E(Y/X i) = 1+2Xi is
linear in both parameters and variable.
1.7 Stochastic Specification of PRF
The stochastic nature of the regression model implies that for every value of X
there is a whole probability distribution of values of Y. In other words, the value of Y
can never be predicted exactly. The form of the equation (1.4) implies that the
relationship between consumption expenditure and income is exact, that is, all the
variations in Y are is due solely to changes in X, and there is no other factors
affecting the dependent variable. Given the income level of X i, an individual familys
consumption expenditure will cluster around the average consumption of all
families at that Xi, that is around the conditional expectation. Therefore, we can
express the deviation of an individual Y i around the expected value as follows.
Mathematical Economics
Page 12
ui = Yi E(Y/Xi)
or
Yi= E(Y/Xi) + ui
(1.5)
(1.6)
(1.7)
Since the expected value of a constant is that constant itself. Again, E(Y i/Xi) is
the same thing as E(Y/Xi), equation (1.7) implies that
E(ui/Xi) = 0
(1.8)
Thus, the assumption that the regression line passes through the conditional
mean of Y implies that the conditional mean values of u i (conditional upon the given
Xs) are zero. The stochastic specification clearly shows that there are other
variables besides income that affect the consumption expenditure and that an
individuals family consumption expenditure cannot be fully explained only by the
variables included in the regression model.
1.8 The Significance of the Stochastic Disturbance Term
As noted above, the disturbance tern u i is the substitute for all those variables that
are omitted from the model but that collectively affect Y. The reasons for not
introducing those variables into the model and significance of stochastic
disturbance term ui are explained below.
Mathematical Economics
Page 13
a) Vagueness of theory
The theory determining the behaviour of Y may be incomplete. We might know
for certain that weekly income X influences weekly consumption expenditure Y, but
we might be ignorant or unsure about other variables affecting Y. Therefore, u i may
be used as a substitute for all the excluded or omitted variables from the model.
b) Unavailability of Data
Even if we know some of the excluded variables and therefore consider a multiple
regression rather than a simple regression, we may not have quantitative
information about these variables. For example, we could introduce family wealth as
an explanatory variable in addition to the income variable to explain family
consumption expenditure. But unfortunately, information about family wealth is not
generally available.
c) Core variables vs. Peripheral variables
Assume the in the consumption-income example that besides income X 1, the
number of children per family X 2, sex X3, religion X4, and education X5 also affect
consumption expenditure. But it is quite possible that the joint influence of all or
some of these variables may be so small ad non-systematic or random. Thus, as a
practical matter and for cost consideration we do not introduce them into the model
explicitly. Their combined effect can be treated as a random variable u i.
d) Intrinsic Randomness in Human Behaviour
Human behaviour is not predictable. Even if we succeed in introducing all the
relevant variables into the model, there is bound to be some intrinsic randomness in
individual Y that cannot be explained no matter how hard we try. The disturbances,
the us may very well reflect this intrinsic randomness.
e) Poor Proxy Variables
Although regression model assumes that the variables Y and X are measured
accurately, in practice the data may be plagued by errors of measurement. The
deviations of the points from the true regression line may be due to errors of
measurement of the variables, which are inevitable due to the methods of collecting
and processing statistical information. The disturbance term u i also represents the
errors of measurement.
f) Principle of Parsimony
Following Occams razor which states that descriptions be kept as simple as
possible until proved inadequate, we would like to keep our regression model as
simple as possible. If we can explain the behaviour of Y substantially with two or
three explanatory variables and if our theory is not strong enough to suggest that
other variables might be included, there is no need to introduce other variables. Let
ui represent all other variables.
Mathematical Economics
Page 14
(1.9)
= Estimator of 2
An estimator also known as a sample statistic is simply a rule or formula or
method that tells us how to estimate the population parameter from the information
provided by the sample at hand. The particular numerical value obtained by the
estimator in an application is known as an estimate. Now we can express the SRF in
its stochastic form as
Y = + X + u
(1.10)
Page 15
Thus, to sum up, the primary objective in the regression analysis is to estimate the
PRF
Yi = 1+2Xi + ui
On the basis of the SRF
Y = + X + u
But because of the sampling fluctuations our estimate of PRF based on the SRF
is at best an approximate one. This approximation is shown the figure below.
Yi
Weekly Consumption expenditure
SRF:
Yi
ui
E(Y/Xi)
A
0
Xi
Weekly Income
For X = Xi, we have one sample observation Y=Y i. In terms of the SRF, the observed
Yi can be expressed as
Yi = Y + u
(1.11)
Yi = E (Y/Xi) + ui
(1.12)
In the figure, Y overestimates the true E(Y/Xi). But at the same time, for any Xi
to the left of point A, SRF will underestimate the true PRF. Such over and
underestimation is inevitable because of sampling fluctuations. The important task
is to device a rule or method that will make the approximation as close as possible.
That is, SRF should be constructed such that is as close as possible to true 1
and is as close possible to the true 2, even though we never know the true 1
and 2.
Mathematical Economics
Page 16
MODULE II
TWO VARIABLE REGRESSION MODEL
2.1 The Method of Ordinary Least Squares
Our task is to estimate the population regression function (PRF) on the basis
of the sample regression function (SRF) as accurately as possible. Though there are
several methods of constructing the SRF, the most popular method to estimate the
PRF from SRF is the method of Ordinary Least Squares (OLS). The method of
ordinary least squares is attributed to German mathematician Carl Friedrich Gauss.
Under certain assumptions, the method of least squares has some very attractive
statistical properties that have made it one of the most popular methods of
regression analysis.
The relationship between X and Y in the PRF is
Yi = 1+2Xi + ui
Since PRF is not directly observable, we estimate it from the SRF,
Y = + X + u
Yi = Y + u
Where Y is the estimated conditional mean value of Y i. Now to determine the SRF,
we have
u = Y Y
u = Y X
u = Y ( + X )
(2.1)
Which shows that the residuals, u are simply the difference between the actual
and estimated Y values. Given n pairs of observations on Y and X, we would like to
determine the SRF in such a manner that it is as close as possible to the actual Y.
To this objective, we may adopt the following criterion: Choose the SRF in such a
way that the sum of residuals u = ( Y Y ) is as small as possible. The
rationale of this criterion is easy to understand. It is intuitively obvious that the
smaller the deviations from the population regression line, the better the fit of the
line to the scatter of observations. Although intuitively appealing this is not a good
criterion as can be seen from the following scatter diagram.
Mathematical Economics
Page 17
SRF:
X2
X1
X4
X3
If we adopt the criterion of minimising u , then all the residuals will receive
the same weight in the sum, although u and u are more widely scattered around
the SRF than u and u . In other words, all the residuals receive equal importance
no matter how close or how widely scattered the individual observations are from
the SRF. A consequence of this is that it is quite possible that the algebraic sum of
the u is small or even zero although u are widely scattered around the SRF. That is,
while summing these deviations the positive values will offset the negative values, so
that the final algebraic sum of these residuals might equal to zero.
For avoiding this problem the best solution is to square the deviations and
minimise the sum of squares. That is we can adopt the least squares criterion,
which states that the SRF can be fixed in such a way that
That is,
u =
= (Y X )
(Y Y )
(2.2)
Thus, in the least squares criterion, we try to make equation (2.2) as small as
possible, where u are the squared residuals. The reason for calling this method as
the least squares method is that it seeks the minimisation of the sum of squares of
the deviations of actual observations from the line. By squaring u , this method
gives more weight to residuals such as u and u in the above than the residuals u
and u As noted previously, under the minimum u criterion, the sum can be small
even though the u are widely spread around the SRF. But this is not possible under
the least squares method because the larger the absolute value of u , the larger will
be u . A further justification of the least squares method lies in the fact that the
estimators obtained by it have some very desirable statistical properties.
Mathematical Economics
Page 18
(2.3)
= n + X
(2.4)
= X + X
(2.5)
Where n is the sample size. These simultaneous equations are known as the
normal equations.
2.1.1 Formal Derivation of the Normal Equations
We have to minimise the function
u
(Y X )
With respect to and . The necessary condition for minimum is that the
first derivatives of the function be equal to zero. That is,
= 0 and
=0
To obtain the above derivatives we apply the function of the function rule of
differentiation. Then, the partial derivatives with respect to will be,
(Y X )
Mathematical Economics
=0
(Y X ) (1) = 0
X )
=0
(2.4)
Page 19
2
This can be written as
=0
(Y X ) (X ) = 0
X X
(YX
) =0
= X + X
(2.5)
= n + X
(2.4)
= X + X
(2.5)
n
X
X
X
= n X ( X )
n
X
X Y
= n X Y
X Y
=
n X Y X Y
n X ( X )
X Y
n n
n
=
X X
X n
n
n
X Y
Mathematical Economics
X Y nX Y
X nX
Page 20
(X X ) (Y Y)
(X X )
(2.6)
To derive , let us reproduce the first normal equation, that is equation (2.4), we
have
= n + X
That is, Y = + X
(2.7)
The estimators and obtained are known as ordinary least squares estimators
because they are derived from the least squares principle.
Example:
To illustrate the use of the above formulae we will estimate the supply function of
commodity z. We have 12 pairs of observations as shown in the following table.
Number of Observations
1
Quantity
69
Price
9
76
12
52
56
10
57
77
10
58
55
67
12
10
53
11
72
11
12
64
Mathematical Economics
Page 21
The following is the worksheet for the estimation of the supply function of
commodity z.
n
Quantity
(Yi)
Price
(Xi)
x i yi
xi2
69
76
12
13
39
52
-11
-3
33
56
10
-7
-7
57
-6
77
10
14
14
58
-5
-2
10
55
-8
-1
67
12
12
10
53
-10
-3
30
11
72
11
18
12
64
-1
-1
n=12
Y = 756
yi =(
) xi = (
y =0
X = 108
Y=
X=
756
=
=
n
12
x =0
x y =156
= 48
108
=
=
n
12
x y
156
=
= .
x
48
X = 63-(3.25)9 = 33.75
+ .
Numerical properties are those that hold as a consequence of the use of ordinary
least squares, regardless of how the data were generated. The following are
numerical properties of OLS estimators
I.
The OLS estimators are expressed solely in terms of the observable (that is,
sample ) quantities of X and Y. Therefore it can be easily computed.
Mathematical Economics
Page 22
II.
III.
OLS estimators are point estimators. That is, given the sample, each
estimator will provide only a single (point) value of the relevant population
parameter.
Once the OLS estimates are obtained from the sample data, the sample
regression line can be easily obtained. The regression line thus obtained has
the following properties:
1) It passes through the sample means of Y and X. This fact is obvious
from equation (2.7) which can also be written as Y = + X. This is
shown in the following figure.
Y
SRF:
2) The mean value of the estimated Y that is, Y is equal to the mean value of the
actual Y. That is Y = Y
Proof:
Y = + X
Since, = Y X
We
will
get,
Y = (Y X) + X
Y = Y + (X X)
Applying summation
Dividing by n, we have
(X X)
Y
nY (X X)
=
+
n
n
n
since (X X) = 0
we will get, Y = Y
3) The mean value residuals, u is zero.
Mathematical Economics
Y = nY +
Page 23
Proof:
From our earlier equation, we have
2
Or
(Y X ) (1) = 0
(Y X )
=0
(1.10)
Since, u = 0, we have
X +
Y = + X
(2.8)
(2.9)
Note that equation (2.9) is the same as equation (2.7). Subtracting (2.9) from (1.10)
we obtain
Y
or
= (X X) + u
y = x +u
(2.10)
Where yi and xi are deviations from their respective sample mean values. Equation
(2.10) is known as the deviation form. Note that the intercept term, is absent. Yet
we can find it from our earlier equation (2.7). In the deviation form, the SRF can be
written as
y = x
(2.11)
Mathematical Economics
That is Y
Page 24
Proof:
Y u = x u
= x (y x )
=
x y
Since =
We have Y u =
Y u = 0
Proof:
(Y X ) (X ) = 0
That is
2 u X = 0 or
u X =0
2.3 The Classical Linear Regression Model: The Assumptions Underlying the
Method of Least Squares
If our objective is to estimate 1 and 2 only, then the method of OLS will be
sufficient. But our aim is not only to obtain and but also to draw inferences
about the true 1 and 2. We would like to know how close and are to their
counterparts in the population or how close is Y to the true E(Y/Xi).The PRF: Yi =
1+2Xi + ui shows that Yi depends on both Xi and ui. Therefore unless we are
specific about how Xi and ui are generated, there is no way we can make any
statistical inference about Y i, 1 and 2. The classical linear regression model, which
is the cornerstone of the most econometric theory, makes 10 assumptions, which
are explained below.
1. Linear Regression Model
The regression model is linear in the parameters as shown in the PRF, Y i = 1+2Xi +
u i.
2. X Values are fixed in repeated sampling
Values taken by the regressor X are considered fixed in repeated samples. More
technically, X is assumed to be nonstochastic. Our regression analysis is
conditional regression analysis, that is, condition upon the given values of the
Mathematical Economics
Page 25
regressor X. The Xis are a set of fixed values in the hypothetical process of repeated
sampling which underlines the linear regression model. This means that, in taking a
large number of samples on Y and X, the Xi values are the same in all the samples,
but the ui and Yi do differ from sample to sample.
3. Zero mean value of the disturbance u i
Given the value of X, the mean or expected value of the random disturbance term
ui is zero. Technically, the conditional mean value of u i is zero. Symbolically we have
E (ui/Xi) = 0
(2.12)
The equation states that the mean value of u i conditional upon the given Xi is
zero. This means that for each X, u may assume various values, some greater than
zero and some smaller than zero, but if we consider all the possible value of u, for
any given value of X, they would have an average value of zero. This assumption
implies that E(Yi/Xi) = 1+2Xi.
4. Homoscedasticity or equal variance of u i
Given the value X, the variance of ui is the same for all observations. That is, the
conditional variances of ui are identical. Symbolically, we have
var(ui/Xi) =E [u E
(2.13)
var stands for variance. Equation (2.13) states that the variance of ui for each X i is
some positive constant number equal to 2. Technically it represents the
assumption of Homoscedasticity or equal spread or equal variance. Stated
differently, the equation means that the Y populations corresponding to X values
have the same variance. The assumption also implies that the conditional variances
of Yi are also homoscedastic. That is,
var(Yi/Xi)=2
(2.14)
Mathematical Economics
since E(ui)=E(uj)=0
(2.15)
Page 26
Where i and j are two different observations and where cov means covariance.
Equation (2.15) postulates that the disturbances u i and uj are uncorrelated.
Technically, this is the assumption of no serial correlation or no auto correlation.
That is the covariances of any u i with any other uj are equal to zero. The value
which the random term assumed in one period does not depend on the value
which it assumed in any other period.
6. Zero covariance between ui and Xi or E(uiXi)=0
Formally, cov (ui, Xi) = E [u i-E(ui)][ Xi-E(Xi)]
cov (ui, Xi) = E [ui ( Xi-E(Xi))] since E(ui) = 0
cov (ui, Xi) = E (uiXi)-E(Xi)E(ui)
cov (ui, Xi) = E (uiXi) since E(ui)=0
cov (ui, Xi) =0, by assumption
(2.16)
The above assumption states that the disturbance u and explanatory variable X
are uncorrelated. The rationale for this assumption is as follows: when we express
the PRF as Yi = 1+2Xi+ui, we assumed that explanatory variable X and u which
represent the influence of all omitted variables have separate and additive influence
on Y. But if X and u are correlated, it is impossible to assess their individual effects
on Y.
7. The number of observations n must be greater than the number of
parameters to be estimated.
Alternatively, the number of observations n must be greater than the number of
explanatory variables. For instance, if we had only one pair of observations on Y and
X, there is no way to estimate the two unknowns, namely 1and 2. We need at least
two pairs of observations to estimate the two unknowns.
8. Variability in X values
The X values in a given sample must not all be the same. Technically, var(X)
must be a finite positive number. If all the X values are identical, then X = X and
Page 27
model? Is it linear in the parameters, the variables or both? (c) What are the
probabilistic assumptions made about the Y i, the Xi and the ui entering the model?
10.
That is, there is no perfect linear relationship among the explanatory variables. If
there is more than one explanatory variable in the relationship it is assumed that
they are not perfectly correlated with each other. Indeed, the regressors should not
even be strongly correlated; they should not be highly multicollinear.
2.4 Properties of Least Squares Estimators: The Gauss Markov Theorem
As noted earlier, given the assumptions of classical linear regression model,
the least squares estimates possess some ideal or optimum properties. These
properties are contained in the well known Gauss Markov theorem. To understand
this theorem, first we need to consider the best linear unbiasedness properties of an
estimator, which is explained below.
An estimator is the best when it has the smallest variance as compared to any
other estimators obtained from other econometric methods. Symbolically assumed
that has two estimates, namely and . is the best if,
Or var( ) < var )
E [ E ] < E [ E ]
The estimator is unbiased if its bias is zero, that is, E () = this means that
the unbiased estimator converges to the true value of the parameter as the number
of samples increases. An unbiased estimator gives on the average the true value of
the parameter.
An estimator is best linear unbiased estimator (BLUE) if it is linear,
unbiased and has the smallest variance as compared to all other linear unbiased
estimators of the true . The BLU estimate has the minimum variance within a class
of linear unbiased estimators of the true .
An estimator, say the ordinary least squares estimator is said to be BLUE
of 2 if the following hold.
Mathematical Economics
Page 28
x (Y Y)
x
x Y Yx )
x
= k Y
since x = 0
Or
where ki =
The values of Xs are fixed in hypothetical repeated sampling. Hence the k is are
fixed constants from sample to sample and may be regarded as constant weights
assigned to the individual values of Y. We write,
=
k Y = k Y + k Y + + k Y = f (Y)
The estimate is a linear function of Ys, that is, a linear combination of the values
of dependent variable.
Mathematical Economics
Page 29
Similarly,
=Y X
=YX
Y
X
n
kY
kY
1
Xk Y
n
Since X values are constants and X and k are fixed constants from sample to
sample. Thus, depends only on the values of Yi, that is, is a linear function of
the sample values of Y.
2) Property of Unbiasedness
The property of unbiasedness of and can be proved if we can establish
E( ) = and E( ) = . The meaning of this property is that the estimates
converge to the true value of the parameters as we increase the number of
hypothetical sample.
We have = k Y
k ( + X + u )
k +
k X +
k u)
That is, k =
ki =
=0
similarly, k X =
kX =
Since x
kX =
= X X X we have
Mathematical Economics
(X X)X
x
X XX
x
Page 30
kX =
Therefore we have = + k u
X XX
X XX
= +
=E
=1
x u
x
= . Therefore, is unbiased
E ( ) =
E ( ) =
E ( ) =
E ( ) =
1
Xk E(Y )
n
1
Xk ( + X )
n
X
Xk +
Xk X
n
n
n
+X
n
have
1
Xk Y
n
k +
X
X
n
kX
E ( ) = + X X
E ( ) =
Mathematical Economics
Page 31
In this section we will prove the Gauss Markov theorem which states that the
least squares estimates have the smallest variance as compared with any other
linear unbiased estimators.
It can be proved that var ( ) = E[ E( )]
Since = k Y we get
var
var
= var (
k Y)
)=
var ( ) = E[ E( )]
1
Xk Y
n
var ( ) = var
var
var
=
=
+X
1
Xk
n
var( Y )
+X k
+X k
var
+ X k
Page 32
var
since xi 2 = X2 nX ,
var 1 =
X2 nX nX
2
2
n xi
that is, ( ) =
2 =
Where ci = ki + di
ci Yi
ki being the weights defined earlier for OLS estimates and d i, an arbitrary set of
weights.
The new estimator 2 is also assumed to be unbiased estimator of 2 . That is,
E(2 ) = 2
We have
2 =
2 = 1
2 =
ci (1 + 2 Xi + u )
i
ci +2
Mathematical Economics
(2 ) = 1
ci Yi
ci +2
ci X i +
ci X i +
ci u
ci u
Page 33
var
var
var
But k
= var (
c Y)
var
var
= var
k +
k +
>
Given that dis are defined as an arbitrary constant weights not all of them are zero,
the second term is positive, that is, d > 0.
Therefore, var
Thus, in the group of linear unbiased estimates of true , the least squares
estimate has the minimum variance. In the similar way, we can prove that the least
squares intercept coefficient has the minimum variance.
2.5 The Coefficient of Determination: A Measure of Goodness of Fit
After the estimation of the parameters and the determination of the least
squares regression line, we need to know how good is the fit of this line to the
sample observations of Y and X. That is, we need to measure the dispersion of
observations around the regression line. This knowledge is essential, the closer the
observations to the line, the better the goodness of fit and the better is the
explanation of the variations of Y by the changes in the explanatory variables.
If all the observations lie on the sample regression line, we would obtain a
perfect fit. But this is rarely the case. Generally, there will be some positive u and
negative u . What we hope for is that these residuals around the regression line are
as small as possible. The square of correlation coefficient known as coefficient of
determination r2, in two variable case, and R2 in multiple regression is a summary
Mathematical Economics
Page 34
measure that tells how well the sample regression line fits the data. As a measure of
goodness of fit, r2 shows the percentage of the total variation of the dependent
variable Y that is explained by the independent variable X.
To compute the coefficient of determination (r 2) we proceed as follows. Let
y = (Y Y) Total variation of the actual Y values about their sample mean,
which may be called Total Sum of Squares (TSS). We compute total variation of the
dependent variable by comparing each value of Y to the mean value Y and adding all
the resulting deviations. Note that in order to find the TSS, we square the simple
deviations, since by definition the sum of simple deviations of any variable around
its means is identically equal to zero.
y = Y Y Variations of the estimated Y values about their means, which
may be called the sum of squares due to regression, or explained by regression or
simply the Explained Sum of Squares (ESS). This is the part of the total variations
of Yi which is explained by the regression line. Thus, ESS is the total explained by
the regression line variation of the dependent variable.
u = Y Y Variations of the dependent variable which is not explained by
the dependent variable and is not explained by the regression line and is attributed
to the existence of the disturbance term. The Residual Sum of Squares (RSS) is the
sum of the squared residuals that gives the total unexplained variation of the
dependent variable Y around its mean.
In summary
y = Y Y deviations of Y from its mean
Y =Y +u
Y =
Y +2
Y =
Y +
u +
u
(2.17)
or
Mathematical Economics
Page 35
]=[
]+[
The above equation shows that the total variation in the observed Y values
about their mean value can be partitioned into two parts; one attributable to the
regression line and the other to random forces because not all actual Y observation
lie on the fitted line.
Now dividing equation (2.17) by TSS on both sides, we obtain,
1=
1=
ESS RSS
+
TSS TSS
Y Y
Y Y
+
(Y Y)
(Y Y)
We now define r2 as
Or alternatively
=
=
( )
Mathematical Economics
Page 36
MODULE III
THE CLASSICAL NORMAL LINEAR REGRESSION MODEL
3.1 The Probability Distribution of Disturbances
For the application of the method of ordinary least squares (OLS) to the
classical linear regression model, we did not make any assumptions about the
probability distribution of the disturbances u i. The only assumption made about u i
were that they had zero expectations, were uncorrelated and had constant variance.
With these assumptions we saw that the OLS estimators satisfy several desirable
statistical properties, such as unbiasedness and minimum variance. If our objective
is point estimation only, the OLS method will be sufficient. But point estimation is
only one aspect of statistical inference, the other being hypothesis testing.
Thus, our interest is not only in obtaining, say but also using it to make
statements or inferences about true . That is, the goal is not merely to obtain the
Sample Regression Function (SRF) but to use it to draw inferences about the
Population Regression Function (PRF). Since our objective is estimation as well as
hypothesis testing, we need to specify the probability distribution of disturbances u i.
In the module II we proved that the OLS estimators of and are both linear
functions of ui, which is random by assumption. Therefore, the sampling or
probability distribution of OLS estimators will depend upon the assumptions made
about the probability distribution of u i. Since the probability distribution of these
estimators are necessary to draw inferences about their population values, the
nature of probability distribution of u i assumes an extremely important role in
hypothesis testing.
But since the method of OLS does not make any assumptions about the
probabilistic nature of ui, it is of little help for the purpose of drawing inferences
about the PRF from SRF. But this can solved if we assume that the us follow some
probability distribution. In the regression context, it is usually assumed that the us
follow the normal distribution.
3.2 The Normality Assumption
The classical normal linear regression model assumes that each u i is
distributed normally with
Mean: E(u ) = 0
Variance: E(u ) =
cov u , u = 0 i j
(3.1)
(3.2)
(3.3)
Mathematical Economics
(3.4)
Page 37
Where ~ means distributed as and where N stands for the normal distribution.
The terms in the parentheses represents the two parameters of the normal
distribution, namely, the mean and the variance. u is normally distributed around
zero mean and a constant finite variance . For each ui, there is a distribution of
the type of (3.4).The meaning is that small values of u have a higher probability to
be observed than large values. Extreme values of u are more and more unlikely the
more extreme we get.
For two normally distributed variables zero covariance or correlation means
independence of the two variables. Therefore, with the normality assumption,
equation (3.3) means that u and u are not only uncorrelated but also independently
distributed. Therefore, we can write equation (3.4) as,
u ~NID(0, )
(3.5)
Where NID stands for normally and independently distributed. There are several
reasons for the use of normality assumption, which are summarised below.
1) As noted earlier, ui represents the combined influence of a large number of
independent variables that are not explicitly introduced in the regression
model. We hope that the influence of these omitted or neglected variables is
small or at best random. By the central limit theorem of statistics it can be
shown that if there are large number of independent and identically
distributed random variables, then, with few exceptions, the distribution of
their sum tends to a normal distribution as the number of such variables
increases indefinitely. It is this central limit theorem that provided a
theoretical justification for the assumption of normality of u i.
2) A variant of central limit theorem states that even if the number of variables is
not very large or if these variables are not strictly independent, their sum may
still be normally distributed.
3) With the normality assumption, the probability distribution of the OLS
estimators can be easily derived because one property of the normal
distribution is that any linear function of normally distributed variables is
itself normally distributed.
4) The normal distribution is a comparatively simple distribution involving only
two parameters, namely mean and variance.
5) The assumption of normality is necessary for conducting the statistical tests
of significance of the parameter estimates and for constructing confidence
intervals. If this assumption is violated, the estimates of and are still
unbiased and best, but we cannot assess their statistical reliability by the
classical test of significance, because the latter are based on normal
distribution.
3.3 Properties of OLS Estimators under the Normality Assumption
With the assumptions of normality the OLS estimators have the following properties
1. They are unbiased.
Mathematical Economics
Page 38
2. They have the minimum variance. Combined with property 1, this means that
they are minimum-variance unbiased or efficient estimators.
3. As the sample size increases indefinitely, the estimators converge to their
population values. That is, they are consistent.
4. is normally distributed with
E =
~N( ,
var
nx
or more compactly,
(3.6)
Then by the properties of normal distribution, the variable Z, which is defined
as Z =
~N( ,
6.
And Z =
(
var
or more compactly,
(3.7)
Page 39
possible. The function which defines the joint (total) probability of any sample being
observed is called the likelihood function of the variable X.
The general expression of the likelihood function is
L(X , X , , X ; , , )
(3.9)
which is the density function of a normally distributed variance with given mean
and variance. Substituting equation (3.9) in (3.8), we get
Mathematical Economics
Page 40
(3.10)
if Y1, Y2...Yn are known or given, but 1,2and 2 are not known, the function in
(3.10) is called a likelihood function denoted by LF (1,2, 2) and written as,
LF (1,2, 2)=
(3.11)
Y X
n
1
log(2)
2
2
Y X
n
n
1
Log LF = log log(2)
2
2
2
Or
n
n
1
Log LF = log log(2)
2
2
2
Y X
2
d
Similarly,
dLog LF
1
2
d
2 Y X
Y X
Y = n +
YX =
Mathematical Economics
=0
(1) = 0
X +
(X ) = 0
Page 41
Which is same as the second normal equation of the least squares theory.
Therefore, the ML estimators, s are the same as OLS estimators s.
(3.12)
Mathematical Economics
Page 42
is known
~N( ,
Z=
P Z > Z/
P Z < Z/
(3.13)
(3.14)
(3.15)
< Z = 1
(3.16)
P Z <
< Z = 1
(3.17)
< <
< < +
Therefore, [
Substituting
(2) If
If
, +
=1
]=1
and
(3.18)
(3.19)
as it is known, we get,
, +
is unknown
(3.20)
Z
n
Mathematical Economics
Page 43
=
That is,
(3.21)
(3.22)
or t
(3.23)
< <
(3.24)
=1
P t <
< t = 1
(3.25)
Multiplying all the elements with se and subtracting from (3.25), we have
P t se < < t se
Rearranging, we will have
=1
P[ t se < < + t se ] = 1
Therefore,
interval for .
se , +
se
or
(3.26)
(3.27)
se
By following the same procedure, we can get the confidence interval for also.
P[ t se < < + t se ] = 1
Therefore,
interval for .
se , +
se
or
(3.28)
se
An important feature of the confidence interval given in (3.27) and (3.28) may
be noted. In both the cases, the width of the confidence interval is proportional to
the standard error of the estimator. That is, the larger the standard error, the larger
Mathematical Economics
Page 44
is the width of the confidence interval. Put differently, the larger is the standard
error of the estimator, the greater is the uncertainty of estimating the true value of
the unknown parameter. Thus, the standard error of an estimator is often described
as a measure of the precision of the estimator. That is, how precisely the estimator
measures the true population value.
3.6.3 Confidence Interval for 2
As pointed out in the properties of OLS estimators under the normality
assumption (Property 6), the variable
X =
(n 2)
~X with n 2 degrees of freedom
X X
(3.29)
= 1
Pr X
= 1
(3.30)
Pr
= 1
(3.31)
Pr
Therefore,
= 1
(3.31)
Page 45
The hypothesis which is tested for possible rejection under the assumption
that it is true is called null hypothesis and is denoted by H o. Rejection of Ho
naturally results in acceptance of some other hypothesis which is called alternative
hypothesis and is denoted by H1. The testable hypothesis is called the null
hypothesis. The term null refers to the idea that there is no difference between the
true value and the value we hypothesise. Since null hypothesis is a testable
hypothesis there must also exists a counter proposition to it in order to test the
hypothesised proposition. This counter proposition is called alternative hypothesis.
In other words, the stated hypothesis is known null hypothesis. The null hypothesis
is usually tested against alternative hypothesis which is also known as maintained
hypothesis.
The theory of hypothesis testing is concerned with developing rules or
procedures for deciding whether to reject or not reject the null hypothesis. There are
two mutually complementary approaches for devising such rules, namely confidence
interval and test of significance. Both these approaches predicate that the variable
(statistic or estimator) under consideration has some probability distribution and
that hypothesis tasting involves making statement or assertions about the values of
the parameters of such distribution.
In the confidence interval approach of hypothesis testing we construct a
100(1-) % confidence interval for the estimator, say 2. If 2 under H0 falls within
this confidence interval, we accept H0. But if it falls outside this interval we reject
H0. When we reject the null hypothesis, we say that our finding is statistically
significant. On the other hand, when we do not reject the null hypothesis, we say
that our finding is not statistically significant.
An alternative but complementary approach to the confidence interval method
of testing statistical hypothesis is the test of significance approach developed by R A
Fisher, Neyman and Pearson. Broadly speaking, a test of significance is a procedure
by which sample results are used to verify the truth or falsity of a null hypothesis.
The key idea of test of significance is that of a test statistic (estimator) and the
sampling distribution of such a statistic under the null hypothesis. The decision to
accept or reject H0 is made on the basis of the value of the test statistic obtained
from the data at hand. In the language of significance tests, a statistic is said to be
statistically significant if the value of the test statistic lies in the regions of rejection
(H0) or the critical region. In this case, the null hypothesis is rejected. By the same
token, a test is said to be statistically insignificant if the value of the test statistic
lies in the region of acceptance (of the null hypothesis). In this situation, the null
hypothesis is not rejected.
Thus, the first step in hypothesis testing is that of formulation of the null
hypothesis and its alternative. The next step consists of devising a criterion of test
that would enable us to decide whether the null hypothesis is to be rejected or not.
For this purpose the whole set of values of the population is divided into two
regions, namely the acceptance region and the rejection region. The acceptance
region includes the values of the population which have a high probability of being
observed and the rejection region or critical region includes those values which are
highly unlikely to be observed. Then the test is performed with reference to test
Mathematical Economics
Page 46
statistic. The empirical tests that are used for testing the hypothesis are called tests
of significance. If the value of the test statistic falls in the critical region, the null
hypothesis is rejected; while if the value of test statistic falls in the acceptance
region, the null hypothesis is not rejected.
The following tables summarises the test of significance approach to
hypothesis testing.
(1) Normal Test of Significance-
is known
Suppose
~N( ,
Z=
<
>
Critical Region
|Z| > Z/
Z > Z
Z < Z
Here,
is the hypothesised numerical value of . |Z| means absolute
value of Z. Z or Z/ means the critical Z value at the or /2 level of significance.
The same procedure holds to test hypothesis about .
(2) t-Test of Significance-
is unknown
Mathematical Economics
se
Page 47
Two Tail
Right Tail
Left Tail
<
>
Critical Region
|t| > t /
t > t
t < t
For testing the significance of 2, as noted earlier, we use the chi square test.
X =
(n 2)
~X with n 2 degrees of freedom
X >X
<
X <X
Critical Region
>
X >X
y +
u =
x +
That is, TSS= ESS+RSS. In other words, the total sum of squares composed of
explained sum of squares and the residual sum squares. A study of these
components of TSS is known as the analysis of variance (ANOVA) from the
regression view point. ANOVA is a statistical method developed by R A Fisher for the
analysis of experimental data.
Mathematical Economics
Page 48
Associated with any sum of squares is its degree of freedom (df), that is, the
number of independent observations on which it s based. TSS has n-1 df because
we lose 1 df in computing the sample mean . RSS has n-2 df and ESS has 1 df
which follows from the fact that
= x is a function of only as
x is known . Both case is true only in two variable regression model. The following
table presents the various sum of squares and their associated df which is the
standard form of the AOV table, sometimes also called the ANOVA table.
Source of
variation
Sum of Squares
(SS)
Due t0 regression
(ESS)
y =
Due to residuals
(RSS)
TSS
Degree of
Freedom
1
n-2
n-1
Mean Sum of
Squares (MSS)
u
=
2
In table the MSS obtained by dividing SS by their df. From the table let us consider,
F=
u
2
(3.32)
If we assume that the disturbances u i are normally distributed and H 0:2=0, it can
be shown that the F of equation (3.32) follows the F distribution with 1 and n-2 df.
3.8 Application of Regression Analysis: The Problem of Prediction
On the basis of sample data, we obtained the following sample regression
Y = + X
Where Y is the estimator of true E(Y i). We want use it to predict or forecast Y
corresponding to some given level of X. There are two kinds of predictions, namely
Mathematical Economics
Page 49
Mean Prediction
Given Xi= X0, the mean prediction E(Y0/X0) is given by
E(Y0/ X0) = 1+2Xi
(3.33)
Y = + X
E Y
(3.35)
= ( ) + ( )X
= + X
That is, E Y
= E(Y / X ) = + X
(3.36)
= var( + X )
var Y
= var
(3.37)
Now using the property that var (a+b)=var (a) +var (b)+ 2 cov (a,b), we obtain
+ var X
(3.38)
+ 2 cov( , )X
1
X
+
n x
+X
+ 2X
x
X
x
(3.39)
(3.40)
Follows t distribution with n-2 df. Therefore t distribution can be used to derive
confidence intervals for the true E(Y / X ) and test hypotheses about it in the usual
manner. That is,
P[ + X t se(Y ) < + X < + X + t se(Y )] = 1
Where se Y
(3.41)
Mathematical Economics
Page 50
Individual Prediction
We want to predict an individual Y corresponding to a given X value, say X . That is
we want to obtain,
Y0 = 1+2Xi+ u0
(3.42)
That is, Y Y = ( ) + X + u
(3.43)
=E Y Y
var Y
= var
var Y
= 1+ +
E Y Y
=0
number and
= E ( ) + X + u
var Y = E ( ) + X
222u0
+u
(u ) = 0 by assumption.
(3.44)
+ 2( )X
(3.45)
+ 2( )u +
(3.46)
Using the variance and covariance formula for and and noting that var(u ) = ,
and slightly rearranging the equation (3.46), we have
(
(3.47)
further , it can be shown that Y follows the normal distribution. Substituting for
the unknown , it follows that,
t=
(3.48)
Mathematical Economics
Page 51
MODULE IV
EXTENSION OF TWO VARIABLE REGRESSION MODEL
4.1
Introduction
Some aspects of linear regression analysis can be easily introduced within the
frame work of the two variable linear regression models that we have been
discussing so far. First we consider the case of regression through the origin, ie, a
situation where the intercept term, 1, is absent from the model. Then we consider
the question of the functional form of the linear regression model. Here we consider
the models that are linear in parameters but not in variables. Finally we consider
the question of unit of measurement, i.e, how the X and Y variables are measured
and whether a change in the units of measurement affects the regression results.
4.2
There are occasions when two variables PRF assume the following form:
` Y i = 2 X i + u i
(4.1)
In this model the intercept term is absent or zero, hence regression through
origin. How do we estimate models like (4.1) and what special problems do they
pose? To answer these questions, let us first write SRF of (4.1) namely:
Y = X + u
(4.2)
u = (Y Xi)
(4.3)
Now applying the ordinary least square (OLS) method to (4.2), we obtain the
following formulae for the , and its variance. We want to minimize
With respect to .
= 2 Y Xi
(Xi)
(4.4)
(4.5)
Mathematical Economics
(4.6)
Page 52
= 2 +
Note: E ( ) = 2. Therefore,
E ( -2)2 = E
(4.7)
Expand the right hand side of (4.7) and noting that the Xi is nonstochastic and the
ui are homoscedastic and uncorrelated, we obtain:
Var ( ) = E ( -2)2 =
Where is estimated by
2
(4.8)
(4.9)
Var ( ) =
2
(4.10)
(4.11)
(4.12)
The difference between two sets of formulae should be obvious. In the model
with intercept term is absent, we use raw sums of squares and cross product but in
the intercept present model, we use adjusted (from mean) sum of squares and cross
products. Second, the degrees of freedom for computing 2 is (n-1) in the first case
and (n-2) in the second case.
Although the zero intercept models may be appropriate on occasions, there
are some features of this model that need to be noted. First,
which is always
zero for the model with the intercept term need not be zero when that term is
absent. In short
need not be zero for the regression through the origin.
Suppose we want to impose conditions that
Y = X + u
This expression then gives
Mathematical Economics
= X
(4.13)
Page 53
(4.14)
But this estimator is not the same as equation (4.5). And since of (4.5) is
unbiased, the of (4.14) is unbiased. Incidentally note from (4.4), we get after
equating it to zero.
=0
(4.15)
(4.16)
(4.17)
and u need not be zero then it follows that
(4.18)
Y=Y
That is the mean of actual Y values need not be equal to the mean of the
estimated Y values; the two mean values are identical for the intercept present
model.
Second, r2, the coefficient of determination which is always non negative for
the conventional model, can on occasions turn out to be negative for the intercept
less model. Therefore, conventionally, computed r 2 may not be appropriate for
regression through origin model.
r2 =1
= 1
(4.19)
(4.20)
Now there is no guarantee that this RSS will always be less than TSS which
suggests that RSS can be greater than TSS, implying that r 2 as conventionally
defined can be negative. The conventional r 2 is not appropriate for regression
Mathematical Economics
Page 54
through origin model. But we can compute what is known as the raw r 2 for such
models which is defined as
Raw r2 =
(4.21)
Although the r2 satisfies the relation 0< r2< 1, it is not directly comparable to
the conventional r2 value.
Because of these special features of this model, one needs to exercise great
caution in using the zero intercept models. Unless there is strong apriori
expectation, one would be well advised to seek to the conventional intercept present
model.
4.3 Functional forms of regression models
So far we have considered models that are linear in parameters as well as in
the variables. Here we consider some commonly used regression models that may be
nonlinear in the variables but are linear in the parameters or that can be made so
by suitable transformation of the variables. In particular we discuss the following
regression models
1. Log linear model
2. Semi log models
3. Reciprocal models
4.4 How to measure elasticity: the log linear model
Consider the following model, known as exponential regression model:
Yi = 1 Xi2 eui
(4.23)
(4.24)
Where ln = natural log, ie, log to the base e, and where e= 2.718. If we write
equation (4.24) as;
ln Yi = + 2 ln Xi +ui
(4.25)
Mathematical Economics
(4.26)
Page 55
One attractive feature of the log- log model, which has made it popular in
applied work, is that the slope co-efficient 2 measures the elasticity of Y with
respect to X, that is the percentage change in Y for given small percentage change in
X. Thus if Y represents the quantity of a commodity demanded and X its unit price,
2 measures the price elasticity of demand.
In the two variable models, the simplest way to decide whether the log linear
model fit the data is to plot the scatter diagram of ln Y i against Xi and see the
scatter plots lie approximately on a straight line.
4.5 Semi log models: Log Lin and Lin Log models:
4.5.1 How to measure the growth rate: the Log Lin model
Economists, business people and governments are often interested in finding
out the rate of growth of certain economic variables such as population GDP, money
supply, employment etc.
Suppose we want to find out the growth rate of personal consumption
expenditure on services. Let Y t denote real expenditure on services at time t and Y 0
the initial value of the expenditure on services. We may recall the following wellknown compound interest formula given as
Yt = Y0 ( 1+r)t
(4.27)
Where r is the compound that is overtime rate of growth of Y. taking the natural
logarithm of equation (4.27), we can write
ln Yi = ln Y0 + t ln(1+r)
(4.28)
Now letting
1 = ln Y0
(4.29)
2 = ln (1+r)
(4.30)
(4.31)
(4.32)
This model is like any other regression model in that the parameters
1 and 2 are linear. The only difference is that the regressand is the logarithm of Y
and the regressor is time which will take values of 1, 2, 3 etc.
Mathematical Economics
Page 56
Models like (4.31) are called semi log models because only one variable (in the
case of regressand) appears in the logarithmic form. For descriptive purposes a
model in which the regressand is logarithmic will be called a log lin model. A model
in which the regressand is linear but the regressor is logarithmic is called a lin-log
model.
Let us briefly examine the properties of the model. In this model the slope coefficient measures the constant proportional or relative change in Y for a given
absolute change in the value of the regressor (in the case of variable t) that is,
2 =
(4.33)
If we multiply the relative change in Y by 100, equation (4.33) will then give
the percentage change or the growth rate, in Y for an absolute change in x, the
regressor. That is, 100 times 2 give the growth rate in Y; 100 times 2 is known in
the literature as semi elasticity of Y with repeat of X.
The slope coefficient of the growth model, 2 gives the instantaneous (at a
point in time) rate of growth and not the compound (over a period of time) rate of
growth. But the latter can be easily found from (4.32) by taking the antilog the
estimated 2 and subtracting 1 from it and multiplying the difference by 100.
Linear trend model: instead of estimating model (4.32), researchers sometimes
estimate the following model:
Yt = 1 + 2t +ut
(4.34)
(4.35)
For descriptive purposes we call such a model as a lin log model. Let us interpret
the slope of the coefficient. As usual
2 =
Mathematical Economics
Page 57
=
The relative step follows from the fact that a change the log of a number is a relative
change. Symbolically we have,
2 =
So that Y = 2 (X/X)
(4.36)
(4.38)
Such a model may therefore be appropriate for shot run production functions.
4.7 Scaling and unit of measurement
Here we consider, how the Y and X variables are measured and whether a
change in the unit of measurement affects the regression results. Let,
Mathematical Economics
Y = + X + u
(4.39)
Page 58
Define
Y i* = w 1 Y i
(4.40)
X i* = w 2 X i
(4.41)
Where w1 and w2 are constants, called the scale factors; w 1 may be equal to w2
or may be different. From (4.40) and (4.41) it is clear that Y i* and Xi* are rescaled Yi
and Xi. Thus if, Yi and Xi measured in billions of dollars and one want to express
them in millions of dollars, we will have Y i* = 1000Yi and Xi*=1000 Xi; here w1 = w2 =
1000.
Now consider the egression using Y i* and Xi* variables:
Yi= *+ *Xi + u *
(4.42)
* and *
* and *
Var( *) and Var( *)
Var( *) and Var( *)
2 and *2
r2xy and r2x*y*
var
x y
x
var ( ) =
2
Mathematical Economics
Page 59
Var
Var
*2
From these results it is easy to establish relationships between the two sets of
parameter estimates. All that one has to do is recall these definitional relationships:
Y i* = w 1 Y i
(or yi* = w1yi); Xi* = w2Xi (xi* = w2xi); u * = wi u ; *=w1 and * = w2 .
Making use of these definitions, the reader can easily verify that
*2
Var
= w1
= w12
(4.44)
(4.45)
= w12 Var
Var * =
2xy
(4.43)
2x*y*
Var
(4.46)
(4.47)
(4.48)
From the preceding results, it should be clear that, given the regression
results based on one scale of measurement, one scale of measurement, one can
derive the results based on another scale of measurement once the scaling factors,
the ws are known. In practice, though, one should choose the units of
measurement sensibly; there is little point in carrying all these zeros in expressing
numbers in millions or billions of dollars.
From the results given in (4.43) though (4.48) one can easily derive some
special cases. For instance, if w1= w2, that is, the scaling factors are identical, the
slope coefficient and its standard error remain unaffected in going from the (X i Yi) to
the (Xi*, Yi*) scale, which should be intuitively clear. However the intercept and its
standard error are both multiplied by w 1. But if X scale is not changed, (i.e. w 2 = 1),
and Y scale is changed by the factor w 1, the slope as well as the intercept
coefficients and their respective standard errors are all multiplied by the same w1
factor. Finally if Y scale remains unchanged, (i.e. w 1=1), but the X scale is changed
by the factor w2, the slope coefficient and its standard error are multiplied by the
factor (1/w2) but the intercept coefficient and its standard error remain unaffected.
It should be noted that the transformation from (X i Yi) to the ( Xi*, Yi* ) scale does not
affect the properties of OLS estimators.
Mathematical Economics
Page 60
(4.49)
X i* =
(4.50)
(4.51)
= *Xi + u *
(4.52)
(4.53)
Mathematical Economics
Page 61