Belisa Aliyi - Assignments - For - Econometrics

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 34

December 3, 2022

HARAMAYA UNIVERSITY

HARAMAYA UNIVERSITY
COLLEGE OF BUSINESS AND ECONOMICS

DEPARTMENT ECONOMICS
PROGRAMME –Energy Economics
Assignment FOR the course Econometrics and Forecasting

SUMITTED TO: TAMESGEN KENO (PhD)

PREPARED BY: BELISA ALIYI IBRO

ID.NO.SGS.0882/12
1. The difference between economic theory, mathematical model and
econometrics model.
Econometrics deals with the measurement of economic relationships and it is an integration of
economics, mathematical economics and statistics with an objective to provide numerical values to the
parameters of economic relationships. It is a social science in which the tools of economic theory,
mathematics and statistical inference are applied to the analysis of economic phenomena.
The following are the difference between economic theory, mathematical model and econometric
model:
Economic theories try to define the relationship between different economic variables. These
relationships can be expressed into varieties of firms like graphs, tables, mathematical equations, and
so on to make them more undersetting and more general.
 It shows the economic relationship between different economic variables.
 The economic model is the theoretical construct that represents the complex economic process
or association between economic variables.
 Economic models are qualitative but by nature, they are based on mathematical models as they
ignore residual variables.
 Economic models attempt to exhibit the logical relationship between different variables
considered in the model.
 The economic model is directly linked with the mathematical model as in mathematical
economics all the economic models are applied to express them in quantitative form.
 The outcome of the economic model is almost certain and exact. It means all the economic
models are developed with a set of fixed assumptions/conditions so the outcome is also almost
fixed.
 Economic models are deterministic models. Deterministic models do not include the error
term. Economic models is deterministic in nature
 Economic models forecast the future values of economic variables regardless of their
uncertainty or degree of probability.
 Y=B1+B2X, Keynesian consumption function is an example of an economic model. This
relation is deterministic.
 Economic models do not have anything to do with the significance testing of the variables and
parameters.
 There is no uncertainty in the values of variables in the model at any point in time in the case
of an economic model.
 Economic models allow for random elements which might affect the exact relationship and
tender it in stochastic character.
 Economic models are less powerful to predict the future.
Mathematical model is the application of mathematical methods to represent theories and analyze
problems in economics. By convention, these applied methods are beyond simple geometry, such as
differential and integral calculus, difference and differential equations, matrix algebra, mathematical
programming, and other computational methods. Proponents of this approach claim that it allows the
formulation of theoretical relationships with rigor, generality, and simplicity.

1
Econometrics deals with the measurement of economic relationships. It is an integration of economics,
mathematical economics and statistics with an objective to provide numerical values to the parameters
of economic relationships.
 It measures the values of parameters in economic relationships.
 An econometric model is the combination of mathematical, statistical, and economic concepts
that represents the mathematical estimate of the variables or parameters there in the identified
model.
 Econometric models are extensively statistical or future forecast oriented and thus based on
statistical models.
 Econometric models focus on calculating the numerical values and direction of variables
considered in the model.
 Econometric models are also directly linked with the mathematical model but it is used for
further empirical forecasting and extension of an economic model or mathematical model.
 The econometric model includes the residual variables/uncertainty so their outcome is not
fixed and unknown, unlike the economic model. So the outcome of econometric models may
be certain but not exact.
 All the econometric models are stochastic and econometrics assumes all the economic models
as stochastic by including the error terms. Econometric model is much more in realistic
 Econometric models forecast the future values of economic magnitudes with a certain degree
of probability.
 Y=B1+B2X+ U. This stochastic relation is an example of the econometric model.
 Econometric models require significant testing of their parameters.
 There are linear as well as non-linear relations in the econometric model so there is a
possibility of uncertainty n the value of variables in any particular instant of time.
 Econometric models take randomness as an essential element of the model. So all the
economic models in econometric models as probabilistic models.
 Econometric models are more powerful to predict the future.
2. The need to incorporate the error term (stochastic part) in the

econometrics model
The significance for incorporating the stochastic error term ( ui ) in the econometrics model. The
following are the main reasons for the inclusion of error term (ui ) in the econometrics model:-
i. Vagueness of theory (Incomplete theory): -
 Ignorant or unsure about the other variables affecting Y (consumption) other than income.
 The theory may be incomplete to explain the behaviour of Y.
 Thus, ui may be used as a substitute for all the excluded or omitted variables from the
model.
ii. Unavailability of data: -
 The data may not have quantitative information about these variables.. Hence ui may be
implied this omitted variables.
 Data might just not be available
iii. Core variables versus Peripheral variables: -

2
 There may be some variables which jointly influence the model, which is not explicitly in
the model. ui thus tries to explain the combined effect of these variables in the model
concerned.
 Omitted a variables whose information may not be available.
iv. Intrinsic randomness in human behaviour: -
 All the relevant variables into the model there may be some intrinsic randomness in the
human behaviour. Therefore ui may be well reflect these intrinsic randomness.
 Human errors.
v. Poor proxy variables: -
 The variables Y and X are measured accurately, in practice, the data may be plagued by
errors of measurement.
 The disturbance term ui may in this case also represent the errors of measurement.
 Not enough information to substitute variables
vi. Principle of parsimony: -
 It is done by avoiding some variables from the model. Let ui represents all the omitted
variables.
 Excludes irrelevant variables from a model e.g. population
vii. Wrong functional form: -
 Unfortunately due to the unavailability of data on these variables, we do not know the form
of the functional relationship between the regressand and the repressors.
 So the errors of the functional form can also be solved by the inclusion of the random
variable ui . For all these reasons, stochastic disturbances term ui assume an extremely
crucial role in regression analysis.
 Misspecifying the models

3. The fundamental causes of the error term in the regression model.


An error term (i) is a residual variables produced by a statistical or mathematical model, which is
created when the model does not fully represents the actual relationships between the dependent
variables and independent variables. The ❑i is the stochastic term in the population regression
function. The main reasons for its existence of error are:
i. Omission of a relevant variables from the function
Some of the factors may not be known even to the person
Some factors cannot be measured statistically
Some factors are random, appearing in unpredictable way and time
Some of the factors may have a very little influence on the dependent variable
Some factors may not be having adequate and reliable data.
ii. Random behaviour of human beings
 limited knowledge of the factors which are operative in any particular case
 formidable obstacles presented by data requirements in the estimation of large models

3
iii. Imperfect specification of the mathematical form of the model
 imperfections, looseness of statements in economic theories
 Wrong functional forms
 Incorrect specification of error terms
iv. Errors of aggregation
 To account for purely randomness in the human behavior
 Error terms normal
v. Errors of measurement
 Incorporation of irrelevant variables
 Inclusion of unnecessary variables

4. Properties of the OLS Estimators


One property of the OLS estimators (in simple or multiple regression) is that they minimize
the sum of squared residuals. There are several properties that the OLS estimators have.
a. Linear Regressions
The OLS estimator are linear function of the value of Y (the dependent variable) which are
linearly combined using weights that a nonlinear function of the values of X (the repressors or
explanatory variables). So the OLS estimator is ‘linear’ estimator with respect to how it uses the
values of dependent variable only, and irrespective of how it uses the values of the repressors.
Proof that the linearity of the Regressions:-

Let weights for i = 1… N,

Result: The OLS slope coefficient estimator is a linear function of the sample values
Yi (i=1 , … . , N ), where the coefficient of Yi or yi iski .
Properties of the Weights ki
In order to establish the remaining properties of , it is necessary to know the arithmetic
properties of the weightski .
These properties are,

4
b. Unbiasedness
Unbiasedness is one of the most desirable properties of any estimator. A desirable property of
estimates is that it means equals the true mean of the variables being estimated.
The OLS estimators of β0 and β1 are an unbiased if
 E ( β^ 1 ) =β 1∧E ( ^β 0 ) = ^β 0
 Thus, on average, the estimated values of the coefficient will be equal to the true value.
 The assumptions of zero covariances between x and u or E(u / x)=0 is crucial for
unbiasedness of ^β 0and ^β 1.

 Remember that we get ^β 1=∑ ¿ ¿ ¿ ¿


^β =∑ ¿ ¿ ¿ ¿
1

 In terms of population coefficients and error terms, ^β 1 can be written as



∑ (x ¿¿ 1−x )(β 0+ β1 x 1+u i)
^β = ❑
¿
1 ❑


( x¿¿ 1−x) ¿ 2


❑ ❑ ❑ ❑
∑ (x ¿¿ 1−x)( β0 + β 1 x1 +ui )=β 0 ∑ ( x ¿¿ 1−x )+ β1 ∑ ( x¿¿ 1−x) x1 +∑ (x ¿¿ 1−x)ui ¿ ¿ ¿ ¿
❑ ❑ ❑ ❑
❑ ❑ ❑
 since β0 ∑ ( x¿ ¿1−x)=o ^β 1=β 1 ∑ ( x¿¿ 1−x) x1 + ∑ (x ¿¿ 1−x)ui ¿ ¿ ¿
❑ ❑ ❑

 It can be shown that (x ¿¿ 1−x ) x 1=∑ (x¿ ¿1−x) ¿ ¿
2



∑ (x ¿¿ 1−x )u i
 Therefore, ^β 1=β 1+ ❑

¿


(x ¿¿ 1−x) ¿ 2

5
 Under the assumption that the expected value of each ui conditional on x 1 , x 2 ,.. , x nis



(x ¿¿ 1−x) ui
zero, E( ❑
=0 ¿


2
(x ¿¿ 1−x ) ¿

 Hence, E ( β^ 1 ) =β 1
 Since, unbiasedness, also holds, without conditioning on {x 1 , x 2 , … , x n }
 Further, ^β 0= y − ^β` 1 x=β 0 + β 1 x +u− β^` 1 x=β 0+(β 1− ^β` 1) x+ u
 Now conditional the value of x i
 E ( β^ 0 ) =β 0 + E ⌈ β 1− ^β` 1 x ⌉ + E (u )=β 0
( )
 Because, E ( u ) =0∧E ⌈ β1 − ^β` 1 x ⌉=x E β1− ^β` 1 =0
( ) ( )
Since, E ( β^ 0 ) =β 0
Result: The OLS intercept coefficient estimator ^β 0 is an unbiased of the coefficient β 0: that is,

 In short,

c. Efficiency (Best): Minimum Variance


 Any set of regression estimates ^β 0and ^β 1 are specific to the sample used in their
estimation. In other words, if a differences samples of data was selected from within the
population, the data point ( x i∧ y i ¿ will be different, leading to different values of the
OLS estimators.
 An estimators ^β 1 of a parameters of β 1, is said to efficient if no other estimator has a
smaller variances. Broadly, speaking, if the estimator is efficient, it will be minimizing
the probability that it is a long way off from the true value of β 1.
 It is thus useful to know whether one can have confidence in the estimates, and whether
they are likely to very much from one sample to another sample within the given
population.
 An idea of the sampling variability and hence of precision of the estimates can be
calculated using any sample of data available. This estimates is given by the sampling
variances of the OLS estimators conditional on the sample values { x 1 , x 2 ,… , x n}.
 The efficient property of any estimator says that the estimator is the minimum variance
unbiased estimator. Therefore, if we take all the unbiased estimators of the unknown
population parameter, the estimator will have the least variance. As a result, they will be
more likely to give better and accurate results than other estimators having higher
variance.

Proof:-
The variance of the OLS slope coefficient ^β 1estimator is defined as;

6
4: Asymptotic Unbiasedness
This property of OLS says that as the sample size increases, the biasedness of OLS estimators
disappears.
d. Consistency
An estimator is said to be consistent if its value approaches the actual, true parameter (population)
value as the sample size increases. An estimator is consistent if it satisfies two conditions:
a. It is asymptotically unbiased
b. Its variance converges to 0 as the sample size increases.
Both these hold true for OLS estimators and, hence, they are consistent estimators. For an
estimator to be useful, consistency is the minimum basic requirement. Since there may be several
such estimators, asymptotic efficiency also is considered. Asymptotic efficiency is the sufficient
condition that makes OLS estimators the best estimators.
5. Which of the following can cause the usual OLS t-statistics to be invalid (that is,
not to have t-distributions under H0)?

7
a) Heteroskedasticity? Why?
b) A sample correlation coefficient of 0.95 between two independent variables that
are in the model? Why?
c) Omitting an important explanatory variable? Why?
Ordinary Least Squares:

It is a statistical technique used to estimate linear regression problems' coefficients. Since the linear
regression equations highlight the relationship between a single or more independent factors and a
dependent variable, the ordinary least squares are better for obtaining the unknown phenomena values
within the linear regression equations. The main objective of the ordinary least squares is to minimize
the variations between the gathered observations in random datasets and the results estimated by the
linear regression approximation of existing data.
The correct answers are A and C.

The ordinary least squares t - statistics can become invalid when heteroscedasticity and when
important explanatory variables are excluded from the estimation procedure. Since one of the
assumptions of the OLS model is homoscedasticity, omitting variables from the regression analysis
violates the assumption of the conditional mean. Under this case, if there is a correlation between one,
the case should be ruled out because the premises of the linear regression models do not mention any
sample correlation between independent variables.

Additionally, the OLS statistic can be invalid and lack t-distributions under the null hypothesis when
heteroscedasticity. It is because heteroskedastic means scattering the dependent variables with one
independent variable and specified sample hence giving unexpected results. It thus does not provide a
valid test for the hypothesis within the linear regression model.
6. The causes of heteroscedasticity and multicolleaniarity as two violations of the
OLS assumptions.

6.1. The causes of heteroscedasticity as violations of the OLS assumptions.


There are several reasons why the variance of the error term may be variable, some of which
are as follows.
i The presence of outliers(breakdown in the model):
 Frequently observed in the cross-sectional data sets where demographics are involved
(population, GNP, etc.).
 An outlying observation, or outlier, is an observation that is much different (either very
small or very large) in relation to the observations in the sample.
 An outlier is an observation from a different population to that generating the remaining
sample observations.
 The inclusion or exclusion of such an observation, especially if the sample size is small,
can substantially alter the results of regression analysis.
ii Incorrect specification of the regression model:
 Heteroscedasticity may be due to the fact that some important variables are omitted
from the model.

8
 The residual obtained from the regression may give the distinct impression that the error
variance may not be constant.
iii Skewness in the distribution of one or more repressors included in the model:
 It is well known that the distribution of income and wealth in most societies is uneven,
with the bulk of the income and wealth being owned by a few at the top.
 Examples are economic variables such as income, wealth, and education.
iv Incorrect data transformation
 E.g., ratio or first difference transformations.
v Improper (incorrect) functional form
 E.g., linear versus log–linear models.
6.2. The causes of multicolleaniarity as violations of the OLS
assumptions.
 Improper uses of dummy variables:-
 E.g. failures to exclude one categories.
 Including a variable that is computed from other variables in equation:-
 E.g. family income = husbands income + wife’s income, and the regression includes all 3
incomes measures.
 In effect, including the same or almost the same variable twice:-
 Height in feet and height in in inches; or, more commonly, two different
operationalizations of the same identical concept.
 The data collection method employed:-
 For example, sampling over a limited range of the values taken by the repressors in the
population.
 Constraints on the model or in the population being sampled:-
 For example, in the regression of electricity consumption on income and house size there
is a physical constraint in the population in that families with higher incomes generally
have larger homes than families with lower incomes.
 Model specification:-
 For example, adding polynomial terms to a regression model, especially when the range
of the X variable is small.
 An over determined model:-
 This happens when the model has more explanatory variables than the number of
observations. This could happen in medical research where there may be a small number
of patients about whom information is collected on a large number of variables.
 An additional reason for multicolleaniarity:-
 Especially in time series data, may be that the repressors included in the model share a
common trend that is, they all increase or decrease over time. Thus, in the regression of
consumption expenditure on income, wealth, and population, the repressors income,

9
wealth, and population may all be growing over time at more or less the same rate,
leading to collinearity among these variables.
7. The consequences and the mechanisms of overcoming the problems of
heteroscedasticity and multicolleaniarity in a practical exercise.
7.1. The Consequences of heteroskedasticity for OLS estimators

If the error terms of an equation are heteroscedastic, there are five major consequences.
i The OLS estimators for the are still unbiased and consistent. This is because
none of the explanatory variables is correlated with the error term.
 So, a correctly specified equation that suffers only from the presence of
heteroskedasticity will give us values of that are relatively good.
ii Heteroskedasticity affects the distribution of the increasing the variances of the
distributions and therefore making the estimators of the OLS method inefficient (because it
violates the minimum variance property).
iii Heteroskedasticity also affects the variances (and therefore the standard errors as well) of
the estimated .
 In fact the presence of heteroskedasticity causes the OLS method to underestimate the
variances (and standard errors), leading to higher than expected values of t-statistics and
F-statistics.
 Therefore, heteroskedasticity has a wide impact on hypothesis testing: both the t-
statistics and the F-statistics are no longer reliable for hypothesis testing because they
lead us to reject the null hypothesis too often.
iv The prediction (of Y for a given value of X) based on the estimates b's from the original
data, would have a high variance that is the prediction would be inefficient.
 We cannot apply the formulae of the variances of the Coefficients to conduct tests of
significance and construct confidence intervals.
v The standard errors are biased when Heteroskedasticity is present. This in turn leads bias in
test statistics and confidence intervals.
vi Underestimates the variances of the estimators, leading to higher values of t- and F-
statistics.
7.1.1. The mechanisms of overcoming the problems of heteroscedasticity
OLS estimators are still unbiased even in the presence of heteroscedasticity. But they are not efficient,
not even asymptotically (i.e., large sample size). Generally the solution is based on some form of
transformation. There are two approaches to remediation: when ² is known and when ² is not
known.

10
i) The Method of Weighted Least Squares: When is known

The weighted least square method requires running the OLS regression to a transformed data. The
transformation is based on the assumption of the form of heteroscedasticity.

 When is known:

If is known, the most straight forward method of correcting heteroscedasticity is by means of


weighted least squares. The estimators thus obtained are BLUE. To fit this idea, consider a two
variable regression model,

Yi=β 0+ β1 X i +ui

Yi=β 0 X oi + β1 X i +ui Where X oi =1 for each i.

Assume that, the true error variance is known. That is, the error variance for each observation
is known. Now consider the following transformation of the model,

Dividing both side by σ i

Yi X oi Xi ui
=β 0 ( )+ β 1 ( )+( )
σi σi σi σi
Which for ease of exposition, can be written as
¿ ¿ ¿ ¿ ¿
Yi =β 0 + β 1 X i +ui

The purpose of transforming the original model to notice that the following transformed error
¿
term ui

()
2
¿2 ui 1 2 2
Var (u¿ ¿i¿¿ ¿)=E ( ui ) =E = 2 E ( ui ) since σ i is known¿ ¿
σi σi

1 2 2 2
¿ 2
. σ i since E ( ui ) =σ
σi

¿1

¿ ui
That is, we deflate or divide both sides of the regression model by the σ iknown. Now let, ui =
σi
¿ ¿
where, ui =¿ the transformed error term. If ui is homoscedastic, then the transformed regression
does not suffer from the problem of heteroscedasticity. Thus it can be estimated using the usual
OLS method. Assuming all other assumptions of the CLRM are fulfilled, OLS estimators of the
parameters in the equation will be BLUE and we can then proceed to statistical inference in the
usual manner. WLS is simply the OLS applied to the transformed model.

11
ii) White’s Heteroscedasticity-Consistent Standard Errors: When is not
known
 When is not known:
If true are known, we can use the WLS method to obtain BLUE estimators. Since the true
are rarely known. Therefore, if we want to use the method of WLS, we will have to resort to
some adhoc assumption about and transform the original regression model so that the
transformed model satisfies the heteroscedasticity assumption.

iii) Re-specification of the model


Instead of σ i2 speculating, a re-specification of the model choosing a different functional form can
reduce heteroscedasticity. For example, instead of running linear regression, if we estimate the
model in the log form, it often reduces heteroscedasticity.
7.2. The Consequences of multicollinearity for OLS estimators
Recall that, if the assumptions of the classical linear regression model are satisfied, the OLS estimators
of the regression estimators are BLUE. As stated above if there is perfect multicollinearity between the
explanatory variables, then it is not possible to determine the regression coefficients and their standard
errors.
The consequences of multicolleaniarity may occur in two cases:-
i. Theoretical consequences of multicolleaniarity
a. In the case of near multicollinearity, the OLS estimators are unbiased. But unbiasedness is a
multi-samples or repeated sampling property.
b. Collinearity does not dealing the property of minimum variances. In the classes of all linear
unbiased estimators the OLS estimators have minimum variances, i.e. they are efficient. But
this does not means that varies of an OLS estimators will necessarily be small is any given
sample.
c. Multicollinearity is essentially a sample phenomenon. Even if the X variables are not linearly
related in the population that may be so related in the particular sample at hand.
ii. The practical consequences of multicolleaniarity
a) It multicollinearity is perfect the regression coefficients of X variables are indeterminate and
their standard overall infinite
b) Although BLUE, the OLS estimators have large variances and covariance’s, making précises
estimation difficult.
c) Because of the above consequence, the confidence intervals tend to be much wide, leading to
the acceptances of the zero null hypothesis (i.e. - the true population coefficient as zero) more
readily.
d) Also, the t-ratio of one or more coefficients tends to be statistically insignificant.
e) Also, t-ratio is statistically insignificant, the R 2 value, which is the overall measures of
goodness of estimation can be very high.

12
f) The OLS estimators and their standard errors can be sensitive to small changes in the data.
7.2.1. The mechanisms of overcoming the problems of multicolleaniarity

The possible solution which might be adopted if multicollinearity exists in a function, vary depending
on the severity of multicollinearity, on the availability of other data sources, on the importance of
factors which are multicollinear, on the purpose for which the function is used. However, some
alternative remedies could be suggested for reducing the effect of multicollinearity.
1. Do Nothing
 Multicollinearity is essentially a data deficiency problems and sometimes no choice a real data.
We have available for optimal analysis. A remedy for multicollinearity should only be considered
if and when the consequences cause insignificant t-scores or widely unreliable estimated
coefficients.
2. A priority informations:-
 If we have a prior information regarding the presences of multicollinearity, we can solve it.
Suppose we have consider the models:
Yi=β 0+ β1 X 1 + β 2 X 2+ β3 X 3 +Ui
If we know that β 3=0.10 β 1
 It is possible that we can have some knowledge of the values of one or more parameters from
previous empirical work. This knowledge can be profitably utilised in the current sample to
reduce multicollinearity.
3. Dropping one or more of the multicollinear variables:-
 When faced with severe multicollinearity, one of the “simplest” things to do is to drop one of the
collinear variables.
 But in dropping a variable from the model we may be committing a specification bias or
specification error.
 Specification bias arises from incorrect specification of the model used in the analysis
4. Transformation of variables.
 Multicollinearity problems can be avoided by transforming the variables. The variables
transformed to the first differences forms or ratio transformations or using lagged values, etc. but
these transformation leads to many other problems and also it needs a prior information regarding
variables and procedures. But it is sometimes possible to transform the variables in the equation to
get rid of at least some of the multicollinearity.
 The transformation of the model can minimise if not solve the problem of collinearity. Commonly
used transformation technique is first difference form.
 By transforming the variable, it could be possible to reduce the effect of multicollinearity.
5. Increasing the sample size
 Since multicollinearity is a sample feature, it is possible that in another sample involving the same
variables collinearity may not be so serious as in the first sample
 By increasing the sample, high covariances among estimated parameters resulting from
multicollinearity in an equation can be reduced, because these covariances are inversely
proportional to sample size.
6. Application of the Principal Components Method

13
 With the principal components method we construct some artificial orthogonal variables (from
linear combinations of the X's).
 We thus transform the multicollinear X's in orthogonal variables.
 If these artificial variables can be given any specific economic-meaning, then they can be used as
variables in their own right, and the transformation provides a defensible solution to the
multicollinearity problem.
7. Ridge Regression.
 One of the solutions often suggested for the multicollinearity problem is to use what is known as
ridge regression.
 The addition of to the variances produces biased estimators but the argument is that if the
variance can be decreased, the mean-squared error will decline.

 Where , are the estimators of , from the ridge regression and , are the least squares
estimators and k is the number of repressors.
 Unfortunately, is a function of the regression parameters and error
variance, which are unknown.
 There are different methods of determining and using constrained least square is one of these
methods
8. The easiest solution:
 Remove the most intercorrelated variable(s) from analysis. This method is misguided if the
variables were there due to the theory of the model, which they should have been.
9. Combining cross-sectional data and time series data: -
 This also known as pooling the data. In time series data the economic variables generally tends to
be highly collier, then the cross-sectional data when the variables do not vary much at particular
parts of time. Therefore, a combination of a cross-sectional data and time series data can be a
remedy for multicollinearity problems, but this pooling of data may creates the problems of
interpretation.
10. Use centering:
 Transform the offending independents by subtracting the mean from each case. The resulting
centered data may well display considerably lower multicollinearity. You should have a theoretical
justification for this consistent with the fact that a zero b coefficient will now correspond to the
independent being at its mean, not at zero, and interpretations of b and beta must be changed
accordingly.
11. Leave one intercorrelated variable
 As is but then remove the variance in its covariates by regressing them on that variable and using
the residuals.
12. Assign the common variance
 To each of the covariates by some probably arbitrary procedure.
13. Treat the common variance as

14
 A separate variable and decontaminate each covariate by regressing them on the others and using
the residuals. That is, analyze the common variance as a separate variable.
8. Consider the following data on the percentage rate of change in electricity consumption
(millions KWH) (Y) and the rate of change in the price of electricity (Birr/KWH) (X) for
the years 1961 – 1984.
Year X Y Year X Y
1961 0.22 15.43 1973 5.47 23.01
1962 0.31 12.35 1974 3.26 24.05
1963 -0.43 14.65 1975 6.14 32.80
1964 0.24 15.89 1976 4.63 49.78
1965 -0.13 17.93 1977 2.57 52.17
1966 0.29 14.56 1978 0.89 39.66
1967 -0.12 32.22 1979 1,80 21.8
1968 0.42 27.20 1980 7.86 -49.51
1969 0.08 54.26 1981 6.59 -25.55
1970 0.8 58.61 1982 -0.37 6.43
1971 0.24 15.13 1983 0.16 15.27
1972 -1.09 39.25 1984 0.5 60.4

Table 2: Computations of the summary statistics for coefficients for data of Error:
Reference source not found
Year X Y x y x2 xy y2 Y^ i ê ê2
1961 0.22 15.43 -1.46 -8.23 2.13 12.02 67.73 29.57 -14.14 199.94
1962 0.31 12.35 -1.37 -11.31 1.88 15.49 127.92 29.21 -16.86 284.26
1963 -0.43 14.65 -2.11 -9.01 4.45 19.01 81.18 32.20 -17.55 308.00
1964 0.24 15.89 -1.44 -7.77 2.07 11.19 60.31 29.49 -13.6 184.96
1965 -0.13 17.93 -1.81 -5.73 3.28 10.37 32.83 30.99 -13.06 170.56
1966 0.29 14.56 -1.39 -9.10 1.93 12.65 82.81 29.29 -14.73 216.97
1967 -0.12 32.22 -1.80 8.56 3.24 -15.41 73.27 30.95 1.27 1.61
1968 0.42 27.20 -1.26 3.54 1.59 -4.46 12.53 28.76 -1.56 2.43
1969 0.08 54.26 -1.60 30.60 2.56 -48.96 936.36 30.17 24.09 580.33
1970 0.8 58.61 -0.88 34.95 0.77 -30.76 1221.50 27.22 31.39 985.33
1971 0.24 15.13 -1.44 -8.53 2.07 12.28 72.76 29.49 -14.36 206.21
1972 -1.09 39.25 -2.77 15.59 7.67 -43.18 243.05 34.88 4.37 19.10
1973 5.47 23.01 3.79 -0.65 14.36 -2.46 0.42 8.31 14.70 216.09
1974 3.26 24.05 1.58 0.39 2.50 0.62 0.15 17.26 6.79 46.10
1975 6.14 32.80 4.46 9.14 19.89 40.76 83.54 5.60 27.20 739.84
1976 4.63 49.78 2.95 26.12 8.70 77.05 682.25 11.72 38.09 1,448.56
1977 2.57 52.17 0.89 28.51 0.79 25.37 812.82 20.06 32.11 1,031.05
1978 0.89 39.66 -0.79 16 0.62 -12.64 256 26.86 12.80 163.84
1979 1.80 21.8 0.12 -1.86 0.014 -0.22 3.46 23.17 -1.37 1.88

15
1980 7.86 -49.51 6.18 -73.17 38.19 -452.19 5353,85 -1.32 -48.19 2,322.28
1981 6.59 -25.55 4.91 -49.21 24.11 -241.62 2421.62 3.78 -29.33 860.25
1982 -0.37 6.43 -2.05 -17.23 4.20 35.32 296.87 31.96 -25.53 651.78
1983 0.16 15.27 -1.52 -8.39 2.31 12.75 70.39 29.81 -14.54 211.41
1984 0.5 60.4 -1.18 36.74 1.39 -43.35 1349.83 28.44 31.96 1,021.44
Sum 40.33 567.79 0 0 150.74 -610.36 14,343.48 507.63 0 11,874.22
Mean 1.68 23.66 0 0

Required: Based on the above information,


a) Compute the value of the regression coefficients. /
The summary results in deviation forms are then given by:
∑ Y =567.79∑ X=40.33 n=24



X
40.33
X= = =1.68
n 24
❑ ❑


2
x =150.7419 ∑

xy =−610.3636




2
y =14,343.4755


∑ xi y i
−610.3636
^β = ❑
= =−4.049
1 ❑
150.7419


xi2

^β =Y − ^β X =23.66− (−4.049064 )( 1.68 )


0 1

¿ 23.66+6.80243
¿ 30.462
^ =30.462−4.049 X
The fitted model is then written as: Y i 1

b)Estimate the regression equation.


The fitted model is then written as:
Y^ i=30.462−4.049 X 1
( 5.70912 ) (1.892065)
Interpreting the coefficients:-

 The estimated model is


Consumption=30.462−4.049 price
Y^ i=30.462−4.049 X 1

16
 The value of the intercept term, ( ^β 0=30.462) could be interpreted as the value of the
consumption level of electricity when the value of the electricity is zero price.
 Most generally, intercepts doesn’t have intuitive.
 The value of the slope coefficient ( ^β 1=−4.049) is a measure of the marginal effect of one unit
of price on consumption: for every units more of price of electricity has, we estimate its
consumption decreases on average by −4.049
 The slopes always has an intuitive interpretation.
c) Test whether the estimated regression equation is adequate.

Model adequacy:
The fitted model is said to be adequate if it explain the data set adequately, i.e., if the residual
does not contain (or conceal) any explainable non randomness left from the (explained) model.
An important parts of assessing the adequacy of a linear regression model is testing statistical
hypothesis about the model parameters and constructing certain confidence intervals. To test
hypothesis about the slope and intercept of the regression model, we must make the additional
assumption that the error component in the model, ϵ , is normally distributed. Thus, the complete
assumptions are that the errors are normally and independently distributed with mean zero and
constant varianceσ 2, abbreviated NID (0,σ 2).

1. Run significance test of regression coefficients using the following test methods
A. Standard Error test
In testing the statistical significance of the estimates using standard error test, the following
information is needed for decision.
Fitted regression line for the data given is:

Y^ i=30.462−4.049 X 1
( 5.70912 ) (1.892065),
Since there are two parameter estimates in the model, we have to test them separately.

Testing for ^β 1

^ ^¿
We have the following information about 1 i.e. β 1=−4.049 and Se( β ¿ 1)=1.892065 ¿
The following are the null and alternative hypotheses to be tested.

H : β 1=0

H : β1≠ 0

1
1.892065> (−4.049064 ) ∨1.892065>−2.024532
2
 

Conclusion: Since the standard error of 1 is greater than half of the value of 1 , we have to accept

the null hypothesis and conclude that the parameter estimate 1 is not statistically significant.

17
Testing for ^β 0

^ ^¿
Again we have the following information about  0 i.e. β 0=30.46243∧Se ( β ¿ 1)=5.70912¿
The hypotheses to be tested are given as follows;

H : β 1=0

H : β1≠ 0

1
5.70912< ( 30.46243 )∨5.70912< 15.231215
2
 

Conclusion: Since the standard error of  0 is less than half of the numerical value of  0 , we have to

reject the null hypothesis and conclude that  0 is statistically significant.


B. The t-test statistic
The t-test is usually used to conduct hypothesis tests on the regression coefficients ( β s ) obtained from
simple linear regression. A statistic based on the t - distribution is used to test the two – sided
hypothesis that the true slope, ^β 1 equals some constant value, β 1 ,0. Use =0.05.
The hypothesis to be tested is:

The parameters are known. ^β 1=−4.049064∧Se ( ^β ¿ ¿ 1)=1.892065 ¿

1. Do regression analysis
2. H 0 : β 1=0

H 1 : β1 ≠ 0

Two tail test

0.05
3. Find t-table ∝=0.05 ρ= =0.025
2
4. t – test, df = n-2 = 24 – 2 = 22
5. t – critical value = ±2.074

Reject H 0 if t<−2.074

if t >2.074

6. Then we can estimate tcal as follows



1
t ^β =
−4.049064
Se ( ^β ¿ ¿1)=
1

=−2.14 0¿
1.892065
Conclusion: Since -2.140 is less than the critical value of t for (n-k) or 24 - 2= 22 degree of freedom
[t0.025, 22] are: t0.025, 22 = -2.074, we reject the null hypothesis and conclude that it is statistically

18
significant, there is a relationship between the electricity consumption and slope coefficient of price of
electricity.
The parameters are known. ^β 0=30.46243∧Se ( ^β ¿ ¿ 0)=5.70912 ¿

1. Do regression analysis
2. H 0 : β 0=0

H 1 : β0 ≠ 0

Two tail test

0.05
3. Find t-table ∝=0.05 ρ= =0.025
2
4. t – test, df = n-2 = 24 – 2 = 22
5. t – critical value = ±2.074

Reject H 0 if t<−2.074

if t >2.074

6. Then we can estimate tcal as follows

β^ 0
t ^β =
30.46243
Se ( ^β ¿ ¿ 0)=
0

=5.336 ¿
5.70912
Conclusion: Since 5.336 is greater than the critical value of t for (n-k) or 24 - 2= 22 degree of freedom
[t0.025, 22] are: t0.025, 22 = 2.074, we reject the null hypothesis and conclude that it is statistically significant,
there is a relationship between in the electricity consumption and intercept coefficient of electricity.
d)Test whether the change in price of electricity significantly affects its consumption.

i The t-test statistic


We can apply t-test to see whether price of the electricity is significant in determining the quantity
supplied of the electricity consumption under consideration? Use =0.05.
The hypothesis to be tested is:

The parameters are known. ^β 1=−4.049064∧Se ( ^β ¿ ¿ 1)=1.892065 ¿

1. Do regression analysis
7. H 0 : β 1=0

H 1 : β1 ≠ 0

Two tail test

0.05
8. Find t-table ∝=0.05 ρ= =0.025
2
9. t – test, df = n-2 = 24 – 2 = 22

19
10. t – critical value = ±2.074

Reject H 0 if t<−2.074

if t >2.074

11. Then we can estimate tcal as follows



1
t ^β =
−4.049064
Se ( ^β ¿ ¿1)=
1

=−2.14 0¿
1.892065
Conclusion: - Since -2.14 is less than the critical value of t for (n-k) or 24 - 2= 22 degree of freedom
[t0.025, 22] are: t0.025, 22 = -2.074, we reject the null hypothesis and conclude that the price of the electricity
is significant in determining the quantity supplied for the electricity consumption.
The parameters are known. ^β 0=30.46243∧Se ( ^β ¿ ¿ 0)=5.70912 ¿

1. Do regression analysis
2. H 0 : β 0=0

H 1 : β0 ≠ 0

Two tail test

0.05
3. Find t-table ∝=0.05 ρ= =0.025
2
4. t – test, df = n-2 = 24 – 2 = 22
5. t – critical value = ±2.074

Reject H 0 if t<−2.074

if t >2.074

6. Then we can estimate tcal as follows

β^ 0
t ^β =
30.46243
Se ( ^β ¿ ¿ 0)=
0

=5.336 ¿
5.70912
Conclusion: - Since 5.336 is greater than the critical value of t for (n-k) or 24 - 2= 22 degree of
freedom [t0.025, 22] are: t0.025, 22 = 2.074, we reject the null hypothesis and conclude that the price of the
electricity is significant in determining the quantity supplied for the electricity consumption.
9. Consider the following simple linear regression model

And the estimated fitted line is given as

20
a)Derive the values of the OLS estimators using appropriate optimization method.

Solution
Estimating a linear regression function using the Ordinary Least Square (OLS) method is simply about
calculating the parameters of the regression function for which the sum of square of the error terms is
minimized. Suppose we want to estimate the following equation.
Y i=β 0 + β 1 X 1+ ϵ i

Since most of the time we use sample the corresponding sample regression function is given as
follows.

Y^ i= β^ 0 + β^ 1 X 1

'e '
Then, we solve for the residual term i , square both sides and then take sum of both sides. These
three steps are given respectively as follows.
ε i=Y i−Y^ i=Y i− ^β 0− ^β 1 X 1
❑ ❑
∑ εi =∑ (Y i− ^β 0− β^ 1 X 1)
2 2

❑ ❑


Where, ∑ ε i =¿RSS= Residual Sum of Squares.
2

The method of OLS involves finding the estimates of the intercept and the slope for which the sum
squares given by the Equation is minimized.

That is, the partial derivative with respect to


̂ 0 :

∂ ∑ εi
2


=2 ∑ ( Y i− ^β0 − ^β1 X 1 ) (−1 )=0
∂ β^ 0 ❑




( Y i− β^ 0− β^ 1 X 1 )=0
❑ ❑


Y i− ^β 0− ^β 1 ∑ X 1=0

❑ ❑
 ∑ Y i=n ^β 0+ ^β1 ∑ X 1 Where n is the sample size.
❑ ❑

Partial derivative With respect to


̂1

∂ ∑ εi
2


=2 ∑ ( Y i− ^β0 − ^β1 X 1 ) (− X 1 )=0
∂ β^ 1 ❑




( Y i X 1− ^β 0 X 1− ^β 1 X 12)=0

21
❑ ❑ ❑
∑ Y i X 1 − ^β 0 ∑ X 1− ^β 1 ∑ X 1 =0
❑ ❑ ❑
2

❑ ❑ ❑
 ∑ Y i X 1 = ^β 0 ∑ X 1+ β^ 1 ∑ X 1
❑ ❑ ❑
2



2
Note that the equation (Y i− β^ 0− ^β1 X 1 ) is a composite function and we should apply a chain rule in

finding the partial derivatives with respect to the parameter estimates.



n ∑ Y i X 1−n X 1 Y
^β = ❑
∧we have ^β0 =Y − β^ 1 X
1 ❑
n ∑ X 1 −n X 2 2

b)Show that both the OLS estimators are BLUE (best, linear, unbiased
estimators).
Solution
 Show that the OLS estimators of β 0∧β 1 are BLUE (best, linear, unbiased estimators).

Y i=β 0 + β 1 X 1+ ϵ i i=1 , 2 , ….. , n

Since most of the time we use sample the corresponding sample regression function is given as
follows.

Y^ i= β^ 0 + β^ 1 X 1

Proof that the OLS estimators are BLUE in one by one in the following:-

I. Linearity: (for ^β 1)
The OLS estimators are linear in the dependent variable (Y):
Proposition: ^β 0∧ β^ 1 are linear in Y.

Show that ^β 1 is linear in Y


Proof:
❑ ❑ ❑
∑ x1 y i ❑ x 1 (¿ Y ¿−Y ) ∑ x 1 Y ii −Y ∑ x 1
^β = ❑
=∑ i
= ❑ ❑
¿¿
1 ❑ ❑ ❑


x1 2 ❑


x1 2


x1 2

xi   ( X  X )   X  nX  nX  nX  0
(But )

xi
∑ x1 Y i  Ki
¿> β^ 1= ❑
; Now, let  x 2
i (i  1,2,.....n)



x 12

22

∴ ^β1 =∑ K i Y i

∴ ^β1 islinear ∈Y i

(n )

1
Show that ^β 0 is linear in Y? Hint: ^β 0=∑ −X k i Y i Derive this relationship between ^β 0and Y.

II. Unbiasedness:
Proposition: ^β 0∧ β^ 1 are the unbiased estimators of the true parameters β 0∧β 1

E ( β^ 1 ) =β 1∧E ( ^β 0 ) =β 0

E ( β^ 1 ) =β 1

 Proof: Prove that ^β 1 is unbiased i.e. E ( β^ 1 ) =β 1


❑ ❑
∑ ( X 1−X ¿ ) ( Y i−Y ) ∑ ( X 1−X ¿ ) ( Y i )
^β = ❑
¿¿ ❑
¿
1 ❑ ❑


x 1
2


x1 2

❑ ❑
^β =∑ kY =∑ ki(β + β ¿ X + ϵ )¿
1 i 0 1 1 i
❑ ❑
❑ ❑ ❑
= β 0 ∑ ki+ β 1 ∑ kiX 1 + ∑ kiϵ i,
❑ ❑ ❑
❑ ❑
But ∑ ki=0∧∑ kiX 1=1
❑ ❑
❑ ❑ ❑

❑ ∑ x1 ∑ ( X 1−X ¿ ) ∑ X 1−n X
n X−n X
∑ k i= ❑

= ❑

= ❑

= ❑
=0 ¿



x 1
2


x 1
2


x1 2


x 1
2

  ki  0

❑ ∑ x1 X i ❑
∑ k i X 1= ❑

=∑ ¿ ¿ ¿



x12 ❑

❑ ❑
¿
∑ 2
X 1 −n X
2

¿ ∑ X 1 − X ∑ X1 ¿
2 ❑

= ❑
=1
❑ ❑


X 1
2
−n X 2


X 1 −n X 2 2


E( ^β1 )=E ( β^ 1 ) + ∑ k i E (ϵ i )Since ki are fixed

23
E( ^β1 )=β 1, since E ( ϵ i ) =0

Therefore, ^β 1 is unbiased estimator of β 1.

( ^β0 )=β 0
 Proof: prove that is unbiased i.e.: E ( β^ 0 ) =β 0

( )

^β 0=∑ 1 −X k i Y i
❑ n

[( ) ]

1
¿∑ −X k i (β 0 + β 1 X 1+ ϵ i ) , Since Y i=β 0 + β 1 X 1+ ϵ i
❑ n
1 1
¿ β 0 + β 1 Ʃ X 1 + Ʃϵ − β^ 0 X Ʃki− β1 X Ʃki X 1− X Ʃk i ϵ i
n n i
1 1
¿ β 0 + Ʃϵ − X Ʃk i ϵ i =¿ ^β0 −β 0= Ʃϵ − X Ʃk i ϵ i
n i n i

1
¿ ∑ ( −X Ʃk i )ϵ i
❑ n
1
E( ^β0 )=β 0 + ƩE (ϵ ¿¿ i)−X Ʃki E(ϵ i )¿
n
E( ^β0 )=β 0

∴ ^β 0Is an unbiased estimator of β 0.


III. Best: Minimum variance of ^β 0∧ β^ 1
An estimator is best if it has the smallest variance compared with any other estimate obtained using
other methods. By minimum variance, we mean that the value of the parameters ^β 0∧ β^ 1cluster very
closely around true parameter of β 0∧β 1 .
a. Variance of ^β 1

Var ( ^β 1 )=E ( ^β 1−( β 1 ) )2 =E( ^β 1−β 1)2


Substitute and we get
Var ( ^β 1 )=E [ (Ʃ k i ϵ i)2 ]

¿ E [ k 12 ϵ 12+ k 22 ϵ 22 +… ..+k n2 ϵ n2 +2 k 1 k 2 ϵ 1 ϵ 2 +… ..+2 k n−1 k n ϵ n−1 ϵ n ]

¿ E [ k 12 ϵ 12+ k 22 ϵ 22 +… ..+k n2 ϵ n2 ] + E [ 2 k 1 k 2 ϵ 1 ϵ 2 +..+2 k n−1 k n ϵ n−1 ϵ n ]

(∑ k k ϵ ϵ )i ≠ j
❑ ❑
¿ E( ∑ k i ϵ i )+ E
2 2
i j i j
❑ ❑

¿ ∑ k i E(ϵ i ¿)+2 k i k j E(ϵ ¿ ¿ iϵ j)=σ Ʃk i ¿ ¿ ¿)
2 2 2 2

24
❑ ❑
∑ x1
❑ 2


x1
2

Ʃki = ∧therefore , Ʃk i =
❑ ¿¿¿


x1 2

2
σ
∴ Var ( ^β 1) =σ 2 Ʃki2 = ❑


x1
2

b. Variance of ^β 0

Var ( ^β 0 ) =E ( ^β 0−E ( β 0 ) )
2

¿ E ( β^ 0− β0 )
2

[ ]
❑ 2
1
Var ( ^β 0 ) =E ∑ 2
( − X Ʃk i) ϵ i
❑ n
❑ 2
1
¿ ∑ ( −X Ʃk i ) E ¿
❑ n
❑ 2
1
¿ σ ∑ ( − X Ʃk i )
2

❑ n
2 1 2 2 2
¿ σ Ʃ( − X ki + X k i )
n
2
n

( )

1 2X
2
¿σ Ʃ
n
2

n
Ʃk i + X
2
Ʃ k i
2
, Since ∑

k i=0

2 1 2 2
¿σ ( 2
+ X Ʃki )
n

∑ x1
2

( )
2
2 1 X 2 ❑
¿σ + ❑ , Since Ʃk i =
n
2 ¿¿¿
∑ x12❑

Again
❑ ❑

1 +X
2 ∑

2
x1 + X
2


X1
2

+ = = ❑
n ❑ 2 ❑
∑ x 1 n ∑ x 1 n ∑ x12

2

❑ ❑

( )

∑ X1
2

( )
^ 1 X2
∴ Var ( β 0 )=σ 2 + ❑ 2

2 ❑

n
∑ x1 2
n ∑ x 12
❑ ❑

25
To establish that ^β 0∧ β^ 1 possess minimum variance property, we compare their variances with that of
the variances of some other alternative linear and unbiased estimators of β 0∧β 1, say ^β 0 ∧ ^β1 .
¿ ¿

Let’s first show minimum variance of ^β 1 and then that of ^β 0.

1. Minimum variance of ^β 1
¿
Suppose: β 1 an alternative linear and unbiased estimator of β 1 and;
¿
Let β 1 =Ʃ w i Y i

Where , H 1 : βi ≠ 0; but: wi  k i  ci

^β 1¿=Ʃ w i ( ^β0 + β 1 X 1 +ϵ i ) Since Y i=β 0 + β 1 X 1+ ϵ i

¿ ^β 0 Ʃ w i+ β1 Ʃ wi X 1 + Ʃ wi ϵ i

∴ E ( ^β 1¿ ) = ^β 0 Ʃ wi + β 1 Ʃ w i X 1 , Since E ( ϵ i ) =0
¿ ¿
Since β 1 is assumed to be an unbiased estimator, then for β 1 is to be an unbiased estimator of β 1, there
wi  0 wi X  1
must be true that and in the above equation.

But,
wi  k i  ci

wi  (k i  ci )  k i  ci

c i  0 k i  wi  0
Therefore, since
Again Ʃ wi X 1=Ʃ(k ¿ ¿ i+ c i) X 1=Ʃk i X 1+ Ʃ c i X 1 ¿
Since Ʃ wi X 1=1∧Ʃki X 1=1=¿ Ʃ c i X 1=0 .
From these values we can drive Ʃ c i x 1=0 where x 1=X 1− X
Ʃ c i x 1=Ʃ c i( X ¿¿ 1− X)=Ʃ c i X 1− X ¿
Since Ʃ c i x 1=1 Ʃ c i=0=¿ Ʃ c i X 1 =0
Thus, from the above calculations we can summarize the following results.
Ʃ wi=0 , Ʃ w i x 1=1 , Ʃ c i=0 , Ʃ c i X 1=0

To prove whether ^β 1 has minimum variance or not lets compute Var ( β^ 1 ) to compare with
¿

Var ( β^ 1 ).
¿
Var (β 1 )=Var (Ʃ w i Y i )
2
 wi var(Yi )

∴ Var ( ^β 1¿ )=σ 2 Ʃ w i2 , Since Var ( Y i ) =σ 2


2
wi  (k i  ci ) 2  k i2  2k i ci  ci2
But,

26
c i x i
k i c i  0
 w  k  c Since
2
i i
2 2
i xi2

Therefore, Var ( ^β 1 ) =σ ( Ʃ k i + Ʃ ci ) =¿ σ Ʃ k i + σ Ʃ c i σ
¿ 2 2 2 2 2 2 2 2

Var ( ^β 1¿ ) =Var ( ^β 1 ) +σ 2 Ʃ ci2

Given that ci an arbitrary constant,


 2 ci2 is a positive i.e. it is greater than zero. Thus
is

Var ( ^β 1¿ ) > Var ( ^β1 ). This proves that ^β 1 possesses minimum variance property.

2. Minimum Variance of ^β 0
We take a new estimator β 0, which we assume to be a linear and unbiased estimator of function of β 0 .
The least square estimator ^β 0 is given by:

( )

^β 0=∑ 1 −X k i Y i
❑ n
Proof that of the minimum variance property of ^β 0, let’s use the weights w i=c i+k i
consequently;

( )

^β 0=∑ 1 −X w i Y i
❑ n
Since we want β 0 to be on unbiased estimator of the true  , that is, E ( β 0 ) =β 0, we substitute
¿ ¿

for Y = β0 + β 1 X 1 +ϵ i in  * and find the expected value of  * .

( 1n − X w )(β¿ ¿ 0+ β X + ϵ ¿)¿ ¿

β 0¿ =∑ i 1 1 i


β0 β1 X1 ϵ i
¿∑ ( + + ¿− X wi β 0−X β 1 X 1 w i−X w i ϵ i )¿
❑ n n n

ϵi
β 0 =β 0+ β 1 X + ∑
¿
−β 0 X Ʃw i−X β 1 Ʃw i X 1−X Ʃw i ϵ i ¿
❑ n
¿
For β 0 to be an unbiased estimator of the true β 0, the following must hold.
Ʃw i=0 , Ʃ(w ¿ ¿ i X 1 )=1∧Ʃw i ϵ i ¿=0

i.e., if
wi  0, and wi X i  1 . these conditions imply that ci  0 and ci X i  0 .

As in the case of ^β 1 , we need to compute Var ( β 0 ) to compare with Var ( ^β 0)


¿

Var ( β 0¿ ) =Var ¿
 ( 1 n  Xwi ) 2 var(Yi )
  2 ( 1 n  Xwi ) 2
  2 ( 1 n 2  X 2 wi  2 1 n Xwi )
2

27
  2 ( n n 2  X 2 wi  2 X
2 1
n wi )

Var ( β 0 ) =σ
¿ 2
( 1n + X w ), Since w  0
2
i
2
i

2
But
 wi   k i
2
  c 2
i

¿>Var ( β 0¿ )=σ 2 ( 1n + X ( Ʃ k + Ʃ c ))
2
i
2
i
2

2
2 1 X
Var ( β 0 ) =σ ( +
¿ 2 2 2
❑ )+σ X Ʃ ci
n
n∑ x1
2

 X 2

  2  i


 n x i
2
   2 X 2 ci2
¿
The first term in the bracket itVar ( β 0 ), hence

Var ( β 0¿ ) =Var ( ^β 0 ) +σ 2 X 2 Ʃ ci2

¿>Var ( β 0¿ ) >Var ( ^β 0 ), Since  2 X 2 ci2  0

So, we have proved that the least square estimators of linear regression model are best, linear and
unbiased (BLU) estimators.
c) Derive and interpret the variance of and .
Solution

The variance of 1

The variance of 1 is derived as follows;


Var ( ^β 1 )=E ( ^β 1−( β 1 ) ) . But we have seen that E ( ^β1 ) =β 1
2

¿ E ( β^ 1−( β1 ) ) . But ^β1 −β1=Ʃ k i ϵ i


2

¿ E [ (Ʃ k i ϵ i )2 ]

¿ E [ k 12 ϵ 12+ k 22 ϵ 22 +… ..+k n2 ϵ n2 +2 k 1 k 2 ϵ 1 ϵ 2 +… ..+2 k n−1 k n ϵ n−1 ϵ n ]

¿ E [ k 12 ϵ 12+ k 22 ϵ 22 +… ..+k n2 ϵ n2 ] + E [ 2 k 1 k 2 ϵ 1 ϵ 2 +… ..+2 k n−1 k n ϵ n−1 ϵ n ]

(∑❑ ki k j ϵ i ϵ j )i ≠ j
❑ ❑
¿ E( ∑ k i ϵ i )+ E
2 2



¿ ∑ k i E(ϵ i ¿)+2 k i k j E(ϵ ¿ ¿ iϵ j)=σ Ʃk i ¿ ¿ ¿)
2 2 2 2

28
❑ ❑
∑ x1
❑ 2


x1
2

Ʃki = ∧therefore , Ʃk i =
❑ ¿¿¿


x1 2

2
σ
∴ Var ( ^β 1) =σ 2 Ʃki2 = ❑


x1
2

Interpretation: So Var( ^β 1) is a measurement of how much variability our estimates would have if we
repeated sampling over and over again and fit multiple regression models, each time obtaining a
different estimate of' ^β 1 ' s. The Var( ^β 1) measures how much variability each of the ^β 1's have around
their mean.

The Variance of 0

The variance of  0 can be computed as follows;


Var ( ^β 0 ) =E ( ^β 0−( β0 ) )
2

¿ E ( β^ 0− β0 )
2

[ ]
❑ 2
1
Var ( ^β 0 ) =E ∑ 2
( − X Ʃk i) ϵ i
❑ n
❑ 2
1
¿ ∑ ( −X Ʃk i ) E ¿
❑ n
❑ 2
1
¿ σ ∑ ( − X Ʃk i )
2

❑ n
2 1 2 2 2
¿ σ Ʃ( − X ki + X k i )
n
2
n

( )

1 2X
2
¿σ Ʃ
n
2

n
Ʃk i + X
2
Ʃ k i
2
, Since ∑

k i=0

2 1 2 2
¿σ ( 2
+ X Ʃki )
n

∑ x1
2

( )
2
2 1 X 2 ❑
¿σ + ❑ , Since Ʃk i =
n
2 ¿¿¿
∑ x12

Again

29
❑ ❑

1 +X
2

∑∑ x1 + X

X1
2 2 2

+ = = ❑
n ❑ 2 ❑
∑ x 1 n ∑ x 1 n ∑ x12

2

❑ ❑

( )

∑ X1
2

( )
^ 1 X2
∴ Var ( β 0 )=σ 2 + ❑
2

2 ❑

n
∑ x1 2


n ∑ x 12

Interpretation: So Var( ^β 0 ) is a measurement of how much variability our estimates would have if we
repeated sampling over and over again and fit multiple regression models, each time obtaining a
different estimate of' ^β 0 ' s . The Var( ^β 0 ) measures how much variability each of the ^β 0 ' s have around
their mean.

d)Explain what would happen to the variance of and as the sample size increases.
Solution
 As a sample size increases, sample variance (variation between observations) increases
but the variances of OLS estimators will become decreases and hence precision
increases.
e) How does the variance of the stochastic terms affect the variances of and
 The variance of the stochastic terms affect the variances of ^β 0∧ β^ 1through minimizing the
sum of the square residuals in the regression analysis.
f) Explain the limitation of simple linear regression model just compared to the multiple
linear regression model.
Solution
 The followings are the limitation of simple linear regression model just compared to
the multiple linear regression model.
1) The distinction between explanatory and response variable is important in regression.
2) Correlation and regression lines describes only linear relationship
3) Correlation and least square regression lines are not resistant
4) Association does not imply causation
5) An association between an explanatory variables x and responses variables y, is
not by itself good evidences that changes in x actually causes changes in y.
6) Linear regressions are sensitive to outliers
7) Over fitting
8) Multicollinearity can also be a problem for linear regression
9) missing variables can also a problem

10. Consider the following regression outputs.

30
Where Y = consumption expenditure and X = percapita income. Assume that the regression
results were obtained from a sample of 234 households.

a)How do you interpret results of this regression?

Solution
Interpretation:-
The slope coefficient of the marginal propensity to consume (MPC) is about 0.6560, suggesting that if
(real income) goes up by one dollar, the average personal consumption expenditure goes up by about
66 cents. According to Keynesian theory, MPC is expected to lie between 0 and 1.
The intercept value of about 0.2033 suggests that even if the per capita income is zero, the average
personal consumption expenditure subscribers is about 0.2033 per 100 subscribers.
The r2 value of 0.397 means approximately 40 percent of the variation in personal consumption
expenditure is explained by variation in the percapita income. This value is quite high, considering that
r2 can at most be 1.
So, there is a positive relationship between consumption expenditure and percapita income in the
household.
b)Test the hypothesis : against : . Explain what test do you use and Why.
What are the underlying assumptions of the test(s) you use?

Solution:
Use the one-tailed t-test
^β −β 0.6560−1 −0.344
1 1
t= = = =−1.755
^
se ( β1 ) 0.196 0.196
From 234-2=232 degrees of freedom at 5% t-table = 1.740
Decision: t-table is than t-calculate. Therefore, reject the null hypothesis and the alternative one. We
can reject the null hypothesis that the true slope coefficient is 1 or greater than zero.
11. Suppose the aggregate saving model is explained by the interest rate and income
variables as below, using a 50-sample size.

Where y is saving rate, is interest rate and is income level then


the auxiliary regression model is estimated as below.

31
a)Detect presence of heteroscedasticity in the saving model using Breush Pagan methods.
Solution
Using the following steps to perform a Breusch- Pagan test:
1. Fit the regression model
2. Calculate the square residuals
3. Fit the new regression model, using the square residuals as a response values
4. Calculate the Chi-Square test statistic X2 as n*R2new
Where,
 n: The total number of observation
 R2: The R-Squared of the regression model that used the squared residuals as a response value.
If the p-value that corresponds to this Chi-Square test statistic with p (the number of predictors)
degrees of freedom is less than some significance level (i.e. α =0.05) then reject the Null hypothesis
and the heteroskedasticity is present.
Other
Given:-
 n = 50
 R
2
= 0.85

 2
X = n*R2 = 50(0.85) = 42.5
 Thus, the Chi-Square test statistic for the Breush-pagan test is
2 2
X =n∗R =50∗0.85=42.5
 The degree of freedom is p=3 predictor variables. According to the Chi-Square to p-value
calculate, the p-value that corresponds to X 2 =42.5 with 3 degrees of freedom is 0.111418.
 Since this p-value is not less than 0.05, we fail to reject the null hypothesis. Thus, we assume that
homoscedasticity’s present.
b)If for the original model, show the transformation mechanisms
towards the generalized least square methods to obtain BLUE estimators
Solution
Given a regression equation model of the form
Y i=β 1 X 1 + β 2 X 2 + ε i

The generalized least square method requires running the OLS regression to a transformed data. The
transformation is based on the assumption of the form of heteroscedasticity.
Given the model Y i=β 1 X 1 + β 2 X 2 + ε i
2 2 2
IfVar ( ϵ i )=σ i =X 2 σ , then E(ϵ i¿ ¿2)= X 22 σ 2 ¿

32
2 2 2
Var ( ϵ i )=σ i =X 2 σ ∨equivalently σ i=σ X 2, then transforming the original model by
1
multiplying each variable by the weight wi= Yields
X2

Where ² is a constant variance of a classical error term. So if as a matter of speculation or other tests
indicate that the variance is proportional to the square of the explanatory variable X, we may transform
the original model as follows:
Y i β1 X1 εi
= + β2+
X2 X2 X2
¿ ¿ ¿ ¿ ¿
Y i =β 1 X 1 + β 2 X 2 + ϵ i

Where
εi
ϵ i=
X2

[ ]
2
ε
Var ( ϵ i )=E (ϵ i ¿ ¿ 2)=E i ¿
X2

2
1 2 X2 2
¿ 2
E(ε i ¿ ¿ 2)=σ 2
=σ ¿
X2 X2

2
¿σ

Yi 1
Hence the variance of ϵ i is now homoscedastic and regress on .
X2 X2

33

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy