Econometrics - Basic 1-8

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 58

Basic Econometrics

Prof. Ganesh Kawadia


Former Head and Dean
School of Economics
Devi Ahilya University
Indore
Email:ganesh.kawadia@gmail.com
Economic Mathematical
Theory Economics

Econometrics

Economic Mathematical
Statistics Statistics
Raw Material for Econometrics
 Economic theory makes statements that are
mostly qualitative in nature, while econometrics
gives empirical content to most economic theory

 Mathematical economics is to express


economic theory in mathematical form without
empirical verification of the theory, while
econometrics is mainly interested in the empirical
verification of the theory
Raw Material for Econometrics
 Economic Statistics is mainly concerned with
collecting, processing and presenting economic
data. It does not being concerned with using the
collected data to test economic theories

 Mathematical statistics provides many of tools


for economic studies, but econometrics supplies the
later with many special methods of quantitative
analysis based on economic data
What is Econometrics?
 Definition 1: Economic Measurement
 Definition 2: Application of the mathematical
statistics to economic data in order to lend
empirical support to the economic mathematical
models and obtain numerical results (Gerhard
Tintner, 1968)

 Definition 3: The empirical determination


of economic laws (By H. Theil, 1971)
What is Econometrics?
 Definition 4: The quantitative analysis of
actual economic phenomena based on concurrent
development of theory and observation, related by
appropriate methods of inference (P.A.Samuelson,
T.C.Koopmans and J.R.N.Stone, 1954)

 Definition 5: The social science


which applies economics, mathematics and
statistical inference to the analysis of
economic phenomena (By Arthur S. Goldberger, 1964)
Economic Mathematical
Theory Economics

Econometrics

Economic Mathematic
Statistics Statistics
Methodology of Econometrics
Anatomy of economic modelling
• 1) Economic Theory
• 2) Mathematical Model of Theory
• 3) Econometric Model of Theory
• 4) Data
• 5) Estimation of Econometric Model
• 6) Hypothesis Testing
• 7) Forecasting or Prediction
• 8) Using the Model for control or policy
purposes
Methodology of Econometrics
(1) Statement of theory or hypothesis:

Keynes stated: ”Consumption increases as


income increases, but not as much as the
increase in income”. It means that “The
marginal propensity to consume (MPC) for a
unit change in income is grater than zero but
less than unit”
Methodology of Econometrics
(2) Specification of the
mathematical model of the
theory
Y = ß1+ ß2X ; 0 < ß2< 1
Y= consumption expenditure
X= income
ß1 and ß2 are parameters; ß1 is
intercept, and ß2 is slope coefficients
Methodology of Econometrics
(3) Specification of the
econometric model of the
theory
Y = ß1+ ß2X + u ; 0 < ß2< 1;
Y = consumption expenditure;
X = income;
ß1 and ß2 are parameters; ß1is
intercept and ß2 is slope coefficients;
u is disturbance term or error term. It
is a random or stochastic variable
Error Term
 An error term in statistics is a value which represents
how observed data differs from actual population data.
  It can also be a variable which represents how a given
statistical model differs from reality. The error term is
often written as ui
 In econometric theory, the classical normal linear
regression model (CNLRM) involves finding the best
fitting linear model for observed data that shows the
relationship between two variables
Classical Normal Linear Regression Model

  Let’s say we were running a study on the way


the Income of a family affect the amount of
consumption
   You could collect data which told us how much
is the income and consumption of the family.
These data can be plotted as a scatter plot, with
Income (X) on the x axis and consumption (Y)
per family on the y axis. Then we would look
for the line y = β0 + β1 x that best fit the data
Best Fitted Line
Introduction
Methodology of Econometrics

(4) Obtaining Data


Y= Personal consumption
expenditure
X= Gross Domestic Product
all in Million Rupees
Types of data
 All empirical analysis requires data. We
will now discuss a few different structures
of data that we may come across if we do
empirical analysis in economics:
 1. Cross-sectional data
 2. Time series data
 3. Pooled cross sections
 4. Panel (or longitudinal) data
Cross-sectional data

 A cross-sectional dataset consists of a sample of


individuals, households, firms, … taken at a given
point in time
 • Cross-sectional datasets are often obtained from
random sampling from the underlying population.
 • If the sample has not been drawn randomly, our
methods may have to be adjusted. For now, we
assume random sampling unless it is said
otherwise.
Time series data
 • A time series data set consists of observations on
one or several variables over time.
 Unlike the arrangement of cross-sectional data,
the chronological ordering of observations in a
time series is important.
 A key feature of time series data that makes them
more difficult to analyze than cross-sectional data
is that observations are unlikely to be independent
over time.
 • Special methodological problems arise when we
analyze time series data.
Pooled cross-sections
 Some datasets have both cross-sectional and time
series features.
 Example: household surveys from 1985 and 1990
which are combined to yield one dataset containing
observations from both years.
 It May be a useful basis for analysis of change of
policy, for example, we often include time (year) as
an additional explanatory variable in regressions
based on pooled cross-sections.
. Panel (longitudinal) Data
 A panel dataset consists of a time series for each
cross-sectional member in the data set.
 Key feature: the same cross sectional units are
followed over time.
 This is the big difference compared to a pooled
cross section.
 Can analyze dynamics.
Key ingredients of data
 : Data – typically in the form of large samples. •
Data = information.
 1. Cross-sectional data—are data on one or more
variables collected at one point in time.
 2. Time series data—are collected over a period of
time.
 3. Pooled data—a combination of time series and
cross-section.
 4.Panel (longitudinal)data is a special type of
pooled data, in which the same cross-sectional unit,
say, a family or firm, is surveyed over time. •
Econometric Model
 Education and Earnings • Suppose we want to
evaluate the effects of years of education on worker
earnings.
 A plausible economic model: wage = f ( educ,
exper) where wage = hourly wage, educ = years of
formal education and exper = years of work
experience.
 Regression is a technique enabling us (under certain
assumptions that we will study carefully later on) to
quantify by how much the average – or expected -
wage increases as education increases by (say) 1
year..
An Econometric Model
 • WAGE is the dependent variable in the model. –
In econometrics, the dependent variable is almost
always to the left of the equal sign (’on the left-
hand side’). – The dependent variable is the
outcome of interest, to be explained by other
variables.
 • Education is the explanatory (or independent)
variable in the model. – Explanatory variables are
typically written to the right of the equal sign (’on
the right-hand side’).
OLS Estimation
 Under a set of assumptions that we will study
carefully throughout this course, we can estimate
the unknown model parameters β0 and β1 using
the wage1.sav dataset
 • The two most important assumptions: – the
expected value of the residual is zero – the residual
is uncorrelated with income
 • Why we need these assumptions will become
clearer later on.
Methodology of Econometrics
(4) Obtaining Data
Year Y X

1980 2447.1 3776.3


1981 2476.9 3843.1
1982 2503.7 3760.3
1983 2619.4 3906.6
1984 2746.1 4148.5
1985 2865.8 4279.8
1986 2969.1 4404.5
1987 3052.2 4539.9
1988 3162.4 4718.6
1989 3223.3 4838.0
1990 3260.4 4877.5
1991 3240.8 4821.0
Methodology of Econometrics
(5) Estimating the Econometric
Model
Y^ = - 231.8 + 0.7194 X (1.3.3)
MPC was about 0.72 and it means
that for the sample period when real
income increases 1 USD, led (on
average) real consumption expenditure
increases of about 72 cents
Note: A Cap symbol (^) above one
variable will signify an estimator of the
relevant population value
Methodology of Econometrics
(6) Hypothesis Testing
Are the estimates accord with the
expectations of the theory that is being
tested? Is MPC < 1 statistically? If so,
it may support Keynes’ theory.
Confirmation or refutation of
economic theories based on
sample evidence is the object of Statistical
Inference (hypothesis testing)
Methodology of Econometrics
(7) Forecasting or Prediction
 With given future value(s) of X, the future
value(s) of Y can be predicated
 GDP=Re 6000 Bill in 1994, what is the forecast
consumption expenditure?
 Y^= - 231.8+0.7196(6000) = 4084.6
 Income Multiplier M = 1/(1 – MPC) (=3.57).
decrease (increase) of $1 in investment will
eventually lead to Rs3.57 decrease (increase) in
income
Methodology of Econometrics
(8) Using model for control or
policy purposes
Y(4000)= -231.8+0.7194 X  X  5882
MPC = 0.72, an income of Rs 5882 Bill
will produce an expenditure of Rs.4000
Bill. By fiscal and monetary policy,
Government can manipulate the
control variable X to get the desired
level of target variable Y
Economic Theory

Mathematic Model Econometric Model Data Collection

Estimation

Hypothesis Testing
Application
in control or
Forecasting policy
studies
Part one

THE NATURE OF REGRESSION ANALYSIS


Origin of the term “Regression”

 The term REGRESSION was introduced by


Francis Galton
 Tendency for tall parents to have tall children
and for short parents to have short children.
 But the average height of children born from
parents of a given height tended to move (or
regress) toward the average height in the
population as a whole (F. Galton, “Family
Likeness in Stature”)
1. Historical origin of the term
“Regression”
 Galton’s Law was confirmed by Karl
Pearson: The average height of sons of a
group of tall fathers < their fathers’ height.
And the average height of sons of a group of
short fathers > their fathers’ height.

 Thus “regressing” tall and short sons alike


toward the average height of all men. (K.
Pearson and A. Lee, “On the law of
Inheritance”)
Modern Interpretation of
Regression Analysis
 The modern way in interpretation of
Regression: Regression Analysis is
concerned with the study of the dependence
of one variable (The Dependent Variable),
on one or more other variable(s) (The
Explanatory Variable), with a view to
estimating and/or predicting the
(population) mean or average value of the
former in term of the known or fixed (in
repeated sampling) values of the latter.

Dependent Variable Y; Explanatory Variable Xs
1. Y = Son’s Height; X = Father’s Height
2. Y = Height of boys; X = Age of boys
3. Y = Personal Consumption Expenditure
X = Personal Disposable Income
4. Y = Demand; X = Price
5. Y = Rate of Change of Wages
X = Unemployment Rate
6. Y = Money/Income; X = Inflation Rate
7. Y = % Change in Demand; X = % Change in the
advertising budget
8. Y = Crop yield; Xs = temperature, rainfall, sunshine,
fertilizer
Statistical vs.
Deterministic Relationships
 In regression analysis we are concerned
with STATISTICAL DEPENDENCE
among variables (not Functional or
Deterministic), we essentially deal with
RANDOM or STOCHASTIC variables
(with the probability distributions)
. Regression vs. Causation:
Regression does not necessarily imply
causation. A statistical relationship cannot
logically imply causation. “A statistical
relationship, however strong and however
suggestive, can never establish causal
connection:
The ideas of causation must come from outside
statistics, ultimately from some theory or
other” (M.G. Kendal and A. Stuart, “The
Advanced Theory of Statistics”)
5. Regression vs. Correlation
 Correlation Analysis: the primary objective is to
measure the strength or degree of linear association
between two variables (both are assumed to be
random)
 Regression Analysis: we try to estimate or predict
the average value of one variable (dependent, and
assumed to be stochastic) on the basis of the fixed
values of other variables (independent, and non-
stochastic)
.
Terminology and Notation
Dependent Variable Explanatory
Variable(s)
 
Explained Variable Independent
 Variable(s)
Predictand 
 Predictor(s)

Regressand Regressor(s)
 
Response Stimulus or control
 variable(s)
Endogenous 
Exogenous(es)
The Nature and Sources
of Data for Econometric
Analysis

1) Types of Data :
 Time series data;
 Cross-sectional data;
 Pooled data
2) The Sources of Data
3) The Accuracy of Data
. Summary and Conclusions
1) The key idea behind regression analysis is the
statistic dependence of one variable on one or
more other variable(s)
2) The objective of regression analysis is to
estimate and/or predict the mean or average
value of the dependent variable on basis of
known (or fixed) values of explanatory
variable(s).
3) The success of regression depends on the
available and appropriate data
4) The researcher should clearly state the sources
of the data used in the analysis
Basic Econometrics

TWO-VARIABLE REGRESSION
ANALYSIS:
1. A Hypothetical Example
 Total population: 60 families
 Y=Weekly family consumption expenditure
 X=Weekly disposable family income
 60 families were divided into 10 groups of
approximately the same income level
(80, 100, 120, 140, 160, 180, 200, 220, 240, 260)
A Hypothetical Example
 Table gives the conditional distribution
of Y on the given values of X
 It also gives the conditional probabilities of
Y: p(YX)
 Conditional Mean
(or Expectation): E(YX=Yi )
Table 2-2: Weekly family income X ($), and consumption Y ($)
X 80 100 120 140 160 180 200 220 240 260
Y
Weekly 55 65 79 80 102 110 120 135 137 150
family
consumption 60 70 84 93 107 115 136 137 145 152
expenditure 65 74 90 95 110 120 140 140 155 175
Y ($)
70 80 94 103 116 130 144 152 165 178
75 85 98 108 118 135 145 157 175 180
-- 88 -- 113 125 140 -- 160 189 185
-- -- -- 115 -- -- -- 162 -- 191

Total 325 462 445 707 678 750 685 1043 966 1211

Mean 65 77 89 101 113 125 137 149 161 173


A Hypothetical Example

 Figure shows the population regression line


(curve). It is the regression of Y on X

 Population regression curve is the


locus of the conditional means or expectations
of the dependent variable for the fixed values
of the explanatory variable X
The concepts of population
regression function (PRF)

 E(YX=Xi ) = f(Xi) is Population


Regression Function (PRF) or
Population Regression (PR)
 In the case of linear function we have
linear population regression function (or
equation or model)
E(YX=Xi ) = f(Xi) = ß1 + ß2Xi
The concepts of population
regression function (PRF)

E(YX=Xi ) = f(Xi) = ß1 + ß2Xi


 ß1 and ß2 are regression coefficients, ß 1is
intercept and ß2 is slope coefficient
 Linearity in the Variables
 Linearity in the Parameters
Stochastic Specification of PRF

 Ui = Y - E(YX=Xi ) or Yi = E(YX=Xi ) + Ui
 Ui = Stochastic disturbance or stochastic error term.
It is nonsystematic component
 Component E(YX=Xi ) is systematic or
deterministic. It is the mean consumption
expenditure of all the families with the same level of
income
 The assumption that the regression line passes
through the conditional means of Y implies that
E(UiXi ) = 0
The Significance of the Stochastic
Disturbance Term
 Ui = Stochastic Disturbance Term is a
surrogate for all variables that are
omitted from the model but they
collectively affect Y
 Many reasons why not include such
variables into the model as follows:
The Significance of the Stochastic
Disturbance Term
Why not include as many as variable into
the model (or the reasons for using ui)
+ Vagueness of theory
+ Unavailability of Data
+ Core Variables vs. Peripheral Variables
+ Intrinsic randomness in human behavior
+ Poor proxy variables
+ Principle of parsimony
+ Wrong functional form
The Sample Regression
Function (SRF)
Table A random Table : Another random
sample from the sample from the population
population Y X
Y X -------------------
------------------ 55 80
70 80 88 100
65 100 90 120
90 120 80 140
95 140 118 160
110 160 120 180
115 180 145 200
120 200 135 220
140 220 145 240
155 240 175 260
150 260 --------------------
------------------
Weekly Consumption
Expenditure (Y)
SRF1

SRF2

Weekly Income (X)


.
The Sample Regression Function (SRF)

 Fig. SRF1 and SRF 2


 Y^i = ^1 + ^2Xi
 Y^i = estimator of E(YXi)
 ^1 = estimator of 1
 ^2 = estimator of 2
 Estimate = A particular numerical value obtained by
the estimator in an application
 SRF in stochastic form: Yi= ^1 + ^2Xi + u^i
or Yi= Y^i + u^i
. The Sample Regression
Function (SRF)
 Primary objective in regression analysis is
to estimate the PRF Yi= 1 + 2Xi + ui on
the basis of the SRF Yi= ^1 + ^2Xi + ei
and how to construct SRF so that ^1 close
to 1 and ^2 close to 2 as much as
possible
. The Sample Regression
Function (SRF)

 Population Regression Function PRF


 Linearity in the parameters
 Stochastic PRF
 Stochastic Disturbance Term ui plays a
critical role in estimating the PRF
 Sample of observations from population
 Stochastic Sample Regression Function
SRF used to estimate the PRF
Summary and Conclusions

 The key concept underlying regression


analysis is the concept of the
population regression function (PRF).
 This book deals with linear PRFs:
linear in the unknown parameters.
They may or may not linear in the
variables.
Summary and Conclusions
 For empirical purposes, it is the stochastic
PRF that matters. The stochastic disturbance
term ui plays a critical role in estimating the
PRF.
 The PRF is an idealized concept, since in
practice one rarely has access to the entire
population of interest. Generally, one has a
sample of observations from population and
use the stochastic sample regression (SRF) to
estimate the PRF.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy