0% found this document useful (0 votes)
7 views22 pages

Courses 1-5 Econometrics 2023

Uploaded by

cristina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views22 pages

Courses 1-5 Econometrics 2023

Uploaded by

cristina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

WHAT IS ECONOMETRICS?

COURSE 1 ECONOMETRICS

Prof. Ph.D. Erika Marin

WHAT IS ECONOMETRICS?
 econometrics means “economic
measurement.”

 Although measurement is an important part of


econometrics, the scope of econometrics is
much broader

What is Econometrics?
6

Definition 1: econometrics = econo + metrics


 economics vs. econometrics
 economics: focus on “how” and “why”
 econometrics: focus on “how much” and “by how much”

 example:
◼ economist: “If the government increases alcohol excise tax,
consumers will cut down on their alcohol consumption.”
◼ econometrician: “If the government increases alcohol excise tax by
20%, consumers will reduce their alcohol consumption by 1%.”
→ econometrics is absolutely vital in applying economic
theories in practice
◼ reflected in the number of econometricians among Nobel prize
laureates
What is Econometrics? (cont’d)
7econometrics is not concerned with the numbers themselves (the
concrete information in the previous example), but rather with the
methods used to obtain the information → crucial role of statistics

Definition 2: econometrics = statistics for economists

 textbook definitions of econometrics:


 “application of mathematical statistics to economic data to lend empirical
support to models constructed by mathematical economics and to obtain
numerical estimates.” (Samuelson et al., Econometrica, 1954)
 “application of mathematics and statistical methods to the analysis of
economic data.“ (www.wikipedia.org)
 econometrics vs. statistics:
 is econometrics a part of statistics? Not quite – economic data give rise to
methods unparalleled in any branch of statistics.

what is the methodology?


 traditional econometric methodology
proceeds along the following steps:
 1. Statement of theory or hypothesis.
 2. Specification of the mathematical model of the
theory
 3. Specification of the statistical, or econometric,
model
 4. Obtaining the data

what is the methodology?


 traditional econometric methodology proceeds
along the following steps:

 5. Estimation of the parameters of the econometric


model
 6. Hypothesis testing
 7. Forecasting or prediction
 8. Using the model for control or policy purposes.
BIBLIOGRAPHY
Class notes

ANDREI, T.
Statistică şi econometrie,
Bucureşti, Editura Economica, 2004
GUJARATI, D. N.
Basic Econometrics, 3rd ed., New York, Mc Graw-
Hill, 1995
PECICAN, E.
Econometrie ... pentru economişti, Bucureşti,
Editura Economică, 2004

RECAP FROM STATS COURSE

SAMPLE POPULATION
AVERAGE: n N

 xi x i
X = i =1
 = i =1

n N

 (x )
n N

 (x − )
2
i −X 2
VARIANCE i
s2 = i =1
 =
2 i =1
n −1
N

STANDARD
DEVIATION
s = s2  = 2

s 
COEFFICIENT OF cv = CV =
X 
VARIATION

Constructing
Confidence Intervals
Course 2 Econometrics
_
x
a/2 1-a a/2
_
X
9
Introduction
 Statistical inference is the process by
which we acquire information about
populations from samples.
 There are two types of inference:
 Estimation
 Hypotheses testing

Concepts of Estimation
 The objective of estimation is to determine the
value of a population parameter on the basis of a
sample statistic.
 There are two types of estimators:
Point Estimator
Interval estimator

Estimation Process

Population Random Sample


I am 95%
Mean confident that 
Mean, , is is between 40 &
unknown X = 50
60.

Sample
Point Estimator
A point estimator draws inference about a
population by estimating the value of an
unknown parameter using a single value
or point.

Interval Estimator
An interval estimator draws inferences
about a population by estimating the value
of an unknown parameter using an
interval.

Population Parameters
Estimated
Estimate Population with Sample
Parameter... Statistic
_
Mean  X
Proportion ps

 2 2
Variance s
_ _
Difference  -  x - x
1 2 1 2

Confidence Interval Estimation

 Provides Range of Values


 Based on Observations from 1 Sample

 Gives Information about Closeness to


Unknown Population Parameter
 Stated in terms of Probability
Never 100% Sure
Level of Confidence
 Probability that the unknown population parameter falls
within the interval

 Denoted (1 - a) % = level of confidence

 e.g. 90%, 95%, 99%

 a Is Probability That the Parameter Is Not Within the


Interval = level of significance

The Confidence Interval for 


(  is known)
 Three commonly used confidence levels

Confidence
level a a/2 za/2
0,90 0,10 0,05 1,645
0,95 0,05 0,025 1,96
0,99 0,01 0,005 2,58

Intervals &
Level of Confidence
Sampling
Distribution of _
x
the Mean a/2 1-a a/2
_
X
Intervals X = 
Extend from
(1 - a) % of
X − Z X Intervals
Contain .
to
a % Do Not.
X + Z X
Confidence Intervals
Factors Affecting
Interval Width
 Data Variation
Intervals Extend from
measured by 
X - Z to X + Z 
x x
 Sample Size

X =X / n
 Level of Confidence
(1 - a)

© 1984-1994 T/Maker Co.

Confidence Interval Estimates

Confidence
Intervals

Mean Proportion

Finite
 Known  Unknown Population

Confidence Intervals ( Known)

 Assumptions
 Population Standard Deviation Is Known
 Population Is Normally Distributed
 If Not Normal, use large samples

 Confidence Interval Estimate

 
X − Za / 2 •
n
 X + Za / 2 •
n
Example
 A machine is set up to produce juice bottles.
 A sample of 100 bottles yields an average content of 312 ml.
 Calculate a 95% confidence interval for the average content
(for the entire statistical population )
 Assume the standard deviation for the population is 50 ml.

Confidence Interval Estimates

Confidence
Intervals

Mean Proportion

Finite
 Known  Unknown Population

Confidence Intervals ( Unknown)


 Assumptions
 Population Standard Deviation Is Unknown
 Population Must Be Normally Distributed
 Use Student’s t Distribution
 Confidence Interval Estimate

S S
X − ta / 2,n −1 •  X + ta / 2,n −1 •
n
n
Student’s t Distribution

Standard
Normal

Bell-Shaped t (df = 13)


Symmetric
‘Fatter’ Tails t (df = 5)

Z
t
0

Degrees of Freedom (df)

 Number of Observations that Are Free


to Vary After Sample Mean Has Been
Calculated
 Example degrees of freedom =
 Mean of 3 Numbers Is 2 n -1
X1 = 1 (or Any Number) = 3 -1
X2 = 2 (or Any Number) =2
X3 = 3 (Cannot Vary)
Mean = 2

Student’s t Table

a/2 Assume: n = 3 df
=n-1=2
Upper Tail Area
a = .10
df .25 .10 .05 a/2 =.05

1 1.000 3.078 6.314

2 0.817 1.886 2.920 .05


3 0.765 1.638 2.353
0 t
t Values 2.920
Example: Interval Estimation
 Unknown

A random sample of n = 25 has the average = 50 and


s = 8. Set up a 95% confidence interval estimate for .
S
X − ta / 2 , n −1 • X + ta / 2,n −1 •
S
n  n
8 8
50 − 2.0639 • 50 + 2.0639 •
25  25

46 . 69    53 . 30

Confidence Interval Estimates

Confidence
Intervals

Mean Proportion

Finite
 Known  Unknown Population

Estimation for
Finite Populations
 Assumptions
 Sample Is Large Relative to Population

 n / N > .05
 Use Finite Population Correction Factor
 Confidence Interval (Mean, X Unknown)

  X  X + ta / 2,n −1 • S • N − n
S N −n
X − ta / 2,n −1 • •
n N −1 n N −1
Confidence Interval Estimates

Confidence
Intervals

Mean Proportion

Finite
 Known  Unknown Population

Confidence Interval Estimate


Proportion
 Assumptions
 Two Categorical Outcomes

 Population Follows Binomial Distribution

 Normal Approximation Can Be Used

 n·p  5 & n·(1 - p)  5


 Confidence Interval Estimate

ps (1 − ps ) ps (1 − ps )
ps − Za / 2 •  p  ps + Za / 2 •
n n

Example: Estimating Proportion

A random sample of 400 Voters showed 32 preferred


Candidate A. Set up a 95% confidence interval
estimate for p.

ps (1 − ps ) ps (1 − ps )
ps − Za / 2 •  p ps + Za / 2 •
n n

.08(1 − .08) .08(1 − .08)


.08 − 1.96 •  p .08 + 1.96 •
400 400

.053  p  .107
Sample Size

Too Big: Too Small:


•Requires too •Won’t do
much resources the job

Example: Sample Size


for Mean
What sample size is needed to be 90% confident of
being correct within ± 5? A pilot study suggested that
the standard deviation is 45.

2 2
Z 2 2 1645
. 45
n= = = 219.2 @ 220
Error 2 5
2

Round Up
Introduction to
Hypothesis Testing
Course 3 -4

38

Hypothesis Testing: Preliminaries

A hypothesis is a statement that something is true.


Null hypothesis: A hypothesis to be tested. We use
the symbol H0 to represent the null hypothesis
Alternative hypothesis: A hypothesis to be
considered as an alternative to the null hypothesis. We
use the symbol Ha or H1 to represent the alternative
hypothesis.
- The alternative hypothesis is the one believe to
to be true, or what you are trying to prove
is true.

In this course, we will always assume that the null


hypothesis for a population parameter, always specifies a
single value for that parameter. So, an equal sign always
appears:
H 0 :  = 0
If the primary concern is deciding whether a population
parameter is different than a specified value, the alternative
hypothesis should be:
H a :   0
This form of alternative hypothesis is called a two-tailed
test.
Example: You suspect that the equilibrium wage of low
skilled workers is not the federal minimum wage level of
$5.15
*If the primary concern is whether a population
parameter,  , is less than a specified value  0 , the
alternative hypothesis should be:
H a :   0
A hypothesis test whose alternative hypothesis has
this form is called a left-tailed test.
*If the primary concern is whether a population
parameter,  , is greater than a specified value  0 ,
the alternative hypothesis should be:
H a :   0
A hypothesis test whose alternative hypothesis has
this form is called a right-tailed test.
A hypothesis test is called a one-tailed test if it is
either right- or left-tailed, i.e.,if it is not a two-tailed
test.

After we have the null hypothesis, we have to determine


whether to reject it or fail to reject it.
The decision to reject or fail to reject is based on information
contained in a sample drawn from the population of interest.
The sample values are used to compute a single number,
corresponding to a point on a line, which operates as a decision
maker. This decision maker is called test statistic
If test statistic falls in some interval which support alternative
hypothesis, we reject the null hypothesis. This interval is called
rejection region
It test statistic falls in some interval which support null
hypothesis, we fail to reject the null hypothesis. This interval is
called acceptance region
The value of the point, which divide the rejection region and
acceptance one is called critical value

We can make mistakes in the test.


Type I error: reject the null hypothesis when it is true.
probability of type I error is denoted by a
Type II error: accept the null hypothesis when it is wrong.
probability of type II error is denoted by 
Test of hypothesis for a population
mean
 We are basically asking: What observed value of x bar
would be different enough from my null hypothesis value
to convince me that my null is wrong
 We always talk in terms of type I errors, alpha, which are
always small (.1, .05, .01)
 The smaller alpha gets the more tight your proof that the
alternative is correct, because the probability of type I
error is reduced, but the chances of pa type II error are
increased

Test of hypothesis for a population mean


(two tailed and large sample)
1) Hypothesis: H 0 :  = 0
H a :   0
2) Test statistic: large sample case
x − 0
zobs =
/ n
3) Critical value, rejection and acceptance region:
- The bigger the absolute value of z is, the more
possible to reject null hypothesis.
- The critical value depend on the significance level a
- rejection region: | zobs | za / 2 or crit

 Example
 The amount of time required to complete a critical part of
a production process on an assembly line is normally
distributed. The mean was believed to be 130 seconds.

 To test if this belief is correct, a sample of 100 randomly


selected assemblies was drawn, and the processing time
recorded. The sample mean was 126.8 seconds.

 If the process time is really normal with a standard


deviation of 15 seconds, can we conclude that the belief
regarding the mean is incorrect?

46
 Solution
 Is the mean different than 130?

H0:  = 130
Then
H1 :   130

– Define the rejection region


z < - za/2 or z > za/2

47

So, if in reality  =130, but we mistakenly


reject this hypothesis in favor of   130
because x was very small or very large,
a/2 = 0.025 we want this mistake to happen not more
than 5% of the time.

x 130 x
a/2 = 0.025
A sample mean far below 130
or far above 130, should be a a/2 = 0.025 a/2 = 0.025
rare event if  = 130.

-za/2 = -1.96 0 za/2 = 1.96


x − 126.8 − 130
z= = = −2.13
 n 15 100
Rejection region

48

Since the value of the test statistic There is sufficient


falls in the rejection region, we reject evidence to infer
the null hypothesis in favor of the that the mean is not 130.
alternative hypothesis.

a/2 = 0.025 a/2 = 0.025

-2.13 0 2.13
x − 126.8 − 130 -za/2 = -1.96 za/2 = 1.96
z= = = −2.13
 n 15 100

49
Test of hypothesis for a population mean
(one tailed test and large sample)
1) Hypothesis: H 0 :  = 0
H a :   0 or H a :    0
2) Test statistic: large sample case
x − 0
zobs =
/ n
3) Critical value, rejection and acceptance region:
rejection region: zobs  za or zobs  − za

The rejection region is a range of values such


that if the test statistic falls into that range, the
null hypothesis is rejected in favor of the
alternative hypothesis.

51

 The p-value and rejection region


methods
 The p-value can be used when making decisions based on
rejection region methods as follows:
 Define the hypotheses to test, and the required significance level a.
 Perform the sampling procedure, calculate the test statistic and the p-
value associated with it.
 Compare the p-value to a. Reject the null hypothesis only if p <a;
otherwise, do not reject the null hypothesis.

a = 0.05
The p-value
 x = 170
52 x L = 175.34 x = 178
 Describing the p-value

• If the p-value is less than 1%, there is overwhelming


evidence that support the alternative hypothesis.
• If the p-value is between 1% and 5%, there is a
strong evidence that supports the alternative
hypothesis.
• If the p-value is between 5% and 10% there is a weak
evidence that supports the alternative hypothesis.
• If the p-value exceeds 10%, there is no evidence that
supports of the alternative hypothesis.
53

Power of a statistical test


- P(reject the null hypothesis when it is false)=1-
-(1-α) is the probability we accept the null when it was in
fact true
-(1-β) is the probability we reject when the null is in fact
false - this is the power of the test.
-You would prefer to have a larger power
-The power changes depending on what the actual
population parameter is.

Conclusions of a test of Hypothesis

 If we reject the null hypothesis, we conclude that there


is enough evidence to infer that the alternative
hypothesis is true.
 If we do not reject the null hypothesis, we conclude
that there is not enough statistical evidence to infer
that the alternative hypothesis is true.
The alternative hypothesis
is the more important
one. It represents what
we are investigating.
55
Test of hypothesis for a population mean
(two tailed and small sample)
1) Hypothesis:
H 0 :  = 0
H a :   0

2) Test statistic: small sample case


x − 0
t=
s/ n
3) Critical value, rejection and acceptance region:
- The bigger the absolute value of t is, the more
possible to reject null hypothesis.
- The critical value depends on significance level a

- rejection region: | t | ta / 2 d.f.=n-1

Test of hypothesis for a population mean


(one tailed test and small sample)
1) Hypothesis: H 0 :  = 0
H a :    0 or H a :    0
2) Test statistic: small sample case
x − 0
t=
s/ n
3) Critical value, rejection and acceptance region:
rejection region: t  ta or t  −ta d.f.=n-1

Example:
A consumer group concerned about the average fat
content of a certain burger submits to an independent
laboratory a random sample of 12 burgers for analysis.
The percentage of fat in each burger is:
21. 18, 19, 18, 16, 24, 22, 19, 24, 14, 18 and 15
The manufacturer claims that the mean fat content of the
burger is less than 20%.
Assuming the fat content is normally distributed test the
validity of the manufacturer’s claim.
Testing Hypotheses
Two-Sample Procedures
Course 5

TWO RELATIVELY LARGE INDEPENDENT


SAMPLES (n1 , n2 > 30)

 Difference between means


 H0 :  = 2
 H1 :  ≠ 2
 Compare x1 − x2 with Za/2
Z=
 x −x  12  22
◆ and 2 known:
1 2
 x1 − x 2 = +
n1 n2

◆ and 2 unknown:  x1 − x 2 =


s12 s2 2
+
n1 n2

where s1 and s2 are computed using STDEV


TWO RELATIVELY LARGE INDEPENDENT
SAMPLES (n1 , n2 > 30)

 Difference between means (cont’d)


 Example:
In a manufacturing firm, the productivity of two units is measured. Is
the productivity of unit A significantly higher than the productivity of
unit B?

EXAMPLE (cont’d)
Unit A Unit B
Sample Size 50 60
Average Productivity 120 116
Sample standard dev. 10 12
2 2
s1 s2 100 144
ˆ x − x = + = + = 2 + 2.4 = 2.1
1 2
n1 n2 50 60
x1 − x2 120 − 116
z= = = 1.9
 x1 −x2 2.1
 Do not reject the null hypothesis and conclude that the productivity
difference between the two units is not statistically significant.

TWO RELATIVELY LARGE INDEPENDENT


SAMPLES (n1 , n2 > 30)

Difference between two percentages


 H0 :  =2
 H0 :  ≠ 2
p1 − p2
 Compare z= with Za/2
ˆ p1 − p2

where
p1 (100 − p1 ) p2 (100 − p2 )
 p − p = +
1 2
n1 − 1 n2 − 1
INDEPENDENT SAMPLES
(at least one of n1 or n2  30)

 Difference between means


 The two samples originate from populations assumed to be normally
distributed and the unknown standard deviations for these two
populations are equal
 H0 :  = 2

 Compare x1 − x2 with t(n +n -2 ,a /2)


z= 1 2
ˆ x1 − x2
where
s1 (n1 − 1) + s2 (n2 − 1)
2 2
1 1
ˆ x − x =  +
1 2
n1 + n2 − 2 n1 n2
and s1 and s2 are computed using STDEV

INDEPENDENT SAMPLES
(at least one of n1 or n2  30)

Women Men
Sample Size 15 22
Mean Salary $24,467 $33,095
Sample Standard Dev. $2,806 $4,189

 Example: In the U.S., a company is taken to court on the basis of


discrimination in its salary policy. Is the allegation of sex-based salary
discrimination statistically likely?

EXAMPLE (cont’d)
 Can the difference of $8,628 $ be the result of chance?
 To be able to use the test, note that although the variability in men
salaries is slightly higher than that of women, the variability is not
substantially different between the groups
 The standard error of the difference between means is:
2806 2 (15 − 1) + 4189 2 (22 − 1) 1 1
ˆ x −x =  + = 1.238
1 2
15 + 22 − 2 15 22
 With a = .001, the critical ratio 8,628 / 1,238 = 6.97 must be
compared with t(n1+n2 -2 ,a /2)= t (35, .0005) = 3.591. This leads to the
rejection of the null hypothesis assuming that the two mean salaries are
equal

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy