Lecture w6 2 Hypothesis Testing 1 PDF
Lecture w6 2 Hypothesis Testing 1 PDF
Lecture w6 2 Hypothesis Testing 1 PDF
ECON 131
Week 6, Lecture 2
1
2
Today’s Agenda
Test statistic
x−µ s=
1
( (n1 −1)s12 + (n2 −1)s22 ) 1
z = o (n1 + n2 − 2) sd = ∑ (di − d )2
s n −1
n x1 − x2
z= d
1 1 z=
s + sd
n1 n2 n
Distribution Z ~ N(0,1) Z ~ N(0,1) Z ~ N(0,1)
of test
statistic
H0 µ = µ0
Test statistic
x−µ
z = o
s
n
Distribution Z ~ N(0,1)
of test
statistic
9
H0 µ = µ0 x is an estimate of p.
Under H0:
Test statistic
x−µ • E[Xi] = p = µ0
z = o
s
n • Var[Xi] = p(1-p)
• SD[Xi] = p(1− p)
= µ 0 (1− µ 0 )
Distribution Z ~ N(0,1)
of test We can use this instead of s when
statistic computing our test statistic.
10
H0 µ = µ0
Test statistic
x−µ
z = o
µo (1−µo )
n
Distribution Z ~ N(0,1)
of test
statistic
11
One-sample t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
vote08 | 639 .6525822 .018851 .4765229 .6155647 .6895996
------------------------------------------------------------------------------
mean = mean(vote08) t = -4.1068
Ho: mean = .73 degrees of freedom = 638
Ha: mean < .73 Ha: mean != .73 Ha: mean > .73
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
12
H0 µ1 = µ2
Test statistic
1
s=
(n1 + n2 − 2)
( (n1 −1)s12 + (n2 −1)s22 )
x1 − x2
z=
1 1
s +
n1 n2
Distribution Z ~ N(0,1)
of test
statistic
14
H0 µ1 = µ2
Test statistic
1 We can do better
s=
(n1 + n2 − 2)
( (n1 −1)s12 + (n2 −1)s22 )
x1 − x2
z=
1 1
s +
n1 n2
Distribution Z ~ N(0,1)
of test
statistic
15
H0 µ1 = µ2
Test statistic
x1 + x2
s = x(1− x ) x=
n1 + n2
x1 − x2
z=
1 1
s +
n1 n2
Distribution Z ~ N(0,1)
of test
statistic
Contingency tables
Suppose we want to know if the voting behavior in the population
could be the same across liberal/conservative spectrum.
did r vote |
in 2008 | think of self as liberal or conservative
election | extremely liberal slightly moderate slghtly c conservat extrmly c | Total
-------------+-----------------------------------------------------------------------------+----------
0 | 20 51 32 222 50 46 16 | 437
| 4.58 11.67 7.32 50.80 11.44 10.53 3.66 | 100.00
-------------+-----------------------------------------------------------------------------+----------
1 | 56 175 150 417 196 225 48 | 1,267
| 4.42 13.81 11.84 32.91 15.47 17.76 3.79 | 100.00
-------------+-----------------------------------------------------------------------------+----------
Total | 76 226 182 639 246 271 64 | 1,704
| 4.46 13.26 10.68 37.50 14.44 15.90 3.76 | 100.00
Contingency tables
Suppose we want to know if the voting behavior in the population
could be the same across liberal/conservative spectrum.
did r vote |
in 2008 | think of self as liberal or conservative
election | extremely liberal slightly moderate slghtly c conservat extrmly c | Total
-------------+-----------------------------------------------------------------------------+----------
0 | 20 51 32 222 50 46 16 | 437
| 4.58 11.67 7.32 50.80 11.44 10.53 3.66 | 100.00
-------------+-----------------------------------------------------------------------------+----------
1 | 56 175 150 417 196 225 48 | 1,267
| 4.42 13.81 11.84 32.91 15.47 17.76 3.79 | 100.00
-------------+-----------------------------------------------------------------------------+----------
Total | 76 226 182 639 246 271 64 | 1,704
| 4.46 13.26 10.68 37.50 14.44 15.90 3.76 | 100.00
Under H0 that the distributions in the populations are identical, we would expect
something more like this:
did r vote |
in 2008 | think of self as liberal or conservative
election | extremely liberal slightly moderate slghtly c conservat extrmly c | Total
-------------+-----------------------------------------------------------------------------+----------
0 | 19.5 57.9 46.7 163.9 63.1 69.5 16.4 | 437
| |
-------------+-----------------------------------------------------------------------------+----------
1 | 56.5 168.0 135.3 475.1 183.0 201.5 47.6 | 1,267
| |
-------------+-----------------------------------------------------------------------------+----------
Total | 76 226 182 639 246 271 64 | 1,704
| 4.46 13.26 10.68 37.50 14.44 15.90 3.76 | 100.00
Chi-square test
Define the following test statistic:
2 2 2
(O
2 − E
Χ = 11 11 ) (O
+ 12 − E12 ) (O
+... + RC − E RC )
E11 E12 ERC
+----------------+
| Key |
|----------------|
| frequency |
| row percentage |
+----------------+
did r vote |
in 2008 | respondents sex
election | male female | Total
-------------+----------------------+----------
0 | 234 251 | 485
| 48.25 51.75 | 100.00
-------------+----------------------+----------
1 | 564 740 | 1,304
| 43.25 56.75 | 100.00
-------------+----------------------+----------
Total | 798 991 | 1,789
| 44.61 55.39 | 100.00
did r vote |
in 2008 | respondents sex
election | male female | Total
-------------+----------------------+----------
0 | 234 251 | 485
| 48.25 51.75 | 100.00
-------------+----------------------+----------
1 | 564 740 | 1,304
| 43.25 56.75 | 100.00
-------------+----------------------+----------
Total | 798 991 | 1,789
| 44.61 55.39 | 100.00
• Key insight: The cells have counts that are drawn from
multinomial distributions. In the 2x2, they are draws from
binomial random variables.
• Fisher's exact test is great for 2x2 but as size of table grows, it
gets infeasible quickly—e.g., 5x5 not feasible
29
+----------------+
| Key |
|----------------|
| frequency |
| row percentage |
+----------------+
did r vote |
in 2008 | respondents sex
election | male female | Total
-------------+----------------------+----------
0 | 234 251 | 485
| 48.25 51.75 | 100.00
-------------+----------------------+----------
1 | 564 740 | 1,304
| 43.25 56.75 | 100.00
-------------+----------------------+----------
Total | 798 991 | 1,789
| 44.61 55.39 | 100.00