Module 5
Module 5
Hypothesis Testing
True State
H0 True Correct Type I Error
Decision
H0 False Type II Error Correct
Decision
• Type I error - Concluding that the new drug is better than the
standard (HA) when in fact it is no better (H0). Ineffective drug
is deemed better.
– Traditionally a = P(Type I error) = 0.05
y1 y 2
T .S . : zobs
s12 s22
n1 n2
• Rejection Region - Set of values of the test
statistic that are consistent with HA, such that
the probability it falls in this region when H0
is true is a (we will always set a=0.05)
R.R. : zobs z 0.05 z 1.645
P-value (aka Observed Significance Level)
• P-value - Measure of the strength of evidence the sample data provides against
the null hypothesis:
y1 y 2
T . S . : z obs
s 12 s 22
n1 n2
R . R . : z obs z
P value : P ( Z z obs )
difference m1-m2
• H0: m1-m2 = 0 HA: m1-m2 0
• Test statistic is the same as before
• Decision Rule:
– Conclude m1-m2 > 0 if zobs za/2 (a=0.05 za/2=1.96)
– Conclude m1-m2 < 0 if zobs -za/2 (a=0.05 -za/2= -1.96)
– Do not reject m1-m2 = 0 if -za/2 zobs za/2
• P-value: 2P(Z |zobs|)
Power of a Test
· Example:
· H0: m1- m2 = 0 HA: m1- m2 > 0
· s = s2 = 25 n1 = n2 = 25
1
2 2
y1 y 2 y1 y 2
z obs 1 .645 y 1 y 2 2 .326
2
2
2
1
2
n1 n2
Power of a Test
2.326 3
Power P(Y1 Y 2 2.326) P(Z 0.48) .6844
1.41
1 2
• Step 2 - Choose the desired power to
detect the the clinically meaningful
2z / 2 z
2
difference (1-nb , typically at2 least .80). For 2-
1 n2
sided test:
Example - Rosiglitazone for HIV-1 Lipoatrophy
12 22
Y 1 Y 2 ~ N 1 2 ,
n n2
1
1 y 2 z / 2 1
2
n1 n2
95% CI for
m(63 (41.3)2 (42.3)2
1-.2m : .5) 1.96
223 39.7 7.3 (32.4,47.0)
264 240
Source: Carson, et al (2002)
ANOVA
ANOVA: Comparing Several Means
• The statistical methodology for comparing
several means is called analysis of variance, or
ANOVA.
• In this case one variable is categorical.
– This variable forms the groups to be compared.
• The response variable is numeric.
• This methodology is the extension of
comparing two means.
ANOVA: Comparing Several Means
• Examples:
– “An investigator is interested in studying the average number of days rats
live when fed diets that contain different amounts of fat. Three
populations were studied, where rats in population 1 were fed a high-fat
diet, rats in population 2 were fed a medium-fat diet, and rats in
population 3 were fed a low-fat diet. The variable of interest is ‘Days
lived.’” (from Graybill, Iyer and Burdick, Applied Statistics, 1998).
– “A state regulatory agency is studying the effects of secondhand smoke in
the workplace. All companies in the state that employ more than 15
workers must file a report with the agency that describes the company’s
smoking policy. In particular, each company must report whether (1)
smoking is allowed (no restrictions), (2) smoking is allowed only in
restricted areas, or (3) smoking is banned. In order to determine the
effect of secondhand smoke, the state agency needs to measure the
nicotine level at the work site. It is not possible to measure the nicotine
level for every company that reports to the agency, and so a simple
random sample of 25 companies is selected from each category of
smoking policy.” (from Graybill, Iyer and Burdick, Applied Statistics, 1998).
Assumptions for ANOVA
H0 : 1 2 I
Note: MSE is the pooled sample variance and SSG + SSE = SSTot
SSG is the proportion of the total variation explained by the
R2
SSTot difference in means
ANOVA: Comparing Several Means
• Information Given
Sample size:
Stddev1 = 0.89316
N = 32
Stddev2 = 0.86603 Group I Group II Group III Group IV
Stddev3 = 0.64507 16 hours 20 hours 24 hours 28 hours
Stddev4 = 0.85206 8.95 7.7 5.99 3.78
6.21
6.48
6.07
8.04
5.85
5.78
4.27
4.87
Variation
7.81
7.5
5.96
7.3
7.6
5.78
3.14
3.98
within
6.9 7.46 6 2.47
groups
8.00
7.00
6.00
5.00
4.00
3.00
2.00
1.5 1.5
1.0 1.0
Expected Normal
Expected Normal
0.5 0.5
0.0 0.0
-0.5 -0.5
-1.0 -1.0
-1.5 -1.5
6.0 6.5 7.0 7.5 8.0 8.5 9.0 6.0 6.5 7.0 7.5 8.0
Observed Value Observed Value
1.5
1.5
1.0 1.0
E xpected Normal
Expected Normal
0.5
0.5
0.0
0.0
-0.5
-0.5
-1.0
-1.0
-1.5
5.5 6.0 6.5 7.0 7.5 8.0 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Observed Value Observed Value
ANOVA: Comparing Several Means
• Information Given
Variation
6.28
Within
6.00
dexter
Groups 5.00
4.00
3.54
Average
Within
16 hours 20 hours 24 hours 28 hours Group
deprived Variation
(MSE)
ANOVA: Comparing Several Means
• Information Given
Dot/Lines show Means
7.00
6.00
dexter
5.00
Average
Between
Variation Group
Variation
Between
4.00
(MSG)
deprived
28 hours
ANOVA: Comparing Several Means
H 0 : 1 2 3 4
M M
ANOVA
DEXTER
Sum of
SG SE
Squares df Mean Square F Sig.
Between Groups 71.928 3 23.976 35.730 .000
Within Groups 18.789 28 .671
Total 90.716 31
ANOVA: Comparing Several Means
35.7
3
ANOVA: Comparing Several Means
•
The default null hypothesis is even distribution.
• Kolmogorov-Smirnov – Compares the distribution of
a variable with a uniform, normal, Poisson, or
exponential distribution,
• Null hypothesis: the observed values were sampled
from a distribution of that type.
Runs
• A run is defined as a sequence of cases on the
same side of the cut point. (An uninterrupted
course of some state or condition, for e.g. a
run of good luck).
• You should use the Runs Test procedure when
you want to test the hypothesis that the
values of a variable are ordered randomly with
respect to a cut point of your choosing
(Default cut point: median.
• E.g. If you ask 20 students about how well they understand a
lecture on a scale ranged from 1 to 5 (and the median in the
class is 3). If you find that, the first 10 students give a value
higher than 3 and the second 10 give a value lower than 3
(there are only 2 runs). 5445444545 2222112211
• For random situation, there should be more runs (but will not
be close to 20, which means they are ordered exactly in an
alternative fashion; for example a value below 3 will be
followed by one higher than it and vice versa).
2,4,1,5,1,4,2,5,1,4,2,4
• The Runs Test is often used as a precursor to running tests
that compare the means of two or more groups, including:
– The Independent-Samples T Test procedure.
– The One-Way ANOVA procedure.
– The Two-Independent-Samples Tests procedure.
– The Tests for Several Independent Samples procedure.
Sample cases (Related Samples)