0% found this document useful (0 votes)

39 views

Module 5

Hypothesis testing involves making statements about unknown population parameters based on sample data. The key elements of a hypothesis test are the null hypothesis (H0), alternative hypothesis (HA), test statistic, and rejection region. Hypothesis tests balance type I and type II errors. Sample distributions are used to test hypotheses and calculate p-values and confidence intervals. Power refers to the probability of correctly rejecting a false null hypothesis and depends on factors like sample size and effect size. Sample size calculations aim to have sufficient power to detect clinically meaningful differences.

Uploaded by

Jagadeswar Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views

Module 5

Uploaded by

Jagadeswar Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 53

Hypothesis testing

Hypothesis Testing

• Goal: Make statement(s) regarding unknown population

parameter values based on sample data
• Elements of a hypothesis test:
– Null hypothesis - Statement regarding the value(s) of
unknown parameter(s). Typically will imply no association
between explanatory and response variables in our
applications (will always contain an equality)
– Alternative hypothesis - Statement contradictory to the
null hypothesis (will always contain an inequality)
– Test statistic - Quantity based on sample data and null
hypothesis used to test between null and alternative
hypotheses
– Rejection region - Values of the test statistic for which we
reject the null in favor of the alternative hypothesis
Hypothesis Testing

Test Result – H0 True H0 False

True State
H0 True Correct Type I Error
Decision
H0 False Type II Error Correct
Decision

  P(Type I Error )   P(Type II Error )

• Goal: Keep a, b reasonably small

Example - Efficacy Test for New drug

• Drug company has new drug, wishes to

compare it with current standard treatment
• Federal regulators tell company that they must
demonstrate that new drug is better than
current treatment to receive approval
• Firm runs clinical trial where some patients
receive new drug, and others receive standard
treatment
• Numeric response of therapeutic effect is
obtained (higher scores are better).
• Parameter of interest: mNew - mStd
Example - Efficacy Test for New drug
• Null hypothesis - New drug is no better than standard trt

H 0 :  New  Std  0  New  Std  0

• Alternative hypothesis - New drug is
better thanHstandard trt
A :  New   Std  0

• Experimental (Sample) data:

y New y Std
s New sStd
nNew nStd
Sampling Distribution of Difference in Means

• In large samples, the difference in two sample means

is approximately normally distributed:
  12  22 
Y 1  Y 2 ~ N  1   2 , 
 n1 n2 


• Under the null hypothesis, m1-

m2=0 and:Z  Y  Y ~ N (0,1)
1 2
2
 2
 2
1
n1 n2

• s12 and s22 are unknown and

Example - Efficacy Test for New drug

• Type I error - Concluding that the new drug is better than the
standard (HA) when in fact it is no better (H0). Ineffective drug
is deemed better.
– Traditionally a = P(Type I error) = 0.05

• Type II error - Failing to conclude that the new drug is better

(HA) when in fact it is. Effective drug is deemed to be no
better.
– Traditionally a clinically important difference (D) is
assigned and sample sizes chosen so that:
b = P(Type II error | m1-m2 = D)  .20
Elements of a Hypothesis Test
• Test Statistic - Difference between the Sample means,
scaled to number of standard deviations (standard errors)
from the null difference of 0 for the Population means:

y1  y 2
T .S . : zobs 
s12 s22

n1 n2
• Rejection Region - Set of values of the test
statistic that are consistent with HA, such that
the probability it falls in this region when H0
is true is a (we will always set a=0.05)
R.R. : zobs  z   0.05  z  1.645
P-value (aka Observed Significance Level)

• P-value - Measure of the strength of evidence the sample data provides against
the null hypothesis:

P(Evidence This strong or stronger against H0 | H0 is true)

P  val : p  P(Z  zobs )

Large-Sample Test H0:m1-m2=0 vs H0:m1-m2>0

• H0: m1-m2 = 0 (No difference in population means

• HA: m1-m2 > 0 (Population Mean 1 > Pop Mean 2)

y1  y 2
 T . S . : z obs 
s 12 s 22

n1 n2
 R . R . : z obs  z 
 P  value : P ( Z  z obs )

• Conclusion - Reject H0 if test statistic falls

in rejection region, or equivalently the P-
2-Sided Tests

• Many studies don’t assume a direction wrt the

difference m1-m2
• H0: m1-m2 = 0 HA: m1-m2  0
• Test statistic is the same as before
• Decision Rule:
– Conclude m1-m2 > 0 if zobs  za/2 (a=0.05  za/2=1.96)
– Conclude m1-m2 < 0 if zobs  -za/2 (a=0.05  -za/2= -1.96)
– Do not reject m1-m2 = 0 if -za/2  zobs  za/2
• P-value: 2P(Z |zobs|)
Power of a Test

• Power - Probability a test rejects H0 (depends on m1- m2)

– H0 True: Power = P(Type I error) = a
– H0 False: Power = 1-P(Type II error) = 1-b

· Example:
· H0: m1- m2 = 0 HA: m1- m2 > 0
· s = s2 = 25 n1 = n2 = 25
1
2 2

· Decision Rule: Reject H0 (at a=0.05 significance level) if:

y1  y 2 y1  y 2
z obs    1 .645  y 1  y 2  2 .326
 2
 2
2
1
 2
n1 n2
Power of a Test

• Now suppose in reality that m1-m2 = 3.0 (HA is true)

• Power now refers to the probability we (correctly)
reject the null hypothesis. Note that the sampling
distribution of the difference in sample means is
approximately normal, with mean 3.0 and standard
deviation (standard error) 1.414.
• Decision Rule (from last slide): Conclude population
means differ if the sample mean for group 1 is at
least 2.326 higher than the sample mean for group 2
• Power for this case can be computed as:

P(Y 1  Y 2  2.326) Y 1  Y 2 ~ N (3, 2.0  1.414)

Power of a Test

2.326 3
Power P(Y1 Y 2  2.326)  P(Z   0.48)  .6844
1.41

• All else being equal:

• As sample sizes increase, power increases
• As population variances decrease, power increases
• As the true mean difference increases, power
increases
Power of a Test
Distribution (H0) Distribution (HA)
Power of a Test

Power Curves for group sample sizes of

25,50,75,100 and varying true values m1-
m2 with s1=s2=5.
Sample Size Calculations for Fixed Power

• Goal - Choose sample sizes to have a favorable chance of

detecting a clinically meaning difference
• Step 1 - Define an important difference in means:
– Case 1: s approximated from prior experience or pilot study -
dfference can be stated in units of the data
– Case 2: s unknown - difference must be stated in units of
standard deviations of the data

1   2


• Step 2 - Choose the desired power to
detect the the clinically meaningful
2z / 2  z  
2
difference (1-nb , typically at2 least .80). For 2-
1  n2 

sided test:
Example - Rosiglitazone for HIV-1 Lipoatrophy

• Trts - Rosiglitazone vs Placebo

• Response - Change in Limb fat mass
• Clinically Meaningful Difference - 0.5 (std dev’s)
• Desired Power - 1-b = 0.80
• Significance Level - a = 0.05

z / 2  1.96 z   z.20  .84

21.96  0.84
2
n1  n2   63
(0.5) 2
Source: Carr, et al (2004)
Confidence Intervals

• Normally Distributed data - approximately 95%

of individual measurements lie within 2
standard deviations of the mean
• Difference between 2 sample means is
approximately normally distributed in large
samples (regardless of shape of distribution of
individual measurements):

  12  22 

Y 1  Y 2 ~ N 1   2 , 
 n n2 
 1

• Thus, we can expect (with 95%

confidence) that our sample mean
(1-a)100% Confidence Interval for m1-m2

• Large sample Confidence

Interval for m -m :
y  s2 s2
1 2

1  y 2  z / 2 1
 2
n1 n2

• Standard level of confidence is 95%

(z.025 = 1.96  2)
• (1-a)100% CI’s and 2-sided tests
reach the same conclusions
Example - Viagra for ED

• Comparison of Viagra (Group 1) and Placebo (Group 2)

for ED
• Data pooled from 6 double-blind trials
• Subjects - White males
• Response - Percent of succesful intercourse attempts in
past 4 weeks (Each subject reports his own percentage)

y1  63.2 s1  41.3 n2  264

y 2  23.5 s2  42.3 n2  240

95% CI for
m(63 (41.3)2 (42.3)2
1-.2m : .5) 1.96
223   39.7  7.3  (32.4,47.0)
264 240
Source: Carson, et al (2002)
ANOVA
ANOVA: Comparing Several Means
• The statistical methodology for comparing
several means is called analysis of variance, or
ANOVA.
• In this case one variable is categorical.
– This variable forms the groups to be compared.
• The response variable is numeric.
• This methodology is the extension of
comparing two means.
ANOVA: Comparing Several Means

• Examples:
– “An investigator is interested in studying the average number of days rats
live when fed diets that contain different amounts of fat. Three
populations were studied, where rats in population 1 were fed a high-fat
diet, rats in population 2 were fed a medium-fat diet, and rats in
population 3 were fed a low-fat diet. The variable of interest is ‘Days
lived.’” (from Graybill, Iyer and Burdick, Applied Statistics, 1998).
– “A state regulatory agency is studying the effects of secondhand smoke in
the workplace. All companies in the state that employ more than 15
workers must file a report with the agency that describes the company’s
smoking policy. In particular, each company must report whether (1)
smoking is allowed (no restrictions), (2) smoking is allowed only in
restricted areas, or (3) smoking is banned. In order to determine the
effect of secondhand smoke, the state agency needs to measure the
nicotine level at the work site. It is not possible to measure the nicotine
level for every company that reports to the agency, and so a simple
random sample of 25 companies is selected from each category of
smoking policy.” (from Graybill, Iyer and Burdick, Applied Statistics, 1998).
Assumptions for ANOVA

1. Each of the I population or group distributions is normal. -

check with a Normal Quantile Plot (or boxplot) of each group
2. These distributions have identical variances (standard
deviations).
-check if largest sd is > 2 times smallest sd
3. Each of the I samples is a random sample.
4. Each of the I samples is selected independently of one another.
ANOVA: Comparing Several Means

H0 : 1   2     I

where I is the number of

populations to be compared
The alternative hypothesis (step 2) is
The null hypothesis (step 1) for comparing several means is

H a : not all of the i are equal

(at least one of the means
is different from the others)
ANOVA: Comparing Several Means

Mean Squares Group MSG

F or
Mean Squares Error MSE

This compares the variation between groups

(group mean to group mean) to the variation
within groups (individual values to group
means).
• Step 3: State the significance level
• Step 4: Calculate the F-statistic:
This is what gives it the name “Analysis of
Variance.”
ANOVA: Comparing Several Means
Pr( Fdf1 ,df 2  Fcalculated )
where df1 = I – 1 (number of
groups minus 1) and
df2 = N – I (total sample size
minus number of groups).
• Step 5: Find the P-value
– The P-value for an ANOVA F-test is always one-sided.
– The P-value is
P-value
F-
distribution
ANOVA: Comparing Several Means

• Step 6. Reject or fail to reject H0 based on the P-value.

– If the P-value is less than or equal to a, reject H0.
– It the P-value is greater than a, fail to reject H0.
• Step 7. State your conclusion.
– If H0 is rejected, “There is significant statistical
evidence that at least one of the population means
is different from another.”
– If H0 is not rejected, “There is not significant
statistical evidence that at least one of the
population means is different from another.”
ANOVA Table
Source df Sum of Squares Mean Square F p-value

Group I–1 SSG MSG Pr( F  Fcalc )

 n (x  x) 2
 SSG  MSG  Fca lc
(between) i i
dfG MSE
Error N–I SSE
 (n 1)s
2
 SSE  MSE
(within) i i
dfE
Total N–1 SSTot
 (x ij  x ) 2  SSTot
dfTot
 MST

Note: MSE is the pooled sample variance and SSG + SSE = SSTot
SSG is the proportion of the total variation explained by the
R2 
SSTot difference in means
ANOVA: Comparing Several Means

• Example: “An experimenter is interested in the effect

of sleep deprivation on manual dexterity. Thirty-two
(N) subjects are selected and randomly divided into
four (I) groups of size 8 (ni). After differing amount of
sleep deprivation, all subjects are given a series of
tasks to perform, each of which requires a high
amount of manual dexterity. A score from 0 (poor
performance) to 10 (excellent performance) is
obtained for each subject. Test at the a = 0.05 level
the hypothesis that the degree of sleep deprivation
has no effect on manual dexterity.” (from Milton,
McTeer, and Corbet, Introduction to Statistics, 1997)
ANOVA: Comparing Several Means

• Information Given
Sample size:
Stddev1 = 0.89316
N = 32
Stddev2 = 0.86603 Group I Group II Group III Group IV
Stddev3 = 0.64507 16 hours 20 hours 24 hours 28 hours
Stddev4 = 0.85206 8.95 7.7 5.99 3.78

8.04 5.81 6.79 3.35

7.72 6.61 6.43 2.45

6.21

6.48
6.07

8.04
5.85

5.78
4.27

4.87
Variation
7.81

7.5
5.96

7.3
7.6

5.78
3.14

3.98
within
6.9 7.46 6 2.47
groups

Variation between groups

Side by Side Boxplots
9.00

8.00

7.00

6.00

5.00

4.00

3.00

2.00

GroupI GroupII GroupIII GroupIV

Normal Quantile Plots

1.5 1.5

1.0 1.0
Expected Normal

Expected Normal
0.5 0.5

0.0 0.0

-0.5 -0.5

-1.0 -1.0

-1.5 -1.5

6.0 6.5 7.0 7.5 8.0 8.5 9.0 6.0 6.5 7.0 7.5 8.0
Observed Value Observed Value

1.5
1.5

1.0 1.0

E xpected Normal
Expected Normal

0.5
0.5

0.0

-0.5

-0.5
-1.0

-1.0
-1.5

5.5 6.0 6.5 7.0 7.5 8.0 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Observed Value Observed Value
ANOVA: Comparing Several Means

• Information Given

7.45 Error Bars show Mean +/- 1.0 SE



6.87 Dot/Lines show Means

7.00 

Variation
6.28


Within
6.00
dexter

Groups 5.00

4.00


3.54
Average
Within
16 hours 20 hours 24 hours 28 hours Group
deprived Variation
(MSE)
ANOVA: Comparing Several Means
• Information Given

Dot/Lines show Means

7.00 

6.00
dexter

5.00
Average
Between
Variation Group
Variation
Between
4.00

 (MSG)

Groups 16 hours 20 hours 24 hours

deprived
28 hours
ANOVA: Comparing Several Means

H 0 : 1   2   3   4

Step 2: The alternative hypothesis is

Ha : not all of the i are equal

Step 3: The significance level is a =

0.05
Step 1: The null hypothesis is
ANOVA: Comparing Several Means

Mean Square Group MSG 23.976  35.73

F or 
Mean Square Error MSE 0.671
MSG and MSE are found in the ANOVA table
when the analysis is run on the computer:
• Step 4: Calculate the F-statistic:

M M
ANOVA

DEXTER
Sum of
SG SE
Squares df Mean Square F Sig.
Between Groups 71.928 3 23.976 35.730 .000
Within Groups 18.789 28 .671
Total 90.716 31
ANOVA: Comparing Several Means

Pr( Fdf1 ,df 2  Fcalculated )  Pr( Fdf1 ,df2  35.73)

 .0001

where df1 = I – 1 (number of groups

minus 1) = 4 – 1 = 3 and df2 = N – I
(total sample size minus I) = 32 – 4 = 28
• Step 5:ANOVA
Find the P-value
DEXTER
–
Sum of The P-value is
Squares df Mean Square F Sig.
Between Groups 71.928 3 23.976 35.730 .000
Within Groups 18.789 28 .671
Total 90.716 31

35.7
3
ANOVA: Comparing Several Means

• Step 6. Reject or fail to reject H0 based on the P-value.

– Because the P-value is less than a =

0.05, reject H0.
• Step 7. State your conclusion.

– “There is significant statistical

evidence that at least one of the
population means is different from
another.”
An additional test will tell us which
means are different from the others.
Non-Parametric Tests
Level of One-sample Two-sample case K-sample case
measurement test Related Samples Independent samples Related Independent
samples samples
Nominal Binomial McNemar for Fisher exact Cochran Q Chi-square
significance of probability (Dichotomous)
changes Chi-square
Ordinal Kolmogorov Sign Wilcoxon Mann-Whitney U Friedman Kruskal-Wallis
Smirnov matched-pair Kolmogorov-Smirnov two-way one-way
signed-ranks analysis of analysis of
Runs Wald-Wolfowitz runs variance variance

Moses of extreme Kendall’s W

reactions
Interval Walsh Randomization
• Chi-square – tests whether the observed
distribution is the same as a certain hypothesized
distribution.

•
The default null hypothesis is even distribution.
• Kolmogorov-Smirnov – Compares the distribution of
a variable with a uniform, normal, Poisson, or
exponential distribution,
• Null hypothesis: the observed values were sampled
from a distribution of that type.
Runs
• A run is defined as a sequence of cases on the
same side of the cut point. (An uninterrupted
course of some state or condition, for e.g. a
run of good luck).
• You should use the Runs Test procedure when
you want to test the hypothesis that the
values of a variable are ordered randomly with
respect to a cut point of your choosing
(Default cut point: median.
• E.g. If you ask 20 students about how well they understand a
lecture on a scale ranged from 1 to 5 (and the median in the
class is 3). If you find that, the first 10 students give a value
higher than 3 and the second 10 give a value lower than 3
(there are only 2 runs). 5445444545 2222112211
• For random situation, there should be more runs (but will not
be close to 20, which means they are ordered exactly in an
alternative fashion; for example a value below 3 will be
followed by one higher than it and vice versa).
2,4,1,5,1,4,2,5,1,4,2,4
• The Runs Test is often used as a precursor to running tests
that compare the means of two or more groups, including:
– The Independent-Samples T Test procedure.
– The One-Way ANOVA procedure.
– The Two-Independent-Samples Tests procedure.
– The Tests for Several Independent Samples procedure.
Sample cases (Related Samples)

• McNemar – tests whether the changes in proportions

are the same for pairs of dichotomous variables.
McNemar’s test is computed like the usual chi-square
test, but only the two cells in which the classification
don’t match are used.
• Null hypothesis: People are equally likely to fall into
two contradictory classification categories.
• Sign test – tests whether the numbers of differences (+ve
or –ve) between two samples are approximately the
same. Each pair of scores (before and after) are compared.
• When “after” > “before” (+ sign), if smaller (- sign). When
both are the same, it is a tie.
• Sign-test did not use all the information available (the size
of difference), but it requires less assumptions about the
sample and can avoid the influence of the outliers.
• To test the association between the following two
perceptions
• Social workers help the disadvantaged and Social
workers bring hopes to those in averse situation
• Wilcoxon matched-pairs signed-ranks test – Similar to sign test,
but take into consideration the ranking of the magnitude of the
difference among the pairs of values. (Sign test only considers
the direction of difference but not the magnitude of differences.)
• The test requires that the differences (of the true values) be a
sample from a symmetric distribution (but not require normality).
It’s better to run stem-and-leaf plot of the differences.
Two-sample case (independent samples)

• Mann-Whitney U – similar to Wilcoxon matched-paired

signed-ranks test except that the samples are independent and
not paired. It’s the most commonly used alternative to the
independent-samples t test.
• Null hypothesis: the population means are the same for the
two groups.
• The actual computation of the Mann-Whitney test is simple.
You rank the combined data values for the two groups. Then
you find the average rank in each group.
• Requirement: the population variances for the two groups
must be the same, but the shape of the distribution does not
matter.
• Kolmogorov-Smirnov Z– to test if two
distributions are different. It is used when there
are only a few values available on the ordinal
scale. K-S test is more powerful than M-W U test
if the two distributions differ in terms of
dispersion instead of central tendency.
K-sample case
(Independent samples)

• Kruskal-Wallis One-way ANOVA – It’s more powerful

than Chi-square test when ordinal scale can be
assumed. It is computed exactly like the Mann-
Whitney test, except that there are more groups. The
data must be independent samples from populations
with the same shape (but not necessarily normal).

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6412)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (640)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1173)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (990)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1852)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4101)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (627)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1015)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5143)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (460)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2126)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4355)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2001)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1087)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2787)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2032)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2876)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4087)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (835)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (918)
Math 221C Exam 2 Review Team Quiz
No ratings yet
Math 221C Exam 2 Review Team Quiz
24 pages
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (814)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Chapter 5
No ratings yet
Chapter 5
58 pages
STATS Exam Questions!
No ratings yet
STATS Exam Questions!
3 pages
Das and Hammer
No ratings yet
Das and Hammer
29 pages
Full Download Graph Algorithms and Applications 7 Giuseppe Liotta PDF DOCX
100% (2)
Full Download Graph Algorithms and Applications 7 Giuseppe Liotta PDF DOCX
67 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
Validation Protocol
100% (1)
Validation Protocol
24 pages
Quantitative Methods: House of Knowledge
100% (1)
Quantitative Methods: House of Knowledge
11 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
9 pages
Hypothesis Testing For One Population
No ratings yet
Hypothesis Testing For One Population
57 pages
Topic 3 - Hypothesis Testing - 2
No ratings yet
Topic 3 - Hypothesis Testing - 2
29 pages
Hypothesis Testing For Means & Proportions
No ratings yet
Hypothesis Testing For Means & Proportions
20 pages
Business Statistics Level 3/series 2 2008 (Code 3009)
No ratings yet
Business Statistics Level 3/series 2 2008 (Code 3009)
22 pages
Chapter Two: Sampling and Sampling Distribution
100% (1)
Chapter Two: Sampling and Sampling Distribution
30 pages
Problems in Hypothesis Testing
No ratings yet
Problems in Hypothesis Testing
8 pages
Eda Hypothesis Testing For Single Sample
No ratings yet
Eda Hypothesis Testing For Single Sample
6 pages
TQC MCQs
No ratings yet
TQC MCQs
34 pages
9.1 Significance Tests: The Basics: Problem 1 - 911 Calls
No ratings yet
9.1 Significance Tests: The Basics: Problem 1 - 911 Calls
15 pages
What Is Social Science Research PDF
No ratings yet
What Is Social Science Research PDF
32 pages
Statistics - Probability Q4 Mod1 Tests-of-Hypothesis
No ratings yet
Statistics - Probability Q4 Mod1 Tests-of-Hypothesis
20 pages
Research, Quality, Statistics
No ratings yet
Research, Quality, Statistics
12 pages
Understanding Statistical Power in The Context of Applied Research
No ratings yet
Understanding Statistical Power in The Context of Applied Research
8 pages
Core - 11 - Statistics-and-Probability - q4 - CLAS1 - Hypothesis - Testing - v1.2 - JOSEPH AURELLO
No ratings yet
Core - 11 - Statistics-and-Probability - q4 - CLAS1 - Hypothesis - Testing - v1.2 - JOSEPH AURELLO
12 pages
Notes09 One-Sample Hypothesis Tests
No ratings yet
Notes09 One-Sample Hypothesis Tests
58 pages
1.1 Statistics For Data Science PDF
No ratings yet
1.1 Statistics For Data Science PDF
91 pages
Single best answers (SBAs) for the MRCS Part A a Bailey & Love revision guide Second Edition Chowdhury - Download the ebook in PDF with all chapters to read anytime
100% (1)
Single best answers (SBAs) for the MRCS Part A a Bailey & Love revision guide Second Edition Chowdhury - Download the ebook in PDF with all chapters to read anytime
58 pages
CH 8
No ratings yet
CH 8
20 pages
Hypothesis Presentation
No ratings yet
Hypothesis Presentation
12 pages
Two Population - Hypothesis - ch4 PDF
No ratings yet
Two Population - Hypothesis - ch4 PDF
30 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Module 5

Uploaded by

Module 5

Uploaded by

Hypothesis testing

• Goal: Make statement(s) regarding unknown population

Test Result – H0 True H0 False

  P(Type I Error )   P(Type II Error )

• Goal: Keep a, b reasonably small

• Drug company has new drug, wishes to

H 0 :  New  Std  0  New  Std  0

• Experimental (Sample) data:

• In large samples, the difference in two sample means

• Under the null hypothesis, m1-

• s12 and s22 are unknown and

• Type II error - Failing to conclude that the new drug is better

P(Evidence This strong or stronger against H0 | H0 is true)

P  val : p  P(Z  zobs )

• H0: m1-m2 = 0 (No difference in population means

• Conclusion - Reject H0 if test statistic falls

• Many studies don’t assume a direction wrt the

• Power - Probability a test rejects H0 (depends on m1- m2)

· Decision Rule: Reject H0 (at a=0.05 significance level) if:

• Now suppose in reality that m1-m2 = 3.0 (HA is true)

P(Y 1  Y 2  2.326) Y 1  Y 2 ~ N (3, 2.0  1.414)

• All else being equal:

Power Curves for group sample sizes of

• Goal - Choose sample sizes to have a favorable chance of

• Trts - Rosiglitazone vs Placebo

z / 2  1.96 z   z.20  .84

• Normally Distributed data - approximately 95%

• Thus, we can expect (with 95%

• Large sample Confidence

• Standard level of confidence is 95%

• Comparison of Viagra (Group 1) and Placebo (Group 2)

y1  63.2 s1  41.3 n2  264

1. Each of the I population or group distributions is normal. -

where I is the number of

H a : not all of the i are equal

Mean Squares Group MSG

This compares the variation between groups

• Step 6. Reject or fail to reject H0 based on the P-value.

Group I–1 SSG MSG Pr( F  Fcalc )

• Example: “An experimenter is interested in the effect

8.04 5.81 6.79 3.35

7.72 6.61 6.43 2.45

Variation between groups

GroupI GroupII GroupIII GroupIV

7.45 Error Bars show Mean +/- 1.0 SE

6.87 Dot/Lines show Means

Groups 16 hours 20 hours 24 hours

Step 2: The alternative hypothesis is

Step 3: The significance level is a =

Mean Square Group MSG 23.976  35.73

Pr( Fdf1 ,df 2  Fcalculated )  Pr( Fdf1 ,df2  35.73)

where df1 = I – 1 (number of groups

• Step 6. Reject or fail to reject H0 based on the P-value.

– Because the P-value is less than a =

– “There is significant statistical

Moses of extreme Kendall’s W

• McNemar – tests whether the changes in proportions

• Mann-Whitney U – similar to Wilcoxon matched-paired

• Kruskal-Wallis One-way ANOVA – It’s more powerful

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.