Hypothesis Test1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Hypothesis Testing

Rama Shankar

© Copyright. All rights reserved

Hypothesis Testing

A statistical hypothesis is an assertion about


one or more characteristics of a population. It
is often a claim about the value of a
parameter.

© Copyright. All rights reserved

1
Developing Null and Alternative Hypotheses

• Hypothesis testing can be used to determine whether


a statement about the value of a population
parameter should or should not be rejected.
• The null hypothesis, denoted by H0 , is a tentative
assumption about a population parameter.
• The alternative hypothesis, denoted by Ha, is the
opposite of what is stated in the null hypothesis.

© Copyright. All rights reserved

Hypothesis Testing & the American Justice


System

• State the Opposing Conjectures, H0 and HA.


• Determine the amount of evidence required, n, and the risk
of committing a “type I error”, α
• What sort of evaluation of the evidence is required and
what is the justification for this? (type of test)
• What are the conditions which proclaim guilt and those
which proclaim innocence? (Decision Rule)
• Gather & evaluate the evidence.
• What is the verdict? (H0 or HA?)
• Determine a “Zone of Belief” - Confidence Interval.
• What is appropriate justice? --- Conclusions

© Copyright. All rights reserved

2
The American Trial System
In Truth, the Defendant is:
H0: Innocent HA: Guilty

Innocent Correct Decision Incorrect Decision


Innocent Individual Guilty Individual
Verdict

Goes Free Goes Free

Incorrect Decision Correct Decision


Guilty Innocent Individual Guilty Individual
Is Disciplined Is Disciplined

© Copyright. All rights reserved

Null Hypothesis Ho

• The null hypothesis usually represents a


state of no change or no difference from the
experimenter’s point of view. It is assumed to
be true until sufficient evidence is obtained to
warrant its rejection.

© Copyright. All rights reserved

3
Alternative Hypothesis Ha

• The alternative hypothesis is the statement


on which the researcher places the burden of
proof. Substantial supporting evidence is
required before it will be accepted.

© Copyright. All rights reserved

Establish the Hypothesis


• Null hypothesis (Ho) is the
statement we assess.
Ho : µ a = µ b
• Ho is usually stated as, “there is Ha : µ a ≠ µ b
no difference.”
• Alternate hypothesis (Ha) is
stated as “there is a difference.” Ho : pa = pb
• We fail to reject Ho unless there
is convincing evidence to reject
Ha : pa ≠ pb
it.

© Copyright. All rights reserved

4
Steps in Hypothesis Testing
1. Establish the Hypothesis 10. Select the samples
– State the Null Hypothesis 11. Conduct the test and collect
(H0)
data
– State the Alternative
Hypothesis (Ha) 12. Calculate the test statistic (Z, t,
) from the data
2. Decide on appropriate statistical
test (assume distribution Z, t,) 13. Determine the probability that
the calculated test statistic has
3. State the alpha level
occurred by chance
(usually 5 %)
4. State the beta level 14. If that probability is less than
(usually 10-20 %) alpha, reject H0
5. Establish the effect size (delta) 15. If that probability is greater than
alpha, do not reject H0
6. Establish the sample size
7. Develop a sampling plan

© Copyright. All rights reserved

Types of Hypothesis Testing: Single


Sample
One Sample Z-Test
• Perform a hypothesis test of the mean when s is known
• For a two-tailed, one-sample Z
• Sample size (n) greater than or equal to 30
One Sample t-Test
• Perform a hypothesis test of the mean
• s is unknown
• For a two-tailed, one-sample t
• Sample size (n) less than 30

© Copyright. All rights reserved

5
Types of Hypothesis Testing: Two
Samples
Two Sample t-Test
• Use to compare two sample distributions to determine if a difference
exists in their means
• For a two-tailed, two-sample t
• s is unknown
Paired t-Test
• Use the paired t command to perform a hypothesis test of the difference
between population means when observations are paired.
• A paired t-procedure matches responses that are dependent or related
in a pairwise manner.
• This matching usually results in a smaller error term, since it minimizes
variability in the pair.

© Copyright. All rights reserved

A Test of Hypotheses

A test of hypotheses is a method for


using sample data to decide whether
the null hypothesis should be
rejected.

© Copyright. All rights reserved

6
Test Procedure
A test procedure is specified by
1. A test statistic, a function of the
sample data on which the decision is
to be based.
2. A rejection region, the set of all test
statistic values for which H0 will be
rejected (null hypothesis rejected if
the test statistic value falls in this
region.)
© Copyright. All rights reserved

Test Statistic
 The test statistic z has a standard normal
probability distribution.
 We can use the standard normal probability
distribution table to find the calculated z-value with
an area of α in the lower (or upper) tail of the
distribution.
The value of the test statistic that established the
boundary of the rejection region for the test is
called the critical value for the test.
 The rejection rule is: (For the calculated z value)
• Lower tail: Reject H0 if z < zα.
• Upper tail: Reject H0 if z > zα.

© Copyright. All rights reserved

7
Type I and Type II Errors

• Since hypothesis tests are based on sample


data, we must allow for the possibility of
errors.
• A Type I error is rejecting H0 when it is true.
• The person conducting the hypothesis test
specifies the maximum allowable probability
of making a Type I error, denoted by α and
called the level of significance.

© Copyright. All rights reserved

Type I and Type II Errors

 A Type II error is accepting H0 when it is false.

 Generally, we cannot control for the probability of


making a Type II error, denoted by β.

 Statistician avoids the risk of making a Type II error


by using “fail to reject H0” rather than “accept H0”.

© Copyright. All rights reserved

8
The Risk Truth Table
α is the risk of finding a difference when there really isn’t one.
β is the risk of not finding a difference when there really is one.

Action
Ho not rejected Ho rejected
State of Nature
Type I or
Ho should not
producer’s risk
be rejected Correct
α = P(Type I)
(Ho is true)

Ho should Type II or
be rejected consumer’s risk Correct
(Ho is false) β = P(Type II)

© Copyright. All rights reserved

Think of…
Pascal’s Wager

The TRUTH
Your Decision God Exists God Doesn’t Exist

Reject God
BIG MISTAKE Correct

Accept God Correct—


MINOR MISTAKE
Big Pay Off

© Copyright. All rights reserved

9
Steps of Hypothesis Testing (Manual
Method)

1. Determine the null and alternative hypotheses.


2. Specify the level of significance α.
3. Collect the sample data and compute the value of
the test statistic.
4. Use α to determine the rejection region of H0 and
state the rejection rule for H0.
5. Use the value of the test statistic and the rejection
rule to determine whether to reject H0.

© Copyright. All rights reserved

Hypothesis Tests for


Mean

© Copyright. All rights reserved

10
A Summary of Forms for Null and Alternative
Hypotheses about a Population Mean

• The equality part of the hypotheses always appears


in the null hypothesis.
• In general, a hypothesis test about the value of a
population mean µ must take one of the following
three forms
H0: µ >= µ0 H0: µ < =µ0 H0: µ = µ0
Ha: µ < µ0 Ha: µ > µ0 Ha: µ ≠ µ0
One-tailed One-tailed Two-tailed

© Copyright. All rights reserved

A Normal Population With Known σ

Null hypothesis: H 0 : µ = µ0

x − µ0
Test statistic value: z =
σ/ n

© Copyright. All rights reserved

11
A Normal Population With Known σ

Alternative Rejection Region


Hypothesis for Level α Test
H a : µ > µ0 z ≥ zα
H a : µ < µ0 z ≤ − zα
H a : µ ≠ µ0 z ≥ zα / 2 or z ≤ − zα / 2
© Copyright. All rights reserved

Small Sample Tests (n < 30)

X −µ
Test Statistic: T=
S/ n

has a t distribution with n – 1


degrees of freedom.

© Copyright. All rights reserved

12
Small Sample Tests (n < 30)

Null hypothesis: H 0 : µ = µ0
x − µ0
Test statistic value: t=
s/ n

© Copyright. All rights reserved

Small Sample Tests (n < 30)

Alternative Rejection Region


Hypothesis for Level α Test
H a : µ > µ0 t ≥ tα , n −1
H a : µ < µ0 t ≤ −tα , n −1
H a : µ ≠ µ0 t ≥ tα / 2, n −1 or t ≤ −tα / 2,n −1
© Copyright. All rights reserved

13
Hypothesis Tests for
Proportions

© Copyright. All rights reserved

A Population Proportion

Let p denote the proportion of


individuals or objects in a
population who possess a specified
property.

© Copyright. All rights reserved

14
A Summary of Forms for Null and Alternative
Hypotheses about a Population Proportion

• The equality part of the hypotheses always appears


in the null hypothesis.
• In general, a hypothesis test about the value of a
population proportion p must take one of the following
three forms
• H0: p > = p0 H0: p < = p0 H0: p = p0
Ha: p < p0 Ha: p > p0 Ha: p ≠ p0
One-tailed One-tailed Two-tailed

© Copyright. All rights reserved

Tests for Proportion p : (np > = 5 and n(1-p) >=5)

Null hypothesis: H 0 : p = p0

Test statistic value:


pˆ − p0
z=
p0 (1 − p0 ) / n

© Copyright. All rights reserved

15
Tests for Proportion p : (np > = 5 and n(1-p) >=5)

Alternative Rejection Region


Hypothesis
H a : p > p0 z ≥ zα
H a : p < p0 z ≤ − zα
H a : p ≠ p0 z ≥ zα / 2 or z ≤ − zα / 2

© Copyright. All rights reserved

Student’s t Table

Upper Tail Area


df=7
df .025 .01 .005

6 2.447 3.143 3.707

7 2.365 2.998 3.500


8 2.306 2.900 3.355 .005

The body of the table


contains t values, not 0 3.500 t
probabilities
© Copyright. All rights reserved

16
Exercise 1:

1. A two-tailed hypothesis test:


High school students claim that they sleep an
average of 6.0 hours a night. A researcher
asked 30 high school students how many
hours they sleep per night and the sample
average she got was 6.9 hours with a sample
standard deviation of 3.0.
Do you think the students were mistaken in their
claim? Use an alpha risk of 5%.

© Copyright. All rights reserved

Answer
1. Define your hypotheses (null, alternative)
H0: Mu. (average sleep time) = 6.0 hours Ho : µ a = µ b
Ha: Mu. (average sleep time) ≠ 6.0 hours
Ha : µ a ≠ µ b
2. Specify your null distribution
We have 30 students. Sample size <=30, therefore T distribution

We do not know the true standard deviation of sleep-times, but our


sample standard deviation is probably close enough to the true
standard deviation and can be used in our calculation of the standard
error.
x − µ 0
Calculate the z value for your data z =
σ / n
Standard deviation = 3 hours 3.0
(σ / n = = 0.55)
Sample size = 30 30
© Copyright. All rights reserved

17
Answer
3. Do an experiment
x − µ 0
observed in our experiment z =
x bar = 6.9 hours σ / n

4. Calculate the z-
z-value of what you observed
Calculated test statistic = 1.64
6.9 − 6.0
T29 = = 1.64
.55

5. From T-table the Z value(t.025**, 29 DF) = 2.045

© Copyright. All rights reserved

Answer
6. Reject or fail to reject the null hypothesis ?
Since the test statistic 1.64 does not fall in the reject
region,
Fail to reject the null hypothesis
at the significance level

Critical z Critical z
z ≥ zα / 2 or z ≤ − zα / 2 Calc. z

If calculated z value is > or = critical z, -2.045 0 1.64 2.045 t


then reject null hypothesis
© Copyright. All rights reserved

18
Minitab Method Exercise 1

• Minitab Command:
– Stat> Basic Stat > 1 sample t
• At dialogue box:
– Summarized data
– Sample size = 30
– Mean = 6.9
– Standard Deviation = 3
– Test Mean = 6
• What is your conclusion?

© Copyright. All rights reserved

Exercise 2 Cavity hypothesis.mtw

• A single cavity molding press has been


producing insulators with a mean impact
strength of 5.15 ft-lb and a standard deviation of
0.25 ft-lb. A new lot from a new cavity tool shows
the following data from 12 specimens:
• Data in Minitab worksheet “Cavity hypothesis.”
• Is the new lot from the new cavity tool, from
which the sample of 12 was taken different in
mean impact strength from the old cavity tool?
• Assume Alpha = .05
• Source: Juran’s Quality Planning and Analysis, 5th Edition

© Copyright. All rights reserved

19
Minitab Method Exercise 2
• Null Hypothesis: New Mean = Old Mean
• Alternate Hypothesis: New Mean is not = Old Mean
• Minitab Command:
– Stat> Basic Stat > 1 sample t
• At dialogue box:
– Choose “Samples in Column”
– Select “Strength”
– Enter test mean value from exercise: 5.15
– Options: Choose “not equal” and Confidence level = 95
– Graphs: Choose “boxplot”
• What is your conclusion?

© Copyright. All rights reserved

Minitab Exercise 2

© Copyright. All rights reserved

20
Minitab output Exercise 2
Boxplot of strength
(with Ho and 95% t-confidence interval for the mean)

What is your conclusion?


Reject or Fail to reject the null
_
X
hypothesis?
Ho

4.85 4.90 4.95 5.00 5.05 5.10 5.15


strength

One-Sample T: strength

Test of mu = 5.15 vs not = 5.15

Variable N Mean StDev SE Mean 95% CI T P


strength 12 4.9517 0.0781 0.0226 (4.9020, 5.0013) -8.79 0.000
© Copyright. All rights reserved

Exercise 3 Furnace.mtw
• A study was performed in order to evaluate the effectiveness of two
devices for improving the efficiency of gas home-heating systems.
Energy consumption in houses was measured after one of the two
devices was installed.

• The two furnaces were an electric vent damper (Furnace 1) and a


thermally activated vent damper (Furnace 2). The energy
consumption data (BTU) are documented in the column for the
appropriate furnace

• Now you want to compare the effectiveness of these two devices by


determining whether or not there is any evidence that there is a
difference between the two devices.

• Use an alpha level of 5%

© Copyright. All rights reserved

21
Exercise in Minitab

• State the null and alternate hypothesis


– Null Hypothesis: Mu (Furnace 1) = Mu (Furnace 2)
– Alternate Hypothesis: Mu (Furnace 1) not = Mu (Furnace 2)
– Command: Stat> Basic Stat> 2 sample t
– Samples from different columns
– Furnace 1
– Furnace 2
– Assume Equal variance
– Graphs> Box plot
– Options> 95%, Not Equal

© Copyright. All rights reserved

Exercise in Minitab
Boxplot of Furnace 1, Furnace 2

20

15

Are the furnaces


Data

different?
10

Furnace 1 Furnace 2

T-Test of difference = 0 (vs not =): T-Value = -0.67 P-Value = 0.506 DF = 88

© Copyright. All rights reserved

22
More Exercises
• A new spark plug design is tested for wear. A sample of
6 plugs tested showed wear of: .0058, .0049, .0052,
.0044, .0050 and .0047 inches. The current design has
historically produced an average wear of .0055". With
95% confidence, is the new design better?
• A very expensive experiment has been conducted to
evaluate the manufacture of synthetic diamonds using a
new technique. Five diamonds have been generated and
weights recorded of .46, .61, .52, .57 and .54 carats. An
average diamond weight equal to or greater than .50
carats must be realized for the venture to be profitable.
What is your recommendation assuming 95%
confidence?

© Copyright. All rights reserved

P - Values

© Copyright. All rights reserved

23
P – Value (Value shown in statistical software)

The P-value is the smallest level of


significance at which H0 would be
rejected when a specified test procedure
is used on a given data set.
1. P -value ≤ α
⇒ reject H 0 at a level of α
2. P -value > α
⇒ do not reject H 0 at a level of α
© Copyright. All rights reserved

P-Value (area)
P -value = 1 − Φ( z )
Upper-Tailed

P -value = Φ ( z ) 0 z

Lower-Tailed

-z 0
P -value = 2[1 − Φ (| z |)]
Two-Tailed

-z 0 z
© Copyright. All rights reserved

24
P–Values for t Tests

The P-value for a t test will be a t


curve area. The number of df for the
one-sample t test is n – 1.

© Copyright. All rights reserved

Exercise 3

While talking about the cars that co-workers drive,


John made the claim that at least 15% of them drive
BMW’s. Nick found this hard to believe and decided
to check the validity of John’s claim, so he took a
random sample. Does Nick have sufficient evidence
to reject John’s claim if there were 17 BMW’s in his
sample of 200 cars in the employee parking lot?

© Copyright. All rights reserved

25
Answer
1. What is your null hypothesis?
Null hypothesis: Let P(1) be John’s value and P(2) be Nick’s value
P (2) > = P (1)
Alternative hypothesis: P (2) < P(1)

2. What is your null distribution?


Large number of trials, binomial approximates the  normal distribution.

3. Empirical evidence: Based on sample: = 17/200 = .085 = p̂

4. Z = (.085-.15)/.025 = -2.6 pˆ − p0
z=
5. From Z table : .05* area under curve corresponds to p0 (1 − p0 ) / n
Z= -1.65
p-value = P(Z –1.65 < -2.6)
6. Since “P value is low; Null Hypotheses must be rejected.

© Copyright. All rights reserved

Minitab Method Exercise 4

• Minitab Command:
– Stat> Basic Stat > 2 proportions
• At dialogue box:
– Summarized data
– First Sample: Trials 200; Events 17
– Second Sample: Trials 100; Events 15
– Options: Choose “less than”
– Choose “Pooled standard deviation”
• What is your conclusion?

© Copyright. All rights reserved

26
Exercise

If we have a p-value of 0.03 and so decide that our


effect is statistically significant, what is the
probability that we’re wrong (i.e., that the hypothesis
test gave us a false positive)?

© Copyright. All rights reserved

Answer

If we have a p-value of 0.03 and so decide that our effect is


statistically significant, what is the probability that we’re
wrong (i.e., that the hypothesis test gave us a false positive)?
Answer = .03 ie., 3%

© Copyright. All rights reserved

27
Exercise 5
• As your corporation's purchasing manager, you need to
authorize the purchase of twenty new photocopy
machines. After comparing many brands in terms of
price, copy quality, warranty, and features, you have
narrowed the choice to two: Brand X and Brand Y. You
decide that the determining factor will be the reliability of
the brands as defined by the proportion requiring service
within one year of purchase.
• Because your corporation already uses both of these
brands, you were able to obtain information on the
service history of 50 randomly selected machines of
each brand. Records indicate that six Brand X machines
and eight Brand Y machines needed service. Use this
information to guide your choice of brand for purchase.

© Copyright. All rights reserved

Minitab solution

• Choose Stat > Basic Statistics > 2


Proportions.
• Choose Summarized data.
• In First sample, under Events, enter 44.
Under Trials, enter 50.
• In Second sample, under Events, enter
42. Under Trials, enter 50. Click OK.

© Copyright. All rights reserved

28
Minitab solution
• Test and CI for Two Proportions

• Sample X N Sample p
• 1 44 50 0.880000
• 2 42 50 0.840000

• Difference = p (1) - p (2)


• Estimate for difference: 0.04
• 95% CI for difference: (-0.0957903, 0.175790)
• Test for difference = 0 (vs not = 0): Z = 0.58 P-Value = 0.564

• Fisher's exact test: P-Value = 0.774

What is your conclusion?

© Copyright. All rights reserved

Exercise
The number of calls to the help desk associated with a new software
release was recorded during May. At the end of May a software
patch was instituted. Data was recorded on the number of calls
associated with report generation during June. Did the patch reduce
the number of calls to the help desk.

1. State the null hypothesis in business terms


2. State the alternate hypothesis in business terms
3. Perform the analysis
4. State your conclusion

Data = Test.mtw columns May and June

© Copyright. All rights reserved

29
Exercise

The number of orders shipped in region 1 and region 2 was recorded.


Management wishes to know if there a statistically significant
difference in number of orders shipped between the two regions.

1. State the null hypothesis in business terms


2. State the alternate hypothesis in business terms
Perform the analysis
1. State your conclusion

Data = Test.mtw columns Region1 and Region2

© Copyright. All rights reserved

Exercise
A new type of ink is being proposed by your supplier. The ink is
being tested to see the color saturation obtained using old ink and
new ink (higher is better). QA wishes to know if the new ink is better
than the old ink.

1. State the null hypothesis in business terms


2. State the alternate hypothesis in business terms
3. Perform the analysis
4. State your conclusion

Data = Test.mtw columns 5 & 6

© Copyright. All rights reserved

30
© Copyright. All rights reserved

31

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy