Hypothesis Testing
Hypothesis Testing
Joselito O. Jayme
Learning Competencies
The learner. . .
1 illustrates: (a) null hypothesis; (b) alternative hypothesis;
(c) level of significance; (d) rejection region; and (e) types
of errors in hypothesis testing.
2 calculates the probabilities of committing a Type I and
Type II error.
3 identifies the parameter to be tested given a real-life
problem.
4 formulates the appropriate null and alternative hypotheses
on a population mean.
Learning Objectives
1 Define hypothesis testing
Definition
Hypothesis testing is the process of determining whether or
not a given hypothesis is true.
Definition
A null hypothesis (Ho ) is an assertion about the value of a
population parameter. It is an assertion that we hold as true
unless we have sufficient statistical evidence to conclude
otherwise.
Definition
The alternative hypothesis (Ha ) is the negation of the null
hypothesis.
Example 1
A null hypothesis might assert that the population mean is
equal to 100.
Ho : µ = 100
Ha : µ 6= 100
Example 2
The null hypothesis may assert that the population proportion
p is at least 40%
Ho : p ≥ 40%
Ha : p < 40%
Example 3
The null hypothesis asserts that the population variance is at
most 50.
Ho : σ 2 ≤ 50
Ha : σ 2 > 50
Ho : µ = µo Ho : µ ≤ µo Ho : µ ≥ µo
Ha : µ 6= µo Ha : µ > µo Ha : µ < µo
Ho : µ ≥ µo
Ha : µ < µo
Practice Problem 1
A vendor claims that his company fills any accepted order, on
the average, in at most six working days. You suspect that the
average is greater than six working days and want to test the
claim. How will you set up the null and alternative hypotheses?
Practice Problem 2
A manufacturer of golf balls claims that the variance of the
weights of the companys golf balls is controlled to within
0.0028oz 2 . If you wish to test this claim, how will you set up
the null and alternative hypotheses?
Practice Problem 3
At least 20% of the visitors to a particular commercial Web site
where an electronic product is sold are said to end up ordering
the product. If you wish to test this claim, how will you set up
the null and alternative hypotheses?
Practice Problem 4
The mean number of sick days used per year nationally is
reported to be 5.5 days. A study is undertaken to determine if
the mean number of sick days used for nonunion members in
Kansas differs from the national mean. Give the null and
alternative hypothesis for this scenario.
Practice Problem 5
A study in 1992 established the mean commuting distance for
workers in a certain city to be 15 miles. Because of the
westward spread of the city, it is hypothesized that the current
mean commuting distance exceeds 15 miles. A traffic engineer
wishes to test the hypothesis that the mean commuting
distance for workers in this city is not equal to 15 miles. Give
the null and alternative hypothesis for this scenario.
Assignment 1
A pharmaceutical company claims that four out of five doctors
prescribe the pain medicine it produces. If you wish to test this
claim, how would you set up the null and alternative
hypotheses?
Assignment 2
It is found that Web surfers will lose interest in a Web page if
downloading takes more than 12 seconds at 28K baud rate. If
you wish to test the effectiveness of a newly designed Web page
in regard to its download time, how will you set up the null and
alternative hypotheses?
Assignment 3
During the sharp increase in gasoline prices in the summer of
the year 2006, oil companies claimed that the average price of
unleaded gasoline with minimum octane rating of 89 in the
Midwest was not more than $3.75. If you want to test this
claim, how would you set up the null and alternative
hypotheses?
Level of Significance
The level of significance is the probability of making a Type
I error when the null hypothesis is true as an equality. The
Greek symbol α (alpha) is used to denote the level of
significance. The common choices of α are 0.05 and 0.01.
p-value
A p-value is a probability that provides measure of the
evidence agaist the null hypothesis provided by the sample.
Smaller p-value indicate more evidence against Ho .
Critical value
It is the critical value of the test statistic that corresponds to an
area of α in the lower tail of the sampling distribution of the
statistic. In other words, the critical value is the largest value of
the test statistic that will result in the rejection of the null
hypothesis denoted by za .
Critical region
The set of z scores outside the range −1.96 to 1.96 constitutes
what is called the critical region or region of rejection of the
hypothesis or the region of significance. The set of z scores
inside the range −1.96 to 1.96 could then be called the region of
acceptance of the hypothesis or the region of nonsignificance.
Example 1
An automatic bottling machine fills cola into 2-liter (2000 mL)
bottles. A consumer advocate wants to test the null hypothesis
that the average amount filled by the machine into a bottle is at
least 2000 mL. A random sample of 40 bottles coming out of
the machine was selected and the exact contents of the selected
bottles are recorded. The sample mean was 1999.6 mL. The
population standard deviation is known from past experience to
be 1.30 mL. Test the null hypothesis at an α of 5%.
Solution.
1.)
Ho : µ ≥ 2000
Ha : µ < 2000
p-value Approach
4.) p-value = 0.5 − 0.4744 = 0.0256
p-value = 0.0256 ≤ α = 0.05. Reject H0 .
5.) Therefore, we find sufficient statistical evidence to reject
the null hypothesis.
−1.95 = z ≤ zα = −1.645
Reject H0 .
5.) Therefore, we find sufficient statistical evidence to reject
the null hypothesis.
Example 2
Consider the following hypothesis test:
Ho : µ ≤ 25
Ha : µ > 25
A sample of 40 provided a sample mean of 26.4. The population
standard deviation is 6 with α = 0.01. What is your conclusion?
Solution.
1.)
Ho : µ ≤ 25
Ha : µ > 25
p-value Approach
4.) p-value = 0.5 − 0.4306 = 0.0694
p-value = 0.0694 ≥ α = 0.01. Do not reject H0 .
5.) Therefore, we find sufficient statistical evidence not to
reject the null hypothesis.
Example 3
Capehan Sa Calicanan states that the mean filling weight of
their product is at least 3 grams per can. The Department of
Trade and Industry (DTI) periodically conducts studies to test
the claims of the Capehan Sa Calicanan. Previous test shows
that the value of σ = 0.18. The DTI gets a sample of 36 cans of
coffee of Capehan Sa Calicanan and measure the sample mean
x = 2.92 grams with the level of significance of 0.01. What will
be the DTI’s conclusion after the study.
Example 4
A teacher gives his class a test, which as he knows from years of
teaching, yields a mean of 80. His present class of 40 students
obtains a mean of 85 and a standard deviation of 8. Can he
claim that his present class is a superior class? Use α = 0.01.
Assignment 5
An electrical firm manufactures light bulbs that have a length
of life that is approximately normally distributed with a mean
of 800 hours, and a standard deviation of 40 hours. Test the
null hypothesis that µ = 800 hours against the alternative
µ 6= 800 if a random sample of 30 bulbs have an average of 700
hours. Use the α = 0.05 level of significance.
Assignment 6
Solution.
1.)
Ho : µ = 150
Ha : µ 6= 150
Solution.
p-value Approach
4.) p-value = 2(0.5 − 0.4826) = 2(0.0174) = 0.0348
p-value = 0.0348 < α = 0.05. Reject H0 .
5.) Therefore, we find sufficient statistical evidence to reject
the null hypothesis.
Reject H0 .
5.) Therefore, we find sufficient statistical evidence to reject
the null hypothesis.
Assignment 6
Solution.
1.)
Ho : µ = 50
Ha : µ 6= 50
Solution.
Critical Value approach
4.)
t < tα/2
< t0.1/2,v=n−1
< t0.05,v=18−1
< t0.05/2,v=17 = 1.740
The learner:
1 identifies the appropriate form of the test-statistic when:
Solution:
1.)
H0 : µ = 15
Ha : µ 6= 15
Solution:
1.)
H0 : µ ≤ 25
Ha : µ > 25
Solution:
1.)
H0 : µ = 18
Ha : µ 6= 18
Do not reject H0 .
5.) Therefore, we find sufficient statistical evidence not to
reject the null hypothesis.
Joselito O. Jayme Statistics and Probability
Hypothesis Testing (EXAMPLE)
Solution:
1.)
H0 : µ ≤ 7
Ha : µ > 7
Reject H0 .
5.) Therefore, we find sufficient statistical evidence to reject
the null hypothesis.
Joselito O. Jayme Statistics and Probability
Hypothesis Testing (EXAMINATION)
Solution:
1.)
H0 : µ ≥ 20
Ha : µ < 20
Solution:
1.)
H0 : µ ≥ 80
Ha : µ < 80
Solution:
1.)
H0 : µ = 22
Ha : µ 6= 22
p − p0
z=q
p0 (1−p0 )
n
0.46 − 0.5
=q
0.5(1−0.5)
50
= −0.56568542495 ≈ −0.57
Joselito O. Jayme Statistics and Probability
Hypothesis Testing (Population Proportion)
p − p0
z=q
p0 (1−p0 )
n
0.325 − 0.3
=q
0.3(1−0.3)
40
= 0.34503277967 ≈ 0.35
Joselito O. Jayme Statistics and Probability
Hypothesis Testing (Population Proportion)
4. p-value approach
A(0 ≤ z ≤ 0.35) = 0.1368
A(z ≥ 0.35) = 0.5 − 0.1368 = 0.3632
p-value = 2(0.3632) = 0.7264
p-value 0.7264 ≥ α = 0.05
Learning Competencies
The learner. . .
1 illustrates the nature of bivariate data.
2 constructs a scatter plot.
3 describes shape (form), trend (direction), and variation
(strength) based on a scatter plot.
4 estimates strength of association between the variables
based on a scatter plot.
5 calculates the Pearson’s sample correlation coefficient.
6 solves problems involving correlation analysis.
Scatter Diagram
Scatterplot (scatter diagram, scattergram) is a figure in
which the individual data points are plotted in two-dimensional
space.
Example
A report read by a physician indicated that the maximum heart
rate an individual can reach during intensive exercise decreases
with age. The physician decided to do his own study. Ten
randomly selected members of a jogging club performed exercise
tests and recorded their peak heart rates. The result are shown
in the following table:
Example
Age (X) Peak Heart Rate (Y )
10 210
20 200
20 195
25 195
30 190
30 180
30 185
40 180
45 170
50 165
X Y XY
10 210 2100
20 200 4000
20 195 3900
25 195 4875
30 190 5700
30 180 5400
30 185 5550
40 180 7200
45 170 7650
50 165 8250
We construct column X 2 .
X Y XY X2
10 210 2100 100
20 200 4000 400
20 195 3900 400
25 195 4875 625
30 190 5700 900
30 180 5400 900
30 185 5550 900
40 180 7200 1600
45 170 7650 2025
50 165 8250 2500
X Y XY X2 Y2
10 210 2100 100 44100
20 200 4000 400 40000
20 195 3900 400 38025
25 195 4875 625 38025
30 190 5700 900 36100
30 180 5400 900 32400
30 185 5550 900 34225
40 180 7200 1600 32400
45 170 7650 2025 28900
50 165 8250 2500 27225
Σ = 300 1870 54625 10350 351400
P P
Thus,
P we have n
P 2 = 10; XY = 54625;
P 2 X = 300;
Y = 1870; X = 10350; Y = 351400
P P P
n XY − ( X)( Y )
r= p P
(n X 2 − ( X)2 )(n Y 2 − ( Y )2 )
P P P
(10)(54625) − (300)(1870)
= p
((10)(3002 ) − (300)2 )((10)(18702 ) − (1870)2 )
= −0.970793994105179
ACTIVITY
ACTIVITY 1
ACTIVITY 2
ACTIVITY 3
ACTIVITY 4
SALAMAT.