0% found this document useful (0 votes)
3 views59 pages

Sta301 Lec40

The lecture discusses hypothesis testing and confidence intervals using the t-distribution, particularly for small sample sizes from normal populations with unknown variances. It includes examples of testing the mean height of animals and estimating the difference in average song lengths between pop and semi-classical music. The lecture also covers the application of the t-distribution in testing equality of means and paired observations.

Uploaded by

Bilal Haider
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views59 pages

Sta301 Lec40

The lecture discusses hypothesis testing and confidence intervals using the t-distribution, particularly for small sample sizes from normal populations with unknown variances. It includes examples of testing the mean height of animals and estimating the difference in average song lengths between pop and semi-classical music. The lecture also covers the application of the t-distribution in testing equality of means and paired observations.

Uploaded by

Bilal Haider
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 59

Virtual University of Pakistan

Lecture No. 40
of the course on
Statistics and Probability
by
Miss Saleha Naghmi Habibullah
IN THE LAST LECTURE,
YOU LEARNT

• Hypothesis Testing Regarding p1-p2


(based on Z-statistic)
•The Student’s t-distribution
•Confidence Interval for  based on the t-
distribution
TOPICS FOR TODAY

• Tests and Confidence Intervals


based on the t-distribution
In the last lecture, we introduced
the t-distribution, and began the
discussion of statistical inference
based on the
t-distribution.
In particular, we discussed the
construction of the confidence
interval for  in that situation when
we are drawing a small sample from
a normal population having
unknown variance 2.
When the parent population is normal,
the population variance is unknown, and the
sample size n is small (less than 30), then the
confidence interval for  is given by

s
x t / 2 n  1
n
 x
where x is the sample mean
n
 
 x x 2

s is the sample
n 1
standard deviation n = sample size
and t (/2,) is found by looking in the
t-table under the appropriate value of
 against  = n – 1;
/2 = 0.005 if we desire
99% confidence:

0.005 0.005
0.99

 t 0.005   0 t 0.005 
/2 = 0.025 if we desire
95% confidence:

0.025 0.025
0.95

 t 0.025   0 t 0.025  
/2 = 0.05 if we desire
90% confidence:

0.05 0.05
0.90

 t 0.05   0 t 0.05  
Next, we discuss hypothesis - testing
regarding the mean of a normally distributed
population for which 2 is unknown and the
sample size is small (n < 30).
This procedure is illustrated through the
following example:
EXAMPLE-1
Just as human height is approximately
normally distributed, we can expect the heights
of animals of any particular species to be
normally distributed.
Suppose that, for the past five years, a
zoologist has been involved in an extensive
research-project regarding the animals of one
particular species.
Based on his research-experience, the
zoologist believes that the average height of
the animals of this particular species is 66
centimeters.
He selects a random sample of ten
animals of this particular species, and, upon
measuring their heights, the following data
is obtained:
63, 63, 66, 67, 68, 69, 70, 70, 71, 71
In the light of these data, test the
hypothesis that the mean height of the
animals of this particular species is 66
centimeters.
SOLUTION

Hypothesis-Testing Procedure:

i) We state our null and alternative


hypotheses as
H0 :  = 66 and H1 :   66.

ii) We set the significance level at  =


0.05.
iii) Test Statistic:
The test-statistic to be used is
x 
t 
s n
which, if H0 is true, has the t-distribution
with n – 1 = 9 degrees of freedom.
Important Note:
As indicated in the previous discussion, we
always begin by assuming that H0 is true.
(The entire mathematical logic of the
hypothesis-testing procedure is based on the
assumption that H0 is true.)
iv) CALCULATIONS
Individual No. xi xi 2
1 63 3969
2 63 3969
3 66 4356
4 67 4489
5 68 4624
6 69 4761
7 70 4900
8 70 4900
9 71 5041
10 71 5041
Total 678 46050
n

Now, X
i 1
i
678
X  67.8 cm
n 10
n   n
 
2

And  (Xi  X ) 2
 n   Xi  
1   i 1  
2
s  i 1

n 1

 
n  1 i 1
X i
2

n 
 
 
1
  46050  45968.4 9.0667
9
So,
s  9.0667 3.01 cm
x  0
 t
s n
67.8  66

3.01 10
1.89
V) Critical Region:
Since this is a two-tailed test, hence the
critical region is given by
| t | > t0.025(9) = 2.262.

-2.262 0 2.262
REJECT ACCEPT REJECT
vi) Conclusion:
Since the computed value of t = 1.89
does not fall in the critical region, we
therefore
do not reject H0 and may conclude that the
mean height of the animals of this particular
species is 66 centimeters.
Next, we consider the construction of
the confidence interval for 1-2 in that
situation when we are drawing small samples
from two normally distributed populations
having unknown but equal variances:
We illustrate this concept with the help
of the following example:
EXAMPLE

A record company executive is


interested in estimating the difference in the
average
play-length of songs pertaining to pop music
and semi-classical music.
To do so, she randomly selects 10
semi-classical songs and 9 pop songs.
The play-lengths (in minutes) of the
selected songs are listed in the following
table:
Semi-Classic Music Pop Music
3.80 3.88
3.30 4.13
3.43 4.11
3.30 3.98
3.03 3.98
4.18 3.93
3.18 3.92
3.83 3.98
3.22 4.67
3.38
Calculate a 99% confidence interval
to estimate the difference in population
means for these two types of recordings.
SOLUTION
In this problem, we are dealing with a
t-distribution with n1+n2 - 2 = 10 + 9 – 2 =
17 degrees of freedom.
The table t-value for a
99% level of confidence and
17 degrees of freedom is
t005.17 = 2.898.
Formula

1 1
X 1  X 2  t 2n1 n2  2 s p 
n1 n2
Calculations:

Semi-Classical Music Pop Music


n1 = 10 n2 = 9
X = 3.465 X = 4.064
1 2
S1 = 0.3575 S2 = 0.2417
Hence,
0.3575 9   0.2417  8
2 2

sp 
10  9  2
1.1503  0.4674

17
0.31
The confidence interval is

3.465  4.064

 
 2.898 0.31  1 1
   0.5990.411
10 9
i.e. the C.I. is :
 1.010      .188
1 2
With 99% confidence, the record
company executive can conclude that the true
difference in population average length of play
is between –1.01 minutes and –.188 minute.
Zero is not in this interval, so she could
conclude that there is a significant difference
in the average length of play time between
semi-classical music and pop music songs’
recordings.
Examination of the sample results
indicates that pop music songs’ recordings
are longer.
The result and conclusion obtained
above can be used in the tactical and
strategic planning for programming,
marketing, and production of recordings.
Next, we discuss the application of the
t-distribution for testing the equality of two
population means:
EXAMPLE:

From an area planted in one variety of


guayule (a rubber producing plant), 54
plants were selected at random. Of these, 15
were offtypes and 12 were aberrant.
Rubber percentages for these plants
were:

6.21, 5.70, 6.04, 4.47, 5.22, 4.45, 4.84,


Offtypes
5.88, 5.82, 6.09, 6.06, 5.59, 6.74, 5.55
4.28, 7.71, 6.48, 7.71, 7.37, 7.20, 7.06,
Aberrant
6.40, 8.93, 5.91, 5.51, 6.36
Test the hypothesis that the mean
rubber percentage of the Aberrants is at least
1 percent more than the mean rubber
percentage of Offtypes.
Assume the populations of rubber
percentages are approximately normal and
have equal variances.
Let subscript 1 stand for Aberrants, and
let subscript 2 stand for Offtypes.
Then, we proceed as follows:
i) We formulate our null and alternative
hypotheses as

H0 : 1 - 2 > 1,
and
H 1 : 1 - 2 < 1
ii) We set the significance level at  = 0.05.
iii) The test-statistic, if H0 is true, is

t
 X 1  X 2   1   2 
1 1
sp 
n1 n2

which has a Student’s t-distribution with


 = n 1 + n 2 – 2, i.e. 25 degrees of freedom.
iv) Computations:
We have

 x 80 . 92
x  1  6 . 74 ,
1 n 12
1
 x 84 . 25
x  2   5 . 62 ,
2 n 15
2
And
 x1 
2
 x1  x1 
2 2
  x1 
n1

 561.6402 
80.92
2

12
 561 .6402  545 .6705
 15 .9697
Also
 x2 2
 x 2  x 2 
2 2
  x2 
n2

 478.9779 
84.25
2

15
 478.9779  473.2042
 5.7737
 x  x 2   x  x 2
1 1 2 2
Now s 2 
p n n  2
1 2
5 .9697  5 .7737

12  15  2

= 0.8697,
so that
s  0 . 8697  0 . 93 ,
p
Hence, the computed value of our test
statistic comes out to be

6 . 74  5 . 62   1 0 . 12
 t   0 . 33
1 1 0 . 36
0 . 93 
12 15
v) Critical Region :
Since this is a left-tailed test, therefore the
critical region is given by
t < -t0.05(25)
i.e. t < -1.708
vi) Conclusion:
Since the computed value of t = 0.33
falls in the acceptance region, therefore we
accept H0.
We may conclude that the mean rubber
percentage of the Aberrants is at least 1
percent more than the mean rubber
percentage of Offtypes.
Next, we consider the application of
the
t-distribution in the case of paired
observations:
In testing hypotheses about two means,
until now we have used independent
samples, but there are many situations in
which the two samples are not independent.
This happen when the observation are found
in pairs such that the two observations of a
pair are related to each other.
Pairing occurs either naturally or by
design.
Natural pairing occurs whenever
measurement is taken on the same unit or
individual at two different times. For
example, suppose ten young recruits are
given a strenuous physical training
programme by the Army.
Their weights are recorded before they
begin and after they complete the training. The
two observations obtained for each recruit i.e.
the before-and-after measurement constitute
natural pairing.

The above is natural pairing.


EXAMPLE

Ten young recruits were put through a


strenuous physical training programme by
the Army.
Their weights were recorded before
and after the training with the following
results:
Recruit 1 2 3 4 5 6 7 8 9 10
Weight before 125 195 160 171 140 201 170 176 195 139
Weight after 136 201 158 184 145 195 175 190 190 145

Using  = 0.05, would you say that the


programme affects the average weight of
recruits?
Assume the distribution of weights
before and after to be approximately normal.
When the observations from two
samples are paired, we find the difference
between the two observations of each pair,
and the test-statistic in this situation is:
d  d
t 
sd n
d  0

sd n
d

sd n
IN TODAY’S LECTURE,
YOU LEARNT

• Tests and Confidence Intervals


based on the t-distribution
IN THE NEXT LECTURE,
YOU WILL LEARN
•Hypothesis-Testing regarding Two Population
Means in the Case of Paired Observations (t-
distribution)
• The Chi-square Distribution

• Hypothesis Testing and Interval Estimation


Regarding a Population Variance (based on
Chi-square Distribution)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy