CHAPTER3 Statistic
CHAPTER3 Statistic
CHAPTER3 Statistic
English
Bahasa Melayu
1.
Parameter
Parameter
2.
Statistic
Statistik
3.
Sampling error
Ralat pensampelan
4.
5.
Sampling distribution
Taburan pensampelan
6.
113
Definition 1
Sampling error of single mean is the difference between values (a statistic)
computed from a sample and the corresponding value (a parameter) computed from a
population.
Theory 1
_
x sample mean
population mean.
Definition 2
A parameter is a measure computed from the entire population.
Definition 3
A statistics is a measure computed from a sample that has been selected from a
population.
Example 1
If given that the mean population is 158972 square feet and a sample size of five
shopping centre yields with sample mean 155072 square feet. Find the sampling
error.
Answer Example 1
_
114
Theory 2
Formula for population mean,
x
N
where
population mean
x values in the population
N population size.
Theory 3
Fundamental statistical concepts are
The size of the sampling error depends on which sample is taken.
The sampling error may be positive or negative.
There is potentially a different value for each possible sample mean.
Definition 4
A simple random sample is a sample selected in such a manner that each possible
sample of a given size has an equal chance of being selected.
Theory 4
_
x
n
where
_
x sample mean
115
In the inferential statistics process, a researcher selects a random sample from the
population, computes a statistics on the sample and reaches conclusions about the
population parameter from the statistics. In this chapter, we will explore the sample
__
mean, x , as the statistic. The sample means is one of the more common statistics
used in the inferential process. To compute and assign the probability of occurrence
of a particular value of a sample mean, the researcher must know the distribution of
the sample means. One way to examine the distribution possibilities is to take a
population with a particular distribution, randomly select samples of a given size,
compute the sample means and attempt to determine how the means are distributed.
Let say 23 people selected randomly from the population of women in Ayer Hitam
Pahat between the ages of 20 and 40 years old and we computed the mean height of
the sample. We would not expect our sample mean to be equal to the mean of all
women in Ayer Hitam. It might be somewhat lower or it might be somewhat higher,
but it would not equal the population mean exactly. Similarly, if we took a second
sample of 23 people from the same population, we would not expect the mean of this
second sample to equal the mean of the first sample. Inferential statistics concerns
generalizing from sample to population. A critical part of inferential statistics
involves determining how far sample statistics are likely to vary from each other and
from the population parameter. Why we sample the population ? Why not we study
the whole population ? These all because the physical impossibility of checking all
items in the population, the cost of studying all the items in a population, the sample
results are usually adequate, contacting the whole population would often be timeconsuming and the last one is the destructive nature of certain tests such as a study of
light bulb life.
Definition 5
If samples of size n are drawn randomly from a population that has a mean of and
__
116
distributed for sufficiently large sample size ( n 30 ) regardless of the shape of the
population distribution. If the population is normally distributed, the sample means
are normally distributed for any size sample. It can be shown that the mean of the
sample means is the population mean, which is __ and standard deviation of the
x
sample means (called the standard error of the mean) is the standard deviation of the
population divided by the square root of the sample size, which is __
x
Definition 6
A sampling distributions is a distribution of the possible values of a statistic for a
given size sample selected from a population.
Definition 7
Sampling distribution of the mean, for random samples of n observations taken
from a population with mean, and a standard deviation, , regardless of the
populations distribution, provided the sample size is sufficiently large, the
_
distribution of the sample mean, x will be normal with a mean equal to the
population mean, _ . Further, the standard deviation will equal the population
x
larger sample size is, the better an approximation to the normal distribution.
x
Z-value for sampling distribution of x is Z
n
_
where
_
x sample mean
117
n . The
population mean
n.
__
r __
__
x
Step 4 : Find the probability of sample mean, P x r P Z
.
__
Example 2
What is the probability that a sample of 100 automobile insurance claim files will
yield an average claim of RM4527.77 or less if the average claim for the population is
RM4560 with standard deviation of RM600 ?
Answer Example 2
_
Step 2 : _
x
600
60
100
Step 3 : x ~ N 4560 , 60 2
__
__
60
118
P Z 0.54
P( Z 0.54)
0.2946
Example 3
The random variable, X represent the number of box in a container, has the following
probability distribution.
X
P(x)
0.2
0.4
0.3
0.1
(a)
(b)
Find the sample mean and variance for random samples of 36 boxes.
(c)
Answer Example 3
(a)
E ( X ) x.P( x )
Var ( X ) E ( X 2 ) E ( X )
28.9 (5.3) 2
0.81
119
(b)
Variance sample, _2
x
(c)
2
n
0.81
0.0225
36
Step 1 : _ 5.3
x
Step 2 : _
0.81
0.0225 0.15
36
5.5 5.3
__
Step 4 : P x 5.5 P Z
0.15
P( Z 1.33)
1 P Z 1.33
1 0.09176
0.90824
Example 4
An electrical firm manufactures light bulbs that have a length of life that is
approximately normally distributed, with mean equal to 800 hours and a standard
deviation of 40 hours. Find the probability that a random sample of 16 bulbs will have
an average life of less than 775 hours.
Answer Example 4
Step 1 : _ 800
x
Step 2 : _
x
40
10
16
__
__
Step 4 : P x 775 P Z
10
120
P( Z 2.5)
0.0062
Example 5
At a large university, the mean age of the students is 22.3 years and the standard
deviation is 4 years. A random sample of 64 students is drawn. What is the
probability that the average age of these students is greater than 23 years ?
Answer Example 5
Step 1 : _ 22.3
x
Step 2 : _
x
4
0.5
64
23 22.3
__
Step 4 : P x 23 P Z
0.5
P( Z 1.4)
0.0808
Example 6
The breaking strength (in kg/mm) for a certain type of fabric has mean 1.86 and
standard deviation 0.27. A random sample of 80 pieces of fabric is drawn. What is the
probability that the sample mean breaking strength is less than 1.8 kg/mm ?
Answer Example 6
Given : 1.86 , 0.27 and n 80
Step 1 : _ 1.86
x
Step 2 : _
x
0.27
0.03018
80
121
1.8 1.86
__
Step 4 : P x 1.8 P Z
0.03018
P( Z 1.99)
0.0233
Example 7
Taking random samples of size n from an infinite population that has a standard
__
deviation two, show that x would be a more precise estimator of if sample size
were increased from four to six. Interpret the result.
Answer Example 7
__
n.
n 2
4 1
Increasing n from 4 to 16 :
n 2
16 0.5
Exercise 3.3
1.
Bags of concrete mix labeled have a population mean weight of 100 kg and a
population standard deviation of 0.5 kg.
(a)
(b)
122
2.
In a report stated that the average time of watching movie per week for
children with ages between two and six years is 22 hours. Assume the variable
is normally distributed and the standard deviation is five hours. A sample of
33 children with ages between two and six years is randomly selected. Find
the probability that the average time they watch movie per week will be
greater than 23.5 hours.
3.
Women from aged 18 to 24, their systolic blood pressures (in mm Hg) are
normally distributed with a mean of 114.4 and a standard deviation of 13.1.
(a)
(b)
4.
5.
If one male is randomly selected, find the probability that his head
breadth is less than 15.75 cm.
(b)
Find the probability that 100 randomly selected men have a mean head
breadths at least 16.00 cm.
6.
123
(a)
of x ?
(b)
What is the probability that the average lifetime of these 400 batteries
is between 1097 and 1104 days ?
7.
8.
The amount of sulfur in the daily emissions from a power plant has a normal
distribution with a mean of 94 and a standard deviation of 22. For a random
sample of 5 days, find the probability that the average amount of sulfur
emissions will exceed 80.
9.
According to the growth chart that doctors use as a reference, the heights of
two-year-old boys are normally distributed with mean 34.5 inches and
standard deviation 1.3 inches. If six two-year-old boys are selected, what is
the probability that their average height will be between 34.1 and 35.2 inches.
10.
Casual workers in a certain industry are paid on average RM5.10 per hour
which is normally distributed with standard deviation of RM2.20. A sample of
35 casual workers from the industry was selected to be respondents for the
underpaid issue questionnaires. Find the probability that the average payment
for those casual workers is
11.
(a)
(b)
124
students was taken in a certain university. Find the probability that the mean
IQ of the sample is
12.
(a)
(b)
population with Poisson distribution 3.5. Find probability that the sample
mean is between 3.4 and 4.3.
13
14
Consider the PVC pipe in the previous question. How is the standard
deviation of the sample mean changed when the sample size is decreased from
64 to 9 ? Explain.
(a)
0.0793
(b)
0.0170
2.
0.0427
3.
(a)
0.1685
(b)
0.5636
4.
0.1359
5.
(a)
0.5793
(b)
0.0014
6.
(a)
1100, 4
7.
0.9936
8.
0.9222
9.
0.6799
10.
(a)
0.0078
(b)
0.7910
11.
(a)
0.0159
(b)
0.9826
(b)
0.6147
125
12.
0.6296
13.
0.1587
Theory 7
Statistical analyses are very often concerned with the difference between means. A
typical example is an experiment designed to compare the mean of a control group
with the mean of an experimental group. Inferential statistics used in the analysis of
this type of experiment depend on the sampling distribution of the difference between
means.
1 2
__
x1 x 2
__
x1 x 2
21 2 2
n1
n2
r _ _
__ __
x1 x 1
Step 4 : Find the probability of sample mean, P x1 x 2 r P Z
_
_
x1 x 2
Example 8
The mature citrus trees of type A have a mean height of 14.8 feet with a standard
deviation of 1.2 feet. The mature citrus trees of type B have a mean height of 12.9
feet with a standard deviation of 1.5 feet. Two samples of size 12 and 15 are
randomly selected from mature citrus tree of type A and B respectively. Find the
probability that
126
(a)
(b)
(c)
the mean of type A is two feet more than the mean of type B.
Answer Example 8
(a)
A
Sample mean
14.8
12.9
1.2
1.5
Sample size
12
15
Step 1 : __ A 14.8
xA
Step 2 : __
x
A
n
1.2
0.34641
12
__
14 14.8
Step 4 : P( x 14) P Z
0.34641
P( Z 2.31)
1 P( Z 2.31)
1 0.01044
0.9896
(b)
Step 1 : __ B 12.9
xB
Step 2 : __
x
B
n
1 .5
0.38729
15
__
14 12.9
12 12.9
Step 4 : P 12 x 14 P
Z
0.38729
0.38729
P( 2.32 Z 2.84)
1 P( Z 2.32) P( Z 2.84)
127
1 0.01017 0.00226
0.9878
(c)
Step 2 : __
2A
nA
2B
nB
1.2 2 1.52
0.51961
12
15
__
2 1.9
__ __
Step 4 : P x A x B 2 P Z
0.51961
P( Z 0.19)
0.4247
Example 9
The result of Statistics Test 1 for two groups of management students, Section 1 and
Section 2 are normally distributed with N (60, 4 2 ) and N (64, 2 2 ) respectively. Two
samples of size 9 and 12 are randomly selected from Section 1 and Section 2
respectively. Find the probability that the mean of Section 2 is lower than the mean
of Section 1 ?
Answer Example 9
Step 1 :
Section 1
Section 2
Sample mean
60
64
Sample variance
16
Sample size
12
x 2 x1
Step 2 :
x 2 x1
2 1 64 60 4
12
n1
22
n2
16 4
1.4529
9 12
128
Step 3 : x 2 x 1 ~ N 4, 1.4529 2
__
__
__
__
__ __
Step 4 : P x 2 x 1 P x 2 x 1 0
04
P Z
1.4529
P( Z 2.75)
P( Z 2.75)
0.0030
Example 10
Consider two populations of students who participate in a reading programmed prior
to taking a Japanese course. The populations are those who earn an A grade and those
who earn a B grade. Let X be the number of books read by the students who
participate in the programmed. Find the probability that the mean number of books
read by the students who earn A grade is greater than the students who earn B grade if
given the data below.
Grade A
Sample mean
Sample standard deviation
Sample size
Grade B
37
25
8.7014
8.5264
Answer Example 10
Step 1 :
x A xB
Step 2 :
x A xB
A B 37 25 12
A2
nA
B2
nB
8.7014 2 8.5264 2
4.6455
8
6
__
__
__
__ __
Step 4 : P x A x B P x A x B 0
129
0 12
P Z
4.6455
P( Z 2.58)
1 P( Z 2.58)
1 0.00494
0.99506
Example 11
The length of computer desk is approximately normal distributed. There are two
factories produce that kind of desk. The summary statistics are given below.
Factory A
Sample mean
Factory B
60.5
58.3
Sample size
35
40
Find the probability the mean sample for the length of computer desk produced by
Factory B is greater than mean sample for the length of computer desk produced by
Factory A.
Answer Example 11
Step 1 :
xB xA
Step 2 :
xB x A
A2
nA
B2
nB
32 4 2
0.81064
35 40
__
__
__
__ __
Step 4 : P x B x A P x B x A 0
130
0 ( 2.2)
P Z
0.81064
0.0034
Exercise 3.4
1.
2.
A company manufactures two types of cables, brand A and brand B that have
mean breaking strengths of 4000 kg and 4500 kg and standard deviations of
300 kg and 200 kg, respectively. If 100 cables of brand A and 50 cables of
brand B are tested, what is the probability that the mean breaking strengths of
brand B will be at least 600 kg more than brand A ?
3.
131
4.
The average running times of films produced by Company A are 98.4 minutes
with standard deviation of 7.8 minutes. Companies B have a mean running
times of 110.7 minutes with standard deviation of 29.8 minutes. Assume the
populations are approximately normally distributed. What is the probability
that a random sample of 36 films from Company B will have mean running
times that at least 13 minutes more than the mean running times of a random
sample of 49 films from Company A.
5.
6.
7.
Score A
Score B
Sample mean
83
91
12
Sample size
15
14
The mean age at death in Malaysia is 55.5 years and Singapore is 57 years.
The standard deviation is approximately 4.6 years and 5 years for each
country respectively. Samples of 130 deaths from the Malaysia Hospital and
132
120 from Singapore Hospital were selected. Find the probability that
(a)
the mean age at death in Malaysia is greater than the mean age at
death in Singapore.
(b)
the mean age at death in Singapore is three less than the mean age at
death in Malaysia.
8.
The average life of a hand phone is 8 years for a female and 6 years for a
male, with a standard deviation of 1 and 2 years respectively. Assuming that
the lives of these hand phones follow approximately a normal distribution,
find the probability that the mean life of a random
(a)
(b)
sample of 44 females is not less than 2.5 years than the sample of 55
males hand phones.
9.
10.
11.
133
of size nine is selected from another normal population with a mean of 70 and
a standard deviation of twelve from sample B. Let X A and X B be the two
sample means. Find
(a)
(b)
12.
13.
14.
Company 2
134
Sample size = 38
Sample size = 29
15.
(b)
(c)
0.6443
2.
0.0076
3.
0.0073
4.
0.4443
5.
0.9830
6.
0.0166
7.
(a)
0070
(b)
0.9931
8.
(a)
0.1844
(b)
0.0526
9.
0.7486
10.
0.9452
11.
(a)
(b)
0.1769
12.
0.983
13.
0.44433
14.
0.9993
15.
(b)
0.5871
0.9719
(c)
0.3483
135
Theory 9
The t-distribution has been introduced by W. S. Gosset (1876 - 1937). He adopted the
pen name "student." Therefore, the distribution is known as 'students t-distribution'.
It is used to establish confidence limits and test the hypothesis when the population
variance is not known and sample size is small (less than 30). If a random sample x1 ,
x 2 , , x n of n values be drawn from a normal population with mean and standard
the
estimate
s xi x
of
the
variance
of
the
sample
then
s2
given
by
is defined as t x
population and n sample size and s standard deviation of sample. The formula for
s
xi x
n 1 .
Note :
't ' is distributed as the student distribution with (n 1) degree of freedom
(df ).
The variable 't ' distribution ranges from minus infinity to plus infinity.
Such as standard normal distribution, it is also symmetrical and has mean zero.
2 of t-distribution is greater than 1, but becomes 1 as ' df ' increases and thus
the sample size becomes large.
The t-distribution is lower at the mean and higher at the tails than the normal
distribution.
The t-distribution has proportionally greater area at its tails than the normal
distribution.
The t-distribution is similar in shape to the standard normal distribution,
which are symmetric about zero, uni-modal and bell-shaped.
136
Example 12
Find the value of t-distribution, if given nine degrees of freedom with alpha equal to
0.05.
Answer Example 12
Refers to t-distribution table, if we go across the row for nine degrees of freedom and
down the column for an area of 0.05, we get the t value of 1.833. That means, for t9
distribution, the area under the curve to the right of 1.833 is 0.05.
Example 13
Find the value of t-distribution, if given twenty degrees of freedom with alpha equal
to 0.001.
Answer Example 13
Refers to t-distribution table, if we go across the row for twenty degrees of freedom
and down the column for an area of 0.001, we get the t value of 3.552. That means,
for t 20 distribution, the area under the curve to the right of 3.552 is 0.001.
137
Example 14
By using the statistical table, find the value of t , v .
(a)
(b)
P(T t , 24 ) 0.005
Answer Example 14
(a)
138
than 0. But the difference may also be due to sample fluctuation and thus the value of
Example 15
Find the value of 2 - distribution, if given seventeen degrees of freedom with alpha
equal to 0.95.
Answer Example 15
Refers to 2 - distribution table, if we go across the row for seventeen degrees of
freedom and down the column for an area of 0.95, we get the 2 value of 8.672. That
139
means, for 217 distribution, the area under the curve to the right of 8.672 is 0.95.
Example 16
Find the value of 2 - distribution, if given twelve degrees of freedom with alpha
equal to 0.02.
Answer Example 16
Refers to 2 - distribution table, if we go across the row for twelve degrees of
freedom and down the column for an area of 0.02, we get the 2 value of 24.054.
That means, for 212 distribution, the area under the curve to the right of 24.054 is
0.02.
Theory 13
In probability theory and statistics, the F-distribution is a continuous probability
distribution. It is also known as Snedecor's F distribution or the Fisher-Snedecor
distribution (after R.A. Fisher and George W. Snedecor). The F-distribution becomes
relevant when we try to calculate the ratios of variances of normally distributed
statistics. Suppose we have two samples with n1 and n2 observations, the ratio F
s12
s22
each
pair
of
values
of
v1
and
v2 ,
F , v1 ,v2
is
tabulated
for
0.05, 0.025, 0.01, 0.001 and the 0.025 values being bracketed. The lower
140
percentage
points
relation F1 , v1 ,v2
of
1
F , v2 ,v1
the
distribution
may
be
1
F0.05, 8,12
obtained
from
the
0.351 .
Example 17
2
2
If s1 and s2 are the variances of independent random samples of size, n1 25 and
s1 2
Answer Example 17
Variance of normal population are equal for two independent random samples,
2 12 2 2 .
n1 25 , v1 n1 1 25 1 24
n 2 13 , v 2 n 2 1 13 1 12
P 2 6.25 P ( F 6.25)
s2
1 P ( F 6.25)
1 0.001
0.999
Exercise 3.4
By using the statistical table, find the probability
1.
P(T 2.898), v 17
2.
P(T 1.415), v 7
3.
P(T 2.042), v 30
141
5.
6.
In each of the following parts, find 2 0.95, . Assume a chi- square distribution with
7.
14 degrees of freedom
8.
29 degrees of freedom
Assume a chi square distribution with 17 degrees of freedom. Fill in the blanks.
9.
P( 2 __________) 0.05
10.
P( 2 __________) 0.005
11.
Assume
distribution
with
degrees
of
freedom,
find
P(2.167 2 6.346 ) .
12.
13.
S12
P 2 5.06 .
S2
14.
s1 2
P 2 4.50 .
s2
15.
142
s2
P 1 2 4.28 .
s2
0.005
2.
0.90
3.
0.975
4.
2.779
5.
-2.831
6.
4.140
7.
6.571
8.
17.708
9.
27.587
10.
5.697
11.
0.45
12.
0.7
13.
0.01
14.
0.99
15.
0.95
EXERCISE CHAPTER 3
1.
A simple random sample of 100 men is chosen from a population with mean
height 70 inch and standard deviation 2.5 inch. What is the probability that the
average height of the sample men is greater than 69.5 inch ?
2.
A group of ball bearings have a mean weight of 5.02 grams and a standard
deviation of 0.30 grams. A random sample of 100 ball bearings chosen from
this group, find
(a)
(b)
3.
(b)
143
population.
4.
The mean height of 250 UPM staffs is 158 m and the standard deviation is
5 m. Find the mean and standard deviation of the sampling distribution of the
mean height for a sample size of 38 staffs.
5.
A chemical engineer calculates that the populations mean yield of batch is 518
grams per milliliter with a standard deviation of 40 grams. Assume that the
distribution of yield to be approximately normal. What is the probability in a
certain month, he get yield less than 515 grams for 36 batches ?
6.
(b)
(c)
7.
Two independent experiments are being run in which two different types of
paints are compared. Twenty specimens are painted using type A and the
drying time in hours is recorded on each. The same is done with type B.
Assume that the mean drying times of the two populations are normal,
X A ~ N ( , 1) and X B ~ N ( , 1) for the two types of paints.
__
(a)
(b)
__
__
Calculate P X A X B 0.3 .
144
8.
(b)
9.
The population of the usage per sheet of paper for old and new certain
products are distributed N 1 ( 2000, 60) and N 2 ( 2500 , 40) respectively. Two
random samples are taken from each population of size n1 and n 2 .
(a)
(b)
(c)
10.
Line Clear Manufacturing Sdn. Bhd. manufactured two type of cables A and
B that have mean breaking strengths of 2500 lb and 2400 lb with their
standard deviation 150 lb and 100 lb. If 50 cables of brand A and 25 cables of
brand B are tested, what is the probability that the mean breaking strength of
A will be
11.
(a)
(b)
A light bulbs manufacturer claims that the lifetime of its light bulbs has a
145
12.
13.
Design B
24.2
23.9
Variance
10
20
Sample size
15
10
14.
(a)
(b)
(c)
Students may choose between a 3 semester course in physics without labs and
4 semester course with labs. The final written examination is the same for
each section. The section wit labs made an average examination grade of 84
with a standard deviation of 4 and the section without labs made an average
grade of 77 with a standard deviation of 6. Assume the populations are
approximately normally distributed. Find the probability that the sample mean
for a random sample of scores of 12 students with labs exceeds the sample
mean for a random sample of scores of 18 students without labs by at most 5.
146
15.
Casual workers in a certain industry are paid on average RM5.10 per hour
which is normally distributed with standard deviation of RM2.20. A sample of
35 casual workers from the industry was selected to be respondents for the
underpaid issue questionnaires. Find the probability that the average payment
for those casual workers is
16.
(a)
(b)
The mean age at death in Malaysia is 55.5 years and Singapore is 57 years.
The standard deviation is approximately 4.6 years and 5 years for each
country respectively. Samples of 130 deaths from the Malaysia Hospital and
120 deaths from Singapore Hospital were selected. Find the probability that
(a)
the mean age at death in Malaysia is greater than the mean age at death
in Singapore.
(b)
the mean age at death in Singapore is three less than the mean age at
death in Malaysia.
17.
18.
(a)
the mean IQ of the sample is greater than 105 and less than 107.
(b)
The average life of a hand phone is 8 years for a female and 6 years for male,
with a standard deviation of 1 and 2 years respectively. Assuming that the
lives of these hand phones follow approximately a normal distribution, find
(a)
the probability that the mean life of a random male hand phone falls
between 6.6 and 7.7 years.
(b)
147
19.
20.
0.97725
2.
(a)
0.22867
(b)
0.00379
3.
(a)
7.1, 0.2392
(b)
7.1, 0.87368
4.
158m, 0.8111
5.
0.32636
6.
(a)
0.0791
(b)
0.1038
7.
(b)
0.00135
8.
(a)
0.2937
(b)
49
9.
(b)
0.3936
(c)
0.82381
10.
(a)
0.0432
(b)
0.3658
11.
0.00914
12.
0.45620
13.
(a)
(b)
0.84134
14.
0.13567
15.
(a)
0.00776
(b)
0.791
16.
(a)
0.00695
(b)
0.99305
17.
(a)
0.01593
(b)
0.98257
0.00621
148
(c)
0.8133
(c)
0.34134
18.
(a)
19.
0.7486
0.18443
(b)
0.0526
20.
0.9452
SUMMARY CHAPTER 3
_
Sample mean, is x
x .
N
x.
n
x
Z-value for sampling distribution of x is Z
.
n
_
149
n.
__
r __
__
x
Step 4 : Probability of sample mean, P x r P Z
.
__
1 2
__
x1 x 2
__
x1 x 2
21 2 2
n1
n2
r _ _
__ __
x1 x 1
Step 4 : Probability of sample mean, P x1 x 2 r P Z
_ _
x1 x 2
150
151
152