Sampling Distributions
Sampling Distributions
Sampling Distributions
Sampling distributions
5.1 Introduction
A Parameter is a number that describes some characteristic of a population.
A Statistic is a number that describes some characteristic of a sample.
A Population consists of all observations of concern.
A Sample is a subset of the population.
A statistic is used to estimate a population parameter. The value of a statistic varies in repeated
random samples. Examples of parameters and statistics include;
Population Sample
∑ni=1 xi
Mean µ x̄ = n
∑ni=1 (xi −x̄)2 ∑ni=1 (xi −x̄)2 n ∑ni=1 xi2 −(∑ni=1 xi )2
Variance σ2 = s2 = =
√n n−1 √ n(n−1)
Standard deviation σ = σ2 s = s2
Proportion p p̂
The probability distribution of a statistic is called a Sampling distribution.
A statistic is said to be an unbiased estimator of a parameter if the mean of the sampling distribution
is equal to the parameter. eg. Consider a random sample of size n taken from a N(µ, σ 2 ) population.
Then;
∑ni=1 xi
x̄ =
n
x1 + x2 + ... + xn
E[x̄] = E[ ]
n
nµ
=
n
=µ
Exercise 5.1.1
Show that the sample variance s2 is an unbiased estimator of the population variance σ 2 .
1. The distribution of the means will be approximately normal for large sample sizes.
2. The mean of the distribution of means will be equal to the population mean.
1
σ2
3. The variance of the distribution of means will be equal to n .
The above hold for n large i.e n > 30. NOTE: If the original population is normally distributed then
the sample mean will be normally distributed for any sample size n.
Therefore from the central limit theorem, for n large the sampling distribution of the sample means is
normal i.e
σ2
X̄ ∼ N(µ, )
n
Example 5.2.1
Given a population {2,6,8,10,10,12}
ii List all 36 possible simple random samples of size n=2(assume you are picking with replace-
ment to maintain independence).Find x̄ for each sample.
iii Obtain the sampling distribution of x̄. Make a graph of the sampling distribution, obtain the
mean and variance of the sampling distribution and finally compare the mean and variance of
the sampling distribution with the population mean and variance obtained in part i.
Solution:
The population mean and variance
48
µ=
6
=8
(Xi − µ)2
σ2 = ∑
N
(−6) + (−2)2 + (0)2 + (2)2 + (2)2 + (4)2
2
=
6
64
=
6
= 10.6667
σ = 3.266
x 2 6 8 10 12
P(X = x) 1/6 1/6 1/6 2/6 1/6
and the histogram of the distribution
2
-~-
t 7
7t:T~~+--k:i±. ~-•
_I
1 1-- •
+-
-X •
...
-+ +-+-
I I
2 6 8 10 10 12
2 2,2 2,6 2,8 2,10 2,10 2,12
6 6,2 6,6 6,8 6,10 6,10 6,12
8 8,2 8,6 8,8 8,10 8,10 8,12
10 10,2 10,6 10,8 10,10 10,10 10,12
10 10,2 10,6 10,8 10,10 10,10 10,12
12 12,2 12,6 12,8 12,10 12,10 12,12
and their means
2 6 8 10 10 12
2 2 4 5 6 6 7
6 4 6 7 8 8 9
8 5 7 8 9 9 10
10 6 8 9 10 10 11
10 6 8 9 10 10 11
12 7 9 10 11 11 12
3
The mean and variance of the sample means is given as
288
µx̄ = =8
36
(X̄i − µx̄ )2
σx̄2 = ∑
N
192
=
36
= 5.33333
σx̄ = 2.3094
x̄ 2 4 5 6 7 8 9 10 11 12
P(X̄ = x̄) 1/36 2/36 2/36 5/36 4/36 5/36 6/36 6/36 4/36 1/36
and the histogram of the distribution
-~-
t 7
7t:T~~+--k:i±. ~-•
_I
1 1-- •
+-
-X •
...
-+ +-+-
I I
σx̄2 10.6667
From the above; µx = µx̄ and σx2 = n = 2 = 5.3333 .
Example 5.2.2
The weights of a population of workers have µ = 167 and σ = 27.A sample of 36 workers is chosen.
Approximate the probability that the sample mean of their weights lies between 163 and 170.
4
Solution:
2
√
X−µ n(X−µ)
The sample mean X̄ ∼ N(µ, σn ) therefore Z = σ
√
= σ Hence
n
Exercise 5.2
1. The amount of time that a drive-through bank teller spends on a customer is a random variable
with a mean µ = 3.2 minutes and a standard deviation σ = 1.6 minutes. If a random sample of
64 customers is observed, find the probability that their mean time at the teller’s counter is (a)
at most 2.7 minutes: (b) more than 3.5 minutes; (c) at least 3.2 minutes but less than 3.4 minutes.
2. The average life of a bread-making machine is 7 years, with a standard deviation of 1 year.
Assuming that the lives of these machines follow approximately a normal distribution, find
(a) the probability that the mean life of a random sample of 9 such machines falls between 6.4
and 7.2 years;
(b) the value of x; to the right of which 15% of the means computed from random samples of
size 9 would fall.
3. A certain type of thread is manufactured with a mean tensile strength of 78.3 kilograms and a
standard deviation of 5.6 kilograms. How is the variance of the sample mean changed when the
sample size is
(a) increased from 64 to 196?
(b) decreased from 784 to 49?
5
Example 5.3.1
Two independent experiments are being run in which two different types of paints are compared.
Eighteen specimens are painted using type A and the drying time, in hours, is recorded on each. The
same is done with type B. The population standard deviations are both known to be 1.0. Assuming
that the mean drying time is equal for the two types of paint, find P(X¯A − X¯B > 1.0) where X¯A and X¯B
are average drying times for samples of size n1 = n2 = 18.
Solution
σA2 σB2 1 1 1
x¯A − x¯B ∼ N(µ1 − µ2 = 0, + = + = )
n1 n2 18 18 9
Hence
1.0 − 0
P(X¯A − X¯B > 1.0) = P(Z > q )
1
9
= P(Z > 3.0)
= 0.0013
Exercise 5.3
1. Given the following data
Population 1 Population 2
µ1 = 6.5 µ2 = 6.0
σ1 = 0.9 σ2 = 0.8
n1 = 36 n2 = 49
Find P(X¯1 − X¯2 ≥ 1.0)
6
To show this
n n
∑ (xi − µ)2 = ∑ (xi − x̄ + x̄ − µ)2
i=1 i=1
n
= ∑ [(xi − x̄) + (x̄ − µ)]2
i=1
n n n
= ∑ (xi − x̄)2 + ∑ (x̄ − µ)2 + 2(x̄ − µ) ∑ (xi − x̄)
i=1 i=1 i=1
n n
= ∑ (xi − x̄)2 + n(x̄ − µ)2 + 2(x̄ − µ)[ ∑ xi − nx̄]
i=1 i=1
n
= ∑ (xi − x̄)2 + n(x̄ − µ)2 + 2(x̄ − µ)[nx̄ − nx̄]
i=1
n n
∑ (xi − µ)2 = ∑ (xi − x̄)2 + n(x̄ − µ)2
i=1 i=1
2
Dividing each term by σ
∑ni=1 (xi − µ)2 ∑ni=1 (xi − x̄)2 ∑ni=1 (xi − µ)2
= +
σ2 σ2 σ2
n 2
(n − 1)s
∑ Z2 = σ 2 + Z2
i=1
n
But ∑ Z 2 ∼ χ(n)
2
i=1
2
2
And Z ∼ χ(1)
Therefore
2 (n − 1)s2 2
χ(n) = + χ(1)
σ2
(n − 1)s2 2
= χ(n−1)
σ2
Thus the sampling distribution of the sample variance is dependent on the population variance and
has a chi-square distribution with n-1 degrees of freedom.
Example 5.4.1
The time it takes a central processing unit to process a certain type of job is normally distributed with
mean 20 seconds and standard deviation 3 seconds. If a sample of 7 such jobs is observed, what is
the probability that the sample variance will exceed 12?
Solution
We need to find P(S2 > 12) given that n=15 and σ = 3 thus
(n − 1) ∗ S2 6 ∗ 12
P(S2 > 12) = P( > )
σ2 32
2
= P(χ(14) > 8)
= 0.9
7
5.5 Sampling distribution of the ratio of variances
Given that S12 and S22 are the sample variances of independent random samples of size n1 and n2 taken
from normal populations with variances σ12 and σ22 respectively, then
S12
σ12 S12 σ22
F= = ∼ Fn1 −1,n2 −1
S22 S22 σ12
σ22
NOTE: When sampling from a finite population without replacement i.e the observations are not
independent then sampling too large a fraction of the population means that the standard deviation
of sample proportion σ p̂ will be inaccurate. To calculate σ p̂ we use the finite population correction
(FPC). However this is not considered when the population size is large in relation to the sample size.
Example 5.6.1
Suppose that 45 percent of the population favors a certain candidate in an upcoming election. If a
random sample of size 200 is chosen, find the expected value and standard deviation of the number
of members of the sample that favor the candidate
Solution
Population proportion (p=0.45), sample size n=200 and number of members of the sample that favor
the candidate=X, hence
E(X) = np
= 200 ∗ 0.45 = 90
p
stddev(X) = np(1 − p)
p
= 90(0.55) = 7.0356
8
Exercise 5.6
1. A local college has 500 students and 54 of them are left-handed. You conduct a survey of 50
students and find that 6 of them are left-handed.
(a) What is the population proportion of left-handed students?
(b) What is the sample proportion of left-handed students?
3. The following table gives the percentages of individuals, categorized by gender, that follow
certain negative health practices.