Confidence Intervals for the Sample Mean with Known σ
Confidence Intervals for the Sample Mean with Known σ
Confidence Intervals for the Sample Mean with Known σ
Examples for Ch 8
Example 2: A personnel department analyst wishes to estimate the mean number of training
hours needed annually for supervisors in a division of the company within the margin of 3
hours (that is plus minus 3 hours) with a 95% confidence level. Based on a large data from
other similar companies the analyst estimates the standard deviation of required training hours
to be equal to 20 hours. Find the minimum sample size which will give the required estimate
with specified margin of error and level of confidence.
Answer: Here = 20 hours, Z/2 = Z0.025 = 1.96 (for 95% confidence level), and the margin of
error E = 3 hours. Therefore, n = (1.96*20/3)2 = 170.7 or 171 observations (always rounded up).
The required sample size increases as the tolerable margin of error is reduced. Find the required
sample size for margin of error of only 1 hour (I bet it will be 9 times the sample size we just
obtained). Similarly, we can find the required sample size for other confidence levels, such as
90% (with Z/2= Z0.05 = 1.645) and 99% (with Z/2= Z0.005 = 2.576). Clearly the sample size
increases as the desired confidence level increases and conversely. Similarly you see that the
required sample size increases as the standard deviation of the population increases. This makes
sense, because you need a larger sample size to have the same level of confidence if the parent
population involves larger variability, other things remaining the same.
Example 3: A small town has 1000 families who make contributions to the only local church. A
poll of 144 randomly selected contributing families reveals that the mean annual family
contribution is $500 with a standard deviation of $72. Construct a 95% confidence interval for
the mean annual family contribution for this population of families who contribute to this
particular church.
Do you see a problem with this question? We are not given the population mean or standard
deviation like the previous example. This may be because there was no previous survey done for
this population. In such cases the sample results are used as surrogates to the unknown
population parameters in the formulas given above if the sample size is adequate. In this case the
sample size 144 is quite large. So we will use $500 and $72 as the surrogates for the unknown
population mean and the standard deviation, respectively. But do we need to use the formula for
large population or small population? At first sight the population of 1000 families seems to be
large. But remember the rule given above. The population is much smaller than 20 times the
sample. So we will use property #3 to find the standard error. We will use the formula given for
finite population on page 4 of Instructions for Chapter 7.
Therefore,
= (72/
= 6*0.9257= 5.554
Therefore, a 95% confidence interval would be between 500-1.96*5.554 and 500+ 1.96*5.554 or
between 489 and 511 dollars per year (rounded to whole numbers ignoring cents. We could also
use the t distribution (discussed below) to build the confidence interval in this case since the
population standard deviation is not given. But the result would be very close to what we
obtained using Z distribution because the sample size is very large. I will discuss this issue in the
following section.
Thus we will use t/2 in place of Z/2 in our calculation of the margin of error and the
confidence interval whenever df is less than 30 and is unknown (assuming ,however, that
the parent population is normal). We will follow exactly the same steps (shown above) as
the case when is known, except that we replace by s and Z by t. For df greater than or
equal to 30 it is a matter of researchers choice. Theoretically t would be more accurate than
Z, but that would involve reading from the t-table instead of using the popularly known Zvalues. So it is up to you which one to use.
Example 4: The sample mean operating life for a random sample of 16 light bulbs of a particular
brand is calculated to be 4000 hours with the sample standard deviation of 200 hours. The
operating life of bulbs is generally assumed to be approximately normal. Estimate the mean
operating life for the population of bulbs from which the sample is taken using a 95% confidence
interval.
Here n=16, df = 15, the population is normal (approximately) and the population standard
deviation is not given. Therefore, we will use the t-distribution instead of the Z-distribution to
construct the confidence interval. We are given = 4000 hours and s = 200 hours. Therefore, the
standard deviation (or standard error) of the sample mean denoted by (from my previous
Instructions) is given by =
where we have replaced by s.
Or =
= 50 hours. Now the confidence interval required is 95%. So = .05 and /2 = .025.
Therefore, we need to find t.025 from the table for df = 15. This value is 2.131. Next,
The margin of error = t/2* = 2.131*50 = 106.55 hours. Therefore,
The 95% confidence interval for the mean is t/2* = t/2*
4000 106.55 or between 3893.45 and 4106.55 hours or between 3893 and 4107 hours rounded
(because the numbers are very large we can ignore the decimals and round to the nearest whole
number). If we had neglected the fact that the population standard deviation is not known and the
sample size is quite small (consequently the df is small), then we would be estimating a narrower
interval which would be questionable because it would be claiming more precision than
warranted by the nature of the sample.
Now can you build 90% and 99% confidence interval estimates of the mean life of bulbs for this
sample? (Hint: look for t.050 and t.005, respectively).
Use of Computer
For the example of auditors sample of accounts receivable (Example 1 above)
95%
2600
450
36
1.960
146.997
2746.997
2453.003
confidence level
mean
std. dev.
n
z
half-width
upper confidence
limit
lower confidence
limit
You can also find the required sample size for a given level of confidence and specified tolerable
margin of error. Let us work on the Second example of this instruction using MegaStat.
Example 2 from above: A personnel department analyst wishes to estimate the mean number of
training hours needed annually for supervisors in a division of the company within the margin of
3 hours (that is plus minus 3 hours) with a 95% confidence level. Based on a large data from
other similar companies the analyst estimates the standard deviation of required training hours
to be equal to 20 hours. Find the minimum sample size which will give the required estimate
with specified margin of error and level of confidence.
Go to MegaStat, select Confidence interval/Sample size, then select Sample size-mean in the
dialogue box, and fill 3 for E and 20 for std deviation. Then the sample size for 95%
confidence level is given by MegaStat as:
Sample size - mean
3 E, error tolerance
20 standard deviation
95% confidence level
1.960 z
170.732 sample size
171 rounded up
This is exactly the same answer we derived above using the formula. I want you to learn
everything using formula as well as computer. Learning only one way will be half knowledge.
Similarly, you can use MegaStat to find confidence intervals using the t-distribution. Let us solve
the Example 3 of this instruction using MegaStat.
Example 4 from above: The sample mean operating life for a random sample of 16 light bulbs
of a particular brand is calculated to be 4000 hours with the sample standard deviation of 200
hours. The operating life of bulbs is generally assumed to be approximately normal. Estimate the
mean operating life for the population of bulbs from which the sample is taken using a 95%
confidence interval.
In this case we will select t instead of Z in the dialogue box and get:
Confidence interval - mean
95% confidence level
4000 mean
200 std. dev.
16 n
2.131 t (df = 15)
106.572 half-width
4106.572 upper confidence limit
3893.428 lower confidence limit
= 0.029
Therefore, the 95% confidence interval is 0.3 0.029*1.96 = 0.3 0.057 (rounded to three
decimals) or between 0.243 and 0.357 (also found in previous instructions). We could have used
MegaStat to find this (select proportion in the dialogue box instead of sample mean) as follows:
Confidence interval -proportion
95% confidence level
0.3 proportion
250 n
1.960 z
0.057 half-width
0.357 upper confidence limit
0.243 lower confidence limit
Sometimes the population proportion is not given. Then we have to work with the estimated
sample proportion only as in the following example.
Example 6: A sample of 75 retail in-store purchases showed that 24 paid in cash. Construct a
95% confidence interval for the proportion of all retail in-store purchases that are paid in cash.
Here population proportion is not given. The sample p = 24/75 = 0.32 and n= 75. Thus np = 24
and n(1-p) = 51. Therefore, normal approximation can be satisfactorily applied. Do we need
continuity correction? We have n(p(1-p) =16.3> 10. So we dont need continuity correction. We
get p = {(.32)(.68)/75} = 0.0539. Now replacing by p we get the 95% confidence interval
as: p Z/2p = 0.32 1.96*0.0539 = 0.32 0.1056 or between 0.2144 and 0.4256.
Using MegaStat
Confidence interval - proportion
95% confidence level
0.32 proportion
75 n
1.960 z
0.106 half-width
0.426 upper confidence limit
0.214 lower confidence limit
You can easily find other confidence intervals.