Chapter 4: Point Estimators and Confidence Interval: Phan Thi Khanh Van

Chapter 4: Point estimators and Confidence
Interval
Phan Thi Khanh Van
E-mail: khanhvanphan@hcmut.edu.vn
December 15, 2020
(Phan Thi Khanh Van) Chap 4: Point estimators and Confidence Interval December 15, 2020 1 / 36
Table of Contents
1 Point Estimator
Sampling Distributions
Point Estimation
2 Confidence Interval
Confidence Interval on the Mean of a Normal Distribution,Variance
Known
Confidence Interval on the Mean of a Normal Distribution,Variance
Unknown
Large-Sample Confidence Interval for a Population Proportion
Random Sample
The random variables X1 , X2 , ..., Xn are a random sample of size n if
the Xi s are independent random variables
every Xi has the same probability distribution.
Statistic
A statistic is any function of the observations in a random sample.
For example, if X1 , X2 , ..., Xn is a random sample of size n,

n
the sample mean: X̄ = E (X ) = n1
P
Xi ,
i=1
n
(Xi −X̄ )2
P
the sample variance: S 2 = i=1

n−1 ,
s
n
(Xi −X̄ )2
P
i=1
and the sample standard deviation: S = n−1 are statistics.
Sampling Distribution
The probability distribution of a statistic is called a sampling distribution.
Example: Sampling distribution of the sample mean of a normal

population
Suppose that a random sample of size n is taken from a normal
population with mean µ and variance σ 2 . Then the sample mean
X̄ = X1 +X2 +···+X
n
n
has a normal distribution with mean and variance
µ+µ+···+µ σ 2 +σ 2 +···+σ 2 σ2
µX̄ = n = µ, σX̄2 = n2
= n .
Central Limit Theorem

If X1 , X2 , ..., Xn is a random sample of size n taken from a population
(either finite or infinite) with mean µ and finite variance σ 2 and if X̄ is the
sample mean, the limiting form of the distribution of
X̄ − µ
Z= √
σ/ n
as n → ∞, is the standard normal distribution.
Example
Suppose that a random variable X has a continuous uniform distribution
(
0.3, 4 ≤ x ≤ 8,
f =
0, 4 < x or x > 8
a. Find the distribution of the sample mean of a random sample of size

n = 40.
b. Find the probability that the sample mean is less than 5.9.
4 2
a. µ = 6, σ 2 = 12 = 43 .
2
µX̄ = 6, σX̄2 = σn = 3.40 4 1
= 30 .
µX̄ −µ
b. P(X̄ < 5.9) = P(Z < σ )
! X̄
5.9−6
=P Z< q
1
≈ P(Z < −0.55) = 0.2912.
30
Example
Suppose that X has a discrete uniform distribution
(
1/3, x = 1, 2, 3
f (x) =
0, otherwise
A random sample of n = 36 is selected from this population. Find the

probability that the sample mean is greater than 2.1 but less than 2.5,
assuming that the sample mean would be measured to the nearest tenth.
µ = (1 + 2 + 3) 13 = 2.
q
σ = (12 + 22 + 32 ) 13 − 22 = 0.8165.
q
σX̄ = √σn = 54 1
= 0.1361
2.1 − 2 2.5 − 2
P(2.1 < X̄ < 2.5) = P( q <Z < q )
1 1
54 54
≈ P(0.7348 < Z < 3.6742) = Φ(3.6742) − Φ(0.7348)
≈ 0.2326.
Point Estimator
Point Estimator
A point estimate of some population parameter θ is a single numerical
value θ of a statistic θ̂. The statistic θ̂ is called the point estimator.
Example
Suppose that the random variable X is normally distributed with an
unknown mean µ. After the sample has been selected, the sample mean
X̄ is a point estimator of the unknown population mean µ.
That is, µ̂ = X̄ .
Thus, if x1 = 25, x2 = 30, x3 = 29, and x4 = 31, the point estimate of µ
is
x̄ = x1 +x2 +x
4
3 +x4
= 28.75.
Similarly, a point estimator for σ is the sample variance S 2 , and the
2
2 2
numerical value s 2 = (x1 −x̄) +...+(x
3
4 −x̄)
= 6.9167 calculated from the
sample data is called the point estimate of σ 2 .
Example: Normal Distribution Estimators
A team of analytic specialists has been investigating the cycle time to
process loan applications. The specialists’ experience with the process
informs them that cycle time is normally distributed with parameter µ and
σ 2 . A recent random sample of 10 applications gives the following (in
hours): 24.1514, 27.4145, 20.4000, 22.5151, 28.5152, 28.5611, 21.2489,
20.9983, 24.9840, 22.6245.
Use the sample mean and variance to estimate µ and σ.
10
1 P
µ̂ = X̄ = 10 Xi = 24.1413.
v i=1
u n
uP
u (Xi − X̄ )2
√
σ̂ = i=1
t
= 9.6974 = 3.1141.
n−1
Bias of an Estimator
The point estimator θ̂ is an unbiased estimator for the parameter θ if
E (θ̂) = θ.
If the estimator is not unbiased, then the difference E (θ̂) − θ is called the
bias of the estimator θ̂.
Example: Sample Mean and Variance are Unbiased

X1 + X2 + ... + Xn
E (X̄ ) = E = µ.
n
The samplevariance isan estimator of Population Variance and:
n
(Xi −X̄ )2
P
n
2 = 1 E (Xi2 + X̄ 2 − 2X̄ Xi )
i=1
P
E (S ) = E 
n−1 n−1
i=1
n
h i
1 1 σ2
E (Xi2 ) − nE (X̄ 2 ) n(µ2 + σ 2 ) − n(µ2 + = σ2:
P
= n−1 = n−1 n )
i=1
unbiased
Estimation problems occur frequently in engineering. Reasonable point
estimates:
The mean µ of a single population: µ̂ = x̄: sample mean.
The variance σ 2 (or standard deviation σ) of a single population:
σ̂ 2 = s 2 : sample variance.
The proportion p of items in a population that belong to a class of
interest: p̂ = x/n: sample proportion.
The difference in means of two populations, µ1 − µ2 :
µ̂1 − µ̂2 = x¯1 − x¯2 .
The difference in two population proportions, p1 − p2 : p̂1 − p̂2 .
Example: Exponential Distribution Moment Estimator
The time to failure of an electronic module used in an automobile engine
controller is tested at an elevated temperature to accelerate the failure
mechanism. The time to failure is exponentially distributed with the
parameter λ. Eight units are randomly selected and tested, resulting in the
following failure time (in hours):
11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38. Use the
sample mean to estimate λ
1 1 1
λ̂ = = 8
= 21.6462 = 0.0462.
µ̂ 1 P
8 Xi
i=1
Example: Gamma Distribution Moment Estimators
The time to failure of an electronic module used in an automobile engine
controller is tested at an elevated temperature to accelerate the failure
mechanism. The time to failure has Gamma distribution with the
parameters r and λ. Eight units are randomly selected and tested,
resulting in the following failure time (in hours):
11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38. Find
the estimate of λ and r using the sample mean and variance.
8
rˆ 1 P
= µ̂ = X̄ = 8 Xi = 21.6462.
λ̂
i=1
(Xi − X̄ )2
P 8

rˆ
= σˆ2 = s 2 = 1
X̄ )2
P
= 7 (Xi − = 413.8491.
λ̂2 n−1 i=1
Hence,
21.6462
λ̂ = 413.8491 = 0.0523.
rˆ = λ̂.µ̂ = 1.1322.
Confidence Interval
Example
There is an ASTM Standard E23 that defines a technique called the
Charpy V-notch method for notched bar impact testing of metallic
materials. The impact energy is often used to determine whether the
material experiences a ductile-to-brittle transition as the temperature
decreases.
Suppose that we have tested a sample of 10 specimens of a particular
material with this procedure. We estimate the true mean impact energy
µ by the sample average X̄ . Our estimate could be very close, or it could
be considerably far from the true mean.
A way to avoid this is to report the estimate in terms of a range of
plausible values called a confidence interval (CI): I .
A confidence interval always specifies a confidence level, usually 90%,
95%, or 99%, which is a measure of the reliability of the procedure, for
example: P(µ ∈ I ) = 0.9.
Confidence Interval (CI) on the Mean of a Normal
Distribution,Variance Known
Confidence Interval on the Mean of a Normal Distribution

Suppose that X1 , X2 , ..., Xn is a random sample from a normal distribution
with unknown mean µ and known variance σ 2 .
X̄ is normally distributed with mean µ and variance σ 2 /n.
X̄ −µ
Standardize X̄ : Z = σ/ √ : standard normal random variable.
n
A confidence interval estimate for µ is an interval of the form l ≤ µ ≤ u,
where the end-points l and u are computed from the sample data.
We have to determine: L and U such that:
P(L ≤ µ ≤ U) = 1 − α, 0 ≤ α ≤ 1.
The end-points or bounds l and u are called lower- and upper-confidence

limits (bounds), respectively. 1 − α is called the confidence coefficient.
Confidence Interval on the Mean, Variance Known

If x̄ is the sample mean of a random sample of size n from a normal
population with known variance σ 2 , a 100(1 − α)% confidence interval on
µ is given by
x̄ − zα/2 √σn ≤ µ ≤ x̄ + zα/2 √σn .
where zα/2 is the upper 100α/2 percentage point of the standard normal
distribution (P(z ≥ zα/2 ) = P(z ≤ −zα/2 ) = α/2).
Confidence Interval on the Mean of a Normal
Example: Metallic Material Transition
ASTM Standard E23 defines standard test methods for notched bar
impact testing of metallic materials. The Charpy V-notch (CVN)
technique measures impact energy and is often used to determine whether
or not a material experiences a ductile-to-brittle transition with decreasing
temperature. Ten measurements of impact energy (J ) on specimens of
A238 steel cut at 60o C are as follows: 64.1, 64.7, 64.5, 64.6, 64.5, 64.3,
64.6, 64.8, 64.2, 64.3. Assume that impact energy is normally distributed
with σ = 1J. We want to find a 95% CI for µ, the mean impact energy.
10
1 P
X̄ = 10 Xi = 64.46. α = 0.05, P(Z ≥ zα/2 ) = 0.025 ⇒ zα/2 = 1.96.
i=1
A 95% CI: x̄ − zα/2 √σn ≤ µ ≤ x̄ + zα/2 √σn
⇔ 64.46 − 1.96 √110 ≤ µ ≤ 64.46 + 1.96 √110
⇔ 63.84 ≤ µ ≤ 65.08.
One-Sided Confidence Bounds on the Mean, Variance Known

If x̄ is the sample mean of a random sample of size n from a normal
population with known variance σ 2 .
A 100(1 − α)% upper-confidence bound for µ is given by
µ ≤ U = x̄ + zα √σn .
A 100(1 − α)% lower-confidence bound for µ is given by
µ ≥ L = x̄ − zα √σn .
where zα is the upper 100α percentage point of the standard normal
distribution.
Sample Size for Specified Error on the Mean, Variance Known

If X̄ is used as an estimate of µ, we can be 100(1 − α)% confident that the
error x̄ − µ will not exceed a specified amount E when the sample size is
z σ 2
α/2
n= .
E
If the right-hand side of is not an integer, it must be rounded up. This will
ensure that the level of confidence does not fall below 100(1 − α)%.
Notice that 2E is the length of the resulting CI.
Example
The following data below describe temperatures for wheat grown at
Harper Adams Agricultural College between 1982 and 1993. The
temperatures measured in June were obtained as follows:
15.2, 14.2, 14.0, 12.2, 14.4, 12.5, 14.3, 14.2, 13.5, 11.8, 15.2. Assume that
the standard deviation is known to be σ = 0.5.
(a) Construct a 99% two-sided confidence interval on the mean
temperature.
(b) Construct a 95% lower-confidence bound on the mean temperature.
(c) Suppose that you wanted to be 95% confident that the error in
estimating the mean temperature is less than 2 degrees Celsius. What
sample size should be used?
(d) Suppose that you wanted the total width of the two-sided confidence
interval on mean temperature to be 1.5 degrees Celsius at 95% confidence.
What sample size should be used?
11
1 P
X̄ = 11 Xi = 13.77.
i=1
a) α = 0.01, P(Z ≥ zα/2 ) = 0.005 ⇒ zα/2 = 2.57.
A 99% two-sided CI:
µ ∈ x̄ − zα/2 √σn , x̄ + zα/2 √σn = 13.77 − 2.57 √0.5
11
, 13.77 + 2.57 0.5
√
11
= (13.39, 14.16) .
b) α = 0.05, P(Z ≥ zα ) = 0.05 ⇒ zα = 1.64.
A 95% lower-confidence bound on the mean temperature:
µ ≥ x̄ − zα √σn = 13.77 − 1.64 √0.511
= 13.52.
c) E = 2, α/2 = 0.025 ⇒ zα/2 = 1.96.
In order to be 95% confident that the error in estimating the mean
temperature is less than 2o C , the sample size should be:
z σ 2 2
α/2
n= = 1.96·0.5
2 = 0.2401.
E
Round up, we have that n = 1.
d) E = 1.5 = 0.75. The sample size should be:
z 2 σ 2 2
α/2
n= = 1.96·0.5
0.75 = 1.7074. Round up, we have that n = 2.
E
Large-sample confidence interval for µ
Large-Sample Confidence Interval on the Mean

Let X1 , X2 , ..., Xn be a random sample from a population with unknown
mean µ and variance σ 2 . If the sample size n is large, the central limit
theorem implies that X has approximately a normal distribution with
2
mean µ and variance σn . Therefore,
X̄ − µ
Z= √
S/ n
has approximately a standard normal distribution. Consequently,
X̄ − zα/2 √Sn ≤ µ ≤ X̄ + zα/2 √Sn
is a large-sample confidence interval for µ, with confidence level of

approximately 100(1 − α)%.
Generally, n should be at least 40 to use this result reliably.

Large-sample confidence interval for µ
Example: Mercury Contamination

An article in the 1993 volume of the Transactions of the American
Fisheries Society reports the results of a study to investigate the mercury
contamination in large-mouth bass. A sample of fish was selected from 53
Florida lakes, and mercury concentration in the muscle tissue was
measured (ppm). The mercury concentration values were
1.230 1.330 0.040 0.044 1.200 0.270
0.490 0.190 0.830 0.810 0.710 0.500
0.490 1.160 0.050 0.150 0.190 0.770
1.080 0.980 0.630 0.560 0.410 0.730
0.590 0.340 0.340 0.840 0.500 0.340
0.280 0.340 0.750 0.870 0.560 0.170
0.180 0.190 0.040 0.490 1.100 0.160
0.100 0.210 0.860 0.520 0.650 0.270
0.940 0.400 0.430 0.250 0.270
v
u n
uP
u (Xi − X̄ )2
n = 53, X̄ = 0.525, S = i=1
t
= 0.3486.
n−1
Because n > 40, the assumption of normality is not necessary.
The approximate 95% CI on mean is
X̄ − z0.025 √Sn ≤ µ ≤ X̄ + z0.025 √Sn

0.525 − 1.96 0.3486
√
53
≤ µ ≤ 0.525 + 1.96 0.3486
√
53
0.4311 ≤ µ ≤ 0.6189.
Distribution,Variance Unknown
t Distribution
Let X1 , X2 , ..., Xn be a random sample from a normal distribution with
unknown mean µ and unknown variance σ 2 . The random variable
X̄ −µ
T = √
S/ n
has a t distribution with n − 1 degrees of freedom.

The t- PDF is
Γ[(k + 1)/2] 1
f (x) = √ . , −∞ < x < ∞,
πkΓ(k/2) [(x /k) + 1](k+1)/2
2
where k is the number of degrees of freedom.

The mean and variance of the t distribution are:
k
µ = 0, σ2 = k−2 , (k > 2).
Let tα,k be the value of the random variable T with k degrees of freedom
above which we find an area (or probability) α. Thus, tα,k is an
upper-tailed 100α percentage point of the t distribution with k degrees of
freedom.
The following table provides percentage points of the t distribution.
Confidence Interval on the Mean, Variance Unknown
If X̄ and S are the mean and standard deviation of a random sample from
a normal distribution with unknown variance σ 2 , a 100(1 − α)%
confidence interval on µ is given by
X̄ − tα/2,n−1 √Sn ≤ µ ≤ X̄ + tα/2,n−1 √Sn ,
where tα/2,n−1 is the upper 100α/2 percentage point of the t distribution

with n − 1 degrees of freedom.
One-Sided Confidence Bounds on the Mean, Variance Unknown

A 100(1 − α)% upper-confidence bound for µ is given by
µ ≤ U = X̄ + tα,n−1 √Sn .
A 100(1 − α)% lower-confidence bound for µ is given by
µ ≥ L = X̄ − tα,n−1 √Sn .
Distribution,Variance Unknown
Example
The compressive strength of concrete, which is normally distributed, is
being tested by a civil engineer who tests 12 specimens and obtains the
following data:
2216, 2237, 2249, 2204, 2225, 2301, 2281, 2263, 2318, 2255, 2275, 2295.
distributed. Include a graphical display in your answer.
(a) Construct a 95% two-sided confidence interval on the mean strength.
(b) Construct a 95% lower confidence bound on the mean strength.
12
1 P
a) X̄ = 12 Xi = 2259.9167.
v i=1
u n
uP
u (Xi − X̄ )2
S = i=1
t
= 35.5693.
n−1
α = 0.05, n = 12, tα/2,n−1 = 2.201.
A 95% two-sided CI on the mean strength is:
X̄ − tα/2,n−1 √Sn ≤ µ ≤ X̄ + tα/2,n−1 √Sn
⇔ 2259.9167 − 2.201 35.5693
√
12
≤ µ ≤ 2259.9167 + 2.201 35.5693
√
12
⇔ 2237.3169 ≤ µ ≤ 2282.5165
b) α = 0.05 ⇒ tα,n−1 = 1.796.
A 95% lower confidence bound on the mean strength is:
µ ≥ X̄ − tα,n−1 √Sn
⇔ µ ≥ 2259.9167 − 1.796 35.5693
√
12
⇔ µ ≥ 2241.4754.
Large-Sample Confidence Interval for a Population
Proportion
It is often necessary to construct confidence intervals on a population
proportion. Suppose that a random sample of size n has been taken
from a large (possibly infinite) population and that X (≤ n) observations in
this sample belong to a class of interest. P̂ = X /n is a point estimator of
the proportion p of the population that belongs to this class. Note that n
and p are the parameters of a binomial distribution.
Normal Approximation for a Binomial Proportion
If n is large, the distribution of
Z = √X −np = qp̂−p
p(1−p)
np(1−p)
n
is approximately standard normal. p̂ is approximate by a normal

distribution with mean µ = p and variance p(1 − p)/n. Condition:
n(1 − p) ≥ 5, np ≥ 5.
Proportion
Approximate Confidence Interval on a Binomial Proportion
If p̂ is the proportion of observations in a random sample of size n that
belongs to a class of interest, an approximate 100(1 − α)% confidence
interval on the proportion p of the population that belongs to this class is
q q
p̂(1−p̂)
p̂ − zα/2 n ≤ p ≤ p̂ + zα/2 p̂(1−
n
p̂)
where zα/2 is the upper α/2 percentage point of the standard normal
distribution.
Approximate One-Sided Confidence Bounds on a Binomial Proportion

The approximate 100(1 − α)% lower and upper confidence bounds are
q q
p̂(1−p̂)
p̂ − zα n ≤ p and p ≤ p̂ + zα p̂(1−
n
p̂)
.
Proportion
Example: Crankshaft Bearings
In a random sample of 85 automobile engine crankshaft bearings, 10 have
a surface finish that is rougher than the specifications allow. Find a 95%
two-sided confidence interval for the proportion of bearings in the
population that exceeds the roughness specification.
Let p be the proportion of bearings in the population that exceeds the

roughness specification.
p̂ = xn = 10
85 = 0.1176. α = 0.05 ⇒ zα/2 = 1.96.
A 95% two-sided confidence interval for p is:
q q
p̂ − zα/2 p̂(1− n
p̂)
≤ µ ≤ p̂ + z α/2
p̂(1−p̂)
nq
q
0.1176(1−0.1176)
0.1176 − 1.96 85 ≤ µ ≤ 0.1176 + 1.96 0.1176(1−0.1176)
85
0.0491 ≤ µ ≤ 0.1861.
Choice of Sample Size
Sample Size for a Specified Error on a Binomial Proportion
p are approximately 100(1 − α)% confident
P̂ is the point estimator of p, we
that this error is less than zα/2 p̂(1 − p̂)/n. The error in estimating p
p
by E = zα/2 p̂(1 − p̂)/n and solve for n, the appropriate sample size is
2
zα/2
n= E p̂(1 − p̂).
Example: Crankshaft Bearings

In a random sample of 85 automobile engine crankshaft bearings, 10 have
a surface finish that is rougher than the specifications allow. How large a
sample is required if we want to be 95% confident that the error in using p̂
to estimate p is less than 0.025?
p̂ = 0.1176, zα/2 = 1.96. E = 0.025, therefore,

1.96 2

n = 0.025 0.1176(1 − 0.1176) = 638.05. Round up, we have n = 639.
Thank you for your attention!

Chapter 4: Point Estimators and Confidence Interval: Phan Thi Khanh Van

Uploaded by

Copyright:

Available Formats

Chapter 4: Point Estimators and Confidence Interval: Phan Thi Khanh Van

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4: Point Estimators and Confidence Interval: Phan Thi Khanh Van

Uploaded by

Copyright:

Available Formats

Chapter 4: Point estimators and Confidence

Phan Thi Khanh Van

December 15, 2020

For example, if X1 , X2 , ..., Xn is a random sample of size n,

the sample variance: S 2 = i=1

Example: Sampling distribution of the sample mean of a normal

Central Limit Theorem

as n → ∞, is the standard normal distribution.

a. Find the distribution of the sample mean of a random sample of size

A random sample of n = 36 is selected from this population. Find the

Example: Sample Mean and Variance are Unbiased

Confidence Interval on the Mean of a Normal Distribution

The end-points or bounds l and u are called lower- and upper-confidence

Confidence Interval on the Mean, Variance Known

x̄ − zα/2 √σn ≤ µ ≤ x̄ + zα/2 √σn .

One-Sided Confidence Bounds on the Mean, Variance Known

Sample Size for Specified Error on the Mean, Variance Known

Large-Sample Confidence Interval on the Mean

X̄ − zα/2 √Sn ≤ µ ≤ X̄ + zα/2 √Sn

is a large-sample confidence interval for µ, with confidence level of

Generally, n should be at least 40 to use this result reliably.

Example: Mercury Contamination

X̄ − z0.025 √Sn ≤ µ ≤ X̄ + z0.025 √Sn

has a t distribution with n − 1 degrees of freedom.

where k is the number of degrees of freedom.

X̄ − tα/2,n−1 √Sn ≤ µ ≤ X̄ + tα/2,n−1 √Sn ,

where tα/2,n−1 is the upper 100α/2 percentage point of the t distribution

One-Sided Confidence Bounds on the Mean, Variance Unknown

is approximately standard normal. p̂ is approximate by a normal

Approximate One-Sided Confidence Bounds on a Binomial Proportion

Let p be the proportion of bearings in the population that exceeds the

Example: Crankshaft Bearings

p̂ = 0.1176, zα/2 = 1.96. E = 0.025, therefore,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.