0% found this document useful (0 votes)
8 views

9511_et_Module-2

The document discusses the evaluation of estimators in statistical inference, focusing on criteria such as closeness, mean-squared error (MSE), and unbiasedness. It highlights that while MSE is a key measure, it does not guarantee the existence of a 'best' estimator, as biased estimators can sometimes outperform unbiased ones in terms of MSE. The document provides examples to illustrate the trade-offs between bias and variance in estimator performance.

Uploaded by

masoodstats
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

9511_et_Module-2

The document discusses the evaluation of estimators in statistical inference, focusing on criteria such as closeness, mean-squared error (MSE), and unbiasedness. It highlights that while MSE is a key measure, it does not guarantee the existence of a 'best' estimator, as biased estimators can sometimes outperform unbiased ones in terms of MSE. The document provides examples to illustrate the trade-offs between bias and variance in estimator performance.

Uploaded by

masoodstats
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Statistical Inference I

Shirsendu Mukherjee
Department of Statistics, Asutosh College, Kolkata, India.
shirsendu st@yahoo.co.in

In any given problem of estimation we may have a large, often an infinite class of
competing estimators for g(θ), a real valued function of the parameter θ.The follow-
ing question that may arise is : Are some of many possible estimators better, in some
sense, than others? In this section we will define certain criteria, which an estimator
may or may not possess, that will help us in comparing the performances of rival
estimators and deciding which one is perhaps the ’best’.

Closeness
If our object is to estimate a parametric function g(θ) then we would like the estimator
T (X) to be close to g(θ). Since T (X) is a statistic, the usual measure of closeness
|T (X) − g(θ)| is also a random variable and as a measure of closeness of T we use the
measure Pθ (| T1 − g |< ) for some  > 0.
Consider two estimators T1 and T2 for estimating a parametric function g = g(θ) of
θ. The estimator T1 will be called more concentrated estimator of g(θ) than T2 if T1
if for every  > 0

Pθ (| T1 − g |< ) ≥ Pθ (| T2 − g |< ), for all θ ∈ Θ (1)

Result : A necessary condition for (1) to hold is that

Eθ (T1 − g)2 ≤ Eθ (T1 − g)2 , for all θ ∈ Θ

provided Eθ (Ti − θ)2 exists for all i=1,2.


Proof (For continuous case): We know that for every non-negative random variable
X such that E(X) exists,
Z ∞
E(X) = P (X > x)dx.
0

1
Since (T1 − θ)2 is a non-negative random variable we get
Z ∞
2
Eθ (T1 − g) = Pθ (| T1 − g |> )d
0

It follows that
Z ∞
2 2
Eθ (T1 − g) − Eθ (T2 − g) = [Pθ (| T2 − g |< ) − Pθ (| T1 − g |< )] d
0

Hence the inequality Eθ (T1 −g)2 ≤ Eθ (T2 −g)2 for all θ ∈ Θ implies that Pθ (| T2 −g |<
) ≤ Pθ (| T1 − g |< ) for all  > 0.

Mean-squared Error(MSE)
If T be an estimator of g then the MSE of T is defined by

M SEθ (T ) = Eθ (T − g)2 , for all θ ∈ Θ

The term (T − g) is called the error of T in estimating g and thus E(T − g)2 is called
the mean square error of T . It measures the average squared difference between the
estimator T and the parameter g. From the above result it is clear that smaller the
value of MSE the better is the estimator. Naturally, we would prefer an estimator
with smaller or smallest MSE. If such an estimator exists it will be best for the
parameter g.
An estimator T is said to be best for g if M SEθ (T ) < M SEθ (T 0 ) for all θ ∈ Θ for
any other estimator T 0 of g. But the problem is that no such best estimator will
exists in this sense. It will be clear from the following discussion.
Let for a particular value of θ, say θ0 , T 0 be defined as

T 0 = g(θ0 ) for all x ∈ X

Then

M SEθ0 (T 0 ) = Eθ0 [g(θ0 ) − g(θ0 )]2 = 0

⇒ M SEθ0 (T ) = 0

2
Hence T = g(θ0 ) with probability 1. Since θ0 is arbitrary for any θ

T = g(θ) with probability 1

But T being a statistic it can not be a function of the unknown θ. Hence such a best
estimator does not exist.
Consider the following example.
Example Let X1 , X2 , . . . , Xn be i.i.d. N (θ, 1), θinRrandom variables. To estimate
Pn
θ let us consider two estimators T = X̄ = 1
n i=1 Xi and T 0 = θ0 .
Then we get
M SEθ (T ) = 1
n
and M SEθ (T 0 ) = (θ0 − θ)2 .
Now for values of θ ∈ [θ0 − √1 , θ0
n
+ √1 ]
n
we have M SEθ (T 0 ) ≤ M SEθ (T ) and for
other values of θ we have M SEθ (T 0 ) > M SEθ (T ).
Here T 0 is not a good estimator of θ sine it always estimate θ to be θ0 and it does
not depend on observations at all. On the other hand the estimator T utilizes the
observations and therefore it is better than T 0 .
From the above discussion it is clear that if MSE is the only criterion in search for a
good estimator then there may be some ’freak’ estimators (like T 0 ) that are extremely
prejudiced in favour of a particular values of θ and they would perform better than a
generally good estimator estimator at those points. For instance, in the above exam-
ple the estimator T 0 is highly partial to θ0 since it always estimate θ to be θ0 . One
could restrict such freak estimators by considering only estimators that satisfy some
other property. One such property is that of unbiasedness.

Unbiasedness
Definition An Estimator T is said to be an unbiased estimator (UE) of g(θ) if

Eθ (T ) = g(θ) for all θ ∈ Θ

3
If T is not an unbiased estimator of g(θ) then the bias of T is defined by

Bθ (T ) = Eθ (T ) − g(θ), θ ∈ Θ.

The following result shows a relationship between MSE and variance of an estimator
in terms of the bias.

Result M SEθ (T ) − V arθ (T ) = Bθ2 (T ), θ ∈ Θ.


Proof

M SEθ (T ) = Eθ (T − g(θ))2

= Eθ [(T − Eθ (T )) + (Eθ (T ) − g(θ))]2

= Eθ (T − Eθ (T ))2 + (Eθ (T ) − g(θ))2 + 2 (Eθ (T ) − g(θ)) Eθ (T − Eθ (T ))

= Eθ (T − Eθ (T ))2 + (Eθ (T ) − g(θ))2

= V arθ (T ) + Bθ2 (T )

Thus, MSE incorporates two components one measuring the variability of the esti-
mator(precision) and the other measuring its bias (accuracy). An estimator that has
good MSE properties has small combined variance and bias. To find an estimator
with good MSE properties, we need to find estimators that control both variance and
bias. Clearly, unbiased estimators do a good job of controlling bias. For an unbiased
estimator T we have M SEθ (T ) = V arθ (T ).
Although many unbiased estimators may be reasonable from the standpoint of MSE,
but controlling bias does not guarantee that MSE is controlled. In some cases a
trade-off occurs between the variance and the bias in such a way that a small increase
in bias can be traded for a larger decrease in variance, resulting a smaller MSE. It is
clear from the following example.
Example 1 Let Xi ∼ N (µ, σ 2 )i = 1, 2 . . . n independently where µ and σ 2 are un-
known.
Consider all estimators of σ 2 of the form T = cS 2 where c > 0 is a constant and

4
n
1
S2 = Σ (Xi
n−1 i=1
− X̄)2
Now
M SEσ (cS 2 − σ 2 )2 = Eσ (cS 2 − σ 2 )2 = c2 Eσ (S 4 ) − 2cEσ (S 2 ) + σ 4
(n−1)S 2
Since σ2
∼ χ2n−1 ,

(n − 1)S 2 (n − 1)S 2
! !
Eσ = (n − 1), Vσ = 2(n − 1)
σ2 σ2
2σ 4
which gives Eσ (S 2 ) = σ 2 and Vσ (S 2 ) = n−1
. After some routine algebra we get

2n
+1
 
2 2 2 4
M SEσ (cS − σ ) = σ c − 2c + 1
n−1
n−1
which attains the minimum when c = n+1
. The minimum value being

2σ 4 2σ 4
<
n+1 n−1
n
1
and hence T = Σ (Xi
n+1 i=1
− X̄)2 has smaller MSE than the unbiased estimator S 2 of
n−1 2
σ 2 . But T is not unbiased for σ 2 since Eσ (T ) = n+1
σ . If we use the criterion of MSE
the estimator T which is a biased for σ 2 is better than the unbiased estimatorS 2 of σ 2 .

Example 2 Let Xi ∼ N (θ, 1)i = 1, 2 . . . n independently where |θ| ≤ 1. Here X̄ is


unbiased for θ. Let us consider the following estimator T of θ such that

T = −1 if X̄ < −1

= X̄ if | X̄ |≤ 1

= 1 if X̄ > 1

Then
Z 1
Eθ (T ) = Pθ (X̄ > 1) + Pθ (X̄ < −1) + x̄f (x̄)dx̄ 6= θ,
−1

where f (x̄) represents the p.d.f. of X̄. Hence T is a biased estimator of θ and the
MSE of T is given by
Z 1
2 2
M SEθ (T ) = (1 − θ) Pθ (X̄ > 1) + (−1 − θ) Pθ (X̄ < −1) + (x̄ − θ)2 f (x̄)dx̄
−1

5
For the estimator X̄ we have
Eθ (X̄) = θ.

Hence X̄ is an unbiased estimator of θ and


Z ∞
M SEθ (X̄) = (x̄ − θ)2 f (x̄)dx̄.
−∞

Now
Z ∞
M SEθ (X̄) − M SEθ (T ) = (x̄ − θ)2 f (x̄)dx̄ − M SEθ (T )
−∞
Z −1 h i Z ∞h i
= (x̄ − θ)2 − (−1 − θ)2 f (x̄)dx̄ + (x̄ − θ)2 − (1 − θ)2 f (x̄)dx̄
−∞ 1

= I1 + I2 , say.

In the first integral I1 ,

x̄ < −1 ⇒ (x̄ − θ) < (−1 − θ) < 0 ⇒ (x̄ − θ)2 > (−1 − θ)2 .

In the second integral I2 ,

x̄ > 1 ⇒ (x̄ − θ) > (−1 − θ) > 0 ⇒ (x̄ − θ)2 > (−1 − θ)2 .

Hence we get I1 + I2 > 0. Thus T is a biased estimator of θ but the MSE of T is less
than that of X̄.
Note In both the examples a natural question may arise : which then should be
preferred? The answer obviously depends on the purpose for which an estimate is
obtained.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy