0% found this document useful (0 votes)
182 views

Sample Theory With Ques. - Estimation (JAM MS Unit-14)

The document discusses point estimation and properties of estimators. It begins by defining point estimation as selecting a single number based on sample data to represent the value of an unknown parameter. It then discusses key properties of good estimators: 1) Consistency - the estimator converges in probability to the true parameter value as the sample size increases. 2) Unbiasedness - the expected value of the estimator equals the true parameter value for any given sample size. 3) Efficiency - the estimator has minimum variance compared to other consistent estimators. 4) Sufficiency - the estimator contains as much information from the sample about the parameter as the full sample.

Uploaded by

ZERO TO VARIABLE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
182 views

Sample Theory With Ques. - Estimation (JAM MS Unit-14)

The document discusses point estimation and properties of estimators. It begins by defining point estimation as selecting a single number based on sample data to represent the value of an unknown parameter. It then discusses key properties of good estimators: 1) Consistency - the estimator converges in probability to the true parameter value as the sample size increases. 2) Unbiasedness - the expected value of the estimator equals the true parameter value for any given sample size. 3) Efficiency - the estimator has minimum variance compared to other consistent estimators. 4) Sufficiency - the estimator contains as much information from the sample about the parameter as the full sample.

Uploaded by

ZERO TO VARIABLE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

IIT JAM

Mathematical Statistics (MS)

SAMPLE

Eduncle.com
Mpa 44, 2nd Floor, Above Bank Of India, Rangbari Main Road,
Toll Free: 1800-120-1021
Mahaveer Nagar 2nd, Near Amber Dairy, Kota, Rajasthan, 324005
Website: www.eduncle.com | Email: Info@Eduncle.com
Mathematical statistics (Sample)

ESTIMATION
1. ESTIMATION

The theory of estimation was founded by Prof. R.A. Fisher in a series of fundamental papers
round about 1930.
Parameter Space
Let us consider a random variable X with p.d.f. f(x, ). In most common applications, though not
always, the functional form of the population distribution is assumed to be known except for the value
of some unknown parameter(s)  which may take any value on a set , which is the set of all possible
values of  is called the parameter space. Such a situation gives rise not to one probability distribution
but a family of probability distributions which we write as {f(x, ), }. For example if X ~ N(, 2),
then the parameter space.
 = {(, 2) : –  <  <  ; 0 <  < }
In particular, for 2 = 1, the family of probability distributions is given by
{N(, 1);   }, where  = { : –  <  < }
In the following discussion we shall consider a general family of distributions
{f(x; 1, 2, ..., k) : i  , i = 1, 2, ..., k}.
Let us consider a random sample x1, x2, ..., xn of size n from a population, with probability function
f(x ; 1, 2, ..., k), where 1, 2, ..., k are the unknown population parameters. There will then always
be an infinite number of functions of sample values, called statistics, which may be proposed as
estimates of one or more of the parameters.
Evidently, the best estimate would be one that falls nearest to the true value of the parameter to
be estimated. In other words, the statistic whose distribution concentrates as closely as possible near
the true value of the parameter may be regarded the best estimate. Hence the basis problem of the
estimation in the above case, be formulated as follows :
‘We wish to determine the functions of the sample observations :

T1  ˆ 1(x1, x 2 ,..., xn ), T2  ˆ 2 (x1, x 2 ,..., x n ),...,Tk  ˆ k (x1, x 2 ,..., xn ),


such that their distribution is concentrated as closely as possible near the true value of the
parameter.
The estimating functions are then referred to as estimators.
Point Estimation
Definition
A point estimate of a parameter  is a single number that can be regarded as a sensible value
for .
A point estimate is obtained by selecting a suitable statistic and computing its value from the
given sample data. The selected statistic is called the point estimator of .
Examples
Suppose, for example, that the parameter of interest is , the true average lifetime of batteries
of a certain type.
A random sample of n = 3 batteries might yield observed lifetimes (hours)
x1 = 5.0, x2 = 6.4, x3 = 5.9.

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 1


Mathematical statistics (Sample)

The computed value of the sample mean lifetimes is x = 5.77.


It is reasonable to regard 5.77 as a very plausible value of  our “best guess” for the value of
 based on the available sample information.
Notations
Suppose we want to estimate a parameter of a single population (e.g.,  or , or ) based on
a random sample of size n.
 When discussing general concepts and methods of inference, it is convenient to have a
generic symbol for the parameter of interest.
 We will use the Greek letter  for this purpose.
 The objective of point estimation is to select a single number, based on sample data, that
represents a sensible value for .
Some General Concepts of Point Estimation
We know that before data is available, the sample observations must be considered random
variables X1, X2, ..., Xn.
 It follows that any function of the Xi’s - that is, any statistic - such as the sample mean
X or sample standard deviation S is also a random variable.
Example : In the battery example just given, the estimator used to obtain the point estimate of
 was X , and the point estimate of  was 5.77.
 If the three observed lifetimes had instead been x1 = 5.6, x2 = 4.5, and x3 = 6.1, use of
the estimator X would have resulted in the estimate x = (5.6 + 4.5 + 6.1)/3 = 5.40.

 The symbol ̂ (“theta hat”) is customarily used to denote both the estimator of  and the
point estimate resulting from a given sample.

 Thus ˆ  X is read as “the point estimator of  is the sample mean X .” The statement
“the point estimate of  is 5.77” can be written concisely as ̂ = 5.77.

In the best of all possible worlds, we could find an estimator ̂ for which ̂ =  always. However,,
̂ is a function of the sample Xi’s, so it is a random variable.

 For same samples, ̂ will yield a value larger than , whereas for other samples ̂ will
underestimate . If we write

̂ =  + error of estimation
then an accurate estimator would be one resulting in small estimation errors, so that
estimated values will be near the true value.

 A sensible way to quantify the idea of ̂ being close to  is to consider the squared error
(ˆ  )2 .

For some samples, ̂ will be quite close to  and the resulting squared error will be near 0.

 Other samples may given values of ̂ for from , corresponding to very large squared
errors.
A measure of accuracy is the expected or mean square errors (MSE)

MSE  E[( ˆ  )2 ]


 If a first estimator has smaller MSE than does a second, it is natural to say that the first

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 2


Mathematical statistics (Sample)
estimator is the better one.
 However, MSE will generally depend on the value of . What after happens is that one
estimator will have a smaller MSE for some values of q and a larger MSE for other values.
 Finding an estimator with the smallest MSE is typically not possible.
One way out of this dilemma is to restrict attention just to estimators that have some
specified desirable property and then find the best estimator in this restricted group.
 A popular property of this short in the statistical community is unbiasedness.
Characteristics of Estimators
The following are some of the criteria that should be satisfied by a good estimator.
(i) Consistency
(ii) Unbiasedness
(iii) Efficiency and
(iv) Sufficiency
We shall now, briefly, explain these terms one by one.

2. CONSISTENCY

A estimator Tn = T(x1, x2, ..., xn), based on a random sample of size n, is said to be consistent
estimator of (),   , the parameter space, if Tn converges to () in probability.
p
i.e., if Tn   () as n   ...(1)
In other words, Tn is a consistent estimator of () if for every  > 0,  > 0, there exists a positive
integer n  m (, ) such that
P[|Tn – ()| < ]  1 as n   ...(2)
 P[|Tn – ()| < ] > 1 –  ;  n  m ...(2a)
where m is some very large value of n.
Remark. If X1, X2, ..., Xn is a random sample from a population with finite mean EXi =  < , then
by Khinchine’s weak law of large numbers (W.L.L.N), we have

1 n p
Xn   Xi   E(Xi )  , as n  .
ni 1

Hence sample mean (Xn ) is always a consistent estimator of the population mean ().

3. UNBIASEDNESS

Obviously, consistency is a property concerning the behaviour of an estimator for indefinitely large
values of the sample size n, i.e., as n . Nothing is regarded of its behaviour for finite n.
Moreover, if there exists a consistent estimator, say, Tn of (), then infinitely many such estimators
can be constructed, e.g.,
na  1  (a / n)  p
Tn    Tn    Tn  Tn   ( ), as n  
nb  1  (b / n) 
and hence, for different values of a and b, Tn’ is also consistent for ().
Unbiasedness is a property associated with finite n. A statistic
Tn = T(x1, x2, ..., xn), is said to be an unbiased estimator of () if

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 3


Mathematical statistics (Sample)
E(Tn) = (), for all    ...(3)
2
We have see that in sampling from a population with mean m and variance  ,

E(x)   and E(s2 )  2 but E(S2 )  2 ,


Hence there is a reason to prefer

1 n 1 n 2
S2 =  (x i  x)2 , to the sample variance s2 =  (x i  x) .
n  1i  1 n i1

Remark. If E(Tn) > , Tn is said to be positively biased and if E(Tn) < , it is said to be negatively
biased, the amount of bias b() being given by
b() = E(Tn) – (),    ...(3a)
Example.
The sample proportion X/n can be used as an estimator of p, where X, the number of sample
successes, has a binomial distribution with parameters n and p. Thus

X 1 1
ˆ  E    E(X)  (np)  p
E(p)
n n n

Invariance Property of Consistent Estimators


Theorem : If Tn is a consistent estimator of () and (()) is a continuous function of (), then
(Tn) is a consistent estimator of (()).
p
Proof. Since Tn is a consistent estimator of (), Tn   () as n  i.e., for every  > 0,
 > 0,  a positive integer n  m (, ) such that
P[|Tn – ()| < ] > 1 – ,  n  m ...()
Since (·) is a continuous function, for every  > 0, however small,  a positive number 1 such
that | (Tn) – y(()) | < 1, whenever | Tn – () | < 
i.e., | Tn – () | <   | (Tn) – (()) | < 1 ...()
For two events A and B,
if A  B, then A  B  P(A)  P(B)  P(B)  P(A) ...()
From () and (), we get
P[|(Tn) – (()| < 1]  P[|Tn – ()| < ]
 P[|(Tn) – (()| < 1]  1 – ;  n  m [Using ()]
p
 (Tn)   (()), as n  
     (Tn) is a consistent estimator of ().
Sufficient Conditions for Consistency
Theorem : Let [Tn] be a sequence of estimators such that for all   ,
(i) Eq (Tn)  (), n  
and (ii) Varq(Tn)  0, as n  .
Then Tn is a consistent estimator of ().
Proof. We have to prove that Tn is a consistent estimator of ()
p
i.e., Tn   (), as n  
i.e., P[|Tn  ()| < ] > 1 –  ;  n  m (, ) ...(4)
where  and  are arbitrarily small positive numbers and m is some large value of n:
Applying Chebychev’s inequality to the statistic Tn, we get

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 4


Mathematical statistics (Sample)

Var (Tn )
P[|Tn – E(Tn)|  ]  1 – ...(5)
2
We have
 | Tn – E(Tn) | + | E(Tn) – () | ...(6)
Now
|Tn – E(Tn)|     |Tn – ()|   + |E(Tn) – ()| ...(7)
Hence, on using () of Theorem we get
P[|Tn – ()|   + | E(Tn) – ()|]  P[|Tn – E(Tn)|  ]

Var (Tn )
 1 – [From (5)] ...(8)
2
We are given :
Eq(Tn)  ()     as n  .
Hence, for every 1 > 0,  a positive integer n  n0 (1) such that
|E(Tn) – ()|  1,  n  n0 (1) ...(9)
Also Var(Tn)  0 as n  , (Given).

Var (Tn )
  h,  n  n0’ () ...(10)
2
where  is arbitrarily small positive number.
Substituting from (9) and (10) in (8), we get
P[|Tn – () |   + 1]  1 – h ; n  m (1, )
 P[|Tn – () |  ]  1 –  ; n  m
where m = max (n0, n0) and  =  + 1 > 0.
p
 Tn   (), as n   [Using (4)]
 Tn is a consistent estimator of ().
Example : x1, x2, ..., xn is a random sample from a normal population N(, 1). Show that t =

1 n 2
 xi , is an unbiased estimator of 2 + 1.
ni 1

Solution. (a) We are given


E(xi) = , V(xi) = 1  i = 1, 2, ..., n
Now E(xi2) = V(xi) + {E(xi)}2 = 1 + 2

1 n  1 n 1 n
E(t)  E   xi2    E(x i2 )   (1   2 )  1   2
n i  1  n i  1 ni 1

Hence t is an unbiased estimator of 1 + 2.


Theorem : If T is an unbiased estimator for , show that T2 is a biased estimator for 2.
Solution. Since T is an unbiased estimator for , we have
E(T) = 
Also Var(T) = E(T2) – [E(T)]2 = E(T2) – 2
 E(T2) = 2 + Vaqr(T), (Var T > 0).
Since E(T2)  2, T2 is a biased estimator for 2.

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 5


Mathematical statistics (Sample)

4. EFFICIENT ESTIMATORS

Efficiency. Even if we confine ourselves to unbiased estimates, there will, in general, exist more
than one consistent estimator of a parameter. For example, in sampling from a normal population N(,
2), when 2 is known, sample mean x is an unbiased and consistent estimator of  [c.f. Example].
From symmetry it follows immediately that sample median (Md) is an unbiased estimate of ,
which is the same as the population median. Also for large n,

1
V(Md) 
4n f12

Here f1 = Median ordinate of the parent distribution


= Modal ordinate of the parent distribution

 1  1
 exp{ (x   )2 / 22 }  
  2  x    2

1 2
 2
 V(Md) = . 2 =
4n 2n

Since E(Md)   
 , as n  
and V(Md)  0 

median is also an unbiased and consistent estimator of .


Thus, there is a necessity of some further criterion which enable us to choose between the
estimators with the common property of consistency. Such a criterion which is based on the variances
of the sampling distribution of estimators is usually known as efficiency.
If, of the two consistent estimators T1, T2 of a certain parameter , we have
V(T1) < V(T2), for all n, ...(11)
then T1 is more efficient than T2 for all samples sizes.
We have seen above :
2
For all n, V(x) 
n
2 2
and for large n, V(Md) =  1.57
2n n

Since V(x) < V(Md), we conclude that for normal distribution, sample mean is more efficient
estimator for  than the sample median, for large samples at least.
Most Efficient Estimator
If in a class of consistent estimators for a parameter, there exists one whose sampling variance
is less than that of any such estimator, it is called the most efficient estimator. Whenever such an
estimator exists, it provides a criterion for measurement of efficiency of the other estimators.
Efficiency (Def.) If T1 is the most efficient estimator with variance V1 and T2 is any other estimator
with variance V2, then the efficiency E of T2 is defined as :

V1
E ...(12)
V2

Obviously, E cannot exceed unity.

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 6


Mathematical statistics (Sample)
If T, T1, T2, ..., Tn are all estimators of () and Var(T) is minimum, then the efficiency Ei of Ti,
(i = 1, 2, ..., n) is defined as :
Var T
Ei  ; i  1, 2, ... , n ...(12a)
Var Ti
Obviously Ei  1, i = 1, 2, ... n.
For example, in the normal samples, since sample mean x is the most efficient estimator of ,
the efficiency E of Md for such sample, (for large n), is :
V(x) 2 / n 2
E  2   0.637
V(Md)  /(2n) 
Example : Suppose we have some prior knowledge that the population from which we are about
to sample is normal. The mean of this population is however unknown to us.

Because it is normal we know that X and median sample are unbiased

E(X)  
E(md) = 
However, consider their variances
2
V(md) 
2n
2
Var(X) 
n

Clearly, X is the more efficient since it has the smaller variance.


Example : A random sample (X1, X2, X3, X4, X5) of size 5 is drawn from a normal population with
unknown mean . Consider the following estimators to estimate  :
X1  X2  X3  X4  X5
(i) t1 =
5
X1  X2
(ii) t2 = + X3,
2
2X1  X2  X3
(iii) t3 =
3
where  is such that t3 is an unbiased estimator of .
Find  are t1 and t2 unbiased? State giving reasons, the estimator which is best among t1, t2 and t3.
Solution. We are given
E(Xi) = , Var(Xi) = 2, (say) ; Cov (Xi, Xj) = 0, (i  j = 1, 2, ..., n) ...()

1 n 1 5 1
(i) E(t1) =  E(Xi )     . 5 = 
5 i1 5i1 5

 t1 is an unbiased estimator of .

1
(ii) E(t2) = E(X1 + X2) + E(X3)
2
1
= ( + ) +  [Using ()]
2
= 2
 t2 is not an unbiased estimator of .

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 7


Mathematical statistics (Sample)
(iii) E(t3) = 
1
 E(2X1 + X2 + X3) = 
3
 2E(X1) + E(X2) + E(X3) = 3
 2 +  +  = 3
  = 0    = 0
Using (), we get

1 1
V(t1) = [V(X1) + V(X2) + V(X3) + V(X4) + V(X5)] = 2
25 5
1 1 3
V(t2) = [V(X1) + V(X2)] + V(X3) = 2 + 2 = 2
4 2 2
1 1 5
V(t3) = [4V(X1) + V(X2)] = (42 + 2) = 2 (   = 0)
9 9 9
Since V(t1) is the least, t1 is the best estimator (in the sense of least variance) of .
Example : X1, X2, and X3 is a random sample of size 3 from a population with mean value m and
variance 2, T1, T2, T3 are the estimators used to estimate mean value , where
T1 = X1 + X2 – X3, T2 = 2X1 + 3X3 – 4X2, and T3 = (X1 + X2 + X3)/3
(i) Are T1 and T2 unbiased estimators?
(ii) Find the value of  such that T3 is unbiased estiamator for .
(iii) With this value of  is T3 a consistent estimator?
(iv) Which is the best estimator?
Solution. Since X1, X2, X3 is a random sample from a population with mean  and variance 2,
E(Xi) = , Var (Xi) = 2 and Cov(Xi, Xj) = 0, (i  j = 1, 2, ..., n) ...()
(i) E(T1) = E(X1) + E(X2) – E(X3) =  +  –  = 
 T1 is an unbiased estimator of 
E(T2) = 2E(X1) + 3E(X3) – 4E(X2) = 2 + 3 – 4 = .
 T2 is an unbiased estimator of .
(ii) We are given : E(T3) = 
1
 [E(X1) + E(X2) + E(X3) = 
3
1
 ( +  + ) =        + 2 = 3     = 1.
3
1
(iii) With  = 1, T3 =(X + X2 + X3) = X
3 1
Since sample mean is a consistent estimator of population mean , by Weak Law of
Large Numbers, T3 is a consistent estimator of .
(iv) We have [on using ()] :
Var(T1) = Var(X1) + Var(X2) + Var(X3) = 32
Var(T2) = 4 Var(X1) + 9 Var(X3) + 16 Var(X2) = 29 s2

1 1
Var(T3) = [Var(X1) + Var(X2) + Var(X3)] = 2 (  = 1)
9 3
Since Var(T3) is minimum, T3 is the best estimator in the sense of minimum variance.

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 8


Mathematical statistics (Sample)

5. MINIMUM VARIANCE UNBIASED (M.V.U.) ESTIMATORS

If a statistic T = T(x1, x2, ..., xn), based on sample of size n is such that :
(i) T is unbiased for (), for all    and
(ii) It has the smallest variance among the class of all unbiased estimators of ().
then T is called the minimum variance unbiased estimator (MVUE) of ().
More precisely, T is MVUE of () if
E(T) = () for all    ...(13)
and Var(T)  Var(T’) for all    ...(14)
where T’ is any other unbiased estimator of ().
We give below some important Theorems concerning MVU estimators.
Theorem : An M.V.U. is unique in the sense that if T1 and T2 are M.V.U. estimators for (), then
T1 = T2, almost surely.
Proof. We are given that

E (T1 )  E (T2 )  (), for all    



and Var (T1 )  Var (T2 ) for all     ...(15)

Consider a new estimator

1
T (T1  T2 )
2
which is also unbiased since

1
E(T)  [E(T1 )  E(T2 )]  
2

1  1
Var(T) = Var  (T1  T2 )  Var(T1 + T2) [ Var (CX) = C2 Var (X)]
2  4

1
= [Var(T1) + Var(T2) + 2 Cov(T1, T2)]
4
1
= [Var(T1) + Var(T2) + 2 Var(T1 ) Var(T2 )]
4
1
= Var (T1) (1 + ), ...[From (15)]
2
where  is Karl Pearson’s co-efficient of correlation between T1 and T2.
Since T1 is the MUV estimator,
Var (T)  Var (T1)

1
 Var (T1) [1 + ]  Var (T1)
2

1
 (1 + )  1, i.e.,   1
2
Since |  |  1, we must have  = 1, i.e., T1 and T2 must have a linear relation of the form :

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 9


Mathematical statistics (Sample)
T1 =  + T2, ...(16)
where  and  are constants independent of x1, x2, ..., xn but may depend on , i.e., we may have
 = () and  = ().
Taking expectation of both sides in (16) and using (15), we get
 =  +  ...(17)
Also from (16), we get
Var(T1) = Var( + T2) =  2 Var (T2)
 1 = 2     = ± 1 ...[From (15)]
But since (T1, T2) = + 1, the coefficient of regression of T1 and T2 must be positive.
  = 1     = 0 ...[From (17)]
Substituting in (16), we get T1 = T2 as desired:
Theorem : Let T1 and T2 be unbiased estimators of () with efficiencies e1 and e2 respectively
and  =  be the correlation coefficient between them. Then

e1e 2  (1  e1 )(1  e 2 )    e1e 2  (1  e1 )(1  e 2 )


Proof. Let T be the minimum variance unbiased estimator of (). Then we are given :
E(T1) = () = E(T2),     ...(18)

V (T) V V
and e1 =  , (say)  V1 = ...(19)
V (T1 ) V1 e1

V (T) V V
e2 =  , (say)  V2 = ...(20)
V (T2 ) V2 e2

Let us consider another estimator


T3 = T1 + T2 ...(21)
which is also unbiased estimator of (),
i.e., E(T3) = ( + ) () = () ...[Using (18)]
  +  = 1
V(T3) = V(T1 + T2)
= 2 V(T1) + 2 V(T2) + 2 Cov (T1, T2)

  2 2  
 V  2.  ...(22) [Using (19) and (20)]
 e1 e2 e1e2 

But V (T3)  V, since V is the minimum variance.

 2  2 2
    1  (   )2 [Using (22)]
e1 e2 e1e2

1  2  1    
   1     1  2  2   1  0
 e1   e2   ee
 1 2

2
1          1 
   1    2   1      1  0 ...(23)
 e1     e1e2      e2
 

which is quadratic expression in (/).


Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 10
Mathematical statistics (Sample)
Note that :

1 1 
ei < 1     1 > 0, i = 1, 2
ei  ei 
We know that
AX2 + BX + C  0 x, A > 0, C > 0;
if and only if
Discriminant = B2 – 4AC  0 ...(24)
Using (24), we get from (23) :
2
    1  1 
  1    1   1  0
 ee   e1   e2 
 1 2 

 (  e1e 2 )2  (1  e1 )(1  e2 )  0

 2  2 e1e 2   (e1  e 2  1)  0
This implies that  lies between the roots of the equation

2  2 e1e 2   (e1  e 2  1)  0
which are given by

1
2 e1e2  2 e1e2  (e1  e2  1) 
2 

 e1e 2  (e1  1)(e 2  1)


Hence we have :

e1e 2  (e1  1)(e2  1)    e1e 2  (e1  1)(e 2  1)

 e1e 2  (1  e1 )(1  e 2 )    e1e 2  (1  e1 )(1  e 2 ) ...(25)


Corollary : If we take e1 = 1 and e2 = e in (25), we get

e  e   e
This leads to the following important result, which we state in the form of a theorem.
Theorem : If T1 is an MVU estimator of (),  and T2 is any other unbiased estimator of
() with efficiency e = e, then the correlation coefficient between T1 and T2 is given by

  e i.e.,   e  ,     ...(26)
Theorem : If T1 is an MVUE of () and T2 is any other unbiased estimator of () with efficiency
e < 1, then no unbiased linear combination of T1 and T2 can be an MVUE of ().
Proof. A linear combination.
T = l1T1 + l2T2 ...(27)
will be unbiased estimator of () if
E(T) = l1E(T1) + l2E(T2) = g(q), for all   
 l1 + l2 = 1 ...(27a)
since we are given E(T1) = E(T2) = ().
We have
Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 11
Mathematical statistics (Sample)

Var(T1 ) Var T1
e  Var T2  ...(28)
Var(T2 ) e

and  = (T1, T2) = e


From (27), on using (28), we get
Var T = l12 Var (T1) + l22 Var (T2) + 2l1l2 Cov (T1, T2)

= l12 Var (T1) + l22 Var (T2) + 2l1l2  Var(T1 ) Var(T2 )

 l2  
 Var(T1 ) l12  2  2l1l2 
 e e

 l2 
 Var(T1 ) l12  2l1l2  2    e 
 e

 1 
 Var T1 [l12 + 2l1l2 + l22]  0  e  1  e  1
 
= Var T1(l1 + l2)2
= Var(T1) [From (27a)
 T cannot be an MVU estimator.
Example : If T1 and T2 be two unbiased estimator of () with variances 12, 22 and correlation
, what is the best unbiased linear combination of T1 and T2 and what is the variance of such a
combination?
Solution. Let T1 and T2 be two unbiased estimators of ().
 E(T1) = E(T2) = () ...(1)
Let T be a linear combination of T1 and T2 given by
T = l1T1 + l2T2 ...()
where l1, l2 are arbitrary constants.
E(T) = l1E(T1) + l2E(T2) = (l1 + l2) () [From (1)]
 T is also an unbiased estimator of () if and only if
l1 + l2 = 1 ...(2)
Now V(T) = V(l1T1 + l2T2)
= l12V(T1) + l22V(T2) + 2l1l2 Cov. (T1, T2)
= l1212 + l2222 + 2l1l2 12 ...(3)
We want the minimum value of (3) for variations in l1 and l2, subject to the condition (2).


 V(T) = 0 = l112 + l2  12
l1


2
l2 V(T) = 0 = l22 + l1  12

Subtracting, we get
l1(12 – 12) = l2(22 – 12)

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 12


Mathematical statistics (Sample)

l1 l l1  l2
  2 2  2
  12 1  12 1  22  212
2
2

1
 2 2 [From (2)]
    21 2
1 2

 22  12 12  12


 l1  and l2  ...(4)
12  22  212 12  22  212

With these values of l1 and l2, T in the above example is an unbiased minimum variance estimate
and T2 is any other unbiased estimate with variance 2/e. Then prove that the correlation between T1 and
T2 is e.
Solution. The coefficients of the best linear unbiased combination of T1 and T2, given by () in
Example are given by (4).
We are given that 12 = V(T1) = 2

V(T1 ) 2
and 
e = V(T ) V(T )  V(T2) = 22 = 2/e
2 2

Substituting in (4) of Example we get

1 p e 
l1  
D 
...(5)
 , where D  1  e  2 e
ep e
l2 
D 

Hence from (), the unbiased statistic is

[(1   e )T1  (e   e )T2 ]


T
D
and from (3) the minimum variance is :

1  2 2 2 
V(T)   1  e
  2  e   e
   2 1   e e   e . .. / e 
  
D2  e 

2  1 
 2 
1  2 e  2 e  e2  2 e  2e e  2 1   e
      e   

D  e 

2 
 2 
1  2 e  2 e  e  2  2 e  2  e  2 e  2  3 e   
D 

2 
 1  2 e  e  2  2 e  23 e 
D2  

2 
 2 
1  e  2 e  2 e  1  2 e 
   
D 

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 13


Mathematical statistics (Sample)

2 1  2 1  e  2 e
    2 1  2
 
 2
1  e  2 e  1  e  2 e

2 1  2
 
 2
2
1      e  
V(T) 1  2
  1 ...(6)
2 1  2  e  
2
   
Since T1 is the most efficient estimator,

V(T)
V(T)  2  1 ...(7)
2
From (6) and (7), we get

V(T) 1  2
 1, i.e., 1
2 (1  2 )  ( e  )2

 ( e  )2  0    e
Aliter. From (5) onwards. Since T1 is given to be the most efficient estimator, it cannot be
improved upon (c.f. Theorem). Hence, in order that T defined in () is minimum variance unbiased
estimator we must have

l1  1 
  e ...[From (5)]
l2  0 

Remarks. This problem leads to the following very important result :


“The correlation coefficient between a most efficient estimator and any other estimator with
efficiency e is e .”

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 14


Mathematical statistics (Sample)

Sample Questions

SECTION-(A) MULTIPLE CHOICE QUESTIONS (MCQ)

1. Which of the following assumptions are required to show the consistency, unbiasedness and
efficiency of the OLS estimator ?
(i) E(ut) = 0 (ii) Var(ut) = 2
(iii) Cov(ut, ut–j) = 0  j (iv) ut~N(0, 2)
(A) (ii) and (iv) only (B) (i) and (iii) only
(C) (i), (ii) and (iii) only (D) (i), (ii), (iii) and (iv)

2. Which of the following may be consequences of one or more of the CLRM assumptions being
violated ?
(i) The coefficient estimates are not optimal
(ii) The standard error estimates are not optimal
(iii) The distributions assumed for the test statistics are inappropriate
(iv) Conclusions regarding the strength of relationships between the dependent and independent
variables may be invalid.
(A) (ii) and (iv) only (B) (i) and (iii) only
(C) (i), (ii) and (iii) only (D) (i), (ii), (iii) and (iv)

3. What would be then consequences for the OLS estimator if heteroscedasticity is present in a
regression model but ignored ?
(A) It will be biased (B) It will be inconsistent
(C) It will be inefficient (D) All of (a), (b) and (c) will be true

4. Including relevant lagged values of the dependent variable on the right hand side of a regression
equation could lead to which one of the following ?
(A) Biased but consistent coefficient estimates
(B) Biased and inconsistent coefficient estimates
(C) Unbiased but inconsistent coefficient estimates
(D) Unbiased and consistent but inefficient coefficient estimates

5. What will be the properties of the OLS estimator in the presence of multicollinearity ?
(A) It will be consistent, unbiased and efficient
(B) It will be consistent and unbiased but not efficient
(C) It will be consistent but not unbiased
(D) It will not be consistent

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 15


Mathematical statistics (Sample)

SECTION-(B) MULTIPLE SELECT QUESTIONS (MSQ)

1. Let X1, X2, ..., Xn be normal random variables with mean  and variance 2. What are the method
of moments estimators of the mean  and variance 2?

2 1 n 2
2 1 n 2
(A) For variance ˆ MM  
 Xi  X  (B) For variance ˆ MM  
 Xi  X 
n i 1 n i1

1 n 2 n
(C) For mean ˆ MM   Xi  X (D) For mean ˆ MM   Xi  2X
n i1 n i 1

2. Let X1, X2, ..., Xn be gamma random variables with parameters  and , so that the probability
density function is:

1
f(x i )  x  1e  x / 
(  ) 

for x > 0. Therefore, the likelihood function.


n
 1  1  1 
L(, )   x x ,...., x n  exp   x i 
   1 2
 ( )    

is difficult to differentiate because of the gamma function (). So, rather than finding the maximum
likelihood estimators, what are the method of moments estimators of  and ?

nX2 1 n
(A) ˆ MM  n
(B) ˆ MM   (Xi  X)2
nX i  1
 (X i  X)2
i1

2nX2 1 n
(C) ˆ MM  n
(D) ˆ MM   (Xi  X)2
nX i  1
 (X i  X)2
i1

3. Suppose you compute a sample statistic q to estimate a population quantity Q. Which of the
following is/are false?
(A) the variance of Q is zero
(B) if q is an unbiased estimator of Q, then q = Q
(C) if q is an unbiased estimator of Q, then q is the mean of the sample distribution of Q
(D) a 95% confidence interval for q contains Q with 95% probability

4. Suppose you draw a random sample of n observations, X1, X2, ..., Xn, from a population with
unknown mean . Which of the following estimators of  is/are biased?
(A) the first observation you sample, Xl

(B) X2

(C) X 2  s2 / n
(D) All of these

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 16


Mathematical statistics (Sample)
5. In the linear regression model, the least squares estimator
(A) minimizes the sum of squared residuals
(B) is unbiased
(C) is most efficient among the class of linear estimators
(D) maximizes the value of R2

SECTION-(C) NUMERICAL ANSWER TYPE QUESTIONS (NAT)

1. The following table gives probabilities and observed frequencies in four classes. AB Ab, aB and
ab in a genetically experiment. Estimate the parameter  by the method of maximum likelihood
and fixed its standard error.
Class Probability Observed frequency

1
AB (2 + ) 108
4

1
Ab (1 – ) 27
4

1
aB (1 – ) 30
4

1
ab  8
4

2. Let X1, X2, ..., Xn be a random sample of size n from a population X with probability density
function
x 1 if 0  x  1
f(x; )  
0 otherwise,
where 0 <  <  is an unknown parameter. Using the method of moment find an estimator of
? If x1 = 0.2, x2 = 0.6, x3 = 0.5, x4 = 0.3 is a random sample of size 4, then what is the estimator
of ?

3. Suppose X1, X2, ..., X7 is a random sample from a population X with density function
 6  x
 x e
f(x; )   (7)7 if 0  x  

 0 otherwise
Find the coefficient of an estimator of  by the moment method.

4. Suppose X1, X2, ..., Xn is a random sample from a population X with density function
1
 if 0  x  
f(x; )   
 0 otherwise
Find the coefficient of an estimator of  by the moment method.

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 17


Mathematical statistics (Sample)

5. If X1, X2, ..., Xn is a random sample from a distribution with density function
 6  x
 x e
f(x; )   (7) 7 if 0  x  

 0 otherwise
then what is the coefficient of the maximum likelihood estimator of ?

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 18


Mathematical statistics (Sample)

SOLUTIONS

SECTION-(A) MULTIPLE CHOICE QUESTIONS (MCQ)

1. (C) All of the assumptions listed in (i) to (iii) are required to show that the OLS estimator has
the desirable properties of consistency, unbiasedness and efficiency. However, it is not
necessary to assume normality (iv) to derive the above results for the coefficient esti-
mates. This assumption is only required in order to construct test statistics that follow the
standard statistical distributions - in other words, it is only required for hypothesis testing
and not for coefficient estimation.
2. (D) If one or more of the assumptions is violated, either the coefficients could be wrong or
their standard errors could be wrong, and in either case, any hypothesis tests used to
investigate the strength of relationships between the explanatory and explained variables
could be invalid. So all of (i) to (iv) are true.
3. (C) Under heteroscedasticity, provided that all of the other assumptions of the classical linear
regression model are adhered to, the coefficient estimates will still be consistent and
unbiased, but they will be inefficient. Thus c is correct. The upshot is that whilst this
would not result in wrong coefficient estimates our measure of the sampling variability of
the coefficients, the standard errors, would probably be wrong. The stronger the degree
of heteroscedasticity (i.e. the more the variance of the errors changed over the sample),
the more inefficient the OLS estimator would be.
4. (A) Including lagged values of the dependent variable y will cause the assumption of the
CLRM that the explanatory variables are non-stochastic to be violated. This arises since
the lagged value of y is now being used as an explanatory variable and, since y at time
t-1 will depend on the value of u at time t-1, it must be the case that lagged values of
y are stochastic (i.e. they have some random influences and are not fixed in repeated
samples). The result of this is that the OLS estimator in the presence of lags of the
dependent variable will produce biased but consistent coefficient estimates. Thus, as the
sample size increases towards infinity, we will still obtain the optimal parameter esti-
mates, although these estimates could be biased in small samples. Note that no problem
of this kind arises whatever the sample size when using only lags of the explanatory
variables in the regression equation.
5. (A) In fact, in the presence of near multicollinearity, the OLS estimator will still be consistent,
unbiased and efficient. This is the case since none of the four (Gauss-Markov) assump-
tions of the CLRM have been violated. You may have thought that, since the standard
errors are usually wide in the presence of multicollinearity, the OLS estimator must be
inefficient. But this is not true-the multicollinearity will simple mean that it is hard to obtain
small standard errors due to insufficient separate information between the collinear vari-
ables, not that the standard errors are wrong.

SECTION-(B) MULTIPLE SELECT TYPE QUESTIONS (MSP)

1. (B,C) The first and second theoretical moments about the origin are:
E(Xi) =  and E(Xi2) = 2 + 2
(Incidentally, in case it’s not obvious, that second moment can be derived from manipu-
lating the shortcut formula for the variance). In this case, we have two parameters for

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 19


Mathematical statistics (Sample)
which we are trying to derive method of moments estimators. Therefore, we need two
equations here. Equating the first theoretical moment about the origin with the corre-
sponding sample moment, we get:
1 n
E(X)     Xi
n i1
And, equating the second theoretical moment about the origin with the corresponding
sample moment, we get:
1 n 2
E(X2 )  2   2   Xi
n i1
Now, the first equation tells us that the method of moments estimator for the mean  is
the sample mean:
1 n
ˆ MM   Xi  X
n i1
And, substituting the sample mean in for  in the second equation and solving for 2, we
get that the method moments estimator for the variance 2 is :
2 1 n 2 1 n
ˆ MM   Xi   2   Xi2  X2
n i1 n i 1
which can be rewritten as :
2 1 n
ˆ MM   (Xi  X)2
n i1
2. (A,B) The first theoretical moment about the origin is :
E(Xi) = 
And the second theoretical moment about the mean is:
Var(Xi) = E(Xi – )2 = 2
Again, since we have two parameters for which we are trying to derive method of mo-
ments estimators, we need two equations. Equating the first theoretical moment about
the origin with the corresponding sample moment, we get :

1 n
E(X)     Xi  X
n i1

And, equating the second theoretical moment about the mean with the corresponding
sample moment, we get:

1 n
Var(X)  2   (Xi  X)2
n i1

Now, we just have to solve for the two parameters  and . Let’s start by solving for 
in the first equation (E(X)). Doing so, we get:
X


X
Now, substituting  = into the second equation (Var(X)), we get:

3. (B,C,D)
Since Q simply exists and if fixed number or constant, it has no variance. So [A] is
correct. Again [B] is nonsense whereas [C] confuses what if from the sample and what
if from the population and is incorrect. Finally. [D] misinterprets what a confidence interval

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 20


Mathematical statistics (Sample)
captures. So (D) is the correct answer.
 X 1 n
2    2  X   (Xi  X)2
 n i1
Now, solving for  in that last equation, and putting on its hat, we get that the method of
moment estimator for  is:
1 n
ˆ MM   (Xi  X)2
nX i  1
And, substituting that value of  back into the equation we have for , and putting on its
hat, we get that the method of moment estimator for  is :

X X nX2
ˆ MM   n
 n
ˆMM
(1/ nX) (Xi  X)2  (X i  X)2
i 1 i1

4. (B,C) We have seen before that a) is actually unbiased. The second line seems like it might
be unbiased since by taking the square root of a square you just arrive back at X-bar. A
problem comes up though if your original X-bar is negative, say-10. Squaring-10 and then
taking the square root you arrive at 10. This is problematic and will lead to b) being a
biased estimator because we cannot say that its expected value is indeed . Likewise for
c) which compounds the problems of b) by subtracting out a constant, an operation
which we know will impart bias. So the correct answer is d).
5. (A,B,C,D)
We know by definition that the OLS estimator minimizes the sum of squared residuals
(thus, it produces the “least squares”). We have also noted that it has the desirable
property of being both unbiased and most efficient. Finally, we have seen in Lecture 7b
that since it minimizes the sum of squared residuals (or in other words RSS) it automati-
cally maximizes the value of R2. So, the answer must be e).

SECTION-(C) NUMERICAL ANSWER TYPE QUESTIONS (NAT)

1. 0.0576
Using multinomial probability law, we have

n!
L = L() = p1n1 pn22 pn33 pn44 ,  pi  1,  ni  n
n1 ! n2 ! n3 ! n4 !

 log L = C + n1 log p1 + n2 log p2 + n3 log p3 + n4 log p4,

 n! 
where C = log   , is a constant.
 n1 ! n2 ! n3 ! n4 ! 

2   1    1   
 log L = C + n1 log   + n2 log   + n3 log   + n4 log  
 4   4   4  4
Likelihood equation givens :

 log L n n n n
 1  2  3  4 0 ...()
 2   1  1  

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 21


Mathematical statistics (Sample)

n1 (n  n3 ) n4
  2  0
2 1  
Taking n1 = 108, n2 = 27, n3 = 30 and n4 = 8, we get

108 (27  30) 8


  0
2 1  
 108 (1 – ) – 57(2 + ) + 8(1 – )(2 + ) = 0
 173 2 + 14 – 16 = 0

14  196  11072


  = = –0·34 and 0·26
346
But , being the probability cannot be negative. Hence M.L.E. of  is given by

ˆ  0·26 ...()
Differentiating () again partially w.r. to , get

 2 logL n1 (n  n3 ) n4
2
 2
 2  2
 (2  ) (1  )2 

  2 logL  E(n1 ) E(n2 )  E(n3 ) E(n4 )


E  2  2
  2
   (2  ) (1  )2 

np1 n(p 2  p 3 ) np 4
 2
  2
(2  ) (1  )2 

n(2  ) n(1  ) n
 2
 2
 2
4(2  ) 2(1  ) 4

n n n
 I()    ;n   ni  173.
4(2  ) 2(1  ) 4ˆ
ˆ ˆ

 1 1 1 
 173    
 4  2·26 2  0·74 4  0·26 
= 173 [0·11 + 0·26 + 0·96] = 173 × 1·74 = 301·02

1
S.E.( ˆ )  1/ I( )   0·0576
301·02

2. 0.67 To find an estimator, we shall equate the population moment to the sample moment. The
population moment E(X) is given by
1
E(X) =  x f(x; )dx
0

1
  x  x   1 dx
0

1
  x  dx
0

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 22


Mathematical statistics (Sample)

 1
  x   1 
 1 0


 .
 1

We know that M1 = X . Now setting M1 equal to E(X) and solving for , we get


X
 1
that is

X
 ,
1 X
X
where X is the sample mean. Thus, the statistic is an estimator of the parameter
1 X
. Hence

X
ˆ  .
1 X

Since x1 = 0.2, x2 = 0.6, x3 = 0.5, x4 = 0.3, we have X = 0.4 and

0.4 2
ˆ   =0.67
1  0.4 3
is an estimate of the .
3. 0.143 Since, we have only one parameter, we need to compute only the first population moment
E(X) about 0. Thus,

E(X)   x f(x;  ) dx
0

x

 x6 e 
 x dx
0 (7) 7

7 x
1   x  
   e dx
(7) 0   

1  7 y
 y e dy
(7) 0

1
 (8)
(7)

= 7.

Since M1 = X , equating E(X) to M1, we get

7  X
that is

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 23


Mathematical statistics (Sample)

1
 X
7
Therefore, the estimator of  by the moment method is given by

1
 X  0.143X
7
4. 2 Examining the density function of the population X, we see that X ~ UNIF(0, ). Therefore


E(X) 
2
Now, equating this populating moment to the sample moment, we obtain


 E(X)  M1  X.
2
Therefore, the estimator of  is

ˆ  2X.
5. 0.143 The likelihood function of the sample is given by
n
L()   f(x i ; ).
i 1

Thus,
n
In L()   In f(xi , )
i 1

n
1 n
 6 In xi   xi – n In(6!) – 7n In().
i1  i 1

Therefore
n
d 1 7n
In L()  2 x i  .
d  i1 

d
Setting this derivative In L() to zero, we get
d

n
1 7n
2
x
i1
i 

0

which yields

1 n
  xi
7n i  1

This  can be shown to be maximum by the second derivative test and again we leave
this verification to the reader. Hence, the estimator of  is given by

1
 X  0.143X
7

Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy