Sample Theory With Ques. - Estimation (JAM MS Unit-14)
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
SAMPLE
Eduncle.com
Mpa 44, 2nd Floor, Above Bank Of India, Rangbari Main Road,
Toll Free: 1800-120-1021
Mahaveer Nagar 2nd, Near Amber Dairy, Kota, Rajasthan, 324005
Website: www.eduncle.com | Email: Info@Eduncle.com
Mathematical statistics (Sample)
ESTIMATION
1. ESTIMATION
The theory of estimation was founded by Prof. R.A. Fisher in a series of fundamental papers
round about 1930.
Parameter Space
Let us consider a random variable X with p.d.f. f(x, ). In most common applications, though not
always, the functional form of the population distribution is assumed to be known except for the value
of some unknown parameter(s) which may take any value on a set , which is the set of all possible
values of is called the parameter space. Such a situation gives rise not to one probability distribution
but a family of probability distributions which we write as {f(x, ), }. For example if X ~ N(, 2),
then the parameter space.
= {(, 2) : – < < ; 0 < < }
In particular, for 2 = 1, the family of probability distributions is given by
{N(, 1); }, where = { : – < < }
In the following discussion we shall consider a general family of distributions
{f(x; 1, 2, ..., k) : i , i = 1, 2, ..., k}.
Let us consider a random sample x1, x2, ..., xn of size n from a population, with probability function
f(x ; 1, 2, ..., k), where 1, 2, ..., k are the unknown population parameters. There will then always
be an infinite number of functions of sample values, called statistics, which may be proposed as
estimates of one or more of the parameters.
Evidently, the best estimate would be one that falls nearest to the true value of the parameter to
be estimated. In other words, the statistic whose distribution concentrates as closely as possible near
the true value of the parameter may be regarded the best estimate. Hence the basis problem of the
estimation in the above case, be formulated as follows :
‘We wish to determine the functions of the sample observations :
The symbol ̂ (“theta hat”) is customarily used to denote both the estimator of and the
point estimate resulting from a given sample.
Thus ˆ X is read as “the point estimator of is the sample mean X .” The statement
“the point estimate of is 5.77” can be written concisely as ̂ = 5.77.
In the best of all possible worlds, we could find an estimator ̂ for which ̂ = always. However,,
̂ is a function of the sample Xi’s, so it is a random variable.
For same samples, ̂ will yield a value larger than , whereas for other samples ̂ will
underestimate . If we write
̂ = + error of estimation
then an accurate estimator would be one resulting in small estimation errors, so that
estimated values will be near the true value.
A sensible way to quantify the idea of ̂ being close to is to consider the squared error
(ˆ )2 .
For some samples, ̂ will be quite close to and the resulting squared error will be near 0.
Other samples may given values of ̂ for from , corresponding to very large squared
errors.
A measure of accuracy is the expected or mean square errors (MSE)
2. CONSISTENCY
A estimator Tn = T(x1, x2, ..., xn), based on a random sample of size n, is said to be consistent
estimator of (), , the parameter space, if Tn converges to () in probability.
p
i.e., if Tn () as n ...(1)
In other words, Tn is a consistent estimator of () if for every > 0, > 0, there exists a positive
integer n m (, ) such that
P[|Tn – ()| < ] 1 as n ...(2)
P[|Tn – ()| < ] > 1 – ; n m ...(2a)
where m is some very large value of n.
Remark. If X1, X2, ..., Xn is a random sample from a population with finite mean EXi = < , then
by Khinchine’s weak law of large numbers (W.L.L.N), we have
1 n p
Xn Xi E(Xi ) , as n .
ni 1
Hence sample mean (Xn ) is always a consistent estimator of the population mean ().
3. UNBIASEDNESS
Obviously, consistency is a property concerning the behaviour of an estimator for indefinitely large
values of the sample size n, i.e., as n . Nothing is regarded of its behaviour for finite n.
Moreover, if there exists a consistent estimator, say, Tn of (), then infinitely many such estimators
can be constructed, e.g.,
na 1 (a / n) p
Tn Tn Tn Tn ( ), as n
nb 1 (b / n)
and hence, for different values of a and b, Tn’ is also consistent for ().
Unbiasedness is a property associated with finite n. A statistic
Tn = T(x1, x2, ..., xn), is said to be an unbiased estimator of () if
1 n 1 n 2
S2 = (x i x)2 , to the sample variance s2 = (x i x) .
n 1i 1 n i1
Remark. If E(Tn) > , Tn is said to be positively biased and if E(Tn) < , it is said to be negatively
biased, the amount of bias b() being given by
b() = E(Tn) – (), ...(3a)
Example.
The sample proportion X/n can be used as an estimator of p, where X, the number of sample
successes, has a binomial distribution with parameters n and p. Thus
X 1 1
ˆ E E(X) (np) p
E(p)
n n n
Var (Tn )
P[|Tn – E(Tn)| ] 1 – ...(5)
2
We have
| Tn – E(Tn) | + | E(Tn) – () | ...(6)
Now
|Tn – E(Tn)| |Tn – ()| + |E(Tn) – ()| ...(7)
Hence, on using () of Theorem we get
P[|Tn – ()| + | E(Tn) – ()|] P[|Tn – E(Tn)| ]
Var (Tn )
1 – [From (5)] ...(8)
2
We are given :
Eq(Tn) () as n .
Hence, for every 1 > 0, a positive integer n n0 (1) such that
|E(Tn) – ()| 1, n n0 (1) ...(9)
Also Var(Tn) 0 as n , (Given).
Var (Tn )
h, n n0’ () ...(10)
2
where is arbitrarily small positive number.
Substituting from (9) and (10) in (8), we get
P[|Tn – () | + 1] 1 – h ; n m (1, )
P[|Tn – () | ] 1 – ; n m
where m = max (n0, n0) and = + 1 > 0.
p
Tn (), as n [Using (4)]
Tn is a consistent estimator of ().
Example : x1, x2, ..., xn is a random sample from a normal population N(, 1). Show that t =
1 n 2
xi , is an unbiased estimator of 2 + 1.
ni 1
1 n 1 n 1 n
E(t) E xi2 E(x i2 ) (1 2 ) 1 2
n i 1 n i 1 ni 1
4. EFFICIENT ESTIMATORS
Efficiency. Even if we confine ourselves to unbiased estimates, there will, in general, exist more
than one consistent estimator of a parameter. For example, in sampling from a normal population N(,
2), when 2 is known, sample mean x is an unbiased and consistent estimator of [c.f. Example].
From symmetry it follows immediately that sample median (Md) is an unbiased estimate of ,
which is the same as the population median. Also for large n,
1
V(Md)
4n f12
1 1
exp{ (x )2 / 22 }
2 x 2
1 2
2
V(Md) = . 2 =
4n 2n
Since E(Md)
, as n
and V(Md) 0
Since V(x) < V(Md), we conclude that for normal distribution, sample mean is more efficient
estimator for than the sample median, for large samples at least.
Most Efficient Estimator
If in a class of consistent estimators for a parameter, there exists one whose sampling variance
is less than that of any such estimator, it is called the most efficient estimator. Whenever such an
estimator exists, it provides a criterion for measurement of efficiency of the other estimators.
Efficiency (Def.) If T1 is the most efficient estimator with variance V1 and T2 is any other estimator
with variance V2, then the efficiency E of T2 is defined as :
V1
E ...(12)
V2
E(X)
E(md) =
However, consider their variances
2
V(md)
2n
2
Var(X)
n
1 n 1 5 1
(i) E(t1) = E(Xi ) . 5 =
5 i1 5i1 5
t1 is an unbiased estimator of .
1
(ii) E(t2) = E(X1 + X2) + E(X3)
2
1
= ( + ) + [Using ()]
2
= 2
t2 is not an unbiased estimator of .
1 1
V(t1) = [V(X1) + V(X2) + V(X3) + V(X4) + V(X5)] = 2
25 5
1 1 3
V(t2) = [V(X1) + V(X2)] + V(X3) = 2 + 2 = 2
4 2 2
1 1 5
V(t3) = [4V(X1) + V(X2)] = (42 + 2) = 2 ( = 0)
9 9 9
Since V(t1) is the least, t1 is the best estimator (in the sense of least variance) of .
Example : X1, X2, and X3 is a random sample of size 3 from a population with mean value m and
variance 2, T1, T2, T3 are the estimators used to estimate mean value , where
T1 = X1 + X2 – X3, T2 = 2X1 + 3X3 – 4X2, and T3 = (X1 + X2 + X3)/3
(i) Are T1 and T2 unbiased estimators?
(ii) Find the value of such that T3 is unbiased estiamator for .
(iii) With this value of is T3 a consistent estimator?
(iv) Which is the best estimator?
Solution. Since X1, X2, X3 is a random sample from a population with mean and variance 2,
E(Xi) = , Var (Xi) = 2 and Cov(Xi, Xj) = 0, (i j = 1, 2, ..., n) ...()
(i) E(T1) = E(X1) + E(X2) – E(X3) = + – =
T1 is an unbiased estimator of
E(T2) = 2E(X1) + 3E(X3) – 4E(X2) = 2 + 3 – 4 = .
T2 is an unbiased estimator of .
(ii) We are given : E(T3) =
1
[E(X1) + E(X2) + E(X3) =
3
1
( + + ) = + 2 = 3 = 1.
3
1
(iii) With = 1, T3 =(X + X2 + X3) = X
3 1
Since sample mean is a consistent estimator of population mean , by Weak Law of
Large Numbers, T3 is a consistent estimator of .
(iv) We have [on using ()] :
Var(T1) = Var(X1) + Var(X2) + Var(X3) = 32
Var(T2) = 4 Var(X1) + 9 Var(X3) + 16 Var(X2) = 29 s2
1 1
Var(T3) = [Var(X1) + Var(X2) + Var(X3)] = 2 ( = 1)
9 3
Since Var(T3) is minimum, T3 is the best estimator in the sense of minimum variance.
If a statistic T = T(x1, x2, ..., xn), based on sample of size n is such that :
(i) T is unbiased for (), for all and
(ii) It has the smallest variance among the class of all unbiased estimators of ().
then T is called the minimum variance unbiased estimator (MVUE) of ().
More precisely, T is MVUE of () if
E(T) = () for all ...(13)
and Var(T) Var(T’) for all ...(14)
where T’ is any other unbiased estimator of ().
We give below some important Theorems concerning MVU estimators.
Theorem : An M.V.U. is unique in the sense that if T1 and T2 are M.V.U. estimators for (), then
T1 = T2, almost surely.
Proof. We are given that
1
T (T1 T2 )
2
which is also unbiased since
1
E(T) [E(T1 ) E(T2 )]
2
1 1
Var(T) = Var (T1 T2 ) Var(T1 + T2) [ Var (CX) = C2 Var (X)]
2 4
1
= [Var(T1) + Var(T2) + 2 Cov(T1, T2)]
4
1
= [Var(T1) + Var(T2) + 2 Var(T1 ) Var(T2 )]
4
1
= Var (T1) (1 + ), ...[From (15)]
2
where is Karl Pearson’s co-efficient of correlation between T1 and T2.
Since T1 is the MUV estimator,
Var (T) Var (T1)
1
Var (T1) [1 + ] Var (T1)
2
1
(1 + ) 1, i.e., 1
2
Since | | 1, we must have = 1, i.e., T1 and T2 must have a linear relation of the form :
V (T) V V
and e1 = , (say) V1 = ...(19)
V (T1 ) V1 e1
V (T) V V
e2 = , (say) V2 = ...(20)
V (T2 ) V2 e2
2 2
V 2. ...(22) [Using (19) and (20)]
e1 e2 e1e2
2 2 2
1 ( )2 [Using (22)]
e1 e2 e1e2
1 2 1
1 1 2 2 1 0
e1 e2 ee
1 2
2
1 1
1 2 1 1 0 ...(23)
e1 e1e2 e2
1 1
ei < 1 1 > 0, i = 1, 2
ei ei
We know that
AX2 + BX + C 0 x, A > 0, C > 0;
if and only if
Discriminant = B2 – 4AC 0 ...(24)
Using (24), we get from (23) :
2
1 1
1 1 1 0
ee e1 e2
1 2
( e1e 2 )2 (1 e1 )(1 e2 ) 0
2 2 e1e 2 (e1 e 2 1) 0
This implies that lies between the roots of the equation
2 2 e1e 2 (e1 e 2 1) 0
which are given by
1
2 e1e2 2 e1e2 (e1 e2 1)
2
e e e
This leads to the following important result, which we state in the form of a theorem.
Theorem : If T1 is an MVU estimator of (), and T2 is any other unbiased estimator of
() with efficiency e = e, then the correlation coefficient between T1 and T2 is given by
e i.e., e , ...(26)
Theorem : If T1 is an MVUE of () and T2 is any other unbiased estimator of () with efficiency
e < 1, then no unbiased linear combination of T1 and T2 can be an MVUE of ().
Proof. A linear combination.
T = l1T1 + l2T2 ...(27)
will be unbiased estimator of () if
E(T) = l1E(T1) + l2E(T2) = g(q), for all
l1 + l2 = 1 ...(27a)
since we are given E(T1) = E(T2) = ().
We have
Contact Us : Website : www.eduncle.com | Email : support@eduncle.com | Call Toll Free : 1800-120-1021 11
Mathematical statistics (Sample)
Var(T1 ) Var T1
e Var T2 ...(28)
Var(T2 ) e
l2
Var(T1 ) l12 2 2l1l2
e e
l2
Var(T1 ) l12 2l1l2 2 e
e
1
Var T1 [l12 + 2l1l2 + l22] 0 e 1 e 1
= Var T1(l1 + l2)2
= Var(T1) [From (27a)
T cannot be an MVU estimator.
Example : If T1 and T2 be two unbiased estimator of () with variances 12, 22 and correlation
, what is the best unbiased linear combination of T1 and T2 and what is the variance of such a
combination?
Solution. Let T1 and T2 be two unbiased estimators of ().
E(T1) = E(T2) = () ...(1)
Let T be a linear combination of T1 and T2 given by
T = l1T1 + l2T2 ...()
where l1, l2 are arbitrary constants.
E(T) = l1E(T1) + l2E(T2) = (l1 + l2) () [From (1)]
T is also an unbiased estimator of () if and only if
l1 + l2 = 1 ...(2)
Now V(T) = V(l1T1 + l2T2)
= l12V(T1) + l22V(T2) + 2l1l2 Cov. (T1, T2)
= l1212 + l2222 + 2l1l2 12 ...(3)
We want the minimum value of (3) for variations in l1 and l2, subject to the condition (2).
V(T) = 0 = l112 + l2 12
l1
2
l2 V(T) = 0 = l22 + l1 12
Subtracting, we get
l1(12 – 12) = l2(22 – 12)
l1 l l1 l2
2 2 2
12 1 12 1 22 212
2
2
1
2 2 [From (2)]
21 2
1 2
With these values of l1 and l2, T in the above example is an unbiased minimum variance estimate
and T2 is any other unbiased estimate with variance 2/e. Then prove that the correlation between T1 and
T2 is e.
Solution. The coefficients of the best linear unbiased combination of T1 and T2, given by () in
Example are given by (4).
We are given that 12 = V(T1) = 2
V(T1 ) 2
and
e = V(T ) V(T ) V(T2) = 22 = 2/e
2 2
1 p e
l1
D
...(5)
, where D 1 e 2 e
ep e
l2
D
1 2 2 2
V(T) 1 e
2 e e
2 1 e e e . .. / e
D2 e
2 1
2
1 2 e 2 e e2 2 e 2e e 2 1 e
e
D e
2
2
1 2 e 2 e e 2 2 e 2 e 2 e 2 3 e
D
2
1 2 e e 2 2 e 23 e
D2
2
2
1 e 2 e 2 e 1 2 e
D
2 1 2 1 e 2 e
2 1 2
2
1 e 2 e 1 e 2 e
2 1 2
2
2
1 e
V(T) 1 2
1 ...(6)
2 1 2 e
2
Since T1 is the most efficient estimator,
V(T)
V(T) 2 1 ...(7)
2
From (6) and (7), we get
V(T) 1 2
1, i.e., 1
2 (1 2 ) ( e )2
( e )2 0 e
Aliter. From (5) onwards. Since T1 is given to be the most efficient estimator, it cannot be
improved upon (c.f. Theorem). Hence, in order that T defined in () is minimum variance unbiased
estimator we must have
l1 1
e ...[From (5)]
l2 0
Sample Questions
1. Which of the following assumptions are required to show the consistency, unbiasedness and
efficiency of the OLS estimator ?
(i) E(ut) = 0 (ii) Var(ut) = 2
(iii) Cov(ut, ut–j) = 0 j (iv) ut~N(0, 2)
(A) (ii) and (iv) only (B) (i) and (iii) only
(C) (i), (ii) and (iii) only (D) (i), (ii), (iii) and (iv)
2. Which of the following may be consequences of one or more of the CLRM assumptions being
violated ?
(i) The coefficient estimates are not optimal
(ii) The standard error estimates are not optimal
(iii) The distributions assumed for the test statistics are inappropriate
(iv) Conclusions regarding the strength of relationships between the dependent and independent
variables may be invalid.
(A) (ii) and (iv) only (B) (i) and (iii) only
(C) (i), (ii) and (iii) only (D) (i), (ii), (iii) and (iv)
3. What would be then consequences for the OLS estimator if heteroscedasticity is present in a
regression model but ignored ?
(A) It will be biased (B) It will be inconsistent
(C) It will be inefficient (D) All of (a), (b) and (c) will be true
4. Including relevant lagged values of the dependent variable on the right hand side of a regression
equation could lead to which one of the following ?
(A) Biased but consistent coefficient estimates
(B) Biased and inconsistent coefficient estimates
(C) Unbiased but inconsistent coefficient estimates
(D) Unbiased and consistent but inefficient coefficient estimates
5. What will be the properties of the OLS estimator in the presence of multicollinearity ?
(A) It will be consistent, unbiased and efficient
(B) It will be consistent and unbiased but not efficient
(C) It will be consistent but not unbiased
(D) It will not be consistent
1. Let X1, X2, ..., Xn be normal random variables with mean and variance 2. What are the method
of moments estimators of the mean and variance 2?
2 1 n 2
2 1 n 2
(A) For variance ˆ MM
Xi X (B) For variance ˆ MM
Xi X
n i 1 n i1
1 n 2 n
(C) For mean ˆ MM Xi X (D) For mean ˆ MM Xi 2X
n i1 n i 1
2. Let X1, X2, ..., Xn be gamma random variables with parameters and , so that the probability
density function is:
1
f(x i ) x 1e x /
( )
is difficult to differentiate because of the gamma function (). So, rather than finding the maximum
likelihood estimators, what are the method of moments estimators of and ?
nX2 1 n
(A) ˆ MM n
(B) ˆ MM (Xi X)2
nX i 1
(X i X)2
i1
2nX2 1 n
(C) ˆ MM n
(D) ˆ MM (Xi X)2
nX i 1
(X i X)2
i1
3. Suppose you compute a sample statistic q to estimate a population quantity Q. Which of the
following is/are false?
(A) the variance of Q is zero
(B) if q is an unbiased estimator of Q, then q = Q
(C) if q is an unbiased estimator of Q, then q is the mean of the sample distribution of Q
(D) a 95% confidence interval for q contains Q with 95% probability
4. Suppose you draw a random sample of n observations, X1, X2, ..., Xn, from a population with
unknown mean . Which of the following estimators of is/are biased?
(A) the first observation you sample, Xl
(B) X2
(C) X 2 s2 / n
(D) All of these
1. The following table gives probabilities and observed frequencies in four classes. AB Ab, aB and
ab in a genetically experiment. Estimate the parameter by the method of maximum likelihood
and fixed its standard error.
Class Probability Observed frequency
1
AB (2 + ) 108
4
1
Ab (1 – ) 27
4
1
aB (1 – ) 30
4
1
ab 8
4
2. Let X1, X2, ..., Xn be a random sample of size n from a population X with probability density
function
x 1 if 0 x 1
f(x; )
0 otherwise,
where 0 < < is an unknown parameter. Using the method of moment find an estimator of
? If x1 = 0.2, x2 = 0.6, x3 = 0.5, x4 = 0.3 is a random sample of size 4, then what is the estimator
of ?
3. Suppose X1, X2, ..., X7 is a random sample from a population X with density function
6 x
x e
f(x; ) (7)7 if 0 x
0 otherwise
Find the coefficient of an estimator of by the moment method.
4. Suppose X1, X2, ..., Xn is a random sample from a population X with density function
1
if 0 x
f(x; )
0 otherwise
Find the coefficient of an estimator of by the moment method.
5. If X1, X2, ..., Xn is a random sample from a distribution with density function
6 x
x e
f(x; ) (7) 7 if 0 x
0 otherwise
then what is the coefficient of the maximum likelihood estimator of ?
SOLUTIONS
1. (C) All of the assumptions listed in (i) to (iii) are required to show that the OLS estimator has
the desirable properties of consistency, unbiasedness and efficiency. However, it is not
necessary to assume normality (iv) to derive the above results for the coefficient esti-
mates. This assumption is only required in order to construct test statistics that follow the
standard statistical distributions - in other words, it is only required for hypothesis testing
and not for coefficient estimation.
2. (D) If one or more of the assumptions is violated, either the coefficients could be wrong or
their standard errors could be wrong, and in either case, any hypothesis tests used to
investigate the strength of relationships between the explanatory and explained variables
could be invalid. So all of (i) to (iv) are true.
3. (C) Under heteroscedasticity, provided that all of the other assumptions of the classical linear
regression model are adhered to, the coefficient estimates will still be consistent and
unbiased, but they will be inefficient. Thus c is correct. The upshot is that whilst this
would not result in wrong coefficient estimates our measure of the sampling variability of
the coefficients, the standard errors, would probably be wrong. The stronger the degree
of heteroscedasticity (i.e. the more the variance of the errors changed over the sample),
the more inefficient the OLS estimator would be.
4. (A) Including lagged values of the dependent variable y will cause the assumption of the
CLRM that the explanatory variables are non-stochastic to be violated. This arises since
the lagged value of y is now being used as an explanatory variable and, since y at time
t-1 will depend on the value of u at time t-1, it must be the case that lagged values of
y are stochastic (i.e. they have some random influences and are not fixed in repeated
samples). The result of this is that the OLS estimator in the presence of lags of the
dependent variable will produce biased but consistent coefficient estimates. Thus, as the
sample size increases towards infinity, we will still obtain the optimal parameter esti-
mates, although these estimates could be biased in small samples. Note that no problem
of this kind arises whatever the sample size when using only lags of the explanatory
variables in the regression equation.
5. (A) In fact, in the presence of near multicollinearity, the OLS estimator will still be consistent,
unbiased and efficient. This is the case since none of the four (Gauss-Markov) assump-
tions of the CLRM have been violated. You may have thought that, since the standard
errors are usually wide in the presence of multicollinearity, the OLS estimator must be
inefficient. But this is not true-the multicollinearity will simple mean that it is hard to obtain
small standard errors due to insufficient separate information between the collinear vari-
ables, not that the standard errors are wrong.
1. (B,C) The first and second theoretical moments about the origin are:
E(Xi) = and E(Xi2) = 2 + 2
(Incidentally, in case it’s not obvious, that second moment can be derived from manipu-
lating the shortcut formula for the variance). In this case, we have two parameters for
1 n
E(X) Xi X
n i1
And, equating the second theoretical moment about the mean with the corresponding
sample moment, we get:
1 n
Var(X) 2 (Xi X)2
n i1
Now, we just have to solve for the two parameters and . Let’s start by solving for
in the first equation (E(X)). Doing so, we get:
X
X
Now, substituting = into the second equation (Var(X)), we get:
3. (B,C,D)
Since Q simply exists and if fixed number or constant, it has no variance. So [A] is
correct. Again [B] is nonsense whereas [C] confuses what if from the sample and what
if from the population and is incorrect. Finally. [D] misinterprets what a confidence interval
X X nX2
ˆ MM n
n
ˆMM
(1/ nX) (Xi X)2 (X i X)2
i 1 i1
4. (B,C) We have seen before that a) is actually unbiased. The second line seems like it might
be unbiased since by taking the square root of a square you just arrive back at X-bar. A
problem comes up though if your original X-bar is negative, say-10. Squaring-10 and then
taking the square root you arrive at 10. This is problematic and will lead to b) being a
biased estimator because we cannot say that its expected value is indeed . Likewise for
c) which compounds the problems of b) by subtracting out a constant, an operation
which we know will impart bias. So the correct answer is d).
5. (A,B,C,D)
We know by definition that the OLS estimator minimizes the sum of squared residuals
(thus, it produces the “least squares”). We have also noted that it has the desirable
property of being both unbiased and most efficient. Finally, we have seen in Lecture 7b
that since it minimizes the sum of squared residuals (or in other words RSS) it automati-
cally maximizes the value of R2. So, the answer must be e).
1. 0.0576
Using multinomial probability law, we have
n!
L = L() = p1n1 pn22 pn33 pn44 , pi 1, ni n
n1 ! n2 ! n3 ! n4 !
n!
where C = log , is a constant.
n1 ! n2 ! n3 ! n4 !
2 1 1
log L = C + n1 log + n2 log + n3 log + n4 log
4 4 4 4
Likelihood equation givens :
log L n n n n
1 2 3 4 0 ...()
2 1 1
n1 (n n3 ) n4
2 0
2 1
Taking n1 = 108, n2 = 27, n3 = 30 and n4 = 8, we get
ˆ 0·26 ...()
Differentiating () again partially w.r. to , get
2 logL n1 (n n3 ) n4
2
2
2 2
(2 ) (1 )2
np1 n(p 2 p 3 ) np 4
2
2
(2 ) (1 )2
n(2 ) n(1 ) n
2
2
2
4(2 ) 2(1 ) 4
n n n
I() ;n ni 173.
4(2 ) 2(1 ) 4ˆ
ˆ ˆ
1 1 1
173
4 2·26 2 0·74 4 0·26
= 173 [0·11 + 0·26 + 0·96] = 173 × 1·74 = 301·02
1
S.E.( ˆ ) 1/ I( ) 0·0576
301·02
2. 0.67 To find an estimator, we shall equate the population moment to the sample moment. The
population moment E(X) is given by
1
E(X) = x f(x; )dx
0
1
x x 1 dx
0
1
x dx
0
1
x 1
1 0
.
1
We know that M1 = X . Now setting M1 equal to E(X) and solving for , we get
X
1
that is
X
,
1 X
X
where X is the sample mean. Thus, the statistic is an estimator of the parameter
1 X
. Hence
X
ˆ .
1 X
0.4 2
ˆ =0.67
1 0.4 3
is an estimate of the .
3. 0.143 Since, we have only one parameter, we need to compute only the first population moment
E(X) about 0. Thus,
E(X) x f(x; ) dx
0
x
x6 e
x dx
0 (7) 7
7 x
1 x
e dx
(7) 0
1 7 y
y e dy
(7) 0
1
(8)
(7)
= 7.
7 X
that is
1
X
7
Therefore, the estimator of by the moment method is given by
1
X 0.143X
7
4. 2 Examining the density function of the population X, we see that X ~ UNIF(0, ). Therefore
E(X)
2
Now, equating this populating moment to the sample moment, we obtain
E(X) M1 X.
2
Therefore, the estimator of is
ˆ 2X.
5. 0.143 The likelihood function of the sample is given by
n
L() f(x i ; ).
i 1
Thus,
n
In L() In f(xi , )
i 1
n
1 n
6 In xi xi – n In(6!) – 7n In().
i1 i 1
Therefore
n
d 1 7n
In L() 2 x i .
d i1
d
Setting this derivative In L() to zero, we get
d
n
1 7n
2
x
i1
i
0
which yields
1 n
xi
7n i 1
This can be shown to be maximum by the second derivative test and again we leave
this verification to the reader. Hence, the estimator of is given by
1
X 0.143X
7