Exercises and Answers To Chapter 1
Exercises and Answers To Chapter 1
Exercises and Answers To Chapter 1
1 The continuous type of random variable X has the following density function:
a − x,
if 0 < x < a,
f (x) =
0, otherwise.
(1) Find a.
[Answer]
∫
(1) From the property of the density function, i.e., f (x) dx = 1, we need to have:
∫ ∫ a [ ]a
1 1
f (x) dx = (a − x) dx = ax − x2 = a2 = 1.
0 2 0 2
√
Therefore, a = 2 is obtained, taking into account a > 0.
∫
(2) The definitions of mean and variance are given by: E(X) = x f (x) dx and
∫
V(X) = (x − µ)2 f (x) dx, where µ = E(X). Therefore, mean of X is:
∫ ∫ a [ ]a
1 1 1
E(X) = x f (x) dx = x(a − x) dx = ax2 − x3 = a3
0 2 3 6
√ 0
2 √
= ←− a = 2 is substituted.
3
Variance of X is:
∫ ∫ ∫ a
V(X) = (x − µ) f (x) dx =
2
x f (x) dx − µ =
2 2
x2 (a − x) dx − µ2
0
[ ]a √ 2
1 3 1 4 1 4 1 2 1
= ax − x − µ = a − µ = −
2 2 = .
3 4 0 12 3 3 9
1
(3) Let f (x) be the density function of X and F(x) be the distribution function of X.
And let g(y) be the density function of Y and G(y) be the distribution function
of Y. Using Y = X 2 , we obtain:
√ √ √ √
G(y) = P(Y < y) = P(X 2 < y) = P(− y < X < y) = F( y) − F(− y)
√ √
= F( y) ←− F(− y) = 0.
Moreover, from the relationship between the density and the distribution func-
tions, we obtain the following:
√ √
dG(y) dF( y) dF(x) d y √
g(y) = = = ←− x = y
dy dy dx dy
1 1 √ 1
= F 0 (x) √ = f (x) √ = f ( y) √
2 y 2 y 2 y
√ √ 1
= ( 2 − y) √ , for 0 < y < 2.
2 y
√
The range of y is obtained as: 0 < x < 2 =⇒ 0 < x2 < 2 =⇒ 0 < y < 2.
2 The continuous type of random variable X has the following density function:
1
f (x) = √ e− 2 x .
1 2
2π
Answer the following questions.
[Answer]
∫
(1) The definitions of mean and variance are: E(X) = x f (x) dx and V(X) =
∫
(x − µ)2 f (x) dx, where µ = E(X). Therefore, mean of X is:
∫ ∫ ∞
1 1 [ − 21 x2 ]∞
x √ e− 2 x dx = − √
1 2
E(X) = x f (x) dx = e = 0.
−∞
−∞ 2π 2π
2
de− 2 x
1 2
= −xe− 2 x .
1 2
In the third equality, we utilize:
dx
Variance of X is:
∫ ∫ ∫ ∞
1
x2 √ e− 2 x dx − µ2
1 2
V(X) = (x − µ) f (x) dx =
2
x f (x) dx − µ =
2 2
−∞ 2π
[ ]∞ ∫ ∞
1 1
= −x √ e− 2 x √ e− 2 x dx − µ2 = 1.
1 2 1 2
+
2π −∞ −∞ 2π
In the fourth equality, the following formula is used.
∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx,
a
a a
1
lim x √ e− 2 x = 0.
1 2
x→±∞ 2π
In the second term of the fourth equality, we utilize the property that the inte-
gration of the density function is equal to one.
Variance of Y is:
∫ ∞ −∞ 2π
1
x3 · x √ e− 2 x dx − µ2y
1 2
=
−∞ 2π
[ ]∞ ∫ ∞
1 − 12 x2 1
x2 √ e− 2 x dx − µ2y
1 2
= −x √ e
3
+3
2π −∞ −∞ 2π
= 3E(X ) − µy ←− E(X ) = 1, µy = 1
2 2 2
=2
3
In the sixth equality, the following formula on integration is utilized.
∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx,
a
a a
x→±∞ 2π
(3) For Z = eX , mean of Z is:
∫ ∞ ∫ ∞
x 1 − 12 x2 1
√ e− 2 (x −2x) dx
1 2
E(Z) = E(e ) =
X
e √ e dx =
∫ ∞ −∞ 2π ∫ ∞ −∞ 2π
1 1
√ e− 2 (x−1) + 2 dx = e 2 √ e− 2 (x−1) dx = e 2 .
1 2 1 1 1 2 1
=
−∞ 2π −∞ 2π
1
In the sixth equality, √ e− 2 (x−1) is a normal distribution with mean one and
1 2
2π
variance one, and accordingly its integration is equal to one.
Variance of Z is:
1
V(Z) = E(Z − µz )2 ←− µz = E(Z) = e 2
∫ ∞
1
e2x √ e− 2 x dx − µ2z
1 2
= E(Z ) − µz = E(e ) − µz =
2 2 2X 2
∫ ∞ −∞
∫ ∞ 2π
1 1
√ e− 2 (x −4x) dx − µ2z = √ e− 2 (x−2) +2 dx − µ2z
1 2 1 2
=
∫ ∞ 2π
−∞ −∞ 2π
1
√ e− 2 (x−2) dx − µ2z = e2 − e.
1 2
= e2
−∞ 2π
1
The eighth equality comes from the facts that √ e− 2 (x−2) is a normal distri-
1 2
2π
bution with mean two and variance one and that its integration is equal to one.
3 The continuous type of random variable X has the following density function:
1 x
−
e λ,
if 0 < x,
f (x) = λ
0, otherwise.
4
(1) Compute mean and variance of X.
(2) Derive the moment-generating function of X.
(3) Let X1 , X2 , · · ·, Xn be the random variables, which are mutually independently
distributed and have the density function shown above. Prove that the density
function of Y = X1 + X2 + · · · + Xn is given by the chi-square distribution with
2n degrees of freedom when λ = 2. Note that the chi-square distribution with m
degrees of freedom is given by:
1 2 −1 e− 2 ,
m x
2 2 Γ( )
m m
x if x > 0,
f (x) =
2
0, otherwise.
[Answer]
x→∞ x→∞
Variance of X is:
∫ ∫
V(X) = (x − µ) f (x) dx =
2
x2 f (x) dx − µ2 ←− µ = E(X) = λ
∫ ∞ [ ] ∫ ∞
2 1 − λx 2 − λx ∞
xe− λ dx − µ2
x
= x e dx − µ = −x e
2
+2
λ 0
[
0
] ∫ ∞ 0
x ∞ 1
= −x2 e− λ + 2λ x e− λ dx − µ2
x
0
0 λ
= 2λE(X) − µ 2
←− µ = E(X) = λ
= 2λ2 − λ2 = λ2 .
5
In the third equality, we utilize:
∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx,
a
a a
1 − λx
where g(x) = x2 and h0 (x) = e .
λ
In the sixth equality, the following formulas are used:
∫ ∞
2 − λx
xe− λ dx.
x
lim x e = 0, µ = E(X) =
x→∞ 0
λ
1
one. λ in f (x) is replaced by − θ.
λ
(3) We want to show that the moment-generating function of Y is equivalent to that
of a chi-square distribution with 2n degrees of freedom.
1
φi (θ) = = φ(θ),
1 − 2θ
6
A chi-square distribution with m degrees of freedom is given by:
1 2 −1 e− 2 ,
m x
f (x) = x for x > 0.
2 Γ( m2 )
m
2
The moment-generating function of the above density function, φχ2 (θ), is:
∫ ∞
θX 1
eθx m m x 2 −1 e− 2 dx
m x
φχ2 (θ) = E(e ) =
0 2 2 Γ( 2 )
∫ ∞
1
x 2 −1 e− 2 (1−2θ)x dx
m 1
=
0 2 2 Γ( 2 )
m m
∫ ∞ ( y ) m2 −1 1
1 −2y 1
= e dx
0 2 2 Γ( 2 ) 1 − 2θ 1 − 2θ
m m
( ) m2 −1 ∫ ∞ ( ) m2
1 1 1 1
y 2 −1 e− 2 y dx =
m 1
= .
1 − 2θ 1 − 2θ 0 2 2 Γ( m2 ) 1 − 2θ
m
In the fourth equality, use y = (1 − 2θ)x. In the sixth equality, since the function
in the integration corresponds to the chi-square distribution with m degrees of
freedom, the integration is one. Thus, φy (θ) is equivalent to φχ2 (θ) for m = 2n.
That is, φy (θ) is the moment-generating function of a chi square distribution
with 2n degrees of freedom. Therefore, Y ∼ χ2 (2n).
4 The continuous type of random variable X has the following density function:
if 0 < x < 1,
1,
f (x) =
0, otherwise.
(2) When Y = −2 log X, derive the moment-generating function of Y. Note that the
log represents the natural logarithm (i.e., y = −2 log x is equivalent to x = e− 2 y ).
1
(3) Let Y1 and Y2 be the random variables which have the density function obtained
in (2). Suppose that Y1 is independent of Y2 . When Z = Y1 + Y2 , compute the
density function of Z.
7
[Answer]
(3) Let Y1 and Y2 be the random variables which have the density function obtained
from (2). And, assume that Y1 is independent of Y2 . For Z = Y1 + Y2 , we want
to have the density function of Z.
8
5 The continuous type of random variable X has the following density function:
1 2 −1 e− 2 ,
n x
2 2 Γ( n )
n x if x > 0,
f (x) =
2
0, otherwise.
Answer the following questions. Γ(a) is called the gamma function, defined as:
∫ ∞
Γ(a) = xa−1 e−x dx.
0
[Answer]
2− 2 x 2 −1 e− 2 dx
n+4 n+4 x
= n+4
2 − 2 Γ( 2 ) 0 Γ( 2 )
n n+4
∫ ∞
n+2n 1 − n20 n20 −1 − 2x
= 4( ) 0 2 x e dx = n(n + 2),
2 2 0 Γ( n2 )
9
where n0 = n + 4 is set. Therefore, V(X) = n(n + 2) − n2 = 2n is obtained.
( 1 ) n2 ∫ ∞ 1 1 ( 1 ) 2n
2 −1 exp(− y) dy =
n
= y .
1 − 2θ 0 2 2 Γ( 2 ) 1 − 2θ
n n 2
dx
Use y = (1 − 2θ)x in the fifth equality. Note that = (1 − 2θ)−1 . In the
dy
seventh equality, the integration corresponds to the chi-square distribution with
n degrees of freedom.
6 The continuous type of random variables X and Y are mutually independent and
assumed to be X ∼ N(0, 1) and Y ∼ N(0, 1). Define U = X/Y. Answer the following
questions. When X ∼ N(0, 1), the density function of X is represented as:
1
f (x) = √ e− 2 x .
1 2
2π
(1) Derive the density function of U.
[Answer]
1 1
f (x) = √ exp(− x2 ), −∞ < x < ∞,
2π 2
1 1
g(y) = √ exp(− y2 ), −∞ < y < ∞.
2π 2
10
Since X is independent of Y, the joint density of X and Y is:
1 1 1 1
h(x, y) = f (x)g(y) = √ exp(− x2 ) √ exp(− y2 )
2π 2 2π 2
1 1 2
= exp(− (x + y2 )).
2π 2
x
Using u = and v = y, the transformation of the variables is performed. For
y
x = uv and y = v, we have the Jacobian:
∂x ∂x
v u
J = ∂u ∂v =
.
∂y ∂y 0 1
∂u ∂v
Using transformation of variables, the joint density of U and V, s(u, v) is given
by:
1 1
s(u, v) = h(uv, v)|J| = exp(− v2 (1 + u2 ))|v|.
2π 2
11
7 The continuous type of random variables has the following joint density func-
tion:
x + y,
if 0 < x < 1 and 0 < y < 1,
f (x, y) =
0, otherwise.
Answer the following questions.
[Answer]
(2) We want to obtain the correlation coefficient between X and Y, which is repre-
√
sented as: ρ = Cov(X, Y)/ V(X)V(Y). Therefore, E(X), E(Y), V(X), V(Y) and
Cov(X, Y) have to be computed.
E(X) is:
∫ 1∫ 1 ∫ 1∫ 1
E(X) = x f (x, y) dx dy = x(x + y) dx dy
0 0 0 0
∫ 1[ ]1 ∫ 1
1 3 1 2 1 1
= x + yx dy = ( + y) dy
0 3 2 0 0 3 2
[ ]1
1 1 7
= y + y2 = .
3 4 0 12
12
In the case where x and y are exchangeable, the functional form of f (x, y) is
unchanged. Therefore, E(Y) is:
7
E(Y) = E(X) = .
12
For V(X),
( ) 7
V(X) = E (X − µ)2 ←− µ = E(X) =
12
∫ 1∫ 1
= E(X 2 ) − µ2 = x2 f (x, y) dx dy − µ2
0 0
∫ 1∫ 1 ∫ 1[ ]1
1 4 1 3
= x (x + y) dx dy − µ =
2 2
x + yx dy − µ2
0 0 0 4 3 0
∫ 1 [ ]1
1 1 1 1
= ( + y) dy − µ2 = y + y2 − µ2
0 4 3 4 6 0
5 ( )
7 2 11
= − = .
12 12 144
For V(Y),
11
V(Y) = V(X) = .
144
For Cov(X, Y),
( )
Cov(X, Y) = E (X − µ x )(Y − µy ) = E(XY) − µ x µy
1 7 7 1
= − =− ,
3 12 12 144
where
7 7
µ x = E(X) = , µy = E(Y) = .
12 12
Therefore, ρ is:
Cov(X, Y) −1/144 1
ρ= √ = √ =− .
V(X)V(Y) (11/144)(11/144) 11
13
8 The discrete type of random variable X has the following density function:
e−λ λ x
f (x) = , x = 0, 1, 2, · · · .
x!
Answer the following questions.
∑
∞
(1) Prove f (x) = 1.
x=0
(2) Compute the moment-generating function of X.
[Answer]
∑
∞
(1) We can show f (x) = 1 as:
x=0
∑
∞ ∑
∞ ∑
∞
−λ λ λx
x
−λ
f (x) = e =e = e−λ eλ = 1.
x=0 x=0
x! x=0
x!
∑ ∞
xk
Note that e = x
, because we have f (k) (x) = e x for f (x) = e x . As shown in
k=0
k!
Appendix 1.3, the formula of Taylor series expansion is:
∑∞
1 (k)
f (x) = f (x0 )(x − x0 )k .
k=0
k!
The Taylor series expansion around x = 0 is:
∑∞
1 (k) ∑∞
1 k ∑ xk
∞
f (x) = f (0)xk = x = .
k=0
k! k=0
k! k=0
k!
14
(3) Based on the moment-generating function, we obtain mean and variance of X.
( ) ( )
For mean, because of φ(θ) = exp λ(eθ − 1) , φ0 (θ) = λeθ exp λ(eθ − 1) and
E(X) = φ0 (0), we obtain:
E(X) = φ0 (0) = λ.
For variance, from V(X) = E(X 2 ) − (E(X))2 , we obtain E(X 2 ). Note that E(X 2 ) =
( )
φ00 (0) and φ00 (θ) = (1 + λeθ )λeθ exp λ(eθ − 1) . Therefore,
(3) We want to test the null hypothesis H0 : µ = µ0 by the likelihood ratio test.
Obtain the test statistic and explain the testing procedure.
[Answer]
i=1 2πσ 2 2σ
∑
n
= (2πσ2 )−n/2 exp − 2 (xi − µ)2 = l(µ, σ2 ).
1
2σ i=1
15
Taking the logarithm, we have:
1 ∑
n
n n
log l(µ, σ ) = − log(2π) − log(σ ) −
2 2
(xi − µ)2 .
2 2 2σ2 i=1
The derivatives of the log-likelihood function log l(µ, σ2 ) with respect to µ and
σ2 are set to be zero.
1 ∑
n
∂ log l(µ, σ2 )
= 2 (xi − µ) = 0,
∂µ σ i=1
1 ∑
n
∂ log l(µ, σ2 ) n 1
= − + (xi − µ)2 = 0.
∂σ 2 2σ 2 4
2σ i=1
Solving the two equations, we have the solution of (µ, σ2 ), denoted by (µ̂, σ̂2 ):
1∑
n
µ̂ = xi = x,
n i=1
1∑ 1∑
n n
σ̂ =
2
(xi − µ) =
2
(xi − x)2 .
n i=1 n i=1
Therefore, the maximum likelihood estimators of µ and σ2 , (µ̂, σ̂2 ), are as fol-
lows:
1∑
n
X, S ∗∗2 = (Xi − X)2 .
n i=1
1 (∑ n )
= E (Xi − µ)2 − 2n(X − µ)2 + n(X − µ)2
n i=1
16
1 (∑ )
n
= E (Xi − µ)2 − n(X − µ)2
n i=1
1 (∑ ) 1 ( )
n
= E (Xi − µ)2 − E n(X − µ)2
n i=1 n
1∑ ( ) ( )
n
= E (Xi − µ)2 − E (X − µ)2
n i=1
1∑ 1 ∑ 2 σ2
n n
= V(Xi ) − V(X) = σ −
n i=1 n i=1 n
1 n−1 2
= σ2 − σ2 = σ , σ2 .
n n
Therefore, S ∗∗2 is not unbiased. Based on S ∗∗2 , we obtain the unbiased estimator
of σ2 . Multiplying n/(n − 1) on both sides of E(S ∗∗2 ) = σ2 (n − 1)/n, we obtain:
n
E(S ∗∗2 ) = σ2 .
n−1
1 ∑
n
n
S ∗∗2 = (Xi − X)2 = S 2 .
n−1 n − 1 i=1
max l(µ0 , σ2 )
σ2 l(µ0 , σe2 )
λ= = .
max l(µ, σ2 ) l(µ̂, σ̂2 )
µ,σ2
−2 log λ −→ χ2 (1).
1 ∑
n
n n
log l(µ, σ ) = − log(2π) − log(σ ) −
2 2
(xi − µ)2 .
2 2 2σ2 i=1
17
On the numerator, under the restriction µ = µ0 , log l(µ0 , σ2 ) is maximized with
respect to σ2 as follows:
1 ∑
n
∂ log l(µ0 , σ2 ) n 1
=− 2 + (xi − µ0 )2 = 0.
∂σ2 2σ 2σ4 i=1
1∑
n
e =
σ2
(xi − µ0 )2 .
n i=1
e2 ) is:
Then, l(µ0 , σ
( n)
1 ∑ n
e ) = (2πe
l(µ0 , σ2
σ) 2 −n/2
exp − 2 (xi − µ0 ) = (2πe
2
σ2 )−n/2 exp − .
σ i=1
2e 2
1∑ 1∑
n n
µ̂ = xi , σ̂ =2
(xi − µ̂)2 .
n i=1 n i=1
When −2 log λ > χ2α (1), the null hypothesis H0 : µ = µ0 is rejected by the
significance level α, where χ2α (1) denotes the 100 × α percent point of the Chi-
square distribution with one degree of freedom.
18
(1) The discrete type of random variable X is assumed to be Bernoulli. The Bernoulli
distribution is given by:
f (x) = p x (1 − p)1−x , x = 0, 1.
Let X1 , X2 , · · · ,Xn be random variables drawn from the Bernoulli trials. Com-
pute the maximum likelihood estimator of p.
(2) Let Y be a random variable from a binomial distribution, denoted by f (y), which
is represented as:
(3) For the random variable Y in the question (2), Let us define:
Y − np
Zn ≡ √ .
np(1 − p)
(4) The continuous type of random variable X has the following density function:
1 2 −1 − 2
n x
2 n2 Γ( n ) x e ,
if x > 0,
f (x) =
2
0, otherwise.
[Answer]
19
The joint probability function of X1 , X2 , · · · ,Xn is:
∏
n ∏
n
f (x1 , x2 , · · · , xn ; p) = f (xi ; p) = p xi (1 − p)1−xi
i=1 i=1
∑ ∑
=p i xi
(1 − p)
n− i xi
= l(p).
The derivative of the log-likelihood function log l(p) with respect to p is set to
be zero.
∑ ∑ ∑
d log l(p) n − i xi
xi xi − np
= − i
= i = 0.
dp p 1− p p(1 − p)
1∑
n
p= xi = x.
n i=1
1∑
n
p̂ = Xi = X.
n i=1
Therefore, we have:
Y 1 Y 1 p(1 − p)
E( ) = E(Y) = p, V( ) = 2 V(Y) = .
n n n n n
20
Here, when g(X) = (X − E(X))2 and k = 2 are set, we can rewrite as:
V(X)
P(|X − E(X)| ≥ ) ≤ ,
2
where > 0.
Y
Replacing X by , we apply Chebyshev’s inequality.
n
Y Y V( Yn )
P(| − E( )| ≥ ) ≤ 2 .
n n
That is, as n −→ ∞,
Y p(1 − p)
P(| − p| ≥ ) ≤ −→ 0.
n n 2
Therefore, we obtain:
Y
−→ p.
n
(3) Let X1 , X2 , · · ·, Xn be Bernoulli random variables, where P(Xi = x) = p x (1 −
p)1−x for x = 0, 1. Define Y = X1 + X2 + · · · + Xn . Because Y has a binomial
distribution, Y/n is taken as the sample mean from X1 , X2 , · · ·, Xn , i.e., Y/n =
∑
(1/n) ni=1 Xi . Therefore, using E(Y/n) = p and V(Y/n) = p(1 − p)/n, by the
central limit theorem, as n −→ ∞, we have:
Y/n − p
√ −→ N(0, 1).
p(1 − p)/n
Moreover,
Y − np Y/n − p
Zn ≡ √ = √ .
np(1 − p) p(1 − p)/n
Therefore,
Zn −→ N(0, 1).
(4) When X ∼ χ2 (n), we have E(X) = n and V(X) = 2n. Therefore, E(X/n) = 1 and
V(X/n) = 2/n.
21
Apply Chebyshev’s inequality. Then, we have:
X X V( Xn )
P(| − E( )| ≥ ) ≤ 2 ,
n n
X 2
P(| − 1| ≥ ) ≤ 2 −→ 0.
n n
Therefore,
X
−→ 1.
n
[Answer]
We want the λ which maximizes log l(λ). Solving the following equation:
d log l(λ) n ∑
n
= − xi = 0,
dλ λ i=1
22
and replacing xi by Xi , the maximum likelihood estimator of λ, denoted by λ̂,
is:
n
λ̂ = ∑n .
i=1 Xi
(2) X1 , X2 , · · ·, Xn are mutually independent. Let f (xi ; λ) be the density function of
Xi . For the maximum likelihood estimator of λ, i.e., λ̂n , as n −→ ∞, we have
the following property:
√ ( )
n(λ̂n − λ) −→ N 0, σ2 (λ) ,
where
1
σ2 (λ) = ( ) .
d log f (X; λ) 2
E
dλ
1 2
E(X) = , E(X 2 ) = .
λ λ2
Therefore, we have:
1
σ2 (λ) = ( ) =λ .
2
d log f (X; λ) 2
E
dλ
λ2
λ̂n ∼ N(λ, ).
n
Thus, as n goes to infinity, mean and variance are given by λ and λ2 /n.
23
12 The n random variables X1 , X2 , · · ·, Xn are mutually independently distributed
with mean µ and variance σ2 . Consider the following two estimators of µ:
1∑
n
X= Xi , e = 1 (X1 + Xn ).
X
n i=1 2
e
(1) Is X unbiased? How about X?
e
(2) Which is more efficient, X or X?
e are consistent.
(3) Examine whether X and X
[Answer]
e are unbiased.
(1) We check whether X and X
1∑ 1 ∑ 1∑ 1∑
n n n n
E(X) = E( Xi ) = E( Xi ) = E(Xi ) = µ = µ,
n i=1 n i=1 n i=1 n i=1
( )
e = 1 E(X1 ) + E(Xn ) = 1 (µ + µ) = µ.
E(X)
2 2
1∑ 1 ∑ 1 ∑ 1 ∑ 2 σ2
n n n n
V(X) = V( Xi ) = 2 V( Xi ) = 2 V(Xi ) = 2 σ = ,
n i=1 n i=1
n i=1 n i=1 n
( )
e = 1 V(X1 ) + V(Xn ) = 1 (σ2 + σ2 ) = σ .
2
V(X)
4 4 2
e X is more efficient than X
Therefore, because of V(X) < V(X), e when n > 2.
V(X)
P(|X − E(X)| ≥ ) ≤ ,
2
σ2
P(|X − µ| ≥ ) ≤ 2 −→ 0.
n
24
Therefore, we obtain:
X −→ µ.
e we have:
Next, for X,
e
e ≥ ) ≤ V(X) ,
e − E(X)|
P(|X
2
e − µ| ≥ ) ≤ σ2
P(|X −→
/ 0.
2 2
e is not consistent.
X is a consistent estimator of µ, but X
(3) Test the null hypothesis H0 : µ = 24 and the alternative hypothesis H1 : µ > 24
by the significance level 0.10. How about 0.05?
[Answer]
(1) The unbiased estimators of µ and σ2 , denoted by X and S 2 , are given by:
1∑ 1 ∑
n n
X= Xi , S =
2
(Xi − X)2 .
n i=1 n − 1 i=1
1∑ 1 ∑
n n
x= xi , s =
2
(xi − x)2 .
n i=1 n − 1 i=1
25
Therefore,
1∑
n
1
x= xi = (21 + 23 + 32 + 20 + 36 + 27 + 26 + 28 + 30) = 27,
n i=1 9
1 ∑
n
s2 = (xi − x)2
n − 1 i=1
(
1
= (21 − 27)2 + (23 − 27)2 + (32 − 27)2 + (20 − 27)2
8
)
+(36 − 27) + (27 − 27) + (26 − 27) + (28 − 27) + (30 − 27)
2 2 2 2 2
1
= (36 + 16 + 25 + 49 + 81 + 0 + 1 + 1 + 9) = 27.25.
8
26
(3) We test the null hypothesis H0 : µ = 24 and the alternative hypothesis H1 : µ >
24 by the significance levels 0.10 and 0.05. The distribution of X is:
X−µ
√ ∼ t(n − 1).
S/ n
Therefore, under the null hypothesis H0 : µ = µ0 , we obtain
X − µ0
√ ∼ t(n − 1).
S/ n
Note that µ is replaced by µ0 . For the alternative hypothesis H1 : µ > µ0 , since
we have:
( X − µ0 )
P √ > tα (n − 1) = α,
S/ n
we reject the null hypothesis H0 : µ = µ0 by the significance level α when we
have:
x − µ0
√ > tα (n − 1).
s/ n
Substitute x = 27, s2 = 27.25, µ0 = 24, n = 9, t0.10 (8) = 1.397 and t0.05 (8) =
1.860 into the above formula. Then, we obtain:
x − µ0 27 − 24
√ = √ = 1.724 > t0.10 (8) = 1.397.
s/ n 27.25/9
Therefore, we reject the null hypothesis H0 : µ = 24 by the significance level
α = 0.10. And we obtain:
x − µ0 27 − 24
√ = √ = 1.724 < t0.05 (8) = 1.860.
s/ n 27.25/9
Therefore, the null hypothesis H0 : µ = 24 is accepted by the significance level
α = 0.05.
14 The 16 samples X1 , X2 , · · ·, X16 are randomly drawn from the normal population
with mean µ and known variance σ2 = 22 . The sample average is given by x = 36.
Then, answer the following questions.
27
(2) Test the null hypothesis H0 : µ = 35 and the alternative hypothesis H1 : µ =
36.5 by the significance level 0.05.
(3) Compute the power of the test in the above question (2).
[Answer]
X−µ
√ ∼ N(0, 1).
σ/ n
Therefore,
( X − µ )
P √ < zα/2 = 1 − α,
σ/ n
α
where zα/2 denotes the 100× percent point, which is obtained given probability
2
α. Therefore,
( σ σ )
P X − zα/2 √ < µ < X + zα/2 √ = 1 − α.
n n
2 2
(36 − 1.960 √ , 36 + 1.960 √ ) = (35.02, 36.98).
16 16
X−µ
√ ∼ N(0, 1).
σ/ n
X − µ0
√ ∼ N(0, 1).
σ/ n
28
For the alternative hypothesis H1 : µ > µ0 , we obtain:
( X − µ0 )
P √ > zα = α.
σ/ n
If we have:
x − µ0
√ > zα ,
σ/ n
the null hypothesis H0 : µ = µ0 is rejected by the significance level α. Substi-
tuting x = 36, σ2 = 22 , n = 16 and z0.05 = 1.645, we obtain:
x − µ0 36 − 35
√ = √ = 2 > zα = 1.645.
σ/ n 2/ 16
The null hypothesis H0 : µ = 35 is rejected by the significance level α = 0.05.
(3) We compute the power of the test in the question (2). The power of the test is the
probability which rejects the null hypothesis under the alternative hypothesis.
That is, under the null hypothesis H0 : µ = µ0 , the region which rejects the null
√
hypothesis is: X > µ0 + zα σ/ n, because
( X − µ0 )
P √ > zα = α.
σ/ n
We compute the probability which rejects the null hypothesis under the alter-
native hypothesis H1 : µ = µ1 . That is, under the alternative hypothesis
H1 : µ = µ1 , the following probability is known as the power of the test:
( √ )
P X > µ0 + zα σ/ n .
X − µ1
√ ∼ N(0, 1).
σ/ n
29
Substituting σ = 2, n = 16, µ0 = 35, µ1 = 36.5 and zα = 1.645, we obtain:
( X − µ1 35 − 36.5 ) ( X − µ1 )
P √ > √ + 1.645 = P √ > −1.355
σ/ n 2/ 16 σ/ n
( X − µ1 )
=1−P √ > 1.355
σ/ n
= 1 − 0.0877 = 0.9123.
λ x e−λ
P(X = x) = f (x; λ) = , x = 0, 1, 2, · · · .
x!
Then, answer the following questions.
[Answer]
(1) We obtain the maximum likelihood estimator of λ, denoted by λ̂. The Poisson
distribution is:
λ x e−λ
P(X = x) = f (x; λ) = , x = 0, 1, 2, · · · .
x!
The likelihood function is:
∑n
∏
n ∏
n
λ xi e−λ λ i=1 xi e−nλ
l(λ) = f (xi ; λ) = = ∏n .
i=1 i=1
xi ! i=1 xi !
30
The derivative of the log-likelihood function with respect to λ is:
∂ log l(λ) 1 ∑
n
= xi − n = 0.
∂λ λ i=1
1∑
n
λ̂ = Xi = X.
n i=1
1∑ 1∑ 1∑
n n n
E(λ̂) = E( Xi ) = E(Xi ) = λ = λ.
n i=1 n i=1 n i=1
(3) We prove that λ̂ is an efficient estimator of λ, where we show that the equality
holds in the Cramer-Rao inequality. First, we obtain V(λ̂) as:
1∑ 1 ∑ 1 ∑
n n n
λ
V(λ̂) = V( Xi ) = 2 V(Xi ) = 2 λ= .
n i=1 n i=1 n i=1 n
1 1
( )2 = ( )
∂ log f (X; λ) ∂(X log λ − λ − log X!) 2
nE nE
∂λ ∂λ
1 λ2
= [( )2 ] =
X nE[(X − λ)2 ]
nE −1
λ
λ2 λ2 λ
= = = .
nV(X) nλ n
Therefore,
1
V(λ̂) = ( ) .
∂ log f (X; λ) 2
nE
∂λ
That is, V(λ̂) is equal to the lower bound of the Cramer-Rao inequality. There-
fore, λ̂ is efficient.
31
(4) We show that λ̂ is a consistent estimator of λ. Note as follows:
λ
E(λ̂) = λ, V(λ̂) = .
n
In Chebyshev’s inequality:
V(λ̂)
P(|λ̂ − E(λ̂)| ≥ ) ≤ ,
2
E(λ̂) and V(λ̂) are substituted. Then, we have:
λ
P(|λ̂ − λ| > ) < −→ 0,
n 2
which implies that λ̂ is consistent.
(2) Define:
X−µ
Z= √ .
σ/ n
Show that Z is normally distributed with mean zero and variance one.
32
(4) Prove that S 2 is an consistent estimator of σ2 .
[Answer]
∑n
(1) The distribution of the sample mean X = (1/n) i=1 Xi is derived using the
moment-generating function. Note that for X ∼ N(µ, σ2 ) the moment-generating
function φ(θ) is:
∫ ∞ ∫ ∞
1
e− 2σ2 (x−µ) dx
1 2
θX θx
φ(θ) ≡ E(e ) = e f (x) dx = eθx √
∫ ∞ −∞ −∞ 2πσ2
1
e− 2σ2 (x−µ) +θx dx
1 2
= √
−∞ 2πσ2 ( )
∫ ∞
1 − 1 2 x2 −2(µ+σ2 θ)x+µ2
= √ e 2σ dx
−∞ 2πσ2
∫ ∞ ( )2
1 − 1 2 x−(µ+σ2 θ) +(µθ+ 12 σ2 θ2 )
= √ e 2σ dx
−∞ 2πσ2
∫ ∞ ( )2
1 − 1 2 x−(µ+σ2 θ)
( 1 )
µθ+ 12 σ2 θ2
=e √ e 2σ dx = exp µθ + σ2 θ2 .
−∞ 2πσ2 2
In the integration above, N(µ + σ2 θ, σ2 ) is utilized. Therefore, we have:
( 1 )
φi (θ) = exp µθ + σ2 θ2 .
2
Now, consider the moment-generating function of X, denoted by φ x (θ):
∑ ∏n
θ
∏n
θ
∏ n
θ
θX θ 1n ni=1 Xi
φ x (θ) ≡ E(e ) = E(e ) = E( e )=
n Xi
E(e ) =
n Xi
φi ( )
i=1 i=1 i=1
n
∏n ( θ 1 θ ) ( 1 θ2 ) ( 1 σ2 2 )
= exp µ + σ2 ( )2 = exp µθ + σ2 = exp µθ + θ ,
i=1
n 2 n 2 n 2 n
which is equivalent to the moment-generating function of the normal distribu-
tion with mean µ and variance σ2 /n.
33
The moment-generating function of Z, denoted by φz (θ):
( X−µ )
φz (θ) ≡ E(eθZ ) = E exp(θ √ )
σ/ n
( µ ) ( θ )
= exp −θ √ E exp( √ X)
σ/ n σ/ n
( µ ) θ
= exp −θ √ φ x ( √ )
σ/ n σ/ n
( µ ) ( θ 1 σ2 ( θ )2 ) 1
= exp −θ √ exp µ √ + √ = exp( θ2 ),
σ/ n σ/ n 2 n σ/ n 2
(3) First, as preliminaries, we derive mean and variance of the chi-square distribu-
tion with m degrees of freedom. The chi-square distribution with m degrees of
freedom is:
1
x 2 −1 e− 2 ,
m x
f (x) = if x > 0.
2 Γ( 2 )
mm
2
( 1 ) m2 −1 1 ∫ ∞ 1
y 2 −1 e− 2 y dx = (1 − 2θ)− 2 .
m 1 m
=
1 − 2θ 1 − 2θ 0 2 2 Γ( 2 )
m m
In the fourth equality, use y = (1 − 2θ)x. The first and second derivatives of the
moment-generating function is:
Therefore, we obtain:
34
Thus, for the chi-square distribution with m degrees of freedom, mean is given
by m and variance is:
(n − 1)S 2 (n − 1)S 2
E( ) = n − 1, V( ) = 2(n − 1),
σ2 σ2
which implies
n−1 n−1 2
E(S 2 ) = n − 1, ( ) V(S 2 ) = 2(n − 1).
σ 2 σ 2
2σ4
E(S 2 ) = σ2 , V(S 2 ) = .
n−1
2σ4
P(|S 2 − σ2 | ≥ ) ≤ −→ 0.
(n − 1) 2
Therefore, S 2 is consistent.
35