Exercises and Answers To Chapter 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Exercises and Answers to Chapter 1

1 The continuous type of random variable X has the following density function:




 a − x,
 if 0 < x < a,
f (x) = 



 0, otherwise.

Answer the following questions.

(1) Find a.

(2) Obtain mean and variance of X.

(3) When Y = X 2 , derive the density function of Y.

[Answer]

(1) From the property of the density function, i.e., f (x) dx = 1, we need to have:

∫ ∫ a [ ]a
1 1
f (x) dx = (a − x) dx = ax − x2 = a2 = 1.
0 2 0 2

Therefore, a = 2 is obtained, taking into account a > 0.

(2) The definitions of mean and variance are given by: E(X) = x f (x) dx and

V(X) = (x − µ)2 f (x) dx, where µ = E(X). Therefore, mean of X is:

∫ ∫ a [ ]a
1 1 1
E(X) = x f (x) dx = x(a − x) dx = ax2 − x3 = a3
0 2 3 6
√ 0
2 √
= ←− a = 2 is substituted.
3

Variance of X is:
∫ ∫ ∫ a
V(X) = (x − µ) f (x) dx =
2
x f (x) dx − µ =
2 2
x2 (a − x) dx − µ2
0
[ ]a  √ 2
1 3 1 4 1 4 1  2  1
= ax − x − µ = a − µ = − 
2 2  = .
3 4 0 12 3 3 9

1
(3) Let f (x) be the density function of X and F(x) be the distribution function of X.
And let g(y) be the density function of Y and G(y) be the distribution function
of Y. Using Y = X 2 , we obtain:

√ √ √ √
G(y) = P(Y < y) = P(X 2 < y) = P(− y < X < y) = F( y) − F(− y)
√ √
= F( y) ←− F(− y) = 0.

Moreover, from the relationship between the density and the distribution func-
tions, we obtain the following:
√ √
dG(y) dF( y) dF(x) d y √
g(y) = = = ←− x = y
dy dy dx dy
1 1 √ 1
= F 0 (x) √ = f (x) √ = f ( y) √
2 y 2 y 2 y
√ √ 1
= ( 2 − y) √ , for 0 < y < 2.
2 y

The range of y is obtained as: 0 < x < 2 =⇒ 0 < x2 < 2 =⇒ 0 < y < 2.

2 The continuous type of random variable X has the following density function:

1
f (x) = √ e− 2 x .
1 2


Answer the following questions.

(1) Compute mean and variance of X.

(2) When Y = X 2 , compute mean and variance of Y.

(3) When Z = eX , obtain mean and variance of Z.

[Answer]

(1) The definitions of mean and variance are: E(X) = x f (x) dx and V(X) =

(x − µ)2 f (x) dx, where µ = E(X). Therefore, mean of X is:
∫ ∫ ∞
1 1 [ − 21 x2 ]∞
x √ e− 2 x dx = − √
1 2
E(X) = x f (x) dx = e = 0.
−∞
−∞ 2π 2π

2
de− 2 x
1 2

= −xe− 2 x .
1 2
In the third equality, we utilize:
dx
Variance of X is:
∫ ∫ ∫ ∞
1
x2 √ e− 2 x dx − µ2
1 2
V(X) = (x − µ) f (x) dx =
2
x f (x) dx − µ =
2 2
−∞ 2π
[ ]∞ ∫ ∞
1 1
= −x √ e− 2 x √ e− 2 x dx − µ2 = 1.
1 2 1 2
+
2π −∞ −∞ 2π
In the fourth equality, the following formula is used.
∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx,
a
a a

where g(x) = x and h0 (x) = x √12π e− 2 x are set.


1 2

And in the first term of the fourth equality, we use:

1
lim x √ e− 2 x = 0.
1 2

x→±∞ 2π

In the second term of the fourth equality, we utilize the property that the inte-
gration of the density function is equal to one.

(2) When Y = X 2 , mean of Y is:

E(Y) = E(X 2 ) = V(X) − µ2x = 1

From (1), note that V(X) = 1 and µ x = E(X) = 0.

Variance of Y is:

V(Y) = E(Y − µy )2 ←− µy = E(Y) = 1


∫ ∞
1
x4 √ e− 2 x dx − µ2y
1 2
= E(Y ) − µy = E(X ) − µy =
2 2 4 2

∫ ∞ −∞ 2π
1
x3 · x √ e− 2 x dx − µ2y
1 2
=
−∞ 2π
[ ]∞ ∫ ∞
1 − 12 x2 1
x2 √ e− 2 x dx − µ2y
1 2
= −x √ e
3
+3
2π −∞ −∞ 2π
= 3E(X ) − µy ←− E(X ) = 1, µy = 1
2 2 2

=2

3
In the sixth equality, the following formula on integration is utilized.
∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx,
a
a a

where g(x) = x3 and h0 (x) = x √12π e − 12 x2


are set.

In the first term of the sixth equality, we use:


1
lim x3 √ e− 2 x = 0.
1 2

x→±∞ 2π
(3) For Z = eX , mean of Z is:
∫ ∞ ∫ ∞
x 1 − 12 x2 1
√ e− 2 (x −2x) dx
1 2
E(Z) = E(e ) =
X
e √ e dx =
∫ ∞ −∞ 2π ∫ ∞ −∞ 2π
1 1
√ e− 2 (x−1) + 2 dx = e 2 √ e− 2 (x−1) dx = e 2 .
1 2 1 1 1 2 1
=
−∞ 2π −∞ 2π
1
In the sixth equality, √ e− 2 (x−1) is a normal distribution with mean one and
1 2


variance one, and accordingly its integration is equal to one.

Variance of Z is:
1
V(Z) = E(Z − µz )2 ←− µz = E(Z) = e 2
∫ ∞
1
e2x √ e− 2 x dx − µ2z
1 2
= E(Z ) − µz = E(e ) − µz =
2 2 2X 2

∫ ∞ −∞
∫ ∞ 2π
1 1
√ e− 2 (x −4x) dx − µ2z = √ e− 2 (x−2) +2 dx − µ2z
1 2 1 2
=
∫ ∞ 2π
−∞ −∞ 2π
1
√ e− 2 (x−2) dx − µ2z = e2 − e.
1 2
= e2
−∞ 2π
1
The eighth equality comes from the facts that √ e− 2 (x−2) is a normal distri-
1 2


bution with mean two and variance one and that its integration is equal to one.

3 The continuous type of random variable X has the following density function:
1 x


 −
 e λ,
 if 0 < x,
f (x) =  λ



 0, otherwise.

Answer the following questions.

4
(1) Compute mean and variance of X.
(2) Derive the moment-generating function of X.
(3) Let X1 , X2 , · · ·, Xn be the random variables, which are mutually independently
distributed and have the density function shown above. Prove that the density
function of Y = X1 + X2 + · · · + Xn is given by the chi-square distribution with
2n degrees of freedom when λ = 2. Note that the chi-square distribution with m
degrees of freedom is given by:



 1 2 −1 e− 2 ,
m x


 2 2 Γ( )
m m
x if x > 0,
f (x) = 
 2



 0, otherwise.

[Answer]

(1) Mean of X is:


∫ ∫ ∞
1 x
E(X) = x f (x) dx = x e− λ dx
λ
[ ] ∫ ∞
0
[ ]
− λx ∞ x ∞
e− λ dx = −λe− λ = λ.
x
= −xe +
0 0
0

In the third equality, the following formula is used:


∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx.
a
a a
1 − λx
where g(x) = x and h0 (x) = e are set.
λ
And we utilize:
lim xe− λ = 0, lim e− λ = 0.
x x

x→∞ x→∞

Variance of X is:
∫ ∫
V(X) = (x − µ) f (x) dx =
2
x2 f (x) dx − µ2 ←− µ = E(X) = λ
∫ ∞ [ ] ∫ ∞
2 1 − λx 2 − λx ∞
xe− λ dx − µ2
x
= x e dx − µ = −x e
2
+2
λ 0
[
0
] ∫ ∞ 0
x ∞ 1
= −x2 e− λ + 2λ x e− λ dx − µ2
x

0
0 λ
= 2λE(X) − µ 2
←− µ = E(X) = λ

= 2λ2 − λ2 = λ2 .

5
In the third equality, we utilize:
∫ b [ ]b ∫ b
0
h (x)g(x) dx = h(x)g(x) − h(x)g0 (x) dx,
a
a a

1 − λx
where g(x) = x2 and h0 (x) = e .
λ
In the sixth equality, the following formulas are used:
∫ ∞
2 − λx
xe− λ dx.
x
lim x e = 0, µ = E(X) =
x→∞ 0

(2) The moment-generating function of X is:


∫ ∫ ∞ ∫ ∞
θX θx θx 1 − λx 1 −( λ1 −θ)x
φ(θ) = E(e ) = e f (x) dx = e e dx = e dx
λ 0 λ
∫ ∞ 0
1/λ 1 1
( − θ)e−( λ −θ)x dx =
1
= .
1/λ − θ 0 λ 1 − λθ
1
In the last equality, since ( − θ)e−( λ −θ)x is a density function, its integration is
1

λ
1
one. λ in f (x) is replaced by − θ.
λ
(3) We want to show that the moment-generating function of Y is equivalent to that
of a chi-square distribution with 2n degrees of freedom.

Because X1 , X2 , · · ·, Xn are mutually independently distributed, the moment-


generating function of Xi , φi (θ), is:

1
φi (θ) = = φ(θ),
1 − 2θ

which corresponds to the case λ = 2 of (2).

For λ = 2, the moment-generating function of Y = X1 + X2 + · · · + Xn , φy (θ), is:

φy (θ) = E(eθY ) = E(eθ(X1 +X2 +···+Xn ) ) = E(eθX1 )E(eθX2 ) · · · E(eθXn )


( )n ( 1 )n ( 1 ) 2n2
= φ1 (θ)φ2 (θ) · · · φn (θ) = φ(θ) = = .
1 − 2θ 1 − 2θ

Therefore, the moment-generating function of Y is:


( 1 ) 2n2
φy (θ) = .
1 − 2θ

6
A chi-square distribution with m degrees of freedom is given by:

1 2 −1 e− 2 ,
m x
f (x) = x for x > 0.
2 Γ( m2 )
m
2

The moment-generating function of the above density function, φχ2 (θ), is:
∫ ∞
θX 1
eθx m m x 2 −1 e− 2 dx
m x
φχ2 (θ) = E(e ) =
0 2 2 Γ( 2 )
∫ ∞
1
x 2 −1 e− 2 (1−2θ)x dx
m 1
=
0 2 2 Γ( 2 )
m m
∫ ∞ ( y ) m2 −1 1
1 −2y 1
= e dx
0 2 2 Γ( 2 ) 1 − 2θ 1 − 2θ
m m

( ) m2 −1 ∫ ∞ ( ) m2
1 1 1 1
y 2 −1 e− 2 y dx =
m 1
= .
1 − 2θ 1 − 2θ 0 2 2 Γ( m2 ) 1 − 2θ
m

In the fourth equality, use y = (1 − 2θ)x. In the sixth equality, since the function
in the integration corresponds to the chi-square distribution with m degrees of
freedom, the integration is one. Thus, φy (θ) is equivalent to φχ2 (θ) for m = 2n.
That is, φy (θ) is the moment-generating function of a chi square distribution
with 2n degrees of freedom. Therefore, Y ∼ χ2 (2n).

4 The continuous type of random variable X has the following density function:



 if 0 < x < 1,

 1,
f (x) = 



 0, otherwise.

Answer the following questions.

(1) Compute mean and variance of X.

(2) When Y = −2 log X, derive the moment-generating function of Y. Note that the
log represents the natural logarithm (i.e., y = −2 log x is equivalent to x = e− 2 y ).
1

(3) Let Y1 and Y2 be the random variables which have the density function obtained
in (2). Suppose that Y1 is independent of Y2 . When Z = Y1 + Y2 , compute the
density function of Z.

7
[Answer]

(1) Mean of X is:


∫ ∫ 1 [ ]1
1 1
E(X) = x f (x) dx = x dx = x2 = .
0 2 0 2
Variance of X is:
∫ ∫
1
V(X) = (x − µ) f (x) dx =
2
x2 f (x) dx − µ2 ←− µ = E(X) =
2
∫ 1 [ ]1 ( )
1 1 1 2 1
= x2 dx − µ2 = x3 − µ2 = − = .
0 3 0 3 2 12

(2) For Y = −2 log X, we obtain the moment-generating function of Y, φy (θ).



θY −2θ log X −2θ
φy (θ) = E(e ) = E(e ) = E(X ) = x−2θ f (x) dx
∫ 1 [ ]1
−2θ 1 1
= x dx = x 1−2θ
= .
0 1 − 2θ 0 1 − 2θ

(3) Let Y1 and Y2 be the random variables which have the density function obtained
from (2). And, assume that Y1 is independent of Y2 . For Z = Y1 + Y2 , we want
to have the density function of Z.

The moment-generating function of Z, φz (θ), is:


( )2
φz (θ) = E(eθZ ) = E(eθ(Y1 +Y2 ) ) = E(eθY1 )E(eθY2 ) = φy (θ)
( 1 )2 ( 1 ) 42
= = ,
1 − 2θ 1 − 2θ
which is equivalent to the moment-generating function of the chi square distri-
bution with 4 degrees of freedom. Therefore, Z ∼ χ2 (4). Note that the chi-
square density function with n degrees of freedom is given by:



 1 2 −1 − 2
n x

 2 n2 Γ( n ) x e ,
 for x > 0,
f (x) = 
 2



 0, otherwise.

The moment-generating function φ(θ) is:


( 1 ) n2
φ(θ) = .
1 − 2θ

8
5 The continuous type of random variable X has the following density function:



 1 2 −1 e− 2 ,
n x


 2 2 Γ( n )
n x if x > 0,
f (x) = 
 2



 0, otherwise.
Answer the following questions. Γ(a) is called the gamma function, defined as:
∫ ∞
Γ(a) = xa−1 e−x dx.
0

(1) What are mean and variance of X?


(2) Compute the moment-generating function of X.

[Answer]

(1) For mean:


∫ ∞ ∫ ∞
1 − 2n n2 −1 − 2x
E(X) = x f (x) dx = x 2 x e dx
−∞ 0 Γ( n2 )

2 − 2n Γ( n+2 ) ∞
1
2− 2 −1 e− 2 dx
n+2 n+2 x
= 2 2 x
2 − n+2
2 Γ( 2 )
n
0 Γ( n+2 )
∫ ∞
2
n 1 − n20 n20 −1 − 2x
=2 0 2 x e dx = n.
2 0 Γ( n2 )
1 √
Note that Γ(s + 1) = sΓ(s), Γ(1) = 1, and Γ( ) = π. Using n0 = n + 2, from
2
the property of the density function, we have:
∫ ∞ ∫ ∞
1 − n20 n20 −1 − 2x
f (x) dx = n0
2 x e dx = 1,
−∞ 0 Γ( 2 )

which is utilized in the fifth equality.

For variance, from V(X) = E(X 2 ) − µ2 we compute E(X 2 ) as follows:


∫ ∞ ∫ ∞
1
x2 n 2− 2 x 2 −1 e− 2 dx
n n x
E(X ) =
2
x f (x) dx =
2
−∞ 0 Γ( 2 )
∫ ∞
1 − n2 n+42 −1 e− 2 dx
x
= n 2 x
0 Γ( 2 )
n+4 ∫
2− 2 Γ( 2 ) ∞ 1
n

2− 2 x 2 −1 e− 2 dx
n+4 n+4 x
= n+4
2 − 2 Γ( 2 ) 0 Γ( 2 )
n n+4
∫ ∞
n+2n 1 − n20 n20 −1 − 2x
= 4( ) 0 2 x e dx = n(n + 2),
2 2 0 Γ( n2 )

9
where n0 = n + 4 is set. Therefore, V(X) = n(n + 2) − n2 = 2n is obtained.

(2) The moment-generating function of X is:


∫ ∞ ∫ ∞
θX θx 1 x
eθx n n x 2 −1 exp(− ) dx
n
φ(θ) = E(e ) = e f (x) dx =
−∞ 0 2 2 Γ( 2 ) 2
∫ ∞ ( )
1 1
x 2 −1 exp − (1 − 2θ)x dx
n
=
0 2 2 Γ( 2 )
n n 2
∫ ∞
1 ( y ) 2 −1
n
1 1
= exp(− y) dy
0 2 2 Γ( 2 ) 1 − 2θ 2 1 − 2θ
n n

( 1 ) n2 ∫ ∞ 1 1 ( 1 ) 2n
2 −1 exp(− y) dy =
n
= y .
1 − 2θ 0 2 2 Γ( 2 ) 1 − 2θ
n n 2

dx
Use y = (1 − 2θ)x in the fifth equality. Note that = (1 − 2θ)−1 . In the
dy
seventh equality, the integration corresponds to the chi-square distribution with
n degrees of freedom.

6 The continuous type of random variables X and Y are mutually independent and
assumed to be X ∼ N(0, 1) and Y ∼ N(0, 1). Define U = X/Y. Answer the following
questions. When X ∼ N(0, 1), the density function of X is represented as:

1
f (x) = √ e− 2 x .
1 2


(1) Derive the density function of U.

(2) Prove that the first moment of U does not exist.

[Answer]

(1) The density of U is obtained as follows. The densities of X and Y are:

1 1
f (x) = √ exp(− x2 ), −∞ < x < ∞,
2π 2
1 1
g(y) = √ exp(− y2 ), −∞ < y < ∞.
2π 2

10
Since X is independent of Y, the joint density of X and Y is:

1 1 1 1
h(x, y) = f (x)g(y) = √ exp(− x2 ) √ exp(− y2 )
2π 2 2π 2
1 1 2
= exp(− (x + y2 )).
2π 2
x
Using u = and v = y, the transformation of the variables is performed. For
y
x = uv and y = v, we have the Jacobian:
∂x ∂x
v u
J = ∂u ∂v =
.
∂y ∂y 0 1
∂u ∂v
Using transformation of variables, the joint density of U and V, s(u, v) is given
by:

1 1
s(u, v) = h(uv, v)|J| = exp(− v2 (1 + u2 ))|v|.
2π 2

The marginal density of U is:


∫ ∫ ∞
1 1
p(u) = s(u, v) dv = |v| exp(− v2 (1 + u2 )) dv
2π −∞ 2
∫ ∞
1 1
= v exp(− v2 (1 + u2 )) dv
π 0 2
[ ]∞
1 1 1 2 1
= − exp(− v (1 + u ))2
= ,
π 1+u 2 2 v=0 π(1 + u2 )

which corresponds to Cauchy distribution.

(2) We prove that the first moment of U is infinity, i.e.,


∫ ∫ ∞
1
E(U) = u f (u) du = u du
−∞ π(1 + u )
2
∫ ∞
1 1
= dx ←− x = 1 + u2 is used.
2π x
[1 ]∞
1 d log x 1
= log x ←− =
2π 1 dx x
= ∞.

For −∞ < u < ∞, the range of x = 1 + u2 is give by 1 < x < ∞.

11
7 The continuous type of random variables has the following joint density func-
tion: 



 x + y,
 if 0 < x < 1 and 0 < y < 1,
f (x, y) = 



 0, otherwise.
Answer the following questions.

(1) Compute the expectation of XY.

(2) Obtain the correlation coefficient between X and Y.

(3) What is the marginal density function of X?

[Answer]

(1) The expectation of XY is:


∫ 1∫ 1 ∫ 1∫ 1
E(XY) = xy f (x, y) dx dy = xy(x + y) dx dy
0 0 0 0
∫ 1[ ]1 ∫ 1
1 3 1 2 2 1 1
= yx + y x dy = ( y + y2 ) dy
0 3 2 0 0 3 2
[ ]1
1 1 1
= y2 + y3 = .
6 6 0 3

(2) We want to obtain the correlation coefficient between X and Y, which is repre-

sented as: ρ = Cov(X, Y)/ V(X)V(Y). Therefore, E(X), E(Y), V(X), V(Y) and
Cov(X, Y) have to be computed.

E(X) is:
∫ 1∫ 1 ∫ 1∫ 1
E(X) = x f (x, y) dx dy = x(x + y) dx dy
0 0 0 0
∫ 1[ ]1 ∫ 1
1 3 1 2 1 1
= x + yx dy = ( + y) dy
0 3 2 0 0 3 2
[ ]1
1 1 7
= y + y2 = .
3 4 0 12

12
In the case where x and y are exchangeable, the functional form of f (x, y) is
unchanged. Therefore, E(Y) is:

7
E(Y) = E(X) = .
12

For V(X),
( ) 7
V(X) = E (X − µ)2 ←− µ = E(X) =
12
∫ 1∫ 1
= E(X 2 ) − µ2 = x2 f (x, y) dx dy − µ2
0 0
∫ 1∫ 1 ∫ 1[ ]1
1 4 1 3
= x (x + y) dx dy − µ =
2 2
x + yx dy − µ2
0 0 0 4 3 0
∫ 1 [ ]1
1 1 1 1
= ( + y) dy − µ2 = y + y2 − µ2
0 4 3 4 6 0
5 ( )
7 2 11
= − = .
12 12 144

For V(Y),
11
V(Y) = V(X) = .
144
For Cov(X, Y),
( )
Cov(X, Y) = E (X − µ x )(Y − µy ) = E(XY) − µ x µy
1 7 7 1
= − =− ,
3 12 12 144

where
7 7
µ x = E(X) = , µy = E(Y) = .
12 12
Therefore, ρ is:

Cov(X, Y) −1/144 1
ρ= √ = √ =− .
V(X)V(Y) (11/144)(11/144) 11

(3) The marginal density function of X, f x (x), is:


∫ ∫ 1 [ ]1
1 2 1
f x (x) = f (x, y) dy = (x + y) dy = xy + y = x+ ,
0 2 y=0 2

for 0 < x < 1.

13
8 The discrete type of random variable X has the following density function:
e−λ λ x
f (x) = , x = 0, 1, 2, · · · .
x!
Answer the following questions.


(1) Prove f (x) = 1.
x=0
(2) Compute the moment-generating function of X.

(3) From the moment-generating function, obtain mean and variance of X.

[Answer]


(1) We can show f (x) = 1 as:
x=0


∞ ∑
∞ ∑

−λ λ λx
x
−λ
f (x) = e =e = e−λ eλ = 1.
x=0 x=0
x! x=0
x!
∑ ∞
xk
Note that e = x
, because we have f (k) (x) = e x for f (x) = e x . As shown in
k=0
k!
Appendix 1.3, the formula of Taylor series expansion is:
∑∞
1 (k)
f (x) = f (x0 )(x − x0 )k .
k=0
k!
The Taylor series expansion around x = 0 is:
∑∞
1 (k) ∑∞
1 k ∑ xk

f (x) = f (0)xk = x = .
k=0
k! k=0
k! k=0
k!

Here, replace x by λ and k by x.

(2) The moment-generating function of X is:


∑ ∞ ∑∞ ∑

(eθ λ) x
θx −λ λ
x
θX θx
φ(θ) = E(e ) = e f (x) = e e = e−λ
x=0 x=0
x! x=0 x!
∑∞
(eθ λ) x ∑ ∞

0x
= e−λ exp(eθ λ) exp(−eθ λ) = e−λ exp(eθ λ) e−λ
x=0
x! x=0
x!
( )
= exp(−λ) exp(eθ λ) = exp λ(eθ − 1) .

Note that λ0 = exp(eθ λ).

14
(3) Based on the moment-generating function, we obtain mean and variance of X.
( ) ( )
For mean, because of φ(θ) = exp λ(eθ − 1) , φ0 (θ) = λeθ exp λ(eθ − 1) and
E(X) = φ0 (0), we obtain:

E(X) = φ0 (0) = λ.

For variance, from V(X) = E(X 2 ) − (E(X))2 , we obtain E(X 2 ). Note that E(X 2 ) =
( )
φ00 (0) and φ00 (θ) = (1 + λeθ )λeθ exp λ(eθ − 1) . Therefore,

V(X) = E(X 2 ) − (E(X))2 = φ00 (0) − (φ0 (0))2 = (1 + λ)λ − λ2 = λ.

9 X1 , X2 , · · ·, Xn are mutually independently and normally distributed with mean


µ and variance σ2 , where the density function is given by:
1
e− 2σ2 (x−µ) .
1 2
f (x) = √
2πσ2
Then, answer the following questions.

(1) Obtain the maximum likelihood estimators of mean µ and variance σ2 .

(2) Check whether the maximum likelihood estimator of σ2 is unbiased. If it is


not unbiased, obtain an unbiased estimator of σ2 . (Hint: use the maximum
likelihood estimator.)

(3) We want to test the null hypothesis H0 : µ = µ0 by the likelihood ratio test.
Obtain the test statistic and explain the testing procedure.

[Answer]

(1) The joint density is:



n
f (x1 , x2 , · · · , xn ; µ, σ ) =
2
f (xi ; µ, σ2 )
i=1

n ( )
1 1
= √ exp − 2 (xi − µ) 2

i=1 2πσ 2 2σ
 

 ∑
n

= (2πσ2 )−n/2 exp − 2 (xi − µ)2  = l(µ, σ2 ).
1
2σ i=1

15
Taking the logarithm, we have:

1 ∑
n
n n
log l(µ, σ ) = − log(2π) − log(σ ) −
2 2
(xi − µ)2 .
2 2 2σ2 i=1

The derivatives of the log-likelihood function log l(µ, σ2 ) with respect to µ and
σ2 are set to be zero.

1 ∑
n
∂ log l(µ, σ2 )
= 2 (xi − µ) = 0,
∂µ σ i=1
1 ∑
n
∂ log l(µ, σ2 ) n 1
= − + (xi − µ)2 = 0.
∂σ 2 2σ 2 4
2σ i=1

Solving the two equations, we have the solution of (µ, σ2 ), denoted by (µ̂, σ̂2 ):

1∑
n
µ̂ = xi = x,
n i=1
1∑ 1∑
n n
σ̂ =
2
(xi − µ) =
2
(xi − x)2 .
n i=1 n i=1

Therefore, the maximum likelihood estimators of µ and σ2 , (µ̂, σ̂2 ), are as fol-
lows:
1∑
n
X, S ∗∗2 = (Xi − X)2 .
n i=1

(2) Take the expectation to check whether S ∗∗2 is unbiased.


(1 ∑n ) 1 (∑ n )
∗∗2
E(S )=E (Xi − X) = E
2
(Xi − X)2
n i=1 n i=1
1 (∑ )
n
= E ((Xi − µ) − (X − µ))2
n i=1
1 (∑ )
n
= E ((Xi − µ)2 − 2(Xi − µ)(X − µ) + (X − µ)2 )
n i=1
1 (∑ ∑ )
n n
= E (Xi − µ)2 − 2(X − µ) (Xi − µ) + n(X − µ)2
n i=1 i=1

1 (∑ n )
= E (Xi − µ)2 − 2n(X − µ)2 + n(X − µ)2
n i=1

16
1 (∑ )
n
= E (Xi − µ)2 − n(X − µ)2
n i=1
1 (∑ ) 1 ( )
n
= E (Xi − µ)2 − E n(X − µ)2
n i=1 n
1∑ ( ) ( )
n
= E (Xi − µ)2 − E (X − µ)2
n i=1
1∑ 1 ∑ 2 σ2
n n
= V(Xi ) − V(X) = σ −
n i=1 n i=1 n
1 n−1 2
= σ2 − σ2 = σ , σ2 .
n n

Therefore, S ∗∗2 is not unbiased. Based on S ∗∗2 , we obtain the unbiased estimator
of σ2 . Multiplying n/(n − 1) on both sides of E(S ∗∗2 ) = σ2 (n − 1)/n, we obtain:

n
E(S ∗∗2 ) = σ2 .
n−1

Therefore, the unbiased estimator of σ2 is:

1 ∑
n
n
S ∗∗2 = (Xi − X)2 = S 2 .
n−1 n − 1 i=1

(3) The likelihood ratio is defined as:

max l(µ0 , σ2 )
σ2 l(µ0 , σe2 )
λ= = .
max l(µ, σ2 ) l(µ̂, σ̂2 )
µ,σ2

Since the number of restriction is one, we have:

−2 log λ −→ χ2 (1).

l(µ, σ2 ) is given by:


 
 1 ∑ n

l(µ, σ ) = (2πσ )
2 2 −n/2
exp − 2 (xi − µ)  .
2
2σ i=1

Taking the logarithm, log l(µ, σ2 ) is:

1 ∑
n
n n
log l(µ, σ ) = − log(2π) − log(σ ) −
2 2
(xi − µ)2 .
2 2 2σ2 i=1

17
On the numerator, under the restriction µ = µ0 , log l(µ0 , σ2 ) is maximized with
respect to σ2 as follows:

1 ∑
n
∂ log l(µ0 , σ2 ) n 1
=− 2 + (xi − µ0 )2 = 0.
∂σ2 2σ 2σ4 i=1

e2 , which is represented as:


This solution of σ2 is σ

1∑
n
e =
σ2
(xi − µ0 )2 .
n i=1

e2 ) is:
Then, l(µ0 , σ
  ( n)
 1 ∑ n

e ) = (2πe
l(µ0 , σ2
σ) 2 −n/2
exp − 2 (xi − µ0 )  = (2πe
2
σ2 )−n/2 exp − .
σ i=1
2e 2

On the denominator, from the question (1), we have:

1∑ 1∑
n n
µ̂ = xi , σ̂ =2
(xi − µ̂)2 .
n i=1 n i=1

Therefore, l(µ̂, σ̂2 ) is:


  ( n)
 1 ∑ n

l(µ̂, σ̂ ) = (2πσ̂ )
2 2 −n/2
exp − 2 (xi − µ̂)  = (2πσ̂2 )−n/2 exp − .
2
2σ̂ i=1 2

The likelihood ratio is:


max l(µ0 , σ2 )
σ2 l(µ0 , σe2 ) (2πe σ2 )−n/2 exp(−n/2) ( σe2 )−n/2
λ= = = = 2 .
max l(µ, σ2 ) l(µ̂, σ̂2 ) (2πσ̂2 )−n/2 exp(−n/2) σ̂
µ,σ2

As n goes to infinity, we obtain:

e2 − log σ̂2 ) ∼ χ2 (1).


−2 log λ = n(log σ

When −2 log λ > χ2α (1), the null hypothesis H0 : µ = µ0 is rejected by the
significance level α, where χ2α (1) denotes the 100 × α percent point of the Chi-
square distribution with one degree of freedom.

10 Answer the following questions.

18
(1) The discrete type of random variable X is assumed to be Bernoulli. The Bernoulli
distribution is given by:

f (x) = p x (1 − p)1−x , x = 0, 1.

Let X1 , X2 , · · · ,Xn be random variables drawn from the Bernoulli trials. Com-
pute the maximum likelihood estimator of p.

(2) Let Y be a random variable from a binomial distribution, denoted by f (y), which
is represented as:

f (y) = nCy py (1 − p)n−y , y = 0, 1, 2, · · · , n.

Then, prove that Y/n goes to p as n is large.

(3) For the random variable Y in the question (2), Let us define:

Y − np
Zn ≡ √ .
np(1 − p)

Then, Zn goes to a standard normal distribution as n is large.

(4) The continuous type of random variable X has the following density function:



 1 2 −1 − 2
n x

 2 n2 Γ( n ) x e ,
 if x > 0,
f (x) = 
 2



 0, otherwise.

where Γ(a) denotes the Gamma function, i.e.,


∫ ∞
Γ(a) = xa−1 e−x dx.
0

Then, show that X/n approaches one when n −→ ∞.

[Answer]

(1) When X is a Bernoulli random variable, the probability function of X is given


by:
f (x; p) = p x (1 − p)1−x , x = 0, 1.

19
The joint probability function of X1 , X2 , · · · ,Xn is:

n ∏
n
f (x1 , x2 , · · · , xn ; p) = f (xi ; p) = p xi (1 − p)1−xi
i=1 i=1
∑ ∑
=p i xi
(1 − p)
n− i xi
= l(p).

Take the logarithm of l(p).


∑ ∑
log l(p) = ( xi ) log(p) + (n − xi ) log(1 − p).
i i

The derivative of the log-likelihood function log l(p) with respect to p is set to
be zero.
∑ ∑ ∑
d log l(p) n − i xi
xi xi − np
= − i
= i = 0.
dp p 1− p p(1 − p)

Solving the above equation, we have:

1∑
n
p= xi = x.
n i=1

Therefore, the maximum likelihood estimator of p is:

1∑
n
p̂ = Xi = X.
n i=1

(2) Mean and variance of Y are:

E(Y) = np, V(Y) = np(1 − p).

Therefore, we have:

Y 1 Y 1 p(1 − p)
E( ) = E(Y) = p, V( ) = 2 V(Y) = .
n n n n n

Chebyshev’s inequality indicates that for a random variable X and g(x) ≥ 0 we


have:
E(g(X))
P(g(X) ≥ k) ≤ ,
k
where k > 0.

20
Here, when g(X) = (X − E(X))2 and k =  2 are set, we can rewrite as:

V(X)
P(|X − E(X)| ≥ ) ≤ ,
2

where  > 0.

Y
Replacing X by , we apply Chebyshev’s inequality.
n

Y Y V( Yn )
P(| − E( )| ≥ ) ≤ 2 .
n n 

That is, as n −→ ∞,

Y p(1 − p)
P(| − p| ≥ ) ≤ −→ 0.
n n 2

Therefore, we obtain:
Y
−→ p.
n
(3) Let X1 , X2 , · · ·, Xn be Bernoulli random variables, where P(Xi = x) = p x (1 −
p)1−x for x = 0, 1. Define Y = X1 + X2 + · · · + Xn . Because Y has a binomial
distribution, Y/n is taken as the sample mean from X1 , X2 , · · ·, Xn , i.e., Y/n =

(1/n) ni=1 Xi . Therefore, using E(Y/n) = p and V(Y/n) = p(1 − p)/n, by the
central limit theorem, as n −→ ∞, we have:

Y/n − p
√ −→ N(0, 1).
p(1 − p)/n

Moreover,
Y − np Y/n − p
Zn ≡ √ = √ .
np(1 − p) p(1 − p)/n
Therefore,
Zn −→ N(0, 1).

(4) When X ∼ χ2 (n), we have E(X) = n and V(X) = 2n. Therefore, E(X/n) = 1 and
V(X/n) = 2/n.

21
Apply Chebyshev’s inequality. Then, we have:

X X V( Xn )
P(| − E( )| ≥ ) ≤ 2 ,
n n 

where  > 0. That is, as n −→ ∞, we have:

X 2
P(| − 1| ≥ ) ≤ 2 −→ 0.
n n

Therefore,
X
−→ 1.
n

11 Consider n random variables X1 , X2 , · · ·, Xn , which are mutually independently


and exponentially distributed. Note that the exponential distribution is given by:

f (x) = λe−λx , x > 0.

Then, answer the following questions.

(1) Let λ̂ be the maximum likelihood estimator of λ. Obtain λ̂.

(2) When n is large enough, obtain mean and variance of λ̂.

[Answer]

(1) Since X1 , · · · , Xn are mutually independently and exponentially distributed, the


likelihood function l(λ) is written as:

n ∏
n

l(λ) = f (xi ) = λe−λxi = λn e−λ xi
.
i=1 i=1

The log-likelihood function is:



n
log l(λ) = n log(λ) − λ xi .
i=1

We want the λ which maximizes log l(λ). Solving the following equation:

d log l(λ) n ∑
n
= − xi = 0,
dλ λ i=1

22
and replacing xi by Xi , the maximum likelihood estimator of λ, denoted by λ̂,
is:
n
λ̂ = ∑n .
i=1 Xi
(2) X1 , X2 , · · ·, Xn are mutually independent. Let f (xi ; λ) be the density function of
Xi . For the maximum likelihood estimator of λ, i.e., λ̂n , as n −→ ∞, we have
the following property:

√ ( )
n(λ̂n − λ) −→ N 0, σ2 (λ) ,

where
1
σ2 (λ) = ( ) .
 d log f (X; λ) 2 
E  

Therefore, we obtain σ2 (λ). The expectation in σ2 (λ̂n ) is:


( ) ( )2  ( )
 d log f (X; λ) 2   1  1 2
E   = E  − X  = E 2 − X + X 2
dλ λ λ λ
1 2 1
= 2 − E(X) + E(X 2 ) = 2 ,
λ λ λ

where E(X) and E(X 2 ) are:

1 2
E(X) = , E(X 2 ) = .
λ λ2

Therefore, we have:

1
σ2 (λ) = ( )  =λ .
2
 d log f (X; λ) 2 
E  

As n is large, λ̂n approximately has the following distribution:

λ2
λ̂n ∼ N(λ, ).
n

Thus, as n goes to infinity, mean and variance are given by λ and λ2 /n.

23
12 The n random variables X1 , X2 , · · ·, Xn are mutually independently distributed
with mean µ and variance σ2 . Consider the following two estimators of µ:

1∑
n
X= Xi , e = 1 (X1 + Xn ).
X
n i=1 2

Then, answer the following questions.

e
(1) Is X unbiased? How about X?
e
(2) Which is more efficient, X or X?
e are consistent.
(3) Examine whether X and X

[Answer]

e are unbiased.
(1) We check whether X and X

1∑ 1 ∑ 1∑ 1∑
n n n n
E(X) = E( Xi ) = E( Xi ) = E(Xi ) = µ = µ,
n i=1 n i=1 n i=1 n i=1
( )
e = 1 E(X1 ) + E(Xn ) = 1 (µ + µ) = µ.
E(X)
2 2

Thus, both are unbiased.


e
(2) We examine which is more efficient, X or X.

1∑ 1 ∑ 1 ∑ 1 ∑ 2 σ2
n n n n
V(X) = V( Xi ) = 2 V( Xi ) = 2 V(Xi ) = 2 σ = ,
n i=1 n i=1
n i=1 n i=1 n
( )
e = 1 V(X1 ) + V(Xn ) = 1 (σ2 + σ2 ) = σ .
2
V(X)
4 4 2
e X is more efficient than X
Therefore, because of V(X) < V(X), e when n > 2.

e are consistent. Apply Chebyshev’s inequality. For X,


(3) We check if X and X

V(X)
P(|X − E(X)| ≥ ) ≤ ,
2

where  > 0. That is, when n −→ ∞, we have:

σ2
P(|X − µ| ≥ ) ≤ 2 −→ 0.
n

24
Therefore, we obtain:
X −→ µ.

e we have:
Next, for X,
e
e ≥ ) ≤ V(X) ,
e − E(X)|
P(|X
2

where  > 0. That is, when n −→ ∞, the following equation is obtained:

e − µ| ≥ ) ≤ σ2
P(|X −→
/ 0.
2 2
e is not consistent.
X is a consistent estimator of µ, but X

13 The 9 random samples:


21   23   32   20   36   27   26   28   30
which are obtained from the normal population N(µ, σ2 ). Then, answer the following
questions.

(1) Obtain the unbiased estimates of µ and σ2 .

(2) Obtain both 90 and 95 percent confidence intervals for µ.

(3) Test the null hypothesis H0 : µ = 24 and the alternative hypothesis H1 : µ > 24
by the significance level 0.10. How about 0.05?

[Answer]

(1) The unbiased estimators of µ and σ2 , denoted by X and S 2 , are given by:

1∑ 1 ∑
n n
X= Xi , S =
2
(Xi − X)2 .
n i=1 n − 1 i=1

The unbiased estimates of µ and σ2 are:

1∑ 1 ∑
n n
x= xi , s =
2
(xi − x)2 .
n i=1 n − 1 i=1

25
Therefore,
1∑
n
1
x= xi = (21 + 23 + 32 + 20 + 36 + 27 + 26 + 28 + 30) = 27,
n i=1 9
1 ∑
n
s2 = (xi − x)2
n − 1 i=1
(
1
= (21 − 27)2 + (23 − 27)2 + (32 − 27)2 + (20 − 27)2
8
)
+(36 − 27) + (27 − 27) + (26 − 27) + (28 − 27) + (30 − 27)
2 2 2 2 2

1
= (36 + 16 + 25 + 49 + 81 + 0 + 1 + 1 + 9) = 27.25.
8

(2) We obtain the confidence intervals of µ. The following sample distribution is


utilized:
X−µ
√ ∼ t(n − 1).
S/ n
Therefore,
( X − µ )
P √ < tα/2 (n − 1) = 1 − α,

S/ n
where tα/2 (n − 1) denotes the 100 × α/2 percent point of the t distribution, which
is obtained given probability α and n − 1 degrees of freedom. Therefore, we
have:
( S S )
P X − tα/2 (n − 1) √ < µ < X + tα/2 (n − 1) √ = 1 − α.
n n
Replacing X and S 2 by x and s2 , the 100 × (1 − α) percent confidence interval
of µ is:
( s s )
x − tα/2 (n − 1) √ , x + tα/2 (n − 1) √ .
n n
Since x = 27, s = 27.25, n = 9, t0.05 (8) = 1.860 and t0.025 (8) = 2.306, the 90
2

percent confidence interval of µ is:


√ √
27.25 27.25
(27 − 1.860 , 27 + 1.860 ) = (23.76, 30.24),
9 9
and the 95 percent confidence interval of µ is:
√ √
27.25 27.25
(27 − 2.306 , 27 + 2.306 ) = (22.99, 31.01).
9 9

26
(3) We test the null hypothesis H0 : µ = 24 and the alternative hypothesis H1 : µ >
24 by the significance levels 0.10 and 0.05. The distribution of X is:

X−µ
√ ∼ t(n − 1).
S/ n
Therefore, under the null hypothesis H0 : µ = µ0 , we obtain

X − µ0
√ ∼ t(n − 1).
S/ n
Note that µ is replaced by µ0 . For the alternative hypothesis H1 : µ > µ0 , since
we have:
( X − µ0 )
P √ > tα (n − 1) = α,
S/ n
we reject the null hypothesis H0 : µ = µ0 by the significance level α when we
have:
x − µ0
√ > tα (n − 1).
s/ n
Substitute x = 27, s2 = 27.25, µ0 = 24, n = 9, t0.10 (8) = 1.397 and t0.05 (8) =
1.860 into the above formula. Then, we obtain:

x − µ0 27 − 24
√ = √ = 1.724 > t0.10 (8) = 1.397.
s/ n 27.25/9
Therefore, we reject the null hypothesis H0 : µ = 24 by the significance level
α = 0.10. And we obtain:

x − µ0 27 − 24
√ = √ = 1.724 < t0.05 (8) = 1.860.
s/ n 27.25/9
Therefore, the null hypothesis H0 : µ = 24 is accepted by the significance level
α = 0.05.

14 The 16 samples X1 , X2 , · · ·, X16 are randomly drawn from the normal population
with mean µ and known variance σ2 = 22 . The sample average is given by x = 36.
Then, answer the following questions.

(1) Obtain the 95 percent confidence interval for µ.

27
(2) Test the null hypothesis H0 : µ = 35 and the alternative hypothesis H1 : µ =
36.5 by the significance level 0.05.

(3) Compute the power of the test in the above question (2).

[Answer]

(1) We obtain the 95 percent confidence interval of µ. The distribution of X is:

X−µ
√ ∼ N(0, 1).
σ/ n

Therefore,
( X − µ )
P √ < zα/2 = 1 − α,
σ/ n
α
where zα/2 denotes the 100× percent point, which is obtained given probability
2
α. Therefore,
( σ σ )
P X − zα/2 √ < µ < X + zα/2 √ = 1 − α.
n n

Replacing X by x, the 100(1 − α) percent confidence interval of µ is:


( σ σ )
x − zα/2 √ , x + zα/2 √ .
n n

Substituting x = 36, σ2 = 22 , n = 16 and z0.025 = 1.960, the 100 × (1 − α)


percent confidence interval of µ is:

2 2
(36 − 1.960 √ , 36 + 1.960 √ ) = (35.02, 36.98).
16 16

(2) We test the null hypothesis H0 : µ = 35 and the alternative hypothesis H1 : µ =


36.5 by the significance level 0.05. The distribution of X is:

X−µ
√ ∼ N(0, 1).
σ/ n

Under the null hypothesis H0 : µ = µ0 ,

X − µ0
√ ∼ N(0, 1).
σ/ n

28
For the alternative hypothesis H1 : µ > µ0 , we obtain:
( X − µ0 )
P √ > zα = α.
σ/ n

If we have:
x − µ0
√ > zα ,
σ/ n
the null hypothesis H0 : µ = µ0 is rejected by the significance level α. Substi-
tuting x = 36, σ2 = 22 , n = 16 and z0.05 = 1.645, we obtain:

x − µ0 36 − 35
√ = √ = 2 > zα = 1.645.
σ/ n 2/ 16
The null hypothesis H0 : µ = 35 is rejected by the significance level α = 0.05.

(3) We compute the power of the test in the question (2). The power of the test is the
probability which rejects the null hypothesis under the alternative hypothesis.
That is, under the null hypothesis H0 : µ = µ0 , the region which rejects the null

hypothesis is: X > µ0 + zα σ/ n, because
( X − µ0 )
P √ > zα = α.
σ/ n

We compute the probability which rejects the null hypothesis under the alter-
native hypothesis H1 : µ = µ1 . That is, under the alternative hypothesis
H1 : µ = µ1 , the following probability is known as the power of the test:
( √ )
P X > µ0 + zα σ/ n .

Under the alternative hypothesis H1 : µ = µ1 , we have:

X − µ1
√ ∼ N(0, 1).
σ/ n

Therefore, we want to compute the following probability


( X − µ1 µ0 − µ1 )
P √ > √ + zα .
σ/ n σ/ n

29
Substituting σ = 2, n = 16, µ0 = 35, µ1 = 36.5 and zα = 1.645, we obtain:
( X − µ1 35 − 36.5 ) ( X − µ1 )
P √ > √ + 1.645 = P √ > −1.355
σ/ n 2/ 16 σ/ n
( X − µ1 )
=1−P √ > 1.355
σ/ n
= 1 − 0.0877 = 0.9123.

Note that z0.0885 = 1.35 and z0.0869 = 1.36.

15 X1 , X2 , · · ·, Xn are assumed to be mutually independent and be distributed as a


Poisson process, where the Poisson distribution is given by:

λ x e−λ
P(X = x) = f (x; λ) = , x = 0, 1, 2, · · · .
x!
Then, answer the following questions.

(1) Obtain the maximum likelihood estimator of λ, which is denoted by λ̂.

(2) Prove that λ̂ is an unbiased estimator.

(3) Prove that λ̂ is an efficient estimator.

(4) Prove that λ̂ is an consistent estimator.

[Answer]

(1) We obtain the maximum likelihood estimator of λ, denoted by λ̂. The Poisson
distribution is:

λ x e−λ
P(X = x) = f (x; λ) = , x = 0, 1, 2, · · · .
x!
The likelihood function is:
∑n

n ∏
n
λ xi e−λ λ i=1 xi e−nλ
l(λ) = f (xi ; λ) = = ∏n .
i=1 i=1
xi ! i=1 xi !

The log-likelihood function is:


∑n ∏
n
log l(λ) = log(λ) xi − nλ − log( xi !).
i=1 i=1

30
The derivative of the log-likelihood function with respect to λ is:

∂ log l(λ) 1 ∑
n
= xi − n = 0.
∂λ λ i=1

Solving the above equation, the maximum likelihood estimator λ̂ is:

1∑
n
λ̂ = Xi = X.
n i=1

(2) We prove that λ̂ is an unbiased estimator of λ.

1∑ 1∑ 1∑
n n n
E(λ̂) = E( Xi ) = E(Xi ) = λ = λ.
n i=1 n i=1 n i=1

(3) We prove that λ̂ is an efficient estimator of λ, where we show that the equality
holds in the Cramer-Rao inequality. First, we obtain V(λ̂) as:

1∑ 1 ∑ 1 ∑
n n n
λ
V(λ̂) = V( Xi ) = 2 V(Xi ) = 2 λ= .
n i=1 n i=1 n i=1 n

The Cramer-Rao lower bound is given by:

1 1
( )2  = ( )
 ∂ log f (X; λ)   ∂(X log λ − λ − log X!) 2 
nE   nE  
∂λ ∂λ
1 λ2
= [( )2 ] =
X nE[(X − λ)2 ]
nE −1
λ
λ2 λ2 λ
= = = .
nV(X) nλ n

Therefore,
1
V(λ̂) = ( ) .
 ∂ log f (X; λ) 2 
nE  
∂λ
That is, V(λ̂) is equal to the lower bound of the Cramer-Rao inequality. There-
fore, λ̂ is efficient.

31
(4) We show that λ̂ is a consistent estimator of λ. Note as follows:
λ
E(λ̂) = λ, V(λ̂) = .
n
In Chebyshev’s inequality:
V(λ̂)
P(|λ̂ − E(λ̂)| ≥ ) ≤ ,
2
E(λ̂) and V(λ̂) are substituted. Then, we have:
λ
P(|λ̂ − λ| > ) < −→ 0,
n 2
which implies that λ̂ is consistent.

16 X1 , X2 , · · ·, Xn are mutually independently distributed as normal random vari-


ables. Note that the normal density is:
1
e− 2σ2 (x−µ) .
1 2
f (x) = √
2πσ2
Then, answer the following questions.
∑n
(1) Prove that the sample mean X = (1/n) i=1 Xi is normally distributed with mean
µ and variance σ2 /n.

(2) Define:
X−µ
Z= √ .
σ/ n
Show that Z is normally distributed with mean zero and variance one.

(3) Consider the sample unbiased variance:


1 ∑
n
S2 = (Xi − X)2 .
n − 1 i=1

The distribution of (n − 1)S 2 /σ2 is known as a Chi-square distribution with n−1


degrees of freedom. Obtain mean and variance of S 2 . Note that a Chi-square
distribution with m degrees of freedom is:



 1 2 −1 e− 2 ,
m x


 2 2 Γ( )
m m
x if x > 0,
f (x) = 
 2



 0, otherwise.

32
(4) Prove that S 2 is an consistent estimator of σ2 .

[Answer]
∑n
(1) The distribution of the sample mean X = (1/n) i=1 Xi is derived using the
moment-generating function. Note that for X ∼ N(µ, σ2 ) the moment-generating
function φ(θ) is:
∫ ∞ ∫ ∞
1
e− 2σ2 (x−µ) dx
1 2
θX θx
φ(θ) ≡ E(e ) = e f (x) dx = eθx √
∫ ∞ −∞ −∞ 2πσ2
1
e− 2σ2 (x−µ) +θx dx
1 2
= √
−∞ 2πσ2 ( )
∫ ∞
1 − 1 2 x2 −2(µ+σ2 θ)x+µ2
= √ e 2σ dx
−∞ 2πσ2
∫ ∞ ( )2
1 − 1 2 x−(µ+σ2 θ) +(µθ+ 12 σ2 θ2 )
= √ e 2σ dx
−∞ 2πσ2
∫ ∞ ( )2
1 − 1 2 x−(µ+σ2 θ)
( 1 )
µθ+ 12 σ2 θ2
=e √ e 2σ dx = exp µθ + σ2 θ2 .
−∞ 2πσ2 2
In the integration above, N(µ + σ2 θ, σ2 ) is utilized. Therefore, we have:
( 1 )
φi (θ) = exp µθ + σ2 θ2 .
2
Now, consider the moment-generating function of X, denoted by φ x (θ):
∑ ∏n
θ
∏n
θ
∏ n
θ
θX θ 1n ni=1 Xi
φ x (θ) ≡ E(e ) = E(e ) = E( e )=
n Xi
E(e ) =
n Xi
φi ( )
i=1 i=1 i=1
n
∏n ( θ 1 θ ) ( 1 θ2 ) ( 1 σ2 2 )
= exp µ + σ2 ( )2 = exp µθ + σ2 = exp µθ + θ ,
i=1
n 2 n 2 n 2 n
which is equivalent to the moment-generating function of the normal distribu-
tion with mean µ and variance σ2 /n.

(2) We derive the distribution of Z, which is shown as:


X−µ
Z= √ .
σ/ n
From the question (1), the moment-generating function of X, denoted by φ x (θ),
is:
( 1 σ2 2 )
θX
φ x (θ) ≡ E(e ) = exp µθ + θ .
2 n

33
The moment-generating function of Z, denoted by φz (θ):
( X−µ )
φz (θ) ≡ E(eθZ ) = E exp(θ √ )
σ/ n
( µ ) ( θ )
= exp −θ √ E exp( √ X)
σ/ n σ/ n
( µ ) θ
= exp −θ √ φ x ( √ )
σ/ n σ/ n
( µ ) ( θ 1 σ2 ( θ )2 ) 1
= exp −θ √ exp µ √ + √ = exp( θ2 ),
σ/ n σ/ n 2 n σ/ n 2

which is the moment-generating function of N(0, 1).

(3) First, as preliminaries, we derive mean and variance of the chi-square distribu-
tion with m degrees of freedom. The chi-square distribution with m degrees of
freedom is:
1
x 2 −1 e− 2 ,
m x
f (x) = if x > 0.
2 Γ( 2 )
mm
2

Therefore, the moment-generating function φχ2 (θ) is:


∫ ∞
θX 1
eθx m m x 2 −1 e− 2 dx
m x
φχ2 (θ) = E(e ) =
0 2 2 Γ( 2 )
∫ ∞
1
x 2 −1 e− 2 (1−2θ)x dx
m 1
=
0 2 2 Γ( 2 )
m m
∫ ∞
1 ( y ) m2 −1 − 12 y 1
= e dx
0 2 2 Γ( 2 ) 1 − 2θ 1 − 2θ
m m

( 1 ) m2 −1 1 ∫ ∞ 1
y 2 −1 e− 2 y dx = (1 − 2θ)− 2 .
m 1 m
=
1 − 2θ 1 − 2θ 0 2 2 Γ( 2 )
m m

In the fourth equality, use y = (1 − 2θ)x. The first and second derivatives of the
moment-generating function is:

φ0χ2 (θ) = m(1 − 2θ)− 2 −1 , φ00χ2 (θ) = m(m + 2)(1 − 2θ)− 2 −2 .


m m

Therefore, we obtain:

E(X) = φ0χ2 (0) = m, E(X 2 ) = φ00χ2 (0) = m(m + 2).

34
Thus, for the chi-square distribution with m degrees of freedom, mean is given
by m and variance is:

V(X) = E(X 2 ) − (E(X))2 = m(m + 2) − m2 = 2m.

Therefore, using (n − 1)S 2 /σ2 ∼ χ2 (n − 1), we have:

(n − 1)S 2 (n − 1)S 2
E( ) = n − 1, V( ) = 2(n − 1),
σ2 σ2

which implies

n−1 n−1 2
E(S 2 ) = n − 1, ( ) V(S 2 ) = 2(n − 1).
σ 2 σ 2

Finally, mean and variance of S 2 are:

2σ4
E(S 2 ) = σ2 , V(S 2 ) = .
n−1

(4) We show that S 2 is a consistent estimator of σ2 . Chebyshev’s inequality is


utilized, which is:
V(S 2 )
P(|S 2 − E(S 2 )| ≥ ) ≤ .
2
Substituting E(S 2 ) and V(S 2 ), we obtain:

2σ4
P(|S 2 − σ2 | ≥ ) ≤ −→ 0.
(n − 1) 2

Therefore, S 2 is consistent.

35

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy