0% found this document useful (0 votes)
326 views

MAST20005 Statistics Assignment 1

1) The document provides the solutions to assignment questions on statistics involving Gamma, exponential, and Weibull distributions. 2) For question 1, the method of moments estimators for the parameters of a Gamma distribution are derived and the estimates for a given dataset are calculated. 3) For question 2, the properties of estimators for the scale parameter of an exponential distribution are examined, including bias, variance, and how the estimators compare as the sample size increases. 4) For question 3, different distributions are fit to the data and compared based on log-likelihood values, Q-Q plots, and a histogram with density overlay. The log-normal model provides the best fit.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
326 views

MAST20005 Statistics Assignment 1

1) The document provides the solutions to assignment questions on statistics involving Gamma, exponential, and Weibull distributions. 2) For question 1, the method of moments estimators for the parameters of a Gamma distribution are derived and the estimates for a given dataset are calculated. 3) For question 2, the properties of estimators for the scale parameter of an exponential distribution are examined, including bias, variance, and how the estimators compare as the sample size increases. 4) For question 3, different distributions are fit to the data and compared based on log-likelihood values, Q-Q plots, and a histogram with density overlay. The log-normal model provides the best fit.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

MAST20005 Statistics, Assignment 1

Brendan Hill - Student 699917 (Tutorial Thursday 10am)


November 19, 2016

Question 1
1
The first and second moments of a Gamma distribution X = Γ(α, θ) are given by the mgf MX (t) = (1−θt)α :

(1) d 1
MX (0) = = αθ

dt (1 − θt)α

t=0

(2) d2 1
MX (0) = 2 = α(α + 1)θ2

α
dt (1 − θt)

t=0

(Noting that α(α + 1)θ2 = V ar(X) + E[X])

Hence the method of moments estimators are given by:


n
X
1
αθ = n xi
i=1
n
X
α(α + 1)θ2 = 1
n x2i
i=1
1
Pn 1
Pn 2
Given n = 15, and computed values n i=1 xi = 5.4, n i=1 xi = 32.63733 we have the system of equations:

αθ = 5.4
α(α + 1)θ2 = 32.63733
Which yields the solutions:
5.4
⇒ θ = α

⇒ α(α + 1)( 5.4


α )
2
= 32.63733

α2 +α 32.63733
⇒ α2 = 5.42

1
⇒ 1+ α = 1.11925

1
⇒ α = 0.11925

⇒ α = 8.38574

⇒ θ = 0.64395

Hence the estimate provided by the method of moments estimators is:

X ≈ Γ(α = 8.38574, θ = 0.64395)

1
Question 2
(a)
1
Pn
Consider the estimator θ̂ = T̄ = n i=1 Ti . Now, since:

n
1 X 
E[T̄ ] = E n Ti
i=1
1
= n E[T 1 + T2 + ...Tn ]
1
= n (E[T1 ] + E[T2 ] + ... + E[Tn ]) (since independent)
1
= n (nE[T ]) (since identically distributed)
= E[T ]
=θ (since T exponentially distributed with scale parameter θ)

Since E[T̄ ] = θ, θ̂ is an unbiased estimator of θ.

(b)
The variance of θ̂ is:

V ar(θ̂) = V ar(T̄ )
= V ar( n1 (T1 + T2 + ... + Tn ))
1
= n2 (V ar(T1 ) + V ar(T2 ) + ... + V ar(Tn )) (since independent)
1
= n2 nV ar(T ) (since identically distributed)
1
= n V ar(T )
2
θ
= (since T exponentially distributed with scale parameter θ)
n

(c)
Let Z = min(T1 , T2 , ...Tn ), where Z ∼ Exp( nθ ) as stated in the problem.
Let θ̃ be the estimator given by:

θ̃ = nZ
Now, given that:

E[nZ] = nE[Z]
= n nθ (since Z exponentially distributed with parameter θ
n)

Hence, θ̃ = nZ is an unbiased estimator of θ.

The variance of θ̃ is:

V ar(nZ) = n2 V ar(Z)
= n2 ( nθ )2 (since Z exponentially distributed with scale parameter θ
n)
= θ2

(d)
The estimator θ̂ has a variance of θ2 /n while the estimator θ̃ has a variance of θ2 . Hence, as n gets large, the variance
of θ̂ will decrease while the variance of θ̃ will remain unchanged.

∴ Since lower variance is a desirable property of an estimator, the θ̂ estimator should be preferred.

2
Question 3
(a)
Summary statistics:

M in. 1stQu. M edian M ean 3rdQu. M ax.


104 285 840 1382 1410 6084
Boxplot:

Assignment 1 pic.png

The boxplot is centered around 1382 (mean) or 840 (median). It is right skewed with a single outlier at 6084.

(b)
The MLE for a Gamma distribution yields parameters X ∼ Γ(α = 0.897317, β = 0.000649), for which the pdf would
be:

βα α−1 −βx
fX (x) = Γ(α) x e
0.0006490.897317 0.897317−1 −0.000649x
= Γ(0.897317) x e
−0.000649x
0.00128784e
= 0≤x≤∞
x0.102683
The MLE for a Log-normal distribution yields parameters X ∼ lnN (µ = 6.579638, σ 2 = (1.178048)2 ), for which the
pdf would be:

1 (ln(x)−µ)2
fX (x) = √ e− 2σ 2
xσ 2π
(ln(x)−6.579638)2
1 − 2(1.178048)2
= √ e 0≤x≤∞
1.178048x 2π
The MLE for a Weibull distribution yields parameters X ∼ W (k = 0.894421, λ = 1301.860056), for which the pdf
would be:

k
fX (x) = ( λk )( λx )k−1 e−(x/λ)
0.894421
0.894421
= ( 1301.860056 x
)( 1301.860056 )0.894421−1 e−(x/1301.860056)
0.894421
0.00146491e−0.001638x
=
x0.105579

3
(c)
The log-likelihood values for the distributions above are:

Gamma −82.3
Log-normal −81.6
Weibull −82.2

The Log-normal model has the highest log-likelihood value, and hence gives the best fit.

The qqplots of each theoretical model against the data are:

Assignment 1 q3 gamma.png

Assignment 1 q3 lognormal.png

Assignment 1 q3 weibull.png

4
(d)
Histogram with pdf of lognormal model superimposed:

Assignment 1 q3 hist.png

5
Question 4
The general form for the lower endpoint of a 90% one sided confidence interval is derived from:
 X̄ − µ 
Pr √ ≤ t0.1 (n − 1) = 0.9
σ/ n
 
⇔ P r X̄ − t0.1 (n − 1) √σn ≤ µ = 0.9

Hence the lower end point of a 90% one-sided confidence interval for µ is given by:

l = x̄ − t0.1 (n − 1) · √s
n
0.61
= 35.4 − t0.1 (39) · √40
0.61
= 35.4 − 1.3036 · √ 40
= 35.2743

NOTE: I am assuming t0.1 (39) = 1.3036 as consistent the lecture notes, although when calculated in R, qt(0.1, df =
39) = −1.3036.

6
Question 5
(a)
The log-likelihood function is as follows (where log denotes the natural logarithm):

l(µ, λ) = logL(µ, λ)
n
Y
λ 1/2 −λ(xi − µ)2
= log ( 2πx 3) exp( )
i=1
i 2µ2 xi
n
X h
λ 1/2 −λ(xi − µ)2 i
= log ( 2πx3) exp( )
i=1
i 2µ2 xi
n h
X
1 λ λ(xi − µ)2 i
= 2 log( 2πx3i ) −
i=1
2µ2 xi
n h
X λ(xi − µ)2 i
= 1
2 log(λ) − 12 log(2πx3i ) −
i=1
2µ2 xi
n h
X x2i 2xi µ µ2 i
= 1
2 log(λ) − 12 log(2πx3i ) − λ( − + )
i=1
2µ2 xi 2µ2 xi 2µ2 xi
n h
X λxi λ λ i
= 1
2 log(λ) − 12 log(2πx3i ) − 2
+ −
i=1
2µ µ 2xi

The partial derivative with respect to µ is:

n h
∂l(µ,λ)
X λxi λi
∂µ = −
i=1
µ3 µ2
n
λ X nλ
= xi − 2
µ3 i=1 µ

∂l(µ,λ)
And when ∂µ = 0:

n
λ X nλ
⇒ xi = 2
µ̂3 i=1 µ̂
Pn
xi
⇒ i=1 = µ̂
n
⇒ µ̂ = x̄

Hence the MLE of µ is x̄.

The partial derivative with respect to λ is:

n h
∂l(µ,λ)
X 1 xi 1 1 i
∂λ = − 2+ −
i=1
2λ 2µ µ 2xi
n
n Xh xi 1 1 i
= + − 2+ −
2λ i=1 2µ µ 2xi

P Pn
And when equal to zero (letting µ = x̄ as per result above, noting that means i=1 throughout):

7
n X h xi 1 1 i
⇒ = − +
2λ̂ 2µ2 µ 2xi
n X h xi 2 1i
⇒ = 2
− +
λ̂ µ µ xi
n 1 X 2n X 1
⇒ = 2 xi − +
λ̂ µ µ xi
P P 1
1 1 xi 2 xi
⇒ = 2 − +
λ̂ µ n µ n
P 1
1 1 2 xi
⇒ = 2 x̄ − +
λ̂ µ µ n
P 1
1 1 2 xi
⇒ = 2 x̄ − + (since µ = x̄)
λ̂ x̄ x̄ n
P 1
1 1 2 xi
⇒ = − +
λ̂ x̄ x̄ n
P 1
1 xi 1
⇒ = −
λ̂ n x̄
P 1
1 [ xi ] − nx̄
⇒ =
λ̂ n
P 1
1 [ xi − x̄1 ]
⇒ =
λ̂ n
n
⇒ λ̂ = P 1
[ xi − x̄1 ]

n
Hence the MLE of λ is P 1 1
[ x − x̄ ]
i

(b)
Since nλ/λ̂ ≈ χ2 (n − 1), a 100(1 − α)% confidence interval can be constructed as follows:

P r(χ2(1−α/2) (n − 1) ≤ nλ/λ̂ ≤ χ2(α/2) (n − 1)) = 1 − α


⇒ P r(χ2(1−α/2) (n − 1) ≤ nλ/λ̂ ≤ χ2(α/2) (n − 1)) = 1 − α
⇒ P r(χ2(1−α/2) (n − 1) · λ̂
n ≤ λ ≤ χ2(α/2) (n − 1) · nλ̂ ) = 1 − α

Hence the confidence interval is:

[χ2(1−α/2) (n − 1) · nλ̂ , χ2(α/2) (n − 1) · nλ̂ ]

(c)
The MLE for λ is:

n
λ̂ = P 1
[ xi − x̄1 ]
32
=P 1 1
[ xi − 21.26562 ]
= 7.234316

The 95% confidence interval is given by α = .05, λ̂ = 7.234316, n = 32:

8
[χ2(α/2) (n − 1) · nλ̂ , χ2(1−α/2) (n − 1) · nλ̂ ]
= [χ20.025 (31) · 7.234316
32 , χ20.975 (31) · 7.234316
32 ]
7.234316 7.234316
= [17.53874 · 32 , 48.23189 · 32 ]
= [3.965025, 10.903898]

Question 6
(a)
Since T1 and T2 are unbiased estimators of θ:

E[T1 ] = θ (def. unbiased estimator)


E[T2 ] = θ (def. unbiased estimator)

Let T3 = αT1 + (1 − α)T2 for some α ∈ [0, 1]. Then:

E[T3 ] = E[αT1 + (1 − α)T2 ]


= αE[T1 ] + (1 − α)E[T2 ]
= αθ + (1 − α)θ

Hence, T3 is an unbiased estimator of θ.

(b)
Let α∗ be the value of α which minimizes the variance of T3 , that is:

α∗ = min V ar(T3 )
α
d
This can be found by solving dα V ar(T3 ) = 0 for α as follows:

d d h i
V ar(T3 ) = V ar(αT1 + (1 − α)T2 )
dα dα
d h 2 i
= α V ar(T1 ) + (1 − α)2 V ar(T2 ) (since T1 , T2 are independent)

= 2αV ar(T1 ) − 2(1 − α)V ar(T2 )

Equating this with 0 and solving for α:

2αV ar(T1 ) − 2(1 − α)V ar(T2 ) = 0


⇒ 2αV ar(T1 ) = 2(1 − α)V ar(T2 )
⇒ αV ar(T1 ) + αV ar(T2 ) = V ar(T2 )
V ar(T2 )
⇒α=
V ar(T1 ) + V ar(T2 )

Note that the second derivative test results in a necessarily positive value indicating that this extremum is a minimum.
V ar(T2 )
Hence α∗ = V ar(T1 )+V ar(T2 ) minimizes V ar(T3 ).

If V ar(T1 ) >> V ar(T2 ), then α∗ would be close to 0. The effect of this is that T3 would be the combination of
a large proportion of T2 (which has less variance), and and small proportion of T1 (which has greater variance).

9
(c)
In the case that T1 and T2 are dependent, the calculation can be done with Cov(T1 , T2 ) factored in:

d d h i
V ar(T3 ) = V ar(αT1 + (1 − α)T2 )
dα dα
d h 2 i
= α V ar(T1 ) + (1 − α)2 V ar(T2 ) + 2α(1 − α)Cov(T1 , T2 )

d h 2 i
= α V ar(T1 ) + (1 − α)2 V ar(T2 ) + 2αCov(T1 , T2 ) − 2α2 Cov(T1 , T2 )

= 2αV ar(T1 ) − 2(1 − α)V ar(T2 ) + 2Cov(T1 , T2 ) − 4αCov(T1 , T2 )
= 2αV ar(T1 ) − 2V ar(T2 ) + 2αV ar(T2 ) + 2Cov(T1 , T2 ) − 4αCov(T1 , T2 )

Equating this with 0 and solving for α:

2αV ar(T1 ) − 2V ar(T2 ) + 2αV ar(T2 ) + 2Cov(T1 , T2 ) − 4αCov(T1 , T2 ) = 0


⇒ αV ar(T1 ) − V ar(T2 ) + αV ar(T2 ) + Cov(T1 , T2 ) − 2αCov(T1 , T2 ) = 0
⇒ α(V ar(T1 ) + V ar(T2 ) − 2Cov(T1 , T2 )) = V ar(T2 ) − Cov(T1 , T2 )
V ar(T2 ) − Cov(T1 , T2 )
⇒α=
V ar(T1 ) + V ar(T2 ) − 2Cov(T1 , T2 )

Note that the second derivative test results in a necessarily positive value indicating that this extremum is a minimum.
V ar(T2 )−Cov(T1 ,T2 )
Hence α∗ = (V ar(T1 )+V ar(T2 )−2Cov(T1 ,T2 )) minimizes V ar(T3 ).

The consequence of V ar(T1 ) >> V ar(T2 ) is similar as in (b), though complicated further depending on the cor-
relation between T1 and T2 .

Question 7
Not applicable

10

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy