Part 4 Chapter 1
Part 4 Chapter 1
Part 4 Chapter 1
CONTINUOUS DISTRIBUTIONS
BETA DISTRIBUTION
The beta distribution is a family of continuous probability distribution defined on
the interval [0, 1] parameterized by two positive shape parameters, denoted by α and λ,
that appear as exponents of the random variable and control the shape of the distribution.
The beta distribution has been applied to model the behavior of random variables limited
to intervals of finite length in a wide variety of disciplines.
𝟏
𝒇(𝒙) = 𝒙𝜶−𝟏 (𝟏 − 𝒙)𝝀−𝟏 𝟎≤𝒙≤𝟏
𝚩(𝜶, 𝝀)
𝟏
𝚪(𝜶)𝚪(𝝀)
𝒇(𝒙) = ∫ 𝒙𝜶−𝟏 (𝟏 − 𝒙)𝝀−𝟏 𝒅𝒙 = 𝚩(𝜶, 𝝀) =
𝟎 𝚪(𝜶 + 𝝀)
where Γ(z) is the gamma function. The beta function, В, is normalization constant at
𝛼 and 𝜆 ≥10 the beta distribution can be approximated using the normal distribution.
MEAN AND VARIANCE FOR BETA
𝟏 𝟏
𝑬[𝑿] = ∫𝟎 𝒙 𝒙𝜶−𝟏 (𝟏 − 𝒙)𝝀−𝟏 𝒅𝒙
𝚩(𝜶,𝝀)
𝟏 𝟏
𝑬[𝑿] = ∫𝟎
𝒙𝜶 (𝟏 − 𝒙)𝝀−𝟏 𝒅𝒙 → В(𝜶 + 𝟏, 𝝀)
𝚩(𝜶,𝝀)
В(𝜶+𝟏,𝝀) 𝜶
= →
𝚩(𝜶,𝝀) 𝜶+𝝀
𝟏 𝟏
𝑬[𝑿𝟐 ] = ∫𝟎 𝒙𝟐 𝒙𝜶−𝟏 (𝟏 − 𝒙)𝝀−𝟏 𝒅𝒙
𝚩(𝜶,𝝀)
𝟏 𝟏
𝑬[𝑿𝟐 ] = ∫𝟎
𝒙𝜶+𝟏 (𝟏 − 𝒙)𝝀−𝟏 𝒅𝒙 → В(𝜶 + 𝟐, 𝝀)
𝚩(𝜶,𝝀)
В(𝜶+𝟐,𝝀) 𝜶(𝜶+𝟏)
→
𝚩(𝜶,𝝀) (𝜶+𝝀)(𝜶+𝝀+𝟏)
𝜶(𝜶+𝟏) 𝜶 𝟐 𝜶𝝀
Var(x)= −( ) =
(𝜶+𝝀)(𝜶+𝝀+𝟏) 𝜶+𝝀 (𝜶+𝝀)𝟐 (𝜶+𝝀+𝟏)
1
1
𝑀𝑥 (𝑡) = 𝐸[𝑒𝑥𝑡 ] = ∫ 𝑒 𝑡𝑥 𝑥 𝛼−1 (1 − 𝑥)𝜆−1 𝑑𝑥
Β(𝛼, 𝜆) 0
1 ∞
1 (𝑡𝑥 )𝑘 𝛼−1
∫ (∑ )𝑥 (1 − 𝑥)𝜆−1 𝑑𝑥
Β(𝛼, 𝜆) 0 𝑘!
𝑘=0
∞
1 (𝑡)𝑘 1 𝛼+𝑘−1
= ∑ ∫ 𝑥 (1 − 𝑥)𝜆−1 𝑑𝑥
Β(𝛼, 𝜆) 𝑘! 0
𝑘=0
∞
(𝑡 )𝑘
Β(𝛼 + 𝑘, 𝜆)
=∑ ( )
𝑘! Β(𝛼, 𝜆)
𝑘=0
∞
Β(𝛼, 𝜆) 𝑡 0 (𝑡)𝑘 Β(𝛼 + 𝑘, 𝜆)
= +∑ ( )
Β(𝛼, 𝜆) 0! 𝑘! Β(𝛼, 𝜆)
𝑘=1
∞
𝛤(𝛼 + 𝐾)𝛤(𝜆) Γ(𝛼, 𝜆) (𝑡)𝑘
= 1+∑( . )
𝛤(𝛼 + 𝜆 + 𝐾) 𝛤(𝛼)𝛤(𝜆) 𝑘!
𝑘=1
∞
𝛤(𝛼 + 𝐾) 𝛤(𝛼 + 𝜆) (𝑡)𝑘
= 1+∑( . )
𝛤(𝛼) 𝛤(𝛼 + 𝜆 + 𝐾) 𝑘!
𝑘=1
=1
∞
𝛤(𝛼) ∏𝑘𝑟=0(𝛼 + 𝑟) 𝛤(𝛼 + 𝜆) (𝑡)𝑘
+∑( . 𝑘 )
𝛤(𝛼) 𝛤(𝛼 + 𝜆) 𝑟=0(𝛼 + 𝜆 + 𝑟) 𝑘!
∏
𝑘=1
∞ 𝑘−1
𝛼+𝑟 (𝑡)𝑘
= 1 + ∑ (∏ )
𝛼 + 𝜆 + 𝑟 𝑘!
𝑘=1 𝑟=0
NORMAL AND STANDARD NORMAL DISTRIBUTIONS
TEST OF NORMALITY
There are two methods to obtain test of normality informal method(graphs) formal
method(test) The histogram below is NOT normally distributed, and the Normal
Probability Plot for the same data (right) is therefore NOT a straight line.
If your sample size is greater than 2000, then look at the Kolmogorov-Smirnov
and if your sample size is less than 2000, then look at the Shapiro-Wilk.
If the value for “Sig.” (Which is the p-value) is greater than 0.05, then your data
are
Normally distributed and if the value for “Sig.” is less than 0.05, then your data are NOT
Normally distribute.