3.0 Common Probability Distribution
3.0 Common Probability Distribution
3.0 Common Probability Distribution
DISTRIBUTION
RAW DATA TO PROBABILITY
Histogram Probability
Data Hypothesise Parameter Goodness- Probability,
& Central Density
Collection Distribution Estimation of-Fit Test P(x)
Moment Function
• The central limit theorem states that the sum of many arbitrary
distribution random variables asymptotically follows as normal
distribution when the sample size becomes large.
• The parameters that define the Normal distribution are mean value, x
and standard deviation, x, denoted as N (,).
NORMAL/GAUSSIAN DISTRIBUTION
• One main and useful criterion of the Normal distribution is it
can be applied to any value of a random variable from - to +.
• The areas under the curve within one, two and three std are
about 68%, 95.5% and 99.7%.
1 x x 2
f x x exp
2 x
2
2 x
2
x x
F x
x
a) What is the probability that a lamp will fail in the first 700 burning hours?
b) What is the probability that a lamp will fail between 900 and 1300 burning
hours?
c) How many lamps are expected to fail between 900 and 1300 burning hours?
d) What is the probability that a lamp will burn for exactly 900 hours?
e) What is the probability that a lamp will burn between 899 hours and 901 hours
before it fails?
f) After how many burning hours would we expect 10% of the lamps to be left?
EXAMPLE 4
a) What is the probability that a lamp will fail in the first 700 burning hours?
EXAMPLE 4
b) What is the probability that a lamp will fail between 900 and 1300 burning
hours?
EXAMPLE 4
EXAMPLE 4
c) How many lamps are expected to fail between 900 and 1300 burning
hours?
EXAMPLE 4
d) What is the probability that a lamp will burn for exactly 900 hours?
EXAMPLE 4
e) What is the probability that a lamp will burn between 899 hours and 901
hours before it fails?
(a) If two loads are independent, what are the mean and the
standard deviation of the shear and the bending moment at
the fix end?
(b) If two random loads are normally distributed, what is the
probability that the bending moment will exceed 235 kNm?
(c) If two loads are independent, what is the correlation
coefficient between V and M.
TUTORIAL 1
F1 F2
L1=6m
L1=9m
LOGNORMAL DISTRIBUTION
• Some engineering problems depend only on positive values of
variables.
• The mean, median, and mod values are not the same.
[ [ ]]
2
1 1 ln 𝑥 −𝜇 𝑥
𝑓 𝑥 ( 𝑥)= exp −
𝜎 𝑥 𝑥 √2 𝜋 2 𝜎𝑥
( [ ])
𝑥 2
1 1 ln 𝑥 −𝜇 𝑥
𝐹 ( 𝑥 )= ∫
𝜎 𝑥 √2 𝜋 0
exp −
2 𝜎𝑥
𝐹 ( 𝑥 )=𝜑
[
ln ( 𝑥 ) − 𝜇 𝑥
𝜎𝑥 ]
LOGNORMAL DISTRIBUTION
EXAMPLE 5
Suppose that the reaction time in seconds of a person can be modeled by a
lognormal distribution with parameter values, = -0.35 and = 0.2.
Data: ln X = N (−0.35, 0.2)
a) Find the probability that the reaction time is less than 0.6 seconds
b) Find the reaction time that is exceeded by 95% of the population
Part A
EXAMPLE 5
Part B
EXPONENTIAL DISTRIBUTION
• This distribution is defined for a positive random variable x > xo > 0.
0, x xo
f x x x
e 0
x xo
• = Exponential parameter also known as failure rate.
• xo= an offset, which is assumed to be known a priori (the smallest value).
EXPONENTIAL DISTRIBUTION
• The CDF for the Exponential distribution is defined as
F x 1 exp x xo
xo=0 when the
smallest data is
equivalent to zero.
Assumed zero if xo
is not mentioned in
the question.
• The mean and variance are given respectively by
1
xo
1
2
2
EXPONENTIAL DISTRIBUTION
CONSTANT FAILURE RATE
• Due to ease in dealing with a constant failure rate, the exponential distribution
function has proven popular as the traditional basis for reliability modeling
CONSTANT FAILURE RATE
EXAMPLES 6
On the average, a certain computer part lasts ten years. The length of time the
computer part lasts is exponentially distributed.
a) What is the probability that a computer part lasts more than 7 years?
b) On the average, how long would five computer parts last if they are used
one after another?
d) What is the probability that a computer part lasts between nine and 11
years?
EXAMPLES 6
a) What is the probability that a computer part lasts more than 7 years?
m=l
Since P(X < x) = 1 –e–mx ,then P(X > x) = 1 –(1 –e–mx) = e-mx
The probability that a computer part lasts more than seven years
is 0.4966.
EXAMPLES 6
b) On the average, how long would five computer parts last if they are used one
after another?
Find the 80th percentile. Draw the graph. Let k = the 80th
percentile.
Solve for k:
k=ln(1−0.80)/(−0.1)=16.1
This means that the component will have a chance of survival over 100 000
hours of about 98%. Suppose that the producer of this component has
distributed 5000 units to different users, then Ns = Nt.R(t) = 5000 × 0.98 =
4901. Accordingly, about 98 components are likely to fail during the 100 000
hours.
WEIBULL DISTRIBUTION
• The Weibull distribution has been widely used to solve many engineering
problems.
• It has been accepted that this distribution is the most useful density
function for reliability estimations and also capable of application to
problems including defining strength of brittle materials, classifying
failure types, scheduling preventive maintenance and inspection activities.
x
F x 1 exp
2 2 1
1 1
2 2
1
x 1 Gamma function
WEIBULL DISTRIBUTION
WEIBULL FAILURE RATE
WEIBULL FAILURE RATE
• Weibull distributions with β < 1 have a failure rate that decreases
with time, also known as infantile or early-life failures.
b. P(X<5)
c. P(1.8<X<6)
d. P(X>3)
x
exp 1 F x
x
ln 1 F x
• The right side is written in positive form. Thus
x 1
ln
1 F x
PROBABILITY PLOTTING
• By having the second logarithm, the new can be written as
1 x
ln ln ln
1 F x
• The right side of the above equation is then transformed into a linear
equation as follows
1
ln ln ln x ln
1 F x
• This is now a straight line equation where ln[ln(1/(1-F(x))] is the
dependent variable, ln(x) is the independent variable, is then the slope
and ln(x) is the y-axis intercept
PROBABILITY PLOTTING
where:
c = - ln()
m =
x = ln(x)
1
ln ln
y = 1 F x
• From the constructed graph, the shape parameter can be determined according to the
slope value of the straight line. The scale parameter can be calculated based on
c
exp c = y-axis intercept
PROBABILITY PLOTTING
y = 1
ln ln
1 F x
n = total frequency
PROBABILITY PLOTTING
3.0000000
0.0000000
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
ln[ln(1/1-F(t))]
-1.0000000
-2.0000000
-3.0000000
-4.0000000
-5.0000000
-6.0000000
-7.0000000
ln(t-l)
MAXIMUM LIKELIHOOD ESTIMATOR
• In statistics, maximum-likelihood estimation (MLE) is a
method of estimating the parameters of a statistical model.
n
L xi f xi
i 1
where:
f(xi) = probability density function (PDF).
n = total number of observed data.
xi = observed data for observation order, i.
MAXIMUM LIKELIHOOD ESTIMATOR
• Example: The likelihood function for Exponential distribution is
denoted as
n
L exp x
i 1
n
ln L n log xi
i 1
MAXIMUM LIKELIHOOD ESTIMATOR
• The equation can also be written down in a simplified form as
below
ln L n ln x
• where
1 n
x xi sample average
n i 1
• By differentiating the above equation, the maximum likelihood estimator
equation of can be performed as
ln L 1
n x
xi
MAXIMUM LIKELIHOOD ESTIMATOR
• If
ln L
0 * to estimate l that maximizes L
• then
1
* l is the inverse of average
x
GOODNESS-OF-FIT TEST
• Even though the parameters of a probability distribution can be
estimated using the MLE or probability plotting, it does not mean that
the distribution ultimately represents the data population accurately as
the distribution type is determined based on the shape of plotted
histogram of the population data.
• The test determines the accuracy which the chosen distribution fits the
data population.
• There are many goodness-of-fit tests that can be applied such as Chi-
square test, Kolgomorov-Smirnov test, Graphical test, Hollander-
Proschan test, Mann-Scheuer-Fertig test and many more.
GOODNESS-OF-FIT TEST
• Only the Chi-square and Graphical tests will be discussed
in this section.
• The observed data can be grouped into class interval and observed
frequency, O.
• For each class of the grouped data, the expected frequency for each
class can be estimated on the basis of the hypothecal distribution.
• If the critical value for the 2 statistics is less than the calculated
value, the hypothesis will be rejected.
d g kn 1
where:
2 = chi-square value.
dg = degree of freedom.
E = expected value.
kn = number of classes.
O = observed value
CHI-SQUARE GOODNESS OF FIT TEST
• In Chi-Square goodness of fit test, the term goodness of fit is used to compare
the observed sample distribution with the expected probability distribution.
• Then the numbers of points that fall into the interval are compared, with the
expected numbers of points in each interval.
CHI-SQUARE GOODNESS OF FIT TEST
• The following example demonstrates the calculation of chi-square value, 2
for each bin based on Chi-square equation.
HYPOTHESIS
[ ( )]
β−1
β( 𝑥− δ) 𝑥−δ
β
𝑓 𝑥 ( 𝑥)= 𝛽
exp −
θ θ
[ ( )]
𝛽
𝑥 −𝛿
𝐹 ( 𝑥 )=1 −exp −
𝜃
PROBABILITY
β= 1.79 ,θ=15.7 PLOT
CHI-SQUARE TEST EXAMPLE
[ ( )]
β−1
β( 𝑥− δ) 𝑥−δ
β Class Probability Expected
𝑓 𝑥 ( 𝑥)= 𝛽
exp − 0-5 0.147586008 5.5
θ θ 5-10 0.246573692 9.1
[ ( )]
10-15 0.229921628 8.5
𝛽
𝑥 −𝛿 15-20 0.168443234 6.2
𝐹 ( 𝑥 )=1 −exp −
𝜃 20-25
25-30
0.104450981
0.056626195
3.9
2.1
β= 1.79 ,θ=15.7
Example of Calculation
Probability (0<x<5)=0.1476
Total no. of data = 37
Expected frequency for 0<x<5 = 0.1476 x 37 = 5.5
CHI-SQUARE TEST EXAMPLE
• The R value is based on how well the data fits to the straight line.
• It is well known that the extreme values are a compromise between the
critical capacity of engineering elements and their associated extreme
operating conditions.
Yn max of x1 , x 2 , x3 , x 4 ..........x n
• The CDF of these extreme values can be written as
FYn y Fx y
n
f Yn y nFx y f x y
n 1
n=200
n=20
n=2