Evans - Analytics2e - PPT - 05 Data Modelling
Evans - Analytics2e - PPT - 05 Data Modelling
Evans - Analytics2e - PPT - 05 Data Modelling
Probability Distributions
and Data Modeling
Basic Concepts of Probability
Probability is the likelihood that an outcome
occurs. Probabilities are expressed as values
between 0 and 1.
An experiment is the process that results in an
outcome.
The outcome of an experiment is a result that
we observe.
The sample space is the collection of all
1. like, like
2. like, dislike
3. dislike, like
4. dislike, dislike
Probability at least one dislikes product = 3/4
Example 5.2: Relative Frequency
Definition of Probability
Use relative frequencies as probabilities
Probability a computer is repaired in 10 days = 0.076
Probability Rules and Formulas
Label the n outcomes in a sample space as O1, O2, …,
On, where Oi represents the ith outcome in the sample
space. Let P(Oi) be the probability associated with the
outcome Oi.
The probability associated with any outcome must be
between 0 and 1.
0 ≤ P(Oi) ≤ 1 for each outcome Oi (5.1)
The sum of the probabilities over all possible outcomes
must be equal to 1.
P(O1) + P(O2) + … + P(On) = 1 (5.2)
Probabilities Associated with Events
An event is a collection of one or more
outcomes from a sample space.
Rule 1. The probability of any event is the sum of
P(A) = 8/36
Ac = {2, 3, 4, 5, 6, 8, 9, 10, 12}
Using Rule 2:
= P(A) + P(B)
= 8/36 + 4/36 = 12/36
Non-Mutually Exclusive Events
The notation (A and B) represents the intersection of
events A and B – that is, all outcomes belonging to
both A and B .
Rule 4. If two events A and B are not mutually
exclusive, then P(A or B) = P(A)+ P(B) - P(A and B).
Example 5.6: Computing the Probability
of Non-Mutually Exclusive Events
Dice Example:
A = {2, 3, 12}: P(A) = 4/36
B = {even number} : P(B) = 18/36
(A and B) = {2, 12}: P(A and B) = 2/36
P(A or B) = P(A) + P(B)− P(A and B)
Joint
probabilities
Example 5.7: Continued
The marginal probabilities for gender and brand
preference are calculated by adding the joint
probabilities across the rows and columns
◦ E.g., the event F, (respondent is female) is comprised of the
outcomes O1, O2, and O3, and therefore P(F) = P(F and B1) +
P(F and B2) + P(F and B3) = 0.37
Marginal
probabilities
Joint/Marginal Probability Rule
Calculations of marginal probabilities leads to the
following probability rule:
P(Ai) = P(Ai and B1) + P(Ai and B2) + … + P(Ai and Bn)
Example 5.7 Continued
Events F and M are mutually exclusive, as are events B1, B2, and B3
since a respondent may be only male or female and prefer exactly
one of the three brands. We can use Rule 3 to find, for example,
P(B1 or B2) = 0.34 + 0.23 = 0.57.
Events F and B1, however, are not mutually exclusive because a
respondent can be both female and prefer brand 1. Therefore, using
Rule 4, we have P(F or B1) = P(F) + P(B1) – P(F and B1) = 0.37 +
0.34 – 0.09 = 0.62.
Conditional Probability
Conditional probability is the probability of
occurrence of one event A, given that another
event B is known to be true or has already
occurred.
Example 5.8 Computing a Conditional
Probability in a Cross-Tabulation
Suppose we know a respondent is male. What is the probability that
he prefers Brand 1?
Using cross-tabulation: Of 63 males, 25 prefer Brand 1, so the
probability of preferring Brand 1 given that a respondent is male =
25/63
Using joint probability table: divide the joint probability 0.25 (the
probability that the respondent is male and prefers brand 1) by the
marginal probability 0.63 (the probability that the respondent is male).
Example 5.9: Conditional Probability in
Marketing
Apple Purchase History
The PivotTable shows the count of the
type of second purchase given that
each product was purchased first.
Probability of purchasing an
iPad given that a customer already
purchased an iMac = 2/13
Conditional Probability Formula
The conditional probability of an event A given
that event B is known to have occurred is
P(B1|M) = 0.397
and sell their art for the most money possible. A back-of-the-
envelope expected value calculation would have easily predicted
the winner.
Deal or No Deal
Contestant had 5 briefcases left with $100, $400, $1000, $50,000 or
$300,000 in them.
Expected value of briefcases is $70,300.
E[X] = p
Var[X] = p(1 − p)
Example 24: Using the Bernoulli
Distribution
The Bernoulli distribution can be used to model whether
an individual responds positively (x = 1), or negatively
(x = 0) to a telemarketing promotion.
For example, if you estimate that 20% of customers
contacted will make a purchase, the probability
distribution that describes whether or not a particular
individual makes a purchase is Bernoulli with p = 0.2
Binomial Distribution
Models n independent replications of a Bernoulli experiment, each
with a probability p of success.
◦ X represents the number of successes in these n experiments
Probability mass function:
Excel function:
=BINOM.DIST(number_s, trials, probability_s, cumulative)
If cumulative is set to TRUE, then this function will provide
cumulative probabilities; otherwise the default is FALSE, and it
provides values of the probability mass function, f(x).
Example 5.26: Using Excel’s Binomial
Distribution Function
The probability that exactly 3 of 10 individuals will make
a purchase is P(x = 3): =BINOM.DIST(3,10,0.2,TRUE) =
0.20133
The probability that 3 or fewer of 10 individuals will make
a purchase is P(x ≤ 3): =BINOM.DIST(3,10,0.2,FALSE)
= 0.87913
Shapes and Skewness of the Binomial
Distribution
The binomial distribution is symmetric when
p = 0.5; positively skewed when p < 0.5,
and negatively skewed when p > 0.5.
Example of
negatively-skewed
distribution
Poisson Distribution
Models the number of occurrences in some unit of
measure (often time or distance).
There is no limit on the number of occurrences.
The average number of occurrence per unit is a constant
denoted as λ.
Probability mass function:
Mean = m = 1/l
Excel function:
◦ =EXPON.DIST(x, lambda, cumulative)
If the number of events
occurring during an interval of
time has a Poisson distribution,
then the time between events is
exponentially distributed.
Example 5.34: Using the Exponential
Distribution
The mean time to failure of a critical engine component is µ = 8,000
hours. What is the probability of failing before 5000 hours?
P(X < x) =EXPON.DIST(x, lambda, cumulative)
λ = 1/8000
P(X < 5000) =EXPON.DIST(5000, 1/8000, TRUE)
= 0.4647
Other Useful Distributions
Triangular Distribution
Lognormal Distribution
Beta Distribution
Random Sampling from Probability Distributions
◦ It is not clear what the distribution might be. It does not appear to
be exponential, but it might be lognormal or another distribution.
Goodness of Fit
A better approach that simply visually examining a
histogram and summary statistics is to analytically fit the
data to the best type of probability distribution.
Three statistics measure goodness of fit:
◦ Chi-square (need at least 50 data points)
◦ Kolmogorov-Smirnov (works well for small samples)
◦ Anderson-Darling (puts more weight on the differences between
the tails of the distributions)
Analytic Solver Platform has the capability of fitting a
probability distribution to data.
Example 5.42: Fitting a Distribution to
Airport Service Times
1. Highlight the data
Analytic Solver Platform >
Tools > Fit
2. Fit Options dialog
Type: Continuous
Test: Kolmorgov-Smirnov
Click Fit button
Example 5.42 Continued
The best-fitting distribution is called an Erlang
distribution.