Chapter 5 (Technical English For Statistics)
Chapter 5 (Technical English For Statistics)
Probability Distributions
Definitions
Random Variable:
Variable whose values are determined by chance.
Probability Distribution:
The values a random variable can assume, along with the corresponding probabilities of
each.
Expected Value:
The theoretical mean of the variable.
Binomial Experiment:
An experiment with a fixed number of independent trials. Each trial can have only two
outcomes, or outcomes that can be reduced to two. The probability of each outcome must
remain constant from trial to trial.
Binomial Distribution:
The outcomes of a binomial experiment with their corresponding probabilities.
Multinomial Distribution:
A probability distribution resulting from an experiment with a fixed number of
independent trials. Each trial has two or more mutually exclusive outcomes. The
probability of each outcome must remain constant from trial to trial.
Poisson Distribution:
A probability distribution used when a density of items is distributed over a period of
time. The sample size needs to be large and the probability of success small.
Hypergeometric Distribution:
A probability distribution of a variable with two outcomes when sampling is done without
replacement.
Probability Distributions
Probability Functions
If these two conditions aren't met, then the function isn't a probability function. There is no
requirement that the values of the random variable only be between 0 and 1, only that the
probabilities be between 0 and 1.
Probability Distributions
A listing of all the values the random variable can assume with their corresponding
probabilities make a probability distribution.
A note about random variables. A random variable does not mean that the values can be
anything (a random number). Random variables have a well defined set of outcomes and well
defined probabilities for the occurrence of each outcome. The random refers to the fact that
the outcomes happen by chance -- that is, you don't know which outcome will occur next.
Here's an example probability distribution that results from the rolling of a single fair die.
x 1 2 3 4 5 6 sum
p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6=1
Mean, Variance, and Standard Deviation
Consider the following.
The definitions for population mean and variance used with an ungrouped frequency
distribution were:
Some of you might be confused by only dividing by N. Recall that this is the population
variance, the sample variance, which was the unbiased estimator for the population variance
was when it was divided by n-1.
Recall that a probability is a long term relative frequency. So every f/N can be replaced by
p(x). This simplifies to be:
What's even better, is that the last portion of the variance is the mean squared. So, the two
formulas that we will be using are:
x 1 2 3 4 5 6 sum
p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6 = 1
x p(x) 1/6 2/6 3/6 4/6 5/6 6/6 21/6 = 3.5
x^2 p(x) 1/6 4/6 9/6 16/6 25/6 36/6 91/6 = 15.1667
These can be summarized as: An experiment with a fixed number of independent trials, each
of which can only have two possible outcomes.
The fact that each trial is independent actually means that the probabilities remain constant.
Example:
There are five things you need to do to work a binomial story problem.
1. Define Success first. Success must be for a single trial. Success = "Rolling a 6 on a
single die"
2. Define the probability of success (p): p = 1/6
3. Find the probability of failure: q = 5/6
4. Define the number of trials: n = 6
5. Define the number of successes out of those trials: x = 2
Anytime a six appears, it is a success (denoted S) and anytime something else appears, it is a
failure (denoted F). The ways you can get exactly 2 successes in 6 trials are given below. The
probability of each is written to the right of the way it could occur. Because the trials are
independent, the probability of the event (all six dice) is the product of each probability of
each outcome (die)
1 FFFFSS 5/6 * 5/6 * 5/6 * 5/6 * 1/6 * 1/6 = (1/6)^2 * (5/6)^4
2 FFFSFS 5/6 * 5/6 * 5/6 * 1/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
3 FFFSSF 5/6 * 5/6 * 5/6 * 1/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
4 FFSFFS 5/6 * 5/6 * 1/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
5 FFSFSF 5/6 * 5/6 * 1/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
6 FFSSFF 5/6 * 5/6 * 1/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
7 FSFFFS 5/6 * 1/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
8 FSFFSF 5/6 * 1/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
9 FSFSFF 5/6 * 1/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
10 FSSFFF 5/6 * 1/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
11 SFFFFS 1/6 * 5/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
12 SFFFSF 1/6 * 5/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
13 SFFSFF 1/6 * 5/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
14 SFSFFF 1/6 * 5/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
15 SSFFFF 1/6 * 1/6 * 5/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
Notice that each of the 15 probabilities are exactly the same: (1/6)^2 * (5/6)^4.
Also, note that the 1/6 is the probability of success and you needed 2 successes. The 5/6 is the
probability of failure, and if 2 of the 6 trials were success, then 4 of the 6 must be failures.
Note that 2 is the value of x and 4 is the value of n-x.
Further note that there are fifteen ways this can occur. This is the number of ways 2 successes
can be occur in 6 trials without repetition and order not being important, or a combination of 6
things, 2 at a time.
The probability of getting exactly x success in n trials, with the probability of success on
a single trial being p is:
Example:
A coin is tossed 10 times. What is the probability that exactly 6 heads will occur.
Example:
Find the mean, variance, and standard deviation for the number of sixes that appear when
rolling 30 dice.
The mean is 30 * (1/6) = 5. The variance is 30 * (1/6) * (5/6) = 25/6. The standard deviation
is the square root of the variance = 2.041241452 (approx)
Other Discrete Distributions
Multinomial Probabilities
Instead of using a combination, as in the case of the binomial probability, the number of ways
the outcomes can occur is done using distinguishable permutations.
The probability that a person will pass a College Algebra class is 0.55, the probability that a
person will withdraw before the class is completed is 0.40, and the probability that a person
will fail the class is 0.05. Find the probability that in a class of 30 students, exactly 16 pass, 12
withdraw, and 2 fail.
Outcome x p(outcome)
Pass 16 0.55
Withdraw 12 0.40
Fail 2 0.05
Total 30 1.00
30!
P = ---------------- * 0.55^16 * 0.40^12 * 0.05^2
(16!) (12!) (2!)
Poisson Probabilities
Named after the French mathematician Simeon Poisson, Poisson probabilities are useful when
there are a large number of independent trials with a small probability of success on a single
trial and the variables occur over a period of time. It can also be used when a density of items
is distributed over a given area or volume.
If there are 500 customers per eight-hour day in a check-out lane, what is the probability that
there will be exactly 3 in line during any five-minute period?
The expected value during any one five minute period would be 500 / 96 = 5.2083333. The 96
is because there are 96 five-minute periods in eight hours. So, you expect about 5.2 customers
in 5 minutes and want to know the probability of getting exactly 3.
Hypergeometric Probabilities
Hypergeometric experiments occur when the trials are not independent of each other and
occur due to sampling without replacement -- as in a five card poker hand.
Example:
How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women?
Note that the sum of the numbers in the numerator are the numbers used in the combination in
the denominator.
This can be extended to more than two groups and called an extended hypergeometric
problem.