A B P (B) P (A: Multiplication Law. Let and Be Events and Assume - Then
A B P (B) P (A: Multiplication Law. Let and Be Events and Assume - Then
A B P (B) P (A: Multiplication Law. Let and Be Events and Assume - Then
1
Example A. An urn contains three red balls and one blue ball.
Two balls are selected without replacement. What is the
probability that they are both red?
Let R1 and R2 denote the events that a red ball is drawn on the
first trial and on the second trial, respectively. From the
multiplication law,
2
LAW OF TOTAL PROBABILITY.
P(A) = i=1
P(A | Bi) P(Bi)
3
EXAMPLE C. Referring to Example A, what is the probability
that a red ball is selected on the second draw?
4
P(R2) = P(R2 | R1) P(R1) + P(R2 | B1) P(B1)
5
INDEPENDENCE.
Intuitively, we would say that two events, A and B, are
independent if knowing that one had occurred gives us no
information about whether the other had or will occur; that is,
P(A | B) = P(A) and P(B | A) = P(B). Now, if
P(A B)
P(A) = P(A | B) = ---------------
P(B)
then
P(A B) = P(A) P(B)
We will use this last relation as the definition of independence.
Note that it is symmetric in A, and B and does not require the
existence of a conditional probability; that is, P(B) can be 0.
DEFINITION. A and B are said to be independent events if
P(A B) = P(A) P(B).
6
EXAMPLE a.
A card is selected randomly from a deck. Let A denote the
event that it is an ace and D the event that it is a diamond.
Knowing that the card is an ace gives no information about its
suit. Checking formally that the events are independent, we
have P(A) = 4/52 = 1/13 and
P(D) = 1/4.
Also, A D is the event that the card is the ace of diamonds
and P(A D) = 1/52. Since P(A) P(D) = (1/4) x (1/13) = 1/52,
the events are in fact independent.
7
EXAMPLE c.
A fair coin is tossed twice. Let
A denote the event of heads on the first toss,
B the event of heads on the second toss, and
C the event that exactly one head is thrown.
A and B are clearly independent, and P(A) = P(B) = P(C) = .5.
To see that A and C are independent, we observe that P(C | A) = .5. But
P(A B C) = 0 P(A) P(B) P(C)
To encompass situations such as that in Example c, we define a collection
of events, A1, A2, … An, to be mutually independent if for any
subcollection, Ai 1 , …, Ai m ,
1 m 1 m
P(Ai ··· Ai ) = P(Ai ) ··· P(Ai )
8
Discrete Random Variables.
A random variable is essentially a random number. We will be
interested in random numbers that are determined by
experiments. As motivation for a definition, let us consider an
example. A coin is thrown three times, and the sequence of
heads and tails is observed; thus,
9
In general, a random variable is a function from to the real numbers.
Since the outcome of the experiment for which is the sample
space is random, the number produced by the function is random as
well. It is conventional to denote random variables by italic uppercase
letters from the end of the alphabet. For example, we might define X
to be the total number of heads in the experiment described above.
A discrete random variable is a random variable that can take on only
a finite or at most a countably infinite number of values. The random
variable X just defined is a discrete random variable since it can take
on only the values 0, 1, 2, and 3. For an example of a random
variable that can take on a countably infinite number of values,
consider an experiment that consists of tossing a coin until a head
turns up and defining Y to be the total number of tosses. The possible
values of Y are 0, 1, 2, 3, … . In general, a countably infinite set is
one that can be put into one-to-one correspondence with the integers.
10
If the coin is fair, then each of the outcomes in above
has probability 1/8, from which the probabilities that X
takes on the values 0, 1, 2, and 3 can be easily computed:
P(X = 0) = 1/8
P(X = 1) = 3/8
P(X = 2) = 3/8
P(X = 3) = 1/8
11
Generally, the probability measure on the sample space
determines the probabilities of the various values of X; if
those values are denoted by x1, x2, …, then there is a
function p such that p(xi) = P(X = xi) and i p(xi) = 1. This
function is called the probability mass function, or the
frequency function, of the random variable X. Figure below
shows a graph of p(x) for the coin tossing experiment. The
frequency function describes completely the probability
properties of the random variable.
12
In addition to the frequency function, it is sometimes useful to use the
cumulative distribution function (cdf) of a random variable, which is
defined to be
F(x) = P(X x), - < x <
Cumulative distribution functions are usually denoted by uppercase
letters and frequency functions by lowercase letters. Figure below is a
graph of the cumulative distribution function of the random variable X of
the preceding paragraph. Note that the cdf jumps where p(x) > 0 and
that the jump at xi is p(xi).
13
It is useful to define here the concept of independence of random
variables. In the case of two discrete random variables X and Y,
taking on possible values x1, x2, … and y1, y2, …, X and Y are said
to be independent if, for all i and j,
P (X = xi and Y = yj) = P(X = xi) P(Y = yj)
14
Bernoulli Random Variables.
A Bernoulli random variable takes on only two values: 0 and 1, with
probabilities 1 – p and p, respectively. Its frequency function is thus
p(1) = p
p(0) = 1 – p
p(x) = 0, if x 0 or x 1
An alternative and sometimes useful representation of this function is
px (1 – p) 1-x , if x = 0 or x = 1
p(x) {
0, otherwise
15
If A is an event, then the indicator random variable, IA,
takes on the value 1 if A occurs and the value 0 if A
does not occur:
1, if A
IA () = { 0, other wise
16
The Binomial Distribution.
Suppose that n independent experiments, or trials, are performed,
where n is a fixed number, and that each experiment results in a
“success” with probability p and a “failure” with probability 1 – p.
The total number of successes, X, is a binomial random variable
with parameters n and p. For example, a coin is tossed 10 times
and the total number of heads is counted (“head” is identified with
“success”).
17
Two binomial frequency functions are shown in Figure 2-3. Note
how the shape varies as a function of p.
18
EXAMPLE. Tay-Sachs disease is a rare but fatal disease of
genetic origin occurring chiefly in infants and children,
especially those of Jewish, eastern European extraction. If
a couple are both carriers of Tay-Sachs disease, a child of
theirs has probability .25 of being born with the disease. If
such a couple has four children, what is the frequency
function for the number of children that will have the
disease?
19
We assume that the four outcomes are independent of
each other, so, if X denotes the number of children with
the disease,
4
p(k) = ( k ) .25k .754-k, k = 0, 1, 2, 3, 4
k p(k)
------------------
0 .316
1 .422
2 .211
3 .047
4 .004
20