Product Rule and Bayes' Rule

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 25

USES OF CONDITIONAL

PROBABILITY

The Product Rule, Bayes’ Rule, and


Extended Independence

Probability and Statistics for 1


Teachers, Math 507, Lecture 5
The Product Rule
• Notation: In advanced probability courses it is common
to denote the intersection of events by concatenating
them rather than writing an intersection symbol
between them. That is, if A and B are events we write
AB instead of A  B . Our book does not use this
notation, but I will use it in this lecture to simplify
what I have to type (i.e., writing the letters adjacent
does not make me use the equation editor!).

Probability and Statistics for 2


Teachers, Math 507, Lecture 5
The Product Rule
• Example: Suppose a bag contains four green beads and
seven red ones. If I pull two out (without replacement),
what is the probability that both are red?
– Intuitively my probability of getting red on the first draw is
7/11, my probability of then getting another red on the
second draw is 6/10=3/5, and my total probability of getting
both red is (7/11)*(3/5)=21/55, which is about 0.38.
– Formally we can solve the problem by counting. There are
C(11,2)=55 2-subsets of the 11 beads, and C(7,2)=21 of them
contain 2 red beads. Assuming all outcomes are equally
likely, the probability of 2 reds is 21/55. We see that our
intuitive procedure yields the correct answer, but can we
justify it formally?

Probability and Statistics for 3


Teachers, Math 507, Lecture 5
The Product Rule
• Theorem (The Product Rule for Probabilities): Suppose
A and B are events in a sample space S with
probability measure P. We already know that P(B|
A)=P(AB)/P(A). Clearing denominators we see
P(AB)=P(A)P(B|A). In other words, the probability
that A and B both happen equals the probability that A
happens times the probability that B happens if A does.

Probability and Statistics for 4


Teachers, Math 507, Lecture 5
The Product Rule
• Example Revisited: Again suppose we have four green
and seven red beads in a bag and we choose two of
them without replacement. Let F be the event that the
first bead chosen is red and S be the event that the
second bead chosen is red. Then the even that both
beads are red is FS. We can find its probability from
the product rule as follows: P(FS)=P(F)P(S|
F)=(7/11)*(6/10)=21/55. Our intuition is now justified.

Probability and Statistics for 5


Teachers, Math 507, Lecture 5
The Product Rule
• The product rule generalizes to more events. For instance,
suppose A, B, C, and D are events in a sample space S with
probability measure P. Then P(ABCD)=P(A)P(B|A)P(C|AB)P(D|
ABC). This same pattern works with any number of events.
• Example Re-revisited: Now suppose we have four green and
seven red beads and we want the probability that four beads
chosen (without replacement) from the bag are all red. Let A, B,
C, and D, be the events that the first, second, third, and fourth
beads are red, respectively. Then the probability that all four
beads are red is P(ABCD)=(7/11)*(6/10)*(5/9)*(4/8)
=(7/11)*(3/5)*(5/9)*(1/2)=7/66, which is about 0.11. Note that,
for instance, the fourth factor 4/8=1/2 is the probability that the
fourth bead is red after we have already removed three red beads.

Probability and Statistics for 6


Teachers, Math 507, Lecture 5
Independence of three events
• If A, B, and C are events in some sample space S with
a probability measure P, then we say the events are
independent if each pair is independent (i.e.,
P(AB)=P(A)P(B), P(AC)=P(A)P(C), and
P(BC)=P(B)P(C)) and in addition
P(ABC)=P(A)P(B)P(C).
• Intuitively this is what is needed to guarantee that all
the events and their complements are proportionally
represented in each other in all relevant ways. It is
possible to satisfy some of these equations while
failing to satisfy others
Probability and Statistics for 7
Teachers, Math 507, Lecture 5
Independence of three events
• Contrary Example: Roll a red die and a clear die. Let A
be the event “the red die is 1,” B be the event “the clear
die is 3,” and C be the event “both dice have the same
number.” The P(A)=P(B)=P(C)=6/36=1/6. Clearly
P(AB)=1/36=P(A)P(B), P(AC)=1/36=P(A)P(C), and
P(BC)=1/36=P(B)P(C). The event ABC, however, is
empty (since you cannot have the red die 1, the clear
die 3, and both dice the same), so P(ABC)=0. But
P(A)P(B)P(C)=(1/6)(1/6)(1/6)=1/216. Thus A, B, and
C are not independent.

Probability and Statistics for 8


Teachers, Math 507, Lecture 5
Independence of three events
• Example: Roll red, green, and clear dice. Let A be the
event “the red dice is 1,” B be the events “the green die
is 1,” and C be the event “the clear die is 1.” It is easy
to test that events A, B, and C are independent under
the uniform model. In particular P(ABC)=1/216=(1/6)
(1/6)(1/6)=P(A)P(B)P(C).

Probability and Statistics for 9


Teachers, Math 507, Lecture 5
Independence of more events.
• A collection of events is independent if the probability
of every subset of them equals the product of the
probabilities of the events in the subsets. For example,
events A, B, C, and D are independent if every pair is
independent, every triple is independent, and
P(ABCD)=P(A)P(B)P(C)P(D). (Consider rolling four
dice and getting all 1’s, flipping four coins and getting
HTTH).

Probability and Statistics for 10


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Bayes’ Rule is a simple formula relating the values of
P(A|B) and P(B|A). It has several forms and interesting
consequences.

Probability and Statistics for 11


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Theorem 2.16 (Bayes’ Rule)
– Given events H and E in a sample space S with probability
measure P it holds that P(H|E)=P(H)P(E|H)/P(E)
– Proof: By definition of conditional probability, P(H|
E)=P(HE)/P(E). By the product rule P(HE)=P(H)P(E|H).
Therefore P(H|E)=P(H)P(E|H)/P(E).
– Here we use H and E to stand for Hypothesis and Evidence.
We sometimes conceive of Bayes’ Rule as telling us how to
revise the probability of a hypothesis based on the
observation of some particular piece of evidence.

Probability and Statistics for 12


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Example: You are living in a dorm. One night the fire alarm
goes off. How likely is it that there is a fire? Here H is the event
“there is a fire” and E is the event “the fire alarm goes off.” You
want to know P(H|E). You estimate that all things being equal a
fire is unlikely on a given night, setting P(H)=0.001 (roughly
one fire in three years). You know that in a typical semester of
about 100 days there are about 3 fire alarms (typically false
alarms), so you estimate P(E)=0.03. Finally you guess that it is
nearly certain someone would set off the alarm if there really
were a fire, so you estimate P(E|H)=0.98. By Bayes’ Rule, P(H|
E)=P(H)P(E|H)/P(E)=(0.001)(0.98)/(0.03)=0.033.

Probability and Statistics for 13


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Notes to the example: From one point of view the alarm is
almost meaningless. There is only 3.3% chance of a fire. Why is
it so low? Your probabilities say that in 100 days you should
expect 30 alarms but only one fire. Thus your chance of having
fire with the alarm is 1/30. From another point of view the alarm
carries a lot of weight: The alarm raises the likelihood of a fire
thirtyfold, from 0.1% to 3.3% (that is 1/1000 to 1/30). This is
how the evidence (alarm) causes you to revise your estimate of
the hypothesis (fire). In any case the difference between P(H|E)
and P(E|H) is large: 0.033 to 0.98, a clear example of how these
quantities need not be equal.

Probability and Statistics for 14


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Theorem 2.17 (Bayes’ Rule, extended form)
– Under the same circumstances as before .

P( H ) P( E | H )
P( H | E ) 
P( H ) P( E | H )  P( H ) P( E | H )

Probability and Statistics for 15


Teachers, Math 507, Lecture 5
Bayes’ Rule S

• Proof: This is the same equation as H E


in the simpler statement except that
in that case the denominator was
simply P(E). It is easy to see that the
sets E
EHH and partition E. The
Venn Diagram makes it clear: The
event E is partitioned into the yellow
section EH and the orange section
EH
. Since the sets are disjoint and have
union E, we have P ( E ) P ( EH )  P ( E H )
By the product rule the righthand
side becomes P( H ) P( E | H )  P( H ) P( E | H )
So we have the same equation as
before, but with a fancy expansion of
P(E) in the denominator.
Probability and Statistics for 16
Teachers, Math 507, Lecture 5
Bayes’ Rule
• Example (medical testing)
– A drug company has designed a test for a disease. Through
extensive testing, the company reports that the test produces
only 1% false positive results (i.e., a healthy person tests
positive) and only 2% false negative results (i.e., a person
with the disease tests negative). Let P be the event “someone
tests positive,” N be the event “someone tests negative,” H
be the event “someone is healthy,” and D be the event
“someone has the disease.” Then the company is reporting
P(P|H)=0.01 (or equivalently P(N|H)=0.99) and P(N|D)=0.02
(or equivalently P(P|D)=0.98).

Probability and Statistics for 17


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Example (medical testing)
– Suppose you test positive for the disease. How likely is it that
you in fact have the disease? It is tempting but incorrect to
say 98% since P(P|D)=0.98. But you want to know P(D|P),
which may be quite different. It turns out you do not have
enough information yet. Oddly enough you must also know
P(D), the prevalence of the disease in your population. Why?
As in the case of the fires and fire alarms, if the disease is
rare, then false positives will dominate true ones. If the
disease is common, true positives will dominate false ones.

Probability and Statistics for 18


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Example (medical testing)
– Suppose the disease is rare, occurring in only 0.05% of the
population. Then applying the second form of Bayes’ we get

P( D) P( P | D)
P( D | P) 
P( D) P( P | D)  P( H ) P( P | H )
0.0005 * 0.98
 0.047 4.7%
0.0005 * 0.98  0.9995 * 0.01

Probability and Statistics for 19


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Thus with a positive test your chance of having the
disease is still just below 5%. Why? Roughly speaking
among 2000 randomly chosen people you expect to
have 20 positive tests but only 1 person with the
disease. Thus about 95% of your positives are false.
Still this is a dramatic increase in the probability of
having the disease, from 0.05% to 4.7%, almost a
hundredfold increase.

Probability and Statistics for 20


Teachers, Math 507, Lecture 5
Bayes’ Rule
• On the other hand, suppose the disease is common.
Suppose 10% of people in your “population” have the
disease. Then
P( D) P( P | D)
P( D | P) 
P( D) P( P | D)  P( H ) P( P | H )
0.1* 0.98
 0.916 91.6%
0.1* 0.98  0.9 * 0.01

Probability and Statistics for 21


Teachers, Math 507, Lecture 5
Bayes’ Rule
• Now the positive test gives you over a 90% chance of
having the disease. How can this be? Now among 1000
people, 100 will have the disease and 98 of them will
test positive. Similarly 900 will not have the disease
and 9 of them will test positive. Now fewer than 10%
of your positives are false.

Probability and Statistics for 22


Teachers, Math 507, Lecture 5
Bayes Rule Extended
• Theorem 2.19 (Bayes’ Rule for multiple hypotheses)
Suppose you have n events (hypotheses) H1 ,  , H n
that partition the sample space S and you have an event
(evidence) E in S. Then for i between 1 and n,

P( H i ) P( E | H i )
P( H i | E ) 
P( H1 ) P( E | H1 )    P( H n ) P( E | H n )

Probability and Statistics for 23


Teachers, Math 507, Lecture 5
Bayes Rule Extended
• Proof: This is identical to the proof of the extended
form of Bayes’ Rule except that you partition E into n
blocks by dividing it among the H’s. Then you use this
partition to expand the denominator P(E) in the simple
form of Bayes’ Rule.

Probability and Statistics for 24


Teachers, Math 507, Lecture 5
Bayes Rule Extended
• Example
– At a college, 40% of the students are freshmen, 25%
sophomores, 20% juniors, and 15% seniors (These are your
H’s, partitioning the whole population of students). Among
students on the honor roll (your evidence E), 5% are
freshmen (the percentage of E among freshmen), 10% are
sophomores, 18% are juniors, and 22% are seniors. What
percentage of honor roll students are sophomores? By Bayes’
Rule the probability is .

0.25 * 0.1
0.219 21.9%
0.4 * 0.05  0.25 * 0.1  0.2 * 0.18  0.15 * 0.22

Probability and Statistics for 25


Teachers, Math 507, Lecture 5

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy