Bayesrule Stats
Bayesrule Stats
Bayesrule Stats
1
Lecture Notes for Introductory Statistics
P (A|B) × P (B)
P (B|A) =
P (A)
provided of course that P (A) 6= 0. The above formula is called Bayes’ Rule or
Bayes’ Theorem.
In the Bayes’ Rule formula, there are some terms you need to know. The P (B)
term is sometimes referred to as the prior probability of B, and the P (B|A)
term is sometimes called the updated probability of B, given that A is known.
Intuitively, P (B) is your initial estimate of how likely B is to occur, and P (B|A)
provides you an updated estimate of this probability once you know some additional
information (i.e. that A has occurred).
know P (J|S), the updated probability that the email is junk, once we know the
additional information that it contains the word sale. Fortunately, we know that
P (S|J) = .60, as 60 percent of the user’s known junk emails contain the word sale.
Therefore, Bayes’ Rule tells us that
P (+|CF ) × P (CF )
P (CF |+) =
P (+)
but unfortunately, the probability P (+) (the percentage of the population who
would test positive for CF when tested) is not immediately obvious. However, there
are two scenarios where a subject could test positive for CF-either they have CF
(probability 1/1600) and they subsequently test positive when tested (85 percent
of such subjects will do so based on the information we are given), or a second
possible scenario is the subject does not have CF (probability 1599/1600) and they
subsequently test positive (.001 from the given information). Thus,
P (+|CF ) × P (CF )
P (CF |+) =
P (+)
1
.85 × 1600
= 1 1599
1600 × .85 + 1600 × .001
≈ .347
Notes, p 2
Supplemental topic: Bayes’ Rule N. Smith
Knowledge of this positive test increases the probability the subject has CF from
1
our initial estimate of 1600 to almost 35 percent. Again, this should intuitively seem
reasonable, as the vast majority of people do not have CF, and a .001 probability
of a false positive in this group is going to be a lot of positive test results. Most
subjects with CF will return a positive test, but the fact there will be so many other
false positives means that while a positive test drastically increases the probability
the subject has CF, it is still far from certain that they do have CF.
As in the above example, it is not at all uncommon to see the denominator of
Bayes’ Rule break into a bunch of cases. Let’s look at another example.
Example 3. Suppose if you will that if you arrive on campus before 8 am, there is
a 95% probability you will obtain a premium parking spot. If you arrive between 8
and 9, this probability drops to 50%, and if you arrive after 9, this probability falls
to only 10%.
You have a friend, and based on your knowledge of their habits, you believe that
only 5% of the time do they arrive on campus before 8 am. 20% of the time they
arrive between 8 and 9, and 75% of the time they arrive after 9. However, one day
you see that your friend managed to obtain a premium parking spot. Use Bayes’
Rule to find the probability that they arrived before 8 am (B8), given the evidence
E that they obtained a premium parking spot.
By Bayes’ Rule,
P (E|B8) × P (B8)
P (B8|E) =
P (E)
We know P (E|B8) = .95 since a premium spot will be had 95 percent of the
time when your friend arrives before 8. We also know that P (B8) = .05 based on
our knowledge of our friend. To find P (E), there are three cases, as the probability
of getting a premium spot depends on the time of arrival. Thus, we have
.95 × .05
P (B8|E) =
.05 × .95 + .20 × .50 + .75 × .10
≈ .213
Once we see that our friend obtained a premium spot, the scarce availability of
such spots later in the day boosts the likelihood that our friend arrived on campus
before 8 a.m. from 5 percent to more than 20 percent.
P (E|1W ) × P (1W )
P (1W |E) =
P (E)
Notes, p 3
Supplemental topic: Bayes’ Rule N. Smith
1
× 14 3
=
P (E)
Now, the denominator is the probability of drawing a white card from the bag, and
here there are four cases, depending on whether the bag actually contains 0, 1, 2,
or 3 white cards. So,
1 1 1 1 2 1 3
P (E) = ×0+ × + × + ×
4 4 3 4 3 4 3
1 2 3 6 1
=0+ + + = =
12 12 12 12 2
Therefore,
1 1
3 × 4 2 1
P (1W |E) = 1 = =
2
12 6
and in exactly the same way,
2 1
3 × 4 2 2
P (2W |E) = 1 = =
2
12 6
and
2 1
3 ×
2 4 3
P (3W |E) = 1 = =
12
2
6
So, if it was initially equally likely that there were 0, 1, 2, or 3 white cards in the
bag, and all we know is that we drew a single white card from the bag, the most
likely scenario is that all three cards in the bag are white!
Notes, p 4