Mathematical Foundations of Computer Science Lecture Outline
Mathematical Foundations of Computer Science Lecture Outline
Mathematical Foundations of Computer Science Lecture Outline
Lecture Outline
October 9, 2018
Example. When three dice are rolled what is the probability that one of the dice results
in 4?
Solution. Let Fi , i ∈ {1, 2, 3} be the event that the ith dice results in a 4. We are
interested in Pr[F1 ∪ F2 ∪ F3 ]. By inclusion-exclusion formula we have
Pr[F1 ∪F2 ∪F3 ] = Pr[F1 ]+Pr[F2 ]+Pr[F3 ]−Pr[F1 ∩F2 ]−Pr[F 1∩F3 ]−Pr[F2 ∩F3 ]+Pr[F1 ∩F2 ∩F3 ]
Since the events F1 , F2 , F3 are mutually independent we can rewrite the above expression
as
Pr[F∪ F2 ∪ F3 ] = Pr[F1 ] + Pr[F2 ] + Pr[F3 ] − Pr[F1 ] Pr[F2 ] − Pr[F 1] Pr[F3 ] − Pr[F2 ] Pr[F3 ]
+ Pr[F1 ] Pr[F2 ] Pr[F3 ]
1 1 1 1 1 1 1 1 1 1 1 1
= + + − × − × − × + × ×
6 6 6 6 6 6 6 6 6 6 6 6
91
=
216
An easier way to solve this is as follows. Let Fi be the complement of event Fi , i = 1, 2, 3.
91
Pr[F1 ∪ F2 ∪ F3 ] = 1 − Pr[F1 ∩ F2 ∩ F3 ] = 1 − (5/6)3 =
216
Example. A coin is tossed 10 times. What is the probability that eight or more heads
turn up?
Solution. Let Ei denote the event that exactly i heads turn up. We are interested in
Pr[E8 ∪ E9 ∪ E10 ]. Since the events Ei are disjoint, we have
Example. (Birthday Paradox) Suppose there are k people in a room and n days in
a year. We are interested in the probability that there are at least two people in the room
with the same birthday. What is the smallest value of k for which this probability is at
least 1/2? Assume that it is equally likely for a person to be born on any of the n days of
the year.
2 Lecture Outline October 9, 2018
Solution. Let B be the event that at least two people in the room have the same birthday.
We are interested in Pr[B].
Pr[B] = 1 − Pr[B]
P (n, k)
= 1−
nk
For n = 365, the smallest value of k for which the RHS is at least 1/2 is 23. If k = 40 then
Pr[B] = 0.89, and if k = 60 then Pr[B] = 0.994. This means that if there are 60 people
then it is almost certain that there exists two among them sharing the same birthday. To
illustrate how good our model is, consider the set of presidents of the United States of
America. Through Bill Clinton, 41 people belong to this set. The chances of two of them
sharing the same birthday is at least 89%. Indeed, James Polk (11th president) and Warren
Harding (29th president) are both born on Nov. 2.
Conditional Probability
We now introduce a very important concept of conditional probability. Conditional prob-
ability allows us to calculate the probability of an event when some partial information
about the result of an experiment is known. As we shall see conditional probability is often
a convenient way to calculate probabilities even when no information about the result of
an experiment is available.
Suppose we want to calculate the probability of event A given that event B has already
occured. We denote this by Pr[A|B] (read as “the probability of A given B”). Since we
know that event B has occured our sample space reduces to the outcomes in B. Is this a
valid probability space? No, because the sum of probabilities of the outcomes in B is less
than 1. How do we change the probabilities so that this is a valid probability distribution
while making sure that the relative probabilities of outcomes in B do not change? We do
1
this by scaling the probability of all sample points in B by Pr[B] . Thus for each sample
point ω ∈ B,
Pr[ω]
Pr[ω|B] =
Pr[B]
To calculate Pr[A|B] we just sum up the probabilities of sample points in A ∩ B. Thus we
get
X X Pr[ω] Pr[A ∩ B]
Pr[A|B] = Pr[ω|B] = =
Pr[B] Pr[B]
ω∈A∩B ω∈A∩B
In order to avoid division by 0, we only define Pr[A|B] when Pr[B] > 0. Conditional
probabilities can sometimes get tricky. To avoid pitfalls, it is best to use the above math-
ematical definition of conditional probability. Note that the R.H.S. of the above equation
are unconditional probabilities.
Example. Suppose we flip two fair coins. What is the probability that both tosses give
heads given that one of the flips results in heads? What is the probability that both tosses
give heads given that the first coin results in heads?
October 9, 2018 Lecture Outline 3
1/2 x x x
HH (1/4) x
H
1/2
H
T HT (1/4) x x
1/2
1/2 TH (1/4) x
T H
1/2
T
1/2 TT (1/4)
Example. Suppose we flip two fair coins. What is the probability that both tosses give
heads given that one of the flips results in heads? What is the probability that both tosses
give heads given that the first coin results in heads?
The above formula follows from the definition of Pr[A2 |A1 ]. This formula can be generalized
to n events. We state the generalization without proof.
Pr[A1 ∩A2 ∩· · ·∩An ] = Pr[A1 ]·Pr[A2 |A1 ]·Pr[A3 |A1 ∩A2 ] · · · Pr[An |A1 ∩A2 ∩A3 ∩· · ·∩An−1 ]
4 Lecture Outline October 9, 2018
Example. The probability that a new car battery functions for over 10,000 miles is 0.8,
the probability that it functions for over 20,000 miles is 0.4, and the probability that it
functions for over 30,000 miles is 0.1. If a new car battery is still working after 10,000 miles,
what is the probability that (i) its total life will exceed 20,000 miles, (ii) its additional life
will exceed 20,000 miles?
L10 : event that the battery lasts for more than 10K miles.
L20 : event that the battery lasts for more than 20K miles.
L30 : event that the battery lasts for more than 30K miles.
We know that Pr[L10 ] = 0.8, Pr[L20 ] = 0.4 and Pr[L30 ] = 0.1. We are interested in
calculating Pr[L20 |L10 ] and Pr[L30 |L10 ].
Pr[L20 ∩ L10 ]
Pr[L20 |L10 ] =
Pr[L10 ]
Pr[L20 ] · Pr[L10 |L20 ]
=
0.8
0.4 × 1
=
0.8
1
=
2
By doing similar calculations it is easy to verify that Pr[L30 |L10 ] = 18 .
Example. An urn initially contains 5 white balls and 7 black balls. Each time a ball is
selected, its color is noted and it is replaced in the urn along with two other balls of the
same color. Compute the probability that the first two balls selected are black and the next
two white.
The Total Probability Theorem. Consider events E and F . Consider a sample point
ω ∈ E. Observe that ω belongs to either F or F . Thus, the set E is a disjoint union of two
sets: E ∩ F and E ∩ F . Hence we get
In general, if A1 , A2 , . . . , An form a partition of the sample space and if ∀i, Pr[Ai ] > 0, then
for any event B in the same probability space, we have
n
X n
X
Pr[B] = Pr[Ai ∩ B] = Pr[Ai ] × Pr[B|Ai ]
i=1 i=1