Lecture 05
Lecture 05
Lecture 05
Lecture 5: Independence
he content of this lecture roughly corresponds to Sec. 3.8 of Sheldon and Ross’ Introduction to Probability
and Statistics for Engineers and Scientists.
Example 1. (Illustrative example)
Consider the following two events A and B in the Venn diagram below. Note that P (A) = 0.5. What about
Figure 5.1: The numbers in red ink correspond to the probabilities in this example, e.g., P (A) = 0.5, P (B) =
0.6.
Caution: It’s important to point out that events which are mutually exclusive typically are NOT indepen-
dent: In Figure ??, P (A ∩ B) = P (∅) = 0. However, as long as P (A) > 0 and P (B) > 0 then P (A)P (B) > 0,
thus P (A ∩ B) 6= P (A)P (B). Therefore, A and B may be mutually exclusive but not independent (i.e.,
dependent) and vice versa. The explanation of this is that when events are mutually exclusive, knowledge of
one of them occurring implies that the other doesn’t, hence the knowledge of one influences the probability
of the other.
5-1
5-2 Lecture 5: Independence
Figure 5.2: Common misconception of independence - one may be tempted to call these two events indepen-
dent because they ”don’t touch one another”, but this is incorrect.
Proposition 5.2. If we know A, B are independent events, then it turns out that
You can quickly check this visually for Example 5.1. from the Venn diagram.
Proof. Here, we prove 1. The other two are left as an exercise. From the Venn diagram below,
P (A ∩ B c ) = P (A) − P (A ∩ B)
= P (A) − P (A)P (B)
= P (A)[1 − P (B)]
= P (A)P (B c ),
when the events are independent. Quick note: These are a lot of conditions to check for independence.
Fortunately, in Statistics it is typically an assumption that is made or a condition that follows from the
design of the experiment.
Lecture 5: Independence 5-3
Remark. Earlier we were able to show that if A1 and A2 are independent, then A1 and Ac2 , Ac1 and A2 ,
Ac1 and Ac2 are independent as well. It is interesting that this property holds true for any finite number
of independent events. So, for example, if A1 , A2 , A3 , A4 are independent, then A1 , Ac2 , Ac3 , A4 are also
independent as well as any of the events A1 , A2 , A3 , A4 where we replace Ai by Aci .
Example 3. (Interlude to the binomial distribution)
Suppose A1 , A2 , A3 , A4 are independent and P (Ai ) = p, 0 < p < 1, for all i = 1, 2, 3, 4. Observe that by the
complementary rule implies P (Aci ) = 1 − p for all i. Compute the probability that ”NONE”, ”exactly 1”,
”exactly 2”, ”exactly 3” and ”exactly 4” of the events A1 , A2 , A3 , A4 occur.
Solution: NONE of the events is the event Ac1 ∩ Ac2 ∩ Ac3 ∩ Ac4 . Hence,
Exactly 1 of the events is (A1 ∩ Ac2 ∩ Ac3 ∩ Ac4 ) ∪ (Ac1 ∩ A2 ∩ Ac3 ∩ Ac4 ) ∪ (Ac1 ∩ Ac2 ∩ A3 ∩ Ac4 ) ∪ (Ac1 ∩ Ac2 ∩ Ac3 ∩ A4 ).
Note that the events are independent, thus
P (A1 ∩ Ac2 ∩ Ac3 ∩ Ac4 ) = P (A1 )P (Ac2 )P (Ac3 )P (Ac4 ) = p(1 − p)3 .
5-4 Lecture 5: Independence
Similarly,
P (Ac1 ∩ A2 ∩ Ac3 ∩ Ac4 ) = P (Ac1 ∩ Ac2 ∩ A3 ∩ Ac4 ) = P (Ac1 ∩ Ac2 ∩ Ac3 ∩ A4 ) = p(1 − p)3 .
Note that the paretheses are all mutually exclusive, hence the probability the desired probability of exactly
1 of the events is 4p(1 − p)3 . As an exercise, show that the probability of exactly 2 of these events occur is
6p2 (1 − p)2 , the probability of exactly 3 of these events occur is 4p3 (1 − p) and all is p4 .
Example 4. (Darts game)
In a game of darts, Fred claims that he can hit the bullseye with probability 5% (when the bullseye is his
target). Assume Fred’s attempts hitting the bullseye are independent from throw to throw, so that if we
let Ai be the event Fred hits the bullseye on the ith attempt, then the sequence A1 , A2 , A3 , · · · is assumed
independent.
1. Compute the probability that Fred hits the bullseye on each of his first 3 throws.
2. Compute the probability that Fred hits the bullseye in at least one of his first 3 throws.
3. How many times Fred should throw the dart in order for him to have at least 95% chance that he hits
the bullseye at least once?
Solution:
1. The event of Fred hitting the bullseye on all of his first 3 throws is A1 ∩ A2 ∩ A3 . So, the desired
probability is
P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ) = (0.05))3 = .000125.
2. The event of Fred hitting the bullseye at least once in his first 3 throws is A1 ∪ A2 ∪ A3 . Then,
3. Here, the problem statement asks us to specify the number of events n such that P (A1 ∪ A2 ∪ · · · An ) ≥
0.95. As in 2., P (A1 ∪ A2 ∪ · · · An ) = 1 − P (Ac1 ∩ Ac2 ∩ · · · Acn ) = 1 − (.95)n . Therefore, (.95)n ≤ .05 −→
→ n ≥ ln(0.05)
n ln(.95) ≤ ln(.05) − ln(.95) = 58.4. Note that the change in the inequality is because ln(.95) < 0.
In the end, the number of necessary dart throws is n = 59.
The main takeaway of this exercise is that we can compute the probabilities of much more complex events
by only knowing the probabilities of the individual events when the independence assumption holds.
Remark. Regarding the last example, we showed P (A1 ∪ A2 ∪ A3 ) = .142625 by using complementary rule
and employing the De Morgans Law. We could have also used the ”tool” of the inclusion-exclusion rule to
solve the problem. To see this
Although it is possible to get the correct answer using the inclusion-exclusion rule, it is a more labor intensive
approach. Nevertheless, sometimes this may be the only right approach.
Prelude to Combinatorics (alternatively, why we need to learn to count)
Lecture 5: Independence 5-5
Imagine an experiment that leads to a finite sample space Ω of N equally likely outcomes (sample points), say
Ω = {ω1 , ω2 , · · · , ωN }. Since sample points are equally likely, then P ({ωi }) = N1 for all i. The explanation
is the following: Assume that P ({ωi }) = c for some constant c. Then,
N
[ N
X N
X
1 = P (Ω) = P ( {ωi }) = P ({ωi }) = c = cN,
i=1 i=1 i=1
thus c = N1 . Therefore, given any event A, we see that A is the mutually exclusive union of its sample points
and since each such sample point has probability N1 , it follows that P (A) = N1 + N1 + · · · + N1 = |A|
N . The
point of this is that when finite sample spaces are equally likely, the probability of an event A is simply the
proportion of sample points in Ω that belong to A. Consequently, all we have to do is to count how many
sample points there are in A and in Ω and take their ratio. Many experiments of this type may include very
very large events, so we need more systematic/efficient ways to count.