Artificial Intelligence Exercises For Tutorial 3 On Probabilistic Inference and Bayesian Networks Including Answers November 2018
Artificial Intelligence Exercises For Tutorial 3 On Probabilistic Inference and Bayesian Networks Including Answers November 2018
Artificial Intelligence Exercises For Tutorial 3 On Probabilistic Inference and Bayesian Networks Including Answers November 2018
November 2018.
Introduction
The following multiple choice questions are examples of typical questions one can expect
on the AI exam. The questions on the AI exam are also multiple choice, but for this
tutorial one has to explain the answers given. Moreover at the end one can find some open
questions.
After the tutorial the answers to the MC will be available on BB.
a: A witness declares that the taxi was blue. Given the declaration of our witness
what is the probability that the taxi is indeed blue?
b: Suppose two witnesses independently declare that the taxi was blue. Draw the
Bayesian Network for this case and what is the probability that the taxi was
indeed blue?
c: Now assume that a third independent witness appears on the scene and declares
that the taxi was green. What is the now probability that the taxi was indeed
blue?
Answer:
1
a: We must compute P (C = b|W = b). Using Bayes Law this is equal to
P (W = b) = P (W = b, C = b) + P (W = b, C = g)
= P (W = b|C = g)P (C = g) + P (W = b|C = b)P (C = b)
= 0.3 ∗ 0.7 + 0.7 ∗ 0.3 = 0.42
P (W1 = b, W2 = b) =
P (W1 = b|C = g)P (W2 = b|C = g)P (C = g) +
P (W1 = b|C = b)P (W2 = b|C = b)P (C = b) =
0.3 ∗ 0.3 ∗ 0.7 + 0.7 ∗ 0.7 ∗ 0.3 = 0.063 + 0.147 = 0.200
2. The Prosecution argument. The counsel for the prosecution argues as follows:
2
Ladies and gentlemen of the jury, the probability of the observed match be-
tween the sample at the scene of the crime and that of the suspect having
arisen by innocent means is 1 in 10 million. This is an entirely negligible
probability, and we must therefore conclude that with a probability over-
whelmingly close to 1 that the suspect is guilty. You have no alternative
but to convict.
This argument is known as the Prosecutor’s Fallacy. Explain the error in the counsel’s
reasoning.
Answer: The confusion is between P (M |¬G), the probability of a match given the
suspect is not guilty, which is 1 in 10 million, and P (¬G|M ). They are not the same.
Bayes rule gives the relation. The values are the same when the priors P (¬G|B) and
P (G|B) (based on all other evidence B) are the same.
3. “Most car accidents are caused by people that do have a driver’s licence.” What is
suggested by this statement? What are the relevant conditional probabilities?
Answer: The suggestion is that having a driver’s licence causes (or at least has a pos-
itive impact on) car accidents, more than not having a driver’s licence. The relevant
probabilities are: P (Acc|¬HasLisence) and P (Acc|HasLisence). The statement is
probably true since most people driving a car have a driver’s license. But this doesn’t
imply that the first conditional probability is larger than the second one. You may
construct two worlds in both worlds the statement is true, but in one world the
suggestion is true, in the other world it is not true.
4. Make exercise 14.1 from the book of Russel and Norvig Artificial Intelligence (3rd
edition).
Solution (from R&N): see Figures 1 and 2.
3
Figure 1: Solution exercise 14.1 (first part))
4
5. Make exercise 14.4 from the book of Russel and Norvig Artificial Intelligence (3rd
edition).
Solution (from R&N): see Figure 3
5
Figure 4: The Sprinkler Bayesian network
6. Given the Sprinkler network shown in Figure 4. What is the best approximation
of the value of P (S = T rue|W = T rue) (the probability that the Sprinkler was on
given that the grass is Wet)? Use the enumeration method. Indicate where you use
the “conditional independency” relation represented by the BN.
(a) 0.2781
(b) 0.6471
(c) 0.1945
(d) 0.4298
6
P(S|W = T rue)
=
αP(S, W = T rue)
=
P P
α c r P(C = c, S, R = r, W = T rue)
=
P P
α c r P(C = c)P(S|C = c)P(R = r|C = c)P(W = T rue|S, R = r)
= (factoring out P (C = c) and P (S|C = c) of the summation over r:)
P P
α c P(C = c)P(S|C = c) r P(R = r|C = c)P(W = T rue|S, R = r)
We compute this expression for each value of S (i.e. for both S=True and S=False).
to obtain both values of the distribution P(S|W = T rue).
We start with the conditional probability value for S = T rue.
P (S = T rue|W = T rue)
=
P P
α c P (C = c)P (S = T rue|C = c) r P (R = r|C = c)P (W = T rue|S = T rue, R =
r) ( where α stands for the constant P (W = T rue) )
=
P
αP (C = T rue).P (S = T rue|C = T rue) r P (R = r|C = T rue)P (W = T rue|S =
T rue, R = r)
+
P
αP (C = F alse).P (S = T rue|C = F alse) r P (R = r|C = F alse)P (W = T rue|S =
T rue, R = r).
=
α0.5 × 0.1 × (P (R = T rue|C = T rue)P (W = T rue|S = T rue, R = T rue) + P (R =
F alse|C = T rue)P (W = T rue|S = T rue, R = F alse))
+
α0.5 × 0.5 × (P (R = T rue|C = F alse)P (W = T rue|S = T rue, R = T rue) + P (R =
F alse|C = F alse)P (W = T rue|S = T rue, R = F alse))
=
α0.5 × 0.1 × (0.8 × 0.99 + 0.2 × 0.9)+ α0.5 × 0.5 × (0.2 × 0.99 + 0.8 × 0.9)
=
α0.2781
And, for S = F alse:
P (S = F alse|W = T rue)
7
=
P P
α c P (C = c)P (S = F alse|C = c) r P (R = r|C = c)P (W = T rue|S =
F alse, R = r)
=
P
αP (C = T rue).P (S = F alse|C = T rue) r P (R = r|C = T rue)P (W = T rue|S =
F alse, R = r)
+
P
αP (C = F alse).P (S = F alse|C = F alse) r P (R = r|C = F alse)P (W = T rue|S =
F alse, R = r).
=
α0.5 × 0.9 × (P (R = T rue|C = T rue)P (W = T rue|S = F alse, R = T rue) + P (R =
F alse|C = T rue)P (W = T rue|S = F alse, R = F alse))
+
α0.5 × 0.5 × (P (R = T rue|C = F alse)P (W = T rue|S = F alse, R = T rue) + P (R =
F alse|C = F alse)P (W = T rue|S = F alse, R = F alse))
=
α0.5 × 0.9 × (0.8 × 0.9 + 0.2 × 0.0)+ α0.5 × 0.5 × (0.2 × 0.9 + 0.8 × 0.0)
=
α0.369
Thus: P(S|W = T rue) = α < 0.2781, 0.369 >.
After normalization we obtain: P(S|W = T rue) =< 0.2781/0.2781+0.369, 0.369/0.2781+
0.369 >=< 0.4296, 0.5702 >.
Conclusion: answer d) ( 0.4298 ) is the best approximation of P (S = T rue|W =
T rue).
7. Given the Sprinkler network shown in Figure 4. What is the best approximation of
the value of P (S = T rue|W = T rue, R = T rue) (the probability that the Sprinkler
was on given that the grass is Wet and that it was Raining)?
(a) 0.2781
(b) 0.6471
(c) 0.1945
(d) 0.4298
8
=
αP(S, W = T rue, R = T rue)
=
P
α c P(C = c, S, R = T rue, W = T rue)
=
P
α c P(C = c)P(S|C = c)P(R = T rue|C = c)P(W = T rue|S, R = T rue)
We first compute the value for S = T rue:
P (S = T rue|W = T rue, R = T rue)
=
P
α c P (C = c)P (S = T rue|C = c)P (R = T rue|C = c)P (W = T rue|S = T rue, R =
T rue)
=
αP (C = T rue)P (S = T rue|C = T rue)P (R = T rue|C = T rue)P (W = T rue|S =
T rue, R = T rue)
+
αP (C = F alse)P (S = T rue|C = F alse)P (R = T rue|C = F alse)P (W = T rue|S =
T rue, R = T rue)
=
α(0.5 × 0.1 × 0.8 × 0.99 + 0.5 × 0.5 × 0.2 × 0.99)
=
α0.0891
For S = F alse we compute:
P (S = F alse|W = T rue, R = T rue)
=
P
α c P (C = c)P (S = F alse|C = c)P (R = T rue|C = c)P (W = T rue|S =
F alse, R = T rue)
=
αP (C = T rue)P (S = F alse|C = T rue)P (R = T rue|C = T rue)P (W = T rue|S =
F alse, R = T rue)
+
αP (C = F alse)P (S = F alse|C = F alse)P (R = T rue|C = F alse)P (W = T rue|S =
F alse, R = T rue)
=
α(0.5 × 0.9 × 0.8 × 0.9 + 0.5 × 0.5 × 0.2 × 0.9)
9
=
α0.369
Normalization: P(S|W = T rue, R = T rue) = α < 0.0891, 0.369 >=< 0.0891/(0.0891+
0.369), 0.369/(0.0891 + 0.369) >=< 0, 1945, 0, 8055 >.
Conclusion: c is the correct answer.
Remark: the two exercises show an example of explaining away: the presence of
one of the possible causes of an observed event makes the other less probable. The
fact that the Sprinkler is on makes it less probable that there has been Rain.
8. In the Bayesian Network below with three boolean variables the probabilities for
P and M are: P (M = true) = 0, 1 and P (L = true) = 0.7 and the conditional
probabilities for variable V are as shown in the table.
L M L M P (V = true | L, M )
S true true 0,9
S
true false 0,5
w
S
S /
false true 0,3
V false false 0,05
(a) 0.72
(b) 0.54
(c) 0.46
(d) 0.28
Answer: b (0.54)
Following the same strategy via computing the full joint probability and using the
definition of conditional probability: P (X|Y ) = P (X, Y )/P (Y ):
P (V = t|L = t)
= (summing out M)
P
m P (V = t, M = m|L = t)
= (definition cond. prob)
P
P (V =t,M =m,L=t)
P mP
m v P (V =v,M =m,L=t)
= (use BN)
10
C
S
S
w
S
/
S
D E
(a) 0.50
(b) 0.32
(c) 0.18
(d) 0.90
P P P P
Answer: P (D = t) = c e P (C = c, E = e, D = t) = c e P (C = c).P (E =
e|C = c).P
P(D = t|C = c) (using the BN semantics).
P By factoring out and simplifying
because e P (E = e|C = f ) = 1, as well as e P (E = e|C = t) = 1 we obtain:
P (C = t).P (D = t|C = t) + P (C = f ).P (D = t|C = f ) = 04. × 0.8 + 0.6 × 0.3 = 0.5.
11
10. Consider again the Bayesian Network in Figure 5 with the probability distributions
as given in the exercise above. One of the following statements is true. Which one?
11. In the Bayesian Network below with three boolean variables the probabilities for
P and M are: P (M = true) = 0, 2 and P (L = true) = 0.7 and the conditional
probabilities for variable V are as shown in the table.
L M L M P (V = true | L, M )
S true true 0,9
S
true false 0,5
w
S
S /
false true 0,3
V false false 0,05
(a) 0.3
(b) 0.7
(c) 0.9
(d) 0.1
Answer: c (0.9)
12
Figure 6: The dependencies for the user tests of my design
12. The Bayesian Network structure given in Figure 6 models the dependencies between
three properties related to my design. IsMale is true when the user is male, IsYoung
is true when the user is young, LikesDesign is true when the user likes my design.
Which of the following statements is true? Give a proof or counter example.
13