Exam2 s15 Sol
Exam2 s15 Sol
Exam2 s15 Sol
LAST NAME:
SOLUTIONS
FIRST NAME:
Problem
Score
Max Score
___________
12
___________
13
___________
11
___________
12
___________
12
___________
14
___________
___________
17
Total
___________
100
1 of 10
Each person gets the same menu consisting of one item in each course. Dietary restrictions of the
guests imply the following constraints:
(i) The appetizer must be veggies or the main course must be pasta or fish.
(ii) If you serve salad, the beverage must be water.
(iii) You must serve at least one of milk, ice cream or cheese.
(a) [4] Draw the constraint graph and the initial domain of each variable associated with this
problem. That is, show a graph with 4 nodes labeled A, B, M and D and arcs connecting
appropriate pairs of nodes based on the constraints; show the domains beside each node.
(b) [4] Say we decide to have the appetizer be salad, i.e., A = s. What are the domains of all the
variables after applying the Forwarding Checking inference algorithm (but no backtracking
search)?
Eliminate values b, m and h, resulting in A = {s}, B = {w}, M =
{f, p}, and D = {t, i, c}
(c) [4] Instead of using Forward Checking in (b), say we initially set A = s and then apply the
Arc-Consistency algorithm (AC-3) (but no backtracking search). What are the domains of
all the variables after it halts?
A = {s}, B = {w}, M = {f, p}, D = {i, c}. t is eliminated in
addition to the values eliminated in (b) because there is no
value for B that is consistent with t at D based on constraint
(iii).
2 of 10
(b) [5] Consider a Perceptron with 3 inputs, x1, x2, x3, and one output unit that uses a linear
threshold unit (LTU) as its activation function. Assume initial weights w1 = 0.2, w2 = 0.7, w3 =
0.9, learning rate 0.2, and bias w4 = 0.7 (and the LTU itself uses a fixed threshold value of
0). Hence we have:
0
w1
x1
w4
w3
w2
x2
x3
+1
(i) [2] Given the inputs x1=1, x2=0, x3=1, what is the output of this Perceptron? Show
your work.
The output is 1 because (.2)(1) + (.7)(0) + (.9)(1) + (0.7)(1) = 0.4 0.
(ii) [3] What are the four updated weights' values after applying the Perceptron Learning
Rule with the above input and teacher output 0? Show your work.
T=0 and O=1, so update the weights using wi = wi + (.2)(01)xi. Thus, the new weights are w1 = 0.2 + (-.2)(1) = 0.0,
w2 = 0.7 + (-.2)(0) = 0.7, w3 = 0.9 + (-.2)(1) = 0.7 and w4
= -0.7 + (-.2)(1) = -0.9
3 of 10
(c) [2] Which one of the following best describes the process of learning in a multilayer, feedforward neural network that uses back-propagation learning?
(i)
(ii)
(iii)
(iv)
(v)
(vi)
Activation values are propagated from the input nodes through the hidden layers to the
output nodes.
Activation values are propagated from the output nodes through the hidden layers to the
input nodes.
Weights on the arcs are modified based on values propagated from input nodes to
output nodes.
Weights on the arcs are modified based on values propagated from output nodes to
input nodes.
Arcs in the network are modified, gradually shortening the path from input nodes to
output nodes.
Weights on the arcs from the input nodes are compared to the weights on the arcs
coming into the output nodes, and then these weights are modified to reduce the
difference.
(iv)
(d) [3] Why dont multilayer, feed-forward neural networks use an LTU as the activation function
at nodes? And what do they use instead of an LTU?
Learning in multi-layer feed-forward neural networks uses the
back-propagation algorithm, which does gradient descent in
weight space. Gradient descent requires computing the
derivative of the activation function. But LTUs have a
discontinuity at the threshold value and therefore the
derivative is not defined there. So, we use a continuous
function such as the Sigmoid that is differentiable everywhere.
4 of 10
(b) [4] What is the prior probability that a ball is blue, i.e., P(B)?
(c) [4] What is the posterior probability that a blue ball is metal, i.e., P(M | B)?
5 of 10
P(A | V)
P(B | V)
P(C | V)
P(V)
False
0.4
0.1
0.6
0.8
True
0.8
0.3
0.1
0.2
6 of 10
(a) [4] Write an expression for computing P(A, S, H, E, C) given only information that is in the
associated CPTs for this network.
P(A) P(S) P(H) P(E | A, S, H) P(C | E, H)
(c) [3] How many numbers must be stored in total in all CPTs associated with this network
(excluding numbers that can be calculated from other numbers)?
1 + 1 + 1 + 23 + 22 = 15
(d) [3] C is conditionally independent of ____A and S____ given ____E and H_____
7 of 10
8 of 10
(b) [3] What classification method is used by the Eigenfaces algorithm to recognize the face in a
test image?
9 of 10
(b) [3] Given two arbitrary sentences and in PL, if |= then is _______________
(one word answer)
Unsatisfiable / contradiction / inconsistent / False under all
interpretations
(c) [3] Is the sentence ((P Q) Q) P valid, satisfiable, or unsatisfiable? Briefly explain how
you determined your answer.
Satisfiable (but not valid) since when P=T and Q=T the sentence is
True, but when P=F and Q=T, the sentence is False
(e) [5] Given the three PL sentences: P Q, P Q, P Q defining a KB, prove the goal
sentence (P Q) using the Resolution Refutation algorithm. Show your answer as a proof tree.
1.
2.
3.
4.
5.
6.
7.
P Q
P Q
P Q
P Q
Q
Q
False
negation of the
KB
KB
KB
Resolution rule
Resolution rule
Resolution rule
goal sentence
with 1 and 3
with 2 and 4
with 5 and 6
10 of 10