2nd Exam Question Paper 2
2nd Exam Question Paper 2
This exam has i) multiple choice questions and ii) open questions. Multiple choice questions should
be answered in a text file (question number followed by selected alinea). Open questions should also
be answered in the text file, if there are no equations involved. Otherwise, they should be answered in
separate sheets of paper photographed and sent by fenix together with the text file.
Portuguese speaking students should answer the questions in Portuguese.
x1 x2 y
1 −1 −5
−1 −3 5
2 −1 7
Problem 1 (2 points)
Consider the table above. We wish to predict the variable y ∈ R knowing the features x1 , x2 , using a
linear regression model ŷ = β0 + β1 x1 + β2 x2 . Please assume that β0 is known and is equal to 1.
Find the coefficients β = (β1 , β2 ) that minimize the sum of squared errors (SSE) criterion.
12 1
a) (− 11 , 3) b) (1, −2) c) ( 13 , − 12
11 ) d) (1, −1) e) none of the others
x1 x2 y x1 x2 y
1 2 0 −2 1 2
2 2 0 −3 1 2
3 1 1 −3 0 3
3 3 1 −1 2 3
Problem 2 (1 point)
Consider the training set defined in the tables above, where (x1 , x2 ) ∈ R2 denotes a feature vector
and y ∈ {0, 1, 2, 3} the class label.
Find the class predicted by the Nearest Neighbor (NN) classifier for an input vector (− 21 , 1):
a) much better in the training set than in the test set b) very well in the training set
c) much better in the test set than in the training set d) very well in the test set
e) none of the others
Problem 4 (1 point)
Regularization methods are used to
Problem 5 (1 point)
Ridge regression tends to
Problem 6 (2 points)
Consider a random variable x ∈ R+
0 with conditional probability density functions
1 0≤x<1 αe−αx x ≥ 0
p(x|y = 0) = p(x|y = 1) =
0 otherwise 0 x<0
where y ∈ {0, 1} is a binary class label. Consider a classifier with decision regions R1 = [0, T [, R0 =
[T, +∞[, where T ∈ [0, 1] is a threshold.
Compute the element P11 of the confusion matrix P = (Pij ), i, j = 0, 1.
2. Consider the squared loss L(y, ŷ) = (y − ŷ)2 . Compute the partial derivative of the loss L(y, ŷ)
with respect to the weight w11 for the training example
x1 x2 y
0 0 −1
√
0 2 −1
1 2 +1
√ √
− 2 − 2 +1
Problem 8 (2+1+1 points) Consider a training set defined above. We wish to train a non-linear
support vector machine (SVM) in feature space, using the nonlinear transformation
1. Find the decision hyperplane in feature space using the training set (defined in input space):
a) 2x12 y12 + x22 y22 + x12 y22 + x22 y12 b) 2x12 y12 + x22 y22
c) x12 y12 + x22 y22 − x12 y22 − x22 y12 d) x12 y12 + x22 y22 e) none of the others
0
Exam of Machine Learning
This exam has i) multiple choice questions and ii) open questions. Multiple choice questions should
be answered in a text file (question number followed by selected alinea). Open questions should also
be answered in the text file, if there are no equations involved. Otherwise, they should be answered in
separate sheets of paper photographed and sent by fenix together with the text file.
Portuguese speaking students should answer the questions in Portuguese.
x1 x2 y
1 −1 −5
−1 −3 5
2 −1 7
Problem 1 (2 points)
Consider the table above. We wish to predict the variable y ∈ R knowing the features x1 , x2 , using a
linear regression model ŷ = β0 + β1 x1 + β2 x2 . Please assume that β0 is known and is equal to −1.
Find the coefficients β = (β1 , β2 ) that minimize the sum of squared errors (SSE) criterion.
a) ( 31 , − 12
11 ) b) (1, −2) 12 1
c) (− 11 , 3) d) (1, −1) e) none of the others
x1 x2 y x1 x2 y
1 2 0 −2 1 2
2 2 0 −3 1 2
3 1 1 −3 0 3
3 3 1 −1 2 3
Problem 2 (1 point)
Consider the training set defined in the tables above, where (x1 , x2 ) ∈ R2 denotes a feature vector
and y ∈ {0, 1, 2, 3} the class label.
Find the class predicted by the Nearest Neighbor (NN) classifier for an input vector (− 21 , 1):
a) much better in the test set than in the training set b) very well in the training set
c) much better in the training set than in the test set d) very well in the test set
e) none of the others
Problem 4 (1 point)
Regularization methods are used to
Problem 5 (1 point)
Ridge regression tends to
Problem 6 (2 points)
Consider a random variable x ∈ R+
0 with conditional probability density functions
1 0≤x<1 αe−αx x ≥ 0
p(x|y = 0) = p(x|y = 1) =
0 otherwise 0 x<0
where y ∈ {0, 1} is a binary class label. Consider a classifier with decision regions R1 = [0, T [, R0 =
[T, +∞[, where T ∈ [0, 1] is a threshold.
Compute the element P00 of the confusion matrix P = (Pij ), i, j = 0, 1.
2. Consider the squared loss L(y, ŷ) = (y − ŷ)2 . Compute the partial derivative of the loss L(y, ŷ)
with respect to the weight w11 for the training example
x1 x2 y
0 0 −1
√
0 2 −1
1 2 +1
√ √
− 2 − 2 +1
Problem 8 (2+1+1 points) Consider a training set defined above. We wish to train a non-linear
support vector machine (SVM) in feature space, using the nonlinear transformation
1. Find the decision hyperplane in feature space using the training set (defined in input space):
a) x12 y12 + x22 y22 b)x12 y12 + x22 y22 − x12 y22 − x22 y12
c)2x12 y12 + x22 y22 d)2x12 y12 + x22 y22 + x12 y22 + x22 y12 e) none of the others
1
Exam of Machine Learning
This exam has i) multiple choice questions and ii) open questions. Multiple choice questions should
be answered in a text file (question number followed by selected alinea). Open questions should also
be answered in the text file, if there are no equations involved. Otherwise, they should be answered in
separate sheets of paper photographed and sent by fenix together with the text file.
Portuguese speaking students should answer the questions in Portuguese.
x1 x2 y
1 −1 −5
−1 −3 5
2 −1 7
Problem 1 (2 points)
Consider the table above. We wish to predict the variable y ∈ R knowing the features x1 , x2 , using a
linear regression model ŷ = β0 + β1 x1 + β2 x2 . Please assume that β0 is known and is equal to 1.
Find the coefficients β = (β1 , β2 ) that minimize the sum of squared errors (SSE) criterion.
12 1
a) (− 11 , 3) b) (1, −2) c) (1, −1) d) ( 13 , − 12
11 ) e) none of the others
x1 x2 y x1 x2 y
1 2 0 −2 1 2
2 2 0 −3 1 2
3 1 1 −3 0 3
3 3 1 −1 2 3
Problem 2 (1 point)
Consider the training set defined in the tables above, where (x1 , x2 ) ∈ R2 denotes a feature vector
and y ∈ {0, 1, 2, 3} the class label.
Find the class predicted by the Nearest Neighbor (NN) classifier for an input vector (− 21 , 1):
a) very well in the training set b) much better in the test set than in the training set
c) very well in the test set d) much better in the training set than in the test set
e) none of the others
Problem 4 (1 point)
Regularization methods are used to
Problem 5 (1 point)
Ridge regression tends to
Problem 6 (2 points)
Consider a random variable x ∈ R+
0 with conditional probability density functions
1 0≤x<1 αe−αx x ≥ 0
p(x|y = 0) = p(x|y = 1) =
0 otherwise 0 x<0
where y ∈ {0, 1} is a binary class label. Consider a classifier with decision regions R1 = [0, T [, R0 =
[T, +∞[, where T ∈ [0, 1] is a threshold.
Compute the element P11 of the confusion matrix P = (Pij ), i, j = 0, 1.
2. Consider the squared loss L(y, ŷ) = (y − ŷ)2 . Compute the partial derivative of the loss L(y, ŷ)
with respect to the weight w11 for the training example
x1 x2 y
0 0 −1
√
0 2 −1
1 2 +1
√ √
− 2 − 2 +1
Problem 8 (2+1+1 points) Consider a training set defined above. We wish to train a non-linear
support vector machine (SVM) in feature space, using the nonlinear transformation
1. Find the decision hyperplane in feature space using the training set (defined in input space):
a) x12 y12 + x22 y22 − x12 y22 − x22 y12 b) 2x12 y12 + x22 y22
c) 2x12 y12 + x22 y22 + x12 y22 + x22 y12 d) x12 y12 + x22 y22 e) none of the others
2
Exam of Machine Learning
This exam has i) multiple choice questions and ii) open questions. Multiple choice questions should
be answered in a text file (question number followed by selected alinea). Open questions should also
be answered in the text file, if there are no equations involved. Otherwise, they should be answered in
separate sheets of paper photographed and sent by fenix together with the text file.
Portuguese speaking students should answer the questions in Portuguese.
x1 x2 y
1 −1 −5
−1 −3 5
2 −1 7
Problem 1 (2 points)
Consider the table above. We wish to predict the variable y ∈ R knowing the features x1 , x2 , using a
linear regression model ŷ = β0 + β1 x1 + β2 x2 . Please assume that β0 is known and is equal to −1.
Find the coefficients β = (β1 , β2 ) that minimize the sum of squared errors (SSE) criterion.
a) (1, −2) b) ( 31 , − 12
11 )
12 1
c) (− 11 , 3) d) (1, −1) e) none of the others
x1 x2 y x1 x2 y
1 2 0 −2 1 2
2 2 0 −3 1 2
3 1 1 −3 0 3
3 3 1 −1 2 3
Problem 2 (1 point)
Consider the training set defined in the tables above, where (x1 , x2 ) ∈ R2 denotes a feature vector
and y ∈ {0, 1, 2, 3} the class label.
Find the class predicted by the Nearest Neighbor (NN) classifier for an input vector (− 21 , 1):
a) very well in the test set b) much better in the training set than in the test set
c) very well in the training set d) much better in the test set than in the training set
e) none of the others
Problem 4 (1 point)
Regularization methods are used to
Problem 5 (1 point)
Ridge regression tends to
Problem 6 (2 points)
Consider a random variable x ∈ R+
0 with conditional probability density functions
1 0≤x<1 αe−αx x ≥ 0
p(x|y = 0) = p(x|y = 1) =
0 otherwise 0 x<0
where y ∈ {0, 1} is a binary class label. Consider a classifier with decision regions R1 = [0, T [, R0 =
[T, +∞[, where T ∈ [0, 1] is a threshold.
Compute the element P00 of the confusion matrix P = (Pij ), i, j = 0, 1.
2. Consider the squared loss L(y, ŷ) = (y − ŷ)2 . Compute the partial derivative of the loss L(y, ŷ)
with respect to the weight w11 for the training example
x1 x2 y
0 0 −1
√
0 2 −1
1 2 +1
√ √
− 2 − 2 +1
Problem 8 (2+1+1 points) Consider a training set defined above. We wish to train a non-linear
support vector machine (SVM) in feature space, using the nonlinear transformation
1. Find the decision hyperplane in feature space using the training set (defined in input space):
a) x12 y12 + x22 y22 b)2x12 y12 + x22 y22 + x12 y22 + x22 y12
c)2x12 y12 + x22 y22 d)x12 y12 + x22 y22 − x12 y22 − x22 y12 e) none of the others