Answer All Questions, Each Carries 4 Marks
Answer All Questions, Each Carries 4 Marks
Answer All Questions, Each Carries 4 Marks
PART A
Answer all questions, each carries 4 marks. Marks
1 Distinguish between classification and regression with an example. (4)
2 Define Hypothesis space and version space for a binary classification problem. (4)
Determine the hypothesis space H and version space with respect to the
following data D.
x 2 11 17 0 1 5 7 13 20
Class 0 1 1 0 0 0 0 1 1
3 State Occam’s razor principle. Illustrate its necessity in learning hypothesis. (4)
4 Define the following terms (a) sensitivity (b) Specificity (c) Precision (d) (4)
Accuracy for a classification problem.
5 What is meant by k-fold cross validation. Given a data set with 1200 instances, (4)
how k- fold cross validation is done with k=1200.
6 Calculate the output of the following neuron Y if the activation function as (4)
(a) Binary sigmoid
(b) Bipolar sigmoid
Page 1of 3
For More Visit : KtuQbank.com
F T7944 Pages: 3
12 a) Discuss the necessity of dimensionality reduction in machine learning. (3)
b) Illustrate the idea of PCA for a two dimensional data using suitable diagrams. (6)
13 a) Let = and be the set of all possible rectangles in two dimensional plane (6)
which are axis aligned (not rotated). Show that this concept class is PAC
learnable.
b) Describe the applications of machine learning in any three different domains. (3)
PART C
Answer any two full questions, each carries 9 marks.
14 The following table consists of training data from an employee database. For a (9)
given row entry, count represents the number of data tuples having the values
for department, status, age, and salary given in that row. Let status be the class
label attribute. Given a data tuple having the values “systems”, “31..35”, and
“46–50K” for the attributes department, age, and salary, respectively, what
would a Naive Bayesian classification of the status for the tuple be?
15 With the following data set, generate a decision tree and predict the class label (9)
for a data point with values <Female, 2, standard, high>.
16 a) Point out the benefits of pruning in decision tree induction. Explain different (5)
Page 2of 3
For More Visit : KtuQbank.com
F T7944 Pages: 3
approaches to tree pruning.?
b) Compute ML estimate for the parameter p in the binomial distribution whose (4)
probability function is
( )= (1 − )( )
= 0,1,2. . . .
PART D
Answer any two full questions, each carries 12 marks.
17 a) Explain the basic problems associated with hidden markov model. (6)
b) Describe the significance of soft margin hyperplane and optimal separating (6)
hyperplane and explain how they are computed.
18 a) Suppose that the datamining task is to cluster the following seven points (with (6)
(x,y) representing location) into two clusters A1(1,1), A2(1.5,2), A3(3,4),
A4(5,7), A5(3.5,5), A6(4.5,5), A7(3.5,4.5) The distance function is City block
distance. Suppose initially we assign A1,A5 as the centre for each cluster
respectively. Using the K-means algorithm to find the three clusters and their
centres after two round of execution.
b) Give the significance of kernel trick in the context of support vector machine. (6)
Describe different types standard kernel functions.
19 a) Describe any one technique for Density based clustering with necessary (6)
diagrams.
b) Given the following distance matrix, construct the dendogram using single (6)
linkage, complete linkage and average linkage clustering algorithm.
Item A B C D E
A 0 2 3 3 4
B 2 0 3 5 4
C 3 3 0 2 6
D 3 5 2 0 4
E 4 4 6 4 0
****
Page 3of 3
For More Visit : KtuQbank.com