QBank All Mod
QBank All Mod
Question Bank
COURSE NAME: Machine Learning COURSE CODE: BCS602
SEMESTER: 6th SECTION: A, B & C
Module 1
1. What is the need for Machine Learning? Explain with a neat diagram the knowledge
pyramid.
2. Define Machine Learning. What is the difference between a conventional
programming and Machine learning.
3. What are the different types of Machine Learning. Explain in brief.
4. What are the various steps involved in Machine Learning / Data Mining process.
5. What is Data? Define Big data. What are the different elements of Big Data?
6. Explain about different types of data w.r.t big data. Also describe about their storage
and representation.
7. What is data Analytics? Briefly describe its types.
8. Explain in brief the Big Data Analytics Frame work.
9. Explain Big data Analytics data processing cycle in detail.
10. What is meant by ‘Dirty Data’? Briefly describe how can we process and clean the
dirty data.
11. How do we classify data based on different categories.
Module 2
1. What is the difference between Bivariate and Multivariate Data Analysis? What are
the various visualization technique used in each case?
2. Discuss the importance of Multivariate Statistics in Machine Learning.
3. What is Correlation and covariance?
4. Explain various types of probability distribution? (Explain with relevant formulae)
5. Briefly explain density estimation.
6. Explain the role of Feature Engineering and Dimensionality Reduction techniques in
Machine Learning. Briefly describe some of the algorithms falling under this
category.
7. Explain how a learning system is designed with example.
8. Explain the Concept of Learning in Machine Learning with suitable examples.
9. Explain the following with examples wherever necessary
i. Hypothesis
ii. Hypothesis space/ Hypotheses
iii. Heuristic space search
iv. Version space
v. Generalization and Specialization
10. Write the Find S algorithm and Candidate Elimination Algorithm. (Learn to solve
problems)
11. What is Modelling in Machine Learning. Mention the steps involved in ML
12. What is the importance of model selection and model evaluation?
13. Discuss various resampling methods in detail with necessary diagrams and examples.
14. Define all the metrics used in evaluating model performance with example.
15. Explain about visual classifier performances and scoring methods.
NOTE : Refer to all the problems worked out in the classroom as well as given in assignment.
Module 3
1. Explain the following algorithm.
a. K-NN
b. Weighted K-NN
c. Nearest centroid classifier
d. Locally weighted regression
2. Consider the same training dataset given in previous table/ example. Use K-NN
and Weighted k-NN and determine the class for the given test instance CGPA=7.6,
Assessment = 60 and project submitted = 8
3. Consider the sample data shown in Table with two features x and y. The target
classes are ‘A’ or ‘B’. Predict the class using Nearest Centroid Classifier.
X Y Class
3 1 A
5 1 A
4 2 A
7 6 B
6 7 B
8 5 B
4. A COVID care centre decided to develop a case-based reasoning system to predict whether a
person will test positive or negative based on the symptoms. The table below shows the
number of possible symptoms and the results of the previous cases. The training dataset
contains the following instances as shown in the Table 4.13 below.
Table 4.13: Sample Set of Instances
Dry Sore Loss of Taste or Shortness of Chest
S.No. Fever Tiredness Diarrhea Headache Result
Cough Throat Smell Breath Pain
1 Yes Yes Yes Yes Yes Yes Yes Yes Yes Positive
2 Yes No Yes No No Yes No No No Negative
3 No No No No No No No No No Negative
4 Yes Yes No No No No No No Yes Negative
5 Yes Yes Yes No No No No Yes Yes Positive
6 Yes Yes Yes No No Yes No No No Positive
7 Yes Yes Yes No No No No No No Positive
8 Yes Yes Yes No No No No No No Positive
9 Yes Yes Yes No No No No No No Positive
10 No No No No No No No No No Negative
7. Using multiple regression, fit a line for the following dataset shown in Table 5.13.
Here: z is the equity (dependent variable),x is the net sales (independent variable), y is the asset
(independent variable). All data is in million dollars.
z (Equity) x (Net Sales) y(Asset)
4 12 8
6 18 12
7 22 16
8 28 36
11 35 42
8. Explain the key differences between Linear and Multiple Linear Regression.
9. What are the advantages and disadvantages of Polynomial Regression?
10. How does Logistic Regression differ from Linear Regression?
11. Briefly explain the structure of a decision tree. Also explain the two major procedures
involved with Decision trees.
12. Write the general algorithm for decision tree. Discuss stopping criteria.
13. What are the steps involved in Decision Tree Induction Algorithms?
14. Explain the concept of Entropy and Information Gain in Decision Trees.
15. What are the common applications of Decision Tree Learning in real-world problems?
16. Explain about the following tree construction method. Also write the procedure
(algorithm) to construct the following trees.
a. ID3
b. C4.5
c. CART
d. Regression trees
Module 4
1. Explain the three types of probabilities on which the Naïve Bayes Model relies on.
2. Explain the following
i. Naïve Bayes classification models
ii. Maximum A Posteriori (MAP) Hypothesis
iii. Maximum Likelihood (ML) Hypothesis
iv. Bayes Optimal Classifier with an example.
3. Explain Naïve Bayes algorithm.
4. What is zero probability Error? How can it be overcome?
5. Explain the basic concept of Human Nervous System.
6. What is an artificial Neuron? Give a comparison of Artificial Neuron to Human
nervous system.
7. Explain a Simple Model of an Artificial Neuron.
8. Explain the Structure Artificial Neural Network.
9. What is an activation function? What are the different types of activation function
available in CNN.
10. Explain in detail the concept of perceptron learning.
11. Explain the different types of artificial neural networks
12. Mention the advantages, limitations, challenges and applications of ANN
Module 5
1. What is clustering? How does it group unlabeled data into meaningful clusters using
centroids? How does it differ from classification?
2. Explain the challenges, advantages, disadvantages and application of clustering.
3. What is proximity measure? Explain the properties of the distance measures.
4. What are the different methods to measure distance of
i. qualitative variables
ii. Binary Attributes
iii. Categorical Variables
iv. Ordinal Variables
v. Vector Type Distance Measures
Explain with examples
5. Explain the following with relevant equation, steps and example
i. Agglomerative clustering
ii. Single Linkage or MIN Algorithm
iii. Complete Linkage or MAX or Clique
iv. Average Linkage
v. Mean-Shift Clustering Algorithm
vi. k-means Algorithm
6. Explain the concept and algorithms coming under
i. Density based approaches
ii. Grid Based Approaches
7. What are the Differences between Reinforcement Learning and Supervised Learning and
Differences between Reinforcement and Unsupervised Learning.
8. Explain the components of RL with help of a diagram.
9. Explain MDP with an example.
10. Explain multi-arm bandit problem and reinforcement problem types.
11. How do you classify Reinforcement Agent Types. Explain
12. What are the Algorithms available for Solving Reinforcement Problems using
Conventional Methods. Explain
13. What are Model free methods? What are the various formulation technique used.
14. Explain Monte-Carlo Methods.
15. What is Temporal Difference Learning. How does it differ from Monte-Carlo Methods.
16. Explain and write the algorithm for
i. Q-LEARNING
ii. SARSA Learning