0% found this document useful (0 votes)
38 views

Classification

Uploaded by

raghunathan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Classification

Uploaded by

raghunathan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES


COURSE HANDOUT

Part A: Content Design


Course Title Classification
Course No(s) PCAM ZC311
Credit Units 3
Course Author Dr. N.L.Bhanu Murthy
Lead Instructor Dr. Chetana Gavankar
Version No 3.0
(Modified by Dr. Chetana Gavankar)
Date 12/09/2019

Course Description
Classification is a type of supervised learning techniques wherein the target attribute takes discrete values.
This course emphasizes the three types of techniques to solve classification problems – discriminant function,
generative and probabilistic discriminative approaches. This course lays down a strong foundation on
algorithmic perspective of popular classification algorithms - k-NN, Naïve Bayes, Decision Tree, Logistic
Regression and SVM. The implementation details of these models along with tuning of parameters will be
illustrated. The ensemble methods, bagging, boosting, Random Forest and eXtreme Gradient Boosting will
be taught. The interpretability/explicability of the models will also be discussed.

Course Objectives
No Objective

CO1 Provide deeper understanding of three types of techniques to solve classification problems

CO2 Provide comprehensive algorithmic perspective of popular classification algorithms

CO3 Provide hands-on to solve real life classification problems

CO4 Provide the skill to interpret the predicted model

CO5 Provide the competence to build ensemble classifiers using well known techniques

Text Book(s)
No Author(s), Title, Edition, Publishing House
T1 Christopher Bishop: Pattern Recognition and Machine Learning, Springer International Edition
T2 Tom M. Mitchell: Machine Learning, The McGraw-Hill Companies, Inc..

Reference Book(s) & other resources


No Author(s), Title, Edition, Publishing House
R1 An Introduction to Data Mining – Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin
Kumar - 2005

Content Structure
<List down the modular content structure of the course either in the tabular form given below or as bullets>

No Title of the Module Reference

M1 Overview of the Classification Module Class Notes


1. Introduction to Classification T1 – Ch. 4
2. Types of classification algorithms - Discriminant Functions,
Probabilistic Generative models and Probabilistic
Discriminative models, Tree based models
3. Classification Algorithms covered in the course and type of
these algorithms
4. Applications of classification and case study of the course
M2 Nearest-neighbour Meathods T2 – Ch. 8
1. kNN Classifier
2. Measures of prediction accuracies of classifiers – precision,
recall, AUC of ROC etc.
3. Finding optimal k
4. Python Implementation of kNN
M3 Naïve Bayes Classifier T2 – Ch. 6
1. Probability Foundations – Discrete & Continuous Random
Variables, Conditional Independence, Bayes Theorem (1)
2. Probability Foundations – Discrete & Continuous Random
Variables, Conditional Independence, Bayes Theorem (2)
3. Naïve Bayes Classifier – Derivation
4. An illustrative example
5. Python implementation of Naïve Bayes Classifier
6. Naïve Bayes Classifier is a generative model
7. Advantages of Naïve Bayes Classifier and when to use
Naïve Bayes Classifier?
8. Interpretability of Naïve Bayes Classifier
M4 Logistic Regression T1 – 4.3.
1. Significance of Sigmoid function and finding its derivative
2. Statistics Foundations – Maximum likelihood estimation
3. Cross entropy error function for logistic regression and its
optimal solution
4. Logistic Regression is probabilistic discriminative model
and an illustrative example
5. Implementation of logistic Regression using Python
6. Decision boundary of logistic regression
7. Overfitting of logistic regression and counter measures
8. Interpretability of logistic regression
M5 Decision Tree T2 – Ch. 3
1. Decision Tree Representation
2. Entropy and Information Gain for an attribute
3. Search in Hypothesis space, ID3 Algorithm for decision
tree learning
4. Implementation of Decision Tree using Python
5. Prefer short hypothesis to longer ones, Occam’s razor
6. Overfitting in Decision Tree
7. Reduced Error Pruning and Rule post pruning
8. Alternative measures for selecting attributes
9. Interpretability of Decision Tree
M6 Optimization Foundations for Support Vector Machines Class Notes
1. Constrained and Unconstrained Optimization
2. Primal and Dual of an optimization problem
3. Quadratic Programming
4. KKT conditions
5. Lagrange Multiplier
M7 Support Vector Machines T1 – 6.1, 6.2
1. Understanding the spirit and significance of maximum and 7.1
margin classifier
2. Posing an optimization problem for SVM in non-
overlapping class scenario
3. Converting the constrained optimization problem into
unconstrained using Legrange multipliers
4. Dual of the optimization problem
5. Appreciation of sparse kernel machine and support vectors
in the solution of the optimization problem
6. Implementation of SVM in python
M8 Support Vector Machines in overlapping class distributions & T1 – 6.1, 6.2
Kernels and 7.1
1. Issues of overlapping class distribution for SVM
2. Posing an optimization problem for SVM in overlapping
class scenario
3. Solving the optimization problem using Legrange
multipliers, dual representations
4. Kernel Trick and Mercer’s theorem
5. Techniques for constructing Kernels and advantages of
Kernels in SVM
6. Implementation of SVM using different kernels
M9 Ensemble Methods R1 - 5.6 and
1. Rational for Ensemble Method 5.7
2. Methods for constructing an Ensemble Classifier
3. Bagging, Boosting, AdaBoost
4. Random Forest
5. eXtreme Gradient Boosting (XGBoost)
6. Python Implementation of Random Forest and XGBoost
7. Class Imbalance Problem & approaches to solve it

Weekly coverage of the course


Week Content / Assignments / Exercises
Week1 Video Content: M1
Assignments : Nil
Evaluative Quiz : Nil

Week 2 Video Content: M2


Evaluative Quiz : Q1(M1 and M2)
Assignments : Nil

Week 3 Video Content: M3


Evaluative Quiz : Nil
Assignment 1 : kNN Implementation

Week 4 Video Content:M4


Evaluative Quiz: Nil
Assignments : Nil

Week 5 Video Content: M5


Evaluative Quiz : Q2 (M3, M4 and M5)
Assignments : Nil

Week 6 Video Content: M6


Evaluative Quiz : Nil
Assignment 2 : Naïve Bayes and Logistic Regression Implementation

Week 7 Video Content: M7


Evaluative Quiz : Q3 (M6 and M7)
Assignment : Nil
Week 8 Video Content: M8
Evaluative Quiz : Nil
Assignment 3 : SVM Implementation

Week 9 Video Content: M9


Evaluative Quiz: Nil
Assignment 4: Ensemble Classifiers Implementation

Evaluation

Evaluation Component Marks Type


Comprehensive Examination 40% Closed
Quizzes (3) 24% Open
Assignments (4) 36% Open

Learning Outcomes:
No Learning Outcomes

LO1 Ability to build appropriate classifier for a given real life business problem

L02 Demonstrate the capability to understand classification algorithms deeply and fine tuning the
parameters therein to enhance performance of the classifier

LO3 Ability to build ensemble classifier using well known techniques

LO4 Ability to interpret the regression model

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy