L02 Classification and Regression

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 26

Classification and

Regression

1
Objective
• Understand classification and regression tasks

2
What is Classification?
A flower shop wants to
guess a customer's
purchase from similarity to
most recent purchase.

3
What
Which flower is a
is Classification?
customer most likely to
purchase based on
similarity to previous
purchase?

?
4
What
Which flower is a
is Classification?
customer most likely to
purchase based on
similarity to previous
purchase?

?
5
What
Which flower is a
is Classification?
customer most likely to
purchase based on
similarity to previous
purchase?

?
6
What
Which flower is a
is Classification?
customer most likely to
purchase based on
similarity to previous
purchase?

?
7
What is Needed for Classification?

• Model data with:


• Features that can be quantitated
• Labels that are known
• Method to measure similarity

8
4 main types of classification tasks

• Binary Classification
• Multi-Class Classification
• Multi-Label Classification
• Imbalanced Classification

9
Binary Classification
• refers to those classification tasks that have two class
labels
• Examples include:
• Email spam detection (spam or not)
• Churn prediction (churn or not)
• Conversion prediction (buy or not)
• involve one class that is the normal state and another
class that is the abnormal state
• normal state is assigned the class label 0 and the
class with the abnormal state is assigned the class 10
Popular Binary Classification Algorithms
• Logistic Regression
• K-Nearest Neighbors
• Decision Trees
• Support Vector Machine
• Naive Bayes

11
Multi-Class Classification

• Classification tasks that have more than two class labels.


• Examples include:
• Face classification
• Plant species classification
• Optical character recognition
• does not have the notion of normal and abnormal outcomes.
Instead, examples are classified as belonging to one among a
range of known classes.
• number of class labels may be very large on some problems
12
Multi-Class Classification

• Problems
• a model may predict a photo as belonging to one among
thousands or tens of thousands of faces in a face recognition
system
• predicting a sequence of words, such as text translation
models, may also be considered a special type of multi-class
classification. Each word in the sequence of words to be
predicted involves a multi-class classification where the size
of the vocabulary defines the number of possible classes that
may be predicted and could be tens or hundreds of
thousands of words in size
13
Popular Multi-Class Classification Algorithms
• k-Nearest Neighbors
• Decision Trees
• Naive Bayes
• Random Forest
• Gradient Boosting

14
Popular Multi-Class Classification Algorithms
• algorithms that are designed for binary classification can be adapted for
use for multi-class problems using a strategy of fitting multiple binary
classification models for each class vs. all other classes (called one-vs-
rest) or one model for each pair of classes (called one-vs-one).
• One-vs-Rest: Fit one binary classification model for each class vs. all
other classes.
• One-vs-One: Fit one binary classification model for each pair of classes.
• binary classification algorithms that can use these strategies for multi-
class classification include:
• Logistic Regression
• Support Vector Machine

15
Multi-Label Classification
• classification tasks that have two or more class labels, where
one or more class labels may be predicted for each example.
• photo classification, where a given photo may have multiple
objects in the scene and a model may predict the presence
of multiple known objects in the photo, such as “bicycle,”
“apple,” “person,” etc.

16
Multi-Label Classification Algorithms
• classification algorithms used for binary or multi-class classification
cannot be used directly for multi-label classification
• specialized versions of standard classification algorithms can be used,
so-called multi-label versions of the algorithms, including:
• Multi-label Decision Trees
• Multi-label Random Forests
• Multi-label Gradient Boosting
• another approach is to use a separate classification algorithm to
predict the labels for each class

17
Imbalanced Classification
• classification tasks where the number of examples in each class is
unequally distributed
• is binary classification tasks where the majority of examples in the
training dataset belong to the normal class and a minority of examples
belong to the abnormal class
• Examples include:
• Fraud detection
• Outlier detection
• Medical diagnostic tests

18
Imbalanced Classification Algorithms
• Specialized techniques may be used to change the composition of
samples in the training dataset by undersampling the majority class or
oversampling the minority class.
• Examples include:
• Random Undersampling
• SMOTE Oversampling
• Performance metrics may be required as reporting the classification
accuracy may be misleading
• Examples include:
• Precision, Recall and F-Measure

19
Regression Analysis
• consists of a set of machine learning methods that
allow us to predict a continuous outcome variable (y)
based on the value of one or multiple predictor
variables (x)
• goal of regression model is to build a mathematical
equation that defines y as a function of the x
variables. Next, this equation can be used to predict
the outcome (y) on the basis of new values of the
predictor variables (x)
20
Regression Analysis
• used for prediction
• fit a function on the available data and try to predict
the outcome for the future or hold-out datapoints
• 2 main purposes
• estimate missing data within your data range
(Interpolation)
• estimate future data outside your data range
(Extrapolation)
21
Application of Regression Analysis
• real-world examples for regression analysis include
• predicting the price of a house given house features
• predicting the impact of SAT/GRE scores on college
admissions
• predicting the sales based on input parameters
• predicting the weather, etc.

22
Interpolation
Source: https://towardsdatascience.com/a-beginners-
guide-to-regression-analysis-in-machine-learning-
8a828b491bbf

Predictio
n

Inpu
t
23
Extrapolation Source: https://towardsdatascience.com/a-
beginners-guide-to-regression-analysis-in-
machine-learning-8a828b491bbf

24
Regression Algorithms
• Linear Regression
• Polynomial Regression

25
Reference:
• https://machinelearningmastery.com/types-of-
classification-in-machine-learning/
• http://www.sthda.com/english/wiki/regression-
analysis-essentials-for-machine-learning
• https://towardsdatascience.com/a-beginners-guide-to-
regression-analysis-in-machine-learning-
8a828b491bbf

26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy