Classification
Classification
LECTURE 4: CLASSIFICATION
4. Classification
Models
Classification
• The process in which historical records are used to make a prediction
about an uncertain future.
• At a fundamental level, most data science problems can be
categorized into either class or numeric prediction problems.
• In classification or class prediction, one should try to use the
information from the predictors or independent variables to sort the
data samples into two or more distinct classes or buckets.
• In the case of numeric prediction, one would try to predict the
numeric value of a dependent variable using the values assumed by
the independent variables.
Classification
Algorithms
Decision Trees
Rule induction
K-NN
Naive Bayesian
Neural Networks
Support Vector Machines
Decision Trees
• A decision tree is a supervised learning algorithm used for
both classification and regression problems.
• Simply put, it takes the form of a tree with branches
representing the potential answers to a given question.
• There are metrics used to train decision trees.
• One of them is information gain (Entropy).
Decision Trees
• Decision trees (also known as classification trees) are
probably one of the most intuitive and frequently used data
science techniques.
• From an analyst’s point of view, they are easy to set up.
• From a business user’s point of view they are easy to
interpret.
• Classification trees, are used to separate a dataset into
classes belonging to the response variable.Usually the
response variable has two classes: Yes or No (1 or 0).
Decision Trees
How It Works
• A decision tree model takes a form of decision flowchart where
an attribute is tested in each node.
• At end of the decision tree path is a leaf node where a prediction
is made.
• The nodes split the dataset into subsets.
• In a decision tree, the idea is to split the dataset based on the
homogeneity of data.
Decision Trees
Entropy
• Entropy is an information theory metric that measures the impurity or
uncertainty in a group of observations.
• It determines how a decision tree chooses to split data.
Decision Trees
• Defined entropy as log2 (1/p) or -log2 (p) where p is the
probability of an event occurring.
• If the probability for all events is not identical, a weighted expression
is needed and, thus, entropy, H, is adjusted as follows:
http://archive.ics.uci.edu/ml/datasets/