Classification & Prediction

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

TOPIC: CLASSIFICATION & PREDICTION

What is classification?
Following are the examples of cases where the data analysis task is Classification

 A bank loan officer wants to analyze the data in order to know which
customer (loan applicant) are risky or which are safe.
 A marketing manager at a company needs to analyze a customer with a
given profile, who will buy a new computer.
In both of the above examples, a model or classifier is constructed to predict the
categorical labels. These labels are risky or safe for loan application data and yes
or no for marketing data.
Real-Life Examples
There are many real-life examples and applications of classification in data
mining. Some of the most common examples of applications include -
 Email spam classification - This involves classifying emails as spam or
non-spam based on their content and metadata.
 Image classification - This involves classifying images into different
categories, such as animals, plants, buildings, and people.
 Medical diagnosis - This involves classifying patients into different
categories based on their symptoms, medical history, and test results.
 Credit risk analysis - This involves classifying loan applications into
different categories, such as low-risk, medium-risk, and high-risk, based on
the applicant's credit score, income, and other factors.
 Sentiment analysis - This involves classifying text data, such as reviews or
social media posts, into positive, negative, or neutral categories based on
the language used.
 Customer segmentation - This involves classifying customers into
different segments based on their demographic information, purchasing
behavior, and other factors.
 Fraud detection - This involves classifying transactions as fraudulent or
non-fraudulent based on various features such as transaction amount,
location, and frequency.

What is prediction?
Following are the examples of cases where the data analysis task is Prediction −
Suppose the marketing manager needs to predict how much a given customer
will spend during a sale at his company. In this example we are bothered to
predict a numeric value. Therefore the data analysis task is an example of numeric
prediction. In this case, a model or a predictor will be constructed that predicts a
continuous-valued-function or ordered value.
How Does Classification Works?
With the help of the bank loan application that we have discussed above, let us
understand the working of classification. The Data Classification process includes
two steps −
 Building the Classifier or Model
 Using Classifier for Classification

Building the Classifier or Model


 This step is the learning step or the learning phase.
 In this step the classification algorithms build the classifier.
 The classifier is built from the training set made up of database tuples and
their associated class labels.
 Each tuple that constitutes the training set is referred to as a category or
class. These tuples can also be referred to as sample, object or data points.

Using Classifier for Classification


In this step, the classifier is used for classification. Here the test data is used to
estimate the accuracy of classification rules. The classification rules can be applied
to the new data tuples if the accuracy is considered acceptable.
Classification and Prediction Issues
The major issue is preparing the data for Classification and Prediction. Preparing
the data involves the following activities, such as:

1. Data Cleaning: Data cleaning involves removing the noise and treatment of
missing values. The noise is removed by applying smoothing techniques,
and the problem of missing values is solved by replacing a missing value
with the most commonly occurring value for that attribute.
2. Relevance Analysis: The database may also have irrelevant attributes.
Correlation analysis is used to know whether any two given attributes are
related.
3. Data Transformation and reduction: The data can be transformed by any of the
following methods.
o Normalization: The data is transformed using normalization.
Normalization involves scaling all values for a given attribute to
make them fall within a small specified range. Normalization is used
when the neural networks or the methods involving measurements
are used in the learning step.
o Generalization: The data can also be transformed by generalizing it to
the higher concept. For this purpose, we can use the concept
hierarchies.
Comparison of Classification and Prediction Methods
o Accuracy: The accuracy of the classifier can be referred to as the ability of
the classifier to predict the class label correctly, and the accuracy of the
predictor can be referred to as how well a given predictor can estimate the
unknown value.
o Speed: The speed of the method depends on the computational cost of
generating and using the classifier or predictor.
o Robustness: Robustness is the ability to make correct predictions or
classifications. In the context of data mining, robustness is the ability of the
classifier or predictor to make correct predictions from incoming unknown
data.
o Scalability: Scalability refers to an increase or decrease in the performance
of the classifier or predictor based on the given data.
o Interpretability: Interpretability is how readily we can understand the
reasoning behind predictions or classification made by the predictor or
classifier.

CLASSIFICATION TECHNIQUES
1. DECISION TREE
o Decision Tree Mining is a type of data mining technique that is used to
build Classification Models. It builds classification models in the form of a
tree-like structure, just like its name. This type of mining belongs to
supervised class learning.
o In supervised learning, the target result is already known. Decision trees
can be used for both categorical and numerical data. The categorical data
represent gender, marital status, etc. while the numerical data represent
age, temperature, etc.

Decision Tree Terminologies


Root Node: Root node is from where the decision tree starts. It represents
the entire dataset, which further gets divided into two or more homogeneous
sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node
into sub-nodes according to the given conditions.
Branch/Sub Tree: A tree formed by splitting the tree.
Pruning: Pruning is the process of removing the unwanted branches from
the tree.
Parent/Child node: The root node of the tree is called the parent node, and
other nodes are called the child nodes.

QUESTION BASED ON DECISION TREE


=0.289
Naive Bayes Classifiers
Bayes’ Theorem
Bayes’ Theorem finds the probability of an event occurring given the
probability of another event that has already occurred. Bayes’ theorem is
stated mathematically as the following equation:
P(A∣B)=P(B)P(B∣A)P(A)
where A and B are events and P(B) ≠ 0
 Basically, we are trying to find probability of event A, given the event B
is true. Event B is also termed as evidence.
 P(A) is the priori of A (the prior probability, i.e. Probability of event
before evidence is seen). The evidence is an attribute value of an
unknown instance(here, it is event B).
 P(B) is Marginal Probability: Probability of Evidence.
 P(A|B) is a posteriori probability of B, i.e. probability of event after
evidence is seen.
 P(B|A) is Likelihood probability i.e the likelihood that a hypothesis will
come true based on the evidence.

Advantages of Naive Bayes Classifier


 Easy to implement and computationally efficient.
 Effective in cases with a large number of features.
 Performs well even with limited training data.
Numerical Example:

Solution:
P(A|B) = (P(B|A) * P(A) )/ P(B)

1. Mango:

P(X | Mango) = P(Ye | Yellow ) * P(Sweet | Mango) * P(Long |


Mango)

1. a)P(Yellow | Mango) = (P(Mango | Yellow) * P(Yellow) )/ P


(Mango)

= ((350/800) * (800/1200)) / (650/1200)

P(Yellow | Mango)= 0.53 →1

1.b)P(Sweet | Mango) = (P(Sweet | Mango) * P(Sweet) )/ P


(Mango)

= ((450/850) * (850/1200)) / (650/1200)

P(Sweet | Mango)= 0.69 → 2

1. c)P(Long | Mango) = (P(Long | Mango) * P(Long) )/ P


(Mango)

= ((0/650) * (400/1200)) / (800/1200)

P(Long | Mango)= 0 → 3
On multiplying eq 1,2,3 ==> P(X | Mango) = 0.53 * 0.69 * 0

P(X | Mango) = 0

2. Banana:

P(X | Banana) = P(Yellow | Banana) * P(Sweet | Banana) *


P(Long | Banana)

2.a) P(Yellow | Banana) = (P( Banana | Yellow ) * P(Yellow) )/ P


(Banana)

= ((400/800) * (800/1200)) / (400/1200)

P(Yellow | Banana) = 1 → 4

2.b) P(Sweet | Banana) = (P( Banana | Sweet) * P(Sweet) )/ P


(Banana)

= ((300/850) * (850/1200)) / (400/1200)

P(Sweet | Banana) = .75→ 5

2.c)P(Long | Banana) = (P( Banana | Yellow ) * P(Long) )/ P


(Banana)

= ((350/400) * (400/1200)) / (400/1200)


P(Yellow | Banana) = 0.875 → 6

On multiplying eq 4,5,6 ==> P(X | Banana) = 1 * .75 * 0.875

P(X | Banana) = 0.6562

3. Others:

P(X | Others) = P(Yellow | Others) * P(Sweet | Others) * P(Long


| Others)

3.a) P(Yellow | Others) = (P( Others| Yellow ) * P(Yellow) )/ P


(Others)

= ((50/800) * (800/1200)) / (150/1200)

P(Yellow | Others) = 0.34→ 7

3.b) P(Sweet | Others) = (P( Others| Sweet ) * P(Sweet) )/ P


(Others)

= ((100/850) * (850/1200)) / (150/1200)

P(Sweet | Others) = 0.67 → 8

3.c) P(Long | Others) = (P( Others| Long) * P(Long) )/ P


(Others)
= ((50/400) * (400/1200)) / (150/1200)

P(Long | Others) = 0.34 → 9

On multiplying eq 7,8,9 ==> P(X | Others) = 0.34 * 0.67* 0.34

P(X | Others) = 0.07742

So finally from P(X | Mango) == 0 , P(X | Banana) == 0.65 and


P(X| Others) == 0.07742.

We can conclude Fruit{Yellow,Sweet,Long} is Banana.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy