0% found this document useful (0 votes)

15 views8 pages

Decision Trees - Detailed Notes

Uploaded by

Sainath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views8 pages

Decision Trees - Detailed Notes

Uploaded by

Sainath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Decision Trees

Detailed Notes

Prerequisite-
- Probability and statistics.

Objectives-
- Understand the prerequisite term such as Gini index, entropy, Information gain and pruning.
- Understand the Decision tree as a CART algorithm.
- Tuning parameter of the decision tree.
- Advantages and disadvantages of Decision Tree.

Decision Tree
A Decision Tree is one of the most popular and effective supervised learning technique for classification
problem that equally works well with both categorical and quantitative variables. It is a graphical representation
of all the possible solution to a decision that is based on a certain condition. In this algorithm, the training
sample points are split
sainath.kesavan@gmail.com into two or more sets based on the split condition over input variables. A simple
CE8RSBVN1O
example of a decision tree can be as a person has to take a decision for going to sleep or restaurant based on
parameters like he is hungry or has 25$ in his pocket.

Figure 1 Example of Decision tree

Types of Decision tree

This file is meant for personal use by sainath.kesavan@gmail.com only.

Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
- Categorical variable decision tree: The type of decision tree is classified based on the
response/target variable. A tree with qualitative or categorical response variable is known as
Categorical variable decision tree.
- Continuous variable decision tree: A tree with a continuous response variable is known as a
continuous variable decision tree.

Terminology
Before moving forward into the details of the decision tree and its working lets understand the meaning
of some terminology associated with it.
Root node- Represent the entire set of the population which gets further divided into sets based on splitting
decisions.
Decision node- These are the internal nodes of the tree, These nodes are expressed through conditional
expression for input attributes.
Leaf node- Nodes which do not split further are known as leaf nodes or terminal nodes.
Splitting- The process of dividing a node into one or more sub-nodes.
Pruning- It is the reverse process of splitting where the sub-nodes are removed.

sainath.kesavan@gmail.com
CE8RSBVN1O

Figure 2 Decision Tree Terminology

The tree accuracy is heavily affected by the split point at a decision node. Decision trees use different criteria
to decide split on decision node to get two or more sub-nodes. The resultant sub nodes must increase in the
homogeneity of data points also known as the purity of nodes with respect to the target variable. The split
decision is tested on all available variables and then the split with maximum purity sub-nodes is get selected.

Measures of Impurity:
Decision trees recursively split feature about to their target variable’s purity. The algorithm is designed to
optimize each split such the purity will be maximized. Impurity can be measured in many ways such as Gini
impurity, Entropy and information gain.

Gini Impurity-

This file is meant for personal use by sainath.kesavan@gmail.com only.

Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Gini index is the measure of how often a randomly chosen element from the set would be incorrectly
labelled. Mathematically the impurity of a set can be expressed as:

2
𝑔𝑖𝑛𝑖 𝑖𝑚𝑝𝑢𝑟𝑖𝑡𝑦 = 1 − ∑ 𝑝𝑖
𝑖
Where the pi represents the probability of random selection of class i observation.
Let consider a simple example of a bag contains some balls (4 red balls and 0 blue balls). The Gini index
would be:

𝑔𝑖𝑛 𝑖𝑚𝑝𝑢𝑟𝑖𝑡𝑦 = 1 − 𝑃 𝑅𝑒𝑑

( { 2
) + 𝑃 (𝐵𝑙𝑢𝑒 ) = 1 − 1
2
}
+0 =0 { 2 2
}
The impurity measurement is zero because no matter what ball we take out we would never incorrectly label it.
Similarly, if we consider 2 red and 2 blue balls, the Gini index would be 0.5. Gini primarily works with
categorical variables and perform the binary split. Lower the Gini index shows higher homogeneity. Split
selection can be done in two simple steps.
1. Calculate the Gini impurity using the above formula for each sub-nodes.
2. Calculate the Gini for the split using the weighted approach of Gini scores of each node of the split.

Example- Let we have been given with a data of some students with a target variable that whether a student is
athlete or not. The input attributes are gender (Male/Female) and education (UG/PG).

Person Gender Education Athlete

Student1 Male UG Yes
sainath.kesavan@gmail.com
CE8RSBVN1O Student2 Female PG No
Student3 Male UG Yes
Student4 Male PG No
Student5 Female UG Yes
Student6 Female UG Yes
Student7 Female PG No
Student8 Male UG Yes
Student9 Female PG Yes
Student10 Female PG No

Case 1- Let first make a split with respect to gender variable: Figure is showing the split-

Step 1- calculate the Gini impurity for each sub-node

This file is meant for personal use by sainath.kesavan@gmail.com only.

⎰ ⎱
𝑔𝑖𝑛𝑖𝑓𝑒𝑚𝑎𝑙𝑒 = 1 − 𝑃 𝑦𝑒𝑠
( { 2
) + 𝑃 (𝑁𝑜 ) = 1 −
2
} ⎱ (6)
3 2
+ ( ) = 0. 5
3 2
⎰
6

Step 2- Calculate the Gini for split using weights of each Gini score.

𝑔𝑖𝑛𝑖𝑔𝑒𝑛𝑑𝑒𝑟 =
4
10 (𝑔𝑖𝑛𝑖𝑚𝑎𝑙𝑒) + 106 (𝑔𝑖𝑛𝑖𝑓𝑒𝑚𝑎𝑙𝑒) = 0. 4 (0. 375) + 0. 6 (0. 5 ) = 0. 45
Case 2- Let first make the second split with respect to the education variable: Figure is showing the split-

Step 1- calculate the Gini impurity for each sub-node

⎰ 5
( ) + ( ) ⎱⎰ = 0. 0
2 0 2
𝑔𝑖𝑛𝑖𝑈𝐺 { 2
= 1 − 𝑃 (𝑦𝑒𝑠 ) + 𝑃 (𝑁𝑜 ) = 1 −
⎱ 5
2
} 5

⎰ 1
( ) + ( ) ⎱⎰ = 0. 32
2 4 2
sainath.kesavan@gmail.com 𝑔𝑖𝑛𝑖
CE8RSBVN1O 𝑃𝐺
2
{ 2
= 1 − 𝑃 (𝑦𝑒𝑠 ) + 𝑃 (𝑁𝑜 ) = 1 −
⎱ 5 } 5

Step 2- Calculate the Gini for split using weights of each Gini score.

𝑔𝑖𝑛𝑖𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 =
5
10 (𝑔𝑖𝑛𝑖𝑈𝐺) + 105 (𝑔𝑖𝑛𝑖𝑃𝐺) = 0. 5 (0. 0) + 0. 5 (0. 32 ) = 0. 16
As it is clear from the calculations that the Gini score for education is lower than for gender variable so the best
variable for making a split is education. ( gives more pure sub-nodes)

Entropy and Information gain

Entropy- In Layman term, Entropy is nothing but the measure of disorder. We can also think it as a measure of
purity. The mathematical formula of Entropy is as follows-
𝑐
𝐸= ∑ − 𝑝 𝑖𝑙𝑜𝑔2𝑝𝑖
𝑖=1
Where pi is the probability of class i.
For the given example we have two labels for athlete class (Yes/No). Therefore the athlete could be either Yes
or No. So, the entropy for the given set is defined as:
6 4
𝑃(𝑌𝑒𝑠 ) = 10
𝑎𝑛𝑑 𝑃(𝑁𝑜 ) = 10

𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = − 10
6
log ( )− log ( ) = -(- 0. 442 - 0. 529) = 0. 971
6
2 10
4
10
4
2 10

This file is meant for personal use by sainath.kesavan@gmail.com only.

Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The entropy comes out to be 0.971 which is considered a high entropy, Which means a low level of Purity.
Entropy is measured between 0 and 1.

Information Gain

Information Gain measures the reduction in Entropy and decides which attribute would be selected as a
decision node. In general, information gain has again calculated the subtraction of decision node entropy to the
weighted average of the entropies for the children of the decision node. That is for ”m” points in the first child
node and n points in the second child node, the information gain is:
𝑚 𝑛
𝐼𝐺 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑛𝑜𝑑𝑒
( )− 𝑚+𝑛
𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐹𝑖𝑟𝑠𝑡𝐶ℎ𝑖𝑙𝑑 ) − 𝑚+𝑛
𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑒𝑐𝑜𝑛𝑑𝐶ℎ𝑖𝑙𝑑)
A Simple Graph between a class and Entropy would be like:

sainath.kesavan@gmail.com
CE8RSBVN1O
Figure 3 Entropy graph w.r.t impurity

Gini vs Entropy-

Figure 4 Gini vs Entropy Graph

Pruning-

This file is meant for personal use by sainath.kesavan@gmail.com only.

Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One of the problems with the decision tree is it gets easily overfit with training sample and becomes too
large and complex. A complex and large tree poorly generalizes the new samples data whereas a small tree
fails to capture the information of training sample data.
Pruning may be defined as shortening the branches of the tree. The process of reducing the size of the tree by
turning some branch node into a leaf node and removing the leaf node under the original branch.

Figure 5 Pruning in Decision Tree

Pruning is very useful in the decision tree because sometimes what happens is that the decision tree
sainath.kesavan@gmail.com
CE8RSBVN1O
may fit the training data very well but performs very poorly in testing or new data. So, by removing branches
we can reduce the complexity of tree which help in reducing the overfitting of the tree.

Hyperparameters for Decision Trees

To generate decision trees that will generalize to new problems well, we can tune different aspect of trees. We
call these aspects of decision tree “hyperparameters”. Some of the Important Hyperparameters used in
decision trees are as follows:
Maximum Depth- The maximum depth of the decision tree is simply the largest length between the root to
leaf. A tree of maximum length k can have at most 2**k leaves.

This file is meant for personal use by sainath.kesavan@gmail.com only.

Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The minimum number of samples per leaf- While splitting a node, one could run into the problem of having
990 samples in one of them, and 10 on the other. This will not take us too far in our process and would be a
waste of resources and time. If we want to avoid this, we can set a minimum for the number of samples we
allow on each leaf.

The maximum number of the feature- We can have too many features to build a decision tree. While
splitting, in every split, we have to check the entire data-set on each of the features. This can be very
expensive. A solution for this is to limit the number of features that one looks for in each split. If this number is
large enough, we're very likely to find a good feature among the ones we look for (although maybe not the
perfect one). However, if it's not as large as the number of features, it will speed up our calculations
sainath.kesavan@gmail.com
significantly.
CE8RSBVN1O

Example-

The decision tree for the example discussed in Gini calculation will be-

Figure 6 Decision Tree

This file is meant for personal use by sainath.kesavan@gmail.com only.

- Decision tree does not require normalization and scaling of data.

- Missing value in the data-set also does not affect the process of making the decision tree of the
particular dataset.
- It is very easy to explain the decision tree to anyone. It does not require much knowledge of
statistics and visualization also helps very much.
- A decision tree requires less time and effort during data pre-processing than other algorithms.

Disadvantages of Decision Tree

- A small change in the data-set can result in a large change in the structure of the decision tree
causing instability in the model.
- It requires more time to train a model in decision tree than any other algorithm presents out there.
- In a Decision tree calculation can be far more expensive than the other algorithm.
- It is not advised to apply decision tree for regression or predicting continuous values.

*********
sainath.kesavan@gmail.com
CE8RSBVN1O

This file is meant for personal use by sainath.kesavan@gmail.com only.

FALLSEM2024-25_BCSE209L_TH_VL2024250101737_2024-07-30_Reference-Material-I
No ratings yet
FALLSEM2024-25_BCSE209L_TH_VL2024250101737_2024-07-30_Reference-Material-I
28 pages
ProtoAccessEncode Java
No ratings yet
ProtoAccessEncode Java
256 pages
23-01!08!00 CS 633 Data Mining Prediction With Decision Trees PDF.pdf
No ratings yet
23-01!08!00 CS 633 Data Mining Prediction With Decision Trees PDF.pdf
80 pages
QI401_Zhang_etal_AdvOptPhot_2024_EntQI
No ratings yet
QI401_Zhang_etal_AdvOptPhot_2024_EntQI
103 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
10 pages
ML-chap9_2024_110217
No ratings yet
ML-chap9_2024_110217
52 pages
3 Decision Trees
No ratings yet
3 Decision Trees
41 pages
COS10022 DSP Week05 Decision Tree and Random Forest
No ratings yet
COS10022 DSP Week05 Decision Tree and Random Forest
50 pages
ml
No ratings yet
ml
39 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
Unit-3_ML
No ratings yet
Unit-3_ML
47 pages
ML-Lecture-8-9-Classification
No ratings yet
ML-Lecture-8-9-Classification
35 pages
Lecture 6 Greedy Technique
No ratings yet
Lecture 6 Greedy Technique
27 pages
DMMLASSIGNMENT
No ratings yet
DMMLASSIGNMENT
36 pages
Fundamentals of The Analysis of Algorithm Efficiency: Asymptotic Notations and Basic Efficiency Classes
No ratings yet
Fundamentals of The Analysis of Algorithm Efficiency: Asymptotic Notations and Basic Efficiency Classes
10 pages
CSE445 NSU Week_4
No ratings yet
CSE445 NSU Week_4
48 pages
10. Decistion Tree.pptx
No ratings yet
10. Decistion Tree.pptx
27 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree - Gini Index
No ratings yet
Decision Tree - Gini Index
27 pages
solution for dwdm problems (1)
No ratings yet
solution for dwdm problems (1)
24 pages
5 1 decision trees
No ratings yet
5 1 decision trees
34 pages
Unit 10 - Decision Trees
No ratings yet
Unit 10 - Decision Trees
21 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Decision Tree
No ratings yet
Decision Tree
8 pages
Lecture 9
No ratings yet
Lecture 9
21 pages
Decision Tree
No ratings yet
Decision Tree
47 pages
Decision Tree
No ratings yet
Decision Tree
36 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Decision Tree Theory
No ratings yet
Decision Tree Theory
22 pages
Lecture 5 DecisionTree
No ratings yet
Lecture 5 DecisionTree
21 pages
Midterm gr12
No ratings yet
Midterm gr12
17 pages
Rust As Functional Programming Language
No ratings yet
Rust As Functional Programming Language
19 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
CART1
No ratings yet
CART1
17 pages
Decision Trees
No ratings yet
Decision Trees
31 pages
Data II_ Decision Trees and Rules
No ratings yet
Data II_ Decision Trees and Rules
11 pages
MIT Dynamic Programming Lecture Slides
No ratings yet
MIT Dynamic Programming Lecture Slides
261 pages
Decision Trees
No ratings yet
Decision Trees
16 pages
ML Unit 3_Questions
No ratings yet
ML Unit 3_Questions
7 pages
ML Unit-2
No ratings yet
ML Unit-2
16 pages
6. Decision Trees
No ratings yet
6. Decision Trees
18 pages
Attribute Selection Presentation by - Rohit Ghosh
No ratings yet
Attribute Selection Presentation by - Rohit Ghosh
11 pages
PRA
No ratings yet
PRA
10 pages
Decision Tree Version 3
No ratings yet
Decision Tree Version 3
16 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
(MXML-2-02) Decision Trees - ID3 - C4.5, Impurity, Gini Index, Entropy, Information Gain
No ratings yet
(MXML-2-02) Decision Trees - ID3 - C4.5, Impurity, Gini Index, Entropy, Information Gain
3 pages
Decitions Tree
No ratings yet
Decitions Tree
6 pages
Clase12 13
No ratings yet
Clase12 13
15 pages
decision tree
No ratings yet
decision tree
4 pages
Tutorial 4 (1)
No ratings yet
Tutorial 4 (1)
6 pages
2 Decision Tree Algo
No ratings yet
2 Decision Tree Algo
46 pages
Data Science Concepts Lesson04 Decision Tree Concepts
No ratings yet
Data Science Concepts Lesson04 Decision Tree Concepts
22 pages
Software Engineering
No ratings yet
Software Engineering
3 pages
Logic Gates Revision
No ratings yet
Logic Gates Revision
13 pages
Codevita Problem1
No ratings yet
Codevita Problem1
5 pages
Gini Index
No ratings yet
Gini Index
6 pages
50 Years of FFT Algorithms and Applications.
No ratings yet
50 Years of FFT Algorithms and Applications.
34 pages
Design Analysis Algorithm Aug Sep2023
No ratings yet
Design Analysis Algorithm Aug Sep2023
2 pages
Jacobi and Gauss-Seidel
No ratings yet
Jacobi and Gauss-Seidel
10 pages
Data Mining Algorithms Classification L4
No ratings yet
Data Mining Algorithms Classification L4
7 pages
Gini Vs Entrophy
No ratings yet
Gini Vs Entrophy
8 pages
Gini V Entropy
No ratings yet
Gini V Entropy
12 pages
Unit-5 Decision Trees and Ensemble Learning
100% (1)
Unit-5 Decision Trees and Ensemble Learning
162 pages
Decision Tree Tutorial
No ratings yet
Decision Tree Tutorial
8 pages
Regular Expression
No ratings yet
Regular Expression
17 pages
Practical File Questions With Answer
No ratings yet
Practical File Questions With Answer
4 pages
3 PLC Program To Implement Various Boolean Functions With Don
No ratings yet
3 PLC Program To Implement Various Boolean Functions With Don
2 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
3 pages
3.5 Harmonic Identities
No ratings yet
3.5 Harmonic Identities
3 pages
IC368 Computational Intelligence in Control Engineering
No ratings yet
IC368 Computational Intelligence in Control Engineering
3 pages
Ans: D
No ratings yet
Ans: D
10 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Different Criterion For Splitting in Decision Tree Algorithm
No ratings yet
Different Criterion For Splitting in Decision Tree Algorithm
3 pages
Decision Tree: "For Each Node of The Tree, The Information Value Measures
No ratings yet
Decision Tree: "For Each Node of The Tree, The Information Value Measures
3 pages
Aiml Easy Solution
No ratings yet
Aiml Easy Solution
70 pages
CS 302 (3001 (O) ) PDF
No ratings yet
CS 302 (3001 (O) ) PDF
7 pages
CS 540: Introduction To Artificial Intelligence: Final Exam: 5:30-7:30pm, December 17, 2015 Beatles Room at Epic
No ratings yet
CS 540: Introduction To Artificial Intelligence: Final Exam: 5:30-7:30pm, December 17, 2015 Beatles Room at Epic
11 pages
Algo Problems For Learning
No ratings yet
Algo Problems For Learning
2 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
APCS MCQ葵花宝
100% (1)
APCS MCQ葵花宝
67 pages
(Drawing) Learn To Draw Celtic Knots
No ratings yet
(Drawing) Learn To Draw Celtic Knots
4 pages
Information Retrieval Systems U6
No ratings yet
Information Retrieval Systems U6
13 pages
Company Wise Different Questions
No ratings yet
Company Wise Different Questions
10 pages
Data Structure Using C Notes
No ratings yet
Data Structure Using C Notes
21 pages
High School Pre-Calculus Tutor
From Everand
High School Pre-Calculus Tutor
The Editors of REA
4/5 (1)
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Algebra - Task & Drill Sheets Gr. 3-5
From Everand
Algebra - Task & Drill Sheets Gr. 3-5
Nat Reed
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Decision Trees - Detailed Notes

Uploaded by

Decision Trees - Detailed Notes

Uploaded by

Decision Trees

Figure 1 Example of Decision tree

Types of Decision tree

This file is meant for personal use by sainath.kesavan@gmail.com only.

Figure 2 Decision Tree Terminology

This file is meant for personal use by sainath.kesavan@gmail.com only.

𝑔𝑖𝑛 𝑖𝑚𝑝𝑢𝑟𝑖𝑡𝑦 = 1 − 𝑃 𝑅𝑒𝑑

Person Gender Education Athlete

Step 1- calculate the Gini impurity for each sub-node

This file is meant for personal use by sainath.kesavan@gmail.com only.

Step 1- calculate the Gini impurity for each sub-node

Entropy and Information gain

This file is meant for personal use by sainath.kesavan@gmail.com only.

Figure 4 Gini vs Entropy Graph

This file is meant for personal use by sainath.kesavan@gmail.com only.

Figure 5 Pruning in Decision Tree

Hyperparameters for Decision Trees

This file is meant for personal use by sainath.kesavan@gmail.com only.

Figure 6 Decision Tree

This file is meant for personal use by sainath.kesavan@gmail.com only.

- Decision tree does not require normalization and scaling of data.

Disadvantages of Decision Tree

This file is meant for personal use by sainath.kesavan@gmail.com only.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.