Decision Tree
Decision Tree
2
Classification Trees (Decision Trees)
Root
Node
Leaf
3
Homogeneity/Purity of Data
• The basic idea of a decision tree is to split data set based on the
homogeneity (of the same kind) of data, i.e., reducing “Impurity”
• Impurity (uncertainty) is at maximum when all possible classes are
equally represented, e.g., same number of “default” and “not default”
in the following example
Age < 45
4
Entropy
Entropy =
5
ID3 Decision Tree Algorithm
Sources:
https://en.wikipedia.org/wiki/ID3_algorithm
http://www.saedsayad.com/decision_tree.htm 6
Source: http://www.saedsayad.com/decision_tree.htm
7
http://www.saedsayad.com/decision_tree.htm
8
- Choose attribute with
the largest information
gain as the decision
node
- divide the dataset by its
branches and repeat the
same process on every
branch
need for
further
splitting
Already pure
(Entropy is 0)
No more
splitting
need for
further
splitting
• A branch with entropy of 0 is a leaf node.
• A branch with entropy more than 0 needs further splitting
• Repeat recursively on the non-leaf branches until all data is classified
Windy is chosen to
be the next
decision node here
10
Split Points for Numeric Variables
11
When to Stop?
• ID3 algorithm may lead to a very complex tree as shown below,
which may work well for the training data but has poor prediction
performance for the unseen data, which is called overfitting (we will
discuss this more in the future)
12
Questions?
13