Decision Tree
Decision Tree
Decision Trees
What is Decision Tree?
A decision tree is a decision support tool that uses a tree-like graph or
model of decisions and their possible outcomes. A decision tree is a
classifier in the form of a tree structure where each node is either:
• Leaf node- Leaf node is an indicator
of the value of target attribute(class)
of examples, or
• A decision node- A decision node
specifies all possible tests on a single
attribute-value, with one branch and
sub-tree for each possible outcome
of the test.
Why Decision Trees?
• Decision trees can be visualized and are simple to understand and interpret.
• They require very little data preparation whereas other techniques often require
data normalization, the creation of dummy variables and removal of blank values.
• The cost of using the tree (for predicting data) is logarithmic in the number of data
points used to train the tree.
• Decision trees can handle both categorical and numerical data whereas other
techniques are specialized for only one type of variable.
• Decision trees can handle multi-output problems.
• Uses a white box model i.e. the explanation for the condition can be explained
easily by Boolean logic because there are mostly two outputs. For example yes or
no.
• Decision trees can perform well even if assumptions are somewhat violated by the
dataset from which the data is taken.
Some of Decision Tree Algorithms
ID3 (Iterative Dichotomiser 3)
C4.5 & C5.0
Random Forest
CART(Classification and Regression Tree)
CHAID(CHi- squared Automatic Interaction Detector)
TARGET (Tree Analysis with Randomly Generated and Evolved Trees)
VFDT XGBOOST
Main Steps of ID3:
• Compute Entropy of Goal/Class.
• Compute Entropy of each Attribute.
• Compute the Gain of each Attribute.
• Select the Maximum Attribute Gain as a current node.
• For each Branch in Tree Re-Compute Attributes Gain and
select the Max. Gain.
Example for ID3 Algorithm
Decision Tree Rules
If Income=0$ to 15$ Then Risk = High