0% found this document useful (0 votes)
13 views25 pages

Decision Tree

notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views25 pages

Decision Tree

notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

What is Learning ?

• Is a branch of AI

• Criticism of AI
– Adapt to new situation
– Whatever is told it is done

• Remedy L….L….E….A…R……
R…N
Types of Learning

• Supervised Learning

– We know the target


– Because of previous history
– Chess Safe
– Loan
Risky
– Categorical
– Learning by Examples
Types of Learning
• Unsupervised Learning

– No target
– Continuous
– Eg. Predict the Economic growth of India
2004
– Clustering
– Learning by Observation
Supervised Learning –
Classification
Classification
Training Data
Algorithm

Name Age Income Credit_rating


Saran <=30 low fair Classification
rules
Bill <=30 low excellent
Susan >40 med fair
If age=“ 31..40 “
Geetha 31..40 high excellent and income= “high”
then credit_rating=
Clara >40 med fair
excellent
Babu 31..40 high excellent
Supervised Learning –
Classification

Classification
rules

Test Data New Data

(John Henri ,
31..40, high )
Credit rating ?
excellent
Learning by Decision Tree
Induction
• Tree

– Graphical representation
– Nodes – Rounded rectangle
– Branches
– Leaves - Oval
What is a Decision Tree ?
• Is a flow chart like tree structure

• Internal nodes tests on an attribute

• Branch Outcome of the test

• Leaf Classes ( Categorical)


Example – “ buys_computer”
Age ?

< = 30 > 40

Student ?
Credit_rating ?

no yes excellent fair

no yes no yes
How to choose Root Node ?
• Attribute with the highest Information Gain

• Information Gain
Expected
information to
classify a given
sample
How to choose Root Node ?
• S = { s 1 , s2 , … s m }
• Ci for i=1 to m

Classes

• Si No. of samples S in class C i


m
• I(s1,s2,…sm) =  Pi log 2 ( Pi )
i=1
How to choose Root Node ?
• Pi – Probability that an arbitrary sample
belongs to
class Ci

• Pi = S i
-------
S
Entropy & Gain
• E(A) A – Attribute
• Expected Information based on
partitioning into subsets by A
v
• E(A) =  s ij +……+smj I (s ij ….. smj)
j=1
S
Entropy & Gain

• Gain(A) = I (s1 , s2 ,….. sm) –E(A)

• Larger the gain the attribute is selected as


Root
Example
RID Age Income Studen Credi Class:
t t_rat buys_comput
e er
1 <=30 High no fair no
2 <=30 high no excel no

3 31..40 high no fair yes


4 >40 medium no fair yes
5 >40 low yes fair yes
6 >40 low yes excel no
7 31..40 low yes excel yes
Example
RID Age Income Student Credit_ Class:
rate buys_com
puter
8 <=30 medium no fair no
9 <=30 low yes fair yes
10 >40 medium yes fair yes
11 <=30 medium yes excel yes
12 31..40 medium no excel yes
13 31..40 high yes fair yes
14 >40 medium no excel no
Solution
• Class – buys_computer
• Category
– - yes - 9 records – s1
– - no - 5 records – s2
– - Total records - 14

I ( s1,s2 ) = - 9 log 2 9 - 5 log 2 5


14 14 14 14
= 0.940
Information Gain Calculation
• Attribute – Age
1. <=30 yes ( 2 ) - s11
no ( 3 ) - s21 5 records

2. 31..40 yes ( 4 ) - s12


no ( 0 ) - s22 4 records

3. <=30 yes ( 3 ) - s13


no ( 2 ) - s23 5 records
Information Gain Calculation

• I ( s11, s21) = - 2 log 2 2 - 3 log 2 3


5 5 5 5

= 0.971
I ( s12,s22 ) =0
I ( s13,s23 ) = 0.97
Entropy & Gain Calculation
E( age )= 5 I( s11 , s21) + 4 I( s12,s22) + 5 I(s13,s23 )
14 14 14

= 0.694

Gain(age) = I (s1,s2) –E (age) =0.246


Gain(income) =0.029
Gain(student) =0.151
Gain(Credit_Rating) =0.048
Test Attribute

• Age has the highest information gain

• So it is selected as the test attribute


<=30 Age ? >40
Income Student Credit_rating class Income Studen Credit_rating clas

High No Fair No
31 Medium
t
No Fair
s
Yes

High No Excel No . Low Yes fair Yes


Mediu
m
No Fair No . Low Yes excel No

Low Yes Fair Yes 40 medium Yes Fair Yes

Mediu Yes Excel yes Medium No Excel No


m

Income Studen Credit_rating class


t
High No Fair Yes

Low Yes Excel Yes

Medium No Excel Yes

High Yes Fair Yes


Generating Classification rules
from a decision tree

• If age = “<=30” AND student= “no” THEN


buys_computer = “no”

• If age= “<=30” AND student= “yes” THEN


buys_computer = “yes”
Generating Classification rules
from a decision tree

• If age=“31..40” THEN buys_computer =“yes”

• If age=“>40” AND credit_rating=“excel” THEN


buys_computer=“no”

• If age=“>40” AND credit_rating=“fair” THEN


buys_computer=“yes”
Problem
• X = ( age = “<=30”,
income = “medium”,
student = “yes”,
credit_rating = “fair” )

Cannot
Predict???????????????????????!!!!!!!!!!!!
Other Learning Schema

1. Naïve Bayesian Learning

2. Neural Network Learning

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy