0% found this document useful (0 votes)

40 views

Unit II Part 1

The document discusses decision trees, including their structure, uses, advantages, and limitations. It provides examples of using decision trees for classification and describes the top-down induction of decision trees algorithm ID3.

Uploaded by

aravindjas95020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Unit II Part 1

Uploaded by

aravindjas95020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

Decision Trees

Decision Trees are considered to be one of the most popular

approaches for representing classifiers. The tree has three important
nodes. These are root node, internal nodes, and leaf nodes. Root and
internal contain attribute test conditions to separate records that have
different characteristics. Branch deals with one of the possible values
of the attribute.
A root • It has no incoming edges.
node
• It has exactly one
Internal incoming edge and two or
more outgoing edges. It
nodes tests an attribute.

Leaf • It has one incoming edge

and no outgoing edges. It
nodes assigns a classification
Decision Trees
Above diagram is used to classify the bird Crane. Flow is given below.
Name Body temperature Gives Birth Class
Crane Warm No Non-mammal
Uses of Decision Trees
Automated
• To get to the desired department.
telephone call
• To predict the price of an option either a bull
Financial institution market or bear market using binary decision
tree.
• To establish customers by type and predict
Marketers whether a customer will buy a specific type
of product.

• To predict heart attack outcomes in

Medical field
chest pain patients.

Gaming industry • To recognize moment and face.

Real Life • Selecting a flight to travel.

Advantages of Decision Trees
• Easy interpretable for humans
1

• Very fast at testing time

• They can handle both categorical and numerical data

• They can handle highly dimensional data and also operate

4 large datasets

• Decision trees are capable of handling datasets that may have

5 missing values.

• Decision trees do not require complex data preparation

6
Limitations of Decision Trees
• Only axis-aligned splits of data-items
1

• Greedy and may not find the globally optimal

2 tree

• They create overly complex models

• Prone to Over-fitting
4

• Need to be careful with parameter tuning

5
Classifcation Learning
Tid Attrib1 Attrib2 Attrib3 Class Learning
1 Yes Large 125K No
algorithm
2 No Medium 100K No

3 No Small 70K No

4 Yes Medium 120K No

Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn

8 No Small 85K Yes Model
9 No Medium 75K No

10 No Small 90K Yes

Model
10

Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ? Deduction

14 No Small 95K ?

15 No Large 67K ?
10

Test Set
Tree Uses Nodes, and Leaves
Example of a Decision Tree
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No

2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

Training Data
Example of a Decision Tree
Tid Refund Marital Taxable
Status Income Cheat
Refund
1 Yes Single 125K No
Yes No
2 No Married 100K No
3 No Single 70K No NO MarSt
4 Yes Married 120K No Single, Divorced Married
5 No Divorced 95K Yes
TaxInc NO
6 No Married 60K No
< 80K > 80K
7 Yes Divorced 220K No
8 No Single 85K Yes NO YES
9 No Married 75K No
10 No Single 90K Yes
10

Training Data Model: Decision Tree

Apply Model to Test Data
Test Data
Start at the root of tree Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K