0% found this document useful (0 votes)

85 views

Presentation On Decision Tree

Uploaded by

riyazkhan7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views

Presentation On Decision Tree

Uploaded by

riyazkhan7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 39

Presentation on Decision Tree

Submitted to: Submitted by:

Smita Agarwal Group-4

1
Introduction
•A decision tree is a classification scheme
which generates a tree and a set of rules.

•It represents the model of different classes,

from a given data set. The set of records
available for developing classification methods
is generally divided into two disjoints subsets.
(a) A training set
(b) A test set
2
•A decision tree construction process is
concerned with identifying the splitting
attributes and splitting criteria at every level of
the tree.

•The main of the decision tree construction

process is to generate simple, comprehensive
rules with high accuracy.

•Classification of tree can be improved by

revising the tree through process like pruning
and grafting. 3
Advantages of decision tree

• Decision trees are able to generate under stable rules.

• Decision tree are able to handle both numerical and the
categorical attributes.
• Decision tree provide a clear indication of which fields are
most important for prediction or classification.
• Perform well with large data in a short time.
• Robust means perform well the true model from which
the data were generated.
4
Products

1.Alice
2.Cart
3.Knowledge Seeker
4.See 5
5
Alice

• Product: Alice d’Isoft & Alice Server

• Vendor: Isoft
• Platforms: Metaframe, TSE, Windows, Unix

Details

6
Cart
• Product: Cart
• Vendor: Salford Systems
• Functions: Classification
• Platforms: CMS, MVS, Unix, Windows

Details

7
Knowledge Seeker
• Product: Knowledge Seeker
• Vendor: ANGOSS Software Corporation
• Functions: Classification
• Platforms: Windows, Unix

Details

8
See5
• Product: See5 / C5.0 1.15
• Vendor: RuleQuest Research Pty Ltd
• Functions: Classification

• See5/C5.0 is available for Windows

2000/Xp/Vista/7 and Linux.
• See5/C5.0 has been designed to analyze substantial
databases
• See5/C5.0 also takes advantage of processors with
quad cores, up to four CPUs, or Intel Hyper-Threading
to speed up the analysis.
Details 9
ALICE

•Alice d'ISoft, software for Data Mining by decision

tree, is a powerful and inviting tool that allows the
creation of segmentation models.

•It makes it possible for the business user to explore

data on line interactively and directly.

10
•As simple to use as a table, ALICE d'ISoft is made
for the business user and requires no previous
statistical skills.

•It allows, in complete autonomy, the most rapid of

analyses, and then the improvement of their
pertinence.

11
•Alice d'ISoft is the only software that takes business
knowledge into account in its analysis.

•Thanks to its rapidity and its facililty, it provides

segmentation models of the best quality.

12
Sectors of application

•Marketing: market studies, segmentation,

classification, customer profiles, satisfaction studies.
•Direct Marketing: surveys and opinion polls, return
criteria.
•Bank, financing, insurance: scoring, risk analysis,
fraud detection
•Industry: quality control, diagnostics, experience
return
•Health: clinical studies, biometrics,epidemology,
biomedical research
13
Analysis: Decision Tree
•Automatic construction

•Interactive construction

•Pruning

•Dynamic field impact on node representation

•One level development / reduction

•One node manual development with freely chosen field

•One node automatic development

14
•Cut, fold, unfold, isolate branch

•Symbolic values dividing / regrouping

•Hide nodes

•Thresholds-driven node coloration

•Multiple-trees projects

•Segmentation variables

•Tree navigator Go Back to Products

15
CART

•Salford Systems' flagship data mining software,

CART, is a robust, easy-to-use decision tree that
automatically sifts large, complex databases, searching
for and isolating significant patterns and relationships.
•This discovered knowledge is then used to generate
reliable, easy-to-grasp predictive models for
applications such as finding best prospects and
customers, targeted marketing, detecting credit card
fraud, and managing credit risk.
16
•Designed for both non-technical and technical
business users, CART can quickly reveal important
data relationships that could remain hidden using
other analytical tools.

•The most recent 2008 release, CART 6.0, includes

modeling automation technology that dramatically
accelerates the process of generating accurate and
robust models for deployment in core business
functions.

•CART was the primary tool used to win the

KDDCup 2000 web-mining competition and is
currently in use in major web applications. 17
•The CART creators continue to collaborate with
Salford Systems to enhance CART with proprietary
advances.

•With CART 6.0 ProEX, Salford has introduced

patented extensions to CART specifically designed to
enhance results for market research and web analytics.

•CART supports high-speed deployment, allowing

Salford models to predict and score in real time on a
massive scale.
18
CART's features provide
Stability and Reliability
CART uses an intuitive, Windows-based interface, making it
accessible to both technical and non-technical users.

Underlying the "easy" interface, however, is a mature theoretical

foundation that distinguishes CART from other methodologies and
other decision trees.

Salford Systems' CART is the only decision tree system based on the
original CART code developed by world-renowned Stanford
University and University of California at Berkeley statisticians; this
code now includes enhancements that were co-developed by Salford
Systems and CART's originators. 19
CART 6.0 ProEX Features
Tree Controls
•Force splitters into nodes
•Confine select splitters to specific regions of a
tree (Structured Tree™)

HotSpot Detector
•Search data for ultra-high performance segments.
•HotSpot Detector trees are specifically designed
to yield extraordinarily high-lift or high-risk nodes.
The process focuses on individual nodes and
generally discards the remainder of the tree.
20
Train/Test Consistency Assessment
Node-by-node summaries of agreement between train
and test data on both class assignment and rank
ordering of the nodes.
Quickly identify ideally-performing robust trees.

Modeling Automation
Automatically generate entire collections of trees
exploring different control parameters.
Nineteen automated batteries cover exploration of
multiple splitting rules, five alternative missing value
handling strategies, random selection of alternative
predictor lists, progressively smaller (or larger) training
sample sizes, and much more. 21
Predictor Refinement
Includes stepwise backwards predictor elimination
using any of three predictor ranking criteria (lowest
variable importance rank, lowest loss of area under the
ROC curve, highest variable importance rank).

Model Assessment via Monte Carlo Testing

Measure possible overfitting with automated Monte
Carlo randomization tests.
allows precise control over which predictors may be
combined into a single new feature.
22
Constructed Features
New tools for automatic construction of new features
(as linear combinations of predictors). Identification of
multiple lists of candidates

Unsupervised Learning Mode

Use Breiman's column scrambler to automatically
detect potential clusters with no need to scale data,
address missing values, or select variables for
clustering.

23
How are CART's decision
trees grown?
•CART uses strictly binary, or two-way, splits that divide each
parent node into exactly two child nodes by posing questions with
yes/no answers at each decision node.
•CART searches for questions that split nodes into relatively
homogenous child nodes, such as a group consisting largely of
responders, or high credit risks, or people who bought sport-utility
vehicles.
•As the tree evolves, the nodes become increasingly more
homogenous, identifying important segments.
•Other methods, such as CHAID, favor multi-way splits that can
paint visually appealing trees but that can bog models down with
less accurate splits.
24
Why is CART unique among
decision-tree tools?
CART is based on a decade of research, assuring stable performance and
reliable results. CART's proven methodology is characterized by:

Reliable pruning strategy - CART's developers determined definitively that

no stopping rule could be relied on to discover the optimal tree, so they
introduced the notion of over-growing trees and then pruning back; this idea,
fundamental to CART, ensures that important structure is not overlooked by
stopping too soon. Other decision-tree techniques use problematic stopping rules.

Powerful binary-split search approach - CART's binary decision trees are

more sparing with data and detect more structure before too little data are left for
learning. Other decision-tree approaches use multi-way splits that fragment the
data rapidly, making it difficult to detect rules that require broad ranges of data to
discover.
25
Automatic self-validation procedures - In the search for patterns in
databases it is essential to avoid the trap of "overfitting," or finding
patterns that apply only to the training data. CART's embedded test
disciplines ensure that the patterns found will hold up when applied to
new data. Further, the testing and selection of the optimal tree are an
integral part of the CART algorithm. Testing in other decision-tree
techniques is conducted after the fact and tree selection is left up to the
user.

In addition, CART accommodates many different types of real-

world modeling problems by providing a unique combination of
automated solutions:
1. surrogate splitters intelligently handle missing values
2. adjustable misclassification penalties help avoid the most costly errors
3.multiple-tree, committee-of-expert methods increase the precision of
results
4,alternative splitting criteria make progress when other criteria fail.
26
What tree-growing, or "splitting,"
criteria can CART provide?

CART includes seven single-variable splitting criteria - Gini, Symgini,

twoing, ordered twoing and class probability for classification trees, and
least squares and least absolute deviation for regression trees - and one
multi-variable splitting criteria, the linear combinations method. The
default Gini method typically performs best, but, given specific
circumstances, other methods can generate more accurate models. CART's
unique "twoing" procedure, for example, is tuned for classification
problems with many classes, such as modeling which of 170 products
would be chosen by a given consumer. Other splitting criteria are
available for inherently difficult problems in which even the best models
are expected to have a relatively low accuracy.
27
Demographics, for example, are often weak predictors of attitude- and
preference-based segments. Special CART tree-growing options can
dramatically increase the predictive accuracy of such demographic-
based models. Additional unique tree-growing criteria are available for
problems involving unequal misclassification costs, ordered target
variables, and continuous dependent variables.To deal more effectively
with select data patterns, CART also offers splits on linear combination
of continuous predictor variables. For this option, CART looks for
weighted averages of predictor variables to use as splitters; these
weighted averages can reveal important database structure and can
uncover new critical measures.
Go back to Products

28
Knowledge Seeker
• Knowledge SEEKER is widely used in
marketing, sales and risk functions.
• It is flexible, powerful, yet easy-to-use
interface enables users to quickly and
efficiently develop insights to advance their
goals and objectives.

29
• Data Discovery

– KnowledgeSEEKER features advanced data import,

sampling and preparation.

– Knowledge SEEKER can import data from virtually any

data source - statistical files, file servers and databases
through native drivers for SAS, SPSS, as well as using text
and ODBC

30
• Advanced Visualization

– Knowledge SEEKER has an extensive array of tools for

data exploration and visualization.

– Business-friendly graphs and charts offer rapid profiling

and can be exported into Microsoft Office applications.

– Users can quickly and conveniently perform advanced

visualization

31
• Decision Trees

– Knowledge SEEKER provides intuitive decision tree

capabilities that help users segment their
populations and understand the key drivers of
outcomes in their business data.

Go Back to Products

32
See 5
• See5/C5.0 is easy to use and does not
presume any special knowledge of Statistics.
• Xp/Vista/7 can install and use the 64-bit
version
• Linux C5.0 continues to be available in both
32-bit and 64-bit versions.

33
See5 (Windows 2000/Xp/Vista/7):
• Single-computer licence (32-bit)
• Single-computer licence (64-bit)
• Network licence (32-bit and 64-bit)

34
Enhanced multi-threading
• Additional sections of See5/C5.0 have been
multi-threaded. This can result in speed
improvements when the application has many
discrete attributes, especially with the discrete
value subset option.

35
Confidence of ruleset predictions
• This affects the output from the public code to
read and interpret See5/C5.0 ruleset
classifiers, and also impacts results with
boosted rulesets.

36
Small changes to pruning algorithms
• For most applications this should not affect
the final classifier; in some cases, the tree or
ruleset will be larger or smaller, but predictive
accuracy should be similar.

37
• Sample Applications
– Assessing Churn Risk
– Detecting Advertisements on the web
– Identifying Spam

Go back to Products

38
References

http://alice-soft.com/html/prod_alice.htm
http://salford-systems.com/cart.php
http://rulequest.com/see5-info.html
http://angoss.com/analytics_software/Knowledg
eSEEKER.php
Data mining techniques, arun K Pujari

Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
CART - Machine Learning
No ratings yet
CART - Machine Learning
29 pages
u34
No ratings yet
u34
4 pages
Business Analytics: Foundation: Material Handouts
No ratings yet
Business Analytics: Foundation: Material Handouts
7 pages
Data Mining
No ratings yet
Data Mining
18 pages
Objective Segmentation
No ratings yet
Objective Segmentation
21 pages
Financial Applications of Classification and Regr
No ratings yet
Financial Applications of Classification and Regr
41 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Data Warehousing and Data Mining: Classification, Trees
No ratings yet
Data Warehousing and Data Mining: Classification, Trees
26 pages
TEAA_ Tree Ensembles-1
No ratings yet
TEAA_ Tree Ensembles-1
43 pages
PR GTU IMP questions by jay
No ratings yet
PR GTU IMP questions by jay
35 pages
248 T. M. Khoshgoftaar and E. B. Allen: Et Al. (1992) ), Artificial Neural Networks (Khoshgoftaar and Lanning (1995) ), and
No ratings yet
248 T. M. Khoshgoftaar and E. B. Allen: Et Al. (1992) ), Artificial Neural Networks (Khoshgoftaar and Lanning (1995) ), and
1 page
Decision Tree
No ratings yet
Decision Tree
52 pages
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
No ratings yet
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
11 pages
An Overview of Data Mining2016-PAKTAUFIK-MPK
No ratings yet
An Overview of Data Mining2016-PAKTAUFIK-MPK
29 pages
Decision Trees: at Some Point of Time You Have To Take A Decision Sitting On A Tree
100% (1)
Decision Trees: at Some Point of Time You Have To Take A Decision Sitting On A Tree
19 pages
Peer Reviewed Scientific Journals
No ratings yet
Peer Reviewed Scientific Journals
9 pages
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
No ratings yet
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
5 pages
Business Data Mining Week 11
No ratings yet
Business Data Mining Week 11
15 pages
My Decision Tree Algorithm
No ratings yet
My Decision Tree Algorithm
21 pages
7 - Classfication - Concept - DecisionTree - Evaluation
No ratings yet
7 - Classfication - Concept - DecisionTree - Evaluation
47 pages
decision tree
No ratings yet
decision tree
13 pages
Module 4 Lecture -2
No ratings yet
Module 4 Lecture -2
65 pages
1
No ratings yet
1
2 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
PSR 0607 Chap10
No ratings yet
PSR 0607 Chap10
33 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Presented by Elden 18mca514
No ratings yet
Presented by Elden 18mca514
15 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
37 pages
entropy and information gain for decision tree algorithm
No ratings yet
entropy and information gain for decision tree algorithm
12 pages
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
No ratings yet
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
20 pages
Chapter 9 - Classification and Regression Trees: Data Mining For Business Intelligence
No ratings yet
Chapter 9 - Classification and Regression Trees: Data Mining For Business Intelligence
36 pages
Data Mining - Decision Tree
No ratings yet
Data Mining - Decision Tree
13 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
No ratings yet
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
34 pages
Decision Trees
No ratings yet
Decision Trees
21 pages
Data Mining Algo
No ratings yet
Data Mining Algo
8 pages
Trees
No ratings yet
Trees
19 pages
ML Unit 3
No ratings yet
ML Unit 3
49 pages
DMI UNIT 4
No ratings yet
DMI UNIT 4
34 pages
Data Analytics - Unit-IV
No ratings yet
Data Analytics - Unit-IV
21 pages
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet
Data Analytics and Performance of Mobile Apps Using R Language
No ratings yet
Data Analytics and Performance of Mobile Apps Using R Language
10 pages
Arboles de Decisión en SAS
No ratings yet
Arboles de Decisión en SAS
42 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Decision Tree Ppt
0% (1)
Decision Tree Ppt
24 pages
Classification_and_Regression_Trees_CART
No ratings yet
Classification_and_Regression_Trees_CART
40 pages
Classification and Regression Trees (CART) Theory and Applications
No ratings yet
Classification and Regression Trees (CART) Theory and Applications
40 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
18-Article Text-61-1-10-20200510
No ratings yet
18-Article Text-61-1-10-20200510
6 pages
Up M PHD Seminar Cart RF May 2023
No ratings yet
Up M PHD Seminar Cart RF May 2023
101 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
15 pages
Decision Tree
No ratings yet
Decision Tree
2 pages
Deploying Scalable Systems with Nomad: Definitive Reference for Developers and Engineers
From Everand
Deploying Scalable Systems with Nomad: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
What Are Progressive Cavity Pumps?: Progressing Cavity Pump Guide and Design
No ratings yet
What Are Progressive Cavity Pumps?: Progressing Cavity Pump Guide and Design
7 pages
Vickers DG4V-3-6C Electrovalve PDF
No ratings yet
Vickers DG4V-3-6C Electrovalve PDF
16 pages
Void Fraction
No ratings yet
Void Fraction
4 pages
Roles of Audiologists and Speech-Language Pathologists Working With Persons With Attention Deficit Hyperactivity Disorder
No ratings yet
Roles of Audiologists and Speech-Language Pathologists Working With Persons With Attention Deficit Hyperactivity Disorder
41 pages
William Osler: A Life in Medicine. ISBN 0195329600, 978-0195329605
100% (20)
William Osler: A Life in Medicine. ISBN 0195329600, 978-0195329605
23 pages
Sri Venkateswara Engineering College
No ratings yet
Sri Venkateswara Engineering College
15 pages
HEL ParallelReactorPlatforms PDF
No ratings yet
HEL ParallelReactorPlatforms PDF
2 pages
Key performance indicators (KPIs) for a Centralized Dispatch Center
No ratings yet
Key performance indicators (KPIs) for a Centralized Dispatch Center
2 pages
GR 12 LFSC TERM 1 PRATICAL TASK_040921
No ratings yet
GR 12 LFSC TERM 1 PRATICAL TASK_040921
6 pages
The Control Unit
No ratings yet
The Control Unit
38 pages
VGP Method Statement1
No ratings yet
VGP Method Statement1
13 pages
Kalera and Agrico Investor Presentation Final
No ratings yet
Kalera and Agrico Investor Presentation Final
50 pages
Rainfall Analysis Implementing On Data Warehouse
No ratings yet
Rainfall Analysis Implementing On Data Warehouse
12 pages
Refraction of Light, From Lens and Slabs
No ratings yet
Refraction of Light, From Lens and Slabs
90 pages
Assignment - Brewer Career Research Paper
No ratings yet
Assignment - Brewer Career Research Paper
2 pages
Identification: Group 6 - Reack
No ratings yet
Identification: Group 6 - Reack
8 pages
Outcrop: Features Study Examples See Also References External Links
No ratings yet
Outcrop: Features Study Examples See Also References External Links
3 pages
Electrical Schematic
No ratings yet
Electrical Schematic
14 pages
Drive Retrofit of Sugar Centrifuge Shortens Cycle Times
No ratings yet
Drive Retrofit of Sugar Centrifuge Shortens Cycle Times
6 pages
PHYLUM PORIFERA-WPS Office
No ratings yet
PHYLUM PORIFERA-WPS Office
3 pages
JD's Private Tutorials: Contact: 7303781863 / 9920564563 IX CBSE Prelim (Narayana)
No ratings yet
JD's Private Tutorials: Contact: 7303781863 / 9920564563 IX CBSE Prelim (Narayana)
7 pages
Structural Performance of Rounded Dovetail Connections
No ratings yet
Structural Performance of Rounded Dovetail Connections
256 pages
MSDS Dyluent PDF
No ratings yet
MSDS Dyluent PDF
14 pages
d20dtr Pre Heating
No ratings yet
d20dtr Pre Heating
8 pages
Acid Radical I
No ratings yet
Acid Radical I
3 pages
Notes Power Plant Engineering
No ratings yet
Notes Power Plant Engineering
8 pages
Cambridge IGCSE: Combined Science 0653/63
No ratings yet
Cambridge IGCSE: Combined Science 0653/63
16 pages
IC-8 IOL Physician Brochure
No ratings yet
IC-8 IOL Physician Brochure
8 pages
Sample Statement of Purpose For Law Internship
No ratings yet
Sample Statement of Purpose For Law Internship
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Presentation On Decision Tree

Uploaded by

Presentation On Decision Tree

Uploaded by

Presentation on Decision Tree

Submitted to: Submitted by:

•It represents the model of different classes,

•The main of the decision tree construction

•Classification of tree can be improved by

• Decision trees are able to generate under stable rules.

• Product: Alice d’Isoft & Alice Server

• See5/C5.0 is available for Windows

•Alice d'ISoft, software for Data Mining by decision

•It makes it possible for the business user to explore

•It allows, in complete autonomy, the most rapid of

•Thanks to its rapidity and its facililty, it provides

•Marketing: market studies, segmentation,

•Dynamic field impact on node representation

•One level development / reduction

•One node manual development with freely chosen field

•One node automatic development

•Symbolic values dividing / regrouping

•Thresholds-driven node coloration

•Tree navigator Go Back to Products

•Salford Systems' flagship data mining software,

•The most recent 2008 release, CART 6.0, includes

•CART was the primary tool used to win the

•With CART 6.0 ProEX, Salford has introduced

•CART supports high-speed deployment, allowing

Underlying the "easy" interface, however, is a mature theoretical

Model Assessment via Monte Carlo Testing

Unsupervised Learning Mode

Reliable pruning strategy - CART's developers determined definitively that

Powerful binary-split search approach - CART's binary decision trees are

In addition, CART accommodates many different types of real-

CART includes seven single-variable splitting criteria - Gini, Symgini,

– KnowledgeSEEKER features advanced data import,

– Knowledge SEEKER can import data from virtually any

– Knowledge SEEKER has an extensive array of tools for

– Business-friendly graphs and charts offer rapid profiling

– Users can quickly and conveniently perform advanced

– Knowledge SEEKER provides intuitive decision tree

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.