100% found this document useful (1 vote)

153 views

06 - Decision Trees

This document provides an overview of decision trees, including classification trees and regression trees. It discusses key algorithms for building decision trees such as ID3, C4.5, CART, and CHAID. The advantages of decision trees include their interpretability, ability to handle different data types, insensitivity to scale factors, and ability to automatically determine relevant attributes. Disadvantages include sensitivity to small data variations and worse performance with many classes. Decision trees are constructed recursively in a top-down manner by selecting attributes to partition the data at each node until reaching pure leaf nodes.

Uploaded by

sidra shafiq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

153 views

06 - Decision Trees

Uploaded by

sidra shafiq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

Master in Information Management

202223

06
Decision Trees
A
G
E
N Decision Trees
D
A
Classification Trees

Regression Trees

Overfitting in Decision Trees

1 Decision Trees

What are Decision Trees?

Non-parametric supervised learning algorithm used for classification and regression
1 Decision Trees

- Decision tree can be though of as classification and estimation tools

- One of its major advantages has to do with the fact that they represent rules, which are fairly
simple to interpret

- In some problems we are just interested in achieving the best precision possible. In others we are
more interested in understanding the results and the way the model is producing the estimates
- Sometimes the reasons that underlie certain decisions are of paramount importance.

Interpretability
1 Decision Trees

Income

Age
1 Decision Trees

Age≥3

No Yes

(4,0) Income ≥ 3

No Yes

Age ≥ 6 Age ≥ 5

No Yes No Yes

(4,1) (0,6) (0,4) (4,1)

1 Decision Trees

What if I have a new observation where age is

equal to 5 and income equal to 1?
Age≥3

No Yes

(4,0) Income ≥ 3

No Yes

Age ≥ 6 Age ≥ 5

No Yes No Yes

(4,1) (0,6) (0,4) (4,1)

1 Decision Trees
Root node

Age≥3

No Yes Internal node

The objective is to:
(4,0) Income ≥ 3
- Discriminate between classes Branches
No Yes
- To obtain leaves that are as pure as possible.

Hopefully each leaf only represents individuals of Age ≥ 6 Age ≥ 5

a particular class No Yes No Yes

(4,1) (0,6) (0,4) (4,1)

Leaf node
1.1 Decision Trees Classification Tree
Classification Trees vs Regression Trees

Age≥3

No Yes

(4,0) Income ≥ 3

No Yes

Age ≥ 6 Age ≥ 5

No Yes No Yes

(4,1) (0,6) (0,4) (4,1)

1.1 Decision Trees Regression Tree
Classification Trees vs Regression Trees

x ≥ 3.2
Income

No Yes

2.0 x ≥ 5.1
No Yes

3.5 x ≥ 6.1
No Yes

Age
4.5 5.0
1.2 Decision Trees
Rule extraction from trees
Age≥3

No Yes
- Each edge adds a conjunction (∧)
(4,0) Income ≥ 3
- Each new leaf adds a disjunction (∨)
No Yes
⇔ 𝑎𝑔𝑒 < 3
⇔ 𝑎𝑔𝑒 ≥ 3 ∧ (𝑖𝑛𝑐𝑜𝑚𝑒 < 3) ∧ (𝑎𝑔𝑒 < 6)
⇔ 𝑎𝑔𝑒 ≥ 3 ∧ (𝑖𝑛𝑐𝑜𝑚𝑒 < 3) ∧ (𝑎𝑔𝑒 ≥ 6) Age ≥ 6 Age ≥ 5
⇔ 𝑎𝑔𝑒 ≥ 3 ∧ 𝑖𝑛𝑐𝑜𝑚𝑒 ≥ 3 ∧ 𝑎𝑔𝑒 < 5
⇔ 𝑎𝑔𝑒 ≥ 3 ∧ 𝑖𝑛𝑐𝑜𝑚𝑒 ≥ 3 ∧ 𝑎𝑔𝑒 ≥ 5 No Yes No Yes

(4,1) (0,6) (0,4) (4,1)

⇔ 𝑎𝑔𝑒 < 3 ∨ 𝑎𝑔𝑒 ≥ 3 ∧ 𝑖𝑛𝑐𝑜𝑚𝑒 < 3 ∧ 𝑎𝑔𝑒 < 6 ∨ ( 𝑎𝑔𝑒 ≥ 3 ∧ 𝑖𝑛𝑐𝑜𝑚𝑒 ≥ 3 ∧ 𝑎𝑔𝑒 ≥ 5 )

⇔ 𝑎𝑔𝑒 ≥ 3 ∧ 𝑖𝑛𝑐𝑜𝑚𝑒 ≥ 3 ∧ 𝑎𝑔𝑒 < 5 ∨ ( 𝑎𝑔𝑒 ≥ 3 ∧ 𝑖𝑛𝑐𝑜𝑚𝑒 ≥ 3 ∧ 𝑎𝑔𝑒 ≥ 5 )

1.3 Decision Trees
Advantages of using Decision Trees

Interpretation
- Easily understand the underlying reason to the decision

No problems in dealing with different types of data

- Interval, ordinal, nominal, etc.
- Not necessary to define the relative importance of the variables

Insensitive to scale factors

- Different types of measurements can be used without the need for normalization

Automatic definition of the attributes that are more relevant in each case
- The most relevant attributes appear in the top part of the tree

Can be adapted to regression

- Linear local models in the leaves

Decision trees are considered a nonparametric method

- No assumptions about the space distribution and the classifier structure
1.3 Decision Trees
Disadvantages of using Decision Trees

- Most of the algorithms (ID3 and C4.5) require a discrete target

- Small variations in the data can result on very different trees
- Sub-trees can be replicated several times
- Worse results when dealing with many classes
- Linear boundaries perpendicular to the axis
1.4 Decision Trees
Decision Trees induction (or how to build trees)

Training Set
1.4 Decision Trees
Decision Trees induction (or how to build trees)

“PROBLEMS”: Where should we “cut”?

Training Set Age≥3

How many edges per node?

When to stop?
Which variable to
query?
1.4 Decision Trees
Decision Trees induction (or how to build trees)

BASIC ALGORITHM (a greedy algorithm)

- Tree is constructed in a top-down recursive divide-and-conquer manner
- At start, all the training observations are at the root
- If attributes are continuous-valued, they are discretized in advance (see slide 44)
- Observations are partitioned recursively based on selected attributes

Conditions for stopping partitioning

- All observations for a given node belong to the same class
- There are no remaining attributes for further partitioning – majority voting is employed for classify
the leaf
- There are no samples left
1.5 Decision Trees
The algorithms

DECISION TREES

CART CHAID
DDT ID3, C4.5, C5
Classification and Chi-Squared Automatic
Divisive decision tree Iterative Dichotomizer 3 …
Regression Trees Interaction Detection
(Hunt 62) (Quinlan 86, 93)
(Breiman 84) (Hartigan 75)
2 Classification Trees

What are
Classification Trees?
Tree models where the target variable can take
a discrete set of values
2.1 Classification Trees DDT
DDT

Major characteristics
- It is a greed search
- There is no “backtracking” (once a partition is done there is no re-evaluation)
- It can become stuck in a local minimum
- It uses discriminate power as selection measure

The algorithm (continue on the next slide)

- Start with a dataset of pre-classified individuals (examples or instances)
- Each node specifies an attribute (independent variables) used as a test
- N – is node N
- ASET – attribute set
- ISET – instance set (individuals or examples)
2.1 Classification Trees DDT
DDT

DDT(N, ASET, ISET)

If the ISET is empty then

the terminal node N is an unknown class
elseif all the examples of ISET are of the same class or ASET is empty
then the terminal node N has the name of the class
else
For each attribute A of the set of attribute ASET
Evaluate A according to its capability to discriminate a class
Select the attribute B which has the best discriminate value
For each value V of the best attribute B
Create a new node C from node N
Place the attribute value pair (B, V) in C
Let JSET be the set of examples of ISET with value V in B
Let KSET be the set of attributes of ASET with B removed
DDT(C, KSET, JSET)

Let’s see an example…

2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Lethargia

The class
(dependent
Independent variables
variable) Burpoma The quantity of nuclei (1 or 2)

The number of tails (1 or 2)

The color (gray or white)

Healthy
The membrane (thin or thick)
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

# nuclei # tails Color Membrane Class

1 1 Light Thin Lethargia
2 1 Light Thin Lethargia
1 1 Light Thick Lethargia
1 1 Dark Thin Lethargia
1 1 Dark Thick Lethargia
2 2 Light Thin Burpoma
2 2 Dark Thin Burpoma
2 2 Dark Thick Burpoma
2 1 Dark Thin Healthy
2 1 Dark Thick Healthy
1 2 Light Thin Healthy
1 2 Light Thick Healthy
2.1 Classification Trees DDT
DDT

The discriminative metric:

Measure the attribute discrimination:

1
Discriminate power = ෍ 𝐶𝑖
𝑛
𝑖=1

- where n is the total number of examples

- 𝐶𝑖 is the number of examples correctly classified by the most frequent class

This is a measure of “dominance” or “purity”

2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

# nuclei # tails Color Membrane Class

1 1 Light Thin Lethargia
2 1 Light Thin Lethargia Using the number of nuclei:
1 1 Light Thick Lethargia # nuclei 1 2
1 1 Dark Thin Lethargia
Lethargia 4 1
1 1 Dark Thick Lethargia
2 2 Light Thin Burpoma Burpoma 0 3

2 2 Dark Thin Burpoma Healthy 2 2

2 2 Dark Thick Burpoma
2 1 Dark Thin Healthy
2 1 Dark Thick Healthy 4+3
Discriminate power : = 0.58
1 2 Light Thin Healthy 12
1 2 Light Thick Healthy
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

# nuclei # tails Color Membrane Class

1 1 Light Thin Lethargia
2 1 Light Thin Lethargia Using the number of tails:
1 1 Light Thick Lethargia # tails 1 2
1 1 Dark Thin Lethargia
Lethargia 5 0
1 1 Dark Thick Lethargia
2 2 Light Thin Burpoma Burpoma 0 3

2 2 Dark Thin Burpoma Healthy 2 2

2 2 Dark Thick Burpoma
2 1 Dark Thin Healthy
2 1 Dark Thick Healthy 5+3
Discriminate power : = 0.67
1 2 Light Thin Healthy 12
1 2 Light Thick Healthy
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

# nuclei # tails Color Membrane Class

1 1 Light Thin Lethargia
2 1 Light Thin Lethargia Using color:
1 1 Light Thick Lethargia color Light Dark
1 1 Dark Thin Lethargia
Lethargia 3 2
1 1 Dark Thick Lethargia
2 2 Light Thin Burpoma Burpoma 1 2

2 2 Dark Thin Burpoma Healthy 2 2

2 2 Dark Thick Burpoma
2 1 Dark Thin Healthy
2 1 Dark Thick Healthy 3+2
Discriminate power : = 0.41
1 2 Light Thin Healthy 12
1 2 Light Thick Healthy
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

# nuclei # tails Color Membrane Class

1 1 Light Thin Lethargia
2 1 Light Thin Lethargia Using membrane:
1 1 Light Thick Lethargia membrane Thin Thick
1 1 Dark Thin Lethargia
Lethargia 3 2
1 1 Dark Thick Lethargia
2 2 Light Thin Burpoma Burpoma 2 1

2 2 Dark Thin Burpoma Healthy 2 2

2 2 Dark Thick Burpoma
2 1 Dark Thin Healthy
2 1 Dark Thick Healthy 3+2
Discriminate power : = 0.41
1 2 Light Thin Healthy 12
1 2 Light Thick Healthy
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

# nuclei 1 2 # tails 1 2 color Light Dark membrane Thin Thick

Lethargia 4 1 Lethargia 5 0 Lethargia 3 2 Lethargia 3 2

Burpoma 0 3 Burpoma 0 3 Burpoma 1 2 Burpoma 2 1

Healthy 2 2 Healthy 2 2 Healthy 2 2 Healthy 2 2

0.58 0.67 0.41 0.41

Choice: # tails
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Tails
one two

# Nuclei Color Membrane Classe # Nuclei Color Membrane Classe

1 Light Thin Lethargia 2 Light Thin Burpoma
2 Light Thin Lethargia 2 Dark Thin Burpoma
1 Light Thick Lethargia 2 Dark Thick Burpoma
1 Dark Thin Lethargia 1 Light Thin Healthy
1 Dark Thick Lethargia 1 Light Thick Healthy
2 Dark Thin Healthy
2 Dark Thick Healthy
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96) # Nuclei 1 2

Lethargia 4 1

Burpoma 0 0 D.P. = 0.86

# Nuclei Color Membrane Classe Healthy 0 2

1 Light Thin Lethargia
Color Light Dark
2 Light Thin Lethargia
Lethargia 3 2
1 Light Thick Lethargia
1 Dark Thin Lethargia Burpoma 0 0 D.P. = 0.71
1 Dark Thick Lethargia
Healthy 0 2
2 Dark Thin Healthy
2 Dark Thick Healthy Membrane Thin Thick

Lethargia 3 2

Burpoma 0 0 D.P. = 0.71

Healthy 1 1
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Tails
one two

Nuclei # Nuclei Color Membrane Classe

2 Light Thin Burpoma
one two 2 Dark Thin Burpoma
2 Dark Thick Burpoma
1 Light Thin Healthy
Color Membrane Classe Color Membrane Classe
1 Light Thick Healthy
Light Thin Lethargia Light Thin Lethargia
Light Thick Lethargia Dark Thin Healthy
Dark Thin Lethargia Dark Thick Healthy
Dark Thick Lethargia
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Tails
one two

Nuclei # Nuclei Color Membrane Classe

2 Light Thin Burpoma
one two 2 Dark Thin Burpoma
2 Dark Thick Burpoma
1 Light Thin Healthy
Lethargia (4) Color Membrane Classe
1 Light Thick Healthy
Light Thin Lethargia
Dark Thin Healthy
Dark Thick Healthy
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Tails
one two

Nuclei # Nuclei Color Membrane Classe

2 Light Thin Burpoma
one two 2 Dark Thin Burpoma
2 Dark Thick Burpoma
1 Light Thin Healthy
Lethargia (4)
Color 1 Light Thick Healthy

Light Dark
Lethargia (1) Healthy (2)
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Tails
one two

Nuclei Nuclei
one two
one two

Lethargia (4) Healthy (2) Burpoma (3)

Color
Light Dark
Lethargia (1) Healthy (2)
2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

Lethargia (4) Healthy (2) Burpoma (2)

Lethargia (1) Healthy(2)

2.1 Classification Trees DDT
DDT

Example - Cell analysis (Langley 96)

𝑡𝑎𝑖𝑙𝑠 = 1 ∧ 𝑛𝑢𝑐𝑙𝑒𝑖 = 1 ∨ 𝑡𝑎𝑖𝑙𝑠 = 1 ∧ 𝑛𝑢𝑐𝑙𝑒𝑖 = 2 ∧ (𝑐𝑜𝑙𝑜𝑟 = 𝑙𝑖𝑔ℎ𝑡) ⇒ 𝑳𝒆𝒕𝒉𝒂𝒓𝒈𝒊𝒂

𝑡𝑎𝑖𝑙𝑠 = 2 ∧ 𝑛𝑢𝑐𝑙𝑒𝑖 = 1 ∨ 𝑡𝑎𝑖𝑙𝑠 = 1 ∧ 𝑛𝑢𝑐𝑙𝑒𝑖 = 2 ∧ 𝑐𝑜𝑙𝑜𝑟 = 𝑑𝑎𝑟𝑘 ⇒ 𝑯𝒆𝒂𝒍𝒕𝒉𝒚

𝑡𝑎𝑖𝑙𝑠 = 2 ∧ (𝑛𝑢𝑐𝑙𝑒𝑖 = 2) ⇒ 𝑩𝒖𝒓𝒑𝒐𝒎𝒂

2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

Major characteristics
- It uses entropy to measure the “disorder” in each independent variable
- From entropy, we can calculate the information gain, the selection measure
- ID3 handles only categorical attributes while C4.5 is able to deal also with numeric values

The nomenclature
Let D be a training set of class-labeled tuples
Let 𝑝𝑖 be the probability that an arbitrary tuple in D belongs to class 𝐶𝑖 , estimated by 𝐶𝑖,𝐷 / 𝐷
Let 𝐶𝑖,𝐷 be the set of tuples of class 𝐶𝑖 in D.
Let 𝐶𝑖,𝐷 and 𝐷 be the number of tuples in D and 𝐶𝑖,𝐷 and D
2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

Expected information (entropy) needed to classify a tuple in D:

𝑚

𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷) = − ෍ 𝑝𝑖 log 2 ( 𝑝𝑖 )
𝑖=1

If we choose attribute A to split the node with the tuples in D in v partitions, then the Information needed to classify D is:
𝑣
|𝐷𝑗 |
𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴 (𝐷) = ෍ × 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷𝑗 )
|𝐷|
𝑗=1

The Information gained by branching on attribute A:

𝐺𝑎𝑖𝑛(𝐴) = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷) − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴 (𝐷)

2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

Our dependent variable

AllElectronics Example: - Two classes: buys_computer = “yes” or “no”
age income student credit_rating buys_computer
<=30 high no fair no
<=30 high no excellent no
[31…40] high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
[31…40] low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
[31…40] medium no excellent yes
[31…40] high yes fair yes
>40 medium no excellent no
2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷) = − ෍ 𝑝𝑖 log 2 ( 𝑝𝑖 )
𝑖=1

Buy Computer
9 9 5 5
Yes No 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷) = 𝐸(9,5) = − log 2 ( ) − log 2 ( ) = 0.940
14 14 14 14
9 5
2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

We are going to check the entropy associated to the variable age:

Age Yes No Total E

<=30 2 3 5 0.971 2 2 3 3
𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴𝑔𝑒<30 (𝐷) = 𝐸(2,3) = − log 2 ( ) − log 2 ( ) = 0.971
5 5 5 5
4 4 0 0
[31…40] 4 0 4 0 𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴𝑔𝑒 31..40 (𝐷) = 𝐸(4,0) = − log 2 ( ) − log 2 ( ) = 0
4 4 4 4
3 3 2 2
>40 3 2 5 0.971 𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴𝑔𝑒>40 (𝐷) = 𝐸(3,2) = − log 2 ( ) − log 2 ( ) = 0.971
5 5 5 5

𝑣
|𝐷𝑗 |
𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴 (𝐷) = ෍ × 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷𝑗 )
|𝐷|
𝑗=1

5 4 5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝑎𝑔𝑒 (𝐷) = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(2,3) + 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(4,0) + 𝐸(3,2) = 0.694
14 14 14
5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 2,3 means “age <=30” has 5 out of 14 samples, with 2 yes and 3 no
14
2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

𝐺𝑎𝑖𝑛(𝐴) = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷) − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴 (𝐷)

Thus the information gain for age is:

𝐺𝑎𝑖𝑛(𝑎𝑔𝑒) = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐷) − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝑎𝑔𝑒 (𝐷) = 0.940 − 0.694 = 0.246

Similarly for the variables income, student and credit:

𝐺𝑎𝑖𝑛(𝑖𝑛𝑐𝑜𝑚𝑒) = 0.029
𝐺𝑎𝑖𝑛(𝑠𝑡𝑢𝑑𝑒𝑛𝑡) = 0.151
𝐺𝑎𝑖𝑛(𝑐𝑟𝑒𝑑𝑖𝑡) = 0.048

Since age has the highest information gain it is the selected attribute
2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

Age

age income student credit_rating buys_computer age income student credit_rating buys_computer
<=30 high no fair no >40 medium no fair yes
<=30 high no excellent no >40 low yes fair yes
<=30 medium no fair no >40 low yes excellent no
<=30 low yes fair yes >40 medium yes fair yes
<=30 medium yes excellent yes >40 medium no excellent no

age income student credit_rating buys_computer

[31…40] high no fair yes
[31…40] low yes excellent yes
[31…40] medium no excellent yes
[31…40] high yes fair yes
[31…40] high no fair yes
2.2 Classification Trees ID3 / C4.5
Information Gain – ID3 / C4.5

age
But what about if I want to use the original age (algorithm C4.5)?
18
Originally Age is a continuous-valued attribute 20
21
22
26
Must determine the best split point for Age
27
- Sort the values of Age 29
31
- Every midpoint between each pair of adjacent values is a possible split point 35
41
𝐷𝑗
- Evaluate 𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐴 𝐷 = σ𝑣𝑗=1 × 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷𝑗 with two partitions 42
𝐷
45
(Partition 1<split point<Partition 2) 48
49
- The point with the minimum entropy for Age is selected as the split-point …
2.3 Classification Trees CART
Gini Index - CART

- If a data set D contains examples from n classes, Gini index is defined as:

𝑔𝑖𝑛𝑖 𝐷 = 1 − ෍ 𝑝𝑗2
𝑗=1

where 𝑝𝑗 is the relative frequency of class j in D

- Similar to entropy but without the lo FASTER !

2.3 Classification Trees CART
Gini Index - CART

- If a data set D is split on Age into two subsets D1 and D2, the gini index is defined as:

𝐷1 𝐷2
𝑔𝑖𝑛𝑖𝐴𝑔𝑒 𝐷 = 𝑔𝑖𝑛𝑖 𝐷1 + 𝑔𝑖𝑛𝑖 𝐷2
𝐷 𝐷

- The reduction in Impurity is given by:

∆𝑔𝑖𝑛𝑖 𝐴 = 𝑔𝑖𝑛𝑖 𝐷 − 𝑔𝑖𝑛𝑖𝐴 (𝐷)

- The attribute that maximizes the reduction in impurity is selected as splitting attribute
2.3 Classification Trees CART
Gini Index - CART

𝑔𝑖𝑛𝑖 𝐷 = 1 − ෍ 𝑝𝑗2
𝑗=1
For the AllElectronics example:
Buy Computer
Yes No 9 2 5 2
g𝑖𝑛𝑖(𝐷) = 1 − − = 0.459
14 14
9 5

For the variable Age:

Age Yes No Total Gini

2 2
2 3
<=30 2 3 5 0.48 𝐺𝑖𝑛𝑖(𝐷𝑎𝑔𝑒<30 ) = 1 − − = 0.48
5 5
[31…40] 4 0 4 0 …
>40 3 2 5 0.48 …
2.3 Classification Trees CART
Gini Index - CART

𝐷1 𝐷2
𝑔𝑖𝑛𝑖𝐴𝑔𝑒 𝐷 = 𝑔𝑖𝑛𝑖 𝐷1 + 𝑔𝑖𝑛𝑖 𝐷2
𝐷 𝐷

Age Yes No Total Gini

<=30 2 3 5 0.48 5 4 5
𝐺𝑖𝑛𝑖𝑎𝑔𝑒 (𝐷) = 𝐺𝑖𝑛𝑖(2,3) + 𝐺𝑖𝑛𝑖(4,0) + 𝐺𝑖𝑛𝑖(3,2) = 0.343
[31…40] 4 0 4 0 14 14 14
>40 3 2 5 0.48

∆𝑔𝑖𝑛𝑖 𝐴 = 𝑔𝑖𝑛𝑖 𝐷 − 𝑔𝑖𝑛𝑖𝐴 (𝐷)

Δ𝑔𝑖𝑛𝑖(𝐴𝑔𝑒) = 𝑔𝑖𝑛𝑖(𝐷) − 𝑔𝑖𝑛𝑖𝐴𝑔𝑒 (𝐷) = 0.459 − 0.343 = 0.116

3 Regression Trees

What are Regression Trees?

Tree models where the target variable can take a continuous set of values
3 Regression Trees

In a regression problem…
There are some problems that are easily fitted with a linear regression…

100

Money Spent 75

20 30 40 50
Age
1 Regression Trees

Middle aged people spend

Age Money
the most

Money Spent (€)

Spent
18 5 75
22 7 65
23 6
55
27 68
45 Some middle-aged people
34 72 spend some money
38 69 35
Older people don’t spend
43 77 25 much money
45 48 Younger people don’t spend
15 much money
48 42
5
53 16
57 18 5 10 15 20 25 30 35 40 45 50 55 60 65
Age
64 14
3 Regression Trees

A problem with one variable…

Money Spent (€)

75
Fitting a straight line to the data does
65 not seem very useful!

55
As an example, if someone has
45 the age of 64…
35 we will predict that the money spent
25 would be around 50€…

15 The observed data however suggests it

should be around 15€…
5
Vary far from reality!
5 10 15 20 25 30 35 40 45 50 55 60 65
Age
3 Regression Trees

- In some datasets we should use other methods than using straight lines to make predictions

- One option is the Regression Tree

- A Regression tree is a type of a Decision tree where each leaf represents a numeric value, and not a
discrete category like in Classification Decision Trees.
3 Regression Trees

Regression tree with one variable…

Age < 25

Yes No

6€ spent Age < 44

Yes No

71.5€ spent Age < 50.5

Yes No

45€ spent 16€ spent

However, I could obtain the same answer just by looking to the plot!
3 Regression Trees

Regression tree with more variables…

When we have more than 2 predictors, such as Age, Number of Kids and Number of months as client, to
predict the money spent through drawing a plot is very difficult, if not impossible

Age #Kids #Months as customer … Money Spent

18 0 1 … 5
22 1 3 … 7
23 0 6 … 6
27 2 22 … 68
34 3 27 … 72
38 2 29 … 69
43 1 15 … 77
45 0 12 … 48
48 3 17 … 42
53 2 18 … 16
57 3 43 … 18
64 1 32 … 14
3 Regression Trees

Regression tree with more variables…

A regression tree easily supports more than 2 predictors
Age #Kids #Months as … Money Spent
customer
18 0 1 … 5

Age < 24
Yes No

#Kids < 2 …
Yes No

#months < 7 …

Yes No

6€ spent …
1 Regression Trees

Choose the best split

Money Spent (€)

45 Why do we start with age < 25?

What is the best splitting point?
35

5 10 15 20 25 30 35 40 45 50 55 60 65
Age
3 Regression Trees

Choose the best split

Money Spent (€)

65
Age < 20
55
Yes No
45

35 5€ spent 39.7€ spent

5 10 15 20 25 30 35 40 45 50 55 60 65
Age
3.1 Regression Trees Age < 20 MSE
Using MSE
Yes No

Choose the best split using MSE 5€ spent 39.7€ spent

𝑛
1
𝑀𝑆𝐸 = ො 2
෍(𝑦 − 𝑦)
𝑛
𝑖=1

Age Money Spent

(𝒚) 𝑦ത = 5
18 5 𝑀𝑆𝐸 (𝐴𝑔𝑒 < 20) = (5 − 5)2 = 0

22 7 7 + 6 + … + 14
23 6 𝑦ത = = 39.7
11
… … 1
𝑀𝑆𝐸 (𝐴𝑔𝑒 ≥ 20) = × ((7 − 39.7)2 + (6 − 39.7)2 + … + (14 − 39.7)2 ) = 733.3
11
64 14

𝑇𝑂𝑇𝐴𝐿 𝑀𝑆𝐸 = 0 + 733.3 = 733.3

3.1 Regression Trees MSE
Using MSE

Age Money Spent ෝ

𝒚 MSE
𝑛 (𝒚)
1
ො 2
𝑀𝑆𝐸 = ෍(𝑦 − 𝑦) 18 5 5 0.0
𝑛
𝑖=1
22 7 39.7
23 6 39.7
27 68 39.7
34 72 39.7
38 69 39.7
43 77 39.7 733.3
45 48 39.7
Age < 20 48 42 39.7
Yes No 53 16 39.7
57 18 39.7
5€ spent 39.7€ spent
64 14 39.7
TOTAL MSE 733.3
3.1 Regression Trees MSE
Using MSE

Money Spent (€)

65
Age < 22.5
55
Yes No
45

35 6€ spent 43€ spent

5 10 15 20 25 30 35 40 45 50 55 60 65
Age
3.1 Regression Trees MSE
Using MSE

Age Money Spent ෝ

𝒚 MSE
1 2 (𝒚)
𝑀𝑆𝐸 = 𝑦 − 𝑦ො𝑖
𝑛 𝑖 18 5 6
1.0
22 7 6
23 6 43
27 68 43
34 72 43
38 69 43
43 77 43
688.8
45 48 43
48 42 43
Age < 22.5 53 16 43
Yes No 57 18 43
6€ spent 43€ spent 64 14 43
TOTAL MSE 689.8
3.1 Regression Trees MSE
Using MSE

Money Spent (€)

65
Age < 25
55
Yes No
45
6€ spent 47.1€ spent
35

5 10 15 20 25 30 35 40 45 50 55 60 65
Age
3.1 Regression Trees MSE
Using MSE

Age Money Spent ෝ

𝒚 MSE
𝑛
1 (𝒚)
𝑀𝑆𝐸 = ො 2
෍(𝑦 − 𝑦)
𝑛 18 5 6
𝑖=1
22 7 6 0.7
23 6 6
27 68 43
34 72 43
38 69 43
43 77 43
45 48 43 596.3
48 42 43
Age < 25 53 16 43
Yes No 57 18 43
6€ spent 47.1€ spent 64 14 43
TOTAL MSE 597.0
3.1 Regression Trees MSE
Using MSE

Splitting TOTAL MSE

criteria for age This is my first splitting criteria!

<20.0 733.3 Age lower than 25

<21.5 689.8
<25.0 597.0
< 30.5 1330.8
< 36.0 1558.1
< 40.5 1526.6
< 44.0 1265.0
< 46.5 1056.8
< 50.5 828.0
< 55.0 816.2
< 60.5 782.1
3.1 Regression Trees MSE
Using MSE

And my second splitting criteria? (MSE)

Age < 25
Yes No

Age Money Spent Age Money Spent

(𝒚) (𝒚)
18 5 27 68 If you continue this process, you will end with one
observation (since all values on money spent are
22 7 34 72
different) in each leaf…
23 6 38 69
43 77 OVERFITTING!!
45 48
!The need to define stopping criterias!
48 42
53 16
57 18
64 14
3.1 Regression Trees MSE
Using MSE

And my second splitting criteria? (MSE)

If I define, for example, the following stopping criteria:
- Minimum samples to split = 7

Age < 25
Yes No
Splitting criteria Total MSE
Only 3 samples! 9 samples…
I finish here I need to check the < 30.5 609.5
MSE for each split < 36.0 577.1
6€ spent
< 40.5 514.4
< 44.0 219.3
< 46.5 226.9
< 50.5 169.9
< 55.0 414.0
< 60.5 516.7
3.1 Regression Trees MSE
Using MSE

My final tree (MSE)

Money Spent (€) Age < 25

65 Yes No

55
6€ spent Age < 50.5
45
Yes No
35
62.7€ spent 16€ spent
25

5 10 15 20 25 30 35 40 45 50 55 60 65
Age
3.1 Regression Trees MSE
Using MSE

Using more than two predictors

Age #Kids #Months as … Money Spent

customer Just like before, we will try different thresholds
18 0 1 … 5 for Age and calculate the MSE (or other
22 1 3 … 7 measure) at each step, and pick the threshold
23 0 6 … 6 that gives us the minimum MSE.
27 2 22 … 68
34 3 27 … 72
The best threshold becomes a candidate for
the root
38 2 29 … 69
43 1 15 … 77
45 0 12 … 48
Age < 25
48 3 17 … 42
53 2 18 … 16
57 3 43 … 18
64 1 32 … 14
3.1 Regression Trees MSE
Using MSE

Using more than two predictors

Age #Kids #Months as … Money Spent
customer
18 0 1 … 5
22 1 3 … 7
23 0 6 … 6 Splitting criteria TOTAL MSE
27 2 22 … 68 for #Kids
34 3 27 … 72
38 2 29 … 69 <1 1155.8
43 1 15 … 77 <2 1301.1
45 0 12 … 48 <3 1321.6
48 3 17 … 42
53 2 18 … 16
57 3 43 … 18
64 1 32 … 14
3.1 Regression Trees MSE
Using MSE

Using more than two predictors

Age #Kids #Months as … Money Spent
customer
18 0 1 … 5
22 1 1 … 7
23 0 3 … 6
27 2 2 … 68 Splitting criteria TOTAL MSE
34 3 2 … 72 for #Months
38 2 3 … 69
<2 1056.7
43 1 2 … 77
<3 1501.3
45 0 1 … 48
48 3 2 … 42
53 2 1 … 16
57 3 3 … 18
64 1 2 … 14
3.1 Regression Trees MSE
Using MSE

Using more than two predictors

Age TOTAL MSE # Kids TOTAL MSE
Age < 25
Yes No
<20.0 733.3 <1 1155.8
<21.5 689.8 <2 1301.1 6€ spent 43€ spent
<25.0 597.0 <3 1321.6
< 30.5 1330.8 #Kids < 1
#Months TOTAL MSE
< 36.0 1558.1 Yes No
< 40.5 1526.6
<2 1056.7 19.7€ spent 42.6€ spent
< 44.0 1265.0
< 46.5 1056.8 <3 1501.3
#Months < 2
< 50.5 828.0
< 55.0 816.2 Yes No

< 60.5 782.1 19€ spent 45.8€ spent

3 Regression Trees
Other measures

What about other measures?

- Similarly to what we have on classification trees, where gini or entropy can be used to measure the
quality of a split, in regression trees we have different alternatives also…

- MSE

- MAE

- Friedman MSE

- Sum of residuals

- …

- In sklearn, you have the three first options

3.2 Regression Trees Age < 20 MAE
Using MAE
Yes No

5€ spent 42€ spent

𝑛
1
𝑀𝐴𝐸 = ෍ 𝑦𝑖 − 𝑦ො𝑖
𝑛
𝑖=1

Age Money Spent

Median (y) = 5
(𝒚)
18 5 𝑀𝐴𝐸 𝐴𝑔𝑒 < 20 = 5 − 5 = 0

22 7 Median (y)= 42
23 6 1
… … 𝑀𝐴𝐸 𝐴𝑔𝑒 ≥ 20 = × ( 7 − 42 + 6 − 42 + … + 14 − 42 ) = 24.8
11
64 14

𝑇𝑂𝑇𝐴𝐿 𝑀𝐴𝐸 = 0 + 24.8 = 24.8

3.2 Regression Trees MAE
Using MAE

𝑛 Age Money Spent ෝ

𝒚 MAE
1
𝑀𝐴𝐸 = ෍ 𝑦𝑖 − 𝑦ො𝑖 (𝒚)
𝑛
𝑖=1 18 5 5 0.0
22 7 39.7
23 6 39.7
27 68 39.7
34 72 39.7
38 69 39.7
43 77 39.7 24.8
45 48 39.7
Age < 20 48 42 39.7

Yes No
53 16 39.7
57 18 39.7
5€ spent 42€ spent
64 14 39.7
TOTAL MAE 24.8
3.2 Regression Trees MAE
Using MAE

Splitting TOTAL MAE

criteria for age This is my first splitting criteria!

<20.0 24.8 Age lower than 25

<21.5 22.7
<25.0 18.9
< 30.5 23.2
< 36.0 26.8
< 40.5 57.7
< 44.0 80.9
< 46.5 70.9
< 50.5 60.0
< 55.0 60.2
< 60.5 59.3
3.2 Regression Trees MAE
Using MAE

And my second splitting criteria? (MAE)

If I define, for example, the following stopping criteria:
- Minimum samples to split = 7

Age < 25
Yes No
Splitting criteria Total MAE
Only 3 samples! 9 samples…
I finish here I need to check the < 30.5 22.0
MAE for each split < 36.0 22.9
6€ spent
< 40.5 21.2
< 44.0 15.0
< 46.5 14.1
< 50.5 11.3
< 55.0 18.4
< 60.5 20.3
3.2 Regression Trees MAE
Using MAE

My final tree (MAE)

Money Spent (€) Age < 25

65 Yes No

55
6€ spent Age < 50.5
45
Yes No
35
68.5€ spent 16€ spent
25

5 10 15 20 25 30 35 40 45 50 55 60 65
Age
4 Overfitting in Decision Trees

How to avoid overfitting in DT?

Two main approaches: prepruning and postpruning
4 Overfitting in DT

How to avoid overfitting in decision trees?

- An induced tree may overfit the training data
- Too many branches, some may reflect anomalies due to noise or outliers
- Poor accuracy for unseen samples

Two approaches to avoid overfitting

- Prepruning:
- do not split a node if this would result in the goodness measure falling below a threshold
- Difficult to choose an appropriate threshold
- Postpruning:
- Remove branches from a “fully grown” tree—get a sequence of progressively pruned trees
- Use a set of data different from the training data to decide which is the “best pruned tree”
4 Overfitting in DT

How to avoid overfitting in decision trees?

Simple to complex trees

Validation
Error

Training Prune here

Complexity

- Training set – used to develop the tree

- Validation set – used to access the generalization ability of the tree
References
- Mitchell, TM: 1997, Machine Learning, McGraw-Hill
- Langley, P: 1996, Elements of Machine Learning, Morgan and Kaufmann Publishers.
- Breiman, L., J. H. Friedman, R. A. Olsen and C. J. Stone (1984). Classification and Regression Trees,
Chapman & Hall, pp 358.
- http://www.ise.bgu.ac.il/faculty/liorr/hbchap9.pdf
- http://www.cs.princeton.edu/courses/archive/spr07/cos424/papers/mitchell-dectrees.pdf
Obrigado!

Morada: Campus de Campolide, 1070-312 Lisboa, Portugal

Tel: +351 213 828 610 | Fax: +351 213 828 611

The Effective Change Manager's Handbook - Essential Guidance To The Change Management Body of Knowledge PDF
83% (36)
The Effective Change Manager's Handbook - Essential Guidance To The Change Management Body of Knowledge PDF
633 pages
Top 101 Consulting Framework
90% (10)
Top 101 Consulting Framework
205 pages
The Coaches Handbook The Complete Practitioner Guide For Professional Coaches 1nbsped 2020020167 2020020168 9780367539207 9780367546199 9781003089889 Compress
100% (14)
The Coaches Handbook The Complete Practitioner Guide For Professional Coaches 1nbsped 2020020167 2020020168 9780367539207 9780367546199 9781003089889 Compress
469 pages
The Five Dysfunctions of A Team PDF
92% (36)
The Five Dysfunctions of A Team PDF
123 pages
Operating Model and Organization Design Toolkit - Overview and Approach
82% (11)
Operating Model and Organization Design Toolkit - Overview and Approach
47 pages
50 Case Studies On Risk Management
100% (7)
50 Case Studies On Risk Management
300 pages
Management Consulting Toolkit - Overview and Approach
100% (5)
Management Consulting Toolkit - Overview and Approach
48 pages
The McKinsey Engagement Summary
95% (19)
The McKinsey Engagement Summary
82 pages
250+ McKinsey Frequently Used Templates (7 Color Themes)
83% (12)
250+ McKinsey Frequently Used Templates (7 Color Themes)
256 pages
Leadership Development Roadmap
100% (10)
Leadership Development Roadmap
15 pages
The Prosci ADKAR Model Ebook
100% (11)
The Prosci ADKAR Model Ebook
19 pages
Business and Consulting Toolkits - Sample
90% (10)
Business and Consulting Toolkits - Sample
114 pages
Digital Transformation and Industry 4.0 - Masterclass 2020: Abhinav Singhal
No ratings yet
Digital Transformation and Industry 4.0 - Masterclass 2020: Abhinav Singhal
44 pages
Consulting Frameworks Toolkit v3.7 4-7-23
100% (2)
Consulting Frameworks Toolkit v3.7 4-7-23
108 pages
Flevy 100 Case Studies On Strategy & Transformation
100% (6)
Flevy 100 Case Studies On Strategy & Transformation
501 pages
Business & Consulting Toolkits - Free Sample in Powerpoint
100% (17)
Business & Consulting Toolkits - Free Sample in Powerpoint
131 pages
The Management Consulting Toolkit - Fifty of The Most Versatile An
100% (12)
The Management Consulting Toolkit - Fifty of The Most Versatile An
234 pages
Change Management
94% (16)
Change Management
103 pages
TheBigBookOfTeamCulture PDF
100% (3)
TheBigBookOfTeamCulture PDF
231 pages
Consulting Process Copy 3
100% (1)
Consulting Process Copy 3
66 pages
Chapter 2 - Process Identification (Updated With Solutions)
No ratings yet
Chapter 2 - Process Identification (Updated With Solutions)
54 pages
BPTA BPM Methodology 2. Overview
No ratings yet
BPTA BPM Methodology 2. Overview
2 pages
McKinsey Problem Solving Approach
100% (26)
McKinsey Problem Solving Approach
46 pages
Five Dysfunctions of A Team
100% (10)
Five Dysfunctions of A Team
5 pages
Team Building Activities Toolkit Club
100% (6)
Team Building Activities Toolkit Club
46 pages
Corporate and Business Strategy Toolkit - Overview and Approach
91% (11)
Corporate and Business Strategy Toolkit - Overview and Approach
37 pages
All+Lectures Consolidated
100% (1)
All+Lectures Consolidated
181 pages
Diagnostic Review Process - GUIDE
No ratings yet
Diagnostic Review Process - GUIDE
10 pages
Data Analysis Guide
100% (1)
Data Analysis Guide
8 pages
Innovation Management
No ratings yet
Innovation Management
39 pages
Chapter 1 - Introduction To Business Process Management (Updated With Solutions)
No ratings yet
Chapter 1 - Introduction To Business Process Management (Updated With Solutions)
64 pages
Chapter 7 - Process Redesign
No ratings yet
Chapter 7 - Process Redesign
80 pages
Building The Supply Chain of The Future July 2023 Edit
100% (1)
Building The Supply Chain of The Future July 2023 Edit
18 pages
Session 6 Lecture
100% (1)
Session 6 Lecture
64 pages
Chapter 1 - Summary
No ratings yet
Chapter 1 - Summary
15 pages
Session 7 Lecture
No ratings yet
Session 7 Lecture
77 pages
Session 4 Lecture
100% (1)
Session 4 Lecture
55 pages
BUS5PB-Lecture2-Digital Transformation and AnalyticsInOrganisations-S1-2024
100% (2)
BUS5PB-Lecture2-Digital Transformation and AnalyticsInOrganisations-S1-2024
66 pages
ProdMan Frameworks - ThinC - Handbook
No ratings yet
ProdMan Frameworks - ThinC - Handbook
25 pages
The Lean Canvas: Problem Solution Unique Value Prop. Unfair Advantage Customer Segments
No ratings yet
The Lean Canvas: Problem Solution Unique Value Prop. Unfair Advantage Customer Segments
7 pages
TQM - TRG - A-02 - Cause - Effect - Rev02 - 20180603
No ratings yet
TQM - TRG - A-02 - Cause - Effect - Rev02 - 20180603
33 pages
Lean Six Sigma - Part 2 Complete
No ratings yet
Lean Six Sigma - Part 2 Complete
306 pages
State of The Professional Services Report Final
100% (1)
State of The Professional Services Report Final
21 pages
Deloitte Part-2 Task-1
No ratings yet
Deloitte Part-2 Task-1
4 pages
Marketing Analytics: PPT-9 (Marketing Metrics X-Ray)
No ratings yet
Marketing Analytics: PPT-9 (Marketing Metrics X-Ray)
18 pages
HANDOUTS - 6SigmaPH - Six Sigma GREEN Belt - ANALYZE, IMPROVE, CONTROL - 29MAR2018
No ratings yet
HANDOUTS - 6SigmaPH - Six Sigma GREEN Belt - ANALYZE, IMPROVE, CONTROL - 29MAR2018
110 pages
Understanding Capstone Project DQLab
No ratings yet
Understanding Capstone Project DQLab
48 pages
SK - ESG Report - 23
No ratings yet
SK - ESG Report - 23
85 pages
15 - Change-Impact-Radius-Template
100% (2)
15 - Change-Impact-Radius-Template
2 pages
Linking It All Together
100% (1)
Linking It All Together
27 pages
Module 1, Strategic Planning, AmrSukkar
100% (1)
Module 1, Strategic Planning, AmrSukkar
45 pages
BCG Presentation For Ashridge Automotive Parts Inc
No ratings yet
BCG Presentation For Ashridge Automotive Parts Inc
13 pages
Rethinking The Consumer-Goods Supply Chain in Response To COVID-19
100% (1)
Rethinking The Consumer-Goods Supply Chain in Response To COVID-19
27 pages
The Umbrex Library Functional Key Performance
No ratings yet
The Umbrex Library Functional Key Performance
95 pages
Process Excellence: "Sources For Improvement Projects, What To Use"
No ratings yet
Process Excellence: "Sources For Improvement Projects, What To Use"
26 pages
Forward Learning Workbook
100% (1)
Forward Learning Workbook
63 pages
190527 Introduction to Consulting
No ratings yet
190527 Introduction to Consulting
24 pages
PPS Overview Workbook
No ratings yet
PPS Overview Workbook
6 pages
HANDOUTS - 6SigmaPH - Six Sigma GREEN Belt - DEFINE & MEASURE - 29MAR2018
No ratings yet
HANDOUTS - 6SigmaPH - Six Sigma GREEN Belt - DEFINE & MEASURE - 29MAR2018
102 pages
Instant download Digital Transformation at Scale Why the Strategy is Delivery Greenway pdf all chapter
100% (1)
Instant download Digital Transformation at Scale Why the Strategy is Delivery Greenway pdf all chapter
40 pages
Lean Thinking4422 PDF
No ratings yet
Lean Thinking4422 PDF
115 pages
Enterprise Agility PWC
No ratings yet
Enterprise Agility PWC
8 pages
Matchmaking Services-Compressed
100% (1)
Matchmaking Services-Compressed
33 pages
Process Measures and Analytics The Right Data For The Right Decisions
No ratings yet
Process Measures and Analytics The Right Data For The Right Decisions
110 pages
SAP Innovation Management: A Detailed View
No ratings yet
SAP Innovation Management: A Detailed View
37 pages
Lean 6 Sigma Toolkit - Overview
100% (1)
Lean 6 Sigma Toolkit - Overview
42 pages
KPMG's DT Playbook For BITS Pilani
No ratings yet
KPMG's DT Playbook For BITS Pilani
34 pages
INFS2030: Digital Business Management: Week 6 - The Data-Driven Organisation: Big Data, Analytics and Decision-Making
100% (1)
INFS2030: Digital Business Management: Week 6 - The Data-Driven Organisation: Big Data, Analytics and Decision-Making
22 pages
VNUIS - SM - Chaper 1 - SV
No ratings yet
VNUIS - SM - Chaper 1 - SV
23 pages
Unlock Full Document
100% (1)
Unlock Full Document
50 pages
LSSGB Course Handout
100% (2)
LSSGB Course Handout
177 pages
Toolbox EN
No ratings yet
Toolbox EN
1 page
English - CEX Trendradar 2023 Jahresreport - Compressed
No ratings yet
English - CEX Trendradar 2023 Jahresreport - Compressed
101 pages
Malcolum Baldridge BE Model - Training Presentation
100% (1)
Malcolum Baldridge BE Model - Training Presentation
81 pages
MT514KTA1BHW
No ratings yet
MT514KTA1BHW
27 pages
Slidor - Glide (Windows)
No ratings yet
Slidor - Glide (Windows)
81 pages
Case Interview Structure and How To Approach It
No ratings yet
Case Interview Structure and How To Approach It
2 pages
Atlassian Axelos ITIL4 Guide
No ratings yet
Atlassian Axelos ITIL4 Guide
38 pages
O365 For Change Management 1
100% (1)
O365 For Change Management 1
18 pages
Intro To MIS
No ratings yet
Intro To MIS
40 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
Leadership Models
100% (10)
Leadership Models
49 pages
Change and Project Management Toolkit - Overview and Approach
100% (5)
Change and Project Management Toolkit - Overview and Approach
35 pages
02 Organizational Change Management Plan
100% (6)
02 Organizational Change Management Plan
20 pages
Business & Consulting Toolkits - Sample
100% (4)
Business & Consulting Toolkits - Sample
170 pages
The 7 Steps of Effective Executive Coaching PDF
96% (26)
The 7 Steps of Effective Executive Coaching PDF
220 pages
Change Management Toolkit
100% (7)
Change Management Toolkit
84 pages
Leadership Toolkit
100% (6)
Leadership Toolkit
94 pages
Strategicplanningtoolkit Overviewandapproach 211012054413
100% (4)
Strategicplanningtoolkit Overviewandapproach 211012054413
31 pages
Ann MPDM Ii
No ratings yet
Ann MPDM Ii
42 pages
05 ZeroR OneR Bayes KNN
No ratings yet
05 ZeroR OneR Bayes KNN
76 pages
ITSM-2022-23.Aulas.05.The Service Value System
No ratings yet
ITSM-2022-23.Aulas.05.The Service Value System
15 pages
Chapter 5 - Practical
No ratings yet
Chapter 5 - Practical
10 pages
Decision Trees Explained - Entropy, Information Gain, Gini Index, CCP Pruning - by Shailey Dash - Towards Data Science
No ratings yet
Decision Trees Explained - Entropy, Information Gain, Gini Index, CCP Pruning - by Shailey Dash - Towards Data Science
25 pages
ESE Lab File
No ratings yet
ESE Lab File
105 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
MCQ Machine Learning
No ratings yet
MCQ Machine Learning
23 pages
Predicting Cricket Match 490021 1 en
No ratings yet
Predicting Cricket Match 490021 1 en
13 pages
Session 17-Decision Tree
No ratings yet
Session 17-Decision Tree
16 pages
2019 SCC361 Questions
No ratings yet
2019 SCC361 Questions
6 pages
Business Data Mining Week 10
No ratings yet
Business Data Mining Week 10
30 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
16 pages
25-questions-to-test-your-skills-on-decision-trees
No ratings yet
25-questions-to-test-your-skills-on-decision-trees
10 pages
Advanced Machine Learning CIE
No ratings yet
Advanced Machine Learning CIE
13 pages
ID3 Decision Tree Explanation
No ratings yet
ID3 Decision Tree Explanation
8 pages
Self-Quiz Unit 5 - Attempt Review
No ratings yet
Self-Quiz Unit 5 - Attempt Review
3 pages
Performance Analysis of Deep Neural Network and Machine Learning Algorithms For Diabetes Prediction
No ratings yet
Performance Analysis of Deep Neural Network and Machine Learning Algorithms For Diabetes Prediction
6 pages
Decision Tree
100% (1)
Decision Tree
10 pages
Classification Basedon Decision Tree Algorithm
No ratings yet
Classification Basedon Decision Tree Algorithm
10 pages
Meeting 6 CE609-supervised-learning
No ratings yet
Meeting 6 CE609-supervised-learning
166 pages
Unit-5 Decision Trees & Ensembles Methods
No ratings yet
Unit-5 Decision Trees & Ensembles Methods
11 pages
Lab-Practice-I(ML)-Lab Manual-Vaishali
No ratings yet
Lab-Practice-I(ML)-Lab Manual-Vaishali
57 pages
Jntuk Machine Learning 3-2 Unit-2
No ratings yet
Jntuk Machine Learning 3-2 Unit-2
47 pages
AIML- Module 3- Updated
No ratings yet
AIML- Module 3- Updated
42 pages
21csc305p Machine Learning Unit 5
No ratings yet
21csc305p Machine Learning Unit 5
61 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
ML-chap-3
No ratings yet
ML-chap-3
52 pages
MCQs Dumps 2
No ratings yet
MCQs Dumps 2
15 pages
Detection of Cyber Attacks Using Ai
No ratings yet
Detection of Cyber Attacks Using Ai
92 pages
Decision Trees Palagraism
No ratings yet
Decision Trees Palagraism
16 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.