Data Mining Classification: Alternative Techniques

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Data  Mining  

Classification:  Alternative  Techniques

Lecture  Notes  for  Chapter  4

Rule-­Based

Introduction  to  Data  Mining  ,  2nd Edition


by
Tan,  Steinbach,  Karpatne,  Kumar

Rule-­Based  Classifier

● Classify  records  by  using  a  collection  of  


“if…then…”  rules

● Rule:        (Condition)  → y
– where  
u Condition is  a  conjunctions  o f  a ttributes  
u y is  the  class  label
– LHS:  rule  antecedent  or  condition
– RHS:  rule  consequent
– Examples  of  classification  rules:
u (Blood  Type=Warm)  ∧ (Lay  E ggs=Yes)  → Birds
u (Taxable  Income  <  5 0K)  ∧ (Refund=Yes)  → Evade=No

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   2


Rule-­based  Classifier  (Example)
Name Blood  Type Give  Birth Can  Fly Live  in  Water Class
human warm yes no no mammals
python cold no no no reptiles
salmon cold no no yes fishes
whale warm yes no yes mammals
frog cold no no sometimes amphibians
komodo cold no no no reptiles
bat warm yes yes no mammals
pigeon warm no yes no birds
cat warm yes no no mammals
leopard  shark cold yes no yes fishes
turtle cold no no sometimes reptiles
penguin warm no no sometimes birds
porcupine warm yes no no mammals
eel cold no no yes fishes
salamander cold no no sometimes amphibians
gila  monster cold no no no reptiles
platypus warm no no no mammals
owl warm no yes no birds
dolphin warm yes no yes mammals
eagle warm no yes no birds

R1:  (Give  Birth  =  no)  ∧ (Can  Fly  =  yes)  → Birds


R2:  (Give  Birth  =  no)  ∧ (Live  in  Water  =  yes)  → Fishes
R3:  (Give  Birth  =  yes)  ∧ (Blood  Type  =  warm)   → Mammals
R4:  (Give  Birth  =  no)  ∧ (Can  Fly  =  no)  → Reptiles
R5:  (Live  in  Water =  sometimes)  → Amphibians
02/14/2018 Introduction   to   Data  Mining,  2nd Edition   3

Application  of  Rule-­Based  Classifier

● A  rule  r covers an  instance  x  if  the  attributes  of  


the  instance  satisfy  the  condition  of  the  rule
R1:  (Give  Birth  =  no)  ∧ (Can  Fly  =  yes)  → Birds
R2:  (Give  Birth  =  no)  ∧ (Live  in  Water  =  yes)  → Fishes
R3:  (Give  Birth  =  y es)  ∧ (Blood  Type  =  warm)  → Mammals
R4:  (Give  Birth  =  no)  ∧ (Can  Fly  =  no)  → Reptiles
R5:  (Live  in  Water =  sometimes)  → Amphibians  

Name Blood  Type Give  Birth Can  Fly Live  in  Water Class
hawk warm no yes no ?
grizzly  bear warm yes no no ?

The  rule  R1  covers  a  hawk  =>  Bird


The  rule  R3  covers  t he  grizzly  bear  =>  Mammal

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   4


Rule  Coverage  and  Accuracy
Tid   Refund   Marital   Taxable  
● Coverage  of  a  rule: Status   Income   Class  

– Fraction  of  records   1   Yes   Single   125K   No  

that  satisfy  the   2   No   Married   100K   No  


3   No   Single   70K   No  
antecedent  of  a  rule 4   Yes   Married   120K   No  

● Accuracy  of  a  rule: 5   No   Divorced   95K   Yes  


6   No   Married   60K   No  
– Fraction  of  records   7   Yes   Divorced   220K   No  
that  satisfy  the   8   No   Single   85K   Yes  

antecedent  that   9   No   Married   75K   No  

also  satisfy  the   10   No   Single   90K   Yes  


10  

consequent  of  a   (Status=Single)  → No


rule Coverage  =  4 0%,    Accuracy  =  5 0%

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   5

How  does  Rule-­based  Classifier  Work?

R1:  (Give  Birth  =  no)  ∧ (Can  Fly  =  yes)  → Birds


R2:  (Give  Birth  =  no)  ∧ (Live  in  Water  =  yes)  → Fishes
R3:  (Give  Birth  =  y es)  ∧ (Blood  Type  =  warm)  → Mammals
R4:  (Give  Birth  =  no)  ∧ (Can  Fly  =  no)  → Reptiles
R5:  (Live  in  Water =  sometimes)  → Amphibians  

Name Blood  Type Give  Birth Can  Fly Live  in  Water Class
lemur warm yes no no ?
turtle cold no no sometimes ?
dogfish  shark cold yes no yes ?

A  lemur  triggers  rule  R3,  so  it  is  c lassified  as  a  mammal
A  t urtle  t riggers  both  R4  and  R5
A  dogfish  s hark  t riggers  none  of  t he  rules

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   6


Characteristics  of  Rule  Sets:  Strategy  1

● Mutually  exclusive  rules


– Classifier  contains  mutually  exclusive  rules  if  
the  rules  are  independent  of  each  other
– Every  record  is  covered  by  at  most  one  rule

● Exhaustive  r ules
– Classifier  has  exhaustive  coverage  if  it  
accounts  for  every  possible  combination  of  
attribute  values
– Each  record  is  covered  by  at  least  one  rule
02/14/2018 Introduction   to   Data  Mining,  2nd Edition   7

Characteristics  of  Rule  Sets:  Strategy  2

● Rules  are  not  m utually  exclusive


– A  record  may  trigger  more  than  one  rule
– Solution?
u Ordered  rule  set
u Unordered  rule  set  – use  voting  schemes

● Rules  are  not  exhaustive


– A  record  may  not  trigger  any  rules
– Solution?
u Use  a  default  class

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   8


Ordered  Rule  Set

● Rules  are  r ank  ordered  according  to  their  priority


– An  o rdered  rule  set  is  known  a s  a  d ecision  list
● When  a  test  record  is  presented  to  the  classifier  
– It  is  assigned  to  the  class  label  o f  the  h ighest  ranked  rule  it  h as  
triggered
– If  n one  o f  the  rules  fired,  it  is  assigned  to  the  d efault  class

R1:  (Give  Birth  =  no)  ∧ (Can  Fly  =  yes)  → Birds


R2:  (Give  Birth  =  no)  ∧ (Live  in  Water  =  yes)  → Fishes
R3:  (Give  Birth  =  y es)  ∧ (Blood  Type  =  warm)  → Mammals
R4:  (Give  Birth  =  no)  ∧ (Can  Fly  =  no)  → Reptiles
R5:  (Live  in  Water =  sometimes)  → Amphibians  

Name Blood  Type Give  Birth Can  Fly Live  in  Water Class
turtle cold no no sometimes ?
02/14/2018 Introduction   to   Data  Mining,  2nd Edition   9

Rule  Ordering  Schemes

● Rule-­based  ordering
– Individual  rules  a re  ranked  b ased  o n  their  q uality
● Class-­based  ordering
– Rules  that  b elong  to  the  same  class  a ppear  together

Rule-­based  Ordering Class-­based  Ordering


(Refund=Yes)  ==>  No (Refund=Yes)  ==>  No

(Refund=No,  Marital  Status={Single,Divorced}, (Refund=No,  Marital  Status={Single,Divorced},


Taxable  Income<80K)  ==>  No Taxable  Income<80K)  ==>  No

(Refund=No,  Marital  Status={Single,Divorced}, (Refund=No,  Marital  Status={Married})  ==>  No


Taxable  Income>80K)  ==>  Yes
(Refund=No,  Marital  Status={Single,Divorced},
(Refund=No,  Marital  Status={Married})  ==>  No Taxable  Income>80K)  ==>  Yes

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   10


Building  Classification  Rules

● Direct  Method:  
u Extract  rules  directly  from  data
u Examples:  RIPPER,  CN2,  Holte’s 1R

● Indirect  M ethod:
u Extract  rules  from  other  classification  models  (e.g.  
decision  trees,  neural  networks,  etc).
u Examples:  C4.5rules

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   11

Direct  Method:  Sequential  Covering

1. Start  from  an  empty  rule


2. Grow  a  rule  using  the  Learn-­One-­Rule  function
3. Remove  training  records  covered  by  the  rule
4. Repeat  Step  (2)  and  (3)  until  stopping  criterion  
is  met  

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   12


Example  of  Sequential  Covering

(i)  Original  Data (ii)  Step  1

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   13

Example  of  Sequential  Covering…

R1 R1

R2

(iii)  Step  2 (iv)  Step  3

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   14


Instance  Elimination

● Why  do  we  need  to  


eliminate  instances?
R3 R2
– Otherwise,  the  n ext  rule  is  
identical  to  p revious  rule R1 + + + +
+
+ ++ +
● Why  do  we  remove   class  =  +
+
+++
+ +
+
+
+
+ +
positive  instances? + + + +
+ + +
+ +
– Ensure  that  the  n ext  rule  is   -­
-­ -­

-­ -­ -­

different -­ -­
class  =  -­ -­ -­ -­
● Why  do  we  remove   -­ -­

negative  instances? -­
-­ -­

– Prevent  u nderestimating   -­
accuracy  o f  rule
– Compare  rules  R2  and  R3  
in  the  d iagram

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   15

Rule  Growing

● Two  common  strategies  

Yes:  3
{  } No:    4
Refund=No, Refund=No,
Status=Single, Status=Single,
Income=85K Income=90K
(Class=Yes) (Class=Yes)

Refund=
No
Status  =
Single
Status  =
Divorced
Status  =
Married
... Income
>  80K
Refund=No,
Status  =  Single
Yes:  3 Yes:  2 Yes:  1 Yes:  0 Yes:  3 (Class  =  Yes)
No:    4 No:    1 No:    0 No:    3 No:    1

(a)  General-­to-­specific (b)  Specific-­to-­general

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   16


Rule  Evaluation
FOIL:   First  Order  Inductive  
● Foil’s  Information  Gain Learner  – an  early  rule-­
based  learning  algorithm

– R0:    {}  =>  class      (initial  rule)


– R1:    {A}  =>  class  (rule  after  a dding  conjunct)

– Gain(R0,  R1)  =  t  [    log  (p1/(p1+n1))  – log  (p0/(p0  +  n 0))  ]

– where      t:  n umber  o f  p ositive  instances  covered  b y  


both  R0  a nd  R1
p0:  n umber  o f  p ositive  instances  covered  b y  R0
n0:  n umber  o f  n egative  instances  covered  b y  R0
p1:  n umber  o f  p ositive  instances  covered  b y  R1
n1:  n umber  o f  n egative  instances  covered  b y  R1

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   17

Direct  Method:  RIPPER

● For  2-­class  problem,  choose  one  of  the  classes  as  


positive  class,  and  the  other  as  negative  class
– Learn  rules  for  positive  class
– Negative  class  will  be  default  class
● For  multi-­class  problem
– Order  the  classes  according  to  increasing  class  
prevalence  (fraction  of  instances  that  belong  to  a  
particular  class)
– Learn  the  rule  set  for  smallest  class  first,  treat  the  rest  
as  negative  class
– Repeat  with  next  smallest  class  as  positive  class

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   18


Direct  Method:  RIPPER

● Growing  a  rule:
– Start  from  empty  rule
– Add  conjuncts  as  long  as  they  improve  FOIL’s  
information  gain
– Stop  when  rule  no  longer  covers  negative  examples
– Prune  the  rule  immediately  using  incremental  reduced  
error  pruning
– Measure  for  pruning:      v  =  (p-­n)/(p+n)
u p:  n umber  o f  p ositive  e xamples  covered  b y  the  rule  in
the  validation  set
u n:  n umber  o f  n egative  e xamples  covered  b y  the  rule  in
the  validation  set
– Pruning  method:  delete  any  final  sequence  of  
conditions  that  maximizes  v
02/14/2018 Introduction   to   Data  Mining,  2nd Edition   19

Direct  Method:  RIPPER

● Building  a  Rule  Set:


– Use  sequential  covering  algorithm
u Finds  the  best  rule  that  covers  the  current  set  of  
positive  examples
u Eliminate  both  positive  and  negative  examples  
covered  by  the  rule
– Each  time  a  rule  is  added  to  the  rule  set,  
compute  the  new  description  length
u Stop  adding  new  rules  when  the  new  description  
length  is  d  bits  longer  than  the  smallest  description  
length  obtained  so  far

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   20


Direct  Method:  RIPPER

● Optimize  the  rule  set:


– For  each  rule  r in  the  rule  set  R
u Consider  2  alternative  rules:
– Replacement rule (r*): grow new rule from scratch
– Revised rule(r′): add conjuncts to extend the rule r
u Compare  the  rule  set  for  r  against  the  rule  set  for  r*  
and  r′  
u Choose  rule  set  that  minimizes  MDL  principle

– Repeat  rule  generation  and  rule  optimization  


for  the  remaining  positive  examples

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   21

Indirect  Methods

P
No Yes

Q R Rule  Set

No Yes No Yes r1:  (P=No,Q=No)  ==>  -­


r2:  (P=No,Q=Yes)  ==>  +
-­ + + Q r3:  (P=Yes,R=No)  ==>  +
r4:  (P=Yes,R=Yes,Q=No)  ==>  -­
No Yes
r5:  (P=Yes,R=Yes,Q=Yes)  ==>  +
-­ +

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   22


Indirect  Method:  C4.5rules

● Extract  r ules  from  an  unpruned decision  tree


● For  each  r ule,  r :  A  → y,  
– consider  an  alternative  rule  r′:  A′  → y  where  A′  
is  obtained  by  removing  one  of  the  conjuncts  
in  A
– Compare  the  pessimistic  error  rate  for  r  
against  all  r’s
– Prune  if  one  of  the  alternative  rules  has  lower  
pessimistic  error  rate
– Repeat  until  we  can  no  longer  improve  
generalization  error

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   23

Indirect  Method:  C4.5rules

● Instead  of  ordering  the  r ules,  order  subsets  of  


rules (class  ordering)
– Each  subset  is  a  collection  of  rules  with  the  
same  rule  consequent  (class)
– Compute  description  length  of  each  subset
u Description  length  =  L(error)  +  g  L(model)
u g  is  a  parameter  that  takes  into  a ccount  the  
presence  of  redundant  attributes  in  a  rule  set  
(default  value  =  0.5)

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   24


Example
Name Give  Birth Lay  Eggs Can  Fly Live  in  Water Have  Legs Class
human yes no no no yes mammals
python no yes no no no reptiles
salmon no yes no yes no fishes
whale yes no no yes no mammals
frog no yes no sometimes yes amphibians
komodo no yes no no yes reptiles
bat yes no yes no yes mammals
pigeon no yes yes no yes birds
cat yes no no no yes mammals
leopard  shark yes no no yes no fishes
turtle no yes no sometimes yes reptiles
penguin no yes no sometimes yes birds
porcupine yes no no no yes mammals
eel no yes no yes no fishes
salamander no yes no sometimes yes amphibians
gila  monster no yes no no yes reptiles
platypus no yes no no yes mammals
owl no yes yes no yes birds
dolphin yes no no yes no mammals
eagle no yes yes no yes birds

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   25

C4.5  versus  C4.5rules  versus  RIPPER

Give C4.5rules:
Birth? (Give  Birth=No,  Can  Fly=Yes)  → Birds
(Give  Birth=No,  Live  in  Water=Yes)  → Fishes
Yes No
(Give  Birth=Yes)  → Mammals
(Give  Birth=No,  Can  Fly=No,  Live  in  Water=No)  → Reptiles
Mammals Live  In (  )   → Amphibians
Water?
Yes No RIPPER:
(Live  in  Water=Yes)  → Fishes
Sometimes (Have  Legs=No)  → Reptiles
(Give  Birth=No,  Can  Fly=No,  Live  In  Water=No)  
Fishes Amphibians Can → Reptiles
Fly?
(Can  Fly=Yes,Give  Birth=No)  → Birds
Yes No ()   → Mammals

Birds Reptiles

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   26


C4.5  versus  C4.5rules  versus  RIPPER

C4.5  a nd  C4.5rules:
PREDICTED  CLASS
  Amphibians Fishes Reptiles Birds Mammals
ACTUAL Amphibians 2 0 0 0 0
CLASS Fishes 0 2 0 0 1
Reptiles 1 0 3 0 0
Birds 1 0 0 3 0
Mammals 0 0 1 0 6

RIPPER:
PREDICTED  CLASS
  Amphibians Fishes Reptiles Birds Mammals
ACTUAL Amphibians 0 0 0 0 2
CLASS Fishes 0 3 0 0 0
Reptiles 0 0 3 0 1
Birds 0 0 1 2 1
Mammals 0 2 1 0 4

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   27

Advantages  of  Rule-­Based  Classifiers

● Has  characteristics  quite  similar  to  decision  trees


– As  highly  expressive  as  decision  trees
– Easy  to  interpret
– Performance  comparable  to  decision  trees
– Can  handle  redundant  attributes

● Better  suited  for  handling  imbalanced  classes

● Harder  to  handle  m issing  values  in  the  test  set

02/14/2018 Introduction   to   Data  Mining,  2nd Edition   28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy