4chap4 BM
4chap4 BM
4chap4 BM
Concepts and
Techniques
Chapter 4
May 5, 2015
Summary
May 5, 2015
May 5, 2015
Task-relevant data
Background knowledge
May 5, 2015
May 5, 2015
Characterization
Discrimination
Association
Classification/prediction
Clustering
Outlier analysis
May 5, 2015
Background Knowledge:
Concept Hierarchies
Schema hierarchy
E.g., street < city < province_or_state <
country
Set-grouping hierarchy
E.g., {20-39} = young, {40-59} =
middle_aged
Operation-derived hierarchy
email address: login-name < department
< university < country
Rule-based hierarchy
low_profit_margin (X) <= price(X, P1)
and cost (X, P2) and (P1 - P2) < $50
May 5, 2015
Measurements of
Pattern Interestingness
Simplicity
e.g., (association) rule length, (decision) tree size
Certainty
e.g., confidence, P(A|B) = n(A and B)/ n (B),
classification reliability or accuracy, certainty
factor, rule strength, rule quality, discriminating
weight, etc.
Utility
potential usefulness, e.g., support (association),
noise threshold (description)
Novelty
not previously known, surprising (used to remove
redundant rules, e.g., Canada vs. Vancouver rule
implication support ratio
May 5, 2015
Visualization of Discovered
Patterns
Different backgrounds/usages may require different
forms of representation
May 5, 2015
Motivation
Design
May 5, 2015
10
task-relevant data
interestingness measure
May 5, 2015
11
in relevance to att_or_dim_list
order by order_list
group by grouping_list
having condition
May 5, 2015
12
Specification of task-relevant
data
May 5, 2015
13
Mine_Knowledge_Specification ::=
mine characteristics [as pattern_name]
analyze measure(s)
Discrimination
Mine_Knowledge_Specification ::=
mine comparison [as pattern_name]
for target_classwhere target_condition
{versus contrast_class_iwhere
contrast_condition_i}
analyze measure(s)
Association
Mine_Knowledge_Specification ::=
mine associations [as pattern_name]
May 5, 2015
14
Classification
Mine_Knowledge_Specification ::=
mine classification [as pattern_name]
analyze classifying_attribute_or_dimension
Prediction
Mine_Knowledge_Specification ::=
mine prediction [as pattern_name]
analyze
prediction_attribute_or_dimension
{set {attribute_or_dimension_i=
value_i}}
May 5, 2015
15
May 5, 2015
16
May 5, 2015
17
Example:
with support threshold = 0.05
with confidence threshold = 0.7
May 5, 2015
18
display as <result_form>
To facilitate interactive viewing at different concept
level, the following syntax is defined:
Multilevel_Manipulation ::= roll up on
attribute_or_dimension
| drill down on attribute_or_dimension
| add attribute_or_dimension | drop
attribute_or_dimension
May 5, 2015
19
May 5, 2015
20
May 5, 2015
21
May 5, 2015
22
May 5, 2015
23
Summary
May 5, 2015
24