Notes 4 DWM Data Mining
Notes 4 DWM Data Mining
2
Mining Frequent Patterns, Association and
Correlations
diaper}
i.e., every transaction having {beer, diaper, nuts} also
Challenges
Multiple scans of transaction database
Huge number of candidates
Tedious workload of support counting for candidates
Improving Apriori: general ideas
Reduce passes of transaction database scans
Shrink number of candidates
Facilitate support counting of candidates
Completeness
Preserve complete information for frequent pattern
mining
Never break a long pattern of any transaction
Compactness
Reduce irrelevant info—infrequent items are gone
Patterns containing p
…
Pattern f
{}
Header Table
f:4 c:1 Conditional pattern bases
Item frequency head
f 4 item cond. pattern base
c 4 c:3 b:1 b:1 c f:3
a 3
b 3 a:3 p:1 a fc:3
m 3 b fca:1, f:1, c:1
p 3 m:2 b:1 m fca:2, fcab:1
pattern base
database partition.
Method
For each frequent item, construct its conditional
pattern-base, and then its conditional FP-tree
Repeat the process on each newly created conditional
FP-tree
Until the resulting FP-tree is empty, or it contains only
Divide-and-conquer:
decompose both the mining task and DB according to
the frequent patterns obtained so far
leads to focused search of smaller databases
Other factors
no candidate generation, no candidate test
compressed database: FP-tree structure
no repeated scan of entire database
basic ops—counting local freq items and building sub
FP-tree, no pattern search and matching
P( A B)
lift
P( A) P( B)
If the value is greater than 1 then A and B are positively corelated it means
the occurrence of one implies the occurrence of other.
May 18, 2023 Data Mining: Concepts and Techniques 31
Mining Frequent Patterns, Association
and Correlations
Basic concepts and a road map
Efficient and scalable frequent itemset mining
methods
Mining various kinds of association rules
From association mining to correlation analysis
Constraint-based association mining
Summary