Week 8-Association Rules Part 1

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

CIS 517 :Data Mining and

Warehousing

Week 8: Association Rules Part 1


(Chapter 6)
Instructor : Dr.Gomathi Krishna
Email : gkrishna@iau.edu.sa

Data Mining and Data Warhousing 1


Lecture Outline

 Basic Concepts

 Frequent Itemset Mining Methods – Apriori

Algorithm.

 Generating Association rules from frequent

itemsets.

2
Frequent Patterns: Association rule

 Frequent patterns are patterns (e.g.,


itemsets, subsequences, or substructures) that
appear frequently in a data set.
 For example, a set of items, such as milk and
bread, that appear frequently together in a
transaction data set is a frequent itemset.

Data Mining and Data Warhousing 3


Frequent Patterns : Association rule

 How to mine such patterns and rules efficiently


in large datasets?
 How to use such patterns for classification,
clustering, and other applications?

Data Mining and Data Warhousing 4


Data Mining and Data Warhousing 5
Market Basket Analysis: A Motivating Example

 This process analyzes customer buying habits by


finding associations between the different items
that customers place in their “shopping baskets”.
 The discovery of these associations can help
retailers develop marketing strategies by gaining
insight into which items are frequently purchased
together by customers.

Data Mining and Data Warhousing 6


Market Basket Analysis

Data Mining and Data Warhousing 7


Data Mining and Data Warhousing 8
Basic Concepts: Association Rules

Data Mining and Data Warhousing 9


Basic Concepts: Association Rules

Data Mining and Data Warhousing 10


Basic Concepts: Association Rules

Data Mining and Data Warhousing 11


Basic Concepts: Association Rules
Tid Items bought  Find all the rules X  Y with
10 Beer, Nuts, Diaper
minimum support and confidence
20 Beer, Coffee, Diaper
30 Beer, Diaper, Eggs
 support, s, probability that a
40 Nuts, Eggs, Milk transaction contains X  Y
50 Nuts, Coffee, Diaper, Eggs, Milk
 confidence, c, conditional
Customer
buys both
Customer probability that a transaction
buys
having X also contains Y
diaper
Let minsup = 50%, minconf = 50%
Freq. Pat.: Beer:3, Nuts:3, Diaper:4, Eggs:3,
{Beer, Diaper}:3
Customer
buys beer  Association rules: (many more!)
 Beer  Diaper (60%, 100%)
 Diaper  Beer (60%, 75%)

12
Data Mining and Data Warhousing 13
Data Mining and Data Warhousing 14
Data Mining and Data Warhousing 15
Data Mining and Data Warhousing 16
The Apriori Algorithm (Pseudo-Code)
Ck: Candidate itemset of size k
Lk : frequent itemset of size k

L1 = {frequent items};
for (k = 1; Lk !=; k++) do begin
Ck+1 = candidates generated from Lk;
for each transaction t in database do
increment the count of all candidates in Ck+1 that are
contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return k Lk; 17
Implementation of Apriori

 How to generate candidates?


 Step 1: self-joining Lk
 Step 2: pruning
 Example of Candidate-generation
 L3={abc, abd, acd, ace, bcd}
 Self-joining: L3*L3
 abcd from abc and abd
 acde from acd and ace
 Pruning:
 acde is removed because ade is not in L3
 C4 = {abcd}
18
Apriori Algorithm Example 1

 Generation of the candidate itemsets and


frequent itemset , where the minimum support
count is 2.

Data Mining and Data Warhousing 19


Data Mining and Data Warhousing 20
Data Mining and Data Warhousing 21
Data Mining and Data Warhousing 22
Data Mining and Data Warhousing 23
Data Mining and Data Warhousing 24
Final Result

Data Mining and Data Warhousing 25


Apriori Algorithm Example 2

Supmin = 2 Itemset sup


Itemset sup
Database TDB {A} 2
L1 {A} 2
Tid Items C1 {B} 3
{B} 3
10 A, C, D {C} 3
1st scan {C} 3
20 B, C, E {D} 1
{E} 3
30 A, B, C, E {E} 3
40 B, E
C2 Itemset sup C2 Itemset
{A, B} 1
L2 Itemset sup
{A, C} 2 2nd scan {A, B}
{A, C} 2 {A, C}
{A, E} 1
{B, C} 2 {A, E}
{B, C} 2
{B, E} 3
{B, E} 3 {B, C}
{C, E} 2
{C, E} 2 {B, E}
{C, E}

C3 Itemset
3rd scan L3 Itemset sup
{B, C, E} {B, C, E} 2
26
Generating Association rules

from frequent itemsets

Data Mining and Data Warhousing 27


Data Mining and Data Warhousing 28
Data Mining and Data Warhousing 29
Data Mining and Data Warhousing 30
Data Mining and Data Warhousing 31

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy