0% found this document useful (0 votes)

8 views

Data Mining - Module 6

The document outlines a course on Data Mining, specifically focusing on the Apriori Algorithm, which is used for mining frequent itemsets and association rules in large databases. It includes learning outcomes, course content, and detailed lessons on the algorithm's background, steps for implementation, market basket analysis, and its advantages and disadvantages. Additionally, it provides examples and exercises for practical application of the Apriori Algorithm.

Uploaded by

DHEN DHEN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Data Mining - Module 6

Uploaded by

DHEN DHEN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Republic of the Philippines

Province of Cotabato
Municipality of Makilala
MAKILALA INSTITUTE OF SCIENCE AND TECHNOLOGY
Makilala, Cotabato

COLLEGE OF TECHNOLOGY AND INFORMATION SYSTEMS

Bachelor of Science in Information Systems

Course Number : PROF EL 3 Instructor : RONALD L. BAJADOR

Course Description : DATA MINING Mobile Number: +639075182943
Credit Units : 3 units (3 hours lecture; 2hours laboratory) Email Address : rbajadorsharp@gmail.com
Module Number :6
Duration : 2 Weeks
I. LEARNING OUTCOMES

Upon completion of this material, you should be able to:

 discuss the overview of Apriori Algorithm and its key concepts
 determine the steps to perform Apriori Algorithm
 consider the market basket analysis
 generate frequent itemset in a given data set
 apply Apriori Algorithm in a given data set
 discuss the advantages and disadvantages of Apriori Algorithm

II. TOPIC(S) - Apriori Algorithm

Lesson 1: Apriori Background and its key concepts
Lesson 2: Steps to perform Apriori Algorithm
Lesson 3: Market Basket Analysis
Lesson 4: Frequent Itemset Generation
Lesson 5: Application of Apriori Algorithm
Lesson 6: Advantages and Disadvantages of Apriori Algorithm

III. REFERENCES

 Main Textbook

 Tan, Steinbach, Karpatne, Kumar (2019). Introduction to Data Mining 2nd Edition.

 Han, J., Kamber, M. & Pei, J. (2013). Data Mining Concepts and Techniques. 3rd Edition

 I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal (2016) Data Mining: Practical Machine Learning Tools
and Techniques. 4TH Edition

IV. COURSE CONTENT

Lesson 1: Apriori Background and It’s Key Concepts

 Proposed by Agrawal R, Imielinski T, Swami AN

- "Mining Association Rules between Sets of Items in Large Databases. “
- SIGMOD, June 1993
 The Apriori Algorithm is an influential algorithm for mining frequent item sets for Boolean association
rules.
 Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step
known as candidate generation, and groups of candidates are tested against the data.
 Apriori is designed to operate on database containing transactions (for example, collections of items
bought by customers, or details of a website frequentation).

PROFEL 3 – DATA MINING 1

 Frequent Itemsets: All the sets which contain the item the minimum support (denoted by Li for ith
itemset)
 Apriori Property: Any subset of frequent itemset must be frequent.
 Join Operation: To find Lk ,a set of candidate k-itemsets is generated by joining Lk – 1 with itself.

 Apriori principle holds due to the following property of the support measure:
∀X ,Y : (X ⊆ Y)⇒ s(X ) ≥ s(Y)
- Support of an itemset never exceeds the support of its subsets
- This is known as the anti-monotone property of support

Lesson 2: Steps to perform

STEPApriori
1 Algorithm
Scan the transaction data
base to get the support of S
each 1-itemset, compare S
with min_sup, and get a
STEP
support of 1-itemsets, L1
4

STEP 6
N The
cand
For every nonempty O idate
subset s of 1, output the YE
STEPset
5=
rule “s=>(1-s)” if Null S
For each frequent
confidence C of the rule itemset 1, generate
“s=>(1-s)” (=support s of all nonempty
1/support S of s)’ subsets of 1
Lesson 3: Market Basket Analysis
min_conf
 Provides insight into which products tend to be purchased together and which are most amenable to
promotion.
 Actionable rules
 Trivial rules
- People who buy chalk-piece also buy duster
 Inexplicable
- People who buy mobile also buy bag

Lesson 4: Frequent Itemset Generation

1. Let k = 1
2. Generate frequent itemsets of length 1
3. Repeat until no new frequent
itemsets are identified
1. Generate length (k+1) candidate
itemsets from length k frequent Itemsets
2. Prune candidate itemsets containing subsets
of length k that are infrequent
- How many k-itemsets contained
in a (k+1)-itemset?
3. Count the support of each candidate
by scanning the DB
4. Eliminate candidates that are infrequent,
5. leaving only those that are frequent

Note: steps 3.2 and 3.4 prune itemsets that are infrequent

PROFEL 3 – DATA MINING 2

 Generating Itemsets Efficiently
 How can we efficiently generate all (frequent) item sets at each iteration?
o Avoid generate repeated itemsets and infrequent itemsets
 Finding one-item sets easy
 Idea: use one-item sets to generate two-item sets, two-item sets to generate three-item sets …
o If (A B) is frequent item set, then (A) and (B) have to be frequent item sets as well!

o ⇒ Compute k-item set by merging two (k-1)-itemsets. Which ones?

o In general: if X is a frequent k-item set, then all (k-1)-item subsets of X are also frequent

E.g. Merge {Bread, Milk} with {Bread, Diaper} to get {Bread, Diaper, Milk}

 Example: generating frequent itemsets

 Given: five frequent 3-itemsets (A B C) , (A B D) , (A C D) , (A C E) , (B C D)

1. Lexicographically ordered!
2. Merge (x1, x2, …, xk-1) with (y1, y2, …, yk-1),
if x1 = y1, x2 = y2, …, xk-2 = yk-2

 Candidate 4-itemsets:
(A B C D) OK because of (A B C), (A B D),
(A C D), (B C D)

(A C D E) Not OK because of (C D E)

3. Final check by counting instances in dataset!

Lesson 5: Apriori Algorithm Example

Example 1: Support threshold=50%, Confidence= 60%

TABLE-1
Transaction List of items

T1 I1,I2,I3

T2 I2,I3,I4

T3 I4,I5

T4 I1,I2,I4

T5 I1,I2,I3,I5

T6 I1,I2,I3,I4

Solution:
Support threshold=50% => 0.5*6= 3 => min_sup=3

1. Count of Each Item

TABLE-2
Item Count

I1 4

I2 5

I3 4

I4 4

I5 2

PROFEL 3 – DATA MINING 3

2. Prune Step: TABLE -2 shows that I5 item does not meet min_sup=3, thus it is deleted, only I1, I2, I3, I4 meet min_sup
count.

TABLE-3
Item Count

I1 4

I2 5

I3 4

I4 4

3. Join Step: Form 2-itemset. From TABLE-1, find out the occurrences of 2-itemset.

TABLE-4
Item Count

I1,I2 4

I1,I3 3

I1,I4 2

I2,I3 4

I2,I4 3

I3,I4 2

4. Prune Step: TABLE -4 shows that item set {I1, I4} and {I3, I4} does not meet min_sup, thus it is deleted.

TABLE-5
Item Count

I1,I2 4

I1,I3 3

I2,I3 4

I2,I4 3

5. Join and Prune Step: Form 3-itemset. From the TABLE- 1, find out occurrences of 3-itemset.
From TABLE-5, find out the 2-itemset subsets which support min_sup.

We can see for itemset {I1, I2, I3} subsets, {I1, I2}, {I1, I3}, {I2, I3} are occurring in TABLE-5 thus {I1, I2, I3} is frequent.

We can see for itemset {I1, I2, I4} subsets, {I1, I2}, {I1, I4}, {I2, I4}, {I1, I4} is not frequent, as it is not occurring in
TABLE-5 thus {I1, I2, I4} is not frequent, hence it is deleted.

TABLE-6
Item

I1,I2,I3

I1,I2,I4

I1,I3,I4

I2,I3,I4

Only {I1, I2, I3} is frequent.

PROFEL 3 – DATA MINING 4

6. Generate Association Rules: From the frequent itemset discovered above the association could be:
{I1, I2} => {I3}

Confidence = support {I1, I2, I3} / support {I1, I2} = (3/ 4)* 100 = 75%

{I1, I3} => {I2}

Confidence = support {I1, I2, I3} / support {I1, I3} = (3/ 3)* 100 = 100%

{I2, I3} => {I1}

Confidence = support {I1, I2, I3} / support {I2, I3} = (3/ 4)* 100 = 75%

{I1} => {I2, I3}

Confidence = support {I1, I2, I3} / support {I1} = (3/ 4)* 100 = 75%

{I2} => {I1, I3}

Confidence = support {I1, I2, I3} / support {I2 = (3/ 5)* 100 = 60%

{I3} => {I1, I2}

Confidence = support {I1, I2, I3} / support {I3} = (3/ 4)* 100 = 75%

This shows that all the above association rules are strong if minimum confidence threshold is 60%.

Example 2:

PROFEL 3 – DATA MINING 5

Example 3:

Example 4:

Lesson 6: Advantages and Disadvantages of Apriori Algorithm

 Advantages

 Uses large itemset property

 Easily parallelized
 Easy to implement

 Disadvantages
 Algorithm can be very slow and bottleneck is candidate generation.
 Assumes transaction database is memory resident
 Requires many database scans

PROFEL 3 – DATA MINING 6

V. ACTIVITY/ EXERCISES/EVALUATION

(Apply Apriori Algorithm in the given data sets below. Follow the steps to perform Apriori Algorithm and consider the
market basket analysis then generate frequent itemset.)

1. Rule: Item/Items that is frequently purchased at least 50%

Transaction ID Items purchased

T1 {Apple, Mango, Pears}
T2 {Mango, Pears,Cabbage, Carrots}
T3 {Pears, Carrots, Mango}
T4 {Carrots, Mango}

2. Rule: Item/Items that is frequently purchased at least25%

TID Biscuit Bread Cheese Coffee Yogurt Cereal Chocolate Donuts Juice Milk Tea Eggs NewsPaper Pastry Rools Sugar Count
1 1 1 1 1 1 5
2 1 1 1 1 4
3 1 1 1 1 1 5
4 1 1 1 1 1 5
5 1 1 1 1 1 5
6 1 1 2
7 1 1 1 1 1 5
8 1 1 1 3
9 1 1 1 1 1 5
10 1 1 1 1 1 5
11 1 1 1 3
12 1 1 1 1 1 5
13 1 1 1 3
14 1 1 1 1 1 5
15 1 1 2
16 1 1
17 1 1 1 3
18 1 1 1 1 4
19 1 1 1 1 1 5
20 1 1 1 1 4
21 1 1 1 3
22 1 1 1 1 4
23 1 1 1 1 1 5
24 1 1 1 3
25 1 1 1 3
Count 4 13 12 9 2 9 9 10 11 6 4 2 2 1 2 1

PROFEL 3 – DATA MINING 7

Gulfstream G500 Pilot Notes (Additional)
100% (2)
Gulfstream G500 Pilot Notes (Additional)
92 pages
Module 5 - Frequent Pattern Mining
No ratings yet
Module 5 - Frequent Pattern Mining
111 pages
Chapter Ii: Research Design and Methodology
No ratings yet
Chapter Ii: Research Design and Methodology
6 pages
Case Name: in Re Annesley: Davidson V. Annesley Case Citation: Facts of The Case
No ratings yet
Case Name: in Re Annesley: Davidson V. Annesley Case Citation: Facts of The Case
2 pages
Isabella Jack Resume
No ratings yet
Isabella Jack Resume
1 page
[2025-05-27]-FPM_LECTURE 9-
No ratings yet
[2025-05-27]-FPM_LECTURE 9-
35 pages
P-3 1 5-Association
No ratings yet
P-3 1 5-Association
46 pages
Mining Association Rules in Large Databases
No ratings yet
Mining Association Rules in Large Databases
40 pages
Unit 4
No ratings yet
Unit 4
72 pages
Association-Rules
No ratings yet
Association-Rules
33 pages
Association-Analysis
No ratings yet
Association-Analysis
72 pages
Apriori
No ratings yet
Apriori
34 pages
16-Efficient and scalable frequent item set mining methods_ Apriori algorithm-05-02-2025
No ratings yet
16-Efficient and scalable frequent item set mining methods_ Apriori algorithm-05-02-2025
37 pages
11 Association Rules Mining New
No ratings yet
11 Association Rules Mining New
32 pages
FALLSEM2022-23 SWE2009 ETH VL2022230101117 Reference Material I 25-08-2022 Frequent Pattern Mining
No ratings yet
FALLSEM2022-23 SWE2009 ETH VL2022230101117 Reference Material I 25-08-2022 Frequent Pattern Mining
42 pages
Association Rules
No ratings yet
Association Rules
24 pages
11 Association Rules Mining and Recommendation Systems
No ratings yet
11 Association Rules Mining and Recommendation Systems
70 pages
Apriori Algo
No ratings yet
Apriori Algo
15 pages
Association Rules Explained
No ratings yet
Association Rules Explained
10 pages
Apriori Algorithm Example Problems
No ratings yet
Apriori Algorithm Example Problems
8 pages
What Is A Frequent Itemset?
No ratings yet
What Is A Frequent Itemset?
7 pages
Association Rule Mining 2023 (Compatibility Mode)
No ratings yet
Association Rule Mining 2023 (Compatibility Mode)
44 pages
APznzaYKXa5YwGceeu2-5Hb2cWsN90NIV1g8I9DxBLLoKwuE7P4qjOfEGWd6pCzfmwSqKnWNBm5euXlo07JZKRKi-UcpBSTEjg7UTMzxCaVnPn0Jb2VsTE_sqVGq7R0pvAGyLrtvL4jK7B1dY1fgM9rEecJTtpRn5WSkJB__vFz_Re2xK6z3uN9DfvIaFgXRVYH8z-mJcY-z6Q8hhRFSOd
No ratings yet
APznzaYKXa5YwGceeu2-5Hb2cWsN90NIV1g8I9DxBLLoKwuE7P4qjOfEGWd6pCzfmwSqKnWNBm5euXlo07JZKRKi-UcpBSTEjg7UTMzxCaVnPn0Jb2VsTE_sqVGq7R0pvAGyLrtvL4jK7B1dY1fgM9rEecJTtpRn5WSkJB__vFz_Re2xK6z3uN9DfvIaFgXRVYH8z-mJcY-z6Q8hhRFSOd
174 pages
MODULE 3 - Question &answer-2
No ratings yet
MODULE 3 - Question &answer-2
32 pages
Chapter - 05 - Association Rules
No ratings yet
Chapter - 05 - Association Rules
38 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
27 pages
Association Rule Mining
No ratings yet
Association Rule Mining
72 pages
Mod 4 part1_merged
No ratings yet
Mod 4 part1_merged
104 pages
BIS 541 Ch05 20-21 S
No ratings yet
BIS 541 Ch05 20-21 S
91 pages
Association Rules PDF
No ratings yet
Association Rules PDF
35 pages
CH 03 Frequent Pattern Mining 2021
No ratings yet
CH 03 Frequent Pattern Mining 2021
62 pages
Data Mining: Magister Teknologi Informasi Universitas Indonesia
No ratings yet
Data Mining: Magister Teknologi Informasi Universitas Indonesia
72 pages
Association Rule Mining Presentation
No ratings yet
Association Rule Mining Presentation
44 pages
Apriori Algorithm in Data Mining
No ratings yet
Apriori Algorithm in Data Mining
8 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
13 pages
CSE 634 Data Mining Techniques: Mining Association Rules in Large Databases
No ratings yet
CSE 634 Data Mining Techniques: Mining Association Rules in Large Databases
41 pages
Data Analytics - Unit - 4
No ratings yet
Data Analytics - Unit - 4
14 pages
DM -Unit 2-PPT
No ratings yet
DM -Unit 2-PPT
49 pages
Associationrule 1
No ratings yet
Associationrule 1
30 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
4 pages
Frequent Patterns and Association Rule Mining: Outline
No ratings yet
Frequent Patterns and Association Rule Mining: Outline
26 pages
Unit3 Data mining Pattern
No ratings yet
Unit3 Data mining Pattern
46 pages
Association Rule Mining Spring 2022
No ratings yet
Association Rule Mining Spring 2022
84 pages
Shweta Singh-Dwdm2024
No ratings yet
Shweta Singh-Dwdm2024
5 pages
An Approach of Improvisation in Efficiency of Apriori Algorithm
No ratings yet
An Approach of Improvisation in Efficiency of Apriori Algorithm
13 pages
Module 4 DM
No ratings yet
Module 4 DM
86 pages
Unit 5 Notes DWM
No ratings yet
Unit 5 Notes DWM
11 pages
Marketbasket Analysis
No ratings yet
Marketbasket Analysis
28 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
5 pages
Assoc 1
No ratings yet
Assoc 1
26 pages
Apriori Algorithm Example PDF
No ratings yet
Apriori Algorithm Example PDF
7 pages
Module5 DMW
No ratings yet
Module5 DMW
13 pages
Session5 6 (Am) PDF
No ratings yet
Session5 6 (Am) PDF
57 pages
DMT Unit-IV - UR20 - New
No ratings yet
DMT Unit-IV - UR20 - New
62 pages
667a8d24bb947_ppt
No ratings yet
667a8d24bb947_ppt
24 pages
Session 7
No ratings yet
Session 7
45 pages
Topic 1, 2, 3
No ratings yet
Topic 1, 2, 3
5 pages
U2 - Apriori - 5th Sem - DS
No ratings yet
U2 - Apriori - 5th Sem - DS
12 pages
Week 6 - Basic Association Analysis
No ratings yet
Week 6 - Basic Association Analysis
71 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
23 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
A Beginner's guide to Python
From Everand
A Beginner's guide to Python
Steven Mcananey
No ratings yet
Finalasynchronous Activity 3,Madeloso
No ratings yet
Finalasynchronous Activity 3,Madeloso
18 pages
104activity
No ratings yet
104activity
3 pages
Midterm Activity Educ 104 Madeloso
No ratings yet
Midterm Activity Educ 104 Madeloso
5 pages
Educ 101 Activity.madeloso
No ratings yet
Educ 101 Activity.madeloso
4 pages
ASYNCHRONOUS ACTIVITY 1(Educ101)
No ratings yet
ASYNCHRONOUS ACTIVITY 1(Educ101)
4 pages
ASYNCHRONOUS ACTIVITY 3 (1)
No ratings yet
ASYNCHRONOUS ACTIVITY 3 (1)
11 pages
EDUC 100 Report
No ratings yet
EDUC 100 Report
7 pages
Week 3-4 - CC106 - Hardware and Software Emerging Technologies_upload
No ratings yet
Week 3-4 - CC106 - Hardware and Software Emerging Technologies_upload
14 pages
EDUC 106 ACTIVITY 2.ANNIE JANE MADELOSO
No ratings yet
EDUC 106 ACTIVITY 2.ANNIE JANE MADELOSO
2 pages
tp-ld53-datasheet-v1.2-0314
100% (1)
tp-ld53-datasheet-v1.2-0314
2 pages
Jiptummpp GDL Nenaafitri 48324 4 Chapter3
100% (1)
Jiptummpp GDL Nenaafitri 48324 4 Chapter3
7 pages
IYT 01-06 Introduction To Boating
100% (1)
IYT 01-06 Introduction To Boating
83 pages
Total Quality Management in Pharmaceuticals PDF
No ratings yet
Total Quality Management in Pharmaceuticals PDF
11 pages
Topic 1 sociology of work and employment
No ratings yet
Topic 1 sociology of work and employment
7 pages
Resume Adriella Kealani Rico
No ratings yet
Resume Adriella Kealani Rico
1 page
1099-OID Template That Has To Be Supplied To Us
No ratings yet
1099-OID Template That Has To Be Supplied To Us
1 page
Case Note 7: Patient Details
100% (1)
Case Note 7: Patient Details
3 pages
Ikhtisar Pertanggungan Property All Risk Insurance
No ratings yet
Ikhtisar Pertanggungan Property All Risk Insurance
4 pages
Class x English Poem Ncert Textbook Solution Ch 10 for Anne Gregory
No ratings yet
Class x English Poem Ncert Textbook Solution Ch 10 for Anne Gregory
28 pages
Complainant Respondent: Adelita S. Villamor, Atty. Ely Galland A. Jumao-As
No ratings yet
Complainant Respondent: Adelita S. Villamor, Atty. Ely Galland A. Jumao-As
3 pages
Major Levels of Market Segmentation and Bases For Segmenting Consumer and Business Markets
100% (3)
Major Levels of Market Segmentation and Bases For Segmenting Consumer and Business Markets
6 pages
Fuerte V Estomo
50% (2)
Fuerte V Estomo
4 pages
Principles of Taxation For Business and Investment Planning 17th Edition Jones Solutions Manual 1
100% (40)
Principles of Taxation For Business and Investment Planning 17th Edition Jones Solutions Manual 1
36 pages
Fire Safety and Utility Controls: CERT Basic Training Unit 2
No ratings yet
Fire Safety and Utility Controls: CERT Basic Training Unit 2
33 pages
g1 Full Manual
86% (7)
g1 Full Manual
141 pages
Results of Competition: Increase Productivity, Performance and Quality in UK Construction Competition Code: 1807 - ISCF - IPPQUKC
No ratings yet
Results of Competition: Increase Productivity, Performance and Quality in UK Construction Competition Code: 1807 - ISCF - IPPQUKC
52 pages
Assignment No. 02 SEMESTER Spring 2021 CS507-Information Systems Total Marks: 20
No ratings yet
Assignment No. 02 SEMESTER Spring 2021 CS507-Information Systems Total Marks: 20
5 pages
0417 Example Candidate Responses Paper 3 (For Examination From 2023)
No ratings yet
0417 Example Candidate Responses Paper 3 (For Examination From 2023)
19 pages
Rigger seamen001-1
No ratings yet
Rigger seamen001-1
3 pages
100 Most Influential Programming Books According To Stack Overflow
No ratings yet
100 Most Influential Programming Books According To Stack Overflow
58 pages
BACHRACH Vs BRITISH AMERICAN INSURANCE
0% (1)
BACHRACH Vs BRITISH AMERICAN INSURANCE
4 pages
Timber Product Stage 1
No ratings yet
Timber Product Stage 1
68 pages
Journal Pre-Proof: Current Developments in Nutrition
No ratings yet
Journal Pre-Proof: Current Developments in Nutrition
51 pages
10 2307@195772 PDF
No ratings yet
10 2307@195772 PDF
25 pages
S I 4 8 2 2 / 2 6 / 2 7 / 4 0 / 4 4 A, S, L, D G: Ntenna Chematic Ayout AND Esign Uidelines
No ratings yet
S I 4 8 2 2 / 2 6 / 2 7 / 4 0 / 4 4 A, S, L, D G: Ntenna Chematic Ayout AND Esign Uidelines
38 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Mining - Module 6

Uploaded by

Data Mining - Module 6

Uploaded by

Republic of the Philippines

COLLEGE OF TECHNOLOGY AND INFORMATION SYSTEMS

Course Number : PROF EL 3 Instructor : RONALD L. BAJADOR

Upon completion of this material, you should be able to:

II. TOPIC(S) - Apriori Algorithm

IV. COURSE CONTENT

Lesson 1: Apriori Background and It’s Key Concepts

 Proposed by Agrawal R, Imielinski T, Swami AN

PROFEL 3 – DATA MINING 1

Lesson 2: Steps to perform

Lesson 4: Frequent Itemset Generation

PROFEL 3 – DATA MINING 2

o ⇒ Compute k-item set by merging two (k-1)-itemsets. Which ones?

 Example: generating frequent itemsets

3. Final check by counting instances in dataset!

Lesson 5: Apriori Algorithm Example

Example 1: Support threshold=50%, Confidence= 60%

1. Count of Each Item

PROFEL 3 – DATA MINING 3

Only {I1, I2, I3} is frequent.

PROFEL 3 – DATA MINING 4

{I1, I3} => {I2}

{I2, I3} => {I1}

{I1} => {I2, I3}

{I2} => {I1, I3}

{I3} => {I1, I2}

PROFEL 3 – DATA MINING 5

Lesson 6: Advantages and Disadvantages of Apriori Algorithm

 Uses large itemset property

PROFEL 3 – DATA MINING 6

1. Rule: Item/Items that is frequently purchased at least 50%

Transaction ID Items purchased

2. Rule: Item/Items that is frequently purchased at least25%

PROFEL 3 – DATA MINING 7

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.