0% found this document useful (0 votes)
3 views5 pages

Apriori Algorithm

The Apriori Algorithm is used to identify frequent itemsets in transaction data and generate association rules based on minimum support and confidence thresholds. The process involves calculating support counts, filtering infrequent itemsets, and generating strong association rules with confidence values. Finally, the rules are sorted by lift to determine the strength of the associations between items.

Uploaded by

alexmason1100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

Apriori Algorithm

The Apriori Algorithm is used to identify frequent itemsets in transaction data and generate association rules based on minimum support and confidence thresholds. The process involves calculating support counts, filtering infrequent itemsets, and generating strong association rules with confidence values. Finally, the rules are sorted by lift to determine the strength of the associations between items.

Uploaded by

alexmason1100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Apriori Algorithm

Algorithm in a nutshell
1. Set a minimum support and confidence.
2. Take all the subset present in the transactions which have higher support than
minimum support.
3. Take all the rules of these subsets which have higher confidence than minimum
confidence.
4. Sort the rules by decreasing lift.

Mathematical Approach to Apriori Algorithm

Consider the transaction dataset of a store where each transaction contains the list of
items purchased by the customers. Our goal is to find frequent set of items that are
purchased by the customers and generate the association rules for them.

We are assuming that minimum support count is 2 and minimum confidence is 50%.
Step 1: Create a table which has support count of all the items present in the
transaction database.

We will compare each item’s support count with the minimum support count we have
set. If the support count is less than minimum support count then we will remove
those items.

Support count of I4 < minimum support count.

Step 2: Find all the superset with 2 items of all the items present in the last step.
Check all the subset of an itemset which are frequent or not and remove the infrequent
ones. (For example subset of { I2, I4 } are { I2 } and { I4 }but since I4 is not found
as frequent in previous step so we will not consider it ).
Since I4 was discarded in the previous one, so we are not taking any superset having
I4.

Now, remove all those itemset which has support count less than minimum support
count. So, the final dataset will be

Step 3: Find superset with 3 items in each set present in last transaction dataset.
Check all the subset of an itemset which are frequent or not and remove the infrequent
ones.

In this case if we select { I1, I2, I3 } we must have all the subset that is,
{ I1, I2 }, { I2, I3 }, { I1, I3 }. But we don’t have { I1, I3 } in our dataset. Same is
true for { I1, I3, I5 } and { I2, I3, I5 }.

So, we stop here as there are no frequent itemset present.

Step 4: As we have discovered all the frequent itemset. We will generate strong
association rule. For that we have to calculate the confidence of each rule.

All the possible association rules can be,


1. I1 -> I2
2. I2 -> I3
3. I2 -> I5
4. I2 -> I1
5. I3 -> I2
6. I5 -> I2

So, Confidence( I1 -> I2 ) = SupportCount ( I1 U I2 ) / SupportCount( I1 )


= (2 / 2) * 100 % = 100%.

Similarly we will calculate the confidence for each rule.

Since, All these association rules has confidence ≥50% then all can be considered as
strong association rules.

Step 5: We will calculate lift for all the strong association rules.

Lift ( I1 -> I2 ) = Confidence( I1 -> I2 )/ Support( I2 ) = 100 / 4 = 25 %.


Now we will sort the Lift in decreasing order.

It means that there is a 25% chance that the customers who buy I3 are likely to buy
I2.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy