2 Place Solution: Instacart Market Basket Analysis

2 nd Place Solution
Instacart Market Basket Analysis

Agenda
• My Background
• Problem Overview
• Main Approach
• Feature Engineering
• Feature Importance
• Important Findings
• F1 maximization
My Background
• Bachelor of Economics
• Programmer of Financial Industry
• Consultant of Financial Industry
• 2nd Place at KDDCUP2015
• Data Scientist at Yahoo! JAPAN

Problem Overview
• In this competition, we have to predict reorder.
• So, it is little different from general recommendation.
• I mean,
Problem Overview
• How hot(user)?
*prior is regarded as train

Problem Overview
• How hot(item)?
*Clipped by 500
Problem Overview
• Evaluation metric is mean F1 score
• Precision and Recall

Problem Overview
• Links between the files
Main Approach
• I made 2 models. For predicting reorder and for predicting None*

• reorder model’s keys are user_id and product_id
• None model’s key is only user_id
• I thought I should use more train data to make better prediction
• I decided to use prior as train
• As a result of tunings, best number of window is 3
• See next page for details
*None means there is no reorder
Main Approach
• We are given orders.csv
Main Approach
• We are given orders.csv
Main Approach
• We are given order_products.csv

Main Approach
user_id product_id label
• Reorder Prediction
Main Approach
user_id label
• None Prediction
Main Approach
Main Approach
Feature Engineering
• I made 4 types of features
1. User
• What this user like
2. Item
• What this item like
3. User x Item
• How do the user feel about the item
4. Datetime
• What this day and hour like
*For None model, I can’t use above features except user and datetime. So I convert those to
stats(min, mean, max, sum, std…).
Feature Importance for reorder
Feature Importance for None
Important Findings for reorder - 1
• Let’s think about the reordering problem. Common sense
tells us that an item purchased many times in the past has a
high probability of being reordered. However, there may be a
pattern for when the item is not reordered. We can try to
figure out this pattern and understand when a user doesn’t
repurchase an item.
• See next page for details

• user_id: 54035
• This user always reorders Cola.
• But at order number 8, the user didn’t. Why not?
• Probably because the user bought Fridge Pack Cola instead.
• I created features to catch this type of behavior.

• days_last_order-max is difference between days_since_last_order_this_item and
useritem_order_days_max
• days_since_last_order_this_item is a feature belong to user and item. This means how

many days passed since last order
• Also, useritem_order_days_max is a feature belong to user and item. This means max
span(day) of order
• For more detail, see the next page

• See the index 0, this means
the user bought this item 14 days
ago, and max span is 30 days
• So I think this feature says if the user

is bored or not by that item
• We already know fruits are reordered more frequently than vegetables(3
Million Instacart Orders, Open Sourced)
• I wanted to know how often

• So I made a item_10to1_ratio feature
that’s defined as the reorder ratio after
an item is ordered vs. not ordered.
• Next page, for more details

• Let’s say userA bought itemA at order_number 1 and 4
• And userB bought itemA at order_number 1 and 3
• item_10to1_ratio is 0.5
Important Findings for None - 1
• Useritem_sum_pos_cart(User A, Item B) is the average position in User A’s cart
that Item B falls into
• Useritem_sum_pos_cart-mean(User A) is the mean of the above feature across all

items
• So this feature essentially captures
the average position of an item in a user’s
cart, and we can see that users who
don’t buy many items all at once are
more likely to be None

• total_buy is number of total order
• If userA bought itemA 3 times

in the past, this would be 3
• So total_buy-max is max of above

feature by user
• We can see that it predicts

whether or not a user will make a reorder
• t-1_is_None(User A) is a binary feature that says whether or not the
user’s previous order was None.
• If the previous order is None,
then the next order will also be
None with 30% probability.

F1 maximization
• In this competition, the evaluation metric was an F1 score, which is a way of
capturing both precision and recall in a single metric.
• Thus, we needed to convert reorder probabilities into binary 1/0 (Yes/No)

numbers.
• However, in order to perform this conversion, we need to know a threshold. At

first, I used grid search to find a universal threshold of 0.2. But I saw
comments on the Kaggle discussion boards that said different orders should
have different thresholds.
• To understand why, let’s look at an example.

F1 maximization
F1 maximization
• In the first example, threshold is between 0.9 and 0.3
• In the second example, threshold is lower than 0.2
• As I showed, each order should have each threshold
• But using above calculation, we have to prepare all patterns of
probability at first
• Thus I needed to come up with another calculation
• See the next page
F1 maximization
• Let’s say our model predicts Item A will be reordered with probability 0.9, and Item B with probability 0.3. I then
simulate 9,999 target labels (whether A and B will be ordered or not) using these probabilities.
• For example, the simulated labels might look like this.
• I then calculate the expected F1 score for each set of labels,
starting from the highest probability items, and then adding items
(e.g., [A], then [A, B], then [A, B, C], etc) until the F1 score
peaks and then decreases.
• We don’t need to calculate all of patterns
like A, B, AB…
• Because if we should select itemB, we should
select itemA as well

F1 maximization
• F1score_mean( , [A]) -> 0.809747641431
• F1score_mean( , [A,B]) -> 0.709004233757

F1 maximization - Predicting None
• One way to think about None is as the probability (1 - Item A)

* (1 - Item B) * …
• But another method is to try to predict None as a special

case.
• By using our None model and treating None as just another

item, we can boost the F1 score from 0.400 to 0.407.
EOP

2 Place Solution: Instacart Market Basket Analysis

Uploaded by

Copyright:

Available Formats

2 Place Solution: Instacart Market Basket Analysis

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 Place Solution: Instacart Market Basket Analysis

Uploaded by

Copyright:

Available Formats

What are the two models proposed in the main approach and what is the difference between them?

What are the two models proposed in the main approach and what is the difference between them?

What are the four types of features engineered and what is an example of each?

What are the four types of features engineered and what is an example of each?

2 nd Place Solution

Instacart Market Basket Analysis

• Programmer of Financial Industry

• Consultant of Financial Industry

• 2nd Place at KDDCUP2015

• Data Scientist at Yahoo! JAPAN

*prior is regarded as train

• Precision and Recall

• I made 2 models. For predicting reorder and for predicting None*

• We are given order_products.csv

• See next page for details

• This user always reorders Cola.

• But at order number 8, the user didn’t. Why not?

• Probably because the user bought Fridge Pack Cola instead.

• I created features to catch this type of behavior.

• days_since_last_order_this_item is a feature belong to user and item. This means how

• For more detail, see the next page

• So I think this feature says if the user

• I wanted to know how often

• Next page, for more details

• Useritem_sum_pos_cart-mean(User A) is the mean of the above feature across all

• So this feature essentially captures

the average position of an item in a user’s

cart, and we can see that users who

don’t buy many items all at once are

more likely to be None

• If userA bought itemA 3 times

• So total_buy-max is max of above

• We can see that it predicts

• t-1_is_None(User A) is a binary feature that says whether or not the

user’s previous order was None.

• If the previous order is None,

then the next order will also be

None with 30% probability.

• Thus, we needed to convert reorder probabilities into binary 1/0 (Yes/No)

• However, in order to perform this conversion, we need to know a threshold. At

• To understand why, let’s look at an example.

• For example, the simulated labels might look like this.

• I then calculate the expected F1 score for each set of labels,

peaks and then decreases.

• We don’t need to calculate all of patterns

• Because if we should select itemB, we should

select itemA as well

• F1score_mean( , [A]) -> 0.809747641431

• F1score_mean( , [A,B]) -> 0.709004233757

• One way to think about None is as the probability (1 - Item A)

• But another method is to try to predict None as a special

• By using our None model and treating None as just another

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.