mod4
mod4
mod4
Item-based similarity focuses on iden fying rela onships between items based on user preferences.
The core idea is that if two items receive similar ra ngs from mul ple users, they are likely to have
inherent similari es. This approach is widely used in recommenda on systems and is par cularly
effec ve when the number of items is less than the number of users, reducing computa onal
overhead.
How It Works
1. Data Representa on: The user-item ra ng data is represented in a matrix or pivot table
format.
2. Similarity Calcula on: Metrics such as cosine similarity, Pearson correla on, or Jaccard
similarity are used to calculate how similar two items are based on user ra ngs.
o Cosine Similarity: Measures the cosine of the angle between two vectors (rows in
the matrix).
o Pearson Correla on: Measures the linear rela onship between two sets of ra ngs.
Consider a movie recommenda on system with three users (User 1, User 2, and User 3) and five
movies (Movie A, Movie B, Movie C, Movie D, and Movie E). The ra ngs provided by the users are as
follows:
Movie A 5 4 5
Movie B 4 3 4
Movie C 1 2 1
Movie D 3 2 3
Movie E 5 4 5
Here:
1. Stability: Item similari es tend to remain stable over me, unlike user preferences which
might change.
2. Cold Start: This method performs be er for new users since it relies on item rela onships,
not user profiles.
3. Efficiency: It is computa onally efficient in systems with fewer items than users.
Applica ons
Movie Recommenda ons: Suggest movies similar to what a user has watched.
Music Pla orms: Suggest songs or albums similar to the ones a user likes.
By iden fying similar items using metrics like cosine similarity, item-based collabora ve filtering
effec vely provides meaningful and personalized recommenda ons.
4o
User-based collabora ve filtering focuses on iden fying users who share similar preferences based
on their ra ngs of common items. The system then recommends items that similar users have rated
highly but the target user has not yet interacted with. This technique is beneficial in scenarios where
user preferences are diverse, and the goal is to leverage community opinions for recommenda ons.
How It Works
Values are ra ngs given by users to items (or NaN for unrated items).
o User-based similarity measures how closely two users are related based on their
ra ng pa erns. Metrics like cosine similarity, Pearson correla on, or mean squared
difference are commonly used.
o For example, if User A and User B rate the same movies similarly, they are
considered similar.
3. Recommenda on:
o Items rated highly by similar users are recommended to the target user. For instance,
if User B highly rated a movie that User A has not seen, the system might
recommend it to User A.
The Surprise library is designed for building recommenda on systems and includes pre-built
algorithms for collabora ve filtering. It abstracts the complexity of matrix manipula ons, making it
easier to implement advanced techniques.
python
Copy code
data = Dataset.load_from_df(ra ng_df[['userId', 'movieId', 'ra ng']], reader) # Load data into
Surprise format
sim_op ons = {
# Output results
2. Similarity Op ons:
user_based: Set to True for user-based similarity (set to False for item-based
similarity).
3. KNNBasic Algorithm:
o Finds the K nearest neighbors for a user based on the specified similarity metric.
o Combines the ra ngs of neighbors to predict ra ngs for the target user.
4. Cross-Valida on:
o Evaluates the model on each fold using RMSE (Root Mean Squared Error) and MAE
(Mean Absolute Error).
o The mean RMSE and MAE across folds indicate the model’s performance.
Detailed Output
1. Fold-wise RMSE and MAE: Error rates for each test fold.
2. Overall Performance:
o Average RMSE: Indicates the devia on of predicted ra ngs from actual ra ngs.
o Average MAE: Reflects the average absolute difference between predicted and actual
ra ngs.
1. Ease of Use: Abstracts the complexity of implemen ng similarity calcula ons and
recommenda ons.
3. Evalua on Tools: Provides in-built methods for cross-valida on and accuracy measurement.
Applica ons
By leveraging the Surprise library, implemen ng user-based collabora ve filtering becomes efficient
and intui ve, allowing for rapid prototyping and experimenta on with recommenda on systems.
4o