Global Baseline Estimate - 12S21009
Global Baseline Estimate - 12S21009
Global Baseline Estimate - 12S21009
RECOMMENDER SYSTEM
Oleh:
Mikhael Janugrah Pakpahan - 12S21009
S1 Sistem Informasi
We first need to load the dataset, which contains user ratings for different movies. The dataset will have
users as rows and movies as columns, with ratings as values. Unrated movies arerepresented as `NaN`.
• We load the dataset using `pandas` and set the `user_id` column as the index.
• We check for duplicate user IDs to ensure data consistency.
In collaborative filtering, it’s important to center the ratings by removing the mean. This
helpscompare users' ratings by eliminating individual biases
Normalization: Subtract the movie's mean rating from the user ratings. This helps in comparingusers’
ratings more fairly by removing biases
Step 4: Finding Similar Users (Neighbors)
Next, we find users who have similar tastes to the target user using cosine similarity. Cosinesimilarity
measures how similar two users’ rating patterns are.
• Cosine similarity: Measures how similar users are by looking at the angle between their rating
vectors.
• We calculate the cosine similarity between the target user and all other users and retrieve the
top`k` most similar users.
To predict how a user will rate a specific movie, we look at their neighbors’ ratings for that movie.
The prediction is adjusted based on how similar the neighbors are to the target user.
• For each of the neighbors, we adjust their rating for the movie by subtracting their
baseline rating and multiplying it by their similarity to the target user.
• The final prediction is the baseline rating for the user-movie pair, adjusted by the
weighted sum of the neighbors’ opinions.
To generate recommendations, we predict the ratings for all movies the user hasn’t seen yet and return
the top `n` movies with the highest predicted ratings.
• We loop through all movies the user hasn’t rated, predict their ratings, and return the
top`n`movies with the highest predicted ratings.
Example Output :
Implementing Item-Based Collaborative Filtering(IBCF)Step
In the original UBCF code, similarities between users were calculated. To implement IBCF,
youneed to calculate item-to-item similarities instead. This is done by transposing the data
matrixsothat the items become rows and users become columns. After transposing, you can
computecosine similarity between items.
In UBCF, the function identifies the most similar users for a given target user. In IBCF, we needto
identify the most similar items for a given target item
Step 3: Adjust the Prediction Function
In UBCF, rating predictions are based on ratings from similar users. In IBCF, you need to adjust the
logic to predict ratings based on the ratings that the target user has given to similar items.
The final function in the UBCF system generates recommendations by predicting ratings for items a
user has not rated, based on user similarities. In IBCF, modify this to predict ratings for unrated items
based on item similarities.
Step 5: Test the IBCF Module
After implementing the above changes, you should test your IBCF module using sample data. Ensure
that the system generates reasonable item recommendations based on item similarities.