RS2
RS2
Experiment No 6
Jay Topiwala 60009220169 D058/D1
Aim: Implement Constraint based Recommender System on an appropriate dataset.
Theory:
Key Features:
1. Rule-Based Filtering: Uses logical constraints to eliminate items that do not satisfy given
conditions.
2. Explicit User Preferences: Users define preferences such as genre, rating, release year, or
specific attributes.
3. No Need for Past User Data: Unlike collaborative filtering, it does not require historical
interactions.
4. Deterministic Recommendations: Always produces the same output given the same
constraints.
[47]: imdb_df
1
4 Bobby Deol Aishwarya Rai Bachchan Shammi Kapoor
… … … …
15503 Naseeruddin Shah Sumeet Saigal Suparna Anand
15504 Akshay Kumar Twinkle Khanna Aruna Irani
15505 Sangeeta Tiwari NaN NaN
15506 NaN NaN NaN
15507 Dharmendra Jaya Prada Arjun Sarja
[52]: imdb_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15508 entries, 0 to 15507
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 name 15508 non-null object
1 year 14981 non-null Int64
2 duration 7240 non-null object
3 genre 15508 non-null object
4 rating 15508 non-null float64
5 votes 7920 non-null object
6 director 15508 non-null object
7 actor 1 15508 non-null object
8 actor 2 15508 non-null object
9 actor 3 15508 non-null object
dtypes: Int64(1), float64(1), object(8)
memory usage: 1.2+ MB
[53]: imdb_df.head()
2
3 …And Once Again 2010 105 min Drama
4 …Aur Pyaar Ho Gaya 1997 147 min Comedy, Drama, Musical
actor 3
0 Arvind Jangid
1 Roy Angana
2 Siddhant Kapoor
3 Antara Mali
4 Shammi Kapoor
[54]: print(imdb_df.isnull().sum())
name 0
year 527
duration 8268
genre 0
rating 0
votes 7588
director 0
actor 1 0
actor 2 0
actor 3 0
dtype: int64
[55]: imdb_df.describe()
[56]: imdb_df["genre"].unique()
3
'Horror, Romance, Thriller', 'Comedy, Drama, Romance', 'Thriller',
'Comedy, Drama', 'Unknown', 'Comedy, Drama, Fantasy',
'Comedy, Drama, Family', 'Crime, Drama, Mystery',
'Horror, Thriller', 'Biography', 'Comedy, Horror', 'Action',
'Drama, Horror, Mystery', 'Comedy', 'Action, Thriller',
'Drama, History', 'Drama, History, Sport',
'Horror, Mystery, Romance', 'Horror, Mystery',
'Drama, Horror, Romance', 'Action, Drama, History',
'Action, Drama, War', 'Comedy, Family',
'Adventure, Horror, Mystery', 'Action, Sci-Fi',
'Crime, Mystery, Thriller', 'War', 'Sport',
'Biography, Drama, History', 'Horror, Romance', 'Crime, Drama',
'Drama, Romance', 'Adventure, Drama', 'Comedy, Mystery, Thriller',
'Action, Crime, Drama', 'Crime, Thriller',
'Horror, Sci-Fi, Thriller', 'Crime, Drama, Thriller',
'Drama, Mystery, Thriller', 'Drama, Sport',
'Drama, Family, Musical', 'Action, Comedy', 'Comedy, Thriller',
'Action, Adventure, Fantasy', 'Drama, Romance, Thriller',
'Action, Drama', 'Drama, Horror, Musical',
'Action, Biography, Drama', 'Adventure, Comedy, Drama', 'Mystery',
'Action, Fantasy, Mystery', 'Adventure, Drama, Mystery',
'Mystery, Thriller', 'Adventure', 'Drama, Musical, Thriller',
'Comedy, Crime, Drama', 'Musical, Romance', 'Documentary, Music',
'Documentary, History, Music', 'Drama, Fantasy, Mystery',
'Drama, Family, Sport', 'Drama, Thriller',
'Documentary, Biography', 'Action, Adventure, Comedy', 'Romance',
'Comedy, Drama, Music', 'Comedy, Horror, Mystery', 'Musical',
'Musical, Romance, Drama', 'Family, Romance',
'Action, Sci-Fi, Thriller', 'Action, Drama, Romance',
'Mystery, Romance', 'Fantasy', 'Family', 'Drama, Family',
'Action, Comedy, Drama', 'Action, Drama, Thriller',
'Drama, Horror, Thriller', 'Drama, Musical, Romance',
'Comedy, Sci-Fi', 'Action, Romance', 'Action, Crime',
'Action, Drama, Crime', 'Drama, Family, Music',
'Action, Mystery, Thriller', 'Action, Drama, Family',
'Action, Mystery', 'Drama, History, Romance',
'Crime, Drama, Music', 'Sci-Fi', 'Animation',
'Crime, Mystery, Romance', 'Action, Adventure, Romance',
'Music, Romance', 'Action, Comedy, Crime',
'Comedy, Family, Fantasy', 'Romance, Drama',
'Drama, Family, Romance', 'Romance, Drama, Family',
'Musical, Romance, Thriller', 'Family, Musical, Romance',
'Action, Drama, Fantasy', 'Family, Drama', 'Crime, Drama, Romance',
'Musical, Drama, Romance', 'Drama, Music, Musical',
'Drama, Mystery', 'Adventure, Comedy, Romance',
'Crime, Drama, Horror', 'Family, Music, Musical',
'Action, Musical, Thriller', 'Action, Romance, Thriller',
4
'Romance, Thriller', 'Drama, Music', 'Crime, Drama, Musical',
'Action, Crime, Mystery', 'Action, Adventure, Thriller',
'Comedy, Romance, Sci-Fi', 'Crime', 'Action, Drama, Mystery',
'Action, Comedy, Thriller', 'Biography, Drama',
'Action, Comedy, Fantasy', 'Drama, Family, Horror',
'Action, Adventure, Family', 'Documentary, Biography, Musical',
'Action, Drama, Musical', 'Adventure, Thriller', 'Crime, Mystery',
'Drama, Crime', 'Drama, Fantasy, Romance',
'Comedy, Romance, Thriller', 'Musical, Comedy, Drama',
'Biography, History, War', 'Action, Comedy, Romance',
'Drama, History, Musical', 'Action, Crime, Horror',
'Adventure, Fantasy', 'Adventure, Drama, Fantasy',
'Adventure, Fantasy, Romance', 'Action, Adventure, Drama',
'Action, Adventure', 'Comedy, Crime', 'Crime, Drama, Fantasy',
'Adventure, Drama, Romance', 'History', 'Drama, Fantasy, Thriller',
'Musical, Fantasy', 'Documentary, Thriller',
'Mystery, Romance, Musical', 'Family, Drama, Romance',
'History, Musical, Romance', 'Musical, Drama, Crime',
'Adventure, Crime, Romance', 'Musical, Thriller, Mystery',
'Drama, Comedy', 'Biography, Drama, Romance', 'Biography, Music',
'Biography, Drama, Music', 'Drama, Sci-Fi',
'Drama, Family, Thriller', 'Comedy, Musical, Romance',
'Drama, Family, Comedy', 'Action, Thriller, Romance',
'Animation, Adventure', 'Action, Crime, Musical',
'Action, Crime, Romance', 'Animation, Action, Adventure',
'Action, Drama, Sport', 'Comedy, History', 'Documentary, History',
'Drama, Comedy, Family', 'Action, Adventure, Crime',
'Documentary, Biography, Music', 'Comedy, Musical',
'Biography, Crime, Thriller', 'Adventure, Mystery, Thriller',
'Biography, Drama, Sport', 'Action, Comedy, Musical',
'Mystery, Romance, Thriller', 'Action, Adventure, Musical',
'Crime, Musical, Mystery', 'Action, Thriller, Crime',
'Adventure, Comedy, Crime', 'Comedy, Horror, Musical',
'Adventure, Family', 'Family, Thriller', 'Drama, Action, Crime',
'Drama, War', 'Action, Drama, Adventure',
'Adventure, Fantasy, History', 'Fantasy, Musical',
'Comedy, Drama, Thriller', 'Drama, Fantasy', 'Musical, Drama',
'Action, Drama, Horror', 'Biography, Crime, Drama',
'Action, Drama, Music', 'Adventure, Drama, Family',
'Drama, Romance, Musical', 'Comedy, Musical, Drama',
'Adventure, Comedy, Musical', 'Crime, Drama, Family',
'Thriller, Musical, Mystery', 'Documentary, Adventure, Crime',
'Drama, Action, Horror', 'Adventure, Crime, Drama',
'Documentary, Biography, Sport', 'Crime, Fantasy, Mystery',
'Documentary, Biography, Drama', 'Action, Fantasy, Thriller',
'Adventure, Drama, History', 'Animation, Drama, History',
'Comedy, Horror, Thriller', 'Drama, Family, History',
5
'Animation, History', 'Biography, Drama, Musical', 'Music',
'Family, Comedy', 'Adventure, Mystery', 'Family, Fantasy',
'Documentary, History, News', 'Drama, Mystery, Romance',
'Comedy, Fantasy', 'Action, Crime, Family',
'Drama, Musical, Mystery', 'Action, Thriller, Mystery',
'Drama, Family, Fantasy', 'Action, Family',
'Action, Adventure, Mystery', 'Horror, Fantasy', 'Comedy, Action',
'Adventure, Romance', 'Drama, Adventure',
'Animation, Drama, Romance', 'Comedy, Crime, Romance',
'Adventure, Comedy', 'Comedy, Drama, Sport',
'Documentary, Crime, History', 'Musical, Mystery, Drama',
'Adventure, Drama, Sci-Fi', 'Action, Romance, Western',
'Comedy, Fantasy, Romance', 'Animation, Action, Comedy',
'Drama, Fantasy, Sci-Fi', 'Drama, Horror', 'Family, Drama, Comedy',
'Action, Adventure, History', 'Comedy, Family, Romance',
'Biography, History', 'Animation, Family',
'Drama, Fantasy, History', 'Animation, Adventure, Fantasy',
'Adventure, Comedy, Family', 'Drama, History, War',
'Animation, Drama, Fantasy', 'Action, Musical, Romance',
'Crime, Action, Drama', 'Comedy, Romance, Musical',
'Fantasy, Drama', 'Musical, Action, Crime', 'Documentary, Drama',
'Action, Horror, Thriller', 'Action, Horror, Sci-Fi',
'Mystery, Sci-Fi, Thriller', 'Biography, Family',
'Drama, Action, Comedy', 'Drama, Music, Romance',
'Action, Biography, Crime', 'Adventure, Drama, Musical',
'Family, Music, Romance', 'Fantasy, Mystery, Romance',
'Drama, Crime, Family', 'Drama, Family, Action',
'Romance, Comedy, Drama', 'Animation, Adventure, Comedy',
'Sci-Fi, Thriller', 'Romance, Family, Drama',
'Action, Family, Thriller', 'Adventure, Crime, Thriller',
'Drama, Romance, Sport', 'Comedy, Crime, Mystery',
'Adventure, Comedy, Mystery', 'Action, Fantasy', 'Comedy, Mystery',
'Animation, Adventure, Family', 'Adventure, Drama, Music',
'Biography, Drama, War', 'Documentary, Comedy, Drama',
'Musical, Drama, Family', 'Animation, Comedy, Drama',
'Fantasy, Musical, Drama', 'Adventure, Crime, Mystery',
'Comedy, Drama, Mystery', 'Documentary, News',
'Drama, Musical, Family', 'Action, Romance, Drama',
'Comedy, Crime, Thriller', 'Action, Musical', 'Action, History',
'Action, Comedy, Mystery', 'Drama, Family, Mystery',
'Adventure, Drama, Thriller', 'Documentary, Reality-TV',
'Action, Fantasy, Horror', 'Drama, History, Thriller',
'Documentary, Family', 'Documentary, Biography, Family',
'Comedy, Sport', 'Animation, Comedy, Family',
'Crime, Romance, Thriller', 'Comedy, Musical, Action',
'Action, Mystery, Sci-Fi', 'Comedy, Crime, Musical',
'Drama, Adventure, Action', 'History, Romance', 'Reality-TV',
6
'Fantasy, History', 'Family, Drama, Thriller',
'Musical, Mystery, Thriller', 'Musical, Comedy, Romance',
'Musical, Action, Drama', 'Action, Musical, War',
'Romance, Comedy', 'Horror, Crime, Thriller',
'Crime, Drama, History', 'Comedy, Drama, Horror',
'Crime, Horror, Thriller', 'Animation, Comedy',
'Romance, Action, Crime', 'Musical, Thriller',
'Action, Romance, Comedy', 'Comedy, Family, Musical',
'Horror, Drama, Mystery', 'Thriller, Mystery, Family',
'Comedy, Drama, Sci-Fi', 'Documentary, Adventure',
'Documentary, Biography, Crime', 'Musical, Action',
'Musical, Mystery', 'Action, Crime, Sci-Fi',
'Action, Horror, Mystery', 'Fantasy, Horror',
'Adventure, Family, Fantasy', 'Fantasy, Sci-Fi', 'Comedy, War',
'Romance, Action, Drama', 'Musical, Family, Romance',
'Romance, Drama, Action', 'Family, Comedy, Drama',
'Comedy, Music, Romance', 'Comedy, Family, Sci-Fi',
'Action, Drama, Western', 'Adventure, Romance, Thriller',
'Biography, Comedy, Drama', 'Action, Mystery, Romance',
'Romance, Sport', 'Crime, Romance', 'Action, Thriller, Western',
'Crime, Musical, Romance', 'Romance, Thriller, Mystery',
'Drama, Crime, Mystery', 'Biography, Drama, Family',
'Action, Family, Mystery', 'Comedy, Mystery, Romance',
'Drama, Thriller, Action', 'Documentary, Short',
'Documentary, Western', 'Musical, Family, Drama',
'Action, Family, Musical', 'Animation, Family, Musical',
'Drama, Fantasy, Horror', 'Action, Adventure, Sci-Fi',
'Drama, Action, Musical', 'Drama, Musical, Sport',
'Action, Comedy, Horror', 'Drama, Fantasy, Musical',
'Action, Fantasy, Musical', 'Animation, Action', 'Comedy, Music',
'Documentary, Drama, Romance', 'Drama, Music, Thriller',
'Fantasy, Musical, Mystery', 'Drama, Fantasy, War', 'Action, War',
'Action, Adventure, War', 'Horror, Musical',
'Fantasy, Mystery, Thriller', 'Adventure, Biography, Drama',
'Family, Romance, Sci-Fi', 'Drama, Romance, Family',
'Animation, Adventure, Drama', 'Family, Romance, Drama',
'Animation, Action, Sci-Fi', 'Adventure, Comedy, Fantasy',
'Comedy, Crime, Family', 'Horror, Musical, Thriller',
'Biography, Drama, Thriller', 'Drama, Western',
'Romance, Sci-Fi, Thriller', 'Comedy, Musical, Family',
'Comedy, Horror, Romance', 'Thriller, Action',
'Fantasy, Thriller, Action', 'Fantasy, Romance',
'Action, Drama, Comedy', 'Family, Fantasy, Romance',
'Comedy, Crime, Horror', 'Horror, Mystery, Sci-Fi',
'Animation, Action, Drama', 'Family, Mystery',
'Adventure, Biography, History', 'Fantasy, Horror, Mystery',
'Family, Musical', 'Drama, Family, Adventure',
7
'Crime, Horror, Mystery', 'Documentary, Drama, Fantasy',
'Action, Adventure, Biography', 'Biography, History, Thriller',
'Action, Family, Drama', 'Documentary, Drama, Sport',
'Thriller, Mystery', 'Musical, Drama, Comedy',
'Documentary, History, War', 'Adventure, Horror, Thriller',
'Action, Adventure, Horror', 'Action, Crime, War',
'Adventure, Musical, Romance', 'Action, Fantasy, Sci-Fi',
'Drama, Comedy, Action', 'Documentary, Sport',
'Documentary, Adventure, Music', 'Drama, Action, Family',
'Adventure, History, Thriller', 'Adventure, Horror, Romance',
'Adventure, Crime, Horror', 'Mystery, Musical, Romance',
'Action, Crime, History', 'Documentary, Musical',
'Adventure, Fantasy, Musical', 'Documentary, Family, History',
'Documentary, Drama, Family', 'Drama, Mystery, Sci-Fi',
'Animation, Drama, Musical', 'Drama, History, Mystery',
'Drama, Sport, Thriller', 'Action, Crime, Fantasy',
'Comedy, Musical, Mystery', 'Romance, Musical, Action',
'Musical, Drama, Fantasy', 'Animation, Family, History',
'Action, Drama, News', 'Romance, Musical, Comedy',
'Adventure, Fantasy, Horror', 'Adventure, History',
'Comedy, Drama, History', 'Mystery, Sci-Fi',
'Action, Thriller, War', 'Documentary, Drama, News',
'Documentary, Crime, Mystery', 'Adventure, Horror',
'Animation, Drama, Adventure', 'Crime, Horror, Romance',
'Documentary, Adventure, Drama', 'Documentary, Biography, History',
'Fantasy, Horror, Romance', 'Comedy, Fantasy, Musical',
'Crime, Musical, Thriller', 'Documentary, War',
'Action, Comedy, War', 'Crime, Drama, Sport',
'Musical, Adventure, Drama', 'Horror, Romance, Sci-Fi',
'Musical, Mystery, Romance', 'Romance, Musical, Drama',
'Adventure, Fantasy, Sci-Fi'], dtype=object)
[57]: imdb_df["year"].unique()[:10]
[57]: <IntegerArray>
[2019, 2021, 2010, 1997, 2005, 2008, 2012, 2014, 2004, 2016]
Length: 10, dtype: Int64
[67]: imdb_df_filtered.head()
8
3 …And Once Again 2010 105 min Drama
5 …Yahaan 2005 142 min Drama, Romance, War
actor 3
0 Arvind Jangid
1 Roy Angana
2 Siddhant Kapoor
3 Antara Mali
5 Yashpal Sharma
[60]: print(imdb_df["genre"].value_counts())
genre
Drama 2779
Unknown 1877
Action 1289
Thriller 779
Romance 708
…
Musical, Adventure, Drama 1
Horror, Romance, Sci-Fi 1
Musical, Mystery, Romance 1
Romance, Musical, Drama 1
Adventure, Fantasy, Sci-Fi 1
Name: count, Length: 486, dtype: int64
9
[62]: array(['Rasika Dugal', 'Sayani Gupta', 'Prateik', 'Rajat Kapoor',
'Bobby Deol', 'Jimmy Sheirgill', 'Unknown', 'Yash Dave',
'Augustine', 'Rati Agnihotri'], dtype=object)
[70]: imdb_df_filtered_actor.head()
actor 2 actor 3
0 Vivek Ghamande Arvind Jangid
4 Aishwarya Rai Bachchan Shammi Kapoor
62 Sunny Deol Amrita Singh
166 Bipasha Basu Priyanka Chopra Jonas
479 Karisma Kapoor Rahul Dev
[72]: imdb_df_filtered_director
10
7 326 Allyson Patel Yash Dave Muntazir Ahmad
5249 11,364 Shoojit Sircar Amitabh Bachchan Ayushmann Khurrana
8529 23,388 Shoojit Sircar John Abraham Nargis Fakhri
10296 14,013 Shoojit Sircar Varun Dhawan Banita Sandhu
10853 29,786 Shoojit Sircar Deepika Padukone Amitabh Bachchan
14572 29 Shoojit Sircar Kirsty Averton Tim Berrington
14841 40,589 Shoojit Sircar Ayushmann Khurrana Yami Gautam
actor 3
5 Yashpal Sharma
7 Kiran Bhatia
5249 Vijay Raaz
8529 Raashi Khanna
10296 Gitanjali Rao
10853 Irrfan Khan
14572 Nicholas Gecks
14841 Annu Kapoor
filtered_movies = imdb_df[
(imdb_df["genre"].str.lower().str.contains(preferred_genre)) &
(imdb_df["rating"] >= min_rating) &
(imdb_df["year"] >= min_year)
]
if preferred_actor:
filtered_movies = filtered_movies[
(filtered_movies["actor 1"].str.lower().str.contains(preferred_actor)) |
(filtered_movies["actor 2"].str.lower().str.contains(preferred_actor)) |
(filtered_movies["actor 3"].str.lower().str.contains(preferred_actor))
]
if preferred_director:
filtered_movies = filtered_movies[filtered_movies["director"].str.lower().
↪str.contains(preferred_director)]
if not filtered_movies.empty:
print("\nRecommended Movies:")
print(filtered_movies[["name", "year", "rating", "genre", "director"]].
↪head(10))
11
else:
print("No movies found based on your preferences.")
Recommended Movies:
name year rating genre \
74 3 Idiots 2009 8.4 Comedy,
Drama
943 Ammaa Ki Boli 2021 8.1 Comedy,
Drama
1103 Ankhon Dekhi 2013 8.0 Comedy,
Drama
1599 Badhaai Ho 2018 8.0 Comedy,
Drama
1708 Bahattar Hoorain 2019 8.8 Comedy
1739 Bajrangi Bhaijaan 2015 8.0 Action, Adventure, Comedy
1876 Barfi! 2012 8.1 Comedy, Drama, Romance
2074 Bhabhipedia 2018 8.3 Comedy, Romance
2983 Chhichhore 2019 8.3 Comedy, Drama
3201 Colour photo 2018 8.7 Comedy
director
74 Rajkumar Hirani
943 Narayan Chauhan
1103 Rajat Kapoor
1599 Amit Ravindernath Sharma
1708 Sanjay Puran Singh Chauhan
1739 Kabir Khan
1876 Anurag Basu
2074 Saumyy Shivhare
2983 Nitesh Tiwari
3201 Aziz Naser
2 Conclusion
In this experiment, we applied a constraint-based filtering approach to filter movies based on average
ratings, year of release, and genre. The constraint was defined as:
Movies with rating >= 4.
Movies released in 2000 or later.
No genre constraint applied in the final filtering.
The filtered dataset now includes movies that meet these conditions, providing a broad selection
from Indian movies that are relatively modern (from 2000 onwards) and have a moderate rating
(above 4).
12
Aspect Constraint-Based Filtering Content-Based Filtering
Definition Filters data based on user-specified Recommends items based on the
criteria (e.g., rating, genre, year). features and characteristics of the
items themselves (e.g., genre, plot
keywords, actors).
User Input Requires specific constraints from the Relies on the analysis of item
user (e.g., rating threshold, genre, year). characteristics to suggest similar
items.
Personalization
Less personalized, based purely on Highly personalized, as it recommends
predefined constraints. based on user preferences and item
similarity.
Filtering Applies filters based on strict conditions Uses algorithms to match similar
Logic (rating, year, etc.). items based on content attributes like
genres, keywords, or actors.
Flexibility Can be rigid, as it is based on More flexible, adapts to user tastes
predefined criteria. and patterns over time.
Examples Searching for movies with a specific Recommending movies similar to the
of Use rating or release year. ones a user has watched or liked in the
Cases past.
Pros Simple to implement, easy to use, Offers personalized recommendations,
provides control to users. adapts to user preferences.
Cons Can be too rigid, leading to missed Can require more complex algorithms
recommendations or a lack of diversity. and data to analyze content features
effectively.
Data Re- Requires structured data (e.g., rating, Requires detailed metadata and
quirements year, genre) with minimal processing. content features (e.g., keywords,
director, actors).
Implementation
Simple to implement, requires basic More complex, involves content
Complexity filtering techniques. analysis and item profiling, typically
using machine learning models.
Performance Static; does not adapt to new user Dynamic; adapts to user behavior and
Over Time behavior or preferences unless manually preferences over time.
adjusted.
Handling of Can miss recommendations for new Can recommend new items if they
Novel Items items without sufficient data. share similar features with items the
user has liked.
Scalability Works well with small to medium Scales well to large datasets but
datasets where filtering criteria are easy requires more computation power for
to define. analyzing content.
Interpretability
High interpretability, as users know the Can be harder to explain, especially
exact criteria used for filtering. when machine learning models are
involved.
Bias and May introduce bias if the constraints May lead to homogeneity in
Diversity are too narrow (e.g., excluding less recommendations, as similar items are
popular genres or ratings). repeatedly suggested.
13
Aspect Constraint-Based Filtering Content-Based Filtering
Real-time Can be less responsive in real-time Can provide real-time
Recommen- settings, especially with large filtering recommendations, especially when
dations constraints. continuously learning from user
interactions.
[ ]:
14