0% found this document useful (0 votes)
10 views16 pages

RS2

The document outlines a laboratory experiment for implementing a Constraint-Based Recommender System using datasets from ratings.csv and movies.csv. It describes the theory behind constraint-based recommendations, which rely on predefined rules rather than user behavior, and details the steps for implementation, including data preprocessing and applying user-defined constraints. Additionally, it includes lab assignments to enhance the recommender system's functionality and compare it with content-based filtering.

Uploaded by

jaytopiwala22oct
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views16 pages

RS2

The document outlines a laboratory experiment for implementing a Constraint-Based Recommender System using datasets from ratings.csv and movies.csv. It describes the theory behind constraint-based recommendations, which rely on predefined rules rather than user behavior, and details the steps for implementation, including data preprocessing and applying user-defined constraints. Additionally, it includes lab assignments to enhance the recommender system's functionality and compare it with content-based filtering.

Uploaded by

jaytopiwala22oct
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Department of Computer Science and Engineering (Data Science) Subject:

Recommender System Laboratory (DJS22DSL6012)


(A.Y. 2024-2025)

Experiment No 6
Jay Topiwala 60009220169 D058/D1
Aim: Implement Constraint based Recommender System on an appropriate dataset.

Theory:

A constraint-based recommendation system suggests items based on predefined constraints


and rules instead of learning from user behavior. This type of system is useful when explicit
user preferences, domain knowledge, or business rules must be enforced. Unlike collaborative
filtering, which relies on user interactions, a constraint-based approach applies logical rules to
filter items. For example, a movie recommendation system can apply constraints such as
minimum rating, preferred genre, release year, or actor/director preference to generate
recommendations.

Key Features:

1. Rule-Based Filtering: Uses logical constraints to eliminate items that do not satisfy given
conditions.
2. Explicit User Preferences: Users define preferences such as genre, rating, release year, or
specific attributes.
3. No Need for Past User Data: Unlike collaborative filtering, it does not require historical
interactions.
4. Deterministic Recommendations: Always produces the same output given the same
constraints.

Steps in Implementing Constraint-Based Recommendation:

1. Data Preprocessing: Read the movie and ratings datasets.


2. Calculate average ratings: Compute the mean rating for each movie to filter out low-
rated ones.
3. Define constraints: Set filters like preferred genres, minimum rating, release year, and
other preferences.
4. Apply filters: Extract movies that match the defined constraints.
5. Display recommendations: Present the final list of movies satisfying all constraints.
Department of Computer Science and Engineering (Data Science)

Lab Assignments to complete:

Dataset: ratings.csv, movies.csv


(https://www.kaggle.com/datasets/grouplens/movielens20m-dataset?resource=download)

Apply the concept of Constraint Based Recommender on the above-mentioned dataset.

1. Modify the constraints to include specific actors or directors.


2. Implement a user input-based constraint selection system where users can enter their
preferred genres, rating threshold, and release year.
3. Compare the constraint-based approach with content-based filtering and document the
differences in recommendations.
Notebook

April 15, 2025

[2]: import pandas as pd


import numpy as np

[46]: imdb_df = pd.read_csv('/content/IMDbRatings_IndianMovies.csv')

[47]: imdb_df

[47]: Name Year Duration \


0 #Gadhvi (He thought he was Gandhi) -2019.0 109 min
1 #Homecoming -2021.0 90 min
2 #Yaaram -2019.0 110 min
3 …And Once Again -2010.0 105 min
4 …Aur Pyaar Ho Gaya -1997.0 147 min
… … … …
15503 Zulm Ko Jala Doonga -1988.0 NaN
15504 Zulmi -1999.0 129 min
15505 Zulmi Raj -2005.0 NaN
15506 Zulmi Shikari -1988.0 NaN
15507 Zulm-O-Sitam -1998.0 130 min

Genre Rating Votes Director \


0 Drama 7.0 8 Gaurav Bakshi
1 Drama, Musical NaN NaN Soumyajit Majumdar
2 Comedy, Romance 4.4 35 Ovais Khan
3 Drama NaN NaN Amol Palekar
4 Comedy, Drama, Musical 4.7 827 Rahul Rawail
… … … … …
15503 Action 4.6 11 Mahendra Shah
15504 Action, Drama 4.5 655 Kuku Kohli
15505 Action NaN NaN Kiran Thej
15506 Action NaN NaN NaN
15507 Action, Drama 6.2 20 K.C. Bokadia

Actor 1 Actor 2 Actor 3


0 Rasika Dugal Vivek Ghamande Arvind Jangid
1 Sayani Gupta Plabita Borthakur Roy Angana
2 Prateik Ishita Raj Siddhant Kapoor
3 Rajat Kapoor Rituparna Sengupta Antara Mali

1
4 Bobby Deol Aishwarya Rai Bachchan Shammi Kapoor
… … … …
15503 Naseeruddin Shah Sumeet Saigal Suparna Anand
15504 Akshay Kumar Twinkle Khanna Aruna Irani
15505 Sangeeta Tiwari NaN NaN
15506 NaN NaN NaN
15507 Dharmendra Jaya Prada Arjun Sarja

[15508 rows x 10 columns]

[48]: imdb_df.columns = imdb_df.columns.str.strip().str.lower()

[49]: imdb_df["year"] = pd.to_numeric(imdb_df["year"], errors="coerce").abs().


↪astype("Int64")

[50]: imdb_df["genre"] = imdb_df["genre"].fillna("Unknown")


imdb_df["rating"] = imdb_df["rating"].fillna(0)
imdb_df["director"] = imdb_df["director"].fillna("Unknown")
imdb_df["actor 1"] = imdb_df["actor 1"].fillna("Unknown")
imdb_df["actor 2"] = imdb_df["actor 2"].fillna("Unknown")
imdb_df["actor 3"] = imdb_df["actor 3"].fillna("Unknown")

[52]: imdb_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15508 entries, 0 to 15507
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 name 15508 non-null object
1 year 14981 non-null Int64
2 duration 7240 non-null object
3 genre 15508 non-null object
4 rating 15508 non-null float64
5 votes 7920 non-null object
6 director 15508 non-null object
7 actor 1 15508 non-null object
8 actor 2 15508 non-null object
9 actor 3 15508 non-null object
dtypes: Int64(1), float64(1), object(8)
memory usage: 1.2+ MB

[53]: imdb_df.head()

[53]: name year duration genre \


0 #Gadhvi (He thought he was Gandhi) 2019 109 min Drama
1 #Homecoming 2021 90 min Drama, Musical
2 #Yaaram 2019 110 min Comedy, Romance

2
3 …And Once Again 2010 105 min Drama
4 …Aur Pyaar Ho Gaya 1997 147 min Comedy, Drama, Musical

rating votes director actor 1 actor 2 \


0 7.0 8 Gaurav Bakshi Rasika Dugal Vivek Ghamande
1 0.0 NaN Soumyajit Majumdar Sayani Gupta Plabita Borthakur
2 4.4 35 Ovais Khan Prateik Ishita Raj
3 0.0 NaN Amol Palekar Rajat Kapoor Rituparna Sengupta
4 4.7 827 Rahul Rawail Bobby Deol Aishwarya Rai Bachchan

actor 3
0 Arvind Jangid
1 Roy Angana
2 Siddhant Kapoor
3 Antara Mali
4 Shammi Kapoor

[54]: print(imdb_df.isnull().sum())

name 0
year 527
duration 8268
genre 0
rating 0
votes 7588
director 0
actor 1 0
actor 2 0
actor 3 0
dtype: int64

[55]: imdb_df.describe()

[55]: year rating


count 14981.0 15508.000000
mean 1987.012215 2.982964
std 25.416689 3.082650
min 1913.0 0.000000
25% 1968.0 0.000000
50% 1991.0 2.800000
75% 2009.0 6.000000
max 2022.0 10.000000

[56]: imdb_df["genre"].unique()

[56]: array(['Drama', 'Drama, Musical', 'Comedy, Romance',


'Comedy, Drama, Musical', 'Drama, Romance, War', 'Documentary',
'Horror, Mystery, Thriller', 'Action, Crime, Thriller', 'Horror',

3
'Horror, Romance, Thriller', 'Comedy, Drama, Romance', 'Thriller',
'Comedy, Drama', 'Unknown', 'Comedy, Drama, Fantasy',
'Comedy, Drama, Family', 'Crime, Drama, Mystery',
'Horror, Thriller', 'Biography', 'Comedy, Horror', 'Action',
'Drama, Horror, Mystery', 'Comedy', 'Action, Thriller',
'Drama, History', 'Drama, History, Sport',
'Horror, Mystery, Romance', 'Horror, Mystery',
'Drama, Horror, Romance', 'Action, Drama, History',
'Action, Drama, War', 'Comedy, Family',
'Adventure, Horror, Mystery', 'Action, Sci-Fi',
'Crime, Mystery, Thriller', 'War', 'Sport',
'Biography, Drama, History', 'Horror, Romance', 'Crime, Drama',
'Drama, Romance', 'Adventure, Drama', 'Comedy, Mystery, Thriller',
'Action, Crime, Drama', 'Crime, Thriller',
'Horror, Sci-Fi, Thriller', 'Crime, Drama, Thriller',
'Drama, Mystery, Thriller', 'Drama, Sport',
'Drama, Family, Musical', 'Action, Comedy', 'Comedy, Thriller',
'Action, Adventure, Fantasy', 'Drama, Romance, Thriller',
'Action, Drama', 'Drama, Horror, Musical',
'Action, Biography, Drama', 'Adventure, Comedy, Drama', 'Mystery',
'Action, Fantasy, Mystery', 'Adventure, Drama, Mystery',
'Mystery, Thriller', 'Adventure', 'Drama, Musical, Thriller',
'Comedy, Crime, Drama', 'Musical, Romance', 'Documentary, Music',
'Documentary, History, Music', 'Drama, Fantasy, Mystery',
'Drama, Family, Sport', 'Drama, Thriller',
'Documentary, Biography', 'Action, Adventure, Comedy', 'Romance',
'Comedy, Drama, Music', 'Comedy, Horror, Mystery', 'Musical',
'Musical, Romance, Drama', 'Family, Romance',
'Action, Sci-Fi, Thriller', 'Action, Drama, Romance',
'Mystery, Romance', 'Fantasy', 'Family', 'Drama, Family',
'Action, Comedy, Drama', 'Action, Drama, Thriller',
'Drama, Horror, Thriller', 'Drama, Musical, Romance',
'Comedy, Sci-Fi', 'Action, Romance', 'Action, Crime',
'Action, Drama, Crime', 'Drama, Family, Music',
'Action, Mystery, Thriller', 'Action, Drama, Family',
'Action, Mystery', 'Drama, History, Romance',
'Crime, Drama, Music', 'Sci-Fi', 'Animation',
'Crime, Mystery, Romance', 'Action, Adventure, Romance',
'Music, Romance', 'Action, Comedy, Crime',
'Comedy, Family, Fantasy', 'Romance, Drama',
'Drama, Family, Romance', 'Romance, Drama, Family',
'Musical, Romance, Thriller', 'Family, Musical, Romance',
'Action, Drama, Fantasy', 'Family, Drama', 'Crime, Drama, Romance',
'Musical, Drama, Romance', 'Drama, Music, Musical',
'Drama, Mystery', 'Adventure, Comedy, Romance',
'Crime, Drama, Horror', 'Family, Music, Musical',
'Action, Musical, Thriller', 'Action, Romance, Thriller',

4
'Romance, Thriller', 'Drama, Music', 'Crime, Drama, Musical',
'Action, Crime, Mystery', 'Action, Adventure, Thriller',
'Comedy, Romance, Sci-Fi', 'Crime', 'Action, Drama, Mystery',
'Action, Comedy, Thriller', 'Biography, Drama',
'Action, Comedy, Fantasy', 'Drama, Family, Horror',
'Action, Adventure, Family', 'Documentary, Biography, Musical',
'Action, Drama, Musical', 'Adventure, Thriller', 'Crime, Mystery',
'Drama, Crime', 'Drama, Fantasy, Romance',
'Comedy, Romance, Thriller', 'Musical, Comedy, Drama',
'Biography, History, War', 'Action, Comedy, Romance',
'Drama, History, Musical', 'Action, Crime, Horror',
'Adventure, Fantasy', 'Adventure, Drama, Fantasy',
'Adventure, Fantasy, Romance', 'Action, Adventure, Drama',
'Action, Adventure', 'Comedy, Crime', 'Crime, Drama, Fantasy',
'Adventure, Drama, Romance', 'History', 'Drama, Fantasy, Thriller',
'Musical, Fantasy', 'Documentary, Thriller',
'Mystery, Romance, Musical', 'Family, Drama, Romance',
'History, Musical, Romance', 'Musical, Drama, Crime',
'Adventure, Crime, Romance', 'Musical, Thriller, Mystery',
'Drama, Comedy', 'Biography, Drama, Romance', 'Biography, Music',
'Biography, Drama, Music', 'Drama, Sci-Fi',
'Drama, Family, Thriller', 'Comedy, Musical, Romance',
'Drama, Family, Comedy', 'Action, Thriller, Romance',
'Animation, Adventure', 'Action, Crime, Musical',
'Action, Crime, Romance', 'Animation, Action, Adventure',
'Action, Drama, Sport', 'Comedy, History', 'Documentary, History',
'Drama, Comedy, Family', 'Action, Adventure, Crime',
'Documentary, Biography, Music', 'Comedy, Musical',
'Biography, Crime, Thriller', 'Adventure, Mystery, Thriller',
'Biography, Drama, Sport', 'Action, Comedy, Musical',
'Mystery, Romance, Thriller', 'Action, Adventure, Musical',
'Crime, Musical, Mystery', 'Action, Thriller, Crime',
'Adventure, Comedy, Crime', 'Comedy, Horror, Musical',
'Adventure, Family', 'Family, Thriller', 'Drama, Action, Crime',
'Drama, War', 'Action, Drama, Adventure',
'Adventure, Fantasy, History', 'Fantasy, Musical',
'Comedy, Drama, Thriller', 'Drama, Fantasy', 'Musical, Drama',
'Action, Drama, Horror', 'Biography, Crime, Drama',
'Action, Drama, Music', 'Adventure, Drama, Family',
'Drama, Romance, Musical', 'Comedy, Musical, Drama',
'Adventure, Comedy, Musical', 'Crime, Drama, Family',
'Thriller, Musical, Mystery', 'Documentary, Adventure, Crime',
'Drama, Action, Horror', 'Adventure, Crime, Drama',
'Documentary, Biography, Sport', 'Crime, Fantasy, Mystery',
'Documentary, Biography, Drama', 'Action, Fantasy, Thriller',
'Adventure, Drama, History', 'Animation, Drama, History',
'Comedy, Horror, Thriller', 'Drama, Family, History',

5
'Animation, History', 'Biography, Drama, Musical', 'Music',
'Family, Comedy', 'Adventure, Mystery', 'Family, Fantasy',
'Documentary, History, News', 'Drama, Mystery, Romance',
'Comedy, Fantasy', 'Action, Crime, Family',
'Drama, Musical, Mystery', 'Action, Thriller, Mystery',
'Drama, Family, Fantasy', 'Action, Family',
'Action, Adventure, Mystery', 'Horror, Fantasy', 'Comedy, Action',
'Adventure, Romance', 'Drama, Adventure',
'Animation, Drama, Romance', 'Comedy, Crime, Romance',
'Adventure, Comedy', 'Comedy, Drama, Sport',
'Documentary, Crime, History', 'Musical, Mystery, Drama',
'Adventure, Drama, Sci-Fi', 'Action, Romance, Western',
'Comedy, Fantasy, Romance', 'Animation, Action, Comedy',
'Drama, Fantasy, Sci-Fi', 'Drama, Horror', 'Family, Drama, Comedy',
'Action, Adventure, History', 'Comedy, Family, Romance',
'Biography, History', 'Animation, Family',
'Drama, Fantasy, History', 'Animation, Adventure, Fantasy',
'Adventure, Comedy, Family', 'Drama, History, War',
'Animation, Drama, Fantasy', 'Action, Musical, Romance',
'Crime, Action, Drama', 'Comedy, Romance, Musical',
'Fantasy, Drama', 'Musical, Action, Crime', 'Documentary, Drama',
'Action, Horror, Thriller', 'Action, Horror, Sci-Fi',
'Mystery, Sci-Fi, Thriller', 'Biography, Family',
'Drama, Action, Comedy', 'Drama, Music, Romance',
'Action, Biography, Crime', 'Adventure, Drama, Musical',
'Family, Music, Romance', 'Fantasy, Mystery, Romance',
'Drama, Crime, Family', 'Drama, Family, Action',
'Romance, Comedy, Drama', 'Animation, Adventure, Comedy',
'Sci-Fi, Thriller', 'Romance, Family, Drama',
'Action, Family, Thriller', 'Adventure, Crime, Thriller',
'Drama, Romance, Sport', 'Comedy, Crime, Mystery',
'Adventure, Comedy, Mystery', 'Action, Fantasy', 'Comedy, Mystery',
'Animation, Adventure, Family', 'Adventure, Drama, Music',
'Biography, Drama, War', 'Documentary, Comedy, Drama',
'Musical, Drama, Family', 'Animation, Comedy, Drama',
'Fantasy, Musical, Drama', 'Adventure, Crime, Mystery',
'Comedy, Drama, Mystery', 'Documentary, News',
'Drama, Musical, Family', 'Action, Romance, Drama',
'Comedy, Crime, Thriller', 'Action, Musical', 'Action, History',
'Action, Comedy, Mystery', 'Drama, Family, Mystery',
'Adventure, Drama, Thriller', 'Documentary, Reality-TV',
'Action, Fantasy, Horror', 'Drama, History, Thriller',
'Documentary, Family', 'Documentary, Biography, Family',
'Comedy, Sport', 'Animation, Comedy, Family',
'Crime, Romance, Thriller', 'Comedy, Musical, Action',
'Action, Mystery, Sci-Fi', 'Comedy, Crime, Musical',
'Drama, Adventure, Action', 'History, Romance', 'Reality-TV',

6
'Fantasy, History', 'Family, Drama, Thriller',
'Musical, Mystery, Thriller', 'Musical, Comedy, Romance',
'Musical, Action, Drama', 'Action, Musical, War',
'Romance, Comedy', 'Horror, Crime, Thriller',
'Crime, Drama, History', 'Comedy, Drama, Horror',
'Crime, Horror, Thriller', 'Animation, Comedy',
'Romance, Action, Crime', 'Musical, Thriller',
'Action, Romance, Comedy', 'Comedy, Family, Musical',
'Horror, Drama, Mystery', 'Thriller, Mystery, Family',
'Comedy, Drama, Sci-Fi', 'Documentary, Adventure',
'Documentary, Biography, Crime', 'Musical, Action',
'Musical, Mystery', 'Action, Crime, Sci-Fi',
'Action, Horror, Mystery', 'Fantasy, Horror',
'Adventure, Family, Fantasy', 'Fantasy, Sci-Fi', 'Comedy, War',
'Romance, Action, Drama', 'Musical, Family, Romance',
'Romance, Drama, Action', 'Family, Comedy, Drama',
'Comedy, Music, Romance', 'Comedy, Family, Sci-Fi',
'Action, Drama, Western', 'Adventure, Romance, Thriller',
'Biography, Comedy, Drama', 'Action, Mystery, Romance',
'Romance, Sport', 'Crime, Romance', 'Action, Thriller, Western',
'Crime, Musical, Romance', 'Romance, Thriller, Mystery',
'Drama, Crime, Mystery', 'Biography, Drama, Family',
'Action, Family, Mystery', 'Comedy, Mystery, Romance',
'Drama, Thriller, Action', 'Documentary, Short',
'Documentary, Western', 'Musical, Family, Drama',
'Action, Family, Musical', 'Animation, Family, Musical',
'Drama, Fantasy, Horror', 'Action, Adventure, Sci-Fi',
'Drama, Action, Musical', 'Drama, Musical, Sport',
'Action, Comedy, Horror', 'Drama, Fantasy, Musical',
'Action, Fantasy, Musical', 'Animation, Action', 'Comedy, Music',
'Documentary, Drama, Romance', 'Drama, Music, Thriller',
'Fantasy, Musical, Mystery', 'Drama, Fantasy, War', 'Action, War',
'Action, Adventure, War', 'Horror, Musical',
'Fantasy, Mystery, Thriller', 'Adventure, Biography, Drama',
'Family, Romance, Sci-Fi', 'Drama, Romance, Family',
'Animation, Adventure, Drama', 'Family, Romance, Drama',
'Animation, Action, Sci-Fi', 'Adventure, Comedy, Fantasy',
'Comedy, Crime, Family', 'Horror, Musical, Thriller',
'Biography, Drama, Thriller', 'Drama, Western',
'Romance, Sci-Fi, Thriller', 'Comedy, Musical, Family',
'Comedy, Horror, Romance', 'Thriller, Action',
'Fantasy, Thriller, Action', 'Fantasy, Romance',
'Action, Drama, Comedy', 'Family, Fantasy, Romance',
'Comedy, Crime, Horror', 'Horror, Mystery, Sci-Fi',
'Animation, Action, Drama', 'Family, Mystery',
'Adventure, Biography, History', 'Fantasy, Horror, Mystery',
'Family, Musical', 'Drama, Family, Adventure',

7
'Crime, Horror, Mystery', 'Documentary, Drama, Fantasy',
'Action, Adventure, Biography', 'Biography, History, Thriller',
'Action, Family, Drama', 'Documentary, Drama, Sport',
'Thriller, Mystery', 'Musical, Drama, Comedy',
'Documentary, History, War', 'Adventure, Horror, Thriller',
'Action, Adventure, Horror', 'Action, Crime, War',
'Adventure, Musical, Romance', 'Action, Fantasy, Sci-Fi',
'Drama, Comedy, Action', 'Documentary, Sport',
'Documentary, Adventure, Music', 'Drama, Action, Family',
'Adventure, History, Thriller', 'Adventure, Horror, Romance',
'Adventure, Crime, Horror', 'Mystery, Musical, Romance',
'Action, Crime, History', 'Documentary, Musical',
'Adventure, Fantasy, Musical', 'Documentary, Family, History',
'Documentary, Drama, Family', 'Drama, Mystery, Sci-Fi',
'Animation, Drama, Musical', 'Drama, History, Mystery',
'Drama, Sport, Thriller', 'Action, Crime, Fantasy',
'Comedy, Musical, Mystery', 'Romance, Musical, Action',
'Musical, Drama, Fantasy', 'Animation, Family, History',
'Action, Drama, News', 'Romance, Musical, Comedy',
'Adventure, Fantasy, Horror', 'Adventure, History',
'Comedy, Drama, History', 'Mystery, Sci-Fi',
'Action, Thriller, War', 'Documentary, Drama, News',
'Documentary, Crime, Mystery', 'Adventure, Horror',
'Animation, Drama, Adventure', 'Crime, Horror, Romance',
'Documentary, Adventure, Drama', 'Documentary, Biography, History',
'Fantasy, Horror, Romance', 'Comedy, Fantasy, Musical',
'Crime, Musical, Thriller', 'Documentary, War',
'Action, Comedy, War', 'Crime, Drama, Sport',
'Musical, Adventure, Drama', 'Horror, Romance, Sci-Fi',
'Musical, Mystery, Romance', 'Romance, Musical, Drama',
'Adventure, Fantasy, Sci-Fi'], dtype=object)

[57]: imdb_df["year"].unique()[:10]

[57]: <IntegerArray>
[2019, 2021, 2010, 1997, 2005, 2008, 2012, 2014, 2004, 2016]
Length: 10, dtype: Int64

[65]: imdb_df_filtered = imdb_df[(imdb_df["year"] >= 2000) & (imdb_df["year"] <=␣


↪2023)]

[67]: imdb_df_filtered.head()

[67]: name year duration genre \


0 #Gadhvi (He thought he was Gandhi) 2019 109 min Drama
1 #Homecoming 2021 90 min Drama, Musical
2 #Yaaram 2019 110 min Comedy, Romance

8
3 …And Once Again 2010 105 min Drama
5 …Yahaan 2005 142 min Drama, Romance, War

rating votes director actor 1 actor 2 \


0 7.0 8 Gaurav Bakshi Rasika Dugal Vivek Ghamande
1 0.0 NaN Soumyajit Majumdar Sayani Gupta Plabita Borthakur
2 4.4 35 Ovais Khan Prateik Ishita Raj
3 0.0 NaN Amol Palekar Rajat Kapoor Rituparna Sengupta
5 7.4 1,086 Shoojit Sircar Jimmy Sheirgill Minissha Lamba

actor 3
0 Arvind Jangid
1 Roy Angana
2 Siddhant Kapoor
3 Antara Mali
5 Yashpal Sharma

[59]: top_movies = imdb_df[imdb_df["rating"] >= 8.0].sort_values(by="rating",␣


↪ascending=False)

[60]: print(imdb_df["genre"].value_counts())

genre
Drama 2779
Unknown 1877
Action 1289
Thriller 779
Romance 708

Musical, Adventure, Drama 1
Horror, Romance, Sci-Fi 1
Musical, Mystery, Romance 1
Romance, Musical, Drama 1
Adventure, Fantasy, Sci-Fi 1
Name: count, Length: 486, dtype: int64

1 Filtering by Actors and Directors


[61]: imdb_df["director"].unique()[:10]

[61]: array(['Gaurav Bakshi', 'Soumyajit Majumdar', 'Ovais Khan',


'Amol Palekar', 'Rahul Rawail', 'Shoojit Sircar', 'Anirban Datta',
'Allyson Patel', 'Biju Bhaskar Nair', 'Madhu Ambat'], dtype=object)

[62]: imdb_df["actor 1"].unique()[:10]

9
[62]: array(['Rasika Dugal', 'Sayani Gupta', 'Prateik', 'Rajat Kapoor',
'Bobby Deol', 'Jimmy Sheirgill', 'Unknown', 'Yash Dave',
'Augustine', 'Rati Agnihotri'], dtype=object)

[68]: imdb_df_filtered_actor = imdb_df[(imdb_df["actor 1"] == 'Rasika Dugal') |␣


↪(imdb_df["actor 1"] == 'Bobby Deol')]

[70]: imdb_df_filtered_actor.head()

[70]: name year duration \


0 #Gadhvi (He thought he was Gandhi) 2019 109 min
4 …Aur Pyaar Ho Gaya 1997 147 min
62 23rd March 1931: Shaheed 2002 188 min
166 A Sublime Love Story: Barsaat 2005 143 min
479 Aashiq 2001 160 min

genre rating votes director actor 1 \


0 Drama 7.0 8 Gaurav Bakshi Rasika Dugal
4 Comedy, Drama, Musical 4.7 827 Rahul Rawail Bobby Deol
62 Biography, Drama, History 5.1 642 Guddu Dhanoa Bobby Deol
166 Drama, Romance 3.8 627 Suneel Darshan Bobby Deol
479 Action, Drama, Romance 3.8 403 Indra Kumar Bobby Deol

actor 2 actor 3
0 Vivek Ghamande Arvind Jangid
4 Aishwarya Rai Bachchan Shammi Kapoor
62 Sunny Deol Amrita Singh
166 Bipasha Basu Priyanka Chopra Jonas
479 Karisma Kapoor Rahul Dev

[71]: imdb_df_filtered_director = imdb_df[(imdb_df["director"] == 'Shoojit Sircar') |␣


↪(imdb_df["actor 1"] == 'Yash Dave')]

[72]: imdb_df_filtered_director

[72]: name year duration genre rating \


5 …Yahaan 2005 142 min Drama, Romance, War 7.4
7 ?: A Question Mark 2012 82 min Horror, Mystery, Thriller 5.6
5249 Gulabo Sitabo 2020 124 min Comedy, Drama 6.3
8529 Madras Cafe 2013 130 min Action, Drama, Thriller 7.7
10296 October 2018 115 min Drama, Romance 7.5
10853 Piku 2015 123 min Comedy, Drama 7.6
14572 Udham Singh 2021 NaN Crime, Drama, History 8.2
14841 Vicky Donor 2012 126 min Comedy, Romance 7.8

votes director actor 1 actor 2 \


5 1,086 Shoojit Sircar Jimmy Sheirgill Minissha Lamba

10
7 326 Allyson Patel Yash Dave Muntazir Ahmad
5249 11,364 Shoojit Sircar Amitabh Bachchan Ayushmann Khurrana
8529 23,388 Shoojit Sircar John Abraham Nargis Fakhri
10296 14,013 Shoojit Sircar Varun Dhawan Banita Sandhu
10853 29,786 Shoojit Sircar Deepika Padukone Amitabh Bachchan
14572 29 Shoojit Sircar Kirsty Averton Tim Berrington
14841 40,589 Shoojit Sircar Ayushmann Khurrana Yami Gautam

actor 3
5 Yashpal Sharma
7 Kiran Bhatia
5249 Vijay Raaz
8529 Raashi Khanna
10296 Gitanjali Rao
10853 Irrfan Khan
14572 Nicholas Gecks
14841 Annu Kapoor

[73]: preferred_genre = input("Enter preferred genre: ").strip().lower()


min_rating = float(input("Enter minimum rating (e.g., 7.5): "))
min_year = int(input("Enter minimum release year: "))
preferred_actor = input("Enter preferred actor (leave blank if none): ").
↪strip().lower()

preferred_director = input("Enter preferred director (leave blank if none): ").


↪strip().lower()

filtered_movies = imdb_df[
(imdb_df["genre"].str.lower().str.contains(preferred_genre)) &
(imdb_df["rating"] >= min_rating) &
(imdb_df["year"] >= min_year)
]

if preferred_actor:
filtered_movies = filtered_movies[
(filtered_movies["actor 1"].str.lower().str.contains(preferred_actor)) |
(filtered_movies["actor 2"].str.lower().str.contains(preferred_actor)) |
(filtered_movies["actor 3"].str.lower().str.contains(preferred_actor))
]

if preferred_director:
filtered_movies = filtered_movies[filtered_movies["director"].str.lower().
↪str.contains(preferred_director)]

if not filtered_movies.empty:
print("\nRecommended Movies:")
print(filtered_movies[["name", "year", "rating", "genre", "director"]].
↪head(10))

11
else:
print("No movies found based on your preferences.")

Enter preferred genre: Comedy


Enter minimum rating (e.g., 7.5): 8
Enter minimum release year: 1999
Enter preferred actor (leave blank if none):
Enter preferred director (leave blank if none):

Recommended Movies:
name year rating genre \
74 3 Idiots 2009 8.4 Comedy,
Drama
943 Ammaa Ki Boli 2021 8.1 Comedy,
Drama
1103 Ankhon Dekhi 2013 8.0 Comedy,
Drama
1599 Badhaai Ho 2018 8.0 Comedy,
Drama
1708 Bahattar Hoorain 2019 8.8 Comedy
1739 Bajrangi Bhaijaan 2015 8.0 Action, Adventure, Comedy
1876 Barfi! 2012 8.1 Comedy, Drama, Romance
2074 Bhabhipedia 2018 8.3 Comedy, Romance
2983 Chhichhore 2019 8.3 Comedy, Drama
3201 Colour photo 2018 8.7 Comedy

director
74 Rajkumar Hirani
943 Narayan Chauhan
1103 Rajat Kapoor
1599 Amit Ravindernath Sharma
1708 Sanjay Puran Singh Chauhan
1739 Kabir Khan
1876 Anurag Basu
2074 Saumyy Shivhare
2983 Nitesh Tiwari
3201 Aziz Naser

2 Conclusion
In this experiment, we applied a constraint-based filtering approach to filter movies based on average
ratings, year of release, and genre. The constraint was defined as:
Movies with rating >= 4.
Movies released in 2000 or later.
No genre constraint applied in the final filtering.
The filtered dataset now includes movies that meet these conditions, providing a broad selection
from Indian movies that are relatively modern (from 2000 onwards) and have a moderate rating
(above 4).

12
Aspect Constraint-Based Filtering Content-Based Filtering
Definition Filters data based on user-specified Recommends items based on the
criteria (e.g., rating, genre, year). features and characteristics of the
items themselves (e.g., genre, plot
keywords, actors).
User Input Requires specific constraints from the Relies on the analysis of item
user (e.g., rating threshold, genre, year). characteristics to suggest similar
items.
Personalization
Less personalized, based purely on Highly personalized, as it recommends
predefined constraints. based on user preferences and item
similarity.
Filtering Applies filters based on strict conditions Uses algorithms to match similar
Logic (rating, year, etc.). items based on content attributes like
genres, keywords, or actors.
Flexibility Can be rigid, as it is based on More flexible, adapts to user tastes
predefined criteria. and patterns over time.
Examples Searching for movies with a specific Recommending movies similar to the
of Use rating or release year. ones a user has watched or liked in the
Cases past.
Pros Simple to implement, easy to use, Offers personalized recommendations,
provides control to users. adapts to user preferences.
Cons Can be too rigid, leading to missed Can require more complex algorithms
recommendations or a lack of diversity. and data to analyze content features
effectively.
Data Re- Requires structured data (e.g., rating, Requires detailed metadata and
quirements year, genre) with minimal processing. content features (e.g., keywords,
director, actors).
Implementation
Simple to implement, requires basic More complex, involves content
Complexity filtering techniques. analysis and item profiling, typically
using machine learning models.
Performance Static; does not adapt to new user Dynamic; adapts to user behavior and
Over Time behavior or preferences unless manually preferences over time.
adjusted.
Handling of Can miss recommendations for new Can recommend new items if they
Novel Items items without sufficient data. share similar features with items the
user has liked.
Scalability Works well with small to medium Scales well to large datasets but
datasets where filtering criteria are easy requires more computation power for
to define. analyzing content.
Interpretability
High interpretability, as users know the Can be harder to explain, especially
exact criteria used for filtering. when machine learning models are
involved.
Bias and May introduce bias if the constraints May lead to homogeneity in
Diversity are too narrow (e.g., excluding less recommendations, as similar items are
popular genres or ratings). repeatedly suggested.

13
Aspect Constraint-Based Filtering Content-Based Filtering
Real-time Can be less responsive in real-time Can provide real-time
Recommen- settings, especially with large filtering recommendations, especially when
dations constraints. continuously learning from user
interactions.

[ ]:

This notebook was converted with convert.ploomber.io

14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy