0% found this document useful (0 votes)
3 views32 pages

Chatbot For Banking Project Report - Phase - 1,2,3

The document outlines a project from Karpaga Vinayaga College of Engineering and Technology focused on developing an AI-driven movie recommendation system. It details the problem of content overload on streaming platforms and proposes a solution using collaborative filtering and content-based filtering techniques to provide personalized recommendations. The project includes objectives, methodologies, tools, and team roles, along with plans for future enhancements.

Uploaded by

Hari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views32 pages

Chatbot For Banking Project Report - Phase - 1,2,3

The document outlines a project from Karpaga Vinayaga College of Engineering and Technology focused on developing an AI-driven movie recommendation system. It details the problem of content overload on streaming platforms and proposes a solution using collaborative filtering and content-based filtering techniques to provide personalized recommendations. The project includes objectives, methodologies, tools, and team roles, along with plans for future enhancements.

Uploaded by

Hari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

KARPAGA VINAYAGA COLLEGE OF ENGINEERING AND TECHNOLOGY

DEPARTMENT OF ELECTRONICS AND COMMUNICATION

ENGINEERING

REGULATIONS 2021

NM1075 - Naan Mudhalvan

Name:

Reg.No:

Department:

Year/Sem:
KARPAGA VINAYAGA
COLLEGE OF ENGINEERING & TECHNOLOGY
(Approved by AICTE, Accrediated by NBA & Affiliated to Anna University)
G.S.T Road, Chinna Kolambakkam, Padalam – 603 308,
Madhuranthagam (Tk), Kancheepuram (Dt).

BONAFIDE CERTIFICATE

REGISTER NUMBER:

Certified that this is a Bonafide Record of Practical work done by Mr/Ms…………………………….

of ………….. semester … … … … … … … … Year B.E Degree in Electronics & Communication

Engineering in the laboratory during the Year……………………

Date :

Staff-in-charge Head of the Department

Submitted for the Practical Examination held on ………………………………………

Internal Examiner External Examiner


Phase-1 Submission Template
Student Name:

M.Dinesh Karthick
Register Number:
421223106006

Institution: :Karpaga vinayaga college of engineering and technology

Department: B.E ECE – 2nd Year


Date of Submission: 30/04/2025

1.Problem Statement

People often spend more time deciding what to watch than


watching itself. With streaming platforms overloaded with
content, it’s crucial to offer users smart, personalized
recommendations. This project aims to solve the challenge of
content overload by recommending movies based on user
preferences and personality traits using AI.

2.Objective

Build an AI system that recommends movies based on user


preferences, past behavior, and matchmaking with similar
user profiles.

Use collaborative filtering, content-based filtering, and


deep learning models to generate recommendations.
Integrate a matchmaking layer that finds users with similar
tastes and suggests trending content from their profiles.

3.Scope

Features Planned:

Movie recommendation engine

User preference profiling

Matchmaking engine based on similarity metrics

Dashboard or web interface for users

Constraints:

Use of publicly available datasets (e.g., MovieLens)

Limited to prototype (not full production deployment)

Focus on algorithm effectiveness over large-scale deployment

4.Data sources Dataset:

Dataset: MovieLens 100K dataset (from GroupLens)

Source: Kaggle / GroupLens official site

Type: Public, static dataset


5.High-Level Methodology

Data Collection: Download MovieLens dataset from Kaggle

Data Cleaning: Handle missing ratings, remove duplicates, normalize ratings

EDA: Use visualizations like histograms, user-movie matrices, heatmaps

Feature Engineering:
Create user profiles based on genre preferences

Derive similarity scores between users

Model Building:

Content-Based Filtering

Collaborative Filtering (KNN, Matrix Factorization)

Hybrid Recommendation System using Neural Networks

Model Evaluation:

Precision, Recall, RMSE, MAE

Visualization & Interpretation:

Use interactive charts to explain recommendations

Deployment:
Optional deployment as a Streamlit app

6.Tools and Technologies

Language: Python

IDE/Notebook: Google Colab, VS Code

Libraries: pandas, numpy, matplotlib, seaborn, scikit-learn, surprise,


TensorFlow/Keras (optional)

Deployment Tools: Streamlit or Flask (optional)

Optional Tools for Deployment:

Streamlit / Gradio (for simple web app)

7. Team Members and Roles

M. Dinesh Karthick

Data Cleaning and EDA

Model Building (Content-Based + Collaborative Filtering)

Writing documentation and coordinating team tasks

P. Varunraj

Feature Engineering

Model Evaluation and Interpretation

Building visual dashboards using matplotlib/seaborn


T. Vishal

Data Collection and Preprocessing

Integration of Matchmaking Module

Optional Web App Deployment using Streamlit or Flask


Phase-2 Submission Template
Student Name:M.DineshKarthick
Register Number:42133010606

Institution: Karpaga Vinayaga College of


engineering and technology

Department: ECE

Date of Submission: 15/05/2025


Github Repository Link:
https://github.com/DineshKarthick743/ai-movie-recommendation.git

1. Problem Statement
2.
The objective of this project is to develop an AI-driven
movie recommendation system that leverages data
analysis and machine learning algorithms to provide
personalized movie suggestions to users. This system
aims to bridge the gap between user preferences and
available movie content by identifying patterns in user
data and matching them with relevant movie genres,
directors, actors, and other attributes.

3. Project Objectives
To build a robust recommendation model using supervised learning techniques.

To implement data preprocessing techniques to clean and transform the dataset.

To conduct exploratory data analysis (EDA) to understand key data patterns.

To develop and compare multiple machine learning models for generating


personalized recommendations.

To evaluate model performance using accuracy, precision, recall, and F1-score


metrics.

4. Flowchart of the Project Workflow


5. Data Description
Dataset Source: [Insert Dataset Source, e.g., Kaggle, IMDB API]

Data Type: Structured data consisting of movie attributes (e.g., genre, director,
cast) and user interaction data (e.g., ratings, reviews).

Number of Records: [Insert Record Count]

Features: Genre, Director, Cast, User Ratings, Release Year, etc.

Target Variable: Movie Recommendations

6. Data Preprocessing

Handling missing values through imputation and removal of unnecessary records.

Converting categorical features using label encoding and one-hot encoding.

Normalizing numerical features to maintain consistency.

Removing duplicates and detecting outliers using statistical methods.


7. Exploratory Data Analysis (EDA)
Univariate Analysis: Analyzing feature distributions using histograms
and boxplots.

Bivariate Analysis: Identifying correlations between user ratings and


movie attributes using scatter plots and heatmaps.

Multivariate Analysis: Analyzing interactions between multiple


variables to identify patterns in user preferences
8. Feature Engineering
Creating new features such as movie popularity scores, average ratings, and
genre-based user profiles.

Implementing dimensionality reduction using PCA to enhance model efficiency.

Combining features such as genre, director, and cast to create unique identifiers
for recommendation purposes.

9. Model Building

Implementing multiple models: Logistic Regression, Random Forest, and K-


Nearest Neighbors.

Splitting data into training and testing sets with an 80-20 ratio.

Training models and comparing performance using evaluation metrics.

Selecting the best-performing model for deployment.


10. Visualization of Results & Model Insights

Confusion matrix to illustrate model accuracy.

ROC curve to assess the trade-off between sensitivity and specificity.

Feature importance plots to identify influential features.

11. Tools and Technologies Used

Programming Language: Python

IDE/Notebook: Jupyter Notebook, Google Colab

Libraries: pandas, numpy, seaborn, matplotlib, scikit-learn, XGBoost

Visualization Tools: Plotly, Matplotlib

12. Team Members and Contributions

Name Roles and Responsibilities

Data Cleaning: M. Dinesh Karthick

EDA and Feature Engineering: P. Varunraj

Model Development: T. Vishal


Phase-3 Submission Template
Student Name: M.dinesh karthick
Register Number: 421223106006

Institution: Karpaga Vinayaga College of Engineering and


Technology
Department: Electronics and Communication Engineering
Date of Submission: 20/05/2025
Github Repository Link:
https://github.com/DineshKarthick743/Ai-movie-.git

1. Problem Statement
With an exponential rise in content, users face difficulty in finding movies that
match their tastes. This project aims to solve that by building an AI-powered
recommendation system that suggests personalized movie content. It is a
collaborative filtering and content-based recommendation problem using machine
learning techniques.

2. Abstract
This project focuses on designing a personalized movie recommendation system
using an AI-driven matchmaking approach. The goal is to enhance user experience
by providing tailored movie suggestions based on user preferences, watch history,
and movie metadata. The system combines collaborative filtering and content-based
techniques. We used publicly available datasets, performed preprocessing and
exploratory analysis, engineered key features, trained models like KNN, SVD, and
neural networks, and evaluated their performance. The final model is deployed using
Streamlit to make real-time predictions.
.
3. System Requirements
• Hardware:

• RAM: 8GB+

• Processor: Intel i5 or above

• Software:

• Python 3.8+

• Libraries: pandas, numpy, scikit-learn, matplotlib, seaborn, surprise,


streamlit

• IDE: Google Colab / Jupyter Notebook

4. Objectives
 Recommend movies based on user preferences
 Improve prediction accuracy using hybrid modeling
 Enable real-time suggestions via a web interface
 Enhance user engagement with relevant content
5. Flowchart of Project Workflow
6. Dataset Description
 Source: Kaggle - MovieLens Dataset

 Type: Public

 Size: ~100,000 ratings by 600 users for 9,000+ movies

 Structure: 4 CSV files - ratings.csv, movies.csv, tags.csv, links.csv

7. Data Preprocessing
 Distribution of ratings
 Most watched and top-rated movies
 Heatmap for user-movie interactions

8. Exploratory Data Analysis (EDA)


• Visual Tools Used: Histograms, boxplots, heatmaps, and scatter plots.
9. Feature Engineering

 Created user profile vector


 Genre one-hot encoding
 TF-IDF on movie descriptions
 Normalized ratings for collaborative filtering

10. Model Building


Algorithms used:
 KNN (user-based)
 SVD (matrix factorization)
 Content-based using cosine similarity
Chosen for their effectiveness in recommendation systems
11. Model Evaluation

 RMSE, MAE
 KNN RMSE: 0.91 | SVD RMSE: 0.87
 ROC not applicable (not binary classification)
12. Source code:
import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.feature_extraction.text import TfidfVectorizer


from sklearn.metrics.pairwise import linear_kernel

from sklearn.neighbors import NearestNeighbors

# Load CSV files

movies = pd.read_csv('movies.csv')

ratings = pd.read_csv('rating.csv')

# Merge both files on movieId

data = pd.merge(ratings, movies, on='movieId')

print(data.isnull().sum())

# Create pivot table for collaborative filtering

user_movie_matrix = data.pivot_table(index='userId', columns='title', values='rating')

user_movie_matrix.fillna(0, inplace=True)

# TF-IDF on genres

movies['genres'] = movies['genres'].fillna('')

tfidf = TfidfVectorizer(stop_words='english')

tfidf_matrix = tfidf.fit_transform(movies['genres'])

cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

def get_recommendations(title):

idx = movies[movies['title'] == title].index[0]

sim_scores = list(enumerate(cosine_sim[idx]))

sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)[1:6]

movie_indices = [i[0] for i in sim_scores]


return movies['title'].iloc[movie_indices]

print(get_recommendations('Toy Story (1995)'))

# KNN for user-based recommendations

model_knn = NearestNeighbors(metric='cosine', algorithm='brute')

model_knn.fit(user_movie_matrix.values)

distances, indices = model_knn.kneighbors([user_movie_matrix.iloc[0].values], n_neighbors=4)

print(indices)

# ---------------------- VISUALIZATIONS ----------------------

# 1. Bar Chart - Top 10 most rated movies

top_movies = data['title'].value_counts().head(10)

plt.figure(figsize=(10,5))

sns.barplot(x=top_movies.values, y=top_movies.index, palette='viridis')

plt.title("Top 10 Most Rated Movies")

plt.xlabel("Ratings Count")

plt.ylabel("Movie Title")

plt.show()

# 2. Histogram - Distribution of ratings

plt.figure(figsize=(8,5))

plt.hist(data['rating'], bins=10, color='skyblue', edgecolor='black')

plt.title('Distribution of Ratings')

plt.xlabel('Rating')
plt.ylabel('Count')

plt.show()

# 3. Pie Chart - Proportion of rating values

rating_counts = data['rating'].value_counts().sort_index()

plt.figure(figsize=(6,6))

plt.pie(rating_counts, labels=rating_counts.index, autopct='%1.1f%%', startangle=140, colors=sns.color_palette('pastel'))

plt.title('Rating Value Distribution')

plt.axis('equal')

plt.show()

# 4. Line Graph - Average rating per year (optional, if timestamp available)

if 'timestamp' in data.columns:

data['year'] = pd.to_datetime(data['timestamp'], unit='s').dt.year

yearly_avg = data.groupby('year')['rating'].mean()

plt.figure(figsize=(10,5))

plt.plot(yearly_avg.index, yearly_avg.values, marker='o', color='coral')

plt.title('Average Movie Rating by Year')

plt.xlabel('Year')

plt.ylabel('Average Rating')

plt.grid(True)

plt.show()

13. Future scope


 Integrate user reviews sentiment analysis
 Add multilingual content recommendations
 Use deep learning (autoencoders) for improved accuracy

14. Team Members and Roles


Team Members – P. varunraj , M.dineshkarthick
Roles-

 M. Dinesh Karthick – Data Collection, Preprocessing, EDA


 P. Varunraj – Feature Engineering, Model Building
 T. Vishal – Deployment, Documentation
Hackathon Submission Template (Level-1-Solution)

Use Case Title: Delivering Personalized Movie

Recommendations with an AI-driven Matchmaking

System

Student Name:P.Varunraj
Register Number: 421223106026
Institution: Karpaga Vinayaga College of engineering and
technology
Department: ECE
Date of Submission: 20/05/2025

1. Problem Statement

With the rapid growth of online streaming platforms, users are


overwhelmed with choices. Often, they waste time browsing instead
of watching content. Static recommendation systems fail to adapt to
users’ evolving preferences. There is a need for a smart, dynamic
solution that understands a user’s taste and delivers highly relevant
movie suggestions.

2. Proposed Solution

We propose an AI-driven recommendation system that matches


users with movies using machine learning techniques like
collaborative filtering and content-based filtering. The system
analyzes user history, preferences, and viewing patterns to predict
movies they are most likely to enjoy.

Key Features:

Personalized recommendations

Real-time learning from user interactions

Hybrid recommendation combining user behavior and movie


features

Scalable for large streaming platforms

3. Technologies & Tools Considered

Programming Languages: Python

Libraries/Frameworks: Scikit-learn, TensorFlow, Pandas, Surprise, Flask

Frontend: React or HTML/CSS for user interface

Database: PostgreSQL or MongoDB

APIs: IMDb, TMDb for movie data


4. Solution Architecture & Workflow

1. User Data Collection: Watching history, ratings, preferences

2. Data Preprocessing: Encoding, normalization, and filtering

3. Model Inference: Hybrid recommendation engine processes inputs

4. Recommendation Engine: Suggests top movies

5. Feedback Loop: Updates the model with user ratings and actions
6. Feedback Loop: Users/doctors provide feedback to retrain and
improve mode
5.Feasibility & Challenges
Feasibility:
Feasibility:
Technologies for recommender systems are well-
documented, and datasets are widely available.
Implementation is practical with modest resources.

Challenges:

Cold start problem for new users

Data sparsity and scalability

Privacy concerns
Solutions: Use hybrid models, matrix factorization, and
anonymized data processing.

7. Expected Outcome & Impact

Saves user time and improves streaming experience

Increases user engagement and platform retention

Personalized content delivery enhances satisfaction


Beneficiaries: Streaming platforms, users, and content creators

8. Future Enhancements

Voice and emotion-based recommendations

Social media integration

Cross-platform syncing of preferences

Multilingual support for diverse audiences

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy