MovieRecomendation
MovieRecomendation
MovieRecomendation
Recommendation System
Dr. Yeresime Suresh1 , Rohit Kumar BR2 , Jahnavi Reddy C3 , P Sai Rohini4 , Sharath K5
1 Associate Professor, 2,3,4,5 Final Year Students
Department of Computer Science and Engineering,
Ballari Institute of Technology & Management, Ballari - 583 104, India.
{1 suresh.vec04, 2 bpscrohit,3 jahnavireddy2891, 4 prohini2906, 5 sharathkori0526}@gmail.com
I. Introduction
In the information age of today, almost all material
is accessible online. All we have to do is conduct an
online inquiry. Internet users spend more time looking up Fig. 1. Categorization of recommendation systems.
information about movies. Therefore, we reasoned that
it would be easier to make all the necessary information Lina Chen et al. (2017), in their study makes use of
available on a single website, which would enable users collaborative filtering ideas. It concentrates on how to
to find movies more rapidly. This was a key factor offer an effective and efficient programme for moving
in our decision to launch this initiative. In daily life, images. The suggested approach handles huge datasets
recommendation algorithms are more important. Due to using the Java MapReduce Framework. Highly efficient
many works in our life, people are constantly pressed and dependability can be attained for the suggested model
for time. Therefore, the suggestion systems are crucial by utilising the MapReduce architecture [2].
because they recommend without too much human effort. Unnathi et al. (2018) proposed collaborative filtering
A recommendation system essentially seeks out material with Apache Mahout. The user’s tastes and actions are
that would be interesting for a particular person. These taken into consideration in the collaborative filtering
recommendation engines saves time for people in taking method. On the basis of user similarities, predictions
decisions. These recommendation systems use artificial are made. Apache Mahout is used to integrate and use
intelligence, where they can predict what a person wants machine learning tools. The combined use of Apache
to view. Mahout and collaborative filtration is required throughout
The article has different sections. Section II provides the entire system [3].
details about state of the art literature review carried out. Shreya Agarwal et al., recommended that the techniques
Section III briefs about the proposed approach for movie of the past are now obsolete. So, the most recent iteration
recommendation to the user. Section IV provides insight of the Hybrid technique was created with the express
to the obtained results and the analysis part. Section V purpose of raising the standard of the movie recom-
concludes the work with scope for future work. mendation system. The collaborative filtering technique
and content-based filtering both have advantages and • Step 2: Extracting Movie Details
disadvantages that the suggested system incorporates. By The system extracts the basic information about the
utilizing one another’s good traits, negative characteristics movie and its cast
can be surmounted. Support Vector Machine method is • Step 3: Recommending similar movies.
used as its predictor in the implementation process. The The system recommends similar movies to the user,
combined approach has allowed for the identification of a based on cosine similarity.
positive increase in the overall system’s performance [4]. • Step 4: Sentimental Analysis of reviews
Mihhail Matskin et al.(2016) proposed a specialized In this step, the system scrapes the content from
recommender system for a movie website. The textual the Internet and does sentimental analysis of reviews
meta data is collected and analyzed, and it is discovered using naïve bayes classification algorithm. (by parsing
that they are unique and varied, which sets them apart the HTML web pages).
from other movie recommender systems. Following analy-
sis, resemblance is found and pictures are suggested. This
recommended model also suggests an extra feature for
adjusting weight in the textual information to discover
similarity [5].
Jeffrey Lund, yiu-Kai N G suggested that the system
with a combination of collaborative filtration and en-
coders. This method trains the suggested model using
the Movie Lens data set. Based on ratings from various
users, movie ratings for a specific person are anticipated.
The suggested system employs Collaborative Filtering as
well as a neural network model to suggest videos to the
user. The method also employs regularisation to lower
suggestion error rates [6].
Vallari Manaci et al.(2020) proposed the hybrid tech-
nique, which raises the system’s quality, is the primary
idea at play in this suggestion system. A hybrid strategy
combines a content-based approach and a joint approach. Fig. 2. Use case diagram for proposed system functionalities
It starts with a single hot encoding and then generates a
similarity matrix. The produced matrix is then subjected Figure 3 depicts the flow chart for the proposed movie
to the Deep Neural Network with the SoftMax Activation recommendation system based on content filtering.
function to produce the suggested selection of films [7]. Pre-Processing:
Patrick Adolf et al.(2019) suggested a study, that The selected data set cannot be used immediately
examines the operation of the text-based categorization for comparable calculations. To do this, we format the
method. Analyzing the effectiveness and applications of dataset in a manner that makes it possible to quickly
different classification and regression methods is helpful. determine the similarity value. We explain the idea of
It provides a short explanation of how the programme ranking. Finding the most pertinent paper is made easier
operates in various situations [8]. by ranking.
Angshuman Paul et al.(2018) proposed an approach Term Frequency - Inverse Document Frequency (TF-
that conducts an advanced study on the Random Forest IDF)
algorithm, explaining how performance is enhanced and The mathematical tool TF-IDF assesses a word’s rele-
assessed based on internal parameters. It helps to enhance vance to a document within a group of documents. It is
classification performance by reducing the number of trees utilized for information extraction and document search.
and repeatedly removing undesirable features [9]. It counts the number of times a word appears in the
III. Proposed Methodology documents. Most frequently used terms like if, where, and,
etc are given least priority because they don’t matter a lot
The proposed system mainly concentrates on the fol- to that particular document. The easiest way to calculate
lowing features. frequency is to count the total number of times a word
• Providing basic information about the movie appears in the given document.
• Sentimental analysis of the movies.
• Recommending similar kind of movies term i frequency in document j
T F (i, j) = (1)
Figure 2 represents of the functionalities of the proposed Total words in document j
system, through a systematic Use Case diagram. Equation 1 gives us a count of how many times a word
• Step 1: Input Movie Name has been repeated in a document. More frequent words
The system lets the user input movie name. have values that are closer to 0. By considering, all the
documents. Cosine similarity is used in our recommenda-
tion system to find and filter similar movies based on the
users interest.
x·y
cos(x, y) = (2)
|x||y|
B. Movie Reviews:
Analyzing movie evaluations with sentiment analysis
based on naive Bayes categorization is a common tech-
nique. It is a probabilistic algorithm called Naive Bayes
determines the likelihood that each phrase in a review falls
Fig. 3. Flow chart for movie recommendation system under a specific mood group (positive or negative). Figure
5, shows sample reviews obtained through sentimental
analysis. The algorithm then classifies the evaluation
documents, splitting total words in that document, and as favorable or negative using these odds. The Mean
then computing the logarithm, TF-IDF can be computed. Square Error (MSE) is a parameter, used to address issues
Consequently, if the word is repeated many number of with the data’s branching patterns from each component.
times, the value will be closer to 0, else it will be closer Equation 3 represents the MSE formula used in our
to 1. approach. MSE value obtained for the proposed model
A. Recommending Similar movies: is 0.0072.
Figure 4 represents the sequence diagram for the pro-
posed system based on the use case diagram (Figure 2). ∑
D
M SE = (1/N ) (xi − yi )2 (3)
Cosine similarity is used to find similarity between two
i=1
Fig. 4. Sequence diagram for movie recommendation system