0% found this document useful (0 votes)

114 views22 pages

Movies Final Report

This document provides an overview and final report for a movie analysis project. The project contains two parts: exploring movie metadata to understand factors that impact revenues and success, and building various movie recommendation systems. The data was obtained from The Movie Database and MovieLens datasets. Extensive data wrangling was required to clean the datasets. Exploratory data visualization and analysis was performed on topics like production countries, movie franchises, and production companies. Predictive models were built to forecast revenues and success. The recommendation systems were evaluated both qualitatively and quantitatively.

Uploaded by

Kumara S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

114 views22 pages

Movies Final Report

Uploaded by

Kumara S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Rounak Banik
IIT Roorkee
rounakbanik@gmail.com
+91 84398 60325

Movies Final Report

24th October 2017

OVERVIEW
This project is divided into two parts:

1. The Story of Film: This section aims at narrating the history, trivia and facts behind the
world of cinema through the lens of data. Extensive Exploratory Data Analysis is
performed on Movie Metadata about Movie Revenues, Casts, Crews, Budgets, etc.
through the years. Two predictive models are built to predict movie revenues and movie
success. Through these models, we also aim at discovering what features have the most
significant impact in determining revenue and success.
2. Movie Recommender Systems: This part is focused around building various kinds of
recommendation engines; namely the Simple Generic Recommender, the Content Based
Filter and the User Based Collaborative Filter. The performance of the systems are
evaluated in both a qualitative and quantitative manner.

THE CLIENT
The first section of the project does not have a definitive client. But some of the analysis
performed could be of use to anyone in the Movie Making Business (Streaming Providers,
Producers, etc). The Movie Success and Revenue Prediction Models can give valuable insights
into the features that actually determine the end class and value respectively.

The Movie Recommender System is useful to any business that makes money via
recommendations. This includes Amazon, Netflix, Hotstar, etc. Giving good recommendations
directly entails one or many of the following:

1. Customers buy a particular product or service leading to increased revenue or sales.
2. Customers use the platform more frequently due to the quality and relevance of content
shown to them.
3. Better User Experience. Customers spend less time searching and more time watching.
The pain of discovery is eliminated.

THE DATA

The data used in this project has been obtained from two sources: The Movie Database (TMDB)
and MovieLens.

MovieLens has a publicly available dataset that contains 26 million ratings and 750,000 tag
applications applied to 45,000 movies by 270,000 users. It also includes tag genome data with
12 million relevance scores across 1,100 tags. A small subset of this dataset, containing 10,000
ratings for 9000 movies from 700 users is also available.

One of the files contains the TMDB ID of every movie listed in the MovieLens dataset. Using this
ID, the metadata, credits and keywords of all 45,000 movies were obtained by running a script
that requested and parsed data from TMDB Open API. The data collected was initially in the
JSON format but was converted into CSV files using Python’s Pandas Library.

The following files were used in the project:

1. movies_metadata.csv: The file containing metadata collected from TMDB for over
45,000 movies. Data includes budget, revenue, date released, genres, etc.
2. credits.csv: Complete information on credits for a particular movie. Data includes Director,
Producer, Actors, Characters, etc.
3. keywords.csv: Contains plot keywords associated with a movie.
4. links_small.csv: Contains the list of movies that are included in the small subset of the
Full MovieLens Dataset.
5. Ratings_small.csv: The MovieLens Dataset containing 100,000 ratings on 9,000 movies
from 700 users. The main dataset used for building the Collaborative Filter.

DATA COLLECTION

The MovieLens Full Dataset was readily available at the GroupLens Website
(https://grouplens.org/datasets/movielens/). This dataset contained 26 million ratings from
270,000 users on 45,000 movies. One file in this dataset, links.csv, contained the TMDB and
IMDB IDs for all the movies.

I signed up for an API Key with TMDB. This gave me access to data at 3 endpoints. Each endpoint
gave me details about the movie, its cast and crew information and plot keywords. I wrote 3
separate scrapers to hit each endpoint and collect this data for all 45,000 movies. Since TMDB
has a restriction of 40 requests every 10 seconds, this task took a day to execute.

All the data collected was in the form of stringified JSON which demanded more processing.

DATA WRANGLING

Overview
This section describes the various data cleaning and data wrangling methods applied on the
Movie datasets to make it more suitable for further analysis. The following sections are divided
based on the procedures followed.

Conversion to CSV Files

The data obtained from scraping was in the form of stringified JSON. This had to be converted
into CSV Files to enable easier parsing and subsequent upload to public platforms such as
Kaggle.

Removing Unnecessary Features

Some features such as the Backdrop Path, Adult and IMDB ID were unnecessary attributes and
were dropped to reduce the dimensions of the dataset.

Cleaning
The dataset had a lot of features which had 0s for values it did not possess. These values were
converted to NaN. Some features were still in the form of a Stringified JSON Object. They were
converted into Python Dictionaries using Python’s ast library. These were further reduced into
lists since we did not have a need for ID, timestamp and other attributes.

The dataframe was exploded wherever the analysis demanded it (for instance, genres and
production countries).

Finally, most of the features were converted into a Python basic type (integer, string, float) by
removing all the unclean values. The date string was converted into a Pandas Datetime and from
it, we extracted the month, year and day of release of every movie.

EXPLORATORY DATA VISUALIZATION AND ANALYSIS

In this section, the various insights produced through descriptive statistics and data visualisation
is presented.

This forms the crux of the first section of my Capstone Project.

Production Countries

1. The Movies in the dataset are overwhelmingly in the English Language and shot in the
United States of America.
2. Europe is also an extremely popular location with the UK, France, Germany and Italy in
the top five.
3. Japan and India are the most popular Asian countries when it comes to movie production.

Franchise Movies
1. The Harry Potter Franchise is the most successful movie franchise raking in more than
7.707 billion dollars from 8 movies. The Star Wars Movies come in a close second with a
7.403 billion dollars from 8 movies too.
2. The Avatar Collection, although just consisting of one movie at the moment, is the most
successful franchise of all time with the sole movie raking in close to 3 billion dollars. The
Harry Potter franchise is still the most successful franchise with at least 5 movies.
3. The James Bond Movies is the largest franchise ever with over 26 movies released under
the banner. Friday the 13th and Pokemon come in at a distant second and third with 12
and 11 movies respectively.

Production Companies
1. Warner Bros is the highest earning production company of all time earning a staggering
63.5 billion dollars from close to 500 movies. Universal Pictures and Paramount Pictures
are the second and the third highest earning companies with 55 billion dollars and 48
billion dollars in revenue respectively.
2. Pixar Animation Studios has produced the most successful movies, on average. This is not
surprising considering the amazing array of movies that it has produced in the last few
decades: Up, Finding Nemo, Inside Out, Wall-E, Ratatouille, the Toy Story Franchise, Cars
Franchise, etc. Marvel Studios with an average gross of 615 million dollars comes in
second with movies such as Iron Man and The Avengers under its banner.

Movie Title Wordcloud

The word Love is the most commonly used word in movie titles. Girl, Day and Man are also
among the most commonly occurring words. I think this encapsulates the idea of the ubiquitous
presence of romance in movies pretty well.

Original Languages
There are over 93 languages represented in our dataset. As we had expected, English language
films form the overwhelmingly majority. French and Italian movies come at a very distant second
and third respectively.

As mentioned earlier, French and Italian are the most commonly occurring languages after
English. Japanese and Hindi form the majority as far as Asian Languages are concerned.

Popularity, Vote Average and Vote Count

1. Minions is the most popular movie by the TMDB Popularity Score. Wonder Woman and
Beauty and the Beast, two extremely successful woman centric movies come in second
and third respectively.
2. Inception and The Dark Knight, two critically acclaimed and commercially successful
Christopher Nolan movies figure at the top of The Most Voted On Movies Chart.
3. The Shawshank Redemption and The Godfather are the two most critically acclaimed
movies in the TMDB Database. Interestingly, they are the top 2 movies in IMDB's Top 250
Movies list too. They have a rating of over 9 on IMDB as compared to their 8.5 TMDB
Scores.

4. Surprisingly, the Pearson Coefficient of the two aforementioned quantities is a measly
0.097 which suggests that there is no tangible correlation. In other words, popularity and
vote average and independent quantities. It would be interesting to discover how TMDB
assigns numerical popularity scores to its movies.
5. There is a very small correlation between Vote Count and Vote Average. A large number
of votes on a particular movie does not necessarily imply that the movie is good.

Movie Release Dates

It appears that January is the most popular month when it comes to movie releases. In Hollywood
circles, this is also known as the the dump month when sub par movies are released by the
dozen.

We see that the months of April, May and June have the highest average gross among high
grossing movies. This can be attributed to the fact that blockbuster movies are usually released
in the summer when the kids are out of school and the parents are on vacation and therefore, the
audience is more likely to spend their disposable income on entertainment.

The months of June and July tend to yield the highest median returns. September is the least
successful months on the aforementioned metrics. Again, the success of June and July movies
can be attributed to them being summer months and times of vacation. September usually
denotes the beginning of the school/college semester and hence a slight reduction in the
consumption of movies.

Friday is clearly the most popular day for movie releases. This is understandable considering the
fact that it usually denotes the beginning of the weekend. Sunday and Monday are the least
popular days and this can be attributed to the same aforementioned reason.

The oldest movie, Passage of Venus, was a series of photographs of the transit of the planet
Venus across the Sun in 1874. They were taken in Japan by the French astronomer Pierre
Janssen using his 'photographic revolver'. This is also the oldest movie on both IMDB and TMDB.

Spoken Languages

The movie with the most number of languages, Visions of Europe is actually a collection of 25
short films by 25 different European directors. This explains the sheer diversity of the movie in
terms of language.

There is no correlation between the number of languages and returns of a movie.

Runtime
The average length of a movie is about 1 hour and 30 minutes. The longest movie on record in
this dataset is a staggering 1256 minutes (or 20 hours) long.

There seems to be no relationship between runtime and return. The duration of a movie is
independent of its success.

We notice that films started hitting the 60 minute mark as early as 1914. Starting 1924, films
started having the traditional 90 minute duration and has remained more or less constant ever
since.

Budget

The distribution of movie budgets shows an exponential decay. More than 75% of the movies
have a budget smaller than 25 million dollars.

Two Pirates of the Caribbean films occupy the top spots in this list with a staggering budget of
over 300 million dollars. All the top 10 most expensive films made a profit on their investment
except for The Lone Ranger which managed to recoup less than 35% of its investment, taking in
a paltry 90 million dollars on a 255 million dollar budget.

The pearson r value of 0.73 between the two quantities indicates a very strong correlation.

Revenue
The mean gross of a movie is 68.7 million dollars whereas the median gross is much lower at 16.8
million dollars, suggesting the skewed nature of revenue. The lowest revenue generated by a
movie is just 1 dollar whereas the highest grossing movie of all time has raked in an astonishing
2.78 billion dollars.

As can be seen from the figure, the maximum gross has steadily risen over the years. The world
of movies broke the 1 billion dollar mark in 1997 with the release of Titanic. It took another 12
years to break the 2 billion dollar mark with Avatar. Both these movies were directed by James
Cameron.

Correlation Matrix

Genres

Drama is the most commonly occurring genre with almost half the movies identifying itself as a
drama film. Comedy comes in at a distant second with 25% of the movies having adequate doses
of humor. Other major genres represented in the top 10 are Action, Horror, Crime, Mystery,
Science Fiction, Animation and Fantasy.

The proportion of movies of each genre has remained fairly constant since the beginning of this
century except for Drama. The proportion of drama films has fallen by over 5%. Thriller movies
have enjoyed a slight increase in their share.

Animation movies has the largest 25-75 range as well as the median revenue among all the
genres plotted. Fantasy and Science Fiction have the second and third highest median revenue
respectively.

Cast and Crew

REGRESSION: PREDICTING MOVIE REVENUES

Predicting Movie Revenues is an extremely popular problem in Machine Learning which has
created a huge amount of literature. Most of the models proposed in these papers use far more
potent features than what we possess at the moment. These include Facebook Page Likes,
Information on Tweets about the Movie, YouTube Trailer Reaction (Views, Likes, Dislikes, etc.),
Movie Rating (MPCAA, CBIFC) among many others.

To compensate for the lack of these features, we are going to cheat a little. We will be using
TMDB's Popularity Score and Vote Average as our features in our model to assign a numerical
value to popularity. However, it must be kept in mind that these metrics will not be available when
predicting movie revenues in the real world, when the movie has not been released yet.

Feature Engineering
1. belongs_to_collection will be turned into a Boolean variable. 1 indicates a movie is a part
of collection whereas 0 indicates it is not.
2. genres will be converted into number of genres.
3. homepage will be converted into a Boolean variable that will indicate if a movie has a
homepage or not.
4. original_language will be replaced by a feature called is_foreign to denote if a particular
film is in English or a Foreign Language.
5. production_companies will be replaced with just the number of production companies
collaborating to make the movie.
6. production_countries will be replaced with the number of countries the film was shot in.
7. day will be converted into a binary feature to indicate if the film was released on a Friday.
8. month will be converted into a variable that indicates if the month was a holiday season.

Model
The model that I choose for regression is the Gradient Boosting Regression. The Coefficient of
Determination Score obtained by the regressor was 0.78

Feature Importances

We notice that vote_count, a feature we cheated with, is the most important feature to our
Gradient Boosting Model. This goes on to show the importance of popularity metrics in
determining the revenue of a movie. Budget was the second most important feature followed by
Popularity (Literally, a popularity metric) and Crew Size.

CLASSIFICATION: PREDICTING MOVIE SUCCESS

The Classification model uses the same Feature Engineering steps as those followed by the
Regression Model built in the previous section.

Model
The model that I choose for classification is the Gradient Boosting Classifier. The model
showcased an accuracy of 80% with unseen test cases.

Feature Importances

We see that Vote Count is once again the most significant feature identified by our Classifier.
Other important features include Budget, Popularity and Year. With this, we will conclude our
discussion on the classification model and move on to the main part of the project.

RECOMMENDATION SYSTEMS

The next step was to build a classifier to train the data on and then test its performance against
the test data. With all the feature engineering already done in the previous step, applying
machine learning was a fairly concise step.

The Simple Recommender

The Simple Recommender offers generalized recommendations to every user based on movie
popularity and (sometimes) genre. The basic idea behind this recommender is that movies that
are more popular and more critically acclaimed will have a higher probability of being liked by the
average audience. This model does not give personalized recommendations based on the user.

I used the TMDB Ratings to come up with our Top Movies Chart. I used IMDB's weighted rating
formula to construct my chart.

The next step was to determine an appropriate value for m, the minimum votes required to be
listed in the chart. I used 95th percentile as the cutoff. In other words, for a movie to feature in the
charts, it must have more votes than at least 95% of the movies in the list.

Content Based Recommender

My approach to building the recommender was extremely hacky. What I did was create a
metadata dump for every movie which consisted of genres, director, main actors and keywords. I
then used a Countvectorizer to create a count matrix. I then calculated the cosine similarities and
returned movies that are most similar.

I also added a mechanism to remove bad movies and return movies which are popular and have
had a good critical response.

I took the top 25 movies based on similarity scores and calculate the vote of the 60th percentile
movie. Then, using this as the value of m , I calculated the weighted rating of each movie using
IMDB's formula like I did with the Simple Recommender.

Collaborative Filtering
The content based engine suffers from some severe limitations. It is only capable of suggesting
movies that are close to a certain movie. That is, it is not capable of capturing tastes and
providing recommendations across genres.

Also, the engine that I built is not really personal in that it doesn't capture the personal tastes and
biases of a user. Anyone querying our engine for recommendations based on a movie will
receive the same recommendations for that movie, regardless of who s/he is.

Therefore, I used a technique called Collaborative Filtering to make recommendations to Movie
Watchers. Collaborative Filtering is based on the idea that users similar to a me can be used to
predict how much I will like a particular product or service those users have used/experienced
but I have not.

I did not implement Collaborative Filtering from scratch. Instead, I used the Surprise library that
provides extremely powerful algorithms like Singular Value Decomposition (SVD) to minimise
RMSE (Root Mean Square Error) and give great recommendations.

Hybrid Recommender
The Hybrid Recommender brought together techniques from both Content Based and
Collaborative Filtering Based engines to provide personalized Similar Movie Recommendations
to Users based on their taste.

CONCLUSION

This report highlighted the processes of data wrangling, inferential statistics, data visualization,
feature engineering and predictive modelling performed on the Movies Dataset. All the results
and insights gained as part of these processes were also highlighted. With these insights, a
Gradient Boosting Regressor and Classifier were built to predict Movie Revenue and Success
respectively with a Score of 0.78 and 0.8 respectively.

In addition, four recommendation engines were built based on different ideas and algorithms:

1. Simple Recommender: This system used overall TMDB Vote Count and Vote Averages to
build Top Movies Charts, in general and for a specific genre. The IMDB Weighted Rating
System was used to calculate ratings on which the sorting was finally performed.
2. Content Based Recommender: We built two content based engines; one that took movie
overview and taglines as input and the other which took metadata such as cast, crew,
genre and keywords to come up with predictions. We also devised a simple filter to give
greater preference to movies with more votes and higher ratings.
3. Collaborative Filtering: We used the powerful Surprise Library to build a collaborative
filter based on singular value decomposition. The RMSE obtained was less than 1 and the
engine gave estimated ratings for a given user and movie.
4. Hybrid Engine: We brought together ideas from content and collaborative filtering to
build an engine that gave movie suggestions to a particular user based on the estimated
ratings that it had internally calculated for that user.

The code associated with this report is available at: https://github.com/rounakbanik/movies

Industrial Automation Technologies (Chanchal Dey (Editor) Sunit Kumar Sen (Editor) )
100% (2)
Industrial Automation Technologies (Chanchal Dey (Editor) Sunit Kumar Sen (Editor) )
376 pages
How To Upgrade A MiR Robot's Software 2.1 - en
No ratings yet
How To Upgrade A MiR Robot's Software 2.1 - en
33 pages
The Baby-Sitters Club Graphix #1 (Kristy's Great Idea) - VPSI Library - Page 1 - 193 Flip PDF Online PubHTML5
No ratings yet
The Baby-Sitters Club Graphix #1 (Kristy's Great Idea) - VPSI Library - Page 1 - 193 Flip PDF Online PubHTML5
1 page
Movie Prediction
100% (1)
Movie Prediction
7 pages
MovieLens Project Report
No ratings yet
MovieLens Project Report
19 pages
Movielens Recommender System Capstone Project: Compiled by Mahesh Halkeri
No ratings yet
Movielens Recommender System Capstone Project: Compiled by Mahesh Halkeri
19 pages
Group 15 Report
No ratings yet
Group 15 Report
23 pages
Report Final-MovieLens
No ratings yet
Report Final-MovieLens
47 pages
Movie Recommendation System Analysis
No ratings yet
Movie Recommendation System Analysis
8 pages
Movie Recommender Systems
No ratings yet
Movie Recommender Systems
11 pages
Sneha Kumari - 262 - DS Project.
No ratings yet
Sneha Kumari - 262 - DS Project.
19 pages
Report
No ratings yet
Report
26 pages
Final Project1 IMDB Movie Analysis PDF
No ratings yet
Final Project1 IMDB Movie Analysis PDF
9 pages
Analytic Project Report APR
No ratings yet
Analytic Project Report APR
42 pages
Ads - Phase 5
No ratings yet
Ads - Phase 5
14 pages
Technical Documenetflix Technicalnt
No ratings yet
Technical Documenetflix Technicalnt
15 pages
Technical Docs of NETFLIX MOVIES AND TV SHOWS CLUSTERING
No ratings yet
Technical Docs of NETFLIX MOVIES AND TV SHOWS CLUSTERING
12 pages
Case Study Data Analytics
No ratings yet
Case Study Data Analytics
12 pages
IMDB Analysis
No ratings yet
IMDB Analysis
4 pages
Netflix Data - Cleaning, Analysis and Visualization - (Data Analyst)
No ratings yet
Netflix Data - Cleaning, Analysis and Visualization - (Data Analyst)
24 pages
It Optics Project Report
No ratings yet
It Optics Project Report
6 pages
Divya NM (1) - 2
No ratings yet
Divya NM (1) - 2
41 pages
Python Project Description
No ratings yet
Python Project Description
4 pages
Project Problem Statement
No ratings yet
Project Problem Statement
3 pages
IMDb+Movie+Assignment Stub
No ratings yet
IMDb+Movie+Assignment Stub
9 pages
Recommender System
No ratings yet
Recommender System
45 pages
Project Movielense Solution
No ratings yet
Project Movielense Solution
4 pages
IMDB Movie Analysis1
No ratings yet
IMDB Movie Analysis1
14 pages
Recommendation System
No ratings yet
Recommendation System
11 pages
Movie Reccomendation System Project Report
No ratings yet
Movie Reccomendation System Project Report
19 pages
Informatics Practices Project Synopsis Title: Imdb Movie Analysis System
No ratings yet
Informatics Practices Project Synopsis Title: Imdb Movie Analysis System
24 pages
BCM Project
No ratings yet
BCM Project
4 pages
Project 4 Imdb Movie Analysis
No ratings yet
Project 4 Imdb Movie Analysis
17 pages
3 An Illustrative Analysis: 3.1 Gathering Data
No ratings yet
3 An Illustrative Analysis: 3.1 Gathering Data
11 pages
Project Movielense Solution
29% (7)
Project Movielense Solution
4 pages
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
17 pages
Movie Recommender System Using Content Based AndCollaborative Filtering
No ratings yet
Movie Recommender System Using Content Based AndCollaborative Filtering
7 pages
Project 5
No ratings yet
Project 5
5 pages
ML Project Movie Recommendation System
No ratings yet
ML Project Movie Recommendation System
2 pages
Investigate A Dataset
No ratings yet
Investigate A Dataset
14 pages
Movie Tracker: Developed by
No ratings yet
Movie Tracker: Developed by
11 pages
STA220 FInal Project Report
No ratings yet
STA220 FInal Project Report
30 pages
A Predictor For Movie Success: 2.1 Data Collection
No ratings yet
A Predictor For Movie Success: 2.1 Data Collection
5 pages
Student Details
No ratings yet
Student Details
10 pages
Business Intelligence Project Report
No ratings yet
Business Intelligence Project Report
14 pages
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
23 pages
Iv Year Technical Seminar Presentation
No ratings yet
Iv Year Technical Seminar Presentation
16 pages
IV Year Technical Seminar Presentation
No ratings yet
IV Year Technical Seminar Presentation
16 pages
DSV Final
No ratings yet
DSV Final
14 pages
Practical Work 1 - Recommender Systems
No ratings yet
Practical Work 1 - Recommender Systems
3 pages
Synopsis
No ratings yet
Synopsis
52 pages
Project 2 - Movielens Case Study
No ratings yet
Project 2 - Movielens Case Study
5 pages
A1: Resit Coursework: Big Data (6CS030)
100% (1)
A1: Resit Coursework: Big Data (6CS030)
40 pages
Datascience Pepar
No ratings yet
Datascience Pepar
9 pages
Netflix Analysis Report (2105878 - Bibhudutta Swain)
No ratings yet
Netflix Analysis Report (2105878 - Bibhudutta Swain)
19 pages
Netflix Movies and TV Shows Clustering
No ratings yet
Netflix Movies and TV Shows Clustering
29 pages
Netflix Recommendation Based On IMDB
No ratings yet
Netflix Recommendation Based On IMDB
5 pages
Dav Project
No ratings yet
Dav Project
22 pages
Movie Recommender
No ratings yet
Movie Recommender
23 pages
Netflix Data Analysis
No ratings yet
Netflix Data Analysis
23 pages
Hitchhiker's Guide To Exploratory Data Analysis - by Harshit Tyagi - Towards Data Science
No ratings yet
Hitchhiker's Guide To Exploratory Data Analysis - by Harshit Tyagi - Towards Data Science
14 pages
Babylon.js Essentials: Understand, train, and be ready to develop 3D Web applications/video games using the Babylon.js framework, even for beginners
From Everand
Babylon.js Essentials: Understand, train, and be ready to develop 3D Web applications/video games using the Babylon.js framework, even for beginners
Julien Moreau-Mathis
No ratings yet
AI and ML Technological Solutions for the Film Industry
From Everand
AI and ML Technological Solutions for the Film Industry
Zemelak Goraga
No ratings yet
Components of A Block
No ratings yet
Components of A Block
12 pages
0501 Indexing and Selecting Data
No ratings yet
0501 Indexing and Selecting Data
16 pages
Implementation of Blockchain Based Techn
No ratings yet
Implementation of Blockchain Based Techn
6 pages
Getting Started With Data Science: Grade VIII
No ratings yet
Getting Started With Data Science: Grade VIII
32 pages
Blockchain Application in Education
No ratings yet
Blockchain Application in Education
11 pages
A Cardiovascular Disease Prediction Using Machine Learning Algorithms
No ratings yet
A Cardiovascular Disease Prediction Using Machine Learning Algorithms
10 pages
An Ensemble Methods For Medical Insurance Costs Prediction Task
No ratings yet
An Ensemble Methods For Medical Insurance Costs Prediction Task
16 pages
Information & Communication Technology - 1 (ICT-1) : Computer Fundamentals and Office Tools II Semester
No ratings yet
Information & Communication Technology - 1 (ICT-1) : Computer Fundamentals and Office Tools II Semester
49 pages
Gender Recong Paper 4
No ratings yet
Gender Recong Paper 4
9 pages
Computer Fundamentals
No ratings yet
Computer Fundamentals
15 pages
Machine Learning in Healthcare Management For Medical Insurance Cost Prediction
No ratings yet
Machine Learning in Healthcare Management For Medical Insurance Cost Prediction
11 pages
Identifying The Gender of A Voice Using Acoustic Properties
No ratings yet
Identifying The Gender of A Voice Using Acoustic Properties
8 pages
Accounting & Tally Prime Course: Chapter - 1 Fundamentals of Accounting
No ratings yet
Accounting & Tally Prime Course: Chapter - 1 Fundamentals of Accounting
3 pages
Mini Project Hospital
No ratings yet
Mini Project Hospital
13 pages
Algorithmic Prediction of Health Care Costs and Di
No ratings yet
Algorithmic Prediction of Health Care Costs and Di
12 pages
Dynamic Routing To Preserve Data Integrity in Wireless Sensor Networks
No ratings yet
Dynamic Routing To Preserve Data Integrity in Wireless Sensor Networks
5 pages
Medical Insurance Cost
No ratings yet
Medical Insurance Cost
12 pages
J2EE and Web Services For Managers: Understanding The Hype
No ratings yet
J2EE and Web Services For Managers: Understanding The Hype
13 pages
Enhancing The Optimal Price in IaaS Cloud Environments
No ratings yet
Enhancing The Optimal Price in IaaS Cloud Environments
7 pages
Dynamic Routing For Data Integrity and Delay Differentiated Services in Wireless Sensor Networks
No ratings yet
Dynamic Routing For Data Integrity and Delay Differentiated Services in Wireless Sensor Networks
4 pages
Effectively Error Detection On Cloud by Using Time Efficient Technique
No ratings yet
Effectively Error Detection On Cloud by Using Time Efficient Technique
3 pages
Data Integrity and Delay Differentiated Services in Wireless Sensor Networks Using Dynamic Routing
No ratings yet
Data Integrity and Delay Differentiated Services in Wireless Sensor Networks Using Dynamic Routing
5 pages
Unit 1
No ratings yet
Unit 1
57 pages
Amazon EC2 API Reference
No ratings yet
Amazon EC2 API Reference
592 pages
FTI Delta Tech Trends 2024
No ratings yet
FTI Delta Tech Trends 2024
17 pages
Address Profile - Lytchett House, 13 Freeland Park Wareham Road, Lytchett Matravers, Poole, Dorset, England, Bh16 6fa - Finest Rentals
No ratings yet
Address Profile - Lytchett House, 13 Freeland Park Wareham Road, Lytchett Matravers, Poole, Dorset, England, Bh16 6fa - Finest Rentals
3 pages
Price Quotation For Drafting Services
No ratings yet
Price Quotation For Drafting Services
2 pages
Scikit-Learn Cheat Sheet Python For Data Science: Preprocessing The Data Evaluate Your Model's Performance
100% (1)
Scikit-Learn Cheat Sheet Python For Data Science: Preprocessing The Data Evaluate Your Model's Performance
1 page
2018 - Sony Pictures Imageworks Arnold
No ratings yet
2018 - Sony Pictures Imageworks Arnold
18 pages
Smart City Faridabad RFPUCOPCCTVITMSVolumeI
No ratings yet
Smart City Faridabad RFPUCOPCCTVITMSVolumeI
79 pages
Identify Your MacBook Air Model - Apple Support (PH)
No ratings yet
Identify Your MacBook Air Model - Apple Support (PH)
8 pages
Advanced Data Management Techniques
No ratings yet
Advanced Data Management Techniques
257 pages
Introduction To ARM Assembly Language and Keil Uvision5
No ratings yet
Introduction To ARM Assembly Language and Keil Uvision5
20 pages
IGI Global - Balancing Agile and Disciplined
No ratings yet
IGI Global - Balancing Agile and Disciplined
378 pages
Data Cleaning Part 4-2
No ratings yet
Data Cleaning Part 4-2
19 pages
Chapter 11
No ratings yet
Chapter 11
37 pages
Year
No ratings yet
Year
53 pages
Course Contents: Introduction To SQL Server 2008 Administration Welcome To SQL Server 2008
No ratings yet
Course Contents: Introduction To SQL Server 2008 Administration Welcome To SQL Server 2008
5 pages
655N00481 Kit Instructions
No ratings yet
655N00481 Kit Instructions
2 pages
Brian M. O'Malley
No ratings yet
Brian M. O'Malley
2 pages
P2 DataSheet
No ratings yet
P2 DataSheet
4 pages
Tekken 7 - FAQ - Move List - PlayStation 4 - by Catlord - GameFAQs PDF
No ratings yet
Tekken 7 - FAQ - Move List - PlayStation 4 - by Catlord - GameFAQs PDF
111 pages
Build A Simple Chatbot With Fastapi 1733585637
No ratings yet
Build A Simple Chatbot With Fastapi 1733585637
11 pages
Sverker 760: Relay Testing Unit
No ratings yet
Sverker 760: Relay Testing Unit
4 pages
Transcend TSonic840 Manual
No ratings yet
Transcend TSonic840 Manual
73 pages
Vignette
No ratings yet
Vignette
3 pages
YouTube Content Machine PDF
100% (1)
YouTube Content Machine PDF
28 pages
Updated CERTIFICATE PAGE
No ratings yet
Updated CERTIFICATE PAGE
7 pages
Dart Basics
No ratings yet
Dart Basics
43 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Movies Final Report

Uploaded by

Movies Final Report

Uploaded by

Movies Final Report

The following files were used in the project:

Conversion to CSV Files

Removing Unnecessary Features

EXPLORATORY DATA VISUALIZATION AND ANALYSIS

Movie Title Wordcloud

Popularity, Vote Average and Vote Count

Movie Release Dates

Cast and Crew

REGRESSION: PREDICTING MOVIE REVENUES

CLASSIFICATION: PREDICTING MOVIE SUCCESS

The Simple Recommender

Content Based Recommender

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Movies Final Report

Uploaded by

Movies Final Report

Uploaded by

Movies​ ​Final​ ​Report

The​ ​following​ ​files​ ​were​ ​used​ ​in​ ​the​ ​project:

Conversion​ ​to​ ​CSV​ ​Files

Removing​ ​Unnecessary​ ​Features

EXPLORATORY​ ​DATA​ ​VISUALIZATION​ ​AND​ ​ANALYSIS

Movie​ ​Title​ ​Wordcloud

Popularity,​ ​Vote​ ​Average​ ​and​ ​Vote​ ​Count

Movie​ ​Release​ ​Dates

Cast​ ​and​ ​Crew

REGRESSION:​ ​PREDICTING​ ​MOVIE​ ​REVENUES

CLASSIFICATION:​ ​PREDICTING​ ​MOVIE​ ​SUCCESS

The​ ​Simple​ ​Recommender

Content​ ​Based​ ​Recommender

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Movies Final Report

The following files were used in the project:

Conversion to CSV Files

Removing Unnecessary Features

EXPLORATORY DATA VISUALIZATION AND ANALYSIS

Movie Title Wordcloud

Popularity, Vote Average and Vote Count

Movie Release Dates

Cast and Crew

REGRESSION: PREDICTING MOVIE REVENUES

CLASSIFICATION: PREDICTING MOVIE SUCCESS

The Simple Recommender

Content Based Recommender