0% found this document useful (0 votes)
4 views13 pages

project 5

The project analyzes IMDb movie data to identify factors contributing to a movie's success, focusing on genre, duration, language, director performance, and budget. Key insights reveal that genres like Drama and Biography tend to score higher, while directors like Christopher Nolan consistently produce top-rated films. The analysis underscores the importance of content quality, direction, and smart budgeting alongside financial investment for achieving success in the film industry.

Uploaded by

shraddhajinendra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views13 pages

project 5

The project analyzes IMDb movie data to identify factors contributing to a movie's success, focusing on genre, duration, language, director performance, and budget. Key insights reveal that genres like Drama and Biography tend to score higher, while directors like Christopher Nolan consistently produce top-rated films. The analysis underscores the importance of content quality, direction, and smart budgeting alongside financial investment for achieving success in the film industry.

Uploaded by

shraddhajinendra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Project 5: IMDB Movie Analysis

Submitted by,
SHRADDHA S J

Project Description:
The objective of this project is to analyze a dataset of IMDb
movies to understand what factors contribute to a movie's
success, as measured by its IMDb score. This insight can help
movie producers, directors, and investors make informed
decisions. We explored genre, duration, language, director
performance, and budget-related factors.

Approach:
The project was divided into five main tasks:
 Genre Analysis: Examined how different genres
influence IMDb scores.
 Duration Analysis: Studied the relationship between
movie length and ratings.
 Language Analysis: Analyzed the most common movie
languages and their rating impact.
 Director Analysis: Identified top-performing directors
using percentile-based ranking.
 Budget Analysis: Explored the correlation between
budget and gross, and identified the most profitable
movies.
Data cleaning included handling missing values, converting
text to numbers (e.g., IMDb scores, budget), and splitting
multi-genre movies for analysis.
Tech-Stack used:
Microsoft Excel 2022
o Used for data cleaning, analysis, and visualization.
o Functions used: AVERAGE, MEDIAN,
MODE.SNGL, STDEV.S, MAX, MIN, COUNTIF,
CORREL, PERCENTILE.INC, IF, INDEX,
MATCH
o Pivot tables and scatter plots with trendlines were
used for summarizing data visually.

Insight:
 Genres like Drama and Biography tend to have higher
average IMDb scores.
 Movie Duration shows a weak but positive trend with
higher IMDb scores.
 English is the most dominant language, but some non-
English films (like Hindi or French) also perform well.
 Top directors like Christopher Nolan and Quentin
Tarantino consistently score in the top 10% based on IMDb
ratings.
 Higher budgets are positively correlated with gross
earnings.
 The most profitable movies had either a high gross or were
made on a lower budget with strong audience appeal.
Task 1:
Determine the most common genres of movies in the
dataset. Then, for each genre, calculate descriptive
statistics (mean, median, mode, range, variance, standard
deviation) of the IMDB scores.
Approach:
1. Genre Splitting:
o The original dataset had multiple genres listed in
one column (e.g., Action|Adventure|Sci-Fi).
o I split the genres into individual values using
Excel’s Text to Columns and Power Query,
allowing us to treat each genre separately.
2. Counting Genre Frequency:
o Used COUNTIF to find how many times each genre
appeared in the dataset.
3. IMDb Score Calculation:
o For each genre, used the following Excel functions
to calculate:
 Average – Mean imdb score
 Median – Middle score
 Mode.sngl – Most common score
 Max / min – Highest and lowest scores
 Stdev.s – Standard deviation
 Var.s – Variance
Key Insights:
 Drama is the most common genre and has a consistently
good IMDb score.
 Biography films had the highest average IMDb rating
among major genres.
 Comedy showed the widest range in ratings, indicating
varying audience tastes.
 Genres with more emotional or real-life stories (Drama,
Biography) tend to score higher than Action or Sci-Fi.
Task 2:
Analyze the distribution of movie durations and identify
the relationship between movie duration and IMDB score.
Approach:
1. Descriptive Statistics for Duration:
o I used Excel functions to calculate:
 Average(duration)
 Median(duration)
 Stdev.s(duration) – to measure variability.
2. Scatter Plot Analysis:
o I plotted movie duration vs IMDb score using a
scatter plot.
o Added a trendline to observe any visible
relationship (correlation) between the two.
Observations from Scatter Plot:
Very short movies (< 80 mins) had a wider range of
IMDb scores — some extremely good or extremely poor.
Most high-rated movies clustered around 100 to 130
minutes.

Insights:
While duration alone doesn’t guarantee success,
movies that are too short or too long may perform
worse unless supported by good content.
Task 3:
Determine the most common languages used in movies
and analyze their impact on the IMDB score using
descriptive statistics.
Approach:
1. Count of Movies by Language:
o Used Excel’s COUNTIF to find how many movies
were made in each language.
2. IMDb Score Calculations by Language:
o For each language, calculated:
 Average(IMDb Score)
 Median(IMDb Score)
 Stdev.s(IMDb Score)
o Handled #DIV/0! errors for languages with only 1
movie using IFERROR.
Key Insights:
 English-language films dominate the dataset, forming the
majority of entries.
 French movies, while fewer, tend to have slightly higher
average IMDb scores.
 Non-English films generally had fewer entries, so
individual ratings have a bigger effect on their averages.
Task 4:
Identify the top directors based on their average IMDB
score and analyze their contribution to the success of
movies using percentile calculations.
Approach:
1. Average IMDb Score per Director:
o Used a Pivot Table in Excel:
 Rows: Director Names
 Values: Average of IMDb Score
o Filtered out directors with very few movies (e.g.,
only 1 film) to avoid misleading averages.
2. Top Directors:
o Sorted the directors by average IMDb score to find
the top performers.
o Used PERCENTILE function to identify which
directors are in the top 10% (90th percentile) based
on average rating.

Observations:
 Directors like Christopher Nolan and Tarantino
consistently deliver movies with high IMDb scores.
 High-rated directors tend to:
o Work with strong casts and production teams
o Have distinctive storytelling styles
o Invest more time in script and direction quality
Insights:
 The director has a measurable impact on a movie’s
success.
 Movies by top 10% directors averaged above 8.0,
compared to the overall dataset average of ~7.0.
 This insight is useful for investors and studios when
choosing whom to back for future projects.
Task 5:
Analyze the correlation between movie budgets and gross
earnings, and identify the movies with the highest profit
margin.
1. Calculate Profit Margin:
o Added a new column:
Profit = Gross - Budget
o Helped identify movies with the highest earnings
over cost.
2. Correlation Analysis:
o Used Excel’s CORREL function to check the
relationship between budget and gross earnings:
=CORREL([Budget Column], [Gross Column])
3. Find Highest Profit Movies:
o Used MAX function on the profit column to find the
top-performing film(s).
Observations:
 The correlation coefficient was around 0.7, indicating a
strong positive correlation between budget and gross
earnings.
 However, high profit doesn’t always need a big budget
(e.g., Paranormal Activity).
 Some high-budget films still made small profits or even
losses.
Insights:
 Higher budgets generally lead to higher gross
earnings, but not always higher profit.
 Low-budget, high-concept films can deliver massive
returns.
 Smart budgeting is just as important as investing
heavily in production.
Result:
The analysis of the IMDb movie dataset revealed several key
factors that contribute to a movie's success, as measured by its
IMDb rating. Genres such as Drama and Action were found to
be the most common, but it was genres like Biography and
Documentary that showed slightly higher average ratings,
indicating that content type can influence audience perception.
Duration also played a role, with moderately long movies
performing better than very short or overly long ones
The influence of directors was significant—certain directors
consistently produced highly rated movies, showing that
experience, style, and reputation can drive success. These
insights provide valuable information for producers, investors,
and filmmakers by highlighting that while financial
investment is important, content quality, direction, and
audience targeting are equally crucial for a movie's success.
Drive links:
1. https://drive.google.com/file/d/
1F4ALw9c3zAPWQXJMyNCD9V32pVgiFJ08/view?
usp=sharing
2. https://drive.google.com/file/d/
1RxVi2hiOoIZSpQTXJ_JBHalI-756Atsq/view?
usp=sharing
3. https://drive.google.com/file/d/
1dahF3hwnP3zD0A9cSLLpWhsHLnK5_4py/view?
usp=sharing
4. https://drive.google.com/file/d/
1dqInUh7Jp5KcqAoxe_D7ENXm4eylmql1/view?
usp=sharing
5. https://drive.google.com/file/d/
1OftWiR8Dwav8Zu1W1r9vky25-RmNIFIE/view?
usp=sharing

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy