0% found this document useful (0 votes)

59 views11 pages

SDM - Task B - Group 1G - Movies

The document discusses a project analyzing factors that may affect IMDb ratings of movies. These factors include duration, Facebook likes of cast/directors/movies, number of reviews, and others. Statistical concepts like boxplots, histograms, correlation, and multiple regression were used to determine how the factors impact IMDb scores and by what level. A linear regression model was created using duration, gross, Facebook likes, reviews, budget, Facebook likes, and age as predictors of IMDb score.

Uploaded by

Akash Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views11 pages

SDM - Task B - Group 1G - Movies

Uploaded by

Akash Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

SDM_Task B_Group 1G_Movies

Akash Gupta, B S S Pramod, Raksha Shetty, Rishabh Agrawal, Sanya Sharma, Vipul Bhatia

14/09/2019

The success of any movie depends on many factors. Cast and crew of few movies expect
commercial success, while cast and crew of other movies expect critical success (i.e, higher
ratings by critics by movie critics and websites like IMDb, Rotten Tomatoes etc.).
In this project, we have tried to check whether some of the following factors affect the
IMDb ratings of the movies: Duration, Facebook likes of Cast, Directors and Movies, number
of reviews at IMDb, etc.
We have used some concepts of Statistics (like Boxplots, Histogram, Correlation, Multiple
Regression etc.) to determine not only how the factors impact IMDb score, but also to
determine by what level do the factors impact IMDb score.
Setting the working directory
setwd("D:/SDM/R")

First, we read the CSV file Data into R.

M<-read.csv(paste("movie.csv",sep=""))

Then, we converted few values which were sgtored as factors into integer value.
M$Gross<-as.integer(M$Gross)
M$Budget<-as.integer(M$Budget)

We view the data.

View(M)

We see the structure of the data to ensure whether we have all variables in the required
formats.
str(M)

## 'data.frame': 3000 obs. of 14 variables:

## $ Movie : Factor w/ 2927 levels "[Rec] 2Â ","10
Cloverfield LaneÂ ",..: 224 1618 1935 2189 1141 1939 225 933 258 2012 ...
## $ Duration.In.Min. : int 178 100 148 100 132 156 141 153 183 169
...
## $ Gross : int 2684 1719 1215 2040 2573 1752 2048 1700
1744 1214 ...
## $ Genre : Factor w/ 20 levels "Action","Adventure",..:
13 8 20 15 14 17 18 17 19 16 ...
## $ Cast_Total_Facebook_Likes: int 4834 48350 11700 106759 1873 46055
92000 58753 24450 29991 ...
## $ num_user_for_reviews : int 3054 1238 994 2701 738 1902 1117 973
3018 2367 ...
## $ Language : Factor w/ 8 levels "Chinese","English",..: 1
2 4 2 3 8 3 4 7 7 ...
## $ Country : Factor w/ 10 levels "Australia","China",..:
2 9 4 9 3 8 3 4 7 7 ...
## $ content_rating : Factor w/ 8 levels "A","G","NC-17",..: 6 6 6
6 6 6 6 5 6 6 ...
## $ Budget : int 137 168 139 141 145 143 141 141 141 127
...
## $ Year : int 2009 2007 2015 2012 2012 2007 2015 2009
2016 2006 ...
## $ imdb_score : num 7.9 7.1 6.8 8.5 6.6 6.2 7.5 7.5 6.9 6.1
...
## $ movie_facebook_likes : int 33000 0 85000 164000 24000 0 118000
10000 197000 0 ...
## $ Age : int 10 12 4 7 7 12 4 10 3 13 ...

We check the number of rows and columns of the data.

dim(M)

## [1] 3000 14

We summarise the entire data to get a rough idea of how variables are spread over a range
of values.
summary(M)

## Movie Duration.In.Min. Gross

## PanÂ : 3 Min. : 45.0 Min. : 1.0
## The Fast and the FuriousÂ : 3 1st Qu.: 94.0 1st Qu.: 736.8
## Victor FrankensteinÂ : 3 Median :104.0 Median :1464.5
## Alice in WonderlandÂ : 2 Mean :108.7 Mean :1462.5
## AlohaÂ : 2 3rd Qu.:118.0 3rd Qu.:2191.2
## Around the World in 80 DaysÂ : 2 Max. :300.0 Max. :2928.0
## (Other) :2985
## Genre Cast_Total_Facebook_Likes num_user_for_reviews
## Biography: 174 Min. : 0 Min. : 1.0
## War : 170 1st Qu.: 2113 1st Qu.: 114.0
## Musical : 167 Median : 4614 Median : 215.0
## Animation: 164 Mean : 12260 Mean : 351.8
## Crime : 162 3rd Qu.: 17152 3rd Qu.: 420.0
## Thriller : 159 Max. :656730 Max. :5060.0
## (Other) :2004
## Language Country content_rating Budget
## English :907 China : 308 R :1310 Min. : 1.00
## Chinese :308 USA : 308 PG-13 :1178 1st Qu.: 80.75
## Japanese:307 Australia: 307 PG : 406 Median :149.00
## Hindi :305 Japan : 307 G : 66 Mean :142.43
## German :304 India : 305 Not Rated: 22 3rd Qu.:213.00
## Russian :294 Germany : 304 Unrated : 15 Max. :297.00
## (Other) :575 (Other) :1161 (Other) : 3
## Year imdb_score movie_facebook_likes Age
## Min. :1996 Min. :1.600 Min. : 0 Min. : 3.00
## 1st Qu.:2002 1st Qu.:5.800 1st Qu.: 0 1st Qu.: 8.00
## Median :2006 Median :6.500 Median : 317 Median :13.00
## Mean :2006 Mean :6.389 Mean : 10797 Mean :12.56
## 3rd Qu.:2011 3rd Qu.:7.100 3rd Qu.: 13000 3rd Qu.:17.00
## Max. :2016 Max. :9.000 Max. :349000 Max. :23.00
##

We represent the same data visually with boxplots, to check the skewness of the variables.
par(mfrow=c(1,1))
boxplot(M$Duration.In.Min., xlab="Duration in Mins", ylab="",
horizontal=TRUE,col=c("yellow"))

boxplot(M$Gross, xlab="Gross", ylab="",

horizontal=TRUE,col=c("Green"))
boxplot(M$Cast_Total_Facebook_Likes, xlab="Cast facebook likes", ylab="",
horizontal=TRUE,col=c("yellow"))
boxplot(M$num_user_for_reviews, xlab="Reviews", ylab="",
horizontal=TRUE,col=c("red"))

boxplot(M$Budget, xlab="Budget", ylab="",

horizontal=TRUE,col=c("brown"))
boxplot(M$movie_facebook_likes, xlab="Movie FB Like", ylab="",
horizontal=TRUE,col=c("magenta"))
boxplot(M$Age, xlab="Movie Age", ylab="",
horizontal=TRUE,col=c("orange"))

boxplot(M$imdb_score, xlab="IMDB Rating", ylab="",

horizontal=TRUE,col=c("blue"))
Coming, to the model, we first check the normality of the dependent variable (IMDb Rating)
by plotting a histogram.
hist(M$imdb_score)
The dependent variable looks normal from the Histogram, which implies we can go ahead
with linear multiple regression.
Now, we check how strong or weak are the associations between all variables. Hence, we
plot correlation matrix.
round(digits=4, cor(M[,c(2,3,5,6,10,12,13,14)]))

## Duration.In.Min. Gross
## Duration.In.Min. 1.0000 0.0103
## Gross 0.0103 1.0000
## Cast_Total_Facebook_Likes 0.0830 0.0448
## num_user_for_reviews 0.2105 0.0274
## Budget -0.0241 0.0403
## imdb_score 0.2412 0.0293
## movie_facebook_likes 0.1831 0.0144
## Age -0.0577 -0.0354
## Cast_Total_Facebook_Likes num_user_for_reviews
## Duration.In.Min. 0.0830 0.2105
## Gross 0.0448 0.0274
## Cast_Total_Facebook_Likes 1.0000 0.1842
## num_user_for_reviews 0.1842 1.0000
## Budget 0.0147 0.0038
## imdb_score 0.1174 0.3339
## movie_facebook_likes 0.2033 0.3551
## Age -0.1017 0.0592
## Budget imdb_score movie_facebook_likes Age
## Duration.In.Min. -0.0241 0.2412 0.1831 -0.0577
## Gross 0.0403 0.0293 0.0144 -0.0354
## Cast_Total_Facebook_Likes 0.0147 0.1174 0.2033 -0.1017
## num_user_for_reviews 0.0038 0.3339 0.3551 0.0592
## Budget 1.0000 -0.0007 -0.0432 0.0550
## imdb_score -0.0007 1.0000 0.3091 -0.0420
## movie_facebook_likes -0.0432 0.3091 1.0000 -0.4472
## Age 0.0550 -0.0420 -0.4472 1.0000

Also, to visually represent correlation values, we plot a corrgram.

library(corrgram)

## Registered S3 method overwritten by 'seriation':

## method from
## reorder.hclust gclus

corrgram(M[,c(2,3,5,6,10,12,13,14)],
order=FALSE,
lower.panel=panel.pie,
upper.panel=panel.cor,
text.panel=panel.txt,
main="Corrgram of all Test variables")

Now, we create a Linear Multiple Regression model by inputting all parameters

M_imdb <- lm(imdb_score~Duration.In.Min.+Gross+Cast_Total_Facebook_Likes
+num_user_for_reviews +
Budget +movie_facebook_likes +Age,data=M)
summary(M_imdb)

##
## Call:
## lm(formula = imdb_score ~ Duration.In.Min. + Gross +
Cast_Total_Facebook_Likes +
## num_user_for_reviews + Budget + movie_facebook_likes + Age,
## data = M)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2287 -0.5334 0.1142 0.6686 2.2141
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.194e+00 9.964e-02 52.133 < 2e-16 ***
## Duration.In.Min. 6.410e-03 6.993e-04 9.166 < 2e-16 ***
## Gross 2.374e-05 2.051e-05 1.158 0.24711
## Cast_Total_Facebook_Likes 1.270e-06 8.917e-07 1.424 0.15459
## num_user_for_reviews 5.194e-04 4.588e-05 11.322 < 2e-16 ***
## Budget 9.739e-05 2.091e-04 0.466 0.64138
## movie_facebook_likes 1.003e-05 9.167e-07 10.942 < 2e-16 ***
## Age 1.132e-02 3.723e-03 3.042 0.00237 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9446 on 2992 degrees of freedom
## Multiple R-squared: 0.1795, Adjusted R-squared: 0.1776
## F-statistic: 93.52 on 7 and 2992 DF, p-value: < 2.2e-16

Looking at the Regression model, we can infer few points.

1) Adjusted R square = 0.1776, which indicates that the model is weak.

2) We can reject the null hypotheses for the variables: Duration, Gross, Facebook likes of
cast and movie.

Lumina Homes
No ratings yet
Lumina Homes
3 pages
IMDB Movie Dataset Analysis: Sarada Saripalli
No ratings yet
IMDB Movie Dataset Analysis: Sarada Saripalli
9 pages
834L Photobrochure
No ratings yet
834L Photobrochure
1 page
IMDB Movie Analysis 05 Project
No ratings yet
IMDB Movie Analysis 05 Project
7 pages
Mumbai Metro Rail Systems: Project and Structured Finance
No ratings yet
Mumbai Metro Rail Systems: Project and Structured Finance
22 pages
Sanofi-Aventis's Tender Offer For Genzyme
No ratings yet
Sanofi-Aventis's Tender Offer For Genzyme
24 pages
General Knowledge Trivia Quiz Questions
No ratings yet
General Knowledge Trivia Quiz Questions
3 pages
IMDB Movie Analysis: by Biswajeet Nayak
No ratings yet
IMDB Movie Analysis: by Biswajeet Nayak
23 pages
21Bcs5066 - Deepanshu Tyagi Source Code: #Importing Libraries
No ratings yet
21Bcs5066 - Deepanshu Tyagi Source Code: #Importing Libraries
18 pages
Vertopal.com IMDb+Movie+Assignment Stub
No ratings yet
Vertopal.com IMDb+Movie+Assignment Stub
9 pages
Report
No ratings yet
Report
26 pages
Adriano Axel Pliopas Pereira - 83393 - Exercise 8 - Ggplot2movies
No ratings yet
Adriano Axel Pliopas Pereira - 83393 - Exercise 8 - Ggplot2movies
15 pages
RE Paper
No ratings yet
RE Paper
25 pages
Movie Recommendation System in R Jupyter Notebook
No ratings yet
Movie Recommendation System in R Jupyter Notebook
18 pages
MovieLens Final-Project
No ratings yet
MovieLens Final-Project
18 pages
Hands-On Lab - Importing Data in R
No ratings yet
Hands-On Lab - Importing Data in R
8 pages
1st Harvard Project
No ratings yet
1st Harvard Project
17 pages
IMDB Dataframe Insights
No ratings yet
IMDB Dataframe Insights
3 pages
Ads - Phase 5
No ratings yet
Ads - Phase 5
14 pages
Final Project1 IMDB Movie Analysis PDF
No ratings yet
Final Project1 IMDB Movie Analysis PDF
9 pages
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
80 pages
Final Project - CS181
No ratings yet
Final Project - CS181
3 pages
DSLAB5
No ratings yet
DSLAB5
17 pages
Data Analytics Group 7
No ratings yet
Data Analytics Group 7
7 pages
Source Code
No ratings yet
Source Code
19 pages
Recommender System
No ratings yet
Recommender System
45 pages
Rotten Tomatoes Audience Rating Prediction
No ratings yet
Rotten Tomatoes Audience Rating Prediction
36 pages
subtitle
No ratings yet
subtitle
3 pages
Project Movielense Solution
No ratings yet
Project Movielense Solution
4 pages
Extra Practice #3 Making A Movie
No ratings yet
Extra Practice #3 Making A Movie
4 pages
IMDB Movie Analysis - PDF
No ratings yet
IMDB Movie Analysis - PDF
8 pages
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
2 pages
Group 15 Report
No ratings yet
Group 15 Report
23 pages
Report Final-MovieLens
No ratings yet
Report Final-MovieLens
47 pages
Imdb Scrape v1
No ratings yet
Imdb Scrape v1
9 pages
RS2
No ratings yet
RS2
16 pages
Imdb_Movie_Analysis
No ratings yet
Imdb_Movie_Analysis
17 pages
project 5
No ratings yet
project 5
13 pages
04 - Movie Rating Analysis
No ratings yet
04 - Movie Rating Analysis
9 pages
TMDB Box Office Prediction: Group 6
No ratings yet
TMDB Box Office Prediction: Group 6
7 pages
Informatics Practices Project Synopsis Title: Imdb Movie Analysis System
No ratings yet
Informatics Practices Project Synopsis Title: Imdb Movie Analysis System
24 pages
Linear Regression and Modeling Data
No ratings yet
Linear Regression and Modeling Data
3 pages
Project Movielense Solution
29% (7)
Project Movielense Solution
4 pages
IMDB Analysis
No ratings yet
IMDB Analysis
4 pages
Imdb Scrape v3
No ratings yet
Imdb Scrape v3
9 pages
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
17 pages
Imdb Movie Data Set
No ratings yet
Imdb Movie Data Set
9 pages
IMDB Movie Analysis1
No ratings yet
IMDB Movie Analysis1
14 pages
project 5
No ratings yet
project 5
5 pages
3 An Illustrative Analysis: 3.1 Gathering Data
No ratings yet
3 An Illustrative Analysis: 3.1 Gathering Data
11 pages
Movie Analysis.pdf
No ratings yet
Movie Analysis.pdf
11 pages
Bi 501 03 Final Imdb Rating
No ratings yet
Bi 501 03 Final Imdb Rating
33 pages
Student Details
No ratings yet
Student Details
10 pages
Project 5
No ratings yet
Project 5
13 pages
Film Data Analysis
No ratings yet
Film Data Analysis
3 pages
imdb
No ratings yet
imdb
11 pages
math551lab9
No ratings yet
math551lab9
5 pages
Netflix HD
No ratings yet
Netflix HD
21 pages
Imdb Movie Analysis - Powerpoint
No ratings yet
Imdb Movie Analysis - Powerpoint
14 pages
A Predictor For Movie Success: 2.1 Data Collection
No ratings yet
A Predictor For Movie Success: 2.1 Data Collection
5 pages
Practical Work 1 - Recommender Systems
No ratings yet
Practical Work 1 - Recommender Systems
3 pages
PPT
No ratings yet
PPT
7 pages
Group Project Description
No ratings yet
Group Project Description
6 pages
549129758-synopsis
No ratings yet
549129758-synopsis
52 pages
The Foundation For Creating Video Games
From Everand
The Foundation For Creating Video Games
Austin Onwudachi
No ratings yet
Title of Your Presentation: A Free Presentation Template
No ratings yet
Title of Your Presentation: A Free Presentation Template
6 pages
Advanced Financial Statement Analysis: by Group 5
No ratings yet
Advanced Financial Statement Analysis: by Group 5
20 pages
Stock Market Prediction Using Machine Learning Algorithms
No ratings yet
Stock Market Prediction Using Machine Learning Algorithms
7 pages
Sensitivity Analysis: Parameter Eirr (%) Enpv at 9% (US$ Million) Enpv at 8% (US$ Million) Enpv at 6% (US$ Million)
No ratings yet
Sensitivity Analysis: Parameter Eirr (%) Enpv at 9% (US$ Million) Enpv at 8% (US$ Million) Enpv at 6% (US$ Million)
1 page
Automated Stock Price Prediction Using Machine Learning: Mariam Moukalled Wassim El-Hajj Mohamad Jaber
No ratings yet
Automated Stock Price Prediction Using Machine Learning: Mariam Moukalled Wassim El-Hajj Mohamad Jaber
9 pages
MarketLineIC - OJSC Rosneft Oil Company - Profile - 151220 PDF
No ratings yet
MarketLineIC - OJSC Rosneft Oil Company - Profile - 151220 PDF
24 pages
Risk Analysis and Prediction of The Stock Market Using Machine Learning and NLP
No ratings yet
Risk Analysis and Prediction of The Stock Market Using Machine Learning and NLP
6 pages
3-Predicting Stock Prices Using Deep Learning - by Yacoub Ahmed - Towards Data Science PDF
No ratings yet
3-Predicting Stock Prices Using Deep Learning - by Yacoub Ahmed - Towards Data Science PDF
15 pages
Stock Market Prediction Using Machine Learning: December 2018
No ratings yet
Stock Market Prediction Using Machine Learning: December 2018
4 pages
MarketLineIC - OJSC Rosneft Oil Company - Profile - 151220 PDF
No ratings yet
MarketLineIC - OJSC Rosneft Oil Company - Profile - 151220 PDF
65 pages
Stock Market Prediction Using Machine Learning: Gareja Pradip, Chitrak Bari, J. Shiva Nandhini
No ratings yet
Stock Market Prediction Using Machine Learning: Gareja Pradip, Chitrak Bari, J. Shiva Nandhini
4 pages
Bharti Airtel Limited - SWOT Analysis
No ratings yet
Bharti Airtel Limited - SWOT Analysis
3 pages
MarketLineIC - Nayara Energy Ltd. - Profile - 151220 PDF
No ratings yet
MarketLineIC - Nayara Energy Ltd. - Profile - 151220 PDF
18 pages
1G - NR Ma'am - TELUS Corporation
No ratings yet
1G - NR Ma'am - TELUS Corporation
13 pages
The Social Issue: Growth of A Surface Contamination Network and Its Role in Disease Spread
No ratings yet
The Social Issue: Growth of A Surface Contamination Network and Its Role in Disease Spread
5 pages
Session 1
No ratings yet
Session 1
18 pages
Step Down Method of Cost Allocation - Explanation, Example, Advantages and Disadvantages - Accounting For Management
No ratings yet
Step Down Method of Cost Allocation - Explanation, Example, Advantages and Disadvantages - Accounting For Management
5 pages
Cci Online Internship Programme From January To March 2021: Hard Copies of Applications Shall Not Be Accepted
No ratings yet
Cci Online Internship Programme From January To March 2021: Hard Copies of Applications Shall Not Be Accepted
1 page
Marketing Mix and Strategy For Itc Hotels To Establish Five-Star Villa in Meghalaya
No ratings yet
Marketing Mix and Strategy For Itc Hotels To Establish Five-Star Villa in Meghalaya
5 pages
My eduCBA Certificate
No ratings yet
My eduCBA Certificate
1 page
Expected Return Risk Free Rate + (Beta Market Risk Premium) Expected Return 3.68% + (0.66 5.05%) Expected Return 7.01%
No ratings yet
Expected Return Risk Free Rate + (Beta Market Risk Premium) Expected Return 3.68% + (0.66 5.05%) Expected Return 7.01%
16 pages
Loan EMI Table: Loan Number Loan Booked Date Loan Type Principal Amount Interest Rate Tenure O/S Principal
No ratings yet
Loan EMI Table: Loan Number Loan Booked Date Loan Type Principal Amount Interest Rate Tenure O/S Principal
1 page
4th Sem CV Syllabus (22scheme)
No ratings yet
4th Sem CV Syllabus (22scheme)
35 pages
Keys in Database Management System
No ratings yet
Keys in Database Management System
12 pages
ijosmViewRCResults aspxdocID 756&rev 0&msid 7B7D
No ratings yet
ijosmViewRCResults aspxdocID 756&rev 0&msid 7B7D
1 page
CSF363
No ratings yet
CSF363
2 pages
KT 3 Ngu Am Hoc-Thuc
No ratings yet
KT 3 Ngu Am Hoc-Thuc
10 pages
Software Development Life Cycle
No ratings yet
Software Development Life Cycle
23 pages
Case Study Denso-Toyota
0% (1)
Case Study Denso-Toyota
3 pages
Vsi
No ratings yet
Vsi
6 pages
Economics As A Social Science
No ratings yet
Economics As A Social Science
14 pages
AN13879_inversión de motor
No ratings yet
AN13879_inversión de motor
64 pages
Machine Scheduling - Hands-On Mathematical Optimization With AMPL in Python
No ratings yet
Machine Scheduling - Hands-On Mathematical Optimization With AMPL in Python
10 pages
Philcare v. Ca
No ratings yet
Philcare v. Ca
2 pages
Wollo University: Kombolcha Institute of Technology College of Informatics
No ratings yet
Wollo University: Kombolcha Institute of Technology College of Informatics
8 pages
Biodiversity in Sulawesi Island
No ratings yet
Biodiversity in Sulawesi Island
9 pages
Lec33 - 210102029 - DIYA ARUN
No ratings yet
Lec33 - 210102029 - DIYA ARUN
5 pages
Draft Layout Letter
No ratings yet
Draft Layout Letter
6 pages
Transboundary Governance Research Paper
No ratings yet
Transboundary Governance Research Paper
1 page
Lab Report Exp 3 CHM 524 (Physical Chemistry)
No ratings yet
Lab Report Exp 3 CHM 524 (Physical Chemistry)
18 pages
Is 101 5 2 1988 PDF
No ratings yet
Is 101 5 2 1988 PDF
23 pages
TPE 1 Artifact-Case Study
No ratings yet
TPE 1 Artifact-Case Study
13 pages
Msds - N-Butyl Acetate
No ratings yet
Msds - N-Butyl Acetate
8 pages
Book Award Proforma0001
No ratings yet
Book Award Proforma0001
1 page
TC General and TCIF
No ratings yet
TC General and TCIF
27 pages
CS-91 GR 3.47(1) pg. 67-68
100% (1)
CS-91 GR 3.47(1) pg. 67-68
2 pages
PAKYAW MASONRY_UNC PERIMETER FENCE_113024RF
No ratings yet
PAKYAW MASONRY_UNC PERIMETER FENCE_113024RF
3 pages
Law of Banking & Negotiable Instruments - Final Study notes (1)
No ratings yet
Law of Banking & Negotiable Instruments - Final Study notes (1)
18 pages
Advt 07
No ratings yet
Advt 07
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

SDM - Task B - Group 1G - Movies

Uploaded by

SDM - Task B - Group 1G - Movies

Uploaded by

SDM_Task B_Group 1G_Movies

First, we read the CSV file Data into R.

We view the data.

## 'data.frame': 3000 obs. of 14 variables:

We check the number of rows and columns of the data.

## Movie Duration.In.Min. Gross

boxplot(M$Gross, xlab="Gross", ylab="",

boxplot(M$Budget, xlab="Budget", ylab="",

boxplot(M$imdb_score, xlab="IMDB Rating", ylab="",

Also, to visually represent correlation values, we plot a corrgram.

## Registered S3 method overwritten by 'seriation':

Now, we create a Linear Multiple Regression model by inputting all parameters

Looking at the Regression model, we can infer few points.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.