0% found this document useful (0 votes)

5 views13 pages

Predictive Modelling Report

Uploaded by

akshaypankar907

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views13 pages

Predictive Modelling Report

Uploaded by

akshaypankar907

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

AKSHAY PANKAR PREDICTIVE MODELLING REPORT

PREDICTIVE MODELLING REPORT

REGARDS,
AKSHAY PANKAR

GREAT LEARNING 1
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Problem 1: Linear Regression

The comp-active databases is a collection of a computer systems activity measures.

The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running
in a multi-user university department. Users would typically be doing a large variety of tasks
ranging from accessing the internet, editing files or running very cpu-bound programs.
As you are a budding data scientist you thought to find out a linear equation to build a model
to predict 'usr'(Portion of time (%) that cpus run in user mode) and to find out how each
attribute affects the system to be in 'usr' mode using a list of system attributes.

1.1 Read the data and do exploratory data analysis. Describe the data briefly. (Check the Data
types, shape, EDA, 5 point summary). Perform Univariate, Bivariate Analysis, Multivariate
Analysis.

A. Most of the columns in the data are numeric in nature ('int64' or 'float64' type).The
runqsz and user name columns are string columns ('object' type).We will be dropping
the 'runqsz' column for prediction purposes.
B. Replace the missing values with median values of the columns. Note that we do not
need to specify the column names below. Every column's missing value is replaced
with that column's median respectively.
C. Univariate Analysis

FIG 1 - Histplot of lread for univariate analysis

GREAT LEARNING 2
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

FIG 2 - Boxplot of lwrite with outliers for univariate analysis

FIG 3 - Plot of Scall with outliers for univariate analysis

D. Bivariate analysis among the different variables can be done using scatter matrix plot.
Seaborn libs create a dashboard reflecting useful information about the dimensions.

Output

<seaborn.axisgrid.PairGrid at 0x1aab1315e50>

Image in python file cannot be cropped.

GREAT LEARNING 3
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

E. Multivariate Analysis

FIG 4 – Heat Map for Multivariate analysis

1.2 Impute null values if present, also check for the values which are equal to zero. Do
they have any meaning or do we need to change them or drop them? Check for
the possibility of creating new features if required. Also check for outliers and
duplicates if there.

FIG 6 – Treating Null values.

GREAT LEARNING 4
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Yes possibility of creating new feature by Imputing of null values are done here and
checked whether is zero or not by usinge medians that is filling missing values with
medians.

FIG 7 – After Treating Null values.

From the image 2 Box plot of lwrite Outliers are present in data set.

1.3 Encode the data (having string values) for Modelling. Split the data into train and
test (70:30). Apply Linear regression using scikit learn. Perform checks for significant
variables using appropriate method from statsmodel. Create multiple models and check
the performance of Predictions on Train and Test sets using Rsquare, RMSE & Adj Rsquare.
Compare these models and select the best one with appropriate reasoning.

 By , Comparing the two regression results provided, we can see that both models have
relatively similar R-squared values, with the first model having an R-squared of 0.598
and the second model having an R-squared of 0.602. This indicates that both models
explain approximately the same amount of variability in the dependent variable (usr)
using the independent variables included.

 However, the second model has more independent variables (20) compared to the
first model, which has an unspecified number of independent variables. The second

GREAT LEARNING 5
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

model also has a lower F-statistic value (431.3) compared to the first model (608.4),
indicating that the first model may be a better fit for the data.

1.4 Inference: Basis on these predictions, what are the business insights and
recommendations.

Overall, the business insights and recommendations from these regression analyses
would likely depend on the specific independent variables included in the models and
the specific goals of the analysis. Further analysis and interpretation of the results
would be necessary to provide more specific insights and recommendations.

Step 1 :- Loading all Libraries and followed by upload in Excel file

Step 2 :- EDA And Variate Analysis (Univariate , Bivariate and Multivariate)
Step 3 :- Scaling of Data .
Step 4 :- Followed by regression.
Step 5 :- Performing Train – Test procedure and comparing of model 1 and 2 to draw
conclusions.

GREAT LEARNING 6
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Problem 2: Logistic Regression, LDA and CART

You are a statistician at the Republic of Indonesia Ministry of Health and you are provided
with a data of 1473 females collected from a Contraceptive Prevalence Survey. The samples
are married women who were either not pregnant or do not know if they were at the time
of the survey.

The problem is to predict do/don't they use a contraceptive method of choice based on
their demographic and socio-economic characteristics.

2.1 Data Ingestion: Read the dataset. Do the descriptive statistics and do null value condi tion
check, check for duplicates and outliers and write an inference on it. Perform Univariate and
Bivariate Analysis and Multivariate Analysis.

A. Most of the columns in the data are numeric in nature ('Object' , 'float64' and 1- int64
type).
B. After checking and treating the null values.

FIG 7 – After Treating Null values

C. Outliers checking – using boxplot command (we have to treat outliers)

GREAT LEARNING 7
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Fig 8 – Boxplot of Children born Fig 9 – Boxplot of Husband Education

D. By univariate analysis we can conclude that the house wife with minimum education
is 150 , with 25 percent education is 330 , 50 percent education is 398 and for 75 and
above education percentage women are 515.
E. By univariate analysis we can conclude that men with minimum education is 44 , with
25 percent education is 175 , 50 percent education is 347 and for 75 and above
education percentage women are 827.
F. Bivariate Analysis :

Fig 10 – count plot

From the above graph we can conclude that the with increase in use of contraceptive the
birth rate has been decreased.

GREAT LEARNING 8
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Fig 11 – count plot

From the above graph we can conclude that the with increase in wife education rate the use
of contraceptive have been increased.

G. Multivariate Analysis

Fig 12 – Heat Map for multivariate analysis.

2.2 Do not scale the data. Encode the data (having string values) for Modelling. Data
Split: Split the data into train and test (70:30). Apply Logistic Regression and LDA
(linear discriminant analysis) and CART.

GREAT LEARNING 9
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Logistic Regression Output :-

Fig 13 - Logistics Regression.

LDA Output

Fig 14 - Linear Discrimant Analysis.

CARTING OUTPUT

GREAT LEARNING 10
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Fig 15 - CART.

2.3 Performance Metrics: Check the performance of Predictions on Train and Test sets using
Accuracy, Confusion Matrix, Plot ROC curve and get ROC_AUC score for each model Final
Model: Compare Both the models and write inference which model is best/optimized .

Logistic Regression: Accuracy = 0.6507177033492823, Precision = 0.64895

10489510489, Recall = 0.6355996944232238, F1 score = 0.6345191040843214
LDA: Accuracy = 0.6483253588516746, Precision = 0.6469681397738951, Rec
all = 0.6324166030048383, F1 score = 0.6308285719435482
CART: Accuracy = 0.65311004784689, Precision = 0.6491240266963292, Reca
ll = 0.6489686783804431, F1 score = 0.6490425538074917

Fig 16 - ROC for LR.

GREAT LEARNING 11
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

Fig 17 - ROC for LDA .

Fig 17 - ROC for CART .

Logistic Regression:
Accuracy (train): 0.6687179487179488
Accuracy (test): 0.65311004784689
Confusion Matrix:
[[ 92 95]
[ 50 181]]
ROC AUC Score: 0.6841678820288445

LDA:
Accuracy (train): 0.6707692307692308
Accuracy (test): 0.6483253588516746
Confusion Matrix:
[[ 90 97]
[ 50 181]]
ROC AUC Score: 0.6850707225038777

CART:
Accuracy (train): 0.9794871794871794
Accuracy (test): 0.65311004784689
Confusion Matrix:

GREAT LEARNING 12
AKSHAY PANKAR PREDICTIVE MODELLING REPORT

[[114 73]
[ 72 159]]
ROC AUC Score: 0.6487487557006274

From all we can conclude that accuracy of train is more.

2.4 Inference: Basis on these predictions, what are the insights and recommendations .

Overall, the business insights and recommendations from such analyses would likely
depend on the specific independent variables included in the models and the specific
goals of the analysis. Further analysis and interpretation of the results would be
necessary to provide more specific insights and recommendations.

Step 1 :- Loading all Libraries and followed by upload in Excel file

Step 2 :- EDA And Variate Analysis (Univariate , Bivariate and Multivariate)
Step 3 :- Scaling of Data .
Step 4 :- Followed by , data encoding split ,logistic regression , LDA and Cart .
Step 5 :- Performing Train – Test procedure and comparing of model 1 and 2 to draw
conclusions by performance matrix (Train –test split , confusion matrix , AUC-ROC
Curve)

GREAT LEARNING 13

Problem 1: Linear Regression
54% (13)
Problem 1: Linear Regression
14 pages
Predictive Modelling Project Report Final
45% (11)
Predictive Modelling Project Report Final
49 pages
Project Submission Machine Learning - Ankit Bhagat - 8th Jan
100% (9)
Project Submission Machine Learning - Ankit Bhagat - 8th Jan
36 pages
Predictive Modelling ALOK KUMAR
100% (1)
Predictive Modelling ALOK KUMAR
25 pages
Sustainable Practices For Landfill Design and Operation
100% (1)
Sustainable Practices For Landfill Design and Operation
483 pages
FRA Milestone1 - Maminulislam
100% (4)
FRA Milestone1 - Maminulislam
23 pages
Project-Predictive Modeling-Rajendra M Bhat
100% (3)
Project-Predictive Modeling-Rajendra M Bhat
14 pages
Predictive Modeling Business Report
100% (3)
Predictive Modeling Business Report
69 pages
Predictive - Modelling - Project - PDF 1
No ratings yet
Predictive - Modelling - Project - PDF 1
31 pages
Monika Sree 11-07-2024
No ratings yet
Monika Sree 11-07-2024
36 pages
Sukanya December Predictive Modeling 14th Jan 2024
No ratings yet
Sukanya December Predictive Modeling 14th Jan 2024
50 pages
Predictive Modelling Sweta Kumari
No ratings yet
Predictive Modelling Sweta Kumari
35 pages
Bussiness Report PM
No ratings yet
Bussiness Report PM
44 pages
Predictive Modeling Business Report Seetharaman Final Changes PDF
100% (1)
Predictive Modeling Business Report Seetharaman Final Changes PDF
28 pages
Arpita - Sarkar - Business - Report - 17th December, 2023
No ratings yet
Arpita - Sarkar - Business - Report - 17th December, 2023
23 pages
Predictive Model: Submitted by
100% (3)
Predictive Model: Submitted by
27 pages
Nanduri Naga Sowri Pgp-Dsba - Octa - G2 Great Learning
No ratings yet
Nanduri Naga Sowri Pgp-Dsba - Octa - G2 Great Learning
40 pages
FRA Milestone 1
No ratings yet
FRA Milestone 1
33 pages
Lead Scoring Group Case Study Presentation
100% (2)
Lead Scoring Group Case Study Presentation
19 pages
Pooja Kabadi- Predictive Modelling Project
No ratings yet
Pooja Kabadi- Predictive Modelling Project
70 pages
Machine Learning VIVEK
80% (5)
Machine Learning VIVEK
118 pages
Capstone Assessment
No ratings yet
Capstone Assessment
18 pages
Devidutta_Predictive_Modeling.pdf
No ratings yet
Devidutta_Predictive_Modeling.pdf
25 pages
Linear_Regression_datascience_basit.pdf
No ratings yet
Linear_Regression_datascience_basit.pdf
19 pages
'Yatham Padma' 8 May 2022
No ratings yet
'Yatham Padma' 8 May 2022
82 pages
Predicting Mode of Transport
No ratings yet
Predicting Mode of Transport
29 pages
FRA Milestone 1
No ratings yet
FRA Milestone 1
33 pages
Machine Learning Project: Sneha Sharma PGPDSBA Mar'21 Group 2
100% (4)
Machine Learning Project: Sneha Sharma PGPDSBA Mar'21 Group 2
36 pages
Business Report PM Suchita Bhovar March 10 2024
No ratings yet
Business Report PM Suchita Bhovar March 10 2024
27 pages
Business Report: Predictive Modelling
100% (2)
Business Report: Predictive Modelling
37 pages
SMDM_Predictive_Modeling_Business_Report_05.02.2022.pdf
No ratings yet
SMDM_Predictive_Modeling_Business_Report_05.02.2022.pdf
38 pages
Suchita - Bhovar - Business Report - March 14 2024
No ratings yet
Suchita - Bhovar - Business Report - March 14 2024
24 pages
Project Employee Absenteeism
No ratings yet
Project Employee Absenteeism
33 pages
Machine Learning Report
92% (12)
Machine Learning Report
42 pages
FRA Business Report
100% (1)
FRA Business Report
21 pages
Predicting Mode of Transport (ML) : Akalya KS
No ratings yet
Predicting Mode of Transport (ML) : Akalya KS
17 pages
Machine Learning Extended Project - BrahmaChari
No ratings yet
Machine Learning Extended Project - BrahmaChari
29 pages
Machine Learning Project: Name-Rasmita Mallick Date - 5 September 2021
100% (2)
Machine Learning Project: Name-Rasmita Mallick Date - 5 September 2021
47 pages
DS assignment COMPLETED DOC
No ratings yet
DS assignment COMPLETED DOC
11 pages
Lead Score Case Study Presentation
No ratings yet
Lead Score Case Study Presentation
13 pages
Lead Score Case Study
No ratings yet
Lead Score Case Study
9 pages
Personalized Learning PPt
No ratings yet
Personalized Learning PPt
13 pages
Machine Leaning
No ratings yet
Machine Leaning
29 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
66 pages
Session 7-8 - Data Cleaning and Logistic Regression For Classification
No ratings yet
Session 7-8 - Data Cleaning and Logistic Regression For Classification
30 pages
Machine Learning
100% (2)
Machine Learning
30 pages
Business+Report Classification
No ratings yet
Business+Report Classification
16 pages
Business Report On Data Mining: By: Aditya Janardan Hajare Batch: PGPDSBA Mar'C21 Group 1
No ratings yet
Business Report On Data Mining: By: Aditya Janardan Hajare Batch: PGPDSBA Mar'C21 Group 1
18 pages
Big Data Management and Architecture Assignment
No ratings yet
Big Data Management and Architecture Assignment
9 pages
FRA Project Report - Chilla Nagaraju
100% (1)
FRA Project Report - Chilla Nagaraju
66 pages
Project Submission Predictive Modelling - Logistic Regression and LDA
No ratings yet
Project Submission Predictive Modelling - Logistic Regression and LDA
29 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
67 pages
Exam Question Ans
No ratings yet
Exam Question Ans
19 pages
PM ProjectJune - 2021
100% (1)
PM ProjectJune - 2021
33 pages
Machine Learning
100% (1)
Machine Learning
33 pages
Sunira - Predictive Modeling
100% (1)
Sunira - Predictive Modeling
65 pages
1 Final-Exam
No ratings yet
1 Final-Exam
6 pages
Project Report
100% (3)
Project Report
36 pages
BA II- End sem Exam - 2023
No ratings yet
BA II- End sem Exam - 2023
6 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Samanya Jnana
No ratings yet
Samanya Jnana
88 pages
DISASTER MANAGEMENT (Being Safe & Responsible)
No ratings yet
DISASTER MANAGEMENT (Being Safe & Responsible)
17 pages
Spirogyra
No ratings yet
Spirogyra
2 pages
Efforts For Rural Advancement (ERA) : Background
No ratings yet
Efforts For Rural Advancement (ERA) : Background
13 pages
NAVIGATORS INSURANCE COMPANY v. ILLINOIS UNION INSURANCE COMPANY Complaint
No ratings yet
NAVIGATORS INSURANCE COMPANY v. ILLINOIS UNION INSURANCE COMPANY Complaint
59 pages
Lesson-15-Week-16-Setting-Goals-for-Success-by-Sir-CSD
No ratings yet
Lesson-15-Week-16-Setting-Goals-for-Success-by-Sir-CSD
8 pages
doc (8)
No ratings yet
doc (8)
1 page
Speaking
No ratings yet
Speaking
2 pages
Coaching exercise CBT reframing
No ratings yet
Coaching exercise CBT reframing
2 pages
Ground Plane Antenna
No ratings yet
Ground Plane Antenna
7 pages
LAS #3 EIM - Use of Electrical Materials
No ratings yet
LAS #3 EIM - Use of Electrical Materials
1 page
PDF 20230409 204533 0000
No ratings yet
PDF 20230409 204533 0000
13 pages
Wayne County Community Resource Guide
No ratings yet
Wayne County Community Resource Guide
14 pages
Keo
No ratings yet
Keo
7 pages
Standard Classroom Building - DPWH As of 021524 A1
No ratings yet
Standard Classroom Building - DPWH As of 021524 A1
1 page
IEEE Standards Libraries
No ratings yet
IEEE Standards Libraries
2 pages
Ophthalmology Notes For PG
No ratings yet
Ophthalmology Notes For PG
24 pages
Barnard Bartending Agency Letter To Student Employment Services
No ratings yet
Barnard Bartending Agency Letter To Student Employment Services
6 pages
IntraOs 70 UNV Version 5.03 SERVICE
No ratings yet
IntraOs 70 UNV Version 5.03 SERVICE
56 pages
Half Sample
No ratings yet
Half Sample
2 pages
Financial IQ
No ratings yet
Financial IQ
42 pages
Turbin A
No ratings yet
Turbin A
2 pages
Wind-Services Wikov en 1809 PDF
No ratings yet
Wind-Services Wikov en 1809 PDF
2 pages
Gr11 EC P1 (ENG) June 2022 Question Paper
No ratings yet
Gr11 EC P1 (ENG) June 2022 Question Paper
14 pages
Arrear Calculation Under 7th Pay Commission: CODE: 109/ 200701 Period From 01-01-2016 To 31-07-2017 Name Himanshu Jain
No ratings yet
Arrear Calculation Under 7th Pay Commission: CODE: 109/ 200701 Period From 01-01-2016 To 31-07-2017 Name Himanshu Jain
1 page
Monitoring and Assessment Quality Objectives - Form - Admin.2
No ratings yet
Monitoring and Assessment Quality Objectives - Form - Admin.2
3 pages
Taski R9
No ratings yet
Taski R9
1 page
Supplier Quality Agreement (Fresh Produce) : and Vegetables
100% (1)
Supplier Quality Agreement (Fresh Produce) : and Vegetables
2 pages
HAZARD IDENTIFICATION
No ratings yet
HAZARD IDENTIFICATION
13 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Predictive Modelling Report

Uploaded by

Predictive Modelling Report

Uploaded by

AKSHAY PANKAR PREDICTIVE MODELLING REPORT

PREDICTIVE MODELLING REPORT

Problem 1: Linear Regression

The comp-active databases is a collection of a computer systems activity measures.

FIG 1 - Histplot of lread for univariate analysis

FIG 2 - Boxplot of lwrite with outliers for univariate analysis

FIG 3 - Plot of Scall with outliers for univariate analysis

Image in python file cannot be cropped.

FIG 4 – Heat Map for Multivariate analysis

FIG 6 – Treating Null values.

FIG 7 – After Treating Null values.

Step 1 :- Loading all Libraries and followed by upload in Excel file

Problem 2: Logistic Regression, LDA and CART

FIG 7 – After Treating Null values

C. Outliers checking – using boxplot command (we have to treat outliers)

Fig 8 – Boxplot of Children born Fig 9 – Boxplot of Husband Education

Fig 10 – count plot

Fig 11 – count plot

Fig 12 – Heat Map for multivariate analysis.

Logistic Regression Output :-

Fig 13 - Logistics Regression.

Fig 14 - Linear Discrimant Analysis.

Logistic Regression: Accuracy = 0.6507177033492823, Precision = 0.64895

Fig 16 - ROC for LR.

Fig 17 - ROC for LDA .

Fig 17 - ROC for CART .

From all we can conclude that accuracy of train is more.

Step 1 :- Loading all Libraries and followed by upload in Excel file

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.