Critical Analysis Of: Multiple Regression Model

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Report on

Critical Analysis of

Multiple Regression Model.


Submitted by-

 Ashish (PF2123-D126)
 Aditi (PH2123-D142)
 Akansha (PF2123-D-040)
 Anirudh (PM2123-D130)
 Mihir (PF2123-D-099)
 Swapnil (PF2123-D-087)

Guided by-

Prof. Pradeep pai.


Contents

1. Introduction and Objective.


2. Questions on multiple Regression.
3. Critical analysis and Observations.
Introduction

In today's world data is everywhere data itself is just facts


and figures which need to be explored to get meaningful information
hence data analysis is important. Data analysis is the process of
applying statistical techniques to describe and provide a better context
for the data. Regression analysis is one of the most sought method
used in data analysis and in multiple linear regression relationship
between two or more independent variables and one dependent
variable is estimated.

Objective
 How to develop a multiple regression model.
 How to interpret the regression coefficients.
 How to read which independent variables to include in the
regression model.
 How to determine which independent variables are most important
in predicting a dependent variable.
 To use categorical independent variables in regression model.
Questions on multiple Regression

Q.1 We are interested to see how we can predict the price of our
upcoming BST ride in Mumbai based on some available data. In the
table below, you see age of the passenger, duration, distance, and price
of Mumbai BST rides.

Price (in Rs) Duration (in Distance (in Age of


minutes) X1 miles) X2 Passenger X3
22.3 16 3.2 61
12.5 4 1.5 24
29 29 5 47
36.2 33 5.8 32
19.1 14 2.3 23
36.5 30 6.1 82
66.9 56 12 57
17.3 11 1.9 36
8 2 0.8 42
18 14 1.7 47
24.3 15 2.8 29
22.1 15 2.5 27
44.3 40 4.2 19
23 19 3.6 45
61.1 49 10.2 39

Equation of Regression-
Y(Price)= a + b Age + c Distance + d Duration.

Multiple Regression Model-

Equation of Regression According to summary Output


Y= 7.747104054 - 0.06705101 Age + 2.067179097 Distance +
0.673838511 Duration.
Observations
1. Which Variables are going in same direction as that of Y?

From Summary Output we can see variables X1 and X2 i.e., Duration


and Distance are in same direction as that of Y i.e., Price.

2. Is regression Working or not?

As value of F-significance is way below 0.05 we can say that yes


regression is working.

3. Significance of Adjusted R square.

In our example, Adjusted R Square is 0.978 which means that 97.8%


of the observations of the dependent variable are explained by the
independent variables. This figure is higher than 95% and hence
considered a good fit.

4. Importance of p - value.

Variables P-Value
Intercept 0.00277151

Duration (in minutes) X1 0.00036701

Distance (in miles) X2 0.01214325


Age of Passenger X3 0.16937769

If P-Value is less than 0.05 we keep that variable and if P-Value is


greater than 0.05 we discard that variable. From the given summary
output, we would keep two variables- Duration and Distance but we will
discard Age of passenger as its P-Value is greater than 0.05.

5. How Adjusted R square penalizes variables?

Below is Regression of price and Age of passenger is given.

From above summary output value of adjusted R square for Age of


Passenger is -0.02531894 which shows the irrelevancy of that variable to
the final output hence we can conclude that value of R square has
penalized Variable Age of passenger.

Q.2 We are interested to see how we can predict the profit of company
based on some available data. In the table below, you see Marketing
spend, Administration, R&D spend.

Profit Marketing Administration R&D Spend


Spend-X1 -X2 -X3
192262 471784 136898 165349
191792 443899 151378 162598
191050 407935 101146 153442
182902 383200 118672 144372
166188 366168 91391 142107
156991 362861 99814 131877
156123 127717 147199 134615
155753 323877 145530 130298
152212 311613 148719 120543
149760 304982 109679 123335

Equation of Regression-
Y(Profit) = a + b Marketing + c Administration + d R&D.
Multiple Regression Model

Equation of Regression According to summary Output


Y= 23410.97357 + 0.030632574 Marketing - 0.004759078
Administration + 0.965212461 R&D.

Observations
1) Which Variables are going in same direction as that of Y?

From Summary Output we can see variables X1 and X3 i.e.,


Marketing and R&D are in same direction as that of Y i.e., Price.

2) Is regression Working or not?

As value of F-significance is way below 0.05 we can say that yes


regression is working.

3) Significance of Adjusted R square.

In our example, Adjusted R Square is 0.881 which means that 88.1%


of the observations of the dependent variable are explained by the
independent variables. This figure is higher than 95% and hence
considered a good fit.

4) Importance of p - value.

Variables P-Value
Intercept 0.317536505
Marketing Spend-X1 0.35668363
Administration-X2 0.960879901
R&D Spend-X3 0.001911261

If P-Value is less than 0.05 we keep that variable and if P-Value is


greater than 0.05 we discard that variable. From the given summary
output, we would keep one variable – R&D and we will discard
Marketing Spend-X1, and Administration-X2 as its P-Value is greater
than 0.05.

5) Reading which independent variables to include in the regression


model?

Below is Regression of price and R&D is given


From above summary output value of adjusted R square is -
0.893922093 which shows the higher relevancy of that variable to the
final output hence we can conclude that value of R- Square is nearly
same as that of R- Square of all the variables combined.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy