Price Vs SQFT: Best Fit Trend Line Equation: Price 70.226 Square Feet - 10091

Q1. Estimate a multiple linear regression model and interpret the result.
Formulate and tests the hypothesis

about the coefficients applying ‘t’ and ‘F’ test based on your theory and interpret them.
Price Vs SqFt
250000
200000
150000 f(x) = 70.23 x − 10091.13

Price
100000
50000
0
1200 1400 1600 1800 2000 2200 2400 2600 2800
Square feet
Best Fit Trend Line equation: Price = 70.226* Square feet – 10091
Price Vs Bedrooms
250000
200000
f(x) = 5407.21 x² − 14062.77 x + 120689.63
150000
Price
100000
50000
0
1.5 2 2.5 3 3.5 4 4.5 5 5.5
Bedroom
Best Fit Trend line equation: Price = 5407.2*(Bedroom)2 -14063*(Bedroom)+120690
Price Vs Bathroom
250000
200000
f(x) = 14445.71 x² − 46238.13 x + 153321.21
150000
Price
100000
50000
0
1.5 2 2.5 3 3.5 4 4.5
Bathroom
Best Fit Trend line equation: Price = 14446(Bathroom)2 – 46238*(Bathroom)+153321

Price Vs Offers
250000
200000
150000
Price
f(x) = − 7880.69 x + 150744.74

100000
50000
0
0 1 2 3 4 5 6 7
Offers
Best Fit Trend line equation: Price = -7880.7(Offer)2 + 150745
Price Vs Brick
250000
200000
150000
f(x) = 25810.91 x + 121958.14
100000
50000
0
0 0.2 0.4 0.6 0.8 1 1.2
Best Fit Trend line equation: Price = 25811(Brick)+121958
Q2) Obtain the intercepts & slope coefficients of the model and interpret them. Interpret the overall
regression results with a sound theoretical knowledge.
For developing an accurate forecasting model, several models were developed on trial and error basis to arrive at
the best model. They intercept and slope coefficient of the model are as follows:
Model 1:
Price= C0+C1*SqFt+C2*Bedroom+C3*Bathroom+C4*Offers
X
Regression Statistics
Multiple R R Square Adjusted R Square Standard Error Observations
0.835573 0.698182 0.688367 14999.25 128
Coefficien Standard Lower Upper Lower Upper

ts Error t Stat P-value 95% 95% 95.0% 95.0%
Intercept -17347.4 12724.9 -1.632 0.1752 -42535.5 7840.775 -42535.5 7840.77
SqFt 61.83995 8.263774 7.4832 1.2E-11 45.48231 78.19758 45.48231 78.19
Bedroom 9319.753 2148.754 4.3372 2.97E-5 5066.425 13573.08 5066.425 13573.08
Bathroom 12646.35 3109.662 4.0667 8.45E-5 6490.962 18801.73 6490.962 18801.73
Offers -13601 1324.819 10.266 3.09E-18 -16223.4 -10978.6 -16223.4 -10978.61
The intercept and slope coefficients are as shown in the table. Noteworthy points are as follows:
 The coefficient of the “Offer” is negative which indicates that as offer increases prices will go down. This
also matches with our intuition. All other coefficients of independent variable are positive.
 All independent variables are statistically significant i.e. p value<0.05. However, the intercept term’s p
value is 0.175 which implies that intercept term is statistically not significant.
 Adjusted R2 for the model is 0.6883
Model 2: Price= C0+ C1*Sq.Ft+ C2*Bedroom+ C3*Bathroom+ C4*Offers+ C5*Bricks
0.884907 0.783061 0.77417 12768.46 128
Coefficient Standard Lower Upper Lower Upper

s Error t Stat P-value 95% 95% 95.0% 95.0%
Intercept -15831.6 10834.59 -1.46121 0.146529 -37279.7 5616.565 -37279.7 5616.565
SqFt 59.20441 7.045067 8.403669 9.35E-14 45.258 73.15082 45.258 73.15082
Bedrooms 9763.172 1830.303 5.334183 4.49E-07 6139.904 13386.44 6139.904 13386.44
Bathrooms 9823.664 2678.515 3.667579 0.000364 4521.276 15126.05 4521.276 15126.05
Offers -12168.5 1146.684 -10.6119 4.9E-19 -14438.5 -9898.56 -14438.5 -9898.56
Brick 17146.94 2481.854 6.908925 2.38E-10 12233.86 22060.02 12233.86 22060.02
 The coefficient of the “Offer” is negative which indicates that as offer increases prices will go down. This
also matches with our intuition. All other coefficients of independent variable are positive.
 All independent variables are statistically significant i.e. p value<0.05. However, the intercept term’s p
value is 0.1465 which implies that intercept term is statistically not significant.
 Adjusted R2 for the model is 0.77417 which is better than model 1.
Model 3: Price = C0+ C1*Sq.Ft+ C2*Bedroom+ C3*Bathroom+ C4*Offers+ C5*Bricks+C6*East loc+C7*West

Loc + C8*North Loc
0.931998406 0.868621029 0.852623922 10018.94419 128
Coefficien Standard t P- Lower Upper Lower Upper

ts Error Stat value 95% 95% 95.0% 95.0%
Intercept 2159.5 8877.8 0.2 0.81 -15417.9 19736.9 -15417.9 19736.9
SqFt 53.0 5.7 9.2 0.00 41.6 64.3 41.6 64.3
Bedrooms 4246.8 1597.9 2.7 0.01 1083.0 7410.5 1083.0 7410.5
Bathrooms 7883.3 2117.0 3.7 0.00 3691.7 12074.9 3691.7 12074.9
Offers -8267.5 1084.8 -7.6 0.00 -10415.3 -6119.7 -10415.3 -6119.7
Brick 17297.3 1981.6 8.7 0.00 13373.9 21220.8 13373.9 21220.8
East
location -1560.6 2396.8 -0.7 0.52 -6306.0 3184.8 -6306.0 3184.8
West Loc 20681.0 3149.0 6.6 0.00 14446.3 26915.7 14446.3 26915.7
655 #NU
North Loc 0.0 0.0 3 M 0.0 0.0 0.0 0.0
 The value of intercept and slope coefficient of East Location are statistically insignificant
 North Location shows multicollinearity problem as can be seen from above table. Hence, we did not get
any value of p for East location slope coefficient.
 Adjusted R2 for the model is 0.8526 which is better than model 2.
Note: Because of collinearity problem of North location, we have calculated VIF for independent variables
Hypothesis testing based on t values
H0 : The intercepts and slope coefficients are not significant
H1 : The intercepts and slope coefficients are significant
From the above table, for the coefficients to be significant t>1.658. But for intercept and East location, the null
hypothesis cannot be rejected. But other independent variables have t>1.658, hence they are significant.
In the next part, we will attempt to address the multicollinearity problem in the present model and will develop a
improved model for prediction of real estate prices.
2. Test the Multicollinearity problem with a suitable method. Solve the problem of Multicollinearity if so, by any
one of the methods which you think is suitable for your example?
Multicollinearity Test:
VIF values
Independent SqFt Bedrooms Bathrooms Offers Brick West North East
Variable Location Location Location
VIF 1.862 1.702 1.501 1.702 1.104 1.732 1.652 7.845
Since, multicollinearity is shown by East Location, so we will modify our regression model and will use stepwise
regression for counter the problem.
Stepwise Regression:
Model Summary g
Durbin-
Std. Error Change Statistics Watson
R Adjusted of the R Square F Sig. F
Model R Square R Square Estimate Change Change df1 df2 Change
a
1 .714 .510 .506 18886.376 .510 131.041 1 126 .000
b
2 .812 .659 .654 15814.680 .149 54.700 1 125 .000
c
3 .885 .783 .778 12653.739 .124 71.251 1 124 .000
d
4 .918 .842 .837 10849.280 .059 45.678 1 123 .000
e
5 .928 .861 .855 10226.392 .019 16.440 1 122 .000
f
6 .932 .868 .862 9995.067 .007 6.712 1 121 .011 1.902
a. Predictors: (Constant), West Location
b. Predictors: (Constant), West Location, SqFt
c. Predictors: (Constant), West Location, SqFt, Brick
d. Predictors: (Constant), West Location, SqFt, Brick, Offers
e. Predictors: (Constant), West Location, SqFt, Brick, Offers, Bathrooms
f. Predictors: (Constant), West Location, SqFt, Brick, Offers, Bathrooms, Bedrooms
g. Dependent Variable: Price
Significance and VIF values of independent variable in model 6
Unstandardized Standardized
Coefficients Coefficients Collinearity Statistics
Model B Std. Error Beta t Sig. Tolerance VIF
6 (Constant) 3067.471 8746.712 1.351 .072
West Location 21937.572 2482.393 .377 8.837 .000 .598 1.673
SqFt 52.149 5.572 .411 9.359 .000 .566 1.767
Brick 17058.771 1942.805 .299 8.780 .000 .938 1.066
Offers -8019.003 1013.011 -.319 -7.916 .000 .670 1.492
Bathrooms 7810.698 2109.060 .150 3.703 .000 .668 1.497
Bedrooms 4070.005 1570.921 .110 2.591 .011 .605 1.653
a. Dependent Variable: Price
 VIF values are between 1.5-2.5, hence the multicollinearity in model 6 is not significant.
 Intercept & Slope terms are significant as p value<0.1 in all the cases
 Adjusted R2 value for the model is 0.862 which is better all previous models.
So, our final model after removing multicollinearity:
Model 4: Price = C0 + C1*West Location + C2*SqFt + C3*Brick + C4*Offers + C5*Bathrooms + C6*Bedrooms
3. Applying a suitable method test the Heteroscedasticity problem. Solve the problem of heteroscedasticity, if
so, by any one of the methods which you think suitable for your example?
For testing Heteroscedasticity, we will use Bruesch-Pagan Test using SPSS:

Breusch Pagan model:
(Residual of predicted model) 2 = C0 + C1*West Location + C2*SqFt + C3*Brick + C4*Offers + C5*Bathrooms +
C6*Bedrooms
In this regression, we are obtaining the below results:
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 120623666833319424.000 6 20103944472219904.000 .946 .465b
Residual 2570097401831889400.000 121 21240474395304872.000
Total 2690721068665208800.000 127
a. Dependent Variable: sq_residual
b. Predictors: (Constant), West Location, Brick, SqFt, Offers, Bathrooms, Bedrooms
From this table it can be noted that, the significance of the regression model is 0.465
Null Hypothesis, H0 = There is no heteroscedasticity in the residuals of our predicted model 4
Alternate Hypothesis, H1 = There is heteroscedasticity in residuals for predicted model 4
So, from Breusch Pagan Test, we can see that p value of the model is 0.465, hence we fail to reject our Null
Hypothesis. So, there is no heteroscedasticity in error term of our model 4.
4. Tests whether autocorrelation is present or not in your regression? Solve the problem of autocorrelation,
if so, by any one of the methods which you think suitable for your example?
To test autocorrelation, we have used Durbin Watson method in our model. The model summary i
Model Summaryb
Std. Error of the
Model R R Square Adjusted R Square Estimate Durbin-Watson
a
1 .932 .868 .862 9995.067 1.902
a. Predictors: (Constant), West Location, Brick, SqFt, Offers, Bathrooms, Bedrooms
b. Dependent Variable: Price
Durbin Watson value = 1.902

DW value from table for n=128 and k=6: dL =1.60 & dU = 1.805
Let Null Hypothesis: H0= There is no autocorrelation in error term

Alternate Hypothesis: H1 = Autocorrelation exists in the error term
Since DW of model>du, so we fail to reject our null hypothesis and therefore there is no autocorrelation in error
term of model 4
5.Perform the redundant variable or omitted variable tests to test about the inclusion or exclusion of a
variable into the model.
To test the redundant variable test, we have conducted two operations:

 Stepwise regression in SPSS
 RAMSEY Reset Test
Stepwise regression
Model 4 of our regression was derived from Stepwise regression which was as follows:
Price = C0 + C1*West Location + C2*SqFt + C3*Brick + C4*Offers + C5*Bathrooms + C6*Bedrooms
The test statistics for the model is as follows
Model Summary g
Durbin-
Std. Error Change Statistics Watson
R Adjusted of the R Square F Sig. F
Model R Square R Square Estimate Change Change df1 df2 Change
6 .932f .868 .862 9995.067 .007 6.712 1 121 .011 1.902
f. Predictors: (Constant), West Location, SqFt, Brick, Offers, Bathrooms, Bedrooms
g. Dependent Variable: Price
ANOVAa
Model Sum of Squares df Mean Square F Sig.
6 Regression 79597148827.421 6 13266191471.237 132.793 .000g
Residual 12088065469.454 121 99901367.516
Total 91685214296.875 127
g. Predictors: (Constant), West Location, SqFt, Brick, Offers, Bathrooms, Bedrooms
RAMSEY Reset Test:
For Ramsey Reset Test our model for examination was:

Price = C0 + C1*West Location + C2*SqFt + C3*Brick + C4*Offers + C5*Bathrooms + C6*Bedrooms+
C7*(Pred value)^2 + C8*(Pred Value)^3
Null Hypothesis, H0: C7 and C8 =0
Alternate Hypothesis, H1: Either or both of C7 and C8 not equal to zero
Coefficients a
Unstandardized Standardized
Coefficients Coefficients Collinearity Statistics
Model B Std. Error Beta t Sig. Tolerance VIF
1 (Constant) 21144.391 15506.040 1.364 .098
SqFt 41.612 9.312 .328 4.469 .000 .201 4.975
Bedrooms 2959.131 1751.931 .080 1.689 .094 .482 2.073
Bathrooms 5694.238 2582.215 .109 2.205 .029 .442 2.262
Offers -6373.571 1543.132 -.254 -4.130 .000 .287 3.490
Brick 12946.373 3501.441 .227 3.697 .000 .286 3.491
West Location 16118.918 4812.623 .277 3.349 .001 .158 6.338
(Pred)^3 4.222E-12 .000 .228 1.409 .161 .041 24.205
Note: Pred^2 has been excluded from regression model as it was showing high multicollinearity
From the above table, p value of Pred^3 coefficient i.e. C 8 =0.161, which is >0.10, so we fail to reject Null
hypothesis and hence, C7 and C8 are not significant.
Therefore, our excluded variables are appropriate, and our model are correctly specified.

Price Vs SQFT: Best Fit Trend Line Equation: Price 70.226 Square Feet - 10091

Uploaded by

Copyright:

Available Formats

Price Vs SQFT: Best Fit Trend Line Equation: Price 70.226 Square Feet - 10091

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Price Vs SQFT: Best Fit Trend Line Equation: Price 70.226 Square Feet - 10091

Uploaded by

Copyright:

Available Formats

Q1. Estimate a multiple linear regression model and interpret the result.

Formulate and tests the hypothesis

150000 f(x) = 70.23 x − 10091.13

Best Fit Trend line equation: Price = 5407.2*(Bedroom)2 -14063*(Bedroom)+120690

Best Fit Trend line equation: Price = 14446(Bathroom)2 – 46238*(Bathroom)+153321

f(x) = − 7880.69 x + 150744.74

Best Fit Trend line equation: Price = -7880.7(Offer)2 + 150745

Best Fit Trend line equation: Price = 25811(Brick)+121958

Coefficien Standard Lower Upper Lower Upper

Model 2: Price= C0+ C1*Sq.Ft+ C2*Bedroom+ C3*Bathroom+ C4*Offers+ C5*Bricks

Coefficient Standard Lower Upper Lower Upper

Model 3: Price = C0+ C1*Sq.Ft+ C2*Bedroom+ C3*Bathroom+ C4*Offers+ C5*Bricks+C6*East loc+C7*West

Coefficien Standard t P- Lower Upper Lower Upper

For testing Heteroscedasticity, we will use Bruesch-Pagan Test using SPSS:

Durbin Watson value = 1.902

Let Null Hypothesis: H0= There is no autocorrelation in error term

To test the redundant variable test, we have conducted two operations:

RAMSEY Reset Test:

For Ramsey Reset Test our model for examination was:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Best Fit Trend line equation: Price = 5407.2(Bedroom)2 -14063(Bedroom)+120690

Model 2: Price= C0+ C1Sq.Ft+ C2Bedroom+ C3Bathroom+ C4Offers+ C5*Bricks

Model 3: Price = C0+ C1Sq.Ft+ C2Bedroom+ C3Bathroom+ C4Offers+ C5Bricks+C6East loc+C7*West