Statistics For Business and Economics: Simple Regression
Statistics For Business and Economics: Simple Regression
Statistics For Business and Economics: Simple Regression
Chapter 11
Simple Regression
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-1
Chapter Goals
After completing this chapter, you should be
able to:
■ Explain the simple linear regression model
■ Obtain and interpret the simple linear regression
equation for a set of data
■ Describe R2 as a measure of explanatory power of the
regression model
■ Understand the assumptions behind regression
analysis
■ Explain measures of variation and determine whether
the independent variable is significant
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-2
Chapter Goals
(continued)
After completing this chapter, you should be
able to:
■ Calculate and interpret confidence intervals for the
regression coefficients
■ Use a regression equation for prediction
■ Form forecast intervals around an estimated Y value
for a given X
■ Use graphical analysis to recognize potential problems
in regression analysis
■ Explain the correlation coefficient and perform a
hypothesis test for zero population correlation
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-3
11.1
Overview of Linear Models
Y = β0 + β1X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-5
Introduction to
Regression Analysis
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-6
11.2
Linear Regression Model
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-7
Simple Linear Regression
Model
The population regression model:
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-8
Linear Regression Assumptions
■ The random error terms, εi, are not correlated with one
another, so that
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-9
Simple Linear Regression
Model
(continued)
Y
Observed Value
of Y for xi
εi Slope = β1
Predicted Value Random Error
of Y for xi
for this Xi value
Intercept = β0
xi X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-10
Simple Linear Regression
Equation
The simple linear regression equation provides an
estimate of the population regression line
Estimated (or Estimate of Estimate of the
predicted) y the regression regression slope
value for intercept
observation i
Value of x for
observation i
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-11
11.3
Least Squares
Coefficient Estimators
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-13
Computer Computation of
Regression Coefficients
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-14
Interpretation of the
Slope and the Intercept
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-15
Simple Linear Regression
Example
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-16
Sample Data for
House Price Model
House Price in $1000s Square Feet
(Y) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-17
Graphical Presentation
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-18
Regression Using Excel
■ Excel will be used to generate the coefficients and
measures of goodness of fit for regression
■ Data / Data Analysis / Regression
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-19
Regression Using Excel
(continued)
■ Data / Data Analysis / Regression
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-20
Excel Output
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-21
Excel Output
(continued)
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-22
Graphical Presentation
■ House price model: scatter plot and
regression line
Slope
= 0.10977
Intercept
= 98.248
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-23
Interpretation of the
Intercept, b0
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-24
Interpretation of the
Slope Coefficient, b1
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-25
11.4
Explanatory Power of a
Linear Regression Equation
where:
= Average value of the dependent variable
yi = Observed values of the dependent variable
i
= Predicted value of y for the given xi value
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-26
Analysis of Variance
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-27
Analysis of Variance
(continued)
Y Unexplained
yi variation ∧
∧ 2 y
SSE = ∑(yi - yi )
_
SST = ∑(yi - y)2
Explained
∧
y ∧ _2 variation
_ SSR = ∑(yi - y) _
y y
xi X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-28
2
Coefficient of Determination, R
■ The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
■ The coefficient of determination is also called
R-squared and is denoted as R2
note:
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-29
Examples of Approximate
r2 Values
Y
r2 = 1
2 X
r =1
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-30
Examples of Approximate
2
r Values
Y
0 < r2 < 1
X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-31
Examples of Approximate
2
r Values
r2 = 0
Y
No linear relationship
between X and Y:
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-32
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842 58.08% of the variation in
Standard Error 41.33032
house prices is explained by
Observations 10
variation in square feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-33
Correlation and R2
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-34
Estimation of Model
Error Variance
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-35
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-36
Comparing Standard Errors
se is a measure of the variation of observed y
values from the regression line
Y Y
X X
where:
= Estimate of the standard error of the least squares slope
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-38
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-39
Comparing Standard Errors of
the Slope
Y Y
X X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-40
Inference about the Slope:
t Test
■ t test for a population slope
■ Is there a linear relationship between X and Y?
■ Null and alternative hypotheses
H0: β1 = 0 (no linear relationship)
H1: β1 ≠ 0 (linear relationship does exist)
■ Test statistic
where:
b1 = regression slope
coefficient
β1 = hypothesized slope
sb1 = standard
error of the slope
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-41
Inference about the Slope:
t Test
(continued)
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-42
Inferences about the Slope:
t Test Example
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-43
Inferences about the Slope:
t Test Example
(continued)
Test Statistic: t = 3.329
H 0: β 1 = 0 From Excel output: b1 t
H 1: β 1 ≠ 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
d.f. = 10-2 = 8 Square Feet 0.10977 0.03297 3.32938 0.01039
t8,.025 = 2.3060
Decision:
α/2=.025 α/2=.025 Reject H0
Conclusion:
d.f. = n - 2
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-46
Confidence Interval Estimate
for the Slope
(continued)
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-47
Hypothesis Test for Population
Slope Using the F Distribution
■ F Test statistic:
where
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-48
Hypothesis Test for Population
Slope Using the F Distribution
(continued)
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-49
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10 With 1 and 8 degrees P-value for
of freedom the F-Test
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-50
F-Test for Significance
(continued)
H 0: β 1 = 0 Test Statistic:
H 1: β 1 ≠ 0
α = .05
df1= 1 df2 = 8
Decision:
Critical Reject H0 at α = 0.05
Value:
F1,8,0.05 = 5.32
Conclusion:
α = .05
There is sufficient evidence that
0 F house size affects selling price
Do not Reject H0
reject H0
F.05 = 5.32
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-51
11.6
Prediction
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-52
Predictions Using
Regression Analysis
Predict the price for a house
with 2000 square feet:
Risky to try to
extrapolate far
beyond the range
of observed x
values
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-54
Estimating Mean Values and
Predicting Individual Values
Goal: Form intervals around y to express
uncertainty about the value of y for a given xi
Confidence
Interval for Y ∧
the expected y
value of y,
given xi
∧
y = b0+b1xi
Prediction Interval
for an single
observed y, given xi
xi X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-55
Confidence Interval for
the Average Y, Given X
Confidence interval estimate for the
expected value of y given a particular xi
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-56
Prediction Interval for
an Individual Y, Given X
Confidence interval estimate for an actual
observed value of y given a particular xi
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-57
Example: Confidence Interval for
the Average Y, Given X
Confidence Interval Estimate for E(Yn+1|Xn+1)
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-58
Example: Prediction Interval for
an Individual Y, Given X
∧
Confidence Interval Estimate for yn+1
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-59
11.7
Correlation Analysis
■ Correlation analysis is used to measure
strength of the association (linear relationship)
between two variables
■ Correlation is only concerned with strength of the
relationship
■ No causal effect is implied with correlation
■ Correlation was first presented in Chapter 4
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-60
Correlation Analysis
■ The population correlation coefficient is
denoted ρ (the Greek letter rho)
■ The sample correlation coefficient is
where
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-61
Test for Zero
Population Correlation
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-62
Decision Rules
Hypothesis Test for Correlation
α α α/2 α/2
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-63
11.8
Beta Measure of Financial Risk
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-64
Beta Coefficient Example
■ Slope coefficient is the Beta Coefficient
Information about
the quality of the
regression
model that
provides the
estimate of beta
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-65
11.9
Graphical Analysis
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-66
Chapter Summary
■ Introduced the linear regression model
■ Reviewed correlation and the assumptions of
linear regression
■ Discussed estimating the simple linear
regression coefficients
■ Described measures of variation
■ Described inference about the slope
■ Addressed estimation of mean values and
prediction of individual values
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-67
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the publisher.
Printed in the United States of America.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-68