Chapter 8: Regression Wisdom
Chapter 8: Regression Wisdom
• Patterns on residual
plots p277
• Example: Population (in
millions) in a country
for 2000-2005
(recorded as 0, 1, 2, 3,
4, 5):
1
Example
• The regression equation
is
• population = 5.19 + 0.686 year
• R-Sq = 93.5%
2
Example - continued
• Nonlinearity is more
prominent
3
Sifting Residuals for Groups
• No regression analysis is complete without a
display of the residuals to check that the linear
model is reasonable.
4
Example: Regression Analysis: Self-
Esteem versus Age
• It is a good idea to look
at both a histogram of
the residuals and a
scatterplot of the
residuals vs. predicted
values:
5
• Looks like two groups
6
Real Data
7
Example
• An examination of residuals
often leads us to discover
groups of observations that
are different from the rest.
8
Outliers, Leverage, and Influence
9
High Leverage point
Examples
• A data point that has an x-
value far from the mean of
the x-values is called a high
leverage point.
10
High Leverage point
Examples
• A data point that has an x-
value far from the mean of
the x-values is called a high
leverage point.
11
Influential observations
Example
• A data point is influential if
omitting it from the analysis
gives a very different model.
12
Influential observations
Example
• A data point is influential if
omitting it from the analysis
gives a very different model.
P 284
13
Example (A high leverage point that is
not influential)
16
Restricted-range problem
17
Restricted Range
Restricted Range
Restricted Range
Restricted Range
Working with summary statistics
• There is a strong,
positive, linear
association between
weight (in pounds) and
height (in inches) for
men
22
Working with summary statistics
• If instead of data on
individuals we only had
the mean weight for
each height value, we
would see an even
stronger association
23