0% found this document useful (0 votes)

98 views15 pages

Polynomial Regression

This document discusses polynomial regression models. Polynomial regression models extend linear regression to include higher order terms of predictor variables, such as quadratic (x2) or cubic (x3) terms. Polynomial models are useful when the relationship between predictors and response is nonlinear but curvilinear. The key points covered are: - Polynomial models of higher degree are not always best and overfitting is a risk - Strategies for building polynomial models include forward or backward selection of terms - Extrapolation from polynomial models should be done cautiously beyond the observed data range - Multicollinearity between polynomial terms like x and x2 can cause numerical problems - Hierarchical models include all powers of a variable from 1 to the maximum degree

Uploaded by

Humberto Alonso Santos Morales

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views15 pages

Polynomial Regression

Uploaded by

Humberto Alonso Santos Morales

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Math 261A - Spring 2012 M.

Bremer

Polynomial Regression Models

Recall, that our definition of linear regression models extends to models, in which
several powers of the same predictor variable are included

y = β0 + β1 x + β2 x2 +

and even to second-order polynomial models in two (or more) variables

y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 +

Polynomials are widely used in regression if the relationship between the predictor(s)
and response is not linear but curve-linear.
Polynomial Models in One Variable
Definition: The regression model

y = β0 + β1 x + β2 x2 +

is called a second-order model or a quadratic model in one variable. The

expected value of y is a parabola in x. Here, β0 can still be interpreted as the y-
intercept of the parabola. β1 is called the linear effect parameter and β2 is called
the quadratic effect parameter. The regression model

y = β0 + β1 x + β2 x2 + · · · + βk xk +

is called a k-th order polynomial model in one variable.

Polynomial models may be analyzed with the techniques we have previously devel-
oped for multiple regression models, if we use xj = xj for the j th predictor variable.
There are several important decisions that have to be made when a polynomial re-
gression model is fit. They are discussed in more detail below.
1. Order of the Model Since our goal is to model the data well with the simplest
possible regression model, polynomial models of lower degree are usually preferable.
Keep in mind, that most datasets of k + 1
observations can be (perfectly!) modeled Polynomial regression
by a polynomial of degree k. That clearly model − degree k

cannot be the goal. In practice, we usually

start with models of degree one, and - if
transformations on the predictor or the
response are insufficient - also consider Quadratic
regression model
models of degree two. Higher degree models
should be avoided unless the context the
data is coming from explicitly calls for one
of these models.

62
Math 261A - Spring 2012 M. Bremer

2. Model-Building Strategy To decide the appropriate degree of a polynomial

regression model, two different strategies are possible. One can start with a lin-
ear model and include higher order terms one by one until the highest order term
becomes non-significant (look at p-values for t-test for slopes). This method is gen-
erally called Forward variable selection. Or one could start with a high order
model and exclude the non-significant highest order terms one by one until the re-
maining highest order term becomes significant. This method is generally referred
to as Backward variable selection. In general, the two methods do not have
to lead to the same model. For polynomial models, these methods are likely over-
powered, since we can restrict our attention to first and second order polynomial
models.
3. Extrapolation You have to take extreme care when making predictions outside
the range of observed variables in polynomial models. Recall the windmill data,
where we could have modeled the non-linear increase in DC output as a polynomial
function of wind velocity. Obviously, the DC output should not decrease as wind
velocity increases, so predictions made on the part of the parabola with negative
slope would be incorrect.
4. Ill-Conditioning Usually, we hope that the predictors in multiple regression
models are (almost) independent. If they are highly correlated, we have problems
with multicollinearity. The predictors in a polynomial regression are not indepen-
dent. One predictor is x, while the next predictor is x2 , for instance. Even though
the correlation between x and x2 is not perfect, it can be high enough to make the
matrix X0 X ill-conditioned. That means that the matrix is “hard” to invert, or
that considerable numerical error will be involved in computing the inverse. It is
possible to select the predictor functions more carefully as curve-linear functions of
x to avoid this problem.
Multicollinearity becomes more of a concern if the range of predictor values is very
narrow, so that x and x2 are almost linearly related. Some of the ill-conditioning
can be remedied by centering the variables (subtracting the mean x̄ from x) before
taking their powers.
5. Hierarchy Regression models that contain all powers of x (from 1 to k) are said
to be hierarchical. It is possible to exclude lower order terms from the model while
keeping some of the higher order terms. Statisticians are split on whether this is
a good idea. As always, any knowledge about the context the data is taken from
should be utilized to decide which polynomial regression model to fit.

63
Math 261A - Spring 2012 M. Bremer

Example: The Hardwood Data

The strength of kraft paper is related to the percentage of hardwood in the batch of
pulp that the paper is produced from. Not enough hardwood makes the paper weak,
but too much hardwood makes the paper brittle. The variables in this dataset are
the hardwood concentration (x, in %) and the tensile strength of the paper (y, in
psi) for 19 samples of paper. The scatterplot of tensile strength against hardwood
percentage (below) shows a clear non-linear relationship:

Hardwood

50
40
tensile paper strength (psi)

30
20
10

2 4 6 8 10 12 14

hardwood concentration (%)

Without centering the predictor variable, the correlation between x and x2 is un-
acceptably high (0.97). Based on the scatterplot, a quadratic model seems like a
possibly good fit:
y = β0 + β1 (x − x̄) + β2 (x − x̄)2 +
The summary statistics for this model fitted with R are shown below:

Both the linear and quadratic coefficients (β1 and β2 ) are significant, the model
R2 is 0.9085. Compare this to the R2 -value of 0.3054 for the corresponding linear
regression model (output not shown).

64
Math 261A - Spring 2012 M. Bremer

A plot of the Studentized residuals against the fitted values ŷi reveals no outliers
and no (strong) patterns. A qq-plot of the standardized residuals shows that the
distribution of the residuals is not perfectly Normal.

Hardwood residual plot Normal Q-Q Plot

1.5
1.5

1.0
1.0

0.5
0.5
Studentized residuals

Sample Quantiles

0.0
0.0
-0.5

-0.5
-1.0

-1.0
-1.5

-1.5
10 20 30 40 -2 -1 0 1 2

fitted values Theoretical Quantiles

In this case it is pretty obvious that the quadratic term is a meaningful addition to
the model. But we could also test “by-hand” the hypothesis

H0 : β2 = 0 vs. Ha : β2 6= 0

using the extra sums-of-squares method. For the linear model

y = β0 + β1 (x − x̄) +

the sum of squares of regression is SSR (β1 |β0 ) = 1043.4. Use the output from the
previous page to compute the sum of squares of regression for the quadratic model
and to conduct the F -test for the hypotheses mentioned above.

65
Math 261A - Spring 2012 M. Bremer

Piecewise Polynomial Fitting - Splines

If low-order polynomials do not provide a good fit for curve-linear data (poor resid-
ual plots, for instance) then the cause can be that the data behave differently for
different parts of the predictor range. Sometimes, transformations on x or y can
help. If transformations don’t help, than the predictor range can be divided into
segments and a different function is fit in each segment.
Definition: Splines are piecewise polynomial functions that satisfy certain “smooth-
ness” criteria. These criteria are often continuity and possibly continuity of the
derivative(s). The points where different polynomial pieces are joined together are
called the knots of the spline.
Example: A cubic spline (k = 3) with continuous first and second derivatives and
h knots t1 < t2 < · · · < th can be written as
3
X h
X
j
S(x) = β0j x + βi (x − ti )3+
j=1 i=1

where

(x − ti ) if x − ti > 0
(x − ti )+ =
0 if x − yi ≤ 0

If the positions of the knows in a spline are known, the fitting a spline function to
data reduces to a nonlinear regression problem. If the positions are not known, the
problem becomes more complicated. It is not easy to decide how many knots to use
and how they should be placed. In general, each piece of the spline should be kept
as simple as possible to avoid over-fitting the data.
A special case of spline functions are the piecewise-linear functions. As with splines,
we can assume that piecewise linear functions to be continuous, or we can allow
discontinuities at the knots.
Example: Consider a (not necessarily continuous) piecewise linear function with a
single knot at t:
S(x) = β00 + β01 x + β10 (x − t)0+ + β11 (x − t)1+
If x ≤ t (before the knot), the function is the line y = β00 + β01 x. If x > t (after the
knot) the function is
y = β00 + β01 x + β10 + β11 (x − t) = (β00 + β10 − β11 t) + (β01 + β11 )x
Notice, that β10 is the height of the vertical “jump” at the knot when x = t. If
we require the piecewise linear function to be continuous, then that means that β10
should be equal to zero.
S(x) = β00 + β01 x + β11 (x − t)1+

66
Math 261A - Spring 2012 M. Bremer

y y
non−continuous continuous
linear spline linear spline

β01+β11 β01+β11

β10
β01
β00
β01
β00
β00−β11t

t x t x

So far, we have seen models that use polynomials or lines (polynomials of degree
one) that are pieced together at knots. Alternatively, other functions can be consid-
ered. If the scatterplot shows some periodic behavior, then trigonometric functions
(sine, cosine) may be reasonable to include in the model. The trigonometric terms
can have varying amplitudes and frequencies.
Nonparametric Regression
So far, we have discussed linear regression and polynomial regression (and some
more exotic variants of regression). What all these models have in common is that
they specify a functional relationship (line, plane, parabolic surface etc.) between
the predictors and the response. And so far, we (the users) have always been the
ones with the ultimate decision of which model to use. There is an alternative. We
could not specify a model and instead allow the data to “pick” its own model.
In nonparametric regression, the regression function does not take any predeter-
mined shape but is derived entirely from the data. The conventional parametrized
regression model is
yi = f (xi , β) + i
where we specify the general class of the function f (linear, quadratic, etc.) and
use the data to estimate the function parameters β. The nonparametric regression
model is similar
yi = f (xi ) + i
but now our (more ambitious) goal is to estimate the function f itself. Of course,
without any restrictions, there are uncountably infinitely many choices. So we’ll
restrict the problem a bit.
Recall, that in ordinary linear least squares regression, the predicted values ŷ can
be written as linear combinations of the observed values y. The coefficients in the
linear combinations are determined by the entries of the hat-matrix.

ŷ = Hy

Most nonparametric regression models also model the predicted values as linear
combinations of the observations, but with different weights.

67
Math 261A - Spring 2012 M. Bremer

Kernel Regression
As in the OLS case, the predicted observations are computed as weighted sums of
the actual observations. Let ỹi be the kernel smoother estimate for observation yi .
Then n
X
ỹi = wij yj
j=1

where the weights wij sum to one (over j). Alternatively, we could write

ỹ = Sy, where S = (wij )

S is called the smoothing matrix. The weights are typically chosen such that wij = 0
for all points yj outside of a specified “neighborhood” of the point yi . The width
of this neighborhood is sometimes also called the bandwidth of the kernel. The
larger the bandwidth is chosen, the smoother the estimated function will become.
There are many possible kernel functions. They need to satisfy the following prop-
erties:

• K(t) ≥ 0 for all t

R∞
• K(t)dt = 1
−∞

• K(−t) = K(t) (symmetry).

Note, that these are the properties of symmetric probability density functions.
Which means that for instance the Normal distribution (Gaussian kernel), the tri-
angular distribution (triangular kernel), and the uniform distribution (uniform or
box kernel) make good kernel functions.
2
Gaussian kernel function K(t) = √1 exp(− x2 )
2π

1 − |t| t ≤ 1
Triangular kernel function K(t) =
0 t>1

0.5 |t| ≤ 1
Uniform kernel K(t) =
0 |t| > 1

The weights for the kernel smoother are chosen as

x −x
K ib j
wij = Pn
x −x
K ib j
k=1

68
Math 261A - Spring 2012 M. Bremer

1.0
0.8
Kernel Functions
Rectangular
Triangular
Gaussian

0.6
K(x)

0.4
0.2
0.0

-3 -2 -1 0 1 2 3

Example: Suppose you want to fit a nonparametric regression curve to the following
data:
x 1 2 4 5 7
y 1 4 2 3 1
The R-function ksmooth() computes a kernel smoother (options are Gaussian and
uniform kernels) with chosen bandwidth. The results for three different bandwidths
(1,2, and 3) are shown below.

Gaussian Kernel Uniform Kernel

4.0

Bandwidth choice Bandwidth choice

1 1
2 2
3.5

3.5

3 3
3.0

3.0
2.5

2.5
y

y
2.0

2.0
1.5

1.5
1.0

1.0

1 2 3 4 5 6 7 1 2 3 4 5 6 7

x x

Note: The regression “function” in this case is a sequence of points for which we
have computed predictions. There is no algebraic function that is estimated.
For large sample sizes, the bandwidth of the kernel to be used becomes much more
important than the choice of the kernel function.

69
Math 261A - Spring 2012 M. Bremer

Locally Weighted Regression (LOESS)

Locally weighed scatterplot smoothing, often also referred to as LOWESS or LOESS
is similar to kernel regression as it uses data from a neighborhood of the point where
the regression function is to be estimated. The neighborhood is usually not defined
as a bandwidth but as a span which is the fraction of available data points that are
to be used for the smoother.
The LOESS procedure then uses the points in the neighborhood to estimate a
weighted least squares function, typically with either simple linear regression or
with a low order polynomial. The weights are based on the distances of the points
in the sample from the point of interest where the function is to be estimated.
The function loess() in R produces a weighted smoothed curve of predicted re-
sponse values at a sequence of specified points. The function can handle up to four
numeric predictors. The size of the neighborhood used for smoothing is specified
with the span argument. The points are weighed with tricubic weights. Let x0 be
the location where we want to compute a prediction and let ∆x0 denote the distance
from x0 to the farthest point in the neighborhood. Then the weights for points xi
in the neighborhood are
3 !3
|x0 − xi |
wi = 1 −
∆(x0 )

Weights for points outside the neighborhood are wi = 0.

Example: The Windmill Data
Recall, that the data models the DC output of a windmill as a function of wind
velocity. In Chapter 5, we decided that the best simple linear regression model for
this data is to model DC as a reciprocal of wind velocity. A close second model was
DC as a quadratic function of wind velocity. If LOESS is applied to this data set
(with quadratic smoothing) the result is very similar to the fitted simple polynomial
model.

Windmill Data
2.0
1.5
DC current

1.0
0.5

4 6 8 10

wind velocity (mph)

70
Math 261A - Spring 2012 M. Bremer

The red curve corresponds to a LOESS smoother that uses 50% of the data points,
while the green curve corresponds to the LOESS smoother that uses 75% of the
points. For a LOESS smoother with 100% of the points, the results would be simi-
lar (but not identical) to the simple linear regression with the polynomial function
in wind velocity. The difference comes from the weighting by inverse distance.
Note: As we have seen so far, nonparametric models can sometimes be quite similar
to parametric models and sometimes they can be quite different. Which model to
prefer depends on the specific situation. If there is a good explanation for a spe-
cific parametric model that comes from the experimental context, then this model
is usually preferred if it provides a reasonable fit.
If no reasonable parametric model provides an adequate fit to the data, then non-
parametric models can provide viable alternatives.
Polynomial Models in Two or More Variables
Recall, that for an ordinary least squares regression model the model “surface” for
the additive model
y = β0 + β1 x1 + β2 x2 +
is a plane in space, while the model surface for the OLS model with interaction is a
curved surface.
y = β0 + β1 x1 + β2 x2 + β12 x1 x2 +
The quadratic regression model with two variables is an extension of this idea:

y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 +

Recall, that depending on the values of β, the response surfaces can take many
different shapes.

1000 300

800 250

200
600
150
400
100

200 50

0 0
10 10
5 10 5 10
0 5 0 5
0 0
−5 −5
−5 −5
−10 −10 −10 −10

400 0

200 −200

0
−400
−200
−600
−400

−600 −800

−800 −1000
10 10
5 10 5 10
0 5 0 5
0 0
−5 −5
−5 −5
−10 −10 −10 −10

71
Math 261A - Spring 2012 M. Bremer

Under certain circumstances, namely when the matrix

β0 12 β1 12 β2
 
 1 β1 β11 1 β12 
2 2
1
β 1β
2 2 2 12
β22
is negative (or positive) definite, the response surface has a single maximum (or
minimum). These cases may be of interest, if a response function is to be maxi-
mized (or minimized) as a function of two predictors. This task is called Response
Surface Methodology (RSM) in practice.
Example: Chemical Process
Engineers are interested in maximizing the percent conversion (y) in a chemical
process. They know that the conversion depends on the reaction temperature (T ,
in ◦ C) and the reactant concentration (C, in %). They have a vague idea which
temperature and reactant concentrations tend to lead to higher response values, but
they would like to find the optimal combination.
They design the experiment in the form of a central composite design. That
means that they will run experiments for specific combinations of T and C and
measure the response. The values at which the experiments will be run are laid out
in the form of a square:

Four observations are collected in the center of the square (x1 , x2 ) = (0, 0) and
√ one
observation
√ each at the corners of the square and at the radial axis runs (± 2, 0)
and (0, ± 2). The data for this experiment can be found in the file “ChemicalPro-
cess.txt” on the course website.
We will use R to fit a second order model for the response y as a function of the
coded variables x1 and x2 .
y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 +

72
Math 261A - Spring 2012 M. Bremer

The fitted model equation becomes:

ŷ = 79.75 + 9.825x1 + 4.216x2 − 8.877x21 − 5.125x22 − 7.75x1 x2
Note, that the standardization procedure was
T − 225 C − 20
x1 = and x2 =
25 5
What does that mean for the quadratic regression model in the original variables T
and C?

The coefficients table in R tells us that each component in the model (the linear,
quadratic and mixed terms) are significant. Thus, the model cannot be easily sim-
plified. The residuals are approximately normally distributed. The residual plot
shows a pattern, but that is to be expected with such a small sample size. Overall,
the residuals fall pretty much within the “pure error” range that we can observe
from the repeated observations at the center of the design.

Chemical Process Data Normal Q-Q Plot

1.5
1.0
1

0.5
Studentized residuals

Sample Quantiles
0

0.0
-0.5
-1

-1.0
-1.5
-2

45 50 55 60 65 70 75 80 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

fitted values Theoretical Quantiles

The optimal temperature (T ) and concentration (C) combination that would lead
to the highest expected response value can now be determined by finding the max-
imum for the fitted response surface. (Take partial derivatives, set equal to zero,
solve corresponding system). The maximum occurs at approximately 245◦ C and
20% concentration.

73
Math 261A - Spring 2012 M. Bremer

Orthogonal Polynomials
We have seen previously, that when fitting polynomial regression models in one vari-
ables, ill-conditioning is frequent due to the multicollinearity between the predictors
x and x2 etc. Some of these problems can be fixed by formulating the model in
terms of orthogonal polynomials.
Suppose we want to fit a polynomial regression model with one predictor x and
degree k:
yi = β0 + β1 xi + β2 x2i + · · · + βk xki + i
Generally, for this model, the columns of the X matrix will be (highly) correlated.
Additionally, if another term βk+1 xk+1 were added to the model, the matrix (X0 X)−1
would have to be recomputed and the estimates of the other β-parameters would
change.
Instead, consider the polynomial model

yi = α0 P0 (xi ) + α1 P1 (xi ) + α2 P2 (xi ) + · · · + αk Pk (xi ) + i

where Pu (xi ) is a polynomial of degree u defined such that

n
X
Pr (xi )Ps (xi ) = 0 r 6= s, r, s = 1, 2, . . . , k
i=1

P0 (xi ) = 1
Then the multiple linear regression model becomes y = Xα + where the predictor
matrix becomes  
P0 (x1 ) P1 (x1 ) · · · Pk (x1 )
 P0 (x2 ) P1 (x2 ) · · · Pk (x2 ) 
X=
 
.. .. ... .. 
 . . . 
P0 (xn ) P1 (xn ) · · · Pk (xn )
Since the polynomials are orthogonal, the X0 X matrix has the following simple form
that is very easy to invert:
 P n 
2
P (x ) 0 ··· 0
 i=1 0 i 
 n 
2
P
0 P2 (xi ) · · · 0
 
0
 
XX= i=1 
 .. .. .. .. 

 . . . . 

 n 
2
P
0 0 ··· Pk (xi )
i=1

Note: There are several different methods by which orthogonal polynomials can be
obtained for a given set of data. R uses a different method than the one described
in your text.

74
Math 261A - Spring 2012 M. Bremer

Find the least squares estimates α̂ of this model:

Note: The zero-degree polynomial can be set to P0 (xi ) = 1, i = 1, . . . , n. Thus,

α̂0 = ȳ.
R automatically uses orthogonal polynomials in polynomial regression unless you
explicitly ask for something else.
Example: Inventory System
An operations research analysist has developed a computer simulation model for a
single item inventory system. He is looking at the reorder quantity (x) for the item
and the average annual cost (y) of the inventory. The data are available in the file
“Inventory.txt”.
Inventory Data

A scatterplot clearly shows that there is

345

a parabolic (quadratic) relationship between

340

cost and reorder quantity. The R command

335
average annual cost

fit <- lm(y poly(x,degree = 2), data)

330

fits a polynomial regression model with orthog-

325

onal polynomials to the data.

320
315
310

50 100 150 200 250

reorder quantity

If you do not want to fit orthogonal polynomials, you have to include the argument
raw = TRUE inside the poly() function.
You can see which orthogonal polynomials R uses by typing poly(your predictor,
degree = 2). Caution: they are not the polynomials described in the book.

75
Math 261A - Spring 2012 M. Bremer
n
Pj (xi )2 = 1 for j = 1, 2 and P0 (xi ) = 1.
P
R uses orthogonal polynomials for which
i=1

Fit both models in R and compare the coefficients:

Note: prediction in R is not influenced by the polynomials chosen for the regression:

But the stability of the estimates is. The method that uses orthogonal polynomials
is not subject to the numerical problems that occur in the estimation of the model
parameters when the data are subject to multicollinearity. Therefore, in general,
predictions made with the orthogonal polynomial method are preferred. But it can
be tricky to derive the “raw” model equation for this fit.

F Pattern Resume Template
100% (2)
F Pattern Resume Template
5 pages
Polynomial Regression
No ratings yet
Polynomial Regression
6 pages
Zixi Receiver - Aws Mediaconnect User Guide: Software Version 12.4 Document Version Doc26-450-0002 All Rights Reserved
No ratings yet
Zixi Receiver - Aws Mediaconnect User Guide: Software Version 12.4 Document Version Doc26-450-0002 All Rights Reserved
42 pages
Dbms Lab Final A
No ratings yet
Dbms Lab Final A
9 pages
Chapter 7 - Handsout Machine Learning
No ratings yet
Chapter 7 - Handsout Machine Learning
18 pages
2. Linear Reg, Logistic Reg and SVM
No ratings yet
2. Linear Reg, Logistic Reg and SVM
40 pages
07 Moving Beyond Linearity I 169
No ratings yet
07 Moving Beyond Linearity I 169
40 pages
Lecture-5---Polynomial-Regression-imran-07032025-114203am
No ratings yet
Lecture-5---Polynomial-Regression-imran-07032025-114203am
39 pages
08 Curvefitting w Interpolation (1)
No ratings yet
08 Curvefitting w Interpolation (1)
64 pages
Fitting
No ratings yet
Fitting
32 pages
UNIT-1 Polynomial Regression
No ratings yet
UNIT-1 Polynomial Regression
7 pages
Multiple Regression Okk PDF
No ratings yet
Multiple Regression Okk PDF
19 pages
Lecture02 95791
No ratings yet
Lecture02 95791
94 pages
7 Nonlinear
No ratings yet
7 Nonlinear
48 pages
Module 07
No ratings yet
Module 07
21 pages
Lec 08_Polynomial Regression
No ratings yet
Lec 08_Polynomial Regression
56 pages
CHAPTER 3- POLYNOMIAL REGRESSION AND INTERACTION
No ratings yet
CHAPTER 3- POLYNOMIAL REGRESSION AND INTERACTION
18 pages
About Model Selection
No ratings yet
About Model Selection
33 pages
Mult Regression
No ratings yet
Mult Regression
28 pages
Module08 PolynomialRegressionSplineGAMs
No ratings yet
Module08 PolynomialRegressionSplineGAMs
56 pages
Polyynomial Regression (1)
No ratings yet
Polyynomial Regression (1)
13 pages
Introduction To Polynomial Regression
No ratings yet
Introduction To Polynomial Regression
5 pages
Unit3
No ratings yet
Unit3
12 pages
CHAPTER 3- POLYNOMIAL REGRESSION AND INTERACTION
No ratings yet
CHAPTER 3- POLYNOMIAL REGRESSION AND INTERACTION
12 pages
15 Splines
No ratings yet
15 Splines
51 pages
Chapter 7 Polynomial Regression Models: Ray-Bing Chen Institute of Statistics National University of Kaohsiung
No ratings yet
Chapter 7 Polynomial Regression Models: Ray-Bing Chen Institute of Statistics National University of Kaohsiung
69 pages
Polynomial Regression
No ratings yet
Polynomial Regression
11 pages
Supplement 5 - Multiple Regression
No ratings yet
Supplement 5 - Multiple Regression
19 pages
L3 Unix Handling Ordinary Files
No ratings yet
L3 Unix Handling Ordinary Files
10 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
ML Polynomial Regression4
No ratings yet
ML Polynomial Regression4
36 pages
Chapter12 Regression PolynomialRegression
No ratings yet
Chapter12 Regression PolynomialRegression
12 pages
Understanding Polynomial Regression Model
No ratings yet
Understanding Polynomial Regression Model
11 pages
4 - Empirical Modeling
No ratings yet
4 - Empirical Modeling
36 pages
Poly Regression
No ratings yet
Poly Regression
10 pages
ch12 0
No ratings yet
ch12 0
82 pages
Polynomial Regression
No ratings yet
Polynomial Regression
16 pages
Capgemini Interview Questions
No ratings yet
Capgemini Interview Questions
27 pages
Vehicle Detection Trackingand Counting
No ratings yet
Vehicle Detection Trackingand Counting
8 pages
2.1 Regression Analysis
No ratings yet
2.1 Regression Analysis
28 pages
Lecture 2
No ratings yet
Lecture 2
23 pages
Digital Health Care Mini Project Report
No ratings yet
Digital Health Care Mini Project Report
34 pages
Assessing Artificial Intelligence For Humanity Will AI Be The Our Biggest Ever Advance or The Biggest Threat Opinion
No ratings yet
Assessing Artificial Intelligence For Humanity Will AI Be The Our Biggest Ever Advance or The Biggest Threat Opinion
9 pages
10 Polynomial Regression
No ratings yet
10 Polynomial Regression
16 pages
Polynomial Regression
No ratings yet
Polynomial Regression
9 pages
(Slide) Non Linear Regression
No ratings yet
(Slide) Non Linear Regression
39 pages
lec8_Regularization_polynomial_regression
No ratings yet
lec8_Regularization_polynomial_regression
30 pages
Chapter3 Annotated Almostwholething
No ratings yet
Chapter3 Annotated Almostwholething
26 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
Fourth Lecture
No ratings yet
Fourth Lecture
15 pages
week2
No ratings yet
week2
43 pages
Hardware Uk
No ratings yet
Hardware Uk
7 pages
Procedure Cooling Tower
No ratings yet
Procedure Cooling Tower
2 pages
K Means & Polynomial
No ratings yet
K Means & Polynomial
5 pages
PRECAL
No ratings yet
PRECAL
4 pages
c8 - Vector
No ratings yet
c8 - Vector
15 pages
LMS Procedures Handbook Student
No ratings yet
LMS Procedures Handbook Student
82 pages
Polynomial Regression
No ratings yet
Polynomial Regression
11 pages
MATH 10 Test Questions Q1 SY 2024-2025
No ratings yet
MATH 10 Test Questions Q1 SY 2024-2025
3 pages
SangforNGAF8.0.35VersionReleaseTraining Rev3 20210330
No ratings yet
SangforNGAF8.0.35VersionReleaseTraining Rev3 20210330
27 pages
Determinants
No ratings yet
Determinants
18 pages
Unit 7: Inverse Trigonometr-Ic Functions
No ratings yet
Unit 7: Inverse Trigonometr-Ic Functions
31 pages
Script (ML)
No ratings yet
Script (ML)
2 pages
Remote Control Software Manual: Digital Multimeters DMK-DMG Series
No ratings yet
Remote Control Software Manual: Digital Multimeters DMK-DMG Series
39 pages
Project Synopsis: Title of The Project: Problem Defintion
No ratings yet
Project Synopsis: Title of The Project: Problem Defintion
2 pages
BCA-101 Programming and Problem Solving Using C (Credit-4) L-3, T-1
No ratings yet
BCA-101 Programming and Problem Solving Using C (Credit-4) L-3, T-1
7 pages
How to Open Blender Site
No ratings yet
How to Open Blender Site
11 pages
SecurityPlus SY0-701 Categorized Acronyms
No ratings yet
SecurityPlus SY0-701 Categorized Acronyms
3 pages
Nonlinear Regression
No ratings yet
Nonlinear Regression
8 pages
3-Polynomial Regression Using Python
No ratings yet
3-Polynomial Regression Using Python
14 pages
Lecture 16: Polynomial and Categorical Regression 1 Review
No ratings yet
Lecture 16: Polynomial and Categorical Regression 1 Review
10 pages
Stats216 hw3 PDF
No ratings yet
Stats216 hw3 PDF
26 pages
Curve Fitting and Interpolation
No ratings yet
Curve Fitting and Interpolation
14 pages
SRM Formula Sheet-2
100% (1)
SRM Formula Sheet-2
11 pages
Least Squares Fit To Polynomial
No ratings yet
Least Squares Fit To Polynomial
12 pages
Customer Experience in 2020: of Buying Decisions Are Based On Customer Experience.
No ratings yet
Customer Experience in 2020: of Buying Decisions Are Based On Customer Experience.
17 pages
econometrics-cheat-sheet
No ratings yet
econometrics-cheat-sheet
4 pages
RG-EG105GW-XDatasheet - 11071045
No ratings yet
RG-EG105GW-XDatasheet - 11071045
12 pages
Network Security Implementation Final
No ratings yet
Network Security Implementation Final
7 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
TFL 8 Catalogue
No ratings yet
TFL 8 Catalogue
2 pages
Polynomial Regression Models
No ratings yet
Polynomial Regression Models
5 pages
Chapter 5: Basis Functions and Regularization
No ratings yet
Chapter 5: Basis Functions and Regularization
4 pages
Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
Architecture Overview Diagram
No ratings yet
Architecture Overview Diagram
12 pages
Unit-1 Introduction To Microprocessor Architecture PDF
No ratings yet
Unit-1 Introduction To Microprocessor Architecture PDF
15 pages
ManageEngine OpManager Plus Datasheet
No ratings yet
ManageEngine OpManager Plus Datasheet
2 pages
Binary Bug Automatic Binary Trading
No ratings yet
Binary Bug Automatic Binary Trading
6 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Polynomial Regression

Uploaded by

Polynomial Regression

Uploaded by

Math 261A - Spring 2012 M.

Polynomial Regression Models

and even to second-order polynomial models in two (or more) variables

y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 + 

is called a second-order model or a quadratic model in one variable. The

is called a k-th order polynomial model in one variable.

cannot be the goal. In practice, we usually

2. Model-Building Strategy To decide the appropriate degree of a polynomial

Example: The Hardwood Data

hardwood concentration (%)

Hardwood residual plot Normal Q-Q Plot

fitted values Theoretical Quantiles

using the extra sums-of-squares method. For the linear model

Piecewise Polynomial Fitting - Splines

ỹ = Sy, where S = (wij )

• K(t) ≥ 0 for all t

• K(−t) = K(t) (symmetry).

The weights for the kernel smoother are chosen as

Gaussian Kernel Uniform Kernel

Bandwidth choice Bandwidth choice

Locally Weighted Regression (LOESS)

Weights for points outside the neighborhood are wi = 0.

wind velocity (mph)

y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 + 

Under certain circumstances, namely when the matrix

The fitted model equation becomes:

Chemical Process Data Normal Q-Q Plot

45 50 55 60 65 70 75 80 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

fitted values Theoretical Quantiles

yi = α0 P0 (xi ) + α1 P1 (xi ) + α2 P2 (xi ) + · · · + αk Pk (xi ) + i

where Pu (xi ) is a polynomial of degree u defined such that

Find the least squares estimates α̂ of this model:

Note: The zero-degree polynomial can be set to P0 (xi ) = 1, i = 1, . . . , n. Thus,

A scatterplot clearly shows that there is

a parabolic (quadratic) relationship between

cost and reorder quantity. The R command

fit <- lm(y poly(x,degree = 2), data)

fits a polynomial regression model with orthog-

onal polynomials to the data.

50 100 150 200 250

Fit both models in R and compare the coefficients:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 +

y = β0 + β1 x1 + β2 x2 + β11 x21 + β22 x22 + β12 x1 x2 +

yi = α0 P0 (xi ) + α1 P1 (xi ) + α2 P2 (xi ) + · · · + αk Pk (xi ) + i