1 Logit - Gam: Generalized Additive Model For Di-Chotomous Dependent Variables

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

1 logit.

gam: Generalized Additive Model for Di-


chotomous Dependent Variables
This function runs a nonparametric Generalized Additive Model (GAM) for
dichotomous dependent variables.

1.1 Syntax
> z.out <- zelig(y ~ x1 + s(x2), model = "logit.gam", data = mydata)
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)

Where s() indicates a variable to be estimated via nonparametric smooth. All


variables for which s() is not specified, are estimated via standard parametric
methods.

1.2 Additional Inputs


In addition to the standard inputs, zelig() takes the following additional op-
tions for GAM models.
• method: Controls the fitting method to be used. Fitting methods are se-
lected via a list environment within method=gam.method(). See gam.method()
for details.

• scale: Generalized Cross Validation (GCV) is used if scale = 0 (see the


“Model” section for details) except for Logit models where a Un-Biased
Risk Estimator (UBRE) (also see the “Model” section for details) is used
with a scale parameter assumed to be 1. If scale is greater than 1, it is
assumed to be the scale parameter/variance and UBRE is used. If scale
is negative GCV is used.
• knots: An optional list of knot values to be used for the construction of
basis functions.
• H: A user supplied fixed quadratic penalty on the parameters of the GAM
can be supplied with this as its coefficient matrix. For example, ridge
penalties can be added to the parameters of the GAM to aid in identifi-
cation on the scale of the linear predictor.
• sp: A vector of smoothing parameters for each term.
• ...: additional options passed to the logit.gam model. See the mgcv
library for details.

1
1.3 Examples
1. Basic Example
Create some count data:

> set.seed(0); n <- 400; sig <- 2;


> x0 <- runif(n, 0, 1); x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1); x3 <- runif(n, 0, 1)
> g <- (f-5)/3
> g <- binomial()$linkinv(g)
> y <- rbinom(g,1,g)
> my.data <- as.data.frame(cbind(y, x0, x1, x2, x3))

Estimate the model, summarize the results, and plot nonlinearities:

> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), model="logit.gam", data=my.data)


> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)

Note that the plot() function can be used after model estimation and
before simulation to view the nonlinear relationships in the independent
variables:
2

2
1

1
s(x1,1.17)
−1 0

−1 0
s(x0,1)

−3

−3

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x0 x1
2

2
1

1
s(x2,4.8)

−1 0

−1 0
s(x3,1)
−3

−3

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x2 x3

2
Set values for the explanatory variables to their default (mean/mode)
values, then simulate, summarize and plot quantities of interest:
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)
> summary(s.out)
> plot(s.out)

Expected Value (for X): E(Y|X)


1.2
Density

0.6
0.0

−0.5 0.0 0.5 1.0 1.5 2.0

N = 1000 Bandwidth = 0.06488

Predicted Value (for X): Y|X


400
0

0 1

2. Simulating First Differences


Estimating the risk difference (and risk ratio) between low values (20th
percentile) and high values (80th percentile) of the explanatory variable x3
while all the other variables are held at their default (mean/mode) values.

> x.high <- setx(z.out, x3= quantile(my.data$x3, 0.8))


> x.low <- setx(z.out, x3 = quantile(my.data$x3, 0.2))
> s.out <- sim(z.out, x=x.high, x1=x.low)
> summary(s.out)
> plot(s.out)
>

Expected Value (for X): E(Y|X) Predicted Value (for X): Y|X
1.2
Density

400
0.6
0.0

0.0 0.5 1.0 1.5 2.0 0 1

N = 1000 Bandwidth = 0.0705

Expected Value (for X1): E(Y|X1) Predicted Value (for X1): Y|X1
1.2
Density

400
0.6
0.0

−0.5 0.0 0.5 1.0 1.5 2.0 0 1

N = 1000 Bandwidth = 0.07092

First Difference: E(Y|X1) − E(Y|X)


0.8
Density

0.4
0.0

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

N = 1000 Bandwidth = 0.09778

3
3. Variations in GAM model specification. Note that setx and sim work as
shown in the above examples for any GAM model. As such, in the interest
of parsimony, I will not re-specify the simulations of quantities of interest.
An extra ridge penalty (useful with convergence problems):
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), H=diag(0.5,37),
+ model="logit.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)
>
Set the smoothing parameter for the first term, estimate the rest:

> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),sp=c(0.01,-1,-1,-1),


+ model="logit.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1)
>

Set lower bounds on smoothing parameters:


> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),min.sp=c(0.001,0.01,0,10),
+ model="logit.gam", data=my.data)
> summary(z.out)
> plot(z.out$result, pages=1)
>
A GAM with 3df regression spline term & 2 penalized terms:
> z.out <-zelig(y~s(x0,k=4,fx=TRUE,bs="tp")+s(x1,k=12)+s(x2,k=15),
+ model="logit.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1)
>

1.4 Model
GAM models use families the same way GLM models do: they specify the
distribution and link function to use in model fitting. In the case of logit.gam
a logistic link function is used. Specifically, let Yi be the binary dependent
variable for observation i which takes the value of either 0 or 1.
• The logistic distribution has stochastic component

Yi ∼ Bernoulli(yi |πi )
= πiyi (1 − πi )1−yi

where πi = Pr(Yi = 1).

4
• The systematic component is given by:
1
πi =  PJ ,
1 + exp −xi β + j=1 fj (Zj )

where xi is the vector of covariates, β is the vector of coefficients and


fj (Zj ) for j = 1, . . . J is the set of smooth terms..
Generalized additive models (GAMs) are similar in many respects to gener-
alized linear models (GLMs). Specifically, GAMs are generally fit by penalized
maximum likelihood estimation and GAMs have (or can have) a parametric
component identical to that of a GLM. The difference is that GAMs also in-
clude in their linear predictors a specified sum of smooth functions.
In this GAM implementation, smooth functions are represented using pe-
nalized regression splines. Two techniques may be used to estimate smoothing
parameters: Generalized Cross Validation (GCV),

D
n , (1)
(n − DF )2

or an Un-Biased Risk Estimator (UBRE) (which is effectively just a rescaled


AIC),
D DF
+ 2s , (2)
n n−s
where D is the deviance, n is the number of observations, s is the scale pa-
rameter, and DF is the effective degrees of freedom of the model. The use of
GCV or UBRE can be set by the user with the scale command described in
the “Additional Inputs” section and in either case, smoothing parameters are
chosen to minimize the GCV or UBRE score for the model.
Estimation for GAM models proceeds as follows: first, basis functions and a
set (one or more) of quadratic penalty coefficient matrices are constructed for
each smooth term. Second, a model matrix is is obtained for the parametric
component of the GAM. These matrices are combined to produce a complete
model matrix and a set of penalty matrices for the smooth terms. Iteratively
Reweighted Least Squares (IRLS) is then used to estimate the model; at each
iteration of the IRLS, a penalized weighted least squares model is run and the
smoothing parameters of that model are estimated by GCV or UBRE. This
process is repeated until convergence is achieved.
Further details of the GAM fitting process are given in Wood (2000, 2004,
2006).

1.5 Quantities of Interest


The quantities of interest for the logit.gam model are the same as those for
the standard logistic regression.

5
• The expected value (qi$ev) for the logit.gam model is the mean of sim-
ulations from the stochastic component,
1
πi =  PJ ,
1 + exp −xi β + j=1 fj (Zj )

• The predicted values (qi$pr) are draws from the Binomial distribution
with mean equal to the simulated expected value πi .
• The first difference (qi$fd) for the logit.gam model is defined as

F D = Pr(Y |w1 ) − Pr(Y |w)

for w = {X, Z}.

1.6 Output Values


The output of each Zelig command contains useful information which you may
view. For example, if you run z.out <- zelig(y ~ x, model = "logit.gam",
data), then you may examine the available information in z.out by using
names(z.out), see the coefficients by using coefficients(z.out), and a de-
fault summary of information through summary(z.out). Other elements avail-
able through the $ operator are listed below.
• From the zelig() output stored in z.out, you may extract:

– coefficients: parameter estimates for the explanatory variables.


– fitted.values: the vector of fitted values for the explanatory vari-
ables.
– residuals: the working residuals in the final iteration of the IRLS
fit.
– linear.predictors: the vector of xi β.
– aic: Akaike’s Information Criterion (minus twice the maximized log-
likelihood plus twice the number of coefficients).
– method: the fitting method used.
– converged: logical indicating weather the model converged or not.
– smooth: information about the smoothed parameters.
– df.residual: the residual degrees of freedom.
– df.null: the residual degrees of freedom for the null model.
– data: the input data frame.
– model: the model matrix used.

• From summary(z.out)(as well as from zelig()), you may extract:

6
– p.coeff: the coefficients of the parametric components of the model.
– se: the standard errors of the entire model.
– p.table: the coefficients, standard errors, and associated t statistics
for the parametric portion of the model.
– s.table: the table of estimated degrees of freedom, estimated rank,
F statistics, and p-values for the nonparametric portion of the model.
– cov.scaled: a k × k matrix of scaled covariances.
– cov.unscaled: a k × k matrix of unscaled covariances.
• From the sim() output stored in s.out, you may extract:

– qi$ev: the simulated expected probabilities for the specified values


of x.
– qi$pr: the simulated predicted values for the specified values of x.
– qi$fd: the simulated first differences in the expected probabilities
simulated from x and x1.

How to Cite the Logitistic General Additive Model


How to Cite the Zelig Software Package
To cite Zelig as a whole, please reference these two sources:

Kosuke Imai, Gary King, and Olivia Lau. 2007. “Zelig: Everyone’s
Statistical Software,” http://GKing.harvard.edu/zelig.

Imai, Kosuke, Gary King, and Olivia Lau. (2008). “Toward A Com-
mon Framework for Statistical Analysis and Development.” Jour-
nal of Computational and Graphical Statistics, Vol. 17, No. 4
(December), pp. 892-913.

See also
The logit.gam model is adapted from the mgcv package by Simon N. Wood
[7]. Advanced users may wish to refer to help(gam), [6], [5], and other docu-
mentation accompanying the mgcv package. All examples are reproduced and
extended from mgcv’s gam() help pages.

2 normal.gam: Generalized Additive Model for


Continuous Dependent Variables
This function runs a nonparametric Generalized Additive Model (GAM) for
continuous dependent variables.

7
2.1 Syntax
> z.out <- zelig(y ~ x1 + s(x2), model = "normal.gam", data = mydata)
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)

Where s() indicates a variable to be estimated via nonparametric smooth. All


variables for which s() is not specified, are estimated via standard parametric
methods.

2.2 Additional Inputs


In addition to the standard inputs, zelig() takes the following additional op-
tions for GAM models.
• method: Controls the fitting method to be used. Fitting methods are se-
lected via a list environment within method=gam.method(). See gam.method()
for details.
• scale: Generalized Cross Validation (GCV) is used if scale = 0 (see the
“Model” section for details) except for Normal models where a Un-Biased
Risk Estimator (UBRE) (also see the “Model” section for details) is used
with a scale parameter assumed to be 1. If scale is greater than 1, it is
assumed to be the scale parameter/variance and UBRE is used. If scale
is negative GCV is used.
• knots: An optional list of knot values to be used for the construction of
basis functions.
• H: A user supplied fixed quadratic penalty on the parameters of the GAM
can be supplied with this as its coefficient matrix. For example, ridge
penalties can be added to the parameters of the GAM to aid in identifi-
cation on the scale of the linear predictor.
• sp: A vector of smoothing parameters for each term.
• ...: additional options passed to the normal.gam model. See the mgcv
library for details.

2.3 Examples
1. Basic Example:
Create some data:

> set.seed(0); n <- 400; sig <- 2;


> x0 <- runif(n, 0, 1); x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1); x3 <- runif(n, 0, 1)
> f0 <- function(x) 2 * sin(pi * x)
> f1 <- function(x) exp(2 * x)

8
> f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 *
+ x)^3 * (1 - x)^10
> f3 <- function(x) 0 * x
> f <- f0(x0) + f1(x1) + f2(x2)
> e <- rnorm(n, 0, sig); y <- f + e
> my.data <- as.data.frame(cbind(y, x0, x1, x2, x3))

Estimate the model, summarize the results, and plot nonlinearities:

> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), model="normal.gam", data=my.data)


> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)

Note that the plot() function can be used after model estimation and
before simulation to view the nonlinear relationships in the independent
variables:
10

10
5

5
s(x0,5.17)

s(x1,2.36)
0

0
−10 −5

−10 −5

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x0 x1
10

10
5

5
s(x2,8.52)

s(x3,1)
0

0
−10 −5

−10 −5

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x2 x3

Set values for the explanatory variables to their default (mean/mode)


values, then simulate, summarize and plot quantities of interest:
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)

9
> summary(s.out)
> plot(s.out)

Expected Value (for X): E(Y|X)


0.8
Density

0.4
0.0

6.5 7.0 7.5 8.0 8.5 9.0 9.5

N = 1000 Bandwidth = 0.08735

Predicted Value (for X): Y|X


0.8
Density

0.4
0.0

7.0 7.5 8.0 8.5 9.0 9.5

N = 1000 Bandwidth = 0.08694

2. Simulating First Differences


Estimating the risk difference (and risk ratio) between low values (20th
percentile) and high values (80th percentile) of the explanatory variable x3
while all the other variables are held at their default (mean/mode) values.

> x.high <- setx(z.out, x3= quantile(my.data$x3, 0.8))


> x.low <- setx(z.out, x3 = quantile(my.data$x3, 0.2))
> s.out <- sim(z.out, x=x.high, x1=x.low)
> summary(s.out)
> plot(s.out)
>

Expected Value (for X): E(Y|X) Predicted Value (for X): Y|X
0.8
0.8
Density

Density

0.4
0.4
0.0

0.0

6.5 7.0 7.5 8.0 8.5 9.0 9.5 6.5 7.0 7.5 8.0 8.5 9.0 9.5

N = 1000 Bandwidth = 0.08688 N = 1000 Bandwidth = 0.09166

Expected Value (for X1): E(Y|X1) Predicted Value (for X1): Y|X1
0.8
0.8
Density

Density

0.4
0.4
0.0

0.0

6.5 7.0 7.5 8.0 8.5 9.0 9.5 7.0 7.5 8.0 8.5 9.0 9.5

N = 1000 Bandwidth = 0.08809 N = 1000 Bandwidth = 0.08643

First Difference: E(Y|X1) − E(Y|X)


0.6
Density

0.3
0.0

−2 −1 0 1 2

N = 1000 Bandwidth = 0.1224

3. Variations in GAM model specification. Note that setx and sim work as
shown in the above examples for any GAM model. As such, in the interest
of parsimony, I will not re-specify the simulations of quantities of interest.
An extra ridge penalty (useful with convergence problems):

10
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), H=diag(0.5,37),
+ model="normal.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)
>
Set the smoothing parameter for the first term, estimate the rest:
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),sp=c(0.01,-1,-1,-1),
+ model="normal.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1)
>
Set lower bounds on smoothing parameters:
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),min.sp=c(0.001,0.01,0,10),
+ model="normal.gam", data=my.data)
> summary(z.out)
> plot(z.out$result, pages=1)
>
A GAM with 3df regression spline term & 2 penalized terms:
> z.out <-zelig(y~s(x0,k=4,fx=TRUE,bs="tp")+s(x1,k=12)+s(x2,k=15),
+ model="normal.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1)
>

2.4 Model
GAM models use families the same way GLM models do: they specify the
distribution and link function to use in model fitting. In the case of normal.gam
a normal link function is used. Specifically, let Yi be the continuous dependent
variable for observation i.
• The stochastic component is described by a univariate normal model with
a vector of means µi and scalar variance σ 2 :

Yi ∼ Normal(µi , σ 2 ).

• The systematic component is given by:


J
X
µi = xi β + fj (Zj ).
j=1

where xi is the vector of k explanatory variables, β is the vector of coeffi-


cients and fj (Zj ) for j = 1, . . . J is the set of smooth terms.

11
Generalized additive models (GAMs) are similar in many respects to gener-
alized linear models (GLMs). Specifically, GAMs are generally fit by penalized
maximum likelihood estimation and GAMs have (or can have) a parametric
component identical to that of a GLM. The difference is that GAMs also in-
clude in their linear predictors a specified sum of smooth functions.
In this GAM implementation, smooth functions are represented using pe-
nalized regression splines. Two techniques may be used to estimate smoothing
parameters: Generalized Cross Validation (GCV),

D
n , (3)
(n − DF )2

or an Un-Biased Risk Estimator (UBRE) (which is effectively just a rescaled


AIC),
D DF
+ 2s , (4)
n n−s
where D is the deviance, n is the number of observations, s is the scale pa-
rameter, and DF is the effective degrees of freedom of the model. The use of
GCV or UBRE can be set by the user with the scale command described in
the “Additional Inputs” section and in either case, smoothing parameters are
chosen to minimize the GCV or UBRE score for the model.
Estimation for GAM models proceeds as follows: first, basis functions and a
set (one or more) of quadratic penalty coefficient matrices are constructed for
each smooth term. Second, a model matrix is is obtained for the parametric
component of the GAM. These matrices are combined to produce a complete
model matrix and a set of penalty matrices for the smooth terms. Iteratively
Reweighted Least Squares (IRLS) is then used to estimate the model; at each
iteration of the IRLS, a penalized weighted least squares model is run and the
smoothing parameters of that model are estimated by GCV or UBRE. This
process is repeated until convergence is achieved.
Further details of the GAM fitting process are given in Wood (2000, 2004,
2006).

2.5 Quantities of Interest


The quantities of interest for the normal.gam model are the same as those for
the standard Normal regression.

• The expected value (qi$ev) for the normal.gam model is the mean of
simulations from the stochastic component,
J
X
E(Y ) = µi = xi β + fj (Zj ).
j=1

• The predicted value (qi$pr) is a draw from the Normal distribution de-
fined by the set of parameters (µi , σ 2 ).

12
• The first difference (qi$fd) for the normal.gam model is defined as

F D = Pr(Y |w1 ) − Pr(Y |w)

for w = {X, Z}.

2.6 Output Values


The output of each Zelig command contains useful information which you may
view. For example, if you run z.out <- zelig(y ~ x, model = "normal.gam",
data), then you may examine the available information in z.out by using
names(z.out), see the coefficients by using coefficients(z.out), and a de-
fault summary of information through summary(z.out). Other elements avail-
able through the $ operator are listed below.
• From the zelig() output stored in z.out, you may extract:
– coefficients: parameter estimates for the explanatory variables.
– fitted.values: the vector of fitted values for the explanatory vari-
ables.
– residuals: the working residuals in the final iteration of the IRLS
fit.
– linear.predictors: the vector of xi β.
– aic: Akaike’s Information Criterion (minus twice the maximized log-
likelihood plus twice the number of coefficients).
– method: the fitting method used.
– converged: logical indicating weather the model converged or not.
– smooth: information about the smoothed parameters.
– df.residual: the residual degrees of freedom.
– df.null: the residual degrees of freedom for the null model.
– data: the input data frame.
– model: the model matrix used.
• From summary(z.out)(as well as from zelig()), you may extract:
– p.coeff: the coefficients of the parametric components of the model.
– se: the standard errors of the entire model.
– p.table: the coefficients, standard errors, and associated t statistics
for the parametric portion of the model.
– s.table: the table of estimated degrees of freedom, estimated rank,
F statistics, and p-values for the nonparametric portion of the model.
– cov.scaled: a k × k matrix of scaled covariances.
– cov.unscaled: a k × k matrix of unscaled covariances.

13
• From the sim() output stored in s.out, you may extract:
– qi$ev: the simulated expected probabilities for the specified values
of x.
– qi$pr: the simulated predicted values for the specified values of x.
– qi$fd: the simulated first differences in the expected probabilities
simulated from x and x1.

How to Cite the Normal General Addtitive Model


How to Cite the Zelig Software Package
To cite Zelig as a whole, please reference these two sources:

Kosuke Imai, Gary King, and Olivia Lau. 2007. “Zelig: Everyone’s
Statistical Software,” http://GKing.harvard.edu/zelig.

Imai, Kosuke, Gary King, and Olivia Lau. (2008). “Toward A Com-
mon Framework for Statistical Analysis and Development.” Jour-
nal of Computational and Graphical Statistics, Vol. 17, No. 4
(December), pp. 892-913.

See also
The gam.logit model is adapted from the mgcv package by Simon N. Wood
[7]. Advanced users may wish to refer to help(gam), [6], [5], and other docu-
mentation accompanying the mgcv package. All examples are reproduced and
extended from mgcv’s gam() help pages.

3 poisson.gam: Generalized Additive Model for


Count Dependent Variables
This function runs a nonparametric Generalized Additive Model (GAM) for
count dependent variables.

3.1 Syntax
> z.out <- zelig(y ~ x1 + s(x2), model = "poisson.gam", data = mydata)
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)

Where s() indicates a variable to be estimated via nonparametric smooth. All


variables for which s() is not specified, are estimated via standard parametric
methods.

14
3.2 Additional Inputs
In addition to the standard inputs, zelig() takes the following additional op-
tions for GAM models.
• method: Controls the fitting method to be used. Fitting methods are se-
lected via a list environment within method=gam.method(). See gam.method()
for details.

• scale: Generalized Cross Validation (GCV) is used if scale = 0 (see the


“Model” section for details) except for Poisson models where a Un-Biased
Risk Estimator (UBRE) (also see the “Model” section for details) is used
with a scale parameter assumed to be 1. If scale is greater than 1, it is
assumed to be the scale parameter/variance and UBRE is used. If scale
is negative GCV is used.
• knots: An optional list of knot values to be used for the construction of
basis functions.
• H: A user supplied fixed quadratic penalty on the parameters of the GAM
can be supplied with this as its coefficient matrix. For example, ridge
penalties can be added to the parameters of the GAM to aid in identifi-
cation on the scale of the linear predictor.
• sp: A vector of smoothing parameters for each term.
• ...: additional options passed to the poisson.gam model. See the mgcv
library for details.

3.3 Examples
1. Basic Example
Create some count data:

> set.seed(0); n <- 400; sig <- 2;


> x0 <- runif(n, 0, 1); x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1); x3 <- runif(n, 0, 1)
> f0 <- function(x) 2 * sin(pi * x)
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 *
+ x)^3 * (1 - x)^10
> f3 <- function(x) 0 * x
> f <- f0(x0) + f1(x1) + f2(x2)
> g <- exp(f/4); y <- rpois(rep(1, n), g)
> my.data <- as.data.frame(cbind(y, x0, x1, x2, x3))

Estimate the model, summarize the results, and plot nonlinearities:

15
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), model="poisson.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)

Note that the plot() function can be used after model estimation and
before simulation to view the nonlinear relationships in the independent
variables:
0 1 2 3 4

0 1 2 3 4
s(x1,4.56)
s(x0,3.3)

−2

0.0 0.2 0.4 0.6 0.8 1.0 −2 0.0 0.2 0.4 0.6 0.8 1.0

x0 x1
0 1 2 3 4

0 1 2 3 4
s(x2,7.85)

s(x3,2.61)
−2

−2

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x2 x3

Set values for the explanatory variables to their default (mean/mode)


values, then simulate, summarize and plot quantities of interest:
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)
> summary(s.out)
> plot(s.out)

2. Simulating First Differences


Estimating the risk difference (and risk ratio) between low values (20th
percentile) and high values (80th percentile) of the explanatory variable x3
while all the other variables are held at their default (mean/mode) values.

> x.high <- setx(z.out, x3= quantile(my.data$x3, 0.8))


> x.low <- setx(z.out, x3 = quantile(my.data$x3, 0.2))

16
Expected Value (for X): E(Y|X)

Density

4
2
0 0.6 0.7 0.8 0.9 1.0 1.1

N = 1000 Bandwidth = 0.01606

Predicted Value (for X): Y|X


0.10
Density

0.00

0 5 10 15

N = 1000 Bandwidth = 0.5671

> s.out <- sim(z.out, x=x.high, x1=x.low)


> summary(s.out)
> plot(s.out)
>

Expected Value (for X): E(Y|X) Predicted Value (for X): Y|X
6

0.10
Density

Density
4
2

0.00
0

0.7 0.8 0.9 1.0 1.1 0 5 10 15

N = 1000 Bandwidth = 0.01515 N = 1000 Bandwidth = 0.5061

Expected Value (for X1): E(Y|X1) Predicted Value (for X1): Y|X1
0.10
4
Density

Density
2

0.00
0

0.6 0.7 0.8 0.9 1.0 1.1 0 5 10 15

N = 1000 Bandwidth = 0.01574 N = 1000 Bandwidth = 0.5061

First Difference: E(Y|X1) − E(Y|X)


0 1 2 3 4
Density

−0.4 −0.2 0.0 0.2 0.4

N = 1000 Bandwidth = 0.02148

3. Variations in GAM model specification. Note that setx and sim work as
shown in the above examples for any GAM model. As such, in the interest
of parsimony, I will not re-specify the simulations of quantities of interest.
An extra ridge penalty (useful with convergence problems):
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), H=diag(0.5,37),
+ model="poisson.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)
>
Set the smoothing parameter for the first term, estimate the rest:
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),sp=c(0.01,-1,-1,-1),
+ model="poisson.gam", data=my.data)

17
> summary(z.out)
> plot(z.out$result,pages=1)
>
Set lower bounds on smoothing parameters:

> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),min.sp=c(0.001,0.01,0,10),


+ model="poisson.gam", data=my.data)
> summary(z.out)
> plot(z.out$result, pages=1)
>

A GAM with 3df regression spline term & 2 penalized terms:


> z.out <-zelig(y~s(x0,k=4,fx=TRUE,bs="tp")+s(x1,k=12)+s(x2,k=15),
+ model="poisson.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1)
>

3.4 Model
GAM models use families the same way GLM models do: they specify the
distribution and link function to use in model fitting. In the case of poisson.gam
a Poisson link function is used. Specifically, let Yi be the dependent variable for
observation i. Yi is thus the number of independent events that occur during a
fixed time period. This variable can take any non-negative integer.

• The Poisson distribution has stochastic component

Yi ∼ Poisson(λi ),

where λi is the mean and variance parameter.

• The systematic component is given by:


 
XJ
λi = exp xi β + fj (Zj ) .
j=1

where xi is the vector of explanatory variables, β is the vector of coeffi-


cients and fj (Zj ) for j = 1, . . . J is the set of smooth terms.
Generalized additive models (GAMs) are similar in many respects to gener-
alized linear models (GLMs). Specifically, GAMs are generally fit by penalized
maximum likelihood estimation and GAMs have (or can have) a parametric
component identical to that of a GLM. The difference is that GAMs also in-
clude in their linear predictors a specified sum of smooth functions.

18
In this GAM implementation, smooth functions are represented using pe-
nalized regression splines. Two techniques may be used to estimate smoothing
parameters: Generalized Cross Validation (GCV),
D
n , (5)
(n − DF )2
or an Un-Biased Risk Estimator (UBRE) (which is effectively just a rescaled
AIC),
D DF
+ 2s , (6)
n n−s
where D is the deviance, n is the number of observations, s is the scale pa-
rameter, and DF is the effective degrees of freedom of the model. The use of
GCV or UBRE can be set by the user with the scale command described in
the “Additional Inputs” section and in either case, smoothing parameters are
chosen to minimize the GCV or UBRE score for the model.
Estimation for GAM models proceeds as follows: first, basis functions and a
set (one or more) of quadratic penalty coefficient matrices are constructed for
each smooth term. Second, a model matrix is is obtained for the parametric
component of the GAM. These matrices are combined to produce a complete
model matrix and a set of penalty matrices for the smooth terms. Iteratively
Reweighted Least Squares (IRLS) is then used to estimate the model; at each
iteration of the IRLS, a penalized weighted least squares model is run and the
smoothing parameters of that model are estimated by GCV or UBRE. This
process is repeated until convergence is achieved.
Further details of the GAM fitting process are given in Wood (2000, 2004,
2006).

3.5 Quantities of Interest


The quantities of interest for the poisson.gam model are the same as those for
the standard Poisson regression.
• The expected value (qi$ev) for the poisson.gam model is the mean of
simulations from the stochastic component,
 
J
X
E(Y ) = λi = exp xi β fj (Zj ) .
j=1

• The predicted value (qi$pr) is a random draw from the Poisson distribu-
tion defined by mean λi .
• The first difference (qi$fd) for the poisson.gam model is defined as

F D = Pr(Y |w1 ) − Pr(Y |w)

for w = {X, Z}.

19
3.6 Output Values
The output of each Zelig command contains useful information which you may
view. For example, if you run z.out <- zelig(y ~ x, model = "poisson.gam",
data), then you may examine the available information in z.out by using
names(z.out), see the coefficients by using coefficients(z.out), and a de-
fault summary of information through summary(z.out). Other elements avail-
able through the $ operator are listed below.
• From the zelig() output stored in z.out, you may extract:
– coefficients: parameter estimates for the explanatory variables.
– fitted.values: the vector of fitted values for the explanatory vari-
ables.
– residuals: the working residuals in the final iteration of the IRLS
fit.
– linear.predictors: the vector of xi β.
– aic: Akaike’s Information Criterion (minus twice the maximized log-
likelihood plus twice the number of coefficients).
– method: the fitting method used.
– converged: logical indicating weather the model converged or not.
– smooth: information about the smoothed parameters.
– df.residual: the residual degrees of freedom.
– df.null: the residual degrees of freedom for the null model.
– data: the input data frame.
– model: the model matrix used.
• From summary(z.out)(as well as from zelig()), you may extract:
– p.coeff: the coefficients of the parametric components of the model.
– se: the standard errors of the entire model.
– p.table: the coefficients, standard errors, and associated t statistics
for the parametric portion of the model.
– s.table: the table of estimated degrees of freedom, estimated rank,
F statistics, and p-values for the nonparametric portion of the model.
– cov.scaled: a k × k matrix of scaled covariances.
– cov.unscaled: a k × k matrix of unscaled covariances.
• From the sim() output stored in s.out, you may extract:
– qi$ev: the simulated expected probabilities for the specified values
of x.
– qi$pr: the simulated predicted values for the specified values of x.
– qi$fd: the simulated first differences in the expected probabilities
simulated from x and x1.

20
How to Cite the Poisson General Addtitive Model
How to Cite the Zelig Software Package
To cite Zelig as a whole, please reference these two sources:

Kosuke Imai, Gary King, and Olivia Lau. 2007. “Zelig: Everyone’s
Statistical Software,” http://GKing.harvard.edu/zelig.

Imai, Kosuke, Gary King, and Olivia Lau. (2008). “Toward A Com-
mon Framework for Statistical Analysis and Development.” Jour-
nal of Computational and Graphical Statistics, Vol. 17, No. 4
(December), pp. 892-913.

See also
The gam.logit model is adapted from the mgcv package by Simon N. Wood
[7]. Advanced users may wish to refer to help(gam), [6], [5], and other docu-
mentation accompanying the mgcv package. All examples are reproduced and
extended from mgcv’s gam() help pages.

4 probit.gam: Generalized Additive Model for


Dichotomous Dependent Variables
This function runs a nonparametric Generalized Additive Model (GAM) for
dichotomous dependent variables.

4.1 Syntax
> z.out <- zelig(y ~ x1 + s(x2), model = "probit.gam", data = mydata)
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)

Where s() indicates a variable to be estimated via nonparametric smooth. All


variables for which s() is not specified, are estimated via standard parametric
methods.

4.2 Additional Inputs


In addition to the standard inputs, zelig() takes the following additional op-
tions for GAM models.

• method: Controls the fitting method to be used. Fitting methods are se-
lected via a list environment within method=gam.method(). See gam.method()
for details.

21
• scale: Generalized Cross Validation (GCV) is used if scale = 0 (see the
“Model” section for details) except for Logit models where a Un-Biased
Risk Estimator (UBRE) (also see the “Model” section for details) is used
with a scale parameter assumed to be 1. If scale is greater than 1, it is
assumed to be the scale parameter/variance and UBRE is used. If scale
is negative GCV is used.
• knots: An optional list of knot values to be used for the construction of
basis functions.
• H: A user supplied fixed quadratic penalty on the parameters of the GAM
can be supplied with this as its coefficient matrix. For example, ridge
penalties can be added to the parameters of the GAM to aid in identifi-
cation on the scale of the linear predictor.
• sp: A vector of smoothing parameters for each term.
• ...: additional options passed to the probit.gam model. See the mgcv
library for details.

4.3 Examples
1. Basic Example
Create some count data:

> set.seed(0); n <- 400; sig <- 2;


> x0 <- runif(n, 0, 1); x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1); x3 <- runif(n, 0, 1)
> g <- (f-5)/3
> g <- binomial()$linkinv(g)
> y <- rbinom(g,1,g)
> my.data <- as.data.frame(cbind(y, x0, x1, x2, x3))

Estimate the model, summarize the results, and plot nonlinearities:

> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), model="probit.gam", data=my.data)


> summary(z.out)
> plot(z.out$result,pages=1,residuals=TRUE)

Note that the plot() function can be used after model estimation and
before simulation to view the nonlinear relationships in the independent
variables:
Set values for the explanatory variables to their default (mean/mode)
values, then simulate, summarize and plot quantities of interest:
> x.out <- setx(z.out)
> s.out <- sim(z.out, x = x.out)
> summary(s.out)
> plot(s.out)

22
1.0

1.0
s(x1,1.09)
s(x0,1)

0.0

0.0
−2.0 −1.0

−2.0 −1.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x0 x1
1.0

1.0
s(x2,4.85)

s(x3,1)
0.0

0.0
−2.0 −1.0

−2.0 −1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x2 x3

2. Simulating First Differences


Estimating the risk difference (and risk ratio) between low values (20th
percentile) and high values (80th percentile) of the explanatory variable x3
while all the other variables are held at their default (mean/mode) values.

> x.high <- setx(z.out, x3= quantile(my.data$x3, 0.8))


> x.low <- setx(z.out, x3 = quantile(my.data$x3, 0.2))
> s.out <- sim(z.out, x=x.high, x1=x.low)
> summary(s.out)
> plot(s.out)
>

3. Variations in GAM model specification. Note that setx and sim work as
shown in the above examples for any GAM model. As such, in the interest
of parsimony, I will not re-specify the simulations of quantities of interest.
An extra ridge penalty (useful with convergence problems):

> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3), H=diag(0.5,37),


+ model="probit.gam", data=my.data)
> summary(z.out)

23
Expected Value (for X): E(Y|X)

2.0
Density

1.0
0.0 0.0 0.5 1.0

N = 1000 Bandwidth = 0.03741

Predicted Value (for X): Y|X


400
0

0 1

Expected Value (for X): E(Y|X) Predicted Value (for X): Y|X
2.0
Density

400
1.0
0.0

0
0.2 0.4 0.6 0.8 1.0 1.2 1.4 0 1

N = 1000 Bandwidth = 0.04079

Expected Value (for X1): E(Y|X1) Predicted Value (for X1): Y|X1
2.0
Density

400
1.0
0.0

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0 1

N = 1000 Bandwidth = 0.04106

First Difference: E(Y|X1) − E(Y|X)


Density

1.0
0.0

−1.0 −0.5 0.0 0.5 1.0

N = 1000 Bandwidth = 0.05662

> plot(z.out$result,pages=1,residuals=TRUE)
>
Set the smoothing parameter for the first term, estimate the rest:
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),sp=c(0.01,-1,-1,-1),
+ model="probit.gam", data=my.data)
> summary(z.out)
> plot(z.out$result,pages=1)
>
Set lower bounds on smoothing parameters:
> z.out <- zelig(y~s(x0)+s(x1)+s(x2)+s(x3),min.sp=c(0.001,0.01,0,10),
+ model="probit.gam", data=my.data)
> summary(z.out)
> plot(z.out$result, pages=1)
>
A GAM with 3df regression spline term & 2 penalized terms:
> z.out <-zelig(y~s(x0,k=4,fx=TRUE,bs="tp")+s(x1,k=12)+s(x2,k=15),
+ model="probit.gam", data=my.data)

24
> summary(z.out)
> plot(z.out$result,pages=1)
>

4.4 Model
GAM models use families the same way GLM models do: they specify the
distribution and link function to use in model fitting. In the case of probit.gam
a normal link function is used. Specifically, let Yi be the binary dependent
variable for observation i which takes the value of either 0 or 1.
• The normal distribution has stochastic component

Yi ∼ Bernoulli(πi )

where πi = Pr(Yi = 1).


• The systematic component is given by:
 
X J
πi = Φ xi β + fj (Zj ) ,
j=1

where Φ (µ) is the cumulative distribution function of the Normal distri-


bution with mean 0 and unit variance and fj (Zj ) for j = 1, . . . J is the set
of smooth terms.
Generalized additive models (GAMs) are similar in many respects to gener-
alized linear models (GLMs). Specifically, GAMs are generally fit by penalized
maximum likelihood estimation and GAMs have (or can have) a parametric
component identical to that of a GLM. The difference is that GAMs also in-
clude in their linear predictors a specified sum of smooth functions.
In this GAM implementation, smooth functions are represented using pe-
nalized regression splines. Two techniques may be used to estimate smoothing
parameters: Generalized Cross Validation (GCV),

D
n , (7)
(n − DF )2

or an Un-Biased Risk Estimator (UBRE) (which is effectively just a rescaled


AIC),
D DF
+ 2s , (8)
n n−s
where D is the deviance, n is the number of observations, s is the scale pa-
rameter, and DF is the effective degrees of freedom of the model. The use of
GCV or UBRE can be set by the user with the scale command described in
the “Additional Inputs” section and in either case, smoothing parameters are
chosen to minimize the GCV or UBRE score for the model.

25
Estimation for GAM models proceeds as follows: first, basis functions and a
set (one or more) of quadratic penalty coefficient matrices are constructed for
each smooth term. Second, a model matrix is is obtained for the parametric
component of the GAM. These matrices are combined to produce a complete
model matrix and a set of penalty matrices for the smooth terms. Iteratively
Reweighted Least Squares (IRLS) is then used to estimate the model; at each
iteration of the IRLS, a penalized weighted least squares model is run and the
smoothing parameters of that model are estimated by GCV or UBRE. This
process is repeated until convergence is achieved.
Further details of the GAM fitting process are given in Wood (2000, 2004,
2006).

4.5 Quantities of Interest


The quantities of interest for the probit.gam model are the same as those for
the standard normal regression.
• The expected value (qi$ev) for the probit.gam model is the mean of
simulations from the stochastic component,
 
J
X
πi = Φ  xi β + fj (Zj ) .
j=1

• The predicted values (qi$pr) are draws from the Binomial distribution
with mean equal to the simulated expected value πi .
• The first difference (qi$fd) for the probit.gam model is defined as

F D = Pr(Y |w1 ) − Pr(Y |w)

for w = {X, Z}.

4.6 Output Values


The output of each Zelig command contains useful information which you may
view. For example, if you run z.out <- zelig(y ~ x, model = "probit.gam",
data), then you may examine the available information in z.out by using
names(z.out), see the coefficients by using coefficients(z.out), and a de-
fault summary of information through summary(z.out). Other elements avail-
able through the $ operator are listed below.
• From the zelig() output stored in z.out, you may extract:

– coefficients: parameter estimates for the explanatory variables.


– fitted.values: the vector of fitted values for the explanatory vari-
ables.

26
– residuals: the working residuals in the final iteration of the IRLS
fit.
– linear.predictors: the vector of xi β.
– aic: Akaike’s Information Criterion (minus twice the maximized log-
likelihood plus twice the number of coefficients).
– method: the fitting method used.
– converged: logical indicating weather the model converged or not.
– smooth: information about the smoothed parameters.
– df.residual: the residual degrees of freedom.
– df.null: the residual degrees of freedom for the null model.
– data: the input data frame.
– model: the model matrix used.
• From summary(z.out)(as well as from zelig()), you may extract:
– p.coeff: the coefficients of the parametric components of the model.
– se: the standard errors of the entire model.
– p.table: the coefficients, standard errors, and associated t statistics
for the parametric portion of the model.
– s.table: the table of estimated degrees of freedom, estimated rank,
F statistics, and p-values for the nonparametric portion of the model.
– cov.scaled: a k × k matrix of scaled covariances.
– cov.unscaled: a k × k matrix of unscaled covariances.
• From the sim() output stored in s.out, you may extract:
– qi$ev: the simulated expected probabilities for the specified values
of x.
– qi$pr: the simulated predicted values for the specified values of x.
– qi$fd: the simulated first differences in the expected probabilities
simulated from x and x1.

How to Cite the Probit General Addtitive Model


How to Cite the Zelig Software Package
To cite Zelig as a whole, please reference these two sources:
Kosuke Imai, Gary King, and Olivia Lau. 2007. “Zelig: Everyone’s
Statistical Software,” http://GKing.harvard.edu/zelig.
Imai, Kosuke, Gary King, and Olivia Lau. (2008). “Toward A Com-
mon Framework for Statistical Analysis and Development.” Jour-
nal of Computational and Graphical Statistics, Vol. 17, No. 4
(December), pp. 892-913.

27
See also
The gam.logit model is adapted from the mgcv package by Simon N. Wood
[7]. Advanced users may wish to refer to help(gam), [6], [5], and other docu-
mentation accompanying the mgcv package. All examples are reproduced and
extended from mgcv’s gam() help pages.

References
[1] Matt Owen and Skyler Cranmer. Generalized Additive Model for Logistic
Regression of Dichotomous Dependent Variables, 2011.
[2] Matt Owen and Skyler Cranmer. Generalized Additive Model for Normal
Regression of Continuous Dependent Variables, 2011.
[3] Matt Owen and Skyler Cranmer. Generalized Additive Model for Poisson
Regression of Count Dependent Variables, 2011.
[4] Matt Owen and Skyler Cranmer. Generalized Additive Model for Probit
Regression of Dichotomous Dependent Variables, 2011.
[5] Simon N. Wood. Modeling and smoothing parameter estimation wiht multi-
ple quadratic penalties. Journal of the Royal Statistical Society, 62(2):413–
428, 2000.
[6] Simon N. Wood. Stable and efficient multiple smoothing parameter esti-
mation for generalized additive models. Journal of the American Statistical
Association, 99:673–686, 2004.

[7] Simon N. Wood. Generalized Additive Models: An Introduction with R. CRC


Press, London, 2006.

28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy