The LIFEREG Procedure
The LIFEREG Procedure
The LIFEREG Procedure
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 1767
. 1768
. 1769
. 1769
. 1770
. 1775
. 1777
DETAILS . . . . . . . . .
Missing Values . . . . .
Main Effects . . . . . . .
Computational Method .
Model Specifications . .
Supported Distributions .
Predicted Values . . . .
OUTEST= Data Set . . .
Computational Resources
Displayed Output . . . .
ODS Table Names . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 1777
. 1777
. 1777
. 1778
. 1778
. 1780
. 1783
. 1784
. 1784
. 1785
. 1786
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1786
Example 36.1 Motorette Failure . . . . . . . . . . . . . . . . . . . . . . . . 1786
Example 36.2 Computing Predicted Values for a Tobit Model . . . . . . . . . 1789
Example 36.3 Overcoming Convergence Problems by Specifying Initial Values1793
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796
1760
Chapter 36
= X +
where y is a vector of response values, often the log of the failure times, X is a matrix of covariates or independent variables (usually including an intercept term), is
a vector of unknown regression parameters, is an unknown scale parameter, and
is a vector of errors assumed to come from a known distribution (such as the standard
normal distribution). The distribution may depend on additional shape parameters.
These models are equivalent to accelerated failure time models when the log of the
response is the quantity being modeled. The effect of the covariates in an accelerated failure time model is to change the scale, and not the location, of a baseline
distribution of failure times.
The LIFEREG procedure estimates the parameters by maximum likelihood using a
Newton-Raphson algorithm. PROC LIFEREG estimates the standard errors of the
parameter estimates from the inverse of the observed information matrix.
The accelerated failure time model assumes that the effect of independent variables
on an event time distribution is multiplicative on the event time. Usually, the scale
function is exp(x0 ), where x is the vector of covariate values and is a vector
of unknown parameters. Thus, if T0 is an event time sampled from the baseline
distribution corresponding to values of zero for the covariates, then the accelerated
failure time model specifies that, if the vector of covariates is x, the event time is
T = exp(x0 )T0 . If y = log(T ) and y0 = log(T0 ), then
y = x0 + y0
This is a linear model with y0 as the error term.
In terms of survival or exceedance probabilities, this model is
1762
The probability on the left-hand side of the equal sign is evaluated given the value x
for the covariates, and the right-hand side is computed using the baseline probability
distribution but at a scaled value of the argument. The right-hand side of the equation represents the value of the baseline Survival Distribution Function evaluated at
exp(,x0 )t.
Usually, an intercept parameter and a scale parameter are allowed in the model. In
terms of the original untransformed event times, the effects of the intercept term and
the scale term are to scale the event time and power the event time, respectively. That
is, if
log(T ) = + log(T0 )
then
T = exp()T0
Although it is possible to fit these models to the original response variable using
the NOLOG option, it is more common to model the log of the response variable.
Because of this log transformation, zero values for the observed failure times are
not allowed unless the NOLOG option is specified. Similarly, small values for the
observed failure times lead to large negative values for the transformed response. The
NOLOG option should only be used if you want to fit a distribution appropriate for
the untransformed response, the extreme value instead of the Weibull, for example.
The parameter estimates for the normal distribution are sensitive to large negative
values, and care must be taken that the fitted model is not unduly influenced by them.
Likewise, values that are extremely large even after the log transformation have a
strong influence in fitting the extreme value (Weibull) and normal distributions. You
should examine the residuals and check the effects of removing observations with
large residuals or extreme values of covariates on the model parameters. The logistic
distribution gives robust parameter estimates in the sense that the estimates have a
bounded influence function.
The standard errors of the parameter estimates are computed from large sample normal approximations using the observed information matrix. In small samples, these
approximations may be poor. Refer to Lawless (1982) for additional discussion and
references. You can sometimes construct better confidence intervals by transforming
the parameters. For example, large sample theory is often more accurate for log( )
than . Therefore, it may be more accurate to construct confidence intervals for
log() and transform these into confidence intervals for . The parameter estimates
and their estimated covariance matrix are available in an output SAS data set and can
be used to construct additional tests or confidence intervals for the parameters. Alternatively, tests of parameters can be based on log-likelihood ratios. Refer to Cox and
Oakes (1984) for a discussion of the merits of some possible test methods including
score, Wald, and likelihood ratio tests. It is believed that likelihood ratio tests are
generally more reliable in small samples than tests based on the information matrix.
Getting Started
1763
The log-likelihood function is computed using the log of the failure time as a response. This log likelihood differs from thePlog likelihood obtained using the failure
time as the response by an additive term of log(ti ), where the sum is over the noncensored failure times. This term does not depend on the unknown parameters and
does not affect parameter or standard error estimates. However, many published values of log likelihoods use the failure time as the basic response variable and, hence,
differ by the additive term from the value computed by the LIFEREG procedure.
The classic Tobit model (Tobin 1958) also fits into this class of models but with data
usually censored on the left. The data considered by Tobin in his original paper came
from a survey of consumers where the response variable is the ratio of expenditures
on durable goods to the total disposable income. The two explanatory variables are
the age of the head of household and the ratio of liquid assets to total disposable
income. Because many observations in this data set have a value of zero for the
response variable, the model fit by Tobin is
y
= max(x0 + ; 0)
Getting Started
The following examples demonstrate how you can use the LIFEREG procedure to fit
a parametric model to failure time data.
Suppose you have a response variable y that represents failure time, censor is a binary variable indicating censored values, and x1 and x2 are two linearly independent
variables. The following statements perform a typical accelerated failure time model
analysis. Note that no higher-order effects such as interactions are allowed in the
covariables list.
proc lifereg;
model y*censor(0) = x1 x2;
run;
PROC LIFEREG can operate on interval-censored data. The model syntax for specifying the censored interval is
proc lifereg;
model (begin, end) = x1 x2;
run;
You can also express the response with events/trials syntax, as illustrated in the following statements:
proc lifereg;
model r/n=x1 x2;
run;
1764
The variable n represents the number of trials and the variable r represents the number
of events.
1
1
1
1
2
2
2
2
2
0
0
0
0
0
0
0
0
0
The data set headache contains the variable minutes, which represents the reported
time to headache relief, the variable group, the group to which the patient is assigned, and the variable censor, a binary variable indicating whether the observation
is censored. Valid values of the variable censor are 0 (no) and 1 (yes). The first five
records of the data set headache are shown below.
Figure 36.1.
Obs
minutes
group
censor
1
2
3
4
5
11
12
19
19
19
1
1
1
1
1
0
0
0
0
0
Headache Data
1765
The CLASS statement specifies the variable group as the classification variable. The
MODEL statement syntax indicates that the response variable minutes is censored
when the variable censor takes the value 1. The MODEL statement specifies the
variable group as the single explanatory variable. Because the MODEL statement
does not specify the DISTRIBUTION= option, the LIFEREG procedure fits the default type 1 extreme value distribution using log(minutes) as the response. This is
equivalent to fitting the Weibull distribution.
The OUTPUT statement creates the output data set new. In addition to the variables
in the original data set headache, the SAS data set new also contains the variable
prob. This new variable is created by the CDF= option to contain the estimates of
the cumulative distribution function evaluated at the observed response.
The results of this analysis are displayed in the following figures.
The LIFEREG Procedure
Class Level Information
Name
Levels
group
Values
1 2
Model Information
Data Set
Dependent Variable
Censoring Variable
Censoring Value(s)
Number of Observations
Noncensored Values
Right Censored Values
Left Censored Values
Interval Censored Values
Name of Distribution
Log Likelihood
Figure 36.2.
WORK.HEADACHE
Log(minutes)
censor
1
38
30
8
0
0
WEIBULL
-9.37930239
Figure 36.2 displays the class level information and model fitting information. There
are 30 noncensored observations and 8 right-censored observations. The log likelihood for the Weibull distribution is -9.3793. The log-likelihood value can be used to
compare the goodness of fit for different models.
1766
Variable
Intercept
group
Scale
Standard
Error Chi-Square Pr > ChiSq Label
DF
Estimate
1
1
1
0
1
3.30912
0.05885
-0.19330
0
0.21219
0.07856
0
0.03036
Figure 36.3.
3161.7000
6.0540
6.0540
.
<.0001 Intercept
0.0139
0.0139 1
.
2
Extreme value scale
The table of parameter estimates is displayed in Figure 36.3. Both the intercept and
the slope parameter for the variable group are significantly different from 0 at the
0.05 level. Because the variable group has only one degree of freedom, parameter
estimates are given for only one level of the variable group (group=1). However, the
estimate for the intercept parameter provides a baseline for group=2. The resulting
model is
log(minutes) =
3:30911843 , 0:1933025
3:30911843
for group=1
for group=2
Note that the Weibull shape parameter for this model is the reciprocal of the extreme
value scale parameter estimate shown in Figure 36.3 (1=0:21219 = 4:7128).
The following statements produce a graph of the cumulative distribution values versus
the variable minutes. The LEGEND1 statement defines the appearance of the legend
that displays on the plot. The two AXIS statements define the appearance of the plot
axes. The SYMBOL statements control the plotting symbol, color, and method of
smoothing.
legend1 frame cframe=ligr cborder=black
position=center value=(justify=center);
axis1 label=(angle=90 rotate=0 Estimated CDF) minor=none;
axis2 minor=none;
symbol1 c=white i=spline;
symbol2 c=yellow i=spline;
proc sort data=new;
by prob;
proc gplot data=new;
plot prob*minutes=group/ frame cframe=ligr
legend=legend1 vaxis=axis1 haxis=axis2;
run;
The SORT procedure sorts the data set new by the variable prob. Then the GPLOT
procedure plots the variable prob versus the variable minutes using the grouping
SAS OnlineDoc: Version 8
Syntax
1767
variable as the identification variable. The LEGEND=, VAXIS=, and HAXIS= options specify the previously defined legend and axis statements.
Figure 36.4 displays the estimated cumulative distribution function for each group.
Figure 36.4.
Syntax
The following statements are available in PROC LIFEREG.
1768
statement identifies a variable with values that are used to weight the observations.
Observations with zero or negative weights are not used to fit the model, although
predicted values can be computed for them. The OUTPUT statement creates an output data set containing predicted values and residuals.
specifies the input SAS data set used by PROC LIFEREG. By default, the most recently created SAS data set is used.
NOPRINT
suppresses the display of the output. Note that this option temporarily disables the
Output Delivery System (ODS). For more information, see Chapter 15, Using the
Output Delivery System.
ORDER=DATA | FORMATTED | FREQ | INTERNAL
specifies the sorting order for the levels of the classification variables (specified in the
CLASS statement). This ordering determines which parameters in the model correspond to each level in the data. The following table illustrates how PROC LIFEREG
interprets values of the ORDER= option.
Value of ORDER=
DATA
Levels Sorted By
order of appearance in the input data set
FORMATTED
formatted value
FREQ
INTERNAL
unformatted value
specifies an output SAS data set containing the parameter estimates, the maximized
log likelihood and, if the COVOUT option is specified, the estimated covariance matrix. See the section OUTEST= Data Set on page 1784 for a detailed description of
the contents of the OUTEST= data set. This data set is not created if class variables
are used.
SAS OnlineDoc: Version 8
CLASS Statement
1769
BY Statement
BY variables ;
You can specify a BY statement with PROC LIFEREG to obtain separate analyses on
observations in groups defined by the BY variables. When a BY statement appears,
the procedure expects the input data set to be sorted in order of the BY variables.
If your input data set is not sorted in ascending order, use one of the following alternatives:
Sort the data using the SORT procedure with a similar BY statement.
Specify the BY statement option NOTSORTED or DESCENDING in the BY
statement for the LIFEREG procedure. The NOTSORTED option does not
mean that the data are unsorted but rather that the data are arranged in groups
(according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.
Create an index on the BY variables using the DATASETS procedure.
For more information on the BY statement, refer to the discussion in SAS Language
Reference: Concepts. For more information on the DATASETS procedure, refer to
the discussion in the SAS Procedures Guide.
CLASS Statement
CLASS variables ;
Variables that are classification variables rather than quantitative numeric variables
must be listed in the CLASS statement. For each explanatory variable listed in the
CLASS statement, indicator variables are generated for the levels assumed by the
CLASS variable. If you use a CLASS statement, you cannot output parameter estimates to the OUTEST= data set (you can output them to a data set via ODS). If the
CLASS statement is used, it must appear before any of the MODEL statements.
1770
MODEL Statement
<label:> MODEL response<*censor(list)>=independents < / options > ;
<label:>
<label:>
Multiple MODEL statements can be used with one invocation of the LIFEREG procedure. The optional label is used to label the model estimates in the output SAS data
set.
The first MODEL syntax allows for right censoring. The variable response is possibly
right censored. If the response variable can be right censored, then a second variable,
denoted censor, must appear after the response variable with a list of parenthesized
values, separated by commas or blanks, to indicate censoring. That is, if the censor
variable takes on a value given in the list, the response is a right-censored value;
otherwise, it is an observed value.
The second MODEL syntax specifies two variables, lower and upper, that contain
values of the endpoints of the censoring interval. If the two values are the same (and
not missing), it is assumed that there is no censoring and the actual response value is
observed. If the lower value is missing, then the upper value is used as a left-censored
value. If the upper value is missing, then the lower value is taken as a right-censored
value. If both values are present and the lower value is less than the upper value, it
is assumed that the values specify a censoring interval. If the lower value is greater
than the upper value or both values are missing, then the observation is not used in
the analysis although predicted values can still be obtained if none of the covariates
are missing. The following table summarizes the ways of specifying censoring.
lower
not missing
upper
not missing
Comparison
equal
Interpretation
no censoring
not missing
not missing
censoring interval-
missing
not missing
not missing
missing
not missing
not missing
missing
missing
MODEL Statement
1771
The third MODEL syntax specifies two variables that contain count data for a binary
response. The value of the first variable, events, is the number of successes. The
value of the second variable, trials, is the number of tries. The values of both events
and (trials-events) must be nonnegative, and trials must be positive for the response
to be valid. The values of the two variables do not need to be integers and are not
modified to be integers.
The variables following the equal sign are the covariates in the model. No higher
order effects, such as interactions, are allowed in the covariables list; only variable
names are allowed to appear in this list. However, a class variable can be used as a
main effect, and indicator variables are generated for the class levels. If you do not
specify any covariates following the equal sign, an intercept-only model is fit.
Examples of three valid MODEL statements are
a: model time*flag(1,3)=temp;
b: model (start, finish)=;
c: model r/n=dose;
Model statement a indicates that the response is contained in a variable named time
and that, if the variable flag takes on the values 1 or 3, the observation is right censored. The explanatory variable is temp, which could be a class variable. Model
statement b indicates that the response is known to be in the interval between the
values of the variables start and finish and that there are no covariates except for a
default intercept term. Model statement c indicates a binary response, with the variable r containing the number of responses and the variable n containing the number
of trials.
The following options can appear in the MODEL statement.
1772
DISTRIBUTION=
NOLOG
INTERCEPT=
NOINT
INITIAL=
SCALE=
NOSCALE
SHAPE1=
NOSHAPE1
Model fitting
set convergence criterion
set maximum iterations
set tolerance for testing singularity
CONVERGE=
MAXITER=
SINGULAR=
Output
display estimated correlation matrix
display estimated covariance matrix
display iteration history, final gradient,
and second derivative matrix
Option
CORRB
COVB
ITPRINT
CONVERGE=value
sets the convergence criterion. Convergence is declared when the maximum change
in the parameter estimates between Newton-Raphson steps is less than the value specified. The change is a relative change if the parameter is greater than 0.01 in absolute
value; otherwise, it is an absolute change. By default, CONVERGE=0.001.
CONVG=number
sets the relative Hessian convergence criterion. The value of number must be between 0 and 1. After convergence is determined with the change in parameter crite0 ,1
rion specified with the CONVERGE= option, the quantity tc = g Hjf j g is computed
and compared to number, where g is the gradient vector, H is the Hessian matrix
for the model parameters, and f is the log-likelihood function. If tc is greater than
number, a warning that the relative Hessian convergence criterion has been exceeded
is printed. This criterion detects the occasional case where the change in parameter
convergence criterion is satisfied, but a maximum in the log-likelihood function has
not been attained. By default, CONVG=1E,4.
CORRB
specifies the distribution type assumed for the failure time. By default, PROC
LIFEREG fits a type 1 extreme value distribution to the log of the response. This
SAS OnlineDoc: Version 8
MODEL Statement
1773
is equivalent to fitting the Weibull distribution, since the scale parameter for the extreme value distribution is related to a Weibull shape parameter and the intercept
is related to the Weibull scale parameter in this case. When the NOLOG option is
specified, PROC LIFEREG models the untransformed response with a type 1 extreme value distribution as the default. See the section Supported Distributions on
page 1780 for descriptions of the distributions. The following are valid values for
distribution-type:
EXPONENTIAL the exponential distribution, which is treated as a restricted
Weibull distribution
GAMMA
LLOGISTIC
a loglogistic distribution
LNORMAL
a lognormal distribution
LOGISTIC
NORMAL
WEIBULL
By default, PROC LIFEREG transforms the response with the natural logarithm
before fitting the specified model when you specify the GAMMA, LLOGISTIC,
LNORMAL, or WEIBULL option. You can suppress the log transformation with
the NOLOG option. The following table summarizes the resulting distributions when
the distribution options above are used in combination with the NOLOG option.
DISTRIBUTION=
EXPONENTIAL
EXPONENTIAL
GAMMA
GAMMA
LOGISTIC
LOGISTIC
LLOGISTIC
LLOGISTIC
LNORMAL
LNORMAL
NORMAL
NORMAL
WEIBULL
WEIBULL
NOLOG specified?
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
Resulting distribution
Exponential
One parameter extreme value
Generalized gamma
Generalized gamma with untransformed responses
Logistic
Logistic (NOLOG has no effect)
Log-logistic
Logistic
Lognormal
Normal
Normal
Normal (NOLOG has no effect)
Weibull
Extreme value
1774
INITIAL=values
sets initial values for the regression parameters. This option can be helpful in the case
of convergence difficulty. Specified values are used to initialize the regression coefficients for the covariates specified in the MODEL statement. The intercept parameter
is initialized with the INTERCEPT= option and is not included here. The values are
assigned to the variables in the MODEL statement in the same order in which they
are listed in the MODEL statement. Note that a class variable requires k , 1 values
when the class variable takes on k different levels. The order of the class levels is determined by the ORDER= option. If there is no intercept term, the first class variable
requires k initial values. If a BY statement is used, all class variables must take on
the same number of levels in each BY group or no meaningful initial values can be
specified. The INITIAL option can be specified as follows.
Type of List
list separated by blanks
initial=3 4 5
Specification
initial=3,4,5
x to y
initial=3 to 5
x to y by z
initial=3 to 5 by 1
combination of methods
initial=1,3 to 5,9
By default, PROC LIFEREG computes initial estimates with ordinary least squares.
See the section Computational Method on page 1778 for details.
INTERCEPT=value
displays the iteration history, the final evaluation of the gradient, and the final evaluation of the negative of the second derivative matrix, that is, the negative of the
Hessian.
MAXITER=value
sets the maximum allowable number of iterations during the model estimation. By
default, MAXITER=50.
NOINT
holds the intercept term fixed. Because of the usual log transformation of the response, the intercept parameter is usually a scale parameter for the untransformed
response, or a location parameter for a transformed response.
NOLOG
requests that no log transformation of the response variable be performed. By default, PROC LIFEREG models the log of the response variable for the GAMMA,
LLOGISTIC, LOGNORMAL, and WEIBULL distribution options.
NOSCALE
holds the scale parameter fixed. Note that if the log transformation has been applied
to the response, the effect of the scale parameter is a power transformation of the
original response. If no SCALE= value is specified, the scale parameter is fixed at
the value 1.
SAS OnlineDoc: Version 8
OUTPUT Statement
1775
NOSHAPE1
holds the first shape parameter, SHAPE1, fixed. If no SHAPE= value is specified,
SHAPE1 is fixed at a value that depends on the DISTRIBUTION type.
SCALE=value
initializes the scale parameter to value. If the Weibull distribution is specified, this
scale parameter is the scale parameter of the type 1 extreme value distribution, not the
Weibull scale parameter. Note that, with a log transformation, the exponential model
is the same as a Weibull model with the scale parameter fixed at the value 1.
SHAPE1=value
initializes the first shape parameter to value. If the specified distribution does not
depend on this parameter, then this option has no effect. The only distribution that
depends on this shape parameter is the generalized gamma distribution. See the
Supported Distributions section on page 1780 for descriptions of the parameterizations of the distributions.
SINGULAR=value
sets the tolerance for testing singularity of the information matrix and the crossproducts matrix for the initial least-squares estimates. Roughly, the test requires that
a pivot be at least this number times the original diagonal value. By default,
SINGULAR=1E,12.
OUTPUT Statement
OUTPUT <OUT=SAS-data-set> keyword=name <: : :keyword=name> ;
The OUTPUT statement creates a new SAS data set containing statistics calculated
after fitting the model. At least one specification of the form keyword=name is required.
All variables in the original data set are included in the new data set, along with the
variables created as options to the OUTPUT statement. These new variables contain
fitted values and estimated quantiles. If you want to create a permanent SAS data set,
you must specify a two-level name (refer to SAS Language Reference: Concepts for
more information on permanent SAS data sets). Each OUTPUT statement applies to
the preceding MODEL statement. See Example 36.1 for illustrations of the OUTPUT
statement.
The following specifications can appear in the OUTPUT statement:
OUT=SAS-data-set specifies the new data set. By default, the procedure uses the
DATAn convention to name the new data set.
keyword=name
specifies the statistics to include in the output data set and gives
names to the new variables. Specify a keyword for each desired
statistic (see the following list of keywords), an equal sign, and
the variable to contain the statistic.
1776
The keywords allowed and the statistics they represent are as follows:
CENSORED
CDF
specifies a variable to contain the estimates of the cumulative distribution function evaluated at the observed response. See the
Predicted Values section on page 1783 for more information.
CONTROL
PREDICTED | P specifies a variable to contain the quantile estimates. If the response variable in the corresponding model statement is binomial, then this variable contains the estimated probabilities, 1 ,
F (,x0 b).
QUANTILES | QUANTILE | Q gives a list of values for which quantiles are calculated. The values must be between 0 and 1, noninclusive. For each
value, a corresponding quantile is estimated. This option is not
used if the response variable in the corresponding MODEL statement is binomial. The QUANTILES option can be specified as
follows.
Type of List
list separated by blanks
Specification
.2 .4 .6 .8
.2,.4,.6,.8
x to y
.2 to .8
x to y by z
.2 to .8 by .1
combination of methods
.1,.2 to .8 by .2
By default, QUANTILES=0.5. When the response is not binomial, a numeric variable, PROB , is added to the OUTPUT data
set whenever the QUANTILES= option is specified. The variable
PROB gives the probability value for the quantile estimates.
These are the values taken from the QUANTILES= list and are
given as values between 0 and 1, not as values between 0 and 100.
STD ERR | STD specifies a variable to contain the estimates of the standard errors of the estimated quantiles or x0 b. If the response used in the
MODEL statement is a binomial response, then these are the standard errors of x0 b. Otherwise, they are the standard errors of the
Computational Method
1777
quantile estimates. These estimates can be used to compute confidence intervals for the quantiles. However, if the model is fit to
the log of the event time, better confidence intervals can usually
be computed by transforming the confidence intervals for the log
response. See Example 36.1 for such a transformation.
XBETA
WEIGHT Statement
WEIGHT variable ;
If you want to use weights for each observation in the input data set, place the weights
in a variable in the data set and specify the name in a WEIGHT statement. The values
of the WEIGHT variable can be nonintegral and are not truncated. Observations with
nonpositive or missing values for the weight variable do not contribute to the fit of
the model. The WEIGHT variable multiplies the contribution to the log likelihood
for each observation.
Details
Missing Values
Any observation with missing values for the dependent variable is not used in the
model estimation unless it is one and only one of the values in an interval specification. Also, if one of the explanatory variables or the censoring variable is missing, the
observation is not used. For any observation to be used in the estimation of a model,
only the variables needed in that model have to be nonmissing. Predicted values are
computed for all observations with no missing explanatory variable values. If the
censoring variable is missing, the CENSORED= variable in the OUT= SAS data set
is also missing.
Main Effects
Unlike the GLM procedure, only main effect terms are allowed in the model specification. For numeric variables, this is a linear term equal to the value of the variable unless the variable appears in the CLASS statement. For variables listed in the
CLASS statement, PROC LIFEREG creates indicator variables (variables taking the
values zero or one) for every level of the variable except the last level. If there is
no intercept term, the first class variable has indicator variables created for all levels
including the last level. The levels are ordered according to the ORDER= option.
Estimates of a main effect depend upon other effects in the model and, therefore, are
adjusted for the presence of other effects in the model.
1778
Computational Method
By default, the LIFEREG Procedure computes initial values for the parameters using
ordinary least squares (OLS) ignoring censoring. This might not be the best set of
starting values for a given set of data. For example, if there are extreme values in your
data the OLS fit may be excessively influenced by the extreme observations, causing
an overflow or convergence problems. See Example 36.3 for one way to deal with
convergence problems.
You can specify the INITIAL= option in the MODEL statement to override these
starting values. You can also specify the INITIAL=, SCALE=, and SHAPE= options
to set initial values of the intercept, scale, and shape parameters.
The rank of the design matrix X is estimated before the model is fit. Columns of X
that are judged linearly dependent on other columns have the corresponding parameters set to zero. The test for linear dependence is controlled by the SINGULAR=
option in the MODEL statement. Variables are included in the model in the order in
which they are listed in the MODEL statement with the nonclass variables included
in the model before any class variables.
The log-likelihood function is maximized by means of a ridge-stabilized NewtonRaphson algorithm. The maximized value of the log-likelihood can take positive or
negative values, depending on the specified model and the values of the maximum
likelihood estimates of the model parameters.
A composite chi-square test statistic is computed for each class variable, testing
whether there is any effect from any of the levels of the variable. This statistic is
computed as a quadratic form in the appropriate parameter estimates using the corresponding submatrix of the asymptotic covariance matrix estimate. The asymptotic
covariance matrix is computed as the inverse of the observed information matrix.
Note that if the NOINT option is specified and class variables are used, the first class
variable contains a contribution from an intercept term.
Model Specifications
LIFEREG procedure
Suppose there are n observations from the model y = X + , where X is an n k
matrix of covariate values (including the intercept), y is a vector of responses, and
is a vector of errors with survival distribution function S , cumulative distribution
function F , and probability density function f . That is, S (t) = Pr(i > t), F (t) =
Pr(i t), and f (t) = dF (t)=dt, where i is a component of the error vector. Then,
if all the responses are observed, the log likelihood, L, can be written as
L=
where wi
log f (wi )
= 1 (yi , x0i ).
Model Specifications
1779
If some of the responses are left, right, or interval censored, the log likelihood can be
written as
L=
X
X
X
log f (wi ) + log (S (wi )) + log (F (wi )) + log (F (wi ) , F (vi ))
with the first sum over uncensored observations, the second sum over right-censored
observations, the third sum over left-censored observations, the last sum over intervalcensored observations, and
vi = (zi , x0i )
where zi is the lower end of a censoring interval.
If the response is specified in the binomial format, events/trials, then the loglikelihood function is
L=
where ri is the number of events and ni is the number of trials for the ith observation.
In this case, Pi = 1 , F (,x0i ). For the symmetric distributions, logistic and normal,
this is the same as F (x0i ). Additional information on censored and limited dependent variable models can be found in Kalbfleisch and Prentice (1980) and Maddala
(1983).
The estimated covariance matrix of the parameter estimates is computed as the negative inverse of I, which is the information matrix of second derivatives of L with
respect to the parameters evaluated at the final parameter estimates. If I is not positive definite, a positive definite submatrix of I is inverted, and the remaining rows and
columns of the inverse are set to zero. If some of the parameters, such as the scale
and intercept, are restricted, the corresponding elements of the estimated covariance
matrix are set to zero. The standard error estimates for the parameter estimates are
taken as the square roots of the corresponding diagonal elements.
For restrictions placed on the intercept, scale, and shape parameters, one-degree-offreedom Lagrange multiplier test statistics are computed. These statistics are computed as
2 =
g2
V
where g is the derivative of the log likelihood with respect to the restricted parameter
at the restricted maximum and
1780
maximum. These statistics are asymptotically distributed as chi-squares with one degree of freedom under the null hypothesis that the restrictions are valid, provided that
some regularity conditions are satisfied. See Rao (1973, p. 418) for a more complete
discussion. It is possible for these statistics to be missing if the observed information
matrix is not positive definite. Higher degree-of-freedom tests for multiple restrictions are not currently computed.
A Lagrange multiplier test statistic is computed to test this constraint. Notice that this
test statistic is comparable to the Wald test statistic for testing that the scale is one.
The Wald statistic is the result of squaring the difference of the estimate of the scale
parameter from one and dividing this by the square of its estimated standard error.
Supported Distributions
For each distribution, the baseline survival distribution function (S ) and the probability density function(f ) are listed for the additive random disturbance. These distributions apply when the log of the response is modeled (this is the default analysis). The
corresponding survival distribution function (G) and its density function (g ) are given
for the untransformed baseline distribution. For example, for the WEIBULL distribution, S (w) and f (w) are the baseline survival distribution function and the probability density function for the extreme value distribution (the log of the response)
while G(t) and g (t) are the survival distribution function and probability distribution
function of a Weibull distribution (using the untransformed response).
The chosen baseline functions define the meaning of the intercept, scale, and shape
parameters. Only the gamma distribution has a free shape parameter in the following
parameterizations. Notice that some of the distributions do not have mean zero and
that is not, in general, the standard deviation of the baseline distribution.
Additionally, it is worth mentioning that, for the Weibull distribution, the accelerated failure time model is also a proportional-hazards model. However, the parameterization for the covariates differs by a multiple of the scale parameter from the
parameterization commonly used for the proportional hazards model.
The distributions supported in the LIFEREG procedure follow.
= Scale in the output.
Exponential
= Intercept and
Supported Distributions
1781
Generalized Gamma
(with = 0, = 1)
S (w ) =
8
, ,2 ;,2 exp(w))
>
< (
,(,2 )
,2 ,2
>
: 1 ,( ; exp(w))
,(,2 )
G(t) =
8
, ,2 ;,2 t )
>
< (
,(,2 )
,2 ,2
>
: 1 ,( ; t )
,(,2 )
if
>0
if < 0
,
jj ,,2 exp(w),2 exp ,, exp(w),2
f (w ) =
, (,2 )
g(t) =
jj
,2 t
t, (,2 )
,2
if
>0
if
<0
where ,(a) denotes the complete gamma function, ,(a; z ) denotes the incomplete
gamma function, and is a free shape parameter. The parameter is referred
to as Shape by PROC LIFEREG. Refer to Lawless, 1982, p.240 and Klein and
Moeschberger, 1997, p.386 for a description of the generalized gamma distribution.
Loglogistic
S (w ) =
1 + exp w ,
,
,1
exp w,
f (w ) = ,
,
1 + exp w, 2
1
G(t) =
1 + t
g(t) =
where
t
,1
(1 + t
)2
Lognormal
S (w ) = 1 ,
w,
1 exp , 1 w ,
f (w ) = p
2
2
2 !
1782
log(t) ,
G(t) = 1 ,
1 exp , 1 log(t) ,
g(t) = p
2
2t
2 !
Weibull
w,
S (w) = exp ,exp
w,
w,
1
exp
,
exp
f (w) = exp
G(t) = exp (,t
)
g(t) =
t
,1 exp (,t
)
where
= 1= and = exp(,=).
If your parameterization is different from the ones shown here, you can still use the
procedure to fit your model. For example, a common parameterization for the Weibull
distribution is
g(t; ; ) =
,1
G(t; ; ) = exp
so that = exp() and
!
exp , t
!
, t
= 1=.
Again note that the expected value of the baseline log response is, in general, not
zero and that the distributions are not symmetric in all cases. Thus, for a given set of
covariates, x, the expected value of the log response is not always x0 .
Some relations among the distributions are as follows:
1783
Predicted Values
For a given set of covariates, x (including the intercept term), the pth quantile of the
log response, yp , is given by
yp = x0 + wp
where wp is the pth quantile of the baseline distribution. The estimated quantile is
computed by replacing the unknown parameters with their estimates, including any
shape parameters on which the baseline distribution might depend. The estimated
quantile of the original response is obtained by taking the exponential of the estimated log quantile unless the NOLOG option is specified in the preceding MODEL
statement.
The standard errors of the quantile estimates are computed using the estimated covariance matrix of the parameter estimates and a Taylor series expansion of the quantile
estimate. The standard error is computed as
STD =
z0 Vz
= 64 w^p
p
^ @w
@
3
7
5
where is the vector of the shape parameters. Unless the NOLOG option is specified,
this standard error estimate is converted into a standard error estimate for exp(yp )
as exp(^
yp)STD. It may be more desirable to compute confidence limits for the log
response and convert them back to the original response variable than to use the
standard error estimates for exp(yp ) directly. See Example 36.1 for a 90% confidence
interval of the response constructed by exponentiating a confidence interval for the
log response.
The variable, CDF, is computed as
CDFi
= F (wi )
y , x0 i b
wi = i
^
1784
MODEL
NAME
a character variable of length 8 containing the name of the dependent variable for the parameter estimates observations or the name
of the row for the covariance matrix estimates
TYPE
a character variable of length 8 containing the type of the observation, either PARMS for parameter estimates or COV for covariance
estimates
DIST
LNLIKE
INTERCEPT
SCALE
SHAPE1
Any BY variables specified are also added to the OUTEST= data set.
Computational Resources
Let p be the number of parameters estimated in the model. The minimum working
space (in bytes) needed is
16p2 + 100p
SAS OnlineDoc: Version 8
Displayed Output
1785
However, if sufficient space is available, the input data set is also kept in memory;
otherwise, the input data set is reread for each evaluation of the likelihood function
and its derivatives, with the resulting execution time of the procedure substantially
increased.
Let n be the number of observations used in the model estimation. Each evaluation
of the likelihood function and its first and second derivatives requires O (np2 ) multiplications and additions, n individual function evaluations for the log density or log
distribution function, and n evaluations of the first and second derivatives of the function. The calculation of each updating step from the gradient and Hessian requires
O(p3 ) multiplications and additions. The O(v) notation means that, for large values
of the argument, v , O (v ) is approximately a constant times v .
Displayed Output
For each model, PROC LIFEREG displays
For each explanatory variable in the model, the LIFEREG procedure displays
If there are constrained parameters in the model, such as the scale or intercept, then
PROC LIFEREG displays a Lagrange multiplier test for the constraint.
1786
Description
Class variable levels
Convergence status
Parameter estimate correlation matrix
Parameter estimate covariance matrix
Iteration history
Lagrange statistics
Last Evaluation of the Gradient
Last Evaluation of the Hessian
Parameter estimates
Model information
Statement
CLASS
MODEL
MODEL
MODEL
MODEL
MODEL
MODEL
MODEL
MODEL
MODEL
Option
default
default
CORRB
COVB
ITPRINT
NOINT | NOSCALE
ITPRINT
ITPRINT
default
default
Depends on data.
Examples
Example 36.1. Motorette Failure
This example fits a Weibull model and a lognormal model to the example given in
Kalbfleisch and Prentice (1980, p. 5). An output data set called models is specified
to contain the parameter estimates. By default, the natural log of the variable time
is used by the procedure as the response. After this log transformation, the Weibull
model is fit using the extreme value baseline distribution, and the lognormal is fit
using the normal baseline distribution.
Since the extreme value and normal distributions do not contain any shape parameters, the variable SHAPE1 is missing in the models data set. An additional output
data set, out, is requested that contains the predicted quantiles and their standard errors for values of the covariate corresponding to temp=130 and temp=150. This is
done with the control variable, which is set to 1 for only two observations.
Using the standard error estimates obtained from the output data set, approximate
90% confidence limits for the predicted quantities are then created in a subsequent
DATA step for the log response. The logs of the predicted values are obtained because
the values of the P= variable in the OUT= data set are in the same units as the original
response variable, time. The standard errors of the quantiles of the log(time) are
approximated (using a Taylor series approximation) by the standard deviation of time
divided by the mean value of time. These confidence limits are then converted back
to the original scale by the exponential function. The following statements produce
Output 36.1.1 through Output 36.1.5.
Example 36.1.
Motorette Failure
1787
1788
Output 36.1.1.
time
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
.
.
1764
2772
3444
3542
3780
4860
5196
5448
5448
5448
408
408
1344
1344
1440
1680
1680
1680
1680
1680
408
408
504
504
504
528
528
528
528
528
Output 36.1.2.
censor
0
0
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
temp
control
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2.48016
2.36295
2.25632
2.25632
2.25632
2.25632
2.25632
2.25632
2.25632
2.25632
2.25632
2.25632
2.15889
2.15889
2.15889
2.15889
2.15889
2.15889
2.15889
2.15889
2.15889
2.15889
2.02758
2.02758
2.02758
2.02758
2.02758
2.02758
2.02758
2.02758
2.02758
2.02758
130
150
170
170
170
170
170
170
170
170
170
170
190
190
190
190
190
190
190
190
190
190
220
220
220
220
220
220
220
220
220
220
WORK.MOTORS
Log(time)
censor
0
30
17
13
0
0
2
WEIBULL
-22.95148315
Variable
Intercept
z
Scale
DF
Estimate
1
1
1
-11.89122
9.03834
0.36128
Standard
Error Chi-Square Pr > ChiSq Label
1.96551
0.90599
0.07950
36.6019
99.5239
<.0001 Intercept
<.0001
Extreme value scale
Example 36.2.
Output 36.1.3.
1789
WORK.MOTORS
Log(time)
censor
0
30
17
13
0
0
2
LNORMAL
-24.47381031
Variable
Intercept
z
Scale
DF
Estimate
1
1
1
-10.47056
8.32208
0.60403
Output 36.1.4.
Standard
Error Chi-Square Pr > ChiSq Label
2.77192
1.28412
0.11073
14.2685
42.0001
0.0002 Intercept
<.0001
Normal scale
_MODEL_
A
A
A
A
B
B
B
B
_NAME_
_TYPE_
_DIST_
time
Intercept
z
Scale
time
Intercept
z
Scale
PARMS
COV
COV
COV
PARMS
COV
COV
COV
WEIBULL
WEIBULL
WEIBULL
WEIBULL
LNORMAL
LNORMAL
LNORMAL
LNORMAL
Output 36.1.5.
_STATUS_
0
0
0
0
0
0
0
0
Converged
Converged
Converged
Converged
Converged
Converged
Converged
Converged
_LNLIKE_
Intercept
time
-22.9515
-22.9515
-22.9515
-22.9515
-24.4738
-24.4738
-24.4738
-24.4738
-11.8912
3.8632
-1.7788
0.0345
-10.4706
7.6835
-3.5557
0.0327
-1.0000
-11.8912
9.0383
0.3613
-1.0000
-10.4706
8.3221
0.6040
_SCALE_
9.03834
-1.77878
0.82082
-0.01488
8.32208
-3.55566
1.64897
-0.01285
0.36128
0.03448
-0.01488
0.00632
0.60403
0.03267
-0.01285
0.01226
_SHAPE1_
.
.
.
.
.
.
.
.
temp
130
130
130
150
150
150
time
.
.
.
.
.
.
censor
0
0
0
0
0
0
control
_PROB_
PREDTIME
STD
ltime
1
1
1
1
1
1
2.48016
2.48016
2.48016
2.36295
2.36295
2.36295
0.1
0.5
0.9
0.1
0.5
0.9
12033.19
26095.68
56592.19
4536.88
9838.86
21336.97
5482.34
11359.45
26036.90
1443.07
2901.15
7172.34
9.3954
10.1695
10.9436
8.4200
9.1941
9.9682
stde
0.45560
0.43530
0.46008
0.31808
0.29487
0.33615
upper
25402.68
53285.36
120349.65
7643.71
15957.38
37029.72
lower
5700.09
12779.95
26611.42
2692.83
6066.36
12294.62
1790
fY0 (y) =
fY (y )
Pr(Y > C)
for
y>C
where fY (y ) is the probability density function of Y. PROC LIFEREG cannot compute the proper likelihood function to estimate parameters or predicted values for a
truncated distribution.
Suppose the model being fit is specified as follows:
Yi = x0i + i
where i is a normal error term with zero mean and standard deviation .
Define the censored random variable Yi as
Yi = 0 if Yi 0
Yi = Yi if Yi > 0
This is the Tobit model for left-censored normal data. Yi is sometimes called the
latent variable. PROC LIFEREG estimates parameters of the distribution of Yi by
maximum likelihood.
You can use the LIFEREG procedure to compute predicted values based on the mean
functions of the latent and observed variables. The mean of the latent variable Yi
is x0i and you can compute values of the mean for different settings of xi by specifying XBETA=variable-name in an OUTPUT statement. Estimates of x0i for each
observation will be written to the OUT= data set. Predicted values of the observed
variable Yi can be computed based on the mean
E (Yi ) =
x0i
(x0i + i )
where
i =
(x0i =)
(x0i =)
and represent the normal probability density and cumulative distribution functions.
Example 36.2.
1791
The following table shows a subset of the Mroz (1987) data set. In this data, Hours is
the number of hours the wife worked outside the household in a given year, Yrs Ed is
the years of education, and Yrs Exp is the years of work experience. A Tobit model
will be fit to the hours worked with years of education and experience as covariates.
Hours
0
0
0
0
0
0
1000
1960
0
2100
3686
1920
0
1728
1568
1316
0
Yrs Ed
8
8
9
10
11
11
12
12
13
13
14
14
15
16
16
17
17
Yrs Exp
9
12
10
15
4
6
1
29
3
36
11
38
14
3
19
7
15
If the wife was not employed (worked 0 hours), her hours worked will be left censored
at zero. In order to accommodate left censoring in PROC LIFEREG, you need two
variables to indicate censoring status of observations. You can think of these variables
as lower and upper endpoints of interval censoring. If there is no censoring, set both
variables to the observed value of Hours. To indicate left censoring, set the lower
endpoint to missing and the upper endpoint to the censored value, zero in this case.
The following statements create a SAS data set with the variables Hours, Yrs Ed,
and Yrs Exp from the data above. A new variable, Lower is created such that
Lower=. if Hours=0 and Lower=Hours if Hours>0.
data subset;
input Hours Yrs_Ed Yrs_Exp @@;
if Hours eq 0
then Lower=.;
else Lower=Hours;
datalines;
0 8 9 0 8 12 0 9 10 0 10 15 0 11 4 0 11 6
1000 12 1 1960 12 29 0 13 3 2100 13 36
3686 14 11 1920 14 38 0 15 14 1728 16 3
1568 16 19 1316 17 7 0 17 15
;
The following statements fit a normal regression model to the left censored Hours
data using Yrs Ed and Yrs Exp as covariates. You will need the estimated standard
SAS OnlineDoc: Version 8
1792
deviation of the normal distribution to compute the predicted values of the censored
distribution from the formulas above. The data set OUTEST contains the standard
deviation estimate in a variable named SCALE . You also need estimates of x0i .
These are contained in the data set OUT as the variable Xbeta
proc lifereg data=subset outest=OUTEST(keep=_scale_);
model (lower, hours) = yrs_ed yrs_exp / d=normal;
output out=OUT xbeta=Xbeta;
run;
Output 36.2.1 shows the results of the model fit. These tables show parameter estimates for the uncensored, or latent variable, distribution.
Output 36.2.1.
WORK.SUBSET
Lower
Hours
17
8
0
9
0
NORMAL
-74.9369977
Variable
Intercept
Yrs_Ed
Yrs_Exp
Scale
DF
Estimate
1
1
1
1
-5598.6
373.14771
63.33711
1582.9
Standard
Error Chi-Square Pr > ChiSq Label
2850.2
191.88717
38.36317
442.67318
3.8583
3.7815
2.7258
0.0495 Intercept
0.0518
0.0987
Normal scale
The following statements combine the two data sets created by PROC LIFEREG
to compute predicted values for the censored distribution. The OUTEST= data set
contains the estimate of the standard deviation from the uncensored distribution, and
the OUT= data set contains estimates of x0i .
data predict;
drop lambda _scale_ _prob_;
set out;
if _n_ eq 1 then set outest;
lambda = pdf(NORMAL,Xbeta/_scale_)
/ cdf(NORMAL,Xbeta/_scale_);
Predict = cdf(NORMAL, Xbeta/_scale_)
* (Xbeta + _scale_*lambda);
label Xbeta=MEAN OF UNCENSORED VARIABLE
Predict = MEAN OF CENSORED VARIABLE;
run;
proc print data=predict noobs label;
var hours lower yrs: xbeta predict;
run;
Example 36.3.
1793
Output 36.2.2 shows the original variables, the predicted means of the uncensored
distribution, and the predicted means of the censored distribution.
Output 36.2.2.
Hours
Lower
Yrs_Ed
0
0
0
0
0
0
1000
1960
0
2100
3686
1920
0
1728
1568
1316
0
.
.
.
.
.
.
1000
1960
.
2100
3686
1920
.
1728
1568
1316
.
8
8
9
10
11
11
12
12
13
13
14
14
15
16
16
17
17
Yrs_Exp
9
12
10
15
4
6
1
29
3
36
11
38
14
3
19
7
15
MEAN OF
UNCENSORED
VARIABLE
-2043.42
-1853.41
-1606.94
-917.10
-1240.67
-1113.99
-1057.53
715.91
-557.71
1532.42
322.14
2032.24
885.30
561.74
1575.13
1188.23
1694.93
MEAN OF
CENSORED
VARIABLE
73.46
94.23
128.10
276.04
195.76
224.72
238.63
1052.94
391.42
1672.50
805.58
2106.81
1170.39
951.69
1708.24
1395.61
1809.97
18
18
22
35
54
54
0.00
0.04
0.40
4.00
40.00
400.00
proc print;
run;
title OLS (default) initial values;
proc lifereg data=raw;
model x*censor(1) = c1 / distribution = weibull itprint;
run;
1794
Output 36.3.1.
censor
c1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
16
17
18
17
18
18
23
22
22
33
34
35
54
54
54
54
54
54
0.00
0.00
0.00
0.04
0.04
0.04
0.40
0.40
0.40
4.00
4.00
4.00
40.00
40.00
40.00
400.00
400.00
400.00
Convergence was not attained in 50 iterations for this model, as the messages to the
log indicate:
WARNING: Convergence not attained in 50 iterations.
WARNING: The procedure is continuing but the validity of the model
fit is questionable.
The first line (iter=0) of the iteration history table, in Output 36.3.2, shows the default
initial ordinary least squares (OLS) estimates of the parameters.
Output 36.3.2.
Iter
Ridge
Loglike
Intercept
c1
Scale
-22.891088
3.2324769714
0.0020664542
0.3995754195
The log logistic distribution is more robust to large values of the response than the
Weibull, so one approach to improving the convergence performance is to fit a log
logistic distribution, and if this converges, use the resulting parameter estimates as
initial values in a subsequent fit of a model with the Weibull distribution.
The following statements fit a log logistic distribution to the data.
proc lifereg data=raw;
model x*censor(1) = c1 / distribution = llogistic;
run;
The algorithm converges, and the maximum likelihood estimates for the log logistic
distribution are shown in Output 36.3.3
References
Output 36.3.3.
1795
WORK.RAW
Log(x)
censor
1
18
12
6
0
0
LLOGISTC
12.093136846
Variable
DF
Estimate
1
1
1
2.89828
0.15921
0.04979
Intercept
c1
Scale
Standard
Error Chi-Square Pr > ChiSq Label
0.03179
0.01327
0.01218
8309.4488
143.8537
<.0001 Intercept
<.0001
Logistic scale
The following statements re-fit the Weibull model using the maximum likelihood
estimates from the log logistic fit as initial values.
proc lifereg data=raw outest=outest;
model x*censor(1) = c1 / itprint distribution = weibull
intercept=2.898 initial=0.16 scale=0.05;
output out=out xbeta=xbeta;
run;
Examination of the resulting output in Output 36.3.4 shows that the convergence
problem has been solved by specifying different initial values.
Output 36.3.4.
WORK.RAW
Log(x)
censor
1
18
12
6
0
0
WEIBULL
11.232023272
Algorithm converged.
Variable
Intercept
c1
Scale
DF
Estimate
1
1
1
2.96986
0.14346
0.08437
Standard
Error Chi-Square Pr > ChiSq Label
0.03264
0.01652
0.01887
8278.8602
75.4316
<.0001 Intercept
<.0001
Extreme value scale
1796
References
Allison, P.D. (1995) Survival Analysis Using the SAS System: A Practical Guide,
Cary, NC: SAS Institute.
Cox, D.R. (1972), Regression Models and Life Tables (with discussion), Journal
of the Royal Statistical Society, Series B, 34, 187220.
Cox, D.R. and Oakes, D. (1984), Analysis of Survival Data, London: Chapman and
Hall.
Elandt-Johnson, R.C. and Johnson, N.L. (1980), Survival Models and Data Analysis,
New York: John Wiley & Sons, Inc.
Green, W.H. (1993) Econometric Analysis, 2nd Edition, New York: Cambridge University Press.
Gross, A.J. and Clark, V.A. (1975), Survival Distributions: Reliability Applications
in the Biomedical Sciences, New York: John Wiley & Sons, Inc.
Kalbfleisch, J.D. and Prentice, R.L. (1980), The Statistical Analysis of Failure Time
Data, New York: John Wiley & Sons, Inc.
Klein, J.P. and Moeschberger, M.L. (1997), Survival Analysis: Techniques for Censored and Truncated Data, Berlin: Springer.
Lawless, J.E. (1982), Statistical Models and Methods for Lifetime Data, New York:
John Wiley & Sons, Inc.
Lee, E.T. (1980), Statistical Methods for Survival Data Analysis, Belmont, CA: Lifetime Learning Publications.
Maddala, G.S. (1983), Limited-Dependent and Qualitative Variables in Econometrics, New York: Cambridge University Press.
Mroz, T.A. (1987) The Sensitivity of an Empirical Model of Married Womens Work
to Economic and Statistical Assumptions, Econometrica 55, 765799.
Rao, C.R. (1973), Linear Statistical Inference and Its Applications, New York: John
Wiley & Sons, Inc.
Tobin, J. (1958), Estimation of Relationships for Limited Dependent Variables,
Econometrica, 26, 2436.
The correct bibliographic citation for this manual is as follows: SAS Institute Inc.,
SAS/STAT Users Guide, Version 8, Cary, NC: SAS Institute Inc., 1999.