Survival Models: 7.1 The Hazard and Survival Functions
Survival Models: 7.1 The Hazard and Survival Functions
Survival Models: 7.1 The Hazard and Survival Functions
Survival Models
Our final chapter concerns models for the analysis of data which have three
main characteristics: (1) the dependent variable or response is the waiting
time until the occurrence of a well-defined event, (2) observations are cen-
sored, in the sense that for some units the event of interest has not occurred
at the time the data are analyzed, and (3) there are predictors or explanatory
variables whose effect on the waiting time we wish to assess or control. We
start with some basic definitions.
It will often be convenient to work with the complement of the c.d.f, the
survival function
Z ∞
S(t) = Pr{T ≥ t} = 1 − F (t) = f (x)dx, (7.1)
t
which gives the probability of being alive just before duration t, or more
generally, the probability that the event of interest has not occurred by
duration t.
f (t)
λ(t) = , (7.3)
S(t)
which some authors give as a definition of the hazard function. In words, the
rate of occurrence of the event at duration t equals the density of events at t,
divided by the probability of surviving to that duration without experiencing
the event.
Note from Equation 7.1 that −f (t) is the derivative of S(t). This suggests
rewriting Equation 7.3 as
d
λ(t) = − log S(t).
dt
7.1. THE HAZARD AND SURVIVAL FUNCTIONS 3
Integrating by parts, and making use of the fact that −f (t) is the derivative
of S(t), which has limits or boundary conditions S(0) = 1 and S(∞) = 0,
one can show that Z ∞
µ= S(t)dt. (7.6)
0
actual waiting time T is always well defined. In this case we can calculate
not just the conditional hazard and survivor functions, but also the mean.
In our marriage example, we could calculate the mean age at marriage for
those who marry. We could even calculate a conventional median, defined
as the age by which half the people who will eventually marry have done so.
It turns out that the conditional density, hazard and survivor function
for those who experience the event are related to the unconditional density,
hazard and survivor for the entire population. The conditional density is
f (t)
f ∗ (t) = ,
1 − S(∞)
S(t) − S(∞)
S ∗ (t) = ,
1 − S(∞)
f ∗ (t) f (t)
λ∗ (t) = = .
S ∗ (t) S(t) − S(∞)
Derivation of the mean waiting time for those who experience the event is
left as an exercise for the reader.
Whichever approach is adopted, care must be exercised to specify clearly
which hazard or survival is being used. For example, the conditional hazard
for those who eventually experience the event is always higher than the
unconditional hazard for the entire population. Note also that in most cases
all we observe is whether or not the event has occurred. If the event has not
occurred, we may be unable to determine whether it will eventually occur.
In this context, only the unconditional hazard may be estimated from data,
but one can always translate the results into conditional expressions, if so
desired, using the results given above.
Li = S(ti ),
Taking logs, and recalling the expression linking the survival function S(t)
to the cumulative hazard function Λ(t), we obtain the log-likelihood function
for censored survival data
n
X
log L = {di log λ(ti ) − Λ(ti )}. (7.7)
i=1
and setting the score to zero gives the maximum likelihood estimator of the
hazard
D
λ̂ = , (7.9)
T
the total number of deaths divided by the total exposure time. Demogra-
phers will recognize this expression as the general definition of a death rate.
Note that the estimator is optimal (in a maximum likelihood sense) only if
the risk is constant and does not depend on age.
We can also calculate the observed information by taking minus the sec-
ond derivative of the score, which is
D
I(λ) = .
λ2
To obtain the expected information we need to calculate the expected num-
ber of deaths, but this depends on the censoring scheme. For example under
Type I censoring with fixed duration τ , one would expect n(1−S(τ )) deaths.
Under Type II censoring the number of deaths would have been fixed in ad-
vance. Under some schemes calculation of the expectation may be fairly
complicated if not impossible.
A simpler alternative is to use the observed information, estimated using
the m.l.e. of λ given in Equation 7.9. Using this approach, the large sample
variance of the m.l.e. of the hazard rate may be estimated as
D
var(
ˆ λ̂) = ,
T2
a result that leads to large-sample tests of hypotheses and confidence inter-
vals for λ.
If there are no censored cases, so that di = 1 for all i and D = n, then the
results obtained here reduce to standard maximum likelihood estimation for
the exponential distribution, and the m.l.e. of λ turns out to be the reciprocal
of the sample mean.
It may be interesting to note in passing that the log-likelihood for cen-
sored exponential data given in Equation 7.8 coincides exactly (except for
constants) with the log-likelihood that would be obtained by treating D as a
Poisson random variable with mean λT . To see this point, you should write
the Poisson log-likelihood when D ∼ P (λT ), and note that it differs from
Equation 7.8 only in the presence of a term D log(T ), which is a constant
depending on the data but not on the parameter λ.
Thus, treating the deaths as Poisson conditional on exposure time leads
to exactly the same estimates (and standard errors) as treating the exposure
7.3. APPROACHES TO SURVIVAL MODELING 9
log Ti = x0i β + i ,
S1 (t) = S0 (t/γ).
In words, the probability that a member of group one will be alive at age t
is exactly the same as the probability that a member of group zero will be
alive at age t/γ. For γ = 2, this would be half the age, so the probability
that a member of group one would be alive at age 40 (or 60) would be the
same as the probability that a member of group zero would be alive at age
20 (or 30). Thus, we may think of γ as affecting the passage of time. In our
example, people in group zero age ‘twice as fast’.
For the record, the corresponding hazard functions are related by
λ1 (t) = λ0 (t/γ)/γ,
a − a0
F (a) = cF0 ( ),
k
7.3. APPROACHES TO SURVIVAL MODELING 11
In this model λ0 (t) is a baseline hazard function that describes the risk for
individuals with xi = 0, who serve as a reference cell or pivot, and exp{x0i β}
is the relative risk, a proportionate increase or reduction in risk, associated
with the set of characteristics xi . Note that the increase or reduction in risk
is the same at all durations t.
To fix ideas consider a two-sample problem where we have a dummy
variable x which serves to identify groups one and zero. Then the model is
(
λ0 (t) if x = 0,
λi (t|x) = β .
λ0 (t)e if x = 1.
Thus, λ0 (t) represents the risk at time t in group zero, and γ = exp{β}
represents the ratio of the risk in group one relative to group zero at any
time t. If γ = 1 (or β = 0) then the risks are the same in the two groups. If
γ = 2 (or β = 0.6931), then the risk for an individual in group one at any
given age is twice the risk of a member of group zero who has the same age.
Note that the model separates clearly the effect of time from the effect
of the covariates. Taking logs, we find that the proportional hazards model
is a simple additive model for the log of the hazard, with
where α0 (t) = log λ0 (t) is the log of the baseline hazard. As in all additive
models, we assume that the effect of the covariates x is the same at all times
or ages t. The similarity between this expression and a standard analysis of
covariance model with parallel lines should not go unnoticed.
Returning to Equation 7.10, we can integrate both sides from 0 to t to
obtain the cumulative hazards
Si (t|xi ) = S0 (t)exp{xi β } ,
0
(7.11)
where S0 (t) = exp{−Λ0 (t)} is a baseline survival function. Thus, the effect
of the covariate values xi on the survivor function is to raise it to a power
given by the relative risk exp{x0i β}.
In our two-group example with a relative risk of γ = 2, the probability
that a member of group one will be alive at any given age t is the square of
the probability that a member of group zero would be alive at the same age.
The one case where the two families coincide is the Weibull distribution,
which has survival function
S(t) = exp{−(λt)p }
and hazard function
λ(t) = pλ(λt)p−1 ,
for parameters λ > 0 and p > 0. If p = 1, this model reduces to the
exponential and has constant risk over time. If p > 1, then the risk increases
over time. If p < 1, then the risk decreases over time. In fact, taking logs
in the expression for the hazard function, we see that the log of the Weibull
risk is a linear function of log time with slope p − 1.
If we pick the Weibull as a baseline risk and then multiply the hazard by
a constant γ in a proportional hazards framework, the resulting distribution
turns out to be still a Weibull, so the family is closed under proportionality
of hazards. If we pick the Weibull as a baseline survival and then speed
up the passage of time in an accelerated life framework, dividing time by a
constant γ, the resulting distribution is still a Weibull, so the family is closed
under acceleration of time.
For further details on this distribution see Cox and Oakes (1984) or
Kalbfleish and Prentice (1980), who prove the equivalence of the two Weibull
models.
Let xi (t) denote the value of a vector of covariates for individual i at time
or duration t. Then the proportional hazards model may be generalized to
λi (t, xi (t)) = λ0 (t) exp{xi (t)0 β}. (7.12)
The separation of duration and covariate effects is not so clear now, and on
occasion it may be difficult to identify effects that are highly collinear with
time. If all children were weaned when they are around six months old, for
example, it would be difficult to identify effects of breastfeeding from general
duration effects without additional information. In such cases one might still
prefer a time-varying covariate, however, as a more meaningful predictor of
risk than the mere passage of time.
Calculation of survival functions when we have time-varying covariates is
a little bit more complicated, because we need to specify a path or trajectory
for each variable. In the birth intervals example one could calculate a survival
function for women who breastfeed for six months and then wean. This
would be done by using the hazard corresponding to x(t) = 0 for months 0
to 6 and then the hazard corresponding to x(t) = 1 for months 6 onwards.
Unfortunately, the simplicity of Equation 7.11 is lost; we can no longer
simply raise the baseline survival function to a power.
Time-varying covariates can be introduced in the context of accelerated
life models, but this is not so simple and has rarely been done in applications.
See Cox and Oakes (1984, p.66) for more information.
1.2
0.8
0.6
1.0
survival
hazard
0.4
0.8
0.2
0.6
0.0
0 2 4 6 8 0 2 4 6 8
time time
Figure 7.1 shows how a Weibull distribution with λ = 1 and p = 0.8 can
be approximated using a piece-wise exponential distribution with bound-
aries at 0.5, 1.5 and 3.5. The left panel shows how the piece-wise constant
hazard can follow only the broad outline of the smoothly declining Weibull
hazard yet, as shown on the right panel, the corresponding survival curves
are indistinguishable.
where tij is the exposure time as defined above and λij is the hazard for
individual i in interval j. Taking logs in this expression, and recalling that
the hazard rates satisfy the proportional hazards model in Equation 7.15,
we obtain
log µij = log tij + αj + x0i β,
where αj = log λj as before.
Thus, the piece-wise exponential proportional hazards model is equiva-
lent to a Poisson log-linear model for the pseudo observations, one for each
combination of individual and interval, where the death indicator is the re-
sponse and the log of exposure time enters as an offset.
It is important to note that we do not assume that the dij have indepen-
dent Poisson distributions, because they clearly do not. If individual i died
in interval j(i), then it must have been alive in all prior intervals j < j(i), so
the indicators couldn’t possibly be independent. Moreover, each indicator
can only take the values one and zero, so it couldn’t possibly have a Poisson
distribution, which assigns some probability to values greater than one. The
result is more subtle. It is the likelihood functions that coincide. Given a
realization of a piece-wise exponential survival process, we can find a realiza-
tion of a set of independent Poisson observations that happens to have the
same likelihood, and therefore would lead to the same estimates and tests of
hypotheses.
The proof is not hard. Recall from Section 7.2.2 that the contribution of
the i-th individual to the log-likelihood function has the general form
where we have written λi (t) for the hazard and Λi (t) for the cumulative
hazard that applies to the i-th individual at time t. Let j(i) denote the
interval where ti falls, as before.
Under the piece-wise exponential model, the first term in the log-likelihood
can be written as
di log λi (ti ) = dij(i) log λij(i) ,
using the fact that the hazard is λij(i) when ti is in interval j(i), and that the
death indicator di applies directly to the last interval visited by individual
i, and therefore equals dj(i) .
The cumulative hazard in the second term is an integral, and can be
written as a sum as follows
Z ti j(i)
X
Λi (ti ) = λi (t)dt = tij λij ,
0 j=1
20 CHAPTER 7. SURVIVAL MODELS
j(i)
X
log Li = {dij log λij − tij λij }.
j=1
log Lij = dij log µij − µij = dij log(tij λij ) − tij λij .
This expression agrees with the log-likelihood above except for the term
dij log(tij ), but this is a constant depending on the data and not on the
parameters, so it can be ignored from the point of view of estimation. This
completes the proof.2
This result generalizes the observation made at the end of Section 7.2.2
noting the relationship between the likelihood for censored exponential data
and the Poisson likelihood. The extension is that instead of having just one
‘Poisson’ death indicator for each individual, we have one for each interval
visited by each individual.
7.4. THE PIECE-WISE EXPONENTIAL MODEL 21
where β represents the effect of the predictor on the log of the hazard at any
given time. Exponentiating, we see that the hazard when x = 1 is exp{β}
times the hazard when x = 0, and this effect is the same at all times. This
is a simple additive model on duration and the predictor of interest.
To allow for a time-dependent effect of the predictor, we would write
where βj represents the effect of the predictor on the hazard during interval
j. Exponentiating, we see that the hazard in interval j when x = 1 is
exp{βj } times the hazard in interval j when x = 0, so the effect may vary
from one interval to the next. Since the effect of the predictor depends on the
interval, we have a form of interaction between the predictor and duration,
which might be more obvious if we wrote the model as
Birth Cohort
Exact
1941–59 1960–67 1968-76
Age
deaths exposure deaths exposure deaths exposure
0–1 m 168 278.4 197 403.2 195 495.3
1–3 m 48 538.8 48 786.0 55 956.7
3–6 m 63 794.4 62 1165.3 58 1381.4
6–12 m 89 1550.8 81 2294.8 85 2604.5
1–2 y 102 3006.0 97 4500.5 87 4618.5
2–5 y 81 8743.5 103 13201.5 70 9814.5
5–10 y 40 14270.0 39 19525.0 10 5802.5
Table 7.1 shows the results of these calculations in terms of the number
of deaths and the total number of person-years of exposure to risk between
birth and age ten, by categories of age of child, for three groups of children
(or cohorts) born in 1941–59, 1960–67 and 1968–76. The purpose of our
24 CHAPTER 7. SURVIVAL MODELS
the sum of two parts, log tij , an offset or known part of the linear predictor,
and log λij , the log of the hazard rates of interest.
Finally, we introduce a log-linear model for the hazard rates, of the usual
form
log λij = x0ij β,
where xij is a vector of covariates. In case you are wondering what happened
to the baseline hazard, we have folded it into the vector of parameters β. The
vector of covariates xij may include a constant, a set of dummy variables
representing the age groups (i.e. the shape of the hazard by age), a set
of dummy variables representing the birth cohorts (i.e. the change in the
hazard over time) and even a set of cross-product dummies representing
combinations of ages and birth cohorts (i.e. interaction effects).
Table 7.2 shows the deviance for the five possible models of interest,
including the null model, the two one-factor models, the two-factor additive
model, and the two-factor model with an interaction, which is saturated for
these data.
7.5. INFANT AND CHILD MORTALITY IN COLOMBIA 25
Table 7.3: Parameter Estimates for Age, Cohort and Age+Cohort Models
of Infant and Child Mortality in Colombia
Consider now the additive model with effects of both age and cohort,
where the hazard rate is allowed to vary with age and may differ from one
cohort to another, but the age (or cohort) effect is assumed to be the same
for each cohort (or age). This model is equivalent to a proportional hazards
model, where we assume a common shape of the hazard by age, and let cohort
affect the hazard proportionately at all ages. Comparing the proportional
hazards model with the age model we note a reduction in deviance of 66.5
on two d.f., which is highly significant. Thus, we have strong evidence of
cohort effects net of age. On the other hand, the attained deviance of 6.2
on 12 d.f. is clearly not significant, indicating that the proportional hazards
model provides an adequate description of the patterns of mortality by age
and cohort in Colombia. In other words, the assumption of proportionality
of hazards is quite reasonable, implying that the decline in mortality in
Colombia has been the same at all ages.
Let us examine the parameter estimates on the right-most column of
Table 7.3. The constant is the baseline hazard at ages 0–1 months for the
earliest cohort, those born in 1941–59. The age parameters representing the
baseline hazard are practically unchanged from the model with age only, and
trace the dramatic decline in mortality from birth to age ten, with half the
reduction concentrated in the first year of life. The cohort affects adjusted
for age provide a more reasonable picture of the decline in mortality over
time. The multiplicative effects for the cohorts born in 1960–67 and 1068–
76 are exp{−0.3243} = 0.7233 and exp{−0.4784} = 0.6120, corresponding
to mortality declines of 28 and 38 percent at every age, compared to the
cohort born in 1941–59. This is a remarkable decline in infant and child
mortality, which appears to have been the same at all ages. In other words,
neonatal, post-neonatal, infant and toddler mortality have all declined by
approximately 38 percent across these cohorts.
The fact that the gross effect for the youngest cohort was positive but
the net effect is substantially negative can be explained as follows. Because
the survey took place in 1976, children born between 1968 and 76 have been
exposed mostly to mortality at younger ages, where the rates are substan-
tially higher than at older ages. For example a child born in 1975 would
have been exposed only to mortality in the first year of life. The gross effect
ignores this fact and thus overestimates the mortality of this group at ages
zero to ten. The net effect adjusts correctly for the increased risk at younger
ages, essentially comparing the mortality of this cohort to the mortality of
earlier cohorts when they had the same ages, and can therefore unmask the
actual decline.
A final caveat on interpretation: the data are based on retrospective re-
28 CHAPTER 7. SURVIVAL MODELS
ports of mothers who were between the ages of 15 and 49 at the time of the
interview. These women provide a representative sample of both mothers
and births for recent periods, but a somewhat biased sample for older peri-
ods. The sample excludes mothers who have died before the interview, but
also women who were older at the time of birth of the child. For example
births from 1976, 1966 and 1956 come from mothers who were under 50,
under 40 and under 30 at the time of birth of the child. A more careful
analysis of the data would include age of mother at birth of the child as an
additional control variable.
Consider first the baseline group, namely the cohort of children born
before 1960. To obtain the log-hazard for each age group we must add the
constant and the age effect, for example the log-hazard for ages 1–3 months
is −0.4485 − 1.973 = −2.4215. This gives the numbers in column (3) of
Table 7.3. Next we exponentiate to obtain the hazard rates in column (4),
for example the rate for ages 1–3 months is exp{−2.4215} = 0.0888. Next
we calculate the cumulative hazard, multiply the hazard by the width of the
interval and summing across intervals. In this step it is crucial to express
the width of the interval in the same units used to calculate exposure, in
7.5. INFANT AND CHILD MORTALITY IN COLOMBIA 29
this case years. Thus, the cumulative hazard at then end of ages 1–3 months
is 0.6386 × 1/12 + 0.0888 × 2/12 = 0.0680. Finally, we change sign and
exponentiate to calculate the survival function. For example the baseline
survival function at 3 months is exp{−0.0680} = 0.9342.
To calculate the survival functions shown in columns (7) and (8) for the
other two cohorts we could multiply the baseline hazards by exp{−0.3242}
and exp{−0.4874} to obtain the hazards for cohorts 1960–67 and 1968–76,
respectively, and then repeat the steps described above to obtain the survival
functions. This approach would be necessary if we had time-varying effects,
but in the present case we can take advantage of a simplification that obtains
for proportional hazard models. Namely, the survival functions for the two
younger cohorts can be calculated as the baseline survival function raised to
the relative risks exp{−0.3242} and exp{−0.4874}, respectively. For example
the probability of surviving to age three months was calculated as 0.9342 for
the baseline group, and turns out to be 0.9342exp{−0.3242} = 0.9520 for the
cohort born in 1960–67, and 0.9342exp{−0.4874} = 0.9587 for the cohort born
in 1968–76.
Note that the probability of dying in the first year of life has declined
from 106.7 per thousand for children born before 1960 to 78.3 per thousand
for children born in 1960–67 and finally to 67.5 per thousand for the most
recent cohort. Results presented in terms of probabilities are often more
accessible to a wider audience than results presented in terms of hazard
rates. (Unfortunately, demographers are used to calling the probability of
dying in the first year of life the ‘infant mortality rate’. This is incorrect
because the quantity quoted is a probability, not a rate. In our example the
rate varies substantially within the first year of life. If the probability of
dying in the first year of life is q, say, then the average rate is approximately
− log(1 − q), which is not too different from q for small q.)
By focusing on events and exposure, we have been able to combine infant
and child mortality in the same analysis and use all available information.
An alternative approach could focus on infant mortality (deaths in the first
year of life), and solve the censoring problem by looking only at children
born at least one year before the survey, for whom the survival status at
age one is know. One could then analyze the probability of surviving to age
one using ordinary logit models. A complementary analysis could then look
at survival from age one to five, say, working with children born at least
five years before the survey who survived to age one, and then analyzing
whether or not they further survive to age five, using again a logit model.
While simple, this approach does not make full use of the information, relying
on cases with complete (uncensored) data. Cox and Oakes (1980) show that
30 CHAPTER 7. SURVIVAL MODELS
Note that in discrete time the hazard is a conditional probability rather than
a rate. However, the general result expressing the hazard as a ratio of the
density to the survival function is still valid.
A further result of interest in discrete time is that the survival function
at time tj can be written in terms of the hazard at all prior times t1 , . . . , tj−1 ,
as
Sj = (1 − λ1 )(1 − λ2 ) . . . (1 − λj−1 ). (7.18)
In words, this result states that in order to survive to time tj one must
first survive t1 , then one must survive t2 given that one survived t1 , and
so on, finally surviving tj−1 given survival up to that point. This result is
analogous to the result linking the survival function in continuous time to
the integrated or cumulative hazard at all previous times.
7.6. DISCRETE TIME MODELS 31
where λ(tj |xi ) is the hazard at time tj for an individual with covariate values
xi , λ0 (tj ) is the baseline hazard at time tj , and exp{x0i β} is the relative risk
associated with covariate values xi .
Taking logs, we obtain a model on the logit of the hazard or conditional
probability of dying at tj given survival up to that time,
where αj = logitλ0 (tj ) is the logit of the baseline hazard and x0i β is the effect
of the covariates on the logit of the hazard. Note that the model essentially
treats time as a discrete factor by introducing one parameter αj for each
possible time of death tj . Interpretation of the parameters β associated with
the other covariates follows along the same lines as in logistic regression.
In fact, the analogy with logistic regression goes further: we can fit the
discrete-time proportional-hazards model by running a logistic regression on
a set of pseudo observations generated as follows. Suppose individual i dies
or is censored at time point tj(i) . We generate death indicators dij that take
the value one if individual i died at time j and zero otherwise, generating
one for each discrete time from t1 to tj(i) . To each of these indicators we
associate a copy of the covariate vector xi and a label j identifying the time
point. The proportional hazards model 7.19 can then be fit by treating
32 CHAPTER 7. SURVIVAL MODELS
where S(tj |xi ) is the probability that an individual with covariate values xi
will survive up to time point tj , and S0 (tj ) is the baseline survival function.
Recalling Equation 7.18 for the discrete survival function, we obtain a similar
relationship for the complement of the hazard function, namely
so that solving for the hazard for individual i at time point tj we obtain the
model
λ(tj |xi ) = 1 − [1 − λ0 (tj )]exp{xi β } .
0
7.6. DISCRETE TIME MODELS 33
The transformation that makes the right hand side a linear function of the
parameters is the complementary log-log. Applying this transformation we
obtain the model
log(− log(1 − λ(tj |xi ))) = αj + x0i β, (7.20)
where αj = log(− log(1 − λ0 (tj ))) is the complementary log-log transforma-
tion of the baseline hazard.
This model can be fitted to discrete survival data by generating pseudo-
observations as before and fitting a generalized linear model with binomial
error structure and complementary log-log link. In other words, the equiv-
alence between the binomial likelihood and the discrete-time survival likeli-
hood under non-informative censoring holds both for the logit and comple-
mentary log-log links.
It is interesting to note that this model can be obtained by grouping time
in the continuous-time proportional-hazards model. To see this point let us
assume that time is continuous and we are really interested in the standard
proportional hazards model
λ(t|x) = λ0 (t) exp{x0i β}.
Suppose, however, that time is grouped into intervals with boundaries 0 =
τ0 < τ1 < . . . < τJ = ∞, and that all we observe is whether an individual
survives or dies in an interval. Note that this construction imposes some
constraints on censoring. If an individual is censored at some point inside
an interval, we do not know whether it would have survived the interval or
not. Therefore we must censor it at the end of the previous interval, which
is the last point for which we have complete information. Unlike the piece-
wise exponential set-up, here we can not use information about exposure to
part of an interval. On the other hand, it turns out that we do not need to
assume that the hazard is constant in each interval.
Let λij denote the discrete hazard or conditional probability that in-
dividual i will die in interval j given that it was alive at the start of the
interval. This probability is the same as the complement of the conditional
probability of surviving the interval given that one was alive at the start,
and can be written as
λij = 1 − Pr{Ti > τj |Ti > τj−1 }
Z τj
= 1 − exp{− λ(t|xi )dt}
τj−1
Z τj
λ0 (t)dt}exp{xi β }
0
= 1 − exp{−
τj−1
34 CHAPTER 7. SURVIVAL MODELS
= 1 − (1 − λj )exp{xi β } ,
0
• If time is truly discrete, then one should probably use the discrete
model with a logit link, which has a direct interpretation in terms of
conditional odds, and is easily implemented using standard software
for logistic regression.
Finally, if time is truly continuous and one wishes to estimate the effects of
the covariates without making any assumptions about the baseline hazard,
then Cox’s (1972) partial likelihood is a very attractive approach.