Lecture 14: Nonparametric Survival Analysis Methods: James J. Dignam
Lecture 14: Nonparametric Survival Analysis Methods: James J. Dignam
Methods
James J. Dignam
11 13 13 13 13 13 14 14 15 15 17
This is a case where all the survival times are fully observed (no
censoring, which does not often occur in medical studies).
11 13 13 13 13 13 14 14 15 15 17
11
0 < t ≤ 11: Ŝ(t ) = P̂ (T ≥ 11) = 11 =1
10
11 < t ≤ 13: Ŝ(t ) = P̂ (T ≥ 13) = 11 = 0.909
5
13 < t ≤ 14: Ŝ(t ) = P̂ (T ≥ 14) = 11 = 0.455
3
14 < t ≤ 15: Ŝ(t ) = P̂ (T ≥ 15) = 11 = 0.273
1
15 < t ≤ 17: Ŝ(t ) = P̂ (T ≥ 17) = 11 = 0.091
+ 0
17 < t : Ŝ(t ) = P̂ (T ≥ 17 ) = 11 = 0
. use pulmonary_metastasis . d t a
. s t s e t time
f a i l u r e event : ( assumed t o f a i l a t t i m e = t i m e )
obs . t i m e i n t e r v a l : (0 , time ]
e x i t on o r b e f o r e : failure
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
11 t o t a l observations
0 exclusions
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
11 o b s e r v a t i o n s remaining , r e p r e s e n t i n g
11 f a i l u r e s i n s i n g l e − r e c o r d / s i n g l e − f a i l u r e data
151 t o t a l a n a l y s i s t i m e a t r i s k and under o b s e r v a t i o n
a t r i s k from t = 0
e a r l i e s t observed e n t r y t = 0
l a s t observed e x i t t = 17
The stset command tells Stata that this is a survival time variable -
must have certain properties (non-negative, may have censoring var
associated with it)
f a i l u r e _d : 1 ( meaning a l l fail )
analysis time _t : time
0 5 10 15 20
analysis time
Ŝ(t ) is 1 from the time origin until the time of first death (11 months).
Ŝ(t ) is 0 after the last observed survival time (17 months).
Ŝ(t ) is non-increasing in t .
f a i l u r e _d : 1 ( meaning a l l fail )
analysis time _t : time
Survival to the first failure time is 100% (S(t) =1.0). Stata does not show this
(other pgms do by convention).
First value change of the estimated survivor function occurs at time 11 months,
Ŝ(11+ ) = 0.9091:
Second value change of the estimated survivor function occurs at time 13
months, Ŝ(13+ ) = 0.4545:
...
The other approach: we can estimate S(t ) using the conditional probability idea:
11
0 < t ≤ 11 : Ŝ(t ) = P̂ (T ≥ 11) = 11 =1
11 < t ≤ 13:
13 < t ≤ 14:
The other approach: we can estimate S(t ) using the conditional probability idea:
11
0 < t ≤ 11 : Ŝ(t ) = P̂ (T ≥ 11) = 11 =1
11 < t ≤ 13:
13 < t ≤ 14:
HEALTH
Paul Meier, a leading medical statistician who had a major influence on how the
federal government assesses and makes decisions about new treatments that can
affect the lives of millions, died on Sunday at his home in Manhattan. He was 87.
The cause was complications of a stroke, his daughter Diane Meier said.
As early as the mid-1950s, Dr. Meier was one of the first and most vocal
proponents of what is called “randomization.”
If the number of subjects is large enough, the two groups will be the same in
every respect except the treatment they receive. Such randomized controlled trials
are considered the most rigorous way to conduct a study and the best way to gather
For analytic purposes, survival data are recorded using two variables:
+−−−−−−−−−−−−−−−+
| time status |
|−−−−−−−−−−−−−−−|
1. | 10 1 |
2. | 13 0 |
3. | 18 0 |
4. | 19 1 |
5. | 23 0 |
|−−−−−−−−−−−−−−−|
6. | 30 1 |
7. | 36 1 |
8. | 38 0 |
9. | 54 0 |
10. | 56 0 |
|−−−−−−−−−−−−−−−|
11. | 59 1 |
12. | 75 1 |
13. | 93 1 |
14. | 97 1 |
15. | 104 0 |
|−−−−−−−−−−−−−−−|
16. | 107 1 |
17. | 107 0 |
18. | 107 0 |
+−−−−−−−−−−−−−−−+
By convention, when censored survival times occur at the same time as one or
more failures, the censored survival time is taken to occur immediately after the
failure time.
t( j ) nj dj
10 18 1
19 15 1
30 13 1
36 12 1
59 8 1
75 7 1
93 6 1
97 5 1
107 3 1
J. Dignam (UChicago) Lecture 14 Feb. 25, 2020 17 / 42
Kaplan-Meier estimate of S(t )
for t (k) < t ≤ t (k+1) , k = 1, 2, ..., r, with Ŝ(t ) = 1 for t ≤ t (1) , and where
t (r +1) = ∞.
Table 2: Kaplan-Meier estimate of the survivor function for the IUD data.
Time interval nj dj (n j − d j )/n j Ŝ(t )
0- 18 0 1.0000 1.0000
10- 18 1 0.9444 0.9444
19- 15 1 0.9333 0.8815
30- 13 1 0.9231 0.8137
36- 12 1 0.9167 0.7459
59- 8 1 0.8750 0.6526
75- 7 1 0.8571 0.5594
93- 6 1 0.8333 0.4662
97- 5 1 0.8000 0.3729
107 3 1 0.6667 0.2486
. s t s e t time , f a i l u r e ( s t a t u s )
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
18 t o t a l observations
0 exclusions
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
18 o b s e r v a t i o n s remaining , r e p r e s e n t i n g
9 f a i l u r e s i n s i n g l e − r e c o r d / s i n g l e − f a i l u r e data
1046 t o t a l a n a l y s i s t i m e a t r i s k and under o b s e r v a t i o n
a t r i s k from t = 0
e a r l i e s t observed e n t r y t = 0
l a s t observed e x i t t = 107
. s t s graph
f a i l u r e _d : status
analysis time _t : time
0 20 40 60 80 100
analysis time
f a i l u r e _d : status
analysis time _t : time
. s t s graph , gwood
f a i l u r e _d : status
analysis time _t : time
0 20 40 60 80 100
analysis time
f a i l u r e _d : status
analysis time _t : time
dk
ĥ(t ) = (9)
n k τk
Actuarial Method
Time interval j dj cj nj
[0 − 10) 1 0 0 18
[10 − 20) 2 2 2 18
[20 − 30) 3 0 1 14
[30 − 40) 4 2 1 13
[40 − 50) 5 0 0 10
[50 − 60) 6 1 2 10
[60 − 70) 7 0 0 7
[70 − 80) 8 1 0 7
[80 − 90) 9 0 0 6
[90 − 100) 10 2 0 6
[100 − 110) 11 1 3 4
Table 4: Life-table estimate of the survivor function for the data of Time to
discontinuation of the use of an IUD
dj
Time interval j dj cj nj 1 − n −c /2 Ŝ(t )
j j
[0 − 10) 1 0 0 18 1 1
[10 − 20) 2 2 2 18 0.8824 0.8824
[20 − 30) 3 0 1 14 1 0.8824
[30 − 40) 4 2 1 13 0.8400 0.7412
[40 − 50) 5 0 0 10 1 0.7412
[50 − 60) 6 1 2 10 0.8889 0.6588
[60 − 70) 7 0 0 7 1 0.6588
[70 − 80) 8 1 0 7 0.8571 0.5647
[80 − 90) 9 0 0 6 1 0.5647
[90 − 100) 10 2 0 6 0.6667 0.3765
[100 − 110) 11 1 3 4 0.6000 0.2259
Beg . Std .
Interval Total Deaths Lost Survival Error [95% Conf . I n t . ]
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
10 20 18 2 2 0.8824 0.0781 0.6060 0.9692
20 30 14 0 1 0.8824 0.0781 0.6060 0.9692
30 40 13 2 1 0.7412 0.1126 0.4451 0.8951
50 60 10 1 2 0.6588 0.1267 0.3572 0.8444
70 80 7 1 0 0.5647 0.1392 0.2642 0.7824
90 100 6 2 0 0.3765 0.1429 0.1234 0.6337
100 110 4 1 3 0.2259 0.1448 0.0314 0.5276
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
dk
ĥ(t ) = (12)
(n k0 − d k /2)τk
where τk = t k+1 − t k .
Actuarial Estimate
. l t a b l e t i m e s t a t u s , i n t e r v a l ( 1 0 ) graph
1 .8
Proportion Surviving
.6 .4
.2
20 40 60 80 100 120
time
. l t a b l e t i m e s t a t u s , i n t e r v a l ( 1 0 ) hazard
Estimating S(t)