Charles M. Rowland
2010-Jul-15 20:45 UTC
[R] Standard Error for individual patient survival with survfit and summary.survfit
I am using the coxph, survfit and summary.survfit functions to calculate an
estimate of predicted survival with confidence interval for future patients
based on the survival distribution of an existing cohort of subjects. I am
trying to understand the calculation and interpretation of the std.err and
confidence intervals printed by the summary.survfit function.
Using the default confidence interval type of "log", I would have
expected the confidence intervals to be calculated as
exp( log(surv) +,- Z*std.err )
where z is the appropriate quantile of the normal distribution. This does seem
to be the case but using the std.err found in the survfit object rather than the
std.err that is printed by and returned with the summary.survfit function.
Can anyone tell me what is the difference between these two standard errors and
how should I interpret the confidence intervals and std.err given these
differences?
An example using R version 2.11.1 with survival package version 2.35-8 on
Windows operating system is below
Thanks,
Charley Rowland
Celera, Corp.
Example:
> fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) # Fit cox
model to existing data
> s.fit=survfit(fit,newdata=data.frame(age=60)) #
Calculate predicted survival for a new patient
> sum.s.fit=summary(s.fit)
# Print summary of predicted survival estimates.
> sum.s.fit
Call: survfit(formula = fit, newdata = data.frame(age = 60))
time n.risk n.event survival std.err lower 95% CI upper 95% CI
59 26 1 0.978 0.0240 0.932 1.000
115 25 1 0.952 0.0390 0.878 1.000
156 24 1 0.917 0.0556 0.814 1.000
268 23 1 0.880 0.0704 0.752 1.000
329 22 1 0.818 0.0884 0.662 1.000
353 21 1 0.760 0.0991 0.588 0.981
365 20 1 0.698 0.1079 0.516 0.945
431 17 1 0.623 0.1187 0.429 0.905
464 15 1 0.549 0.1248 0.352 0.858
475 14 1 0.480 0.1267 0.286 0.805
563 12 1 0.382 0.1332 0.193 0.757
638 11 1 0.297 0.1292 0.127 0.697
### The confidence intervals extracted from the survfit and summary.survfit
objects are identical and agree with what is printed by
summary> s.fit$lower
[1] 0.9318402 0.8784979 0.8144399 0.7522512 0.6616253 0.5881502 0.5159010
0.4287437 0.3519573 0.2857688 0.1932689 0.1267205> sum.s.fit$lower
[1] 0.9318402 0.8784979 0.8144399 0.7522512 0.6616253 0.5881502 0.5159010
0.4287437 0.3519573 0.2857688 0.1932689 0.1267205
### However, the standard errors extracted from the survfit and summary.survfit
objects are different> s.fit$std.err ###Note the std.err from survfit does not agree
with what is printed by the summary.survfit function
[1] 0.02459050 0.04099162 0.06065735 0.07997872 0.10806000 0.13052489
0.15448058 0.19052784 0.22719704 0.26416491 0.34826243
0.43480448> sum.s.fit$std.err ###Note the std.err from the summary object agrees
with what is printed by summary.survfit
[1] 0.02404586 0.03902366 0.05563833 0.07037450 0.08836045 0.09914814
0.10787837 0.11866803 0.12481970 0.12669151 0.13320179 0.12919546
### The (lower) confidence interval printed by the summary.survfit function
appears to be based on the stderr contained in the survfit object (not std.err
printed in summary)> zval <- qnorm(1- (1-.95)/2, 0,1)
> exp( log(s.fit$surv) - zval*s.fit$std.err)
[1] 0.9318402 0.8784979 0.8144399 0.7522512 0.6616253 0.5881502 0.5159010
0.4287437 0.3519573 0.2857688 0.1932689 0.1267205
> exp( log(s.fit$surv) - zval*sum.s.fit$std.err)
[1] 0.9328355 0.8818930 0.8224912 0.7665456 0.6876705 0.6254552 0.5652418
0.4935883 0.4301637 0.3741385 0.2945926 0.2306651
[[alternative HTML version deleted]]
Terry Therneau
2010-Jul-16 13:15 UTC
[R] Standard Error for individual patient survival with survfit and summary.survfit
> Can anyone tell me what is the difference between these two standard > errors and how should I interpret the confidence intervals and std.err > given these differences?help(survfit.object) will give you the answer. The std in the object is for the cumulative hazard, the printout uses a Taylor series argument to compute the se of the survival. (The are several reasons for this choice, including that se(survival) is not well defined by the standard formulas when S=0). Terry Therneau