Charles M. Rowland
2010-Jul-15  20:45 UTC
[R] Standard Error for individual patient survival with survfit and summary.survfit
I am using the coxph, survfit and summary.survfit functions to calculate an
estimate of predicted survival with confidence interval for future patients
based on the survival distribution of an existing cohort of subjects.  I am
trying to understand the calculation and interpretation of the std.err and
confidence intervals printed by the summary.survfit function.
Using the default confidence interval type of "log", I would have
expected the confidence intervals to be calculated as
                exp( log(surv) +,- Z*std.err )
where z is the appropriate quantile of the normal distribution.   This does seem
to be the case but using the std.err found in the survfit object rather than the
std.err that is printed by and returned with the summary.survfit function.
Can anyone tell me what is the difference between these two standard errors and
how should I interpret the confidence intervals and std.err given these
differences?
An example using R version 2.11.1 with survival package version 2.35-8 on
Windows operating system is below
Thanks,
Charley Rowland
Celera, Corp.
Example:
> fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian)    # Fit cox
model to existing data
> s.fit=survfit(fit,newdata=data.frame(age=60))                    #
Calculate predicted survival for a new patient
> sum.s.fit=summary(s.fit)                                                   
# Print summary of predicted survival estimates.
> sum.s.fit
Call: survfit(formula = fit, newdata = data.frame(age = 60))
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
   59     26       1    0.978  0.0240        0.932        1.000
  115     25       1    0.952  0.0390        0.878        1.000
  156     24       1    0.917  0.0556        0.814        1.000
  268     23       1    0.880  0.0704        0.752        1.000
  329     22       1    0.818  0.0884        0.662        1.000
  353     21       1    0.760  0.0991        0.588        0.981
  365     20       1    0.698  0.1079        0.516        0.945
  431     17       1    0.623  0.1187        0.429        0.905
  464     15       1    0.549  0.1248        0.352        0.858
  475     14       1    0.480  0.1267        0.286        0.805
  563     12       1    0.382  0.1332        0.193        0.757
  638     11       1    0.297  0.1292        0.127        0.697
### The confidence intervals extracted from the survfit and summary.survfit
objects are identical and agree with what is printed by
summary> s.fit$lower
 [1] 0.9318402 0.8784979 0.8144399 0.7522512 0.6616253 0.5881502 0.5159010
0.4287437 0.3519573 0.2857688 0.1932689 0.1267205> sum.s.fit$lower
 [1] 0.9318402 0.8784979 0.8144399 0.7522512 0.6616253 0.5881502 0.5159010
0.4287437 0.3519573 0.2857688 0.1932689 0.1267205
###  However, the standard errors extracted from the survfit and summary.survfit
objects are different> s.fit$std.err               ###Note the std.err from survfit does not agree
with what is printed by the summary.survfit function
 [1] 0.02459050 0.04099162 0.06065735 0.07997872 0.10806000 0.13052489
0.15448058 0.19052784 0.22719704 0.26416491 0.34826243
0.43480448> sum.s.fit$std.err     ###Note the std.err from the summary object agrees
with what is printed by summary.survfit
 [1] 0.02404586 0.03902366 0.05563833 0.07037450 0.08836045 0.09914814
0.10787837 0.11866803 0.12481970 0.12669151 0.13320179 0.12919546
###  The (lower) confidence interval printed by the summary.survfit function
appears to be based on the stderr contained in the survfit object  (not std.err
printed in summary)> zval <- qnorm(1- (1-.95)/2, 0,1)
> exp( log(s.fit$surv) - zval*s.fit$std.err)
 [1] 0.9318402 0.8784979 0.8144399 0.7522512 0.6616253 0.5881502 0.5159010
0.4287437 0.3519573 0.2857688 0.1932689 0.1267205
> exp( log(s.fit$surv) - zval*sum.s.fit$std.err)
 [1] 0.9328355 0.8818930 0.8224912 0.7665456 0.6876705 0.6254552 0.5652418
0.4935883 0.4301637 0.3741385 0.2945926 0.2306651
	[[alternative HTML version deleted]]
Terry Therneau
2010-Jul-16  13:15 UTC
[R] Standard Error for individual patient survival with survfit and summary.survfit
> Can anyone tell me what is the difference between these two standard > errors and how should I interpret the confidence intervals and std.err > given these differences?help(survfit.object) will give you the answer. The std in the object is for the cumulative hazard, the printout uses a Taylor series argument to compute the se of the survival. (The are several reasons for this choice, including that se(survival) is not well defined by the standard formulas when S=0). Terry Therneau