Dear List, Is there an automated way to use the survival package to generate survival rate estimates and their standard errors? To be clear, *not *the survivorship estimates (which are cumulative), but the survival *rate * estimates... Thank you in advance for any help. Best, Brian [[alternative HTML version deleted]]
On May 12, 2011, at 12:40 PM, Brian McLoone wrote:> Dear List, > > Is there an automated way to use the survival package to generate > survival > rate estimates and their standard errors? To be clear, *not *the > survivorship estimates (which are cumulative), but the survival > *rate * > estimates...Not entirely clear, but from context I suspect you mean instantaneous hazard? (Survival is not a rate but rather a proportion. Mortality can be a rate. The instantaneous hazard is the decrement in survival per unit time divided by the survival to that time.) So at each death the non-parametric estimate would divide current deaths (often 1 but ties are possible) by time since last death and then divide by proportion surviving. Or if you have a semi-parametric estimated function for survival (such as might be output from `basehaz` which calls `survfit`) take: -delta_survival/delta_time/survival tdata <- data.frame(time =c(1,1,1,2,2,2,3,3,3,4,4,4), status=rep(c(1,0,2),4), n =c(12,3,2,6,2,4,2,0,2,3,3,5)) fit <- survfit(Surv(time, time, status, type='interval') ~1, data=tdata, weight=n) > T <- c(0, fit$time) > S <- c(1, fit$surv) > (-diff(S)/diff(T) )/fit$surv [1] 0.8602308 0.8247746 0.4044324 1.2115931 I don't know if Therneau's opinion about estimating smoothed hazards has changed: http://finzi.psych.upenn.edu/Rhelp10/2009-March/193104.html There is also a muhaz package which may generate standard errors for its estimates but I have read elsewhere that is does not do Cox models. http://finzi.psych.upenn.edu/R/library/muhaz/html/00Index.html -- David Winsemius, MD West Hartford, CT
---begin included message -- Is there an automated way to use the survival package to generate survival rate estimates and their standard errors? To be clear, *not *the survivorship estimates (which are cumulative), but the survival *rate * estimates... --- end -- The classic method in epidemiology is simple hazard rates = total events / total time. This is done in more general form by the pyears (person-years) routine. pfit <- pyears(Surv(time, status) ~ agegrp + sex, data=mydata) It's advantage over simple tables is the ability to do time-dependent categories. For instance if I want death rates by decade of age: acut <- tcut(mydata$age, 0:12 *10) pfit2 <- pyears(time, status) ~ acut + sex, data=mydata) The first table was based on age at the start, the second on current age. For non-parametric estimates of the hazard function look at the "survival" task view on CRAN; there are several choices. Terry Therneau
>> pfit2 <- pyears(time, status) ~ acut + sex, data=mydata)>I think there was an omitted Surv function call above.Correct>And a follow on question and a tentative answer: Can one generate a >'pyears' table that incorporates categories of time observed by >tcut()- ting the time variable and having it in both the Surv argument >and on the RHS of the formula? Experimentation makes me think this >works very well.The tcut function is simply "cut" with a different class added. In fact, I just now reviewed the code and need to update it: it reprises an old version of cut, due to a flaw in cut at the time, but that issue is long since resolved. I'll update it to call cut and add to the returned class. The pyears function looks for objects of class "tcut" and treats them special: the value contained therein is the category one starts in, but the category can change over time as you age. The key "trick" with tcut objects is that the follow-up time (left hand side of the formula) and the tcut result must be in the same units for program to do this properly. As to the direct question: yes we often break up follow-up time. A simple example: > fugroup <- tcut(rep(0, nrow(lung)), c(0, 182, 365, 730, 100*365), labels=c("0-6m", "6m-1yr", "1-2yr", "2+ yr")) > fit <- pyears(Surv(time, status) ~ sex + fugroup, data=lung) > fit$event 0-6m 6m-1yr 1-2yr 2+ yr 1 51 34 24 3 2 15 21 14 3 In this advanced lung cancer data set, the majority of the deaths happen within 1 year of study enrollment. Don't make the common mistake of "fugroup <- tcut(lung$time, ...", which would start each subject's time scale at the day of their last follow-up. Terry T
Reasonably Related Threads
- Survival: pyears and ratetable: expected events
- Estimate and plot hazard function using "muhaz" package
- Confidence Intervals on Hazard Plots
- Looking for new maintainer of orphans R2HTML SemiPar cghseg hexbin lgtdl monreg muhaz operators pamr
- Nesting in Cox proportional hazards survivorship analysis