Frostygoat
2010-Jul-07 15:25 UTC
[R] Appropriateness of survdiff {survival} for non-censored data
I read through Harrington and Fleming (1982) but it is beyond my statistical comprehension. I have survival data for insects that have a very finite expiration date. I'm trying to test for differences in survival distributions between different groups. I understand that the medical field is most often dealing with censored data and that survival analysis, at least in the package survival, is largely built around these conventions and differs from a classical biological perspective. For example, for lifetable analysis of insects there is often no need to estimate survival using a Kaplan-Meir estimate because it is relatively easy to follow a cohort of individuals through the entire course of life. Thus I question the appropriateness of using survdiff in my analysis; I have exact data yet I would be testing on the Kaplan-Meir estimate of these data in survdiff. Thanks for any help.
Terry Therneau
2010-Jul-08 13:55 UTC
[R] Appropriateness of survdiff {survival} for non-censored data
The query: "Thus I question the appropriateness of using survdiff in my analysis; I have exact data yet I would be testing on the Kaplan-Meir estimate of these data in survdiff. Thanks for any help." My thoughts: There are two aspects of survival analysis you need to think of. The first, as you've noted, is the nuisance of censored data and the fact that this forces different software. All of that software works fine with uncensored data, the Kaplan-Meier for instance simply reduces to the emprical cdf. The second is that the models commonly used are ones that have been found to work well for this kind of data. It is easy to do a censored data t-test for instance [survreg(Surv(y) ~ x, dist='gaussian')] but it is almost never done. The reason is that the effect of covariates on survival times is not well described by a location shift, e.g., "everyone gets 3 more weeks". The log rank test is most powerful for a shift in the hazard rate, which is how a lot of covariates seem to work for this data. BTW in uncensored data the LR is equivalent to the Savage exponential scores test which comes from the non-parametrics literature but is rarely used there: most of that literature deals with problems where the effect of 'x' is not a shift in hazard. If the way in which covariates affect insect lifetimes is similar to how they work in human biology or industrial reliabily, then survival methods would be good choice. The answer to this is biological, not statistical. Terry Therneau