Vumani Dlamini
2008-Aug-15 12:11 UTC
[R] estimating the proportion without recurring ailment based on the nelson-aalen estimator
Dear useRs, I'm trying to estimate the proportion of individuals with a without a certain recurring ailment at several times points. The data are of the survival type, with "start"-"stop" dates and whether the individual had the ailment in that interval. Some cases are observed until database closure and some died or are lost to followup. The interest is not on death. I have tries using "coxph" and "cph" as fitCOXPH <- coxph(Surv(time=start,time2=stop,sick)~strata(adult)+frailty(individual),data=test.data) fitKM <- survfit(fitCOXPH ) summary(fitKM ,times=sort(seq(0,128,by=16))) What I noted is that the proportion was decreasing with increasing time, which I felt was incorrect! I then tried estimating the number without the ailment in each discrete month, but computing the denominator (the number alive and at risk) and getting the confidence intervals becomes a huge undertaking. As an approximation I tend to use the number who survive to the midpoint of the month resulting in proportions above 1 sometimes, for instance at the beginning where no individuals have the ailment but some are censored before the midpoint time. I have reason to believe there is a way within the survival packages but haven't figured out how. Thanking you in advance, Vumani ps: sorry for cross posting. wasnt sure whether "r-sig-epi@stat.math.ethz.ch" or "r-help@R-project.org" was more suitable for this question _________________________________________________________________ [[alternative HTML version deleted]]
Terry Therneau
2008-Aug-18 13:31 UTC
[R] estimating the proportion without recurring ailment based on the nelson-aalen estimator
-- begin included message --- Dear useRs, I'm trying to estimate the proportion of individuals with a without a certain recurring ailment at several times points. The data are of the survival type, with "start"-"stop" dates and whether the individual had the ailment in that interval. Some cases are observed until database closure and some died or are lost to followup. The interest is not on death. I have tries using "coxph" and "cph" as . . . ----- end included --- The simple solution is to use survfit> fit <- survfit(Surv(start, stop, sick) ~1, data=test.data) > plot(fit, fun='cumhaz') > summary(fit, times=seq(0, 128, by=16))You do not say whether the subjects can have mutliple episodes. If so, then the standard errors above will by underestimates and one needs to do compute a jackknife variance. If each subject can have at most one episode then the above will be correct. Using coxph + frailty + survfit does not address the problem; the survival curve function is not yet intellegent enough for that. Note that the cumulative hazard = -log(survival) is an estimate of the mean number of events M(t) per subject, whereas the survival curve S(t) is an estimate of Pr(0 events by time t). Terry Therneau