Svetlana Eden
2008-Dec-04 21:05 UTC
[R] comparing SAS and R survival analysis with time-dependent covariates
Dear R-help, I was comparing SAS (I do not know what version it is) and R (version 2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent covariates. The results differed significantly so I tried to understand on a short example where I went wrong. The following example shows that even when argument 'method' in R function coxph and argument 'ties' in SAS procedure phreg are the same, the results of Cox regr. are different. This seems to happen when there are ties in the events/covariates times. My question is what software, R or SAS, is more reliable for the survival analysis with time-dependent covariates or if you could point out a problem in the following example. Example. SAS gives HR=3.236: data trythis; input id days timedeli stat; datalines; 1 3 .5 1 2 1.5 1 1 3 6 1000 0 4 8 1000 1 5 8 1 0 6 21 1000 1 7 11 3 1 run; proc phreg data=trythis; model days*stat(0)=deli/risklimits ties=exact; if timedeli>days then deli=0; else deli=1; run; Example (continued). R gives HR=3.91: tmp = data.frame(id=c(1, 1, 2, 2, 3, 4, 5, 5, 6, 7, 7), start=c(0.0, 0.5, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 3.0), end=c(0.5, 3.0, 1.0, 1.5, 6.0, 8.0, 1.0, 8.0, 21.0, 3.0, 11.0), delir=c(0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1), outcome=c(0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1)) tmp surv = Surv(time=tmp$start, time2=tmp$end, tmp$outcome) cphres = coxph(surv ~ tmp$delir, method="exact") summary(cphres)[["coef"]] After breaking a tie b/w an event and a time-dependent observation, R gives the same result as SAS. tmp$end[2]=tmp$end[2] + .1 tmp surv = Surv(time=tmp$start, time2=tmp$end, tmp$outcome) cphres = coxph(surv ~ tmp$delir, method="exact") summary(cphres)[["coef"]] Thank you so much for time, Svetlana [[alternative HTML version deleted]]
Terry Therneau
2008-Dec-05 13:56 UTC
[R] comparing SAS and R survival analysis with time-dependent covariates
This query of "why do SAS and S give different answers for Cox models" comes up every so often. The two most common reasons are that a. they are using different options for the ties b. the SAS and S data sets are slightly different. You have both errors. First, make sure I have the same data set by reading a common file, and then compare the results. tmt54% more sdata.txt 1 0.0 0.5 0 0 1 0.5 3.0 1 1 2 0.0 1.0 0 0 2 1.0 1.5 1 1 3 0.0 6.0 0 0 4 0.0 8.0 0 1 5 0.0 1.0 0 0 5 1.0 8.0 1 0 6 0.0 21.0 0 1 7 0.0 3.0 0 0 7 3.0 11.0 1 1 tmt55% more test.sas options linesize=80; data trythis; infile 'sdata.txt'; input id start end delir outcome; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=discrete; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=efron; tmt56% more test.r trythis <- read.table('sdata.txt', col.names=c("id", "start", "end", "delir", "outcome")) coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact') coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron') ----------------- I now get comparable answers. Note that Cox's "exact partial likelihood" is the correct form to use for discrete time data. I labeled this as the 'exact' method and SAS as the 'discrete' method. The "exact marginal likelihood" of Prentice et al, which SAS calls the 'exact' method is not implemented in S. As to which package is more reliable, I can only point to a set of formal test cases that are found in Appendix E of the book by Therneau and Grambsch. These are small data sets where the coefficients, log-likelihood, residuals, etc have all been worked out exactly in closed form. R gets all of these test cases right, SAS gets almost all. Terry Therneau ----------------------------------------- Svetlan Eden wrote Dear R-help, I was comparing SAS (I do not know what version it is) and R (version 2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent covariates. The results differed significantly so I tried to understand on a short example where I went wrong. The following example shows that even when argument 'method' in R function coxph and argument 'ties' in SAS procedure phreg are the same, the results of Cox regr. are different. This seems to happen when there are ties in the events/covariates times. My question is what software, R or SAS, is more reliable for the survival analysis with time-dependent covariates or if you could point out a problem in the following example. ...