thr3ads.net - R help - [R] results of a survival analysis change when converting the data to counting process format [Aug 2019]

If this information is useful, please help other people find it:
Share via:

Ferenci Tamas

2019-Aug-18 17:10 UTC

[R] results of a survival analysis change when converting the data to counting process format

Dear All,

Consider the following simple example:

library( survival )
data( veteran )

coef( coxph(Surv(time, status) ~ trt + prior + karno, data = veteran) )
         trt        prior        karno 
 0.180197194 -0.005550919 -0.033771018

Note that we have neither time-dependent covariates, nor time-varying
coefficients, so the results should be the same if we change to
counting process format, no matter where we cut the times.

That's true if we cut at event times:

veteran2 <- survSplit( Surv(time, status) ~ trt + prior + karno,
                       data = veteran, cut = unique( veteran$time ) )

coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = veteran2 ) )
         trt        prior        karno 
 0.180197194 -0.005550919 -0.033771018 

But quite interestingly not true, if we cut at every day:

veteran3 <- survSplit( Surv(time, status) ~ trt + prior + karno,
                       data = veteran, cut = 1:max(veteran$time) )

coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = veteran3 ) )
         trt        prior        karno 
 0.180197215 -0.005550913 -0.033771016 

The difference is not large, but definitely more than just a rounding
error, or something like that.

What's going on? How can the results get wrong, especially by
including more cutpoints?

Thank you in advance,
Tamas

Göran Broström

2019-Aug-22 19:48 UTC

head link

[R] results of a survival analysis change when converting the data to counting process format

On 2019-08-18 19:10, Ferenci Tamas wrote:> Dear All,
> 
> Consider the following simple example:
> 
> library( survival )
> data( veteran )
> 
> coef( coxph(Surv(time, status) ~ trt + prior + karno, data = veteran) )
>           trt        prior        karno
>   0.180197194 -0.005550919 -0.033771018
> 
> Note that we have neither time-dependent covariates, nor time-varying
> coefficients, so the results should be the same if we change to
> counting process format, no matter where we cut the times.
> 
> That's true if we cut at event times:
> 
> veteran2 <- survSplit( Surv(time, status) ~ trt + prior + karno,
>                         data = veteran, cut = unique( veteran$time ) )
> 
> coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data =
veteran2 ) )
>           trt        prior        karno
>   0.180197194 -0.005550919 -0.033771018
> 
> But quite interestingly not true, if we cut at every day:
> 
> veteran3 <- survSplit( Surv(time, status) ~ trt + prior + karno,
>                         data = veteran, cut = 1:max(veteran$time) )
> 
> coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data =
veteran3 ) )
>           trt        prior        karno
>   0.180197215 -0.005550913 -0.033771016
> 
> The difference is not large, but definitely more than just a rounding
> error, or something like that.
> 
> What's going on? How can the results get wrong, especially by
> including more cutpoints?
All results are wrong, but they are useful (paraphrasing George EP Box).

G?ran
> 
> Thank you in advance,
> Tamas
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Göran Broström

2019-Aug-23 09:12 UTC

head link

[R] results of a survival analysis change when converting the data to counting process format

Den 2019-08-22 kl. 21:48, skrev G?ran Brostr?m:> 
> 
> On 2019-08-18 19:10, Ferenci Tamas wrote:
>> Dear All,
>>
>> Consider the following simple example:
>>
>> library( survival )
>> data( veteran )
>>
>> coef( coxph(Surv(time, status) ~ trt + prior + karno, data = veteran) )
>> ????????? trt??????? prior??????? karno
>> ? 0.180197194 -0.005550919 -0.033771018
>>
>> Note that we have neither time-dependent covariates, nor time-varying
>> coefficients, so the results should be the same if we change to
>> counting process format, no matter where we cut the times.
>>
>> That's true if we cut at event times:
>>
>> veteran2 <- survSplit( Surv(time, status) ~ trt + prior + karno,
>> ??????????????????????? data = veteran, cut = unique( veteran$time ) )
>>
>> coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = 
>> veteran2 ) )
>> ????????? trt??????? prior??????? karno
>> ? 0.180197194 -0.005550919 -0.033771018
>>
>> But quite interestingly not true, if we cut at every day:
>>
>> veteran3 <- survSplit( Surv(time, status) ~ trt + prior + karno,
>> ??????????????????????? data = veteran, cut = 1:max(veteran$time) )
>>
>> coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = 
>> veteran3 ) )
>> ????????? trt??????? prior??????? karno
>> ? 0.180197215 -0.005550913 -0.033771016
>>
>> The difference is not large, but definitely more than just a rounding
>> error, or something like that.
>>
>> What's going on? How can the results get wrong, especially by
>> including more cutpoints?
> 
> All results are wrong, but they are useful (paraphrasing George EP Box).
That said, it is a little surprising: The generated risk sets are 
(should be) identical in all cases, and one would expect rounding errors 
to be the same. But data get stored differently, and ... who knows?

I tried your examples on my computer and got exactly the same results as 
you. Which surprised me.

G,
> 
> G?ran
> 
>>
>> Thank you in advance,
>> Tamas
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Aug 2019 - results of a survival analysis change when converting the data to counting process format

[R] results of a survival analysis change when converting the data to counting process format

[R] results of a survival analysis change when converting the data to counting process format

[R] results of a survival analysis change when converting the data to counting process format