BXC (Bendix Carstensen)
2004-Nov-09 14:01 UTC
[R] survSplit: further exploration and related topics
To Danardonos concern of splitting time for records with delayed entry: This can fairly easily be accomodated, by simply splitting time in small intervals of time since entry into the study, and then compute the value of the other timescales for each of these e.g.: current.age <- time.from.entry + age.at.entry but the cut on the other timescales will not be exactly where you may want them to be, but this should be of minor importance in real life. This approach will also make it clearer what really is going on. The effect of each of the timescales can then be modelled using the usual regression tools available in R. David Clayton har written an R-function that does it correctly, you can find it in: http://biostat.ku.dk/~bxc/Lexis/ along with its .Rd file. It is also included a Lexis-package which is a first shot at an epidemiology package for R available at http://biostat.ku.dk/~bxc/SPE/library/, but built under 1.9. I recently tried to build it under 2.0, but it crashed for me and I was advised to wait till 2.1 before I had another go at it. A little further exploration of what goes on in survSplit gives:> data(aml) > aml$id <- letters[1:dim(aml)[1]] > aml3<-survSplit(aml,cut=c(5,10,50),end="time",start="start",+ event="status",episode="i")> # Sort the rows sensibly > aml3 <- aml3[order(aml3$id,aml3$start),] > aml3$expand <- 1:(dim(aml3)[1]) > aml3[aml3$id=="k",c("id","expand","start","time","status","x")]id expand start time status x 11 k 30 0 5 0 Maintained 34 k 31 5 10 0 Maintained 57 k 32 10 50 0 Maintained 80 k 33 50 161 0 Maintained> > # All records are taken as if entry were at time 0: > aml4<-survSplit(aml3,cut=c(9,12,40),end="time",start="entry",+ event="status",episode="i",id="id2")> aml4 <- aml4[order(aml4$id,aml4$expand,aml4$start),] >aml4[aml4$id=="k",c("id","expand","start","entry","time","status","x")] id expand start entry time status x 30 k 30 0 0 5 0 Maintained 31 k 31 5 0 9 0 Maintained 94 k 31 5 9 10 0 Maintained 32 k 32 10 0 9 0 Maintained 95 k 32 10 9 12 0 Maintained 158 k 32 10 12 40 0 Maintained 221 k 32 10 40 50 0 Maintained 33 k 33 50 0 9 0 Maintained 96 k 33 50 9 12 0 Maintained 159 k 33 50 12 40 0 Maintained 222 k 33 50 40 161 0 Maintained> > # entry time is is "start", but it seems that it is more or less > # assumed that entry is at the first split point given. > aml5<-survSplit(aml3,cut=c(9,12,40),end="time",start="start",+ event="status",episode="i",id="id2")> aml5 <- aml5[order(aml5$id,aml5$expand,aml5$start),] > aml5[aml5$id=="k",c("id","expand","start","time","status","x")]id expand start time status x 30 k 30 0 5 0 Maintained 31 k 31 5 9 0 Maintained 94 k 31 9 10 0 Maintained 95 k 32 9 12 0 Maintained 158 k 32 12 40 0 Maintained 221 k 32 40 50 0 Maintained 96 k 33 9 12 0 Maintained 159 k 33 12 40 0 Maintained 222 k 33 40 161 0 Maintained>It appears that the intention has been to support counting process input, but not quite succeeded. Bendix Carstensen ---------------------- Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2 DK-2820 Gentofte Denmark tel: +45 44 43 87 38 mob: +45 30 75 87 38 fax: +45 44 43 07 06 bxc at steno.dk www.biostat.ku.dk/~bxc ----------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Thomas Lumley > Sent: Monday, November 08, 2004 4:28 PM > To: Danardono > Cc: R-help at stat.math.ethz.ch > Subject: Re: [R] survSplit > > > On Mon, 8 Nov 2004, Danardono wrote: > > > I am just realized that survival has the facility to do > survival time > > splitting survSplit > > after read some postings about time dependency in the list. > > Is it survSplit only for the survival data input > (time,status) and not for > > the 'counting process' input (start,stop,status)? > > > > I take one example modified from the survSplit help: > >> data(aml) > >> > aml3<-survSplit(aml,cut=c(5,10,50),end="time",start="start",event="st > >> atus",episode="i",id="id") > > > >> coxph(Surv(time,status)~x,data=aml) > > > > coef exp(coef) se(coef) z p > > xNonmaintained 0.916 2.5 0.512 1.79 0.074 > > > > Likelihood ratio test=3.38 on 1 df, p=0.0658 n= 23 > > > > #the same > >> coxph(Surv(time,status)~x,data=aml3) > > > > coef exp(coef) se(coef) z p > > xNonmaintained 0.916 2.5 0.512 1.79 0.074 > > > > Likelihood ratio test=3.38 on 1 df, p=0.0658 n= 63 > > This should NOT be the same, and on my computer is not. You > should have > to use the counting-process syntax. I get > > > coxph(Surv(time,status)~x,data=aml3) > Call: > coxph(formula = Surv(time, status) ~ x, data = aml3) > > > coef exp(coef) se(coef) z p > xNonmaintained 1.07 2.92 0.514 2.08 0.037 > > Likelihood ratio test=4.61 on 1 df, p=0.0317 n= 63 > > coxph(Surv(start,time,status)~x,data=aml3) > Call: > coxph(formula = Surv(start, time, status) ~ x, data = aml3) > > > coef exp(coef) se(coef) z p > xNonmaintained 0.916 2.5 0.512 1.79 0.074 > > Likelihood ratio test=3.38 on 1 df, p=0.0658 n= 63 > > > > > BUT If I split aml3 further: > > > >> > aml4<-survSplit(aml3,cut=c(9,12,40),end="time",start="start",event="s > >> tatus",episode="i",id="id2") > > #not the same! > >> coxph(Surv(start,time,status)~x,data=aml4) > > coef exp(coef) se(coef) z p > > xNonmaintained 1.05 2.85 0.515 2.03 0.042 > > > > Likelihood ratio test=4.38 on 1 df, p=0.0363 n= 105 > > > > Hmm. I suspect this is a <= vs < bug of some sort in handling start > times. > > -thomas > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read > the posting guide! http://www.R-project.org/posting-guide.html >
On Tue, 9 Nov 2004, BXC (Bendix Carstensen) wrote:> To Danardonos concern of splitting time for records with delayed entry: >There was actually no intention of supporting counting-process input, but it turns out to be a one-line change: from starttime <- c(starttime, rep(cut, each = n)) to starttime<-c(starttime, pmax(starttime, rep(cut,each=n))) about 20 lines down from the top. This will be in the next release of survival, but not until after R 2.0.1 -thomas