Dear G?ran, Thank you for responding to my query; please see below:> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of G?ran > Brostr?m > Sent: August 29, 2015 3:59 PM > To: r-help at r-project.org > Subject: Re: [R] using survreg() in survival package with "long" data > > Dear John, > > I think you are missing that 'survreg' does not handle left truncated data. With > that you should use the 'eha' package, for a PH model the function 'phreg', and > for an AFT model the function 'aftreg' (you didn't tell which model you want to > fit).That's odd, in that it's not true in general, since, e.g., survreg() can be used to fit the left-censored Tobit regression model, as illustrated in this example from ?survreg: tobinfit <- survreg(Surv(durable, durable>0, type='left') ~ age + quant, data=tobin, dist='gaussian') In fact, in my example, the data are right-censored, but in the second data set are represented in counting-process form as one-week intervals. I imagine that you're right in the sense that survreg() can't handle data like this in counting-process form. Yet, I've reviewed the documentation in the survey package but can't find any reference to this -- which is not to say that it's not there, only that I don't see it.> > Your attempt with the 'survreg' function implies that you are satisfied with a > Weibull baseline distribution, in which case you could choose either model.Right, I understand that this is the default. The problem is just a small toy example meant to illustrate the error. Thanks again, John> > G?ran > > > On 08/29/2015 07:06 PM, Fox, John wrote: > > Dear list members, > > > > I'm unable to fit a parametric survival regression using survreg() in > > the survival package with data in "counting-process" ("long") form. > > > > To illustrate using a scaled-down problem with 10 subjects (with data > > placed on the web): > > > > --------------- snip ------------ > >> library(survival) RW <- > >> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RW.txt") RL <- > >> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RL.txt") > > > >> RW # "wide" data > > week arrest age 1 20 1 27 2 17 1 18 3 25 1 > > 19 4 52 0 23 5 52 0 19 6 52 0 24 7 23 > > 1 25 8 52 0 21 9 52 0 22 10 52 0 20 > > > >> head(RL, 20) # "long" data, counting-process form > > start stop arrest.time age 1.1 0 1 0 27 1.2 1 > > 2 0 27 1.3 2 3 0 27 1.4 3 4 > > 0 27 1.5 4 5 0 27 1.6 5 6 0 > > 27 1.7 6 7 0 27 1.8 7 8 0 27 > > 1.9 8 9 0 27 1.10 9 10 0 27 1.11 > > 10 11 0 27 1.12 11 12 0 27 1.13 12 > > 13 0 27 1.14 13 14 0 27 1.15 14 15 > > 0 27 1.16 15 16 0 27 1.17 16 17 0 > > 27 1.18 17 18 0 27 1.19 18 19 0 27 > > 1.20 19 20 1 27 > > > > --------------- snip ------------ > > > > I have no trouble fitting a Cox model to both the wide and long forms > > of the data, obtaining (as should be the case) identical results: > > > > --------------- snip ------------ > > > >> coxph(Surv(week, arrest) ~ age, data=RW) # works > > Call: coxph(formula = Surv(week, arrest) ~ age, data = RW) > > > > > > coef exp(coef) se(coef) z p age 0.0963 1.1011 0.2073 0.46 > > 0.64 > > > > Likelihood ratio test=0.21 on 1 df, p=0.643 n= 10, number of events> > 4 > >> coxph(Surv(start, stop, arrest.time) ~ age, data=RL) # works, same > > Call: coxph(formula = Surv(start, stop, arrest.time) ~ age, data > > RL) > > > > > > coef exp(coef) se(coef) z p age 0.0963 1.1011 0.2073 0.46 > > 0.64 > > > > Likelihood ratio test=0.21 on 1 df, p=0.643 n= 397, number of events> > 4 > > > > --------------- snip ------------ > > > > But when I try to fit a parametric survival regression with survreg(), > > I get an error with the long form of the data: > > > > --------------- snip ------------ > > > >> survreg(Surv(week, arrest) ~ age, data=RW) # works > > Call: survreg(formula = Surv(week, arrest) ~ age, data = RW) > > > > Coefficients: (Intercept) age 6.35386771 -0.08982624 > > > > Scale= 0.7363196 > > > > Loglik(model)= -22.1 Loglik(intercept only)= -22.2 Chisq= 0.3 on 1 > > degrees of freedom, p= 0.58 n= 10 > > > >> survreg(Surv(start, stop, arrest.time) ~ age, data=RL) # fails > > Error in survreg(Surv(start, stop, arrest.time) ~ age, data = RL) : > > Invalid survival type > > > > --------------- snip ------------ > > > > I expect that there's something about survreg() that I'm missing. I > > first noted this problem quite some time ago but didn't look into it > > carefully because I didn't really need to use survreg(). > > > > Any help would be appreciated. > > > > Thanks, John ----------------------------- John Fox, Professor > > McMaster University Hamilton, Ontario Canada L8S 4M4 Web: > > socserv.mcmaster.ca/jfox > > > > ______________________________________________ R-help at r-project.org > > mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the > > posting guide http://www.R-project.org/posting-guide.html and provide > > commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Göran Broström
2015-Aug-30 14:15 UTC
[R] using survreg() in survival package with "long" data
On 08/29/2015 11:56 PM, Fox, John wrote:> Dear G?ran, > > Thank you for responding to my query; please see below: > >> -----Original Message----- From: R-help >> [mailto:r-help-bounces at r-project.org] On Behalf Of G?ran Brostr?m >> Sent: August 29, 2015 3:59 PM To: r-help at r-project.org Subject: Re: >> [R] using survreg() in survival package with "long" data >> >> Dear John, >> >> I think you are missing that 'survreg' does not handle left >> truncated data. With that you should use the 'eha' package, for a >> PH model the function 'phreg', and for an AFT model the function >> 'aftreg' (you didn't tell which model you want to fit). > > That's odd, in that it's not true in general, since, e.g., survreg() > can be used to fit the left-censored Tobit regression model, as > illustrated in this example from ?survreg: tobinfit <- > survreg(Surv(durable, durable>0, type='left') ~ age + quant, > data=tobin, dist='gaussian')Well, it is true that survreg doesn't handle left truncated data. Its ability to cope with left censored data does not change that.> In fact, in my example, the data are right-censored, but in the > second data set are represented in counting-process form as one-week > intervals. I imagine that you're right in the sense that survreg() > can't handle data like this in counting-process form.Which implies left truncation, at least formally.> Yet, I've > reviewed the documentation in the survey package but can't find any > reference to this -- which is not to say that it's not there, only > that I don't see it.Right, it is not explicitly spelled out in the documentation. But this has been discussed earlier on R-help. See e.g. https://stat.ethz.ch/pipermail/r-help/2008-November/178621.html G?ran> >> >> Your attempt with the 'survreg' function implies that you are >> satisfied with a Weibull baseline distribution, in which case you >> could choose either model. > > Right, I understand that this is the default. The problem is just a > small toy example meant to illustrate the error. > > Thanks again, John > >> >> G?ran >> >> >> On 08/29/2015 07:06 PM, Fox, John wrote: >>> Dear list members, >>> >>> I'm unable to fit a parametric survival regression using >>> survreg() in the survival package with data in "counting-process" >>> ("long") form. >>> >>> To illustrate using a scaled-down problem with 10 subjects (with >>> data placed on the web): >>> >>> --------------- snip ------------ >>>> library(survival) RW <- >>>> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RW.txt") RL >>>> <- >>>> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RL.txt") >>> >>>> RW # "wide" data >>> week arrest age 1 20 1 27 2 17 1 18 3 25 >>> 1 19 4 52 0 23 5 52 0 19 6 52 0 24 7 >>> 23 1 25 8 52 0 21 9 52 0 22 10 52 0 >>> 20 >>> >>>> head(RL, 20) # "long" data, counting-process form >>> start stop arrest.time age 1.1 0 1 0 27 1.2 >>> 1 2 0 27 1.3 2 3 0 27 1.4 3 >>> 4 0 27 1.5 4 5 0 27 1.6 5 6 >>> 0 27 1.7 6 7 0 27 1.8 7 8 0 >>> 27 1.9 8 9 0 27 1.10 9 10 0 >>> 27 1.11 10 11 0 27 1.12 11 12 0 27 >>> 1.13 12 13 0 27 1.14 13 14 0 27 >>> 1.15 14 15 0 27 1.16 15 16 0 27 1.17 16 >>> 17 0 27 1.18 17 18 0 27 1.19 18 19 >>> 0 27 1.20 19 20 1 27 >>> >>> --------------- snip ------------ >>> >>> I have no trouble fitting a Cox model to both the wide and long >>> forms of the data, obtaining (as should be the case) identical >>> results: >>> >>> --------------- snip ------------ >>> >>>> coxph(Surv(week, arrest) ~ age, data=RW) # works >>> Call: coxph(formula = Surv(week, arrest) ~ age, data = RW) >>> >>> >>> coef exp(coef) se(coef) z p age 0.0963 1.1011 0.2073 >>> 0.46 0.64 >>> >>> Likelihood ratio test=0.21 on 1 df, p=0.643 n= 10, number of >>> events= 4 >>>> coxph(Surv(start, stop, arrest.time) ~ age, data=RL) # works, >>>> same >>> Call: coxph(formula = Surv(start, stop, arrest.time) ~ age, data >>> = RL) >>> >>> >>> coef exp(coef) se(coef) z p age 0.0963 1.1011 0.2073 >>> 0.46 0.64 >>> >>> Likelihood ratio test=0.21 on 1 df, p=0.643 n= 397, number of >>> events= 4 >>> >>> --------------- snip ------------ >>> >>> But when I try to fit a parametric survival regression with >>> survreg(), I get an error with the long form of the data: >>> >>> --------------- snip ------------ >>> >>>> survreg(Surv(week, arrest) ~ age, data=RW) # works >>> Call: survreg(formula = Surv(week, arrest) ~ age, data = RW) >>> >>> Coefficients: (Intercept) age 6.35386771 -0.08982624 >>> >>> Scale= 0.7363196 >>> >>> Loglik(model)= -22.1 Loglik(intercept only)= -22.2 Chisq= 0.3 >>> on 1 degrees of freedom, p= 0.58 n= 10 >>> >>>> survreg(Surv(start, stop, arrest.time) ~ age, data=RL) # >>>> fails >>> Error in survreg(Surv(start, stop, arrest.time) ~ age, data = RL) >>> : Invalid survival type >>> >>> --------------- snip ------------ >>> >>> I expect that there's something about survreg() that I'm missing. >>> I first noted this problem quite some time ago but didn't look >>> into it carefully because I didn't really need to use survreg(). >>> >>> Any help would be appreciated. >>> >>> Thanks, John ----------------------------- John Fox, Professor >>> McMaster University Hamilton, Ontario Canada L8S 4M4 Web: >>> socserv.mcmaster.ca/jfox >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >>> see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read >>> the posting guide http://www.R-project.org/posting-guide.html and >>> provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ R-help at r-project.org >> mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the >> posting guide http://www.R-project.org/posting-guide.html and >> provide commented, minimal, self-contained, reproducible code.
Dear G?ran, Yes, you're right -- I didn't read your message carefully enough and was confusing left-censoring with left-truncation. Thanks for the correction, John ________________________________________ From: G?ran Brostr?m [goran.brostrom at umu.se] Sent: August 30, 2015 10:15 AM To: Fox, John Cc: r-help at r-project.org Subject: Re: [R] using survreg() in survival package with "long" data On 08/29/2015 11:56 PM, Fox, John wrote:> Dear G?ran, > > Thank you for responding to my query; please see below: > >> -----Original Message----- From: R-help >> [mailto:r-help-bounces at r-project.org] On Behalf Of G?ran Brostr?m >> Sent: August 29, 2015 3:59 PM To: r-help at r-project.org Subject: Re: >> [R] using survreg() in survival package with "long" data >> >> Dear John, >> >> I think you are missing that 'survreg' does not handle left >> truncated data. With that you should use the 'eha' package, for a >> PH model the function 'phreg', and for an AFT model the function >> 'aftreg' (you didn't tell which model you want to fit). > > That's odd, in that it's not true in general, since, e.g., survreg() > can be used to fit the left-censored Tobit regression model, as > illustrated in this example from ?survreg: tobinfit <- > survreg(Surv(durable, durable>0, type='left') ~ age + quant, > data=tobin, dist='gaussian')Well, it is true that survreg doesn't handle left truncated data. Its ability to cope with left censored data does not change that.> In fact, in my example, the data are right-censored, but in the > second data set are represented in counting-process form as one-week > intervals. I imagine that you're right in the sense that survreg() > can't handle data like this in counting-process form.Which implies left truncation, at least formally.> Yet, I've > reviewed the documentation in the survey package but can't find any > reference to this -- which is not to say that it's not there, only > that I don't see it.Right, it is not explicitly spelled out in the documentation. But this has been discussed earlier on R-help. See e.g. https://stat.ethz.ch/pipermail/r-help/2008-November/178621.html G?ran> >> >> Your attempt with the 'survreg' function implies that you are >> satisfied with a Weibull baseline distribution, in which case you >> could choose either model. > > Right, I understand that this is the default. The problem is just a > small toy example meant to illustrate the error. > > Thanks again, John > >> >> G?ran >> >> >> On 08/29/2015 07:06 PM, Fox, John wrote: >>> Dear list members, >>> >>> I'm unable to fit a parametric survival regression using >>> survreg() in the survival package with data in "counting-process" >>> ("long") form. >>> >>> To illustrate using a scaled-down problem with 10 subjects (with >>> data placed on the web): >>> >>> --------------- snip ------------ >>>> library(survival) RW <- >>>> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RW.txt") RL >>>> <- >>>> read.table("http://socserv.mcmaster.ca/jfox/.Pickup/RL.txt") >>> >>>> RW # "wide" data >>> week arrest age 1 20 1 27 2 17 1 18 3 25 >>> 1 19 4 52 0 23 5 52 0 19 6 52 0 24 7 >>> 23 1 25 8 52 0 21 9 52 0 22 10 52 0 >>> 20 >>> >>>> head(RL, 20) # "long" data, counting-process form >>> start stop arrest.time age 1.1 0 1 0 27 1.2 >>> 1 2 0 27 1.3 2 3 0 27 1.4 3 >>> 4 0 27 1.5 4 5 0 27 1.6 5 6 >>> 0 27 1.7 6 7 0 27 1.8 7 8 0 >>> 27 1.9 8 9 0 27 1.10 9 10 0 >>> 27 1.11 10 11 0 27 1.12 11 12 0 27 >>> 1.13 12 13 0 27 1.14 13 14 0 27 >>> 1.15 14 15 0 27 1.16 15 16 0 27 1.17 16 >>> 17 0 27 1.18 17 18 0 27 1.19 18 19 >>> 0 27 1.20 19 20 1 27 >>> >>> --------------- snip ------------ >>> >>> I have no trouble fitting a Cox model to both the wide and long >>> forms of the data, obtaining (as should be the case) identical >>> results: >>> >>> --------------- snip ------------ >>> >>>> coxph(Surv(week, arrest) ~ age, data=RW) # works >>> Call: coxph(formula = Surv(week, arrest) ~ age, data = RW) >>> >>> >>> coef exp(coef) se(coef) z p age 0.0963 1.1011 0.2073 >>> 0.46 0.64 >>> >>> Likelihood ratio test=0.21 on 1 df, p=0.643 n= 10, number of >>> events= 4 >>>> coxph(Surv(start, stop, arrest.time) ~ age, data=RL) # works, >>>> same >>> Call: coxph(formula = Surv(start, stop, arrest.time) ~ age, data >>> = RL) >>> >>> >>> coef exp(coef) se(coef) z p age 0.0963 1.1011 0.2073 >>> 0.46 0.64 >>> >>> Likelihood ratio test=0.21 on 1 df, p=0.643 n= 397, number of >>> events= 4 >>> >>> --------------- snip ------------ >>> >>> But when I try to fit a parametric survival regression with >>> survreg(), I get an error with the long form of the data: >>> >>> --------------- snip ------------ >>> >>>> survreg(Surv(week, arrest) ~ age, data=RW) # works >>> Call: survreg(formula = Surv(week, arrest) ~ age, data = RW) >>> >>> Coefficients: (Intercept) age 6.35386771 -0.08982624 >>> >>> Scale= 0.7363196 >>> >>> Loglik(model)= -22.1 Loglik(intercept only)= -22.2 Chisq= 0.3 >>> on 1 degrees of freedom, p= 0.58 n= 10 >>> >>>> survreg(Surv(start, stop, arrest.time) ~ age, data=RL) # >>>> fails >>> Error in survreg(Surv(start, stop, arrest.time) ~ age, data = RL) >>> : Invalid survival type >>> >>> --------------- snip ------------ >>> >>> I expect that there's something about survreg() that I'm missing. >>> I first noted this problem quite some time ago but didn't look >>> into it carefully because I didn't really need to use survreg(). >>> >>> Any help would be appreciated. >>> >>> Thanks, John ----------------------------- John Fox, Professor >>> McMaster University Hamilton, Ontario Canada L8S 4M4 Web: >>> socserv.mcmaster.ca/jfox >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >>> see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read >>> the posting guide http://www.R-project.org/posting-guide.html and >>> provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ R-help at r-project.org >> mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the >> posting guide http://www.R-project.org/posting-guide.html and >> provide commented, minimal, self-contained, reproducible code.