Dear all, I have some trouble using the "id"-argument with aftreg (accelerated failure time regression analysis from the eha library). As far as I understand it, the id argument is used to group individuals together if there are time-varying covariates and the data is arranged in counting process style. Unfortunately, i cannot figure out how to use the "id"-argument. The most straight-forward way would be to simply state the grouping variable, but it throws an error. I've included an example below: the dataframe for regression is called "test", with the grouping variable "person". > test start end censor person var1 1 0 1 0 1 0.5 2 1 2 0 1 0.4 3 2 3 0 1 0.6 4 3 4 1 1 -0.3 5 0 1 0 2 0.6 6 1 2 0 2 0.7 7 2 3 0 2 0.6 > fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=person) Error in order(id, Y[, 1]) : argument 1 is not a vector > fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=test["person"]) Error in `[.data.frame`(id, ord) : undefined columns selected What would be the correct way to fit this example model? Thanks + all the best Philipp
In case this time-dependent covariate is an internal time-dependent covariate (aka endogenous time-dependent covariate), you can use the jointModel() function from package JM, with the option "weibull-AFT-GH" for the 'method' argument. For more information you may have a look at: rwiki.sciviews.org/doku.php?id=packages:cran:jm I hope it helps. Best, Dimitris Philipp Rappold wrote:> Dear all, > > I have some trouble using the "id"-argument with aftreg (accelerated > failure time regression analysis from the eha library). > > As far as I understand it, the id argument is used to group individuals > together if there are time-varying covariates and the data is arranged > in counting process style. > > Unfortunately, i cannot figure out how to use the "id"-argument. The > most straight-forward way would be to simply state the grouping > variable, but it throws an error. I've included an example below: the > dataframe for regression is called "test", with the grouping variable > "person". > > > test > start end censor person var1 > 1 0 1 0 1 0.5 > 2 1 2 0 1 0.4 > 3 2 3 0 1 0.6 > 4 3 4 1 1 -0.3 > 5 0 1 0 2 0.6 > 6 1 2 0 2 0.7 > 7 2 3 0 2 0.6 > > > fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=person) > Error in order(id, Y[, 1]) : argument 1 is not a vector > > > fit <- aftreg(Surv(start, end, censor)~var1, data=test, > id=test["person"]) > Error in `[.data.frame`(id, ord) : undefined columns selected > > > > What would be the correct way to fit this example model? > > Thanks + all the best > Philipp > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
On Fri, Feb 5, 2010 at 11:30 AM, Philipp Rappold <philipp.rappold at gmail.com> wrote:> Dear all, > > I have some trouble using the "id"-argument with aftreg (accelerated failure > time regression analysis from the eha library). > > As far as I understand it, the id argument is used to group individuals > together if there are time-varying covariates and the data is arranged in > counting process style. > > Unfortunately, i cannot figure out how to use the "id"-argument. The most > straight-forward way would be to simply state the grouping variable, but it > throws an error. I've included an example below: the dataframe for > regression is called "test", with the grouping variable "person". > >> test > ?start end censor person var1 > 1 ? ? 0 ? 1 ? ? ?0 ? ? ?1 ?0.5 > 2 ? ? 1 ? 2 ? ? ?0 ? ? ?1 ?0.4 > 3 ? ? 2 ? 3 ? ? ?0 ? ? ?1 ?0.6 > 4 ? ? 3 ? 4 ? ? ?1 ? ? ?1 -0.3 > 5 ? ? 0 ? 1 ? ? ?0 ? ? ?2 ?0.6 > 6 ? ? 1 ? 2 ? ? ?0 ? ? ?2 ?0.7 > 7 ? ? 2 ? 3 ? ? ?0 ? ? ?2 ?0.6 > >> fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=person) > Error in order(id, Y[, 1]) : argument 1 is not a vectorYou have caught the _function_ 'person' (package: utils) instead of the variable 'person' in the data frame. That explains the odd error message. If you change the variable name to, e.g., "ID", you'll get the error message Error in order(id, Y[, 1]) : object 'id' not found which would hint you in the right direction. You need to specify 'id' by a full name, in your case 'test$person'. This is of course a deficiency in the interface of aftreg. I will fix it asap. So the temporary fix is 'id = test$person'. Thanks for the report, G?ran>> fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=test["person"]) > Error in `[.data.frame`(id, ord) : undefined columns selected > > > > What would be the correct way to fit this example model? > > Thanks + all the best > Philipp > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- G?ran Brostr?m
G?ran, thanks! One more thing that I found: As soon as you have at least one NA in the independent vars, the trick that you mentioned does not work anymore. Example: > testdata start stop censor groupvar var1 1 0 1 0 1 0.1284928 2 1 2 0 1 0.4896125 3 2 3 0 1 0.7012899 4 3 4 0 1 NA 5 0 1 0 2 0.7964361 6 1 2 0 2 0.8466039 7 2 3 1 2 0.2234271 > aftreg(Surv(start, stop, censor)~var1, data=testdata, id=testdata$groupvar) Error in order(id, Y[, 1]) : Different length of arguments (* I translated this from the German Output *) Do you think there is a simple hack which excludes all subjects that have at least on NA in their independent vars? If it was only one dependent var it would probably be easy by just using subset, but I have lots of different combinations of vars that I'd like to test ;) Best Philipp PS: Conerning the benmark: For a large dataset (~ 1600 observations on ~300 subjects) processing takes about 40 seconds (core 2 duo @ 2.46 GHz, T9300). Interestingly, processing the testdata-set above with only 7 observations on 2 subjects takes 2 minutes... G?ran Brostr?m wrote:> Philipp Rappold wrote: >> Dear all, >> >> I have some trouble using the "id"-argument with aftreg (accelerated >> failure time regression analysis from the eha library). >> >> As far as I understand it, the id argument is used to group >> individuals together if there are time-varying covariates and the data >> is arranged in counting process style. >> >> Unfortunately, i cannot figure out how to use the "id"-argument. The >> most straight-forward way would be to simply state the grouping >> variable, but it throws an error. I've included an example below: the >> dataframe for regression is called "test", with the grouping variable >> "person". >> >> > test >> start end censor person var1 >> 1 0 1 0 1 0.5 >> 2 1 2 0 1 0.4 >> 3 2 3 0 1 0.6 >> 4 3 4 1 1 -0.3 >> 5 0 1 0 2 0.6 >> 6 1 2 0 2 0.7 >> 7 2 3 0 2 0.6 >> >> > fit <- aftreg(Surv(start, end, censor)~var1, data=test, id=person) >> Error in order(id, Y[, 1]) : argument 1 is not a vector > > You have caught the _function_ 'person' (package: utils) instead of the > variable 'person' in the data frame. That explains the odd error > message. If you change the variable name to, e.g., "ID", you'll get the > error message > > Error in order(id, Y[, 1]) : object 'id' not found > > which would hint you in the right direction. You need to specify 'id' > by a full name, in your case 'test$person'. This is of course a > deficiency in the interface of aftreg. I will fix it asap. > > So the temporary fix is 'id = test$person'. > > Thanks for the report, > > G?ran > > >> >> > fit <- aftreg(Surv(start, end, censor)~var1, data=test, >> id=test["person"]) >> Error in `[.data.frame`(id, ord) : undefined columns selected >> >> >> >> What would be the correct way to fit this example model? >> >> Thanks + all the best >> Philipp >