Georges Dupret
2012-Nov-17 02:05 UTC
[R] survfit & number of variables != number of variable names
This works ok:> cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data) > fit = survfit(cox, newdata=data[1:100,])but using strata leads to problems:> cox.s = coxph(surv ~ bucket*(today + accor + both) + strata(activity), > data = data) > fit.s = survfit(cox.s, newdata=data[1:100,])Error in model.frame.default(data = data[1:100, ], formula = ~bucket + : number of variables != number of variable names Note that the following give rise to the same error:> fit.s = survfit(cox.s, newdata=data)Error in model.frame.default(data = data, formula = ~bucket + today + : number of variables != number of variable names but if I use data implicitly, all is working fine:> fit.s = survfit(cox.s)Any idea on how I could solve this? Best, and thank you, ge -- View this message in context: http://r.789695.n4.nabble.com/survfit-number-of-variables-number-of-variable-names-tp4649834.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2012-Nov-17 16:12 UTC
[R] survfit & number of variables != number of variable names
On Nov 16, 2012, at 6:05 PM, Georges Dupret wrote:> This works ok: > >> cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = >> data) >> fit = survfit(cox, newdata=data[1:100,]) > > but using strata leads to problems: > >> cox.s = coxph(surv ~ bucket*(today + accor + both) + >> strata(activity), >> data = data) >> fit.s = survfit(cox.s, newdata=data[1:100,]) > > Error in model.frame.default(data = data[1:100, ], formula = ~bucket > + : > number of variables != number of variable names > > Note that the following give rise to the same error: > >> fit.s = survfit(cox.s, newdata=data) > Error in model.frame.default(data = data, formula = ~bucket + today > + : > number of variables != number of variable names > > but if I use data implicitly, all is working fine: >> fit.s = survfit(cox.s) > > Any idea on how I could solve this? >I noticed that you were using what might be called an "externally created Surv object". I have a memory that Terry Therneau has criticized that practice. I cannot remember if it was in exactly this situation but I might ask if setting up the model as: cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) + activity, data = data) ... might give the survival machinery a better handle on where everything might be found. -- David. David Winsemius, MD Alameda, CA, USA
Terry Therneau
2012-Nov-19 17:01 UTC
[R] survfit & number of variables != number of variable names
I can't reproduce the problem. Tell us what version of R and what version of the survival package. Create a reproducable example. I don't know if some variables are numeric and some are factors, how/where the "surv" object was defined, etc. Terry Therneau On 11/17/2012 05:00 AM, r-help-request at r-project.org wrote:> This works ok: > >> > cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data) >> > fit = survfit(cox, newdata=data[1:100,]) > but using strata leads to problems: > >> > cox.s = coxph(surv ~ bucket*(today + accor + both) + strata(activity), >> > data = data) >> > fit.s = survfit(cox.s, newdata=data[1:100,]) > Error in model.frame.default(data = data[1:100, ], formula = ~bucket + : > number of variables != number of variable names > > Note that the following give rise to the same error: > >> > fit.s = survfit(cox.s, newdata=data) > Error in model.frame.default(data = data, formula = ~bucket + today + : > number of variables != number of variable names > > but if I use data implicitly, all is working fine: >> > fit.s = survfit(cox.s) > Any idea on how I could solve this? > > Best, and thank you, > > ge
Georges Dupret
2012-Nov-19 19:07 UTC
[R] survfit & number of variables != number of variable names
Hi! In answer to: -------- I noticed that you were using what might be called an "externally created Surv object". I have a memory that Terry Therneau has criticized that practice. I cannot remember if it was in exactly this situation but I might ask if setting up the model as: cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) + activity, data = data) ... might give the survival machinery a better handle on where everything might be found. ------------ I tried to create the Surv object "internally" but I face the same issue:> (cox.s = coxph(Surv(time=absence, event=(censored==FALSE)) ~ > bucket*(today) + strata(activity), data = small))Call: coxph(formula = Surv(time = absence, event = (censored == FALSE)) ~ bucket * (today) + strata(activity), data = small) coef exp(coef) se(coef) z p bucket575 0.4526 1.572 0.740 0.612 0.54 todayTRUE -0.0886 0.915 0.676 -0.131 0.90 bucket575:todayTRUE -0.1670 0.846 0.794 -0.210 0.83 Likelihood ratio test=2.32 on 3 df, p=0.509 n= 100, number of events= 100> fit = survfit(cox.s, newdata=small[1:50,])Error in model.frame.default(data = small[1:50, ], formula = ~bucket + : number of variables != number of variable names Best, and thank you for the suggestion. ge -- View this message in context: http://r.789695.n4.nabble.com/survfit-number-of-variables-number-of-variable-names-tp4649834p4650080.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2012-Nov-20 02:33 UTC
[R] survfit & number of variables != number of variable names
On Nov 19, 2012, at 5:33 PM, Georges Dupret wrote:> Hi David, > > Sorry for the signature files... this is automatic. I should disable that. > > Please find in attachment a copy of small.csv.gzI found it but I suspect nobody else will. I think Terry Therneau already got a copy. when you attached it earlier. But the rest of Rhelp did not, since .gz files will get scrubbed by the list-serv.> Best, > > ge > > On 11/19/2012 02:37 PM, David Winsemius wrote: >> >> On Nov 19, 2012, at 2:23 PM, David Winsemius wrote: >> >>> >>> On Nov 19, 2012, at 11:07 AM, Georges Dupret wrote: >>> >>>> Hi! >>>> >>>> In answer to: >>>> >>>> -------- >>>> I noticed that you were using what might be called an "externally >>>> created Surv object". I have a memory that Terry Therneau has >>>> criticized that practice. I cannot remember if it was in exactly this >>>> situation but I might ask if setting up the model as: >>>> >>>> cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) + >>>> activity, data = data) >>>> >>>> ... might give the survival machinery a better handle on where >>>> everything might be found. >>>> ------------ >>>> >>>> I tried to create the Surv object "internally" but I face the same issue: >>>> >>>>> (cox.s = coxph(Surv(time=absence, event=(censored==FALSE)) ~ >>>>> bucket*(today) + strata(activity), data = small)) >>>> Call: >>>> coxph(formula = Surv(time = absence, event = (censored == FALSE)) ~ >>>> bucket * (today) + strata(activity), data = small)All of your 'censored' were FALSE so all of your events were TRUE. My guess is that you are having problems because you end up with different model designs in the different strata:> with( small, table(activity, today))today activity FALSE TRUE (100,121] 1 13 (121,149] 2 8 (149,196] 0 4 (196,1.33e+03] 1 8 (30,42] 1 8 (42,55] 4 12 (55,68] 2 9 (68,83] 2 9 (83,100] 2 6 [11,30] 0 8 I do not think it matters that you levels for the factor variable will not be in the expected order: table(small$activity) (100,121] (121,149] (149,196] (196,1.33e+03] (30,42] (42,55] (55,68] (68,83] 14 10 4 9 9 16 11 11 (83,100] [11,30] 8 8 But I do also wonder if the small numbers in each strata might be causing problems. Is it really needed to stratify so finely? -- David.>>>> >>>> coef exp(coef) se(coef) z p >>>> bucket575 0.4526 1.572 0.740 0.612 0.54 >>>> todayTRUE -0.0886 0.915 0.676 -0.131 0.90 >>>> bucket575:todayTRUE -0.1670 0.846 0.794 -0.210 0.83 >>>> >>>> Likelihood ratio test=2.32 on 3 df, p=0.509 n= 100, number of events= 100 >>>>> fit = survfit(cox.s, newdata=small[1:50,]) >>>> Error in model.frame.default(data = small[1:50, ], formula = ~bucket + : >>>> number of variables != number of variable names >>> >>> OK. Thanks for doing that. You might want to know that the only attachment that made it through to the emailing list was a file named small.csv.gz.sig That's not a format that my system knows how to decompress ( I tried downloading GnuPG and compiling it but >>> >> >> (hit sent button too soon. ) .... was unable to figure out how to decompress with GnuPG either. (It's hard to imagine this needed to be encrypted.) >> > <small.csv.gz>David Winsemius, MD Alameda, CA, USA