Paul Johnson
2012-Jan-05 20:26 UTC
[Rd] delete.response leaves response in attribute dataClasses
I posted this one as an R bug (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14767), but Prof. Ripley says I'm premature, and I should raise the question here. Here's the behavior I assert is a bug: The output from delete.response on a terms object alters the formula by removing the dependent variable. It removes the response from the "variables" attribute and it changes the response attribute from 1 to 0. The response is removed from "predvars" But it leaves the name of the dependent variable first in the in "dataClasses". It caused an unexpected behavior in my code, so (as usual) the bug may be mine, but in my heart, I believe it belongs to delete.response. To illustrate, here's a terms object from a regression.> tty ~ x1 * x2 + x3 + x4 attr(,"variables") list(y, x1, x2, x3, x4) attr(,"factors") x1 x2 x3 x4 x1:x2 y 0 0 0 0 0 x1 1 0 0 0 1 x2 0 1 0 0 1 x3 0 0 1 0 0 x4 0 0 0 1 0 attr(,"term.labels") [1] "x1" "x2" "x3" "x4" "x1:x2" attr(,"order") [1] 1 1 1 1 2 attr(,"intercept") [1] 1 attr(,"response") [1] 1 attr(,".Environment") <environment: R_GlobalEnv> attr(,"predvars") list(y, x1, x2, x3, x4) attr(,"dataClasses") y x1 x2 x3 x4 "numeric" "numeric" "numeric" "numeric" "numeric" Now observe that delete.response removes the response from all attributes except dataClasses.> delete.response(tt)~x1 * x2 + x3 + x4 attr(,"variables") list(x1, x2, x3, x4) attr(,"factors") x1 x2 x3 x4 x1:x2 x1 1 0 0 0 1 x2 0 1 0 0 1 x3 0 0 1 0 0 x4 0 0 0 1 0 attr(,"term.labels") [1] "x1" "x2" "x3" "x4" "x1:x2" attr(,"order") [1] 1 1 1 1 2 attr(,"intercept") [1] 1 attr(,"response") [1] 0 attr(,".Environment") <environment: R_GlobalEnv> attr(,"predvars") list(x1, x2, x3, x4) attr(,"dataClasses") y x1 x2 x3 x4 "numeric" "numeric" "numeric" "numeric" "numeric" pj -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas
William Dunlap
2012-Jan-05 20:56 UTC
[Rd] delete.response leaves response in attribute dataClasses
I had noticed the same thing but figured that most people (writers of predict methods) would be looking up entries in dataClasses by name and not by position, since predict's newdata argument need not have entries in the same order as the data used to fit the model. Hence the extra entry would not noticed (nor would it be missed if it were omitted). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf Of Paul Johnson > Sent: Thursday, January 05, 2012 12:27 PM > To: R Devel List > Subject: [Rd] delete.response leaves response in attribute dataClasses > > I posted this one as an R bug > (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14767), but > Prof. Ripley says I'm premature, and I should raise the question here. > > Here's the behavior I assert is a bug: > The output from delete.response on a terms object alters the formula > by removing the dependent variable. It removes the response from the > "variables" attribute and it changes the response attribute from 1 to > 0. The response is removed from "predvars" > > But it leaves the name of the dependent variable first in the in > "dataClasses". It caused an unexpected behavior in my code, so (as > usual) the bug may be mine, but in my heart, I believe it belongs to > delete.response. > > To illustrate, here's a terms object from a regression. > > > tt > y ~ x1 * x2 + x3 + x4 > attr(,"variables") > list(y, x1, x2, x3, x4) > attr(,"factors") > x1 x2 x3 x4 x1:x2 > y 0 0 0 0 0 > x1 1 0 0 0 1 > x2 0 1 0 0 1 > x3 0 0 1 0 0 > x4 0 0 0 1 0 > attr(,"term.labels") > [1] "x1" "x2" "x3" "x4" "x1:x2" > attr(,"order") > [1] 1 1 1 1 2 > attr(,"intercept") > [1] 1 > attr(,"response") > [1] 1 > attr(,".Environment") > <environment: R_GlobalEnv> > attr(,"predvars") > list(y, x1, x2, x3, x4) > attr(,"dataClasses") > y x1 x2 x3 x4 > "numeric" "numeric" "numeric" "numeric" "numeric" > > Now observe that delete.response removes the response from all > attributes except dataClasses. > > > delete.response(tt) > ~x1 * x2 + x3 + x4 > attr(,"variables") > list(x1, x2, x3, x4) > attr(,"factors") > x1 x2 x3 x4 x1:x2 > x1 1 0 0 0 1 > x2 0 1 0 0 1 > x3 0 0 1 0 0 > x4 0 0 0 1 0 > attr(,"term.labels") > [1] "x1" "x2" "x3" "x4" "x1:x2" > attr(,"order") > [1] 1 1 1 1 2 > attr(,"intercept") > [1] 1 > attr(,"response") > [1] 0 > attr(,".Environment") > <environment: R_GlobalEnv> > attr(,"predvars") > list(x1, x2, x3, x4) > attr(,"dataClasses") > y x1 x2 x3 x4 > "numeric" "numeric" "numeric" "numeric" "numeric" > > > pj > > -- > Paul E. Johnson > Professor, Political Science > 1541 Lilac Lane, Room 504 > University of Kansas > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel