Allowing partial matching on $-extraction has always been a source of accidents. Recently, someone who shall remain nameless tried names(mydata) <- "d^2" followed by mydata$d^2. As variables in a data frame are generally considered similar to variables in, say, the global environment, it seems strange that foo$bar can give you the content of foo$bartender. In R-devel (i.e., *not* R-3.0.0 beta, but 3.1.0-to-be) partial matches now gives a warning. Of course, it is inevitable that lazy programmers will have been using code like> anova(fit1)$P[1] 0.0008866369 NA Warning message: In `$.data.frame`(anova(fit1), P) : Name partially matched in data frame and now get the warning during package checks. This can always be removed by spelling out the column name, as in> anova(fit1)$`Pr(>F)`[1] 0.0008866369 NA or by explicitly specifying a partial match with> anova(fit1)[["P", exact=FALSE]][1] 0.0008866369 NA -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
On Wed, Mar 20, 2013 at 7:28 AM, peter dalgaard <pdalgd at gmail.com> wrote:> Allowing partial matching on $-extraction has always been a source of accidents. Recently, someone who shall remain nameless tried names(mydata) <- "d^2" followed by mydata$d^2. > > As variables in a data frame are generally considered similar to variables in, say, the global environment, it seems strange that foo$bar can give you the content of foo$bartender. > > In R-devel (i.e., *not* R-3.0.0 beta, but 3.1.0-to-be) partial matches now gives a warning.Just for data frames, or also for lists? I think this is a fantastic change, but I do worry a little that it is going to generate warnings for a _lot_ of existing code. Hadley -- Chief Scientist, RStudio http://had.co.nz/
Will you be doing the same for attribute names? > options(prompt=with(version, paste0(language,"-",major,".",minor,"> "))) R-2.15.3> x <- structure(17, AnAttr="an attribute", Abcd="a b c d") R-2.15.3> attr(x, "A") NULL R-2.15.3> attr(x, "An") [1] "an attribute" R-2.15.3> attr(x, "Ab") [1] "a b c d" How will you deal with the common idiom of using is.null(x$n) to see if x has a compnent named "n"? One would not want a warning if x had a component called "nn". Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf > Of peter dalgaard > Sent: Wednesday, March 20, 2013 5:28 AM > To: r-devel at r-project.org > Subject: [Rd] Deprecating partial matching in $.data.frame > > Allowing partial matching on $-extraction has always been a source of accidents. > Recently, someone who shall remain nameless tried names(mydata) <- "d^2" followed by > mydata$d^2. > > As variables in a data frame are generally considered similar to variables in, say, the > global environment, it seems strange that foo$bar can give you the content of > foo$bartender. > > In R-devel (i.e., *not* R-3.0.0 beta, but 3.1.0-to-be) partial matches now gives a warning. > > Of course, it is inevitable that lazy programmers will have been using code like > > > anova(fit1)$P > [1] 0.0008866369 NA > Warning message: > In `$.data.frame`(anova(fit1), P) : Name partially matched in data frame > > and now get the warning during package checks. This can always be removed by spelling > out the column name, as in > > > anova(fit1)$`Pr(>F)` > [1] 0.0008866369 NA > > or by explicitly specifying a partial match with > > > anova(fit1)[["P", exact=FALSE]] > [1] 0.0008866369 NA > > > -- > Peter Dalgaard, Professor > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 20/03/13 17:58, Hadley Wickham wrote:> On Wed, Mar 20, 2013 at 11:26 AM, peter dalgaard <pdalgd at gmail.com> wrote: >> >> On Mar 20, 2013, at 16:59 , William Dunlap wrote: >> >>> Will you be doing the same for attribute names? >> >> Not at this point. > > It would be really nice to have consistent behaviour across argument names, attributes, lists > and data frames, at least for R CMD check.I agree with Hadley that consistency is quite important. This is especially true for data.frames and lists, as this concerns the data itself, and not names or attributes of the data. I would very much like to see at least at the level of R CMD check warnings for *all* partial matching so that they can be ironed out before in the next stage warnings are give to the user (as mentioned by Milan). I was bitten at least once by a bug, which cost me quite some time to figure out, caused by partial completion and would very much like to see it go (or at least have the option to show warnings if it occurs). Cheers, Rainer> > Hadley >- -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Stellenbosch University South Africa Tel : +33 - (0)9 53 10 27 44 Cell: +33 - (0)6 85 62 59 98 Fax : +33 - (0)9 58 10 27 44 Fax (D): +49 - (0)3 21 21 25 22 44 email: Rainer at krugs.de Skype: RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRSsQOAAoJENvXNx4PUvmC7zkH/Rp0yFMmgQD9D2Z2EpWm5vGR T0ojk8WKCeqoGY4IKpCPP0rSKJqPI0HxjdAplOclFSdfBaCDrHdALLaxzqJWG6TJ 346A/lAgdgbJWNTTWMXiXcq2vqDKAvoOVhZ/A1YDo7CzjZsgpcBPzmUZREFNSDKu TeFNM29GgLIaQ2JqV6wRPQee/j36+iLpcCfACTdsXs0H/kRkcogV96g75OTGsxJr 9pZRzOQpH0fv9DsdLGkOCO1twZ+XtWOKSCmTTcOJ97wBWcYk80jrwJObKFG7qMz7 VVoz38hWjgLKj9RRKSLtEtIfUhNogvT5bayPO3ZBD1jDx8qRfm8BtNV+ofEvnd0=akLx -----END PGP SIGNATURE-----
On Mar 21, 2013, at 09:25 , Rainer M Krug wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 20/03/13 17:58, Hadley Wickham wrote: >> On Wed, Mar 20, 2013 at 11:26 AM, peter dalgaard <pdalgd at gmail.com> wrote: >>> >>> On Mar 20, 2013, at 16:59 , William Dunlap wrote: >>> >>>> Will you be doing the same for attribute names? >>> >>> Not at this point. >> >> It would be really nice to have consistent behaviour across argument names, attributes, lists >> and data frames, at least for R CMD check. > > I agree with Hadley that consistency is quite important. This is especially true for data.frames > and lists, as this concerns the data itself, and not names or attributes of the data.Well, maybe consistency is important, but partial matching never worked for $-extraction in environments, so the current change could be considered mainly a nudge of data frames in the direction of environments. After all, both can be thought of as collections of named objects. General lists are a somewhat different issue. They often, formally or informally, represent classed objects with a defined set of names, typically obtained as return values from functions. Since the names are known, people will have used the expedient of abbreviating them. This can happen with data frames as well, but less commonly, since it is in general unsafe to rely on column names being uniquely defined by any particular prefix. I.e., deprecating partial matching for lists opens a rather larger can of worms, and might require more extensive code revisions. Also, the performance hit of a runtime check for partial matching might be more important for lists than it is for data frames. It could be worth it to implement an R CMD check warning as you suggest, but perhaps not just now. -Peter -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com