Muenchen, Robert A (Bob)
2007-Aug-27  15:54 UTC
[R] FW: subset using noncontiguous variables by name (not index)
Thomas, that's a good point. I was thinking of anscombe[x1::y1] making it clear which one, but you would then want just x1::y1 to have unambiguous meaning on its own, which is impossible. As for x1:xN, it's unambiguous on its own. I thought one of the great advantages of R was that it could use different methods so that a new operator would not be needed. The colon operator would just have a new method for when stringN appeared. One that would be very useful & have obvious meaning. Thanks, Bob> -----Original Message----- > From: Thomas Lumley [mailto:tlumley at u.washington.edu] > Sent: Monday, August 27, 2007 10:25 AM > To: Muenchen, Robert A (Bob) > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] subset using noncontiguous variables by name (not > index) > > On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote: > > > Gabor, That works great! > > > > I think this would be a very helpful addition to the main R > > distribution. Perhaps with a single colon representing numerical > order > > (exactly as you have written it) and two colons representing the > order > > of the variables as they appear in the data frame (your first > example). > > That's analogous to SAS' x1-xN, which you know gets those N > variables, > > and a--z, which selects an unknown number of variables a through z. > How > > many that is depends upon their order in the data frame. That would > not > > only be very useful in general, but it would also make transitioning > to > > R from SAS or SPSS less confusing. > > > > Is R still being extended in such basic ways, or does that muck up > > existing programs too much? > > > > In principle base R can be extended like that, but a strong case is > needed > for non-standard evaluation rules and for depleting the restricted > supply > of short binary operator names. > > The reason for subset() and its behaviour is that 'variables as they > appear the in data frame' is typically ambiguous -- which data frame? > In > SPSS you have only one and in SAS there is a default one, so there is > no > ambiguity in X1--Y2, but in R it needs another argument specifying the > data frame, so it can't really be a binary operator. > > The double colon :: and triple colon ::: are already used for > namespaces, > and a search of r-help reveals two previous, different, suggestionsfor> %:%. > > > -thomas > > Thomas Lumley Assoc. Professor, Biostatistics > tlumley at u.washington.edu University of Washington, Seattle
