quizz---what does this produce? d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) attach(d); c <- (a+b)>25; detach(d) d= subset(d, TRUE, select=c( a, b, c )) yes, I know I have made a mistake, in that the code does not do what I presumably would have wanted. it does seem like unexpected behavior, though, without an error. there probably is some reason why this does not ring an alarm bell... /iaw ---- Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
ivo welch wrote:> quizz---what does this produce? > > d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) > attach(d); c <- (a+b)>25; detach(d) > d= subset(d, TRUE, select=c( a, b, c )) > > yes, I know I have made a mistake, in that the code does not do what I > presumably would have wanted.What exactly did you want?
I would not have wanted a data set with 1000 variables, but an error message. the intent was d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) attach(d); d$c <- (a+b)>25; detach(d) d= subset(d, TRUE, select=c( a, b, c )) -iaw On Mon, Aug 23, 2010 at 6:04 PM, Erik Iverson <eriki at ccbr.umn.edu> wrote:> > > ivo welch wrote: >> >> quizz---what does this produce? >> >> ? d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) >> ? attach(d); c <- (a+b)>25; detach(d) >> ? d= subset(d, TRUE, select=c( a, b, c )) >> >> yes, I know I have made a mistake, in that the code does not do what I >> presumably would have wanted. > > What exactly did you want? >
On Aug 23, 2010, at 5:51 PM, ivo welch wrote:> quizz---what does this produce? > > d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) > attach(d); c <- (a+b)>25; detach(d) > d= subset(d, TRUE, select=c( a, b, c )) > > yes, I know I have made a mistake, in that the code does not do what I > presumably would have wanted. it does seem like unexpected behavior, > though, without an error. there probably is some reason why this does > not ring an alarm bell...You have created a perfect example for why it is a bad idea to attach data.frames. ?attach # yes, I am yet again saying: "read the help page..." ... especially the 4th paragraph of the Details section. -- David. David Winsemius, MD West Hartford, CT
On Aug 23, 2010, at 6:28 PM, David Winsemius wrote:> > On Aug 23, 2010, at 5:51 PM, ivo welch wrote: > >> quizz---what does this produce? >> >> d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) >> attach(d); c <- (a+b)>25; detach(d) >> d= subset(d, TRUE, select=c( a, b, c )) >> >> yes, I know I have made a mistake, in that the code does not do >> what I >> presumably would have wanted. it does seem like unexpected behavior, >> though, without an error. there probably is some reason why this >> does >> not ring an alarm bell... > > You have created a perfect example for why it is a bad idea to > attach data.frames. > > ?attach # yes, I am yet again saying: "read the help page..." > > ... especially the 4th paragraph of the Details section.I think it might helpful to consider the right way and the wrong way to do the same assignment using with(), which is my choice as an alternative to attache Right; d$c <- with(d, a+b >25) # note: using "c" as an object name is a really confusing strategy Wrong: with(d, c <- a+b <25) The wrong way is similar to what you might have thought would be happening. The attach() operation created its own environment, but that did not necessarily mean that all assignments would be creating new columns inside "d".> > -- > David. > > > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
On Mon, 2010-08-23 at 17:51 -0400, ivo welch wrote:> quizz---what does this produce?Henrique has provided an answer to the question, but...> d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) > attach(d); c <- (a+b)>25; detach(d)...this is ugly and will potentially catch you out one day if you forget to detach. These three calls can be achieved using a single with() : c <- with(d, (a + b) > 25) And the version you wanted: attach(d); d$c <- (a+b)>25; detach(d) can be done using within(): d <- within(d, c <- (a + b) > 25) and with the latter, the intention is pretty clear. HTH G> d= subset(d, TRUE, select=c( a, b, c )) > > yes, I know I have made a mistake, in that the code does not do what I > presumably would have wanted. it does seem like unexpected behavior, > though, without an error. there probably is some reason why this does > not ring an alarm bell... > > /iaw > ---- > Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
thanks, everyone. I did not even know about transform() and with(), but they look quite useful. actually, what I had intended to state was that Boolean variables in select parts of subset statements, especially when mixed with other variables that are just part of the data frame, leads to unexpected results. (my d statement was intended to show how easy it is to forget that a variable is not part of the data frame, but just part of the global environment.) unless this mixed-treatment covers an important functional aspect, this might be better to cause a warning (or an error) than a silent recycling of variables. (Related: The recycling rules are generally convenient, but can also be rather problematic in catching errors. It would be nice to be able to turn them off.) regards, /iaw On Tue, Aug 24, 2010 at 3:55 AM, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:> On Mon, 2010-08-23 at 17:51 -0400, ivo welch wrote: >> quizz---what does this produce? > > Henrique has provided an answer to the question, but... > >> ? ?d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) >> ? ?attach(d); c <- (a+b)>25; detach(d) > > ...this is ugly and will potentially catch you out one day if you forget > to detach. These three calls can be achieved using a single with() : > > c <- with(d, (a + b) > 25) > > And the version you wanted: > > attach(d); d$c <- (a+b)>25; detach(d) > > can be done using within(): > > d <- within(d, c <- (a + b) > 25) > > and with the latter, the intention is pretty clear. > > HTH > > G > >> ? ?d= subset(d, TRUE, select=c( a, b, c )) >> >> yes, I know I have made a mistake, in that the code does not do what I >> presumably would have wanted. ?it does seem like unexpected behavior, >> though, without an error. ?there probably is some reason why this does >> not ring an alarm bell... >> >> /iaw >> ---- >> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > ?Dr. Gavin Simpson ? ? ? ? ? ? [t] +44 (0)20 7679 0522 > ?ECRC, UCL Geography, ? ? ? ? ?[f] +44 (0)20 7679 0565 > ?Pearson Building, ? ? ? ? ? ? [e] gavin.simpsonATNOSPAMucl.ac.uk > ?Gower Street, London ? ? ? ? ?[w] http://www.ucl.ac.uk/~ucfagls/ > ?UK. WC1E 6BT. ? ? ? ? ? ? ? ? [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > >
On Aug 24, 2010, at 7:36 AM, ivo welch wrote:> thanks, everyone. I did not even know about transform() and with(), > but they look quite useful. > > actually, what I had intended to state was that Boolean variables in > select parts of subset statements, especially when mixed with other > variables that are just part of the data frame, leads to unexpected > results. (my d statement was intended to show how easy it is to > forget that a variable is not part of the data frame, but just part of > the global environment.) unless this mixed-treatment covers an > important functional aspect, this might be better to cause a warning > (or an error) than a silent recycling of variables.I think you will find intense resistance to the request for a warning when arguments to functions like table() and subset() are given vectors of the correct length but are not part of a data.frame or in hte data argument. I suppose the subset situation might be arguably different than the table situation, but it is rather common practice to construct utility index or flag vectors that are never incorporated into part of the main data,frame that is being analyzed.> > (Related: The recycling rules are generally convenient, but can also > be rather problematic in catching errors. It would be nice to be able > to turn them off.)You do get warnings and it is possible to raise the level of action taken by the system to that of an error. ?options ... and take note of hte various options beginning with "warn... ... or take note of the error option and write your own.> > regards, > > /iaw > > > On Tue, Aug 24, 2010 at 3:55 AM, Gavin Simpson <gavin.simpson at ucl.ac.uk > > wrote: >> On Mon, 2010-08-23 at 17:51 -0400, ivo welch wrote: >>> quizz---what does this produce? >> >> Henrique has provided an answer to the question, but... >> >>> d=data.frame( a=1:1000, b=2001:3000, z= 5001:6000 ) >>> attach(d); c <- (a+b)>25; detach(d) >> >> ...this is ugly and will potentially catch you out one day if you >> forget >> to detach. These three calls can be achieved using a single with() : >> >> c <- with(d, (a + b) > 25) >> >> And the version you wanted: >> >> attach(d); d$c <- (a+b)>25; detach(d) >> >> can be done using within(): >> >> d <- within(d, c <- (a + b) > 25) >> >> and with the latter, the intention is pretty clear. >> >> HTH >> >> G >> >>> d= subset(d, TRUE, select=c( a, b, c )) >>> >>> yes, I know I have made a mistake, in that the code does not do >>> what I >>> presumably would have wanted. it does seem like unexpected >>> behavior, >>> though, without an error. there probably is some reason why this >>> does >>> not ring an alarm bell... >>> >>> /iaw >>> ---- >>> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> -- >> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >> Dr. Gavin Simpson [t] +44 (0)20 7679 0522 >> ECRC, UCL Geography, [f] +44 (0)20 7679 0565 >> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk >> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ >> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk >> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT