Dear R developers---I just spent half a day debugging an R program, which had two bugs---I selected the wrongly named variable, which turns out to have been a scalar, which then happily multiplied as if it was a matrix; and another wrongly named variable from a data frame, that triggered no error when used as a[["name"]] or a$name . there should be an option to turn on that throws an error inside R when one does this. I cannot imagine that there is much code that wants to reference non-existing columns in data frames. I know you guys are saints for developing without financial support. but maybe we non-insider end-users can help by putting up a bounty list on R-project for us end-users to contribute to? I would pledge $500 to a $10,000 fund that funds a project to comprehensively enhance the programming and debugging aspects of R. it would only take 20 of us to make this possible. personally, I think basic nudgeware is the way to go. when a user starts R in interactive mode, there should be a note that says, please donate $20 to the R foundation to support the development. press enter to continue or enter your contribution number to avoid this message in the future . you can even accept the same string if need be. it's a nudge only, not a requirement. regards, /iaw ---- Ivo Welch (ivo.welch at gmail.com) http://www.ivo-welch.info/ J. Fred Weston Professor of Finance Anderson School at UCLA, C519 Director, UCLA Anderson Fink Center for Finance and Investments Free Finance Textbook, http://book.ivo-welch.info/ Editor, Critical Finance Review, http://www.critical-finance-review.org/
Here is a JIT suggestion: add the message to errors/coredump reports "If you would like to submit a patch for this error, please contact r-devel@ r-project.org for further instructions. If you cannot, or would prefer not to, fix this error yourself, please consider donating $20 to the Fix My Bug (R) fund. patch: ;-) if(R_Interactive) { REprintf("\nPossible actions:\n1: %s\n2: %s\n3: %s\n4: %s\n5: %s\n", "abort (with core dump, if enabled)", "normal R exit", "exit R without saving workspace", "exit R saving workspace",> "donate $20"); --t nb. if I am not mistaken, Dr. Gentleman is fairly busy these days at Genentech, sequencing anything he/they can get his/their hands on Prof. Ripley might be a more sensible choice for these sorts of suggestions if you are going to cc: people besides the r-devel list if there is history between you and Dr. Gentleman then by all means please ignore whatever irrelevant crap I typed above On Thu, Jan 3, 2013 at 10:00 AM, ivo welch <ivo.welch@anderson.ucla.edu>wrote:> r-- *A model is a lie that helps you see the truth.* * * Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> [[alternative HTML version deleted]]
Well... On Thu, Jan 3, 2013 at 10:00 AM, ivo welch <ivo.welch at anderson.ucla.edu> wrote:> > Dear R developers---I just spent half a day debugging an R program, > which had two bugs---I selected the wrongly named variable, which > turns out to have been a scalar, which then happily multiplied as if > it was a matrix; and another wrongly named variable from a data frame, > that triggered no error when used as a[["name"]] or a$name . there > should be an option to turn on that throws an error inside R when one > does this. I cannot imagine that there is much code that wants to > reference non-existing columns in data frames.But I can -- and do it all the time: To add a new variable, "d" to a data frame, df, containing only "a" and "b" (with 10 rows, say): df[["d"]] <- 1:10 Trying to outguess documentation to create error triggers is a very bad idea. R already has plenty of debugging tools -- and there is even a "debug" package. Perhaps you need a better programming editor/IDE. There are several listed on CRAN, RStudio, etc. -- Bert> > > I know you guys are saints for developing without financial support. > but maybe we non-insider end-users can help by putting up a bounty > list on R-project for us end-users to contribute to? I would pledge > $500 to a $10,000 fund that funds a project to comprehensively enhance > the programming and debugging aspects of R. it would only take 20 of > us to make this possible. > > personally, I think basic nudgeware is the way to go. when a user > starts R in interactive mode, there should be a note that says, > > please donate $20 to the R foundation to support the development. > press enter to continue or enter your contribution number to avoid > this message in the future . > > you can even accept the same string if need be. it's a nudge only, > not a requirement. > > regards, > > /iaw > > ---- > Ivo Welch (ivo.welch at gmail.com) > http://www.ivo-welch.info/ > J. Fred Weston Professor of Finance > Anderson School at UCLA, C519 > Director, UCLA Anderson Fink Center for Finance and Investments > Free Finance Textbook, http://book.ivo-welch.info/ > Editor, Critical Finance Review, http://www.critical-finance-review.org/ > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Ivo, That's standard R behaviour. But I've had similar bugs as you. If you really want to change it then one way would be to create your own helper function, say strictselect(), or shorter name and ensure to use that instead of [[ and $. Or, how about something like this? :> DF = data.frame(a=1:3,b=4:6) > DF$fooNULL> DF[["foo"]]NULL> "$.data.frame" = "[[.data.frame" = function(x,...) {if (!..1 %in% names(x)) stop("Column not found!") else base::"[[.data.frame"(x,...) }> DF$fooError in `$.data.frame`(DF, foo) : Column not found!> DF[["foo"]]Error in `[[.data.frame`(DF, "foo") : Column not found!> DF[["newcol"]] <- 7:9 > DFa b newcol 1 1 4 7 2 2 5 8 3 3 6 9 Masking those methods in your .GlobalEnv shouldn't break packages that may rely on missing columns returning NULL because all packages now have namespaces. So this mask should just affect your own code, iiuc. You could place those masks into your Rprofile.site. It was quickly typed and tested, and not thoroughly thought through, so *it is just a straw man*. Matthew
ivo welch <ivo.welch <at> anderson.ucla.edu> writes:> > Dear R developers---I just spent half a day debugging an R program, > which had two bugs---I selected the wrongly named variable, which > turns out to have been a scalar, which then happily multiplied as if > it was a matrix; and another wrongly named variable from a data frame, > that triggered no error when used as a[["name"]] or a$name . there > should be an option to turn on that throws an error inside R when one > does this. I cannot imagine that there is much code that wants to > reference non-existing columns in data frames. > > I know you guys are saints for developing without financial support. > but maybe we non-insider end-users can help by putting up a bounty > list on R-project for us end-users to contribute to? I would pledge > $500 to a $10,000 fund that funds a project to comprehensively enhance > the programming and debugging aspects of R. it would only take 20 of > us to make this possible. > > personally, I think basic nudgeware is the way to go. when a user > starts R in interactive mode, there should be a note that says, > > please donate $20 to the R foundation to support the development. > press enter to continue or enter your contribution number to avoid > this message in the future . > > you can even accept the same string if need be. it's a nudge only, > not a requirement.I did bring this idea up briefly 5 years ago (for whatever that's worth)L http://tolstoy.newcastle.edu.au/R/e2/devel/07/05/3202.html. I very much doubt R-core will go for this, but there's nothing stopping some private citizen with time and energy on their hands from setting up their own private bounty system. As I see it the challenges would be: * setting up and administering the web site and the bounty system (i.e. figuring out rules for deciding when a bounty should be paid) * convincing the R community that their money is safe with you; * figuring out an appropriate payment/escrow system (Paypal?) * dealing with any tax and reporting issues relevant to your locality of receiving and disbursing money It's conceivable that some existing R-oriented entity (Mango Solutions, Revolution, RStudio?) would want/be willing to partner. This won't take care of getting stuff into core R, but (1) well-worked out proofs of concept would go a long way to convincing R-core; (2) a lot can be done outside of core R if (for example) you moved over to using data.table everywhere instead of data frames (only translating to data frames where absolutely necessary). (I would love a scalar data type for R, but I don't think that can be done without a near-complete rewrite ...) Ben Bolker
On 2013-01-04 12:00, r-devel-request at r-project.org wrote:> Message: 16 Date: Thu, 3 Jan 2013 22:52:44 +0000 From: Ben Bolker > <bbolker at gmail.com> To: <r-devel at stat.math.ethz.ch> Subject: Re: [Rd] > Bounty on Error Checking Message-ID: > <loom.20130103T234406-301 at post.gmane.org> Content-Type: text/plain; > charset="us-ascii" ivo welch <ivo.welch <at> anderson.ucla.edu> writes: >> > >> >Dear R developers---I just spent half a day debugging an R program, >> >which had two bugs---I selected the wrongly named variable, which >> >turns out to have been a scalar, which then happily multiplied as if >> >it was a matrix; and another wrongly named variable from a data frame, >> >that triggered no error when used as a[["name"]] or a$name . there >> >should be an option to turn on that throws an error inside R when one >> >does this. I cannot imagine that there is much code that wants to >> >reference non-existing columns in data frames. >> > >> >I know you guys are saints for developing without financial support. >> >but maybe we non-insider end-users can help by putting up a bounty >> >list on R-project for us end-users to contribute to? I would pledge >> >$500 to a $10,000 fund that funds a project to comprehensively enhance >> >the programming and debugging aspects of R. it would only take 20 of >> >us to make this possible. >> > >> >personally, I think basic nudgeware is the way to go. when a user >> >starts R in interactive mode, there should be a note that says, >> > >> > please donate $20 to the R foundation to support the development. >> >press enter to continue or enter your contribution number to avoid >> >this message in the future . >> > >> >you can even accept the same string if need be. it's a nudge only, >> >not a requirement. > I did bring this idea up briefly 5 years ago (for whatever that's > worth)Lhttp://tolstoy.newcastle.edu.au/R/e2/devel/07/05/3202.html. > I very much doubt R-core will go for this, but there's nothing stopping > some private citizen with time and energy on their hands from setting > up their own private bounty system. As I see it the challenges would > be: > > * setting up and administering the web site and the bounty system > (i.e. figuring out rules for deciding when a bounty should be paid) > * convincing the R community that their money is safe with you; > * figuring out an appropriate payment/escrow system (Paypal?) > * dealing with any tax and reporting issues relevant to your locality of > receiving and disbursing money > > It's conceivable that some existing R-oriented entity (Mango Solutions, > Revolution, RStudio?) would want/be willing to partner. > > This won't take care of getting stuff into core R, but (1) > well-worked out proofs of concept would go a long way to convincing > R-core; (2) a lot can be done outside of core R if (for > example) you moved over to using data.table everywhere instead of > data frames (only translating to data frames where absolutely necessary). > > (I would love a scalar data type for R, but I don't think that > can be done without a near-complete rewrite ...) > > Ben Bolker >The Pypy project is funding the developments of new features this way (http://pypy.org/ - right side of the page, there are proposals, how much they cost to implement, and how much was donated). There must be others, I am just more aware of that one. A potential difficulty is that all of R-core is possibly already funded (tenure positions in the academia, I'd guess) and might be moderately sensitive to the fact that a given feature should be implemented because people are paying to see it appear.
On Fri, Jan 3, 2013, Bert Gunter wrote> Well... > > On Thu, Jan 3, 2013 at 10:00 AM, ivo welch <ivo.welch <at> > anderson.ucla.edu> wrote: >> >> Dear R developers---I just spent half a day debugging an R program, >> which had two bugs---I selected the wrongly named variable, which >> turns out to have been a scalar, which then happily multiplied as if >> it was a matrix; and another wrongly named variable from a data >> frame, >> that triggered no error when used as a[["name"]] or a$name . there >> should be an option to turn on that throws an error inside R when >> one >> does this. I cannot imagine that there is much code that wants to >> reference non-existing columns in data frames. > >But I can -- and do it all the time: To add a new variable, "d" to a >data frame, df, containing only "a" and "b" (with 10 rows, say): > >df[["d"]] <- 1:10Yes but that's `[[<-`. Ivo was talking about `[[` and `$`; i.e., select only not assign, if I understood correctly.> >Trying to outguess documentation to create error triggers is a very > bad idea.Why exactly is it a very bad idea? (I don't necessarily disagree, just asking for more colour.)>R already has plenty of debugging tools -- and there is even a "debug" >package. Perhaps you need a better programming editor/IDE. There are >several listed on CRAN, RStudio, etc.True, but that relies on you knowing there's a bug to hunt for. What if you don't know you're getting incorrect results, silently? In a similar way that options(warn=2) turns known warnings into errors, to enable you to be more strict if you wish, an option to turn on warnings from `[[` and `$` if the column is missing (select only, not assign) doesn't seem like a bad option to have. Maybe it would reveal some previously silent bugs. Anyway, I'm hoping Ivo will let us know if he likes the simple mask I proposed, or not. That's already an option that can be turned on or off. But if his bug was selecting the wrong column, not a missing one, then I'm not sure anything could (or needs to be) done about that. Matthew