Paul Johnson
2011-Apr-09 19:51 UTC
[Rd] Wish there were a "strict mode" for R interpreter. What about You?
Years ago, I did lots of Perl programming. Perl will let you be lazy and write functions that refer to undefined variables (like R does), but there is also a strict mode so the interpreter will block anything when a variable is mentioned that has not been defined. I wish there were a strict mode for checking R functions. Here's why. We have a lot of students writing R functions around here and they run into trouble because they use the same name for things inside and outside of functions. When they call functions that have mistaken or undefined references to names that they use elsewhere, then variables that are in the environment are accidentally used. Know what I mean? dat <- whatever someNewFunction <- function(z, w){ #do something with z and w and create a new "dat" # but forget to name it "dat" lm (y, x, data=dat) # lm just used wrong data } I wish R had a strict mode to return an error in that case. Users don't realize they are getting nonsense because R finds things to fill in for their mistakes. Is this possible? Does anybody agree it would be good? -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas
Duncan Murdoch
2011-Apr-09 20:37 UTC
[Rd] Wish there were a "strict mode" for R interpreter. What about You?
On 11-04-09 3:51 PM, Paul Johnson wrote:> Years ago, I did lots of Perl programming. Perl will let you be lazy > and write functions that refer to undefined variables (like R does), > but there is also a strict mode so the interpreter will block anything > when a variable is mentioned that has not been defined. I wish there > were a strict mode for checking R functions. > > Here's why. We have a lot of students writing R functions around here > and they run into trouble because they use the same name for things > inside and outside of functions. When they call functions that have > mistaken or undefined references to names that they use elsewhere, > then variables that are in the environment are accidentally used. Know > what I mean? > > dat<- whatever > > someNewFunction<- function(z, w){ > #do something with z and w and create a new "dat" > # but forget to name it "dat" > lm (y, x, data=dat) > # lm just used wrong data > } > > I wish R had a strict mode to return an error in that case. Users > don't realize they are getting nonsense because R finds things to fill > in for their mistakes. > > Is this possible? Does anybody agree it would be good? >It would be really bad, unless done carefully. In your function the free (undefined) variables are dat and lm. You want to be warned about dat, but you don't want to be warned about lm. What rule should R use to determine that? (One possible rule would work in a package with a namespace. In that case, all variables must be found in declared dependencies, the search could stop before it got to globalenv(). But it seems unlikely that your students are writing packages with namespaces.) Duncan Murdoch
Hadley Wickham
2011-Apr-09 21:31 UTC
[Rd] Wish there were a "strict mode" for R interpreter. What about You?
On Sat, Apr 9, 2011 at 2:51 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:> Years ago, I did lots of Perl programming. Perl will let you be lazy > and write functions that refer to undefined variables (like R does), > but there is also a strict mode so the interpreter will block anything > when a variable is mentioned that has not been defined. I wish there > were a strict mode for checking R functions. > > Here's why. We have a lot of students writing R functions around here > and they run into trouble because they use the same name for things > inside and outside of functions. When they call functions that have > mistaken or undefined references to names that they use elsewhere, > then variables that are in the environment are accidentally used. Know > what I mean? > > dat <- whatever > > someNewFunction <- function(z, w){ > ? #do something with z and w and create a new "dat" > ? # but forget to name it "dat" > ? ?lm (y, x, data=dat) > ? # lm just used wrong data > } > > I wish R had a strict mode to return an error in that case. Users > don't realize they are getting nonsense because R finds things to fill > in for their mistakes. > > Is this possible? ?Does anybody agree it would be good?> library(codetools) > checkUsage(someNewFunction)<anonymous>: no visible binding for global variable ?y? <anonymous>: no visible binding for global variable ?x? <anonymous>: no visible binding for global variable ?dat? Which also picks up another bug in your function ;) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Maybe Matching Threads
- checkUsage from codetools shows errors when function uses functions from loaded packages
- findGlobals on apply
- trouble automating formula edits when log or * are present; update trouble
- what do you think about write.table(... qmethod = "excel")?
- SELinux Strict Mode