I noticed this oddity about [ and setMethod. First, I define testFunc, which sorts a data frame by the first column and returns the entries that aren't NAs, and testIt, which runs testFunc repeatedly on a random large data frame, each time saving the return into a dummy placeholder (for demonstration's sake).> require(methods)Loading required package: methods [1] TRUE> testFunc <- function(cur) {+ sorted <- cur[order(cur[,1]),] + sorted[!is.na(sorted[,1]),] + }> testIt <- function(num)+ for (i in 1:num) { + cat("iteration", i, "...\n") + dummy <- testFunc(as.data.frame(matrix(rnorm(5000 * 9),9))) + } Let's test out testIt.> testIt(10)iteration 1 ... iteration 2 ... iteration 3 ... iteration 4 ... iteration 5 ... iteration 6 ... iteration 7 ... iteration 8 ... iteration 9 ... iteration 10 ... So far, no problems. Next, I define myClass.> setClass("myClass", representation(mySlot = "numeric"))[1] "myClass"> setMethod("[", signature(x = "myClass"),+ function(x, i, j, drop) "[ for myClass") [1] "[" Again, everything is fine. I've expanded [ to handle objects of class myClass. But when we attempt to use testIt now, problems emerge:> testIt(10)iteration 1 ... Error in "[.data.frame"(cur, order(cur[, 1]), ) : unused argument(s) (.Method ...) Maybe that was just a fluke. Trying again, we get:> testIt(10)iteration 1 ... iteration 2 ... Error in sorted[, 1] : incorrect number of dimensions A different message every time, and the error seems no bug of mine (a .Method argument? What is that?). In fact, testIt produces a seemingly random response from R on each subsequent attempt--everything from successful execution to similarly ambiguous errors to a process-killing "Bus Error." All in all, it seems that once I've defined a new method for [ I can no longer count on it to operate properly in any context. Is this the intended behavior? And if so, is there any way I can have one class present that overloads [ and still use [ to reliably subset data frames? Thanks, Kevin> R.version_ platform sparc-sun-solaris2.6 arch sparc os solaris2.6 system sparc, solaris2.6 status major 1 minor 5.1 year 2002 month 06 day 17 language R -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Definitely something funny going on, but perhaps a little less sweeping than your comments suggest. No explanation at the moment, but a couple of clues. 1. The problem doesn't seem to arise unless the data frame is large enough. I parametrized the 5000 rows in your example and couldn't get the bizarre behavior until the number of rows was larger than 1000. 2. There is some reason to think that nested "[" expressions are related. After rewriting your testFunc as follows, the random errors seemed to go away. testFunc <- function(cur) { ii <- order(cur[,1]) sorted <- cur[ii,] ll <- !is.na(sorted[,1]) sorted[ll,] } (the change is just to pull out the subset expressions that were arguments to other "[" expressions.) I'm not an expert on R internals, but the combination of these might suggest garbage collection taking place in nested calls to the dispatch code for "[". John Chambers PS: this thread might be more suited to r-devel than r-help. "Bartz, Kevin" wrote:> > I noticed this oddity about [ and setMethod. > > First, I define testFunc, which sorts a data frame by the first column and > returns the entries that aren't NAs, and testIt, which runs testFunc > repeatedly on a random large data frame, each time saving the return into a > dummy placeholder (for demonstration's sake). > > > require(methods) > Loading required package: methods > [1] TRUE > > testFunc <- function(cur) { > + sorted <- cur[order(cur[,1]),] > + sorted[!is.na(sorted[,1]),] > + } > > testIt <- function(num) > + for (i in 1:num) { > + cat("iteration", i, "...\n") > + dummy <- testFunc(as.data.frame(matrix(rnorm(5000 * 9),9))) > + } > > Let's test out testIt. > > > testIt(10) > iteration 1 ... > iteration 2 ... > iteration 3 ... > iteration 4 ... > iteration 5 ... > iteration 6 ... > iteration 7 ... > iteration 8 ... > iteration 9 ... > iteration 10 ... > > So far, no problems. Next, I define myClass. > > > setClass("myClass", representation(mySlot = "numeric")) > [1] "myClass" > > setMethod("[", signature(x = "myClass"), > + function(x, i, j, drop) "[ for myClass") > [1] "[" > > Again, everything is fine. I've expanded [ to handle objects of class > myClass. But when we attempt to use testIt now, problems emerge: > > > testIt(10) > iteration 1 ... > Error in "[.data.frame"(cur, order(cur[, 1]), ) : > unused argument(s) (.Method ...) > > Maybe that was just a fluke. Trying again, we get: > > > testIt(10) > iteration 1 ... > iteration 2 ... > Error in sorted[, 1] : incorrect number of dimensions > > A different message every time, and the error seems no bug of mine (a > .Method argument? What is that?). In fact, testIt produces a seemingly > random response from R on each subsequent attempt--everything from > successful execution to similarly ambiguous errors to a process-killing "Bus > Error." > > All in all, it seems that once I've defined a new method for [ I can no > longer count on it to operate properly in any context. > > Is this the intended behavior? And if so, is there any way I can have one > class present that overloads [ and still use [ to reliably subset data > frames? > > Thanks, > > Kevin > > > R.version > _ > platform sparc-sun-solaris2.6 > arch sparc > os solaris2.6 > system sparc, solaris2.6 > status > major 1 > minor 5.1 > year 2002 > month 06 > day 17 > language R > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- John M. Chambers jmc at bell-labs.com Bell Labs, Lucent Technologies office: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
The problem was in fact not protecting the argument list in certain circumstances. Should be fixed in the currently checked-in version. John Chambers John Chambers wrote:> > Definitely something funny going on, but perhaps a little less sweeping > than your comments suggest. > > No explanation at the moment, but a couple of clues. > > 1. The problem doesn't seem to arise unless the data frame is large > enough. I parametrized the 5000 rows in your example and couldn't get > the bizarre behavior until the number of rows was larger than 1000. > > 2. There is some reason to think that nested "[" expressions are > related. After rewriting your testFunc as follows, the random errors > seemed to go away. > > testFunc <- function(cur) { > ii <- order(cur[,1]) > sorted <- cur[ii,] > ll <- !is.na(sorted[,1]) > sorted[ll,] > } > (the change is just to pull out the subset expressions that were > arguments to other "[" expressions.) > > I'm not an expert on R internals, but the combination of these might > suggest garbage collection taking place in nested calls to the dispatch > code for "[". > > John Chambers > > > "Bartz, Kevin" wrote: > > > > I noticed this oddity about [ and setMethod. > > > > First, I define testFunc, which sorts a data frame by the first column and > > returns the entries that aren't NAs, and testIt, which runs testFunc > > repeatedly on a random large data frame, each time saving the return into a > > dummy placeholder (for demonstration's sake). > > > > > require(methods) > > Loading required package: methods > > [1] TRUE > > > testFunc <- function(cur) { > > + sorted <- cur[order(cur[,1]),] > > + sorted[!is.na(sorted[,1]),] > > + } > > > testIt <- function(num) > > + for (i in 1:num) { > > + cat("iteration", i, "...\n") > > + dummy <- testFunc(as.data.frame(matrix(rnorm(5000 * 9),9))) > > + } > > > > Let's test out testIt. > > > > > testIt(10) > > iteration 1 ... > > iteration 2 ... > > iteration 3 ... > > iteration 4 ... > > iteration 5 ... > > iteration 6 ... > > iteration 7 ... > > iteration 8 ... > > iteration 9 ... > > iteration 10 ... > > > > So far, no problems. Next, I define myClass. > > > > > setClass("myClass", representation(mySlot = "numeric")) > > [1] "myClass" > > > setMethod("[", signature(x = "myClass"), > > + function(x, i, j, drop) "[ for myClass") > > [1] "[" > > > > Again, everything is fine. I've expanded [ to handle objects of class > > myClass. But when we attempt to use testIt now, problems emerge: > > > > > testIt(10) > > iteration 1 ... > > Error in "[.data.frame"(cur, order(cur[, 1]), ) : > > unused argument(s) (.Method ...) > > > > Maybe that was just a fluke. Trying again, we get: > > > > > testIt(10) > > iteration 1 ... > > iteration 2 ... > > Error in sorted[, 1] : incorrect number of dimensions > > > > A different message every time, and the error seems no bug of mine (a > > .Method argument? What is that?). In fact, testIt produces a seemingly > > random response from R on each subsequent attempt--everything from > > successful execution to similarly ambiguous errors to a process-killing "Bus > > Error." > > > > All in all, it seems that once I've defined a new method for [ I can no > > longer count on it to operate properly in any context. > > > > Is this the intended behavior? And if so, is there any way I can have one > > class present that overloads [ and still use [ to reliably subset data > > frames? > > > > Thanks, > > > > Kevin > > > > > R.version > > _ > > platform sparc-sun-solaris2.6 > > arch sparc > > os solaris2.6 > > system sparc, solaris2.6 > > status > > major 1 > > minor 5.1 > > year 2002 > > month 06 > > day 17 > > language R > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > > Send "info", "help", or "[un]subscribe" > > (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch > > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > > -- > John M. Chambers jmc@bell-labs.com > Bell Labs, Lucent Technologies office: (908)582-2681 > 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 > Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- John M. Chambers jmc@bell-labs.com Bell Labs, Lucent Technologies office: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._