Hervé Pagès
2009-Jun-02 01:42 UTC
[Rd] formal argument "envir" matched by multiple actual arguments
Hi list, This looks similar to the problem reported here https://stat.ethz.ch/pipermail/r-devel/2006-April/037199.html by Henrik Bengtsson a long time ago. It is very sporadic and non-reproducible. Henrik, do you remember if your code was using reg.finalizer()? I tend to suspect it but I'm not sure. I've been hunting this bug for months but today, and we the help of other Bioconductor users, I was able to isolate it and to write some code that seems to "almost" reproduce it (i.e. not systematically but most of the times). (Just to put some context to the code below: it's a simplified version of some more complex code that we use in Bioconductor to manage memory caching of some big objects stored on disk. The idea is that objects of class A can be named. All A objects with the same name form a group. The code below implements a simple mechanism to trigger some action when a group is completely removed from memory i.e. when the last object in a group is garbage collected.) setClassUnion("environmentORNULL", c("environment", "NULL")) setClass("A", representation( aa="integer", groupname="character", groupanchor="environmentORNULL" ) ) .A.group.sizes <- new.env(hash=TRUE, parent=emptyenv()) .inc.A.group.size <- function(groupname) { group.size <- 1L if (exists(groupname, envir=.A.group.sizes, inherits=FALSE)) group.size <- group.size + get(groupname, envir=.A.group.sizes, inherits=FALSE) assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE) } .dec.A.group.size <- function(groupname) { group.size <- get(groupname, envir=.A.group.sizes, inherits=FALSE) - 1L assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE) return(group.size) } newA <- function(groupname="") { a <- new("A", groupname=groupname) if (!identical(groupname, "")) { .inc.A.group.size(groupname) groupanchor <- new.env(parent=emptyenv()) reg.finalizer(groupanchor, function(e) { group.size <- .dec.A.group.size(groupname) if (group.size == 0L) { cat("no more object of group", groupname, "in memory\n") # take some action } } ) a at groupanchor <- groupanchor } return(a) } The following commands seem to trigger the problem: > for (i in 1:2000) {a1 <- newA("group1")} > as.list(.A.group.sizes) > gc() > as.list(.A.group.sizes) > for (i in 1:2000) {a2 <- newA("group2")} Error in assign(".Method", method, envir = envir) : formal argument "envir" matched by multiple actual arguments If it doesn't, then adding more rounds should finally do it: gc() for (i in 1:2000) {a3 <- newA("group3")} gc() for (i in 1:2000) {a4 <- newA("group4")} etc... Thanks in advance for any help with this! H. > sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
hpages at fhcrc.org
2009-Jun-02 08:05 UTC
[Rd] formal argument "envir" matched by multiple actual arguments
In fact reg.finalizer() looks like a dangerous feature. If the finalizer itself triggers (implicitely or explicitely) garbage collection, then bad things happen. In the following example, garbage collection is triggered explicitely (using R-2.9.0): setClass("B", representation(bb="environment")) newB <- function() { ans <- new("B", bb=new.env()) reg.finalizer(ans at bb, function(e) { gc() cat("cleaning", class(ans), "object...\n") } ) return(ans) } > for (i in 1:500) {cat(i, "\n"); b1 <- newB()} 1 2 3 4 5 6 ... 13 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... 14 ... 169 170 171 Error: not a weak reference Error: not a weak reference [repeat the above line thousands of times] ... Error: not a weak reference Error: not a weak reference cleaning B object... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' [repeat the above line thousands of times] ... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' 172 ... 246 247 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... *** caught segfault *** address 0x41, cause 'memory not mapped' Traceback: 1: gc() 2: function (e) { gc() cat("cleaning", class(ans), "object...\n")}(<environment>) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: 2 Save workspace image? [y/n/c]: n Segmentation fault So apparently, if the finalizer triggers garbage collection, then we can end up with a corrupted session. Then anything can happen, from the strange 'formal argument "envir" matched by multiple actual arguments' error I reported in the previous post, to a segfault. In the worse case, nothing apparently happens but the output produced by the code is wrong. Maybe garbage collection requests should be ignored during the execution of the finalizer? (and more generally during garbbage collection itself) Cheers, H.
Henrik Bengtsson
2009-Jun-02 08:13 UTC
[Rd] formal argument "envir" matched by multiple actual arguments
Hi. 2009/6/1 Herv? Pag?s <hpages at fhcrc.org>:> Hi list, > > This looks similar to the problem reported here > ?https://stat.ethz.ch/pipermail/r-devel/2006-April/037199.html > by Henrik Bengtsson a long time ago. It is very sporadic and > non-reproducible. > Henrik, do you remember if your code was using reg.finalizer()? > I tend to suspect it but I'm not sure.Yes. This was/is observed with object extending the Object class of R.oo, and the constructor of Object use reg.finalizer() [which then calls finalize() that can be "overloaded"]. The fact that the garbage collector is involved could explain why this bug(?) is hard to reproduce. It's been a while since I saw this problem (and we do instantiate way more Object:s these days). Looking at my source code comments and the post you refers to, I suspect that I manage to circumvent the issue by the following trick (looking at my code, I have several of those statements): envir2 <- envir get(name, envir=envir2) Also, on March 6, 2008 I reported to R-devel on a related problem with '%in%': http://tolstoy.newcastle.edu.au/R/e4/devel/08/03/0708.html That one I circumvent by now only using is.element(a,b) instead of a %in% b. Maybe this gives you further clues. /Henrik BTW. You need to be careful when you register a finalizer and that uses code in a package, which may have been detached. This may cause an error in the finalizer which can give further side effects. See here: http://tolstoy.newcastle.edu.au/R/e2/devel/07/08/4251.html> > I've been hunting this bug for months but today, and we the help of other > Bioconductor users, I was able to isolate it and to write some code that > seems to "almost" reproduce it (i.e. not systematically but most of the > times). > > (Just to put some context to the code below: it's a simplified version > of some more complex code that we use in Bioconductor to manage memory > caching of some big objects stored on disk. The idea is that objects of > class A can be named. All A objects with the same name form a group. > The code below implements a simple mechanism to trigger some action when > a group is completely removed from memory i.e. when the last object in > a group is garbage collected.) > > > ?setClassUnion("environmentORNULL", c("environment", "NULL")) > > ?setClass("A", > ? ?representation( > ? ? ?aa="integer", > ? ? ?groupname="character", > ? ? ?groupanchor="environmentORNULL" > ? ?) > ?) > > ?.A.group.sizes <- new.env(hash=TRUE, parent=emptyenv()) > > ?.inc.A.group.size <- function(groupname) > ?{ > ? ?group.size <- 1L > ? ?if (exists(groupname, envir=.A.group.sizes, inherits=FALSE)) > ? ? ? ?group.size <- group.size + > ? ? ? ? ? ? ? ? ? ? ?get(groupname, envir=.A.group.sizes, inherits=FALSE) > ? ?assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE) > ?} > > ?.dec.A.group.size <- function(groupname) > ?{ > ? ?group.size <- get(groupname, envir=.A.group.sizes, inherits=FALSE) - 1L > ? ?assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE) > ? ?return(group.size) > ?} > > ?newA <- function(groupname="") > ?{ > ? ?a <- new("A", groupname=groupname) > ? ?if (!identical(groupname, "")) { > ? ? ? ?.inc.A.group.size(groupname) > ? ? ? ?groupanchor <- new.env(parent=emptyenv()) > ? ? ? ?reg.finalizer(groupanchor, > ? ? ? ? ? ? ? ? ? ? ?function(e) > ? ? ? ? ? ? ? ? ? ? ?{ > ? ? ? ? ? ? ? ? ? ? ? ? ?group.size <- .dec.A.group.size(groupname) > ? ? ? ? ? ? ? ? ? ? ? ? ?if (group.size == 0L) { > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cat("no more object of group", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?groupname, "in memory\n") > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# take some action > ? ? ? ? ? ? ? ? ? ? ? ? ?} > ? ? ? ? ? ? ? ? ? ? ?} > ? ? ? ?) > ? ? ? ?a at groupanchor <- groupanchor > ? ?} > ? ?return(a) > ?} > > > The following commands seem to trigger the problem: > > ?> for (i in 1:2000) {a1 <- newA("group1")} > ?> as.list(.A.group.sizes) > ?> gc() > ?> as.list(.A.group.sizes) > ?> for (i in 1:2000) {a2 <- newA("group2")} > ?Error in assign(".Method", method, envir = envir) : > ? ?formal argument "envir" matched by multiple actual arguments > > If it doesn't, then adding more rounds should finally do it: > > ?gc() > ?for (i in 1:2000) {a3 <- newA("group3")} > ?gc() > ?for (i in 1:2000) {a4 <- newA("group4")} > > ?etc... > > Thanks in advance for any help with this! > > H. > >> sessionInfo() > R version 2.9.0 (2009-04-17) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > > -- > Herv? Pag?s > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M2-B876 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org > Phone: ?(206) 667-5791 > Fax: ? ?(206) 667-1319 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >