Ben Bolker
2014-Mar-03 00:38 UTC
[Rd] reference classes, LAZY_DUPLICATE_OK, and external pointers
We (the lme4 authors) are having a problem with doing a proper deep copy of a reference class object in recent versions of R-devel with the LAZY_DUPLICATE_OK flag in src/main/bind.c enabled. Apologies in advance for any improper terminology. TL;DR Is there an elegant way to force non-lazy/deep copying in our case? Is anyone else using reference classes with a field that is an external pointer? This is how copying of reference classes works in a normal situation: library(data.table) ## for address() function setRefClass("defaultRC",fields="theta") d1 <- new("defaultRC") d1$theta <- 1 address(d1$theta) ## "0xbbbbb70" d2 <- d1$copy() address(d2$theta) ## same as above d2$theta <- 2 address(d2$theta) ## now modified, by magic d1$theta ## unmodified The extra complication in our case is that many of the objects within our reference class are actually accessed via an external pointer, which is initialized when necessary -- details are copied below for those who want them, or you can see the code at https://github.com/lme4/lme4 The problem is that this sneaky way of copying the object's contents doesn't trigger R's (new) rules for recognizing that a non-lazy copy should be made. library(lme4) fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy) pp <- fm1 at pp pp$theta ## [1] 0.96673279 0.01516906 0.23090960 address(pp$theta) ## something pp$Ptr ## <pointer: ...> xpp <- pp$copy() ## default is deep copy xpp$Ptr ## <pointer: (nil)> address(xpp$theta) ## same as above xpp$setTheta(c(0,0,0)) ## referenced through Ptr field xpp$Ptr ## now set to non-nil fm1 at pp$theta ## changes to (0,0,0). oops. So apparently when the xpp$theta object is copied into the external pointer, a reference/lazy copy is made. (xpp$theta itself is read-only, so I can't do the assignment that way) I can hack around this in a very ugly way by doing a trivial modification when assigning inside the copy method: assign("theta",get("theta",envir=selfEnv)+0, envir=vEnv) ... but (a) this is very ugly and (b) it seems very unsafe -- as R gets smarter it should start to recognize trivial changes like x+0 and x*1 and *not* copy in these cases ... Method details: ## from R/AllClass.R, merPredD RC definition ptr = function() { 'returns the external pointer, regenerating if necessary' if (length(theta)) { if (.Call(isNullExtPtr, Ptr)) initializePtr() } Ptr }, ## ditto initializePtr = function() { Ptr <<- .Call(merPredDCreate, as(X, "matrix"), Lambdat, LamtUt, Lind, RZX, Ut, Utr, V, VtV, Vtr, Xwts, Zt, beta0, delb, delu, theta, u0) ... } merPredDCreate in turn just copies the relevant bits into a new C++ class object: /* see src/external.cpp */ SEXP merPredDCreate(SEXP Xs, SEXP Lambdat, SEXP LamtUt, SEXP Lind, SEXP RZX, SEXP Ut, SEXP Utr, SEXP V, SEXP VtV, SEXP Vtr, SEXP Xwts, SEXP Zt, SEXP beta0, SEXP delb, SEXP delu, SEXP theta, SEXP u0) { BEGIN_RCPP; merPredD *ans = new merPredD(Xs, Lambdat, LamtUt, Lind, RZX, Ut, Utr, V, VtV, Vtr, Xwts, Zt, beta0, delb, delu, theta, u0); return wrap(XPtr<merPredD>(ans, true)); END_RCPP; }
Simon Urbanek
2014-Mar-03 01:05 UTC
[Rd] reference classes, LAZY_DUPLICATE_OK, and external pointers
Ben, On Mar 2, 2014, at 7:38 PM, Ben Bolker <bbolker at gmail.com> wrote:> We (the lme4 authors) are having a problem with doing a proper deep > copy of a reference class object in recent versions of R-devel with > the LAZY_DUPLICATE_OK flag in src/main/bind.c enabled. > > Apologies in advance for any improper terminology. > > TL;DR Is there an elegant way to force non-lazy/deep copying in our > case? Is anyone else using reference classes with a field that is an > external pointer? > > This is how copying of reference classes works in a normal > situation: > > library(data.table) ## for address() function > setRefClass("defaultRC",fields="theta") > d1 <- new("defaultRC") > d1$theta <- 1 > address(d1$theta) ## "0xbbbbb70" > d2 <- d1$copy() > address(d2$theta) ## same as above > d2$theta <- 2 > address(d2$theta) ## now modified, by magic > d1$theta ## unmodified > > The extra complication in our case is that many of the objects within > our reference class are actually accessed via an external pointer, > which is initialized when necessary -- details are copied below for > those who want them, or you can see the code at > https://github.com/lme4/lme4 > > The problem is that this sneaky way of copying the object's contents > doesn't trigger R's (new) rules for recognizing that a non-lazy copy > should be made. >This is not R's decision - AFAICS your code is incorrectly assuming that there is no other reference where there is no such guarantee. Your code that assigns into the external pointer has to make that decision - it's not R's to make since you are taking the full responsibility for external pointers by circumventing R's handing. External pointers had always had reference semantics. Note that this is not new - you had to inspect the NAMED bits and call duplicate() yourself to guarantee a copy even in previous R versions. It just so happened that bugs of not doing so were often masked by R being more conservative such that in some circumstanced there were enough references to function arguments that R would defensively create a new copy. So, the same applies as it did before - if you store something that you want to be mutable in C/C++ you have to check the references and call duplicate() if you don't own the only reference. Cheers, Simon> library(lme4) > fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy) > pp <- fm1 at pp > pp$theta ## [1] 0.96673279 0.01516906 0.23090960 > address(pp$theta) ## something > pp$Ptr ## <pointer: ...> > xpp <- pp$copy() ## default is deep copy > xpp$Ptr ## <pointer: (nil)> > address(xpp$theta) ## same as above > xpp$setTheta(c(0,0,0)) ## referenced through Ptr field > xpp$Ptr ## now set to non-nil > fm1 at pp$theta ## changes to (0,0,0). oops. > > So apparently when the xpp$theta object is copied into the external > pointer, a reference/lazy copy is made. (xpp$theta itself is > read-only, so I can't do the assignment that way) > > I can hack around this in a very ugly way by doing a trivial > modification when assigning inside the copy method: > > assign("theta",get("theta",envir=selfEnv)+0, envir=vEnv) > > ... but (a) this is very ugly and (b) it seems very unsafe -- > as R gets smarter it should start to recognize trivial changes > like x+0 and x*1 and *not* copy in these cases ... > > Method details: > > ## from R/AllClass.R, merPredD RC definition > > ptr = function() { > 'returns the external pointer, regenerating if necessary' > if (length(theta)) { > if (.Call(isNullExtPtr, Ptr)) initializePtr() > } > Ptr > }, > > ## ditto > > initializePtr = function() { > Ptr <<- .Call(merPredDCreate, as(X, "matrix"), Lambdat, > LamtUt, Lind, RZX, Ut, Utr, V, VtV, Vtr, > Xwts, Zt, beta0, delb, delu, theta, u0) > ... > } > > merPredDCreate in turn just copies the relevant bits into a new C++ > class object: > > /* see src/external.cpp */ > > SEXP merPredDCreate(SEXP Xs, SEXP Lambdat, SEXP LamtUt, SEXP Lind, > SEXP RZX, SEXP Ut, SEXP Utr, SEXP V, SEXP VtV, > SEXP Vtr, SEXP Xwts, SEXP Zt, SEXP beta0, > SEXP delb, SEXP delu, SEXP theta, SEXP u0) { > BEGIN_RCPP; > merPredD *ans = new merPredD(Xs, Lambdat, LamtUt, Lind, RZX, > Ut, Utr, V, VtV, > Vtr, Xwts, Zt, beta0, delb, delu, > theta, u0); > return wrap(XPtr<merPredD>(ans, true)); > END_RCPP; > } > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >