Does R have a function like the S/S++ unset() function? unset(name) would remove 'name' from the current evaluation frame and return its value. It allowed you to safely avoid some memory copying when calling .C or .Call. E.g., suppose you had C code like #include <R.h> #include <Rinternals.h> SEXP add1(SEXP pX) { int nProtected = 0; int n = Rf_length(pX); int i; double* x; Rprintf("NAMED(pX)=%d: ", NAMED(pX)); if (NAMED(pX)) { Rprintf("Copying pX before adding 1\n"); PROTECT(pX = duplicate(pX)); nProtected++; } else { Rprintf("Changing pX in place\n"); } x = REAL(pX); for(i=0 ; i<n ; i++) { x[i] = x[i] + 1.0; } UNPROTECT(nProtected); return pX; } If I call this from an R function add1 <- function(x) { stopifnot(inherits(x, "numeric")) .Call("add1", x) } it will will always copy 'x', even though not copying would be safe (since add1 doesn't use 'x' after calling .Call()). > add1(c(1.2, 3.4)) NAMED(pX)=2: Copying pX before adding 1 [1] 2.2 4.4 If I make the .Call directly, without a nice R function around it then I can avoid the copy > .Call("add1", c(1.2, 3.4)) NAMED(pX)=0: Changing pX in place [1] 2.2 4.4 If something like S's unset() were available I could avoid the copy, when safe to do so, by making the .Call in add1 .Call("add1", unset(x)) If you called this new add1 with a named variable from another function the copying would be done, since NAMED(x) would be 2 even after the local binding was removed. It actually requires some care to to eliminate the copying, as all the functions in the call chain would have to use unset() when possible. I ask this because I ran across a function in the 'bit' package that does not have its C code call duplicate but instead assumes the x[1] <- x[1] will force x to be copied: "!.bit" <- function(x){ if (length(x)){ ret <- x ret[1] <- ret[1] # force duplication .Call("R_bit_not", ret, PACKAGE="bit") }else{ x } } If you optimize things so that 'ret[1] <- ret[1]' does not copy 'ret', then this function alters its input. It a function like unset() were there then the .Call could be .Call("R_bit_not", unset(x)) I suppose the compiler could analyze the code and see that x was not used after the .Call and thus feel free to avoid the copy. In any case bit's maintainer should add something like if(NAMED(x) { PROTECT(x=duplicate(x)); nProtect++; } ... UNPROTECT(nProtect); in the C code, but unset() would help avoid unneeded duplications. Bill Dunlap TIBCO Software wdunlap tibco.com [[alternative HTML version deleted]]
> Does R have a function like the S/S++ unset() function?That should be 'S+' or 'S-Plus', not the typo 'S++'. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Aug 21, 2015 at 1:43 PM, William Dunlap <wdunlap at tibco.com> wrote:> Does R have a function like the S/S++ unset() function? > unset(name) would remove 'name' from the current evaluation > frame and return its value. It allowed you to safely avoid > some memory copying when calling .C or .Call. > > E.g., suppose you had C code like > #include <R.h> > #include <Rinternals.h> > SEXP add1(SEXP pX) > { > int nProtected = 0; > int n = Rf_length(pX); > int i; > double* x; > Rprintf("NAMED(pX)=%d: ", NAMED(pX)); > if (NAMED(pX)) { > Rprintf("Copying pX before adding 1\n"); > PROTECT(pX = duplicate(pX)); nProtected++; > } else { > Rprintf("Changing pX in place\n"); > } > x = REAL(pX); > for(i=0 ; i<n ; i++) { > x[i] = x[i] + 1.0; > } > UNPROTECT(nProtected); > return pX; > } > > If I call this from an R function > add1 <- function(x) { > stopifnot(inherits(x, "numeric")) > .Call("add1", x) > } > it will will always copy 'x', even though not copying would > be safe (since add1 doesn't use 'x' after calling .Call()). > > add1(c(1.2, 3.4)) > NAMED(pX)=2: Copying pX before adding 1 > [1] 2.2 4.4 > If I make the .Call directly, without a nice R function around it > then I can avoid the copy > > .Call("add1", c(1.2, 3.4)) > NAMED(pX)=0: Changing pX in place > [1] 2.2 4.4 > > If something like S's unset() were available I could avoid the copy, > when safe to do so, by making the .Call in add1 > .Call("add1", unset(x)) > > If you called this new add1 with a named variable from another > function the copying would be done, since NAMED(x) would be > 2 even after the local binding was removed. It actually requires some > care to to eliminate the copying, as all the functions in the call > chain would have to use unset() when possible. > > I ask this because I ran across a function in the 'bit' package that > does not have its C code call duplicate but instead assumes the > x[1] <- x[1] will force x to be copied: > "!.bit" <- function(x){ > if (length(x)){ > ret <- x > ret[1] <- ret[1] # force duplication > .Call("R_bit_not", ret, PACKAGE="bit") > }else{ > x > } > } > If you optimize things so that 'ret[1] <- ret[1]' does not copy 'ret', > then this function alters its input. It a function like unset() > were there then the .Call could be > .Call("R_bit_not", unset(x)) > > I suppose the compiler could analyze the code and see that > x was not used after the .Call and thus feel free to avoid the > copy. > > In any case bit's maintainer should add something like > if(NAMED(x) { > PROTECT(x=duplicate(x)); > nProtect++; > } > ... > UNPROTECT(nProtect); > in the C code, but unset() would help avoid unneeded duplications. > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com >[[alternative HTML version deleted]]
This wouldn't actually work at present as evaluating a promise always sets NAMED to 2. With reference counting it would work so might be worth considering when we switch. Going forward it would be best to use MAYBE_REFERENCED to test whether a duplicate is needed -- this macro is defined appropriately whether R is compiled to use NAMED or reference counting. Best, luke On Fri, 21 Aug 2015, William Dunlap wrote:> Does R have a function like the S/S++ unset() function? > unset(name) would remove 'name' from the current evaluation > frame and return its value. It allowed you to safely avoid > some memory copying when calling .C or .Call. > > E.g., suppose you had C code like > #include <R.h> > #include <Rinternals.h> > SEXP add1(SEXP pX) > { > int nProtected = 0; > int n = Rf_length(pX); > int i; > double* x; > Rprintf("NAMED(pX)=%d: ", NAMED(pX)); > if (NAMED(pX)) { > Rprintf("Copying pX before adding 1\n"); > PROTECT(pX = duplicate(pX)); nProtected++; > } else { > Rprintf("Changing pX in place\n"); > } > x = REAL(pX); > for(i=0 ; i<n ; i++) { > x[i] = x[i] + 1.0; > } > UNPROTECT(nProtected); > return pX; > } > > If I call this from an R function > add1 <- function(x) { > stopifnot(inherits(x, "numeric")) > .Call("add1", x) > } > it will will always copy 'x', even though not copying would > be safe (since add1 doesn't use 'x' after calling .Call()). > > add1(c(1.2, 3.4)) > NAMED(pX)=2: Copying pX before adding 1 > [1] 2.2 4.4 > If I make the .Call directly, without a nice R function around it > then I can avoid the copy > > .Call("add1", c(1.2, 3.4)) > NAMED(pX)=0: Changing pX in place > [1] 2.2 4.4 > > If something like S's unset() were available I could avoid the copy, > when safe to do so, by making the .Call in add1 > .Call("add1", unset(x)) > > If you called this new add1 with a named variable from another > function the copying would be done, since NAMED(x) would be > 2 even after the local binding was removed. It actually requires some > care to to eliminate the copying, as all the functions in the call > chain would have to use unset() when possible. > > I ask this because I ran across a function in the 'bit' package that > does not have its C code call duplicate but instead assumes the > x[1] <- x[1] will force x to be copied: > "!.bit" <- function(x){ > if (length(x)){ > ret <- x > ret[1] <- ret[1] # force duplication > .Call("R_bit_not", ret, PACKAGE="bit") > }else{ > x > } > } > If you optimize things so that 'ret[1] <- ret[1]' does not copy 'ret', > then this function alters its input. It a function like unset() > were there then the .Call could be > .Call("R_bit_not", unset(x)) > > I suppose the compiler could analyze the code and see that > x was not used after the .Call and thus feel free to avoid the > copy. > > In any case bit's maintainer should add something like > if(NAMED(x) { > PROTECT(x=duplicate(x)); > nProtect++; > } > ... > UNPROTECT(nProtect); > in the C code, but unset() would help avoid unneeded duplications. > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Hi, I was playing around with this idea earlier this year. This would allow you to remove a variable with NAMED<2 while still passing it's value, e.g. x1 <- log(r(x1)) where the returned value/variable has NAMED<=1. At first I was quite excited about the results, but it turned out that it only worked for a few functions. If you want to play around with it, I've created the 'recycle' package: https://github.com/HenrikBengtsson/recycle Have a look at the package tests for examples and what works and what doesn't work: https://github.com/HenrikBengtsson/recycle/tree/master/tests However, basically due to what Luke says, I've decided not to pursue this any further for now. But, I certainly agree that if the internals of R could be made less conservative (not force NAMED=2), this idea would certainly be worth pursuing and could save quite a bit of memory. The downside would be that code would be cluttered up with lots of explicit r() statements. On the other hand, maybe those could be added automatically by code compilers, e.g. x1 <- log(x1) would become x1 <- log(r(x1)) /Henrik On Sat, Aug 22, 2015 at 4:50 PM, <luke-tierney at uiowa.edu> wrote:> This wouldn't actually work at present as evaluating a promise always > sets NAMED to 2. With reference counting it would work so might be > worth considering when we switch. > > Going forward it would be best to use MAYBE_REFERENCED to test whether > a duplicate is needed -- this macro is defined appropriately whether R > is compiled to use NAMED or reference counting. > > Best, > > luke > > > On Fri, 21 Aug 2015, William Dunlap wrote: > >> Does R have a function like the S/S++ unset() function? >> unset(name) would remove 'name' from the current evaluation >> frame and return its value. It allowed you to safely avoid >> some memory copying when calling .C or .Call. >> >> E.g., suppose you had C code like >> #include <R.h> >> #include <Rinternals.h> >> SEXP add1(SEXP pX) >> { >> int nProtected = 0; >> int n = Rf_length(pX); >> int i; >> double* x; >> Rprintf("NAMED(pX)=%d: ", NAMED(pX)); >> if (NAMED(pX)) { >> Rprintf("Copying pX before adding 1\n"); >> PROTECT(pX = duplicate(pX)); nProtected++; >> } else { >> Rprintf("Changing pX in place\n"); >> } >> x = REAL(pX); >> for(i=0 ; i<n ; i++) { >> x[i] = x[i] + 1.0; >> } >> UNPROTECT(nProtected); >> return pX; >> } >> >> If I call this from an R function >> add1 <- function(x) { >> stopifnot(inherits(x, "numeric")) >> .Call("add1", x) >> } >> it will will always copy 'x', even though not copying would >> be safe (since add1 doesn't use 'x' after calling .Call()). >> > add1(c(1.2, 3.4)) >> NAMED(pX)=2: Copying pX before adding 1 >> [1] 2.2 4.4 >> If I make the .Call directly, without a nice R function around it >> then I can avoid the copy >> > .Call("add1", c(1.2, 3.4)) >> NAMED(pX)=0: Changing pX in place >> [1] 2.2 4.4 >> >> If something like S's unset() were available I could avoid the copy, >> when safe to do so, by making the .Call in add1 >> .Call("add1", unset(x)) >> >> If you called this new add1 with a named variable from another >> function the copying would be done, since NAMED(x) would be >> 2 even after the local binding was removed. It actually requires some >> care to to eliminate the copying, as all the functions in the call >> chain would have to use unset() when possible. >> >> I ask this because I ran across a function in the 'bit' package that >> does not have its C code call duplicate but instead assumes the >> x[1] <- x[1] will force x to be copied: >> "!.bit" <- function(x){ >> if (length(x)){ >> ret <- x >> ret[1] <- ret[1] # force duplication >> .Call("R_bit_not", ret, PACKAGE="bit") >> }else{ >> x >> } >> } >> If you optimize things so that 'ret[1] <- ret[1]' does not copy 'ret', >> then this function alters its input. It a function like unset() >> were there then the .Call could be >> .Call("R_bit_not", unset(x)) >> >> I suppose the compiler could analyze the code and see that >> x was not used after the .Call and thus feel free to avoid the >> copy. >> >> In any case bit's maintainer should add something like >> if(NAMED(x) { >> PROTECT(x=duplicate(x)); >> nProtect++; >> } >> ... >> UNPROTECT(nProtect); >> in the C code, but unset() would help avoid unneeded duplications. >> >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel