thr3ads.net - R devel - [Rd] unset() function? [Aug 2015]

If this information is useful, please help other people find it:
Share via:

William Dunlap

2015-Aug-21 20:43 UTC

[Rd] unset() function?

Does R have a function like the S/S++ unset() function?
unset(name) would remove 'name' from the current evaluation
frame and return its value.  It allowed you to safely avoid
some memory copying when calling .C or .Call.

E.g., suppose you had C code like
  #include <R.h>
  #include <Rinternals.h>
  SEXP add1(SEXP pX)
  {
      int nProtected = 0;
      int n = Rf_length(pX);
      int i;
      double* x;
      Rprintf("NAMED(pX)=%d: ", NAMED(pX));
      if (NAMED(pX)) {
          Rprintf("Copying pX before adding 1\n");
          PROTECT(pX = duplicate(pX)); nProtected++;
      } else {
          Rprintf("Changing pX in place\n");
      }
      x = REAL(pX);
      for(i=0 ; i<n ; i++) {
        x[i] = x[i] + 1.0;
      }
      UNPROTECT(nProtected);
      return pX;
  }

If I call this from an R function
  add1 <- function(x) {
      stopifnot(inherits(x, "numeric"))
     .Call("add1", x)
  }
it will will always copy 'x', even though not copying would
be safe (since add1 doesn't use 'x' after calling .Call()).
  > add1(c(1.2, 3.4))
  NAMED(pX)=2: Copying pX before adding 1
  [1] 2.2 4.4
If I make the .Call directly, without a nice R function around it
then I can avoid the copy
  > .Call("add1", c(1.2, 3.4))
  NAMED(pX)=0: Changing pX in place
  [1] 2.2 4.4

If something like S's unset() were available I could avoid the copy,
when safe to do so, by making the .Call in add1
   .Call("add1", unset(x))

If you called this new add1 with a named variable from another
function the copying would be done, since NAMED(x) would be
2 even after the local binding was removed.  It actually requires some
care to to eliminate the copying, as all the functions in the call
chain would have to use unset() when possible.

I ask this because I ran across a function in the 'bit' package that
does not have its C code call duplicate but instead assumes the
x[1] <- x[1] will force x to be copied:
  "!.bit" <- function(x){
    if (length(x)){
      ret <- x
      ret[1] <- ret[1]  # force duplication
      .Call("R_bit_not", ret, PACKAGE="bit")
    }else{
      x
    }
  }
If you optimize things so that 'ret[1] <- ret[1]' does not copy
'ret',
then this function alters its input.  It a function like unset()
were there then the .Call could be
     .Call("R_bit_not", unset(x))

I suppose the compiler could analyze the code and see that
x was not used after the .Call and thus feel free to avoid the
copy.

In any case bit's maintainer should add something like
    if(NAMED(x) {
        PROTECT(x=duplicate(x));
        nProtect++;
    }
    ...
    UNPROTECT(nProtect);
in the C code, but unset() would help avoid unneeded duplications.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

	[[alternative HTML version deleted]]

William Dunlap

2015-Aug-21 21:12 UTC

head link

[Rd] unset() function?

> Does R have a function like the S/S++ unset() function?
That should be 'S+' or 'S-Plus', not the typo 'S++'.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Aug 21, 2015 at 1:43 PM, William Dunlap <wdunlap at tibco.com>
wrote:
> Does R have a function like the S/S++ unset() function?
> unset(name) would remove 'name' from the current evaluation
> frame and return its value.  It allowed you to safely avoid
> some memory copying when calling .C or .Call.
>
> E.g., suppose you had C code like
>   #include <R.h>
>   #include <Rinternals.h>
>   SEXP add1(SEXP pX)
>   {
>       int nProtected = 0;
>       int n = Rf_length(pX);
>       int i;
>       double* x;
>       Rprintf("NAMED(pX)=%d: ", NAMED(pX));
>       if (NAMED(pX)) {
>           Rprintf("Copying pX before adding 1\n");
>           PROTECT(pX = duplicate(pX)); nProtected++;
>       } else {
>           Rprintf("Changing pX in place\n");
>       }
>       x = REAL(pX);
>       for(i=0 ; i<n ; i++) {
>         x[i] = x[i] + 1.0;
>       }
>       UNPROTECT(nProtected);
>       return pX;
>   }
>
> If I call this from an R function
>   add1 <- function(x) {
>       stopifnot(inherits(x, "numeric"))
>      .Call("add1", x)
>   }
> it will will always copy 'x', even though not copying would
> be safe (since add1 doesn't use 'x' after calling .Call()).
>   > add1(c(1.2, 3.4))
>   NAMED(pX)=2: Copying pX before adding 1
>   [1] 2.2 4.4
> If I make the .Call directly, without a nice R function around it
> then I can avoid the copy
>   > .Call("add1", c(1.2, 3.4))
>   NAMED(pX)=0: Changing pX in place
>   [1] 2.2 4.4
>
> If something like S's unset() were available I could avoid the copy,
> when safe to do so, by making the .Call in add1
>    .Call("add1", unset(x))
>
> If you called this new add1 with a named variable from another
> function the copying would be done, since NAMED(x) would be
> 2 even after the local binding was removed.  It actually requires some
> care to to eliminate the copying, as all the functions in the call
> chain would have to use unset() when possible.
>
> I ask this because I ran across a function in the 'bit' package
that
> does not have its C code call duplicate but instead assumes the
> x[1] <- x[1] will force x to be copied:
>   "!.bit" <- function(x){
>     if (length(x)){
>       ret <- x
>       ret[1] <- ret[1]  # force duplication
>       .Call("R_bit_not", ret, PACKAGE="bit")
>     }else{
>       x
>     }
>   }
> If you optimize things so that 'ret[1] <- ret[1]' does not copy
'ret',
> then this function alters its input.  It a function like unset()
> were there then the .Call could be
>      .Call("R_bit_not", unset(x))
>
> I suppose the compiler could analyze the code and see that
> x was not used after the .Call and thus feel free to avoid the
> copy.
>
> In any case bit's maintainer should add something like
>     if(NAMED(x) {
>         PROTECT(x=duplicate(x));
>         nProtect++;
>     }
>     ...
>     UNPROTECT(nProtect);
> in the C code, but unset() would help avoid unneeded duplications.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
	[[alternative HTML version deleted]]

luke-tierney at uiowa.edu

2015-Aug-22 14:50 UTC

head link

[Rd] unset() function?

This wouldn't actually work at present as evaluating a promise always
sets NAMED to 2. With reference counting it would work so might be
worth considering when we switch.

Going forward it would be best to use MAYBE_REFERENCED to test whether
a duplicate is needed -- this macro is defined appropriately whether R
is compiled to use NAMED or reference counting.

Best,

luke

On Fri, 21 Aug 2015, William Dunlap wrote:
> Does R have a function like the S/S++ unset() function?
> unset(name) would remove 'name' from the current evaluation
> frame and return its value.  It allowed you to safely avoid
> some memory copying when calling .C or .Call.
>
> E.g., suppose you had C code like
>  #include <R.h>
>  #include <Rinternals.h>
>  SEXP add1(SEXP pX)
>  {
>      int nProtected = 0;
>      int n = Rf_length(pX);
>      int i;
>      double* x;
>      Rprintf("NAMED(pX)=%d: ", NAMED(pX));
>      if (NAMED(pX)) {
>          Rprintf("Copying pX before adding 1\n");
>          PROTECT(pX = duplicate(pX)); nProtected++;
>      } else {
>          Rprintf("Changing pX in place\n");
>      }
>      x = REAL(pX);
>      for(i=0 ; i<n ; i++) {
>        x[i] = x[i] + 1.0;
>      }
>      UNPROTECT(nProtected);
>      return pX;
>  }
>
> If I call this from an R function
>  add1 <- function(x) {
>      stopifnot(inherits(x, "numeric"))
>     .Call("add1", x)
>  }
> it will will always copy 'x', even though not copying would
> be safe (since add1 doesn't use 'x' after calling .Call()).
>  > add1(c(1.2, 3.4))
>  NAMED(pX)=2: Copying pX before adding 1
>  [1] 2.2 4.4
> If I make the .Call directly, without a nice R function around it
> then I can avoid the copy
>  > .Call("add1", c(1.2, 3.4))
>  NAMED(pX)=0: Changing pX in place
>  [1] 2.2 4.4
>
> If something like S's unset() were available I could avoid the copy,
> when safe to do so, by making the .Call in add1
>   .Call("add1", unset(x))
>
> If you called this new add1 with a named variable from another
> function the copying would be done, since NAMED(x) would be
> 2 even after the local binding was removed.  It actually requires some
> care to to eliminate the copying, as all the functions in the call
> chain would have to use unset() when possible.
>
> I ask this because I ran across a function in the 'bit' package
that
> does not have its C code call duplicate but instead assumes the
> x[1] <- x[1] will force x to be copied:
>  "!.bit" <- function(x){
>    if (length(x)){
>      ret <- x
>      ret[1] <- ret[1]  # force duplication
>      .Call("R_bit_not", ret, PACKAGE="bit")
>    }else{
>      x
>    }
>  }
> If you optimize things so that 'ret[1] <- ret[1]' does not copy
'ret',
> then this function alters its input.  It a function like unset()
> were there then the .Call could be
>     .Call("R_bit_not", unset(x))
>
> I suppose the compiler could analyze the code and see that
> x was not used after the .Call and thus feel free to avoid the
> copy.
>
> In any case bit's maintainer should add something like
>    if(NAMED(x) {
>        PROTECT(x=duplicate(x));
>        nProtect++;
>    }
>    ...
>    UNPROTECT(nProtect);
> in the C code, but unset() would help avoid unneeded duplications.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

Henrik Bengtsson

2015-Aug-22 17:15 UTC

head link

[Rd] unset() function?

Hi,

I was playing around with this idea earlier this year. This would
allow you to remove a variable with NAMED<2 while still passing it's
value, e.g.

    x1 <- log(r(x1))

where the returned value/variable has NAMED<=1.  At first I was quite
excited about the results, but it turned out that it only worked for a
few functions.  If you want to play around with it, I've created the
'recycle' package:

    https://github.com/HenrikBengtsson/recycle

Have a look at the package tests for examples and what works and what
doesn't work:

    https://github.com/HenrikBengtsson/recycle/tree/master/tests

However, basically due to what Luke says, I've decided not to pursue
this any further for now.

But, I certainly agree that if the internals of R could be made less
conservative (not force NAMED=2), this idea would certainly be worth
pursuing and could save quite a bit of memory.  The downside would be
that code would be cluttered up with lots of explicit r() statements.
On the other hand, maybe those could be added automatically by code
compilers, e.g.

    x1 <- log(x1)

would become

    x1 <- log(r(x1))

/Henrik

On Sat, Aug 22, 2015 at 4:50 PM,  <luke-tierney at uiowa.edu>
wrote:> This wouldn't actually work at present as evaluating a promise always
> sets NAMED to 2. With reference counting it would work so might be
> worth considering when we switch.
>
> Going forward it would be best to use MAYBE_REFERENCED to test whether
> a duplicate is needed -- this macro is defined appropriately whether R
> is compiled to use NAMED or reference counting.
>
> Best,
>
> luke
>
>
> On Fri, 21 Aug 2015, William Dunlap wrote:
>
>> Does R have a function like the S/S++ unset() function?
>> unset(name) would remove 'name' from the current evaluation
>> frame and return its value.  It allowed you to safely avoid
>> some memory copying when calling .C or .Call.
>>
>> E.g., suppose you had C code like
>>  #include <R.h>
>>  #include <Rinternals.h>
>>  SEXP add1(SEXP pX)
>>  {
>>      int nProtected = 0;
>>      int n = Rf_length(pX);
>>      int i;
>>      double* x;
>>      Rprintf("NAMED(pX)=%d: ", NAMED(pX));
>>      if (NAMED(pX)) {
>>          Rprintf("Copying pX before adding 1\n");
>>          PROTECT(pX = duplicate(pX)); nProtected++;
>>      } else {
>>          Rprintf("Changing pX in place\n");
>>      }
>>      x = REAL(pX);
>>      for(i=0 ; i<n ; i++) {
>>        x[i] = x[i] + 1.0;
>>      }
>>      UNPROTECT(nProtected);
>>      return pX;
>>  }
>>
>> If I call this from an R function
>>  add1 <- function(x) {
>>      stopifnot(inherits(x, "numeric"))
>>     .Call("add1", x)
>>  }
>> it will will always copy 'x', even though not copying would
>> be safe (since add1 doesn't use 'x' after calling .Call()).
>>  > add1(c(1.2, 3.4))
>>  NAMED(pX)=2: Copying pX before adding 1
>>  [1] 2.2 4.4
>> If I make the .Call directly, without a nice R function around it
>> then I can avoid the copy
>>  > .Call("add1", c(1.2, 3.4))
>>  NAMED(pX)=0: Changing pX in place
>>  [1] 2.2 4.4
>>
>> If something like S's unset() were available I could avoid the
copy,
>> when safe to do so, by making the .Call in add1
>>   .Call("add1", unset(x))
>>
>> If you called this new add1 with a named variable from another
>> function the copying would be done, since NAMED(x) would be
>> 2 even after the local binding was removed.  It actually requires some
>> care to to eliminate the copying, as all the functions in the call
>> chain would have to use unset() when possible.
>>
>> I ask this because I ran across a function in the 'bit' package
that
>> does not have its C code call duplicate but instead assumes the
>> x[1] <- x[1] will force x to be copied:
>>  "!.bit" <- function(x){
>>    if (length(x)){
>>      ret <- x
>>      ret[1] <- ret[1]  # force duplication
>>      .Call("R_bit_not", ret, PACKAGE="bit")
>>    }else{
>>      x
>>    }
>>  }
>> If you optimize things so that 'ret[1] <- ret[1]' does not
copy 'ret',
>> then this function alters its input.  It a function like unset()
>> were there then the .Call could be
>>     .Call("R_bit_not", unset(x))
>>
>> I suppose the compiler could analyze the code and see that
>> x was not used after the .Call and thus feel free to avoid the
>> copy.
>>
>> In any case bit's maintainer should add something like
>>    if(NAMED(x) {
>>        PROTECT(x=duplicate(x));
>>        nProtect++;
>>    }
>>    ...
>>    UNPROTECT(nProtect);
>> in the C code, but unset() would help avoid unneeded duplications.
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>    Actuarial Science
> 241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Seemingly Similar Threads

Search for more possibly parallel threads

R devel - Aug 2015 - unset() function?

[Rd] unset() function?

[Rd] unset() function?

[Rd] unset() function?

[Rd] unset() function?

Seemingly Similar Threads