thr3ads.net - R devel - [Rd] iterated lapply [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Radford Neal

2015-Feb-24 11:39 UTC

[Rd] iterated lapply

From: Daniel Kaschek <daniel.kaschek at
physik.uni-freiburg.de>> ... When I evaluate this list of functions by
> another lapply/sapply, I get an unexpected result: all values coincide.
> However, when I uncomment the print(), it works as expected. Is this a
> bug or a feature?
> 
> conditions <- 1:4
> test <- lapply(conditions, function(mycondition){
>   #print(mycondition)
>   myfn <- function(i) mycondition*i
>   return(myfn)
> })
> 
> sapply(test, function(myfn) myfn(2))
From: Jeroen Ooms <jeroenooms at gmail.com>> I think it is a bug. If we use substitute to inspect the promise, it
> appears the index number is always equal to its last value:
From: Duncan Temple Lang <dtemplelang at
ucdavis.edu>> Not a bug, but does surprise people. It is lazy evaluation.

I think it is indeed a bug.  The lapply code saves a bit of time by
reusing the same storage for the scalar index number every iteration.
This amounts to modifying the R code that was used for the previous
function call.  There's no justification for doing this in the
documentation for lapply.  It is certainly not desired behaviour,
except in so far as it allows a slight savings in time (which is
minor, given the time that the function call itself will take).

   Radford Neal

luke-tierney at uiowa.edu

2015-Feb-24 15:50 UTC

head link

[Rd] iterated lapply

The documentation is not specific enough on the indented semantics in
this situation to consider this a bug. The original R-level
implementation of lapply was

     lapply <- function(X, FUN, ...) {
         FUN <- match.fun(FUN)
         if (!is.list(X))
         X <- as.list(X)
         rval <- vector("list", length(X))
         for(i in seq(along = X))
         rval[i] <- list(FUN(X[[i]], ...))
         names(rval) <- names(X)           # keep `names' !
         return(rval)
     }

and the current internal implementation is consistent with this. With
a loop like this lazy evaluation and binding assignment interact in
this way; the force() function was introduced to help with this.

That said, the expression FUN(X[[i]], ...) could be replaced by

     local({
         i <- i
         list(FUN(X[[i]], ...)
     })

which would produce the more desirable result

     > sapply(test, function(myfn) myfn(2))
     [1] 2 4 6 8

The C implementation could use this approach, or could rebuild the
expression being evaluated at each call to get almost the same semantics.
Both would add a little overhead. Some code optimization might reduce
the overhead in some instances (e.g. if FUN is a BUILTIN), but it's
not clear that would be worth while.

Variants of this issue arise in a couple of places so it may be worth
looking into.

Best,

luke

On Tue, 24 Feb 2015, Radford Neal wrote:
> From: Daniel Kaschek <daniel.kaschek at physik.uni-freiburg.de>
>> ... When I evaluate this list of functions by
>> another lapply/sapply, I get an unexpected result: all values coincide.
>> However, when I uncomment the print(), it works as expected. Is this a
>> bug or a feature?
>>
>> conditions <- 1:4
>> test <- lapply(conditions, function(mycondition){
>>   #print(mycondition)
>>   myfn <- function(i) mycondition*i
>>   return(myfn)
>> })
>>
>> sapply(test, function(myfn) myfn(2))
>
> From: Jeroen Ooms <jeroenooms at gmail.com>
>> I think it is a bug. If we use substitute to inspect the promise, it
>> appears the index number is always equal to its last value:
>
> From: Duncan Temple Lang <dtemplelang at ucdavis.edu>
>> Not a bug, but does surprise people. It is lazy evaluation.
>
>
> I think it is indeed a bug.  The lapply code saves a bit of time by
> reusing the same storage for the scalar index number every iteration.
> This amounts to modifying the R code that was used for the previous
> function call.  There's no justification for doing this in the
> documentation for lapply.  It is certainly not desired behaviour,
> except in so far as it allows a slight savings in time (which is
> minor, given the time that the function call itself will take).
>
>   Radford Neal
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

Michael Weylandt

2015-Feb-24 18:26 UTC

head link

[Rd] iterated lapply

> On Feb 24, 2015, at 10:50 AM, <luke-tierney at uiowa.edu> wrote:
> 
> The documentation is not specific enough on the indented semantics in
> this situation to consider this a bug. The original R-level
> implementation of lapply was
> 
>    lapply <- function(X, FUN, ...) {
>        FUN <- match.fun(FUN)
>        if (!is.list(X))
>        X <- as.list(X)
>        rval <- vector("list", length(X))
>        for(i in seq(along = X))
>        rval[i] <- list(FUN(X[[i]], ...))
>        names(rval) <- names(X)           # keep `names' !
>        return(rval)
>    }
> 
> and the current internal implementation is consistent with this. With
> a loop like this lazy evaluation and binding assignment interact in
> this way; the force() function was introduced to help with this.
> 
> That said, the expression FUN(X[[i]], ...) could be replaced by
> 
>    local({
>        i <- i
>        list(FUN(X[[i]], ...)
>    })
> 
> which would produce the more desirable result
> 
>    > sapply(test, function(myfn) myfn(2))
>    [1] 2 4 6 8
> 
Would the same semantics be applied to parallel::mclapply and friends?

sapply(lapply(1:4, function(c){function(i){c*i}}), function(f) f(2))

# [1] 8 8 8 8

sapply(mclapply(1:4, function(c){function(i){c*i}}), function(f) f(2))

# [1] 6 8 6 8

I understand why they differ, but making mclapply easier for 'drop-in'
parallelism might be a good thing.

Michael

William Dunlap

2015-Feb-26 17:56 UTC

head link

[Rd] iterated lapply

Would introducing the new frame, with the call to local(), cause problems
when you use frame counting instead of <<- to modify variables outside the
scope of lapply's FUN, I think the frame counts may have to change.  E.g.,
here is code from actuar::simul() that might be affected:

        x <- unlist(lapply(nodes[[i]], seq))
        lapply(nodes[(i + 1):(nlevels - 1)],
               function(v) assign("x", rep.int(x, v), envir
parent.frame(2)))
        m[, i] <- x

(I think the parent.frame(2) might have to be changed to parent.frame(8)
for that to work.  Such code looks pretty ugly to me but seems to be rare.)

It also seems to cause problems with some built-in functions:
newlapply <- function (X, FUN, ...)
{
    FUN <- match.fun(FUN)
    if (!is.list(X))
        X <- as.list(X)
    rval <- vector("list", length(X))
    for (i in seq(along = X)) {
        rval[i] <- list(local({
            i <- i
            FUN(X[[i]], ...)
        }))
    }
    names(rval) <- names(X)
    return(rval)
}
newlapply(1:2,log)
#Error in FUN(X[[i]], ...) : non-numeric argument to mathematical function
newlapply(1:2,function(x)log(x))
#[[1]]
#[1] 0
#
#[[2]]
#[1] 0.6931472



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Feb 24, 2015 at 7:50 AM, <luke-tierney at uiowa.edu> wrote:
> The documentation is not specific enough on the indented semantics in
> this situation to consider this a bug. The original R-level
> implementation of lapply was
>
>     lapply <- function(X, FUN, ...) {
>         FUN <- match.fun(FUN)
>         if (!is.list(X))
>         X <- as.list(X)
>         rval <- vector("list", length(X))
>         for(i in seq(along = X))
>         rval[i] <- list(FUN(X[[i]], ...))
>         names(rval) <- names(X)           # keep `names' !
>         return(rval)
>     }
>
> and the current internal implementation is consistent with this. With
> a loop like this lazy evaluation and binding assignment interact in
> this way; the force() function was introduced to help with this.
>
> That said, the expression FUN(X[[i]], ...) could be replaced by
>
>     local({
>         i <- i
>         list(FUN(X[[i]], ...)
>     })
>
> which would produce the more desirable result
>
>     > sapply(test, function(myfn) myfn(2))
>     [1] 2 4 6 8
>
> The C implementation could use this approach, or could rebuild the
> expression being evaluated at each call to get almost the same semantics.
> Both would add a little overhead. Some code optimization might reduce
> the overhead in some instances (e.g. if FUN is a BUILTIN), but it's
> not clear that would be worth while.
>
> Variants of this issue arise in a couple of places so it may be worth
> looking into.
>
> Best,
>
> luke
>
>
> On Tue, 24 Feb 2015, Radford Neal wrote:
>
>  From: Daniel Kaschek <daniel.kaschek at physik.uni-freiburg.de>
>>
>>> ... When I evaluate this list of functions by
>>> another lapply/sapply, I get an unexpected result: all values
coincide.
>>> However, when I uncomment the print(), it works as expected. Is
this a
>>> bug or a feature?
>>>
>>> conditions <- 1:4
>>> test <- lapply(conditions, function(mycondition){
>>>   #print(mycondition)
>>>   myfn <- function(i) mycondition*i
>>>   return(myfn)
>>> })
>>>
>>> sapply(test, function(myfn) myfn(2))
>>>
>>
>> From: Jeroen Ooms <jeroenooms at gmail.com>
>>
>>> I think it is a bug. If we use substitute to inspect the promise,
it
>>> appears the index number is always equal to its last value:
>>>
>>
>> From: Duncan Temple Lang <dtemplelang at ucdavis.edu>
>>
>>> Not a bug, but does surprise people. It is lazy evaluation.
>>>
>>
>>
>> I think it is indeed a bug.  The lapply code saves a bit of time by
>> reusing the same storage for the scalar index number every iteration.
>> This amounts to modifying the R code that was used for the previous
>> function call.  There's no justification for doing this in the
>> documentation for lapply.  It is certainly not desired behaviour,
>> except in so far as it allows a slight savings in time (which is
>> minor, given the time that the function call itself will take).
>>
>>   Radford Neal
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>    Actuarial Science
> 241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more maybe matching threads

R devel - Feb 2015 - iterated lapply

[Rd] iterated lapply

[Rd] iterated lapply

[Rd] iterated lapply

[Rd] iterated lapply

Apparently Analagous Threads