Gregory Werbin
2017-Aug-04 04:32 UTC
[Rd] Why is as.function() slower than eval(call("function"())?
(Apologies if this is better suited for R-help.) On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), I found that it is faster to construct a function using eval(call("function", ...)) than using as.function(list(...)). Example: make_fn_1 <- function(a, b) eval(call("function", a, b), env = parent.frame()) make_fn_2 <- function(a, b) as.function(c(a, list(b)), env = parent.frame()) a <- as.pairlist(alist(x = , y = )) b <- quote(x + y) library("microbenchmark") microbenchmark(make_fn_1(a, b), make_fn_2(a, b)) # Unit: microseconds # expr min lq mean median uq max neval cld # make_fn_1(a, b) 1.671 1.8855 2.13297 2.039 2.1950 9.852 100 a # make_fn_2(a, b) 3.541 3.7230 4.13400 3.906 4.1055 23.153 100 b At first I thought the gap was due to the overhead of calling c(a, list(b)). But this turns out not to be the case: make_fn_weird <- function(a, b) as.function(c(a, b), env = parent.frame()) b_wrapped <- list(b) make_fn_weirder <- function(a_b) as.function(a_b, env = parent.frame()) a_b <- c(a, b_wrapped) microbenchmark(make_fn_1(a, b), make_fn_2(a, b), make_fn_weird(a, b_wrapped), make_fn_weirder(a_b)) # Unit: microseconds # expr min lq mean median uq max neval cld # make_fn_1(a, b) 1.718 1.8990 2.12119 1.9860 2.1605 8.057 100 a # make_fn_2(a, b) 3.393 3.5865 4.03029 3.6655 3.9615 27.499 100 c # make_fn_weird(a, b_wrapped) 3.354 3.5005 3.77190 3.6405 3.9425 6.839 100 c # make_fn_weirder(a_b) 2.488 2.6290 2.83352 2.7215 2.8800 7.007 100 b One IRC user pointed out that as.function() takes its own path through the code, namely do_asfunction() (in src/main/coerce.c). What is it about this code path that's 50% slower than whatever happens during eval(call("function", a, b))? Obviously this is a trivial micro-optimization and it doesn't matter to 99% of users. Mostly asking out of curiosity, but also wondering if there's a more general lesson to be learned here. Thanks!
Joshua Ulrich
2017-Aug-04 11:18 UTC
[Rd] Why is as.function() slower than eval(call("function"())?
On Thu, Aug 3, 2017 at 11:32 PM, Gregory Werbin <outthere at me.gregwerbin.com> wrote:> (Apologies if this is better suited for R-help.) > > On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), I found that it is faster to construct a function using eval(call("function", ...)) than using as.function(list(...)). Example: > > make_fn_1 <- function(a, b) eval(call("function", a, b), env = parent.frame()) > make_fn_2 <- function(a, b) as.function(c(a, list(b)), env = parent.frame()) > > a <- as.pairlist(alist(x = , y = )) > b <- quote(x + y) > > library("microbenchmark") > microbenchmark(make_fn_1(a, b), make_fn_2(a, b)) > > # Unit: microseconds > # expr min lq mean median uq max neval cld > # make_fn_1(a, b) 1.671 1.8855 2.13297 2.039 2.1950 9.852 100 a > # make_fn_2(a, b) 3.541 3.7230 4.13400 3.906 4.1055 23.153 100 b > > At first I thought the gap was due to the overhead of calling c(a, list(b)). But this turns out not to be the case: > > make_fn_weird <- function(a, b) as.function(c(a, b), env = parent.frame()) > b_wrapped <- list(b) > > make_fn_weirder <- function(a_b) as.function(a_b, env = parent.frame()) > a_b <- c(a, b_wrapped) > > microbenchmark(make_fn_1(a, b), make_fn_2(a, b), > make_fn_weird(a, b_wrapped), make_fn_weirder(a_b)) > > # Unit: microseconds > # expr min lq mean median uq max neval cld > # make_fn_1(a, b) 1.718 1.8990 2.12119 1.9860 2.1605 8.057 100 a > # make_fn_2(a, b) 3.393 3.5865 4.03029 3.6655 3.9615 27.499 100 c > # make_fn_weird(a, b_wrapped) 3.354 3.5005 3.77190 3.6405 3.9425 6.839 100 c > # make_fn_weirder(a_b) 2.488 2.6290 2.83352 2.7215 2.8800 7.007 100 b > > One IRC user pointed out that as.function() takes its own path through the code, namely do_asfunction() (in src/main/coerce.c). What is it about this code path that's 50% slower than whatever happens during eval(call("function", a, b))? > > Obviously this is a trivial micro-optimization and it doesn't matter to 99% of users. Mostly asking out of curiosity, but also wondering if there's a more general lesson to be learned here. >Agreed that this is minor (~2us), but the majority of the difference seems to be from S3 method dispatch. as.function() is generic and has to dispatch to as.function.default(). The times are very similar if you call the method directly. R> make_fn_3 <- function(a, b) as.function.default(c(a, list(b)), env = parent.frame()) R> microbenchmark(make_fn_1(a, b), make_fn_2(a, b), make_fn_3(a, b)) Unit: microseconds expr min lq mean median uq max neval make_fn_1(a, b) 1.615 1.7595 12.78339 1.9115 2.145 1077.657 100 make_fn_2(a, b) 3.077 3.3390 19.89423 3.5215 3.862 1589.505 100 make_fn_3(a, b) 1.629 1.7975 15.40389 1.9505 2.227 1335.306 100 Now the difference is <100ns, which is much harder to investigate.> Thanks! > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2017 | www.rinfinance.com
Duncan Murdoch
2017-Aug-04 11:21 UTC
[Rd] Why is as.function() slower than eval(call("function"())?
On 04/08/2017 12:32 AM, Gregory Werbin wrote:> (Apologies if this is better suited for R-help.) > > On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), I found that it is faster to construct a function using eval(call("function", ...)) than using as.function(list(...)). Example: > > make_fn_1 <- function(a, b) eval(call("function", a, b), env = parent.frame()) > make_fn_2 <- function(a, b) as.function(c(a, list(b)), env = parent.frame()) > > a <- as.pairlist(alist(x = , y = )) > b <- quote(x + y) > > library("microbenchmark") > microbenchmark(make_fn_1(a, b), make_fn_2(a, b)) > > # Unit: microseconds > # expr min lq mean median uq max neval cld > # make_fn_1(a, b) 1.671 1.8855 2.13297 2.039 2.1950 9.852 100 a > # make_fn_2(a, b) 3.541 3.7230 4.13400 3.906 4.1055 23.153 100 b > > At first I thought the gap was due to the overhead of calling c(a, list(b)). But this turns out not to be the case: > > make_fn_weird <- function(a, b) as.function(c(a, b), env = parent.frame()) > b_wrapped <- list(b) > > make_fn_weirder <- function(a_b) as.function(a_b, env = parent.frame()) > a_b <- c(a, b_wrapped) > > microbenchmark(make_fn_1(a, b), make_fn_2(a, b), > make_fn_weird(a, b_wrapped), make_fn_weirder(a_b)) > > # Unit: microseconds > # expr min lq mean median uq max neval cld > # make_fn_1(a, b) 1.718 1.8990 2.12119 1.9860 2.1605 8.057 100 a > # make_fn_2(a, b) 3.393 3.5865 4.03029 3.6655 3.9615 27.499 100 c > # make_fn_weird(a, b_wrapped) 3.354 3.5005 3.77190 3.6405 3.9425 6.839 100 c > # make_fn_weirder(a_b) 2.488 2.6290 2.83352 2.7215 2.8800 7.007 100 b > > One IRC user pointed out that as.function() takes its own path through the code, namely do_asfunction() (in src/main/coerce.c). What is it about this code path that's 50% slower than whatever happens during eval(call("function", a, b))? > > Obviously this is a trivial micro-optimization and it doesn't matter to 99% of users. Mostly asking out of curiosity, but also wondering if there's a more general lesson to be learned here.The main difference is that `function` is a primitive, while as.function() is a generic. You will get much closer timing if you skip the method dispatch by calling as.function.default() directly. The next part of the difference is that as.function.default is a regular R closure: as.function.default <- function (x, envir = parent.frame(), ...) if (is.function(x)) x else .Internal(as.function.default(x, envir)) If I skip the is.function(x) test and call .Internal directly, I find it is about 10% faster than `function`. But that is an extremely risky optimization; it wouldn't be accepted in a CRAN package. Duncan Murdoch