Xiao He
2013-Aug-17 23:05 UTC
[R] A function taken out of its original environment performs more slowly.
Hi dear R-users, I encountered an interesting pattern. Take for example the function combn(), I copied and pasted the function definition and saved it as a new function named combn2() (see the end of this email). As it turned out, combn2() seems to be substantially slower than the original function combn() (see benchmark below),> system.time(combn(30, 5)); system.time(combn2(30, 5))user system elapsed 0.304 0.003 0.308 user system elapsed 1.591 0.007 1.602 I wonder if there is any reason for this difference and if there is any way to reduce the performance difference. Thanks! combn2 <- function (x, m, FUN = NULL, simplify = TRUE, ...) { stopifnot(length(m) == 1L) if (m < 0) stop("m < 0", domain = NA) if (is.numeric(x) && length(x) == 1L && x > 0 && trunc(x) = x) x <- seq_len(x) n <- length(x) if (n < m) stop("n < m", domain = NA) m <- as.integer(m) e <- 0 h <- m a <- seq_len(m) nofun <- is.null(FUN) if (!nofun && !is.function(FUN)) stop("'FUN' must be a function or NULL") len.r <- length(r <- if (nofun) x[a] else FUN(x[a], ...)) count <- as.integer(round(choose(n, m))) if (simplify) { dim.use <- if (nofun) c(m, count) else { d <- dim(r) if (length(d) > 1L) c(d, count) else if (len.r > 1L) c(len.r, count) else c(d, count) } } if (simplify) { out <- matrix(r, nrow = len.r, ncol = count) } else { out <- vector("list", count) out[[1L]] <- r } if (m > 0) { i <- 2L nmmp1 <- n - m + 1L while (a[1L] != nmmp1) { if (e < n - h) { h <- 1L e <- a[m] j <- 1L } else { e <- a[m - h] h <- h + 1L j <- 1L:h } a[m - h + j] <- e + j r <- if (nofun) x[a] else FUN(x[a], ...) if (simplify) out[, i] <- r else out[[i]] <- r i <- i + 1L } } if (simplify) array(out, dim.use) else out } [[alternative HTML version deleted]]
Uwe Ligges
2013-Aug-17 23:13 UTC
[R] A function taken out of its original environment performs more slowly.
On 18.08.2013 01:05, Xiao He wrote:> Hi dear R-users, > > I encountered an interesting pattern. Take for example the function > combn(), I copied and pasted the function definition and saved it as a new > function named combn2() (see the end of this email). As it turned out, > combn2() seems to be substantially slower than the original function > combn() (see benchmark below), > >> system.time(combn(30, 5)); system.time(combn2(30, 5))combn is bytecode compiled, combn2 not: try library("compiler") combn2 <- cmpfun(combn2) and you won't see such a difference anymore. Best, Uwe Ligges> user system elapsed > 0.304 0.003 0.308 > user system elapsed > 1.591 0.007 1.602 > > > I wonder if there is any reason for this difference and if there is any way > to reduce the performance difference. Thanks! > > combn2 <- function (x, m, FUN = NULL, simplify = TRUE, ...) > { > stopifnot(length(m) == 1L) > if (m < 0) > stop("m < 0", domain = NA) > if (is.numeric(x) && length(x) == 1L && x > 0 && trunc(x) => x) > x <- seq_len(x) > n <- length(x) > if (n < m) > stop("n < m", domain = NA) > m <- as.integer(m) > e <- 0 > h <- m > a <- seq_len(m) > nofun <- is.null(FUN) > if (!nofun && !is.function(FUN)) > stop("'FUN' must be a function or NULL") > len.r <- length(r <- if (nofun) x[a] else FUN(x[a], ...)) > count <- as.integer(round(choose(n, m))) > if (simplify) { > dim.use <- if (nofun) > c(m, count) > else { > d <- dim(r) > if (length(d) > 1L) > c(d, count) > else if (len.r > 1L) > c(len.r, count) > else c(d, count) > } > } > if (simplify) { > out <- matrix(r, nrow = len.r, ncol = count) > } > else { > out <- vector("list", count) > out[[1L]] <- r > } > if (m > 0) { > i <- 2L > nmmp1 <- n - m + 1L > while (a[1L] != nmmp1) { > if (e < n - h) { > h <- 1L > e <- a[m] > j <- 1L > } > else { > e <- a[m - h] > h <- h + 1L > j <- 1L:h > } > a[m - h + j] <- e + j > r <- if (nofun) > x[a] > else FUN(x[a], ...) > if (simplify) > out[, i] <- r > else out[[i]] <- r > i <- i + 1L > } > } > if (simplify) > array(out, dim.use) > else out > } > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Duncan Murdoch
2013-Aug-19 15:30 UTC
[R] A function taken out of its original environment performs more slowly.
On 13-08-17 7:05 PM, Xiao He wrote:> Hi dear R-users, > > I encountered an interesting pattern. Take for example the function > combn(), I copied and pasted the function definition and saved it as a new > function named combn2() (see the end of this email). As it turned out, > combn2() seems to be substantially slower than the original function > combn() (see benchmark below),Besides the difference Uwe pointed out, those functions likely have different environments, so searching for symbols will take a different amount of time. Usually this will be longer from globalenv() than from the namespace of the package, but sometimes the reverse could be true. Duncan Murdoch> >> system.time(combn(30, 5)); system.time(combn2(30, 5)) > user system elapsed > 0.304 0.003 0.308 > user system elapsed > 1.591 0.007 1.602 > > > I wonder if there is any reason for this difference and if there is any way > to reduce the performance difference. Thanks! > > combn2 <- function (x, m, FUN = NULL, simplify = TRUE, ...) > { > stopifnot(length(m) == 1L) > if (m < 0) > stop("m < 0", domain = NA) > if (is.numeric(x) && length(x) == 1L && x > 0 && trunc(x) => x) > x <- seq_len(x) > n <- length(x) > if (n < m) > stop("n < m", domain = NA) > m <- as.integer(m) > e <- 0 > h <- m > a <- seq_len(m) > nofun <- is.null(FUN) > if (!nofun && !is.function(FUN)) > stop("'FUN' must be a function or NULL") > len.r <- length(r <- if (nofun) x[a] else FUN(x[a], ...)) > count <- as.integer(round(choose(n, m))) > if (simplify) { > dim.use <- if (nofun) > c(m, count) > else { > d <- dim(r) > if (length(d) > 1L) > c(d, count) > else if (len.r > 1L) > c(len.r, count) > else c(d, count) > } > } > if (simplify) { > out <- matrix(r, nrow = len.r, ncol = count) > } > else { > out <- vector("list", count) > out[[1L]] <- r > } > if (m > 0) { > i <- 2L > nmmp1 <- n - m + 1L > while (a[1L] != nmmp1) { > if (e < n - h) { > h <- 1L > e <- a[m] > j <- 1L > } > else { > e <- a[m - h] > h <- h + 1L > j <- 1L:h > } > a[m - h + j] <- e + j > r <- if (nofun) > x[a] > else FUN(x[a], ...) > if (simplify) > out[, i] <- r > else out[[i]] <- r > i <- i + 1L > } > } > if (simplify) > array(out, dim.use) > else out > } > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >