Henrik Bengtsson
2014-Jun-17 21:27 UTC
[Rd] PATCH: Avoiding extra copies (NAMED bumped) with source(..., print.eval=FALSE) ...and with print.eval=TRUE?
OBJECTIVE: To update source(..., print.eval=FALSE) to not use withVisible() unless really needed. This avoids unnecessary increases of reference counts/NAMED introduced by withVisible(), which in turn avoids unnecessary memory allocations and garbage collection overhead. This has an impact on all source():ed scripts, e.g. pre-allocation of large matrices to save memory does *not always* help in such setups. It is likely to also affect the evaluation of code chunks of various vignette engines. BACKGROUND: If you run the following at the prompt, you get that the assignment of the first element does *not* cause an extra copy of 'x':> x <- 1:2 > tracemem(x)[1] "<0x000000001c5a7b28>"> x[1] <- 0L > x[1] 0 2 However, it you source() the same code (with print.eval=FALSE; the default), you get:> code <- "x <- 1:2 tracemem(x) x[1] <- 0L "> source(textConnection(code))tracemem[0x0000000010504e20 -> 0x0000000010509cd0]: eval eval withVisible source> x[1] 0 2 Looking at how source() works, this is because it effectively does:> invisible(withVisible(x <- 1:2)) > invisible(withVisible(tracemem(x))) > invisible(withVisible(x[1] <- 0L))tracemem[0x00000000104b68a8 -> 0x00000000104b6788]: withVisible> x[1] 0 2 WORKAROUND HACK: I understand that wrapping up multiple expressions into one avoids this:> code <- "{x <- 1:2 tracemem(x) x[1] <- 0L }"> source(textConnection(code))which you in this case can narrow down to: code <- " {x <- 1:2; {}} tracemem(x) x[1] <- 0L " source(textConnection(code)) but that's not my point here. Instead, I believe R can handle this better itself. DISCUSSION / PATCH: It's quite easy to patch base::source(..., print.eval=FALSE) to avoid the extra copies, because source() uses withVisible() so that it: (a) can show()/print() the value of each expression (when print.eval=TRUE), as well as (b) returning the value of the last expression evaluated. Thus, with print.eval=FALSE, withVisible() is only needed for the very last expression evaluated. Here is a patch to source() that avoids calling withVisible() unless needed: $ svn diff src/library/base/R/source.R Index: src/library/base/R/source.R ==================================================================--- src/library/base/R/source.R (revision 65900) +++ src/library/base/R/source.R (working copy) @@ -206,7 +206,10 @@ } } if (!tail) { - yy <- withVisible(eval(ei, envir)) + if (print.eval || i == Ne+echo) + yy <- withVisible(eval(ei, envir)) + else + eval(ei, envir) i.symbol <- mode(ei[[1L]]) == "name" if (!i.symbol) { ## ei[[1L]] : the function "<-" or other With this patch you get:> source(textConnection(code), echo=TRUE, print.eval=FALSE)> x <- 1:2 > tracemem(x) > x[1] <- 0L> source(textConnection(code), echo=TRUE, print.eval=TRUE)> x <- 1:2 > tracemem(x)[1] "<0x000000001c5675c0>"> x[1] <- 0Ltracemem[0x000000001c5675c0 -> 0x000000001c564ad0]: eval eval withVisible source FURTHER IMPROVEMENTS: Looking at the internals of withVisible(): /* This is a special .Internal */ SEXP attribute_hidden do_withVisible(SEXP call, SEXP op, SEXP args, SEXP rho) { SEXP x, nm, ret; checkArity(op, args); x = CAR(args); x = eval(x, rho); PROTECT(x); PROTECT(ret = allocVector(VECSXP, 2)); PROTECT(nm = allocVector(STRSXP, 2)); SET_STRING_ELT(nm, 0, mkChar("value")); SET_STRING_ELT(nm, 1, mkChar("visible")); SET_VECTOR_ELT(ret, 0, x); SET_VECTOR_ELT(ret, 1, ScalarLogical(R_Visible)); setAttrib(ret, R_NamesSymbol, nm); UNPROTECT(3); return ret; } Not sure exactly where the reference count (NAMED is updated) is bumped, but *if it is possible to evaluate the expression and inspect if the value is "visible" or not before it happens*, then one could imaging adding an option to withVisible() that tells it to only return the value if the evaluated value is visible (otherwise NULL). That way one could avoid extra copies in most cases also with print.eval=TRUE, e.g.> withVisible(x[1] <- 0L)$value [1] 0 $visible [1] FALSE In other words, whenever withVisible() returns visible=FALSE, the return values is not used by source(). Comments? /Henrik> sessionInfo()R Under development (unstable) (2014-06-12 r65926) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base
Seemingly Similar Threads
- WISH: eval() to preserve the "visibility" (now value is always visible)
- Inconsistency between eval and withVisible (with patch)
- WISH: eval() to preserve the "visibility" (now value is always visible)
- Extra copies of objects in environments when using $ operator?
- Extra copies of objects in environments when using $ operator?