Hi all, The function call series genericForPrimitive -> .findBasicFuns -> .findAll happens 4400 times while the GenomicRanges package is loading. Each time .findAll follows a chain of environments to determine that the methods namespace is the only one that holds a variable called .BasicFunsList. This accounts for ~10% of package loading time. I'm sure there is some history to that design, but would it be possible shortcut this operation? Could .BasicFunsList be initialized in the methods namespace at startup and might genericForPrimitive just go straight there? Does anyone on the list know why it works this way? There are some other cases of seemingly redundant work, but this seems like an easy one to address. I have included some code below that was used to investigate some of the above. # Try this to count calls to a function .count <- 0; trace(methods:::.findBasicFuns,tracer=function() { .count <<- .count + 1 }); library(GenomicRanges); print(.count) # Try this to capture the input and output of a set of functions you wish to refactor .init_test_data_collection <- function(ns = asNamespace("methods")) { funs = c("isClassUnion", "getClass", "genericForPrimitive", "possibleExtends", ".dataSlot", ".requirePackage", ".classEnv", "getClassDef", "outerLabels", ".getClassFromCache", "getFunction") message(paste0("\nCollecting data for unit tests on ", paste(funs, collapse=", "), " ...\n")) # Make env with list to hold test input/output TEST_ENV <- new.env() for (fname in funs) { # Make placeholder for input/output for future runs of this function TEST_ENV[[fname]] = list() # Actually probably not necessary, will just be c(NULL, list(first result)) the first time # Construct test version of function unlockBinding(fname, ns) fun = get(fname, envir=ns, mode="function") funbody = deparse(body(fun)) newfun <- fun newfun.body = c( sprintf("fname = '%s'", fname), "TEST_INFO = list()", "TEST_INFO$input = mget(names(formals(fname)))", c("realfun <- function()", funbody), "TEST_INFO$output = realfun()", "TEST_ENV[[fname]] = c(TEST_ENV[[fname]], list(TEST_INFO))", "return(TEST_INFO$output)") body(newfun) = as.call(c(as.name("{"), as.list(parse(text=newfun.body)))) assign(fname, newfun, envir=ns) } return(TEST_ENV) } # run code, print items in TEST_ENV The relevant code is in methods/R/BasicFunsList.R and methods/R/ClassExtensions.R Pete ____________________ Peter M. Haverty, Ph.D. Genentech, Inc. phaverty at gene.com [[alternative HTML version deleted]]
Note that setMethod() resolves .BasicFunsList in the methods namespace directly when setting a method on a primitive. Somehow there should be consistency between genericForPrimitive() and the check in setMethod(). Also, we can probably step away from the use of elNamed(), given that [[ now uses exact matching. Have you tried patching methods to use .BasicFunsList directly as in setMethod? On Wed, Jan 21, 2015 at 10:41 AM, Peter Haverty <haverty.peter at gene.com> wrote:> Hi all, > > The function call series genericForPrimitive -> .findBasicFuns -> .findAll > happens 4400 times while the GenomicRanges package is loading. Each time > .findAll follows a chain of environments to determine that the methods > namespace is the only one that holds a variable called .BasicFunsList. This > accounts for ~10% of package loading time. I'm sure there is some history > to that design, but would it be possible shortcut this operation? Could > .BasicFunsList be initialized in the methods namespace at startup and might > genericForPrimitive just go straight there? > > Does anyone on the list know why it works this way? > > There are some other cases of seemingly redundant work, but this seems like > an easy one to address. > > I have included some code below that was used to investigate some of the > above. > > # Try this to count calls to a function > > .count <- 0; trace(methods:::.findBasicFuns,tracer=function() { .count <<- > .count + 1 }); library(GenomicRanges); print(.count) > > # Try this to capture the input and output of a set of functions you wish > to refactor > > .init_test_data_collection <- function(ns = asNamespace("methods")) { > > funs = c("isClassUnion", "getClass", "genericForPrimitive", > "possibleExtends", ".dataSlot", ".requirePackage", ".classEnv", > "getClassDef", "outerLabels", ".getClassFromCache", "getFunction") > > message(paste0("\nCollecting data for unit tests on ", paste(funs, > collapse=", "), " ...\n")) > > # Make env with list to hold test input/output > > TEST_ENV <- new.env() > > for (fname in funs) { > > # Make placeholder for input/output for future runs of this > function > > TEST_ENV[[fname]] = list() # Actually probably not necessary, will > just be c(NULL, list(first result)) the first time > > # Construct test version of function > > unlockBinding(fname, ns) > > fun = get(fname, envir=ns, mode="function") > > funbody = deparse(body(fun)) > > newfun <- fun > > newfun.body = c( > > sprintf("fname = '%s'", fname), > > "TEST_INFO = list()", > > "TEST_INFO$input = mget(names(formals(fname)))", > > c("realfun <- function()", funbody), > > "TEST_INFO$output = realfun()", > > "TEST_ENV[[fname]] = c(TEST_ENV[[fname]], list(TEST_INFO))", > > "return(TEST_INFO$output)") > > body(newfun) = as.call(c(as.name("{"), > as.list(parse(text=newfun.body)))) > > assign(fname, newfun, envir=ns) > > } > > return(TEST_ENV) > > } > # run code, print items in TEST_ENV > > The relevant code is in methods/R/BasicFunsList.R and > methods/R/ClassExtensions.R > Pete > > ____________________ > Peter M. Haverty, Ph.D. > Genentech, Inc. > phaverty at gene.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Doing it like this: genericForPrimitive <- function(f, where = topenv(parent.frame()), mustFind = TRUE) { ans = .BasicFunsList[[f]] ## this element may not exist (yet, during loading), dom't test null if(mustFind && identical(ans, FALSE)) stop(gettextf("methods may not be defined for primitive function %s in this version of R", sQuote(f)), domain = NA) ans } or this: genericForPrimitive <- function(f, where = topenv(parent.frame()), mustFind = TRUE) { env = asNamespace("methods") funs <- env[[".BasicFunsList"]] ans = funs[[f]] ## this element may not exist (yet, during loading), dom't test null if(mustFind && identical(ans, FALSE)) stop(gettextf("methods may not be defined for primitive function %s in this version of R", sQuote(f)), domain = NA) ans } Seems to work just fine. Yes, "el" and "elNamed" can probably go now. Pete ____________________ Peter M. Haverty, Ph.D. Genentech, Inc. phaverty at gene.com On Wed, Jan 21, 2015 at 2:26 PM, Michael Lawrence <lawrence.michael at gene.com> wrote:> Note that setMethod() resolves .BasicFunsList in the methods namespace > directly when setting a method on a primitive. Somehow there should be > consistency between genericForPrimitive() and the check in setMethod(). > > Also, we can probably step away from the use of elNamed(), given that [[ > now uses exact matching. > > Have you tried patching methods to use .BasicFunsList directly as in > setMethod? > > > On Wed, Jan 21, 2015 at 10:41 AM, Peter Haverty <haverty.peter at gene.com> > wrote: > >> Hi all, >> >> The function call series genericForPrimitive -> .findBasicFuns -> .findAll >> happens 4400 times while the GenomicRanges package is loading. Each time >> .findAll follows a chain of environments to determine that the methods >> namespace is the only one that holds a variable called .BasicFunsList. >> This >> accounts for ~10% of package loading time. I'm sure there is some history >> to that design, but would it be possible shortcut this operation? Could >> .BasicFunsList be initialized in the methods namespace at startup and >> might >> genericForPrimitive just go straight there? >> >> Does anyone on the list know why it works this way? >> >> There are some other cases of seemingly redundant work, but this seems >> like >> an easy one to address. >> >> I have included some code below that was used to investigate some of the >> above. >> >> # Try this to count calls to a function >> >> .count <- 0; trace(methods:::.findBasicFuns,tracer=function() { .count >> <<- >> .count + 1 }); library(GenomicRanges); print(.count) >> >> # Try this to capture the input and output of a set of functions you wish >> to refactor >> >> .init_test_data_collection <- function(ns = asNamespace("methods")) { >> >> funs = c("isClassUnion", "getClass", "genericForPrimitive", >> "possibleExtends", ".dataSlot", ".requirePackage", ".classEnv", >> "getClassDef", "outerLabels", ".getClassFromCache", "getFunction") >> >> message(paste0("\nCollecting data for unit tests on ", paste(funs, >> collapse=", "), " ...\n")) >> >> # Make env with list to hold test input/output >> >> TEST_ENV <- new.env() >> >> for (fname in funs) { >> >> # Make placeholder for input/output for future runs of this >> function >> >> TEST_ENV[[fname]] = list() # Actually probably not necessary, >> will >> just be c(NULL, list(first result)) the first time >> >> # Construct test version of function >> >> unlockBinding(fname, ns) >> >> fun = get(fname, envir=ns, mode="function") >> >> funbody = deparse(body(fun)) >> >> newfun <- fun >> >> newfun.body = c( >> >> sprintf("fname = '%s'", fname), >> >> "TEST_INFO = list()", >> >> "TEST_INFO$input = mget(names(formals(fname)))", >> >> c("realfun <- function()", funbody), >> >> "TEST_INFO$output = realfun()", >> >> "TEST_ENV[[fname]] = c(TEST_ENV[[fname]], list(TEST_INFO))", >> >> "return(TEST_INFO$output)") >> >> body(newfun) = as.call(c(as.name("{"), >> as.list(parse(text=newfun.body)))) >> >> assign(fname, newfun, envir=ns) >> >> } >> >> return(TEST_ENV) >> >> } >> # run code, print items in TEST_ENV >> >> The relevant code is in methods/R/BasicFunsList.R and >> methods/R/ClassExtensions.R >> Pete >> >> ____________________ >> Peter M. Haverty, Ph.D. >> Genentech, Inc. >> phaverty at gene.com >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > >[[alternative HTML version deleted]]