Henrik Bengtsson
2013-Nov-11 12:31 UTC
[Rd] SUGGESTION: Environment variable R_MAX_MC_CORES for maximum number of cores
I like to propose a unified/standard system environment variable that specifies the maximum number of cores an R session should use, e.g. R_MAX_MC_CORES. This could then be used to *guide* multicore implementations on the number of cores to use. This is different from parallel::detectCores(). ENVIRONMENT VARIABLE: library(parallel) mc.cores <- as.integer(Sys.getenv("R_MAX_MC_CORES", 1L)) res <- mclapply(1:10, FUN=fib, mc.cores=mc.cores) R OPTION: Analogously to several other env.var./options, R_MAX_MC_CORES could set an option on startup for convenience, e.g. options(max.mc.cores=as.integer(Sys.getenv("R_MAX_MC_CORES", 1L))) R COMMAND-LINE OPTION: One could also imagine a command-line option for R/Rscript that sets this, e.g. Rscript --max.mc.cores=3 batch.R EXAMPLE OF USAGE: This would for instance simplify multicore processing on PBS cluster, where the PBS job script can be: Rscript --max.mc.cores=$PBS_NUM_PPN batch.R such that R and the 'batch.R' script does not have to be aware of settings/variables specific to PBS (or whatever cluster system is used). Finally, getOption("max.mc.cores", 1L) could possibly also be the new default for the 'mc.cores' argument in 'parallel' functions. /Henrik