Waichler, Scott R
2008-May-14 15:05 UTC
[R] Parallel computing with rgenoud and snow: external file I/O possible?
I am trying to use rgenoud and snow with an external model and file I/O. Unlike the typical application of rgenoud and snow, I need to run an external executable and do pre- and post-processing of input and output files for each parameter set generated by genoud(). I'm hoping that someone can suggest improvements to my approach and a work-around for file I/O problems I've encountered when trying to do this in parallel processing under snow. I'm using the "SOCK" cluster type. An external executable (the model) and file input and output for that model are used. For this example, the model is simply R's sin() function, packaged in such a way that it is called by a shell script. The model reads an input file and writes an output file. My example involves two nodes on different machines. Each model run launched by genoud uses a working directory named after the node to hold the temporary input and output files. The function called by genoud, drive.calib(), saves the parameter and objective function values in a file also named after the node, and after genoud finishes, the results from all the runs are combined in one file for convenience. I realize that the communications in this simple example take much longer than the model execution. In my real application of concern the model runtimes will be much longer than the communication time. To simplify the scripts further, I tried to pass the variable working.dir as an additional argument in genoud(), but it didn't work. I seem to have to specify all the working.dir pathnames before the genoud() call as well as the specific one for the particular model execution inside the function drive.calib(). Also, a relative pathname for the working directory does not work; I found I had to use absolute pathnames. If I wanted to use more than one chip on the same machine, I believe I would have to use random numbers to create unique directory names and prevent conflict between model executions. However, in my trial-and-error development, I found that genoud and/or the cluster are very touchy about file handling, and it was necessary to create the working directories before genoud() was called, thus effectively preventing me from using multiple chips on the same machine. If you have any insights as to why or can suggest a work-around, it would be appreciated. The pasted-in files below are as follows: test_cluster.r, the main R script; call_R.csh, a shell script that calls the model; sin.r, an R script that is the "model". Together call_R.csh and sin.r make up the external model. Thanks for any help, Scott Waichler Pacific Northwest National Laboratory Richland, WA USA 509-372-4423 (voice) 509-372-6089 (fax) scott.waichler at pnl.gov ------------ test_cluster.r ----------------------------------------------------------- # test_cluster.r # # R script to test the parallelization capability. # This uses rgenoud and external processes outside of R, and file I/O. # These are the key characteristics of work with stand-alone models. library(snow) # Simple Network of Workstations #library(rlecuyer) # for random number generator library(rgenoud) # calibrator of choice # Set the working directory--must be an absolute pathname. Also put this statement inside # drive.calib(), the function that is called by genoud(). I cannot get genoud() to pass this # as an argument without breaking. working.dir <- "/projects/dhsvm/uvm/test/rhelp/" results.file <- "test_results.txt" # file to save results in # Set up the cluster this.host <- system("hostname", intern=T) node <- c(this.host, "escalante") # add additional nodes here setDefaultClusterOptions(master=this.host, type="SOCK", homogeneous=T, outfile="/tmp/cluster1") cl <- makeCluster(node) #clusterSetupRNG(cl) # init random number generator to ensure each node has a different seed # Define the function that will be called by genoud() drive.calib <- function(xx) { # parameter value that is being adjusted working.dir <- "/projects/dhsvm/uvm/test/rhelp/" # HARDWIRED WORKING DIRECTORY this.host <- as.character(system("hostname", intern=T)) #this.rn <- sample(1:1e6, 1) # get a random number #this.dir <- paste(sep="", working.dir, this.host, "_", this.rn) # hostname and random number this.dir <- paste(sep="", working.dir, this.host) # hostname only infile <- paste(sep="/", this.dir, "tmp.in") # input to external model outfile <- paste(sep="/", this.dir, "tmp.out") # output from external model # file that holds input parameter values and results for all model runs on this node results.file <- paste(sep="", this.dir, ".out") cat(file=infile, append=F, xx) # write the input file for the model # run the external model system(paste(sep="", working.dir, "call_R.csh ", this.dir), intern F, ignore.stderr = TRUE) # read the result from the external model run score <- scan(file=outfile, quiet=T) # save the parameter value and result for this model run cat(file=results.file, append=T, xx, score, "\n") return(score) # return the objective function result to genoud() } # end drive.calib() # Initialize a file to hold the results for each host, and create a working # directory for each host to hold the temporary input and output files for the runs on that host. result.files <- paste(sep="", node, ".out") for(i in 1:length(result.files)) { file.create(result.files[i]) # result file for runs on node[i] dir.create(node[i], showWarnings=F) # working directory for model runs on node[i] } # Run genoud system.time( optim.results <- genoud(fn=drive.calib, nvars=1, max=TRUE, boundary.enforcement=1, pop.size=20, max.generations=6, wait.generations=2, default.domains=1, cluster=cl, balance=T, debug=T) ) stopCluster(cl) # terminate cluster # Consolidate the results into one file. cat(file=results.file, sep="", "# Results from testing parallel computing with R,\n", "# from test_cluster.r\n", "x y\n") for(i in 1:length(result.files)) { x <- readLines(result.files[i]) cat(file=results.file, append=T, sep="\n", x) } # end of script ----------------- end of test_cluster.r ---------------------------------- ----------------- call_R.csh --------------------------------------------- ! /bin/tcsh # A shell script used in testing parallel computing in R with the snow package. set rscript="/projects/dhsvm/uvm/scripts/sin.r" set thisdir=$1 # use the argument passed to this script R --vanilla --slave --args $thisdir < $rscript ----------------- end of call_R.csh -------------------------------------- ----------------- sin.r --------------------------------------------- # Script for testing parallel computing in R with the snow package. # This is called by the shell script call_R.sh. args <- commandArgs(TRUE) this.dir <- args[1] infile <- paste(sep="/", this.dir, "tmp.in") outfile <- paste(sep="/", this.dir, "tmp.out") x <- scan(file=infile, quiet=T) # get the input y <- round(sin(x), 3) # get the output; round the answer to speed up solution cat(file=outfile, append=F, y) ----------------- end of sin.r --------------------------------------