First of all LSF is a batch scheduling software. It usually expects an .lsf
script. Usually the compilers on a cluster are interchangeable via the
'module switch <unload module> <load module>' and MPI-2 is
the message passing interface standard. This is also rather an topic for the
high-performance R list.
Next, doMC is a multicore package registering cores on one machine - AFAIK, i.e.
you have to work on one machine with the 24 cores (inform yourself on the
hardware on your cluster - there should also be introduction presentations
online! To know what hardware you use and what architecture it has is the first
step! Try 'bhosts' on your shell to see what hosts are available). If
you want to use several machines, your backend for foreach should be doMPI and
not doMC (see http://cran.r-project.org/web/packages/doMPI/vignettes/doMPI.pdf).
If you found your host, you have to write an lsf-script like the one following
(for OpenMP on ONE machine - using 24 cores, in most cases this suffices.
Further, it is faster as you do not have to wait that long because you have to
use just ONE machine. If you have BULL clusters - take these. A lot of cores
32/64? and a lot of memory)
So in your case, write a script with extension .lsf containing:
### using the zsh shell
#!/usr/bin/env zsh
### Job name
#BSUB -J OpenMP
### File/path where output will be written, the %J is the job ID
#BSUB -o OpenMP.%J
### (OFF) Different file for STDERR, if not to be merged with STDOUT
# #BSUB -e OpenM.e%J
### Request the time you need for execution in minutes
### The format for the parameter is: [hour:]minute,
### that means for 80 minutes you could also use this: 1:20
#BSUB -W 3:00
### Request virtual memory you need for your job in MB (per process)
#BSUB -M 1024
### Request higher amount of stack site (per process)
#BSUB -S 1024
### Request the number of compute slots you want to use
#BSUB -n 24
### Specify your mail address
#BSUB pkount at bgc-jena.mpg.de
### Send a mail when job is done
#BSUB -N
### Use esub for OpenMP
#BSUB -a openmp
### (OFF) As R is usually compiled via gcc I would load the gcc module on your
cluster
# module switch pgi gcc/4.6
### (OFF) load another OpenMP (check which one is usually loaded!! should be now
OpenMP 4.0) version than the default one
# module switch openmp openmp/3.0
### Set stack and address limits
ulimit -s unlimited
ulimit -v unlimited
### Change to the work directory
cd /home/your_username/
### Execute your application (make sure, that R can be loaded via 'R' on
the shell!!!)
R --no-restore --no-save --quiet --slave < your_R_script.R
------------------------
In your R script file, load the packages
library(doMC)
library(foreach)
registerDoMC(24) ## now, foreach knows the backend.
forach(...) %dopar% ?..
## save your stuff to your work- or home directory (csv or database)
quit()
-----------------------
Then you send the script to LSF via
bsub <- my_LSF_script.lsf
Look via 'bjobs' if it is is send and what's its status (PEND or
RUN). If the status is RUN you can look via 'bpeek your_job_ID' what the
output looks like, while it runs.
Best
Simon
On Sep 27, 2013, at 10:48 PM, pakoun <pkount at bgc-jena.mpg.de> wrote:
> Dear R users,
> I am struggling with memory issues and try to understand a few things. I am
> using an LSF cluster with PGI compiler and parallel mpi2 computing
(whatever
> does that means..) and i submit a job like:
>
> bsub -R "rusage[mem=30000]" -q queue -n 24 R CMD BATCH
<arguments..>
> myjob.r ..log
>
> According to that I am asking for 24 cores and 30GB RAM.
>
> Then I am using
> library(doMC)
> registerDoMC(24)
>
> and a foreach loop either simple or nested with the %dopar% command.
>
> 1. this 30 GB will be distributed among the 24 jobs or each will take 30?
> 2. If i dont ask the -n 24 argument still the foreach loop will run in
> parallel as i check with TOP command. What is the purpose of using it? Just
> to "reserve" the nodes from other users?
>
> Thank you
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Memory-distribution-using-foreach-tp4677133.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.