André Ziervogel
2014-Aug-07 10:19 UTC
[R] Some hpc problems with doMPI and runmpi (I know there is R-SIG-HPC)
Dear R people, I?ve been doing some hpc using R and openmpi. Unfortunately I?ve encoutred a major problem and it?s nature is hard to pin down: Essentially I call mpirun Rscipt ? as soon as the script reaches a foreach()%dopar% it halts indefinitely. I?ve attached the qsub script: #!/bin/bash #$ -S /bin/bash #$ -N test_14 #$ -cwd #$ -V #$ -o "/fhgfs/g61570/Spectral Databases/log/test_14_"$JOB_ID #$ -j y #$ -q regular #$ -pe openmpi 8 #$ -l h_rt=00:15:00 #$ -l h_vmem=1.9G #$ -m eas #$ -M andre.ziervogel at psychol.uni-giessen.de module add gcc module add openmpi/gcc/64/1.6.5 module add R/gcc/3.0.1 date #log start time echo "Number of slots " . $NSLOTS mpirun Rscript /fhgfs/g61570/Spectral\ Databases/test_10.r > /fhgfs/g61570/Spectral\ Databases/log/test_14.Rout date exit and the R file: suppressMessages(library('doMPI')) skylla.cluster <- startMPIcluster() registerDoMPI(skylla.cluster) cat(paste("COMM SIZE: ", mpi.comm.size(0), " cluster size: ", clusterSize(skylla.cluster), "\n",sep = "")) tmp.time <- proc.time() sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %do% { r <- sqrt(i^2 + i^2) + .Machine$double.eps * factorial(i) sin(r) / r } cat(paste("Processing seriell time: ", "\n", sep = " ")) print(proc.time() - tmp.time) #print(sample) tmp.time <- proc.time() sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %dopar% { r <- sqrt(i^2 + i^2) + .Machine$double.eps * factorial(i) sin(r) / r } cat(paste("Processing parallel time: ", "\n", sep = " ")) print(proc.time() - tmp.time) #print(sample) closeCluster(skylla.cluster) # mpi.close.Rslaves() # mpi.exit() mpi.quit(save='no?) Any suggestions would be highly appreciated! Thanks! Best Andr? ------------------------------------------------------ Dipl. Psych Andr? Ziervogel andre.ziervogel at psychol.uni-giessen.de ------------------------------------------------------ -------------- n?chster Teil -------------- Ein Dateianhang mit Bin?rdaten wurde abgetrennt... Dateiname : signature.asc Dateityp : application/pgp-signature Dateigr??e : 842 bytes Beschreibung: Message signed with OpenPGP using GPGMail URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20140807/71438479/attachment.bin>
Pascal Oettli
2014-Aug-08 06:56 UTC
[R] Some hpc problems with doMPI and runmpi (I know there is R-SIG-HPC)
Hello, In your email, you speak about foreach()%dopar%, but in your script, it is foreach()%do%. Best, Pascal On Thu, Aug 7, 2014 at 7:19 PM, Andr? Ziervogel <Andre.Ziervogel at psychol.uni-giessen.de> wrote:> Dear R people, > > I?ve been doing some hpc using R and openmpi. Unfortunately I?ve encoutred a major problem and it?s nature is hard to pin down: > > Essentially I call mpirun Rscipt ? as soon as the script reaches a foreach()%dopar% it halts indefinitely. I?ve attached the qsub script: > > #!/bin/bash > #$ -S /bin/bash > #$ -N test_14 > #$ -cwd > > #$ -V > #$ -o "/fhgfs/g61570/Spectral Databases/log/test_14_"$JOB_ID > #$ -j y > #$ -q regular > #$ -pe openmpi 8 > #$ -l h_rt=00:15:00 > #$ -l h_vmem=1.9G > #$ -m eas > #$ -M andre.ziervogel at psychol.uni-giessen.de > > module add gcc > module add openmpi/gcc/64/1.6.5 > module add R/gcc/3.0.1 > > date #log start time > > echo "Number of slots " . $NSLOTS > > mpirun Rscript /fhgfs/g61570/Spectral\ Databases/test_10.r > /fhgfs/g61570/Spectral\ Databases/log/test_14.Rout > > date > > exit > > and the R file: > > suppressMessages(library('doMPI')) > > skylla.cluster <- startMPIcluster() > registerDoMPI(skylla.cluster) > > cat(paste("COMM SIZE: ", mpi.comm.size(0), " cluster size: ", clusterSize(skylla.cluster), "\n",sep = "")) > > tmp.time <- proc.time() > sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %do% > { > r <- sqrt(i^2 + i^2) + .Machine$double.eps * factorial(i) > sin(r) / r > } > cat(paste("Processing seriell time: ", "\n", sep = " ")) > print(proc.time() - tmp.time) > #print(sample) > > tmp.time <- proc.time() > sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %dopar% > { > r <- sqrt(i^2 + i^2) + .Machine$double.eps * factorial(i) > sin(r) / r > } > cat(paste("Processing parallel time: ", "\n", sep = " ")) > print(proc.time() - tmp.time) > #print(sample) > > closeCluster(skylla.cluster) > # mpi.close.Rslaves() > # mpi.exit() > mpi.quit(save='no?) > > Any suggestions would be highly appreciated! Thanks! > > Best > > Andr? > > ------------------------------------------------------ > Dipl. Psych Andr? Ziervogel > andre.ziervogel at psychol.uni-giessen.de > ------------------------------------------------------ > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Pascal Oettli Project Scientist JAMSTEC Yokohama, Japan