I can do mpi.apply but not foreach with doMPI
Since you are running in batch, consider using the SLURM cluster the way it was designed to be used: SPMD style. Below is a simple code inspired by your examples that does a sort to find the bottom 10 numbers. library(pbdMPI, quiet=TRUE) init() a <- sort(runif(1e7))[1:10] comm.print(a, all.rank=TRUE) b <- as.numeric(unlist(gather(a))) c <- sort(b)[1:10] comm.print(c) finalize() Here is how I run the code in serial: -bash-4.1$ time Rscript bottomten.r COMM.RANK = 0 [1] 2.596062e-07 3.082678e-07 3.138557e-07 6.444753e-07 7.168856e-07 [6] 7.280615e-07 1.073349e-06 1.138775e-06 1.226086e-06 1.244014e-06 COMM.RANK = 0 [1] 2.596062e-07 3.082678e-07 3.138557e-07 6.444753e-07 7.168856e-07 [6] 7.280615e-07 1.073349e-06 1.138775e-06 1.226086e-06 1.244014e-06 real 0m5.047s user 0m4.734s sys 0m0.157s And now a parallel run on 8 cores: -bash-4.1$ time mpirun -np 8 Rscript bottomten.r COMM.RANK = 0 [1] 1.641456e-07 2.663583e-07 7.601921e-07 1.008157e-06 1.064735e-06 [6] 1.178822e-06 1.366483e-06 1.381151e-06 1.406297e-06 1.461012e-06 COMM.RANK = 1 [1] 3.492460e-08 6.798655e-08 1.867302e-07 3.015157e-07 3.234018e-07 [6] 3.348105e-07 4.756730e-07 5.729962e-07 5.888287e-07 6.936025e-07 COMM.RANK = 2 [1] 1.094304e-07 1.136214e-07 2.984889e-07 3.867317e-07 6.183982e-07 [6] 8.104835e-07 9.895303e-07 1.240522e-06 1.284061e-06 1.376960e-06 COMM.RANK = 3 [1] 3.050082e-08 6.728806e-08 8.335337e-08 4.125759e-07 5.690381e-07 [6] 6.437768e-07 1.186039e-06 1.340872e-06 1.558103e-06 1.562294e-06 COMM.RANK = 4 [1] 4.889444e-09 1.490116e-08 1.576264e-07 1.578592e-07 1.718290e-07 [6] 1.958106e-07 2.747402e-07 7.252675e-07 9.618234e-07 9.881333e-07 COMM.RANK = 5 [1] 1.862645e-08 6.728806e-08 1.268927e-07 1.578592e-07 2.654269e-07 [6] 3.289897e-07 3.348105e-07 6.000046e-07 6.633345e-07 7.471536e-07 COMM.RANK = 6 [1] 1.394656e-07 2.512243e-07 2.977904e-07 3.096648e-07 3.606547e-07 [6] 6.635674e-07 1.054723e-06 1.059147e-06 1.180219e-06 1.305714e-06 COMM.RANK = 7 [1] 1.785811e-07 1.816079e-07 2.454035e-07 3.625173e-07 4.067552e-07 [6] 4.153699e-07 4.447066e-07 4.516915e-07 4.768372e-07 5.601906e-07 COMM.RANK = 0 [1] 4.889444e-09 1.490116e-08 1.862645e-08 3.050082e-08 3.492460e-08 [6] 6.728806e-08 6.728806e-08 6.798655e-08 8.335337e-08 1.094304e-07 real 0m5.847s user 0m40.735s sys 0m2.358s Note that real time barely increased even though we did about 8 times the work. User time reflects the actual total CPU time added across the 8 cores. The communication operation is gather(), which gathers its argument to rank 0 by default. See the pbdDEMO package for other examples. George -----Original Message----- From: R-sig-hpc <r-sig-hpc-bounces at r-project.org> on behalf of Seija Sirki? <seija.sirkia at csc.fi> Date: Wednesday, August 26, 2015 at 4:12 AM To: <r-sig-hpc at r-project.org> Subject: [R-sig-hpc] I can do mpi.apply but not foreach with doMPI
Hi all,
I'm trying to learn to do parallel computing with R and foreach on this
cluster of ours but clearly I'm doing something wrong and I can't figure
out what.
Briefly, I'm sitting on a Linux cluster, about which the user guide says
that the login nodes are based on the RHEL6, while the computing nodes
use CentOS 6. Jobs are submitted using SLURM.
So there I go, requesting a short interactive test session using:
srun -p test -n4 -t 0:15:00 --pty Rmpi --no-save
Here Rmpi is the modified R_home_dir/bin/R mentioned in the Rprofile file
that comes with Rmpi ("This R profile can be used when a cluster does not
allow spawning --- Another way is to modify R_home_dir/bin/R by
adding...").
When my session starts, I get these messages:
master (rank 0, comm 1) of size 4 is running on: c1
slave1 (rank 1, comm 1) of size 4 is running on: c1
slave2 (rank 2, comm 1) of size 4 is running on: c1
slave3 (rank 3, comm 1) of size 4 is running on: c1
before the prompt. Sounds good, and if I go check top on the c1 node,
there I see 3 R's churning away happily at 100% cpu time, and one not
doing much. As it should be, as far as I can tell?
If I then run this little test:
funtorun<-function(k) {
system.time(sort(runif(1e7)))
}
system.time(a<-mpi.apply(1:3,funtorun))
a
b<-a
system.time(for(i in 1:3) b[[i]]<-system.time(sort(runif(1e7))))
b
it goes through nicely, and the mpi.apply part takes about 2.6 seconds in
total, with each of the 3 sorts taking about that same time, while the
latter for-loop takes about 7 seconds in total, each of the three sorts
taking about 2.3 seconds. Nice, that tells me the workers will do stuff,
simultaneously, when requested correctly.
But if I try this instead:
library(doMPI)
cl<-startMPIcluster()
registerDoMPI(cl)
system.time(a<-foreach(i=1:3) %dopar% system.time(sort(runif(1e7))))
it just hangs up at the foreach line, and never gets through, and only
gets killed at the end of the reserved 15 minutes or when I scancel the
whole job myself. None of the lines give any errors.
So what am I doing wrong? I have a hunch this has something to do with
how my workers are started, since I never get to do those mpirun commands
that the doMPI manual speaks of. But despite my efforts of reading the
manual and the documentation of startMPIcluster I haven't figured out
what else to try.
Many thanks in advance for your time!
BR,
Seija Sirki?
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc