Skip to content
Prev 132881 / 398503 Next

Multicore computation in Windows network: How to set up Rmpi

Hello!

I finally got MPICH 1.06 + R 2.6.1 + Rmpi 0.5-5 working with multiple 
computers. The key was to realize that the number of processes should be 
one when launching Rgui using mpiexec and not the number of 
master+slaves, as I had first wrongly understood.

However, I seem to have a new problem which I have not been able to 
figure out:

After loading Rmpi, the first attempt to mpi.spawn.Rslaves() always 
spawns the slaves on the local machine instead of on both machines. If I 
close the slaves and spawn again, then one slave gets spawned on remote 
machine. Each time I close and then spawn againg, the order of machines 
is different, and eventually I get back to the situation where all 
slaves are on the local machine. Continuing to do spawning and closing 
seems to reveal a pattern. I can see similar behavior if I have more 
than two machines, and it takes more spawn-close cycles to get all my 
slave machines spawned on.

Below is an example session with two machines. This pattern shows 
everytime I start R and run this script. How to control the spawning so 
that I get everything right at the first call of mpi.spawn.Rslaves()?

Regards,

Samu

<R>

 >
 > library(Rmpi)
 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-pc-mingw32

locale:
LC_COLLATE=Finnish_Finland.1252;LC_CTYPE=Finnish_Finland.1252;LC_MONETARY=Finnish_Finland.1252;LC_NUMERIC=C;LC_TIME=Finnish_Finland.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rmpi_0.5-5
 > mpi.universe.size()
[1] 2
 > mpichhosts()
          master          slave1          slave2
"clustermaster" "clustermaster" "clusterslave1"
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterMaster
slave2 (rank 2, comm 1) of size 3 is running on: ClusterMaster
 > mpi.close.Rslaves()
[1] 1
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterSlave1
slave2 (rank 2, comm 1) of size 3 is running on: ClusterMaster
 > mpi.close.Rslaves()
[1] 1
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterMaster
slave2 (rank 2, comm 1) of size 3 is running on: ClusterSlave1
 > mpi.close.Rslaves()
[1] 1
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterMaster
slave2 (rank 2, comm 1) of size 3 is running on: ClusterMaster
 > mpi.close.Rslaves()
[1] 1
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterSlave1
slave2 (rank 2, comm 1) of size 3 is running on: ClusterMaster
 > mpi.close.Rslaves()
[1] 1
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterMaster
slave2 (rank 2, comm 1) of size 3 is running on: ClusterSlave1
 > mpi.close.Rslaves()
[1] 1
 >
 >
 > mpi.spawn.Rslaves()
         2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: ClusterMaster
slave1 (rank 1, comm 1) of size 3 is running on: ClusterMaster
slave2 (rank 2, comm 1) of size 3 is running on: ClusterMaster
 > mpi.close.Rslaves()
[1] 1
 >

</R>


Samu M?ntyniemi kirjoitti: