Rmpi and cpu usage on slaves
On 21 April 2009 at 16:40, Sean Davis wrote:
| I am running sge6.2, openmpi 1.3.1, and Rmpi 0.5.7 on openSUSE linux. I can | start up an arbitrarily-sized cluster using sge, see the appropriate | universe.size using Rmpi, and start a cluster using mpi.spawn.Rslaves(). | However, it appears that all the slaves then run at 100% cpu on all nodes. | Even using Rmpi under openmpi with a simple hostfile produces the same | result. Any suggestions to figure out what is going on on the slaves? There is a known issue with Open MPI and blocking which you may be hitting here. Upstream Open MPI considers it a feature. But as this has come up a few times on their mailing list as well, I believe the last word was that it will go away in a future release. Hth, Dirk | Thanks, | Sean | | | > library(Rmpi) | library(Rmpi) | > mpi.universe.size() | mpi.universe.size() | [1] 24 | > mpi.spawn.Rslaves() | mpi.spawn.Rslaves() | 24 slaves are spawned successfully. 0 failed. | master (rank 0 , comm 1) of size 25 is running on: Mahfouz | slave1 (rank 1 , comm 1) of size 25 is running on: Mahfouz | slave2 (rank 2 , comm 1) of size 25 is running on: Mahfouz | slave3 (rank 3 , comm 1) of size 25 is running on: Mahfouz | slave4 (rank 4 , comm 1) of size 25 is running on: Mahfouz | slave5 (rank 5 , comm 1) of size 25 is running on: Mahfouz | slave6 (rank 6 , comm 1) of size 25 is running on: Mahfouz | slave7 (rank 7 , comm 1) of size 25 is running on: Mahfouz | slave8 (rank 8 , comm 1) of size 25 is running on: Grass | slave9 (rank 9 , comm 1) of size 25 is running on: Grass | slave10 (rank 10, comm 1) of size 25 is running on: Grass | slave11 (rank 11, comm 1) of size 25 is running on: Grass | slave12 (rank 12, comm 1) of size 25 is running on: Grass | slave13 (rank 13, comm 1) of size 25 is running on: Grass | slave14 (rank 14, comm 1) of size 25 is running on: Grass | slave15 (rank 15, comm 1) of size 25 is running on: Grass | slave16 (rank 16, comm 1) of size 25 is running on: shakespeare | slave17 (rank 17, comm 1) of size 25 is running on: shakespeare | slave18 (rank 18, comm 1) of size 25 is running on: shakespeare | slave19 (rank 19, comm 1) of size 25 is running on: shakespeare | slave20 (rank 20, comm 1) of size 25 is running on: shakespeare | slave21 (rank 21, comm 1) of size 25 is running on: shakespeare | slave22 (rank 22, comm 1) of size 25 is running on: shakespeare | slave23 (rank 23, comm 1) of size 25 is running on: shakespeare | slave24 (rank 24, comm 1) of size 25 is running on: Mahfouz | > mpi.close.Rslaves() | mpi.close.Rslaves() | [1] 1 | | > sessionInfo() # on the master | R version 2.9.0 Under development (unstable) (2009-02-21 r47969) | x86_64-unknown-linux-gnu | | locale: | LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C | | attached base packages: | [1] stats graphics grDevices utils datasets methods base | | other attached packages: | [1] Rmpi_0.5-7 | | [[alternative HTML version deleted]] | | _______________________________________________ | R-sig-hpc mailing list | R-sig-hpc at r-project.org | https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
Three out of two people have difficulties with fractions.