An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-hpc/attachments/20120329/38894a02/attachment.pl>
Rmpi spawning across nodes.
3 messages · Ben Weinstein, Brian G. Peterson, Stephen Weston
On Thu, 2012-03-29 at 14:26 -0400, Ben Weinstein wrote:
Hello all, I've seen multiple posts on this subject, but haven't clearly been able to understand the issue. There must be something small. I am trying to get Rmpi, foreach and snow working on a beowulf cluster, using debian. I have succeeded in installing Rmpi, changing the paths, etc. However, i remain the with my original problem, when cl<-makeCluster(4,type="MPI") spawns slaves, they are always on the same node!
<... snip ...>
My dream scenerio is to use both processors on each node, and pass information between nodes in separate workflows. I appreciate any suggestions There is nothing wrong with my R script in terms of code, it works great using doMC and foreach, with 8 cores on my desktop. However, i have not been able to register the correct cores when using the University Cluster. I am new at this, please forgive my lack of precise terms.
I'll suggest a few things. First, if you want to use MPI, doMPI is less fragile than doSNOW for an MPI cluster. Second, I recall that there was a config file that needed to be set to define the worker machines for our MPI cluster (I don't use MPI routinely anymore). This had some interaction that I don't recall in detail that should be described in the doMPI vignette. Third, if you want to use foreach and parallelization in an interactive session, as you can with doMC or doParallel, I recommend looking into doRedis. Regards, - Brian
Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
Hi Ben, You have to run R via mpirun, otherwise all of the workers start on the one node.
I have tried using mpirun -np 4 in front of the R - call, but this just fails without message.
You have to use '-np 1', otherwise your script will be executed by mpirun four times, each trying to spawn four workers. I'm not sure if that explains failing without a message, however. Try something like this: #!/bin/bash #PBS -o 'qsub.out' #PBS -e 'qsub.err' #PBS -l nodes=4:ppn=1 #PBS -m bea cat $PBS_NODEFILE hostname cd $PBS_O_WORKDIR # Run an R script mpirun -hostfile $PBS_NODEFILE -np 1 R --slave -f /nfs/user08/bw4sz/Files/Seawulf.R You may not need to use '-hostfile $PBS_NODEFILE', depending on how your Open MPI was built, but I don't think if ever hurts, and it may be required for your installation. - Steve