Problem Running 'snow' on Several Machines
Unfortunately, there are dozens of possible reasons that makeSOCKcluster
could hang, so it's impossible for us to guess what your particular problem is.
You may have several issues to resolve before you'll get this running if
you're unlucky.
However, one common problem is that one of the workers (in your case
"192.168.1.101") may be unable to connect back to the master process using
the default name used by makeSOCKcluster. It may work better to use the
dot-separated IP address, which you might be able to compute using:
> master <- nsl(Sys.info()["nodename"])
Then create the cluster with:
> cl <- makeSOCKcluster(c("localhost", "192.168.1.101"), master=master)
But if that shot in the dark doesn't work, I suggest debugging it by setting
the manual argument to TRUE:
> cl <- makeSOCKcluster(c("localhost", "192.168.1.101"), manual=TRUE)
It will then tell you the command to use to start each of the workers.
Open a terminal on each worker, and execute the commands as it prompts you.
Now, you should see the error message, and finding the solution will be
much easier.
- Steve
On Fri, Apr 22, 2011 at 5:30 AM, Michael Smith <my.r.help at gmail.com> wrote:
All,
I want to run the following program:
library("snow")
cl <- makeSOCKcluster(c("localhost", "192.168.1.101"))
clusterCall(cl, function() Sys.info()[c("nodename","machine")])
stopCluster(cl)
I have set up SSH keys so I don't need to type a password when logging
in to 192.168.1.101. I'm using Fedora 13 on both machines.
What happens is that the program somehow hangs after 'makeSOCKcluster'
because it does not return to the R prompt and I have to hit Ctrl-C. It
also fails to establish the cluster:
$ R -q
library("snow")
Attaching package: 'snow' The following object(s) are masked from 'package:base': ? ?enquote
cl <- makeSOCKcluster(c("localhost", "192.168.1.101"))
Attaching package: 'snow' The following object(s) are masked from 'package:base': ? ?enquote Attaching package: 'snow' The following object(s) are masked from 'package:base': ? ?enquote ^C
clusterCall(cl, function() Sys.info()[c("nodename","machine")])
Error in inherits(cl, "cluster") : object 'cl' not found
When I use makeSOCKcluster(c("localhost", "localhost")) everything works
fine locally.Below is more information on the system.
Thanks,
Michael
sessionInfo()
R version 2.12.2 (2011-02-25) Platform: x86_64-redhat-linux-gnu (64-bit) locale: ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C ?[3] LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ?[5] LC_MONETARY=C ? ? ? ? ? ? LC_MESSAGES=en_US.utf8 ?[7] LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C ?[9] LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base other attached packages: [1] snow_0.3-3
Sys.getlocale()
[1] "LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE=en_US.utf8;LC_MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDENTIFICATION=C"
library(help=snow)
? ? ? ? ? ? ? ?Information on package 'snow' Description: Package: ? ? ? snow Title: ? ? ? ? Simple Network of Workstations Version: ? ? ? 0.3-3 Author: ? ? ? ?Luke Tierney, A. J. Rossini, Na Li, H. Sevcikova Description: ? Support for simple parallel computing in R. Maintainer: ? ?Luke Tierney <luke at stat.uiowa.edu> Suggests: ? ? ?Rmpi,rpvm,rlecuyer,rsprng,nws License: ? ? ? GPL Depends: ? ? ? R (>= 2.3), utils Packaged: ? ? ?Fri Jul 4 18:07:30 2008; luke Built: ? ? ? ? R 2.12.2; ; 2011-04-21 03:02:00 UTC; unix Index: clusterSetupRNG ? ? ? ? Uniform Random Number Generation in SNOW ? ? ? ? ? ? ? ? ? ? ? ?Clusters clusterSplit ? ? ? ? ? ?Cluster-Level SNOW Functions getMPIcluster ? ? ? ? ? Starting and Stopping SNOW Clusters parLapply ? ? ? ? ? ? ? Higher Level SNOW Functions
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc