Skip to content

Can't start multi node SNOW cluster

6 messages · Xiaobo Gu, Stephen Weston

#
Hi,

First of all, my environment is :
64 bit Windows 7 Home basic with firewall off and the IP address 192.168.72.1 , R 2.14.1 64 bit, snow0.3.8 and doSNOW1.0.5

The first test senario works:

cl <- makeCluster(4, type = "SOCK")
clusterApply(cl, 1:2, get("+"), 3)
stopCluster(cl)

Before start the two node cluster I try to use the multi node syntex on the local machine,

winOptions <-
    list(host="192.168.72.1",
         rscript="D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe",
         snowlib="C:/Users/dell/Documents/R/win-library/2.14")
cl <- makeCluster(c(rep(winOptions, 2)), type = "SOCK")

The makeCluster call just hung.

Then I do the same test on the single node 64bit CENTOS LINUX system, using the same versions of R, snow and doSNOW

Again, cl <- makeCluster(4, type = "SOCK") works,

lixOptions <-
  list(host="hdp1",
       rscript="/opt/r2141/lib64/R/bin/Rscript",
       snowlib="/opt/r2141/lib64/R/library")
The authenticity of host '192.168.72.7 (192.168.72.7)' can't be established.
RSA key fingerprint is 03:61:3b:4f:72:60:a7:a3:2a:8f:25:16:be:02:56:7c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.72.7' (RSA) to the list of known hosts.
ssh: /opt/r2141/lib64/R/bin/Rscript: Name or service not known
ssh: /opt/r2141/lib64/R/bin/Rscript: Name or service not known

Can you help find what's wrong here.

There is another question about multi computer snow clusters,
becuase the master process must log onto remote machine to start worker process, where can I specify the user name and password, in modern Windows server environments uers must log on first before doing anything on it.


Regards,

Xiaobo Gu
#
It seems there is a bug in snow or the samle code,
It seems snow treats every entry in the sep list as a seperate worker
+     list(host="192.168.72.1",
+          rscript="D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe",
+          snowlib="C:/Users/dell/Documents/R/win-library/2.14")
Manually start worker on 192.168.72.1 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on C:/Users/dell/Documents/R/win-library/2.14 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on 192.168.72.1 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on C:/Users/dell/Documents/R/win-library/2.14 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14
[[1]]
[1] 4

[[2]]
[1] 5
Xiaobo Gu

From: Xiaobo Gu
Date: 2012-02-04 22:51
To: r-sig-hpc
Subject: Can't start multi node SNOW cluster
Hi,

First of all, my environment is :
64 bit Windows 7 Home basic with firewall off and the IP address 192.168.72.1 , R 2.14.1 64 bit, snow0.3.8 and doSNOW1.0.5

The first test senario works:

cl <- makeCluster(4, type = "SOCK")
clusterApply(cl, 1:2, get("+"), 3)
stopCluster(cl)

Before start the two node cluster I try to use the multi node syntex on the local machine,

winOptions <-
    list(host="192.168.72.1",
         rscript="D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe",
         snowlib="C:/Users/dell/Documents/R/win-library/2.14")
cl <- makeCluster(c(rep(winOptions, 2)), type = "SOCK")

The makeCluster call just hung.

Then I do the same test on the single node 64bit CENTOS LINUX system, using the same versions of R, snow and doSNOW

Again, cl <- makeCluster(4, type = "SOCK") works,

lixOptions <-
  list(host="hdp1",
       rscript="/opt/r2141/lib64/R/bin/Rscript",
       snowlib="/opt/r2141/lib64/R/library")
The authenticity of host '192.168.72.7 (192.168.72.7)' can't be established.
RSA key fingerprint is 03:61:3b:4f:72:60:a7:a3:2a:8f:25:16:be:02:56:7c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.72.7' (RSA) to the list of known hosts.
ssh: /opt/r2141/lib64/R/bin/Rscript: Name or service not known
ssh: /opt/r2141/lib64/R/bin/Rscript: Name or service not known

Can you help find what's wrong here.

There is another question about multi computer snow clusters,
becuase the master process must log onto remote machine to start worker process, where can I specify the user name and password, in modern Windows server environments uers must log on first before doing anything on it.


Regards,

Xiaobo Gu
#
Further testes shows it should be connection problems on the Windows Platform,

On CENTOS LINUX platform, the following works
cl <- makeCluster(c(rep("192.168.72.21", 2),rep("192.168.72.22", 2),rep("192.168.72.23", 2)), type = "SOCK") 

And on Windows Platform the following works too
cl <- makeCluster(c("192.168.72.1","192.168.72.1"), type = "SOCK", manual=TRUE) 
even if 192.168.72.1 is a remote machine, but the following does not work

cl <- makeCluster(c("192.168.72.1","192.168.72.1"), type = "SOCK") 

I think it's because snow use ssh to logon remote machine, but Windows does not have a ssh application, and the remote server does not accept ssh connections.

Is there anyone have successfully run a SNOW cluster on multiple Windows computers?

Regards,
Xiaobo Gu

???? Xiaobo Gu
????? 2012-02-05 12:35
???? r-sig-hpc
??? Re: Can't start multi node SNOW cluster
It seems there is a bug in snow or the samle code,
It seems snow treats every entry in the sep list as a seperate worker
+     list(host="192.168.72.1",
+          rscript="D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe",
+          snowlib="C:/Users/dell/Documents/R/win-library/2.14")
Manually start worker on 192.168.72.1 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on C:/Users/dell/Documents/R/win-library/2.14 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on 192.168.72.1 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14 
Manually start worker on C:/Users/dell/Documents/R/win-library/2.14 with
     D:/Amber/Program/R/R-214~1.1/bin/Rscript.exe C:/Users/dell/Documents/R/win-library/2.14/snow/RSOCKnode.R MASTER=DELL-PC PORT=10187 OUT=D:/rworkout.txt SNOWLIB=C:/Users/dell/Documents/R/win-library/2.14
[[1]]
[1] 4

[[2]]
[1] 5
Xiaobo Gu

From: Xiaobo Gu
Date: 2012-02-04 22:51
To: r-sig-hpc
Subject: Can't start multi node SNOW cluster
Hi,

First of all, my environment is :
64 bit Windows 7 Home basic with firewall off and the IP address 192.168.72.1 , R 2.14.1 64 bit, snow0.3.8 and doSNOW1.0.5

The first test senario works:

cl <- makeCluster(4, type = "SOCK")
clusterApply(cl, 1:2, get("+"), 3)
stopCluster(cl)

Before start the two node cluster I try to use the multi node syntex on the local machine,

winOptions <-
    list(host="192.168.72.1",
         rscript="D:/Amber/Program/R/R-2.14.1/bin/x64/Rscript.exe",
         snowlib="C:/Users/dell/Documents/R/win-library/2.14")
cl <- makeCluster(c(rep(winOptions, 2)), type = "SOCK")

The makeCluster call just hung.

Then I do the same test on the single node 64bit CENTOS LINUX system, using the same versions of R, snow and doSNOW

Again, cl <- makeCluster(4, type = "SOCK") works,

lixOptions <-
  list(host="hdp1",
       rscript="/opt/r2141/lib64/R/bin/Rscript",
       snowlib="/opt/r2141/lib64/R/library")
The authenticity of host '192.168.72.7 (192.168.72.7)' can't be established.
RSA key fingerprint is 03:61:3b:4f:72:60:a7:a3:2a:8f:25:16:be:02:56:7c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.72.7' (RSA) to the list of known hosts.
ssh: /opt/r2141/lib64/R/bin/Rscript: Name or service not known
ssh: /opt/r2141/lib64/R/bin/Rscript: Name or service not known

Can you help find what's wrong here.

There is another question about multi computer snow clusters,
becuase the master process must log onto remote machine to start worker process, where can I specify the user name and password, in modern Windows server environments uers must log on first before doing anything on it.


Regards,

Xiaobo Gu
1 day later
#
On Sat, Feb 4, 2012 at 11:35 PM, Xiaobo Gu <guxiaobo1982 at gmail.com> wrote:
I think you should use something like:

    cl <- makeCluster(lapply(1:2, function(i) winOptions), type =
"SOCK", manual=TRUE)

To allow options to be passed in with the host names, you need to
specify a list of lists.  Using "c(rep(winOptions, 2))" concatenates winOptions
to itself which makes snow think that you're passing in a simple lists of
hostnames.

- Steve
1 day later
#
Thanks, that works, but snow still can't start remote workers automatically.
#
On Tue, Feb 7, 2012 at 11:19 AM, Xiaobo Gu <guxiaobo1982 at gmail.com> wrote:
I thought that you had already figured out that the problem was
not having sshd's running on your Windows cluster.

With snow, I believe you have three options for running on multiple
nodes:

1) Use a socket cluster which depends on ssh/sshd;
2) Use an MPI cluster which depends on an MPI installation that
works with Rmpi;
3) Use manual mode which obviously isn't automatic.

I thought that some people have managed to use snow on clusters,
at least using MPI, but I don't really use Windows, so I can't really
help there.

- Steve