--
Steve Weston
REvolution Computing
One Century Tower | 265 Church Street, Suite 1006
New Haven, CT 06510
P: 203-777-7442 x266 | www.revolution-computing.com
On Thu, Apr 16, 2009 at 4:52 AM, Matthieu Stigler
<matthieu.stigler at gmail.com> wrote:
luke at stat.uiowa.edu a ?crit :
On Wed, 15 Apr 2009, Matthieu Stigler wrote:
On Tue, Apr 14, 2009 at 5:29 AM, Matthieu Stigler
<matthieu.stigler at gmail.com> wrote:
So it is now working for the local computer with. However, when trying
to
use the external computer, it seems to be working but nothing happens
after
he asked for the last password...
This tells you is that "something went wrong". The basic strategy in
this case
is to use the "outfile" option to hopefully capture an error message.
You might
need to set outfile differently for different slaves, particularly if
you're starting
more than one on the same machine, but I suggest just starting one slave
on 210 to avoid the issue. So do something like:
host210 <- list(host = "mat at 192.100.100.210", rscript =
"/usr/bin/Rscript",
+ outfile="/tmp/log.txt")
cl2 <- makeCluster(list(host210), type = "SOCK")
Ok, thanks for pointing out this methid.
I tried it and got following error message. This does not seem not be
computer specific (tried to do it to other host 213, and from other host 213
to 212, always same error message):
starting worker for ubuntu:10187 Error in socketConnection(master, port =
port, blocking = TRUE, open = "a+b") : unable to open connection
Calls: local ... slaveLoop -> recvData -> makeSOCKmaster ->
socketConnection
In addition: Warning message:
In socketConnection(master, port = port, blocking = TRUE, open = "a+b") :
ubuntu:10187 cannot be opened
Execution halted
Is it related to ssh or snow? I did not find any reference to that prob
googling for it...
It is an issue with your ability to make a socket connection to the
master. Most likely the master computer has a firewall that is
blocking connections to the port snow uses. Try turning the firewall
off or at least enabling the port in the error message.
A simple test is to do
socketConnection(port = 10187, server = TRUE)
in an R session on the master and
telnet ubuntu 10187
in a shell on your worker machine (assumign your master is called
ubuntu) (or you can use R and
socketConnection("ubuntu", port = 10187)
in an R session on the worker).
luke
Thanks Luke and Dirk for your help!
I don't think it is a firewall error, as both machines have all port open
(as default with iptables as I understood), and the admin of the network
opened even port 10187.
I tried first the three solutions suggested, none of them seem to give good
results:
$telnet 192.100.100.212 10187
Trying 192.100.100.212...
telnet: Unable to connect to remote host: Connection refused
R>socketConnection(port = 10187, server=TRUE)
#nothing happens... is it right?
R > socketConnection("192.100.100.212", port = 10187)
Erreur dans socketConnection("192.100.100.212", port = 10187) :
impossible d'ouvrir la connexion
De plus : Warning message:
In socketConnection("192.100.100.212", port = 10187) :
192.100.100.212:10187 cannot be opened
Same error message when using "ubuntu"/ dsge at 192.100.100.212 etc..
Going to a ubuntu forum, someone told that one has to open a server on the
port (excuse, explanations are not good as I don't understand that much the
subject :-( ).
So launching in the master (212):
$nc -l -p 10187
then one is able to have in 210:
$telnet 192.100.100.212 10187
Trying 192.100.100.212...
Connected to 192.100.100.212.
Escape character is '^]'.
So it seems that it is working, but there is then no effect on the previous
commands socketConnection, makeCluster, still claims that 10187 can't be
open.
With those elements, do you guys see clearer or is it even darker? Thanks a
lot for your help!
Matthieu
Thanks a lot for your help!!
If it hangs, go to another terminal, ssh to 192.100.100.210, and look at
the contents of /tmp/log.txt, and hopefully that will provide a clue to
the problem.
Another approach is to use the "manual" option. That will print the
command that you should use to manually start each of the slaves.
You just ssh to that machine from another terminal, and cut and paste
the printed command to start the slave. If you set "outfile" to an
empty
string, then output messages will go right to that terminal.
--
Steve Weston
REvolution Computing
One Century Tower | 265 Church Street, Suite 1006
New Haven, CT 06510
P: 203-777-7442 x266 | www.revolution-computing.com