Skip to content

BUG?: On Linux setTimeLimit() fails to propagate timeout error when it occurs (works on Windows)

1 message · Henrik Bengtsson

#
On Mon, Oct 31, 2016 at 9:36 AM, <luke-tierney at uiowa.edu> wrote:
Thanks.

So, if I understand it correctly, my example showing that
setTimeLimit() doesn't work properly on Linux was unfortunately
misleading, mainly due to me choosing Sys.sleep() and it does indeed
work in most cases (except connections).  For example, this works

slowfcn <- function(time) { t0 <- Sys.time(); while(Sys.time() - t0 <
time) Sys.sleep(0.1); TRUE }
setTimeLimit(elapsed = 1.0)
system.time(slowfcn(3))
## Error in Sys.sleep(0.1) : reached elapsed time limit
## Timing stopped at: 0.004 0 1.008
Yes, true. I didn't want to sidetrack the discussion too much, but
I've started to make some standalone improvements based on
parallel:::newPSOCKnode() & parallel:::.slaveRSOCK(), e.g. more
control options for launching remote workers, say, over SSH with
reverse tunneling (no need for port forwarding) and then running
Rscript within a Docker container, e.g.

https://github.com/HenrikBengtsson/future/blob/develop/R/makeClusterPSOCK.R
https://github.com/HenrikBengtsson/future/blob/develop/incl/makeClusterPSOCK.R

This part is fully backward compatible with makePSOCKcluster() and
could be eventually be implemented in parallel itself.  The next level
up could be to make the worker loop to handle connection-setup
timeouts and similar.

By now I'm fairly ok with testing and validating remote SSH access
etc, but I think it's possible to make exception handling a little bit
more automatic and informative, particularly the part detecting when
the connection and worker setup actually never happens.  For a
newcomer, it can be quite a challenge to troubleshoot why the setup of
remote workers doesn't work.

Thanks,

Henrik