Skip to content

nested parallel workers

5 messages · Valerie Obenchain, Simon Urbanek

#
Hi Simon,

I'm having trouble with nested parallel workers, specifically, forking
inside socket connections.

When mclapply is called inside a SOCK, PSOCK or FORK worker I get an
error in unserialize().

cl <- makeCluster(1, "SOCK")

fun = function(i) {
   library(parallel)
   mclapply(1:2, sqrt)
}

Failure occurs after multiple calls to clusterApply:

 > clusterApply(cl, 1, fun)
[[1]]
[[1]][[1]]
[1] 1

[[1]][[2]]
[1] 1.414214

 > clusterApply(cl, 1, fun)
[[1]]
[[1]][[1]]
[1] 1

[[1]][[2]]
[1] 1.414214

 > clusterApply(cl, 1, fun)
Error in unserialize(node$con) : error reading from connection


This example is from Martin and may be a different problem.

~/tmp >cat test1.R
## like mclapply
## should run 'forever' but terminates semi-randomly
library(parallel)
children <- parallel:::children

while (TRUE) {
     n <- 8            ## n == dectectCores()
     jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20)))
     mccollect(children(jobs), FALSE)
     parallel:::mckill(children(jobs), tools::SIGTERM)
     leni <- length(mccollect(children(jobs)))
     message("leni: ", leni)
}

~/tmp >R-dev --vanilla --slave -f test1.R
leni: 6
leni: 7
leni: 7
leni: 7
leni: 7
leni: 7
leni: 7
leni: 7
leni: 8
leni: 7
leni: 7
leni: 7
~/tmp >


Thanks.
Valerie


 > sessionInfo()
R Under development (unstable) (2015-03-18 r68009)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Fedora 21 (Twenty One)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

loaded via a namespace (and not attached):
[1] snow_0.3-13
#
On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:

            
You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers.

Cheers,
Simon
4 days later
#
On 03/25/2015 07:48 PM, Simon Urbanek wrote:
Sorry if I'm missing the obvious here -
I thought since the fork workers were shut down by the time the SOCK 
worker returned to its master conflicting I/O wouldn't be a problem.

There are quite a few examples floating around where SOCK workers are 
spawned on a cluster and multicore workers are called within them. If I 
understand correctly this should not be done (or at least not 
encouraged). Instead, nested parallel should only be done with 
distributed memory workers, SOCK, MPI etc.

Thanks.
Valerie
#
On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:

            
If the workers are done and don't use I/O then all is well. However, it's not easy to guarantee that they don't use I/O since they all already come with active sockets, so e.g. on exit they may flush the socket buffers which would confuse the recipient. Interestingly your example works fine on OS X but fails on Linux. I'll try to dig deeper in a quiet minute --- in principle it should be sufficient to close all FDs right away, which you can do when using mcparallel() but not using mclapply().

Cheers,
Simon
#
On 03/30/2015 02:51 PM, Simon Urbanek wrote:
I see. Thanks for the explanation.

Valerie