[Bioc-devel] BiocParallel-devel error
Hi Valerie,
Thanks for looking into this.
Yes, if I include the bogus 'MYR' in *.tmpl then I am getting the same
error in R-release as well.
To double-check whether it is related to some nodes on our cluster (ours
has different node architectures and the IB interconnect can be flaky at
times), I restricted the computation to two specific nodes for all
comparisons using nodes="1:ppn=1+n02+n03". As you can see below, the same
computation works in R-release with both BiocParallel and BatchJobs. However,
if I run it in R-devel it only works with BatchJobs.
Certainly, there could still be another problem with our specfic
environment on the cluster, not sure?
For my specific application there is no rush to get things working in
BiocParallel right away. BatchJobs works fine for now.
Thomas
###############
## R-release ##
###############
library(BiocParallel); library(BatchJobs)
f <- function(i) system("hostname", intern=TRUE)
funs <- makeClusterFunctionsTorque("~/tmp/torque.tmpl")
param <- BatchJobsParam(4, resources=list(walltime="00:05:00", nodes="1:ppn=1+n02+n03", memory="1gb"), cluster.functions=funs)
register(param)
xx <- bplapply(1:4, f)
xx
xx
[[1]] [1] "n03" [[2]] [1] "n03" [[3]] [1] "n03" [[4]] [1] "n02" library(BatchJobs) loadConfig(conffile = "~/tmp/.BatchJobs.R") reg <- makeRegistry(id="BatchJobTest", work.dir="results") ids <- batchMap(reg, fun=f, 1:4) done <- submitJobs(reg, resources=list(walltime="00:05:00", nodes="1:ppn=1+n02+n03", memory="1gb")) sapply(1:4, function(x) loadResult(reg, x)) [1] "n03" "n03" "n03" "n02"
sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] C
attached base packages:
[1] stats graphics utils datasets grDevices methods base
other attached packages:
[1] BatchJobs_1.2 BBmisc_1.7 BiocParallel_0.6.1
loaded via a namespace (and not attached):
[1] BiocGenerics_0.10.0 DBI_0.2-7 RSQLite_0.11.4 Rcpp_0.11.2 brew_1.0-6 checkmate_1.0 codetools_0.2-8 digest_0.6.4 fail_1.2 foreach_1.4.2
[11] iterators_1.0.7 parallel_3.1.0 plyr_1.8.1 sendmailR_1.1-2 stringr_0.6.2 tools_3.1.0
#############
## R-devel ##
#############
library(BiocParallel); library(BatchJobs)
f <- function(i) system("hostname", intern=TRUE)
funs <- makeClusterFunctionsTorque("~/tmp/torque.tmpl")
param <- BatchJobsParam(4, resources=list(walltime="00:05:00", nodes="1:ppn=1+n02+n03", memory="1gb"), cluster.functions=funs)
register(param)
xx <- bplapply(1:4, f)
Error: 10 errors; first error:
For more information, use bplasterror(). To resume calculation, re-call the
function and set the argument 'BPRESUME' to TRUE or wrap the previous call in
bpresume().
bplasterror()
Error in vapply(head(which(is.error), n.print), f, character(1L)) :
values must be length 1, but FUN(X[[1]]) result is length 0
library(BatchJobs)
loadConfig(conffile = "~/tmp/.BatchJobs.R")
reg <- makeRegistry(id="BatchJobTest", work.dir="results")
ids <- batchMap(reg, fun=f, 1:4)
done <- submitJobs(reg, resources=list(walltime="00:05:00", nodes="1:ppn=1+n02+n03", memory="1gb"))
sapply(1:4, function(x) loadResult(reg, x))
[1] "n03" "n03" "n03" "n02"
sessionInfo()
R Under development (unstable) (2014-05-05 r65530)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] C
attached base packages:
[1] stats graphics utils datasets grDevices methods base
other attached packages:
[1] BatchJobs_1.3 BBmisc_1.7 BiocParallel_0.99.19
loaded via a namespace (and not attached):
[1] BiocGenerics_0.11.4 DBI_0.3.0 RSQLite_0.11.4 brew_1.0-6 checkmate_1.4 codetools_0.2-9 digest_0.6.4 fail_1.2
foreach_1.4.2 iterators_1.0.7
[11] parallel_3.2.0 sendmailR_1.1-2 stringr_0.6.2 tools_3.2.0
On Tue, Sep 23, 2014 at 09:41:44PM +0000, Valerie Obenchain wrote:
Hi, Martin and I looked into this a bit. It looks like a problem with handling an 'undefined error' returned from a worker (i.e., job did not run). When there is a problem executing the tmpl script no error message is sent back. The NULL is coerced to simpleError and becomes a problem downstream when the error processing is expecting messages of length > 0. You can reproduce the error by putting a typo in the script. For example replace R with something bogus such as MYR in this line: MYR CMD --no-save --no-restore "<%= rscript %>" /dev/stdout You said the script worked with release but not devel. Is it possible there's a problem with how R devel is being called on the cluster? Michel Lang (cc'd) implemented BatchJobs in BiocParallel. I'd like to get his opinion on how he wants to handle this type of error. Michel, let me know if you need more details, I can send another example off-line. Valerie On 09/22/2014 02:58 PM, Valerie Obenchain wrote:
Hi Thomas, Just wanted to let you know I saw this and am looking into it. Valerie On 09/20/2014 02:54 PM, Thomas Girke wrote:
Hi Martin, Micheal and Vincent, If I run the following code, with the release version of BiocParallel then it works (took me some time to actually realize that), but with the development version I am getting an error shown after the test code below. If I run the same test with BatchJobs from the devel branch alone then there is no problem. Thus, it seems there is some change in the devel version of BiocParallel causing this error? The torque.tmpl file I am using on our cluster is the standard one from BatchJobs here: https://github.com/tudo-r/BatchJobs/blob/master/examples/cfTorque/simple.tmpl For my application, I could stick with BatchJobs, but it would be nicer if I could get things to work with BiocParallel. Thanks, Thomas ############### ## Test Code ## ############### FUN <- function(i) system("hostname", intern=TRUE) library(BiocParallel); library(BatchJobs) funs <- makeClusterFunctionsTorque("torque.tmpl") param <- BatchJobsParam(4, resources=list(walltime="48:00:00", nodes="1:ppn=4", memory="4gb"), cluster.functions=funs) register(param) xx <- bplapply(1:4, FUN) Error: 4 errors; first error: For more information, use bplasterror(). To resume calculation, re-call the function and set the argument 'BPRESUME' to TRUE or wrap the previous call in bpresume()
bplasterror()
Error in vapply(head(which(is.error), n.print), f, character(1L)) : values must be length 1, but FUN(X[[1]]) result is length 0
sessionInfo()
R Under development (unstable) (2014-05-05 r65530) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics utils datasets grDevices methods base other attached packages: [1] BatchJobs_1.3 BBmisc_1.7 BiocParallel_0.99.19 loaded via a namespace (and not attached): [1] BiocGenerics_0.11.4 DBI_0.3.0 RSQLite_0.11.4 brew_1.0-6 checkmate_1.4 codetools_0.2-9 digest_0.6.4 fail_1.2 foreach_1.4.2 iterators_1.0.7 [11] parallel_3.2.0 sendmailR_1.1-2 stringr_0.6.2 tools_3.2.0
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel