Skip to content

[Bioc-devel] BiocParallel-devel error

2 messages · Thomas Girke, Vincent Carey

#
Hi Valerie,

Excellent. In addition to collecting log outputs, I have a few more 
suggestions that may be worth considering:

- Collecting the results form parallel computing tasks directly in an R
  object is a great convenience, which I like a lot. However, in the
  context of slow computations there should be an option to redirect to
  files instead and then assemble things in R in a second step that the user can
  control. Perhaps this is possible already but it is not clear to me 
  what the intended way is how to do this. 
  
- A much higher level of fault tolerance by adding options to restart failed 
  jobs is another extremely important feature for parallel computations.
  This may only be possible if results are temporarily stored in files.
  For instance, if I farm out a computation to 10 compute nodes and one
  of them crashes, I want to be able to use the results form the 9 completed tasks
  but easily restart the computation assigned to the crashed node so that I
  get the final result quickly. 
  
BatchJobs provides most of these facilities. Making it easier and/or
more obvious how to use these utilities from within BiocParallel may be
all what is needed.

Thomas
On Thu, Nov 20, 2014 at 04:43:54PM +0000, Valerie Obenchain wrote:
#
On Thu, Nov 20, 2014 at 12:17 PM, Thomas Girke <thomas.girke at ucr.edu> wrote:

            
I would agree with this.  I just noticed that setting cleanup=FALSE in the
BatchJobsParam
allows retention of the work.dir and thus the registry and jobs files.

It is not clear to me how to use BiocParallel when all one wants to do is
establish a
registry and populate it but does not want to wait for the loadResults that
is
carried out with bplapply.  Currently I just work with BatchJobs directly.