parallel::mclapply() dummy function on Windows?
On Sat, 8 Oct 2011, John Fox wrote:
Dear Martin, I don't have an opinion about whether what Tim wants to do is a good idea, but was responding to his comment that he would need "parallel=FALSE flags all over the place." Why could he not simply define mclapply <- if (.Platform$OS.type == "windows") base::lapply else parallel::mclapply in his package?
Because mclapply has additional arguments that would be passed by FUN to lapply as part of ... . We are contemplating having wrappers of mclapply and pvec on Windows equivalent to the behaviour with mc.cores = 1 on Unix. But that is nothing to do with original specious claim to which I responded: if you want good parallel performance for most users you need also to support both parLapply and mclapply (or at least, parLapply with a fork cluster). I think the import issue is a red herring: these functions are not called often enough for parallel::mclapply to be inefficient. And really importFrom is only better practice for things that will always be used, since it moves the computation from as-needed to every time the package is loaded.
Best, John
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
On
Behalf Of Martin Morgan Sent: October-08-11 8:16 AM To: John Fox Cc: ttriche at usc.edu; 'Prof Brian Ripley'; 'r-devel' Subject: Re: [Rd] parallel::mclapply() dummy function on Windows? On 10/07/2011 06:03 PM, John Fox wrote:
Dear Tim,
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
On
Behalf Of Tim Triche, Jr. Sent: October-07-11 3:05 PM To: Prof Brian Ripley Cc: r-devel Subject: Re: [Rd] parallel::mclapply() dummy function on Windows? On Thu, Oct 6, 2011 at 11:25 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk>wrote:
Why would it make it easier? And how could using a dummy for 'most
users'
(who are on Windows) offer them 'good parallel support'?
Good point. Most of my users are on unix, because my use of mclapply() is primarily to expedite processing of raw scanner data. Only a handful of users for the packages that call mclapply() are on Windows. Right now, I default to having parallel=FALSE flags all over the place, but I'd prefer
for
the default to be "go as fast as practical in the common case", i.e.,
Unix.
It would have been more accurate for me to say "I would like to
parallelize
by default, without having the methods fail on Windows in the default configuration" than to claim that I want "good parallel support" for
Windows.
When I have tried using the foreach/doMC combination in the past, it has
not
worked out satisfactorily, so I don't know how well I can support Windows users... period.
Why don't you just apply the approach you initially suggested in your own package, defining mclapply() the way you want it?
Hi John et al., Individual packages will become littered with ad hoc solutions,
constructed
without, for instance, the wisdom and experience of Prof. Ripley about platforms or environments in which it is appropriate to use mclapply. For instance, Tim's pseudo-code if (Windows) ... translated as
if
(.Platform$OS.type == "windows") doesn't sound like its the correct test;
at
least
exists("mclapply", getNamespace("parallel"))
but probably more. Also, doesn't parallel's name space differ between
platforms, requiring the package author to import(parallel) rather than
the
better practice of importFrom(parallel, mclapply) ? Martin
I hope this helps, John
Take a look at e.g. package 'boot' to see how to offer alternatives. (A
version that uses 'parallel' is pending on CRAN, or see http://www.stats.ox.ac.uk/pub/**R/boot_1.3-3.tar.gz<http://www.stats .o x.ac.uk/pub/R/boot_1.3-3.tar.gz>.) Package 'parallel' may in future offer a higher-level abstraction layer that makes offers such a choice,
but
as the 'boot' code shows, deciding what to send to the workers in a snow- style cluster is not simple.
It seems similar to what I do (off topic: why do you use the file
extension
'.q' for all of the R/S code files?): pass flags around. I suppose I was just being lazy, but I would love to default to "go as fast as
possible"
without having Windows users get left out in the cold (unless they add
flags
to their function calls). Thank you for your suggestions, I will look into this further. -- Tim Triche, Jr. USC Biostatistics [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595