doMC and reproducible parallel numbers under plyr
Since the doMC package now uses the "parallel" package, you can use
the same techniques as when using mclapply directly. The documentation
(actually for mcparallel) says:
The behaviour with 'mc.set.seed = TRUE' is different only if
'RNGkind("L'Ecuyer-CMRG")' has been selected. Then each time a
child is forked it is given the next stream (see 'nextRNGStream').
So if you select that generator, set a seed and call
'mc.reset.stream' just before the first use of 'mcparallel' the
results of simulations will be reproducible provided the same
tasks are given to the first, second, ... forked process.
I haven't tried this with plyr, but it's worth try.
- Steve
On Wed, Feb 26, 2014 at 1:06 PM, Aaron King <kingaa at umich.edu> wrote:
I've run into a little bit of frustration trying to combine plyr, foreach,
and doMC to get reproducible results. It's straightforward to achieve
reproducibility when using one of the other foreach backends (doSNOW,
doMPI, for instance), but there are times when you want to take advantage
of the shared-memory capacity of multicore machines. doRNG is nice in that
it makes it easy to get fully reproducible results when using foreach
directly, but when you foreach only via plyr + .parallel=TRUE, you don't
have the option of using doRNG.
Has anyone figured out how to get fully reproducible results when using
plyr + doMC?
Aaron
--
Aaron A. King, Ph.D.
Ecology & Evolutionary Biology
Mathematics
Center for the Study of Complex Systems
University of Michigan
GPG Public Key: 0x15780975
[[alternative HTML version deleted]]
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc