[Bioc-devel] reproducible with mclapply?
There are different ways set.seed can be used. The way it is suggested on the aforementioned stackoverflow post is basically a two stage process. First seed is provided by a user (set.seed(1)). That is user can change the outcome from run to run. Based on that seed, a vector of randomized seeds is generated (seeds <- sample.int(length(input), replace=TRUE)). Those seeds are basically arguments to the function under mclapply/lapply that help to control random number generation for each iteration (set.seed( seeds[idx])). There are two different roles of set.seed. First left the user to control random number generation and the second (within the function) makes sure that it is the same for individual iterations regardless how the loop is executed. Does that make sense?
On Wed, Jun 3, 2015 at 7:07 PM, Yu, Guangchuang <gcyu at connect.hku.hk> wrote:
There is one possible solution posted in http://stackoverflow.com/questions/30610375/how-to-run-permutations-using-mclapply-in-a-reproducible-way-regardless-of-numbe/30627984#30627984 . As Kasper suggested, it's not a proper way to use set.seed inside a package. I suggest using a parameter for example seed=FALSE to disable the set.seed and if user want the result reproducible, e.g. in demonstration, set seed=TRUE explicitly and set.seed will be run inside the function. Bests, Guangchuang On Wed, Jun 3, 2015 at 8:42 PM, Kasper Daniel Hansen < kasperdanielhansen at gmail.com> wrote:
For this situation, generate the permutation indexes outside of the mclapply, and the do mclapply over a list with the indices. And btw., please don't use set.seed inside a package; that control should completely be left to the user. Best, Kasper On Wed, Jun 3, 2015 at 7:08 AM, Vincent Carey <
stvjc at channing.harvard.edu>
wrote:
This document indicates how to achieve reproducibility independent of
the
underlying physical environment. http://cran.r-project.org/web/packages/doRNG/vignettes/doRNG.pdf Let me know if that satisfies the question. On Wed, Jun 3, 2015 at 5:32 AM, Yu, Guangchuang <gcyu at connect.hku.hk> wrote:
Der Vincent,
RNGkind("L'Ecuyer-CMRG") works as using mc.set.seed=FALSE.
When mc.cores changes, the output is not reproducible.
I think this issue is also of concern within the Bioconductor
community
as parallel version of permutation test is commonly used now.
Best Regards, Guangchuang On Wed, Jun 3, 2015 at 5:17 PM, Vincent Carey <
stvjc at channing.harvard.edu>
wrote:
Hi, this question belongs on R-help, but perhaps
will be useful. Best regards On Wed, Jun 3, 2015 at 3:11 AM, Yu, Guangchuang <gcyu at connect.hku.hk
wrote:
Dear all, I have an issue of setting seed value when using parallel package.
library("parallel")
library("digest")
set.seed(0)
m <- mclapply(1:10, function(x) sample(1:10),
+ mc.cores=2)
digest(m, 'crc32')
[1] "4827c80c"
set.seed(0) m <- mclapply(1:10, function(x) sample(1:10),
+ mc.cores=2)
digest(m, 'crc32')
[1] "e95b9134" By default, set.seed() will be ignored since mclapply will set the
seed
internally. If we use mc.set.seed=FALSE to disable this feature. It works as indicated below:
set.seed(0) m <- mclapply(1:10, function(x) sample(1:10),
+ mc.cores=2, mc.set.seed = FALSE)
digest(m, 'crc32')
[1] "6bbada78"
set.seed(0) m <- mclapply(1:10, function(x) sample(1:10),
+ mc.cores=2, mc.set.seed = FALSE)
digest(m, 'crc32')
[1] "6bbada78" The problems is that the results are also depending on the number of cores.
set.seed(0) m <- mclapply(1:10, function(x) sample(1:10),
+ mc.cores=4, mc.set.seed = FALSE)
digest(m, 'crc32')
[1] "a22e0aab" Any idea? Best Regards, Guangchuang -- --~--~---------~--~----~------------~-------~--~----~ Guangchuang Yu, PhD Candidate State Key Laboratory of Emerging Infectious Diseases School of Public Health The University of Hong Kong Hong Kong SAR, China www: http://ygc.name -~----------~----~----~----~------~----~------~--~--- [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
-- --~--~---------~--~----~------------~-------~--~----~ Guangchuang Yu, PhD Candidate State Key Laboratory of Emerging Infectious Diseases School of Public Health The University of Hong Kong Hong Kong SAR, China www: http://ygc.name -~----------~----~----~----~------~----~------~--~---
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
-- --~--~---------~--~----~------------~-------~--~----~ Guangchuang Yu, PhD Candidate State Key Laboratory of Emerging Infectious Diseases School of Public Health The University of Hong Kong Hong Kong SAR, China www: http://ygc.name -~----------~----~----~----~------~----~------~--~--- [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel