Skip to content

[Bioc-devel] bpparam Non-deterministic Default

7 messages · Martin Morgan, Dario Strbenac, Spencer Nystrom +2 more

#
Good day,

I maintain an R package which makes use of functions such as bplapply which has bpparam() as the default. I have received feedback from a beginnre user that the results change when he knitted his R Markdown document a second time. This stems from the default constructor of bpparam() which sets no RNGseed. I am wondering about the desirability of changing the RNGseed default in BiocParallel to a particular uncontroversial number, such as 12345, so that beginners get deterministic behaviour.

--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
#
I'm not sure that this is a good idea? For instance R does not set the random number stream to be the same by default. Not sure what others might think... Martin

?On 11/26/21, 6:01 AM, "Bioc-devel on behalf of Dario Strbenac via Bioc-devel" <bioc-devel-bounces at r-project.org on behalf of bioc-devel at r-project.org> wrote:

    Good day,

    I maintain an R package which makes use of functions such as bplapply which has bpparam() as the default. I have received feedback from a beginnre user that the results change when he knitted his R Markdown document a second time. This stems from the default constructor of bpparam() which sets no RNGseed. I am wondering about the desirability of changing the RNGseed default in BiocParallel to a particular uncontroversial number, such as 12345, so that beginners get deterministic behaviour.

    --------------------------------------
    Dario Strbenac
    University of Sydney
    Camperdown NSW 2050
    Australia
    _______________________________________________
    Bioc-devel at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
#
Hello,

Might it instead made possible to set an RNGseed value by specifying one to bpparam but still get the automated back-end selection, so that it could easily be set to a particular value in an R package?

--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
#
I agree with Martin. I think it is worse for beginners to falsely believe
their results are deterministic when they are not. This sounds like a
problem that should be solved with documentation and maybe even examples of
setting the RNG seed manually.

I also worry about advanced users and pollution of RNG seed global state by
having packages assign their own seed. Just sounds like a mess for little
gain.

  -Spencer


On Fri, Nov 26, 2021, 7:00 PM Dario Strbenac via Bioc-devel <
bioc-devel at r-project.org> wrote:

            

  
  
#
Hi All,

Sorry I'm not an advanced user. What is the disadvantage of pulling the default from .Random.seed? As a non-advanced user it took me way too long to figure out that set.seed() doesn't work.

Thanks,
Ellis

-----Original Message-----
From: Bioc-devel <bioc-devel-bounces at r-project.org> On Behalf Of Spencer Nystrom
Sent: Saturday, 27 November 2021 2:15 PM
To: Dario Strbenac <dstr7320 at uni.sydney.edu.au>
Cc: bioc-devel at r-project.org
Subject: Re: [Bioc-devel] bpparam Non-deterministic Default

I agree with Martin. I think it is worse for beginners to falsely believe their results are deterministic when they are not. This sounds like a problem that should be solved with documentation and maybe even examples of setting the RNG seed manually.

I also worry about advanced users and pollution of RNG seed global state by having packages assign their own seed. Just sounds like a mess for little gain.

  -Spencer
On Fri, Nov 26, 2021, 7:00 PM Dario Strbenac via Bioc-devel < bioc-devel at r-project.org> wrote:

            
_______________________________________________
Bioc-devel at r-project.org mailing list
https://protect-au.mimecast.com/s/1cu1CwV1vMfGxk9ngFVLYJj?domain=stat.ethz.ch
#
This GitHub issue, although lengthy, discusses some of the technical
difficulties associated with pulling from the random stream. I've linked to
a comment from Martin that discusses this in particular, but there's other
good stuff in the rest of the issue.

https://github.com/Bioconductor/BiocParallel/pull/140#issuecomment-921627153

   -Spencer


On Fri, Nov 26, 2021, 10:41 PM Ellis Patrick <ellis.patrick at sydney.edu.au>
wrote:

  
  
2 days later
#
This should be solved by the vignette appropriately calling set.seed and
explaining why they set the seed and why it is not done automatically. In a
visible code chunk.

On Sat, Nov 27, 2021 at 7:31 AM Spencer Nystrom <nystromdev at gmail.com>
wrote: