Skip to content

[R-pkg-devel] Change in normal random numbers between R 3.5.3 and R 3.6.0

9 messages · Mark Sharp, Ulrike Grömping, Martin Maechler +1 more

#
Dear R package authors,

I am currently struggling with differences in test results between R 
versions 3.5.3 and 3.6.0: There are expected ones (from the new behavior 
of sample(), which can be switched off by 
RNGkind(sample.kind="Rounding")) and unexpected ones: The normal random 
numbers from a call "y <- rnorm(36)" have changed between R versions, in 
spite of working with a seed. Consequently, most other outcomes change 
as well, as they depend on those random numbers.

Does anyone have an idea what this behavior results from ? You see 
samples below (ignore the first columns, these can be made alike by 
changing sample.kind; but the normal random numbers in the last column 
were the bottom ones before version 3.6.0 and are the top ones afterwards.
...

Can anyone explain, why this is the case, or how I could possibly 
circumvent it (for some noLD checks)?

Best regards,

Ulrike
#
On 09/05/2019 9:15 a.m., Ulrike Gr?mping wrote:
I am not seeing a difference in rnorm(), but I would expect to see one 
if the new sample() code was used, as it can make a different number of 
calls to the underlying RNG.

That is:  I'd expect this code to give identical results in both 
versions, whether I used val <- 10 or any other value:

val <- 10
set.seed(val)
rnorm(36)

This code does not give identical results, because the calls to sample() 
will result in different changes to the seed:

val <- 10
set.seed(val)
discard <- sample(1000, 100)
rnorm(36)

Duncan Murdoch
#
Hmmmh, but does that also apply if the sample.kind has been set to the 
old version? I.e., would

if (getRversion()>="3.6.0") RNGkind(sample.kind="Rounding")
val <- 10
set.seed(val)
discard <- sample(1000, 100)
rnorm(36)

produce the same normal random numbers in 3.5.3 and 3.6.0? I would have 
expected it to, but it seems to produce the same normal random numbers 
as R version 3.6.0 in the previous version of the test code without the 
RNGkind call.

Best, Ulrike

Am 09.05.2019 um 16:19 schrieb Duncan Murdoch:

  
    
#
On 09/05/2019 12:43 p.m., Ulrike Gr?mping wrote:
I'm not seeing that, but I'm not using the exact versions you tested. 
If I run your code in  "R version 3.5.2 (2018-12-20)" and "R Under 
development (unstable) (2019-05-02 r76454)" I get this output from both:

 > if (getRversion()>="3.6.0") RNGkind(sample.kind="Rounding")
 > val <- 10
 > set.seed(val)
 > discard <- sample(1000, 100)
 > rnorm(36)
  [1] -0.4006375 -0.3345566  1.3679540  2.1377671  0.5058193  0.7863424 
-0.9022119  0.5328970 -0.6458943  0.2909875 -1.2375945
[12] -0.4561763 -0.8303227  0.3401156  1.0663764  1.2161258  0.7356907 
-0.4812086  0.5627448 -1.2463197  0.3809222 -1.4304273
[23] -1.0484455 -0.2185036 -1.4899362  1.1727063 -1.4798270 -0.4303878 
-1.0516386  1.5225863  0.5928281 -0.2226615  0.7128943
[34]  0.7166008  0.4402419  0.1588306

Okay, I just installed 3.6.0, and I get the same values there.  I don't 
see a Mac binary for 3.5.3, so I can't test that one.

Duncan Murdoch
#
I was dealing with a similar issue but in the context of getting the same unit test code to work on multiple versions of R in a Travis-CI build. It seems RNGkind(sample.kind="Rounding?) does not work prior to version 3.6 so I resorted to using version dependent construction of the argument list to set.seed() in do.call().

I better solution will be greatly appreciated.

#' Work around for unit tests using sample()
#'
#' @param seed argument to \code{set.seed}
set_seed <- function(seed = 1) {
  version <- as.integer(R.Version()$major) + (as.numeric(R.Version()$minor) / 10.0)
  if (version >= 3.6) {
    args <- list(seed, sample.kind = "Rounding")
  } else {
    args <- list(seed)
  }
  suppressWarnings(do.call(set.seed, args))
}

Mark

R. Mark Sharp, Ph.D.
Data Scientist and Biomedical Statistical Consultant
7526 Meadow Green St.
San Antonio, TX 78251
mobile: 210-218-2868
rmsharp at me.com
#
Mark,

I used

if (getRversion()>="3.6.0") RNGkind(sample.kind="Rounding")

And that works. Actually, using rnorm afterwards also yields the same random numbers.
My question arose from the fact that I confused myself about the noLD output I was supposed to reproduce. Therefore, my problem should be entirely explained by Duncan Murdoch's initial explanation: the sample() change does not only lead to different results in discrete sampling but also to different results from random number calls for other functions (like rnorm).

Best, Ulrike

Am 10.05.2019 um 04:58 schrieb R. Mark Sharp:

  
    
#
Ulrike,

RNGkind() worked on 3.4.1 and 3.6.0 but generated a warning on R 3.5.3. The message follows:

 checking whether package ?nprcmanager? can be installed ... WARNING
Found the following significant warnings:
  Note: possible error in 'RNGkind(sample.kind = "Rounding")': unused argument (sample.kind = "Rounding") 
See ?/tmp/RtmpA4S3Ki/file32b81c218ac8/nprcmanager.Rcheck/00install.out? for details.
Information on the location(s) of code generating the ?Note?s can be
obtained by re-running with environment variable R_KEEP_PKG_SOURCE set
to ?yes?.

Mark
R. Mark Sharp
rmsharp at me.com
#
> Mark,
    > I used

    > if (getRversion()>="3.6.0") RNGkind(sample.kind="Rounding")

    > And that works. Actually, using rnorm afterwards also
    > yields the same random numbers.

Yes, "of course",  'sample.kind' was only introduced into 3.6.0.
We had always recommended

   RNGversion("3.5.0")

possibly wrapped in  suppressWarnings().
That *does* work in old and new versions of R.

Note that in R >= 3.6.0 , e.g., inside your if(.) { ** }
you could also use  set.seed(<n>, sample.kind="Rounding")

Martin


    > My question arose from the fact that I confused myself about the noLD output I was supposed to reproduce. Therefore, my problem should be entirely explained by Duncan Murdoch's initial explanation: the sample() change does not only lead to different results in discrete sampling but also to different results from random number calls for other functions (like rnorm)

*IFF* callled after sample() [etc].
So yes, do call   RNGversion("3.5.0")
before set.seed() before the first call to sample() / sample.int()
or functions using those [or 'rwilcox()', see its help in R >= 3.6.0!].

Martin


    > Best, Ulrike

    > Am 10.05.2019 um 04:58 schrieb R. Mark Sharp:
    >> I was dealing with a similar issue but in the context of getting the same unit test code to work on multiple versions of R in a Travis-CI build. It seems RNGkind(sample.kind="Rounding?) does not work prior to version 3.6 so I resorted to using version dependent construction of the argument list to set.seed() in do.call().
    >> 
    >> I better solution will be greatly appreciated.
    >> 
    >> #' Work around for unit tests using sample()
    >> #'
    >> #' @param seed argument to \code{set.seed}
    >> set_seed <- function(seed = 1) {
    >> version <- as.integer(R.Version()$major) + (as.numeric(R.Version()$minor) / 10.0)
    >> if (version >= 3.6) {
    >> args <- list(seed, sample.kind = "Rounding")
    >> } else {
    >> args <- list(seed)
    >> }
    >> suppressWarnings(do.call(set.seed, args))
    >> }
    >> 
    >> Mark
    >> 
    >> R. Mark Sharp, Ph.D.
    >> Data Scientist and Biomedical Statistical Consultant
    >> 7526 Meadow Green St.
    >> San Antonio, TX 78251
    >> mobile: 210-218-2868
    >> rmsharp at me.com
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >>
>>> On May 9, 2019, at 12:45 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
>>>
>>> On 09/05/2019 12:43 p.m., Ulrike Gr?mping wrote:
>>>> Hmmmh, but does that also apply if the sample.kind has been set to the
    >>>> old version? I.e., would
    >>>> if (getRversion()>="3.6.0") RNGkind(sample.kind="Rounding")
    >>>> val <- 10
    >>>> set.seed(val)
    >>>> discard <- sample(1000, 100)
    >>>> rnorm(36)
    >>>> produce the same normal random numbers in 3.5.3 and 3.6.0? I would have
    >>>> expected it to, but it seems to produce the same normal random numbers
    >>>> as R version 3.6.0 in the previous version of the test code without the
    >>>> RNGkind call.
    >>> I'm not seeing that, but I'm not using the exact versions you tested. If I run your code in  "R version 3.5.2 (2018-12-20)" and "R Under development (unstable) (2019-05-02 r76454)" I get this output from both:
    >>> 
    >>>> if (getRversion()>="3.6.0") RNGkind(sample.kind="Rounding")
    >>>> val <- 10
    >>>> set.seed(val)
    >>>> discard <- sample(1000, 100)
    >>>> rnorm(36)
    >>> [1] -0.4006375 -0.3345566  1.3679540  2.1377671  0.5058193  0.7863424 -0.9022119  0.5328970 -0.6458943  0.2909875 -1.2375945
    >>> [12] -0.4561763 -0.8303227  0.3401156  1.0663764  1.2161258  0.7356907 -0.4812086  0.5627448 -1.2463197  0.3809222 -1.4304273
    >>> [23] -1.0484455 -0.2185036 -1.4899362  1.1727063 -1.4798270 -0.4303878 -1.0516386  1.5225863  0.5928281 -0.2226615  0.7128943
    >>> [34]  0.7166008  0.4402419  0.1588306
    >>> 
    >>> Okay, I just installed 3.6.0, and I get the same values there.  I don't see a Mac binary for 3.5.3, so I can't test that one.
    >>> 
    >>> Duncan Murdoch
    >>> 
    >>> ______________________________________________
    >>> R-package-devel at r-project.org mailing list
    >>> https://stat.ethz.ch/mailman/listinfo/r-package-devel


    > -- 
    > ##############################################
    > ## Prof. Ulrike Groemping
    > ## FB II
    > ## Beuth University of Applied Sciences Berlin
    > ##############################################
    > ## prof.beuth-hochschule.de/groemping
    > ## Phone: +49(0)30 4504 5127
    > ## Fax:   +49(0)30 4504 66 5127
    > ## Home office: +49(0)30 394 04 863
    > ##############################################

    > ______________________________________________
    > R-package-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-package-devel
#
On 10/05/2019 8:55 a.m., Martin Maechler wrote:
That's good advice.  There's a couple of other things I'd add:

  - Be sure to do this only in test code with saved results.  Functions 
should never need to do this, and you really don't want to leave that 
setting in place after an example.

  - At some point in the future (maybe in a year or so), update your 
package to depend on "R (>= 3.6.0)", remove the RNGversion() line and 
update saved test results.

Duncan Murdoch