Back to formatted view
Raw Message

Message-ID: <3d7901f0-6b06-89a9-d841-738d2a8307ec@gmail.com>
Date: 2018-09-19T14:03:35Z
From: Ben Bolker
Subject: Bias in R's random integers?
In-Reply-To: <CAARY7khVi==mZs5o_8fZkTUDDuY-ABwE=gA45COX7qbRgBLhsg@mail.gmail.com>

On 2018-09-19 09:40 AM, David Hugh-Jones wrote:
> On Wed, 19 Sep 2018 at 13:43, Duncan Murdoch <murdoch.duncan at gmail.com>
> wrote:
> 
>>
>> I think the analyses are correct, but I doubt if a change to the default
>> is likely to be accepted as it would make it more difficult to reproduce
>> older results.
> 
> 
> I'm a bit alarmed by the logic here. Unbiased sampling seems basic for a
> statistical language. As a consumer of R I'd like to think that e.g. my
> bootstrapped p values are correct.
> Surely if the old results depend on the biased algorithm, then they are
> false results?
> 

   Balancing backward compatibility and correctness is a tough problem
here.  If this goes into base R, what's the best way to do it?  What was
the protocol for migrating away from the "buggy Kinderman-Ramage"
generator, back in the day?   (Version 1.7 was sometime between 2001 and
2004).

  I couldn't find the exact commit in the GitHub mirror: this is related ...

https://github.com/wch/r-source/commit/7ad3044639fd1fe093c655e573fd1a67aa7f55f6#diff-dbcad570d4fb9b7005550ff630543b37



===
?normal.kind? can be ?"Kinderman-Ramage"?, ?"Buggy
     Kinderman-Ramage"? (not for ?set.seed?), ?"Ahrens-Dieter"?,
     ?"Box-Muller"?, ?"Inversion"? (the default), or ?"user-supplied"?.
     (For inversion, see the reference in ?qnorm?.)  The
     Kinderman-Ramage generator used in versions prior to 1.7.0 (now
     called ?"Buggy"?) had several approximation errors and should only
     be used for reproduction of old results.