difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice - R-help

Tue, Feb 7, 2006 7:21 PM #

Hi R users

This looks a simple question

Is there any difference between between rnorm(1000,0,1) and running 
rnorm(500,0,1) twice in terms of outcome ?

TM

Romain Francois

Wed, Feb 8, 2006 12:51 AM #

Le 08.02.2006 04:21, Taka Matzmoto a ??crit :

Not here :

R> set.seed(1)
R> x <- rnorm(1000, 0, 1)
R> set.seed(1)
R> y <- rnorm(500, 0, 1)
R> z <- rnorm(500, 0, 1)
R> all(x == c(y,z))
[1] TRUE

Romain

visit the R Graph Gallery : http://addictedtor.free.fr/graphiques
mixmod 1.7 is released : http://www-math.univ-fcomte.fr/mixmod/index.php
+---------------------------------------------------------------+
| Romain FRANCOIS - http://francoisromain.free.fr               |
| Doctorant INRIA Futurs / EDF                                  |
+---------------------------------------------------------------+

Philippe GROSJEAN

Wed, Feb 8, 2006 1:08 AM #

Romain Francois wrote:

Indeed! The pseudo-random number generator is initialized at the same 
state, and thus, returns the same 1000 pseudo-random numbers in both 
cases. So, no differences.
Best,

Philippe Grosjean

Bjørn-Helge Mevik

Wed, Feb 8, 2006 1:53 AM #

Why don't you test it yourself?

E.g.,

set.seed(42)
bob1 <- rnorm(1000,0,1)
set.seed(42)
bob2 <- rnorm(500,0,1)
bob3 <- rnorm(500,0,1)
identical(bob1, c(bob2, bob3))

I won't tell you the answer. :-)

Bj??rn-Helge Mevik

Duncan Murdoch

Wed, Feb 8, 2006 4:22 AM #

On 2/8/2006 4:53 AM, Bj??rn-Helge Mevik wrote:

This isn't really something that can be proved by a test.  Perhaps the 
current implementation makes those equal only because 500 is even, or 
divisible by 5, or whatever...

I think the intention is that those should be equal, but in a quick 
search I've been unable to find a documented guarantee of that.  So I 
would take a defensive stance and assume that there may be conditions 
where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n).

If someone can point out the document I missed, I'd appreciate it.

Duncan Murdoch

(Ted Harding)

Wed, Feb 8, 2006 5:26 AM #

On 08-Feb-06 Duncan Murdoch wrote:

On my understanding, once the seed is set the sequence generated
by the underlying RNG is determined, whether it is the result of
a single call to produce a long sequence or multiple calls to
generate many shorter sequences. Example:

[1] 0

[1] 0

In other words: identical!

Whether there are possible exceptions, in some implementations
of r<dist> where <dist> is other than "norm", has to be answered
by people who are familiar with the internals of these functions.

Best wishes to all,
Ted.




--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Feb-06                                       Time: 13:26:10
------------------------------ XFMail ------------------------------

Brian Ripley

Wed, Feb 8, 2006 5:30 AM #

On Wed, 8 Feb 2006, Duncan Murdoch wrote:

It's various source files in R_HOME/src/main.

Barring bugs, they will be the same.  As you know

	R is free software and comes with ABSOLUTELY NO WARRANTY.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Duncan Murdoch

Wed, Feb 8, 2006 6:07 AM #

On 2/8/2006 8:30 AM, Brian D Ripley wrote:

I didn't mean guarantee in the sense of warranty, just guarantee in the 
sense that if someone found a situation where they weren't equal, we 
would consider it a bug and fix it or document it as an exception.

Should we add a statement to the RNG man page or manuals somewhere that 
says this is the intention?

For others who aren't as familiar with the issues as Brian: this isn't 
necessarily a good idea.  We have a lot of RNGs, and it's fairly easy to 
write one so that this isn't true.  For example, the Box-Muller method 
naturally generates pairs of normals; a naive implementation would just 
throw one away at the end if asked for an odd number.  (Ours doesn't do 
that.)

Duncan Murdoch

Peter Dalgaard

Wed, Feb 8, 2006 7:25 AM #

Duncan Murdoch <murdoch at stats.uwo.ca> writes:

I think it's a fair assumption that *uniform* random numbers have the
property, since these are engines that produce a continuous stream of
values, of which we select the next n and m values. 

As long as the normal.kind (see ?RNGkind) is "Inversion", we can be
sure that the property carries to rnorm, but it might not be the case
for other methods. In particular the ones that generate normal
variates in batches are suspect. However, empirically, I can't seem to
provoke the effect with any of R's built-in generators. One *could* of
course check the source code and see whether there is state
information being kept between invokations...

O__  ---- Peter Dalgaard             ??ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Brian Ripley

Wed, Feb 8, 2006 7:34 AM #

On Wed, 8 Feb 2006, Duncan Murdoch wrote:

I think that is part of the sense of `no warranty': we allow ourselves to 
change anything which is not documented, and so things are as a result 
deliberately not documented.

I think we should allow future methods to do things like that, and 
preferably document that they do them.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595