Hi R users This looks a simple question Is there any difference between between rnorm(1000,0,1) and running rnorm(500,0,1) twice in terms of outcome ? TM
difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice
10 messages · Taka Matzmoto, Romain Francois, Philippe GROSJEAN +5 more
Le 08.02.2006 04:21, Taka Matzmoto a ??crit :
Hi R users This looks a simple question Is there any difference between between rnorm(1000,0,1) and running rnorm(500,0,1) twice in terms of outcome ? TM
Not here : R> set.seed(1) R> x <- rnorm(1000, 0, 1) R> set.seed(1) R> y <- rnorm(500, 0, 1) R> z <- rnorm(500, 0, 1) R> all(x == c(y,z)) [1] TRUE Romain
visit the R Graph Gallery : http://addictedtor.free.fr/graphiques mixmod 1.7 is released : http://www-math.univ-fcomte.fr/mixmod/index.php +---------------------------------------------------------------+ | Romain FRANCOIS - http://francoisromain.free.fr | | Doctorant INRIA Futurs / EDF | +---------------------------------------------------------------+
Romain Francois wrote:
Le 08.02.2006 04:21, Taka Matzmoto a ??crit :
Hi R users This looks a simple question Is there any difference between between rnorm(1000,0,1) and running rnorm(500,0,1) twice in terms of outcome ? TM
Not here : R> set.seed(1) R> x <- rnorm(1000, 0, 1) R> set.seed(1) R> y <- rnorm(500, 0, 1) R> z <- rnorm(500, 0, 1) R> all(x == c(y,z)) [1] TRUE Romain
Indeed! The pseudo-random number generator is initialized at the same state, and thus, returns the same 1000 pseudo-random numbers in both cases. So, no differences. Best, Philippe Grosjean
Why don't you test it yourself? E.g., set.seed(42) bob1 <- rnorm(1000,0,1) set.seed(42) bob2 <- rnorm(500,0,1) bob3 <- rnorm(500,0,1) identical(bob1, c(bob2, bob3)) I won't tell you the answer. :-)
Bj??rn-Helge Mevik
On 2/8/2006 4:53 AM, Bj??rn-Helge Mevik wrote:
Why don't you test it yourself? E.g., set.seed(42) bob1 <- rnorm(1000,0,1) set.seed(42) bob2 <- rnorm(500,0,1) bob3 <- rnorm(500,0,1) identical(bob1, c(bob2, bob3)) I won't tell you the answer. :-)
This isn't really something that can be proved by a test. Perhaps the current implementation makes those equal only because 500 is even, or divisible by 5, or whatever... I think the intention is that those should be equal, but in a quick search I've been unable to find a documented guarantee of that. So I would take a defensive stance and assume that there may be conditions where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n). If someone can point out the document I missed, I'd appreciate it. Duncan Murdoch
On 08-Feb-06 Duncan Murdoch wrote:
On 2/8/2006 4:53 AM, Bj??rn-Helge Mevik wrote:
Why don't you test it yourself? E.g., set.seed(42) bob1 <- rnorm(1000,0,1) set.seed(42) bob2 <- rnorm(500,0,1) bob3 <- rnorm(500,0,1) identical(bob1, c(bob2, bob3)) I won't tell you the answer. :-)
This isn't really something that can be proved by a test. Perhaps the current implementation makes those equal only because 500 is even, or divisible by 5, or whatever... I think the intention is that those should be equal, but in a quick search I've been unable to find a documented guarantee of that. So I would take a defensive stance and assume that there may be conditions where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n). If someone can point out the document I missed, I'd appreciate it. Duncan Murdoch
On my understanding, once the seed is set the sequence generated by the underlying RNG is determined, whether it is the result of a single call to produce a long sequence or multiple calls to generate many shorter sequences. Example:
set.seed(42) multi<-numeric(20) set.seed(42) single<-rnorm(20) set.seed(42) for(i in (1:20)) multi[i]<-rnorm(1) print(max(multi-single),digits=22)
[1] 0
print(min(multi-single),digits=22)
[1] 0 In other words: identical! Whether there are possible exceptions, in some implementations of r<dist> where <dist> is other than "norm", has to be answered by people who are familiar with the internals of these functions. Best wishes to all, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 08-Feb-06 Time: 13:26:10 ------------------------------ XFMail ------------------------------
On Wed, 8 Feb 2006, Duncan Murdoch wrote:
On 2/8/2006 4:53 AM, BjÂørn-Helge Mevik wrote:
Why don't you test it yourself? E.g., set.seed(42) bob1 <- rnorm(1000,0,1) set.seed(42) bob2 <- rnorm(500,0,1) bob3 <- rnorm(500,0,1) identical(bob1, c(bob2, bob3)) I won't tell you the answer. :-)
This isn't really something that can be proved by a test. Perhaps the current implementation makes those equal only because 500 is even, or divisible by 5, or whatever... I think the intention is that those should be equal, but in a quick search I've been unable to find a documented guarantee of that. So I would take a defensive stance and assume that there may be conditions where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n). If someone can point out the document I missed, I'd appreciate it.
It's various source files in R_HOME/src/main. Barring bugs, they will be the same. As you know R is free software and comes with ABSOLUTELY NO WARRANTY.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 2/8/2006 8:30 AM, Brian D Ripley wrote:
On Wed, 8 Feb 2006, Duncan Murdoch wrote:
On 2/8/2006 4:53 AM, Bj�rn-Helge Mevik wrote:
Why don't you test it yourself? E.g., set.seed(42) bob1 <- rnorm(1000,0,1) set.seed(42) bob2 <- rnorm(500,0,1) bob3 <- rnorm(500,0,1) identical(bob1, c(bob2, bob3)) I won't tell you the answer. :-)
This isn't really something that can be proved by a test. Perhaps the current implementation makes those equal only because 500 is even, or divisible by 5, or whatever... I think the intention is that those should be equal, but in a quick search I've been unable to find a documented guarantee of that. So I would take a defensive stance and assume that there may be conditions where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n). If someone can point out the document I missed, I'd appreciate it.
It's various source files in R_HOME/src/main. Barring bugs, they will be the same. As you know R is free software and comes with ABSOLUTELY NO WARRANTY.
I didn't mean guarantee in the sense of warranty, just guarantee in the sense that if someone found a situation where they weren't equal, we would consider it a bug and fix it or document it as an exception. Should we add a statement to the RNG man page or manuals somewhere that says this is the intention? For others who aren't as familiar with the issues as Brian: this isn't necessarily a good idea. We have a lot of RNGs, and it's fairly easy to write one so that this isn't true. For example, the Box-Muller method naturally generates pairs of normals; a naive implementation would just throw one away at the end if asked for an odd number. (Ours doesn't do that.) Duncan Murdoch
Duncan Murdoch <murdoch at stats.uwo.ca> writes:
This isn't really something that can be proved by a test. Perhaps the current implementation makes those equal only because 500 is even, or divisible by 5, or whatever... I think the intention is that those should be equal, but in a quick search I've been unable to find a documented guarantee of that. So I would take a defensive stance and assume that there may be conditions where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n). If someone can point out the document I missed, I'd appreciate it.
I think it's a fair assumption that *uniform* random numbers have the property, since these are engines that produce a continuous stream of values, of which we select the next n and m values. As long as the normal.kind (see ?RNGkind) is "Inversion", we can be sure that the property carries to rnorm, but it might not be the case for other methods. In particular the ones that generate normal variates in batches are suspect. However, empirically, I can't seem to provoke the effect with any of R's built-in generators. One *could* of course check the source code and see whether there is state information being kept between invokations...
O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
On Wed, 8 Feb 2006, Duncan Murdoch wrote:
On 2/8/2006 8:30 AM, Brian D Ripley wrote: On Wed, 8 Feb 2006, Duncan Murdoch wrote:
On 2/8/2006 4:53 AM, Bj????rn-Helge Mevik wrote: Why don't you test it yourself?
E.g.,>> >>> > set.seed(42)>> > bob1 <- rnorm(1000,0,1)>> > set.seed(42)>> > bob2 <- rnorm(500,0,1)>> > bob3 <- rnorm(500,0,1)>> > identical(bob1, c(bob2, bob3))>> >>> > I won't tell you the answer. :-)
This isn't really something that can be proved by a test. Perhaps the current implementation makes those equal only because 500 is even, or divisible by 5, or whatever... I think the intention is that those should be equal, but in a quick search I've been unable to find a documented guarantee of that. So I would take a defensive stance and assume that there may be conditions where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n). If someone can point out the document I missed, I'd appreciate it. It's various source files in R_HOME/src/main. Barring bugs, they will be the same. As you know R is free software and comes with ABSOLUTELY NO WARRANTY.
I didn't mean guarantee in the sense of warranty, just guarantee in the sense that if someone found a situation where they weren't equal, we would consider it a bug and fix it or document it as an exception.
Should we add a statement to the RNG man page or manuals somewhere that says this is the intention?
I think that is part of the sense of `no warranty': we allow ourselves to change anything which is not documented, and so things are as a result deliberately not documented.
For others who aren't as familiar with the issues as Brian: this isn't necessarily a good idea. We have a lot of RNGs, and it's fairly easy to write one so that this isn't true. For example, the Box-Muller method naturally generates pairs of normals; a naive implementation would just throw one away at the end if asked for an odd number. (Ours doesn't do that.)
I think we should allow future methods to do things like that, and preferably document that they do them.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595