portable parallel seeds project: request for critiques

Tue, Feb 21, 2012 5:04 AM

On Fri, Feb 17, 2012 at 02:57:26PM -0600, Paul Johnson wrote:

Hi.

In the description of your project in the file

  http://winstat.quant.ku.edu/svn/hpcexample/trunk/Ex66-ParallelSeedPrototype/README

you argue as follows

  Question: Why is this better than the simple old approach of
  setting the seeds within each run with a formula like
   
  set.seed(2345 + 10 * run)
   
  Answer: That does allow replication, but it does not assure
  that each run uses non-overlapping random number streams. It
  offers absolutely no assurance whatsoever that the runs are
  actually non-redundant.

The following demonstrates that the function set.seed() for
the default generator indeed allows to have correlated streams.

  step <- function(x)
  {
      x[x < 0] <- x[x < 0] + 2^32
      x <- (69069 * x + 1) %% 2^32
      x[x > 2^31] <- x[x > 2^31] - 2^32
      x
  }

  n <- 1000
  seed1 <- 124370417 # any seed
  seed2 <- step(seed1)

  set.seed(seed1)
  x <- runif(n)
  set.seed(seed2)
  y <- runif(n)

  rbind(seed1, seed2)
  table(x[-1] == y[-n])

The output is

             [,1]
  seed1 124370417
  seed2 205739774
  
  FALSE  TRUE 
      5   994 

This means that if the streams x, y are generated from the two
seeds above, then y is almost exactly equal to x shifted by 1.

What is the current state of your project?

Petr Savicky.

portable parallel seeds project: request for critiques

Thread (11 messages)