Skip to content
Prev 893 / 2152 Next

parallel random numbers: set.seed(i), rsprng, rlecuyer, [one solution]

On Tue, 2010-12-14 at 13:51 -0800, Ross Boylan wrote:
I have some good news to report with rsprng, based on sprng v2.
Scenario A) is quite possible.  That is, if you have 1,000 simulations
you can operate using 1,000 streams.  You just have to remember to kill
the old stream if the same process is handling multiple simulations.

The stream number one to pass to the initialization routine is the job
or simulation number, rather than the rank of the process on which it is
running.

According to the sprng docs, MPI is used only in in transmitting a seed
to all nodes.  My code set the seed though other means, and so there was
never any messaging.  This made it possible to start the streams at
different times.

I also did a small experiment to see if stream 1 was the same regardless
of whether the total number of streams was 500 or 1000.  It appeared to
be.  However, the documentation provides no guarantees that this is the
case.

Here are some excerpts from my code, which runs under snow:
# master and slave setup
setup0 <- function(){
  library("rsprng")
  # set these in the global environment for use by slaveDo
  nTotalSimulations <<- 1000
  seed <<- 1066
}

# setup to run on master
# cl is cluster
setupMaster <- function(cl, nsim, fname){
  clusterEvalQ(cl, source("ascertain.R")) # file with this code
  clusterEvalQ(cl, setup0())
  system.time(r <- clusterApplyLB(cl, seq(nsim), "slaveDo"))
  invisible(save(r, file=fname))
  stopCluster(cl)
}


# sample slave job
# uses global nTotalSimulation and seed
slaveDo <- function(k){
  free.sprng()
  init.sprng(nTotalSimulations, k-1, seed) 
  # do and return simulation results
}

cl <- getMPIcluster()
if (mpi.comm.rank(0) == 0) {
  setupMaster(cl, 1000, "r1.RData")
}