Although this query was inspired by distributed random number generation, one of the questions (#2 below) is a single-machine issue. I call C++ code from R to generate simulated data. I'm doing this on a cluster, and use rmpi and rsprng. While rsprng randomizes R-level random numbers (e.g., from runif), it has no effect on the C code, which is completely SPRNG and MPI ignorant. Currently I generate a seed to pass into the C code, using as.integer(runif(1, max=.Machine$integer.max)-.Machine$integer.max/2) It seems to work. Any comments on this approach? Here are some issues I see: 1) The much simpler method of using the consecutive integers as seeds also seemed to work. This also has the advantage of repeatability. I avoided it because I was concerned it wouldn't be random enough. Would consecutive integers as in parLapply(cluster, seq(nSimulations), function(i) myfunction(seed=i)) be sufficient? I suppose I could also generate all the random seeds on the master. 2) This got me thinking about how to generate random integers that span the whole range of 32 bit signed integers. The method show above only spans half the range, since .Machine$integer.max = 2^31. It also makes some assumptions about the relation between the value in .Machine $integer.max and the seed for random numbers. Interestingly, integer.max was 2^31 despite running on a 64 bit powerpc, albeit under the mostly 32 bit OS-X (I think Leopard--not the current one; Darwin Kernel 7.9.0). My understanding is that random number generators internally produce 32 bit integers, which then get converted into the desired distribution. I'm a little surprised there doesn't seem to be a way to get at them. Or is one supposed to do runif()*2^32-2^31? 3) Vagaries of the underlying C++ random number generator could also complicate life.
random numbers
3 messages · Dirk Eddelbuettel, Ross Boylan
On 30 June 2007 at 12:12, Ross Boylan wrote:
| I call C++ code from R to generate simulated data. I'm doing this on a | cluster, and use rmpi and rsprng. While rsprng randomizes R-level | random numbers (e.g., from runif), it has no effect on the C code, which | is completely SPRNG and MPI ignorant. | | Currently I generate a seed to pass into the C code, using | as.integer(runif(1, max=.Machine$integer.max)-.Machine$integer.max/2) | It seems to work. | | Any comments on this approach? Here are some issues I see: I may be missing something but given that rsprng is running on your cluster, you are bound to also have sprng itself -- so why don't you use that from C or C++ for this purpose? Hth, Dirk
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison
On Sat, 2007-06-30 at 14:50 -0500, Dirk Eddelbuettel wrote:
On 30 June 2007 at 12:12, Ross Boylan wrote: | I call C++ code from R to generate simulated data. I'm doing this on a | cluster, and use rmpi and rsprng. While rsprng randomizes R-level | random numbers (e.g., from runif), it has no effect on the C code, which | is completely SPRNG and MPI ignorant. | | Currently I generate a seed to pass into the C code, using | as.integer(runif(1, max=.Machine$integer.max)-.Machine$integer.max/2) | It seems to work. | | Any comments on this approach? Here are some issues I see: I may be missing something but given that rsprng is running on your cluster, you are bound to also have sprng itself -- so why don't you use that from C or C++ for this purpose? Hth, Dirk
Doing so would add considerable complexity, at least as far as I know. Sometimes I run within an MPI session and sometimes not. My understanding is that SPRNG will not work if MPI is absent. I think someone on the SPRNG list told me that there wasn't a good way to handle this at run-time. Unfortunately, a lot of SPRNG options seem to be compile-time settings. Using SPRNG would also complicate my build process, as I'd need autoconf magic to support it. Part of the issue is that I want something I can redistribute, not just something that will work for me on a one-off basis. One simple solution would be to build several versions of the library. A not so simple solution would be to build various random number generators as separate libraries, and dynamically load the appropriate one.