Skip to content

user supplied random number generators

4 messages · Ross Boylan, Christophe Dutang, William Dunlap

#
?Random.user says (in svn trunk)
  Optionally,
  functions \code{user_unif_nseed} and \code{user_unif_seedloc} can be
  supplied which are called with no arguments and should return pointers
  to the number of seeds and to an integer array of seeds.  Calls to
  \code{GetRNGstate} and \code{PutRNGstate} will then copy this array to
  and from \code{.Random.seed}.
And it offers as an example
  void  user_unif_init(Int32 seed_in) { seed = seed_in; }
  int * user_unif_nseed() { return &nseed; }
  int * user_unif_seedloc() { return (int *) &seed; }

First question: what is the lifetime of the buffers pointed to by the
user_unif-* functions, and who is responsible for cleaning them up?  In
the help file they are static variables, but in general they might be
allocated on the heap or might be in structures that only persist as
long as the generator does.

Since the example uses static variables, it seems reasonable to conclude
the core R code is not going to try to free them.

Second, are the types really correct?  The documentation seems quite
explicit, all the more so because it uses Int32 in places.  However, the
code in RNG.c (RNG_Init) says

	    ns = *((int *) User_unif_nseed());
	    if (ns < 0 || ns > 625) {
		warning(_("seed length must be in 0...625; ignored"));
		break;
	    }
	    RNG_Table[kind].n_seed = ns;
	    RNG_Table[kind].i_seed = (Int32 *) User_unif_seedloc();
consistent with the earlier definition of RNG_Table entries as
typedef struct {
    RNGtype kind;
    N01type Nkind;
    char *name; /* print name */
    int n_seed; /* length of seed vector */
    Int32 *i_seed;
} RNGTAB;

This suggests that the type of user_unif_seedloc is Int32*, not int *.
It also suggests that user_unif_nseed should return the number of 32 bit
integers.  The code for PutRNGstate(), for example, uses them in just
that way.

While the dominant model, even on 64 bit hardware, is probably to leave
int as 32 bit, it doesn't seem wise to assume that is always the case.

I got into this because I'm trying to extend the rsprng code; sprng
returns its state as a vector of bytes.  Converting these to a vector of
integers depends on the integer length, hence my interest in the exact
definiton of integer.  I'm interested in lifetime because I believe
those bytes are associated with the stream and become invalid when the
stream is freed; furthermore, I probably need to copy them into a buffer
that is padded to full wordlength.  This means I allocate the buffer
whose address is returned to the core R RNG machinery.  Eventually
somebody needs to free the memory.

Far more of my rsprng adventures are on
http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng.  Feel
free to read, correct, or extend it.

Thanks.

Ross Boylan
#
Hello,

Le 30 juil. 09 ? 08:21, Ross Boylan a ?crit :
You can test the size of an int with a configure script. see for  
example the package foreign, the package randtoolbox (can be found in  
Rmetrics R forge project) I maintain with Petr Savicky.

By the way, I'm sure he has an answer about RNGkind because he made  
the runif interface in the randtoolbox package and in rngWELL19937  
package.

Christophe
--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr
#
On Thu, 2009-07-30 at 12:32 +0200, Christophe Dutang wrote:
http://cran.r-project.org/doc/manuals/R-admin.html#Choosing-between-32_002d-and-64_002dbit-builds says "All current versions of R use 32-bit integers".

Also, sizeof(int) works at runtime.  But my question was really about
whether code for user defined RNGs should be written using Int32 or int
as the target type for the state vector.  The R core code suggests to me
one should use Int32, but the documentation says int.

Ross
#
src/arithmetic.c will not compile if ints are not 32 bits and I suspect
that there is other code that will not work correctly if ints are not
32 bits
   src/arithmetic.c:
    574 #ifndef INT_32_BITS
    575 /* configure checks whether int is 32 bits.  If not this code
will
    576    need to be rewritten.  Since 32 bit ints are pretty much
universal,
    577    we can worry about writing alternate code when the need
arises.
    578    To be safe, we signal a compiler error if int is not 32 bits.
*/
    579 # error code requires that int have 32 bits
    580 #else
http://www.unix.org/version2/whatsnew/lp64_wp.html is rather old
but says that keeping int at 32 bits is the way to go for the forseeable
future.

I would recommend using Int32, as it makes the intent clearer.  The
seed does not need to be stored as nseed Sint's (the C type used to
store R/S/S+ integers: int in R, long in S and S+) but as nseed Int32's.

When I implemented this interface in S+ I used unsigned 32-bit ints
internally, so they would not suffer sign extension when naively
converted to the (signed) longs S+ uses for integer datsets like
.Random.seed.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com