?Random.user says (in svn trunk) Optionally, functions \code{user_unif_nseed} and \code{user_unif_seedloc} can be supplied which are called with no arguments and should return pointers to the number of seeds and to an integer array of seeds. Calls to \code{GetRNGstate} and \code{PutRNGstate} will then copy this array to and from \code{.Random.seed}. And it offers as an example void user_unif_init(Int32 seed_in) { seed = seed_in; } int * user_unif_nseed() { return &nseed; } int * user_unif_seedloc() { return (int *) &seed; } First question: what is the lifetime of the buffers pointed to by the user_unif-* functions, and who is responsible for cleaning them up? In the help file they are static variables, but in general they might be allocated on the heap or might be in structures that only persist as long as the generator does. Since the example uses static variables, it seems reasonable to conclude the core R code is not going to try to free them. Second, are the types really correct? The documentation seems quite explicit, all the more so because it uses Int32 in places. However, the code in RNG.c (RNG_Init) says ns = *((int *) User_unif_nseed()); if (ns < 0 || ns > 625) { warning(_("seed length must be in 0...625; ignored")); break; } RNG_Table[kind].n_seed = ns; RNG_Table[kind].i_seed = (Int32 *) User_unif_seedloc(); consistent with the earlier definition of RNG_Table entries as typedef struct { RNGtype kind; N01type Nkind; char *name; /* print name */ int n_seed; /* length of seed vector */ Int32 *i_seed; } RNGTAB; This suggests that the type of user_unif_seedloc is Int32*, not int *. It also suggests that user_unif_nseed should return the number of 32 bit integers. The code for PutRNGstate(), for example, uses them in just that way. While the dominant model, even on 64 bit hardware, is probably to leave int as 32 bit, it doesn't seem wise to assume that is always the case. I got into this because I'm trying to extend the rsprng code; sprng returns its state as a vector of bytes. Converting these to a vector of integers depends on the integer length, hence my interest in the exact definiton of integer. I'm interested in lifetime because I believe those bytes are associated with the stream and become invalid when the stream is freed; furthermore, I probably need to copy them into a buffer that is padded to full wordlength. This means I allocate the buffer whose address is returned to the core R RNG machinery. Eventually somebody needs to free the memory. Far more of my rsprng adventures are on http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng. Feel free to read, correct, or extend it. Thanks. Ross Boylan
Hello, Le 30 juil. 09 ? 08:21, Ross Boylan a ?crit :> ?Random.user says (in svn trunk) > Optionally, > functions \code{user_unif_nseed} and \code{user_unif_seedloc} can be > supplied which are called with no arguments and should return > pointers > to the number of seeds and to an integer array of seeds. Calls to > \code{GetRNGstate} and \code{PutRNGstate} will then copy this array > to > and from \code{.Random.seed}. > And it offers as an example > void user_unif_init(Int32 seed_in) { seed = seed_in; } > int * user_unif_nseed() { return &nseed; } > int * user_unif_seedloc() { return (int *) &seed; } > > First question: what is the lifetime of the buffers pointed to by the > user_unif-* functions, and who is responsible for cleaning them up? > In > the help file they are static variables, but in general they might be > allocated on the heap or might be in structures that only persist as > long as the generator does. > > Since the example uses static variables, it seems reasonable to > conclude > the core R code is not going to try to free them. > > Second, are the types really correct? The documentation seems quite > explicit, all the more so because it uses Int32 in places. However, > the > code in RNG.c (RNG_Init) says > > ns = *((int *) User_unif_nseed()); > if (ns < 0 || ns > 625) { > warning(_("seed length must be in 0...625; ignored")); > break; > } > RNG_Table[kind].n_seed = ns; > RNG_Table[kind].i_seed = (Int32 *) User_unif_seedloc(); > consistent with the earlier definition of RNG_Table entries as > typedef struct { > RNGtype kind; > N01type Nkind; > char *name; /* print name */ > int n_seed; /* length of seed vector */ > Int32 *i_seed; > } RNGTAB; > > This suggests that the type of user_unif_seedloc is Int32*, not int *. > It also suggests that user_unif_nseed should return the number of 32 > bit > integers. The code for PutRNGstate(), for example, uses them in just > that way. > > While the dominant model, even on 64 bit hardware, is probably to > leave > int as 32 bit, it doesn't seem wise to assume that is always the case.You can test the size of an int with a configure script. see for example the package foreign, the package randtoolbox (can be found in Rmetrics R forge project) I maintain with Petr Savicky. By the way, I'm sure he has an answer about RNGkind because he made the runif interface in the randtoolbox package and in rngWELL19937 package. Christophe> > > I got into this because I'm trying to extend the rsprng code; sprng > returns its state as a vector of bytes. Converting these to a > vector of > integers depends on the integer length, hence my interest in the exact > definiton of integer. I'm interested in lifetime because I believe > those bytes are associated with the stream and become invalid when the > stream is freed; furthermore, I probably need to copy them into a > buffer > that is padded to full wordlength. This means I allocate the buffer > whose address is returned to the core R RNG machinery. Eventually > somebody needs to free the memory. > > Far more of my rsprng adventures are on > http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng. > Feel > free to read, correct, or extend it. > > Thanks. > > Ross Boylan > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Christophe Dutang Ph.D. student at ISFA, Lyon, France website: http://dutangc.free.fr
On Sun, 2009-08-16 at 21:24 +0200, Petr Savicky wrote:> Dear Ross Boylan: > > Some time ago, you sent an email to R-devel with the following. > > I got into this because I'm trying to extend the rsprng code; sprng > > returns its state as a vector of bytes. Converting these to a vector of > > integers depends on the integer length, hence my interest in the exact > > definiton of integer. I'm interested in lifetime because I believe > > those bytes are associated with the stream and become invalid when the > > stream is freed; furthermore, I probably need to copy them into a buffer > > that is padded to full wordlength. This means I allocate the buffer > > whose address is returned to the core R RNG machinery. Eventually > > somebody needs to free the memory. > > > > Far more of my rsprng adventures are on > > http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng. Feel > > free to read, correct, or extend it. > > I am interested to know, what is the current state of your project.I did figure out some of the lifetime issues; SPRNG does allocate memory when you ask it for its state. I also realized that for several reasons it would not be appropriate to hand that buffer to R. I've reworked the page extensively since it had the section you quote. See particularly the "Getting and Setting Stream State" section near the bottom. I submitted patches to hook rsprng into R's standard machinery for stream state (the user visible part of which is .Random.seed). The package developer has reservations about applying them. As a practical matter, I shifted my package's C code to call back to R to get random numbers. If rsprng is loaded and activated, my code will use it. I also eliminated all attempts to set the seed in my code. For rsprng, in its current form, the R set.seed() function is a no-op and you have to use an rsprng function to set the seed (generally when activating the library).> > There is a package rngwell19937 with a random number generator, which i develop > and use for several parallel processes. Setting a seed may be done by a vector, > one of whose components is the process number. The initialization then provides > unrelated sequences for different processes.That sounds interesting; thanks for pointing it out.> > Seeding by a vector is also available in the initialization of Mersenne Twister > from 2002. See mt19937ar.c (ar for array) at > http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html > Unfortunately, seeding by a vector is not available in R base. R uses > Mersenne Twister, but with an initialization by a single number.I think that one could write to .Random.seed directly to set a vector for many of the generators. ?.Random.seed does not recommend this and notes various limits and hazards of this strategy. Ross
On Mon, Aug 17, 2009 at 12:25:57PM -0700, Ross Boylan wrote:> > Seeding by a vector is also available in the initialization of Mersenne Twister > > from 2002. See mt19937ar.c (ar for array) at > > http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html > > Unfortunately, seeding by a vector is not available in R base. R uses > > Mersenne Twister, but with an initialization by a single number. > I think that one could write to .Random.seed directly to set a vector > for many of the generators. ?.Random.seed does not recommend this and > notes various limits and hazards of this strategy.Initialization of a generator by writing the initial state into .Random.seed is possible, but requires to know the exact form of the admissible states of the generator. An error leads to a corrupted sequence. So, this is indeed not suggested. By seeding by a vector, i mean something different. The user provides a vector of arbitrary length and this vector is used as an input to a hash function, which produces a correct initial state. The advantage over seeding by a single number is that the set of possible initial states is much larger already if we use a seed of length 2 or 3. Seeding by a vector is available, for example, in Mersenne Twister from 2002 using the C function init_by_array() in mt19937ar.c. In the package rngwell19937, the R function set.vector.seed() may be used for the same purpose. I will reply to the rest of your message in a private email (as was also the email you were replying to). Petr Savicky.