Faheem Mitha
2002-Jan-24 22:39 UTC
[R] random number generation issues with r and compiled C code
Dear People, I have been writing a simulation routine that is currently entirely written in c++. I've been using the R standalone math library to give me random number generation for this. However, I have had to set the seed myself, and I am using set_seed(time(NULL), clock) for every call to unif_rand(). However, this works really badly. The random numbers aren't uniform at all. Does anyone have better suggestions? I think the problem is that every time I call the random number generation function, I have to set the seed, and this messes up the uniformity of the random number generation. To be more explicit, if I was to call the seed once, and then generate a thousand unif rand variates, these would be distributed more or less correctly, but if I set the seed for each of the thousand variates separately, this screws up the algorithm. Is my understanding correct? I've been looking on the web, and it looks like there is no perfect solution to the problem of generating random numbers in C/C++. So, I'm thinking of running my program inside R by compiling my code as a shared library, and using R's builtin random number facilities. I have a couple of questions about this. 1) The functions GetRNGstate() and PutRNGetate() read and write .Random.Seed. The question is what is the best strategy for calling these. Would it be best to call these routines as infrequently as possible? My remarks above (if correct) suggest that. That is, would be best to call GetRNGstate() once before generating all random numbers necessary and then call it again when I have finished calling all the random variates? Or does it not matter that much? Also, is scoping a problem here? I have not tried this yet, but suppose I was to do as follows foo() { GetRNGstate(); randcall(); randcall(); PutRNGetate(); } where randcall() { int i; for(i=0; i < 100; i++) cout << unif_rand(); } would this be Ok, or do calls to GetRNGstate() and PutRNGetate() have to be inside randcall()? If this is the case, then the above strategy of reading in seed as early as possible and writing it out as late as possible will not work. Also, what is the answer to the corresponding question in the purely C/C++ situation? Ie. can I call set_seed at top level? eg, in main()? If so, this would make a lot of the problems go away. 2) What is the difference between doing GetRNGstate(); unif_rand(); PutRNGetate(); and calling runif()? or are these equivalent? Ie. does runif() read in and out seed itself? Thanks in advance. Sincerely, Faheem Mitha. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian Ripley
2002-Jan-24 23:20 UTC
[R] random number generation issues with r and compiled C code
All the answers here are taken from R-exts.texi. 1) However, before these are used, the user must call GetRNGstate(); and after all the required variates have been generated, call PutRNGstate(); These essentially read in (or create) @code{.Random.seed} and write it out after use. So, call each just once in each call from R to your compiled code. 2) The C code behind @R{}'s @code{r at var{xxx}} functions can be accessed by including the header file @file{Rmath.h}; @xref{Distribution functions}. Those calls generate a single variate and should also be enclosed in calls to @code{GetRNGstate} and @code{PutRNGstate}. (The difference between unif_rand() and runif(a, b) lies in the extra arguments.) 3) When using standalone Rmathlib A little care is needed to use the random-number routines. You will need to supply the uniform random number generator double unif_rand(void) or use the one supplied (and with a shared library or DLL you will have to use the one supplied, which is the Marsaglia-multicarry with an entry point set_seed(unsigned int, unsigned int) to set its seeds). So, I'm not clear if you have supplied your own or used the one supplied, but if the latter, you need to call set_seed, *but only once*. Finally, R's random number generators are of course written in C and are also not perfect. (They were chosen by an expert based on the consensus of experts, though.) On Thu, 24 Jan 2002, Faheem Mitha wrote:> > Dear People, > > I have been writing a simulation routine that is currently entirely > written in c++. I've been using the R standalone math library to give me > random number generation for this. However, I have had to set the seed > myself, and I am using > > set_seed(time(NULL), clock) > > for every call to unif_rand(). > > However, this works really badly. The random numbers aren't uniform at > all. Does anyone have better suggestions? > > I think the problem is that every time I call the random number generation > function, I have to set the seed, and this messes up the uniformity of the > random number generation. To be more explicit, if I was to call the seed > once, and then generate a thousand unif rand variates, these would be > distributed more or less correctly, but if I set the seed for each of the > thousand variates separately, this screws up the algorithm. Is my > understanding correct? > > I've been looking on the web, and it looks like there is no perfect > solution to the problem of generating random numbers in C/C++. So, I'm > thinking of running my program inside R by compiling my code as a shared > library, and using R's builtin random number facilities. > > I have a couple of questions about this. > > 1) The functions GetRNGstate() and PutRNGetate() read and write > .Random.Seed. The question is what is the best strategy for calling these. > Would it be best to call these routines as infrequently as possible? My > remarks above (if correct) suggest that. That is, would be best to call > GetRNGstate() once before generating all random numbers necessary and then > call it again when I have finished calling all the random variates? Or > does it not matter that much? > > Also, is scoping a problem here? I have not tried this yet, but suppose I > was to do as follows > > foo() > { > GetRNGstate(); > randcall(); > randcall(); > PutRNGetate(); > } > > where > > randcall() > { > int i; > > for(i=0; i < 100; i++) > cout << unif_rand(); > } > > would this be Ok, or do calls to GetRNGstate() and PutRNGetate() have > to be inside randcall()? If this is the case, then the above strategy > of reading in seed as early as possible and writing it out as late as > possible will not work. > > Also, what is the answer to the corresponding question in the purely > C/C++ situation? Ie. can I call set_seed at top level? eg, in main()? > If so, this would make a lot of the problems go away. > > 2) What is the difference between doing > > GetRNGstate(); > unif_rand(); > PutRNGetate(); > > and calling runif()? > > or are these equivalent? Ie. does runif() read in and out seed itself? > > Thanks in advance. > > Sincerely, Faheem Mitha. > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
hzi@uol.com.br
2002-Jan-29 02:29 UTC
[R] random number generation issues with r and compiled C code
On Thu, Jan 24, 2002 at 05:39:14PM -0500, Faheem Mitha wrote:> > Dear People, > > I have been writing a simulation routine that is currently entirely > written in c++. I've been using the R standalone math library to give me > random number generation for this. However, I have had to set the seed > myself, and I am using > > set_seed(time(NULL), clock) > > for every call to unif_rand(). > > However, this works really badly. The random numbers aren't uniform at > all. Does anyone have better suggestions?Hi- One question about "random number generation": AFAIK, true random number generation has to receive random input from the "outside." For instance, when you generate GnuPG or PGP keys for encryption, you have to type something on the keyboard. With all the other stuff I've seen that purports to generate random numbers, real randomo is no being generated, right? Shoudn't R have some function similar to the above mentioned one that exists in PGP or GNuPG? TIA Regs, Henry -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._