GILLIBERT, Andre
2021-Aug-12 09:51 UTC
[Rd] Problem in random number generation for Marsaglia-Multicarry + Kinderman-Ramage
Dear R developers, In my opinion, I discovered a severe flaw that occur with the combination of the Marsaglia-Multicarry pseudo-random number generator associated to the Kinderman-Ramage algorithm to generate normally distributed numbers. The sample program is very simple (tested on R-4.1.1 x86_64 on Windows 10): set.seed(1, "Marsaglia-Multicarry", normal.kind="Kinderman-Ramage") v=rnorm(1e7) poisson.test(sum(v < (-4)))$conf.int # returns c(34.5, 62.5) poisson.test(sum(v > (4)))$conf.int # returns c(334.2, 410.7) pnorm(-4)*1e7 # returns 316.7 There should be approximatively 316 values less than -4 and 316 values greater than +4, bug there are far too few values less than -4. Results are similar with other random seeds, and things are even more obvious with larger sample sizes. The Kinderman-Ramage algorithm is fine when combined to Mersenne-Twister, and Marsaglia-Multicarry is fine when combined with the normal.kind="Inversion" algorithm, but the combination of Marsaglia-Multicarry and Kinderman-Ramage seems to have severe flaws. R should at least warn for that combination ! What do you think? Should I file a bug report? -- Sincerely Andr? GILLIBERT [[alternative HTML version deleted]]
peter dalgaard
2021-Aug-12 13:07 UTC
[Rd] Problem in random number generation for Marsaglia-Multicarry + Kinderman-Ramage
With these matters, one has to be careful to distinguish between method error and implementation error. The reason for changing the RNG setup in R v. 1.7.0 was pretty much this kind of unfortunate interaction between M-M and K-R. There are even more egregious examples for the distribution of maxima of normal variables. Try e.g. RNGversion("1.6.0") # Marsaglia-Multicarry, Kinderman-Ramage s <- replicate(1e6,max(rnorm(10))) plot(density(s)) (A further bug in K-R was fixed in 1.7.1, but that is tangential to this.) A glimpse of the source of the problem is seen in the "microcorrelations" in this: RNGkind("Mar");m <- matrix(runif(4e7),2) plot(m[1,],m[2,],xlim=c(0,1e-3),pch=".") m <- matrix(runif(4e7),2) points(m[1,],m[2,],pch=".") These examples are from 2003, so the issue has been known for almost 2 decades. However, to the best of our knowledge, the M-M RNG is a faithful implementation of their method, so we have left the RNG in R's arsenal, in case someone needed it for some specific purpose. - pd> On 12 Aug 2021, at 11:51 , GILLIBERT, Andre <Andre.Gillibert at chu-rouen.fr> wrote: > > Dear R developers, > > > In my opinion, I discovered a severe flaw that occur with the combination of the Marsaglia-Multicarry pseudo-random number generator associated to the Kinderman-Ramage algorithm to generate normally distributed numbers. > > > The sample program is very simple (tested on R-4.1.1 x86_64 on Windows 10): > > set.seed(1, "Marsaglia-Multicarry", normal.kind="Kinderman-Ramage") > v=rnorm(1e7) > poisson.test(sum(v < (-4)))$conf.int # returns c(34.5, 62.5) > poisson.test(sum(v > (4)))$conf.int # returns c(334.2, 410.7) > pnorm(-4)*1e7 # returns 316.7 > > > There should be approximatively 316 values less than -4 and 316 values greater than +4, bug there are far too few values less than -4. > > Results are similar with other random seeds, and things are even more obvious with larger sample sizes. > > The Kinderman-Ramage algorithm is fine when combined to Mersenne-Twister, and Marsaglia-Multicarry is fine when combined with the normal.kind="Inversion" algorithm, but the combination of Marsaglia-Multicarry and Kinderman-Ramage seems to have severe flaws. > > R should at least warn for that combination ! > > What do you think? Should I file a bug report? > > -- > Sincerely > Andr? GILLIBERT > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com