Hi, Our department has detected a bug in the implementation of the Kinderman-Ramage generator for normal random variates in version 1.7.0, which can be seen from the below R session. (Consecutive calls for chisq.test(...) always gives p-values very close to 0.) We have already encountered this bug in version 1.6.2 The error is in file R-1.7.0/src/nmath/snorm.c Here is a patch for this file to correct the error (the added two lines are crucial): --- R-1.7.0/src/nmath/snorm.c Wed Feb 26 16:51:17 2003 +++ snorm.c Fri Apr 25 09:22:28 2003 @@ -197,7 +197,9 @@ u1 = unif_rand(); if(u1 < 0.884070402298758) { u2 = unif_rand(); - return A*(1.13113163544180*u1+u2-1); + /** <modified constant> **/ + return A*(1.131131635444180*u1+u2-1); + /** </modified> **/ } if(u1 >= 0.973310954173898) { /* tail: */ @@ -241,6 +243,10 @@ tt = 0.479727404222441-0.595507138015940*fmin2(u2,u3); if(fmax2(u2,u3) <= 0.805577924423817) return (u2<u3) ? tt : -tt; + /** <added> **/ + if(0.053377549506886*fabs(u2-u3) <= g(tt)) + return (u2<u3) ? tt : -tt; + /** </added> **/ } case BOX_MULLER: if(BM_norm_keep != 0.0) { /* An exact test is intentional */ ---------------------------------------------------------- $ uname -a Linux 2.4.18-3 #1 Thu Apr 18 07:37:53 EDT 2002 i686 unknown $ R R : Copyright 2003, The R Development Core Team Version 1.7.0 (2003-04-16) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type `license()' or `licence()' for distribution details. R is a collaborative project with many contributors. Type `contributors()' for more information. Type `demo()' for some demos, `help()' for on-line help, or `help.start()' for a HTML browser interface to help. Type `q()' to quit R.> RNGkind(normal.kind="Kinderman-Ramage") > chisq.test(table(trunc(100*pnorm(rnorm(1e6)))))Chi-squared test for given probabilities data: table(trunc(100 * pnorm(rnorm(1e+06)))) X-squared = 204.1378, df = 99, p-value = 2.746e-09 Josef -- ----------------------------------------------------------------------------- Josef Leydold | University of Economics and Business Administration | Department for Applied Statistics and Data Processing ----------------------------------------------------------------------------- Augasse 2-6 | Tel. *43/1/31336-4695 A-1090 Vienna | FAX *43/1/31336-738 European Union | email Josef.Leydold@statistik.wu-wien.ac.at ----------------------------------------------------------------------------- Alles Unglueck kam daher, dass die Denkenden nicht mehr handeln konnten, und die Handelnden keine Zeit mehr fanden zu denken. (Marlen Haushofer)
leydold@statistik.wu-wien.ac.at writes:> Hi, > > Our department has detected a bug in the implementation of the > Kinderman-Ramage generator for normal random variates in version > 1.7.0, which can be seen from the below R session. > (Consecutive calls for chisq.test(...) always gives p-values very > close to 0.) > We have already encountered this bug in version 1.6.2 > > The error is in file > R-1.7.0/src/nmath/snorm.c > > Here is a patch for this file to correct the error > (the added two lines are crucial):Thanks. Both look like transcription errors. What was the source of the corrected constants? The JASA paper cited in snorm.c? (Another way of seeing the effect is m <- sapply(1:50,function(i)table(trunc(100*pnorm(rnorm(1e6))))) matplot(m,type="l") Two small, but clear, spikes at around 30 and 70 can be seen. The effect is not seen with the Inversion method, which is the default in R-1.7.0.) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:> Unfortunately it was the default prior to 1.7.0, so for reproducibility I > think we can't change this. We could add a new option, a corrected > version, but is it worth it? I would definitely add a warning if the > buggy version was selected, and perhaps make it an error to select it via > RNGkind (rather than RNGversion or a saved .Random.seed).Argh... We do normally correct errors leading to incorrect values. I mean, if we found a bug in qnorm() we wouldn't hesitate to fix it even if though it affects the Inversion RNG, would we? How about this: Keep the old generator, but call it "K-R-bug" or so. Fix the real K-R. Make K-R-bug the version set by RNGversion. One or more of the methods to set it will generate warnings. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
Martin Maechler <maechler@stat.math.ethz.ch> writes:> PD> How about this: Keep the old generator, but call it > PD> "K-R-bug" or so. Fix the real K-R. Make K-R-bug the > PD> version set by RNGversion. One or more of the methods to > PD> set it will generate warnings. > > This proposal sounds good to my ears. Since we haven't heard > other comments -- this should go into R-patched, right? > {and who from R-core does it?}I intend to, but not just now. The consequences of not being prepared for my course on Tuesday are worse than leaving a buggy K-R generator in 1.7.0 for a while (especially since it is no longer the default)... -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
leydold@statistik.wu-wien.ac.at writes:> Hi, > > Our department has detected a bug in the implementation of the > Kinderman-Ramage generator for normal random variates in version > 1.7.0, which can be seen from the below R session. > (Consecutive calls for chisq.test(...) always gives p-values very > close to 0.) > We have already encountered this bug in version 1.6.2Josef, I'm in the process of implementing your fixes, but while they certainly improve things, I'm still seeing anomalies. Could you try this (it takes a while, even on a fast machine) m <- sapply(1:200,function(i)table(trunc(100*pnorm(rnorm(1e6))))) plot(rowMeans(m), type="l") and tell me if you see a double peak in the middle? It's only about 1.3% off-target, but given the accuracy of the constants in the routine, it does seem a bit more than you should expect. I see the same effect both with the Mersenne-Twister and with the Wichmann-Hill uniform generators.> The error is in file > R-1.7.0/src/nmath/snorm.c > > Here is a patch for this file to correct the error > (the added two lines are crucial): > > --- R-1.7.0/src/nmath/snorm.c Wed Feb 26 16:51:17 2003 > +++ snorm.c Fri Apr 25 09:22:28 2003 > @@ -197,7 +197,9 @@ > u1 = unif_rand(); > if(u1 < 0.884070402298758) { > u2 = unif_rand(); > - return A*(1.13113163544180*u1+u2-1); > + /** <modified constant> **/ > + return A*(1.131131635444180*u1+u2-1); > + /** </modified> **/ > } > > if(u1 >= 0.973310954173898) { /* tail: */ > @@ -241,6 +243,10 @@ > tt = 0.479727404222441-0.595507138015940*fmin2(u2,u3); > if(fmax2(u2,u3) <= 0.805577924423817) > return (u2<u3) ? tt : -tt; > + /** <added> **/ > + if(0.053377549506886*fabs(u2-u3) <= g(tt)) > + return (u2<u3) ? tt : -tt; > + /** </added> **/ > } > case BOX_MULLER: > if(BM_norm_keep != 0.0) { /* An exact test is intentional */ > > > > ---------------------------------------------------------- > > $ uname -a > Linux 2.4.18-3 #1 Thu Apr 18 07:37:53 EDT 2002 i686 unknown > > $ R > R : Copyright 2003, The R Development Core Team > Version 1.7.0 (2003-04-16) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type `license()' or `licence()' for distribution details. > > R is a collaborative project with many contributors. > Type `contributors()' for more information. > > Type `demo()' for some demos, `help()' for on-line help, or > `help.start()' for a HTML browser interface to help. > Type `q()' to quit R. > > > RNGkind(normal.kind="Kinderman-Ramage") > > chisq.test(table(trunc(100*pnorm(rnorm(1e6))))) > > Chi-squared test for given probabilities > > data: table(trunc(100 * pnorm(rnorm(1e+06)))) > X-squared = 204.1378, df = 99, p-value = 2.746e-09 > > > Josef > > -- > > ----------------------------------------------------------------------------- > Josef Leydold | University of Economics and Business Administration > | Department for Applied Statistics and Data Processing > ----------------------------------------------------------------------------- > Augasse 2-6 | Tel. *43/1/31336-4695 > A-1090 Vienna | FAX *43/1/31336-738 > European Union | email Josef.Leydold@statistik.wu-wien.ac.at > ----------------------------------------------------------------------------- > Alles Unglueck kam daher, dass die Denkenden nicht mehr handeln konnten, > und die Handelnden keine Zeit mehr fanden zu denken. (Marlen Haushofer) > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel >-- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907