There appears to be a mild bug, or at least a deficiency, in rnorm. The bug becomes apparent when one looks at extremes of the squares of the values generated by rnorm; rnorm is not generating quite enough extreme values. The R version that I am using is 1.4.1; I never got around to installing 1.5.0, and now since 1.5.1 is about to come out .... However, checking the 1.5.0 release notes revealed no mention of fixing a bug in rnorm. Version details: platform sparc-sun-solaris2.7 arch sparc os solaris2.7 system sparc, solaris2.7 status major 1 minor 4.1 year 2002 month 01 day 30 language R More detail regarding how the bug revealed itself: ================================================= For n = 100, 200, ..., 1500 o I generated 1000 sequences of length n via rnorm(n), o for each sequence x, I calculated m = the max of x^2 o I then calculated pval = 1 - pchisq(m,1)^n o I then calculated s.hat.n = #{pval: pval < 0.05}/1000 I then plotted s.hat.n versus n. This ***should*** give a result close to a horizontal straight line, at height 0.05 --- but it didn't. For the larger values of n, the values of s.hat.n were displaced significantly below 0.05. After some discussion with colleagues, I replaced the calls to rnorm() by calls to myrnorm() defined by myrnorm <- function(n,mu=0,sigma=1){ mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n))) } which uses the ``(r,theta)'' method of generating random normals. When I did so, the resulting values were indeed all ``close to'' 0.05, as they should be. I also tried the experiment using rchisq(n,1) instead of rnorm(n) (and then of course taking m = max of x --- rather than max of x^2). Again all the resulting values were close to 0.05 as ought to be the case. (So rchisq() appears to be OK in this regard.) Enclosed below is a script to demonstrate the bug. cheers, Rolf Turner rolf@math.unb.ca #==+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+==# # Script to demonstrate the bug in rnorm. # myrnorm <- function(n,mu=0,sigma=1){ mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n))) } # If RFUN <- rnorm we get ``wrong'' answers; if RFUN <- myrnorm, # we get ``right'' answers. RFUN <- rnorm NSER <- 1000 set.seed(350734) rslt <- list() for(K in 1:15) { N <- 100*K M <- matrix(RFUN(NSER*N),N,NSER) T2 <- apply(M,2,function(x){max(x**2)}) PV <- 1 - pchisq(T2,1)**N SZ <- sum(PV < 0.05)/NSER rslt[[K]] <- SZ cat(K,"\n") } rslt <- unlist(rslt) plot(100*(1:15),rslt,ylim=c(0,0.1),xlab='n',ylab='s.hat.n') abline(h=0.05) error.bar(100*(1:15),rslt,lower=1.96*sqrt(0.05*0.95/1000),add=TRUE) # Clean up: rm(RFUN,NSER,K,N,M,T2,PV,SZ) #==+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+== -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
rolf@math.unb.ca writes:> There appears to be a mild bug, or at least a deficiency, in > rnorm. The bug becomes apparent when one looks at extremes > of the squares of the values generated by rnorm; rnorm is not > generating quite enough extreme values. > > The R version that I am using is 1.4.1; I never got around to installing > 1.5.0, and now since 1.5.1 is about to come out .... However, checking > the 1.5.0 release notes revealed no mention of fixing a bug in rnorm....and I see the effect too with an r-patched from a few days back. [snip]> After some discussion with colleagues, I replaced the calls to rnorm() > by calls to myrnorm() defined by > > myrnorm <- function(n,mu=0,sigma=1){ > mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n))) > } > > which uses the ``(r,theta)'' method of generating random normals. > > When I did so, the resulting values were indeed all ``close to'' 0.05, > as they should be. > > I also tried the experiment using rchisq(n,1) instead of rnorm(n) (and > then of course taking m = max of x --- rather than max of x^2). Again > all the resulting values were close to 0.05 as ought to be the case. > (So rchisq() appears to be OK in this regard.)Also qnorm(runif(n)) seems to be closer to the target. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> writes:> rolf@math.unb.ca writes: > > > There appears to be a mild bug, or at least a deficiency, in > > rnorm. The bug becomes apparent when one looks at extremes > > of the squares of the values generated by rnorm; rnorm is not > > generating quite enough extreme values. > > > > The R version that I am using is 1.4.1; I never got around to installing > > 1.5.0, and now since 1.5.1 is about to come out .... However, checking > > the 1.5.0 release notes revealed no mention of fixing a bug in rnorm. > > ...and I see the effect too with an r-patched from a few days back.No I do not! My quick check code took the square of the max rather than the max of the square... With your script, I get a nice horizontal line both with 1.4.0 and 1.5.0-patched. (RedHat Linux 7.1 on a PC) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
R supplies a choice of random-number generators as well as a choice of generating normals. All methods are compromises, and I suspect one of the non-default ones may be better for your purposes. Please investigate the choices. On 13 Jun 2002, Peter Dalgaard BSA wrote:> rolf@math.unb.ca writes: > > > There appears to be a mild bug, or at least a deficiency, in > > rnorm. The bug becomes apparent when one looks at extremes > > of the squares of the values generated by rnorm; rnorm is not > > generating quite enough extreme values. > > > > The R version that I am using is 1.4.1; I never got around to installing > > 1.5.0, and now since 1.5.1 is about to come out .... However, checking > > the 1.5.0 release notes revealed no mention of fixing a bug in rnorm. > > ...and I see the effect too with an r-patched from a few days back. > > > [snip] > > After some discussion with colleagues, I replaced the calls to rnorm() > > by calls to myrnorm() defined by > > > > myrnorm <- function(n,mu=0,sigma=1){ > > mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n))) > > } > > > > which uses the ``(r,theta)'' method of generating random normals. > > > > When I did so, the resulting values were indeed all ``close to'' 0.05, > > as they should be. > > > > I also tried the experiment using rchisq(n,1) instead of rnorm(n) (and > > then of course taking m = max of x --- rather than max of x^2). Again > > all the resulting values were close to 0.05 as ought to be the case. > > (So rchisq() appears to be OK in this regard.) > > Also qnorm(runif(n)) seems to be closer to the target. > > -- > O__ ---- Peter Dalgaard Blegdamsvej 3 > c/ /'_ --- Dept. of Biostatistics 2200 Cph. N > (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 > ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._