lbaring@stochastik.uni-hannover.de
2000-Mar-24 12:23 UTC
[Rd] quantiles of the hypergeometric distribution (PR#502)
Hello! I use R-version 1.0.0 To get the 0.95 quantile of the hypergeometric distribution with the parameters m=45000,n=5000 and k=600 I use the R-command> qhyper(0.95,45000,5000,600).The value obtained is 600. However, the true value is 552. The latter can be obtained for example by calling the corresponding distribution function with the R commands> x<-540:580 > phyper(x,45000,5000,600)The value 552 is obtained also by the usual normal approximation. Overall, it seems that R produces errors when calling qhyper(p,m,n,k) for large values of the parametes m,n or k. Sincerely L. Baringhaus -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Torsten Hothorn
2000-Mar-24 13:40 UTC
[Rd] quantiles of the hypergeometric distribution (PR#502)
> Hello! > > I use R-version 1.0.0 > To get the 0.95 quantile of the hypergeometric > distribution with the parameters m=45000,n=5000 and > k=600 I use the R-command > > > qhyper(0.95,45000,5000,600). > > > The value obtained is 600. However, the true value > is 552.Well, had a short look. The problem arises in qhyper.c at line 56: term = exp(lfastchoose(NR, xr) + lfastchoose(NB, xb) - lfastchoose(N, n)); The expression inside the exp() function gives -1415.411856 and exp() returns 0. So term and sum in while(sum < p && xr < xend) { xr++; NB++; term *= (NR / xr) * (xb / NB); sum += term; xb--; NR--; } are 0 all the time and xend = 600 is returned. R-Core: How to fix this? Torsten ***************************************************************** * * * Torsten Hothorn, Statistician * * at the moment: Institut fuer Statistik, TU Wien * * Tel: 0043 1 58801 10772 * * Fax: 0043 1 58801 10798 * * * ***************************************************************** -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
thomas@biostat.washington.edu
2000-Mar-24 16:10 UTC
[Rd] quantiles of the hypergeometric distribution (PR#502)
On 24 Mar 2000, Peter Dalgaard BSA wrote:> Torsten Hothorn <hothorn@ci.tuwien.ac.at> writes: > > > term = exp(lfastchoose(NR, xr) + lfastchoose(NB, xb) - lfastchoose(N, n)); > > > > The expression inside the exp() function gives -1415.411856 and exp() > > returns 0. So term and sum in > > > > while(sum < p && xr < xend) { > > xr++; > > NB++; > > term *= (NR / xr) * (xb / NB); > > sum += term; > > xb--; > > NR--; > > } > > > > are 0 all the time and xend = 600 is returned. > > > > R-Core: How to fix this? > > Ick! That is of course only to be expected with recurrence relations > like this applied to large numbers. Since the term formula (as far as > I remember!) applies to all the point probabilities, I suppose that > one could start the recurrence from a "known positive" value and work > both forwards and backwards from there. Since you still want to do the > sums from the low end (although that is perhaps not all that > important?) you might need an intermediate array to hold the values. > One idea could be to start from the normal approximation of the > desired quantile. Probably a bit more than an afternoons work, but > hardly several weeks...As phyper seems to work we could just start at the Normal approximation to qhyper and search for x such that phyper(x)=p. Since it's a discrete distribution this shouldn't take that long. -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
p.dalgaard@biostat.ku.dk
2000-Mar-24 16:29 UTC
[Rd] quantiles of the hypergeometric distribution (PR#502)
Thomas Lumley <thomas@biostat.washington.edu> writes:> As phyper seems to work we could just start at the Normal approximation to > qhyper and search for x such that phyper(x)=p. Since it's a discrete > distribution this shouldn't take that long.Or apply the same fixup: sum += (small_N ? term : exp(term)); ... if(small_N) term *= (NR / xr) * (xb / NB); else term += log((NR / xr) * (xb / NB)); which is pretty obviously not optimized for speed, but certainly avoids the multiply-by-zero effect. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._