Hello R-users, When I didn't know about the internal 'choose' function, I made such function, 'my.choose' below. But when I used them instead of choose(6000,20), they didn't give me any answer. What is the difference between 'choose', 'my.choose1', and 'my.choose2' below? That is, what is behind 'choose' function and what's the problem using 'prod' or 'gamma' function? Thanks a lot. John ##########> choose(6000,20)[1] 1.455904e+57> > my.choose1 <- function(x,y) {prod(1:x)/(prod(1:y)*prod(1:(x-y))) }> my.choose1(6000,20)[1] NaN> > my.choose2 <- function(x,y) {gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) }> my.choose2(6000,20)[1] NaN>
John wrote:> Hello R-users, > > When I didn't know about the internal 'choose' > function, I made such function, 'my.choose' below. But > when I used them instead of choose(6000,20), they > didn't give me any answer. > > What is the difference between 'choose', 'my.choose1', > and 'my.choose2' below? That is, what is behind > 'choose' function and what's the problem using 'prod' > or 'gamma' function?prod() calculates the whole product while choose() can do it the clever way avoiding overflows. On my machine, .Machine$double.xmax is 1.797693e+308, hence prod(1:170) works while prod(1:171) is too much to be represented correctly ... Uwe Ligges> Thanks a lot. > > John > > ########## > > >>choose(6000,20) > > [1] 1.455904e+57 > >>my.choose1 <- function(x,y) { > > prod(1:x)/(prod(1:y)*prod(1:(x-y))) } > >>my.choose1(6000,20) > > [1] NaN > >>my.choose2 <- function(x,y) { > > gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) } > >>my.choose2(6000,20) > > [1] NaN > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Try with less ambitious numbers such as my.choose1(60,20). I think it works fine. I think the problem is that gamma(6001) and prod(1:6000) are so large that it gives Inf as the answer. Hence the numerator and denominator approaches Inf and division of two Inf gives NaN. You could use the natural log version of gamma (or even lchoose) to handle these large numbers as below my.choose3 <- function(x,y){ y <- lgamma(x+1) - lgamma(y+1) - lgamma(x-y+1) return( exp(y) ) } But have you tested the case when your inputs are not integers ? Regards, Adai On Mon, 2004-11-08 at 11:58, John wrote:> Hello R-users, > > When I didn't know about the internal 'choose' > function, I made such function, 'my.choose' below. But > when I used them instead of choose(6000,20), they > didn't give me any answer. > > What is the difference between 'choose', 'my.choose1', > and 'my.choose2' below? That is, what is behind > 'choose' function and what's the problem using 'prod' > or 'gamma' function? > > Thanks a lot. > > John > > ########## > > > choose(6000,20) > [1] 1.455904e+57 > > > > my.choose1 <- function(x,y) { > prod(1:x)/(prod(1:y)*prod(1:(x-y))) } > > my.choose1(6000,20) > [1] NaN > > > > my.choose2 <- function(x,y) { > gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) } > > my.choose2(6000,20) > [1] NaN > > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
On Mon, 8 Nov 2004 11:58:04 +0000 (GMT), John <cyracules at yahoo.co.uk> wrote :>What is the difference between 'choose', 'my.choose1', >and 'my.choose2' below? That is, what is behind >'choose' function and what's the problem using 'prod' >or 'gamma' function? > >Thanks a lot. > >John > >########## > >> choose(6000,20) >[1] 1.455904e+57 >> >> my.choose1 <- function(x,y) { >prod(1:x)/(prod(1:y)*prod(1:(x-y))) } >> my.choose1(6000,20) >[1] NaNGenerally 1:x will be taken to be of mode integer, so I would have expected prod(1:x) to overflow around x==13, but whoever programmed it was smart, and switched the values to floating point. Floating point evaluation of prod(1:x) overflows around x==171. The ratio of two overflows is undefined, so you get a NaN result.>> >> my.choose2 <- function(x,y) { >gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) } >> my.choose2(6000,20) >[1] NaNSame problem. Duncan Murdoch
On 08-Nov-04 John wrote:> Hello R-users, > > When I didn't know about the internal 'choose' > function, I made such function, 'my.choose' below. But > when I used them instead of choose(6000,20), they > didn't give me any answer. > > What is the difference between 'choose', 'my.choose1', > and 'my.choose2' below? That is, what is behind > 'choose' function and what's the problem using 'prod' > or 'gamma' function? > > Thanks a lot. > > John > >########## > >> choose(6000,20) > [1] 1.455904e+57 >> >> my.choose1 <- function(x,y) { > prod(1:x)/(prod(1:y)*prod(1:(x-y))) } >> my.choose1(6000,20) > [1] NaN >> >> my.choose2 <- function(x,y) { > gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) } >> my.choose2(6000,20) > [1] NaNYour problem arises because the first thing that gets computed in your functions is the factorial of a large number, which goes out of range; you then compound this by computing the factorial of another large number, and dividing one by the other: > prod(1:6000) [1] Inf > prod(1:5980) [1] Inf > prod(1:6000)/prod(1:5980) [1] NaN It doesn't matter whether you do this using 'prod' or 'gamma'. Those of us who grew up in the old days when you had to do things the hard way learned tricks like: my.choose3 <- function(x,y){ if((x==y)||(y==0)) return(1); m <- min(y,x-y) prod((x:(x-m+1))/(m:1)) } i.e. implementing, using term-by-term ratios (which keeps the numbers within bounds all the way) either (x/y)*((x-1)/(y-1))*...*((x-y+2)/2)*((x-y+1)/1) = x*(x-1)*...*(x-y+1)/{y*(y-1)*...*2*1} or (x/(x-y))*((x-1)/(x-y-1))*...*((y+2)/2)*((y+1)/1) = x*(x-1)*...*(y+1)/{(x-y)*(x-y-1)*...*2*1} whichever gives the shorter product. (Actually, in those days we did it with loops too, or even by turning a handle -- all the more reason to keep it short!). Anyway: > choose(6000,20) [1] 1.455904e+57 > my.choose3(6000,20) [1] 1.455904e+57 (And, if you try it, you'll see that 'my.choose3' is pretty fast). However, as written above 'my.choose3' doesn't like really large arguments: > choose(60000000000,31) [1] 1.612899e+300 > my.choose3(60000000000,31) Error in x:(x - m + 1) : argument too large in magnitude because the result of ":" is integer. However, it works OK in the following form: my.choose3<-function(x,y){ if((x==y)||(y==0)) return(1); m <- min(y,x-y) prod(seq(x,(x-m+1),by=-1)/(seq(m,1,by=-1))) } when my.choose3(60000000000,31) [1] 1.613121e+300 which has a slight difference (0.014% greater) from the result of 'choose'. Given the method of computation, I might feel inclined to trust 'my.choose3' rather than 'choose', but I'm not at sure of this without studying the internal code of 'choose', and would welcome comments! Best wishes to all, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 08-Nov-04 Time: 14:29:46 ------------------------------ XFMail ------------------------------