thr3ads.net - R help - [R] About 'choose' function [Nov 2004]

If this information is useful, please help other people find it:
Share via:

John

2004-Nov-08 11:58 UTC

[R] About 'choose' function

Hello R-users,

When I didn't know about the internal 'choose'
function, I made such function, 'my.choose' below. But
when I used them instead of choose(6000,20), they
didn't give me any answer.

What is the difference between 'choose', 'my.choose1',
and 'my.choose2' below? That is, what is behind
'choose' function and what's the problem using 'prod'
or 'gamma' function?

Thanks a lot.

John

##########
> choose(6000,20)
[1] 1.455904e+57>
> my.choose1 <- function(x,y) {
prod(1:x)/(prod(1:y)*prod(1:(x-y))) }> my.choose1(6000,20)
[1] NaN>
> my.choose2 <- function(x,y) {
gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) }> my.choose2(6000,20)
[1] NaN>

Uwe Ligges

2004-Nov-08 13:17 UTC

head link

[R] About 'choose' function

John wrote:
> Hello R-users,
> 
> When I didn't know about the internal 'choose'
> function, I made such function, 'my.choose' below. But
> when I used them instead of choose(6000,20), they
> didn't give me any answer.
> 
> What is the difference between 'choose', 'my.choose1',
> and 'my.choose2' below? That is, what is behind
> 'choose' function and what's the problem using 'prod'
> or 'gamma' function?
prod() calculates the whole product while choose() can do it the clever 
way avoiding overflows. On my machine, .Machine$double.xmax is 
1.797693e+308, hence prod(1:170) works while prod(1:171) is too much to 
be represented correctly ...

Uwe Ligges

> Thanks a lot.
> 
> John
> 
> ##########
> 
> 
>>choose(6000,20)
> 
> [1] 1.455904e+57
> 
>>my.choose1 <- function(x,y) {
> 
> prod(1:x)/(prod(1:y)*prod(1:(x-y))) }
> 
>>my.choose1(6000,20)
> 
> [1] NaN
> 
>>my.choose2 <- function(x,y) {
> 
> gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) }
> 
>>my.choose2(6000,20)
> 
> [1] NaN
> 
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

Adaikalavan Ramasamy

2004-Nov-08 13:30 UTC

head link

[R] About 'choose' function

Try with less ambitious numbers such as my.choose1(60,20). I think it
works fine.

I think the problem is that gamma(6001) and prod(1:6000) are so large
that it gives Inf as the answer. Hence the numerator and denominator
approaches Inf and division of two Inf gives NaN.

You could use the natural log version of gamma (or even lchoose) to
handle these large numbers as below 

my.choose3 <- function(x,y){
  y <- lgamma(x+1) - lgamma(y+1) - lgamma(x-y+1)
  return( exp(y) )
}

But have you tested the case when your inputs are not integers ?

Regards, Adai


On Mon, 2004-11-08 at 11:58, John wrote:> Hello R-users,
> 
> When I didn't know about the internal 'choose'
> function, I made such function, 'my.choose' below. But
> when I used them instead of choose(6000,20), they
> didn't give me any answer.
> 
> What is the difference between 'choose', 'my.choose1',
> and 'my.choose2' below? That is, what is behind
> 'choose' function and what's the problem using 'prod'
> or 'gamma' function?
> 
> Thanks a lot.
> 
> John
> 
> ##########
> 
> > choose(6000,20)
> [1] 1.455904e+57
> >
> > my.choose1 <- function(x,y) {
> prod(1:x)/(prod(1:y)*prod(1:(x-y))) }
> > my.choose1(6000,20)
> [1] NaN
> >
> > my.choose2 <- function(x,y) {
> gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) }
> > my.choose2(6000,20)
> [1] NaN
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Duncan Murdoch

2004-Nov-08 13:55 UTC

head link

[R] About 'choose' function

On Mon, 8 Nov 2004 11:58:04 +0000 (GMT), John <cyracules at yahoo.co.uk>
wrote :
>What is the difference between 'choose', 'my.choose1',
>and 'my.choose2' below? That is, what is behind
>'choose' function and what's the problem using 'prod'
>or 'gamma' function?
>
>Thanks a lot.
>
>John
>
>##########
>
>> choose(6000,20)
>[1] 1.455904e+57
>>
>> my.choose1 <- function(x,y) {
>prod(1:x)/(prod(1:y)*prod(1:(x-y))) }
>> my.choose1(6000,20)
>[1] NaN
Generally 1:x will be taken to be of mode integer, so I would have
expected prod(1:x) to overflow around x==13, but whoever programmed it
was smart, and switched the values to floating point.  Floating point
evaluation of prod(1:x) overflows around x==171.  The ratio of two
overflows is undefined, so you get a NaN result.>>
>> my.choose2 <- function(x,y) {
>gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) }
>> my.choose2(6000,20)
>[1] NaN
Same problem.

Duncan Murdoch

(Ted Harding)

2004-Nov-08 14:29 UTC

head link

[R] About 'choose' function

On 08-Nov-04 John wrote:> Hello R-users,
> 
> When I didn't know about the internal 'choose'
> function, I made such function, 'my.choose' below. But
> when I used them instead of choose(6000,20), they
> didn't give me any answer.
> 
> What is the difference between 'choose', 'my.choose1',
> and 'my.choose2' below? That is, what is behind
> 'choose' function and what's the problem using 'prod'
> or 'gamma' function?
> 
> Thanks a lot.
> 
> John
> 
>##########
> 
>> choose(6000,20)
> [1] 1.455904e+57
>>
>> my.choose1 <- function(x,y) {
> prod(1:x)/(prod(1:y)*prod(1:(x-y))) }
>> my.choose1(6000,20)
> [1] NaN
>>
>> my.choose2 <- function(x,y) {
> gamma(x+1)/(gamma(y+1)*gamma(x-y+1)) }
>> my.choose2(6000,20)
> [1] NaN
Your problem arises because the first thing that gets computed
in  your functions is the factorial of a large number, which
goes out of range; you then compound this by computing the
factorial of another large number, and dividing one by the
other:

  > prod(1:6000)
  [1] Inf
  > prod(1:5980)
  [1] Inf
  > prod(1:6000)/prod(1:5980)
  [1] NaN

It doesn't matter whether you do this using 'prod' or
'gamma'.

Those of us who grew up in the old days when you had to do things
the hard way learned tricks like:

  my.choose3 <- function(x,y){
    if((x==y)||(y==0)) return(1);
    m <- min(y,x-y)
    prod((x:(x-m+1))/(m:1))
  }

i.e. implementing, using term-by-term ratios (which keeps the
numbers within bounds all the way)  either

  (x/y)*((x-1)/(y-1))*...*((x-y+2)/2)*((x-y+1)/1)

  = x*(x-1)*...*(x-y+1)/{y*(y-1)*...*2*1}

or

  (x/(x-y))*((x-1)/(x-y-1))*...*((y+2)/2)*((y+1)/1)

  = x*(x-1)*...*(y+1)/{(x-y)*(x-y-1)*...*2*1}

whichever gives the shorter product. (Actually, in those days
we did it with loops too, or even by turning a handle -- all
the more reason to keep it short!).

Anyway:

  > choose(6000,20)
  [1] 1.455904e+57
  > my.choose3(6000,20)
  [1] 1.455904e+57

(And, if you try it, you'll see that 'my.choose3' is pretty fast).

However, as written above 'my.choose3' doesn't like really large
arguments:

  > choose(60000000000,31)
  [1] 1.612899e+300
  > my.choose3(60000000000,31)
  Error in x:(x - m + 1) : argument too large in magnitude

because the result of ":" is integer. However, it works OK
in the following form:

  my.choose3<-function(x,y){
    if((x==y)||(y==0)) return(1);
    m <- min(y,x-y)
   prod(seq(x,(x-m+1),by=-1)/(seq(m,1,by=-1)))
  }

when

  my.choose3(60000000000,31)
  [1] 1.613121e+300

which has a slight difference (0.014% greater) from the result of
'choose'.

Given the method of computation, I might feel inclined to trust
'my.choose3' rather than 'choose', but I'm not at sure of
this
without studying the internal code of 'choose', and would welcome
comments!

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 08-Nov-04                                       Time: 14:29:46
------------------------------ XFMail ------------------------------

Possibly Parallel Threads

Search for more maybe matching threads

R help - Nov 2004 - About 'choose' function

[R] About 'choose' function

[R] About 'choose' function

[R] About 'choose' function

[R] About 'choose' function

[R] About 'choose' function

Possibly Parallel Threads