Aloha all, I know too little about what I'm about to write and hope I'm not wasting your time. For a class I'm teaching in archaeological data analysis, I'm trying to put together a routine that calculates the so-called Petersen index and, especially, confidence intervals for the index. This was introduced to archaeologists by N.R.J. Fieller and A. Turner in an article in Journal of Archaeological Science (1982) called Number Estimation in Vertebrate Samples. They say that "calculation of precise confidence intervals for population sizes is, in principle at least, straightforward. It involves calculation of cumulative hypergeometric probabilities (i.e. the summation of probabilities given by equation 3.1 of Seber, 1973)." The reference is to G.A.F. Seber's book, The Estimation of Animal Abundance. I went to equation 3.1 and wrote a small function to sum its probabilities, modeled after phyper() and taking the arguments in the same order (the names have changed to suit the archaeological situation):> seber <- function(p,l,n,r) > { > y <- 0 > for (x in 0:p) > y <- y + exp(lchoose(l,x) + lchoose(n-l,r-x) - lchoose(n,r)) > y > }When used in the larger routine, this yields results that very closely approximate the results in Fieller and Turner's table 1. I initially thought I could use the function phyper() for this because, as I interpret the help files, this routine yields cumulative hypergeometric probabilities. But I'm finding that it gives different results than seber(). I apologize if I am in too far over my head, but I am wondering if this is a bug in dhyper/phyper? Perhaps I have misunderstood what phyper() actually does, or am calling it incorrectly? Or, were Fieller and Turner in error? All the best, Tom Thomas Dye Dean Hall 201, Tuesday 1:00-1:55 pm
tdye at hawaii.edu wrote:> Aloha all, > > I know too little about what I'm about to write and hope I'm not > wasting your time. > > For a class I'm teaching in archaeological data analysis, I'm trying > to put together a routine that calculates the so-called Petersen > index and, especially, confidence intervals for the index. This was > introduced to archaeologists by N.R.J. Fieller and A. Turner in an > article in Journal of Archaeological Science (1982) called Number > Estimation in Vertebrate Samples. They say that "calculation of > precise confidence intervals for population sizes is, in principle at > least, straightforward. It involves calculation of cumulative > hypergeometric probabilities (i.e. the summation of probabilities > given by equation 3.1 of Seber, 1973)." The reference is to G.A.F. > Seber's book, The Estimation of Animal Abundance. > > I went to equation 3.1 and wrote a small function to sum its > probabilities, modeled after phyper() and taking the arguments in the > same order (the names have changed to suit the archaeological > situation): > > >> seber <- function(p,l,n,r) >> { >> y <- 0 >> for (x in 0:p) >> y <- y + exp(lchoose(l,x) + lchoose(n-l,r-x) - lchoose(n,r)) >> y >> } >> > > When used in the larger routine, this yields results that very > closely approximate the results in Fieller and Turner's table 1. > > I initially thought I could use the function phyper() for this > because, as I interpret the help files, this routine yields > cumulative hypergeometric probabilities. But I'm finding that it > gives different results than seber(). >Could you give an example?> I apologize if I am in too far over my head, but I am wondering if > this is a bug in dhyper/phyper? Perhaps I have misunderstood what > phyper() actually does, or am calling it incorrectly? Or, were > Fieller and Turner in error?I'd guess it's your error, but without seeing what you did can't really say for sure. By the way, please don't post things like this to the R-bugs list unless you're reasonably sure it's a bug. Entries on that list take work by busy people to clear up. Duncan Murdoch