All,
I have some data on parasites on apple leaves and want to do a
goodness of fit test to a Poisson distribution. This seems to
do it:
  mites <- c(rep(0,70),
             rep(1,38),
             rep(2,17), 
             rep(3,10), 
             rep(4,9), 
             rep(5,3), 
             rep(6,2), 
             rep(7,1))
  tab <- table(mites)
  NSU <- length(mites)
  N <- sum(mites)
  NSU.Mean <- N / NSU
  exp <- dpois(as.numeric(dimnames(tab)[[1]]), NSU.Mean) *  NSU
  chi2 <- sum(((as.vector(tab) - exp)^2) / exp)
  tab
  exp
  chi2
  
I have some small expected values and so need to collapse some
categories. If anyone has code to do that already ... but my
question is how would I do the same for a negative binomial
distribution?
Mark
--
Mark Myatt
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> > All, > > I have some data on parasites on apple leaves and want to do a > goodness of fit test to a Poisson distribution. This seems to > do it: > > mites <- c(rep(0,70), > rep(1,38), > rep(2,17), > rep(3,10), > rep(4,9), > rep(5,3), > rep(6,2), > rep(7,1)) > > tab <- table(mites) > NSU <- length(mites) > N <- sum(mites) > NSU.Mean <- N / NSU > exp <- dpois(as.numeric(dimnames(tab)[[1]]), NSU.Mean) * NSU > chi2 <- sum(((as.vector(tab) - exp)^2) / exp) > > tab > exp > chi2 > > I have some small expected values and so need to collapse some > categories. If anyone has code to do that already ... but my > question is how would I do the same for a negative binomial > distribution?The negative binomial density function is available in my rmutil library. (www.luc.ac.be/~jlindsey/rcode.html) Jim> > Mark > > -- > Mark Myatt > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I had a very similar example before. This is for Mark's data:
(the last frequency is for k>= 5). (Mark should be happy with
the neg-binomial fit.)
-YP-
x_ c(70, 38, 17, 10,  9, 6) 
kk_ 0:5
n_ sum(x)
fn_ function(p){
    r_ p[1]
    th_ p[2]
    ll _ -sum(x[1:max(kk)]*log(dnbinom(kk[-length(kk)],r,th))) -
          x[length(kk)]*log(1-pnbinom((max(kk)-1),r,th))
    return(ll)
}
a_ nlm(fn,p=c(.9,.5),hes=T)
est_ a$est
phat_ c(dnbinom(kk[-length(kk)],est[1],est[2]),
        1-pnbinom((max(kk)-1),est[1],est[2]))
E_ n*phat
chi2_ sum((x-E)^2/E)
cat('Chi2 goodness-of-fit = ',chi2,'\n')
cat('P-value=',1-pchisq(chi2,df=length(kk)-1-2),'\n')
print(cbind(O=x,E=round(E,1),err=round((x-E)^2/E,1)))
=========================
At 09:14 AM 2/7/01 +0000, Mark Myatt <mark at myatt.demon.co.uk>
wrote:>All,
>
>I have some data on parasites on apple leaves and want to do a
>goodness of fit test to a Poisson distribution. This seems to
>do it:
>
>  mites <- c(rep(0,70),
>             rep(1,38),
>             rep(2,17), 
>             rep(3,10), 
>             rep(4,9), 
>             rep(5,3), 
>             rep(6,2), 
>             rep(7,1))
>
>  tab <- table(mites)
>  NSU <- length(mites)
>  N <- sum(mites)
>  NSU.Mean <- N / NSU
>  exp <- dpois(as.numeric(dimnames(tab)[[1]]), NSU.Mean) *  NSU
>  chi2 <- sum(((as.vector(tab) - exp)^2) / exp)
>
>  tab
>  exp
>  chi2
>  
>I have some small expected values and so need to collapse some
>categories. If anyone has code to do that already ... but my
>question is how would I do the same for a negative binomial
>distribution?
>
>Mark
>
>--
>Mark Myatt
>
>
>-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
.-.->r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
>_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
._._>
Yudi Pawitan     yudi at stat.ucc.ie
Department of Statistics UCC
Cork, Ireland
Ph   353-21-490 2906
Fax 353-21-427 1040 
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I (Mark Myatt) wrote:> > I have some data on parasites on apple leaves and want to do a > > goodness of fit test to a Poisson distribution. This seems to > > do it: > > > > mites <- c(rep(0,70), > > rep(1,38), > > rep(2,17), > > rep(3,10), > > rep(4,9), > > rep(5,3), > > rep(6,2), > > rep(7,1)) > > > > tab <- table(mites) > > NSU <- length(mites) > > N <- sum(mites) > > NSU.Mean <- N / NSU > > exp <- dpois(as.numeric(dimnames(tab)[[1]]), NSU.Mean) * NSU > > chi2 <- sum(((as.vector(tab) - exp)^2) / exp) > > > > tab > > exp > > chi2 > > > > I have some small expected values and so need to collapse some > > categories. If anyone has code to do that already ... but my > > question is how would I do the same for a negative binomial > > distribution?Jim Lindsey wrote:>> The negative binomial density function is available in my rmutil >> library. (www.luc.ac.be/~jlindsey/rcode.html)Brian Ripley wrote:>It is in R too, dnbinom.That was silly of me ... It is even there on the dpois() help page that I had already visited. I have a further question ... I am not familiar with the parameters required for dnbinom: dnbinom(x, size, prob, log = FALSE) I am used to 'mu' and 'k'. The parameter 'prob' is I think the failure probability: tab[[1]] / NSU Is that right? I am unsure of how to specify the 'size' parameter in this context. I have a proto-function working without dnbinom(): neg.bin.gof <- function(data) { case.count <- vector(mode = "numeric", length = max(data) + 2) observed <- vector(mode = "numeric", length = max(data) + 2) neg.bin.expected <- vector(mode = "numeric", length = max(data) + 2) neg.bin.partial.chi.square <- vector(mode = "numeric", length = max(data) + 2) # # calculate frequencies # for (i in 0:(max(data) + 1)) { case.count[i + 1] <- i observed[i + 1] <- sum(data == i) } # # calculate summary measures # N.SU <- length(data) cases <- sum(data) cases.per.SU.mean <- cases / N.SU cases.per.SU.var <- var(data) # # find negative parameter 'k' by successive approximation # delta.direction <- 0 delta.difference <- 1 k <- cases.per.SU.mean ^ 2 / (cases.per.SU.var - cases.per.SU.mean) repeat { difference <- log10(N.SU / observed[1]) - k * log10(1 + (cases.per.SU.mean / k)) if(abs(difference) < 0.0000001) break if (delta.direction != sign(difference)) delta.difference <- delta.difference / 10 delta.direction <- sign(difference) k <- k + delta.direction * delta.difference } # # calculate expected number of sampling units with ZERO cases # neg.bin.expected[1] <- (1 + (cases.per.SU.mean / k)) ^ (0 - k) * N.SU neg.bin.sum.expected <- neg.bin.expected[1] # # calculate expected number of sampling units with 1, 2, etc. cases # for (i in 2:(max(data) + 1)) { neg.bin.expected[i] <- (cases.per.SU.mean / (cases.per.SU.mean + k)) * ((k + i - 1 - 1) / (i - 1)) * neg.bin.expected[i - 1] neg.bin.sum.expected <- neg.bin.sum.expected + neg.bin.expected[i] } neg.bin.expected[max(data) + 2] <- N.SU - neg.bin.sum.expected # # calculate chi-square statistic # neg.bin.partial.chi.square <- (observed - neg.bin.expected) ^ 2 / neg.bin.expected neg.bin.chi.square.statistic <- sum(neg.bin.partial.chi.square) # # some output ... punt out as a list when function debugged # output.table <- cbind(case.count, observed, neg.bin.expected, neg.bin.partial.chi.square) print(output.table) print(neg.bin.chi.square.statistic) print(k) } But I really should plug a gap in my ignorance. Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._