If yoy write out the likelihood equations for an independent sample size n from
the beta(a,b) distribution:
L \propto \prod_i dbeta(y_i,a,b)
log(L) = constant + \sum_i dbeta(y_i,a,b,log=TRUE)
log(L)= constant + \sum_i (a-1) log(y_i) + (b-i) log(1-y_i)
you see that your problem comes from trying to calculate log(0.0).
So one pragmatic approach will be to replace your measured 0's by some
epsilon
and your measured 1's by (1-epsilon), and maybe do some sensitivity
analysis for the choice of epsilon.
If you have exactly one measured y_i=0.0, and the rest in (0,1), then
the log-likelihood
will be constant + (a-1)*(-\infty) + ordinary (finite) log-likelihood,
suggesting that
maximization will choose a=1 to avoid the -\infty term. This indicates
that choosing the epsilon
too small will give a huge bias in direction of estimating a=1.
Kjetil
On Wed, Mar 16, 2011 at 11:14 AM, Jim Silverton <jim.silverton at
gmail.com> wrote:> I want to fit some p-values to a beta distribution. But the problem is some
> of the values have 0s and 1's. I am getting an error if I use the MASS
> function to do this. Is there anyway to get around this?
>
> --
> Thanks,
> Jim.
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>