thr3ads.net - R help - [R] Re; Fitting a Beta distribution [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Jim Silverton

2011-Mar-16 15:14 UTC

[R] Re; Fitting a Beta distribution

I want to fit some p-values to a beta distribution. But the problem is some
of the values have 0s and 1's. I am getting an error if I use the MASS
function to do this. Is there anyway to get around this?

-- 
Thanks,
Jim.

	[[alternative HTML version deleted]]

Ben Bolker

2011-Mar-16 16:20 UTC

head link

[R] Re; Fitting a Beta distribution

Jim Silverton <jim.silverton <at> gmail.com> writes:
> 
> I want to fit some p-values to a beta distribution. But the problem is some
> of the values have 0s and 1's. I am getting an error if I use the MASS
> function to do this. Is there anyway to get around this?
  The probability *density* of an exact 0 or 1 can be infinite
(as you can see by experimenting with dbeta() a bit) -- although
it may (?) require shape parameters <1, I don't remember.  The general
practice (see e.g. Smithson and Verkuilen "better lemon squeezer"
paper),
as far as I know, is to take something like (x+e)/(1+2*e) where e is
a small fudge factor (that can be justified e.g. on Bayesian grounds ...)
Presumably if these are p-values then they aren't *exactly* 0/1, just
very close to it ... ?

  Would be interested to hear what others had to say.

Kjetil Halvorsen

2011-Mar-16 16:30 UTC

head link

[R] Re; Fitting a Beta distribution

If yoy write out the likelihood equations for an independent sample size n from
the beta(a,b) distribution:
L \propto \prod_i dbeta(y_i,a,b)
log(L) = constant + \sum_i dbeta(y_i,a,b,log=TRUE)
log(L)= constant + \sum_i (a-1) log(y_i) + (b-i) log(1-y_i)

you see that your problem comes from trying to calculate log(0.0).
So one pragmatic approach will be to replace your measured 0's by some
epsilon
and your measured 1's by (1-epsilon), and maybe do some sensitivity
analysis for the choice of epsilon.

If you have exactly one measured y_i=0.0, and the rest in (0,1), then
the log-likelihood
will be constant + (a-1)*(-\infty) + ordinary (finite) log-likelihood,
suggesting that
maximization will choose a=1 to avoid the -\infty term. This indicates
that choosing the epsilon
too small will give a huge bias in direction of estimating a=1.

Kjetil

On Wed, Mar 16, 2011 at 11:14 AM, Jim Silverton <jim.silverton at
gmail.com> wrote:> I want to fit some p-values to a beta distribution. But the problem is some
> of the values have 0s and 1's. I am getting an error if I use the MASS
> function to do this. Is there anyway to get around this?
>
> --
> Thanks,
> Jim.
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Mar 2011 - Re; Fitting a Beta distribution

[R] Re; Fitting a Beta distribution

[R] Re; Fitting a Beta distribution

[R] Re; Fitting a Beta distribution

Apparently Analagous Threads