thr3ads.net - R help - [R] MLE for bimodal distribution [Apr 2009]

If this information is useful, please help other people find it:
Share via:

_nico_

2009-Apr-08 16:54 UTC

[R] MLE for bimodal distribution

Hello everyone,

I'm trying to use mle from package stats4 to fit a bi/multi-modal
distribution to some data, but I have some problems with it.
Here's what I'm doing (for a bimodal distribution):

# Build some fake binormally distributed data, the procedure fails also with
real data, so the problem isn't here
data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
# Just to check it's bimodal
plot(density(data))
f = function(m, s, m2, s2, w)
{
-log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2,
sd=s2)) )
}

res = mle(f, start=list("m"=3, "s"=0.5, "m2"=5,
"s2"=0.35, "w"=0.6))

This gives an error:
Error in optim(start, f, method = method, hessian = TRUE, ...) : 
  non-finite finite-difference value [2]
In addition: There were 50 or more warnings (use warnings() to see the first
50)
And the warnings are about dnorm producing NaNs

So, my questions are:
1) What does "non-finite finite-difference value" mean? My guess would
be
that an Inf was passed somewhere where a finite number was expected...? I'm
not sure how optim works though...
2) Is the log likelihood function I wrote correct? Can I use the same type
of function for 3 modes?
3) What should I do to avoid function failure? I tried by changing the
parameters, but it did not work.
4) Can I put constraints to the parameters? In particular, I would like w to
be between 0 and 1.
5) Is there a way to get the log-likelihood value, so that I can compare
different extimations?
6) Do you have any (possibly a bit "pratically oriented") readings
about MLE
to suggest?

Thanks in advance
Nico
-- 
View this message in context:
http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p22954970.html
Sent from the R help mailing list archive at Nabble.com.

Ben Bolker

2009-Apr-08 19:47 UTC

head link

[R] MLE for bimodal distribution

_nico_ wrote:> 
> Hello everyone,
> 
> I'm trying to use mle from package stats4 to fit a bi/multi-modal
> distribution to some data, but I have some problems with it.
> Here's what I'm doing (for a bimodal distribution):
> 
> # Build some fake binormally distributed data, the procedure fails also
> with real data, so the problem isn't here
> data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
> # Just to check it's bimodal
> plot(density(data))
> f = function(m, s, m2, s2, w)
> {
> -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2,
> sd=s2)) )
> }
> 
> res = mle(f, start=list("m"=3, "s"=0.5,
"m2"=5, "s2"=0.35, "w"=0.6))
> 
> This gives an error:
> Error in optim(start, f, method = method, hessian = TRUE, ...) : 
>   non-finite finite-difference value [2]
> In addition: There were 50 or more warnings (use warnings() to see the
> first 50)
> And the warnings are about dnorm producing NaNs
> 
> So, my questions are:
> 1) What does "non-finite finite-difference value" mean? My guess
would be
> that an Inf was passed somewhere where a finite number was expected...?
> I'm not sure how optim works though...
> 2) Is the log likelihood function I wrote correct? Can I use the same type
> of function for 3 modes?
> 3) What should I do to avoid function failure? I tried by changing the
> parameters, but it did not work.
> 4) Can I put constraints to the parameters? In particular, I would like w
> to be between 0 and 1.
> 5) Is there a way to get the log-likelihood value, so that I can compare
> different extimations?
> 6) Do you have any (possibly a bit "pratically oriented")
readings about
> MLE to suggest?
> 
> Thanks in advance
> Nico
> 
  Here's some tweaked code that works.
Read about the "L-BFGS-B" method in ?optim to constrain parameters,
although you will have some difficulty making this work for more than
two components.  I think there's also a mixture model problem in
Venables & Ripley (MASS).

  There are almost certainly more efficient approaches for mixture
model fitting, although this "brute force" approach should be fine
if you don't need to do anything too complicated.
(RSiteSearch("mixture model"))

set.seed(1001)
data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
# Just to check it's bimodal
plot(density(data))
f = function(m, s, m2, s2, w)
{
  -sum(log(w*dnorm(data, mean=m, sd=s)+
           (1-w)*dnorm(data, mean=m2, sd=s2)))
}

library(stats4)
start0 <- list("m"=3, "s"=0.5, "m2"=5,
"s2"=0.35, "w"=0.6)
do.call("f",start0) ## OK
res = mle(f, start=start0)

with(as.list(coef(res)),
     curve(w*dnorm(x,m,s)+(1-w)*dnorm(x,m2,s2),add=TRUE,col=2))


-- 
View this message in context:
http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p22957613.html
Sent from the R help mailing list archive at Nabble.com.

Rubén Roa-Ureta

2009-Apr-08 20:24 UTC

head link

[R] MLE for bimodal distribution

_nico_ wrote:> Hello everyone,
>
> I'm trying to use mle from package stats4 to fit a bi/multi-modal
> distribution to some data, but I have some problems with it.
> Here's what I'm doing (for a bimodal distribution):
>
> # Build some fake binormally distributed data, the procedure fails also
with
> real data, so the problem isn't here
> data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3))
> # Just to check it's bimodal
> plot(density(data))
> f = function(m, s, m2, s2, w)
> {
> -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2,
> sd=s2)) )
> }
>
> res = mle(f, start=list("m"=3, "s"=0.5,
"m2"=5, "s2"=0.35, "w"=0.6))
>
> This gives an error:
> Error in optim(start, f, method = method, hessian = TRUE, ...) : 
>   non-finite finite-difference value [2]
> In addition: There were 50 or more warnings (use warnings() to see the
first
> 50)
> And the warnings are about dnorm producing NaNs
>
> So, my questions are:
> 1) What does "non-finite finite-difference value" mean? My guess
would be
> that an Inf was passed somewhere where a finite number was expected...?
I'm
> not sure how optim works though...
> 2) Is the log likelihood function I wrote correct? Can I use the same type
> of function for 3 modes?
>   I think it is correct but it is difficult to fit as others have pointed out.
As an alternative, you may discretise your variable so the mixture is 
fit to counts on a histogram using a multinomial likelihood.
HTH
Rub?n

_nico_

2009-Apr-11 12:38 UTC

head link

[R] MLE for bimodal distribution

Just wanted to thank everyone for their help, I think I mostly managed to
solve my problem.
-- 
View this message in context:
http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p23000785.html
Sent from the R help mailing list archive at Nabble.com.

Reasonably Related Threads

Search for more seemingly similar threads

R help - Apr 2009 - MLE for bimodal distribution

[R] MLE for bimodal distribution

[R] MLE for bimodal distribution

[R] MLE for bimodal distribution

[R] MLE for bimodal distribution

Reasonably Related Threads