R 2.3.0 Linux, SuSE 10.0 Hi I have two problems with mle - probably I am using it the wrong way so please let me know. I want to fit different distributions to an observed count of seeds and in the next step use AIC or BIC to identify the best distribution. But when I run the script below (which is part of my original script), I get one error message for the first call of mle: Error in solve.default(oout$hessian) : Lapack routine dgesv: system is exactly singular In addition: Warning messages: 1: NaNs produced in: dweibull(x, shape, scale, log) 2: NaNs produced in: dweibull(x, shape, scale, log) 3: NaNs produced in: dweibull(x, shape, scale, log) 4: NaNs produced in: dweibull(x, shape, scale, log) What can I do to avoid this problem? The second one is following the second call of mle: summary(fit.NE) gives me the summary of the fit, but profile(fit.NE) gives me the following error message: Error in onestep(step) : profiling has found a better solution, so original fit had not converged In addition: Warning messages: 1: NaNs produced in: dexp(x, 1/rate, log) 2: NaNs produced in: dexp(x, 1/rate, log) 3: NaNs produced in: dexp(x, 1/rate, log) 4: NaNs produced in: dexp(x, 1/rate, log) 5: NaNs produced in: dexp(x, 1/rate, log) 6: NaNs produced in: dexp(x, 1/rate, log) What can I do in this case - which parameters do I have to change that it converges? Rainer ###################################### library(stats4) SumSeeds <- c(762, 406, 288, 260, 153, 116, 57, 33, 40, 44, 36, 24, 35, 23, 36, 25, 35, 30) X <- c(1.74924, 3.49848, 5.24772, 6.99696, 8.74620, 17.49240, 26.23860, 34.98480, 43.73100, 52.47720, 61.22340, 69.96960, 71.71880, 73.46810, 75.21730, 76.96650, 78.71580, 80.46500 ) #Sum of log of values in vector SumLog <- function (x) { sum(log(x)) } #-ln(L) with Poisson Likelihood estimator NeglogPoisL <- function (obs, est) { sel <- est != 0 SumLogObs <- SumLog(obs[sel]) -sum( obs[sel] * log(est[sel]) - est[sel]- SumLogObs ) } #-ln(L) for Negative Exponential NENeglogPoisL <- function(a=0.2, rate=0.04) { NeglogPoisL( SumSeeds, a * dexp( X, rate) ) } #-ln(L) for Weibull WBNeglogPoisL <- function(a=1000000, shape=0.12, scale=1e+37) { NeglogPoisL( SumSeeds, a * dweibull(X, shape, scale) ) } fit.WB <- mle( WBNeglogPoisL , start=list(a=1, shape=0.1, scale=1) , fixed=list() , control=list(maxit=10000) ) fit.NE <- mle( NENeglogPoisL , start=list(a=1, rate=1) , fixed=list() , control=list(maxit=10000) ) ######################################
Rainer M KRug <RMK <at> krugs.de> writes:> > R 2.3.0 > Linux, SuSE 10.0 > > Hi > > I have two problems with mle - probably I am using it the wrong way so > please let me know. > > I want to fit different distributions to an observed count of seeds and > in the next step use AIC or BIC to identify the best distribution.library(stats4) SumSeeds <- c(762, 406, 288, 260, 153, 116, 57, 33, 40, 44, 36, 24, 35, 23, 36, 25, 35, 30) X <- c(1.74924, 3.49848, 5.24772, 6.99696, 8.74620, 17.49240, 26.23860, 34.98480, 43.73100, 52.47720, 61.22340, 69.96960, 71.71880, 73.46810, 75.21730, 76.96650, 78.71580, 80.46500 ) plot(X,SumSeeds) curve(7000*dexp(x,0.1),add=TRUE,from=0) ## exponential function (a*exp(-rate*X)) makes more sense ## than using dexp, which is normalized to 1 (thus makes ## it harder to interpret a as an intercept -- I don't ## know what X is? (You could easily reverse this decision ## though.) negexplik <- function(a=7000,rate=0.1) { -sum(dpois(SumSeeds,lambda=a*dexp(X,rate),log=TRUE)) } ## start from eyeball-estimated parameters m1 = mle(minuslogl=negexplik,start=list(a=7000,rate=0.1)) ## add estimated curve with(as.list(coef(m1)),curve(a*dexp(x,rate),col="red",add=TRUE,from=0)) ## conclusion so far: the exponential fit is really bad, because ## the numbers don't fall to zero as fast as expected ## changed parameterization of Weibull slightly to agree with ## the parameterization of the exponential above weiblik <- function(a=7000,shape=1,rate=0.1) { -sum(dpois(SumSeeds,lambda=a*dweibull(X,scale=1/rate,shape=shape), log=TRUE)) } weiblik(a=7222,shape=1,rate=0.05) ## check ## start from exponential a/rate estimates, plus shape=1 which reduces ## to exponential m2 = mle(minuslogl=weiblik,start=list(a=7222,rate=0.05,shape=1)) with(as.list(coef(m2)),curve(a*dweibull(x,scale=1/rate,shape=shape), col="blue",add=TRUE,from=0)) as.numeric(pchisq(-2*(logLik(m1)-logLik(m2)),df=1, lower.tail=FALSE,log.p=TRUE)) ## insanely significant I still had some problems profiling. Various things that may help: (1) set parscale= option (2) do fits on the log-parameters (which all have to be positive) OR (3) use L-BFGS-B and set lower=0 (4) summary(m1) will give you approximate (Fisher information) confidence limits even if profiling doesn't work. cheers Ben Bolker
Hi Ben Thanks a lot for your help - it solved my problem and I understand a little bit more. Just one more question along this line: If I want to use another likelihood estimator, e.g. negative binominal, I guess I have to use -sum(dpois(SumSeeds,size=???, prob=???,log=TRUE)) instead of -sum(dnbinom(SumSeeds,lambda=est,log=TRUE)) But what do I use for size and prob? Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: RKrug at sun.ac.za Rainer at krugs.de