Hello I am trying to find an automated way of fitting a mixture of normal and log-normal distributions to data which is clearly bimodal. Here's a simulated example: x.1<-rnorm(6000, 2.4, 0.6)x.2<-rlnorm(10000, 1.3,0.1)X<-c(x.1, x.2) hist(X,100,freq=FALSE, ylim=c(0,1.5))lines(density(x.1), lty=2, lwd=2)lines(density(x.2), lty=2, lwd=2)lines(density(X), lty=4) Currently i am using mixtools and just run: library(mixtools)mixmdl = normalmixEM(X, k=2, epsilon = 1e-08, maxit = 1000, maxrestarts=20, verb = TRUE, fast=FALSE, ECM = FALSE, arbmean = TRUE, arbvar = TRUE) plot(mixmdl,which=2)lines(density(X), lty=2, lwd=2) This is obviously not the best way of doing this. The estimates it gives me are:mu 3.6595737(x.2 log()=1.29) 2.3113135(x.1) These are not too far off but I was wondering if someone knows of a better R package/way of doing this or has any hints that would help me coding it from scratch (EM+writing down the formulae for ML? but... would this really be better than using a "more-advanced-already-available-package-made-by-pros"?). The objective is to estimate threshold values at specific FDRs. (some help with that would also be most helpful.) Thanks to all in advance!To' [[alternative HTML version deleted]]
Hello I am trying to find an automated way of fitting a mixture of normal and log-normal distributions to data which is clearly bimodal. Here's a simulated example: x.1<-rnorm(6000, 2.4, 0.6) x.2<-rlnorm(10000, 1.3,0.1) X<-c(x.1, x.2) hist(X,100,freq=FALSE, ylim=c(0,1.5))lines(density(x.1), lty=2, lwd=2) lines(density(x.2), lty=2, lwd=2) lines(density(X), lty=4) Currently i am using mixtools and just run: library(mixtools) mixmdl = normalmixEM(X, k=2, epsilon = 1e-08, maxit = 1000, maxrestarts=20, verb = TRUE, fast=FALSE, ECM = FALSE, arbmean = TRUE, arbvar = TRUE) plot(mixmdl,which=2)lines(density(X), lty=2, lwd=2) This is obviously not the best way of doing this.? The estimates it gives are: mu 3.6595737(x.2 log()=1.29) 2.3113135(x.1) These are not too far off but I was wondering if someone knows of a better R package/way of doing this or has any hints that would help me coding it from scratch (EM+writing down the formulae for ML? but... would this really be better than using a "more-advanced-already-available-package-made-by-pros"?). The objective is to estimate threshold values at specific FDRs. (some help with that would also be most helpful.) Thanks to all in advance! To'
Marc Girondot
2013-Mar-19 10:43 UTC
[R] Fit a mixture of lognormal and normal distributions
Le 18/03/13 16:24, To . . a ?crit :> x.1<-rnorm(6000, 2.4, 0.6)x.2<-rlnorm(10000, 1.3,0.1)X<-c(x.1, x.2) > hist(X,100,freq=FALSE, ylim=c(0,1.5))lines(density(x.1), lty=2, lwd=2)lines(density(x.2), lty=2, lwd=2)lines(density(X), lty=4)Here is a solution: x.1<-rnorm(6000, 2.4, 0.6) x.2<-rlnorm(10000, 1.3,0.1) X<-c(x.1, x.2) hist(X,100,freq=FALSE, ylim=c(0,1.5)) lines(density(x.1), lty=2, lwd=2) lines(density(x.2), lty=2, lwd=2) lines(density(X), lty=4) fitnormlnorm<-function(par, val) { p <- 1/(1+exp(-par[5])) return(-sum(log(p*dnorm(val, par[1], abs(par[2]), log = FALSE)+ (1-p)*dlnorm(val, par[3], abs(par[4]), log = FALSE)))) } # Mean 1 m1=2.3; s1=0.5 # Mean 2 m2=1.3; s2=0.1 # proportion of 1 - logit transform p=0 par<-c(m1, s1, m2, s2, p) result2<-optim(par, fitnormlnorm, method="BFGS", val=X, hessian=FALSE, control=list(trace=1)) lines(seq(from=0, to=5, length=100), dnorm(seq(from=0, to=5, length=100), result2$par[1], result2$par[2]), col="red") lines(seq(from=0, to=5, length=100), dlnorm(seq(from=0, to=5, length=100), result2$par[3], result2$par[4]), col="green") p <- 1/(1+exp(-result2$par[5])) paste("Proportion of Gaussian data", p) lines(seq(from=0, to=5, length=100), p*dnorm(seq(from=0, to=5, length=100), result2$par[1], result2$par[2])+ (1-p)*dlnorm(seq(from=0, to=5, length=100), result2$par[3], result2$par[4]), col="blue") Sincerely Marc Girondot -- __________________________________________________________ Marc Girondot, Pr Laboratoire Ecologie, Syst?matique et Evolution Equipe de Conservation des Populations et des Communaut?s CNRS, AgroParisTech et Universit? Paris-Sud 11 , UMR 8079 B?timent 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.girondot at u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot