>>>>> Bert Gunter <bgunter.4567 at gmail.com> >>>>> on Wed, 7 Sep 2016 23:47:40 -0700 writes:> "please suggest what can I do to resolve this > issue." > Fitting normal mixtures can be difficult, and sometime the > optimization algorithm (EM) will get stuck with very slow convergence. > Presumably there are options in the package to either increase the max > number of steps before giving up or make the convergence criteria less > sensitive. The former will increase the run time and the latter will > reduce the optimality (possibly leaving you farther from the true > optimum). So you should look into changing these as you think > appropriate. I'm jumping in late, without having read everything preceding. One of the last messages seemed to indicate that you are looking at mixtures of *one*-dimensional gaussians. If this is the case, I strongly recommend looking at (my) CRAN package 'nor1mix' (the "1" is for "*one*-dimensional). For a while now that small package is providing an alternative to the EM, namely direct MLE, simply using optim(<likelihood>) where the likelihood uses a somewhat smart parametrization. Of course, *as the EM*, this also depends on the starting value, but my (limited) experience has been that nor1mix::norMixMLE() works considerably faster and more reliable than the EM (which I also provide as nor1mix::norMixEM() . Apropos 'starting value': The help page shows how to use kmeans() for "somewhat" reliable starts; alternatively, I'd recommend using cluster::pam() to get a start there. I'm glad to hear about experiences using these / comparing these with other approaches. Martin -- Martin Maechler, ETH Zurich > On Wed, Sep 7, 2016 at 3:51 PM, Aanchal Sharma > <aanchalsharma833 at gmail.com> wrote: >> Hi Simon >> >> I am facing same problem as described above. i am trying to fit gaussian >> mixture model to my data using normalmixEM. I am running a Rscript which >> has this function running as part of it for about 17000 datasets (in loop). >> The script runs fine for some datasets, but it terminates when it >> encounters one dataset with the following error: >> >> Error in normalmixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k = 2, : >> Too many tries! >> >> (command used: expr_mix_gau <- normalmixEM(expr_glm_residuals, lambda >> c(0.75,0.25), k = 2, epsilon = 1e-08, maxit = 10000, maxrestarts=200, verb >> = TRUE)) >> (expr_glm_residuals is my dataset which has residual values for different >> samples) >> >> It is suggested that one should define the mu and sigma in the command by >> looking at your dataset. But in my case there are many datasets and it will >> keep on changing every time. please suggest what can I do to resolve this >> issue. >> >> Regards >> Anchal >> >> On Tuesday, 16 July 2013 17:53:09 UTC-4, Simon Zehnder wrote: >>> >>> Hi Tjun Kiat Teo, >>> >>> you try to fit a Normal mixture to some data. The Normal mixture is very >>> delicate when it comes to parameter search: If the variance gets closer and >>> closer to zero, the log Likelihood becomes larger and larger for any values >>> of the remaining parameters. Furthermore for the EM algorithm it is known, >>> that it takes sometimes very long until convergence is reached. >>> >>> Try the following: >>> >>> Use as starting values for the component parameters: >>> >>> start.par <- mean(your.data, na.rm = TRUE) + sd(your.data, na.rm = TRUE) * >>> runif(K) >>> >>> For the weights just use either 1/K or the R cluster function with K >>> clusters >>> >>> Here K is the number of components. Further enlarge the maximum number of >>> iterations. What you could also try is to randomize start parameters and >>> run an SEM (Stochastic EM). In my opinion the better method is in this case >>> a Bayesian method: MCMC. >>> >>> >>> Best >>> >>> Simon >>> >>> >>> On Jul 16, 2013, at 10:59 PM, Tjun Kiat Teo <teot... at gmail.com >>> <javascript:>> wrote: >>> >>> > I was trying to use the normixEM in mixtools and I got this error >>> message. >>> > >>> > And I got this error message >>> > >>> > One of the variances is going to zero; trying new starting values. >>> > Error in normalmixEM(as.matrix(temp[[gc]][, -(f + 1)])) : Too many >>> tries! >>> > >>> > Are there any other packages for fitting mixture distributions ? >>> > >>> > >>> > Tjun Kiat Teo >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-h... at r-project.org <javascript:> mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-h... at r-project.org <javascript:> mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thanks for the reply. I have another related issue with Gamma mixture model. here is the description: I am trying to fit a 2 component gamma mixture model to my data (residual values obtained after running Generalized Linear Model), using following command (part of the code): expr_mix_gamma <- gammamixEM(expr_glm_residuals, lambda = c(0.75,0.25), k = 2, epsilon = 1e-08, maxit = 1000, maxrestarts=20, verb = TRUE) The code runs for multiple gene files (in loop). it runs fine for some files whereas for others it throws following error: Error in gammamixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k = 2, : Try different number of components? I tried increasing iterations and decreasing the convergence value, but that doesn't seem to work. Is there anything else that I can try? Thanks On Thu, Sep 8, 2016 at 8:38 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> >>>>> Bert Gunter <bgunter.4567 at gmail.com> > >>>>> on Wed, 7 Sep 2016 23:47:40 -0700 writes: > > > "please suggest what can I do to resolve this > > issue." > > > Fitting normal mixtures can be difficult, and sometime the > > optimization algorithm (EM) will get stuck with very slow > convergence. > > Presumably there are options in the package to either increase the > max > > number of steps before giving up or make the convergence criteria > less > > sensitive. The former will increase the run time and the latter will > > reduce the optimality (possibly leaving you farther from the true > > optimum). So you should look into changing these as you think > > appropriate. > > I'm jumping in late, without having read everything preceding. > > One of the last messages seemed to indicate that you are looking > at mixtures of *one*-dimensional gaussians. > > If this is the case, I strongly recommend looking at (my) CRAN > package 'nor1mix' (the "1" is for "*one*-dimensional). > > For a while now that small package is providing an alternative > to the EM, namely direct MLE, simply using optim(<likelihood>) where the > likelihood uses a somewhat smart parametrization. > > Of course, *as the EM*, this also depends on the starting value, > but my (limited) experience has been that > nor1mix::norMixMLE() > works considerably faster and more reliable than the EM (which I > also provide as nor1mix::norMixEM() . > > Apropos 'starting value': The help page shows how to use > kmeans() for "somewhat" reliable starts; alternatively, I'd > recommend using cluster::pam() to get a start there. > > I'm glad to hear about experiences using these / comparing > these with other approaches. > > Martin > > > -- > Martin Maechler, > ETH Zurich > > > > On Wed, Sep 7, 2016 at 3:51 PM, Aanchal Sharma > > <aanchalsharma833 at gmail.com> wrote: > >> Hi Simon > >> > >> I am facing same problem as described above. i am trying to fit > gaussian > >> mixture model to my data using normalmixEM. I am running a Rscript > which > >> has this function running as part of it for about 17000 datasets > (in loop). > >> The script runs fine for some datasets, but it terminates when it > >> encounters one dataset with the following error: > >> > >> Error in normalmixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k > = 2, : > >> Too many tries! > >> > >> (command used: expr_mix_gau <- normalmixEM(expr_glm_residuals, > lambda > >> c(0.75,0.25), k = 2, epsilon = 1e-08, maxit = 10000, > maxrestarts=200, verb > >> = TRUE)) > >> (expr_glm_residuals is my dataset which has residual values for > different > >> samples) > >> > >> It is suggested that one should define the mu and sigma in the > command by > >> looking at your dataset. But in my case there are many datasets and > it will > >> keep on changing every time. please suggest what can I do to > resolve this > >> issue. > >> > >> Regards > >> Anchal > >> > >> On Tuesday, 16 July 2013 17:53:09 UTC-4, Simon Zehnder wrote: > >>> > >>> Hi Tjun Kiat Teo, > >>> > >>> you try to fit a Normal mixture to some data. The Normal mixture > is very > >>> delicate when it comes to parameter search: If the variance gets > closer and > >>> closer to zero, the log Likelihood becomes larger and larger for > any values > >>> of the remaining parameters. Furthermore for the EM algorithm it > is known, > >>> that it takes sometimes very long until convergence is reached. > >>> > >>> Try the following: > >>> > >>> Use as starting values for the component parameters: > >>> > >>> start.par <- mean(your.data, na.rm = TRUE) + sd(your.data, na.rm > TRUE) * > >>> runif(K) > >>> > >>> For the weights just use either 1/K or the R cluster function with > K > >>> clusters > >>> > >>> Here K is the number of components. Further enlarge the maximum > number of > >>> iterations. What you could also try is to randomize start > parameters and > >>> run an SEM (Stochastic EM). In my opinion the better method is in > this case > >>> a Bayesian method: MCMC. > >>> > >>> > >>> Best > >>> > >>> Simon > >>> > >>> > >>> On Jul 16, 2013, at 10:59 PM, Tjun Kiat Teo <teot... at gmail.com > >>> <javascript:>> wrote: > >>> > >>> > I was trying to use the normixEM in mixtools and I got this error > >>> message. > >>> > > >>> > And I got this error message > >>> > > >>> > One of the variances is going to zero; trying new starting > values. > >>> > Error in normalmixEM(as.matrix(temp[[gc]][, -(f + 1)])) : Too > many > >>> tries! > >>> > > >>> > Are there any other packages for fitting mixture distributions ? > >>> > > >>> > > >>> > Tjun Kiat Teo > >>> > > >>> > [[alternative HTML version deleted]] > >>> > > >>> > ______________________________________________ > >>> > R-h... at r-project.org <javascript:> mailing list > >>> > https://stat.ethz.ch/mailman/listinfo/r-help > >>> > PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> > and provide commented, minimal, self-contained, reproducible > code. > >>> > >>> ______________________________________________ > >>> R-h... at r-project.org <javascript:> mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >-- Anchal Sharma, PhD Postdoctoral Fellow 195, Little Albany street, Cancer Institute of New Jersey Rutgers University NJ-08901 [[alternative HTML version deleted]]
Do you mean "increase the convergence value." Decreasing it should make it harder to converge (I believe, depending on exactly how "convergence vaue" is defined, so doublecheck.) -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 12, 2016 at 4:40 PM, Aanchal Sharma <aanchalsharma833 at gmail.com> wrote:> Thanks for the reply. > > I have another related issue with Gamma mixture model. here is the > description: > > I am trying to fit a 2 component gamma mixture model to my data (residual > values obtained after running Generalized Linear Model), using following > command (part of the code): > > expr_mix_gamma <- gammamixEM(expr_glm_residuals, lambda = c(0.75,0.25), k > 2, epsilon = 1e-08, maxit = 1000, maxrestarts=20, verb = TRUE) > > The code runs for multiple gene files (in loop). it runs fine for some files > whereas for others it throws following error: > > Error in gammamixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k = 2, > : Try different number of components? > > I tried increasing iterations and decreasing the convergence value, but that > doesn't seem to work. Is there anything else that I can try? > Thanks > > > On Thu, Sep 8, 2016 at 8:38 AM, Martin Maechler <maechler at stat.math.ethz.ch> > wrote: >> >> >>>>> Bert Gunter <bgunter.4567 at gmail.com> >> >>>>> on Wed, 7 Sep 2016 23:47:40 -0700 writes: >> >> > "please suggest what can I do to resolve this >> > issue." >> >> > Fitting normal mixtures can be difficult, and sometime the >> > optimization algorithm (EM) will get stuck with very slow >> convergence. >> > Presumably there are options in the package to either increase the >> max >> > number of steps before giving up or make the convergence criteria >> less >> > sensitive. The former will increase the run time and the latter will >> > reduce the optimality (possibly leaving you farther from the true >> > optimum). So you should look into changing these as you think >> > appropriate. >> >> I'm jumping in late, without having read everything preceding. >> >> One of the last messages seemed to indicate that you are looking >> at mixtures of *one*-dimensional gaussians. >> >> If this is the case, I strongly recommend looking at (my) CRAN >> package 'nor1mix' (the "1" is for "*one*-dimensional). >> >> For a while now that small package is providing an alternative >> to the EM, namely direct MLE, simply using optim(<likelihood>) where the >> likelihood uses a somewhat smart parametrization. >> >> Of course, *as the EM*, this also depends on the starting value, >> but my (limited) experience has been that >> nor1mix::norMixMLE() >> works considerably faster and more reliable than the EM (which I >> also provide as nor1mix::norMixEM() . >> >> Apropos 'starting value': The help page shows how to use >> kmeans() for "somewhat" reliable starts; alternatively, I'd >> recommend using cluster::pam() to get a start there. >> >> I'm glad to hear about experiences using these / comparing >> these with other approaches. >> >> Martin >> >> >> -- >> Martin Maechler, >> ETH Zurich >> >> >> > On Wed, Sep 7, 2016 at 3:51 PM, Aanchal Sharma >> > <aanchalsharma833 at gmail.com> wrote: >> >> Hi Simon >> >> >> >> I am facing same problem as described above. i am trying to fit >> gaussian >> >> mixture model to my data using normalmixEM. I am running a Rscript >> which >> >> has this function running as part of it for about 17000 datasets >> (in loop). >> >> The script runs fine for some datasets, but it terminates when it >> >> encounters one dataset with the following error: >> >> >> >> Error in normalmixEM(expr_glm_residuals, lambda = c(0.75, 0.25), k >> = 2, : >> >> Too many tries! >> >> >> >> (command used: expr_mix_gau <- normalmixEM(expr_glm_residuals, >> lambda >> >> c(0.75,0.25), k = 2, epsilon = 1e-08, maxit = 10000, >> maxrestarts=200, verb >> >> = TRUE)) >> >> (expr_glm_residuals is my dataset which has residual values for >> different >> >> samples) >> >> >> >> It is suggested that one should define the mu and sigma in the >> command by >> >> looking at your dataset. But in my case there are many datasets and >> it will >> >> keep on changing every time. please suggest what can I do to >> resolve this >> >> issue. >> >> >> >> Regards >> >> Anchal >> >> >> >> On Tuesday, 16 July 2013 17:53:09 UTC-4, Simon Zehnder wrote: >> >>> >> >>> Hi Tjun Kiat Teo, >> >>> >> >>> you try to fit a Normal mixture to some data. The Normal mixture >> is very >> >>> delicate when it comes to parameter search: If the variance gets >> closer and >> >>> closer to zero, the log Likelihood becomes larger and larger for >> any values >> >>> of the remaining parameters. Furthermore for the EM algorithm it >> is known, >> >>> that it takes sometimes very long until convergence is reached. >> >>> >> >>> Try the following: >> >>> >> >>> Use as starting values for the component parameters: >> >>> >> >>> start.par <- mean(your.data, na.rm = TRUE) + sd(your.data, na.rm >> TRUE) * >> >>> runif(K) >> >>> >> >>> For the weights just use either 1/K or the R cluster function with >> K >> >>> clusters >> >>> >> >>> Here K is the number of components. Further enlarge the maximum >> number of >> >>> iterations. What you could also try is to randomize start >> parameters and >> >>> run an SEM (Stochastic EM). In my opinion the better method is in >> this case >> >>> a Bayesian method: MCMC. >> >>> >> >>> >> >>> Best >> >>> >> >>> Simon >> >>> >> >>> >> >>> On Jul 16, 2013, at 10:59 PM, Tjun Kiat Teo <teot... at gmail.com >> >>> <javascript:>> wrote: >> >>> >> >>> > I was trying to use the normixEM in mixtools and I got this >> error >> >>> message. >> >>> > >> >>> > And I got this error message >> >>> > >> >>> > One of the variances is going to zero; trying new starting >> values. >> >>> > Error in normalmixEM(as.matrix(temp[[gc]][, -(f + 1)])) : Too >> many >> >>> tries! >> >>> > >> >>> > Are there any other packages for fitting mixture distributions >> ? >> >>> > >> >>> > >> >>> > Tjun Kiat Teo >> >>> > >> >>> > [[alternative HTML version deleted]] >> >>> > >> >>> > ______________________________________________ >> >>> > R-h... at r-project.org <javascript:> mailing list >> >>> > https://stat.ethz.ch/mailman/listinfo/r-help >> >>> > PLEASE do read the posting guide >> >>> http://www.R-project.org/posting-guide.html >> >>> > and provide commented, minimal, self-contained, reproducible >> code. >> >>> >> >>> ______________________________________________ >> >>> R-h... at r-project.org <javascript:> mailing list >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> PLEASE do read the posting guide >> >>> http://www.R-project.org/posting-guide.html >> >>> and provide commented, minimal, self-contained, reproducible code. >> >>> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Anchal Sharma, PhD > Postdoctoral Fellow > 195, Little Albany street, > Cancer Institute of New Jersey > Rutgers University > NJ-08901