I used to consider using R and "Optim" to replace my commercial packages: Gauss and Matlab. But it turns out that "Optim" does not converge completely. The same data for Gauss and Matlab are converged very well. I see that there are too many packages based on "optim" and really doubt if they can be trusted! -- View this message in context: http://r.789695.n4.nabble.com/Poor-performance-of-Optim-tp3862229p3862229.html Sent from the R help mailing list archive at Nabble.com.
-----Original Message----- From: r-help-bounces@r-project.org on behalf of yehengxin Sent: Sat 10/1/2011 8:12 AM To: r-help@r-project.org Subject: [R] Poor performance of "Optim" I used to consider using R and "Optim" to replace my commercial packages: Gauss and Matlab. But it turns out that "Optim" does not converge completely. The same data for Gauss and Matlab are converged very well. I see that there are too many packages based on "optim" and really doubt if they can be trusted! -- Considering that your post is pure whining without any evidence or reproducible example, considering that you speak of 'data' being converged, me think it's your fault, you cann't control optim well enough to get sensible results. There are many ways to use optim eh?, you can pass on the gradients, you can use a variety of methods, you can increase the number of iterations, et cetera, read optim's help, come back with a reproducible example, or quietly stick to your commercial sofware, leaving the whining to yourself. HTH Ruben -- Rubén H. Roa-Ureta, Ph. D. AZTI Tecnalia, Txatxarramendi Ugartea z/g, Sukarrieta, Bizkaia, SPAIN -- View this message in context: http://r.789695.n4.nabble.com/Poor-performance-of-Optim-tp3862229p3862229.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Is there a question or point to your message or did you simply feel the urge to inform the entire R-help list of the things that you consider? Josh On Fri, Sep 30, 2011 at 11:12 PM, yehengxin <xye78 at hotmail.com> wrote:> I used to consider using R and "Optim" to replace my commercial packages: > Gauss and Matlab. ?But it turns out that "Optim" does not converge > completely. ?The same data for Gauss and Matlab are converged very well. ?I > see that there are too many packages based on "optim" and really doubt if > they can be trusted! > > -- > View this message in context: http://r.789695.n4.nabble.com/Poor-performance-of-Optim-tp3862229p3862229.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/
Le 01/10/11 08:12, yehengxin a ?crit :> I used to consider using R and "Optim" to replace my commercial packages: > Gauss and Matlab. But it turns out that "Optim" does not converge > completely.What it means "completely" ?> The same data for Gauss and Matlab are converged very well. I > see that there are too many packages based on "optim" and really doubt if > they can be trusted! > >I don't understand the "too many". If a package needs an optimization, it is normal that it uses optim ! I use the same model in r, Excel solver (the new version is rather good) or Profit (a mac software, very powerful) and r is rather one of the best solution. But they are many different choices that can influence the optimization. You must give an example of the problem. I find some convergence problem when the criteria to be minimized is the result of a stochastic model (ie if the same set of parameters produce different objective value depending on the run). In this case the fit stops prematurely and the method SANN should be preferred. In conclusion, give us more information but take into account that non-linear optimization is a complex world ! Marc -- __________________________________________________________ Marc Girondot, Pr Laboratoire Ecologie, Syst?matique et Evolution Equipe de Conservation des Populations et des Communaut?s CNRS, AgroParisTech et Universit? Paris-Sud 11 , UMR 8079 B?timent 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.girondot at u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot
What I tried is just a simple binary probit model. Create a random data and use "optim" to maximize the log-likelihood function to estimate the coefficients. (e.g. u = 0.1+0.2*x + e, e is standard normal. And y = (u > 0), y indicating a binary choice variable) If I estimate coefficient of "x", I should be able to get a value close to 0.2 if sample is large enough. Say I got 0.18. If I expand x by twice and reestimate the model, which coefficient should I get? 0.09, right? But with "optim", I got something different. When I do the same thing in both Gauss and Matlab, I can exactly get 0.09, evidencing that the coefficient estimator is reliable. But R's "optim" does not give me a reliable estimator. -- View this message in context: http://r.789695.n4.nabble.com/Poor-performance-of-Optim-tp3862229p3863969.html Sent from the R help mailing list archive at Nabble.com.
With respect, your statement that R's optim does not give you a reliable estimator is bogus. As pointed out before, this would depend on when optim believes it's good enough and stops optimizing. In particular if you stretch out x, then it is plausible that the likelihood function will become flat enough "earlier," so that the numerical optimization will stop earlier (i.e., optim will "think" that the slope of the likelihood function is flat enough to be considered zero and stop earlier than it will for more condensed data). After all, maximum likelihood is a numerical method and thus an approximation. I would venture to say that what you describe lies in the nature of this method. You could also follow the good advice given earlier, by increasing the number of iterations or decreasing the tolerance. However, check the example below: for all purposes it's really close enough and has nothing to do with optim being "unreliable." n<-1000 x<-rnorm(n) y<-0.5*x+rnorm(n) z<-ifelse(y>0,1,0) X<-cbind(1,x) b<-matrix(c(0,0),nrow=2) #Probit reg<-glm(z~x,family=binomial("probit")) #Optim reproducing probit (with minor deviations due to difference in method) LL<-function(b){-sum(z*log(pnorm(X%*%b))+(1-z)*log(1-pnorm(X%*%b)))} optim(c(0,0),LL) #Multiply x by 2 and repeat optim X[,2]=2*X[,2] optim(c(0,0),LL) HTH, Daniel yehengxin wrote:> > What I tried is just a simple binary probit model. Create a random data > and use "optim" to maximize the log-likelihood function to estimate the > coefficients. (e.g. u = 0.1+0.2*x + e, e is standard normal. And y = (u > > 0), y indicating a binary choice variable) > > If I estimate coefficient of "x", I should be able to get a value close to > 0.2 if sample is large enough. Say I got 0.18. > > If I expand x by twice and reestimate the model, which coefficient should > I get? 0.09, right? > > But with "optim", I got something different. When I do the same thing in > both Gauss and Matlab, I can exactly get 0.09, evidencing that the > coefficient estimator is reliable. But R's "optim" does not give me a > reliable estimator. >-- View this message in context: http://r.789695.n4.nabble.com/Poor-performance-of-Optim-tp3862229p3864133.html Sent from the R help mailing list archive at Nabble.com.
Oh, I think I got it. Commercial packages limit the number of decimals shown. -- View this message in context: http://r.789695.n4.nabble.com/Poor-performance-of-Optim-tp3862229p3864271.html Sent from the R help mailing list archive at Nabble.com.
Hi, You really need to study the documentation of "optim" carefully before you make broad generalizations. There are several algorithms available in optim. The default is a simplex-type algorithm called Nelder-Mead. I think this is an unfortunate choice as the default algorithm. Nelder-Mead is a robust algorithm that can work well for almost any kind of objective function (smooth or nasty). However, the trade-off is that it is very slow in terms of convergence rate. For simple, smooth problems, such as yours, you should use "BFGS" (or "L-BFGS" if you have simple box-constraints). Also, take a look at the "optimx" package and the most recent paper in J Stat Software on optimx for a better understanding of the wide array of optimization options available in R. Best, Ravi.