Greetings, I am in great anguish as the routine stats::optim shows unexplicable behaviour of various sorts. For one it is immune to the choice of optimization method and seems to always do the same. The following trace log N = 21, M = 5 machine precision = 2.22045e-16 At X0, 0 variables are exactly at the bounds At iterate 0 f= 1756.8 |proj g|= 0.73581 At iterate 1 f = 911.52 |proj g|= 0.70136 At iterate 2 f = 791.62 |proj g|= 0.68563 At iterate 3 f = 749.81 |proj g|= 1 ..... ..... At iterate 87 f = 666.91 |proj g|= 0.98217 At iterate 88 f = 666.9 |proj g|= 0.96966 Bad direction in the line search; refresh the lbfgs memory and restart the iteration. At iterate 89 f = 9022.8 |proj g|= 1.0426 iterations 89 function evaluations 132 segments explored during Cauchy searches 128 BFGS updates skipped 0 active bounds at final generalized Cauchy point 18 norm of the final projected gradient 1.04257 final function value 9022.84 F = 9022.84 final value 9022.836050 converged by each of the following calls to optim: optPars <- optim( pars,OF,#gradientOF, method = "CG", lower = pars_lb,upper=pars_ub, control = list(fnscale=1,trace=3,REPORT=1) ) optPars <- optim( pars,OF,#gradientOF, method = "Nelder-Mead", lower = pars_lb,upper=pars_ub, control = list(fnscale=1,trace=3,REPORT=1) ) optPars <- optim( pars,OF,#gradientOF, method = "L-BFGS-B", lower = pars_lb,upper=pars_ub, control = list(fnscale=1,trace=3,REPORT=1) ) If method != "L-BFGS-B", then the routine complains about the uses of bounds for the parameters as expected, however the trace log above reminas the same. Note also that the routine makes fine progress toward a minimum (as desired) but in the last iteration reverses course and returns a function value much larger than the starting value. What is going on here? All help is much appreciated. Michael Meyer [[alternative HTML version deleted]]
I don't think anyone can do much to help you unless you show us (a) your objective function "OF" and your starting value for "pars" --- which I do not see in your posting. Examples should be ***reproducible***!!! My personal experience with optim() has always been very good. cheers, Rolf Turner On 09/04/13 01:08, Michael Meyer wrote:> Greetings, > > I am in great anguish as the routine stats::optim shows unexplicable behaviour > of various sorts. > For one it is immune to the choice of optimization method and seems to always do the same. > The following trace log > > > N = 21, M = 5 machine precision = 2.22045e-16 > At X0, 0 variables are exactly at the bounds > At iterate 0 f= 1756.8 |proj g|= 0.73581 > At iterate 1 f = 911.52 |proj g|= 0.70136 > At iterate 2 f = 791.62 |proj g|= 0.68563 > At iterate 3 f = 749.81 |proj g|= 1 > ..... > ..... > At iterate 87 f = 666.91 |proj g|= 0.98217 > At iterate 88 f = 666.9 |proj g|= 0.96966 > > Bad direction in the line search; > refresh the lbfgs memory and restart the iteration. > At iterate 89 f = 9022.8 |proj g|= 1.0426 > iterations 89 > function evaluations 132 > segments explored during Cauchy searches 128 > BFGS updates skipped 0 > active bounds at final generalized Cauchy point 18 > norm of the final projected gradient 1.04257 > final function value 9022.84 > F = 9022.84 > final value 9022.836050 > converged > > > by each of the following calls to optim: > > optPars <- optim( pars,OF,#gradientOF, > method = "CG", > lower = pars_lb,upper=pars_ub, > control = list(fnscale=1,trace=3,REPORT=1) > ) > optPars <- optim( pars,OF,#gradientOF, > method = "Nelder-Mead", > lower = pars_lb,upper=pars_ub, > control = list(fnscale=1,trace=3,REPORT=1) > ) > optPars <- optim( pars,OF,#gradientOF, > method = "L-BFGS-B", > lower = pars_lb,upper=pars_ub, > control = list(fnscale=1,trace=3,REPORT=1) > ) > > If method != "L-BFGS-B", then the routine complains about the uses of bounds for > the parameters as expected, however the trace log above reminas the same. > > Note also that the routine makes fine progress toward a minimum (as desired) > but in the last iteration reverses course and returns a function value much larger than the starting value. > > What is going on here? > All help is much appreciated.
It would take some effort to extract selfcontained code from the mass of code wherein this optimization is embedded. Moreover I would have to obtain permission from my employer to do so. This is not efficient. However some things are evident from the trace log which I have submitted: (a) L-BFGS-B does not identify itself even though it was called overriding the method parameter in optim. (b) Optim reports as final converged minimum value a function value that is much larger than others computed during the optimization. I think we can agree on calling this a bug. [[alternative HTML version deleted]]
Sometimes one has to really read the manual carefully. "If non-trivial bounds are supplied, this method will be selected, with a warning." (re L-BFGS-B) Several of us have noted problems occasionally with this code. You might want to look at the box constrained codes offered in optimx package through other packages (bobyqa, nmkb, Rvmmin, Rcgmin) JN On 13-09-04 06:00 AM, r-help-request at r-project.org wrote: > Message: 67 > Date: Wed, 4 Sep 2013 16:34:54 +0800 (SGT) > From: Michael Meyer<spyqqqdia at yahoo.com> > To:"r-help at r-project.org" <r-help at r-project.org> > Subject: [R] optim evils > Message-ID: > <1378283694.77272.YahooMailNeo at web193402.mail.sg3.yahoo.com> > Content-Type: text/plain > > It would take some effort to extract selfcontained code from the mass of code wherein this optimization is embedded. Moreover I would have to obtain permission from my employer to do so. > > This is not efficient. > However some things are evident from the trace log which I have submitted: > (a) L-BFGS-B does not identify itself even though it was called overriding the method > parameter in optim. > (b) Optim reports as final converged minimum value a function value that is much larger than > others computed during the optimization. > > I think we can agree on calling this a bug. > [[alternative HTML version deleted]] >
Thanks for all replies. The problem occurred in the following context: A Gaussian one dimensional mixture (number of constituents, locations, variances all unknown) is to be fitted to data (as starting value to or in lieu of mixtools). A likelihood maximization is performed. I'll try to destill the code so that reproducible failure of L-BFGS-B occurs and post it here. Michael Meyer [[alternative HTML version deleted]]
Greetings, In obedient deference to the demands of the collective I emailed a?BUG report containing code and data to r-bugs at r-project.org but found subsequently that I am unable to load the page http://bugs.r-project.org/ to check on the status of this report. Can anyone else load this page? Thanks, Michael Meyer