Abby Spurdle
2020-Mar-12 20:22 UTC
[R] Relatively Simple Maximization Using Optim Doesnt Optimize
I'm sorry, Duncan. But I disagree. This is not a "bug" in optim function, as such. (Or at least, there's nothing in this discussion to suggest that there's a bug). But rather a floating point arithmetic related problem. The OP's function looks simple enough, at first glance. But it's not. Plotting a numerical approximation of the derivative, makes the problem more apparent: ---------- plot_derivative <- function (f, a = sol - offset, b = sol + offset, sol, offset=0.001, N=200) { FIRST <- 1:(N - 2) LAST <- 3:N MP <- 2:(N - 1) x <- seq (a, b, length.out=N) y <- f (x) dy <- (y [LAST] - y [FIRST]) / (x [LAST] - x [FIRST]) plot (x [MP], dy, type="l", xlab="x", ylab="dy/dx (approx)") } optim.sol <- optim (1001, production1 ,method="CG", control = list (fnscale=-1) )$par plot_derivative (production1, sol=optim.sol) abline (v=optim.sol, lty=2, col="grey") ---------- So, I would say the optim function (including the CG method) is doing what it's supposed to do. And collating/expanding on Nash's, Jeff's and Eric's comments: (1) An exact solution can be derived quickly, so using a numerical method is unnecessary, and inefficient. (2) Possible problems with the CG method are noted in the documentation. (3) Numerical approximations of the function's derivative need to be well-behaved for gradient-based numerical methods to work properly. On Fri, Mar 13, 2020 at 3:42 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > It looks like a bug in the CG method. The other methods in optim() all > work fine. CG is documented to be a good choice in high dimensions; why > did you choose it for a 1 dim problem? > > Duncan Murdoch
Abby Spurdle
2020-Mar-12 21:54 UTC
[R] Relatively Simple Maximization Using Optim Doesnt Optimize
> (1) An exact solution can be derived quicklyPlease disregard note (1) above. I'm not sure if it was right. And one more comment: The conjugate gradient method is an established method. So the question is, is the optim function applying this method or not... And assuming that it is, then R is definitely doing what it should be doing. If not, then I guess it would be a bug...
Mark Leeds
2020-Mar-12 23:58 UTC
[R] Relatively Simple Maximization Using Optim Doesnt Optimize
Hi Abby: Either way, thanks for your efforts with the derivative plot. Note that John Nash is a SERIOUS EXPERT in optimization so I would just go by what he said earlier. Also, I don't want to speak for Duncan but I have a feeling that he meant "inadequacy" in the CG method rather than a bug in the R code. Mark On Thu, Mar 12, 2020 at 5:55 PM Abby Spurdle <spurdle.a at gmail.com> wrote:> > (1) An exact solution can be derived quickly > > Please disregard note (1) above. > I'm not sure if it was right. > > And one more comment: > > The conjugate gradient method is an established method. > So the question is, is the optim function applying this method or not... > And assuming that it is, then R is definitely doing what it should be > doing. > If not, then I guess it would be a bug... > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Duncan Murdoch
2020-Mar-13 01:42 UTC
[R] Relatively Simple Maximization Using Optim Doesnt Optimize
On 12/03/2020 1:22 p.m., Abby Spurdle wrote:> I'm sorry, Duncan. > But I disagree. > > This is not a "bug" in optim function, as such. > (Or at least, there's nothing in this discussion to suggest that there's a bug). > But rather a floating point arithmetic related problem. > > The OP's function looks simple enough, at first glance. > But it's not. > > Plotting a numerical approximation of the derivative, makes the > problem more apparent:There is nothing in that plot to indicate that the result given by optim() should be accepted as optimal. The numerical approximation to the derivative is 0.055851 everywhere in your graph, with numerical errors out in the 8th decimal place or later. Clearly the max occurs somewhere to the right of that. Yes, the 2nd derivative calculation will be terrible if R chooses a step size of 0.00001 when calculating it, but why would it do that, given that the 1st derivative is 3 orders of magnitude larger?> ---------- > plot_derivative <- function (f, a = sol - offset, b = sol + offset, > sol, offset=0.001, N=200) > { FIRST <- 1:(N - 2) > LAST <- 3:N > MP <- 2:(N - 1) > > x <- seq (a, b, length.out=N) > y <- f (x) > dy <- (y [LAST] - y [FIRST]) / (x [LAST] - x [FIRST]) > > plot (x [MP], dy, type="l", xlab="x", ylab="dy/dx (approx)") > } > > optim.sol <- optim (1001, production1 ,method="CG", control = list > (fnscale=-1) )$par > plot_derivative (production1, sol=optim.sol) > abline (v=optim.sol, lty=2, col="grey") > ---------- > > So, I would say the optim function (including the CG method) is doing > what it's supposed to do. > > And collating/expanding on Nash's, Jeff's and Eric's comments: > (1) An exact solution can be derived quickly, so using a numerical > method is unnecessary, and inefficient. > (2) Possible problems with the CG method are noted in the documentation. > (3) Numerical approximations of the function's derivative need to be > well-behaved for gradient-based numerical methods to work properly. > > > On Fri, Mar 13, 2020 at 3:42 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> It looks like a bug in the CG method. The other methods in optim() all >> work fine. CG is documented to be a good choice in high dimensions; why >> did you choose it for a 1 dim problem? >> >> Duncan Murdoch
Abby Spurdle
2020-Mar-13 02:25 UTC
[R] Relatively Simple Maximization Using Optim Doesnt Optimize
> There is nothing in that plot to indicate that the result given by > optim() should be accepted as optimal. The numerical approximation to > the derivative is 0.055851 everywhere in your graphThat wasn't how I intended the plot to be interpreted. By default, the step size (in x) is 1e-5, which seems like a moderate step size. However, at that level, the numerical approximation is very badly behaved. And if the step size is decreased, things get worse. I haven't checked all the technical details of the optim function. But any reliance on numerical approximations of the derivative, have a high chance of running into problems using a function like this.