peter dalgaard
2019-May-03 11:13 UTC
[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Yes, I think you are right. I was at first confused by the fact that after the optim() call,> environment(fn)$xx[1] 10> environment(fn)$ret[1] 100.02 so not 9.999, but this could come from x being assigned the final value without calling fn. -pd> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. > > Duncan Murdoch > > On 03/05/2019 4:41 a.m., Serguei Sokol wrote: >> On 03/05/2019 10:31, Serguei Sokol wrote: >>> On 02/05/2019 21:35, Florian Gerber wrote: >>>> Dear all, >>>> >>>> when using optim() for a function that uses the parent environment, I >>>> see the following unexpected behavior: >>>> >>>> makeFn <- function(){ >>>> xx <- ret <- NA >>>> fn <- function(x){ >>>> if(!is.na(xx) && x==xx){ >>>> cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") >>>> return(ret) >>>> } >>>> xx <<- x; ret <<- sum(x^2) >>>> cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") >>>> ret >>>> } >>>> fn >>>> } >>>> fn <- makeFn() >>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>> # x=10, ret=100 (calculate) >>>> # x=10.001, ret=100.02 (calculate) >>>> # x=9.999, ret=100.02 (memory) >>>> # $par >>>> # [1] 10 >>>> # >>>> # $value >>>> # [1] 100 >>>> # (...) >>>> >>>> I would expect that optim() does more than 3 function evaluations and >>>> that the optimization converges to 0. >>>> >>>> Same problem with optim(par=10, fn=fn, method="BFGS"). >>>> >>>> Any ideas? >>> I don't have an answer but may be an insight. For some mysterious >>> reason xx is getting changed when in should not. Consider: >>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx >>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>>> optim(par=10, fn=fn, method="L-BFGS-B") >>> 1 in x,xx,ret= 10 NA NA >>> out x,xx,ret= 10 10 100 >>> 2 in x,xx,ret= 10.001 10 100 >>> out x,xx,ret= 10.001 10.001 100.02 >>> 3 in x,xx,ret= 9.999 9.999 100.02 >>> $par >>> [1] 10 >>> >>> $value >>> [1] 100 >>> >>> $counts >>> function gradient >>> 1 1 >>> >>> $convergence >>> [1] 0 >>> >>> $message >>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>> >>> At the third call, xx has value 9.999 while it should have kept the >>> value 10.001. >>> >> A little follow-up: if you untie the link between xx and x by replacing >> the expression "xx <<- x" by "xx <<- x+0" it works as expected: >> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- >> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >> > optim(par=10, fn=fn, method="L-BFGS-B") >> 1 in x,xx,ret= 10 NA NA >> out x,xx,ret= 10 10 100 >> 2 in x,xx,ret= 10.001 10 100 >> out x,xx,ret= 10.001 10.001 100.02 >> 3 in x,xx,ret= 9.999 10.001 100.02 >> out x,xx,ret= 9.999 9.999 99.98 >> 4 in x,xx,ret= 9 9.999 99.98 >> out x,xx,ret= 9 9 81 >> 5 in x,xx,ret= 9.001 9 81 >> out x,xx,ret= 9.001 9.001 81.018 >> 6 in x,xx,ret= 8.999 9.001 81.018 >> out x,xx,ret= 8.999 8.999 80.982 >> 7 in x,xx,ret= 1.776357e-11 8.999 80.982 >> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 >> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 >> out x,xx,ret= 0.001 0.001 1e-06 >> 9 in x,xx,ret= -0.001 0.001 1e-06 >> out x,xx,ret= -0.001 -0.001 1e-06 >> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 >> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 >> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 >> out x,xx,ret= 0.001 0.001 1e-06 >> 12 in x,xx,ret= -0.001 0.001 1e-06 >> out x,xx,ret= -0.001 -0.001 1e-06 >> $par >> [1] -1.334475e-23 >> $value >> [1] 1.780823e-46 >> $counts >> function gradient >> 4 4 >> $convergence >> [1] 0 >> $message >> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >> Serguei. >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Duncan Murdoch
2019-May-03 12:18 UTC
[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
It looks as though this happens when calculating numerical gradients: x is reduced by eps, and fn is called; then x is increased by eps, and fn is called again. No check is made that x has other references after the first call to fn. I'll put together a patch if nobody else gets there first... Duncan Murdoch On 03/05/2019 7:13 a.m., peter dalgaard wrote:> Yes, I think you are right. I was at first confused by the fact that after the optim() call, > >> environment(fn)$xx > [1] 10 >> environment(fn)$ret > [1] 100.02 > > so not 9.999, but this could come from x being assigned the final value without calling fn. > > -pd > > >> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. >> >> Duncan Murdoch >> >> On 03/05/2019 4:41 a.m., Serguei Sokol wrote: >>> On 03/05/2019 10:31, Serguei Sokol wrote: >>>> On 02/05/2019 21:35, Florian Gerber wrote: >>>>> Dear all, >>>>> >>>>> when using optim() for a function that uses the parent environment, I >>>>> see the following unexpected behavior: >>>>> >>>>> makeFn <- function(){ >>>>> xx <- ret <- NA >>>>> fn <- function(x){ >>>>> if(!is.na(xx) && x==xx){ >>>>> cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") >>>>> return(ret) >>>>> } >>>>> xx <<- x; ret <<- sum(x^2) >>>>> cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") >>>>> ret >>>>> } >>>>> fn >>>>> } >>>>> fn <- makeFn() >>>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>>> # x=10, ret=100 (calculate) >>>>> # x=10.001, ret=100.02 (calculate) >>>>> # x=9.999, ret=100.02 (memory) >>>>> # $par >>>>> # [1] 10 >>>>> # >>>>> # $value >>>>> # [1] 100 >>>>> # (...) >>>>> >>>>> I would expect that optim() does more than 3 function evaluations and >>>>> that the optimization converges to 0. >>>>> >>>>> Same problem with optim(par=10, fn=fn, method="BFGS"). >>>>> >>>>> Any ideas? >>>> I don't have an answer but may be an insight. For some mysterious >>>> reason xx is getting changed when in should not. Consider: >>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx >>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>> 1 in x,xx,ret= 10 NA NA >>>> out x,xx,ret= 10 10 100 >>>> 2 in x,xx,ret= 10.001 10 100 >>>> out x,xx,ret= 10.001 10.001 100.02 >>>> 3 in x,xx,ret= 9.999 9.999 100.02 >>>> $par >>>> [1] 10 >>>> >>>> $value >>>> [1] 100 >>>> >>>> $counts >>>> function gradient >>>> 1 1 >>>> >>>> $convergence >>>> [1] 0 >>>> >>>> $message >>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>>> >>>> At the third call, xx has value 9.999 while it should have kept the >>>> value 10.001. >>>> >>> A little follow-up: if you untie the link between xx and x by replacing >>> the expression "xx <<- x" by "xx <<- x+0" it works as expected: >>> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- >>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>> > optim(par=10, fn=fn, method="L-BFGS-B") >>> 1 in x,xx,ret= 10 NA NA >>> out x,xx,ret= 10 10 100 >>> 2 in x,xx,ret= 10.001 10 100 >>> out x,xx,ret= 10.001 10.001 100.02 >>> 3 in x,xx,ret= 9.999 10.001 100.02 >>> out x,xx,ret= 9.999 9.999 99.98 >>> 4 in x,xx,ret= 9 9.999 99.98 >>> out x,xx,ret= 9 9 81 >>> 5 in x,xx,ret= 9.001 9 81 >>> out x,xx,ret= 9.001 9.001 81.018 >>> 6 in x,xx,ret= 8.999 9.001 81.018 >>> out x,xx,ret= 8.999 8.999 80.982 >>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982 >>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 >>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 >>> out x,xx,ret= 0.001 0.001 1e-06 >>> 9 in x,xx,ret= -0.001 0.001 1e-06 >>> out x,xx,ret= -0.001 -0.001 1e-06 >>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 >>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 >>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 >>> out x,xx,ret= 0.001 0.001 1e-06 >>> 12 in x,xx,ret= -0.001 0.001 1e-06 >>> out x,xx,ret= -0.001 -0.001 1e-06 >>> $par >>> [1] -1.334475e-23 >>> $value >>> [1] 1.780823e-46 >>> $counts >>> function gradient >>> 4 4 >>> $convergence >>> [1] 0 >>> $message >>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>> Serguei. >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >
Ravi Varadhan
2019-May-06 14:06 UTC
[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Optim's Nelder-Mead works correctly for this example.> optim(par=10, fn=fn, method="Nelder-Mead")x=10, ret=100.02 (memory) x=11, ret=121 (calculate) x=9, ret=81 (calculate) x=8, ret=64 (calculate) x=6, ret=36 (calculate) x=4, ret=16 (calculate) x=0, ret=0 (calculate) x=-4, ret=16 (calculate) x=-4, ret=16 (memory) x=2, ret=4 (calculate) x=-2, ret=4 (calculate) x=1, ret=1 (calculate) x=-1, ret=1 (calculate) x=0.5, ret=0.25 (calculate) x=-0.5, ret=0.25 (calculate) x=0.25, ret=0.0625 (calculate) x=-0.25, ret=0.0625 (calculate) x=0.125, ret=0.015625 (calculate) x=-0.125, ret=0.015625 (calculate) x=0.0625, ret=0.00390625 (calculate) x=-0.0625, ret=0.00390625 (calculate) x=0.03125, ret=0.0009765625 (calculate) x=-0.03125, ret=0.0009765625 (calculate) x=0.015625, ret=0.0002441406 (calculate) x=-0.015625, ret=0.0002441406 (calculate) x=0.0078125, ret=6.103516e-05 (calculate) x=-0.0078125, ret=6.103516e-05 (calculate) x=0.00390625, ret=1.525879e-05 (calculate) x=-0.00390625, ret=1.525879e-05 (calculate) x=0.001953125, ret=3.814697e-06 (calculate) x=-0.001953125, ret=3.814697e-06 (calculate) x=0.0009765625, ret=9.536743e-07 (calculate) $par [1] 0 $value [1] 0 $counts function gradient 32 NA $convergence [1] 0 $message NULL ________________________________ From: R-devel <r-devel-bounces at r-project.org> on behalf of Duncan Murdoch <murdoch.duncan at gmail.com> Sent: Friday, May 3, 2019 8:18:44 AM To: peter dalgaard Cc: Florian Gerber; r-devel at r-project.org Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments It looks as though this happens when calculating numerical gradients: x is reduced by eps, and fn is called; then x is increased by eps, and fn is called again. No check is made that x has other references after the first call to fn. I'll put together a patch if nobody else gets there first... Duncan Murdoch On 03/05/2019 7:13 a.m., peter dalgaard wrote:> Yes, I think you are right. I was at first confused by the fact that after the optim() call, > >> environment(fn)$xx > [1] 10 >> environment(fn)$ret > [1] 100.02 > > so not 9.999, but this could come from x being assigned the final value without calling fn. > > -pd > > >> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. >> >> Duncan Murdoch >> >> On 03/05/2019 4:41 a.m., Serguei Sokol wrote: >>> On 03/05/2019 10:31, Serguei Sokol wrote: >>>> On 02/05/2019 21:35, Florian Gerber wrote: >>>>> Dear all, >>>>> >>>>> when using optim() for a function that uses the parent environment, I >>>>> see the following unexpected behavior: >>>>> >>>>> makeFn <- function(){ >>>>> xx <- ret <- NA >>>>> fn <- function(x){ >>>>> if(!is.na(xx) && x==xx){ >>>>> cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="") >>>>> return(ret) >>>>> } >>>>> xx <<- x; ret <<- sum(x^2) >>>>> cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="") >>>>> ret >>>>> } >>>>> fn >>>>> } >>>>> fn <- makeFn() >>>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>>> # x=10, ret=100 (calculate) >>>>> # x=10.001, ret=100.02 (calculate) >>>>> # x=9.999, ret=100.02 (memory) >>>>> # $par >>>>> # [1] 10 >>>>> # >>>>> # $value >>>>> # [1] 100 >>>>> # (...) >>>>> >>>>> I would expect that optim() does more than 3 function evaluations and >>>>> that the optimization converges to 0. >>>>> >>>>> Same problem with optim(par=10, fn=fn, method="BFGS"). >>>>> >>>>> Any ideas? >>>> I don't have an answer but may be an insight. For some mysterious >>>> reason xx is getting changed when in should not. Consider: >>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx >>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>>>> optim(par=10, fn=fn, method="L-BFGS-B") >>>> 1 in x,xx,ret= 10 NA NA >>>> out x,xx,ret= 10 10 100 >>>> 2 in x,xx,ret= 10.001 10 100 >>>> out x,xx,ret= 10.001 10.001 100.02 >>>> 3 in x,xx,ret= 9.999 9.999 100.02 >>>> $par >>>> [1] 10 >>>> >>>> $value >>>> [1] 100 >>>> >>>> $counts >>>> function gradient >>>> 1 1 >>>> >>>> $convergence >>>> [1] 0 >>>> >>>> $message >>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>>> >>>> At the third call, xx has value 9.999 while it should have kept the >>>> value 10.001. >>>> >>> A little follow-up: if you untie the link between xx and x by replacing >>> the expression "xx <<- x" by "xx <<- x+0" it works as expected: >>> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in >>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- >>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}}) >>> > optim(par=10, fn=fn, method="L-BFGS-B") >>> 1 in x,xx,ret= 10 NA NA >>> out x,xx,ret= 10 10 100 >>> 2 in x,xx,ret= 10.001 10 100 >>> out x,xx,ret= 10.001 10.001 100.02 >>> 3 in x,xx,ret= 9.999 10.001 100.02 >>> out x,xx,ret= 9.999 9.999 99.98 >>> 4 in x,xx,ret= 9 9.999 99.98 >>> out x,xx,ret= 9 9 81 >>> 5 in x,xx,ret= 9.001 9 81 >>> out x,xx,ret= 9.001 9.001 81.018 >>> 6 in x,xx,ret= 8.999 9.001 81.018 >>> out x,xx,ret= 8.999 8.999 80.982 >>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982 >>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 >>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 >>> out x,xx,ret= 0.001 0.001 1e-06 >>> 9 in x,xx,ret= -0.001 0.001 1e-06 >>> out x,xx,ret= -0.001 -0.001 1e-06 >>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 >>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 >>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 >>> out x,xx,ret= 0.001 0.001 1e-06 >>> 12 in x,xx,ret= -0.001 0.001 1e-06 >>> out x,xx,ret= -0.001 -0.001 1e-06 >>> $par >>> [1] -1.334475e-23 >>> $value >>> [1] 1.780823e-46 >>> $counts >>> function gradient >>> 4 4 >>> $convergence >>> [1] 0 >>> $message >>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" >>> Serguei. >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]]
Maybe Matching Threads
- R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
- R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
- R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
- R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
- R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments