thr3ads.net - R devel - [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments [May 2019]

If this information is useful, please help other people find it:
Share via:

peter dalgaard

2019-May-03 11:13 UTC

[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

Yes, I think you are right. I was at first confused by the fact that after the
optim() call,
> environment(fn)$xx
[1] 10> environment(fn)$ret[1] 100.02

so not 9.999, but this could come from x being assigned the final value without
calling fn.

-pd

> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:
> 
> Your results below make it look like a bug in optim():  it is not
duplicating a value when it should, so changes to x affect xx as well.
> 
> Duncan Murdoch
> 
> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>> On 03/05/2019 10:31, Serguei Sokol wrote:
>>> On 02/05/2019 21:35, Florian Gerber wrote:
>>>> Dear all,
>>>> 
>>>> when using optim() for a function that uses the parent
environment, I
>>>> see the following unexpected behavior:
>>>> 
>>>> makeFn <- function(){
>>>>      xx <- ret <- NA
>>>>      fn <- function(x){
>>>>         if(!is.na(xx) && x==xx){
>>>>             cat("x=", xx, ", ret=", ret,
" (memory)", fill=TRUE, sep="")
>>>>             return(ret)
>>>>         }
>>>>         xx <<- x; ret <<- sum(x^2)
>>>>         cat("x=", xx, ", ret=", ret, "
(calculate)", fill=TRUE, sep="")
>>>>         ret
>>>>      }
>>>>      fn
>>>> }
>>>> fn <- makeFn()
>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>> # x=10, ret=100 (calculate)
>>>> # x=10.001, ret=100.02 (calculate)
>>>> # x=9.999, ret=100.02 (memory)
>>>> # $par
>>>> # [1] 10
>>>> #
>>>> # $value
>>>> # [1] 100
>>>> # (...)
>>>> 
>>>> I would expect that optim() does more than 3 function
evaluations and
>>>> that the optimization converges to 0.
>>>> 
>>>> Same problem with optim(par=10, fn=fn,
method="BFGS").
>>>> 
>>>> Any ideas?
>>> I don't have an answer but may be an insight. For some
mysterious
>>> reason xx is getting changed when in should not. Consider:
>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n,
"in
>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx)
&& x==xx) ret else {xx
>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x,
xx, ret, "\n"); ret}}})
>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>> 1 in x,xx,ret= 10 NA NA
>>> out x,xx,ret= 10 10 100
>>> 2 in x,xx,ret= 10.001 10 100
>>> out x,xx,ret= 10.001 10.001 100.02
>>> 3 in x,xx,ret= 9.999 9.999 100.02
>>> $par
>>> [1] 10
>>> 
>>> $value
>>> [1] 100
>>> 
>>> $counts
>>> function gradient
>>>        1        1
>>> 
>>> $convergence
>>> [1] 0
>>> 
>>> $message
>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>> 
>>> At the third call, xx has value 9.999 while it should have kept the
>>> value 10.001.
>>> 
>> A little follow-up: if you untie the link between xx and x by replacing
>> the expression "xx <<- x" by "xx <<-
x+0" it works as expected:
>>  > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n,
"in
>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) &&
x==xx) ret else {xx <<-
>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret,
"\n"); ret}}})
>>  > optim(par=10, fn=fn, method="L-BFGS-B")
>> 1 in x,xx,ret= 10 NA NA
>> out x,xx,ret= 10 10 100
>> 2 in x,xx,ret= 10.001 10 100
>> out x,xx,ret= 10.001 10.001 100.02
>> 3 in x,xx,ret= 9.999 10.001 100.02
>> out x,xx,ret= 9.999 9.999 99.98
>> 4 in x,xx,ret= 9 9.999 99.98
>> out x,xx,ret= 9 9 81
>> 5 in x,xx,ret= 9.001 9 81
>> out x,xx,ret= 9.001 9.001 81.018
>> 6 in x,xx,ret= 8.999 9.001 81.018
>> out x,xx,ret= 8.999 8.999 80.982
>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982
>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
>> out x,xx,ret= 0.001 0.001 1e-06
>> 9 in x,xx,ret= -0.001 0.001 1e-06
>> out x,xx,ret= -0.001 -0.001 1e-06
>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
>> out x,xx,ret= 0.001 0.001 1e-06
>> 12 in x,xx,ret= -0.001 0.001 1e-06
>> out x,xx,ret= -0.001 -0.001 1e-06
>> $par
>> [1] -1.334475e-23
>> $value
>> [1] 1.780823e-46
>> $counts
>> function gradient
>>         4        4
>> $convergence
>> [1] 0
>> $message
>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>> Serguei.
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Duncan Murdoch

2019-May-03 12:18 UTC

head link

[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

It looks as though this happens when calculating numerical gradients:  x 
is reduced by eps, and fn is called; then x is increased by eps, and fn 
is called again.  No check is made that x has other references after the 
first call to fn.

I'll put together a patch if nobody else gets there first...

Duncan Murdoch

On 03/05/2019 7:13 a.m., peter dalgaard wrote:> Yes, I think you are right. I was at first confused by the fact that after
the optim() call,
> 
>> environment(fn)$xx
> [1] 10
>> environment(fn)$ret
> [1] 100.02
> 
> so not 9.999, but this could come from x being assigned the final value
without calling fn.
> 
> -pd
> 
> 
>> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:
>>
>> Your results below make it look like a bug in optim():  it is not
duplicating a value when it should, so changes to x affect xx as well.
>>
>> Duncan Murdoch
>>
>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>>> On 03/05/2019 10:31, Serguei Sokol wrote:
>>>> On 02/05/2019 21:35, Florian Gerber wrote:
>>>>> Dear all,
>>>>>
>>>>> when using optim() for a function that uses the parent
environment, I
>>>>> see the following unexpected behavior:
>>>>>
>>>>> makeFn <- function(){
>>>>>       xx <- ret <- NA
>>>>>       fn <- function(x){
>>>>>          if(!is.na(xx) && x==xx){
>>>>>              cat("x=", xx, ", ret=",
ret, " (memory)", fill=TRUE, sep="")
>>>>>              return(ret)
>>>>>          }
>>>>>          xx <<- x; ret <<- sum(x^2)
>>>>>          cat("x=", xx, ", ret=", ret,
" (calculate)", fill=TRUE, sep="")
>>>>>          ret
>>>>>       }
>>>>>       fn
>>>>> }
>>>>> fn <- makeFn()
>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>> # x=10, ret=100 (calculate)
>>>>> # x=10.001, ret=100.02 (calculate)
>>>>> # x=9.999, ret=100.02 (memory)
>>>>> # $par
>>>>> # [1] 10
>>>>> #
>>>>> # $value
>>>>> # [1] 100
>>>>> # (...)
>>>>>
>>>>> I would expect that optim() does more than 3 function
evaluations and
>>>>> that the optimization converges to 0.
>>>>>
>>>>> Same problem with optim(par=10, fn=fn,
method="BFGS").
>>>>>
>>>>> Any ideas?
>>>> I don't have an answer but may be an insight. For some
mysterious
>>>> reason xx is getting changed when in should not. Consider:
>>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1;
cat(n, "in
>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx)
&& x==xx) ret else {xx
>>>> <<- x; ret <<- x**2; cat("out x,xx,ret=",
x, xx, ret, "\n"); ret}}})
>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>> 1 in x,xx,ret= 10 NA NA
>>>> out x,xx,ret= 10 10 100
>>>> 2 in x,xx,ret= 10.001 10 100
>>>> out x,xx,ret= 10.001 10.001 100.02
>>>> 3 in x,xx,ret= 9.999 9.999 100.02
>>>> $par
>>>> [1] 10
>>>>
>>>> $value
>>>> [1] 100
>>>>
>>>> $counts
>>>> function gradient
>>>>         1        1
>>>>
>>>> $convergence
>>>> [1] 0
>>>>
>>>> $message
>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <=
PGTOL"
>>>>
>>>> At the third call, xx has value 9.999 while it should have kept
the
>>>> value 10.001.
>>>>
>>> A little follow-up: if you untie the link between xx and x by
replacing
>>> the expression "xx <<- x" by "xx <<-
x+0" it works as expected:
>>>   > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1;
cat(n, "in
>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx)
&& x==xx) ret else {xx <<-
>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret,
"\n"); ret}}})
>>>   > optim(par=10, fn=fn, method="L-BFGS-B")
>>> 1 in x,xx,ret= 10 NA NA
>>> out x,xx,ret= 10 10 100
>>> 2 in x,xx,ret= 10.001 10 100
>>> out x,xx,ret= 10.001 10.001 100.02
>>> 3 in x,xx,ret= 9.999 10.001 100.02
>>> out x,xx,ret= 9.999 9.999 99.98
>>> 4 in x,xx,ret= 9 9.999 99.98
>>> out x,xx,ret= 9 9 81
>>> 5 in x,xx,ret= 9.001 9 81
>>> out x,xx,ret= 9.001 9.001 81.018
>>> 6 in x,xx,ret= 8.999 9.001 81.018
>>> out x,xx,ret= 8.999 8.999 80.982
>>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982
>>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
>>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
>>> out x,xx,ret= 0.001 0.001 1e-06
>>> 9 in x,xx,ret= -0.001 0.001 1e-06
>>> out x,xx,ret= -0.001 -0.001 1e-06
>>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
>>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
>>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
>>> out x,xx,ret= 0.001 0.001 1e-06
>>> 12 in x,xx,ret= -0.001 0.001 1e-06
>>> out x,xx,ret= -0.001 -0.001 1e-06
>>> $par
>>> [1] -1.334475e-23
>>> $value
>>> [1] 1.780823e-46
>>> $counts
>>> function gradient
>>>          4        4
>>> $convergence
>>> [1] 0
>>> $message
>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>> Serguei.
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Ravi Varadhan

2019-May-06 14:06 UTC

head link

[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

Optim's Nelder-Mead works correctly for this example.

> optim(par=10, fn=fn, method="Nelder-Mead")x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0

$value
[1] 0

$counts
function gradient
      32       NA

$convergence
[1] 0

$message
NULL




________________________________
From: R-devel <r-devel-bounces at r-project.org> on behalf of Duncan
Murdoch <murdoch.duncan at gmail.com>
Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel at r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when
working with parent environments


It looks as though this happens when calculating numerical gradients:  x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again.  No check is made that x has other references after the
first call to fn.

I'll put together a patch if nobody else gets there first...

Duncan Murdoch

On 03/05/2019 7:13 a.m., peter dalgaard wrote:> Yes, I think you are right. I was at first confused by the fact that after
the optim() call,
>
>> environment(fn)$xx
> [1] 10
>> environment(fn)$ret
> [1] 100.02
>
> so not 9.999, but this could come from x being assigned the final value
without calling fn.
>
> -pd
>
>
>> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:
>>
>> Your results below make it look like a bug in optim():  it is not
duplicating a value when it should, so changes to x affect xx as well.
>>
>> Duncan Murdoch
>>
>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>>> On 03/05/2019 10:31, Serguei Sokol wrote:
>>>> On 02/05/2019 21:35, Florian Gerber wrote:
>>>>> Dear all,
>>>>>
>>>>> when using optim() for a function that uses the parent
environment, I
>>>>> see the following unexpected behavior:
>>>>>
>>>>> makeFn <- function(){
>>>>>       xx <- ret <- NA
>>>>>       fn <- function(x){
>>>>>          if(!is.na(xx) && x==xx){
>>>>>              cat("x=", xx, ", ret=",
ret, " (memory)", fill=TRUE, sep="")
>>>>>              return(ret)
>>>>>          }
>>>>>          xx <<- x; ret <<- sum(x^2)
>>>>>          cat("x=", xx, ", ret=", ret,
" (calculate)", fill=TRUE, sep="")
>>>>>          ret
>>>>>       }
>>>>>       fn
>>>>> }
>>>>> fn <- makeFn()
>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>> # x=10, ret=100 (calculate)
>>>>> # x=10.001, ret=100.02 (calculate)
>>>>> # x=9.999, ret=100.02 (memory)
>>>>> # $par
>>>>> # [1] 10
>>>>> #
>>>>> # $value
>>>>> # [1] 100
>>>>> # (...)
>>>>>
>>>>> I would expect that optim() does more than 3 function
evaluations and
>>>>> that the optimization converges to 0.
>>>>>
>>>>> Same problem with optim(par=10, fn=fn,
method="BFGS").
>>>>>
>>>>> Any ideas?
>>>> I don't have an answer but may be an insight. For some
mysterious
>>>> reason xx is getting changed when in should not. Consider:
>>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1;
cat(n, "in
>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx)
&& x==xx) ret else {xx
>>>> <<- x; ret <<- x**2; cat("out x,xx,ret=",
x, xx, ret, "\n"); ret}}})
>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>> 1 in x,xx,ret= 10 NA NA
>>>> out x,xx,ret= 10 10 100
>>>> 2 in x,xx,ret= 10.001 10 100
>>>> out x,xx,ret= 10.001 10.001 100.02
>>>> 3 in x,xx,ret= 9.999 9.999 100.02
>>>> $par
>>>> [1] 10
>>>>
>>>> $value
>>>> [1] 100
>>>>
>>>> $counts
>>>> function gradient
>>>>         1        1
>>>>
>>>> $convergence
>>>> [1] 0
>>>>
>>>> $message
>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <=
PGTOL"
>>>>
>>>> At the third call, xx has value 9.999 while it should have kept
the
>>>> value 10.001.
>>>>
>>> A little follow-up: if you untie the link between xx and x by
replacing
>>> the expression "xx <<- x" by "xx <<-
x+0" it works as expected:
>>>   > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1;
cat(n, "in
>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx)
&& x==xx) ret else {xx <<-
>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret,
"\n"); ret}}})
>>>   > optim(par=10, fn=fn, method="L-BFGS-B")
>>> 1 in x,xx,ret= 10 NA NA
>>> out x,xx,ret= 10 10 100
>>> 2 in x,xx,ret= 10.001 10 100
>>> out x,xx,ret= 10.001 10.001 100.02
>>> 3 in x,xx,ret= 9.999 10.001 100.02
>>> out x,xx,ret= 9.999 9.999 99.98
>>> 4 in x,xx,ret= 9 9.999 99.98
>>> out x,xx,ret= 9 9 81
>>> 5 in x,xx,ret= 9.001 9 81
>>> out x,xx,ret= 9.001 9.001 81.018
>>> 6 in x,xx,ret= 8.999 9.001 81.018
>>> out x,xx,ret= 8.999 8.999 80.982
>>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982
>>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
>>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
>>> out x,xx,ret= 0.001 0.001 1e-06
>>> 9 in x,xx,ret= -0.001 0.001 1e-06
>>> out x,xx,ret= -0.001 -0.001 1e-06
>>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
>>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
>>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
>>> out x,xx,ret= 0.001 0.001 1e-06
>>> 12 in x,xx,ret= -0.001 0.001 1e-06
>>> out x,xx,ret= -0.001 -0.001 1e-06
>>> $par
>>> [1] -1.334475e-23
>>> $value
>>> [1] 1.780823e-46
>>> $counts
>>> function gradient
>>>          4        4
>>> $convergence
>>> [1] 0
>>> $message
>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>> Serguei.
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more possibly parallel threads

R devel - May 2019 - R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

Possibly Parallel Threads