Skip to content

R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

10 messages · Florian Gerber, Duncan Murdoch, Peter Dalgaard +4 more

#
Dear all,

when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:

makeFn <- function(){
??? xx <- ret <- NA
??? fn <- function(x){
?????? if(!is.na(xx) && x==xx){
?????????? cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
?????????? return(ret)
?????? }
?????? xx <<- x; ret <<- sum(x^2)
?????? cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
?????? ret
??? }
??? fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)

I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.

Same problem with optim(par=10, fn=fn, method="BFGS").

Any ideas?

See also my related post:
https://stackoverflow.com/questions/53826521/r-optim-unexpected-behavior-when-working-with-parent-environments
???
platform?????? x86_64-pc-linux-gnu????????
arch?????????? x86_64?????????????????????
os???????????? linux-gnu??????????????????
system???????? x86_64, linux-gnu??????????
status????????????????????????????????????
major????????? 3??????????????????????????
minor????????? 6.0????????????????????????
year?????????? 2019???????????????????????
month????????? 04?????????????????????????
day??????????? 26?????????????????????????
svn rev??????? 76424??????????????????????
language?????? R??????????????????????????
version.string R version 3.6.0 (2019-04-26)
nickname?????? Planting of a Tree?????????

Best,
Florian
#
On 02/05/2019 21:35, Florian Gerber wrote:
I don't have an answer but may be an insight. For some mysterious reason 
xx is getting changed when in should not. Consider:
 > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in 
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- 
x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
 > optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10

$value
[1] 100

$counts
function gradient
 ?????? 1??????? 1

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

At the third call, xx has value 9.999 while it should have kept the 
value 10.001.

Serguei.
#
On 03/05/2019 10:31, Serguei Sokol wrote:
A little follow-up: if you untie the link between xx and x by replacing 
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
 > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in 
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- 
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
 > optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23

$value
[1] 1.780823e-46

$counts
function gradient
 ?????? 4??????? 4

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

Serguei.
#
Your results below make it look like a bug in optim():  it is not 
duplicating a value when it should, so changes to x affect xx as well.

Duncan Murdoch
On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
#
Yes, I think you are right. I was at first confused by the fact that after the optim() call,
[1] 10
[1] 100.02

so not 9.999, but this could come from x being assigned the final value without calling fn.

-pd

  
    
#
It looks as though this happens when calculating numerical gradients:  x 
is reduced by eps, and fn is called; then x is increased by eps, and fn 
is called again.  No check is made that x has other references after the 
first call to fn.

I'll put together a patch if nobody else gets there first...

Duncan Murdoch
On 03/05/2019 7:13 a.m., peter dalgaard wrote:
3 days later
#
Optim's Nelder-Mead works correctly for this example.
x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0

$value
[1] 0

$counts
function gradient
      32       NA

$convergence
[1] 0

$message
NULL
#
That's consistent/not surprising if the problem lies in the numerical
gradient calculation step ...
On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:
#
It seems that it's an old bug that was found in some other packages, but 
at that time not optim:

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958

and that Duncan Murdoch posted a patch already last Friday :)

Thomas

Am 06.05.2019 um 16:40 schrieb Ben Bolker:
#
On 06/05/2019 18:21, Thomas Petzoldt wrote:
I think that the bug description is a little bit misleading. The bug is 
not in fact that "<<-" produce a reference instead of a copy (that's 
normal) but in fact that some C or Fortran code modifies a variable "in 
place" without taking care if there are some references on it or not.

Serguei (just splitting hairs)