Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
??? xx <- ret <- NA
??? fn <- function(x){
?????? if(!is.na(xx) && x==xx){
?????????? cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
?????????? return(ret)
?????? }
?????? xx <<- x; ret <<- sum(x^2)
?????? cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
?????? ret
??? }
??? fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
See also my related post:
https://stackoverflow.com/questions/53826521/r-optim-unexpected-behavior-when-working-with-parent-environments
???
platform?????? x86_64-pc-linux-gnu????????
arch?????????? x86_64?????????????????????
os???????????? linux-gnu??????????????????
system???????? x86_64, linux-gnu??????????
status????????????????????????????????????
major????????? 3??????????????????????????
minor????????? 6.0????????????????????????
year?????????? 2019???????????????????????
month????????? 04?????????????????????????
day??????????? 26?????????????????????????
svn rev??????? 76424??????????????????????
language?????? R??????????????????????????
version.string R version 3.6.0 (2019-04-26)
nickname?????? Planting of a Tree?????????
Best,
Florian
R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
10 messages · Florian Gerber, Duncan Murdoch, Peter Dalgaard +4 more
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
??? xx <- ret <- NA
??? fn <- function(x){
?????? if(!is.na(xx) && x==xx){
?????????? cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
?????????? return(ret)
?????? }
?????? xx <<- x; ret <<- sum(x^2)
?????? cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
?????? ret
??? }
??? fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious reason
xx is getting changed when in should not. Consider:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10
$value
[1] 100
$counts
function gradient
?????? 1??????? 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
At the third call, xx has value 9.999 while it should have kept the
value 10.001.
Serguei.
On 03/05/2019 10:31, Serguei Sokol wrote:
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
???? xx <- ret <- NA
???? fn <- function(x){
??????? if(!is.na(xx) && x==xx){
??????????? cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
??????????? return(ret)
??????? }
??????? xx <<- x; ret <<- sum(x^2)
??????? cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
??????? ret
???? }
???? fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider:
fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient ?????? 1??????? 1 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001.
A little follow-up: if you untie the link between xx and x by replacing
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23
$value
[1] 1.780823e-46
$counts
function gradient
?????? 4??????? 4
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
Serguei.
Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. Duncan Murdoch
On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
On 03/05/2019 10:31, Serguei Sokol wrote:
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
???? xx <- ret <- NA
???? fn <- function(x){
??????? if(!is.na(xx) && x==xx){
??????????? cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
??????????? return(ret)
??????? }
??????? xx <<- x; ret <<- sum(x^2)
??????? cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
??????? ret
???? }
???? fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider:
fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 9.999 100.02 $par [1] 10 $value [1] 100 $counts function gradient ?????? 1??????? 1 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" At the third call, xx has value 9.999 while it should have kept the value 10.001.
A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA out x,xx,ret= 10 10 100 2 in x,xx,ret= 10.001 10 100 out x,xx,ret= 10.001 10.001 100.02 3 in x,xx,ret= 9.999 10.001 100.02 out x,xx,ret= 9.999 9.999 99.98 4 in x,xx,ret= 9 9.999 99.98 out x,xx,ret= 9 9 81 5 in x,xx,ret= 9.001 9 81 out x,xx,ret= 9.001 9.001 81.018 6 in x,xx,ret= 8.999 9.001 81.018 out x,xx,ret= 8.999 8.999 80.982 7 in x,xx,ret= 1.776357e-11 8.999 80.982 out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22 out x,xx,ret= 0.001 0.001 1e-06 9 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06 out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46 out x,xx,ret= 0.001 0.001 1e-06 12 in x,xx,ret= -0.001 0.001 1e-06 out x,xx,ret= -0.001 -0.001 1e-06 $par [1] -1.334475e-23 $value [1] 1.780823e-46 $counts function gradient ?????? 4??????? 4 $convergence [1] 0 $message [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL" Serguei.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Yes, I think you are right. I was at first confused by the fact that after the optim() call,
environment(fn)$xx
[1] 10
environment(fn)$ret
[1] 100.02 so not 9.999, but this could come from x being assigned the final value without calling fn. -pd
On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. Duncan Murdoch On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
On 03/05/2019 10:31, Serguei Sokol wrote:
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
xx <- ret <- NA
fn <- function(x){
if(!is.na(xx) && x==xx){
cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
return(ret)
}
xx <<- x; ret <<- sum(x^2)
cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
ret
}
fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider:
fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10
$value
[1] 100
$counts
function gradient
1 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
At the third call, xx has value 9.999 while it should have kept the
value 10.001.
A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23
$value
[1] 1.780823e-46
$counts
function gradient
4 4
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
Serguei.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
It looks as though this happens when calculating numerical gradients: x is reduced by eps, and fn is called; then x is increased by eps, and fn is called again. No check is made that x has other references after the first call to fn. I'll put together a patch if nobody else gets there first... Duncan Murdoch
On 03/05/2019 7:13 a.m., peter dalgaard wrote:
Yes, I think you are right. I was at first confused by the fact that after the optim() call,
environment(fn)$xx
[1] 10
environment(fn)$ret
[1] 100.02 so not 9.999, but this could come from x being assigned the final value without calling fn. -pd
On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well. Duncan Murdoch On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
On 03/05/2019 10:31, Serguei Sokol wrote:
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
xx <- ret <- NA
fn <- function(x){
if(!is.na(xx) && x==xx){
cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
return(ret)
}
xx <<- x; ret <<- sum(x^2)
cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
ret
}
fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious reason xx is getting changed when in should not. Consider:
fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10
$value
[1] 100
$counts
function gradient
1 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
At the third call, xx has value 9.999 while it should have kept the
value 10.001.
A little follow-up: if you untie the link between xx and x by replacing the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23
$value
[1] 1.780823e-46
$counts
function gradient
4 4
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
Serguei.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
3 days later
Optim's Nelder-Mead works correctly for this example.
optim(par=10, fn=fn, method="Nelder-Mead")
x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0
$value
[1] 0
$counts
function gradient
32 NA
$convergence
[1] 0
$message
NULL
From: R-devel <r-devel-bounces at r-project.org> on behalf of Duncan Murdoch <murdoch.duncan at gmail.com>
Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel at r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel at r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
It looks as though this happens when calculating numerical gradients: x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again. No check is made that x has other references after the
first call to fn.
I'll put together a patch if nobody else gets there first...
Duncan Murdoch
On 03/05/2019 7:13 a.m., peter dalgaard wrote:
> Yes, I think you are right. I was at first confused by the fact that after the optim() call,
>
>> environment(fn)$xx
> [1] 10
>> environment(fn)$ret
> [1] 100.02
>
> so not 9.999, but this could come from x being assigned the final value without calling fn.
>
> -pd
>
>
>> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
>>
>> Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well.
>>
>> Duncan Murdoch
>>
>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>>> On 03/05/2019 10:31, Serguei Sokol wrote:
>>>> On 02/05/2019 21:35, Florian Gerber wrote:
>>>>> Dear all,
>>>>>
>>>>> when using optim() for a function that uses the parent environment, I
>>>>> see the following unexpected behavior:
>>>>>
>>>>> makeFn <- function(){
>>>>> xx <- ret <- NA
>>>>> fn <- function(x){
>>>>> if(!is.na(xx) && x==xx){
>>>>> cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
>>>>> return(ret)
>>>>> }
>>>>> xx <<- x; ret <<- sum(x^2)
>>>>> cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
>>>>> ret
>>>>> }
>>>>> fn
>>>>> }
>>>>> fn <- makeFn()
>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>> # x=10, ret=100 (calculate)
>>>>> # x=10.001, ret=100.02 (calculate)
>>>>> # x=9.999, ret=100.02 (memory)
>>>>> # $par
>>>>> # [1] 10
>>>>> #
>>>>> # $value
>>>>> # [1] 100
>>>>> # (...)
>>>>>
>>>>> I would expect that optim() does more than 3 function evaluations and
>>>>> that the optimization converges to 0.
>>>>>
>>>>> Same problem with optim(par=10, fn=fn, method="BFGS").
>>>>>
>>>>> Any ideas?
>>>> I don't have an answer but may be an insight. For some mysterious
>>>> reason xx is getting changed when in should not. Consider:
>>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
>>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>> 1 in x,xx,ret= 10 NA NA
>>>> out x,xx,ret= 10 10 100
>>>> 2 in x,xx,ret= 10.001 10 100
>>>> out x,xx,ret= 10.001 10.001 100.02
>>>> 3 in x,xx,ret= 9.999 9.999 100.02
>>>> $par
>>>> [1] 10
>>>>
>>>> $value
>>>> [1] 100
>>>>
>>>> $counts
>>>> function gradient
>>>> 1 1
>>>>
>>>> $convergence
>>>> [1] 0
>>>>
>>>> $message
>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>>>
>>>> At the third call, xx has value 9.999 while it should have kept the
>>>> value 10.001.
>>>>
>>> A little follow-up: if you untie the link between xx and x by replacing
>>> the expression "xx <<- x" by "xx <<- x+0" it works as expected:
>>> > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>> > optim(par=10, fn=fn, method="L-BFGS-B")
>>> 1 in x,xx,ret= 10 NA NA
>>> out x,xx,ret= 10 10 100
>>> 2 in x,xx,ret= 10.001 10 100
>>> out x,xx,ret= 10.001 10.001 100.02
>>> 3 in x,xx,ret= 9.999 10.001 100.02
>>> out x,xx,ret= 9.999 9.999 99.98
>>> 4 in x,xx,ret= 9 9.999 99.98
>>> out x,xx,ret= 9 9 81
>>> 5 in x,xx,ret= 9.001 9 81
>>> out x,xx,ret= 9.001 9.001 81.018
>>> 6 in x,xx,ret= 8.999 9.001 81.018
>>> out x,xx,ret= 8.999 8.999 80.982
>>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982
>>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
>>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
>>> out x,xx,ret= 0.001 0.001 1e-06
>>> 9 in x,xx,ret= -0.001 0.001 1e-06
>>> out x,xx,ret= -0.001 -0.001 1e-06
>>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
>>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
>>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
>>> out x,xx,ret= 0.001 0.001 1e-06
>>> 12 in x,xx,ret= -0.001 0.001 1e-06
>>> out x,xx,ret= -0.001 -0.001 1e-06
>>> $par
>>> [1] -1.334475e-23
>>> $value
>>> [1] 1.780823e-46
>>> $counts
>>> function gradient
>>> 4 4
>>> $convergence
>>> [1] 0
>>> $message
>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>> Serguei.
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
That's consistent/not surprising if the problem lies in the numerical gradient calculation step ...
On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:
Optim's Nelder-Mead works correctly for this example.
optim(par=10, fn=fn, method="Nelder-Mead")
x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0
$value
[1] 0
$counts
function gradient
32 NA
$convergence
[1] 0
$message
NULL
________________________________
From: R-devel <r-devel-bounces at r-project.org> on behalf of Duncan Murdoch <murdoch.duncan at gmail.com>
Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel at r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
It looks as though this happens when calculating numerical gradients: x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again. No check is made that x has other references after the
first call to fn.
I'll put together a patch if nobody else gets there first...
Duncan Murdoch
On 03/05/2019 7:13 a.m., peter dalgaard wrote:
Yes, I think you are right. I was at first confused by the fact that after the optim() call,
environment(fn)$xx
[1] 10
environment(fn)$ret
[1] 100.02
so not 9.999, but this could come from x being assigned the final value without calling fn.
-pd
On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well.
Duncan Murdoch
On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
On 03/05/2019 10:31, Serguei Sokol wrote:
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
xx <- ret <- NA
fn <- function(x){
if(!is.na(xx) && x==xx){
cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
return(ret)
}
xx <<- x; ret <<- sum(x^2)
cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
ret
}
fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious
reason xx is getting changed when in should not. Consider:
fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10
$value
[1] 100
$counts
function gradient
1 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
At the third call, xx has value 9.999 while it should have kept the
value 10.001.
A little follow-up: if you untie the link between xx and x by replacing
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23
$value
[1] 1.780823e-46
$counts
function gradient
4 4
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
Serguei.
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
It seems that it's an old bug that was found in some other packages, but at that time not optim: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958 and that Duncan Murdoch posted a patch already last Friday :) Thomas Am 06.05.2019 um 16:40 schrieb Ben Bolker:
That's consistent/not surprising if the problem lies in the numerical gradient calculation step ... On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:
Optim's Nelder-Mead works correctly for this example.
optim(par=10, fn=fn, method="Nelder-Mead")
x=10, ret=100.02 (memory)
x=11, ret=121 (calculate)
x=9, ret=81 (calculate)
x=8, ret=64 (calculate)
x=6, ret=36 (calculate)
x=4, ret=16 (calculate)
x=0, ret=0 (calculate)
x=-4, ret=16 (calculate)
x=-4, ret=16 (memory)
x=2, ret=4 (calculate)
x=-2, ret=4 (calculate)
x=1, ret=1 (calculate)
x=-1, ret=1 (calculate)
x=0.5, ret=0.25 (calculate)
x=-0.5, ret=0.25 (calculate)
x=0.25, ret=0.0625 (calculate)
x=-0.25, ret=0.0625 (calculate)
x=0.125, ret=0.015625 (calculate)
x=-0.125, ret=0.015625 (calculate)
x=0.0625, ret=0.00390625 (calculate)
x=-0.0625, ret=0.00390625 (calculate)
x=0.03125, ret=0.0009765625 (calculate)
x=-0.03125, ret=0.0009765625 (calculate)
x=0.015625, ret=0.0002441406 (calculate)
x=-0.015625, ret=0.0002441406 (calculate)
x=0.0078125, ret=6.103516e-05 (calculate)
x=-0.0078125, ret=6.103516e-05 (calculate)
x=0.00390625, ret=1.525879e-05 (calculate)
x=-0.00390625, ret=1.525879e-05 (calculate)
x=0.001953125, ret=3.814697e-06 (calculate)
x=-0.001953125, ret=3.814697e-06 (calculate)
x=0.0009765625, ret=9.536743e-07 (calculate)
$par
[1] 0
$value
[1] 0
$counts
function gradient
32 NA
$convergence
[1] 0
$message
NULL
________________________________
From: R-devel <r-devel-bounces at r-project.org> on behalf of Duncan Murdoch <murdoch.duncan at gmail.com>
Sent: Friday, May 3, 2019 8:18:44 AM
To: peter dalgaard
Cc: Florian Gerber; r-devel at r-project.org
Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
It looks as though this happens when calculating numerical gradients: x
is reduced by eps, and fn is called; then x is increased by eps, and fn
is called again. No check is made that x has other references after the
first call to fn.
I'll put together a patch if nobody else gets there first...
Duncan Murdoch
On 03/05/2019 7:13 a.m., peter dalgaard wrote:
Yes, I think you are right. I was at first confused by the fact that after the optim() call,
environment(fn)$xx
[1] 10
environment(fn)$ret
[1] 100.02
so not 9.999, but this could come from x being assigned the final value without calling fn.
-pd
On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
Your results below make it look like a bug in optim(): it is not duplicating a value when it should, so changes to x affect xx as well.
Duncan Murdoch
On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
On 03/05/2019 10:31, Serguei Sokol wrote:
On 02/05/2019 21:35, Florian Gerber wrote:
Dear all,
when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:
makeFn <- function(){
xx <- ret <- NA
fn <- function(x){
if(!is.na(xx) && x==xx){
cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
return(ret)
}
xx <<- x; ret <<- sum(x^2)
cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
ret
}
fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)
I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.
Same problem with optim(par=10, fn=fn, method="BFGS").
Any ideas?
I don't have an answer but may be an insight. For some mysterious
reason xx is getting changed when in should not. Consider:
fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10
$value
[1] 100
$counts
function gradient
1 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
At the third call, xx has value 9.999 while it should have kept the
value 10.001.
A little follow-up: if you untie the link between xx and x by replacing
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23
$value
[1] 1.780823e-46
$counts
function gradient
4 4
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
Serguei.
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On 06/05/2019 18:21, Thomas Petzoldt wrote:
It seems that it's an old bug that was found in some other packages, but at that time not optim: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958
I think that the bug description is a little bit misleading. The bug is not in fact that "<<-" produce a reference instead of a copy (that's normal) but in fact that some C or Fortran code modifies a variable "in place" without taking care if there are some references on it or not. Serguei (just splitting hairs)