Parameter scaling problems with optim and Nelder-Mead method (bug?)

Dear all,

I?m having some problems getting optim with method="Nelder-Mead" to work
properly. It seems like there is no way of controlling the step size,
and the step size seems to depend on the *difference* between the
initial values, which makes no sense. Example:

    f=function(xy, mu1, mu2) {
      print(xy)
      dnorm(xy[1]-mu1)*dnorm(xy[2]-mu2)
    }
    f1=function(xy) -f(xy, 0, 0)
    optim(c(1,1), f1)

The first four values evaluated are

    1.0, 1.0
    1.1, 1.0
    1.0, 1.1
    0.9, 1.1

which is reasonable (step size of 0.1) for this function. And if I
translate both the function and the initial values

    f2=function(xy) -f(xy, 5000, 5000)
    optim(c(5001,5001), f2)

the first four values are

    5001.0, 5001.0
    5501.1, 5001.0
    5001.0, 5501.1
    4500.9, 5501.1

With

    f3=function(xy) -f(xy, 0, 5000)
    optim(c(1,5001), f3)

they are

       1.0, 5001.0
     501.1, 5001.0
       1.0, 5501.1
    -499.1, 5501.1

and with

    f4=function(xy) -f(xy, -3000, 50000)
    optim(c(-2999,50001), f4)

    -2999.0, 50001.0
     2001.1, 50001.0
    -2999.0, 55001.1
    -7999.1, 55001.1

However, the function to optimise is the same in all cases, only
translated, not scaled, so the step size *should* be the same. From
reading the documentation, it looks like changing the parscale should
work, and *relative* changes have the intended effect. Example:

    optim(c(1,1), f1, control=list(parscale=c(1,5)))

gives the function evaluations

    1.0, 1.0
    1.1, 1.0
    1.0, 1.5
    1.1, 0.5

But changing both values, e.g.,

   optim(c(1,1), f1, control=list(parscale=c(500,500)))

gives the same first four values. There *are* eventually some
differences in the values tried, but these don?t seem to correspond to
parscale as described in ?optim. For example, for parscale=c(1,1), the
parameter values tried are

1: 1, 1
2: 1.1, 1
3: 1, 1.1
4: 0.9, 1.1
5: 0.95, 1.075
6: 0.9, 1
7: 0.85, 0.95
8: 0.95, 0.85
9: 0.9375, 0.9125
10: 0.8, 0.8
11: 0.7, 0.7
12: 0.8, 0.6
13: 0.8125, 0.6875
14: 0.55, 0.45

while for parscale=c(500,500) they are

1: 1, 1
2: 1.1, 1
3: 1, 1.1
4: 0.9, 1.1
5: 0.95, 1.075
6: 0.9, 1
7: 0.85, 0.95
8: 0.95, 0.85
9: 0.975, 0.725
10: 0.825, 0.675
11: 0.7375, 0.5125
12: 0.8625, 0.2875
13: 0.859375, 0.453125
14: 0.625000000000001, 0.0750000000000004

for parscale=1/c(50000,50000) they are

1: 1, 1
2: 1.1, 1
3: 1, 1.1
4: 0.9, 1.1
5: 0.95, 1.075
6: 0.9, 1
7: 0.85, 0.95
8: 0.95, 0.85
9: 0.9375, 0.9125
10: 0.8, 0.8
11: 0.7, 0.7
12: 0.8, 0.6
13: 0.8125, 0.6875
14: 0.55, 0.45

And there seems to be no way of actually changing the step size to
reasonable values (i.e., the same values for optimising f1?f4).

Is there something I have missed in how one is supposed to use optim
with Nelder-Mead? Or is this actually a bug in the implementation?

$ sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-suse-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=nn_NO.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=nn_NO.UTF-8        LC_COLLATE=nn_NO.UTF-8    
 [5] LC_MONETARY=nn_NO.UTF-8    LC_MESSAGES=nn_NO.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=nn_NO.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base