Skip to content
Prev 393545 / 398503 Next

R 'arima' discrepancies

As the original author of what became "BFGS" in optim(), I would point out that BFGS is a catch-all
phrase that should be applied only to the formula used to update EITHER (my emphasis) the approximate
Hessian or the INVERSE approximate Hessian. The starting approximation can vary as well, along with
choices of the line search based on a search direction the approximate (inverse?) Hessian generates.

The optim::BFGS starts with a unit matrix approximation to the inverse hessian and uses a backtrack
line search. There are some choices of how fast to backtrack and the stopping criteria. Also optim::BFGS
allows a relatively crude gradient approximation to be used.

Rvmmin (now part of optimx) is an all-R implementation of the same ideas, but with some changes in local
strategies. Even putting in what I believe are the same choices, I don't get identical iterates to
optim::BFGS, likely due to some ways in which C works. The "vm" in the name is for the original
"Variable Metric" algorithm of Fletcher (1970). I sat with Roger in Dundee in Jan. 1976 and we used a red
pencil and simplified his Fortran code. I made just 1 more change when I returned to Ottawa -- my approach
always tries a steepest descent (i.e., new unit inverse Hessian) before quitting.

One choice I have made with Rvmmin is to insist on a gradient function. I've found approximations don't
work so well, not for speed but for knowing that an answer has a nearly zero gradient.

There is likely a useful but modest project to explore the differences Rodrigo has pointed out. It should not
be too difficult to set up the optimization to try different optimizers. I'll be happy to collaborate in
this. Note that Google Summer of Code is a possibility, but prospective students and mentors need to be
starting now.

Cheers,

John Nash
On 2023-01-05 14:52, Rodrigo Ribeiro Rem?dio wrote: