optim-Bug (PR#6720) - R-devel

Peter Dalgaard · 2004-04-21T00:53:17Z

kestler@neuro.informatik.uni-ulm.de writes: > Full_Name: Dr. Hans A. Kestler > Version: 1.8.1. > OS: Linux, Win, Mac OSX > Submission from: (NULL) (134.60.73.116) > > > The code below produces after a different number of iterations i the following > error: > > Error in optim(par = rep(0.5, length(edges)), loglik, method = "L-BFGS-B", : > non-finite value supplied by optim > > This was reproducible on different machines (Mac G4 OSX, AMD Opteron Linux SUSE > 9.0, Intel P4 Suse 9.0, P

Peter Dalgaard

Tue, Apr 20, 2004 5:53 PM #

kestler@neuro.informatik.uni-ulm.de writes:

I have this down to the use of R_alloc in the vect() function (line
40, optim.c). Replacing with S_alloc (which zeros memory) removes the
issue for me on this machine and on the Opteron. Could you please
verify on other platforms? And if anyone actually understands the code
could you verify that it makes sense to require the workspace to be
initialized?

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

Brian Ripley

Wed, Apr 21, 2004 1:57 AM #

On 21 Apr 2004, Peter Dalgaard wrote:

Zeroing the workspace is not a requirement of the original L-BFGS-B code 
that I can see.  Given that it was originally in Fortran, and Fortran 
often does zero it seems a likely symptom, but it does mean that a 
variable is being used uninitialized somewhere in the code (converted to 
C).  It would be better to leave vect alone and to zero the workspace with 
a memset call in lbfgsb.  (Incidentally, I don't know why S_alloc does not 
use memset -- we do require standard C and use in seeral other places.)

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Peter Dalgaard

Wed, Apr 21, 2004 7:23 AM #

Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:

...

Yes, sorry but it got a bit late when I sent that. I found this by
backtracking, so I know quite well where the problem is: The
allocation of "wa" on line 1020 in optim.c, specifically the wa[lsnd]
part in setulb which becomes snd in mainlb and wn1 in formk. Parts of
wn1 never get initialized. This appears to be *relatively* harmless,
except that sometimes there are NaN which propagate to wn and into
dpofa(), etc.

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

Douglas Bates

Wed, Apr 21, 2004 12:46 PM #

Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:
...

I had planned to suggest that we use memset more widely and perhaps
add a macro Memset, defined like the current Memcpy, to R_ext/RS.h.  I
felt that memset was likely to be more efficient for zeroing large
areas of memory than running a loop would be.  Recently Peter Dalgaard
and Andy Liaw and I discussed this with regard to a test to see if
vectors using more than 4 GB of memory could be allocated on an
Opteron.  Andy installed my suggested modification to do_makevector to
use memset for zeroing freshly allocated vectors and found that the
running time for a test on an Opteron/Linux system actually increased.
It appears that, at least on that system, memset runs a loop at the
level of characters.

I still think it would be a good idea to use memset but we would want
to do some timing comparisons to make sure we don't slow things down.

Brian Ripley

Wed, Apr 21, 2004 1:32 PM #

I think S_alloc is little used (at least in R), but I have altered it to
use memset in R-devel. As I understand it all decent compilers recognize
memset calls and try to inline them with optimized code (probably even
optimizing the bzero case).

On 21 Apr 2004, Douglas Bates wrote:

(You mean char* bytes?  S_alloc was a byte-level loop, but looking in the
headers suggest that gcc3 has optimized inlined code for memset(s, 0, n).)

That's very machine- and compiler- dependent though.

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595