Request for help with UBSAN and total absense of CRAN response
On Tue, 2015-01-13 at 10:34 -0600, Dirk Eddelbuettel wrote:
On 13 January 2015 at 08:21, Dan Tenenbaum wrote: | Where should the package source be downloaded from? I see it in CRAN (but presumably the latest version that causes the issue is not yet downloadable) and in github. The "presumable" assumption is incorrect AFAIK. The error should presumably came up in both versions as annoylib.h did not change there. Feel free to prove me wrong :) and just use whatever is easiest. Dirk
This is a curious case.
Here is where the first error occurs:
Executing test function test01getNNsByVector ...
Breakpoint 1, 0x00000000009c0440 in __ubsan_handle_out_of_bounds ()
(gdb) frame 1
#1 0x00007fffe777935b in AnnoyIndex<int, float, Angular<int, float> >::_get_all_nns (this=0x3a7e8f0, v=0x37d95d0, n=3, result=0x7ffffffee1e8)
at ../inst/include/annoylib.h:532
532 nns.insert(nns.end(), nd->children, &(nd->children[nd->n_descendants]));
(gdb) p nd->children
$48 = {0, 1}
(gdb) p nd->n_descendants
$49 = 3
(gdb) p nns
$50 = std::vector of length 0, capacity 0
So we are trying to insert 3 values from an array of length 2 into an
STL vector.
Comments in the header file annoylib.h (lines 114-130) show that this is
a result of a "memory optimization". Small objects have a completely
different format but are allocated in the same memory. When the
optimization is used the array is deliberately allowed to overflow:
S children[2]; // Will possibly store more than 2
T v[1]; // We let this one overflow intentionally
A workaround is to turn off the optimization by reducing the threshold
for the alternate data format (_K) to such a low level that it is never
used (annoylib.h, line 259):
//_K = (sizeof(T) * f + sizeof(S) * 2) / sizeof(S);
_K = 2; //Turn off memory optimization
I think this is a case of "being too clever by half".
Martyn
-----------------------------------------------------------------------
This message and its attachments are strictly confidenti...{{dropped:8}}