Message-ID: <CAC2h7uuE518YkAWqWbBrnKVWfQZjes4SLWRJ=gvNu6gJjKWNug@mail.gmail.com>
Date: 2018-12-17T04:02:53Z
From: Kasper Daniel Hansen
Subject: [Bioc-devel] Compilation flags, CHECK errors and BiocNeighbors
In-Reply-To: <41BCDD06-172D-4E19-9EF7-07E76AABB22C@gmail.com>
I would hope we do not distribute binaries compiled with
-march=native
That seems very wrong.
On Sun, Dec 16, 2018 at 1:56 AM Aaron Lun <
infinite.monkeys.with.keyboards at gmail.com> wrote:
> Sometime between 6-18 November, BiocNeighbors? BioC-devel builds began
> failing on Windows 64-bit, and have continued to fail since:
>
> http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/ <
> http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>
>
> The most interesting part is the nature of the failures. They are not
> segmentation faults but rather ?incorrect? output in the unit tests:
>
> - BiocNeighbors uses the Annoy algorithm for approximate nearest neighbor
> search, which is provided as a header-only C++ library in the RcppAnnoy
> package.
>
> - I have compiled the BiocNeighhbors C++ code with an ?#include" for these
> libraries to use the Annoy routines. For testing, I compared the output of
> my C++ code to the output of the code in the RcppAnnoy package.
>
> - It is these tests that are failing (i.e., the output does not match up)
> during CHECK on Windows 64-bit only, despite the fact that the same library
> is being ?#include?d in both the BiocNeighbors and RcppAnnoy sources!
>
> What makes this particularly intriguing is that the differences between
> BiocNeighbors and RcppAnnoy are very minor. Less than 1% of the neighbor
> identities differ, and only for some of the scenarios, so it?s not an
> obvious bug that would be changing the output en masse. Now, the package
> also uses/tests Annoy in BioC-release but builds fine on tokay1:
>
> http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/ <
> http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>
>
> The major difference between the Bioc-release/devel builds is the
> compilation flags, which have changed from ?-O2 -mtune=generic? to ?-O3
> -march=native -mtune=native? in tokay2. I am told (thanks Val) that the
> timing of this change is consistent with the start of the BiocNeighbors
> build failures on tokay2. I would guess that RcppAnnoy is also compiled
> with ?-O2 -mtune=generic? on the CRAN build systems, introducing
> differences in optimization levels between the BiocNeighbors and RcppAnnoy
> binaries. These could be responsible for the discrepancies in the search
> results.
>
> I was able to reproduce this on my Unix cluster (gcc 6.5.0) where setting
> ?-march=native? with either ?-O3? or ?-O2? caused a difference in the
> calculations. After much trial and error, I eventually narrowed this down
> to the ?-mfma? flag, which seems to change the precision of
> multiply-and-add operations and thus the search results. This occurs even
> when AVX support is turned off; I guess the compiler tries to be smart if
> it detects you are doing some kind of simultaneous multiply and addition,
> which is a pretty common thing to do when computing Euclidean distances.
>
> In summary: can we not use ?-march=native? on tokay2? (Val, I know we
> discussed this, but whatever changes you made to the compilation flags
> don?t seem to have propagated to the build machines.) As the case study
> with BiocNeighbors shows, this leads to inconsistencies between the CRAN
> and BioC-devel binaries for the same code, which unnecessarily complicates
> downstream usage and unit tests. I also wonder how binaries specialized for
> tokay2?s architecture would behave on other CPUs with different instruction
> sets, if they would run at all.
>
> Cheers,
>
> Aaron
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
[[alternative HTML version deleted]]