Skip to content

[R-pkg-devel] Reproducing CRAN checks

4 messages · Ivan Krylov, Pepijn de Vries, Dirk Eddelbuettel

#
Dear all,

After submitting a newbie package to CRAN, I know receive some check-results that need to be addressed. There are issues raised by the gcc-ASAN, gcc-UBSAN and M1mac checks. More specifically here are the check results:

https://cran.r-project.org/web/checks/check_results_blosc.html

I think I know how to address these issues, however... How can I test if my fixes (https://github.com/pepijn-devries/blosc/pull/27/files) were adequate? When I check the submitted package with rhub's the gcc-ASAN docker file, the issues reported by CRAN don't pop up. I'm also not sure how to reproduce the M1mac results. In short: how can I assure that my fixes are adequate, other then resubmitting to CRAN and hoping for the best?

Kind regards,

Pepijn
#
? Thu, 4 Sep 2025 07:07:32 +0000
Pepijn de Vries <pepijn.devries at outlook.com> ?????:
Is it this one?
https://github.com/r-hub/containers/blob/main/containers/gcc-asan/Dockerfile

Looks like it's based on Fedora 40 with GCC 14, while Prof. Brian
Ripley runs the gcc-ASAN checks on Fedora 42 with GCC 15:
https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt

(If README.txt and 00check.log ever disagree, trust 00check.log.)

Sometimes the issues are trivial to reproduce with any version of gcc
-fsanitize. Sometimes you need the right compiler version. Sometimes
you need to compile R and all dependencies from source so that they are
all instrumented by sanitizers. Sometimes even a fully reproducible
build environment, if it existed, wouldn't be enough because the issue
is only visible when the CPU is fully loaded with some threads being
starved for CPU time [*].

I think that currently, the approach most likely to reproduce the
problem is to start with a Fedora 42 container and follow the
README.txt together with WRE 4.3.3 and 4.3.4 to build R and all
dependencies from source. There are many shortcuts that could be taken
(and the ones taken by r-hub containers save a lot of electricity and
time running CI!), but they carry a risk of missing the problem.
If you skip install.libs('blosc') in CI on macOS, the failure should be
reproducible, but then there's no way for the configure script to
succeed. Were you asked to fix the error messages produced by the
configure script?
If you'd like to provably prevent the dereference of the [-1] subscript
from happening, consider moving the if (target_unit < 0) stop(...)
inside the previous if (target_unit < 0) branch, before accessing the
arrays. This way even a strange new unit (or a damaged file) won't cause
undefined behaviour.
#
Hi Ivan,

Thank you for you very helpful response!
Yes, that's one of the sanitizers I've used for testing
Thank you for the thorough explanation. I see that reproducing sanitizer
results isn't for the faint of heart. I will have to ponder if I really
want to put all this effort in trying to reproduce the reported results,
or just making sensible adjustments based on the reported issues.
Prof. Ripley did send an e-mail with specific requests. Three of the requests
are related to the configure script: 1) I should use c++ flags instead of
c-flags. That was my bad, I have fixed those. 2) In case of failure the
configure script should exit with an informative message. I have updated the
configure script by informing the user how to install the system requirements,
when not found. 3) "Could not find the required static library BLOSC"
Here Prof. Ripley implies that blosc is not static on Linux. I don't think this is
the case. So I'm not sure how to proceed with this.
Thanks for the suggestion. I think you are right that it is safer to move the
check (target_unit < 0) up. And I agree that this should address the reported
undefined behaviour.
#
On 4 September 2025 at 21:30, Pepijn de Vries wrote:
| when not found. 3) "Could not find the required static library BLOSC"
| Here Prof. Ripley implies that blosc is not static on Linux. I don't think this is
| the case. So I'm not sure how to proceed with this.

What is your system / your reference?  "Here" on Ubuntu libblosc* is set up
like thousands of other libraries: the shared library in the run-time
package, the static library in the -dev package.

Your build picks up the shared library as we'd expect:

$ ldd /usr/local/lib/R/site-library/blosc/libs/blosc.so | grep blosc
        libblosc.so.1 => /lib/x86_64-linux-gnu/libblosc.so.1 (0x00007f4b56501000)
$

To pick just one other package with compressors libraries:

$ ldd /usr/local/lib/R/site-library/zip/libs/zip.so | grep 'lib.*z'
        libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x000078af0e90f000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x000078af0e8cf000)
        libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x000078af0e8bb000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x000078af0e89d000)
$

Now, it has become a little 'fashionable' to ship libraries with their CRAN
package and embed them, but the default is still to rely on established
system libraries:

$ ldd /usr/local/lib/R/site-library/RPostgreSQL/libs/RPostgreSQL.so | grep libpq
        libpq.so.5 => /lib/x86_64-linux-gnu/libpq.so.5 (0x000071fe7fc53000)
$ ldd /usr/local/lib/R/site-library/RPostgres/libs/RPostgres.so | grep libpq
        libpq.so.5 => /lib/x86_64-linux-gnu/libpq.so.5 (0x000072e718fd6000)
$

Maybe Prof Ripley wanted you to play along the common theme of relying on an
external shared library?  For libblosc the default (as seen from `pkgconf`
aka `pkg-config`) appears to be standard linking just as we saw above:

$ pkgconf --libs blosc
-lblosc 
$ 

Hope this helps,  Dirk