Dear all, After submitting a newbie package to CRAN, I know receive some check-results that need to be addressed. There are issues raised by the gcc-ASAN, gcc-UBSAN and M1mac checks. More specifically here are the check results: https://cran.r-project.org/web/checks/check_results_blosc.html I think I know how to address these issues, however... How can I test if my fixes (https://github.com/pepijn-devries/blosc/pull/27/files) were adequate? When I check the submitted package with rhub's the gcc-ASAN docker file, the issues reported by CRAN don't pop up. I'm also not sure how to reproduce the M1mac results. In short: how can I assure that my fixes are adequate, other then resubmitting to CRAN and hoping for the best? Kind regards, Pepijn
[R-pkg-devel] Reproducing CRAN checks
4 messages · Ivan Krylov, Pepijn de Vries, Dirk Eddelbuettel
? Thu, 4 Sep 2025 07:07:32 +0000 Pepijn de Vries <pepijn.devries at outlook.com> ?????:
When I check the submitted package with rhub's the gcc-ASAN docker file, the issues reported by CRAN don't pop up.
Is it this one? https://github.com/r-hub/containers/blob/main/containers/gcc-asan/Dockerfile Looks like it's based on Fedora 40 with GCC 14, while Prof. Brian Ripley runs the gcc-ASAN checks on Fedora 42 with GCC 15: https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt (If README.txt and 00check.log ever disagree, trust 00check.log.) Sometimes the issues are trivial to reproduce with any version of gcc -fsanitize. Sometimes you need the right compiler version. Sometimes you need to compile R and all dependencies from source so that they are all instrumented by sanitizers. Sometimes even a fully reproducible build environment, if it existed, wouldn't be enough because the issue is only visible when the CPU is fully loaded with some threads being starved for CPU time [*]. I think that currently, the approach most likely to reproduce the problem is to start with a Fedora 42 container and follow the README.txt together with WRE 4.3.3 and 4.3.4 to build R and all dependencies from source. There are many shortcuts that could be taken (and the ones taken by r-hub containers save a lot of electricity and time running CI!), but they carry a risk of missing the problem.
I'm also not sure how to reproduce the M1mac results.
If you skip install.libs('blosc') in CI on macOS, the failure should be
reproducible, but then there's no way for the configure script to
succeed. Were you asked to fix the error messages produced by the
configure script?
In short: how can I assure that my fixes are adequate, other then resubmitting to CRAN and hoping for the best?
If you'd like to provably prevent the dereference of the [-1] subscript from happening, consider moving the if (target_unit < 0) stop(...) inside the previous if (target_unit < 0) branch, before accessing the arrays. This way even a strange new unit (or a damaged file) won't cause undefined behaviour.
Best regards, Ivan [*] e.g. https://github.com/Rdatatable/data.table/issues/7051
Hi Ivan, Thank you for you very helpful response!
? Thu, 4 Sep 2025 07:07:32 +0000 Pepijn de Vries <pepijn.devries at outlook.com> ?????:
When I check the submitted package with rhub's the gcc-ASAN docker file, the issues reported by CRAN don't pop up.
Yes, that's one of the sanitizers I've used for testing
Looks like it's based on Fedora 40 with GCC 14, while Prof. Brian Ripley runs the gcc-ASAN checks on Fedora 42 with GCC 15: https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt (If README.txt and 00check.log ever disagree, trust 00check.log.) Sometimes the issues are trivial to reproduce with any version of gcc -fsanitize. Sometimes you need the right compiler version. Sometimes you need to compile R and all dependencies from source so that they are all instrumented by sanitizers. Sometimes even a fully reproducible build environment, if it existed, wouldn't be enough because the issue is only visible when the CPU is fully loaded with some threads being starved for CPU time [*]. I think that currently, the approach most likely to reproduce the problem is to start with a Fedora 42 container and follow the README.txt together with WRE 4.3.3 and 4.3.4 to build R and all dependencies from source. There are many shortcuts that could be taken (and the ones taken by r-hub containers save a lot of electricity and time running CI!), but they carry a risk of missing the problem.
Thank you for the thorough explanation. I see that reproducing sanitizer results isn't for the faint of heart. I will have to ponder if I really want to put all this effort in trying to reproduce the reported results, or just making sensible adjustments based on the reported issues.
I'm also not sure how to reproduce the M1mac results.
If you skip install.libs('blosc') in CI on macOS, the failure should be
reproducible, but then there's no way for the configure script to
succeed. Were you asked to fix the error messages produced by the
configure script?
Prof. Ripley did send an e-mail with specific requests. Three of the requests are related to the configure script: 1) I should use c++ flags instead of c-flags. That was my bad, I have fixed those. 2) In case of failure the configure script should exit with an informative message. I have updated the configure script by informing the user how to install the system requirements, when not found. 3) "Could not find the required static library BLOSC" Here Prof. Ripley implies that blosc is not static on Linux. I don't think this is the case. So I'm not sure how to proceed with this.
In short: how can I assure that my fixes are adequate, other then resubmitting to CRAN and hoping for the best?
If you'd like to provably prevent the dereference of the [-1] subscript from happening, consider moving the if (target_unit < 0) stop(...) inside the previous if (target_unit < 0) branch, before accessing the arrays. This way even a strange new unit (or a damaged file) won't cause undefined behaviour.
Thanks for the suggestion. I think you are right that it is safer to move the check (target_unit < 0) up. And I agree that this should address the reported undefined behaviour.
-- Best regards, Ivan [*] e.g. https://github.com/Rdatatable/data.table/issues/7051
On 4 September 2025 at 21:30, Pepijn de Vries wrote:
| when not found. 3) "Could not find the required static library BLOSC"
| Here Prof. Ripley implies that blosc is not static on Linux. I don't think this is
| the case. So I'm not sure how to proceed with this.
What is your system / your reference? "Here" on Ubuntu libblosc* is set up
like thousands of other libraries: the shared library in the run-time
package, the static library in the -dev package.
Your build picks up the shared library as we'd expect:
$ ldd /usr/local/lib/R/site-library/blosc/libs/blosc.so | grep blosc
libblosc.so.1 => /lib/x86_64-linux-gnu/libblosc.so.1 (0x00007f4b56501000)
$
To pick just one other package with compressors libraries:
$ ldd /usr/local/lib/R/site-library/zip/libs/zip.so | grep 'lib.*z'
libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x000078af0e90f000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x000078af0e8cf000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x000078af0e8bb000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x000078af0e89d000)
$
Now, it has become a little 'fashionable' to ship libraries with their CRAN
package and embed them, but the default is still to rely on established
system libraries:
$ ldd /usr/local/lib/R/site-library/RPostgreSQL/libs/RPostgreSQL.so | grep libpq
libpq.so.5 => /lib/x86_64-linux-gnu/libpq.so.5 (0x000071fe7fc53000)
$ ldd /usr/local/lib/R/site-library/RPostgres/libs/RPostgres.so | grep libpq
libpq.so.5 => /lib/x86_64-linux-gnu/libpq.so.5 (0x000072e718fd6000)
$
Maybe Prof Ripley wanted you to play along the common theme of relying on an
external shared library? For libblosc the default (as seen from `pkgconf`
aka `pkg-config`) appears to be standard linking just as we saw above:
$ pkgconf --libs blosc
-lblosc
$
Hope this helps, Dirk
dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org