Skip to content

[R-pkg-devel] How reproduce CRAN check

6 messages · Cristiane Hayumi Taniguti, Duncan Murdoch, Henrik Bengtsson +2 more

#
I'm having difficulties updating my package onemap: When I run standard
checks as --as-cran in linux, winbuilder and using docker images everything
is fine, but 'CRAN-submissions' reports this problem in my c++ code:

onemap.Rcheck/onemap-Ex.Rout:
.............../usr/local/bin/../include/c++/v1/memory:1825:35: runtime
error: nan is outside the range of representable values of type 'int'
onemap.Rcheck/onemap-Ex.Rout:twopts_out.cpp:147:7: runtime error: nan is
outside the range of representable values of type 'int'
onemap.Rcheck/onemap-Ex.Rout:twopts_out.cpp:147:26: runtime error: nan
is outside the range of representable values of type 'int'
onemap.Rcheck/onemap-Ex.Rout:
...........................twopts_f2.cpp:101:7: runtime error: nan is
outside the range of representable values of type 'int'
onemap.Rcheck/onemap-Ex.Rout:twopts_f2.cpp:101:26: runtime error: nan is
outside the range of representable values of type 'int'

This problem was there in the last CRAN version of my package too  <
https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-UBSAN/onemap>

I am having a hard time trying to reproduce this error "at home" and fix
the problem. Without reproduce this I am not able to find what is causing
the nan in my code.

Here <https://drive.google.com/open?id=1L9Qcz9qraZOo-jaq-fwkF-PQfmg73mSv> is
the new version source package

Any help is much appreciated.

Cris
#
On 13/09/2019 10:15 a.m., Cristiane Hayumi Taniguti wrote:
I would add some kind of Rprintf statement near line 101 of 
twopts_f2.cpp to show the value that you are apparently trying to coerce 
to int at that point.  Apparently it is sometimes equal to NaN, and this 
is being caught in the CRAN checks, but not in yours.  That might be 
because they are doing more stringent checks (UBSAN is very stringent), 
or it might be that the value is actually different on their system.

Duncan Murdoch
#
As first step of being able to reproduce the error, have a look at
R-Hub (https://blog.r-hub.io/2019/03/26/why-care/).  It'll allow you
to use their servers to test your package on many different
configurations.  It helped me several times. You can launch it all
from you local R prompt, e.g.
?? Choose build platform

 1: Debian Linux, R-devel, clang, ISO-8859-15 locale (debian-clang-devel)
 2: Debian Linux, R-devel, GCC (debian-gcc-devel)
 3: Debian Linux, R-devel, GCC, no long double (debian-gcc-devel-nold)
 4: Debian Linux, R-patched, GCC (debian-gcc-patched)
 5: Debian Linux, R-release, GCC (debian-gcc-release)
 6: Fedora Linux, R-devel, clang, gfortran (fedora-clang-devel)
 7: Fedora Linux, R-devel, GCC (fedora-gcc-devel)
 8: CentOS 6, stock R from EPEL (linux-x86_64-centos6-epel)
 9: CentOS 6 with Redhat Developer Toolset, R from EPEL
(linux-x86_64-centos6-epel-rdt)
10: Debian Linux, R-devel, GCC ASAN/UBSAN (linux-x86_64-rocker-gcc-san)
11: macOS 10.11 El Capitan, R-release (experimental) (macos-elcapitan-release)
12: Oracle Solaris 10, x86, 32 bit, R-patched (experimental)
(solaris-x86-patched)
13: Ubuntu Linux 16.04 LTS, R-devel, GCC (ubuntu-gcc-devel)
14: Ubuntu Linux 16.04 LTS, R-release, GCC (ubuntu-gcc-release)
15: Ubuntu Linux 16.04 LTS, R-devel with rchk (ubuntu-rchk)
16: Windows Server 2008 R2 SP1, R-devel, 32/64 bit (windows-x86_64-devel)
17: Windows Server 2012, R-devel, Rtools4.0, 32/64 bit (experimental)
(windows-x86_64-devel-rtools4)
18: Windows Server 2008 R2 SP1, R-oldrel, 32/64 bit (windows-x86_64-oldrel)
19: Windows Server 2008 R2 SP1, R-patched, 32/64 bit (windows-x86_64-patched)
20: Windows Server 2008 R2 SP1, R-release, 32/64 bit (windows-x86_64-release)

Selection: 10

/Henrik

PS. "Google Drive"; the sooner you migrate to Git and an online Git
repos (GitHub, Gitlab, Bitbucket, ...) the sooner you'll thank
yourself for doing so and saving yourself lots of time. The world of
continuous integration (read "automagic instant R CMD check on Linux,
macOS, WIndows - for free in the cloud") will also open up to you.

On Fri, Sep 13, 2019 at 7:15 AM Cristiane Hayumi Taniguti
<chtaniguti at usp.br> wrote:
#
On Fri, 13 Sep 2019 at 16:16, Cristiane Hayumi Taniguti
<chtaniguti at usp.br> wrote:
Here: https://github.com/augusto-garcia/onemap/blob/master/src/twopts_f2.cpp#L101
-> k1=segreg_type(i); k2=segreg_type(j);

k1 and k2 are vectors of integers, but segreg_type is a NumericVector
(double), which may contain NaN.

I?aki
4 days later
#
Hello, thanks for all the answers.

I did the new tests with all suggestions.

I included the Rprintf in my script, however it did not show any NaN when
running with my examples (the same that CRAN check pointed the error)

The check in r-hub Debian Linux, R-devel, GCC ASAN/UBSAN
(linux-x86_64-rocker-gcc-san) did not show the error.

The r-debug seems to be a good tool, but I had problems to build my package
on it. First, the library libglu1-mesa-dev was missing for rgl package
installation, then, I installed it, but I  found the following errors when
trying to install the required MDSMap package:

"Shadow memory range interleaves with an existing memory mapping. ASan
cannot proceed correctly. ABORTING."
and
"/usr/bin/ld: cannot find -lgfortran"

I did the modifications in rcpp files (update here<
https://github.com/Cristianetaniguti/onemap/commit/8d90eab994c9c5cc56202c2c15c5eae7c639315b>),
but I still can not test the package with it. Would be a problem if I
submit anyway?
On Fri, Sep 13, 2019 at 12:35 PM I?aki Ucar <iucar at fedoraproject.org> wrote:

            

  
  
8 days later
#
On 9/17/19 6:17 PM, Cristiane Hayumi Taniguti wrote:
The approach of adding a runtime check looking for NaNs should work. You 
can reproduce the problem (extracted from the vignette for which the 
CRAN checks report NaNs, tested on the CRAN version of the package):

---

library(onemap)
data("onemap_example_f2")
data("vcf_example_f2")
comb_example <- combine_onemap(onemap_example_f2, vcf_example_f2)
twopts_f2 <- rf_2pts(comb_example)
mark_all_f2 <- make_seq(twopts_f2, "all")

CHR_mks <- group_seq(input.2pts = twopts_f2, seqs = "CHROM", unlink.mks 
= mark_all_f2,
 ????????????????????? repeated = FALSE)

---

With a smaller example like this, it is easier to iterate using added 
runtime checks. Also it is easier to debug as one can have it in an 
interactive R session.

Now one can check the "geno" vector before passed to the range 
constructor of std::vector<int>, you will see it has "NaNs" as reported 
by UBSAN. One way to check is just add a function to the code that 
searches the vector and prints a message, and call it before the line 
indicated in the UBSAN reports

std::vector<int> k_sub(&geno[i*n_ind],&geno[i*n_ind+n_ind]);

It turns out that when there is an NaN in the vector, all values are 
NaN. So it is natural to continue debugging in the surrounding R code. 
It turns out that? est_rf_f2 in cpp_utils.R sometimes gets called with a 
vector of integers (not doubles) and this vector has NAs, which are 
later converted to NaN.

So perhaps you could continue debugging from there using the R debugger 
and in an interactive session, and it might make sense to add some 
diagnostics to the C code to make it easier to debug similar problems 
later if they appear. At least it would be worth checking the type of 
the argument, this is fast and would detect that the type is integer 
instead of double. But perhaps it might make sense to check validity 
even of the individual elements.

Best
Tomas