Skip to content

[R-pkg-devel] R package with external C++ library

8 messages · Duncan Murdoch, Ege Rubak, MTurgeon +1 more

#
Hi,

I would like to port Google's s2-library for spherical geometry (see 
e.g. https://github.com/micolous/s2-geometry-library for a fork on 
GitHub). It is not a standard library that can easily be installed on 
various systems, so I would like to include the source code in the R 
package. The catch is that I would like to modify the source code as 
little as possible :-)

I have package everything and added configure scripts and a tiny 
R-function that calls one of the C++-functions (using the antiquated .C 
interface for now -- that will of course be changed) in this repo:
https://github.com/spatstat/s2

It compiles into a working package on Ubuntu (travis-ci + my laptop), 
OSX (travis-ci), and Windows (appveyor + my surface pro), but R CMD 
check produces some warnings (and a note about the size of the shared 
object, but I assume that is less important).

The main things seem to be related to (travis log is at 
https://travis-ci.org/spatstat/s2/jobs/149578339):

1. Deprecated C++ headers <ext/hash_set> and <ext/hash_map>.

2. Compiled code that calls entry points which might terminate R or 
write to stdout/stderr.

Is it hopeless to get on CRAN with warnings like these?
I'm not very used to writing C/C++ code, but I guess 1. can be fixed by 
a few sed commands with the replacement headers and corresponding new 
function names. Point 2. can probably also be fixed with a reasonable 
effort, but I haven't investigated yet, and I would like an opinion from 
the list before spending more time on this. In more generality the 
question could be phrased something like:

"When including C++ code from an upstream library which you do not 
control should R CMD check be completely spotless or is some flexibility 
to be expected in these circumstances?"

Cheers,
Ege

PS: Extra question (prehaps particularly aimed at Dirk): When I will 
actually start to use the C++ library I expect it could be beneficial to 
use Rcpp. I have seen RcppModules mentioned somewhere, and I wonder if 
such an external C++ library would make sense to interface via 
RcppModules (again aiming at changing upstream sources as little as 
possible)?
#
On 03/08/2016 5:36 PM, Ege Rubak wrote:
I don't set CRAN policy, but I would say yes.  Problem 1 limits your 
package to systems using compilers that support those antiquated 
headers; R tries very hard to be portable across many systems.  Problem 
2 makes R potentially unstable.

Duncan Murdoch
#
Thanks for the quick reply. The non-portability issue you mention has 
not shown up so far but I was actually fearing it would show up once I 
(or more likely CRAN) try to compile on solaris -- is there a way to 
test code on solaris without bothering the brave CRAN souls? Anyway, I 
should probably start executing some sed commands and see if I can get 
rid of the problem in a fairly reproducible way.
Regarding problem 2 even if I never use the part of the C++ code that 
writes to stdout etc. I guess it is impossible for CRAN to know this, so 
it makes sense for them to be safe rather than sorry?

/Ege
#
On 03/08/2016 5:59 PM, Ege Rubak wrote:
I don't know.


Anyway, I
Yes.

Duncan Murdoch
#
Hi Ege,

For writing to standard output/error, you can use Rcout or Rcerr 
(defined by Rcpp; they even have a vignette showing how to use it in the 
Rcpp gallery). Alternatively, if you're using C code, you can replace 
printf by Rprintf (this is explained in Writing R extensions, section 6.5).

For abort, you can use error() instead (this is documented in WRE, 
section 6.2).

Hope this helps,

Max
On 2016-08-03 03:36 PM, Ege Rubak wrote:

  
    
#
It looks like most of these usages of std::cout, std::cerr and abort
are coming from the logging infrastructure:

https://github.com/micolous/s2-geometry-library/blob/master/geometry/base/logging.h

If you wanted to make this work on CRAN, you could patch these files
in a number of ways:

You could replace calls to 'std::cout' with a helper function
'cout()', and 'std::cerr' with a helper function 'cerr()', to instead
point to an output stream object that delegates to Rprintf and
REprintf for logging. You can see a sample implementation of something
like this in the infamous test obfuscation suite, testthat:

https://github.com/hadley/testthat/blob/46d15dad93fc2bfca339963cb66ffb1a309fa8a0/inst/include/testthat/testthat.h#L58-L96

https://github.com/hadley/testthat/blob/46d15dad93fc2bfca339963cb66ffb1a309fa8a0/inst/include/testthat/testthat.h#L119-L129

Similarly, calls to 'abort()' could instead throw a C++ exception,
which you would then catch at the top level with your R wrapper
entrypoints, and forwarded to the user appropriately (e.g. with
Rf_error called with no C++ objects on the stack).

As for the use of deprecated C++ headers, you could patch those to
instead use C++11's 'unordered_set' and 'unordered_map', or try
getting similar classes from e.g. Boost.

All in all, it would likely be a lot of work to get the package ready
for a CRAN submission, but it's definitely doable.

Finally, the CRAN policies state:

Package authors should make all reasonable efforts to provide
cross-platform portable code. Packages will not normally be accepted
that do not run on at least two of the major R platforms. Cases for
Windows-only packages will be considered, but CRAN may not be the most
appropriate place to host them.

So if compilation on Solaris really is a non-starter for some reason,
I think you can indicate that you do not intend for your package to be
used / distributed on Solaris (but you would probably have to provide
sufficient justification during the submission).

Cheers,
Kevin
On Wed, Aug 3, 2016 at 2:59 PM, Ege Rubak <rubak at math.aau.dk> wrote:
#
On Wed, Aug 3, 2016 at 3:09 PM, MTurgeon <maxime.turgeon at mail.mcgill.ca> wrote:
Although, please note that calling Rf_error from a C++ context is not
safe as it will skip the execution of any destructors for C++ objects
on the stack, and lead to undefined behavior in general. It's much
better to use Rcpp, Rcpp attributes, and call Rcpp::stop (which
handles this for you), or else implement this 'by hand' and throw a
C++ exception that is caught by your wrapper functions at the top
level.
#
Thanks for the good specific suggestions Kevin and Max. It is amazing 
that great help is available from the R community in such a short time.

Clearly everybody thinks that these issues should be fixed before I even 
try to submit to CRAN (and I fully agree that they shouldn't lower their 
standards, but I just wanted to be sure before throwing myself into the 
unknown land of seriously editing other people C++ code). I will give it 
a go sometime soon and keep your pointers in mind.

Cheers,
Ege