Skip to content

undefined symbol errors when compiling package using ALTREP API

7 messages · Luke Tierney, Gabriel Becker, Mark Klik

#
Hello,

I'm developing a package (lazyvec) that makes full use of the ALTREP
framework (R >= 3.6.0).
One application of the package is to wrap existing ALTREP vectors in a new
ALTREP vector and pass all calls from R to the contained object. The
purpose of this is to provide a diagnostic framework for working with
ALTREP vectors and show information about internal calls.

The package builds on Windows and OSX but fails to build on Linux as can be
seen from the link to the Travis build:
https://travis-ci.org/fstpackage/lazyvec/jobs/539442806

The reason of build failure is that many ALTREP methods generate 'undefined
symbol' errors upon building the package (on Linux). I've checked the R
source code and the undefined symbols seems to be related to the
'attribute_hidden' before the function definition. For example, the method
'ALTVEC_EXTRACT_SUBSET' is defined as:

SEXP attribute_hidden ALTVEC_EXTRACT_SUBSET(SEXP x, SEXP indx, SEXP call)

My question is why these differences between Windows / OSX and Linux exist
and if they are intentional?
Do I need special build parameters to make sure my package builds correctly
on Linux?

thanks for all the hard work!

best,
Mark

PS: some additional info:

package github repository: https://github.com/fstpackage/lazyvec
AppVeyor package build logs:
https://ci.appveyor.com/project/fstpackage/lazyvec
Travis package build logs: https://travis-ci.org/fstpackage/lazyvec/builds
#
On Tue, 4 Jun 2019, Mark Klik wrote:

            
It is intentional that this not be part of the public API. This is
true of almost all functions with an ALTREP prefix. You need a
different approach that avoids using these directly.

Best,

luke

  
    
#
thanks for clearing that up, so these methods are actually not meant to be
exported on Windows and OSX?
Some of the ALTREP methods that now use 'attribute_hidden' would be very
useful to packages that aim to be ALTREP aware, should the currently
(exported) API be considered final?

thanks  for your time & best,
Mark
On Tue, Jun 4, 2019 at 6:52 PM Tierney, Luke <luke-tierney at uiowa.edu> wrote:

            

  
  
#
Hi Mark,

So depending pretty strongly on what you mean by "ALTREP aware", packages
aren't necessarily supposed to be ALTREP aware. What I mean by this is that
as of right now, ALTREP objects are designed to be interacted with by
non-ALTREP-implementing package code, *more-or-less *exactly as standard
(non-AR) SEXPs are: via the published C API. The more or less comes from
the fact that in some cases, doing things that are good ideas on standard
SEXPS will work, but may not be a good idea for ALTREPs.

The most "low-hanging-fruit" example of something that was best practice
for standard vectors but is not a good idea for ALTREP vectors is grabbing
a DATAPTR and iterating over the values without modification in a tight
loop.  This will work (absent allocation  failure or, I suppose, the ALTREP
being specifically designed to refuse to give you a full DATAPTR), but with
ALTREP in place its no longer what you want to do.

That said, you don't want to check whether something is an ALTREP yourself
and branch your code, what you want to do is use the ITERATE_BY_REGION
macro in R_ext/Itermacros.h for ALL SEXPs, which will be nearly as for
standard vectors and work safely for ALTREP vectors.

Basically any time you find yourself wanting to check if something is an
ALTREP and if so, call a specific ALT*_BLAH method, the intention is that
there should be a universal API point you can call which will work for both
types.

This is true, e.g., of INTEGER_IS_SORTED (which will always work and just
returns UNKNOWN_SORTEDNESS, ie INT_MIN, ie NA_INTEGER for non-ALTREPs).,
for REAL_GET_REGION, (which populates a double* with the requested values
for both standard and ALTREP REALSXPs), etc.

Does the above make sense?

If you feel a universal API point is missing, you can raise that here,
though I can't promise that will ultimately result in the method being
added.

Best,
~G
On Tue, Jun 4, 2019 at 2:22 PM Mark Klik <markklik at gmail.com> wrote:

            

  
  
#
Hi Gabriel,

thanks for your detailed explanation, that definitely clarifies the design
choices that were made in setting up the ALTREP framework and I can see how
those choices make sure existing code won't break.

My specific use-case for wanting to check whether a vector is an ALTREP is
the following: the fst package wraps an external C++ library (fstlib,
independent from R) that was made for high speed serialization of
dataframe's. Sequences are fairly common in dataframe's and I'm planning to
add the concept of a sequence to the (R-agnostic) fst format. When I can
detect, e.g. a 'compact_intseq' ALTREP vector and just retrieve it's 3
integer internal representation, serialization could be very fast.
Alternatively, as you describe, the vector needs to be expanded first
before serialization, which will actually be slower than using an already
expanded vector and can take a lot of RAM for large datasets.

So being able to make use of the internal representation of (a few of the)
base ALTREP vectors can be very interesting for (non-R) serialization
schemes.

thanks for your time!
Mark


On Tue, Jun 4, 2019 at 11:50 PM Gabriel Becker <gabembecker at gmail.com>
wrote:

  
  
#
For now you can use

R_altrep_inherits(x, R_compact_intseq_class)

The variable R_compact_intseq_class should currently be visible to
packages on all platforms, though that may change if we eventually
provide a string-based lookup mechanism, e.g. somehting like

R_find_altrep_class("compact_intseq", "base")

Best,

luke
On Tue, 4 Jun 2019, Mark Klik wrote:

            

  
    
#
thanks Luke, I can work with that and will watch out for changes and new
developments in the ALTREP code with great interest.

all the best,
Mark
On Wed, Jun 5, 2019 at 6:02 PM Tierney, Luke <luke-tierney at uiowa.edu> wrote: