Skip to content

Any plans for ALTREP lists (VECSXP)?

1 message · Bemis, Kylie

#
Thanks for the suggestions, everyone.

Is it not a pressing issue requiring alternatives, since the ?matter_list? object already behaves like a list, and I am just looking for a way to present a native R list (VECSXP) when a regular list is required.

In this case (in my typical use case), the ?matter_list? is homogenous and I use it like a ragged array; however, in general each element could be a different atomic vector type (specifically raw, logical, integer, or double).

Here, as.altrep() is an S4 method for converting my custom ?matter?-class out-of-memory objects into their native R representations using ALTREP.

Seems to work well for the ?matter' vectors, matrices, and arrays, where it just .Call()s my C function for making the corresponding ALTREP object, but the lists were giving me trouble because there I use lapply() to extract and uncompress the ?matter_list? metadata for each list element into a separate S4 ?matter_vec? out-of-memory vector, each of which is then used to create an ALTREP object for the corresponding list element. So it gets costly...

The cost is mostly in re-creating all of the metadata as regular R objects that end up occupying the R_altrep_data1() spot for all of the individual list elements. If I could make an ALTREP list, I could leave the metadata as-is and avoid all of that.

Anyway, not a pressing issue for me either, just something I noticed where having an ALTREP list could be useful, so I was wondering if it was in the plans, which Luke answered.

Thanks,

-Kylie
On Jul 23, 2019, at 8:27 PM, Gabriel Becker <gabembecker at gmail.com<mailto:gabembecker at gmail.com>> wrote:
Hi Kylie,

Is it a list with only numerics in it? (I only see REALSXPs there, but obviously inspect isn't showing all of them). If so, you could load it up into one big vector and then also keep partitioning information around. Bioconductor does this (see ?IRanges::CompressedList ). The potential benefit here being that the underlying large vector could then be a big out-of-memory altrep. How helpful this would be depends somewhat on what you want to do with it, of course, but it is something that comes to mind.

Also, I would expect some overhead but that seems like a lot (without having done super much in the way of benchmarking). What exactly is as.altrep doing?

Best,
~G
On Tue, Jul 23, 2019 at 9:54 AM Michael Lawrence via R-devel <r-devel at r-project.org<mailto:r-devel at r-project.org>> wrote:
Hi Kylie,

As an alternative in the short term, you could consider deriving from
S4Vector's List class, implementing the getListElement() method to
lazily create the objects.

Michael
On Tue, Jul 23, 2019 at 9:09 AM Bemis, Kylie <k.bemis at northeastern.edu<mailto:k.bemis at northeastern.edu>> wrote:
--
Michael Lawrence
Scientist, Bioinformatics and Computational Biology
Genentech, A Member of the Roche Group
Office +1 (650) 225-7760
michafla at gene.com<mailto:michafla at gene.com>

Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube

______________________________________________
R-devel at r-project.org<mailto:R-devel at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Ck.bemis%40northeastern.edu%7C30d98923a37f405b4c9908d70f9b6875%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636995032467102920&sdata=3CNTeCYlKyul8JPFhVeEFKvKooGPSm16xU8UplfJJsA%3D&reserved=0>