Skip to content

removeSource() vs. function literals

4 messages · Ivan Krylov, Duncan Murdoch, Lionel Henry +1 more

#
Dear R-devel,

In a package of mine, I use removeSource on expression objects in order
to make expressions that are semantically the same serialize to the
same byte sequences:
https://github.com/cran/depcache/blob/854d68a/R/fixup.R#L8-L34

Today I learned that expressions containing function definitions also
contain the source references for the functions, not as an attribute,
but as a separate argument to the `function` call:

str(quote(function() NULL)[[4]])
# 'srcref' int [1:8] 1 11 1 25 11 25 1 1
# - attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile'
#   <environment:0x55aba55a8a50>

This means that removeSource() on an expression that would define a
function when evaluated doesn't actually remove the source reference
from the object.

Do you think it would be appropriate to teach removeSource() to remove
such source references? What could be a good way to implement that?
if (is.call(fn) && identical(fn[[1]], 'function')) fn[[4]] <- NULL
sounds too arbitrary. if (inherits(fn, 'srcref')) return(NULL) sounds
too broad.
#
On 30/03/2023 10:32 a.m., Ivan Krylov wrote:
I don't think there's a simple way to do that.  Functions can define 
functions within themselves.  If you're talking about code that was 
constructed by messing with language objects, it could contain both 
function objects and calls to `function` to construct them.  You'd need 
to recurse through all expressions in the object.  Some of those 
expressions might be environments, so your changes could leak out of the 
function you're working on.

Things are simpler if you know the expression is the unmodified result 
of parsing source code, but if you know that, wouldn't you usually be 
able to control things by setting keep.source = FALSE?

Maybe a workable solution is something like parse(deparse(expr, control 
= "exact"), keep.source = FALSE).  Wouldn't work on environments or 
various exotic types, but would probably warn you if it wasn't working.

Duncan Murdoch
#
If you can afford a dependency on rlang, `rlang::zap_srcref()` deals
with this. It's recursive over expression vectors, calls (including
calls to `function` and their hidden srcref arg), and function
objects. It's implemented in C for efficiency as we found it to be a
bottleneck in some applications (IIRC caching). I'd be happy to
upstream this in base if R core is interested.

Best,
Lionel
On 3/30/23, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
#
On 3/31/23 08:49, Lionel Henry via R-devel wrote:
That would be very helpful. When having to implement caching, I have 
been hit by this issue several times in the past, too (before 
rlang::zap_srcref() existed).

Regards,
Denes