Skip to content
Prev 55773 / 63421 Next

True length - length(unclass(x)) - without having to call unclass()?

On 09/03/2018 03:59 PM, D?nes T?th wrote:
Hi Denes,
indeed, and not your fault, but the function is cheating and that it is 
in a widely used package, even exported from it, does not make it any 
safer. The related optimization in base R (shallow copying) mentioned in 
the documentation of data.table::setattr is on the other hand sound, it 
does not break the semantics.
Extreme care is not enough as the internals can and do change (and with 
the limits given by documentation, they are likely to change soon wrt to 
NAMED/reference counting), not mentioning that they are very 
complicated. The approach of "modify in place because we know the 
reference count is 0" is particularly error prone and unnecessary. It is 
unnecessary because there is documented C API for legitimate use in 
packages to find out whether an object may be referenced/shared 
(indirectly checks the reference count). If not, it can be modified in 
place without cheating, and some packages do it. It is error prone 
because the reference count can change due to many things package 
developers cannot be expected to know (and again, these things change): 
in set* functions for example, it will never be 0 (!), these functions 
with their current API can never be implemented in current R without 
breaking the semantics.

In principle one can do similar things legitimately by wrapping objects 
in an environment, passing such environment (environments can 
legitimately be modified in place), checking the contained objects have 
reference count of 1 (not shared), and if so, modifying them in place. 
But indeed, as soon as such objects become shared, there is no way out, 
one has to copy (in the current R).

Best
Tomas