Skip to content

Destructive str(...)?

7 messages · Simon Urbanek, Luke Tierney, Peter Dalgaard +1 more

#
I have encountered a strange behavior of the str function - it seems to 
modify the object that is displayed. Probably I'm using something 
unsupported (objects consisting just of an external reference), but 
still I'm curious as of why this happens. I create (in C code) 
EXTPTRSXP and associate a class to it via SET_CLASS. Such objects works 
fine until it's passed to str as the following output demonstrates:

 > c<-.MCall("RController","getRController")
 > c
[1] "<RController: 0x3be5d0>"
 > str(c)
Class 'ObjCid' length 1 <pointer: 0x3be5d0>
 > c
<pointer: 0x3be5d0>
 > str(c)
length 1 <pointer: 0x3be5d0>

The .MCall basically produces an external reference and assigns a class 
(ObjCid) to it. There's a corresponding print method and it works fine. 
However, when str is called, it strips the class information from the 
object as a repeated call to str also shows:

  > str(c); str(c)
Class 'ObjCid' length 1 <pointer: 0x3be5d0>
length 1 <pointer: 0x3be5d0>

Is this behavior intentional, undocumented or simply wrong?

Cheers,
Simon

[Tested with R 2.0.0 release (2004-10-04) on Mac OS X 10.3.5 - I have 
currently no other machine to test it on, but I very much suspect that 
this is platform-independent.]

the C code used to generate the object:

SEXP class, sref = R_MakeExternalPtr((void*) obj, R_NilValue, 
R_NilValue);
PROTECT(class = allocVector(STRSXP, 1));
SET_STRING_ELT(class, 0, mkChar("ObjCid"));
SET_CLASS(sref, class);
UNPROTECT(1);
#
On Fri, 29 Oct 2004, Simon Urbanek wrote:

            
Yes, and I think it is documented somewhere, but I can't lay my hands on 
it right now.
The issue is almost certainly that something has forgotten/decided not to
either set or respect SET_NAMED on the object, so when str does

	object <- unclass(object)

or some such, the original object gets changed.  Now the `something' has 
to be C code: possibly yours but probably something in R itself.

I think this is intentional.  External references do not get copied, and
the advice I recall is to wrap them in a list for use at R level (and
before setting a class on them).  In RODBC I took another tack, and attach
the reference as an attribute to a `documentation' object.

str() probably ought to be more cautious when it encounters at external 
reference or similar exotic object, since it will look at list elements 
and attributes.

Brian
#
On Sat, 30 Oct 2004, Prof Brian Ripley wrote:

            
It's probably just unclass itself, not an issue with NAMED. External
references are one of a handful of objects that are handled as
references to mutable objects rather than as immutable values (the
main other one being environments).  unclass is destructive when
applied to a reference object.  At some point it might make sense to
make unclass signal an error when used on a reference object, and
clean up the things this breaks, including str and a number of other
print methods.  On the other hand, the same issue exists with all
attributes on referece objects, so the safest approach is to use a
wrapper as Brian suggests.

luke
#
Luke Tierney <luke@stat.uiowa.edu> writes:
Argh. I think this means that there is a bug in the tcltk code since
tclObj class objects are exactly external references with a class
attribute. It doesn't seem to have bitten anyone yet, though. Or were
you saying that we should fix str() instead?

Anyways, Tcl objects do provide a rather nice illustration of why
reference objects are non-duplicatable (which is the reason behind
unclass being destructive). They have a finalizer that decrements the
Tcl reference count when the R object is destroyed. To avoid bad
things resulting from decreasing the refcount multiple times,
duplication would require an increment of the reference count, and R
just isn't geared to do that: we'd need to introduce something like an
R_RegisterCDuplicator function.
#
Thank you all for your replies. I wrapped the reference in a LISTSXP 
and everyone's happy (I know that the docs say one ought to do so right 
away, but I was curious what breaks ;)).
On Oct 30, 2004, at 5:55 PM, Peter Dalgaard wrote:

            
Now, hold on a second - I thought the main point of EXTPTR is that the 
finalizer is called only once, that is when the last instance of the 
reference is disposed of by the gc (no matter how many copies existed 
meanwhile). Am I wrong and/or did I miss something? I did some tests 
which support my view, but one never knows ...

Cheers,
Simon
#
Simon Urbanek <simon.urbanek@math.uni-augsburg.de> writes:
How do you ensure that the finalizer is called once? By *not* copying
the reference object! You can have as many references to it as you
like (i.e. assign it to multiple variables), and the object itself is
not removed until the last reference is gone, but if you modify the
object (most likely by setting attributes, but you might also change
the C pointer payload in a C routine), all "copies" are changed:
<Tcl> 3.14159265359
<Tcl> 3.14159265359
[1] "externalptr"
$class
[1] "tclObj"

$Simon
[1] "Urbanek"
#
Just to be 100% clear, the finalizer is called *at most* once if (as in
tcltk) R_RegisterCFinalizer is called.  If you want it to be called
exactly once, you need to use R_RegisterCFinalizerEx.

The issue is that there may not be a final gc().

BTW, str(x) is destructive here too, so we do need to improve str().
I have code written, but access to svn.r-project.org is down (yet again).
Class 'tclObj' length 1 <pointer: 0x860c3f8>
length 1 <pointer: 0x860c3f8>
On 31 Oct 2004, Peter Dalgaard wrote: