Skip to content

Segfault: .Call and classes with logical slots

6 messages · Torsten Hothorn, John Chambers, Douglas Bates +1 more

#
Hi,

the following example aiming at a class containing a logical slot
segfaults under R-1.9.0 when `gctorture(on = TRUE)' is used:

Code code (dummy.c):

#include <Rdefines.h>

SEXP foo() {

    SEXP ans;

    PROTECT(ans = NEW_OBJECT(MAKE_CLASS("test")));
    SET_SLOT(ans, install("lgl"), allocVector(LGLSXP, 1));
    LOGICAL(GET_SLOT(ans, install("lgl")))[0] = TRUE;
    UNPROTECT(1);
    return(ans);
}

R code (dummy.R):

dyn.load("dummy.so")

setClass("test", representation = representation(lgl = "logical"))

a = .Call("foo")
a # OK

gctorture(on = TRUE)
a = .Call("foo")
gctorture(on = FALSE)
a # segfault

which gives

R>
R>
R> dyn.load("dummy.so")
R>
R> setClass("test", representation = representation(lgl = "logical"))
[1] "test"
R>
R> a = .Call("foo")
R> a
An object of class "test"
Slot "lgl":
[1] TRUE

R>
R> gctorture(on = TRUE)
R> a = .Call("foo")
R> gctorture(on = FALSE)
Segmentation fault

Best,

Torsten

R> version
         _
platform i686-pc-linux-gnu
arch     i686
os       linux-gnu
system   i686, linux-gnu
status
major    1
minor    9.0
year     2004
month    04
day      12
language R
R>



 _______________________________________________________________________
|									|
|	Dr. rer. nat. Torsten Hothorn					|
|	Institut fuer Medizininformatik, Biometrie und Epidemiologie	|
|	Waldstrasse 6, D-91054 Erlangen, Deutschland			|
|	Tel: ++49-9131-85-22707	(dienstl.)				|
|	Fax: ++49-9131-85-25740						|
|	Email:  Torsten.Hothorn@rzmail.uni-erlangen.de			|
|       	PLEASE send emails cc to torsten@hothorn.de		|
|	Web: http://www.imbe.med.uni-erlangen.de/~hothorn		|
|_______________________________________________________________________|
#
I think you need to PROTECT the vector you're putting in the slot as
well as the overall object.  At any rate, the problem goes away for me
with the revised version of dummy.c below.  (Empirically, PROTECT'ing
the class definition didn't seem to be needed, but experience suggests
that too much protection  is better than too little.)

#include <Rdefines.h>

SEXP foo() {

    SEXP ans, cl, el;

    PROTECT(cl = MAKE_CLASS("test"));
    PROTECT(ans = NEW_OBJECT(cl));
    PROTECT(el = allocVector(LGLSXP, 1));
    SET_SLOT(ans, install("lgl"), el);
    LOGICAL(GET_SLOT(ans, install("lgl")))[0] = TRUE;
    UNPROTECT(3);
    return(ans);
}
Torsten Hothorn wrote:

  
    
#
On Mon, 26 Apr 2004, John Chambers wrote:

            
yes, and it seems that PROTECT'ing the logical variable is sufficient
while PROTECT'ing the class but not the logical variable causes a
segfault again. I tried with numeric slots too: No problems.
I tried to save (UN)PROTECT calls because of efficiency reasons. Anyway,
this helps me a lot, thanks!

Torsten
#
torsten@hothorn.de writes:
Perhaps this example is an indication that gctorture is too
aggressive.  I use constructions like

   PROTECT(ans = ...);

   SET_SLOT(ans, install("lgl"), allocVector(LGLSXP,1));
   LOGICAL(GET_SLOT(ans, install("lgl")))[0] = TRUE;

in many places in my code, having been assured by a usually reliable
source (Luke) that SET_SLOT applied to a freshly allocated vector
would be atomic with respect to garbage collection.  That is, under
the usual conditions there would be no chance of a garbage
collection being triggered between the allocVector and SET_SLOT
operations.  It may be that gctorture is causing a garbage collection
at a place where it otherwise could not occur and the additional
(UN)PROTECT are redundant except when gctorture is active.

In trying to avoid (UN)PROTECT calls I'm not as concerned about
efficiency as I am about clarity of the code.  I would prefer not to
clutter the code with (UN)PROTECT calls if they are known to be
redundant.

At the time we discussed this Luke suggested that we document a set of
C calls that are atomic with respect to garbage collection.  I think
this would be a good idea but I suspect that no one has the time to do
it right now.

  
    
#
On 27 Apr 2004, Douglas Bates wrote:

            
It can't be--it only forces gc in places that _could_ result in a gc
withut gctorture; it will not result in a gc in places that otherwise
could not.  The result may be a gc in places that otherwise would be
very unlikely to cause one but not in places that could not.
There are two different scenarios.  Some things are guaranteed not to
allocate, for example low level operations like SET_VECTOR_ELT.
Others, like setAttrib do allocate, but when they do they protect
(some of) their arguments.  So code that uses setAttrib does not need
to protect the arguments to the call (at least the value one--I'd have
to double check the others).  Other variables that are alive before
and after the setAttrib call will need to be protected.

Since slots are stored in attributes we can at most hope for the
second behavior.  But we do not have it.  SET_SLOT is a macro that
expands to R_do_slot_assign, which starts out as

SEXP R_do_slot_assign(SEXP obj, SEXP name, SEXP value) {
    SEXP input = name; int nprotect = 0;
    if(isSymbol(name) ) {
	input = PROTECT(allocVector(STRSXP, 1)); nprotect++; /******/
	SET_STRING_ELT(input, 0, PRINTNAME(name));
    }
    else if(!(isString(name) && LENGTH(name) == 1))
	error("invalid type or length for slot name");
    ...

The actual assignment uses setAttrib, which does operate in a way that
protects the value being assigned, but the unprotected allocation at
/******/ happens before we get there.

So unless we modify SET_SLOT to protect the value argument (and the
others as well to be safe), the value needs to be protected (as do any
other objects that might be needed after the call).

Best,

luke
#
I was about to comment with some of the same points Luke makes here. 
It's hard to see how gctorture could be less aggressive and still
guarantee to find problems.  Yes, some of the problems look unlikely,
but that's partly what makes them so insidious.

Two additional small points, one of detail, one about style.

1. A slight expansion on SET_SLOT.  There are two situations:  ordinary
slots and the ".Data" slot, a way to set the "data part" of an object.

The allocation for ordinary slots is trivial and maybe there is a way to
avoid it.  If the name argument is a symbol (as it usually is), SET_SLOT
allocates a corresponding character vector, because that's what
set_attrib wants.  Seems like each symbol could have a corresponding
object of this form (maybe there is one already?) to avoid allocation in
this case.

The .Data case involves much more code.  In this special case, should
SET_SLOT PROTECT the value argument?  or for that matter, would it be a
serious overhead for SET_SLOT to PROTECT the value argument always?

2.  On the other hand, this style of example might be characterized as
"hand compiling" S language code.  Many of us have had the experience
that such hand compiling is very error prone.  I know it's sometimes
strongly motivated, but it's likely to be an unpleasant experience.

If it's at all possible, the long-standing advice applies:  Try to
pre-allocate the data needed and keep the C code to dealing with
existing objects (e.g., DATAPTR() pointers); in particular, so the heavy
C code uses only pointers to data, not R objects.

And hopefully real compiling will eventually relieve us of some of the
need.

John
Luke Tierney wrote: