Skip to content

match gets confused by S4 objects

6 messages · Brian Ripley, Martin Maechler, Seth Falcon +1 more

#
If one accidentally calls match(x, obj), where obj is any S4 instance,
the result is NA.  

I was expecting an error because, in general, if a match method is not
defined for a particular S4 class, I don't know what a reasonable
default could be.  Specifically, here's what I see

setClass("FOO", representation(a="numeric"))
foo <- new("FOO", a=10)
match("a", foo)
[1] NA

And my thinking is that this should be an error, along the lines of
match("a", function(x) x)

Unless, of course, a specific method for match, table="FOO" has been
defined.

+ seth
#
An S4 object is just a list with attributes, so a vector type.  match() 
works with all vector types including lists, as you found out (or could 
have read).

If in the future those proposing it do re-implement an S4 object as an new 
SEXP then this will change, but for now the cost of detecting objects 
which might have an S4 class defined somewhere is just too high (and would 
fall on those who do not use S4 classes).
On Mon, 6 Feb 2006, Seth Falcon wrote:

            

  
    
#
BDR> An S4 object is just a list with attributes, so a
    BDR> vector type.  match() works with all vector types
    BDR> including lists, as you found out (or could have read).

yes, the internal representation of S4 objects is such -- seen
from a non-S4 perspective.

    BDR> If in the future those proposing it do re-implement an
    BDR> S4 object as an new SEXP then this will change, but for
    BDR> now the cost of detecting objects which might have an
    BDR> S4 class defined somewhere is just too high (and would
    BDR> fall on those who do not use S4 classes).

Just for further explanation, put into other words and a
slightly changed point of view: 

Yes, many R functions get confused by S4 objects, 
most notably,  c()  (!)

 - because they only look at the "internal representation"

 - and because it's expensive to always ``look twice'';
   particularly from the internal C code.
   There's a relatively simple check from R code which we've
   using for str() :

   >> if(has.class <- !is.null(cl <- attr(object, "class"))) { # S3 or S4 class
   >>    ## FIXME: a kludge
   >>    S4 <- !is.null(attr(cl, "package")) || cl == "classRepresentation"
   >>    ## better, but needs 'methods':   length(methods::getSlots(cl)) > 0
   >> }

   which --- when only testing for S4-presence --- you could collapse to

      if(!is.null(cl <- attr(object, "class")) &&
	 (!is.null(attr(cl, "package")) || 
	  cl == "classRepresentation"))     {

	  ...have.S4.object... 

      }

  but note the comment  >>>>   ## FIXME: a kludge   <<<

The solution has been agreed to be changing the internal
representation of S4 objects making them a new SEXP (basic R
"type"); and as Brian alludes to, the problem is that those in
R-core that want to and are able to do this didn't have the time
for that till now.

Martin Maechler, ETH Zurich
BDR> On Mon, 6 Feb 2006, Seth Falcon wrote:
>> If one accidentally calls match(x, obj), where obj is any S4 instance,
    >> the result is NA.
    >> 
    >> I was expecting an error because, in general, if a match method is not
    >> defined for a particular S4 class, I don't know what a reasonable
    >> default could be.  Specifically, here's what I see
    >> 
    >> setClass("FOO", representation(a="numeric"))
    >> foo <- new("FOO", a=10)
    >> match("a", foo)
    >> [1] NA
    >> 
    >> And my thinking is that this should be an error, along the lines of
    >> match("a", function(x) x)
    >> 
    >> Unless, of course, a specific method for match, table="FOO" has been
    >> defined.


    BDR> -- 
    BDR> Brian D. Ripley,                  ripley at stats.ox.ac.uk
    BDR> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
    BDR> University of Oxford,             Tel:  +44 1865 272861 (self)
    BDR> 1 South Parks Road,                     +44 1865 272866 (PA)
    BDR> Oxford OX1 3TG, UK                Fax:  +44 1865 272595

    BDR> ______________________________________________
    BDR> R-devel at r-project.org mailing list
    BDR> https://stat.ethz.ch/mailman/listinfo/r-devel
#
On 7 Feb 2006, maechler at stat.math.ethz.ch wrote:
The explanations from you are Brian are helpful, thanks.  I was aware
that the issue is the internal representation of S4 objects and was
hoping there might be a cheap work around until a new SEXP comes
around.

It seems that S4 instances are less trivial to detect than one might
expect before actually trying it.  

I suppose one work around is to have an S4Basic class that defines
methods for match(), c(), etc and raises an error.  Then extending
this class gives you some protection.

+ seth
#

        
Seth> On 7 Feb 2006, maechler at stat.math.ethz.ch wrote:
>> The solution has been agreed to be changing the internal
    >> representation of S4 objects making them a new SEXP (basic R
    >> "type"); and as Brian alludes to, the problem is that those in
    >> R-core that want to and are able to do this didn't have the time
    >> for that till now.

    Seth> The explanations from you are Brian are helpful, thanks.  I was aware
    Seth> that the issue is the internal representation of S4 objects and was
    Seth> hoping there might be a cheap work around until a new SEXP comes
    Seth> around.

    Seth> It seems that S4 instances are less trivial to detect than one might
    Seth> expect before actually trying it.  

    Seth> I suppose one work around is to have an S4Basic class that defines
    Seth> methods for match(), c(), etc and raises an error.  Then extending
    Seth> this class gives you some protection.

well; not so easy for c() !! {see the hoops we had to jump through to do
this for cbind() / rbind() (used in 'Matrix')}.

But it might be interesting; particularly since some have said
they'd expect a considerable performance penalty when all these basic
functions would become S4 generics...

Martin