Skip to content

[R-pkg-devel] S3 length method behavior

4 messages · Hadley Wickham, Barry Rowlingson, Nathan Wendt +1 more

#
I've found that it's a very bad idea to provide length or names
methods for just this reason.
Hadley
On Sat, Jan 30, 2016 at 1:25 PM, Nathan Wendt <nawendt84 at gmail.com> wrote:

  
    
#
On Tue, Feb 2, 2016 at 3:28 PM, Hadley Wickham <h.wickham at gmail.com> wrote:
Defining a str on your class will at least fix the out of bounds error:

Create a trivial S3 class:

 > z=list(1,22)
 > class(z)="foo"

length method looks at the second element:

 > length.foo=function(x){x[[2]]}
 > length(z)
[1] 22

 and str barfs:

 > str(z)
List of 22
 $ : num 1
 $ : num 22
 $ :Error in object[[i]] : subscript out of bounds

Define a str method:

 > str.foo=function(object,...){for(i in
1:length(unclass(object))){str(unclass(object[[i]]))}}
 > str(z)
 num 1
 num 22

BUT... the real problem is that S3 classes are seriously informal and
there's no concept of what methods you need to define on a class
because there's no concept of an "interface" that new classes have to
conform to. So stuff breaks, seemingly at random, and via action at a
distance. Somewhere something is going to expect z[[1]] to
z[[length(z)]] to exist, which is what the default str is doing...

+1 on Hadley - don't override any basic R structural methods, create
new ones with new names. You can make them more meaningful too. For
your example, maybe "messageCount(myObject)"?

Barry
#
After toying around with trying some of the suggested fixes, I have to
agree that it's more trouble than it is worth to add my own primitive
methods. It'll work just fine to move on and write my own methods. Thanks
to all for the explanations and suggestions.

On Tue, Feb 2, 2016 at 11:23 AM, Barry Rowlingson <
b.rowlingson at lancaster.ac.uk> wrote:

            

  
  
#

        
> On Tue, Feb 2, 2016 at 3:28 PM, Hadley Wickham <h.wickham at gmail.com> wrote:
>> I've found that it's a very bad idea to provide length or names
    >> methods for just this reason.

well, not quite, see below ..

    >>> After looking
    >>> for memory leaks and other errors I finally noticed that the str() on the
    >>> object of myClass looked odd. It returned something like this:
    >>> 
    >>> List of 82
    >>> $ file  : chr "my/file/location"
    >>> $ handle:<externalptr>
    >>> $ NA:
    >>> Error in object[[i]] : subscript out of bounds


    >>> My questions are, then, whether this behavior makes sense and what to do
    >>> about it. If I define my own str() method, will that fix it? I think I am
    >>> just misunderstanding what is going on with the methods I have defined.
    >>> Hopefully, someone can offer some clarity.

    > Defining a str on your class will at least fix the out of bounds error:

    > Create a trivial S3 class:

    >> z=list(1,22)
    >> class(z)="foo"

    > length method looks at the second element:

    >> length.foo=function(x){x[[2]]}
    >> length(z)
    > [1] 22

    > and str barfs:

    >> str(z)
    > List of 22
    > $ : num 1
    > $ : num 22
    > $ :Error in object[[i]] : subscript out of bounds

    > Define a str method:

    >> str.foo=function(object,...){for(i in
    > 1:length(unclass(object))){str(unclass(object[[i]]))}}
    >> str(z)
    > num 1
    > num 22

    > BUT... the real problem is that S3 classes are seriously informal and
    > there's no concept of what methods you need to define on a class
    > because there's no concept of an "interface" that new classes have to
    > conform to. So stuff breaks, seemingly at random, and via action at a
    > distance. Somewhere something is going to expect z[[1]] to
    > z[[length(z)]] to exist, which is what the default str is doing...

Indeed.
Still, it can also be advantageous to define such methods *consistently*.

With *consistence*, I mean that at least

- names(obj) either returns NULL or a character vector of
  length length(obj), and that as Barry mentions,
- obj[[ i ]]  is meaningful  for (i in  seq_along(obj))
           [yes, seq_along(.) automatically works with your length() method !]

- often you'd also want  obj[ i ]  to also work consistently
  (sometimes identically to `[[`)

I'd say that oftentimes it may be easier (and more "rewarding")
to define such `[` and `[[` methods for your class anyway.

As author of str(), I'll declare the design(*) of str() to be such
that with these methods (length, names, `[`, `[[`) defined
consistently, str.default(obj) already works sensibly.  
The alternative is indeed to define your own str() method.  

One of the two you'd want often, because e.g.,
  str( list( <obj1>, <obj2> ) )
or similar things should work too.

--
(*) to be honest, str() grew and developed very much
    historically, so the above is more an "implementation principle"

    > +1 on Hadley - don't override any basic R structural methods, create
    > new ones with new names. You can make them more meaningful too. For
    > your example, maybe "messageCount(myObject)"?

    > Barry