Skip to content
Prev 26988 / 63461 Next

S4 class extending data.frame?

Ben, Oleg --

Some solutions, which you've probably already thought of, are (a) move
the data.frame into its own slot, instead of extending it, (b) manage
the data.frame attributes yourself, or (c) reinvent the data.frame
from scratch as a proper S4 class (e.g., extending 'list' with
validity constraints on element length and homogeneity of element
content).

(b) places a lot of dependence on understanding the data.frame
implementation, and is probably too tricky (for me) to get right,(c)
is probably also tricky, and probably caries significant performance
overhead (e.g., object duplication during validity checking).

(a) means that you don't get automatic method inheritance. On the plus
side, you still get the structure. It is trivial to implement methods
like [, [[, etc to dispatch on your object and act on the appropriate
slot. And in some sense you now know what methods i.e., those you've
implemented, are supported on your object.

Oleg, here's my cautionary tale for extending list, where manually
subsetting the .Data slot mixes up the names (callNextMethod would
have done the right thing, but was not appropriate). This was quite a
subtle bug for me, because I hadn't been expecting named lists in my
object; the problem surfaced when sapply used the (incorrectly subset)
names attribute of the list. My solution in this case was to make sure
'names' were removed from lists used to construct objects. As a
consequence I lose a nice little bit of sapply magic.
[1] "A"
+     x at .Data <- x at .Data[i]
+     x
+ })
[1] "["
[1] "x"

Martin

Oleg Sklyar <osklyar at ebi.ac.uk> writes: