[R-pkg-devel] tibbles are not data frames
What I think is troublesome is that data.frame is part of the definition of the R language, and the expectation based on R's normal behaviour is that testing with is.data.frame() should be enough to ensure that an object can be treated as a data frame. We can think of different solutions for use in our packages, but the naive R user will be always surprised by the behaviour of tibbles because package 'tibble' breaks the expectations of the R language with an exception. I do not know what could be the best solution... though. Maybe thinking of tibbles as a step towards R++ or R 4 or whatever future enhanced version of R, in which they will replace data frames completely. Hadley is correct in that they are a very significant improvement to R, but the problem is the inconsistent behaviour. Pedro.
On 2017-09-26 16:01, G?ran Brostr?m wrote:
Thanks G?bor, that is OK. However, if I would like an input tibble remain a tibble (after massaging) in output, as a courtesy to the user, this will fail. I think that it works if I instead treat the input as a list: That's all 'the tibble way' does (in my case at least). G?ran On 2017-09-26 14:17, G?bor Cs?rdi wrote:
Yes, basically tibbles violate the substitution principle. A lot of other packages do, probably base R as well, although it is sometimes hard to say, because there is no clear object hierarchy. Let's take a step back, and see how you can check for a data frame argument. 1. Weak check. is.data.frame(arg) This essentially means that you trust subclasses of data.frame to adhere to the substitution principle. While this is nice in theory, a lot packages (including both major packages implementing subclasses of data.frame!) do not always adhere. So this is not really a safe solution. Base R does this as well, sometimes, e.g. aggregate.data.frame has: ???? if (!is.data.frame(x)) ???????? x <- as.data.frame(x) which is essentially equivalent to the weak check, since it leaves data.frame subclasses untouched. 2. Strong "check". arg <- as.data.frame(arg) This is safer, because it does not rely on subclass implementors. It also has the additional benefit that your code is polymorphic: it works with any input, as long as it can be converted to a data frame. Base R also uses this often, e.g. in merge.data.frame: ???? nx <- nrow(x <- as.data.frame(x)) ???? ny <- nrow(y <- as.data.frame(y)) Gabor Disclaimer: I do not represent the tibble authors in any way. On Tue, Sep 26, 2017 at 11:21 AM, David Hugh-Jones <davidhughjones at gmail.com> wrote:
These replies seem to be missing the point, which is that old code has to be rewritten because tibbles don't behave like data frames. It is true that subclasses can override behaviour, but there is an implicit contract that the same methods should do the same things. The as.xxx pattern seems weird to me, though I see it a lot. What is the point of inheritance if you always have to convert an object upwards before you can treat it as a member of the superclass? I can see this argument will run... David On 26 September 2017 at 11:15, G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:
What is the benefit here, compared to just calling as.data.frame() on it? Gabor On Tue, Sep 26, 2017 at 11:11 AM, Daniel L?decke <d.luedecke at uke.de> wrote:
Since tibbles add their class attributes first, you could use: tb <- tibble(a = 5) inherits(tb, "data.frame", which = TRUE) == 1 if "tb" is a data frame (only), TRUE is returned, for tibble FALSE. You could then coerce to data frame: as.data.frame(tb) -----Urspr?ngliche Nachricht----- Von: R-package-devel [mailto:r-package-devel-bounces at r-project.org] Im Auftrag von G?ran Brostr?m Gesendet: Dienstag, 26. September 2017 12:09 An: r-package-devel at r-project.org Betreff: Re: [R-pkg-devel] tibbles are not data frames On 2017-09-26 11:56, G?bor Cs?rdi wrote:
On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys <Joris.Meys at ugent.be> wrote:
I don't like the dropping of dimensions either. That doesn't change the fact that a tibble reacts different from a data.frame. So tibbles do not inherit correctly from the class data.frame, and it can thus be argued that it's against OOP paradigms to pretend tibbles inherit from the class data.frame.
I have yet to see an OOP system in which a subclass cannot override the methods of its superclass. Not only is this in line with OOP paradigms, it is actually one of the essential OOP features. To be more constructive, if you have a function that only works with data frame inputs, then it is good practice to check that the supplied input is indeed a data frame. This is independent of tibbles.
It is not. I check input for being a data frame, but tibbles pass that test. That's the essence of the problem.
In practice it seems to me that an easy fix is to just call as.data.frame on the input. This should either convert it to a data frame, or throw an error.
Sure, but I still need to rewrite the package. G?rn
For tibbles it drops the tbl* classes. Gabor
Defensive coding techniques would check if it's a tibble and return an error saying a data.frame is expected. Unless tibbles inherit correctly from data.frame. I have nothing against tibbles. But calling them "data.frame" raises expectations that can't be fulfilled.
[...]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel -- _____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Pr?l?, Martina Saurin (komm.) _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING ______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
------------------------------------------------------------------------ Pedro J. Aphalo University Lecturer, Principal Investigator (Office 4417, Biocenter 3, Viikinkaari 1) Department of Biosciences Plant Biology P.O. Box 65 00014 University of Helsinki Finland e-mail: pedro.aphalo at helsinki.fi <mailto:pedro.aphalo at helsinki.fi> Tel. (mobile) +358 50 4150623 Tel. (office) +358 2941 57897 ------------------------------------------------------------------------ *Web sites and blogs* Web site (research group): http://blogs.helsinki.fi/senpep-blog/ Web site (own teaching): http://www.helsinki.fi/people/pedro.aphalo/ Web site (using R in photobiology): http://www.r4photobiology.info/ ------------------------------------------------------------------------ *Societies* UV4Plants <http://www.uv4plants.org/> (communications officer), ESP <http://www.photobiology.eu/> (member) SEB <http://www.sebiology.org/> (member), BES <http://www.britishecologicalsociety.org/> (member), SPPS <http://www.spps.fi/> (member), SMS <http://www.metsatieteellinenseura.fi/english> (member), TUG <http://tug.org/> (member), FOAS <http://www.foastat.org/> (member). ------------------------------------------------------------------------ [[alternative HTML version deleted]]