[R-pkg-devel] tibbles are not data frames
On Tue, Sep 26, 2017 at 10:40 AM, Joris Meys <Joris.Meys at ugent.be> wrote:
On Tue, Sep 26, 2017 at 5:33 PM, Hadley Wickham <h.wickham at gmail.com> wrote:
I for one am happy this discussion pops up, because it's a piece of information I give to my students as well: convert to a data.frame when you start your analysis just to play safe. And this discussion shows why that is -for the time being!- a good advice. The moment tibbles become the default data format in R, or some R++, or in Julia for all I care, I'll be more than happy to burn that drop = FALSE on a stake. But for now we can't ignore the differences and the potential for conflicts when you try to use a tibble instead of a data.frame.
I think this is sub-optimal advice because most functions do work fine with tibbles.
Most. Not all. Either tibbles work exactly like a data.frame, or they don't. If they do, I wouldn't give that advice. But they don't.
They work 95% like a data frame. Seems odd to recommend that you coerce 100% of the time for a <5% of the time problem.
It is only a few packages (largely written some time ago) that don't. And typically, if they don't work with tibbles, you'll get a (usually slightly confusing) error message because some function will get a data frame instead of a vector. So as far I can tell, you only need to as.data.frame() retrospectively, not prospectively. Are you aware of any code that returns an incorrect result (i.e. no error) when given a tibble instead of a data frame?
x <- tibble(a = 1:5, b = 5:1)
relcount <- function(x, id){
table(x[,id]) / length(x[,id])
}
relcount(x, "a")
relcount(as.data.frame(x), "a")
You're welcome.
Obviously you can contrive an example that fails (why wouldn't you use nrow() here?). I meant an existing function in a package. Hadley